Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uhd_tx_streamer_send, uhd_tx_streamer_recv_async_msg - Crashes #171

Closed
nitha2000 opened this issue Apr 23, 2018 · 8 comments
Closed

uhd_tx_streamer_send, uhd_tx_streamer_recv_async_msg - Crashes #171

nitha2000 opened this issue Apr 23, 2018 · 8 comments

Comments

@nitha2000
Copy link

Hi,
It looks like this issue is related or somewhat similar to the issues addressed by #134 , #144.
I can reproduce this issue with very little effort and I can provide more info as needed.

CLUES:
I see this issue very often when the freq difference between TX and RX is higher (Non. standard band)
Ex: 1600MHz (DL) 700MHz (UL).

When the BW is 20MHz it happens very often.

ENV:
B210 with "linux; GNU C++ version 4.8.4; Boost_105400; UHD_003.010.003.000-0-unknown
Ubuntu 14.04.5 LTS

CPU:
x64, Intel(R) Core(TM) i7-5557U CPU @ 3.10GHz, Quad-core

   ./srsue() [0x5a70ba]
    /lib/x86_64-linux-gnu/libc.so.6(+0x36cb0) [0x7f244be56cb0]
    /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37) [0x7f244be56c37]
    /lib/x86_64-linux-gnu/libc.so.6(abort+0x148) [0x7f244be5a028]
    /usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x155) [0x7f244c45b535]
    /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e6d6) [0x7f244c4596d6]
    /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e703) [0x7f244c459703]
    /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e922) [0x7f244c459922]
    /usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZSt20__throw_length_errorPKc+0x67) [0x7f244c4ab3a7]
    /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xba262) [0x7f244c4b5262]
    /usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZNSs9_M_mutateEmmm+0x56) [0x7f244c4b53d6]
    /usr/local/lib/libuhd.so.003(uhd_tx_streamer_send+0x3c) [0x7f244a0e0d0c]
    /home/user/Downloads/srsLTE_clone_Mar_08_2018/srsLTE/build/lib/src/phy/rf/libsrslte_rf.so(rf_uhd_send_timed_multi+0x194) [0x7f244cc14634]
    /home/user/Downloads/srsLTE_clone_Mar_08_2018/srsLTE/build/lib/src/phy/rf/libsrslte_rf.so(srslte_rf_send_timed_multi+0x2e) [0x7f244cc11cce]
    ./srsue(_ZN6srslte5radio2txEPPvj18srslte_timestamp_t+0xd1) [0x613741]
    ./srsue(_ZN6srslte5radio9tx_singleEPvj18srslte_timestamp_t+0x2e) [0x61380e]
    ./srsue(_ZN5srsue11phch_common10worker_endEjbPCfj18srslte_timestamp_t+0xaf) [0x5352bf]
    ./srsue(_ZN5srsue11phch_worker8work_impEv+0x6bc) [0x54301c]
    ./srsue(_ZN6srslte11thread_pool6worker10run_threadEv+0x31) [0x570431]
    ./srsue(_ZN6thread21thread_function_entryEPv+0x9) [0x508bc9]
    /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184) [0x7f244d5e3184]
    /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f244bf1a37d]


    ./srsue() [0x5a70ba]
    /lib/x86_64-linux-gnu/libc.so.6(+0x36cb0) [0x7f244be56cb0]
    /usr/local/lib/libuhd.so.003(uhd_tx_streamer_recv_async_msg+0x62) [0x7f244a0e10d2]
    /home/user/Downloads/srsLTE_clone_Mar_08_2018/srsLTE/build/lib/src/phy/rf/libsrslte_rf.so(+0x8e25) [0x7f244cc11e25]
    /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184) [0x7f244d5e3184]
    /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f244bf1a37d]
@michael-west
Copy link
Contributor

@nitha2000 Yes, this looks like a duplicate of #134. It looks like srsLTE may be passing in a bad parameter to the functions or there is some problem with the uhd_tx_streamer_handle (i.e. race condition between initialization and use). If you compile the Debug version of UHD, it should give a much better stack trace. Is that possible for you to do?

@nitha2000
Copy link
Author

I recompiled the UHD driver with this option "cmake -DCMAKE_BUILD_TYPE=Debug ../", When it crashed i'm still getting the same output, not much better stack trace as you mentioned. Am I missing anything ?

    /lib/x86_64-linux-gnu/libc.so.6(+0x36cb0) [0x7f39b5304cb0]
    /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37) [0x7f39b5304c37]
    /lib/x86_64-linux-gnu/libc.so.6(abort+0x148) [0x7f39b5308028]
    /lib/x86_64-linux-gnu/libc.so.6(+0x2fbf6) [0x7f39b52fdbf6]
    /lib/x86_64-linux-gnu/libc.so.6(+0x2fca2) [0x7f39b52fdca2]
    /usr/local/lib/libuhd.so.003(+0x2b5e2b) [0x7f39b3469e2b]
    /usr/local/lib/libuhd.so.003(+0x49fc90) [0x7f39b3653c90]
    /usr/local/lib/libuhd.so.003(uhd_tx_streamer_send+0xdb) [0x7f39b34194cc]
    /home/user/Downloads/srsLTE_clone_Mar_08_2018/srsLTE/build/lib/src/phy/rf/libsrslte_rf.so(rf_uhd_send_timed_multi+0x194) [0x7f39b62ca634]
    /home/user/Downloads/srsLTE_clone_Mar_08_2018/srsLTE/build/lib/src/phy/rf/libsrslte_rf.so(srslte_rf_send_timed_multi+0x2e) [0x7f39b62c7cce]
    ./srsue(_ZN6srslte5radio2txEPPvj18srslte_timestamp_t+0xd1) [0x615271]
    ./srsue(_ZN6srslte5radio9tx_singleEPvj18srslte_timestamp_t+0x2e) [0x61533e]
    ./srsue(_ZN5srsue11phch_common10worker_endEjbPCfj18srslte_timestamp_t+0xfb) [0x53636b]
    ./srsue(_ZN5srsue11phch_worker8work_impEv+0x6bc) [0x54484c]
    ./srsue(_ZN6srslte11thread_pool6worker10run_threadEv+0x31) [0x571f61]
    ./srsue(_ZN6thread21thread_function_entryEPv+0x9) [0x5099b9]
    /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184) [0x7f39b6c99184]
    /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f39b53c837d]


    ./srsue() [0x5a8bea]
    /lib/x86_64-linux-gnu/libc.so.6(+0x36cb0) [0x7f39b5304cb0]
    /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37) [0x7f39b5304c37]
    /lib/x86_64-linux-gnu/libc.so.6(abort+0x148) [0x7f39b5308028]
    /usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x155) [0x7f39b5909535]
    /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e6d6) [0x7f39b59076d6]
    /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e703) [0x7f39b5907703]
    /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5f1bf) [0x7f39b59081bf]
    /usr/local/lib/libuhd.so.003(uhd_tx_streamer_send+0x6d) [0x7f39b341945e]
    /home/user/Downloads/srsLTE_clone_Mar_08_2018/srsLTE/build/lib/src/phy/rf/libsrslte_rf.so(rf_uhd_send_timed_multi+0x194) [0x7f39b62ca634]
    /home/user/Downloads/srsLTE_clone_Mar_08_2018/srsLTE/build/lib/src/phy/rf/libsrslte_rf.so(srslte_rf_send_timed_multi+0x2e) [0x7f39b62c7cce]
    ./srsue(_ZN6srslte5radio2txEPPvj18srslte_timestamp_t+0xd1) [0x615271]
    ./srsue(_ZN6srslte5radio9tx_singleEPvj18srslte_timestamp_t+0x2e) [0x61533e]
    ./srsue(_ZN5srsue11phch_common10worker_endEjbPCfj18srslte_timestamp_t+0xaf) [0x53631f]
    ./srsue(_ZN5srsue11phch_worker8work_impEv+0x6bc) [0x54484c]
    ./srsue(_ZN6srslte11thread_pool6worker10run_threadEv+0x31) [0x571f61]
    ./srsue(_ZN6thread21thread_function_entryEPv+0x9) [0x5099b9]
    /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184) [0x7f39b6c99184]
    /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f39b53c837d]


    ./srsue() [0x5a8bea]
    /lib/x86_64-linux-gnu/libc.so.6(+0x36cb0) [0x7f39b5304cb0]
    /usr/local/lib/libuhd.so.003(uhd_tx_streamer_recv_async_msg+0x76) [0x7f39b341959f]
    /home/user/Downloads/srsLTE_clone_Mar_08_2018/srsLTE/build/lib/src/phy/rf/libsrslte_rf.so(+0x8e25) [0x7f39b62c7e25]
    /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184) [0x7f39b6c99184]
    /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f39b53c837d]

@nitha2000
Copy link
Author

With gdb. I got this as bt.

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffd52c2700 (LWP 20619)]
0x00007ffff622bc37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt

#0 0x00007ffff622bc37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007ffff622f028 in __GI_abort () at abort.c:89
#2 0x00007ffff6224bf6 in _assert_fail_base (fmt=0x7ffff63753b8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7ffff485c861 "px != 0",
file=file@entry=0x7ffff487bf88 "/usr/include/boost/smart_ptr/intrusive_ptr.hpp", line=line@entry=162,
function=function@entry=0x7ffff48bcec0 <ZZNK5boost13intrusive_ptrIN3uhd9transport19managed_send_bufferEEptEvE19__PRETTY_FUNCTION
> "T* boost::intrusive_ptr::operator->() const [with T = uhd::transport::managed_send_buffer]") at assert.c:92
#3 0x00007ffff6224ca2 in _GI___assert_fail (assertion=0x7ffff485c861 "px != 0", file=0x7ffff487bf88 "/usr/include/boost/smart_ptr/intrusive_ptr.hpp", line=162,
function=0x7ffff48bcec0 <ZZNK5boost13intrusive_ptrIN3uhd9transport19managed_send_bufferEEptEvE19__PRETTY_FUNCTION
> "T* boost::intrusive_ptr::operator->() const [with T = uhd::transport::managed_send_buffer]")
at assert.c:101
#4 0x00007ffff4390e2b in boost::intrusive_ptruhd::transport::managed_send_buffer::operator-> (this=) at /usr/include/boost/smart_ptr/intrusive_ptr.hpp:162
#5 0x00007ffff457ac90 in convert_to_in_buff (index=0, this=0x738fb98) at /home/user/Downloads/uhd_debug/uhd-maint/host/lib/usrp/device3/../../transport/super_send_packet_handler.hpp:465
#6 send_one_packet (buffer_offset_bytes=0, timeout=3, if_packet_info=..., nsamps_per_buff=2044, buffs=..., this=0x738fb98)
at /home/user/Downloads/uhd_debug/uhd-maint/host/lib/usrp/device3/../../transport/super_send_packet_handler.hpp:424
#7 send (timeout=3, metadata=..., nsamps_per_buff=2044, buffs=..., this=0x738fb98) at /home/user/Downloads/uhd_debug/uhd-maint/host/lib/usrp/device3/../../transport/super_send_packet_handler.hpp:278
#8 uhd::transport::sph::send_packet_streamer::send (this=0x738fb90, buffs=..., nsamps_per_buff=2044, metadata=..., timeout=3)
at /home/user/Downloads/uhd_debug/uhd-maint/host/lib/usrp/device3/../../transport/super_send_packet_handler.hpp:498
#9 0x00007ffff43404cc in uhd_tx_streamer_send (h=0x730dbf0, buffs=0x7fffd52c15b0, samps_per_buff=2044, md=0x7308780, timeout=3, items_sent=0x7fffd52c1570)
at /home/user/Downloads/uhd_debug/uhd-maint/host/lib/usrp/usrp_c.cpp:236
#10 0x00007ffff71f9619 in rf_uhd_send_timed_multi (h=0x7308750, data=0x7fffd52c16f0, nsamples=11520, secs=91, frac_secs=0.38894716530721063, has_time_spec=true, blocking=true, is_start_of_burst=false,
is_end_of_burst=false) at /home/user/Downloads/srsLTE_clone_Mar_08_2018/srsLTE/lib/src/phy/rf/rf_uhd_imp.c:890
#11 0x00007ffff71f70b4 in srslte_rf_send_timed_multi (rf=0x7fffde561080, data=0x7fffd52c16f0, nsamples=11520, secs=91, frac_secs=0.38894716530721063, blocking=true, is_start_of_burst=false, is_end_of_burst=false)
at /home/user/Downloads/srsLTE_clone_Mar_08_2018/srsLTE/lib/src/phy/rf/rf_imp.c:311
#12 0x00000000006b77cc in srslte::radio::tx (this=0x7fffde561080, buffer=0x7fffd52c16f0, nof_samples=11520, tx_time=...) at /home/user/Downloads/srsLTE_clone_Mar_08_2018/srsLTE/lib/src/radio/radio.cc:227
#13 0x00000000006b75cf in srslte::radio::tx_single (this=0x7fffde561080, buffer=0x2923280 srsue::zeros, nof_samples=11520, tx_time=...)
at /home/user/Downloads/srsLTE_clone_Mar_08_2018/srsLTE/lib/src/radio/radio.cc:199
#14 0x00000000005b5e7f in srsue::phch_common::worker_end (this=0x7fffecfc1508, tti=0, tx_enable=false, buffer=0x7fffd59d1100, nof_samples=11520, tx_time=...)
at /home/user/Downloads/srsLTE_clone_Mar_08_2018/srsLTE/srsue/src/phy/phch_common.cc:250
#15 0x00000000005c23c8 in srsue::phch_worker::work_imp (this=0x7ffff7f0e010) at /home/user/Downloads/srsLTE_clone_Mar_08_2018/srsLTE/srsue/src/phy/phch_worker.cc:371
#16 0x00000000005fdaab in srslte::thread_pool::worker::run_thread (this=0x7ffff7f0e010) at /home/user/Downloads/srsLTE_clone_Mar_08_2018/srsLTE/lib/src/common/thread_pool.cc:61
#17 0x0000000000580173 in thread::thread_function_entry (_this=0x7ffff7f0e010) at /home/user/Downloads/srsLTE_clone_Mar_08_2018/srsLTE/lib/include/srslte/common/threads.h:79
#18 0x00007ffff7bc4184 in start_thread (arg=0x7fffd52c2700) at pthread_create.c:312
#19 0x00007ffff62ef37d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

@michael-west
Copy link
Contributor

@nitha2000 That is extremely helpful. It looks like the exception is thrown when the managed_send_buffer is dereferenced. Looking over the source code (specifically host/lib/transport/super_send_packet_handler.hpp), it looks like the only way that can happen is if multiple threads are calling send() on the same streamer object. Is your application calling rf_uhd_send_timed() or rf_uhd_send_timed_multi() from multiple threads?

@nitha2000
Copy link
Author

From the points #10 and #11 below, it looks like srsLte is calling rf_uhd_send_timed_multi ().
What is the diff. between these two calls and any suggestions to prevent it from happening ?
thanks,

#10 0x00007ffff71f9619 in rf_uhd_send_timed_multi (h=0x7308750, data=0x7fffd52c16f0, nsamples=11520, secs=91, frac_secs=0.38894716530721063, has_time_spec=true, blocking=true, is_start_of_burst=false,
is_end_of_burst=false) at /home/user/Downloads/srsLTE_clone_Mar_08_2018/srsLTE/lib/src/phy/rf/rf_uhd_imp.c:890

#11 0x00007ffff71f70b4 in srslte_rf_send_timed_multi (rf=0x7fffde561080, data=0x7fffd52c16f0, nsamples=11520, secs=91, frac_secs=0.38894716530721063, blocking=true, is_start_of_burst=false, is_end_of_burst=false)

@michael-west
Copy link
Contributor

Those are 2 different function calls. the function srslte_rf_send_timed_multi(#11) calls rf_uhd_send_timed_multi(#10). That is OK and is probably what is supposed to happen.

The backtrace is only looking at one thread. I suspect a concurrent thread also called into a send function. It's really the only way this type of error can occur.

@nitha2000
Copy link
Author

I meant to ask what was the diff. btw these two calls; rf_uhd_send_timed() or rf_uhd_send_timed_multi().
I need to do more tests to confirm but i think you were right that multiple threads call send function.
After I reduced the number of worker threads to 1, it looks like things are more stable.
Is it possible to call send() or async_receive () from multiple threads, i think it's ? How can I make sure or what mechanism should i use to avoid this dereferencing issue ? thanks,

@michael-west
Copy link
Contributor

The uhd_tx_streamer_send() and uhd_tx_streamer_recv_async_msg() functions are not thread safe, so it is up to the application to do serialization where necessary. I would recommend putting a lock/mutex in the srsLTE code where those functions are called.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants