Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault when creating streamers with N310 #520

Open
andrepuschmann opened this issue Nov 4, 2021 · 2 comments
Open

Segfault when creating streamers with N310 #520

andrepuschmann opened this issue Nov 4, 2021 · 2 comments

Comments

@andrepuschmann
Copy link
Contributor

Issue Description

The issue only appears when using the N310 in a two channel configuration. It happens occasionally but is annoying nonetheless since its causing many tests to fail because the eNB/gNB doesn't start up in the first place.

We are using the N310 to test an NSA configuration that uses 2x channels at 15.35Msps. I've compiled UHD 4.1 in debug mode and got following backtrace. Unfortunately not all symbols are there and line numbers aren't shown.

(launched: 2021-11-03_15:12:21.510889)
t
---  Software Radio Systems LTE eNodeB  ---

Reading configuration file /osmo-gsm-tester-srsenb/srsenb_rfci-slave4-n310_10.12.1.214/srsenb.conf...

Built in Debug mode using commit bcb4b594c on branch disable_backward.

Opening 2 channels in RF device=uhd with args=type=n3xx,tx_subdev_spec=A:0 B:0,rx_subdev_spec=A:0 B:0,None
Available RF device list: UHD  zmq 
�[0;32m[INFO] [UHD] �[0;39mlinux; GNU C++ version 9.3.0; Boost_107100; UHD_4.1.0.HEAD-0-g25d617ca
�[0;32m[INFO] [LOGGING] �[0;39mFastpath logging disabled at runtime.
Opening USRP channels=2, args: type=n3xx,tx_subdev_spec=A:0 B:0,rx_subdev_spec=A:0 B:0,None=,master_clock_rate=122.88e6
�[0;32m[INFO] [UHD RF] �[0;39mRF UHD Generic instance constructed
�[0;32m[INFO] [MPMD] �[0;39mInitializing 1 device(s) in parallel with args: mgmt_addr=192.168.20.2,type=n3xx,product=n310,serial=317F537,fpga=HG,claimed=False,addr=192.168.20.2,None=,master_clock_rate=122.88e6
�[1;33m[WARNING] [MPM.RPCServer] �[0;39mA timeout event occured!
�[0;32m[INFO] [MPM.PeriphManager] �[0;39minit() called with device args `None=,fpga=HG,master_clock_rate=122.88e6,mgmt_addr=192.168.20.2,product=n310,clock_source=internal,time_source=internal'.
�[1;33m[WARNING] [RFNOC::GRAPH] �[0;39mOne or more blocks timed out during flush!
�[0;32m[INFO] [UHD RF] �[0;39mSetting tx_subdev_spec to 'A:0 B:0'
�[0;32m[INFO] [UHD RF] �[0;39mSetting rx_subdev_spec to 'A:0 B:0'
�[0;32m[INFO] [MULTI_USRP] �[0;39m    1) catch time transition at pps edge
�[0;32m[INFO] [MULTI_USRP] �[0;39m    2) set times next pps (synchronously)
--- command='/osmo-gsm-tester-srsenb/srslte/bin/srsenb /osmo-gsm-tester-srsenb/srsenb_rfci-slave4-n310_10.12.1.214/srsenb.conf' version=21.10.0 signal=11 date='03/11/2021 14:12:31' ---
	/osmo-gsm-tester-srsenb/srslte/bin/srsenb(+0xd88926) [0x55e128d46926]
	/lib/x86_64-linux-gnu/libc.so.6(+0x46210) [0x7f1bcc148210]
	/opt/uhd-4.1/lib/libuhd.so.4.1.0(_ZN3uhd5rfnoc4chdr12mgmt_payload11deserializeEPKmmRKSt8functionIFmmEE+0x31f) [0x7f1bcb501497]
	/opt/uhd-4.1/lib/libuhd.so.4.1.0(+0x2d07aa) [0x7f1bcb56e7aa]
	/opt/uhd-4.1/lib/libuhd.so.4.1.0(+0x2d1497) [0x7f1bcb56f497]
	/opt/uhd-4.1/lib/libuhd.so.4.1.0(+0x2d8e6e) [0x7f1bcb576e6e]
	/opt/uhd-4.1/lib/libuhd.so.4.1.0(+0x26d090) [0x7f1bcb50b090]
	/opt/uhd-4.1/lib/libuhd.so.4.1.0(+0x7c9d4b) [0x7f1bcba67d4b]
	/opt/uhd-4.1/lib/libuhd.so.4.1.0(+0x28682d) [0x7f1bcb52482d]
	/opt/uhd-4.1/lib/libuhd.so.4.1.0(+0x28a12f) [0x7f1bcb52812f]
	/opt/uhd-4.1/lib/libuhd.so.4.1.0(+0x2c17a8) [0x7f1bcb55f7a8]
	/opt/uhd-4.1/lib/libuhd.so.4.1.0(+0x3ccaae) [0x7f1bcb66aaae]
	/osmo-gsm-tester-srsenb/srslte/lib/libsrsran_rf.so.0(_ZN14rf_uhd_generic13get_rx_streamERm+0x1a6) [0x7f1bcc8e11be]
	/osmo-gsm-tester-srsenb/srslte/lib/libsrsran_rf.so.0(+0x77073) [0x7f1bcc8c9073]
	/osmo-gsm-tester-srsenb/srslte/lib/libsrsran_rf.so.0(rf_uhd_open_multi+0x14c) [0x7f1bcc8c9d65]
	/osmo-gsm-tester-srsenb/srslte/lib/libsrsran_rf.so.0(srsran_rf_open_devname+0x141) [0x7f1bcc8c4f42]
	/osmo-gsm-tester-srsenb/srslte/bin/srsenb(_ZN6srsran5radio8open_devERKjRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESA_+0x109) [0x55e128f1c61f]
	/osmo-gsm-tester-srsenb/srslte/bin/srsenb(_ZN6srsran5radio4initERKNS_9rf_args_tEPNS_19phy_interface_radioE+0x4c8) [0x55e128f1a8aa]
	/osmo-gsm-tester-srsenb/srslte/bin/srsenb(_ZN6srsenb3enb4initERKNS_10all_args_tE+0x55f) [0x55e1289daf39]
	/osmo-gsm-tester-srsenb/srslte/bin/srsenb(main+0xb09) [0x55e1289b48e6]
	/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7f1bcc1290b3]
	/osmo-gsm-tester-srsenb/srslte/bin/srsenb(_start+0x2e) [0x55e1289ac95e]
srsRAN crashed. Please send this backtrace to the developers ...

Setup Details

  • srsRAN 21.10
  • 2x RF channel
  • UHD_4.1.0.HEAD-0-g25d617ca

Expected Behavior

No UHD crash when starting eNB.

Actual Behaviour

UHD segfaults occasionally.

Steps to reproduce the problem

I've not been able to reproduce the issue with the UHD examples but the srsRAN appnote for running COTS UEs here contains all config steps. The UHD device args for the N310 are shown at the end of the document.

Note that you don't need a COTS UE or even a core network. Just starting the eNB with this config crashes the UHD every so often.

Additional Information

Let me know if you need further details or want me to compile with different flags to maybe get more debug info.

@andrepuschmann
Copy link
Contributor Author

Here is another segfault with stacktrace of the same issue I believe:

---  Software Radio Systems LTE eNodeB  ---

Reading configuration file enb.conf...

Built in Release mode using commit 0967cda04 on branch dev.

Opening 2 channels in RF device=uhd with args=type=n3xx,tx_subdev_spec=A:0 B:0,rx_subdev_spec=A:0 B:0
Available RF device list: UHD  soapy  zmq
[INFO] [UHD] linux; GNU C++ version 9.3.0; Boost_107100; UHD_4.1.0.2-1-gceac1bdd
[INFO] [LOGGING] Fastpath logging disabled at runtime.
Opening USRP channels=2, args: type=n3xx,tx_subdev_spec=A:0 B:0,rx_subdev_spec=A:0 B:0,master_clock_rate=122.88e6
[INFO] [UHD RF] RF UHD Generic instance constructed
[INFO] [MPMD] Initializing 1 device(s) in parallel with args: mgmt_addr=192.168.20.2,type=n3xx,product=n310,serial=317F537,fpga=HG,claimed=False,addr=192.168.20.2,master_clock_rate=122.88e6
[WARNING] [MPM.RPCServer] A timeout event occured!
[INFO] [MPM.PeriphManager] init() called with device args `fpga=HG,master_clock_rate=122.88e6,mgmt_addr=192.168.20.2,product=n310,clock_source=internal,time_source=internal'.
[WARNING] [RFNOC::GRAPH] One or more blocks timed out during flush!
[INFO] [UHD RF] Setting tx_subdev_spec to 'A:0 B:0'
[INFO] [UHD RF] Setting rx_subdev_spec to 'A:0 B:0'
[INFO] [MULTI_USRP]     1) catch time transition at pps edge
[INFO] [MULTI_USRP]     2) set times next pps (synchronously)
Stack trace (most recent call last):
#19   Object "", at 0xffffffffffffffff, in
#18   Object "/home/anpu/src/srsLTE/build_release/srsenb/src/srsenb", at 0x56079edc00bd, in _start
#17   Source "../csu/libc-start.c", line 308, in __libc_start_main [0x7f9c743730b2]
#16   Object "/home/anpu/src/srsLTE/build_release/srsenb/src/srsenb", at 0x56079edbd888, in main
#15   Object "/home/anpu/src/srsLTE/build_release/srsenb/src/srsenb", at 0x56079eddc917, in srsenb::enb::init(srsenb::all_args_t const&)
#14   Object "/home/anpu/src/srsLTE/build_release/srsenb/src/srsenb", at 0x56079f208fd7, in srsran::radio::init(srsran::rf_args_t const&, srsran::phy_interface_radio*)
#13   Object "/home/anpu/src/srsLTE/build_release/srsenb/src/srsenb", at 0x56079f200391, in srsran::radio::open_dev(unsigned int const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
#12   Object "/home/anpu/src/srsLTE/build_release/lib/src/phy/rf/libsrsran_rf.so.21.10.0", at 0x7f9c74b1b1fb, in rf_uhd_open_multi
#11   Object "/home/anpu/src/srsLTE/build_release/lib/src/phy/rf/libsrsran_rf.so.21.10.0", at 0x7f9c74b19cc9, in uhd_init(rf_uhd_handler_t*, char*, unsigned int)
#10   Object "/home/anpu/src/srsLTE/build_release/lib/src/phy/rf/libsrsran_rf.so.21.10.0", at 0x7f9c74b27952, in rf_uhd_generic::get_rx_stream(unsigned long&)
#9    Object "/opt/uhd-4.1-release/lib/libuhd.so.4.1.0", at 0x7f9c739fdc57, in multi_usrp_rfnoc::get_rx_stream(uhd::stream_args_t const&)
#8    Object "/opt/uhd-4.1-release/lib/libuhd.so.4.1.0", at 0x7f9c73907475, in rfnoc_graph_impl::connect(uhd::rfnoc::block_id_t const&, unsigned long, std::shared_ptr<uhd::rx_streamer>, unsigned long, unsigned long)
#7    Object "/opt/uhd-4.1-release/lib/libuhd.so.4.1.0", at 0x7f9c738d0234, in graph_stream_manager_impl::create_device_to_host_data_stream(std::pair<unsigned short, unsigned short>, uhd::rfnoc::sw_buff_t, uhd::rfnoc::sw_buff_t, unsigned long, uhd::device_addr_t const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
#6    Object "/opt/uhd-4.1-release/lib/libuhd.so.4.1.0", at 0x7f9c738ce2f5, in link_stream_manager_impl::create_device_to_host_data_stream(std::pair<unsigned short, unsigned short>, uhd::rfnoc::sw_buff_t, uhd::rfnoc::sw_buff_t, uhd::device_addr_t const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
#5    Object "/opt/uhd-4.1-release/lib/libuhd.so.4.1.0", at 0x7f9c73d33b3e, in uhd::mpmd::mpmd_mboard_impl::mpmd_mb_iface::make_rx_data_transport(uhd::rfnoc::mgmt::mgmt_portal&, std::pair<std::pair<unsigned short, unsigned short>, std::pair<unsigned short, unsigned short> > const&, std::pair<unsigned short, unsigned short> const&, uhd::rfnoc::sw_buff_t, uhd::rfnoc::sw_buff_t, uhd::device_addr_t const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
#4    Object "/opt/uhd-4.1-release/lib/libuhd.so.4.1.0", at 0x7f9c738b8019, in uhd::rfnoc::chdr_rx_data_xport::configure_sep(std::shared_ptr<uhd::transport::io_service>, std::shared_ptr<uhd::transport::recv_link_if>, std::shared_ptr<uhd::transport::send_link_if>, uhd::rfnoc::chdr::chdr_packet_factory const&, uhd::rfnoc::mgmt::mgmt_portal&, std::pair<unsigned short, unsigned short> const&, uhd::rfnoc::sw_buff_t, uhd::rfnoc::sw_buff_t, uhd::rfnoc::stream_buff_params_t const&, uhd::rfnoc::stream_buff_params_t const&, uhd::rfnoc::stream_buff_params_t const&, bool, std::function<void ()>)
#3    Object "/opt/uhd-4.1-release/lib/libuhd.so.4.1.0", at 0x7f9c7391c0a4, in uhd::rfnoc::mgmt::mgmt_portal_impl::config_local_rx_stream_commit(uhd::rfnoc::chdr_ctrl_xport&, unsigned short const&, double, bool)
#2    Object "/opt/uhd-4.1-release/lib/libuhd.so.4.1.0", at 0x7f9c739164c6, in uhd::rfnoc::mgmt::mgmt_portal_impl::_get_ostrm_status(uhd::rfnoc::chdr_ctrl_xport&, std::vector<std::pair<uhd::rfnoc::mgmt::node_id_t, int>, std::allocator<std::pair<uhd::rfnoc::mgmt::node_id_t, int> > > const&)
#1    Object "/opt/uhd-4.1-release/lib/libuhd.so.4.1.0", at 0x7f9c73910f54, in uhd::rfnoc::mgmt::mgmt_portal_impl::_send_recv_mgmt_transaction(uhd::rfnoc::chdr_ctrl_xport&, uhd::rfnoc::chdr::mgmt_payload const&, double) [clone .constprop.0]
#0    Object "/opt/uhd-4.1-release/lib/libuhd.so.4.1.0", at 0x7f9c738ae695, in uhd::rfnoc::chdr::mgmt_payload::deserialize(unsigned long const*, unsigned long, std::function<unsigned long (unsigned long)> const&)
Segmentation fault (Address not mapped to object [0x5607ba1e9000])
Segmentation fault

@wkunice
Copy link

wkunice commented Oct 11, 2022

I am seeing a similiar crash testing 4.2. It is in deserialize. It looks like the message length is very large.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants