Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenThread/Multi-Protocol addon starts erroring and bootlooping after a couple days #2848

Closed
dcmeglio opened this issue Jan 23, 2023 · 29 comments
Labels

Comments

@dcmeglio
Copy link

Not entirely sure of the root cause, just had everything working last night, woke up and it's all broken. 3rd time this has happened now and the only solution I've found is a full power down (even rpi reboot doesn't help). Here are the addon logs from a fresh addon restart. It just repeats the last two times over and over until the watchdog restarts it continually.

Full log after a fresh restart:

s6-rc: info: service mdns: starting
s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service mdns successfully started
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
cont-init: info: running /etc/cont-init.d/check-cpcd-shm.sh
[15:18:59] INFO: Starting mDNS Responder...
Default: mDNSResponder (Engineering Build) (Jan 12 2023 14:23:29) starting
Default: mDNS_AddDNSServer: Lock not held! mDNS_busy (0) mDNS_reentrancy (0)
cont-init: info: /etc/cont-init.d/check-cpcd-shm.sh exited 0
cont-init: info: running /etc/cont-init.d/config.sh
[15:19:00] INFO: Generating cpcd configuration.
cont-init: info: /etc/cont-init.d/config.sh exited 0
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service banner: starting
-----------------------------------------------------------
 Add-on: Silicon Labs Multiprotocol
 Zigbee and OpenThread multiprotocol add-on
-----------------------------------------------------------
 Add-on version: 0.11.4
 You are running the latest version of this add-on.
 System: Home Assistant OS 9.4  (aarch64 / raspberrypi4-64)
 Home Assistant Core: 2023.1.7
 Home Assistant Supervisor: 2023.01.0
-----------------------------------------------------------
 Please, share the above information when looking for help
 or support in, e.g., GitHub, forums or the Discord chat.
-----------------------------------------------------------
s6-rc: info: service banner successfully started
s6-rc: info: service universal-silabs-flasher: starting
[15:19:03] INFO: Flashing firmware is disabled
s6-rc: info: service universal-silabs-flasher successfully started
s6-rc: info: service cpcd: starting
[15:19:03] INFO: Starting cpcd...
WARNING in function 'main' in file /usr/src/cpc-daemon/main.c at line #188 : Running CPCd as 'root' is not recommended. Proceed at your own risk.
[15:19:03:992] Info : [CPCd v4.2.0.0] [Library API v3] [RCP Protocol v3]
[15:19:03:992] Info : Git commit: 2036da8fa5aa7bd42b127b5bb603cab7a49e6fcd / branch: 
[15:19:03:992] Info : Sources hash: 5454a76205641ee86fd40f458cb0920c0d010fec5702a25faa3c902b7e119596
[15:19:03:992] WARNING : In function 'main' in file /usr/src/cpc-daemon/main.c at line #188 : Running CPCd as 'root' is not recommended. Proceed at your own risk.
[15:19:03:992] Info : Reading cli arguments
[15:19:03:992] Info : /usr/local/bin/cpcd 
[15:19:03:998] Info : Reading configuration
[15:19:03:998] Info : file_path = /usr/local/etc/cpcd.conf
[15:19:03:998] Info : instance_name = cpcd_0
[15:19:03:998] Info : socket_folder = /dev/shm
[15:19:03:998] Info : operation_mode = MODE_NORMAL
[15:19:03:998] Info : use_encryption = false
[15:19:03:998] Info : binding_key_file = /etc/binding-key.key
[15:19:03:998] Info : binding_key_override = false
[15:19:03:998] Info : binding_method = 
[15:19:03:998] Info : stdout_tracing = false
[15:19:03:998] Info : file_tracing = false
[15:19:03:998] Info : lttng_tracing = false
[15:19:03:998] Info : enable_frame_trace = false
[15:19:03:998] Info : traces_folder = /dev/shm/cpcd-traces
[15:19:03:998] Info : bus = UART
[15:19:03:998] Info : uart_baudrate = 115200
[15:19:03:998] Info : uart_hardflow = true
[15:19:03:998] Info : uart_file = /dev/ttyUSB2
[15:19:03:999] Info : spi_file = /dev/spidev0.0
[15:19:03:999] Info : spi_bitrate = 1000000
[15:19:03:999] Info : spi_mode = SPI_MODE_0
[15:19:03:999] Info : spi_bit_per_word = 8
[15:19:03:999] Info : spi_cs_chip = gpiochip0
[15:19:03:999] Info : spi_cs_pin = 8
[15:19:03:999] Info : spi_irq_chip = gpiochip0
[15:19:03:999] Info : spi_irq_pin = 22
[15:19:03:999] Info : fu_reset_chip = gpiochip0
[15:19:03:999] Info : fu_spi_reset_pin = 23
[15:19:03:999] Info : fu_wake_chip = gpiochip0
[15:19:03:999] Info : fu_spi_wake_pin = 24
[15:19:03:999] Info : fu_recovery_enabled = false
[15:19:03:999] Info : fu_connect_to_bootloader = false
[15:19:03:999] Info : fu_enter_bootloader = false
[15:19:03:999] Info : fu_file = 
[15:19:03:999] Info : fu_restart_daemon = false
[15:19:03:999] Info : board_controller_ip_addr = 
[15:19:03:999] Info : application_version_validation = false
[15:19:03:999] Info : print_secondary_versions_and_exit = false
[15:19:03:999] Info : use_noop_keep_alive = false
[15:19:03:999] Info : reset_sequence = true
[15:19:03:999] Info : uart_validation_test_option = 
[15:19:03:999] Info : stats_interval = 0
[15:19:03:999] Info : rlimit_nofile = 2000
[15:19:03:999] Info : ENCRYPTION IS DISABLED 
[15:19:03:999] Info : Starting daemon in normal mode
[15:19:04:017] Info : Connecting to Secondary...
[15:19:06:018] Info : Failed to connect, secondary seems unresponsive
[15:19:06:018] Info : Connecting to Secondary...
[15:19:08:018] Info : Failed to connect, secondary seems unresponsive
[15:19:08:018] Info : Connecting to Secondary...
[15:19:10:018] Info : Failed to connect, secondary seems unresponsive
[15:19:10:018] Info : Connecting to Secondary...
@puddly
Copy link
Collaborator

puddly commented Jan 23, 2023

Is this with a SkyConnect?

@dcmeglio
Copy link
Author

Yes, sorry. For the record I do NOT have a zigbee network setup on skyconnect, I am using it solely for Thread.

@agners agners transferred this issue from home-assistant/addons-development Jan 23, 2023
@dcmeglio
Copy link
Author

dcmeglio commented Jan 23, 2023

I can't say if this is a related issue or just coincidence, but after updating the addon to 0.12.0, it restarted the addon. It is now stuck in a loop saying:

[18:48:23:499] *** ASSERT *** : FATAL in function 'protocol_version_check' in file /usr/src/cpc-daemon/server_core/server_core.c at line #610 : Secondary Protocol v3 doesn't match CPCd Protocol v2
[18:48:24:500] Info : Daemon exiting with status EXIT_FAILURE
Logger buffer size = 28672, highwater mark = 2765 : 9.64%. Lost logs : 0
[18:48:24] INFO: CPC ended with exit code 1 (signal 0)...
[18:48:24] INFO: Starting cpcd...
FATAL in function 'protocol_version_check' in file /usr/src/cpc-daemon/server_core/server_core.c at line #610 : Secondary Protocol v3 doesn't match CPCd Protocol v2

Edit: I think this part may be because I had the auto firmware updater shut off? A better error message might be helpful there.

@puddly
Copy link
Collaborator

puddly commented Jan 24, 2023

Edit: I think this part may be because I had the auto firmware updater shut off? A better error message might be helpful there.

I'd leave it at the default value. Allowing auto-flashing to be disabled is really there for debugging purposes or if you run the multi-PAN addon with another Zigbee stick that needs its own firmware.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Feb 23, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 2, 2023
@daveliang
Copy link

I also encountered a similar problem, you need to burn the BootLoader+RCP application

@daveliang
Copy link

Did you flash the bootLoader?

@cmatte
Copy link

cmatte commented Apr 5, 2023

@daveliang can you expand on what's the solve? I see the same issue above and have followed all the steps.

@dvhub
Copy link

dvhub commented May 6, 2023

I have the same issue. @daveliang , what was your solution with the SkyConnect?

@daveliang
Copy link

daveliang commented May 8, 2023

I have the same issue. @daveliang , what was your solution with the SkyConnect?

I am using the silicon labs otbr solution:
https://openthread.google.cn/vendors/silicon-labs
https://www.silabs.com/documents/public/application-notes/an1333-concurrent-protocols-with-802-15-4-rcp.pdf

@marcelmah
Copy link

I have the same... posted on the forums, no reply unfortunately...
https://community.home-assistant.io/t/skyconnect-checksum-errors-in-silicon-labs-multiprotocol-logs/612984

Will this get any attention because the issue is marked as closed?

@marcelmah
Copy link

Pfff, I just noticed a document of Silicon Labs mentioning an incompatability of Bluetooth devices with Thread.
I removed my USB passtrough from my Home Assistant VM running in ESXi 7 U3o and rebooted the Sillion Labs multiprotocol addon. Error is gone... so this is not a 'fix', but this might get us in the right direction.

@fuomag9
Copy link

fuomag9 commented Nov 3, 2023

Pfff, I just noticed a document of Silicon Labs mentioning an incompatability of Bluetooth devices with Thread. I removed my USB passtrough from my Home Assistant VM running in ESXi 7 U3o and rebooted the Sillion Labs multiprotocol addon. Error is gone... so this is not a 'fix', but this might get us in the right direction.

Disabling bluetooth on my pi4 did not help

@marcelmah
Copy link

Pfff, I just noticed a document of Silicon Labs mentioning an incompatability of Bluetooth devices with Thread. I removed my USB passtrough from my Home Assistant VM running in ESXi 7 U3o and rebooted the Sillion Labs multiprotocol addon. Error is gone... so this is not a 'fix', but this might get us in the right direction.

Disabling bluetooth on my pi4 did not help

Yeah, did not last very long for me either. Think I just had some brief luck...

@fuomag9
Copy link

fuomag9 commented Nov 3, 2023

Pfff, I just noticed a document of Silicon Labs mentioning an incompatability of Bluetooth devices with Thread. I removed my USB passtrough from my Home Assistant VM running in ESXi 7 U3o and rebooted the Sillion Labs multiprotocol addon. Error is gone... so this is not a 'fix', but this might get us in the right direction.

Disabling bluetooth on my pi4 did not help

Yeah, did not last very long for me either. Think I just had some brief luck...

For now I'm trying to see if things work better in threads only mode (so no siliconlabs add-on) as I was using a conbee2 before and I can have separated devices

Seems to be working for now, but we'll see in a few hours

@kvithayathil
Copy link

Looks like I'm having the same issue right now with a new skyconnect

@gpranzo
Copy link

gpranzo commented Dec 3, 2023

Same for me. Grrrr....
image

In Supervisor Logs I get this message looping through:
23-12-03 02:04:12 WARNING (MainThread) [supervisor.addons.addon] Watchdog found addon OpenThread Border Router is failed, restarting...
23-12-03 02:04:12 INFO (SyncWorker_4) [supervisor.docker.manager] Cleaning addon_core_openthread_border_router application
23-12-03 02:04:13 INFO (MainThread) [supervisor.docker.addon] Starting Docker add-on homeassistant/aarch64-addon-otbr with version 2.3.2

In OpenThread Border Router Logs I get this message:
2023-12-03 02:17:19 GPhomeassistant universal_silabs_flasher.cpc[150] WARNING Failed to parse buffer bytearray(b'\x14\x0c\x03\x00\x97`h\x01!\x10'): ValueError('Unsupported frame type: <FrameType.SUPERVISORY: 2>')

In SiliconLabs Multiprotocol Logs I get this:
WARNING : In function 'core_process_rx_s_frame' in file /usr/src/cpc-daemon/server_core/core/core.c at line #818 : Remote received a packet with an invalid checksum

@puddly
Copy link
Collaborator

puddly commented Dec 3, 2023

@gpranzo "OpenThread Border Router" and "Silicon Labs Multiprotocol" conflict with one another. Uninstall the first one.

@marcelmah
Copy link

@gpranzo "OpenThread Border Router" and "Silicon Labs Multiprotocol" conflict with one another. Uninstall the first one.

But you need a OTBR? It gets installed by itself... how sure are you?

@gpranzo
Copy link

gpranzo commented Dec 5, 2023

Thank you @puddly I will try that. Though I think I ended up adding OTBR after I could not get SL Multiprotocol to work either.
I may need to disconnect Sky Connect, remove the appropriate integrations and start from scratch.

@gpranzo
Copy link

gpranzo commented Dec 5, 2023

@puddly , thanks again. Backing everything out, including reflashing the Skyconnect to remove Multiprotocol firmware, doing a migration to Skyconnect Zigbee controller cleared everything up.

@gpranzo
Copy link

gpranzo commented Dec 7, 2023

@marcelmah , you are right that the skyconnect instructions imply that OTBR should be installed automatically when the SL Multiprotocol is configured. https://skyconnect.home-assistant.io/procedures/enable-multiprotocol/
That part does not work for me. But at this point I'm just happy my Zigbee network is back. I'll keep working on the Thread protocol later. Enough time spent on this.

@sergeykad
Copy link

sergeykad commented Dec 21, 2023

I have similar issues with SONOFF Zigbee 3.0 USB Dongle V2. Physically disconnecting the dongle and reconnecting seems to fix the problem for a few days.

 Add-on: Silicon Labs Multiprotocol
 Zigbee and OpenThread multiprotocol add-on
-----------------------------------------------------------
 Add-on version: 2.3.2
 You are running the latest version of this add-on.
 System: Home Assistant OS 11.2  (amd64 / qemux86-64)
 Home Assistant Core: 2023.12.3
 Home Assistant Supervisor: 2023.12.0

There is a similar discussion on the HA forum https://community.home-assistant.io/t/messed-up-ha-by-installing-sonoff-zigbee-3-0-usb-dongle-plus-e-and-silicon-labs-multiprotocol/625826

@Waghead
Copy link

Waghead commented Jan 17, 2024

Got the same problem. I bought a new SONOFF Zigbee 3.0 USB Dongle V2 stick because my Sky connect didn't work. Now I got the same problem. Multipla won't work.

@PatrickSteiner
Copy link

Did anyone get rid of the issue with the "secondary"? Have the same with my SONOFF Zigbee 3.0 USB Dongle V2, that I want to use solely for Matter/Thread, while the SkyConnect only serves Zigbee.

@marcelmah
Copy link

I switched my SkyConnect to the Thread only firmware and now the OTBR works fine. Now I have IPv6 issues :)

@PatrickSteiner
Copy link

@marcelmah just now tried that, still have the error ...

@IncendioMoussy
Copy link

Hello you found a solution I still have this problem too

@AntoineGS
Copy link

Adding to this, I have been having the same issue but disconnecting and reconnecting the SkyConnect fixes it.
This is obviously still an issue but much less painful than a full shutdown (a reboot did nothing).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests