Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openthread border router add-on consuming 100% CPU. #3490

Closed
kluner opened this issue Feb 28, 2024 · 24 comments
Closed

openthread border router add-on consuming 100% CPU. #3490

kluner opened this issue Feb 28, 2024 · 24 comments

Comments

@kluner
Copy link

kluner commented Feb 28, 2024

Describe the issue you are experiencing

Since the last update of HAOS and Homeassistant to below versions, the open thread border router add on has started using 100% CPU. Turning overal VM cpu usage from 3% to 27%. (enough to trigger the fans).

-----------------------------------------------------------
 Add-on: OpenThread Border Router
 OpenThread Border Router add-on
-----------------------------------------------------------
 Add-on version: 2.4.7
 You are running the latest version of this add-on.
 System: Home Assistant OS 12.0  (amd64 / qemux86-64)
 Home Assistant Core: 2024.2.4
 Home Assistant Supervisor: 2024.02.0

logs show no remarkable information:

s6-rc: info: service banner successfully started
s6-rc: info: service universal-silabs-flasher: starting
[09:11:58] INFO: Checking /dev/ttyUSB0 identifying SkyConnect v1.0 from Nabu Casa.
[09:11:58] INFO: Starting universal-silabs-flasher with /dev/ttyUSB0
2024-02-28 09:11:59.070 homeassistant universal_silabs_flasher.flash INFO Extracted GBL metadata: NabuCasaMetadata(metadata_version=1, sdk_version='4.4.0', ezsp_version=None, ot_rcp_version='SL-OPENTHREAD/2.4.0.0_GitHub-7074a43e4' (2.4.0.0), cpc_version=None, fw_type=<FirmwareImageType.OT_RCP: 'ot-rcp'>, baudrate=460800)
2024-02-28 09:11:59.070 homeassistant universal_silabs_flasher.flasher INFO Probing ApplicationType.GECKO_BOOTLOADER at 115200 baud
2024-02-28 09:12:01.073 homeassistant universal_silabs_flasher.flasher INFO Probing ApplicationType.SPINEL at 460800 baud
2024-02-28 09:12:05.381 homeassistant universal_silabs_flasher.flasher INFO Probing ApplicationType.CPC at 460800 baud

What type of installation are you running?

Home Assistant OS

Which operating system are you running on?

Home Assistant Operating System

Which add-on are you reporting an issue with?

CEC Scanner

What is the version of the add-on?

2.4.7

Steps to reproduce the issue

  1. start or reboot the whole instance
  2. cpu usage for add-on goes to 100%
    ...

System Health information

System Information

version core-2024.2.4
installation_type Home Assistant OS
dev false
hassio true
docker true
user root
virtualenv false
python_version 3.12.1
os_name Linux
os_version 6.6.16-haos
arch x86_64
timezone Europe/Amsterdam
config_dir /config
Home Assistant Cloud
logged_in true
subscription_expiration 9 January 2025 at 01:00
relayer_connected true
relayer_region eu-central-1
remote_enabled true
remote_connected true
alexa_enabled true
google_enabled true
remote_server eu-central-1-13.ui.nabu.casa
certificate_status ready
instance_id 3b1746964c94474db743e9ce4d6d388e
can_reach_cert_server ok
can_reach_cloud_auth ok
can_reach_cloud ok
Home Assistant Supervisor
host_os Home Assistant OS 12.0
update_channel stable
supervisor_version supervisor-2024.02.0
agent_version 1.6.0
docker_version 24.0.7
disk_total 30.8 GB
disk_used 8.4 GB
healthy true
supported true
board ova
supervisor_api ok
version_api ok
installed_addons Terminal & SSH (9.9.0), File editor (5.8.0), Glances (0.21.0), Silicon Labs Multiprotocol (2.4.4), Matter Server (5.2.0), OpenThread Border Router (2.4.7), Node-RED (17.0.7)
Dashboards
dashboards 5
resources 0
views 4
mode storage
Recorder
oldest_recorder_run 22 February 2024 at 10:39
current_recorder_run 28 February 2024 at 09:11
estimated_db_size 1329.48 MiB
database_engine sqlite
database_version 3.44.2

Anything in the Supervisor logs that might be useful for us?

�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/amd64-hassio-multicast with version 2023.06.2�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.plugins.multicast] Starting Multicast plugin�[0m
�[32m24-02-28 09:11:57 INFO (SyncWorker_0) [supervisor.docker.manager] Cleaning hassio_multicast application�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.docker.multicast] Starting Multicast ghcr.io/home-assistant/amd64-hassio-multicast with version 2023.06.2 - Host�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.homeassistant.secrets] Loaded 1 Home Assistant secrets�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/qemux86-64-homeassistant with version 2024.2.4�[0m
�[33m24-02-28 09:11:57 WARNING (MainThread) [supervisor.homeassistant.core] Watchdog found Home Assistant failed, restarting...�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.os.manager] Detect Home Assistant Operating System 12.0 / BootSlot B�[0m
�[32m24-02-28 09:11:57 INFO (SyncWorker_0) [supervisor.docker.manager] Starting homeassistant�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.homeassistant.core] Wait until Home Assistant is ready�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.store.git] Loading add-on /data/addons/git/a0d7b954 repository�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.store.git] Loading add-on /data/addons/core repository�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.store.git] Loading add-on /data/addons/git/5c53de3b repository�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.store] Loading add-ons from store: 77 all - 77 new - 0 remove�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.addons.manager] Found 7 installed add-ons�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.docker.interface] Attaching to ghcr.io/hassio-addons/glances/amd64 with version 0.21.0�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.docker.interface] Attaching to homeassistant/amd64-addon-ssh with version 9.9.0�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.docker.interface] Attaching to homeassistant/amd64-addon-matter-server with version 5.2.0�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.docker.interface] Attaching to homeassistant/amd64-addon-otbr with version 2.4.7�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.docker.interface] Attaching to homeassistant/amd64-addon-configurator with version 5.8.0�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.docker.interface] Attaching to homeassistant/amd64-addon-silabs-multiprotocol with version 2.4.4�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.docker.interface] Attaching to ghcr.io/hassio-addons/node-red/amd64 with version 17.0.7�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.backups.manager] Found 1 backup files�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.discovery] Loaded 2 messages�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.ingress] Loaded 0 ingress sessions�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.resolution.check] Starting system checks with state setup�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.resolution.check] System checks complete�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state setup�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.jobs] 'ResolutionFixup.run_autofix' blocked from execution, system is not running - setup�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state setup�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [__main__] Running Supervisor�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.os.manager] Rauc: B - marked slot kernel.1 as good�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.addons.manager] Phase 'initialize' starting 0 add-ons�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.addons.manager] Phase 'system' starting 0 add-ons�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.addons.manager] Phase 'services' starting 5 add-ons�[0m
�[32m24-02-28 09:11:57 INFO (SyncWorker_6) [supervisor.docker.manager] Cleaning addon_core_ssh application�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.docker.addon] Starting Docker add-on homeassistant/amd64-addon-ssh with version 9.9.0�[0m
�[33m24-02-28 09:11:57 WARNING (MainThread) [supervisor.docker.addon] Glances running with disabled protected mode!�[0m
�[32m24-02-28 09:11:57 INFO (SyncWorker_0) [supervisor.docker.manager] Cleaning addon_a0d7b954_glances application�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.docker.addon] Starting Docker add-on ghcr.io/hassio-addons/glances/amd64 with version 0.21.0�[0m
�[32m24-02-28 09:11:57 INFO (SyncWorker_1) [supervisor.docker.manager] Cleaning addon_core_silabs_multiprotocol application�[0m
�[32m24-02-28 09:11:57 INFO (MainThread) [supervisor.docker.addon] Starting Docker add-on homeassistant/amd64-addon-silabs-multiprotocol with version 2.4.4�[0m
�[32m24-02-28 09:11:57 INFO (SyncWorker_3) [supervisor.docker.manager] Cleaning addon_core_matter_server application�[0m
�[32m24-02-28 09:11:58 INFO (MainThread) [supervisor.docker.addon] Starting Docker add-on homeassistant/amd64-addon-matter-server with version 5.2.0�[0m
�[32m24-02-28 09:11:58 INFO (SyncWorker_3) [supervisor.docker.manager] Cleaning addon_core_openthread_border_router application�[0m
�[32m24-02-28 09:11:58 INFO (MainThread) [supervisor.docker.addon] Starting Docker add-on homeassistant/amd64-addon-otbr with version 2.4.7�[0m
�[32m24-02-28 09:11:58 INFO (MainThread) [supervisor.api.middleware.security] /network/info access from core_matter_server�[0m
�[32m24-02-28 09:11:59 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state startup�[0m
�[32m24-02-28 09:11:59 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete�[0m
�[32m24-02-28 09:12:01 INFO (MainThread) [supervisor.api.middleware.security] /network/info access from core_silabs_multiprotocol�[0m
�[32m24-02-28 09:12:02 INFO (MainThread) [supervisor.homeassistant.api] Updated Home Assistant API token�[0m
�[32m24-02-28 09:12:02 INFO (MainThread) [supervisor.homeassistant.core] Home Assistant Core state changed to NOT_RUNNING�[0m
�[33m24-02-28 09:12:04 WARNING (MainThread) [supervisor.addons.addon] core_silabs_multiprotocol is already running!�[0m
�[32m24-02-28 09:12:07 INFO (MainThread) [supervisor.core] Skipping start of Home Assistant�[0m
�[32m24-02-28 09:12:07 INFO (MainThread) [supervisor.addons.manager] Phase 'application' starting 2 add-ons�[0m
�[32m24-02-28 09:12:07 INFO (SyncWorker_5) [supervisor.docker.manager] Cleaning addon_core_configurator application�[0m
�[32m24-02-28 09:12:08 INFO (MainThread) [supervisor.docker.addon] Starting Docker add-on homeassistant/amd64-addon-configurator with version 5.8.0�[0m
�[32m24-02-28 09:12:08 INFO (SyncWorker_4) [supervisor.docker.manager] Cleaning addon_a0d7b954_nodered application�[0m
�[32m24-02-28 09:12:08 INFO (MainThread) [supervisor.docker.addon] Starting Docker add-on ghcr.io/hassio-addons/node-red/amd64 with version 17.0.7�[0m
�[32m24-02-28 09:12:12 INFO (MainThread) [supervisor.homeassistant.core] Home Assistant Core state changed to RUNNING�[0m
�[32m24-02-28 09:12:12 INFO (MainThread) [supervisor.homeassistant.core] Detect a running Home Assistant instance�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.misc.tasks] All core tasks are scheduled�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.core] Supervisor is up and running�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.host.info] Updating local host information�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.updater] Fetching update data from https://version.home-assistant.io/stable.json�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.resolution.check] Starting system checks with state running�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.resolution.checks.base] Run check for free_space/system�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.resolution.checks.base] Run check for trust/supervisor�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.resolution.checks.base] Run check for multiple_data_disks/system�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.resolution.checks.base] Run check for security/core�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.resolution.checks.base] Run check for dns_server_failed/dns_server�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.resolution.checks.base] Run check for dns_server_ipv6_error/dns_server�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.resolution.checks.base] Run check for docker_config/system�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.resolution.checks.base] Run check for pwned/addon�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.resolution.checks.base] Run check for ipv4_connection_problem/system�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.resolution.checks.base] Run check for no_current_backup/system�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.resolution.check] System checks complete�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state running�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.resolution.fixup] Starting system autofix at state running�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.resolution.fixup] System autofix complete�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.host.services] Updating service information�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.host.network] Updating local network information�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.host.sound] Updating PulseAudio information�[0m
�[32m24-02-28 09:12:38 INFO (MainThread) [supervisor.host.manager] Host information reload completed�[0m
�[32m24-02-28 09:42:10 INFO (MainThread) [supervisor.homeassistant.api] Updated Home Assistant API token�[0m
�[32m24-02-28 09:51:48 INFO (MainThread) [supervisor.store.git] Update add-on https://github.com/esphome/home-assistant-addon repository�[0m
�[32m24-02-28 09:51:48 INFO (MainThread) [supervisor.store.git] Update add-on https://github.com/home-assistant/addons repository�[0m
�[32m24-02-28 09:51:48 INFO (MainThread) [supervisor.store.git] Update add-on https://github.com/hassio-addons/repository repository�[0m
�[32m24-02-28 09:51:48 INFO (MainThread) [supervisor.updater] Fetching update data from https://version.home-assistant.io/stable.json�[0m
�[32m24-02-28 09:51:49 INFO (MainThread) [supervisor.store] Loading add-ons from store: 77 all - 0 new - 0 remove�[0m
�[32m24-02-28 09:51:49 INFO (MainThread) [supervisor.store] Loading add-ons from store: 77 all - 0 new - 0 remove�[0m
�[33m24-02-28 10:00:45 WARNING (MainThread) [supervisor.api.ingress] No valid ingress session None�[0m
�[33m24-02-28 10:00:45 WARNING (MainThread) [supervisor.api.ingress] No valid ingress session None�[0m
�[33m24-02-28 10:00:45 WARNING (MainThread) [supervisor.api.ingress] No valid ingress session None�[0m
�[33m24-02-28 10:00:45 WARNING (MainThread) [supervisor.api.ingress] No valid ingress session None�[0m
�[33m24-02-28 10:00:45 WARNING (MainThread) [supervisor.api.ingress] No valid ingress session None�[0m
�[33m24-02-28 10:00:45 WARNING (MainThread) [supervisor.api.ingress] No valid ingress session None�[0m

Anything in the add-on logs that might be useful for us?

[09:11:58] INFO: The otbr-web is disabled.
s6-rc: info: service mdns: starting
s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service banner: starting
s6-rc: info: service mdns successfully started
[09:11:58] INFO: Starting mDNS Responder...
Default: mDNSResponder (Engineering Build) (Feb 17 2024 11:16:43) starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started

-----------------------------------------------------------
 Add-on: OpenThread Border Router
 OpenThread Border Router add-on
-----------------------------------------------------------
 Add-on version: 2.4.7
 You are running the latest version of this add-on.
 System: Home Assistant OS 12.0  (amd64 / qemux86-64)
 Home Assistant Core: 2024.2.4
 Home Assistant Supervisor: 2024.02.0
-----------------------------------------------------------
 Please, share the above information when looking for help
 or support in, e.g., GitHub, forums or the Discord chat.
-----------------------------------------------------------
s6-rc: info: service banner successfully started
s6-rc: info: service universal-silabs-flasher: starting
[09:11:58] INFO: Checking /dev/ttyUSB0 identifying SkyConnect v1.0 from Nabu Casa.
[09:11:58] INFO: Starting universal-silabs-flasher with /dev/ttyUSB0
2024-02-28 09:11:59.070 homeassistant universal_silabs_flasher.flash INFO Extracted GBL metadata: NabuCasaMetadata(metadata_version=1, sdk_version='4.4.0', ezsp_version=None, ot_rcp_version='SL-OPENTHREAD/2.4.0.0_GitHub-7074a43e4' (2.4.0.0), cpc_version=None, fw_type=<FirmwareImageType.OT_RCP: 'ot-rcp'>, baudrate=460800)
2024-02-28 09:11:59.070 homeassistant universal_silabs_flasher.flasher INFO Probing ApplicationType.GECKO_BOOTLOADER at 115200 baud
2024-02-28 09:12:01.073 homeassistant universal_silabs_flasher.flasher INFO Probing ApplicationType.SPINEL at 460800 baud
2024-02-28 09:12:05.381 homeassistant universal_silabs_flasher.flasher INFO Probing ApplicationType.CPC at 460800 baud

Additional information

No response

@agners
Copy link
Member

agners commented Feb 29, 2024

Hm, so this is the last line of the startup log?

2024-02-28 09:12:05.381 homeassistant universal_silabs_flasher.flasher INFO Probing ApplicationType.CPC at 460800 baud

Sounds like the universal-silabs-flasher is hanging then 🤔 Is this reproducible, as in when you restart the add-on it hangs on the same line and CPU goes to 100% on one CPU?

@puddly do you happen to have an idea?

@kluner
Copy link
Author

kluner commented Feb 29, 2024

Hi,

Yeah, that was the last line.

I had already rebooted my HA VM, but no to no avail, it spun up to 100% (1 of 4 cores assigned to it) right away after HA had finished starting.

I disabled the addon for now, but let me check again.

@kluner
Copy link
Author

kluner commented Feb 29, 2024

Check, I just spun up the addon again, and it immediately goes to 100%


Add-on version: 2.4.7
 You are running the latest version of this add-on.
 System: Home Assistant OS 12.0  (amd64 / qemux86-64)
 Home Assistant Core: 2024.2.4
 Home Assistant Supervisor: 2024.02.0
-----------------------------------------------------------
 Please, share the above information when looking for help
 or support in, e.g., GitHub, forums or the Discord chat.
-----------------------------------------------------------
s6-rc: info: service banner successfully started
s6-rc: info: service universal-silabs-flasher: starting
[13:26:39] INFO: Checking /dev/ttyUSB0 identifying SkyConnect v1.0 from Nabu Casa.
[13:26:39] INFO: Starting universal-silabs-flasher with /dev/ttyUSB0
2024-02-29 13:26:39.551 homeassistant universal_silabs_flasher.flash INFO Extracted GBL metadata: NabuCasaMetadata(metadata_version=1, sdk_version='4.4.0', ezsp_version=None, ot_rcp_version='SL-OPENTHREAD/2.4.0.0_GitHub-7074a43e4' (2.4.0.0), cpc_version=None, fw_type=<FirmwareImageType.OT_RCP: 'ot-rcp'>, baudrate=460800)
2024-02-29 13:26:39.551 homeassistant universal_silabs_flasher.flasher INFO Probing ApplicationType.GECKO_BOOTLOADER at 115200 baud
2024-02-29 13:26:41.556 homeassistant universal_silabs_flasher.flasher INFO Probing ApplicationType.SPINEL at 460800 baud
2024-02-29 13:26:45.866 homeassistant universal_silabs_flasher.flasher INFO Probing ApplicationType.CPC at 460800 baud

@puddly
Copy link
Collaborator

puddly commented Feb 29, 2024

The flasher has explicit timeouts for every CPC command so it should never stall. The only thing I can imagine is if it's being overwhelmed by a continuous stream of data, which I think I've seen happen once.

Do we have a way to enable verbose logging for the flasher within the addon?

@agners
Copy link
Member

agners commented Feb 29, 2024

@puddly I don't think so. 😢

@kluner what happens if you remove the stick at that point?

@agners
Copy link
Member

agners commented Feb 29, 2024

@kluner from the system console (I assume you have access to it on your VM), can you use login to get OS shell access, then use top, press f, s, q to sort by CPU usage, and verify that it is indeed the universal-silabs-flasher appearing at the top?

@kluner
Copy link
Author

kluner commented Feb 29, 2024

@kluner from the system console (I assume you have access to it on your VM), can you use login to get OS shell access, then use top, press f, s, q to sort by CPU usage, and verify that it is indeed the universal-silabs-flasher appearing at the top?

yep, confirmed:

image

@kluner
Copy link
Author

kluner commented Feb 29, 2024

@puddly I don't think so. 😢

@kluner what happens if you remove the stick at that point?

no change. I had a tail on the docker container (docker logs -f ) and it did not show anything, the flasher is still doing 100%.

the kernel spewed 2 lines of USB device error due to the unplug, and that's it.

@agners
Copy link
Member

agners commented Feb 29, 2024

Can you try run it manually with verbose option?

docker exec -it addon_core_openthread_border_router /bin/bash
kill $(pidof python3)
universal-silabs-flasher --verbose --device /dev/ttyUSB0 flash --ensure-exact-version --allow-cross-flashing --firmware "/root/NabuCasa_SkyConnect_OpenThread_RCP_v2.4.0.0_ot-rcp_hw_460800.gbl"

@kluner
Copy link
Author

kluner commented Feb 29, 2024

and unfortunately no strace in haos, or I could have a peak at what is happening at the syscall level.

@kluner
Copy link
Author

kluner commented Feb 29, 2024

Can you try run it manually with verbose option?

docker exec -it addon_core_openthread_border_router /bin/bash
kill $(pidof python3)
universal-silabs-flasher --verbose --device /dev/ttyUSB0 flash --ensure-exact-version --allow-cross-flashing --firmware "/root/NabuCasa_SkyConnect_OpenThread_RCP_v2.4.0.0_ot-rcp_hw_460800.gbl"

yeah, but give me a bit to see if I can get a real ssh into haos going, so I can copy/paste.

@agners
Copy link
Member

agners commented Feb 29, 2024

yeah, but give me a bit to see if I can get a real ssh into haos going, so I can copy/paste.

👍 , fwiw, there is a guide how to do this in our developer docs: https://developers.home-assistant.io/docs/operating-system/debugging#ssh-access-to-the-host

@kluner
Copy link
Author

kluner commented Feb 29, 2024

oooh, great suggestion.

ok, so I half anticipate that kill on python would kill the whole container and cause it to restart, but it seems something did happen:

s6-rc: info: service banner successfully started
s6-rc: info: service universal-silabs-flasher: starting
[19:55:15] INFO: Checking /dev/ttyUSB0 identifying SkyConnect v1.0 from Nabu Casa.
[19:55:15] INFO: Starting universal-silabs-flasher with /dev/ttyUSB0
2024-02-29 19:55:15.766 homeassistant universal_silabs_flasher.flash INFO Extracted GBL metadata: NabuCasaMetadata(metadata_version=1, sdk_version='4.4.0', ezsp_version=None, ot_rcp_version='SL-OPENTHREAD/2.4.0.0_GitHub-7074a43e4' (2.4.0.0), cpc_version=None, fw_type=<FirmwareImageType.OT_RCP: 'ot-rcp'>, baudrate=460800)
2024-02-29 19:55:15.766 homeassistant universal_silabs_flasher.flasher INFO Probing ApplicationType.GECKO_BOOTLOADER at 115200 baud
2024-02-29 19:55:18.388 homeassistant universal_silabs_flasher.flasher INFO Detected bootloader version '2.1.1'
2024-02-29 19:55:18.389 homeassistant universal_silabs_flasher.flasher INFO Detected ApplicationType.GECKO_BOOTLOADER, version '2.1.1' at 115200 baudrate (bootloader baudrate 115200)
2024-02-29 19:55:18.389 homeassistant universal_silabs_flasher.flash INFO Firmware baudrate 115200 differs from expected baudrate 460800
NabuCasa_SkyConnect_OpenThread_RCP_v2.4.0.0_ot-rcp_hw_460800.gbl
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/universal_silabs_flasher/gecko_bootloader.py", line 71, in probe
    return await self.ebl_info()
  File "/usr/local/lib/python3.9/dist-packages/universal_silabs_flasher/gecko_bootloader.py", line 81, in ebl_info
    await self._state_machine.wait_for_state(State.IN_MENU)
  File "/usr/local/lib/python3.9/dist-packages/universal_silabs_flasher/common.py", line 115, in wait_for_state
    return await future
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/universal-silabs-flasher", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/universal_silabs_flasher/flash.py", line 40, in inner
    return asyncio.run(f(*args, **kwargs))
  File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.9/dist-packages/universal_silabs_flasher/flash.py", line 423, in flash
    await flasher.flash_firmware(
  File "/usr/local/lib/python3.9/dist-packages/universal_silabs_flasher/flasher.py", line 289, in flash_firmware
    await gecko.probe()
  File "/usr/local/lib/python3.9/dist-packages/universal_silabs_flasher/gecko_bootloader.py", line 71, in probe
    return await self.ebl_info()
  File "/usr/local/lib/python3.9/dist-packages/async_timeout/__init__.py", line 141, in __aexit__
    self._do_exit(exc_type)
  File "/usr/local/lib/python3.9/dist-packages/async_timeout/__init__.py", line 228, in _do_exit
    raise asyncio.TimeoutError
asyncio.exceptions.TimeoutError
s6-rc: warning: unable to start service universal-silabs-flasher: command exited 1
/run/s6/basedir/scripts/rc.init: warning: s6-rc failed to properly bring all the services up! Check your logs (in /run/uncaught-logs/current if you have in-container logging) for more information.
/run/s6/basedir/scripts/rc.init: fatal: stopping the container.
s6-rc: info: service legacy-cont-init: stopping
s6-rc: info: service banner: stopping
s6-rc: info: service mdns: stopping
s6-rc: info: service banner successfully stopped
Default: mDNSResponder (Engineering Build) (Feb 17 2024 11:16:43) stopping
[19:55:20] INFO: mDNS ended with exit code 4 (signal 0)...
s6-rc: info: service legacy-cont-init successfully stopped
s6-rc: info: service fix-attrs: stopping
s6-rc: info: service mdns successfully stopped
s6-rc: info: service fix-attrs successfully stopped
s6-rc: info: service s6rc-oneshot-runner: stopping
s6-rc: info: service s6rc-oneshot-runner successfully stopped

@kluner
Copy link
Author

kluner commented Feb 29, 2024

followed by

[19:56:32] INFO: The otbr-web is disabled.
s6-rc: info: service mdns: starting
s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service mdns successfully started
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service banner: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
[19:56:32] INFO: Starting mDNS Responder...
Default: mDNSResponder (Engineering Build) (Feb 17 2024 11:16:43) starting

-----------------------------------------------------------
 Add-on: OpenThread Border Router
 OpenThread Border Router add-on
-----------------------------------------------------------
 Add-on version: 2.4.7
 You are running the latest version of this add-on.
 System: Home Assistant OS 12.0  (amd64 / qemux86-64)
 Home Assistant Core: 2024.2.4
 Home Assistant Supervisor: 2024.02.1
-----------------------------------------------------------
 Please, share the above information when looking for help
 or support in, e.g., GitHub, forums or the Discord chat.
-----------------------------------------------------------
s6-rc: info: service banner successfully started
s6-rc: info: service universal-silabs-flasher: starting
[19:56:33] INFO: Checking /dev/ttyUSB0 identifying SkyConnect v1.0 from Nabu Casa.
[19:56:33] INFO: Starting universal-silabs-flasher with /dev/ttyUSB0
2024-02-29 19:56:33.342 homeassistant universal_silabs_flasher.flash INFO Extracted GBL metadata: NabuCasaMetadata(metadata_version=1, sdk_version='4.4.0', ezsp_version=None, ot_rcp_version='SL-OPENTHREAD/2.4.0.0_GitHub-7074a43e4' (2.4.0.0), cpc_version=None, fw_type=<FirmwareImageType.OT_RCP: 'ot-rcp'>, baudrate=460800)
2024-02-29 19:56:33.342 homeassistant universal_silabs_flasher.flasher INFO Probing ApplicationType.GECKO_BOOTLOADER at 115200 baud
2024-02-29 19:56:34.358 homeassistant universal_silabs_flasher.flasher INFO Detected bootloader version '2.1.1'
2024-02-29 19:56:34.358 homeassistant universal_silabs_flasher.flasher INFO Detected ApplicationType.GECKO_BOOTLOADER, version '2.1.1' at 115200 baudrate (bootloader baudrate 115200)
2024-02-29 19:56:34.358 homeassistant universal_silabs_flasher.flash INFO Firmware baudrate 115200 differs from expected baudrate 460800
NabuCasa_SkyConnect_OpenThread_RCP_v2.4.0.0_ot-rcp_hw_460800.gbl

which seems to run stable currently, at no excessive cpu consumption.

@kluner
Copy link
Author

kluner commented Feb 29, 2024

possible (likely) side-effect: zigbee integration is stuck initialising. let's see if I can shake it loose.

@agners
Copy link
Member

agners commented Feb 29, 2024

Uh, do you use the same device in the ZHA integration maybe? 🤔 The ZHA integration should not point to the that serial port. This would create havoc.

We currently discover the device still as ZHA device, but that will change in the future. Currently you have to explicitly ignore the discovered ZHA entry (see https://skyconnect.home-assistant.io/procedures/enable-thread/).

@kluner
Copy link
Author

kluner commented Mar 1, 2024

Uh, do you use the same device in the ZHA integration maybe? 🤔 The ZHA integration should not point to the that serial port. This would create havoc.

We currently discover the device still as ZHA device, but that will change in the future. Currently you have to explicitly ignore the discovered ZHA entry (see https://skyconnect.home-assistant.io/procedures/enable-thread/).

You know, that would make sense in it causing absolute chaos. But no, I use the socket in ZHA.

IMG_1102

@kluner
Copy link
Author

kluner commented Mar 1, 2024

btw, it came back. It’s doing 100% again.

@agners
Copy link
Member

agners commented Mar 1, 2024

Do you have the Silicon Labs Multiprotocol add-on enabled at the same time? If that accesses the serial port at the same time it would explain the problem as well...

@kluner
Copy link
Author

kluner commented Mar 1, 2024

Actually,I do.

Funny thing: I have had this configuration since I got the skyconnect in. It never caused problems.

So what is the recommended configuration here?

Only multi, and matter server and ZHA to handle the protocol stacks? I do not have matter things at the moment, but I do expect that to change shortly.

@agners
Copy link
Member

agners commented Mar 1, 2024

Unfortunately, the Silicon Labs Multiprotocol add-on showed problems for a lot of folks especially when they started to add devices to the Thread side 😢

So currently we only recommend dedicated setups: Use a radio for Zigbee and one radio for Thread. Maybe you already have a second radio available? 🤔

Alternatively, if you have Google or Apple BR, our Matter stack can make use of those as well (the devices still will be directly associated with Home Assistant on the application/Matter level).

@kluner
Copy link
Author

kluner commented Mar 1, 2024

Well, I can easily turn off all the thread and matter stuff currently, it's not really getting used. It's just an obstacle for the future.

What do you mean with using the Apple BR? The HomeKit bridge? Can the matter addon use that? That would kinda fix the whole problem anyway.

@agners
Copy link
Member

agners commented Mar 1, 2024

What do you mean with using the Apple BR? The HomeKit bridge? Can the matter addon use that? That would kinda fix the whole problem anyway.

No, HomePod and such, see https://www.home-assistant.io/integrations/thread#list-of-thread-border-router-devices.

@kluner
Copy link
Author

kluner commented Mar 1, 2024

ah, right check. Well I was looking for an excuse to buy a matter capable one anyway. ;-)

I reset the firmware on skyconnect to zigbee, and hooked ZHA to it again. Seems to work fine again now.

@agners agners closed this as not planned Won't fix, can't repro, duplicate, stale Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants