Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Z-Wave.me UZB dongle not fully recognized #2995

Open
jires opened this issue Dec 16, 2023 · 60 comments
Open

Z-Wave.me UZB dongle not fully recognized #2995

jires opened this issue Dec 16, 2023 · 60 comments
Labels
board/generic-x86-64 Generic x86-64 Boards (like Intel NUC) board/raspberrypi Raspberry Pi Boards board/yellow Home Assistant Yellow bug stable-kernel-regression Issue which appears to be an upstream stable kernel regression

Comments

@jires
Copy link

jires commented Dec 16, 2023

Describe the issue you are experiencing

After updating to 11.1, my usb dongle for Z-Wave became unresponsive and integration z-wave js ui hangs or won't start from time to time.
When I turn on the monitor connected to Intel NUC, I can se that dongle generates some errors when connected to usb port.

It's saying "device descriptor read/64, error -32

It's like missing driver or something. Issue persist in 11.2 version. Right now, I had to move my z-wave network to Aeotec Z-Stick 7.
Inked2023-12-07 22 04 08

What operating system image do you use?

generic-x86-64 (Generic UEFI capable x86-64 systems)

What version of Home Assistant Operating System is installed?

6.1.63-haos - Home Assistant OS 11.2

Did you upgrade the Operating System.

Yes

Steps to reproduce the issue

  1. start the system up
  2. Plug an USB dongle ZMEEUZB from Z-Wave.Me into usb port

...

Anything in the Supervisor logs that might be useful for us?

Not really

Anything in the Host logs that might be useful for us?

Not Really

System information

System Information

version core-2023.10.3
installation_type Home Assistant OS
dev false
hassio true
docker true
user root
virtualenv false
python_version 3.11.5
os_name Linux
os_version 6.1.63-haos
arch x86_64
timezone Europe/Copenhagen
config_dir /config
Home Assistant Community Store
GitHub API ok
GitHub Content ok
GitHub Web ok
GitHub API Calls Remaining 5000
Installed Version 1.33.0
Stage running
Available Repositories 1353
Downloaded Repositories 22
Home Assistant Cloud
logged_in true
subscription_expiration September 7, 2024 at 02:00
relayer_connected true
relayer_region eu-central-1
remote_enabled true
remote_connected true
alexa_enabled false
google_enabled true
remote_server eu-central-1-11.ui.nabu.casa
certificate_status ready
can_reach_cert_server ok
can_reach_cloud_auth ok
can_reach_cloud ok
Home Assistant Supervisor
host_os Home Assistant OS 11.2
update_channel stable
supervisor_version supervisor-2023.11.6
agent_version 1.6.0
docker_version 24.0.7
disk_total 916.2 GB
disk_used 64.5 GB
healthy true
supported true
board generic-x86-64
supervisor_api ok
version_api ok
installed_addons Advanced SSH & Web Terminal (16.0.1), InfluxDB (4.8.0), Grafana (9.1.1), Node-RED (16.0.2), File editor (5.7.0), Studio Code Server (5.14.2), Mosquitto broker (6.4.0), Home Assistant Google Drive Backup (0.112.1), Network UPS Tools (0.12.2), ESPHome (2023.11.6), Samba share (12.2.0), Filebrowser (2.23.0_7), Portainer (2.19.3), Log Viewer (0.16.0), Glances (0.20.0), Frigate (Full Access) (0.12.1), Z-Wave JS UI (3.0.2), Zigbee2MQTT (1.34.0-1)
Dashboards
dashboards 7
resources 5
views 41
mode storage
Recorder
oldest_recorder_run March 17, 2023 at 20:34
current_recorder_run December 7, 2023 at 19:12
estimated_db_size 17118.81 MiB
database_engine sqlite
database_version 3.41.2

Additional information

No response

@jires jires added the bug label Dec 16, 2023
@mcolyer
Copy link

mcolyer commented Dec 17, 2023

I'm having a similar issue, after upgrading to 11.2 my usb Zwave hub is no longer recognized.

@sblogoshs
Copy link

In my case, it is the SkyConnect stick that stops working after a few hours and the problem can only be solved by unplugging it and plugging it in again. A downgrade to 11.1 solved the problem for me.

@mcolyer
Copy link

mcolyer commented Dec 18, 2023

Can confirm that downgrading to 11.1 and then unplugging and replugging in the dongle (Aeotec z-stick Gen 5) fixes the issue.

@ohessel
Copy link

ohessel commented Dec 18, 2023

On my RPI 3b+ downgrading to 11.1 (ha os update --version 11.1) directly fixed the issue with Z-Wave USB Stick (ZMEEUZB1).

@bnounours
Copy link

Same as all, I upgraded to OS 11.2 Zwave stop working as the stick was no more detected. I downgraded to OS 11.1, stick is back online.

@GHGiampy
Copy link

Also related #2977

@sairon sairon added board/generic-x86-64 Generic x86-64 Boards (like Intel NUC) stable-kernel-regression Issue which appears to be an upstream stable kernel regression labels Dec 19, 2023
@sairon
Copy link
Member

sairon commented Dec 19, 2023

We suspect it could be caused by some regression in the stable kernel, because at the same time similar issues started to appear on RPi and x86 platforms. There is bunch of changes in the USB subsystem in the newer stable releases which are currently in the development branch. If there's someone willing to switch to the dev channel and test it, it will help a lot (but make sure you have backups, since it switches all the components to dev and things can go south much more often).

ha supervisor options --channel=dev
ha supervisor reload
ha supervisor update
ha os update

Otherwise there should be another beta release soon, we will appreciate early feedback after the release too.

@sairon
Copy link
Member

sairon commented Jan 2, 2024

Since OS 11.3.rc1 is out for a while already, has anyone been able to check if the problem persists there?

@sblogoshs
Copy link

I also noticed that with OS 11.3.rc1 but for me it just took longer until the error reappeared, it just depends on how long there is no restart.

@llroelj
Copy link

llroelj commented Jan 2, 2024

We suspect it could be caused by some regression in the stable kernel, because at the same time similar issues started to appear on RPi and x86 platforms. There is bunch of changes in the USB subsystem in the newer stable releases which are currently in the development branch. If there's someone willing to switch to the dev channel and test it, it will help a lot (but make sure you have backups, since it switches all the components to dev and things can go south much more often).

ha supervisor options --channel=dev
ha supervisor reload
ha supervisor update
ha os update

Otherwise there should be another beta release soon, we will appreciate early feedback after the release too.

I will try this tomorrow morning and will keep you posted if it works.

@agners
Copy link
Member

agners commented Jan 3, 2024

@mcolyer @sblogoshs @ohessel @bnounours @llroelj do you see the same device descriptor read errors as the original poster?

@bnounours which stick are you using?

@llroelj
Copy link

llroelj commented Jan 3, 2024

@mcolyer @sblogoshs @ohessel @bnounours @llroelj do you see the same device descriptor read errors as the original poster?

@bnounours which stick are you using?

After going through the above mentioned steps I am having the same error as mentioned above and the one I had before downgrading to 11.1

usb 1-1.1.2: device descriptor read/64, error -32

I see in the Z-Wave JS addon also the same error popping up: " Driver: Failed to open the serial port: Error: No such file or directory, cannot open /dev/serial/by-id/usb-0658_0200-if00 (ZW0100)"

Currently I am running the following versions:

Core: 2023.12.3
Supervisor: 2024.01.0.dev0201
Operating System: 11.4.dev20231226
Frontend: 20231208.2

@agners
Copy link
Member

agners commented Jan 3, 2024

@llroelj so you are saying you had the device descriptor read with 11.2 as well as the OS version from the dev channel?

What (exact) hardware are you running HAOS on?

@llroelj
Copy link

llroelj commented Jan 3, 2024

@llroelj so you are saying you had the device descriptor read with 11.2 as well as the OS version from the dev channel?

What (exact) hardware are you running HAOS on?

Indeed I had the same message in 11.2 and now the dev channel. With both the zwave stick cannot be found by the system itself (no serial folder).
If needed I can share my dmesg log with all information.

I am running HAOS on a RPi3 B+ 1GB memory installed on a SanDisk Ultra 32GB microSDHC card
The Zwave USB stick is a Z-Wave.me Stick (ZMEEUZB1)

@mnorrsken
Copy link

My problems with USB is solved now after upgrading to HAOS 11.3. I had problems with ConbeeII zigbee dongle not being detected. This is possibly another issue but worth trying 11.3 as I see a couple of kernel USB fixes.

@sblogoshs
Copy link

With the final version OS 11.3 the error seems to be fixed, but due to various updates I haven't had 24 hours without a restart yet, I hope that it will still work.

@jires
Copy link
Author

jires commented Jan 7, 2024

I can tell that upgrade to OS 11.3 DID NOT FIXED (? or did?) the issue with Z-Wave.me UZB. Descriptor error is still coming, however it's looks like that dongle is working as expected
Log output from host, after plugin dongle in the usb port, when system reboted with dongle plugged in, there is no errors in the log:

Jan 07 12:04:09 homeassistanttest kernel: usb 1-1.3: new full-speed USB device number 6 using xhci_hcd
Jan 07 12:04:09 homeassistanttest kernel: usb 1-1.3: device descriptor read/64, error -32
Jan 07 12:04:09 homeassistanttest kernel: usb 1-1.3: device descriptor read/64, error -32
Jan 07 12:04:10 homeassistanttest kernel: usb 1-1.3: new full-speed USB device number 7 using xhci_hcd
Jan 07 12:04:10 homeassistanttest kernel: usb 1-1.3: device descriptor read/64, error -32
Jan 07 12:04:10 homeassistanttest kernel: usb 1-1.3: device descriptor read/64, error -32
Jan 07 12:04:10 homeassistanttest kernel: usb 1-1-port3: attempt power cycle
Jan 07 12:04:11 homeassistanttest kernel: usb 1-1.3: new full-speed USB device number 8 using xhci_hcd
Jan 07 12:04:11 homeassistanttest kernel: usb 1-1.3: New USB device found, idVendor=0658, idProduct=0200, bcdDevice= 0.00
Jan 07 12:04:11 homeassistanttest kernel: usb 1-1.3: New USB device strings: Mfr=0, Product=0, SerialNumber=0
Jan 07 12:04:11 homeassistanttest kernel: cdc_acm 1-1.3:1.0: ttyACM0: USB ACM device

@kvandt
Copy link

kvandt commented Jan 8, 2024

I have the same on my HA Yellow. When upgrading to 11.2 or 11.3 my UZB Zwave stick cannot be found anymore. No usb device available. When downgrading to 11.1, everything works again. I have tried the upgrade three times, but no luck. I hope this can be fixed in next release.

@ohessel
Copy link

ohessel commented Jan 8, 2024

@agners I did not download the logs, and the journal of the error is rotated away already.
If it helps, I could update again (or better setup a clone, as I don't want to break my system) and see if those exact logs appear.

@afaucogney
Copy link

I'm in the same situation, with 2 different Zwave devices.
Now running 11.4OS

@ohessel
Copy link

ohessel commented Jan 11, 2024

@afaucogney So you're saying 11.4 has the same issue?

@afaucogney
Copy link

@ohessel Yes it seems the same issue.
I have reboot to be certain that everything is clear, but then I canot find my usb stick in the all_hardware list
And ZWaveJS is not able to talk to the controller. Should it work with that version ?

@kvandt
Copy link

kvandt commented Jan 12, 2024

@sairon can you please add the board/yellow label as well? I have this issue on the yellow (see earlier post).

@afaucogney
Copy link

Is there some activity on this topic, or do you need some information/log to help you ?

@Taraman17
Copy link

Taraman17 commented Mar 1, 2024

Anyway, if there's someone who's got the stick that can be used to reproducibly trigger the issue (especially on x86), I'll be interested in getting it - that way I can try reproducing it in my environment and bisect the kernel to find the exact commit it causes i. I'll strongly prefer someone from the EU, to avoid high shipping costs and customs nuisances 😇 Of course I'll pay for the shipping and for a new locally-sourced device.

Hi, I would be willing to send you my device which has this problem on my RPI3 - I'm not sure if that guarantees the same behavior on X86. I would need to get a replacement before however and backup restore, since this is my production system and my home relies on it.

@FredrikFornstad
Copy link

OK here goes. A lot of interesting info:
I managed to find an 10+ year old HP Pavilion laptop with an Intel i7-3610QM CPU. I downloaded a Debian "LiveCD" today (debian-live-12.5.0-amd64-standard.iso) and put the image on an USB and booted the live image.

Here you see the first part of dmesg so you see OS and Computer info:

[ 0.000000] microcode: microcode updated early to revision 0x21, date = 2019-02-13
[ 0.000000] Linux version 6.1.0-18-amd64 ([email protected]) (gcc-12 (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC Debian 6.1.76-1 (2024-02-01)
[ 0.000000] Command line: BOOT_IMAGE=/live/vmlinuz-6.1.0-18-amd64 boot=live components quiet splash findiso=
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x0000000000087fff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000000088000-0x00000000000bffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000001fffffff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000020000000-0x00000000201fffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000020200000-0x0000000039dbefff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000039dbf000-0x000000003a1befff] type 20
[ 0.000000] BIOS-e820: [mem 0x000000003a1bf000-0x000000003aebefff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000003aebf000-0x000000003afbefff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x000000003afbf000-0x000000003affefff] ACPI data
[ 0.000000] BIOS-e820: [mem 0x000000003afff000-0x000000003affffff] usable
[ 0.000000] BIOS-e820: [mem 0x000000003b000000-0x000000003f9fffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000e0000000-0x00000000efffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000feb00000-0x00000000feb03fff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000fed10000-0x00000000fed19fff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000ffb80000-0x00000000ffffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x00000004bf5fffff] usable
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] efi: EFI v2.31 by INSYDE Corp.
[ 0.000000] efi: ACPI=0x3affe000 ACPI 2.0=0x3affe014 SMBIOS=0x3aebdf18 MOKvar=0x3a264000
[ 0.000000] secureboot: Secure boot disabled
[ 0.000000] SMBIOS 2.7 present.
[ 0.000000] DMI: Hewlett-Packard HP Pavilion dv6 Notebook PC/181B, BIOS F.22 11/02/2012
[ 0.000000] tsc: Fast TSC calibration using PIT
[ 0.000000] tsc: Detected 2294.742 MHz processor

This laptop has 4 USB ports. 3 marked "SS" which I guess is USB3 and 1 port which is USB2. I had the "live CD usb" in one of the "SS" ports. I then plugged the UZB into the USB2 port and got this:

[ 545.624330] usb 1-1.2: new full-speed USB device number 5 using ehci-pci
[ 545.704323] usb 1-1.2: device descriptor read/64, error -32
[ 545.892340] usb 1-1.2: device descriptor read/64, error -32
[ 546.080327] usb 1-1.2: new full-speed USB device number 6 using ehci-pci
[ 546.160330] usb 1-1.2: device descriptor read/64, error -32
[ 546.348330] usb 1-1.2: device descriptor read/64, error -32
[ 546.456552] usb 1-1-port2: attempt power cycle
[ 547.268330] usb 1-1.2: new full-speed USB device number 7 using ehci-pci
[ 547.297571] usb 1-1.2: device descriptor read/8, error -32
[ 547.425569] usb 1-1.2: device descriptor read/8, error -32
[ 547.612330] usb 1-1.2: new full-speed USB device number 8 using ehci-pci
[ 547.641567] usb 1-1.2: device descriptor read/8, error -32
[ 547.769604] usb 1-1.2: device descriptor read/8, error -32
[ 547.876542] usb 1-1-port2: unable to enumerate USB device

Exactly the same behaviour as trying the UZB in a rpi3b on HAOS 12.0 etc. BUT with one difference: Since this is another type of USB-controller it is using ehci-pci instead of dwc_otg which is used on rpi3b.

So now we know two things:

  1. The changed behaviour was not introduced by Raspian, but in Debian
  2. The changed behaviour is not limited to ARM. x86-64 is also affected.

I now rebooted and made another try, and connected the UZB (to the SAME laptop, running the same liveCD...) to one of the USB-ports market "SS" (and thereby to another internal USB-controller):

[ 38.244292] usb 2-3: new full-speed USB device number 3 using xhci_hcd
[ 38.372319] usb 2-3: device descriptor read/64, error -71
[ 38.608317] usb 2-3: device descriptor read/64, error -71
[ 38.844287] usb 2-3: new full-speed USB device number 4 using xhci_hcd
[ 38.972317] usb 2-3: device descriptor read/64, error -71
[ 39.208325] usb 2-3: device descriptor read/64, error -71
[ 39.316405] usb usb2-port3: attempt power cycle
[ 39.936295] usb 2-3: new full-speed USB device number 5 using xhci_hcd
[ 39.957228] usb 2-3: New USB device found, idVendor=0658, idProduct=0200, bcdDevice= 0.00
[ 39.957241] usb 2-3: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[ 39.999591] cdc_acm 2-3:1.0: ttyACM0: USB ACM device
[ 39.999639] usbcore: registered new interface driver cdc_acm
[ 39.999641] cdc_acm: USB Abstract Control Model driver for USB modems and ISDN adapters

To those who wonder which USB controllers are in the laptop, here is some info:

[ 3.356305] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 6.01
[ 3.356310] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 3.356313] usb usb1: Product: EHCI Host Controller
[ 3.356315] usb usb1: Manufacturer: Linux 6.1.0-18-amd64 ehci_hcd
[ 3.356317] usb usb1: SerialNumber: 0000:00:1a.0
[ 3.356511] hub 1-0:1.0: USB hub found
[ 3.356521] hub 1-0:1.0: 2 ports detected
[ 3.356730] xhci_hcd 0000:00:14.0: xHCI Host Controller
[ 3.356739] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 2
[ 3.357802] xhci_hcd 0000:00:14.0: hcc params 0x20007181 hci version 0x100 quirks 0x000000000000b930
[ 3.357931] ehci-pci 0000:00:1d.0: EHCI Host Controller
[ 3.357938] ehci-pci 0000:00:1d.0: new USB bus registered, assigned bus number 3
[ 3.357953] ehci-pci 0000:00:1d.0: debug port 2
[ 3.357959] xhci_hcd 0000:00:14.0: xHCI Host Controller
[ 3.357966] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 4
[ 3.357970] xhci_hcd 0000:00:14.0: Host supports USB 3.0 SuperSpeed
[ 3.358013] usb usb2: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 6.01
[ 3.358016] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 3.358017] usb usb2: Product: xHCI Host Controller
[ 3.358019] usb usb2: Manufacturer: Linux 6.1.0-18-amd64 xhci-hcd
[ 3.358020] usb usb2: SerialNumber: 0000:00:14.0
[ 3.358149] hub 2-0:1.0: USB hub found
[ 3.358165] hub 2-0:1.0: 4 ports detected
[ 3.358592] usb usb4: New USB device found, idVendor=1d6b, idProduct=0003, bcdDevice= 6.01
[ 3.358596] usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 3.358598] usb usb4: Product: xHCI Host Controller
[ 3.358599] usb usb4: Manufacturer: Linux 6.1.0-18-amd64 xhci-hcd
[ 3.358600] usb usb4: SerialNumber: 0000:00:14.0
[ 3.358698] hub 4-0:1.0: USB hub found
[ 3.358713] hub 4-0:1.0: 4 ports detected
[ 3.361860] ehci-pci 0000:00:1d.0: irq 23, io mem 0x74618000
[ 3.365636] libata version 3.00 loaded.
[ 3.376225] ehci-pci 0000:00:1d.0: USB 2.0 started, EHCI 1.00
[ 3.376297] usb usb3: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 6.01
[ 3.376305] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 3.376307] usb usb3: Product: EHCI Host Controller
[ 3.376309] usb usb3: Manufacturer: Linux 6.1.0-18-amd64 ehci_hcd
[ 3.376311] usb usb3: SerialNumber: 0000:00:1d.0
[ 3.376532] hub 3-0:1.0: USB hub found
[ 3.376540] hub 3-0:1.0: 2 ports detected

So to my surprise it actually works (after the usual hickup in the startup, however now with another error) using the "SS" USB ports with the xhci-hcd driver.
So now we know a bit more:
3. Some types of USB-controllers/drivers (now this was the xhci_hcd driver) may work with UZB and the newer Debian releases.
4. Everything points to that (some batches? of) the UZB is not fully USB compliant in the startup-process, but depending on how the USB driver handles it can work (and has done so until recently)

@FredrikFornstad
Copy link

Forgot to say: I also booted the same laptop as in previous post with Windows 10 Home (22H2). Then the laptop could successfully access the UZB in all the USB slots without any problem and exposing it as a COM port using the build-in Microsoft "usbser" driver.

@FredrikFornstad
Copy link

FredrikFornstad commented Mar 2, 2024

@sairon : I did some more testing with plain Debian Live CDs on my HP Pavillion Laptop (Intel i7-3610QM CPU): The USB3 port always work, and the USB2 port as follows:

Debian 12.0.0 Works (Kernel 6.1.0-9-amd64 as reported by the cmd "uname -r"), released June 10, 2023
Debian 12.1.0 Works (Kernel 6.1.0-10-amd64), released July 22, 2023
Debian 12.2.0 Does NOT work (Kernel 6.1.0-13-amd64), released Oct 7, 2023
Debian 12.4.0 Does NOT work (Kernel 6.1.0-15-amd64), released Dec 10, 2023 (Debian 12.3.0 was never released)
Debian 12.5.0 Does NOT work

@FredrikFornstad
Copy link

I have been looking at the changelog for Debian 12.2 for "suspects". Found these commits in the Linux kernel that was introduced in Debian 12.2:
- USB: core: Unite old scheme and new scheme descriptor reads
- USB: core: Change usb_get_device_descriptor() API
- USB: core: Fix race by not overwriting udev->descriptor in hub_port_init()
- USB: core: Fix oversight in SuperSpeed initialization

My guess, but I might be wrong, is that the "Unite old scheme and new scheme descriptor reads is the cause why the UZB does no longer work. There is a fix for USB3 (SuperSpeed) that was found not to work after the schemes update.

Here you can read a bit about these patches and how they were introduced: https://lore.kernel.org/linux-usb

@sairon
Copy link
Member

sairon commented Mar 4, 2024

@FredrikFornstad Very interesting findings again, thanks! I've created a build with those four patches reverted for Raspberry Pi 3 (32-bit and 64-bit builds), could you test it? You can find the build here: https://github.com/home-assistant/operating-system/actions/runs/8141367477

Just ignore the failure, it's because the image of the target for testing wasn't built. Also note you must be logged in to download and the files are ZIP compressed when downloaded (it's a limitation of GH artifacts).

@sairon
Copy link
Member

sairon commented Mar 4, 2024

FWIW, these changes were introduced to Linux stable in 6.1.53 - which means that last version working should be HAOS 10.5 for x86 (and other non-RPi boards, with 11.0.rc1 being the breaking version, bumping 6.1.52 -> 6.1.53), and HAOS 11.1 for Raspberry Pi (breaking version 11.2.rc1, bumping 6.1.21 -> 6.1.58). This issue is a bit of a mixed bag of reports from both platforms without any further details, and while there's clear consensus about reverting to 11.1 fixing the issue, it's not clear if it's related to x86, or those all are about RPi (which should have been reported in #2977).

@FredrikFornstad
Copy link

FredrikFornstad commented Mar 4, 2024

Tested your special bild (64bit). Result: It works with UZB on rpi3b.

Dmesg reports Linux version 6.1.73-haos-raspi with build date today afternoon.

It is exactly as in haos 11.1: 2x2 error messages, and then directly after the "attempt powercycle" it find the UZB and do the handshake and then cdc_acm assigns it to ttyACM0

I have a broken internet connection today, so I only have my mobile. So I skip the 32bit for now. But I am sure it also will work successfully.

Next step? Report to the Linux usb team?

@sairon
Copy link
Member

sairon commented Mar 5, 2024

@FredrikFornstad Perfect, I have started a discussion in the linux-usb mailing list and reported the issue to the regressions list as well. Also, putting all the pieces together, I realized the "USB 2" port on my mini PC I used for the testing before, is in fact only black, but it's a SuperSpeed port anyway 🤦 That's why I wasn't able to reproduce it before (just like you, I got -71 errors instead and the driver recovers). And using RPi's USB 2 ports, I am reproducibly getting the USB enumeration error too 🎉

@sairon sairon added the board/raspberrypi Raspberry Pi Boards label Mar 5, 2024
@kvandt
Copy link

kvandt commented Mar 5, 2024

The solution to put the UZB stick in a USB3 port is not a solution for the HA Yellow. This device only has USB2 ports (in USB-A format). I just tried again the update from 11.1 to 11.5 and 12.0, but the UZB stick is not found. I switched USB port (but are both USB2) and no effect as expected. After downgrading to 11.1 the UZB is recognised again.

I hope this can be solved upstreams and if not in a special build for HA Yellow devices.

@sairon can you also add the HA Yellow label to this topic?

@sairon sairon added the board/yellow Home Assistant Yellow label Mar 6, 2024
sairon added a commit that referenced this issue Mar 6, 2024
Revert changes in the USB driver causing Z-Wave sticks (Z-Wave.me a
and Aeotec at least) failing to enumerate. Issue is reported upstream
but reverting the patches is a feasible workaround for the time being.

Refs #2995
sairon added a commit that referenced this issue Mar 6, 2024
Revert changes in the USB driver causing Z-Wave sticks (Z-Wave.me a
and Aeotec at least) failing to enumerate. Issue is reported upstream
but reverting the patches is a feasible workaround for the time being.

Refs #2995
@kvandt
Copy link

kvandt commented Mar 6, 2024

Thnx @sairon !

@FredrikFornstad
Copy link

FredrikFornstad commented Mar 11, 2024

@FredrikFornstad Perfect, I have started a discussion in the linux-usb mailing list and reported the issue to the regressions list as well. Also, putting all the pieces together, I realized the "USB 2" port on my mini PC I used for the testing before, is in fact only black, but it's a SuperSpeed port anyway 🤦 That's why I wasn't able to reproduce it before (just like you, I got -71 errors instead and the driver recovers). And using RPi's USB 2 ports, I am reproducibly getting the USB enumeration error too 🎉

@sairon, I saw Alan Stern asked for Windows Wireshark logs from UZB attachment. This is from my HP Pavilion laptop, running Windows 10 Home 22H2. The log starts exactly when I plug in the UZB, before I plugged it in it was 30 seconds of complete silence on the usb-bus. The UZB is plugged into "Port 2" which after windows has managed to start talking to it becomes destination 1.3.0. One notable difference (I think) is that Windows do 5 port resets, before it manage to start communication with the UZB. I guess the UZB "problem" can be timing related, and that Linux gives up too soon. Also, I tend to remember to have read somewhere that Windows uses the "new" scheme, just as Linux do since the "troublesome" patches that caused the problems with UZB, so maybe if Linux did a few more tries before giving up, it maybe works also with the new scheme. Anyway, maybe the attached log-file is of some help. Windows assign the UZB to COM3 in this particular case.

One thing that puzzles me is that with the Linux patches USB3 ports actually works, but USB2 ports do not, even though Alan says that the USB init is now sharing the code between the two... Leads me to think there has to be some difference in behaviour between USB2 and USB3 code.

usbPcap1_USB2-port.txt

@FredrikFornstad
Copy link

FredrikFornstad commented Mar 13, 2024

@sairon, I looked in detail at your usbmon traces and compared them side by side. I understand Alan already have looked at those in detail, and as an amateur I can probably not provide much more useful input/ideas. But I stick out my neck a little bit and offer my observations for what it may be worth:

  1. With the "troublesome" patches reverted (the working example), the sequence executes faster. Assuming the timestamps (second column) is in usec, then the working example is 1871 usec faster up to the powercycle instruction than the non-working example after the start of the logs. Most of these come early in the logs at two places 800 usec + 1000 usec. I guess this could just be a coincidence caused by some CPU interrupt calls that has nothing to do with the usb bus. Repetive runs of usbmon would tell if this was just a coincidence or if the troublesome patches actually made the whole init procedure slower than before.
  2. After approx 1.28 seconds(I guess) after the start of the logs comes an interupt input (I guess this is the power-cycle). In the working example this took approx 1700 usec longer than in the non-working example. Why does the "interupt input" come later (take longer time?) in the working example? This I guess could be significant for the troubleshooting.
    Non-working example:
    ffff9f9a29b3f300 298538893 C Co:1:002:0 0 0 ffff9f9a012cae40 298581342 C Ii:1:002:1 0:2048 1 = 04 (timestamp diff = 42499)
    Working example:
    ffff8fc4ee367240 368298823 C Co:1:002:0 0 0 ffff8fc4c0c5ac00 368343025 C Ii:1:002:1 0:2048 1 = 04 (timestamp diff = 44202)
  3. After the interupt input then follows a control input submission and directly after that a control input callback where for the first time there is a difference in bit sequence. If this is where the init has been successful then the cause of the difference is before this point which indicates a potential timing issue, unless there are things going on that is not captured by usbmon. Below is the first difference I see in the sequence apart from timing:
    Non-working example: ffff9f9a29b3f300 298742459 C Ci:1:002:0 0 4 = 03011000
    Working example: ffff8fc4ee367240 368502372 C Ci:1:002:0 0 4 = 01011100

@henriklund
Copy link

Upgrade to v12.1 did not change the USB disconnect behaviour. Using it from USB3 improved stability (from disconnecting permanently every 2nd time to every 4-5 times ZWave JS was restarted).

@diablodale
Copy link

diablodale commented Mar 14, 2024

As a datapoint... my Z-Wave.Me UZB has none of these failures that I've seen. Looking now I also see nothing in various logs.

  • It is on a RPI4 model B and plugged into a black USB 2 port on the top left.
  • My UZB has firmware 5.26. This is the newest firmware possible on my specific UZB stick.

Note: There are many sub-types of UZB sticks and each has distinct firmware upgrade paths that DO NOT all reach the same numbers.

@MUN0X
Copy link

MUN0X commented Mar 21, 2024

I updated my HA yellow from 11.1 to 12.1.
My uzb stick works unlike at 11.2-11.4.

Looking at dmesg output, I still get some errors while it attempts to read the descriptor, after a single power cycle it recognizes the stick.

@FredrikFornstad
Copy link

@sairon , I saw that you are thinking about contacting Z-Wave.Me team on possible correction of the UZB firmware. For your information, in case you are not aware, relevant information for them might be the combination of bootloader version and firmware version. The bootloader version is easily obtained using their Z-Way software using the Expert UI (no license needed for viewing the info or to upgrade bootloader and/or firmware).

I do not recall exactly what versions my UZB sticks had when I bought them many years ago. But at least one of them was on 5.04 or 5.05. The other one must have been 5.06 or earlier.

Today, both my UZB sticks are running bootloader 40196 and firmware 5.39 (after several steps of upgrading both bootloader and firmware). As @diablodale says, the bootloader/firmware upgrade paths are not straight forward. Actually, I think it is possible to upgrade his UZB to 5.39 also, but requires to filter on "all" upgrade paths and not only the "active" ones. However, I do not know why some paths (the non-active) are normally hidden, so maybe this is not recommended.

See here for the very complicated upgrade path map: [https://service.z-wave.me/expertui/uzb-stats/versions-graph.html?hw=277&with_hidden]

Given the amount of time that has pasted since 5.39 firmware was released, I somewhat doubt if the Z-Wave.Me team still have a possibility to make any adjustments (if it is technically possible).

@kvandt
Copy link

kvandt commented Mar 31, 2024

For the record: 12.1 fixed the issue for me. (on Home-Assistant Yellow)

@Taraman17
Copy link

For me as well, the ZWAVE stick was recognized with 12.1

@PoltoS
Copy link

PoltoS commented Apr 16, 2024

Thank you @sairon for calling for me. I'm from Z-Wave.Me, the manufacturer of the UZB1.

Before we dive in deeper, just to test it, does an intermediate old-school USB hub helps? This might be an easy solution.

Based on the details analysis here and in the Linux Kernel thread I can deduce that UZB1 is setting up pull-ups too early. Even taking into account that this device is EOL and is not produced for more than 4 years, we still have the ability to make a new firmware. In theory ;) But I'll have to check with my colleagues if we have access to the USB part of the code as in old times Sigma Designs was not sharing sources of internal parts of the SDK, especially in the CDC-ACM driver implementation (in contrast, now we control everything except for the very low-level RF driver called RAIL).

I'll come back soon with more info from our R&D.

Please note that we have released new cool hardware to substitute UZB1. Modern and more professional.
Z-Station dual-protocol Z-Wave & Zigbee/Thread/BLE USB dongle
mPCI-E dual-protocol Z-Wave & Zigbee/Thread/BLE mini PCI-Express extension card
RaZberry 7 Pro Z-Wave extension card for Raspberry Pi and similar

All are using high-performant external antennas providing 3-4 times better range compared to UZB1. And with Long Range support (if you are in the US; soon in EU too)! Recently we published our test results and it went up to 1.9 km light-of-sight! And of course a simple backup & restore can help to migrate without any re-inclusion. So, if you thought about replacing the old 5th gen UZB1 with new hardware, it is the right time — to compensate the issue above we propose an offer of 20 EUR discount on any of the three hardware listed above with the coupon 2CPLD4DT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
board/generic-x86-64 Generic x86-64 Boards (like Intel NUC) board/raspberrypi Raspberry Pi Boards board/yellow Home Assistant Yellow bug stable-kernel-regression Issue which appears to be an upstream stable kernel regression
Projects
None yet
Development

No branches or pull requests