Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CNTRLR Failed to execute controller command after X attempts. Scheduling next try in 1100 ms. #61502

Closed
sanderlv opened this issue Dec 11, 2021 · 10 comments

Comments

@sanderlv
Copy link

The problem

Constant problems with error:
CNTRLR Failed to execute controller command after 2/3 attempts. Scheduling next try in 1100 ms.

What version of Home Assistant Core has the issue?

core-2021.11.5

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

ZwaveJS

Link to integration documentation on our website

https://www.home-assistant.io/integrations/zwave_js/

Example YAML snippet

I started an issue in the addon section, but I think it is not the right place...
https://github.com/home-assistant/addons/issues/2298

Anything in the logs that might be useful for us?

https://github.com/home-assistant/addons/issues/2298

Additional information

home-assistant/addons#2298

@probot-home-assistant
Copy link

Hey there @home-assistant/z-wave, mind taking a look at this issue as it has been labeled with an integration (zwave_js) you are listed as a code owner for? Thanks!
(message by CodeOwnersMention)


zwave_js documentation
zwave_js source
(message by IssueLinks)

@MartinHjelmare
Copy link
Member

Please continue in this issue instead of the other issue. Please provide logs and a network dump and upload the dump as a text file here.

The dump tool is available in the configuration panel of the Z-Wave JS integration, reached from the integrations page in the Home Assistant GUI.

The network dump will help us troubleshoot your device.

Thanks!

@sanderlv
Copy link
Author

sanderlv commented Dec 11, 2021

@sanderlv
Copy link
Author

Any other thing, just ask!

@sanderlv
Copy link
Author

sanderlv commented Dec 12, 2021

I think I narrowed down the problem with the help of Tim (mwav3). I had an automation to do zwave_js.refresh_value every minute for 8 cover devices which do not reliably report instant status. I disable that and now I got only 1 error left (will open a new issue for that one since it is device specific I think).

My question now is, what is a safe and good way to poll the devices regularly since I need to have some statuses for automations... poll them in order? Or? What is the best and safest way to do so?
I have read: https://www.home-assistant.io/integrations/zwave_js/#what-about-polling-of-devices

On that matter... the issue is:

2021-12-12T19:55:15.133Z DRIVER   Dropping message with invalid payload
2021-12-12T19:55:15.133Z DRIVER « [Node 043] [REQ] [ApplicationCommand]
                                  └─[SecurityCCCommandEncapsulation] [INVALID]
                                      error: Nonce 0xab expired, cannot decode security encapsulated command.

for device ID43... The device works well but gives below error. Where is the right place to ask for support on that?

2021-12-12T19:55:13.842Z SERIAL « 0x010a0004002b029840ca00ca                                          (12 bytes)
2021-12-12T19:55:13.844Z SERIAL » [ACK]                                                                   (0x06)
2021-12-12T19:55:13.845Z DRIVER « [Node 043] [REQ] [ApplicationCommand]
                                  └─[SecurityCCNonceGet]
2021-12-12T19:55:13.849Z SERIAL » 0x011100132b0a9880ab73d66daa5072a90577f4                            (19 bytes)
2021-12-12T19:55:13.849Z DRIVER » [Node 043] [REQ] [SendData]
                                  │ transmit options: 0x05
                                  │ callback id:      119
                                  └─[SecurityCCNonceReport]
                                      nonce: 0xab73d66daa5072a9
2021-12-12T19:55:13.854Z SERIAL « [ACK]                                                                   (0x06)
2021-12-12T19:55:13.863Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2021-12-12T19:55:13.864Z SERIAL » [ACK]                                                                   (0x06)
2021-12-12T19:55:13.865Z DRIVER « [RES] [SendData]
                                    was sent: true
2021-12-12T19:55:14.010Z SERIAL « 0x011800137700000e027e7f7f7f7f0101034746000002010000f0              (26 bytes)
2021-12-12T19:55:14.011Z SERIAL » [ACK]                                                                   (0x06)
2021-12-12T19:55:14.012Z DRIVER « [REQ] [SendData]
                                    callback id:     119
                                    transmit status: OK
2021-12-12T19:55:15.101Z SERIAL « 0x012f0004002b279881ff526249e6d0594d059914e40c8d99fd29ec9cd79a2c3bc (49 bytes)
                                  cde8b529eabb2a9cfa237a507e3ca00db
2021-12-12T19:55:15.102Z SERIAL » [ACK]                                                                   (0x06)
2021-12-12T19:55:15.103Z CNTRLR   [Node 043] [~] [Meter] value[66049]: 3 => 3.1                     [Endpoint 1]
2021-12-12T19:55:15.104Z DRIVER « [Node 043] [REQ] [ApplicationCommand]
                                  └─[SecurityCCCommandEncapsulation]
                                    │ sequenced: false
                                    └─[MultiChannelCCCommandEncapsulation]
                                      │ source:      1
                                      │ destination: 0
                                      └─[MeterCCReport]
                                          type:        Electric
                                          scale:       W
                                          rate type:   Consumed
                                          value:       3.1
                                          time delta:  31 seconds
                                          prev. value: 3
2021-12-12T19:55:15.132Z SERIAL « 0x012f0004002b279881ff526249e6d0594d059914e40c8d99fd29ec9cd79a2c3bc (49 bytes)
                                  cde8b529eabb2a9cfa237a507e3ca00db
2021-12-12T19:55:15.133Z DRIVER   Dropping message with invalid payload
2021-12-12T19:55:15.133Z DRIVER « [Node 043] [REQ] [ApplicationCommand]
                                  └─[SecurityCCCommandEncapsulation] [INVALID]
                                      error: Nonce 0xab expired, cannot decode security encapsulated command.
2021-12-12T19:55:15.134Z SERIAL » [ACK]                                                                   (0x06)
2021-12-12T19:55:45.335Z SERIAL « 0x010a0004002b029840ca00ca                                          (12 bytes)
2021-12-12T19:55:45.336Z SERIAL » [ACK]                                                                   (0x06)
2021-12-12T19:55:45.337Z DRIVER « [Node 043] [REQ] [ApplicationCommand]
                                  └─[SecurityCCNonceGet]
2021-12-12T19:55:45.340Z SERIAL » 0x011100132b0a9880ed5e3544473f7d6005781e                            (19 bytes)
2021-12-12T19:55:45.341Z DRIVER » [Node 043] [REQ] [SendData]
                                  │ transmit options: 0x05
                                  │ callback id:      120
                                  └─[SecurityCCNonceReport]
                                      nonce: 0xed5e3544473f7d60
2021-12-12T19:55:45.346Z SERIAL « [ACK]                                                                   (0x06)
2021-12-12T19:55:45.361Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2021-12-12T19:55:45.361Z SERIAL » [ACK]                                                                   (0x06)
2021-12-12T19:55:45.362Z DRIVER « [RES] [SendData]
                                    was sent: true
2021-12-12T19:55:45.508Z SERIAL « 0x011800137800000e027e7f7f7f7f0101034746000002010000ff              (26 bytes)
2021-12-12T19:55:45.508Z SERIAL » [ACK]                                                                   (0x06)
2021-12-12T19:55:45.509Z DRIVER « [REQ] [SendData]
                                    callback id:     120
                                    transmit status: OK
2021-12-12T19:55:45.593Z SERIAL « 0x012f0004002b2798815a654dc7d7c58367ea33d7f63729cfad1169da4df44e124 (49 bytes)
                                  1f14718d5edea10d937e755cda97e002a
2021-12-12T19:55:45.593Z SERIAL » [ACK]                                                                   (0x06)
2021-12-12T19:55:45.594Z CNTRLR   [Node 043] [~] [Meter] value[66049]: 3.1 => 3                     [Endpoint 1]
2021-12-12T19:55:45.595Z DRIVER « [Node 043] [REQ] [ApplicationCommand]
                                  └─[SecurityCCCommandEncapsulation]
                                    │ sequenced: false
                                    └─[MultiChannelCCCommandEncapsulation]
                                      │ source:      1
                                      │ destination: 0
                                      └─[MeterCCReport]
                                          type:        Electric
                                          scale:       W
                                          rate type:   Consumed
                                          value:       3
                                          time delta:  31 seconds
                                          prev. value: 3.1
2021-12-12T19:56:16.800Z SERIAL « 0x010a0004002b029840ca00ca                                          (12 bytes)
2021-12-12T19:56:16.801Z SERIAL » [ACK]                                                                   (0x06)
2021-12-12T19:56:16.802Z DRIVER « [Node 043] [REQ] [ApplicationCommand]
                                  └─[SecurityCCNonceGet]
2021-12-12T19:56:16.807Z SERIAL » 0x011100132b0a98802554d943d621c36905790e                            (19 bytes)
2021-12-12T19:56:16.808Z DRIVER » [Node 043] [REQ] [SendData]
                                  │ transmit options: 0x05
                                  │ callback id:      121
                                  └─[SecurityCCNonceReport]
                                      nonce: 0x2554d943d621c369
2021-12-12T19:56:16.811Z SERIAL « [ACK]                                                                   (0x06)
2021-12-12T19:56:16.825Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2021-12-12T19:56:16.825Z SERIAL » [ACK]                                                                   (0x06)
2021-12-12T19:56:16.826Z DRIVER « [RES] [SendData]
                                    was sent: true
2021-12-12T19:56:16.976Z SERIAL « 0x011800137900000d02dd7f7f7f7f01010347460000020100005e              (26 bytes)
2021-12-12T19:56:16.976Z SERIAL » [ACK]                                                                   (0x06)
2021-12-12T19:56:16.977Z DRIVER « [REQ] [SendData]
                                    callback id:     121
                                    transmit status: OK
2021-12-12T19:56:17.065Z SERIAL « 0x012f0004002b279881b6b62714807075fb859988e4490743c54b5a9ee84211b91 (49 bytes)
                                  f7d23542325eb49126db1f29cf67e0020
2021-12-12T19:56:17.066Z SERIAL » [ACK]                                                                   (0x06)
2021-12-12T19:56:17.067Z CNTRLR   [Node 043] [~] [Meter] value[66049]: 3 => 3                       [Endpoint 1]
2021-12-12T19:56:17.067Z DRIVER « [Node 043] [REQ] [ApplicationCommand]
                                  └─[SecurityCCCommandEncapsulation]
                                    │ sequenced: false
                                    └─[MultiChannelCCCommandEncapsulation]
                                      │ source:      1
                                      │ destination: 0
                                      └─[MeterCCReport]
                                          type:        Electric
                                          scale:       W
                                          rate type:   Consumed
                                          value:       3
                                          time delta:  31 seconds
                                          prev. value: 3
2021-12-12T19:56:25.056Z SERIAL « 0x0116000400350e600d010032022134000000000000c1005e                  (24 bytes)

image

Think it's this one:
https://devices.zwave-js.io/?jumpTo=0x0299:0x0003:0x1a91:0.0

@mwav3
Copy link

mwav3 commented Dec 12, 2021

Glad I could help. Yeah you definitely don't want to poll 8 devices at once every minute. Not sure the best way automation wise to space that out - that could be a whole new thread you might want to open on the forum.

As far as your node 43, do you need S0 security? S0 security uses a tremendous amount of bandwith, and should really only be used if abosolutely necessary, like a lock. I would drop security for that node if you're ok with that (it will still have zwave security and unless someone has some sort of advance hacking tool and is at your house they won't be able to control it. It also looks like it is sending a meter report every 30 seconds. I would definitely change the report interval at parameter 2 to do that less frequently. I'd say the S0 security with 30 second reporting is way too much traffic.

Zwave is very low bandwidth and is very easily overwhelmed by too much traffic. I think the key to your issues is reducing traffic - avoid S0 security, decrease polling, and decrease frequency of power management reports.

@sanderlv
Copy link
Author

sanderlv commented Dec 12, 2021

Dropping S0 is not an issue, do not need that no. But... is excluding and including the only option? Or is there some trick to re-interview insecure (normal)?

Yes report every 30 secs. But what's that for? Only for power? I do not need that at all... once an hour is ok then.
image

Changed to 600 (1 hour):
image

@mwav3
Copy link

mwav3 commented Dec 12, 2021

Zwave Security is only established at inclusion, so you'll have to exclude and reinclude to drop s0 security.

You can reuse the old entity name when you reinclude so you won't have to change your automations. The Node id number will change though so if any automation uses that it will need updating.

@sanderlv
Copy link
Author

Hi all, sorry to chime in again. I still have these vague "major" issues. When firing multiple commands I have a huge lag. Devices on battery power, sending their status just do not get seen by the controller. also I still have these regular messages:

in between:

2022-01-25T07:09:35.937Z CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try in 100 ms.
2022-01-25T07:09:36.044Z CNTRLR   Failed to execute controller command after 2/3 attempts. Scheduling next try in 1100 ms.
2022-01-25T04:35:59.592Z CNTRLR « [Node 057] received wakeup notification
2022-01-25T04:35:59.593Z CNTRLR   [Node 057] The node is now awake.
2022-01-25T04:36:00.593Z CNTRLR » [Node 057] Sending node back to sleep...
2022-01-25T04:36:00.659Z CNTRLR   [Node 057] The node is now asleep.
2022-01-25T04:42:52.922Z CNTRLR « [Node 012] received wakeup notification
2022-01-25T04:52:23.237Z CNTRLR « [Node 030] received wakeup notification
2022-01-25T04:52:23.238Z CNTRLR   [Node 030] The node is now awake.
2022-01-25T04:52:24.238Z CNTRLR » [Node 030] Sending node back to sleep...
2022-01-25T04:52:24.284Z CNTRLR   [Node 030] The node is now asleep.
2022-01-25T04:58:37.151Z CNTRLR « [Node 029] received wakeup notification
2022-01-25T04:58:37.152Z CNTRLR   [Node 029] The node is now awake.
2022-01-25T04:58:38.153Z CNTRLR » [Node 029] Sending node back to sleep...
2022-01-25T04:58:38.213Z CNTRLR   [Node 029] The node is now asleep.
2022-01-25T05:02:38.029Z CNTRLR « [Node 103] received wakeup notification
2022-01-25T05:02:38.030Z CNTRLR   [Node 103] The node is now awake.
2022-01-25T05:02:39.031Z CNTRLR » [Node 103] Sending node back to sleep...
2022-01-25T05:02:39.084Z CNTRLR   [Node 103] The node is now asleep.
2022-01-25T05:11:24.385Z CNTRLR   [Node 062] Timed out while waiting for a response from the node (ZW0201)
2022-01-25T05:14:52.546Z CNTRLR « [Node 104] received wakeup notification
2022-01-25T05:14:52.546Z CNTRLR   [Node 104] The node is now awake.
2022-01-25T05:14:53.547Z CNTRLR » [Node 104] Sending node back to sleep...
2022-01-25T05:14:53.586Z CNTRLR   [Node 104] The node is now asleep.
2022-01-25T05:29:41.982Z CNTRLR « [Node 014] received wakeup notification
2022-01-25T05:29:41.982Z CNTRLR   [Node 014] The node is now awake.
2022-01-25T05:29:42.983Z CNTRLR » [Node 014] Sending node back to sleep...
2022-01-25T05:29:43.081Z CNTRLR   [Node 014] The node is now asleep.
2022-01-25T05:36:21.260Z CNTRLR « [Node 057] received wakeup notification
2022-01-25T05:36:21.261Z CNTRLR   [Node 057] The node is now awake.
2022-01-25T05:36:22.262Z CNTRLR » [Node 057] Sending node back to sleep...
2022-01-25T05:36:22.318Z CNTRLR   [Node 057] The node is now asleep.
2022-01-25T05:42:51.911Z CNTRLR « [Node 012] received wakeup notification
2022-01-25T05:57:25.040Z CNTRLR   [Node 066] Mapping unsolicited report from root device to endpoint #1
2022-01-25T05:58:13.218Z CNTRLR   [Node 066] Mapping unsolicited report from root device to endpoint #1
2022-01-25T06:00:52.947Z CNTRLR « [Node 021] received wakeup notification
2022-01-25T06:00:52.948Z CNTRLR   [Node 021] The node is now awake.
2022-01-25T06:00:53.949Z CNTRLR » [Node 021] Sending node back to sleep...
2022-01-25T06:00:54.009Z CNTRLR   [Node 021] The node is now asleep.
2022-01-25T06:09:11.509Z CNTRLR « [Node 103] received wakeup notification
2022-01-25T06:09:11.509Z CNTRLR   [Node 103] The node is now awake.
2022-01-25T06:09:12.511Z CNTRLR » [Node 103] Sending node back to sleep...
2022-01-25T06:09:12.577Z CNTRLR   [Node 103] The node is now asleep.
2022-01-25T06:15:51.732Z CNTRLR « [Node 104] received wakeup notification
2022-01-25T06:15:51.732Z CNTRLR   [Node 104] The node is now awake.
2022-01-25T06:15:52.733Z CNTRLR » [Node 104] Sending node back to sleep...
2022-01-25T06:15:52.780Z CNTRLR   [Node 104] The node is now asleep.
2022-01-25T06:22:39.938Z CNTRLR « [Node 052] received wakeup notification
2022-01-25T06:22:39.939Z CNTRLR   [Node 052] The node is now awake.
2022-01-25T06:22:40.940Z CNTRLR » [Node 052] Sending node back to sleep...
2022-01-25T06:22:40.988Z CNTRLR   [Node 052] The node is now asleep.
2022-01-25T06:26:47.708Z CNTRLR   [Node 066] Mapping unsolicited report from root device to endpoint #1
2022-01-25T06:27:28.227Z CNTRLR   [Node 066] Mapping unsolicited report from root device to endpoint #1
2022-01-25T06:30:51.112Z CNTRLR   [Node 063] Mapping unsolicited report from root device to endpoint #1
2022-01-25T06:32:17.922Z CNTRLR   [Node 054] Mapping unsolicited report from root device to endpoint #1
2022-01-25T06:34:48.788Z CNTRLR « [Node 039] received wakeup notification
2022-01-25T06:34:48.789Z CNTRLR   [Node 039] The node is now awake.
2022-01-25T06:34:49.791Z CNTRLR » [Node 039] Sending node back to sleep...
2022-01-25T06:34:49.891Z CNTRLR   [Node 039] The node is now asleep.
2022-01-25T06:36:42.098Z CNTRLR « [Node 057] received wakeup notification
2022-01-25T06:36:42.099Z CNTRLR   [Node 057] The node is now awake.
2022-01-25T06:36:43.100Z CNTRLR » [Node 057] Sending node back to sleep...
2022-01-25T06:36:43.152Z CNTRLR   [Node 057] The node is now asleep.
2022-01-25T06:38:39.861Z CNTRLR « [Node 044] received wakeup notification
2022-01-25T06:38:39.861Z CNTRLR   [Node 044] The node is now awake.
2022-01-25T06:38:40.861Z CNTRLR » [Node 044] Sending node back to sleep...
2022-01-25T06:38:40.902Z CNTRLR   [Node 044] The node is now asleep.
2022-01-25T06:42:50.897Z CNTRLR « [Node 012] received wakeup notification
2022-01-25T06:43:34.427Z CNTRLR   [Node 048] Mapping unsolicited report from root device to endpoint #1
2022-01-25T06:43:56.768Z CNTRLR   [Node 048] Mapping unsolicited report from root device to endpoint #1
2022-01-25T07:03:26.456Z CNTRLR   [Node 054] Mapping unsolicited report from root device to endpoint #1
2022-01-25T07:07:06.717Z CNTRLR   [Node 048] Mapping unsolicited report from root device to endpoint #1
2022-01-25T07:08:02.157Z DRIVER   Dropping message with invalid payload
2022-01-25T07:08:03.170Z DRIVER   Dropping message with invalid payload
2022-01-25T07:08:18.143Z DRIVER   Dropping message with invalid payload
2022-01-25T07:09:35.937Z CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                                  n 100 ms.
2022-01-25T07:09:36.044Z CNTRLR   Failed to execute controller command after 2/3 attempts. Scheduling next try i
                                  n 1100 ms.
2022-01-25T07:09:47.598Z CNTRLR   [Node 048] Mapping unsolicited report from root device to endpoint #1
2022-01-25T07:09:47.641Z CNTRLR   [Node 048] Mapping unsolicited report from root device to endpoint #1
2022-01-25T07:09:47.807Z CNTRLR   [Node 048] Mapping unsolicited report from root device to endpoint #1
2022-01-25T07:10:04.056Z CNTRLR   [Node 021] treating BasicCC::Set as a report
2022-01-25T07:15:46.987Z CNTRLR « [Node 103] received wakeup notification
2022-01-25T07:15:46.987Z CNTRLR   [Node 103] The node is now awake.
2022-01-25T07:15:47.987Z CNTRLR » [Node 103] Sending node back to sleep...
2022-01-25T07:15:48.062Z CNTRLR   [Node 103] The node is now asleep.
2022-01-25T07:16:50.879Z CNTRLR « [Node 104] received wakeup notification
2022-01-25T07:16:50.879Z CNTRLR   [Node 104] The node is now awake.
2022-01-25T07:16:51.881Z CNTRLR » [Node 104] Sending node back to sleep...
2022-01-25T07:16:51.939Z CNTRLR   [Node 104] The node is now asleep.
2022-01-25T07:20:23.952Z CNTRLR   [Node 021] treating BasicCC::Set as a report
2022-01-25T07:20:35.658Z CNTRLR   [Node 048] Timed out while waiting for a response from the node (ZW0201)
2022-01-25T07:20:42.251Z CNTRLR   [Node 021] treating BasicCC::Set as a report

@sanderlv
Copy link
Author

I tried tom manually "zwavejs_refresh value" multiple times after each other but I can't reproduce the "failed to execute" with that...

@github-actions github-actions bot locked and limited conversation to collaborators Feb 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants