Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfaults in v2.0.17 #2885

Closed
sauvant opened this issue Aug 26, 2023 · 11 comments
Closed

Segfaults in v2.0.17 #2885

sauvant opened this issue Aug 26, 2023 · 11 comments

Comments

@sauvant
Copy link

sauvant commented Aug 26, 2023

Keep getting segfaults with v2.0.17 running on docker->plesk->debian:
mosquitto[24889]: segfault at 500000000 ip 00007fba45142fc1 sp 00007fff4bc3e338 error 4 in ld-musl-x86_64.so.1[7fba45104000+4c000]

Did not find a reproduce case yet.

Environment:
Debian 10.13
Plesk Obsidian v18.0.54_build1800230824.08 os_Debian 10.0
Docker version 24.0.5, build ced0996
Mosquitto version 2.0.17

@NorbertHeusser
Copy link
Contributor

Hi Keith,
any more information you are able to provide about your setup/usage of Mosquitto. Or even some log files ?
Best would be to get some log files created with log_type all.

If this is not possible any additional kind of information might help, like:

  • Any plugins used (e.g. dynamic security) ?
  • Persistence on the broker enabled ?
  • Clients using persistence session (clean session = false) ?
  • Any messages with retain/qos1/qos2 ?

Without any kind of detail it will be hard to pin down the problem.

@tbclark3
Copy link

tbclark3 commented Sep 7, 2023

I'm getting a similar segfault on 2.0.17 almost every time I restart Home Assistant:
mosquitto[608511]: segfault at 200000000 ip 00007f0b8cefffc1 sp 00007ffff9d82498 error 4 in ld-musl-x86_64.so.1[7f0b8cec1000+4c000] likely on CPU 20 (core 16, socket 0)

Home Assistant has several hundred mqtt entities, all of which have to update at startup, but I don't think that's an unusual number. I have been experimenting with configs but haven't been able to find anything consistent.

Persistence is enabled, and I am not using any plugins. There are many messages with retain set, from Home Assistant and other clients.

@RoboMagus
Copy link

Been seeing the same thing lately.
At first I thought it had something to do with the Saving in-memory database stuff. See #2726.
But after some monitoring of the docker status for the Mosquitto container I've seen 139 exit codes come by. Looking at the log this occurs both on new connections (not just HA, but other clients as well) as well as the DB writes.

@LucidityCrash
Copy link

Wish I could add more then a "Me too", not noticing it for connects/disconnects with mqtt explorer on windows but restarting Home Assistant causes a seg fault in Mosquitto 2.1.7, reverting to 2.1.5 (what I was previously running) and issue goes away

@tbclark3
Copy link

I can add that the crash happens during the startup phase of Home Assistant (and not during the shutdown phase). It is not instant; the MQTT crash happens several seconds into the startup, but I'm not sure if it's a single transaction that causes it of the cumulative effect of several startup transactions.

@halfgaar
Copy link

Probably the same thing I found in #2881. I have a stack trace there. It started in version 2.0.16 and reverted out servers back to 2.0.15.

@ralight
Copy link
Contributor

ralight commented Sep 12, 2023

We've finally managed to replicate this once after running on test.mosquitto.org for a good while, so hopefully should have an update soon.

@ralight
Copy link
Contributor

ralight commented Sep 12, 2023

Could you please try out the master branch if convenient? I intend to make a new release in the next 12 hours.

@halfgaar
Copy link

halfgaar commented Sep 13, 2023

I haven't been able to reproduce it; I only saw it on the live system, so I don't know what to do.

Does it make sense this behavior was introduced since 2.0.15?

This seems to directly revert 18ea97c "Fixes sub_count is not decreased when client ubsubscribe" BTW.

@ralight
Copy link
Contributor

ralight commented Sep 13, 2023

@halfgaar Yes, it does directly revert that commit - it was an incorrect back port from develop where the case is slightly different. In develop we have sub_capacity, which is the size of the subscription array, and sub_count, which is the count of subscriptions we have. In 2.0, we just have sub_count, which refers to the size of the array. sub_count is also accessible to plugins through the mosquitto_client_sub_count() call. In the case of 2.0, this call may produce incorrect values, hence the change in develop.

@tbclark3
Copy link

@ralight, thanks for your work on this! You mentioned a new release within 12 hours, but that was several days ago. Do you have a timeframe for a new release?

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants