Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mosquitto.db contains only null bytes -> "Unable to restore persistent database. Unrecognised file format." #2639

Open
svet-b opened this issue Sep 24, 2022 · 1 comment

Comments

@svet-b
Copy link

svet-b commented Sep 24, 2022

I'm experiencing a surprisingly common corruption issue with the persistent database. It looks a bit like the previously reported and fixed #189 and #424, but isn't the same.

Context

This is happening on 2.0.14. I'm conscious that there have been some updates to the persistence mechanism in 2.0.15, but unfortunately I can't try that due to #2634. Based on the commit history, I don't feel that these updates in persist behavior play a role here.

We have around 100 Raspberry Pi-based dataloggers running in the field. A couple of weeks ago we installed Mosquitto on them (bridged to a central broker), with the persistence setting

persistence true
autosave_interval 300

Since then, this corruption issue has cropped up on 5-10% of them, which seems quite high. These dataloggers do experience occasional power outages, but in most cases those shouldn't occur more than once or twice per day. So I find it unlikely that over such a short period there would have been many power outages that occurred exactly within the (cumulative) few seconds of the day when persistence is taking place. But that's veering into speculation.

The issue itself

The symptoms of database corruption that I'm experiencing are consistently the following:

  • On startup, Mosquitto logs the messages
Error: Unable to restore persistent database. Unrecognised file format.
Error: Couldn't open database.

and exits

  • A mosquitto.db.new file exists, and contains a valid, non-corrupted, database. I.e. simply renaming mosquitto.db.new to mosquitto.db allows Mosquitto to run successfully.
  • A mosquitto.db file exists, and has the same size as the mosquitto.db.new file, but is blank. I.e. contains nothing but NULL bytes.
  • The two files (obviously) have different inodes.
  • The size of the persistence database is in most cases just a few kilobytes. So the persistence operation shouldn't take a meaningful amount of time (which would increase the chance of it being in progress as a power outage hits).

I briefly looked at https://github.com/eclipse/mosquitto/blob/v2.0.14/src/persist_write.c but can't figure out what could be causing the behavior we're observing - specifically a mosquitto.db full of NULL bytes. Any ideas?

I suppose it could be device or OS-specific behavior (or behavior that's specific to writing to SD cards) but haven't found any clues there either. Also no luck reproducing it in a controlled environment yet.

@itdap77
Copy link

itdap77 commented Sep 28, 2022

Hi, I am having the same problem, did you were able to figured out? I am trying to look for the location of the DB, do you know where it is?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants