Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combination of max_inflight_messages 1 and max_queued_messages 0 causes "Bad socket read/write" error #3088

Open
mossblaser opened this issue Jul 26, 2024 · 0 comments

Comments

@mossblaser
Copy link

Mosquitto v2.0.17 - v2.0.18 (latest at time of writing)

Since mosquitto v2.0.17, or more specifically, commit bfb373d ("Fix max_queued_message 0 stopping clients from receiving messages."), when the following configuration is used:

max_inflight_messages 1
max_queued_messages 0
log_type all  # Not part of bug, just to see what's happening

Sending two QoS 2 PUBLISH messages from a client in quick succession produces a "Bad socket read/write" error and disconnects the client.

Full mosquitto output (click to expand)
$ mosquitto -c mosquitto.conf 
1721319767: mosquitto version 2.0.18 starting
1721319767: Config loaded from my_mosquitto.conf.
1721319767: Starting in local only mode. Connections will only be possible from clients running on this machine.
1721319767: Create a configuration file which defines a listener to allow remote access.
1721319767: For more details see https://mosquitto.org/documentation/authentication-methods/
1721319767: Opening ipv4 listen socket on port 1883.
1721319767: Opening ipv6 listen socket on port 1883.
1721319767: mosquitto version 2.0.18 running
1721319769: New connection from ::1:51211 on port 1883.
1721319769: New client connected from ::1:51211 as auto-44FED64F-E9D6-546E-F356-DC5F36A5111C (p2, c1, k60).
1721319769: No will message specified.
1721319769: Sending CONNACK to auto-44FED64F-E9D6-546E-F356-DC5F36A5111C (0, 0)
1721319769: Received PUBLISH from auto-44FED64F-E9D6-546E-F356-DC5F36A5111C (d0, q2, r0, m1, 'foo', ... (3 bytes))
1721319769: Sending PUBREC to auto-44FED64F-E9D6-546E-F356-DC5F36A5111C (m1, rc0)
1721319769: Received PUBLISH from auto-44FED64F-E9D6-546E-F356-DC5F36A5111C (d0, q2, r0, m2, 'foo', ... (3 bytes))
1721319769: Sending PUBREC to auto-44FED64F-E9D6-546E-F356-DC5F36A5111C (m2, rc151)
1721319769: Bad socket read/write on client auto-44FED64F-E9D6-546E-F356-DC5F36A5111C: Unknown error.

The following minimal paho-mqtt based Python client can be used to reproduce this issue:

repro.py script (click to expand)
import paho.mqtt.client as mqtt

mqttc = mqtt.Client(mqtt.CallbackAPIVersion.VERSION2)
mqttc.connect("localhost", 1883, 60)
mqttc.on_connect = lambda *_: print("on_connect")
mqttc.on_publish = lambda *_: print("on_publish")
mqttc.loop_start()

mqttc.publish("foo", b"bar", qos=2)
mqttc.publish("foo", b"bar", qos=2)

input("press enter to exit...")

Run using:

$ python3 -m venv venv
$ venv/bin/pip install paho-mqtt
$ venv/bin/python repro.py

It is the combination of both max_* options which triggers this bug -- remove either one and the client can connect and publish successfully.

Similarly, if a lower QoS than 2 is used, the bug does not manifest.

Likewise, don't send the two PUBLISH messages in quick succession and the problem goes away too. For example, just send one or put a time.sleep(1) between them and the bug does not manifest.

Mosquitto 2.0.16 (related, earlier bug)

Though the bug above was first released in mosquitto 2.0.17, the configuration shown also breaks mosquitto 2.0.16 due to a different bug introduced in commit 6113eac ("Fix for CVE-2023-28366"). This bug caused all messages to be dropped when max_queued_messages 0 is used with the log message "Outgoing messages are being dropped".

Full mosquitto 2.0.16 output (click to expand)
1721320706: mosquitto version 2.0.16 starting
1721320706: Config loaded from my_mosquitto.conf.
1721320706: Starting in local only mode. Connections will only be possible from clients running on this machine.
1721320706: Create a configuration file which defines a listener to allow remote access.
1721320706: For more details see https://mosquitto.org/documentation/authentication-methods/
1721320706: Opening ipv4 listen socket on port 1883.
1721320706: Opening ipv6 listen socket on port 1883.
1721320706: mosquitto version 2.0.16 running
1721320708: New connection from ::1:40175 on port 1883.
1721320708: New client connected from ::1:40175 as auto-F893A8DD-E931-3880-A59C-5BE2C381C192 (p2, c1, k60).
1721320708: No will message specified.
1721320708: Sending CONNACK to auto-F893A8DD-E931-3880-A59C-5BE2C381C192 (0, 0)
1721320708: Outgoing messages are being dropped for client auto-F893A8DD-E931-3880-A59C-5BE2C381C192.
1721320708: Received PUBLISH from auto-F893A8DD-E931-3880-A59C-5BE2C381C192 (d0, q2, r0, m1, 'foo', ... (3 bytes))
1721320708: Sending PUBREC to auto-F893A8DD-E931-3880-A59C-5BE2C381C192 (m1, rc0)
1721320708: Received PUBLISH from auto-F893A8DD-E931-3880-A59C-5BE2C381C192 (d0, q2, r0, m2, 'foo', ... (3 bytes))
1721320708: Sending PUBREC to auto-F893A8DD-E931-3880-A59C-5BE2C381C192 (m2, rc151)
1721320708: Bad socket read/write on client auto-F893A8DD-E931-3880-A59C-5BE2C381C192: Unknown error.

In this case, the bug is entirely caused by the max_queued_messages 0 option and prevents clients from even connecting.

It appears that the fix for this bug -- the commit bfb373d ("Fix max_queued_message 0 stopping clients from receiving messages.") mentioned at the top of this report -- may have been some way incomplete since it introduced a compatibility issue with the max_queued_messages 0 option.

Mosquitto 2.0.15 and earlier (works)

Mosquitto 2.0.15 works correctly with both max_inflight_messages 1 and max_queued_messages 0.

Other details

I have tested this on Arch Linux using Mosquitto binaries built directly from the repository with default settings. (The same bug does, however, appear in Arch's packaged versions).

The commits identified in this report were found using git bisect rather than an exhaustive scan! It is possible (though hopefully unlikely) I've identified the wrong changes. Sorry I haven't had the time to dig any deeper!

In case it is relevant, the reason for my use of both max_inflight_messages 1 and max_queued_messages 0 is that my application requires guaranteed, in-order delivery of messages, even during bursts of messages. (Naturally, all clients are trusted and any bursts in messages are strictly bounded -- I'm aware of the caveats of using these options!).

As ever, thank you so much to everyone for all the hard work in making and maintaining mosquitto!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant