Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broker doesn't honor Receive Maximum value set in a bridged broker's CONNACK #3002

Open
calliday opened this issue Feb 12, 2024 · 3 comments

Comments

@calliday
Copy link

calliday commented Feb 12, 2024

UPDATE

The root cause has been found in the comment below, and the details listed in this comment are out of date, though the logs give good context.

Environment Details

OS: Debian Bullseye
Mosquitto Version: 2.0.18

Issue

A Mosquitto broker with >20 queued messages destined for a bridged HiveMQ broker connects, attempts to send 20 queued messages, closes the bridge, and repeats.

I have Mosquitto and one MQTT client running on an IoT device. Mosquitto is configured to bridge to a HiveMQ broker running on a remote machine. I conducted a durability test where I disconnect the network between the bridged brokers, send ~100 messages from the IoT client to the IoT mosquitto broker, restart the mosquitto broker, and then reconnect the network. Once I reconnect the network, the mosquitto broker repeatedly attempts to send 20 messages over the bridge, receives a retained message, disconnects the bridge, and repeats.

Related issues

#2165 - appears related but pastebins have expired and I can't tell if bridge disconnections were happening.

To Replicate

  • Configure a mosquitto broker to:
    • listen on a local port
    • bridge a topic in both directions with QoS 2 to another broker
    • (must be able to terminate network connection between the two brokers)
  • Start the mosquitto broker
  • Send messages on mapped topic to mosquitto broker
  • Disconnect network
  • Continue to send >20 messages to mosquitto broker
  • Restart mosquitto broker
  • Reconnect network

Mosquitto Config

persistence true
persistence_location /var/lib/mosquitto/
autosave_on_changes true
autosave_interval 1

log_dest file /var/log/mosquitto/mosquitto.log

include_dir /etc/mosquitto/conf.d

# Local listener
listener 1883
allow_anonymous true

# Bridging to HiveMQ
connection bridge-mosquitto-to-hivemq
address <omitted>
topic durability-test both 2
cleansession false
notifications false
start_type automatic
try_private false
remote_clientid edge-bridge
remote_username <omitted>
remote_password <omitted>
bridge_insecure false
bridge_cafile <omitted>
bridge_protocol_version mqttv50
bridge_attempt_unsubscribe true

Relevant Logs (debug)

1707769589: Received PUBLISH from edge-client-pub (d0, q2, r0, m99, 'durability-test', ... (40 bytes))
1707769589: Sending PUBREC to edge-client-pub (m99, rc0)
1707769589: Received PUBREL from edge-client-pub (Mid: 99)
1707769589: Sending PUBCOMP to edge-client-pub (m99)
1707769589: Saving in-memory database to /var/lib/mosquitto//mosquitto.db.
1707769589: Received PUBLISH from edge-client-pub (d0, q2, r0, m100, 'durability-test', ... (40 bytes))
1707769589: Sending PUBREC to edge-client-pub (m100, rc0)
1707769589: Received PUBREL from edge-client-pub (Mid: 100)
1707769589: Sending PUBCOMP to edge-client-pub (m100)
1707769589: Saving in-memory database to /var/lib/mosquitto//mosquitto.db.
1707769591: Received DISCONNECT from edge-client-pub
1707769591: Client edge-client-pub disconnected.
1707769591: mosquitto version 2.0.18 terminating
1707769591: Saving in-memory database to /var/lib/mosquitto//mosquitto.db.
1707769591: mosquitto version 2.0.18 starting
1707769591: Config loaded from /etc/mosquitto/mosquitto.conf.
1707769591: Opening ipv4 listen socket on port 1883.
1707769591: Opening ipv6 listen socket on port 1883.
1707769591: Bridge local.edge-bridge doing local SUBSCRIBE on topic durability-test
1707769591: Connecting bridge bridge-mosquitto-to-hivemq (<URL-OMITTED>:8883)
1707769591: Error creating bridge: Temporary failure in name resolution.
1707769591: Warning: Unable to connect to bridge bridge-mosquitto-to-hivemq.
1707769591: mosquitto version 2.0.18 running
1707769639: Saving in-memory database to /var/lib/mosquitto//mosquitto.db.
1707769647: Bridge local.edge-bridge doing local SUBSCRIBE on topic durability-test
1707769647: Connecting bridge bridge-mosquitto-to-hivemq (<URL-OMITTED>:8883)
1707769647: Error creating bridge: Temporary failure in name resolution.
1707769683: Bridge local.edge-bridge doing local SUBSCRIBE on topic durability-test
1707769683: Connecting bridge bridge-mosquitto-to-hivemq (<URL-OMITTED>:8883)
1707769683: Error creating bridge: Temporary failure in name resolution.
1707769737: Bridge local.edge-bridge doing local SUBSCRIBE on topic durability-test
1707769737: Connecting bridge bridge-mosquitto-to-hivemq (<URL-OMITTED>:8883)
1707769737: Bridge edge-bridge sending CONNECT
1707769739: Received CONNACK on connection local.edge-bridge.
1707769739: Bridge local.edge-bridge sending SUBSCRIBE (Mid: 561, Topic: durability-test, QoS: 2, Options: 0x0c)
1707769739: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m461, 'durability-test', ... (38 bytes))
1707769739: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m462, 'durability-test', ... (38 bytes))
1707769739: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m463, 'durability-test', ... (38 bytes))
1707769739: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m464, 'durability-test', ... (38 bytes))
1707769739: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m465, 'durability-test', ... (38 bytes))
1707769739: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m466, 'durability-test', ... (38 bytes))
1707769739: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m467, 'durability-test', ... (38 bytes))
1707769739: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m468, 'durability-test', ... (38 bytes))
1707769739: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m469, 'durability-test', ... (38 bytes))
1707769739: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m470, 'durability-test', ... (38 bytes))
1707769739: Sending PUBLISH to local.edge-bridge (d0, q2, r0, m471, 'durability-test', ... (40 bytes))
1707769739: Sending PUBLISH to local.edge-bridge (d0, q2, r0, m472, 'durability-test', ... (40 bytes))
1707769739: Sending PUBLISH to local.edge-bridge (d0, q2, r0, m473, 'durability-test', ... (40 bytes))
1707769739: Sending PUBLISH to local.edge-bridge (d0, q2, r0, m474, 'durability-test', ... (40 bytes))
1707769739: Sending PUBLISH to local.edge-bridge (d0, q2, r0, m475, 'durability-test', ... (40 bytes))
1707769739: Sending PUBLISH to local.edge-bridge (d0, q2, r0, m476, 'durability-test', ... (40 bytes))
1707769739: Sending PUBLISH to local.edge-bridge (d0, q2, r0, m477, 'durability-test', ... (40 bytes))
1707769739: Sending PUBLISH to local.edge-bridge (d0, q2, r0, m478, 'durability-test', ... (40 bytes))
1707769739: Sending PUBLISH to local.edge-bridge (d0, q2, r0, m479, 'durability-test', ... (40 bytes))
1707769739: Sending PUBLISH to local.edge-bridge (d0, q2, r0, m480, 'durability-test', ... (40 bytes))
1707769739: Saving in-memory database to /var/lib/mosquitto//mosquitto.db.
1707769739: Received SUBACK from local.edge-bridge
1707769739: Received PUBLISH from local.edge-bridge (d0, q2, r1, m51, 'durability-test', ... (32 bytes))
1707769739: Sending PUBREC to local.edge-bridge (m51, rc0)
1707769739: Received DISCONNECT from local.edge-bridge
1707769739: Client local.edge-bridge disconnected.
1707769745: Bridge local.edge-bridge doing local SUBSCRIBE on topic durability-test
1707769745: Connecting bridge bridge-mosquitto-to-hivemq (<URL-OMITTED>:8883)
1707769745: Bridge edge-bridge sending CONNECT
1707769747: Received CONNACK on connection local.edge-bridge.
1707769747: Bridge local.edge-bridge sending SUBSCRIBE (Mid: 563, Topic: durability-test, QoS: 2, Options: 0x0c)
1707769747: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m461, 'durability-test', ... (38 bytes))
1707769747: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m462, 'durability-test', ... (38 bytes))
1707769747: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m463, 'durability-test', ... (38 bytes))
1707769747: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m464, 'durability-test', ... (38 bytes))
1707769747: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m465, 'durability-test', ... (38 bytes))
1707769747: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m466, 'durability-test', ... (38 bytes))
1707769747: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m467, 'durability-test', ... (38 bytes))
1707769747: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m468, 'durability-test', ... (38 bytes))
1707769747: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m469, 'durability-test', ... (38 bytes))
1707769747: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m470, 'durability-test', ... (38 bytes))
1707769747: Sending PUBLISH to local.edge-bridge (d1, q2, r0, m471, 'durability-test', ... (40 bytes))
@calliday
Copy link
Author

After further digging, the response from the server was 0x93: "Receive Maximum exceeded". The assumption is that this is related to mosquitto's 20 message default for the max_inflight_messages setting. Looking at the logs, those 20 messages are sent before any PUBACK or PUBCOMP messages were received.

I restarted the mosquitto broker, configuring max_inflight_messages to be set to 1. Now the bridge stays up, but the retained messages received during each of previous disconnect/reconnects were regurgitated back over the bridge, resulting in hundreds of repeated retained messages being sent back to the Hive server.

It appears that the original bug has been resolved, but is the echo of retained messages over a topic (mapped in both directions) the expected behavior of a Mosquitto broker using try_private set to false?

@calliday
Copy link
Author

After further review, it looks like there is a bug in the Mosquitto broker. Mosquitto doesn't use the Receive Maximum setting found in the CONNACK packet sent by the broker on the other end of a bridge. Here is the payload of the CONNACK being sent by HiveMQ to my Mosquitto broker:

Screenshot 2024-02-13 at 4 18 37 PM

The ReceiveMaximum established by HiveMQ is 0x0a (decimal 10) and Mosquitto continues to use it's configured value of 20, resulting in the disconnections listed above.

@calliday calliday changed the title Bridged broker with >20 queued messages can't bring up bridge properly Broker doesn't honor Receive Maximum value set in a bridged broker's CONNACK Feb 13, 2024
@calliday
Copy link
Author

After more debugging with gdb, here's what I've found:

The order of operations of Mosquitto starting up and connecting a bridge causes this issue:

  • Mosquitto starts
  • Imports config
    • including max_inflight_messages
  • Loads data from local database (see main() call of db__open)
    • context__init imports db.config->max_inflight_messages into the context->msgs_out.inflight_maximum and .inflight_quota (see context.c:89)
    • context->msgs_out.inflight_quota is used as the limit for populating QoS 1+ messages into the msgs_out.inflight linked list (see persist_read.c:157)
  • Connects bridge
  • Handles CONNACK from broker on other end of bridge
    • imports Receive Maximum value in request from CONNACK packet into context.msgs_out.inflight_maximum (see handle_connack.c:88)
  • Publishes messages in inflight linked list

The problem is that the import of the Receive Maximum value doesn't impact the already allocated context->msgs_out.inflight linked list. This issue only occurs on a restart, because that list gets populated from the database based on the config value BEFORE the context->msgs_out.inflight_maximum value is overwritten by the bridge's CONNACK.

This probably won't become priority for us to fix for a few weeks, but I'll plan on writing a patch around that time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant