Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mosquitto connectivity over 3g #1133

Open
jezlukasz opened this issue Jan 30, 2019 · 8 comments
Open

Mosquitto connectivity over 3g #1133

jezlukasz opened this issue Jan 30, 2019 · 8 comments

Comments

@jezlukasz
Copy link

jezlukasz commented Jan 30, 2019

We are experiencing an issue:

We have a device connecting with MQTT broker over 3g (with poor signal quality).
We are experiencing some kind of deadlock in reconnect mechanism and having a bunch of sockets in CLOSE_WAIT stage with RecvQ of 3642. The number of sockets in CLOSE_WAIT state seems to increase with decreasing the keepalive parameter.

Disconnect callback finally results with rc =1 (NOMEM).

Did anyone, ever experienced such thing?

@ralight
Copy link
Contributor

ralight commented Feb 2, 2019

Could you provide some example code of what you are trying to do, plus what version you are working with?

@jezlukasz
Copy link
Author

jezlukasz commented Feb 4, 2019

The version we are working with is 1.5.5 on OpenWrt (k4.1.23) built with GCC(5.3.0), musl(1.1.14).
As for sources - I can't provide them, but I can describe the mechanism:

MQTT works in single-threaded regime. It's a event-driven state machine which has separate states of:

  • MqttConnect (sets iConnected flag to true)
  • MqttDisconnect (resets iConnected flag to false)
  • MqttPublish (is _iConnected dependent)

MqttPublish event is dependent on iConnected flag so there is no mosquitto_publish() call when this flag is set to false.

Reconnect mechanism is also available, tested and working fine with more reliable connectivity.

Also - iConnected flag is dependent on response (or it's ) for PINGREQ, and CONNECT messages (respectively: PINGRESP, CONNACK).

As I have mentioned before the connectivity medium is a poor quality 3g network.
I believe that somewhere between PINGREQ and PINGRESP there is a connection gap which causes the reconnect mechanism to work and setting the iConnected flag to be reset to false.

Also there is this problem with sockets in CLOSE_WAIT state. Our TCP keepalive is set to 120 sek. but it looks like it's not relevant at all.
cat /proc/sys/net/ipv4/tcp_keepalive_time
120

For managing iConnected flag we use native reconnection mechanism. Also - to provide most reliable information about connection we use mosquitto_log_callback_set to monitor if PINGRESP is provided within keepalive time interval AND callbacks for connect and disconnect.

Hope my description is sufficient - If not - I believe I can provide with more details.

@ralight
Copy link
Contributor

ralight commented Feb 5, 2019

Sorry, that doesn't really give me anything to go on. I'm not even sure whether you're saying that the built in libmosquitto reconnection code isn't working, or whether you are implementing your own :)

@jezlukasz
Copy link
Author

Heh 😄.

That is a problem 😉.

Reconnection mechanism we use is the one built in libmosquitto.

For connection validation (iConnected flag) we use message string provided for log callback (mosquitto_log_callback_set()).

@ralight
Copy link
Contributor

ralight commented Feb 5, 2019

Ok, that sounds odd.

Why not use the on_connect callback?

@jezlukasz
Copy link
Author

jezlukasz commented Feb 5, 2019

We do use it. They also manage the iConnected flag.

@jezlukasz
Copy link
Author

I've updated my initial description. Has anyone experienced anything like that?

@ralight
Copy link
Contributor

ralight commented Feb 26, 2019

For me there isn't enough detail about what you're doing to be able to make any more comments I'm afraid. The devil is likely to be in the detail of what you are doing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants