-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mosquitto loop exits on TCP handshake failure for TLS #2825
Comments
I can confirm that this commit e979a46 introduces this issue (because of SSL_ERROR_SYSCALL not handled anymore during connection setup). |
Yes, we also face the same issue. Currently we downgraded to 2.0.14. |
On SSL_get_error man page, it says:
As stated on other issues, such as #2767 , seems like the library has broken the reconnection for EOF on e979a46 . It is broken from v2.0.15 to v2.0.18. Taking a look at the library implementation, on v2.0.18 under lib/net_mosq.c at net__handle_ssl, there could exist some checks for non fatal SSL server errors (SSL_ERROR_ZERO_RETURN, SSL_ERROR_NONE, SSL_ERROR_SSL - SSL_ERROR_SYSCALL and errno 0 for OpenSSL < 3.0.0 - and possible others?) that the client could return MOSQ_ERR_CONN_LOST, what would trigger the automatic reconnection. Currently if a SSL error happens on server side, like proxy down, EOF, empty reply, firewall rule added, the SSL goes to errno EPROTO, exiting the mosquitto_loop_forever as fatal error. Treating these non fatal SSL errors as MOSQ_CONN_LOST and considering the fatal SSL errors as MOSQ_ERR_TLS could potentially allow to remove the check by 'errno == EPROTO' on mosquitto_loop_forever, what I find a trick to handle on async implementations, but unsure if it is used for other purpose that I'm not aware. |
mosquitto v2.0.15
platform: linux
Using client library in threaded async mode with TLS connection to broker.
If, during the initial TCP connection setup, the server responds with a TCP RST (e.g. broker application down),
then quite often the mosquitto thread exits, thus requiring a manual re-start of the client,
which is unfeasible because the lib does not even notify the host application about this premature exit.
The exit point is in mosquitto_loop_forever() when dealing with mosquitto_loop() retcode:
https://github.com/eclipse/mosquitto/blob/master/lib/loop.c#L276
(rc is MOSQ_ERR_ERRNO and errno is EPROTO).
It seems to me that everything is originated from net__handle_ssl() (called by net__read()):
the SSL error SSL_ERROR_SYSCALL (which is returned in this scenario, since the underlying TCP socket got a ECONNREFUSED) is not handled anymore (it was, before the commit mentioned here below),
so the function returns -1 and sets errno to EPROTO, which triggers the above loop exit condition.
This bug could be possibly introduced by this commit: e979a46
The text was updated successfully, but these errors were encountered: