Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenSSL Error - called a function you should not call #383

Closed
wiebeytec opened this issue Feb 17, 2017 · 21 comments
Closed

OpenSSL Error - called a function you should not call #383

wiebeytec opened this issue Feb 17, 2017 · 21 comments

Comments

@wiebeytec
Copy link

wiebeytec commented Feb 17, 2017

I'm working on a C++ project using QMQTT and the connections keep failing with:

1487331321: New connection from 1.2.3.4 on port 8883.
1487331321: New client connected from 1.2.3.4 as MyProgram_975197543 (c1, k300, u'[email protected]').
1487331321: MyProgram_975197543 0 N/847e4062cb99/#
1487331321: OpenSSL Error: error:140F3042:SSL routines:ssl_undefined_const_function:called a function you should not call
1487331321: Socket error on client MyProgram_975197543, disconnecting.

I subsequently get 'server closed connection' errors in the client.

This has been found before, but seemingly without fix. I found something about it on StackOverflow:

Someone on the openssl-users mailing list helped me figure this out; the problem was that I was setting up my SSL session with SSLv23_method(), and when using SSLv23_method(), you mustn't call SSL_pending() until after the SSL handshake has finished negotiating which protocol (SSLv2, SSLv3, TLSv1, etc) it's actually going to use.

Since my application doesn't require compatibility with older versions of SSL, the quick work-around for me is to call SSLv3_method() during setup instead of SSLv23_method(). If backwards compatibility was needed, then I'd need to figure out some way of detecting when the protocol negotiation had completed and avoid calling SSL_pending() until then; but I'm going to ignore the issue for now since I don't need that functionality.

I'm not sure where the fault lies, and what the fix should be.

BTW: SSLv23 is actually the deprecated way of saying any SSL/TLS version, right?

Version: 1.4.8-1build1
Ubuntu 16.04

@ralight
Copy link
Contributor

ralight commented Feb 17, 2017

In your log we can see that the client has connected successfully and sent a subscribe request. It couldn't have done that without having the SSL handshake complete.

From what I've read of the openssl source, calling SSL_pending() on a connection that hasn't completed the handshake will cause no problems anyway. Maybe that was different with 0.9.8, but 0.9.8 is ancient now.

I don't know what is causing the problem - something like https://bz.apache.org/bugzilla/show_bug.cgi?id=58956 is a good candidate, but I think that is fixed in the version of openssl that is in ubuntu 16.04..

Have you got a minimal client example that you could share that displays this behaviour?

SSLv23_server_method() may well be deprecated in openssl 1.1.0, but openssl 1.0.2 does not provide an alternative and is supported until the end of 2019.

@wiebeytec
Copy link
Author

I should be able to make a small example program. I hope to get back to you later this week.

@wiebeytec
Copy link
Author

I have created this tool. You can download a binary with Qt libs. I compiled it with Qt 5.5, should it be relevant.

I have encountered both the scenario of an immediate disconnect, but also a 'server closed connection' after having received some publishes:

$ ./QmqttMosquittoSocketError  --username user --password password --hostname mqtt.example.com
qt.network.ssl: QSslSocket: cannot resolve SSLv2_client_method
qt.network.ssl: QSslSocket: cannot resolve SSLv2_server_method
"Connection established. Subscribing to N/847e4062cb99/#. Publishing R/847e4062cb99/system/0/Serial"
"N/847e4062cb99/system/0/Dc/Battery/ConsumedAmphours: {\"value\": null}"
...
...
"N/847e4062cb99/settings/0/Settings/Alarm/Vebus/VoltageSenseError: {\"value\": 1}"
"N/847e4062cb99/settings/0/Settings/CGwacs/AcPowerSetPoint: {\"value\": 1000.0}"
"Error occurred. QMQTT error code 2"
^C

This produces the result shown in the log messages in the first message.

(See Qmqtt error codes. Code 2 is SocketRemoteHostClosedError)

It seems to be dependent on from where I do this. My Linux Mint 17 from the office, based on Ubuntu 14.04, has the scenario described in the previous paragraph (error after getting publishes). But, from Ubuntu 14.04 servers, with the same Openssl 1.0.1f-1ubuntu2.22, i can get SocketRemoteHostClosedError, but never after receiving publishes. I haven't figured out why this is. Perhaps it's related to network latency or something.

@ralight
Copy link
Contributor

ralight commented Feb 23, 2017

Thanks, I'll give it a go.

@wiebeytec
Copy link
Author

I've been testing Mosquitto 1.4.11 for a few hours and it seems to behave very well. The test server isn't quite as loaded as the production server (still dozens of messages per second), but I haven't got a disconnect once. I dug up logs from our backups to see how long the 1.4.8 version has been giving errors, and that is quite a while. So I think it's safe to say it's fixed.

I think we'll just upgrade.

The version included with Ubuntu 16.04 seems unusable. Between this, and #303, it's really hard to get anything useful out of it.

@wiebeytec
Copy link
Author

I've upgraded our production server to version 1.4.11, and the problem persists. It seems to be related to how busy it is. Our Nagios monitoring is going into failure because of a rejected connection more and more. It simply runs a mosquitto_pub, and sometimes, it fails.

I just tried running mosquitto_pub -d several consecutive times, and the SSL error appears at the same moment I get the error, so it really seems to be related to that. The debug output from mosquitto_pub -d is not helpful:

Client mosqpub/18912-sentry sending CONNECT
Error: The connection was lost.

There is some variation in SSL errors. The one mentioned in the first post is still there, but I also get:

OpenSSL Error: error:140940E5:SSL routines:ssl3_read_bytes:ssl handshake failure

Our server (still, until rewrite in C++ is done) uses the JPMens' auth plugin with HTTP backend, so when a user logs in, a HTTP request is made to a small server running on localhost doing the auth. Could the delay in that be the cause, since the behavior seems to be related to server load?

@wiebeytec
Copy link
Author

I'm currently experimenting with an Nginx stream proxy that'll do the TCP SSL for me. Looks promising so far.

@ralight
Copy link
Contributor

ralight commented Mar 23, 2017

Could the delay in that be the cause, since the behavior seems to be related to server load?

Yes that is a very good possibility and it gives me something to work with, thanks.

@gysel
Copy link

gysel commented Apr 24, 2017

Any update on this? We're having the same issues. Should I consider using the Nginx stream proxy approach as well?

@wiebeytec
Copy link
Author

I've been using the Nginx proxy for a month. It works really well, except that the Mosquitto logs are unfortunate, as in that every connection comes from localhost.

@gysel
Copy link

gysel commented May 15, 2017

I switched to NGINX as well. And it is very stable now...

In case someone else wants to do the switch as well, I documented my NGINX config on my blog: https://mgysel.ch/mosquitto-tls-instability/

@mm-ark
Copy link

mm-ark commented Aug 8, 2017

I also observed this on my broker, client can not reconnect anymore :-(

1501574820: Received PUBLISH from Client-116 (d0, q1, r0, m9867, 'Client-116/TelemetryData', ... (261088 bytes))
1501574820: Sending PUBACK to Client-116 (Mid: 9867)
1501574821: Received PUBLISH from Client-116 (d0, q1, r0, m9868, 'Client-116/TelemetryData', ... (261115 bytes))
1501574821: Sending PUBACK to Client-116 (Mid: 9868)
1501574822: Received PUBLISH from Client-116 (d0, q1, r0, m9869, 'Client-116/TelemetryData', ... (261088 bytes))
1501574822: Sending PUBACK to Client-116 (Mid: 9869)
1501574822: New connection from 192.168.2.24 on port 8883.
1501574822: OpenSSL Error: error:140F3042:SSL routines:SSL_UNDEFINED_CONST_FUNCTION:called a function you should not call
1501574822: Socket error on client Client-116, disconnecting.

broker : mosquitto version 1.4.12 (build date Sat, 27 May 2017 21:38:19 +0100)
Client: using mosquitto_lib version 1.4.11

@Gussoh
Copy link

Gussoh commented Nov 7, 2018

Our server (still, until rewrite in C++ is done) uses the JPMens' auth plugin with HTTP backend, so when a user logs in, a HTTP request is made to a small server running on localhost doing the auth. Could the delay in that be the cause, since the behavior seems to be related to server load?

We are experiencing the same issue and we also have an authentication module. That could be the cause.

@ralight
Copy link
Contributor

ralight commented Nov 7, 2018

@Gussoh What version of mosquitto and platform are you on?

@Gussoh
Copy link

Gussoh commented Nov 7, 2018

@Gussoh What version of mosquitto and platform are you on?
Version: 1.4.8-1ubuntu0.16.04.4
Ubuntu 16.04

I tried the nginx workaround with tls termination outside of mosquitto but have some issues still. I rather go back to do it in mosquitto but it breaks after just a few hours it seems like.

@ralight
Copy link
Contributor

ralight commented Nov 7, 2018

Are you able to try out 1.5.3 from https://launchpad.net/~mosquitto-dev/+archive/ubuntu/mosquitto-ppa ?

@Gussoh
Copy link

Gussoh commented Nov 7, 2018

Yes, I can. I hope all the config works the same. I'll try tomorrow. Thanks!

@Gussoh
Copy link

Gussoh commented Nov 7, 2018

I couldn't wait. It seems to be working. No errors in the log yet and no obvious disconnections.
(Also super great with the per_listener_settings which closed this issue: jpmens/mosquitto-auth-plug#305 )

@ralight
Copy link
Contributor

ralight commented Nov 7, 2018

Two birds with one stone, great. I'll keep this open for a while, please do report back if it's still a problem.

@Gussoh
Copy link

Gussoh commented Nov 19, 2018

Hi, sorry to say that we still have to restart the server to make it work. It doesn't give this specific error message any more so it might be caused by another issue now. I haven't figured out anything more. Just that it starts working again after a mosquitto restart.

@ralight
Copy link
Contributor

ralight commented Nov 20, 2018

Thanks for the feedback. I think it seems as though this specific issue is fixed, so I'm going to close it. If you get more idea about what is happening with what you're seeing now then please open a new issue and provide details. Maybe it is best to do that anyway, so we can see what your setup is to try and reproduce it.

@ralight ralight closed this as completed Nov 20, 2018
@lock lock bot locked as resolved and limited conversation to collaborators Aug 7, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants