Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Websocket connections are not accepted after docker images update to eclipse-mosquitto:1.5.3 or eclipse-mosquitto:1.5 #1004

Closed
sergey-lukin opened this issue Oct 30, 2018 · 32 comments

Comments

@sergey-lukin
Copy link

After today's update for docker images: eclipse-mosquitto:1.5.3, eclipse-mosquitto:1.5 and eclipse-mosquitto:latest docker container does not accept websocket connections

How I start docker container:

docker run --detach -p 1883:1883 -p 8000:8000 -v mosquitto/config:/mosquitto/config eclipse-mosquitto

Content my mosquitto.conf:

log_dest syslog
port 1883
listener 8000
protocol websockets

As a temporary workaround I start docker container with tag "1.4.12":

docker run --detach -p 1883:1883 -p 8000:8000 -v mosquitto/config:/mosquitto/config eclipse-mosquitto:1.4.12
@thrust15
Copy link

thrust15 commented Oct 30, 2018

Hi,

I'm experiencing the same issue, rolling back to 1.4.12 resolves the issue

Docker run command:

docker run -itd --name=mosquitto \
--restart=always \
-v /opt/mosquitto/config/mosquitto.conf:/mosquitto/config/mosquitto.conf:ro \
-v /opt/mosquitto/data:/mosquitto/data:rw \
-v /opt/mosquitto/log:/mosquitto/log:rw \
-u 996:995 \
-p 1883:1883 \
-p 9001:9001 \
-e TZ=Europe/Amsterdam \
eclipse-mosquitto

Mosquitto Config

user mosquitto
persistence true
persistence_location /mosquitto/data/
persistence_file mosquitto.db
log_dest syslog
log_dest stdout
log_dest topic
log_type error
log_type warning
log_type notice
log_type information
log_dest file /mosquitto/log/mosquitto.log
connection_messages true
log_timestamp true
allow_anonymous false

listener 1883

listener 9001
protocol websockets

@ralight
Copy link
Contributor

ralight commented Nov 1, 2018

I've tested this out just now and it works fine for me. This is an asciicast of me doing that test on Ubuntu:

https://asciinema.org/a/KgVrmp0o8TigYQpdCfMNzDSMy

What do your logs show with respect to opening the websockets listener?

@sergey-lukin
Copy link
Author

sergey-lukin commented Nov 1, 2018 via email

@ralight
Copy link
Contributor

ralight commented Nov 1, 2018

How peculiar. I don't know what to suggest next. I'm testing on Ubuntu.

@sirockin
Copy link

sirockin commented Nov 3, 2018

I have exactly the same issue running 1.5 docker images on linux or windows.

@stoinov
Copy link

stoinov commented Nov 3, 2018

I am having the same issue. Running on arm64v8 and having Traefik for a proxy. Here's what I was able to find while troubleshooting this:
The listener actually starts and is able to ACCEPT connection as per netstat:

Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp      601      0 mqtt:9001               traefik:49793      ESTABLISHED
tcp       73      0 mqtt:9001               grafana:51309      ESTABLISHED
tcp        0      0 mqtt:1883               192.168.1.11:20403    ESTABLISHED
Active UNIX domain sockets (w/o servers)
Proto RefCnt Flags       Type       State         I-Node Path

There are packets in the Q for the WS. I tried to see if this is proxy issue and done a connection test from another docker host (grafana in the table). The connection times out and again there are packets in the Q.

So it seems like something is preventing the answering of the WS connections. Just to note that the http_dir option is also timing out which is expected.

@stoinov
Copy link

stoinov commented Nov 3, 2018

@ralight and @sergey-lukin can you check with netstat -a and compare? mine is:

Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 127.0.0.11:40442        0.0.0.0:*               LISTEN
tcp        0      0 mqtt:1883               0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:9001            0.0.0.0:*               LISTEN
tcp      601      0 mqtt:9001               traefik.:49880      ESTABLISHED
tcp        0      0 mqtt:1883               192.168.1.11:20403    ESTABLISHED
udp        0      0 127.0.0.11:45898        0.0.0.0:*
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags       Type       State         I-Node Path

You can see the LISTEN addresses are different, and this might be the reason.

@ralight
Copy link
Contributor

ralight commented Nov 4, 2018

Is anybody able to use wireshark to capture the communication flow when attempting to connect and failing? I can take the capture file by private email if need be.

@stoinov
Copy link

stoinov commented Nov 4, 2018

I am not very familiar as to what should I dump but I was able to spin up a alpine docker and install tcpdump and 1.5.3. Here's what I see just running it:

10:31:02.030703 IP mqtt.9001 > 192.168.1.11.56027: Flags [R.], seq 1, ack 555, win 237, length 0
10:31:02.043970 IP 192.168.1.11.56057 > mqtt.9001: Flags [S], seq 1677818646, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
10:31:02.044083 IP mqtt.9001 > 192.168.1.11.56057: Flags [S.], seq 1340804362, ack 1677818647, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
10:31:02.045817 IP 192.168.1.11.56057 > mqtt.9001: Flags [.], ack 1, win 256, length 0
10:31:02.045834 IP 192.168.1.11.56057 > mqtt.9001: Flags [P.], seq 1:555, ack 1, win 256, length 554
10:31:02.045993 IP mqtt.9001 > 192.168.1.11.56057: Flags [.], ack 555, win 237, length 0
10:31:23.074382 IP mqtt.9001 > 192.168.1.11.56057: Flags [R.], seq 1, ack 555, win 237, length 0
10:31:23.088113 IP 192.168.1.11.56087 > mqtt.9001: Flags [S], seq 3814909019, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
10:31:23.088237 IP mqtt.9001 > 192.168.1.11.56087: Flags [S.], seq 824927371, ack 3814909020, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
10:31:23.089864 IP 192.168.1.11.56087 > mqtt.9001: Flags [.], ack 1, win 256, length 0
10:31:23.090668 IP 192.168.1.11.56087 > mqtt.9001: Flags [P.], seq 1:555, ack 1, win 256, length 554
10:31:23.090736 IP mqtt.9001 > 192.168.1.11.56087: Flags [.], ack 555, win 237, length 0

I only left the WS client and this is the only traffic, over and over again. If you want me to run some specific command and send the result let me know.

@ralight
Copy link
Contributor

ralight commented Nov 4, 2018

Yes, sorry I should have been more specific. If you could run wireshark instead of tcpdump then I can see the actual traffic being transmitted to check whether anything is different to here.

The procedure would be:

run wireshark
set it to collect information from the network interface where the traffic is flowing
run the test
stop wireshark collection
use file/export specified packets as to save the capture file

On the export side, if you want to ensure that only packets related to the test are included in the export file, then you can filter from the wireshark gui then choose to export only the displayed packets. You can probably use the filter as (tcp.dstport == 9001) || (tcp.srcport == 9001) - this goes in the "Apply a display filter" box in the gui.

@stoinov
Copy link

stoinov commented Nov 4, 2018

Ran tcpdump -s 0 port 9001 -w mycap.pcap and captured three requests (18 packets total). How should I send it?

@ralight
Copy link
Contributor

ralight commented Nov 4, 2018

That's also fine, thank you. Please email to [email protected]

@thelordoflite
Copy link

thelordoflite commented Nov 5, 2018

Hi ralight,

I tried connecting to mosquitto from within the container itself. That also fails for me. Just curious if even that behavior is different for you. In your ASCII cinema you seem to be connecting to the broker from host system instead. In my limited understanding of docker containers that should ideally completely rule out environment based difference factor and hence you should also face the same error.

Thanks

@sergey-lukin
Copy link
Author

@gamble09 @ralight
I've also tried connecting to mosquitto from within the container itself and it fails for me too.
Here is the output of netstat -a from within the container:

/ # netstat -a
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       
tcp        0      0 0.0.0.0:9001            0.0.0.0:*               LISTEN      
tcp        0      0 0.0.0.0:1883            0.0.0.0:*               LISTEN      
tcp        0      0 :::1883                 :::*                    LISTEN      
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags       Type       State         I-Node Path

tcpdump -v -s 0 port 9001 doesn't give any results:

/ # tcpdump -v -s 0 port 9001
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes

Here is my asciinema record:
https://asciinema.org/a/h2jBWiDZVljMD4l47cr1J9tQW

@karlp
Copy link
Contributor

karlp commented Nov 6, 2018

I suspect you're running into a dns issue. Note that your mqtt protocol is listening on both v4 & v6, but WS on 9001 is only on v4? I've had issues with some dns resolvers in minimal systems returning the ::1 for "localhost" which would not work. You just get oddly unexpected "couldn't connect" type failures

@sergey-lukin
Copy link
Author

Thank you @karlp for your comment. Sounds reasonable. However I've also tried to connect to 127.0.0.1 not to localhost (/paho.mqtt.python/examples/client_sub-ws.py > mqttc.connect("127.0.0.1", 9001, 60)) from within the container. Still no success.

In any case, it would be good to setup mosquitto so that WS would be listening on both v4 & v6, but I didn't find how to do it.

@ralight
Copy link
Contributor

ralight commented Nov 7, 2018

I've reproduced this in a manjaro vm, so I'm trying to solve it now.

@ralight
Copy link
Contributor

ralight commented Nov 7, 2018

Could you please try again with the docker file from master?

git clone https://github.com/eclipse/mosquitto
cd mosquitto/docker/1.5
docker build . -t mosq
docker run ... mosq

@sergey-lukin
Copy link
Author

Hi @ralight! Now it works.

@ralight
Copy link
Contributor

ralight commented Nov 7, 2018

Excellent. Let's see what the others say.

@loosfoos
Copy link

loosfoos commented Nov 8, 2018

Tested with latest tag, still did not work for me. I had to rollback to the 1.4.12 version.

@stoinov
Copy link

stoinov commented Nov 10, 2018

The latest 1.5.4 docker now works. I've noticed that the WS connection returns slightly different log:
New client connected from ::ffff:192.168.1.2 as ws-client

I tried setting the local address explicitly but I always get this. The TCP connections just shows the IPv4 without the IPv6 part. Is this expected?

@Codelica
Copy link

Also tried latest tag (1.5.4) and docker file from master, but both failed to bring up the websocket.
The host does have IPV6 disabled, not sure if that's involved.

1542004771: mosquitto version 1.5.4 starting,
1542004771: Config loaded from /mosquitto/config/mosquitto.conf.,
1542004771: Opening ipv6 listen socket on port 1883.,
1542004771: Warning: Address family not supported by protocol,
1542004771: Opening ipv4 listen socket on port 1883.,
1542004771: libuv support not compiled in,
1542004771:  Using non-SSL mode,
1542004771: ERROR opening socket,
1542004771: Failed to create default vhost,
1542004771: init server failed,
1542004771: Error: Unable to create websockets listener on port 9001.

Running same config with 1.4.12 tagged image works fine.

@ralight
Copy link
Contributor

ralight commented Nov 13, 2018

@stoinov That is down to the way in which libwebsockets treats IPv6, I don't think there is anything that can be done about it other than completely disabling IPv6.

@ralight
Copy link
Contributor

ralight commented Nov 13, 2018

@Codelica I think you've probably hit the nail on the head mentioning that IPv6 is disabled. It turns out that libwebsockets assumes everything is IPv6 (if support is compiled in) unless you tell it you don't want IPv6 support. Mosquitto asks for both IPv4 and IPv6 and uses the one that works.

Are you in a position to test with IPv6 enabled, even if the interface isn't configured?

For fixing it in Mosquitto, on one hand we could disable IPv6 in lws completely, on the other hand we could add an option to disable IPv6 for a listener, or on the gripping hand we could write our own websockets library. I'd like to go for the second option in 1.5.5.

@stoinov
Copy link

stoinov commented Nov 13, 2018

That's fine, as long as it's visual artifact and does not affect any functionality.

@thelordoflite
Copy link

Thanks @ralight for the fix. Works for me.

@sergey-lukin
Copy link
Author

For fixing it in Mosquitto, on one hand we could disable IPv6 in lws completely, on the other hand we could add an option to disable IPv6 for a listener, or on the gripping hand we could write our own websockets library. I'd like to go for the second option in 1.5.5.

Option 2 (add an option to disable IPv6 for a listener) sounds like the most reasonable, that can make everyone happy. Those who need ipv6 can use it and those who have problems because of ipv6 can just disable it with an option.

@Codelica
Copy link

@Codelica I think you've probably hit the nail on the head mentioning that IPv6 is disabled. It turns out that libwebsockets assumes everything is IPv6 (if support is compiled in) unless you tell it you don't want IPv6 support. Mosquitto asks for both IPv4 and IPv6 and uses the one that works.

Are you in a position to test with IPv6 enabled, even if the interface isn't configured?

@ralight I enabled ipv6 (un-configured) on one of our hosts, and it did resolve the issue. Websocket came up using :latest.

For fixing it in Mosquitto, on one hand we could disable IPv6 in lws completely, on the other hand we could add an option to disable IPv6 for a listener, or on the gripping hand we could write our own websockets library. I'd like to go for the second option in 1.5.5.

I would agree with @sergey-lukin, option 2 would be best for us also. I actually looked for that option before joining this issue, as most our hosts have it disabled.

@thrust15
Copy link

I've just pulled the latest image again and I'm able to connect without issue.

@ralight
Copy link
Contributor

ralight commented Dec 5, 2018

This can be controlled in the upcoming 1.5.5 by setting socket_domain ipv4 in the config for the websockets listeners.

@wsw70
Copy link

wsw70 commented Jun 3, 2019

Having had the same problem, the working configuration per @ralight comment above is

listener 1883
listener 1884
protocol websockets
socket_domain ipv4

@lock lock bot locked as resolved and limited conversation to collaborators Sep 1, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants