New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broker bridge 2.0.10: Corruption of pollfds on Windows #2173
Comments
Another way to make it work would be static void bridge_check_pending(struct mosquitto *context)
But it feels like it is dangerous because alot of code calls net__socket_close which doesn't immediately removes the socket from the poll array |
Thanks, you're right it needs some careful consideration of where it should happen. I'll take a look. |
Cool, I was wrong that it solves #2172, that problem is much more serious, it seems when you have more than 1 bridge configured only one of them will ever work, all the other (even if reachable) will never connect up, I will investigate. I suspect this must all work on Linux but I am on Windows |
net__socket_close is called which deletes/closes the socket from db.contexts_by_sock, but doesn't call mux__delete(mosq) before trying another connect, when connect is called the socket fd is changed in the context, but the pollfd_index points to the old spot (which also isn't cleared)
This happens in bridge_check which calls mosquitto__check_keepalive which closes the socket, which triggers a call to rc = bridge__connect(context); in the same loop
I am not sure what the proper solution is, because it doesn't seem like you are supposed to call mux__delete from net__socket_close, rather that maybe bridge_check_pending should detect the socket as stale and call mux__delete
however adding mux__delete here (in net_mosq::net__socket_close) works
This also seems to solve the issue where the broker is "stalled" if there is a bridge that isn't reachable (#2172 )
The text was updated successfully, but these errors were encountered: