-
Notifications
You must be signed in to change notification settings - Fork 12.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix connect timeout for non-linux targets, read readiness of socket connection, Read readiness to detect errors. Fixes #127018
#127300
base: master
Are you sure you want to change the base?
Conversation
…onnection before returning success
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Nilstrieb (or someone else) some time within the next two weeks. Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (
|
Not sure why I was explicitly requested as a reviewer. I don't have the necessary context/knowledge in any situation. Edit: sorry for the spam. GitHub mobile didn't show the reassignments for whatever reason :/ |
I don't know enough about this area either. maybe you do? |
// Check if the connnection actually succeeded and return ok only when | ||
// the socket is ready and no errors were found | ||
if let Some(e) = self.take_error()? { | ||
return Err(e); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it makes sense to change this for all platforms, since #127272 (precursor) and #127018 seem to indicate this is only on vxworks (or are there others you know of?)
// Check if the connnection actually succeeded and return ok only when | |
// the socket is ready and no errors were found | |
if let Some(e) = self.take_error()? { | |
return Err(e); | |
} | |
if cfg!(target_os = "vxworks") { | |
// [comment about vxworks not using POLLHUP] | |
if let Some(e) = self.take_error()? { | |
return Err(e); | |
} | |
} else { | |
/* original impl */ | |
} |
Also just for reference, if you put Fixes #127018
in a line of your PR description then it will close the issue when this merges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tgross35 I don't think that changes I made, will impact other platforms. For code simplicity, I did not make these changes specific for Vxworks. Not sure which approach is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also do not understand the details here but to me it seems like the difference between returning Err
on all errors vs. only on hangups must be intentional. So I am hesitant to change this without having a better picture of what is going on.
@sfackler it looks like you authored this bit of code in #43062 and updated it in #45269. What is the reasoning to check for POLLOUT
/POLLHUP
rather than just always returning an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@the8472 might have a better idea here too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote that code 7 years ago - I do not remember the specific context around it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth a shot, thanks for taking a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few relevant things from https://man7.org/linux/man-pages/man2/poll.2.html:
The field events is an input parameter ...
This field may be specified as zero, in which case the only
events that can be returned in revents are POLLHUP, POLLERR, and
POLLNVAL (see below).
We probably don't need to set POLLOUT
anymore as we aren't checking on it since #45269.
POLLERR
Error condition (only returned in revents; ignored in
events). This bit is also set for a file descriptor
referring to the write end of a pipe when the read end has
been closed.
POLLHUP
Hang up (only returned in revents; ignored in events).
Note that when reading from a channel such as a pipe or a
stream socket, this event merely indicates that the peer
closed its end of the channel. Subsequent reads from the
channel will return 0 (end of file) only after all
outstanding data in the channel has been consumed.
POLLNVAL
Invalid request: fd not open (only returned in revents;
ignored in events).
From #45265 (comment), it sounds like in reality it sets all three.
The manpage doesn't say anything about setting an error to SO_ERROR
. I would assume this is always true, but it does seem safer to have the fallback in place. It also seems a bit weird to me that we are checking only against POLLHUP and not a POLLHUP | POLLNVAL | POLLERR
mask, since these are all documented error conditions (as you pointed out below. Possibly relevant only for files, but without checking kernel source seems uncertain).
Perhaps I just don't have all the context here, if you find docs on the poll
/ SO_ERROR
relationship then please do link it.
So: I wouldn't really want to change the existing Linux handling to only check take_error
because it means going from documented behavior to empirical behavior, even if has proven correct. In which case I would prefer the cfg
suggestion above to the current proposal.
However I think this code could be cleaned up a bit with the following, which should work on your system too:
-
Change the
libc::pollfd
struct constructed above to have events as 0 rather thanPOLLOUT
, since we are no longer checkingPOLLOUT
(since Fix TcpStream::connect_timeout on linux #45269) -
Because we are no longer have a possible
POLLOUT
revent
, the possible results oflibc::poll
are:- -1: error via errno
- 0: success
- 3 (Linux), something else (vxworks): emitted revents,
POLLHUP | POLLNVAL | POLLERR
-
Since we can no longer have any successful revents (previously
POLLOUT
), the last match arm can be changed to unconditionally emit an error (without checkingrevents
orSome(e) = take_error
). So just remove theif
from the existing code so it always emitsself.take_error()?.unwrap_or_else(|| { ... })
.That is: that branch of the
match
will now always try to read theSO_ERROR
, which works on both Linux and vxworks. If that fails, meaning revents were set but there is noSO_ERROR
for some unexpected reason, it will emit anUncategorized
error (can never happen on vxworks per std::net::TcpStream::connect_timeout on vxworks #127018, but maybe possible on linux?). -
If you feel like it,
debug_assert!
that the revents are something we expect so we learn if this isn't accurate.
If that tests well, then I think it might be a more robust option on all systems. If you make the changes please comment it well, somebody else is going to have to reverse engineer this one day...
Doesn't matter if the above works, but does Vxworks set POLLNVAL
or POLLERR
rather than a mask including POLLHUP
?
@rustbot author |
The original implementation also calls @rustbot review |
Fixes #127018
r? @tgross35 |
r? @tgross35 |
Fixes #127018
![Screenshot 2024-07-04 105820](https://private-user-images.githubusercontent.com/88673422/345707023-5ef5a87f-f2af-4fb7-98da-7612d5e27e9a.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE3OTk4NTUsIm5iZiI6MTcyMTc5OTU1NSwicGF0aCI6Ii84ODY3MzQyMi8zNDU3MDcwMjMtNWVmNWE4N2YtZjJhZi00ZmI3LTk4ZGEtNzYxMmQ1ZTI3ZTlhLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MjQlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzI0VDA1MzkxNVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTA3MjUyNmFkOGFlZDJkNWE2NTMzNDI3ODU5MjJiZWU5MTAyZWUyNTZlNDE3MDY0NjRlMDFkN2E5ZWMxODRiMTUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.pi7xoAkyD8XiKdp05cnyzcp93BjnZs9ycixQt-8oVmw)
Connect_timeout would call
poll
and checkpollfd.revents
for POLLHUP error, rather that checking readiness. This behavior was meant for Linux as it returns POLLHUP | POLLOUT | POLLERR in case of errors. But on targets that do not return POLLHUP inpollfd.revents
, this would indicate a false success and result in this issue. To resolve this we will check readiness of socket usinggetsockopt():
and return success from connect_timeout when there are no errors.Changes were tested on Linux and an rtos.
Thank you.