Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement read_ahead option for inet_drv.c to not overconsume kTLS data #8690

Conversation

RaimoNiskanen
Copy link
Contributor

Closes #8561.

This PR adds a boolean option read_ahead to gen_tcp that affects if inet_drv.c is greedy when reading packet data.

The default is the backwards compatible {read_ahead,true} but for a connection that will switch over to kTLS it should be {read_ahead,false} or else inet_drv.c will probably, when switching over to kTLS in greediness have received some data that was for the OS TCP stack to decrypt, so the connection's TLS packet framing derails.

I changed SSL code to use the option, but I guess it is used in too many places for the SSL distribution. @IngelaAndin will have to educate me on that.

This PR lacks handling the option in the socket backend. Documentation is also missing, for now.

@RaimoNiskanen RaimoNiskanen added this to the OTP-26.2.5.3 milestone Jul 25, 2024
@RaimoNiskanen RaimoNiskanen self-assigned this Jul 25, 2024
@zzydxm
Copy link
Contributor

zzydxm commented Jul 27, 2024

Thanks a lot for the fix!
I think it would be enough to change those places in inet_tls_dist, we will do a test on our side too.

(I still hope ssl application can be changed to use the native socket module instead of the gen_tcp_socket, but it is unrelated to this issue.)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we mean to commit this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I work inside the OTP team I can commit that file.

That file has to be on the branch for the final merge to work, but we are not allowed to take a preloaded .beam file from an external contributor, for security reasons.

External contributors cannot have a preloaded file in a PR branch, and then someone in the OTP team has to build it on top of the branch, internally, cannot use the "Testing" label and has to add the branch manually to our integration branch.

But on the other hand, a security aware external entity might not like to take a preloaded file for granted from single, probably, OTP developer... Maybe I should put it in the last commit so it would be easy to back out. Although I prefer to have it in the commit that changes it's source code, which facilitate bisecting.

@miriampena
Copy link

Hi @RaimoNiskanen . I tested this out (many times) with our code and the bugfix seems to remove the race condition as I am not able to trigger it anymore. Thanks for the fix!

@RaimoNiskanen
Copy link
Contributor Author

@miriampena: Great to hear!

Unfortunately it is not quite done yet; the fix passes a NULL pointer to memmove in some odd case, so it causes segfaults in our daily runs.

That, and read_ahead option support is missing from the socket backend...

@RaimoNiskanen
Copy link
Contributor Author

RaimoNiskanen commented Aug 16, 2024

SHA 405c4dd looks better in our daily builds; no more core dumps. Please re-verify!

I am now working on gen_tcp_socket, and I have to write some documentation too...

@RaimoNiskanen
Copy link
Contributor Author

RaimoNiskanen commented Aug 20, 2024

My changes in gen_tcp_socket clashed in a train wreck like manner with the rewrites in OTP 27, creating a nightmare-ish conflict when merging forward.

I think I will release this as-is, an undocumented option used by SSL's kTLS distribution, and implement the read_ahead option fully in OTP 27.

I hope that will be sufficient...?

I also wait for confirmation that my new version 405c4dd fixing the memmove NULL pointer problem still works as expected.

Edit I cleaned up the branch to f7a4fff, with the same content.

@RaimoNiskanen RaimoNiskanen force-pushed the raimo/inet_drv-active-1-no-read-ahead/GH-8561/OTP-19175 branch 2 times, most recently from f7a4fff to b3c4eaf Compare August 21, 2024 12:57
@RaimoNiskanen
Copy link
Contributor Author

New update; corrected where in SSL the read_ahead option should be used.

@RaimoNiskanen RaimoNiskanen marked this pull request as ready for review August 21, 2024 12:59
@RaimoNiskanen RaimoNiskanen requested review from miriampena and removed request for IngelaAndin August 21, 2024 12:59
@miriampena
Copy link

Thanks @RaimoNiskanen . Will test this soon :)

@miriampena
Copy link

Unfortunately I can trigger the race condition with this code.

@RaimoNiskanen
Copy link
Contributor Author

Interesting... I will investigate!

@RaimoNiskanen RaimoNiskanen force-pushed the raimo/inet_drv-active-1-no-read-ahead/GH-8561/OTP-19175 branch from b3c4eaf to fbafeda Compare August 30, 2024 13:36
@RaimoNiskanen
Copy link
Contributor Author

Unfortunately our test cases does not trigger the race, and the obvious bug that inet_drv.c reads encrypted data ahead of decryption should be fixed. At least from what I can see.

May there be some other bug lurking under the fixed one...

I am thinking about merging this next week, and take it from there.

@RaimoNiskanen RaimoNiskanen linked an issue Sep 2, 2024 that may be closed by this pull request
@RaimoNiskanen
Copy link
Contributor Author

I have merged this PR (commit fbafeda) forward to 'maint' and 'master', and it will be released in OTP-26.2.5.3 hopefully later this week.

The read ahead problem for kTLS in SSL should be fixed, but there is probably some other bug that we can handle in a new Issue/PR...

@garazdawi garazdawi merged commit a78208f into erlang:maint-26 Sep 5, 2024
1 check passed
@RaimoNiskanen
Copy link
Contributor Author

Continuation, full implementation, in PR #8785

@miriampena
Copy link

Thanks a lot! Ill try to create a test that trigger the bug I am seeing.

@miriampena
Copy link

miriampena commented Sep 8, 2024

Hi, I pulled the latest code from maintenance and all seems to work fine and the bug is no longer seen. Thanks a million! ❤️

@RaimoNiskanen
Copy link
Contributor Author

@miriampena: Oh, so the bug is fixed on OTP 26.2.5.3 and 'maint', but I see some other bug or testcase problem in our daily runs. Bummer!...

Anyway; good for you! One bug less.
Thank you for finding this bug!

@RaimoNiskanen RaimoNiskanen deleted the raimo/inet_drv-active-1-no-read-ahead/GH-8561/OTP-19175 branch September 10, 2024 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Race condition with OTP ktls implementation and inet_driver
4 participants