Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(ext/web): Prevent TextDecoderStream resource leak on stream cancellation #21074

Merged

Conversation

egfx-notifications
Copy link
Contributor

@egfx-notifications egfx-notifications commented Nov 3, 2023

This PR uses the new cancel method of TransformStream to properly clean up the internal TextDecoder used in TextDecoderStream if the stream is cancelled.

Fixes #13142

@CLAassistant
Copy link

CLAassistant commented Nov 3, 2023

CLA assistant check
All committers have signed the CLA.

@mmastrac
Copy link
Contributor

mmastrac commented Nov 3, 2023

Could you add a test case for this?

This should be sufficient (it'll trigger the resource sanitizer for tests):

new TextDecoder('utf8').decode(new Uint8Array(), { stream: true });

@egfx-notifications
Copy link
Contributor Author

Could you add a test case for this?

I can, though I'm not satisfied with the fix yet as this recreates the internal decoder resource for every chunk of the stream. I'm going to look further into the root cause of this, hence the draft status of the PR. I will update this PR and add a test once I have (hopefully) come up with a better solution.

From what I've found out so far the problem is that the decoder from encoding_rs requires a final call of a decode_* function with the last argument set to true. The deno TextDecoder provides the last argument as the negated value of the stream option, so we have to call TextDecoder.decode() at least once without the stream option to free the resource. This is what TextDecoderStream does in its flush method, I'm yet to find the proper way of handling this for an aborted async iterator.

@crowlKats
Copy link
Member

@egfx-notifications FYI as of recently, there is a TransformStream cancel option (#20741)

@egfx-notifications
Copy link
Contributor Author

@egfx-notifications FYI as of recently, there is a TransformStream cancel option (#20741)

Thank you! That saved me quite some time.

I just pushed a version using the cancel option and it works fine for my test case. Feel free to have a look, this is my first deep dive into deno code, so I may be missing something.

I'm going to test this properly and also add a minimal test case sometime next week. The test case suggested by @mmastrac is a bit too minimal as I consider the problem to be a cancelled stream not cleaning up its internal resources. Explicitly calling TextDecoder.decode() with the stream option without actually using it in a stream is not a case I want to solve, especially as the deno test output already mentions how to close the TextDecoder resource properly.

@egfx-notifications egfx-notifications changed the title fix(ext/web): Prevent TextDecoderStream resource leak fix(ext/web): Prevent TextDecoderStream resource leak on stream cancellation Nov 4, 2023
@egfx-notifications egfx-notifications force-pushed the fix/textdecoderstream-resource-leak branch from ea78619 to 38b8e1e Compare November 6, 2023 12:52
@egfx-notifications egfx-notifications force-pushed the fix/textdecoderstream-resource-leak branch from 38b8e1e to d2088a8 Compare November 6, 2023 14:28
@egfx-notifications egfx-notifications marked this pull request as ready for review November 6, 2023 14:28
@egfx-notifications
Copy link
Contributor Author

I'm done with this unless you request any changes.

This may also fix #19946 from how I understand the issue.

Comment on lines +304 to +311
cancel: (_reason) => {
try {
const _ = this.#decoder.decode();
return PromiseResolve();
} catch (err) {
return PromiseReject(err);
}
},
Copy link
Contributor

@mmastrac mmastrac Nov 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe something like this can be used to simplify the construction:

Suggested change
cancel: (_reason) => {
try {
const _ = this.#decoder.decode();
return PromiseResolve();
} catch (err) {
return PromiseReject(err);
}
},
// deno-lint-ignore require-await
cancel: async (_reason) => {
const _ = this.#decoder.decode();
},

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that this entire class is using the other style -- I'm not sure why so let's leave as-is. This is probably a good follow-up issue. I'll merge it!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I tried to follow the existing style. Not sure why it is preferred here, though.

const file = await Deno.open(filename);
const readable = file.readable.pipeThrough(new TextDecoderStream());
const chunks = [];
for await (const chunk of readable) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great test, thanks

@mmastrac mmastrac merged commit 3a7abe6 into denoland:main Nov 12, 2023
13 checks passed
@egfx-notifications egfx-notifications deleted the fix/textdecoderstream-resource-leak branch November 13, 2023 13:48
@mmastrac
Copy link
Contributor

This is causing a test flake:

[text_encoding_test 001.57] textDecoderStreamCleansUpOnCancel => ./cli/tests/unit/text_encoding_test.ts:323:6
[text_encoding_test 001.57] error: Leaking async ops:
[text_encoding_test 001.57]   - 1 async operation to op_read was started in this test, but never completed.
[text_encoding_test 001.57] To get more details where ops were leaked, run again with --trace-ops flag.

@egfx-notifications
Copy link
Contributor Author

Yeah, just noticed this as well. Thanks for the fixed test. I was going to suggest deferring the sanitizers as suggested here, but your fix seems cleaner.

kt3k pushed a commit that referenced this pull request Nov 17, 2023
…llation (#21074)

This PR uses the new `cancel` method of `TransformStream` to properly
clean up the internal `TextDecoder` used in `TextDecoderStream` if the
stream is cancelled.

Fixes #13142

Co-authored-by: Bartek Iwańczuk <[email protected]>
kt3k pushed a commit that referenced this pull request Nov 17, 2023
zifeo pushed a commit to metatypedev/deno that referenced this pull request Nov 22, 2023
…llation (denoland#21074)

This PR uses the new `cancel` method of `TransformStream` to properly
clean up the internal `TextDecoder` used in `TextDecoderStream` if the
stream is cancelled.

Fixes denoland#13142

Co-authored-by: Bartek Iwańczuk <[email protected]>
zifeo pushed a commit to metatypedev/deno that referenced this pull request Nov 22, 2023
mmastrac pushed a commit that referenced this pull request Feb 13, 2024
…cancellation (#21199)

Based on #21074 and #20741 I was looking for further potential use cases
of `TransformStream` `cancel()` method, so here go `CompressionStream`
and `DecompressionStream`.

Fixes #14212
littledivy pushed a commit that referenced this pull request Feb 15, 2024
…cancellation (#21199)

Based on #21074 and #20741 I was looking for further potential use cases
of `TransformStream` `cancel()` method, so here go `CompressionStream`
and `DecompressionStream`.

Fixes #14212
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TextDecoderStream leaks a native decoder resource if its stream errors
5 participants