Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(ext/web): Add fast path for non-streaming TextDecoder #14217

Merged
merged 10 commits into from
May 17, 2022

Conversation

randomicon00
Copy link
Contributor

First attempt at adding the fast path.

@CLAassistant
Copy link

CLAassistant commented Apr 5, 2022

CLA assistant check
All committers have signed the CLA.

input = new Uint8Array(input);
}

if (!options.stream) {
Copy link
Contributor

@andreubotella andreubotella Apr 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!options.stream) {
if (!options.stream && this.#rid === null) {

stream: true makes the current decoding context (i.e. the current Decoder instance) be kept around for the next call to TextDecoder.prototype.decode, rather than end with this call. Using op_encoding_decode_single when there is an open decoding context means that any state kept from the last call (for example, bytes at the end of the last input that couldn't be decoded without having access to the remainder of the input) would be discarded.

const decoder = new TextDecoder();

// Without `{stream: true}` this would return "🙈�", because the last two bytes
// aren't valid UTF-8 on their own.
decoder.decode(
  new Uint8Array([0xF0, 0x9F, 0x99, 0x88, 0xF0, 0x9F]),
  {stream: true}
);  // "🙈"

// With your PR as it is now, this would return "�🙊", because the first two
// bytes aren't valid UTF-8 on their own, but they are when appended to the
// last input.
decoder.decode(new Uint8Array([0x99, 0x89, 0xF0, 0x9F, 0x99, 0x8A])); // "🙉🙊"

With this change, the assumption that options.stream must be true in the code below no longer holds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @andreubotella!
The failed tests were reduced to 6 now. Will have a closer look at the remaining failed tests and your comments in a few hours.

ext/web/lib.rs Outdated
.max_utf16_buffer_length(data.len())
.ok_or_else(|| range_error("Value too large to decode."))?;

let mut output = vec![0; max_buffer_length];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please run cargo fmt (or ./tools/format.js to also format the JS code).

@randomicon00
Copy link
Contributor Author

randomicon00 commented Apr 6, 2022

The last 6 errors are all due to this panic.

thread 'worker-0' panicked at 'Must not use a decoder that has finished.',
encoding_rs-0.8.29/src/lib.rs:4203:43

Basically, it happens when a decoder wants to query a worst-case UTF-16 length in order for the dst buffer to allocate the required space.

Its state is DecoderLifeCycle::Finished when performing the query which causes the panic.

DecoderLifeCycle::Finished => panic!("Must not use a decoder that has finished."),

I am still in the process of trying to isolate the problem.

@randomicon00
Copy link
Contributor Author

All tests are passing now. The first version of this refactoring is complete. Please have a look.

Copy link
Contributor

@andreubotella andreubotella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. But please rename this PR with a title following conventional commits as with other PRs and commits. See also the contributing section of the manual.

@@ -97,14 +97,6 @@

// TODO(lucacasonato): add fast path for non-streaming decoder & decode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please get rid of this comment

@bartlomieju bartlomieju changed the title Fix#13829 (TextDecoder): Add fast path for TextDecoder non streaming (Draft) perf(ext/web): Add fast path for non-streaming TextDecoder Apr 26, 2022
Copy link
Member

@littledivy littledivy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Can we get a benchmark for this in https://github.com/denoland/deno/blob/main/cli/bench/deno_common.js ? Before and after comparison will be helpful.

@randomicon00
Copy link
Contributor Author

Looks great! Can we get a benchmark for this in https://github.com/denoland/deno/blob/main/cli/bench/deno_common.js ? Before and after comparison will be helpful.

Thanks @littledivy! I don't know if your message was addressed to me. But just in case, I would need just a little bit more information on how to perform the before/after benchmark.

@randomicon00
Copy link
Contributor Author

What benchmark tests should I run to move this PR towards completion and validation? Thanks.

@littledivy
Copy link
Member

@randomicon00 Added benchmarks, results look good:

# main
test bench_encode_12kb ... bench:   5,024,752 ns/iter (+/- 91,769)
# this branch
test bench_encode_12kb ... bench:   4,749,543 ns/iter (+/- 68,506)

Copy link
Member

@littledivy littledivy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@littledivy littledivy merged commit e58f77e into denoland:main May 17, 2022
sigmaSd pushed a commit to sigmaSd/deno that referenced this pull request May 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants