Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(ext/streams): optimize streams #20649

Merged
merged 25 commits into from
Oct 13, 2023
Merged

Conversation

marcosc90
Copy link
Contributor

This PR introduces several optimizations to streams

Highlights:

  • ReadableStream constructor: +20% iter/s.
  • WritableStream constructor: +50% iter/s.
  • TransformStream constructor: +30% iter/s.
  • ReadableStream iterator (both 2 and 20 chunks): +42% and +25% iter/s.
  • ReadableByteStream iterator (both 2 and 20 chunks): +39% and +20% iter/s.

Benchmarks

main

cpu: 13th Gen Intel(R) Core(TM) i9-13900H
runtime: deno 1.37.0 (x86_64-unknown-linux-gnu)

benchmark                                      time (avg)        iter/s             (min … max)       p75       p99      p995
----------------------------------------------------------------------------------------------- -----------------------------
ReadableStream constructor                    294.52 ns/iter   3,395,392.9 (277.92 ns … 618.26 ns) 292.66 ns 353.87 ns 618.26 ns
WritableStream constructor                    235.51 ns/iter   4,246,065.3 (213.04 ns … 306.35 ns) 236.77 ns 279.08 ns 281.32 ns
TransformStream constructor                   672.52 ns/iter   1,486,938.7 (652.15 ns … 880.74 ns) 670.11 ns 880.74 ns 880.74 ns
ReadableStream - iterator (2 chunks)           10.44 µs/iter      95,757.9   (8.97 µs … 830.91 µs)  10.22 µs  14.74 µs  18.93 µs
ReadableStream - iterator (20 chunks)          21.93 µs/iter      45,593.4   (18.8 µs … 864.97 µs)  20.57 µs  57.15 µs 137.16 µs
ReadableStream - reader (2 chunks)              7.09 µs/iter     140,987.2     (7.03 µs … 7.18 µs)   7.13 µs   7.18 µs   7.18 µs
ReadableStream - reader (20 chunks)            18.41 µs/iter      54,324.2    (15.7 µs … 252.7 µs)  17.14 µs  68.88 µs  94.08 µs
ReadableByteStream - iterator (2 chunks)       11.06 µs/iter      90,375.1   (9.75 µs … 404.69 µs)  10.88 µs   16.6 µs  29.69 µs
ReadableByteStream - iterator (20 chunks)      26.71 µs/iter      37,435.0  (22.98 µs … 508.34 µs)  25.25 µs  85.28 µs 155.65 µs
ReadableByteStream - reader (2 chunks)          7.99 µs/iter     125,131.1     (7.92 µs … 8.13 µs)   8.01 µs   8.13 µs   8.13 µs
ReadableByteStream - reader (20 chunks)        23.46 µs/iter      42,618.5  (20.28 µs … 414.66 µs)  21.94 µs  90.52 µs 147.38 µs

this PR

cpu: 13th Gen Intel(R) Core(TM) i9-13900H
runtime: deno 1.37.0 (x86_64-unknown-linux-gnu)

benchmark                                      time (avg)        iter/s             (min … max)       p75       p99      p995
----------------------------------------------------------------------------------------------- -----------------------------
ReadableStream constructor                    235.48 ns/iter   4,246,584.3 (223.12 ns … 504.65 ns)  234.3 ns 290.84 ns 311.12 ns
WritableStream constructor                    156.31 ns/iter   6,397,537.3 (148.54 ns … 211.13 ns) 157.49 ns 199.82 ns 208.23 ns
TransformStream constructor                   471.29 ns/iter   2,121,815.3 (452.53 ns … 791.41 ns) 468.62 ns 540.36 ns 791.41 ns
ReadableStream - iterator (2 chunks)            7.32 µs/iter     136,705.4   (6.35 µs … 639.97 µs)    7.1 µs  12.12 µs  20.98 µs
ReadableStream - iterator (20 chunks)          17.48 µs/iter      57,195.1  (14.48 µs … 289.06 µs)  16.06 µs  76.98 µs 114.61 µs
ReadableStream - reader (2 chunks)              6.86 µs/iter     145,847.9      (6.8 µs … 6.97 µs)   6.88 µs   6.97 µs   6.97 µs
ReadableStream - reader (20 chunks)            16.88 µs/iter      59,227.7  (14.04 µs … 311.29 µs)  15.39 µs  74.95 µs  97.45 µs
ReadableByteStream - iterator (2 chunks)        7.94 µs/iter     125,881.2   (6.86 µs … 811.16 µs)   7.69 µs  11.43 µs   16.6 µs
ReadableByteStream - iterator (20 chunks)      22.23 µs/iter      44,978.2  (18.98 µs … 590.11 µs)  20.73 µs  45.13 µs  159.8 µs
ReadableByteStream - reader (2 chunks)           7.4 µs/iter     135,206.9     (7.36 µs … 7.42 µs)    7.4 µs   7.42 µs   7.42 µs
ReadableByteStream - reader (20 chunks)        21.03 µs/iter      47,555.6  (17.75 µs … 357.66 µs)  19.52 µs  98.69 µs  146.5 µs

@mmastrac
Copy link
Contributor

This is awesome.

Can you see if this test passes w/1000000 writes? It was too slow to pass for me before your PR:

https://github.com/denoland/deno/blob/main/cli/tests/unit/streams_test.ts#L252

@marcosc90
Copy link
Contributor Author

This is awesome.

Can you see if this test passes w/1000000 writes? It was too slow to pass for me before your PR:

https://github.com/denoland/deno/blob/main/cli/tests/unit/streams_test.ts#L252

Just rebased main, I'll try it out.

@marcosc90
Copy link
Contributor Author

marcosc90 commented Sep 23, 2023

@mmastrac the optimizations are much better than I thought.

This PR with: 1_000_000

readableStreamVeryLongReadAll ... ok (795ms)

In main > 5 minutes

readableStreamVeryLongReadAll ... ok (5m47s)

@mmastrac
Copy link
Contributor

I tested this patch with my web-stream serving benchmark and it's ~2% faster vs main. Great work!

57171 -> 58495 for large chunk streams

63183 -> 63183 for small chunk streams

Copy link
Contributor

@mmastrac mmastrac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I think @lucacasonato should take a look as well.

Copy link
Member

@lucacasonato lucacasonato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall - some comments

Comment on lines -360 to +408
return new ReadableStreamDefaultReader(stream);
const reader = new ReadableStreamDefaultReader(_brand);
setUpReadableStreamDefaultReader(reader, stream);
return reader;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a spec fix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a performance optimization but also aligns with the spec: https://streams.spec.whatwg.org/#acquire-readable-stream-reader.

  1. Let reader be a new ReadableStreamDefaultReader.
  2. Perform ? SetUpReadableStreamDefaultReader(reader, stream).
  3. Return reader.

@@ -4734,6 +4757,32 @@ const asyncIteratorPrototype = ObjectGetPrototypeOf(AsyncGeneratorPrototype);
const _iteratorNext = Symbol("[[iteratorNext]]");
const _iteratorFinished = Symbol("[[iteratorFinished]]");

class ReadableStreamAsyncIteratorReadRequest {
constructor(reader, promise) {
this.reader = reader;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is vulnerable if a user declares an object getter/setter. Explicitly define the property on the class (maybe private is even faster?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with making them private, just pushed it. But how is it vulnerable? The class is not exposed and only uses internal fields.

Copy link
Member

@lucacasonato lucacasonato Sep 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a user adds a reader setter/getter to Object.prototype, this assignment will call that setter rather than creating an object property on this.

Copy link
Member

@lucacasonato lucacasonato Sep 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To fix it, just declaring the field on the class is enough (class X { reader; constructor() { ... } }). This causes the locally declared field to shadow the possible setter/getter on the object prototype

Comment on lines -4774 to -4870
return PromisePrototypeThen(promise.promise, (result) => {
reader[_iteratorNext] = null;
if (result.done === true) {
reader[_iteratorFinished] = true;
return { value: undefined, done: true };
}
return result;
}, (reason) => {
reader[_iteratorNext] = null;
reader[_iteratorFinished] = true;
throw reason;
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did this code go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here

deno/ext/web/06_streams.js

Lines 4766 to 4782 in 6369ad4

chunkSteps(chunk) {
this.reader[_iteratorNext] = null;
this.promise.resolve({ value: chunk, done: false });
}
closeSteps() {
this.reader[_iteratorNext] = null;
this.reader[_iteratorFinished] = true;
readableStreamDefaultReaderRelease(this.reader);
this.promise.resolve({ value: undefined, done: true });
}
errorSteps(e) {
this.reader[_iteratorNext] = null;
this.reader[_iteratorFinished] = true;
readableStreamDefaultReaderRelease(this.reader);
this.promise.reject(e);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment for future readers so they know where to change things if the spec changes? Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @lucacasonato I was OOO, just read this.

AFAIK these state variables are not part of the spec, but needed to pass WPT.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are defined in this spec: https://webidl.spec.whatwg.org/#es-default-asynchronous-iterator-object

They handle cases like ensuring that calling next() after it has returned done won't throw

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are defined in this spec: https://webidl.spec.whatwg.org/#es-default-asynchronous-iterator-object

Thanks, wasn't aware of that.

Added a comment @lucacasonato, feel free to edit it.

Copy link
Member

@lucacasonato lucacasonato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@bartlomieju bartlomieju merged commit 7599990 into denoland:main Oct 13, 2023
13 checks passed
@marcosc90 marcosc90 deleted the perf-web-streams branch October 13, 2023 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants