Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(core.js): introduce promise ring #9979

Merged
merged 6 commits into from
Apr 7, 2021

Conversation

AaronO
Copy link
Contributor

@AaronO AaronO commented Apr 3, 2021

This is another optimization to help improve the baseline overhead of async ops. It shaves off ~55ns/op or ~7% of the current total async op overhead.

Though it's only 7% of the total async-op overhead, by my estimates it reduces the overhead of promise storing/lookup by somewhere between 33-50%.

It achieves these gains by taking advantage of the sequential nature of promise IDs and optimistically stores them sequentially in a pre-allocated circular buffer and fallbacks to the promise Map for slow to resolve promises.

The promise ring will add some constant memory overhead, ~4B per slot so ~16kb for 4k slots, etc...

Benches

Before (main):
test bench_op_async   ... bench:     774,428 ns/iter (+/- 46,444)

After:
test bench_op_async   ... bench:     719,575 ns/iter (+/- 17,933)

Todo

  • Implement basic promise ring
  • Double check ring arithmetic
  • Test idea of breaching sectors to batch "old promise" checks out of setPromise

@AaronO AaronO changed the title core.js: introduce promise ring perf(core.js): introduce promise ring Apr 3, 2021
Since breaching sectors in batches doesn't seem useful performance wise
@AaronO
Copy link
Contributor Author

AaronO commented Apr 3, 2021

Breaching ring sectors in batch doesn't seem to yield any performance gains ultimately, so I've dropped references to ring sectors and this should otherwise be good to merge, if it's correct.

@bartlomieju bartlomieju requested a review from ry April 3, 2021 14:06
…g bounds

Given that nextPromiseId is an increment ahead of the cursor last written to
@ry
Copy link
Member

ry commented Apr 3, 2021

I see about a 3% real-world improvement with this optimization in the HTTP throughput benchmark. However to my eye, this optimization seems rather complicated for such minimal benefit. If you don't mind, I'd rather leave this one queued up for the future and search for lower hanging fruit for now.

@leoc11
Copy link
Contributor

leoc11 commented Apr 5, 2021

Though it's not directly related to this pr, it seem that when promiseId touch Number.MAX_SAFE_INTEGER, next asyncOp might failed.

@AaronO
Copy link
Contributor Author

AaronO commented Apr 5, 2021

Though it's not directly related to this pr, it seem that when promiseId touch Number.MAX_SAFE_INTEGER, next asyncOp might failed.

Yeah, I thought of that and it would be relatively easy to handle, but it needs to be coordinated on the rust side. We could easily loop around or fallback to BigInts up to 2^64.

In practice, it should be hard to hit that limit. Number.MAX_SAFE_INTEGER is 2^53 (number of significant bits in a f64 that aren't used for the radix point).

A deno program doing 10M opcalls/s non-stop, would take ~28.5 years to reach that limit:

2**53 / (1e7 * 86400 * 365)
> 28.561641472415626

Also, given that in 1.8.2 and prior, the cheapest op-baseline was ~1000ns/op, you couldn't do more than 1M opcalls/s, so you would be good for ~285 years ...

@AaronO
Copy link
Contributor Author

AaronO commented Apr 6, 2021

I see about a 3% real-world improvement with this optimization in the HTTP throughput benchmark. However to my eye, this optimization seems rather complicated for such minimal benefit. If you don't mind, I'd rather leave this one queued up for the future and search for lower hanging fruit for now.

I'm working on other async op improvements, but this improvement is pretty significant, it's a 10% reduction in async-op baseline at current values and it substantially helps improve tail latencies and close the tail latency gap with node on our tcp latency benches. These improvements come at a relatively small cost: ~20 lines of moderately complex JS. Ring buffers are a relatively common and well understood data-structure, nothing too exotic.

A promise ring is naturally less straightforward that a plain Map or Object, but it significantly reduces the GC-pressure of promises whilst being faster than both. Unless we have a better alternative for holding promises, I would be in favor of landing this.

Implementation Op Baseline Tail Latencies
Ring 👍 👍
Map 👍 👎
Object 👎 👍

Promise Map has a better op baseline than Object but has higher tail latencies due to the GC pressure, Ring is better than both.

Copy link
Member

@bartlomieju bartlomieju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance improvements from this PR for throughput increate and tail latency decrease are too good to let it wait. LGTM!

const promiseTable = new Map();
const promiseMap = new Map();
const RING_SIZE = 4 * 1024;
const NO_PROMISE = null; // Alias to null is faster than plain nulls
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What, really??!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it appears to save ~10ns/opcall in the benches I ran.

I haven't double-checked the v8 generated bytecode, but I assume it could be faster due to checking equality via referential equality instead of value equality.

Copy link
Contributor Author

@AaronO AaronO Apr 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will definitely need to double check this and narrow it down, to see if there is a true performance difference or if it was just a statistical fluke in my local benches.

But it's easy to change either way and doesn't negatively impact performance or correctness in its current form.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@piscisaureus I've dumped the v8 bytecode and diffed the bytecode implementations of getPromise() and setPromise()

setPromise()

image

getPromise()

image

@bartlomieju bartlomieju merged commit 2865f39 into denoland:main Apr 7, 2021
@AaronO AaronO deleted the ops/promise-ring branch August 16, 2021 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants