Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TransformStream cleanup using "Transformer.cancel" #1283

Merged
merged 15 commits into from
Sep 30, 2023

Conversation

lucacasonato
Copy link
Member

@lucacasonato lucacasonato commented Jun 8, 2023

This commit adds a cancel hook to Transformer. This allows users to perform resource cleanup when the readable side of the TransformStream is cancelled, or the writable side is aborted.

To preserve existing behavior, when the readable side is cancelled with a reason, the writable side is always immediately aborted with that same reason. The same is true in the reverse case. This means that the status of both sides is always either closed, erroring, or erroring when the cancel hook is called.

flush and cancel are never both called. As per existing behaviour, when the writable side is closed the flush hook is called. If the readable side is cancelled while a promise returned from flush is still pending, cancel is not called. In this scenario the readable side ends up in the "errored" state, while the writable side ends up in the closed state.

I have opted for a cancel hook instead of a finally hook, to mirror the API in WritableStream - it has one hook for successful completion (close), and one hook for errored completion (abort). Transformer already has a flush hook for successful completion. The logical addition is an cancel hook for errored completion.

Closes #1212


Open questions:

  • Is cancel the right name? Mirrors ReadableStream, but we use abort for WritableStream.
  • Is the immediate cancellation behaviour correct?
    • Ie should call the cancel hook before or after cancelling the underlying streams?
    • Calling it before means that current behaviour is changed (number of microticks from readable.cancel() to the writable actually being aborted). This is the current behaviour.
    • Calling it after means that you can not modify the reason passed to the other side using the cancel hook. This is different to abort and cancel hooks in WritableStream and ReadableStream respectively.


Preview | Diff

This commit adds a "cancel" hook to "Transformer". This allows users to
perform resource cleanup when the readable side of the TransformStream
is cancelled, or the writable side is aborted.

To preserve existing behavior, when the readable side is cancelled with
a reason, the writable side is always immediately aborted with that same
reason. The same is true in the reverse case. This means that the
status of both sides is always either "closed", "erroring", or
"erroring" when the "cancel" hook is called.

"flush" and "cancel" are never both called. As per existing behaviour,
when the writable side is closed the "flush" hook is called. If the
readable side is cancelled while a promise returned from "flush" is
still pending, "cancel" is not called. In this scenario the readable
side ends up in the "errored" state, while the writable side ends up in
the "closed" state.
@ricea
Copy link
Collaborator

ricea commented Jun 9, 2023

  • Is cancel the right name? Mirrors ReadableStream, but we use abort for WritableStream.

Since it's used for both the cancel and abort cases, it might be good to come up with a third name to avoid confusion. stop is one possibility.

  • Is the immediate cancellation behaviour correct?

    • Ie should call the cancel hook before or after cancelling the underlying streams?
    • Calling it before means that current behaviour is changed (number of microticks from readable.cancel() to the writable actually being aborted). This is the current behaviour.
    • Calling it after means that you can not modify the reason passed to the other side using the cancel hook. This is different to abort and cancel hooks in WritableStream and ReadableStream respectively.

Interesting. Calling it before would be better in terms of consistency with the other APIs. I personally think that changing the number of microticks is an acceptable breakage, but others may disagree.

@lucacasonato
Copy link
Member Author

Interesting. Calling it before would be better in terms of consistency with the other APIs. I personally think that changing the number of microticks is an acceptable breakage, but others may disagree.

For reference, exactly one WPT is broken by this.

@lucacasonato
Copy link
Member Author

Since it's used for both the cancel and abort cases, it might be good to come up with a third name to avoid confusion. stop is one possibility.

What about terminate? This aligns with controller.terminate

@ricea
Copy link
Collaborator

ricea commented Jun 9, 2023

What about terminate? This aligns with controller.terminate

controller.terminate is rather a niche and subtle operation, which is why we gave it a long name.

This operation, however, is something we'd like people to implement if they need to do cleanup, so a short friendly name seems appropriate.

@saschanaz
Copy link
Member

dispose? That's the name "owning" proposal uses by "dispose steps" and also by the upcoming Symbol.dispose proposal which might ultimately call this (if we implement it for stream classes)

index.bs Outdated Show resolved Hide resolved
@lucacasonato
Copy link
Member Author

I don't particularly like dispose, as it implies it is called both on successful and errored close (because that is what Symbol.dispose does. This however is only called on the error case, because successful disposal is already handled through flush.

I think we need something that implies early unsuccessful termination. abort and cancel fit the bill here, except that we already use them elsewhere so using the same word may be confusing. I'm not super strongly concerned with confusion here tho, as the use case is very similar to those hooks in RS and WS.

@lucacasonato
Copy link
Member Author

@domenic @MattiasBuelens @saschanaz Do you have any thoughts on the "Is the immediate cancellation behaviour correct?" point? I'm leaning towards the currently implemented behaviour because it minimizes chance of breakage, but I do also sympathize with @ricea's points above.

@domenic
Copy link
Member

domenic commented Jun 12, 2023

What about cancelOrAbort for the name? Since it's literally called when either the readable side is canceled, or the writable side is aborted? It's not short though.

I think calling it before is better. I agree that changing the number of microtasks is acceptable.

@lucacasonato
Copy link
Member Author

lucacasonato commented Jun 12, 2023

I think calling it before is better.

Ok, I'll change that. This opens up a question on error handling:

const ts = new TransformStream({
  async cancel(reason) {
    console.log("t.cancel start with", reason);
    await delay(100);
    console.log("t.cancel end");
    throw "t.cancel error";
  },
});
console.log("ts constructed");
ts.readable.cancel("rs.cancel error")
  .then(() => console.log("rs.cancel fulfilled"))
  .catch((err) => console.log("rs.cancel rejected with", err));
ts.writable.abort("ws.abort error")
  .then(() => console.log("ws.abort fulfilled"))
  .catch((err) => console.log("ws.abort rejected with", err));
console.log("started rs.cancel and ws.abort");

I think the logical ordering of logs is:

# a-1
ts constructed
t.cancel start with rs.cancel error
started rs.cancel and ws.abort
t.cancel end
rs.cancel rejected with t.cancel error
ws.abort rejected with t.cancel error

But the following are also possible:

# a-2
ts constructed
t.cancel start with rs.cancel error
started rs.cancel and ws.abort
ws.abort fulfilled
t.cancel end
rs.cancel rejected with t.cancel error
# a-3
ts constructed
t.cancel start with rs.cancel error
started rs.cancel and ws.abort
t.cancel end
rs.cancel fulfilled
ws.abort fulfilled

Also consider:

const ts = new TransformStream({
  flush() { console.log("t.flush") },
  async cancel(reason) {
    console.log("t.cancel start with", reason);
    await delay(100);
    console.log("t.cancel end");
    throw "t.cancel error";
  },
});
console.log("ts constructed");
ts.readable.cancel("rs.cancel error")
  .then(() => console.log("rs.cancel fulfilled"))
  .catch((err) => console.log("rs.cancel rejected with", err));
ts.writable.close()
  .then(() => console.log("ws.close fulfilled"))
  .catch((err) => console.log("ws.close rejected with", err));
console.log("started rs.cancel and ws.close");

Here similar options are possible:

# b-1
ts constructed
t.cancel start with rs.cancel error
started rs.cancel and ws.close
t.cancel end
rs.cancel rejected with t.cancel error
ws.close rejected with t.cancel error
# b-2
ts constructed
t.cancel start with rs.cancel error
started rs.cancel and ws.close
ws.close fulfilled
t.cancel end
rs.cancel rejected with t.cancel error
# b-3
ts constructed
t.cancel start with rs.cancel error
started rs.cancel and ws.close
t.cancel end
rs.cancel fulfilled
ws.close fulfilled

These each also have effect on whether the error is visible in writer.closed or not. In a-1 and b-1 examples writer.closed would reject, in a-2, a-3, b-2, and b-3 the it resolves to undefined.

Does anyone have preferences here? I think a-1/b-1 make the most sense.

index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
Comment on lines +5577 to +5581
<p>(Note that there is no need to call
{{TransformStreamDefaultController/terminate()|controller.terminate()}} inside
{{Transformer/cancel|cancel()}}; the stream is already in the process of cancelling/aborting, and
terminating it would be counterproductive.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<p>(Note that there is no need to call
{{TransformStreamDefaultController/terminate()|controller.terminate()}} inside
{{Transformer/cancel|cancel()}}; the stream is already in the process of cancelling/aborting, and
terminating it would be counterproductive.)

This part is redundant after 1c65d61.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My train of thought is that you could extract the controller in start and assign it to a local variable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, hmm, okay then. Maybe also worth having tests for that btw.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously it was impossible because of the immediate close semantics. I just pushed the new semantics, and will add tests for this case.

index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
@saschanaz
Copy link
Member

Does anyone have preferences here? I think a-1/b-1 make the most sense.

I fail to see why any of the resulting promises should not reject, so I also prefer a-1/b-1 (and I also think it's okay to have extra microtask).

@lucacasonato
Copy link
Member Author

What about:

const ts = new TransformStream({
  flush() { console.log("t.flush") },
  async cancel(reason) {
    console.log("t.cancel start with", reason);
    await delay(100);
    console.log("t.cancel end");
    throw "t.cancel error";
  },
});
console.log("ts constructed");
ts.writable.close()
  .then(() => console.log("ws.close fulfilled"))
  .catch((err) => console.log("ws.close rejected with", err));
ts.readable.cancel("rs.cancel error")
  .then(() => console.log("rs.cancel fulfilled"))
  .catch((err) => console.log("rs.cancel rejected with", err));
console.log("started rs.cancel and ws.close");
ts constructed
t.flush
started ws.close and rs.cancel
ws.close fulfilled
rs.cancel fulfilled

I'm trying to figure out if it's observable from the outside that the RS was closed through a cancellation rather than a "clean close" (ie is "rs.cancel error" exposed through a reader.read() promise rejection)?

@lucacasonato
Copy link
Member Author

I have now implemented a v2 that runs the cancel algorithm prior to cancel / abort of the other side. I have opted for the approach where all of rs.cancel ws.close and ws.abort will all either resolve or reject (never one resolves one rejects), and that the first one always wins.

5b00d08

This required updating some test expectations (the ones making the assumption that ws.abort and rs.cancel are sync, which due to micotask changes is not the case anymore).

web-platform-tests/wpt@4156716

PTAL!

Copy link
Member

@saschanaz saschanaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing looks obviously off, which means, LGTM.

For the test part, could we replace await delay(0); with something else?

index.bs Outdated
1. Perform !
[$ReadableStreamDefaultControllerClose$](|readable|.[=ReadableStream/[[controller]]=]).
1. Perform ! [$ReadableStreamDefaultControllerClose$](|readable|.[=ReadableStream/[[controller]]=]).
1. [=Resolve=] |controller|.[=TransformStreamDefaultController/[[finishPromise]]=] with undefined.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(At the first glance it looked like it was ignoring the error, but in that case it would have finishPromise and thus would return at the above check.)

@lucacasonato
Copy link
Member Author

For the test part, could we replace await delay(0); with something else?

I could replace them with a fixed set of queueMicrotask, but I instead went with the approach used in other streams tests. It exists only to drain the current microtask queue before the next statement, not because the tests are timing sensitive.

@lucacasonato
Copy link
Member Author

@ricea @domenic Could you take another look?

@lucacasonato
Copy link
Member Author

lucacasonato commented Sep 15, 2023

I have one editor approval - does anyone else still want to take a look, or is this good to merge? @MattiasBuelens do we usually do a "final comment period" for stream spec PRs to help keep things moving?

@domenic
Copy link
Member

domenic commented Sep 15, 2023

This needs implementer interest before it's ready to merge. Although I trust @MattiasBuelens , I and other editors might be holding off on further review until we're clear on whether this has a cross-browser future.

@lucacasonato
Copy link
Member Author

Ok, would be great to start getting some explicit feedback on interest from Chrome / Firefox / Safari then.

I would urge folks to look at this with some urgency - it's a very serious omission in TransformStream that makes implementations of many transformer algorithms impossible unless cleanup is performed exclusively via GC. There is significant user interest, both from us at Deno, from folks using Cloudflare Workers (cc @jasnell), and from other interested parties (see the original issue #1212).

@domenic is Chrome interested in shipping this? It sounded in this issue as if you may be interested? Looking it over, it seems like a relatively trivial implementation for Chromium :)

@saschanaz can I note down your LGTM as "interested in shipping" for Gecko?

@annevk is WebKit interested? (if you are the wrong person to ping, please forward :) )

@saschanaz
Copy link
Member

LGTM spec-wise, but not actively pursuing implementation-wise. If others are interested to ship it then Mozilla will too.

@domenic
Copy link
Member

domenic commented Sep 15, 2023

@ricea is the relevant Chromium implementer

@lucacasonato
Copy link
Member Author

@ricea Is Chrome interested in shipping this?

@youennf
Copy link
Contributor

youennf commented Sep 28, 2023

This addition makes sense to me as we have more and more web platform objects that benefit from being explicitly closed.

@ricea
Copy link
Collaborator

ricea commented Sep 28, 2023

Yes, Chrome is interested. Sorry for the delay.

@lucacasonato
Copy link
Member Author

Fantastic, thank you folks! I'll open implementation bugs shortly so we can get this landed 😃

Copy link
Member

@domenic domenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a few editorial nits, but looks good! Thanks so much for working on this.

As you may or may not be aware, because of the reference implementation landing things in the streams repo is a bit annoying. First we have to land the tests PR, then we roll the submodule, then we land this PR. I'll go merge the tests PR now.

index.bs Outdated Show resolved Hide resolved
index.bs Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
@lucacasonato
Copy link
Member Author

lucacasonato commented Sep 30, 2023

All implementation bugs and the MDN issue have been filed, and the WPT submodule has been updated. This PR is now ready to land :)

@domenic
Copy link
Member

domenic commented Sep 30, 2023

I think all your issues have the wrong titles, using "cleanup" instead of "cancel" :). I'll merge this though.

@domenic domenic merged commit 007d729 into whatwg:main Sep 30, 2023
3 checks passed
@lucacasonato
Copy link
Member Author

@domenic The cancel hook allows for cleanup of transformers backing TransformStreams, so I think it makes some sense?

@lucacasonato lucacasonato deleted the Transformer_cancel branch September 30, 2023 06:48
@domenic
Copy link
Member

domenic commented Sep 30, 2023

There is no "transformer.cleanup" property though.

@lucacasonato
Copy link
Member Author

Oh I realize my mistake now. Will fix. Thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Handling stream errors in a TransformStream transformer
6 participants