Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: make serde_v8::V8Slice sound #18452

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lucacasonato
Copy link
Member

@lucacasonato lucacasonato commented Mar 27, 2023

Note: this PR is not complete - it requires more changes to deno_core and many code-paths touching V8Slice / ZeroCopyBuf in ext/ and runtime/.

This commit attempts to make serde_v8::V8Slice properly sound, according to Rust memory model (never both &mut and & a given value). This soundness fix will also enable using resizable ArrayBuffers in ops.

There is however a catch - as part of this fix, the patch prevents callers from holding the byte slices represented by the V8Slice across await points. This means that all async ops will now require copying data out/in prior/post suspending. This may or may not be a performance problem. It is to be noted, that in practice this copying already happens for FS ops and TLS sockets, but for TCP sockets.

This leaves us in a precarious position with two options:

  1. maybe fine, but not safe according to rust, and no RAB
    • we disable resizable ArrayBuffer as an argument to all async ops forever
    • we ignore Rust's memory model (never both &mut and & a given value). note: this is what we do now
    • leaves the possibility of very weird bugs in other code that explicitly relies on the assumption that &[u8] can never be modified. this may result in very undefined behaviour that could result in crashes, especially in crypto code. note: we are currently susceptible to this, just no-one has made a POC for it yet
    • fast, because we can directly let the kernel / other libs write into the V8Slice asynchronously
  2. safe, RAB, but possibly slower
    • we prevent V8Slice access from other threads
    • we prevent JS execution while a & or &mut is held to the byte slice represented by the V8Slice
    • requires copying data prior/post passing it to async operations
    • completely within Rust's memory model. & and &mut are never held at the same time

The safety constraints for full soundness are described in the V8Slice rustdoc comment in this PR.

I'd like to go with option 2, but I want feedback from the rest of the team. cc @littledivy @piscisaureus @bartlomieju. Additionally @andreubotella is likely interested.

Fixes #16756

Comment on lines +11 to +76
/// [V8Slice] encapsulates a borrowed byte slice from V8, in the form of a
/// [v8::BackingStore]. The allocation backing the [v8::BackingStore] is safe
/// from garbage collection until the [V8Slice] is collected.
///
/// If the underlying [v8::BackingStore] comes from a [v8::ArrayBuffer] wrapped
/// in a [v8::ArrayBufferView], the current start and end range of the view is
/// captured upon creation of the [V8Slice]. The [V8Slice] only exposes the data
/// contained within this range. The [v8::BackingStore] must not come from a
/// [v8::SharedArrayBuffer].
///
/// To access the data backing a [V8Slice], one can call [V8Slice::to_vec] to
/// fully copy the data into a [Vec<u8>], or [V8Slice::open] with a synchronous
/// callback to get access to a `&mut [u8]` representing the data.
///
/// ### Cloning
///
/// Cloning a V8Slice does not clone the contents of the underlying backing
/// store. Rather it clones the underlying smart-pointer.
///
/// To actually clone the contents of the buffer, use [V8Slice::to_vec].
///
/// ### Growing and shrinking ArrayBuffers
///
/// Since V8 11.2, ArrayBuffer is both growable and shrinkable. Both ArrayBuffer
/// growth and shrinkage are implemented in V8 without re-alloc. The maximum
/// length of the buffer must be specifed up-front and is reserved in virtual
/// address space by [v8::BackingStore]. When the [v8::BackingStore] is grown,
/// the underlying buffer is grown to the specified size by allocating physical
/// pages for the relevant exisiting virtual address space. When a buffer is
/// shrunk, the physical pages storing the excess bytes are de-allocated. In
/// both cases the length of the reserved virtual address space stays fixed.
///
/// [V8Slice] can safely handle resizable buffers (safety is explained below).
/// When the underlying [v8::BackingStore] is shrunk below the `range` of this
/// [V8Slice], the length of any exposed byte slices is truncated to fit within
/// the new bounds.
///
/// ### Safety
///
/// To make [V8Slice] fit within Rust's safety guaruantees, the following two
/// constraints must always be upheld (especially in light of buffer resizing):
///
/// # Cloning
/// Cloning a V8Slice does not clone the contents of the buffer,
/// it creates a new reference to that buffer.
/// - There MUST never exist a mutable reference and a read-only reference to
/// a byte slice at the same time (this is Rust's memory model).
/// - While a `&[u8]` or `&mut [u8]` pointing to an underlying allocation exists
/// that allocation MUST NEVER be deallocated (doing so may result in a
/// use-after-free).
///
/// To actually clone the contents of the buffer do
/// `let copy = Vec::from(&*zero_copy_buf);`
/// JavaScript execution has the ability to get a `&mut [u8]` for the underlying
/// bytes at any time. JavaScript execution can also resize the allocation at
/// any time. As such, it is never safe to expose a `&[u8]` or `&mut [u8]` while
/// JavaScript is executing, as this would violate the above constraints.
///
/// To ensure that these constraints can not be violated, this type never
/// exposes `&[u8]` or `&mut [u8]` pointing to the underlying bytes while
/// JavaScript is executing. This is done through two mechanisms:
///
/// - [V8Slice] is not [Send] or [Sync]: it can not be sent to a different
/// thread. This means that no `&[u8]` or `&mut [u8]` can be created from a
/// different thread, out of the purview of the JavaScript executing thread
/// which would possibly cause a constraint violation.
/// - [V8Slice] never exposes a `&[u8]` or `&mut [u8]` that can be held
/// asynchronously across a point causing JavaScript execution. This is
/// enforced through the API design for asynchronous Rust. Users MUST take
/// care to not execute JavaScript within a [V8Slice::open] or
/// [V8Slice::open_mut] callback.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the part I want you all to read and provide feedback on please :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, all of that makes sense. Very good write up. I'm in favor of no 2 as well as that would put the safety issues with V8Slice to bed for forseeable future. I'm just worried about the HTTP performance due to these constraints.

Can we evaluate this approach once we're able to run some HTTP/TCP benchmarks?

Copy link
Member

@bartlomieju bartlomieju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lucacasonato is this far along that we can measure the performance hit on TCP sockets?

@andreubotella
Copy link
Contributor

andreubotella commented Apr 5, 2023

I haven't yet looked at this PR in depth, but my understanding is that a follow-up PR could add a detaching version of op_read which would allow BYOB ReadableStreams to recover the performance losses from this PR. Is that right?

impl AsMut<[u8]> for DetachedBuffer {
fn as_mut(&mut self) -> &mut [u8] {
self.0.as_mut()
impl DetachedBuffer {
Copy link
Contributor

@andreubotella andreubotella Apr 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the V8Slice constraints don't make sense for DetachedBuffer: since ABs uniquely own their backing memory (unlike SABs) and the only way to obtain a DetachedBuffer is by detaching an AB, it should be perfectly safe to get slices to the backing store memory, as long as the Rust memory model is otherwise followed.

Edit: I guess one issue with this is that calling to_v8 would need to invalidate the DetachedBuffer so that getting slices from it after that panics (or returns None/Err). This is a consequence of the fact that ToV8 creates a JS-visible AB with the same backing store while not dropping the DetachedBuffer. But IMO this is fine, since it would most often happen when returning a DetachedBuffer from an op. And since to_v8 takes &mut self, this can't pull the rug from under any existing slices, since there can't be any slices alive at the same time.

@bartlomieju
Copy link
Member

Discussed during the CLI working group meeting. We should migrate some examples from core/ to use this new API to see how involved changes this PR requires. We'll reevaluate after that.

@bartlomieju
Copy link
Member

I need to rebase this PR an move it to deno_core repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proposal for safe async I/O with ArrayBuffers
3 participants