Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(streams): add single-character fast path for DelimiterStream() #3739

Conversation

aapoalas
Copy link
Collaborator

DelimiterStream is often used to delimit formats that use a single char as the separator such as most commonly used flavours of CSV, new-line delimited streamable JSON and others. Multi-character delimiters require extra handling that is entirely unnecessary for char delimiters. As such, it makes sense to special-case char delimiter streams for extra performance.

The performance benefit is at most a few percentage points.

@aapoalas aapoalas force-pushed the perf/streams/delimiter-stream-single-byte-delimiter branch from 3b68470 to e74e501 Compare October 28, 2023 11:57
@aapoalas aapoalas force-pushed the perf/streams/delimiter-stream-single-byte-delimiter branch from e74e501 to 4ce46fe Compare October 28, 2023 11:59
Copy link
Collaborator

@iuioiua iuioiua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! I think we can make this a little simpler.

Comment on lines 65 to 100
let transform: (
chunk: Uint8Array,
controller: TransformStreamDefaultController<Uint8Array>,
) => void;
const flush = (
controller: TransformStreamDefaultController<Uint8Array>,
) => {
const bufs = this.#bufs;
const length = bufs.length;
if (length === 0) {
controller.enqueue(new Uint8Array());
} else if (length === 1) {
controller.enqueue(bufs[0]);
} else {
controller.enqueue(concat(...bufs));
}
};

let lps: Uint8Array | null = null;

if (delimiter.length === 1) {
// Delimiter is a single char
transform = (chunk, controller) => this.#handleChar(chunk, controller);
} else {
transform = (chunk, controller) => this.#handle(chunk, controller);
lps = createLPS(delimiter);
}
super({
transform: (chunk, controller) => {
this.#handle(chunk, controller);
},
flush: (controller) => {
controller.enqueue(concat(...this.#bufs));
},
transform,
flush,
});

this.#delimiter = delimiter;
this.#delimLPS = createLPS(delimiter);
this.#delimLPS = lps;
this.#disp = options?.disposition ?? "discard";
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can simplify the constructor by doing something like:

constructor(
  delimiter: Uint8Array,
  options?: DelimiterStreamOptions,
) {
  super({
    transform: (chunk, controller) =>
      delimiter.length === 1
        ? this.#handleChar(chunk, controller)
        : this.#handle(chunk, controller),
    flush: (controller) => this.#flush(controller),
  });

  this.#delimiter = delimiter;
  this.#delimLPS = delimiter.length > 1 ? createLPS(delimiter) : null;
  this.#disp = options?.disposition ?? "discard";
}

I think we should also factor out the flush() logic into its own separate #flush() method to match the style of the rest of the class. WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gentle ping @aapoalas

streams/delimiter_stream_test.ts Show resolved Hide resolved
@iuioiua iuioiua changed the title perf(DelimiterStream): Implement fast path for char delimiter streams perf(streams): add single-character fast path for DelimiterStream() Oct 29, 2023
Copy link
Collaborator

@iuioiua iuioiua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've gone ahead and applied my suggestion by factoring the flush() logic into it's own method and simplifying the constructor. Now, LGTM! Thanks for this, @aapoalas!

@iuioiua iuioiua merged commit 1d99bac into denoland:main Nov 10, 2023
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants