Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(encoding/csv/streaming): Added skipFirstRow and columns options #3184

Merged
merged 8 commits into from
Feb 17, 2023

Conversation

ayame113
Copy link
Contributor

@ayame113 ayame113 commented Feb 10, 2023

close #2559

This PR adds the ability to parse csv with headers to the WebStream based csv parser.

The additional options (skipFirstRow and columns) work exactly the same as when specify them in the existing synchronous csv parser.

how it works
import { parse } from "./csv.ts";
import { CsvStream } from "./csv/stream.ts";
import { readableStreamFromIterable } from "../streams/readable_stream_from_iterable.ts";

const csv = `a,b,c
d,e,f
`;

// It shows how each option works.
const readable1 = readableStreamFromIterable(csv).pipeThrough(new CsvStream({ skipFirstRow: true }));
for await (const data of readable1) {
  console.log(data); // { a: "d", b: "e", c: "f" }
}

const readable2 = readableStreamFromIterable(csv).pipeThrough(new CsvStream({ columns: ["aaa", "bbb", "ccc"] }));
for await (const data of readable2) {
  console.log(data); // { aaa: "a", bbb: "b", ccc: "c" }, { aaa: "d", bbb: "e", ccc: "f" }
}

const readable3 = readableStreamFromIterable(csv).pipeThrough(new CsvStream({ skipFirstRow: true, columns: ["aaa", "bbb", "ccc"] }));
for await (const data of readable3) {
  console.log(data); // { aaa: "d", bbb: "e", ccc: "f" }
}

// The output is the same as the existing synchronous parse function.
console.log(parse(csv, { skipFirstRow: true })); // [ { a: "d", b: "e", c: "f" } ]
console.log(parse(csv, { columns: ["aaa", "bbb", "ccc"] })); // [ { aaa: "a", bbb: "b", ccc: "c" }, { aaa: "d", bbb: "e", ccc: "f" } ]
console.log(parse(csv, { skipFirstRow: true, columns: ["aaa", "bbb", "ccc"] })); // [ { aaa: "d", bbb: "e", ccc: "f" } ]

Comment on lines +270 to +276
{
name: "simple",
input: "a,b,c",
output: [["a", "b", "c"]],
skipFirstRow: false,
},
{
Copy link
Contributor Author

@ayame113 ayame113 Feb 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These test cases were copied from csv_test.ts.

@ayame113
Copy link
Contributor Author

ayame113 commented Feb 10, 2023

Copy link
Member

@kt3k kt3k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nice feature!

@kt3k kt3k merged commit bb4d6b2 into denoland:main Feb 17, 2023
@ayame113 ayame113 deleted the headerd-csv-streaming branch February 17, 2023 09:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: Webstream-based parser for CSV with headers
2 participants