Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not convert to string when skipping unused fields. #203

Merged
merged 1 commit into from
Nov 27, 2020

Conversation

michalsrb
Copy link
Contributor

Hi. This is reopening of #198, but with just one of the two commits. Turns out that this is still needed for the suggested solution to work.

If some fields are not required by serde, the ByteRecord::deserialize will attempt to skip them by calling next_field, which will fail if the skipped field is not valid utf-8. So instead lets skip unused fields using next_field_bytes. This affects also StringRecord::deserialize, but getting byte slice from string should have no cost. I didn't see any changes in the included benchmarks.

When the deserializer skips unused fields, we don't have to convert them
to valid Utf-8 string. This will prevent errors when unused fields are
not Utf-8 (maybe they are supposed to be bytes). Removing the check may also
improve performance.
@hniksic
Copy link

hniksic commented Nov 26, 2020

Hi @BurntSushi,

Could you please look into this PR?

It sems like a reasonable change, skipped fields be validated as UTF-8? As I understand it, if you use ByteRecord to deserialize something that has a Vec<u8> field, it doesn't have to be valid UTF-8, but if you completely ignore that field, it has to be. Eliminating the validation seems faster while equally correct.

Copy link
Owner

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@BurntSushi BurntSushi merged commit b08b3e6 into BurntSushi:master Nov 27, 2020
@BurntSushi
Copy link
Owner

This PR is on crates.io in csv 1.1.5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants