Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic on climate data file from Canadian government #75

Closed
lithiumfrost opened this issue Oct 1, 2021 · 9 comments
Closed

Panic on climate data file from Canadian government #75

lithiumfrost opened this issue Oct 1, 2021 · 9 comments
Labels
bug Something isn't working enhancement New feature or request

Comments

@lithiumfrost
Copy link

Expected: Formatted columns

Got: Panic

❯ tv < en_climate_hourly_AB_3012209_05-2021_P1H.csv
thread 'main' panicked at 'a csv record: Error(UnequalLengths { pos: Some(Position { byte: 65424, line: 361, record: 361 }), expected_len: 30, len: 9 })', src/main.rs:168:20
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Attached csv.

en_climate_hourly_AB_3012209_05-2021_P1H.csv

@alexhallam
Copy link
Owner

alexhallam commented Oct 1, 2021

Awesome! Thanks for posting.

The hint is in the error line 361. You may be thinking that tv should fill the missing cells with NA. This is impossible as this is an incorrect csv formatted file as it currently stands.

For NAs to be populated those missing cells would need to be filled with commas.

image

@alexhallam
Copy link
Owner

alexhallam commented Oct 1, 2021

Let me keep looking at this. It seems there is more to it than just that line. I tried head to tv also and got an error. Disregard the above comment for now.

@alexhallam
Copy link
Owner

xsv hits the same issue on that line, but has a better error. I can improve the error message. I still think there may be more here. I am still looking.

xsv select 1-2 en_climate_hourly_AB_3012209_05-2021_P1H.csv

CSV error: record 361 (line: 361, byte: 65424): found record with 9 fields, but the previous record has 30 fields

@alexhallam
Copy link
Owner

alexhallam commented Oct 1, 2021

Alright from what I can tell this is two issue.

  1. Bad csv after line 361
  2. A bug on src/datatype/sigfig.rs:242:9

I can fix the second point, but the first problem will remain. The most I can do from problem one is have a more informative error.

@lithiumfrost
Copy link
Author

I, too, had xsv bail on me for this file. It's regrettable that the Canadian climate data comes this way.

I fixed my problem by switching to Miller, which has an --allow-ragged-csv-input option.

Perhaps you might consider something like this? For just viewing, a more permissive approach would be great for visualizing dirty csv files.

@alexhallam
Copy link
Owner

I would like to build a feature like that for when things break. Thanks for finding this issue!

@alexhallam
Copy link
Owner

alexhallam commented Oct 1, 2021

Okay. I patched something up for the first 360 lines. I took your same file, but deleted every row from 361 to the last row.

image

@alexhallam
Copy link
Owner

alexhallam commented Oct 1, 2021

I am going to add a test csv then open a PR.

Again, this only fixes issue 2. The first problem will have to be a larger issue I will take on soon. Just not tonight.

@alexhallam alexhallam mentioned this issue Oct 1, 2021
@alexhallam alexhallam added bug Something isn't working enhancement New feature or request labels Oct 1, 2021
@alexhallam
Copy link
Owner

I am going to close this issue and make a new issue with the focus on dirty csv files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants