Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support to copy from orc format #1814

Merged
merged 6 commits into from
Jun 25, 2023

Conversation

WenyXu
Copy link
Member

@WenyXu WenyXu commented Jun 23, 2023

I hereby agree to the terms of the GreptimeDB CLA

What's changed and what's your intention?

  • Support to copy from orc format
  • Supported types
    • Boolean
    • String
    • Integers (i16, i32, i64)
    • Floats (f32, f64)
    • Timestamp (Nanosecond)
    • Date
  • Compression
    • None
    • ZLIB

Notes: currently, it doesn't support the Run Length Encoding defined in orc spec v0, the writer should use Run Length Encoding V2 instead.

See also orc-rust

TL;DR

I was planning to support the import of orc format data based on orc-format crate, but I found a lot of bugs in the process of supplementing unit testing, and finally, I rewrote the entire rlev2 algorithm based on the orc and supported of timestamp, date types decoding, added an async stream reader.

Checklist

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.

Refer to a related PR or issue link (optional)

@WenyXu WenyXu force-pushed the feat/copy-from-orc-format branch 2 times, most recently from 2c48df2 to e7b9521 Compare June 23, 2023 15:40
@WenyXu WenyXu force-pushed the feat/copy-from-orc-format branch from e7b9521 to 47484d6 Compare June 23, 2023 15:51
@codecov
Copy link

codecov bot commented Jun 23, 2023

Codecov Report

Merging #1814 (f44d299) into develop (5ab0747) will decrease coverage by 0.26%.
The diff coverage is 76.19%.

@@             Coverage Diff             @@
##           develop    #1814      +/-   ##
===========================================
- Coverage    86.38%   86.13%   -0.26%     
===========================================
  Files          584      585       +1     
  Lines        95458    95502      +44     
===========================================
- Hits         82462    82259     -203     
- Misses       12996    13243     +247     

.gitignore Outdated Show resolved Hide resolved
src/frontend/src/statement/copy_table_from.rs Outdated Show resolved Hide resolved
src/frontend/src/statement/copy_table_from.rs Outdated Show resolved Hide resolved
@WenyXu
Copy link
Member Author

WenyXu commented Jun 25, 2023

@fengjiachun PTAL

@fengjiachun fengjiachun merged commit 223cf31 into GreptimeTeam:develop Jun 25, 2023
paomian pushed a commit to paomian/greptimedb that referenced this pull request Oct 19, 2023
* feat: support to copy from orc format

* test: add copy from orc test

* chore: add license header

* refactor: remove unimplemented macro

* chore: apply suggestions from CR

* chore: bump orc-rust to 0.2.3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-required This change requires docs update.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants