Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

entire .lz4 files must be read before processing #293

Open
jtmoon79 opened this issue May 7, 2024 · 2 comments
Open

entire .lz4 files must be read before processing #293

jtmoon79 opened this issue May 7, 2024 · 2 comments
Labels
code improvement enhancement not seen by the user difficult A difficult problem; a major coding effort or difficult algorithm to perfect enhancement New feature or request P1 important

Comments

@jtmoon79
Copy link
Owner

jtmoon79 commented May 7, 2024

Summary

The entire .lz4 file is uncompressed to get it's uncompressed size before processing.

Current behavior

Crate lz4_flex does not provide an API for getting the .lz4 file uncompressed size. See PSeitz/lz4_flex#159

So the entire file is uncompressed to get the uncompressed file size.

Suggested behavior

One or both

  • Submit a PR to lz4_flex.
  • refactor blockreader (and all readers) to not need a file size (difficult)

Also see

Similar to #300 regarding .bz2
Meta-Issue #182

@jtmoon79 jtmoon79 added bug Something isn't working enhancement New feature or request code improvement enhancement not seen by the user P1 important difficult A difficult problem; a major coding effort or difficult algorithm to perfect labels May 7, 2024
@jtmoon79
Copy link
Owner Author

jtmoon79 commented May 7, 2024

Follow on from #291

@jtmoon79
Copy link
Owner Author

jtmoon79 commented Jun 1, 2024

By default, .lz4 files compressed by lz4c do not include a content_size.

$ lz4c --help
*** LZ4 command line interface 64-bits v1.9.3, by Yann Collet ***
Usage :
      lz4c [arg] [input] [output]

...
--content-size : compressed frame includes original size (default:not present)

So even if I could access content_size in the FrameHeader, it would most often not be valid. So the current implementation of first reading the entire file would still be required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code improvement enhancement not seen by the user difficult A difficult problem; a major coding effort or difficult algorithm to perfect enhancement New feature or request P1 important
Projects
None yet
Development

No branches or pull requests

1 participant