Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zstd: Improve better compression #364

Merged
merged 1 commit into from
Apr 23, 2021
Merged

Conversation

klauspost
Copy link
Owner

@klauspost klauspost commented Apr 23, 2021

Try to find a better match by searching for a long match
at the end of the current best match

Before/after pairs.. Speed comparison not reliable, different Go versions. Speed loss appears to be ~1%.

file    out     level   insize      outsize     millis  mb/s
silesia.tar zskp    3   211947520   65177448    1899    106.44
silesia.tar zskp    3   211947520   64595893    2007    100.68

gob-stream  zskp    3   1911399616  185792019   9324    195.48
gob-stream  zskp    3   1911399616  175034659   9636    189.17

enwik9  zskp    3   1000000000  294540704   11725   81.34
enwik9  zskp    3   1000000000  292243069   12162   78.41

github-june-2days-2019.json zskp    3   6273951764  537511906   29252   204.54
github-june-2days-2019.json zskp    3   6273951764  524340691   34043   175.75

rawstudio-mint14.tar    zskp    3   8558382592  3224594213  71751   113.75
rawstudio-mint14.tar    zskp    3   8558382592  3158085214  77675   105.08

nyc-taxi-data-10M.csv   zskp    3   3325605752  538490114   25683   123.49
nyc-taxi-data-10M.csv   zskp    3   3325605752  530289687   25239   125.66

Try to find a better match by searching for a long match
at the end of the current best match

Before/after pairs.. Speed comparison not reliable, different Go versions.

```
silesia.tar zskp    3   211947520   65177448    1899    106.44
silesia.tar zskp    3   211947520   64595893    2007    100.68

gob-stream  zskp    3   1911399616  185792019   9324    195.48
gob-stream  zskp    3   1911399616  175034659   9636    189.17

enwik9  zskp    3   1000000000  294540704   11725   81.34
enwik9  zskp    3   1000000000  292243069   12162   78.41

github-june-2days-2019.json zskp    3   6273951764  537511906   29252   204.54
github-june-2days-2019.json zskp    3   6273951764  524340691   34043   175.75

rawstudio-mint14.tar    zskp    3   8558382592  3224594213  71751   113.75
rawstudio-mint14.tar    zskp    3   8558382592  3158085214  77675   105.08

nyc-taxi-data-10M.csv   zskp    3   3325605752  538490114   25683   123.49
nyc-taxi-data-10M.csv   zskp    3   3325605752  530289687   25239   125.66
```
@klauspost klauspost merged commit 9bb6b77 into master Apr 23, 2021
@klauspost klauspost deleted the zstd-better-improve-compr branch April 23, 2021 11:23
mostynb added a commit to mostynb/zstdpool-syncpool that referenced this pull request Apr 28, 2021
This includes the following zstd improvement since v1.12.1:

* Add helpers to compress/decompress zstd inside zip files
  klauspost/compress#363
* Improve best compression
  klauspost/compress#360
* Improve better compression
  klauspost/compress#364
* Improve compression with dictionaries too
  klauspost/compress#365
mostynb added a commit to mostynb/go-grpc-compression that referenced this pull request Apr 28, 2021
This includes the following zstd improvement since v1.12.1:

* Add helpers to compress/decompress zstd inside zip files
  klauspost/compress#363
* Improve best compression
  klauspost/compress#360
* Improve better compression
  klauspost/compress#364
* Improve compression with dictionaries too
  klauspost/compress#365
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant