-
Notifications
You must be signed in to change notification settings - Fork 309
Permalink
Loading
Choose a base ref
{{ refName }}
default
Loading
Choose a head ref
{{ refName }}
default
Comparing changes
Choose two branches to see what’s changed or to start a new pull request.
If you need to, you can also or
learn more about diff comparisons.
Open a pull request
Create a new pull request by comparing changes across two branches. If you need to, you can also .
Learn more about diff comparisons here.
base repository: klauspost/compress
base: v1.15.7
Could not load branches
Nothing to show
Loading
Could not load tags
Nothing to show
{{ refName }}
default
Loading
...
head repository: klauspost/compress
compare: v1.15.8
Could not load branches
Nothing to show
Loading
Could not load tags
Nothing to show
{{ refName }}
default
Loading
- 8 commits
- 11 files changed
- 2 contributors
Commits on Jun 29, 2022
-
Configuration menu - View commit details
-
Copy full SHA for fa9e24a - Browse repository at this point
Copy the full SHA fa9e24aView commit details
Commits on Jul 4, 2022
-
zstd: Optimize seqdeq amd64 asm (#636)
copyMemoryPrecise now generates a loop over 16-byte blocks with a single branchless 16-byte fixup after it. This is a tiny bit faster on the whole and quite a bit faster for some inputs. Benchmark results on Intel Core i7-3770K: name old speed new speed delta Decoder_DecoderSmall/kppkn.gtb.zst-8 369MB/s ± 0% 374MB/s ± 1% +1.56% (p=0.008 n=5+5) Decoder_DecoderSmall/geo.protodata.zst-8 977MB/s ± 0% 1056MB/s ± 1% +8.17% (p=0.008 n=5+5) Decoder_DecoderSmall/plrabn12.txt.zst-8 291MB/s ± 0% 289MB/s ± 0% -0.74% (p=0.008 n=5+5) Decoder_DecoderSmall/lcet10.txt.zst-8 329MB/s ± 1% 333MB/s ± 0% +1.23% (p=0.008 n=5+5) Decoder_DecoderSmall/asyoulik.txt.zst-8 310MB/s ± 0% 310MB/s ± 1% ~ (p=1.000 n=5+5) Decoder_DecoderSmall/alice29.txt.zst-8 291MB/s ± 0% 291MB/s ± 1% ~ (p=0.421 n=5+5) Decoder_DecoderSmall/html_x_4.zst-8 2.07GB/s ± 0% 2.15GB/s ± 2% +4.05% (p=0.008 n=5+5) Decoder_DecoderSmall/paper-100k.pdf.zst-8 3.58GB/s ± 3% 3.74GB/s ± 1% +4.31% (p=0.008 n=5+5) Decoder_DecoderSmall/fireworks.jpeg.zst-8 8.57GB/s ± 0% 8.60GB/s ± 0% ~ (p=0.056 n=5+5) Decoder_DecoderSmall/urls.10K.zst-8 474MB/s ± 1% 507MB/s ± 1% +6.80% (p=0.008 n=5+5) Decoder_DecoderSmall/html.zst-8 745MB/s ± 0% 803MB/s ± 0% +7.68% (p=0.008 n=5+5) Decoder_DecoderSmall/comp-data.bin.zst-8 399MB/s ± 1% 400MB/s ± 0% ~ (p=0.841 n=5+5) Decoder_DecodeAll/kppkn.gtb.zst-8 521MB/s ± 0% 521MB/s ± 0% ~ (p=0.841 n=5+5) Decoder_DecodeAll/geo.protodata.zst-8 1.27GB/s ± 1% 1.29GB/s ± 0% +1.19% (p=0.008 n=5+5) Decoder_DecodeAll/plrabn12.txt.zst-8 429MB/s ± 0% 427MB/s ± 0% -0.51% (p=0.032 n=5+5) Decoder_DecodeAll/lcet10.txt.zst-8 435MB/s ± 0% 439MB/s ± 0% +0.94% (p=0.008 n=5+5) Decoder_DecodeAll/asyoulik.txt.zst-8 438MB/s ± 0% 436MB/s ± 0% -0.39% (p=0.008 n=5+5) Decoder_DecodeAll/alice29.txt.zst-8 423MB/s ± 0% 420MB/s ± 1% -0.72% (p=0.008 n=5+5) Decoder_DecodeAll/html_x_4.zst-8 1.59GB/s ± 0% 1.59GB/s ± 1% +0.54% (p=0.032 n=5+5) Decoder_DecodeAll/paper-100k.pdf.zst-8 4.53GB/s ± 1% 4.54GB/s ± 1% ~ (p=0.310 n=5+5) Decoder_DecodeAll/fireworks.jpeg.zst-8 9.64GB/s ± 1% 9.57GB/s ± 0% ~ (p=0.151 n=5+5) Decoder_DecodeAll/urls.10K.zst-8 683MB/s ± 0% 681MB/s ± 0% ~ (p=0.056 n=5+5) Decoder_DecodeAll/html.zst-8 1.04GB/s ± 1% 1.06GB/s ± 0% +1.77% (p=0.008 n=5+5) Decoder_DecodeAll/comp-data.bin.zst-8 398MB/s ± 1% 399MB/s ± 1% ~ (p=1.000 n=5+5) Decoder_DecodeAllFiles/Mark.Twain-Tom.Sawyer.txt/fastest-8 439MB/s ± 0% 437MB/s ± 0% -0.39% (p=0.016 n=5+5) Decoder_DecodeAllFiles/Mark.Twain-Tom.Sawyer.txt/default-8 448MB/s ± 0% 448MB/s ± 0% ~ (p=0.841 n=5+5) Decoder_DecodeAllFiles/Mark.Twain-Tom.Sawyer.txt/better-8 478MB/s ± 0% 477MB/s ± 0% ~ (p=0.151 n=5+5) Decoder_DecodeAllFiles/Mark.Twain-Tom.Sawyer.txt/best-8 463MB/s ± 0% 460MB/s ± 0% -0.57% (p=0.008 n=5+5) Decoder_DecodeAllFiles/e.txt/fastest-8 9.62GB/s ± 3% 9.66GB/s ± 1% ~ (p=0.841 n=5+5) Decoder_DecodeAllFiles/e.txt/default-8 394MB/s ± 0% 395MB/s ± 0% ~ (p=0.056 n=5+5) Decoder_DecodeAllFiles/e.txt/better-8 438MB/s ± 0% 442MB/s ± 0% +0.82% (p=0.008 n=5+5) Decoder_DecodeAllFiles/e.txt/best-8 501MB/s ± 0% 506MB/s ± 0% +1.07% (p=0.008 n=5+5) Decoder_DecodeAllFiles/fse-artifact3.bin/fastest-8 1.04GB/s ± 0% 1.05GB/s ± 1% ~ (p=0.056 n=5+5) Decoder_DecodeAllFiles/fse-artifact3.bin/default-8 1.20GB/s ± 1% 1.20GB/s ± 1% ~ (p=0.095 n=5+5) Decoder_DecodeAllFiles/fse-artifact3.bin/better-8 1.01GB/s ± 0% 1.00GB/s ± 1% -0.82% (p=0.008 n=5+5) Decoder_DecodeAllFiles/fse-artifact3.bin/best-8 386MB/s ± 0% 383MB/s ± 0% -0.57% (p=0.008 n=5+5) Decoder_DecodeAllFiles/gettysburg.txt/fastest-8 271MB/s ± 1% 275MB/s ± 1% +1.59% (p=0.008 n=5+5) Decoder_DecodeAllFiles/gettysburg.txt/default-8 224MB/s ± 1% 223MB/s ± 1% ~ (p=0.222 n=5+5) Decoder_DecodeAllFiles/gettysburg.txt/better-8 228MB/s ± 0% 226MB/s ± 0% -0.89% (p=0.008 n=5+5) Decoder_DecodeAllFiles/gettysburg.txt/best-8 223MB/s ± 1% 221MB/s ± 1% -1.03% (p=0.016 n=5+5) Decoder_DecodeAllFiles/html.txt/fastest-8 592MB/s ± 1% 611MB/s ± 0% +3.20% (p=0.008 n=5+5) Decoder_DecodeAllFiles/html.txt/default-8 597MB/s ± 0% 607MB/s ± 0% +1.71% (p=0.008 n=5+5) Decoder_DecodeAllFiles/html.txt/better-8 623MB/s ± 0% 633MB/s ± 0% +1.57% (p=0.008 n=5+5) Decoder_DecodeAllFiles/html.txt/best-8 603MB/s ± 0% 610MB/s ± 0% +1.25% (p=0.008 n=5+5) Decoder_DecodeAllFiles/pi.txt/fastest-8 9.59GB/s ± 1% 9.70GB/s ± 1% +1.16% (p=0.032 n=5+5) Decoder_DecodeAllFiles/pi.txt/default-8 391MB/s ± 0% 393MB/s ± 0% +0.62% (p=0.008 n=5+5) Decoder_DecodeAllFiles/pi.txt/better-8 437MB/s ± 1% 441MB/s ± 2% ~ (p=0.087 n=5+5) Decoder_DecodeAllFiles/pi.txt/best-8 501MB/s ± 0% 507MB/s ± 0% +1.22% (p=0.008 n=5+5) Decoder_DecodeAllFiles/pngdata.bin/fastest-8 1.66GB/s ± 1% 1.70GB/s ± 0% +2.49% (p=0.008 n=5+5) Decoder_DecodeAllFiles/pngdata.bin/default-8 1.49GB/s ± 0% 1.51GB/s ± 0% +1.18% (p=0.008 n=5+5) Decoder_DecodeAllFiles/pngdata.bin/better-8 1.87GB/s ± 0% 1.90GB/s ± 1% ~ (p=0.056 n=5+5) Decoder_DecodeAllFiles/pngdata.bin/best-8 1.44GB/s ± 1% 1.46GB/s ± 0% +1.75% (p=0.008 n=5+5) Decoder_DecodeAllFiles/sharnd.out/fastest-8 9.64GB/s ± 1% 9.66GB/s ± 1% ~ (p=0.841 n=5+5) Decoder_DecodeAllFiles/sharnd.out/default-8 9.70GB/s ± 1% 9.70GB/s ± 2% ~ (p=1.000 n=5+5) Decoder_DecodeAllFiles/sharnd.out/better-8 9.71GB/s ± 1% 9.79GB/s ± 1% ~ (p=0.151 n=5+5) Decoder_DecodeAllFiles/sharnd.out/best-8 9.76GB/s ± 0% 9.80GB/s ± 0% ~ (p=0.056 n=5+5) Decoder_DecodeAllFilesP/Mark.Twain-Tom.Sawyer.txt/fastest-8 1.85GB/s ± 0% 1.85GB/s ± 0% -0.31% (p=0.008 n=5+5) Decoder_DecodeAllFilesP/Mark.Twain-Tom.Sawyer.txt/default-8 1.86GB/s ± 0% 1.85GB/s ± 0% -0.47% (p=0.008 n=5+5) Decoder_DecodeAllFilesP/Mark.Twain-Tom.Sawyer.txt/better-8 2.00GB/s ± 0% 2.00GB/s ± 0% -0.32% (p=0.008 n=5+5) Decoder_DecodeAllFilesP/Mark.Twain-Tom.Sawyer.txt/best-8 1.93GB/s ± 0% 1.93GB/s ± 0% -0.22% (p=0.008 n=5+5) Decoder_DecodeAllFilesP/e.txt/fastest-8 37.7GB/s ± 0% 37.5GB/s ± 0% -0.38% (p=0.016 n=5+5) Decoder_DecodeAllFilesP/e.txt/default-8 1.68GB/s ± 0% 1.69GB/s ± 0% +0.55% (p=0.008 n=5+5) Decoder_DecodeAllFilesP/e.txt/better-8 1.91GB/s ± 0% 1.92GB/s ± 0% +0.96% (p=0.008 n=5+5) Decoder_DecodeAllFilesP/e.txt/best-8 2.22GB/s ± 0% 2.25GB/s ± 0% +1.50% (p=0.008 n=5+5) Decoder_DecodeAllFilesP/fse-artifact3.bin/fastest-8 5.18GB/s ± 0% 5.05GB/s ± 2% -2.50% (p=0.008 n=5+5) Decoder_DecodeAllFilesP/fse-artifact3.bin/default-8 5.50GB/s ± 1% 5.34GB/s ± 1% -2.86% (p=0.008 n=5+5) Decoder_DecodeAllFilesP/fse-artifact3.bin/better-8 5.11GB/s ± 0% 5.14GB/s ± 0% +0.57% (p=0.016 n=5+5) Decoder_DecodeAllFilesP/fse-artifact3.bin/best-8 2.36GB/s ± 0% 2.37GB/s ± 0% +0.20% (p=0.032 n=5+5) Decoder_DecodeAllFilesP/gettysburg.txt/fastest-8 1.16GB/s ± 0% 1.16GB/s ± 0% ~ (p=0.056 n=5+5) Decoder_DecodeAllFilesP/gettysburg.txt/default-8 1.09GB/s ± 0% 1.08GB/s ± 0% -1.19% (p=0.008 n=5+5) Decoder_DecodeAllFilesP/gettysburg.txt/better-8 1.09GB/s ± 0% 1.08GB/s ± 1% -0.96% (p=0.008 n=5+5) Decoder_DecodeAllFilesP/gettysburg.txt/best-8 1.03GB/s ± 3% 1.02GB/s ± 0% ~ (p=0.151 n=5+5) Decoder_DecodeAllFilesP/html.txt/fastest-8 2.50GB/s ± 1% 2.56GB/s ± 0% +2.39% (p=0.008 n=5+5) Decoder_DecodeAllFilesP/html.txt/default-8 2.51GB/s ± 0% 2.55GB/s ± 0% +1.69% (p=0.008 n=5+5) Decoder_DecodeAllFilesP/html.txt/better-8 2.61GB/s ± 0% 2.66GB/s ± 0% +1.93% (p=0.008 n=5+5) Decoder_DecodeAllFilesP/html.txt/best-8 2.53GB/s ± 0% 2.56GB/s ± 0% +1.13% (p=0.008 n=5+5) Decoder_DecodeAllFilesP/pi.txt/fastest-8 37.8GB/s ± 0% 37.6GB/s ± 0% -0.44% (p=0.016 n=5+5) Decoder_DecodeAllFilesP/pi.txt/default-8 1.67GB/s ± 0% 1.68GB/s ± 0% +0.61% (p=0.008 n=5+5) Decoder_DecodeAllFilesP/pi.txt/better-8 1.91GB/s ± 0% 1.93GB/s ± 0% +0.82% (p=0.008 n=5+5) Decoder_DecodeAllFilesP/pi.txt/best-8 2.23GB/s ± 0% 2.26GB/s ± 0% +1.35% (p=0.008 n=5+5) Decoder_DecodeAllFilesP/pngdata.bin/fastest-8 6.99GB/s ± 0% 7.00GB/s ± 0% ~ (p=0.690 n=5+5) Decoder_DecodeAllFilesP/pngdata.bin/default-8 6.88GB/s ± 0% 6.87GB/s ± 0% ~ (p=0.222 n=5+5) Decoder_DecodeAllFilesP/pngdata.bin/better-8 8.49GB/s ± 0% 8.44GB/s ± 1% ~ (p=0.310 n=5+5) Decoder_DecodeAllFilesP/pngdata.bin/best-8 6.59GB/s ± 1% 6.53GB/s ± 1% -0.96% (p=0.032 n=5+5) Decoder_DecodeAllFilesP/sharnd.out/fastest-8 37.8GB/s ± 0% 37.5GB/s ± 0% -0.86% (p=0.008 n=5+5) Decoder_DecodeAllFilesP/sharnd.out/default-8 37.9GB/s ± 1% 38.0GB/s ± 1% ~ (p=0.310 n=5+5) Decoder_DecodeAllFilesP/sharnd.out/better-8 37.9GB/s ± 0% 37.8GB/s ± 2% ~ (p=0.841 n=5+5) Decoder_DecodeAllFilesP/sharnd.out/best-8 37.8GB/s ± 0% 38.0GB/s ± 1% ~ (p=0.310 n=5+5) Decoder_DecodeAllParallel/kppkn.gtb.zst-8 2.20GB/s ± 0% 2.20GB/s ± 0% ~ (p=1.000 n=5+5) Decoder_DecodeAllParallel/geo.protodata.zst-8 5.37GB/s ± 0% 5.39GB/s ± 0% +0.35% (p=0.008 n=5+5) Decoder_DecodeAllParallel/plrabn12.txt.zst-8 1.77GB/s ± 0% 1.76GB/s ± 0% -0.19% (p=0.008 n=5+5) Decoder_DecodeAllParallel/lcet10.txt.zst-8 1.90GB/s ± 0% 1.92GB/s ± 0% +0.80% (p=0.008 n=5+5) Decoder_DecodeAllParallel/asyoulik.txt.zst-8 1.83GB/s ± 0% 1.83GB/s ± 0% ~ (p=0.841 n=5+5) Decoder_DecodeAllParallel/alice29.txt.zst-8 1.74GB/s ± 0% 1.74GB/s ± 0% ~ (p=0.548 n=5+5) Decoder_DecodeAllParallel/html_x_4.zst-8 6.55GB/s ± 0% 6.49GB/s ± 0% -0.97% (p=0.008 n=5+5) Decoder_DecodeAllParallel/paper-100k.pdf.zst-8 18.3GB/s ± 0% 18.3GB/s ± 0% ~ (p=0.056 n=5+5) Decoder_DecodeAllParallel/fireworks.jpeg.zst-8 37.4GB/s ± 0% 37.2GB/s ± 1% -0.57% (p=0.016 n=4+5) Decoder_DecodeAllParallel/urls.10K.zst-8 2.97GB/s ± 0% 2.96GB/s ± 0% ~ (p=0.310 n=5+5) Decoder_DecodeAllParallel/html.zst-8 4.42GB/s ± 1% 4.43GB/s ± 0% ~ (p=0.556 n=5+4) Decoder_DecodeAllParallel/comp-data.bin.zst-8 1.69GB/s ± 1% 1.70GB/s ± 0% +0.84% (p=0.008 n=5+5) [Geo mean] 1.77GB/s 1.78GB/s +0.57%
Configuration menu - View commit details
-
Copy full SHA for bf3f0fd - Browse repository at this point
Copy the full SHA bf3f0fdView commit details -
zstd: Improve decoder memcopy (#637)
Up to 25% faster decodes, depending on contents. Use s2 memcopier and eliminate a zero check. ``` benchmark old MB/s new MB/s speedup Benchmark_seqdec_execute/n-12286-lits-13914-prev-9869-1990358-3296656-win-4194304.blk-32 1284.77 1493.64 1.16x Benchmark_seqdec_execute/n-12485-lits-6960-prev-976039-2250252-2463561-win-4194304.blk-32 1107.87 1580.86 1.43x Benchmark_seqdec_execute/n-14746-lits-14461-prev-209-8-1379909-win-4194304.blk-32 3947.25 4163.99 1.05x Benchmark_seqdec_execute/n-1525-lits-1498-prev-2009476-797934-2994405-win-4194304.blk-32 10281.12 10375.47 1.01x Benchmark_seqdec_execute/n-3478-lits-3628-prev-895243-2104056-2119329-win-4194304.blk-32 8115.99 8862.70 1.09x Benchmark_seqdec_execute/n-8422-lits-5840-prev-168095-2298675-433830-win-4194304.blk-32 1578.08 2306.80 1.46x Benchmark_seqdec_execute/n-1000-lits-1057-prev-21887-92-217-win-8388608.blk-32 17079.65 15875.41 0.93x Benchmark_seqdec_execute/n-15134-lits-20798-prev-4882976-4884216-4474622-win-8388608.blk-32 2020.09 2077.16 1.03x Benchmark_seqdec_execute/n-2-lits-0-prev-620601-689171-848-win-8388608.blk-32 35781.31 35736.03 1.00x Benchmark_seqdec_execute/n-90-lits-67-prev-19498-23-19710-win-8388608.blk-32 33125.43 32874.37 0.99x Benchmark_seqdec_execute/n-931-lits-1179-prev-36502-1526-1518-win-8388608.blk-32 19394.38 19785.45 1.02x Benchmark_seqdec_execute/n-2898-lits-4062-prev-335-386-751-win-8388608.blk-32 10494.30 10229.09 0.97x Benchmark_seqdec_execute/n-4056-lits-12419-prev-10792-66-309849-win-8388608.blk-32 7425.77 8034.31 1.08x Benchmark_seqdec_execute/n-8028-lits-4568-prev-917-65-920-win-8388608.blk-32 2855.17 3336.71 1.17x BenchmarkDecoder_DecoderSmall/kppkn.gtb.zst-32 537.74 653.10 1.21x BenchmarkDecoder_DecoderSmall/geo.protodata.zst-32 1500.59 1610.70 1.07x BenchmarkDecoder_DecoderSmall/plrabn12.txt.zst-32 410.13 508.09 1.24x BenchmarkDecoder_DecoderSmall/lcet10.txt.zst-32 467.83 602.22 1.29x BenchmarkDecoder_DecoderSmall/asyoulik.txt.zst-32 434.53 528.57 1.22x BenchmarkDecoder_DecoderSmall/alice29.txt.zst-32 433.95 544.60 1.25x BenchmarkDecoder_DecoderSmall/html_x_4.zst-32 2860.31 3199.64 1.12x BenchmarkDecoder_DecoderSmall/paper-100k.pdf.zst-32 5336.43 5422.59 1.02x BenchmarkDecoder_DecoderSmall/fireworks.jpeg.zst-32 12327.10 12324.96 1.00x BenchmarkDecoder_DecoderSmall/urls.10K.zst-32 660.52 769.09 1.16x BenchmarkDecoder_DecoderSmall/html.zst-32 1076.67 1286.06 1.19x BenchmarkDecoder_DecoderSmall/comp-data.bin.zst-32 569.30 574.46 1.01x BenchmarkDecoder_DecodeAll/kppkn.gtb.zst-32 812.16 822.43 1.01x BenchmarkDecoder_DecodeAll/geo.protodata.zst-32 1943.14 1906.88 0.98x BenchmarkDecoder_DecodeAll/plrabn12.txt.zst-32 712.27 723.91 1.02x BenchmarkDecoder_DecodeAll/lcet10.txt.zst-32 688.23 781.85 1.14x BenchmarkDecoder_DecodeAll/asyoulik.txt.zst-32 702.87 714.37 1.02x BenchmarkDecoder_DecodeAll/alice29.txt.zst-32 717.44 738.78 1.03x BenchmarkDecoder_DecodeAll/html_x_4.zst-32 1960.55 1975.63 1.01x BenchmarkDecoder_DecodeAll/paper-100k.pdf.zst-32 5981.50 6118.97 1.02x BenchmarkDecoder_DecodeAll/fireworks.jpeg.zst-32 13140.18 13126.95 1.00x BenchmarkDecoder_DecodeAll/urls.10K.zst-32 983.71 979.34 1.00x BenchmarkDecoder_DecodeAll/html.zst-32 1624.80 1585.31 0.98x BenchmarkDecoder_DecodeAll/comp-data.bin.zst-32 569.84 572.56 1.00x BenchmarkDecoder_DecodeAllFiles/.tracker-unpacked.bin/fastest-32 504.31 623.48 1.24x BenchmarkDecoder_DecodeAllFiles/.tracker-unpacked.bin/default-32 564.68 723.22 1.28x BenchmarkDecoder_DecodeAllFiles/.tracker-unpacked.bin/better-32 615.18 781.33 1.27x BenchmarkDecoder_DecodeAllFiles/.tracker-unpacked.bin/best-32 786.17 862.88 1.10x BenchmarkDecoder_DecodeAllFiles/.tracker.bin/fastest-32 12860.99 12908.39 1.00x BenchmarkDecoder_DecodeAllFiles/.tracker.bin/default-32 619.06 626.95 1.01x BenchmarkDecoder_DecodeAllFiles/.tracker.bin/better-32 630.33 628.85 1.00x BenchmarkDecoder_DecodeAllFiles/.tracker.bin/best-32 609.12 616.50 1.01x BenchmarkDecoder_DecodeAllFiles/Mark.Twain-Tom.Sawyer.txt/fastest-32 658.22 669.16 1.02x BenchmarkDecoder_DecodeAllFiles/Mark.Twain-Tom.Sawyer.txt/default-32 723.60 741.86 1.03x BenchmarkDecoder_DecodeAllFiles/Mark.Twain-Tom.Sawyer.txt/better-32 735.73 750.40 1.02x BenchmarkDecoder_DecodeAllFiles/Mark.Twain-Tom.Sawyer.txt/best-32 745.43 764.97 1.03x BenchmarkDecoder_DecodeAllFiles/e.txt/fastest-32 12801.86 13043.13 1.02x BenchmarkDecoder_DecodeAllFiles/e.txt/default-32 680.29 683.65 1.00x BenchmarkDecoder_DecodeAllFiles/e.txt/better-32 739.23 748.08 1.01x BenchmarkDecoder_DecodeAllFiles/e.txt/best-32 820.16 828.45 1.01x BenchmarkDecoder_DecodeAllFiles/fse-artifact3.bin/fastest-32 1186.63 1177.03 0.99x BenchmarkDecoder_DecodeAllFiles/fse-artifact3.bin/default-32 1384.74 1383.55 1.00x BenchmarkDecoder_DecodeAllFiles/fse-artifact3.bin/better-32 1104.17 1114.92 1.01x BenchmarkDecoder_DecodeAllFiles/fse-artifact3.bin/best-32 409.59 409.66 1.00x BenchmarkDecoder_DecodeAllFiles/gettysburg.txt/fastest-32 392.32 390.94 1.00x BenchmarkDecoder_DecodeAllFiles/gettysburg.txt/default-32 296.47 295.87 1.00x BenchmarkDecoder_DecodeAllFiles/gettysburg.txt/better-32 296.52 296.60 1.00x BenchmarkDecoder_DecodeAllFiles/gettysburg.txt/best-32 299.85 298.91 1.00x BenchmarkDecoder_DecodeAllFiles/html.txt/fastest-32 988.75 999.28 1.01x BenchmarkDecoder_DecodeAllFiles/html.txt/default-32 987.11 1018.97 1.03x BenchmarkDecoder_DecodeAllFiles/html.txt/better-32 1027.64 1030.76 1.00x BenchmarkDecoder_DecodeAllFiles/html.txt/best-32 973.41 989.37 1.02x BenchmarkDecoder_DecodeAllFiles/pi.txt/fastest-32 12976.96 12976.25 1.00x BenchmarkDecoder_DecodeAllFiles/pi.txt/default-32 678.88 680.77 1.00x BenchmarkDecoder_DecodeAllFiles/pi.txt/better-32 746.38 751.28 1.01x BenchmarkDecoder_DecodeAllFiles/pi.txt/best-32 823.52 833.27 1.01x BenchmarkDecoder_DecodeAllFiles/pngdata.bin/fastest-32 2115.58 2106.14 1.00x BenchmarkDecoder_DecodeAllFiles/pngdata.bin/default-32 1767.98 1767.57 1.00x BenchmarkDecoder_DecodeAllFiles/pngdata.bin/better-32 2306.86 2288.16 0.99x BenchmarkDecoder_DecodeAllFiles/pngdata.bin/best-32 1660.52 1667.53 1.00x BenchmarkDecoder_DecodeAllFiles/sharnd.out/fastest-32 13027.08 13044.50 1.00x BenchmarkDecoder_DecodeAllFiles/sharnd.out/default-32 13054.18 13081.06 1.00x BenchmarkDecoder_DecodeAllFiles/sharnd.out/better-32 13067.23 13066.65 1.00x BenchmarkDecoder_DecodeAllFiles/sharnd.out/best-32 13079.77 13061.36 1.00x BenchmarkDecoder_DecodeAllFilesP/.tracker-unpacked.bin/fastest-32 10354.84 11876.83 1.15x BenchmarkDecoder_DecodeAllFilesP/.tracker-unpacked.bin/default-32 11557.12 13415.35 1.16x BenchmarkDecoder_DecodeAllFilesP/.tracker-unpacked.bin/better-32 12644.67 14515.52 1.15x BenchmarkDecoder_DecodeAllFilesP/.tracker-unpacked.bin/best-32 15934.00 17307.06 1.09x BenchmarkDecoder_DecodeAllFilesP/.tracker.bin/fastest-32 35354.57 35307.64 1.00x BenchmarkDecoder_DecodeAllFilesP/.tracker.bin/default-32 11392.27 11353.17 1.00x BenchmarkDecoder_DecodeAllFilesP/.tracker.bin/better-32 11793.77 11733.41 0.99x BenchmarkDecoder_DecodeAllFilesP/.tracker.bin/best-32 11203.91 11174.37 1.00x BenchmarkDecoder_DecodeAllFilesP/Mark.Twain-Tom.Sawyer.txt/fastest-32 12089.54 12097.65 1.00x BenchmarkDecoder_DecodeAllFilesP/Mark.Twain-Tom.Sawyer.txt/default-32 12604.67 12647.83 1.00x BenchmarkDecoder_DecodeAllFilesP/Mark.Twain-Tom.Sawyer.txt/better-32 13265.79 13275.92 1.00x BenchmarkDecoder_DecodeAllFilesP/Mark.Twain-Tom.Sawyer.txt/best-32 13078.85 13130.26 1.00x BenchmarkDecoder_DecodeAllFilesP/e.txt/fastest-32 52477.17 51848.17 0.99x BenchmarkDecoder_DecodeAllFilesP/e.txt/default-32 11947.06 11922.24 1.00x BenchmarkDecoder_DecodeAllFilesP/e.txt/better-32 13184.17 13223.10 1.00x BenchmarkDecoder_DecodeAllFilesP/e.txt/best-32 14630.26 14702.42 1.00x BenchmarkDecoder_DecodeAllFilesP/fse-artifact3.bin/fastest-32 3013.25 3025.30 1.00x BenchmarkDecoder_DecodeAllFilesP/fse-artifact3.bin/default-32 3125.61 2976.92 0.95x BenchmarkDecoder_DecodeAllFilesP/fse-artifact3.bin/better-32 3181.68 3162.28 0.99x BenchmarkDecoder_DecodeAllFilesP/fse-artifact3.bin/best-32 3351.22 3372.69 1.01x BenchmarkDecoder_DecodeAllFilesP/gettysburg.txt/fastest-32 1188.15 1147.96 0.97x BenchmarkDecoder_DecodeAllFilesP/gettysburg.txt/default-32 1215.39 1156.01 0.95x BenchmarkDecoder_DecodeAllFilesP/gettysburg.txt/better-32 1219.20 1177.16 0.97x BenchmarkDecoder_DecodeAllFilesP/gettysburg.txt/best-32 1216.72 1170.21 0.96x BenchmarkDecoder_DecodeAllFilesP/html.txt/fastest-32 16901.32 17180.70 1.02x BenchmarkDecoder_DecodeAllFilesP/html.txt/default-32 16819.66 16997.40 1.01x BenchmarkDecoder_DecodeAllFilesP/html.txt/better-32 17805.12 17946.54 1.01x BenchmarkDecoder_DecodeAllFilesP/html.txt/best-32 16916.87 17294.25 1.02x BenchmarkDecoder_DecodeAllFilesP/pi.txt/fastest-32 52314.15 52657.88 1.01x BenchmarkDecoder_DecodeAllFilesP/pi.txt/default-32 11878.94 11796.12 0.99x BenchmarkDecoder_DecodeAllFilesP/pi.txt/better-32 13303.16 13216.13 0.99x BenchmarkDecoder_DecodeAllFilesP/pi.txt/best-32 14622.76 14697.47 1.01x BenchmarkDecoder_DecodeAllFilesP/pngdata.bin/fastest-32 34134.48 36542.10 1.07x BenchmarkDecoder_DecodeAllFilesP/pngdata.bin/default-32 33589.32 34982.31 1.04x BenchmarkDecoder_DecodeAllFilesP/pngdata.bin/better-32 43754.89 44323.18 1.01x BenchmarkDecoder_DecodeAllFilesP/pngdata.bin/best-32 32422.22 33882.10 1.05x BenchmarkDecoder_DecodeAllFilesP/sharnd.out/fastest-32 52706.00 52863.28 1.00x BenchmarkDecoder_DecodeAllFilesP/sharnd.out/default-32 52527.76 52319.50 1.00x BenchmarkDecoder_DecodeAllFilesP/sharnd.out/better-32 52177.25 52506.60 1.01x BenchmarkDecoder_DecodeAllFilesP/sharnd.out/best-32 52443.28 52402.30 1.00x BenchmarkDecoder_DecodeAllParallel/kppkn.gtb.zst-32 13992.47 14134.26 1.01x BenchmarkDecoder_DecodeAllParallel/geo.protodata.zst-32 34107.95 33812.99 0.99x BenchmarkDecoder_DecodeAllParallel/plrabn12.txt.zst-32 12012.34 12123.74 1.01x BenchmarkDecoder_DecodeAllParallel/lcet10.txt.zst-32 12630.22 13586.02 1.08x BenchmarkDecoder_DecodeAllParallel/asyoulik.txt.zst-32 12327.02 12374.31 1.00x BenchmarkDecoder_DecodeAllParallel/alice29.txt.zst-32 11932.73 12059.89 1.01x BenchmarkDecoder_DecodeAllParallel/html_x_4.zst-32 31233.38 36076.61 1.16x BenchmarkDecoder_DecodeAllParallel/paper-100k.pdf.zst-32 97435.31 100702.06 1.03x BenchmarkDecoder_DecodeAllParallel/fireworks.jpeg.zst-32 62247.22 61824.88 0.99x BenchmarkDecoder_DecodeAllParallel/urls.10K.zst-32 18659.58 18502.10 0.99x BenchmarkDecoder_DecodeAllParallel/html.zst-32 28464.78 28500.16 1.00x BenchmarkDecoder_DecodeAllParallel/comp-data.bin.zst-32 3114.03 3132.86 1.01x BenchmarkDecoderSilesia/multithreaded-writer-32 1099.69 1059.67 0.96x BenchmarkDecoderSilesia/multithreaded-writer-himem-32 1093.10 1054.67 0.96x BenchmarkDecoderSilesia/singlethreaded-writer-32 803.85 819.16 1.02x BenchmarkDecoderSilesia/singlethreaded-writerto-32 812.83 828.44 1.02x BenchmarkDecoderSilesia/singlethreaded-himem-32 813.14 824.41 1.01x BenchmarkDecoderEnwik9/multithreaded-writer-32 877.55 981.68 1.12x BenchmarkDecoderEnwik9/multithreaded-writer-himem-32 961.20 1013.19 1.05x BenchmarkDecoderEnwik9/singlethreaded-writer-32 632.07 629.32 1.00x BenchmarkDecoderEnwik9/singlethreaded-writerto-32 634.62 635.76 1.00x BenchmarkDecoderEnwik9/singlethreaded-himem-32 763.68 755.70 0.99x BenchmarkDecoderWithCustomFiles/github-june-2days-2019.json.zst/multithreaded-writer-32 1626.86 1658.42 1.02x BenchmarkDecoderWithCustomFiles/github-june-2days-2019.json.zst/multithreaded-writer-himem-32 2299.80 2305.08 1.00x BenchmarkDecoderWithCustomFiles/github-june-2days-2019.json.zst/singlethreaded-writer-32 1221.34 1207.19 0.99x BenchmarkDecoderWithCustomFiles/github-june-2days-2019.json.zst/singlethreaded-writerto-32 1236.18 1224.88 0.99x BenchmarkDecoderWithCustomFiles/github-june-2days-2019.json.zst/singlethreaded-himem-32 1749.21 1729.03 0.99x BenchmarkDecoderWithCustomFiles/github-ranks-backup.bin.zst/multithreaded-writer-32 839.51 922.30 1.10x BenchmarkDecoderWithCustomFiles/github-ranks-backup.bin.zst/multithreaded-writer-himem-32 1055.54 1093.19 1.04x BenchmarkDecoderWithCustomFiles/github-ranks-backup.bin.zst/singlethreaded-writer-32 574.91 614.02 1.07x BenchmarkDecoderWithCustomFiles/github-ranks-backup.bin.zst/singlethreaded-writerto-32 579.19 618.97 1.07x BenchmarkDecoderWithCustomFiles/github-ranks-backup.bin.zst/singlethreaded-himem-32 780.67 863.05 1.11x ```
Configuration menu - View commit details
-
Copy full SHA for b16a9af - Browse repository at this point
Copy the full SHA b16a9afView commit details
Commits on Jul 8, 2022
-
huff0: Pass a single bitReader pointer to asm (#634)
This makes the context object smaller and frees up three registers, which we can use to replace the limitPtr and bufferOrigin stack variables. Benchmark results show a tiny win (Go 1.19beta, Core i7-3770K): name old speed new speed delta Decompress1XTable/digits-8 347MB/s ± 0% 347MB/s ± 0% ~ (p=0.650 n=8+10) Decompress1XTable/gettysburg-8 268MB/s ± 0% 268MB/s ± 0% ~ (p=0.400 n=9+9) Decompress1XTable/twain-8 327MB/s ± 0% 327MB/s ± 1% ~ (p=0.339 n=7+9) Decompress1XTable/low-ent.10k-8 385MB/s ± 0% 385MB/s ± 1% ~ (p=0.510 n=9+10) Decompress1XTable/superlow-ent-10k-8 376MB/s ± 0% 376MB/s ± 0% ~ (p=0.712 n=8+10) Decompress1XTable/crash2-8 17.3MB/s ± 1% 17.3MB/s ± 1% ~ (p=0.926 n=10+10) Decompress1XTable/endzerobits-8 52.9MB/s ± 1% 52.4MB/s ± 0% -0.94% (p=0.000 n=10+10) Decompress1XTable/endnonzero-8 11.4MB/s ± 0% 11.4MB/s ± 1% ~ (p=0.343 n=10+10) Decompress1XTable/case1-8 22.0MB/s ± 0% 22.0MB/s ± 0% ~ (p=0.618 n=9+9) Decompress1XTable/case2-8 18.1MB/s ± 0% 18.1MB/s ± 0% ~ (p=0.348 n=9+9) Decompress1XTable/case3-8 19.1MB/s ± 0% 19.1MB/s ± 0% +0.21% (p=0.048 n=10+10) Decompress1XTable/pngdata.001-8 374MB/s ± 0% 374MB/s ± 0% ~ (p=0.861 n=9+10) Decompress1XTable/normcount2-8 54.3MB/s ± 1% 54.5MB/s ± 1% ~ (p=0.093 n=10+10) Decompress1XNoTable/digits/100-8 279MB/s ± 0% 280MB/s ± 0% +0.30% (p=0.003 n=10+9) Decompress1XNoTable/digits/10000-8 366MB/s ± 0% 365MB/s ± 0% ~ (p=0.113 n=10+9) Decompress1XNoTable/digits/262143-8 347MB/s ± 0% 347MB/s ± 1% ~ (p=0.739 n=10+10) Decompress1XNoTable/gettysburg/100-8 278MB/s ± 1% 277MB/s ± 1% ~ (p=0.676 n=10+9) Decompress1XNoTable/gettysburg/10000-8 363MB/s ± 1% 362MB/s ± 0% -0.50% (p=0.001 n=10+9) Decompress1XNoTable/gettysburg/262143-8 350MB/s ± 0% 347MB/s ± 0% -0.90% (p=0.000 n=10+8) Decompress1XNoTable/twain/100-8 268MB/s ± 0% 267MB/s ± 0% ~ (p=0.384 n=9+8) Decompress1XNoTable/twain/10000-8 363MB/s ± 0% 362MB/s ± 0% -0.32% (p=0.000 n=9+9) Decompress1XNoTable/twain/262143-8 328MB/s ± 0% 329MB/s ± 0% ~ (p=0.063 n=9+10) Decompress1XNoTable/low-ent.10k/100-8 180MB/s ± 0% 181MB/s ± 0% ~ (p=0.225 n=10+10) Decompress1XNoTable/low-ent.10k/10000-8 385MB/s ± 0% 385MB/s ± 0% ~ (p=0.289 n=10+10) Decompress1XNoTable/low-ent.10k/262143-8 389MB/s ± 1% 389MB/s ± 1% ~ (p=0.971 n=10+10) Decompress1XNoTable/superlow-ent-10k/262143-8 389MB/s ± 0% 390MB/s ± 0% +0.27% (p=0.017 n=9+10) Decompress1XNoTable/crash2/100-8 278MB/s ± 0% 279MB/s ± 1% ~ (p=0.163 n=9+10) Decompress1XNoTable/crash2/10000-8 373MB/s ± 1% 373MB/s ± 0% ~ (p=0.370 n=10+8) Decompress1XNoTable/crash2/262143-8 375MB/s ± 0% 375MB/s ± 0% ~ (p=0.604 n=9+10) Decompress1XNoTable/endzerobits/100-8 180MB/s ± 0% 181MB/s ± 0% +0.26% (p=0.005 n=10+9) Decompress1XNoTable/endzerobits/10000-8 384MB/s ± 0% 385MB/s ± 0% ~ (p=0.914 n=8+10) Decompress1XNoTable/endzerobits/262143-8 389MB/s ± 0% 390MB/s ± 0% ~ (p=0.739 n=10+10) Decompress1XNoTable/endnonzero/100-8 180MB/s ± 1% 180MB/s ± 1% ~ (p=0.926 n=10+10) Decompress1XNoTable/endnonzero/10000-8 384MB/s ± 0% 384MB/s ± 0% ~ (p=0.965 n=10+8) Decompress1XNoTable/endnonzero/262143-8 390MB/s ± 0% 390MB/s ± 0% ~ (p=0.633 n=8+10) Decompress1XNoTable/case1/100-8 282MB/s ± 0% 283MB/s ± 0% +0.34% (p=0.005 n=10+10) Decompress1XNoTable/case1/10000-8 372MB/s ± 0% 373MB/s ± 0% ~ (p=0.113 n=9+9) Decompress1XNoTable/case1/262143-8 374MB/s ± 0% 374MB/s ± 0% ~ (p=0.448 n=10+10) Decompress1XNoTable/case2/100-8 274MB/s ± 1% 274MB/s ± 0% ~ (p=0.927 n=10+10) Decompress1XNoTable/case2/10000-8 376MB/s ± 0% 376MB/s ± 0% ~ (p=0.408 n=10+8) Decompress1XNoTable/case2/262143-8 376MB/s ± 1% 377MB/s ± 0% ~ (p=1.000 n=10+10) Decompress1XNoTable/case3/100-8 266MB/s ± 0% 265MB/s ± 0% ~ (p=0.113 n=9+10) Decompress1XNoTable/case3/10000-8 372MB/s ± 0% 372MB/s ± 0% ~ (p=0.075 n=10+9) Decompress1XNoTable/case3/262143-8 374MB/s ± 0% 374MB/s ± 0% ~ (p=0.172 n=10+10) Decompress1XNoTable/pngdata.001/100-8 238MB/s ± 0% 238MB/s ± 0% ~ (p=0.438 n=9+8) Decompress1XNoTable/pngdata.001/10000-8 384MB/s ± 0% 384MB/s ± 0% ~ (p=0.448 n=10+10) Decompress1XNoTable/pngdata.001/262143-8 378MB/s ± 0% 378MB/s ± 0% ~ (p=0.836 n=10+10) Decompress1XNoTable/normcount2/100-8 281MB/s ± 0% 282MB/s ± 1% ~ (p=0.122 n=8+10) Decompress1XNoTable/normcount2/10000-8 369MB/s ± 1% 369MB/s ± 0% ~ (p=0.912 n=10+10) Decompress1XNoTable/normcount2/262143-8 370MB/s ± 0% 370MB/s ± 1% ~ (p=0.342 n=10+10) Decompress4XNoTable/digits/100-8 197MB/s ± 0% 197MB/s ± 1% ~ (p=0.764 n=10+9) Decompress4XNoTable/digits/10000-8 594MB/s ± 0% 602MB/s ± 1% +1.35% (p=0.000 n=10+10) Decompress4XNoTable/digits/262143-8 570MB/s ± 1% 578MB/s ± 0% +1.30% (p=0.000 n=10+8) Decompress4XNoTable/gettysburg/100-8 258MB/s ± 1% 260MB/s ± 0% +0.59% (p=0.001 n=10+10) Decompress4XNoTable/gettysburg/10000-8 638MB/s ± 0% 641MB/s ± 0% +0.44% (p=0.000 n=9+9) Decompress4XNoTable/gettysburg/262143-8 573MB/s ± 1% 574MB/s ± 0% ~ (p=0.353 n=10+10) Decompress4XNoTable/twain/100-8 214MB/s ± 2% 214MB/s ± 2% ~ (p=0.853 n=10+10) Decompress4XNoTable/twain/10000-8 634MB/s ± 1% 638MB/s ± 0% +0.62% (p=0.000 n=10+10) Decompress4XNoTable/twain/262143-8 513MB/s ± 1% 517MB/s ± 0% +0.85% (p=0.000 n=10+10) Decompress4XNoTable/low-ent.10k/100-8 195MB/s ± 0% 194MB/s ± 0% ~ (p=0.130 n=9+9) Decompress4XNoTable/low-ent.10k/10000-8 635MB/s ± 0% 642MB/s ± 0% +1.19% (p=0.000 n=10+10) Decompress4XNoTable/low-ent.10k/262143-8 675MB/s ± 0% 685MB/s ± 0% +1.51% (p=0.000 n=10+10) Decompress4XNoTable/superlow-ent-10k/262143-8 673MB/s ± 1% 684MB/s ± 0% +1.70% (p=0.000 n=10+10) Decompress4XNoTable/case1/100-8 206MB/s ± 1% 206MB/s ± 0% ~ (p=0.189 n=10+9) Decompress4XNoTable/case1/10000-8 593MB/s ± 0% 601MB/s ± 0% +1.47% (p=0.000 n=10+10) Decompress4XNoTable/case1/262143-8 603MB/s ± 0% 613MB/s ± 0% +1.64% (p=0.000 n=10+10) Decompress4XNoTable/case2/100-8 201MB/s ± 0% 202MB/s ± 1% ~ (p=0.053 n=9+10) Decompress4XNoTable/case2/10000-8 610MB/s ± 0% 618MB/s ± 0% +1.30% (p=0.000 n=9+10) Decompress4XNoTable/case2/262143-8 622MB/s ± 1% 634MB/s ± 0% +1.90% (p=0.000 n=9+8) Decompress4XNoTable/case3/100-8 197MB/s ± 1% 198MB/s ± 0% +0.53% (p=0.001 n=9+10) Decompress4XNoTable/case3/10000-8 606MB/s ± 0% 615MB/s ± 0% +1.49% (p=0.000 n=8+10) Decompress4XNoTable/case3/262143-8 613MB/s ± 1% 622MB/s ± 0% +1.48% (p=0.000 n=10+10) Decompress4XNoTable/pngdata.001/100-8 212MB/s ± 1% 211MB/s ± 0% ~ (p=0.136 n=9+9) Decompress4XNoTable/pngdata.001/10000-8 645MB/s ± 1% 649MB/s ± 1% +0.65% (p=0.000 n=9+10) Decompress4XNoTable/pngdata.001/262143-8 640MB/s ± 1% 649MB/s ± 0% +1.44% (p=0.000 n=10+10) Decompress4XNoTable/normcount2/100-8 260MB/s ± 1% 261MB/s ± 1% ~ (p=0.211 n=10+9) Decompress4XNoTable/normcount2/10000-8 584MB/s ± 1% 591MB/s ± 0% +1.33% (p=0.000 n=9+9) Decompress4XNoTable/normcount2/262143-8 588MB/s ± 1% 596MB/s ± 1% +1.39% (p=0.000 n=10+9) Decompress4XNoTableTableLog8/digits-8 583MB/s ± 1% 592MB/s ± 0% +1.48% (p=0.000 n=10+10) Decompress4XTable/digits-8 580MB/s ± 0% 588MB/s ± 0% +1.33% (p=0.000 n=8+10) Decompress4XTable/gettysburg-8 368MB/s ± 1% 370MB/s ± 0% +0.59% (p=0.017 n=10+9) Decompress4XTable/twain-8 510MB/s ± 0% 515MB/s ± 0% +0.99% (p=0.000 n=9+10) Decompress4XTable/low-ent.10k-8 657MB/s ± 0% 665MB/s ± 0% +1.24% (p=0.000 n=10+10) Decompress4XTable/superlow-ent-10k-8 608MB/s ± 0% 617MB/s ± 1% +1.48% (p=0.000 n=8+10) Decompress4XTable/case1-8 21.1MB/s ± 1% 21.0MB/s ± 2% ~ (p=0.223 n=10+10) Decompress4XTable/case2-8 17.6MB/s ± 0% 17.6MB/s ± 0% ~ (p=0.199 n=9+10) Decompress4XTable/case3-8 18.7MB/s ± 0% 18.7MB/s ± 0% ~ (p=0.557 n=10+8) Decompress4XTable/pngdata.001-8 633MB/s ± 1% 645MB/s ± 0% +1.90% (p=0.000 n=9+10) Decompress4XTable/normcount2-8 49.9MB/s ± 1% 49.5MB/s ± 1% -0.64% (p=0.002 n=10+10) [Geo mean] 270MB/s 271MB/s +0.36%
Configuration menu - View commit details
-
Copy full SHA for 4b3cc06 - Browse repository at this point
Copy the full SHA 4b3cc06View commit details -
s2: Add Index header trim/restore (#638)
* s2: Add Index header trim/restore Add `RemoveIndexHeaders` that will remove 20 header+trailer bytes for cases when storage can be relied upon. `RestoreIndexHeaders` will restore the index header+trailer so it can be loaded.
Configuration menu - View commit details
-
Copy full SHA for 9d7fe70 - Browse repository at this point
Copy the full SHA 9d7fe70View commit details -
Configuration menu - View commit details
-
Copy full SHA for 08efe28 - Browse repository at this point
Copy the full SHA 08efe28View commit details
Commits on Jul 12, 2022
-
zstd: Branchless getBits for amd64 w/o BMI2 (#640)
This produces the same number of instructions, while requiring less generating code. Benchmarks on the Intel Core i7-3770K show a tiny speedup: ``` name old speed new speed delta Decoder_DecoderSmall/kppkn.gtb.zst-8 430MB/s ± 1% 437MB/s ± 1% +1.60% (p=0.000 n=10+9) Decoder_DecoderSmall/geo.protodata.zst-8 1.11GB/s ± 1% 1.13GB/s ± 0% +1.37% (p=0.000 n=9+9) Decoder_DecoderSmall/plrabn12.txt.zst-8 334MB/s ± 1% 339MB/s ± 1% +1.41% (p=0.000 n=9+10) Decoder_DecoderSmall/lcet10.txt.zst-8 392MB/s ± 2% 404MB/s ± 1% +3.05% (p=0.000 n=10+10) Decoder_DecoderSmall/asyoulik.txt.zst-8 355MB/s ± 2% 357MB/s ± 1% ~ (p=0.315 n=10+9) Decoder_DecoderSmall/alice29.txt.zst-8 344MB/s ± 1% 350MB/s ± 1% +1.69% (p=0.000 n=10+10) Decoder_DecoderSmall/html_x_4.zst-8 2.34GB/s ± 1% 2.37GB/s ± 1% +1.10% (p=0.000 n=10+10) Decoder_DecoderSmall/paper-100k.pdf.zst-8 3.75GB/s ± 0% 3.76GB/s ± 1% ~ (p=0.182 n=9+10) Decoder_DecoderSmall/fireworks.jpeg.zst-8 8.59GB/s ± 1% 8.58GB/s ± 1% ~ (p=0.842 n=10+9) Decoder_DecoderSmall/urls.10K.zst-8 561MB/s ± 1% 556MB/s ± 1% -0.82% (p=0.019 n=10+10) Decoder_DecoderSmall/html.zst-8 900MB/s ± 1% 913MB/s ± 1% +1.42% (p=0.000 n=10+9) Decoder_DecoderSmall/comp-data.bin.zst-8 399MB/s ± 1% 395MB/s ± 1% -0.99% (p=0.000 n=10+10) Decoder_DecodeAll/kppkn.gtb.zst-8 518MB/s ± 0% 526MB/s ± 0% +1.52% (p=0.000 n=10+9) Decoder_DecodeAll/geo.protodata.zst-8 1.28GB/s ± 0% 1.27GB/s ± 2% ~ (p=0.739 n=10+10) Decoder_DecodeAll/plrabn12.txt.zst-8 427MB/s ± 1% 433MB/s ± 1% +1.24% (p=0.000 n=10+10) Decoder_DecodeAll/lcet10.txt.zst-8 480MB/s ± 1% 490MB/s ± 1% +2.06% (p=0.000 n=10+10) Decoder_DecodeAll/asyoulik.txt.zst-8 435MB/s ± 0% 447MB/s ± 0% +2.70% (p=0.000 n=7+9) Decoder_DecodeAll/alice29.txt.zst-8 422MB/s ± 0% 438MB/s ± 1% +3.96% (p=0.000 n=8+9) Decoder_DecodeAll/html_x_4.zst-8 1.60GB/s ± 0% 1.61GB/s ± 0% +0.99% (p=0.000 n=9+10) Decoder_DecodeAll/paper-100k.pdf.zst-8 4.55GB/s ± 1% 4.44GB/s ± 1% -2.42% (p=0.000 n=10+10) Decoder_DecodeAll/fireworks.jpeg.zst-8 9.52GB/s ± 1% 9.47GB/s ± 2% ~ (p=0.143 n=10+10) Decoder_DecodeAll/urls.10K.zst-8 678MB/s ± 1% 684MB/s ± 0% +0.83% (p=0.000 n=10+10) Decoder_DecodeAll/html.zst-8 1.05GB/s ± 0% 1.07GB/s ± 1% +2.11% (p=0.000 n=10+10) Decoder_DecodeAll/comp-data.bin.zst-8 397MB/s ± 1% 391MB/s ± 1% -1.37% (p=0.000 n=10+10) Decoder_DecodeAllFiles/Mark.Twain-Tom.Sawyer.txt/fastest-8 437MB/s ± 0% 436MB/s ± 1% -0.21% (p=0.025 n=9+9) Decoder_DecodeAllFiles/Mark.Twain-Tom.Sawyer.txt/default-8 448MB/s ± 0% 451MB/s ± 0% +0.70% (p=0.000 n=9+9) Decoder_DecodeAllFiles/Mark.Twain-Tom.Sawyer.txt/better-8 478MB/s ± 0% 475MB/s ± 0% -0.53% (p=0.000 n=10+10) Decoder_DecodeAllFiles/Mark.Twain-Tom.Sawyer.txt/best-8 461MB/s ± 0% 470MB/s ± 0% +2.07% (p=0.000 n=8+9) Decoder_DecodeAllFiles/e.txt/fastest-8 9.62GB/s ± 3% 9.62GB/s ± 2% ~ (p=1.000 n=10+10) Decoder_DecodeAllFiles/e.txt/default-8 391MB/s ± 0% 406MB/s ± 0% +3.81% (p=0.000 n=10+8) Decoder_DecodeAllFiles/e.txt/better-8 438MB/s ± 0% 448MB/s ± 0% +2.39% (p=0.000 n=8+10) Decoder_DecodeAllFiles/e.txt/best-8 500MB/s ± 0% 500MB/s ± 0% ~ (p=0.119 n=9+9) Decoder_DecodeAllFiles/fse-artifact3.bin/fastest-8 1.07GB/s ± 1% 1.04GB/s ± 1% -2.61% (p=0.000 n=10+10) Decoder_DecodeAllFiles/fse-artifact3.bin/default-8 1.21GB/s ± 1% 1.19GB/s ± 1% -1.33% (p=0.000 n=10+10) Decoder_DecodeAllFiles/fse-artifact3.bin/better-8 994MB/s ± 0% 990MB/s ± 0% -0.42% (p=0.002 n=10+9) Decoder_DecodeAllFiles/fse-artifact3.bin/best-8 389MB/s ± 0% 381MB/s ± 0% -2.00% (p=0.000 n=8+10) Decoder_DecodeAllFiles/gettysburg.txt/fastest-8 274MB/s ± 1% 274MB/s ± 1% ~ (p=1.000 n=10+10) Decoder_DecodeAllFiles/gettysburg.txt/default-8 224MB/s ± 1% 223MB/s ± 1% -0.64% (p=0.015 n=10+10) Decoder_DecodeAllFiles/gettysburg.txt/better-8 228MB/s ± 1% 227MB/s ± 1% -0.40% (p=0.041 n=10+10) Decoder_DecodeAllFiles/gettysburg.txt/best-8 225MB/s ± 1% 223MB/s ± 0% -0.52% (p=0.008 n=10+6) Decoder_DecodeAllFiles/html.txt/fastest-8 599MB/s ± 1% 614MB/s ± 1% +2.41% (p=0.000 n=10+10) Decoder_DecodeAllFiles/html.txt/default-8 601MB/s ± 0% 613MB/s ± 0% +2.01% (p=0.000 n=8+9) Decoder_DecodeAllFiles/html.txt/better-8 626MB/s ± 1% 638MB/s ± 0% +1.99% (p=0.000 n=10+10) Decoder_DecodeAllFiles/html.txt/best-8 601MB/s ± 0% 612MB/s ± 0% +1.87% (p=0.000 n=10+10) Decoder_DecodeAllFiles/pi.txt/fastest-8 9.64GB/s ± 2% 9.66GB/s ± 1% ~ (p=0.529 n=10+10) Decoder_DecodeAllFiles/pi.txt/default-8 390MB/s ± 0% 403MB/s ± 0% +3.48% (p=0.000 n=10+10) Decoder_DecodeAllFiles/pi.txt/better-8 439MB/s ± 0% 451MB/s ± 0% +2.65% (p=0.000 n=10+10) Decoder_DecodeAllFiles/pi.txt/best-8 500MB/s ± 0% 499MB/s ± 0% -0.27% (p=0.009 n=7+10) Decoder_DecodeAllFiles/pngdata.bin/fastest-8 1.70GB/s ± 1% 1.69GB/s ± 1% -0.63% (p=0.013 n=10+9) Decoder_DecodeAllFiles/pngdata.bin/default-8 1.52GB/s ± 1% 1.51GB/s ± 0% -0.75% (p=0.000 n=10+9) Decoder_DecodeAllFiles/pngdata.bin/better-8 1.92GB/s ± 0% 1.90GB/s ± 0% -1.02% (p=0.000 n=10+10) Decoder_DecodeAllFiles/pngdata.bin/best-8 1.47GB/s ± 0% 1.46GB/s ± 0% -0.88% (p=0.000 n=10+9) Decoder_DecodeAllFiles/sharnd.out/fastest-8 9.60GB/s ± 1% 9.67GB/s ± 1% +0.67% (p=0.029 n=10+10) Decoder_DecodeAllFiles/sharnd.out/default-8 9.65GB/s ± 2% 9.71GB/s ± 1% ~ (p=0.353 n=10+10) Decoder_DecodeAllFiles/sharnd.out/better-8 9.67GB/s ± 1% 9.66GB/s ± 0% ~ (p=0.549 n=10+9) Decoder_DecodeAllFiles/sharnd.out/best-8 9.70GB/s ± 1% 9.61GB/s ± 0% -0.91% (p=0.010 n=10+9) [Geo mean] 935MB/s 940MB/s +0.57% ```
Configuration menu - View commit details
-
Copy full SHA for 9a048c1 - Browse repository at this point
Copy the full SHA 9a048c1View commit details
Commits on Jul 13, 2022
-
gzip: fix stack exhaustion bug in Reader.Read (#641)
Replace recursion with iteration in Reader.Read to avoid stack exhaustion when there are a large number of files. Fixes CVE-2022-30631 Upstream: golang/go#53168
Configuration menu - View commit details
-
Copy full SHA for 5a16edc - Browse repository at this point
Copy the full SHA 5a16edcView commit details
Loading
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff v1.15.7...v1.15.8