-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compresison ratio worse than zlib (deflate) #256
Comments
using v0.7.4 $>zstd -22 24bit.bpt -o 24bit.zs $>zstd -22 32bit.bpt -o 32bit.zs |
I agree it feels strange, but I believe it's in line with how the algorithm works. For such kind of data though, I would recommend to have a look at @FrancescAlted 's Blosc filter, which combined with zstd provides excellent performance. See this blog post for detailed information. |
Yes, the kind of behavior that exposes @hansinator is quite typical in Blosc, but to be frank I was not expecting to see it in Zstd. It would be interesting to see how Blosc+Zstd can perform on this file indeed. Is there a place were it can be accessed? |
Here are both files: Blosc sounds interesting, though I didn't understand it fully. Will it increase speed, compression ratio, or both? Will decompression be fast on an embedded risc processor running @ 50MHz? |
Generally, Blosc will increase speed (it is multithreaded) and could increase compression ratios depending on whether its filters (shuffle and bitshuffle) can rearrange data in a way that is easier for the codec to look for duplicities in data. Regarding using Blosc on a RISC processor, this is indeed feasible, but you won't get the extra speed that provides SIMD instructions on Intel (SSE2, AVX2) or in ARM (NEON, but only in Blosc2). Another constraint is that on RISC processors the size of binaries is normally important, and Blosc library will necessarily be fatter (although you can always disable codecs and keep the important ones for you). At any rate, I gave your files a go in my laptop, and got interesting results. The original files were:
and I processed them with bloscpack (a high level interface to Blosc):
[note that compression level 9 in Blosc translates to 22 in Zstd] And the sizes for the compressed files are:
so, when not using the shuffle filter, I almost can reproduce plain zstd figures (expected). However, when the shuffle filter is used, the 32bit version can be reduced down to 8 KB. As I was quite surprised by this result, I double checked that the latter file can reproduce the original one:
so yup, it does. |
I suspect : |
Thank you a lot. This opens up new possibilities to look into. I need to transfer look-up tables for image processing to an FPGA via a 115kbps serial line and it has got limited storage and memory resources. I am trying different compression algorithms since a couple of weeks with mixed results. Algorithms like XZ or LZHAM take too long to decompress, deflate and similar take too long to transfer. Now Blosc with Zstd seems to be the perfect solution.. the only codec that achieved a 10kb compressed size was RAR which I can't use without winrar.. @Cyan4973 @FrancescAlted |
@Cyan4973 Unfortunately -t 3 is not supported and the only values that are supported for shuffle are 2, 4, 8, 16 and 16 + (any other of the previous ones). @hansinator Zstd support has been included in python-blosc (a dependency of bloscpack) last week, so you may need to install the latest version of it (1.4.0). |
@FrancescAlted stupid question, but would it be helpful, for the compression ratio, to have a "pre-filter" that would expand 24-bit integers into 32-bits (HSB padded with 0) to improve compression using |
Your suggestions made me think twice, and
which is another 20% smaller than the 32bit ones. But for 3 bytes, a regular shuffle function in C is used, and not one using an SIMD accelerated codepath, so perhaps your idea of adding a padding would be a good one. Added a ticket for C-Blosc2. |
Just for completeness, apparently the improvement in compression for the 24bit case is not just that we are removing the 0s in MSB. Taking the 32bit file and choosing a chunksize that is close to the length of the file (for some reason, the size of the 32bit file is not exactly divisible by 4), we get:
which is actually a bit smaller than the 24bit base. In addition, using the accelerated SIMD path for shuffle still provides a good advantage in performance (Python code ahead):
So, decompression happens 2.7x faster (at more than 5 GB/s) when using the padded version. Although if that would be an actual padding pre-filter, perhaps the additional copies of internal buffers would defeat the advantage. Well, I suppose the only way to know is experimenting. |
Is there anything else that can be done on this issue ? |
As far as I understood Zstd is intended to be a better replacement for deflate, so my intent for this issue was to point out a weird case where deflate performs better, hoping that I could help bring Zstd closer to its goal. It turns out to be a complex issue with no "easy" one-fits-all solution. I appreciate your replies and the effort you have put into experimenting. You've helped me to get a better understanding of this and showed me how to get better performance in my concrete compression scenario. If there is nothing tangible that we can learn here to improve Zstd to perform better, I'd say there's nothing more that can be done on this issue. |
Thanks @hansinator We'll try to analyze what happens in such case, because while I'm not surprised that Regards |
Hi,
I have an input file on which zstd gives a worse compression ratio than zlib does with deflate. Basically my file is mostly a list of 24-bit integers. If i store all integers as 32-bit with the MSB set to zero, then my input file is of course larger but compresses with more than double the ratio.
I find this rather odd.. Maybe you can have a look at it?
The text was updated successfully, but these errors were encountered: