Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch MapBlock compression to zstd #10788

Merged
merged 20 commits into from
Sep 1, 2021
Merged

Switch MapBlock compression to zstd #10788

merged 20 commits into from
Sep 1, 2021

Conversation

lhofhansl
Copy link
Contributor

@lhofhansl lhofhansl commented Jan 5, 2021

This add zstd to Minetest.

The code backward, but not forward compatible:

  • blocks are compressed with zstd in the wire if the client understands, otherwise zlib is used as before
  • all new blocks written to the DB are compressed with zstd
  • old blocks are still decompressed with zlib

@sfan5's brotli and @est31's earlier zstd branched helped a lot - although I'm using different zstd APIs.

Questions:

  • What about the compression levels? Right now these are zlib specific. For now I simply mapped them to zstd levels 1-10, which really should be good enough. Alternatively I could remove the compression level config, since zstd's default is pretty good.
  • Is forward compatibility needed? I.e. should we (somehow) allow old server to still read old maps, even if a newer server has written to them?
  • What should the default compression level be? Right now I set it to 0 (which uses the default, which currently is 3). That seems pretty good. Better compression than zlib, and faster too.

Some stats: (scene with viewing_range of 600, 3700 blocks visible)
Zlib-6 (default), 54MB DB, 90s to generate (dominated by compression time on save)
Zlib-3, 69MB DB, 70s to generate

Zstd-15, 50MB, 108s to load
Zstd-10, 50MB, 70s to load
Zstd-6, 54MB, 67s to load
Zstd-3 (default), 56MB DB, 60s to load
Zstd-1, 55MB DB, 60s to load
Zstd- -1, 62MB, 60s to load
Zstd- -10, 72MB, 60s to load

This shows - as expected - that we get more compression per time from zstd. It also shows that for the tiny things like blocks higher zstd level do not buy much. Perhaps we should default to 1. I'm concluding that zstd below level 3 is no longer the bottelneck. (should confirm with a run with no compression)
For zlib levels other than 3 or 6 do not make much sense, while for zstd level 1 seems to be optimal.

See @ShadowNinja's comment below for more ways to optimize this.

I sporadically watched the map save times. Grew up to 5s for zlib-6, stayed below 2s for zlib-3.
Zstd-10 stayed below 2s, Zstd-1 below 1s.
Similarly Zstd-15 stayed below 5s. Not much improvement there for levels -1 or -10.

(I did not test zstd compression levels larger than 10, or negative values - which are faster but with less compression)

@lhofhansl lhofhansl added WIP The PR is still being worked on by its author and not ready yet. Testing needed labels Jan 5, 2021
if(version >= 29)
{
// map the zlib levels [-1,9]
// the range 1-10 is good enough, 0 is the default (currently 3)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may find this questionable.
Perhaps, since compress can now call multiple schemes, we ought to come with a general compression scale, perhaps -1 to 100, and then map that whatever range the specific algorithms support.

Or remove the config option altogether and use the defaults....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yet another option is to always use -1 for zlib, and use the zstd spectrum for the config instead.
zstd has a wide range compressions from very fast to very slow, and perhaps we'll never need a new one again.

@lhofhansl
Copy link
Contributor Author

Oops. Removed some unrelated changes.

infostream << "Test: Testing zstd wrappers with a large amount "
"of pseudorandom data" << std::endl;

u32 size = 500000;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

500.000 because zstd default buf size is 128k.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer needed now since I changed the bufsize to 64, but it's still a good test, and takes only a few ms anyway.

@lhofhansl
Copy link
Contributor Author

I can clearly use some help with the build issues. It builds fine on my Linux box.

@lhofhansl
Copy link
Contributor Author

lhofhansl commented Jan 5, 2021

Looks like ZSTD_c_compressionLevel is not defined in older version of zstd, neither ZSTD_e_end, which is need here. Hmmm.

Edit: Yeah, I should have used the legacy APIs. Meh. I will see when I get this to this again.

@lhofhansl
Copy link
Contributor Author

OK. At least it now compiles with the legacy API used...
I have no clue why the unittest fails. It passes on my machine, but I have a newer version of libzstd.

@lhofhansl
Copy link
Contributor Author

Hah. Older versions of zstd require to initialize a dstream before each decompression. New version do not require that.

@ShadowNinja
Copy link
Member

I was actually looking into implementing this myself, but didn't finish it. Some discussion on other optimizations to the MapBlock format: #3946

Something that might be worth looking into with zstd would be a custom dictionary -- it should improve compression on the small bits that we pass to the compressor. We compress each part of the mapblock separately, so the inputs are generally 8KiB or less, which isn't much for a compressor to learn from without a precomputed dictionary. It might also help to compress an entire mapblock instead of compressing nodes, meta, etc separately. That might also improve performance since you only have to include the compressor startup time once (although I think zstd has some significant optimizations in startup time over zlib).

I wrote a C++ app that converted Minecraft worlds to a Minetest database, profiled it, and found that it spent most of its CPU time in zlib compressing Minetest blocks, so compression performance is definitely important to database performance.

@lhofhansl
Copy link
Contributor Author

lhofhansl commented Jan 6, 2021

Thanks @ShadowNinja. There's also #10671 (you've seen it). It turns out we write a lot of blocks to disk when the only thing changed is the timestamp.

I've improved startup somewhat by making the stream context thread local.

A pre-computed dictionary is tricky... The data can be only decompressed with that exact same dictionary (unless it is written with the data).

In addition to compression I find that ostringstream is unreasonably slow - especially construction, reducing the number of those will also help a lot.

Let me think this over. I'd prefer not too many change in single PR, on the other hand each time we change the format we need a protocol bump, so those we be good to group.

@lhofhansl
Copy link
Contributor Author

I do not know how to fix the Android and Windows builds. It seems we're downloading pre-compiled artifacts, I do not have the necessary environments to build one for libzstd.

@lhofhansl
Copy link
Contributor Author

lhofhansl commented Jan 7, 2021

@ShadowNinja See second commit. For version >= 29 node data and metadata is now compressed together. Can't say I like the code - littered with if-statements, but it works.

Edit: This saves another 10% of the size at Zstd-3.
Edit2: Could write the entire block into an uncompressed ostringstream first, then compress that complete object into the final output.

@ShadowNinja
Copy link
Member

Yeah, I ended up using a Buffer<u8> to avoid the overhead of std::ostringstream.

I've pushed what I had completed here: https://github.com/ShadowNinja/minetest/commits/mapblock-v29

Only the serialization is completed, and there are a bunch of other changes on top of the compression, but it might be useful.

@lhofhansl
Copy link
Contributor Author

I like that better. (BTW, as I found out the hard way, you cannot be use the new compress2 zstd API if you want this work with zstd 1.3.3.)
Do you want to continue with your branch, or should I incorporate this in my branch. Fine either way, no ego from my side :)

@ShadowNinja
Copy link
Member

You should continue with your branch. Mine is far from complete, and I'm not working on it right now.

@lhofhansl
Copy link
Contributor Author

I removed the node data/metadata combined compression from this patch, since I did not like it.
Perhaps it's best to add zstd first, and then optimize other things around it.

@lhofhansl
Copy link
Contributor Author

@ShadowNinja new version, that compresses the entire block.

@sfan5 sfan5 self-requested a review January 10, 2021 18:40
src/main.cpp Outdated Show resolved Hide resolved
@lhofhansl
Copy link
Contributor Author

Another idea for the meta-data length: Instead of writing a separate ostringstream, one could just leave a 4 byte placeholder, then remember the position before and after writing the metadata (tellp()) and then poke that value in the to std::string before it is written into the compressed stream.

@sfan5
Copy link
Member

sfan5 commented Aug 29, 2021

If you can make that work that'd be good but I thought ostringstreams did not support seeking.

@lhofhansl
Copy link
Contributor Author

lhofhansl commented Aug 29, 2021

According to the c++ standard stringbuf does support seeking.
And tellp() seems to work. Verified by comparing the size with the difference of the offsets. On GCC with -std=c++98
(By the way I noticed that most cases the metadata size is just 1 - i.e. no metadata - in that case the extra ostringstream is incredibly wasteful)

I'll post an update soon.

@lhofhansl
Copy link
Contributor Author

Yet another alternative is to write the metadata last. That way it can simply be skipped by parsing out the rest (all fix sized) and then skipping to the end.

@lhofhansl
Copy link
Contributor Author

lhofhansl commented Aug 29, 2021

Now the MacOS build is broken. :(
(Can't find the zstd package, not sure what changed...)

@sfan5
Copy link
Member

sfan5 commented Aug 29, 2021

Yet another alternative is to write the metadata last.

Ahhhh of course why didn't I think of that, we can change the format freely anyway.
Sorry to make you do more but at this point I suggest the following format changes:

  • Move the timestamp after lighting_complete (not very relevant but who knows when it'll be useful)
  • Move the node/id mapping just before content_width
  • Remove the metadata length field again

This is even easier for minetestmapper and similar tools because they can just stop reading before metadata, static objects or anything else that is uninteresting.

Now the MacOS build is broken

Did you add it here?

pkgs=(cmake freetype gettext gmp hiredis jpeg jsoncpp leveldb libogg libpng libvorbis luajit)

@lhofhansl
Copy link
Contributor Author

lhofhansl commented Aug 29, 2021

Thanks @sfan5 . Update coming soon :)
And NP at all for the extra work. Good to figure this out, I did not even think about other tools that might want to read the map data.

Update: Moving the timestamp makes for a little more ugly code, since it's only written to disk, but not the network.

@lhofhansl
Copy link
Contributor Author

I still need to update world_format.txt.

@lhofhansl
Copy link
Contributor Author

Alternatively we would document version >= 29 separately and leave <= 28 the way it is.

src/main.cpp Outdated Show resolved Hide resolved
src/mapblock.cpp Outdated Show resolved Hide resolved
src/serialization.h Outdated Show resolved Hide resolved
Copy link
Member

@sfan5 sfan5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works, tested:

  • old map loadability
  • network compatibility, both ways
  • format conforming to expectations

@sfan5 sfan5 added this to the 5.5.0 milestone Aug 31, 2021
@lhofhansl
Copy link
Contributor Author

Changes look good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants