Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redo serialize.lua #11427

Merged
merged 31 commits into from
Jun 11, 2022
Merged

Redo serialize.lua #11427

merged 31 commits into from
Jun 11, 2022

Conversation

appgurueu
Copy link
Contributor

@appgurueu appgurueu commented Jul 7, 2021

Alternative to #11380. Not yet thoroughly tested, but passes the (few) unit tests provided, as well as my own tests. Closes #8719.

Features:

Benchmarks

LuaJIT. Write speeds, Luon is the name of the modlib module, Bluon is a binary format, Minetest is the old minetest.serialize, Minetest (JSON) is minetest.write_json. The times shown are what 100 serializations take. Data from
https://github.com/miloyip/nativejson-benchmark/tree/master/data.

OBJECT: canada.json
Luon    13.800655s
Bluon   21.086339s
Minetest        16.716192s
Minetest (JSON) 41.147817s
OBJECT: citm_catalog.json
Luon    3.456392s
Bluon   3.230237s
Minetest        3.811432s
Minetest (JSON) 5.631833s
OBJECT: twitter.json
Luon    1.186978s
Bluon   0.79831700000001s
Minetest        1.015294s
Minetest (JSON) 1.718552s

("reopened" #11425)

@sfan5 sfan5 added @ Builtin Bugfix 🐛 PRs that fix a bug Maintenance Tasks to keep the codebase and related parts in order, including architectural improvements labels Jul 7, 2021
Copy link
Member

@SmallJoker SmallJoker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall needs more comments to describe what each step does for easier maintenance and better understanding.
Erroneous code style
See comments

builtin/common/serialize.lua Outdated Show resolved Hide resolved
builtin/common/serialize.lua Outdated Show resolved Hide resolved
builtin/common/serialize.lua Outdated Show resolved Hide resolved
builtin/common/serialize.lua Outdated Show resolved Hide resolved
builtin/common/serialize.lua Outdated Show resolved Hide resolved
builtin/common/serialize.lua Outdated Show resolved Hide resolved
builtin/common/serialize.lua Outdated Show resolved Hide resolved
builtin/common/serialize.lua Outdated Show resolved Hide resolved
builtin/common/serialize.lua Outdated Show resolved Hide resolved
@appgurueu
Copy link
Contributor Author

appgurueu commented Sep 16, 2021

This still closes #8719

@rubenwardy
Copy link
Member

rubenwardy commented Sep 16, 2021

This should be unit tested to cover compatibility and edge cases

See how vectors have unit tests in https://github.com/minetest/minetest/blob/master/builtin/common/tests/vector_spec.lua

@appgurueu
Copy link
Contributor Author

This should be unit tested to cover compatibility and edge cases

See how vectors are unit tests in builtin/common/tests/vector_spec.lua

Minetest already has unit tests for serialization and this passes them, plus my modlib unit tests (as already said). I should definitely add FaceDeer's example to the unit tests though.

@rubenwardy
Copy link
Member

Minetest already has unit tests for serialization and this passes them

oh right, I wrote them. It would be good to add tests for previous shortcomings, such as "Support for arbitrary references, including self-referencing"

Copy link
Member

@Desour Desour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should work. Didn't test though.
Some more comments and empty lines wouldn't hurt.
(If you want to further improve the performance, I'd suggest using luajit's profiler.)

builtin/common/serialize.lua Outdated Show resolved Hide resolved
builtin/common/serialize.lua Outdated Show resolved Hide resolved
builtin/common/serialize.lua Outdated Show resolved Hide resolved
builtin/common/serialize.lua Show resolved Hide resolved
builtin/common/serialize.lua Outdated Show resolved Hide resolved
@rubenwardy rubenwardy added Roadmap: Needs approval The change is not part of the current roadmap and needs to be approved by coredevs beforehand and removed Roadmap: Needs approval The change is not part of the current roadmap and needs to be approved by coredevs beforehand labels Oct 31, 2021
@appgurueu
Copy link
Contributor Author

Bump. 5.5 could use this.

@sfan5 sfan5 added this to the 5.6.0 milestone Dec 30, 2021
@appgurueu
Copy link
Contributor Author

appgurueu commented Jan 1, 2022

It looks like I'm duplicating part of the is_valid_identifier implementation (which is used by dump) with my latest changes. I'll see how I can deduplicate this (most likely I'll just expose is_valid_identifier and localize it for performance).

@ghost
Copy link

ghost commented Jan 13, 2022

From a brief non-thorough test with one of my mods that serializes a fair amount of data, this seems to work as expected. Notably, the resulting data was significantly smaller with this patch due to the way keys are handled. The code itself also seems fine, I had to slow down reading it but it's still understandable.

I'd call this good to merge with a couple caveats

  • This could probably do with some "real" use-case testing with mods/games that rely on serialization over multiple play sessions
  • I haven't checked code style yet, although I don't expect to find anything since SmallJoker did a pass for that already

I think the best time would be directly after 5.5 is released though, the potential harm that a missed bug could do here is very high and this will give time for us to see more real-world behavior during the 5.6 release cycle.

@appgurueu
Copy link
Contributor Author

appgurueu commented Jan 13, 2022

I think the best time would be directly after 5.5 is released though, the potential harm that a missed bug could do here is very high and this will give time for us to see more real-world behavior during the 5.6 release cycle.

Agreed. Get 5.5 out first, it's already late. I'll see where I can find good real-world test cases. It'd be nice if there were some voluntary "testers". I believe @erlehmann could be one of them ;).

@appgurueu
Copy link
Contributor Author

The latest commit massively improves the coverage, testing thousands of primitives and tables through fuzzing. I had to write my own assert_same because busted's deepcompare (1) only compares by depth, not by reference "mappability" and (2) simply didn't complete (poor performance, possibly exponential?).

.github/workflows/lua.yml Outdated Show resolved Hide resolved
builtin/common/serialize.lua Outdated Show resolved Hide resolved
builtin/common/serialize.lua Outdated Show resolved Hide resolved
builtin/common/serialize.lua Outdated Show resolved Hide resolved
builtin/common/serialize.lua Outdated Show resolved Hide resolved
builtin/common/serialize.lua Outdated Show resolved Hide resolved
builtin/common/serialize.lua Show resolved Hide resolved
builtin/common/serialize.lua Outdated Show resolved Hide resolved
builtin/common/serialize.lua Outdated Show resolved Hide resolved
builtin/common/serialize.lua Outdated Show resolved Hide resolved
@sfan5
Copy link
Member

sfan5 commented Jun 6, 2022

Can we cut down on trailing commas?
grafik

Also these should probably be declared as locals:
r1="param1";r2="default:desert_stone";r3="default:desert_sand";r4="default:cactus"

Also² this breaks WorldEdit's deserialize workaround on LuaJIT. It was bound to happen of course but it's not fun to figure out a fix. (Don't worry about this)

@appgurueu
Copy link
Contributor Author

appgurueu commented Jun 6, 2022

Can we cut down on trailing commas?

We can and we did. Desour seems to prefer trailing commas though.

Also these should probably be declared as locals: r1="param1";r2="default:desert_stone";r3="default:desert_sand";r4="default:cactus"

Locals are capped at 255. I don't want to increase the code complexity (or impose arbitrary limits) for a negligible performance benefit.

(Don't worry about this)

Now I will worry about this.

@sfan5
Copy link
Member

sfan5 commented Jun 6, 2022

We can and we did. #11427 (comment)

Trailing commas definitely do hurt.

Locals are capped at 255. I don't want to increase the code complexity (or impose arbitrary limits) for a negligible performance benefit.

Wasn't even thinking of (theoretical) performance benefits, but sure, keep it like this.

Co-authored-by: sfan5 <[email protected]>
Copy link
Member

@sfan5 sfan5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested mainly with WorldEdit schematic serialization

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bugfix 🐛 PRs that fix a bug @ Builtin Maintenance Tasks to keep the codebase and related parts in order, including architectural improvements >= Two approvals ✅ ✅
Projects
None yet
6 participants