Shuffle Coding

This is accompanying code for our workshop paper "Entropy Coding of Unordered Data Structures", which was accepted for oral presentation at the ICML 2023 Neural Compression Workshop.

Workshop paper | Presentation (13 minutes)

Shuffle Coding

Shuffle coding is a general method for optimal compression of sequences of unordered objects using bits-back coding. Data structures that can be compressed using shuffle coding include multisets, graphs, hypergraphs, and others. This implementation can easily be adapted to different data types and statistical models, and achieves state-of-the-art compression rates on a range of graph datasets including molecular data.

Run

Apart from Rust, you need a C compiler to build this project due to dependency on nauty and Traces.

Use the following commands to replicate all experiments from the paper:

cargo run --release -- --symstats --stats --er --shuffle AIDS..mit_ct2 reddit_threads..TRIANGLES
cargo run --release -- --symstats --stats --er --eru --pu --pur --shuffle MUTAG PTC_MR
cargo run --release -- --symstats --stats --er --eru --pu --pur --shuffle --plain MUTAG PTC_MR ZINC_full..ZINC_val PROTEINS IMDB-BINARY IMDB-MULTI
cargo run --release -- --symstats --stats --er --pu --shuffle SZIP

Graph datasets are downloaded automatically as needed (from TU and SZIP). Full results can be found here.

To retrieve standard deviations for the stochastic Polya urn models, you can set a random seed (0, 1 and 2 were used for the paper):

cargo run --release -- MUTAG PTC_MR ZINC_full..ZINC_val PROTEINS IMDB-BINARY IMDB-MULTI --plain --pu --shuffle --seed=0
cargo run --release -- MUTAG PTC_MR --pu --shuffle --seed=0 
cargo run --release -- SZIP --pu --shuffle --seed=0

Use cargo run --help to see all available options.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src		src
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENCE		LICENCE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Workshop paper | Presentation (13 minutes)

Shuffle Coding

Run

About

Releases

Packages

Contributors 2

Languages

License

juliuskunze/shuffle-coding

Folders and files

Latest commit

History

Repository files navigation

Workshop paper | Presentation (13 minutes)

Shuffle Coding

Run

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages