Implementation of archivator using Huffman coding

How to use

compress("compress-me.txt", "compressed.bin")

decompress("compressed.bin", "decompressed.txt")

Result

encoded huffman tree code: 0010010000001011001010001011100001001011000101100011...
encoded text code: 000000001111011011010010001111010011101101111110111001110100...
before: 7528bytes, after: 4109bytes, compression 45.4%

compression can be negative if the text is too short, as the binary file with encoded text also contains a Huffman tree data, which needed to decode the text.

average result is 45%, depends on frequency can be up to 90%

!not working with cyrillic symbols, it's a simple archivator made with learning purposes

how it's implemented

compression:

read text from the input file
create frequencies table using collections.Counter
create Huffman nodes using queue.PriorityQueue
merge the nodes to get Huffman Tree
create code table (dictionary) with chars and corresponding codes
encode the text using the code table
encode Huffman tree (each node is 0, each leaf is 1 + encoded ascii char)
place the encoded tree before encoded text
save it to a new binary file

decompression

read encoded data from binary file
decode a Huffman tree
using the tree decode the text
save it to output file

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
README.md		README.md
compress-me		compress-me
huffman_compressor.py		huffman_compressor.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Implementation of archivator using Huffman coding

How to use

how it's implemented

About

Releases

Packages

Languages

shkolovy/huffman-compressor

Folders and files

Latest commit

History

Repository files navigation

Implementation of archivator using Huffman coding

How to use

how it's implemented

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages