Skip to content

oliverbaileysmith/zlib-decompress

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

zlib-decompress

Disclaimer: This project is for learning purposes only and is not a specification-compliant ZLIB decompressor.

A Python implementation of decompression of ZLIB compressed data.

decompress.py contains the user-facing function decompress which takes in data compressed by zlib.compress and returns the decompressed data.

Running py main.py shows an example of compressing input data using zlib.compress and decompressing using decompress.

ZLIB format

The ZLIB format has a header that contains details about the compression algorithm used. The only defined compression method is DEFLATE meaning that ZLIB is essentially an extension of the DEFLATE format. The ZLIB format also adds the option for a "preset dictionary", none of which are defined in the specification, so this implementation does not handle those. Additionally, the ZLIB format includes an Adler-32 checksum to verify data integrity, which this implementation does not calculate.

DEFLATE uses Huffman coding to convert bytes (each composed of 8 bits) into a string of bits in which the most frequently occurring characters use fewer bits and less frequently occurring characters use more bits. This coding significantly reduces the size of data, especially in written English where some characters are much more common than others.

DEFLATE also uses LZ77 to compress repeated strings of data by replacing a repeated string with a <length, backward distance> pair referring to a copy of the same string in the decompressed data.