huffman_coding

Huffman Coding is a simple Rust-based implementation of Huffman encoding and decoding.

This project was inspired by a challenge from coding challenges.fyi and the algorithm implementation details were referenced from huffman coding challenge - Huffman Coding. The implementation details are from Huffman coding

The challenges I faced in this project are -

Handling Multiple References: A major challenge was dealing with multiple references to the TreeNode and multiple owners in Rust. Traditional methods of using references didn't work due to Rust's ownership model, requiring the use of Rc<HuffNode> to create multiple references.

Debugging BinaryHeap Comparison: Debugging the comparison in BinaryHeap was challenging because it wasn't working correctly when comparing the weights and elements. This issue arose when both elements were None, leading to non-deterministic ordering due to the use of serde_json, which changes the ordering of HashMap. As a result, the heap provided the wrong order during decoding, affecting the correctness of the prefixes.

// Old Implementation

#[derive(Debug, Clone)]
 struct HuffNode {
   weight: u32,
   element: Option<char>,
   left: Option<TreeNodeRef>,
   right: Option<TreeNodeRef>
 }

impl PartialEq for HuffNode {
    fn eq(&self, other: &Self) -> bool {
        if other.weight().eq(&self.weight()) {
           self.element.eq(&other.element)
        } else {
           other.weight().eq(&self.weight())
        }
    }
 }
impl PartialOrd for HuffNode {
   fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
      if other.weight().eq(&self.weight()) {
         match (self.element, other.element) {
             (None, None) => Option::from(Ordering:Equal),
             (Some(_), None) => { Option::from(Ordering::Greater) }
             (None, Some(_)) => { Option::from(Ordering::Less) }
             (Some(el1), Some(el2)) => Option::from(el1.cmp(&el2))
         }
     } else {
         Some(other.weight().cmp(&self.weight()))
     }
  }
}

impl Ord for HuffNode {
    fn cmp(&self, other: &Self) -> Ordering {
       if other.weight().eq(&self.weight()) {
           match (self.element, other.element) {
               (None, None) => {Ordering::Equal},
               (Some(_), None) => { Ordering::Greater }
               (None, Some(_)) => { Ordering::Less }
               (Some(el1), Some(el2)) => el1.cmp(&el2)
          }
      } else {
         other.weight().cmp(&self.weight())
      }
    }
 }

// new implementation
#[derive(Debug, Clone)]
struct HuffNode {
   weight: u32,
   element: Option<char>,
   left: Option<TreeNodeRef>,
   right: Option<TreeNodeRef>,
   id: u32
}

impl PartialEq for HuffNode {
    fn eq(&self, other: &Self) -> bool {
        if other.weight().eq(&self.weight()) {
            self.element.eq(&other.element)
        } else {
        other.weight().eq(&self.weight())
        }
    }
}

impl PartialOrd for HuffNode {
    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
        if other.weight().eq(&self.weight()) {
            match (self.element, other.element) {
                (None, None) => Some(other.id.cmp(&self.id)),
                (Some(_), None) => { Option::from(Ordering::Greater) }
                (None, Some(_)) => { Option::from(Ordering::Less) }
                (Some(el1), Some(el2)) => Option::from(el1.cmp(&el2))
            }
        } else {
            Some(other.weight().cmp(&self.weight()))
        }
    }
}

impl Ord for HuffNode {
    fn cmp(&self, other: &Self) -> Ordering {
        if other.weight().eq(&self.weight()) {
            match (self.element, other.element) {
                (None, None) => other.id.cmp(&self.id),
                (Some(_), None) => { Ordering::Greater }
                (None, Some(_)) => { Ordering::Less }
                (Some(el1), Some(el2)) => el1.cmp(&el2)
            }
        } else {
            other.weight().cmp(&self.weight())
        }
    }
}

In the new implementation, an id field was added, generated incrementally, to ensure deterministic behavior when comparing nodes where the element is None.

Encoding and Decoding: Ensuring the use of compressed bits for file I/O was also challenging because Rust doesn't provide direct bit manipulation. Conversion from bytes u8 to bits had to be implemented.

Assumptions-

The implementation assumes that the unique characters in the input string are greater than or equal to 2.
It uses serde_json to store mappings, which may be less efficient compared to Canonical Encoding, where mappings can be stored in $B*2^B$ bits of information (where B is the number of bits per symbol).
Custom error handling using the thiserror package was not implemented extensively, except for handling read file errors.

Running the code

You can run the encoding algorithm using -

cargo run -- /absolute-path-to-file

You can run the decoding algorithm using -

cargo run -- /absolute-path-to-huf-file -d

Extra Dependencies

Additional dependencies used in this project:

thiserror = "1.0.56" # custom error handling package
clap = { version = "4.5.0", features = ["derive"] } # command line argument parser packages
serde_json = "1.0.115" # serializing DataStructures to json, used for serializing hashmap to json

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.idea		.idea
src		src
tests		tests
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
huffman.txt		huffman.txt
huffman_decode.txt		huffman_decode.txt
rust-toolchain.toml		rust-toolchain.toml
small.txt		small.txt
small_decode.txt		small_decode.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

huffman_coding

The challenges I faced in this project are -

Assumptions-

Running the code

Extra Dependencies

About

Releases

Packages

Contributors 2

Languages

Dhruvmehta18/huffman_coding

Folders and files

Latest commit

History

Repository files navigation

huffman_coding

The challenges I faced in this project are -

Assumptions-

Running the code

Extra Dependencies

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages