anhsirk0 / Huffman_Encoding_Decoding

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Huffman coding

Huffman coding is a lossless data compression algorithm. The idea is to assign variable-length codes to input characters, lengths of the assigned codes are based on the frequencies of corresponding characters. The variable-length codes assigned to input characters are Prefix Codes, means the codes (bit sequences) are assigned in such a way that the code assigned to one character is not the prefix of code assigned to any other character.

1fEJE

Usage

clone the repo

git clone https://github.com/amandeepsirohi/Huffman_Encoding_Decoding.git --depth=1

Building

g++ huffman_main.cpp encode_text.cpp -o huff-encode

Test on random text file

create text file with 500000 lines of random strings

cat /dev/urandom | tr -dc 'a-z0-9' | fold -w 32 | head -n 500000 > input.txt

compress input.txt file

huff-encode input.txt out.txt

compare size

du -h input.txt
16M    input.txt
du -h out.txt
11M    out.txt

decompress file

build decoder

g++ huffman_main.cpp decode_text.cpp -o huff-decode

decompress out.txt file

huff-encode compressed.huff out.txt

About


Languages

Language:C++ 100.0%