w-i-l / huffman-coding

An algorithm to encode characters with information theory

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Huffman code

An algorithm to encode data optimally



About it

The program takes as input the number of symbols, followed by them and their probabilities.
It displays the encoded version along with the Shanon's entropy and the current entropy.

It can use any ASCII symbols and any word as symbol.

How to use it

All the function flow is implemented in the main function. The coding standard is to assign 0 to the left node and 1 to right, if you wish to change it, just modify the 3rd parameter of encode function to 1.

How to run

g++ main.c util.c -o main

How it works

All the data are read from the user from the STDIN and store in a dynamic allocated array of type data declared in the .h file. The data struct stores the symbols, encoded version, symbol probablity and if it's read from the user in via the original attribute.

After it sorts all the symbols ascending based on their probability, it appends all the missing "inter_symbols" nodes.(i.e. the user choose 'A' and 'B' as symbols, the program will append "AB")

Having all symbols now, it sorts them descending and starts building the edges with their values(1 or 0).

Note that at this point the array has a structure of a heap, stored as breadth-first.

About

An algorithm to encode characters with information theory


Languages

Language:C 100.0%