Compress Data using Huffman code

Huffman code is very effective technique for compress data .Huffman code is greedy algorithm. Normal storage 8 bits per character in ASCII form. That huffman algorithm compress the file and store it compactly. Lossless data compression technique. Huffman codes generates prefix codes.

Prefix codes

Code of one character is not a prefix of another character.

Example

a=00

b=11

c=01

d=10

entire code should not be a prefix of any other code.So that is prefix code for character.

Example

a=0

b=00

c=10

d=1

That is not prefix code. a is contain in b.

Benefit of compress data

Less disk space.
Faster file transfer.
Reduce file size.
Compress data read and written faster than original data.

Huffman code

Fixed length code.
Variable length code.

Fixed length code

Each letter represent equal number of bits.

Variable length code

In variable length code giving frequent characters short code words and infrequent character long code words.

Steps to find Average length

Arrange in increasing order.
build Huffman tree(Left<Right)
Assign codes to the branches(left=0,Right=1).
Count the number of bits of each character.
Find the average length=∑(Number of bits* Frequency).

Main point

Char always leaf node.

Char having high probability/Frequency will have smallest codes

Example

A text is made up of the characters A,B,C,D,E each occurring with the probability 0.08,0.40,0.25,0.15,0.12 respectively. The optimal coding will have the average length .

Solution

A B C D E

.08 .40 .25 .15 .12

Arrange them in increasing order

A E D C B

.08 .12 .15 .25 .40

Build Huffman tree

condition

Left<Right

Character be always in leaf node.

D C B

0.15 0.20 0.25 0.40

C B

0.25 0.35 0.40

0.40 0.60

Assign codes to the branches

Left branch =0

right branch=1

A=1110

B=0

C=10

D=110

E=1111

Average length of coding=∑(Number of bits* Frequency of Character)

Average length of coding=4*0.08+1*0.40+2*0.25+3*0.15+4*0.12 =2.15

NSRENUCOMP

Ticker

Compress Data using Huffman code

Post a Comment

0 Comments

Report Abuse

Search This Blog

Labels

Contact Form

menu

Popular Posts

Modification of Turing Machine

Ardens Theorem in FA|| Finite Automata to Regular expression.

Peephole Optimization in compiler AND Optimize the Intermediate code using small window.

Random Posts

Popular Posts

Myhill Nerode Theorem||Minimization of DFA using Myhill Nerode Theorem

Modification of Turing Machine

Diagonalization method in theory of computation

Categories

Menu Footer Widget

Ticker

Compress Data using Huffman code

You may like these posts

Post a Comment

0 Comments

Search This Blog

Labels

Contact Form

menu

Social Plugin

Popular Posts

Random Posts

Popular Posts

Categories

Menu Footer Widget