+ All Categories
Home > Documents > Huffman Presantation

Huffman Presantation

Date post: 09-Apr-2018
Category:
Upload: muhammad-zia-shahid
View: 224 times
Download: 0 times
Share this document with a friend
24
DATA STRUCTURE Huffman Tree : Project S ubmitted to : S ir Abdul Wahab S ubmitted by:  Muzmmal Hussain  Muhammad Zia S hahid Riasat Ali 
Transcript

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 1/24

DATA STRUCTURE

Huffman Tree : Project 

S ubmitted to :  S ir Abdul Wahab 

S ubmitted by: 

 Muzmmal Hussain 

 Muhammad Zia S hahid 

Riasat Ali 

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 2/24

´In computer science and inf ormation theory, Huff man 

coding is an entropy encoding alg orithm used f or

lossless data compressionµ

What Is Huf fman Tree . .?  

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 3/24

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 4/24

Huff man tree is a eleg ant f orms of  a data compressions. It is based

On minimum redundancy coding . We need represent the data in a way 

that makes the data required less space.

Need O f  Hu f f man Co d i n g 

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 5/24

Huff man coding is most eff icient f orm of  a binary 

tree.

 Adaptive Huff man coding  

I S T H I S D AT A S T R U C T U R E D E R  I V E F R O M  

 A N Y D A T A S T R U C T U R E

 A R E T H E R E A N Y D S T H A T A R E D E R  I V E D

F R O M   I T ?

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 6/24

Huff man is used to compressed the f iles.

Huff man is used to minimized the binary code

Huff man is used in compression tools and also in f ax machine

Reduce storag e needed 

Reduce size of  data e.g . imag es audio video and text

Reduce transmission cost and band width

 A d va n t a g e o f  Hu f f ma n

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 7/24

Chang ing ensemble

If  the ensemble chang es the f requencies and probabilities chang e the optimal coding chang es

e.g . in text compression symbol f requencies vary with context

Re-computing the Huff man code by running throug h the entire f ile in ad vance?!

Saving/ transmitting the code too?!

Does not consider ¶blocks of  symbols·

¶ string s_of  _ch· the next nine symbols are predictable ¶aracters_· ,but bits are used without con veying any new inf ormation

Disad vantag e of  Huff man

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 8/24

 The run time complexity of  Huff man is 0( n),where n is number is a

symbol in the orig inal data. Each of  these runs is 0 ( n) times.

 The time to build the Huff man tree does not eff ect the complexity 

of  Huff man compress because a running time of  this process

depends only on the number of  diff erent symbols in the data which in 

this implantation is a constant

Cost Of  Huff man

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 9/24

On a computer: chang ing the representation of  a f ile so that it takes less

space to store or/and less time to transmit. ² orig inal f ile can be reconstructed exactly f rom the

compressed representation.

 diff erent than data compression in g eneral

 ² text compression has to be lossless.

 ² compare with sound and imag es: small chang es and noise is

tolerated.

Tex t C omp r e s s i o n  

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 10/24

 We can construct lossless compression by f ollowing alg orithm

Let the word ABRACADABRA

 What is the most economical way to write this string in a binary 

representation?

Generally speaking , if  a text consists of  N diff erent characters, we

need bits log[ N] bits to represent each one using a f ixed-

leng th encoding .

Co n s t r u c t i o n   O f  Hu f f ma n

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 11/24

 Thus, it would require 3 bits f or each of  5

diff erent letters, or 33 bits f or 11 letters.

Can we do it better?

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 12/24

 We can do better, provided:

 ² Some characters are more f requent than others.

 ² Characters may be diff erent bit leng ths, so that f or

example, in the Eng lish alphabet letter a may use

only one or two bits, while letter y may use several.

 ² We have a unique way of  decoding the bit stream.

 YES!!!!

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 13/24

U s i n g Va r i a b l e - l e n g t h En c o d i n g ( 1 )

Mag ic word: ABRACADABRA

LET A = 0

B = 100

C = 1010

D = 1011

R = 11

 Thus, ABRACADABRA = 01001101010010110100110

So 11 letters demand 23 bits < 33 bits, an improvement of  about

30%.

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 14/24

U s i n g Va r i a b l e - l e n g t h En c o d i n g ( 2 )

However, there is a serious dang er: How to ensure unique

reconstruction?

Let A -> 01 and B -> 0101

How to decode 010101?

 AB?

BA?

 AAA?

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 15/24

 N o P r o b l em«««

if  we use pref ix codes: no code word is a pref ix of  another code

 word.

 Any pref ix code can be represented by a f ull binary tree.

Each leaf  stores a symbol.

Each node has two children ² lef t branch means 0, rig ht means 1.

code word = path f rom the root to the leaf  interpreting suitably 

the lef t and rig ht branches.

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 16/24

P r e f i x Co d e s ( 2 )

 ABRACADABRA

 A = 0

B = 100

C = 1010

D = 1011

R = 11

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 17/24

Decoding is unique and simple!

Read the bit stream f rom lef t to rig ht and starting  f rom the root,

 whenever a leaf  is reached,

 write dow n its symbol and return to the root.

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 18/24

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 19/24

CO NS TR U CTI  NG A H UFF  MA N  CO DE(1 )

 Assume that frequencies of symbols are: 

 ² A: 40 B: 20 C: 10 D: 10 R: 20

S mallest numbers are 10 and 10 (C and D), so connect them 

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 20/24

CO NS TR U CTI  NG A H UFF  MA N  CO DE(2 )

C and D have already been used, and the new node above them (call it C+D)

has value 20

 The smallest values are B, C+D, and R, all of  which have value 20

 ² Connect any two of  these

It is clear that the alg orithm does not construct a unique tree, but

even if  we have chosen the other possible connection, the code would 

be optimal too!

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 21/24

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 22/24

CO NS TR U CTI  NG A H UFF  MA N  CO DE(3 )

The smallest value is R, while A and B+C+D have value 40.

Connect R to either of  the others.

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 23/24

CO NS TR U CTI  NG A H UFF  MA N  CODE(4 )

Connect the final two nodes, adding 0 and 1 to each left and right branch 

respectively.

8/8/2019 Huffman Presantation

http://slidepdf.com/reader/full/huffman-presantation 24/24

 A: 40 B: 20 C: 10 D: 10 R: 20

 A = 0

B = 100

C = 1010

D = 1011

R = 11

20(B ) 10(C ) 10(D ) 20(R  )

20

0

0

0

0

11

1

1


Recommended