+ All Categories
Home > Documents > Huffman Algorithm

Huffman Algorithm

Date post: 21-Feb-2018
Category:
Upload: ines-sassi
View: 230 times
Download: 0 times
Share this document with a friend
44
Huffman Coding: CS 102  Trees and Priority Queues
Transcript

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 1/44

Huffman Coding:

CS 102

 Trees and Priority Queues

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 2/44

Encoding and

Compression of Data

• Fax Machines

• ASCII

• Variations on ASCII

 

CS 102

– min number of bits needed – cost of savings

– patterns

– modifications

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 3/44

Purpose of Huffman

Coding

•Proposed by Dr. David A.Huffman in 1952

– “A Method for the Construction

CS 102

of Minimum Redundancy Codes” 

• Applicable to many forms of

data transmission

– Our example: text files

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 4/44

The Basic Algorithm 

• Huffman coding is a form of

statistical coding

• Not all characters occur with the

CS 102

same requency• Yet all characters are allocated

the same amount of space

– 1 char = 1 byte, be it e or x

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 5/44

The Basic Algorithm 

 Any savings in tailoring codesto frequency of character?

• Code word lengths are no longer

CS 102

 

fixed like ASCII.

• Code word lengths vary and will

 be shorter for the morefrequently used characters.

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 6/44

The (Real) Basic

 Algorithm 1. Scan text to be compressed and tally

occurrence of all characters.

2. Sort or prioritize characters based on

number of occurrences in text.

CS 102

3. Build Huffman code tree based on prioritized list.

4. Perform a traversal of tree to determine

all code words.

5. Scan text again and create new file

using the Huffman codes.

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 7/44

Building a Tree

Scan the original text

•Consider the following shorttext:

CS 102

Eerie eyes seen near lake.

• Count up the occurrences of all

characters in the text

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 8/44

Building a Tree

Scan the original text

Eerie eyes seen near lake.• What characters are present?

CS 102

E e r i space

y s n a r l k .

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 9/44

Building a Tree

Scan the original text

Eerie eyes seen near lake.

• What is the frequency of each

character in the text?

CS 102

Char Freq. Char Freq. Char Freq.E 1 y 1 k 1e 8 s 2 . 1r 2 n 2i 1 a 2space 4 l 1

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 10/44

Building a Tree

Prioritize characters

• Create binary tree nodes with

character and frequency of

each character

 

CS 102

•Place nodes in a priorityqueue

– The lower the occurrence, the

higher the priority in the queue

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 11/44

Building a Tree

Prioritize characters

• Uses binary tree nodes

 public class HuffNode{

ublic char m Char

 

CS 102

 

 public int myFrequency; public HuffNode myLeft, myRight;}

 priorityQueue myQueue;

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 12/44

Building a Tree

• The queue after inserting all nodes

CS 102

• Null Pointers are not shown

E

1

i

1

y

1

l

1

k

1

.

1

r

2

s

2

n

2

a

2

sp

4

e

8

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 13/44

Building a Tree

• While priority queue contains two

or more nodes– Create new node

–  

CS 102

 

– Dequeue next node and make it right

subtree

– Frequency of new node equals sum of

frequency of left and right children

– Enqueue new node back into queue

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 14/44

Building a Tree

E i y l k . r s n a sp e

CS 102

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 15/44

Building a Tree

y

1

l

1

k

1

.

1

r

2

s

2

n

2

a

2

sp

4

e

8

CS 102

E1

i1

2

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 16/44

Building a Tree

y

1

l

1

k

1

.

1

r

2

s

2

n

2

a

2

sp

4

e

82

CS 102

E1 i1

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 17/44

Building a Tree

k

1

.

1

r

2

s

2

n

2

a

2

sp

4

e

82

CS 102

E1 i1

y1

l1

2

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 18/44

Building a Tree

k

1

.

1

r

2

s

2

n

2

a

2

sp

4

e

8

2

y l

2

CS 102

E1 i1

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 19/44

Building a Tree

r

2

s

2

n

2

a

2

sp

4

e

8

2 2

CS 102

E1 i1 1 1

k1

.1

2

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 20/44

Building a Tree

r

2

s

2

n

2

a

2

sp

4

e

8

22 2

CS 102

E1 i1y1

l1

k

1

.

1

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 21/44

Building a Tree

n

2

a

2

sp

4

e

8

22 2

CS 102

E1 i1 y1 l1 k1

.1

r2

s2

4

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 22/44

Building a Tree

n

2

a

2

sp

4

e

8

22

2 4

CS 102

E1 i1 y1

l1

k1

.1

r2

s2

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 23/44

Building a Tree

sp

4

e

8

2 2 2 4

CS 102

E1

i1

y

1 1 1

.

1 2 2

n2

a2

4

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 24/44

Building a Tree

sp

4

e

8

2 2 2 4 4

CS 102

E1

i1

y

1 1 1

.

12 2

2 2

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 25/44

Building a Tree

sp

4

e

8

2 4 4

CS 102

E1

i1

2

y1

l1

2

1

.

1 2 2 2 2

4

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 26/44

Building a Tree

sp

4

e

82 2

2 4 4 4

CS 102

E1

i1

y1

l1

1

.

1 2 2 2 2

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 27/44

Building a Tree

e

82 2r s

4

n a

4 4

CS 102

E1

i1

sp4

y1

l1

k1

.1

2

2 2 2 2

6

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 28/44

Building a Tree

sp

e

82

4 4 4 6

CS 102

E1

i1

4

y1

l1

k1

.1

r

2

s

2

n

2

a

2

 What is happening to the characters

with a low number of occurrences?

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 29/44

Building a Tree

s

e

82 2 2

46

CS 102

E1

i1

4

y1

l1

k1

.1

r2

s2

4

n2

a2

4

8

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 30/44

Building a Tree

s

e

82 2 2

46 8

CS 102

E1

i1

4

y1

l1

k1

.1

r2

s2

n2

a2

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 31/44

Building a Tree

e

84 4

8

CS 102

E1

i1

sp4

2

y1

l1

2

k1

.1

2

r2

s2

n2

a2 4

6

10

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 32/44

Building a Tree

e

84 4 4

8 10

CS 102

E1

i1

sp4

2

y1

l1

2

k1

.1

2r2

s2

n2

a2

6

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 33/44

Building a Tree

4

10

16

CS 102

E1

i1

sp4

e8

2

y1

l1

2

k1

.1

2

r2

s2

4

n2

a2

4

8

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 34/44

Building a Tree

4

1016

CS 102

E1

i1

sp4

e82

y1

l1

2

k1

.1

2

r2 s2

4

n2 a2

4

8

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 35/44

Building a Tree

1016

26

CS 102

E

1

i

1

sp4

e8

2

y

1

l

1

2

k

1

.

1

2

r2 s2

4

n2 a2

4

4

68

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 36/44

Building a Tree

26

• After

enqueueingthis node

there is only

CS 102

E1

i1

sp

4

e8

2

y1

l1

2

k1

.1

2

r2

s2

4

n2

a2

4

46

8

10 one node left

in priority

queue.

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 37/44

Building a Tree

Dequeue the single node

left in the queue.

This tree contains the 1016

26

CS 102

 

character.

Frequency of root node

should equal number ofcharacters in text.

E1

i1

sp4

e8

2

y1

l1

2

k1

.1

2

r

2

s

2

4

n

2

a

2

4

46 8

Eerie eyes seen near lake. 26 characters

E di th Fil

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 38/44

Encoding the File

Traverse Tree for Codes

• Perform a traversal

of the tree to

obtain new code

words

26

CS 102

• Going left is a 0going right is a 1

• code word is only

completed when a

leaf node is

reached

E1

i1

sp

4

e8

2

y1

l1

2

k1

.1

2

r2

s2

4

n2

a2

4

46

8

10

E di th Fil

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 39/44

Encoding the File

Traverse Tree for Codes

Char CodeE 0000i 0001y 0010l 0011

26

CS 102

k 0100. 0101space 011e 10

r 1100s 1101n 1110a 1111

E1

i1

sp

4

e8

2

y1

l1

2

k1

.1

2

r2

s2

4

n2

a2

4

46

8

10

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 40/44

Encoding the File

• Rescan text and

encode file using

new code words

Eerie eyes seen near lake.

Char Code

E 0000i 0001y 0010l 0011

CS 102

k 0100. 0101space 011e 10

r 1100s 1101n 1110a 1111

1000101011011010011

1110101111110001100

1111110100100101

• Why is there no need

for a separator

character?

.

Encoding the File

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 41/44

Encoding the File

Results

• Have we made

things any better?

• 73 bits to encode

0000101100000110011

1000101011011010011

1110101111110001100

CS 102

the text• ASCII would take

8 * 26 = 208 bits

hIf modified code used 4 bits percharacter are needed. Total bits

4 * 26 = 104. Savings not as great.

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 42/44

Decoding the File

• How does receiver know what the codes are?

• Tree constructed for each text file.

– Considers frequency for each file

–  

CS 102

  ,

files

• Tree predetermined 

– based on statistical analysis of text files or

file types

• Data transmission is bit based versus byte

 based 

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 43/44

Decoding the File

• Once receiver has

tree it scans

incoming bit stream 

• 0 ⇒⇒⇒⇒ go left10

16

26

CS 102

• 1 ⇒⇒⇒⇒ go right

E1

i1

sp4

e

82

y1

l1

2

k1

.1

2

r2

s2

4

n2

a2

4

6

101000110111101111

01111110000110101

7/24/2019 Huffman Algorithm

http://slidepdf.com/reader/full/huffman-algorithm 44/44

Summary

• Huffman coding is a technique used

to compress files for transmission

• Uses statistical coding

CS 102

 more frequently used symbols haveshorter code words

• Works well for text and fax

transmissions

• An application that uses several

data structures


Recommended