Date post: | 05-Jan-2016 |
Category: |
Documents |
Upload: | anis-fleming |
View: | 224 times |
Download: | 0 times |
Huffman Code and Data Decomposition
Pranav Shah CS157B
Why Data Compression?
Fixed Length Data inefficient for transfers and storage.
Types Of Compressions
Lossless Compression Exact Original data reconstructed from compressed
data. Nothing lost. Examples : Zip, Bank Account Records
Types of Compressions
Lossy Compression Approximation of original data reconstructed from
compressed data. Examples : JPEG – Loss of data quality after
repeated compressions.
File Size: 87 KB File Size:26 KB
Variable Length Bit Coding
Maps source symbols to a variable number of bits.
Allows sources to be compressed and decompressed with zero error.
Examples : Huffman Coding, Lempel-Ziv Coding and Arithmetic Coding
Variable Bit Coding Rules
Use Minimum Number of bits. Helps to speed up the transfer rate and increase
storage.
Variable Bit Coding Rules
Cannot have code which contains prefix for another code Example: Assume A has the code 01. Then, B
cannot have the code 010 as it contains A.
Enable left to right unambiguous decoding. Example: If you have 01, then you know that it is A
and not any other character (Not B!)
Huffman Code
Entropy encoding algorithm used for lossless data compression. Variable length code using average length formula : L =
l1p1 + l2p2 + … + lMpM where l1,l2,l3…lM = length and p1,p2,p3…pM = Probabilities of Source Alphabets A1,A2,…AM being generated.
Uses binary tree.
The Huffman Code generated using binary Huffman Code construction method. Equivalent to simple binary block encoding (Example:
ASCII)
Algorithm
Make a leaf node for each code symbol Add the generation probability of each symbol to
the leaf node.
Take the two leaf nodes with the smallest probability and connect them into a new node. Add 1 or 0 to each of the two branches. The probability of the new node is the sum of the
probabilities of the two connecting nodes.
If there is only one node left, the code construction is completed. If not, go back to (2)
Example
Characters Frequency
A 19% (0.19)
B 28% (0.28)
C 13% (0.13)
D 30% (0.30)
E 10% (0.10)
Step 1
Take lowest two frequencies and make a node.
0.10 0.13
0.23
Step 2
Take next two lowest and connect into a node.
0.10 0.13
0.23
0.42
0.19
Step 3
Continue…
0.10
0.13
0.19
0.23
0.42
0.58
0.300.28
Completed Tree
1.0
0.42 0.58
0.28 0.300.19
0.10
0.23
0.13
Add 0 or 1 to each branch
1.0
0.42 0.58
0.28 0.300.19
0.10
0.23
0.13
01
0 0
0
1 1
1
Generated Code
Characte
rsFrequenc
yCode
A 19% (0.19)
00
B 28% (0.28)
10
C 13% (0.13)
011
D 30% (0.30)
11
E 10% (0.10)
010
References
http://gadgethobby.com/wp-content/plugins/blog/images/data-compression.jpg
http://www.steves-digicams.com/knowledge-center/jpeg-images-counting-your-losses.html
http://en.wikipedia.org/wiki/Variable-length_code
http://en.wikipedia.org/wiki/Huffman_coding
http://www.aykew.com/aboutwork/speed.html
http://www.000studio.com/kobe_biennale2007/main/gallery.php?id=1