Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 2 ECE-C490 Winter 2004 Image...

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 1Lecture 2

ECE-C490 Winter 2004 Image Processing Architecture

Lecture 2, 1/8/2004

Lossless Coding (More)Oleh Tretiak

Drexel University


Review: Need for Compression• Example: Fax document

- 8.5x11”, 200 dpi, 8.5x11x(200)2 = 3.74 Mbits- @28.8 kbits/sec, 3740000/28800 = 130 sec- Typical compression = 15, with compression 8.65 sec.

• Example: Video- 640x480 pictures, 24 bits/sample, 30 frames/sec- 640x480x24x30 = 2.21E+08 bits/sec = 2.76E+07bytes/sec- A CD-ROM stores 650 Mbytes -> playing time = 650/27.6 = 23.5 sec- With compression, 74 minutes of low (VHS) quality video can be

stored on a CD-ROM


Review: How is Compression Possible?

• Statistical Redundancy- Adjacent pixels are similar (spatial correlation)- Color components are similar (spectral correlation)- Successive frames are similar (temporal correlation)

• Perceptual Redundancy- Data can be eliminated from signal with no visible changes.

• Lossy compression: - Send (inferior) picture that requires fever bits.


Example: Compression• We have 1000 symbols from alphabet {a b c d}• Coding: 2 bits per symbol, total bits = 2x1000 = 2000• Variable length coding, some symbols are more frequent

Total bits = 900 + 100 + 75 + 75 = 1150Average bits/symbol = 1150/1000 = 1.15 < 2Compression = source bits/code bits = 2/1.15 =1.74

Symbol Number Code Bitsa 900 0 900b 50 10 100c 25 110 75d 25 111 75


Review: Typical Encoder System

• Issues- Constant Bit Rate vs. Variable Bit Rate

o In lossless encoder, bit rate depends on compression efficiencyo Variable bit rate is undesirable in real-time applicationso In lossy encoder, bit rate can be kept constant by varying quality

- Single or Multiple Sample encodingo Multiple sample encoding is usually more efficient but also is more

complex.

SignalPreprocessorEntropy CoderData


Review: Information Theory• Let the i-th symbol have probability pi. The information of this

symbol is defined as log2(1/ pi) bits. The average information of this symbol is pi log2(1/ pi) bits. The entropy of this symbol set is defined as

• Shannon Coding Theorem: It is possible to encode a (long) sequence of symbols with H +bits per symbol

• How to do it?• If the symbols are statistically independent, it is impossible to

encode these with fewer than H bits per symbol.

H = pi log2(1/pi )i=1

N∑ bits


Review: Entropy and Huffman Codes

• Theory- Information- Entropy- Shannon Compression Theorem

• Practice- Deriving symbols from signals- Huffman encoding

o Coder constructiono Encoderso Decoders


Review: Variable Length Codes

• Symbol set: si, i = 1 … N, pi — symbol probability

• Code: ci, li, where ci is a sequence of 1’s and 0’s of length li.

• The code words must be decodable: the transmitted bit stream is just a set of 1’s and 0’s, code word boundaries are not indicated. For a decodable code, the code word boundaries can be found uniquely from the transmitted sequence.

• Sufficient condition for decodability: A code is decodable if no code word is at the beginning (prefix) of another code word. This is called the prefix condition

• Average code word length:

For a decodable code, Lave ≥ H

Lave = pilii=0

N∑


This Lecture• Huffman codes

- How to construct codes- Encoding and decoding algorithms- Huffman codes with constrained length

• Golomb and Rice coding• Arithmetic Coding

- Coding when H < 1


Construction of a Huffman Code• Tree construction

- Order the symbols according to probabilities- Apply contraction process to the two symbols with lowest

probabilityo assign a new (hypothetical) symbol to these two,with probabilities equal

to sum of the code word probabilities- Repeat until one symbol is left

• Code construction- For the two initial branches of the tree, attach bit ‘0’ and ‘1’ at end

of code word.- For both branches

o If (branch is a source symbol) doneo else repeat above process


Code construction exampleSymbol Probability

? 0.10e 0.30k 0.05l 0.20r 0.20u 0.10w 0.05

ST 1e 0.30l 0.20r 0.20? 0.10u 0.10k 0.05w 0.05

ST 2e 0.30l 0.20r 0.20? 0.10u 0.10A 0.10

ST 3e 0.30l 0.20r 0.20B 0.20? 0.10

ST 4e 0.30C 0.30l 0.20r 0.20

ST 5D 0.40e 0.30C 0.30

ST 6D 0.40E 0.60

A=k,wB=u,A

C=B,?D=l,r

E=e,C

00 r01 l10 e1100 u11010 k11011 w111 ?

DEeClrB?uAkw010101001011

H = 2.55

? 111 3e 10 2k 11010 5l 01 2r 00 2u 1100 4w 11011 5

Lave = pilii=0

N∑

=0.10 ×3 + 0.30 ×2 +0.05 ×5 + ... +0.05 ×5=2.6


Variable Length Code: Encoding• Source sequence: werule?

? 111 3e 10 2k 11010 5l 01 2r 00 2u 1100 4w 11011 5

11011100011000110111


Prefix code: bit-serial decoding

• Algorithm steps: bold denotes output symbols- ECBAwEeDrECBuDl...

11011100011000110111

DEeClrB?uAkw010101001011


Prefix code: table decoding• Let k be the maximum code symbol length. Construct table with

2k entries. Each table location contains input symbol and code word length. Order code by binary value. A code symbol of length l will have 2k-l entries.

00 r01 l10 e111 ?1100 u11010 k11011 w

Since k = 5, we use 32 table entries. Code symbol ‘00’ will use 25-2=8 entries. Each entry will have output symbol ‘r’ and length 2. The next 8 entries will be for ‘l’. The following 8 entries will be for ‘e’. The following 4 entries will be for ‘?’, etc.Decoding:Take k = 5 bits from encoded sequence. Decode itby table lookup. From table, find the symbol length, discardthese many bits from the code word used for lookup, take additional bits from encoded sequence.


Lookup table decoding example

• First lookup code: 11011. Output = ‘w’, l = 5, read 5 bits from input stream.

• Lookup code: 10001. Output = ‘e’, l = 2, discard the initial ‘10’ from lookup code, read two more bits

• Lookup code 00110. Output = ‘r’, l = 2, discard the initial ‘00’ from lookup code, read two more bits.

• Lookup code 11000. Output = ‘u’, l = 4, discard initial ‘1100’, read 4 more bits

• Lookup code 01101. Output = ‘l’, ...

1101110001100011011100 r01 l10 e111 ?1100 u11010 k11011 w


Huffman Codes With Constrained Length

• Some codes used in practice can have longest code words 20 bits long: huge lookup tables for lookup sequences

• Solution concept 1: ‘Escape’ sequences• Solution concept 2: limit max code word length


Code construction - Escape• Escape sequence approach

- High probability code symbols decoded with lookup table, table is modest in size.

- All low probability codes are lumped together: they are assigned one symbol (escape symbol) in the high probability table.

- To encode low probability symbol, send escape symbol plus low-probability sequence (this will be another Huffman code).

- Decoding: use high probability table: if escape symbol is encountered, switch to low probability table.

• This approach uses a hierarchical set of lookup tables for decoding


Rule of Thumb• If symbol has probability p, then the length of a VLC code word

should be about• l = -log2p

• Examples- p = 0.5, l = 1 bit- p = 0.1, l = 3 bits- p = 0.01, l = 7 bit


Two level hierarchy: method• S is the source, L be the maximum code word length• Sort the symbols by probability, so that p1 ≥ p2 … ≥ pN

• Split the source into two sets:

• Create a special symbol Q with probability• Augment S1 by Q to form new set W. Design Huffman code for this this

set• Encoding: For symbols is S1, output code word. For symbols in S2,

send Q, then symbol without encoding (this requires bits).

q = pii=t

N∑

log2 NS2⎡ ⎤

S1={si |pi >1

2L}={s1,s2,L ,st−1}

S2 ={si |pi ≤12L

}={st,st+1,L ,sN}

P (S2 ) ≥12L


Example• H = 2.65, Lave = 2.69

• Max l is 9, decoding table requires 512 entries

• Target max length 5, min p > 1/2^5 = 1/32 = 0.03125

• S1 = (a, b, c, d, e), S2 = (f … p)

• P(S2) = 0.0559, min p in S1 = 0.0513

• Both are greater than 1/32• Expect longest code word should be 5 bits

or less

Symbolp

i

li Codeword

a 0.2820 2 11

b 0.2786 2 10

c 0.1419 3 011

d 0.1389 3 010

e 0.0514 4 0011

f 0.0513 4 0010

g 0.0153 5 00011

h 0.0153 5 00010

I 0.0072 6 000011

j 0.0068 6 000010

k 0.0038 7 0000011

l 0.0032 7 0000010

m 0.0019 7 00000011

n 0.0013 8 00000010

o 0.0007 9 000000011

p 0.0004 9 000000010


Example - Continued• Build Huffman code for a-f + S2

• If a-f needs encoding, send code word• If g-p needs encoding, send code for S2, followed by binary number (one of 10) -> 4 bits

Example: encode “cam”010 00 111 0110

Symbolp

i

li Codeword

a 0.2820 2 00

b 0.2786 2 10

c 0.1419 3 010

d 0.1389 3 011

e 0.0514 4 1100

f 0.0513 4 1101

S2 0.0559 3 111

Symbolp

i

li Codeword

g 0.0153 4 111 + 0000

h 0.0153 4 111 + 0001

I 0.0072 4 111 + 0010

j 0.0068 4 111 + 0011

k 0.0038 4 111 + 0100

l 0.0032 4 111 + 0101

m 0.0019 4 111 + 0110

n 0.0013 4 111 + 0111

o 0.0007 4 111 + 1000

p 0.0004 4 111 + 1111


Performance Analysis• What is Lave?

• Lave = Avg no of bits to send VLC + p(S2)*4= 2.5421 + 0.0559*4 = 2.766

• How does this compare with Huffman code?• For Huffman code, Lave = 2.694

• Other ideas?


Constrained Length - 2nd method• Long code words produced by low probability symbols

- Idea: modify probabilities, increasing the low probability values- Design Huffman code for this

• This approach produces codes with lower maximum length, but larger average length

• Leads to simpler encoder and decoder structure than escape (hierarchical) approach, but performance may not be as good.


Huffman vs. Arithmetic Code• Lowest Lave for Huffman codes is 1. Suppose H << 1?

- One option: use one code symbol for several source symbols- Another option: Arithmetic code.

• Idea behind arithmetic code:- Represent the probability of a sequence by a binary number.


Arithmetic Encoding• Assume source alphabet has values 0 and 1, p0 = p , p1 = 1 – p.

• A sequence of symbols s1, s2, … sm is represented by a probability interval found as follows:

- Initialize, lo = 0; range = 1- For i = 0 to m

o if si = 0

§ range = range*p

o else // si = 1

§ lo = lo + range*p§ range = range*(1-p)

o end- end

• Send binary fraction x such that lo ≤ x < hi. This will require log2 range⎡ ⎤ bits


Arithmetic coding: example• p0 = 0.2, source sequence is 1101

Number of bits = ceiling(-log2(0.1024)) = 4low2 =.01100010, (low+range)2 = .01111100Bits sent: 0111

symbol low range

0 1

1 0.2000 0.8000

1 0.3600 0.6400

0 0.3600 0.1280

1 0.3856 0.1024


Arithmetic Decoding• We receive x, a binary fraction• lo = 0; hi = 1• for i = 1 to m

- if (x - lo) < p*(hi-lo)o si = 0o hi = lo + (hi-lo)*p

- elseo si = 1o lo = lo + (hi-lo)*p

- end

• end


Arithmetic Decoding• We receive x, a binary fraction• lo = 0; range = 1;• for i = 1 to m

- if (x - lo) < p*rangeo si = 0o range = p*range

- elseo si = 1o lo = lo +range*po range = range*(1 - p)

- end

• end


Arithmetic Decoding• We receive x, a binary fraction• for i = 1 to m

- if x < po si = 0o x = x/p

- Else // x > po si = 1o x = (x - p)/(1 - p)

- end

• end

symbol x range0.4375 1

1 0.2969 0.80001 0.1211 0.64000 0.6055 0.12801 0.5068 0.1024

Receive x = 0111 = 0.4375p = 0.2


Arithmetic decoding example• Receive 0111 (0.4375), decode 4 bits, p0 = 0.2

x= 0.4375low high bit0 1 1

0.2 1 10.36 1 00.36 0.488 1


Arithmetic decoding example• Receive 0111 (0.4375), decode 4 bits, p0 = 0.2

symbol low range

0 1

1 0.2000 0.8000

1 0.3600 0.6400

0 0.3600 0.1280

1 0.3856 0.1024


Magic Features of ArithmeticCoding

• Remember I (information) = - log2p- p = 0.5, I = 1- p = 0.125, I = 3- p = 0.99, I = 0,0145 (wow!)

• High p symbol, less than 1 code bit per symbol!• In encoder, hi - lo = ∑ I(symbols)


Summary: Arithmetic Coding• Complexity: requires arithmetic (multiplications, divisions),

rather than just table lookups• Algorithms are complex, accuracy (significant bits) is tricky• Can be made to operate incrementally

- Both encoder and decoder can output symbols with limited internal memory

• Provides important compression savings in certain settings• Part of standards


Summary: VLC• Lowest level (basis) of image compression• We talked about

- Huffman - Golomb/Rice- Arithmetic

• Two phases - Code design- Encoding/decoding

• More about VLC- Adaptive coding: estimate probabilities- There are universal coders (good non-adaptive coding) such as

Lempel-Ziv (zip)

Date post:	13-Jan-2016
Category:	Documents
Upload:	norah-higgins
View:	213 times
Download:	1 times

Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 2 ECE-C490 Winter 2004 Image...

Documents