Date post: | 27-Dec-2015 |
Category: |
Documents |
Upload: | eleanore-alexander |
View: | 219 times |
Download: | 0 times |
Transform Coding
Heejune AHNEmbedded Communications Laboratory
Seoul National Univ. of TechnologyFall 2013
Last updated 2013. 9. 30
Heejune AHN: Image and Video Compression p. 2
Agenda
Transform Coding Concept Transform Theory Review DCT (Discrete Cosine Transform) DCT in Video coding DCT Implementation & Fast Algorithms Appendix: KL Transform
Heejune AHN: Image and Video Compression p. 3
1. Transform Coding
X1= lum(2n), X2= lum(2n+1), neighbor pixels X1 ~ U(0, 255), X2~ U(0,255)
Quantization of X1 and X2 => same data Cross-Correlation of X1 and X2
Y1, Y2
45 degree rotation Y1 = (X1 + X2) /2
• Average or DC value Y2 = (X2 – X1) /2
• Difference or AC value Y1 ~ F(0, 255), Y2~ F(-255,255)
2X
1X
p
1y
2x
2y
2Y
1Y
1x
255 -255 2550 0
Heejune AHN: Image and Video Compression p. 4
Which ones are easier to encode (quantize)?
2550
255 -255 2550 0
2550
f(X1) f(X2)
f(Y1) f(Y2)
Heejune AHN: Image and Video Compression p. 5
Origins of Transform Coding Benefits Signal Theory
• Make the representation easier to manipulate
• energy concentration
Image and HVS Properties• HVS is more sensitive to Low frequency
• More dense quantizer to Low frequency
21
,
2,
222,22, ),/(log
2
1 N
lklklklk N
Bb
Vilfredo ParetoEconomist1848-1923
Heejune AHN: Image and Video Compression p. 6
2. Transform Theory Review
Definition of Transform N to M mapping, [Y1, Y2, . . ., YN] = F [X1,X2, . . ., XM]
Linear Transform (cf. Non-Linear Transform) if [Y11, Y12] = F [X11,X12] and [Y21, Y22] = F [X21,X22]
[Y11 + Y21, Y12 +Y22] = F [X11+X21, X21+X22]
Matrix representation of Linear Transform Forward
Inverse
N transform coefficients,arranged as a vector
Transform matrixof size NxN
Input signal block of sizeN, arranged as a vector
y = T x
x = T-1 y
Heejune AHN: Image and Video Compression p. 7
Basis Vectors
Orthogonal Vl * Vm = 0 for basis Vector V1, V2, . . ., VN
Each vectors are disjointed, separated. Orthonormal
|| Vl || = 1 for basis Vector V1, V2, . . ., VN
Parseval’s Theorem • Signal Power/Energy conserves between Transform Domain
v1v2v3
vN
x = T-1 y = TT yT-1 =TT =>
||y||2 = yTy = xTTT Tx = ||x||2
Heejune AHN: Image and Video Compression p. 8
Example of Orthonormal transform
11
11
2
1
45cos45sin
45sin45cos )rotation45(
oo
oooT
Heejune AHN: Image and Video Compression p. 9
2D Transform
Data 2D pixel value matrix, 2D transform coefs matrix 2D matrix => 1D vector
Forward Transform
Inverse transform
NxN transform coefficients,arranged as a vector Transform matrix
of size N2xN2 Input signal block of sizeNxN, arranged as a vector
y = T x
x = T-1 y
),(),( ),(1
0
1
0
mntmnflkFN
n
N
m
Heejune AHN: Image and Video Compression p. 10
3. Transforms
Various transforms in image compression DFT (Discrete Fourier Transform) DCT (Discrete cosine Transform) DST (Discrete sine Transform) Hadamard Transfrom Discrete Wavelet Transform and more (HAAR etc )
Heejune AHN: Image and Video Compression p. 11
Hadamard transform
Core Matrix 1 차원
N 차원
2 차원
Transform
11
11
2
11
H
11
1111
2
1
nn
nnnn HH
HHHHH
product Knonecker :
,2 ,1 ,2 ,log where 2
nNNn n
1* HHHt
HXHHXHY NNt
NNNN
2for
1 111
111 1
11 11
1 1 1 1
4
1
2
1
11
1111
n
HH
HHHHH
nn
nnnn
Heejune AHN: Image and Video Compression p. 12
DCT Transform
1D Forward DCT (pixel domain to frequency domain)
1D Inverse DCT (frequency domain to pixel domain)Nk
N
NkN
k nfkCkF
N
n
2)0(
1)0(
,1 , ,1 ,0 ,2
1)(2ncos )( )()(
1
0
10 ,2
1)(2ncos )()()(
1
0
NnN
kkFkCnf
N
k
Heejune AHN: Image and Video Compression p. 13
2D DCT
2D DCT basis Functions Coef. Distribution
DC ~ Uniform dist., AC ~ Laplacian dist.
Heejune AHN: Image and Video Compression p. 14
Properties Orthonormal transform Separable transform Real valued coefficients
DCT performance very resembles KLT for image input
• Image input model (1 order Markov chain)
• xn+1 = rho * xn+1 + e(n)
DCT complexity 2D DCT = 1D DCT for vertical * 1D DCT for horizontal Not for 3D (for delay and memory size) DCT size (4x4, 8x8, 16x16, 32x32 …)
• Larger: better performance, but blocking artifact (?) and HW complexity
Heejune AHN: Image and Video Compression p. 15
Coding Performance of DCT
Karhunen Loève transform [1948/1960]Haar transform [1910]
Walsh-Hadamard transform [1923]Slant transform [Enomoto, Shibata, 1971]
Discrete CosineTransform (DCT) [Ahmet, Natarajan, Rao, 1974]
Comparison of 1-dbasis functions forblock size N=8
Heejune AHN: Image and Video Compression p. 16
Energy concentration Performance measured for typical natural images, block size 1x32 KLT is optimum DCT performs only slightly worse than KLT
Heejune AHN: Image and Video Compression p. 17
Complexity Performance of DCT
Separation of 2D DCT Cascading 1-D DCT Reduction of the complexity (multiplication) from O(N4) to O(N3) 8x8 DCT
• For 64 each Coefs, 64 multiplications
• 2 times 64 Coefs x 8 Can you derive this ?
column-wise N-transform
row-wiseN-transform
N
Nx Ax AxAT
NxN blockof pixels
NxN block of transformcoefficientsAxAT
Heejune AHN: Image and Video Compression p. 18
4. Transform in Image Coding
Transform coding Procedure Transform T(x) usually invertible Quantization not invertible, introduces distortion Combination of encoder and decoder lossless
transform
Ty x quantizer
Qq y encoder
Cc q
samples yimage x indices q
1
inversetransform
ˆ ˆT x y 1
dequantizer
ˆ Qy q 1
decoder
C q c
indices qˆsamples yreconstructed
ˆimage x
bit-stream c
Heejune AHN: Image and Video Compression p. 19
185 3 1 1 -3 2 -1 0
1 1 -1 0 -1 0 0 1
0 0 1 0 -1 0 0 0
1 1 0 -1 0 0 0 -1
0 0 1 0 0 0 -1 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
DCT in Image Coding
DCT
Original 8x8 block
Q
1480 26.0 9.5 8.9 -26.4 15.1 -8.1 0.3
11.0 8.3 -8.2 3.8 -8.4 -6.0 -2.8 10.6
-5.5 4.5 9.0 5.3 -8.0 4.0 -5.1 4.9
10.7 9.8 4.9 -8.3 -2.1 -1.9 2.8 -8.1
1.6 1.4 8.2 4.3 3.4 4.1 -7.9 1.0
-4.5 -5.0 -6.4 4.1 -4.4 1.8 -3.2 2.1
5.9 5.8 2.4 2.8 -2.0 5.9 3.2 1.1
-3.0 2.5 -1.0 0.7 4.1 -6.1 6.0 5.7
198 202 194 179 180 184 196 168
187 196 192 181 182 185 189 174
188 185 193 179 188 188 187 170
184 188 182 187 183 186 195 174
194 193 189 187 180 183 181 185
193 195 193 192 170 189 187 181
181 185 183 180 175 184 185 176
195 185 177 178 170 179 195 175
192 201 195 184 177 184 193 174
189 191 195 182 182 187 190 171
188 185 190 181 185 187 189 171
189 188 185 183 183 182 190 175
191 192 186 189 179 182 188 178
190 191 189 190 177 186 184 179
189 188 185 184 175 186 187 179
189 188 178 176 173 183 193 180
Scaling and inverse DCT
Reconstructed 8x8 block
Inverse zig-zag scan
Mean of Block: 185
(0,3) (0,1) (1,1) (0,1) (0,1) (0,1) (0,-1) (1,1)
(1,1) (0,1) (1,-3) (0,2) (0,-1) (6,1) (0,-1) (0,-1)
(1,-1) (14,1) (9,-1) (0,-1) EOB
Run-level coding
Zig-zag scan
Transmission
Transformed 8x8 block
185 3 1 1 -3 2 -1 0
1 1 -1 0 -1 0 0 1
0 0 1 0 -1 0 0 0
1 1 0 -1 0 0 0 -1
0 0 1 0 0 0 -1 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Mean of Block: 185
(0,3) (0,1) (1,1) (0,1) (0,1) (0,1) (0,-1) (1,1)
(1,1) (0,1) (1,-3) (0,2) (0,-1) (6,1) (0,-1) (0,-1)
(1,-1) (14,1) (9,-1) (0,-1) EOB
Heejune AHN: Image and Video Compression p. 20
DCT in Image Coding
Uniform deadzone quantizer transform coefficients that fall
below a threshold are discarded.
Entrphy coding Positions of non-zero transform
coefficients are transmitted in addition to their amplitude values.
Efficient encoding of the position of non-zero transform coefficients: zig-zag-scan + run-level-coding
Quantizer input
Quantizer output
Heejune AHN: Image and Video Compression p. 21
DCT Examples
Note that only a few coefficients has sizable value.
image blockDCT coefficients
of block
quantized DCT coefficients
of block
block reconstructed
from quantized coefficients
0
2
4
6
0
2
4
6
- 30
- 20
- 10
0
10
20
30
0
2
4
6
0
2
4
6
- 30
- 20
- 10
0
10
20
30
0
2
4
6
0
2
4
6
- 30
- 20
- 10
0
10
20
30
0
2
4
6
0
2
4
6
- 30
- 20
- 10
0
10
20
30
0
2
4
6
0
2
4
6
- 30
- 20
- 10
0
10
20
30
0
2
4
6
0
2
4
6
- 30
- 20
- 10
0
10
20
30
Heejune AHN: Image and Video Compression p. 22
DCT coding with increasingly coarse quantization, block size 8x8
quantizer stepsize for AC coefficients: 25
quantizer stepsize for AC coefficients: 100
quantizer stepsize for AC coefficients: 200
Heejune AHN: Image and Video Compression p. 23
4. Implementation
Implementation issue HW or SW Computational Cost, Speed, Implementation Size Performance Cost Implementation complexity
SW Implementation decision factors Computational cost of multiplication Whether Fixed or Float point operation (esp. multiplication) Special Coprocessor and Instruction set (e.g. MMX)
Heejune AHN: Image and Video Compression p. 24
Fast DCT Algorithm
Original DCT/IDCT Computation load
• 64 Add + 64 Mult.
• 8 (7) Addition + 8 multiplication / one coeff. (from eqn.) Scaling
• input range [0, 255] => output range [-2024, 2024]
Fast DCT Similar to Fast DFT Share same computation between nodes. O(NxN) => O (N log2N)
• N : Width (num of coeff.)
• log2N : Steps of algorithm
Several version : Chen, Lee, Arai etc
Heejune AHN: Image and Video Compression p. 25
Chen’s FDCT
See Code at http://www.cmlab.csie.ntu.edu.tw/~chenhsiu/tech/fastdct.cpp
Heejune AHN: Image and Video Compression p. 26
How the fast algorithm works? Exploiting the symmetry of cosine function.
STEP 1
STEP 2
8283
283
281 cos cos )6(2,cos cos )2(2 DDFDDF
865218
33740
83
652183740
815
7813
6811
589
4
87
385
283
18021
cos)( cos)( )6(2
cos)( cos)( )2(2
coscoscoscos
coscoscoscos)2(
ffffffffF
ffffffffF
ffff
ffffF
)( 2),( 1 65213740 ffffDffffD
Heejune AHN: Image and Video Compression p. 27
HW Implementation
2D DCT using 1D DCT Function Block
1-D DCT
8x8 RAM
Input sample
MUX Output coef
Row order input Column order output
Heejune AHN: Image and Video Compression p. 28
Distributed Arithmetic DCT Multiplier-less architecture Lookup, Shift, accumulators only
Shift(2-1)
accumulator
4 bits from u input
Output coef Fx
Add or subtract
LUT (ROM)
Heejune AHN: Image and Video Compression p. 29
IDCT Mismatch
DCT x IDCT = I ? DCT is defined: in “floating point” and “direct form.” Integer Implementation induces ‘error’ after Inverse DCT. different FDCT has different ‘error’s.
DCT mismatch in MC-DCT different reference image at encoder and decoder very small error but it accumulates.
DCT Q
IDCTDIQIDCTE
recE
IQ
recDShould Equal but Mismatch !
orgE VLC VLD
Heejune AHN: Image and Video Compression p. 30
IDCT Mismatch control Minimum accuracy of DCT algorithm is defined in SPEC. H.261/3,MPEG-1/2 Restrict the sum of coefficients values
• Oddification rule of sum of all DCT coefficients,
• Make LSB of F[63], the last Coef.
• Decoder check and correct the values H.264
• (modified) Integer DCT is used
adding random error cancelation
Heejune AHN: Image and Video Compression p. 32
Optimal Transform
Optimality (No) Redundancy in input signal => (No) Redundant Quantization
Result No cross-correlation between different components (coefs)
K-L (Karhunen-Loeve) transform Assumption
• Input Covariance is given Problem Definition
• find a transform (Y=T X) such that RY,Y = T RX,X TT meets diagonal matrix (i.e., completely uncorrelated Y)
] [ *,
t
XXER XX
}{ diag
T )(
1
1
0
*,
**,
k
xxyy
N
ttt
RTXTXTEYYER
Heejune AHN: Image and Video Compression p. 33
Optimal Transform
Solution
• Build T with eigenvectors of RX,X as basis vector
• Then, by the definition of Eigen-vectors & values (of RX,X)
– –
• So.
Issue in KLT
• RX,X is varying for image to image: Need to calculate new T, transmit it to decoder
• Not Separable (vertical, horizontal)• But, good for benchmarking performance of other transform.
, 1 , ,1 ,0 , NkRkkkx
1
0
10 101010,
N
N NNNxxR
I
RR
N
Nt
N
Nxxt
Nyy
1
0
10*
10
10,*
10,
tNoptimalT *
10