Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 1Lecture 15
ECEC-453Image Processing Architecture
3/11/2004Exam Review
Oleh TretiakDrexel University
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 2Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 3Lecture 15
Announcements• Final on March 20• Cumulative• Extra credit problem - write plugin for ImageJ (everybody does a
different plugin)
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 5Lecture 15
Architecture for the DCT• Separable DCT
Y =TXTT =(T (TX )T )T• Options
- Fast DCT ~ conventional computer- Vector DCT ~ parallel hardware
• 8x8 1-D DCTZ =TX =T ×Z X
• Unit operation: Multiply 8x8 matrix with 8x1 matrix ~ 64 ops
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 6Lecture 15
Computational Complexity• 1D DCT
- N input and output samples ~ N2= 64 operations (additions + multiplications)
• 2D DCT - direct implementation- M = N2 input values, M output values -> M2 = N4
• 2D DCT - separable implementation, Y = TXTT = ZTT, where Z = TX, all matrices are NxN -> 2N3 operations
• For N = 8- 2D DCT direct — 4096 operations, 64 operations per pixel- 2D DCT separable — 1024 operations, 16 ops/pixel
• Big savings due to separable transform• Inverse DFT — same story.
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 7Lecture 15
DCT: Encoding in JPEG, MPEG
• Take 8x8 blocks of pixels• Subtract range mean value• Compute 8x8 DCT• Quantize the DCT coefficients
- Typically, many of the samples are equal to zero• Lossless entropy coding of the quantized samples• Different quantization step is used for different DCT coefficients
- ykl — DCT coefficients, qkl — quantizer steps- zkl — quantized values
zkl =roundyklqkl
⎛ ⎝ ⎜
⎞ ⎠ ⎟
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 8Lecture 15
Optimized (fast) DCT• 1-D Chen DCT diagram.
Dashed lines indicate subtraction, — multi-plication by a constant, — multiplication by 0.5 (shift).
DCT or IDCT Method1-D 2-D 1-D 2-D
1-D Chen 16 256 26 4161-D Lee 12 192 29 464
1-D Loeffler, Ligtenberg 11 176 29 4642-D Kamangar, Rao 128 430
2-D Cho, Lee 96 466
Multiplications Additions
x0
x1
x2x3
x4
x5
x6x7
y0
y1
y 2
y 3
y 4
y 5
y6
y7
Characteristicsof optimizedDCT algorithms
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 9Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 10Lecture 15
Huffman Coding - Block Diagram
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 11Lecture 15
Coding AC Coefficients• AC coefficients are coded in zig-zag (called ZZ in standard)
order to maximize possible runs of zeros.• Code unit consist of run length
followed by coefficient size.• Baseline coding of size category
is the same as for DC differences (Table 2.9)
• Example: run of 6 zeros, size = -18. In the table, -18is in category 5. Code is(6/5, 01101). If the Huffmancode for 6/5 is 1101, codeword = 110101101
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 12Lecture 15
Example of JPEG compression
Very high quality: compression = 2.33Photoshop Image
Very low quality: compression = 115Produced by MATLABwith Quality = 0
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 13Lecture 15
Compression = 64
JPEG JPEG2000
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 14Lecture 15
Predictive Coding of Video• E(x, y, t) = I(x, y, t) - P(x, y, t)
- I ~ image, P ~ prediction, E ~ error• P(x, y, t+1) = P(x, y, t) + Code(E(x, y, t))• At receiver, Ie(x, y, t) = P(x, y, t+1)
- Ie(x, y, t) ~ estimate of image at time tQuantizerPredictorx i+-+ q i
ˆ x ip i
q i+Predictor
p i
ˆ x i
EncoderDecoder
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 15Lecture 15
Generic Encoder - simplifiedI(x, y, t-1)I(x, y, t)Motion vector
(u, v)e(x, y, t) = I(x, y, t) - I(x-u, y-v, t-1)DCT codingMotionEstimation
MotionCompenastionTransmit
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 16Lecture 15
Motion Estimation Methods
No compensation
Full search
logarithmicsearch 3 level
hierarchical
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 17Lecture 15
Full-Search Method• Compute for (2p+1)2 values of (i, j).• Each location requires 3MN operations• Picture dimensions IxJ, F pictures per second
- 3IJF(2p + 1)2 operations per second- I = 720, J = 480, F = 30, p = 15 —> 30 GOPS
• Guaranteed to find best (MAE) displacement• How to do it?
- Special computers- Smaller p- Faster (suboptimal) algorithm
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 18Lecture 15
Hierarchical Search• Prepare downsampled versions of current and reference
images- Full macroblock 16x16- Down 2 macroblock 8x8- Down 4 macroblock 4x4
• Full search in Down 4 reference image- 16 x speedup, smaller macroblock- 16 x speedup, fewer displacement vectors
o p = ±16, p’ = ±4
• Around point of best match, do local search in Down 2 reference image (3x3 search zone)
• Repeat for Full reference image (3x3 search zone)
Full
Down 2
Down 4
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 19Lecture 15
Comparison
p = 17 p=7
Full Search 29.89 6.99
Logarithmic 1.02 0.78
PHODS 0.53 0.40
Hierarchical 0.51 0.40
Search Method Operations per MacroblockOperations for video
720x480 at 30 fps, GOPS
3(2 p+1)2NM
3(4 log2 p⎡ ⎤+1)NM
3 (2 p / 4⎡ ⎤+1)2 +180[ ]NM / 16
3(8 log2 p⎡ ⎤+1)NM
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 20Lecture 15
MPEG-1: ‘1.5’ Mbps• Sample rate reduction in spatial and temporal domains• Spatial
- Block-based DCT- Huffman coding (no arithmetic coding) of motion vectors and
quantized DCT coefficientso 352 x 340 pixels, 12 bits per pixel, picture rate 30 pictures per second
—> 30.4 Mbpso Coded bit stream 1.15 Mbps (must leave bandwidth for audio)o Compression 26:1o Quality better than VHS!
• Temporal- Block-based motion compensation- Interframe coding (two kinds)
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 21Lecture 15
Picture Types• MPEG-1 is designed to support random access & editing
- I — intraframe coding only- P — predictive coding- B — bi-directional coding
IPB12345678
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 22Lecture 15
Picture of LayersGOP-1GOP-NGOP-2IBBPBB ... PSlice-1Slice-NSlice-2Sequence LayerGOP layerPicture layermb-1mb-2mb-n012333YCrCbSlice layerMacroblock layerBlock layer
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 23Lecture 15
Coding Image Blocks• B pictures
- Inter or intra? - Forward, backward, interpolational?- Code block or skip?- Quantization step?
I P B Zero MV Skipped TotalI 3300 3300P 897 8587 5128 568 15180B 60 7356 22845 429 30690
Picture Type
Macroblock type
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 24Lecture 15
MPEG-1 Wrap-up• Data below for decoder, SIF pictures, 2 B pictures per P• IDCT must be precise, because of inter-frame coding• MPEG-1 does not deliver quality acceptable for broadcast —>
MPEG-2Decoding Function Load (%)Bit-stream header parsing 0.44 0.44Huffman decoding and dequantization 19.00Inverse DCT 22.10Motion compensation 38.64Color transformation and display 19.82
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 25Lecture 15
Typical MPEG coding parameters• Typical sequence
- IPBBPBBPBBPBBPBB (16 frames)
Picture Average size
Comp-ression
I 156000 6.5P 62000 16.4B 15000 67.6
Compression (GOP) = BitsPerFrameU ×NFramesPerGOP
BitsPerCodedGOPBitsPerCodedGOP=NI frames×(Bits/ Iframe)+NPframes×(Bits/Pframe)+
+NBframes×(Bits/Bframe)Bits / Iframe =BitsPerFrameU/CI , Bits/ Pframe=BitsPerFrameU/CP
Bits/ Bframe =BitsPerFrameU/CB
Compression (GOP) = NFramesPerGOP
NIframes / CI +NPframes / CP +NBframes / CB
= 161/ 6.5 + 5 / 16 .4 +10 / 67 .6
=26.4
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 26Lecture 15
MPEG-2 Goals• Compatibility with MPEG-1• Good picture quality• Flexibility in input format• Random access capability (I pictures)• Capability for fast forward, fast reverse play, stop frame• Bit stream scalability• Low delay for 2-way communications (videoconferencing)• Resilience to bit errors
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 27Lecture 15
MPEG-2 profiles• A profile is a subset of the entire MPEG-2 bit-stream syntax
- Simple- Main- 4:2:2- SNR- Spatial- High- Multiview
• Each profile has several levels (resolution quality)- Low — MPEG1- Main — CCIR 601- High-1440 (Video Editing)- High (HDTV)
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 28Lecture 15
MPEG2 - Alternate Scan
Zig-zag scan Alternate scan
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 29Lecture 15
MPEG2 — Subsampling• Suppose picture is 720x480
- 4:4:4o Luminance and chrominance @ 720x480
- 4:2:2o Luminance @ 720x480, chrominance 360x480
- 4:2:0o Luminance 420x480, chrominance 360x240
• Weird terminology
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 30Lecture 15
Teleconferencing Standards• Digital video areas
- Broadcast television- Recorded programs- Two-way communications
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 31Lecture 15
Review: Video Telephone System
H.320
H.200/AV.250 -Series
H.221H.261
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 32Lecture 15
Review: H.261 Features• Common Interchange Format
- Interoperability between 25 fps and 30 fps countries- 252 pix/line, 288 line, 30 fps noninterlace- Terminal equipment converts frame and line numbers- Y Cb Cr components, color sub-sampled by a factor of 2 in both
directions• Coding
- DCT, 8x8, 4 Y and 2 chrominance per masterblock- I and P frames only, P blocks can be skipped- Motion compensation optional, only integer compensation- (Optional) forward error correction coding
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 33Lecture 15
Picture Formats for H.263
Image SizeFormat Y Cb, Cr
sub-QCIF 128 x 96 64 x 48QCIF 176 x 144 88 x 72CIF 352 x 288 176 x 144
ACIF 704 x 576 352 x 28816CIF 1408 x 1152 704 x 576
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 34Lecture 15
Color, Down-sample
Motion Estimation
Motion CompensationReference Memory
Predicted PictureIDCTInv. quantizationQuantizationEntropy codingBufferDCT+-VideoBitstream+
Encoder: Where’s the meat?
63%
10%
10%
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 35Lecture 15
Advanced Video Coding• H.263 and MPEG-4 based on ~1995 technology• After 1995, MPEG and VCEG (video coding) started working on
a new low-rate standard (H.26L)• Rec H.264 released in September 2002 • Information on http://www.vcodex.com/ (some is on our web
site)• Site maintained by Ian Richardson, who has written books
about video coding
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 36Lecture 15
New Features• Prediction in I pictures• Different block transform• Different Block Sizes• Changes in motion compensation• VLC and arithmetic coding
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 37Lecture 15
I Picture Prediction• System operates with 4x4 blocks and 16x16 macroblocks
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 38Lecture 15
9 Prediction Modes for 4x4 Blocks
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 39Lecture 15
4 Modes for 16x16 Macroblocks• Mode 0: Vertical, extrapolate from upper samples• Mode 1: Horizontal, extrapolate from left samples• Mode 2: DC, mean of upper and left-hand samples• Mode 3: Plane, linear fit to left and upper samples
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 40Lecture 15
Different Block Transform• Basically, 4x4 DCT• Scanning sequence for 16x16 macroblock is shown below• 4x4 and 2x2 DC coefficients transformed (again)
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 41Lecture 15
4x4 DCT Tricks• Y = AXAT
• a = 1/2, b = 0.707 cos(π/8), c = cos(3π/8)• Trick: Y = (CXCT).*E€
A =a a a ab c −c ba −a −a ac −b b −c
⎡
⎣
⎢ ⎢ ⎢
⎤
⎦
⎥ ⎥ ⎥
€
C =1 1 1 11 1 −1 −21 −1 −1 11 −2 2 −1
⎡
⎣
⎢ ⎢ ⎢
⎤
⎦
⎥ ⎥ ⎥
€
E =a2 ab /2 a2 ab /2
ab /2 b2 /4 ab /2 b2 /4a2 ab /2 a2 ab /2
ab /2 b2 /4 ab /2 b2 /4
⎡
⎣
⎢ ⎢ ⎢
⎤
⎦
⎥ ⎥ ⎥
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 42Lecture 15
Motion Compensation Ideas• Adaptive motion compensation blocks:
- 16x16, 16x8, 8x16, 8x8, 8x4, 4x8, 4x4
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 43Lecture 15
Coding Ideas• Constant quantizer value• Zig-zag scan with novel run-length code• Arithmetic coding an option• Motion vectors to 1/4 pixel
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 44Lecture 15
Loop Filter• Concept to overcome block artifacts• Average across inter-block lines if difference
is too big• Difference threshold depends on coding
mode (intra or inter) and quantization stepsize
Image Processing Architecture, © 2001, 2002 Oleh Tretiak Page 45Lecture 15
Example of Loop Filter