Post on 05-Jan-2016
description
transcript
Introduction to H.264 Video Standard
Anurag JainTexas Instruments
H.264 Background
Jointly developed by ITU-T and MPEG.
Upto 50% more efficient at the same virtual quality compared to MPEG-4 ASP
Supports wide range of applications. (interlaced, progressive, low bit-rate, studio quality digital cinema etc).
Multiple profiles (Baseline, Main, Extended, High, FRExt).
Good results obtained from interoperability tests making it suitable for wide deployment in short span of time.
H.264 Encoder Block Diagram
[Single Universal VLC and Context Adaptive VLC] OR[Context-Based Adaptive Binary Arithmetic Coding]
Intra Prediction Modes9 4x4 & 4 16x16 modes = 13 modes
• Seven block sizes and shapes• Multiple reference picture
selection• 1/4-pel motion estimation
accuracy• Referenced B-frames
Intra
Inter_
Video Source Transfor
m
Bit Stream Out
Quantized Transform Coefficients
Motion Vectors
+
Predicted Frame
Quantization
EntropyCoding
Motion Estimation
Frame Store
Motion Compensation
InverseQuantization
InverseTransform
+
Coding Control
Loop Filter
+
Intra Prediction
Integer 16-bit fixed point transform with no mismatch
Quantization step more resolution for finer control of bit rate
Common elements with other standards
Macroblocks: 16x16 luma + 2 x 8x8 chroma samples
Input: association of luma and chroma and conventional block motion displacement
Motion vectors over picture boundaries
Block Transform
Variable block-size motion
I, P and B picture coding types
Common Elements
High Level Coding Tools
Sequence and Picture Parameter Sets (SPS & PPS)
Picture Order Count (POC)
Decoded Picture Buffer (DPB)
Slice group map (FMO)
Multiple slices and arbitrary arrangements (ASO)
Supplemental Enhancement Information (SEI)
Hypothetical Reference Decoder (HRD)
Video Usability Information (VUI)
A coded sequence contains one or more access units
An access unit is a set of NAL units that contains all necessary
information for decoding exactly one (primary) coded picture
A coded picture is divided into Slices (VLC NAL units)
A slice contains a slice header and a set of macroblocks
A macroblock contains a 16x16 luma block and two chroma blocks
An I-slice contains a set of INTRA-coded macroblocks
A P-slice contains a set of INTRA- and INTER-coded macroblocks
An IDR (instantaneous decoding refresh) picture contains only I-slices
(SI-slices too in extended profile)
High Level Tools: Coding Hierarchy
Profile @ Level indicator
Profile constraint indicator
Sequence parameter set ID (0..31)
Picture order count type and infos
DPB (Decode Picture Buffer) info
Picture size
Frame/field coding flag
Method for vector derivation of B-direct mode
Frame cropping parameters
VUI_parameters (Annex E, Video usability information)
Sequence Parameter Set
Picture Parameter Set
Picture parameter ID (0..255)
Sequence parameter ID (0..31)
Entropy coding mode flag (CABAC/CAVLC)
Slice POC info presence flag
Slice group map parameters
Max. number (1..16) of ref. frames used for decoding slices
Weighted prediction flags
Quantization scales (qp minus 26, range -26 ..+ 25)
Chroma QP offset for loop-filter (-12 ..+12)
Slice loop-filter control flag (Alpha/Beta table offsets)
INTRA predication using pixels of INTER neighboring MBs?
Slice redundant pic. parameters presence flag
Starting macroblock address Slice type (I, P, B, SI, SP ) Temporal reference (frame_num) Picture parameter set ID (0..255) Interlaced frame/field coding, top/bottom field indicators IDR pictire ID (0,… 65536) Slice POC parameters Redundant picture count(0.. 127, 0 for baseline) B-slice temporal or spatial direct mode indicator Max. number (1..32) of ref. pictures for decoding current slice Reference picture reordering parameters (DPB) Weighted prediction parameters DPB marking parameters (e.g. short term, long term pred. Pics) Slice delta QP (-26 ..25) SP switch flag and SP/SI slice QP Loop-filter indicator (0: disabled, 1: enabled, 2: enabled but LP across slice Boundaries disabled) Loop-filter alpha/beta table access offset (-6, +6) Slice group change cycle (derives the No. of MBs in slice group 0)
Slice Header
Slice Group Maps
For error resilience
Ordering of Slices within Slice Groups
Low Level Coding Tools
Motion compensated prediction
Additional intra modes for spatial compensation
Transform: 4x4 Integer transform (Baseline, Main Profiles)
Transform: 8x8 Integer transform (High Profile)
Quantization: Scalar quantization
Entropy Coding : CABAC / CAVLC
In-loop deblocking filter
Enhanced MC (Inter Prediction)
Every macroblock can be split in one of 7 ways for improved motion estimation
Accuracy of motion compensation = 1/4 pixel
Up to 5 reference frames for SDTV size @ L3
Weighted predictions
Reference B pictures
Trade off between accuracy and side information
CurrentMacroblockor Partition
or Block
A
B CD
B Slice - Direct Mode
Direct mode Forward / backward pair of bi-directional prediction Prediction signal is calculated by a linear combination of two blocks that
are determined by the forward and backward motion vectors pointing to two reference pictures.
Spatial Direct mode Temporal Direct mode
List 0 Reference
td
tb
mvCol
mvL0
mvL1
......
direct-mode partition
co-located partition
List 1 ReferenceCurrent Picture
mvL0 = tb mvCol / td mvL1 = – (td – tb) mvCol / td
where mvCol is a MV used in the co-located MB of the subsequent picture
B Slice : Multi-picture Reference ModeGeneralized Bidirectional prediction
Multiple reference pictures mode
Two forward references : proper for a region just before scene change
Two backward references : proper for a region just after scene change
......
next pictures
current picture
...... ...... ......
previous pictures
2 forward MVs
2 backward MVs
1 forward MV +1 backward MV
traditional Bidirectional
H.264 Intra Prediction
4 modes for 16x16 intra prediction
9 modes for 4x4 blocks
Luma Sub-Pixel Interpolation
Chroma Sub-pel Calculation
If (vx, vy) is luma vector, then xFracc = vx&0x7, yFracc = vy&0x7
Block Scanning Order in a MBOne more extraction of correlation among sub-blocks
Integer 4x4 DCT approximation. 8x8
Cost of transformed differences (i.e. residual coefficients) for 4x4 block using 4 x 4 Hadamard-Transformation for INTRA_16x16 coded macroblocks.
Scalar quantization.
Transform & Quant
Hadamard
8x8 Luma-Chroma
4x4 Luma/Chroma AC
All integers!
Deblocking filter
Frame / Field Adaptation
Picture Adaptive Frame Field (PicAFF).
Macroblock Adaptive Frame Field (MBAFF)
Field scan and zig-zag scan options
Interlaced Coding
Zig-zag Frame Scan Field Scan
Universal Variable Length Coding (UVLC) using Exp-Golomb codes.
Context Adaptive VLC (CAVLC)
Context Adaptive Binary Arithmetic Coding (CABAC)
Entropy Coding
CAVLC
• TotalCoeff = 7 : # of non-zeros
• Trailing 1s = 2 : 1, -1
• Sign Trail = 1 0 (reverse order) : minus, plus
• Levels = 5 20 27 33 50 (reverse order) : 7 – 2 = 5
• TotalZeros = 3 (# of zeros)
• RunsBefore = 0 2 1 : 0 before -1, 2 before 1, and 0’s before 5
Zigzag order: 50 33 27 20 0 5 0 0 1 -1 0 0 0 0 0 0
Exp Golomb Coding
Loop filter
Vertical edges(chroma)
Vertical edges(luma)
Horizontal edges(luma)
Horizontal edges(chroma)
16*16 Macroblock 16*16 Macroblock
Check if the boundary is original to picture or blocking effects
Profiles and Tools
Arbitrary slice order
Flexible macroblock order
Redundant slice
B slice
I slice
P slice
CAVLC
Weighted prediction
CABAC
Data partition
SI slice
SP slice
Extended Profile
Main Profile
Baseline Profile
H.264 Profiles and Tools: Graphical Representation
Lossless representation
Allows more than 8-bits per sample (upto 12-bits)
Higher resolution for color representation (4:2:2, 4:4:4)
Source editing function like alpha blending
Very high bit-rates (often with constant quality)
Very high-resolution
Color space transformation (YCgCo, YCbCr, RGB)
RGB color representation
Adaptive block transform sizes
Quantization matrices
FRExt: Fidelity Range Extension
Coding Efficiency
Comparision of Standards
Feature/Standard MPEG-1 MPEG-2 MPEG-4 part 2 (visual)
H.264/MPEG-4 part 10
Macroblock size 16x16 16x16 (frame mode)16x8 (field mode)
16x16 16x16
Block Size 8x8 8x8 16x16, 16x8, 8x8 16x16, 8x16, 16x8, 8x8, 4x8, 8x4, 4x4
Transform 8x8 DCT 8x8 DCT 8x8 DCT/Wavelet
4x4, 8x8 Int DCT4x4, 2x2 Hadamard
Quantization Scalar quantization with
step size of constant
increment
Scalar quantization with step size of
constant increment
Vector quantization
Scalar quantization with step size of
increase at the rate of 12.5%
Entropy coding VLC VLC VLC VLC, CAVLC, CABAC
Motion Estimation & Compensation
Yes Yes Yes Yes, more flexibleUp to 16 MVs per
MB
Playback & Random Access
Yes Yes Yes Yes
Comparision of Standards (cont’d..)
Feature/Standard MPEG-1 MPEG-2 MPEG-4 part 2 (visual)
H.264/MPEG-4 part 10
Pel accuracy Integer, ½-pel Integer, ½-pel Integer, ½-pel, ¼-pel
Integer, ½-pel, ¼-pel
Profiles No 5 8 3
Reference picture one one one multiple
Bidirectional prediction mode
forward/backward forward/backward forward/backward forward/forwardforward/backward
backward/backward
Picture Types I, P, B, D I, P, B I, P, B I, P, B, SP, SI
Error robustness Synchronization & concealment
Data partitioning, FEC for important
packet transmission
Synchronization, Data partitioning, Header extension, Reversible VLCs
Data partitioning,Parameter setting,
Flexible macroblock ordering, Redundant slice, Switched slice
Transmission rate Up to 1.5Mbps 2-15Mbps 64kbps - 2Mbps 64kbps -150Mbps
Compatibility with previous standards
n/a Yes Yes No
Encoder complexity Low Medium Medium High
References
– Related group • MPEG website http://www.mpeg.org• JVT website: ftp://ftp.imtc-files.org/jvt-experts• www.mpegif.org
– Test software • H.264/AVC JM Software:
http://bs.hhi.de/~suehring/tml/download– Test sequences
• http://ise.stanford.edu/video.html• http://kbs.cs.tu-berlin.de/~stewe/vceg/sequences.htm• http://www.its.bldrdoc.gov/vqeg• ftp.tnt.uni-hannover.de/pub/jvt/sequences/• http://trace.eas.asu.edu/yuv/yuv.html
THANKS