+ All Categories
Home > Documents > chuong 1&2.pdf

chuong 1&2.pdf

Date post: 24-Oct-2014
Category:
Upload: iketegami-gami
View: 39 times
Download: 0 times
Share this document with a friend
Popular Tags:
59
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] Faculty of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi Video Coding Tien Pham Van, Dr. rer. nat. Hanoi University of Technology
Transcript
Page 1: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Video Coding

Tien Pham Van, Dr. rer. nat.

Hanoi University of Technology

Page 2: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Agenda

• Video coding process

• Video coding standards

• Future development

Page 3: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Introduction (1/2)

• Why video compression technique is

important ?

• One movie video without compression

– 720 x 480 pixels per frame

– 30 frames per second

– Total 90 minutes

– Full color

– The full data quantity = 167.96 G bytes !!

3

Page 4: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Introduction (2/2)

• What is the difference between video

compression and image compression?

– Temporal Redundancy

• Coding method to remove redundancy

– Intraframe Coding

• Remove spatial redundancy

– Interframe Coding

• Remove temporal redundancy

4

Page 5: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Desired Features

• Better compression

• Improved quality

• Interactivity and Manipulation of Content

• Error Resilience

• Processing of content in the compressed domain

• Identification and selective coding/decoding of the object of interest

• Facilitate Search / Indexing (MPEG-7)

Page 6: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Time table

1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 … 2010

6

JPEG

MPEG1

MPEG2/H.262

MPEG4

H.26L H.264

H.261

H.263

Year

VC-1/VC-2

H.265

Page 7: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Evolution of Video Compression

Standards

H.261

Video Telephony

H.262/MPEG-2

Digital TV/DVD

MPEG-4 Visual

Object-based Coding

H.263

Video Conferencing

H.264 MPEG-4 AVC

MPEG-1

Video-CD

ITU-T MPEG

Page 8: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Where used?

– MPEG-1• Video-CD

• Usually .mpg or .mpeg files are MPEG-1

• DAB Digital Radio is MP2 (MPEG-1 Layer 2)

• MP3 files (MPEG-1 Layer 3)

– MPEG-2:• .vob, .m2v, rarely .mpg files

• Anything to do with DVD– Camcorders, DVD players, DVD recorders

• Digital TV (DVB)

– MPEG-4:• High Quality AVI files

• Video Phones

• DivX

• Some advanced audio players support MPEG-4 Advanced Audio Coding (AAC)

Page 9: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Where used?

–H.263/+/++• NetMeeting and similar video-chat

• Network streaming application, video phone…

– H.264• Video Conferencing: over different networks

• Multimedia Streaming: live and on-demand

• Multimedia Messaging Services (MMS)

• Blu-ray, Digital Video Broadcasting, iPod Video, HD DVD

– VC-1, VC-2 • Video on Internet,

• HDTV broadcast, UHDTV

Page 10: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

R-D Performance of MPEG Codecs

32

34

36

38

40

42

44

46

48

50

350 450 550 650 750 850 950 1050

Bit rate (kbps)

PS

NR

(Y

)

MPEG-1 MPEG-2 MPEG-4 H.264

Page 11: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Questions

• What are video/audio codecs ? Name some

popular codecs that your media players

support. What are disadvantages of using

specific codecs ?

• What is container format? Name some

examples.

• Codecs and Formats

Page 12: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Compression...

movie picture 1 movie picture 2

Page 13: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Residue after motion compensation

Pixel-wise difference w/o motion compensation

Motion estimation

“Horse ride”

Page 14: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Motion Prediction

• Motion vector: a motion vector is a bi-dimensional pointer that tell the decoder how much left/right and up/down

• Motion estimation: the process, perfomed by the coder, that should find the motion vector pointing to the best prediction macroblock in a reference frame or field

• Motion compensation: what obtained after applying motion vector on reference frame

Page 15: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Motion Estimation

• Help understanding the content of image sequence

– For surveillance

• Help reduce temporal redundancy of video

– For compression

• Stabilizing video by detecting and removing small, noisy global motions

– For building stabilizer in camcorder

Page 16: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Motion Compensation

• It aims to reduce the data transmitted by detecting the motion of objects

– Use the previous as reference

– In steps:

• Split the current frame in blocks. For each one:

• Find the best-matching block in the reference frame

• The best matching block is coded and transmitted

– Next frame can be used a reference too

Page 17: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Picture type

• Slice

– One or more "contiguous'' macroblocks. The order of

the macroblocks within a slice is from left-to-right

and top-to-bottom.

• Macroblock– A 16-pixel by 16-line section of luminance

components and the corresponding 8-pixel by 8-line

section of the two chrominance components.

• Block – A block is an 8-pixel by 8-line set of values of a

luminance or a chrominance component.

Page 18: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

CODEC Design

Page 19: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Coding functions

• Achieve high compression performance while keep

good picture quality

• Theorem

– Spatial redundancy – DCT,DFT,subband,wavelet

– Temporal redundancy – MC/ME

– Statistical redundancy – VLC, Entropy coding

– Perceptual redundancy – VQ

Page 20: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Tradeoffs in lossy compression

Page 21: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

DCT

• Use the technique of the JPEG

– DCT based coding scheme

• DCT transform (2D)

• 3D DCT transform ?

Page 22: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Discrete cosine transform

• Use the technique of the JPEG

– Discrete cosine transform

Page 23: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

23

DCT Transformation

Page 24: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Steps

Image

Spatial-to-DCT domain

transformation

8 x 8 DCT

Lossless coding of

DCT domain samples

Entropy Coding

Discard unimportant

DCT domain samples

Quantization

Page 25: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Quantization

• Quantization

– Eyes are insensible to high-frequency components

– The greater quantizer means greater loss

– Lower frequency component has smaller quantizer, high frequency component has greater quantizer

– The quantization tables in the encoder and decoder are the same

Page 26: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Picture type

• Video bit stream

Page 27: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Picture type

• Intra picture

– Coded using only information present in the

picture itself

– I-pictures provide potential random access points into the compressed video data.

Page 28: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Picture type

• Predicted picture

– coded with respect to the nearest previous I- or P-

picture.

– P-pictures use motion compensation

– Unlike I-pictures, P-pictures can propagate coding errors

Page 29: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Picture type

• Bidirectional picture

– Coded use both a past and future picture as a

reference

– B-pictures provide the most compression and do

not propagate errors

Page 30: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Picture type

• Typical display order of picture types

• Video stream composition

– The MPEG encoder reorders pictures in the video stream to

present the pictures to the decoder in the most efficient sequence

Page 31: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Hybrid MC-DCT Video Encoder

• Intra-frame: encoded without prediction

• Inter-frame: predictively encoded => use quantized frames as ref for residue

Page 32: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

MPEG-1 = JPEG + Motion Prediction + Rate Control

• Early motivation: to encode motion video at 1.5Mbits/s for

transport over T1 data circuits and for replay from CD-ROM

• Defines the decoder but not the encoder

• Frames (pictures)

– Intra-coded using JPEG

– Inter-coded using (interpolated)

ME & MC and JPEG for

the residuals

• MacroBlocks (MBs)

– 16×16 pixels block

• Rate control

– buffer at each end

– Test Model 5 (TM5)

A22

A21

Page 33: chuong 1&2.pdf

Slide 32

A22 Intracoding of MBs in MPEG is as same as what is described for JPEG, except that 1) unless otherwise specified in the sequence

header MPEG defines quantization tables: one is used for intracoding, the other is used to code any residules when prediction by

montion estimation. 2)Quantization scale factor, or MQuant is different.Author, 6/17/2004

A21 MPEG does not define the encoder. A valid encoder produces a syntactically correct bit stream, resulting in the desired output if the bit

stream is fed to a compliant decoder. But an MPEG-1 complaint decoder is required to decode all valid MPEG-1 bit streams.Author, 6/17/2004

Page 34: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

MPEG-2 = MPEG-1 +

• Improvements

– Color space: could support 4:2:2 and 4:4:4 coding

– Quantization: could have 9- or 10- bit precision for DC

coefficients

– Concealment motion vectors: used when an intra-MB is

lost

– Pan and Scan: supports display of different aspect

ratios, e.g., 16:9

• Profiles and levels

– Profiles: define the tools or syntactical elements

– Levels: define the permissible ranges of parameters

Page 35: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

MPEG-2 = MPEG-1 +

• Interlace tools

• Scalable coding profiles

• System layer: define two bit stream

constructs

– Program stream (PS): modeled on MPEG-1

(backward compatibility)

– Transport stream (TS): more robust, does not

need a common time base, designed for use in

error-prone environment.

Page 36: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

MPEG-4 = MPEG-2+Objects+Other Enhancements• Object-oriented

– Video (texture+shape), image, audio, speech, text, etc.

– Encoded using different techniques

– Transmitted independently

– Composited at the decoder using BInary Format for Scenes

(BIFS)

• Improvements in MPEG-4 version2

– Global motion compensation (GMC)

– Quarter pixel motion compensation

– Shape-adaptive DCT

• Why is MPEG-4 not a success as MPEG-2?

– Not substantially better than MPEG-2

– Suffers from its sheer size and flexibility

– Issue of licensing35

Page 37: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

MPEG-4 – Error Resilience Tools

• Video packet resynchronization

– Previous coding standards: Resynchronization markers are

fixed at the beginning of each row of MBs

– MPEG-4: Resynchronization markers are inserted at every

K bits

• Data partitioning

– Partitions the data in a video packet into a motion part and

a texture part separated by a motion boundary marker

(MBM)

Resync.

marker

MB

No.QP HEC

Repeated

header info.

Motion

dataMBM DCT dataA video

packet

use discard use

I-VOPVP

Header

DC DCT

data

AC DCT

dataP-VOP

VP

Header

Motion

data

Texture

data

Page 38: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

MPEG-4 – Error Resilience Tools

• Reversible variable length codes (RVLC)

– Finds the next resynchronization marker and

decode backwards

• Header extension code (HEC)

– The header information is repeated after the 1-bit

HEC

• Unequal error protection technique

(UEP)

Resync.

marker

MB

No.QP HEC

Repeated

header info.

Motion

dataMBM DCT data

A video

packet

use discard use

I-VOPVP

Header

DC DCT

data

AC DCT

dataP-VOP

VP

Header

Motion

data

Texture

data

Page 39: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

New Features of H.264

• Multi-mode, multi-reference MC

• Motion vector can point out of image border

• 1/4-, 1/8-pixel motion vector precision

• B-frame prediction weighting

• 4×4 integer transform

• Multi-mode intra-prediction

• In-loop de-blocking filter

• UVLC (Uniform Variable Length Coding)

• NAL (Network Abstraction Layer)

• SP-slices

Page 40: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Profiles and Levels

• Profiles: Baseline, Main, and X

– Baseline: Progressive, Videoconferencing &

Wireless

– Main: esp. Broadcast

– X: Mobile network

• Baseline profile is the minimum implementation

– Without CABAC, 1/8 MC, B-frame, SP-slices

• 11 levels

– Resolution, capability, bit rate, buffer, reference #

– Built to match popular international production and

emission formats

– From QCIF to D-Cinema

Page 41: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Basic Marcoblock Coding Structure

Entropy

Coding

Scaling & Inv.

Transform

Motion-

Compensation

Control

Data

Quant.

Transf. coeffs

Motion

Data

Intra/Inter

Coder

Control

Decoder

Motion

Estimation

Transform/

Scal./Quant.-

Input

Video

Signal

Split into

Macroblocks

16x16 pixels

Intra-frame

Prediction

De-blocking

Filter

Output

Video

Signal

Page 42: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Variable block size

• The fixed block size may not be suitable for

all motion objects

– Improve the flexibility of comparison

– Reduce the error of comparison

• 7 types of blocks for selection

– 16×16, 16×8, 8×16, 8×8, 8×4, 4×8, 4×4

41

Page 43: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Multiple Reference Frames

• The neighboring frames are not the most

similar in some cases

• The B-frame can be reference frame

– B-frame is close to the target frame in many

situations

Page 44: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Spatial Prediction for Intra-Coded MBs

• luma

- 4x4: 9 modes

- 16x16: 4 modes

• chroma

- 8x8: 4modes

- The same prediction mode is always applied to

both chroma blocks

M A B C D

I

J

K

L

M A B C D

I

J

K

L

M

I

J

A B C D

K

L

Mean (A-D,

I-M)

M A B C D

I

J

K

L

E F G H

……..

H

V

……..

H

VMean(H, V)

H

V

H

V

……..

H

V ……..

H

V

H

VMean

(H, V)

H

V

Page 45: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Deblocking filter

• Picture is filtered using an adaptive deblocking filter.

• The filter removes visible block structures on the

edges of the 4 X 4 blocks caused by block-based

transform coding and motion estimation

Page 46: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Deblocking FiltersA boundary-strength (BS) parameter

is assigned to every 4×4 block• BS = 0 No filtering

BS = 1-3 Slight filtering

BS = 4 Strong filtering

• Filters only when

– |P0-Q0|< α

– |P1-P0|< β

– |Q1-Q0|< β

• Thresholds α and β depend on the average quantization parameter (QP)

• The deblocking filtering accounts for 1/3 of the computational complexity of a decoder.

46

Block modes and

conditions(BS)

One of the blocks is intra-

coded and the edge is a

MB edge

4

One of the blocks is intra-

coded

3

One of the blocks has

coded residuals

2

Difference of block

motion ≥ one luma

sample distance

1

Motion compensation

from different reference

frames

1

Else 0

P3 P2 P1 P0 Q0 Q1 Q2 Q3

Page 47: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

SP and SI-Frame Design

• SP and SI-frames

– allow identical reconstruction when coded using different

references

– Subtract the reference in the coder and add it back in the

decoder

• Bitstream switching

– In previous coding standards:

perfect (mismatch-free) switching

only happens at Intra-frames.

• Other applications

– Bitstream splicing

– Error recovery/resilience

– Video redundancy coding47

P2,n-2 P2,n-1SP2,n P2,n+1 P2,n+2

P1,n-2 P1,n-1 P1,n P1,n+1 P1,n+2

SP12,n

Stream 2:

Stream 1:

Page 48: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Transformation

• H.264 employs a 4X4 integer transform

• The transform is an approximation of the DCT

– It has a similar coding-gain to the DCT transform.

– Since the integer transform has an exact inverse

operation, there is no mismatch between the

encoder and the decoder which was a problem in

all DCT based codecs

Page 49: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Network friendliless

• H.264 structure

– Video coding layer (VCL)

– Network abstraction layer (NAL)

Scope of H.264 standard

Page 50: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

H.264 Over IP

• Network Abstraction Layer

Unit (NALU)

– A byte stream of variable

length

– 1-byte header

• NALU type (T)

• NALU importance (R)

• Error indication (F)

• RTP packetization

– Simple packetization

• One NALU in one RTP

packet

• NALU header as RTP

header

– NALU fragmentation

– NALU aggregation

OSI/RM Protocols and specifi-cations for H.264

Application Layer� RTP (Real-Time Transport Protocol)

Header size: IP/UDP/RTP = 20+8+12=40 bytes

Media-Unaware RTP payload specifications to reduce the loss rates observed by the decoder.

Packet duplication/Packet based FEC/Audio redundancy coding

� Control protocols: H.245, SIP (Session Initiation Protocol), SDP (Session Description Protocol), RTSP (Real-Time Streaming Protocol)

Presentation Layer

Session Layer

Transport Layer� UDP (User Datagram Protocol)

Network Layer � IP: best effort service

T FR

A1

Page 51: chuong 1&2.pdf

Slide 50

A1 IP header is 20 bytes in size and protected by a checksum. No protection of the payload is performed.Author, 8/24/2011

Page 52: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Comparison

Page 53: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

H265 outlook

• Half-rate reduction compared to H264

• Tree-structured prediction and residual difference

block segmentation

• Extended prediction block sizes (up to 64x64)

• Tile and slice picture segmentations for loss

resilience and parallelism

• Wavefront processing structure for decoder

parallelism

• Mode-dependent sine/cosine transform type

switching

• Adaptive motion vector predictor selection

• Temporal motion vector prediction

Page 54: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

3D video coding

53

• Left and right eye view

• Depth sensation

• Resolving 2D viewing ambiguity

• Additional features:

• Free view points

• Depth-controlled

object insertion

Page 55: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Multiview Frame Structure

1 2 3 4 5 6 7

.

.

.

…..

time

view

Page 56: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Predictions based on H.264/AVC JM95

Page 57: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Homework 1

• Download the open source tool X264 from VIDEOLAN website

• Capture a video sequence via webcam or from the Internet

• Work around with FFMPEG to encode and transcode the video sequence with different standards (mpeg2, mpeg4, h.263, h.264, etc), parameters

• Playback the encoded video and comment

• Contain the encoded video sequence in mp4 format

Page 58: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Homework 2

• Draw decoding diagrams for MPEG1, MPEG2,

MPEG4, H264 and 3D

Page 59: chuong 1&2.pdf

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi

Future development

• Future coding/presentation standards:

– H265, VC-1, VC-2

– MPEG-21, MHEG

• Computer vision

– Game

– Graphics

• Multimedia retrieval

– Segmentation

– Search (Google)

• Multi-camera system

– 3D cinema

– Realistic broadcasting


Recommended