+ All Categories
Home > Documents > IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

Date post: 23-Feb-2016
Category:
Upload: rafer
View: 56 times
Download: 0 times
Share this document with a friend
Description:
A Fast MB Mode Decision Algorithm for MPEG-2 to H.264 P-Frame transcoding Pedro Cuenca, Member, IEEE, Luis Orozco- Barbosa , Member, IEEE , Gerardo Fernández-Escribano , Antonio Garrido , Hari Kalva. - PowerPoint PPT Presentation
Popular Tags:
28
A FAST MB MODE DECISION ALGORITHM FOR MPEG-2 TO H.264 P-FRAME TRANSCODING PEDRO CUENCA, MEMBER, IEEE, LUIS OROZCO-BARBOSA, MEMBER, IEEE, GERARDO FERNÁNDEZ-ESCRIBANO, ANTONIO GARRIDO, HARI KALVA IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008
Transcript
Page 1: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

A FAST MB MODE DECISION ALGORITHM FOR MPEG-2 TO H.264 P-FRAME TRANSCODING PEDRO CUENCA, MEMBER, IEEE, LUIS OROZCO-BARBOSA, MEMBER, IEEE, GERARDO FERNÁNDEZ-ESCRIBANO, ANTONIO GARRIDO, HARI KALVA

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

Page 2: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

2

Outline Introduction Fast MB Mode Decision Using Machine

Learning Performance Evaluation Conclusion

Page 3: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

3

Introduction1/3

Motivation: make transcoding from MPEG-2 to H.264

seamless.

Hypothesis: the MB mode decision in H.264 have a

correlation with the distribution of the motion compensated residual in MPEG-2 video.

Page 4: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

4

Introduction2/3

Fig. 1. Relationship between MPEG-2 MB residual and H.264 MB coding mode.

the H.264 MB mode computation problem is posed as a data classification problem.

the MPEG-2 MB coding mode and residual have to be classified into one of the several H.264 coding modes.

Page 5: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

5

Introduction3/3

Method: use machine learning tools to exploit the

correlation and construct decision trees to classify the MPEG-2

MBs into one of the coding modes in H.264.

Page 6: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

6

Fast MB Mode Decision Using Machine Learning1/14

Fig. 2. Process for building decision trees for MPEG-2 to H.264 transcoding.

Page 7: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

7

Fast MB Mode DecisionUsing Machine Learning2/14 WEKA data mining tool : machine learning software written in Java and supports several standard data mining tasks.

the J48 algorithm: implemented in the WEKA data mining tool was

used to create the WEKA decision trees. the J48 algorithm is an implementation of the

C4.5 algorithm which widely used as a reference for building decision trees.

Page 8: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

8

Fast MB Mode Decision Using Machine Learning 3/14 Attribute-Relation File Format (ARFF): The file used by the WEKA data mining program,

contain the existing relationship between a set of attributes.

An ARFF file has two sections: (1) header: contains the name of the relation,

the attributes and their types. (2) section: containing the data.

Page 9: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

9

Fast MB Mode Decision Using Machine Learning 4/14 Training sets:

the MPEG-2 sequences encoded at high quality since no B-frames have been used.

use H.264 encoder with a QP of 25 and the R-D optimization enable.

Goal: develop a single, generalized, decision tree to be

used for the MPEG-2 to H.264 transcoding process. It’s found that Flower sequence was good for a large number of videos.

Page 10: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

10

Fast MB Mode Decision Using Machine Learning 5/14

The Decision Tree for the proposed transcoder is a hierarchical

decision tree consisting of three different WEKA trees.

Fig. 3. Decision tree.

Page 11: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

11

Fast MB Mode DecisionUsing Machine Learning6/14

mean and variance of each one of the 4x4 residual subblocks.

MB mode in MPEG-2. coded block pattern

(CBPC) used in MPEG-2.

A. Creating the Training Files

Page 12: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

12

Fast MB Mode DecisionUsing Machine Learning7/14

B. Decision Tree

decision tree

Works as follow

Node 1 Input: MPEG-2 MB information.Output: First level decision that classifies the MB as Skip, Intra, Inter- 8x8 or Inter-16x16.Rule:MPEG-2 MB mode H.264 MB modeMC not coded Inter-16x16 intra Intra or Inter-8X8 skip skip

Page 13: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

13

Fast MB Mode DecisionUsing Machine Learning8/14

decision tree

Works as follow

Node 2 Input: 16x16 MBs classified by the Node 1.

Output: 16x16 submode decision used for coding the MB into 16x16, 16x8 or 8x16.

Rule: This tree examines if there are continuous 16x8 or 8x16 subblocks that might result in a better prediction.

B. Decision Tree

Page 14: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

14

Fast MB Mode DecisionUsing Machine Learning9/14

decision tree

Works as follow

Node 3 Input: The MBs classified by Node 1 as 8x8.

Output: 8x8 submode decision used for coding the MB into 8x8, 8x4, 4x8 or 4x4.Rule: (1)Evaluates only the H.264 8x8 modes using the third WEKA tree and selects the best option. (2)This node is different from the others since this one only uses four means and four variances to make the decision.

B. Decision Tree

Page 15: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

15

Fast MB Mode DecisionUsing Machine Learning10/14

decision tree

Works as follow

Node 4 Input: (1) skip-mode MBs in the MPEG-2 bit stream classified by Node 1(2) the 16x16 MBs classified by Node 2

Output: Select skip or inter-16x16.

Rule: Evaluates only the H.264 16x16 mode (without the submodes 16x8 or 8x16). Then, the node selects the best option.

B. Decision Tree

Page 16: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

16

Fast MB Mode DecisionUsing Machine Learning11/14

MB mode decision and threshold used in the decision tree depend on the QP used in the H.264 encoding stage.

The mean and variance threshold will have to be different at each QP.

Page 17: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

17

Fast MB Mode DecisionUsing Machine Learning12/14Solution(1):method: Develop the decision trees for each QP

and use the appropriate decision tree depending on the

QP selected.

drawback: It's complex since implies to switch between 52

different decision trees resulting in 156 WEKA

trees for a transcoder.

Page 18: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

18

Fast MB Mode DecisionUsing Machine Learning13/14

Solution(2):method: Develop a single decision tree and adjust the

mean and variance threshold used by the trees based on

the QP of 25. For QP values higher than 25, the thresholds

are decreased and for QP values lower than 25 thresholds are oportionally increased. The threshold are adjusted by 2.5% for a change in QP of 1.

Page 19: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

19

Fast MB Mode DecisionUsing Machine Learning14/14

Fig. 2. Process for building decision trees for MPEG-2 to H.264 transcoding

Fig. 4. Proposed transcoder

.

.

Page 20: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

20

Performance Evaluation1/8

Input: (1) A high quality MPEG-2 video. (2) QP ranging from 5 up to 45 in steps of 5.(3) The size of the GOP is 12 frames;where the first frame was I-frame, and the rest of the frames were P-frames.(4) The rate control and CABAC algorithms were disabled for all the simulations.(5) The number of reference in P-frames was set to 1.(6) The motion search range was set to 16 pels with a MV resolution of 1/4 pel.

Page 21: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

21

Performance Evaluation2/8Fig. 6. MB mode decisions generated by the proposed algorithm for the first P-frame in the Ayersroc, Paris, and Foreman sequence.Full

estimation of H.264

Proposed algorithm

Page 22: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

22

Performance Evaluation3/8

Test sequence:Martin, Ayersroc, Paries, Tempete, News, Foreman

RD-results:R-D-cost without FME optionor R-D-cost with FME option

Fromat:CCIR, CIF, QCIF

Page 23: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

23

Performance Evaluation4/8

Page 24: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

24

Performance Evaluation5/8

RD-results:SAE-cost without FME optionor SAE-cost with FME option

Page 25: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

25

Performance Evaluation6/8

Page 26: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

26

Performance Evaluation7/8

Reference transcoder

Proposed transcoder WIN

Page 27: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

27

Performance Evaluation8/8

Page 28: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008

28

Conclusion The proposed algorithm uses machine

learning techniques to develop decision tree decide MPEG-2 to H.264 coding mode, considerably reducing the computational complexity .

It can be applied to develop other transcoders as well.


Recommended