+ All Categories
Home > Documents > 1 Computational Complexity Analysis of MPEG-4 Decoder Student : Chung-Yen Tsai Adivisor : Prof....

1 Computational Complexity Analysis of MPEG-4 Decoder Student : Chung-Yen Tsai Adivisor : Prof....

Date post: 21-Dec-2015
Category:
View: 219 times
Download: 1 times
Share this document with a friend
Popular Tags:
23
1 Computational Complexity Analysis of MPEG-4 Decoder Student : Chung-Yen Tsai Adivisor : Prof. David W. Lin Date : 2005/06/08
Transcript

1

Computational Complexity Analysis of MPEG-4 Decoder

Student : Chung-Yen Tsai

Adivisor : Prof. David W. Lin

Date : 2005/06/08

2

Outline

Corrections of The Computational Complexity in MPEG-4 Encoder(MoMuSys)

Analysis of Computational Complexity in MPEG-4 Decoder(MoMuSys)

Summary Future Work Reference

3

Outline

Corrections of The Computational Complexity in MPEG-4 Encoder(MoMuSys)

Analysis of Computational Complexity in MPEG-4 Decoder(MoMuSys)

Summary Future Work Reference

4

Profile of MoMuSys Encoder in Previous Group Meeting

Execution Time(cycles) Contribution

Motion Estimaton 3,902,482 41.05%

DCT 297,113 3.13%

IDCT 249,057 2.63%

Quantization 42,494 0.45%

Inverse Quantization 16,441 0.17%

VLC 16,130 0.15%

The contribution of ME is not accurate

The information will be gathered in another unit.

The Sum

ofContribution

is Not100%

5

The Cause of The Fault

Execution Time(samples)

Percentage Comutational Complexity

SAD_Macroblock 18642 42.66% 366,785*(257-comp67-data_=256-abs560-data_+)

Subsample_alpha_with_modes 14092 36.26% 643,500*(5-mem_shift3-mem_*4-mem_+2-data_=1-data_+1614-comp)

Redundant Part forFramed-Based

Coding Scheme.

6

Correction of The Profile

Execution Time(samples) Contribution

Motion Estimaton 32739 74.92%

DCT 186 0.43%

IDCT 271 0.62%

Quantization 232 0.53%

Inverse Quantization 78 0.18%

VLC 14 0.03%

File I/O , Mem, others 10051 ~23%

※Samples is the unit used in VTune Performance Analyzer (msec)

7

Correction of The Computational Complexity

Computational Complexity

Motion Estimaton

Frame : 366,785*(257-comp,67-data_=,256-abs,560-data_+) Alpha : 643,500*(5-mem_shift,3-mem_*,4-mem_+,2-data_=,1-d

ata_+,1614-comp)

DCT594*(408-data_*,520-data_+,520-data_=,128-if,64-floor

64-mem_+,64-mem_=,64-mem_*)

IDCT594*(256-data_*,544-data_+,576data_=,64-floor

64-mem_+,64mem_*)

Quan 130-comp,128-dara_=,128-shift,128-data_*

DeQuan 256-data_=,320-comp, 128-shift, 192-data_+, 64-data_*

8

Outline

Corrections of The Computational Complexity in MPEG-4 Encoder(MoMuSys)

Analysis of Computational Complexity in MPEG-4 Decoder(MoMuSys)

Summary Future Work Reference

9

Block Diagram of MPEG4 Decoder

10

Profile of The Original MoMuSys Decoder

File_I/O Non_File_I/O

Contribution 95+% 5-%

11

Distribution of Execution Time (Cont’d)

DecodeVopCombinedMotionShapeTextureInterErrRes

54.9%

DecodeVideoPacketCombinedInterErrRes

34.3%

VopMotionCompensate33.6%

VopTextureUpdate30.8%

DecodeCombinedPacketInfoInterErrRes

98.9%

fprintf91.1%

GetPred_Advanced

3.5%

AllocImage2.4%

PrintOutMBData93.9%

12

Distribution of Execution Time (Cont’d)

DecodeVopCombinedMotionShapeTextureInterErrRes

61.1%

DecodeVideoPacketCombinedInterErrRes

68.5%

VopMotionCompensate15.6%

VopTextureUpdate9.0%

DecodeCombinedPacketInfoInterErrRes

98.4%

InterpolateImage20.1%

GetPred_Advanced

29.9%

AllocImage28.6%

fprintf15.5%

13

Distribution of Execution Time (Cont’d) – For Texture Decoding

DecodeCombinedPacketInfoInterErrRes98.9%

GetMBblockdataNoDataPartErrRes

88.0%

GetMBheaderNoDataPartInterErrRes

5.4%

GetMBvectorsNoDataPartErrRes

3.6%

PrintOutMBData75.1%

VlcGetBlock16.1%

BlockIDCT3.6%

BlockDequantH263

0.4%

14

Distribution of Execution Time (Cont’d) – For Texture Decoding

DecodeCombinedPacketInfoInterErrRes98.4%

GetMBblockdataNoDataPartErrRes

76.1%

GetMBheaderNoDataPartInterErrRes

8.3%

GetMBvectorsNoDataPartErrRes

7.1%

PrintOutMBData

0%

VlcGetBlock69.3%

BlockIDCT15.3%

BlockDequantH263

5.9%

15

File_I/O and Non_File_I/O

Original Modification of Tsai

_calloc 3.81% 15.3%

fwrite 1.68% 7.17%

_output 26.86% 7.12%

VopTextureUpdate 1.85% 6.7%

Memset 1.4% 5.43%

BlockIDCT 0.97% 3.53%

16

Modification of MoMuSys Decoder

After removing PrintOutMBData, contribution of IDCT and VLC became larger.

There were also some files written to trace some information in “Debug” mode, and it is removable.

VopTextureUpdate is used to add the decoded texture on the M.C. image. Thus, fprintf is not removable here.

17

Profile in MoMuSys Decoder

Execution Time(samples) Contribution(%)

Motion Compensation 2072 21.69

DCT 0 0

IDCT 358 3.75

Quantization 82 0.86

Inverse Quantization 42 0.44

VLC Decoding 448 4.69

File I/O 2003 20.97

Mem 2616 27.39

others 1171 12.26

ErrRes 759 7.95

18

Definition of DCT and IDCT in MPEG-4 Standard

19

The Theoretically Computatoinal Complexity We use the 8X8 block based DCT and IDCT.

DCT:64(7_mult,4_div,2_cos),2_mult,1_div

IDCT 64(9_mult,4_div,2_cos),1_mult,1_div

20

Computational Complexity of Each Frame

Computational Complexity

Practical Theoretical

Motion Compesation

DCT594*( 64*(7_mult,4_div,2_cos),

2_mult,1_div )

IDCT594*(256-data_*,544-

data_+,576data_=,64-floor, 64-mem_+,64mem_*)

594*( 64*(9_mult,4_div,2_cos),

1_mult,1_div )

Quan594*(130-comp,128-dara_=,1

28-shift,128-data_*)

DeQuan594*(256-data_=,320-comp,

128-shift, 192-data_+, 64-data_*)

21

Summary

In MoMuSys Encoder, Motion Estimation is the main contribution. The SAD calculation occupies most execution time.

However, the execution time of Motion Compensation and Texture Decoding is less than that of File I/O and Memory operations.

Since the VTune Performance Analyzer is run under Debug Mode, there will be some redundant executions, which increae Files I/O.

22

Future Work

Complete the analysis of Object-Based MoMuSys Codec.

Run some simple simulation on PAC simulator.

23

Reference

Standard of MPEG-4 (Text of ISO/IEC 14496-2:2001 (Unifying N2502, N3307, N3056, and N3664) )


Recommended