+ All Categories
Home > Documents > Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility...

Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility...

Date post: 26-Mar-2015
Category:
Upload: jennifer-fisher
View: 218 times
Download: 3 times
Share this document with a friend
Popular Tags:
30
Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo
Transcript
Page 1: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder

Compatibility

Jingming XuMultimedia Communications Lab

University of Waterloo

Page 2: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 2

Outline Introduction and motivation MP3, AAC, and Two-nested-loop Searc

h Rate-distortion optimization for MP3 Rate-distortion optimization for AAC Conclusions and Future Research

Page 3: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 3

Introduction Audio coding - different from universal data

compression Long term correlations Multi-channel correlations Subject to natural noises Subjective perceptual quality judgement

Audio coding methods - for both lossy and lossless

Linear prediction Time-frequency mapping (DCT, FFT, MDCT, etc.) Parameter coding ….

Page 4: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 4

Introduction (2)

MPEG - the most successful audio coding standard series so far

MPEG-1 (1992) - T/F mapping based, 3 Layers with increased complexity

MPEG-2 BC (1994) - backward compatible with MPEG-1, with multi-channel and sampling frequency extensions

MPEG-2 AAC (1997) - introducing more coding tools and giving up backward compatibility to improve quality

MPEG-4 AAC (1999) - inherited from MPEG-2 AAC with TwinTQ and bitrate scalability extensions

MPEG-1 Layer 3 and MPEG-2 BC Layer 3 define the popular “MP3”

Page 5: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 5

Introduction (3) Motivations

MP3 and AAC leave structured encoding blocks design open for performance enhancement.

The state-of-the-art MP3 and AAC quantization and entropy coding scheme, Two-nested-loop Search (TNLS), is essentially incapable to exploit the maximal standard-constrained flexibility for best rate-distortion tradeoff.

The huge success of MP3 and AAC in the digital audio industry.

Page 6: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 6

Introduction (4)

Quality evaluation of compressed audio Most widely used objective measure - noise-to-

mask ratio

Most widely used subjective measure - ITU listening test (ITU-R Recommendation BS.1116)

Triple sources A, B, C with hidden reference, double blind

5-grade impairment score scale

Page 7: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 7

MP3 and AAC audio coding standards

Encoding process Window switching Stereo coding Pre-processing in AAC: gain control, prediction, noise

shaping and substitution, etc.

Page 8: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 8

MP3 and AAC audio coding standards (2)

Quantization and entropy coding in MP3 Scale factor bands and non-uniform quantization

scale_factor values are encoded by fixed number of bits in the side information and variable number of bits in the main_data stream

Page 9: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 9

MP3 and AAC audio coding standards (3)

Quantization and entropy coding in MP3 Huffman coding

34 fixed Huffman codebooks Huffman coding region division: Each region is coded

with a different codebook that best matches the statistics of that region.big_value, count_1, zero, ….

Page 10: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 10

MP3 and AAC audio coding standards (4)

Quantization and entropy coding in AAC Non-uniform quantizer: same as in MP3 scale_factor values are differentially encoded

relatively to the one of the preceding band by fixed Huffman codebook

Huffman coding 12 fixed Huffman codebooks Huffman coding region division: Section boundaries

can only be at the scale factor band boundaries For each section, the length of the section in scale

factor bands, and the index of the codebook used for that section, are transmitted with a fixed number of bits.

Page 11: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 11

Two-nested-loop Search algorithm

Inner LoopOuter Loop

Page 12: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 12

Two-nested-loop Search algorithm (2)

Problems in TNLS Quantization, scale factor adaption and Huffman

coding are considered separately. Has no convergence guarantee Does not target at minimizing the overall distortion Disregards the inter-band correlations of scale

factors and Huffman codebook selection in AAC

Page 13: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 13

Rate-distortion optimization for MP3

Problem formulation Lagrangian RD cost minimization

- quantized coefficients

- scale factors

- Huffman coding region division- Huffman codebook selection

- non-uniform de-quantizer defined in MP3- noise-to-mask ratio

Page 14: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 14

Rate-distortion optimization for MP3 (2)

Problem formulation Soft-decision quantization

In conventional hard-decision quantization, is solely determined by given , i.e., .However, in the soft-decision quantization scenario, is considered as a flexible coding factor and selected such that the actual RD cost can be minimized. Therefore, .

Page 15: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 15

Rate-distortion optimization for MP3 (3) Fixed-slope graph-based iterative RD

optimization Step 1: Initialize a set of scale factors from the given

frame of spectrum with a HCB selection fashion . Set t=0, and specify a tolerance as the convergence criterion.

Step 2: Given and for any t 0, find the optimal quantized spectrum and HCB region division

fashion throughout a standard-constrained graph, where and achieve the minimum

Denote by .

Page 16: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 16

Rate-distortion optimization for MP3 (4)

Graph Search for MP3 Quantized Spectrum and Region Division

Page 17: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 17

Rate-distortion optimization for MP3 (5) Fixed-slope graph-based iterative RD optimization

Step 3: Given , and , update to , so thatachieves the minimum

Step 4: Given , and , update to , so that achieves the minimum

Step 5: Repeat Steps 2, 3 and 4 for t = 0,1,2…. Until , then output , , and .

Page 18: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 18

Rate-distortion optimization for MP3 (6)

Simulation results: ANMR (implementation based on ISO MP3 reference codec)

violin.wav spme50_1.wav

Page 19: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 19

Rate-distortion optimization for MP3 (7)

Simulation results: ANMR (implementation based on LAME3.96.1 Best-quality mode)

violin.wav spme50_1.wav

Page 20: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 20

Rate-distortion optimization for MP3 (8)

Simulation results: ITU listening test (80kb/s)

Page 21: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 21

Rate-distortion optimization for MP3 (9)

Remarks The iteration process may only achieve local

optimality, thus a wisely chosen initial state is favored when one targets at achieving the best possible RD performance.

The fixed-slope graph-based iterative algorithm we proposed provides a feasible solution to the problems in TNLS.

One can adaptively adjust the value of , to meet rate or distortion constraints in real audio compression applications.

Page 22: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 22

Rate-distortion optimization for AAC

Problem formulation Lagrangian RD cost minimization

- scale factor sequence

- Huffman codebook index sequence

first-order inter-band dependency ->

Dynamic programming (Viterbi algorithm)

Page 23: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 23

Rate-distortion optimization for AAC (2)

Fixed-slope trellis-based RD optimization Step 1: Build up trellis structure. For each state

, = 0,1,…., -1, = 0,1,…., -1, = 0,1,…., -1, in the trellis, find the best to minimize its decomposed RD cost

Step 2: Find the optimal path throughout the Trellis by Viterbi algorithm

Step 3: Backtrack the optimal , and as final output

Page 24: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 24

Rate-distortion optimization for AAC (3)

Trellis Structure for AAC Quantization and Entropy Coding

Page 25: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 25

Rate-distortion optimization for AAC (4)

Simulation results: ANMR Implementation based on ISO AAC reference codec Also compared with Aggarwal’s approach (Steps 2, 3 only)

violin.wav spme50_1.wav

Page 26: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 26

Rate-distortion optimization for AAC (5)

Simulation results: ITU listening test (64kb/s)

Page 27: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 27

Rate-distortion optimization for AAC (6)

Remarks The fixed-slope trellis-based algorithm we proposed

achieves the global optimum RD performance within the quantization and entropy coding stage under the AAC standard constraints.

Joint design of the pre-processing decisions with our proposed optimization can theoretically achieve the global optimum performance in the entire standard-constrained parameter space, however, with computational complexity exponential to the number of bands per frame.

Page 28: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 28

Conclusions and Future Research Conclusions

Fixed-slope approach converts the encoding problem to a search problem through a constrained space and then permits the implementation of efficient sequential search algorithm.

Soft-decision quantization spirit completes our RD optimization frameworks, and introduces significant performance enhancement.

Substantial performance improvement against the state-of-the-art encoders is achieved with complete decoder compatibility in each case.

Page 29: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

September 16th, 2005 29

Conclusions and Future Research (2) Future research

Real-time implementations Extension to scalable AAC Joint pre-processing and optimization for AAC Optimal lossy audio compression without syntax

constraints Optimal settings for transform (e.g. block lengths),

quantization (e.g. stepsizes) and prediction Joint design of quantization and entropy coding ….

Page 30: Rate-distortion Optimization for MP3 and AAC Audio Coding with Complete Decoder Compatibility Jingming Xu Multimedia Communications Lab University of Waterloo.

Questions?


Recommended