+ All Categories
Home > Documents > mpeg1 audio

mpeg1 audio

Date post: 26-Dec-2014
Category:
Upload: avik-chakraborty
View: 395 times
Download: 5 times
Share this document with a friend
65
Multimedia Signals and Systems MP3 - Mpeg 1,2 layer 1,2,3 Polyphase Filterbank Kunio Takaya Electrical and Computer Engineering University of Saskatchewan March 31, 2008 1
Transcript
Page 1: mpeg1 audio

Multimedia Signals andSystems

MP3 - Mpeg 1,2 layer 1,2,3Polyphase Filterbank

Kunio Takaya

Electrical and Computer Engineering

University of Saskatchewan

March 31, 2008

1

Page 2: mpeg1 audio

“A review of algorithms for perceptual coding of digital audio

signals”

Painter, T. Spanias, A.

Dept. of Electr. Eng., Arizona State Univ., Tempe, AZ;

http://ieeexplore.ieee.org/iel3/4961/13644/00628010.pdf?arnumber=628010

MP3’ Tech - Encoding engines source codes:

http://www.mp3-tech.org/programmer/encoding.html

“ECE-700 Filterbank Notes.”, Why Filterbanks? Sub-band

Processing:

Phil Schniter, Ohio State Univ. March 10, 2008. 1

http://www.ece.osu.edu/˜ schniter/ee700/handouts/filterbanks.pdf

** Go to full-screen mode now by hitting CTRL-L

2

Page 3: mpeg1 audio

1 Polyphase Filter Bank

References

1. Phil Schneiter, ECE-700 Filterbank Notes

2. Davis Yen Pan, Digital Audio Compression

3. Davis Pan, A Tutorial on MPEG/Audio Compression

4. CD 11172-3 CODING OF MOVING PICTURES AND

ASSOCIATED AUDIO FOR DIGITAL STORAGE MEDIA AT

UP TO ABOUT 1.5 MBIT/s Part 3 AUDIO

5. Jong-Hwa Kim, Lossless Wideband Audio Compression:

Prediction and Transform, Ph.D. Thesis

3

Page 4: mpeg1 audio

• In MPEG audio coding, a psychoacoustic model is used to

decide how much quantization error can be tolerated in each

sub-band, while signals below the hearing threshold of a

human listener is discarded.

• In the sub-bands that can tolerate more error, less bits are

used for coding. The quantized subband signals can then be

decoded and recombined to reconstruct (an approximate

version of) the input signal.

• Such processing allows, on average, a 12-to-1 reduction in bit

rate while still maintaining CD quality audio.

• The psychoacoustic model takes into account the spectral

masking phenomenon of the human ear, which says that high

energy in one spectral region will limit the ear’s ability to hear

details in nearby spectral regions. Therefore, when the energy

in one sub-band is high, nearby subbands can be coded with

4

Page 5: mpeg1 audio

less bits without degrading the perceived quality of the audio

signal.

• The MPEG standard specifies a 32-channels of sub-band

filtering.

5

Page 6: mpeg1 audio

1.1 Uniform Modulated Filterbank

Polyphase Filterbank

6

Page 7: mpeg1 audio

Uniform Modulated Filterbank

• A modulated filterbank is composed of analysis branches which

1. modulate the input to center the desired sub-band at DC,

2. lowpass filter the modulated signal to isolate the desired

sub-band, and

3. downsample the lowpass signal.

• The synthesis branches interpolate the sub-band signals by

7

Page 8: mpeg1 audio

upsampling and lowpass filtering, then modulate each

sub-band back to its original spectral location.

• In an M -branch critically-sampled uniformly-modulated

filterbank, the kth analysis branch extracts the sub-band signal

with center frequency ωk =2π

Mk via modulation and lowpass

filtering with a (one-sided) bandwidth ofπ

Mradians, and then

downsamples the result by factor M .

• The output from the uniform modulated filterbank is

time-domain data of a subband.

8

Page 9: mpeg1 audio

1.2 Polyphase/DFT Implementation of Uniform

Modulated Filterbank

Uniform Modulated Filterbank

9

Page 10: mpeg1 audio

The uniform modulated filterbank can be implemented using

polyphase filterbanks and DFTs, resulting in huge computational

savings. Fig. illustrates the equivalent polyphase/DFT structures

for analysis and synthesis.

• The impulse responses of the polyphase filters P`(z) and

P̄`(z)can be defined in the time domain as

p`[m] = p̄`[m] = h[mM + `], where h[n] denotes the impulse

responses of the lowpass filters.

• Recall that the standard implementation performs modulation,

filtering, and downsampling, in that order.

• The polyphase/DFT implementation reverses the order of

these operations; it performs downsampling, then filtering,

then modulation (if we interpret the DFT as a two-dimensional

bank of “modulators”).

10

Page 11: mpeg1 audio

We derive the polyphase/DFT implementation by exchanging the

order of modulation, filtering, and downsampling.

11

Page 12: mpeg1 audio

Reversing the modulation and filtering

We start by analyzing the kth filterbank branch, analyzed below.

The first step is to reverse the modulation and filtering operations.

To do this, we define a “modulated filter” Hk(z):

vk[n] =∑

i

h[i]x[n− i]ej 2πM k(n−i) (1)

=

(

i

h[i]e−j2πM kix[n− i]

)

ej2πM kn (2)

=

(

i

hk[i]x[n− i]

)

ej2πM kn (3)

12

Page 13: mpeg1 audio

• where, hk[i] = h[i]e−j2πM ki is the impulse response of the

modulated filter. The equation above indicates that x[n] is

convolved with the modulated filter and that the filter output

is modulated.

• Now. consider the down sampler. The only modulator outputs

not discarded by the downsampler are those with time index

n = mM . For those outputs, the modulator has the value

ej2πM kmM = 1, and thus it can be ignored. The resulting system

is portrayed as shown in the bottom blockdaigram.

13

Page 14: mpeg1 audio

14

Page 15: mpeg1 audio

Reversing the order of filtering anddownsampling.

To apply the Noble identity, we must decompose Hk(z) into a bank

of upsampled polyphase filters. The process to derive polyphase

decimation is explained here:

Hk(z) =∞∑

n=−∞hk[n]z−n =

M−1∑

`=0

∞∑

m=−∞hk[mM + `]z−mM−`

Noting that the `th polyphase filter has impulse response,

hk[mM+`] = h[mM+`]e−j2πM (mM+`) = h[mM+`]e−j

2πM k` = p`[m]e−j

2πM k`

where p`[m] is the `th polyphase filter defined by the original

(unmodulated) lowpass filter H(z) by downsampling M : 1.

15

Page 16: mpeg1 audio

We now obtain,

Hk(z) =M−1∑

`=0

∞∑

m=−∞p`[m]e−j

2πM k`z−mM−`

=M−1∑

`=0

e−j2πM k`z−`

∞∑

m=−∞p`[m](zM )−m

=M−1∑

`=0

e−j2πM k`z−`P`(z

M ). (4)

16

Page 17: mpeg1 audio

Derived filterbank structure - downsampler after the polyphase

branches

17

Page 18: mpeg1 audio

Derived filterbank structure - downsampler before the polyphase

branches

18

Page 19: mpeg1 audio

• The kth filterbank branch (now containing M polyphase

branches) is illustrated. Because it is a linear operator, the

downsampler can be moved through the adders and the

(time-invariant) scalings e−j2πM k`. Finally, the Noble identity is

employed to exchange the filtering and downsampling.

• Observe that the polyphase outputs fv`[m], ` = 0 · · ·M − 1gare identical for each filterbank branch, while the scalings

fe−j 2πM k`, ` = 0 · · ·M − 1 g are different for each filterbank

branch since they depend on the filterbank branch index k.

19

Page 20: mpeg1 audio

• Thus, we only need to calculate the polyphase outputs

fv`[m], ` = 0 · · ·M − 1g once. Using these outputs we can

compute the branch outputs via

yk[m] =M−1∑

`=0

v`[m]e−j2πM k` (5)

• From the previous equation it is clear that yk[m] corresponds

to the kth DFT output given the M-point input sequence

fv`[m], ` = 0 · · ·M − 1g. Thus the M filterbank branches can

be computed in parallel by taking an M-point DFT of the M

polyphase outputs as shown.

20

Page 21: mpeg1 audio

Derived filterbank structure that incorpolates the DFT block

21

Page 22: mpeg1 audio

1.3 Computational Savings of the Polyphase/DFT

Modulated Filterbank Implementation

Here we consider the analysis bank only; the synthesis bank can be

treated similarly.

standard structure Assume that the lowpass filter H(z) has

impulse response length N . To calculate the sub-band output

vector yk[m], k = 0, · · · ,M − 1 using the standard structure, we

have

1. N multiplications for filter Pi(z) plus one multiply for the

modulator

2. M branches of the filterbank

3. M values to calculate yk[m] for k

Thus, the total number of calculations is M2(N + 1).

22

Page 23: mpeg1 audio

lowpass/downsampler If we implement the lowpass/downsampler

in each filterbank branch with a polyphase decimator, the

number of multiplications will be,

1. N multiplications for filter Pi(z) for each of M branches,

i.e. N ×M

2. M -point DFT requires M ×M multiplications

Thus, NM +M2 = (M +N)M .

23

Page 24: mpeg1 audio

FFT If a radix-2 FFT algorithm is used to implement the DFT, we

have approximately,

1. Half size radix-2 FFT performsM

2log2M multiplications.

2. N multiplications for filter Pi(z) for each of M branches,

i.e. N ×M

Thus, the total number of calculations is (MN +M

2log2M).

24

Page 25: mpeg1 audio

When M = 32 and N = 10, the standard filterbank structure

requires 328704 multiplications, the polyphase/DFT structure

performs 11264 multiplications, and the polyphase/FFT

implementation requires only 400 multiplications.

25

Page 26: mpeg1 audio

2 The Analysis Subband Filter used by

MPEG-1 Layer-I and II

In MPEG-1 audio encoder, there are two main processing branches

in the block diagram. One branch is the analyzer of psychoacoustic

effects, and the other is the branch of subband analysis filter bank,

which produces the output from each subband (critical band)

frequency shifted to the baseband. Detailed steps of processing in

the branch of subband analysis filter bank is shown in the Figure

below. Corresponding codes in a MATLAB program

Matlab_MPEG_1_2_4.zip are listed in the following. A few lines

from the main program and all of the subroutine

Analysis_subband_filter.m are shown.

26

Page 27: mpeg1 audio

Block diagram of MPEG1 Layer-II

27

Page 28: mpeg1 audio

In the flow diagram shown in Fig. 2, the first block shows that a

block of 512 data points are taken into a FIFO (First In First Out)

buffer. The data in the FIFO are processed by a polyphase

filterbank. This FIFO buffer is updated everytime the subband

analysis is completedb by shifting in a set of 32 new data as

illustrated by Fig. ??.

The second block of Fig. 2 applies a low-pass filter function shown

in Fig. ?? to a frame of 512 point data to be sent to the subband

analysis by a polyphase filter bank. This low-pass filter is a band

limiting filter to suppress frequency aliasing. The pass band within

a subband (cut-off frequency) is set to befs64× 0.5824. This filter

function can be designed by the window method of FIR filter

design briefly explained in a section to follow. The designed filter

function is then multiplied by the Blackmann window (not the

Hanning window). The total length of 512 data is then divided into

28

Page 29: mpeg1 audio

8 segments of 64 data. The alternating sign

f−,+,−,+,−,+,−,+g are attached each segment. This is to shift

the pass-band to the center of a subband.

In order to understand the insight of the processing details, we will

review the concepts of polyphase filter bank and the DCT in the

following sections.

29

Page 30: mpeg1 audio

Flow Diagram of the MPEG-1 Audio Encoder Layer-I and Layer II

30

Page 31: mpeg1 audio

Input data for the subband filterbank

31

Page 32: mpeg1 audio

Window function applied to a frame of 512 point data

32

Page 33: mpeg1 audio

% Load tables.

[TH, Map, LTq] = Table_absolute_threshold(1, fs, 128); % Threshold in quiet

CB = Table_critical_band_boundaries(1, fs);

C = Table_analysis_window;

% Analysis subband filtering [1, pp. 67].

for i = 0:11,

S = [S; Analysis_subband_filter(x, OFFSET + 32 * i, C)];

end

% -----------------------------------------------

function S = Analysis_subband_filter(Input, n, C)

Common;

nmax = length(Input);

% Check input parameters

if (n + 31 > nmax | n < 1)

error(’Unexpected analysis index.’);

end

% Build an input vector X of 512 elements. The most recent sample

% is at position 512 while the oldest element is at position 1.

% Padd with zeroes if the input signal does not exist.

% ...........................................................

% | 480 samples | 32 samples |

% n-480 n n+31

X = Input(max(1, n - 480):n + 31); % / 32768

33

Page 34: mpeg1 audio

X = X(:);

X = [zeros(512 - length(X), 1); X];

% Window vector X by vector C. This produces the Z buffer.

Z = X .* C;

% Partial calculation: 64 Yi coefficients

Y = zeros(1, 64);

for i = 1 : 64,

for j = 0 : 7,

Y(i) = Y(i) + Z(i + 64 * j);

end

end

% Calculate the analysis filter bank coefficients

for i = 0 : 31,

for k = 0 : 63,

M(i + 1, k + 1) = cos((2 * i + 1) * (k - 16) * pi / 64);

end

end

% Calculate the 32 subband samples Si

S = zeros(1, 32);

for i = 1 : 32,

for k = 1 : 64,

S(i) = S(i) + M(i, k) * Y(k);

end

end

34

Page 35: mpeg1 audio

3 Application of Psychoacoustic Principles:

ISO 11172-3 (MPEG-1)

PSYCHOACOUSTIC MODEL 1

• It is useful to consider an example of how the psychoacoustic

principles described thus far are applied in actual coding

algorithms. The ISO/IEC 11172-3 (MPEG-1, layer 1)

psychoacoustic model 1 determines the maximum allowable

quantization noise energy in each critical band such that

quantization noise remains inaudible.

• In one of its modes, the model uses a 512-point DFT for high

resolution spectral analysis (86.13 Hz), then estimates for each

input frame individual simultaneous masking thresholds due to

the presence of tone-like and noise-like maskers in the signal

spectrum. A global masking threshold is then estimated for a

35

Page 36: mpeg1 audio

subset of the original 256 frequency bins by (power) additive

combination of the tonal and non-tonal individual masking

thresholds.

• This section describes the step-by-step model operations. The

five steps leading to computation of global masking thresholds

are as follows:

1. Spectral Analysis and SPL (Sound Pressure Level)

Normalization

2. Identification of Tonal and Noise Maskers

3. Decimation and Reorganization of Maskers

4. Calculation of Individual Masking Thresholds

5. Calculation of Global Masking Thresholds

36

Page 37: mpeg1 audio

3.1 Spectral Analysis and SPL Normalization

First, incoming audio samples of b bit integer, s(n), are normalized

according to the FFT length, N , and the number of bits per

sample (signed integer), b, using the relation

x(n) =s(n)

N (2b−1)

Normalization references the power spectrum to a 0-dB maximum.

The normalized input, x(n), is then segmented into 12 ms frames

(512 samples) using a 1/16th overlapped Hann window such that

each frame contains 10.9 ms of new data. A power spectral density

(PSD) estimate, P (k), is then obtained using a 512-point FFT.

X(k) =N−1∑

n=0

x(n)e−j2πnkN

37

Page 38: mpeg1 audio

X(k) =N−1∑

n=0

x(n)w(n)e−j2πnkN .

The Hanning window (Hann window) defined by

w(n) =1

2

[

1− cos

(

2πn

N

)]

is used to reduce the spectrum leakage from other frequencies to

the analysing frequency.

38

Page 39: mpeg1 audio

Spectrum of

Rectangular (time) Window

39

Page 40: mpeg1 audio

Spectrum of the Hanning Window

40

Page 41: mpeg1 audio

A power spectral density (PSD) estimate, P (k), is then obtained

from X(k) computed by a 512-point FFT (Fast Fourier

Transform), a fast algorithm to compute DFT (Discrete Fourier

Transform). PSD resulting from 512 FFT has 256 spectral

components (harmonics).

P (k) = PN + 10 log10 jX(k)j2 for 0 ≤ k ≤ N

2

where the power normalization term, PN , is the reference sound

pressure level of 96 dB.

41

Page 42: mpeg1 audio

Problem

Matlab MPEG 1 2 4.zip contains a MATLAB program that sim-

ulates all of MP3 spychoacoustic masking threshold calculations.

A subroutine FFT Analysis.m calculates Power Spectral Density

(PSD). Main program is Test MPEG.m. Apply this program to

a music piece in *.wav of your choice to see its PSD. Slide the

time window of 512 samples to find the first block so that no

zero padding is applied to the analysis. The PSD of “Eine Kleine

Nachtmusik” by Mozart is shown below. The key part of process-

ing in FFT Analysis.m is shown below.

42

Page 43: mpeg1 audio

% Compute the auditory spectrum using the Fast Fourier Transform.

% The spectrum X is expressed in dB. The size of the transform si 512 and

% is centered on the 384 samples (12 samples per subband) used for the

% subband analysis. The first of the 384 samples is indexed by n:

% ................................................

% | | 384 samples | |

% n-64 n n+383 n+447

% A Hanning window applied before computing the FFT.

%

% Prepare the Hanning window

h = sqrt(8/3) * hanning(FFT_SIZE);

% Power density spectrum

X = max(20 * log10(abs(fft(s .* h)) / FFT_SIZE), MIN_POWER);

% Normalization to the reference sound pressure level of 96 dB

Delta = 96 - max(X);

X = X + Delta;

43

Page 44: mpeg1 audio

PSD of “Eine Kleine Nachtmusik” by Mozart

44

Page 45: mpeg1 audio

3.2 Identification of Tonal and Noise Maskers

After PSD estimation and SPL normalization, tonal and non-tonal

masking components are identified.

Tonal maskers

Local maxima in the sample PSD which exceed neighboring

components within a certain bark distance by at least 7 dB are

classified as tonal. Specifically, the tonal set, ST , is defined as

ST =

P (k) such thatP (k) > P (k ± 1)

P (k) > P (k ±∆k) + 7dB

45

Page 46: mpeg1 audio

where,

∆k ∈

2 2 < k < 63 0.17-5.5 KHz

(2, 3) 63 ≤ k < 127 5.5-11 KHz

(2, · · · , 6) 127 ≤ k < 256 11-20 KHz

Tonal maskers, PTM (k), are computed from the spectral peaks

listed in ST as follows

PTM (k) = 10 log10

+1∑

j=−1

100.1P (k+j) dB

Noise maskers

A single noise masker for each critical band, PNM (k̄), is then

computed from (remaining) spectral lines not within the ±∆k

46

Page 47: mpeg1 audio

neighborhood of a tonal masker using the sum,

PNM (k̄) = 10 log10

j

100.1P (j) dB

for all P (j) not the member of PTM (k, k ± 1, k ±∆k)

where, k̄ =

u∏

j=l

j

1u−l+1

and l and u are the lower and upper

spectral line boundaries of the critical band, respectively.

47

Page 48: mpeg1 audio

(1) local maxima

48

Page 49: mpeg1 audio

(2) tonal components

49

Page 50: mpeg1 audio

(3) tonal and non-tonal components of Eine Kleine Nachtmusik

50

Page 51: mpeg1 audio

Problem

A subroutine Find tonal components.m contained in the

MP3 spychoacoustic masking simulation program Mat-

lab MPEG 1 2 4.zip first calculates the local maxima of

Power Spectral Density (PSD). From the obtained local maxima

of PSD, tonal components are calculated based on Equations

described above. Then, non-tonal components and the fre-

quencies of the critical band are calculated. Main program

is Test MPEG.m. Apply this program to a music piece in

*.wav chosen in the previous Problem to show the 3 figures

generated by Find tonal components.m, (1) local maxima, (2)

tonal components, and (3) tonal and non-tonal components.

51

Page 52: mpeg1 audio

3.3 Decimation and Reorganization of Maskers

In this step, the number of maskers is reduced using two criteria.

First, any tonal or noise maskers below the absolute threshold are

discarded, i.e., only maskers which satisfy

PTM,NM (k) ≥ Tq(k)

are retained, where Tq(k) is the SPL of the threshold in quiet at

spectral line k. Next, a sliding 0.5 Bark-wide window is used to

replace any pair of maskers occurring within a distance of 0.5 Bark

by the stronger of the two.

After the sliding window procedure, masker frequency bins are

52

Page 53: mpeg1 audio

reorganized according to the subsampling scheme,

PTM,NM (i) =

PTM,NM (k) if i = k

0 if i 6= k

The net effect is 2:1 decimation of masker bins in critical bands

18-22 and 4:1 decimation of masker bins in critical bands 22-25 ,

with no loss of masking components. This procedure reduces the

total number of tone and noise masker frequency bins under

consideration from 256 to 106. An example of decimation for the

equal SPL is shown in the table below.

53

Page 54: mpeg1 audio

k i decimate

50 50 keep

51 52 zero

52 52 keep

100 100 keep

101 104 zero

102 104 zero

103 104 zero

104 104 keep

54

Page 55: mpeg1 audio

Problem

A subroutine Decimation.m— contained in the MP3 spychoacous-

tic masking simulation program Matlab MPEG 1 2 4.zip does all

processes of decimination described in this sub-section. Apply

this program to a music piece in *.wav chosen in the previous

Problem to see if any of SPL’s are elimnated due to (1) any tonal

or noise maskers are below the absolute threshold, (2) any pair of

maskers occurring within a distance of 0.5 Bark is replaced by the

stronger of the two. (3) 2:1 decimation of masker bins in critical

bands 18-22 and 4:1 decimation of masker bins in critical bands

22-25.

55

Page 56: mpeg1 audio

Tonal and non-tonal maskers after decimation. Only one non-tonal

masker SPL under the absolute threshold was eliminated.

56

Page 57: mpeg1 audio

3.4 Calculation of Individual Masking Thresholds

Having obtained a decimated set of tonal and noise maskers,

individual tone and noise masking thresholds are computed next.

Each individual threshold represents a masking contribution at

frequency bin i due to the tone or noise masker located at bin j

(reorganized during step 3). Tonal masker thresholds, TTM (i, j),

are given by

TTM (i, j) = PTM (j)− 0.275z(j) + SF (i, j)− 6.025 dB

where PTM (j) denotes the SPL of the tonal masker in frequency

bin j, z(j) denotes the Bark frequency of bin j,

57

Page 58: mpeg1 audio

and the spread of masking from masker bin j to maskee bin i,

SF (i, j), is modeled by the expression,

SF (i, j) =

17∆z − 0.4PTM (j) + 11 −3 ≤ ∆z < −1

(0.4PTM (j) + 6)∆z −1 ≤ ∆z < 0

−17∆z 0 ≤ ∆z < 1

(0.15PTM (j)− 17)∆z − 0.15PTM (j) 1 ≤ ∆z < 8

dB

58

Page 59: mpeg1 audio

Prototype spreading functions at z=10 as a function of masker level

59

Page 60: mpeg1 audio

SF (i, j) is a piecewise linear function of masker level, PTM (j), and

Bark maskee-masker separation, ∆z = z(i)− z(j). SF (i, j)

approximates the basilar spreading (excitation pattern) given. As

shown in the figure, the slope of TTM (i, j), decreases with

increasing masker level. This is a reflection of psychophysical test

results, which have demonstrated that the ear’s frequency

selectivity decreases as stimulus levels increase. It is also noted

here that the spread of masking in this particular model is

constrained to a 10-Bark neighborhood for computational

efficiency. This simplifying assumption is reasonable given the very

low masking levels which occur in the tails of the basilar excitation

patterns modeled by SF (i, j).

60

Page 61: mpeg1 audio

Individual noise masker thresholds, TNM (i, j), are given by

TNM (i, j) = PNM (j)− 0.175z(j) + SF (i, j)− 2.025 dB

where TNM (i, j) denotes the SPL of the noise masker in frequency

bin j, z(j) denotes the Bark frequency of bin j, and SF (i, j) is

obtained by replacing PTM (j) with PNM (j).

Problem

A subroutine Individual masking thresholds.m contained in

the MP3 spychoacoustic masking simulation program Mat-

lab MPEG 1 2 4.zip calculates individaul masking thresholds of

tonal maskers TTM (i, j), and non-tonal maskers TNM (i, j) using

the spreading function SF (i, j). Apply this program to a music

piece in *.wav chosen in the previous Problem to plot the indivi-

daul masking thresholds of a frame.

61

Page 62: mpeg1 audio

3.5 Calculation of Global Masking Thresholds

In this step, individual masking thresholds are combined to

estimate a global masking threshold for each frequency bin in the

subset given by Eq. 3.4. The model assumes that masking effects

are additive. The global masking threshold, Tg(i), is therefore

obtained by computing the sum,

Tg(i) = 10 log10

(

100.1Tq(i) +L∑

l=1

100.1TTM (i,l) +M∑

m=1

100.1TNM (i,m)

)

dB

where Tq(i) is the absolute hearing threshold for frequency bin i,

TTM (i, l) and TNM (i,m) are the individual masking thresholds,

and L and M are the number of tonal and noise maskers,

respectively, identified previously.

62

Page 63: mpeg1 audio

In other words, the global threshold for each frequency bin

represents a signal dependent, power additive modification of the

absolute threshold due to the basilar spread of all tonal and noise

maskers in the signal power spectrum. The next Fig. shows global

masking threshold obtained by adding the power of the individual

tonal and noise maskers to the absolute threshold in quiet.

63

Page 64: mpeg1 audio

Individaul masking thresholds for both tonal and non-tonal

maskers. The global masking threshold is the sum of all individual

masking thresholds.

64

Page 65: mpeg1 audio

4 End

Rµν −1

2Rδµν =

8πG

c4Tµν

Here Tµν is tensor of energy momentum.

black blue

red magenta

green cyan

yellow

65


Recommended