COMPENSATION FOR NONLINEAR DISTORTION IN ...Dynamic Range Compression (DRC) • A form of nonlinear...

COMPENSATION FOR NONLINEAR DISTORTION IN NOISE FOR ROBUST SPEECH RECOGNITION

Mark J. Harvilla Ph.D. Thesis Defense October 27, 2014

Introduction

2

Topic Symbol Fraction of thesis work

Dynamic range compression (DRC) and automatic speech recognition (ASR)

11%

Blind amplitude normalization (BAN) 14%

Blind amplitude reconstruction (BAR) 28%

Robust estimation of distortion (RED) 28%

Artificially-matched training (AMT) 9%

The Big Picture 10%

DRC & ASR

BAN

BAR

RED

AMT

Big Picture

DRC & ASR BAN BAR RED AMT Big Picture Conclusion Introduction

Dynamic Range Compression (DRC) •  A form of nonlinear distortion

Ø  Nonlinear systems are common (e.g., AM/FM radio, rectifiers)

• DRC is used extensively in audio engineering typically for one of three reasons: 1.  Adhere to dynamic range limitations of a signal transmission

system, while increasing average signal power 2.  Increase perceived signal loudness 3.  Eliminate drastic changes in volume (e.g., automatic gain control)

•  Because of the ubiquity of DRC, speech systems—like ASR—

are likely to encounter compressed speech

3

BAN BAR RED AMT Big Picture Conclusion Introduction DRC & ASR

−1 −0.8−0.6−0.4−0.2 0 0.2 0.4 0.6 0.8 1−1

−0.8−0.6−0.4−0.2

00.20.40.60.8

1

input amplitude

outp

ut a

mpl

itude

τ = 0.6

τ = 0.1

R = 1.5R = 2.5R = ∞

Dynamic Range Compression (DRC) • DRC is characterized by two parameters, ratio (R) and

threshold (τ).

4


0 0.005 0.01 0.015 0.02 0.025 0.03−1

−0.8−0.6−0.4−0.2

00.20.40.60.8

1

time (seconds)

−1 −0.8−0.6−0.4−0.2 0 0.2 0.4 0.6 0.8 1−1

−0.8−0.6−0.4−0.2

00.20.40.60.8

1

input amplitude

outp

ut a

mpl

itude

R = 1R = 1.5R = 2.5R = ∞

Dynamic Range Compression (DRC)

5


−1 −0.8−0.6−0.4−0.2 0 0.2 0.4 0.6 0.8 10

0.005

0.01

0.015

0.02

0.025

0.03

amplitude

time

(sec

onds

)

0 10 20 30 40 50 60 70 80 90 1000

4

8

12

16

20

τ, threshold (percentile)

SNR

(dB)

R=1.5R=2R=3R=6R=∞

Dynamic Range Compression (DRC)

6


Some examples

7


Threshold (τ) Ratio (R) Audio Crest Factor Word Error

Rate (WER) WER after processing

P100 1 17.1 dB 6.4% 6.4%

P75 4 7.7 dB 20.3% 6.4%

P75 ∞ 4.1 dB 30.8% 13.5%

P50 4 6.7 dB 30.2% 6.4%

P50 ∞ 2.2 dB 49.5% 23.0%

Measuring the effect of DRC on ASR

8


Clean acoustic model

clean speech

signal

Controlled parameter

values: (R,τ)

Measure word error rate (WER) DRC ASR

Experiment 1 (no additive noise):


9



clean speech

signal


values: (R,τ)


Experiment 2 (additive, channel noise):

Additive noise at

controlled SNR

+


10


Experiment 1 (no additive noise):

15 35 55 75 95 100

102030405060708090

100


Wor

d er

ror r

ate

(%)

R = ∞R = 20R = 10R = 6R = 4R = 2R = 1


clean speech

signal


values: (R,τ)


No additive noise


11


Experiment 2 (additive, channel noise): Clean acoustic

model

clean speech

signal


values: (R,τ)


Additive noise at

controlled SNR

+

15 35 55 75 95 100

102030405060708090

100


Wor

d er

ror r

ate

(%)

R = ∞R = 20R = 10R = 6R = 4R = 2R = 1

Additive noise at 20-dB SNR w.r.t. compressed signal

15 35 55 75 95 100

102030405060708090

100


Wor

d er

ror r

ate

(%)


12


Experiment 2 (additive, channel noise): Clean acoustic

model

clean speech

signal


values: (R,τ)


Additive noise at

controlled SNR

+

Additive noise at 15-dB SNR w.r.t. compressed signal

Counteracting the effects of DRC

13


DRC

Saturating “clipping”

Non-saturating “compression”

Blind amplitude reconstruction

(BAR)

Blind amplitude normalization

(BAN)

Artificially-matched

training (AMT)

Robust estimation of nonlinear distortion function (RED)

Blind Amplitude Normalization (BAN) (Balchandran & Mammone; ICASSP 1998)

•  Step 1: Obtain estimate of the cumulative distribution function (CDF) of the observed speech, and of clean, unadulterated reference speech.

14

DRC & ASR BAR RED AMT Big Picture Conclusion Introduction BAN

−1 −0.6 −0.2 0.2 0.6 10

0.2

0.4

0.6

0.8

1

amplitude

cum

ulat

ive

prob

abili

ty

−0.08 −0.048 −0.016 0.016 0.048 0.080

0.2

0.4

0.6

0.8

1

amplitude

cum

ulat

ive

prob

abili

ty

Observed speech (R = 10, τ = P50) Clean speech

•  Step 2: For a given reference signal amplitude, find the amplitude in the observed CDF with the same cumulative probability.

15


Ø  Input amplitude of 0.061 maps to 0.2


−1 −0.6 −0.2 0.2 0.6 10

0.2

0.4

0.6

0.8

1

amplitude

cum

ulat

ive

prob

abili

ty

−0.08 −0.048 −0.016 0.016 0.048 0.080

0.2

0.4

0.6

0.8

1

amplitude

cum

ulat

ive

prob

abili

ty

•  Step 3: Repeat for each input signal amplitude to obtain a full non-parametric estimate of the nonlinear mapping.

16



−1 −0.6 −0.2 0.2 0.6 10

0.2

0.4

0.6

0.8

1

amplitude

cum

ulat

ive

prob

abili

ty

−0.08 −0.048 −0.016 0.016 0.048 0.080

0.2

0.4

0.6

0.8

1

amplitude

cum

ulat

ive

prob

abili

ty

How well does BAN work? •  Experiment 1 (no additive noise):

17


15 35 55 75 95 100

102030405060708090

100


Wor

d er

ror r

ate

(%)

R = ∞R = 20R = 10R = 6R = 4R = 2R = 1

Before BAN

How well does BAN work? •  Experiment 1 (no additive noise):

18


15 35 55 75 95 100

102030405060708090

100


Wor

d er

ror r

ate

(%)

R = ∞R = 20R = 10R = 6R = 4R = 2R = 1

After BAN

15 35 55 75 95 100

102030405060708090

100


Wor

d er

ror r

ate

(%)

R = ∞R = 20R = 10R = 6R = 4R = 2R = 1

How well does BAN work? •  Experiment 2 (additive, channel noise at 20-dB SNR):

19


Before BAN

15 35 55 75 95 100

102030405060708090

100


Wor

d er

ror r

ate

(%)


20


After BAN

15 35 55 75 95 100

102030405060708090

100


Wor

d er

ror r

ate

(%)


21


Before BAN

15 35 55 75 95 100

102030405060708090

100


Wor

d er

ror r

ate

(%)


22


After BAN

Robust BAN (Harvilla & Stern; unpub.)

•  Idea: Shift each input sample by the amount the centroid of it and its neighbors is changed when inverting the nonlinearity.

23


Observed speech after low-pass filter (R = 10, τ = P50, SNR = 15 dB)

Clean speech after low-pass filter

−0.08 −0.048 −0.016 0.016 0.048 0.080

0.2

0.4

0.6

0.8

1

amplitude

cum

ulat

ive

prob

abili

ty

−1 −0.6 −0.2 0.2 0.6 10

0.2

0.4

0.6

0.8

1

amplitude

cum

ulat

ive

prob

abili

ty


•  Step 1: As before, for a given reference signal amplitude, find the amplitude in the observed CDF with the same cumulative probability.

24


−0.08 −0.048 −0.016 0.016 0.048 0.080

0.2

0.4

0.6

0.8

1

amplitude

cum

ulat

ive

prob

abili

ty

−1 −0.6 −0.2 0.2 0.6 10

0.2

0.4

0.6

0.8

1

amplitude

cum

ulat

ive

prob

abili

ty


•  Step 2: The difference between the output and the input is the offset to be added to the original, noisy and compressed waveform.

25


−0.08 −0.048 −0.016 0.016 0.048 0.080

0.2

0.4

0.6

0.8

1

amplitude

cum

ulat

ive

prob

abili

ty

−1 −0.6 −0.2 0.2 0.6 10

0.2

0.4

0.6

0.8

1

amplitude

cum

ulat

ive

prob

abili

ty

Offset = output – input = 0.2 – 0.061 = 0.139


26


−0.08 −0.048 −0.016 0.016 0.048 0.080

0.2

0.4

0.6

0.8

1

amplitude

cum

ulat

ive

prob

abili

ty

−1 −0.6 −0.2 0.2 0.6 10

0.2

0.4

0.6

0.8

1

amplitude

cum

ulat

ive

prob

abili

ty

•  Step 3: Repeat for each input signal amplitude, always using the inverse mapping defined by the smoothed signals.


27


•  Step 1: For each sample, find the centroid of the value and its surrounding 4 samples. •  Step 2: Pass the centroid value through the inverse

nonlinearity estimate. •  Step 3: Find the difference (“offset”) between the output of

the inverse nonlinearity and the centroid. •  Step 4: Add the offset to the original noisy and compressed

sample value from Step 1. •  Step 5: Repeat for each sample in the input signal.


28


0 0.0037 0.0075 0.0112 0.0149 0.0187−0.3−0.2−0.1

00.10.20.30.4

time (seconds)

ampl

itude

originalDRC + noise (SNR = 15dB)

0 0.0037 0.0075 0.0112 0.0149 0.0187−0.3−0.2−0.1

00.10.20.30.4

time (seconds)

ampl

itudeRepaired

using BAN:


29


0 0.0037 0.0075 0.0112 0.0149 0.0187−0.3−0.2−0.1

00.10.20.30.4

time (seconds)

ampl

itude

originalDRC + noise (SNR = 15dB)

0 0.0037 0.0075 0.0112 0.0149 0.0187−0.3−0.2−0.1

00.10.20.30.4

time (seconds)

ampl

itude

Repaired using

Robust BAN:

R=2 R=4 R=6 R=10 R=20−30

−20

−10

0

10

20

30

(RB

AN−B

AN

) rel

. im

prov

. (%

)

15−dB SNR20−dB SNR

•  RBAN is more useful as R becomes large and SNR decreases:

Results summary

30


Blind Amplitude Reconstruction (BAR) • When R = ∞, BAN techniques are ineffective. •  All samples greater than |τ| are completely lost (“clipping”).

31

DRC & ASR BAN RED AMT Big Picture Conclusion Introduction BAR

0 0.5 1 1.5 2 2.5 3x 10−3

−0.3

−0.1

0.1

0.3

0.5

time (seconds)

ampl

itude

Consistent Iterative Hard Thresholding (Kitic et al.; ICASSP 2013)

• Kitic-IHT works by learning a sparse representation of the incoming clipped speech in term of Gabor basis vectors. •  Learning is done using a modified version of the Iterative

Hard Thresholding (IHT) algorithm. •  The learned sparse representation is then used to

reconstruct the signal on a frame-by-frame basis.

32


Kitic-IHT will be used as a baseline to compare novel declipping algorithm performance.

Gabor basis vectors

Sparse representation, learned from clipped observation

Repaired signal frame

Constrained BAR (Harvilla & Stern; Interspeech 2014) • Declip the signal by interpolating missing samples such that

the energy in the second derivative is minimized (i.e., for smoothness). •  Ensure the interpolation matches the sign of the clipped

signal and is greater than |τ| in the absolute sense.

33


0 0.5 1 1.5 2 2.5 3x 10−3

−0.3

−0.1

0.1

0.3

0.5

time (seconds)

ampl

itude

Constrained BAR (Harvilla & Stern; Interspeech 2014) •  Explaining masking matrices

34


Isolates reliable samples

Isolates clipped samples

Constrained BAR (Harvilla & Stern; Interspeech 2014)

35


minimize

subject to

xc

CBAR objective function:

Constrained BAR (Harvilla & Stern; Interspeech 2014) •  Because Constrained BAR (CBAR) imposes a hard constraint

when minimizing the objective function, it is very slow.

•  A line search algorithm is used to solve the constrained optimization separately for every frame.

•  In the worst case, it is 400 times slower than real time. •  This motivates the development of a declipping algorithm

that does not require a hard constraint.

36


Regularized BAR (Harvilla & Stern; ICASSP 2015)

37


•  Replace CBAR’s hard constraint with regularization terms:

minimize

subject to

xc

CBAR objective function:


38



minimize xc


39



minimize xc


40



minimize xc

RBAR objective function: xc can be solved for in closed form!


41



Frame-specific solution: xc can be solved for in closed form!


42


•  The t0 and t1 terms are target vectors. •  They “float” above the clipped segments at the target

amplitude. •  They are defined as a function of the fraction of clipped

samples in a frame.


43


0 0.5 1 1.5 2 2.5 3x 10−3

−0.3

−0.1

0.1

0.3

0.5

time (seconds)

ampl

itude

0 0.5 1 1.5 2 2.5 3x 10−3

−0.3

−0.1

0.1

0.3

0.5

time (seconds)

ampl

itude


44


t0

t1

0 0.5 1 1.5 2 2.5 3x 10−3

−0.3

−0.1

0.1

0.3

0.5

time (seconds)

ampl

itude


45


0 0.5 1 1.5 2 2.5 3x 10−3

−0.3

−0.1

0.1

0.3

0.5

time (seconds)

ampl

itude


46


The target amplitudes underestimate the true peak (future research).


47


•  Amplitude prediction

0 0.2 0.4 0.6 0.8 10

80

160

240

320

400

fraction of clipped samples

P 95 / τ

exponentialpower−law

ρ: fraction of clipped samples in frame

Processing speed

48


20 40 60 80−2 [0.13]

−1 [0.37]

0 [1.00]

1 [2.71]

2 [7.39]

3 [20.1]

4 [54.6]

5 [148]

6 [403]


log(

TRT)

[run

time/

inpu

t dur

atio

n]

CBARKitic−IHTRBAR

Declipping performance

49


•  Experiment 1 (no additive noise):

15 35 55 75 95 1000

20

40

60

80

100


Wor

d er

ror r

ate

(%)

no declippingRBARKitic−IHTCBAR


50


•  Experiment 1 (no additive noise), relative improvements:

15 35 55 75 95−30−20−10

010203040506070


Rel

ativ

e de

crea

se in

WER

(%)

relative to no declippingrelative to RBARrelative to Kitic−IHT

15 35 55 75 95−30−20−10

010203040506070


Rel

ativ

e de

crea

se in

WER

(%)

relative to no declippingrelative to Kitic−IHT

CBAR RBAR


51


•  Experiment 2 (additive noise):

5 10 15 200

20

40

60

80

100

SNR (dB)

Wor

d er

ror r

ate

(%)

5 10 15 200

20

40

60

80

100

SNR (dB)

Wor

d er

ror r

ate

(%)

no declippingRBARCBARKitic−IHT

τ = P75 τ = P95

The location of all clipped samples is assumed known.

Kitic-IHT is more robust to additive noise (future research).

Is audio exposed to

DRC?

Is audio clipped?

Apply BAR

Extract features

Apply BAN

yes

no

yes

no

Receive audio

Robust Estimation of Distortion (RED) • Given a received speech signal, how does one determine if

declipping (BAR) or decompression (BAN) need to be performed?

52

DRC & ASR BAN BAR AMT Big Picture Conclusion Introduction RED



53


Is audio exposed to

DRC?

Is audio clipped?

Apply BAR

Search for peaks in the probability distribution of the waveform amplitudes

Accurately estimate the value of R (recall: if R is “very” large, speech is effectively clipped)

Requires estimation of which samples are clipped and must assume the possibility of noise (e.g., as in Experiment 2)

✔

✗

✔

Clipped speech detection & τ estimation (Harvilla & Stern; ICASSP 2015)

•  Exposure to DRC significantly modifies the waveform amplitude distribution of the speech

54


−0.6 −0.4 −0.2 0 0.2 0.4 0.60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

amplitude

probability

−0.6 −0.4 −0.2 0 0.2 0.4 0.60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

amplitude

probability

Uncompressed speech with noise at 15-dB SNR

DRC’ed speech (R=6, τ=0.06) + noise at 15 dB

−0.6 −0.4 −0.2 0 0.2 0.4 0.60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

amplitude

probability


•  Exposure to DRC significantly modifies the waveform amplitude distribution of the speech

55


DRC’ed speech (R=6, τ=0.06) + noise at 15 dB

Clipping detection and τ estimation algorithm: 1.  Detect peaks in the

distribution 2.  Compute:

3.  Output indicates clipping occurrence and amplitude value of τ (0.5*( |-τ| + 0 + |τ| ))

(if output is ∞, no clipping)


56


Clipped signal detection accuracies

5 10 15 200

20

40

60

80

100

SNR (dB)

Clip

ped

signa

l det

. acc

. (%

)

5 10 15 200

20

40

60

80

100

SNR (dB)Cl

ippe

d sig

nal d

et. a

cc. (

%)

τ = P95 τ = P75

Because the amplitude distribution merges into one lobe (thus, one peak) with decreasing SNR and τ, detection accuracy correspondingly decreases.


57


SNR = 20 dB SNR = 15 dB

SNR = 10 dB SNR = 5 dB

τ-estimation accuracies for R = ∞

0.03 0.06 0.09 0.12 0.15−0.01

0.02

0.05

0.08

0.11

0.14

0.17

0.2

0.23

τ, actual

τ, e

stim

ate

0.03 0.06 0.09 0.12 0.15−0.01

0.02

0.05

0.08

0.11

0.14

0.17

0.2

0.23

τ, actual

τ, e

stim

ate

0.03 0.06 0.09 0.12 0.15−0.01

0.02

0.05

0.08

0.11

0.14

0.17

0.2

0.23

τ, actual

τ, e

stim

ate

0.03 0.06 0.09 0.12 0.15−0.01

0.02

0.05

0.08

0.11

0.14

0.17

0.2

0.23

τ, actual

τ, e

stim

ate

Clipped sample estimation (Harvilla & Stern; ICASSP 2015)

• Given the amplitude value of τ, how do we determine the location of clipped samples?

58


0 0.5 1 1.5 2 2.5 3x 10−3

−0.3

−0.1

0.1

0.3

0.5

time (seconds)

ampl

itude

signal samplesclipping threshold

0 0.5 1 1.5 2 2.5 3x 10−3

−0.3

−0.1

0.1

0.3

0.5

time (seconds)

ampl

itude

Clipped speech, no noise Clipped speech + noise at 10-dB SNR


•  Given the amplitude value of τ, how do we determine the location of clipped samples? •  Solution:

Given, amplitude value of τ percentile value of τ variance of the additive noise (σw

2) variance of the observed signal (σy

2)

•  Model the clean speech and noise with separate Gaussians •  For each sample, classify as clipped if

59


Pr( clipped|observed sample, τ, σw2, σy

2) > Pr( not clipped|observed sample, τ, σw2, σy

2)


60


−0.2 −0.12 −0.04 0.04 0.12 0.20

5.2

10.4

15.6

20.8

26

amplitude

prob

abili

ty d

ensi

ty

clippednot clipped

Speech clipped at τ = 0.07 and added to noise at 15-dB SNR


61


0 4 8 12 16 20 2460

70

80

90

100

SNR (dB)

mea

n cl

assif

icat

ion

accu

racy

τ = P95τ = P75τ = P55τ = P35

Is audio exposed to

DRC?

Is audio clipped?

Apply BAR

Extract features

Apply BAN

yes

no

yes

no

Receive audio



62



63


Apply BAR

Voice activity

detection

Estimation of noise variance

Estimation of τ

percentile

Clipped sample

estimation

Declipping


64



τ = P75 τ = P95

The location of all clipped samples is assumed known.

5 10 15 200

20

40

60

80

100

SNR (dB)

Wor

d er

ror r

ate

(%)

5 10 15 200

20

40

60

80

100

SNR (dB)

Wor

d er

ror r

ate

(%)


5 10 15 200

20

40

60

80

100

SNR (dB)

Wor

d er

ror r

ate

(%)


5 10 15 200

20

40

60

80

100

SNR (dB)

Wor

d er

ror r

ate

(%)


65



τ = P75 τ = P95

Clipping occurrence and location is detected using RED techniques

5 10 15 200

20

40

60

80

100

SNR (dB)

Wor

d er

ror r

ate

(%)

← clipped signal detection accuracy


5 10 15 200

20

40

60

80

100

SNR (dB)

Wor

d er

ror r

ate

(%)

← clipped signal detection accuracy


66



τ = P75 τ = P95

Clipping occurrence and location is detected using RED techniques

Artificially-Matched Training (AMT) •  So far, the developed techniques have sought to repair

clipped, compressed and noisy speech to “look like” clean speech:

67

DRC & ASR BAN BAR RED Big Picture Conclusion Introduction AMT

noisy observations

compensation

clean models

Artificially-Matched Training (AMT) •  Ultimately, it’s only important for the Acoustic Model and

testing data conditions to match. They both need not be “clean.”

68


noisy observations

noisy models

Artificially-Matched Training (AMT)

69



15 35 55 75 95 100

102030405060708090

100


Wor

d er

ror r

ate

(%)

R = ∞R = 20R = 10R = 6R = 4R = 2R = 1

Clean training

15 35 55 75 95 100

102030405060708090

100


Wor

d er

ror r

ate

(%)

R = ∞R = 20R = 10R = 6R = 4R = 2R = 1


70



DRC-matched training


71


• One approach to achieving this in practice:

xn

MFCC ASR WER

Regression on DRC parameters

{R,τ}

{Rk-1, τk-1} {R1, τ1} {R0, τ0} … Bank of

acoustic models

Artificially-Matched Training with Acoustic Model Selection (AMT-AMS)

Current implementation uses the following parameter sets: R = {∞} τ = {P15, P35, P55, P75, P95}


72



15 35 55 75 95 100

102030405060708090

100


Wor

d er

ror r

ate

(%)

R = ∞R = 20R = 10R = 6R = 4R = 2R = 1

Clean training

15 35 55 75 95 100

102030405060708090

100


Wor

d er

ror r

ate

(%)

R = ∞R = 20R = 10R = 6R = 4R = 2R = 1


73



AMT-AMS

x[n] y[n]

+

w[n]

SNR in dB drawn from N(µ,σ2)

τ drawn uniformly in [τ0,τ1]

Compress with probability pc

Add noise with probability pn

R drawn from Gamma dist., [kR,θR]

The Big Picture • With no knowledge of the noise conditions and

characteristics of the incoming speech, how well does the combination of algorithms from the thesis work in practice?

74

DRC & ASR BAN BAR RED AMT Conclusion Introduction Big Picture

pc = 0.9 t0 = 60 t1 = 98 pn = 0.75 µ = 20 σ2 = 25 k = 3 θ = 2

Compression

The Big Picture

75


Compression

12

19

26

33

40

Wor

d er

ror r

ate

(%)

none

RBAN BAN

x[n] y[n]

+

w[n]

SNR in dB drawn from N(µ,σ2)

τ drawn uniformly in [τ0,τ1]

Clip with probability pc

Add noise with probability pn

The Big Picture • With no knowledge of the noise conditions and

characteristics of the incoming speech, how well does the combination of algorithms from the thesis work in practice?

76


pc = 0.9 t0 = 60 t1 = 98 pn = 0.75 µ = 20 σ2 = 25

Clipping

12

19

26

33

40

Wor

d er

ror r

ate

(%)

none

RBAR

CBARKitic−IHT

AMT−AMS AMT−AMS

(RBAR)

The Big Picture

77


Clipping

Summary & Conclusions •  A previously-unexplored problem in speech recognition, DRC,

was introduced. • Novel solutions to the two primary aspects of the problem,

clipping and compression, were developed. •  Techniques for detecting the occurrence of DRC were

considered. •  A comprehensive solution to DRC for speech recognition was

proposed. • DRC, especially in noise, is a very hard problem, but this

thesis lays the groundwork for very promising future research.

78

DRC & ASR BAN BAR RED AMT Big Picture Introduction Conclusion

Summary & Conclusions •  Areas of future research include: •  Improving target amplitude estimates for RBAR [BAR] •  Improving the robustness of BAR methods to additive noise [BAR] •  Improving the robustness of clipped/compressed signal detection to

low-valued SNR and τ [RED, Big Picture] •  Development of an R-estimation algorithm [RED, Big Picture] •  Further investigation of the performance of AMT-AMS with an

increasing granularity of acoustic model references [AMT]

79


Thank you! • Questions?

80


Date post:	12-Jul-2020
Category:	Documents
Upload:	others
View:	3 times
Download:	0 times

COMPENSATION FOR NONLINEAR DISTORTION IN ...Dynamic Range Compression (DRC) • A form of nonlinear...

Documents