Digital Audio Signal Processing
Lecture-4: Noise Reduction
Marc Moonen/Alexander BertrandDept. E.E./ESAT-STADIUS, KU Leuven
homes.esat.kuleuven.be/~moonen/
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 2
Overview
• Spectral subtraction for single-micr. noise reduction– Single-microphone noise reduction problem– Spectral subtraction basics (=spectral filtering)– Features: gain functions, implementation, musical noise,…– Iterative Wiener filter based on speech signal model
• Multi-channel Wiener filter for multi-micr. noise red.– Multi-microphone noise reduction problem– Multi-channel Wiener filter (=spectral+spatial filtering)
• Kalman filter based noise reduction– Kalman filters– Kalman filters for noise reduction
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 3
Single-Microphone Noise Reduction Problem
• Microphone signal is
• Goal: Estimate s[k] based on y[k]
• Applications: Speech enhancement in conferencing, handsfree telephony, hearing aids, …
Digital audio restoration • Will consider speech applications: s[k] = speech signal
desired signal estimate
desired signal s[k]
noise signal(s)
? ][ksy[k]
][][][ knksky
desired signal contribution
noisecontribution
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 4
Spectral Subtraction Methods: Basics
• Signal chopped into `frames’ (e.g. 10..20msec), for each frame a frequency domain representation is
(i-th frame)
• However, speech signal is an on/off signal, hence some frames have speech +noise, i.e.
some frames have noise only, i.e.
• A speech detection algorithm is needed to distinguish between these 2 types of frames (based on energy/dynamic range/statistical properties,…)
][][][ knksky
)()()( iii NSY
frames} only'-noise{`frame )( 0 )( iii NY
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 5
Spectral Subtraction Methods: Basics
• Definition: () = average amplitude of noise spectrum
• Assumption: noise characteristics change slowly, hence estimate () by (long-time) averaging over (M) noise-only frames
• Estimate clean speech spectrum Si() (for each frame), using corrupted speech spectrum Yi() (for each frame, i.e. short-time estimate) + estimated ():
based on `gain function’
)()()(ˆ iii YGS
))(ˆ),(()( ii YfG
framesonly -noise
)(1
)(ˆM
iYM
})({)( iNE
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 6
Spectral Subtraction: Gain Functions
2
2
)(
)(ˆ1)(
i
iY
G
2
2
)(
)(ˆ11
2
1)(
i
iY
G
))(),(ˆ()( ii YfG
)SNR,SNR()( priopostfGi
)(
)(ˆ1)(
ii Y
G
2
2
)(
)(ˆ1)(
i
iY
G
Ephraim-Malah = most frequently used in practice
Non-linear Estimation
Maximum Likelihood
Wiener Estimation
Spectral Subtraction
Magnitude Subtraction
see next slide
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 7
Spectral Subtraction: Gain Functions
• Example 1: Ephraim-Malah Suppression Rule (EMSR)
with:
• This corresponds to a MMSE (*) estimation of the speech spectral amplitude |Si()| based on observation Yi() ( estimate equal to E{ |Si()| | Yi() } ) assuming Gaussian a priori distributions for Si() and Ni() [Ephraim & Malah 1984].
• Similar formula for MMSE log-spectral amplitude estimation [Ephraim & Malah 1985].
(*) minimum mean squared error
prio
priopost
prio
prio
post SNR1
SNRSNR.
SNR1
SNR
SNR
1
2)( MGi
2
2
11postprio
2
2
post
102
)(
)()(1,0)-)max(SNR-(1)(SNR
)(ˆ
)()(SNR
)2
()2
()1(][
ii
i
YG
Y
IIeM
modified Bessel functions
skip fo
rmulas
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 8
Spectral Subtraction: Gain Functions
• Example 2: Magnitude Subtraction – Signal model:
– Estimation of clean speech spectrum:
– PS: half-wave rectification
)(,)(
)()()(
iyji
iii
eY
NSY
)(
)(
)(ˆ1
)(ˆ)()(ˆ
)(
)(,
i
G
i
jii
YY
eYS
i
iy
))(,0max()( ii GG
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 9
Spectral Subtraction: Gain Functions
• Example 3: Wiener Estimation – Linear MMSE estimation:
find linear filter Gi() to minimize MSE
– Solution:
Assume speech s[k] and noise n[k] are uncorrelated, then...
– PS: half-wave rectification
2
2
2
22
,
,,
,
,
)(
)(ˆ1
)(
)(ˆ)(
)(
)()(
)(
)()(
ii
i
iyy
inniyy
iyy
issi
YY
Y
P
PP
P
PG
)(
)(
)().(
)().()(
,
,
iyy
isy
ii
iii P
P
YYE
YSEG <- cross-correlation in i-th frame
<- auto-correlation in i-th frame
2
)(ˆ
)().()(
iS
YGSE iii
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 10
Spectral Subtraction: Implementation
® Short-time Fourier Transform (=uniform DFT-modulated analysis filter bank)
= estimate for Y(n ) at time i (i-th frame)
N=number of frequency bins (channels) n=0..N-1 M=downsampling factor K=frame length h[k] = length-K analysis window (=prototype filter)
® frames with 50%...66% overlap (i.e. 2-, 3-fold oversampling, N=2M..3M)® subband processing:
® synthesis bank: matched to analysis bank (see DSP-CIS)
1
0
/2].[][],[K
k
NknjekMiykhinY
y[k]Y[n,i]
Short-timeanalysis
Short-timesynthesis
Gain functions
][ˆ ks],[ˆ inS
],[ˆ in
],[].,[],[ˆ inYinGinS
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 11
Spectral Subtraction: Musical Noise
• Audio demo: car noise
• Artifact: musical noise
What? Short-time estimates of |Yi()| fluctuate randomly in noise-only frames,
resulting in random gains Gi() ® statistical analysis shows that broadband noise is transformed into
signal composed of short-lived tones with randomly distributed frequencies (=musical noise)
magnitude subtraction][ky ][̂ks
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 12
probability that speech is present, given observation
)()()(ˆ ii YGS
instantaneousaverage
Spectral Subtraction: Musical Noise
• Solutions?- Magnitude averaging: replace Yi() in calculation of Gi()
by a local average over frames
- EMSR (p7)- augment Gi() with soft-decision VAD:
Gi() P(H1 | Yi()). Gi() …
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 13
Spectral Subtraction: Iterative Wiener Filter
Example of signal model-based spectral subtraction…• Basic:
Wiener filtering based spectral subtraction (p.9), with (improved) spectra estimation based on parametric models
• Procedure:1. Estimate parameters of a speech model from noisy signal y[k]
2. Using estimated speech parameters, perform noise reduction (e.g. Wiener estimation, p. 9)
3. Re-estimate parameters of speech model from the speech signal estimate
4. Iterate 2 & 3
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 14
Spectral Subtraction: Iterative Wiener Filter
white noise
generator
pulse train……
pitch period
voiced
unvoiced
x
M
m
mjme
1
1
1
sg
speech
signal
frequency domain:
time domain:
= linear prediction parameters
all-pole filter
u[k]
)(1
)(
1
Ue
gS M
m
mjm
s
M
msm kugmksks
1
][][][
TM 1α
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 15
Spectral Subtraction: Iterative Wiener Filter
For each frame (vector) y[m] (i=iteration nr.)
1. Estimate and
2. Construct Wiener Filter (p.9)
with:
• estimated during noise-only periods
•
3. Filter speech frame y[m]
isg , TiMii ,,1 α
)()(
)(..)(
nnss
ss
PP
PG
)(nnP
2
1,
,
1
)(
M
m
mjim
isss
e
gP
][ˆ mis
Repeat until
some errorcriterion issatisfied
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 16
Overview
• Spectral subtraction for single-micr. noise reduction– Single-microphone noise reduction problem– Spectral subtraction basics (=spectral filtering)– Features: gain functions, implementation, musical noise,…– Iterative Wiener filter based on speech signal model
• Multi-channel Wiener filter for multi-micr. noise red.– Multi-microphone noise reduction problem– Multi-channel Wiener filter (=spectral+spatial filtering)
• Kalman filter based noise reduction– Kalman filters– Kalman filters for noise reduction
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 17
Multi-Microphone Noise Reduction Problem
(some) speech estimate
speech source
noise source(s)
microphone signals
Miknksky iii ...1],[][][ ][kn
][ks
? ][ks
speech part noise part
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 18
Multi-Microphone Noise Reduction Problem
Will estimate speech
part in microphone 1
(*) (**)
Miknksky iii ...1],[][][
? ][1 ks
(*) Estimating s[k] is more difficult, would include dereverberation... (**) This is similar to single-microphone model (p.3), where additional microphones (m=2..M) help to get a better estimate
][kn
][ks
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 19
Multi-Microphone Noise Reduction Problem
• Data model:
See Lecture-2 on multi-path propagation, with q left out for conciseness.
Hm(ω) is complete transfer function from speech source position to m-the microphone
)(
)(
)(
)(.
)(
)(
)(
)(
)(
)(
)()().(
)()()(
2
1
2
1
2
1
MMM N
N
N
S
H
H
H
Y
Y
Y
S
Nd
NSY
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 20
Multi-Channel Wiener Filter (MWF)
• Data model:
• Will use linear filters to obtain speech estimate (as in Lecture-2)
• Wiener filter (=linear MMSE approach)
Note that (unlike in DSP-CIS) `desired response’ signal S1(w) is unknown here (!), hence solution will be `unusual’…
)()().( )( NdY S
)().()().()(ˆ1
*1 YFH
M
mmm YFS
})().()({min2
1)( YFFHSE
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 21
Multi-Channel Wiener Filter (MWF)
• Wiener filter solution is (see DSP-CIS)
– All quantities can be computed !– Special case of this is single-channel Wiener filter formula on p.9
)}().({)}().({.)}().({
...
.)}().({)}().({)(
*1
*1
1
)0)}().({(with
lationcrosscorre
*1
1
ationautocorrel
*
1
NEYEE
SEE
H
NE
H
NYYY
YYYF
S
compute during speech+noise periods
compute during noise-only periods
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 22
• MWF combines spatial filtering (as in Lecture-2) with single-channel spectral filtering (as in single-channel noise reduction)
if
then…
noise vectorsteering
2
1
)()(.)(
)(
)(
)(
)(
Nd
Y
S
Y
Y
Y
M
)()}().({ NNHE ΦNN
Multi-Channel Wiener Filter (MWF)
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 23
• …then it can be shown that
• represents a spatial filtering (*)
Compare to superdirective & delay-and-sum beamforming (Lecture-2) – Delay-and-sum beamf. maximizes array gain in white noise field– Superdirective beamf. maximizes array gain in diffuse noise field – MWF maximizes array gain in unknown (!) noise field.
MWF is operated without invoking any prior knowledge (steering
vector/noise field) ! (the secret is in the voice activity detection… (explain))
(*) Note that spatial filtering can improve SNR, spectral filtering never improves SNR
(at one frequency)
)().(.)()( 1
scalar
dΦF NN
)().(1 dΦNN
Multi-Channel Wiener Filter (MWF)
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 24
• …then it can be shown that (continued)
• represents an additional `spectral post-filter’ i.e. single-channel Wiener estimate (p.9), applied to output signal
of spatial filter
(prove it!)
)().(.)()( 1
scalar
dΦF NN
1)(.)().().(
)(.)(...)( 21
*1
2
S
HS
NNH dΦd
)(
Multi-Channel Wiener Filter (MWF)
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 25
Multi-Channel Wiener Filter: Implementation
• Implementation with short-time Fourier transform: see p.10 • Implementation with time-domain linear filtering:
M
ii
Ti
iiiTi
kks
Lkykykyk
11 ].[][
]1[...]1[][][
fy
y
][kn
][ks
][1 ks
][1 ky
filter
coefficients
][][][ kkk iii nsy
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 26
Solution is…
}].[][{min2
1 fyf kksE T
TTM
TT
TTM
TT
kkkk ][...][][][
...
21
21
yyyy
ffff
]}[.][{]}[.][{.}][.][{
]}[.][{.}][.][{
11
1
1
1
knkEkykEkkE
kskEkkE
T
T
nyyy
yyyf
compute during speech+noise periods
compute during noise-only periods
• Implementation with time-domain linear filtering:
Multi-Channel Wiener Filter: Implementation
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 27
Overview
• Spectral subtraction for single-micr. noise reduction– Single-microphone noise reduction problem– Spectral subtraction basics (=spectral filtering)– Features: gain functions, implementation, musical noise,…– Iterative Wiener filter based on speech signal model
• Multi-channel Wiener filter for multi-micr. noise red.– Multi-microphone noise reduction problem– Multi-channel Wiener filter (=spectral+spatial filtering)
• Kalman filter based noise reduction– Kalman filters– Kalman filters for noise reduction
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 28
State space model of a time-varying discrete-time system
with v[k] and w[k]: mutually uncorrelated, zero mean, white noises
Then: given A[k], B[k], C[k], D[k], V[k], W[k] and
input/output-observations u[k],y[k], k=0,1,2,... then
Kalman filter produces MMSE estimates of internal states x[k], k=0,1,...
PS: will use shorthand notation here, i.e.
xk, yk ,.. instead of x[k], y[k],..
process noise
measurement noise
Kalman Filter 1/12
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 29
Kalman Filter 2/12
Definition: = MMSE-estimate of xk using all available data up until time l
• `FILTERING’ = estimate
• `PREDICTION’ = estimate
• `SMOOTHING’ = estimate
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 30
Initalization:
‘Conventional Kalman Filter’ operation @ time k (k=0,1,2,..) Given and corresponding error covariance matrix , Compute and using uk, yk :Step 1: Measurement Update (produces ‘filtered’ estimate)
(compare to standard RLS!)
Step 2: Time Update
(produces ‘1-step prediction’)
=error covariance matrix
Kalman Filter 3/12
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 31
PS: ‘Standard RLS’ is a special case of ‘Conventional KF’
Kalman Filter 4/12
Internal state vector is FIR filter
coefficients vector, which is
assumed to be time-invariant
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 32
State estimation @ time k corresponds to overdetermined set of linear
equations, where vector of unknowns contains all previous state vectors :
Kalman Filter 5/12
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 33
State estimation @ time k corresponds to overdetermined set of linear
equations, where vector of unknowns contains all previous state vectors :
Kalman Filter 6/12Te
chn
ical
det
ail…
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 34
State estimation @ time k corresponds to overdetermined set of linear
equations, where vector of unknowns contains all previous state vectors :
Kalman Filter 7/12
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 35
‘Square-Root’ Kalman Filter
Kalman Filter 8/12
Propagated from
time k-1 to time k
..hence requires only lower-right/lower part
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 36
‘Square-Root’ Kalman Filter
Kalman Filter 9/12
Propagated from
time k-1 to time k
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 37
‘Square-Root’ Kalman Filter
Kalman Filter 10/12
Propagated from
time k-1 to time k
Propagated from
time k to time k+1
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 38
Kalman Filter 11/12
QRD-
=QRD-
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 39
Kalman Filter 12/12
can be derived from square-root KF equations
: can be worked into measurement update eq.
: can be worked into state update eq.
is .
[details omitted]
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 40
Kalman filter for Speech Enhancement
• Assume AR model of speech and noise
• Equivalent state-space model is…
y=microphone signal
N
mnm
M
msm
kwgmknkn
kugmksks
1
1
][][][
][][][
u[k], w[k] = zero mean, unit
variance,white noise
][][
][][]1[
kky
kkkT xc
vAxx
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 41
Kalman filter for Speech Enhancement
with: ][]1[][]1[][ knNknksMkskT x
n
s
NM
T
n
s
g0
0gGC
A0
0AA ;100100;
1111
100
00
010
;100
00
010
NN
n
MM
s AA
Tkwkuk ][][.][ Gv
;00;00 nTns
Ts gg gg
TGGQ .
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 42
Kalman filter for Speech Enhancement
Disadvantages iterative approach:• complexity• delay
split signal
in frames
estimate
parameters
Kalman Filter
reconstruct
signal
imisg ,, ˆ;ˆ iming ,,
ˆ;ˆ
iterations
y[k]
][̂ks
],[ˆ mis
][ˆ min
Iterative algorithm
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 43
Kalman filter for Speech Enhancement
iteration index time index (no iterations)
State Estimator:
Kalman Filter
Parameters Estimator
(Kalman Filter)
D
D
],1|1[̂ kks
]1|1[ˆ kkn
][ky
]|[ˆ],|[̂ kknkks
],1|1[ˆ kkm],1|1[ˆ kkm],1|1[ˆ kkgs
]1|1[ˆ kkgn
],|[ˆ kkm],|[ˆ kkm],|[ˆ kkgs
]|[ˆ kkgn
ns ˆ,ˆ
,ˆ,ˆ mm
ns gg ˆ,ˆ
Sequential algorithm
Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 44
CONCLUSIONS
• Single-channel noise reduction– Basic system is spectral subtraction– Only spectral filtering, hence can only exploit differences in spectra
between noise and speech signal:• noise reduction at expense of speech distortion• achievable noise reduction may be limited
• Multi-channel noise reduction– Basic system is MWF,– Provides spectral + spatial filtering (links with beamforming!)
• Iterative Wiener filter & Kalman filtering– Signal model based approach