+ All Categories
Home > Documents > A Priori SNR Estimation Using Weibull Mixture Model - 12 ... · A Priori SNR Estimation Using...

A Priori SNR Estimation Using Weibull Mixture Model - 12 ... · A Priori SNR Estimation Using...

Date post: 02-Jul-2018
Category:
Upload: buinguyet
View: 234 times
Download: 0 times
Share this document with a friend
16
A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation Aleksej Chinaev , Jens Heitkaemper, Reinhold Haeb-Umbach Department of Communications Engineering Paderborn University 7. Oktober 2016 Computer Science, Electrical Engineering and Mathematics Communications Engineering Prof. Dr.-Ing. Reinhold Häb-Umbach NT
Transcript
Page 1: A Priori SNR Estimation Using Weibull Mixture Model - 12 ... · A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation ... Spectral speech enhancement

A Priori SNR Estimation

Using Weibull Mixture Model12. ITG Fachtagung Sprachkommunikation

Aleksej Chinaev, Jens Heitkaemper, Reinhold Haeb-Umbach

Department of Communications EngineeringPaderborn University

7. Oktober 2016

Computer Science, ElectricalEngineering and Mathematics

Communications EngineeringProf. Dr.-Ing. Reinhold Häb-Umbach

NT

Page 2: A Priori SNR Estimation Using Weibull Mixture Model - 12 ... · A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation ... Spectral speech enhancement

Table of contents

1 Problem formulation and motivation

2 A priori SNR estimation based on Weibull mixture model

3 Experimental evaluation

4 Conclusions and outlook

A Priori SNR Estimation Using Weibull Mixture Model

A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 1 / 10

NT

Page 3: A Priori SNR Estimation Using Weibull Mixture Model - 12 ... · A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation ... Spectral speech enhancement

Problem formulation and motivation

Single-channel clean speech s(t) contaminated by an additive noise n(t):

y(t) = s(t) + n(t)STFT

◦——-• Y (k , ℓ) = S(k , ℓ) + N(k , ℓ)

| · |2

Noise PSD

tracker

A priori SNR

estimator

Gain

functionISTFT

Y (k , ℓ) |Y (k , ℓ)|2

••

λN(k , ℓ) − noise power spectral density (PSD) k - frequency bin

ℓ - frame index

ξ(k , ℓ) G(k , ℓ) S(k , ℓ) s(t)

A priori SNR ξ(k , ℓ) = λS (k,ℓ)λN (k,ℓ)

– a key component in enhancement system

λS(k , ℓ) = E[

|S(k , ℓ)|2]

- clean speech PSD, λN(k , ℓ) = E[

|N(k , ℓ)|2]

- noise PSD

Motivated by a generalized spectral subtraction (GSS) denoising |Y (k , ℓ)|α

for α ∈ R>0 not restricted to (α = 1) or (α = 2) with assumption

|Y (k , ℓ)|α = |S(k , ℓ)|α + |N(k , ℓ)|α

A Priori SNR Estimation Using Weibull Mixture Model

A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 1 / 10

NT

Page 4: A Priori SNR Estimation Using Weibull Mixture Model - 12 ... · A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation ... Spectral speech enhancement

Table of contents

1 Problem formulation and motivation

2 A priori SNR estimation based on Weibull mixture model

3 Experimental evaluation

4 Conclusions and outlook

A Priori SNR Estimation Using Weibull Mixture Model

A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 1 / 10

NT

Page 5: A Priori SNR Estimation Using Weibull Mixture Model - 12 ... · A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation ... Spectral speech enhancement

Normalized α-order magnitude (NAOM) domain

A priori SNR estimator

Estimate PSα(k)

and go into

NAOM domain

Estimate

parameter of

WMM pSα(s)

Estimate

clean speech

NAOMs

Calculate

a priori SNR

|Y (k , ℓ)|2

λN(k , ℓ)

Yα(k , ℓ)

λNα(k , ℓ)

λm(k , ℓ)

πm(k , ℓ)

Sα(k , ℓ) ξ(k , ℓ)

Normalize |Y (k , ℓ)|α to a root of an averaged power PSα(k) of |S(k , ℓ)|α

Yα(k , ℓ) =|Y (k , ℓ)|α√

PSα(k)

= Sα(k , ℓ)+Nα(k , ℓ) with PSα(k) =

1

L

L∑

ℓ=1

|S(k , ℓ)|2α

Statistical models independent of speaker loudness

Normalized energy of clean speech NAOMs E [S2α(k)] = 1

Sα(k , ℓ) & Nα(k , ℓ) – realizations of random variables Sα(k) & Nα(k)

Estimate Sα(k , ℓ) from Yα(k , ℓ) given models for Sα(k)&Nα(k)

A Priori SNR Estimation Using Weibull Mixture Model

A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 2 / 10

NT

Page 6: A Priori SNR Estimation Using Weibull Mixture Model - 12 ... · A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation ... Spectral speech enhancement

Modeling of noise NAOM coefficients Nα(k, ℓ)

N(k , ℓ) ∼ Nc(n; 0, λN(k , ℓ))

Nα(k , ℓ) – Weibull distributed

pNα(k,ℓ)(n) = Weib(n;λNα(k , ℓ), α)

Shape parameter α ∈ R>0

Scale parameter

λNα(k, ℓ) =

λN(k, ℓ)

α

PSα(k)

∈ R>0

Weibull PDF for λ = 1 and different α

n0.5 1.5 20

1

Wei

b(n;

1,

α) 0.5

11.5

2

Model Nα(k) with Weibull PDF

pNα(k)(n) = Weib(n;λNα(k), α)

with λNα(k) =

1

L

L∑

ℓ=1

λNα(k , ℓ)

NAOM coefficients of white noisesignal and estimated pNα(k)(n)

Histogram and Weibull PDF for α = 0.7

n0 0.3 0.6 0.90

1

2

3

pN

α(n

)

Noise NAOMs

Weibull PDF

A Priori SNR Estimation Using Weibull Mixture Model

A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 3 / 10

NT

Page 7: A Priori SNR Estimation Using Weibull Mixture Model - 12 ... · A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation ... Spectral speech enhancement

Modeling of NAOM coefficients of clean speech Sα(k, ℓ)

S(k , ℓ) ∼ Nc(n; 0, λS(k , ℓ))

Bimodal Weibull mixture model(WMM) to model Sα(k)

pSα(k)(s) =

2∑

m=1

πm(k)·Weib(s; λm(k), β)

m = 1 : silence

m = 2 : activity

πm(k) ∈ [0, 1]: weights

λm(k): scale parameters

β: shape parameter

β 6= α : additional degree offreedom in the model

Clean speech NAOMs & estimatedWMM (α = 0.7; β = 2.5)

Histogram and estimated WMM

s0 0.5 1.0 1.5

0.1

1

10

pS

α

(s)

Clean speech NAOMs

Bimodal WMM

m = 1 componentm = 2 component

A Priori SNR Estimation Using Weibull Mixture Model

A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 4 / 10

NT

Page 8: A Priori SNR Estimation Using Weibull Mixture Model - 12 ... · A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation ... Spectral speech enhancement

Estimation of WMM parameters and clean speech NAOMs

A priori SNR estimator

Estimate PSα(k)

and go into

NAOM domain

Estimate

parameter of

WMM pSα(s)

Estimate

clean speech

NAOMs

Calculate

a priori SNR

|Y (k , ℓ)|2

λN(k , ℓ)

Yα(k , ℓ)

λNα(k , ℓ)

λm(k , ℓ)

πm(k , ℓ)

Sα(k , ℓ) ξ(k , ℓ)

Set λ1(k) acc. to ξmin usually used in a priori SNR estimation [Cappe 94]

Expectation Maximization algorithm to estimate λ2(k), πm(k)

After EM, weights πm(k) are corrected with the constraint E [S2α(k)] = 1

A priori SNR estimator

Estimate PSα(k)

and go into

NAOM domain

Estimate

parameter of

WMM pSα(s)

Estimate

clean speech

NAOMs

Calculate

a priori SNR

|Y (k , ℓ)|2

λN(k , ℓ)

Yα(k , ℓ)

λNα(k , ℓ)

λm(k , ℓ)

πm(k , ℓ)

Sα(k , ℓ) ξ(k , ℓ)

Maximum a posteriori (MAP) estimation:

SMAPα (k , ℓ) = argmax

s

pSα(k) | Yα(k,ℓ)(s|y)

Yα(k, ℓ) is a realisation of random variable Yα(k) = Sα(k) + Nα(k)

Approximative computationally efficient solution for β = α = 1

A Priori SNR Estimation Using Weibull Mixture Model

A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 5 / 10

NT

Page 9: A Priori SNR Estimation Using Weibull Mixture Model - 12 ... · A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation ... Spectral speech enhancement

Calculation of a priori SNR and causal implementation

A priori SNR estimator

Estimate PSα(k)

and go into

NAOM domain

Estimate

parameter of

WMM pSα(s)

Estimate

clean speech

NAOMs

Calculate

a priori SNR

|Y (k , ℓ)|2

λN(k , ℓ)

Yα(k , ℓ)

λNα(k , ℓ)

λm(k , ℓ)

πm(k , ℓ)

Sα(k , ℓ) ξ(k , ℓ)

Go back into domain of power spectral density by calculating

ξ(k , ℓ) = max

[

Sα(k , ℓ) ·√

PSα(k)

] 2α

λN(k , ℓ), ξmin

Causal implementation of WMM-based a priori SNR estimators

Calculate PSα(k) and λNα

(k) in a causal way

Causal EM for λ2(k) and π2(k) with one EM-iteration per time frame

Note, parameters α and β have to be set appropriately → optimization

A Priori SNR Estimation Using Weibull Mixture Model

A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 6 / 10

NT

Page 10: A Priori SNR Estimation Using Weibull Mixture Model - 12 ... · A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation ... Spectral speech enhancement

Table of contents

1 Problem formulation and motivation

2 A priori SNR estimation based on Weibull mixture model

3 Experimental evaluation

4 Conclusions and outlook

A Priori SNR Estimation Using Weibull Mixture Model

A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 6 / 10

NT

Page 11: A Priori SNR Estimation Using Weibull Mixture Model - 12 ... · A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation ... Spectral speech enhancement

Experimental evaluation

Data and setup

Clean speech: Wall Street Journal database 16 kHz (male and female)

7 different noise types of Noisex92 database: white, pink, f16, hfchannel,factory-1, factory-2, babble

Input global SNR from −5 dB up to 25 dB in 5 dB steps

Spectral speech enhancement framework

Noise PSD tracking using Minimum statistics approach [Martin 01]

A priori SNR estimation with ξmin = −18 dB [Cappe 94]

Proposed WMM-based approach with Wiener filter

Reference approach: Decision Directed [Ephraim 84]

A Priori SNR Estimation Using Weibull Mixture Model

A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 7 / 10

NT

Page 12: A Priori SNR Estimation Using Weibull Mixture Model - 12 ... · A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation ... Spectral speech enhancement

Optimization of α and β

Speech quality maximization in terms of wide-band mean opinion scorelistening quality objective (MOS-LQO) with

∆MOS-LQO = max(MOS-LQOWMM − MOS-LQODD , 0 )

Averaging over genders, noise types and input global SNR values

(αopt, βopt) = (0.64, 2.7)

0.4 0.6 0.8 12

4

0

0.1

α

β

∆M

OS

-LQ

O

A Priori SNR Estimation Using Weibull Mixture Model

A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 8 / 10

NT

Page 13: A Priori SNR Estimation Using Weibull Mixture Model - 12 ... · A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation ... Spectral speech enhancement

Final experimental results

Clean speech: WSJ database signals other than used for optimization

Estimation error – Itakura-Saito distance (ISD) and estimator’s variance –logarithmic error variance (LEV): the smaller the better

Resulting ISD, LEV and MOS-LQO values averaged over noise types

SNR, dB −5 0 5 10 15 20 25 AVG

ISDDD 48.8 44.0 39.6 34.9 30.2 24.5 19.1 34.4

WMM 42.6 38.1 34.1 30.4 27.3 23.0 18.9 30.6

LEVDD 53.1 49.0 46.4 45.1 45.5 47.4 50.5 48.1

WMM 45.6 43.9 42.6 41.1 39.0 37.0 35.9 40.7

MOS-LQODD 1.11 1.30 1.63 2.09 2.57 3.00 3.39 2.16

WMM 1.18 1.46 1.77 2.13 2.62 3.16 3.61 2.28

A Priori SNR Estimation Using Weibull Mixture Model

A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 9 / 10

NT

Page 14: A Priori SNR Estimation Using Weibull Mixture Model - 12 ... · A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation ... Spectral speech enhancement

Conclusions and outlook

Conclusions

Novel causal a priori SNR estimator based on a bimodal Weibull mixturemodel for the normalized α-order spectral magnitudes (NAOMs)

Optimization of the proposed approach by maximization of speech quality

Power exponent αopt = 0.64 smaller than 1 (spectral magnitudes)

Shape factor βopt = 2.7 – a heavier tailed Weibull distribution

Compared to the wide-spread Decision Directed approach:

Reduced error and variance of the WMM-based a priori SNR estimator

Improvement of speech quality of the enhanced signals

Higher computational effort

Outlook

Reduction of computational effort – fixed speaker-independent models

Development of model-based spectral enhancement using generalized(arbitrary) power exponent in the spirit of generalized spectral subtraction

A Priori SNR Estimation Using Weibull Mixture Model

A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 10 / 10

NT

Page 15: A Priori SNR Estimation Using Weibull Mixture Model - 12 ... · A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation ... Spectral speech enhancement

Thank you for your attention!

Questions? Paderborn University

Department ofCommunications Engineering

Web: nt.upb.de

Computer Science, ElectricalEngineering and Mathematics

Communications EngineeringProf. Dr.-Ing. Reinhold Häb-Umbach

NT

Page 16: A Priori SNR Estimation Using Weibull Mixture Model - 12 ... · A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation ... Spectral speech enhancement

Resulting WMM parameter and audio samples

50 100 150 200 250−0.6−0.4−0.2

00.2

log(λ

) λmean1 (k)

λmean2 (k)

50 100 150 200 2500.2

0.4

0.6

0.8

k

π

πmean1 (k)

πmean2 (k)

Figure : Resulting WMM parameter over frequency bins

Exemplarily speech samples: Noisy DD WMM

A Priori SNR Estimation Using Weibull Mixture Model

A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 10 / 10

NT


Recommended