Date post: | 21-Dec-2015 |
Category: |
Documents |
View: | 221 times |
Download: | 6 times |
HIWIRE MEETINGHIWIRE MEETINGTrento, January 11-12, 2007Trento, January 11-12, 2007
José C. Segura, Javier RamírezJosé C. Segura, Javier Ramírez
2 HIWIRE Meeting – Trento, 11 -12 January, 2007
Schedule
PEQ
HAFE
IS07 setup
New improvements in robust VAD Revised multiple observation LRT (MO-LRT) Improve noise reduction and frame-dropping
3 HIWIRE Meeting – Trento, 11 -12 January, 2007
PEQ
Evaluation AURORA2, AURORA3, AURORA4 Compared to HEQ
PEQ shows better performance on all databases
Results using Loquendo recognizer Improved results Slight degradation on clean conditions
4 HIWIRE Meeting – Trento, 11 -12 January, 2007
PEQ / HEQ comparative results
5 HIWIRE Meeting – Trento, 11 -12 January, 2007
HAFE
In collaboration with TUC-NTUA
Released two C modules, integrated in HAFE V1.0
Basic Analysis VAD (LTSD) Wiener filter (optional) Output: WAV / MFCC / FB
Post-Processing PEQ (optional) Regression computation (optional) Frame-Dropping (optional) CMS /CMVN (optional)
6 HIWIRE Meeting – Trento, 11 -12 January, 2007
IS07 setup
Prepared an HTK setup for evaluation on the HIWIRE database Training scripts based on LORIA ones Test scripts include MLLR adaptation with variable number of
utterances
Baseline results Only for clean data With and without adaptation
7 HIWIRE Meeting – Trento, 11 -12 January, 2007
IS07 setup (without adaptation)
8 HIWIRE Meeting – Trento, 11 -12 January, 2007
IS07 (with adaptation)
0,00
5,00
10,00
15,00
20,00
25,00
0 2 5 10 20 50
Number of ADAP utterances
Err
or
(%)
WER
SER
9 HIWIRE Meeting – Trento, 11 -12 January, 2007
A review of MO-LRT VAD
Multiple observation likelihood ratio test: Given 2N+1 independent observations of the noisy speech
Hypothesis test: G0 : All the observations in the buffer are non-
speech G1 : “ “ “ noisy
speech
Gaussian model:
where
10 HIWIRE Meeting – Trento, 11 -12 January, 2007
Hangover analysis
11 HIWIRE Meeting – Trento, 11 -12 January, 2007
Hangover analysis
12 HIWIRE Meeting – Trento, 11 -12 January, 2007
Revised MO-LRT Given 2N+1 independent observations of the noisy speech:
All the possible hypothesis on the individual observations:
hk= 0 : xk = n
hk= 1 : xk = s + n
Hypothesis subsets
13 HIWIRE Meeting – Trento, 11 -12 January, 2007
Revised MO-LRT
We assume that just a single speech to non-speech or non-speech to speech transition can occur in h
14 HIWIRE Meeting – Trento, 11 -12 January, 2007
Compared to Sohn et al. VAD.
15 HIWIRE Meeting – Trento, 11 -12 January, 2007
16 HIWIRE Meeting – Trento, 11 -12 January, 2007
ROC curves in quiet noise conditions (stopped car and engine running) and close talking microphone.
17 HIWIRE Meeting – Trento, 11 -12 January, 2007
ROC curves in high noise conditions (high speed over a good road) and distant talking microphone.
18 HIWIRE Meeting – Trento, 11 -12 January, 2007
Presented at ICASSP 2007:
Javier Ramirez, José C. Segura, Juan M. Górriz, “Revised contextual LRT for voice activity detection”, ICASSP 2007.
Under review:
Javier Ramírez, José C. Segura, Juan M. Górriz and Luz García, “Improved Voice Activity Detection Using Contextual Multiple Hypothesis Testing for Robust Speech Recognition”, IEEE Transactions on Audio, Speech and Language Processing.