Post on 18-Dec-2015
transcript
HIWIRE MEETINGHIWIRE MEETINGParis, February 11, 2005Paris, February 11, 2005
JOSÉ C. SEGURA LUNAJOSÉ C. SEGURA LUNA
GSTC UGRGSTC UGR
2 HIWIRE Meeting – Paris, 11 February, 2005 José C. Segura Luna
Schedule
AURORA 4 HTK-based setup
Baseline results (AURORA databases) MFCC with C0 and CMN AFE
Additional results CMVN HEQ
Work in progress WP1: Improved HEQ WP2: User independence & robustness
3 HIWIRE Meeting – Paris, 11 February, 2005 José C. Segura Luna
AURORA 4 HTK-based setup
ETSI AURORA 4 evaluation Baseline system based on ISIP speech recognition system
Main drawbacks: CPU time for experiments (specially for decoding) Scripts are excessively complex to use
Described in: N. Parihar and J. Picone, "DSR Front End LVCSR Evaluation -
AU/384/02," Aurora Working Group, ETSI, December 06, 2002.
G. Hirsch, "Experimental Framework for the Performance Evaluation of Speech Recognition Front-ends on a Large Vocabulary Task, Version 2.0," ETSI STQ-Aurora DSR Working Group, November 19, 2002.
4 HIWIRE Meeting – Paris, 11 February, 2005 José C. Segura Luna
AURORA 4 HTK-based setup
HTK-based setup for AURORA 4 evaluations
Features 12MFCC + C0 (CMS) + Δ + Δ Δ
Cross-word tree-based tied-state tri-phones 3 states / 6 Gaussians per state
Back-off bi-gram language model Same as used in ISIP setup
Pruning is performed as in ISIP setup
Available for partners at: http://www.hiwire.org
5 HIWIRE Meeting – Paris, 11 February, 2005 José C. Segura Luna
AURORA 4 HTK-based setup
Performance comparisons (HTK-based setup vs. ISIP) Training clean models from scratch takes 3h52‘ on a 2.66GHz
Word error rate Decoding time (s)
ISIP HTK ISIP HTK
Test 01
(clean data)16.2% 13.22%
7580
(6.16RT)
3428
(2.78RT)
Test 02
(car noise)49.6% 24.68%
22195
(18.03RT)
8002
(6.50RT)
Test 03
(babble noise)62.2% 46.00%
33203
(26.9RT)
13747
(11.17RT)
12 MFCCs + C0 (CMS) + +
6 HIWIRE Meeting – Paris, 11 February, 2005 José C. Segura Luna
AURORA 4 Baseline results
TRAIN TEST LATTICEPARAMETERS MODE SIZE SIZE 01-07 08-14 01-14 01-07 08-14 01-14
MFCC_0_D_A_Z clean 166 none 40,53 50,60 45,57 --- --- ---MFCC_0_D_A_Z clean 166 sml 26,53 33,57 30,05MFCC_0_D_A_Z clean 166 mid 27,98 35,02 31,50MFCC_0_D_A_Z clean 330 none 40,72 50,78 45,75 -0,47% -0,36% -0,40%MFCC_0_D_A_Z clean 330 sml 25,75 32,93 29,34MFCC_0_D_A_Z clean 330 mid 27,18 34,25 30,71
MFCC_0_D_A_Z multi 166 none 24,58 29,88 27,23 39,36% 40,96% 40,25%MFCC_0_D_A_Z multi 166 sml 17,32 18,87 18,09MFCC_0_D_A_Z multi 166 mid 18,83 20,16 19,50MFCC_0_D_A_Z multi 330 none 24,74 29,73 27,24 38,97% 41,24% 40,23%MFCC_0_D_A_Z multi 330 sml 16,70 17,80 17,25MFCC_0_D_A_Z multi 330 mid 18,26 19,33 18,79
AVERAGES Relative Error Reduction
7 HIWIRE Meeting – Paris, 11 February, 2005 José C. Segura Luna
AURORA 4 Additional results
TRAIN TEST LATTICEPARAMETERS MODE SIZE SIZE 01-07 08-14 01-14 01-07 08-14 01-14
MFCC_0_D_A_Z clean 166 none 40,53 50,60 45,57 --- --- ---MFCC_0_D_A_Z multi 166 none 24,58 29,88 27,23 39,36% 40,96% 40,25%
MFCC_0_D_A_Z MV clean 166 none 36,12 48,50 42,31 10,88% 4,15% 7,14%MFCC_0_D_A_Z MV DELTAS clean 166 none 34,73 47,35 41,04 14,31% 6,43% 9,94%
AFE clean 166 none 27,57 34,99 31,28 31,99% 30,85% 31,36%AFE noFD clean 166 none 27,69 35,26 31,48 31,67% 30,31% 30,92%AFE noFD multi 166 none 22,33 27,67 25,00 44,90% 45,32% 45,13%
ECDF_WSJ_MULTI clean 166 none 32,81 43,77 38,29 19,06% 13,50% 15,97%ECDF_TID_MULTI clean 166 none 31,36 40,87 36,12 22,61% 19,24% 20,74%ECDF_WSJ_CLEAN clean 166 none 32,19 42,75 37,47 20,58% 15,53% 17,78%ECDF_TID_CLEAN clean 166 none 31,75 41,95 36,85 21,67% 17,09% 19,13%
AVERAGES Relative Error Reduction
8 HIWIRE Meeting – Paris, 11 February, 2005 José C. Segura Luna
Baseline results
HIWIRE baseline results: 12 MFCCs + C0 (CMS) + +
Subway Babble Car Exhibition Average RestaurantStreet Airport Station Average Subway MStreet M Average AverageClean 98,46 98,46 98,36 98,73 98,50 98,46 98,46 98,36 98,73 98,50 98,53 98,46 98,50 98,5020 dB 97,79 97,67 98,21 97,22 97,72 97,21 97,82 97,88 97,41 97,58 97,88 97,49 97,69 97,6615 dB 97,11 97,13 97,67 97,13 97,26 96,81 96,89 97,02 96,42 96,79 97,05 97,13 97,09 97,0410 dB 95,52 96,07 96,24 94,17 95,50 95,76 95,41 95,88 94,72 95,44 94,96 95,07 95,02 95,385 dB 90,30 90,24 87,41 87,75 88,93 89,75 89,06 90,69 87,87 89,34 90,11 88,18 89,15 89,140 dB 69,85 66,02 48,79 65,84 62,63 70,49 62,36 72,62 57,39 65,72 68,81 63,00 65,91 64,52-5dB 28,98 28,14 19,21 27,00 25,83 34,45 24,73 33,91 23,82 29,23 28,25 26,45 27,35 27,49Average 90,11 89,43 85,66 88,42 88,41 90,00 88,31 90,82 86,76 88,97 89,76 88,17 88,97 88,75
Subway Babble Car Exhibition Average RestaurantStreet Airport Station Average Subway MStreet M Average AverageClean 99,14 99,09 98,99 99,17 99,10 99,14 99,09 98,99 99,17 99,10 99,17 99,12 99,15 99,1120 dB 96,22 97,64 97,70 96,42 97,00 98,10 97,01 98,03 98,06 97,80 96,53 97,16 96,85 97,2915 dB 90,70 93,83 92,01 90,40 91,74 95,18 92,17 94,72 93,80 93,97 90,82 91,93 91,38 92,5610 dB 71,23 79,47 68,77 69,27 72,19 83,60 74,18 84,58 78,28 80,16 71,91 74,21 73,06 75,555 dB 38,19 47,28 32,84 34,80 38,28 54,38 42,26 52,94 44,80 48,60 38,07 42,68 40,38 42,820 dB 21,40 23,34 19,95 18,45 20,79 26,19 22,52 27,97 23,14 24,96 21,89 22,07 21,98 22,69-5dB 13,82 12,48 12,38 10,18 12,22 13,23 12,15 15,39 13,88 13,66 13,79 11,88 12,84 12,92Average 63,55 68,31 62,25 61,87 64,00 71,49 65,63 71,65 67,62 69,10 63,84 65,61 64,73 66,18
Absolute w ord accuracy. If an HTK
output is WORD: %Corr=99.14,
Acc=98.68 [H=……..], the value to enter is
98.68.
Clean training, multicondition testingA
Aurora 2 Small Vocabulary
Multicondition training, multicondition testingA B C
Absolute w ord accuracy. If an HTK
output is WORD: %Corr=99.14,
Acc=98.68 [H=……..], the value to enter is
98.68.
Aurora 2 Small Vocabulary B C
AURORA 2
9 HIWIRE Meeting – Paris, 11 February, 2005 José C. Segura Luna
Baseline results
AFE
AURORA 2
Subway Babble Car Exhibition Average RestaurantStreet Airport Station Average Subway MStreet M Average AverageClean 99,08 98,85 99,02 99,38 99,08 99,08 98,85 99,02 99,38 99,08 98,89 98,94 98,92 99,0520 dB 98,74 98,28 98,78 98,92 98,68 98,50 98,13 98,54 99,07 98,56 98,62 98,25 98,44 98,5815 dB 98,10 97,88 98,33 98,27 98,15 97,79 97,64 97,82 98,21 97,87 98,10 97,70 97,90 97,9810 dB 95,64 96,16 97,08 96,17 96,26 95,98 95,83 96,66 96,95 96,36 95,55 95,65 95,60 96,175 dB 91,96 91,05 93,80 90,77 91,90 90,70 90,72 92,51 91,79 91,43 90,70 89,09 89,90 91,310 dB 77,13 71,10 81,54 76,06 76,46 72,18 75,51 79,54 77,85 76,27 71,11 70,31 70,71 75,23-5dB 44,01 35,68 43,15 45,56 42,10 37,36 42,10 46,47 45,57 42,88 35,83 36,10 35,97 41,18Average 92,31 90,89 93,91 92,04 92,29 91,03 91,57 93,01 92,77 92,10 90,82 90,20 90,51 91,86
Subway Babble Car Exhibition Average RestaurantStreet Airport Station Average Subway MStreet M Average AverageClean 99,39 99,00 99,28 99,51 99,30 99,39 99,00 99,28 99,51 99,30 99,20 99,24 99,22 99,2820 dB 98,31 98,16 98,81 98,36 98,41 98,50 97,82 98,75 98,64 98,43 97,91 98,13 98,02 98,3415 dB 96,90 96,74 98,00 96,91 97,14 95,92 96,55 97,52 97,19 96,80 96,65 96,49 96,57 96,8910 dB 93,09 92,17 95,97 93,55 93,70 91,80 92,90 94,72 94,72 93,54 92,26 92,17 92,22 93,345 dB 85,26 81,47 89,29 84,91 85,23 80,78 84,16 86,04 86,45 84,36 83,42 82,56 82,99 84,430 dB 65,34 53,87 69,25 63,84 63,08 56,86 61,09 65,14 65,69 62,20 58,15 57,45 57,80 61,67-5dB 32,53 23,45 31,09 31,71 29,70 24,90 29,50 31,94 33,09 29,86 26,90 27,34 27,12 29,25Average 87,78 84,48 90,26 87,51 87,51 84,77 86,50 88,43 88,54 87,06 85,68 85,36 85,52 86,93
Absolute w ord accuracy. If an HTK
output is WORD: %Corr=99.14,
Acc=98.68 [H=……..], the value to enter is
98.68.
Clean training, multicondition testingA
Aurora 2 Small Vocabulary
Multicondition training, multicondition testingA B C
Absolute w ord accuracy. If an HTK
output is WORD: %Corr=99.14,
Acc=98.68 [H=……..], the value to enter is
98.68.
Aurora 2 Small Vocabulary B C
10 HIWIRE Meeting – Paris, 11 February, 2005 José C. Segura Luna
Baseline results
AURORA 3 word error rates
Italian Spanish German AverageWell (x40%) 5,58% 10,69% 8,86% 8,38%Mid (x35%) 12,98% 16,82% 18,81% 16,20%High (x25%) 53,25% 34,50% 20,31% 36,02%Overall 20,09% 18,79% 15,21% 18,03%
Italian Spanish German AverageWell (x40%) 3,29% 3,39% 4,87% 3,85%Mid (x35%) 7,47% 6,21% 10,40% 8,03%High (x25%) 11,00% 9,23% 8,70% 9,64%Overall 6,68% 5,84% 7,76% 6,76%
AFE
MFCC + C0 (CMS) + +
11 HIWIRE Meeting – Paris, 11 February, 2005 José C. Segura Luna
Work in progress (WP1)
Improved equalization
Modeling Speech & Noise separately
First results with Gaussian models Very promising on AURORA 4 Need to be evaluated on AURORA 2 & 3
Next Use more detailed / nonparametric models Incorporate dynamic features
12 HIWIRE Meeting – Paris, 11 February, 2005 José C. Segura Luna
Preliminary results
TRAIN TEST LATTICEPARAMETERS MODE SIZE SIZE 01-07 08-14 01-14 01-07 08-14 01-14
MFCC_0_D_A_Z clean 166 none 40,53 50,60 45,57 --- --- ---MFCC_0_D_A_Z multi 166 none 24,58 29,88 27,23 39,36% 40,96% 40,25%
MFCC_0_D_A_Z (MV) clean 166 none 36,12 48,50 42,31 10,88% 4,15% 7,14%MFCC_0_D_A_Z (MV DELTAS)clean 166 none 34,73 47,35 41,04 14,31% 6,43% 9,94%
AFE clean 166 none 27,57 34,99 31,28 31,99% 30,85% 31,36%AFE noFD clean 166 none 27,69 35,26 31,48 31,67% 30,31% 30,92%AFE noFD multi 166 none 22,33 27,67 25,00 44,90% 45,32% 45,13%
(ECDF_WSJ_MULTI) clean 166 none 32,81 43,77 38,29 19,06% 13,50% 15,97%(ECDF_TID_MULTI) clean 166 none 31,36 40,87 36,12 22,61% 19,24% 20,74%(ECDF_WSJ_CLEAN) clean 166 none 32,19 42,75 37,47 20,58% 15,53% 17,78%(ECDF_TID_CLEAN) clean 166 none 31,75 41,95 36,85 21,67% 17,09% 19,13%
CLASIF N20 ref01 clean 166 none 28,29 33,87 31,08 30,19% 33,06% 31,79%
AVERAGES Relative Error Reduction
13 HIWIRE Meeting – Paris, 11 February, 2005 José C. Segura Luna
Work in progress (WP1)
VAD & Noise reduction
Baseline evaluations AURORA 2 & 3 already done AURORA 4 to be ready on June
Integration with parametric techniques Speech & Noise equalization
14 HIWIRE Meeting – Paris, 11 February, 2005 José C. Segura Luna
Work in progress (WP2)
HEQ-based user robustness
Ready for AURORA 4Working in WSJ1 baseline
HEQ-based user adaptation
MLLR baselineEstimation of MLLR transformations using HEQWorking in WSJ1 baseline
HIWIRE MEETINGHIWIRE MEETINGParis, February 11, 2005Paris, February 11, 2005
JOSÉ C. SEGURA LUNAJOSÉ C. SEGURA LUNA
GSTC UGRGSTC UGR