Date post: | 16-Jan-2016 |
Category: |
Documents |
Upload: | ashlynn-norris |
View: | 220 times |
Download: | 0 times |
Confidence Measures for Automatic Speech Recognition
Presented by Tzan-Hwei Chen
National Taiwan Normal UniversitySpoken Language Processing Lab
Advisor : Hsin-Min Wang Berlin Chen
2
Outline
• Introduction
• The category of estimation methods of confidence measure (CM)– Featured based– Posterior probability based– Explicit model based – Incorporation of high-level information for CM*
• The application of CM to improve speech recognition
• Summary
3
Introduction (1/9)
• It is extremely important to be able to make an appropriate and reliable judgement based on the error-prone ASR result.
• Researchers have proposed to compute a score (preferably 0~1), called confidence measure (CM), to indicate reliability of any recognition decision made by an ASR system.
4
Introduction (2/9)
Feature extraction
Decodingspeechsignal
Acousticmodel
Languagemodel
recognizedword
sequence
featurevector
Lexicon
Confidence Measure
Verification
臺北 到 魚籃
12
1. 臺北到魚籃2. 臺北到宜蘭
Some application of CM
5
Introduction (3/9)
• First of all, we can backtrack some early research on CM to rejection in word-spotting systems.
• Other early CM-related works lie in automatic detection of new words in LVCSR.
• From the past few years, the CM has been applied to more and more research areas, e.g.,– To improve speech recognition– The algorithm about look-head in LVCSR– To guide the system to perform unsupervised learning– …
6
Recognizedunits
Introduction (4/9)
• The general procedure of CM for verification
Confidence estimation
judgment
Predefinedthreshold
> threshold < threshold
Confidenceof unit
acceptance rejection
7
Introduction (5/9)
• Four situations when judging a hypothesis
宜蘭ref
hyp宜蘭 Accept Correct acceptance
reject Correct rejection
reject false rejection
Accept false acceptance
宜蘭ref
hyp魚籃
宜蘭ref
hyp宜蘭
宜蘭ref
hyp魚籃
8
Introduction (6/9)
• The evaluation metric :– Confidence error rate :
wordsrecognized ofnumber totalthe
rejection false and acceptance false ofnumber CER
三民 候選人 通過
有 三名 候選人 通過 審查
審查 了
FA CA FR CA FA
5
12 CER
hyp
ref
9
Introduction (7/9)
• The evaluation metric :– Confidence error rate :
wordsrecognized ofnumber totalthe
ins. sub.
correct) is wordrecognizedevery assumed is(it
baseline
三民 候選人 通過
有 三名 候選人 通過 審查
審查 了
FA CA CA CA FA
5
11baseline
hyp
ref
10
Introduction (8/9)
• The evaluation metric (cont):– Receiver operator characteristics (ROC) curve :simply contains
a plot of the false acceptance rate over the detection rate.
raterejection false-1 ratedetection
wordsrecognizedy incorrectl ofnumber
acceptance false of num.rate acceptance false
wordsrecognizedcorrectly ofnumber
rejection false of num.rate rejection false
11
Introduction (9/9)
• All methods proposed for computing CMs can be roughly classified into three major categories [7]:
– Feature based
– Posterior probability based
– Explicit model based (utterance verification, UV)
– Incorporation of high-level information for CM*
12
Feature-based confidence measure
13
Feature-based confidence measure (1/8)
• The feature can be collected during the decoding procedure and may include acoustic, language and syntactic information
• Any feature can be called a predictor if its p.d.f. of correctly recognized words is clearly distinct from that of misrecognized words
)( wfp
wf
misrecognized word correctly recognized word
14
Feature-based confidence measure (2/8)
• Some common predictor features – Pure normalized likelihood score related : acoustic score per
frame.
– N-best related : count in the N-best list, N-best homogeneity score
– Duration related : word duration divided by its number of phones
)|],,([
)|],,([
1],,[
1],,[
1
1
Xeswp
Xeswp
Nnnn
bestNesw
Mmmm
esww
Nnnn
Mmmm
wordth- theof timeend the:
wordth- theof start time the:
is word
ofnumber that sequence a word :],,[ 1
me
ms
M
esw
m
m
Mmmm
15
Feature-based confidence measure (3/8)
• Some common predictor features (cont)– Hypothesis density :
ttD at arc worddifferent theofnumber The:)'(
)(1
1]),;[:( tD
seeswaHD
a
a
e
staaaaa
靜音結果
建國
有
由
又
三名三名
三名
候選人
候選人
候選人
沒有
沒有
沒有
審查
審查
候選人通過
三名候選人
通過
2)( tDgraph wordin arc a word :],;[: aaa eswa
16
Feature-based confidence measure (4/8)
• Some common predictor features (cont)– Acoustic stability
今天 天氣 很好Hypothesized word sequence
天氣 很好
今天 天氣
今天Hypothesized word sequence
1)()|( WPWXp
今天 天氣
今天 天氣 不佳
2)()|( WPWXp
3)()|( WPWXp
今天 天氣 很好
今天 天氣Hypothesized word sequence
不佳
17
Feature-based confidence measure (6/8)
• We can combine the above features with any one of the following classifiers
– Line discriminant function
– Generalized linear model
– Neural networks
– Decision tree
– Support vector machine
– Boosting
– Naïve Bayes classifier
18
Feature-based confidence measure (7/8)
• Naïve Bayes Classifier [3]
),'|()|'(
),|()|(),|(
21or'
111
wCfPwCP
wCfPwCPwfCP
wCCC
ww
),|(),|(1
wCfPwCfP iw
K
diw d
)(
),()|(
wN
wCNwCP i
i
),(
),,(),|(
wCN
wCfNwCfP
i
iwiw
d
d
vectorfeaturepredictor dimension the:
wrongis wordrecognized the:
correct is wordrecognized the:
2
1
kf
C
C
w
19
Feature-based confidence measure (8/8)
• Experiments [3]
• Corpus : an Italian speech corpus of phone calls to the front desk of a hotel
feature CER(%)relative reduction (%)
acoustic stability 16.3 22.4
language modelscore
18.8 10
hypothesis density 18.9 10.5
duration 19.3 8.1
acoustic score 19.6 6.7
baseline 21
20
Posterior probability based confidence measure
21
)],,([)],,[|(
)],,([)],,[|(
)(
)],,([)],,[|()|],,([
11
],,[
11
111
1
Nnnn
Nnnn
esw
Mmmm
Mmmm
Mmmm
MmmmM
mmm
eswPeswXp
eswPeswXp
Xp
eswPeswXpXeswP
Nnnn
__
W
Posterior probability based confidence measure (1/11)
• Posterior probability of a word sequence :
• To adopt some approximation methods
Impossible to estimate in a precise manner
22
靜音
建國
Posterior probability based confidence measure (2/11)
• Word graph based approximation
結果
有
由
又
又
有
三名三名
三名
三名
三名
候選人
候選人
候選人
沒有
沒有
沒有靜音
通過 靜音
候選人通過
三名候選人
靜音
)],,([)],,[|()( 11
],,[ 1
Nnnn
Nnnn
esw
eswPeswXpXpN
nnn
__
W
)],,([)],,[|( 11],,[ 1
Mmmm
Mmmm
esw
eswPeswXpXM
mmm
23
Posterior probability based confidence measure (3/11)
• Posterior probability of a word arc :
– Some issues are addressed and the word posterior probability is generalized
• Reduced search space
• Relaxed time registration
• Optimal acoustic and language model weights
)|()|(
)|()|(
)|],;[:(
1}],;{[
1],;[,}],;{[
1
11
nnnes
N
nesw
mmmes
M
meswaeswXaaa
hwPwXp
hwPwXp
eswapn
nXN
nnnn
m
mMmmmm
XMmmmm
24
Posterior probability based confidence measure (4/11)
• Posterior probability of a word arc [6] :
)|()|(
)|()|(
)|],;[:(
1}],;{[
1],;[,}],;{[
1
11
nnnes
N
nesw
mmmes
M
meswaeswXaaanormal
hwPwXp
hwPwXp
eswaCn
nXN
nnnn
m
mMmmmm
XMmmmm
靜音結果
建國
有由
又
又
有
三名三名
三名
三名
三名
候選人
候選人
候選人
沒有
沒有
沒有靜音
通過 靜音
候選人通過
三名候選人
靜音
25
Posterior probability based confidence measure (5/11)
• Posterior probability of a word arc [6] :
)|],;[:(]),;[:(
2/)(],,;[:
Xrrrnormal
eesswweswr
aaamed eswrCeswaC
raar
arrrr
靜音結果
建國
有由
又
又
有
三名三名
三名
三名
三名
候選人
候選人
候選人
沒有
沒有
沒有靜音
通過 靜音
候選人通過
三名候選人
靜音
26
三名
Posterior probability based confidence measure (6/11)
• Posterior probability of a word arc [6] :
)|],;[:(max]),;[:(],,;[:
},,{
Xrrrnormal
etswweswr
estaaa eswrCeswaC
rr
arrrraa
max
靜音結果
建國
有由
又
又
有
三名三名
三名
三名
候選人
候選人
候選人
沒有
沒有
沒有靜音
通過 靜音
候選人通過
三名候選人
靜音
27
Posterior probability based confidence measure (7/11)
• Posterior probability of a word arc [6] :
)|],;[:(]),;[:(
),(),(:],,;[:
secX
rrrnormal
eseswweswr
aaa eswrCeswaC
rraa
arrrr
靜音結果
建國
有由
又
又
有
三名三名
三名
三名
三名
候選人
候選人
候選人
沒有
沒有
沒有靜音
通過 靜音
候選人通過
三名候選人
靜音
28
Posterior probability based confidence measure (8/11)
• The drawbacks of the above methods – all need an additional pass.
• In [8], the “local word confidence measure” is proposed
)'())'|(max(
)())|((max]),,([
''
]'',',[
''
]'',,[
wpwxp
wpwxpeswC
es
Eesw
es
Eesw
今天 rate. relaxationa given sconstraint length and time
therealize which wordsalternate theofset the:E
今天
今天
今天
))|((max ''
]'',,[wxp e
sesw
29
Posterior probability based confidence measure (8/11)
• local word confidence measure (cont)
)'())'|(max(
)())|((max]),,([
''
]'',',[
''
]'',,[
wpwxp
wpwxpeswC
es
Eesw
es
Eesw
)'|'())'|(max(
)|())|((max]),,([
'
''
]'',',[
'
''
]'',,[
hw
ss
Eesw
hw
es
Eesw
wwpwxp
wwpwxpeswC
h
bigram applied
)}'|'()'|'({))'|(max(
)}|()|({))|((max
]),,([''
]'',',[
''
]'',,[
wwpwwpwxp
wwpwwpwxp
wswCfh
ww
es
Eesw
fhww
es
Eesw
fh
fh
forward/backwardbigram applied
30
Posterior probability based confidence measure (9/11)
• Impact of word graph density on the quality of posterior probability [9]
Baseline 27.3 15.4
wordsspoken ofnumber the
arcs wordofnumber totalWGD
31
Posterior probability based confidence measure (10/11)
• Experiments [6]
corpus baseline Cnormal Csec Cmed Cmax
ARISE 13.6 11.5 8.9 8.8 8.9
Verbmobil 27.3 23.3 19.0 20.0 18.9
NAB 20k 11.3 10.3 9.2 9.2 9.2
NAB 64k 9.2 8.4 7.2 7.2 7.2
Broadcast news 27.7 23.7 20.6 20.4 20.6
32
Explicit model based confidence measure (1/10)
• The CM problem is formulated as a statistical hypothesis testing problem.
• Under the framework of binary hypothesis testing, there are two complementary hypotheses
• We test against
W1
W0
model from NOT is and recognized wrongly is :Hypothese) ve(Alternati
model from comes truly and recognizedcorrectly is :Hypothese) (Null
XH
XH
0H 1H
0
1
)|(
)|( RT) testing(Lratio likelihood
1
0
H
HHXP
HXP
33
Explicit model based confidence measure (3/10)
• The above LRT score can be transformed to a CM based on a monotonic 1-1 mapping function.
• The major difficulty with LRT is how to model the alternative hypothesis.
• In practice, the same HMM structure is adopted to model the alternative hypothesis.
• A discriminative training procedure plays a crucial role in improving modeling performance.
34
Explicit model based confidence measure (3/10)
• Two-pass procedure :
)|(score observaion csxP
)|(score transition ci
cj ssp
今天 天氣 很好
)|(
)|(:
aes
ces
XP
XPLRT
今天
今天
of model-anti the:
of modelcorrect the:a
c
35
Explicit model based confidence measure (4/10)
• One-pass procedure
)|(
)|(score observaion
a
c
sxP
sxP
)|(
)|(score transition
ai
aj
ci
cj
ssp
ssp
今天 天氣 很好
)|(
)|(:
aes
ces
XP
XPLRT
今天
今天
a
tct ss
36
Explicit model based confidence measure (5/10)
• How to calculate the confidence of a recognized word?
shift.a is and function sigmoid theof slope thedefines where
)))((logexp(1
1)(
function sigmoida by
dmanipulate is measure confidence subword therange, dynamic limit the To
segment. decoded thein frames ofnumber theis where
)|(
)|(log
1),,(log
1)(
as obtained be can,X vectors,nobservatio
ofsegment a over decoded unit subworda for score likelihood levelunit unweighted The
u
uu
uu
u
au
cu
u
acu
u
uLRuU
N
XP
XP
NXLR
NuLR
u
37
Explicit model based confidence measure (6/10)
• How to calculate the confidence of a recognized word (cont)?
))(log1
exp()(
)(1
)(
))(log1
exp()(
)(1
)(
:,,1, units subword of composed a wordfor defined are measures following The
compared. are U()scores ratio likelihood levelunit weightedsigmoid theand LR(),
scores, ratio likelihood levelunit unweighted the toingcorrespond measures confidence level Word
,1
4
,1
3
,1
2
,1
1
,n
in
N
inn
in
N
inn
in
N
inn
in
N
inn
nin
uUN
wW
uUN
wW
uLRN
wW
uLRN
wW
Niu
n
n
n
n
LR() theof means arithmetic
LR() theof means geometric
U() theof means arithmetic
U() theof means geometric
38
Explicit model based confidence measure (7/10)
• Discriminative training [10] – The goal of the training procedure is to increase the average val
ue of for correct hypotheses and decrease the average value of for false acceptance.
),,( acXLR
),,( acXLR
},,{segment over the
unit as decodedsegment speech theof frame final and initial theare and where
)(1
1)(
as distances based frame theaveragingby obtained is distance basedsegment The
))(log())(log()(
:decoder by the
obtained sequence in then transitiostateeach for defined is distance based frameA
1-tq
ij
ufui
uu
t
uf
uiuu
ttu
fi
tq
tt
ttif
uu
taj
aijt
cj
cijt
xxX
utt
xrtt
XR
xbaxbayr
39
Explicit model based confidence measure (8/10)
• Discriminative training (cont)
),(1
)},({ },,{ where
)},({
)},({cost expected theon performed is updategradient A
imposter ,1
correct ,1)(
as defined is )( functionindicator thewhere
)))()((exp(1
1),,(
function sigmoida using unit for ),,( functioncost theDefine
1
u
u1
uuu
N
iu
uuu
au
cu
uuun
un
uuu
uu
uu
au
cu
uu
au
cu
uu
XFN
XFE
XFE
XFE
u
uu
u
XRuXF
uXF
u
40
Explicit model based confidence measure (9/10)
)(F
)))()(exp(1
1),,(
uu
uu
au
cu
uu XRuXF
) R(andimposter
)( andcorrect if
uu
uRu )(F
Why discriminative training works?
41
Explicit model based confidence measure (10/10)
• Experiments [10] • This task, referred to as the “movie locator”,
42
Incorporation of high-level information for CM
43
Incorporation of high-level information for CM (1/4)
• LSA
– The key property of LSA is that words whose vectors are close to each other are semantically similar words.
– These similarities can be used to provide an estimate of the likelihood of the words co-occurring within the same utterance.
21dd nd
2
1
w
w
mw
2
1
w
w
mw
A U
21dd nd
TV
44
Incorporation of high-level information for CM (2/4)
• LSA (cont)– The entry of matrix :
– The confidence of a recognized word :
)1log()1(j
ijiij n
cEa
A
ijij
N
ji ff
NE 2
12
log)(log
1
i
ij
ij t
cf
))(),((Cosine1
1ji
N
jwUwU
N
document all in termofcount the:
document of size the:
document in termofcount the:
it
jn
jic
i
j
ij
45
Incorporation of high-level information for CM (3/4)
• Inter-word mutual information :
wordsrecognized remaining thewith
word thisof ninformatio mutual average theas calculated is wordrecognized each ofCM
))w()(
)w,(log(
as calculated be can and wordsany two between ninformatio Mutual
)w,(
)w,(),(
: is )w,(y probabilitjoint thedocuments, training thein
wordand wordof timesoccurrence-co thedenotes )w,( Assume
21
21
21
21w,
2121
21
2121
21
pwp
wpMI
ww
wN
wNwwP
wP
wwwN
w
46
Incorporation of high-level information for CM (4/4)
• Experiments [14]
CM Switchbord Mandarin dictation
LSA 44.7 38.5
MI 41.0 33.7
Cmed 24.4 17.5
N-best count 28.3 21.1
MI+Cmed 23.9 16.2
47
The application of CM to improve speech recognition
48
The application of CM to improve speech recognition (1/10)
• Statistical decision theory aims at minimizing the expected of making error
)|],,([maxarg],,[ 1],,[
*
11
XeswPesw Nnnn
esw
Nnnn
Nnnn
靜音結果
建國
有由
又
又
有
三名三名
三名
三名
三名
候選人
候選人
候選人
沒有沒有
沒有靜音
通過 靜音
候選人通過
三名候選人
靜音
49
The application of CM to improve speech recognition (2/10)
• Method 1 [16]:
)|],,([
),],,[|],,([
)|],,([)|],,([
11
11
11
111
Tnnn
N
n
Tnnnn
N
n
TNnnn
Nnnn
xtswp
xtswtswp
xeswpXeswp
)|],,([maxarg],,[ 11],,[
**
11
TNnnn
esw
Nnnn xeswPesw
Nnnn
50
The application of CM to improve speech recognition (3/10)
• Method 2 [18] :
)|],,([WERminarg],,[ 1],,[
*
11
Xeswesw Nnnn
esw
Nnnn
Nnnn
)|()(1
0.1)|],,([1
1 XwPcorrectwPN
XeswWER nn
N
n
Nnnn
51
The application of CM to improve speech recognition (4/10)
• Method 3 (Time Frame Error decoding) [17]: – In minimum Bayes risk decoding
– if
)|],,([
)],,[,],,([minarg],,[
1
],,[11
],,[
*
1 1
1XesvP
esveswCesw
Mmmm
esv
Mmmm
Nnnn
esw
Nnnn
Mmmm
Nnnn
M
mmmN
nnn
Mmmm
NnnnM
mmmN
nnnesvesw
esveswesveswC
11
1111
],,[],,[,0
],,[],,[,1)],,[,],,([
52
The application of CM to improve speech recognition (5/10)
• Method 3 (cont)
)|],,([maxarg
)|],,([1minarg
)|],,([1minarg
)|],,([
)],,[,],,([minarg],,[
1],,[
1],,[
1],,[],,[
],,[
1
],,[11
],,[
*
1
1
1
111
1
1
XeswP
XeswP
XesvP
XesvP
esveswCesw
Nnnn
esw
Nnnn
esw
Mmmm
eswesvesw
Mmmm
esv
Mmmm
Nnnn
esw
Nnnn
Nnnn
Nnnn
Nnnn
Mmmm
Nnnn
Mmmm
Nnnn
53
The application of CM to improve speech recognition (6/10)
• Method 3 (cont) – we are now faced with a conceptual mismatch between the decis
ion rule and the evaluation criterion for speech recognizers- the word error rate
– The easiest way to overcome this mismatch is to use the same cost function for evaluation – Levensthein distance
– In (Stolcke et. al 1997), the pairwise alignment is restricted to N-best list.
– Let us assume that sub. were the one type of error.• A dynamic programming alignment would thus not be necess
ary.
54
The application of CM to improve speech recognition (7/10)
• Method 3 (cont)
)(1
),(1
)],,[,],,([1
11nn
tn
et
stN
n
Mmmm
Nnnn se
w
esveswC
n
n
'W
W
今天 天氣 很好
今天 天氣
1 every word ofcost maximum the1, if 2' ofcost max the
3' ofcost max the
W
W
tesv
vM
mmm
t
frame timeintersects which],,[
sequence wordin hypothesis word theofidentity word the:
1
55
The application of CM to improve speech recognition (8/10)
• Method 3 (cont)
)(1
)|],,([),(1
minarg
)(1
)|],,([),()|],,([1
minarg
)|],,([)(1
),(1
minarg
)|],,([)],,[,],,([minarg],,[
1],,[
1],,[
1]'',,[
11],,[
1],,[
],,[1
1],,[
],,[111
],,[
*
1
1
1
11
1
11
11
nn
Mmmm
tn
esv
et
stN
nesw
nn
Mmmm
tn
es
et
st
TMmmm
esv
et
stN
nesw
esv
Mmmm
nn
tn
et
stN
nesw
esv
Mmmm
Mmmm
Nnnn
esw
Nnnn
se
XesvPw
se
XesvPwxesvP
XesvPse
w
XesvPesveswCesw
Mmmm
n
n
N
M
n
nM
mmm
n
n
N
Mmmm
n
n
N
Mmmm
N
56
The application of CM to improve speech recognition (9/10)
• Method 3 (cont)
),|(
)|],,([),(
)|],,([),(
)|],,([),(
''
1
1
:],,,[
11 ],,[
1],,[
Xtwp
XesvPw
XesvPw
XesvPw
n
mmmmnetsesv
Mmmmmn
M
etsmesv
Mmmm
tn
esv
mmmmmm
mm
Mmmm
Mmmm
)(1
),|(1
nn
n
et
st
se
Xtwpn
n
Can be interpreted as the normalizedProbability of a word being incorrect.
nw
57
The application of CM to improve speech recognition (10/10)
• Experiments (Method 3)
corpus baseline Time frame error
ARISE 15.8 15.0
Verbmobil 33.6 32.5
NAB 20k 13.2 12.9
NAB 64k 11.1 10.8
broadcast news 33.3 32.3
58
Summary
• Almost all CMs rely almost entirely on a single information source :how much the underlying decision can overtake other possible competitors.
• We believe it is critical to improve performance of CMs by – taking this segmentation issue into account.– Deciding a dynamic threshold for different word.
59
Reference (1/4)
• Main reference– [1] H. Jiang ,“Confidence Measures for Speech Recognition : A Survey”,
Speech communication 2005 .
• Feature based confidence measure– [2] S. Cox and R. Rose, “Confidence Measures for The Switchboard Dat
abase”, ICASSP 1996.– [3] T. Schaaf and T. Kemp, “Confidence Measures for Spontaneous Spe
ech Recognition”, ICASSP 1997.– [4] A. Sanchis , A. Juan and E. Vidal, “New Features based on Multiple
Word Graphs for Utterance Verification”, ICSLP 2003.– [5] R. Zhang and A.I. Rudnicky, “Word Level Confidence Annotation Usi
ng Combinations of Features”, EuroSpeech 2001.
• Posterior based confidence measure– [6] F. Wessel , R. Schluter, K. Macherey, and H. Ney, “Confidence Meas
ures for Large Vocabulary Continuous Speech Recognition”, IEEE SAP 2001.
– [7] F. K. Soong and W. K. Lo, “Generalized Posterior Probability for Minimum Error Verification of Recognized Sentences”, ICASSP 2005
60
Reference (2/4)
• Posterior based confidence measure– [8] J. Razik, O. Mella, D. Fohr, J.-P. Haton, “Local Word Confidence Me
asure Using Word Graph and N-Best List.”
– [9] T, Fabian, R. Lieb, G. Ruske, M. Thomae, “Impact of Word Graph Density on the Quality of Posterior Probability Based Confidence Measures.
• Explicit model based confidence measure– [10] E. Lleida, R. C. Rose, “Utterance Verification in Continuous Speech
Recognition: Decoding and Training Procedures”, IEEE SAP 2000.
– [11] M. G. Rahim and C.-H Lee, “String-based Minimum Verification Error (SB-MVE) Training for Speech Recognition”, computer speech and language 1997.
– [12] H. Jiang, F. K. Soong and C.-H. Lee, “A Dynamic In-Search Data Selection Method With Its Applications to Acoustic Modeling and Utterance Verification”, IEEE SAP 2005.
61
Reference (3/4)
• Incorporation of high-level information for CM– [13] R. Sarikaya, Y. Gao, M. Picheny and H. Erdogan, “Semantic Confid
ence Measurement for Spoken Dialog Systems.”, IEEE SAP 2005.
– [14] G. Guo, C. Huang, H. Jiang, R.-H. Wang, “A Comparative Study on Various Confidence Measures in Large Vocabulary Speech Recognition”, ISCSLP 2004.
• Some application for CM – [15] M. Afify, F. Liu, H. Jiang and O. Siohan, “A New Verification-based
Fast-Match for Large Vocabulary Continuous Speech Recognition” IEEE SAP 2005
– [16] F. Wessel , R. Schluter and H. Ney, “Using Posterior Word Probabilities for Improved Speech Recognition”, ICASSP 2000.
– [17] F. Wessel , R. Schluter and H. Ney, “Explicit Word Error Minimization Using Word Hypothesis Posterior Probabilities”, ICASSP 2001.
62
Reference (2/4)
• Some application for CM – [18] A. Kobayashi, K. Onoe, S. Sato and T. Imai, “Word Error Minimizati
on Using an Integrate Confidence Measure”, INTERSPEECH 2005.
– [19] Y. Qian, T. Lee and F. K. Soong, “Tone Information as a Confidence Measure for Improving Cantonese LVCSR “, ICSLP 2004.