NLPRS2001
1
Grapheme-to-Phoneme for
ThaiPongthai Tarsaku
NLPRS2001
2
ContentIntrodu
ctionGrapheme-to-Phoneme in TTS systemProblems in Thai
PGLR ApproachExperiment & Results & DiscussionConclusion
NLPRS2001
3
IntroductionGrapheme-to-Phoneme
(G2P) is a module in TTS system.Grapheme-to-Phoneme approaches.Dictionary
base.Rule base.Statistical base.Probabilistic Generalized LR
(PGLR) parser is statistical base approach.
NLPRS2001
4
G2P in TTS system
T e x t S e g m e n ta tio n
G ra p h e m e -to -P h o n e m e
P ro so d y G e n e ra tio n
S p e e c h S ig n a l S y n th e s is
ผ ม ข อ ข อ บ คุ� ณ ทุ� ก ทุ� า น ทุ�� ม า เยี่�� ยี่ ม ช ม ง า น
/ / / / / / / ผ ม ข อ ข อ บ คุ� ณ ทุ� ก ทุ� า น ทุ�� ม า เยี่�� ยี่ ม ช ม ง า น
/phom 4/k h@ :4/k h@ :p1/k hun0/thuk 3/tha:n2/thi :2/m a:0/j i :am 2/chom 0/nga:n0/
S p e e c h W a v e fo rm
NLPRS2001
5
Problems in Thai (1)
“มณฑ ” า is pronounced as /mon0/tha:0/ “มณฑป” is pronounced as /mon0/dop1/“ ” เพลา (axe) is pronounced as /phlaw0/“ ” เพลา (time) is pronounced as /phe:0/la:0/
“ ” น��า is phonologically pronounced as /nam3/ but usually pronounced as /na:m3/
Ambiguity in grapheme-phoneme mapping.
Homograph.
Vowel’s length
NLPRS2001
6
Problems in Thai (2)
“วิ�ทุ” in “วิ�ทุ ” ยี่า is pronounced as /wit2/tha2/
“อ�ฐิ�” is pronounced as /?at1/thi1/
“ ” ตากลม can be segmented into “ตา| ” กลม (round eye) and “ตาก| ” ลม (to expose wind) which are pronounced /ta:0/klom0/ and /ta:k1/lom0/ respectively.
Linking syllable pronunciation.Ambiguity in consonantal functionality..Word boundary.
NLPRS2001
7
PGLR ApproachPGLR : Probabilistic
Generalize LR parsing.PGLR has advantage in context-sensitivity.PGLR is able to capture two levels of context.Global context - over structures
from the CFG rules.Local n-gram context.
NLPRS2001
8
Context-Free Grammar Rules
A CFG rule is prepared for Thai syllable construction.A set of CFG rules is grouped by Thai vowel unit.( 21 groups and 3 special groups)CFG rules are able to cover both monosyllable and polysyllable.
<G r p_ 1>
<G r p_ 2><G r p_ 1>
<G r p_ 2>
<in it> า<in it> า <fi nal>เ <in it> เ <in it> <fi nal>
NLPRS2001
9
PGLR parserPGLR parser
Thai Grapheme-to Phoneme system
Most probableparse tree
Most probableparse tree G-P MappingG-P Mapping Toneme
Generation
TonemeGeneration
PGLRTable
PGLRTable
CFG RulesCFG Rules G-P TableG-P Table
W
S y l
S y l S y l
ส มช า ยี่S y lS y l
p = 0 .3
W
S y l
S y l S y l
ส ม ช า ยี่
p = 0 .7
/som/chaj/ /som4/chaj0/
W
S y l
S y l S y l
ส ม ช า ยี่
p = 0 .7
สมชาย
NLPRS2001
10
Grapheme-Phoneme Mapping
Example.W
S y lty p eA
ก า รk a: n
S y lty p eB
W
S y lty p eC
เ ษช ฐิ าch e: t th a:
S y lty p eD
W
S y lty p eE
ส ม พ รs m ph no @ :
NLPRS2001
11
Experiment Database
LEXiTRON : The Thai electronic dictio nary is used for training and testing.
~23000 Thai words with pronunciation. TrainingFour-fifth of database is used for training. TestingOne-fifth of database is used for testing. Testing against the -rule based [Wiboon, 1999] and the decision tr
-ee based[Chotimongkol, 2000] systems.
NLPRS2001
12
Result
Conversion (word) accuracy(%)Model
Exact match Ignorance ofVow. Length
PGLR 72.87 90.44
Rule-based 67.14 83.81
Decision Tree 68.76 86.94
NLPRS2001
13
Discussion
Vowel’s length problem is dominant (90.44 -> 72.87).Half of all errors (~5%) come from linking syllable problem.To improve accuracy, more training data is required.
NLPRS2001
14
Conclusion PGLR approach has
advantage in context-sensitivity (both global and local context). The efficiency of PGLR parser depends on carefully writing in CFG rules.
This approach can be applie d in syllable segmentation f
ramework or soundex conv ersion framework.
NLPRS2001
15
Thank you
NLPRS2001
16
Tone in Thai There are 5 tone levels
(Tonemes) in Thai.mid-level : 0 low-level : 1 falling-level : 2 high-level : 3 rising-level : 4
Toneme is depended on con sonant class, syllable type a
nd tone marker.
NLPRS2001
17
Tonemic Gen erationTone Markers and Tonemes
ConsonantClass
Syllabletype
unmarked –่ –่ –่ � –่ �Live Syllable 4 1 2 3 4High
ClassDead Syllable 1 1 2 3 4
Live Syllable 0 1 2 3 4MiddleClass
Dead Syllable 1 1 2 3 4
Live Syllable 0 2 3 3 4
Dead ShortSyllable
3 2 2 3 4LowClass
Dead LongSyllable
2 1 3 3 4
High Class ขฃฉฐถผฝศษสหMiddle Class ก จฎฏดตบปอLow Class ค ฅฆงชซฌญฑฒณทธนพฟภมยรลวฬฮ
Dead SyllableLive Syllable Dead Short Syllable Dead Long Syllable
- the final consonantis n, ng, m, or j.
- long vowel with nofinal consonant (z).
- short vowel withfinal consonant k, t,or p.
- short vowel with nofinal consonant (z).
- long vowel withfinal consonant k, t,or p.
NLPRS2001
18
GLR parserGLR parser parse tree[i]parse tree[i] G-P MappingG-P Mapping TonemeGeneration
TonemeGeneration
GLRTable
GLRTable
CFG RulesCFG Rules G-P TableG-P Table
W
S y l
S y l S y l
ส มช า ยี่S y lS y l
W
S y l
S y l S y l
ส ม ช า ยี่
W
S y l
S y l S y l
ส ม ช า ยี่
สมชาย
G-P TableG-P Table
Phoneme Comparison
Phoneme Comparison
The selected parse tree
is used for training
Increasing iIncreasing i
mismatch
match
Parse Tree Selection