+ All Categories
Home > Documents > Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing...

Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing...

Date post: 18-Jan-2016
Category:
Upload: sheryl-bernice-sparks
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
13
Letter to Phoneme Letter to Phoneme Alignment Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1
Transcript
Page 1: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1.

Letter to Phoneme Letter to Phoneme AlignmentAlignment

Using Graphical Models

N. Bolandzadeh, R. Rabbany

Dept of Computing ScienceUniversity of Alberta

11

Page 2: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1.

Text to Speech Text to Speech ProblemProblem

Conversion of Text to Speech: TTS

◦Automated Telecom Services◦E-mail by Phone◦Banking Systems◦Handicapped People

2

Page 3: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1.

PronunciationPronunciation

Pronunciation of the words Dictionary Words Non-Dictionary Words

Phonetic analysis Dictionary lookup?

Language is alive, new words addProper Nouns

Machine Learning higher accuracyL 2 P alignment is needed

3

Page 4: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1.

4

ProblemProblemLetter to Phoneme Alignment

◦ Letter: c a k e

◦ Phoneme: k ei k

4

L2P

Automatic Speech Recognition

&

Spelling Correction

Page 5: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1.

5

It's not Trivial! It's not Trivial! why?why?

No Consistency◦City / s /◦Cake / k /◦Kid / k /

No Transparency◦K i d (3) / k i d / (3) ◦S i x (3) / s i k s / (4)◦Q u e u e (5) / k j u: / (3)◦A x e (3) / a k s / (3)

5

Page 6: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1.

FrameworkFramework

6

Brick brIkBrightening br2tHINBritishbrItISBronx brQNksBugle bjugPBuoy b4

b|r|i|ck| b|r|I|k|b|r|ig|ht|en|i|ng| b|r|2|t|H|I|N|b|r|i|t|i|sh| b|r|I|t|I|S|b|r|o|n|x| b|r|Q|N|ks|b|u|g|le| b|ju|g|P|bu|oy| b|4|

Page 7: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1.

EvaluationEvaluationNo Aligned DictionaryUnsupervised LearningPreviously aligner was tied with a

generator

Evaluation on percentage of correctly predicted phonemes and words

7

Page 8: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1.

Model of our problemModel of our problem

8

mn pppPlllL ...... 2121

2|||,|

,

...

),|(maxarg

21

ii

iii

k

Abest

PL

PLa

aaaA

PLAPA

B | r | i | t | i | sh |B | r | I | t | I | S |

Page 9: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1.

Static Model, StructureStatic Model, StructureIndependent sub alignments

9

l1 l2

p1 p2

a1

k

iiii PLaPAP

1

),|()(

l3 l4

p3 p4

a2

ln-1 ln

pm-1

pm

ak

Page 10: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1.

Static Model, LearningStatic Model, LearningEM

◦Initialize Parameters◦Expectation Step:

Parameters Alignments

◦Maximization Step: Alignments Parameters

10

Page 11: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1.

Result of Static ModelResult of Static Model

11

Method Letters Words

Static Model

81.34% 43.5%

Page 12: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1.

Dynamic ModelDynamic Model

12

Sequence of dataUnrolled model for T=3 slices

l1 l2

p1 p2

a1

l3 l4

p3 p4

a2

l5 l6

p5 p6

ak

Page 13: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1.

QuestionsQuestions

13


Recommended