8/18/2019 Improvements in Speech Synthesis
1/406
8/18/2019 Improvements in Speech Synthesis
2/406
8/18/2019 Improvements in Speech Synthesis
3/406
8/18/2019 Improvements in Speech Synthesis
4/406
8/18/2019 Improvements in Speech Synthesis
5/406
8/18/2019 Improvements in Speech Synthesis
6/406
8/18/2019 Improvements in Speech Synthesis
7/406
8/18/2019 Improvements in Speech Synthesis
8/406
8/18/2019 Improvements in Speech Synthesis
9/406
8/18/2019 Improvements in Speech Synthesis
10/406
8/18/2019 Improvements in Speech Synthesis
11/406
8/18/2019 Improvements in Speech Synthesis
12/406
8/18/2019 Improvements in Speech Synthesis
13/406
8/18/2019 Improvements in Speech Synthesis
14/406
8/18/2019 Improvements in Speech Synthesis
15/406
8/18/2019 Improvements in Speech Synthesis
16/406
8/18/2019 Improvements in Speech Synthesis
17/406
8/18/2019 Improvements in Speech Synthesis
18/406
8/18/2019 Improvements in Speech Synthesis
19/406
8/18/2019 Improvements in Speech Synthesis
20/406
8/18/2019 Improvements in Speech Synthesis
21/406
8/18/2019 Improvements in Speech Synthesis
22/406
8/18/2019 Improvements in Speech Synthesis
23/406
8/18/2019 Improvements in Speech Synthesis
24/406
8/18/2019 Improvements in Speech Synthesis
25/406
8/18/2019 Improvements in Speech Synthesis
26/406
8/18/2019 Improvements in Speech Synthesis
27/406
8/18/2019 Improvements in Speech Synthesis
28/406
8/18/2019 Improvements in Speech Synthesis
29/406
8/18/2019 Improvements in Speech Synthesis
30/406
8/18/2019 Improvements in Speech Synthesis
31/406
8/18/2019 Improvements in Speech Synthesis
32/406
8/18/2019 Improvements in Speech Synthesis
33/406
8/18/2019 Improvements in Speech Synthesis
34/406
8/18/2019 Improvements in Speech Synthesis
35/406
8/18/2019 Improvements in Speech Synthesis
36/406
8/18/2019 Improvements in Speech Synthesis
37/406
8/18/2019 Improvements in Speech Synthesis
38/406
8/18/2019 Improvements in Speech Synthesis
39/406
0 1000 2000
S p e c t r u m (
d B )
3000 4000
Frequency (Hz)
5000 6000 7000 80000
20
40
60
80
0 1000 2000
S p e c t r u m (
d B )
3000 4000
Frequency (Hz)
5000 6000 7000 80000
20
40
60
80
8/18/2019 Improvements in Speech Synthesis
40/406
100
−24
−22
−20
−18
−16
−14
−12
−10
−8
120 140 160 180
Basic frequency (Hz)(a)
d B
200 220 240 260 280 300
100
−24
−22
−20
−18
−16
−14
−12
−10
−8
120 140 160 180Basic frequency (Hz)
d B
200 220 240 260 280 300(b)
8/18/2019 Improvements in Speech Synthesis
41/406
100
−24
−22
−20
−18
−16
−14
−12
−10
−8
120 140 160 180
Basic frequency (Hz)
d B
200 220 240 260 280 300
(c)
8/18/2019 Improvements in Speech Synthesis
42/406
010
20
30
40
50
60
d B
0.2 0.4 0.6 0.8 1 1.2
0−5000
0
5000
A m p l i t u d e
10000
0.2 0.4 0.6
Sec
0.8 1 1.2
Pitchmarking
Harmonic/stochasticdecomposition
PS-modulation
LPC analysis
PS-ABS
WSS discretecepstrum
j n dct j ncep
dctAncep
T 0
P npol
a nlpc
w 0
An
8/18/2019 Improvements in Speech Synthesis
43/406
0−4000
−2000
0
2000
4000
6000
50 100 150 200 250 300 350 400 450
0−4000
−2000
0
2000
4000
6000
50 100 150 200 250 300 350 400 450
8/18/2019 Improvements in Speech Synthesis
44/406
0
−20
0
20
d B
40
60
80
1000 2000 3000 4000
Hz
5000 6000 7000 8000
0
−3
−2
−1
0 R a d
1
2
3
1000 2000 3000 4000
Hz
5000 6000 7000 8000
8/18/2019 Improvements in Speech Synthesis
45/406
−20
−10
0
10
20
AK
d B
30
40
0 2 4
kHZ kHZ kHZ
6−20
−10
0
10
20
LSP
d B
30
40
0 2 4 6−20
−10
0
10
20
DCT
d B
30
40
0 2 4 6
8/18/2019 Improvements in Speech Synthesis
46/406
2000 4000 6000−2
−1
0
1
2x 104
2000 4000 6000−2
−1
0
1
2x 104
100 300 400200 5000
20
40
60
80
100 300 400200 5000
20
40
60
80
0 0.2 0.4 0.6S
sonagram
0.8 1 1.20
2000
4000
6000
8000
H z
(a)
8/18/2019 Improvements in Speech Synthesis
47/406
0 0.2 0.4 0.6S
sinusoidal sonagram
0.8 1 1.20
2000
4000
6000
8000
H
z
(b)
20.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8−2
0
2
4 x 104
20.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8−2
0
0
1
2
3
10
2
4x 104 x 104
x 104
20.2 0.4 0.6 0.8 1
1.2 1.4 1.6 1.8
x 104(e)
(d)
(c)
8/18/2019 Improvements in Speech Synthesis
48/406
−2000
−1000
0
1000
2000
0.505 0.51 0.515 0.52 0.525
−2000
−1000
0
1000
2000 + + + + + + + + + + + + + +
0.505 0.51 0.515 0.52 0.525
8/18/2019 Improvements in Speech Synthesis
49/406
−5000
0
5000
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
−5000
0
5000
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
−5000
0
5000
0.1
A m p l i t u d e
A
m p l i t u d e
A m p l i t u d e
0.2 0.3 0.4 0.5
Temps (sec)
0.6 0.7 0.8 0.9 1
White noisegenerator
PS-modulation LPC filter
+
Trilinear interpolation
j n ′dct j ncep
dctAncep
′T 0 P npol
′w 0
′An
8/18/2019 Improvements in Speech Synthesis
50/406
8/18/2019 Improvements in Speech Synthesis
51/406
8/18/2019 Improvements in Speech Synthesis
52/406
8/18/2019 Improvements in Speech Synthesis
53/406
8/18/2019 Improvements in Speech Synthesis
54/406
Analysis
Prosodicdeviations
Covariationmodel
Synthesis
Modified paramtetricrepresentation
Original parametricrepresentation
off-line
8/18/2019 Improvements in Speech Synthesis
55/406
8/18/2019 Improvements in Speech Synthesis
56/406
8/18/2019 Improvements in Speech Synthesis
57/406
pca
$
10000
0
0 1000 2000 ech
$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $$ $^ ^ ^ ^ ̂ ^ ^ ̂ ^ ̂ ^ ̂ ^ ^ ^ ̂ ^ $ $
0
0
200
31.13 41.62
s
66.54 Syl
a−
Syl
Inf
SENTIMENTALISER
PHR
PRNCGN
Hz
10
seg
20 ech
8/18/2019 Improvements in Speech Synthesis
58/406
8/18/2019 Improvements in Speech Synthesis
59/406
8/18/2019 Improvements in Speech Synthesis
60/406
0
100
200
−1
0
1x 104
−1
0
1x 104
Distortion
SSC output
Target
8/18/2019 Improvements in Speech Synthesis
61/406
ORIGIN
FD
EM
AT30DB0
20DB0
10DB0
c4_1c4_0
c2_1c1_1
c2_0
c3_0
VO
TDPICP0 c1_0TDP
First component
S e c o n d c o m p o
n e n t
8/18/2019 Improvements in Speech Synthesis
62/406
0
0
0
2000
4000
6000
8000
Hz
P P[ P@ @
0.2 0.4 0.6 s
T] [
(a)
0
0
0 0.2 0.4 0.6 s
2000
4000
6000
Hz
8000
@ @[ ]T] [
(b)
8/18/2019 Improvements in Speech Synthesis
63/406
0
0
2000
4000
6000
8000
Hz
P T] [P[ ]@ @
(c)
8/18/2019 Improvements in Speech Synthesis
64/406
8/18/2019 Improvements in Speech Synthesis
65/406
8/18/2019 Improvements in Speech Synthesis
66/406
8/18/2019 Improvements in Speech Synthesis
67/406
8/18/2019 Improvements in Speech Synthesis
68/406
8/18/2019 Improvements in Speech Synthesis
69/406
8/18/2019 Improvements in Speech Synthesis
70/406
8/18/2019 Improvements in Speech Synthesis
71/406
8/18/2019 Improvements in Speech Synthesis
72/406
8/18/2019 Improvements in Speech Synthesis
73/406
8/18/2019 Improvements in Speech Synthesis
74/406
8/18/2019 Improvements in Speech Synthesis
75/406
8/18/2019 Improvements in Speech Synthesis
76/406
8/18/2019 Improvements in Speech Synthesis
77/406
8/18/2019 Improvements in Speech Synthesis
78/406
8/18/2019 Improvements in Speech Synthesis
79/406
8/18/2019 Improvements in Speech Synthesis
80/406
8/18/2019 Improvements in Speech Synthesis
81/406
8/18/2019 Improvements in Speech Synthesis
82/406
8/18/2019 Improvements in Speech Synthesis
83/406
8/18/2019 Improvements in Speech Synthesis
84/406
8/18/2019 Improvements in Speech Synthesis
85/406
8/18/2019 Improvements in Speech Synthesis
86/406
8/18/2019 Improvements in Speech Synthesis
87/406
8/18/2019 Improvements in Speech Synthesis
88/406
8/18/2019 Improvements in Speech Synthesis
89/406
8/18/2019 Improvements in Speech Synthesis
90/406
8/18/2019 Improvements in Speech Synthesis
91/406
8/18/2019 Improvements in Speech Synthesis
92/406
0 100 200 300 400 500 600
0
Voiced residual signal
a)
samples0 100 200 300 400 500 600
0
Unvoiced residual signal
b)
samples
0 100 200 300 400 500 6000
Energy of voiced residual samples
0 100 200 300 400 500 6000
Energy of unvoiced residual samples
8/18/2019 Improvements in Speech Synthesis
93/406
0.12 0.122 0.124 0.126 0.128 0.13 0.132 0.134 0.136 0.138
0
Time (s)
8/18/2019 Improvements in Speech Synthesis
94/406
8/18/2019 Improvements in Speech Synthesis
95/406
Inventory
LPC-Filter
bla blI
1/f0(t) 1/f0(t) 1/f0(t)
t
s res
f0(t)
t
f0 contour
s out
t
8/18/2019 Improvements in Speech Synthesis
96/406
8/18/2019 Improvements in Speech Synthesis
97/406
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16
0
@ d i t
Time (s)
Time (s)
F r e q u e n c y ( H z )
@ d i t
a)
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.160
2000
4000
6000
8000
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16
0
@ d i t
Time (s)
Time (s)
F r e q u e n c y ( H z )
@ d i t
b)
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.160
2000
4000
6000
8000
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16
0
@ d i t
Time (s)
Time (s)
F r e q u e n c y ( H z )
@ d i t
c)
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.160
2000
4000
6000
8000
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16
0
@ d i t
Time (s)
Time (s)
F r e q u e n c y ( H z )
@ d i t
d)
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.160
2000
4000
6000
8000
8/18/2019 Improvements in Speech Synthesis
98/406
8/18/2019 Improvements in Speech Synthesis
99/406
8/18/2019 Improvements in Speech Synthesis
100/406
8/18/2019 Improvements in Speech Synthesis
101/406
8/18/2019 Improvements in Speech Synthesis
102/406
8/18/2019 Improvements in Speech Synthesis
103/406
8/18/2019 Improvements in Speech Synthesis
104/406
8/18/2019 Improvements in Speech Synthesis
105/406
8/18/2019 Improvements in Speech Synthesis
106/406
8/18/2019 Improvements in Speech Synthesis
107/406
8/18/2019 Improvements in Speech Synthesis
108/406
8/18/2019 Improvements in Speech Synthesis
109/406
8/18/2019 Improvements in Speech Synthesis
110/406
8/18/2019 Improvements in Speech Synthesis
111/406
8/18/2019 Improvements in Speech Synthesis
112/406
8/18/2019 Improvements in Speech Synthesis
113/406
8/18/2019 Improvements in Speech Synthesis
114/406
8/18/2019 Improvements in Speech Synthesis
115/406
8/18/2019 Improvements in Speech Synthesis
116/406
8/18/2019 Improvements in Speech Synthesis
117/406
8/18/2019 Improvements in Speech Synthesis
118/406
8/18/2019 Improvements in Speech Synthesis
119/406
8/18/2019 Improvements in Speech Synthesis
120/406
8/18/2019 Improvements in Speech Synthesis
121/406
Italian, Spanish, Portugues
French
Aucune de ces raisons ne regardaient son pouse
AU
cunede ces
raisons ne re gar daient son
pouse
8/18/2019 Improvements in Speech Synthesis
122/406
Italian
Nessuna di queste ragioni riguar dava la moglie.
Nes
suna
diques
tera
gioni
ri
guarda va la mo glie
Spanish
Ninguno de estos motivos a su mujerconcern a
Nin guno de
es tos mo ti convos
cer n a a su mu
Portuguese
Nenhuma destas raz es dizia respeito a sua mulher
Nenhu
ma
des tas razões di zia res
peitoa
sua mu lher
8/18/2019 Improvements in Speech Synthesis
123/406
Natural
Aucune de ces raisons ne regardaient son pouse
Aucune
de ces rai sons ne re gar daientson
pouse
8/18/2019 Improvements in Speech Synthesis
124/406
MONS
Aucune de ces raisons ne regardaient son pouse
Au cune de
ces rai sons ne re gar daient son pouse
Bell Labs
Aucune de ces raisons ne regardaient son pouse
Aucune de
ces rai sonsne
regar
daient sonpouse
Elan
Aucune de ces raisons ne regardaient son pouse
Au cune de ces raisons ne re gar daient son
pouse
8/18/2019 Improvements in Speech Synthesis
125/406
LATL
Aucune de ces raisons ne regardaient son pouse
Au cune de
cesrai sons ne re gar daient son
pouse
LAIPTTS
Aucune de ces raisons ne regardaient son pouse
Au cune de cesrai
sons ne re gar daient son pouse
SYNTAIX
Aucune de ces raisons ne regardaient son pouse
Au cune de ces raisons
ne re gardaient son
pouse
8/18/2019 Improvements in Speech Synthesis
126/406
L & H
Aucune de ces raisons ne regardaient son pouse
Au cunede ces
rai ne re gar daient son
pouse
Natural
Un groupe de chercheurs allemands a r solu l' nigme.
Ungroupe de cher cheurs alle mands a so
nigmer l'
8/18/2019 Improvements in Speech Synthesis
127/406
MONS
Un groupe de chercheurs allemands
Un groupe de cher cheursalle mands a so lu nigmer
l'
a r solu l' nigme.
Bell Labs
Un groupe de chercheurs allemands
Un groupede
chercheurs
alle mands a sol u nigmer l'
a r solu l' nigme.
Elan
Un groupe de chercheurs allemands
Ungroupe de cher cheurs alle mands a so lu
nigme
rl'
a r solu l' nigme.
8/18/2019 Improvements in Speech Synthesis
128/406
LATL
Un groupe de chercheurs allemands
Un groupe de chercheurs alle mands a so lu
nigmerl'
a r solu l' nigme.
LAIPTTS
Un groupe de chercheurs allemands
Ungroupe
de cher cheurs allemands
a so lu nigmer l'
a r solu l' nigme.
SYNTAIX
Un groupe de chercheurs allemands
Ungroupe de cher
cheurs allemands
a solu
nigmer l'
a r solu l' nigme.
8/18/2019 Improvements in Speech Synthesis
129/406
L & H
Un groupe de chercheurs allemands
Ungroupe de cher cheurs alle
mands a
so lunigme
r l'
a r solu l' nigme.
Natural
Alcuni edifici si sono rivelati pericolosi
Al
cuni
e
difi
ci si so no ri vela
ti
pe ri co
lo si
8/18/2019 Improvements in Speech Synthesis
130/406
Bell Labs
Alcuni edifici si sono rivelati pericolosi
Al cuni e di fi ci si so no ri ve la ti
pe ri co lo si
ELAN
Alcuni edifici si sono rivelati pericolosi
Alcu ni e di
fi ci si so no ri ve lati
pe ri co lo si
L & H
Alcuni edifici si sono rivelati pericolosi
Al cu ni edi
fici si so no ri ve la ti pe ri co
lo si
8/18/2019 Improvements in Speech Synthesis
131/406
Un grupo de investigadores alemanes ha resuelto l'enigma
Natural
Un grupo de investigadores alemanes ha resuelto l' enigma.
Un gru pode in ves ga
tido
resa
lema
nes ha re sue lto l'enig
ma
Elan
Un grupo de investigadores alemanes ha resuelto l' enigma.
Un gru po de in ves gati dores a le ma
nes ha re sue lto l'e nig ma
8/18/2019 Improvements in Speech Synthesis
132/406
L & H
Un grupo de investigadores alemanes ha resuelto l' enigma.
Un gru
pode in ves ga
ti do res a le manes ha re sue lto l'e nig
ma
8/18/2019 Improvements in Speech Synthesis
133/406
8/18/2019 Improvements in Speech Synthesis
134/406
8/18/2019 Improvements in Speech Synthesis
135/406
8/18/2019 Improvements in Speech Synthesis
136/406
Position of wordin the phrase
Position of tonic in theword
Standard Deviationin %
I s o l a t e d
E n d
E n d
0.0
5.0
10.0
15.0
20.0
25.0
30.0
M i d d l e
M i d d l e
B e g i n n i n g
B e g i n n i n g
Isolated Word
Word in theBeginning
Word in theMiddle
% o
f d u r a t i o
n
Word at the End
1. Beginning
2.Middle
3. End
4. Beginning
5.Middle
6. End
7. Beginning
8.Middle
9. End
10. Beginning
11.Middle
12. End1
0.0
50.0
100.0
150.0
200.0
250.0300.0
350.0
400.0
450.0
2 3 4 5 6 7 8 9 10 1211
8/18/2019 Improvements in Speech Synthesis
137/406
8/18/2019 Improvements in Speech Synthesis
138/406
Isolated Word
Word in theBeginning
Word in theMiddle
d B
Word at the End
1. Beginning2.Middle3. End
4. Beginning5.Middle6. End
7. Beginning
8.Middle9. End
10. Beginning11.Middle12. End
1−5.0
−10.0
0.0
5.0
10.0
15.0
20.0
25.0
30.0
35.0
40.0
2 3 4 5 6 7 8 9