Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Modeling of Speech Signal forAnalysis Purposes
or Mathematical modeling of jitter and shimmer
Yannis Stylianou
University of Crete, Computer Science Dept., Multimedia Informatics [email protected]
Limsi, France, 2008 August 13th
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Multimedia Informatics Lab
4 Professors:1 T. Mouchtaris (Audio and Speech Processing)2 Y. Stylianou (Speech and Signal Processing)3 P. Tsakalides (Signal Processing and Sensor Networks)4 G. Tziritas (Image and Video Processing)
3 Post Docs, 8 Ph.D. Students and many students inM.Sc. degree
Strong connections with a Research Center: FORTH.
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
My current research topics
Speech Processing
Voice Quality AssessmentAlgorithms for Speech PathologyNon-linear speech modeling and processingInverse FilteringVoice TransformationMultimodal User identification
Music Signal Processing
Marine Mammals Acoustics
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Multimedia Informatics Lab
Selected Recent Projects:
FP6-IST NoE SIMILAR: Human-computer interactionsimilar to the way humans do it.
FP6-IST Strep PISTE: Personalized, Immersive Sports TVExperience
FP6-Marie Curie TOK: Collaborative Signal Processing forEfficient Wireless Sensor Networks
GSRT Wireless Sensor Networks: Theory and Applicationsin Structural Health Monitoring
GSRT AKMON: Advanced Algorithms for Voice QualityAssessment
GSRT TV++: Multimedia processing for Broadcast News
Industrial Partners: France Telecom, British Telecom,FORTH-Net
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
1 Modeling
2 Synthesis: Jitter and ShimmerDefinitions and EstimatorsMathematical Modeling of JitterMathematical Modeling of Shimmer
3 Analysis: Jitter and ShimmerTime-Frequency RepresentationsTime-Frequency AnalysisModeling Jitter and Shimmer
4 Acknowledgments
5 References
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Modeling
Modeling for ...
Coding
Modifications
Synthesis and Analysis
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Modeling
Modeling for ...
Coding
Modifications
Synthesis and Analysis
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Modeling
Modeling for ...
Coding
Modifications
Synthesis and Analysis
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Definitions andEstimators
MathematicalModeling ofJitter
MathematicalModeling ofShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Synthesis: Jitter and Shimmer
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Definitions andEstimators
MathematicalModeling ofJitter
MathematicalModeling ofShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Defining Jitter and Shimmer
Definition (Jitter)
Jitter is defined as perturbations of the glottal source signalthat occur during vowel phonation and affect the glottal pitchperiod.
Definition (Shimmer)
Shimmer is defined as perturbations of the glottal source signalthat occur during vowel phonation and affect the glottal energy.
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Definitions andEstimators
MathematicalModeling ofJitter
MathematicalModeling ofShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Defining Jitter and Shimmer
Definition (Jitter)
Jitter is defined as perturbations of the glottal source signalthat occur during vowel phonation and affect the glottal pitchperiod.
Definition (Shimmer)
Shimmer is defined as perturbations of the glottal source signalthat occur during vowel phonation and affect the glottal energy.
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Definitions andEstimators
MathematicalModeling ofJitter
MathematicalModeling ofShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Some Estimators ...
Let u[n] be a pitch period sequence.Absolute jitter:
1
N − 1
N−1∑n=1
|u(n + 1)− u(n)|
Let u[n] be a peak amplitude sequence of N samples.Absolute Shimmer:
1
N − 1
N−1∑n=1
|u(n + 1)− u(n)|
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Definitions andEstimators
MathematicalModeling ofJitter
MathematicalModeling ofShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Some Estimators ...
Let u[n] be a pitch period sequence.Absolute jitter:
1
N − 1
N−1∑n=1
|u(n + 1)− u(n)|
Let u[n] be a peak amplitude sequence of N samples.Absolute Shimmer:
1
N − 1
N−1∑n=1
|u(n + 1)− u(n)|
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Definitions andEstimators
MathematicalModeling ofJitter
MathematicalModeling ofShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Jitter: Aperiodicity throughperiodicity[1]
1
ampl
itude
� ����� ��� ����� ��
time (samples)
P − ε P − εP + ε P + ε
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Definitions andEstimators
MathematicalModeling ofJitter
MathematicalModeling ofShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
In mathematical terms
We model the glottal impulse train as:
p[n] =+∞∑
k=−∞δ[n − (2k)P] +
+∞∑k=−∞
δ[n + ε− (2k + 1)P]
We may show that its power spectrum is then:
|P(ω)|2 =2
P2(1 + cos [(ε− P)ω])
[δlω0(ω) + δ(l+ 1
2)ω0
(ω)]
= H(ε, ω) + S(ε, ω)
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Definitions andEstimators
MathematicalModeling ofJitter
MathematicalModeling ofShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
In mathematical terms
We model the glottal impulse train as:
p[n] =+∞∑
k=−∞δ[n − (2k)P] +
+∞∑k=−∞
δ[n + ε− (2k + 1)P]
We may show that its power spectrum is then:
|P(ω)|2 =2
P2(1 + cos [(ε− P)ω])
[δlω0(ω) + δ(l+ 1
2)ω0
(ω)]
= H(ε, ω) + S(ε, ω)
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Definitions andEstimators
MathematicalModeling ofJitter
MathematicalModeling ofShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Examples of power spectrum
On synthetic glottal signal
−40
−38
−36
−34
−32
−30
−28
−26
radian frequency (ω)
pow
er (
dB)
� ����� ����� ����� ����� ������ �
H(0, ω)
S(0, ω)
H(1, ω)
S(1, ω)
H(2, ω)
S(2, ω)
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Definitions andEstimators
MathematicalModeling ofJitter
MathematicalModeling ofShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Experiments
Goal: discriminate pathological from normal voices, based onjitter
Database: Massachusetts Eye and Ear Infirmary (MEEI)[2]
Sustained vowels,53 subjects with normal voice,657 subjects with a wide variety of pathological conditions
Jitter estimation methods:
PRAAT2007 (P. Boersma and D. Weenink) [3]Multi-Dimensional Voice Program (MDVP), (Kay-Pentaxelemetrics, 2007) [4]Our approach [1]
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Definitions andEstimators
MathematicalModeling ofJitter
MathematicalModeling ofShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Experiments
Goal: discriminate pathological from normal voices, based onjitter
Database: Massachusetts Eye and Ear Infirmary (MEEI)[2]
Sustained vowels,53 subjects with normal voice,657 subjects with a wide variety of pathological conditions
Jitter estimation methods:
PRAAT2007 (P. Boersma and D. Weenink) [3]Multi-Dimensional Voice Program (MDVP), (Kay-Pentaxelemetrics, 2007) [4]Our approach [1]
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Definitions andEstimators
MathematicalModeling ofJitter
MathematicalModeling ofShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Results in ROC curves
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Definitions andEstimators
MathematicalModeling ofJitter
MathematicalModeling ofShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Shimmer: Aperiodicity throughperiodicity[1]
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Definitions andEstimators
MathematicalModeling ofJitter
MathematicalModeling ofShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
In mathematical terms
We model the glottal impulse train as:
g [n] = A(1 + ∆)δ(2k)P [n] + A(1−∆)δ(2k+1)P [n]
We may show that its Fourier Transform is then:
G (ω) = A[(1 + ∆) + (1−∆)e
−j2π ωω0
] ω0
4π
+∞∑k=−∞
δ(ω−kω0
2)
Splitting
G (lω0) = Aω0
2π
G ((l + 1/2)ω0) = Aω0
2π∆
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Definitions andEstimators
MathematicalModeling ofJitter
MathematicalModeling ofShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
In mathematical terms
We model the glottal impulse train as:
g [n] = A(1 + ∆)δ(2k)P [n] + A(1−∆)δ(2k+1)P [n]
We may show that its Fourier Transform is then:
G (ω) = A[(1 + ∆) + (1−∆)e
−j2π ωω0
] ω0
4π
+∞∑k=−∞
δ(ω−kω0
2)
Splitting
G (lω0) = Aω0
2π
G ((l + 1/2)ω0) = Aω0
2π∆
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Definitions andEstimators
MathematicalModeling ofJitter
MathematicalModeling ofShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
In mathematical terms
We model the glottal impulse train as:
g [n] = A(1 + ∆)δ(2k)P [n] + A(1−∆)δ(2k+1)P [n]
We may show that its Fourier Transform is then:
G (ω) = A[(1 + ∆) + (1−∆)e
−j2π ωω0
] ω0
4π
+∞∑k=−∞
δ(ω−kω0
2)
Splitting
G (lω0) = Aω0
2π
G ((l + 1/2)ω0) = Aω0
2π∆
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Definitions andEstimators
MathematicalModeling ofJitter
MathematicalModeling ofShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Examples of spectrum
On synthetic glottal signal
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Definitions andEstimators
MathematicalModeling ofJitter
MathematicalModeling ofShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Experiment at 8kHz
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Definitions andEstimators
MathematicalModeling ofJitter
MathematicalModeling ofShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Experiment at 16kHz
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Analysis: Jitter and Shimmer
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Short-Time Fourier Transform
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Time-Frequency Distributions [5]
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Modeling the Periodic part of Speech
Sum of simple exponential functions
h1(t) = <
{L∑
k=1
ake j2πkf0fs
t
}
Sum of exponential functions with complex slope(HNM2[6])
h2(t) = <
{L∑
k=1
Ak(t) expj2πkf0fs
t
}
whereAk(t) = ak + t bk
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Modeling the Periodic part of Speech
Sum of simple exponential functions
h1(t) = <
{L∑
k=1
ake j2πkf0fs
t
}
Sum of exponential functions with complex slope(HNM2[6])
h2(t) = <
{L∑
k=1
Ak(t) expj2πkf0fs
t
}
whereAk(t) = ak + t bk
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Revisiting HNM2
We recall that the periodic part of HNM2 is given by:
s(t) =
(L∑
k=−L
Ak(t)e2πjkf0t
)w(t)
with Ak(t) = ak + tbk , or in frequency domain:
S(f ) =L∑
k=−L
(akW (f − kf0) + jbkW ′(f − kf0)
)where W (f ) is the Fourier Transform of window w(t)
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Time-Domain Properties of HNM2
Instantaneous Amplitude:
mk(t) = |ak + tbk | =√
(aRk + tbR
k )2 + (aIk + tbI
k)2
Instantaneous Phase:
φk(t) = 2πkf0t + ∠(ak + tbk)
= 2πkf0t + atanaIk + tbI
k
aRk + tbR
k
Instantaneous Frequency:
fk(t) =1
2πφ′k(t)
= kf0 +1
2π
aRk bI
k − aIkbR
k
m2k(t)
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Time-Domain Properties of HNM2
Instantaneous Amplitude:
mk(t) = |ak + tbk | =√
(aRk + tbR
k )2 + (aIk + tbI
k)2
Instantaneous Phase:
φk(t) = 2πkf0t + ∠(ak + tbk)
= 2πkf0t + atanaIk + tbI
k
aRk + tbR
k
Instantaneous Frequency:
fk(t) =1
2πφ′k(t)
= kf0 +1
2π
aRk bI
k − aIkbR
k
m2k(t)
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Time-Domain Properties of HNM2
Instantaneous Amplitude:
mk(t) = |ak + tbk | =√
(aRk + tbR
k )2 + (aIk + tbI
k)2
Instantaneous Phase:
φk(t) = 2πkf0t + ∠(ak + tbk)
= 2πkf0t + atanaIk + tbI
k
aRk + tbR
k
Instantaneous Frequency:
fk(t) =1
2πφ′k(t)
= kf0 +1
2π
aRk bI
k − aIkbR
k
m2k(t)
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Frequency Domain properties of HNM2
Let ~ak and ~bk denote the vectors corresponding respectively tothe complex ak and bk andlet’s decompose ~bk into two components:
one collinear to ~ak , and
one perpendicular to ~ak .
Thus, ~bk is given by
~bk = ρ1,k~ak + ρ2,k~a⊥k ,
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Frequency Domain properties of HNM2
Let ~ak and ~bk denote the vectors corresponding respectively tothe complex ak and bk andlet’s decompose ~bk into two components:
one collinear to ~ak , and
one perpendicular to ~ak .
Thus, ~bk is given by
~bk = ρ1,k~ak + ρ2,k~a⊥k ,
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Let’s look at the k th component
The kth component can be written as:
Sk(f ) = ak [W (f −kf0)−ρ2,kW ′(f −kf0)+jρ1,kW ′(f −kf0)]
For small values of ρ2,k , using a first order approximationof the Taylor series of W (f ), we have:
W (f − kf0)− ρ2,kW ′(f − kf0) ≈W (f − kf0 − ρ2,k)
and then:
Sk(f ) ≈ ak [W (f − kf0 − ρ2,k) + jρ1,kW ′(f − kf0)]
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Time-Frequency Analysis using HNM2Healthy voice
Samples
Fre
quen
cy
1000 1500 2000 2500 3000
0
2000
4000
6000
8000
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Time-Frequency Analysis using HNM2Pathologic voice
Samples
Fre
quen
cy
1000 1500 2000 2500 3000
0
2000
4000
6000
8000
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Sinusoidal model
s(t) =
K(t)∑k=1
Ak(t)cos[θk(t)]
whereAk(t) = ak(t)︸ ︷︷ ︸
excitation
· Mk(t)︸ ︷︷ ︸vocal track
and
θk(t) = φk(t)︸ ︷︷ ︸excitation
+ Φk(t)︸ ︷︷ ︸vocal track
φk(t) = 2πk
∫ t
0f0(τ)dτ + φk
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Jitter and Shimmer
Jitter:f0(t) = f0 − δsin(πf0t)
Shimmer:ak(t) = ak [1 + γk cos(πf0t)]
so then:
s(t) =K∑
k=−K
Ak [1 + γkcos(πf0t)]e j(2πkf0t+δkcos(πf0t)+θk )w(t)
and by writing: e jδkcos(πf0t) ≈ 1 + jδkcos(πf0t), then:
s(t) ≈K∑
k=−K
Ake jθk [1 + (γk + jδk)cos(πf0t)]e j2πkf0tw(t)
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Jitter and Shimmer
Jitter:f0(t) = f0 − δsin(πf0t)
Shimmer:ak(t) = ak [1 + γk cos(πf0t)]
so then:
s(t) =K∑
k=−K
Ak [1 + γkcos(πf0t)]e j(2πkf0t+δkcos(πf0t)+θk )w(t)
and by writing: e jδkcos(πf0t) ≈ 1 + jδkcos(πf0t), then:
s(t) ≈K∑
k=−K
Ake jθk [1 + (γk + jδk)cos(πf0t)]e j2πkf0tw(t)
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Jitter and Shimmer
Jitter:f0(t) = f0 − δsin(πf0t)
Shimmer:ak(t) = ak [1 + γk cos(πf0t)]
so then:
s(t) =K∑
k=−K
Ak [1 + γkcos(πf0t)]e j(2πkf0t+δkcos(πf0t)+θk )w(t)
and by writing: e jδkcos(πf0t) ≈ 1 + jδkcos(πf0t), then:
s(t) ≈K∑
k=−K
Ake jθk [1 + (γk + jδk)cos(πf0t)]e j2πkf0tw(t)
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Jitter and Shimmer in HNM2
Suggesting:
s(t) =K∑
k=−K
[ak + bkcos(πf0t)]e j2πkf0tw(t)
and by letting bk = ρ1,kak + ρ2,k jak , then:
s(t) =K∑
k=−K
ak [1 + (ρ1,k + jρ2,k)cos(πf0t)]e j2πkf0tw(t)
comparing to what we would like to have:
s(t) ≈K∑
k=−K
Ake jθk [1 + (γk + jδk)cos(πf0t)]e j2πkf0tw(t)
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Jitter and Shimmer in HNM2
Suggesting:
s(t) =K∑
k=−K
[ak + bkcos(πf0t)]e j2πkf0tw(t)
and by letting bk = ρ1,kak + ρ2,k jak , then:
s(t) =K∑
k=−K
ak [1 + (ρ1,k + jρ2,k)cos(πf0t)]e j2πkf0tw(t)
comparing to what we would like to have:
s(t) ≈K∑
k=−K
Ake jθk [1 + (γk + jδk)cos(πf0t)]e j2πkf0tw(t)
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Modeling Shimmer
0 50 100 150 200 250 300 350 400−0.5
0
0.5
1
Samples
0 50 100 150 200 250 300 350 400−0.5
0
0.5
1
Samples
0 50 100 150 200 250 300 350 400−0.4
−0.2
0
0.2
0.4
Samples
OriginalReconstructed1
OriginalReconstructed2
Error1Error2
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Time-FrequencyRepresentations
Time-FrequencyAnalysis
Modeling Jitterand Shimmer
Acknowledg-ments
References
Modeling Jitter
0 50 100 150 200 250 300 350 400−0.5
0
0.5
1
Samples
0 50 100 150 200 250 300 350 400−0.5
0
0.5
1
Samples
0 50 100 150 200 250 300 350 400−1
−0.5
0
0.5
Samples
OriginalReconstructed1
OriginalReconstructed2
Error1Error2
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
Acknowledgments
I wish to thank T. Quatieri and Prentice Hall for gave methe permission to use figures from Tom’s book[7].
My students, Miltos Vassilakis and Yannis Pantazis fortheir work on jitter and shimmer and on HNM2.
Modeling ofSpeech Signal
for AnalysisPurposes
YannisStylianou
Outline of thetalk
Modeling
Synthesis:Jitter andShimmer
Analysis:Jitter andShimmer
Acknowledg-ments
References
M. Vasilakis and Y. Stylianou, “A mathematical model for accurate measurement of jitter,” in
MAVEBA 2007, (Florence, Italy), 2007.
K. Elemetrics, “Disordered Voice Database (Version 1.03),” 1994.
P. Boersma and D. Weenink, “Praat: doing phonetics by computer (Version 4.6.24) [Computer
program],” 2007.
K. Elemetrics, “Multi-Dimensional Voice Program (MDVP) [Computer program],” 2007.
L. Cohen, Time-Frequency Analysis.
Englewood Cliffs, NJ: Prentice-Hall, 1995.
Y. Stylianou, “Modeling speech based on harmonic plus noise models.,” in Nonlinear Speech
Modeling and Aplications (G. Chellot, A. Esposito, M. Faundez, and M. M, eds.), pp. 244–260,Springer-Verlag, 2005.
T. F. Quatieri, Discrete-Time Speech Signal Processing.
Engewood Cliffs, NJ: Prentice Hall, 2002.