+ All Categories
Home > Documents > Voice source characterisation

Voice source characterisation

Date post: 25-Feb-2016
Category:
Upload: wray
View: 54 times
Download: 2 times
Share this document with a friend
Description:
Voice source characterisation. Gerrit Bloothooft UiL-OTS Utrecht University. Voice research. To describe and model the properties of the vocal sound source from view points of: Physiology Acoustics Perception. Importance of the voice. Speech synthesis - PowerPoint PPT Presentation
Popular Tags:
41
Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University
Transcript
Page 1: Voice source characterisation

Voice source characterisation

Gerrit Bloothooft

UiL-OTS Utrecht University

Page 2: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 2

Voice research To describe and model the

properties of the vocal sound source from view points of:– Physiology– Acoustics– Perception

Page 3: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 3

Importance of the voice• Speech synthesis

– Towards natural sounding synthesis• Speech recognition

– Using source properties in recognition• Speaker recognition/identification

– Voice source characteristics are essential• Diagnosis

– Pathologies, voice classifications

Page 4: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 4

Voice possibilitiesLimited use of voice in speech• Range of the fundamental

frequency• Vocal intensity range• Spectral variation

Page 5: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 5

Focus in this presentation

How do acoustic voice source characteristics vary as a functionof F0 and vocal intensity

Page 6: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 6

Voice profile measurementThirties: Intensity range as function of

various pitches– manual measurement

Eighties: Automatic computation ofF0 and Intensity– computer measurement– visual feedback– additional parameters

Page 7: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 7

Measurement unit

• One decibel• One semi-tone

Page 8: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 8

Measurement procedure

• Subject in front of computer screen• Microphone on head set (30 cm)• Just phonate, sing, and see the result

immediately

• Best results with recording protocol• Feed back stimulates extreme

phonations

Page 9: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 9

Fundamental frequency (Hz)

Voca

l Int

ens it

y (d

B S P

L )

Sam

ple

dens

ity

Voice profile / density

Page 10: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 10

Fundamental frequency (Hz)

Voca

l Int

ens it

y (d

B S P

L )

Sam

ple

dens

ity

Voice profile / speech area

Page 11: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 11

Acoustic voice quality parameters• Jitter

– Stability of periodicity– Asymmetry in vocal folds

• Crest factor– Max amplitude divided by average

energy– Relates to spectral slope

• Many more …

Page 12: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 12

Crest factorVo

c al I

nten

s ity

(dB

S PL )

Fundamental frequency (Hz)

Cres

t fac

tor

Page 13: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 1353

Jitter

Fundamental frequency (Hz)

Vo c

al in

t ens

ity (

dB S

PL)

regular

irregular

Page 14: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 14

Real time presentation

Screen presentation• One data point per F0-I cell

Advanced data storage [new]• Full audio signal • Full distribution of data per F0-I cell • Data for screen presentation

Page 15: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 15

Advantages

• Reusability of recordings• Statistical analysis per F0-I cell• Study of time-varying behavior

Page 16: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 16

Crest factorVo

c al I

nten

s ity

(dB

S PL )

Fundamental frequency (Hz)

Cres

t fac

tor

Page 17: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 17

Median smoothing of crest factorVo

c al I

nten

s ity

(dB

S PL )

Fundamental frequency (Hz)

Cres

t fac

tor

Crest factor median smoothed

Page 18: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 18

Vocal Registers Different movement patterns of the

vocal folds

• Pulse register (creaky voice)• Modal register• Falsetto register

Page 19: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 19

Pulse register

• Less than 50 Hz• Irregular • Long closed period

Page 20: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 20

Fundamental Frequency (Hz)

Voc

al In

t ens

ity (d

B S

PL)

Pulse register

Page 21: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 21

Modal register• “Normal” use of voice• Active role of M. Vocalis• Vocal folds thick and completely

vibrating• Wide range in F0 and intensity• Flat spectrum

Page 22: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 22

Fundamental frequency (Hz)

Voc

al In

t ens

ity (d

B S

PL)

Modal register

Page 23: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 23

Falsetto register• Higher pitches• M. Vocalis passive, tense vocal

ligaments through M.Cricothyroidus

• Edge vibration of vocal volds• Sound poor in higher harmonics (in

untrained subjects)

Page 24: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 24

Fundamental frequency (Hz)

Voc

al In

t ens

ity (d

B S

PL)

Falsetto register

Page 25: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 25

Fundamental frequency (Hz)

Voc

al In

e ns i

ty (d

B S

PL)

Register overlap

Page 26: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 26

Chest- en head voice

Refer to secundary vibratory sensations in the body

• Chest voice: loud modal register• Head voice:

– males: higher, softer modal register in overlap area with falsetto register

– women: falsetto register

Page 27: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 27

Fundamental frequency (Hz)

Voc

al In

t ens

ity (d

B S

PL)

Chest voice and Head voice

chest

head

Page 28: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 28

Registers and voice profiles

With a description using

• Iso-crest factor lines• Iso-jitter lines

Page 29: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 29

Iso-crest factor lines

4 dB

6 dB

Vo c

al In

t ens

ity (d

B S

PL)

Cre

st fa

ctor

Fundamental frequency (Hz)

Page 30: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 30

Vo c

al In

t ens

ity (d

B S

PL)

Fundamental frequency (Hz)

3 %

Jitte

r (%

)

Iso-jitter lines

Page 31: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 31

New representation• Areas defined by iso-parameter

lines– crest factor < 4 dB– crest factor > 4 dB, < 6 dB– crest factor > 6 dB– jitter < 3 %– [relative rise time < 6 %]

Page 32: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 32

Areas in the phonetogramV

o cal

Int e

nsity

(dB

SPL

)

Fundamental frequency (Hz)

Jitter > 3%, unstable

RRT < 6 %pressed-like Crest factor < 4 dB

sine-like

Page 33: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 33Fundamental frequency (Hz)

Vocal registers in the phonetogram

Falsettoupper boundary

Modallower boundary

Chest voiceboundary

Vo c

al In

t ens

ity (d

B S

PL)

Page 34: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 34

Comparison of voice profiles

Characterisation of

• Voice pathologies• Voice classifications

Reuse stored voice profiles of subjects with known voice history

Page 35: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 35

Important features• Contour has limited value

– but most research goes into that direction (norm profiles)

• Distribution of acoustical parameters across the voice profile tells much more

Page 36: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 36

• Unit for comparisonVoice profile unit defined by small range of F0 and Vocal Intensity

• Distributions of acoustic voice parameters per unitProbability density function per parameter

• ModelHidden Markov Model

We need

Page 37: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 37

IN OUT

two unconnected states per phonetogram unit

• vocal registers• start and end of phonetion

Unit model

Page 38: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 38

Speech Voice Profile

• phoneme model F0/I unit model

• not labeled labeled by F0 and I• spectral envelope acoustic voice parameters• language model unrestricted transitions

“forced alignment recognition”

Correspondences

Page 39: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 39

Crest factor distributionstraining subject 1

0

500

4 5 6 7 8 9 10 11 12 13 14 15

test subject 1

0

500

4 5 6 7 8 9 10 11 12 13 14 15

training subject 2

0

500

4 5 6 7 8 9 10 11 12 13 14 15

test subject 2

0

500

4 5 6 7 8 9 10 11 12 13 14 15

 

Page 40: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 40

Fundamental frequency (Hz)

Voc

al In

t ens

ity (d

B S

PL)

Dis

tinct

iven

ess

Most distinctive states

Page 41: Voice source characterisation

Emasters School Leuven 2002 Voice Source Characterization 41

Conclusions• Voice profiles can enhance our

understanding of vocal behaviour in a visually attractive way

• Current data storage opens a series of important research topics

• Market opportunities for “light” versions


Recommended