7/21/2017
1
1Department of Electrical Engineering , IIT Bombay
EE679: Speech Processing
A preview
EE679: Speech Processing
A preview
Dept of Electrical EngineeringI.I.T. Bombay
2Department of Electrical Engineering , IIT Bombay
Why do we need a special course for signal processing of speech?
“Signal processing” is concerned with the mathematicalrepresentation of the signal and the algorithmicoperations carried out to modify the signal or to extractinformation from it.
The representation and the algorithms are applicationdomain specific, i.e. there are no “generic” methods.
An understanding of the signal and of the application arecrucial to the success of the signal processing methods
7/21/2017
2
3
Human communication
• Vocal, visual, gestural
• Language is used for communication and is independent of the modality (writing, signing, speaking)
• Speech Communication is the transfer of information from one person to another via speech
Department of Electrical Engineering , IIT Bombay
4Department of Electrical Engineering , IIT Bombay
Understanding speech communication
7/21/2017
3
5Department of Electrical Engineering , IIT Bombay
Acoustic wavesSpeed = wavelength x frequency
6Department of Electrical Engineering , IIT Bombay
T0 =
3.3 msec
T0 = 10 msec
low pitch tone
high pitch tone
Frequency (Fo) = 1/To= 100 Hz
Frequency = 300 Hz
Air
pres
sure
var
iation
1 Hertz = 1 vibration/sec
7/21/2017
4
7
Speech “waveform”
Department of Electrical Engineering , IIT Bombay
8Department of Electrical Engineering , IIT Bombay
“Information” in speech?
• Linguistic (message -> sentences -> words -> phonemes)
The speech signal is characterised by an enormous range of elementary perceptually contrasting sounds!
• Paralinguistic: --expressive (emotions, mood)--speaker-based (age, gender, accent and style)
7/21/2017
5
9Department of Electrical Engineering , IIT Bombay
“Everyday” speech technology
• Mobile telephony (speech compression)
• Human-computer interfaces (speech recognition/synthesis)
• Security (speaker identification in biometrics, forensics)
• Speech enhancement (improving intelligibility or quality)
• Behavioural analytics
10Department of Electrical Engineering , IIT Bombay
Generating speech*
Respiration->phonation->articulation
Vibrating vocal cords create puffs of air giving rise to air pressure variations which reach our ears.
*HyperPhysics, Sound and Hearing, Georgia State University
7/21/2017
6
11Department of Electrical Engineering , IIT Bombay
.......;45;
43;
4 321 Lcf
Lcf
Lcf
Vocal tract: Acoustic resonances*
*HyperPhysics, Sound and Hearing, Georgia State University
(http://hyperphysics.phy-astr.gsu.edu/hbase/sound/)
12Department of Electrical Engineering , IIT Bombay
Vocal cords
Tongue Jaw
Lips
Teeth
Velum
Moving muscles which alter the resonant cavities Static cavity
Dynamic cavity
Vocalcavity
Pharyngeal
cavity
Velum
Nasal cavity
Oral Cavity
Articulators
Trachea connection to lungs
Oral sound output
Nasal sound output
Articulation: producing the various sounds of speech*
*Securivox tutorial
7/21/2017
7
13Department of Electrical Engineering , IIT Bombay
• The sound spectrum is modified by the shape of the vocal tract. • The resonant frequencies of the vocal tract cause peaks in the spectrum called formants.
Vocal tract “filter”*
*Childers, Speech Overview
14
Von Kempelen's talking machine
1791
"Briefly, the device was operated in the following manner. The right arm rested on the main bellows and
7/21/2017
8
15
1875
• Alexander Bell invents the method of, and apparatus for, “transmitting vocal or other sounds telegraphically ... by causing electrical undulations, similar in form to the vibrations of the air accompanying the said vocal or other sound”.
=> Major impetus to modern speech processing.
• 1930s: Electrical synthesis of speech by Dudley’s vocoder
Department of Electrical Engineering , IIT Bombay
16Department of Electrical Engineering , IIT Bombay
Sound -> electrical form*
*The Physics Classroom:http://www.glenbrook.k12.il.us/gbssci/phys/Class/sound/u11l2a.html
7/21/2017
9
17Department of Electrical Engineering , IIT Bombay
Speech Waveforms from “my speech”
(b) “ee” vowel
(c) “s” consonant
(a) start of “y” vowel
18Department of Electrical Engineering , IIT Bombay
Components of sound
A sound is usually comprised of several frequency components.
Depending on the relationships of the frequency components, the sound can elicit a sensation of pitch.
7/21/2017
10
19Department of Electrical Engineering , IIT Bombay
300 Hz
600 Hz
900 Hz
300 Hz + 600Hz
300 Hz + 600Hz + 900Hz
20Department of Electrical Engineering , IIT Bombay
Classification of speech sounds
Vowels and Consonants
• Vowels: steady sounds specified by position of the articulators (typically, tongue)
• Consonants: are (dynamic) sounds classifiedby place and manner of articulation
7/21/2017
11
21Department of Electrical Engineering , IIT Bombay
Place of articulation(constriction of vocal tract)
22Department of Electrical Engineering , IIT Bombay
Basic sounds of speech: Phones
• The speech signal can be divided into sound segments with fixed articulation and acoustics over short intervals.i.e. articulatory configuration <=> acoustic properties
Smallest meaningful sound unit: “phone” (i.e. set of distinctive sounds of a language)
In Indian written scripts, one symbol represents one phone.
7/21/2017
12
23Department of Electrical Engineering , IIT Bombay
24
PRAAT examples
Department of Electrical Engineering , IIT Bombay
7/21/2017
13
25
Physiology (articulator motion)
Sound with specific acoustic characteristics (seen in waveform and spectrum)
Perception of certain sound qualities
Department of Electrical Engineering , IIT Bombay
26Department of Electrical Engineering , IIT Bombay
Speech production basics
• Vocal cords (larynx) modulate the airflow from the lungs by rapid opening-closing; the rate of vibration is determined by their mass and tension. Pitch frequency ranges:male: 80-160 Hz; female:160-320 Hz; singers: over 2 octaves.
• Vocal tract shapes the vocal cord vibrations into the intricate sounds of speech via changes in shape to produce various acoustic resonances.
7/21/2017
14
27Department of Electrical Engineering , IIT Bombay
28
• Glottal folds in action…
Department of Electrical Engineering , IIT Bombay
7/21/2017
15
29
The interdisciplinary nature… *
Department of Electrical Engineering , IIT Bombay
* Fant, G. (1990). Speech research in perspective. Speech Communication.
30Department of Electrical Engineering , IIT Bombay
Outline
• Speech production (physiology)
• Classification of sounds: articulatory, acoustic
• Speech analysis (signal processing methods for information extraction)
• Hearing, and speech perception
• Speech technology (compression, ASR,TTS,…)
• Audio/music technology
7/21/2017
16
31Department of Electrical Engineering , IIT Bombay
Text / References
• Douglas O'Shaughnessy, Speech Communications: Human and Machine, Universities Press (India) Ltd., 2001
• Rabiner and Schafer, Digital Processing of Speech Signals
• IITB Moodle for all course-related hand-outs
32Department of Electrical Engineering , IIT Bombay
Evaluation
• Computing assignments (Python or Scilab) (30%)
• Exams: mid semester + end semester (70%)
• Attendance is compulsory (<80% => XX, even before midsem)