Date post: | 07-Apr-2018 |
Category: |
Documents |
Upload: | bhavik-patel |
View: | 220 times |
Download: | 0 times |
of 20
8/4/2019 Final Ppt on Speech Processing
1/20
8/4/2019 Final Ppt on Speech Processing
2/20
It is a message information which is converted into aset of neural signals which control articulatorymechanism generating an accoustic waveformcontaining information in original message
MessageInformation
Neural signal ArticulatorymechanismAccousticwaveform
Speech
8/4/2019 Final Ppt on Speech Processing
3/20
Concatenation of elements from finite set of phonemes Each language having distict set of phonemes Typically in the range of 30 50 English has around 42 phonemes
Six bit numerical code sufficient for numbering Average of 10 phonemes per second Total make up of 60 bits per second- average information rate Concerns
Representation of message content Representation in a form convenient for transmission/storage
8/4/2019 Final Ppt on Speech Processing
4/20
study of speech signals and the processingmethods of these signals
usually processed in a digital representation, So regarded as a special case of digital signal
processing
Information
source
Measurement of
observationSignal representation Signal transformation
Signal processing
Extraction &utilization ofinformation
8/4/2019 Final Ppt on Speech Processing
5/20
SPEECH RECOGNITION SPEAKER RECOGNITION SPEECH CODING VOICE ANALYSIS SPEECH SYNTHESIS SPEECH ENHANCEMENT
8/4/2019 Final Ppt on Speech Processing
6/20
AUTOMATIC SPEECH RECOGNITION CONVERTS SPOKEN WORDS TO TEXT
8/4/2019 Final Ppt on Speech Processing
7/20
TEXT DEPENDENT TEXT INDEPENDENT
8/4/2019 Final Ppt on Speech Processing
8/20
In a system using text dependent speech, theindividual presents either a fixed (password) orprompted (Please say the numbers 33-54-
63) phrase that is programmed into thesystem and can improve performanceespecially with cooperative users.
8/4/2019 Final Ppt on Speech Processing
9/20
A text independent system has no advanceknowledge of the presenter's phrasing and ismuch more flexible in situations where the
individual submitting the sample may beunaware of the collection or unwilling tocooperate, which presents a more difficultchallenge.
8/4/2019 Final Ppt on Speech Processing
10/20
Speech codingis the application of data compression ofdigital audio signals containing speech.
Speech coding uses speech-specific parameter estimationusing audio signal processing techniques to model thespeech signal, combined with generic data compression
algorithms to represent the resulting modeled parameters ina compact bit stream. The two most important applications of speech coding are
mobile telephony and Voice over IP.
8/4/2019 Final Ppt on Speech Processing
11/20
Voice analysis is the study of speech sounds for purposes otherthan linguistic content, such as in speech recognition.
Such studies include mostly medical analysis of the voice i.e.phoniatrics, but also speaker identification.
More controversially, some believe that the truthfulness oremotional state of speakers can be determined using Voice StressAnalysis or Layered Voice Analysis.
8/4/2019 Final Ppt on Speech Processing
12/20
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be
implemented in software or hardware. A text-to-speech (TTS) system converts normal language text into speech; other systems
render symbolic linguistic representations like phonetic transcriptions into speech. Synthesized speech can be created by concatenating pieces of recorded speech that are
stored in a database. Systems differ in the size of the stored speech units; a system that
stores phones or diaphones provides the largest output range, but may lack clarity. The quality of a speech synthesizer is judged by its similarity to the human voice and byits ability to be understood. An intelligible text-to-speech program allows people withvisual impairments or reading disabilities to listen to written works on a homecomputer.
8/4/2019 Final Ppt on Speech Processing
13/20
Speech enhancement aims to improve speech quality byusing various algorithms.
The objective of enhancement is improvement inintelligibility and/or overall perceptual quality of degradedspeech signal using audio signal processing techniques.
Enhancing of speech degraded by noise, or noise reduction,is the most important field of speech enhancement, and usedfor many applications such as mobile phones, VoIP,teleconferencing systems , speech recognition, and hearingaids.
8/4/2019 Final Ppt on Speech Processing
14/20
As basic parameters in speech processing we
regard Pitch Duration Intensity voice quality signal to noise ratio voice activity detection strength of Lombard effect.
8/4/2019 Final Ppt on Speech Processing
15/20
In the area of speech recognition, speech synthesis and speaker
characterization basic parameters are needed which are crucial for goodperformance of the systems.
There are two sets parameters. The first is related to prosody Pitch Duration Intensity
The second characterizes the acoustic properties of the environmentincluding the impact on the speakers voice. voice quality signal to noise ratio voice activity detection strength of Lombard effect
Taking in account also adverse conditions the performance of manypublished algorithms to extract those parameters from the speech signalautomatically is not known. A framework based on competitiveevaluation is proposed to push algorithmic research and to makeprogress comparable.
8/4/2019 Final Ppt on Speech Processing
16/20
personal voice qualities differ in the speakers use of temporal structures,articulation precision, vocal effort and type of phonation.
Whereas temporal structures can be measured directly in the acousticsignal and conclusions about articulation precision can be made from theformant structure
These voice quality percepts are a combination of several acoustic voicequality parameters.
In an investigation on emotionally loaded speech material it could beshown, that the named acoustic parameters are useful for differentiatingbetween the emotions happiness, sadness, anger, fear and boredom.
8/4/2019 Final Ppt on Speech Processing
17/20
The signal-to-noise ratio (SNR) is an important feature in determining the qualityof audio data.
This is particularly important in speech recognition technology since it is wellknown that recognition performance is strongly influenced by the SNR.
In most applications the SNR cannot be easily derived since the noise energy is notknown.
Further, the question arises as to what is "signal" and what is "noise". For example, would a cough or breath noise be considered part of the "signal" in
spontaneous speech? Does it convey information?
8/4/2019 Final Ppt on Speech Processing
18/20
Voice activity detection (VAD), also known as speech activitydetection or speech detection
A technique used in speech processing in which the presence or absenceof human speech is detected.
The main uses of VAD are in speech coding and speech recognition. It can facilitate speech processing, and can also be used to deactivate
some processes during non-speech section of an audio session. It can avoid unnecessary coding/transmission of silence packets in Voice
over Internet Protocol applications, saving on computation andon network bandwidth.
8/4/2019 Final Ppt on Speech Processing
19/20
The Lombard effect or Lombard reflex is the involuntary tendency of speakers toincrease the intensity of their voice when speaking in loud noise to enhance itsaudibility.
This change includes not only loudness but also other acoustic features suchas pitch and rate and duration of sound syllables.
This compensation effect results in an increase in the auditory signal-to-noise ratioof the speaker's spoken words.
The effect links to the needs of effective communication as there is a reduced effectwhen words are repeated or lists are read where communication intelligibility isnot important.
Since the effect is also involuntary it is used as a means to detect malingering inthose simulating hearing loss.
The effect was discovered in 1909 by tienne Lombard, a French otolaryngologist.
8/4/2019 Final Ppt on Speech Processing
20/20
Health care
Military
High-performance fighter aircraft
Helicopters Battle management
Training air traffic controllers
Telephony and other domains