+ All Categories
Home > Documents > Representing Acoustics with Mel Frequency Cepstral Coefficients

Representing Acoustics with Mel Frequency Cepstral Coefficients

Date post: 25-Feb-2016
Category:
Upload: eavan
View: 89 times
Download: 1 times
Share this document with a friend
Description:
Representing Acoustics with Mel Frequency Cepstral Coefficients. Lecture 7 Spoken Language Processing Prof. Andrew Rosenberg. Representing Acoustic Information. 16-bit samples 44.1kHz sampling rate ~ 86kB/sec ~5MB/min Waves repeat -- Much of this data is redundant. - PowerPoint PPT Presentation
Popular Tags:
18
Representing Acoustics with Mel Frequency Cepstral Coefficients Lecture 7 Spoken Language Processing Prof. Andrew Rosenberg
Transcript

Course Overview

Representing Acoustics with Mel Frequency Cepstral CoefficientsLecture 7Spoken Language ProcessingProf. Andrew Rosenberg0Representing Acoustic Information16-bit samples 44.1kHz sampling rate~86kB/sec~5MB/minWaves repeat -- Much of this data is redundant.A good representation of speech (for recognition)Keeps all of the information to discriminate between phonesIs Compact. i.e. Gets rid of everything else1Frame Based analysisUsing a short window of analysis, analyze the wave form every 10ms (or other analysis rate)Usually performed with overlapping windows.e.g. FFT and Spectrogram2Overlapping framesSpectrograms allow for visual inspection of spectral information.We are looking for a compact, numerical representation3

10ms10ms10ms10ms10msExample Spectrogram4

Example Spectrogram from Praat4Standard Representation in the fieldMel Frequency Cepstral CoefficientsMFCC

5Pre-EmphasiswindowFFTMel-Filter BanklogFFT-1Deltasenergy12 MFCC12 MFCC12 MFCC12 MFCC1 energy1 energy1 energyPre-emphasisLooking at spectrum for voiced segments, there is more energy at the lower frequencies than higher frequencies.Boosting high frequencies helps make the high frequency information more available. First-order high-pass filter for pre-emphasis.6

Figure 9.96WindowingOverlapping windows allow analysis centered at a frame point, while using more information.7

Figure 9.107Hamming WindowingDiscontinuities at the edge of the window can cause problems for the FFTHamming window smoothes-out the edges.8

Figure 9.11, Figure 9.128Hamming WindowingDiscontinuities at the edge of the window can cause problems for the FFTHamming window smoothes-out the edges.9

Figure 9.11, Figure 9.129Discrete Fourier TransformThe algorithm for calculating the Discrete Fourier Transform (DFT) is the Fast Fourier Transform.

10

http://clas.mq.edu.au/acoustics/speech_spectra/fft_lpc_settings.htmlAustralian male /i:/ from heed FFT analysis window 12.8msMel Filter Bank and LogHuman hearing is not equally sensitive at all frequency regions.Modeling human hearing sensitivity helps phone recognition.MFCC approach: Warp frequencies from Hz to Mel frequency scale.Mel: pairs of sounds that are perceptually equidistant in pitch are separated by an equal number of mels.1111Mel frequency Filter bankCreate a bank of filters collecting energy from each frequency band, 10 filters linearly spaced below 1000Hz, logarithmic spread over 1000Hz.12

Figure 9.1312CepstrumSeparation of source and filter. Source differences are speaker dependentFilter differences are phone dependent.Cepstrum is the Spectrum of the Log of the Spectrum inverse DFT of the log magnitude of the DFT of the signal13

Cepstrum VisualizationPeak at 120 samples represents the glottal pulse, corresponding to the F0 Large values closer to zero correspond to vocal tract filter (tongue position, jaw opening, etc.)Common to take the first12 coefficients14Figure 9.14

14Deltas and EnergyEnergy within a frame is just the sum of the power of the samples.

The spectrum of some phones change over time the stop closure to stop burst, or slope of a formant.Taking the delta or velocity and double delta or acceleration incorporates this information15

Summary: MFCCCommonly MFCCs have 39 Features1639MFCC Features12Cepstral Coefficients12Delta Cepstral Coefficients12Delta Delta Cepstral Coefficieints1Energy Coefficients1Delta Energy Coefficients1Delta Delta Energy CoefficientsNext ClassIntroduction to Statistical Modeling and ClassificationReading: J&M 9.4, optional 6.617


Recommended