+ All Categories
Home > Documents > Analysis of the effects of signal distance on spectrograms caused by the sudden pressure change that...

Analysis of the effects of signal distance on spectrograms caused by the sudden pressure change that...

Date post: 14-Apr-2018
Category:
Upload: vandat
View: 222 times
Download: 4 times
Share this document with a friend
11
2014 SGHA 8/19/2014 Analysis of the effects of signal distance on spectrograms
Transcript
Page 1: Analysis of the effects of signal distance on spectrograms caused by the sudden pressure change that arises above the larynx when the ... is controlled by the position of the articulators.

2014

SGHA

8/19/2014

Analysis of the effects of signal distance on spectrograms

Page 2: Analysis of the effects of signal distance on spectrograms caused by the sudden pressure change that arises above the larynx when the ... is controlled by the position of the articulators.

Contents

Introduction ...................................................................................... 3

Scope ................................................................................................ 3

Data Comparisons ............................................................................. 5

Results ............................................................................................ 10

Recommendations .......................................................................... 10

References ...................................................................................... 11

Page 3: Analysis of the effects of signal distance on spectrograms caused by the sudden pressure change that arises above the larynx when the ... is controlled by the position of the articulators.

Introduction This article is a summary on back ground data that was collected measure the effects of speech on a spectrogram as the distance between the speaker and the microphone were increased. In effect, we are increasing the signal to noise ratio by making the signal weaker. The phrase "The quick brown fox jumped over the lazy dog" was spoken next the microphone. The same phrase was repeated at 10 foot intervals up to the final distance of 60 feet from the recorder. A DMR-40 four channel recorder was used with a Shure model microphone to record the audio. The spectrograms were analyzed with three programs, PRAAT, Voice Analyzer and Sonic Visualiser.

Scope A spectrogram is a visual representation of the frequency content of a signal. A spectrogram shows how the quantity of energy in different frequency regions varies as a function of time. On a spectrogram, the signal is divided into many small time sections and each section is analyzed in terms of what frequency components are present in the section. This analysis is called spectral analysis because the spectrum of each section is calculated and the quantity of each frequency component (that is each sinusoid) is measured from the spectrum. The quantity of each component is then converted to a grey level in which (normally) low energy components are converted to a white color, while high energy components are converted to a black color. These colors are then plotted on a vertical strip corresponding to the time at which the original signal segment occurred. The height of the colored element on this vertical strip represents the frequency of the component.

Thus a spectrogram is a 3-dimensional analysis of a signal, the horizontal dimension is time, the vertical dimension is frequency, and the grey-scale shows the amount of energy occurring in the signal at each time and frequency.

If you study a wide-band spectrogram of say a couple of words of speech you should be able to see some of the following events:

Larynx excitation pulses: these appear as vertical dark lines at intervals of between 5 and 10ms or so. These are also called striations. Each one of these is caused by the sudden pressure change that arises above the larynx when the vocal folds close suddenly, cutting off the flow of air from the lungs. This change

Page 4: Analysis of the effects of signal distance on spectrograms caused by the sudden pressure change that arises above the larynx when the ... is controlled by the position of the articulators.

is so sudden it creates a kind of pressure "pulse" which contains energy at a wide range of frequencies, commonly up to 4kHz or more.

Formant vibrations: between the striations you will see dark regions which only occur at particular frequencies. These regions, which often appear several hundred Hz across because of the limitations of the wide-band frequency analysis of the spectrograph, are caused by the ringing of the vocal tract resonances (or formants) as each pulse from the larynx excites them. If you look carefully, you may see that these vibrations are larger (darker) just after the pulse, and get paler as the energy in the vibrations is lost from the vocal tract.

Changes in formant frequency: You should see that the dark regions caused by formant vibration change in frequency through the utterance. You may see that the resonances have a kind of continuity in time, and slowly rise and fall in frequency through a syllable, and from one syllable to the next. These slow and smooth changes in formant frequencies are because the frequencies of the vocal tract resonances are set by the shape of the vocal tract tube, which in turn is controlled by the position of the articulators. Since the articulators move relatively slowly (a few syllables per second) the formant frequencies appear to move slowly too.

Turbulent sounds: In regions where there is no larynx vibration and hence no striations you should see some "speckled" rather noisy unstructured regions of dark color, often towards the high frequency end of the picture. These are "noise" sounds caused by turbulence in the vocal tract, for instance: bursts, aspiration and frication. Bursts are often short vertical bars, a lot like a striation, caused by the sudden pressure change in the vocal tract when a stop articulation is released. Aspiration is turbulence that occurs in the larynx, caused by a narrowing of the airway from the lungs made by the vocal folds coming close together. Frication is turbulence that occurs at other points of narrowing in the vocal tract, made with the tongue or the lips. If you look carefully you may see differences in the frequency content of bursts or fricatives originating from different places of articulation. This is because the different articulator configurations shape the sound generated by the turbulence in different ways depending on the size and shape of the vocal tract tube in front of the constriction.

In the vowels, F1 can vary from 300 Hz to 1000 Hz. The lower it is, the closer the tongue is to the roof of the mouth. The vowel /i:/ as in the word 'beet' has one of the lowest F1 values - about 300 Hz; in contrast, the vowel /A/ as in the word 'bought' (or 'Bob' in speakers who distinguish the vowels in the two words)

Page 5: Analysis of the effects of signal distance on spectrograms caused by the sudden pressure change that arises above the larynx when the ... is controlled by the position of the articulators.

has the highest F1 value - about 950 Hz. Pronounce these two vowels and try to determine how your tongue is configured for each.

F2 can vary from 850 Hz to 2500 Hz; the F2 value is proportional to the frontness or backness of the highest part of the tongue during the production of the vowel.

Data Comparisons The first comparisons of the data were analyzed by using Voice Analyzer software.

Spectrogram with the speaker next to the recorder

Spectrogram with the speaker 10' away from the recorder.

Spectrogram with the speaker 20' away from the recorder.

Page 6: Analysis of the effects of signal distance on spectrograms caused by the sudden pressure change that arises above the larynx when the ... is controlled by the position of the articulators.

Spectrogram with the speaker 30' away from the recorder.

Spectrogram with the speaker 40' away from the recorder.

Spectrogram with the speaker 50' away from the recorder.

Spectrogram with the speaker 60' away from the recorder.

The higher and mid-range frequencies are initially affected by the strength of the signal in relation to the distance of the speaker.

The next image is a comparison between the 0' sample and the 60' sample in Sonic Visualiser. Some formant vibrations are still visible in the weaker signal.

Page 7: Analysis of the effects of signal distance on spectrograms caused by the sudden pressure change that arises above the larynx when the ... is controlled by the position of the articulators.

The next analysis was performed using PRAAT software. The samples used for comparison are the samples taken at 0' and 60' from the recorder.

Page 8: Analysis of the effects of signal distance on spectrograms caused by the sudden pressure change that arises above the larynx when the ... is controlled by the position of the articulators.

When speech is corrupted by stationary noise, it creates missing features in the spectrogram. The first thing to vanish is most of the higher frequencies, then the midrange frequencies that are at lower decibels.

The next test was to replicate the standard analysis technique used by most ghost hunters and paranormal researchers. The sample that was recorded at 60' from the microphone was amplified by 60dB. A noise profile was then captured and noise reduction was applied to the audio file.

The spectrogram of the new file was then analyzed in PRAAT.

Page 9: Analysis of the effects of signal distance on spectrograms caused by the sudden pressure change that arises above the larynx when the ... is controlled by the position of the articulators.

PRAAT Analysis showing the missing segments of speech in the 60' sample (bottom) that has been amplified and had noise reduction applied. The sample recorded at microphone (o') is on top.

Large amounts of speech have been lost because of application of noise reduction.

Below is the same comparison but the sample recorded at 60' has not been amplified or had noise reduction applied. More segments of speech are visible that would increase the accuracy of determining what was spoken.

Page 10: Analysis of the effects of signal distance on spectrograms caused by the sudden pressure change that arises above the larynx when the ... is controlled by the position of the articulators.

Results The study clearly shows that as the signal becomes weaker and embedded in the noise floor, segments of speech are lost. The techniques used by ghost hunters and other paranormal enthusiasts compound the problem by applying noise reduction in an attempt to hear the voice more clearly. This process also destroys essential elements of the speech which increases the probability of pareidolia when attempting to identify words.

Recommendations By using software designed for the analysis of spectrograms, it is possible to identify vowels and other features of speech without applying noise reduction. This is accomplished by selecting a section in the middle of the formant and measuring the frequencies of the F1 and F2 formants. These frequencies are then plotted against a vowel triangle to determine the most probable vowel that is in the audio file.

Page 11: Analysis of the effects of signal distance on spectrograms caused by the sudden pressure change that arises above the larynx when the ... is controlled by the position of the articulators.

Using our data, we were able to identify the same vowels in the sample recorded at the recorder and the sample recorded 60' from the recorder. There were some minor differences in the frequencies of the F1 and F2 formants but they were within the tolerance of the frequencies typically assigned to the vowels on the triangle.

References http://www.speechandhearing.net/faq/faq1.php


Recommended