+ All Categories
Home > Documents > Auditory Perception

Auditory Perception

Date post: 03-Jan-2016
Category:
Upload: april-casey
View: 38 times
Download: 0 times
Share this document with a friend
Description:
Auditory Perception. April 9, 2009. Auditory vs. Acoustic. So far, we’ve seen two different auditory measures: Mels (unit of perceived pitch) Auditory correlate of Hertz (frequency) Sones (unit of perceived loudness) Auditory correlate of decibels (intensity) - PowerPoint PPT Presentation
47
Auditory Perception April 9, 2009
Transcript
Page 1: Auditory Perception

Auditory Perception

April 9, 2009

Page 2: Auditory Perception

Auditory vs. Acoustic• So far, we’ve seen two different auditory measures:

1. Mels (unit of perceived pitch)

• Auditory correlate of Hertz (frequency)

2. Sones (unit of perceived loudness)

• Auditory correlate of decibels (intensity)

• Both were derived from pitch and loudness estimation experiments…

Page 3: Auditory Perception

Masking• Another scale for measuring auditory frequency emerged in the 1960s.

• This scale was inspired by the phenomenon of auditory masking.

• One sound can “mask”, or obscure, the perception of another.

• Unmasked:

• Masked:

• Q: How narrow can we make the bandwidth of the noise, before the sinewave becomes perceptible?

• A: Masking bandwidth is narrower at lower frequencies.

Page 4: Auditory Perception

Critical Bands• Using this methodology, researchers eventually determined that there were 24 critical bands of hearing.

• The auditory system integrates all acoustic energy within each band.

• Two tones within the same critical band of frequencies sound like one tone

• Ex: critical band #9 ranges from 920-1080 Hz

• F1 and F2 for might merge together

• Each critical band 0.9 mm on the basilar membrane.

• The auditory system consists of 24 band-pass filters.

• Each filter corresponds to one unit on the Bark scale.

Page 5: Auditory Perception

Bark Scale of Frequency

• The Bark scale converts acoustic frequencies into numbers for each critical band

Page 6: Auditory Perception

Bark TableBand Center Bandwidth Band Center

Bandwidth

1 50 20-100 13 1850 1720-2000

2 150 100-200 14 2150 2000-2320

3 250 200-300 15 2500 2320-2700

4 350 300-400 16 2900 2700-3150

5 450 400-510 17 3400 3150-3700

6 570 510-630 18 4000 3700-4400

7 700 630-770 19 4800 4400-5300

8 840 770-920 20 5800 5300-6400

9 1000 920-1080 21 7000 6400-7700

10 1170 1080-1270 22 8500 7700-9500

11 1370 1270-1480 23 10500 9500-12000

12 1600 1480-1720 24 13500 12000-15500

Page 7: Auditory Perception

Your Grandma’s Spectrograph

• Originally, spectrographic analyzing filters were constructed to have either wide or narrow bandwidths.

Page 8: Auditory Perception

Spectral Differences

• Acoustic vs. auditory spectra of F1 and F2

Page 9: Auditory Perception

Cochleagrams• Cochleagrams are spectrogram-like representations which incorporate auditory transformations for both pitch and loudness perception

• Acoustic spectrogram vs. auditory cochleagram representation of Cantonese word

• Check out Peter’s vowels in Praat.

Page 10: Auditory Perception

Cochlear Implants• Cochlear implants transmit sound directly to the cochlea through a series of band-pass filters…

• like the critical bands in our native auditory system.

• These devices can benefit profoundly deaf listeners with nerve deafness.

• = loss of working hair cells in the inner ear.

• Contrast with: a hearing aid, which is simply an amplifier.

• Old style: amplifies all frequencies

• New style: amplifies specific frequencies, based on a listener’s particular hearing capabilities.

Page 11: Auditory Perception

Cochlear Implants A Cochlear Implant artificially stimulates the nerves which are connected to the cochlea.

Page 12: Auditory Perception

Nuts and Bolts• The cochlear implant chain of events:

1. Microphone

2. Speech processor

3. Electrical stimulation

• What the CI user hears is entirely determined by the code in the speech processor

• Number of electrodes stimulating the cochlea ranges between 8 to 22.

• poor frequency resolution

• Also: cochlear implants cannot stimulate the low frequency regions of the auditory nerve

Page 13: Auditory Perception

Noise Vocoding• The speech processor operates like a series of critical bands.

• It divides up the frequency scale into 8 (or 22) bands and stimulates each electrode according to the average intensity in each band.

This results in what sounds (to us) like a highly degraded version of natural speech.

Page 14: Auditory Perception

What CIs Sound Like• Check out some nursery rhymes which have been processed through a CI simulator:

Page 15: Auditory Perception

CI Perception• One thing that is missing from vocoded speech is F0.

• …It only encodes spectral change.

• Last year, Aaron Byrnes put together an experiment testing intonation perception in CI-simulated speech for his honors thesis.

• Tested: discrimination of questions vs. statements

• And identification of most prominent word in a sentence.

• 8 channels:

• 22 channels:

Page 16: Auditory Perception

The Findings• CI User:

• Excellent identification of the most prominent word.

• At chance (50%) when distinguishing between statements and questions.

• Normal-hearing listeners (hearing simulated speech):

• Good (90-95%) identification of the prominent word.

• Not too shabby (75%) at distinguishing statements and questions.

• Conclusion 1: F0 information doesn’t get through the CI.

• Conclusion 2: Noise-vocoded speech might not be a completely accurate CI simulation.

Page 17: Auditory Perception

Mitigating Factors• The amount of success with Cochlear Implants is highly variable.

• Works best for those who had hearing before they became deaf.

• The earlier a person receives an implant, the better they can function with it later in life.

• Works best for (in order):

• Environmental Sounds

• Speech

• Speaking on the telephone (bad)

• Music (really bad)

Page 18: Auditory Perception

Practical Considerations• It is largely unknown how well anyone will perform with a cochlear implant before they receive it.

• Possible predictors:

• lipreading ability

• rapid cues for place are largely obscured by the noise vocoding process.

• fMRI scans of brain activity during presentation of auditory stimuli.

Page 19: Auditory Perception

Infrared Implants?• Some very recent research has shown that cells in the inner ear can be activated through stimulation by infrared light.

• This may enable the eventual development of cochlear implants with very precise frequency and intensity tuning.

• Another research strategy is that of trying to regrow hair cells in the inner ear.

Page 20: Auditory Perception

One Last Auditory Thought• Frequency coding of sound is found all the way up in the auditory cortex.

• Also: some neurons only fire when sounds change.

Page 21: Auditory Perception

A Philosophical Interlude• Q: What’s a category?

• A classical answer:

• A category is defined by properties.

• All members of the category exhibit the same properties.

• No non-members of the category exhibit all of those properties.

The properties of any member of the category may be split into:

• Definitive properties

• Incidental properties

Page 22: Auditory Perception

Classical Example• A rectangle (in Euclidean geometry) may be defined as

having the following properties:

1. Four-sided, two-dimensional figure (quadrilateral)

2. Four right angles

This is a rectangle.

Page 23: Auditory Perception

Classical Example• Adding a third property gives the figure a different

category classification:

1. Four-sided, two-dimensional figure (quadrilateral)

2. Four right angles

3. Four equally long sides

This is a square.

Page 24: Auditory Perception

Classical Example• Altering other properties does not change the category

classification:

1. Four-sided, two-dimensional figure (quadrilateral)

2. Four right angles

3. Four equally long sides

This is still a square.

A. Is red.

definitive properties

incidental property

Page 25: Auditory Perception

Classical Linguistic Categories• Formal phonology traditionally defined all possible speech sounds in terms of a limited number of properties, known as “distinctive features”. (Chomsky + Halle, 1968)

[d] = [CORONAL, +voice, -continuant, -nasal, etc.]

[n] = [CORONAL, +voice, -continuant, +nasal, etc.]

• Similar approaches have been applied in syntactic analysis. (Chomsky, 1974)

Adjectives = [+N, +V]

Prepositions = [-N, -V]

Page 26: Auditory Perception

Prototypes• The psychological reality of classical categories was

called into question by a series of studies conducted by Eleanor Rosch in the 1970s.

• Rosch claimed that categories were organized around privileged category members, known as prototypes.

• (instead of being defined by properties)

• Evidence for this theory initially came from linguistic tasks:

1. Semantic verification (Rosch, 1975)

• Is a robin a bird?

• Is a penguin a bird?

2. Category member naming.

Page 27: Auditory Perception

Prototype Category Example: “Bird”

Page 28: Auditory Perception

Exemplar Categories• Cognitive psychologists in the late ‘70s (e.g., Medin & Schaffer, 1978) questioned the need for prototypes.

• Phenomena explained by prototype theory could be explained without recourse to a category prototype.

• The basic idea:

• Categories are defined by extension.

• Neither prototypes nor properties are necessary.

• Categorization works by comparing new tokens to all exemplars in memory.

• Generalization happens on the fly.

Page 29: Auditory Perception

A Category, Exemplar-style

“square”

Page 30: Auditory Perception

Back to Perception• When people used to talk about categorical perception, they meant perception of classical categories.

• A stop is either a [b] or a [g]

• (no in between)

• Remember: in classical categories, there are:

• definitive properties

• incidental properties

• Q: What are the properties that define a stop category?

• The definitive properties must be invariant.

• (shared by all category members)

• So…what are the invariant properties of stop categories?

Page 31: Auditory Perception

The Acoustic Hypothesis• People have looked long and hard for invariant acoustic properties of stops, with little success.

• (and some people are still looking)

• Frequency values of compact (synthetic) bursts cueing different places of articulation, in various vowel contexts.

(Liberman et al., 1952)

Page 32: Auditory Perception

Theoretical Revision• Since invariant acoustic properties could not be found (especially for velars)…

• It was assumed that listeners perceived (articulatory) gestures, not (acoustic) sounds.

• Q: What invariant articulatory properties define stop categories?

• A: If they exist, they’re hard to find.

• Motor Theory Revision #2: Listeners perceive “intended” gestures.

• Note: “intentions” are kind of impossible to observe.

• But they must be invariant…right?

Page 33: Auditory Perception

Another Brick in the Wall• Another problem for motor theory:

• Perception of speech sounds isn’t always categorical.

• In particular: vowels are perceived in a more gradient fashion than stops.

• However, vowel perception becomes more categorical when the vowels are extremely short.

Page 34: Auditory Perception

• It’s also hard to identify any invariant acoustic properties for vowels.

• Variation is rampant across:

• tokens

• speakers

• genders

• dialects

• age groups, etc.

• Variability = a huge problem for speech perception.

Page 35: Auditory Perception

More Problems• Also: infants exhibit categorical perception, too…

• Even though they don’t know category labels.

• Chinchillas can do it, too!

Page 36: Auditory Perception

An Alternative• It has been proposed that phoneme categories are defined by prototypes…

• which we use to identify vowels in speech.

• One relevant finding: the perceptual magnet effect.

• Part 1: play listeners a continuum of synthetic vowels in the neighborhood of [i].

• Task: judge how much each one sounds like [i].

• Some are better = prototypical

• Others are worse = non-prototypes

Page 37: Auditory Perception

Perceptual Magnets• Part 2: define either a prototype or a non-prototype as a category center.

• Task: determine whether other vowels on the continuum belong to those categories.

• Result: more same responses when the category center is a prototype.

• Prototype = a “perceptual magnet”

Same? Different?

Page 38: Auditory Perception

Prototypes, continued• The perceptual magnet prototypes are usually located at a listener’s average F1 and F2 values for [i].

• 4-month olds exhibit the perceptual magnet effect…

• but monkeys do not.

• Note: the prototype is the only thing that has to be “invariant” about the category.

• particular properties aren’t important.

• Testing a prototype model on the Peterson & Barney data yielded 51% correct classification.

• (Human listeners got 94% correct)

• Variability is still hard to deal with.

Page 39: Auditory Perception

Flipping the Script• Another approach to speech perception is to preserve all variability that we hear…

• Rather than boiling it down to properties or prototypes.

• In this model, speech categories are defined by extension.

• = consist of exemplars

• So, your mental representaton of /b/ consists of every token of /b/ you’ve ever heard in your life.

• …rather than any particular acoustic or articulatory properties.

• Analogy: phonetics field project notes

• (your mind is a pack rat)

Page 40: Auditory Perception

Exemplar Categorization1. Stored memories of speech experiences are known as

traces.

• Each trace is linked to a category label.

2. Incoming speech tokens are known as probes.

3. A probe activates the traces it is similar to.

• Note: amount of activation is proportional to similarity between trace and probe.

• Traces that closely match a probe are activated a lot;

• Traces that have no similarity to a probe are not activated much at all.

Page 41: Auditory Perception

• A (pretend) example: traces = vowels from the Peterson & Barney data set. *

probe

• Activation of each trace is proportional to distance (in vowel space) from the probe.

highly activated

traces

low activation

Page 42: Auditory Perception

Echoes from the Past• The combined average of activations from exemplars in memory is summed to create an echo of the perceptual system.

• This echo is more general features than either the traces or the probe.

• Inspiration: Francis Galton

Page 43: Auditory Perception

Exemplar Categorization II• For each category label…

• The activations of the traces linked to it are summed up.

• The category with the most total activation wins.

• Note: we use all exemplars in memory to help us categorize new tokens.

• Also: any single trace can be linked to different kinds of category labels.

• Test: Peterson & Barney vowel data

• Exemplar model classified 81% of vowels correctly.

Page 44: Auditory Perception

Exemplar Predictions• Point: all properties of all exemplars play a role in categorization…

• Not just the “definitive” ones.

• Prediction: non-invariant properties of speech categories should have an effect on speech perception.

• E.g., the voice in which a [b] is spoken.

• Or even the room in which a [b] is spoken.

• Is this true?

• Let’s find out…

Page 45: Auditory Perception

Another Experiment!• Circle whether each word is a new or old word in the list.

1. 9. 17.

2. 10. 18.

3. 11. 19.

4. 12. 20.

5. 13. 21.

6. 14. 22.

7. 15. 23.

8. 16. 24.

Page 46: Auditory Perception

Another Experiment!• Circle whether each word is a new or old word in the list.

25. 33.

26. 34.

27. 35.

28. 36.

29. 37.

30. 38.

31. 39.

32. 40.

Page 47: Auditory Perception

Continuous Word Recognition• In a “continuous word recognition” task, listeners hear a long sequence of words…

• some of which are new words in the list, and some of which are repeats.

• Task: decide whether each word is new or a repeat.

• Twist: some repeats are presented in a new voice;

• others are presented in the old (same) voice.

• Finding: repetitions are identified more quickly and more accurately when they’re presented in the old voice. (Palmeri et al., 1993)

• Implication: we store voice + word info together in memory.


Recommended