+ All Categories
Home > Documents > Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal...

Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal...

Date post: 16-Feb-2018
Category:
Upload: dangcong
View: 213 times
Download: 0 times
Share this document with a friend
45
Project proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you use your computer or classroom computer Should describe I The problem you are trying to solve I Why it is an interesting / important problem I What data you are going to use I What tools you are going to use I How you will evaluate the success of your approach Michael Mandel (83060 SAU) Auditory Perception 1 / 44
Transcript
Page 1: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Project proposal presentations

In class on 2016/10/07

10 minute presentation, 5 minutes for questions

Can you use your computer or classroom computer

Should describeI The problem you are trying to solveI Why it is an interesting / important problemI What data you are going to useI What tools you are going to useI How you will evaluate the success of your approach

Michael Mandel (83060 SAU) Auditory Perception 1 / 44

Page 2: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

CSC 83060: Speech & Audio Understanding

Lecture 4:Auditory Perception

Michael Mandel <[email protected]>

CUNY Graduate Center, Computer Science Programhttp://mr-pc.org/t/csc83060

With much content from Dan Ellis’ EE 6820 course

1 Motivation: Why & how2 Auditory physiology3 Psychophysics: Detection & discrimination4 Pitch perception5 Speech perception6 Auditory organization & Scene analysis

Michael Mandel (83060 SAU) Auditory Perception 2 / 44

Page 3: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Outline

1 Motivation: Why & how

2 Auditory physiology

3 Psychophysics: Detection & discrimination

4 Pitch perception

5 Speech perception

6 Auditory organization & Scene analysis

Michael Mandel (83060 SAU) Auditory Perception 3 / 44

Page 4: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Why study perception?

Perception is messy: can we avoid it?No!

Audition provides the ‘ground truth’ in audioI what is relevant and irrelevantI subjective importance of distortion (coding etc.)I (there could be other information in sound. . . )

Some sounds are ‘designed’ for auditionI co-evolution of speech and hearing

The auditory system is very successfulI we would do extremely well to duplicate it

We are now able to model complex systemsI faster computers, bigger memories

Michael Mandel (83060 SAU) Auditory Perception 4 / 44

Page 5: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

How to study perception? Three different approaches

Analyze the example: physiology

I dissection & nerve recordings

Black box input/output: psychophysics

I fit simple models of simple functions

Information processing modelsI investigate and model complex functions

e.g. scene analysis, speech perception

Michael Mandel (83060 SAU) Auditory Perception 5 / 44

Page 6: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Outline

1 Motivation: Why & how

2 Auditory physiology

3 Psychophysics: Detection & discrimination

4 Pitch perception

5 Speech perception

6 Auditory organization & Scene analysis

Michael Mandel (83060 SAU) Auditory Perception 6 / 44

Page 7: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Physiology

Auditory Transduction animation:https://www.youtube.com/watch?v=PeTriGTENoc

Processing chain from air to brain:

Outerear

Middleear

Inner ear(cochlea)

Auditorynerve

Midbrain

Cortex

Study via:I anatomyI nerve recordings

Signals flow in both directions

Michael Mandel (83060 SAU) Auditory Perception 7 / 44

Page 8: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Outer & middle ear

Pinna

Ear canal

Eardrum(tympanum)

Middle earbones

Cochlea(inner ear)

Pinna ‘horn’I complex reflections give spatial (elevation) cues

Ear canalI acoustic tube

Middle earI bones provide impedance matching

Michael Mandel (83060 SAU) Auditory Perception 8 / 44

Page 9: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Inner ear: Cochlea

Cochlea

Oval window (from ME bones)

Basilar Membrane (BM)

Travelling wave

Resonant frequency

Position

16 kHz

50 Hz

0 35mm

Mechanical input from middle ear starts traveling wavemoving down Basilar membrane

Varying stiffness and mass of BM results in continuousvariation of resonant frequency

At resonance, traveling wave energy is dissipated in BMvibration

I Frequency (Fourier) analysis

Michael Mandel (83060 SAU) Auditory Perception 9 / 44

Page 10: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Cochlea hair cells

Ear converts sound to BM motionI each point on BM corresponds to a frequency

Cochlea

Basilarmembrane

Tectorialmembrane

Inner Hair Cell(IHC) Outer Hair Cell

(OHC)

Auditory nerve

Hair cells on BM convert motion ↔ nerve impulses (firings)

Inner Hair Cells detect motion

Outer Hair Cells active amplifiers (pushing the swing)

Michael Mandel (83060 SAU) Auditory Perception 10 / 44

Page 11: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Inner Hair Cells

IHCs convert BM vibration into nerve firings

Human ear has ∼3500 IHCsI each IHC has ∼7 connections to Auditory Nerve

Each nerve fires (sometimes) near peak displacement

Local BM displacement

Typical nerve signal (mV)

time / ms50

Histogram to get firing probability

Firing count

Cycle angle

Michael Mandel (83060 SAU) Auditory Perception 11 / 44

Page 12: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Auditory nerve (AN) signalsSingle nerve measurements

Tone burst histogram Frequency thresholdSpike count

Time

100

100 ms

Tone burst (log) frequency

100 Hz 1 kHz 10 kHz

20

40

60

80

dB SPL

Rate vs intensity

Spi

kes/

sec

Intensity / dB SPL

300

200

100

2000

40 60 80 100

One fiber: ~ 25 dB dynamic range

Hearing dynamic range > 100 dB

Hard to measure: probe living ANs?Michael Mandel (83060 SAU) Auditory Perception 12 / 44

Page 13: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

AN population responseAll the information the brain has about sound

average rate & spike timings on 30,000 fibers

Not unlike a (constant-Q) spectrogram

time / ms

freq

/ 8v

e re

100

Hz

PatSla rectsmoo on bbctmp2 (2001-02-18)

0

1

2

3

4

5

0 10 20 30 40 50 60

Michael Mandel (83060 SAU) Auditory Perception 13 / 44

Page 14: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Beyond the auditory nerve

Ascending and descending

Tonotopic × ?I modulation, position,

source??

Michael Mandel (83060 SAU) Auditory Perception 14 / 44

Source:http://www.cochlea.eu/en/auditory-brain

Page 15: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Periphery models

Outer/middle ear

filteringSound

Cochlea filterbank

IHC

IHC

Modeled aspectsI outer / middle earI hair cell

transductionI cochlea filteringI efferent feedback?

time / s

chan

nel

SlaneyPatterson 12 chans/oct from 180 Hz, BBC1tmp (20010218)

0 0.1 0.2 0.3 0.4 0.5

10

20

30

40

50

60

Results: ‘neurogram’ / ‘cochleagram’

Michael Mandel (83060 SAU) Auditory Perception 15 / 44

Page 16: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Outline

1 Motivation: Why & how

2 Auditory physiology

3 Psychophysics: Detection & discrimination

4 Pitch perception

5 Speech perception

6 Auditory organization & Scene analysis

Michael Mandel (83060 SAU) Auditory Perception 16 / 44

Page 17: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Psychophysics

Physiology looks at the implementationPsychology looks at the function/behavior

Analyze audition as signal detection: p(θ | x)I psychological tests reflect internal decisionsI assume optimal decision processI infer nature of internal representations, noise, . . .→ lower bounds on more complex functions

Different aspects to measureI time, frequency, intensityI tones, complexes, noiseI binauralI pitch, detuning

Michael Mandel (83060 SAU) Auditory Perception 17 / 44

Page 18: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Basic psychophysicsRelate physical and perceptual variablese.g. intensity → loudness

frequency → pitchMethodology: subject tests

I just noticeable difference (JND)I magnitude scaling e.g. “adjust to twice as loud”

Results for Intensity vs Loudness:Weber’s law ∆I ∝ I ⇒ log(L) = k log(I )

-20 -10 0 101.4

1.6

1.8

2.0

2.2

2.4

2.6

Sound level / dB

Log(

loud

ness

rat

ing)

Hartmann(1993) Classroom loudness scaling data

Power law fit:

L α I 0.22

Textbook figure:

L α I 0.3

log2(L) = 0.3 log2(I )

= 0.3log10 I

log10 2

=0.3

log10 2

dB

10

= dB/10

Michael Mandel (83060 SAU) Auditory Perception 18 / 44

Page 19: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Loudness as a function of frequency

Fletcher-Munsen equal-loudness curves

freq / Hz

Inte

nsity

/ dB

SP

L

0

40

20

60

100

80

120

1000100 10,000

Intensity / dB

Equ

ival

ent l

oudn

ess

@

1kH

z

400 0

40

80

80

100

60

20

20 60 20 600

100

60

20

0Intensity / dB

Equ

ival

ent l

oudn

ess

@

1kH

z

40

40

80

80

rapid loudness growth

100 Hz 1 kHz

Michael Mandel (83060 SAU) Auditory Perception 19 / 44

Page 20: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Loudness as a function of bandwidthSame total energy, different distributione.g. 2 channels at −6 dB (not −10 dB)

time

freq

freq

mag

freq

mag

Same total

energy I·B

... but wider perceived as louder

I0I1

B0 B1

Bandwidth B‘Critical’ bandwidth

Loud

ness

Critical bands: independent frequency channels

I ∼25 total (4-6 / octave)

Michael Mandel (83060 SAU) Auditory Perception 20 / 44

Page 21: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Simultaneous maskingA louder tone can ‘mask’ the perception of a second tone nearby infrequency:

masked threshold

log freq

absolute threshold

masking tone

Inte

nsity

/ dB

Suggests an ‘internal noise’ model:

decision variable x

internal noise

p(x | I)p(x | I)

p(x | I+∆I)

σn

I

Michael Mandel (83060 SAU) Auditory Perception 21 / 44

Page 22: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Sequential masking

Backward/forward in time:

time

Inte

nsity

/ dB

masker envelope

masked threshold

simultaneous masking ~10 dB

backward masking ~5 ms

forward masking ~100 ms

→ Time-frequency masking ‘skirt’:

time

freq

inte

nsity

Masking tone

Masked threshold

Michael Mandel (83060 SAU) Auditory Perception 22 / 44

Page 23: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

What we do and don’t hear

A B XX = A or B?

“two-interval forced-choice”:

time

Timing: 2 ms attack resolution, 20 ms discriminationI but: spectral splatter

Tuning: ∼1% discriminationI but: beats

Spectrum: profile changes, formantsI variables time-frequency resolution

Harmonic phase?

Noisy signals & texture

(Trace vs categorical memory)

Michael Mandel (83060 SAU) Auditory Perception 23 / 44

Page 24: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Outline

1 Motivation: Why & how

2 Auditory physiology

3 Psychophysics: Detection & discrimination

4 Pitch perception

5 Speech perception

6 Auditory organization & Scene analysis

Michael Mandel (83060 SAU) Auditory Perception 24 / 44

Page 25: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Pitch perception: a classic argument in psychophysics

Harmonic complexes are a pattern on AN

10

20

30

40

50

60

70

0 0.05 0.1 time/s

freq

. cha

n.

I but give a fused percept (ecological)

What determines the pitch percept?I not the fundamental

How is it computed?Two competing models: place and time

Michael Mandel (83060 SAU) Auditory Perception 25 / 44

Page 26: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Place model of pitch

AN excitation pattern shows individual peaks

‘Pattern matching’ method to find pitch

frequency channel

frequency channel

AN

exc

itatio

n

Pitch strength

resolved harmonics

broader HF channels cannot resolve

harmonics

Correlate with harmonic ‘sieve’:

Support: Low harmonics are very important

But: Flat-spectrum noise can carry pitch

Michael Mandel (83060 SAU) Auditory Perception 26 / 44

Page 27: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Time model of pitch

Timing information is preserved in AN down to ∼1 ms scale

Extract periodicity by e.g. autocorrelation and combine acrossfrequency channels

lag / ms

time

freq per-channel

autocorrelation

autocorrelation

Summary autocorrelation

0 10 20 30common period

(pitch)

But: HF gives weak pitch (in practice)

Michael Mandel (83060 SAU) Auditory Perception 27 / 44

Page 28: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Alternate & competing cues

Pitch perception could rely on various cuesI average excitation patternI summary autocorrelationI more complex pattern matching

Relying on just one cue is brittleI e.g. missing fundamental

→ Perceptual system appears to use a flexible, opportunisticcombination

Optimal detector justification?

argmaxθ

p(θ | x) = argmaxθ

p(x | θ)p(θ)

= argmaxθ

p(x1 | θ)p(x2 | θ)p(θ)

I if x1 and x2 are conditionally independent

Michael Mandel (83060 SAU) Auditory Perception 28 / 44

Page 29: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Outline

1 Motivation: Why & how

2 Auditory physiology

3 Psychophysics: Detection & discrimination

4 Pitch perception

5 Speech perception

6 Auditory organization & Scene analysis

Michael Mandel (83060 SAU) Auditory Perception 29 / 44

Page 30: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Speech perception

Highly specialized functionI subsequent to source organization?

. . . but also can interact

Kinds of speech sounds

20

30

40

50

60

1.4 1.6 1.8 2 2.2 2.4 2.6 time/slevel/dB

freq

/ H

z

0

1000

2000

3000

4000

watch thin as a dimeahas

stop burstfricativevowelnasalglide

Michael Mandel (83060 SAU) Auditory Perception 30 / 44

Page 31: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Cues to phoneme perception

Linguists describe speech with phonemes

watch thin as a dimeahas

mdnctcl

^

θ zwzh e

III ayε

Acoustic-phoneticians describe phonemes by

• formants & transitions

• bursts & onset times

time

freq

vowel formants

transition stop burst

voicing onset time

Michael Mandel (83060 SAU) Auditory Perception 31 / 44

Page 32: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Categorical perception

(Some) speech sounds perceived categorically rather thananalogically (e.g. Liberman et al., 1952)

I e.g. stop-burst and timing:

T

P K P

Pi e a c o uε

following vowelbu

rst f

req

f b /

Hz

1000

2000

3000

4000

time

freq stop burst vowel

formants

fb

I tokens within category are hard to distinguishI category boundaries are very sharp

Categories are learned for native tongueI “merry” / “Mary” / “marry”

Michael Mandel (83060 SAU) Auditory Perception 32 / 44

Page 33: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Where is the information in speech?

‘Articulation’ of high/low-pass filtered speech:

Art

icul

atio

n / %

1000

20

40

60

80

2000 3000 4000freq / Hz

high-pass low-pass

sums to more than 1. . .

Speech message is highly redundant

e.g. constraints of language, context

→ listeners can understand with very few cues

Michael Mandel (83060 SAU) Auditory Perception 33 / 44

Page 34: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Top-down influences: Phonemic restoration (Warren, 1970)

What if a noise burst obscures speech?

1.4 1.6 1.8 2 2.2 2.4 2.6 time / s

freq

/ H

z

0

1000

2000

3000

4000

auditory system ‘restores’ the missing phoneme. . . based on semantic content. . . even in retrospect

Subjects are typically unaware of which sounds are restored

Michael Mandel (83060 SAU) Auditory Perception 34 / 44

Page 35: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

A predisposition for speech: Sinewave replicas

Replace each formant with a single sinusoid (Remez et al., 1981)

010002000300040005000

0.5 1 1.5 2 2.5 30

10002000300040005000

time / s

freq

/ H

zfr

eq /

Hz

Speech

Sines

speech is (somewhat) intelligible

people hear both whistles and speech (“duplex”)

processed as speech despite un-speech-like

What does it take to be speech?

Michael Mandel (83060 SAU) Auditory Perception 35 / 44

Page 36: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Simultaneous vowelsMix synthetic vowels with different f0s

freq+ =

dB

/iy/ @ 100 Hz

/ah/ @ 125 Hz

Pitch difference helps(though not necessarily)

DV identification vs. ∆f0 (200ms) (Culling & Darwin 1993)

% b

oth

vow

els

corr

ect

∆f0 (semitones)

0 11/4 1/2 2 4

25

50

75

Michael Mandel (83060 SAU) Auditory Perception 36 / 44

Page 37: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Computational models of speech perception

Various theoretical-practical models of speech comprehensione.g.

Phoneme recognition

Lexical access

Grammar constraints

Speech

Words

Open questions:I mechanism of phoneme classificationI mechanism of lexical recallI mechanism of grammar constraints

ASR is a practical implementation (?)

Michael Mandel (83060 SAU) Auditory Perception 37 / 44

Page 38: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Outline

1 Motivation: Why & how

2 Auditory physiology

3 Psychophysics: Detection & discrimination

4 Pitch perception

5 Speech perception

6 Auditory organization & Scene analysis

Michael Mandel (83060 SAU) Auditory Perception 38 / 44

Page 39: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Auditory organization

Detection model is huge simplification

The real role of hearing is much more general:Recover useful information from the outside world

→ Sound organization into events and sources

0 2 4 time/s

frq/Hz

0

2000

4000 Voice

Stab

Rumble

Research questions:I what determines perception of sources?I how do humans separate mixtures?I how much can we tell about a source?

Michael Mandel (83060 SAU) Auditory Perception 39 / 44

Page 40: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Auditory scene analysis: simultaneous fusion

Harmonics are distinct on AN,but perceived as one sound (“fused”)

time

freq

I depends on common onsetI depends on harmonicity (common period)

Methodologies:I ask subject how many ‘objects’I match attributes e.g. object pitchI manipulate high level e.g. vowel identity

Michael Mandel (83060 SAU) Auditory Perception 40 / 44

Page 41: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Sequential grouping: streaming

Pattern / rhythm: property of a set of objectsI subsequent to fusion ∵ employs fused events?

–2 octaves

TRT: 60-150 ms

time

freq

uenc

y

∆f:1 kHz

Measure by relative timing judgmentsI cannot compare between streams

Separate ‘coherence’ and ‘fusion’ boundaries

Can interact and compete with fusion

Michael Mandel (83060 SAU) Auditory Perception 41 / 44

Page 42: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Continuity and restoration

Tone is interrupted by noise burst: what happened?

time

freq

+

+ +

?

I masking makes tone undetectable during noise

Need to infer most probable real-world eventsI observation equally likely for either explanationI prior on continuous tone much higher ⇒ choose

Top-down influence on perceived events. . .

Michael Mandel (83060 SAU) Auditory Perception 42 / 44

Page 43: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Models of auditory organization

Psychological accounts suggest bottom-up

inputmixture

signalfeatures(maps)

discreteobjectsFront end Object

formationGrouping

rulesSourcegroups

onsetperiodfrq.mod

time

freq

Brown and Cooke (1994)

Complicated in practice

formation of separate elements

contradictory cues

influence of top-down constraints (context, expectations, . . . )

Michael Mandel (83060 SAU) Auditory Perception 43 / 44

Page 44: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

Summary

Auditory perception provides the ‘ground truth’ underlyingaudio processing

Physiology specifies information available

Psychophysics measure basic sensitivities

Sounds sources require further organization

Strong contextual effects in speech perception

Transduce Scene analysis

Multiple represent'ns

High-level recognitionSound

Parting thought

Is pitch central to communication? Why?

Michael Mandel (83060 SAU) Auditory Perception 44 / 44

Page 45: Project proposal presentationsm.mr-pc.org/t/csc83060/2016fa/lecture04.pdfProject proposal presentations In class on 2016/10/07 10 minute presentation, 5 minutes for questions Can you

References

Alvin M Liberman, Pierre Delattre, and Franklin S Cooper. The role of selectedstimulus-variables in the perception of the unvoiced stop consonants. TheAmerican journal of psychology, 65(4):497–516, 1952.

Richard M. Warren. Perceptual restoration of missing speech sounds. Science, 167(3917):392–393, January 1970.

R. E. Remez, P. E. Rubin, D. B. Pisoni, and T. D. Carrell. Speech perception withouttraditional speech cues. Science, 212(4497):947–949, May 1981.

G. J. Brown and M. Cooke. Computational auditory scene analysis. Computer Speech& Language, 8(4):297–336, 1994.

Brian C. J. Moore. An Introduction to the Psychology of Hearing. Academic Press,fifth edition, April 2003. ISBN 0125056281.

James O. Pickles. An Introduction to the Physiology of Hearing. Academic Press,second edition, January 1988. ISBN 0125547544.

Michael Mandel (83060 SAU) Auditory Perception 45 / 44


Recommended