From Sound to Sense and back again: The integration of lexical and speech processes From Sound to...

Post on 17-Jan-2016

215 views 0 download

Tags:

transcript

From Sound to Sense and From Sound to Sense and back again:back again:

The integration of lexical and The integration of lexical and speech processesspeech processes

From Sound to Sense and From Sound to Sense and back again:back again:

The integration of lexical and The integration of lexical and speech processesspeech processesDavid Gow

Massachusetts General HospitalDavid Gow

Massachusetts General HospitalBob McMurray

Dept. of Brain and Cognitive SciencesUniversity of Rochester

Bob McMurrayDept. of Brain and Cognitive Sciences

University of Rochester

Complex computations from sound to sense must be broken up for study.

The Speech Chain

Sound

Sense

Assume intermediate representations:

Phonemes…Words…Syntactic

Phrases…

The Standard Paradigm

The Standard Paradigm

Sense

Ph

on

olo

gy

Words

Phonemes

Sound

The Standard Paradigm

The Standard Paradigm

Ph

on

olo

gy

Words

Phonemes

Delimited fields of study.

Sound

•Speech Perception

•Spoken Word Recognition

•Phonology

Phonemes* essential

* or other sublexical category

Sense

Why? Categorical Perception (CP)

•Sharp identification of tokens on a continuum.

VOT

0

100

PB

% /

p/

ID (%/pa/)0

100Discrim

inatio

n

Discrimination

•Discrimination poor within a phonetic category.

Continuous Acoustic Detail => Discrete Categories

Does CAD affect speech categorization?

Categorical Perception (CP)

Defined fundamental computational problems.

CP is output of •Speech perception

Input to •Phonology•Word recognition.

Ph

on

olo

gy

Words

Phonemes

Sense

Sound

But… • Not all speech contrasts are categorical.

• Lots of tasks show non-categorical perception.

Fry, Abramson, Eimas & Liberman (1962) Pisoni & Tash (1974) Pisoni & Lazarus (1974) Carney, Widden & Viemeister (1977) Hary & Massaro (1982) Pisoni, Aslin, Perey & Hennessy (1982) Healy & Repp (1982) Massaro & Cohen (1983) Miller (1997) Samuel (1997)…

CP

Categorical Perception is about phonetic classification.

Why has the Standard Paradigm persisted?

Sound

SenseThe minimal computational problem: compute meaning from sound.

CP tasks don’t necessarily tap a stage of this problem.

?CPWords

Lexical activation… seems a good bet.

Even when continuous acoustic detail affects word recognition, it is seen as outside of core word recognition.

Why has the Standard Paradigm persisted?

Example: Word Segmentation

• Vowel Length• Stress/Meter• Coarticulation

Words

Phonemes

CAD

Segm

enta

tion

Cue extra-segmental process.W

ord

Reco

gn

itio

n

Even when continuous acoustic detail affects word recognition, it is seen as outside of core word recognition.

Why has the Standard Paradigm persisted?

No. Standard Paradigm is fine…

Yes. Hmm…

Does continuous acoustic detail affect interpretation via core word-recognition processes?

Need to use stimuli with:•Precise control over CAD

Need to use tasks that:•reflect only minimal computational problem:

meaning.•are sensitive to acoustic detail.

Sublexical Filter(phonemes)

Visual World Paradigm

Visual World Paradigm

•Subjects hear spoken language and manipulate objects in a visual world.

•Visual world includes set of objects with interesting linguistic properties (names)

•Eye-movements to each object are monitored throughout the task.

Tanenhaus, Spivey-Knowlton, Eberhart & Sedivy (1995)Allopenna, Magnuson & Tanenhaus (1998)

•Meaning based, natural task: Subjects must interpret speech to perform task.

•Eye-movements fast and time-locked to speech.

•Fixation probability maps onto dynamics of lexical activation.

•Context is controlled: meaning lexical

activation.

?Does continuous

acoustic detail affect interpretation?

Is lexical activation sensitive to continuous

acoustic detail?

Combine tools of

• speech perception:

9-step VOT continuum.

• spoken word recognition:

visual world paradigm

McMurray, Tanenhaus & Aslin (2003)

A moment to view the items

Methods

500 ms later

Bear

Repeat 1080 times…

Target = Bear

Competitor = Pear

Unrelated = Lamp, Ship

Time

200 ms

1

2

3

4

5

Trials

Time (ms)

VOT=0 Response=

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 400 800 1200 1600

Fix

ati

on

p

rop

ort

ion

Systematic effect on competitor dynamics.Fixations to the competitor.

Predictions

Categorical Results Gradient Effect

target

competitor

time

Fix

ati

on

pro

port

ion

target

competitor competitorcompetitor

time

Fix

ati

on

pro

port

ion

target

What would lexical sensitivity to CAD look like?

Results

0 400 800 1200 16000

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0 ms5 ms10 ms15 ms

VOT

0 400 800 1200 1600 2000

20 ms25 ms30 ms35 ms40 ms

VOT

Com

peti

tor

Fix

ati

on

s

Time since word onset (ms)

Response= Response=

Task?

P

B Sh

LPhoneme ID

Not part of minimal computational

problem.

Same stimuli in metalinguistic task…

…more categorical pattern of fixations

Continuous acoustic detail is not helpful in metalinguistic tasks…

Summary

Word recognition shows gradient sensitivity to continuous acoustic detail.

Not extra-segmental: VOT

CAD affects higher-level processes.

Consistent with other studies:

Andruski, Blumstein & Burton (1994)Marslen-Wilson & Warren (1994)Utman, Blumstein & Burton (2000)Dahan, Magnuson, Tanenhaus & Hogan (2001)McMurray, Clayards, Aslin & Tanenhaus (2004)McMurray, Aslin, Tanenhaus, Spivey & Subik (in prep)

The Standard Paradigm?

Sense

Ph

on

olo

gy

Words

Phonemes

Continuous Acoustic Detail

CAD affects higher-level processes.

From other work:

Lexical activation influences sublexical representations.

Samuel & Pitt (2003)Magnuson, McMurray, Tanehaus & Aslin (2003)Samuel (1997)Elman & McClelland (1988)

The Standard Paradigm?

Sense

Ph

on

olo

gy

Words

Phonemes

Continuous Acoustic Detail

CAD affects higher-level processes.

From other work:

Lexical activation influences sublexical representations.

Phonological regularity affectssignal interpretation.

Massaro & Cohen (1983)Halle, Segui, Frauenfelder & Meunier (1998)Pitt (1998)Dupoux,Kakehi, Hirose, Pallier & Mehler, (1999)

?Sense

Ph

on

olo

gy

Words

Phonemes

Continuous Acoustic Detail

Perhaps interaction and integration make sense.

Do they help solve sticky problems?

YES

The Emerging Paradigm

Integration of work in:• spoken word recognition • speech perception• phonology

New computations simplify old problems and solve new ones.

•Cognitive processes: Lexical activation & competition.

•Perceptual processes: sensitivity to CAD & perceptual grouping.

CAD is helpful in language comprehension.

• Word segmentation

• Coping with lawful variability due to assimilation

Combination of approaches helps solve both problems.

Some lexical processes can’t work

in the Standard Paradigm

Lexical Segmentation

[ ]

The SWR Solution

active

[ ]

active department

[ ]

active departmentact of dip art mint

a partdepart in

arepar

Standard Paradigm: Template matching overgenerates

[ ]

•Overgeneration resolved through competition in TRACE (McClelland & Elman 1986)

Problem: What if the speaker is trying to say “suck seeds”?

‘ k s I d -

succeed

suck

seed

activ

atio

n

Cycle

Frauenfelder & Peeters (1990)

Cues shown to affect segmentation:

•Initial strong syllable•Initial lengthening•Increased aspiration•Increased glottalization

Lehiste, 1960; Garding,1967; Lehiste, 1972; Umeda, 1975; Nakatani & Dukes, 1977; Nakatani & Schaffer,1978; Cutler & Norris, 1988…..

Implied processing model requires separate segmentation process

Words

Segm

enta

tion

Phonemes

CAD

Recog

nitio

n

The Speech Solution

Problem: cues are subtle and varied, extra-segmental processes are inelegant

?Is there a better mechanism?

Words

Segm

enta

tion

Phonemes

CAD

Recog

nitio

n

The proposal had a strange syntax that nobody liked. ^

The proposal had a strange sin tax that nobody liked. ^

• CAD affects interpretation.• does not trigger segmentation.

Gow & Gordon (1995)

GRAMMAR primedSyntax

Tax INCOME inhibited

GRAMMAR primedSyntax

Tax INCOME primed

•Observation: All segmentation cues happen to enhance word-initial features

• Strengthened cues facilitate activation, making intended words stronger competitors

Incorporating CAD:

• Solves overgeneration problem.

•No extra-segmental segmentation process.

Good Start Model

Gow & Gordon (1995)

When continuous acoustic detail affects lexical

activation, speech and SWR models can

be integrated and simplified

Summary

The emerging paradigm reframes

computational problems

Assimilation

English coronal place assimilation

/coronal # labial/ [labial # labial]

/coronal #velar/ [velar # velar]Standard Paradigm: Change is • discrete• phonemically neutralizing

Redefining Computational Problems

[ ]# berries nonword?

right berries?

ripe berries?

[ ]# berries

Standard Paradigm solution: Phonological inference (Gaskell & Marslen-Wilson, 1996; 1998; 2001)

Knowledge driven inference:

If [labial # labial] infer /coronal # labial/

greem beans green (Gaskell & Marslen-Wilson, 1996; Gow, 2001)

ripe berries right (Gaskell & Marslen-Wilson, 2001; Gow, 2002)

Moreover: Assimilation effects dissociated from linguistic knowledge (Gow & Im, in press)

ripe

Assimilatory modification is acoustically continuous

This is not discrete feature change!

Assimilation Produces CAD

F2 Transitions in /æC/ Contexts

1550

1600

1650

1700

1750

1800

1850

Pitch Period

Fre

qu

en

cy (

Hz)

coronalassimilatedlabial

F3 Transitions in /æC/ Contexts

2550

2600

2650

2700

2750

2800

Pitch Period

Fre

qu

en

cy (

Hz)

coronalassimilated

labial

SmaSelect thecat

p box

Regressive Context Effects

Subject Hears: Assim_Non-Coronal (cat/p box)

0

0.1

0.2

0.3

0.4

0.5

0.6

0 400 800 1200 1600Time (ms)

Fix

ati

on

Pro

port

ion

Coronal (cat)

Non-Coronal (cap)

Subject Hears: Assim Non-Coronal (cat/p drawing)

0

0.1

0.2

0.3

0.4

0.5

0.6

0 400 800 1200 1600Time (ms)

Fix

ati

on

Pro

port

ion

Coronal (cat)Non-Coronal (cap)

Looks to Final Non-coronal (box)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 400 800 1200 1600

Time (ms)

Fixa

tio

n P

rop

ort

ion

Assim Non-Coronal

Coronal Non-Coronal

Progressive Context Effects

Progressive effect in the same experiment

Assimilation is resolved through phonological context.

Fully assimilated items show neither* (Gaskell & Marslen-Wilson, 2001; Gow, 2002;2003)

Assimilation: Use of CAD

Partially-assimilated items show

regressive context effects (Gow, 2002; 2003)

progressive context effects (Gow, 2001; 2003)

assimilation # context

Infinite regress (eternal ambiguity)…. or something more interesting?

Continuous acoustic detail is subject to basic perceptual

processes

Feature cue parsing (Gow, 2003)

Time (s)0 0.760454

0

3000

[

A Perceptual Account

Feature cue parsing (Gow, 2003)

Time (s)0 0.760454

0

3000

Features encoded by multiple cues that are integrated

Feature cue parsing (Gow, 2003)

Time (s)0 0.760454

0

3000

Feature cue parsing (Gow, 2003)

Time (s)0 0.760454

0

3000

Assimilation creates cues consistent with multiple places

Feature cue parsing (Gow, 2003)

Extract feature cues

Feature cue parsing (Gow, 2003)

Group feature cues by similarity and resolve ambiguity

Feature cue parsing (Gow, 2003)

example: eight….

catp # box cat

p # drawing catp

# | | | |

[cor] [cor] [COR] [cor] [lab] [LAB] [lab] [lab]

example: eight….

catp # Box cat

p # Drawing catp

# | | [cor] [cor] [COR] [cor] [lab] [LAB] [lab] [lab]

Feature cue parsing (Gow, 2003)

Progressive and regressive effects fall out of grouping

SWR problem (eternal ambiguity) replaced by simpler perceptual problem

CAD important in solution: processing obstacle facilitates perception.

Integration of continuous perceptual features facilitates higher-level processes.

Facilitation via core-word recognition mechanisms—no extra-segmental routines required.

Summary

Standard paradigm

•Created artificial boundaries that misframed issues.

•Continous acoustic detail is variability to be conquered..

The Standard Paradigm

The basis of the standard paradigm is undercut.

•Meaning-based processes are affected by CAD.

•CAD is an essential component of word recognition.

The emerging paradigm

•Emphasis on methodologies that tap the minimal computational problem: meaning.

•Stresses integration of speech and spoken word recognition, questions methods and theory.

•Continuous acoustic detail is useful signal, not noise.

The Emerging Paradigm

From Sound to Sense and From Sound to Sense and back again:back again:

The integration of lexical and The integration of lexical and speech processesspeech processes

From Sound to Sense and From Sound to Sense and back again:back again:

The integration of lexical and The integration of lexical and speech processesspeech processesDavid Gow

Massachusetts General HospitalDavid Gow

Massachusetts General HospitalBob McMurray

Dept. of Brain and Cognitive SciencesUniversity of Rochester

Bob McMurrayDept. of Brain and Cognitive Sciences

University of Rochester