Can Sensorimotor Learning Drive Changes Relevant for Communication?
Increases Are Not Due to “Clear Speech”
References1. Houde, J. F., & Jordan, M. I. (1998). Sensorimotor Adaptation i 593 n Speech Production. Science, 594 279(5354), 1213–1216.2. Lametti, D. R., Rochet-Capellan, A., Neufeld, E., Shiller, D. M., & Ostry, D. J. (2014). Plasticity in the human speech motor system drives 616 changes in speech
perception. Journal of Neuroscience, 34(31), 10339–10346.3. Cai S, Boucek M, Ghosh S, Guenther FH, Perkell J. A System for Online Dynamic Perturbation of Formant Trajectories and Results from Perturbations of the
Mandarin Triphthong /iau/. In: Proceedings of the 8th International Seminar on Speech Production. Strasbourg, France: 2008, p. 65–68.
• Altered feedback paradigms can be leveraged to increase speakers’ vowel space area and the acoustic contrast between vowels—changes that have the potential to improve intelligibility.
• Increased vowel contrast persisted after a washout period and a 10-minute silent interval, evidence of potential longer-term changes.
• Speakers simultaneously learned multiple vowel-specific changes in order to compensate for the altered feedback.
• Vowel contrast increases were not the result of a “clear speech” mode, and occurred without conscious awareness or strategy, strengthening the promise of this technique for clinical use.
Conclusions (TL;DR)
Sensorimotor Adaptation IncreasesVowel Space Area and Vowel Contrast
Increased vowel contrast induced by adaptation to a non-uniform auditory perturbation in speech
Caroline A. Niziolek* and Benjamin Parrell*University of Wisconsin–Madison, Dept. of Communication Sciences and Disorders *equal contibution
When auditory feedback is perturbed in a consistent way, speakers learn to adjust their productions to compensate, a process known as sensorimotor adaptation.1,2 While this paradigm has informed our understanding of speech sensorimotor control, its ability to induce behaviorally-relevant changes in speech remains unclear. Here, we examine speakers’ ability to compensate for a non-uniform auditory perturbation field which was explicitly designed to affect vowel distinctiveness, by shifting all vowels towards the center of vowel space.
A: Perturbation field applied to speech. B: Example spectrograms with produced (blue) and perturbed (red) formants, and the vowel space center (yellow). C: Perturbation magnitude throughout the experiment. In the adapt session (red), the hold phase perturbation is 50% of the 2D distance (in F1/F2 space) between the current formant values and the vowel center.
Speakers achieved these increases in speech contrast by increasing the distance between each vowel and the center of the vowel space (p < 0.0001) in all three test phases (A-C below). Post-hoc tests showed that /i/ was farther from the center in both the adapt and washout phases, and that /ɑ/ was farther from the center in all three test phases (all p < 0.05).
/i/
/æ/
/u//ɑ/
300 500 700 900 1100F1 (mels)
1100
1300
1500
1700
1900
F2 (m
els)
1000
2000
frequ
ency
(mel
s)
0 2 4 6 8 10 12block number (40 trials/block)
0
50%
pertu
rbat
ion
base
line
ramp
hold
wash
out
reten
tionadapt
control
time (s)
“bead” “bad” “bod” “booed”
adaptation
A B
C
500
1500
Brain, Language, andAcoustic Behavior Lab
1000
1250
1500
1750
2000
F2 (m
els)
400 600 800 1000F1 (mels)
1000
1250
1500
1750
2000
F2 (m
els)
400 600 800 1000F1 (mels)
400 600 800 1000F1 (mels)
400 600 800 1000F1 (mels)
baseline (adapt)
baseline (control)adaptation (control)
adaptation (adapt)
S1
S1
S12
S12
S14
S14
S11
S11
Example vowel space areas after exposure to perturbed feedback in the adapt session (red) or unperturbed feedback in the control session (blue) compared with baseline (dashed black).
baselineramp hold
adaptationwashout
retention-50
-25
0
25
Norm
alize
d du
ratio
n (m
s)
baselineramp hold
adaptationwashout
retention-5
0
5
10
15
Norm
alize
d in
tens
ity (a
.u.)
baselineramp hold
adaptationwashout
retention-10
-5
0
5
10
Norm
alize
d m
ax. p
itch
(Hz)
baselineramp hold
adaptationwashout
retention-10
-5
0
5
10
Norm
alize
d pi
tch
rang
e (H
z)
A B
C D
adapt
control
adapt
control
adapt adapt
controlcontrol
Formant changes were not accompanied by changes to other acoustic parameters associated with a clear mode of speaking. Vowel duration (A) decreased slightly over the course of the experiment, and there were only minimal changes in other speech parameters (B: peak intensity, C: maximum pitch, D: pitch range) that did not differ across sessions.
English-speaking participants (n=25) read aloud words with corner vowels (bead, bad, bod, and booed) while being exposed to a “vowel centralization” perturbation. In this adapt session, a modified version of Audapter3 was used to shift the first two formant frequencies (F1 and F2) towards the center of each participant’s vowel space, making all vowels sound more like schwa. Auditory feedback was unaltered in the baseline phase; the magnitude of the perturbation was then ramped up to reach a maximum in the hold phase, before being returned to normal in the washout phase. Ten minutes later, a retention phase again tested speech with normal feedback. Each participant also completed a control session with an identical procedure but with no alteration to feedback. The order of these sessions was counterbalanced.
Method: Vowel Feedback Centralization
Vowel Space Area (VSA) Average Vowel Spacing (AVS)
baselineramp hold
adaptationwashout
retention0.9
1
1.1
1.2
Nor
mal
ized
VSA
adapt control0.6
0.8
1
1.2
1.4
1.6
Nor
mal
ized
VSA
adaptation
adapt control
washout
adapt control
retention
baselineramp hold
adaptationwashout
retention0.95
1
1.05
1.1
Nor
mal
ized
AVS
adapt control
0.8
0.9
1
1.1
1.2
1.3
Nor
mal
ized
AVS
adaptation
adapt control
washout
adapt control
retention
adapt
control
adapt
control
A B
DC** * *
0
50
F2 (m
els) /i/ /æ/
-50 0F1 (mels)
-50
0
F2 (m
els)
/u/0 50
F1 (mels)
adaptation
adaptcontrol
-150
-100
-50
0
50
100
150
norm
. dis
tanc
e to
cen
ter (
mel
s)
/i/
adaptcontrol
/æ/
adaptcontrol
adaptcontrol
/u/
adaptation
0
50
F2 (m
els) /i/ /æ/
-50 0F1 (mels)
-50
0
F2 (m
els)
/u/0 50
F1 (mels)
washout
adaptcontrol
-150
-100
-50
0
50
100
150
norm
. dis
tanc
e to
cen
ter (
mel
s)
/i/
adaptcontrol
/æ/
adaptcontrol
adaptcontrol
/u/
washout
0
50
F2 (m
els) /i/ /æ/
-50 0F1 (mels)
-50
0
F2 (m
els)
/u/0 50
F1 (mels)
retention
adaptcontrol
-150
-100
-50
0
50
100
150
norm
. dis
tanc
e to
cen
ter (
mel
s)
/i/
adaptcontrol
/æ/
adaptcontrol
adaptcontrol
/u/
retention
A B
ED
C
F* * * * *
Speakers responded to the perturbation by expanding VSA and increasing AVS in the adapt session relative to the control session (VSA: p = 0.03; AVS: p = 0.003). AVS changes persisted throughout the washout and retention phases (the latter following a 10-min. silent period).
Speakers Adapted Through Multiple Vowel-Specific Changes
Preprint available here: https://psyarxiv.com/abq65/