1
Workshop on Sociophonetic Methodology, LSA Summer Institute, Boulder, USA July 2011
Anne Fabricius, Roskilde University, Denmark Tyler Kendall, University of Oregon, USA Dominic Watt, University of York, UK
Part 1: Plotting 1.1 A quick overview of its role in sociophonetics
Part 2: Normalization 2.1 Why and wherefore 2.2 Defining a normalization algorithm typologically 2.3 Evaluating an algorithm’s performance
▪ Fabricius, Watt & Johnson (2009) ▪ Flynn (2011) ▪ Flynn & Foulkes (2011) ▪ Fruehwald (ms)
Part 3: Metrics and the vowel space 3.1 Fabricius (2007) 3.2 Watt & Fabricius (2011) 3.3 Fridland & Kendall (under review)
Question & discussion time
2
graphical formats (i.e. plots) provide “a front line of attack, revealing intricate structure in data that cannot be absorbed in any other way. We discover unimagined effects, and we challenge imagined ones.” (Cleveland 1993: 1)
“nothing beats a picture” (K. Johnson 2008).
3
4
Joos 1948
Peterson & Barney 1952
Labov 1963
Labov, Yaeger, Steiner 1972
Thomas 2001
Labov 2007 NORM (sample data) vowels.R (+ durplot plug-in)
NORM (& Vowels R library) http://ncslaap.lib.ncsu.edu/tools/norm/
Plotnik Akustyk (Praat plug-‐in)
Origin SigmaPlot Excel…
5
normalization: Chiefly Math. and Physics. To multiply (a series, function, variable, etc.) by a factor
that makes the norm or some associated quantity (such as an integral) equal to a particular value, usually one. [OED online]
here: factoring out of physical (anatomical > acoustic) differences between samples
listeners unconsciously compensate for absolute formant frequency differences for any given vowel category
6
cognitive processes underlying this faculty still not well understood (Johnson & Mullennix 1997; Wong et al. 2004; Ames & Grossberg 2008; Monahan & Idsardi 2010)
not the aim of our own work to simulate this process directly
rather, it is to enable qualitative (visual) and quantitative comparisons of speakers’ and groups’ vowel productions
7 7
cross-‐gender and cross-‐age comparisons same-‐gender comparisons same speaker over time
(age-‐dependent)
sociophonetic community studies will encompass at least some of these
choice not to normalize should be justified
8
RP data for 3 older RP-‐speaking men (Hawkins & Midgley 2005)
reflecting the stages of processing carried out on an incoming acoustic signal the peripheral auditory system (transform) the auditory processing centers of the brain (normalization proper)
9
3-‐way cross-‐cutting set of terms
defining where the algorithm derives its information from
speaker extrinsic vs. intrinsic vowel extrinsic vs. intrinsic formant extrinsic vs. intrinsic
10
psychoperceptual scales (Bark, ERB, mel)
approximate non-‐linear frequency response of inner ear
much more sensitive to changes in frequency at the lower end of the spectrum
1 critical band = 100 Hz between 150 and 250 Hz
but = 350 Hz between 2150 and 2500 Hz
11
Bark Difference Metric Syrdal and Gopal (1986)
two slightly different versions of same idea: Hertz values converted into Bark Z3-‐Z2 or Z2-‐Z1 modelling advancement Z1-‐Z0 modelling vowel height (NORM uses Z3-‐Z1)
12
most successful category for sociophonetic purposes (Adank 2003, Flynn 2011)
ranges (Gerstman) mean / standard dev (Lobanov) individual log-‐means (Nearey CLIHi4) centroids (W&F, mW&F, Bigham)
13
Nordström & Lindblom’s (1975) vocal tract scaling transformation
Nearey’s shared log-‐mean model (‘Constant Log Interval Hypothesis’ or CLIHs4/s2)
ANAE/Telsur G value (also speaker extrinsic) (Labov, Ash & Boberg 2006: 39-‐40)
conceptualized as a ‘sliding template’ approach using a scaling factor
14
Nearey 1 (NORM): CLIHi2 – formant intrinsic Nearey 2 (NORM): CLIHs2 – formant extrinsic
Adank (2003) rates CLIHi4 (≈ Nearey 1) as more successful than CLIHs4
Nearey 1 of same typology as Lobanov Nearey 2 is the basis for the ANAE method and
implemented in Plotnik
15
16 http://normtable.notlong.com
a recent renewed interest in this area
earlier work with emphasis more on speech perception (Disner 1980, Deterding 1990, but see Hindle 1978)
now, comparisons tending towards sociophonetically-‐relevant parameters, both visual and quantitative
Adank (2003) used measures of sorting efficiency using linear discriminant analysis
more evaluation possibilities emerging all the time… (see also Clopper 2009)
17
18
F1
F2
• 1 = disparity in area agreement and poor overlap
• 2 = good area agreement but poor overlap
• 3 = good fit on both counts
3
1 2
18
tested W&F, mW&F against Lobanov and Nearey1 on following parameters:
reduction of variance in area ratios of vowel polygons improvement of intersection of vowel polygons conservation of angular relationships between selected points on the F1/F2 plane, after normalization
19 19
area proportional reduction in variance Pitman-‐Morgan’s test of homogeneity of variance between correlated samples (Cohen, 1990).
intersection intersection of two vowel polygons divided by the union of the same polygons → intersection values compared statistically
vowel juxtapositions planar locations compared across methods (DRESS-‐LOT;
TRAP-‐STRUT and LOT-‐FOOT)
tested on data from RP and Aberdeen English 20
21
Test 1: Equalizing vowel space areas
Test 2: Improving vowel space overlap
Overall:
22
F2
F1
KIT/BIT
DRESS/BET
TRAP/BAT
STRUT/ BUT
LOT/BOT
FOOT/PUT
23
Fabricius, Watt & Johnson (2009: 429)
compares 20 normalization methods (6 vowel intrinsic, 14 vowel extrinsic) ▪ includes some innovative normalization techniques ▪ Bigham (2008) ▪ and additional new possible methods
20 speakers of Nottingham English (age, gender); 180 vowel tokens per speaker
24
methods for equalising vowel space areas: squared coefficients of variance (Fabricius et al. 2009)
Python v2.6.4 incorporating the Shapely v1.2.6 package used to determine intersection and union of all 20 speaker vowel space areas
25
26
Method SCV Rank (Hertz 0.06212 N/A) Gerstman 0.01020 1 LCE 0.01487 2 Lobanov 0.02032 3 Bigham 0.02556 4 1mW&F 0.02587 5 Letter 0.02637 6 origW&F 0.02671 7 2mW&F 0.02818 8 ERB 0.03233 9 Nearey1 0.03250 =10 NeareyGM 0.03250 =10 Log 0.03250 =10 Ln 0.03250 =10
Flynn (2011: 16) Results of testing equalization of vowel space areas, 10 best methods only
27
method % overlapping rank Bigham 45.8% 1 2mW&F 43.8% 2 origW&F 43.4% 3 1mW&F 42.3% 4 Gerstman 30.0% 5 Lobanov 29.2% 6 Nordstrom 28.7% 7 exp{Nearey1} 27.6% 8 Nearey1 27.1% 9 exp{NGM} 26.9% 10 Bladon 25.9% 11 NeareyGM 25.7% 12 Letter 24.1% 13 LCE 23.1% 14 Bark-‐diff 13.5% 15 Bark 13.2% 16 Mel 13.1% 17 ERB 12.8% 18 Ln 12.2% =19 Log 12.2% =19 Hertz 12.6%
Flynn (2011: 17)
Results of testing overlap in vowel space areas
28
method SCV area total overall rank Bigham 4 1 5 1 Gerstman 1 5 6 2 1mW&F 5 4 9 =3 Lobanov 3 6 9 =3 2mW&F 8 2 10 =5 origW&F 7 3 10 =5 LCE 2 14 16 7 Letter 6 13 19 =8 Nearey1 =10 9 19 =8 NeareyGM =10 12 22 10
Flynn (2011: 21). Overall rankings converted to points and then ranked. 10 best performing methods only
condensed version of Flynn (2011) illustrations of Hertz/Bigham/ERB
‘These results demonstrate the possibility of methods performing to different levels of effectiveness depending on the method of comparison used, and suggest evaluation of methods should ideally be based on a range of comparative tests.’
29
to evaluate normalization algorithms’ efficacy in reducing male-‐female vowel space differences
compares density functions for F1, F2, F3 i.e. the likelihood of a formant appearing at a particular frequency
uses some known methods and introduces adjustments to these by varying scaling factors
data: 17 Philadelphia speakers (12 women, 5 men)
30
31 Fruehwald (ms, p.2): Distribution of formants on Hz scale
32 Fruehwald (ms, p. 3): Empirical cumulative density function (Hz)
33
Fruehwald (ms, p. X): variant of W&F using a difference metric (two-‐factor scaling)
what matters in the choice of a normalization algorithm: explicitly testing a range of normalization procedures using a range of test types finding arguments to support a choice based on a range of
factors ▪ suitability of the typological choice (vowel extrinsic, formant intrinsic preferred)
▪ purposes for which the data is being analyzed ▪ optimal performance within those boundaries ▪ time and budget parameters
aiming also to optimize comparability of results
the choice will thus be (to some extent) individual
35
Phoneticians and sociolinguists’ practices in this area have tended to differ somewhat
Measuring formant differences across one formant at a time OR
Describing the vowel space two-‐dimensionally
36
37
Hawkins and Midgley 2005 :190
Example: Hawkins and Midgley 2005
38
Questions: Can we bring in some of the benefits of Labovian-‐type holistic views? Can we make two-‐dimensional comparisons of vowel plots more stringent and replicable by using mathematical methods?
Labov 1994: 167
”In Andersen’s speech, the membership of the New York City /æh/ class is quite regular, in that words like man, pass, half are lengthened and more peripheral than other words; but the raising to mid and high position characteristic of younger speakers has not actually begun, and (oh) is equally conservative. The conservative orientation of (ay) and (aw) is equally clear. They are both squarely located in central position, with no tendency toward fronting or backing.” Figure: Labov 1984:168
Pillai-‐Bartlett statistic/Pillai scores (Hall-‐Lew 2009, Hay, Warren & Drager 2006)
Quantifying angular relations between vowel points (Fabricius 2007) and using the centroid of the vowel space (Watt & Fabricius 2011)
Mahalanobis distances (e.g. Esling 1986) or scaling to enable use of Euclidean distance (Fridland & Kendall, submitted)
Procrustean analysis, relating acoustic and articulatory data (Geng & Mooshammer 2009)
Another recommendation: Harrington, J. (2010) The Phonetic Analysis of Speech Corpora. Wiley-‐Blackwell.
39
TRAP and STRUT relative to the horizontal
(1) Tan Θ = ((F1 TRAP-‐F1 STRUT)/(F2 TRAP-‐F2 STRUT))
LOT and FOOT relative to the vertical
(2) TAN Θ = ((F2 FOOT-‐F2 LOT)/(F1 LOT-‐F1 FOOT))
Euclidean distance
(3) DISTANCE (x,y) = √((F1 x – F1 y)2 + (F2 x–F2 y)2)
Fabricius, Anne H. 2007. Using angle calculations to demonstrate vowel shifts: A diachronic investigation of the short vowel system in 20th-‐century RP (UK). Acta Linguistica Hafniensis. 40:7-‐21.
40
Watt, Dominic and Anne H. Fabricius. 2011. A measure of variable planar locations anchored on the centroid of the vowel space: a sociophonetic research tool. To appear in ICPhS 17 Proceedings.
Derives its methodology from the S-‐centroid calculated by the normalization algorithm (1,1)
((Lobanov normalization also provides a centroid at (0,0) ))
41
42
43
44
work examining the relationship between perception and production of vowel categories during regional vowel shifts
45
46
Southern Vowel Shift (SVS) Northern Cities Shift (NCS)
Diagram
s from G
ordon “Do you speak A
merican?”http://w
ww
.pbs.org/speak/ahead/change/changin/
Euclidean distance relies on scale equivalence: Lobanov handles this nicely (W & F actually performs slightly better, at least on datasets tested)
47
Not all speakers within a region participate in the regional shift, and speakers who do participate do so to varying degrees
48
/e/ - /ɛ/ onset distances ordered by region /e/ - /ɛ/ onset distances ordered by distance
49
Perception is related to an individual’s production, seen in the non-linear (fit by a 2nd order polynomial) effect for Euclidean distance
normalization (and plotting) decisions are crucial components of any sociophonetic vowel (and beyond?) study not normalizing is a valid decision, but crucially is still a decision
as sociophonetics matures, it is important that we strive to develop both shared best-‐practices and an innovative eye toward rigorous and appropriate quantitative techniques
50
Adank, Patti. 2003. Vowel Normalization: A Perceptual-‐Acoustic Study of Dutch Vowels. PhD thesis, University of Nijmegen.
Ames, Heather and Stephen Grossberg. 2008. Speaker normalization using cortical strip maps: a neural model for steady-‐state vowel categorization. Journal of the Acoustical Society of America 124(6): 3918-‐3936.
Bigham, Douglas. 2008. Dialect contact and accommodation among emerging adults in a University Setting. PhD thesis, University of Texas at Austin.
Cleveland, W.S. 1993. Visualizing Data. Summit, NJ: Hobart Press. Clopper, Cynthia. 2009. Computational methods for normalization of acoustic vowel data for talker
differences. Language and Linguistics Compass 3(6): 1430-‐1442. Cohen, Ayala. 1990. Graphical methods for testing the equality of several correlated variances. The Statistician 39(1): 43-‐52. Deterding, David. 1990. Speaker Normalisation for Automatic Speech Recognition. Unpublished PhD
thesis, University of Cambridge. Disner, Sandra Ferrari. 1980. Evaluation of vowel normalization procedures. Journal of the Acoustical
Society of America 67: 253-‐261. Esling, John. 1986. Some analyses of vowels by social group in the Survey of Vancouver English. Working
Papers of the Linguistic Circle 5(1): 21-‐32. Fabricius, Anne H. 2007a. Vowel formants and angle measurements in diachronic sociophonetic studies:
FOOT-‐fronting in RP. Proceedings of the 16th ICPhS, Saarbrücken. Fabricius, Anne H. 2007b. Variation and change in the TRAP and STRUT vowels of RP: a real time
comparison of five acoustic data sets. Journal of the International Phonetic Association 37(3): 293-‐320. 51
Fabricius, Anne H. 2007c. Using angle calculations to demonstrate vowel shifts: a diachronic investigation of the short vowel system in 20th-‐century RP (UK). Acta Linguistica Hafniensis 40: 7-‐21.
Fabricius, Anne H., Dominic Watt and Daniel Ezra Johnson. 2009. A comparison of three speaker-‐intrinsic vowel formant frequency normalization algorithms for sociophonetics. Language Variation and Change 21(3): 413-‐435.
Flynn, Nicholas. 2011. Comparing vowel formant normalisation procedures. York Working Papers in Linguistics (Series 2) 11: 1-‐28.
Flynn, Nicholas and Paul Foulkes. 2011. Comparing vowel formant normalisation procedures. To appear in Proceedings of the 17th ICPhS, Hong Kong.
Fridland, Valerie and Tyler Kendall. Under review. Exploring the relationship between production and perception in the mid front vowels of U.S. English
Fruewald, Josef. ms. Evaluating normalization procedures’ effectiveness at eliminating sex differences. Unpublished manuscript, University of Pennsylvania.
Geng, Christian and Christine Mooshammer. 2009. How to stretch and shrink vowel systems: results from a vowel normalization procedure. Journal of the Acoustical Society of America 125(5): 3278-‐3288.
Gerstman, Louis. 1968. Classification of self-‐normalized vowels. IEEE Transactions of Audio Electroacoustics AU-‐16: 78-‐80.
Hall-‐Lew, Lauren. 2009. Ethnicity and Phonetic Variation in a San Francisco Neighborhood. PhD thesis, Stanford University.
Harrington, Jonathan. 2010. The Phonetic Analysis of Speech Corpora. Oxford: Wiley-‐Blackwell. Hay, Jennifer, Paul Warren and Katie Drager. 2006. Factors influencing speech perception in the context
of a merger-‐in-‐progress. Journal of Phonetics 34: 458–484. 52
Hawkins, Sarah and Jonathan Midgley. 2005. Formant frequencies of RP monophthongs in four age groups of speakers. Journal of the International Phonetic Association 35(2): 183-‐199.
Hindle, Donald. 1978. Approaches to vowel normalization in the study of natural speech. In D. Sankoff (ed.), Linguistic Variation: Models and Methods. New York: Academic Press, pp. 161-‐171.
Johnson, Keith. 2008. Quantitative Methods in Linguistics. Oxford: Blackwell. Johnson, Keith and John Mullennix. 1997, eds. Talker Variability in Speech Processing. San Diego:
Academic Press. Joos, Martin. 1948. Acoustic phonetics. Language 24(2). Language Monograph 23: 5-‐136. Kamata, Miho. 2008. A socio-‐phonetic study of the DRESS, TRAP and STRUT vowels in London English.
Leeds Working Papers in Linguistics and Phonetics 11. [Online: <http://www.leeds.ac.uk/linguistics/WPL/WP2006/6.pdf>]
Labov, William. 1963. The social motivation of a sound change. Word 19: 273-‐309. Labov, William. 1984. Field methods on the project on linguistic change and variation. In John Baugh and
Joel Sherzer (eds.), Language in Use: Readings in Sociolinguistics. Englewood Cliffs, NJ: Prentice Hall, pp. 22-‐86.
Labov, William. 1994. Principles of Linguistic Change, vol. 1: Internal Factors. Oxford: Blackwell. Labov, William. 2007. Transmission and diffusion. Language 83(2): 344-‐387. Labov, William, Sharon Ash and Charles Boberg. 2006. The Atlas of North American English: Phonetics,
Phonology, and Sound Change. New York: Mouton de Gruyter. Labov, William, Malcah Yaeger and Richard Steiner. 1972. A Quantitative Study of Sound Change in
Progress, Vol. 1. Philadelphia: US Regional Survey.
53
Lobanov, Boris M. (1971). Classification of Russian vowels spoken by different speakers. Journal of the Acoustical Society of America 49(2B):606–608.
Monahan, Philip and William Idsardi. 2010. Auditory sensitivity to formant ratios: toward an account of vowel normalisation. Language and Cognitive Processes 25(6): 808-‐839.
Nearey, Terrance. 1977/8. Phonetic Feature Systems for Vowels. Indiana University Linguistics Club. [Online: <http://www.ualberta.ca/~tnearey/Nearey1978_compressed.pdf>]
Nearey, Terrance and Peter Assmann. 2007. Probabilistic ‘sliding template’ models for indirect vowel normalization. In Maria-‐Josep Solé, Patrice Beddor and Manjari Ohala (eds.) Experimental Approaches to Phonology. Oxford: Oxford University Press, pp. 246-‐269.
Nordström, P.-‐E. and Björn Lindblom. 1975. A normalization procedure for vowel formant data. Proceedings of the 8th ICPhS, Leeds, p. 212.
Peterson, Gordon and Harold Barney. 1952. Control methods used in a study of the vowels. Journal of the Acoustical Society of America 24(2): 175-‐184.
Pisoni, David. 1997. Some thoughts on ‘normalization’ in speech perception. In Keith Johnson and John Mullennix (eds.) Talker Variability in Speech Processing. San Diego: Academic Press, pp. 9-‐32.
Stevens, Stanley and John Volkman. 1940. The relation of pitch to frequency: a revised scale. American Journal of Psychology 53: 329-‐353.
Syrdal, A.K. and H.S. Gopal. 1986. A perceptual model of vowel recognition based on the auditory representation of American English vowels. Journal of the Acoustical Society of America 79: 1086-‐1100.
Thomas, Erik R. 2001. An Acoustic Analysis of Vowel Variation in New World English. Durham, NC: Duke University Press.
54
Thomas, Erik R. and Tyler Kendall. 2007. NORM: The Vowel Normalization and Plotting Suite. [Online: <http://ncslaap.lib.ncsu.edu/tools/norm/>]
Traunmüller, Hartmut. 1990. Analytical expressions for the tonotopic sensory scale. Journal of the Acoustical Society of America 88: 97-‐100.
Traunmüller, Hartmut. 1997. Auditory Scales of Frequency Representation. [Online: <http://www.ling.su.se/staff/hartmut/bark.htm>]
Watt, Dominic and Anne Fabricius. 2002. Evaluation of a technique for improving the mapping of multiple speakers' vowel spaces in the F1~F2 plane. Leeds Working Papers in Linguistics and Phonetics 9: 159-‐73.
Watt, Dominic, Anne Fabricius and Tyler Kendall.2011. More on vowels: plotting and normalization. In Marianna di Paolo and Malcah Yaeger-‐Dror (eds.). Sociophonetics: A Student’s Guide. Routledge, pp. 107-‐118.
Watt, Dominic and Anne Fabricius. 2011. A measure of variable planar locations anchored on the centroid of the vowel space: a sociophonetic research tool. Proceedings of the 17th ICPhS, Hong Kong.
Wong, Patrick, Howard Nussbaum and Steven Small. 2004. Neural bases of talker normalization. Journal of Cognitive Neuroscience 16(7): 1173-‐1184.
See also http://ncslaap.lib.ncsu.edu/tools/norm/biblio1.php
55
Questions and discussion?
Anne H. Fabricius, [email protected] Tyler Kendall, [email protected] Dom Watt, [email protected]
56
57
Workshop on Sociophonetic Methodology, LSA Summer Institute, Boulder, USA July 2011
Anne Fabricius, Roskilde University, Denmark Tyler Kendall, University of Oregon, USA Dominic Watt, University of York, UK