+ All Categories
Home > Documents > Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall...

Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall...

Date post: 20-Dec-2015
Category:
View: 212 times
Download: 0 times
Share this document with a friend
Popular Tags:
57
Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge [email protected] IAFPA 2006
Transcript
Page 1: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Characterisation of individuals’ formant dynamics using

polynomial equations

Kirsty McDougallDepartment of LinguisticsUniversity of Cambridge

[email protected]

IAFPA 2006

Page 2: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Speaker characteristics and static features of speech

• Most previous research has focussed on static features- instantaneous, average

• Straightforward to measure

• Natural progression from other research areas – delineation of different languages and language varieties

Page 3: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

• Reflect certain anatomical dimensions of a speaker, e.g. formant frequencies ~ length and configuration of VT

• Instantaneous and average measures - demonstrate speaker differences, but unable to distinguish all members of a population

look to dynamic (time-varying) features

Speaker characteristics and static features of speech

Page 4: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

• More information than static

• Reflect movement of a person’s speech organs as well as dimensions- people move in individual ways for skilled motor activities - walking, running, … and speech

Dynamic features of speech

Page 5: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Dynamic features of speech

• can view speech as achievement of a series of linguistic ‘targets’

• speakers likely to exhibit similar properties at ‘targets’ (e.g. segment midpoints), but move between these in individual ways

examine formant frequency dynamics

Page 6: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Time (s)

/aɪ/ in ‘bike’ uttered by two male speakers of Australian English

Frequency (Hz)

Time (s)

Formant dynamics

Page 7: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Time (s) Time (s)

/aɪ/ in ‘bike’ uttered by two male speakers of Australian English

Frequency (Hz)

10% 10%

Formant dynamics

Page 8: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Time (s)

/aɪ/ in ‘bike’ uttered by two male speakers of Australian English

Frequency (Hz)

Time (s)

Formant dynamics

Page 9: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

• How do speakers’ formant dynamics reflect individual differences in the production of the sequence //?

• How can this dynamic information be captured to characterise individual speakers?

Research Questions

Page 10: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

bike

hike

like

mike

spike

/baIk/

/haIk/

/laIk/

/maIk/

/spaIk/

Target words:

/aIk/

Page 11: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

e.g. I don’t want the scooter, I want the bike now. Later won’t do, I want the bike now.

5 repetitionsx 5 words (bike, hike, like, mike, spike)x 2 stress levels (nuclear, non-nuclear)x 2 speaking rates (normal, fast)= 100 tokens per subject

Data set

Page 12: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

• 5 adult male native speakers of Australian English (A, B, C, D, E)

• aged 22-28

• Brisbane/Gold Coast, Queensland

Subjects

Page 13: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Speaker A “bike” (normal-nuclear)

Page 14: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

1 2

Speaker A “bike” (normal-nuclear)

Page 15: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

1 2 10 20 30 40 50 60 70 80 90%

Speaker A “bike” (normal-nuclear)

Page 16: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

1 2 10 20 30 40 50 60 70 80 90%

Speaker A “bike” (normal-nuclear)

F3 F2

F1

F3F2

F1

Page 17: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

F1 normal-nuclear

1009080706050403020100

800

700

600

500

400

300

A

B

C

D

E

Fre

quen

cy (

Hz)

+10% step of /a/

Page 18: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

F2 normal-nuclearF

requ

ency

(H

z)

+10% step of /a/ 1009080706050403020100

2000

1800

1600

1400

1200

1000

800

A

B

C

D

E

Page 19: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

F3 normal-nuclearF

requ

ency

(H

z)

+10% step of /a/ 1009080706050403020100

2800

2700

2600

2500

2400

2300

2200

2100

2000

A

B

C

D

E

Page 20: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Discriminant Analysis

Multivariate technique used to determine whether a set of predictors (formant frequency measurements) can be combined to predict group (speaker) membership

(ref. Tabachnick and Fidell 1996)

Page 21: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Discriminant Analysis

fast-nuclear

Function 1

6420-2-4-6

Fu

nctio

n 2

6

4

2

0

-2

-4

A

B

C

D

E

Each datapoint represents 1 token

Each speaker’s tokens are represented with a different colour

Page 22: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Discriminant Analysis

fast-nuclear

Function 1

6420-2-4-6

Fu

nctio

n 2

6

4

2

0

-2

-4

A

B

C

D

E

Each datapoint represents 1 token

Each speaker’s tokens are represented with a different colour

e.g. Speaker E’s 25 tokens of /aɪk/

Page 23: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Discriminant Analysis

fast-nuclear

Function 1

6420-2-4-6

Fu

nctio

n 2

6

4

2

0

-2

-4

A

B

C

D

E

DA constructs discriminant functions which maximise differences between speakers

(each function is a linear combination of the formant frequency predictors)

Page 24: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Discriminant Analysis

fast-nuclear

Function 1

6420-2-4-6

Fu

nctio

n 2

6

4

2

0

-2

-4

A

B

C

D

E

Assess how well the predictors distinguish speakers by extent of clustering of tokens

+ classification percentage…

Page 25: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Discriminant Analysis

fast-nuclear

Function 1

6420-2-4-6

Fu

nctio

n 2

6

4

2

0

-2

-4

A

B

C

D

E

Assess how well the predictors distinguish speakers by extent of clustering of tokens

+ classification percentage…

95%

Page 26: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

normal-nuclear

Function 1

6420-2-4-6-8

Function 2

4

2

0

-2

-4

-6

SpeakerG r o up Ce nt r o d s

5

4

3

2

1

fast-nuclear

Function 1

6420-2-4-6

Function 2

6

4

2

0

-2

-4

SpeakerG r o up Ce nt r o d s

E

D

C

B

A

normal-non-nuclear

Function 1

6420-2-4-6

Function 2

4

3

2

1

0

-1

-2

-3

-4

SpeakerG r oup C en t r o ds

E

D

C

B

A

fast-non-nuclear

Function 1

6420-2-4

Function 2

6

4

2

0

-2

-4

SpeakerG r o up Ce nt r o d s

E

D

C

B

A

normal-nuclear

Function 1

6420-2-4-6-8

Function 2

4

2

0

-2

-4

-6

Speaker

Group Centroids

A

B

C

D

E

A

B

C

D

E

Discriminant Analysis

95%

88%

95%

89%

Page 27: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Discussion

• DA scatterplots and classification rates promising

• However, not very efficient – method essentially based on a series of instantaneous measurements, probably containing dependent information

• Recall: individuals’ F1 contours of /aɪk/…

Page 28: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

F1 normal-nuclear

1009080706050403020100

800

700

600

500

400

300

A

B

C

D

E

Fre

quen

cy (

Hz)

+10% step of /a/

Page 29: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

A new approach…

• Differences in location in frequency range

• Differences in curvature – location of turning points, convex/concave, steep/shallow

• Need to capture most defining aspects of the contours efficiently

linear regression to parameterise curves with polynomial equations

Page 30: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Linear regression

• Technique for determining equation of a line or curve which approximates the relationship between a set of (x, y) points

0

5

10

15

20

0 5 10 15 20

y

x

Page 31: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Linear regression

• Technique for determining equation of a line or curve which approximates the relationship between a set of (x, y) points

0

5

10

15

20

0 5 10 15 20

y

x

Page 32: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Linear regression

• Technique for determining equation of a line or curve which approximates the relationship between a set of (x, y) points

0

5

10

15

20

0 5 10 15 20

y

x

Page 33: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Linear regression

• Technique for determining equation of a line or curve which approximates the relationship between a set of (x, y) points

0

5

10

15

20

0 5 10 15 20

y

x

y = a0 + a1x

Page 34: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Linear regression

• Technique for determining equation of a line or curve which approximates the relationship between a set of (x, y) points

0

5

10

15

20

0 5 10 15 20

y

x

y = a0 + a1x

y-intercept

Page 35: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Linear regression

• Technique for determining equation of a line or curve which approximates the relationship between a set of (x, y) points

0

5

10

15

20

0 5 10 15 20

y

x

y = a0 + a1x

y-interceptgradient

Page 36: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Linear regression

• Can also be used for curvilinear relationships

0

5

10

15

20

0 5 10 15 20

0

5

10

15

20

0 5 10 15 20

y

x

Page 37: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Linear regression

• Can also be used for curvilinear relationships

quadratic:y = a0 + a1x + a2x2

0

5

10

15

20

0 5 10 15 20

y

x

Page 38: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Linear regression

• Can also be used for curvilinear relationships

quadratic:y = a0 + a1x + a2x2

y-intercept

0

5

10

15

20

0 5 10 15 20

y

x

Page 39: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Linear regression

• Can also be used for curvilinear relationships

quadratic:y = a0 + a1x + a2x2

y-interceptdetermine shape and direction of curve

0

5

10

15

20

0 5 10 15 20

y

x

Page 40: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Polynomial Equations

x

x

x

y

y

y

Cubic

y = a0 + a1x + a2x2 + a3x3

Quartic

y = a0 + a1x + a2x2 + a3x3 + a4x4

Quintic

y = a0 + a1x + a2x2 + a3x3+ a4x4 + a5x5

Page 41: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Polynomial Equations

x

x

x

y

y

y

Cubic

y = a0 + a1x + a2x2 + a3x3

Quartic

y = a0 + a1x + a2x2 + a3x3 + a4x4

Quintic

y = a0 + a1x + a2x2 + a3x3+ a4x4 + a5x5

Page 42: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

/ak/ data

• fit F1, F2, F3 contours with polynomial equations

• test the reliability of the polynomial coefficients in distinguishing speakers

Quadratic: y = a0 + a1t + a2t2

Cubic: y = a0 + a1t + a2t2 + a3t3

Page 43: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

0

100

200

300

400

500

600

700

800

0 1 2 3 4 5 6 7 8 9

Normalised time

actual

quadratic fit

cubic fit

actual data points

Quadratic fit: y = 420.68 + 79.26t - 5.92t2

Cubic fit:y = 478.85 - 46.07t + 35.62t2

- 3.46t3

“bike”, Speaker A (normal-nuclear token 1)

0

100

200

300

400

500

600

700

800

0 1 2 3 4 5 6 7 8 9

Normalised time

Frequency (Hz)

Normalised time

F1 contoury

t

Page 44: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

0

100

200

300

400

500

600

700

800

0 1 2 3 4 5 6 7 8 9

Normalised time

actual

quadratic fit

cubic fit

actual data points

Quadratic fit: y = 420.68 + 79.26t - 5.92t2

R = 0.879

Cubic fit:y = 478.85 - 46.07t + 35.62t2

- 3.46t3

R = 0.978

“bike”, Speaker A (normal-nuclear token 1)

0

100

200

300

400

500

600

700

800

0 1 2 3 4 5 6 7 8 9

Normalised time

Frequency (Hz)

Normalised time

F1 contoury

t

Page 45: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

600

800

1000

1200

1400

1600

1800

2000

0 1 2 3 4 5 6 7 8 9

Normalised time

“bike”, Speaker A (normal-nuclear token 1)

0

100

200

300

400

500

600

700

800

0 1 2 3 4 5 6 7 8 9

Normalised time

actual

quadratic fit

cubic fit

actual data points

Quadratic fit: y = 876.01 - 53.24t + 22.46t2

R = 0.985

Cubic fit:y = 825.49 + 55.64t - 13.63t2

+ 3.01t3

R = 0.991

Frequency (Hz)

Normalised time

F2 contoury

t

Page 46: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

DA on polynomial coefficents

• Quadratic 3 formants x 3 coefficients = 9 predictors

• Cubic3 formants x 4 coefficients = 12 predictors

• Cubic + duration of /a/ 12 + 1 = 13 predictors

Page 47: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

0

20

40

60

80

100

normal-nuclear

fast-nuclear normal-non-nuclear

fast-non-nuclear

quadratic

cubic

cubic + dur

direct meas'ts

Comparison of Classification Rates

% Correct Classification

Page 48: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

0

20

40

60

80

100

normal-nuclear

fast-nuclear normal-non-nuclear

fast-non-nuclear

quadratic

cubic

cubic + dur

direct meas'ts

% Correct Classification

No. of predictors:

(9)

(12)

(13)

(20)

Comparison of Classification Rates

Page 49: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

0

20

40

60

80

100

normal-nuclear

fast-nuclear normal-non-nuclear

fast-non-nuclear

quadratic

cubic

cubic + dur

direct meas'ts

% Correct Classification

No. of predictors:

(9)

(12)

(13)

(20)

Comparison of Classification Rates

Page 50: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

0

20

40

60

80

100

normal-nuclear

fast-nuclear normal-non-nuclear

fast-non-nuclear

quadratic

cubic

cubic + dur

direct meas'ts

% Correct Classification

No. of predictors:

(9)

(12)

(13)

(20)

Comparison of Classification Rates

Page 51: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

0

20

40

60

80

100

normal-nuclear

fast-nuclear normal-non-nuclear

fast-non-nuclear

quadratic

cubic

cubic + dur

direct meas'ts

% Correct Classification

96% 92% 89% 90%

No. of predictors:

(9)

(12)

(13)

(20)

Comparison of Classification Rates

Page 52: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

0

20

40

60

80

100

normal-nuclear

fast-nuclear normal-non-nuclear

fast-non-nuclear

quadratic

cubic

cubic + dur

direct meas'ts

% Correct Classification

No. of predictors:

(9)

(12)

(13)

(20)

Comparison of Classification Rates

Page 53: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

0

20

40

60

80

100

normal-nuclear

fast-nuclear normal-non-nuclear

fast-non-nuclear

quadratic

cubic

cubic + dur

direct meas'ts

% Correct Classification

No. of predictors:

(9)

(12)

(13)

(20)

Comparison of Classification Rates

Page 54: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Summary of findings

• Comparing polynomial-based tests & direct measurement-based tests: reduction in classification accuracy small in return for much smaller no. of predictors required

• Future: aim to develop this approach to enable inclusion of additional information parametrise other dynamic aspects of speech to capture a dense amount of speaker-specific info with a small no. of predictors

Page 55: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Conclusion

• Differences in formant dynamics reflect differences in articulatory strategies (& VT dimensions) among speakers

e.g. speaker-specificity of /ak/ formant dynamics

- differences in shape and frequency for F1, F2 and F3- preserved across changes in speaking rate and stress

Page 56: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Conclusion

• Trialled new technique for characterising individuals’ formant contours using polynomial equations on /ak/ data

• Able to capture almost same amount of speaker-specific information with far fewer predictors

Polynomial approach using formant dynamics should make an important contribution to speaker characterisation techniques in future

Page 57: Characterisation of individuals’ formant dynamics using polynomial equations Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk.

Characterisation of individuals’ formant dynamics using

polynomial equations

Kirsty McDougallDepartment of LinguisticsUniversity of Cambridge

[email protected]

IAFPA 2006


Recommended