+ All Categories
Home > Documents > A Spectral-Temporal Method for Pitch Tracking

A Spectral-Temporal Method for Pitch Tracking

Date post: 15-Jan-2016
Category:
Upload: viola
View: 32 times
Download: 1 times
Share this document with a friend
Description:
A Spectral-Temporal Method for Pitch Tracking. Stephen A. Zahorian*, Princy Dikshit, Hongbing Hu* Department of Electrical and Computer Engineering Old Dominion University, Norfolk, VA 23529, USA. * Currently at Binghamton University 09/17/2006. Outline. Introduction Algorithm - PowerPoint PPT Presentation
Popular Tags:
18
1 A Spectral-Temporal Method for Pitch Tracking Stephen A. Zahorian*, Princy Dikshit, Hongbing Hu* Department of Electrical and Computer Engineering Old Dominion University, Norfolk, VA 23529, USA. * Currently at Binghamton University 09/17/2006
Transcript
Page 1: A Spectral-Temporal Method for Pitch Tracking

1

A Spectral-Temporal Method for Pitch Tracking

Stephen A. Zahorian*, Princy Dikshit, Hongbing Hu*

Department of Electrical and Computer Engineering

Old Dominion University, Norfolk, VA 23529, USA.

* Currently at Binghamton University

09/17/2006

Page 2: A Spectral-Temporal Method for Pitch Tracking

2

Outline

Introduction Algorithm

Algorithm overview The use of nonlinear processing Pitch tracking from the spectrum

Experimental evaluation Conclusion

Page 3: A Spectral-Temporal Method for Pitch Tracking

3

Introduction

Pitch(the fundamental frequency) applications Automatic speech recognition (ASR), speech synthesis,

speech articulation training aids, etc. Pitch detection algorithms

“Robust and accurate fundamental frequency estimation based on dominant harmonic components,” Nakatani, etc=> High accuracy for noisy speech reported using the harmonic dominance spectrum

“Yet another algorithm for pitch tracking(YAAPT),” Zahorian, etc=> Hybrid spectral-temporal processing for pitch tracking

Page 4: A Spectral-Temporal Method for Pitch Tracking

4

Algorithm Overview

F0 candidates estimation F0 candidates estimation

Squared Value of Speech

Original Speech

Spectrum

Refined F0 Candidates

Refined F0 Candidates

Final F0 Final F0 determination using dynamic programming

Nonlinear processing

FFT

Pitch Tracking

F0 candidates (Squared Value) Spectral F0 track

F0 candidates (Original Speech)

Candidates refinement

Page 5: A Spectral-Temporal Method for Pitch Tracking

5

Restoration of missing fundamental in telephone speech A periodic sound is characterized by the spectrum of its

harmonics The signal the fundamental missed be approximated as

After squaring and applying trigonometric identities

)3cos()2cos()( 32 tbtbty 1st harmonic 2nd harmonic Fundamental

)cos(1 tb

The Use of Nonlinear Processing

ttbb

ttbbty

b

bbb

6cos5cos

4coscos

232

23222

23

22

23

22

The fundamental reappears

Page 6: A Spectral-Temporal Method for Pitch Tracking

6

Illustration of Nonlinear Processing

The telephone speech signal (top panel) and squared telephone signal (bottom panel) for one frame

Page 7: A Spectral-Temporal Method for Pitch Tracking

7

Illustration of Nonlinear Processing The magnitude spectrum for the telephone (top panel) and nonlinear

processed signal (bottom panel)

Page 8: A Spectral-Temporal Method for Pitch Tracking

8

Spectral Effects from Nonlinear Processing

The missing fundamental in the telephone speech (top panel) is restored in the squared signal (bottom panel)

Spectrum of the telephone speech

Time (Seconds)

Fre

quen

cy (

Hz)

18 18.5 19 19.5 20 20.5 21 21.5 22 22.5 23

100

200

300

400

Spectrum of the nonlinear processed signal

Time (Seconds)

Fre

quen

cy (

Hz)

18 18.5 19 19.5 20 20.5 21 21.5 22 22.5 23

100

200

300

400

Page 9: A Spectral-Temporal Method for Pitch Tracking

9

Pitch Tracking From the Spectrum

The pitch track from the spectrum refines the pitch candidates estimated from the temporal method

To achieve a noise robust pitch track from the spectrum, an autocorrelation type of function is proposed

Page 10: A Spectral-Temporal Method for Pitch Tracking

10

The function takes into account multiple harmonics

Equation

0 100 200 300 400 500 600 700 800 900 10000

0.05

0.1

0.15

0.2

Frequency (Hz)

Spectrum

0 50 100 150 200 250 300 350 4000

0.2

0.4

0.6

0.8

1

Frequency (Hz)

Autocorrelation type of function

WL

k 2k

3k

4k

Autocorrelation type of Function

2/

2/

1

1

)()(WL

WLi

N

n

inkfky

)(if : The spectrum,WL: Window length (20Hz)N: The number of harmonics (3),

k: Frequency index, max_0min_0 FF kkk

0 200 400 600 800 10000

0.05

0.1

0.15

0.2

Frequency (Hz)

Spectrum

0 100 200 300 4000

0.2

0.4

0.6

0.8

1

Frequency (Hz)

Autocorrelation type of function

X X X

Page 11: A Spectral-Temporal Method for Pitch Tracking

11

0 200 400 600 800 1000 12000

0.1

0.2

0.3

0.4Spectrum

Frequency(Hz)

Am

plitu

de

0 50 100 150 200 250 300 350 400 4500

0.5

1Peaks in autocorrelation type of function

Frequency(Hz)

Am

plitu

de

Peaks in Autocorrelation Type of Function

A very prominent peak is observed in the proposed function

Page 12: A Spectral-Temporal Method for Pitch Tracking

12

Candidate Insertion to Reduce Pitch Doubling/Halving

If all candidates are larger than a threshold (typically 150 Hz), an additional candidate is inserted at half the frequency of the highest-ranking candidate

Similar logic is used to reduce pitch halving

0 50 100 150 200 250 300 350 400 0

0.5

1 Peaks in autocorrelation type of function

Frequency(Hz)

Am

plitu

de P1 P2(Hz)=P1(Hz)/2

Page 13: A Spectral-Temporal Method for Pitch Tracking

13

Experimental Evaluation

Database Keele pitch extraction database 5 male and 5 female speakers, about 35seconds speaker High quality speech and telephone speech Additive Gaussian noise

Controls (reference pitch) Control C1: supplied in Keele database Control C2: computed from the laryngograph signal

with the proposed algorithm

Page 14: A Spectral-Temporal Method for Pitch Tracking

14

Definition of Error Measures

Gross error The percentage of frames such that the pitch estimate of

the tracker deviates significantly (typically 20%) from the reference pitch (control)

Only evaluated in the voiced sections of the reference

Page 15: A Spectral-Temporal Method for Pitch Tracking

15

Experiment 1 Results

Individual performance of the proposed algorithm

Control Studio,

Clean (%)

Studio,

5dB Noise(%)

Telephone,

Clean (%)

Telephone,

5dB Noise(%)

YAAPT C1 4.26 7.62 8.14 17.85

YAAPT* C1 1.59 1.99 2.69 4.48

Spectral method

C1 4.23 4.45 6.52 6.95

NCCF C1 3.58 4.52 8.00 16.61

YAAPT*: Using control C1 for the spectral pitch trackNCCF : Normalized cross correlation function, used as the temporal method in YAPPT

Page 16: A Spectral-Temporal Method for Pitch Tracking

16

Experiment 2 Results

The results of the new method with various error thresholds

Error Threshold

Control Studio,

Clean (%)

Studio,

5dB Noise(%)

Telephone,

Clean (%)

Telephone,

5dB Noise(%)

10% C1 5.46 7.31 9.39 16.14

10% C2 4.18 6.06 7.77 14.78

20% C1 2.90 3.65 4.86 7.45

20% C2 1.56 2.16 3.27 5.85

40% C1 2.25 2.44 2.75 3.63

40% C2 0.91 1.06 0.99 2.05

Page 17: A Spectral-Temporal Method for Pitch Tracking

17

Comparisons

DASH, REPS, YIN: the results are reported in “Robust and accurate fundamental frequency estimation ... ,” Nakatani, etc.

*: SRAEN filter simulated telephone speech

ControlStudio,

Clean (%)

Studio,

5dB Noise(%)

Telephone,

Clean (%)

Telephone,

5dB Noise(%)

Proposed Method

C1 2.90 3.65 4.86(4.52 *) 7.45(5.90 *)

DASH C1 2.81 2.32 3.73* 4.15 *

REPS C1 2.68 2.98 6.91* 8.49 *

YIN C1 2.57 7.22 7.55* 14.6*

Page 18: A Spectral-Temporal Method for Pitch Tracking

18

Conclusion

A new pitch-tracking algorithm has been developed which combines multiple information sources to enable accurate robust F0 tracking

An analysis of errors indicates better performance for both high quality and telephone speech than previously reported performance for pitch tracking

Acknowledgements This work was partially supported by JWFC 900


Recommended