AUDL4007: 26 Feb 2015. A. Faulkner. See Plack CJ “The ... · Theories of pitch perception have...

Post on 26-May-2020

4 views 0 download

transcript

1

Perception of pitchAUDL4007: 26 Feb 2015. A. Faulkner.

See Plack CJ “The Sense of Hearing” Lawrence Erlbaum,2005 Chapter 7

or Moore, BCJ “Introduction to the Psychology ofHearing, Chapter 5”.

2

Definitions

Perception: Pitch is the perceptual property of sound thatconveys melody

Acoustics: Pitch is closely related to frequency andperiodicity

Pitch is a perceptual property of periodic and approximatelyperiodic sounds – these have spectra that contain harmonics of acommon fundamental frequency.

Pitch should be distinguished from “timbre”, which is a perceptualquality relating to the sharpness of dullness of a sound. Timbre ismainly related to spectral shape

The pitch of a sound is defined, for the purposes of measurement,as being equivalent to the frequency of a simple sine wave that hasthe same pitch as the sound. Hence pitch is expressed in Hz.

3

Singing voice {

4

Why is pitch important?

• In speech

– Pitch variations signal differences between child, adultmale and adult female speakers.

– Pitch variation conveys intonation, which indicateslexical stress and aspects of syntax.

• e.g. it’s raining? “checking” question usually shows final pitchrise

• No I mean the BLUE shirt! – emphasis on BLUE would lead topitch rise

– In tone languages, pitch movement is lexicallycontrastive

5

hemp

horse

scold

mother

6

Importance of pitch: 2

• Music

• Separating sources of sound

– Pitch is rather like a carrier frequency that we can tune in to

• Much studied in examining roles of spectral and temporalcoding and processing in hearing

7

Auditory coding of frequency and pitch

Information in spectral/place and time domains

Theories of pitch perception have been largely concernedwith contrasting the contributions of spectral and temporalcues to the perception of pitch.

– Place representation - pitch is related to place of basilarmembrane vibration

– Temporal representation - neural firing patternpreserves periodicity of the signal

8

Place and time coding of sine-wavefrequency

Place of maximum response varieswith frequency

Pitch Discrimination for sinewaves

Practiced listenerscan hear differencesof less than 1 Hz for a200 Hz sinusoid(precision better than0.5%)

At 1000 Hz,differences of 2 Hzcan be detected(precision of about0.2%)

NB Scales here chosento fit data to straightline: square root(F) andthreshold frequencydifference on a log scale

200 Hz – differences of 1, 2,4, 8, 16 Hz1k Hz, differences of 2, 4, 8,16, 32, 64 Hz.

10

Can we account for pure tone discriminationon the basis of place cues?

Deriving excitation patterns for a 1 kHzsinusoid from filter frequency responses

Note shallower slope to lower frequencies (left) forfrequency responses

300 Hz frequency 1900 Hz

FIlter responses with centre frequenciesrunning from 1400 – 600 Hz

1400 Hz

Excitationlevel for 1kHz toneby centrefrequencyof filter

Acoustic frequency (Hz)

Filter centre frequency (Hz) = place on BM

Shift of excitation pattern with change of frequency

Overall patternof excitationover filter centrefrequency

Response toone frequencyin a series offilters

Acoustic frequency (Hz)

Filter centre frequency (Hz) = place on BM

14

Excitation pattern coding offrequency difference

Intensitydiscriminationthresholds are about1 dB.

At 1000 Hz forexcitation levels todiffer by 1dBrequires a frequencydifference of about10 Hz – yet we canhere a frequencydifference of 2 Hz.

15

What other cues are there?

A just detectable pitch change at 3 kHz and below leads to achange in excitation level that is too small to be detected.

Therefore - acuity for pitch differences for low frequencysinusoids cannot be explained by place cues.

16

Neural temporal coding

Interval histogram from recordings of auditory nerve responses to 1100 Hz sinewave. The common intervals are at 1/1100 seconds, 2/1100 seconds, etc.

17

Synchrony of nerve firing times to sine-wave period:very precise up to about 1.5 kHz – then declines andis lost at 5 kHz and above – so timing cues to pitch

decline in accuracy above 1.5 kHz

18

19

What about effects of duration?

• If pitch discrimination is based on timeintervals between nerve firings then as moreintervals occur, discrimination is likely tobe more accurate in a way that depends onthe statistics of timing of nerve firing,

20

Effects of duration on spectrum

• But duration also affects spectrum, andhence place coding - width of excitationpattern grows with inverse of duration

21

Effects of duration on sine wave spectrum –

spectrum spreads at shorter durations whichlimits place coding of pitch

Sequence from 2 to 128 cycles

Effects of signal duration: place vs. temporal codingof sine wave frequency

Data from Moore (1972, 1973)

Above 4 kHz there are only placecues – duration has a relatively smalleffect which can be explained by thespectral spread arising for shortertones.

Below ~ 4 kHz, pitch discriminationfor longer signals is too fine to beexplained by place (shifts inexcitation pattern)

Effects of signal duration (differentcurves) are larger at low frequencies.They cannot be explained by spreadof excitation pattern but can beexplained by statistics of temporalcoding which depends on number ofinter-spike intervals.

23

Relative discriminability of pitch

Typically pitchdiscrimination isexpressed relativeto frequency.Expressed this waythe relativeDifference Limenfor Frequency(DLF) is smallestat 2 kHz.

24

Coding pure tone frequency

• Only by place of excitation above 4 kHz

• Place information not good enough at lowerfrequencies – coding is dominated bytemporal coding below ~ 1.5 kHz

• Between 1.5 and 4 kHz both types of cueare available.

25

Pitch of complex sounds

• A complex harmonic sound such as a pulse trainhas a pitch that is equivalent to that of a sinusoidat the fundamental frequency (F0) of the pulsesignal.

• This information is present in the acoustic signalboth in the spectrum, as the frequency of thecomponent at F0, and in the time domain, as theperiod of the pulse train.

26

27

Ohm’s other law:

“Every motion of the air, then, which corresponds to acomposite mass of musical tones, is, according to Ohm’sLaw, capable of being analysed into a sum of simplevibrations, and to each such simple vibration corresponds asimple tone, sensible to the ear, and having a pitchdetermined by the periodic time of the correspondingmotion of the air.”

(Helmholtz, 1885; “On the Sensations of Tone”

28

Auditory filter bandwidth increases with frequency (whileharmonics are evenly spaced). For F0 of 200 Hz, bandwidth

exceeds harmonic spacing above about 1.6 kHz

29

ANexcitation

pattern

Cochlear frequency selectivity and resolution of harmonics

Cochlear Place

CochlearFilter Bank

CF

ResolvedHarmonics

UnresolvedHarmonics

Harmonic Number (F/F0)

F0

x

Missing-F0harmoniccomplex

tone

30

Excitation patterns: complex sounds

Lower harmonics are clearly resolved – For 200 Hz F0, above1.6 kHz filter bandwidth is wider than 200 Hz spacing betweenharmonics and these higher harmonics are not resolved.

Similar limits apply at other F0s

31

Classical Place account of pitch

• Pitch of a complex sound determined by positionof peak in excitation pattern due to basilarmembrane response to fundamental frequency (F0)component

32

The missing fundamental• Schouten (1938, 1940)

made a crucial test of theplace theory that is basedon Ohm’s Law

• He presented a pulsesignal, with a completeharmonic series. A placeaccount would claim thatthe pitch is due to thelowest frequencycomponent, at thefundamental frequency.

• This signal is compared toa signal modified toremove the fundamentalfrequency component.According to place theory,the pitch should change

33

For most listeners, pitch is unaffected by deletion ofharmonic at fundamental frequency

Schouten called this “residue pitch” – attributing thelow pitch percept to the periodicity shown in theauditory nerve response to the unresolved higherharmonics

Audio demonstration from “Audio Demonstrations on Compact Disc (ASA1989).

The first sound is a 200 Hz harmonic complex tone comprising the 1st 10harmonics. Succeeding sounds have the 1st, 1st and 2nd, 1st thru 3rd, and then 1st

thru 4th harmonics deleted.

34

Higher harmonics are closelyspaced relative to filterbandwidths and are notresolved. The filter outputshows the fundamentalperiodicity of the pulse train

Lower harmonics arecompletely resolved (1st 5 to 8harmonics depending on F0)

Auditory frequency analysisof a pulse train

35

Role of auditory non-linearity?

• Additional frequency components are introducedwhen a signal is passed through a non-linearsystem – for harmonic complex tones this couldinclude a distortion component at F0.

• Can a component introduced at the fundamentalfrequency explain “The case of the missingfundamental”?

36

Is distortion product responsible for lowpitch?

• Patterson (1976) Low frequency noise will maska distortion component at F0 – (e.g. a differencetone arising from two adjacent harmonics)

– but LF noise does not mask the low pitch at F0

– therefore the low pitch is not due to distortion

Audio demo – A simple melody is heard played by a series of sine wavesand complex tones comprising 3 higher harmonics with the same F0 as thesine wave. Both the sine and complex tones sound the same melody. Then alow pass noise is added – this masks the sine wave and would mask anyauditory distortion product at F0. The low pitch is still heard from thecomplex tones.

37

Contributions of resolved and unresolved harmonics

The pitch of the residue suggests that higher unresolved harmonics areimportant in determining the pitch of complex tones. Both Ritsma andPlomp in 1967 published studies that challenged this.

Plomp used stimuli in which the higher and lower harmonics were shiftedin frequency in opposite directions. E.g., Harmonics 1 to 4 were shifteddown by 10% and harmonics 5 upwards were shifted up by 10%.

38

Contributions of resolved and unresolved harmonics

Generally, and especially in thespeech F0 range, it isharmonics 4 to 8 that dominatepitch

At very high F0 – above 1.5kHz, the fundamental frequencycomponent is dominant.

Contributions of unresolvedhigh harmonics never dominateover contributions of resolvedharmonics.

39

Resolved harmonics produce higher precisionof pitch than unresolved harmonics

(Bernstein &Oxenham, 2003)

Harmoniccomplex tone, 12

successiveharmonics

Resolved Unresolved

Better

Worse

ALSOPitch discrimination forcomplex tones generallybetter than for the sine

wave at F0 (Henning andGrosberg, 1968)

40

The filter output shows thefundamental periodicity – weakcue to pitch

Lower harmonics arecompletely resolved – theirfrequencies coded in time (atdifferent places) are primarycues to pitch –DOMINANTAND MOST PRECISE

Harmonic at fundamentalfrequency not a necessary cuefor pitch

Where are cues to pitch?

Pitch cues across frequency inspeech

1000 Hz

2000 Hz

3000 Hz

4000 Hz

42

Primary cues for pitch of complexsounds

• Pitch is mostly effectively determined by temporally-encoded representations of the frequencies of resolvedharmonics (temporal code needed to explain the precisionof pitch discrimination)

• The temporal encoding of F0 from the unresolved higherharmonics is not a primary cue

• Nor is the harmonic component at F0 except when F0 > 1.5kHz.

43

Pitch without spectral information• White noise that is amplitude

modulated at rates up to 1000 Hzhas a weak pitch (Burns andViemeister, 1976). The spectrumof the noise is flat, and onlytemporal cues to pitch are present

• E.g, below shows white noise(lower trace) amplitude modulatedby half-wave rectified sine wave

Purely temporal pitches, although weak, canconvey melody information for rates up to 300 or500 Hz - but very weak above 200 Hz.

Monaural temporal pitch is perceived from thetemporal nerve firing pattern, which will beaffected by amplitude modulation.

Also DICHOTIC temporal pitches –where a pitch is heard that changeswith inter-aural phase.

Unmodulatednoise Noise amplitude modulated by sine wave

gliding from 40 up to 100 Hz (left) anddown from 100 to 40 Hz (right)

100 Hz am noise

44

Current theories of pitch perception• Pitch perception is based on the pattern of

information over a range of frequencies. Themajor contributing information is the

frequencies of the dominant resolvedharmonics.

• This information is conveyed in the temporalfiring pattern of the auditory nerve acrossfrequency channels.

• Pattern processing identifies intervalsbetween nerve firing that are common acrossfrequency channels. For a series of resolvedharmonics, nerve firings show a related seriesof time intervals

• Periodicity information from higher

frequency unresolved harmonics or fromthe modulation envelope of noise is anothersource of input to this pattern processing, butis a relatively weak cue.

Figure from Moore and Glasberg (1986)

45

Auditory nerve responses to pulse train: nerves responding toresolved harmonics show periodicity of each harmonic

46

Summary: Simple signals

• While pitch is broadly correlated withperiod, human pitch processing is complex

• Sine waves up to a few kHz - pitch istemporally coded – place coding is too poorto account for performance

• Sine waves above 4 kHz, only place cuesare present to code sine wave frequency

47

Summary: Complex signals• The period indicated by temporal cues alone from

unresolved high harmonics in a single auditory filter cansignal pitch at F0.

– And a weak pitch can be heard from purely temporal cues withamplitude modulated noise

• However, pitch of complex tones is dominated by resolvedharmonics (range 4 to 8 for F0 in speech range). Here pitchprocessing depends on pattern extraction operating on timeintervals between nerve firings

48

How might impaired hearing affectpitch perception?

• Wider auditory filters due to OHC damage– Fewer harmonics resolved

• Impaired temporal coding– Would limit phase-locking and hence temporal

coding of frequency

Today’s lab

Measuring fundamental frequency fromwhistles and speech.

Aimed only at Audiology students

49