1
Perception of pitchAUDL4007: 26 Feb 2015. A. Faulkner.
See Plack CJ “The Sense of Hearing” Lawrence Erlbaum,2005 Chapter 7
or Moore, BCJ “Introduction to the Psychology ofHearing, Chapter 5”.
2
Definitions
Perception: Pitch is the perceptual property of sound thatconveys melody
Acoustics: Pitch is closely related to frequency andperiodicity
Pitch is a perceptual property of periodic and approximatelyperiodic sounds – these have spectra that contain harmonics of acommon fundamental frequency.
Pitch should be distinguished from “timbre”, which is a perceptualquality relating to the sharpness of dullness of a sound. Timbre ismainly related to spectral shape
The pitch of a sound is defined, for the purposes of measurement,as being equivalent to the frequency of a simple sine wave that hasthe same pitch as the sound. Hence pitch is expressed in Hz.
3
Singing voice {
4
Why is pitch important?
• In speech
– Pitch variations signal differences between child, adultmale and adult female speakers.
– Pitch variation conveys intonation, which indicateslexical stress and aspects of syntax.
• e.g. it’s raining? “checking” question usually shows final pitchrise
• No I mean the BLUE shirt! – emphasis on BLUE would lead topitch rise
– In tone languages, pitch movement is lexicallycontrastive
5
hemp
horse
scold
mother
6
Importance of pitch: 2
• Music
• Separating sources of sound
– Pitch is rather like a carrier frequency that we can tune in to
• Much studied in examining roles of spectral and temporalcoding and processing in hearing
7
Auditory coding of frequency and pitch
Information in spectral/place and time domains
Theories of pitch perception have been largely concernedwith contrasting the contributions of spectral and temporalcues to the perception of pitch.
– Place representation - pitch is related to place of basilarmembrane vibration
– Temporal representation - neural firing patternpreserves periodicity of the signal
8
Place and time coding of sine-wavefrequency
Place of maximum response varieswith frequency
Pitch Discrimination for sinewaves
Practiced listenerscan hear differencesof less than 1 Hz for a200 Hz sinusoid(precision better than0.5%)
At 1000 Hz,differences of 2 Hzcan be detected(precision of about0.2%)
NB Scales here chosento fit data to straightline: square root(F) andthreshold frequencydifference on a log scale
200 Hz – differences of 1, 2,4, 8, 16 Hz1k Hz, differences of 2, 4, 8,16, 32, 64 Hz.
10
Can we account for pure tone discriminationon the basis of place cues?
Deriving excitation patterns for a 1 kHzsinusoid from filter frequency responses
Note shallower slope to lower frequencies (left) forfrequency responses
300 Hz frequency 1900 Hz
FIlter responses with centre frequenciesrunning from 1400 – 600 Hz
1400 Hz
Excitationlevel for 1kHz toneby centrefrequencyof filter
Acoustic frequency (Hz)
Filter centre frequency (Hz) = place on BM
Shift of excitation pattern with change of frequency
Overall patternof excitationover filter centrefrequency
Response toone frequencyin a series offilters
Acoustic frequency (Hz)
Filter centre frequency (Hz) = place on BM
14
Excitation pattern coding offrequency difference
Intensitydiscriminationthresholds are about1 dB.
At 1000 Hz forexcitation levels todiffer by 1dBrequires a frequencydifference of about10 Hz – yet we canhere a frequencydifference of 2 Hz.
15
What other cues are there?
A just detectable pitch change at 3 kHz and below leads to achange in excitation level that is too small to be detected.
Therefore - acuity for pitch differences for low frequencysinusoids cannot be explained by place cues.
16
Neural temporal coding
Interval histogram from recordings of auditory nerve responses to 1100 Hz sinewave. The common intervals are at 1/1100 seconds, 2/1100 seconds, etc.
17
Synchrony of nerve firing times to sine-wave period:very precise up to about 1.5 kHz – then declines andis lost at 5 kHz and above – so timing cues to pitch
decline in accuracy above 1.5 kHz
18
19
What about effects of duration?
• If pitch discrimination is based on timeintervals between nerve firings then as moreintervals occur, discrimination is likely tobe more accurate in a way that depends onthe statistics of timing of nerve firing,
20
Effects of duration on spectrum
• But duration also affects spectrum, andhence place coding - width of excitationpattern grows with inverse of duration
21
Effects of duration on sine wave spectrum –
spectrum spreads at shorter durations whichlimits place coding of pitch
Sequence from 2 to 128 cycles
Effects of signal duration: place vs. temporal codingof sine wave frequency
Data from Moore (1972, 1973)
Above 4 kHz there are only placecues – duration has a relatively smalleffect which can be explained by thespectral spread arising for shortertones.
Below ~ 4 kHz, pitch discriminationfor longer signals is too fine to beexplained by place (shifts inexcitation pattern)
Effects of signal duration (differentcurves) are larger at low frequencies.They cannot be explained by spreadof excitation pattern but can beexplained by statistics of temporalcoding which depends on number ofinter-spike intervals.
23
Relative discriminability of pitch
Typically pitchdiscrimination isexpressed relativeto frequency.Expressed this waythe relativeDifference Limenfor Frequency(DLF) is smallestat 2 kHz.
24
Coding pure tone frequency
• Only by place of excitation above 4 kHz
• Place information not good enough at lowerfrequencies – coding is dominated bytemporal coding below ~ 1.5 kHz
• Between 1.5 and 4 kHz both types of cueare available.
25
Pitch of complex sounds
• A complex harmonic sound such as a pulse trainhas a pitch that is equivalent to that of a sinusoidat the fundamental frequency (F0) of the pulsesignal.
• This information is present in the acoustic signalboth in the spectrum, as the frequency of thecomponent at F0, and in the time domain, as theperiod of the pulse train.
26
27
Ohm’s other law:
“Every motion of the air, then, which corresponds to acomposite mass of musical tones, is, according to Ohm’sLaw, capable of being analysed into a sum of simplevibrations, and to each such simple vibration corresponds asimple tone, sensible to the ear, and having a pitchdetermined by the periodic time of the correspondingmotion of the air.”
(Helmholtz, 1885; “On the Sensations of Tone”
28
Auditory filter bandwidth increases with frequency (whileharmonics are evenly spaced). For F0 of 200 Hz, bandwidth
exceeds harmonic spacing above about 1.6 kHz
29
ANexcitation
pattern
Cochlear frequency selectivity and resolution of harmonics
Cochlear Place
CochlearFilter Bank
CF
ResolvedHarmonics
UnresolvedHarmonics
Harmonic Number (F/F0)
F0
x
Missing-F0harmoniccomplex
tone
30
Excitation patterns: complex sounds
Lower harmonics are clearly resolved – For 200 Hz F0, above1.6 kHz filter bandwidth is wider than 200 Hz spacing betweenharmonics and these higher harmonics are not resolved.
Similar limits apply at other F0s
31
Classical Place account of pitch
• Pitch of a complex sound determined by positionof peak in excitation pattern due to basilarmembrane response to fundamental frequency (F0)component
32
The missing fundamental• Schouten (1938, 1940)
made a crucial test of theplace theory that is basedon Ohm’s Law
• He presented a pulsesignal, with a completeharmonic series. A placeaccount would claim thatthe pitch is due to thelowest frequencycomponent, at thefundamental frequency.
• This signal is compared toa signal modified toremove the fundamentalfrequency component.According to place theory,the pitch should change
33
For most listeners, pitch is unaffected by deletion ofharmonic at fundamental frequency
Schouten called this “residue pitch” – attributing thelow pitch percept to the periodicity shown in theauditory nerve response to the unresolved higherharmonics
Audio demonstration from “Audio Demonstrations on Compact Disc (ASA1989).
The first sound is a 200 Hz harmonic complex tone comprising the 1st 10harmonics. Succeeding sounds have the 1st, 1st and 2nd, 1st thru 3rd, and then 1st
thru 4th harmonics deleted.
34
Higher harmonics are closelyspaced relative to filterbandwidths and are notresolved. The filter outputshows the fundamentalperiodicity of the pulse train
Lower harmonics arecompletely resolved (1st 5 to 8harmonics depending on F0)
Auditory frequency analysisof a pulse train
35
Role of auditory non-linearity?
• Additional frequency components are introducedwhen a signal is passed through a non-linearsystem – for harmonic complex tones this couldinclude a distortion component at F0.
• Can a component introduced at the fundamentalfrequency explain “The case of the missingfundamental”?
36
Is distortion product responsible for lowpitch?
• Patterson (1976) Low frequency noise will maska distortion component at F0 – (e.g. a differencetone arising from two adjacent harmonics)
– but LF noise does not mask the low pitch at F0
– therefore the low pitch is not due to distortion
Audio demo – A simple melody is heard played by a series of sine wavesand complex tones comprising 3 higher harmonics with the same F0 as thesine wave. Both the sine and complex tones sound the same melody. Then alow pass noise is added – this masks the sine wave and would mask anyauditory distortion product at F0. The low pitch is still heard from thecomplex tones.
37
Contributions of resolved and unresolved harmonics
The pitch of the residue suggests that higher unresolved harmonics areimportant in determining the pitch of complex tones. Both Ritsma andPlomp in 1967 published studies that challenged this.
Plomp used stimuli in which the higher and lower harmonics were shiftedin frequency in opposite directions. E.g., Harmonics 1 to 4 were shifteddown by 10% and harmonics 5 upwards were shifted up by 10%.
38
Contributions of resolved and unresolved harmonics
Generally, and especially in thespeech F0 range, it isharmonics 4 to 8 that dominatepitch
At very high F0 – above 1.5kHz, the fundamental frequencycomponent is dominant.
Contributions of unresolvedhigh harmonics never dominateover contributions of resolvedharmonics.
39
Resolved harmonics produce higher precisionof pitch than unresolved harmonics
(Bernstein &Oxenham, 2003)
Harmoniccomplex tone, 12
successiveharmonics
Resolved Unresolved
Better
Worse
ALSOPitch discrimination forcomplex tones generallybetter than for the sine
wave at F0 (Henning andGrosberg, 1968)
40
The filter output shows thefundamental periodicity – weakcue to pitch
Lower harmonics arecompletely resolved – theirfrequencies coded in time (atdifferent places) are primarycues to pitch –DOMINANTAND MOST PRECISE
Harmonic at fundamentalfrequency not a necessary cuefor pitch
Where are cues to pitch?
Pitch cues across frequency inspeech
1000 Hz
2000 Hz
3000 Hz
4000 Hz
42
Primary cues for pitch of complexsounds
• Pitch is mostly effectively determined by temporally-encoded representations of the frequencies of resolvedharmonics (temporal code needed to explain the precisionof pitch discrimination)
• The temporal encoding of F0 from the unresolved higherharmonics is not a primary cue
• Nor is the harmonic component at F0 except when F0 > 1.5kHz.
43
Pitch without spectral information• White noise that is amplitude
modulated at rates up to 1000 Hzhas a weak pitch (Burns andViemeister, 1976). The spectrumof the noise is flat, and onlytemporal cues to pitch are present
• E.g, below shows white noise(lower trace) amplitude modulatedby half-wave rectified sine wave
Purely temporal pitches, although weak, canconvey melody information for rates up to 300 or500 Hz - but very weak above 200 Hz.
Monaural temporal pitch is perceived from thetemporal nerve firing pattern, which will beaffected by amplitude modulation.
Also DICHOTIC temporal pitches –where a pitch is heard that changeswith inter-aural phase.
Unmodulatednoise Noise amplitude modulated by sine wave
gliding from 40 up to 100 Hz (left) anddown from 100 to 40 Hz (right)
100 Hz am noise
44
Current theories of pitch perception• Pitch perception is based on the pattern of
information over a range of frequencies. Themajor contributing information is the
frequencies of the dominant resolvedharmonics.
• This information is conveyed in the temporalfiring pattern of the auditory nerve acrossfrequency channels.
• Pattern processing identifies intervalsbetween nerve firing that are common acrossfrequency channels. For a series of resolvedharmonics, nerve firings show a related seriesof time intervals
• Periodicity information from higher
frequency unresolved harmonics or fromthe modulation envelope of noise is anothersource of input to this pattern processing, butis a relatively weak cue.
Figure from Moore and Glasberg (1986)
45
Auditory nerve responses to pulse train: nerves responding toresolved harmonics show periodicity of each harmonic
46
Summary: Simple signals
• While pitch is broadly correlated withperiod, human pitch processing is complex
• Sine waves up to a few kHz - pitch istemporally coded – place coding is too poorto account for performance
• Sine waves above 4 kHz, only place cuesare present to code sine wave frequency
47
Summary: Complex signals• The period indicated by temporal cues alone from
unresolved high harmonics in a single auditory filter cansignal pitch at F0.
– And a weak pitch can be heard from purely temporal cues withamplitude modulated noise
• However, pitch of complex tones is dominated by resolvedharmonics (range 4 to 8 for F0 in speech range). Here pitchprocessing depends on pattern extraction operating on timeintervals between nerve firings
48
How might impaired hearing affectpitch perception?
• Wider auditory filters due to OHC damage– Fewer harmonics resolved
• Impaired temporal coding– Would limit phase-locking and hence temporal
coding of frequency
Today’s lab
Measuring fundamental frequency fromwhistles and speech.
Aimed only at Audiology students
49