+ All Categories
Home > Documents > CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level...

CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level...

Date post: 30-Mar-2021
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
48
CCRMA MIR Workshop 2013 Pitch and Chroma Analysis Steve Tjoa June 25, 2013 [email protected] Tuesday, June 25, 2013
Transcript
Page 1: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

CCRMA MIR Workshop 2013Pitch and Chroma Analysis

Steve TjoaJune 25, 2013

[email protected]

Tuesday, June 25, 2013

Page 2: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis2

Traditional Music Representations

Tuesday, June 25, 2013

Page 3: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis3

Pitch content Harmony, melody = pitch concepts

Music Theory Score = Music

Bridge to symbolic MIR

Automatic music transcription

Split the octave to discrete logarithmicallyspaced intervals

Tuesday, June 25, 2013

Page 4: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis4

MIDI Musical Instrument Digital Interfaces

Hardware interface File Format

Note events

Duration, discrete pitch, "instrument" Extensions

General MIDI Notation, OMR, continuous pitch

Tuesday, June 25, 2013

Page 5: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis4

MIDI Musical Instrument Digital Interfaces

Hardware interface File Format

Note events

Duration, discrete pitch, "instrument" Extensions

General MIDI Notation, OMR, continuous pitch

Tuesday, June 25, 2013

Page 6: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis4

MIDI Musical Instrument Digital Interfaces

Hardware interface File Format

Note events

Duration, discrete pitch, "instrument" Extensions

General MIDI Notation, OMR, continuous pitch

Tuesday, June 25, 2013

Page 7: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis5

Representations

Score Discrete, high level abstraction, explicit structure, no performance info

MIDI Discrete, medium level of abstraction, explicit

time but less structure, targeted to keyboard performance

Audio Continuous, low level abstraction, timing and

structure implicit

Tuesday, June 25, 2013

Page 8: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis6

Pitch Detection

P

Time-domainFrequency-domainPerceptual

Rhythm -> ~20 Hz Pitch(courtesy of R.Dannenberg – Nyquist)

Pitch is a PERCEPTUAL attribute correlated but not equivalent to fundamental frequency

Tuesday, June 25, 2013

Page 9: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis6

Pitch Detection

P

Time-domainFrequency-domainPerceptual

Rhythm -> ~20 Hz Pitch(courtesy of R.Dannenberg – Nyquist)

Pitch is a PERCEPTUAL attribute correlated but not equivalent to fundamental frequency

Tuesday, June 25, 2013

Page 10: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis7

Time Domain

C4 Clarinet Note C4 Sine Wave

# zero-crossings sensitive to noise – needs LPF

Tuesday, June 25, 2013

Page 11: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis7

Time Domain

C4 Clarinet Note C4 Sine Wave

# zero-crossings sensitive to noise – needs LPF

Tuesday, June 25, 2013

Page 12: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis7

Time Domain

C4 Clarinet Note C4 Sine Wave

# zero-crossings sensitive to noise – needs LPF

Tuesday, June 25, 2013

Page 13: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis8

AutoCorrelation

Efficient computation possible for powers of 2 using FFT

F(f) = FFT(X(t))S(f) = F(f) F*(f)R(l) = IFFT(S(f))

Tuesday, June 25, 2013

Page 14: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis9

Average Magnitude

No multiplies – more efficient for fixed point

Tuesday, June 25, 2013

Page 15: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis10

Frequency Domain

Fundamental frequency (as well as pitch) will correspond to peaks in the Spectrum. The fundamental does not necessarily have the highest amplitude.

Sine C4 Clarinet C4

Tuesday, June 25, 2013

Page 16: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis11

Polyphonic

Original Transcribed

Mixture signal Noise Suppression

Klapuri et al, DAFX 00

Predominant pitch estimation

Remove detected sound

Estimate # voicesiterate

Tuesday, June 25, 2013

Page 17: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis11

Polyphonic

Original Transcribed

Mixture signal Noise Suppression

Klapuri et al, DAFX 00

Predominant pitch estimation

Remove detected sound

Estimate # voicesiterate

Tuesday, June 25, 2013

Page 18: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis11

Polyphonic

Original Transcribed

Mixture signal Noise Suppression

Klapuri et al, DAFX 00

Predominant pitch estimation

Remove detected sound

Estimate # voicesiterate

Tuesday, June 25, 2013

Page 19: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis12

Musical Pitch

Tuning = different ways of subdividing the octave logarithmically (as ratios) into intervals

Tension between harmonic ratios, modulation to different keys, regularity, pure fifths (ratio of 1.5 or 3:2)

> Many tuning systems have been explored through history

Tuesday, June 25, 2013

Page 20: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis13

Tuning systems

Just intonation (1:1, 9:8, 5:4, 4:3, 3:2, 5:3, 15:8, 2:1)

Pythagorean tuning all notes derives from 3:2 (1:1, 256:243, 9:8,…)

Equal temperament

All notes spaced by logarithmically equal distances (100 cents). Each step is higher by 21/12 (1.0594) from previous.

Tuesday, June 25, 2013

Page 21: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis14

Notation

A, B, C, D, E, F, G

Number indicate octave

A4 is 440Hz and MIDI number 69 Do, Re, Mi, Fa, Sol, La, Ti

MIDI (0-128)

m = 69 + 12 log2(f/440)

Tuesday, June 25, 2013

Page 22: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis15

Pitch Histograms

C GC G

(7 * c ) mod 12

Circle of 5s

Tuesday, June 25, 2013

Page 23: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis16

Chroma

Tuesday, June 25, 2013

Page 24: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis17

Calculating Pitch

Calculate FFT of a signal segment

Map each FFT bin to Hertz

512 time domain samples -> 256 FFT bins @ 22050 Hz. Each bin will be 11025/256 ~= 43 Hz

f = k * (srate / fft_size)

Map each bin (in Hertz) to MIDI:

m = 69 + 12 log2(f/440)

Tuesday, June 25, 2013

Page 25: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis18

Pitch Histogram

Average amplitudes of bins mapping to the same MIDI note number

(different averaging shapes can be used)

If desired fold the resulting histogram, collapsing bins that belong to the same pitch class into one

Frequently more than 12 bins per octave to account for tuning/performance variations

Tuesday, June 25, 2013

Page 26: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis19

Chroma Profiles

Sine C4 Clarinet C4

0 bin is A and spacing is chromatic

Tuesday, June 25, 2013

Page 27: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis20

Chromagrams

Tuesday, June 25, 2013

Page 28: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis21

Time Alignment

Two sequences of energy contours corresponding to two performances of the same symphony

We are given two pitch sequences of the same melody sung by different singers

How can we find if they match ?

Tuesday, June 25, 2013

Page 29: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis22

Dynamic Time

Tuesday, June 25, 2013

Page 30: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis23

Music Representations

Symbolic Representation– easy to manipulate– “flat” performance

Audio Representation– expressive performance– opaque & unstructured

Align

POLYPHONIC AUDIO AND MIDIALIGNMENT

Tuesday, June 25, 2013

Page 31: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis24

Similarity Matrix

Similarity Matrix for Beethoven’s 5th Symphony, first movement

Optimal Alignment

Path

Oboe solo:•Acoustic Recording•Audio from MIDI

(Duration: 6:17)

(Dur

atio

n: 7

:49)

POLYPHONIC AUDIO AND MIDIALIGNMENT

Tuesday, June 25, 2013

Page 32: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis24

Similarity Matrix

Similarity Matrix for Beethoven’s 5th Symphony, first movement

Optimal Alignment

Path

Oboe solo:•Acoustic Recording•Audio from MIDI

(Duration: 6:17)

(Dur

atio

n: 7

:49)

POLYPHONIC AUDIO AND MIDIALIGNMENT

Tuesday, June 25, 2013

Page 33: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis24

Similarity Matrix

Similarity Matrix for Beethoven’s 5th Symphony, first movement

Optimal Alignment

Path

Oboe solo:•Acoustic Recording•Audio from MIDI

(Duration: 6:17)

(Dur

atio

n: 7

:49)

POLYPHONIC AUDIO AND MIDIALIGNMENT

Tuesday, June 25, 2013

Page 34: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis24

Similarity Matrix

Similarity Matrix for Beethoven’s 5th Symphony, first movement

Optimal Alignment

Path

Oboe solo:•Acoustic Recording•Audio from MIDI

(Duration: 6:17)

(Dur

atio

n: 7

:49)

POLYPHONIC AUDIO AND MIDIALIGNMENT

Tuesday, June 25, 2013

Page 35: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis24

Similarity Matrix

Similarity Matrix for Beethoven’s 5th Symphony, first movement

Optimal Alignment

Path

Oboe solo:•Acoustic Recording•Audio from MIDI

(Duration: 6:17)

(Dur

atio

n: 7

:49)

POLYPHONIC AUDIO AND MIDIALIGNMENT

Tuesday, June 25, 2013

Page 36: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis25

Structural Analysis

Similarity matrix

Representations

Notes Chords Chroma

Greedy hill-climbing algorithm

Recognize repeated patterns Result = AABA (explanation)

Dannenberg & Hu, ISMIR 2002Tzanetakis, Dannenberg & Hu, WIAMIS 03

Tuesday, June 25, 2013

Page 37: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis26

“Classic” multi-stage approach

Grouping Cue 1

Time-Frequencyrepresentation

Short Time Fourier TransformDiscrete basis: windowed sine waves

Grouping Cue 2

Partial Tracking (McAuley & Quatieri)

Sound source formation:grouping of partials based on harmonicity

PROBLEMS: Difficult to decide ordering, brittle

Tuesday, June 25, 2013

Page 38: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis27

Sound SourceSeparation using Spectral Clustering

Tuesday, June 25, 2013

Page 39: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis28

Comparison with partial tracking

MacAuly and QuatieriTracking of Partials

Proposed Approach

Tuesday, June 25, 2013

Page 40: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis28

Comparison with partial tracking

MacAuly and QuatieriTracking of Partials

Proposed Approach

Tuesday, June 25, 2013

Page 41: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis29

“Real world” separation

Mirex database

Live U2

More examples: http://opihi.cs.uvic.ca/NormCutAudio http://opihi.cs.uvic.ca/Dafx2007

Tuesday, June 25, 2013

Page 42: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis29

“Real world” separation

Mirex database

Live U2

More examples: http://opihi.cs.uvic.ca/NormCutAudio http://opihi.cs.uvic.ca/Dafx2007

Tuesday, June 25, 2013

Page 43: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis29

“Real world” separation

Mirex database

Live U2

More examples: http://opihi.cs.uvic.ca/NormCutAudio http://opihi.cs.uvic.ca/Dafx2007

Tuesday, June 25, 2013

Page 44: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis29

“Real world” separation

Mirex database

Live U2

More examples: http://opihi.cs.uvic.ca/NormCutAudio http://opihi.cs.uvic.ca/Dafx2007

Tuesday, June 25, 2013

Page 45: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis29

“Real world” separation

Mirex database

Live U2

More examples: http://opihi.cs.uvic.ca/NormCutAudio http://opihi.cs.uvic.ca/Dafx2007

Tuesday, June 25, 2013

Page 46: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis29

“Real world” separation

Mirex database

Live U2

More examples: http://opihi.cs.uvic.ca/NormCutAudio http://opihi.cs.uvic.ca/Dafx2007

Tuesday, June 25, 2013

Page 47: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis29

“Real world” separation

Mirex database

Live U2

More examples: http://opihi.cs.uvic.ca/NormCutAudio http://opihi.cs.uvic.ca/Dafx2007

Tuesday, June 25, 2013

Page 48: CCRMA MIR Workshop 2013 Pitch and Chroma Analysis · 2013. 6. 25. · Score Discrete, high level abstraction, explicit structure, no performance info MIDI Discrete, medium level of

Copyright 2011 G.Tzanetakis29

“Real world” separation

Mirex database

Live U2

More examples: http://opihi.cs.uvic.ca/NormCutAudio http://opihi.cs.uvic.ca/Dafx2007

Tuesday, June 25, 2013


Recommended