+ All Categories
Home > Documents > Identifying Repeated Patterns in Music Using Sparse...

Identifying Repeated Patterns in Music Using Sparse...

Date post: 10-Mar-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
18
Identifying Repeated Patterns in Music Using Sparse Convolutive Non-Negative Matrix Factorization ISMIR 2010 Ron Weiss Juan Bello {ronw,jpbello}@nyu.edu Music and Audio Research Lab New York University August 10, 2010 Ron Weiss, Juan Bello (MARL, NYU) Identifying Repeated Patterns in Music Using Sparse Convolutive Non-Negative Matrix Fac August 10, 2010 1 / 17
Transcript
Page 1: Identifying Repeated Patterns in Music Using Sparse ...ronw/pubs/ismir2010-nmfseg-slides/ismir2010-nmfseg-slides.pdfConvolutive Non-Negative Matrix Factorization ISMIR 2010 Ron Weiss

Identifying Repeated Patterns in Music Using SparseConvolutive Non-Negative Matrix Factorization

ISMIR 2010

Ron Weiss Juan Bello{ronw,jpbello}@nyu.edu

Music and Audio Research LabNew York University

August 10, 2010

Ron Weiss, Juan Bello (MARL, NYU) Identifying Repeated Patterns in Music Using Sparse Convolutive Non-Negative Matrix FactorizationAugust 10, 2010 1 / 17

Page 2: Identifying Repeated Patterns in Music Using Sparse ...ronw/pubs/ismir2010-nmfseg-slides/ismir2010-nmfseg-slides.pdfConvolutive Non-Negative Matrix Factorization ISMIR 2010 Ron Weiss

Repetitive patterns in music

Repetition is ubiquitous is music

long-term verse-chorus structure

repeated motifs

Can we identify this structure directly from audio?

What about the repeated units?

Ron Weiss, Juan Bello (MARL, NYU) Identifying Repeated Patterns in Music Using Sparse Convolutive Non-Negative Matrix FactorizationAugust 10, 2010 2 / 17

Page 3: Identifying Repeated Patterns in Music Using Sparse ...ronw/pubs/ismir2010-nmfseg-slides/ismir2010-nmfseg-slides.pdfConvolutive Non-Negative Matrix Factorization ISMIR 2010 Ron Weiss

Proposed approach

Treat song as concatenation of short, repeated template patterns

Inspired by source separation / text topic modeling

Convolutive Non-negative Matrix Factorization (NMF)

Ron Weiss, Juan Bello (MARL, NYU) Identifying Repeated Patterns in Music Using Sparse Convolutive Non-Negative Matrix FactorizationAugust 10, 2010 3 / 17

Page 4: Identifying Repeated Patterns in Music Using Sparse ...ronw/pubs/ismir2010-nmfseg-slides/ismir2010-nmfseg-slides.pdfConvolutive Non-Negative Matrix Factorization ISMIR 2010 Ron Weiss

Beat-synchronous chroma features [Ellis and Poliner, 2007]

0 50 100 150 200 250Time (beats)

A

BC

D

EF

G

Day Tripper

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Summarize energy at each pitch class during each beat

Normalize frame energy to ignore dynamics

Ron Weiss, Juan Bello (MARL, NYU) Identifying Repeated Patterns in Music Using Sparse Convolutive Non-Negative Matrix FactorizationAugust 10, 2010 4 / 17

Page 5: Identifying Repeated Patterns in Music Using Sparse ...ronw/pubs/ismir2010-nmfseg-slides/ismir2010-nmfseg-slides.pdfConvolutive Non-Negative Matrix Factorization ISMIR 2010 Ron Weiss

SI-PLCA [Smaragdis and Raj, 2007]

Shift-invariant Probabilistic Latent Component Analysisi.e. probabilistic convolutive NMF

V ≈∑k

Wk ∗ hk zk

Decompose matrix V into weighted (by Z ) sum of latent componentseach component is convolution of basis W with activations H

Short-term structure in W , long-term structure in HMust specify number, length of patternsIterative EM learning algorithm

Ron Weiss, Juan Bello (MARL, NYU) Identifying Repeated Patterns in Music Using Sparse Convolutive Non-Negative Matrix FactorizationAugust 10, 2010 5 / 17

Page 6: Identifying Repeated Patterns in Music Using Sparse ...ronw/pubs/ismir2010-nmfseg-slides/ismir2010-nmfseg-slides.pdfConvolutive Non-Negative Matrix Factorization ISMIR 2010 Ron Weiss

Learning algorithm example – Initialization

Ron Weiss, Juan Bello (MARL, NYU) Identifying Repeated Patterns in Music Using Sparse Convolutive Non-Negative Matrix FactorizationAugust 10, 2010 6 / 17

Page 7: Identifying Repeated Patterns in Music Using Sparse ...ronw/pubs/ismir2010-nmfseg-slides/ismir2010-nmfseg-slides.pdfConvolutive Non-Negative Matrix Factorization ISMIR 2010 Ron Weiss

Learning algorithm example – Converged

0 100 200 300 400 500 600 70002468

10

V (Iteration 199)

0 100 200 300 400 500 600 70002468

10Reconstruction

0 100 200 300 400 500 600 70002468

10Basis 0 reconstruction

0 100 200 300 400 500 600 70002468

10Basis 1 reconstruction

0 100 200 300 400 500 600 70002468

10Basis 2 reconstruction

0 100 200 300 400 500 600 70002468

10Basis 3 reconstruction

0 1 2 30.000.050.100.150.200.250.30

Z

0 10 20 30

W0

0 10 20 30

W1

0 10 20 30

W2

0 10 20 30

W3

0 100 200 300 400 500 600 700

H0

0 100 200 300 400 500 600 700

H1

0 100 200 300 400 500 600 700

H2

0 100 200 300 400 500 600 700

H3

Ron Weiss, Juan Bello (MARL, NYU) Identifying Repeated Patterns in Music Using Sparse Convolutive Non-Negative Matrix FactorizationAugust 10, 2010 7 / 17

Page 8: Identifying Repeated Patterns in Music Using Sparse ...ronw/pubs/ismir2010-nmfseg-slides/ismir2010-nmfseg-slides.pdfConvolutive Non-Negative Matrix Factorization ISMIR 2010 Ron Weiss

Sparsity

Encourage sparse (mostly zero) parameters using prior distributions

Use entropic prior over activations H [Smaragdis et al., 2008]

low entropy =⇒ less uniform

Leads to more meaningful patterns

but reduces temporal information in activationssparse H =⇒ dense W

Ron Weiss, Juan Bello (MARL, NYU) Identifying Repeated Patterns in Music Using Sparse Convolutive Non-Negative Matrix FactorizationAugust 10, 2010 8 / 17

Page 9: Identifying Repeated Patterns in Music Using Sparse ...ronw/pubs/ismir2010-nmfseg-slides/ismir2010-nmfseg-slides.pdfConvolutive Non-Negative Matrix Factorization ISMIR 2010 Ron Weiss

Automatic relevance determination [Tan and Fevotte, 2009]

Avoid having to specify number of patterns in advance

Initialize decomposition with large number of patternsSparse Dirichlet distribution over mixing weights ZDiscard unused patterns

0 50 100 150 200Iteration

02468

10121416

Effe

ctiv

e ra

nk (K

)

Ron Weiss, Juan Bello (MARL, NYU) Identifying Repeated Patterns in Music Using Sparse Convolutive Non-Negative Matrix FactorizationAugust 10, 2010 9 / 17

Page 10: Identifying Repeated Patterns in Music Using Sparse ...ronw/pubs/ismir2010-nmfseg-slides/ismir2010-nmfseg-slides.pdfConvolutive Non-Negative Matrix Factorization ISMIR 2010 Ron Weiss

Sparse learning example – Initialization

0 100 200 300 400 500 600 7000246810

V (Iteration 0)

0 100 200 300 400 500 600 7000246810

Reconstruction

0 100 200 300 400 500 600 7000246810

Basis 0 reconstruction

0 100 200 300 400 500 600 7000246810

Basis 1 reconstruction

0 100 200 300 400 500 600 7000246810

Basis 2 reconstruction

0 100 200 300 400 500 600 7000246810

Basis 3 reconstruction

0 100 200 300 400 500 600 7000246810

Basis 4 reconstruction

0 100 200 300 400 500 600 7000246810

Basis 5 reconstruction

0 100 200 300 400 500 600 7000246810

Basis 6 reconstruction

0 100 200 300 400 500 600 7000246810

Basis 7 reconstruction

0 100 200 300 400 500 600 7000246810

Basis 8 reconstruction

0 100 200 300 400 500 600 7000246810

Basis 9 reconstruction

0 100 200 300 400 500 600 7000246810

Basis 10 reconstruction

0 100 200 300 400 500 600 7000246810

Basis 11 reconstruction

0 100 200 300 400 500 600 7000246810

Basis 12 reconstruction

0 100 200 300 400 500 600 7000246810

Basis 13 reconstruction

0 100 200 300 400 500 600 7000246810

Basis 14 reconstruction

0 100 200 300 400 500 600 7000246810

Basis 15 reconstruction

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 150.000.010.020.030.040.050.060.07

Z

0 10 20 30

W0

0 10 20 30

W1

0 10 20 30

W2

0 10 20 30

W3

0 10 20 30

W4

0 10 20 30

W5

0 10 20 30

W6

0 10 20 30

W7

0 10 20 30

W8

0 10 20 30

W9

0 10 20 30

W10

0 10 20 30

W11

0 10 20 30

W12

0 10 20 30

W13

0 10 20 30

W14

0 10 20 30

W15

0 100 200 300 400 500 600 700

H0

0 100 200 300 400 500 600 700

H1

0 100 200 300 400 500 600 700

H2

0 100 200 300 400 500 600 700

H3

0 100 200 300 400 500 600 700

H4

0 100 200 300 400 500 600 700

H5

0 100 200 300 400 500 600 700

H6

0 100 200 300 400 500 600 700

H7

0 100 200 300 400 500 600 700

H8

0 100 200 300 400 500 600 700

H9

0 100 200 300 400 500 600 700

H10

0 100 200 300 400 500 600 700

H11

0 100 200 300 400 500 600 700

H12

0 100 200 300 400 500 600 700∗

H13

0 100 200 300 400 500 600 700∗

H14

0 100 200 300 400 500 600 700

∗H15

Ron Weiss, Juan Bello (MARL, NYU) Identifying Repeated Patterns in Music Using Sparse Convolutive Non-Negative Matrix FactorizationAugust 10, 2010 10 / 17

Page 11: Identifying Repeated Patterns in Music Using Sparse ...ronw/pubs/ismir2010-nmfseg-slides/ismir2010-nmfseg-slides.pdfConvolutive Non-Negative Matrix Factorization ISMIR 2010 Ron Weiss

Sparse learning example – Converged

0 100 200 300 400 500 600 70002468

10

V (Iteration 199)

0 100 200 300 400 500 600 70002468

10Reconstruction

0 100 200 300 400 500 600 70002468

10Basis 0 reconstruction

0 100 200 300 400 500 600 70002468

10Basis 1 reconstruction

0 100 200 300 400 500 600 70002468

10Basis 2 reconstruction

0 100 200 300 400 500 600 70002468

10Basis 3 reconstruction

0 1 2 30.000.050.100.150.200.250.300.350.400.45

Z

0 10 20 30

W0

0 10 20 30

W1

0 10 20 30

W2

0 10 20 30

W3

0 100 200 300 400 500 600 700

H0

0 100 200 300 400 500 600 700

H1

0 100 200 300 400 500 600 700

H2

0 100 200 300 400 500 600 700

H3

Ron Weiss, Juan Bello (MARL, NYU) Identifying Repeated Patterns in Music Using Sparse Convolutive Non-Negative Matrix FactorizationAugust 10, 2010 11 / 17

Page 12: Identifying Repeated Patterns in Music Using Sparse ...ronw/pubs/ismir2010-nmfseg-slides/ismir2010-nmfseg-slides.pdfConvolutive Non-Negative Matrix Factorization ISMIR 2010 Ron Weiss

Applications: Riff identification / Thumbnailing

Reconstruct song using a single pattern

Sparse activationsRiff length known in advance (for now)Thumbnail corresponds to largest activation in H

0 2 4 6 8 10 12 14Time (beats)

A

BC

D

EF

G

0.0000.0030.0060.0090.0120.0150.0180.0210.024

0 100 200 300 400 500 600 700 800Time (beats)

0.000

0.005

0.010

0.015

0.020

0.025

0 2 4 6 8 10 12 14Time (beats)

A

BC

D

EF

G

0.0000.0020.0040.0060.0080.0100.0120.0140.016

0 200 400 600 800 1000Time (beats)

0.0000.0020.0040.0060.0080.0100.0120.0140.016

Ron Weiss, Juan Bello (MARL, NYU) Identifying Repeated Patterns in Music Using Sparse Convolutive Non-Negative Matrix FactorizationAugust 10, 2010 12 / 17

Page 13: Identifying Repeated Patterns in Music Using Sparse ...ronw/pubs/ismir2010-nmfseg-slides/ismir2010-nmfseg-slides.pdfConvolutive Non-Negative Matrix Factorization ISMIR 2010 Ron Weiss

Applications: Structure segmentation

Identify long-term song structure (verse, chorus, bridge, etc.)

Assume one-to-one mapping between chroma patterns and segments

Use SI-PLCA decomposition with longer patterns

no prior on activations

Ron Weiss, Juan Bello (MARL, NYU) Identifying Repeated Patterns in Music Using Sparse Convolutive Non-Negative Matrix FactorizationAugust 10, 2010 13 / 17

Page 14: Identifying Repeated Patterns in Music Using Sparse ...ronw/pubs/ismir2010-nmfseg-slides/ismir2010-nmfseg-slides.pdfConvolutive Non-Negative Matrix Factorization ISMIR 2010 Ron Weiss

Structure segmentation example

Estimated intro refrain verse refrain verse refrain verse refrain refrain outro.. .. .. .. .. .. .. .. .. ..

Ground truth intro refrain verse refrain vs/break refrain verse refrain refrain outro.. .. .. .. .. .. .. .. .. ..

Ron Weiss, Juan Bello (MARL, NYU) Identifying Repeated Patterns in Music Using Sparse Convolutive Non-Negative Matrix FactorizationAugust 10, 2010 14 / 17

Page 15: Identifying Repeated Patterns in Music Using Sparse ...ronw/pubs/ismir2010-nmfseg-slides/ismir2010-nmfseg-slides.pdfConvolutive Non-Negative Matrix Factorization ISMIR 2010 Ron Weiss

Structure segmentation example 2

segments tend to be broken into multiple motifs

Est verse1 verse2 verse1 verse2 refrain. verse1 verse2 refrain. verse1 outro. verse1 refrain. verse1 outro.. .. .. .. .. .. .. .. .. .. .. .. .. ..

GT verse verse refrain. verse refrain. 12

verse inst. 12

verse refrain. outro

.. .. .. .. .. .. .. .. .. .. ..

Ron Weiss, Juan Bello (MARL, NYU) Identifying Repeated Patterns in Music Using Sparse Convolutive Non-Negative Matrix FactorizationAugust 10, 2010 15 / 17

Page 16: Identifying Repeated Patterns in Music Using Sparse ...ronw/pubs/ismir2010-nmfseg-slides/ismir2010-nmfseg-slides.pdfConvolutive Non-Negative Matrix Factorization ISMIR 2010 Ron Weiss

Experiments

Evaluate on 180 songs from The Beatles catalog

System f-meas prec recall over-seg under-seg

[Mauch et al., 2009] 0.66 0.61 0.77 0.76 0.64SI-PLCA (sparse Z) 0.60 0.58 0.68 0.61 0.56SI-PLCA (rank=4) 0.58 0.60 0.59 0.56 0.59[Levy and Sandler, 2008] 0.54 0.58 0.53 0.50 0.57Random 0.30 0.36 0.26 0.07 0.24

Compare to systems based on self-similarity and HMM clustering

middle of the pack performancesparse Z gives ∼ 10% improvement in recall over fixed rank

Needs better post-processing?

Ron Weiss, Juan Bello (MARL, NYU) Identifying Repeated Patterns in Music Using Sparse Convolutive Non-Negative Matrix FactorizationAugust 10, 2010 16 / 17

Page 17: Identifying Repeated Patterns in Music Using Sparse ...ronw/pubs/ismir2010-nmfseg-slides/ismir2010-nmfseg-slides.pdfConvolutive Non-Negative Matrix Factorization ISMIR 2010 Ron Weiss

Summary

Novel algorithm for identifying repeated harmonic patterns in music

Use sparsity to minimize number of fixed parameters, control structure

Applications to thumbnailing and structure segmentation

Future work

Adaptive model of pattern length, better downbeat alignment2D convolution to compensate for key changesTime-warp invariance (beat-tracking errors, fixed hop size)

Open source Python/Matlab implementation available:http://ronw.github.com/siplca-segmentation

Ron Weiss, Juan Bello (MARL, NYU) Identifying Repeated Patterns in Music Using Sparse Convolutive Non-Negative Matrix FactorizationAugust 10, 2010 17 / 17

Page 18: Identifying Repeated Patterns in Music Using Sparse ...ronw/pubs/ismir2010-nmfseg-slides/ismir2010-nmfseg-slides.pdfConvolutive Non-Negative Matrix Factorization ISMIR 2010 Ron Weiss

References

Ellis, D. and Poliner, G. (2007).

Identifying ‘cover songs’ with chroma features and dynamic programming beat tracking.In Proc. ICASSP, pages IV–1429–1432.

Levy, M. and Sandler, M. (2008).

Structural Segmentation of Musical Audio by Constrained Clustering.IEEE Trans. Audio, Speech, and Language Processing, 16(2).

Mauch, M., Noland, K. C., and Dixon, S. (2009).

Using musical structure to enhance automatic chord transcription.In Proc. ISMIR, pages 231–236.

Smaragdis, P. and Raj, B. (2007).

Shift-Invariant Probabilistic Latent Component Analysis.Technical Report TR2007-009, MERL.

Smaragdis, P., Raj, B., and Shashanka, M. (2008).

Sparse and shift-invariant feature extraction from non-negative data.In Proc. ICASSP, pages 2069–2072.

Tan, V. and Fevotte, C. (2009).

Automatic Relevance Determination in Nonnegative Matrix Factorization.In Proc. SPARS.

Ron Weiss, Juan Bello (MARL, NYU) Identifying Repeated Patterns in Music Using Sparse Convolutive Non-Negative Matrix FactorizationAugust 10, 2010 17 / 17


Recommended