Geometry/Topology of Musical Audio Data
Chris Tralie
Duke University
October 23, 2014
Chris Tralie Geometry/Topology of Musical Audio Data
Overview
B View music as a geometric curve in high dimensionsB Express timbral and thematic musical ideas in geometric
languageB Use geometric features for classification of musical audio
Joint work with Paul Bendich, John Harer, Marshall Ratlif, andDerrick NowakSpecial thanks to the summer Data+ undergraduate researchprogram and IID!
Chris Tralie Geometry/Topology of Musical Audio Data
Motivating Demo
B DEMO
Chris Tralie Geometry/Topology of Musical Audio Data
Talk Overview
I Sampled Audio Basics/Spectrograms
B MFCC/Chroma FeaturesB PCA and LoopDittyB Topological Data Analysis Bird’s Eye ViewB Learning on Persistence DiagramsB Genre Classification ResultsB Future Work: Artist20 / Bridge Detection
Chris Tralie Geometry/Topology of Musical Audio Data
Digital Audio Basics: Representation/Sampling
I 1D time series x [n], sampled at 44100hzI Need for dimension reduction
B 1 second chunk lives in R44100
B 3 second chunk lives in R132300!
Chris Tralie Geometry/Topology of Musical Audio Data
Digital Audio Basics: Sliding Window
For a window length of M samples
xM [n] =
x [n]
x [n + 1]x [n + 2]
...x [n + M]
∈ RM
Can summarize each window with the Fourier TransformB Want each window to have roughly stationary frequency
statisticsB Leads to the Short-Time Fourier Transform or a
Spectrogram
Chris Tralie Geometry/Topology of Musical Audio Data
Digital Audio Basics: Spectrogram Examples
Plucked String
Chris Tralie Geometry/Topology of Musical Audio Data
Digital Audio Basics: Spectrogram Examples
Coltrane / Aeorosmith
Chris Tralie Geometry/Topology of Musical Audio Data
Digital Audio Basics: Spectrogram Examples
Coltrane / Aeorosmith Zoomed
Chris Tralie Geometry/Topology of Musical Audio Data
Talk Overview
B Sampled Audio Basics/Spectrograms
I MFCC/Chroma Features
B PCA and LoopDittyB Topological Data Analysis Bird’s Eye ViewB Learning on Persistence DiagramsB Genre Classification ResultsB Future Work: Artist20 / Bridge Detection
Chris Tralie Geometry/Topology of Musical Audio Data
MFCC/Chroma Features
I Purpose: To perform a nonlinear dimension reduction fromR44100 to R50
I Reduction is severe but it should retain importantperceptual informationB Pitches / Overall spectral shape
I Reduction will increase robustness to noise and addresscurse of dimensionality
Chris Tralie Geometry/Topology of Musical Audio Data
Chroma Features
I Would like a feature that picks up on the notes in the musicI Frequency doubles for every increase in octave
B Ex) A is 220hz, 440hz, 880hz, 1760hz, etc.I Measuring strength of all frequencies in a pitch class leads
to 12 distinct features
Chris Tralie Geometry/Topology of Musical Audio Data
Chroma Features
Chris Tralie Geometry/Topology of Musical Audio Data
Chroma Features: Piano Chromatic Scale Example
Chris Tralie Geometry/Topology of Musical Audio Data
Chroma Features: When Doves Cry (Sound Example)
Chris Tralie Geometry/Topology of Musical Audio Data
MFCC Features
I Something’s Missing (no absolute octave reference)I Design a feature that picks up on spectral envelope
1. Multiply STFT with mel-spaced triangular filterbank2. Take the log of the amplitude in each mel bin3. Perform the discrete cosine transform (keep the first 13coefficients)
Chris Tralie Geometry/Topology of Musical Audio Data
MFCC Features: When Doves Cry (Sound Example)
Chris Tralie Geometry/Topology of Musical Audio Data
Analysis/Texture Windows
I Take mean/variance of many small STFT windows in alarger window
I (Chroma (12) + MFCC (13)) x 2 = 50I Reduction from RN to R50
Chris Tralie Geometry/Topology of Musical Audio Data
Talk Overview
B Sampled Audio Basics/SpectrogramsB MFCC/Chroma Features
I PCA and LoopDitty
B Topological Data Analysis Bird’s Eye ViewB Learning on Persistence DiagramsB Genre Classification ResultsB Future Work: Artist20 / Bridge Detection
Chris Tralie Geometry/Topology of Musical Audio Data
Curves in R50
I First big conceptual breakthrough: each texture window isa point in R50
I Transition from analysis to geometryI Do PCA on these curves to visualize in 3DI First time we see dimension reduction is important!
Chris Tralie Geometry/Topology of Musical Audio Data
Curves in R50
I First big conceptual breakthrough: each texture window isa point in R50
I Transition from analysis to geometryI Do PCA on these curves to visualize in 3DI First time we see dimension reduction is important!
Chris Tralie Geometry/Topology of Musical Audio Data
LoopDitty: Synchronize PCA with Audio
I In-depth live demo: Michael Jackson “Bad”
Chris Tralie Geometry/Topology of Musical Audio Data
LoopDitty.net
I I created a web site that can draw these curves for anysound on SoundCloud. Try it out and let me know what youfind!
Chris Tralie Geometry/Topology of Musical Audio Data
Talk Overview
B Sampled Audio Basics/SpectrogramsB MFCC/Chroma FeaturesB PCA and LoopDitty
I Topological Data Analysis Bird’s Eye View
B Learning on Persistence DiagramsB Genre Classification ResultsB Future Work: Artist20 / Bridge Detection
Chris Tralie Geometry/Topology of Musical Audio Data
TDA Motivation
I Need a quantitative way to measure geometric features inhigh dimensions
I Don’t care about coordinatesI Do care about multi-scale features (from small wiggles to
big clusters and loops in verse/chorus)I Topological Data Analysis (TDA) good for the job
B Allows to measure clusters, cycles, and critical pointsin high dimensions
Chris Tralie Geometry/Topology of Musical Audio Data
Finding Cycles: 1D Rips Filtration
I First motivation for looking at music: tons of cycles
Chris Tralie Geometry/Topology of Musical Audio Data
Finding Cycles: 1D Rips Filtration
Chris Tralie Geometry/Topology of Musical Audio Data
Finding Cycles: 1D Rips Filtration
Chris Tralie Geometry/Topology of Musical Audio Data
Finding Cycles: 1D Rips Filtration
Chris Tralie Geometry/Topology of Musical Audio Data
Finding Cycles: 1D Rips Filtration
Chris Tralie Geometry/Topology of Musical Audio Data
Finding Cycles: 1D Rips Filtration
Chris Tralie Geometry/Topology of Musical Audio Data
Finding Cycles: 1D Rips Filtration
Chris Tralie Geometry/Topology of Musical Audio Data
Finding Cycles: 1D Rips Filtration
I Information encoded in a Persistence DiagramI Birth time (time of formation) of loop on x-axisI Death time (time of getting filled in) of loop on y-axis
Chris Tralie Geometry/Topology of Musical Audio Data
Finding Critical Points: Morse Filtrations
Chris Tralie Geometry/Topology of Musical Audio Data
Finding Critical Points: Morse Filtrations
Chris Tralie Geometry/Topology of Musical Audio Data
Finding Critical Points: Morse Filtrations
Chris Tralie Geometry/Topology of Musical Audio Data
Finding Critical Points: Morse Filtrations
Chris Tralie Geometry/Topology of Musical Audio Data
Finding Critical Points: Morse Filtrations
Chris Tralie Geometry/Topology of Musical Audio Data
Finding Critical Points: Morse Filtrations
Chris Tralie Geometry/Topology of Musical Audio Data
Finding Critical Points: Morse Filtrations
Chris Tralie Geometry/Topology of Musical Audio Data
Finding Critical Points: Morse Filtrations
Chris Tralie Geometry/Topology of Musical Audio Data
Finding Critical Points: Morse Filtrations
I For a general curve, no reason to privilege any directionI Filter along many directions equally spaced on SN
Chris Tralie Geometry/Topology of Musical Audio Data
Talk Overview
B Sampled Audio Basics/SpectrogramsB MFCC/Chroma FeaturesB PCA and LoopDittyB Topological Data Analysis Bird’s Eye View
I Learning on Persistence Diagrams
B Genre Classification ResultsB Future Work: Artist20 / Bridge Detection
Chris Tralie Geometry/Topology of Musical Audio Data
Learning on Persistence Diagrams
I Original Diagram
Chris Tralie Geometry/Topology of Musical Audio Data
Learning on Persistence Diagrams
I Transform to lifetime (death time - birth time)
Chris Tralie Geometry/Topology of Musical Audio Data
Learning on Persistence Diagrams
I Sort in descending order of lifetimes to make feature vectorI Take a sparse summary of this feature vector with Fourier
coefficients
Chris Tralie Geometry/Topology of Musical Audio Data
Talk Overview
B Sampled Audio Basics/SpectrogramsB MFCC/Chroma FeaturesB PCA and LoopDittyB Topological Data Analysis Bird’s Eye ViewB Learning on Persistence Diagrams
I Genre Classification Results
B Future Work: Artist20 / Bridge Detection
Chris Tralie Geometry/Topology of Musical Audio Data
Tzanetakis Genre Dataset
I Old dataset (2002), benchmark for early work in genreclassification
I 1000 clips 30 second song clipsI 10 genres x 100 songs
B blues, classical, country, disco, hiphop, jazz, metal,pop, reggae, rock
I Well-documented problems, but 100+ papers cite it
Chris Tralie Geometry/Topology of Musical Audio Data
Genre Classification Pipeline
I Scale each song so that it lies in the unit cubeI Perform 1D rips filtration throwing away all notion of timeI Perform 50 Morse filtrations on piecewise linear curve in 50
equally spaced directions S49
I Grab and sort each, summarize with first 5 Fouriercoefficients
I For Chroma/MFCC, take mean/variance of eachcomponent over the whole song to embed in R100
I Do k-nearest neighbor 10-fold cross validation, with k = 5
Chris Tralie Geometry/Topology of Musical Audio Data
Genre Classification: Chroma/MFCC
Chris Tralie Geometry/Topology of Musical Audio Data
Genre Classification: Morse Filtrations
Chris Tralie Geometry/Topology of Musical Audio Data
Genre Classification: 1D Rips Filtration
Chris Tralie Geometry/Topology of Musical Audio Data
Genre Classification: Morse + 1D Rips
Chris Tralie Geometry/Topology of Musical Audio Data
Genre Classification: Chroma/MFCC + Morse + 1DRips
Chris Tralie Geometry/Topology of Musical Audio Data
Genre Forensics: Mean MFCC Morse
Chris Tralie Geometry/Topology of Musical Audio Data
Genre Forensics: Mean MFCC 1D Rips
Chris Tralie Geometry/Topology of Musical Audio Data
Genre Forensics: Mean Chroma Morse
Chris Tralie Geometry/Topology of Musical Audio Data
Genre Forensics: Mean Chroma 1D Rips
Chris Tralie Geometry/Topology of Musical Audio Data
Next Steps: Dan Ellis Artist20 Dataset
I Slightly more recent standard dataset (2007)I 20 artists, 5 albums eachI 4 albums used for training, 1 used for testI Believe TDA can say something more about global
structure
Chris Tralie Geometry/Topology of Musical Audio Data
Chorus/Verse Loop on Whole Song
Chris Tralie Geometry/Topology of Musical Audio Data
Chorus/Verse Loop on Whole Song
Chris Tralie Geometry/Topology of Musical Audio Data
Last Example: Loop Ditty Bridge
I Motorhead: Ace of Spades
Chris Tralie Geometry/Topology of Musical Audio Data
Questions?
Thank You!
Chris Tralie Geometry/Topology of Musical Audio Data