Bayesian Nonparametric Matrix Factorization for Recorded Music
Matthew D. Hoffman, David M. Blei, Perry R. Cook
Presented by Lu Ren
Electrical and Computer Engineering
Duke University
Outline
Introduction
GaP-NMF Model
Variational Inference
Evaluation
Related Work
Conclusions
Introduction
Breaking audio spectrograms into separate sources of sound
Identifying individual instruments and notes
Predicting hidden or distorted signals
Source separation
previous work
Specifying the number of sources---Bayesian Nonparametric Gamma Process Nonnegative Matrix Factorization (GaP-NMF) Computational challenge: non-conjugate pairs of distributions
• favor for spectrogram data, not for computational convenience
• bigger variational family analytic coordinate ascent algorithm
GaP-NMF Model Observation: Fourier power sepctrogram of an audio signal
: M by N matrix of nonnegative reals
: power at time window n and frequency bin m
A window of 2(M-1)
samples
DFT Squared magnitude in each
frequency bin
Keep only the
first M bins
Assume K static sound sources
: describe these sources
: amplitude of each source changing over time
is the average amount of energy source k exhibits at frequency m
is the gain of source k at time n
GaP-NMF Model
1Abdallah & Plumbley (2004) and Fevotte et al. (2009)
Mixing K sound sources in the time domain (under certain assumptions), spectrogram is distributed1
Infer both the characters and number of latent audio sources
: trunction level
GaP-NMF Model
As goes infinity, approximates an infinite sequence drawn from a gamma process Number of elements greater than some is finite almost surely:
If is sufficiently large relative to , only a few elements of
are substantially greater than 0. Setting :
θ
θ
Variational Inference
Variational distribution: expanded family
Generalized Inverse-Gaussian (GIG):
denotes a modified Bessel function of the second kind
Gamma family is a special case of the GIG family where ,
Variational Inference
Lower bound of GaP-NMF model:
If :
GIG family sufficient statistics:
Gamma family sufficient statistics:
Variational Inference
The likelihood term expands to:
With Jensen’s inequality:
Variational Inference
With a first order Taylor approximation:
: an arbitrary positive point
Variational Inference Tightening the likelihood bound
Optimizing the variational distributions
For example:
Evaluation
Compare GaP-NMF to two variations:
1. Finite Bayesian model
2. Finite non-Bayesian model
Itakura-Saito Nonnegative Matrix Factorization (IS-NMF)
: maximize the likelihood in the above fomula
Compare with another two NMF algorithms:
EU-NMF: minimize the sum of the squared Euclidean distance
KL-NMF: minimize the generalized KL-divergence
Evaluation
1. Synthetic Data
Evaluation
2. Marginal Likelihood & Bandwidth Expansion
Evaluation
3. Blind Monophonic Source Separation
Conclusions
Related work
Bayesian nonparametric model GaP-NMF
Applicable to other types of audio