Date post: | 15-Dec-2015 |
Category: |
Documents |
Upload: | shannon-brower |
View: | 217 times |
Download: | 0 times |
Learning Measurement Matrices for Redundant Dictionaries
Richard Baraniuk
Rice University
Chinmay HegdeMIT
Aswin Sankaranarayanan
CMU
Sparse Recovery
• Sparsity rocks, etc.
• Previous talk focused mainly on signal inference (ex: classification, NN search)
• This talk focuses on signal recovery
Compressive Sensing
• Sensing via randomized dimensionality reduction
random measurements
sparsesignal
nonzeroentries
• Recovery: solve an ill-posed inverse problem
exploit the geometrical structure of sparse/compressible signals
• Gaussian measurements incoherent with any fixed orthonormal basis (with high probability)
• Ex: frequency domain:
General Sparsifying Bases
Sparse Modeling: Approach 1
• Step 1: Choose a signal model with structure– e.g. bandlimited, smooth with r vanishing moments, etc.
• Step 2: Analytically design a sparsifying basis/frame that exploits this structure– e.g. DCT, wavelets, Gabor, etc.
DCT Wavelets Gabor
? ?
Sparse Modeling: Approach 2
• Learn the sparsifying basis/frame from training data
• Problem formulation: given a large number of training signals, design a dictionary D that simultaneously sparsifies the training data
• Called sparse coding / dictionary learning
Dictionaries
• Dictionary: an NxQ matrix whose columns are used as basis functions for the data
• Convention: assume columns are unit-norm• More columns than rows, so dictionary is
redundant / overcomplete
Dictionary Learning
• Rich vein of theoretical and algorithmic work Olshausen and Field [‘97], Lewicki and Sejnowski [’00], Elad [‘06], Sapiro [‘08]
• Typical formulation: Given training data
Solve:
• Several efficient algorithms, ex: K-SVD
Dictionary Learning
• Successfully applied to denoising, deblurring, inpainting, demosaicking, super-resolution, …– State-of-the-art results in many of these problems
Aharon and Elad ‘06
Dictionary Coherence
• Suppose that the learned dictionary is normalized to have unit -norm columns:
• The mutual coherence of D is defined as
• Geometrically, represents the cosine of the minimum angle between the columns of D, smaller is better
• Crucial parameter in analysis as well as practice (line of work starting with Tropp [04])
Dictionaries and CS• Can extend CS to work with non-orthonormal,
redundant dictionaries
• Coherence of determines recovery success Rauhut et al. [08], Candes et al. [10]
• Fortunately, random guarantees low coherence
Holographic basis
Geometric Intuition
• Columns of D: points on the unit sphere
• Coherence: minimum angle between the vectors
• J-L Lemma: Random projections approximately preserve angles between vectors
Q: Can we do better than random projections for dictionary-based CS?
Q restated: For a given dictionary D, find the best CS measurement matrix
Optimization Approach
• Assume that a good dictionary D has been provided.
• Goal: Learn the best for this particular D
• As before, want the “shortest” matrix such that the coherence of is at most some parameter
• To avoid degeneracies caused by a simple scaling, also want that does not shrink columns much:
A NuMax-like Framework
• Convert quadratic constraints in into linear constraints in (via the “lifting trick”)
• Use a nuclear-norm relaxation of the rank
• Simplified problem:
• Alternating Direction Method of Multipliers (ADMM)
- solve for P using spectral thresholding- solve for L using least-squares
- solve for q using “squishing” Convergence rate depends on the size of the
dictionary (since #constraints = )
Algorithm: “NuMax-Dict”
[HSYB12]
NuMax vs. NuMax-Dict
• Same intuition, trick, algorithm, etc;
• Key enabler is that coherence is intrinsically a quadratic function of the data
• Key difference: the (linearized) constraints are no longer symmetric
– We have constraints of the form
– This might result in intermediate P estimates having complex eigenvalues, so the notion of spectral thresholding needs to be slightly modified
Experimental Results
Expt 1: Synthetic Dictionary
• Generic dictionary: random w/ unit norm. columns• Dictionary size: 64x128• We construct different measurement matrices:
• Random• NuMax-Dict• Algorithm by Elad [06]• Algorithm by Duarte-Carvajalino & Sapiro [08]
• We generate K=3 sparse signals with Gaussian amplitudes, add 30dB measurement noise
• Recovery using OMP• Measure recovery SNR, plot as a function of M
Exp 1: Synthetic Dictionary
Expt 2: Practical Dictionaries• 2x overcomplete DCT dictionary, same parameters• 2x overcomplete dictionary learned on 8x8 patches of a
real-world image (Barbara) using K-SVD• Recovery using OMP
Analysis
• Exact problem seems to be hard to analyze
• But, as in NuMax, can provide analytical bounds in the special case where the measurement matrix is further constrained to be orthonormal
Orthogonal Sensing of Dictionary-Sparse Signals
• Given a dictionary D, find the orthonormal measurement matrix that provides the best possible coherence
• From a geometric perspective, ortho-projections cannot improve coherence, so necessarily
Semidefinite Relaxation
• The usual trick: Lifting and trace-norm relaxation
Theoretical Result
• Theorem: For any given redundant dictionary D, denote its mutual coherence by .
Denote the optimum of the (nonconvex) problem as
Then, there exists a method to produce a rank-2M ortho matrix such that the coherence of is at most
i.e., We can obtain close to optimal performance, but pay a price of a factor 2 in the number of measurements
Conclusions
• NuMax-Dict performance comparable to the best existing algorithms
• Principled convex optimization framework
• Efficient ADMM-type algorithm that exploits the rank-1 structure of the problem
• Upshot: possible to incorporate other structure into the measurement matrix, such as positivity, sparsity, etc.
Open Question
• Above framework assumes a two-step approach: first construct a redundant dictionary (analytically or from data) and then construct a measurement matrix
• Given a large number of training data, how to efficiently solve jointly for both the dictionary and the sensing matrix? (Approach introduced in DC-Sapiro [08])