+ All Categories
Home > Documents > Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay...

Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay...

Date post: 15-Dec-2015
Category:
Upload: shannon-brower
View: 217 times
Download: 0 times
Share this document with a friend
27
Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU
Transcript
Page 1: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Learning Measurement Matrices for Redundant Dictionaries

Richard Baraniuk

Rice University

Chinmay HegdeMIT

Aswin Sankaranarayanan

CMU

Page 2: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Sparse Recovery

• Sparsity rocks, etc.

• Previous talk focused mainly on signal inference (ex: classification, NN search)

• This talk focuses on signal recovery

Page 3: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Compressive Sensing

• Sensing via randomized dimensionality reduction

random measurements

sparsesignal

nonzeroentries

• Recovery: solve an ill-posed inverse problem

exploit the geometrical structure of sparse/compressible signals

Page 4: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

• Gaussian measurements incoherent with any fixed orthonormal basis (with high probability)

• Ex: frequency domain:

General Sparsifying Bases

Page 5: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Sparse Modeling: Approach 1

• Step 1: Choose a signal model with structure– e.g. bandlimited, smooth with r vanishing moments, etc.

• Step 2: Analytically design a sparsifying basis/frame that exploits this structure– e.g. DCT, wavelets, Gabor, etc.

DCT Wavelets Gabor

? ?

Page 6: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Sparse Modeling: Approach 2

• Learn the sparsifying basis/frame from training data

• Problem formulation: given a large number of training signals, design a dictionary D that simultaneously sparsifies the training data

• Called sparse coding / dictionary learning

Page 7: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Dictionaries

• Dictionary: an NxQ matrix whose columns are used as basis functions for the data

• Convention: assume columns are unit-norm• More columns than rows, so dictionary is

redundant / overcomplete

Page 8: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Dictionary Learning

• Rich vein of theoretical and algorithmic work Olshausen and Field [‘97], Lewicki and Sejnowski [’00], Elad [‘06], Sapiro [‘08]

• Typical formulation: Given training data

Solve:

• Several efficient algorithms, ex: K-SVD

Page 9: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Dictionary Learning

• Successfully applied to denoising, deblurring, inpainting, demosaicking, super-resolution, …– State-of-the-art results in many of these problems

Aharon and Elad ‘06

Page 10: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Dictionary Coherence

• Suppose that the learned dictionary is normalized to have unit -norm columns:

• The mutual coherence of D is defined as

• Geometrically, represents the cosine of the minimum angle between the columns of D, smaller is better

• Crucial parameter in analysis as well as practice (line of work starting with Tropp [04])

Page 11: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Dictionaries and CS• Can extend CS to work with non-orthonormal,

redundant dictionaries

• Coherence of determines recovery success Rauhut et al. [08], Candes et al. [10]

• Fortunately, random guarantees low coherence

Holographic basis

Page 12: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Geometric Intuition

• Columns of D: points on the unit sphere

• Coherence: minimum angle between the vectors

• J-L Lemma: Random projections approximately preserve angles between vectors

Page 13: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Q: Can we do better than random projections for dictionary-based CS?

Q restated: For a given dictionary D, find the best CS measurement matrix

Page 14: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Optimization Approach

• Assume that a good dictionary D has been provided.

• Goal: Learn the best for this particular D

• As before, want the “shortest” matrix such that the coherence of is at most some parameter

• To avoid degeneracies caused by a simple scaling, also want that does not shrink columns much:

Page 15: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

A NuMax-like Framework

• Convert quadratic constraints in into linear constraints in (via the “lifting trick”)

• Use a nuclear-norm relaxation of the rank

• Simplified problem:

Page 16: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

• Alternating Direction Method of Multipliers (ADMM)

- solve for P using spectral thresholding- solve for L using least-squares

- solve for q using “squishing” Convergence rate depends on the size of the

dictionary (since #constraints = )

Algorithm: “NuMax-Dict”

[HSYB12]

Page 17: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

NuMax vs. NuMax-Dict

• Same intuition, trick, algorithm, etc;

• Key enabler is that coherence is intrinsically a quadratic function of the data

• Key difference: the (linearized) constraints are no longer symmetric

– We have constraints of the form

– This might result in intermediate P estimates having complex eigenvalues, so the notion of spectral thresholding needs to be slightly modified

Page 18: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Experimental Results

Page 19: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Expt 1: Synthetic Dictionary

• Generic dictionary: random w/ unit norm. columns• Dictionary size: 64x128• We construct different measurement matrices:

• Random• NuMax-Dict• Algorithm by Elad [06]• Algorithm by Duarte-Carvajalino & Sapiro [08]

• We generate K=3 sparse signals with Gaussian amplitudes, add 30dB measurement noise

• Recovery using OMP• Measure recovery SNR, plot as a function of M

Page 20: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Exp 1: Synthetic Dictionary

Page 21: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Expt 2: Practical Dictionaries• 2x overcomplete DCT dictionary, same parameters• 2x overcomplete dictionary learned on 8x8 patches of a

real-world image (Barbara) using K-SVD• Recovery using OMP

Page 22: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Analysis

• Exact problem seems to be hard to analyze

• But, as in NuMax, can provide analytical bounds in the special case where the measurement matrix is further constrained to be orthonormal

Page 23: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Orthogonal Sensing of Dictionary-Sparse Signals

• Given a dictionary D, find the orthonormal measurement matrix that provides the best possible coherence

• From a geometric perspective, ortho-projections cannot improve coherence, so necessarily

Page 24: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Semidefinite Relaxation

• The usual trick: Lifting and trace-norm relaxation

Page 25: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Theoretical Result

• Theorem: For any given redundant dictionary D, denote its mutual coherence by .

Denote the optimum of the (nonconvex) problem as

Then, there exists a method to produce a rank-2M ortho matrix such that the coherence of is at most

i.e., We can obtain close to optimal performance, but pay a price of a factor 2 in the number of measurements

Page 26: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Conclusions

• NuMax-Dict performance comparable to the best existing algorithms

• Principled convex optimization framework

• Efficient ADMM-type algorithm that exploits the rank-1 structure of the problem

• Upshot: possible to incorporate other structure into the measurement matrix, such as positivity, sparsity, etc.

Page 27: Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Open Question

• Above framework assumes a two-step approach: first construct a redundant dictionary (analytically or from data) and then construct a measurement matrix

• Given a large number of training data, how to efficiently solve jointly for both the dictionary and the sensing matrix? (Approach introduced in DC-Sapiro [08])


Recommended