+ All Categories
Home > Technology > omp-and-k-svd - Gdc2013

omp-and-k-svd - Gdc2013

Date post: 17-May-2015
Category:
Upload: manchor-ko
View: 2,204 times
Download: 4 times
Share this document with a friend
Description:
Talk from GDC 2013.
Popular Tags:
47
Orthogonal Matching Pursuit and K-SVD for Sparse Encoding Manny Ko Senior Software Engineer, Imaginations Technologies Robin Green SSDE, Microsoft Xbox ATG
Transcript
Page 1: omp-and-k-svd - Gdc2013

Orthogonal Matching Pursuit and K-SVD for Sparse Encoding Manny Ko Senior Software Engineer, Imaginations Technologies

Robin Green SSDE, Microsoft Xbox ATG

Page 2: omp-and-k-svd - Gdc2013

Outline

● Signal Representation

● Orthonormal Bases vs. Frames

● Dictionaries

● The Sparse Signal Model

● Matching Pursuit

● Implementing Orthonomal Matching Pursuit

● Learned Dictionaries and KSVD

● Image Processing with Learned Dictionaries

● OMP for GPUs

Page 3: omp-and-k-svd - Gdc2013

Representing Signals

● We represent most signals as linear combinations of things we already know, called a projection

Γ— 𝛼1 +

Γ— 𝛼2 +

Γ— 𝛼3 +β‹―

=

Γ— 𝛼0 +

Page 4: omp-and-k-svd - Gdc2013

Representing Signals

● Each function we used is a basis and the scalar weights are coefficients

● The reconstruction is an approximation to the original π‘₯ ● We can measure and control the error π‘₯ βˆ’ π‘₯ 2

π‘₯ 𝑑 = 𝑏𝑖 𝑑 𝛼𝑖

𝑁

𝑖=0

Page 5: omp-and-k-svd - Gdc2013

Orthonormal Bases (ONBs)

● The simplest way to represent signals is using a set of orthonormal bases

𝑏𝑖 𝑑 𝑏𝑗(𝑑)

+∞

βˆ’βˆž

𝑑𝑑 = 0 𝑖 β‰  𝑗 1 𝑖 = 𝑗

Page 6: omp-and-k-svd - Gdc2013

Example ONBs

● Fourier Basis

π‘π‘˜ 𝑑 = 𝑒𝑖2π‘π‘˜π‘‘

● Wavelets

π‘π‘š,𝑛 𝑑 = π‘Žβˆ’π‘š 2 π‘₯ π‘Žβˆ’π‘šπ‘‘ βˆ’ π‘π‘š

● Gabor Functions

π‘π‘˜,𝑛 𝑑 = πœ” 𝑑 βˆ’ 𝑏𝑛 𝑒𝑖2π‘π‘˜π‘‘

● Contourlet

𝑏𝑗,π‘˜,𝐧 𝑑 = λ𝑗,π‘˜ 𝑑 βˆ’ 2π‘—βˆ’1π’π‘˜n

Page 7: omp-and-k-svd - Gdc2013

Benefits of ONB

● Analytic formulations

● Well understood mathematical properties

● Fast algorithms for projection

Page 8: omp-and-k-svd - Gdc2013

Limitations

● Orthonormal bases are optimal only for specific synthetic signals ● If your signal looks exactly like your basis, you only need one

coefficient

● Limited expressiveness, all signals behave the same

● Real world signals often take a lot of coefficients ● Just truncate the series, which leads to artifacts like aliasing

Page 9: omp-and-k-svd - Gdc2013

Smooth vs. Sharp

Haar Wavelet Basis

● Sharp edges

● Local support

Discrete Cosine Transform

● Smooth signals

● Global support

Page 10: omp-and-k-svd - Gdc2013

Overcomplete Bases

● Frames are overcomplete bases ● There is now more than one way to represent a signal

● By relaxing the ONB rules on minimal span, we can better approximate signals using more coefficients

Ξ¦ = 𝑒1|𝑒2|𝑒3

=1 0 10 1 βˆ’1

Ξ¦ = 𝑒 1| 𝑒 2|𝑒 3

=2 βˆ’1 βˆ’10 1 0

Page 11: omp-and-k-svd - Gdc2013

Dictionaries

● A dictionary is an overcomplete basis made of atoms

● A signal is represented using a linear combination of only a few atoms

● Atoms work best when zero-mean and normalized

π‘‘π‘–π‘–βˆˆπΌ

𝛼𝑖 = π‘₯

𝑫𝛼 = π‘₯

Page 12: omp-and-k-svd - Gdc2013

Dictionaries

𝐃 Ξ±

=

π‘₯

Page 13: omp-and-k-svd - Gdc2013

Mixed Dictionaries

● A dictionary of Haar + DCT gives the best of both worlds But now how do we pick which coefficients to use?

Page 14: omp-and-k-svd - Gdc2013

The Sparse Signal Model

𝐃 A fixed dictionary

𝛼

=

π‘₯

𝑁 𝑁

𝐾

resulting signal

Sparse vector of

coefficients

Page 15: omp-and-k-svd - Gdc2013

The Sparse Signal Model

It’s Simple ● Every result is built from a combination of a few atoms

It’s Rich ● It’s a general model, signals are a union of many low dimensional parts

It’s Used Everywhere ● The same model is used for years in Wavelets, JPEG compression,

anything where we’ve been throwing away coefficients

Page 16: omp-and-k-svd - Gdc2013

Solving for Sparsity

What is the minimum number of coefficients we can use?

1. Sparsity Constrained

keep adding atoms until we reach a maximum count

2. Error Constrained Keep adding atoms until we reach a certain accuracy

𝛼 = argmin 𝛼

𝛼 0 s. t. 𝐃𝛼 βˆ’ π‘₯ 22 ≀ πœ–

𝛼 = argmin𝛼

𝐃𝛼 βˆ’ π‘₯ 22 s. t. 𝛼 0 ≀ 𝐾

Page 17: omp-and-k-svd - Gdc2013

NaΓ―ve Sparse Methods

● We can directly find 𝛼 using Least Squares

● Given K=1000 and L=10 at one LS per nanosecond this would complete in ~8 million years.

1. set 𝐿 = 1

2. generate 𝑆 = { 𝒫𝐿 𝑫 }

3. for each set solve the Least Squares problem min𝛼

𝐃𝛼 βˆ’ π‘₯ 22

where 𝑠𝑒𝑝𝑝 𝛼 ∈ 𝑆𝑖

4. if LS error ≀ πœ– finish!

5. set 𝐿 = 𝐿 + 1

6. goto 2

Page 18: omp-and-k-svd - Gdc2013

Matching Pursuit

1. Set the residual π‘Ÿ = π‘₯

2. Find an unselected atom that best matches the residual 𝐃𝛼 βˆ’ π‘Ÿ

3. Re-calculate the residual from matched atoms π‘Ÿ = π‘₯ βˆ’ 𝐃𝛼

4. Repeat until π‘Ÿ ≀ πœ–

Greedy Methods

𝐃 𝛼

=

π‘₯

Page 19: omp-and-k-svd - Gdc2013

Problems with Matching Pursuit (MP)

● If the dictionary contains atoms that are very similar, they tend to match the residual over and over

● Similar atoms do not help the basis span the space of representable values quickly, wasting coefficients in a sparsity constrained solution

● Similar atoms may match strongly but will not have a large effect in reducing the absolute error in an error constrained solution

Page 20: omp-and-k-svd - Gdc2013

Orthogonal Matching Pursuit (OMP)

● Add an Orthogonal Projection to the residual calculation

1. set 𝐼 ∢= βˆ… , π‘Ÿ ≔ π‘₯, 𝛾 ≔ 0

2. while (π‘ π‘‘π‘œπ‘π‘π‘–π‘›π‘” 𝑑𝑒𝑠𝑑 π‘“π‘Žπ‘™π‘ π‘’) do

3. π‘˜ ≔ argmaxπ‘˜

π‘‘π‘˜π‘‡π‘Ÿ

4. 𝐼 ≔ 𝐼, π‘˜

5. 𝛾𝐼 ≔ 𝐃𝐼+π‘₯

6. π‘Ÿ ≔ π‘₯ βˆ’ 𝐃𝐼𝛾𝐼

7. end while

Page 21: omp-and-k-svd - Gdc2013

Uniqueness and Stability

● OMP has guaranteed reconstruction (provided the dictionary is overcomplete)

● By projecting the input into the range-space of the atoms, we know that that the residual will be orthogonal to the selected atoms

● Unlike Matching Pursuit (MP) that atom, and all similar ones, will not be reselected so more of the space is spanned per iteration

Page 22: omp-and-k-svd - Gdc2013

Orthogonal Projection

● If the dictionary 𝐃 was square, we could use an inverse

● Instead we use the Pseudo-inverse 𝐃+ = 𝐃𝑇𝐃 βˆ’1𝐃𝑇

𝐃+

Γ— =

βˆ’1

𝐃𝑇 𝐃𝑇 𝐃𝑇 𝑖𝑛𝑣 Γ— 𝐃

=

Page 23: omp-and-k-svd - Gdc2013

Pseudoinverse is Fragile

● In floating point, the expression 𝐃Tπƒβˆ’1

is notoriously

numerically troublesome – the classic FP example

● Picture mapping all the points on a sphere using 𝐃𝑇𝐃 then inverting

Page 24: omp-and-k-svd - Gdc2013

Implementing the Pseudoinverse

● To avoid this, and reduce the cost of inversion, we can note that 𝐃T𝐃 is always symmetric and positive definite

● We can break the matrix into two triangular matrices using Cholesky

Decomposition 𝐀 = 𝐋𝐋𝑇

● Incremental Cholesky Decomp reuses the results of the previous iteration, adding a single new row and column each time

𝐋𝑛𝑒𝑀 =

𝐋 0

𝑀𝑇 1 βˆ’ 𝑀𝑇𝑀 where 𝑀 = π‹βˆ’1π·πΌπ‘‘π‘˜

Page 25: omp-and-k-svd - Gdc2013

OMP-Cholesky 1. set 𝐼 ∢= βˆ… , 𝐿 ≔ 1 , π‘Ÿ ≔ π‘₯, 𝛾 ≔ 0, 𝛼 ≔ 𝐃𝑇π‘₯, 𝑛 ≔ 1

2. while (π‘ π‘‘π‘œπ‘π‘π‘–π‘›π‘” 𝑑𝑒𝑠𝑑 π‘“π‘Žπ‘™π‘ π‘’) do

3. π‘˜ ≔ argmaxπ‘˜

π‘‘π‘˜π‘‡π‘Ÿ

4. if 𝑛 > 1 then 𝑀 ≔ Solve for 𝑀 𝐋𝑀 = 𝐃𝐼

π‘‡π‘‘π‘˜

𝐋 ≔ 𝐋 𝟎

𝑀𝑇 1 βˆ’ 𝑀𝑇𝑀

5. 𝐼 ≔ 𝐼, π‘˜

6. 𝛾𝐼 ≔ Solve for 𝑐 𝐋𝐋𝑇𝑐 = 𝛼𝐼

7. π‘Ÿ ≔ π‘₯ βˆ’ 𝐃𝐼𝛾𝐼

8. 𝑛 ≔ 𝑛 + 1

9. end while

Page 26: omp-and-k-svd - Gdc2013

OMP compression of Barbara

2 atoms 3 atoms 4 atoms

Page 27: omp-and-k-svd - Gdc2013
Page 28: omp-and-k-svd - Gdc2013
Page 29: omp-and-k-svd - Gdc2013
Page 30: omp-and-k-svd - Gdc2013

Batch OMP (BOMP)

● By pre-computing matrices, Batch OMP can speed up OMP on large numbers (>1000) of inputs against one dictionary

● To avoid computing πƒπ‘‡π‘Ÿ at each iteration

● Precompute 𝐃𝑇π‘₯ and the Gram-matrix 𝐆 = 𝐃𝑇𝐃

πƒπ‘‡π‘Ÿ = 𝐃𝑇(π‘₯ βˆ’ 𝐃𝐼(πƒπˆ)+π‘₯)

= 𝐃𝑇π‘₯ βˆ’ 𝐆𝐼(𝐃𝐼)+π‘₯

= 𝐃𝑇π‘₯ βˆ’ 𝐆𝐼(𝐃𝐼𝑇𝐃𝐼)

βˆ’1𝐃𝐼𝑇π‘₯

= 𝐃𝑇π‘₯ βˆ’ 𝐆𝐼(𝐆𝐼,𝐼)βˆ’1𝐃𝐼

𝑇π‘₯

Page 31: omp-and-k-svd - Gdc2013

Learned Dictionaries and K-SVD

● OMP works well for a fixed dictionary, but it would work better if we could optimize the dictionary to fit the data

𝐃 β‰ˆ β‹― 𝐗 β‹― 𝐀

Page 32: omp-and-k-svd - Gdc2013

Sourcing Enough Data

● For training you will need a large number of samples compared to the size of the dictionary.

● Take blocks from all integer offsets on the pixel grid

π‘₯ = …

Page 33: omp-and-k-svd - Gdc2013

1. Sparse Encode

● Sparse encode all entries in 𝐱. Collect these sparse vectors into a square array 𝐀

𝐃

𝐀T

𝐗T

β‹―

β‹―

Page 34: omp-and-k-svd - Gdc2013

2. Dictionary Update

● Find all 𝐗 that use atom in column π’…π‘˜

β‹―

β‹―

Page 35: omp-and-k-svd - Gdc2013

2. Dictionary Update

● Find all 𝐗 that use atom in column π‘‘π‘˜

● Calculate the error without π‘‘π‘˜

by 𝐄 = 𝐗𝐼 βˆ’ π‘‘π‘–π€π’π‘–β‰ π‘˜

● Solve the LS problem:

● Update 𝑑𝑖 with the new 𝑑 and 𝐀 with the new π‘Ž

𝑑, π‘Ž = Argmin𝑑,π‘Ž

𝐄 βˆ’ π‘‘π‘Žπ‘‡ 𝐹2 𝑠. 𝑑. 𝑑 2 = 1

π‘Ž

Page 36: omp-and-k-svd - Gdc2013

Atoms after K-SVD Update

Page 37: omp-and-k-svd - Gdc2013

How many iterations of update?

200000

250000

300000

350000

400000

450000

500000

550000

600000

0 10 20 30 40 50 60 70

Batch OMP

K-SVD

Page 38: omp-and-k-svd - Gdc2013

Sparse Image Compression

● As we have seen, we can control the number of atoms used per block

● We can also specify the exact size of the dictionary and optimize it for each data source

● The resulting coefficient stream can be coded using a Predictive Coder like Huffman or Arithmetic coding

Page 39: omp-and-k-svd - Gdc2013
Page 40: omp-and-k-svd - Gdc2013

Domain Specific Compression

● Using just 550 bytes per image

1. Original

2. JPEG

3. JPEG2000

4. PCA

5. KSVD per block

Page 41: omp-and-k-svd - Gdc2013

Sparse Denoising

● Uniform noise is incompressible and OMP will reject it

● KSVD can train a denoising dictionary from noisy image blocks

Source Result 30.829dB Noisy image

20

Page 42: omp-and-k-svd - Gdc2013

Sparse Inpainting

● Missing values in 𝐱 means missing rows in 𝐃

● Remove these rows and refit Ξ± to recover 𝐱 ● If 𝛼 was sparse enough, the recovery will be perfect

=

Page 43: omp-and-k-svd - Gdc2013

Sparse Inpainting

Original 80% missing Result

Page 44: omp-and-k-svd - Gdc2013

Super Resolution

Page 45: omp-and-k-svd - Gdc2013

Super Resolution

The Original Bicubic Interpolation SR result

Page 46: omp-and-k-svd - Gdc2013

Block compression of Voxel grids

● β€œA Compression Domain output-sensitive volume rendering architecture based on sparse representation of voxel blocks” Gobbetti, Guitian and Marton [2012]

● COVRA sparsely represents each voxel block as a dictionary of 8x8x8 blocks and three coefficients

● The voxel patch is reconstructed only inside the GPU shader so voxels are decompressed just-in-time

● Huge bandwidth improvements, larger models and faster rendering

Page 47: omp-and-k-svd - Gdc2013

Thank you to:

● Ron Rubstein & Michael Elad

● Marc LeBrun

● Enrico Gobbetti


Recommended