+ All Categories
Home > Documents > Learning with Structured Sparsity

Learning with Structured Sparsity

Date post: 12-Jan-2016
Category:
Upload: elgin
View: 30 times
Download: 4 times
Share this document with a friend
Description:
Learning with Structured Sparsity. Authors: Junzhou Huang, Tong Zhang, Dimitris Metaxas. Introduction. Fixed set of p basis vectors where for each j . --> - PowerPoint PPT Presentation
23
Learning with Structured Sparsity Authors: Junzhou Huang, Tong Zhang, Dimitris Metaxas 1 Zhennan Yan
Transcript
Page 1: Learning with Structured  Sparsity

Zhennan Yan 1

Learning with Structured Sparsity

Authors:Junzhou Huang, Tong Zhang, Dimitris

Metaxas

Page 2: Learning with Structured  Sparsity

Zhennan Yan 2

IntroductionFixed set of p basis vectors where

for each j. --> Given a random observation

, which depends on an underlying coefficient vector .

Assume the target coefficient is sparse.Throughout the paper, assume X is fixed, and

randomization is w.r.t. the noise in observation y.

},,{ 1 pxx nj Rx

pnX n

n Ryyy ],,[ 1 pR

Xy

Page 3: Learning with Structured  Sparsity

Zhennan Yan 3

IntroductionDefine the support of a vector as

So A natural method for sparse learning is L0

regularization for desired sparsity s:

Here, only consider the least squares loss

pR}0:{)(sup jjp

|)(sup||||| 0 p

,||||)(ˆminargˆ00 stosubjectQL

22||||)(ˆ yXQ

Page 4: Learning with Structured  Sparsity

Zhennan Yan 4

IntroductionNP-hard!Standard approach:

Relaxation of L0 to L1 (Lasso)Greedy algorithms (such as OMP)

In practical applications, often know a structure on β in addition to sparsity.Group sparsity: variables in the same group

tend to be zero or nonzeroTonal and transient structures: sparse

decomposition for audio signals

Page 5: Learning with Structured  Sparsity

Structured SparsityDenote the index set of coefficientsFor any sparse subset

Coding complexity of F is defined as:

Page 6: Learning with Structured  Sparsity

Structured SparsityIf a coefficient vector has a small coding

complexity, it can be efficiently learned.Why ?Number of bits to encode F is cl(F)Number of bits to encode nonzero

coefficients in F is O(|F|)

Page 7: Learning with Structured  Sparsity

General Coding SchemeBlock Coding: Consider a small number of

base blocks (each element of is a subset of ), every subset can be expressed as union of blocks in .

Define code length on :

Where

So

Page 8: Learning with Structured  Sparsity

General Coding Scheme

a structured greedy algorithm that can take advantage of block structures is efficient:Instead of searching over all subsets of up to

a fixed coding complexity s (exponential), we greedily add blocks from one at a time

is supposed to contain only manageable number of base blocks

Page 9: Learning with Structured  Sparsity

General Coding SchemeStandard Sparsity: consisted only of single

element sets and each base block has coding length . This uses bits to code each subset of cardinality k.

Group Sparsity: Graph Sparsity:

Page 10: Learning with Structured  Sparsity

General Coding SchemeStandard Sparsity:Group Sparsity: Consider , let

contain the m groups, and contain p single element blocks. Element in has cl0 of ∞, and element in has cl0 of . only looks for signals consisted of the groups.

The result coding length is: if can be represented as union of g disjoint groups.

Graph Sparsity:

Page 11: Learning with Structured  Sparsity

General Coding SchemeStandard Sparsity:Group Sparsity:Graph Sparsity: Generalization of Group

Sparsity. Employs a directed graph structure G on . Each element of is a node of G but G may contain additional nodes.

At each node , we define coding length clv(S) on the neighborhood Nv of v, as well as any other single node with clv(u), such that

Page 12: Learning with Structured  Sparsity

Zhennan Yan 12

General Coding SchemeExample for Graph Sparsity: Image

Processing ProblemEach pixel has 4 adjacent pixels, the number

of the subsets in its neighborhood is 24 = 16, with a coding length . Encode all other pixels using random jumping with coding length

If connected region F is composed of g sub-regions, then the coding length is

While standard sparse coding length is

Page 13: Learning with Structured  Sparsity

Zhennan Yan 13

Algorithms for Structured Sparsity

,||||)(ˆminargˆ00 stosubjectQL

Page 14: Learning with Structured  Sparsity

Zhennan Yan 14

Algorithms for Structured SparsityExtend forward greedy algorithms by using

block structure, which is only used to limit the search space.

Page 15: Learning with Structured  Sparsity

Zhennan Yan 15

Algorithms for Structured Sparsity

Maximize the gain ratio:

Using least squares regression

Where is the projection matrix to the subspaces generated by columns of XF

Select by

Page 16: Learning with Structured  Sparsity

Zhennan Yan 16

Experiments-1D1D structured sparse signal with values +1~-

1, p = 512, k =32g = 2Zero-mean Gaussian noise with standard

deviation is a added to the measurements

n = 4k = 128Recovery result by Lasso, OMP and

structOMP:

Page 17: Learning with Structured  Sparsity

Zhennan Yan 17

Experiments-1D

Page 18: Learning with Structured  Sparsity

Zhennan Yan 18

Experiments-2DGenerate a 2D structured sparsity image by

putting four letters in random locations.p = H*W = 48*48k = 160g = 4m = 4k = 640

Strongly sparse signal, Lasso is better than OMP!

Page 19: Learning with Structured  Sparsity

Zhennan Yan 19

Experiments-2D

Page 20: Learning with Structured  Sparsity

Zhennan Yan 20

Experiments for sample size

Page 21: Learning with Structured  Sparsity

Zhennan Yan 21

Experiment on Tree-structured Sparsity2D wavelet coefficientWeakly sparse signal

Page 22: Learning with Structured  Sparsity

Zhennan Yan 22

Experiments-Background Subtracted Images

Page 23: Learning with Structured  Sparsity

Zhennan Yan 23

Experiments for sample size


Recommended