Support Recovery for Orthogonal Matching Pursuit...s)kvk 2 2 kAvk 2 2 (1 + s)kvk 2 2 8v s.t. kvk 0...

transcript

Support Recovery for Orthogonal Matching PursuitUpper and Lower bounds

Raghav Somani, Chirag Gupta, Prateek Jain and Praneeth Netrapalli

September 24, 2018

Somani, Gupta, Jain and Netrapalli Support Recovery of OMP September 24, 2018 1 / 18

Sparse Regression

x̄ = arg min‖x‖0≤s∗

f(x) (1.1)

x ∈ Rd and s∗ << d.`0 norm counts the number of non-zero elements.

Applications

Resource constrained Machine LearningHigh dimensional StatisticsBioinformatics

Sparse Regression

f(x) (1.1)

x ∈ Rd and s∗ << d.`0 norm counts the number of non-zero elements.

Applications

Resource constrained Machine LearningHigh dimensional StatisticsBioinformatics

Sparse Linear Regression (SLR)

Sparse Linear Regression is a representative problem. Results typicallyextend easily to general case.With f(x) = ‖Ax− y‖22, SLR’s objective is to find

‖Ax− y‖22 (2.1)

where A ∈ Rn×d,x ∈ Rd and y = Rn.Unconditionally, it is NP hard (reduction to 3 set cover problem).

‖Ax− y‖22 (2.1)

Assumptions of interest

Despite being NP hard, SLR is tractable under certain assumptions.

Incoherence -If Σ = ATA, then max

i 6=j|Σij | ≤M

If M ≤ 12s∗−1 and y = Ax̄ =⇒ x̄ is unique sparsest solution,

and OMP can recover x̄ in s∗ steps.Restricted Isometry Property (RIP) -∥∥AT

SAS − I∥∥2≤ δ|S| (δs ≤M(s− 1) ∀ s ≥ 2)

=⇒ (1− δs) ‖v‖22 ≤ ‖Av‖22 ≤ (1 + δs) ‖v‖22 ∀ v s.t. ‖v‖0 ≤ s.Null space property -

∀ S ∈ [d] s.t. |S| ≤ s, if v ∈ Null(A) \ {0}, then ‖vS‖1 ≤ ‖vSc‖1=⇒

{v ∈ Rd | Av = 0

}∩{v ∈ Rd | ‖vSc‖1 ≤ ‖vS‖1

Restricted Strong Convexity (RSC) -‖Ax−Az‖22 ≥ ρ−s ‖x− z‖22 ∀ x, z ∈ Rd s.t. ‖x− z‖0 ≤ s

Incoherence =⇒ RIP =⇒ Null space property =⇒ RSCRSC is the weakest and the most popular assumption.

i 6=j|Σij | ≤M

SAS − I∥∥2≤ δ|S| (δs ≤M(s− 1) ∀ s ≥ 2)

{v ∈ Rd | Av = 0

}∩{v ∈ Rd | ‖vSc‖1 ≤ ‖vS‖1

i 6=j|Σij | ≤M

SAS − I∥∥2≤ δ|S| (δs ≤M(s− 1) ∀ s ≥ 2)

{v ∈ Rd | Av = 0

}∩{v ∈ Rd | ‖vSc‖1 ≤ ‖vS‖1

i 6=j|Σij | ≤M

SAS − I∥∥2≤ δ|S| (δs ≤M(s− 1) ∀ s ≥ 2)

{v ∈ Rd | Av = 0

}∩{v ∈ Rd | ‖vSc‖1 ≤ ‖vS‖1

i 6=j|Σij | ≤M

SAS − I∥∥2≤ δ|S| (δs ≤M(s− 1) ∀ s ≥ 2)

{v ∈ Rd | Av = 0

}∩{v ∈ Rd | ‖vSc‖1 ≤ ‖vS‖1

i 6=j|Σij | ≤M

SAS − I∥∥2≤ δ|S| (δs ≤M(s− 1) ∀ s ≥ 2)

{v ∈ Rd | Av = 0

}∩{v ∈ Rd | ‖vSc‖1 ≤ ‖vS‖1

Goals of SLR

SLR can be modelled asy = Ax̄ + η (2.2)

where η ∼ N (0, σ2In×n), supp(x̄) = S∗ and |S∗| = s∗.

=⇒ y = AS∗ x̄S∗ + η (2.3)

Model with deterministic conditions on η can also be analyzed.

Goals of SLR1 Bounding Generalization error - Upper bound G(x) := 1

n ‖A(x− x̄)‖22where the rows of A are i.i.d.

2 Support Recovery - Recover the true features of A, i.e., find a S ⊇ S∗.

Goals of SLR

=⇒ y = AS∗ x̄S∗ + η (2.3)

Goals of SLR

=⇒ y = AS∗ x̄S∗ + η (2.3)

Goals of SLR

=⇒ y = AS∗ x̄S∗ + η (2.3)

Algorithms to solve SLR

The literature mainly studies 3 classes of algorithms

Existing SLR algorithms

`1 minimization based (LASSO based). E.g. - Dantzig selectorNon-convex penalty based. E.g. - IHT, SCAD penalty, Log-sum penaltyGreedy methods. E.g. - Orthogonal Matching Pursuit (OMP)

We study SLR under RSC assumption for OMP algorithm.

Algorithms to solve SLR

The literature mainly studies 3 classes of algorithms

Existing SLR algorithms

`1 minimization based (LASSO based). E.g. - Dantzig selectorNon-convex penalty based. E.g. - IHT, SCAD penalty, Log-sum penaltyGreedy methods. E.g. - Orthogonal Matching Pursuit (OMP)

We study SLR under RSC assumption for OMP algorithm.

Orthogonal Matching Pursuit for SLR

Set initial support set S0 = φ & x0 = 0.∴ residual r0 = y −Ax0 = y.At kth iteration (k ≥ 1)

From the left-over columns of A (in ASck−1

), find the column withmaximum absolute inner product with rk−1.[|〈Ai1 , rk−1〉| |〈Ai2 , rk−1〉| . . .