+ All Categories
Home > Documents > A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured...

A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured...

Date post: 10-Apr-2018
Category:
Upload: buinhu
View: 218 times
Download: 4 times
Share this document with a friend
53
A Nearly-Linear Time Framework for Graph-Structured Sparsity Chinmay Hegde Piotr Indyk Ludwig Schmidt MIT 6 July 2015 ICML Authors ordered alphabetically. 1 / 22
Transcript
Page 1: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

A Nearly-Linear Time Framework forGraph-Structured Sparsity

Chinmay Hegde Piotr Indyk Ludwig Schmidt

MIT

6 July 2015

ICML

Authors ordered alphabetically.1 / 22

Page 2: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Structured sparsitySparsity is widely used in signal processing, machine learning, andstatistics (compressive sensing, sparse linear regression, etc.)

Examples of sparsity

In many cases, there is rich structure in addition to sparsity.

→ How can we exploit this prior information?

2 / 22

Page 3: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Structured sparsitySparsity is widely used in signal processing, machine learning, andstatistics (compressive sensing, sparse linear regression, etc.)

Examples of sparsity

In many cases, there is rich structure in addition to sparsity.

→ How can we exploit this prior information?

2 / 22

Page 4: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Structured sparsitySparsity is widely used in signal processing, machine learning, andstatistics (compressive sensing, sparse linear regression, etc.)

Examples of sparsity

In many cases, there is rich structure in addition to sparsity.

→ How can we exploit this prior information?

2 / 22

Page 5: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Structured sparsitySparsity is widely used in signal processing, machine learning, andstatistics (compressive sensing, sparse linear regression, etc.)

Examples of sparsity

Cluster sparsity Tree sparsity Group sparsity

In many cases, there is rich structure in addition to sparsity.

→ How can we exploit this prior information?

2 / 22

Page 6: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Our focus: stable sparse recovery

Goal: Estimate an unknown, sparse vector β ∈ Rd from observationsof the form

y = Xβ + e .

X ∈ Rn×d is the design / measurement matrix.

y ∈ Rn are the observations / measurements.

e ∈ Rn is an observation noise vector.

We are interested in the regime n d (i.e., X is a fat matrix).

→ Use structured sparsity to reduce sample complexity n.

3 / 22

Page 7: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Utilizing structured sparsity in sparse recoveryLarge body of work: [Yuan, Lin, 2006], [Eldar, Mishali, 2009], [Jacob, Obozinski,Vert, 2009], [Baraniuk, Cevher, Duarte, Hegde, 2010], [Kim, Xing, 2010], [Bi, Kwok,2011], [Huang, Zhang, Metaxas, 2011], [Bach, Jenatton, Mairal, Obozinski, 2012b],[Rao, Recht, Nowak, 2012], [Negahban, Ravikumar, Wainwright, Yu, 2012], [Simon,Friedman, Hastie, Tibshirani, 2013], [El Halabi, Cevher, 2015] etc.

Surveys [Bach, Jenatton, Mairal, Obozinski, 2012a] and [Wainwright, 2014].

Main goals:GeneralityWhat sparsity structures does the approach apply to?

Generalize several previously studied sparsity models.

Statistical efficiencyWhat is the statistical performance improvement?

Asymptotically optimal sample complexity.

Computational efficiencyHow fast are the resulting algorithms?

Nearly-linear time algorithms.

4 / 22

Page 8: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Utilizing structured sparsity in sparse recoveryLarge body of work: [Yuan, Lin, 2006], [Eldar, Mishali, 2009], [Jacob, Obozinski,Vert, 2009], [Baraniuk, Cevher, Duarte, Hegde, 2010], [Kim, Xing, 2010], [Bi, Kwok,2011], [Huang, Zhang, Metaxas, 2011], [Bach, Jenatton, Mairal, Obozinski, 2012b],[Rao, Recht, Nowak, 2012], [Negahban, Ravikumar, Wainwright, Yu, 2012], [Simon,Friedman, Hastie, Tibshirani, 2013], [El Halabi, Cevher, 2015] etc.

Surveys [Bach, Jenatton, Mairal, Obozinski, 2012a] and [Wainwright, 2014].

Main goals:GeneralityWhat sparsity structures does the approach apply to?

Generalize several previously studied sparsity models.

Statistical efficiencyWhat is the statistical performance improvement?

Asymptotically optimal sample complexity.

Computational efficiencyHow fast are the resulting algorithms?

Nearly-linear time algorithms.

4 / 22

Page 9: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Utilizing structured sparsity in sparse recoveryLarge body of work: [Yuan, Lin, 2006], [Eldar, Mishali, 2009], [Jacob, Obozinski,Vert, 2009], [Baraniuk, Cevher, Duarte, Hegde, 2010], [Kim, Xing, 2010], [Bi, Kwok,2011], [Huang, Zhang, Metaxas, 2011], [Bach, Jenatton, Mairal, Obozinski, 2012b],[Rao, Recht, Nowak, 2012], [Negahban, Ravikumar, Wainwright, Yu, 2012], [Simon,Friedman, Hastie, Tibshirani, 2013], [El Halabi, Cevher, 2015] etc.

Surveys [Bach, Jenatton, Mairal, Obozinski, 2012a] and [Wainwright, 2014].

Main goals:GeneralityWhat sparsity structures does the approach apply to?Generalize several previously studied sparsity models.

Statistical efficiencyWhat is the statistical performance improvement?

Asymptotically optimal sample complexity.

Computational efficiencyHow fast are the resulting algorithms?

Nearly-linear time algorithms.

4 / 22

Page 10: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Utilizing structured sparsity in sparse recoveryLarge body of work: [Yuan, Lin, 2006], [Eldar, Mishali, 2009], [Jacob, Obozinski,Vert, 2009], [Baraniuk, Cevher, Duarte, Hegde, 2010], [Kim, Xing, 2010], [Bi, Kwok,2011], [Huang, Zhang, Metaxas, 2011], [Bach, Jenatton, Mairal, Obozinski, 2012b],[Rao, Recht, Nowak, 2012], [Negahban, Ravikumar, Wainwright, Yu, 2012], [Simon,Friedman, Hastie, Tibshirani, 2013], [El Halabi, Cevher, 2015] etc.

Surveys [Bach, Jenatton, Mairal, Obozinski, 2012a] and [Wainwright, 2014].

Main goals:GeneralityWhat sparsity structures does the approach apply to?Generalize several previously studied sparsity models.

Statistical efficiencyWhat is the statistical performance improvement?Asymptotically optimal sample complexity.

Computational efficiencyHow fast are the resulting algorithms?

Nearly-linear time algorithms.

4 / 22

Page 11: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Utilizing structured sparsity in sparse recoveryLarge body of work: [Yuan, Lin, 2006], [Eldar, Mishali, 2009], [Jacob, Obozinski,Vert, 2009], [Baraniuk, Cevher, Duarte, Hegde, 2010], [Kim, Xing, 2010], [Bi, Kwok,2011], [Huang, Zhang, Metaxas, 2011], [Bach, Jenatton, Mairal, Obozinski, 2012b],[Rao, Recht, Nowak, 2012], [Negahban, Ravikumar, Wainwright, Yu, 2012], [Simon,Friedman, Hastie, Tibshirani, 2013], [El Halabi, Cevher, 2015] etc.

Surveys [Bach, Jenatton, Mairal, Obozinski, 2012a] and [Wainwright, 2014].

Main goals:GeneralityWhat sparsity structures does the approach apply to?Generalize several previously studied sparsity models.

Statistical efficiencyWhat is the statistical performance improvement?Asymptotically optimal sample complexity.

Computational efficiencyHow fast are the resulting algorithms?Nearly-linear time algorithms.

4 / 22

Page 12: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Generality

The Weighted Graph Model (WGM)

5 / 22

Page 13: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Structured sparsity modelsModeling approach: restrict the set of allowed supports.[Baraniuk, Cevher, Duarte, Hegde, 2010]

So far: β is a vector.

β1

β2

β3

β4

β5

β6

β7

β8

Now: β corresponds to a graph.

β7 β8

β2

β5

β6

β1

β4

β3

Restrict size and number of connected components of supports.

6 / 22

Page 14: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Structured sparsity modelsModeling approach: restrict the set of allowed supports.[Baraniuk, Cevher, Duarte, Hegde, 2010]

So far: β is a vector.

β1

β2

β3

β4

β5

β6

β7

β8

Now: β corresponds to a graph.

β7 β8

β2

β5

β6

β1

β4

β3

Restrict size and number of connected components of supports.6 / 22

Page 15: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Weighted Graph Model (simplified)Parameters

Graph G = ([d ],E) defined on the index set [d ].Sparsity s.Number of connected components g.

Examples for s = 3 and g = 2:

In the model

Not in the model

In the model

Not in the model

7 / 22

Page 16: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Weighted Graph Model (simplified)Parameters

Graph G = ([d ],E) defined on the index set [d ].Sparsity s.Number of connected components g.

Examples for s = 3 and g = 2:

In the model

Not in the model

In the model

Not in the model 7 / 22

Page 17: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Generality

We can encode several sparsity structures via the graph G.

No edges: standard s-sparsity

Tree: hierarchical / tree sparsity

(Almost) line graph: block sparsity

Grid graph: 2D cluster sparsity

8 / 22

Page 18: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Generality

We can encode several sparsity structures via the graph G.

No edges: standard s-sparsity

Tree: hierarchical / tree sparsity

(Almost) line graph: block sparsity

Grid graph: 2D cluster sparsity

8 / 22

Page 19: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Generality

We can encode several sparsity structures via the graph G.

No edges: standard s-sparsity

Tree: hierarchical / tree sparsity

(Almost) line graph: block sparsity

Grid graph: 2D cluster sparsity

8 / 22

Page 20: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Generality

We can encode several sparsity structures via the graph G.

No edges: standard s-sparsity

Tree: hierarchical / tree sparsity

(Almost) line graph: block sparsity

Grid graph: 2D cluster sparsity

8 / 22

Page 21: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Weighted Graph Model (full version)Our structured sparsity model also supports edge weights.

Additional parameter: B, bound on the sum of weights in the support.

E.g., s = 3, g = 2, and B = 5:

1

2

310

56

7

89

4

11

In the model

1

2

310

56

7

89

4

11

Not in the model

Allows further generalizations, e.g., encoding the EMD-model(a model for correlated supports in adjacent columns).

9 / 22

Page 22: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Weighted Graph Model (full version)Our structured sparsity model also supports edge weights.

Additional parameter: B, bound on the sum of weights in the support.

E.g., s = 3, g = 2, and B = 5:

1

2

310

56

7

89

4

11

In the model

1

2

310

56

7

89

4

11

Not in the model

Allows further generalizations, e.g., encoding the EMD-model(a model for correlated supports in adjacent columns).

9 / 22

Page 23: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Statistical efficiency

Sample complexity of sparse recovery with the WGM

10 / 22

Page 24: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Cardinality of the WGMKey quantity: |M|, the number of allowed supports in the WGM.

→ Counting argument: how many subgraphs with size s and gconnected components does G contain?

|M| depends on the graph G and the parameters s and g.

Useful graph parameter: ρ(G), the maximum degree of a node in G.

ρ(G) = 4

11 / 22

Page 25: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Cardinality of the WGMKey quantity: |M|, the number of allowed supports in the WGM.

→ Counting argument: how many subgraphs with size s and gconnected components does G contain?

|M| depends on the graph G and the parameters s and g.

Useful graph parameter: ρ(G), the maximum degree of a node in G.

ρ(G) = 4

11 / 22

Page 26: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Sample complexity

Let β ∈ Rd be in the (G, s,g,B)-weighted graph model. Then

n = O(

s(

log ρ(G) + logBs

)+ g · log

dg

)i.i.d. Gaussian observations suffice to find an estimate β such that∥∥β − β∥∥ ≤ C ‖e‖ .

Unweighted case: n = O(

s log ρ(G) + g · log dg

)

“Standard” stable sparse recovery: n = O(

s · log dg

).

Asymptotically optimal sample complexity n = O(s) forBlock sparsity.Tree sparsity.Cluster sparsity in constant-degree graphs (for g = O(s/ log d)).

12 / 22

Page 27: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Sample complexity

Let β ∈ Rd be in the (G, s,g,B)-weighted graph model. Then

n = O(

s(

log ρ(G) + logBs

)+ g · log

dg

)i.i.d. Gaussian observations suffice to find an estimate β such that∥∥β − β∥∥ ≤ C ‖e‖ .

Unweighted case: n = O(

s log ρ(G) + g · log dg

)“Standard” stable sparse recovery: n = O

(s · log d

g

).

Asymptotically optimal sample complexity n = O(s) forBlock sparsity.Tree sparsity.Cluster sparsity in constant-degree graphs (for g = O(s/ log d)).

12 / 22

Page 28: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Sample complexity

Let β ∈ Rd be in the (G, s,g,B)-weighted graph model. Then

n = O(

s(

log ρ(G) + logBs

)+ g · log

dg

)i.i.d. Gaussian observations suffice to find an estimate β such that∥∥β − β∥∥ ≤ C ‖e‖ .

Unweighted case: n = O(

s log ρ(G) + g · log dg

)“Standard” stable sparse recovery: n = O

(s · log d

g

).

Asymptotically optimal sample complexity n = O(s) forBlock sparsity.Tree sparsity.Cluster sparsity in constant-degree graphs (for g = O(s/ log d)).

12 / 22

Page 29: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Computational efficiency

Nearly-linear time model projection for the WGM

13 / 22

Page 30: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Model projection

Goal: Given b ∈ Rd and a sparsity model M, find

Ω∗ = arg minΩ∈M

‖b − bΩ‖ .

For the (G, s,g)-WGM: Find the subgraph G with size s and gconnected components that maximizes the sum of node weights.

3 5

7

2

6

8

10

This problem is NP-hard.

14 / 22

Page 31: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Model projection

Goal: Given b ∈ Rd and a sparsity model M, find

Ω∗ = arg minΩ∈M

‖b − bΩ‖ .

For the (G, s,g)-WGM: Find the subgraph G with size s and gconnected components that maximizes the sum of node weights.

3 5

7

2

6

8

10

This problem is NP-hard.

14 / 22

Page 32: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Model projection

Goal: Given b ∈ Rd and a sparsity model M, find

Ω∗ = arg minΩ∈M

‖b − bΩ‖ .

For the (G, s,g)-WGM: Find the subgraph G with size s and gconnected components that maximizes the sum of node weights.

3 5

7

2

6

8

10

This problem is NP-hard.

14 / 22

Page 33: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Model projection

Goal: Given b ∈ Rd and a sparsity model M, find

Ω∗ = arg minΩ∈M

‖b − bΩ‖ .

For the (G, s,g)-WGM: Find the subgraph G with size s and gconnected components that maximizes the sum of node weights.

3 5

7

2

6

8

10

3 5

7

2

6

8

10

This problem is NP-hard.

14 / 22

Page 34: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Model projection

Goal: Given b ∈ Rd and a sparsity model M, find

Ω∗ = arg minΩ∈M

‖b − bΩ‖ .

For the (G, s,g)-WGM: Find the subgraph G with size s and gconnected components that maximizes the sum of node weights.

3 5

7

2

6

8

10

3 5

7

2

6

8

10

This problem is NP-hard.14 / 22

Page 35: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Approximation to the rescue!Approximation-tolerant model-based sparse recovery [HIS’14].→ Approximate projections suffice, but two types are necessary.

Tail-approximation oracle T (b)

Find a support Ω ∈M such that

‖b − bΩ‖ ≤ cT · minΩ′∈M

‖b − bΩ′‖ .

Head-approximation oracle H(b)

Find a support Ω ∈M such that

‖bΩ‖ ≥ cH · maxΩ′∈M

‖bΩ′‖ .

head: bΩ tail: b − bΩ

minimize

head: bΩ

maximizetail: b − bΩ

15 / 22

Page 36: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Approximation to the rescue!Approximation-tolerant model-based sparse recovery [HIS’14].→ Approximate projections suffice, but two types are necessary.

Tail-approximation oracle T (b)

Find a support Ω ∈M such that

‖b − bΩ‖ ≤ cT · minΩ′∈M

‖b − bΩ′‖ .

Head-approximation oracle H(b)

Find a support Ω ∈M such that

‖bΩ‖ ≥ cH · maxΩ′∈M

‖bΩ′‖ .

head: bΩ tail: b − bΩ

minimize

head: bΩ

maximizetail: b − bΩ

15 / 22

Page 37: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Approximation to the rescue!Approximation-tolerant model-based sparse recovery [HIS’14].→ Approximate projections suffice, but two types are necessary.

Tail-approximation oracle T (b)

Find a support Ω ∈M such that

‖b − bΩ‖ ≤ cT · minΩ′∈M

‖b − bΩ′‖ .

Head-approximation oracle H(b)

Find a support Ω ∈M such that

‖bΩ‖ ≥ cH · maxΩ′∈M

‖bΩ′‖ .

head: bΩ tail: b − bΩ

minimize

head: bΩ

maximizetail: b − bΩ

15 / 22

Page 38: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

The prize-collecting Steiner tree problem (PCST)Generalization of the classical Steiner tree problem.

Goal: Given a graph with edge costs c and node prizes π, find asubtree T minimizing c(T ) + π(T ) (T : nodes not in T ).

1

2

34

56

7

89

10

11

The Goemans-Williamson (GW) scheme produces a tree T with

c(T ) + 2π(T ) ≤ 2 minT ′is a tree

c(T ′) + π(T ′)

and runs in time O(|V |2 log|V |) [Goemans, Williamson, 1995].

16 / 22

Page 39: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

The prize-collecting Steiner tree problem (PCST)Generalization of the classical Steiner tree problem.

Goal: Given a graph with edge costs c and node prizes π, find asubtree T minimizing c(T ) + π(T ) (T : nodes not in T ).

1

2

34

56

7

89

10

11

The Goemans-Williamson (GW) scheme produces a tree T with

c(T ) + 2π(T ) ≤ 2 minT ′is a tree

c(T ′) + π(T ′)

and runs in time O(|V |2 log|V |) [Goemans, Williamson, 1995].

16 / 22

Page 40: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

The prize-collecting Steiner tree problem (PCST)Generalization of the classical Steiner tree problem.

Goal: Given a graph with edge costs c and node prizes π, find asubtree T minimizing c(T ) + π(T ) (T : nodes not in T ).

7 6

2

5

4

1

83

1

2

34

56

7

89

10

11

The Goemans-Williamson (GW) scheme produces a tree T with

c(T ) + 2π(T ) ≤ 2 minT ′is a tree

c(T ′) + π(T ′)

and runs in time O(|V |2 log|V |) [Goemans, Williamson, 1995].

16 / 22

Page 41: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

The prize-collecting Steiner tree problem (PCST)Generalization of the classical Steiner tree problem.

Goal: Given a graph with edge costs c and node prizes π, find asubtree T minimizing c(T ) + π(T ) (T : nodes not in T ).

7 6

2

5

4

1

83

1

2

34

56

7

89

10

11

The Goemans-Williamson (GW) scheme produces a tree T with

c(T ) + 2π(T ) ≤ 2 minT ′is a tree

c(T ′) + π(T ′)

and runs in time O(|V |2 log|V |) [Goemans, Williamson, 1995].

16 / 22

Page 42: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

The prize-collecting Steiner tree problem (PCST)Generalization of the classical Steiner tree problem.

Goal: Given a graph with edge costs c and node prizes π, find asubtree T minimizing c(T ) + π(T ) (T : nodes not in T ).

7 6

2

5

4

1

83

1

2

34

56

7

89

10

11

The Goemans-Williamson (GW) scheme produces a tree T with

c(T ) + 2π(T ) ≤ 2 minT ′is a tree

c(T ′) + π(T ′)

and runs in time O(|V |2 log|V |) [Goemans, Williamson, 1995].16 / 22

Page 43: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Our algorithmic contributions1 Generalize GW to the prize-collecting Steiner forest problem.

We find a forest F with g components such that:

c(F ) + 2π(F ) ≤ 2 minF ′ has g components

c(F ′) + π(F ′)

2 Give a nearly-linear time and practical variant of GW.

Building on the dynamic edge splitting idea introduced in[Cole, Hariharan, Lewenstein, Porat, 2001].

a b

3 Reduce WGM-projection to a sequence of PCSF problems.

Lagrangian relaxation + binary search and graph post-processing.

17 / 22

Page 44: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Our algorithmic contributions1 Generalize GW to the prize-collecting Steiner forest problem.

We find a forest F with g components such that:

c(F ) + 2π(F ) ≤ 2 minF ′ has g components

c(F ′) + π(F ′)

2 Give a nearly-linear time and practical variant of GW.

Building on the dynamic edge splitting idea introduced in[Cole, Hariharan, Lewenstein, Porat, 2001].

a b

3 Reduce WGM-projection to a sequence of PCSF problems.

Lagrangian relaxation + binary search and graph post-processing.

17 / 22

Page 45: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Our algorithmic contributions1 Generalize GW to the prize-collecting Steiner forest problem.

We find a forest F with g components such that:

c(F ) + 2π(F ) ≤ 2 minF ′ has g components

c(F ′) + π(F ′)

2 Give a nearly-linear time and practical variant of GW.

Building on the dynamic edge splitting idea introduced in[Cole, Hariharan, Lewenstein, Porat, 2001].

a b

3 Reduce WGM-projection to a sequence of PCSF problems.

Lagrangian relaxation + binary search and graph post-processing.17 / 22

Page 46: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Running time

TheoremOn a graph with |E | edges and d nodes, GRAPH-COSAMP runs in time

O(

(TX + |E | log3 d) log d),

where TX is the cost of a matrix-vector multiplication with the design /measurement matrix X .

Model Reference Previous time Our time

1D-cluster [CIHB09] O(d log2 d) O(d log4 d)

Trees [HIS14a] O(d log2 d) O(d log4 d)

EMD [HIS14b] O(d2 log d) O(d3/2 log4 d)

Graph clusters [HZM11] O(dc) O(d log4 d)

18 / 22

Page 47: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Running time

TheoremOn a graph with |E | edges and d nodes, GRAPH-COSAMP runs in time

O(

(TX + |E | log3 d) log d),

where TX is the cost of a matrix-vector multiplication with the design /measurement matrix X .

Model Reference Previous time Our time

1D-cluster [CIHB09] O(d log2 d) O(d log4 d)

Trees [HIS14a] O(d log2 d) O(d log4 d)

EMD [HIS14b] O(d2 log d) O(d3/2 log4 d)

Graph clusters [HZM11] O(dc) O(d log4 d)

18 / 22

Page 48: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Experiments

19 / 22

Page 49: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Sparse recovery experiments

2 3 4 5 6 70

0.2

0.4

0.6

0.8

1

Oversampling ratio n/s

Pro

babi

lity

ofre

cove

ry

2 3 4 5 6 70

0.2

0.4

0.6

0.8

1

Oversampling ratio n/s

Pro

babi

lity

ofre

cove

ry

Graph-CoSaMP StructOMP LaMP CoSaMP Basis Pursuit

2 3 4 5 6 70

0.2

0.4

0.6

0.8

1

Oversampling ratio n/s

Pro

babi

lity

ofre

cove

ry

StructOMP: [HZM11], LaMP: [CDHB09], CoSaMP: [NT09], BP: [CD92]. 20 / 22

Page 50: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

Running timesAngiogram image, n = 6s observations, subsampled Fourier matrix.

0 1 2 3 4·104

0

20

40

60

80

100

Problem size d

Rec

over

ytim

e(s

ec)

0 1 2 3 4·104

10−2

10−1

100

101

102

Problem size d

Rec

over

ytim

e(s

ec)

Graph-CoSaMP StructOMP LaMP CoSaMP Basis Pursuit

Graph-CoSaMP is about 20× faster than StructOMP for d = 104

and scales nearly-linearly.

Constant factor: solving more than 20 PCSF instances per recovery.21 / 22

Page 51: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

ConclusionsFurther applications, e.g. in seismicimage processing.

We introduced the Weighted Graph Model.Generalizes several structuredsparsity models.

Asymptotically optimal samplecomplexity in many cases.

Nearly-linear time approximate modelprojections.

Open problems / future directionsFast measurement matrix for allsparsity levels.Recovery guarantees beyond RIP.Learning sparsity models.

Noisy input Human labels Automatic

22 / 22

Page 52: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

ConclusionsFurther applications, e.g. in seismicimage processing.

We introduced the Weighted Graph Model.Generalizes several structuredsparsity models.

Asymptotically optimal samplecomplexity in many cases.

Nearly-linear time approximate modelprojections.

Open problems / future directionsFast measurement matrix for allsparsity levels.Recovery guarantees beyond RIP.Learning sparsity models.

Noisy input Human labels Automatic

22 / 22

Page 53: A Nearly-Linear Time Framework for Graph … Nearly-Linear Time Framework for Graph-Structured Sparsity ... Metaxas, 2011], [Bach, Jenatton, Mairal ... [Baraniuk, Cevher, Duarte, Hegde,

ConclusionsFurther applications, e.g. in seismicimage processing.

We introduced the Weighted Graph Model.Generalizes several structuredsparsity models.

Asymptotically optimal samplecomplexity in many cases.

Nearly-linear time approximate modelprojections.

Open problems / future directionsFast measurement matrix for allsparsity levels.Recovery guarantees beyond RIP.Learning sparsity models.

Noisy input Human labels Automatic

22 / 22


Recommended