+ All Categories
Home > Documents > Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins...

Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins...

Date post: 19-Jan-2016
Category:
Upload: reynard-washington
View: 215 times
Download: 0 times
Share this document with a friend
Popular Tags:
24
Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL
Transcript
Page 1: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

Ariadna QuattoniXavier Carreras

An Efficient Projection for l1,∞ Regularization

Michael Collins Trevor Darrell

MIT CSAIL

Page 2: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

Goal : Efficient training of jointly sparse models in high dimensional spaces.

Joint Sparsity

Why? : Learn from fewer

examples. Build more efficient

classifiers. Interpretability.

Page 3: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

3

Church Airport Grocery Store

Flower-Shop

2,1w

2,5w

11w 11w

2,4w

2,3w

2,2w

3,1w

3,5w

3,4w

3,3w

3,2w

4,1w

4,5w

4,4w

4,3w

4,2w

1,1w

1,5w

1,4w

1,3w

1,2w

Page 4: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

4

Church Airport Grocery Store

Flower-Shop

2,1w

2,5w

11w 11w

2,4w

2,3w

2,2w

3,5w

3,4w

3,2w

4,1w

4,5w

4,4w

4,3w

4,2w

1,1w

1,5w

1,4w

1,3w

1,2w

Page 5: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

5

Church Airport Grocery Store

Flower-Shop

11w 11w

2,4w

2,3w

2,2w

3,5w

3,4w

3,2w

4,4w

4,2w

1,4w

1,2w

Page 6: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

l1,∞ Regularization How do we promote joint (i.e. row) sparsity ?

Coefficients forfeature 2

Coefficients for task 2

An l1 norm on the maximum absolute values of the coefficients across tasks promotes sparsity.

Use few features

The l∞ norm on each row promotes non-sparsity on each row.

Share parameters

Page 7: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

Contributions An efficient projected gradient method for l1,∞ regularization

Our projection works on O(n log n) time, same cost as l1 projection

Experiments in Multitask image classification problems

We can discover jointly sparse solutions

l1,∞ regularization leads to better performance than l2 and l1 regularization

Page 8: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

Multitask Application

},...,,{ 21 mDDDD

Joint SparseApproximation

1D2D mD

Collection of Tasks

Page 9: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

l1,∞ Regularization: Constrained Convex

Optimization Formulation

We use a Projected SubGradient method. Main advantages: simple, scalable, guaranteed convergence rates.

A convex function

Convex constraints

Projected SubGradient methods have been recently proposed:

l2 regularization, i.e. SVM [Shalev-Shwartz et al. 2007]

l1 regularization [Duchi et al. 2008]

Ariadna
computing the subgradients is trivial.So the question is weather the projection can be computed efficiently.
Page 10: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

10

Euclidean Projection into the l1-∞ ball

Page 11: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

11

Characterization of the solution

Page 12: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

Series1

02468

10

Series1

02468

10

Series1

02468

10

Series1

02468

10

Characterization of the solution

Feature I Feature II Feature III Feature IV

Page 13: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

Mapping to a simpler problem We can map the projection problem to the following problem which finds the optimal maximums μ:

Page 14: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

Efficient Algorithm

14

Se-ries

1

02468

10

Se-ries

1

02468

10

Se-ries

1

02468

10

Se-ries

1

02468

10

Page 15: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

Efficient Algorithm

15

Se-ries

1

02468

10

Se-ries

1

02468

10

Se-ries

1

02468

10

Se-ries

1

02468

10

Page 16: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

16

Complexity

The total cost of the algorithm is dominated by sorting the entries of A.

The total cost is in the order of:

Page 17: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

17

Synthetic Experiments

Generate a jointly sparse parameter matrix W:

For every task we generate pairs:where:

We compared three different types of regularization :

l1,∞ projection l1 projection l2 projection

Page 18: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

18

Synthetic Experiments

Page 19: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

19

Dataset: Image Annotation

40 top content words Raw image representation: Vocabulary Tree(Nister and Stewenius 2006)

11000 dimensions

president actress team

Page 20: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

20

Results

Most of the differences are statistically significant

Page 21: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

21

Results

Page 22: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

22

Dataset: Indoor Scene Recognition

67 indoor scenes. Raw image representation: similarities to a set of unlabeled images. 2000 dimensions.

bakery bar Train station

Page 23: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

Results

Page 24: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

Conclusions

We proposed an efficient global optimization algorithm for l1,∞ regularization.

We presented experiments on image classification tasks and shown that our method can recover jointly sparse solutions.

A simple an efficient tool to implement an l1,∞ penalty, similar to standard l1 and l2 penalties.


Recommended