+ All Categories
Home > Documents > Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

Date post: 08-Jan-2016
Category:
Upload: lapis
View: 34 times
Download: 2 times
Share this document with a friend
Description:
Regression Shinkage for Sparse Projection Learning. ------ Graduate Celebration Report. Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6. Outline. A review Recommendations Regressions basic sparse learning methods My works Conclusions Future works - PowerPoint PPT Presentation
38
Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6 Regression Shinkage f or Sparse Projection Lea rning ------Graduate Celebration Report
Transcript
Page 1: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

Reporter: Zhihui Lai

Supervised by Prof. Zhong Jin

2011-6

Regression Shinkage for

Sparse Projection Learning

------Graduate Celebration Report

Page 2: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

Outline A review Recommendations Regressions basic sparse learning methods My works Conclusions Future works Possible hot points in the future Some suggestion on the younger

Page 3: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

A review

Fast algorithm

Sparse visual attention system

Sparseness for one class problem

Sparse representation and explanation for gene data

Super-solution images and dictionary learning

Feature extraction and classification

Sparse subspace learning -------reported at June 2009

Jieping Ye 2010

Cairong Zhao and I

Jian Yang, Zhenghong GU, and I

Lei Zhang,

Lili Wang and Guangwei Gao

Chunhou Zheng, Lei Zhang

Page 4: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

10 Recommended References (1)

P.N. Belhumeur, J.P. Hespanha, D.J. Kriengman, Eigenfaces vs. Fisherfa

ces: recognition using class specific linear projection,IEEE Trans. Pattern

Anal. Mach. Intelligence 19 (7) (1997)711–720.

X.F. He, S. Yan, Y. Hu, P. Niyogi, H.J. Zhang, Face recognition using lapl

acianfaces, IEEE Trans. Pattern Anal. Mach. Intelligence 27 (3) (2005) 32

8–340. +++++and its related papers

2DPCA,UDP(T-PAMI)

ULDA OLDA (PR), NLDA

Graph embedding (T-PAMI)

Page 5: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

10 Recommended References (2)

J. Wright, A.Y. Yang,..,Yi Ma,”Robust face recgontition via sparse represetation, T-PAMI 2009. ++++++and its 20 related references!

B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, “Least angle regressi

on,” Annals of Statistics, vol. 32, 2004, pp. 407-499.

R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 58, 1

996, pp. 267-288. Zou, H. (Standford), Hastie, T., & Tibshirani, R. (2004). Sparse principal

component analysis (Technical Report). Statistics Department, Stanford University.

D. Cai, X. He, J.Han, Spectral Regression: A Unified Approach for Sparse Subspace Learning, Proc. 2007 Int. Conf. on Data Mining (ICDM 07), Omaha, NE, Oct. 2007.

Page 6: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

Background---sparseness is needed

One key drawback of PCA is its lack of sparseness.

Sparse representations are generally desirable. Reduce computational cost and promote better

generalization in learning algorithms. In many applications, the coordinate axis

involved in the factors have a direct physical interpretation.

In financial or biological applications, each axis might correspond to a specific asset or gene.

Page 7: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

The methods for sparse solutions

CVX,

L1-magic,L1_eq

SDP,QCQP,

GPRS,SLEP,

Lasso,Glasso,

Elastic net

Page 8: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

regressions

Gaussian ProcessRegression,

Support Vector Regression,

Regression Trees,

and Nearest Neighbor Regression

OMP---Orthogonal OMPUNSOLVED!!

Page 9: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

Why L1 norm learning?

Page 10: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

some useful journalsComm. Pure and Applied Math.SIAM Rev. J. Am. Statistical Assoc.Comm. Pure and Applied Math. IEEE Trans. Information TheoryTheoretical Computer ScienceFoundations of Computational Math

Page 11: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

基本投影理论与算法 ----PCA思 想:最小化重构误差,保留最大方差

2

M

1

min

1( )( )

Ti ii

T n nt i i

i

x x

S x x x x Rm

( ) TtJ S

1 2arg max ( ) [ , , , ]PCA dJ

几何意义:使投影后所得特征的总体散度最大

Page 12: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

基本投影理论与算法 ----SPCA ( 1 )思 想:在旋转不变性的原则下最小化子空

间之间的投影误差TX UDV

2 2* *

, 1 1

( , ) arg min (:, )m d

T T Ti i

A B i j

A B x AB x B j

. . Tds t AA I

*(:, ) (:, )B i V i则有

SVD 分解

几何意义:在子空间之间使同一模式点的像与原像之差达到最小化

Page 13: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

基本投影理论与算法 ----SPCA( 2 )

2 2* *1,

, 1 1 1

( , ) arg min (:, ) (:, )m d d

T T Ti i j

A B i j j

A B x AB x B j B j

. . Tds t AA I

思 想:在旋转不变性的原则下最小化稀疏子空间之间的投影误差

几何意义:寻找一个稀疏线性变换,使得模式点在稀疏子空 间的像及其在原子空间的像之差达到最小化

Page 14: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

基本投影理论与算法 ----SDA ( 1 ) 思 想:把类属变量看成量化变量来处理,并写成回归的形式

21

2,

ˆ ˆ( , ) arg min m Y X

21

2. .s t m Y I

221 1/222 2,

ˆ ˆ( , ) arg min m Y X

Y 是只含 0-1 值的代表各类属性的 m*c 阶变量矩

阵 Optimal scoring

Panelized discriminant analysis

惩罚矩阵

几何意义:在低维子空间中逼近与类相关的量化变量

Page 15: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

基本投影理论与算法 ----SDA ( 2 )

221 1/22 12 12,

ˆ ˆ( , ) arg min m Y X

21

2. .s t m Y I

思 想:把类属变量看成量化变量来处理,并写成含 L1 范数回归的形式

最优的稀疏投影通过迭代 Elastic Net 和 SVD 分解得到

几何意义:在低维子空间中逼近与类相关的量化变量

Page 16: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

基于图的稀疏投影学习模型

max

. . 1

( )

T

T T

XWX

s t XDX

Card K

( )

T TXWX XDX

Card K

max ( ) ( )

min ( ) ( )

. . 1

( )

T b b Tb

T w w Tw

T T

J X D W X

J X D W X

s t XX

Card K

本文提出的稀疏鉴别投影( SLDP )学习模型:

现有的稀疏学习模型( USSL ):

Page 17: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

稀疏投影向量的比较及其语义解释

由 SLDP ( 左 ) 和 USSL (右)算法得到的稀疏人脸子空间的二值图像,此时 K=400 ,白点表示非 0 元,黑色区域为 0 元素

AR 人脸数据集中的一张人脸图像

实验与分析( AR 人脸数据集)

Page 18: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

基于向量的稀疏投影学习小结 优点:稀疏特征提取方法还能给出特征层面上的语义解释,它可以发现

最有效的鉴别特征用于分类,使我们知道到底哪些特征对分类起到了关键作用。

缺点 : 计算复杂度高,并且当非零元素较多时,这些算法往往比较耗时。 需要大量的投影才能有效地分开各个类,进一步增加了计算负担。 些方法用于人脸(图像)识别时,所得的投影轴仍然难于给出较为直观

的、合理的人脸语义上的解释 ,投影向量基本不再含有图像对像的属性 稀疏鉴别投影方法与紧致鉴别投影理论上的联系仍然没有得到论证

Page 19: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

基于流形学习的稀疏二维特征提取算法框架

1 1

1 1

2 : ( ) ( )

2 : ( ) ( )

T Tn n

T b T wn n

DLPP X L I X X D I X

DLGEDA X L I X X L I X

1 1( ) ( )T T

b n w nX L I X X L I X 基于图像矩阵的二维紧致投影 学习方法:

1 1( ) ( )

( )

T Tb n w nX L I X X L I X

subject to Card K

本文所提出的稀疏投影学习算法框架:

Page 20: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

快速图谱特征分解

这两个定理为快速的稀疏回归提供了思路!

Page 21: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

基于图像矩阵的二维回归拓展

1 22 2

1 1 1

arg min( ( ( ,:) ) )n nm

i i ji h j

X h y

1 22

1 1 1

arg min( ( ( ,:) ) )n nm

i i ji h j

X h y

1 2 2

2 2

1 1 1 1

arg min( ( ( ,:) ) )n n nm

i i j ji h j j

X h y

基于图像矩阵的二维脊回归、二维 Lasso 回归、二维 Elastic Net 回归分别如下:

Page 22: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

Sparsefaces: 无监督 S2DLPP 算法 S2DLPP 的目标函数:

1 1( ) ( )

( )

T Tn nX W I X X D I X

subject to Card K

S2DLPP 的算法过程:

Page 23: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

算法时间复杂度与空间复杂度的比较

1 2

2 2 2 3 3

2 2 2 3 2

2 2 2 6 4

2 2 2 3 2

2 2 2 3 2 2

Sparsefaces ( log ( ))

( log ( ))

USSL ( log ( ))

( log ( ))

2DLPP ( log )

n n n m

O n m m m d n n m

O n m m m d K K nm

O n m m m d n n m

O n m m m d K K m

O n m m m n n m

图像大小: ;训练样本数: ;

并可降到

并可降到

时间复杂性

2 2

2 4

2

Sparsefaces:max( ( ), ( ))

: max( ( ), ( ))

2 : ( )

O m O n

USSL O m O n

DLPP O n

空间复杂性

极大提高学习

速度

节省空间

Page 24: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

Sparsefaces 方法的变换矩阵

从左到右 : 2DPCA“ 脸”、 2DLDA“ 脸”、 2DLPP“ 脸”、 USSL“ 脸”

S2DLPP 所学习得到的稀疏“脸”图像,其中 K=2 : 2 :10

稀疏脸的二值“脸”图像,白色点代表 0 元素,黑色部分为非 0元素

在 Yale 人脸数据集上的实验与分析

Page 25: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

无监督 S2DLPP 算法的特性

节省 20%的时间

快速!

Page 26: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

S2DLPP 算法对时间光照表情变化的有效性

S2DLPP 对光照、表情及时间变化的鲁棒性

第一次采集的前 10幅图像用于训练,第二次采集的前10幅图像用于测试

快速!

在 AR 人脸数据集上的实验比较本文提出的 S2DLPP 算法效果

Page 27: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

S2DLPP 在 FERET 数据库上的实验

200 个人的 1400 张人脸图像,前 5 张图像用于训练,后两张图像用于测试,图像大小为 40*40

比基于向量的稀疏学习方法快近 100倍!

Page 28: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

监督的 S2DLDP 算法

S2DLDP 的目标函数:1 1

( ) ( )

( )

T b T wn nX L I X X L I X

subject to Card K

S2DLDP算法过程:

Page 29: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

S2DLDP 的变换矩阵特性

从左到右 :2DPCA“ 脸”、 2DLDA“ 脸”、 2DLPP“ 脸”、 2DLGEDA“ 脸”

S2DLDP 所学习得到的稀疏“脸” , K=2 : 2 : 10

S2DLDP 的二值“脸”,白色点代表非 0 元素,黑色部分为 0 元素

在 Yale 人脸数据集上的实验

Page 30: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

S2DLDP 的橹棒性

在 AR 人脸数据库上各方法的识别率与维数的变化情况

S2DLDP 在 Yale 人脸数据库上识别率与非 0 元个数及维数的情况

含光照、表情与时间的变化

含光照表情的变化

Page 31: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

互相垂直的稀疏投影学习模型

max

. . 1

( )

T

T T

XWX

s t XDX

Card K

现有的稀疏学习模型( USSL ):

max

. . 1

( )

0 ,

T

T T

Tj i

XWX

s t XDX

Card K

for i j

互相垂直的限制!

花了我大半年才发现它的解!

Page 32: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

multilinear sparse regression:MSPCA

1 1 2 2T T T

i i n nU U U Y X1 2 ( 1,2,..., )nm m mi R i N X

21 21 2 1 1 1 2 2 2

2

( , , , ) T T n Tn i i n n ni F

hj j jh jj j hF

J U U U B U B U B U

U u

X X

1 1 1

...

T

Tn n n

subject to B B I

B B I

*1 1 2( | ) arg min ( , , , )n

j j nU J U U U

{ , , 1,2,..., }i im di i iU R d m i n

Page 33: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

MSPCA algorithm

Page 34: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

multilinear sparse regression on manifolds

1 1 2 2T T T

i i n nU U U Y X1 2 ( 1,2,..., )nm m mi R i N X

21 21 2 1 1 1 2 2 2,

2

( , , , ) ( ) T T n Tn i j n n n iji j F

hj j jh jj j hF

J U U U B U B U B U W

U u

X X

1 1 1

...

T

Tn n n

subject to B B I

B B I

*1 1 2( | ) arg min ( , , , )n

j j nU J U U U

{ , , 1,2,..., }i im di i iU R d m i n

Graph on manifolds

Page 35: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

Conclusions

Sparseness might be necessary!Sparseness can be more efficient!Less atoms (loadings), higher accuracy!

Page 36: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

Possible hot points in the future!

Effective dictionary learning for classification

Classifier (classification) based optimal dimensionality reduction

Information theory (entropy) based discriminant analysis (such as AIDA)

Game theory based discriminant analysis (Multilinear) sparse projections and its ap

plications for biometrics and interpretations (such as on gene)

Page 37: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

Some suggestion on the younger Elements: step by step, smaller to bigger

Writings: faster is more harmful! Careful Rewritings! Details decide the success or failure! 3~4 paper per year!

Submitions: comment on it and just do it!

Paper (40%)+writings(30%)+reviewers(30%)=1

Ours visual angle decides ours height!

Page 38: Reporter: Zhihui Lai Supervised by Prof. Zhong Jin 2011-6

Thinks!


Recommended