+ All Categories
Home > Documents > K -means Clustering via Principal Component Analysis

K -means Clustering via Principal Component Analysis

Date post: 05-Jan-2016
Category:
Upload: mahdis
View: 35 times
Download: 0 times
Share this document with a friend
Description:
K -means Clustering via Principal Component Analysis. According to the paper by Chris Ding and Xiaofeng He from Int’l Conf. Machine Learning, Banff, Canada, 2004. Traditional K -means Clustering. Minimizing the sum of squared errors. Where data matrix. Centroid of cluster C k. - PowerPoint PPT Presentation
Popular Tags:
20
1 K-means Clustering via Principal Component Analysis According to the paper by Chris Ding and Xiaofeng He from Int’l Conf. Machine Learning, Banff, Canada, 2004
Transcript
Page 1: K -means Clustering via Principal Component Analysis

1

K-means Clustering via Principal Component Analysis

According to the paper by Chris Ding and Xiaofeng He from Int’l Conf.

Machine Learning, Banff, Canada, 2004

Page 2: K -means Clustering via Principal Component Analysis

2

Traditional K-means Clustering

K

k CikiK

k

J1

2)( mx

),,( 1 nX xx ),,( 1 di xx x

Minimizing the sum of squared errors

Where data matrix

kCii

kk n

xm1

Centroid of cluster Ck

nk is the number of points in Ck

Page 3: K -means Clustering via Principal Component Analysis

3

Principal Component Analysis (PCA)

Centered data matrix

),,,( 1 nY yy ,xxy ii

n

iin 1

1xx

Covariance matrix

n

i

Tii

T

nYY

n 1

))((1

1

1

1xxxx

Factor 1

1

nis ignored

Page 4: K -means Clustering via Principal Component Analysis

4

PCA - continuation

Eigenvalues and eigenvectors

2/1/,, kkT

kkkkT

kkkT YYYYY uvvvuu

Singular value decomposition (SVD)

k

TkkkY vu2/1

Page 5: K -means Clustering via Principal Component Analysis

5

PCA - example

Page 6: K -means Clustering via Principal Component Analysis

6

K-means → PCA

Indikator vectors 2/1/)0,,0,1,,1,0,,0( kT

n

k nk

h

),,( 1 KKH hh

Criterion )Tr()Tr( KTT

KT

K XHXHXXJ Linear transform by K × K orthonormal matrix T

THQ KKk ),,( 1 qq Last column of T

TK nnnnt )/,,/( 11

Page 7: K -means Clustering via Principal Component Analysis

7

K-means → PCA - continuation

Therefore ehhqnn

n

n

nK

KK

11

1

)Tr()Tr( 11 KTT

KT

K YQYQYYJCriterion

Optimization becomes

)Tr(max 111

KTT

KQ

YQYQK

Solution is first K-1 principal components

),,( 11 KkQ vv

Page 8: K -means Clustering via Principal Component Analysis

8

PCA → K-means

Clustering by PCA

K

k

Tkk

K

k

Tkk

K

k

Tkk

T nC11

1

1

/ hhqqvvee

Probability of connectivity between i and j

2/12/1jjii

ijij cc

cp

ij

ijij p

pp

if,1

if,0

0.5usually,10

Page 9: K -means Clustering via Principal Component Analysis

9

Page 10: K -means Clustering via Principal Component Analysis

10

Page 11: K -means Clustering via Principal Component Analysis

11

Page 12: K -means Clustering via Principal Component Analysis

12

Page 13: K -means Clustering via Principal Component Analysis

13

Page 14: K -means Clustering via Principal Component Analysis

14

Page 15: K -means Clustering via Principal Component Analysis

15

Page 16: K -means Clustering via Principal Component Analysis

16

Page 17: K -means Clustering via Principal Component Analysis

17

Eigenvalues

• 1. case 164030, 58, 5

• 2. case 212920, 1892, 157

Page 18: K -means Clustering via Principal Component Analysis

18

Page 19: K -means Clustering via Principal Component Analysis

19

Page 20: K -means Clustering via Principal Component Analysis

20

Thank you for your attention


Recommended