+ All Categories
Home > Documents > The Stability of a Good Clustering

The Stability of a Good Clustering

Date post: 13-Mar-2016
Category:
Upload: rahim-foreman
View: 21 times
Download: 1 times
Share this document with a friend
Description:
The Stability of a Good Clustering. Marina Meila University of Washington [email protected]. similarities. Optimizing these criteria is NP-hard’. worst case. Data Objective Algorithm. K-means. Spectral clustering. - PowerPoint PPT Presentation
23
The Stability of a Good Clustering Marina Meila University of Washington [email protected]
Transcript
Page 1: The Stability of a Good Clustering

The Stability of a Good Clustering

Marina MeilaUniversity of Washington

[email protected]

Page 2: The Stability of a Good Clustering

Optimizing these criteria is NP-hard’

Data

Objective

Algorithm

similarities

Spectral clustering K-means

...but “spectral clustering, K-means work well when good clustering exists”

worst case

interesting case

This talk: If a “good” clustering exists, it is “unique” If “good” clustering found, it is provably good

Page 3: The Stability of a Good Clustering

Results summary Given

objective = NCut, K-means distortion data clustering Y with K clusters

Spectral lower bound on distortion If small Then small

where = best clustering with K clusters

Page 4: The Stability of a Good Clustering

distortion

A graphical view

clusteringslowerbound

Page 5: The Stability of a Good Clustering

Overview Introduction

Matrix representations for clusterings Quadratic representation for clustering cost The misclassification error distance

Results for NCut (easier)

Results for K-means distortion (harder)

Discussion

Page 6: The Stability of a Good Clustering

Clusterings as matrices Clustering of { 1,2,..., n } with K clusters (C1, C2,...CK) Represented by n x K matrix

unnormalized

normalized

All matrices have orthogonal columns

Page 7: The Stability of a Good Clustering

Distortion is quadratic in X

NCut K-means

similarities

Page 8: The Stability of a Good Clustering

k

k’

mkk’

The Confusion MatrixTwo clusterings

(C1, C2, ... CK) with (C’1, C’2, ... C’K’) with

Confusion matrix (K x K’)

=

Page 9: The Stability of a Good Clustering

The Misclassification Error distance

computed by the maximal bipartite matching algorithm between clusters

confusion matrix

classification error

k

k’

Page 10: The Stability of a Good Clustering

Results for NCut given

data A (n x n) clustering X (n x K)

Lower bound for NCut (M02, YS03, BJ03)

Upper bound for (MSX’05)

whenever

largest e-values of A

Page 11: The Stability of a Good Clustering

small w.r.t eigengap K+1-K X close to X*

Two clusterings X,X’ close to X*

trace XTX’ large

trace XTX’ large small

convexity proof

Relaxed minimization for

s.t. X = n x K orthogonal matrixSolution:X* = K principal e-vectors of A

Page 12: The Stability of a Good Clustering

Why the eigengap matters Example

A has 3 diagonal blocks K = 2 gap( C ) = gap( C’ ) = 0 but C, C’ not close

C C’

Page 13: The Stability of a Good Clustering

Remarks on stability results No explicit conditions on S

Different flavor from other stability results, e.g Kannan & al 00, Ng & al 01 which assume S “almost” block diagonal

But…results apply only if a good clustering is found

There are S matrices for which no clustering satisfies theorem

Bound depends on aggregate quantities like K cluster sizes (=probabilities)

Points are weighted by their volumes (degrees) good in some applications bounds for unweighted distances can be obtained

Page 14: The Stability of a Good Clustering

Is the bound ever informative? An experiment: S perfect + additive noise

Page 15: The Stability of a Good Clustering

We can do the same ...

...but, K-th principal subspace typically not stable

K-means distortion

4

K = 4dim = 30

Page 16: The Stability of a Good Clustering

New approach: Use K-1 vectors Non-redundant representation Y

Distortion – new expression

...and new (relaxed) optimization problem

Page 17: The Stability of a Good Clustering

Solution of the new problem Relaxed optimization problem

given

Solution

U = K-1 principal e-vectors of A W = KxK orthogonal matrix

with on first row

Page 18: The Stability of a Good Clustering

Clusterings Y,Y’ close to Y*

||YTY’||F large

Solve relaxed minimization

small Y close to Y*

||YTY’||F large small

Page 19: The Stability of a Good Clustering

Theorem For any two clusterings Y,Y’ with Y, Y’ > 0

whenever

Corollary: Bound for d(Y,Yopt)

Page 20: The Stability of a Good Clustering

Experiments

20 replicates

K = 4dim = 30

true error

bound

pmin

Page 21: The Stability of a Good Clustering
Page 22: The Stability of a Good Clustering

B A D

Page 23: The Stability of a Good Clustering

Conclusions First (?) distribution independent bounds on the clustering error

data dependent hold when data well clustered (this is the case of interest)

Tight? – not yet... In addition

Improved variational bound for the K-means cost Showed local equivalence between “misclassification error”

distance and “Frobenius norm distance” (also known as 2 distance)

Related work Bounds for mixtures of Gaussians (Dasgupta, Vempala) Nearest K-flat to n points (Tseng) Variational bounds for sparse PCA (Mogghadan)


Recommended