+ All Categories
Home > Documents > 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Date post: 21-Dec-2015
Category:
View: 227 times
Download: 5 times
Share this document with a friend
Popular Tags:
42
1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts
Transcript
Page 1: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

1

Kunstmatige Intelligentie / RuG

KI2 - 7

Clustering Algorithms

Johan Everts

Page 2: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

What is Clustering?

Find K clusters (or a classification that consists of K clusters) so that the objects of one cluster are similar to each other whereas objects of different clusters are dissimilar. (Bacher 1996)

Page 3: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

The Goals of Clustering

Determine the intrinsic grouping in a set of unlabeled data.

What constitutes a good clustering? All clustering algorithms will produce clusters, regardless of whether the data contains them

There is no golden standard, depends on goal: data reduction “natural clusters” “useful” clusters outlier detection

Page 4: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Stages in clustering

Page 5: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Taxonomy of Clustering Approaches

Page 6: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Hierarchical Clustering

Agglomerative clustering treats each data point as a singleton cluster, and then successively merges clusters until all points have been merged into a single remaining cluster. Divisive clustering works the other way around.

Page 7: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Single link

Agglomerative Clustering

In single-link hierarchical clustering, we merge in each step the two clusters whose two closest members have the smallest distance.

Page 8: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Complete link

Agglomerative Clustering

In complete-link hierarchical clustering, we merge in each step the two clusters whose merger has the smallest diameter.

Page 9: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Example – Single Link AC

  BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0

Page 10: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Example – Single Link AC

Page 11: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Example – Single Link AC

  BA FI MI/TO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MI/TO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

Page 12: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Example – Single Link AC

Page 13: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Example – Single Link AC

  BA FI MI/TO NA/RM

BA 0 662 877 255

FI 662 0 295 268

MI/TO 877 295 0 564

NA/RM 255 268 564 0

Page 14: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Example – Single Link AC

Page 15: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Example – Single Link AC

  BA/NA/RM FI MI/TO

BA/NA/RM 0 268 564

FI 268 0 295

MI/TO 564 295 0

Page 16: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Example – Single Link AC

Page 17: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Example – Single Link AC

  BA/FI/NA/RM MI/TO

BA/FI/NA/RM 0 295

MI/TO 295 0

Page 18: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Example – Single Link AC

Page 19: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Example – Single Link AC

Page 20: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Taxonomy of Clustering Approaches

Page 21: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Square error

Page 22: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

K-Means

Step 0: Start with a random partition into K clusters

Step 1: Generate a new partition by assigning each pattern to its closest cluster center

Step 2: Compute new cluster centers as the centroids of the clusters.

Step 3: Steps 1 and 2 are repeated until there is no change in the membership (also cluster centers remain the same)

Page 23: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

K-Means

Page 24: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

K-Means – How many K’s ?

Page 25: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

K-Means – How many K’s ?

Page 26: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Locating the ‘knee’

The knee of a curve is defined as the point of maximum curvature.

Page 27: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Leader - Follower

Online Specify threshold distance

Find the closest cluster center Distance above threshold ? Create new

cluster Or else, add instance to cluster

Page 28: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Leader - Follower

Find the closest cluster center Distance above threshold ? Create new

cluster Or else, add instance to cluster

Page 29: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Leader - Follower

Find the closest cluster center Distance above threshold ? Create new

cluster Or else, add instance to cluster and update

cluster center

Distance < Threshold

Page 30: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Leader - Follower

Find the closest cluster center Distance above threshold ? Create new

cluster Or else, add instance to cluster and update

cluster center

Page 31: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Leader - Follower

Find the closest cluster center Distance above threshold ? Create new

cluster Or else, add instance to cluster and update

cluster center

Distance > Threshold

Page 32: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Kohonen SOM’s

The Self-Organizing Map (SOM) is an unsupervised artificial neural network algorithm. It is a compromise between biological modeling and statistical data processing

Page 33: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Kohonen SOM’s

Each weight is representative of a certain input. Input patterns are shown to all neurons simultaneously. Competitive learning: the neuron with the largest response is chosen.

Page 34: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Kohonen SOM’s

Initialize weights Repeat until convergence

Select next input pattern Find Best Matching Unit Update weights of winner and neighbours Decrease learning rate & neighbourhood size

Learning rate & neighbourhood size

Page 35: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Kohonen SOM’s

Distance related learning

Page 36: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Kohonen SOM’s

Page 37: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Some nice illustrations

Page 38: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Kohonen SOM’s

Kohonen SOM Demo (from ai-junkie.com): mapping a 3D colorspace on a 2D Kohonen map

Page 39: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Performance Analysis

K-Means Depends a lot on a priori knowledge (K) Very Stable

Leader Follower Depends a lot on a priori knowledge

(Threshold) Faster but unstable

Page 40: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Performance Analysis

Self Organizing Map Stability and Convergence Assured

Principle of self-ordering Slow and many iterations needed for

convergence Computationally intensive

Page 41: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Conclusion

No Free Lunch theorema Any elevated performance over one class, is

exactly paid for in performance over another class

Ensemble clustering ? Use SOM and Basic Leader Follower to

identify clusters and then use k-mean clustering to refine.

Page 42: 1 Kunstmatige Intelligentie / RuG KI2 - 7 Clustering Algorithms Johan Everts.

Any Questions ?

?


Recommended