EE6540 Final Project...Performance Comparison of K-Means and Expectation Maximization with Gaussian...

Post on 14-Jul-2020

2 views 0 download

transcript

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering

EE6540 Final ProjectDevin Cornell & Sushruth Sastry

Outline

● problem statement● background● experiments● results● conclusions

Problem Statement

● basic clustering● classification through distribution

modeling

Figures from [1]

Background: GMM

● equations from [2]

Background: EM

● equations, algorithm from [3]

Background: EM-GMMAlgorithm 2 reprint from [4]

Background: k-means

● special case of EM-GMM [2] with○ no component cluster covariances○ fixed priori of all covariances for K components○ no membership weights, each point just

belongs to the class with the nearest mean

k-means Algorithm● based on modifications to algorithm mentioned by [2]

Experiment 1: Separate GMM Data

Experiment 2: Intermixed GMM Data

Experiment 3: Concentric Gaussian with Large Covariance Differences

Experiment 4: Radial Poisson Distributions with Different Means

Demonstration

● see Matlab

Experiment 1: Results

Experiment 2: Results

Experiment 3: Results

Experiment 4a: Results

Experiment 4b: Results

Experiment 4c: Results

Experiment 4d: Results

Results Summary

Conclusions● EM-GMM is much slower than k-means● EM-GMM was more accurate for all experiments

performed here● These algorithms can be more flexible if run with

different values of K● With a way to map “fitted distributions” to

“generating distributions”, GMM can estimate arbitrary distributions with fewer fitted distributions

References[1] A. W. Moore, “Clustering with Gaussian Mixtures,”, School of Computer Science, Carnegie Mellon University, http://cs.cmu.edu/awm

[2] D. K. P. Murphy, Machine learning: a probabilistic perspective. MIT press, 2012.

[3] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the em algorithm,” Journal of the royal statistical society. Series B (methodological), pp. 1-38, 1977.

[4] Barber, Bayesian reasoning and machine learning. Cambridge University Press, 2012.