Post on 14-Jul-2020
transcript
K-MeansClustering3/3/17
UnsupervisedLearning• Wehaveacollectionofunlabeled datapoints.• Wewanttofindunderlyingstructureinthedata.
Examples:• Identifygroupsofsimilardatapoints.• Clustering
• Findabetterbasistorepresentthedata.• Principalcomponentanalysis
• Compressthedatatoashorterrepresentation.• Auto-encoders
UnsupervisedLearning• Wehaveacollectionofunlabeled datapoints.• Wewanttofindunderlyingstructureinthedata.
Applications:• GeneratingtheinputrepresentationforanotherAIorMLalgorithm.• ClusterscouldleadtostatesinastatespacesearchorMDPmodel.• Anewbasiscouldbetheinputtoaclassificationorregressionalgorithm.
• Makingdataeasiertounderstand,byidentifyingwhat’simportantand/ordiscardingwhatisn’t.
TheGoalofClusteringGivenabunchofdata,wewanttocomeupwitharepresentationthatwillsimplifyfuturereasoning.
Keyidea:groupsimilarpointsintoclusters.
Examples:• Identifyingobjectsinsensordata• Detectingcommunitiesinsocialnetworks• Constructingphylogenetictreesofspecies• Makingrecommendationsfromsimilarusers
EMAlgorithmEstep:“expectation”…terriblename• Classifythedatausingthecurrentmodel.
Mstep:“maximization”…slightlylessterriblename• Generatethebestmodelusingthecurrentclassificationofthedata.
Initializethemodel,thenalternateEandMstepsuntilconvergence.
Note:TheEMalgorithmhasmanyvariations,includingsomethathavenothingtodowithclustering.
K-MeansAlgorithmModel:kclusterseachrepresentedbyacentroid.
Estep:• Assigneachpointtotheclosestcentroid.
Mstep:• Moveeachcentroidtothemeanofthepointsassignedtoit.
Convergence:werananEstepwherenopointshadtheirassignmentchanged.
K-MeansExample
InitializingK-MeansReasonableoptions:
1. StartwitharandomEstep.• Randomlyassigneachpointtoaclusterin{1,2,…,k}.
2. StartwitharandomMstep.a) Pickrandomcentroidswithinthemaximumrangeof
thedata.b) Pickrandomdatapointstouseasinitialcentroids.
K-MeansinAction
https://www.youtube.com/watch?v=BVFG7fd1H30
AnotherEMExample:GMMs
GMM:Gaussianmixturemodel• AGaussiandistributionisamultivariategeneralizationofanormaldistribution(theclassicbellcurve).• AGaussianmixtureisadistributioncomprisedofseveralindependentGaussians.• IfwemodelourdataasaGaussianmixture,we’resayingthateachdatapointwasarandomdrawfromoneofseveralGaussiandistributions(butwemaynotknowwhich).
EMforGaussianMixtureModelsModel:datadrawnfromamixtureofkGaussians
Estep:• Computethe(log)likelihoodofthedata• Eachpoint’sprobabilityofbeingdrawnfromeachGaussian.
Mstep:• UpdatethemeanandcovarianceofeachGaussian.• WeightedbyhowresponsiblethatGaussianwasforeachdatapoint.
HowdowepickK?There’snohardrule.
• Sometimestheapplicationforwhichtheclusterswillbeuseddictatesk.• Ifkcanbeflexible,thenweneedtoconsiderthetradeoffs:• Higherkwillalwaysdecreasetheerror(increasethelikelihood).• Lowerkwillalwaysproduceasimplermodel.
HierarchicalClustering• Organizesdatapointsintoahierarchy.• Everylevelofthebinarytreesplitsthepointsintotwosubsets.• Pointsinasubsetshouldbemoresimilarthanpointsindifferentsubsets.• Theresultingclusteringcanberepresentedbyadendrogram.
DirectionofClusteringAgglomerative(bottom-up)• Eachpointstartsinitsowncluster.• Repeatedlymergethetwomost-similarclustersuntilonlyoneremains.
Divisive(top-down)• Allpointsstartinasinglecluster.• Repeatedlysplitthedataintothetwomostself-similarsubsets.
Eitherversioncanstopearlyifaspecificnumberofclustersisdesired.