+ All Categories
Home > Documents > Local Classification Methods for Heterogeneous … mixture model (Titsias & Likas, 2002) common...

Local Classification Methods for Heterogeneous … mixture model (Titsias & Likas, 2002) common...

Date post: 17-Sep-2018
Category:
Upload: ngothuy
View: 221 times
Download: 0 times
Share this document with a friend
42
Local Classification Methods for Heterogeneous Classes Julia Schiffner and Claus Weihs Department of Statistics, Dortmund University of Technology SFB 475 ‘Complexity Reduction in Multivariate Data Structures’ August 13, 2008 J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes
Transcript

Local Classification Methodsfor Heterogeneous Classes

Julia Schiffner and Claus Weihs

Department of Statistics, Dortmund University of TechnologySFB 475 ‘Complexity Reduction in Multivariate Data Structures’

August 13, 2008

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Outline

1 Introduction – Heterogeneous Classes

2 Three Classification Methods Based on Mixture Models

3 Local Fisher Discriminant Analysis – LFDA

4 Summary & Outlook

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Introduction – Heterogeneous Classes

package klaR:

miscellaneous functions for classification and visualization

classification into K given classes c1, . . . , cK

underlying assumption for many classification methods:random feature x homogeneous within the classes andheterogeneous across the classes

problem: heterogeneous classes

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Introduction – Heterogeneous Classes

package klaR:

miscellaneous functions for classification and visualization

classification into K given classes c1, . . . , cK

underlying assumption for many classification methods:random feature x homogeneous within the classes andheterogeneous across the classes

problem: heterogeneous classes

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Introduction – Heterogeneous Classes

package klaR:

miscellaneous functions for classification and visualization

classification into K given classes c1, . . . , cK

underlying assumption for many classification methods:random feature x homogeneous within the classes andheterogeneous across the classes

problem: heterogeneous classes

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Introduction – Heterogeneous Classes

problem: heterogeneous classes

22

22

222

22

22

2

2

2

2

22

2 2

22

2 22 2

22

222

2

222

2

222

2

2222

2222

2

2

2

2

2

22

2

22

2

2

2

2

2

2

2222

22

2

2

2

2

2

22

22

22

2

22

2 2222

2

22222 22

2

2

2

2

1

11

11

1

111

11

111

111 1

1

1

1

1

111

1

1

1

1

11

1

1

11

11

11

1

1

1 1

11 1

11

1

1 111

1

11

11

1

11

11

1

1

1 111

11 111

1

1

1

111

111

11

11 11

1 1111

1

1

111

1

way out: local methodsclassification methods based on mixture models, e. g.mixture discriminant analysis (MDA)other prototype methods: K-means, learning vectorquantization (LVQ)k-nearest-neighbor classifier (kNN)local likelihood methods: localized logistic regression,localized LDA (LLDA, in klaR)local Fisher discriminant analysis (LFDA)tree-based methods: CART, random forests

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Introduction – Heterogeneous Classes

problem: heterogeneous classes

22

22

222

22

22

2

2

2

2

22

2 2

22

2 22 2

22

222

2

222

2

222

2

2222

2222

2

2

2

2

2

22

2

22

2

2

2

2

2

2

2222

22

2

2

2

2

2

22

22

22

2

22

2 2222

2

22222 22

2

2

2

2

1

11

11

1

111

11

111

111 1

1

1

1

1

111

1

1

1

1

11

1

1

11

11

11

1

1

1 1

11 1

11

1

1 111

1

11

11

1

11

11

1

1

1 111

11 111

1

1

1

111

111

11

11 11

1 1111

1

1

111

1

way out: local methods

classification methods based on mixture models, e. g.mixture discriminant analysis (MDA)other prototype methods: K-means, learning vectorquantization (LVQ)k-nearest-neighbor classifier (kNN)local likelihood methods: localized logistic regression,localized LDA (LLDA, in klaR)local Fisher discriminant analysis (LFDA)tree-based methods: CART, random forests

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Introduction – Heterogeneous Classes

problem: heterogeneous classes

22

22

222

22

22

2

2

2

2

22

2 2

22

2 22 2

22

222

2

222

2

222

2

2222

2222

2

2

2

2

2

22

2

22

2

2

2

2

2

2

2222

22

2

2

2

2

2

22

22

22

2

22

2 2222

2

22222 22

2

2

2

2

1

11

11

1

111

11

111

111 1

1

1

1

1

111

1

1

1

1

11

1

1

11

11

11

1

1

1 1

11 1

11

1

1 111

1

11

11

1

11

11

1

1

1 111

11 111

1

1

1

111

111

11

11 11

1 1111

1

1

111

1

way out: local methods

classification methods based on mixture models, e. g.mixture discriminant analysis (MDA)other prototype methods: K-means, learning vectorquantization (LVQ)k-nearest-neighbor classifier (kNN)local likelihood methods: localized logistic regression,localized LDA (LLDA, in klaR)local Fisher discriminant analysis (LFDA)tree-based methods: CART, random forests

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Mixture Models in Classification

marginal density:

f(x) =

K∑k=1

pk f(x | ck )

model class conditional densities as mixturesdata are generated by J sources sj

hierarchical mixture model (Titsias & Likas, 2002)

common components model (Titsias & Likas, 2001)

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Mixture Models in Classification

marginal density:

f(x) =

K∑k=1

pk f(x | ck )

model class conditional densities as mixturesdata are generated by J sources sj

hierarchical mixture model (Titsias & Likas, 2002)

common components model (Titsias & Likas, 2001)

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Mixture Models in Classification

marginal density:

f(x) =

K∑k=1

pk f(x | ck )

model class conditional densities as mixturesdata are generated by J sources sj

hierarchical mixture model (Titsias & Likas, 2002)

f(x) =

K∑k=1

pk

J∑j=1

πjk f(x | ck , sj)

common components model (Titsias & Likas, 2001)

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Mixture Models in Classification

marginal density:

f(x) =

K∑k=1

pk f(x | ck )

model class conditional densities as mixturesdata are generated by J sources sj

hierarchical mixture model (Titsias & Likas, 2002)

f(x |θ) =

J∑j=1

πj

K∑k=1

pkj f(x |µkj ,Σkj)

common components model (Titsias & Likas, 2001)

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Mixture Models in Classification

marginal density:

f(x) =

K∑k=1

pk f(x | ck )

model class conditional densities as mixturesdata are generated by J sources sj

hierarchical mixture model (Titsias & Likas, 2002)

f(x |θ) =

J∑j=1

πj

K∑k=1

pkj f(x |µkj ,Σkj)

common components model (Titsias & Likas, 2001)

f(x) =

K∑k=1

pk

J∑j=1

πjk f(x | sj)

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Mixture Models in Classification

marginal density:

f(x) =

K∑k=1

pk f(x | ck )

model class conditional densities as mixturesdata are generated by J sources sj

hierarchical mixture model (Titsias & Likas, 2002)

f(x |θ) =

J∑j=1

πj

K∑k=1

pkj f(x |µkj ,Σkj)

common components model (Titsias & Likas, 2001)

f(x |θ) =

J∑j=1

πj

K∑k=1

pkj f(x |µj ,Σj) =

J∑j=1

πj f(x |µj ,Σj)

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Hierarchical Mixture Classifier

class posterior estimationstep 1: estimate source posteriors assuming a

simple mixture model (unsupervised, "hm1")

f(x |ϕ) =

J∑j=1

πj f(x |µj ,Σj)

EM algorithm⇒ P(sj | x , ϕ)

common components model (supervised, "hm2")

f(x |ϕk ) =

J∑j=1

πjk f(x |µj ,Σj)

EM algorithm⇒ P(sj | x , c(x), ϕc(x))

step 2: ML estimation of πj , pkj , µkj , and Σkj depending on xand the source posteriors

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Hierarchical Mixture Classifier

class posterior estimationstep 1: estimate source posteriors assuming a

simple mixture model (unsupervised, "hm1")

f(x |ϕ) =

J∑j=1

πj f(x |µj ,Σj)

EM algorithm⇒ P(sj | x , ϕ)

common components model (supervised, "hm2")

f(x |ϕk ) =

J∑j=1

πjk f(x |µj ,Σj)

EM algorithm⇒ P(sj | x , c(x), ϕc(x))

step 2: ML estimation of πj , pkj , µkj , and Σkj depending on xand the source posteriors

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Hierarchical Mixture Classifier

class posterior estimationstep 1: estimate source posteriors assuming a

simple mixture model (unsupervised, "hm1")

f(x |ϕ) =

J∑j=1

πj f(x |µj ,Σj)

EM algorithm⇒ P(sj | x , ϕ)

common components model (supervised, "hm2")

f(x |ϕk ) =

J∑j=1

πjk f(x |µj ,Σj)

EM algorithm⇒ P(sj | x , c(x), ϕc(x))

step 2: ML estimation of πj , pkj , µkj , and Σkj depending on xand the source posteriors

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Hierarchical Mixture Classifier

class posterior estimationstep 1: estimate source posteriors assuming a

simple mixture model (unsupervised, "hm1")

f(x |ϕ) =

J∑j=1

πj f(x |µj ,Σj)

EM algorithm⇒ P(sj | x , ϕ)

common components model (supervised, "hm2")

f(x |ϕk ) =

J∑j=1

πjk f(x |µj ,Σj)

EM algorithm⇒ P(sj | x , c(x), ϕc(x))

step 2: ML estimation of πj , pkj , µkj , and Σkj depending on xand the source posteriors

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Hierarchical Mixture Classifier

class posterior estimationstep 1: estimate source posteriors assuming a

simple mixture model (unsupervised, "hm1")

f(x |ϕ) =

J∑j=1

πj f(x |µj ,Σj)

EM algorithm⇒ P(sj | x , ϕ)

common components model (supervised, "hm2")

f(x |ϕk ) =

J∑j=1

πjk f(x |µj ,Σj)

EM algorithm⇒ P(sj | x , c(x), ϕc(x))

step 2: ML estimation of πj , pkj , µkj , and Σkj depending on xand the source posteriors

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Common Components Classifier

class posterior estimation

estimate πj , pkj , µj , and Σj by means of the EM algorithm

some details

initialization of the EM algorithm: repeated execution ofkmeans, posterior deviance

number of sources J:

assumed to be known in advance

choice of J by means of a validation data set

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Common Components Classifier

class posterior estimation

estimate πj , pkj , µj , and Σj by means of the EM algorithm

some details

initialization of the EM algorithm: repeated execution ofkmeans, posterior deviance

number of sources J:

assumed to be known in advance

choice of J by means of a validation data set

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Common Components Classifier

class posterior estimation

estimate πj , pkj , µj , and Σj by means of the EM algorithm

some details

initialization of the EM algorithm: repeated execution ofkmeans, posterior deviance

number of sources J:

assumed to be known in advance

choice of J by means of a validation data set

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

R Functions

hm.cc: generic function with methods for classes"data.frame", "matrix", and "formula"

hm.cc.start: initialization of the EM algorithm

arguments for hm.cc:

argument explanation

formula, data for class "formula"x, grouping required if no formula is givenJ number of sourcesmethod "hm1", "hm2", "cc"tries, iter, eps for hm.cc.start and EM algorithmthreshold for subclass pruning in "hm1" and "hm2"

predict-method for class "hm.cc"

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

R Functions

hm.cc: generic function with methods for classes"data.frame", "matrix", and "formula"

hm.cc.start: initialization of the EM algorithm

arguments for hm.cc:

argument explanation

formula, data for class "formula"x, grouping required if no formula is givenJ number of sourcesmethod "hm1", "hm2", "cc"tries, iter, eps for hm.cc.start and EM algorithmthreshold for subclass pruning in "hm1" and "hm2"

predict-method for class "hm.cc"

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

R Functions

hm.cc: generic function with methods for classes"data.frame", "matrix", and "formula"

hm.cc.start: initialization of the EM algorithm

arguments for hm.cc:

argument explanation

formula, data for class "formula"x, grouping required if no formula is givenJ number of sourcesmethod "hm1", "hm2", "cc"tries, iter, eps for hm.cc.start and EM algorithmthreshold for subclass pruning in "hm1" and "hm2"

predict-method for class "hm.cc"

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Fisher Discriminant Analysis (FDA)

supervised linear dimensionality reduction andclassification

FDA transformation matrix:

TFDA = arg maxT

(tr (T ′SwT)−1 T ′SbT

)FDA projection: sample pairs in the same class are madeclose and sample pairs in different classes are separatedfrom each other

reduced dimension at most K − 1

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Fisher Discriminant Analysis (FDA)

supervised linear dimensionality reduction andclassification

FDA transformation matrix:

TFDA = arg maxT

(tr (T ′SwT)−1 T ′SbT

)FDA projection: sample pairs in the same class are madeclose and sample pairs in different classes are separatedfrom each other

reduced dimension at most K − 1

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Fisher Discriminant Analysis (FDA)

supervised linear dimensionality reduction andclassification

FDA transformation matrix:

TFDA = arg maxT

(tr (T ′SwT)−1 T ′SbT

)FDA projection: sample pairs in the same class are madeclose and sample pairs in different classes are separatedfrom each other

reduced dimension at most K − 1

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Fisher Discriminant Analysis (FDA)

supervised linear dimensionality reduction andclassification

FDA transformation matrix:

TFDA = arg maxT

(tr (T ′SwT)−1 T ′SbT

)FDA projection: sample pairs in the same class are madeclose and sample pairs in different classes are separatedfrom each other

reduced dimension at most K − 1

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Local FDA (LFDA) – Dimensionality Reduction

supervised linear dimensionality reduction (Sugiyama,2007) into arbitrary dimensional spaces

heterogeneous classes: preserve the within-class localstructure by introducing an affinity matrix A into thecalculation of Sw and Sb (Aij : affinity between xi and xj)⇒ downweight influence of far apart sample pairs in thesame class

LFDA transformation matrix:

TLFDA = arg maxT

(tr(T ′SA

w T)−1

T ′SAb T)

LFDA projection: only nearby sample pairs in the sameclass are made close and sample pairs in different classesare separated from each other

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Local FDA (LFDA) – Dimensionality Reduction

supervised linear dimensionality reduction (Sugiyama,2007) into arbitrary dimensional spaces

heterogeneous classes: preserve the within-class localstructure by introducing an affinity matrix A into thecalculation of Sw and Sb (Aij : affinity between xi and xj)⇒ downweight influence of far apart sample pairs in thesame class

LFDA transformation matrix:

TLFDA = arg maxT

(tr(T ′SA

w T)−1

T ′SAb T)

LFDA projection: only nearby sample pairs in the sameclass are made close and sample pairs in different classesare separated from each other

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Local FDA (LFDA) – Dimensionality Reduction

supervised linear dimensionality reduction (Sugiyama,2007) into arbitrary dimensional spaces

heterogeneous classes: preserve the within-class localstructure by introducing an affinity matrix A into thecalculation of Sw and Sb (Aij : affinity between xi and xj)⇒ downweight influence of far apart sample pairs in thesame class

LFDA transformation matrix:

TLFDA = arg maxT

(tr(T ′SA

w T)−1

T ′SAb T)

LFDA projection: only nearby sample pairs in the sameclass are made close and sample pairs in different classesare separated from each other

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Local FDA (LFDA) – Dimensionality Reduction

supervised linear dimensionality reduction (Sugiyama,2007) into arbitrary dimensional spaces

heterogeneous classes: preserve the within-class localstructure by introducing an affinity matrix A into thecalculation of Sw and Sb (Aij : affinity between xi and xj)⇒ downweight influence of far apart sample pairs in thesame class

LFDA transformation matrix:

TLFDA = arg maxT

(tr(T ′SA

w T)−1

T ′SAb T)

LFDA projection: only nearby sample pairs in the sameclass are made close and sample pairs in different classesare separated from each other

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

LFDA – Classification

assumption: classes are composed from subclasses ckm

classification rule:

c(x) = arg mink

minm

∥∥∥T ′LFDA x − T ′LFDA xkm∥∥∥

supervised case: subclasses are known

unsupervised case: subclasses are unknownspectral clustering within the K classesadvantages: number of clusters is determinedautomatically, affinity matrix is usedtwo methods: eigenvalues, eigenvectors

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

LFDA – Classification

assumption: classes are composed from subclasses ckm

classification rule:

c(x) = arg mink

minm

∥∥∥T ′LFDA x − T ′LFDA xkm∥∥∥

supervised case: subclasses are known

unsupervised case: subclasses are unknownspectral clustering within the K classesadvantages: number of clusters is determinedautomatically, affinity matrix is usedtwo methods: eigenvalues, eigenvectors

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

LFDA – Classification

assumption: classes are composed from subclasses ckm

classification rule:

c(x) = arg mink

minm

∥∥∥T ′LFDA x − T ′LFDA xkm∥∥∥

supervised case: subclasses are known

unsupervised case: subclasses are unknownspectral clustering within the K classesadvantages: number of clusters is determinedautomatically, affinity matrix is usedtwo methods: eigenvalues, eigenvectors

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

R Functions

lfda: generic function with methods for classes"data.frame", "matrix", and "formula"arguments for lfda:

argument explanation

formula, data for class "formula"x, grouping required if no formula is givensubgrouping subclass membershipdimension desired dimensionality reductionnorm.method method for normalizing the transforma-

tion matrixaff.method method for calculation of the affinity ma-

trixcluster.method method for calculation of the subclass

centers

predict-method for class "lfda"J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

R Functions

lfda: generic function with methods for classes"data.frame", "matrix", and "formula"arguments for lfda:

argument explanation

formula, data for class "formula"x, grouping required if no formula is givensubgrouping subclass membershipdimension desired dimensionality reductionnorm.method method for normalizing the transforma-

tion matrixaff.method method for calculation of the affinity ma-

trixcluster.method method for calculation of the subclass

centers

predict-method for class "lfda"J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

R Functions

lfda: generic function with methods for classes"data.frame", "matrix", and "formula"arguments for lfda:

argument explanation

formula, data for class "formula"x, grouping required if no formula is givensubgrouping subclass membershipdimension desired dimensionality reductionnorm.method method for normalizing the transforma-

tion matrixaff.method method for calculation of the affinity ma-

trixcluster.method method for calculation of the subclass

centers

predict-method for class "lfda"J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Summary & Outlook

hierarchical mixture and common components classifiers

singularities in EM: variable selection, dimensionalityreduction

automatic determination of the number of clusters

mixtures of other distributions

ML estimation of parameters: criteria better suited forclassification

documentation of the fitting process (trace)

LFDA

metric for classification rule

kernel LFDA

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

Summary & Outlook

hierarchical mixture and common components classifiers

singularities in EM: variable selection, dimensionalityreduction

automatic determination of the number of clusters

mixtures of other distributions

ML estimation of parameters: criteria better suited forclassification

documentation of the fitting process (trace)

LFDA

metric for classification rule

kernel LFDA

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes

References

I. Czogiel, K. Luebke, M. Zentgraf, and C. Weihs.Localized Linear Discriminant Analysis.In R. Decker and H.-J. Lenz, editors, Advances in Data Analysis, volume 33,pages 133–140, Heidelberg, 2007. Springer.

T. Hastie and R. Tibshirani.Discriminant Analysis by Gaussian Mixtures.Journal of the Royal Statistical Society B, 58(1):155–176, 1996.

M. Sugiyama.Dimensionality Reduction of Multimodal Labeled Data by Local FisherDiscriminant Analysis.Journal of Machine Learning Research, 8:1027–1061, 2007.

M. K. Titsias and A. C. Likas.Shared Kernel Models for Class Conditional Density Estimation.IEEE Transactions on Neural Networks, 12(5):987–997, September 2001.

M. K. Titsias and A. C. Likas.Mixture of Experts Classification Using a Hierarchical Mixture Model.Neural Computation, 14:2221–2244, 2002.

L. Zelnik-Manor and P. Perona.Self-Tuning Spectral Clustering.In L. K. Saul, Y. Weiss, and L. Bottou, editors, Advances in Neural InformationProcessing Systems, volume 17, pages 1601–1608. Cambridge, MA, 2005. MITPress.

J. Schiffner and C. Weihs Local Classification Methods for Heterogeneous Classes


Recommended