Clustering for Semantic labels - Wk9

transcript

8/8/2019 Clustering for Semantic labels - Wk9

http://slidepdf.com/reader/full/clustering-for-semantic-labels-wk9 1/39

Computational Intelligence:

Lecture 20

Clustering to form semantic concepts –

Overview• Interpretability of fuzzy representation.

• Histogram analysis• LVQ (Linear Vector Quantization)

• FCM (Fuzzy C-Means)

• FKP (Fuzzy Kohonen Partitioning)

Partitioning)

Semantic Label Clustering• Semantic properties of a linguistic variable

– , , , ,

where L is the name of the variable; T (L) is the linguistic term setof L; U is a universe of discourse; G is a syntactic rule whichgenerates T (L); and M is a semantic rule that associates eachT (L) with its meaning.

– Each linguistic term set is characterized by a fuzzyset which is described using a membership function

Gaussian MF

0 2 4 6 8 100

• linguistic variable x named L=“performance ”

• five lin uistic terms

where T (L)={“very small ”, “small ”, “medium ”, “large ” and “ very large ”}.• Semantic assignment M is shown in the figure – normal and convex

ordering such that “ very small ” “small ” “medium ” “large ” “very large ”.

• universe of discourse

≺≺≺ ≺

U =[0, 100] of the base variable x

verysmall small medium large

verylarge

0.6 μ T ( x

0 20 40 60 80 1000

x (performance)

Criteria of Inter retabilit•

of discourse•

1 x =• Convex :

min , X X X x y z y x zμ μ μ ≤ ≤ ⇒ ≥

• Ordered : 1 2 j n X X X X ≺ ≺

1 2 X X ≺ denotes X1 precedes X2

Clustering• Clustering is a method that organizes patterns

into clusters such that atterns within a cluster

are more similar to each other than patterns inother clusters.

• When the crisp partition in classical clustering

analysis is replaced with a fuzzy partition or afuzzy pseudo-partition , it is referred to as fuzzy clustering

• xamp es: o onen , e ze ,MLVQ (Ang and Quek), DIC (Tung and Quek)

10/7/2008 7

1: 2: 3:

P2 b e r

10/7/2008 8

1: 2: 3:

e t e r P1

3 N u m

P e r i

sets: –

designing a classifier – -

evaluating the obtained classifier

• m by (n+1) matrix, where m is the number

of features.

Features Class

Area Perimeter Class

3 6 P1

5 7 P1

4 4 P1

7 6 P1

Design set:Odd-indexed entries

15 10 P2

14 12 P2

17 13 P2

Test set:Even-indexed entries

14 19 P313 20 P3

15 22 P3

… … …

Flowchart for HistogramFlowchart for HistogramAnalysisAnalysis

Feature Feature

From image toextraction extraction ea ures

a a a a reduction reduction None

Probability Probability estimate estimate Histogram analysis

points50 sam les

3 binsfrom aGaussian

distribution

25 bins10 bins

Histo ram Anal sisHisto ram Anal sis• Properties: –

require explicit use of density functions – Dilemma between no. of intervals vs. no. of points – Rule of thumb: no. of intervals is equal to the square

root of no. of points – – To convert to density functions, the total area must be

unity – Can be used in any number of features, but subjected

to curse of dimensionality

σ = 0.1

σ = 0.3

– Also known as Parzen estimator – – Can be used in multi-features estimation –

154opt ⎛ ⎞=

normal optimal smoothing strategy

3n⎝ ⎠σ denotes the standard deviation of the distribution

W. Bowman and A. Azzalini. A lied Smoothin Techni ues for Data Anal sis: The Kernel Approach with S-Plus Illustrations . New York:Oxford University Press, 1997.

Learning Vector Quantization• LVQ are unsupervised neural networks that determine the weights

for cluster centers in an iterative and sequential manner • Each output neuron has a weight vector v j

that is adjusted during learning.• The winner whose wei ht has

the minimum distance from the

input, updates its weights and

v jw j1

• Repeated until the weights are

forced to stabilize through the.

winner

specification of a learning rate.

inputlayer

outputlayer

LVQ – Cont’d( )( ) ( )min for j 1..cT T

i j j x v x v− = − =

if j i j jT

α + − =

=⎨ ≠⎪⎩

,clusters, x is the input vector, v i is the i th cluster centre

and is the learning constantα Pseudo Code: (1) Define number of clusters c and smallterminating condition (2) Initialise weights (3)Determinin winnin neuron based on distance 4

Update winner:(5) Determine terminating condition, else repeat with

(T) (T-1) ( ) ( 1) (T)i iv v ( ) for i NT T

i k i x vα −= + − ≤

new vec or

Fuzz C-Means FCM – Bezdek• A fuzzy pseudo-partition of a finite data setc

( ) 1 for all k 1..ni k i

• An objective function for fuzzy clustering is

or a ..ci k k

x nμ =

(m defines the degree of fuzziness):2

1 1m i k k i

k i x x vμ = == −

– ’•

– Define number of clusters (c), degree of fuzziness (m)and terminating condition ( ε )

– Init t and pseudo parition p 0

– Compute cluster centres: v 1, v 2, …v i … v c

( ( ))for i 1..c

i k k T k

μ == =

1( ( ))

– ’•

– Update new Pseudo Partition:11 −

2 1( )( 1)

( ) for i 1..c, k 1..nmT c

k iT i k

T j k j

⎛ ⎞⎜ ⎟−⎜ ⎟= = =⎜ ⎟

⎜ ⎟⎜ ⎟−∑

– Compare distance between the partitions E= p t+1 – p t

( 1) ( ) ( 1) ( )

1 1( ) ( )T T T T i k i k

i k E x xμ μ + +

= == Ρ − Ρ = −∑∑

• - -,

optimization algorithm..

• Unable to perform on-line training.• performance depends on a good choice of

weighting exponential m.

b e r s

h i p d e g r e e μ

b e r s

h i p d e g r e e

4.3 4.7 5.1 5.5 5.9 6.3 6.7 7.1 7.5 7.9Sepal length (cm)

2 2.4 2.8 3.2 3.6 4 4.4Sepal width (cm)

1.0 1.0

b e r s

h i p d e g r e e

b e r s

h i p d e g r e e

sentosa

versicolor

1 2 3 4 5 6Petal length (cm)

0.1 0.5 0.9 1.3 1.7 2.1 2.5Petal width (cm)

IRIS data set - FCM with m=1.5, ε=0.0001

trapezoidal –like membership functions

– ’

h i p d e g r e e μ

b e r s

h i p d e g r e e

4.3 4.7 5.1 5.5 5.9 6.3 6.7 7.1 7.5 7.9Sepal length (cm)

2 2.4 2.8 3.2 3.6 4 4.4Sepal width (cm)

1.0 1.0

m b e r s

h i p d e g r e e

m b e r s

h i p d e g r e e

sentosa

0.1 0.5 0.9 1.3 1.7 2.1 2.5Petal width (cm)

M versicolor

virginica

- = . , ε= .

Gaussian-like membership functions

• i closest v v−= i σ

( x )0.8

b e r s

h i p d e g r

b e r s

h i p d e g r

4.3 4.7 5.1 5.5 5.9 6.3 6.7 7.1 7.5 7.9

Sepal length (cm)

2 2.4 2.8 3.2 3.6 4 4.4Sepal width (cm)

b e r s

h i p d e g

b e r s

h i p d e g

sentosa

versicolor

virginica

0.1 0.5 0.9 1.3 1.7 2.1 2.5Petal width (cm)

MLVQ with λ=0.02, σ=1.5, ε=0.0001

– ’ • i closest v v−= i σ

r e e μ

b e r s

h i p d e g

b e r s

h i p d e g

4.3 4.7 5.1 5.5 5.9 6.3 6.7 7.1 7.5 7.9

Sepal length (cm)

. . . . .Sepal width (cm)

b e r s

sentosa

versicolor

virginica

MLVQ with λ=0.02, σ=3.0, ε=0.0001

0.1 0.5 0.9 1.3 1.7 2.1 2.5Petal width (cm)

can be described by a fuzzy interval, , ,

a centroid v μ(x)

also known as a

trapezoidal fuzzy number α γ δ

– ’ • The subinterval where x =1 is called the kernel of

the fuzzy interval, and the subinterval [ α, δ] is called thesupport.• , γ = erne o e uzzy n erva , an• [α, δ]=support of the fuzzy interval.•

used to derive the centroid v• it cannot derive the

0 if x or x

α α β

< >⎪ −⎪ ≤ ≤

−parameters ( α, β, γ, δ)of the trapezoidal-shaped

(x) 1 if x

μ β γ

δ =⎨ ≤ ≤⎪⎪ −

membership function

δ γ −⎩

The Fuzzy Kohonen Partitionalgorithm - supervised

• Define:

– c as the number of classes, – λ≤1/Ω as the learning constant, where Ω=number of data vectors,

– η as the learning width and a small positive number ε as astopping criterion; n=total number of data vectors

• n a se we g s:( ) ( ) ( )( )(0) 2min max min

for i 1..c, k 1..n

i k k k k k k v x x x

+= + −

• Determine the i th cluster that the data x k belongs andUpdate the weights v i of the i th cluster

The Fuzzy Kohonen Partitionalgorithm – supervised (cont’d)• ompute error to c uster an erence n error etween

iteration: ( 1) ( 1)n

T T e x v+ += −1k =

( 1) ( 1) ( )T T T + + −• Repeat: while ¬ (de (T+1) ≤ε )

– End of determining centroids

The Fuzzy Kohonen Partitionalgorithm – supervised (cont’d)• n t a ze

– where ϕi is the pseudo weight of v i.1..cifor ====== iiiiii vϕ γ δ β α

• Determine the i th cluster that the data x belon s and

i=1 i=2 i=3

Update the pseudo weights ϕi of the i th cluster

−i i k i

The Fuzzy Kohonen Partitionalgorithm – supervised (cont’d)• p ate t e our po nts o t e rapezo a uzzy um er

(Tr FN)

m in( , )

i i k xα α =

=m ax( , )i i iγ γ ϕ =

m ax( , )i i k xδ δ =

The Fuzzy Kohonen Partitionalgorithm Results

g r e e μ

g r e e

4.3 4.7 5.1 5.5 5.9 6.3 6.7 7.1 7.5 7.9

b e r s

h i p d e

2 2.4 2.8 3.2 3.6 4 4.4

b e r s

h i p d e

Sepal length (cm) Sepal width (cm)

e g r e e

1 2 3 4 5 6

b e r s

0.1 0.5 0.9 1.3 1.7 2.1 2.5

b e r s

sentosa

versicolor

virginica

e a eng cm e a w cm

FKP with λ=0.02, η=0, ε=0.0005

The Fuzzy Kohonen Partitionalgorithm Results

b e r s

h i p d e g r e

b e r s

h i p d e g r e

4.3 4.7 5.1 5.5 5.9 6.3 6.7 7.1 7.5 7.9Sepal length (cm)

2 2.4 2.8 3.2 3.6 4 4.4Sepal width (cm)

b e r s

h i p d e g r e

b e r s

h i p d e g r e

sentosa

versicolor

virginica

FKP with λ=0.02, η=0.5, ε=0.0005

0.1 0.5 0.9 1.3 1.7 2.1 2.5Petal width (cm)

-1.0 1.0

b e r s h i p d e g r e

4.3 4.7 5.1 5.5 5.9 6.3 6.7 7.1 7.5 7.9Sepal length (cm)

2 2.4 2.8 3.2 3.6 4 4.4Sepal width (cm)

r s h i p d e g r e e

M e m b e

0.1 0.5 0.9 1.3 1.7 2.1 2.5Petal width (cm)

M e m b e

sentosa

versicolor

virginica

PFKP with α=0.02, η=0, ε=0.0005

– ’ 1.0 1.0

m b e r s

h i p d e g r e e

m b e r s

h i p d e g r e e

4.3 4.7 5.1 5.5 5.9 6.3 6.7 7.1 7.5 7.9Sepal length (cm)

2 2.4 2.8 3.2 3.6 4 4.4Sepal width (cm)

1.0 1.0

h i p d e g r e e

0.1 0.5 0.9 1.3 1.7 2.1 2.5Petal width (cm)

e m sentosa

versicolor virginica

with λ=0.02, η=0.01, ε=0.0005

Fuzzy Kohonen Partition (PFKP), were proposedto directly derive appropriate membershipfunctions from training data.

• Both algorithms directly derive trapezoidalmembership functions that are convex andnormal from training data while the latter derive

-partition of the input space

Clustering for Semantic labels - Wk9

Documents