+ All Categories
Home > Documents > 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay...

1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay...

Date post: 29-Dec-2015
Category:
Upload: roy-gregory
View: 224 times
Download: 2 times
Share this document with a friend
56
1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ
Transcript
Page 1: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

1

Neural Network-Based Clustering

A. Selçuk MERCANLI

Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ

Page 2: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

2

Why NN?

• NN have solved a wide range of problems and have a good learning capabilities. Their strengths include adaptation, ease of implementation, parallezition, speed and flexibility. NN based clustering is closely related to the consept of competitive learning.

Page 3: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

3

w: weigth, initialy random

k: # of clusterss(x,wj)=

d

iijixw

1

Page 4: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

4

Updating Weights

))()(()()1( twtxtwtw jjj

: Learning rate.

To avoid the problem of unlimited growth of the weight, the weight vector must be normalized if the input pattern is normalized.

İf it’s zero no learning, if it’s 1 fast learning

Page 5: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

5

WTA - WTM The competitive learning paradigm allows learning for a particular

winning neuron that matches best with the given input pattern. Thus, it is also known as winner - take – all (WTA)

On the other hand, learning can also occur in a cooperative way,

which means that not just the winning neuron adjusts its prototype, but all other cluster prototypes have the opportunity to be adapted based on how proximate they are to the input pattern. The learning scheme is called soft competitive learning or winner - take - most (WTM)- Hard competition

Only one neuron is activated- Soft competition

Neurons neighboring the true winner are activated.

Page 6: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

6

HARD COMPETITIVE LEARNING CLUSTERING

• Online K-means Algorithm

• Leader Follower Clustering Algorithm

• Adaptive Resonance Theory

• Fuzzy ART

Page 7: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

7

Online K-means Algorithm

1. Initialize K cluster prototype vectors, m1 , … , mK ℜd randomly;

2. Present a normalized input pattern x ℜd ;3. Choose the winner J that has the smallest Euclidean

distance to x ,

J =argmin ||x−mj ||;4. Update the winning prototype vector towards x ,

mJ(new) =mJ(old)+η(x−mJ(old)), where η is the learning rate;

5. Repeat steps 2 – 4 until the maximum number of steps is reached.

Page 8: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

8

K-means Algorithm

iterate { Compute distance from all points to all k- centers Assign each point to the nearest k-center Compute the average of all points assigned to

all specific k-centers Replace the k-centers with the new averages}

From Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet Summer 2007, Distributed Computing Seminar, p 12

Page 9: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

9

Disadvantages of K-means

• In the begining of K-means algorithm, it requires a certain determination of number of clusters in advance. The number of clusters must be estimated via the procedure of cluster analysis. An inappropriate selection of number of clusters may distort the real clustering structure, that’s why Leader follower is needed.

Page 10: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

10

Disadvantages of K-means

• η , the learning rate, will be very small in the last stages that cause a disadvantage of not learning very well of new patterns.

• where η0 and η1 are the initial and final values of the learning rate, respectively, and t1 is the maximum number of iterations allowed

Page 11: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

11

Leader - Follower Clustering Algorithm

1. Initialize the first cluster prototype vector m1 with the first input pattern;

2. Present a normalized input pattern x ;3. Choose the winner J that is closest to x based on the Euclideandistance,

j=argmin ||x−mj ||;4. If || x − mj || < θ , update the winning prototype vector,

mJ(new) =mJ(old)+η(x−mJ(old))where η is the learning rate. Otherwise, create a new cluster with theprototype vector equal to x ;

5. Repeat steps 2 – 4 until the maximum number of steps is reached.

Page 12: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

12

Leader - Follower

• Find the closest cluster center– Distance above threshold ? Create new cluster– Or else, add instance to cluster and update cluster

center

Distance > Threshold

From Johan Everts, Clustering algorithms, Kunstmatige Intelligentie, p 31

Page 13: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

13

Performance Analysis

• K-Means– Depends a lot on a priori knowledge (K)– Very Stable

• Leader Follower– Depends a lot on a priori knowledge

(Threshold)– Faster but unstable

From Johan Everts, Clustering algorithms, Kunstmatige Intelligentie, p 39

Page 14: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

14

Adaptive Resonance Theory

• An important problem with competitive learning - based clustering is stability. The stability of an incremental clustering algorithm in terms of two conditions: “

• (1) No prototype vector can cycle, or take on a value that it had at a previous time (provided it has changed in the meantime).

• (2) Only a finite number of clusters are formed with infinite presentation of the data. ” The first condition considers the stability of individual prototype vectors of the clusters, and the second one concentrates on the stability of all the cluster vectors.

Page 15: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

15

Adaptive Resonance Theory

• K-means and Leader Follower algorithms dosen’t produce stable clusters. The plasticity of two algorithms may cause lost of previously learned rules.

• Adaptive resonance theory (ART) was developed by Carpenter and Grossberg (1987a, 1988)

• ART is not, as is popularly imagined, a neural network architecture. It is a learning theory hypothesizing that resonance in neural circuits can trigger fast learning.

Page 16: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

16

Adaptive Resonance Theory

• Stability-Plasticity Dilemma• Stability: system behaviour doesn’t change

after irrelevant events • Plasticity: System adapts its behaviour

according to significant events• Dilemma: how to achieve stability without

rigidity and plasticity without chaos? – Ongoing learning capability– Preservation of learned knowledge

From:Arash Ashari,Ali Mohammadi, ART powerpoint

Page 17: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

17

ART-1

• The basic ART1 architecture consists of two - layer nodes or neurons, the feature representation field F1 , and the category representation field F2

• The neurons in layer F1 are activated by the input pattern, while the prototypes of the formed clusters are stored in layer F2 .

Page 18: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

18

ART-1 Architecture

Page 19: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

19

ART-1

• The two layers are connected via adaptive weights:a bottom – up weight matrix and a top - down weight matrix

• F2 performs a winner - take - all competition, between a certain number of committed neurons and one uncommitted neuron. The winning neuron feeds back its template weights to layer F1. This is known as top - down feedback expectancy.This template is compared with the input pattern

Page 20: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

20

ART-1

• If the match meets the vigilance criterion, weight adaptation occurs, where both bottom - up and top - down weights are updated simultaneously. This procedure is called resonance, which suggests the name of ART. On the other hand, if the vigilance criterion is not met, a reset signal is sent back to layer F2 to shut off the current winning neuron.

• This new expectation is then projected into layer F1 , and this process repeats until the vigilance criterion is met. If an uncommitted neuron is selected for coding, a new uncommitted neuron is created to represent a potential new cluster. It is clear that the vigilance parameter ρ has a function similar to that of the threshold parameter θ of the leader - follower algorithm.

Page 21: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

21

ART-1 Flowchart

Page 22: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

22

Fuzzy ART

• FA maintains architecture and operations similar to ART1 while using the fuzzy set operators to replace the binary operators so that it can work for all real data sets. We describe FA by emphasizing its main difference with ART1 in terms of the following five phases, known as preprocessing, initialization, category choice, category match, and learning.

• Preprocessing. Each component of a d - dimensional input pattern x = ( x1 , … , xd ) must be in the interval [0,1].

Page 23: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

23

Fuzzy ART

• Initialization. The real - valued adaptive weights W = { w ij }, representing the connection from the i th neuron in layer F2 to the j th neuron in layer F1 , include both the bottom - up and top - down weights of ART1. Initially, the weights of an uncommitted node are set to one. Larger values may also be used, however, this will bias the tendency of the system to select committed nodes

Page 24: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

24

||

||

j

j

w

wx

Fuzzy ART

• Category choice. After an input pattern is presented, the nodes in layer F2 compete by calculating the category choice function, defined as

Tj=

Where is the fuzzy AND operator defined by

(x y)i= min (xi,yi) ,

Page 25: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

25

Fuzzy ART

• Category match. The category match function of the winning neuron is then tested with the vigilance criterion. If

resonance occurs. Otherwise, the current winning neuron is disabled and a new neuron in layer F2 is selected and examined with the vigilance criterion. This search process continues until upper criteria sattisfied.

Page 26: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

26

Fuzzy ART

• Learning. The weight vector of the winning neuron that passes the vigilance test at the same time is updated using the following learning rule,

: [0 1] learning rate parameter.

Page 27: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

27

SOFT COMPETITIVE LEARNING CLUSTERING

• Leaky Learning,• One of the major problems with hard competitive learning is the

underutilized or dead neuron problem, which refers to the possibility that the weight vector of a neuron is initialized farther away from any input patterns than other weight vectors so that it has no opportunity to ever win the competition and, therefore, no opportunity to be trained. One solution to addressing this problem is to allow both winning and losing neurons to move towards the presented input pattern, but with different learning rates.

• where ηw and ηl are the learning rates for the winning and losing neurons, respectively, and ηw >> ηl .

Page 28: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

28

• Conscience Mechanism

we need to modify the distance definition described in upside. Desieno(1988) adds a bias term bj to the squared Euclidean distance.x : Data setWj : j=1,2,…K neurons weightsbj : Bias term

Page 29: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

29

• Rival Penalized Competitive Learning

• x : Data set• Wj : j=1,2,…K neurons weights• j : Bias term

Page 30: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

30

Learning Vector Quantization

• Learning vector quantization (LVQ) , (Kohonen1990) is a unsupervised learning pattern classification method. Essentially same as the Kohonens SOM. LVQ algo is to find the output unit that is closest to the input vector. If x and wt belong to same class, then we move the weights toward the new vector; if they belong to different classes then we move the weights away from this input vector.(Fundamentals of Neural Networks, L.Fausett,)

Page 31: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

31

Flowchart of LVQ

X: input pattern

J(w,x) : cost function

w: weights

Page 32: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

32

LVQJ is the winning neuron and cost function defined on locally weighted error between x and w

Page 33: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

33

LVQ

: Prespesified threshold

Page 34: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

34

LVQ Application

10 number of data clustered to two cluster and wieved by red and cyan colors

Page 35: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

35

SOM

• A competitive network. Output neurons of the network compete among themselves to be activated or fired. Neighboorhood function usually decrease by linear, rectangular or hexagonal.

Page 36: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

36

Neural Network a Comprehensive Foundation, Simon Haykin , Prentice Hall, p 467

Page 37: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

37

SOM Neighboorhood

Application of neural Network and other Learning Technologies in Process Engineering, I.M. Majtaba, M.A. Hussain, Imperial College Press, 2001, P 53

Page 38: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

38

SOM BMU

Best matching unit, Update weights of winner and neighbours

Decrease learning rate & neighbourhood size

Page 39: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

39

Flowchart of SOFM

Page 40: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

40

Basic steps of SOFM

• 1. Determine the topology of the SOFM. Initialize the weight vectors w j (0) for j = 1, … , K , randomly;

• 2. Present an input pattern x to the network. Choose the winning node J that has the minimum Euclidean distance to x , i.e.

J=argmin(||x−wj||)• 3. Calculate the current learning rate and size of the

neighborhood;• 4. Update the weight vectors of all the neurons in the

neighborhood of J using wj(t+1)=wj(t)+ (t)hji(t)(x-wj(t)) ;• 5. Repeat steps 2 to 4 until the change of neuron

position is below a prespecified small positive number.

Page 41: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

41

SOM Application

Learning A character

Page 42: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

42

SOM Application

Learning circle with SOM

Page 43: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

43

SOM Application

SOM Examples from Bernd Fritzke, Ruhr Univercity Draft 5 April 1997, p32

Page 44: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

44

Neural Gas

NG is capable of adaptively determining the updating of the neighborhood by using a neighborhood ranking of the prototype vectors within the input space, rather than a neighborhood function in the output lattice

Page 45: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

45

Neural Gas

• h λ ( k j ( x, W )) is a bell - shaped curve

• Prototype vectors are updated as

• Learning rate η and characteristic decay constant λ

• η0 and ηf : initial and final values• λ0 and λf : initial and final decay constants• T : maximum number of iteration

hλ (kj(x,W)) = exp(−kj(x,W) λ ).

wj(t + 1) = wj(t) +η(t)hλ (kj(x,W))(x − wj(t)).

Page 46: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

46

NG Algorithm

The major process of the NG algorithm is as follows:1. Initialize a set of prototype vectors W = { w1 , w2 , … ,

wK } randomly;2. Present an input pattern x to the network. Sort the index

list in order from the prototype vector with the smallest Euclidean distance from x to the one with the greatest distance from x ;

3. Calculate the current learning rate and hλ ( k j ( x, W )) (bell shaped curve). Adjust the prototype vectors using the learning rule

4. Repeat steps 2 and 3 until the maximum number of iterations is reached.

Page 47: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

47

NG Application

NG always adding new centers and stops when it reaches maxiteration

Page 48: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

48

NG Application

NG Examples from Bernd Fritzke, Ruhr Univercity Draft 5 April 1997, p22

Page 49: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

49

Growing Neural Gas

• A type of SOM. The neural gas is a simple algorithm for finding optimal data representations based on feature vectors. The algorithm was coined "neural gas" because of the dynamics of the feature vectors during the adaptation process, which distribute themselves like a gas within the data space.

Page 50: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

50

Growing Neural Gas

• When prototype learning occurs, not only is the prototype vector of the winning neuron J1 updated towards x , but the prototypes within its topological neighborhood NJ1 are also adapted

• Different from NG, GCS, or SOFM, GNG is developed as a self - organizing network that can dynamically increase (usually) and remove the number of neurons in the network. A succession of new neurons is inserted into the network every λ iterations near the neuron with the maximum accumulated error. At the same time, a neuron removal rule could also be used to eliminate the neurons featuring the lowest utility for error reduction

Page 51: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

58

GNG

GNG Examples from Bernd Fritzke, Ruhr Univercity Draft 5 April 1997, p29

Page 52: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

59

Some Applications

Magnetic Resonance Imaging SegmentationMRI provides a visualization of the internal tissues and organs in the living organism, which is valuable in its applications in disease diagnosis (such as cancer and heart and vascular disease ), treatment and surgical planning. MRI segmentation can be formulated as a clustering problem in which a set of feature vectors, which are obtained through transforming image measurements and positions, is grouped into a relatively small number of clusters.

Page 53: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

60

Magnetic Resonance Imaging Segmentation

• After the patient was given Gadolinium, the tumor on the T1 - weighted image (Fig. 5.17 (d)) becomes very bright and is isolated from surrounding tissue.

From N. Karayiannis and P. Pai. Segmentation of magnetic resonance images using fuzzy algorithms for learning vector quantization. IEEE Transactions on Medical Imaging, vol. 18, pp. 172 – 180, 1999. Copyright © 1999 IEEE.)

Page 54: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

61

Condition Monitoring of 3G Cellular Networks

• The 3G mobile networks combine new technologies such as WCDMA and UMTS and provide users with a wide range of multimedia services and applications with higher data rates (Laiho et al., 2005 ). At the same time, emerging new requirements make it more important to monitor the states and conditions of 3G cellular networks. Specifically, in order to detect abnormal behaviors in 3G cellular systems, four competitive learning neural networks, LVQ, FSCL, SOFM (see another application of SOFM in WCDMA network analysis in Laiho et al. (2005) ), and NG, were applied to generate abstractions or clustering prototypes of the input vectors under normal conditions, which are further used for network behavior prediction

Page 55: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

62

Condition Monitoring of 3G Cellular Networks

The clustering prototypes provide a good summary of the normal behaviors of the cellular networks, which can then be used to detect abnormalities.

Page 56: 1 Neural Network-Based Clustering A. Selçuk MERCANLI Supervisor: Assist. Prof.Dr. Turgay İBRİKÇİ.

63

Summary

Neural network – based clustering is tightly related to the concept of competitive learning. Prototype vectors, associated with a set of neurons in the network and representing clusters in the feature or output space, compete with each other upon the presentation of an input pattern. The active neuron or winner reinforces itself (hard competitive learning) or its neighborhood within certain regions (soft competitive learning). More often, the neighborhood decreases monotonically with time.One important problem that learning algorithms need to deal with is the stability and plasticity dilemma. A system should have the capability of learning new and important patterns while maintaining stable cluster structures in response to irrelevant inputs.


Recommended