Tesina AI - New2

7/31/2019 Tesina AI - New2

1/23

Analysis of some extensions of the

Self-Organized Maps: Evolving SOMs (ESOMs) (1),

Growing-Hierarchy SOMs (GHSOM) (2)

Relative-Density SOMs (ReDSOM) (3)

Domenico Leuzzi

Abstract.

We describe and analyze some interesting extensions of the Self-Organized

Map (SOM) algorithm as proposed originally by Kohonen, namely the Evolv-ing SOM (ESOM) and the Growing-Hierarchy SOM (GHSOM), as well as a

visualization method to identifying the changes of the cluster structures in tem-

poral datasets, called ReDSOM. The ESOM algorithm differentiate itself from

its parent SOM because the topology of the map it creates is not fixed as in the

SOM but it is adaptively built on the basis of the dataset distribution, thus re-

ducing the number of the map units required to achieve a determined quantiza-

tion error. The GHSOM algorithm tries to reproduce the hierarchy of the dataset

creating a multilevel map in which each units of a map level can be explained

more deeply by a map on the next level. The multilevel approach not only op-

timizes the use of the units (a level map is added only if it is necessary to im-

prove its quantization error) but allows a more quickly navigation of the maps

obtained.

The ReDSOM visualization method is useful when we have a dataset which

evolves in time and we want to compare the clustering structure on the map

space of the dataset snapshots taken a two different time instants. This methods

allows to identify visually by means a different coloring, emerging/lost clusters,

cluster enlargement/shrinking, more/less dense clusters, clusters movement.

1 IntroductionWe start in section 2 with a brief presentation of the SOM algorithm as originally

developed by Kohonen (3; 4).

In section 3 we explain the used the experimental tests.

Section 4 is dedicated to the ReDSOM visualization method, sections 5 and 6 to

the GHSOM and ESOM algorithms respectively.


2/23

2 Self-Organizing MapsA SOM (also known as SOFM (Self-Organized Feature Map), or Kohonen Maps)

is an artificial neural networks based on unsupervised competitive learning (3; 4).

A low-dimensional grid of neurons (aka units), usually 2-D, is built following a

fixed and predetermined topology (i.e. rectangular or hexagonal). This grid constitutes

the so called map space (or output space). Whichever the used topology is, each unit

is connected with a number of neighboring units which are equidistant in the map

space: in the rectangular topology each unit is surrounded by four equidistant neigh-

boring units, and in the hexagonal topology it is surrounded by six units1.

The grids units are initialized in the data space, that is each unit weight vector (aka

the codebook vector, prototype vector or reference vector) is given an initial value

taken from the input space. The initialization can be random or o linear. In the latter

case the initial values are chosen in an orderly fashion along the first principalcomponents, being the map space dimensionality.

The map is then trained using a competitive unsupervised learning algorithm.

At each training step , a randomly selected data point is chosen from the dataset.Then the best matching unit (BMU) corresponding to , i.e. the unit with theweight vector closest to , is selected from map, in accordance with (1)

, (1)

After that, not only the BMU but also all its neighbors are adjusted in according to

the adaption rule

= , , , = 1, , (2)where

, , = ( ) ()

is the neighborhood kernel function dependent on the distance between the winning

neuron and the neighbor unit , as well as on the time . The parameter con-trols the size of zone of the neurons around the winning one that are affected by the

update, while ( ) is a decreasing function of the time (i.e. linearly or exponentially)controlling the strength of the adaption.

1 The border units are surrounded by fewer units, unless the lattice is wrapped in a cylindrical

or toroidal structure.


3/23

3 Tools used for the experimental testsFor implementing the ReDSOM visualization methods we used the SOM Toolbox

2.02

which is a powerful package developed for MATLAB that allows managing

every aspect of the SOMs, from the initialization, to the training, ending up with the

visualization both in the map space and in the data space or the projected space. De-

tailed information can obtained from the documentation and from source code.

For the implementation of the GHSOM algorithm it has been used the package Ja-

va SOMToolbox3. It contains several modules to train and visualize an SOM. In par-

ticular we used the modules GHSOM to grows a hierarchy of maps, and SOMViewer

to visualize the results.

We used also the ESOM Toolbox for MATLAB to do some tests on the ESOM al-

gorithm.

4 ReDSOM (3)Suppose we have a temporal dataset ( ) that is a dataset whose distribution varies

with time. We want to compare the clustering distribution of the dataset at two differ-

ent time instants and , that is of the two datasets , ( ). The clusteringitself and comparing is much simpler if carried on in the low dimensional space of

two SOMs of equal topology, , ( ) , trained on such two datasets, , ( ). In order to be able to compare directly such two maps, they must havethe same orientation and the datasets on which they are trained have to be normalized

using the same normalization method and parameters. So the procedure to obtain two

maps useful to be compared by the method here concerned is as follows:

1.Normalize both datasets , ( ) using the same normalization method (e.g.the common z-score) and the same parameters.

2.Initialize map ( ) using ordered values.3.Train map ( ) using the dataset ( ).4.Initialize map ( ) using the codebook vectors of the previous trained map .5.Train map ( ) using the dataset ( ).

The maps so obtained are directly comparable with each other. That is, if we definea density function on the data space, related to the density of the units in the data

space (that is considering their prototype vectors), we can compare one-to-one the

density of units of the two maps (the two map units, in the same position of the two

maps, are compared together).2 The package is available at the URL http://www.cis.hut.fi/projects/somtoolbox/download/3 The package is available at the URL

http://www.aut.ac.nz/__data/assets/file/0015/10176/ecos_esom.zip


4/23

4.1 Area Density and Relative Density DefinitionsWe define area density ( ) of the map ( ) and calculated on the data space

vector , as the sum of the values of a Gausssian kernel functions centered on the

vector and calculated on the prototype vectors of the map units, as shown in (3)

() = exp( ( )2

,,

(3)

The radius defines the width of the kernel function, and its value should be cho-

sen in accordance with the mean distance of the neighboring units. It was observed

that the quartile (e.g. the third quartile) is a balanced choice.

Now we define the relative density () () ( ) as the ratio between the areadensity of the map ( ) to the area density of the map ( ) both calculated at thesame location of vector , as shown in (4)

= log ( )( ) (4)

The use of the logarithm in (4) allows the values of the densities ratio, to be con-

verted to negative values when the ratio is below 1 (increase of density), and positive

when the ratio is above 1 (decrease of density).

The base-two logarithm allows getting a more suitable scale. For example a value

of+2 indicates a density four times more dense on the map (

), while a value of

2 indicates a density four times less dense on the map ( ).Based on experimental observation, values of ( ) less than 3 indicate that the

location of vector is no longer occupied on the next map ( ) (it is lost) whilevalues greater than +3 indicate that the location of vector was not occupied on the

previous map ( ) but it is on the next map ( ) (it is new).

The relative density calculation is to be performed only on the prototype vectors of

two maps , ( ) and not on the actual data vectors. So the running time ofcalculation of a map is quadratic with number of units and not with number of datapoints (), where ().

4.2

Relative Density Visualization

As said on the previous section the relative density calculation is to be performed

only on the prototype vectors , ( ) of the two maps , ( ).Let us set and

as shorthand to indicate the values of the relative density calculated respectively,

on the prototype vectors of first map and on the prototype vectors of second map.


5/23

We visualize and on the respective maps, in a gradation of blue for positivevalues and in a gradation of red for negative values, as shown in Fig. 2

The visualization should be used to detect a density decrease of the vectors ofthe first map (negative values of relative density). In fact if we want to detect if a

vector of the first map has been lost or has decreased its density in the second map,

we have to choose a location vector for the calculation of the relative density where

that vector is surely present, that is of reference vectors of the first map.Similarly the visualization should be used to detect the vectors that in the secondmap have increased their density with respect to the same vector location on the first

map.

4.3 MATLAB Implementation using SOM ToolboxTo implement the algorithm of the Relative Density Visualization in MATLAB us-

ing the SOM Toolbox, first of all it is necessary to initialize and train the two maps

relative to the two snapshots of the dataset we want to compare, following the proce-

dure indicated is section 4 to obtain two directly comparable maps. The MATLAB

code we used to do that is as follows:

sD{1}=som_normalize(sD{1}, 'var'); % normalize sD2 using

the 'var' method (z-score)

sD{2}=som_normalize(sD{2}, sD{1}) % normalize sD2 usingthe same method and parameters as sD1

sM{1}=som_lininit(sD{1}); % initialize linearly map 1

sM{1}.comp_names{1}='x';

sM{1}.comp_names{2}='y';

% Train map 1 (use batch training)

% (the trained map will be put in sM1_t)

sTr1=som_train_struct(sM1,sD{1},'algorithm', 'batch',

'phase','rough');

sM_t{1}=som_batchtrain(sM{1}, sD{1}, sTr1);

sTr1=som_train_struct(sM1,sD1,'algorithm', 'batch',

'phase','finetune');

sM_t{1}=som_batchtrain(sM_t{1}, sD{1}, sTr1);

sM{2}=sM{1}; % map 2 same topoloy and codebook vectors as

map 1

% train map 2 (use batch training)

sTr2=som_train_struct(sM{2},sD{2},'algorithm', 'batch',

'phase','rough');


6/23

sM_t{2}=som_batchtrain(sM{2}, sD{2}, sTr2);

sTr2=som_train_struct(sM{2},sD{2},'algorithm', 'batch',

'phase','finetune');

sM_t{2}=som_batchtrain(sM_t{2}, sD{2}, sTr2);

See the SOM Toolbox documentation to understand the meaning of each function

and each structure. The variables sD, sM, sM_t are cell arrays containing respec-

tively the two datasets, the two untrained maps and the two trained maps. We pre-

ferred to keep the untrained and the trained maps in separate variable. We used the

batch training algorithm because it speeds up the training

The MATLAB code we used to calculate the relative densities , is as fol-lows:[c1, p1, err1, ind1]=kmeans_clusters(sM1_t);

[density1, radius]=som_density(sM1_t, sM1_t.codebook,

'kp', p1{dataset1_knum});

[density2]=som_density(sM2_t, sM1_t.codebook, 'radius',

radius);

rd{1}=log2(density2./density1);


radius);


radius);

rd{2}=log2(density2./density1);

The first line calculates the clustering of the codebook vectors using the function

kmeans_clusters . The returned variable p1 is a cell array which contains in the position the clustering information for a number of clusters = . The partitioning

of the prototype vectors is needed for the calculation of the radius parameter present

in the Gaussian function used in the expression of the area density function. The radi-

us is calculated in the next code line (first som_density invocation) for the first

area density calculation. The next three calculations of area densities utilize the radius

calculated by the first invocation and dont require the clustering information parame-

ters before obtained because they dont need to calculate the radius.

The function som_density is not part of the SOM Toolbox package, the salient

part of this function is the calculation of the radius and is reported in the following

code fragment:

U = som_umat(M, sTopol, mode, 'mask', mask);

[mean_neighbors_dist_cluster]=neighbors_dist(U,

topol.msize, sTopol.lattice, kp, knum);

mean_neighbors_dist = mean(mean_neighbors_dist_cluster);

r = quart * mean_neighbors_dist;


7/23

The first line calculates the U-distance matrix and the second line calculates the mean

distances between neighbors in each prototype vectors cluster; the parameter kp is a

vector containing the clustering information of each prototype vectors and knum is

the number of clusters.

4.4 Results on synthetic datasetsFirst synthetic example

The Fig. 1 indicates the datasets used to shows how the relative density visualiza-

tion method performs when there are lost clusters, new clusters, and changes in clus-ter density. The two datasets are constituted by a superposition of four sets of normal-

ly-distributed (Gaussian) 2-D data points. The variance of each normally distributed

set was chosen to the common value 0.2, while the mean value was accordingly cho-

sen so as the resulting sets do not (practically) overlap. In the figure are indicated both

the datasets and the denormalized codebook vectors of the trained maps.

Comparing the top portion of the figure with the bottom one we can see that going

from the first toward the second, the cluster A is lost, it appears the new cluster E and

that the two clusters B, D change density, the first becomes denser and the second one

less dense.


8/23

Fig. 1. Datasets used to show the capacity of the Relative Density Visualization to detect lost

clusters, new clusters as well as clusters with a density variation. Besides showing the datasets

points (blue points), the figure shows the denormalized codebook vectors of the maps trained

on them (red crosses).Both datasets are constituted by four sets of normally-distributed (Gaussian) 2D points. Each of

these normally-distributed set was chosen with a common value of variance (0.2), while the

mean value was set so as the four groups are practically non-overlapping. Going from the first

dataset (a) to the second one (b) we can see that there is a lost cluster (A), a new cluster (E),

a denser cluster (B) and a less dense cluster (D); the cluster C remains unchanged.


9/23

The Fig. 2 shows the two trained maps using both the usual visualizations (compo-

nent planes and U-distance matrix) as well as the relative density visualization. The

top portion of the figure shows the visualizations relative the first map, whilst the

bottom portion is relative to the second map.

The visualization shows clearly that there is a region of strong red coloring,which is associated with a very low value of relative density ( 3 ). This regioncorresponds to the cluster A which is lost going toward the second map.

There are two regions one colored with light blue and the other colored with light

red. The light blue indicates an increase of density (cluster B), while the light red

indicated a decrease of density (cluster D).

At last there is a neutral zone (white color, cluster C) which corresponds to an un-

changed cluster.

As it has mentioned before, is able to show the changes of the clusters presentin the first map (A,B,C,D), but it cannot show the changes relative to the clusters that

are only present in the second map, that is it cannot detect the creation of new clusters

(like the cluster E). That kind of information is instead obtainable by .The visualization of in bottom part of the figure shows clearly a zone with

strong relative density value ( +3 ) corresponding to the dark blue coloring.This region is associated with the emerging cluster E. The other two regions (light

blue, light red, white colors) are the same identified by the visualization .


10/23


11/23

Second synthetic example

Fig. 3 indicates the datasets used to shows how the relative density visualization

method is able to make in evidence a cluster centroid shift. The two datasets are simi-

lar to the ones used in the Fig. 1. The difference is that in the second dataset, the clus-

ter A is not lost anymore but it shifts its centroid, and that the emerging cluster E is no

longer present. In the figure are indicated both the datasets and the denormalized

codebook vectors of the trained maps. The denser and less dense cluster B and D are

the same of those of the datasets of Fig. 1.

Fig. 3. Datasets used to show the capacity of the Relative Density Visualization a shift of clus-

ter centroid. Besides showing the datasets points (blue points), the figure shows the denormal-

ized codebook vectors of the maps trained on them (red crosses).

Both datasets are constituted by four sets of normally-distributed (Gaussian) 2D points. Each of

these normally-distributed set was chosen with a common value of variance (0.2), while the

mean value was set so as the four groups are practically non-overlapping. Going from the first

dataset (a) to the second one (b) we can see that there is shifted cluster (A), a denser cluster (B)

and a less dense cluster (D); the cluster C remains unchanged.


12/23

The Fig. 4 shows the two trained maps using both the usual visualizations (compo-

nent planes and U-distance matrix) as well as the relative density visualization. The

top portion of the figure shows the visualizations relative the first map, whilst the

bottom portion is relative to the second map.

We can draw the same conclusions about the clusters B,C,D as we did for the pre-

vious example shown in Fig. 2: in the visualizations and , the cluster the Bhas increased its density (light blue coloring), cluster D has decreased its density

(light red coloring) and cluster C is about unchanged (there is a very light red coloring

but its negligible). See the previous example for more details.

What is worth our attention is the region of the cluster A: there is a light red color-ing on and a light blue coloring on , both not covering the entire cluster: whenthe border of the colored region crosses inner part of a cluster it means there is a clus-

ter enlargement (blue coloring or and no coloring on ), a cluster shrinking(red coloring on and no coloring on ), or a shift of cluster centroid (both redcoloring on and blue coloring on ). Our case corresponds to the third configu-ration (both partial coloring): indeed the cluster A is subject has a cluster centroid

shift.


13/23

Fig. 4. Visualizations of the maps trained with the dataset (a) shown in Fig. 3a and (b) shown in

Fig. 3b. In the figure are drawn the two component planes, the U-distance matrix and the rela-

tive distance calculated on the units (that is on their codebook vectors) of the considered map.

On the figure are outlined the contours of the clusters as obtained from the U-distance matrix

(blue color indicates high distance in the data space while the yellow color indicates a low

distance and so the units with a blue color represent a region of separation between clusters).

The relative distances , give the same indication about the clusters B,C,D as the previ-

ous example shown in Fig. 2: see the previous example more details. What is worth our atten-

tion is the region of the cluster A: there is a light red coloring on and a light blue coloring

on , both not covering the entire cluster: when the border of the colored region crosses inner

part of a cluster it means there is an enlargement (blue color) and/or


14/23

5 GHSOM (1)The topology preservation capability of the Self-Organized Maps allows creating a

low-dimensional representation of a dataset, i.e. of a collect of documents, so as to

organize it and make it easy to search the desired information. As the amount of in-

formation to be represented grows, the map needed to organize it becomes larger. A

large map, even if low-dimensional, utilized to represent the whole dataset makes

hard to find a particular data of interest. In the unique map representation method,

although the reduction of dimensionality simplifies the visualization of the data, it

makes lose the hierarchical structure of the data itself. The Growing-Hierarchical

Self-Organized Map, it is conceived with the idea of distribute the dataset to be repre-

sented in several distinct sub-maps, each specialized on a specific portion of the data

space, and linked together by a hierarchical relationship. In addition each sub-map

can grow in size to fit the detail of representation needed. This multilevel approach

not only performs the dimensionality reduction without losing the topology, of a da-

taset, proper of the ordinary SOM, but it makes it also possible to maintain to some

degree its hierarchical structure.

5.1 The algorithmThe key idea is to use multiple layers of distinct SOMs. The first layer contains on-

ly a SOM. Each units of this map can be expanded into a finer SOM in the next (low-

er level) layer. The same applies to the units of the maps of this new layer, and so on

the algorithm goes ahead until a predetermined level of detail is reached (see Fig. 5).In addition for every map added to the structure we utilize a incrementally growing

version of the SOM; we start from a simple 2x2 map and eventually grow it if after its

training, the mapping quality is not satisfying.

We start at the layer zero, from a very rough representation of the data, just a

single map unit whose weight vector is set at the mean point of all the dataset vectors;

indeed this first unit has only the purpose of calculating the initial quantization error

associated with the data. In general the quantization of a unit is calculated asthe sum of all the distances between the weight vector of the unit and the data vec-

tors mapped onto this unit; in particular represents how far in total are the datasetvectors from their mean vector location.

We proceed with the first true SOM at layer one, starting from a small 2x2 map

configuration, which is trained with the standard SOM algorithm.For each SOM, the training process is repeated for fixed number of iterations.

When the training process of a SOM is done, its mean quantization error is

calculated. The mean quantization error of a map is a mapping quality index

defined as the average value of the quantization errors of all the units of that SOM.If the of the map just added and trained is higher than a predefined fraction

of the of the unit in the preceding layer the map it is linked to, a new row or anew column of units is added to the SOM. The point of addition is set between the

map unit with the highest (called error unit) and the most dissimilar (in term of its


15/23

weight vector) neighbor unit. The weights of the units added are initialized as the

average of their neighbors, and the training procedure is repeated as said above.

When the grow process is concluded, we can say that the new SOM added presents

the preceding layer unit from which it is expanded, but at higher detail4.

The units of an added SOM with have a quantization error too high, higher than

a predefined threshold fraction of the initial quantization error at layer zero, , are

expanded into an SOM in the next lower level layer. The parameter controls the

granularity of data representation in each final unit the hierarchy (not expanded into a

further map). The more this parameter is low, the more are the units which require

expansion, so more is deepness of the hierarchy produced.

Summing up the structure can grow both in breadth and in depth. The shape of the

hierarchy is controlled by the two parameters and . The size of each single map

tends to increase, as the parameter is lower, while the deepness of the hierarchy,that is its expansion level, increase as the parameter is lower.

Fig. 5. Hierarchical structure of a GHSOM.

4 Actually the first layer SOM is the first detail of representation of the dataset because the

preceding layer SOM is just a dummy map.


16/23

5.2 MATLAB Implementation of the GHSOM: GHSOM ToolboxWe used the package Java SOMToolbox5 to implement the GHSOM algorithm. It

contains several modules to train and visualize an SOM. In particular we used the

modules GHSOM to grows an hierarchy of maps, and SOMViewer to visualize the

results.

5.3 Experimental resultsWe used a dataset consisting of 101 animals described in a data space with a di-

mensionality of 20. The components of such space are simply Boolean values corre-

sponding to the following attributes:

Hair, feathers, eggs, milk, airborne, aquatic, predator, toothed, backbone, breathes,venomous, fins, 2_legs, 4_legs, 5_legs, 6_legs, 8_legs, tail, domestic, catsize.

We ran two tests to prove how the hierarchy structure can shaped by means of the

two parameters and .

Test 1: three layers hierarchy

Training the GHSOM with the parameters = 0.070 and = 0.0035, has pro-duced a structure hierarchy constituted by three layers, as depicted in Fig. 6.

The first layer map has been expanded to form a 2 4 grid. Each unit of this layer isfurther expanded into a map in the second layer. Some of the units in the second layer

maps are even further expanded in third later. As can be seen from the figure, the

algorithm is able to organize the data in a meaningful way. For example the Aquatics,Mammalians, Birds, etc. are all organized into a separate sub-map in the second layer.

In addition the sub-map representing related species are close together, like the se-

cond layer sub-map representing the mammalians quadrupeds, and that representing

the mammalians which are not quadrupeds.

In Fig. 7 are the component planes of the first layer map units.

5 The package is available at the URL http://www.ifs.tuwien.ac.at/dm/somtoolbox/


17/23

31 data items

#2

Mink

Platypus

Layer 3

#12

Aardvark

Bear

BoarCheetah

LeopardLion

Lynx

Mongoose

Polecat

Puma

Raccoon

Wolf

Layer 3

#4

Hare

Vole

Mole

Opossum

Layer 3

#6Antelope

BuffaloDeer

Elephant

GiraffeOryx

#2

Cavy

Hamster

Layer 3

#5

Pussycat

CalfGoat

Pony

Reindeer

Layer 3

Layer 2

Quadrupeds

Mammalians

#16

Seasnake Bass CatfishChub

HerringPiranha

Dolphin

PorpoiseStingray

dogfish

pike

tuna

Layer 3

HaddockSeahorse

Sole

Carp

Layer 3

Layer 2

Aquatic

#9

Vulture Rhea

Kiwi Penguin

CrowHawk

GullSkimmer

Skua

Layer 2

Birds

(those ib blue are alsoaquatic)

11 data items

Sw

Duck

Layer 3

LarkPheasant

Sparrow

Wren

Layer 2

8 data items

Squireel Gorilla Girl

Fruitbat

Vampire

Wallaby Sealion

Seal

Layer 3

Layer 2

Mammalians

#9

Toad Tortoise Scorpion

Frog

Venomous_frog

Layer 3

Newt

Tuatara

Layer 3

Slowworm

Pitviper

Layer 3

Layer 2

Excluding those indicated in red, all the species here are aquatic

#7

Starfish Clam Crab

Crayfish

Lobster

Seawasp Octopus

Layer 2

Aquatic

#10

Slug

Worm

Gnat

Wasp

Ho

Layer 3

Layer 2

Excluding t

the spec

Layer 1

Fig. 6. Hierarchy procuced with the parameters = 0.070, = 0.0035. The first layer map has grown to a 2 4 conf

reached a depth of 3 layers. In the table are indicated some of the categories grouped by the GHSOM


18/23

Fig. 7. Component planes of the first layer map of the GHSOM trained in test 1

Test 2: two layers hiararchy

Training the GHSOM with the parameters 0.025 and 0.0035, has pro-

duced a structure hierarchy constituted by two layers, as depicted in Fig. 8.

The first layer map has been expanded up to a 4 5 grid. Each unit of this layer is

further expanded into a map in the second layer.

In Fig. 9 are the component planes of the first layer map units.


19/23

Kiwi

Crow

Hawk

Gull

SkimmerSkua

Layer 2

Lark

Pheasant

SparrowWren

Chicken

Dove

Parakeet

Duck

Layer 2

Flea

Termite

Ladybird

SlugWorm

Gnat

Layer 2

Wasp Honeybee

Housefly

MothLayer 2

Fruitbat

Vampire

Squirrel Layer 2

Flamingo Vulture Penguin

Swan Ostrich Rhea

Layer 2

Tortoise Octopus Seawasp Crayfish

Lobster

Clam Starfish Crab

Layer 2

Scorpion

Mole

Opossum

Layer 2

Seasnake Slowworm

Pitviper Tuatara

Layer 2

Newt Venomous_frog

Frog Toad

Layer 2

Platypus Girl Antelope

Buffalo

Deer

Elephant

Giraffe

Oryx

Bass Catfish

ChubHerring

Piranha

Carp Haddock

Seahorse

Sole

Layer 2

Stingray

DogfishPike

Tuna

Layer 2

Seal

Sealion DolphinPorpoise

Layer 2

Aardvark

Bear

Boar

CheetahLeopard

Lion

Lynx

Mongoose

Polecat

Puma

Raccoon

Wolf

Pussycat Mink

Layer 2

Cavy

Layer 2

Layer 1

Fig. 8. Hierarchy procuced with the parameters = 0.025, = 0.0035. The first layer map has grown to a 4 5 configu

reached a depth of 2 layers.


20/23

Fig. 9. Component planes of the first layer map of the GHSOM trained in test 2

6 ESOM (5)In the context of data clustering and vector quantization, one of major challenge is

ability of dealing with an online data stream characterized by an unknown or time-

dependent statistic. The most simple approach is the k-means in its online version,

where for each incoming input data vector , only the prototype vector closest to

is updated by dragging it nearer itself (Winner-Takes-All scheme). This approach

is known as local k-means algorithm (6) and it.While this method is quite straight-

forward, it can suffer from confinement to local minima. The SOM algorithm (3; 4) is

able to overcome this problem because it uses a soft approach in which not only the

winner of the competition is updated but also its neighbors, depending on their prox-

imity to the input vector. In addition it has the well-known topology-preserving ability

which puts the prototype vector as to mirror the statistic of the data. The SOM algo-

rithm uses a fixed predetermined topology of the units in the low-dimensional map

space (aka feature space; usually 2-D or 3-D) which defines their order and their

neighborhood relationship. When the original manifold is too complicated to be fol-


21/23

lowed by fixed-topology low dimensional map space this brings to a highly folded

feature map.

The topology constraint on the feature map is removed in the neural-gas model (7),

the dynamic cell structure (DCS-GCS) (8) and the growing neural gas (GNG) (9).

In all these methods the map structure is built dynamically to fit incoming data, but

the need to calculate local resources for prototypes, which increase the computational

effort to calculate and thus brings to a reduction of efficiency.

The ESOM model is similar the GNG but it does not require local resource calcula-

tion and the node insertion mechanism is more efficient than that the DCS and GNG.

6.1 The ESOM algorithmWe start with an empty map, adding new nodes as the input vectors arrive.

We will use the following symbols: = { , , , } indicates the of the prototype nodes at the -th step;

N is current number of nodes;

is the dimension input manifold; = , is the set of the of all the unordered pairs ( , ) of nodes

, which are connected together.

The algorithm can be schematized as follows:

1. A new input is presented to the network2. Let us consider the set ( ) = of prototypes nodes

that match the input vector within a predefined threshold

3. If ( ) is empty go to step 4 (node insertion) otherwise to step 5(nodes updating)

4. Node insertion. Create a new node that matches exactlythe input vector , insert it into and increment by one

= =

+ 1

Connect the new node with the two nearest neighbors , (if theyexist, that is if has at least two elements) and connect also them; if

has only one element connect only the new node with it

= , , ,

, if hasatleasttwoelementsif hasonlyoneelement

Go to step 6.


22/23


23/23

Bibliography

1. On-line Pattern Analysis by Evolving Self-Organizing Maps. Kasabov, D. Deng

and N.

2.Business, Culture, Politics, and Sports - How to Find Your Way Through a Bulk

of News? On Content-Nased Hierarchical Structuring and Organization of Large

Document Archives. Micheal Dittenbach, Andreas Rauber, Dieter Merkl.

3. Denny, Graham J. Williams, Peter Christen. ReDSOM: Relative Density

Visualization of Temporal Changes.

4. Kohonen, T. Self-Organized formation of topologically correct feature maps.Biological. Cybernetics, 43. 1982, pp. 59-69.

5. The Self-organized map. Kohonen. s.l. : IEEE, 1990. Prodeeding of the IEEE,

VOL. 78 , N.9, September 1990. pp. 1464-1480.

6. Some extension of the K-means algorithm for image segmentation and pattern

classification, Technical Report 1390. Girosi, J.L. Marroquin and F. 1993.

7. T.M. Martinetz, S.G. Berkovich and K. J. Schulten. Neural-Gas network for

vector quantization and its application to time-series prediction. Neural Networks 4.

s.l. : IEEE, 1993.

8. J. Bruske, and G. Sommer. Dynamic cell structure learns perfectly topology

preserving map.Neural Computation 7. 1995.

9. Fritzke, B. Growing cell structures - a self-organizing network for unsupervised

and supervised learning.Neural Networks 7. 1994.

Date post:	05-Apr-2018
Category:	Documents
Upload:	domeleu
View:	258 times
Download:	0 times

Tesina AI - New2

Documents