of 23
7/31/2019 Tesina AI - New2
1/23
Analysis of some extensions of the
Self-Organized Maps: Evolving SOMs (ESOMs) (1),
Growing-Hierarchy SOMs (GHSOM) (2)
Relative-Density SOMs (ReDSOM) (3)
Domenico Leuzzi
Abstract.
We describe and analyze some interesting extensions of the Self-Organized
Map (SOM) algorithm as proposed originally by Kohonen, namely the Evolv-ing SOM (ESOM) and the Growing-Hierarchy SOM (GHSOM), as well as a
visualization method to identifying the changes of the cluster structures in tem-
poral datasets, called ReDSOM. The ESOM algorithm differentiate itself from
its parent SOM because the topology of the map it creates is not fixed as in the
SOM but it is adaptively built on the basis of the dataset distribution, thus re-
ducing the number of the map units required to achieve a determined quantiza-
tion error. The GHSOM algorithm tries to reproduce the hierarchy of the dataset
creating a multilevel map in which each units of a map level can be explained
more deeply by a map on the next level. The multilevel approach not only op-
timizes the use of the units (a level map is added only if it is necessary to im-
prove its quantization error) but allows a more quickly navigation of the maps
obtained.
The ReDSOM visualization method is useful when we have a dataset which
evolves in time and we want to compare the clustering structure on the map
space of the dataset snapshots taken a two different time instants. This methods
allows to identify visually by means a different coloring, emerging/lost clusters,
cluster enlargement/shrinking, more/less dense clusters, clusters movement.
1 IntroductionWe start in section 2 with a brief presentation of the SOM algorithm as originally
developed by Kohonen (3; 4).
In section 3 we explain the used the experimental tests.
Section 4 is dedicated to the ReDSOM visualization method, sections 5 and 6 to
the GHSOM and ESOM algorithms respectively.
7/31/2019 Tesina AI - New2
2/23
2 Self-Organizing MapsA SOM (also known as SOFM (Self-Organized Feature Map), or Kohonen Maps)
is an artificial neural networks based on unsupervised competitive learning (3; 4).
A low-dimensional grid of neurons (aka units), usually 2-D, is built following a
fixed and predetermined topology (i.e. rectangular or hexagonal). This grid constitutes
the so called map space (or output space). Whichever the used topology is, each unit
is connected with a number of neighboring units which are equidistant in the map
space: in the rectangular topology each unit is surrounded by four equidistant neigh-
boring units, and in the hexagonal topology it is surrounded by six units1.
The grids units are initialized in the data space, that is each unit weight vector (aka
the codebook vector, prototype vector or reference vector) is given an initial value
taken from the input space. The initialization can be random or o linear. In the latter
case the initial values are chosen in an orderly fashion along the first principalcomponents, being the map space dimensionality.
The map is then trained using a competitive unsupervised learning algorithm.
At each training step , a randomly selected data point is chosen from the dataset.Then the best matching unit (BMU) corresponding to , i.e. the unit with theweight vector closest to , is selected from map, in accordance with (1)
, (1)
After that, not only the BMU but also all its neighbors are adjusted in according to
the adaption rule
= , , , = 1, , (2)where
, , = ( ) ()
is the neighborhood kernel function dependent on the distance between the winning
neuron and the neighbor unit , as well as on the time . The parameter con-trols the size of zone of the neurons around the winning one that are affected by the
update, while ( ) is a decreasing function of the time (i.e. linearly or exponentially)controlling the strength of the adaption.
1 The border units are surrounded by fewer units, unless the lattice is wrapped in a cylindrical
or toroidal structure.
7/31/2019 Tesina AI - New2
3/23
3 Tools used for the experimental testsFor implementing the ReDSOM visualization methods we used the SOM Toolbox
2.02
which is a powerful package developed for MATLAB that allows managing
every aspect of the SOMs, from the initialization, to the training, ending up with the
visualization both in the map space and in the data space or the projected space. De-
tailed information can obtained from the documentation and from source code.
For the implementation of the GHSOM algorithm it has been used the package Ja-
va SOMToolbox3. It contains several modules to train and visualize an SOM. In par-
ticular we used the modules GHSOM to grows a hierarchy of maps, and SOMViewer
to visualize the results.
We used also the ESOM Toolbox for MATLAB to do some tests on the ESOM al-
gorithm.
4 ReDSOM (3)Suppose we have a temporal dataset ( ) that is a dataset whose distribution varies
with time. We want to compare the clustering distribution of the dataset at two differ-
ent time instants and , that is of the two datasets , ( ). The clusteringitself and comparing is much simpler if carried on in the low dimensional space of
two SOMs of equal topology, , ( ) , trained on such two datasets, , ( ). In order to be able to compare directly such two maps, they must havethe same orientation and the datasets on which they are trained have to be normalized
using the same normalization method and parameters. So the procedure to obtain two
maps useful to be compared by the method here concerned is as follows:
1.Normalize both datasets , ( ) using the same normalization method (e.g.the common z-score) and the same parameters.
2.Initialize map ( ) using ordered values.3.Train map ( ) using the dataset ( ).4.Initialize map ( ) using the codebook vectors of the previous trained map .5.Train map ( ) using the dataset ( ).
The maps so obtained are directly comparable with each other. That is, if we definea density function on the data space, related to the density of the units in the data
space (that is considering their prototype vectors), we can compare one-to-one the
density of units of the two maps (the two map units, in the same position of the two
maps, are compared together).2 The package is available at the URL http://www.cis.hut.fi/projects/somtoolbox/download/3 The package is available at the URL
http://www.aut.ac.nz/__data/assets/file/0015/10176/ecos_esom.zip
7/31/2019 Tesina AI - New2
4/23
4.1 Area Density and Relative Density DefinitionsWe define area density ( ) of the map ( ) and calculated on the data space
vector , as the sum of the values of a Gausssian kernel functions centered on the
vector and calculated on the prototype vectors of the map units, as shown in (3)
() = exp( ( )2
,,
(3)
The radius defines the width of the kernel function, and its value should be cho-
sen in accordance with the mean distance of the neighboring units. It was observed
that the quartile (e.g. the third quartile) is a balanced choice.
Now we define the relative density () () ( ) as the ratio between the areadensity of the map ( ) to the area density of the map ( ) both calculated at thesame location of vector , as shown in (4)
= log ( )( ) (4)
The use of the logarithm in (4) allows the values of the densities ratio, to be con-
verted to negative values when the ratio is below 1 (increase of density), and positive
when the ratio is above 1 (decrease of density).
The base-two logarithm allows getting a more suitable scale. For example a value
of+2 indicates a density four times more dense on the map (
), while a value of
2 indicates a density four times less dense on the map ( ).Based on experimental observation, values of ( ) less than 3 indicate that the
location of vector is no longer occupied on the next map ( ) (it is lost) whilevalues greater than +3 indicate that the location of vector was not occupied on the
previous map ( ) but it is on the next map ( ) (it is new).
The relative density calculation is to be performed only on the prototype vectors of
two maps , ( ) and not on the actual data vectors. So the running time ofcalculation of a map is quadratic with number of units and not with number of datapoints (), where ().
4.2
Relative Density Visualization
As said on the previous section the relative density calculation is to be performed
only on the prototype vectors , ( ) of the two maps , ( ).Let us set and
as shorthand to indicate the values of the relative density calculated respectively,
on the prototype vectors of first map and on the prototype vectors of second map.
7/31/2019 Tesina AI - New2
5/23
We visualize and on the respective maps, in a gradation of blue for positivevalues and in a gradation of red for negative values, as shown in Fig. 2
The visualization should be used to detect a density decrease of the vectors ofthe first map (negative values of relative density). In fact if we want to detect if a
vector of the first map has been lost or has decreased its density in the second map,
we have to choose a location vector for the calculation of the relative density where
that vector is surely present, that is of reference vectors of the first map.Similarly the visualization should be used to detect the vectors that in the secondmap have increased their density with respect to the same vector location on the first
map.
4.3 MATLAB Implementation using SOM ToolboxTo implement the algorithm of the Relative Density Visualization in MATLAB us-
ing the SOM Toolbox, first of all it is necessary to initialize and train the two maps
relative to the two snapshots of the dataset we want to compare, following the proce-
dure indicated is section 4 to obtain two directly comparable maps. The MATLAB
code we used to do that is as follows:
sD{1}=som_normalize(sD{1}, 'var'); % normalize sD2 using
the 'var' method (z-score)
sD{2}=som_normalize(sD{2}, sD{1}) % normalize sD2 usingthe same method and parameters as sD1
sM{1}=som_lininit(sD{1}); % initialize linearly map 1
sM{1}.comp_names{1}='x';
sM{1}.comp_names{2}='y';
% Train map 1 (use batch training)
% (the trained map will be put in sM1_t)
sTr1=som_train_struct(sM1,sD{1},'algorithm', 'batch',
'phase','rough');
sM_t{1}=som_batchtrain(sM{1}, sD{1}, sTr1);
sTr1=som_train_struct(sM1,sD1,'algorithm', 'batch',
'phase','finetune');
sM_t{1}=som_batchtrain(sM_t{1}, sD{1}, sTr1);
sM{2}=sM{1}; % map 2 same topoloy and codebook vectors as
map 1
% train map 2 (use batch training)
sTr2=som_train_struct(sM{2},sD{2},'algorithm', 'batch',
'phase','rough');
7/31/2019 Tesina AI - New2
6/23
sM_t{2}=som_batchtrain(sM{2}, sD{2}, sTr2);
sTr2=som_train_struct(sM{2},sD{2},'algorithm', 'batch',
'phase','finetune');
sM_t{2}=som_batchtrain(sM_t{2}, sD{2}, sTr2);
See the SOM Toolbox documentation to understand the meaning of each function
and each structure. The variables sD, sM, sM_t are cell arrays containing respec-
tively the two datasets, the two untrained maps and the two trained maps. We pre-
ferred to keep the untrained and the trained maps in separate variable. We used the
batch training algorithm because it speeds up the training
The MATLAB code we used to calculate the relative densities , is as fol-lows:[c1, p1, err1, ind1]=kmeans_clusters(sM1_t);
[density1, radius]=som_density(sM1_t, sM1_t.codebook,
'kp', p1{dataset1_knum});
[density2]=som_density(sM2_t, sM1_t.codebook, 'radius',
radius);
rd{1}=log2(density2./density1);
[density1]=som_density(sM1_t, sM2_t.codebook, 'radius',
radius);
[density2]=som_density(sM2_t, sM2_t.codebook, 'radius',
radius);
rd{2}=log2(density2./density1);
The first line calculates the clustering of the codebook vectors using the function
kmeans_clusters . The returned variable p1 is a cell array which contains in the position the clustering information for a number of clusters = . The partitioning
of the prototype vectors is needed for the calculation of the radius parameter present
in the Gaussian function used in the expression of the area density function. The radi-
us is calculated in the next code line (first som_density invocation) for the first
area density calculation. The next three calculations of area densities utilize the radius
calculated by the first invocation and dont require the clustering information parame-
ters before obtained because they dont need to calculate the radius.
The function som_density is not part of the SOM Toolbox package, the salient
part of this function is the calculation of the radius and is reported in the following
code fragment:
U = som_umat(M, sTopol, mode, 'mask', mask);
[mean_neighbors_dist_cluster]=neighbors_dist(U,
topol.msize, sTopol.lattice, kp, knum);
mean_neighbors_dist = mean(mean_neighbors_dist_cluster);
r = quart * mean_neighbors_dist;
7/31/2019 Tesina AI - New2
7/23
The first line calculates the U-distance matrix and the second line calculates the mean
distances between neighbors in each prototype vectors cluster; the parameter kp is a
vector containing the clustering information of each prototype vectors and knum is
the number of clusters.
4.4 Results on synthetic datasetsFirst synthetic example
The Fig. 1 indicates the datasets used to shows how the relative density visualiza-
tion method performs when there are lost clusters, new clusters, and changes in clus-ter density. The two datasets are constituted by a superposition of four sets of normal-
ly-distributed (Gaussian) 2-D data points. The variance of each normally distributed
set was chosen to the common value 0.2, while the mean value was accordingly cho-
sen so as the resulting sets do not (practically) overlap. In the figure are indicated both
the datasets and the denormalized codebook vectors of the trained maps.
Comparing the top portion of the figure with the bottom one we can see that going
from the first toward the second, the cluster A is lost, it appears the new cluster E and
that the two clusters B, D change density, the first becomes denser and the second one
less dense.
7/31/2019 Tesina AI - New2
8/23
Fig. 1. Datasets used to show the capacity of the Relative Density Visualization to detect lost
clusters, new clusters as well as clusters with a density variation. Besides showing the datasets
points (blue points), the figure shows the denormalized codebook vectors of the maps trained
on them (red crosses).Both datasets are constituted by four sets of normally-distributed (Gaussian) 2D points. Each of
these normally-distributed set was chosen with a common value of variance (0.2), while the
mean value was set so as the four groups are practically non-overlapping. Going from the first
dataset (a) to the second one (b) we can see that there is a lost cluster (A), a new cluster (E),
a denser cluster (B) and a less dense cluster (D); the cluster C remains unchanged.
7/31/2019 Tesina AI - New2
9/23
The Fig. 2 shows the two trained maps using both the usual visualizations (compo-
nent planes and U-distance matrix) as well as the relative density visualization. The
top portion of the figure shows the visualizations relative the first map, whilst the
bottom portion is relative to the second map.
The visualization shows clearly that there is a region of strong red coloring,which is associated with a very low value of relative density ( 3 ). This regioncorresponds to the cluster A which is lost going toward the second map.
There are two regions one colored with light blue and the other colored with light
red. The light blue indicates an increase of density (cluster B), while the light red
indicated a decrease of density (cluster D).
At last there is a neutral zone (white color, cluster C) which corresponds to an un-
changed cluster.
As it has mentioned before, is able to show the changes of the clusters presentin the first map (A,B,C,D), but it cannot show the changes relative to the clusters that
are only present in the second map, that is it cannot detect the creation of new clusters
(like the cluster E). That kind of information is instead obtainable by .The visualization of in bottom part of the figure shows clearly a zone with
strong relative density value ( +3 ) corresponding to the dark blue coloring.This region is associated with the emerging cluster E. The other two regions (light
blue, light red, white colors) are the same identified by the visualization .
7/31/2019 Tesina AI - New2
10/23
7/31/2019 Tesina AI - New2
11/23
Second synthetic example
Fig. 3 indicates the datasets used to shows how the relative density visualization
method is able to make in evidence a cluster centroid shift. The two datasets are simi-
lar to the ones used in the Fig. 1. The difference is that in the second dataset, the clus-
ter A is not lost anymore but it shifts its centroid, and that the emerging cluster E is no
longer present. In the figure are indicated both the datasets and the denormalized
codebook vectors of the trained maps. The denser and less dense cluster B and D are
the same of those of the datasets of Fig. 1.
Fig. 3. Datasets used to show the capacity of the Relative Density Visualization a shift of clus-
ter centroid. Besides showing the datasets points (blue points), the figure shows the denormal-
ized codebook vectors of the maps trained on them (red crosses).
Both datasets are constituted by four sets of normally-distributed (Gaussian) 2D points. Each of
these normally-distributed set was chosen with a common value of variance (0.2), while the
mean value was set so as the four groups are practically non-overlapping. Going from the first
dataset (a) to the second one (b) we can see that there is shifted cluster (A), a denser cluster (B)
and a less dense cluster (D); the cluster C remains unchanged.
7/31/2019 Tesina AI - New2
12/23
The Fig. 4 shows the two trained maps using both the usual visualizations (compo-
nent planes and U-distance matrix) as well as the relative density visualization. The
top portion of the figure shows the visualizations relative the first map, whilst the
bottom portion is relative to the second map.
We can draw the same conclusions about the clusters B,C,D as we did for the pre-
vious example shown in Fig. 2: in the visualizations and , the cluster the Bhas increased its density (light blue coloring), cluster D has decreased its density
(light red coloring) and cluster C is about unchanged (there is a very light red coloring
but its negligible). See the previous example for more details.
What is worth our attention is the region of the cluster A: there is a light red color-ing on and a light blue coloring on , both not covering the entire cluster: whenthe border of the colored region crosses inner part of a cluster it means there is a clus-
ter enlargement (blue coloring or and no coloring on ), a cluster shrinking(red coloring on and no coloring on ), or a shift of cluster centroid (both redcoloring on and blue coloring on ). Our case corresponds to the third configu-ration (both partial coloring): indeed the cluster A is subject has a cluster centroid
shift.
7/31/2019 Tesina AI - New2
13/23
Fig. 4. Visualizations of the maps trained with the dataset (a) shown in Fig. 3a and (b) shown in
Fig. 3b. In the figure are drawn the two component planes, the U-distance matrix and the rela-
tive distance calculated on the units (that is on their codebook vectors) of the considered map.
On the figure are outlined the contours of the clusters as obtained from the U-distance matrix
(blue color indicates high distance in the data space while the yellow color indicates a low
distance and so the units with a blue color represent a region of separation between clusters).
The relative distances , give the same indication about the clusters B,C,D as the previ-
ous example shown in Fig. 2: see the previous example more details. What is worth our atten-
tion is the region of the cluster A: there is a light red coloring on and a light blue coloring
on , both not covering the entire cluster: when the border of the colored region crosses inner
part of a cluster it means there is an enlargement (blue color) and/or
7/31/2019 Tesina AI - New2
14/23
5 GHSOM (1)The topology preservation capability of the Self-Organized Maps allows creating a
low-dimensional representation of a dataset, i.e. of a collect of documents, so as to
organize it and make it easy to search the desired information. As the amount of in-
formation to be represented grows, the map needed to organize it becomes larger. A
large map, even if low-dimensional, utilized to represent the whole dataset makes
hard to find a particular data of interest. In the unique map representation method,
although the reduction of dimensionality simplifies the visualization of the data, it
makes lose the hierarchical structure of the data itself. The Growing-Hierarchical
Self-Organized Map, it is conceived with the idea of distribute the dataset to be repre-
sented in several distinct sub-maps, each specialized on a specific portion of the data
space, and linked together by a hierarchical relationship. In addition each sub-map
can grow in size to fit the detail of representation needed. This multilevel approach
not only performs the dimensionality reduction without losing the topology, of a da-
taset, proper of the ordinary SOM, but it makes it also possible to maintain to some
degree its hierarchical structure.
5.1 The algorithmThe key idea is to use multiple layers of distinct SOMs. The first layer contains on-
ly a SOM. Each units of this map can be expanded into a finer SOM in the next (low-
er level) layer. The same applies to the units of the maps of this new layer, and so on
the algorithm goes ahead until a predetermined level of detail is reached (see Fig. 5).In addition for every map added to the structure we utilize a incrementally growing
version of the SOM; we start from a simple 2x2 map and eventually grow it if after its
training, the mapping quality is not satisfying.
We start at the layer zero, from a very rough representation of the data, just a
single map unit whose weight vector is set at the mean point of all the dataset vectors;
indeed this first unit has only the purpose of calculating the initial quantization error
associated with the data. In general the quantization of a unit is calculated asthe sum of all the distances between the weight vector of the unit and the data vec-
tors mapped onto this unit; in particular represents how far in total are the datasetvectors from their mean vector location.
We proceed with the first true SOM at layer one, starting from a small 2x2 map
configuration, which is trained with the standard SOM algorithm.For each SOM, the training process is repeated for fixed number of iterations.
When the training process of a SOM is done, its mean quantization error is
calculated. The mean quantization error of a map is a mapping quality index
defined as the average value of the quantization errors of all the units of that SOM.If the of the map just added and trained is higher than a predefined fraction
of the of the unit in the preceding layer the map it is linked to, a new row or anew column of units is added to the SOM. The point of addition is set between the
map unit with the highest (called error unit) and the most dissimilar (in term of its
7/31/2019 Tesina AI - New2
15/23
weight vector) neighbor unit. The weights of the units added are initialized as the
average of their neighbors, and the training procedure is repeated as said above.
When the grow process is concluded, we can say that the new SOM added presents
the preceding layer unit from which it is expanded, but at higher detail4.
The units of an added SOM with have a quantization error too high, higher than
a predefined threshold fraction of the initial quantization error at layer zero, , are
expanded into an SOM in the next lower level layer. The parameter controls the
granularity of data representation in each final unit the hierarchy (not expanded into a
further map). The more this parameter is low, the more are the units which require
expansion, so more is deepness of the hierarchy produced.
Summing up the structure can grow both in breadth and in depth. The shape of the
hierarchy is controlled by the two parameters and . The size of each single map
tends to increase, as the parameter is lower, while the deepness of the hierarchy,that is its expansion level, increase as the parameter is lower.
Fig. 5. Hierarchical structure of a GHSOM.
4 Actually the first layer SOM is the first detail of representation of the dataset because the
preceding layer SOM is just a dummy map.
7/31/2019 Tesina AI - New2
16/23
5.2 MATLAB Implementation of the GHSOM: GHSOM ToolboxWe used the package Java SOMToolbox5 to implement the GHSOM algorithm. It
contains several modules to train and visualize an SOM. In particular we used the
modules GHSOM to grows an hierarchy of maps, and SOMViewer to visualize the
results.
5.3 Experimental resultsWe used a dataset consisting of 101 animals described in a data space with a di-
mensionality of 20. The components of such space are simply Boolean values corre-
sponding to the following attributes:
Hair, feathers, eggs, milk, airborne, aquatic, predator, toothed, backbone, breathes,venomous, fins, 2_legs, 4_legs, 5_legs, 6_legs, 8_legs, tail, domestic, catsize.
We ran two tests to prove how the hierarchy structure can shaped by means of the
two parameters and .
Test 1: three layers hierarchy
Training the GHSOM with the parameters = 0.070 and = 0.0035, has pro-duced a structure hierarchy constituted by three layers, as depicted in Fig. 6.
The first layer map has been expanded to form a 2 4 grid. Each unit of this layer isfurther expanded into a map in the second layer. Some of the units in the second layer
maps are even further expanded in third later. As can be seen from the figure, the
algorithm is able to organize the data in a meaningful way. For example the Aquatics,Mammalians, Birds, etc. are all organized into a separate sub-map in the second layer.
In addition the sub-map representing related species are close together, like the se-
cond layer sub-map representing the mammalians quadrupeds, and that representing
the mammalians which are not quadrupeds.
In Fig. 7 are the component planes of the first layer map units.
5 The package is available at the URL http://www.ifs.tuwien.ac.at/dm/somtoolbox/
7/31/2019 Tesina AI - New2
17/23
31 data items
#2
Mink
Platypus
Layer 3
#12
Aardvark
Bear
BoarCheetah
LeopardLion
Lynx
Mongoose
Polecat
Puma
Raccoon
Wolf
Layer 3
#4
Hare
Vole
Mole
Opossum
Layer 3
#6Antelope
BuffaloDeer
Elephant
GiraffeOryx
#2
Cavy
Hamster
Layer 3
#5
Pussycat
CalfGoat
Pony
Reindeer
Layer 3
Layer 2
Quadrupeds
Mammalians
#16
Seasnake Bass CatfishChub
HerringPiranha
Dolphin
PorpoiseStingray
dogfish
pike
tuna
Layer 3
HaddockSeahorse
Sole
Carp
Layer 3
Layer 2
Aquatic
#9
Vulture Rhea
Kiwi Penguin
CrowHawk
GullSkimmer
Skua
Layer 2
Birds
(those ib blue are alsoaquatic)
11 data items
Sw
Duck
Layer 3
LarkPheasant
Sparrow
Wren
Layer 2
8 data items
Squireel Gorilla Girl
Fruitbat
Vampire
Wallaby Sealion
Seal
Layer 3
Layer 2
Mammalians
#9
Toad Tortoise Scorpion
Frog
Venomous_frog
Layer 3
Newt
Tuatara
Layer 3
Slowworm
Pitviper
Layer 3
Layer 2
Excluding those indicated in red, all the species here are aquatic
#7
Starfish Clam Crab
Crayfish
Lobster
Seawasp Octopus
Layer 2
Aquatic
#10
Slug
Worm
Gnat
Wasp
Ho
Layer 3
Layer 2
Excluding t
the spec
Layer 1
Fig. 6. Hierarchy procuced with the parameters = 0.070, = 0.0035. The first layer map has grown to a 2 4 conf
reached a depth of 3 layers. In the table are indicated some of the categories grouped by the GHSOM
7/31/2019 Tesina AI - New2
18/23
Fig. 7. Component planes of the first layer map of the GHSOM trained in test 1
Test 2: two layers hiararchy
Training the GHSOM with the parameters 0.025 and 0.0035, has pro-
duced a structure hierarchy constituted by two layers, as depicted in Fig. 8.
The first layer map has been expanded up to a 4 5 grid. Each unit of this layer is
further expanded into a map in the second layer.
In Fig. 9 are the component planes of the first layer map units.
7/31/2019 Tesina AI - New2
19/23
Kiwi
Crow
Hawk
Gull
SkimmerSkua
Layer 2
Lark
Pheasant
SparrowWren
Chicken
Dove
Parakeet
Duck
Layer 2
Flea
Termite
Ladybird
SlugWorm
Gnat
Layer 2
Wasp Honeybee
Housefly
MothLayer 2
Fruitbat
Vampire
Squirrel Layer 2
Flamingo Vulture Penguin
Swan Ostrich Rhea
Layer 2
Tortoise Octopus Seawasp Crayfish
Lobster
Clam Starfish Crab
Layer 2
Scorpion
Mole
Opossum
Layer 2
Seasnake Slowworm
Pitviper Tuatara
Layer 2
Newt Venomous_frog
Frog Toad
Layer 2
Platypus Girl Antelope
Buffalo
Deer
Elephant
Giraffe
Oryx
Bass Catfish
ChubHerring
Piranha
Carp Haddock
Seahorse
Sole
Layer 2
Stingray
DogfishPike
Tuna
Layer 2
Seal
Sealion DolphinPorpoise
Layer 2
Aardvark
Bear
Boar
CheetahLeopard
Lion
Lynx
Mongoose
Polecat
Puma
Raccoon
Wolf
Pussycat Mink
Layer 2
Cavy
Layer 2
Layer 1
Fig. 8. Hierarchy procuced with the parameters = 0.025, = 0.0035. The first layer map has grown to a 4 5 configu
reached a depth of 2 layers.
7/31/2019 Tesina AI - New2
20/23
Fig. 9. Component planes of the first layer map of the GHSOM trained in test 2
6 ESOM (5)In the context of data clustering and vector quantization, one of major challenge is
ability of dealing with an online data stream characterized by an unknown or time-
dependent statistic. The most simple approach is the k-means in its online version,
where for each incoming input data vector , only the prototype vector closest to
is updated by dragging it nearer itself (Winner-Takes-All scheme). This approach
is known as local k-means algorithm (6) and it.While this method is quite straight-
forward, it can suffer from confinement to local minima. The SOM algorithm (3; 4) is
able to overcome this problem because it uses a soft approach in which not only the
winner of the competition is updated but also its neighbors, depending on their prox-
imity to the input vector. In addition it has the well-known topology-preserving ability
which puts the prototype vector as to mirror the statistic of the data. The SOM algo-
rithm uses a fixed predetermined topology of the units in the low-dimensional map
space (aka feature space; usually 2-D or 3-D) which defines their order and their
neighborhood relationship. When the original manifold is too complicated to be fol-
7/31/2019 Tesina AI - New2
21/23
lowed by fixed-topology low dimensional map space this brings to a highly folded
feature map.
The topology constraint on the feature map is removed in the neural-gas model (7),
the dynamic cell structure (DCS-GCS) (8) and the growing neural gas (GNG) (9).
In all these methods the map structure is built dynamically to fit incoming data, but
the need to calculate local resources for prototypes, which increase the computational
effort to calculate and thus brings to a reduction of efficiency.
The ESOM model is similar the GNG but it does not require local resource calcula-
tion and the node insertion mechanism is more efficient than that the DCS and GNG.
6.1 The ESOM algorithmWe start with an empty map, adding new nodes as the input vectors arrive.
We will use the following symbols: = { , , , } indicates the of the prototype nodes at the -th step;
N is current number of nodes;
is the dimension input manifold; = , is the set of the of all the unordered pairs ( , ) of nodes
, which are connected together.
The algorithm can be schematized as follows:
1. A new input is presented to the network2. Let us consider the set ( ) = of prototypes nodes
that match the input vector within a predefined threshold
3. If ( ) is empty go to step 4 (node insertion) otherwise to step 5(nodes updating)
4. Node insertion. Create a new node that matches exactlythe input vector , insert it into and increment by one
= =
+ 1
Connect the new node with the two nearest neighbors , (if theyexist, that is if has at least two elements) and connect also them; if
has only one element connect only the new node with it
= , , ,
, if hasatleasttwoelementsif hasonlyoneelement
Go to step 6.
7/31/2019 Tesina AI - New2
22/23
7/31/2019 Tesina AI - New2
23/23
Bibliography
1. On-line Pattern Analysis by Evolving Self-Organizing Maps. Kasabov, D. Deng
and N.
2.Business, Culture, Politics, and Sports - How to Find Your Way Through a Bulk
of News? On Content-Nased Hierarchical Structuring and Organization of Large
Document Archives. Micheal Dittenbach, Andreas Rauber, Dieter Merkl.
3. Denny, Graham J. Williams, Peter Christen. ReDSOM: Relative Density
Visualization of Temporal Changes.
4. Kohonen, T. Self-Organized formation of topologically correct feature maps.Biological. Cybernetics, 43. 1982, pp. 59-69.
5. The Self-organized map. Kohonen. s.l. : IEEE, 1990. Prodeeding of the IEEE,
VOL. 78 , N.9, September 1990. pp. 1464-1480.
6. Some extension of the K-means algorithm for image segmentation and pattern
classification, Technical Report 1390. Girosi, J.L. Marroquin and F. 1993.
7. T.M. Martinetz, S.G. Berkovich and K. J. Schulten. Neural-Gas network for
vector quantization and its application to time-series prediction. Neural Networks 4.
s.l. : IEEE, 1993.
8. J. Bruske, and G. Sommer. Dynamic cell structure learns perfectly topology
preserving map.Neural Computation 7. 1995.
9. Fritzke, B. Growing cell structures - a self-organizing network for unsupervised
and supervised learning.Neural Networks 7. 1994.