tesina ai - new2

Upload: domeleu

Post on 05-Apr-2018

260 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 Tesina AI - New2

    1/23

    Analysis of some extensions of the

    Self-Organized Maps: Evolving SOMs (ESOMs) (1),

    Growing-Hierarchy SOMs (GHSOM) (2)

    Relative-Density SOMs (ReDSOM) (3)

    Domenico Leuzzi

    Abstract.

    We describe and analyze some interesting extensions of the Self-Organized

    Map (SOM) algorithm as proposed originally by Kohonen, namely the Evolv-ing SOM (ESOM) and the Growing-Hierarchy SOM (GHSOM), as well as a

    visualization method to identifying the changes of the cluster structures in tem-

    poral datasets, called ReDSOM. The ESOM algorithm differentiate itself from

    its parent SOM because the topology of the map it creates is not fixed as in the

    SOM but it is adaptively built on the basis of the dataset distribution, thus re-

    ducing the number of the map units required to achieve a determined quantiza-

    tion error. The GHSOM algorithm tries to reproduce the hierarchy of the dataset

    creating a multilevel map in which each units of a map level can be explained

    more deeply by a map on the next level. The multilevel approach not only op-

    timizes the use of the units (a level map is added only if it is necessary to im-

    prove its quantization error) but allows a more quickly navigation of the maps

    obtained.

    The ReDSOM visualization method is useful when we have a dataset which

    evolves in time and we want to compare the clustering structure on the map

    space of the dataset snapshots taken a two different time instants. This methods

    allows to identify visually by means a different coloring, emerging/lost clusters,

    cluster enlargement/shrinking, more/less dense clusters, clusters movement.

    1 IntroductionWe start in section 2 with a brief presentation of the SOM algorithm as originally

    developed by Kohonen (3; 4).

    In section 3 we explain the used the experimental tests.

    Section 4 is dedicated to the ReDSOM visualization method, sections 5 and 6 to

    the GHSOM and ESOM algorithms respectively.

  • 7/31/2019 Tesina AI - New2

    2/23

    2 Self-Organizing MapsA SOM (also known as SOFM (Self-Organized Feature Map), or Kohonen Maps)

    is an artificial neural networks based on unsupervised competitive learning (3; 4).

    A low-dimensional grid of neurons (aka units), usually 2-D, is built following a

    fixed and predetermined topology (i.e. rectangular or hexagonal). This grid constitutes

    the so called map space (or output space). Whichever the used topology is, each unit

    is connected with a number of neighboring units which are equidistant in the map

    space: in the rectangular topology each unit is surrounded by four equidistant neigh-

    boring units, and in the hexagonal topology it is surrounded by six units1.

    The grids units are initialized in the data space, that is each unit weight vector (aka

    the codebook vector, prototype vector or reference vector) is given an initial value

    taken from the input space. The initialization can be random or o linear. In the latter

    case the initial values are chosen in an orderly fashion along the first principalcomponents, being the map space dimensionality.

    The map is then trained using a competitive unsupervised learning algorithm.

    At each training step , a randomly selected data point is chosen from the dataset.Then the best matching unit (BMU) corresponding to , i.e. the unit with theweight vector closest to , is selected from map, in accordance with (1)

    , (1)

    After that, not only the BMU but also all its neighbors are adjusted in according to

    the adaption rule

    = , , , = 1, , (2)where

    , , = ( ) ()

    is the neighborhood kernel function dependent on the distance between the winning

    neuron and the neighbor unit , as well as on the time . The parameter con-trols the size of zone of the neurons around the winning one that are affected by the

    update, while ( ) is a decreasing function of the time (i.e. linearly or exponentially)controlling the strength of the adaption.

    1 The border units are surrounded by fewer units, unless the lattice is wrapped in a cylindrical

    or toroidal structure.

  • 7/31/2019 Tesina AI - New2

    3/23

    3 Tools used for the experimental testsFor implementing the ReDSOM visualization methods we used the SOM Toolbox

    2.02

    which is a powerful package developed for MATLAB that allows managing

    every aspect of the SOMs, from the initialization, to the training, ending up with the

    visualization both in the map space and in the data space or the projected space. De-

    tailed information can obtained from the documentation and from source code.

    For the implementation of the GHSOM algorithm it has been used the package Ja-

    va SOMToolbox3. It contains several modules to train and visualize an SOM. In par-

    ticular we used the modules GHSOM to grows a hierarchy of maps, and SOMViewer

    to visualize the results.

    We used also the ESOM Toolbox for MATLAB to do some tests on the ESOM al-

    gorithm.

    4 ReDSOM (3)Suppose we have a temporal dataset ( ) that is a dataset whose distribution varies

    with time. We want to compare the clustering distribution of the dataset at two differ-

    ent time instants and , that is of the two datasets , ( ). The clusteringitself and comparing is much simpler if carried on in the low dimensional space of

    two SOMs of equal topology, , ( ) , trained on such two datasets, , ( ). In order to be able to compare directly such two maps, they must havethe same orientation and the datasets on which they are trained have to be normalized

    using the same normalization method and parameters. So the procedure to obtain two

    maps useful to be compared by the method here concerned is as follows:

    1.Normalize both datasets , ( ) using the same normalization method (e.g.the common z-score) and the same parameters.

    2.Initialize map ( ) using ordered values.3.Train map ( ) using the dataset ( ).4.Initialize map ( ) using the codebook vectors of the previous trained map .5.Train map ( ) using the dataset ( ).

    The maps so obtained are directly comparable with each other. That is, if we definea density function on the data space, related to the density of the units in the data

    space (that is considering their prototype vectors), we can compare one-to-one the

    density of units of the two maps (the two map units, in the same position of the two

    maps, are compared together).2 The package is available at the URL http://www.cis.hut.fi/projects/somtoolbox/download/3 The package is available at the URL

    http://www.aut.ac.nz/__data/assets/file/0015/10176/ecos_esom.zip

  • 7/31/2019 Tesina AI - New2

    4/23

    4.1 Area Density and Relative Density DefinitionsWe define area density ( ) of the map ( ) and calculated on the data space

    vector , as the sum of the values of a Gausssian kernel functions centered on the

    vector and calculated on the prototype vectors of the map units, as shown in (3)

    () = exp( ( )2

    ,,

    (3)

    The radius defines the width of the kernel function, and its value should be cho-

    sen in accordance with the mean distance of the neighboring units. It was observed

    that the quartile (e.g. the third quartile) is a balanced choice.

    Now we define the relative density () () ( ) as the ratio between the areadensity of the map ( ) to the area density of the map ( ) both calculated at thesame location of vector , as shown in (4)

    = log ( )( ) (4)

    The use of the logarithm in (4) allows the values of the densities ratio, to be con-

    verted to negative values when the ratio is below 1 (increase of density), and positive

    when the ratio is above 1 (decrease of density).

    The base-two logarithm allows getting a more suitable scale. For example a value

    of+2 indicates a density four times more dense on the map (

    ), while a value of

    2 indicates a density four times less dense on the map ( ).Based on experimental observation, values of ( ) less than 3 indicate that the

    location of vector is no longer occupied on the next map ( ) (it is lost) whilevalues greater than +3 indicate that the location of vector was not occupied on the

    previous map ( ) but it is on the next map ( ) (it is new).

    The relative density calculation is to be performed only on the prototype vectors of

    two maps , ( ) and not on the actual data vectors. So the running time ofcalculation of a map is quadratic with number of units and not with number of datapoints (), where ().

    4.2

    Relative Density Visualization

    As said on the previous section the relative density calculation is to be performed

    only on the prototype vectors , ( ) of the two maps , ( ).Let us set and

    as shorthand to indicate the values of the relative density calculated respectively,

    on the prototype vectors of first map and on the prototype vectors of second map.

  • 7/31/2019 Tesina AI - New2

    5/23

    We visualize and on the respective maps, in a gradation of blue for positivevalues and in a gradation of red for negative values, as shown in Fig. 2

    The visualization should be used to detect a density decrease of the vectors ofthe first map (negative values of relative density). In fact if we want to detect if a

    vector of the first map has been lost or has decreased its density in the second map,

    we have to choose a location vector for the calculation of the relative density where

    that vector is surely present, that is of reference vectors of the first map.Similarly the visualization should be used to detect the vectors that in the secondmap have increased their density with respect to the same vector location on the first

    map.

    4.3 MATLAB Implementation using SOM ToolboxTo implement the algorithm of the Relative Density Visualization in MATLAB us-

    ing the SOM Toolbox, first of all it is necessary to initialize and train the two maps

    relative to the two snapshots of the dataset we want to compare, following the proce-

    dure indicated is section 4 to obtain two directly comparable maps. The MATLAB

    code we used to do that is as follows:

    sD{1}=som_normalize(sD{1}, 'var'); % normalize sD2 using

    the 'var' method (z-score)

    sD{2}=som_normalize(sD{2}, sD{1}) % normalize sD2 usingthe same method and parameters as sD1

    sM{1}=som_lininit(sD{1}); % initialize linearly map 1

    sM{1}.comp_names{1}='x';

    sM{1}.comp_names{2}='y';

    % Train map 1 (use batch training)

    % (the trained map will be put in sM1_t)

    sTr1=som_train_struct(sM1,sD{1},'algorithm', 'batch',

    'phase','rough');

    sM_t{1}=som_batchtrain(sM{1}, sD{1}, sTr1);

    sTr1=som_train_struct(sM1,sD1,'algorithm', 'batch',

    'phase','finetune');

    sM_t{1}=som_batchtrain(sM_t{1}, sD{1}, sTr1);

    sM{2}=sM{1}; % map 2 same topoloy and codebook vectors as

    map 1

    % train map 2 (use batch training)

    sTr2=som_train_struct(sM{2},sD{2},'algorithm', 'batch',

    'phase','rough');

  • 7/31/2019 Tesina AI - New2

    6/23

    sM_t{2}=som_batchtrain(sM{2}, sD{2}, sTr2);

    sTr2=som_train_struct(sM{2},sD{2},'algorithm', 'batch',

    'phase','finetune');

    sM_t{2}=som_batchtrain(sM_t{2}, sD{2}, sTr2);

    See the SOM Toolbox documentation to understand the meaning of each function

    and each structure. The variables sD, sM, sM_t are cell arrays containing respec-

    tively the two datasets, the two untrained maps and the two trained maps. We pre-

    ferred to keep the untrained and the trained maps in separate variable. We used the

    batch training algorithm because it speeds up the training

    The MATLAB code we used to calculate the relative densities , is as fol-lows:[c1, p1, err1, ind1]=kmeans_clusters(sM1_t);

    [density1, radius]=som_density(sM1_t, sM1_t.codebook,

    'kp', p1{dataset1_knum});

    [density2]=som_density(sM2_t, sM1_t.codebook, 'radius',

    radius);

    rd{1}=log2(density2./density1);

    [density1]=som_density(sM1_t, sM2_t.codebook, 'radius',

    radius);

    [density2]=som_density(sM2_t, sM2_t.codebook, 'radius',

    radius);

    rd{2}=log2(density2./density1);

    The first line calculates the clustering of the codebook vectors using the function

    kmeans_clusters . The returned variable p1 is a cell array which contains in the position the clustering information for a number of clusters = . The partitioning

    of the prototype vectors is needed for the calculation of the radius parameter present

    in the Gaussian function used in the expression of the area density function. The radi-

    us is calculated in the next code line (first som_density invocation) for the first

    area density calculation. The next three calculations of area densities utilize the radius

    calculated by the first invocation and dont require the clustering information parame-

    ters before obtained because they dont need to calculate the radius.

    The function som_density is not part of the SOM Toolbox package, the salient

    part of this function is the calculation of the radius and is reported in the following

    code fragment:

    U = som_umat(M, sTopol, mode, 'mask', mask);

    [mean_neighbors_dist_cluster]=neighbors_dist(U,

    topol.msize, sTopol.lattice, kp, knum);

    mean_neighbors_dist = mean(mean_neighbors_dist_cluster);

    r = quart * mean_neighbors_dist;

  • 7/31/2019 Tesina AI - New2

    7/23

    The first line calculates the U-distance matrix and the second line calculates the mean

    distances between neighbors in each prototype vectors cluster; the parameter kp is a

    vector containing the clustering information of each prototype vectors and knum is

    the number of clusters.

    4.4 Results on synthetic datasetsFirst synthetic example

    The Fig. 1 indicates the datasets used to shows how the relative density visualiza-

    tion method performs when there are lost clusters, new clusters, and changes in clus-ter density. The two datasets are constituted by a superposition of four sets of normal-

    ly-distributed (Gaussian) 2-D data points. The variance of each normally distributed

    set was chosen to the common value 0.2, while the mean value was accordingly cho-

    sen so as the resulting sets do not (practically) overlap. In the figure are indicated both

    the datasets and the denormalized codebook vectors of the trained maps.

    Comparing the top portion of the figure with the bottom one we can see that going

    from the first toward the second, the cluster A is lost, it appears the new cluster E and

    that the two clusters B, D change density, the first becomes denser and the second one

    less dense.

  • 7/31/2019 Tesina AI - New2

    8/23

    Fig. 1. Datasets used to show the capacity of the Relative Density Visualization to detect lost

    clusters, new clusters as well as clusters with a density variation. Besides showing the datasets

    points (blue points), the figure shows the denormalized codebook vectors of the maps trained

    on them (red crosses).Both datasets are constituted by four sets of normally-distributed (Gaussian) 2D points. Each of

    these normally-distributed set was chosen with a common value of variance (0.2), while the

    mean value was set so as the four groups are practically non-overlapping. Going from the first

    dataset (a) to the second one (b) we can see that there is a lost cluster (A), a new cluster (E),

    a denser cluster (B) and a less dense cluster (D); the cluster C remains unchanged.

  • 7/31/2019 Tesina AI - New2

    9/23

    The Fig. 2 shows the two trained maps using both the usual visualizations (compo-

    nent planes and U-distance matrix) as well as the relative density visualization. The

    top portion of the figure shows the visualizations relative the first map, whilst the

    bottom portion is relative to the second map.

    The visualization shows clearly that there is a region of strong red coloring,which is associated with a very low value of relative density ( 3 ). This regioncorresponds to the cluster A which is lost going toward the second map.

    There are two regions one colored with light blue and the other colored with light

    red. The light blue indicates an increase of density (cluster B), while the light red

    indicated a decrease of density (cluster D).

    At last there is a neutral zone (white color, cluster C) which corresponds to an un-

    changed cluster.

    As it has mentioned before, is able to show the changes of the clusters presentin the first map (A,B,C,D), but it cannot show the changes relative to the clusters that

    are only present in the second map, that is it cannot detect the creation of new clusters

    (like the cluster E). That kind of information is instead obtainable by .The visualization of in bottom part of the figure shows clearly a zone with

    strong relative density value ( +3 ) corresponding to the dark blue coloring.This region is associated with the emerging cluster E. The other two regions (light

    blue, light red, white colors) are the same identified by the visualization .

  • 7/31/2019 Tesina AI - New2

    10/23

  • 7/31/2019 Tesina AI - New2

    11/23

    Second synthetic example

    Fig. 3 indicates the datasets used to shows how the relative density visualization

    method is able to make in evidence a cluster centroid shift. The two datasets are simi-

    lar to the ones used in the Fig. 1. The difference is that in the second dataset, the clus-

    ter A is not lost anymore but it shifts its centroid, and that the emerging cluster E is no

    longer present. In the figure are indicated both the datasets and the denormalized

    codebook vectors of the trained maps. The denser and less dense cluster B and D are

    the same of those of the datasets of Fig. 1.

    Fig. 3. Datasets used to show the capacity of the Relative Density Visualization a shift of clus-

    ter centroid. Besides showing the datasets points (blue points), the figure shows the denormal-

    ized codebook vectors of the maps trained on them (red crosses).

    Both datasets are constituted by four sets of normally-distributed (Gaussian) 2D points. Each of

    these normally-distributed set was chosen with a common value of variance (0.2), while the

    mean value was set so as the four groups are practically non-overlapping. Going from the first

    dataset (a) to the second one (b) we can see that there is shifted cluster (A), a denser cluster (B)

    and a less dense cluster (D); the cluster C remains unchanged.

  • 7/31/2019 Tesina AI - New2

    12/23

    The Fig. 4 shows the two trained maps using both the usual visualizations (compo-

    nent planes and U-distance matrix) as well as the relative density visualization. The

    top portion of the figure shows the visualizations relative the first map, whilst the

    bottom portion is relative to the second map.

    We can draw the same conclusions about the clusters B,C,D as we did for the pre-

    vious example shown in Fig. 2: in the visualizations and , the cluster the Bhas increased its density (light blue coloring), cluster D has decreased its density

    (light red coloring) and cluster C is about unchanged (there is a very light red coloring

    but its negligible). See the previous example for more details.

    What is worth our attention is the region of the cluster A: there is a light red color-ing on and a light blue coloring on , both not covering the entire cluster: whenthe border of the colored region crosses inner part of a cluster it means there is a clus-

    ter enlargement (blue coloring or and no coloring on ), a cluster shrinking(red coloring on and no coloring on ), or a shift of cluster centroid (both redcoloring on and blue coloring on ). Our case corresponds to the third configu-ration (both partial coloring): indeed the cluster A is subject has a cluster centroid

    shift.

  • 7/31/2019 Tesina AI - New2

    13/23

    Fig. 4. Visualizations of the maps trained with the dataset (a) shown in Fig. 3a and (b) shown in

    Fig. 3b. In the figure are drawn the two component planes, the U-distance matrix and the rela-

    tive distance calculated on the units (that is on their codebook vectors) of the considered map.

    On the figure are outlined the contours of the clusters as obtained from the U-distance matrix

    (blue color indicates high distance in the data space while the yellow color indicates a low

    distance and so the units with a blue color represent a region of separation between clusters).

    The relative distances , give the same indication about the clusters B,C,D as the previ-

    ous example shown in Fig. 2: see the previous example more details. What is worth our atten-

    tion is the region of the cluster A: there is a light red coloring on and a light blue coloring

    on , both not covering the entire cluster: when the border of the colored region crosses inner

    part of a cluster it means there is an enlargement (blue color) and/or

  • 7/31/2019 Tesina AI - New2

    14/23

    5 GHSOM (1)The topology preservation capability of the Self-Organized Maps allows creating a

    low-dimensional representation of a dataset, i.e. of a collect of documents, so as to

    organize it and make it easy to search the desired information. As the amount of in-

    formation to be represented grows, the map needed to organize it becomes larger. A

    large map, even if low-dimensional, utilized to represent the whole dataset makes

    hard to find a particular data of interest. In the unique map representation method,

    although the reduction of dimensionality simplifies the visualization of the data, it

    makes lose the hierarchical structure of the data itself. The Growing-Hierarchical

    Self-Organized Map, it is conceived with the idea of distribute the dataset to be repre-

    sented in several distinct sub-maps, each specialized on a specific portion of the data

    space, and linked together by a hierarchical relationship. In addition each sub-map

    can grow in size to fit the detail of representation needed. This multilevel approach

    not only performs the dimensionality reduction without losing the topology, of a da-

    taset, proper of the ordinary SOM, but it makes it also possible to maintain to some

    degree its hierarchical structure.

    5.1 The algorithmThe key idea is to use multiple layers of distinct SOMs. The first layer contains on-

    ly a SOM. Each units of this map can be expanded into a finer SOM in the next (low-

    er level) layer. The same applies to the units of the maps of this new layer, and so on

    the algorithm goes ahead until a predetermined level of detail is reached (see Fig. 5).In addition for every map added to the structure we utilize a incrementally growing

    version of the SOM; we start from a simple 2x2 map and eventually grow it if after its

    training, the mapping quality is not satisfying.

    We start at the layer zero, from a very rough representation of the data, just a

    single map unit whose weight vector is set at the mean point of all the dataset vectors;

    indeed this first unit has only the purpose of calculating the initial quantization error

    associated with the data. In general the quantization of a unit is calculated asthe sum of all the distances between the weight vector of the unit and the data vec-

    tors mapped onto this unit; in particular represents how far in total are the datasetvectors from their mean vector location.

    We proceed with the first true SOM at layer one, starting from a small 2x2 map

    configuration, which is trained with the standard SOM algorithm.For each SOM, the training process is repeated for fixed number of iterations.

    When the training process of a SOM is done, its mean quantization error is

    calculated. The mean quantization error of a map is a mapping quality index

    defined as the average value of the quantization errors of all the units of that SOM.If the of the map just added and trained is higher than a predefined fraction

    of the of the unit in the preceding layer the map it is linked to, a new row or anew column of units is added to the SOM. The point of addition is set between the

    map unit with the highest (called error unit) and the most dissimilar (in term of its

  • 7/31/2019 Tesina AI - New2

    15/23

    weight vector) neighbor unit. The weights of the units added are initialized as the

    average of their neighbors, and the training procedure is repeated as said above.

    When the grow process is concluded, we can say that the new SOM added presents

    the preceding layer unit from which it is expanded, but at higher detail4.

    The units of an added SOM with have a quantization error too high, higher than

    a predefined threshold fraction of the initial quantization error at layer zero, , are

    expanded into an SOM in the next lower level layer. The parameter controls the

    granularity of data representation in each final unit the hierarchy (not expanded into a

    further map). The more this parameter is low, the more are the units which require

    expansion, so more is deepness of the hierarchy produced.

    Summing up the structure can grow both in breadth and in depth. The shape of the

    hierarchy is controlled by the two parameters and . The size of each single map

    tends to increase, as the parameter is lower, while the deepness of the hierarchy,that is its expansion level, increase as the parameter is lower.

    Fig. 5. Hierarchical structure of a GHSOM.

    4 Actually the first layer SOM is the first detail of representation of the dataset because the

    preceding layer SOM is just a dummy map.

  • 7/31/2019 Tesina AI - New2

    16/23

    5.2 MATLAB Implementation of the GHSOM: GHSOM ToolboxWe used the package Java SOMToolbox5 to implement the GHSOM algorithm. It

    contains several modules to train and visualize an SOM. In particular we used the

    modules GHSOM to grows an hierarchy of maps, and SOMViewer to visualize the

    results.

    5.3 Experimental resultsWe used a dataset consisting of 101 animals described in a data space with a di-

    mensionality of 20. The components of such space are simply Boolean values corre-

    sponding to the following attributes:

    Hair, feathers, eggs, milk, airborne, aquatic, predator, toothed, backbone, breathes,venomous, fins, 2_legs, 4_legs, 5_legs, 6_legs, 8_legs, tail, domestic, catsize.

    We ran two tests to prove how the hierarchy structure can shaped by means of the

    two parameters and .

    Test 1: three layers hierarchy

    Training the GHSOM with the parameters = 0.070 and = 0.0035, has pro-duced a structure hierarchy constituted by three layers, as depicted in Fig. 6.

    The first layer map has been expanded to form a 2 4 grid. Each unit of this layer isfurther expanded into a map in the second layer. Some of the units in the second layer

    maps are even further expanded in third later. As can be seen from the figure, the

    algorithm is able to organize the data in a meaningful way. For example the Aquatics,Mammalians, Birds, etc. are all organized into a separate sub-map in the second layer.

    In addition the sub-map representing related species are close together, like the se-

    cond layer sub-map representing the mammalians quadrupeds, and that representing

    the mammalians which are not quadrupeds.

    In Fig. 7 are the component planes of the first layer map units.

    5 The package is available at the URL http://www.ifs.tuwien.ac.at/dm/somtoolbox/

  • 7/31/2019 Tesina AI - New2

    17/23

    31 data items

    #2

    Mink

    Platypus

    Layer 3

    #12

    Aardvark

    Bear

    BoarCheetah

    LeopardLion

    Lynx

    Mongoose

    Polecat

    Puma

    Raccoon

    Wolf

    Layer 3

    #4

    Hare

    Vole

    Mole

    Opossum

    Layer 3

    #6Antelope

    BuffaloDeer

    Elephant

    GiraffeOryx

    #2

    Cavy

    Hamster

    Layer 3

    #5

    Pussycat

    CalfGoat

    Pony

    Reindeer

    Layer 3

    Layer 2

    Quadrupeds

    Mammalians

    #16

    Seasnake Bass CatfishChub

    HerringPiranha

    Dolphin

    PorpoiseStingray

    dogfish

    pike

    tuna

    Layer 3

    HaddockSeahorse

    Sole

    Carp

    Layer 3

    Layer 2

    Aquatic

    #9

    Vulture Rhea

    Kiwi Penguin

    CrowHawk

    GullSkimmer

    Skua

    Layer 2

    Birds

    (those ib blue are alsoaquatic)

    11 data items

    Sw

    Duck

    Layer 3

    LarkPheasant

    Sparrow

    Wren

    Layer 2

    8 data items

    Squireel Gorilla Girl

    Fruitbat

    Vampire

    Wallaby Sealion

    Seal

    Layer 3

    Layer 2

    Mammalians

    #9

    Toad Tortoise Scorpion

    Frog

    Venomous_frog

    Layer 3

    Newt

    Tuatara

    Layer 3

    Slowworm

    Pitviper

    Layer 3

    Layer 2

    Excluding those indicated in red, all the species here are aquatic

    #7

    Starfish Clam Crab

    Crayfish

    Lobster

    Seawasp Octopus

    Layer 2

    Aquatic

    #10

    Slug

    Worm

    Gnat

    Wasp

    Ho

    Layer 3

    Layer 2

    Excluding t

    the spec

    Layer 1

    Fig. 6. Hierarchy procuced with the parameters = 0.070, = 0.0035. The first layer map has grown to a 2 4 conf

    reached a depth of 3 layers. In the table are indicated some of the categories grouped by the GHSOM

  • 7/31/2019 Tesina AI - New2

    18/23

    Fig. 7. Component planes of the first layer map of the GHSOM trained in test 1

    Test 2: two layers hiararchy

    Training the GHSOM with the parameters 0.025 and 0.0035, has pro-

    duced a structure hierarchy constituted by two layers, as depicted in Fig. 8.

    The first layer map has been expanded up to a 4 5 grid. Each unit of this layer is

    further expanded into a map in the second layer.

    In Fig. 9 are the component planes of the first layer map units.

  • 7/31/2019 Tesina AI - New2

    19/23

    Kiwi

    Crow

    Hawk

    Gull

    SkimmerSkua

    Layer 2

    Lark

    Pheasant

    SparrowWren

    Chicken

    Dove

    Parakeet

    Duck

    Layer 2

    Flea

    Termite

    Ladybird

    SlugWorm

    Gnat

    Layer 2

    Wasp Honeybee

    Housefly

    MothLayer 2

    Fruitbat

    Vampire

    Squirrel Layer 2

    Flamingo Vulture Penguin

    Swan Ostrich Rhea

    Layer 2

    Tortoise Octopus Seawasp Crayfish

    Lobster

    Clam Starfish Crab

    Layer 2

    Scorpion

    Mole

    Opossum

    Layer 2

    Seasnake Slowworm

    Pitviper Tuatara

    Layer 2

    Newt Venomous_frog

    Frog Toad

    Layer 2

    Platypus Girl Antelope

    Buffalo

    Deer

    Elephant

    Giraffe

    Oryx

    Bass Catfish

    ChubHerring

    Piranha

    Carp Haddock

    Seahorse

    Sole

    Layer 2

    Stingray

    DogfishPike

    Tuna

    Layer 2

    Seal

    Sealion DolphinPorpoise

    Layer 2

    Aardvark

    Bear

    Boar

    CheetahLeopard

    Lion

    Lynx

    Mongoose

    Polecat

    Puma

    Raccoon

    Wolf

    Pussycat Mink

    Layer 2

    Cavy

    Layer 2

    Layer 1

    Fig. 8. Hierarchy procuced with the parameters = 0.025, = 0.0035. The first layer map has grown to a 4 5 configu

    reached a depth of 2 layers.

  • 7/31/2019 Tesina AI - New2

    20/23

    Fig. 9. Component planes of the first layer map of the GHSOM trained in test 2

    6 ESOM (5)In the context of data clustering and vector quantization, one of major challenge is

    ability of dealing with an online data stream characterized by an unknown or time-

    dependent statistic. The most simple approach is the k-means in its online version,

    where for each incoming input data vector , only the prototype vector closest to

    is updated by dragging it nearer itself (Winner-Takes-All scheme). This approach

    is known as local k-means algorithm (6) and it.While this method is quite straight-

    forward, it can suffer from confinement to local minima. The SOM algorithm (3; 4) is

    able to overcome this problem because it uses a soft approach in which not only the

    winner of the competition is updated but also its neighbors, depending on their prox-

    imity to the input vector. In addition it has the well-known topology-preserving ability

    which puts the prototype vector as to mirror the statistic of the data. The SOM algo-

    rithm uses a fixed predetermined topology of the units in the low-dimensional map

    space (aka feature space; usually 2-D or 3-D) which defines their order and their

    neighborhood relationship. When the original manifold is too complicated to be fol-

  • 7/31/2019 Tesina AI - New2

    21/23

    lowed by fixed-topology low dimensional map space this brings to a highly folded

    feature map.

    The topology constraint on the feature map is removed in the neural-gas model (7),

    the dynamic cell structure (DCS-GCS) (8) and the growing neural gas (GNG) (9).

    In all these methods the map structure is built dynamically to fit incoming data, but

    the need to calculate local resources for prototypes, which increase the computational

    effort to calculate and thus brings to a reduction of efficiency.

    The ESOM model is similar the GNG but it does not require local resource calcula-

    tion and the node insertion mechanism is more efficient than that the DCS and GNG.

    6.1 The ESOM algorithmWe start with an empty map, adding new nodes as the input vectors arrive.

    We will use the following symbols: = { , , , } indicates the of the prototype nodes at the -th step;

    N is current number of nodes;

    is the dimension input manifold; = , is the set of the of all the unordered pairs ( , ) of nodes

    , which are connected together.

    The algorithm can be schematized as follows:

    1. A new input is presented to the network2. Let us consider the set ( ) = of prototypes nodes

    that match the input vector within a predefined threshold

    3. If ( ) is empty go to step 4 (node insertion) otherwise to step 5(nodes updating)

    4. Node insertion. Create a new node that matches exactlythe input vector , insert it into and increment by one

    = =

    + 1

    Connect the new node with the two nearest neighbors , (if theyexist, that is if has at least two elements) and connect also them; if

    has only one element connect only the new node with it

    = , , ,

    , if hasatleasttwoelementsif hasonlyoneelement

    Go to step 6.

  • 7/31/2019 Tesina AI - New2

    22/23

  • 7/31/2019 Tesina AI - New2

    23/23

    Bibliography

    1. On-line Pattern Analysis by Evolving Self-Organizing Maps. Kasabov, D. Deng

    and N.

    2.Business, Culture, Politics, and Sports - How to Find Your Way Through a Bulk

    of News? On Content-Nased Hierarchical Structuring and Organization of Large

    Document Archives. Micheal Dittenbach, Andreas Rauber, Dieter Merkl.

    3. Denny, Graham J. Williams, Peter Christen. ReDSOM: Relative Density

    Visualization of Temporal Changes.

    4. Kohonen, T. Self-Organized formation of topologically correct feature maps.Biological. Cybernetics, 43. 1982, pp. 59-69.

    5. The Self-organized map. Kohonen. s.l. : IEEE, 1990. Prodeeding of the IEEE,

    VOL. 78 , N.9, September 1990. pp. 1464-1480.

    6. Some extension of the K-means algorithm for image segmentation and pattern

    classification, Technical Report 1390. Girosi, J.L. Marroquin and F. 1993.

    7. T.M. Martinetz, S.G. Berkovich and K. J. Schulten. Neural-Gas network for

    vector quantization and its application to time-series prediction. Neural Networks 4.

    s.l. : IEEE, 1993.

    8. J. Bruske, and G. Sommer. Dynamic cell structure learns perfectly topology

    preserving map.Neural Computation 7. 1995.

    9. Fritzke, B. Growing cell structures - a self-organizing network for unsupervised

    and supervised learning.Neural Networks 7. 1994.