g. folino, a. forestiero, g. spezzano {folino,forestiero,spezzano}@icar.cnr.it swarming agents for...
TRANSCRIPT
G. Folino, A. Forestiero, G. Spezzano{folino,forestiero,spezzano}@icar.cnr.it
Swarming Agents for Discovering Clusters in Spatial Data
Second International Symposium on Parallel and Distributed Computing
Ljubljana, Slovenia · 13-14 October 2003
Sommario Introduction
Swarm intelligence Flocking algorithm Clustering and spatial datasets
Sparrow-SNN
Experimental results
Conclusions and Future Works
Swarm Intelligence
Swarm Intelligence (SI) is the property of a system whereby the collective behaviors of (unsophisticated) agents interacting locally with their environment cause coherent functional global patterns to emerge.
A swarm has the following interesting properties: Distributed, without central control Ability to change the environment Stigmergy (indirect communication via interaction with
environment) Fault tolerance Adaptivity and self organization
Typical examples are ant colonies, flocks of birds, etc..
Flocking algorithm
Typical example of emergent collective behavior.
No global control Every agent has a limited visibility The collective behavior emerges only by local interation,
following these three simple rules:Separation Alignment
Cohesion
Flocking algorithm
Agents could have an exploratory behavior:
Before, agents can search for a goal of particular interest
Then, the other flock members will be driven towards the goal in order to explore interesting area more carefully.
Clustering
Clustering means to divide all objects in different groups (clusters) so that all members of a cluster are as similar as possible whereas the members of different clusters differ as much as possible from each other.
Spatial clustering should identify clusters of different dimensions, size, shape and density (particularly difficult).
Clustering
A different density spatial dataset
SNN algorithm (1) SNN is based on the famous Jarvis-Patrick algorithm. identifies the K nearest-neighbors of each object (data
point) in the dataset. two objects i and j join the same cluster if:
1) i is one of the K nearest-neighbors of j;2) j is one of the K nearest-neighbors of i;3) i and j have at least Kmin of their K-nearest-
neighbors in common; where K and Kmin are used-defined parameters. For each
pair of points i and j is defined a link with an associate weight.
The connectivity of a data point is computed as the sum of the weights associated to the outgoing links.
SNN algorithm (2)
For every node (data point) calculate the connectivity; Identify representative points by choosing the point that
have high connectivity ( > core_threshold); Identify noise points by choosing the points that have low
connectivity ( < noise_threshold) and remove them; Remove all links between points that have weight smaller
than a threshold (merge_threshold) Take connected components of points to form clusters,
where every point in a cluster is either a representative point or is connected to a representative point.
SPARROW-SNN Sparrow-SNN combine the stochastic search of an
adaptive flocking with SNN to discover clusters in spatial data.
It uses a variant of the flocking algorithm:
Before, agents can search for a goal of particular interest
Then, the other flock’s members will be driven towards the goal in order to explore interesting area more carefully.
We used Swarm, a software package for multi-agent simulation of complex systems, for the implementation of Sparrow-SNN.
SPARROW-SNN
for i=1..MaxGenerations foreach agent (yellow, green) if (not visited (current_point)) conn = compute_conn(); if (conn < noise_threshold) consider the point for the removal from the clustering endif endif mycolor = color_agent(); end foreach foreach agent (yellow, green) dir= compute_dir(); end foreach foreach agent (all) switch (mycolor){ case yellow, green: move(dir, speed(mycolor)); break; case white: stop ();generate_new_boid();break; case red: stop (); merge(); generate_new_close_boid(); break; } end foreach end for
Pseudo-code ofthe algorithm
SPARROW-SNN N agents are generated randomly in the search space.
When an agent falls on a data point not previously explored computes the connectivity.
Using connectivity, agents take different colors: conn > core_threshold -> mycolor = rednoise_threshold < conn <= core_threshold -> mycolor =
green0 < conn < noise_threshold -> mycolor = yellowconn = 0 -> mycolor = white
Agents can indicate a representative point (red), noise (yellow), border point (green), or obstacle (white).
Red and white agents will stop signaling to the others the interesting and desert regions.
SPARROW-SNN Yellow and green agents will move following the
modified rules of the flock (with repulsion from white agents and attraction towards red agents.
Besides, yellow agents move quickly (not interesting zones) whereas green agents move slowly.
red agents (placed on a representative point) will run the merge procedure so that it will include, in the final cluster, the representative point discovered together to the points that share with it a significant (greater that Pmin) number of neighbors.
Experimental results (datasets)
Experimental results (clusters found)
Experimental results (random search vs Sparrow–SNN)
0
200
400
600
800
1000
0 500 1000
Visited Points
Core
Poi
nts
SPARROW-SNN
RANDOM
0
50
100
150
200
250
300
0 100 200 300 400
Visited Points
Core
Poi
nts
SPARROW-SNN
RANDOM
a) GEORGE b) North-East
Experimental results (scalability)
Conclusions and Future Works
Sparrow-SNN is able to discover cluster of arbitrary shape, size and density in spatial data.
Performs well approximate clustering.
is naturally distributed, fault tolerant and scalable.
We are working on implementing a new version of Sparrow using Anthill, a peer-to-peer multi agent system based on JXTA.