UNIVERSITY OF JYVÄSKYLÄ
Resource Discovery in P2P Networks Using Evolutionary Neural Networks Presentation for International Conference on Advances in Intelligent Systems – Theory and Applications (AISTA 2004)
15.11.2004
Mikko Vapa, researcher studentAgora Center
http://tisu.it.jyu.fi/cheesefactory
With co-authors Niko Kotilainen, Annemari Auvinen, Heikki Kainulainen and Jarkko Vuori
2004
UNIVERSITY OF JYVÄSKYLÄ
Peer-to-Peer Networks
• Peer-to-Peer networks (P2P) are formed by Transmission Control Protocol (TCP) connections between workstations
• Workstations denoted as nodes can share their resources for example files ( ) or computing power
Node 1
Node 2
Node 3
Node 4
TCP
TCP
TCP
TCP
TCP
2004
UNIVERSITY OF JYVÄSKYLÄ
Resource Discovery Problem
• In peer-to-peer resource discovery problem any node in the network can query resources from other nodes
Node1: Where is ?
Node 1
Node 2
Node 3
Node 4
2004
UNIVERSITY OF JYVÄSKYLÄ
A Simple Solution for the Problem
• Gnutella P2P network for example uses Breadth-First Search (BFS) flooding algorithm which sends query to all neighbors
• Problems: all resources in the network can be found, but network gets congested and there are lots of useless packets
Node 1: Where is ?
Node 1
Node 2
Node 3
Node 4
Query
QueryQuery
Query
Query
Query
Node 4: I have it!
Node 2: I have it!Node 4: Node 4 has it too!Reply
Reply
2004
UNIVERSITY OF JYVÄSKYLÄ
Our solution: NeuroSearch• NeuroSearch resource discovery algorithm uses neural
networks and evolution to adapt its behavior to given environment– neural network for deciding whether to pass the query further
down the connection or not– evolution for breeding and finding out the best neural
network in a large class of local search algorithms• To authors’ knowledge this is the first time when neural
networks are being applied to resource discovery problem
Query
Forward the query
Forward the query
Neighbor Node
Neighbor Node
2004
UNIVERSITY OF JYVÄSKYLÄ
NeuroSearch’s Inputs• The internal structure of NeuroSearch algorithm
• Multiple layers enable the algorithm to express non-linear behavior
• With enough neurons the algorithm can universally approximate any decision function
Tanh
Tanh
Threshold
2004
UNIVERSITY OF JYVÄSKYLÄ
NeuroSearch’s Training Program
• The neural network weights define how neural network behaves so they must be adjusted to right values
• This is done using iterative optimization process based on evolution and Gaussian mutation
Define theP2P networkconditions
Define the fitness requirements
for the algorithm
Create candidate algorithmsrandomly
Select the bestones for next
generation
Breed a newpopulation
Finally select thebest algorithm forthese conditions
Iteratethousands
ofgenerations
Compare the bestone against
Breadth-First Search
2004
UNIVERSITY OF JYVÄSKYLÄ
Well How Good Is The Algorithm?
• We defined a peer-to-peer network scenario where:– 100 nodes form a power-law distributed P2P
network having few hubs and lots of low-connectivity nodes
– Resources are distributed based on the number of connections the node has meaning that high-connectivity nodes are more likely to answer to the queries
– Topology is static so the nodes are not moving• Then we defined a fitness function for the algorithm stating that:
– An algorithm that stops is always better than algorithm that does not
– The algorithm should locate half of the available resources for each query
– The algorithm should use as minimal number of packets as possible
2004
UNIVERSITY OF JYVÄSKYLÄ
Well How Good Is The Algorithm?• After two weeks we were ready to compare NeuroSearch’s
invention against Breadth-First Search in 100-query test scenario
• The measurements indicate that the optimization process had developed an algorithm that:
– finds half of the resources in the network with high probability
– is more efficient than BFS with maximum number of three hops (BFS-3) and as efficient as BFS-2 while still locating the required 50% of resources
– has stable performance regardless of where the querier is located
Conclusion is that the approach is feasible, but not yet optimal
Algorithm Packets Resources Resources/Packets (Efficiency)
BFS-2 3000 619 (37,1%) 0.2063
BFS-3 12202 1295 (66,7%) 0.1061
NeuroSearch 4719 975 (53,2%) 0.2066
2004
UNIVERSITY OF JYVÄSKYLÄ
Evolution Of Neural Networks
The best neuralnetwork of 85,736th
generation was selected for testing
2004
UNIVERSITY OF JYVÄSKYLÄ
Performance of NeuroSearch – Hit Rate
NeuroSearch slightly misses the target of 50% resources in 8 queries
2004
UNIVERSITY OF JYVÄSKYLÄ
Performance of NeuroSearch - Resources
BFS locates more resources when query starts from central nodes
2004
UNIVERSITY OF JYVÄSKYLÄ
Performance of NeuroSearch - Packets
NeuroSearch is stable and the performance does not depend on where the query is started
2004
UNIVERSITY OF JYVÄSKYLÄ
Typical query pattern of NeuroSearch
The maximum number of hops is 5
2004
UNIVERSITY OF JYVÄSKYLÄ
Future Work
• Now the first version of NeuroSearch is ready and analyzed• The future work of NeuroSearch includes:
– Analysis of the effects of varying neural network’s structure• New input types to feed NeuroSearch with more
information• Adjusting the number of neurons to allow NeuroSearch to
make wiser decisions– Studying the scalability factors affecting NeuroSearch when
the P2P network size grows– Developing an optimal resource discovery algorithm using
global knowledge to be able to measure the best efficiency resource discovery algorithm can achieve
– Speeding up the optimization process by parallelizing evolutionary algorithm using distributed computing
UNIVERSITY OF JYVÄSKYLÄ
Thank You!
Any questions?