improving search in peer-to-peer networks beverly yang hector garcia-molina presented by shreeram...

13
Improving Search in Peer- to-Peer Networks Beverly Yang Hector Garcia-Molina Presented by Shreeram Sahasrabudhe ([email protected])

Upload: blaise-goodwin

Post on 16-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Improving Search in Peer-to-Peer Networks Beverly Yang Hector Garcia-Molina Presented by Shreeram Sahasrabudhe (sas4@lehigh.edu)

Improving Search in Peer-to-Peer Networks

Beverly Yang Hector Garcia-Molina

Presented by

Shreeram Sahasrabudhe

([email protected])

Page 2: Improving Search in Peer-to-Peer Networks Beverly Yang Hector Garcia-Molina Presented by Shreeram Sahasrabudhe (sas4@lehigh.edu)

GoalsThree search techniques:

1. Iterative Deepening 2. Directed BFS3. Local Indices

Evaluation and extensive measurements of these techniques on the Gnutella network.Ready-to-use results and recommendations.

Basically - just trying to reduce

nodes that handle a query.

Page 3: Improving Search in Peer-to-Peer Networks Beverly Yang Hector Garcia-Molina Presented by Shreeram Sahasrabudhe (sas4@lehigh.edu)

Current Techniques

Gnutella –Breadth First Search (BFS) with depth limit D (typically 7). Disadvantages

Wastage of resources Inefficient

Freenet: Depth First Search (DFS) Disadvantages

Poor Response Time

Page 4: Improving Search in Peer-to-Peer Networks Beverly Yang Hector Garcia-Molina Presented by Shreeram Sahasrabudhe (sas4@lehigh.edu)

Iterative Deepening

Required System Wide policy P={a,b,c} Time between successive iterations

W.

S

P = {a,b ,c}

1 a

F r e e z eWait = W

Resend [(TTL a) + query_id]

… (TTL b-a)b

Page 5: Improving Search in Peer-to-Peer Networks Beverly Yang Hector Garcia-Molina Presented by Shreeram Sahasrabudhe (sas4@lehigh.edu)

Directed BFS

Send queries to a subset of nodesSubset nodes selected by heuristics like :

Select node … That has highest number of results for

provided queries Whose response messages have taken lowest

avg number of hops. Who has forwarded most messages to our

client Who has the shortest messages queue

Page 6: Improving Search in Peer-to-Peer Networks Beverly Yang Hector Garcia-Molina Presented by Shreeram Sahasrabudhe (sas4@lehigh.edu)

Local Indices

Each node n maintains an index of data for nodes within r hopsSo a node can process a query on behalf of every node within r hopssmall r = less storage. (e.g. for r(1)=70KB)

S 1

process

5

process

2 3 4

P= {1,5}

Page 7: Improving Search in Peer-to-Peer Networks Beverly Yang Hector Garcia-Molina Presented by Shreeram Sahasrabudhe (sas4@lehigh.edu)

More work

Node Join Sends join message with TTL of r,

containing metadata over its collection A node receiving a join messages sends a

return join message with its metadata Periodic refreshes

Cost ?? QueryJoinRatio = Average ratio of queries

to join messages QueryUpdateRatio = Average ratio of

queries to update messages

Page 8: Improving Search in Peer-to-Peer Networks Beverly Yang Hector Garcia-Molina Presented by Shreeram Sahasrabudhe (sas4@lehigh.edu)

ExperimentData Collection Observed Gnutella network traffic for 1 month Determined some general statistics like average

number of files shared /user, query strings etc.

Iterative Deepening For each query Q sent: log response message

arriving in 2min. Ping messages to all neighbors: hops and IP addr. Same data used for Local Indices

Directed BFS Same as above, but each query sent to single

node.

Page 9: Improving Search in Peer-to-Peer Networks Beverly Yang Hector Garcia-Molina Presented by Shreeram Sahasrabudhe (sas4@lehigh.edu)

CostBandwidth Cost in BFS:

Processing Cost

Nodes at depth N

Redundant edges between n-1 and n

Size of query message

Total Records

Response messages from nodes n

Size of header

Size of Record

Page 10: Improving Search in Peer-to-Peer Networks Beverly Yang Hector Garcia-Molina Presented by Shreeram Sahasrabudhe (sas4@lehigh.edu)

ResultsIterative Deepening Neighbors = 8 Desired number of results Z=50 Policies P={Pd = {d, d+1, … D} for

d=1,2,3..D}

• d = cost

• W = cost

“overshooting”

• W = time

• d = time

COST

Page 11: Improving Search in Peer-to-Peer Networks Beverly Yang Hector Garcia-Molina Presented by Shreeram Sahasrabudhe (sas4@lehigh.edu)

Directed BFS

Studied 8 heuristics‘Random neighbor’ is baseline for comparison COST

Page 12: Improving Search in Peer-to-Peer Networks Beverly Yang Hector Garcia-Molina Presented by Shreeram Sahasrabudhe (sas4@lehigh.edu)

Local Indices

Page 13: Improving Search in Peer-to-Peer Networks Beverly Yang Hector Garcia-Molina Presented by Shreeram Sahasrabudhe (sas4@lehigh.edu)

ConclusionsThree new search systems specified and tested.Recommend: Local Indices with r=1. Savings: 61% bandwidth 49% processing