replication strategies in unstructured peer-to-peer networks
DESCRIPTION
Replication Strategies in Unstructured Peer-to-Peer Networks. Edith Cohen. Scott Shenker. This is a modified version of the original presentation by the authors. Search in Basic P2P Architectures. Centralized : central directory server . (Napster) Decentralized: - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/1.jpg)
Replication Strategies in Unstructured Peer-to-Peer Networks
Edith Cohen Scott Shenker
This is a modified version of the original presentation by the authors
![Page 2: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/2.jpg)
Search in Basic P2P Architectures
• Centralized: central directory server. (Napster)
• Decentralized:– Structured (DHTs): Only exact-match queries, tightly
controlled overlay.– Unstructured: (Gnutella, FastTrack); search is “blind” -
probed peers are unrelated to query.
![Page 3: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/3.jpg)
Replication in P2P architectures
• No proactive replication (Gnutella)– Hosts store and serve only what they requested– A copy can be found only by probing a host with a
copy• Proactive replication of “keys” (= meta data +
pointer) for search efficiency (FastTrack, DHTs)• Proactive replication of “copies” – for search
and download efficiency, anonymity. (Freenet)
![Page 4: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/4.jpg)
QUESTION How to use replication to improve search efficiency in unstructured networks with a proactive replication mechanism ?
![Page 5: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/5.jpg)
Search and replication model
• Search: probe hosts, uniformly at random, until the query is satisfied (or the search max size is exceeded)
Goal: minimize average search size (number of probes till query is satisfied)
• Replication: Each host can store up to copies (or keys=metadata+pointer) of items.
Unstructured networks with replication of keys or copies. Peers probed (in the search and replication process) are unrelated to query/item - Probe success likelihood can not be better, on average, than random probes.
![Page 6: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/6.jpg)
Search size
What is the search size of a query ?• Insoluble queries: maximum search size• Soluble queries: number of nodes a query need to visit
until the answer is found. We look at the Expected Search Size (ESS) of each item.
The ESS is inversely proportional to the fraction of peers with a copy of the item.
• Query is soluble if there are sufficiently many copies of the item.
• Query is insoluble if item is rare or non existent.
![Page 7: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/7.jpg)
Search Example
2 probes 4 probes
![Page 8: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/8.jpg)
Notations• m items with relative query rates • n nodes (peers), each has a uniform capacity • R = n is the total available space• ri = number of copies of item i. Thus pi = ri/R is the
fraction of the total space allocated to item i.
i pi = 1• qi = normalized query rate for item i. Thus
i qi = 1
![Page 9: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/9.jpg)
Notations• Allocation p = (r1/R, r2/R, …, rm/R)
• A replication strategy is a mapping from q to p.
• Assumption R ≥ m ≥ . (If m < , then one can copy every item in all the nodes. If R < m then no allocation can store a copy of all m objects)
![Page 10: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/10.jpg)
Expected Search Size (ESS)
• Allocation : p1, p2, p3,…, pm i pi = 1• ri/n = pi is the fraction of hosts storing a copy of i
• m items with relative query rates
q1 > q2 > q3 > … > qm. i qi = 1
• Search size for ith item is a geometric r.v. with mean Ai = 1/( pi).
• ESS is i qi Ai = (i qi / pi)/
![Page 11: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/11.jpg)
Uniform and Proportional Replication
Two natural strategies:• Uniform Allocation: pi = 1/m
•Simple, resources are divided equally
• Proportional Allocation: pi = qi•“Fair”, resources per item
proportional to demand• Reflects current P2P practices
![Page 12: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/12.jpg)
Uniform and Proportional Replication
Example: 3 items, q1=1/2, q2=1/3, q3=1/6
q1 > q2 > q3
Uniform Proportional
![Page 13: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/13.jpg)
Basic Questions• How do Uniform and Proportional allocations
perform/compare ?• Which strategy minimizes the Expected Search
Size (ESS) ?• Is there a simple protocol that achieves
optimal replication in decentralized unstructured networks ?
![Page 14: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/14.jpg)
Insoluble queries• Search always extends to the maximum allowed
search size.• If we fix the available storage for copies, the
query rate distribution, and the number if items that we wish to be “locatable”, then
• The maximum required search size depends on the smallest allocation of an item. Thus,
• Uniform allocation minimizes this maximum and thus the cost induced by insoluble queries.
What about the cost of soluble queries? Answer is more surprising …
![Page 15: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/15.jpg)
Uniform and Proportional Allocations (ESS for soluble queries)
Lemma: The ESS under either Uniform
or Proportional allocations is m/
– Independent of query rates (!!!)
– Same ESS for Proportional and Uniform
(!!!)
![Page 16: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/16.jpg)
Proof outline
Proportional: Average Search Size is
(i qi / pi)/(i qi / qi)/m/
Uniform: Average Search Size is
(i qi / pi)/(i m qi)/m/i qi m/
![Page 17: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/17.jpg)
Space of Possible AllocationsDefinition: Allocation p1, p2, p3,…, pm is “in-between” Uniform
and Proportional if for 1i <m, q i+1/q i < p i+1/p i < 1Theorem1: All (strictly) in-between strategies are (strictly)
better than Uniform and Proportional
Theorem2: p is worse than Uniform/Proportional if for all i, q i+1/q i > 1 (more popular gets less) OR for all i, q i+1/q i > p i+1/p i (less popular gets less than “fair share”)
(These are unreasonable strategies)
Proportional and Uniform are the worst “reasonable” strategies (!!!)
![Page 18: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/18.jpg)
q2/q1
p2/p1
Space of allocations on 2 items
Worse than prop/uniMore popular item gets less.
Worse than prop/uni
More popular gets more thanits proportional share
Better than prop/uni
Uniform
Proportional
SR
![Page 19: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/19.jpg)
So, what is the best strategy for soluble queries ?
![Page 20: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/20.jpg)
Square-Root Allocation(pi) is proportional to square-root of (qi)
m
jj
ii
q
qp
1
• Lies “In-between” Uniform and Proportional
• Theorem: Square-Root allocation minimizes the ESS (on soluble queries)
Minimize i qi / pi such that i pi = 1
![Page 21: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/21.jpg)
How much can we gain by using SR ?w
i iq Zipf-like query rates
![Page 22: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/22.jpg)
OK• SR is best for soluble queries• Uniform minimizes cost of insoluble queries
OPT is a hybrid of Uniform and SR
Tuned to balance cost of soluble and insoluble queries.
What is the optimal strategy?
![Page 23: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/23.jpg)
UniformSR
10^4 items, Zipf-like w=1.5
All Soluble
85% Soluble
All Insoluble
![Page 24: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/24.jpg)
We now know what we need.
How do we get there?
![Page 25: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/25.jpg)
Replication Algorithms
• Fully distributed where peers communicate through random probes; minimal bookkeeping; and no more communication than what is needed for search.
• Converge to/obtain SR allocation when query rates remain steady.
• Uniform and Proportional are “easy” :– Uniform: When item is created, replicate its key in a fixed
number of hosts.– Proportional: for each query, replicate the key in a fixed
number of hosts
Desired properties of algorithm:
![Page 26: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/26.jpg)
Model for Copy Creation/Deletion
• Creation: after a successful search, C(s) new copies are created at random hosts.
• Deletion: is independent of the identity of the item; copy survival chances are non-decreasing with creation time. (i.e., FIFO at each node)
<Ci> average value of C used to replicate ith item.Claim: If <Ci>/<Cj> remains fixed over time, and <Ci>, <Cj> , then pi/pj qi <Ci>/qj <Cj>
Property of the process:
![Page 27: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/27.jpg)
Creation/Deletion Process
If i
i qC 1 jiji qqpp then
Corollary:
Algorithm for square-root allocation needs to have <Ci> equal to or converge to a value inversely proportional to iq
![Page 28: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/28.jpg)
SR Replication Algorithms
• Path replication: number of new copies C(s) is proportional to the size of the search (Freenet)– Converges to SR allocation (+reasonable conditions)– Convergence unstable with delayed creations
• Sibling memory: each copy remembers the number of sibling copies,– Quickly “on target”– For “good estimates” need to find several copies.
• Probe memory: each peer records number and combined search size of probes it sees for each item. C(S) is determined by collecting this info from number of peers proportional to search size. – Immediately “on target”– Extra communication (proportional to that needed for search).
![Page 29: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/29.jpg)
Algorithm 1: Path Replication• Number of new copies produced per query, <Ci>, is
proportional to search size 1/pi
• Creation rate is proportional to qi <Ci>• Steady state: creation rate proportional to allocation pi,
thus
iiiii ppqCq
ii qp
![Page 30: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/30.jpg)
Simulation
Path replicationSibling number
Hosts with copy
time
Delay = 0.25 * copy lifetime; 10000 hosts
![Page 31: Replication Strategies in Unstructured Peer-to-Peer Networks](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d3c550346895dcb4110/html5/thumbnails/31.jpg)
Summary• Random Search/replication Model: probes to “random”
hosts• Proportional allocation – current practice• Uniform allocation – best for insoluble queries• Soluble queries:
• Proportional and Uniform allocations are two extremes with same average performance
• Square-Root allocation minimizes Average Search Size• OPT (all queries) lies between SR and Uniform• SR/OPT allocation can be realized by simple algorithms.