unstructured p2p networks
DESCRIPTION
Unstructured P2P Networks. Achieving Robustness and Scalability in Data Dissemination Scenarios Michael Mirold Seminar on Advanced Topics in Distributed Computing 2007/2008. Structured P2P Node graph with predefined structure Examples Chord Pastry CAN. Unstructured P2P - PowerPoint PPT PresentationTRANSCRIPT
Unstructured P2P Networks
Achieving Robustness and Scalability in Data Dissemination Scenarios
Michael Mirold
Seminar on Advanced Topics in Distributed Computing 2007/2008
2
Introduction
Structured P2P Node graph with
predefined structure Examples
– Chord– Pastry– CAN
Unstructured P2P Random node graph Examples
– Gnutella– BitTorrent
3
Part IDo Ut Des, Tit-for-Tat or
“How Leech Proof is BitTorrent Really?”
Bases on papers:• “Incentives Build Robustness in BitTorrent”, Bram Cohen• “Do Incentives Build Robustness in BitTorrent?”,
Michael Piatek, Tomas Isdal, Thomas Anderson, Arvind Krishnamurthy, Arun Venkataramani
4
What is BitTorrent?
Used for distribution of single files File is split into pieces (32kB – 2MB) Pieces are distributed within a swarm
– denotes all nodes that are interested in the file
Downloaded pieces are redistributed No single “server”: True peer-to-peer
(except perhaps of tracker)
5
How Does BitTorrent Work? (1)
Want really badly to make everyone
enjoy my new holiday pictures
(HDR some GB)
1.2 Put Torrent-Fileonto web server
Initial Seed
Tracker“Ordinary”Web Server
1.3 Register as “downloader”
1.1 Create Torrent-File
Torrent File- url of tracker- name: name of file- piece length- pieces (conc. of SHA1 hashes of all pieces)- file length
6
How Does BitTorrent Work? (2)
Initial Seed
Tracker
2.1 RequestTorrent-File
“Ordinary”Web Server
2.3 Send peer set (“LocalNeighborhood”)
2.2 Registeras downloader
2.4 Open connection
2.7 Send pieces
2.5 Handshake
2.6 Request
7
Bchoke
At some Unspecial Moment…
seed
swarm
non-seed
our node
active
activeset
peer set
active set statically sized
8
BitTorrent in Action
9
Achieving Fairness in BitTorrent
Basis: Tit-for-Tat strategy– Famous in game theory (considered “best” strategy for winning
Prisoners’ Dilemma)– Idea:
• Be cooperative to others• Penalize defective behaviour but don’t be too unforgiving
Put in BitTorrent context– Grant upload capacity to n best uploaders
• n: size of active set + # of optimistic unchokes– “Choke”, i.e. stop uploading to, peers that don’t perform well
• recompute choking every 10 seconds– However: “Optimistically unchoke” peers
• twice every 30 seconds
Reward good uploaders with capacity
10
Sounds good in theory, but…
what if you have LOTS of upload capacity?– most of your peers are slower (see later)– nevertheless you must choose a few
• fastest peers probably already done– AND you have split your upload capacity equally! you give your capacity away for free this is called “ALTRUISM”
Altruism THE source of unfairness in BitTorrent Unfair peers only need
1. be slightly “better” than average2. prefer rich girls, i.e. highly altruistic peers
remember: active set has
static size
11
Unfairness/Altruism Illustrated
Measureof altruism
12
Unfairness/Altruism Illustrated (2)
Altruism as wasted upload capacity
slow clientsnever get reciprocated every byte is wasted
fast clients contributemore than necessary
13
Real World Observations
Reference implementationuses active set size of
This is what you arecompeting against be a bit faster!
If you can upload > 14kB/sreciprocation prob > 99% !!
14
The Optimal Active Set Size *
* for a peer with 300 kB/s UL capacity
Optimal Set Size
15
BitTyrant – A Selfish BitTorrent Client
Based on Azureus Client– publicly available at http://bittyrant.cs.washington.edu
Exploiting the unfairness in BitTorrent Minimizing altruism to improve efficiency Mechanisms:
– choose only “best” peers with respect to UL/DL ratio (see next slide)
– deviate from equal split – optimize active set size
16
Peer Choice Optimization Algorithm Step invariant: Maintain up, dp of peer p
– dp: Estimated download performance from p– up: Estimated upload needed for reciprocation with p
Initially: set according to theor. distribution Each step: Rank order peers according to
and choose best peers for unchoking After each round:
– Has p unchoked us? – Has p not unchoked us? – Has p unchoked us for the last r rounds?
17
Experiences using BitTyrant
Multiple BitTyrant peers:– depends on a few factors and not easily comparable– strategic, i.e. peers use adapted choking algorithm
swarm performance improved compared to BitTorrent
– strategic & selfish, i.e. peer doesn’t give back excess capacity swarm performance decreases dramatically
One BitTyrant peer:– median: 72%
18
Personal Opinion
Paper shows nice “hack” Paper shows that there is no perfect fairness Paper shows a sensible optimization But: I think, model is too restricted
– People’s goals unconsidered• Altruistic people are often just that: altruistic (they don’t mind
performing suboptimal)• Everyone glad with BitTorrent, why optimize?
And: will this paper make the world a better place?
19
Part IIGetting Almost Reliable Broadcast
with Almost no Pain:“Lightweight Probabilistic Broadcast”
Bases on papers:• “Lightweight Probabilistic Broadcast”, 2003, Eugster,
Guerraoui, Handurukande & Kouznetsov
20
Background & Motivation
Large scale event dissemination– Processes p1, …, pn subscribe for topic t– Event e with topic t is delivered to p1, …, pn
Reliable Broadcast– scales poorly
Network level Broadcast/Multicast– lacks reliability guarantees– also scalability problems
Complete view on the network leads to unsustainable memory demands
21
The lpbcast Protocol
System contains n processes Π = {p1, …, pn}– dynamically joining and leaving
Processes subscribe to a single topic– easily extendible to multiple topics– joining/leaving == subscribing/unsubscribing
Gossip sent to F random nodes in viewi of process pi
– F is “fanout” of process– viewi is subset of procs currently known by pi
Gossips sent out periodically (non-synchronized)
22
Gossips
gossip: all-in-one record containing – gossip.events: event notifications– gossip.subs: subscriptions, – gossip.unsubs: unsubscriptions, – gossip.eventIds: history of events ids
received so far
Containers don’t contain duplicates, i.e. they are set-like lists
23
Processes
Every process p has several buffers:– view (fixed maximum size l):
• contains processes that are known to / seen by p
– subs (fixed maximum size |subs|M):• contains subscriptions received by p
– unsubs (fixed maximum size |unsubs|M):• contains unsubscriptions received by p
– events (fixed maximum size |events|M):• contains event notifications since last gossip emission
– eventIds (fixed maximum size |eventIds|M):• ids of events seen so far
– retrieveBuf:• contains ids of events seen in gossip.eventids but not known
24
Example
Fanout: 3|view|M = 8
Process pi
view (pi)
25
lpbcast procedures
Upon receiving a gossip message1. Update view and unSubs with unsubscriptions
2. Update view with new subscriptions
3. Update events with new notifications1. deliver notifications
2. update retrieveBuf for later retrieval of notifications
4. Perform housekeeping (truncate containers)
When sending a gossip message– fill gossip message accordingly
26
Example: Zeit T0
1 2
view 2
subs 2
unSubs
events
eventId
subs 1, 2
unSubs
events
eventIdGO
SS
IP
3view 2
Subs 2
unSubs
events e3
eventId
view 3
subs 3
unSubs
events e3, e4
eventId e3,e4
subs 3, 2
unSubs
events e3
eventId e3GO
SS
IPsubs 3, 2
unSubs
events e3, e4
eventId e3, e4GO
SS
IP
27
Example: Zeit T1
1 2
view 2
subs 2
unSubs
events
eventId subs 1, 2
unSubs
events e18
eventIdGO
SS
IP
3view 2
Subs 2
unSubs
events
eventId e3, e4
view 1,3
subs 1,3
unSubs
events e5
eventId e3,e4,e5
subs 3, 2
unSubs
events
eventId e3,e4
GO
SS
IPsubs 1,2,3
unSubs
events e5
eventId e3, e4,e5
GO
SS
IP
subs 1,2,3
unSubs
events e5
eventId e3,e4,e5
GO
SS
IP
28
Analytical Evaluation of lpbcast
Assumptions:– Π constant during evaluation– synchronous rounds– upper bound on latency– identical fanout F for all processes– probability of message loss ε– probability of crash τ– random view ind. uniformly distributed
29
Analytical Evaluation
Turns out that throughput is independent from view size l– provided that views are uniformly distributed
Membership stability (no partitioning)– increases with growing view size and/or
system size– partitioning probability increases slowly with
rounds: 1012 rounds for n=50, l=3
30
Practical Observations
Throughput does depend a bit on l Explanation:
– views not as uniformly distributed as assumed
31
Time to Infect all Processes
Simulation meets measurements pretty well
Fanout 3 1 msg injected System size varies
32
Thank you for listening!