1 school of computing science simon fraser university cmpt 765/408: p2p systems instructor: dr....
DESCRIPTION
3 When Did P2P Start? Napster (Late 1990’s) -Court shut Napster down in 2001 Gnutella (2000) Then the killer FastTrack (Kazaa,...) BitTorrent, and many others Accompanied by significant research interest Claim -P2P is much older than Napster! Proof -The original Internet! -Remember UUCP (unix-to-unix copy)?TRANSCRIPT
![Page 1: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/1.jpg)
1
School of Computing ScienceSimon Fraser University
CMPT 765/408: P2P SystemsCMPT 765/408: P2P Systems
Instructor: Dr. Mohamed HefeedaInstructor: Dr. Mohamed Hefeeda
![Page 2: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/2.jpg)
2
P2P Computing: Definitions
Peers cooperate to achieve desired functions- Peers:
• End-systems (typically, user machines)• Interconnected through an overlay network• Peer ≡ Like the others (similar or behave in similar manner)
- Cooperate: • Share resources, e.g., data, CPU cycles, storage, bandwidth• Participate in protocols, e.g., routing, replication, …
- Functions: • File-sharing, distributed computing, communications,
content distribution, … Note: the P2P concept is much wider than file
sharing
![Page 3: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/3.jpg)
3
When Did P2P Start?
Napster (Late 1990’s)- Court shut Napster down in 2001
Gnutella (2000) Then the killer FastTrack (Kazaa, ...) BitTorrent, and many others Accompanied by significant research interest Claim
- P2P is much older than Napster! Proof
- The original Internet!- Remember UUCP (unix-to-unix copy)?
![Page 4: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/4.jpg)
4
What IS and IS NOT New in P2P?
What is not new- Concepts!
What is new- The term P2P (may be!)- New characteristics of
• Nodes which constitute the • System that we build
![Page 5: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/5.jpg)
5
What IS NOT New in P2P?
Distributed architectures Distributed resource sharing Node management (join/leave/fail) Group communications Distributed state management ….
![Page 6: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/6.jpg)
6
What IS New in P2P?
Nodes (Peers)- Quite heterogeneous
• Several order of magnitudes difference in resources• Compare the bandwidth of a dial-up peer versus a
high-speed LAN peer - Unreliable
• Failure is the norm!- Offer limited capacity
• Load sharing and balancing are critical- Autonomous
• Rational, i.e., maximize their own benefits!• Motivations should be provided to peers to cooperate
in a way that optimizes the system performance
![Page 7: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/7.jpg)
7
What IS New in P2P? (cont’d)
System - Scale
• Numerous number of peers (millions) - Structure and topology
• Ad-hoc: No control over peer joining/leaving• Highly dynamic
- Membership/participation• Typically open
- More security concerns• Trust, privacy, data integrity, …
- Cost of building and running• Small fraction of same-scale centralized systems• How much would it cost to build/run a super computer
with processing power of that 3 Million SETI@Home PCs?
![Page 8: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/8.jpg)
8
What IS New in P2P? (cont’d)
So what? We need to design new lighter-weight
algorithms and protocols to scale to millions (or billions!) of nodes given the new characteristics
Question: why now, not two decades ago?- We did not have such abundant (and
underutilized) computing resources back then!- And, network connectivity was very limited
![Page 9: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/9.jpg)
9
Why is it Important to Study P2P?
P2P traffic is a major portion of Internet traffic (50+%), current killer app
P2P traffic has exceeded web traffic (former killer app)!
Direct implications on the design, administration, and use of computer networks and network resources
- Think of ISP designers or campus network administrators
Many potential distributed applications
![Page 10: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/10.jpg)
10
Sample P2P Applications
File sharing- Gnutella, Kazaa, BitTorrent, …
Distributed cycle sharing- SETI@home, Gnome@home, …
File and storage systems- OceanStore, CFS, Freenet, Farsite, …
Media streaming and content distribution- PROMISE- SplitStream, CoopNet, PeerCast, Bullet, Zigzag,
NICE, …
![Page 11: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/11.jpg)
11
P2P vs. its Cousin (Grid Computing)
Common Goal:- Aggregate resources (e.g., storage, CPU
cycles, and data) into a common pool and provide efficient access to them
Differences along five axes [Foster & Imanitchi 03] - Target communities and applications- Type of shared resources- Scalability of the system- Services provided- Software required
![Page 12: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/12.jpg)
12
P2P vs Grid Computing (cont’d)
Issue Grid P2P
Communities and Applications
Established communities, e.g., scientific institutions Computationally-intensive problems
Grass-root communities (anonymous) Mostly, file-swapping
Resources Shared
Powerful and Reliable machines, clusters High-speed connectivity Specialized instruments
PCs with limited capacity and connectivity Unreliable Very diverse
![Page 13: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/13.jpg)
13
P2P vs Grid Computing (cont’d)
Issue Grid P2P
System Scalability
Hundreds to thousands of nodes
Hundreds of thousands to Millions of nodes
Services Provided
Sophisticated services: authentication, resource discovery, scheduling, access control, and membership control Members usually trust others
Limited services: resource discovery limited trust among peers
Software required
Sophisticated suit: e.g., Globus, Condor
Simple: (screen saver), e.g., Kazza, SETI@Home
![Page 14: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/14.jpg)
14
P2P vs Grid Computing: Discussion
The differences mentioned are based on the traditional view of each paradigm
- It is conceived that both paradigms will converge and will complement each other [e.g., Butt et al. 03]
Target communities and applications- Grid: is going open
Type of shared resources- P2P: is to include various and more powerful resources
Scalability of the system- Grid: is to increase number of nodes
Services provided- P2P: is to provide authentication, data integrity, trust
management, …
![Page 15: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/15.jpg)
15
P2P Systems: Simple Model
P2P Substrate
Operating System
Hardware
Middleware
P2P Application
Software architecture model on a peer
System architecture: Peers form an overlay according to the P2P
Substrate
![Page 16: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/16.jpg)
16
Overlay Network
An abstract layer built on top of physical network Neighbors in overlay can be several hops away in
physical network Why do we need overlays?
- Flexibility in • Choosing neighbors • Forming and customizing topology to fit application’s
needs (e.g., short delay, reliability, high BW, …) • Designing communication protocols among nodes
- Get around limitations in legacy networks- Enable new (and old!) network services
![Page 17: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/17.jpg)
17
Overlay Network (cont’d)
![Page 18: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/18.jpg)
18
Overlay Network (cont’d)
Some applications that use overlays- Application level multicast, e.g., ESM, Zigzag, NICE, …- Reliable inter-domain routing, e.g., RON- Content Distribution Networks (CDN)- Peer-to-peer file sharing
Overlay design issues- Select neighbors- Handle node arrivals, departures- Detect and handle failures (nodes, links)- Monitor and adapt to network dynamics- Match with the underlying physical network
![Page 19: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/19.jpg)
19
Overlay Network (cont’d) Recall: IP Multicast
source
![Page 20: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/20.jpg)
20
Overlay Network (cont’d)Application Level Multicast (ALM)
source
![Page 21: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/21.jpg)
21
Peer Software Model
A software client installed on each peer
Three components:- P2P Substrate- Middleware- P2P Application
P2P Substrate
Operating System
Hardware
Middleware
P2P Application
Software model on peer
![Page 22: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/22.jpg)
22
Peer Software Model (cont’d)
P2P Substrate (key component)- Overlay management
• Construction • Maintenance (peer join/leave/fail and network
dynamics)- Resource management
• Allocation (storage) • Discovery (routing and lookup)
Ex: Pastry, CAN, Chord, …
![Page 23: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/23.jpg)
23
Peer Software Model (cont’d)
Middleware- Provides auxiliary services to P2P applications:
• Peer selection • Trust management • Data integrity validation• Authentication and authorization• Membership management• Accounting (Economics and rationality) • …
- Ex: CollectCast, EigenTrust, Micro payment
![Page 24: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/24.jpg)
24
Peer Software Model (cont’d)
P2P Application- Potentially, there could be multiple applications
running on top of a single P2P substrate- Applications include
• File sharing• File and storage systems• Distributed cycle sharing • Content distribution
- This layer provides some functions and bookkeeping relevant to target application
• File assembly (file sharing)• Buffering and rate smoothing (streaming)
Ex: Promise, Bullet, CFS
![Page 25: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/25.jpg)
25
P2P Substrate
Key component, which - Manages the Overlay - Allocates and discovers objects
P2P Substrates can be - Structured- Unstructured- Based on the flexibility of placing objects at
peers
![Page 26: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/26.jpg)
26
P2P Substrates: Classification
Structured (or tightly controlled, DHT) − Objects are rigidly assigned to specific peers− Looks like as a Distributed Hash Table (DHT)− Efficient search & guarantee of finding− Lack of partial name and keyword queries− Maintenance overhead− Ex: Chord, CAN, Pastry, Tapestry, Kademila (Overnet)
Unstructured (or loosely controlled)− Objects can be anywhere− Support partial name and keyword queries− Inefficient search & no guarantee of finding− Some heuristics exist to enhance performance − Ex: Gnutella, Kazaa (super node), GIA [Chawathe et al. 03]
![Page 27: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/27.jpg)
27
Structured P2P Substrates
Objects are rigidly assigned to peers− Objects and peers have IDs (usually by
hashing some attributes)− Objects are assigned to peers based on IDs
Peers in overlay form specific geometrical shape, e.g.,
- tree, ring, hypercube, butterfly network Shape (to some extent) determines
− How neighbors are chosen, and − How messages are routed
![Page 28: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/28.jpg)
28
Structured P2P Substrates (cont’d)
Substrate provides a Distributed Hash Table (DHT)-like interface
− InsertObject (key, value), findObject (key), …− In the literature, many authors refer to
structured P2P substrates as DHTs It also provides peer management (join,
leave, fail) operations Most of these operations are done in O(log
n) steps, n is number of peers
![Page 29: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/29.jpg)
29
Structured P2P Substrates (cont’d)
DHTs: Efficient search & guarantee of finding
However, − Lack of partial name and keyword queries− Maintenance overhead, even O(log n) may be too
much in very dynamic environments Ex: Chord, CAN, Pastry, Tapestry, Kademila
(Overnet)
![Page 30: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/30.jpg)
30
Example: Content Addressable Network (CAN) [Ratnasamy 01]
− Nodes form an overlay in d-dimensional space− Node IDs are chosen randomly from the d-space− Object IDs (keys) are chosen from the same d-space
− Space is dynamically partitioned into zones− Each node owns a zone − Zones are split and merged as nodes join and
leave− Each node stores
− The portion of the hash table that belongs to its zone− Information about its immediate neighbors in the d-
space
![Page 31: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/31.jpg)
31
2-d CAN: Dynamic Space Division
n1
n2
n3
n4
0
0
7
7
n5
![Page 32: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/32.jpg)
32
2-d CAN: Key Assignment
n1
n2
n3
n4
0
0
7
7
K1K2
K3
K4
n5
![Page 33: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/33.jpg)
33
2-d CAN: Routing (Lookup)
n1
n2
n3
n4
0
0
7
7
K1K2
K3
K4
n5
K4?
K4?
![Page 34: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/34.jpg)
34
CAN: Routing
− Nodes keep 2d = O(d) state information (neighbor coordinates, IPs)
− Constant, does not depend on number of nodes n
− Greedy routing - Route to the node that is closest to the
destination- On average, is done in O(n1/d) = O(log n) when
d = log n /2
![Page 35: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/35.jpg)
35
CAN: Node Join
− New node finds a node already in the CAN− (bootstrap: one (or a few) dedicated nodes outside the
CAN maintain a partial list of active nodes)− It finds a node whose zone will be split
− Choose a random point P (will be its ID)− Forward a JOIN request to P through the existing node
− The node that owns P splits its zone and sends half of its routing table to the new node
− Neighbors of the split zone are notified
![Page 36: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/36.jpg)
36
CAN: Node Leave, Fail − Graceful departure
− The leaving node hands over its zone to one of its neighbors
− Failure− Detected by the absence of heart beat messages sent
periodically in regular operation− Neighbors initiate takeover timers, proportional to the
volume of their zones− Neighbor with smallest timer takes over zone of dead
node − notifies other neighbors so they cancel their timers (some
negotiation between neighbors may occur)− Note: the (key, value) entries stored at the failed node
are lost− Nodes that insert (key, value) pairs periodically refresh (or
re-insert) them
![Page 37: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/37.jpg)
37
CAN: Discussion − Scalable
− O(log n) steps for operations − State information is O(d) at each node
− Locality − Nodes are neighbors in the overlay, not in the physical
network− Suggestion (for better routing)
− Each node measure RTT between itself and its neighbors− Forward the request to the neighbor with maximum ratio
of progress to RTT− Maintenance cost
− Logarithmic− But, may still be too much for very dynamic P2P
systems
![Page 38: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/38.jpg)
38
Unstructured P2P Substrates
− Objects can be anywhere Loosely-controlled overlays
− The loose control− Makes overlay tolerate transient behavior of nodes
− For example, when a peer leaves, nothing needs to be done because there is no structure to restore
− Enables system to support flexible search queries− Queries are sent in plain text and every node runs a mini-
database engine− But, we loose on searching
− Usually using flooding, inefficient− Some heuristics exist to enhance performance− No guarantee on locating a requested object (e.g., rarely
requested objects)− Ex: Gnutella, Kazaa (super node), GIA [Chawathe et al.
03]
![Page 39: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/39.jpg)
39
Example: Gnutella
− Peers are called servents− All peers form an unstructured overlay− Peer join
− Find an active peer already in Gnutella (e.g., contact known Gnutella hosts)
− Send a Ping message through the active peer− Peers willing to accept new neighbors reply with Pong
− Peer leave, fail− Just drop out of the network!
− To search for a file− Send a Query message to all neighbors with a TTL (=7)− Upon receiving a Query message
− Check local database and reply with a QueryHit to requester
− Decrement TTL and forward to all neighbors if nonzero
![Page 40: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/40.jpg)
40
Flooding in Gnutella
Scalability Problem
![Page 41: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/41.jpg)
41
Heuristics for Searching [Yang and Garcia-Molina 02]
− Iterative deepening− Multiple BFS with increasing TTLs− Reduce traffic but increase response time
− Directed BFS− Send to “good” neighbors (subset of your neighbors
that returned many results in the past) need to keep history
− Local Indices− Keep a small index over files stored on neighbors
(within number of hops)− May answer queries on behalf of them− Save cost of sending queries over the network− Index currency?
![Page 42: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/42.jpg)
42
Heuristics for Searching: Super Node
− Used in Kazaa (signaling protocols are encrypted)
− Studied in [Chawathe 03] − Relatively powerful nodes play special role
− maintain indexes over other peers
![Page 43: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/43.jpg)
43
Unstructured Substrates with Super Nodes
Super Node (SN)
Ordinary Node (ON)
![Page 44: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/44.jpg)
44
Example: FastTrack Networks (Kazaa)
− Most of info/plots in following slides are from Understanding Kazaa by Liang et al.
− The most popular (~ 3 million active users in a typical day) sharing 5,000 Terabytes
− Kazaa traffic exceeds Web traffic− Two-tier architecture (with Super Nodes and
Ordinary Nodes)− SN maintains index on files stored at ONs
attached to it− ON reports to SN the following metadata on each file: − File name, file size, ContentHash, file descriptors (artist
name, album name, …)
![Page 45: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/45.jpg)
45
FastTrack Networks (cont’d)
− Mainly two types of traffic− Signaling
− Handshaking, connection establishment, uploading metadata, …
− Encrypted! (some reverse engineering efforts) − Over TCP connections between SN—SN and SN—ON − Analyzed in [Liang et al. 04]
− Content traffic − Files exchanged, not encrypted− All through HTTP between ON—ON − Detailed Analysis in [Gummadi et al. 03]
![Page 46: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/46.jpg)
46
Kazaa (cont’d)
− File search− ON sends a query to its SN− SN replies with a list of IPs of ONs that have
the file− SN may forward the query to other SNs
− Parallel downloads take place between supplying ONs and receiving ON
![Page 47: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/47.jpg)
47
FastTrack Networks (cont’d)
− Measurement study of Liang et al. − Hook three machines to Kazaa and wait till one
of them is promoted to be SN− Connect the other two (ONs) to that SN− Study several properties
− Topology structure and dynamics − Neighbor selection− Super node lifetime− ….
![Page 48: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/48.jpg)
48
Kazaa: Topology Structure [Liang et al. 04]
ON to SN: 100 - 160 connections Since there are ~3M nodes, we have ~30,000 SNs
SN to SN: 30 – 50 connections Each SN connects to ~0.1 % of total number of SNs
![Page 49: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/49.jpg)
49
Kazaa: Topology Dynamics [Liang et al. 04]
− Average ON – SN connection duration − Is ~ 1 hour, after removing very short-lived
connections (30 sec) used for shopping for SNs
− Average SN – SN connection duration− 23 min, which is short because of
− Connection shuffling between SNs to allow ONs to reach a larger set of objects
− SNs search for other SNs with smaller loads− SNs connect to each other from time to time to
exchange SN lists (each SN stores 200 other SNs in its cache)
![Page 50: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/50.jpg)
50
Kazaa: Neighbor Selection [Liang et al. 04]
− When ON first joins, it gets a list of 200 SNs − ON considers locality and SN workload in selecting its future SN
− Locality− 40% of ON-SN connections have RTT < 5 msec− 60% of ON-SN connections have RTT < 50 msec
− RTT: E. US Europe ~100 msec
![Page 51: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/51.jpg)
51
Kazaa: Lifetime and Signaling Overhead [Liang et al. 04]
− Super node average lifetime is ~2.5 hours− Overhead:
− 161 Kb/s upstream− 191 Kb/s downstream− Most of SNs are high-speed (campus network, or cable)
![Page 52: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/52.jpg)
52
Kazaa vs. Firewalls, NAT [Liang et al. 04]
− Default port WAS 1214− Easy for firewalls to filter out Kazaa traffic
− Now, Kazaa uses dynamic ports− Each peer chooses its random port
− ON reports its port to its SN− Ports of SNs are part of the SN refresh list exchanged among
peers− Too bad for firewalls!
− Network Address Translator (NAT)− A requesting peer can not establish a direct connection with a
serving peer behind NAT− Solution: connection reversal
− Send to SN of NATed peer, which already has a connection with it− SN tells NATed peer to establish a connection with requesting
peer! − Transfer occurs happily through the NAT− Both peers behind NATs?
![Page 53: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/53.jpg)
53
Kazaa: Lessons [Liang et al. 04]
− Distributed design− Exploit heterogeneity− Load balancing − Locality in neighbor selection− Connection Shuffling
− If a peer searches for a file and does not find it, it may try later and gets it!
− Efficient gossiping algorithms− To learn about other SNs and perform shuffling− Kazaa uses a “freshness” field in SN refresh list a
peer ignores stale data− Consider peers behind NATs and Firewalls
− They are everywhere!
![Page 54: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/54.jpg)
54
Summary
P2P is an active research area with many potential applications in industry and academia
In P2P computing paradigm:- Peers cooperate to achieve desired functions
New characteristics- heterogeneity, unreliability, rationality, scale, ad hoc new and lighter-weight algorithms are needed
Simple model for P2P systems:- Peers form an abstract layer called overlay- A peer software client may have three components
• P2P substrate, middleware, and P2P application• Borders between components may be blurred
![Page 55: 1 School of Computing Science Simon Fraser University CMPT 765/408: P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocument.in/reader034/viewer/2022051007/5a4d1b2c7f8b9ab059999659/html5/thumbnails/55.jpg)
55
Summary (cont’d)
P2P substrate: A key component, which - Manages the Overlay - Allocates and discovers objects
P2P Substrates can be - Structured (DHT)
• Example: CAN
- Unstructured• Example 1: Gnutella,• Example 2: Kazza