cs3516-10-p2p - worcester polytechnic institute · v assume a centralized index system that maps...
TRANSCRIPT
1
CS 3516: Advanced Computer Networks
Prof. Yanhua Li
Welcome to
Time: 9:00am –9:50am M, T, R, and F Location: Fuller 320
Fall 2017 A-term
Some slides are originally from the course materials of the textbook “Computer Networking: A Top Down Approach”, 7th edition, by
Jim Kurose, Keith Ross, Addison-Wesley March 2016. Copyright 1996-2017 J.F Kurose and K.W. Ross, All Rights Reserved.
Application Layer 2-2
Chapter 2: outline
2.6 P2P applications 1. P2P vs Client&Server 2. Unstructured Peer-to-Peer Networks BitTorrent 3. Structured Peer-to-Peer Networks Distributed Hash Table (DHT)
Application Layer 2-5
Chapter 2: outline
2.6 P2P applications 1. P2P vs Client&Server 2. Unstructured Peer-to-Peer Networks § Napster § Gnutella § BitTorrent
6
Peer-to-Peer Networks: How Did it Start?
v A killer application: Napster § Free music over the Internet
v Key idea: share the content, storage and bandwidth of individual (home) users
Internet
7
Challenges
v Scale: up to hundred of thousands or millions of machines
v Dynamicity: machines can come and go any time
v Each user stores a subset of files v Each user has access (can download) files from all
users in the system
Model
10
Peer-to-Peer Networks: Napster v Napster history: the rise
§ January 1999: Napster version 1.0 § May 1999: company founded § September 1999: first lawsuits § 2000: 80 million users
v Napster history: the fall § Mid 2001: out of business due to lawsuits § Mid 2001: dozens of P2P alternatives that were harder
to touch, though these have gradually been constrained § 2003: growth of pay services like iTunes
v Napster history: the resurrection § 2003: Napster reconstituted as a pay service § 2006: still lots of file sharing going on
Shawn Fanning,Northeastern freshman
11
Napster Technology: Directory Service v User installing the software
§ Download the client program § Register name, password, local directory, etc.
v Client contacts Napster (via TCP) § Provides a list of music files it will share § … and Napster’s central server updates the directory
v Client searches on a title or performer § Napster identifies online clients with the file § … and provides IP addresses
v Client requests the file from the chosen supplier § Supplier transmits the file to the client § Both client and supplier report status to Napster
12
Napster v Assume a centralized index system that maps files
(songs) to machines that are alive v How to find a file (song)
§ Query the index system à return a machine that stores the required file
• Ideally this is the closest/least-loaded machine § ftp the file
v Advantages: § Simplicity, easy to implement sophisticated search
engines on top of the index system v Disadvantages:
§ Robustness, scalability
13
Napster Technology: Properties v Server’s directory continually updated
§ Always know what music is currently available v Peer-to-peer file transfer
§ No load on the server v As a protocol
§ Login, search, upload, download, and status operations § No security: cleartext passwords and other vulnerability
v Bandwidth issues § Suppliers ranked by apparent bandwidth & response time
15
Napster: Limitations of Central Directory
v Single point of failure v Performance bottleneck v Copyright infringement
v So, later P2P systems were more distributed
File transfer is decentralized, but locating content is highly centralized
16
Peer-to-Peer Networks: Gnutella
v Gnutella history § 2000: J. Frankel &
T. Pepper released Gnutella
§ Soon after: many other clients (e.g., Morpheus, Limewire, Bearshare)
§ 2001: protocol enhancements, e.g., “ultrapeers”
v Query flooding § Join: contact a few nodes to
become neighbors § Publish: no need! § Search: ask neighbors, who
ask their neighbors § Fetch: get file directly from
another node
17
Gnutella v Distribute file location v Idea: flood the request v How to find a file:
§ Send request to all neighbors § Neighbors recursively multicast the request § Eventually a machine that has the file receives
the request, and it sends back the answer v Advantages:
§ Totally decentralized, highly robust v Disadvantages:
§ Not scalable; the entire network can be swamped with request (to alleviate this problem, each request has a TTL)
18
Gnutella v Ad-hoc topology v Queries are flooded for bounded number of hops v No guarantees on recall
Query: “xyz”
xyz
xyz
19
Gnutella: Query Flooding
v Fully distributed § No central server
v Public domain protocol
v Many Gnutella clients implementing protocol
Overlay network: graph v Edge between peer X and
Y if there’s a TCP connection
v All active peers and edges is overlay net
20
Gnutella: Protocol
Query QueryHit
Query QueryHit
File transfer: HTTP
Scalability: limited scope flooding
• Query message sent over existing TCP connections
• Peers forward Query message
• QueryHit sent over reverse path
21
Gnutella: Pros and Cons
v Advantages § Fully decentralized,
v Disadvantages § Search scope may be quite large § Search time may be quite long § High overhead and nodes come and go often
Application Layer 2-22
Chapter 2: outline
2.6 P2P applications 1. P2P vs Client&Server 2. Unstructured Peer-to-Peer Networks BitTorrent
23
BitTorrent: Simultaneous Downloading
v Divide large file into many pieces § Replicate different pieces on different peers § A peer with a complete piece can trade with other
peers § Peer can (hopefully) assemble the entire file
v Allows simultaneous downloading § Retrieving different parts of the file from different
peers at the same time
24
BitTorrent Components v Seed
§ Peer with entire file § Fragmented in pieces
v Leech § Peer with an incomplete copy of the file
v Torrent file § Passive component § Stores summaries of the pieces to allow peers to verify
their integrity
v Tracker § Allows peers to find each other § Returns a list of random peers
25
BitTorrent: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker Web Server
.torre
nt
26
BitTorrent: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker Web Server
27
BitTorrent: Overall Architecture
Web page with link
to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker Web Server
28
BitTorrent: Overall Architecture
Web page with link
to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
Shake-hand
Web Server
29
BitTorrent: Overall Architecture
Web page with link
to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
pieces
Web Server
30
BitTorrent: Overall Architecture
Web page with link
to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
pieces
Web Server
31
BitTorrent: Overall Architecture
Web page with link
to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
pieces
Web Server
32
Free-Riding Problem in P2P Networks v Vast majority of users are free-riders
§ Most share no files and answer no queries § Others limit # of connections or upload speed
v A few “peers” essentially act as servers § A few individuals contributing to the public good § Making them hubs that basically act as a server
v BitTorrent prevent free riding § Allow the fastest peers to download from you § Occasionally let some free loaders download
Application Layer 2-33
BitTorrent: requesting, sending file chunks
requesting chunks: v at any given time, different
peers have different subsets of file chunks
v periodically, Alice asks each peer for list of chunks that they have
v Alice requests missing chunks from peers, rarest first
sending chunks: tit-for-tat v Alice sends chunks to those
four peers currently sending her chunks at highest rate § other peers are choked by Alice
(do not receive chunks from her) § re-evaluate top 4 every10 secs
v every 30 secs: randomly select another peer, starts sending chunks § “optimistically unchoke” this peer § newly chosen peer may join top 4
Application Layer 2-34
BitTorrent: tit-for-tat (1) Alice “optimistically unchokes” Bob (2) Alice becomes one of Bob’s top-four providers; Bob reciprocates (3) Bob becomes one of Alice’s top-four providers
higher upload rate: find better trading partners, get file faster !
Application Layer 2-35
Chapter 2: summary
v application architectures § client-server § P2P
v application service requirements: § reliability, throughput,
delay, security v Internet transport service
model § connection-oriented,
reliable: TCP § unreliable, datagrams: UDP
our study of network apps now complete!
v specific protocols: § HTTP § SMTP, POP, IMAP § DNS § P2P: BitTorrent
Application Layer 2-36
Quiz 5 on Friday P2P networks
Napster Gnutella BitTorrent
Mid-term next Friday (Tentative)