cs3516-10-p2p - worcester polytechnic institute · (1) alice “optimistically unchokes” bob (2)...

53
1 CS 3516: Advanced Computer Networks Prof. Yanhua Li Welcome to Time: 9:00am –9:50am M, T, R, and F Location: Fuller 320 Fall 2016 A-term Some slides are originally from the course materials of the textbook “Computer Networking: A Top Down Approach”, 6th edition, by Jim Kurose, Keith Ross, Addison-Wesley March 2012. Copyright 1996-2013 J.F Kurose and K.W. Ross, All Rights Reserved.

Upload: buithuan

Post on 09-Sep-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

1

CS 3516: Advanced Computer Networks

Prof. Yanhua Li

Welcome to

Time: 9:00am –9:50am M, T, R, and F Location: Fuller 320

Fall 2016 A-term

Some slides are originally from the course materials of the textbook “Computer Networking: A Top Down Approach”, 6th edition, by

Jim Kurose, Keith Ross, Addison-Wesley March 2012. Copyright 1996-2013 J.F Kurose and K.W. Ross, All Rights Reserved.

Application Layer 2-2

Chapter 2: outline

2.6 P2P applications 1. P2P vs Client&Server 2. Unstructured Peer-to-Peer Networks BitTorrent 3. Structured Peer-to-Peer Networks Distributed Hash Table (DHT)

Application Layer 2-3

Pure P2P architecture v  no always-on server v  arbitrary end systems

directly communicate v  peers are intermittently

connected and change IP addresses

examples: §  file distribution

(BitTorrent) §  Streaming (KanKan) §  VoIP (Skype)

Application Layer 2-4

File distribution: client-server vs P2P

Question: how much time to distribute file (size F) from one server to N peers? §  peer upload/download capacity is limited resource

us

uN

dN

server

network (with abundant bandwidth)

file, size F

us: server upload capacity

ui: peer i upload capacity

di: peer i download capacity u2 d2

u1 d1

di

ui

Application Layer 2-5

File distribution time: client-server

v  server transmission: must sequentially send (upload) N file copies: §  time to send one copy: F/us §  time to send N copies: NF/us

increases linearly in N

time to distribute F to N clients using

client-server approach Dc-s > max{NF/us,,F/dmin}

v  client: each client must download file copy §  dmin = min client download rate §  min client download time: F/dmin

us

network di

ui

F

Application Layer 2-6

File distribution time: P2P

v  server transmission: must upload at least one copy §  time to send one copy: F/us

time to distribute F to N clients using

P2P approach

us

network di

ui

F

DP2P > max{F/us,,F/dmin,,NF/(us + Σui)}

v  client: each client must download file copy §  min client download time: F/dmin

v  clients: as aggregate must download NF bits §  max upload rate (limting max download rate) is us + Σui

… but so does this, as each peer brings service capacity increases linearly in N …

Application Layer 2-7

0

0.5

1

1.5

2

2.5

3

3.5

0 5 10 15 20 25 30 35

N

Min

imum

Dis

tribu

tion

Tim

e P2PClient-Server

Client-server vs. P2P: example

client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us

8

Peer-to-Peer Networks: Unstructured and Structured v  Unstructured Peer-to-Peer Networks

§  Napster §  Gnutella §  BitTorrent

v  Distributed Hash Tables (DHT) and Structured Networks

Application Layer 2-9

Chapter 2: outline

2.6 P2P applications 1. P2P vs Client&Server 2. Unstructured Peer-to-Peer Networks § Napster § Gnutella §  BitTorrent 3. Structured Peer-to-Peer Networks Distributed Hash Table (DHT)

10

Peer-to-Peer Networks: How Did it Start?

v  A killer application: Napster §  Free music over the Internet

v  Key idea: share the content, storage and bandwidth of individual (home) users

Internet

11

Model

v  Each user stores a subset of files v  Each user has access (can download) files from all

users in the system

12

Main Challenge

v  Find where a particular file is stored

A B

C

D

E

F

E?

13

Other Challenges

v  Scale: up to hundred of thousands or millions of machines

v  Dynamicity: machines can come and go any time

14

Peer-to-Peer Networks: Napster v  Napster history: the rise

§  January 1999: Napster version 1.0 §  May 1999: company founded §  September 1999: first lawsuits §  2000: 80 million users

v  Napster history: the fall §  Mid 2001: out of business due to lawsuits §  Mid 2001: dozens of P2P alternatives that were harder

to touch, though these have gradually been constrained §  2003: growth of pay services like iTunes

v  Napster history: the resurrection §  2003: Napster reconstituted as a pay service §  2006: still lots of file sharing going on

Shawn Fanning,Northeastern freshman

15

Napster Technology: Directory Service v  User installing the software

§  Download the client program §  Register name, password, local directory, etc.

v  Client contacts Napster (via TCP) §  Provides a list of music files it will share §  … and Napster’s central server updates the directory

v  Client searches on a title or performer §  Napster identifies online clients with the file §  … and provides IP addresses

v  Client requests the file from the chosen supplier §  Supplier transmits the file to the client §  Both client and supplier report status to Napster

16

Napster v  Assume a centralized index system that maps files

(songs) to machines that are alive v  How to find a file (song)

§  Query the index system à return a machine that stores the required file

•  Ideally this is the closest/least-loaded machine §  ftp the file

v  Advantages: §  Simplicity, easy to implement sophisticated search

engines on top of the index system v  Disadvantages:

§  Robustness, scalability

17

Napster: Example

A B

C

D

E

F

m1 m2

m3

m4

m5

m6

m1 A m2 B m3 C m4 D m5 E m6 F

E? m5

E? E

18

Napster Technology: Properties v  Server’s directory continually updated

§  Always know what music is currently available v  Peer-to-peer file transfer

§  No load on the server v  As a protocol

§  Login, search, upload, download, and status operations §  No security: cleartext passwords and other vulnerability

v  Bandwidth issues §  Suppliers ranked by apparent bandwidth & response time

19

Napster: Limitations of Central Directory

v  Single point of failure v  Performance bottleneck v  Copyright infringement

v  So, later P2P systems were more distributed

File transfer is decentralized, but locating content is highly centralized

20

Peer-to-Peer Networks: Gnutella

v  Gnutella history §  2000: J. Frankel &

T. Pepper released Gnutella

§  Soon after: many other clients (e.g., Morpheus, Limewire, Bearshare)

§  2001: protocol enhancements, e.g., “ultrapeers”

v  Query flooding §  Join: contact a few nodes to

become neighbors §  Publish: no need! §  Search: ask neighbors, who

ask their neighbors §  Fetch: get file directly from

another node

21

Gnutella v  Distribute file location v  Idea: flood the request v  How to find a file:

§  Send request to all neighbors §  Neighbors recursively multicast the request §  Eventually a machine that has the file receives

the request, and it sends back the answer v  Advantages:

§  Totally decentralized, highly robust v  Disadvantages:

§  Not scalable; the entire network can be swamped with request (to alleviate this problem, each request has a TTL)

22

Gnutella v  Ad-hoc topology v  Queries are flooded for bounded number of hops v  No guarantees on recall

Query: “xyz”

xyz

xyz

23

Gnutella: Query Flooding

v  Fully distributed §  No central server

v  Public domain protocol

v  Many Gnutella clients implementing protocol

Overlay network: graph v  Edge between peer X and

Y if there’s a TCP connection

v  All active peers and edges is overlay net

24

Gnutella: Protocol

Query QueryHit

Query QueryHit

File transfer: HTTP

Scalability: limited scope flooding

•  Query message sent over existing TCP connections

•  Peers forward Query message

•  QueryHit sent over reverse path

25

Gnutella: Pros and Cons

v  Advantages §  Fully decentralized,

v  Disadvantages §  Search scope may be quite large §  Search time may be quite long §  High overhead and nodes come and go often

26

BitTorrent: Simultaneous Downloading

v  Divide large file into many pieces §  Replicate different pieces on different peers §  A peer with a complete piece can trade with other

peers §  Peer can (hopefully) assemble the entire file

v  Allows simultaneous downloading §  Retrieving different parts of the file from different

peers at the same time

27

BitTorrent Components v  Seed

§  Peer with entire file §  Fragmented in pieces

v  Leech §  Peer with an incomplete copy of the file

v  Torrent file §  Passive component §  Stores summaries of the pieces to allow peers to verify

their integrity

v  Tracker §  Allows peers to find each other §  Returns a list of random peers

28

BitTorrent: Overall Architecture

Web page with link to .torrent

A

B

C

Peer

[Leech]

Downloader

“US”

Peer

[Seed]

Peer

[Leech]

Tracker Web Server

.torre

nt

29

BitTorrent: Overall Architecture

Web page with link to .torrent

A

B

C

Peer

[Leech]

Downloader

“US”

Peer

[Seed]

Peer

[Leech]

Tracker Web Server

30

BitTorrent: Overall Architecture

Web page with link

to .torrent

A

B

C

Peer

[Leech]

Downloader

“US”

Peer

[Seed]

Peer

[Leech]

Tracker Web Server

31

BitTorrent: Overall Architecture

Web page with link

to .torrent

A

B

C

Peer

[Leech]

Downloader

“US”

Peer

[Seed]

Peer

[Leech]

Tracker

Shake-hand

Web Server

32

BitTorrent: Overall Architecture

Web page with link

to .torrent

A

B

C

Peer

[Leech]

Downloader

“US”

Peer

[Seed]

Peer

[Leech]

Tracker

pieces

Web Server

33

BitTorrent: Overall Architecture

Web page with link

to .torrent

A

B

C

Peer

[Leech]

Downloader

“US”

Peer

[Seed]

Peer

[Leech]

Tracker

pieces

Web Server

34

BitTorrent: Overall Architecture

Web page with link

to .torrent

A

B

C

Peer

[Leech]

Downloader

“US”

Peer

[Seed]

Peer

[Leech]

Tracker

pieces

Web Server

35

Free-Riding Problem in P2P Networks v  Vast majority of users are free-riders

§  Most share no files and answer no queries §  Others limit # of connections or upload speed

v  A few “peers” essentially act as servers §  A few individuals contributing to the public good §  Making them hubs that basically act as a server

v  BitTorrent prevent free riding §  Allow the fastest peers to download from you §  Occasionally let some free loaders download

Application Layer 2-36

P2P file distribution: BitTorrent

tracker: tracks peers participating in torrent

torrent: group of peers exchanging chunks of a file

Alice arrives …

v  file divided into 256Kb chunks v  peers in torrent send/receive file chunks

… obtains list of peers from tracker … and begins exchanging file chunks with peers in torrent

Application Layer 2-37

v  peer joining torrent: §  has no chunks, but will

accumulate them over time from other peers

§  registers with tracker to get list of peers, connects to subset of peers (“neighbors”)

P2P file distribution: BitTorrent

v  while downloading, peer uploads chunks to other peers v  peer may change peers with whom it exchanges chunks v  churn: peers may come and go v  once peer has entire file, it may (selfishly) leave or

(altruistically) remain in torrent

Application Layer 2-38

BitTorrent: requesting, sending file chunks

requesting chunks: v  at any given time, different

peers have different subsets of file chunks

v  periodically, Alice asks each peer for list of chunks that they have

v  Alice requests missing chunks from peers, rarest first

sending chunks: tit-for-tat v  Alice sends chunks to those

four peers currently sending her chunks at highest rate §  other peers are choked by Alice

(do not receive chunks from her) §  re-evaluate top 4 every10 secs

v  every 30 secs: randomly select another peer, starts sending chunks §  “optimistically unchoke” this peer §  newly chosen peer may join top 4

Application Layer 2-39

BitTorrent: tit-for-tat (1) Alice “optimistically unchokes” Bob (2) Alice becomes one of Bob’s top-four providers; Bob reciprocates (3) Bob becomes one of Alice’s top-four providers

higher upload rate: find better trading partners, get file faster !

Application Layer 2-40

Chapter 2: outline

2.6 P2P applications 1. P2P vs Client&Server 2. Unstructured Peer-to-Peer Networks BitTorrent 3. Structured Peer-to-Peer Networks Distributed Hash Table (DHT)

Distributed Hash Table (DHT)

v  Hash table

v  DHT paradigm

v  Circular DHT and overlay networks

v  Peer churn

Key Value John Washington 132-54-3570 Diana Louise Jones 761-55-3791 Xiaoming Liu 385-41-0902 Rakesh Gopal 441-89-1956 Linda Cohen 217-66-5609 ……. ……… Lisa Kobayashi 177-23-0199

Simple database with(key, value) pairs: •  key: human name; value: social security #

Simple Database

•  key: movie title; value: IP address

Original Key Key Value John Washington 8962458 132-54-3570 Diana Louise Jones 7800356 761-55-3791 Xiaoming Liu 1567109 385-41-0902 Rakesh Gopal 2360012 441-89-1956 Linda Cohen 5430938 217-66-5609 ……. ……… Lisa Kobayashi 9290124 177-23-0199

•  More convenient to store and search on numerical representation of key •  key = hash(original key)

Hash Table

v  Distribute (key, value) pairs over millions of peers §  pairs are evenly distributed over peers

v  Any peer can query database with a key §  database returns value for the key §  To resolve query, small number of messages exchanged among

peers v  Each peer only knows about a small number of other

peers v  Robust to peers coming and going (churn)

Distributed Hash Table (DHT)

Naïve method: Random distribution of (key, value) pairs.

Assign key-value pairs to peers v  rule: assign key-value pair to the peer that has the

closest ID. v  convention: closest is the immediate successor of

the key. v  e.g., ID space {0,1,2,3,…,63} v  suppose 8 peers: 1,12,13,25,32,40,48,60

§  If key = 51, then assigned to peer 60 §  If key = 60, then assigned to peer 60 §  If key = 61, then assigned to peer 1

1

12

13

25

3240

48

60

Circular DHT

•  each peer only aware of immediate successor and predecessor.

“overlay network”

1

12

13

25

3240

48

60

Whatisthevalueassociatedwithkey53?

value

O(N) messages on average to resolve query, when there are N peers

Resolving a query

Circular DHT with shortcuts

•  each peer keeps track of IP addresses of predecessor, successor, short cuts.

•  reduced from 6 to 3 messages. •  possible to design shortcuts with O(log N) neighbors, O(log N)

messages in query

1

12

13

25

3240

48

60

Whatisthevalueforkey53value

Peer churn

example: peer 5 abruptly leaves

1

3

4

5

810

12

15

handling peer churn: v peers may come and go (churn) v each peer knows address of its two successors v each peer periodically pings its two successors to check aliveness v if immediate successor leaves, choose next successor as new immediate successor

Peer churn

example: peer 5 abruptly leaves v peer 4 detects peer 5’s departure; makes 8 its immediate successor v  4 asks 8 who its immediate successor is; makes 8’s immediate successor its second successor.

1

3

4

810

12

15

handling peer churn: v peers may come and go (churn) v each peer knows address of its two successors v each peer periodically pings its two successors to check aliveness v if immediate successor leaves, choose next successor as new immediate successor

How about node 3?

Application Layer 2-51

Chapter 2: summary

v  application architectures §  client-server §  P2P

v  application service requirements: §  reliability, throughput,

delay, security v  Internet transport service

model §  connection-oriented,

reliable: TCP §  unreliable, datagrams: UDP

our study of network apps now complete!

v  specific protocols: §  HTTP §  FTP §  SMTP, POP, IMAP §  DNS §  P2P: BitTorrent, DHT

Application Layer 2-52

v  typical request/reply message exchange: §  client requests info or

service §  server responds with

data, status code v  message formats:

§  headers: fields giving info about data

§  data: info being communicated

important themes: v  control vs. data msgs

§  in-band, out-of-band v  centralized vs. decentralized v  stateless vs. stateful v  reliable vs. unreliable msg

transfer v  “complexity at network

edge”

Chapter 2: summary most importantly: learned about protocols!

Questions?

Application Layer 2-53