challenges in p2p computing - semantic scholar · 2015. 7. 28. · 2 peer-to-peer model qp2p: a...

41
1 Challenges in P2P Computing Lionel M. Ni Department of Computer Science Hong Kong Univ. of Science & Technology [email protected] http://www.cs.ust.hk/~ni

Upload: others

Post on 19-Jan-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

1

Challenges in P2P Computing

Lionel M. Ni

Department of Computer ScienceHong Kong Univ. of Science & Technology

[email protected]://www.cs.ust.hk/~ni

Page 2: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

2

Peer-to-Peer Modelq P2P: a class of systems and applications that

employ distributed resources to perform a critical function in a decentralized mannerm Link the resources of all peersm Resources: storage, CPU cycles, content, etc.m All peers are servers and equal – highly scalablem All peers are autonomous (different owners)m Peers are both clients and servers

q Examples: Napster, Gnutella, KaZaA, Bit-Torrent, E-Donkey…

Page 3: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

3

Taxonomy of P2P File Sharing Networks

qUnstructured: file placement is unrelated to the overlay topologym With a central server: Napsterm Fully decentralized: Gnutellam Hierarchical: KaZaA

qStructured: the overlay topology & file (or file indices) placement are tightly controlledm One-dimensional coordinate space: Chord

Page 4: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

4

Pure Decentralized Model

qGnutellam Flood queries within

the horizon (TTL)m Query hit sent back

along the request path

m Select peer/file to download

m Directly download from selected peers

qDisadvantagemNot efficientmQuery Loss Problem

query

Page 5: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

5

Hierarchical ModelqHybrid architecture: KaZaA

m A decentralized network of centralized clusters

m Sacrifice anonymity to achieve efficiency

m Cross between Napster and Gnutellaq Each node is either a super-

peer or assigned to a super-peer

qSearch mechanism among super-peers is blind flooding, same as in Gnutella

Page 6: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

6

Existing P2Ps not Scalableq P2P traffic contributes the largest portion

of Internet traffic (ACM SIGCOMM’02 ) q P2P dominates the campus network,

consuming 43% of all bandwidth compared to 14% for WWW traffic (OSDI 2002)

qGiven 95% of any two nodes are less than 7 hops away and TTL=7, Gnutella’s current search generates 330 TB/month with only 50,000 nodes

qGnutella has around 2 million users online at any time; KaZaA has 3-4 million users online at any time

Page 7: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

7

P2P Computing

q Can P2P survive without a business model?m Low entrance barrierm Low maintenance cost

qHow to grow a P2P network?m Traffic efficientm Attract more peers

• Rich contents–Willing to contribute

• Fast response time• Trustworthy environment

–Anonymity; secure; integrity; availability

Page 8: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

8

Traffic Efficient P2P

qTopology Mismatch q Efficient SearchqMulti-downloadq Replication and Cache

Page 9: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

9

Message Duplications in Overlay Connections

S

L MP

Q Implosion: duplicated messages are sent to the same node

Same message traversed over the same channel twice (LM and MP)

3 copies

2 copies

2 copies

Page 10: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

10

Topology Mismatch Problem

S is the source. The longest physical link SC will be traversed three times when the overlay does not match the physical.

Page 11: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

11

Up to 70 % of the query responses along mismatching paths

Topology Mismatch Problem

Page 12: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

12

What We Know Now?

q Even given the global knowledge of all the peering nodes and non-peering nodes, with millions of nodes in the system, and they are randomly coming and leaving, it is difficult, if not impossible, to compute an optimal overlay topology.

qThe reality is even worse, m we don’t have global knowledge; m we don’t have a central server.

qTherefore, we need to explore distributed approaches.

Page 13: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

13

Our Contributions

qACE (ICDCS04) and AOTO (Globecomm03)m 1-hop neighbor informationm The simplest and slowest

q LTM (Infocom04, IEEE TPDS)m Location-aware approachm The convergent speed of LTM is the fastest,

but needs synchronization.qSBO (IPDPS04)mWith half overhead of ACE/AOTO, reduces the

traffic cost the most.

Page 14: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

14

Free Riding Problem

q Free riding m No individual is willing to contribute towards the cost of

something (public goods) when he/she hopes that someone else will bear the cost instead

q Free riding leads to degradation of the performance of the system and adds vulnerabilityto the system

q Lacks incentives for cooperationq In P2P, many people just download files

contributed by others and never share any of their filesm Files shared in P2P are ‘public goods’m Free Rider Statistics in Gnutella: 70% of users share no

files

Page 15: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

15

Free Riding Challenges

qWhat exactly is the free riding problem and what are the consequences?

q Could the problem be solved? How?qHow can “incentives mechanism” be

created?m Incentive policy: Maze (Peking U.)m Credit management: fully distributedm Anonymitym Availability of open source

qOther models: e.g., game theoretic model

Page 16: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

16

P2P Anonymity

qShould be an option to peersqDifferent levels of anonymitym Publisher Anonymity

• Hide publisher identity to resist censorship•Dilemma : Anonymity or Authenticity

m Initiator (requester) Anonymitym Responder AnonymitymMutual Anonymity

Page 17: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

17

Path-based Initiator Anonymity

R

X

Y

Z

I

Page 18: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

18

Mix and Onion Routing

q x only knows B and y; but x does not know whether B is the initiator

q The identity of B is hidden from x, y, A.q The message is hidden from x, y.q B knows everything. q Onion routing improves from Mix with symmetric keys

B x y A

Send to y

Send to AMessage

Message

Send to A

Message

x

y

A

y

A A

Page 19: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

19

Mutual Anonymityl Mutual Communication Anonymity:

(1) hiding requester’s identity, (2) hiding responder’s identity, and (3) no others are able to guess the two parties and the shared document.

l Previous studiesl Onion routing: initiator knows everything and

responder knows nothingl APFSl P5

Page 20: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

20

A BTBTA

Mutual Anonymity: APFS

q A and B build paths, and publicize their tail nodes (agents).

q A and B connect to each other through their tail.m A and B anonymously send packets to their tails (TA, TB) m Their tails then overtly contact the responder’s tail node.m The responder’s tail passes the packets to the anonymous

responder.

C TCDTD

Page 21: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

21

Problems with Mutual Anonymity

q Hard to achieve in a bidirectional waym Without knowledge of each other, how to accomplish the

transaction? Anonymous but knowing he/she is goodm How to trust the opposing party?

q High overheadm Cryptography cost added (Onion Routing)m Consuming a lot of capacity, bandwidth, storage…

(splitting file into shares, multicasting… )q Conflict with some design goals of P2P

m Efficiencym Decentralization: agent management m Peer discovery: search is more complicate

Page 22: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

22

Trustworthy P2P

q Provide a trustworthy P2P environmentm Peer trustinessm Anonymity (option)m File integritym Availability

qMalicious peersm Fake information m Create unnecessary trafficm Virus spreading

qHandling of malicious peersm Detection and prevention

Page 23: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

23

Peer Trustiness

q Peer-to-Peer is a fully distributedmWith no central coordinationmNo central databasemNo global view of the systemm Peers are autonomous, and may be anonymous m Peers are unreliablem Transactions are performed between Peers

qHow can a peer trust another peer?

Page 24: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

24

Four Approaches of “Authentic”

q Oldest Documentm The oldest submission is considered authenticm Timestamping systems

• Can you trust the timestamp?• Very difficult without a centralized CA

q Expert-basedm After search results are returned, ask experts for

opinion.m Authoritative nodes keep track of signaturesm Expert group management

• Many expert groups with different opinions• Experts get feedback from peers to update their opinion• Can experts trust feedback without a centralized CA?

Page 25: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

25

Four Approaches of “Authentic” (cont’d)q Voting-basedm Votes of many expertsm Voter/Expert managementm Asking for votes from experts (not all experts

will return the vote)m Experts may be humansm Spoofing of votes, nodes and files

q Reputation-basedmWeight votes, some experts more trustworthym Dynamically adjust the reputation

Page 26: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

26

Three elements: Identity, Trust, Reputationq IdentitymWho is making a statement (responder)

qTrustm Can I believe the person who is making the

statementq ReputationmWhat is the history of trust in the person

making the statement• Build a reputation history for each pair of peers

m Reputation management

Page 27: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

27

Reputation-based Trust Management qTrust Managementm a mechanism that allows to establish mutual

trust.m Peer A trusts B does not imply vice versa

q Reputationm a measure that is derived from direct or

indirect knowledge on earlier transactions. q Reputation-based trust managementm one specific form of Trust Management.

Page 28: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

28

Strategy of Reputation-based Trust ManagementqGet opinion list based on self observationq Propagate it by reputation collectionm Votingm From Friends (Friends of Friends)

q Building a reputation matrix m A peer’s reputation may be different to other

different peersm Calculating own partm Cooperating to obtain a globe matrix

Page 29: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

29

More Issues

qSecure Score Managementm Voting among multiple score managersm Peer score held by another peer

qThreat scenariosmMalicious individuals (always bad)mMalicious collectives (always bad, trust among

bad peers)m Camouflaged collectives (sometimes good to

trick people)mMalicious spies (good all the time but friends

with bad folks)

Page 30: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

30

File Authenticity

q Given a query, the authentic response has to be distinguishedm Associated with the Reputation of provider

q How to guarantee the “authentic” ? q Including the file integrity

m CRC, hashing, MACs, digital signaturesm Provider a secure transforming

q File Authenticitym How do you know you have the right file?m Bogus copiesm Corrupt copies

q Need detection/correction mechanisms

Page 31: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

31

File Integrity

q Is it enough to find the trust source?qThe attacker can modify the file content mMan in the middle attacks

qHow to provide a secure transforming?qNeed detection/correction mechanisms

Page 32: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

32

Availabilityq Nodes should be always upq Overlay DoS & DDos attacks

m Flooding a node with messagesm Malicious super-nodes in Gnutella

• Claims that the victim has all files requested

q Attack CPU availabilitym Sending complex queries

q Attack file storagem Submit bogus documents

q Attack quality-of-servicem Serve a file slowlym Send a different file

Page 33: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

33

Why Defending against P2P Overlay DDoS?

q The flooding based search mechanism makes overlay DDoS in P2Ps simple in design and operation

q The anonymity design of P2P helps the malicious nodes easily hide behind other peers

Page 34: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

34

Challenges

q In the future, P2P protocols need to be designed to make it hard for adversaries to construct DDoS attacks by taking advantage of loosely constrained protocol features.

q More research is necessary to understand the effects of other types of DoS attacks in various P2P networks.

q It is important to design future P2P protocols such that they do not open up new opportunities for attackers to use as amplifiers and back-door communication channels.

Page 35: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

35

Challenges

q To protect against Man in the Middle attack, one way to defeat attacker is for nodes to authenticate other nodes. This can be achieved by obtaining a node's public key through a secure channel (e.g., a trusted party such as certificate vendor, or through a web of trust like PGP) and validating their fingerprints.

q Make firewalls smarter so that peer-to-peer applications can cooperate with the firewall to allow traffic the administrator wants. Firewalls must become more sophisticated, allowing systems behind the firewall to ask permission to run a particular peer-to-peer application.

Page 36: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

36

Block Malicious or Misbehave Peers

qHow to isolate bad peers?q Blacklistm Bad peer can change names

q Flow control and monitoringqSpecial monitoring peers (police force)m Special privilege

q Remain open

Page 37: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

37

Blocking Discipline-free P2P

q P2P must be trustworthy compliantqDiscipline-free P2PmNone trustworthy compliantm Illegal informationm Unable or difficult to catch bad peers

q Block non-compliant P2P systemsqSpecial force to attack discipline-free P2P

Page 38: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

38

Summary

qWhy securing P2P data sharing applications is challenging?

qOpen and autonomous natureq Peers can join and leave freelym Peers cannot necessarily be trustedm route queries or respond correctlym store documents when asked tom serve documents when requested

Page 39: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

39

Summary (cont’d)

q Develop techniques to deal with fail-stop and byzantine failures that are acceptable from a performance and security standpoint in a P2P context.

q The lesson for P2P designers is that without accountability in a network, it is difficult to enforce rules of social responsibility.

q Use security toolbox like JXTA (project JXTA has security API's and a library that implements RSA, RC4, MD5, SHA-1, a psuedo-random number generator, and digital signatures) to help P2P implementation.

Page 40: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

40

Summary (cont’d)

q Research is required to explore a broad range of fundamental P2P issues such as: peer-node identity, naming, configuration and capabilities; P2P network organization and scope; resource discovery, content lookup, search and distribution; request routing and operation in the presence of mobility; adaptation to expected peer-node instability; monitoring of P2P operations; security of P2P systems involving reputation-based trust for ad-hoc systems or more centralized, CA-like approaches; etc.

Page 41: Challenges in P2P Computing - Semantic Scholar · 2015. 7. 28. · 2 Peer-to-Peer Model qP2P: a class of systems and applications that employ distributed resources to perform a critical

41

Question?

Thank You!