lh* rs p2p : a scalable distributed data structure for p2p environment w. litwin ceria laboratory...

34
LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment W. LITWIN CERIA Laboratory H.YAKOUBEN Paris Dauphine University [email protected] Paris Dauphine University Hanafi[email protected] T. SCHWARZ Santa Clara University (USA) [email protected]

Upload: hector-cain

Post on 28-Dec-2015

224 views

Category:

Documents


0 download

TRANSCRIPT

LH*RSP2P: A Scalable Distributed

Data Structure for P2P Environment

W. LITWIN

CERIA Laboratory

H.YAKOUBENParis Dauphine University

[email protected] Dauphine University

[email protected]

T. SCHWARZSanta Clara University (USA)

[email protected]

LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 2June 8th, 2007

PlanObjective

Overview: SDDS & P2P

LH*RSP2P

Architecture Addressing Properties Churn Management

Conclusion

LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 3June 8th, 2007

Objective

High availabilityto deal with churn

At most one forwarding message for key search or insert or scan

(fastest known performance)

Very Large Scalable Files

A File of records identified by keys SDDS client nodes face the applications

and send queries to SDDS server nodes No centralized addressing Servers contain application or parity data

In buckets Overflowing servers split on new servers Servers do not notify clients about splits

SDDS (1993)

Clients use images of the file state for addressing Key based Range queries Scans …

Images get adjusted towards the file state during queries by Image Adjustment Messages Triggered by incorrect addressing by the client

IAMs reflect the file evolution by splits or, rarely, merges.

IAMs reflect also the location changes because of failures and recovery

SDDS (1993)

6

SDDS Typology

LH*sa

SDDS(1993)

Data Structures

Classics

Tree

m-d Tree1-d Tree

RP*, k-RP*,DRT, DRT*, LH*, DDH, EH*,

Hash

High Availability

1-dimensional d-dimensional

IH*

LH*rs LH*s

k-Availability Security

LH*m LH*g

LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

P2P Data Structures withLogarithmic complexity

CHORD BATON VBI-Tree

New Peer

7LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

Clients

Growth through splits under inserts

Peer

SDDS Expansion

8LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

Clients

SDDS Client Image Evolution

Image Adjustm

ent Messagin

g

Available at CERIA site Announced at DbWorld Managing LH* RS and RP* files

In distributed RAM Uner Windows Over 1gbs Ethernet

Various functions Response time reaching 30 microsec Up to 300 times faster than disk files

SDDS 2007 Prototype

Autonomous nodes store and search data By flooding in early systems

Freenet, Napster, Gnutella…

Structured P2P reduce the flooding Using decentralized data structures

Distributed Hash Table (DHT) especially Few folks know the concept is due to B. Devine

FODO 93

Chord, P-tree, VBI, Baton…

Structured P2P schemes are specific SDDS schemes

P2P (1995 ?)

11

Client Address Calculus

a’ hi’(C ) ; /* a’ is the address of peer destination of the key C*/

if a’ < n’ then a hi’+1(C ) ; Algorithm

LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

Global Addressing Rulea hi(C ) ; /* a is the address of peer destination of the key C*/

if a < n then a hi+1(C ) ; /* (i, n) state of an SDDS file, they are only known to the

file coordinator node

Algorithm

LH*RSP2P Addressing

hi (C ) = C mod 2i

File starts with i = 0 and n = 0 and a single data bucket 0

Every bucket m keeps the bucket level j of hash function hi last used to split, j = 0 initially.

Overflowing bucket m alerts the coordinator Coordinator notifies bucket n to split Bucket n applies hi + 1 About half of keys migrates to new bucket n + 2i

Bucket n and the new one set j = j + 1 Coordinator performs

n = n + 1 if n = 2i then i = i + 1 and n = 0

LH*RSP2P File Expansion

Architecture based on LH*RS

13

LH*RSP2P

j

i’n’

Client Part

Server Part

LH*P2P Peer

LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

LH*RSP2P Peer

LH*RS

ClientLH*RS

DB

Candidate Peer

Candidate Peer

Client &Spare

StorageLH*RS

P2P Peer

LH*RSP2P Peer

LH*RS

ClientLH*RS

PB

Pupils

14

i’ = j  ; /* i’ is presumed «i» in client’s part of peer image

n‘ = m +1  /* n’ is presumed value of «n» in client’s part of peer image

if n’ = 2i’ then i’ = j + 1 ;  n’ = 0 ; /* m is the address of peer that’s split /* j value is before the split

Algorithm

LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

Local Client & Pupil Image Adjustment During Splits

15

Before splittingCoordinator Peer (CP)

P0

j=2

i’=1n’=1

P2P1i=1n=1

j=1

i’=1n’=0

j=2

i’=1n’=1

After splitting

j=2

i’=2n’=0

CP

P0

j=2

i’=1n’=1

P2P1i=2n=0

j=2

i’=2n’=0

j=2

i’=1n’=1

P3

i’= j =1;n’= m+1= 1+1;

If n’=21 then n’=0; i’= i’+1and

(i’, n’)= (2,0)

LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

Example

16

a’ hj(C ) ; if a’ a then /* Addressing ERROR -> Forwarding*/a” hj-1(C ) ; if a”> a and a”<a’ then a’ a”; /* Send Key C to peer a’

Algorithm

LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

Server Address Calculus

17

Peer Image Adjustment

i’ j-1, n’ a+1 ; /* a is the good address of pair*/if n’> 2i’ then n’0 ; i’ i’+1 ;

Algorithm

LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

PC

i=3n=2

P0

j=4

i’=3n’=1

Pairs

P4

j=3

i’=2n’=2

P9

j=4

i’=3n’=2

P1

j=4

i’=3n’=2

9

9

IAM

Checking and forward the key

using A2

Addressing Algorithm

18

Example of the File Expansion

PC

i=2n=2

P0

j=3

i’=2n’=1

Peers

P2

j=2

i’=1n’=1

P5

j=3

i’=2n’=2

Candidate Peer

i’=0n’=0

P6

j=3

i’=2n’=3

i=2

n=3

i’=2n’=3

j=3

Pupil

i’=2n’=1

LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

Assign a Tutor for Candidate Peer:

LH-hash of the client IP Address

TUTOR, Update Pupil

LH*RSP2P

19

1. The maximal number of forwarding messages for the key search is one.

2. The maximal number of rounds for the scan search is two.

3. The worst case addressing performance of LH*RS

P2P as defined by Property 1 is the fastest possible for any SDDS or a practical structured P2P addressing scheme.

LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

Properties of LH*RSP2P :

20LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

Proofs

n

j=i+1j=i+1j=i

n+2i2i-1a’0 a

Property 1: Case 1 : i = i’Peer a use a client image (i’,n’) and didn’t receive any IAM since the last split

21LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

Proofs

n

j=i+1j=i+1j=i

n+2i2i-1

a’

0 a

Property 1: Case 1 : Peer a use a client image (i’,n’) and didn’t receive any IAM since the last split

22LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

Proofs

n

j=i+1j=i+1j=i

n+2i2i-1a0

Property 1: Case 2 : i = i’ + 1Peer a use a client image (i’,n’) and didn’t receive any IAM since the last split

1. n ≤ a’ a 2. a <  a’ < 2i’ - 1 3. a’<n

a’

23LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

Proofs

Property 2:

• Peer a sends the scan to all buckets in its image• Those who split in the meantime resend to children• No child can have a child

• It would need to split also• But then peer a would first need to split again

as well• Then its client image would get re-adjusted

24LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

Proofs

Property 3:

•The only better worst case performance is zero forwarding messages•This requires the notification of every split to every peer•It would be against the scalability goal of every SDDS & structured P2P scheme

25LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

LH*RSP2P Churn Management

Bucket reliability group with k parity buckets protect against up to k bucket failures per group

••••

•••••

••••• •

•••••

54321

0

•••••

•••••

Parity BucketData bucketRank

Parity RecordData Record Tutoring Records

The Tutoring records are stored and have a same treatment as data records

26LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

Peer leaves with notice

Coordinator Peer

j

i’,n’

j

i’,n’

j

i’,n’

i’,n’

P0 PmPl Spare PeerNotification

… … Say

that’s OK

LH*RSP2P Churn Management

27LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

Peer leaves without notice or fails

Coordinator Peer

j

i’,n’

j

i’,n’

j

i’,n’

i’,n’Pl-1 PmPl Parity Bucket

Query

Forward

LH*RS Bucket Recovery

LH*RSP2P Churn Management

28LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

Peer leaves without notice or fails

Coordinator Peer

j

i’,n’

j

i’,n’

j

i’,n’

i’,n’Pl-1 Pm

Pl

Parity Bucket

Answer

LH*RS Bucket Recovery

LH*RSP2P Churn Management

29LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

Sure Search : Protects against old data read (communication failure)

Coordinator Peer

j

i’,n’

j

i’,n’

j

i’,n’

i’,n’Pl-1 PmPl Parity Bucket

Query

Answer

LH*RSP2P Churn Management

j

i’,n’

Pl

30

Conclusion LH*RS

P2P require at most one forward message when addressing error occur

Is the fastest known SDDS and P2P key based addressing algorithm

Protects efficiently against churn

Allows to manage very large scalable files

LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

31

Current & Future Work

LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

Implementation of the peer node architecture and of tutoring functionsUsing existing LH*RS prototypeCreated by Rim Moussa & shown at VLDB

2004 Performance Analysis Variants

32

THE END

LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

Work Partly Supported by theIST eGov-Bus project

33

[1] Adina Crainiceanu, Prakash Linga, Johannes Gehrke, and Jayavel Shanmugasundaram. Querying Peer-to-Peer Networks Using P-Trees. In Proceedings of the Seventh International Workshop on the Web and Databases (WebDB 2004). , June 2004.

[2] Bolosky W. J, Douceur J. R, Howell J. The Farsite Project: A Retrospective. Operating System Review, April 2007, p.17-26

[3] Devine R. Design and Implementation of DDH: A Distributed Dynamic Hashing Algorithm, Proc. Of the 4th Intl. Foundation of Data Organisation and Algorithms –FODO, 1993.

[4] Litwin, W. Neimat, M-A., Schneider, D. LH*: Linear Hashing for Distributed Files. ACM-SIGMOD Int. Conf. On Management of Data, 93.

[5] Litwin, W., Neimat, M-A., Schneider, D. LH*: A Scalable Distributed Data Structure. ACM-TODS, (Dec., 1996).

[6] Litwin, W., Neimat, M-A. High Availability LH* Schemes with Mirroring, Intl. Conf on Cooperating systems, , IEEE Press 1996.

[7] Litwin, W. Moussa R, Schwarz T. LH*rs- A Highly Available Distributed Data Storage. Proc of 30th VLDB Conference, , 2004.

[8] Litwin, W. Moussa R, Schwarz T. LH*rs- A Highly Available Scalable Distributed Data Structure. ACM-TODS, Sept 2005.

[9] Steven D. Gribble, Eric A. Brewer, Joseph M. Hellerstein, and David Culler. Scalable, Distributed Data Structures for Internet Service Construction, Proceedings of the Fourth Symposium on Operating Systems Design and Implementation (OSDI 2000)

[10] Stoica, Morris, Karger, Kaashoek, Balakrishma. CHORD : A scalable Peer to Peer Lookup Service for Internet Application. SIGCOMM’O, August 27-31, 2001,

References

LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

34LH*RSP2P: A Scalable Distributed Data Structure for P2P EnvironmentJune 8th, 2007

Thanks' for your attention