intrinsic references in distributed systems

14
ScaLAB seminar 21st Oct ober 2002 1 Intrinsic References in Distributed Systems Presented by: Nimish Pachapurkar

Upload: nerina

Post on 09-Jan-2016

36 views

Category:

Documents


4 download

DESCRIPTION

Intrinsic References in Distributed Systems. Presented by: Nimish Pachapurkar. Snapshot:. To contrast and compare Intrinsic References with Physical References. Storage and Retrieval mechanism using intrinsic references : Elephant Store - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Intrinsic References in  Distributed Systems

ScaLAB seminar 21st October 2002

1

Intrinsic References in Distributed Systems

Presented by: Nimish Pachapurkar

Page 2: Intrinsic References in  Distributed Systems

ScaLAB seminar 21st October 2002

2

Snapshot:To contrast and compare Intrinsic References with Physical References.Storage and Retrieval mechanism using intrinsic references : Elephant StoreUse of intrinsic references in Hierarchical data structures

Terminology:Collision resistance:

Extremely difficult to find two sequences with same hash.Implies that hash is unique (sufficiently so…)

One-way hash:Given a hash of a sequence it is difficult to reconstruct the sequence.

Reference => Hash AND Referent => byte sequence(ex. Memory addresses and data, URLs and web pages etc.)

Page 3: Intrinsic References in  Distributed Systems

ScaLAB seminar 21st October 2002

3

Physical References –Relationship between reference and referent is defined by state of the physical system.Change in the state changes the referent.All accesses to referent have to be through the system.

Bottleneck and potential failure point

Intrinsic References -Collision resistant (unique) and one-way hash valueState Independence: The relationship between S and R depends only on the hash function.Uniqueness: A given R refers only to a particular S from which it was obtained.Physical storage is still required to store/retrieve the referents.

Page 4: Intrinsic References in  Distributed Systems

ScaLAB seminar 21st October 2002

4

Intrinsic References and Distributed Storage –Useful for Distributed, replicated storage mechanism.No reference-referent inconsistency (hash gives the reference)Simple hashing can check for the correctness of the data

Opaque Storage –Used for storing an instance of a data structure in Elephant StoreSerialize the data structure, store the byte sequence.Called OPAQUE representation as data structure is hidden behind the byte sequence.Hash of the sequence is the reference (digest).Retrieval: Retrieve the byte sequence from store, de-serialize

Opaque Reference (Hash digest)

Data Structure

Serialization (makes the structure opaque)

Page 5: Intrinsic References in  Distributed Systems

ScaLAB seminar 21st October 2002

5

HDAGs –Hash based Acyclic Directed Graph.Nodes are directoriesarcs are directory – sub-directory

relationships.

Root digest of a rooted HDAG is used as intrinsic reference to the whole HDAG.

Application: Can be used to represent a file system or mail system.

Root digest uniquely represents the state of whole directory structure and not just the root directory

Page 6: Intrinsic References in  Distributed Systems

ScaLAB seminar 21st October 2002

6

Versions and Change (Problems with OR) –For a file system, example of Opaque representation is a tarball of the directory structure. Change in any file will cause the opaque representation to change.Hash digest also changes.There is no relationship between the old and new representations.

Solution: Use HDAGsAdding a file to a directory is

same as a new mail in Inbox.The representation of all other files & directories is not changed.Efficient than Opaque Rep.Saves communication cost amongreplicas for distributed storages.

Page 7: Intrinsic References in  Distributed Systems

ScaLAB seminar 21st October 2002

7

Advantages of HDAGs –Efficient for Distributed systems (version management)Every version is represented by a unique intrinsic reference which is independent of physical system.Replication and caching will never lead to inconsistenciesTwo versions of an object are represented by sharing majority of the storage and communication costs.

Conclusions –HDAGs promise to be a useful mechanism for building and maintaining distributed storage systems.

Page 8: Intrinsic References in  Distributed Systems

ScaLAB seminar 21st October 2002

8

OS Support for P2P Programming:a Case for TPS

Presented by: Nimish Pachapurkar

Page 9: Intrinsic References in  Distributed Systems

ScaLAB seminar 21st October 2002

9

Introduction –Need for RPC-like interaction mechanism for P2P infrastructuresMust be decoupled

Anonymous and asynchronousLayers over RPC would certainly hamper performanceType based Publish/Subscribe as a candidateAbstraction of low-level P2P library – JXTA

What’s in the paper:Comparison of the implementation of TPS with pure JXTAA “first” experienceDesign and source code of applications

Page 10: Intrinsic References in  Distributed Systems

ScaLAB seminar 21st October 2002

10

JXTA Three layersCore Layer: Several protocols ensuring basic communication between peers, message routing or peer group creationService Layer: Ready-made services such as content management system and wire serviceApplication Layer: All the code written by the programmerSix concepts:

ID: for any resource (peer, pipe, peergroup, codat)Peer: Any device with an electronic pulse (normal and special)

Rendez-vous and routers

Pipe: Virtual communication channel – asynchronous and uni-directional (wire for many-to-many) – independent of IPPeerGroup: Collection of peersAdvertisement: XML msg with information about new resourceMessage: Any kind of communication (using XML)

Page 11: Intrinsic References in  Distributed Systems

ScaLAB seminar 21st October 2002

11

Protocols for JXTA –PDP – Peer Discovery Protocol

Allows different peers to find each otherPRP - Peer Resolver Protocol

Just above the transport layer, dispatches JXTA message to right service

PIP – Peer Information ProtocolKnow the status of a peer. (time the peer was up, channels available)

PMP – Peer Membership ProtocolObtain group membership requirements information (credentials, password, etc.)

PBP – Peer Binding ProtocolKeeps different peers in a pipe bound together (even when they move)

ERP – Endpoint Routing ProtocolFor routing messages between the peersEnables communication between 2 peers even when they do not know how to connect to each other (due to Firewall etc.)

Page 12: Intrinsic References in  Distributed Systems

ScaLAB seminar 21st October 2002

12

TPS over JXTA –Publish/Subscribe paradigmTime decoupling: Publisher and Subscriber do not need to be up at the same timeSpace decoupling: Publisher and Subscriber do not need to know each otherFlow decoupling: Sending or receiving of messages do not block the participants.

This decoupling suits the server-less architectures.Subscription based on Subject and ContentType-based: Subject => Event object type

Content => State of instance of that typeType safety

Subscriber knows event type in advance

Page 13: Intrinsic References in  Distributed Systems

ScaLAB seminar 21st October 2002

13

Example –Ski renting applicationNeed to find ski rentals with reasonable ratesMust surf the net for a long timeAlternative: Use the TPS based P2P infrastructureSubscribe to ski-rental type and wait for answers

Publisher: (A new shop is opened)Search launched for ski-rental advertisementIf not found, a new one is created

Programming phases –

Page 14: Intrinsic References in  Distributed Systems

ScaLAB seminar 21st October 2002

14

Performance –Invocation timeTime for sendMessage()Publisher produces 50 evtsJXTA-WIRE is quickerNo difference between SR-JXTA and SR-TPS

Throughput: Similar trends!

Conclusion-TPS is a viable alternative abstraction to RPC for future Internet-wide Operating Systems to support P2P applicationsSimple to use, type-safe, preserves decoupled nature of P2P.Makes programming easier than with pure JXTA.