1 grid vs. peer-to-peer yin chen [email protected] 25 june 2003

29
1 Grid vs. Peer-to-Peer Yin Chen [email protected] 25 June 2003

Upload: amice-richardson

Post on 25-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

1

Grid vs. Peer-to-Peer

Yin [email protected]

25 June 2003

Page 2: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

2

Content

Grid vs. P2P

What’s the request

Why P2P architecture

Issues of P2P

P2P case study- Freenet

Design

Page 3: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

3

Grid vs. P2P

Grid Standards- based

Persistent

Addresses security issues

Resources are more powerful,more diverse,

better connected

Data intensive

Facing problems of autonomic configuration

and management

Not much scalable

Page 4: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

4

Grid vs. P2P

P2P

Much scalability

Fault tolerance

Self-configuration

Automatic problem determination

Higher variable behaviour

But lack of infrastructure

Security problems

Less concerned with qualities of service

Page 5: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

5

What’s the request

A user requests the car service, and keeps logs

recording if the request success or fail

The user may asks all other users about history

request records. By statistic, we can know

particular service responding ability.

Which can also gives prediction of further request.

Page 6: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

6

Why P2P

Not run-time information

Better fault tolerance,

Pull model efficient and less network traffic

Page 7: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

7

Issues of P2P - Topology

Page 8: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

8

Issues of P2P - Response Modes

Page 9: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

9

Issues of P2P

… It turns to be problem of query from distributed data stores, which is different from central database query …

Page 10: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

10

Issues of P2P - Query Processing

Recursively Partitionable Query

Page 11: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

11

Issues of P2P - Abort Timeout (1) Problems

- User no longer interested in query results- Query will forever roaming the network without stop it- The query should be fade away after sometime- Static timeout remains unchanging across hops Solution ->Dynamic Abort Timeout - Nodes further away from the originator timeout earlier than nodes closer to the originator.- Decrease the timeout at each hop- Exponential decay with halving

Page 12: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

12

Issues of P2P - Abort Timeout (2)

Page 13: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

13

Issues of P2P - Query Scope (1)

Problems- No necessary to search the whole net- Broadcast model will flooding the network.

Solutions -> Select a neighbour subset

- Search only a specific domain, host, owner - Random select half of the neighbours

- In a tree-like topology, select all child and ignore all parent

- Only find a single result.

- Specify the maximum number of result (maxResults) and

bytes(maxResultBytes) to be returned.

Page 14: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

14

Issues of P2P - Query Scope (2)

Maintain a statistics about its neighbours. Only select neighbours that meet minimum requirements in term of latency, bandwidth or historic (maxLatency, minBandwidth, minHistoricResult)

Neighbour Selection Query Radius of a query - is a measure of path length.

- Set the maximum number of hops a query is allowed to travel- The radius is decreased by one at each hop.- The roaming query and response fade away when a radius of less than zero.

Page 15: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

15

Issues of P2P - Routing

Random forwarding(random walk)

Learning: nodes record the requests answered by

other nodes. A request is forwarded to the peer that answered similar requests previously or randomly.

Best neighbour: records the number of answers

received from each peer. A request is forwarded to the peer who answered the largest number of requests.

Learning + best neighbour: identical with the

learning, when no relevant experience exists, the request is forwarded to the best neighbour.

Page 16: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

16

P2P Case Study - Freenet

Freenet provides a file-storage service

The network is entirely decentralised Information publishers and consumers are anonymous Communications are encrypted Files in the data store are encrypted

Page 17: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

17

Adding New File

A user assigns the file a GUID key, sends an insert message, containing file identifier(GUID) and a time-to-live(TTL)value.

GUID is location-independent globally unique identifier. By hashing the contents of the file.

On receiving an insert, the node checks if the key already exist. If not, stores it, creates a routing entry for it, looks up the closest key, and forwards the message to the related node.

If TTL expires, the final node returns an “all clear” message. The user then sends the data alone the path.

Page 18: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

18

Requesting File

Every node maintains a routing table, listing addresses of other nodes and GUID keys.

On receiving a query, it first checks its own store. If it finds the file, it announces itself as the holder. Otherwise, it forwards the query to the node with the closest key.

If the file is found, each node passes the file alone the chain, and creates a new entry in its routing table.

Each node might also cache a copy locally. The query maintains a TTL, decreased at each hop. If a node runs out of candidates, it reports failure and back

the its predecessor, which then tries its second choice

Page 19: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

19

Adding New Node

New node sends a announcement to an existing

node, with a TTL.

The receiving node forwards the announcement to

another node chosen randomly from its routing

table.

The announcement continues to propagate until

its TTL runs out.

Page 20: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

20

Training Routes

Nodes that reliably answer queries will be added

to more routing tables.

Well-known nodes tend to see more requests and

become better connected.

Similar keys tend to cluster in the nodes along

the same path, because requests will be for

similar files which have similar keys.

Page 21: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

21

Managing Storage

Given finite disk space, sometime need to decide

which file to keep.

Freenet decides by the frequency of requests per

file, keeps the more popular files.

Frequently requested files have more copies in the

network. Tree grows in that direction

Unrequested files are subjected to delete. Tree

shrinks in that direction.

Page 22: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

22

Design

Tree Topology

Each node maintains a Log File

Each node also maintains a Local Data Store for storing the queries result.

Page 23: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

23

Design

Adding New Node - When a new node adds to the network, it connects itself to only one existing node.

Adding Log Record

- When a user accesses services, a log record will be created - Log records should provide information about

service name, service accessing time, success/fail flag

Page 24: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

24

Design - Query

Query - When a node sets up a query, it first looks up its local data store to see if the same query exists. - If it is a new query , the node multicasts a query message to all connecting nodes. The query message contains Query Conditions, Maximum Data Volume value and a Dynamic Abort Timeout(DAT) value. - Query Condition may contains time period which user concerns, services name etc.

Page 25: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

25

Query

- On receiving the query message, a node first looks up its own local data store, if there is no same query, it multicasts the query to all connecting nodes.

- When DAT expires, the final node begins to return data along the chain.

- Response using Routed Response mode

Page 26: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

26

Design - Query

- To reduce network traffic, calculation will operate at each node. Using Recurisively Query Plan. The calculation result will propagate up along the chain.

Page 27: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

27

Design - Query

- To avoid data flooding, only necessary volume data will be calculated, that is specified by Maximum Data Volume

- Each chain will return zero or one result- Dynamic Abort Time (DAT) using Exponential

decay with halving model. DAT will decrease at each hop.

Page 28: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

28

Design

Calculation - By particular statistics methodology Showing Result - Final result will be shown in graph style - The query result will also be saved in the Local Data Store Deleting log records - To save disk space, early log records should be

deleted after period of time

Page 29: 1 Grid vs. Peer-to-Peer Yin Chen s0231189@sms.ed.ac.uk 25 June 2003

29

Grid vs. P2R

Thanks !