1 resilient and coherence preserving dissemination of dynamic data using cooperating peers shetal...

51
1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant Shenoy, UMass

Upload: homer-whitehead

Post on 16-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

1

Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers

Shetal Shah, IIT BombayKirthi Ramamritham, IIT

BombayPrashant Shenoy, UMass

Page 2: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

2

Streaming (Dynamic) Data

Rapid and unpredictable changes Used in on-line decision making/monitoring

Data gathered by (wireless sensor) networks Sensors that monitor light, humidity, pressure, and

heat Network traffic passing via switches Sports scores, stock prices

Query bw_police:

SELECT source_domain, number_of_packets FROM network_state;WHEN bw_used_exceeds_allocation

Page 3: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

3

Coherency Requirement (c )

Clients specify the bound on the tolerable imprecision of requested data. #packets sent, max incoherency = 100

ctStU )()(

Metric:

Fidelity: % of time when the coherency requirement is met

Goal: To achieve high fidelity and resiliency

SourceS(t)

RepositoryP(t)

ClientU(t)

Page 4: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

4

Point to Point Data Dissemination

Pull:Client pulls data from the source.

Push:Source pushes data of interest to the

client.

Page 5: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

5

Generic Architecture

Data sourcesEnd-hosts

servers

sensors

wired hosts

mobile hosts

Netw

ork

Netw

ork

Page 6: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

6

Generic Architecture

Data sources

Proxies/caches

End-hosts

servers

sensors

wired host

mobile host

Netw

ork

Netw

ork

Page 7: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

7

Where should the queries execute ?

At clients can’t optimize across clients, links

At source (where changes take place) Advantages

Minimum number of refresh messages, high fidelity Main challenge

Scalability Multiple sources hard to handle

At Data Aggregators -- DAs/proxies -- placed at edge of network Advantages

Allows scalability through consolidation, Multiple data sources Main challenge

Need mechanisms for maintaining data consistency at DAs

Page 8: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

8

The Basic Problem…

Executing CQs at the edge of the network -- at Data Aggegators will improve scalability and reduce

overheads but poses challenges for consistency

maintenance

Computation / data dissemination moving query processing on streaming/changing data from sources to execution points (edge servers -- proxies/caches/repositories)

Page 9: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

9

Focus of this talk

To create a resilient and efficient content distribution network

(CDN) for streaming/dynamic data.

Existing CDNs for static data and dynamic web pages: Akamai, Dynamai

Sources

CooperatingRepositories

Clients

Page 10: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

10

Overview

Architecture of our CDN Techniques for data

dissemination Techniques to build the

overlay network Resilient data dissemination -- suitable replication of

disseminated data

Sources

CooperatingRepositories

Clients

Page 11: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

11

Cooperative Repository Architecture

Source(s), repositories (and clients)

Each repository specifies its coherency requirement

Source pushes specific changes to selected repositories

Repositories cooperate with

each other the source

Page 12: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

12

Dissemination Graph: Example

Data Set: p,q,r,s Max # push connections : 2

Source

p:0.2, q:0.2 r:0.2

p:0.4, r: 0.3 q: 0.3

AB

D C

Clientq:0.3

Page 13: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

13

Overview

Architecture of our CDN Techniques for data

dissemination Techniques to build the

overlay network Resilient data dissemination -- suitable replication of

disseminated data

Sources

CooperatingRepositories

Clients

Page 14: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

14

Data Dissemination

Different users have different coherency req for the same data item.

Coherency requirement at a repository should be at least as stringent as that of the dependents.

Repositories disseminate only changes of interest.

Source

p:0.2, q:0.2 r:0.2

p:0.4, r: 0.3

q: 0.4

q: 0.3

A B

D C

Client

Page 15: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

15

QQP cxx

Data Dissemination

A repository P sends changes of interest to the dependent Q if

Page 16: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

16

Data dissemination -- must be done with care

Source Repository P Repository Q

3.0Pc 5.0Qc

should prevent missed updates!

1.2

1.5

1

1.41.4

1.7

1.4

1 1 1

1

1.7 1.7

11

Page 17: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

17

Dissemination Algorithms

Source Based (Centralized) Repository Based (Distributed)

Page 18: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

18

Source Based Dissemination Algorithm

For each data item, source maintains unique coherency requirements of

repositories the last update sent for that coherency

For every change, source finds the maximum coherency for which it must be disseminated tags the change with that coherency disseminates (changed data, tag)

Page 19: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

19

Source Based Dissemination Algorithm

Source Repository P Repository Q

3.0Pc 5.0Qc

1.2

1.5

1

1.41.4

1.7

1 1 1

1

1.5

1

1.51.5 1.5

Page 20: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

20

QQP cxx

Repository Based Dissemination Algorithm

A repository P sends changes of interest to the dependent Q if

Pc

Page 21: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

21

Repository BasedDissemination Algorithm

Source Repository P Repository Q

3.0Pc 5.0Qc

1.2

1.5

1

1.41.4

1.7

1.4

1 1 1

1

1.7 1.7

1.41.4

Page 22: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

22

Dissemination Algorithms – number of checks by source.

Repository based algorithm requires fewer checks at source

Page 23: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

23

Dissemination Algorithms – number of messages

Source based algorithm requires less messages

Page 24: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

24

Overview

Architecture of our CDN Techniques for data

dissemination Techniques to build the

overlay network Resilient data dissemination -- suitable replication of

disseminated data

Sources

CooperatingRepositories

Clients

Page 25: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

25

Logical Structure of the CDN

Repositories want many data items,

each with different coherency requirements.

Issues: Which repository

should serve what to whom?

Source

p:0.2, q:0.2 r:0.2,q:0.4

p:0.4, r: 0.3q: 0.3

A B

D C

Page 26: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

26

Constructing the Layout Network

Check level by level starting from the source Each level has a load controller. The load controller tries to find data

providers for the new repository(Q).

Insert repositories one by one

Page 27: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

27

Selecting Data Providers

Repositories with low preference factor are considered as potential data providers.

The most preferred repository with a needed data item is made the provider of that data item.

The most preferred repository is made to provide the remaining data items.

Page 28: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

28

Preference Factor

Resource Availability factor:

Can repository (P) be the provider for one more dependent?

Data Availability Factor:

#data items that P can provide for the new repository Q. Computational delay factor: #dependents P provides for. Communication delay factor: network delay between the 2 repositories.

QservecanPitemsdata

QPdelay

#

),(

Page 29: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

29

Experimental Methodology

Physical network: 4 servers, 600 routers, 100 repositories Communication delay: 20-30 ms Computation delay: 3-5 ms Real stock traces: 100-1000 Time duration of observations: 10,000 s Tight coherency range: 0.01 to 0.05 loose coherency range: 0.5 to 0.99

Page 30: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

30

Loss of fidelity for different coherency requirements

The less stringent the coherency requirement,

the better the fidelity.

For little/no cooperation, loss in fidely is high.

Too much cooperation?

T% of the data items have stringent coherency requirements

Degree of cooperation

Los

s in

fid

elity

%

Page 31: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

31

Controlled cooperation

dependentsdelaycompaverage

delaynetworkaverage

ncooperatioofActual

interested

degree

#

Page 32: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

32

Controlled cooperation is essential

Without controlled cooperation

With controlled cooperation

Degree of cooperationDegree of cooperation Max degree of cooperation

Page 33: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

33

But …

Loss in fidelity increases for large # data items.

Repositories with stringent coherence requirements should be closer to the source.

Page 34: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

34

If parents are not chosen judiciously

It may result in Uneven distribution

of load on repositories.

Increase in the number of messages in the system.

Increase in loss in fidelity!

Source

p:0.2, q:0.2 r:0.2

p:0.4, r: 0.3q: 0.3

AB

D C

Page 35: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

35

Data item at a Time Algorithm

A dissemination tree for each data item.

Source serving the data item is the root.

Repositories with more stringent coherency requirements are placed closer to the root.

Source

Cq: 0.3

q: 0.2 A

Dynamic Data Dissemination Tree - for q

When multiple data items are considered, repositories form a peer to peer network

Page 36: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

36

DiTA

Repository N needs data item x If the source has available push connections, or the source is the only node in the dissemination tree for x

N is made the child of the source Else

repository is inserted in most suitable subtree where

N’’s ancestors have more stringent coherency requirements

N is closest to the root

Page 37: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

37

Most Suitable Subtree?

l: smallest level in the subtree with coherency requirement less stringent than N’’s.

d: communication delay from the root of the subtree to N.

smallest (l x d ): most suitable subtree.

Essentially, minimize communication delays!

Page 38: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

38

Example

Source

Initially the network consists of the source.

q: 0.2 A

A and B request service of q with coherency requirement 0.2

q: 0.2B

C requests service of q with coherency requirement 0.1

q: 0.1 C

q: 0.2 A

Page 39: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

39

Experimental Evaluation

Page 40: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

40

Overview

Architecture of our CDN Techniques for data

dissemination Techniques to build the

overlay network Resilient data dissemination -- suitable replication of

disseminated data

Sources

CooperatingRepositories

Clients

Page 41: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

41

Handling Failures in the Network

Need to detect permanent/transient failures in the network and to recover from them

Resiliency is obtained by adding redundancy

Without redundancy, failures loss in fidelity

Adding redundancy can increase cost possible loss of fidelity! Handle failures such that cost of adding resiliency is low!

Page 42: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

42

Passive/Active Failure Handling

Passive failure detection: Parent sends I’m alive messages at the end

of every time interval. what should the time interval be?

Active failure handling: Always be prepared for failures. For example: 2 repositories can serve the

same data item at the same coherency to a child.

This means lots of work greater loss in fidelity.

Need to be clever

Page 43: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

43

Middle Path

A backup parent B is found for each data item that the repository needs

Let repository R want data item x with coherency c.

P

R

B serves R with coherency k × c

k × cc

B

At what coherency should B serve R ?

Page 44: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

44

If a parent fails

Detection: Child gets two consecutive updates from the backup parent with no updates from the parent

B

R

k x cc

P

c Recovery: Backup

parent is asked to serve at coherency c till we get an update from the parent

Page 45: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

45

Adding Resiliency to DiTA

A sibling of P is chosen as the backup parent of R.

If P fails, A serves B with coherency c change is local. If P has no siblings, a sibling

of nearest ancestor is chosen.

Else the source is made the backup parent.

B

R

k x cc

P

A

Page 46: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

46

Markov Analysis for k

Assumptions Data changes as a random walk along the line The probability of an increase is the same as

that of a decreaseNo assumptions made about the unit of change

or time taken for a change

Expected # misses for any k <= 2 k2 – 2

for k = 2, expected # misses <= 6

Page 47: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

47

Failure and Recovery Modelling

Failures and recovery modeled based on trends observed in practice

Analysis of link failures in an IP backbone by G. Iannaccone et al

Internet Measurement Workshop 2002

Recovery:10% > 20 min 40% > 1 min & < 20 min, 50% < 1 min

Trend for time between failure:

Page 48: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

48

In the Presence of Failures, Varying Recovery Times

Addition of resiliency does improve fidelity.

Page 49: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

49

In the Presence of Failures, Varying Data Items

Increasing Load

Fidelity improves with addition of resiliency even for large number of data items.

Page 50: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

50

In the Absence of Failures

Increasing Load

Often, fidelity improves with addition of resiliency, even in the absence of failures!

Page 51: 1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant

51

Ongoing Work

Mapping Clients to Repositories. Locality, coherency requirements,

load, mobility Reorganization of the CDN.

Snapshot based Dynamic Reorganization