divide and conquer algorithms for pub/sub overlay design

Post on 26-Jan-2016

15 Views

Category:

Documents

4 Downloads

Preview:

Click to see full reader

DESCRIPTION

Divide and Conquer Algorithms for Pub/Sub Overlay Design. Chen Chen 1 joint work with Hans-Arno Jacobsen 1,2 , Roman Vitenberg 3 1 Department of Electrical and Computer Engineering 2 Department of Computer Science University of Toronto 3 Department of Informatics University of Oslo. - PowerPoint PPT Presentation

TRANSCRIPT

MIDDLEWARE SYSTEMSRESEARCH GROUP

Divide and Conquer Algorithms for Pub/Sub Overlay Design

Chen Chen 1

joint work with Hans-Arno Jacobsen 1,2, Roman Vitenberg 3

1 Department of Electrical and Computer Engineering2 Department of Computer Science

University of Toronto

3 Department of InformaticsUniversity of Oslo

ICDCS’10 Genoa, Italy 1

MIDDLEWARE SYSTEMSRESEARCH GROUP

Example: Pub/Sub

Interests: boy

Interests: boy

Interests: girl

boy

girl

ICDCS’10 Genoa, Italy 2

MIDDLEWARE SYSTEMSRESEARCH GROUP

Pub/Sub

• A communication paradigm– Subscribers express their interests– Publishers disseminate messages

• Many applications and industry standards– Application integration, financial data dissemination, RSS feed distribution, business process management– WS Notifications, WS Eventing, OMGs’ Real-time Data Dissemination Service

• Topic-based pub/sub– TIBCO RV– Google’s GooPS

ICDCS’10 Genoa, Italy 3

MIDDLEWARE SYSTEMSRESEARCH GROUP Two components

in pub/sub implementationDesign of routing protocols

• The design of protocols so that publications and subscriptions are sent most efficiently across the overlay network.

• G. Li et al., ICDCS’08• M. Castro et al., JSAC’02

Construction of overlay• The construction of the

overlay topology such that network traffic is minimized.

• Chockler et al., PODC’07• Onus et al., INFOCOM’09

ICDCS’10 Genoa, Italy 4

MIDDLEWARE SYSTEMSRESEARCH GROUP

Desirable properties for overlays

• Low average node degree• Low fan-out of a node• Low diameter• Topic-connectivity• Efficiency to construct• Adaptability to churn• Ease of distributed implementation

ICDCS’10 Genoa, Italy 5

MIDDLEWARE SYSTEMSRESEARCH GROUP Our contributions

ICDCS’10 Genoa, Italy 6

Previous algorithm: GM

High running time cost

Full knowledge requirement

Centralized operation (difficult to decentralize)

No support for dynamic changes

Constructing from scratch only(No support for incremental addition)

Our algorithms

Low running time cost

Partial knowledge requirement

Centralized operation (easy to decentralize)

No direct support for dynamic changes

Constructing both from scratch and incrementally

MIDDLEWARE SYSTEMSRESEARCH GROUP Topic-connectivity

V5

{a,c}

V1

{b,c,d}

V2

{a}

{b,d}

V4

{a,b}

V3

V5

{a,c}

V2

{a}

V4

{a,b}

V1

{b,c,d}

{b,d}

V4

{a,b}

V3

An overlay G Suboverlay Ga istopic-connected

Suboverlay Gb isNOT topic-connected

ICDCS’10 Genoa, Italy 7

MIDDLEWARE SYSTEMSRESEARCH GROUP MinAvg-TCO problem

V5

V1

{b,c,d}

V2

{a}

{b,d}

V4

{a,b}

V3

TCO1 has 5 edges

{a,c}

V5

V1

{b,c,d}

V2

{a}

{b,d}

V4

{a,b}

V3

TCO2 has 10 edges

{a,c}

ICDCS’10 Genoa, Italy 8

MIDDLEWARE SYSTEMSRESEARCH GROUP MinAvg-TCO problem

• A high-quality overlay– Topic-connectivity– Total number of edges

• Input: – a set of nodes V, – a set of topics T, – the interest function Int

• MinAvg-TCO(V,T,Int) (optimization version)

Construct a TCO(V,T,Int,E) such that |E| is minimum.

• Avg-TCO(V,T,Int,k) (decision version)

Is there a TCO(V,T,Int,E) such that |E|=k?

• Theorem: MinAvg-TCO is NP-complete

V5

V1

{b,c,d}

V2

{a}

{b,d}V4{a,b} V3

{a,c}

ICDCS’10 Genoa, Italy 9

MIDDLEWARE SYSTEMSRESEARCH GROUP

Greedy-Merge (GM) algorithm

• Greedy: always making the choice that looks best at the moment

• GM for MinAvg-TCO:always adding an edge with maximum link contribution

• Running Time: O(|V|2|T|)• Approximation Ratio: O(log(|V||T|))

ICDCS’10 Genoa, Italy 10

MIDDLEWARE SYSTEMSRESEARCH GROUP Our contributions

ICDCS’10 Genoa, Italy 11

Previous algorithm: GM

High running time cost

Full knowledge requirement

Centralized operation (difficult to decentralize)

No support for dynamic changes

Construction from scratch only(No support for incremental addition)

Our algorithms

Low running time cost

Partial knowledge requirement

Centralized operation (easy to decentralize)

No direct support for dynamic changes

Construction both from scratch and incrementally

MIDDLEWARE SYSTEMSRESEARCH GROUP

TCO join problem

• Given p TCOs: TCOd (Vd,Td,Intd,Ed), d=1,..,p

• MinAvg-TCO-Join(V,T,Int,p) (optimization version)

Construct a TCO(V,T,Int,E) such that |E| is minimum

• Avg-TCO-Join(V,T,Int,p,k) (decision version)Is there a TCO(V,T,Int,E) such that |E|=k?

• MinAvg-TCO is a special case of MinAvg-TCO-Join:

Theorem: MinAvg-TCO-Join is NP-complete

ICDCS’10 Genoa, Italy 12

MIDDLEWARE SYSTEMSRESEARCH GROUP

Solving MinAvg-TCO-Join

• MinAvg-TCO-Join could be solved by GM,

but NOT practical:– Tear down all existing links– Rebuild the overlay from scratch using GM

• It is better to preserve all existing edges and only add edges incrementally.

ICDCS’10 Genoa, Italy 13

MIDDLEWARE SYSTEMSRESEARCH GROUP Bad case

for incremental addition of edges

ICDCS’10 Genoa, Italy

V2

V1

Vn

ViVn-1

TCO0 :

1 2

,

{ , ,..., }

{ |1 , }n

i j

V v v v

T t i j n

{ , 1,..., }iv ijT t j n

V2

V1

Vn

ViVn-1

Vall

Vall : interested in all topics in T

2( )n TCO1 :2( )n TCO2 : ( )n

Constructing incrementally Constructing from scratch

V2

V1

Vn

ViVn-1

Vall

14

MIDDLEWARE SYSTEMSRESEARCH GROUP

Naive Merge (NM) algorithm

GM algorithm

• Input: (V,T,Int)• Output: one TCO• Algorithm:- Start with an empty edge

set;- Always add an edge with

maximum link contribution.

• Running time:

NM algorithm

• Input: (Vd,Td,Intd,Ed), d=1,...,p

• Output: one TCO• Algorithm:- Start with existing internal-TCO

links;- Always add a cross-TCO edge with

maximum link contribution.

• Running time:

NM is based on the same greedy heuristic as GM.

1

(| | | || |)p p

i ji j i

O T V V 2(| | | |)O V T

ICDCS’10 Genoa, Italy 15

MIDDLEWARE SYSTEMSRESEARCH GROUP Example of NM

V12

V0

{c}

V3

{d}V9

{a,b,c}

V6

{d} {a,b,c}

V8

V11V2

{a}V5{a,b,d}

V14

{b,c,d}

{a,b,c}

{a,b,d}

V13

V1

V4

{c}

V10

V7

{c}{a,c,d}

{c}

{a}

ICDCS’10 Genoa, Italy

Still a prohibitively high running time!!!

1

(| | | || |)p p

i ji j i

O T V V

16

MIDDLEWARE SYSTEMSRESEARCH GROUP Star set

V5

{a,c}

V1

{b,c,d}

V2

{a}

{b,d}

V4

{a,b}

V3

A topic-connected overlay{v3, v5} is a star set which covers all topics {a,b,c,d}

{v2, v3, v4} is not a star set; it only covers {a,b,d}

V5

V1

{b,c,d}

V2

{a}

{b,d}

V4

{a,b}

V3

V5

V1

{b,c,d}

V2

{a}

{b,d}

V4

{a,b}

V3

{a,c}

{a,c}

Given a TCO (V,T,Int,E)

A Star set S is a subset of V that covers all V’s topics.

ICDCS’10 Genoa, Italy 17

MIDDLEWARE SYSTEMSRESEARCH GROUP

Star set

• Star set nodes– Represents the interests of all the nodes– Can function as bridges to determine cross-TCO links

• Observation: minimal star sets tend to be substantially smaller than the total number of nodes.

• How to find a minimum star set S* for (V,T,Int)? – Equal to classic set cover problem: NP-complete– Could be approximated with a log approximation ratio

ICDCS’10 Genoa, Italy 18

MIDDLEWARE SYSTEMSRESEARCH GROUP

Star Merge (SM) algorithm

NM algorithm

• Input: (Vd,Td,Intd,Ed), d=1,..,p

• Output: one TCO• Algorithm:- Start with existing internal-TCO

links;- // Do nothing; - Always add a cross-TCO edge

with maximum link contribution.

SM algorithm

• Input: (Vd,Td,Intd,Ed), d=1,..,p

• Output: one TCO• Algorithm:- Start with existing internal-TCO

links;- Find a star set for each sub-

TCO;- Always add a cross-Star edge

with maximum link contribution.

ICDCS’10 Genoa, Italy 19

MIDDLEWARE SYSTEMSRESEARCH GROUP Example of SM

V12

V0

{c}

V6

{d}V9 {a,b,c

}V3

{d} {a,b,c}

V8

V11V2

{a}V5{a,b,d}

V14

{b,c,d}

{a,b,c}

{a,b,d}

V13

V1

V4

{c}

V10

V7

{c}{a,c,d}

{c}

{a}

ICDCS’10 Genoa, Italy

Running time largely improved because #stars << #nodes for most cases.

20

MIDDLEWARE SYSTEMSRESEARCH GROUP Divide and Conquer (DC)

for MinAvg-TCO• The number of nodes is a dominant factor for the

running time of the GM algorithm.• Divide-and-conquer

– Divide the MinAvg-TCO problem into several sub-overlay construction problems

– Conquer the sub-MinAvg-TCO problems independently and build sub-overlays into sub-TCOs

– Combine these sub-TCOs to one TCO

ICDCS’10 Genoa, Italy 21

MIDDLEWARE SYSTEMSRESEARCH GROUP

Design of DC algorithm

• How to divide the node set V:– Node clustering vs. random partitioning

– The number of partitions p

• The balance between conquer and combine– p = 1 (single partition): conquer only = GM

– p = |V| (each node is a partition): combine only = GM

• How to decentralize DC:– Note the DC algorithm as presented is fully centralized.

– However, it is possible to decentralize it.

• Theoretical analysis: not straightforward.

ICDCS’10 Genoa, Italy 22

MIDDLEWARE SYSTEMSRESEARCH GROUP Example of DC

V12

V0

{c}

V6

{d}V9 {a,b,c

}V3

{d} {a,b,c}

V8

V11V2

{a}V5{a,b,d}

V14

{b,c,d}

{a,b,c}

{a,b,d}

V13

V1

V4

{c}

V10

V7

{c}{a,c,d}

{c}

{a}

ICDCS’10 Genoa, Italy

- Divide overlay based on V- Conquer each sub-TCO by GM- Combine TCO into one by SM

23

MIDDLEWARE SYSTEMSRESEARCH GROUP

Experiment setting

• The number of nodes

|V| = 1000 ranging from 1000 to 8000

• The number of topics

|T| = 100 ranging from 100 to 1000

• The number of topics that subscribed by a node

NodeIntSize=20 ranging from 10 to 100

• Topic distribution uniform, zipf, exponential

ICDCS’10 Genoa, Italy 24

MIDDLEWARE SYSTEMSRESEARCH GROUP

Experiment design

• Evaluation: average node degree, running time

– Star Merge for MinAvg-TCO-Join– DC for MinAvg-TCO

• Random node partitioning

• The effects of the number of nodes

• The effects of the number of topics

• The effects of average subscription size of a node

• Comparison with RingPTRingPT is an algorithm that mimics the common practice of

building separate overlay for each topic.

ICDCS’10 Genoa, Italy 25

MIDDLEWARE SYSTEMSRESEARCH GROUP Star Merge

SM vs NM vs GM

ICDCS’10 Genoa, Italy 26

MIDDLEWARE SYSTEMSRESEARCH GROUP Divide-and-conquer

The effect of the number of nodes

ICDCS’10 Genoa, Italy 27

MIDDLEWARE SYSTEMSRESEARCH GROUP Divide-and-conquer

DC vs GM vs RingPT

ICDCS’10 Genoa, Italy 28

MIDDLEWARE SYSTEMSRESEARCH GROUP Algorithm summary

ICDCS’10 Genoa, Italy 29

Running time Quality of overlay #edges (avg node degree)

Required information

Potential to Decentralize

RingPT good poor full knowledge good

GM poor: O(|V|2|T|) good: O(log(|V||T|)) full knowledge poor

NM poor: 75% of GM good full knowledge good

SM good: 1.0% of GM good: ≤ 0.15 compared to GM partial knowledge good

DC good: 1.7% of GM good: ≤ 2.12 compared to GM partial knowledge good

MIDDLEWARE SYSTEMSRESEARCH GROUP

ICDCS’10 Genoa, Italy 30

MIDDLEWARE SYSTEMSRESEARCH GROUP

Minimal Number of Links

• A typical pub/sub system combines a number of protocols, many of which maintaining per-link state– A node must constantly monitor the availability of

each of its neighbors (heartbeats and keep-alive state)

– If the links are maintained using TCP, there is the cost of connection state for each link

– The more links there are, the fewer topics can be routed over each individual link, thereby diminishing cross-topic aggregation benefits

– If sequential-diff-based compression scheme is used, there is an extra cost associated with a history table

top related