sven bittner and annika hinze, 18 january 2006 talk at the 29 th australasian computer science...

30
Sven Bittner and Sven Bittner and Annika Hinze Annika Hinze , , 18 January 2006 18 January 2006 Talk at the 29 Talk at the 29 th th Australasian Computer Science Australasian Computer Science Conference (ACSC2006) Conference (ACSC2006) Pruning Subscriptions in Pruning Subscriptions in Distributed Distributed Publish/Subscribe Publish/Subscribe Systems Systems

Upload: theresa-morrison

Post on 16-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Sven Bittner and Sven Bittner and Annika HinzeAnnika Hinze,,

18 January 200618 January 2006

Talk at the 29Talk at the 29thth Australasian Computer Science Conference Australasian Computer Science Conference (ACSC2006)(ACSC2006)

Pruning Subscriptions in Pruning Subscriptions in Distributed Publish/Subscribe Distributed Publish/Subscribe

SystemsSystems

22/29/29

Motivation: Motivation: Publish/SubscribePublish/Subscribe

• Subscribers Subscribers register register subscriptionssubscriptions• Publishers Publishers send send event messagesevent messages• SystemSystem informs usinginforms using notificationsnotifications EBayEBay

TradeMeTradeMe

UserUserDistributedDistributedPub/Sub Pub/Sub SystemSystem

pub(item,price,

pub(item,price,

timeLeft,…)timeLeft,…)

pub(item,price,

pub(item,price,timeLeft,…)timeLeft,…)

Notify about Notify about items of interestitems of interest

SubscriptionSubscription

pub(item,...)pub(item,...)

FilteringFiltering

Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems

33/29/29

• A subscriber is interested in books whose title A subscriber is interested in books whose title contains the phrase “contains the phrase “Harry PotterHarry Potter”. ”.

Motivation:Motivation: Subscription Subscription ExampleExample

title like “Harry Potter”

AND

condition = NEW condition = USEDprice < 10.0price < 15.0

AND AND

OR endingWithin < 1 day

• According to the condition of the copy of the book According to the condition of the copy of the book ((newnew, , usedused), she wants to pay at most ), she wants to pay at most NZ$10.0NZ$10.0 or or NZ$15.0NZ$15.0. . • To avoid unnecessary notifications, the subscriber To avoid unnecessary notifications, the subscriber will be notified not earlier than will be notified not earlier than one dayone day before the before the auction ends.auction ends.

Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems

44/29/29

Motivation: Problem SizesMotivation: Problem Sizes• Online auctionsOnline auctions

– Subscriptions: > 10Subscriptions: > 1066 (no. of users) (no. of users)– Events: > 20 / sec (new items and bids)Events: > 20 / sec (new items and bids)– Notifications: not time-critical, but events Notifications: not time-critical, but events

must be processed permanentlymust be processed permanently

• Facility managementFacility management– Subscriptions: > 50,000 (today’s systems)Subscriptions: > 50,000 (today’s systems)– Events: > 1,000 / sec (from sensors, Events: > 1,000 / sec (from sensors,

switches)switches)– Notifications: delay < 0.1 secNotifications: delay < 0.1 sec

TimeTime and and spacespace efficiencyefficiency required requiredAnnika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems

55/29/29

StructureStructure• MotivationMotivation

• Subscription PruningSubscription Pruning

• Selectivity EstimationSelectivity Estimation

• Evaluation of Subscription PruningEvaluation of Subscription Pruning

• Summary and OutlookSummary and OutlookAnnika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems

66/29/29

StructureStructure• MotivationMotivation

• Subscription PruningSubscription Pruning

• Selectivity EstimationSelectivity Estimation

• Evaluation of Subscription PruningEvaluation of Subscription Pruning

• Summary and OutlookSummary and OutlookAnnika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems

77/29/29

Current OptimizationsCurrent Optimizations• Work on Work on conjunctiveconjunctive subscriptions only subscriptions only

((−−)) Restricted subscription languageRestricted subscription language

Not applicable for general-purpose systemsNot applicable for general-purpose systems

• Strong Strong assumptionsassumptions ( (−−))– Similarities/relationships among Similarities/relationships among

subscriptionssubscriptions– Evaluations too simplistic for high-level Evaluations too simplistic for high-level

applicationsapplications

We cannot generalize evaluation resultsWe cannot generalize evaluation resultsMotivationMotivation Subscription Pruning Subscription Pruning Selectivity Estimation Evaluation Selectivity Estimation Evaluation OutlookOutlook

88/29/29

Subscription GeneralizationSubscription Generalization• Routing optimization for Routing optimization for arbitraryarbitrary

Boolean subscriptions (Boolean subscriptions (++))• Optimizes subscriptions Optimizes subscriptions

independentlyindependently of each other ( of each other (++))

Optimization potential regardless of Optimization potential regardless of individual individual and collective and collective subscription structuresubscription structure

FavourableFavourable routing optimization for routing optimization for general-general- purpose purpose pub/sub systemspub/sub systems

MotivationMotivation Subscription Pruning Subscription Pruning Selectivity Estimation Evaluation Selectivity Estimation Evaluation OutlookOutlook

99/29/29

Generalization by PruningGeneralization by Pruning• Goals of pruningGoals of pruning

– Remove parts of subscription tree of non-Remove parts of subscription tree of non-local subscriberslocal subscribers

– Create more Create more generalgeneral subscription subscription

Less predicatesLess predicates to filter on ( to filter on (++))

Less complexLess complex subscriptions ( subscriptions (++))

More eventsMore events to filter ( to filter (−−))

More time More time andand space efficient space efficient filteringfiltering

MotivationMotivation Subscription Pruning Subscription Pruning Selectivity Estimation Evaluation Selectivity Estimation Evaluation OutlookOutlook

1010/29/29

Application of Pruning (1)Application of Pruning (1)• ForwardingForwarding of subscriptions for of subscriptions for

selectiveselective routing routing

Routing tableRouting table Routing tableRouting table Routing tableRouting table

Routing tableRouting table

Routing tableRouting tableRouting tableRouting table Routing tableRouting tableRouting tableRouting table Routing tableRouting tableRouting tableRouting table

Routing tableRouting tableRouting tableRouting table

Un-optimized routing:Un-optimized routing:

Routing tableRouting table Routing tableRouting tableRouting tableRouting table

Routing tableRouting tableSubscriberSubscriber

MotivationMotivation Subscription Pruning Subscription Pruning Selectivity Estimation Evaluation Selectivity Estimation Evaluation OutlookOutlook

1111/29/29

Application of Pruning (2)Application of Pruning (2)• ForwardingForwarding of subscriptions for of subscriptions for

selectiveselective routing routingRouting tableRouting table Routing tableRouting table

Routing tableRouting table

Routing tableRouting table

Less complexLess complex subscriptions subscriptions More More timetime and and spacespace efficientefficient filteringfiltering

SubscriberSubscriber

MotivationMotivation Subscription Pruning Subscription Pruning Selectivity Estimation Evaluation Selectivity Estimation Evaluation OutlookOutlook

1212/29/29

Application of Pruning (3)Application of Pruning (3)• ForwardingForwarding of subscriptions for of subscriptions for

selectiveselective routing routingRouting tableRouting table Routing tableRouting table

Routing tableRouting table

Routing tableRouting table

But But more generalmore general subscriptions subscriptions More forwardedMore forwarded event messages event messages MoreMore event messages to event messages to filterfilter

SubscriberSubscriber

MotivationMotivation Subscription Pruning Subscription Pruning Selectivity Estimation Evaluation Selectivity Estimation Evaluation OutlookOutlook

1313/29/29

Example of Pruning (1)Example of Pruning (1)• Valid pruning - Remove child of Valid pruning - Remove child of

conjunctionconjunction

title like “Harry Potter” endingWithin < 1 day

condition = NEW price < 15.0

AND

OR

condition = USEDAND

AND

OR

title like “Harry Potter” endingWithin < 1 day

condition = NEW price < 10.0price < 15.0

AND AND

OR

condition = USED

AND AND

AND

OR

Remove unary operatorsRemove unary operators

MotivationMotivation Subscription Pruning Subscription Pruning Selectivity Estimation Evaluation Selectivity Estimation Evaluation OutlookOutlook

1414/29/29

Example of Pruning (2)Example of Pruning (2)• Invalid pruning - Remove child of Invalid pruning - Remove child of

disjunctiondisjunction

title like “Harry Potter” endingWithin < 1 day

condition = NEW price < 15.0

AND

title like “Harry Potter” endingWithin < 1 day

condition = NEW price < 10.0price < 15.0

AND AND

OR

condition = USED

AND AND

AND

OR

Remove unary/Remove unary/Summarize consecutive operatorsSummarize consecutive operators

No filtering of used books anymore!No filtering of used books anymore!MotivationMotivation Subscription Pruning Subscription Pruning Selectivity Estimation Evaluation Selectivity Estimation Evaluation OutlookOutlook

1515/29/29

Challenges of PruningChallenges of Pruning• QuestionsQuestions

1.1. Which subscriptionWhich subscription should be pruned first? should be pruned first?

2.2. Which partWhich part of a subscription should be of a subscription should be pruned?pruned?

• AnswerAnswerThe subscription (1.) supporting a pruning (2.) The subscription (1.) supporting a pruning (2.) that that minimallyminimally influences the network traffic influences the network traffic

Utilize Utilize selectivitiesselectivities of subscriptions to of subscriptions to determine effects of pruning on network determine effects of pruning on network

loadloadMotivationMotivation Subscription Pruning Subscription Pruning Selectivity Estimation Evaluation Selectivity Estimation Evaluation OutlookOutlook

1616/29/29

StructureStructure• MotivationMotivation

• Subscription PruningSubscription Pruning

• Selectivity EstimationSelectivity Estimation

• Evaluation of Subscription PruningEvaluation of Subscription Pruning

• Summary and OutlookSummary and OutlookAnnika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems

1717/29/29

Selectivity of SubscriptionsSelectivity of Subscriptions• Calculation of selectivity forCalculation of selectivity for

– Original subscriptions - CountingOriginal subscriptions - Counting– Predicates - Counting/ApproximationPredicates - Counting/Approximation– Pruned subscriptions - No suitable methodPruned subscriptions - No suitable method

Estimate selectivitiesEstimate selectivities for for subscriptionssubscriptions

MotivationMotivation Subscription Pruning Subscription Pruning Selectivity EstimationSelectivity Estimation Evaluation Evaluation OutlookOutlook

1818/29/29

Selectivity Estimation: IdeaSelectivity Estimation: Idea• Using three easily computable Using three easily computable

estimatesestimates– MinimalMinimal selsel

minmin

Worst case - smallest possible selectivity value Worst case - smallest possible selectivity value for all distributions of eventsfor all distributions of events

– AverageAverage selsel avgavg

Average case - assuming uniform distribution Average case - assuming uniform distribution of all possible event messages and of all possible event messages and independent predicates in subscriptionsindependent predicates in subscriptions

– MaximalMaximal selsel maxmax

Best case - largest possible selectivity value Best case - largest possible selectivity value for all distributions of eventsfor all distributions of events

MotivationMotivation Subscription Pruning Subscription Pruning Selectivity EstimationSelectivity Estimation Evaluation Evaluation OutlookOutlook

1919/29/29

Selectivity Estimation: Selectivity Estimation: ExampleExample

• Selectivity of predicates via countingSelectivity of predicates via counting• Selectivity of subscriptions via Selectivity of subscriptions via

estimationestimation

title like “Harry Potter” endingWithin < 1 day

condition = NEW condition = USEDprice < 10.0price < 15.0

AND AND

AND

OR

(0.7, 0.72, 0.8)(0.13, 0.19, 0.2)(0.7, 0.77, 1.0)

(0.0, 7.7e-4, 0.01)

0.2 0.93 0.9 0.8

0.01 0.1

MotivationMotivation Subscription Pruning Subscription Pruning Selectivity EstimationSelectivity Estimation Evaluation Evaluation OutlookOutlook

2020/29/29

Selectivity DegradationSelectivity Degradation• Absolute degradation when pruning sAbsolute degradation when pruning sx x to to

ssyy

– Describes expected Describes expected influenceinfluence on on networknetwork loadload

– Maximal difference between three Maximal difference between three componentscomponents

max(max(selsel minmin(s(syy) - ) - selsel

minmin(s(sxx), ),

selsel avgavg(s(syy) - ) - selsel

avgavg(s(sxx),),

selsel maxmax(s(syy) - ) - selsel

maxmax(s(sxx))))MotivationMotivation Subscription Pruning Subscription Pruning Selectivity EstimationSelectivity Estimation Evaluation Evaluation OutlookOutlook

2121/29/29

StructureStructure• MotivationMotivation

• Subscription PruningSubscription Pruning

• Selectivity EstimationSelectivity Estimation

• Evaluation of Subscription PruningEvaluation of Subscription Pruning

• Summary and OutlookSummary and OutlookAnnika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems

2222/29/29

Experiments: GoalExperiments: Goal• Evaluate Evaluate generalgeneral setting setting

– Real-world subscriptionsReal-world subscriptions– Real-world attribute domainsReal-world attribute domains

• Initial set of experimentsInitial set of experiments– Evaluation of memory usage and real Evaluation of memory usage and real

selectivityselectivity– Real selectivity shows expected network Real selectivity shows expected network

loadload

MotivationMotivation Subscription Pruning Selectivity Estimation Subscription Pruning Selectivity Estimation EvaluationEvaluation OutlookOutlook

2323/29/29

Experiments: SetupExperiments: Setup• E-commerce setting (online book E-commerce setting (online book

auctions)auctions)– Ten attributes, e.g., author, format and priceTen attributes, e.g., author, format and price

• EventsEvents– Analysis of real-world distributionsAnalysis of real-world distributions– Average for 1,000,000 messages Average for 1,000,000 messages

• SubscriptionsSubscriptions– Three typical classes involving conjunctions Three typical classes involving conjunctions

and disjunctionsand disjunctions– 10,000 registered subscriptions10,000 registered subscriptions

MotivationMotivation Subscription Pruning Selectivity Estimation Subscription Pruning Selectivity Estimation EvaluationEvaluation OutlookOutlook

2424/29/29

Experiments: Results (1)Experiments: Results (1)• Setting involving all three subscription Setting involving all three subscription

classesclasses

Cut-off pointCut-off point Memory usageMemory usage Expected increaseExpected increasein network loadin network load

MotivationMotivation Subscription Pruning Selectivity Estimation Subscription Pruning Selectivity Estimation EvaluationEvaluation OutlookOutlook

2525/29/29

• At cut-off point (Column 4)At cut-off point (Column 4)– Slight increaseSlight increase in in selectivityselectivity (Column 2) (Column 2)– Strong reductionStrong reduction in in memorymemory usage usage

(Column 3)(Column 3)

Experiments: Results (2)Experiments: Results (2)

Subscription Subscription classclass

Class 1Class 1

Class 2Class 2

Class 3Class 3

Class 1–3Class 1–3

Increase inIncrease inselectivityselectivity

0.0090.009

0.0120.012

0.0160.016

0.0260.026

Relief in Relief in memorymemory

0.6670.667

0.8330.833

0.3680.368

0.6630.663

Cut-off point at percentCut-off point at percentof pruningsof prunings

0.7500.750

0.8750.875

0.5250.525

0.7710.771

MotivationMotivation Subscription Pruning Selectivity Estimation Subscription Pruning Selectivity Estimation EvaluationEvaluation OutlookOutlook

2626/29/29

StructureStructure• MotivationMotivation

• Subscription PruningSubscription Pruning

• Selectivity EstimationSelectivity Estimation

• Evaluation of Subscription PruningEvaluation of Subscription Pruning

• Summary and OutlookSummary and OutlookAnnika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems

2727/29/29

Summary (1)Summary (1)• MotivationMotivation

– Publish/subscribe (pub/sub) systemsPublish/subscribe (pub/sub) systems– Routing and Routing and optimizationsoptimizations in pub/sub in pub/sub

• Subscription pruningSubscription pruning– Drawbacks of current optimizationsDrawbacks of current optimizations– Prune/Prune/removeremove parts of subscription trees parts of subscription trees– Pruning has to result in more Pruning has to result in more generalgeneral

subscriptionsubscription

MotivationMotivation Subscription Pruning Selectivity Estimation Evaluation Subscription Pruning Selectivity Estimation Evaluation OutlookOutlook

2828/29/29

Summary (2)Summary (2)• Selectivity estimationSelectivity estimation

– Three values easily computable valuesThree values easily computable values– DegradationDegradation measure measure predicted predicted

influence of pruningsinfluence of prunings

• Practical analysisPractical analysis– Evaluation of real-world scenarioEvaluation of real-world scenario– Setting with all subscriptionsSetting with all subscriptions

• SpaceSpace usage decreased by usage decreased by 66%66% of maximal of maximal reductionreduction

• Only Only slight increaseslight increase in in networknetwork load loadMotivationMotivation Subscription Pruning Selectivity Estimation Evaluation Subscription Pruning Selectivity Estimation Evaluation OutlookOutlook

2929/29/29

Future WorkFuture Work• Integrate pruning in pub/sub Integrate pruning in pub/sub

prototypeprototype

• ExtendedExtended experiments experiments– Measure network load, throughput and Measure network load, throughput and

memory usagememory usage– Other real-world scenariosOther real-world scenarios

MotivationMotivation Subscription Pruning Selectivity Estimation Evaluation Subscription Pruning Selectivity Estimation Evaluation OutlookOutlook

Thank you for your Thank you for your attention!attention!

Contact:Contact:

Annika HinzeAnnika [email protected]@cs.waikato.ac.nz