sven bittner and annika hinze, 18 january 2006 talk at the 29 th australasian computer science...
TRANSCRIPT
Sven Bittner and Sven Bittner and Annika HinzeAnnika Hinze,,
18 January 200618 January 2006
Talk at the 29Talk at the 29thth Australasian Computer Science Conference Australasian Computer Science Conference (ACSC2006)(ACSC2006)
Pruning Subscriptions in Pruning Subscriptions in Distributed Publish/Subscribe Distributed Publish/Subscribe
SystemsSystems
22/29/29
Motivation: Motivation: Publish/SubscribePublish/Subscribe
• Subscribers Subscribers register register subscriptionssubscriptions• Publishers Publishers send send event messagesevent messages• SystemSystem informs usinginforms using notificationsnotifications EBayEBay
TradeMeTradeMe
UserUserDistributedDistributedPub/Sub Pub/Sub SystemSystem
pub(item,price,
pub(item,price,
timeLeft,…)timeLeft,…)
pub(item,price,
pub(item,price,timeLeft,…)timeLeft,…)
Notify about Notify about items of interestitems of interest
SubscriptionSubscription
pub(item,...)pub(item,...)
FilteringFiltering
Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems
33/29/29
• A subscriber is interested in books whose title A subscriber is interested in books whose title contains the phrase “contains the phrase “Harry PotterHarry Potter”. ”.
Motivation:Motivation: Subscription Subscription ExampleExample
title like “Harry Potter”
AND
condition = NEW condition = USEDprice < 10.0price < 15.0
AND AND
OR endingWithin < 1 day
• According to the condition of the copy of the book According to the condition of the copy of the book ((newnew, , usedused), she wants to pay at most ), she wants to pay at most NZ$10.0NZ$10.0 or or NZ$15.0NZ$15.0. . • To avoid unnecessary notifications, the subscriber To avoid unnecessary notifications, the subscriber will be notified not earlier than will be notified not earlier than one dayone day before the before the auction ends.auction ends.
Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems
44/29/29
Motivation: Problem SizesMotivation: Problem Sizes• Online auctionsOnline auctions
– Subscriptions: > 10Subscriptions: > 1066 (no. of users) (no. of users)– Events: > 20 / sec (new items and bids)Events: > 20 / sec (new items and bids)– Notifications: not time-critical, but events Notifications: not time-critical, but events
must be processed permanentlymust be processed permanently
• Facility managementFacility management– Subscriptions: > 50,000 (today’s systems)Subscriptions: > 50,000 (today’s systems)– Events: > 1,000 / sec (from sensors, Events: > 1,000 / sec (from sensors,
switches)switches)– Notifications: delay < 0.1 secNotifications: delay < 0.1 sec
TimeTime and and spacespace efficiencyefficiency required requiredAnnika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems
55/29/29
StructureStructure• MotivationMotivation
• Subscription PruningSubscription Pruning
• Selectivity EstimationSelectivity Estimation
• Evaluation of Subscription PruningEvaluation of Subscription Pruning
• Summary and OutlookSummary and OutlookAnnika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems
66/29/29
StructureStructure• MotivationMotivation
• Subscription PruningSubscription Pruning
• Selectivity EstimationSelectivity Estimation
• Evaluation of Subscription PruningEvaluation of Subscription Pruning
• Summary and OutlookSummary and OutlookAnnika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems
77/29/29
Current OptimizationsCurrent Optimizations• Work on Work on conjunctiveconjunctive subscriptions only subscriptions only
((−−)) Restricted subscription languageRestricted subscription language
Not applicable for general-purpose systemsNot applicable for general-purpose systems
• Strong Strong assumptionsassumptions ( (−−))– Similarities/relationships among Similarities/relationships among
subscriptionssubscriptions– Evaluations too simplistic for high-level Evaluations too simplistic for high-level
applicationsapplications
We cannot generalize evaluation resultsWe cannot generalize evaluation resultsMotivationMotivation Subscription Pruning Subscription Pruning Selectivity Estimation Evaluation Selectivity Estimation Evaluation OutlookOutlook
88/29/29
Subscription GeneralizationSubscription Generalization• Routing optimization for Routing optimization for arbitraryarbitrary
Boolean subscriptions (Boolean subscriptions (++))• Optimizes subscriptions Optimizes subscriptions
independentlyindependently of each other ( of each other (++))
Optimization potential regardless of Optimization potential regardless of individual individual and collective and collective subscription structuresubscription structure
FavourableFavourable routing optimization for routing optimization for general-general- purpose purpose pub/sub systemspub/sub systems
MotivationMotivation Subscription Pruning Subscription Pruning Selectivity Estimation Evaluation Selectivity Estimation Evaluation OutlookOutlook
99/29/29
Generalization by PruningGeneralization by Pruning• Goals of pruningGoals of pruning
– Remove parts of subscription tree of non-Remove parts of subscription tree of non-local subscriberslocal subscribers
– Create more Create more generalgeneral subscription subscription
Less predicatesLess predicates to filter on ( to filter on (++))
Less complexLess complex subscriptions ( subscriptions (++))
More eventsMore events to filter ( to filter (−−))
More time More time andand space efficient space efficient filteringfiltering
MotivationMotivation Subscription Pruning Subscription Pruning Selectivity Estimation Evaluation Selectivity Estimation Evaluation OutlookOutlook
1010/29/29
Application of Pruning (1)Application of Pruning (1)• ForwardingForwarding of subscriptions for of subscriptions for
selectiveselective routing routing
Routing tableRouting table Routing tableRouting table Routing tableRouting table
Routing tableRouting table
Routing tableRouting tableRouting tableRouting table Routing tableRouting tableRouting tableRouting table Routing tableRouting tableRouting tableRouting table
Routing tableRouting tableRouting tableRouting table
Un-optimized routing:Un-optimized routing:
Routing tableRouting table Routing tableRouting tableRouting tableRouting table
Routing tableRouting tableSubscriberSubscriber
MotivationMotivation Subscription Pruning Subscription Pruning Selectivity Estimation Evaluation Selectivity Estimation Evaluation OutlookOutlook
1111/29/29
Application of Pruning (2)Application of Pruning (2)• ForwardingForwarding of subscriptions for of subscriptions for
selectiveselective routing routingRouting tableRouting table Routing tableRouting table
Routing tableRouting table
Routing tableRouting table
Less complexLess complex subscriptions subscriptions More More timetime and and spacespace efficientefficient filteringfiltering
SubscriberSubscriber
MotivationMotivation Subscription Pruning Subscription Pruning Selectivity Estimation Evaluation Selectivity Estimation Evaluation OutlookOutlook
1212/29/29
Application of Pruning (3)Application of Pruning (3)• ForwardingForwarding of subscriptions for of subscriptions for
selectiveselective routing routingRouting tableRouting table Routing tableRouting table
Routing tableRouting table
Routing tableRouting table
But But more generalmore general subscriptions subscriptions More forwardedMore forwarded event messages event messages MoreMore event messages to event messages to filterfilter
SubscriberSubscriber
MotivationMotivation Subscription Pruning Subscription Pruning Selectivity Estimation Evaluation Selectivity Estimation Evaluation OutlookOutlook
1313/29/29
Example of Pruning (1)Example of Pruning (1)• Valid pruning - Remove child of Valid pruning - Remove child of
conjunctionconjunction
title like “Harry Potter” endingWithin < 1 day
condition = NEW price < 15.0
AND
OR
condition = USEDAND
AND
OR
title like “Harry Potter” endingWithin < 1 day
condition = NEW price < 10.0price < 15.0
AND AND
OR
condition = USED
AND AND
AND
OR
Remove unary operatorsRemove unary operators
MotivationMotivation Subscription Pruning Subscription Pruning Selectivity Estimation Evaluation Selectivity Estimation Evaluation OutlookOutlook
1414/29/29
Example of Pruning (2)Example of Pruning (2)• Invalid pruning - Remove child of Invalid pruning - Remove child of
disjunctiondisjunction
title like “Harry Potter” endingWithin < 1 day
condition = NEW price < 15.0
AND
title like “Harry Potter” endingWithin < 1 day
condition = NEW price < 10.0price < 15.0
AND AND
OR
condition = USED
AND AND
AND
OR
Remove unary/Remove unary/Summarize consecutive operatorsSummarize consecutive operators
No filtering of used books anymore!No filtering of used books anymore!MotivationMotivation Subscription Pruning Subscription Pruning Selectivity Estimation Evaluation Selectivity Estimation Evaluation OutlookOutlook
1515/29/29
Challenges of PruningChallenges of Pruning• QuestionsQuestions
1.1. Which subscriptionWhich subscription should be pruned first? should be pruned first?
2.2. Which partWhich part of a subscription should be of a subscription should be pruned?pruned?
• AnswerAnswerThe subscription (1.) supporting a pruning (2.) The subscription (1.) supporting a pruning (2.) that that minimallyminimally influences the network traffic influences the network traffic
Utilize Utilize selectivitiesselectivities of subscriptions to of subscriptions to determine effects of pruning on network determine effects of pruning on network
loadloadMotivationMotivation Subscription Pruning Subscription Pruning Selectivity Estimation Evaluation Selectivity Estimation Evaluation OutlookOutlook
1616/29/29
StructureStructure• MotivationMotivation
• Subscription PruningSubscription Pruning
• Selectivity EstimationSelectivity Estimation
• Evaluation of Subscription PruningEvaluation of Subscription Pruning
• Summary and OutlookSummary and OutlookAnnika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems
1717/29/29
Selectivity of SubscriptionsSelectivity of Subscriptions• Calculation of selectivity forCalculation of selectivity for
– Original subscriptions - CountingOriginal subscriptions - Counting– Predicates - Counting/ApproximationPredicates - Counting/Approximation– Pruned subscriptions - No suitable methodPruned subscriptions - No suitable method
Estimate selectivitiesEstimate selectivities for for subscriptionssubscriptions
MotivationMotivation Subscription Pruning Subscription Pruning Selectivity EstimationSelectivity Estimation Evaluation Evaluation OutlookOutlook
1818/29/29
Selectivity Estimation: IdeaSelectivity Estimation: Idea• Using three easily computable Using three easily computable
estimatesestimates– MinimalMinimal selsel
minmin
Worst case - smallest possible selectivity value Worst case - smallest possible selectivity value for all distributions of eventsfor all distributions of events
– AverageAverage selsel avgavg
Average case - assuming uniform distribution Average case - assuming uniform distribution of all possible event messages and of all possible event messages and independent predicates in subscriptionsindependent predicates in subscriptions
– MaximalMaximal selsel maxmax
Best case - largest possible selectivity value Best case - largest possible selectivity value for all distributions of eventsfor all distributions of events
MotivationMotivation Subscription Pruning Subscription Pruning Selectivity EstimationSelectivity Estimation Evaluation Evaluation OutlookOutlook
1919/29/29
Selectivity Estimation: Selectivity Estimation: ExampleExample
• Selectivity of predicates via countingSelectivity of predicates via counting• Selectivity of subscriptions via Selectivity of subscriptions via
estimationestimation
title like “Harry Potter” endingWithin < 1 day
condition = NEW condition = USEDprice < 10.0price < 15.0
AND AND
AND
OR
(0.7, 0.72, 0.8)(0.13, 0.19, 0.2)(0.7, 0.77, 1.0)
(0.0, 7.7e-4, 0.01)
0.2 0.93 0.9 0.8
0.01 0.1
MotivationMotivation Subscription Pruning Subscription Pruning Selectivity EstimationSelectivity Estimation Evaluation Evaluation OutlookOutlook
2020/29/29
Selectivity DegradationSelectivity Degradation• Absolute degradation when pruning sAbsolute degradation when pruning sx x to to
ssyy
– Describes expected Describes expected influenceinfluence on on networknetwork loadload
– Maximal difference between three Maximal difference between three componentscomponents
max(max(selsel minmin(s(syy) - ) - selsel
minmin(s(sxx), ),
selsel avgavg(s(syy) - ) - selsel
avgavg(s(sxx),),
selsel maxmax(s(syy) - ) - selsel
maxmax(s(sxx))))MotivationMotivation Subscription Pruning Subscription Pruning Selectivity EstimationSelectivity Estimation Evaluation Evaluation OutlookOutlook
2121/29/29
StructureStructure• MotivationMotivation
• Subscription PruningSubscription Pruning
• Selectivity EstimationSelectivity Estimation
• Evaluation of Subscription PruningEvaluation of Subscription Pruning
• Summary and OutlookSummary and OutlookAnnika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems
2222/29/29
Experiments: GoalExperiments: Goal• Evaluate Evaluate generalgeneral setting setting
– Real-world subscriptionsReal-world subscriptions– Real-world attribute domainsReal-world attribute domains
• Initial set of experimentsInitial set of experiments– Evaluation of memory usage and real Evaluation of memory usage and real
selectivityselectivity– Real selectivity shows expected network Real selectivity shows expected network
loadload
MotivationMotivation Subscription Pruning Selectivity Estimation Subscription Pruning Selectivity Estimation EvaluationEvaluation OutlookOutlook
2323/29/29
Experiments: SetupExperiments: Setup• E-commerce setting (online book E-commerce setting (online book
auctions)auctions)– Ten attributes, e.g., author, format and priceTen attributes, e.g., author, format and price
• EventsEvents– Analysis of real-world distributionsAnalysis of real-world distributions– Average for 1,000,000 messages Average for 1,000,000 messages
• SubscriptionsSubscriptions– Three typical classes involving conjunctions Three typical classes involving conjunctions
and disjunctionsand disjunctions– 10,000 registered subscriptions10,000 registered subscriptions
MotivationMotivation Subscription Pruning Selectivity Estimation Subscription Pruning Selectivity Estimation EvaluationEvaluation OutlookOutlook
2424/29/29
Experiments: Results (1)Experiments: Results (1)• Setting involving all three subscription Setting involving all three subscription
classesclasses
Cut-off pointCut-off point Memory usageMemory usage Expected increaseExpected increasein network loadin network load
MotivationMotivation Subscription Pruning Selectivity Estimation Subscription Pruning Selectivity Estimation EvaluationEvaluation OutlookOutlook
2525/29/29
• At cut-off point (Column 4)At cut-off point (Column 4)– Slight increaseSlight increase in in selectivityselectivity (Column 2) (Column 2)– Strong reductionStrong reduction in in memorymemory usage usage
(Column 3)(Column 3)
Experiments: Results (2)Experiments: Results (2)
Subscription Subscription classclass
Class 1Class 1
Class 2Class 2
Class 3Class 3
Class 1–3Class 1–3
Increase inIncrease inselectivityselectivity
0.0090.009
0.0120.012
0.0160.016
0.0260.026
Relief in Relief in memorymemory
0.6670.667
0.8330.833
0.3680.368
0.6630.663
Cut-off point at percentCut-off point at percentof pruningsof prunings
0.7500.750
0.8750.875
0.5250.525
0.7710.771
MotivationMotivation Subscription Pruning Selectivity Estimation Subscription Pruning Selectivity Estimation EvaluationEvaluation OutlookOutlook
2626/29/29
StructureStructure• MotivationMotivation
• Subscription PruningSubscription Pruning
• Selectivity EstimationSelectivity Estimation
• Evaluation of Subscription PruningEvaluation of Subscription Pruning
• Summary and OutlookSummary and OutlookAnnika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems
2727/29/29
Summary (1)Summary (1)• MotivationMotivation
– Publish/subscribe (pub/sub) systemsPublish/subscribe (pub/sub) systems– Routing and Routing and optimizationsoptimizations in pub/sub in pub/sub
• Subscription pruningSubscription pruning– Drawbacks of current optimizationsDrawbacks of current optimizations– Prune/Prune/removeremove parts of subscription trees parts of subscription trees– Pruning has to result in more Pruning has to result in more generalgeneral
subscriptionsubscription
MotivationMotivation Subscription Pruning Selectivity Estimation Evaluation Subscription Pruning Selectivity Estimation Evaluation OutlookOutlook
2828/29/29
Summary (2)Summary (2)• Selectivity estimationSelectivity estimation
– Three values easily computable valuesThree values easily computable values– DegradationDegradation measure measure predicted predicted
influence of pruningsinfluence of prunings
• Practical analysisPractical analysis– Evaluation of real-world scenarioEvaluation of real-world scenario– Setting with all subscriptionsSetting with all subscriptions
• SpaceSpace usage decreased by usage decreased by 66%66% of maximal of maximal reductionreduction
• Only Only slight increaseslight increase in in networknetwork load loadMotivationMotivation Subscription Pruning Selectivity Estimation Evaluation Subscription Pruning Selectivity Estimation Evaluation OutlookOutlook
2929/29/29
Future WorkFuture Work• Integrate pruning in pub/sub Integrate pruning in pub/sub
prototypeprototype
• ExtendedExtended experiments experiments– Measure network load, throughput and Measure network load, throughput and
memory usagememory usage– Other real-world scenariosOther real-world scenarios
MotivationMotivation Subscription Pruning Selectivity Estimation Evaluation Subscription Pruning Selectivity Estimation Evaluation OutlookOutlook
Thank you for your Thank you for your attention!attention!
Contact:Contact:
Annika HinzeAnnika [email protected]@cs.waikato.ac.nz