combining proactive and retroactive processing for...

Combining Proactive and Retroactive Processingfor Distributed Complex Event Detection

Mert AkdereDepartment of Computer Science

Brown [email protected]

AbstractComplex Event Detection (CED) is a key capability

for many monitoring applications such as intrusion detec-tion, sensor-based activity/phenomenon tracking, and net-work/infrastructure monitoring. Existing CED solutionscommonly assume centralized availability and proactive pro-cessing of all relevant events, and thus incur significant over-head in distributed settings. In this paper, we present andevaluate efficient distributed CED techniques that reduceevent detection and transmission costs through a combina-tion of proactive and retroactive processing strategies. Thekey idea is to generate CED plans that leverage the temporaland spatial windowing constraints associated with complexevents to determine a multi-step acquisition order of con-stituent events that minimizes expected communication costswhile meeting application-defined latency bounds for eventdetection. We demonstrate the utility of the proposed tech-nique using extensive experimentation on a variety of work-load scenarios.

1. IntroductionIn this paper, we study the problem of complex event de-

tection (CED) in a distributed monitoring environment thatconsists of potentially a large number of distributed eventsources (such as hardware sensors or software receptors).CED is becoming a fundamental capability in many do-mains including network and software infrastructure secu-rity (e.g., denial of service attacks and intrusion detection),phenomenon and activity tracking (e.g., fire detection, stormdetection, tracking suspicious behavior in an airport). Itisoften the case that such sophisticated (or “complex”) eventscannot be detected by individual sources at a single time andlocation: complex events usually take place over a period oftime and region, thus require consolidation of many “simple”events from multiple sources.

The traditional means for CED (as exemplified in streamprocessing systems and traditional databases) is based on acentralized, push-based processing model. Sources generatesimple events, which are continually pushed to a base wherethe registered complex events are evaluated in the form ofcontinuous queries or triggers. This exclusively push-based,“proactive” model of processing is inefficient in distributed

environments as it leads to significant communication over-head that may deplete batteries or hog network pipes (espe-cially considering the fact that while many complex eventsare rare, some of the constituents elements may be generatedrelatively frequently).

Before we introduce our approach, we first make two ob-servations. (1)Local storage: Event sources usually havestorage capabilities (albeit limited) that enable them to keepsome of their data for short-medium periods of time. Clearly,the available storage capacity depends on the hardware plat-form, but even with tiny devices, storage is fast becoming anon-issue due to the advances in flash- and similar technolo-gies. (2)Delay tolerance: While timely detection of eventsis critical, applications often have varying timeliness require-ments. For example, fire or storm detection exhibits muchhigher tolerance to delay than network intrusion.

The key topic of this paper is an approach forcommunication-efficient complex event detection that lever-ages these two observations. Given a complex event, weproactively monitor only a subset of the simpler elements asthe first step, and only if they occur, we then “retroactively”check for the existence of others at the appropriate sourcesas the consequent step and iterate the algorithm. As such,our hybrid proactive-reactive algorithm generates a multi-step plan of event acquisition where rarer events are checkedbefore more frequent events, thus in many cases eliminatingthe need for communicating the latter. To make this approachwork, sources use their local storage to store their events fora pre-determined duration of time in case they need to beretroactively consulted. As each step in the algorithm intro-duces an additional delay, the algorithm also limits the num-ber of steps based on application-specified per-event latencybounds.

In addition to our basic CED algorithm, the other contri-butions of the paper are as follows:

• A simple but expressive set of event composition oper-ators decorated with time and space constraints (includ-ing usage examples).

• An extension that leverages temporal and spatial con-straints (when available and applicable) to further re-duce event transmissions.

• An extension that leverages shared sub-events that are

1

common to multiple complex events.• Extensive experimentation that characterizes and quan-

tifies the behavior and benefits of the algorithm and itsextensions on a variety of workloads.

The rest of the paper is structured as follows. An overviewof the system and its functionality is provided in Section 2.In Section 3, we present our event language together with us-age examples. Then, we describe our multi-step approach toevent detection that uses a cost model based on event occur-rence probabilities for estimating monitoring costs in Sec-tion 4. We provide experimental results in Section 5. Re-lated work is covered in Section 6 and Section 7 concludesthe paper.

2. System Overview

We present a complex event detection framework for dis-tributed monitoring applications. Our framework uses aplan-based approach to complex event detection and utilizesprobabilistic models of event occurrences in finding networkefficient event detection plans. Using this approach our sys-tem incurs low network cost during times of inactivity and isable to detect complex events quickly within user specifieddeadlines once they occur.

2.1 Complex Event Model

Events are defined as activities of interest in a system [8].Detection of a person in a room, the firing of a cpu timer, anda denial of service attack in a network are example eventsfrom various application domains. All events signify certainactivities, however their complexity degrees can be signifi-cantly different. For instance, the firing of a timer is instan-taneous and simple to detect whereas detection of a denial ofservice attack is an involved process that requires computa-tion over many simpler events. Correspondingly, events arecategorized as complex and primitive forming a hierarchy ofevents.

At the base of the hierarchy are the primitive events, de-picted bybottom layerevents in Figure 1. Primitive eventsare defined as atomic occurrences of interest in a system. Forexample, a temperature reading in a sensor network and de-tection of a book in an RFID enabled library are examplesof primitive events. Complex events form the upper levels ofthe hierarchy. They are built on top of simpler events, eitherprimitive or complex, using our event specification languagedefined in section 3. Middle and top layers in Figure 1 rep-resent the complex event layers.

All events are assigned a time interval that indicates theiroccurrence intervals. For primitive events, the time intervalrepresents a single timepoint where the event occurs. Forcomplex events, the assigned intervals contain the time in-tervals of all subevents. Hence, we use interval based se-mantics instead of timepoints. The reason is that intervalsemantics better represent the underlying structure and also

Top Layer

Middle Layer

Bottom Layer

Figure 1. Illustrating Event Hierarchies: Complex events mapto simpler events whereas primitive events lie at the bottom ofthe hierarchy.

solve certain problems that arise with timepoints. As an ex-ample, consider a complex event defined as the sequence ofeventsa and b (see Figure 2). If timepoint based seman-tics are used then we only know the endpoints of events andwould therefore detect the sequence complex event in Fig-ure 2 since b happens after a. On the other hand, if intervalsemantics are used then the start times indicate that b actu-ally started before a occurred which prevents the detectionofthe complex event. This is the required semantics if causalrelations between events are to be observed. This issue isfurther discussed in [7].

1 2 3 4 5

sequence

a b

a

b

Figure 2. Point based semantics cause incorrect detection of thecomplex eventa sequence b. With interval semantics, the com-plex event is not detected since eventb starts beforea occurs.

2.2 System ArchitectureThe main components in our system are the information

sources and the base node (see Figure 2.2). The informa-tion sources, which in a broad sense we refer to assensors,are the entry points of information into the system. For in-stance, routers and firewalls in a network monitoring applica-tion, and a wireless temperature sensor in a disaster monitor-ing application are example information sources. In additionto gathering information, sensors also take part in low levelprocessing of information. The processing done by sensorsinclude the generation of primitive events, the simplest oper-ational units in the system. Finally, we assume that sensorshave data logging capabilities. These data logs provide uswith the ability to reach historical data as well as current datawhich is crucial for retrospective event detection.

Base station is the central component of the system thatplans and executes the complex event detection. It generatesevent detection plans based on the hierarchical structure ofcomplex events, chooses a plan to execute using the infor-mation from cost model and coordinates the execution of thechosen plan among the sensors. For this reason, the base

2

Figure 3. Complex event detection framework: The base node plans and coordinates the event detection using low network cost eventdetection plans formed by utilizing event statistics. The event detection model is an event detection graph generated from the givenevent specifications. Information sources feed the system with primitive events and can operate both in pull and push based modes.

station is provided with the ability to manage the sensors.This ability is significant since sensors transmit the detectedevents on demand from the base station. Therefore our sys-tem combines the pull and push paradigms of data collectionto avoid the disadvantages of a push-based centralized sys-tem. We try to reduce the network traffic towards base stationby carefully choosing which sensors will transmit data basedon the information we have about frequency and constraints(such as spatial) of event types.

Our event detection model is based on an event detec-tion graph constructed from the user given event specifica-tions expressed in our language. For every event expressionwe construct an event detection tree and these event treesare then merged to form the event detection graph. Com-mon events in different event trees, the shared events, aremerged to form nodes with multiple parents. Nodes in anevent detection graph are either operator nodes or primitiveevent nodes. All the non-leaf nodes are operator nodes whichexecute the event language operators on their inputs. The in-puts to operator nodes are events (either complex or prim-itive) coming from their child nodes and their outputs arecomplex events. The leaf nodes in the graph are primitiveevent nodes. A primitive event node exists for each primi-tive event type and stores references to the instances of thatprimitive event type.

3. Complex Event Language

In this section, we describe a SQL-like declarative eventspecification language that is simple yet expressive enoughfor a variety of monitoring applications we have considered.Tha language includes event operators to express event cor-relations, similar to the specification of triggers in activedatabase systems, and also contains other features such asthe time windows from stream processing systems. At thispoint, we would like to emphasize that our main contribu-tion is not the language itself but our efficient complex eventdetection techniques which operate based on the event spec-ifications.

All the basic information in the system comes from theavailable information sources. The types and capabilitiesofinformation sources (referred to as sensors hereafter) dependon the application environment and could range from wire-less cameras in a visual sensor network to logs of a webserver. An output specification of each sensor type is nec-essary for the low level sensor information to be transformedinto primitive events. More specifically, sensor types needto be introduced into the system through our event languagewith a name and a schema describing their attributes.

Every event type (primitive or complex) is associatedwith a set of attributes, forming its schema, in its declara-tion. Certain attributes such as location and event identi-fier are required for all events. Those attributes,eventid,loc, start time, end time and nodeid, form a base schemathat must be extended by the schemas of every event type.eventid is an identifier assigned to every event instance. Itcan be made unique for every event instance or can be set to afunction of event attributes forsimilar event instances to getthe same id. For example, in an RFID enabled library appli-cation a book might be detected by multiple RFID receiversat the same time. Such readings can be discarded if they areassigned the same event identifier.loc attribute is for storingthe location of the event.start time andend time representthe time interval of the event and are assigned by the sys-tem based on the event operator semantics explained in Sec-tion 3.3. The last attribute,nodeid is the id of the node thatgenerated the event. All base schema attributes will be im-plicitly defined unless they are explicitly specified. Finally,a reserved but nonmandatory attribute, namedmaxlatency, isused to specify the latency deadline for an event type. Whena latency deadline is specified, the system will only considerthe plans satisfying the latency requirement.

3.1 Primitive Event Declaration

Primitive events, the simplest units in the event hierar-chy, are formed by annotating sensor readings with metadata.Primitive event declarations specify the details of the trans-formation from sensor readings into primitive events. The

3

syntax for primitive event declaration is given in Figure 4.

primitive nameon sensorlist

schema attribute list

Figure 4. Primitive events are defined using sensor information.

Each primitive event is assigned a unique name using thenamesymbol. The set of sensors used in a primitive event islisted in thesensorlist. Multiple sensors may be used in thislist given that they lie on the same platform. We provide thepseudo-sensornodewhich enables access to context infor-mation such as the location of the sensor node and the cur-rent value of node clock.schemasection is used to expressthe attributes of the primitive event type and the way they areassigned values. The attributes listed in the schema must bea super set of the base schema. An example primitive event,expressing a person detection, is given in Figure 5 togetherwith the declaration of apersondetectorsensor (such as aface detection algorithm running on a camera).

sensor persondetectorschema int id, double locx, double locy

primitive persondetectedon persondetector as PD, node

schema eventid as hash(persondetected, node.id, node.time, PD.id),loc as [ PD.locx, PD.locy ],personid as PD.id

Figure 5. The person detected primitive event is defined usingthe person detector and node sensors.

3.2 Complex Event Declaration

Complex events are specified on simpler subevents usingthe SQL-like template shown in Figure 6. Subevents of acomplex event type, which can be previously specified com-plex or primitive events, are listed in thesourcelist. Thesource list may contain thenodepseudo-sensor as well.

complex nameon sourcelist

schema attribute listwhere constraintlist

Figure 6. Complex events are specified using simpler events onwhich spatial, temporal or attribute-based constraints can alsobe imposed.

The attribute list contains the attributes of a complexevent type which together form a super set of the baseschema and also describes the way they are assigned values.In this sense, the schema section specifies the transformationfrom subevents to complex events. Constraints of a complexevent type are specified in theconstraintlist. We discussconstraint specification in more detail in Section 3.3.

3.3 Constraint Specification

In most applications, users will be interested in complexevents which impose constraints on their subevents. For in-stance, users may want to monitor events occurring in nearby

locations or same time intervals. In order to support suchconstraints, our system allows temporal, spatial, attribute-based and existential constraints to be specified in the whereclause of a complex event specification.

We borrowed event operators from active database re-search for easy specification of temporal correlations be-tween subevents which could otherwise be expressed as aset of attribute constraints on start and end times. Our eventoperators,and, or andseq, are alln-ary operators. We alsoextended the event operators with time windows for tempo-ral constraint specification. The time window argument,w,of an event operator specifies the maximum time betweenany two subevents of a complex event instance. Hence, allthe subevents are separated by at mostw time units. For-mal semantics of our operators are provided below where wedenote subevents withe1, e2, . . . , en and the start and endtimes of the output complex event witht1 andt2.

And operator: and(e1, e2, . . . , en; w)The and operator outputs a complex event witht1 =mini∈{1,..,n}(ei.start time), t2 = maxi∈{1,..,n}(ei.end time)if maxi,j∈{1,..,n}(ei.end time − ej .end time) <= w.

Sequence operator: seq(e1, e2, . . . , en; w)The seq operator outputs a complex event witht1 =e1.start time, t2 = en.end time if (a) ei.end time <

ei+1.start time for i = 1, . . . , n − 1, (b) en.end time −

e1.end time ≤ w. Hence,seqis a restricted form ofand whereoverlapping is not allowed and events need to occur in order.

Or operator: or(e1, e2, . . . , en)

Theor operator outputs a complex event whenever a subevent oc-curs. t1 andt2 are set to start and end times of the subevent. Ob-serve thator operator does not take a window argument.

Parametrized attribute-based constraints between eventsand value-based comparison constraints can be specified inthe where clause as well. Spatial constraints may be spec-ified in the where clause usingloc attribute of events andspatial functions such asdistance(loc x, loc y). Moreover,spatial regions can be defined in the system and constraintscan then be expressed using them. For instance, a region Rcan be expressed as a bounding box, and then the location ofan event can be required to be in the region withloc in R.

Nonexistence (negation) constraints can be specified us-ing the not exists (subquery)SQL construct. Subquery isspecified as aselect-from-whereclause wherefromsection isused to specify the subevent list and constraints are specifiedin thewhereclause. We illustrate the use of the constraintsthrough the unattended bag event given in Figure 7.

complex unattendedbagon BagDetected B, node

schema eventid as hash(unattendedbag, node.id, node.time, B.bagid),loc as B.loc,bagid as B.bagid

where not exists ( select * from persondetected Pwhere and(P,B;120) and distance(P.loc, B.loc)< 3 )

Figure 7. Unattended bag complex event specifies a bag as unat-tended when no person is detected 3m around it for 120 seconds.

4

4. Complex Event Detection Framework

A naive approach to event detection would be to con-stantly send all the events to base station where they wouldbe processed as soon as possible. However, this push-basedcentralized system would create a permanent hot spot loca-tion at the base station even at moderate incoming data rates.The described push-based data collection paradigm is com-mon in continuous query processing systems [10, 11] wherethe global view of data is important. However, event detec-tion systems only need the fraction of the data that generatesevents in the system. Therefore, continuous data collectioncan generally be avoided without missing event detectionsgiven that not all events cause complex events.

We construct event detection plans which specify efficientevent detection strategies to avoid continuous data collection.The simplest event detection plan consists of a single step inwhich all subevents are simultaneously monitored (i.e. thenaive plan). More complex plans have up ton (the number ofsubevents) steps in each of which a subset of the subeventsare monitored. The number of detection plans for a com-plex event with n subevents (primitive) is exponential in n asgiven by the recursive relationT (n) =

∑n

i=1

(

ni

)

T (n − i)where we defineT (0) to be1.

We design a cost model based on event occurrence prob-abilities to calculate the expected costs of event detectionplans. We define the expected cost of a plan as the expectednumber of events it sends to base per time unit. For exam-ple, the cost of the naive plan for detecting the complex eventand(e1, e2) would be the sum of unit costs ofe1 ande2. Onthe other hand, a two step plan, first monitoringe1 and look-ing up e2 whene1 occurs, could cost less but would incurhigher latency. Hence, one of the main goals of our systemis to try to find low network cost event detection plans meet-ing latency deadlines.

Latency of an event detection represents the time betweenthe occurrence of the event and its detection by the system.Event detection latencies are based on network latencies. Inour calculations, we do not consider the processing time orcost at the base station. However, since our system decreasesthe number of events sent to base, both the processing timeand cost should be reduced as well.

4.1 Event Detection Plans

Event detection plans specify monitoring orders for thesubevents of complex events. We represent the plans withfinite state machines in our system. Consider the complexeventand(e1, e2, e3;w) wheree1, e2, e3 are primitive eventsandw is the window size. State machines of the plans forthis complex event have at mostn = 3 states except the finalstate in each of which a subset of primitive events is moni-tored. One state machine of each size is given in Figure 8.For instance, the3-step monitoring plan: “(1) continuouslymonitore1, (2) one1 lookupe2, (3) one1 ande2 lookupe3”,

is illustrated in Figure 8c where the notatione1 → e2 → e3

is used to denote this plan.The finite state machines we use for representing plans are

nondeterministic(NFA) since they can have multiple activestates at a time. Every active state corresponds to a partialdetection of the complex event. For example, in stateSe1

of the plan given in Figure 8c, there can be active instancesof e1 primitive events waiting fore2 primitive events. Thenwhen an instance ofe2 is detected, in addition to the transi-tion to next state, a self transition will also occur so that an e1

instance can match multiple instances ofe2 (self-transitionsare not shown in the figure). Unlike the always active initialstate, intermediate states are active only as long as the eventwindow constraints allow.

start

startstart

(a) The naive plan:

(e1, e2)

(c) Plan e1 → e2 → e3:

(b) Plan e1 → e2, e3:

(e1)

(e1)

(e1, e2, e3)

Se1,e2Se1

Se1

(e1, e2, e3)

w of e1 w of e1, e2

e3 withine2 withine1

e1, e2, e3 e1

(e1, e2, e3)

e2, e3 withinw of e1

Figure 8. Event detection plans represented as finite state ma-chines (FSMs)

In Section 4.1.1, we describe the plan generation processwith the goal of optimizing the overall event detection cost.First, operator wise plan generation is explained where eachoperator forms a set of plans with different cost and latencycharacteristics as no single plan can be chosen that will guar-antee global minimum cost in advance (this will be explainedin more detail in the next section). Then, we describe howthese plans are used in the global optimization of all eventoperators forming the event detection graph.

4.1.1 Plan Generation

In generating the plans for each operator, enumeration ofthe plan space is not a viable option since its size is exponen-tial in the number of subevents as mentioned before. To ad-dress this issue we have come up with the following heuris-tics that together form a representative subset of all planswith distinct cost and latency characteristics:

Forward Stepwise Plan Generation: This heuristicstarts with the minimum latency plan (the naive plan with theminimum latency plan selected for each complex subevent)and repeatedly alters it to form lower cost plans until latencyconstraint is exceeded or no more alterations are possible.Ateach iteration, the current plan is transformed into a lowercost plan either by moving a subevent detection to a laterstate or changing the plan of a subevent with a cheaper plan.

Backward Stepwise Plan Generation: This heuristicstarts by finding the minimum cost plan (an n-step plan withthe minimum cost plans selected for each complex subevent).

5

It can be found in a greedy way when all subevents areprimitive, otherwise a nonexact greedy solution which or-ders the subevents in increasingcost × probability ordercan be used. At each iteration the plan is repeatedly trans-formed into lower latency plans either by moving an event toan earlier state or changing the plan of an event with a lowerlatency plan until no more alterations are possible.

Observe that the first heuristic starts with a single stateFSM and extends it in successive iterations whereas the sec-ond one shrinks down the initial n-state FSM. Moreover,both heuristics choose the move with the highest cost-latencygain at each iteration and both end in a finite number of it-erations since every move results in a better plan (lower costfor the first one and lower latency for the second one). Whilethe first heuristic aims to form low latency plans with reason-able costs, the other one tries to form low cost plans meetinglatency requirements.

All the plans are then merged into a feasible (i.e. meet-ing latency requirements) plan set. During the merge onlypareto optimalplans are kept. Pareto optimal plans are theplans for which there exist no other plan we can use to ei-ther reduce the cost or latency without increasing the other.Moreover, only a limited number of pareto optimal plans canbe stored by the operator node for use in the global optimiza-tion process (explained later in this section). In such a case,the choice is made so that plans with low latency, low costand low latency-cost (a linear combination of the two fac-tors) are equally represented.

We described the plan generation process for the caseswhere all the subevents of an operator node are primitiveevents. However, when complex subevents exist, the plangeneration becomes a hierarchical process where the plansfor the upper level nodes are built on the plans of the lowerlevel nodes. Hence, plan generation is a bottom-up processin which the plans of lower level nodes are generated first.

As mentioned before, choosing only the minimum latencyor cost plan at each node does not guarantee overall optimalsolutions since (a) a lower cost but higher latency plan maybe useful to reduce overall cost (e.g. when there are otherevents with higher latency plans such that the overall latencyis not increased when a higher latency plan is used for thisevent) and (b) a lower latency but higher cost plan may re-duce overall cost (because an event with a high cost plan maythen switch to a lower cost plan with higher latency). For thisreason, each node creates a set of plans with different latencyand cost characteristics in the plan generation process. How-ever, only a subset of these plans can be passed on to upperlevel nodes due to computational complexity. The size of thissubset is a parameter trading computation with the exploredplan space size. This process continues up to the root nodes,each of which then selects the minimum cost plan meetingits latency requirements. This in turn finalizes the genera-tion of the event detection plans to be used by all nodes ina top-down manner. Finally, if a node has multiple parents

requesting different plans, then it chooses the plan with theminimum latency.

The latency deadlines for complex events originate fromtwo different sources. First, as mentioned before we mayhave user specified, explicit latency deadlines. Second, la-tency deadlines can also stem from limited data logging ca-pabilities. More specifically, due to restricted storage someinformation sources may only be able to store the instancesof an event type for a limited time. Therefore, any plan thatrelies on storage of events for longer periods are not gonna beuseful. Our system considers both of the described latencyrequirements and uses the most strict one for each complexevent.

4.1.2 Execution of Plans

Once the selection of plans is completed, the set of prim-itive events to monitor are identified and activated. Whena primitive event arrives to the base station, it is directedto the corresponding primitive event node. The primitiveevent node stores the event and then forwards a pointer ofthe event to its active parents. An active parent is one whichhas expressed interest in the time interval the event arrivedin. The complex event detection proceeds similarly in thehigher level nodes. Each node acts according to its plan uponreceiving events either by activating subevents or by detect-ing a complex event and passing it along to its parents.

4.1.3 Modeling Event Detection Plans

In this section, we explain the cost and latency charac-teristics of event detection plans. In our cost model, we useprobabilistic models of event occurrences to derive expectedcosts of event detection plans. Our approach to cost model-ing is not strictly tied to any particular probability distribu-tion. Here, we derive the cost estimations for two differentprobability models:PoissonandBernoulli distributions. Inboth cases we assume that events occur independently.

Poisson distributions are widely used in modeling discreteoccurrences of events such as the receipt of a web request andarrival of a network packet. A Poisson distribution is char-acterized by a single parameterλ that expresses the averagenumber of events occurring in a given time interval. In ourcase, we defineλ to be the rate of occurrence for an eventtype, i.e. the average number of occurrences of an event typeper time unit. In addition, we assume that the number ofevents in disjoint time intervals are independent. Under theseconditions, the event occurrences follows a Poisson processwith rateλ. On the other hand, when modeling an event typewith the Bernoulli distribution, we assume that event occursindependently with probabilityp at every time step.

As described before, a complex event detection plan con-sists of a set of states each of which corresponds to moni-toring a set of events. The cost of a plan is the sum of thecosts of its states weighted by state reachability probabili-ties. Cost of a state depends on the cost of the subevents

6

in that state. We define the latency of an event detectionplan to be the maximum latency it could have so that we canguarantee latency deadlines. For this reason, we associateeach event type with a latency value that represents the max-imum latency its instances can have. Then, the latency ofan event detection plan can be derived using the latencies ofthe subevents. Here, we consider identical latencies for allevent types for simplicity. However, different latency valuescan be handled by the system as well. We will consider theexpected cost and latency of monitoring the complex eventand(e1, e2, e3) for describing the process in more detail.

We definee1, e2 and e3 to be primitive events with∆tlatency and use Poisson processes with ratesλe1

, λe2and

λe3respectively to model the events. When the naive plan

is used, all subevents will be monitored at all times. So thecost will be the sum of the expected occurrence rates of thesubevents:

∑3

i=1λei

. The latency of the naive plan, whichis simply the maximum latency among its subevents, is∆t.

The cost derivation for the three step plane1 → e2 → e3,given in Figure 8c, is more complex. We define the reach-ability probability of a state to be the probability of detect-ing the partial complex event that activates the state. Forinstance, the partial complex event which makes stateSe1

active ise1. State reachability probabilities are derived usinginterarrival distributions of events. When using a Poissonprocess with parameterλ, the interarrival time is exponen-tially distributed with the same parameter. Hence, the prob-ability of waiting time for the first occurrence of an event tobe greater thant is given bye−λt. On the other hand, whenusing the Bernoulli distribution, the interarrival times havegeometric distribution. The reachability probability forini-tial state is 1 since it is always active and the probability forfinal state is not required for cost estimation. Using the in-terarrival distributions to derive reachability probabilities thecost of the three step plan can be derived as:

cost fore1 → e2 → e3 = λe1+ (1 − e−λe1 )2Wλe2

+

((1 − e−λe1 )(1 − e−Wλe2 ) + (1 − e−λe2 )(1 − e−Wλe1 ))2Wλe3

In the cost equation above and for the rest of the paper,we assume the probability of more than one events to occurin the same time step to be negligible. However, if that isnot true, then the formula can be modified for the other case.Moreover, as more events are required to occur in a singletime step, the occurrence probability will quickly diminishwhich means the terms with many concurrent events can bediscarded as they will have negligible values.

The plan is assigned3∆t latency since this is the maxi-mum latency it exhibits (when the events occur in the ordere3, e2, e1 or e2, e3, e1). Actually, for the exact latency weneed to include the latency of sending pull requests for eventse2 ande3 in the equation. However, the pull requests willhave the same∆t latency and since we assumed all eventsto have the same latency it is not required to include them in

our calculations (they will not change the result when com-paring the cost of plans). For ease of presentation, we omitthem in the rest of the paper as well.

And Operator. Here we describe the cost estima-tion for the n-aryand operator. Given the complex eventand(e1, e2, . . . , en) with window sizeW, and a detection planwith m + 1 statesS1 throughSm and the final stateSm+1,we show the cost derivation using reachability probabilitiesboth for Poisson and Bernoulli distributions below. For eventej we represent the Poisson process parameter withλej

andthe Bernoulli parameter withpej

.The cost for and operator with n operands is given

by∑m

i=1PSi

∗ costSiwhere PSi

is the state reachabil-ity probability for state Si and costSi

represents thecost of monitoring subevents of stateSi for a period oflength 2W . In the case that all subevents are primitivecostSi

=∑

ej∈Si2Wλej

when Poisson processes are usedandcostSi

=∑

ej∈Si2Wpej

for Bernoulli distributions.PSi

, the reachability probability forSi, is equal to the oc-currence probability of the partial complex event that causesthe transition to stateSi. For this partial complex event tooccur in this time step, all its constituent events need to oc-cur within the last W time units with the last one occurring inthis time step (otherwise the event would have occurred be-fore). Then,PSi

is 1 wheni is 1 and form ≥ i > 1 is givenfor Poisson processes (i) and Bernoulli distributions (ii)by:

(i)X

ej∈Si−1

k=1Sk

(1 − e−λej )

Y

et 6=ej

et∈Si−1

k=1Sk

(1 − e−λetW )

(ii)X

ej∈Si−1

k=1Sk

pej

Y

et 6=ej

et∈Si−1

k=1Sk

(1 − (1 − pet )W )

Under the identical latency assumption, the latency of aplan forand operator is defined by the number of the statesin the plan (except the final state). Hence, the latency for theeventand(e1, e2, . . . , en) can range from∆t to n∆t.

Sequence Operator. We can consider the same set ofplans forsequenceoperator as well. However, sequence hasthe additional constraint that events have to occur in a spe-cific order and must not overlap. Therefore, the time intervalto monitor a subevent depends on the occurrence times ofother subevents and is at mostW time units.

Xep1Xepj

. . .ep1ep2

ept. . .epj+1

epj

Figure 9. subevents forseq(ep1, ep2

, . . . , ept)

Expected cost for monitoring the complex eventseq(e1, e2, . . . , en) with window sizeW using a plan withm + 1 states has the same form

∑m

i=1PSi

∗ costSi.

Let seq(ep1, ep2

, . . . , ept) with t ≤ n andp1 < p2 <

. . . < pt be the partial complex event consisting of the eventsbefore stateSi, i.e.∪i−1

k=1Sk = {ep1

, ep2, . . . , ept

}. Then

7

1. PSi, the reachability probability forSi, is equal to de-

tectingseq(ep1, ep2

, . . . , ept) at a time point. For this

complex event to occur subevents has to be detected insequence as in Figure 9 within W time units. We definethe random variableXepj

to be the time betweenepj+1

and the occurrence ofepjbeforeepj+1

(see Figure 9).Then,Xepj

is exponentially distributed withλepjif we

are using Poisson processes, or has geometric distribu-tion with pepj

when using Bernoulli distributions.

(i) For the Poisson process case, we havePSi= (1 −

e−λept )(1−R(W )) whereR(W ) = P (

Pt−1

j=1Xepj

≥

W ). Closed form expressions for sums of expo-nential random variables are studied in [9]. In thecase all exponential variables have distinct param-etersR(W ) has the following form:

R(W ) =Pt−1

j=1Aje

−λepjW where Aj =

t−1Y

k=1k 6=j

λepk

λepk− λepj

.

(ii) For the Bernoulli distributionPSi= pept

(1−R(W ))

whereR(W ) is defined on a sum of geometricrandom variables. In this case, there is no para-metric distribution forR(W ) unless the parame-ters of geometric random variables are identical.Hence, it has to be numerically calculated.

2. Any eventeikof stateSi should either occur (a) be-

tweenepjandepj+1

for some j or (b) beforeep1or af-

ter eptdepending on the order inseq(e1, e2, . . . , en).

In casea, we need to monitoreikbetweenepj

andepj+1

for Xepjtime units (see Figure 9). For case

b we need to monitor the event forW −Pt−1

j=1Xepj

time units. In the cost estimation, we use the ex-pectation valuesE[Xepj

|Pt−1

k=1Xepk

≤ W ] and W −

E[Pt−1

k=1Xepk

|Pt−1

k=1Xepk

≤ W ] for estimatingLeik, the

monitoring interval. ThencostSiis

P

eik∈Si

Leikλeik

.

The latency of a plan for sequence depends on the latencyof the last event (en) and the events in later states (afteren)of the plan. If the complex eventseq(e1, e2, . . . , en) is beingmonitored with an m-step plan where thejth step containsen, then its latency is(m−j +1)∆t. This latency differencebetweenand andsequenceoperators exists because unlikethesequence, with andoperator any of the subevents can bethe last event that causes the occurrence.

Negation Operator. In our system, negation can be usedon the primitive events insideand andseqoperators. Here,we consider the plans for complex events, with negatedterms, specified usingand operator over primitive events.The plans we consider for such events resemble a filteringapproach. First, we detect the partial complex event consist-ing of non-negated events only. When that complex eventis detected, we monitor the negated events. The detectionplan for the complex event defined by non-negated events

can be any arbitrary plan discussed forand operator. Sameset of detection plans can be considered for negated eventsas well. However, the execution has to be changed in a waythat the absence of an event is now what is aimed for. Thecost estimations discussed forand operator can be appliedhere by changing the occurrence probabilities with nonoc-currence probabilities.

Or Operator. As discussed before,or operator generatesa complex event for every event instance it receives. Hence,the only detection plan foror operator is thenaiveplan. Thecost of the naive plan is the sum of the costs of the subeventsand its latency is the highest latency among the subevents.

4.2 Optimizing for Shared SubeventsThe hierarchical nature of complex event specification

may introduce common subevents across complex events.For example, in a network monitoring application we couldhave thesynevent indicating the arrival of a TCPsynpacket.Various complex events could then be specified using thesynevent such as syn-flood (sending syn packets without match-ing acks to create half-open connections for overwhelmingthe receiver), a successfull TCP session and another eventdetecting port scans where the attacker looks for open ports.

The overall goal of shared optimization is to find the setof plans for which the total cost of monitoring all complexevents is minimized. Yet the base algorithm presented inSection 4.1.1 does not consider sharing between event ex-pressions as it runs independently for each expression. Here,we modify our plan generation algorithm for (1) calculat-ing the overall event detection cost correctly when sharedsubevents exist and (2) choosing plans that facilitate sharingto further reduce cost when available and applicable.

First, we need to identify the expected amount of sharingthat will happen on a shared node. However, the degree ofsharing depends on the plans selected by the ancestor nodesof the shared node. Since our base algorithm proceeds in abottom-up fashion, we cannot identify the amount of sharingunless the algorithm completes and the plans for all nodesare selected. Below, we present an iterative version of ouralgorithm to address these problems (for simplicity, modifiedalgorithm is presented for the case of a single shared complexevent):

1. run the base plan generation algorithm

2. find the expected amount of sharing with the current plan se-lections and recalculate the current plan costs for ancestors ofthe shared node

3. rerun the base algorithm starting at each parent of shared nodeutilizing the sharing probabilities

4. if overall cost is reduced then goto 2 else exit with the previousplans

After the first step, every node will have selected its planbut the total cost for the shared node will be incorrect. Inthe second step, we fix the overall cost by taking sharing intoaccount. This is possible because we can find the amount of

8

sharing after all nodes have selected their plans. We assumethat parents of the shared node function independently andfind the probability that they will monitor the shared eventin overlapping intervals. Third step runs the base algorithmstarting at each shared node parent. However, in this execu-tion of the algorithm, the sharing probabilities of generatedplans are calculated and utilized as well. Hence, ancestorsof shared node may now change their plans since increasedsharing may further reduce plan costs. Moreover, the planchanges made in third step are guaranteed to increase theamount of sharing because (1) Cost of the shared node canonly decrease due to sharing and (2) Ancestor nodes can onlyreduce their costs at each step if they choose plans whichmonitor the shared node in earlier states (and monitoring theshared node earlier means it will be shared more). The algo-rithm reiterates as long as the overall cost is reduced.

4.3 Constraint Optimization

In this section, we describe how the spatial and attribute-based constraints affect the occurrence probabilities ofevents and explain the additional optimizations we havemade to the plan selection and execution processes for fur-ther reducing cost using these constraints. First, we discussthe effects of spatial constraints on the plan generation pro-cess. Thespatial constraints we consider are defined interms of regional units. The space is divided into regionssuch that events in a region occur independently from eventsin other regions. The division of space into such independentregions is typical for some applications. For instance, in ase-curity application we could consider the rooms of a buildingas independent regions. In addition, it is also easy for usersto specify spatial constraints (by combining smaller regions)once regional units are provided.

When the spatial constraints are specified in the describedway, their effect on event occurrence probabilities can be in-corporated in our system with minor changes. First, we mod-ify our model to keep event occurrence statistics per each in-dependent region of an event type. Then, when a spatial con-straint on a complex event is given, we only need to combinethe information from corresponding regions to derive the as-sociated event occurrence probability. For example, if wehave Poisson processes with parametersλ1 andλ2 for tworegions, then the Poisson process associated with the com-bined region has the parameterλ1 + λ2. Hence, by com-bining Poisson processes we can easily construct the Pois-son process for any arbitrary combination of independent re-gions. However, this is only possible because the regionsare independent, otherwise we would have to derive jointdistributions. Hence, the spatial constraints alter the eventdetection process, such that different plans may be used formonitoring different spatial regions if doing so reduces theoverall cost. A related experiment is available in the experi-ments section.

Attribute-based constraints have been considered in

many query processing systems. The main approach, whichwe also adapt, has been to keep histograms for attributes.Histograms provide the information for deriving the selectiv-ity probabilities of attribute-based constraints which wecanthen use to derive the event occurrence probabilities. More-over, value based attribute constraints can be pushed down toinformation sources further reducing the number of transmit-ted events. Parametrized attribute constraints between eventscan also be pushed down whenever one of the events is mon-itored earlier than the other one.

5. Experiments

In this section, we analyze the performance of our sys-tem and investigate the effects of various parameters througha set of experiments. We have implemented the base nodefunctionality and generated specific event adapters for usein experiments. Our experiments involve both synthetic andreal data sets. Unless stated otherwiseZipfian distributionhas been used in synthetic data generation. Real data set isa collection of network traffic logs obtained from Planetflowweb site [12].

5.1 Experiments with Synthetic Data Sets

On window size and detection latency:We explore theeffects of window size and latency deadline on the eventdetection cost in this experiment. We defined the complexeventsand(e1, e2, e3) andseq(e1, e2, e3) wheree1, e2 ande3

are primitive events. Using Zipfian distribution with skew0.255 (other skew values are used in the varying skewnessexperiment) we generated event streams for these primitivetypes. Then, for different window values and latency dead-lines of both complex events we ran our event detection al-gorithm on the generated streams. The event detection costs,expressed as percentage of the primitive events sent to base,are provided in Figures 10(a) and 10(b).

Both figures show that as the allowed latency for event de-tection increases (from1 to 3 in this case) the event detectioncost reduces. The lines labeledoutput, which show the per-centage of primitive events output as parts of complex events,serve as a lower bound on cost. Becausesequenceoperatordoes not need to monitor all events unless the first events ofsequence occur, it can reduce cost even under hard latencyconstraints. Finally, the event detection cost increases withwindow size since larger window size means increased eventoccurrence probability.

Increasing the number of subevents: In this experi-ment, we investigate the cost performance under increasingnumber of subevents through a complex event specified witha singleandoperator. We randomly generated streams usingsimilar event frequencies for all event types (to rule out theeffect of frequency in the test). In Figure 10(c), we can seethat (1) increasing the number of operands tends to decreasethe number of detected complex events and (2) greater num-

9

(a) and operator with 3 operands (b) seq operator with 3 operands (c) and operator with increasing operands

(d) and operator with increasing skew (e) and operator with increasing negation(f) events with increasing number of operators

Figure 10. Operator wise experiments and complex event detection performance

ber of operands means we have a wider latency spectrum(therefore a larger plan space) to reduce cost.

Workloads with varying skewness: In this experiment,we use the complex eventand(e1, e2, e3) with a fixed win-dow parameter under workloads with varying skewness.Each workload stream is generated with a Zipfian distribu-tion and has around the same number of events. In Fig-ure 10(d), we see that in low skew streams a greater num-ber of complex events is detected and the cost is thereforehigher. Increasing the skew generates event types with lowfrequencies which our system uses to reduce the cost.

Negated subevents:To explore the cost performance forcomplex events involving negated subevents, we performedan experiment using theand(e1, e2, e3) event in which wevaried the number of negated subevents. In Figure 10(e),we can see that while the costs for the complex eventswith single and no negated terms are similar, the cost whentwo subevents are negated is high even though less com-plex events are detected. This is mainly because (1) mon-itoring of negated and non-negated events are not inter-leaved, that is we monitor the negated subevents after thenon-negated subevents occur (see Section 4.1.3) and (2) allthe detected non-negated subevents are discarded when anegated subevent that prevents them from forming a com-plex event is detected.

Increasing the number of operators: In this exper-iment, we consider the cost performance with increasingnumber of operators. We varied the number of operatorsused in complex events from 1 to 7 and for each operatorcount we generated 10 complex events based on event com-position rules. The average event detection cost for each

operator count is shown in Figure 10(f). As the number ofoperators in an expression is increased, generally its occur-rence probability decreases. Moreover, for similar event oc-currence probabilities the relative cost of event detection isalso similar irrespective of the operator number.

Shared subevents:To test the shared event optimization,we specified two complex events with a common subeventtree and compared the performance with and without sharedoptimization. In the experiment, we varied the frequency ofthe complex event that corresponds to the shared subtree. InFigure 11(a), we see that when the frequency of the sharedpart is low, both with and without sharing the system experi-ences similar cost since the shared part is chosen to be moni-tored earlier in both cases. When the frequency of the sharedpart is the same with or slightly higher than other parts, non-shared parts are monitored earlier without sharing optimiza-tion. In this case, shared optimization reduces cost by moni-toring the shared part first. Finally, when shared part has veryhigh frequency, non-shared parts are monitored first in bothcases. Even in this case shared optimization experiences lesscost, because it better estimates shared subevent costs whichcan cause better execution plans to be selected in some cases(since we are using heuristics for plan generation). When weused exhaustive plan generation, both with and without shar-ing the algorithm chose the exact same plans for this case.

Spatial constraints: In this experiment, we show theutilization of spatial constraints in reducing detection coststhrough the complex eventand(e1, e2) with constrainte1.loc= e2.loc. We assume that there are two regions X and Ywith event occurrence ratesλX

e1= 3λ, λY

e1= 7λ, λX

e2= 6λ

andλYe2

= 4λ. When localized information is available, i.e.

10

(a) shared optimization (b) spatial constraint between two events

Figure 11. Shared optimization and spatial constraints

frequencies of events are known for each region, the cost islowest (see Figure 11(b)). In this case, the system monitorse1 in region X ande2 in region Y. Then whene1 is detected inX (or e2 in Y), e2 is monitored in X (ore1 in Y). When local-ized information is not available, but the global selectivity ofthe spatial constraint is known (global info in Figure 11(b)),eithere1 or e2 is monitored (both have the same total fre-quency) in all regions. Finally, when no spatial constraintinformation is available, the system expects that the com-plex event will occur every time step and therefore choosesto execute the naive plan.

5.2 Experiments with Planetlab Data Set

The Planetlab data set we have used consists of 5 hoursof network logs for 49 Planetlab nodes we have obtainedfrom [12]. The network logs provide aggregated informa-tion on network connections between Planetlab nodes andother nodes in the Internet. The provided information in-cludes connection start/end times, amount of generated traf-fic and used network protocol. We have experimented withvarious complex events most of which can easily be foundon many network monitoring applications. Here, we presentthe results for three of the complex events.

Change of overall network load: We define a Planetlabnode asidle if its average network transfer speed (incomingand outgoing total) in the last minute is less than125KBpsand asactive if the average speed is greater than a thresh-old T . Given that, the complex event monitors for an overallnetwork load change from a situation where more than halfof all nodes are idle to more than half being active withina specified time interval. The complex event is defined asseq(count(idle)> %50 of all nodes, count(active)> %50of all nodes; W=30min ). The results are provided in Fig-ure 12(a) forT = 250, 500, and1250 KBps.

Diverse clusters: We define a cluster to be a set of ma-chines from the same/8 IP class. A diverse cluster is thendefined as a cluster with more thanC connections to Planet-lab nodes in total. To specify this complex event we first de-fine alocally diverse clusterevent which monitors the eventthat a Planetlab node has more thanC

N=49connections with

a cluster. The diverse cluster complex event is specified as

sum(conns)> C group by cluster. Then, it isand’ed with thelocally diverse cluster event which acts as a prerequisite forthe diverse cluster event and helps reduce monitoring cost.The results are given in Figure 12(b) forC = 250, 500, 1000,and2000.

Clusters with multiple active nodes:We define a node (outside of Planetlab) to beactiveif its

aggregate average network transfer rate to Planetlab nodesismore thanT in the last minute. In this complex event we areinterested in/8 clusters with more than one active nodes inthe last minute. Similar to the diverse cluster complex event,we first defined alocally active nodeevent which monitorsa node with an average network speed greater thanT

N=49

to a Planetlab node. Then the active node complex event isspecified assum(speed)> T group by nodeip and isand’edwith the locally active node event which acts as a prerequisiteevent. Finally, clusters with multiple active nodes is specifiedascount(active node)> 1 group by cluster. The results areprovided in Figure 12(c) forT = 500, 1000, and2000 KBps.

6. Related WorkIn continuous query processing systems such as

TinyDB [1] for wireless sensor networks, and Borealis [10]for stream processing applications queries are expected toconstantly produce results. Push based data transfer, eitherto a fixed node or to an arbitrary location in a decentralizedstructure, is characteristic of such continuous query process-ing systems. On the other hand, event detection systems areexpected to be silent as long as no events of interest occur.The aim in event systems is not continuous monitoring of thedata, but is the detection of events of interest.

In the active database community, ECA (event-condition-action) rules have been studied for building triggers [6].Triggers offer the event detection functionality throughwhich database applications can subscribe to in-databaseevents, e.g. the insertion of a tuple. However, most in-database events are simple whereas more complex eventscould be defined in the environments we consider. Manyactive database systems such as Samos [2], Ode ActiveDatabase [3], and Sentinel [4] have been produced as the re-sults of the studies in the active database area. Most systems

11

(a) network load change (b) diverse clusters (c) clusters with multiple active nodes

Figure 12. Experiments with Planetlab data set

provide their own event languages. These languages formthe base of the event language in our system. However, ourlanguage has additional features such as spatial and temporalconstructs which are important for the systems we consider.

In the join ordering problem, database query optimizerstry to find ordering of relations for which intermediate re-sult sizes is minimized [13]. Most query optimizers onlyconsider the orders corresponding to left-deep binary treesmainly for two reasons: (1) Available join algorithms suchas nested-loop joins tend to work well with left-deep trees,and (2) Number of possible left-deep trees is large but notas large as number of all trees. Our problem of constructingminimum cost monitoring plans is different from the join or-dering problem for the following reasons. First, we are notlimited to binary trees since multiple event types can be mon-itored in parallel. Second, our cost metric is the expectednumber of events sent to base. Finally, we have an additionalconstraint, i.e. the latency constraint, further limitingthe so-lution space.

In a recent study about high performance complex eventprocessing [5] optimization methods for efficient event pro-cessing are described. There the aim is to reduce processingcost at the base station. While our system also helps reducethe processing cost, our main goal is to minimize the networktraffic. Moreover, their system does not consider distributedevent processing, and simultaneous queries.

7. Conclusions and Future Work

CED is a critical capability for many monitoring appli-cations. While earlier work primarily focused on optimiz-ing processing requirements of complex events, we made aneffort towards optimizing communication needs when dis-tributed sources are involved.

The results support our premise that communication re-quirements can be significantly reduced by exploiting spatio-temporal constraints within the event specification and thefrequency skew among the relevant sub-events, at the ex-pense of additional detection delays. Specifically, the mainbenefit came from a novel multi-step planning technique thatcombined proactive and retroactive monitoring of events.

This is a rich research area with many open problems.

Our immediate future work will explore probabilistic plan-ning for sensor network applications and augmenting manualevent specifications with learning-based techniques.

References

[1] S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong.Tinydb:An acquisitional query processing system for sensor networks. Trans-actions on Database Systems (TODS), 2005.

[2] S. Gatziu and K. R. Dittrich. Detecting composite events in activedatabase systems using petri nets. In Proc. 4. Intl. Workshopon Re-search Issues in Data Engineering, pages 2–9, Houston, USA,1994.

[3] S. Chakravarthy. et al. Composite Events for Active Databases: Se-mantics, Contexts and Detection, VLDB 1994.

[4] S. Chakravarthy and D. Mishra. Snoop: An Expressive Event Spec-ification Language for Active Databases. Data and KnowledgeEngi-neering, 14(10):1–26, 1994.

[5] Eugene Wu, Yanlei Diao, and Shariq Rizvi. High-Performance Com-plex Event Processing over Streams. SIGMOD 2006

[6] N. Paton and O. Diaz, ’Active Database Systems’, ACM Comp. Sur-veys, Vol. 31, No. 1, 1999.

[7] Zimmer, D. and Unland, R. On the Semantics of Complex Events inActive Database Management Systems. p.392, ICDE’99.

[8] The Power of Events: An Introduction to Complex Event Processingin Distributed Enterprise Systems, David Luckham, May 2002.

[9] S. V. Amaria and R. B. Misra, Closed-form expressions for distribu-tion of sum of exponential random variables, IEEE Trans. Reliability,vol. 46, no. 4, pp. 519-522, Dec. 1997.

[10] Daniel Abadi, et al. The Design of the Borealis Stream ProcessingEngine. CIDR’05.

[11] S. Chandrasekaran, et al. TelegraphCQ: Continuous Dataflow Process-ing. In ACM SIGMOD Conference, June 2003.

[12] http://planetflow.planet-lab.org/

[13] Selinger, P. G., et al. 1979. Access path selection in a relationaldatabase management system. SIGMOD ’79.

12

combining proactive and retroactive processing for...

Documents