processing nested complex sequence pattern queries over event streams

34
Mo Liu 1 , Medhabi Ray 1 , Elke A. Rundensteiner 1 , Dan Dougherty 1 , Chetan Gupta 2 , Song Wang 2 , Ismail Ari 3 , and Abhay Mehta 2 1 Worcester Polytechnic Institute, USA 2 HP Labs, USA 3 Ozyegin University, Turkey DMSN 2010 Singapore Processing Nested Complex Sequence Pattern Queries over Event Streams Acknowledgements: This work is partly supported by HP Innovations Award, NSF 1018443 and NSF IIS 0917017, Turkish National Science Foundation TUBITAK under career award 109E194.

Upload: kermit-norton

Post on 30-Dec-2015

56 views

Category:

Documents


1 download

DESCRIPTION

Processing Nested Complex Sequence Pattern Queries over Event Streams. Mo Liu 1 , Medhabi Ray 1 , Elke A. Rundensteiner 1 , Dan Dougherty 1 , Chetan Gupta 2 , Song Wang 2 , Ismail Ari 3 , and Abhay Mehta 2 1 Worcester Polytechnic Institute, USA 2 HP Labs, USA 3 Ozyegin University, Turkey - PowerPoint PPT Presentation

TRANSCRIPT

Mo Liu1, Medhabi Ray1, Elke A. Rundensteiner1, Dan Dougherty1,

Chetan Gupta2, Song Wang2, Ismail Ari3, and Abhay Mehta2

1Worcester Polytechnic Institute, USA2HP Labs, USA

3Ozyegin University, Turkey

DMSN 2010 Singapore

Processing Nested Complex Sequence Pattern Queries over Event Streams

Acknowledgements:This work is partly supported by HP Innovations Award, NSF 1018443 and NSF IIS 0917017, Turkish National Science Foundation TUBITAK under career award 109E194.

2

Event Processing

EventConsumer

EventProducer

Event Processing—The Big Picture

Data Streams

Query Results

Data Sources

Put on surgical gloves

Wash your hands before touching next

patients

RFID Input

RFID Input

RFID Input

Put on mask for H1N1 contagious patients

Aggregate statistics for a

hospital

3

RFID

Input

Hospital Disease and Hygiene Control

Detect hygiene violations

Track workers

D. Wang, E. Rundensteiner, R. Ellison III, Active complex event processing: applications in realtime health care, VLDB (demonstration paper), 2010.

Primitive event instance is defined to be an occurrence of interest in time.

Composite event instance occurs over an interval.

CEP Basics

e(t)

time t

e([t1, t2])

time t1 t2

4

Outline

• Motivation• NEEL: The Nested Complex Event

Language• Nested CEP Query Processing• Performance Evaluation• Nested Query Optimization with Evaluation• Conclusion

5

CompactIncremental Convenient

6

Why Nested Queries?

+

++

+

+

• Specify time period

• Support nested SEQ, NEGATION , AND, OR

• Specify condition on attributes• Assume value-based comparison

NEEL: The Nested Complex Event Language

7NEEL: The Nested Complex Event Language for Real-Time Event Analytics, Mo Liu, Elke A. Rundensteiner, Dan Dougherty, Chetan Gupta, Song Wang, Ismail Ari, and Abhay Mehta, BIRTE2010

NEEL: The Nested Complex Event Language

Time

Nested sub-query

8

Nested CEP Query Plan

OperatingRecycle

WinSeq(Recycle r, Washing w, , Operating o)

WinAND(Sharpening s, Disinfection d, Checking c)

Sharpening Disinfection

(r.id = w.id = o.id)

(s.id = d.id = c.id = o.id)

RFID readings

Complex Events

9

Washing

Checking

Outline

• Motivation• NEEL: The Nested Complex Event

Language• Nested CEP Query Processing

− Processing Nested Queries with Negation

− Processing Nested Queries with Predicate

• Performance Evaluation• Nested Query Optimization with Evaluation• Conclusion

10

Nested CEP Query Processing

OperatingRecycle Washing

WinSeq(Recycle r, Washing w, , Operating o)

WinSeq(Sharpening s, Disinfection d, Checking c)

Sharpening Disinfection Checking

Complex Events

PATTERN SEQ(Recycle r, Washing w, SEQ(Sharpening s

Disinfection d, Checking c), Operating o)WITHIN 10 minutes

11

Nested CEP Query Processing

w12Recycle

Washing

r5w2

WinSeq Operating

r1

o18

<r1, w2, o18><r1, w12, o18><r5, w12, o18>

partial outer query result

Sharpening

Disinfection

s11 d10

WinSeq Checking

c12

s3

c16

12 [ECUBE] M. Liu, E. A. Rundensteiner, K Greenfield, C Gupta, S Wang, I Ari and A Mehta " E-Cube: Multi-Dimensional Event Sequence Processing Using Concept and Pattern Hierarchies, ICDE'10 (DEMO)

Nested CEP Query Processing

w12Recycle

Washing

r5w2

WinSeq Operating

r1

o18

Sharpening

Disinfection

s11 d10

WinSeq Checking

c12

s3

c16

<r1, w2, o18>partial outer query result

[2, 18]

tightened sub-window

<s3, d10, c12>inner query results

<s3, d10, c16>

<r1, w2, s3, d10, c12 , o18><r1, w2, s3, d10, c16 , o18>

13

Nested CEP Query Processing

w12Recycle

Washing

r5w2

WinSeq Operating

r1

o18

<r1, w2, o18><r1, w12, o18><r5, w12, o18>

partial outer query result

Sharpening

Disinfection

s11d10

WinSeq Checking

c12

s3

c16

14

Nested CEP Query Processing

w12Recycle

Washing

r5w2

WinSeq (outer) Operating

r1

o18

Sharpening

Disinfection

s11 d10

WinSeq(inner) Checking

c12

s3

c16

[12, 18]

tightened sub-window

Emptyinner query results

<r1, w12, o18>

partial outer query result

15

Processing Nested Queries with NegationBounded by outer

query.

PATTERN SEQ(Recycle r, Washing w, ! SEQ(Sharpening s,

Disinfection d, Checking c), Operating o)WITHIN 10 minutes

[w, o]

16

Processing Nested Queries with Negation

w12Recycle

Washing

r5w2

WinSeq(outer) Operating

r1

o18

<r1, w2, o18><r1, w12, o18><r5, w12, o18>

outer query result

Sharpening

Disinfection

s11 d10

WinSeq(inner) Checking

c12

s3

c16

17

PATTERN SEQ(Recycle r, Washing w, ! SEQ(Sharpening s, Disinfection d,

Checking c), Operating o)

Processing Nested Queries with Negation

w12Recycle

Washing

r5w2

WinSeq(outer) Operating

r1

o18

Sharpening

Disinfection

s11(s3) d10

WinSeq(inner) Checking

c12

s3

c16

<r1, w2, o18>outer query result

[2, 18]

tightened sub-window

<s3, d10, c12>inner query results

<s3, d10, c16>

18

not empty inner

Processing Nested Queries with Negation

w12Recycle

Washing

r5 w2

WinSeq(outer) Operating

r1

o18

<r1, w2, o18><r1, w12, o18><r5, w12, o18>

outer query result

Sharpening

Disinfection

s11 d10

WinSeq(inner) Checking

c12

s3

c16

19

Processing Nested Queries with Negation

w12Recycle

Washing

r5 w2

WinSeq(outer) Operating

r1

o18

Sharpening

Disinfection

s11d10

WinSeq(inner) Checking

c12

s3

c16

[12, 18]

tightened sub-window

emptyinner query results

<r1, w12, o18>

outer query result

20

Processing Nested Queries with Negation

Bounded by adjacent query.

PATTERN SEQ(Recycle b, SEQ(Washing w, ! Sharpening

s), SEQ(Disinfection d, Checking

c), Operating o)

WITHIN 10 minutes

[w, d]

21

Challenge: Not yet known bounds at time of subquery

processing, as mutually dependent subqueries.

Processing Nested Queries with Negation

Recycle

WinSeq (outer) Operating

r1

o18

<r1, o18>

outer query result

Sharpening

WinSeq (inner)

w12

Washing

w2

Disinfection

d6

Checking

c12 c16

WinSeq (inner)22

PATTERN SEQ(Recycle b, SEQ(Washing w, ! Sharpening s), SEQ(Disinfection d, Checking c),

Operating o)

s11

Processing Nested Queries with Negation

RecycleWinSeq (outer) Operating

r1

o18

<r1, o18>

outer query result

WinSeq(inner)

w12

Washing

w2

Disinfection

d6

Checking

c12 c16

WinSeq(inner)

[1, 18]Tightened sub-window

Inner query results <d6, c12>, <d6, c16>

Potential query results <w2> + (n. <s11>)

Output:<r1, w2, d6, c12, o18><r1, w2, d6, c16, o18>

Resolve at upper level

23

Sharpening

s11

24

Nested CEP Query Processing with Predicates

PATTERN SEQ(Recycle r, Washing w, SEQ(Sharpening s

Disinfection d, Checking c, s.id = d.id = c.id = o.id),

Operating o, r.id = w.id = o.id)WITHIN 10 minutes

Pass down interval attribute values from outer to inner;

Resolve predicate correlation as early as possible.

Outline

• Motivation• NEEL: The Nested Complex Event

Language• Nested CEP Query Processing• Performance Evaluation• Nested Query Optimization with Evaluation• Conclusion

25

Experimental Setup

• Implemented in ECube Event Analytics System

[ECube] M. Liu, E. A. Rundensteiner, K. Greenfield, C. Gupta, S. Wang, I. Ari, and A. Mehta, “E-Cube: Multi-dimensional event sequence processing using concept and pattern hierarchies,” in ICDE, 2010, pp. 1097–1100.

• Use real stock trades data

[Stock] “I. inetats. stock trade traces. http://www.inetats.com/.”26

• Sample queries

Performance Evaluation

27

Observation: Clearly, children number, query length and nesting levels impact query performance.

Outline

• Motivation• NEEL: The Nested Complex Event

Language• Nested CEP Query Processing• Performance Evaluation• Nested Query Optimization with Evaluation• Conclusion

28

Inefficiency in Nested CEP Query Processing

Outer Query Block Inner-Query Block<w, c>

Help to avoidduplicate

invocations

PATTERN SEQ(Recycle r, Washing w,

, Checking c, Operating o)

WITHIN 10 minutesSEQ(Sharpening s, Disinfection d)

Inner-query block

Selective Result Caching

29

Observations:1.For one outer invocation, the outer matches may share same <w, c>. 2.Different outer invocations may invoke inner subquery with same <w, c>.

Nested Query Optimization

Semantic descriptor interval [leftbound, rightbound] indicates the time validity of the current cache content.

• Cache Design.

• Cache Usage. Reuse cached results if the query interval matches the cache interval.

• Cache Maintenance.−Interval-driven Cache Expansion: Stream insertion− Interval-driven Cache Reduction: Window purge 30

Preliminary Result: Evaluating Optimized Nested Execution

31

Observation: Selective result caching significantly improves performance.

Future work: Refine caching to consider full NEEL query features:

negation. Design additional optimization strategies: query

decorrelation.

Complex Event Processing Systems [SASE, CEDR…]

Query Decorrelation [Decorrelation96, Dayal87…]

Semantic Caching [Dar96, Chen02, …]

[SASE] E. Wu, Y. Diao, and S. Rizvi, High-performance complex event processing over streams, SIGMOD, 2006, pp. 407-418.[CEDR] R. S. Barga, J. Goldstein, M. Ali, and M. Hong, Consistent streaming through time: A vision for event stream processing, CIDR, 2007, pp. 363-374.

Related Work

[Decorrelation96] P. Seshadri, H. Pirahesh, and T. Y. C. Leung, “Complex query decorrelation,” in ICDE ’96: Proceedings of the Twelfth International Conference on Data Engineering. IEEE Computer Society, 1996, pp. 450–458[Dayal87] U. Dayal, “Of nests and trees: A unified approach to processing queries that contain nested subqueries, aggregates, and quantifiers,” in VLDB, 1987, pp. 197–208.[Dar96] S. Dar, M. J. Franklin, B. T. J´onsson, and etc, “Semantic data caching and replacement,” in VLDB, 1996, pp. 330–341.[Chen02] L. Chen, E. Rundensteiner, and etc, “Xcache: A semantic caching system for xml queries,” in ACM SIGMOD, 2002, pp. 618–618.32

Conclusion

Introduce an algebraic query plan for queries expressed in NEEL.

Design an iterative execution strategy for NEEL.

Evaluate our proposed execution strategy on real data streams.

Demonstrate promise of selective caching in NEEL execution.

33

Thank You!

34