challenges, approaches, and solutions in stream reasoning

59
Emanuele Della Valle - visit http://streamreasoning.org Challenges, Challenges, Approaches, and Approaches, and Solutions in Solutions in Stream Reasoning Stream Reasoning http://streamreasoning.org http://streamreasoning.org Emanuele Della Valle DEI - Politecnico di Milano [email protected] http://emanueledellavalle.org

Upload: emanuele-della-valle

Post on 27-Jan-2015

110 views

Category:

Education


3 download

DESCRIPTION

The presentation I gave at Semantic Days 2012 (https://www.posccaesar.org/wiki/PCA/SemanticDays2012) about Stream Reasoning. The main goal of the presentation is to give the most up to date comprehensive view on Stream Reasoning.

TRANSCRIPT

Page 1: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

Challenges, Approaches, Challenges, Approaches, and Solutions inand Solutions in

Stream ReasoningStream Reasoninghttp://streamreasoning.org http://streamreasoning.org

Emanuele Della Valle DEI - Politecnico di Milano

[email protected]://emanueledellavalle.org

Page 2: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Agenda

• Motivation

• Concept

• Achievements

• Applications

• Conclusions

Stavanger, 2012-5-9 2

Page 3: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Motivation

It‘s a streaming World! [IEEE-IS2009] 1/3

• Oil operations

• Traffic

• Financial markets

• Social networks

• Generate data streams!

Stavanger, 2012-5-9 3

Page 4: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Motivation

It‘s a streaming World! [IEEE-IS2009] 2/4

• … and want to analyse data streams in real time

• In a well in progress to drown, how long time do I have givenits historical behavior?

• Is public transportation where the people are?

• Can we detect any intra-daycorrelation clusters among stock exchanges?

• Who is driving the discussion about the top 10 emerging topics ?

Stavanger, 2012-5-9 4

Page 5: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Motivation

It‘s a streaming World! [IEEE-IS2009] 3/4

• e.g., Real Time Rome (mobile network data streams)

afte

rnoo

ne

ven

ing

Normal day Exceptional day

[sou

rce:

http

://se

nsea

ble.

mit.

edu/

real

timer

ome/

]

Stavanger, 2012-5-9 5

Page 6: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Motivation

It‘s a streaming World! [IEEE-IS2009] 4/4

• e.g., Pulse of the Nation (social media streams)

happier

12:00

23:00

[so

urce

: ht

tp:/

/ww

w.c

cs.n

eu.e

du/h

ome/

amis

love

/tw

itter

moo

d/ ]

Stavanger, 2012-5-9 6

Page 7: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Motivation

What are data streams anyway?

• Formally: – Data streams are unbounded sequences of time-

varying data elements

• Less formally: – an (almost) “continuous” flow of information – with the recent information being more relevant as it

describes the current state of a dynamic system

time

Stavanger, 2012-5-9 7

Page 8: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Motivation

The continuous nature of streams

• The nature of streams requires a paradigmatic change*

– from persistent data • to be stored and queried on demand • a.k.a. one time semantics

– to transient data • to be consumed on the fly by continuous queries• a.k.a. continuous semantics

* This paradigmatic change first arose in DB community [Henzinger98]

Stavanger, 2012-5-9 8

Page 9: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Motivation – The continuous nature of streams

Continuous Semantics

• Continuous queries registered over streams that, in most of the cases, are observed trough windows

window

input streams streams of answerRegistered Continuous

Query

Stavanger, 2012-5-9 9

Page 10: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Motivation – The continuous nature of streams

Tools exists [Cugola2011]• Types

– Data Stream Management Systems– Complex Event Processors

• Research Prototypes– Amazon/Cougar (Cornell) – sensors– Aurora (Brown/MIT) – sensor monitoring, dataflow– Gigascope: AT&T Labs – Network Monitoring– Hancock (AT&T) – Telecom streams– Niagara (OGI/Wisconsin) – Internet DBs & XML– OpenCQ (Georgia) – triggers, view maintenance– Stream (Stanford) – general-purpose DSMS– Stream Mill (UCLA) - power & extensibility– Tapestry (Xerox) – publish/subscribe filtering– Telegraph (Berkeley) – adaptive engine for sensors– Tribeca (Bellcore) – network monitoring

• High-tech startups– Streambase, Coral8, Apama, Truviso

• Major DBMS vendors are all adding stream extensions as well– IBM InfoSphere Stream – Microsoft streaminsight– Oracle CEP

Stavanger, 2012-5-9 10

Page 11: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

Motivation

New Requirements New Challenges

Typical Requirements

•Processing Streams

•Large datasets

•Reactivity

•Fine-grained information access

•Modeling complex application domains

•Continuous semantics

• Scalable processing

• Real-time systems

• Powerful query languages

•Rich ontology languages

Stavanger, 2012-5-9 11

Page 12: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

Motivation

Are DSMS/CEP ready to address them?

Typical Requirements

•Processing Streams

•Large datasets

•Reactivity

•Fine-grained information access

•Modeling complex application domains

DSMS/CEP

•Continuous semantics

• Scalable processing

• Real-time systems

• Powerful query languages

•Rich ontology languages

Stavanger, 2012-5-9 12

Page 13: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Motivation

Is Semantic Web ready to address them?

• The Semantic Web, the Web of Data is doing fine– RDF, RDF Schema, SPARQL, OWL, RIF– well understood theory, – rapid increase in scalability

• BUT it pretends that the world is staticor at best a low change rateboth in change-volume and change-frequency

– ontology versioning– belief revision– time stamps on named graphs

• It sticks to the traditional one-time semantics

Stavanger, 2012-5-9 13

Page 14: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

Motivation

New Requirements New Challenges

Typical Requirements

•Processing Streams

•Large datasets

•Reactivity

•Fine-grained information access

•Modeling complex application domains

Semantic Web

•Continuous semantics

• Scalable processing

• Real-time systems

• Powerful query languages

•Rich ontology languages

Stavanger, 2012-5-9 14

Page 15: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

Motivation

New Requirements call for Stream Reasoning

Typical Requirements

•Processing Streams

•Large datasets

•Reactivity

•Fine-grained information access

•Modeling complex application domains

•Continuous semantics

• Scalable processing

• Real-time systems

• Powerful query languages

•Rich ontology languages

Stream Reasoning

Stavanger, 2012-5-9 15

Page 16: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Concept

Stream Reasoning Definition [IEEE-IS2010]

• Making sense – in real time

– of multiple, heterogeneous, gigantic and inevitably noisy data streams

– in order to support the decision process of extremely large numbers of concurrent user

• Note: making sense of streams necessarily requires processing them against rich background knowledge, an unsolved problem in database

Stavanger, 2012-5-9 16

Page 17: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Concept

Research Challenges• Relation with DSMSs and CEPs

– Just as RDF relates to data-base systems?• Data types and query languages for semantic streams

– Just RDF and SPARQL but with continuous semantics?• Reasoning on Streams

– Theory– Efficiency– Scalability

• Dealing with incomplete & noisy data– Even more than on the current Web of Data

• Distributed and parallel processing– Streams are parallel in nature, …

• Engineering Stream Reasoning Applications– Development Environment– Integration with other technologies– Benchmarks

Stavanger, 2012-5-9 17

Page 18: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Achievements• RDF Streams

– Notion defined• C-SPARQL

– Syntax and semantics defined as a SPARQL extension– Engine designed and implemented

• Experiments with C-SPARQL under simple RDF entailment regimes

– window based selection of C-SPARQL outperforms the standard FILTER based selection

– algebraic optimizations of C-SPARQL queries are possible– Complex event can be detected using a network of C-SPARQL

queries at high throughputs• Experiment with C-SPARQL under RDFS++ entailment

regimes– efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as

stream• Streaming Linked Data Framework prototyped

Stavanger, 2012-5-9 18

Page 19: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Achievements

Outline• RDF Streams

– Notion defined• C-SPARQL

– Syntax and semantics defined as a SPARQL extension– Engine designed and implemented

• Experiments with C-SPARQL under simple RDF entailment regimes

– window based selection of C-SPARQL outperforms the standard FILTER based selection

– algebraic optimizations of C-SPARQL queries are possible– Complex event can be detected using a network of C-SPARQL

queries at high throughputs• Experiment with C-SPARQL under RDFS++ entailment

regimes– efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as

stream• Streaming Linked Data Framework prototyped

Stavanger, 2012-5-9 19

Page 20: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

Memo: Sensor Network OntologyMemo: Sensor Network Ontology

Stavanger, 2012-5-9•20

Page 21: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

Memo: Sensor Network Ontology[ ] streaming part [ ] static part

Stavanger, 2012-5-9 21

Page 22: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Achievements

RDF Stream [WWW2009,EDBT2010,IJSC2010]

• RDF Stream Data Type– Ordered sequence of pairs, where each pair is made of

an RDF triple and its timestamp

Timestamps are not required to be unique, they must be non-decreasing

• E.g.,(< :s1 ssn:generatedObservation :o1 >, 2010-02-12T13:34:41)

(< :o1 a weather:SnowfallObservation >, 2010-02-12T13:34:41)

(< :s1 om-owl:generatedObservation :o2 >, 2010-02-12T13:36:28)

(< :o2 a weather:WindSpeedObservation >, 2010-02-12T13:36:28)

(< :o2 ssn:result :a1 >, 2010-02-12T13:36:28)

(< :a1 ssn:floatValue "35.4”^^xsd:float >, 2010-02-12T13:36:28)

Stavanger, 2012-5-9 22

Page 23: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Achievements

Outline• RDF Streams

– Notion defined• C-SPARQL

– Syntax and semantics defined as a SPARQL extension– Engine designed and implemented

• Experiments with C-SPARQL under simple RDF entailment regimes

– window based selection of C-SPARQL outperforms the standard FILTER based selection

– algebraic optimizations of C-SPARQL queries are possible– Complex event can be detected using a network of C-SPARQL

queries at high throughputs• Experiment with C-SPARQL under RDFS++ entailment

regimes– efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as

stream• Streaming Linked Data Framework prototyped

Stavanger, 2012-5-9 23

Page 24: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

MEMO: SPARQLMEMO: SPARQL

Stavanger, 2012-5-9 24

Page 25: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

AchievementsAchievements

Where C-SPARQL Extends SPARQLWhere C-SPARQL Extends SPARQL

Stavanger, 2012-5-9 25

Page 26: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Achievements An Example of C-SPARQL Query

Which are the sensors that have observing winds above 50 mph in the last half an hour? Which is the observed average wind?

REGISTER STREAM AvgWindSpeed AS

CONSTRUCT { ?sens w:avgWindSpeed ?avgWindSpeed }

FROM STREAM <.../streams/ssnmeteostream> [RANGE 1h STEP 1h]

WHERE {

SELECT ?sens (AVG(?v) as ?avgWindSpeed)

WHERE {

?sens om-owl:generatedObservation ?o .

?o a weather:WindSpeedObservation .

?o om-owl:result ?r .

?r om-owl:floatValue ?v . }

GROUP BY ?sens

HAVING (?avgWindSpeed > 5)

}

Stavanger, 2012-5-9 26

Page 27: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Achievements An Example of C-SPARQL Query

Which are the sensors that have observing winds above 50 mph in the last half an hour? Which is the observed average wind?

REGISTER STREAM AvgWindSpeed AS

CONSTRUCT { ?sens w:avgWindSpeed ?avgWindSpeed }

FROM STREAM <.../streams/ssnmeteostream> [RANGE 1h STEP 1h]

WHERE {

SELECT ?sens (AVG(?v) as ?avgWindSpeed)

WHERE {

?sens om-owl:generatedObservation ?o .

?o a weather:WindSpeedObservation .

?o om-owl:result ?r .

?r om-owl:floatValue ?v . }

GROUP BY ?sens

HAVING (?avgWindSpeed > 5)

}

Query registration(for continuous

execution)

RDF Stream added as new ouput format

FROM STREAM clause

WINDOW

SPARQ 1.1 features•Sub-queries•aggregates

Stavanger, 2012-5-9 27

Page 28: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Achievements

Outline• RDF Streams

– Notion defined• C-SPARQL

– Syntax and semantics defined as a SPARQL extension– Engine designed and implemented

• Experiments with C-SPARQL under simple RDF entailment regimes

– window based selection of C-SPARQL outperforms the standard FILTER based selection

– algebraic optimizations of C-SPARQL queries are possible– Complex event can be detected using a network of C-SPARQL

queries at high throughputs• Experiment with C-SPARQL under RDFS++ entailment

regimes– efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as

stream• Streaming Linked Data Framework prototyped

Stavanger, 2012-5-9 28

Page 29: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Achievements FROM STREAM Clause - Types of Window

• physical: a given number of triples• logical: a variable number of triples which occur during a

given time interval (e.g., 1 hour)– Sliding: they are progressively advanced of

a given STEP (e.g., 5 minutes)

– Tumbling: they are advanced of exactly their time interval

Stavanger, 2012-5-9 29

Page 30: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Achievements Efficiency of Evaluation [IEEE-IS2010]

• window based selection of C-SPARQL outperforms the standard FILTER based selection

Stavanger, 2012-5-9 30

Page 31: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Achievements

Outline• RDF Streams

– Notion defined• C-SPARQL

– Syntax and semantics defined as a SPARQL extension– Engine designed and implemented

• Experiments with C-SPARQL under simple RDF entailment regimes

– window based selection of C-SPARQL outperforms the standard FILTER based selection

– algebraic optimizations of C-SPARQL queries are possible– Complex event can be detected using a network of C-SPARQL

queries at high throughputs• Experiment with C-SPARQL under RDFS++ entailment

regimes– efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as

stream• Streaming Linked Data Framework prototyped

Stavanger, 2012-5-9 31

Page 32: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Achievements Algebraic optimizations of C-SPARQL [EDBT2010]

• Several transformations can be applied to algebraic representation of C-SPARQL

• some recalling well known results from classical relational optimization

– push of FILTERs and projections

• some being more specific to the domain of streams.– push of aggregates.

Stavanger, 2012-5-9 32

Page 33: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Achievements algebraic optimizations of C-SPARQL [EDBT2010]

• Push of filters and projections

0

25

50

75

100

125

10 100 1000 10000 100000

ms

Window Size

None Static Only Streaming Only Both

Stavanger, 2012-5-9 33

Page 34: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Achievements

Outline• RDF Streams

– Notion defined• C-SPARQL

– Syntax and semantics defined as a SPARQL extension– Engine designed and implemented

• Experiments with C-SPARQL under simple RDF entailment regimes

– window based selection of C-SPARQL outperforms the standard FILTER based selection

– algebraic optimizations of C-SPARQL queries are possible– Complex event can be detected using a network of C-SPARQL

queries at high throughputs• Experiment with C-SPARQL under RDFS++ entailment

regimes– efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as

stream• Streaming Linked Data Framework prototyped

Stavanger, 2012-5-9 34

Page 35: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Achievements Complex Event Detection as stream compositions

• e.g., continuous detection of blizzards by analyzing multiple streams of data generated by weather sensors spread across a continental area

Blizzard: a severe snowstorm with high winds and low visibility lasting at least three hours

Stavanger, 2012-5-9 35

Page 36: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

Achievements

Complex Event Detection as stream compositions

Linked Sensor Data

Adapter

C-SPARQL QueryL

EG

EN

D

Blizzard

QQ

Count Snow Fall

QQ [1 HOUR][TUMBLING]

AVG Wind Speed

QQ

AVG Temp

QQ

[1 HOUR][TUMBLING]

[1 HOUR][TUMBLING]

[3 HOURS][STEP 1 HOUR]

Stavanger, 2012-5-9 36

Page 37: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

Achievements

Complex Event Detection as stream compositions

snowfall + strong winds+ low temp blizzards

Live demonstration at http://streamreasoning.org/demos/blizzard-detection

Stavanger, 2012-5-9 37

Page 38: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Achievements High Throughputs [JWS2012a]

Stavanger, 2012-5-9 38

Page 39: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Achievements

Outline• RDF Streams

– Notion defined• C-SPARQL

– Syntax and semantics defined as a SPARQL extension– Engine designed and implemented

• Experiments with C-SPARQL under simple RDF entailment regimes

– window based selection of C-SPARQL outperforms the standard FILTER based selection

– algebraic optimizations of C-SPARQL queries are possible– Complex event can be detected using a network of C-SPARQL

queries at high throughputs• Experiment with C-SPARQL under RDFS++ entailment

regimes– efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as

stream• Streaming Linked Data Framework prototyped

Stavanger, 2012-5-9 39

Page 40: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

AchievementsAchievements

Where’s the Reasoning? Where’s the Reasoning?

Example: can we measure the the impact of a tweet? Twitter allows two traceable ways of discussing a tweet:

reply: a user reply to a tweet of another user (it always retweet the original tweet)

retweet: a user propagates to his/her followers an interesting tweet

For example

t1 t3 t5 t8retweet reply reply

t2 t4 t7

t6

reply reply

retweet

reply

now10 min ago20 min ago30 min ago40 min ago50 min ago

Stavanger, 2012-5-9 40

Page 41: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

AchievementsAchievements

Example of C-SPARQL and Reasoning 1/2Example of C-SPARQL and Reasoning 1/2

What impact have I been creating with my tweets in the last hour? Let’s count them …

REGISTER STREAM OpinionSpreading COMPUTED EVERY 30s AS

SELECT ?tweet (count(?tweet) AS ?impact

FROM STREAM <http://ex.org> [RANGE 60m STEP 10m]

WHERE {

:t1 sr:discuss ?tweet

}

:reply rdfs:subPropertyOf :discuss .:retweet rdfs:subPropertyOf :discuss .

t1 t3 t5 t8retweet reply reply

t2 t4 t7

t6

reply reply

retweet

reply

discuss discuss discuss

discuss discuss

discuss

discuss

:discuss a owl:TransitiveProperty .

7!Stavanger, 2012-5-9 41

Page 42: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

AchievementsAchievements

Our approach [ESWC2010]

• The algorithm1. deletes all triples (asserted or inferred) that have just

expired

2. computes the entailments derived by the inserts,

3. annotates each entailed triple with a expiration time, and

4. eliminates from the current state all copies of derived triples except the one with the highest timestamp.

Stavanger, 2012-5-9 42

Page 43: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

Implementation

Comparative Evaluation on Materialization

• base-line: re-computing the materialization from scratch• state-of-the-art [Ceri1994,Volz2005]• our approach [ESWC2010]

10

100

1000

10000

0,0% 2,0% 4,0% 6,0% 8,0% 10,0% 12,0% 14,0% 16,0% 18,0% 20,0%

ms.

% of the materialization changed when the window slides

incremental-volz incremental-streamStavanger, 2012-5-9 43

% of the materialization changed when the window slides

Page 44: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

forward reasoning naive approach incremental-stream

query 5,82 1,61 1,61materialization 0 15,91 0,28

0

5

10

15

20

ms.

AchievementsAchievements

Comparative Evaluation on Query Answering

• comparison of the average time needed to answera C-SPARQL query using

– backward reasoner– the naive approach of re-computing the materialization– our approach

Backward reasoning

Stavanger, 2012-5-9 44

Page 45: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

Achievements

Outline• RDF Streams

– Notion defined• C-SPARQL

– Syntax and semantics defined as a SPARQL extension– Engine designed and implemented

• Experiments with C-SPARQL under simple RDF entailment regimes

– window based selection of C-SPARQL outperforms the standard FILTER based selection

– algebraic optimizations of C-SPARQL queries are possible– Complex event can be detected using a network of C-SPARQL

queries at high throughputs• Experiment with C-SPARQL under RDFS++ entailment

regimes– efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as

stream• Streaming Linked Data Framework prototyped

Stavanger, 2012-5-9 45

Page 46: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

AchievementsAchievements

Streaming Linked Data Framework [JWS2012a]

Features•Accessing raw data stream from C-SPARQL•Publishing streams and C-SPARQL query results as Linked Data•Connecting C-SPARQL queries in a network•Recoding and replaying portions of stream•Supporting fast prototyping of applications

Stavanger, 2012-5-9 46

Page 47: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

Applications

Location Based Social Media Analytics [JWS2012b]

Stavanger, 2012-5-9 47

Page 48: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

Applications

Location Based Social Media Analytics [JWS2012b]

Live demonstration at http://streamreasoning.org/demos/bottari

http://youtu.be/XGOKe_lhSks

Stavanger, 2012-5-9 48

Page 49: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

Applications

Blizzard Detection [JWS2012a]

Stavanger, 2012-5-9 49

Page 50: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

Applications

Blizzard Detection [JWS2012a]

Live demonstration at http://streamreasoning.org/demos/blizzard-detection

Stavanger, 2012-5-9 50

Page 51: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

Applications

Hurricane Detection [JWS2012a]

Stavanger, 2012-5-9 51

Page 52: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

Applications

Hurricane Detection [JWS2012a]

Live demonstration at http://streamreasoning.org/demos/hurricane-detection

Stavanger, 2012-5-9 52

Page 53: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

You can try C-SPARQL out!You can try C-SPARQL out!

Working prototype available for download in a “ready to go pack” http://streamreasoning.org/download

The Streaming Linked Data Framework will be soon released too, ask me directly for a pre-release version.

Stavanger, 2012-5-9 53

Page 54: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

ConclusionsConclusions

Research Challenges vs. AchievementsResearch Challenges vs. Achievements Relation with DSMSs and CEPs

Notion of RDF stream :-| alternative solutions can be investigated Data types and query languages for semantic streams

C-SPARQL :-D work in progress in FZI&AIFB [1,2] DERI [3], UPM [4] Reasoning on Streams

Theory :-( Efficiency :-) work in progress in ISTI-Innsbruck [5] Scalability :-| work in progress in IBM&VUA [6]

Dealing with incomplete & noisy data Even more than on the current Web of Data :-( some initial joint work

with SIEMENS only [IEEE-IS2010] Distributed and parallel processing

Streams are parallel in nature, … :-| work in progress in IBM&VUA [6] Engineering Stream Reasoning Applications

Development Environment :-) work in progress in UPM [7] Integration with other technologies :-) Benchmarks :-P work in progress in Planet Data [8]

Stavanger, 2012-5-9 54

Page 55: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

CreditsCredits

Politecnico di Milano’s colleagues Prof. Stefano Ceri who had the initial intuition about the value of

introducing data streams to the semantic Web community Marco Balduini, Davide Barbieri, Daniele Braga, Stefano Ceri and

Michael Grossniklaus who helped concieving the C-SPARQL Engine and the Streaming Linked Data Framework

once again to Davide Barbieri who engineered most of the C-SPARQL Engine as part of his PhD

once again to Marco Balduini who engineered most of Streaming Linked Data Framework as part of his M.Sc. Thesis

Politecnico di Milano’s Master students that assisted in the design and development of the prototypes Mirko Bratomi, and Marco Regaldo

Colleagues that in helped in concieving, designing, and prototyping the applications CEFRIEL: Irene Celino, and Danile Dell’Aglio SIEMENS: Yi Huang, and Volker Tresp Saltlux: Seonho Kim, and Tony Lee

Stavanger, 2012-5-9 55

Page 56: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

ReferencesReferences

My papersMy papers [IEEE-IS2009] E. Della Valle, S. Ceri, F. van Harmelen, D. Fensel It's a

Streaming World! Reasoning upon Rapidly Changing Information. IEEE Intelligent Systems 24(6): 83-89 (2009)

[EDBT2010] D.F. Barbieri, D.Braga, S. Ceri and M. Grossniklaus. An Execution Environment for C-SPARQL Queries. EDBT 2010

[WWW2009] D.F. Barbieri, D. Braga, S. Ceri, E. Della Valle, M. Grossniklaus: C-SPARQL: SPARQL for continuous querying. WWW 2009: 1061-1062

[SIGMODRec2010] D.F. Barbieri, D.Braga, S. Ceri and M. Grossniklaus. : Querying RDF streams with C-SPARQL. SIGMOD Record 39(1): 20-26 (2010)

[IEEE-IS2010] D. Barbieri, D. Braga, S. Ceri, E. Della Valle, Y. Huang, V. Tresp, A.Rettinger, H. Wermser: Deductive and Inductive Stream Reasoning for Semantic Social Media Analytics IEEE Intelligent Systems, 30 Aug. 2010.

[JWS2012a] E. Della Valle, M. Balduini: SLD: a Framework for Streaming Linked Data. JWS. 2012 Under Review

[JWS2012b] M. Balduini; I.Celino; E. Della Valle; D.Dell'Aglio; Y. Huang; T. Lee; S. Kim; V. Tresp: BOTTARI: an Augmented Reality Mobile Application to deliver Personalized and Location-based Recommendations by Continuous Analysis of Social Media Streams. JWS. 2012. to appear.

[ESWC2010] D.F. Barbieri, D. Braga, S. Ceri, E. Della Valle, M. Grossniklaus. Incremental Reasoning on Streams and Rich Background Knowledge. ESWC 2010

56Stavanger, 2012-5-9

Page 57: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

ReferencesReferences

Other groups’ papersOther groups’ papers[1] Darko Anicic, Paul Fodor, Sebastian Rudolph, Nenad Stojanovic: EP-SPARQL: a

unified language for event processing and stream reasoning. WWW 2011: 635-644[2] Danh Le Phuoc, Minh Dao-Tran, Josiane Xavier Parreira, Manfred Hauswirth: A Native

and Adaptive Approach for Unified Processing of Linked Streams and Linked Data. International Semantic Web Conference (1) 2011: 370-388

[3] D. Anicic, S. Rudolph, P. Fodor, N. Stojanovic: Real-Time Complex Event Recognition and Reasoning-a Logic Programming Approach. Applied Artificial Intelligence 26(1-2): 6-57 (2012)

[4] Jean-Paul Calbimonte, Óscar Corcho, Alasdair J. G. Gray: Enabling Ontology-Based Access to Streaming Data Sources. ISWC (1) 2010: 96-111

[5] S. Komazec and D. Cerri: Towards Efficient Schema-Enhanced Pattern Matching over RDF Data Streams. First International Workshop on Ordering and Reasoning (OrdRing2011)

[6] Jesper Hoeksema, Spyros Kotoulas: High-performance Distributed Stream Reasoning using S4. First International Workshop on Ordering and Reasoning (OrdRing2011)

[7] A.J.G. Gray, R.Garcia-Castro, K.Kyzirakos, M.Karpathiotakis, J.Calbimonte, K.R.Page, J.Sadler, A.Frazer, I.Galpin, A.A.A. Fernandes, N.W. Paton, O.Corcho, M.Koubarakis, D.De Roure, K. Martinez, A. Gómez-Pérez: A Semantically Enabled Service Architecture for Mashups over Streaming and Stored Data. ESWC (2) 2011: 300-314

[8] PlanetData. D1.2 Benchmarking RDF Storage Engines. http://wiki.planet-data.eu/web/D1.2

57Stavanger, 2012-5-9

Page 58: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org

ReferencesReferences

Background papersBackground papers [Henzinger98] Henzinger, M. R. & Raghavan, P. (1998). Computing on

data streams. Systems Research. [Ceri1994] Stefano Ceri, Jennifer Widom: Deriving Incremental

Production Rules for Deductive Data. Inf. Syst. 19(6): 467-490 (1994) [Volz2005] Raphael Volz, Steffen Staab, Boris Motik: Incrementally

Maintaining Materializations of Ontologies Stored in Logic Databases. J. Data Semantics 2: 1-34 (2005)

[Cugola2011] Alessandro Margara, Gianpaolo Cugola: Processing flows of information: from data stream to complex event processing. DEBS 2011: 359-360

58Stavanger, 2012-5-9

Page 59: Challenges, Approaches, and Solutions in Stream Reasoning

Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org

Thank You! Questions?

Much More to Come!Keep an eye on

http://www.streamreasoning.org

Stavanger, 2012-5-9 59