challenges, approaches, and solutions in stream reasoning
DESCRIPTION
Challenges, Approaches, and Solutions in Stream Reasoning http://streamreasoning.org. Emanuele Della Valle DEI - Politecnico di Milano [email protected] http://emanueledellavalle.org. Agenda. Motivation Concept Achievements Applications Conclusions. - PowerPoint PPT PresentationTRANSCRIPT
Emanuele Della Valle - visit http://streamreasoning.org
Challenges, Approaches, and Solutions in
Stream Reasoninghttp://streamreasoning.org
Emanuele Della Valle DEI - Politecnico di Milano
[email protected]://emanueledellavalle.org
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
Agenda• Motivation• Concept • Achievements• Applications• Conclusions
Stavanger, 2012-5-9 2
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
MotivationIt‘s a streaming World! [IEEE-IS2009] 1/3
• Oil operations
• Traffic
• Financial markets
• Social networks
• Generate data streams!
Stavanger, 2012-5-9 3
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
MotivationIt‘s a streaming World! [IEEE-IS2009] 2/4
• … and want to analyse data streams in real time
• In a well in progress to drown, how long time do I have givenits historical behavior?
• Is public transportation where the people are?
• Can we detect any intra-daycorrelation clusters among stock exchanges?
• Who is driving the discussion about the top 10 emerging topics ?
Stavanger, 2012-5-9 4
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
MotivationIt‘s a streaming World! [IEEE-IS2009] 3/4
• e.g., Real Time Rome (mobile network data streams)
afte
rnoo
nev
enin
gNormal day Exceptional day
[sou
rce:
http
://se
nsea
ble.
mit.
edu/
real
timer
ome/
]
Stavanger, 2012-5-9 5
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
MotivationIt‘s a streaming World! [IEEE-IS2009] 4/4
• e.g., Pulse of the Nation (social media streams)
happier
12:00
23:00
[sou
rce:
http
://w
ww
.ccs
.neu
.edu
/hom
e/am
islo
ve/tw
itter
moo
d/ ]
Stavanger, 2012-5-9 6
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
MotivationWhat are data streams anyway?
• Formally: – Data streams are unbounded sequences of time-
varying data elements
• Less formally: – an (almost) “continuous” flow of information – with the recent information being more relevant as it
describes the current state of a dynamic system
time
Stavanger, 2012-5-9 7
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
MotivationThe continuous nature of streams
• The nature of streams requires a paradigmatic change*
– from persistent data • to be stored and queried on demand • a.k.a. one time semantics
– to transient data • to be consumed on the fly by continuous queries• a.k.a. continuous semantics
* This paradigmatic change first arose in DB community [Henzinger98]Stavanger, 2012-5-9 8
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
Motivation – The continuous nature of streamsContinuous Semantics
• Continuous queries registered over streams that, in most of the cases, are observed trough windows
window
input streams streams of answerRegistered Continuous
Query
Stavanger, 2012-5-9 9
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
Motivation – The continuous nature of streamsTools exists [Cugola2011]
• Types– Data Stream Management Systems– Complex Event Processors
• Research Prototypes– Amazon/Cougar (Cornell) – sensors– Aurora (Brown/MIT) – sensor monitoring, dataflow– Gigascope: AT&T Labs – Network Monitoring– Hancock (AT&T) – Telecom streams– Niagara (OGI/Wisconsin) – Internet DBs & XML– OpenCQ (Georgia) – triggers, view maintenance– Stream (Stanford) – general-purpose DSMS– Stream Mill (UCLA) - power & extensibility– Tapestry (Xerox) – publish/subscribe filtering– Telegraph (Berkeley) – adaptive engine for sensors– Tribeca (Bellcore) – network monitoring
• High-tech startups– Streambase, Coral8, Apama, Truviso
• Major DBMS vendors are all adding stream extensions as well– IBM InfoSphere Stream – Microsoft streaminsight– Oracle CEP
Stavanger, 2012-5-9 10
Emanuele Della Valle - visit http://streamreasoning.org
Motivation New Requirements New Challenges
Typical Requirements• Processing Streams• Large datasets• Reactivity• Fine-grained information
access• Modeling complex
application domains
• Continuous semantics• Scalable processing• Real-time systems• Powerful query
languages• Rich ontology
languages
Stavanger, 2012-5-9 11
Emanuele Della Valle - visit http://streamreasoning.org
MotivationAre DSMS/CEP ready to address them?
Typical Requirements• Processing Streams• Large datasets• Reactivity• Fine-grained information
access• Modeling complex
application domains
DSMS/CEP• Continuous semantics• Scalable processing• Real-time systems• Powerful query
languages• Rich ontology
languages
Stavanger, 2012-5-9 12
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
Motivation Is Semantic Web ready to address them?
• The Semantic Web, the Web of Data is doing fine– RDF, RDF Schema, SPARQL, OWL, RIF– well understood theory, – rapid increase in scalability
• BUT it pretends that the world is staticor at best a low change rateboth in change-volume and change-frequency– ontology versioning– belief revision– time stamps on named graphs
• It sticks to the traditional one-time semantics
Stavanger, 2012-5-9 13
Emanuele Della Valle - visit http://streamreasoning.org
MotivationNew Requirements New Challenges
Typical Requirements• Processing Streams• Large datasets• Reactivity• Fine-grained information
access• Modeling complex
application domains
Semantic Web• Continuous semantics• Scalable processing• Real-time systems• Powerful query
languages• Rich ontology
languages
Stavanger, 2012-5-9 14
Emanuele Della Valle - visit http://streamreasoning.org
Motivation New Requirements call for Stream Reasoning
Typical Requirements• Processing Streams• Large datasets• Reactivity• Fine-grained information
access• Modeling complex
application domains
• Continuous semantics• Scalable processing• Real-time systems• Powerful query
languages• Rich ontology
languages
Stream Reasoning
Stavanger, 2012-5-9 15
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
ConceptStream Reasoning Definition [IEEE-IS2010]
• Making sense – in real time – of multiple, heterogeneous, gigantic and inevitably
noisy data streams – in order to support the decision process of
extremely large numbers of concurrent user
• Note: making sense of streams necessarily requires processing them against rich background knowledge, an unsolved problem in database
Stavanger, 2012-5-9 16
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
Concept Research Challenges
• Relation with DSMSs and CEPs– Just as RDF relates to data-base systems?
• Data types and query languages for semantic streams– Just RDF and SPARQL but with continuous semantics?
• Reasoning on Streams– Theory– Efficiency– Scalability
• Dealing with incomplete & noisy data– Even more than on the current Web of Data
• Distributed and parallel processing– Streams are parallel in nature, …
• Engineering Stream Reasoning Applications– Development Environment– Integration with other technologies– Benchmarks
Stavanger, 2012-5-9 17
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
Achievements• RDF Streams
– Notion defined• C-SPARQL
– Syntax and semantics defined as a SPARQL extension– Engine designed and implemented
• Experiments with C-SPARQL under simple RDF entailment regimes– window based selection of C-SPARQL outperforms the standard
FILTER based selection– algebraic optimizations of C-SPARQL queries are possible– Complex event can be detected using a network of C-SPARQL queries
at high throughputs• Experiment with C-SPARQL under RDFS++ entailment regimes
– efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as
stream• Streaming Linked Data Framework prototyped
Stavanger, 2012-5-9 18
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
AchievementsOutline
• RDF Streams– Notion defined
• C-SPARQL– Syntax and semantics defined as a SPARQL extension– Engine designed and implemented
• Experiments with C-SPARQL under simple RDF entailment regimes– window based selection of C-SPARQL outperforms the standard
FILTER based selection– algebraic optimizations of C-SPARQL queries are possible– Complex event can be detected using a network of C-SPARQL queries
at high throughputs• Experiment with C-SPARQL under RDFS++ entailment regimes
– efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as
stream• Streaming Linked Data Framework prototyped
Stavanger, 2012-5-9 19
Emanuele Della Valle - visit http://streamreasoning.org
Memo: Sensor Network Ontology
Stavanger, 2012-5-9• 20
Emanuele Della Valle - visit http://streamreasoning.org
Memo: Sensor Network Ontology[ ] streaming part [ ] static part
Stavanger, 2012-5-9 21
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
Achievements RDF Stream [WWW2009,EDBT2010,IJSC2010]
• RDF Stream Data Type– Ordered sequence of pairs, where each pair is made of
an RDF triple and its timestamp
Timestamps are not required to be unique, they must be non-decreasing
• E.g.,(< :s1 ssn:generatedObservation :o1 >, 2010-02-12T13:34:41) (< :o1 a weather:SnowfallObservation >, 2010-02-12T13:34:41) (< :s1 om-owl:generatedObservation :o2 >, 2010-02-12T13:36:28)(< :o2 a weather:WindSpeedObservation >, 2010-02-12T13:36:28)(< :o2 ssn:result :a1 >, 2010-02-12T13:36:28)(< :a1 ssn:floatValue "35.4”^^xsd:float >, 2010-02-12T13:36:28)
Stavanger, 2012-5-9 22
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
AchievementsOutline
• RDF Streams– Notion defined
• C-SPARQL– Syntax and semantics defined as a SPARQL extension– Engine designed and implemented
• Experiments with C-SPARQL under simple RDF entailment regimes– window based selection of C-SPARQL outperforms the standard
FILTER based selection– algebraic optimizations of C-SPARQL queries are possible– Complex event can be detected using a network of C-SPARQL queries
at high throughputs• Experiment with C-SPARQL under RDFS++ entailment regimes
– efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as
stream• Streaming Linked Data Framework prototyped
Stavanger, 2012-5-9 23
Emanuele Della Valle - visit http://streamreasoning.org
MEMO: SPARQL
Stavanger, 2012-5-9 24
Emanuele Della Valle - visit http://streamreasoning.org
AchievementsWhere C-SPARQL Extends SPARQL
Stavanger, 2012-5-9 25
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
Achievements An Example of C-SPARQL Query
Which are the sensors that have observing winds above 50 mph in the last half an hour? Which is the observed average wind?
REGISTER STREAM AvgWindSpeed AS CONSTRUCT { ?sens w:avgWindSpeed ?avgWindSpeed } FROM STREAM <.../streams/ssnmeteostream> [RANGE 1h STEP 1h] WHERE { SELECT ?sens (AVG(?v) as ?avgWindSpeed) WHERE { ?sens om-owl:generatedObservation ?o . ?o a weather:WindSpeedObservation . ?o om-owl:result ?r . ?r om-owl:floatValue ?v . } GROUP BY ?sens HAVING (?avgWindSpeed > 5)}
Stavanger, 2012-5-9 26
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
Achievements An Example of C-SPARQL Query
Which are the sensors that have observing winds above 50 mph in the last half an hour? Which is the observed average wind?
REGISTER STREAM AvgWindSpeed AS CONSTRUCT { ?sens w:avgWindSpeed ?avgWindSpeed } FROM STREAM <.../streams/ssnmeteostream> [RANGE 1h STEP 1h] WHERE { SELECT ?sens (AVG(?v) as ?avgWindSpeed) WHERE { ?sens om-owl:generatedObservation ?o . ?o a weather:WindSpeedObservation . ?o om-owl:result ?r . ?r om-owl:floatValue ?v . } GROUP BY ?sens HAVING (?avgWindSpeed > 5)}
Query registration(for continuous execution)
RDF Stream added as new ouput format
FROM STREAM clause
WINDOW
SPARQ 1.1 features• Sub-queries• aggregates
Stavanger, 2012-5-9 27
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
AchievementsOutline
• RDF Streams– Notion defined
• C-SPARQL– Syntax and semantics defined as a SPARQL extension– Engine designed and implemented
• Experiments with C-SPARQL under simple RDF entailment regimes– window based selection of C-SPARQL outperforms the standard
FILTER based selection– algebraic optimizations of C-SPARQL queries are possible– Complex event can be detected using a network of C-SPARQL queries
at high throughputs• Experiment with C-SPARQL under RDFS++ entailment regimes
– efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as
stream• Streaming Linked Data Framework prototyped
Stavanger, 2012-5-9 28
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
Achievements FROM STREAM Clause - Types of Window
• physical: a given number of triples• logical: a variable number of triples which occur during a
given time interval (e.g., 1 hour)– Sliding: they are progressively advanced of
a given STEP (e.g., 5 minutes)
– Tumbling: they are advanced of exactly their time interval
Stavanger, 2012-5-9 29
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
Achievements Efficiency of Evaluation [IEEE-IS2010]
• window based selection of C-SPARQL outperforms the standard FILTER based selection
Stavanger, 2012-5-9 30
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
AchievementsOutline
• RDF Streams– Notion defined
• C-SPARQL– Syntax and semantics defined as a SPARQL extension– Engine designed and implemented
• Experiments with C-SPARQL under simple RDF entailment regimes– window based selection of C-SPARQL outperforms the standard
FILTER based selection– algebraic optimizations of C-SPARQL queries are possible– Complex event can be detected using a network of C-SPARQL queries
at high throughputs• Experiment with C-SPARQL under RDFS++ entailment regimes
– efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as
stream• Streaming Linked Data Framework prototyped
Stavanger, 2012-5-9 31
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
Achievements Algebraic optimizations of C-SPARQL [EDBT2010]
• Several transformations can be applied to algebraic representation of C-SPARQL
• some recalling well known results from classical relational optimization– push of FILTERs and projections
• some being more specific to the domain of streams.– push of aggregates.
Stavanger, 2012-5-9 32
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
Achievements algebraic optimizations of C-SPARQL [EDBT2010]
• Push of filters and projections
0
25
50
75
100
125
10 100 1000 10000 100000
ms
Window Size
None Static Only Streaming Only Both
Stavanger, 2012-5-9 33
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
AchievementsOutline
• RDF Streams– Notion defined
• C-SPARQL– Syntax and semantics defined as a SPARQL extension– Engine designed and implemented
• Experiments with C-SPARQL under simple RDF entailment regimes– window based selection of C-SPARQL outperforms the standard
FILTER based selection– algebraic optimizations of C-SPARQL queries are possible– Complex event can be detected using a network of C-SPARQL queries
at high throughputs• Experiment with C-SPARQL under RDFS++ entailment regimes
– efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as
stream• Streaming Linked Data Framework prototyped
Stavanger, 2012-5-9 34
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
Achievements Complex Event Detection as stream compositions
• e.g., continuous detection of blizzards by analyzing multiple streams of data generated by weather sensors spread across a continental area
Blizzard: a severe snowstorm with high winds and low visibility lasting at least three hours
Stavanger, 2012-5-9 35
Emanuele Della Valle - visit http://streamreasoning.org
AchievementsComplex Event Detection as stream compositions
Linked Sensor Data
Adapter
C-SPARQL QueryLE
GE
ND
Most Liked POIs
Q
Count Snow Fall
Q [1 HOUR][TUMBLING]
AVG Wind Speed
Q
AVG Temp
Q
[1 HOUR][TUMBLING]
[1 HOUR][TUMBLING]
[3 HOURS][STEP 1 HOUR]
Stavanger, 2012-5-9 36
Emanuele Della Valle - visit http://streamreasoning.org
AchievementsComplex Event Detection as stream compositions
snowfall + strong winds+ low temp blizzards
Live demonstration at http://streamreasoning.org/demos/blizzard-detection
Stavanger, 2012-5-9 37
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
Achievements High Throughputs [JWS2012a]
Stavanger, 2012-5-9 38
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
AchievementsOutline
• RDF Streams– Notion defined
• C-SPARQL– Syntax and semantics defined as a SPARQL extension– Engine designed and implemented
• Experiments with C-SPARQL under simple RDF entailment regimes– window based selection of C-SPARQL outperforms the standard
FILTER based selection– algebraic optimizations of C-SPARQL queries are possible– Complex event can be detected using a network of C-SPARQL queries
at high throughputs• Experiment with C-SPARQL under RDFS++ entailment regimes
– efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as
stream• Streaming Linked Data Framework prototyped
Stavanger, 2012-5-9 39
Emanuele Della Valle - visit http://streamreasoning.org
AchievementsWhere’s the Reasoning?
Example: can we measure the the impact of a tweet? Twitter allows two traceable ways of discussing a tweet:
reply: a user reply to a tweet of another user (it always retweet the original tweet)
retweet: a user propagates to his/her followers an interesting tweet
For example
t1 t3 t5 t8retweet reply reply
t2 t4 t7
t6
reply reply
retweet
reply
now10 min ago20 min ago30 min ago40 min ago50 min ago
Stavanger, 2012-5-9 40
Emanuele Della Valle - visit http://streamreasoning.org
AchievementsExample of C-SPARQL and Reasoning 1/2
What impact have I been creating with my tweets in the last hour? Let’s count them …REGISTER STREAM OpinionSpreading COMPUTED EVERY 30s AS SELECT ?tweet (count(?tweet) AS ?impact FROM STREAM <http://ex.org> [RANGE 60m STEP 10m] WHERE { :t1 sr:discuss ?tweet}
:reply rdfs:subPropertyOf :discuss .:retweet rdfs:subPropertyOf :discuss .
t1 t3 t5 t8retweet reply reply
t2 t4 t7
t6
reply reply
retweet
replydiscuss discuss discuss
discuss discuss
discuss
discuss
:discuss a owl:TransitiveProperty .
7!Stavanger, 2012-5-9 41
Emanuele Della Valle - visit http://streamreasoning.org
AchievementsOur approach [ESWC2010]
• The algorithm1. deletes all triples (asserted or inferred) that have just
expired2. computes the entailments derived by the inserts,3. annotates each entailed triple with a expiration time,
and4. eliminates from the current state all copies of derived
triples except the one with the highest timestamp.
Stavanger, 2012-5-9 42
Emanuele Della Valle - visit http://streamreasoning.org
ImplementationComparative Evaluation on Materialization
• base-line: re-computing the materialization from scratch• state-of-the-art [Ceri1994,Volz2005]• our approach [ESWC2010]
10
100
1000
10000
0,0% 2,0% 4,0% 6,0% 8,0% 10,0% 12,0% 14,0% 16,0% 18,0% 20,0%
ms.
% of the materialization changed when the window slides
incremental-volz incremental-streamStavanger, 2012-5-9 43
% of the materialization changed when the window slides
Emanuele Della Valle - visit http://streamreasoning.org
forward reasoning naive approach incremental-streamquery 5,82 1,61 1,61materialization 0 15,91 0,28
0
5
10
15
20
ms.
AchievementsComparative Evaluation on Query Answering
• comparison of the average time needed to answera C-SPARQL query using – backward reasoner– the naive approach of re-computing the materialization– our approach
Backward reasoning
Stavanger, 2012-5-9 44
Emanuele Della Valle - visit http://streamreasoning.org
AchievementsOutline
• RDF Streams– Notion defined
• C-SPARQL– Syntax and semantics defined as a SPARQL extension– Engine designed and implemented
• Experiments with C-SPARQL under simple RDF entailment regimes– window based selection of C-SPARQL outperforms the standard
FILTER based selection– algebraic optimizations of C-SPARQL queries are possible– Complex event can be detected using a network of C-SPARQL queries
at high throughputs• Experiment with C-SPARQL under RDFS++ entailment regimes
– efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as
stream• Streaming Linked Data Framework prototyped
Stavanger, 2012-5-9 45
Emanuele Della Valle - visit http://streamreasoning.org
Achievements
Streaming Linked Data Framework [JWS2012a]
Features• Accessing raw data
stream from C-SPARQL
• Publishing streams and C-SPARQL query results as Linked Data
• Connecting C-SPARQL queries in a network
• Recoding and replaying portions of stream
• Supporting fast prototyping of applications
Stavanger, 2012-5-9 46
Emanuele Della Valle - visit http://streamreasoning.org
ApplicationsLocation Based Social Media Analytics [JWS2012b]
Stavanger, 2012-5-9 47
Emanuele Della Valle - visit http://streamreasoning.org
ApplicationsLocation Based Social Media Analytics [JWS2012b]
Live demonstration at http://streamreasoning.org/demos/bottari
http://youtu.be/XGOKe_lhSks
Stavanger, 2012-5-9 48
Emanuele Della Valle - visit http://streamreasoning.org
ApplicationsBlizzard Detection [JWS2012a]
Stavanger, 2012-5-9 49
Emanuele Della Valle - visit http://streamreasoning.org
ApplicationsBlizzard Detection [JWS2012a]
Live demonstration at http://streamreasoning.org/demos/blizzard-detection
Stavanger, 2012-5-9 50
Emanuele Della Valle - visit http://streamreasoning.org
ApplicationsHurricane Detection [JWS2012a]
Stavanger, 2012-5-9 51
Emanuele Della Valle - visit http://streamreasoning.org
ApplicationsHurricane Detection [JWS2012a]
Live demonstration at http://streamreasoning.org/demos/hurricane-detection
Stavanger, 2012-5-9 52
Emanuele Della Valle - visit http://streamreasoning.org
You can try C-SPARQL out!
Working prototype available for download in a “ready to go pack” http://streamreasoning.org/download
The Streaming Linked Data Framework will be soon released too, ask me directly for a pre-release version.
Stavanger, 2012-5-9 53
Emanuele Della Valle - visit http://streamreasoning.org
ConclusionsResearch Challenges vs. Achievements Relation with DSMSs and CEPs
Notion of RDF stream :-| alternative solutions can be investigated Data types and query languages for semantic streams
C-SPARQL :-D work in progress in FZI&AIFB [1,2] DERI [3], UPM [4] Reasoning on Streams
Theory :-( Efficiency :-) work in progress in ISTI-Innsbruck [5] Scalability :-| work in progress in IBM&VUA [6]
Dealing with incomplete & noisy data Even more than on the current Web of Data :-( some initial joint work
with SIEMENS only [IEEE-IS2010] Distributed and parallel processing
Streams are parallel in nature, … :-| work in progress in IBM&VUA [6] Engineering Stream Reasoning Applications
Development Environment :-) work in progress in UPM [7] Integration with other technologies :-) Benchmarks :-P work in progress in Planet Data [8]
Stavanger, 2012-5-9 54
Emanuele Della Valle - visit http://streamreasoning.org
Credits Politecnico di Milano’s colleagues
Prof. Stefano Ceri who had the initial intuition about the value of introducing data streams to the semantic Web community
Marco Balduini, Davide Barbieri, Daniele Braga, Stefano Ceri and Michael Grossniklaus who helped concieving the C-SPARQL Engine and the Streaming Linked Data Framework
once again to Davide Barbieri who engineered most of the C-SPARQL Engine as part of his PhD
once again to Marco Balduini who engineered most of Streaming Linked Data Framework as part of his M.Sc. Thesis
Politecnico di Milano’s Master students that assisted in the design and development of the prototypes Mirko Bratomi, and Marco Regaldo
Colleagues that in helped in concieving, designing, and prototyping the applications CEFRIEL: Irene Celino, and Danile Dell’Aglio SIEMENS: Yi Huang, and Volker Tresp Saltlux: Seonho Kim, and Tony Lee
Stavanger, 2012-5-9 55
Emanuele Della Valle - visit http://streamreasoning.org
ReferencesMy papers [IEEE-IS2009] E. Della Valle, S. Ceri, F. van Harmelen, D. Fensel
It's a Streaming World! Reasoning upon Rapidly Changing Information. IEEE Intelligent Systems 24(6): 83-89 (2009)
[EDBT2010] D.F. Barbieri, D.Braga, S. Ceri and M. Grossniklaus. An Execution Environment for C-SPARQL Queries. EDBT 2010
[WWW2009] D.F. Barbieri, D. Braga, S. Ceri, E. Della Valle, M. Grossniklaus: C-SPARQL: SPARQL for continuous querying. WWW 2009: 1061-1062
[SIGMODRec2010] D.F. Barbieri, D.Braga, S. Ceri and M. Grossniklaus. : Querying RDF streams with C-SPARQL. SIGMOD Record 39(1): 20-26 (2010)
[IEEE-IS2010] D. Barbieri, D. Braga, S. Ceri, E. Della Valle, Y. Huang, V. Tresp, A.Rettinger, H. Wermser: Deductive and Inductive Stream Reasoning for Semantic Social Media Analytics IEEE Intelligent Systems, 30 Aug. 2010.
[JWS2012a] E. Della Valle, M. Balduini: SLD: a Framework for Streaming Linked Data. JWS. 2012 Under Review
[JWS2012b] M. Balduini; I.Celino; E. Della Valle; D.Dell'Aglio; Y. Huang; T. Lee; S. Kim; V. Tresp: BOTTARI: an Augmented Reality Mobile Application to deliver Personalized and Location-based Recommendations by Continuous Analysis of Social Media Streams. JWS. 2012. to appear.
[ESWC2010] D.F. Barbieri, D. Braga, S. Ceri, E. Della Valle, M. Grossniklaus. Incremental Reasoning on Streams and Rich Background Knowledge. ESWC 2010
56Stavanger, 2012-5-9
Emanuele Della Valle - visit http://streamreasoning.org
ReferencesOther groups’ papers[1] Darko Anicic, Paul Fodor, Sebastian Rudolph, Nenad Stojanovic: EP-SPARQL: a
unified language for event processing and stream reasoning. WWW 2011: 635-644[2] Danh Le Phuoc, Minh Dao-Tran, Josiane Xavier Parreira, Manfred Hauswirth: A Native
and Adaptive Approach for Unified Processing of Linked Streams and Linked Data. International Semantic Web Conference (1) 2011: 370-388
[3] D. Anicic, S. Rudolph, P. Fodor, N. Stojanovic: Real-Time Complex Event Recognition and Reasoning-a Logic Programming Approach. Applied Artificial Intelligence 26(1-2): 6-57 (2012)
[4] Jean-Paul Calbimonte, Óscar Corcho, Alasdair J. G. Gray: Enabling Ontology-Based Access to Streaming Data Sources. ISWC (1) 2010: 96-111
[5] S. Komazec and D. Cerri: Towards Efficient Schema-Enhanced Pattern Matching over RDF Data Streams. First International Workshop on Ordering and Reasoning (OrdRing2011)
[6] Jesper Hoeksema, Spyros Kotoulas: High-performance Distributed Stream Reasoning using S4. First International Workshop on Ordering and Reasoning (OrdRing2011)
[7] A.J.G. Gray, R.Garcia-Castro, K.Kyzirakos, M.Karpathiotakis, J.Calbimonte, K.R.Page, J.Sadler, A.Frazer, I.Galpin, A.A.A. Fernandes, N.W. Paton, O.Corcho, M.Koubarakis, D.De Roure, K. Martinez, A. Gómez-Pérez: A Semantically Enabled Service Architecture for Mashups over Streaming and Stored Data. ESWC (2) 2011: 300-314
[8] PlanetData. D1.2 Benchmarking RDF Storage Engines. http://wiki.planet-data.eu/web/D1.2
57Stavanger, 2012-5-9
Emanuele Della Valle - visit http://streamreasoning.org
ReferencesBackground papers [Henzinger98] Henzinger, M. R. & Raghavan, P. (1998). Computing on
data streams. Systems Research. [Ceri1994] Stefano Ceri, Jennifer Widom: Deriving Incremental
Production Rules for Deductive Data. Inf. Syst. 19(6): 467-490 (1994) [Volz2005] Raphael Volz, Steffen Staab, Boris Motik: Incrementally
Maintaining Materializations of Ontologies Stored in Logic Databases. J. Data Semantics 2: 1-34 (2005)
[Cugola2011] Alessandro Margara, Gianpaolo Cugola: Processing flows of information: from data stream to complex event processing. DEBS 2011: 359-360
58Stavanger, 2012-5-9
Emanuele Della Valle - visit http://streamreasoning.org Emanuele Della Valle - visit http://streamreasoning.org
Thank You! Questions?
Much More to Come!Keep an eye on
http://www.streamreasoning.org
Stavanger, 2012-5-9 59