streaming day - an overview of stream reasoning
TRANSCRIPT
Streaming Day: an overview of Stream Reasoning
by: Riccardo Tommasini
1
Scuola di Ingegneria Industriale e dell’Informazione Computer Science and Engineering
Master Degree Thesis – Riccardo Tommasini
Agenda
2
Background
Stream Reasoning
Get in Touch
Heaven✓
SR Example
Master Degree Thesis – Riccardo Tommasini
GiT - Riccardo Tommasini
3
Master Degree in C.S. @ Politecnico Of Milano
M.D. Thesis on Stream Reasoning
I’ll start my Phd in November 2k15
Master Degree Thesis – Riccardo Tommasini
GiT - Research Topic & Areas of Interest
4
• StreamReasoning @ CEP
• Techniques and Methods for Stream Reasoners Benchmarking
• RESTfull API
• Software Testing
• Programming Languages
RDF Stream Processing
Software Engineering
Master Degree Thesis – Riccardo Tommasini
GiT - Stream Reasoning Research Group
5
Daniele Dell’Aglio
Phd
Emanuele Della Valle Advisor
Marco Balduini Phd
Master Degree Thesis – Riccardo Tommasini
Agenda
6
Background
Stream Reasoning
Get in Touch
Heaven✓
SR Example
Master Degree Thesis – Riccardo Tommasini
Background - Semantic Web
7
It provides a common framework to allow
interoperability applications.
The Semantic Web is a WWW extension.
Semantic Web world involves several
technologies.
Master Degree Thesis – Riccardo Tommasini
Background - Semantic Web
7
It provides a common framework to allow
interoperability applications.
The Semantic Web is a WWW extension.
Semantic Web world involves several
technologies.
Master Degree Thesis – Riccardo Tommasini
Background - Semantic Web
7
It provides a common framework to allow
interoperability applications.
The Semantic Web is a WWW extension.
Semantic Web world involves several
technologies.
Master Degree Thesis – Riccardo Tommasini
Background - Semantic Web
7
It provides a common framework to allow
interoperability applications.
The Semantic Web is a WWW extension.
Semantic Web world involves several
technologies.
Master Degree Thesis – Riccardo Tommasini
Background - RDF
8
Let I, B and L be three pairwise disjoint sets, defined as IRIs, Blank Nodes and Literals, respectively. A triple
(s, p, o) ∈ (I ∪ B)I(I ∪ B ∪ L) is an RDF triple, while a set of RDF triples is called an RDF graph.
subject objectpredicate
RDF describes a conceptual model of information in any given domain.
Master Degree Thesis – Riccardo Tommasini
Background - OWL
9
• Web Ontology Language (OWL) is a language for writing ontologies for the Web
• An Ontology is a a specification of a conceptualisation (Tom Gruber)
• OWL extends RDF allowing to specific more about properties and classes
• OWL extends RDF enabling reasoning:
• Check logical correctness of statements
• Infer implied statements w.r.t. a set of inferences rules
Master Degree Thesis – Riccardo Tommasini
Background - SPARQL
10
SPARQL Protocol and RDF Query Language 3 main parts
• CONSTRUCT query: used to provide an RDF graph created directly from the results of the query.• SELECT query: used to extract a set of variables and their matching values, called set of mappings in the table format. • Dataset clause -> FROM or FROM Named• WHERE: provides the graph pattern to match against the data graph.
Master Degree Thesis – Riccardo Tommasini
Background - C-SPARQL
11
RICORDARE CAMBIO SEMANTICA!!!!Csparql language extends sparql in every 3 parts of query formsQuery form -> STREAM CLAUSE to create a RDF stream as query resultsDatasert clause -> FROM STREAM clause added to let engine get data from RDF streams specified by URIWhere Clause -> built in timestamp function to retrieve the timestamp of every single triple in the engine
Master Degree Thesis – Riccardo Tommasini
Background - DSMS vs CEP
12
QQQQ
Throw
Scratch
Store
StreamStream 1
Stream2
Stream n
…
Complex EventProcessing
Engine
Event Observers Event Consumers
Processing Flows of Information: From Data Stream to Complex Event Processing- Gianpaolo Cugola & Alessandro Margara
Heterogeneous data stream processing
Data semantic is up to the client
Incoming data are notification of events
Events are semantically evaluate through rules
Pub/Sub Model
CEP
DSMS
Continuous queries execution
Master Degree Thesis – Riccardo Tommasini 13
Background - Time Based Window
Tumbling Window
Sliding Window
Window Dimension ω [ms] Slide Parameter β [ms]
Master Degree Thesis – Riccardo Tommasini 13
Background - Time Based Window
Tumbling Window
Sliding Window
Window Dimension ω [ms] Slide Parameter β [ms]
Master Degree Thesis – Riccardo Tommasini 13
Background - Time Based Window
Tumbling Window
Sliding Window
Window Dimension ω [ms] Slide Parameter β [ms]
Master Degree Thesis – Riccardo Tommasini 13
Background - Time Based Window
Tumbling Window
Sliding Window
Window Dimension ω [ms] Slide Parameter β [ms]
Master Degree Thesis – Riccardo Tommasini
Agenda
14
Background
Stream Reasoning
Get in Touch
Heaven✓
SR Example
Master Degree Thesis – Riccardo Tommasini
Stream Reasoning (SR)
15
Reasoning upon heterogeneous and rapidly changing information flows.
-- S. Ceri, E. Della Valle, F. van Harmelen and H. Stuckenschmidt, 2010
Master Degree Thesis – Riccardo Tommasini
SR - RSP Engine
16
RDF Stream Processing Engine
Master Degree Thesis – Riccardo Tommasini
SR - RSP Engine
16
RDF Stream Processing Engine
Master Degree Thesis – Riccardo Tommasini
SR - RSP Engine
16
RDF Stream Processing Engine
heterogeneous data (unbounded) streams
Master Degree Thesis – Riccardo Tommasini
SR - RSP Engine
16
RDF Stream Processing Engine
data streams integration through RDF data model
heterogeneous data (unbounded) streams
Master Degree Thesis – Riccardo Tommasini
SR - RSP Engine
16
RDF Stream Processing Engine
data streams integration through RDF data model
continuously infers implied triples w.r.t. ontology T
heterogeneous data (unbounded) streams
T
Master Degree Thesis – Riccardo Tommasini
< ,Q>
SR - RSP Engine
16
RDF Stream Processing Engine
data streams integration through RDF data model
continuously infers implied triples w.r.t. ontology T
heterogeneous data (unbounded) streams
continuous querying (Q) answering
T
Master Degree Thesis – Riccardo Tommasini
SR - RSP Engine Execution Semantics
17
S2ROperator
Window
R2ROperator
SPARQL
R2SOperator
Rstream,Itream,Dstream
RDF StreamRDF Stream
Engine StreamMappings Mappings
Master Degree Thesis – Riccardo Tommasini
SR - RSP Engine Execution Semantics
17
S2ROperator
Window
R2ROperator
SPARQL
R2SOperator
Rstream,Itream,Dstream
RDF StreamRDF Stream
Engine StreamMappings Mappings
Stream to Relation
Master Degree Thesis – Riccardo Tommasini
SR - RSP Engine Execution Semantics
17
S2ROperator
Window
R2ROperator
SPARQL
R2SOperator
Rstream,Itream,Dstream
RDF StreamRDF Stream
Engine StreamMappings Mappings
Stream to Relation
Relation to Relation
Master Degree Thesis – Riccardo Tommasini
SR - RSP Engine Execution Semantics
17
S2ROperator
Window
R2ROperator
SPARQL
R2SOperator
Rstream,Itream,Dstream
RDF StreamRDF Stream
Engine StreamMappings Mappings
Stream to Relation
Relation to Relation
Relation to Stream
Master Degree Thesis – Riccardo Tommasini 18
SR - C-SPARQL Engine
Input TripleInferred Triple
Master Degree Thesis – Riccardo Tommasini 18
SR - C-SPARQL Engine
RDF Stream
DSMS
Input TripleInferred Triple
Master Degree Thesis – Riccardo Tommasini 18
SR - C-SPARQL Engine
RDF Stream
DSMS
active window
Input TripleInferred Triple
Master Degree Thesis – Riccardo Tommasini 18
SR - C-SPARQL Engine
RDF Stream
DSMS Reasoner T,Q
active window
Input TripleInferred Triple
Master Degree Thesis – Riccardo Tommasini 18
SR - C-SPARQL Engine
RDF Stream
DSMS Reasoner
RDF Stream
T,Q
active window
Input TripleInferred Triple
Master Degree Thesis – Riccardo Tommasini 18
SR - C-SPARQL Engine
RDF Stream
DSMS Reasoner
RDF Stream
T,Q
active window
Input TripleInferred Triple
Master Degree Thesis – Riccardo Tommasini 18
SR - C-SPARQL Engine
RDF Stream
DSMS Reasoner
RDF Stream
T,Q
active window
Input TripleInferred Triple
Master Degree Thesis – Riccardo Tommasini 18
SR - C-SPARQL Engine
RDF Stream
DSMS Reasoner
RDF Stream
T,Q
C-SPARQL Query
active window
Input TripleInferred Triple
Master Degree Thesis – Riccardo Tommasini 18
SR - C-SPARQL Engine
RDF Stream
DSMS Reasoner
RDF Stream
T,Q
Continuous Query
SPARQL Query
C-SPARQL Query
active window
Input TripleInferred Triple
Master Degree Thesis – Riccardo Tommasini 18
SR - C-SPARQL Engine
RDF Stream
DSMS Reasoner
RDF Stream
T,Q
Continuous Query
SPARQL Query
C-SPARQL Query
active window
Input TripleInferred Triple
Master Degree Thesis – Riccardo Tommasini 18
SR - C-SPARQL Engine
RDF Stream
DSMS Reasoner
RDF Stream
T,Q
Continuous Query
SPARQL Query
C-SPARQL Query
active window
Input TripleInferred Triple
Master Degree Thesis – Riccardo Tommasini
Agenda
19
Background
Stream Reasoning
Get in Touch
Heaven✓
SR Example
Master Degree Thesis – Riccardo Tommasini 20
BlueRoom RedRoom
is with
Running Example
Master Degree Thesis – Riccardo Tommasini 20
BlueRoom RedRoom
RedSensor
BlueSensor
is with
Running Example
Master Degree Thesis – Riccardo Tommasini 20
BlueRoom RedRoom
RedSensor
BlueSensor
R
Alice
R RFID is with
Running Example
Master Degree Thesis – Riccardo Tommasini 20
BlueRoom RedRoom
RedSensor
BlueSensor
R
Alice
Bob
R RFID is withFoursquare
Running Example
Master Degree Thesis – Riccardo Tommasini 20
BlueRoom RedRoom
RedSensor
BlueSensor
R
Alice
David
Bob
Carl
Elena
R RFID is withf FacebookFoursquare
Running Example
Master Degree Thesis – Riccardo Tommasini 21
▪ Four ways to learn who is where
Sensor Room Person Time-stamp
RedSensor RedRoom Alice T1
… … … …
Person ChecksIn Time-stamp
Bob BlueRoom T2
… … …
Person IsIn With Time-stamp
Carl null Bob T2
David RedRoom Elena T3
… … … …
Running Example - Which Data?
Master Degree Thesis – Riccardo Tommasini
Running Example - Data Model
22
Streaming Data Static Data
isWith
isConnectedTo
Master Degree Thesis – Riccardo Tommasini
Running Example - Data Model
22
Streaming Data Static Data
isWith
isConnectedTo
Master Degree Thesis – Riccardo Tommasini
RDF graph Time-stamp Stream
:RedSensor :observes [ :who :Alice; :where :RedRoom ] . t1 sensors
:Bob :posts [ :who :Bob ; :where :RedRoom] . t2 foursquare
• Data
• Query REGISTER QUERY whoIsInWhichRoom? AS PREFIX : <http://…/sr4ld2014-onto#> SELECT ?x ?room ?personFROM STREAM <http://…/fs> [RANGE 1m STEP 10s] FROM STREAM <http://…/sensors> [RANGE 1m STEP 10s] WHERE { ?x :observes [ :who ?person ; :where ?room ] .}
• Results at t2+10s
23
?x ?room ?person
:RedSensor :RedRoom :Alice
:Bob :RedRoom :Bob
Running Example - Query
Master Degree Thesis – Riccardo Tommasini
Agenda
24
Background
Stream Reasoning
Get in Touch
Heaven✓
SR Example
Master Degree Thesis – Riccardo Tommasini
Heaven - Research Question
My contributions are
Can we enable Systematic Comparative Research Approach of RSP Engines, exploiting existing queries,
dataset and metrics?
25
Master Degree Thesis – Riccardo Tommasini
Heaven - Research Question
My contributions are
Can we enable Systematic Comparative Research Approach of RSP Engines, exploiting existing queries,
dataset and metrics?
Test Stand
25
Master Degree Thesis – Riccardo Tommasini
Evaluate engines with Test Stands
26
In Aerospace engineering…
Experimental Environment
Reproducibility, Repeatability, ComparabilityEvaluation of running systems
Heaven - Test Stand
Master Degree Thesis – Riccardo Tommasini
Heaven - Test Stand
27
Disk
ResultCollector Streamer RSPEngine
Experiment
Analyser
Start MB StopTestStand
MB
Master Degree Thesis – Riccardo Tommasini
My contributions are
Can we enable Systematic Comparative Research Approach of RSP Engines, exploiting existing queries,
dataset and metrics?
28
Test Stand
Heaven - Research Question
Master Degree Thesis – Riccardo Tommasini
My contributions are
Can we enable Systematic Comparative Research Approach of RSP Engines, exploiting existing queries,
dataset and metrics?
Method
28
Test Stand
Heaven - Research Question
Master Degree Thesis – Riccardo Tommasini
Heaven - Analyser
I develop a layered investigation method, which tries answer different possible question about RSP Engine
L0 -‐ How to choose an engine?
L1 -‐ What distinguish an engine?
L2 -‐ When choosing an engine?
L3 -‐ Why choosing this engine?
29
Causalità dei livelli, sarebbe bello poter dire sempre quale engine è migliore
Master Degree Thesis – Riccardo Tommasini
My contributions are
Can we enable Systematic Comparative Research Approach of RSP Engines, exploiting existing queries,
dataset and metrics?
Test Stand
Method
30
Heaven - Research Question
Master Degree Thesis – Riccardo Tommasini
My contributions are
Can we enable Systematic Comparative Research Approach of RSP Engines, exploiting existing queries,
dataset and metrics?
Test Stand
Baselines
Method
Analysis
30
Heaven - Research Question
Master Degree Thesis – Riccardo Tommasini
Heaven - Dashboard Example
31
Increasing Window
Dim
ension (ms)
Master Degree Thesis – Riccardo Tommasini
Heaven - Dashboard Example
31
Memory(mb)
Latency(ms)
Increasing Window
Dim
ension (ms)
Master Degree Thesis – Riccardo Tommasini
Heaven - Dashboard Example
31
Memory(mb)
Latency(ms)
Memory(mb)
Latency(ms)
Increasing Window
Dim
ension (ms)
Master Degree Thesis – Riccardo Tommasini
Heaven - Dashboard Example
31
Memory(mb)
Latency(ms)
Memory(mb)
Latency(ms)
Memory(mb)
Latency(ms)
Increasing Window
Dim
ension (ms)
Master Degree Thesis – Riccardo Tommasini
Heaven - Dashboard Example
31
Memory(mb)
Latency(ms)
Memory(mb)
Latency(ms)
Memory(mb)
Latency(ms)
Memory(mb)
Latency(ms)
Increasing Window
Dim
ension (ms)
Master Degree Thesis – Riccardo Tommasini
Heaven - Pattern Identification Example
32
6.3 SOAK Test Evaluation Results
(a) Graph Naive
Triple Slots
in Number
Window 1 10 100 1000 10000
1
10
100
1000
10000
(b) Graph Incremental
Triple Slots
in Number
Window 1 10 100 1000 10000
1
10
100
1000
10000
Table 6.11 – The figure shows the representation in the time domain of mem-
ory for GN (a) and GI (b).
117
Memory
Naive
cancellare graph lasciare naive
Master Degree Thesis – Riccardo Tommasini
Heaven- Visual Comparison Example
33
Master Degree Thesis – Riccardo Tommasini
Agenda
34
Semantic Web
Stream Reasoning
Get in Touch
Heaven
✓
Master Degree Thesis – Riccardo Tommasini
Thank You
35
Thank You!
Master Degree Thesis – Riccardo Tommasini
Contact
36
RiccardoTommasini+
@rictomm
tomma156
[email protected] Tommasini
riccardotommasini
Master Degree Thesis – Riccardo Tommasini
Resources
37
Streamreasoning.org
StreamReasoning@GitHub
RDF Stream Processors
PhD CEP Course @Polimi
Stream Reasoning Tutorial
C-SPARQL Engine
Quick start availableSource code are released open source under Apache 2.0C-SPARQL Enginehttps://github.com/streamreasoning/CSPARQL-enginehttps://github.com/streamreasoning/CSPARQL-ReadyToGoPack
Master Degree Thesis – Riccardo Tommasini
Resources
37
Streamreasoning.org
StreamReasoning@GitHub
RDF Stream Processors
PhD CEP Course @Polimi
Stream Reasoning Tutorial
Esper
Jena
C-SPARQL Engine
Quick start availableSource code are released open source under Apache 2.0C-SPARQL Enginehttps://github.com/streamreasoning/CSPARQL-enginehttps://github.com/streamreasoning/CSPARQL-ReadyToGoPack