acqua: approximate continuous query answering over streams and dynamic linked data sets
TRANSCRIPT
ACQUA:
APPROXIMATE CONTINUOUS
QUERY ANSWERING
OVER STREAMS AND
DYNAMIC LINKED DATA SETS
Emanuele Della ValleDEIB - Politecnico of Milano
http://emanueledellavalle.org
@manudellavalle
Schloss Dagstuhl, Germany - 26 June 2017
Stream Processing in Nutshell
Stream Processing Engine
Results
Win
dow
s
Stream data Register query
once and execute it
continuously
2Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
Web Stream Processing
Web
Resu
lts
Join
Win
dow
s
Web Streams Linked Data
High Latency
Rate Limits
Loosing Reactiveness
3
Stream Processing Engine
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
RDF Stream Processing (RSP) EngineR
SP
en
gin
e
Web
Resu
lts
Join
Win
dow
s
RDF Streams SPARQL endpoint
4Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
An example
The cloth brand ACME wants to persuade influential Social
Networks users to post commercial endorsements.
Every minute give me the ID of the users that are mentioned on
Social Network in the last 10 minutes whose number of followers is
greater than 100,000.
5
REGISTER STREAM <:InfluencersToContact> AS
CONSTRUCT {?user a :influentialUser}
FROM NAMED WINDOW W ON S [RANGE 10m STEP 1m]
WHERE {
WINDOW W {?user :hasMentions ?mentionsNumber}
SERVICE BKG {?user :hasFollowers ?followerCount }
FILTER (?followerCount > 100,000)
}
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
Problem DefinitionR
SP
en
gin
e
Web
Resu
lts
Join
Win
dow
s
RDF Streams
Define Refresh
Budget to
control
reactiveness
6
Data become stale
if not refreshed
Correct vs
approximate
answer
SPARQL endpoint
Local
Replica
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
Problem DefinitionR
SP
en
gin
e
Web
Resu
lts
Join
Win
dow
s
RDF Streams SPARQL endpoint
7
Local
Replica
Maintenance
Policy
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
ACQUA approach
8
WINDOW clause
Stream data
JOIN Proposer Ranker
Maintainer
2
3
1
SERVICE clauseAQCUA: without FILTER
AQCUA.F: with FILTER Clause
E
C
ACQUA [2]
RND
LRU
WBM
ACQUA.F [3]
Filter Update Policy
RND.F
LRU.F
WBM.F
ACQUA.F+/* [5]
LRU.F+
WBM.F+
WBM.F*
Candidate set
Elected set: top γ mappings
of Candidate set
Local Replica
WSJ: Filter out mappings
that are not involved in
current evaluation
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
Where to read about ACQUA1. Soheila Dehghanzadeh, Alessandra Mileo, Daniele Dell'Aglio,
Emanuele Della Valle, Shen Gao, Abraham Bernstein: Online View Maintenance for Continuous Query Evaluation. WWW (Companion Volume) 2015: 25-26
2. Soheila Dehghanzadeh, Daniele Dell'Aglio, Shen Gao, Emanuele Della Valle, Alessandra Mileo, Abraham Bernstein: Approximate Continuous Query Answering over Streams and Dynamic Linked Data Sets. ICWE 2015: 307-325
3. Shima Zahmatkesh, Emanuele Della Valle, Daniele Dell'Aglio:When a FILTER Makes the Difference in Continuously Answering SPARQL Queries on Streaming and Quasi-Static Linked Data. ICWE 2016: 299-316
4. Shen Gao, Daniele Dell'Aglio, Soheila Dehghanzadeh, Abraham Bernstein, Emanuele Della Valle, Alessandra Mileo: Planning Ahead: Stream-Driven Linked-Data Access Under Update-Budget Constraints. International Semantic Web Conference (1) 2016: 252-270
5. Shima Zahmatkesh, Emanuele Della Valle, Daniele Dell'Aglio: Using Rank Aggregation in Continuously Answering SPARQL Queries on Streaming and Quasi-static Linked Data. DEBS 2017: 170-179
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle 9
Experimental Results
10
Wors
tB
est
Perf
orm
ance
Experiment Dimension
For high selectivity
Filter Update Policy
is better than WBM
For low selectivity
WBM is better than
Filter Update Policy Comparable to
Best Result
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
Future works
• Broaden the class of queries
• Multiple filtering
• Filtering condition formulated as a ranking clause
• Pushing the FILTER clause into the SERVICE clause and
considering caching instead of local replica
• Study the effect of different trends in the data
11Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
ACQUA IN THE
STREAM REASONING
CONTEXTAnnex
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle 12
Stream Reasoning in a nutshell
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle 13
Tame data Variety and Velocity simultaneously
Traditional StreamReasoning
Tame data Variety and Velocity simultaneously
Traditional StreamReasoning
Stream Reasoning in a nutshell
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle 14
Tame data Variety and Velocity simultaneously without forgetting volume
Traditional StreamReasoning
What if the analysis
includes also
data "at rest"?
What if the data "at rest"
are massive and
slowly evolving?
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle 15
Stream Reasoning in a nutshell
ACQUA