geospatial stream query processingpqy g using microsoft ...geospatial stream query processingpqy g...
TRANSCRIPT
Geospatial Stream Query ProcessingGeospatial Stream Query Processingp Q y gi Mi f SQL S S I i husing Microsoft SQL Server StreamInsightusing Microsoft SQL Server StreamInsight
1 1 2 1 1Seyed Jalal Kazemitabar
1Ugur Demiryurek
1Mohamed Ali
2 Afsin Akdogan
1 Cyrus Shahabi
1y g y g y
1I t t d M di S t C t 2Mi ft SQL S1Integrated Media Systems Center 2Microsoft SQL Server University of Southern California Microsoft Corporation ICampus IWatch CTy p ICampus IWatch CT
Streaming EngineIntroduction Streaming Engine
GeoInsight• StreamInsight Architecture
g• A real-world data-driven framework which enables:A real world data driven framework which enables:
– Fast query processing over stream data using Microsoft– Fast query processing over stream data using Microsoft StreamInsightTMStreamInsight
Running spatial queries over geospatial data– Running spatial queries over geospatial data
O li l i d di ti b d hi t i d t i i– Online analysis and prediction based on historic data using our in-k t hi t h imemory sketching technique
• Stream flow in demo
Q
er
Average
er
Q3
dapt
e
Value Filter Spatial Filter PCA PCA PredictRefineQ1 Q2 Q5
Average Ada
pte
Q4 Q6 Q7
put A
d Value Filter Spatial Filter PCA PCA, PredictRefine Average
tput
A
Inp
Out
Application Approachpp
O li A l ti l R fi t d P di ti (OARP)
pp
U i I Sk t hOnline Analytical Refinement and Prediction (OARP) Using In-memory SketchesHybrid queries over spatio-temporal windows provide great analysis • Instead of storing the whole data in DB, store the sketches in memory y q p p p g yfunctionality including:
g , yy g
• Principal component Analysis (PCA): a mathematical approach for analyzing• Refinement functions
• Principal component Analysis (PCA): a mathematical approach for analyzing correlated data• Refinement functions correlated data
– Smoothing noisy input data according to previously observed patternsA b f t ith t i fl
g y p g p y p
D t ti f li h t i d b di th t hi hl• A number of components with great influence
– Detection of anomalies characterized by sensor readings that are highly d i t d f hi t i l l
selected as coordinatesdeviated from historical mean values
• Improving PCA performance for aggregate queries by• Prediction functions
Improving PCA performance for aggregate queries by
calculating the query result in transformed space
P di ti f t t d b d i l b d tt
calculating the query result in transformed space
– Predicting near future trends based on previously observed patterns
– Responding to anomalies and deliberately attempting to change future conditions
Contribution/ExperimentsContribution/Experiments
PCA for Traffic DataPCA for Traffic Data
Hi h d t i t• High data compression rate
– 98% for highway data
• Extra short response time
Challengesp
– 2 milliseconds (compare to 58 sec.)Challenges 2 milliseconds (compare to 58 sec.)
• Highly accurate for Traffic DataLarge Datasets and Spatial Queries
• Highly accurate for Traffic Data
MSE for same query: 10-4 Mphg p Q
• Large response time caused by disk I/O limits the availability of hybrid– MSE for same query: 10-4 Mph
Large response time caused by disk I/O limits the availability of hybrid queries in real-time streaming applications Real Data Transformed Dataqueries in real time streaming applications
“What was the average speed in I-10 in LA county during summer 2009 from 4:00-5:00 pm?”98% ta98%
eed
e in
dat
Spe
aria
nce
% o
f Va
Response Time for the indexed %
ComponentsDatabaseResponse Time for the indexedtable containing data of one
Time TimeComponentsg
year (150 GB) : 58 Seconds!
Conclusion and Future Work
• Limited support for geostreaming (continuous spatial queries) in current D li ti f f t f t hi h ti l i
pp g g ( p q )database technologies Demo application as a proof of concept for a system which runs spatial queries over
real time datag
real-time data
Implementing the fundamentals of Clever Transportation (CT) project as a platform for monitoring, querying, and analyzing real-time Los Angeles traffic data
• Devising a scalable spatial alarm continuous query suitable for location-basedDevising a scalable spatial alarm continuous query suitable for location based services