telegraph: an adaptive global- scale query engine joe hellerstein

16
Telegraph: An Adaptive Global-Scale Query Engine Joe Hellerstein

Post on 21-Dec-2015

222 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Telegraph: An Adaptive Global- Scale Query Engine Joe Hellerstein

Telegraph: An Adaptive Global-Scale Query Engine

Joe Hellerstein

Page 2: Telegraph: An Adaptive Global- Scale Query Engine Joe Hellerstein

Scenarios

• Ubiquitous computing: more than clients!– sensors and their data feeds are key

• smart dust, biomedical (MEMS sensors)• each consumer good records (mis)use

– disposable computing

• video from surveillance cameras, broadcasts, etc.

• Global Data Federation– all the data is online – what are we waiting for?– The plumbing is coming

• XML/HTTP, etc. give LCD communication• but how do you query robustly over many sites in the

wide area?

Page 3: Telegraph: An Adaptive Global- Scale Query Engine Joe Hellerstein

There’s a Data Flood Coming

Page 4: Telegraph: An Adaptive Global- Scale Query Engine Joe Hellerstein

There’s a Data Flood Coming

• What does it look like?– Never ends: interactivity required– Big: data reduction/aggregation is key– Unpredictable: this scale of devices and

nets will not behave nicely

Page 5: Telegraph: An Adaptive Global- Scale Query Engine Joe Hellerstein

The Telegraph Query Engine

• Key technologies– Interactive Control

• interactivity with early answers• online aggregation for data reduction

– Continuously adaptive flow optimization• massively parallel, adaptive dataflow via

Rivers and Eddies

Page 6: Telegraph: An Adaptive Global- Scale Query Engine Joe Hellerstein

CONTROLContinuous Output, Navigation & Transformation with Refinement On Line

• Data-intensive jobs are long-running. How to give early answers and interactivity?– online interactivity over feeds: data “juggle”– online query processing algs: ripple joins– statistical estimators, and their performance

implications

• Appreciate interplay of massive data processing, stats, and UIs

Page 7: Telegraph: An Adaptive Global- Scale Query Engine Joe Hellerstein

CONTROLContinuous Output and Navigation Technology with Refinement On Line

Page 8: Telegraph: An Adaptive Global- Scale Query Engine Joe Hellerstein

CONTROLContinuous Output and Navigation Technology with Refinement On Line

Page 9: Telegraph: An Adaptive Global- Scale Query Engine Joe Hellerstein

River

• We built the world’s fastest sorting machine– On the “NOW”: 100 Sun workstations + SAN– But it only beat the record under ideal

conditions!• River: performance adaptivity for data

flows on clusters– simplifies management and programming– perfect for sensor-based streams

Page 10: Telegraph: An Adaptive Global- Scale Query Engine Joe Hellerstein

Eddy

• How to order and reorder operators over time– based on performance, economic/admin feedback

• Vs.River:– River optimizes each operator “horizontally”– Eddies optimize a pipeline “vertically”

Eddy

Page 11: Telegraph: An Adaptive Global- Scale Query Engine Joe Hellerstein

Telegraph: Putting it Together• Scalable, adaptive dataflow infrastructure. Apps

include…– sensor nets– massively parallel and wide-area query engines– net appliances: chaining xform8n/aggreg8n/etc. proxies– any unpredictable dataflow scenario

• Technology: a marriage of…– CONTROL, River & Eddy

• Many research questions here• E.g. how to combine River and Eddy adaptivity• E.g. how to tune Eddies for statistical performance goals

– Combinations of browse/query/mine at UI– Storage management to handle new hardware realities

Page 12: Telegraph: An Adaptive Global- Scale Query Engine Joe Hellerstein

Integration with Endeavour

• Give– Be data-intensive backbone to diverse clients– Be replication dataflow engine for OceanStore– Telegraph Storage Manager provides storage

(xactional/otherwise) for OceanStore– Provide platform for data-intensive “tacit info

mining”

• Take– Leverage OceanStore to manager distributed

metadata, security– Leverage protocols out of TinyOS for sensors

Page 13: Telegraph: An Adaptive Global- Scale Query Engine Joe Hellerstein

Additional Slides

• For use in questions, etc.

Page 14: Telegraph: An Adaptive Global- Scale Query Engine Joe Hellerstein

Connectivity & Heterogeneity

• Lots of folks working on data format translation, parsing– we will borrow, not build– currently using JDBC & Cohera Net Query

• commercial tool, donated by Cohera Corp. • gateways XML/HTML (via http) to ODBC/JDBC

– we may write “Teletalk” gateways from sensors• Heterogeneity

– never a simple problem– Control project developed interactive, online data

transformation tool: Potter’s Wheel

Page 15: Telegraph: An Adaptive Global- Scale Query Engine Joe Hellerstein
Page 16: Telegraph: An Adaptive Global- Scale Query Engine Joe Hellerstein

Potter’s Wheel Anomaly Detection