charles loboz, slawek smyl, suman nath microsoft corporation
TRANSCRIPT
![Page 1: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/1.jpg)
Charles Loboz, Slawek Smyl, Suman NathMicrosoft Corporation
![Page 2: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/2.jpg)
Monitoring Large DataCenters
Context Performance Data Design Goals DataGarage Query Processing Experiments Context Performance Data Design Goals DataGarage Query Processing Experiments
Monitoring Planning
Historical analysis
Management Task
CPU, memory, disk utilization,…Response time, queue length,…
Performance data
![Page 3: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/3.jpg)
Monitoring Data Management
100K servers = 1TB data per day!
Context Performance Data Design Goals DataGarage Query Processing Experiments Context Performance Data Design Goals DataGarage Query Processing Experiments
Storage challengeStorage challenge
Store data over many months, years
Petabytes of data
Store data over many months, years
Petabytes of data
Query challengeQuery challenge
Hours to run simple queries
Hours to run simple queries
![Page 4: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/4.jpg)
DataGarage
Context Performance Data Design Goals DataGarage Query Processing Experiments Context Performance Data Design Goals DataGarage Query Processing Experiments
CPU, memory, disk utilization,…Response time, queue length,…
Performance data
Storage, query processingEfficient, scalable, cheap
![Page 5: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/5.jpg)
• Context• Performance data characteristics• Design goals• DataGarage design• Query Processing• Evaluation• Conclusion
Outline
![Page 6: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/6.jpg)
Performance Data Collection
Context Performance Data Design Goals DataGarage Query Processing Experiments Context Performance Data Design Goals DataGarage Query Processing Experiments
Monitoring process
Time CPU Mem Jobs Disk …
10:00 48 37 3 134 …
10:01 52 39 3 342 …
10:02 58 45 2 324 …
… … … … … …
Sampling period 15 seconds100-1000 counters/server
5-100 MB/server/day0.01% CPU time
Our Deployment
CPU utilization, memory usage, disk space, SQL queue length, app response time, cache hit rate, network bandwidth, …
![Page 7: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/7.jpg)
Performance Data Characteristics• Heterogeneous counter sets– 30K different counters, 100-1000 per server
• Numeric, read-only, possibly-dirty– Dirty data retained, may be ignored for query
• Hierarchical queries– Selection, projection, aggregation, data mining• Fraction of hotmail.com servers in a given rack with CPU
utilization > 50%• Average memory utilization trend of hotmail servers
Context Performance Data Design Goals DataGarage Query Processing Experiments Context Performance Data Design Goals DataGarage Query Processing Experiments
![Page 8: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/8.jpg)
DataGarage Design Goals• Small storage footprint– Reduces storage and communication cost– Small pay-as-you-go cost for Cloud systems
• Cheap– Commodity hardware and off-the-shelf software
• Fast and robust query processing– Allows fast decisions– Tolerates faulty and slow hardware
• Simple and flexible query interface (SQL + UDF)– Fast query writing
Context Performance Data Design Goals DataGarage Query Processing Experiments Context Performance Data Design Goals DataGarage Query Processing Experiments
![Page 9: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/9.jpg)
• Context• Performance data characteristics• Design goals• DataGarage design• Query Processing• Evaluation• Conclusion
Outline
![Page 10: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/10.jpg)
Options• TableStore: Relational table– DB engine: single-node DBMS, parallel DBMS– MapReduce: HadoopDB [Abouzeid et al. VLDB’09]
• FileStore: Files– MapReduce: Hadoop, Dryad [Isard et al., EuroSys’07]
Context Performance Data Design Goals DataGarage Query Processing Experiments Context Performance Data Design Goals DataGarage Query Processing Experiments
![Page 11: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/11.jpg)
Trade-offs
Context Performance Data Design Goals DataGarage Query Processing Experiments Context Performance Data Design Goals DataGarage Query Processing Experiments
Performance
Fault-tolerance
Cost Storage footprint
TableStore + Parallel DB Engine
(DBMS-X)
TableStore + MR + single node DB(HadoopDB)
FileStore + MapReduce
(Hadoop, Dryad)
TableStore in files + MapReduce
(DataGarage)
![Page 12: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/12.jpg)
Storage Inefficiency: TableStore
Wide table
Context Performance Data Design Goals DataGarage Query Processing Experiments Context Performance Data Design Goals DataGarage Query Processing Experiments
Narrow table
Mac
hine
id
Tim
esta
mps
Coun
ter 1
Coun
ter 2
Coun
ter n
All possible counters
• Too many columns • >95% sparse
Mac
hine
id
Tim
esta
mps
Coun
ter i
d
Valu
e
Key-value store
• Redundant keys(4x more expensive
than raw data)• Expensive joins needed
Key problem: heterogeneous counter setsTotal 30,000 unique counters, <1000/server
![Page 13: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/13.jpg)
Storage Inefficiency: FileStore• Heterogeneous counter sets– Files need to maintain schema for each server
• No structure in data– Compression cannot exploit data correlation
Context Performance Data Design Goals DataGarage Query Processing Experiments Context Performance Data Design Goals DataGarage Query Processing Experiments
![Page 14: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/14.jpg)
Our Solution• One wide-table per server– Benefits of TableStore, without sparseness/ redundancy
• Each wide-table in an embedded database file– Benefits of FileStore
Context Performance Data Design Goals DataGarage Query Processing Experiments Context Performance Data Design Goals DataGarage Query Processing Experiments
c1 c4 c6 c7 c8 c2 c4 c5 c8
Microsoft SQL Server Compact Edition library
.sdf
file c1 c2 c3
SQL Lite, MS SQL Server Compact Edition
![Page 15: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/15.jpg)
DataGarage Architecture
Controller(Query Dissemination)
Controller(Query Dissemination)
SummaryDatabaseSummaryDatabase
Dataanalysis
tools
Data collector
Data collector
Data collector
Embedded database
Distributed file system
Query
Context Performance Data Design Goals DataGarage Query Processing Experiments Context Performance Data Design Goals DataGarage Query Processing Experiments
![Page 16: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/16.jpg)
• Zipping files with PKZip is not effective• Compress one column at a time– Exploit strong correlation– RLE, delta encoding not very effective
• Our idea: Bit-truncation + Byte-interleaving
Data Compression
42424242
AEAEAEAE
91832B39
A0E438C4
…
…
42424242
AEAEAEAE
91832B39
…
…
4242AE..
42..
AE91
42AEAE83
…
…
if lossy
Context Performance Data Design Goals DataGarage Query Processing Experiments Context Performance Data Design Goals DataGarage Query Processing Experiments
<1%
![Page 17: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/17.jpg)
Storage Efficiency
Context Performance Data Design Goals DataGarage Query Processing ResultsContext Performance Data Design Goals DataGarage Query Processing Results
![Page 18: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/18.jpg)
• Context• Performance data characteristics• Design goals• DataGarage design• Query Processing• Evaluation• Conclusion
Outline
![Page 19: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/19.jpg)
• DataGarage query: Three components – On: filesystem path: /hotmail/dc1/*.10-.-2009.sdf– Apply: a SQL query run on individual database files– Combine: a SQL query to compute final result
• Enables map-reduce style execution
DataGarage Query
Context Performance Data Design Goals DataGarage Query Processing Experiments Context Performance Data Design Goals DataGarage Query Processing Experiments
![Page 20: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/20.jpg)
Query Execution
Context Performance Data Design Goals DataGarage Query Processing Experiments Context Performance Data Design Goals DataGarage Query Processing Experiments
DisseminationApply
Execution Nodes
Distributed File system
…
ResultController
NodeOn
Combine
Temporary
ControllerCombine
![Page 21: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/21.jpg)
Query Execution Time
Context Performance Data Design Goals DataGarage Query Processing ResultsContext Performance Data Design Goals DataGarage Query Processing Results
![Page 22: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/22.jpg)
• DataGarage key technology:– Decoupling of execution and storage – Fine-grained data partitioning
• Data is replicated by the file system• Slow execution nodes – Assigned smaller jobs– Faster nodes take additional load after finished
• Execution node failures– New nodes work on remaining job of failed nodes
Fault Tolerance
Context Performance Data Design Goals DataGarage Query Processing Experiments Context Performance Data Design Goals DataGarage Query Processing Experiments
![Page 23: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/23.jpg)
• High performance: queries are pushed inside embedded database
• Storage efficient: compression• Fault tolerant: fine partitioning of data and query
processing, aggressive restarting, speculative execution
• Hierarchical queries: file system paths• Simple interface: SQL queries• Cheap: off-the-shelf tools, commodity machines
Goals Revisited
![Page 24: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/24.jpg)
• Context• Performance data characteristics• Design goals• DataGarage design• Query Processing• Experience• Conclusion
Outline
![Page 25: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/25.jpg)
• Have been in operation for more than 1 year– Warehousing data from Microsoft data centers
• Partitioning with fine granularity + compression is the key to store massive data– Previous implementation with narrow table• 30K server-days in 1TB disk• Slow queries
– Current implementation: • 1-3 million server-days/TB • Orders of magnitude faster queries
Operational Experience
Context Performance Data Design Goals DataGarage Query Processing ResultsContext Performance Data Design Goals DataGarage Query Processing Results
![Page 26: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/26.jpg)
• Embedded database files give flexibility– Placement, backup simplified – Scavenge available storage on the fly
• Simple design helps– Several thousands lines of C# code to glue together
existing tools (FS, Embedded DB, R, …)• Defer features until necessary: Parallel Combine
• Good fit with Cloud computing model– Data and/or computation can be on the Cloud– Cheap: only file storage needed, small footprint
Operational Experience
Context Performance Data Design Goals DataGarage Query Processing ResultsContext Performance Data Design Goals DataGarage Query Processing Results
![Page 27: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/27.jpg)
• Existing solutions are not efficient for warehousing performance data
• DataGarage: performance data warehouse• Cheap, scalable, fault tolerant– Combines benefits of DB, MapReduce, file systems
• Operational experience shows the benefits
Questions?
Conclusion
Context Performance Data Design Goals DataGarage Query Processing ResultsContext Performance Data Design Goals DataGarage Query Processing Results
![Page 28: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/28.jpg)
Compression Overhead
Context Performance Data Design Goals DataGarage Query Processing ResultsContext Performance Data Design Goals DataGarage Query Processing Results
![Page 29: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/29.jpg)
• HadoopDB– DataGarage has finer data partitioning• Improves fault tolerance and storage efficiency
– DataGarage uses embedded databases• Cheap, enables using hierarchical file system
– DataGarage uses data compression
Related Work
Context Performance Data Design Goals DataGarage Query Processing Experiments Context Performance Data Design Goals DataGarage Query Processing Experiments
![Page 30: Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation](https://reader035.vdocument.in/reader035/viewer/2022081414/55147578550346494e8b629a/html5/thumbnails/30.jpg)
Query Processing
Controller(Query Dissemination)
Controller(Query Dissemination)
<apply_script>
<target>
<combine_script>
<apply_script><apply_script>Embedded database
Distributed file system
Temporary table<combine_script>
ResultResult
Context Performance Data Design Goals DataGarage Query Processing Experiments Context Performance Data Design Goals DataGarage Query Processing Experiments