1
ActiveSLA: A Profit-Oriented Admission Control
Framework for Database-as-a-Service Providers
Pengcheng Xiong (Georgia Tech); Yun Chi (NEC Labs America); Shenghuo Zhu (NEC Labs America); Junichi Tatemura (NEC Labs America); Calton Pu (Georgia
Tech); Hakan Hacigumus (NEC Labs America)
Overview Motivation Prediction module design Prediction module evaluation Decision module design Decision module evaluation Conclusions
2
Overview Motivation Prediction module design Prediction module evaluation Decision module design Decision module evaluation Conclusions
3
Database-as-a-service (DaaS)
DaaS provider consolidates multiple clients in shared infrastructures (multi-tenancy) greater economies of scale fixed cost distribution
Problem: system overload due to unpredictable and more bursty workloads dynamic provisioning, queuing and scheduling, and admission control
4
Admission control related work
Macro level (feedback based): keep the mean query execution time at a specific level by tuning the best multiple programming level (MPL) for a given workload, e.g., ICDE2006
Micro level (query-by-query based): estimate every single query’s execution time by query type and query mix, e.g., WWW2004, ICDE2010
5
None of them has well addressed the problem to directly maximize DaaS provider’s profits by
satisfying different SLAs for their clients!
First issue Merely estimating the query
execution time is not enough to make profit-oriented decisions. We need to know the probabilities of a query meeting and missing its deadline.
6
Accept!
?
Second issue We may have to make different
admission control decisions even when the queries have the same deadline and the same probability of meeting the deadline due to different SLAs.
7
System architecture of ActiveSLA
8
Overview Motivation Prediction module design Prediction module evaluation Decision module design Decision module evaluation Conclusions
9
Prediction module design
What kind of models to use? The model selection between linear and nonlinear models, between regression and classification models
What features to use? The rich set of features for DaaS providers
10
Model selection Linear vs. Nonlinear
The execution time of a query depends on many factors in a non-linear fashion, i.e., isolation levels and available buffer size
Regression vs. Classification From the machine learning point of view, a
direct model of classification usually outperforms a two-step regression based approach.
11
Feature collection Query Type and Mix (TYPE, Q-Cop,
ActiveSLA) Query Features (ActiveSLA)
E.g., the estimated number of sequential I/O Database and System Conditions
(ActiveSLA) Buffer cache: the fraction of pages of each
table that are currently in the database buffer pool.
System cache: the fraction of pages of each table that are currently in the operating system cache.
Transaction isolation level: Read Committed(FALSE) or Serializable(TRUE).
CPU, memory, and disk status: the current status of CPU, memory, and disk in the operating system.
12
Summary
13
Overview Motivation Prediction module design Prediction module evaluation Decision module design Decision module evaluation Conclusions
14
Prediction module evaluation
Query Sets with PostgreSQL server TPC-W1 (browsing queries) TPC-W2 (mixture of browsing and
administrative queries) TPC-W3 (mixture of browsing,
administrative, and updating queries)
Prediction error
15
False positive False negative
Total number
Prediction module evaluation
For different query sets(TPC-W2,3), different SLA settings (30s, 45s, 60s), we observe a steady decrease of errors when we use non-linear model, classification model, and include more features
Details on the Machine Learning Model
Positive value->more likely to miss deadlineNegative value->unlikely to miss deadline
CPU waiting% for IO
Q10 is the update querythat needs exclusive lock
Lots of lock contention
Very likely tomiss deadline
Details on the Machine Learning Model
Query Plan A
Query Plan B
Overhead and feature sensitivity
Overhead Training overhead. 72ms to build an initial
model by using 12,000 samples. Evaluation overhead. 8ms
Feature sensitivity
19
The more features, the better
The gain by using more features is less than the gain by using a better model.
Overview Motivation Prediction module design Prediction module evaluation Decision module design Decision module evaluation Conclusions
20
Decision module design
21
General SLA and PDF of query execution time
Step-wise SLA and PDF of query execution time
The probability thatthe query canmeet deadline
Multiple Query Decision Admitting q into the database server may
slow down the execution of other queries that are currently running in the server and make them miss deadline.
Admitting q will consume system resources and change the system status. This may result in the rejection of the next query, which may otherwise be admitted and bring in a higher profit.
Model this as opportunity cost o.
22
Overview Motivation Prediction module design Prediction module evaluation Decision module design Decision module evaluation Conclusions
23
Decision module evaluation
Result with stationary workload (static Poisson arrival rate)
Result with non-stationary workload (dynamic Poisson arrival rate according to 1998 World Cup Trace)Single SLAMultiple SLAs(service differentiation)
Result with stationary workload
Result with non-stationary workload
Single SLA summary
Multiple SLAs summary
Overview Motivation Prediction module design Prediction module evaluation Decision module design Decision module evaluation Conclusion
28
29
Conclusion We proposed a framework,
ActiveSLA, for admission control in cloud database systems. Prediction module to predict the possibility
that a query can meet/miss deadline. Decision module to make the profit-oriented
decision. Future work
Improve the inaccuracy for the query features such as the number of sequential I/O due to the incorrect statistics and cardinality estimates of a query execution plan.
Extend our prediction module by including the level of replication as one of the system variables.
Extend our ActiveSLA to deal with different types of database systems to manage data and serve queries, e.g., NoSQL databases.