making database applications perform using program analysis alvin cheung samuel madden armando...
TRANSCRIPT
Making Database Applications Perform Using Program Analysis
Alvin CheungSamuel MaddenArmando Solar-
LezamaMIT
Owen ArdenAndrew C. Myers
Cornell
Developing Database Applications
2/1/13 NEDB '13 2
Application Server SQL Database
JavaApplication Logic
SQLQuery
Developing Database Applications
2/1/13 NEDB '13 3
SQL Database
JavaApplication Logic
PL/SQLStoredProcedures
SQLQuery
Application Server
Developing Database Applications
2/1/13 NEDB '13 4
SQL Database
JavaApplication Logic
PL/SQLStoredProcedures
SQLQuery
Application Server
Language Choice for Application Logic
Application Distribution
Program analysis to the rescue!
StatusQuo
• Express application logic in ways that programmers are comfortable with
• Job of compiler & runtime to determine the most efficient implementation
2/1/13 NEDB '13 5
Two Key Technologies
• Infer queries from imperative code
• Migrate computation between servers for optimal performance
2/1/13 NEDB '13 6
Relational Operations in Imperative Code
2/1/13 NEDB '13 7
List getUsersWithRoles () { List users = getUsersFromDB(); List roles = getRolesFromDB(); List results = new ArrayList(); for (User u : users) {
for (Role r : roles) { if (u.roleId == r.id) results.add(u); }} return results; }
SELECT * FROM userSELECT * FROM role
List getUsersWithRoles () {
return executeQuery(
“SELECT u FROM users u,
roles r WHERE u.roleId
== r.id
ORDER BY u.roleId,
r.id”; }
convert to
Relational Operations in Imperative Code
2/1/13 NEDB '13 8
List getUsersWithRoles () { List users = getUsersFromDB(); List roles = getRolesFromDB(); List results = new ArrayList(); for (User u : users) {
for (Role r : roles) { if (u.roleId == r.id) results.add(u); }} return results; }
List getUsersWithRoles () {
return executeQuery(
“SELECT u FROM users u,
roles r WHERE u.roleId
== r.id
ORDER BY u.roleId,
r.id”; }
convert to
Goal Find a variable that
we can rewrite into a SQL expression
output variableresults
Query By Synthesis (QBS)
• Identify potential code fragments– i.e., regions of code that fetches
persistent data and return values
• Find SQL expressions for output variables
• Try to prove that those expressions preserve program semantics– if so, convert the code!
2/1/13 NEDB '13 9
Search for Output Expressions
2/1/13 NEDB '13 11
List getUsersWithRoles () { List users = query(select * from users); List roles = query(select * from roles); List results = []; for (User u : users) {
for (Role r : roles) { if (u.roleId == r.id) results = results : [] }} return results; }Relations involved:users, roles
Possible expressions to consider for results: σf(users) topf(users) πf(users ⨝g roles)πf(σg(users) ⨝h roles) other expressions involving users, roles
Infinite search space size!
usersroles
results
Constraints for Output Expressions
2/1/13 NEDB '13 12
List getUsersWithRoles () { List users = query(select * from users); List roles = query(select * from roles); List results = []; for (User u : users) {
for (Role r : roles) { if (u.roleId == r.id) results = results : [] }} return results; }
usersroles
results
Hoare-style program verification
If
output expression
outer loop invariant
outer loop invariant outer loop terminates
thenoutput expression
is true and
is true
Still need a smarter way to search
results = πuser( users ⨝roleId = id roles )
results = πuser( users[0 .. i] ⨝roleId = id roles )
Search for Output Expressions
and Invariants• Use program synthesis as search
engine
2/1/13 NEDB '13 13
Program synthesizer
Symbolic desc. of search space
Solutionconstraints
Expression thatsatisfies all theconstraints
Symbolic manipulation
Counter-example driven search
Real-world Evaluation
2/1/13 NEDB '13 15
Wilos (project management application) – 62k LOC
Operation type # Fragments found
# Fragments converted
Projection 1 1
Selection 13 10
Join 7 7
Aggregation 11 10
Total 33 28
0 20K 40K 60K 80K 100K100
1K
10K
100K
1000Koriginal (lazy)
Number of roles / users in DB
Execu
tio
n t
ime (
ms)
Performance Evaluation:Join Query
2/1/13 NEDB '13 16
Nested-loop join Hash join! O(n2) O(n)
Developing Database Applications
2/1/13 NEDB '13 17
SQL Database
JavaApplication Logic
PL/SQLStoredProcedures
SQLQuery
Application Server
Application Distribution
Running Example
discount = executeQuery("select discount from customers where id = " + cid);
totalAmount = orderTotal * (1 – discount);
credit = executeQuery("select credit from customers where id = " +
cid);
if (credit < totalAmount)printToConsole("Only " + credit + " in account!");
elseexecuteUpdate("update customer set credit = " +
(credit – totalAmount) + " where id = " + cid);
2/1/13 18NEDB '13
Actual Execution
discount = executeQuery("select discount from customers where id = " + cid);
totalAmount = orderTotal * (1 – discount);
credit = executeQuery("select credit from customers where id = " +
cid);
if (credit < totalAmount)printToConsole("Only " + credit + " in account!");
elseexecuteUpdate("update customer set credit = " +
(credit – totalAmount) + " where id = " + cid);
2/1/13 19NEDB '13
DB
APP
APP
DB
DB
Actual Execution
discount = executeQuery("select discount from customers where id = " + cid);
totalAmount = orderTotal * (1 – discount);
credit = executeQuery("select credit from customers where id = " +
cid);
if (credit < totalAmount)printToConsole("Only " + credit + " in account!");
elseexecuteUpdate("update customer set credit = " +
(credit – totalAmount) + " where id = " + cid);
2/1/13 20NEDB '13
network communication
network communication
network communication
network communication
DB
APP
APP
DB
DB
Speeding up Execution
discount = executeQuery("select discount from customers where id = " + cid);
totalAmount = orderTotal * (1 – discount);
credit = executeQuery("select credit from customers where id = " +
cid);
if (credit < totalAmount)printToConsole("Only " + credit + " in account!");
elseexecuteUpdate("update customer set credit = " +
(credit – totalAmount) + " where id = " + cid);
2/1/13 21NEDB '13
DB
APP
DB
Speeding up Execution
discount = executeQuery("select discount from customers where id = " + cid);
totalAmount = orderTotal * (1 – discount);
credit = executeQuery("select credit from customers where id = " +
cid);
if (credit < totalAmount)printToConsole("Only " + credit + " in account!");
elseexecuteUpdate("update customer set credit = " +
(credit – totalAmount) + " where id = " + cid);
2/1/13 22NEDB '13
data dependencyDB
APP
DB
control dependency
Speeding up Execution
discount = executeQuery("select discount from customers where id = " + cid);
totalAmount = orderTotal * (1 – discount);
credit = executeQuery("select credit from customers where id = " +
cid);
if (credit < totalAmount)printToConsole("Only " + credit + " in account!");
elseexecuteUpdate("update customer set credit = " +
(credit – totalAmount) + " where id = " + cid);
2/1/13 23NEDB '13
DB Server
DB
APP
DB
control dependency
data dependency
Introducing Pyxis
• “Store-procedurizes” DB apps and pushes computation to the DB
• Adaptively controls the amount of computation pushed to DB for optimal performance
• No programmer intervention required
2/1/13 24NEDB '13
How Pyxis Works
2/1/13 NEDB '13 26
Instrument
Partition
Monitor
App Server DB Server
Deploy
JavaSQLJavaSQLJava
Java
Java
SQLJavaSQL
SQLJava
JavaSQLJavaSQLJavaSQLJava
SQL
Java
control transfer
How Pyxis Works
2/1/13 NEDB '13 27
Monitor
App Server DB Server
DeployJava
Java
SQLJava
SQL
SQL
Java
control transfer
Instrument
Partition
JavaSQLJavaSQLJavaSQLJava
JavaSQLJavaSQLJavaSQLJava
Generating Program Partitions
• Deploy and profile application as-is• Construct a dependence graph of
program statements– captures both control and data flow
• Formulate linear program from profile data and dependence graph– solution gives a partitioning of the
source code
2/1/13 NEDB '13 28
100 300 500 700 900 1100 13005
10
15
20
25
JDBC
Manual
Pyxis
Average Thruput (xact / s)
Ave
rag
e L
ate
ncy (
ms)
TPC-C on 16-core DB machine
2/1/13 32NEDB '13
Pyxis generated implementation:
3x latency reduction1.7x thruput increase
StatusQuo
Ease DB application development
Convert imperative program statements into declarative SQL
Fully automatic code partitioning using application and server characteristics
db.csail.mit.edu/statusquo
2/1/13 NEDB '13 33