making database applications perform using program analysis alvin cheung samuel madden armando...

31
Making Database Applications Perform Using Program Analysis Alvin Cheung Samuel Madden Armando Solar- Lezama MIT Owen Arden Andrew C. Myers Cornell

Upload: simon-walton

Post on 24-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Making Database Applications Perform Using Program Analysis

Alvin CheungSamuel MaddenArmando Solar-

LezamaMIT

Owen ArdenAndrew C. Myers

Cornell

Developing Database Applications

2/1/13 NEDB '13 2

Application Server SQL Database

JavaApplication Logic

SQLQuery

Developing Database Applications

2/1/13 NEDB '13 3

SQL Database

JavaApplication Logic

PL/SQLStoredProcedures

SQLQuery

Application Server

Developing Database Applications

2/1/13 NEDB '13 4

SQL Database

JavaApplication Logic

PL/SQLStoredProcedures

SQLQuery

Application Server

Language Choice for Application Logic

Application Distribution

Program analysis to the rescue!

StatusQuo

• Express application logic in ways that programmers are comfortable with

• Job of compiler & runtime to determine the most efficient implementation

2/1/13 NEDB '13 5

Two Key Technologies

• Infer queries from imperative code

• Migrate computation between servers for optimal performance

2/1/13 NEDB '13 6

Relational Operations in Imperative Code

2/1/13 NEDB '13 7

List getUsersWithRoles () { List users = getUsersFromDB(); List roles = getRolesFromDB(); List results = new ArrayList(); for (User u : users) {

for (Role r : roles) { if (u.roleId == r.id) results.add(u); }} return results; }

SELECT * FROM userSELECT * FROM role

List getUsersWithRoles () {

  return executeQuery(

“SELECT u FROM users u,

roles r WHERE u.roleId

== r.id

ORDER BY u.roleId,

r.id”; }

convert to

Relational Operations in Imperative Code

2/1/13 NEDB '13 8

List getUsersWithRoles () { List users = getUsersFromDB(); List roles = getRolesFromDB(); List results = new ArrayList(); for (User u : users) {

for (Role r : roles) { if (u.roleId == r.id) results.add(u); }} return results; }

List getUsersWithRoles () {

  return executeQuery(

“SELECT u FROM users u,

roles r WHERE u.roleId

== r.id

ORDER BY u.roleId,

r.id”; }

convert to

Goal Find a variable that

we can rewrite into a SQL expression

output variableresults

Query By Synthesis (QBS)

• Identify potential code fragments– i.e., regions of code that fetches

persistent data and return values

• Find SQL expressions for output variables

• Try to prove that those expressions preserve program semantics– if so, convert the code!

2/1/13 NEDB '13 9

Search for Output Expressions

2/1/13 NEDB '13 11

List getUsersWithRoles () { List users = query(select * from users); List roles = query(select * from roles); List results = []; for (User u : users) {

for (Role r : roles) { if (u.roleId == r.id) results = results : [] }} return results; }Relations involved:users, roles

Possible expressions to consider for results: σf(users) topf(users) πf(users ⨝g roles)πf(σg(users) ⨝h roles) other expressions involving users, roles

Infinite search space size!

usersroles

results

Constraints for Output Expressions

2/1/13 NEDB '13 12

List getUsersWithRoles () { List users = query(select * from users); List roles = query(select * from roles); List results = []; for (User u : users) {

for (Role r : roles) { if (u.roleId == r.id) results = results : [] }} return results; }

usersroles

results

Hoare-style program verification

If

output expression

outer loop invariant

outer loop invariant outer loop terminates

thenoutput expression

is true and

is true

Still need a smarter way to search

results = πuser( users ⨝roleId = id roles )

results = πuser( users[0 .. i] ⨝roleId = id roles )

Search for Output Expressions

and Invariants• Use program synthesis as search

engine

2/1/13 NEDB '13 13

Program synthesizer

Symbolic desc. of search space

Solutionconstraints

Expression thatsatisfies all theconstraints

Symbolic manipulation

Counter-example driven search

Experiments

2/1/13 NEDB '13 14

Real-world Evaluation

2/1/13 NEDB '13 15

Wilos (project management application) – 62k LOC

Operation type # Fragments found

# Fragments converted

Projection 1 1

Selection 13 10

Join 7 7

Aggregation 11 10

Total 33 28

0 20K 40K 60K 80K 100K100

1K

10K

100K

1000Koriginal (lazy)

Number of roles / users in DB

Execu

tio

n t

ime (

ms)

Performance Evaluation:Join Query

2/1/13 NEDB '13 16

Nested-loop join Hash join! O(n2) O(n)

Developing Database Applications

2/1/13 NEDB '13 17

SQL Database

JavaApplication Logic

PL/SQLStoredProcedures

SQLQuery

Application Server

Application Distribution

Running Example

discount = executeQuery("select discount from customers where id = " + cid);

totalAmount = orderTotal * (1 – discount);

credit = executeQuery("select credit from customers where id = " +

cid);

if (credit < totalAmount)printToConsole("Only " + credit + " in account!");

elseexecuteUpdate("update customer set credit = " +

(credit – totalAmount) + " where id = " + cid);

2/1/13 18NEDB '13

Actual Execution

discount = executeQuery("select discount from customers where id = " + cid);

totalAmount = orderTotal * (1 – discount);

credit = executeQuery("select credit from customers where id = " +

cid);

if (credit < totalAmount)printToConsole("Only " + credit + " in account!");

elseexecuteUpdate("update customer set credit = " +

(credit – totalAmount) + " where id = " + cid);

2/1/13 19NEDB '13

DB

APP

APP

DB

DB

Actual Execution

discount = executeQuery("select discount from customers where id = " + cid);

totalAmount = orderTotal * (1 – discount);

credit = executeQuery("select credit from customers where id = " +

cid);

if (credit < totalAmount)printToConsole("Only " + credit + " in account!");

elseexecuteUpdate("update customer set credit = " +

(credit – totalAmount) + " where id = " + cid);

2/1/13 20NEDB '13

network communication

network communication

network communication

network communication

DB

APP

APP

DB

DB

Speeding up Execution

discount = executeQuery("select discount from customers where id = " + cid);

totalAmount = orderTotal * (1 – discount);

credit = executeQuery("select credit from customers where id = " +

cid);

if (credit < totalAmount)printToConsole("Only " + credit + " in account!");

elseexecuteUpdate("update customer set credit = " +

(credit – totalAmount) + " where id = " + cid);

2/1/13 21NEDB '13

DB

APP

DB

Speeding up Execution

discount = executeQuery("select discount from customers where id = " + cid);

totalAmount = orderTotal * (1 – discount);

credit = executeQuery("select credit from customers where id = " +

cid);

if (credit < totalAmount)printToConsole("Only " + credit + " in account!");

elseexecuteUpdate("update customer set credit = " +

(credit – totalAmount) + " where id = " + cid);

2/1/13 22NEDB '13

data dependencyDB

APP

DB

control dependency

Speeding up Execution

discount = executeQuery("select discount from customers where id = " + cid);

totalAmount = orderTotal * (1 – discount);

credit = executeQuery("select credit from customers where id = " +

cid);

if (credit < totalAmount)printToConsole("Only " + credit + " in account!");

elseexecuteUpdate("update customer set credit = " +

(credit – totalAmount) + " where id = " + cid);

2/1/13 23NEDB '13

DB Server

DB

APP

DB

control dependency

data dependency

Introducing Pyxis

• “Store-procedurizes” DB apps and pushes computation to the DB

• Adaptively controls the amount of computation pushed to DB for optimal performance

• No programmer intervention required

2/1/13 24NEDB '13

Using Pyxis

2/1/13 NEDB '13 25

How Pyxis Works

2/1/13 NEDB '13 26

Instrument

Partition

Monitor

App Server DB Server

Deploy

JavaSQLJavaSQLJava

Java

Java

SQLJavaSQL

SQLJava

JavaSQLJavaSQLJavaSQLJava

SQL

Java

control transfer

How Pyxis Works

2/1/13 NEDB '13 27

Monitor

App Server DB Server

DeployJava

Java

SQLJava

SQL

SQL

Java

control transfer

Instrument

Partition

JavaSQLJavaSQLJavaSQLJava

JavaSQLJavaSQLJavaSQLJava

Generating Program Partitions

• Deploy and profile application as-is• Construct a dependence graph of

program statements– captures both control and data flow

• Formulate linear program from profile data and dependence graph– solution gives a partitioning of the

source code

2/1/13 NEDB '13 28

Experiments

2/1/13 NEDB '13 30

100 300 500 700 900 1100 13005

10

15

20

25

JDBC

Manual

Pyxis

Average Thruput (xact / s)

Ave

rag

e L

ate

ncy (

ms)

TPC-C on 16-core DB machine

2/1/13 32NEDB '13

Pyxis generated implementation:

3x latency reduction1.7x thruput increase

StatusQuo

Ease DB application development

Convert imperative program statements into declarative SQL

Fully automatic code partitioning using application and server characteristics

db.csail.mit.edu/statusquo

2/1/13 NEDB '13 33

Backup

2/1/13 NEDB '13 34