no false negatives: accepting all useful schedules in a...
TRANSCRIPT
No False Negatives: Accepting All Useful Schedulesin a Fast Serializable Many-Core System
Dominik Durner, Thomas Neumann
April 10, 2019
Technische Universitat Munchen
Motivation
I Concurrency control schemes only approximate the class of serializableschedules, such as 2PL, OCC, TicToc
I Therefore, unexpected behavior and also unnecessary aborts are introducedI Spurious aborts due to implementation artifacts that are hard to understand
I For example, 2PL cannot accept:
t1r(x) w(x) r(y) c
t2r(x) w(z) c
I Only Serialization Graph Testing (SGT) accepts all valid schedulesI SGT seems to be too expensive and not scalable
2
Motivation
I Concurrency control schemes only approximate the class of serializableschedules, such as 2PL, OCC, TicToc
I Therefore, unexpected behavior and also unnecessary aborts are introducedI Spurious aborts due to implementation artifacts that are hard to understandI For example, 2PL cannot accept:
t1r(x) w(x) r(y) c
t2r(x) w(z) c
I Only Serialization Graph Testing (SGT) accepts all valid schedulesI SGT seems to be too expensive and not scalable
2
Motivation
I Concurrency control schemes only approximate the class of serializableschedules, such as 2PL, OCC, TicToc
I Therefore, unexpected behavior and also unnecessary aborts are introducedI Spurious aborts due to implementation artifacts that are hard to understandI For example, 2PL cannot accept:
t1r(x) w(x) r(y) c
t2r(x) w(z) c
I Only Serialization Graph Testing (SGT) accepts all valid schedulesI SGT seems to be too expensive and not scalable
2
Motivation
I Concurrency control schemes only approximate the class of serializableschedules, such as 2PL, OCC, TicToc
I Therefore, unexpected behavior and also unnecessary aborts are introducedI Spurious aborts due to implementation artifacts that are hard to understandI For example, 2PL cannot accept:
t1r(x) w(x) r(y) c
t2r(x) w(z) c
I Only Serialization Graph Testing (SGT) accepts all valid schedulesI SGT seems to be too expensive and not scalable
2
Motivation: Desired Schedules
I Conflict graphs allow to accept all conflict serializable schedules
I Recoverability is independent of serializabilityI DBMS users expect to see committed changes
all schedules
CSR
OCSR
COCSR
RC
Note that S2PL ( COCSR ∩ RC
3
Motivation: Desired Schedules
I Conflict graphs allow to accept all conflict serializable schedulesI Recoverability is independent of serializability
I DBMS users expect to see committed changes
all schedules
CSR
OCSR
COCSR
RC
Note that S2PL ( COCSR ∩ RC
3
Motivation: Desired Schedules
I Conflict graphs allow to accept all conflict serializable schedulesI Recoverability is independent of serializabilityI DBMS users expect to see committed changes
all schedules
CSROCSR
COCSR
RC
Note that S2PL ( COCSR ∩ RC
3
Motivation: Desired Schedules
I Conflict graphs allow to accept all conflict serializable schedulesI Recoverability is independent of serializabilityI DBMS users expect to see committed changes
all schedules
CSROCSR
COCSR
RC
Note that S2PL ( COCSR ∩ RC
3
Contribution
Our approach leverages the conflict graph and1. accepts all useful COCSR ∩ RC schedules2. meets users’ expectations3. has low overhead for maintaining the graph4. scales to many-core systems
all schedules
CSROCSR
COCSR
RC
4
Serialization Graph Testing (SGT)
I Theorem: s ∈ CSR ⇔ CG(s) is acylicI Update CG(s) at operation arrival and allow if CG(s) is acyclicI Remove all outgoing edges of a node at its deletion
Example: s = r0[x ]w0[x ]
r1[x ] r2[x ]w2[x ]w2[y ] c2 c0 c1
t0
t1 t2
⇒ s ∈ CSR
5
Serialization Graph Testing (SGT)
I Theorem: s ∈ CSR ⇔ CG(s) is acylicI Update CG(s) at operation arrival and allow if CG(s) is acyclicI Remove all outgoing edges of a node at its deletion
Example: s = r0[x ]w0[x ]
r1[x ] r2[x ]w2[x ]w2[y ] c2 c0 c1
t0
t1 t2
⇒ s ∈ CSR
5
Serialization Graph Testing (SGT)
I Theorem: s ∈ CSR ⇔ CG(s) is acylicI Update CG(s) at operation arrival and allow if CG(s) is acyclicI Remove all outgoing edges of a node at its deletion
Example: s = r0[x ]w0[x ] r1[x ]
r2[x ]w2[x ]w2[y ] c2 c0 c1
t0
t1
t2
⇒ s ∈ CSR
5
Serialization Graph Testing (SGT)
I Theorem: s ∈ CSR ⇔ CG(s) is acylicI Update CG(s) at operation arrival and allow if CG(s) is acyclicI Remove all outgoing edges of a node at its deletion
Example: s = r0[x ]w0[x ] r1[x ] r2[x ]w2[x ]
w2[y ] c2 c0 c1
t0
t1 t2
⇒ s ∈ CSR
5
Serialization Graph Testing (SGT)
I Theorem: s ∈ CSR ⇔ CG(s) is acylicI Update CG(s) at operation arrival and allow if CG(s) is acyclicI Remove all outgoing edges of a node at its deletion
Example: s = r0[x ]w0[x ] r1[x ] r2[x ]w2[x ]w2[y ] c2 c0 c1
t0
t1 t2
⇒ s ∈ CSR
5
SGT Lacked Practical Relevance
I SGT has the best theoretical properties of accepting all valid schedulesI However, previous work fails to implement SGT efficiently in practice
We developed the first practical and scalable algorithm that leveragesthe theoretical superior concept of graph-based serialization testing
6
SGT Lacked Practical Relevance
I SGT has the best theoretical properties of accepting all valid schedulesI However, previous work fails to implement SGT efficiently in practice
We developed the first practical and scalable algorithm that leveragesthe theoretical superior concept of graph-based serialization testing
6
Prerequisites for Node Deletions
Pitfall: Deletion of a committed node tc
Example: s = r0[x ]w0[x ] r1[x ] r2[x ]w2[x ]w2[y ] c2
r0[y] c0 c1
t0
t1 t2
⇒ s 6∈ CSR, but not detectable if t2 was deleted
Deletion of committed node is only allowed if all incoming edges are removed
7
Prerequisites for Node Deletions
Pitfall: Deletion of a committed node tc
Example: s = r0[x ]w0[x ] r1[x ] r2[x ]w2[x ]w2[y ] c2
r0[y] c0 c1
t0
t1 t2
⇒ s 6∈ CSR, but not detectable if t2 was deleted
Deletion of committed node is only allowed if all incoming edges are removed
7
Prerequisites for Node Deletions
Pitfall: Deletion of a committed node tc
Example: s = r0[x ]w0[x ] r1[x ] r2[x ]w2[x ]w2[y ] c2
r0[y] c0 c1
t0
t1
t2
⇒ s 6∈ CSR, but not detectable if t2 was deleted
Deletion of committed node is only allowed if all incoming edges are removed
7
Prerequisites for Node Deletions
Pitfall: Deletion of a committed node tc
Example: s = r0[x ]w0[x ] r1[x ] r2[x ]w2[x ]w2[y ] c2 r0[y] c0 c1
t0
t1
t2
⇒ s 6∈ CSR, but not detectable if t2 was deleted
Deletion of committed node is only allowed if all incoming edges are removed
7
Prerequisites for Node Deletions
Pitfall: Deletion of a committed node tc
Example: s = r0[x ]w0[x ] r1[x ] r2[x ]w2[x ]w2[y ] c2 r0[y] c0 c1
t0
t1 t2
⇒ s 6∈ CSR, but not detectable if t2 was deleted
Deletion of committed node is only allowed if all incoming edges are removed
7
Prerequisites for Node Deletions
Pitfall: Deletion of a committed node tc
Example: s = r0[x ]w0[x ] r1[x ] r2[x ]w2[x ]w2[y ] c2 r0[y] c0 c1
t0
t1 t2
⇒ s 6∈ CSR, but not detectable if t2 was deleted
Deletion of committed node is only allowed if all incoming edges are removed
7
(Log-) Recoverability Constraints
Every transaction commit needs to wait until it is not dependent on in-flight results
Example: s = r0[x ]w0[x ] r1[x ] r2[x ]w2[x ]w2[y ] c2 c0 c1
t0
t1 t2
No incoming write-read, write-write edge from an uncommitted node allowed
8
(Log-) Recoverability Constraints
Every transaction commit needs to wait until it is not dependent on in-flight results
Example: s = r0[x ]w0[x ] r1[x ] r2[x ]w2[x ]w2[y ] c2���XXXc0 c1 a0 a1
t0
t1 t2
No incoming write-read, write-write edge from an uncommitted node allowed
8
(Log-) Recoverability Constraints
Every transaction commit needs to wait until it is not dependent on in-flight results
Example: s = r0[x ]w0[x ] r1[x ] r2[x ]w2[x ]w2[y ] c2���XXXc0 c1 a0 a1
t0
t1 t2
No incoming write-read, write-write edge from an uncommitted node allowed
8
Preserving the Commit Order
No (uncommitted) incoming edge at commit time to preserve the commit order
Example: s = r0[x ] w1[x ] c1
d1 r2[y ] c2 w0[y ] c0 c1
t0
t1 t2sorig = r0[x ] w1[x ] c1 r2[y ] c2 w0[y ] c0
with s′ = t2 t0 t1, but sorig /∈ COCSR
All useful COCSR ∩ RC schedules accepted due to commit delays
Committed nodes are deleted directly including all outgoing edges
9
Preserving the Commit Order
No (uncommitted) incoming edge at commit time to preserve the commit order
Example: s = r0[x ] w1[x ] c1
d1 r2[y ] c2 w0[y ] c0 c1
t0
t1
t2sorig = r0[x ] w1[x ] c1 r2[y ] c2 w0[y ] c0
with s′ = t2 t0 t1, but sorig /∈ COCSR
All useful COCSR ∩ RC schedules accepted due to commit delays
Committed nodes are deleted directly including all outgoing edges
9
Preserving the Commit Order
No (uncommitted) incoming edge at commit time to preserve the commit order
Example: s = r0[x ] w1[x ] ��ZZc1 d1
r2[y ] c2 w0[y ] c0 c1
t0
t1
t2sorig = r0[x ] w1[x ] c1 r2[y ] c2 w0[y ] c0
with s′ = t2 t0 t1, but sorig /∈ COCSR
All useful COCSR ∩ RC schedules accepted due to commit delays
Committed nodes are deleted directly including all outgoing edges
9
Preserving the Commit Order
No (uncommitted) incoming edge at commit time to preserve the commit order
Example: s = r0[x ] w1[x ] ��ZZc1 d1 r2[y ] c2
w0[y ] c0 c1
t0
t1 t2
sorig = r0[x ] w1[x ] c1 r2[y ] c2 w0[y ] c0
with s′ = t2 t0 t1, but sorig /∈ COCSR
All useful COCSR ∩ RC schedules accepted due to commit delays
Committed nodes are deleted directly including all outgoing edges
9
Preserving the Commit Order
No (uncommitted) incoming edge at commit time to preserve the commit order
Example: s = r0[x ] w1[x ] ��ZZc1 d1 r2[y ] c2 w0[y ]
c0 c1
t0
t1
t2sorig = r0[x ] w1[x ] c1 r2[y ] c2 w0[y ] c0
with s′ = t2 t0 t1, but sorig /∈ COCSR
All useful COCSR ∩ RC schedules accepted due to commit delays
Committed nodes are deleted directly including all outgoing edges
9
Preserving the Commit Order
No (uncommitted) incoming edge at commit time to preserve the commit order
Example: s = r0[x ] w1[x ] ��ZZc1 d1 r2[y ] c2 w0[y ] c0
c1
t0
t1
t2sorig = r0[x ] w1[x ] c1 r2[y ] c2 w0[y ] c0
with s′ = t2 t0 t1, but sorig /∈ COCSR
All useful COCSR ∩ RC schedules accepted due to commit delays
Committed nodes are deleted directly including all outgoing edges
9
Preserving the Commit Order
No (uncommitted) incoming edge at commit time to preserve the commit order
Example: s = r0[x ] w1[x ] ��ZZc1 d1 r2[y ] c2 w0[y ] c0 c1
t0
t1
t2sorig = r0[x ] w1[x ] c1 r2[y ] c2 w0[y ] c0
with s′ = t2 t0 t1, but sorig /∈ COCSR
All useful COCSR ∩ RC schedules accepted due to commit delays
Committed nodes are deleted directly including all outgoing edges
9
Preserving the Commit Order
No (uncommitted) incoming edge at commit time to preserve the commit order
Example: s = r0[x ] w1[x ] ��ZZc1 d1 r2[y ] c2 w0[y ] c0 c1
t0
t1 t2
sorig = r0[x ] w1[x ] c1 r2[y ] c2 w0[y ] c0
with s′ = t2 t0 t1, but sorig /∈ COCSR
All useful COCSR ∩ RC schedules accepted due to commit delays
Committed nodes are deleted directly including all outgoing edges
9
Preserving the Commit Order
No (uncommitted) incoming edge at commit time to preserve the commit order
Example: s = r0[x ] w1[x ] ��ZZc1 d1 r2[y ] c2 w0[y ] c0 c1
t0
t1 t2
sorig = r0[x ] w1[x ] c1 r2[y ] c2 w0[y ] c0
with s′ = t2 t0 t1, but sorig /∈ COCSR
All useful COCSR ∩ RC schedules accepted due to commit delays
Committed nodes are deleted directly including all outgoing edges
9
Scaling of our SGT-based Approach
I No incoming edges to commit simplifies cycle checkI Conflict graph is accessed concurrently by multiple threadsI No other transaction is allowed to modify a node during its final check
Transaction local shared/exclusive locks help to scale the graph
10
Scaling of our SGT-based Approach
I No incoming edges to commit simplifies cycle checkI Conflict graph is accessed concurrently by multiple threadsI No other transaction is allowed to modify a node during its final check
Transaction local shared/exclusive locks help to scale the graph
10
Example of our SGT-based Approach
t0r(x) w(x) cstart c
t1r(x) cstart c
wait for t0
t0
t1
sharedLocks: {}exclusiveLock: falsesharedLocks: {}exclusiveLock: truesharedLocks: {t0}exclusiveLock: falsesharedLocks: {}exclusiveLock: true
sharedLocks: {}exclusiveLock: false
sharedLocks: {t1}exclusiveLock: falsesharedLocks: {}exclusiveLock: falsesharedLocks: {}exclusiveLock: true
11
Example of our SGT-based Approach
t0r(x) w(x) cstart c
t1r(x) cstart c
wait for t0
t0 t1
sharedLocks: {}exclusiveLock: false
sharedLocks: {}exclusiveLock: truesharedLocks: {t0}exclusiveLock: falsesharedLocks: {}exclusiveLock: true
sharedLocks: {}exclusiveLock: false
sharedLocks: {t1}exclusiveLock: false
sharedLocks: {}exclusiveLock: falsesharedLocks: {}exclusiveLock: true
11
Example of our SGT-based Approach
t0r(x) w(x) cstart c
t1r(x) cstart c
wait for t0
t0 t1
sharedLocks: {}exclusiveLock: false
sharedLocks: {}exclusiveLock: true
sharedLocks: {t0}exclusiveLock: falsesharedLocks: {}exclusiveLock: true
sharedLocks: {}exclusiveLock: falsesharedLocks: {t1}exclusiveLock: false
sharedLocks: {}exclusiveLock: false
sharedLocks: {}exclusiveLock: true
11
Example of our SGT-based Approach
t0r(x) w(x) cstart c
t1r(x) cstart c
wait for t0
t0 t1
sharedLocks: {}exclusiveLock: falsesharedLocks: {}exclusiveLock: true
sharedLocks: {t0}exclusiveLock: false
sharedLocks: {}exclusiveLock: true
sharedLocks: {}exclusiveLock: falsesharedLocks: {t1}exclusiveLock: falsesharedLocks: {}exclusiveLock: false
sharedLocks: {}exclusiveLock: true
11
Example of our SGT-based Approach
t0r(x) w(x) cstart c
t1r(x) cstart c
wait for t0
t0
t1
sharedLocks: {}exclusiveLock: falsesharedLocks: {}exclusiveLock: truesharedLocks: {t0}exclusiveLock: false
sharedLocks: {}exclusiveLock: true
sharedLocks: {}exclusiveLock: falsesharedLocks: {t1}exclusiveLock: falsesharedLocks: {}exclusiveLock: falsesharedLocks: {}exclusiveLock: true
11
Experimental Evaluation
Setup:I 4-socket Intel Xeon server (60 cores) with 1TB DRAMI Every transaction is scheduled on one worker threadI Aborts require undos and restarts of the aborted transactions
Algorithms:I Our SGT-based approachI TicTocI 2PL with row based atomic read-write locks and deadlock prevention
12
SmallBank Medium Contention (1000 Customers)
0.0 × 10+0
2.5 × 10+6
5.0 × 10+6
7.5 × 10+6
0 20 40 60
OLTP threads
TX
/s
2PL
TicToc
SGT
0
2 × 10 2
4 × 10 2
6 × 10 2
8 × 10 2
0 20 40 60
OLTP threads
Ab
ort
Rate
SGTTicToc
2PL
-
-
-
-
13
SmallBank Medium Contention (1000 Customers)
0.0 × 10+0
2.5 × 10+6
5.0 × 10+6
7.5 × 10+6
0 20 40 60
OLTP threads
TX
/s
2PL
TicToc
SGT
0
2 × 10 2
4 × 10 2
6 × 10 2
8 × 10 2
0 20 40 60
OLTP threads
Ab
ort
Rate
SGTTicToc
2PL
-
-
-
-
13
YCSB-A, 50% writes, 16 queries/tx, 60 threads
100%
0.00 0.25 0.50 0.75
Theta
Ab
ort
Rate
(lo
g) 2PL
TicToc
SGT
10%
1%
0.1%
0.0 × 10+0
4.0 × 10+5
8.0 × 10+5
1.2 × 10+6
0.00 0.25 0.50 0.75
Theta
TX/s
TicToc
2PL SGT
Our SGT has competitive throughput while reducing aborts significantly!
14
YCSB-A, 50% writes, 16 queries/tx, 60 threads
100%
0.00 0.25 0.50 0.75
Theta
Ab
ort
Rate
(lo
g) 2PL
TicToc
SGT
10%
1%
0.1% 0.0 × 10+0
4.0 × 10+5
8.0 × 10+5
1.2 × 10+6
0.00 0.25 0.50 0.75
Theta
TX/s
TicToc
2PL SGT
Our SGT has competitive throughput while reducing aborts significantly!
14
Summary: Our graph-based concurrency control algorithm
accepts all usefulCOCSR ∩ RC schedules
all schedules
CSROCSR
COCSR
RC
0
2 × 10 2
4 × 10 2
6 × 10 2
8 × 10 2
0 20 40 60
OLTP threads
Abort
Rate
SGTTicToc
2PL
-
-
-
-
reduces aborted schedulesand meets users’ expectations
has low protocol overheadand scales to many-core systems
0.0 × 10+0
2.5 × 10+6
5.0 × 10+6
7.5 × 10+6
0 20 40 60
OLTP threads
TX
/s
2PL
TicToc
SGT
15
Summary: Our graph-based concurrency control algorithm
accepts all usefulCOCSR ∩ RC schedules
all schedules
CSROCSR
COCSR
RC
0
2 × 10 2
4 × 10 2
6 × 10 2
8 × 10 2
0 20 40 60
OLTP threads
Abort
Rate
SGTTicToc
2PL
-
-
-
-
reduces aborted schedulesand meets users’ expectations
has low protocol overheadand scales to many-core systems
0.0 × 10+0
2.5 × 10+6
5.0 × 10+6
7.5 × 10+6
0 20 40 60
OLTP threads
TX
/s
2PL
TicToc
SGT
15
Summary: Our graph-based concurrency control algorithm
accepts all usefulCOCSR ∩ RC schedules
all schedules
CSROCSR
COCSR
RC
0
2 × 10 2
4 × 10 2
6 × 10 2
8 × 10 2
0 20 40 60
OLTP threads
Abort
Rate
SGTTicToc
2PL
-
-
-
-
reduces aborted schedulesand meets users’ expectations
has low protocol overheadand scales to many-core systems
0.0 × 10+0
2.5 × 10+6
5.0 × 10+6
7.5 × 10+6
0 20 40 60
OLTP threads
TX
/s
2PL
TicToc
SGT
15