ascar: automating contention management for high
TRANSCRIPT
Automating Contention Management for High-Performance Storage Systems
(MSST ’15, June 5th, 2015)
Yan Li, Xiaoyuan Lu, Ethan Miller, Darrell Long
Storage Systems Research Center (SSRC) University of California, Santa Cruz
ASCAR
ASCAR
Contention in modern shared-storage system
2
Client A
Client B
Client C
Client D
Client E Storage devices
Network switch
Network switch
Storage server
Storage server
Storage server
Storage server
ch
S
tch S
SS
ASCAR
Contention in modern shared-storage system
3
Client A
Client B
Client C
Client D
Client E Storage devices
Network switch
Network switch
Storage server
Storage server
Storage server
Storage server
ch S
ch
S
ASCAR
Contention in modern shared-storage system
4
Client A
Client B
Client C
Client D
Client E Storage devices
Storage server
Storage server
Storage server
Storage server
Network switch
Network switch
Sch
ASCAR
Contention in modern shared-storage system
5
Client A
Client B
Client C
Client D
Client E Storage devices
Storage server
Storage server
Storage server
Storage server
Network switch
Network switch
Sh ch
ASCAR
Contention harms efficiency
6
100 clients
150 clients
200 clients
tota
l thr
ough
put
ASCAR
0 5 10 15 20 25Time (second)
0
10
20
30
40
50
60
70
80
Thr
ough
put(
MB
/s)
Client Nodes Throughput
Node 1Node 2Node 3Node 4Node 5
Contention causes fluctuation
7
Client throughput of a random write workload
5 nodes accessing 5 servers
High temporal and spatial I/O speed fluctuation
ASCAR
This is what’s happening at each server node
8
Picture credit: https://www.checkfelix.com/reiseblog/10-argumente-urlaub-schottland/
ASCAR
Limiting client I/O rates can improve performance
9
100 clients
150 clients
200 clients
… if done properly to
tal t
hrou
ghpu
t
200 clients with
contention control
ASCAR 10
200 clients
200 clients with
contention control
Challenges of distributed I/O rate control: 1. capability discovery
nonoptimal rate limit
ASCAR 11
200 clients
200 clients with
contention control
Challenges of distributed I/O rate control: 1. capability discovery
nonoptimal rate limit
ASCAR 12
200 clients
200 clients with
contention control
Challenges of distributed I/O rate control: 1. capability discovery
nonoptimal rate limit
ASCAR 13
200 clients
200 clients with
contention control
Challenges of distributed I/O rate control: 1. capability discovery
optimal rate limit
ASCAR 14
200 clients
200 clients with
contention control
Challenges of distributed I/O rate control: 1. capability discovery
Capability discovery usually involves communication:
• between clients
• with a central controller
ASCAR 15
Challenges of distributed I/O rate control: 2. scalability
Capability discovery usually involves communication:
• between clients
• with a central controller
Intra-node communication grows at O(n2)
Adds overhead to congested network
Low responsiveness for highly dynamic workload
ASCAR
Our goal
Reduce congestion and improve overall system efficiency
General purpose
Scalability
Reduce human involvement
Reduce management cost
16
ASCAR
Target environments
High Performance Computing Homogeneous I/O among clients; bandwidth sensitive; requires
fair bandwidth allocation
Virtual Machines accessing a shared image Performance isolation
Enterprise computing Interactive workloads, backup, indexing, data sharing
Future data center-wise storage system that serves multiple customers
Performance isolation
17
ASCAR
Core ideas
Client-side rule-based I/O rate control 1. no need for central scheduling or coordination, nimble and
highly responsive
2. no need to change server software or hardware
3. no scale-up bottleneck
Use machine learning and heuristics for rule generation and optimization
no prior knowledge of the system or workload is required
18
ASCAR
Components of the ASCAR prototype
19
Legends: Traffic rules
ASCAR traffic controller
Client node
Application
File system client
Storage servers
ASCAR Rule Manager Workload
Character & Rule Set Database
Workload Classifier
Rule Generator & Optimizer (SHARP) traffic
rules
Workload character markers
ASCAR
Legends: Traffic rules
ASCAR traffic controller
Client node
Application
File system client
Storage servers
ASCAR Rule Manager Workload
Character & Rule Set Database
Workload Classifier
Rule Generator & Optimizer (SHARP) traffic
rules
Workload character markers
Components of the ASCAR prototype
20
ASCAR
Legends: Traffic rules
ASCAR traffic controller
Client node
Application
File system client
Storage servers
ASCAR Rule Manager Workload
Character & Rule Set Database
Workload Classifier
Rule Generator & Optimizer (SHARP) traffic
rules
Workload character markers
Components of the ASCAR prototype
21
rau
ASCAR Rule Manager Workload
Character & Rule Set Database
Workload Classifier
Rule Generator & Optimizer (SHARP) tr
ru
Workload character markers
ASCAR
Rule-based Contention Control
Rules tell the controller how to react to congestions shrink the congestion window if congestion is building up enlarge the congestion window if speed is increasing
Each rule maps a congestion state to an action (Congestion State (CS) statistics) → <action>
Each client tracks three congestion state statistics (ack_ewma, send_ewma, pt_ratio)
An action describes how to change the congestion window and rate limit
<m, b, τ> new_cong_window = m × cong_window + b
22
ASCAR
Rule set
A rule set contains many rules
A rule set covers the whole congestion state space
23
ASCAR
Rule set
24
Action <m, b, τ>
ack_ewma
pt_r
aio
0,0 INT_MAX
INT_
MAX
ASCAR
Generating a rule set for a certain application
Extract a short signature workload that covers the important features of the application’s I/O workload
a signature workload is usually 20 ~ 30 seconds long
Generate candidate rules and benchmark them with the signature workload on the real system
test possible combinations of action variables
Split the hottest rule in the set to generate more rules
25
ASCAR
Begin with one rule: the whole state space maps to one action
26
Action <m, b, τ>
ack_ewma
pt_r
aio
0,0 INT_MAX
INT_
MAX
ASCAR
Try different values of <m, b, τ> with the workload
27
Try the Cartesian product of:
{m ± 0.01, m ± 0.02, m ± 0.04, … } ×
{b ± 0.01, b ± 0.02, b ± 0.04, … } ×
{ τ ± 1, τ ± 2, τ ± 4, … }
ack_ewma
pt_r
aio
0,0 INT_MAX
INT_
MAX
ASCAR
Find the rule that yields highest performance
28
Action <0.9, 1, 121>
ack_ewma
pt_r
aio
0,0 INT_MAX
INT_
MAX
ASCAR
Split the state space at the most observed state values
29
Action <0.9, 1, 121>
ack_ewma
pt_r
aio
0,0 INT_MAX
INT_
MAX
Action <0.9, 1, 121>
0.98312s
1.8
Action <0.9, 1, 121>
Action <0.9, 1, 121>
ASCAR
Run the workload, find out the rules that was triggered most often
30
Action <0.9, 1, 121>
used 32k times
ack_ewma
pt_r
aio
0,0 INT_MAX
INT_
MAX
Action <0.9, 1, 121>
used 0.3k times
0.98312s
1.8
Action <0.9, 1, 121> used 2k times
Action <0.9, 1, 121>
used 120 times
ASCAR
Improve the most used rules by sweeping all possible values of <m, b, τ>
31
Try the Cartesian product of:
{m ± 0.01, m ± 0.02, m ± 0.04, … } ×
{b ± 0.01, b ± 0.02, b ± 0.04, … } ×
{ τ ± 1, τ ± 2, τ ± 4, … }
ack_ewma
pt_r
aio
0,0 INT_MAX
INT_
MAX
Action <0.9, 1, 121>
0.98312s
1.8
Action <0.9, 1, 121>
Action <0.9, 1, 121>
ASCAR
Find the action that works best
32
Action <1.1, 1.5, 80>
ack_ewma
pt_r
aio
0,0 INT_MAX
INT_
MAX
Action <0.9, 1, 121>
0.98312s
1.8
Action <0.9, 1, 121>
Action <0.9, 1, 121>
ASCAR
Split the most used rule’s state space at the most observed state values
33
Action <1.1, 1.5, 80>
ack_ewma
pt_r
aio
0,0 INT_MAX
INT_
MAX
Action <0.9, 1, 121>
0.98312s
1.8
Action <0.9, 1, 121>
Action <0.9, 1, 121>
Action <1.1, 1.5, 80>
Action <1.1, 1.5, 80>
Action <1.1, 1.5, 80>
ASCAR
Run workload, find the most used rule, and improve it
34
Action <1.1, 1.5, 80>
ack_ewma
pt_r
aio
0,0 INT_MAX
INT_
MAX
Action <0.5, -1, 30>
0.98312s
1.8
Action <0.9, 1, 121>
Action <0.9, 1, 121>
Action <1.1, 1.5, 80>
Action <1.1, 1.5, 80>
Action <1.1, 1.5, 80>
ASCAR
After find the best action, split the most used rule
35
Action <1.1, 1.5, 80>
ack_ewma
pt_r
aio
0,0 INT_MAX
INT_
MAX
Action <0.5, -1,
30>
0.98312s
1.8
Action <0.9, 1, 121>
Action <0.9, 1, 121>
Action <1.1, 1.5, 80>
Action <1.1, 1.5, 80>
Action <1.1, 1.5, 80>
Action <0.5, -1, 30>
Action <0.5, -1,
30>
Action <0.5, -1, 30>
ASCAR
Repeat this process
36
Action <1.1, 1.5, 80>
ack_ewma
pt_r
aio
0,0 INT_MAX
INT_
MAX
Action <1, 2, 10>
0.98312s
1.8
Action <0.9, 1, 121>
Action <0.9, 1, 121>
Action <1.1, 1.5, 80>
Action <1.1, 1.5, 80>
Action <1.1, 1.5, 80>
Action <0.5, -1, 30>
Action <0.5, -1,
30>
Action <0.5, -1, 30>
Action <2, 1, 10>
Action <1.1, 1.5, 80>
Action <1.1, 1.5,
80>
ASCAR
Until either we are satisfied with the results or failed to find a good rule
37
Action <1.1, 1.5, 80>
ack_ewma
pt_r
aio
0,0 INT_MAX
INT_
MAX
Action <1, 2, 10>
0.98312s
1.8
Action <0.9, 1, 121>
Action <0.9, 1, 121>
Action <1.1, 1.5, 80>
Action <1.1, 1.5, 80>
Action <1.1, 1.5, 80>
Action <0.5, -1, 30>
Action <0.5, -1,
30>
Action <0.5, -1,
30>
Action <2, 1, 10>
Action <1.1, 1.5, 80>
Action <1.1, 1.5,
80>
Action <0.5, -1,
30>
Action <0.5, -1, 30>
Action <0.5, -1, 30>
ASCAR
Prototype and Evaluation
An ASCAR prototype works with Lustre Lustre is a popular high-performance file system used in HPC
Client controller works within the Lustre client Highly responsive and low overhead
Hardware: 5 servers, 5 clients Intel Xeon CPU E3-1230 V2 @ 3.30GHz, 16 GB RAM,
Intel 330 SSD for the OS, dedicated 7200 RPM HGST Travelstar Z7K500 hard drive for Lustre, Gigabit Ethernet
38
ASCAR
0 5 10 15 20 25Time (second)
0
10
20
30
40
50
60
70
80
Thr
ough
put(
MB
/s)
Client Nodes Throughput
Node 1Node 2Node 3Node 4Node 5
0 5 10 15 20 25Time (second)
0
10
20
30
40
50
60
70
80
Thr
ough
put(
MB
/s)
Client Nodes Throughput
Node 1Node 2Node 3Node 4Node 5
ASCAR is good at increasing throughout and decreasing speed variance
39
Without ASCAR
Node 1
With ASCAR
ASCAR 40
Seq. read
Seq. write
Random read
Random write
Random r/w
BTIO (B4)
BTIO (C4)
Random write BTIO (B4) BTIO (C4)
-60.00%
-50.00%
-40.00%
-30.00%
-20.00%
-10.00%
0.00%
10.00%
20.00%
30.00%
0.00% 5.00% 10.00% 15.00% 20.00% 25.00% 30.00% 35.00% 40.00%
Spee
d va
rianc
e ch
ange
Throughput improvement
Workload Throughput Improvements
TP only
TP and Var
ASCAR
Seq. read
Seq. write
Random read
Random write
Random r/w
BTIO (B4)
BTIO (C4)
Random write BTIO (B4) BTIO (C4)
-60.00%
-50.00%
-40.00%
-30.00%
-20.00%
-10.00%
0.00%
10.00%
20.00%
30.00%
0.00% 5.00% 10.00% 15.00% 20.00% 25.00% 30.00% 35.00% 40.00%
Spee
d va
rianc
e ch
ange
Throughput improvement
Workload Throughput Improvements
TP only
TP and Var
41
% 15.00% 20.00% 25.00% 30.00% 35.00% 00% 40.0
Higher throughput
ASCAR
Seq. read
Seq. write
Random read
Random write
Random r/w
BTIO (B4)
BTIO (C4)
Random write BTIO (B4) BTIO (C4)
-60.00%
-50.00%
-40.00%
-30.00%
-20.00%
-10.00%
0.00%
10.00%
20.00%
30.00%
0.00% 5.00% 10.00% 15.00% 20.00% 25.00% 30.00% 35.00% 40.00%
Spee
d va
rianc
e ch
ange
Throughput improvement
Workload Throughput Improvements
TP only
TP and Var
42
m read
Th
Lower speed variance
ASCAR
Seq. read
Seq. write
Random read
Random write
Random r/w
BTIO (B4)
BTIO (C4)
Random write BTIO (B4) BTIO (C4)
-60.00%
-50.00%
-40.00%
-30.00%
-20.00%
-10.00%
0.00%
10.00%
20.00%
30.00%
0.00% 5.00% 10.00% 15.00% 20.00% 25.00% 30.00% 35.00% 40.00%
Spee
d va
rianc
e ch
ange
Throughput improvement
Workload Throughput Improvements
TP only
TP and Var
43
Random writeRandom write
BTIO (B4) BTIO (B4)
BTIO (C4)
ASCAR
Seq. read
Seq. write
Random read
Random write
Random r/w
BTIO (B4)
BTIO (C4)
Random write BTIO (B4) BTIO (C4)
-60.00%
-50.00%
-40.00%
-30.00%
-20.00%
-10.00%
0.00%
10.00%
20.00%
30.00%
0.00% 5.00% 10.00% 15.00% 20.00% 25.00% 30.00% 35.00% 40.00%
Spee
d va
rianc
e ch
ange
Throughput improvement
Workload Throughput Improvements
TP only
TP and Var
44
writewrite
BTIO (B4) BTIO (B4)
BTIO (C4)
35.....00000000% 33333333333000000000.0
Random wRandom w
20.00% 222222222225.0
ASCAR
Our prototype shows that …
ASCAR increases the performance of all workloads 2% to 36%
ASCAR works best to boost write-heavy workloads 25% to 36%
and it can lower speed variance at the same time by more than 50%
45
ASCAR
Handling a new workload without offline training
Measure the workload’s features
Compare features with known workloads
Find the most similar known workload and use its contention control rules
46
ASCAR
Legends: Traffic rules
ASCAR traffic controller
Client node
Application
File system client
Storage servers
ASCAR Rule Manager Workload
Character & Rule Set Database
Workload Classifier
Rule Generator & Optimizer (SHARP) traffic
rules
Workload character markers
Components of the ASCAR prototype
47
Workload Character
& Rule Set Database
Workload Classifier
ASCAR
Measuring workloads’ pressure on the system
Workloads can have radically different pressure on the underlying system
And they require different contention management rules
The combined workload from many applications is a mix of read/write and random/sequential
48
ASCAR
A 75% sequential + 25% random workload can be very different from another
Request p
Request p + 120
Request p + 100
Request p + 101
Request p + 121
Request p + 122
Request p + 200
Request p + 201
Request p
Request p + 3
Request p + 1
Request p + 2
Request p + 4
Request p + 100
Request p + 200
Request p + 351
Random requests
ASCAR
And there are many different 60% read + 40% write workloads out there
Read Write
Read Write Read
Read Read Write Write Write Read Read
ASCAR
The feature set we are using now
Op type: read/write/metadata
Ratios between ops (read to write, read/write to metadata, etc.)
For each type of op, we measure the following features: 1. average size of sequential ops 2. location between seq. ops: random or sequential 3. average temporal gaps between seq. ops
ASCAR
Sample: Different 60% read + 40% write workloads
Read Write
Read Read Write Write Write Read Read
Read to write: 60/40 Avg. size of sequential read: 60 MB Avg. size of sequential write: 40 MB
Read to write: 60/40 Avg. size of sequential read: 15 MB Avg. size of sequential write: 13 MB
ASCAR
Calculating similarity of workloads
Goal: to determine if they can use the same contention control rules
How: start from a known workload and tweak its features, measuring the efficiency of existing rules on the changed workload
Result: we used the results of hundreds of benchmarks to generate a decision tree
53
ASCAR
Calculating similarity of workloads
54
rw_ratio
read_size Linear model 1
Linear model 2
Linear model 3
<=7 >7
<=75 >75
ASCAR
rw_ratio
read_size Linear model 1
Linear model 2
Linear model 3
<=7 >7
<=75 >75
Calculating similarity of workloads
55
< 7 7
rw_ratio
read_size
ASCAR
This decision tree can precisely predict the performance of using a specific rule set with a new workload
Correlation coefficient: 0.8676
Mean absolute error: 0.038
Root mean squared error: 0.0462
56
ASCAR
Conclusion: ASCAR …
Does not require knowledge of the system or workloads
fully unsupervised
Only needs client-side controllers no need to change hardware/application/network/server software
Can improve highly changeable workloads like burst I/O
Work-conserving no wasted bandwidth
57
ASCAR
Conclusion: ASCAR …
we distilled feature set and method for calculating the similarity between workload in terms of applying contention control rules
source code of our prototype is published as an open source project
http://www.ssrc.ucsc.edu/ascar.html
58
ASCAR
Future work: online rule optimization
Current ASCAR prototype requires a lengthy offline learning process
Use cloud computing services for doing evaluation on a larger scale
Online tweaking of rules using random-restart hill climbing
Also need to evaluate the ASCAR algorithm on other workloads: database, web services
59
ASCAR
Acknowledgments
This research was supported in part by the National Science Foundation under awards IIP-1266400, CCF-1219163, CNS- 1018928, the Department of Energy under award DE-FC02- 10ER26017/DESC0005417, Symantec Graduate Fellowship, and industrial members of the Center for Research in Storage Systems. We would like to thank the sponsors of the Storage Systems Research Center (SSRC), including Avago Technologies, Center for Information Technology Research in the Interest of Society (CITRIS of UC Santa Cruz), Department of Energy/Office of Science, EMC, Hewlett Packard Laboratories, Intel Corporation, National Science Foundation, NetApp, Sandisk, Seagate Technology, Symantec, and Toshiba for their generous support.
60
ASCAR
Thanks!
ASCAR project: http://www.ssrc.ucsc.edu/ascar.html
Contact: Yan Li <[email protected]>
61