bolt: i know what you did last - cornell universitydelimitrou/slides/2017...christina delimitrou1...
TRANSCRIPT
![Page 1: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/1.jpg)
Christina Delimitrou1 and Christos Kozyrakis2
1Cornell University, 2Stanford University
ASPLOS – April 12th 2017
Bolt: I Know What You Did Last
Summer… In the Cloud
![Page 2: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/2.jpg)
2
Problem: cloud resource sharing hides security vulnerabilities
Interference from co-scheduled apps leaks app characteristics
Enables severe performance attacks
Bolt: adversarial runtime in public clouds
Transparent app detection (5-10sec)
Leverages practical machine learning techniques
DoS 140x increase in latency
User study: 88% correctly identified applications
Resource partitioning is helpful but insufficient
Executive Summary
![Page 3: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/3.jpg)
3
Motivation
App1 App2
![Page 4: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/4.jpg)
4
Motivation
App1 App2
containers
![Page 5: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/5.jpg)
5
Motivation
App1 App2
containers
memory
capacity
![Page 6: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/6.jpg)
6
Motivation
App1 App2
containers
memory
capacity
storage
capacity/bw
![Page 7: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/7.jpg)
7
Motivation
App1 App2
containers
memory
capacity
storage
capacity/bwnetwork bw
![Page 8: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/8.jpg)
8
Motivation
App1 App2
containers
memory
capacity
storage
capacity/bwnetwork bw
LL cache
![Page 9: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/9.jpg)
9
Motivation
App1 App2
containers
memory
capacity
storage
capacity/bwnetwork bw
LL cache
power
![Page 10: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/10.jpg)
10
Motivation
App1 App2
containers
memory
capacity
storage
capacity/bwnetwork bw
LL cache
power
Not all isolation techniques available
Not all used/configured correctly
Not all scale well
Mem bw/core resources not isolated
![Page 11: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/11.jpg)
11
Bolt
Key idea: Leverage lack of isolation in public clouds to
infer application characteristics
Programming framework, algorithm, load characteristics
Exploit: enable practical, effective, and hard-to-detect
performance attacks
DoS, RFA, VM pinpointing
Use app characteristics (sensitive resource) against it
Avoid CPU saturation hard to detect
![Page 12: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/12.jpg)
12
Threat Model
Impartial, neutral cloud provider
Active adversary but no control over VM placement
Adversary VictimCloud
provider
![Page 13: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/13.jpg)
13
Bolt
Adversary Victim
Contention
injection
1App
inference3
Interference
Impact
measurement
2
![Page 14: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/14.jpg)
14
Bolt
Adversary Victim
Contention
injection
1
Interference
Impact
measurement
2
App
inference3
Custom
contention
kernel
4
Performance
attack5
![Page 15: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/15.jpg)
15
1. Contention Measurement
Adversary Victim
Contention injection1
Interference
impact
measurement
2
Set of contentious kernels (iBench)
Compute
L1/L2/L3
Memory bw
Storage bw
Network bw
(Memory/Storage capacity)
Sample 2-3 kernels, run in
adversarial VM
Measure impact on performance of
kernels vs. isolation
![Page 16: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/16.jpg)
16
2. Practical App Inference
Adversary Victim
Infer resource pressure in non-
profiled resources
Sparse dense information
SGD (Collaborative filtering)
Classify unknown victim based
on previously-seen
applications
Label & determine resource
sensitivity
Content-based recommendation
Practical app inference
3
Hybrid recommender
![Page 17: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/17.jpg)
17
Big Data to the Rescue
1. Infer pressure in non-profiled resources
Reconstruct sparse information
Stochastic Gradient Descent (SGD), O(mpk)
Bolt
Contention
injectionuBench
uBench
Data
AppApp
SVD+SGD
AppAppInterference
profile
r1 r2 r3 … rN
a11 0 0 … a1N
0 a22 0 … 0
… … … … …
aM1 0 aM3 … 0
r1 r2 r3 … rN
a11 a12 a13 … a1N
a21 a22 a23 … a2N
… … … … …
aM1 aM2 aM3 … aMN
![Page 18: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/18.jpg)
18
Big Data to the Rescue
2. Classify and label victims
Weighted Pearson Correlation Coefficients
Output: distribution of similarity scores to app classes
Bolt
Data
AppApp
Pearson Corr Coeff
AppApp
App label &
characteristics
r1 r2 r3 … rN
a11 a12 a13 … a1N
a21 a22 a23 … a2N
… … … … …
aM1 aM2 aM3 … aMN
Hadoop SVM: 65%
Spark ALS: 21%
memcached: 11%
…
![Page 19: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/19.jpg)
19
Inference Accuracy
40 machine cluster (420 cores)
Training apps: 120 jobs (analytics, databases, webservers, in-
memory caching, scientific, js) high coverage of resource space
Testing apps: 108 latency-critical webapps, analytics
No overlap in algorithms/datasets between training and testing sets
Application class Detection accuracy (%)
In-memory caching (memcached) 80%
Persistent databases (Cassandra, MongoDB) 89%
Hadoop jobs 92%
Spark jobs 86%
Webservers 91%
Aggregate 89%
![Page 20: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/20.jpg)
20
3. Practical Performance Attacks
1. Determine the resource
bottleneck of the victim
2. Create custom contentious
kernel that targets critical
resource(s)
3. Inject kernel in Bolt
Several performance attacks
(DoS, RFAs, VM pinpointing)
Target specific, critical resource
low CPU pressure
Adversary Victim
Custom kernel
injection4
![Page 21: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/21.jpg)
21
3. Practical DoS Attacks
Launched against same 108 applications as before
On average 2.2x higher execution time and up to 9.8x
For interactive services, on average 42x increase in tail latency
and up to 140x
Bolt does not saturate CPU
Naïve attacker gets migrated
![Page 22: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/22.jpg)
22
Demo
![Page 23: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/23.jpg)
23
User Study
20 independent users from Stanford and Cornell
Cluster
200 EC2 servers, c3.8xlarge (32vCPUs, 60GB memory)
Rules:
4vCPUs per machine for Bolt
All users have equal priority
Users use thread pinning
Users can select specific instances
Training set: 120 apps incl. analytics, webapps, scientific, etc.
![Page 24: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/24.jpg)
24
Accuracy of App Labeling
53 app classes
(analytics, webapps,
FS/OS, HLS/sim,
other…)
![Page 25: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/25.jpg)
25
Accuracy of App Characterization
Performance
attack results
in the paper
![Page 26: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/26.jpg)
26
The Value of Isolation
Need more scalable, fine-grain, and complete isolation
techniques
45%
14%
![Page 27: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/27.jpg)
27
Bolt: highlight the security vulnerabilities from lack of isolation
Fast detection using online data mining techniques
Practical, hard-to-detect performance attacks
Current isolation helpful but insufficient
In the paper:
Sensitivity to Bolt parameters
Sensitivity to applications and platform parameters
User study details
More performance attacks (resource freeing, VM pinpointing)
Conclusions
![Page 28: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/28.jpg)
28
Bolt: highlight the security vulnerabilities from lack of isolation
Fast detection using online data mining techniques
Practical, hard-to-detect performance attacks
Current isolation helpful but insufficient
In the paper:
Sensitivity to Bolt parameters
Sensitivity to applications and platform parameters
User study details
More performance attacks (resource freeing, VM pinpointing)
Questions?
![Page 29: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/29.jpg)
29
Evolving Applications
Cloud applications change behavior
Users use the same cloud resources for several apps over time
Bolt periodically wakes up, checks if app profile has changed; if
so, reprofile & reclassify
![Page 30: Bolt: I Know What You Did Last - Cornell Universitydelimitrou/slides/2017...Christina Delimitrou1 and Christos Kozyrakis2 1Cornell University, 2Stanford University ASPLOS –April](https://reader034.vdocument.in/reader034/viewer/2022051811/601e3c191a8dec2a3952f225/html5/thumbnails/30.jpg)
30
Inference Within a Framework
Within a framework, dataset and choice of algorithm affect resource requirements
Bolt matches a new unknown application to apps in a framework by distinguishing their resource needs