highly scalable distributed dataflow analysis joseph l. greathouse advanced computer architecture...
TRANSCRIPT
Highly Scalable Distributed Dataflow
AnalysisJoseph L. Greathouse
Advanced Computer Architecture Laboratory
University of Michigan
Chelsea LeBlanc Todd Austin Valeria Bertacco
CGO, Chamonix, FranceApril 6, 2011
NIST: SW errors cost U.S. ~$60 billion/year as of 2002
FBI CCS: Security Issues $67 billion/year as of 2005 >⅓ from viruses, network intrusion, etc.
Software Errors Abound
2
Goals of this Work High quality dynamic software analysis
Find difficult bugs that other analyses miss
Distribute Tests to Large Populations Low overhead or users get angry
Accomplished by sampling the analyses Each user only tests part of the program
3
Dynamic Dataflow Analysis
Associate meta-data with program values
Propagate/Clear meta-data while executing
Check meta-data for safety & correctness
Forms dataflows of meta/shadow information
4
a += ya += y z = y * 75z = y * 75
y = x * 1024y = x * 1024 w = x + 42w = x + 42 Check wCheck w
Example Dynamic Dataflow Analysis
5
validate(x)validate(x)x = read_input()x = read_input() Clear
a += ya += y z = y * 75z = y * 75
y = x * 1024y = x * 1024
x = read_input()x = read_input()
Propagate
Associate
InputInput
Check aCheck a
Check zCheck z
Data
Meta-data
Distributed Dynamic Dataflow Analysis
6
Split analysis across large populations Observe more runtime states Report problems developer never thought to test
Instrumented Program Potential
problems
Taint Analysis(e.g.TaintCheck)
Dynamic Bounds Checking
FP Accuracy Verification
Problem: DDAs are Slow
7
10-200x10-200x2-200x2-200x
10-80x10-80x
5-50x5-50x100-500x100-500x
2-300x2-300x
Our Solution: Sampling
8
Lower overheads by skipping some analyses
CompleteAnalysis
CompleteAnalysis
NoAnalysis
NoAnalysis
Sampling Allows Distribution
9
Developer
Beta TestersEnd Users
Many users testing at little overhead see more errors than
one user at high overhead.
Many users testing at little overhead see more errors than
one user at high overhead.
a += ya += y z = y * 75z = y * 75
y = x * 1024y = x * 1024 w = x + 42w = x + 42
Cannot Naïvely Sample Code
10
Validate(x)Validate(x)x = read_input()x = read_input()
a += ya += y
y = x * 1024y = x * 1024
False PositiveFalse Positive
False NegativeFalse Negative
w = x + 42w = x + 42
validate(x)validate(x)x = read_input()x = read_input() Skip Instr.
InputInput
Check wCheck w
Check zCheck zSkip Instr.
Check aCheck a
Sampling must be aware of meta-data
Remove meta-data from skipped dataflows Prevents false positives
Our Solution: Sample Data, not Code
11
a += ya += y z = y * 75z = y * 75
y = x * 1024y = x * 1024 w = x + 42w = x + 42
Dataflow Sampling Example
12
validate(x)validate(x)x = read_input()x = read_input()
a += ya += y
y = x * 1024y = x * 1024
False NegativeFalse Negative
x = read_input()x = read_input()Skip Dataflow
InputInput
Check wCheck w
Check zCheck z
Skip Dataflow
Check aCheck a
Mechanisms for Dataflow Sampling (1)
13
Start with demand analysis
Demand Analysis ToolDemand Analysis Tool
NativeApplication
NativeApplication
InstrumentedApplication
(e.g. Valgrind)
InstrumentedApplication
(e.g. Valgrind)
Operating SystemOperating SystemMeta-dataMeta-data
No meta-dataNo meta-data
Mechanisms for Dataflow Sampling (2)
14
Remove dataflows if execution is too slow
Sampling Analysis ToolSampling Analysis Tool
InstrumentedApplication
InstrumentedApplication
NativeApplication
NativeApplication
InstrumentedApplication
InstrumentedApplication
Operating SystemOperating SystemMeta-dataMeta-data
Clear meta-dataClear meta-dataOH ThresholdOH Threshold
Prototype Setup Taint analysis sampling system
Network packets untrusted Xen-based demand analysis
Whole-system analysis with modified QEMU Overhead Manager (OHM) is user-controlled
15
Xen Hypervisor
OS and ApplicationsOS and Applications
AppApp AppApp AppApp…
LinuxLinux
ShadowPage Table
ShadowPage Table
Admin VMAdmin VM
Taint Analysis QEMU
Taint Analysis QEMU
Net StackNet
Stack OHMOHM
Benchmarks Performance – Network Throughput
Example: ssh_receive Accuracy of Sampling Analysis
Real-world Security Exploits
16
Name Error Description
Apache Stack overflow in Apache Tomcat JK Connector
Eggdrop Stack overflow in Eggdrop IRC bot
Lynx Stack overflow in Lynx web browser
ProFTPD Heap smashing attack on ProFTPD Server
Squid Heap smashing attack on Squid proxy server
ssh_receive
Performance of Dataflow Sampling
17
Throughput with no analysis
Accuracy at Very Low Overhead Max time in analysis: 1% every 10 seconds Always stop analysis after threshold
Lowest probability of detecting exploits
18
Name Chance of Detecting Exploit
Apache 100%
Eggdrop 100%
Lynx 100%
ProFTPD 100%
Squid 100%
Accuracy with Background Tasks ssh_receive running in background
19
Conclusion & Future WorkDynamic dataflow sampling gives users aknob to control accuracy vs. performance
Better methods of sample choicesCombine static informationNew types of sampling analysis
20
BACKUP SLIDES
21
Outline
22
Software Errors and Security
Dynamic Dataflow Analysis
Sampling and Distributed Analysis
Prototype System
Performance and Accuracy
Detecting Security Errors Static Analysis
Analyze source, formal reasoning
+ Find all reachable, defined errors
– Intractable, requires expert input,no system state
Dynamic Analysis Observe and test runtime state
+ Find deep errors as they happen
– Only along traversed path,very slow
23
Security Vulnerability Example Buffer overflows a large class of security
vulnerabilities
24
void foo(){ int local_variables; int buffer[256]; … buffer = read_input(); … return;}
Return address
Local variables
buffer
Bu
ffer F
ill
New Return address
Bad Local variables
If read_input() reads 200 ints If read_input() reads 200 ints If read_input() reads >256 ints If read_input() reads >256 ints
buffer
netcat_receive
Performance of Dataflow Sampling (2)
25
Throughput with no analysis
ssh_transmit
Performance of Dataflow Sampling (3)
26
Throughput with no analysis
netcat_receive running with benchmark
Accuracy with Background Tasks
27
Width Test
28