![Page 1: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/1.jpg)
Hardening LLVM with
Random Testing
Xuejun Yang, Yang Chen
Eric Eide, John Regehr
{jxyang, chenyang, eeide, regehr}@cs.utah.edu
11/3/2010 1
School of Computing
University of Utah
![Page 2: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/2.jpg)
11/3/2010
School of Computing
University of Utah 211/3/2010
School of Computing
University of Utah 2
A LLVM Crash Bug
11/3/2010 2
…
int * p[2];
int i;
for (...) {
p[i] = &i;
…
}
LLVM crashed due to an over-aggressive assertion in
Loop Invariant Code Motion in a 2.8 revision
For some reason it thought the address-taken
variable “i” must not be part of expression on LHS
![Page 3: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/3.jpg)
11/3/2010
School of Computing
University of Utah 311/3/2010
School of Computing
University of Utah 3
Some LLVM Wrong Code Bugs
11/3/2010 3
A 2.4 revision simplifies "(a > 13) & (a == 15)" into
"(a > 13)“
A 2.8 revision folds "((x == x) | (y & z)) == 1” into 0
- thought sub-expression (y & z) is constant integer
A 2.8 revision reduces this loop into “i = 1”
for (i = 0; i < 5; i++)
{
if (i) continue;
if (i) break;
}
![Page 4: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/4.jpg)
11/3/2010
School of Computing
University of Utah 4
From March, 2008 to present:
•170 bugs fixed + 3 reported but not yet fixed • 57 wrong code bugs• 116 crash bugs
•Account for 2.6% of total valid bugs filed against LLVM in that period
Tally of Bugs
![Page 5: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/5.jpg)
Goal
• Use random differential testing to find LLVM bugs!
Aimed at deep optimization bugs
Systematically
Find bugs,
Report bugs
Automate the process as much as we can
Ultimate goal: improve LLVM quality
11/3/2010
School of Computing
University of Utah 5
![Page 6: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/6.jpg)
Random Generator
clang -O0 clang -O3 gcc -O …
vote minoritymajority
C program
results
School of Computing
University of Utah
![Page 7: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/7.jpg)
1. What we do
2. What we learned
3. How we do it
7
School of Computing
University of Utah
![Page 8: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/8.jpg)
An Experiment
Compiled and ran 1,000,000 random
programs
Using LLVM 1.9 – 2.8
-O0, -O1, -O2, -Os, -O3
x86 only
8
School of Computing
University of Utah
![Page 9: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/9.jpg)
Crash Error is the percentage of crashes triggered by 1,000,000
random test cases
School of Computing
University of Utah
![Page 10: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/10.jpg)
10
Distinct crash Errors is the number of distinct crashes triggered by
1,000,000 random test cases
School of Computing
University of Utah
![Page 11: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/11.jpg)
11
Wrong Code Error Rate is the percentage of wrong code errors
found by 1,000,000 random test cases
School of Computing
University of Utah
![Page 12: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/12.jpg)
11/3/2010
School of Computing
University of Utah 12
0 10 20 30 40
Dan Gohman
Chris Lattner
Evan Cheng
Eli Friedman
Devang Patel
Jakob S. Olesen
Anton Korobeynikov
Duncan Sands
Dale Johannesen
Nick Lewycky
Owen Anderson
Bill Wendling
Anders Carlsson
Jakub Staszak
Daniel Dunbar
Lang Hames
Bob Wilson
Who fixed the bugs we reported?
![Page 13: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/13.jpg)
Bug Distribution Across Stages
11/3/2010 13
School of Computing
University of Utah
frontend 10
middle-end 61
backend 67
Can’t identify 35
total 173
Classification is based on bugzilla records and/or source
files committed for bug fixes
Source files under Clang or gcc are considered frontend
Source files under Analysis and Transforms are considered
middle-end
Source files under CodeGen and Target are considered
backend
![Page 14: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/14.jpg)
Bug Distribution Across Files
11/3/2010 14
School of Computing
University of Utah
Cpp file Bugs found
InstructionCombining 13
SimpleRegisterCoalescing 10
DAGCombiner 6
LoopUnswitch 5
LoopStrengthReduce 4
JumpThreading 4
LICM 4
FastISel 4
llvm-convert 4
ExprConstant 4
X86InstrInfo 3
X86InstrInfo.td 3
X86ISelDAGToDAG 3
BranchFolding 3
… (and many more)
![Page 15: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/15.jpg)
Increased Coverage by Random Programs
74.54% 72.90%
59.22%
74.69% 72.95%
59.48%
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
Line Function Branch
Clang + LLVM test suite
Clang + LLVM test suite + 10,000 random programs
11/3/2010
School of Computing
University of Utah 15
![Page 16: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/16.jpg)
1. What we do
2. What we learned
3. How we do it
16
School of Computing
University of Utah
![Page 17: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/17.jpg)
Beyond a Random Generator
11/3/2010
School of Computing
University of Utah 17
• Ability to create random programs with unambiguous
meanings
undefined behaviors
unspecified behaviors
• Ability to check the meaning of a C program is
preserved by the compiler
Summarize the “meaning”
- the checksum of global variables
Test harness: automatically detect checksum
discrepancies
![Page 18: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/18.jpg)
Not a Bug #1
int foo (int x) {
return (x+1) > x;
}
int main (void) {
printf ("%d\n",
foo (INT_MAX));
return 0;
}
$ gcc -O1 foo.c -o foo
$ ./foo
0
$ clang -O2 foo.c -o foo
$ ./foo
1
18
School of Computing
University of Utah
![Page 19: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/19.jpg)
Not a Bug #2
int a;
void bar (int a, int b) {
}
int main (void) {
bar (a=1, a=2);
printf ("%d\n", a);
return 0;
}
$ gcc -O bar.c -o bar
$ ./bar
1
$ clang -O bar.c -o bar
$ ./bar
2
19
School of Computing
University of Utah
![Page 20: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/20.jpg)
Two Goals We Must Balance
11/3/2010
School of Computing
University of Utah 20
Be expressive on code generation- is easy if you don’t care about undefined / unspecified behavior
Avoid undefined behaviors and not depend on unspecified behaviors
- is easy if you don’t care about expressiveness
Expressive code that avoids undefined / unspecified behavior is hard
![Page 21: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/21.jpg)
Design Compromises
11/3/2010
School of Computing
University of Utah 21
Implementation-defined behavior is allowed- Avoiding it is too restrictive
- We cannot do differential testing of x86 Clang vs. MSP430
Clang
No ground truth- If all compilers generate the same wrong answer, we’ll never
know
- Generating self-checking random code restricts expressiveness
No attempt to generate terminating programs- Test harness uses timeouts
- In practice ~10% of random programs don’t terminate within a
few seconds
![Page 22: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/22.jpg)
Supported features:
Arithmetic, logical, and bit operations on integers
Loops
Conditionals
Function calls
Const and volatile
Structs and Bitfields
Pointers and arrays
Goto
Break, continue
Not supported:
Side-effecting expressions
Comma operator
Interesting type casts
Strings
Unions
Floating point
Nontrivial C++
Nonlocal jumps
Varargs
Recursive functions
Function pointers
Dynamic memory alloc
22
School of Computing
University of Utah
![Page 23: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/23.jpg)
Avoid Undefined Behaviors With “Extra” Code
11/3/2010
School of Computing
University of Utah 23
Use before initialization: declare all variables with
explicit initializers
Array OOB access: force all indices within bound using
modulus
Null pointer deference: check nullness before
dereference
![Page 24: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/24.jpg)
11/3/2010
School of Computing
University of Utah 24
Undefined Integer Behaviors
Problem: These are undefined in C
Divide by zero
INT_MIN % -1
Debatable in C99 standard but undefined in practice
Shift by negative, shift past bitwidth
Signed overflow
Etc.
![Page 25: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/25.jpg)
11/3/2010
School of Computing
University of Utah 25
Undefined Integer Behaviors
Solution: Wrap all potentially undefined
operationsint safe_signed_left_shift (int si1, int si2) {
if (si1 < 0 ||
si2 < 0 ||
si2 >= sizeof(int) ||
si1 > (INT_MAX >> si2)){
return si1;
} else {
return si1 << si2;
}
}
![Page 26: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/26.jpg)
Avoid Undefined/Unspecified Behaviors Statically
11/3/2010
School of Computing
University of Utah 26
Some undefined / unspecified are too
difficult to avoid structurally
Dereference dangling pointers
Order of evaluation problem:
-Two expressions writes to same variable
![Page 27: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/27.jpg)
Generate Statement
Code Generator Augmented with Static Analyzer
Generate Expression
Generate Block
Generate Function
Start
End
School of Computing
University of Utah
Pointer AnalysisSide Effect AnalysisCode Validation
![Page 28: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/28.jpg)
Future work
11/3/2010
School of Computing
University of Utah 28
For us:
Create a turnkey system :
Test harness needs a partial rewrite (7000 lines of Perl)
Improve test case reduction
Support more C features
Support more languages and formats (including IR)
For you:
Please keep fixing bugs we report
Tell us where random testing might be useful
![Page 29: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/29.jpg)
Conclusion
11/3/2010
School of Computing
University of Utah 29
Random testing is powerful
Quickly found bugs in all compilers we tested
Found deep optimization bugs unlikely to be
uncovered by other means
Can generate small test cases pinpoint the problem
But has limitations:
Never know when to stop testing
Coverage is not broad enough
Compilers need random testing
Fixed test suite is not enough
![Page 30: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/30.jpg)
11/3/2010
School of Computing
University of Utah 30
Thank you …
And thank Qualcomm for sponsoring my trip!
Questions?
![Page 31: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/31.jpg)
Test Case Reduction
11/3/2010 31
School of Computing
University of Utah
Is Necessary for
- Bug reporting
- Bug fixing
- Allow causal users find and report bugs using our
tool
Existing approach: Delta Debugging
- Repeatedly remove part of the program and see if
it remains interesting
- Works well for crash bugs
-Works poorly for wrong code bugs
- Problem: Throwing away part of a program
may introduce undefined behavior
![Page 32: Hardening LLVM with Random Testingllvm.org/devmtg/2010-11/Yang-HardenLLVM.pdfExisting approach: Delta Debugging - Repeatedly remove part of the program and see if it remains interesting](https://reader033.vdocument.in/reader033/viewer/2022052011/6026272f57b36e6af80e0198/html5/thumbnails/32.jpg)
Hierarchical Reduction
11/3/2010 32
School of Computing
University of Utah
CsmithRuntime
Verification
and Feedback
Function Reduction
Block Reduction
Statement Reduction
Expression Reduction
Variable Reduction
Initial results show reduced size of failure-inducing test
cases by 93% on average in a few minutes