[cb16] be a binary rockstar: an introduction to program analysis with binary ninja by sophia...
TRANSCRIPT
Be a Binary Rockst rAn Introduction to Program Analysis with
Binary Ninja
What this talk is not about
What this talk is about
Fuzzing….
Current state of the art.
Binary.
Source Code.
Problem
Reading/ scripting
disassembly
Reading code
Analysis of Bitcode
Static Analysis with Bindead, REIL, BAP.
Dynamic Instrumentation
Static and Dynamic Analysis
Compilers
Source code analyzer
McSema
IDA isn’t perfect
Problems.
Binary.
Source Code.
Problem
● Lack of robust tooling options
● Reading code continues to be useful
● Increase in compiler strength and LLVM tooling (lots of cool projects in this area!)
● Most tools lack semantic reasoning
● Decompilers widely used but difficult to automatically reason over
● Majority of program analysis frameworks are hard to use - they lack usable frameworks for interaction with your own analysis
● No really good options to lift binaries to interactive, workable IL frameworks
Binary, interactive
IL frameworks.
Binary Ninja & Binja IL
Binja: Tree Based Structure● Binary Ninja IL
Organized into expressions: LowLevelILInstruction
● LLILI’s are infinite length tree-based instructions
● Infix notation. Destination operand is the left hand operand(e.g. x86 ``mov eax, 0`` vs. LLIL ``eax =
0``)
● Side effect free
● Recursive descent analysis
Binja: Tree Based Structure
● Symbolic analysis (abstract interpretation) to find bounds of a jump table
● Determine function ends, aborts, etc using disassembly and their own IL.
binja_memcpy.py: IL/bin/bash
binja_memcpy.py: IL/bin/bash
binja_memcpy.py: IL/bin/bash
Register States!
binja_memcpy.py: API
binja_memcpy.py: API
binja_memcpy.py: API
binja_memcpy.py: API
binja_memcpy.py: Output
Binja API● Python, C and C++ API (idiomatic!)
● Missing some analysis features, built into LLVM (i.e. integrated CFG traversal, Uses, SSA, reg/ var
distinction)● Branches: Basic block/ Function edges (outgoing)
● Get the register states, some naive range analysis
● api.binary.ninja/search.html
Symbolic Execution● Very accurate
● Takes time, data, and memory, often not feasible
● IDEA! Reasoning only about what we can about
● Apply complex data to abstract domains !
● Domains: type, sign, range, color etc….
Practical(Academia) & Program Analysis ● Sets of concrete
values are abstracted imprecisely
● Galois Connection formalizes Concrete <-> Abstract
Abstraction!int x = 5
int y = argc + x
int za
Abstract Interpretatio
n
int x = 5
int y = argc + x
int z
aint
Abstract domain: Type
Abstraction Interpretatio
n
int x = 5
int y = argc + x
a
int z
int
= +
= +
Sign Analysis
Practical(Academia) & Program Analysis ● X ‘s value is
imprecise● Compilers perform
imprecise abstractionint x; int[] a = new int[10]; a[2 * x] = 3; 1. Add precision - i.e.
declare abstract value [0, 9]1. Symbolically execute with abstract domain/ values
● Requires control-flow analysis
Abstract Domains & Sign Analysisint a,b,c; a = 42;b = 87;if (input) {
c = a + b;} else {
c = a - b;}
● Map variables to an abstract value
Abstract Domains & Sign Analysis● Binary Ninja plugin
● Path sensitive - construct lattices of abstract values
● Under approximate
● One abstract state per CFG node
● Avoid loss in precision for fractions.
Demo!● Analyze example
program ● PHP CVE-2016-6289
Scripts!● memcpy, headless
python API script
● depth-first-search, path sensitive CFG template
● sign analysis, abstract domain plugin
https://github.com/quend/ abstractanalysis
Contact me● Sophia d’Antoine
○ IRC: @quend○ [email protected]○ Binary Ninja Slack
Conclusion● Thanks!
○ Vector35○ Trail of Bits○ Ryan Stortz
(@withzombies)● Resources
○ binary.ninja/○ github.com/quend/abstractanalysis○ santos.cs.ksu.edu/schmidt/Escuela03/WSSA/talk1p.pdf○ Static Program Analysis Book!
cs.au.dk/~amoeller/spa/spa.pdf
remember: prune this before analysing
Agenda1) IDA isn’t perfect2) Binary Ninja IL3) Practical(Academia) and program analysis
a) Abstract Interpretation 4) Binary Ninja plugin demo5) Conclusion