labeling library functions in stripped binaries
DESCRIPTION
Labeling Library Functions in Stripped Binaries. Emily R. Jacobson, Nathan Rosenblum , and Barton P. Miller Computer Sciences Department University of Wisconsin - Madison. Why Binary Code?. Source code isn’t available Source code isn’t the right representation. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/1.jpg)
PASTE 2011Szeged, Hungary
September 5, 2011
Labeling Library Functions in Stripped Binaries
Emily R. Jacobson, Nathan Rosenblum, and Barton P. Miller
Computer Sciences DepartmentUniversity of Wisconsin - Madison
![Page 2: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/2.jpg)
Why Binary Code?o Source code isn’t available
o Source code isn’t the right representation
2Labeling Library Functions in Stripped Binaries
![Page 3: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/3.jpg)
Binary Tools Need Symbol Tableso Debugging Tools
oGDB, IDA Pro…o Instrumentation Tools
o PIN, Dyninst,…o Static Analysis Tools
oCodeSurfer/x86,…o Security Analysis Tools
o IDA Pro,…
3Labeling Library Functions in Stripped Binaries
![Page 4: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/4.jpg)
Function locations
Complicated by:oMissing symbol informationoVariability in function layout (e.g. code sharing, outlined basic blocks)oHigh degree of indirect control flow
program binary
Restoring Information
4Labeling Library Functions in Stripped Binaries
targ80c3bd0 targ80c3df4 targ80c3df4
![Page 5: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/5.jpg)
What about semantic information?o Program’s interaction with the operating
system (system calls) encapsulated by wrapper functions
Restoring Information
5Labeling Library Functions in Stripped Binaries
Library fingerprinting: identify functions based on patterns learned from exemplar libraries
program binarytarg80c3bd0 targ80c3df4 targ80c3df4
![Page 6: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/6.jpg)
stripped binary parsing+
library fingerprinting+
binary rewriting
unstrip
6Labeling Library Functions in Stripped Binaries
targ80c3bd0 targ80c3df4 targ80c3df4getpid accept
![Page 7: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/7.jpg)
<accept>:mov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecxint $0x80mov %edx, %ebxcmp %0xffffff83,%eaxjae syscall_errorret
Set up system call argumentsint $0x80Invoke a
system callmov %edx, %ebxcmp %0xffffff83,%eaxjae syscall_errorret
Error check and return
mov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecx
Save registers
![Page 8: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/8.jpg)
<accept>:mov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecxint $0x80mov %edx, %ebxcmp %0xffffff83,%eaxjae syscall_errorret
int $0x80mov %edx, %ebxcmp %0xffffff83,%eaxjae syscall_errorret
mov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecx
<accept>:cmpl $0x0,%gs:0xcjne 80f669cmov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecxcall *0x814e93cmov %edx, %ebxcmp %0xffffff83,%eaxjae syscall_errorretpush %esicall enable_asyncancelmov %eax,%esimov %ebx,%edx
mov $0x66,%eaxmov $0x5,%ebxlea 0x8(%esp),%ecxcall *0x8181578mov %edx, %ebxxchg %eax,%esicall disable_acynancelmov %esi,%eaxpop %esicmp $0xffffff83,%eaxjae syscall_errorret
<accept>:cmpl $0x0,%gs:0xcjne 80f669cmov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecxint $0x80mov %edx, %ebxcmp %0xffffff83,%eaxjae syscall_errorretpush %esicall enable_asyncancelmov %eax,%esimov %ebx,%edx
mov $0x66,%eaxmov $0x5,%ebxlea 0x8(%esp),%ecxint $0x80mov %edx, %ebxxchg %eax,%esicall disable_acynancelmov %esi,%eaxpop %esicmp $0xffffff83,%eaxjae syscall_errorret
glibc 2.5 on RHEL with GCC 3.4.4
The same function can be realized in a variety of ways in the binary
glibc 2.5 on RHEL with GCC 4.1.2
glibc 2.2.4 on RHEL with GCC 2.95.3
![Page 9: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/9.jpg)
o Function inlining
o Code reordering
o Minor code changes
o Alternative code sequences
Binary-level Code Variations
9Labeling Library Functions in Stripped Binaries
![Page 10: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/10.jpg)
Semantic Descriptorso Rather than recording byte patterns, we
take a semantic approacho Record information that is likely to be
invariant across multiple versions of the function
10
<accept>:mov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecxint $0x80mov %edx, %ebxcmp %0xffffff83,%eaxjae 8048300retmov %esi,%esi
int $0x80
mov %0x66,%eaxmov $0x5,%ebx
{<socketcall >}
, 5
Labeling Library Functions in Stripped Binaries
![Page 11: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/11.jpg)
Building Semantic Descriptors
11Labeling Library Functions in Stripped Binaries
We parse an input binary, locate system calls and wrapper function calls, and employ dataflow analysis.
binary
reboot:push %ebpmov %esp,%ebpsub $0x10,%esppush %edipush %ebxmov 0x8(%ebp),%edxmov $0xfee1dead,%edimov $0x28121969,%ecxpush %ebxmov %edi,%ebxmov $0x58,%eaxint $0x80 …
SYSTEM CALL
0x58 0x28121969
EAX EBX ECX
%edi
0xfee1dead
{<reboot, 0xfee1dead, 0x2812969>}
EAX
(reboot)
![Page 12: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/12.jpg)
Building Semantic Descriptors Recursively
12Labeling Library Functions in Stripped Binaries
sethostid:…call open…call write…mov $0x6, eaxint $0x80…
{ <close>}
open:…mov $0x5, eaxint $0x80…
{<open, “/etc/hostid”, 577, 420>}
write:…mov $0x4, eaxint $0x80…
{<write,?,?,4>}
{ <close>, <open, “/etc/hostid”, 577,420>, <write,?,?,4>}
![Page 13: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/13.jpg)
unstrip
Building a Descriptor Database
13Labeling Library Functions in Stripped Binaries
Descriptor Database
<accept>:mov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecxint $0x80…
Locate wrapper functions
Build semantic descriptors
{<socketcall, 5>}: accept
{<socketcall, 4>}: listen
{<getpid>}: getpid…
glibcreference library
![Page 14: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/14.jpg)
glibcreference library
glibcreference library
glibcreference library
glibcreference library
unstrip
Building a Descriptor Database
14Labeling Library Functions in Stripped Binaries
Descriptor DatabaseBuild
semantic descriptors
Locate wrapper functions
{<socketcall, 5>}: accept
{<socketcall, 4>}: listen
{<getpid>}: getpid…
{<socketcall, 5>}: accept
{<socketcall, 4>}: listen
{<getpid>}: getpid…
{<socketcall, 5>}: accept
{<socketcall, 4>}: listen
{<getpid>}: getpid…
{<socketcall, 5>}: accept
{<socketcall, 4>}: listen
{<getpid>}: getpid…
1
<accept>:mov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecxint $0x80…
1
<accept>:mov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecxint $0x80…
1
<accept>:mov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecxint $0x80…
1
<accept>:mov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecxint $0x80…
![Page 15: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/15.jpg)
o Two stages1) Exact matches2) Best match based on coverage criterion
o Handle minor code variations by allowing flexible matches
Pattern Matching Criteria
15Labeling Library Functions in Stripped Binaries
![Page 16: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/16.jpg)
Pattern Matching Criteria
16Labeling Library Functions in Stripped Binaries
coverage(A,B) =
A: {<socketcall,5>}
B: {<socketcall,5>, <socketcall,5>, <futex>}
<socketcall,5> <socketcall,5>
coverage(A,B) =
<futex>
A B = { b B | b A }
fingerprint from the database
semantic descriptor from the code
![Page 17: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/17.jpg)
Multiple Matcheso It’s possible that two or more functions
are indistinguishableo Policy decision: return set of potential
matcheso In practice, we’ve observed 8% of
functions have multiple matches, but the size of the match set is small (≤ 3)
17Labeling Library Functions in Stripped Binaries
![Page 18: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/18.jpg)
unstrip
Identifying Functions in a Stripped Binary
18Labeling Library Functions in Stripped Binaries
stripped binary
unstripped
binary
Descriptor Database
For each wrapper function {
1. Build the semantic descriptor.
2. Search the database for a match (apply two-stage matching process).
3. Add label to symbol table.}
![Page 19: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/19.jpg)
stripped binary parsing
+library fingerprinting
+ binary rewriting
Implementation
19Labeling Library Functions in Stripped Binaries
![Page 20: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/20.jpg)
Evaluationo To evaluate across three dimensions of
variation, we constructed three data sets:oGCC versiono glibc versiono distribution vendor
o In each set, compile statically-linked binaries, build a DDB, compare unstrip to IDA Pro’s FLIRT
o Evaluation measure is accuracy20Labeling Library Functions in Stripped Binaries
![Page 21: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/21.jpg)
Evaluation Results: GCC Version Study
3.4.4 4.0.2 4.1.2 4.2.10
0.25
0.5
0.75
1
unstripIDA Pro
GCC 3.4.4 Patterns Predicting Each Library
accu
racy
21Labeling Library Functions in Stripped Binaries
![Page 22: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/22.jpg)
Evaluation Results: glibc Version Study
2.2.4 2.3.2 2.3.4 2.5 2.11.10
0.25
0.5
0.75
1
unstripIDA Pro
glibc 2.2.4 Patterns Predicting Each Library
accu
racy
22Labeling Library Functions in Stripped Binaries
![Page 23: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/23.jpg)
Evaluation Results: Distribution Study
Fedora Mandrivia OpenSuse Ubuntu0
0.25
0.5
0.75
1
unstripIDA Pro
Fedora Patterns Predicting Each Library
accu
racy
23Labeling Library Functions in Stripped Binaries
![Page 24: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/24.jpg)
24Labeling Library Functions in Stripped Binaries
unstrip is available athttp://www.paradyn.org/html/tools/unstrip.html
![Page 25: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/25.jpg)
Backup slides follow
![Page 26: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/26.jpg)
Evaluation Results: GCC Version Study(Temporal: backwards)
3.4.4 4.0.2 4.1.2 4.2.10
0.25
0.5
0.75
1
unstripIDA Pro
GCC 4.2.1 Patterns Predicting Each Library
accu
racy
26Labeling Library Functions in Stripped Binaries
![Page 27: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/27.jpg)
Evaluation Results: glibc Version Study(Temporal: backwards)
2.2.4 2.3.2 2.3.4 2.5 2.11.10
0.25
0.5
0.75
1
unstripIDA Pro
glibc 2.11.1 Patterns Predicting Each Library
accu
racy
27Labeling Library Functions in Stripped Binaries
![Page 28: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/28.jpg)
Evaluation Results: Distribution Study(one predicts the rest)
Fedora Mandrivia OpenSuse Ubuntu0
0.25
0.5
0.75
1
unstripIDA Pro
Mandrivia Patterns Predicting Each Library
accu
racy
28Labeling Library Functions in Stripped Binaries
![Page 29: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/29.jpg)
Evaluation Results: GCC Version Study (one predicts the rest)
3.4.4 4.0.2 4.1.2 4.2.10
0.25
0.5
0.75
1
unstripIDA Pro
GNU C Compiler Version
Accu
racy
29Labeling Library Functions in Stripped Binaries
![Page 30: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/30.jpg)
Evaluation Results: glibc Version Study(one predicts the rest)
2.2.4 2.3.2 2.3.4 2.5 2.11.10
0.25
0.5
0.75
1
unstripIDA Pro
glibc version
Accu
racy
30Labeling Library Functions in Stripped Binaries
![Page 31: Labeling Library Functions in Stripped Binaries](https://reader036.vdocument.in/reader036/viewer/2022081604/56816386550346895dd4719a/html5/thumbnails/31.jpg)
Evaluation Results: Distribution Study(one predicts the rest)
Fedora Mandrivia OpenSuse Ubuntu0
0.25
0.5
0.75
1
unstripIDA Pro
Distribution Vendor
Accu
racy
31Labeling Library Functions in Stripped Binaries