address obfuscation: an efficient approach to combat a broad range of memory error exploits sandeep...

Address Obfuscation: An Efficient Approach to Combat a Broad Range of Memory Error Exploits

Sandeep Bhatkar, Daniel C. DuVarney, and R. Sekar

Stony Brook University Department of Computer Science

USENIX Security Symposium, 2003Tracy WagnerCDA 6938 March 22, 2007

Outline

Introduction Attack Goals and Methods Address Obfuscation

Transformations Implementation Concerns Effectiveness Performance Conclusion Strengths/Weaknesses/Future

Work

Introduction

Exploits of memory programming errors Stack smashing Integer overflow Heap overflow Double-free vulnerability

C/C++ low level control Require precise knowledge of

victim program

Introduction

Address Obfuscation Make relative or absolute addresses

of program code and data impossible to predict

On execution, virtual addresses are randomized

Exploits become non-deterministic Effective against large-scale attacks

Randomization strategies for both data and code locations

Attack Goals

Cause the target program to execute attack code

Attack Code Injected Code – provided by

attacker Existing Code – already part of

program

How Do We Attack?

Direct Change Control Flow of Program –

change a code pointer Return address – stack Function pointers – stack, heap, static

area Global offset table (GOT)

Indirect Change Security-Critical Data used in

the course of execution Arguments to system calls Variables holding sensitive data

How Do We Attack? Address-Dependent Attacks

Can corrupt code-pointer or data-pointer

Overwrite pointer value with the absolute address of attacker-defined data or code

Relative Address-Dependent Attacks Corrupts non-pointer data Need to know relative distance between

buffer and location of item to corrupt

Address Obfuscation Goal is to Randomize:

Absolute locations of all code and data Relative distances between data items

Transformations Randomize Base Addresses of Memory

Regions Permute the Order of

Variables/Routines Introduce Random Gaps Between

Objects

Base Address Randomization Changing the base addresses of

code and data by a random amount

Over a large range (1 – 100 million) results in highly unpredictable virtual addresses

Does not increase the physical memory requirements

Some virtual address space becomes unusable

Base Address Randomization Base address of stack

All stack addresses randomized Subtract a large random value from

stack pointer

Base address of heap Randomizes absolute locations of

data Allocate a large block of random

size

Base Address Randomization Starting address of DLLs

Randomize location of all code and static data

Prevents existing code attacks, static data corruption

Locations of routines and static data Randomize all functions and associated

static data in the executable Similar to DLL randomization

Variable/Routine Permutations Three possible transformations

Order of local variables in stack frame

Order of static variables Order of routines in shared libraries

or in executable Defends against relative distance

exploits Difficult to predict distances

Random Gaps Between Objects When relative order of objects

cannot be changed

Add random padding: Stack frames Between malloc requests Variables in static area Gaps within routines (add jump

instructions)

Implementation - Timing

Performing Transformations Compile-time, link-time,

installation-time, load-time Higher performance when closer

to compile-time Delaying transformations:

Can apply to third party software System tools not modified

Implementation - Timing

Determining Randomization Amounts Transformation time Beginning of program execution Continuously, during execution

Continuously is most secure Performance or compatibility

issues may force other choices Transformation time – necessary

to re-transform code periodically

Effectiveness Not foolproof, but will increase work Defends against attacks which

involve overwriting a single value without ability to read memory contents

Can be defeated in specific instances Program allows reading of memory

contents Double pointer attack Partial overwrite attack

Performance

Static relocation at link-time (1) Dynamic relocation at load-time

(2) Transformations

Relocate base of stack, heap, and code regions

Introduce random gaps within stack frames for each routine, at the end of each malloc-requested block

Performance Static relocation at link-time

essentially no runtime overhead Dynamic relocation at load-time has

noticeable overhead but provides broad protection for DLL distribution

Conclusion

Addresses root cause of buffer overflow exploits Predictable location of data

Generic mechanism providing wide range of applications

Causes attacker to start from scratch for each system attack

Slows the spread of self-replicating attacks

Strengths

Developed and analyzed range of transformations that can be implemented with low runtime overheads

Permits randomization of variable and routine locations

Protects against a wide range of attacks

Easily applied to existing code, selective applications

Weaknesses Transformation time randomizations

introduce opportunity for local attacker

Adding jump instructions to skip over inserted gaps within routines would provide more information to attacker

Programs have to be periodically re-obfuscated because some randomizations have been fixed at transformation times

Vulnerable to some specially crafted attacks in current implementation

Future Work Further implementation of

described transformations Improve randomization at binary

level Tool to work with existing binaries Add information section to binary

Move towards randomizing at start of program execution and continuously changing during execution to avoid necessity of re-obfuscation

address obfuscation: an efficient approach to combat a broad range of memory error exploits sandeep...

Documents

sensitive data slide

code pointer

attack code attack code

base addresses of code

code locations

defined data

data impossible

address of dlls