ben agre - adding another level of hell to reverse engineering

50
Level of Hell to Reverse Engineering OR Static Binary Obfuscation using Opaque Predicates and Semi-Junk Code Ben Agre (@sboxkid) MIT Raytheon SI

Upload: source-conference

Post on 29-Nov-2014

2.055 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Adding Another Level of Hell to Reverse Engineering

ORStatic Binary Obfuscation using Opaque Predicates

and Semi-Junk CodeBen Agre (@sboxkid)

MIT Raytheon SI

Page 2: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Who am I

• Ben Agre• Reverse Engineer• Worked random places– Currently work for Raytheon SI

• Done Random things• Kind of an asshole• Currently a student at MIT

Page 3: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Obligatory term slide

• SDLC• Sandbox• APT• Cyber Pompeii– Cyber Eyjafjallajökull (Credit to Jon Oberheide)

• Stuxnet

Page 4: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Overview

• Introduction to X86• Overview of current packers• Overview of current ways to beat packers• Why this is different/why I’m an asshole

Page 5: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Assumptions

• We assume that it is 32 bit x86 assembly– This can be extended and would work better with

64 bits, but was originally written for 32• All items are assumed to be cdecl calling

convention• I don’t like my friends, that’s why I built this

tool

Page 6: Ben Agre - Adding Another Level of Hell to Reverse Engineering

X86 Assembly

• I apologize to those of you who know assembly this is going to be review at best, and boring to tears at worst

• This is a non aligned language, hence the order which bytes appear matter

• The smallest instruction is one byte, the largest is 15, anything past that will throw a #UD exception

Page 7: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Eflags

• Eflags is essentially the status register• It contains 32 bits and can be broken down

into certain items that are used for conditional jumps

• Important flags– ZF=Zero Flag– SF= Sign Flag– OF= Overflow flag– CF= Carry Flag

Page 8: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Basics• Mov r1,r2/imm1– Move register or immediate r2, into r1

• Add sub r1,r2– Does the operation to the first register, and stores

it in r1– Modify Eflags appropriately

• Xor r1,r2– eXclusive OR r1 and r2, and store result in r1– Modify eflags appropriately

• Jmp – Jump to a chunk of code

Page 9: Ben Agre - Adding Another Level of Hell to Reverse Engineering

More Commands

• imul, idiv– Unsigned multiply and divide– Effect eax:edx, and change appropriate flags

• Call addr– Call A function

Page 10: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Conditional Jumps

• JS• JE• JG• JLE• JZ– Jump if zero flag

• JNZ– Jump if zero flag is not set

• These all jump on state of eflags

Page 11: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Now that were out of Narnia, let’s shake it up

• Packers were originally trying to make executable’s smaller

• They are now used to be an ass to reverse engineers

• People have their favorites

Page 12: Ben Agre - Adding Another Level of Hell to Reverse Engineering

General Packer Magic

• Mangle the IAT– Make it so on each outside function call it’s hard

to figure out where things are going• Do some operation to all data– Uncompress it

• Usually add some anti debugging magic– Armadillo parent child debugging– Themida, anything it can think of

Page 13: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Current direction

• Currently there is a large push towards making virtual machines

• This approach leads to closer generic defeats, one learns the language and deals with it

• Tracing is a pain

Page 14: Ben Agre - Adding Another Level of Hell to Reverse Engineering

ASProtect

• Some opaque predicates• Creates stack madness• Virtualizes many things

Page 15: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Themida

• “state of the art”• Uses highly virtualized systems• Locks the binary in everyway it can be• Cisc architecture• Hates VM’s

Page 16: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Both have been kicked badly

• Themida has the full VM reversed by a pair of Chinese hackers– Apparently modified CISC architecture or RISC for

older versions– Softworm did amazing things in this respect

• ASProtect– Thousand tutorials on how to beat it

• These systems make high initial bar to entry but not continued protection

Page 17: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Destroying Them

• There is currently a pair of IDA modules for themida decompiling being sold on the black market

• This shows how broken this model can be at times

• Packing for all intensive purposes is deterministic

• Not IND-CCA secure

Page 18: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Terms

• This seems random but is important• Functionally isomorphic– Two functions that do the same thing but look

different• State isomorphic– Two states that do the same thing, but look different

• Opaque Predicate– A question which you know the answer to before

you ask it• If a term doesn’t make sense ask

Page 19: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Let’s create a way that is different

• Instead of virtualizing the entire system lets stick in x86

• Instead of making one high bar of entry, lets play against the tools

• We can actually modify these binaries to the point at where they won’t look the same

• Example

Page 20: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Previous work

• Kenshoto– MathIsHard• Binary is public, packer is not

– Does more function rearranging, than function obfuscation

• Some packers employ basic junk code, but it’s always actual junk– We use semi-Junk

Page 21: Ben Agre - Adding Another Level of Hell to Reverse Engineering

What this is

• It’s a packer which is state aware and uses that to its advantage

• It adds little pieces of assembly to be executed• Also adds items from /dev/urandom in order

to mess up instruction alignment• Non-Deterministic• Always executes no matter how things change

on the OS

Page 22: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Why you care

• Since it’s a bit different then the normal way• Instead of creating a high startup cost we

create a continued use cost• It’s still straight x86 assembly no matter what• It uses the junk so it’s hard to determine real

from fake codes

Page 23: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Mode of operation

• I take some function or group of functions, from a fully compiled binary, lets call the function A

• I take A and I reassemble it into A’• A’ is functionally isomorphic to A• However, A’ can look nothing like A• Opaque predicates are added, as well as the random

bytes• Original function is nopped out• Functions become longer and have to get rewritten to

the end of the program• Call Indirection added

Page 24: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Objectives

• Create a non deterministic obfuscator• Make IDA DIAF• Make a semi extensible intermediate

representation of the assembly• Make my friends hate me• ???• Profit on the tears of my friends ?

Page 25: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Why This is different

• Randomization– In cryptography to make it harder for an adversary

you randomize you’re plaintext, making it plaintext aware

• What this means– I can pass in a binary twice and get two

completely different results

Page 26: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Design Decisions

• There are two separate ways we analyze the program

• Previous state engine– Analyze the program, look for opaque predicates• xor eax,eax is awesome for this

• Created state engine– AKA Dynamic state engine– Can modify elements, and will use them until they

change

Page 27: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Call indirection

• So in our dynamic engine at times we have to fix things up

• We also may not want to actually place function addresses for calls

• IDA uses these to recursively find functions

Page 28: Ben Agre - Adding Another Level of Hell to Reverse Engineering

What is a call

• Call 0xdeadb33f– Push eip– Jmp 0xdeadb33f

• What could a call be– Push eip– Push 0xdeadb33f– retn

Page 29: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Now how do we rewrite this with stubs

– F(retnOffset, callAddress)• Switch(retnOffset)

– Case x:» Ret = retnOffset[x]» Push ret» Push callAddress» return

• Each stub is essentially a mini function with a switch table– We pregenerate a lookup table

(retnOffset)– Based on value push the parent

return address– Then push address of function

to call– Return

• This calls callAdress and will then return to parent function bypassing stub on return

Page 30: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Other debated way to do this

• Short call that pushes eip• Push function to go to• Retn • Issue with this is that call is easy to find

Page 31: Ben Agre - Adding Another Level of Hell to Reverse Engineering

A third way

• Push value to jmp to, either offset or address• Do essentially xchg [esp+4],[esp]• Retn• Else do something like • Pop eax• Jmp eax

Page 32: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Finding opaque predicates

• Some actions have definitive outcomes before they are ever used– Xor r1,r1– Sub r2,r2

• These will always set eflags in one specific way, or throw an exception

Page 33: Ben Agre - Adding Another Level of Hell to Reverse Engineering

However these are not the only predicates

• JZ – If the jump is taken we know that the zero flag is

set– Else it’s not

• Hence we can reason below it• Add a JNZ, and then throw in some junk• We know that the jump will be taken, a valid

code path followed and our junk will still mess up IDA

Page 34: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Still too easy

• JZ then JNZ is fairly easy to spot• Well we could add some do nothing instructions

if we wanted– If we know that after the item is used, there is

nothing pertaining to EAX, until a mov eax, [edx], we can throw in some instructions• Add eax,ecx• Xor eax, eax

• These do not change the flow of the program, yet still make RE harder

• Creates an isomorphic state

Page 35: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Adding little stubs

• So now that we have some instructions we can throw, we can actually make little sub funcs essentially

• We do some calculation with eax, push it onto the stack and since we controlled the last few things we did, undo it

Page 36: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Looks kinda like

• JNZ(Program logic)• Inc eax( makes eax not zero, compare and jump left

out due to space restraint)• Add eax,edx(edx can be whatever, we don’t care)• Push eax• Mov eax,[esp+88]• JNZ our code– After JNZ, random bytes

• Pop eax• Their code– Before any item using eax, overwrites eax

Page 37: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Well so we’re still now pretty easy• Lets bend the program to our will– Dynamic state isomorphisms – Calling conventions are awesome

• CDECL means that the program makes some assumptions on function calls– EBX stays static– However, on call, there are no assumptions about

eax,ecx,edx. Means we can mess with these before and after the program executes, except eax after

Page 38: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Now we’re getting somewhere

• We can change items before and after the code executes.

• We can also do things like change items in the middle of execution

• So if we do some items where we know how it will modify eflags, and then change a bit later without being used– Xor eax,eax

• We can add a jump that goes where we want, and just add junk afterwards

Page 39: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Now why is this Semi-Junk

• Since we can fix items up inside of this random little stubs

• If we fix things up inside of these little stubs, then when people look for completely dead code removal it won’t be flagged

• It also means that during execution a trace will get a lot of chaffe from our items.

• Hard to distinguish differences between our code and program code

Page 40: Ben Agre - Adding Another Level of Hell to Reverse Engineering

We’re not deterministic

• There are a lot of things that make this nondeterministic

• Our semi junk can look one of many separate indeterminent forms

• Our prologue junk can be as long as we want and can redo or undo anything in a short or long version

Page 41: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Hence

• Things look different every time we ever do packing

• This means that each time that a person wants to fix it up, they need to redo the entire process by hand

• If we rearrange functions, and then do reapply the packer, then the RE has to do it all again from scratch

Page 42: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Other Features Not Discussed

• Max length of basic blocks– No more than lets say 5 lines can appear together,

this is just a parameter• Tunable parameters for semi junk code– Hence one can have the preambles be short or

long– Also can tell it to prefer registers

Page 43: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Future Work

• Add other architectures• Move from nasm to my own assembler– Yet to be built

• Maybe add some anti debugging foo just for lulz

Page 44: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Added bonus

• FLIRT – Flirt is based on signatures of functions– Heavily relies on prologues, hence if we randomize

the prologues FLIRT no longer picks up the signatures

– Makes static Binaries so much worse then the amount that they already suck

Page 45: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Field tests

• Two groups, 2 Highly skilled, 1 skilled, 1 novice in each group

• One group got the program before packing• One got the program after packing• Calculated sum of a fibonacci sequence with

memory, using two arrays, non trivial but not hardest– Also had some other random functions to mess with

them– Dropped privileges, changed prologues some other

red herrings

Page 46: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Results

• Without packing– Around half an hour

• With– Around 9– Novice gave up

Page 47: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Tool Design

• This tool is based on vtrace– Thank you kenshoto

• Uses nasm for assembling the instructions required

• Functions are rewritten at the end of the program, will add pages if necessary

Page 48: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Tool Release

• This tool will most likely be released in the next month after finals– I added a feature three weeks ago and it borked so

many things– Based on vtrace, so one must download it

seperately• I’ll probably tweet it or something

Page 49: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Thanks• For helping me design and build

– Thing1• Design

– d4s, Visi, Psifertex, Metr0, Nitrik• For just being epic

– Draugr– Raid– Gynophage– Bliss– Hates Irony– Kenshoto– Prof Zeldovich and Rivest

• Both of whom’s classes were awesome

– The busticati—forever busticating– The NY Crew- whom are too many to name– And all not enumerated herein

Page 50: Ben Agre - Adding Another Level of Hell to Reverse Engineering

Release Addendum

• Will probably be released after my finals, so around May 28th

• I will most likely announce via twitter, @sboxkid

• Email me at [email protected] if you want to know anything else.