esil - universal il (intermediate language) for radare2

45
esil - universal il ESIL - Intermediate Language for radare2 toolset Anton Kochkov (@akochkov) June 23, 2016 ZeroNights 11-2015

Upload: anton-kochkov

Post on 19-Feb-2017

311 views

Category:

Software


8 download

TRANSCRIPT

Page 1: ESIL - Universal IL (Intermediate Language) for Radare2

esil - universal ilESIL - Intermediate Language for radare2 toolset

Anton Kochkov (@akochkov)

June 23, 2016

ZeroNights 11-2015

Page 2: ESIL - Universal IL (Intermediate Language) for Radare2

anton kochkov

• Moscow, Russia

• Love Reverse Engineering, foreign languages and travel

• Member of R2 crew, radare2 evangelist

• Security Code Ltd.

2

Page 3: ESIL - Universal IL (Intermediate Language) for Radare2

intermediate languages

Page 4: ESIL - Universal IL (Intermediate Language) for Radare2

what is intermediate language?

• Intermediate language is the language of an abstract machinedesigned to aid in the analysis of computer programs. il-wikipedia

• Heavily used for academic research and real world tools

• Vital for decompilation process

• Base for various kind of applications - SMT, AEG, AEP, etc

4

Page 5: ESIL - Universal IL (Intermediate Language) for Radare2

reil1

• Invented by Zynamics company

• BinNavi, BinDiff were both based on top of REIL

• x86, ARM, PowerPC architectures supported

• VM - Infinite memory

• VM - Infinite range of registers

• Missing Floating Point

• Written in Java

1reil.

5

Page 6: ESIL - Universal IL (Intermediate Language) for Radare2

reil

• 17 instructions

• Register aliases (eax, ebx, r0, . . . )2

2reil-zynamics.

6

Page 7: ESIL - Universal IL (Intermediate Language) for Radare2

bap

• BAP - Binary Analysis Platform3

• IL itself called BIL

• Well-maintained framework, various tools included

• Integration with another tools like: TEMU, libVEX, IDA Pro, Qira,. . .

• Targeted for x86, ARM

• Missing Floating Point

3bap-handbook.

7

Page 8: ESIL - Universal IL (Intermediate Language) for Radare2

bitblaze (vineil/vex)

• BitBlaze4 - also a platform, like BAP

• Uses two intermediate languages

• Vine IL - “low-level” language

• VEX IL (libVEX from valgrind) - “high-level” language

• Written in OCaml + C++

4bitblaze.

8

Page 9: ESIL - Universal IL (Intermediate Language) for Radare2

vine il56

• Implicitly require description of all side-effects

• Quite similar to ESIL

• Very useful for direct code emulation

• Too much unused information

5vine-manual.6vine-vuln-analysis.

9

Page 10: ESIL - Universal IL (Intermediate Language) for Radare2

vex il

• Infinite memory

• Infinite range of registers

• Support types

• “Variable scope” support

• Used in Valgrind and a few other programs

• Well-tested and actively maintained

10

Page 11: ESIL - Universal IL (Intermediate Language) for Radare2

rreil, openreil, mail

• RREIL7 - modern and flexible alternative to REIL

• RREIL - supports types

• RREIL - unique conceipt of “domains”

• MAIL - IL, constructed specifically for Malware analysis

• MAIL - the only IL, which allows polymorph programs

• RREIL and MAIL - both lacks Floating Point support

7rreil.

11

Page 12: ESIL - Universal IL (Intermediate Language) for Radare2

rreil, openreil, mail

• OpenREIL8 - reinvention and attempt to spruce up REIL

• OpenREIL - self-contained and ready to use framework, like BAP

• OpenREIL has major differencies from the original REIL

• Based on libVEX and has embedded support of SMT-solving

8openreil.

12

Page 13: ESIL - Universal IL (Intermediate Language) for Radare2

esil - what’s different and what’s com-mon

Page 14: ESIL - Universal IL (Intermediate Language) for Radare2

short description

• Evaluable Strings Intermediate Language9

• Based on RPN (Reverse Polish Notation)(for the sake of speed)

• Designed for evaluation and emulation, not human-reading

• Low-level, pretty much alike Vine IL

• Small set of the instructions

• Implicit specification of all side-effects for each instruction

9r2esil.

14

Page 15: ESIL - Universal IL (Intermediate Language) for Radare2

short description

• Designed with a wide range of different architectures in mind

• Infinite memory

• Infinite set of registers

• Register aliases (“native” names, like “eax” or “cpsr”)

• Ability to call external functions (+syscalls)

• Ability to implement “custom ops” easily

• Without Floating Point support (planned, though)

15

Page 16: ESIL - Universal IL (Intermediate Language) for Radare2

esil operands

Table 1: ESIL Operands10

ESIL Opcode Operands Name Desription$ src Syscall syscall$$ src Instruction address Get address of current instruction== src,dst Compare v = dst - src ; update_eflags(v)< src,dst Smaller stack = (dst <src)<= src,dst Smaller or Equal stack = (dst <= src)> src,dst Bigger stack = (dst >src)|= src,reg OR eq reg = reg | src

10r2esil-instruction-set.

16

Page 17: ESIL - Universal IL (Intermediate Language) for Radare2

practical usage

Page 18: ESIL - Universal IL (Intermediate Language) for Radare2

radare

11

11r2advert.

18

Page 19: ESIL - Universal IL (Intermediate Language) for Radare2

radare2 tools

• rax2

• rabin2

• rasm2

• radiff2

• rafind2

• rahash2

• radare2

• r2pm

• rarun2/ragg2/ragg2-cc

19

Page 20: ESIL - Universal IL (Intermediate Language) for Radare2

1 command <—>1 reverse-engineering’notion

1. Every character of the command has some meaning (w = write, p =print)

2. Usually they’re simple abbreviations pdf = p <->print d<->disassemble f <->function

3. Short usage message for each command can be printed with cmd?,e.g. pdf?,?, ???, ???, ?$?, ?@?

20

Page 21: ESIL - Universal IL (Intermediate Language) for Radare2

radare2 — important commands of cli-mode

1. r2 -A or r2 + aaaa : run analysis

2. s : seek to the address or flag

3. pdf : print disassembly for function

4. af? : perform function analysis

5. ax? : do analysis for XREF

6. /? : various kinds of search

7. ps? : print string

8. C? : comments

9. w? : write bytes (hex, assembly, etc)

21

Page 22: ESIL - Universal IL (Intermediate Language) for Radare2

radare2 — visual mode

Page 23: ESIL - Universal IL (Intermediate Language) for Radare2

radare2 — important commands of visual mode

1. V? or just ? : Hotkeys help

2. p/P : circle between various visual modes

3. hjkl/arrows navigation

4. o : go to the offset/address

5. e : show all config variables

6. v : show the functions list

7. _ : HUD

8. V : ASCII Graph

9. 0-9 : jump to the corresponding function

10. u : Undo

23

Page 24: ESIL - Universal IL (Intermediate Language) for Radare2

emulation using esil

• ae* - evaluate and show r2 commands

• aei - VM initialization

• aeim - VM memory/stack setup

• aeip - to set IP (Instruction Pointer) for VM

• aes - step in ESIL emulation

• aec[u] - continue [until]

• aef - emulate whole function by name

24

Page 25: ESIL - Universal IL (Intermediate Language) for Radare2

emulation using esil

• DEMO

25

Page 26: ESIL - Universal IL (Intermediate Language) for Radare2

embedded controller - 8051 - esil vm12

• r2 -a 8051 ite_it8502.rom

• . ite_it8502.r2

• e io.cache=true to cache IO while emulating

• run aei

• run aeim

• run aeip to start from current IP

• aecu [addr] to emulate until IP = [addr]

12r2esiltv.

26

Page 27: ESIL - Universal IL (Intermediate Language) for Radare2

full fledged emulation in vm

• Allows to start ESIL VM with predefined properties

• Allows to run the code (e.g. unpack) in this VM

• Allows to set callbacks on some opcodes

• Allows to translate syscalls into real ones (optional)

• Good example - unpacking Baleful code13

13esil-baleful.

27

Page 28: ESIL - Universal IL (Intermediate Language) for Radare2

using esil emulation in analysis

• Using emulation of the code to find indirect jumps

• Using emulation to find some strings addresses

• Started by aae

• aae is a part of aaaa command

28

Page 29: ESIL - Universal IL (Intermediate Language) for Radare2

using background emulation to improve disassembly output

• Shows possible registers and memory values in comments

• Run ESIL VM with default properties in background

• Show “likely/unlikely” for conditional jumps

• e asm.emu=true

29

Page 30: ESIL - Universal IL (Intermediate Language) for Radare2

using background emulation to improve disassembly output

• DEMO

30

Page 31: ESIL - Universal IL (Intermediate Language) for Radare2

converting into another ils - openreil

• OpenREIL - actively maintained framework

• Ability to perform symbolic execution using SMT solver

• Radare2 has a special command to translate ESIL into OpenREIL

• aetr

31

Page 32: ESIL - Universal IL (Intermediate Language) for Radare2

converting into another ils - openreil

• DEMO

32

Page 33: ESIL - Universal IL (Intermediate Language) for Radare2

embedded controller - 8051 - esil2reil

• r2 -a 8051 ite_it8502.rom

• . ite_it8502.r2

• run pae 36 to show ESIL of the function ’set_SMBus_frequency’

• run aetr `pae 36` to translate ESIL into REIL14

• Save this output into the file/pipe and send as OpenREIL input

• Could be easily automated using r2pipe script

14openreil.

33

Page 34: ESIL - Universal IL (Intermediate Language) for Radare2

radeco il and radeco decompiler

Page 35: ESIL - Universal IL (Intermediate Language) for Radare2

esil -> radeco16

• Uses ESIL as input

• Request more metadata from radare2 (xrefs, functions, etc)

• Using r2pipe to talk to radare2

• Written in pure Rust

• Biggest part of it written during GSoC 2015

• Authors are: Sushant Dinesh and David Kreuter15

• GSoC 2015 was done under the Openwall’s project umbrella

15radeco-gsoc.16radeco.

35

Page 36: ESIL - Universal IL (Intermediate Language) for Radare2

why radeco?

• Existent FOSS decompilers are old and not use modern research

• Interesting methods to decompile rarely implemented in FOSS tools

• Radare2 as a RE suite need a good decompiler

• Challenging still viable task for Google Summer of Code

36

Page 37: ESIL - Universal IL (Intermediate Language) for Radare2

radeco il description

• Based purely on graphs

• Based on some concepts from RREIL and MAIL

• Simplification to SSA form while lifting ESIL -> Radeco IL

• Automatically performed DCE (Dead Code Elimination)

• Types inference17

17tie.

37

Page 38: ESIL - Universal IL (Intermediate Language) for Radare2

radeco demo

• DEMO

38

Page 39: ESIL - Universal IL (Intermediate Language) for Radare2

future improvements

Page 40: ESIL - Universal IL (Intermediate Language) for Radare2

supported architectures

• Currently supported: x86, ARM, GameBoy, 8051, etc

• Goal is to add support for all architectures in radare2

• CPU/SoC/chip profiles for a slight differences between them

40

Page 41: ESIL - Universal IL (Intermediate Language) for Radare2

instruction sets

• Floating point (LLVM/McSema)18

• SIMD instructions (SSE, AVX, Neon, etc)

• VLIW and parallel execution (for DSP architectures emulation)

18floating-point-SO.

41

Page 42: ESIL - Universal IL (Intermediate Language) for Radare2

visual debugging and tracing

• General UI improvements

• Simple representation of diffs between emulation and nativeexecution

• Auto removing dead ways/blocks from ASCII graphs

• Integration with WebUI and Bokken19

19bokken.

42

Page 43: ESIL - Universal IL (Intermediate Language) for Radare2

radeco improvements

• pseudo-C code emission

• Wider support for various: native and custom types

• Recalculating results on the fly, depending from debugging results

• Types inference and function/classes autorecognition2021

20tie.21il-optimization.

43

Page 44: ESIL - Universal IL (Intermediate Language) for Radare2

references

Page 45: ESIL - Universal IL (Intermediate Language) for Radare2

a lot of them I

45