lecture 2 - assembly language

38
Lecture 2 Assembly Language Computer and Network Security 8th of October 2018 Computer Science and Engineering Department CSE Dep, ACS, UPB Lecture 2, Assembly Language 1/38

Upload: others

Post on 18-Dec-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Lecture 2Assembly Language

Computer and Network Security8th of October 2018

Computer Science and Engineering Department

CSE Dep, ACS, UPB Lecture 2, Assembly Language 1/38

Recap: Explorations Tools

I assembly and C language

I scripting language (Bash, Python, Perl)

I hexadecimal

I executable exploration: strings, xxd, objdump, IDA

I process exploration: strace, ltrace, lsof, pmap

I Capture the Flag (CTF) contests: http://ctftime.org

CSE Dep, ACS, UPB Lecture 2, Assembly Language 2/38

More Info on this Lecture

I https://ocw.cs.pub.ro/courses/iocla

CSE Dep, ACS, UPB Lecture 2, Assembly Language 3/38

Outline

Introduction to Assembly Language

Assembly Language Basics

x86 Assembly

Dealing with Binary Files

Summary

CSE Dep, ACS, UPB Lecture 2, Assembly Language 4/38

Evolution of Programming Languages

I machine code (punch cards)

I assembly language (architecture dependent)

I high-level languages (portable, compilers and interpreters)

CSE Dep, ACS, UPB Lecture 2, Assembly Language 5/38

Current Need for Assembly Language

I low-level optimizations

I unavailable features in C language

I security (binary analysis, offensive security)

I learning how the machine works

CSE Dep, ACS, UPB Lecture 2, Assembly Language 6/38

Mnemonics

I basic blocks for assembly language

I keywords for assembly instructions

I direct mapping to machine code

CSE Dep, ACS, UPB Lecture 2, Assembly Language 7/38

Sample Instruction

Assembly to machine code mapping

NASM syntax: add dword [0xdeadbeef], 42

hex: 8 3 0 5 e f b e a d d e 2 a

binary: [1000 0011][0000 0101][1110 1111 1011 1110 1010 1101 1101 1110][0010 1010]

| | | \- immediate: 42

| | \- memory address: 0xdeadbeef (note the endianness)

| \- opcode modifiers:

| 2 bits = addressing mode

| 3 bits = register/opcode modifier

| 3 bits = r/m field

\- opcode: add sign-extended 8-bits immediate to register, or 32-bits memory address

CSE Dep, ACS, UPB Lecture 2, Assembly Language 8/38

Computer Architecture

I instruction set architecture (ISA)

I register set

I addressing methods

CSE Dep, ACS, UPB Lecture 2, Assembly Language 9/38

Instruction Set Architecture (ISA)

I the types of assembly instructions

I addressing

I moving data

I control flow

I multiple processors may implement the same instruction set

I x86, x86 64, ARM, ARM64, MIPS, PowerPC

CSE Dep, ACS, UPB Lecture 2, Assembly Language 10/38

The Memory Hierarchy

I registers (used in assembly)

I cache memory (controlled by hardware)

I RAM (uses in assembly)

I flash/USB, hard drive

I tape backup

CSE Dep, ACS, UPB Lecture 2, Assembly Language 11/38

Outline

Introduction to Assembly Language

Assembly Language Basics

x86 Assembly

Dealing with Binary Files

Summary

CSE Dep, ACS, UPB Lecture 2, Assembly Language 12/38

Simple Assembly Program

1 extern puts

2 section .data

3 helloStr: db ’Hello, world!’,0

4 section .text

5 global main6 main:7 push helloStr

8 call puts

CSE Dep, ACS, UPB Lecture 2, Assembly Language 13/38

Assembling

Using nasm for assembling

$ nasm -f elf32 hello.asm

Using objdump for inspecting

$ objdump -M intel -d hello.o

[...]

Disassembly of section .text:

00000000 <main>:

0: 68 00 00 00 00 push 0x0

5: e8 fc ff ff ff call 6 <main+0x6>

$ objdump -M intel -r hello.o

[...]

RELOCATION RECORDS FOR [.text]:

OFFSET TYPE VALUE

00000001 R_386_32 .data

00000006 R_386_PC32 puts

CSE Dep, ACS, UPB Lecture 2, Assembly Language 14/38

Linking

Using ld for linking

$ ld -s -lc -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 -e main hello.o -o

hello-min

Using objdump for inspecting

$ objdump -M intel -d hello-min

[...]

Disassembly of section .plt:

08048170 <puts@plt-0x10>:

8048170: ff 35 40 92 04 08 push DWORD PTR ds:0x8049240

8048176: ff 25 44 92 04 08 jmp DWORD PTR ds:0x8049244

804817c: 00 00 add BYTE PTR [eax],al

...

08048180 <puts@plt>:

8048180: ff 25 48 92 04 08 jmp DWORD PTR ds:0x8049248

8048186: 68 00 00 00 00 push 0x0

804818b: e9 e0 ff ff ff jmp 8048170 <puts@plt-0x10>

Disassembly of section .text:

08048190 <.text>:

8048190: 68 4c 92 04 08 push 0x804924c

8048195: e8 e6 ff ff ff call 8048180 <puts@plt>

CSE Dep, ACS, UPB Lecture 2, Assembly Language 15/38

Another Program

1 extern printf

2 section .data

3 sum str: db ’Sum is %d.’,10,0

4 section .text

5 global main6 main:7 xor eax, eax ; Initialize sum register to 0.8

9 mov ecx, 100 ; Start from value and decrement.10 add number:11 add eax, ecx ; Add value to sum register.12 dec ecx ; Decrement value.13 test ecx, ecx ; Test if value is 0.14 jnz add number ; If value is 0, quit loop; otherwise jump to label.15 ;loopnz add number ; Does what the above three instructions do.16

17 ; Print value.18 push eax

19 push sum str

20 call printf

CSE Dep, ACS, UPB Lecture 2, Assembly Language 16/38

Computer Registers

I used for storing and managing data

I CPU/assembly instructions deal with registers

I the register size shows the architecture size/type

I may be orthogonal, or may have specific roles

I are referenced by names: eax, ebp, eflags (x86) or r0, r1, r2(ARM)

CSE Dep, ACS, UPB Lecture 2, Assembly Language 17/38

CPU Instructions

I instruction mnemonic: what the instruction does

I instruction operands: what the instruction uses

CSE Dep, ACS, UPB Lecture 2, Assembly Language 18/38

Addressing Modes

I ways for instructions to identify operandsI for code

I absolute addressing: in the instructionI relative addressing: in the instruction (+ current offset)I register indirect: address in the register

I for dataI register: data in registerI base plus offset: add offset to base valueI immediate: in the instruction

CSE Dep, ACS, UPB Lecture 2, Assembly Language 19/38

CISC vs. RISC Architectures

I Complex/Reduced Instruction Set Computing

I CISC: relative instruction size, multi-clock complexinstructions, memory-to-memory

I RISC: load-store architecture, focus on software

CSE Dep, ACS, UPB Lecture 2, Assembly Language 20/38

Outline

Introduction to Assembly Language

Assembly Language Basics

x86 Assembly

Dealing with Binary Files

Summary

CSE Dep, ACS, UPB Lecture 2, Assembly Language 21/38

Assembly Language Syntax

Intel Syntax

xor eax,eax

mov ecx,0x64

add eax,ecx

dec ecx

test ecx,ecx

jne 7 <add_number>

push eax

push 0x0

call 15 <add_number+0xe>

AT&T Syntax

xor %eax,%eax

mov $0x64,%ecx

add %ecx,%eax

dec %ecx

test %ecx,%ecx

jne 7 <add_number>

push %eax

push $0x0

call 15 <add_number+0xe>

CSE Dep, ACS, UPB Lecture 2, Assembly Language 22/38

Tools of the Trade

I NASM: assembler (Intel Syntax)

I GCC (gas): assembler (x86 Syntax)

I GCC (gcc, ld): compiler/linker

I objdump: disassembler (multiple syntaxes)

CSE Dep, ACS, UPB Lecture 2, Assembly Language 23/38

x86 Registers

I eax: accumulator, used in arithmetic operations

I ebx: base pointer in memory operations (e.g. arrays)

I ecx: loop counters

I edx: also used in arithmetic operations

I esi: source addresses in memory operations

I edi: destination addreses in memory operations

I ebp: frame base pointer

I esp: stack pointer

I named rax, rbx etc. in x86 64

CSE Dep, ACS, UPB Lecture 2, Assembly Language 24/38

Addressing

x86 Addressing Modes

mov eax, [0xcafebab3] ; direct (displacement)

mov eax, [esi] ; register indirect (base)

mov eax, [ebp-8] ; based (base + displacement)

mov eax, [ebx*4 + 0xdeadbeef] ; indexed (index*scale + displacement)

mov eax, [edx + ebx + 12] ; based-indexed w/o scale (base + index + displacement)

mov eax, [edx + ebx*4 + 42] ; based-indexed w/ scale (base + index*scale + displacement)

CSE Dep, ACS, UPB Lecture 2, Assembly Language 25/38

Data Transfer

I mov 〈dest〉, 〈src〉: move

I xchg 〈dest〉, 〈src〉: exchange (swap)

I movzx 〈dest〉, 〈src〉: move with zero extend

I movsx 〈dest〉, 〈src〉: move with sign extend

I movsb: move byte from location pointed to by esi to edi

I movsw: similar, move word (2 bytes)

I lea 〈dest〉, 〈src〉: load effective address (calculate address of〈src〉 and load it to 〈dest〉)

CSE Dep, ACS, UPB Lecture 2, Assembly Language 26/38

Control Flow

I Control Instructions:I jmp 〈addr〉: loads 〈addr〉 into eipI call 〈addr〉: pushes current eip on stack, and loads 〈addr〉 into

eipI ret 〈val〉: loads head of stack into eip, and pops 〈val〉 bytes off

the stackI loop 〈addr〉: decrements ecx, and jumps to 〈addr〉 if ecx != 0

I Conditional Jump Flags:I ZF (zero flag): previous arithmetic operation resulted in zeroI SF (sign flag): previous result’s most significant bitI CF (carry flag): previous result requires a carryI OF (overflow flag): previous result overflows the maximum

value that fits a register

CSE Dep, ACS, UPB Lecture 2, Assembly Language 27/38

Arithmetic/Logical

I Arithmetic Instructions:I add 〈dest〉, 〈src〉: additionI sub 〈dest〉, 〈src〉: subtractionI mul 〈arg〉: multiplication with corresponding byte-wise eax (i.e.

〈arg〉 = ”dh” ? dh * ah)I imul 〈arg〉: signed multiplicationI imul 〈dest〉, 〈src〉: signed multiplication (dest = dest * src)I imul 〈dest〉, 〈src〉, 〈aux〉: signed multiplication (dest = src *

aux)I div 〈arg〉: divisionI idiv 〈arg〉: signed divisionI neg 〈arg〉: 2’s complement negation

CSE Dep, ACS, UPB Lecture 2, Assembly Language 28/38

Arithmetic/Logical (2)

I Shifts and Rotations:I shr, shl (logical shift right/left)I sar, sal (arithmetic shift right/left)I shld, shrd (double-shift)I ror, rol (rotate)I rcr, rcl (rotate with carry)

I Logical Instructions:I and, or, xor, not

CSE Dep, ACS, UPB Lecture 2, Assembly Language 29/38

Function Calls

More in Lecture 4: The Stack. Buffer Management

CSE Dep, ACS, UPB Lecture 2, Assembly Language 30/38

System Calls

I the interface that allows user applications to request servicesfrom the OS kernel

I mechanism is invoked by triggering an interrupt (int 0x80)I conventions for invoking a syscall on Linux:

I eax contains the syscall IDI parameters are passed in ebx, ecx, edx, esi, edi, ebp (in this

order)I the syscall is responsible of saving and restoring all registers

CSE Dep, ACS, UPB Lecture 2, Assembly Language 31/38

Outline

Introduction to Assembly Language

Assembly Language Basics

x86 Assembly

Dealing with Binary Files

Summary

CSE Dep, ACS, UPB Lecture 2, Assembly Language 32/38

Disassembling

I checking the assembly code in object/executable files

I use disassemblers; no need for source code

I useful for reverse engineering

I objdump, IDA, GDB, radare2, Hopper, ImmunityDbg

CSE Dep, ACS, UPB Lecture 2, Assembly Language 33/38

Disassembling (2)

I disassemble object code in non-object files

I objdump -D -b binary -m i386 binary-file

CSE Dep, ACS, UPB Lecture 2, Assembly Language 34/38

Using NOPs

I when altering binary machine code

I you can’t remove data, you will mess the offsets

I use a hex editor (hexedit, bless) and replace code with NOPinstructions (0x90 in x86 assembly)

CSE Dep, ACS, UPB Lecture 2, Assembly Language 35/38

Outline

Introduction to Assembly Language

Assembly Language Basics

x86 Assembly

Dealing with Binary Files

Summary

CSE Dep, ACS, UPB Lecture 2, Assembly Language 36/38

Keywords

I assembly

I mnemonics

I instructions

I architecture

I ISA

I registers

I addressing modes

I CISC and RISC

I memory-to-memory

I load-store

I assembling

I linking

I control flow

I arithmetic/logical

I data transfer

I function calls

I system calls

I disassembling

I objdump

I NOP

CSE Dep, ACS, UPB Lecture 2, Assembly Language 37/38

Useful Links

I https://ocw.cs.pub.ro/courses/iocla

I http://en.wikibooks.org/wiki/X86_Assembly

I http://www.nasm.us/xdoc/2.11.05/html/nasmdoc0.html

I http://timelessname.com/elfbin/

I http://www.cs.umd.edu/~jkatz/security/s12/lecture22.ppt

I http://gcc.godbolt.org/

CSE Dep, ACS, UPB Lecture 2, Assembly Language 38/38