x86-64assembly · x86history seven8-bitregisters 1971:intel8008 eight16-bitregisters:...
TRANSCRIPT
![Page 1: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/1.jpg)
x86-64 assembly
1
![Page 2: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/2.jpg)
x86 history
seven 8-bit registers1971: Intel 8008
eight 16-bit registers:1978: Intel 80861982: Intel 80286
eight 32-bit registers:1985: Intel 803861989: Intel 804861993: Intel Pentium1997: Intel Pentium II1998: Intel Pentium III2000: Intel Pentium IV/Xeon
sixteen 64-bit registers:2003: AMD64 Opteron2004: Intel Pentium IV/Xeon(and most more recentAMD/Intel/Via chips)
2
![Page 3: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/3.jpg)
two syntaxes
there are two ways of writing x86 assemblyAT&T syntax (default on Linux, OS X)Intel syntax (default on Windows)
different operand order, way of writing addresses, punctuation, etc.
we mostly show Intel syntax
3
![Page 4: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/4.jpg)
different directives
non-instruction parts of assembly are called directives
IBCM example: one dw 1there is no IBCM instruction called “dw”
these differ a lot between assemblers
our main assember: NASM
our compiler’s assembler: GAS
4
![Page 5: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/5.jpg)
x86 registers
ALAHAX
EAXRAX
BLBHBX
EBXRBX
CLCHCX
ECXRCX
DLDHDX
EDXRDX
BPLBPEBPRBP
SILSIESIRSI
DILDIEDIRDI
SPLSPESPRSP
R8BR8WR8DR8
R9BR9WR9DR9
R10BR10WR10DR10
R11BR11WR11DR11
R12BR12WR12DR12
R13BR13WR13DR13
R14BR14WR14DR14
R15BR15WR15DR15
1978 – Intel 8086 — 8 16-bit registers
← AX, etc. — “general purpose”(but some instructions use AX or BX only)
← “base pointer”← “source index”← “destination index”← “stack pointer” — push/pop instrs.
special forsome instrs.
1988 – Intel 386 — 8 32-bit registers“Extended” versions of each register2003 – AMD64 — 16 64-bit registersnew registers just numberedname for bottom byte of each register
5
![Page 6: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/6.jpg)
x86 registers
ALAHAX
EAXRAX
BLBHBX
EBXRBX
CLCHCX
ECXRCX
DLDHDX
EDXRDX
BPLBPEBPRBP
SILSIESIRSI
DILDIEDIRDI
SPLSPESPRSP
R8BR8WR8DR8
R9BR9WR9DR9
R10BR10WR10DR10
R11BR11WR11DR11
R12BR12WR12DR12
R13BR13WR13DR13
R14BR14WR14DR14
R15BR15WR15DR15
1978 – Intel 8086 — 8 16-bit registers
← AX, etc. — “general purpose”(but some instructions use AX or BX only)
← “base pointer”← “source index”← “destination index”← “stack pointer” — push/pop instrs.
special forsome instrs.
1988 – Intel 386 — 8 32-bit registers“Extended” versions of each register
2003 – AMD64 — 16 64-bit registersnew registers just numberedname for bottom byte of each register
5
![Page 7: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/7.jpg)
x86 registers
ALAHAX
EAXRAX
BLBHBX
EBXRBX
CLCHCX
ECXRCX
DLDHDX
EDXRDX
BPLBPEBPRBP
SILSIESIRSI
DILDIEDIRDI
SPLSPESPRSP
R8BR8WR8DR8
R9BR9WR9DR9
R10BR10WR10DR10
R11BR11WR11DR11
R12BR12WR12DR12
R13BR13WR13DR13
R14BR14WR14DR14
R15BR15WR15DR15
1978 – Intel 8086 — 8 16-bit registers
← AX, etc. — “general purpose”(but some instructions use AX or BX only)
← “base pointer”← “source index”← “destination index”← “stack pointer” — push/pop instrs.
special forsome instrs.
1988 – Intel 386 — 8 32-bit registers“Extended” versions of each register
2003 – AMD64 — 16 64-bit registersnew registers just numberedname for bottom byte of each register
5
![Page 8: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/8.jpg)
some registers not shown
floating point/“vector” registers (ST(0), XMM0, YMM0, ZMM0, …)
the program counter (RIP/EIP/IP — “instruction pointer”)
“flags” (used by conditional jumps)
registers for the operating system
…
6
![Page 9: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/9.jpg)
x86 fetch/execute cycle
while (true) {IR <- memory[PC]execute(IR)if (instruction didn't change PC)
PC <- PC + length-of-instruction(IR)}
same as IBCM
(execpt instructions are variable-length)
7
![Page 10: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/10.jpg)
declaring variables/constants
(NASM-only syntax)section .dataa DB 23b DW ?c DD 3000d DQ −800x DD 1, 2, 3y TIMES 8 DB 0
“.data” — data (not code) part of memoryDB: declare byteDW: word (2 byte)DD: doubleword (4 bytes)DQ: quadword (8 byte)? — don’t care about valueeight 0 bytes (e.g. 8-byte array)
8
![Page 11: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/11.jpg)
a note on labels
NASM allows labels like:LABEL add RAX, RBX
or like:LABEL: add RAX, RBX
other assemblers: require : always
I recommend :what if label name = instruction name?
9
![Page 12: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/12.jpg)
declaring variables/constants (GAS)
(GAS-only syntax).dataa: .byte 23b: .short 0c: .long 3000d: .quad −800x: .long 1, 2, 3y .fill 8, 1, 0
“.data” — data (not code) part of memory
short — 2 byteslong — 4 bytesquad — 8 bytes
eight 0 bytes (e.g. 8-byte array)(1 is length of value to repeat)
10
![Page 13: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/13.jpg)
mov
mov DEST, SRC
possible DEST and SRC:register: RAX, EAX, …constant: 0x1234, 42, …label name: someLabel, …memory address: [0x1234], [RAX], [someLabel]…
special rule: no moving from memory to memory
11
![Page 14: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/14.jpg)
instruction operands generally
if we don’t specify otherwise…
same as mov:destination: register or memory locationsource: register or constant or memory location
and same special rule: both can’t be memory location
12
![Page 15: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/15.jpg)
mov example
mov rcx, raxmov rdx, [rbx]mov rsi, [rdx+24]mov [rsi], 45mov [a], 15
rax 100rbx 108rcxrdxrsirdi…
registers…
100108 100116124 200132
…200208
a: 300308
…
memory
13
![Page 16: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/16.jpg)
mov example
mov rcx, raxmov rdx, [rbx]mov rsi, [rdx+24]mov [rsi], 45mov [a], 15
rax 100rbx 108rcx 100rdxrsirdi…
registers…
100108 100116124 200132
…200208
a: 300308
…
memory
13
![Page 17: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/17.jpg)
mov example
mov rcx, raxmov rdx, [rbx]mov rsi, [rdx+24]mov [rsi], 45mov [a], 15
rax 100rbx 108rcx 100rdx 100rsirdi…
registers…
100108 100116124 200132
…200208
a: 300308
…
memory
13
![Page 18: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/18.jpg)
mov example
mov rcx, raxmov rdx, [rbx]mov rsi, [rdx+24]mov [rsi], 45mov [a], 15
rax 100rbx 108rcx 100rdx 100rsi 200rdi…
registers…
100108 100116124 200132
…200208
a: 300308
…
memory
13
![Page 19: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/19.jpg)
mov example
mov rcx, raxmov rdx, [rbx]mov rsi, [rdx+24]mov [rsi], 45mov [a], 15
rax 100rbx 108rcx 100rdx 100rsi 200rdi…
registers…
100108 100116124 200132
…200 45208
a: 300308
…
memory
13
![Page 20: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/20.jpg)
mov example
mov rcx, raxmov rdx, [rbx]mov rsi, [rdx+24]mov [rsi], 45mov [a], 15
rax 100rbx 108rcx 100rdx 100rsi 200rdi…
registers…
100108 100116124 200132
…200 45208
a: 300 15308
…
memory
13
![Page 21: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/21.jpg)
later: what types of addresses?
[rdx] allowed
[someLabel] allowed
[rdx+24] allowed
what else?
not everything — has to be encoded in machine code
explain rules: later
14
![Page 22: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/22.jpg)
push/pop
RSP — “top” of stack which grows downpush RBX
RSP ← RSP − 8memory[RSP] ← RBX
pop RBXRBX ← memory[RSP]RSP ← RSP + 8
also okay:push [RAX], etc.push 42, etc.
...
memory[RSP + 16]
memory[RSP + 8]
memory[RSP]
memory[RSP - 8]
memory[RSP - 16]
value to popwhere to push
stackgrowth
15
![Page 23: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/23.jpg)
push/pop replacement
instead of:push RAX
could write:sub RSP, 8mov [RSP], RAX
push/pop instructions are for convenience
16
![Page 24: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/24.jpg)
add/sub
add first, secondadd RAX, RBXadd [RDX], RBX...sub first, secondsub RSP, 16...
first ← first + second (add), or first ← first − second (sub)support same operands as mov:
can use registers, constants, locations in memorycan’t use two memory locations (mov to a register instead)destination can’t be constant
17
![Page 25: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/25.jpg)
jmp
jmp foo
foo: ...
jmp — go to instruction at label
18
![Page 26: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/26.jpg)
conditon testing
cmp <first>, <second>
compare first and second(compute first - second, compare to 0)
set flags AKA machine status word based on result
je label
if (compare result was equal) go to label
19
![Page 27: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/27.jpg)
conditional jmp example
if (RAX > 4)stuff();
cmp RAX, 4jle skip_callcall stuff
skip_call: ...
20
![Page 28: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/28.jpg)
jump conditions and cmp
cmp A, BjXX label
R = A−Bje equal R = 0 or A = Bjz zero R = 0 or A = Bjne not equal R 6= 0 or A 6= Bjl less than A < B (signed)jle less than or equal A ≤ B (signed)jg greater than A > B (signed)jb less than (unsigned) A < B (unsigned)ja greater than (unsigned) A > B (unsigned)js sign bit set R < 0jns sign bit unset R ≥ 0… … …
21
![Page 29: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/29.jpg)
C to assembly example
int n = 5;int i = 1;int sum = 0;...
while (i <= n) {sum += i;i++;
}
section .datan DQ 5i DQ 1sum DQ 0section .text...loop: mov RCX, [i]
cmp RCX, [n]jg endOfLoopadd [sum], RCXadd QWORD PTR [i], 1jmp loop
endOfLoop:
cmp [i], [n] is not allowedonly one memory operand per (most) instructions
QWORD PTR[i] 8 bytes at location iotherwise, no way to know how big otherwise(more on this later)
22
![Page 30: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/30.jpg)
C to assembly example
int n = 5;int i = 1;int sum = 0;...
while (i <= n) {sum += i;i++;
}
section .datan DQ 5i DQ 1sum DQ 0section .text...loop: mov RCX, [i]
cmp RCX, [n]jg endOfLoopadd [sum], RCXadd QWORD PTR [i], 1jmp loop
endOfLoop:
cmp [i], [n] is not allowedonly one memory operand per (most) instructions
QWORD PTR[i] 8 bytes at location iotherwise, no way to know how big otherwise(more on this later)
22
![Page 31: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/31.jpg)
C to assembly example
int n = 5;int i = 1;int sum = 0;...
while (i <= n) {sum += i;i++;
}
section .datan DQ 5i DQ 1sum DQ 0section .text...loop: mov RCX, [i]
cmp RCX, [n]jg endOfLoopadd [sum], RCXadd QWORD PTR [i], 1jmp loop
endOfLoop:
cmp [i], [n] is not allowedonly one memory operand per (most) instructions
QWORD PTR[i] 8 bytes at location iotherwise, no way to know how big otherwise(more on this later)
22
![Page 32: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/32.jpg)
C to assembly example
int n = 5;int i = 1;int sum = 0;...
while (i <= n) {sum += i;i++;
}
section .datan DQ 5i DQ 1sum DQ 0section .text...loop: mov RCX, [i]
cmp RCX, [n]jg endOfLoopadd [sum], RCXadd QWORD PTR [i], 1jmp loop
endOfLoop:
cmp [i], [n] is not allowedonly one memory operand per (most) instructions
QWORD PTR[i] 8 bytes at location iotherwise, no way to know how big otherwise(more on this later)
22
![Page 33: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/33.jpg)
C to assembly example
int n = 5;int i = 1;int sum = 0;...
while (i <= n) {sum += i;i++;
}
section .datan DQ 5i DQ 1sum DQ 0section .text...loop: mov RCX, [i]
cmp RCX, [n]jg endOfLoopadd [sum], RCXadd QWORD PTR [i], 1jmp loop
endOfLoop:
cmp [i], [n] is not allowedonly one memory operand per (most) instructions
QWORD PTR[i] 8 bytes at location iotherwise, no way to know how big otherwise(more on this later)
22
![Page 34: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/34.jpg)
C to assembly example
int n = 5;int i = 1;int sum = 0;...
while (i <= n) {sum += i;i++;
}
section .datan DQ 5i DQ 1sum DQ 0section .text...loop: mov RCX, [i]
cmp RCX, [n]jg endOfLoopadd [sum], RCXadd QWORD PTR [i], 1jmp loop
endOfLoop:
cmp [i], [n] is not allowedonly one memory operand per (most) instructions
QWORD PTR[i] 8 bytes at location iotherwise, no way to know how big otherwise(more on this later)
22
![Page 35: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/35.jpg)
call
call LABEL...
is about the same as:push after_this_calljmp LABEL
after_this_call:...
pushed address called the “return address”
23
![Page 36: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/36.jpg)
call/ret
call LABELpush next instruction address (“return address”) to stackjump to LABEL
ret — opposite of callpop address from the stackjump to that address
24
![Page 37: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/37.jpg)
return addresses using a stackmax: ...
...ret
main: ......call max
after: ...ret
return address for main (OS)temporary storage for mainother things related to call???return address for max (after:)temporary storage for max
stack when main starts:← RSP
stack in the middle of main:
← RSP
stack just before call max:
← RSP
stack just after call max:
← RSP
stack in the middle of max:
← RSP
stack just before max’s ret:
← RSP
stack just after max’s ret:
← RSP
stack just before main’s ret:← RSP
smaller addresses
25
![Page 38: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/38.jpg)
return addresses using a stackmax: ...
...ret
main: ......call max
after: ...ret
return address for main (OS)temporary storage for mainother things related to call???return address for max (after:)temporary storage for max
stack when main starts:← RSP
stack in the middle of main:
← RSP
stack just before call max:
← RSP
stack just after call max:
← RSP
stack in the middle of max:
← RSP
stack just before max’s ret:
← RSP
stack just after max’s ret:
← RSP
stack just before main’s ret:← RSP
smaller addresses
25
![Page 39: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/39.jpg)
return addresses using a stackmax: ...
...ret
main: ......call max
after: ...ret
return address for main (OS)temporary storage for mainother things related to call???return address for max (after:)temporary storage for max
stack when main starts:← RSP
stack in the middle of main:
← RSP
stack just before call max:
← RSP
stack just after call max:
← RSP
stack in the middle of max:
← RSP
stack just before max’s ret:
← RSP
stack just after max’s ret:
← RSP
stack just before main’s ret:← RSP
smaller addresses
25
![Page 40: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/40.jpg)
return addresses using a stackmax: ...
...ret
main: ......call max
after: ...ret
return address for main (OS)temporary storage for mainother things related to call???return address for max (after:)temporary storage for max
stack when main starts:← RSP
stack in the middle of main:
← RSP
stack just before call max:
← RSP
stack just after call max:
← RSP
stack in the middle of max:
← RSP
stack just before max’s ret:
← RSP
stack just after max’s ret:
← RSP
stack just before main’s ret:← RSP
smaller addresses
25
![Page 41: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/41.jpg)
return addresses using a stackmax: ...
...ret
main: ......call max
after: ...ret
return address for main (OS)temporary storage for mainother things related to call???return address for max (after:)temporary storage for max
stack when main starts:← RSP
stack in the middle of main:
← RSP
stack just before call max:
← RSP
stack just after call max:
← RSP
stack in the middle of max:
← RSP
stack just before max’s ret:
← RSP
stack just after max’s ret:
← RSP
stack just before main’s ret:← RSP
smaller addresses
25
![Page 42: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/42.jpg)
return addresses using a stackmax: ...
...ret
main: ......call max
after: ...ret
return address for main (OS)temporary storage for mainother things related to call???return address for max (after:)temporary storage for max
stack when main starts:← RSP
stack in the middle of main:
← RSP
stack just before call max:
← RSP
stack just after call max:
← RSP
stack in the middle of max:
← RSP
stack just before max’s ret:
← RSP
stack just after max’s ret:
← RSP
stack just before main’s ret:← RSP
smaller addresses
25
![Page 43: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/43.jpg)
return addresses using a stackmax: ...
...ret
main: ......call max
after: ...ret
return address for main (OS)temporary storage for mainother things related to call???return address for max (after:)temporary storage for max
stack when main starts:← RSP
stack in the middle of main:
← RSP
stack just before call max:
← RSP
stack just after call max:
← RSP
stack in the middle of max:
← RSP
stack just before max’s ret:
← RSP
stack just after max’s ret:
← RSP
stack just before main’s ret:← RSP
smaller addresses
25
![Page 44: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/44.jpg)
return addresses using a stackmax: ...
...ret
main: ......call max
after: ...ret
return address for main (OS)temporary storage for mainother things related to call???return address for max (after:)temporary storage for max
stack when main starts:← RSP
stack in the middle of main:
← RSP
stack just before call max:
← RSP
stack just after call max:
← RSP
stack in the middle of max:
← RSP
stack just before max’s ret:
← RSP
stack just after max’s ret:
← RSP
stack just before main’s ret:← RSP
smaller addresses
25
![Page 45: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/45.jpg)
return addresses using a stackmax: ...
...ret
main: ......call max
after: ...ret
return address for main (OS)temporary storage for mainother things related to call???return address for max (after:)temporary storage for max
stack when main starts:← RSP
stack in the middle of main:
← RSP
stack just before call max:
← RSP
stack just after call max:
← RSP
stack in the middle of max:
← RSP
stack just before max’s ret:
← RSP
stack just after max’s ret:
← RSP
stack just before main’s ret:← RSP
smaller addresses
25
![Page 46: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/46.jpg)
function calls use the stack
“the” stackconvention: RSP points to topgrows ‘down’ (towards address 0)used by pop, push, call, ret
used to implement function calls
main reason: support recursive calls
where do (place to return/arguments/local variables/etc.) go?when in doubt — use the stackoptimization: sometimes use registers
26
![Page 47: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/47.jpg)
calling convention preview
call FUNC and RET instructions
…but where do arguments, local variables, etc. go?
what registers can a function call change?
compiler/OS choice! — much more detail later
27
![Page 48: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/48.jpg)
Linux calling convention preview
return value: RAX
argument 1: RDI; argument 2: RSIargument 3: RDX; argument 4: RCX; argument 5: R8; argument 6: R9
local variables: stack or “free” registers
value of RBP, RBX, R12, R13, R14, R15 can’t be changed byfunction call
can use them, but must save/restore
28
![Page 49: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/49.jpg)
simple recursion (C++)
long sum(long count) {if (count > 0) {
long partial_sum = sum(count − 1);return partial_sum + count;
} else {return 0;
}}
29
![Page 50: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/50.jpg)
simple recursion (assembly)
# RDI (arg 1) is countsum:
cmp RDI, 0jle base_case // if count <= 0 --> do base casepush RDI // save a copy of original RDIsub RDI, 1call sum // sum(count-1)pop RDI // restore copy of original RDIadd RAX, RDI // ret val = sum(count-1) + countret
base_case:mov RAX, 0ret
the stackreturn address for sum(100)saved RDI: 100return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(1)saved RDI: 1return address for sum(0)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)saved RDI: 1
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)
return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100return address for sum(100)
30
![Page 51: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/51.jpg)
simple recursion (assembly)
# RDI (arg 1) is countsum:
cmp RDI, 0jle base_case // if count <= 0 --> do base casepush RDI // save a copy of original RDIsub RDI, 1call sum // sum(count-1)pop RDI // restore copy of original RDIadd RAX, RDI // ret val = sum(count-1) + countret
base_case:mov RAX, 0ret
the stackreturn address for sum(100)saved RDI: 100
return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(1)saved RDI: 1return address for sum(0)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)saved RDI: 1
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)
return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100return address for sum(100)
30
![Page 52: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/52.jpg)
simple recursion (assembly)
# RDI (arg 1) is countsum:
cmp RDI, 0jle base_case // if count <= 0 --> do base casepush RDI // save a copy of original RDIsub RDI, 1call sum // sum(count-1)pop RDI // restore copy of original RDIadd RAX, RDI // ret val = sum(count-1) + countret
base_case:mov RAX, 0ret
the stack
return address for sum(100)saved RDI: 100
return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(1)saved RDI: 1return address for sum(0)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)saved RDI: 1
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)
return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100return address for sum(100)
30
![Page 53: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/53.jpg)
simple recursion (assembly)
# RDI (arg 1) is countsum:
cmp RDI, 0jle base_case // if count <= 0 --> do base casepush RDI // save a copy of original RDIsub RDI, 1call sum // sum(count-1)pop RDI // restore copy of original RDIadd RAX, RDI // ret val = sum(count-1) + countret
base_case:mov RAX, 0ret
the stack
return address for sum(100)saved RDI: 100return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(1)saved RDI: 1return address for sum(0)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)saved RDI: 1
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)
return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100return address for sum(100)
30
![Page 54: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/54.jpg)
simple recursion (assembly)
# RDI (arg 1) is countsum:
cmp RDI, 0jle base_case // if count <= 0 --> do base casepush RDI // save a copy of original RDIsub RDI, 1call sum // sum(count-1)pop RDI // restore copy of original RDIadd RAX, RDI // ret val = sum(count-1) + countret
base_case:mov RAX, 0ret
the stack
return address for sum(100)saved RDI: 100return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(1)saved RDI: 1return address for sum(0)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)saved RDI: 1
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)
return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100return address for sum(100)
30
![Page 55: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/55.jpg)
simple recursion (assembly)
# RDI (arg 1) is countsum:
cmp RDI, 0jle base_case // if count <= 0 --> do base casepush RDI // save a copy of original RDIsub RDI, 1call sum // sum(count-1)pop RDI // restore copy of original RDIadd RAX, RDI // ret val = sum(count-1) + countret
base_case:mov RAX, 0ret
the stack
return address for sum(100)saved RDI: 100return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(1)saved RDI: 1return address for sum(0)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)saved RDI: 1
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)
return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100return address for sum(100)
30
![Page 56: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/56.jpg)
simple recursion (assembly)
# RDI (arg 1) is countsum:
cmp RDI, 0jle base_case // if count <= 0 --> do base casepush RDI // save a copy of original RDIsub RDI, 1call sum // sum(count-1)pop RDI // restore copy of original RDIadd RAX, RDI // ret val = sum(count-1) + countret
base_case:mov RAX, 0ret
the stack
return address for sum(100)saved RDI: 100return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(1)saved RDI: 1return address for sum(0)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)saved RDI: 1
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)
return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100return address for sum(100)
30
![Page 57: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/57.jpg)
simple recursion (assembly)
# RDI (arg 1) is countsum:
cmp RDI, 0jle base_case // if count <= 0 --> do base casepush RDI // save a copy of original RDIsub RDI, 1call sum // sum(count-1)pop RDI // restore copy of original RDIadd RAX, RDI // ret val = sum(count-1) + countret
base_case:mov RAX, 0ret
the stack
return address for sum(100)saved RDI: 100return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(1)saved RDI: 1return address for sum(0)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)saved RDI: 1
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)
return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100return address for sum(100)
30
![Page 58: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/58.jpg)
simple recursion (assembly)
# RDI (arg 1) is countsum:
cmp RDI, 0jle base_case // if count <= 0 --> do base casepush RDI // save a copy of original RDIsub RDI, 1call sum // sum(count-1)pop RDI // restore copy of original RDIadd RAX, RDI // ret val = sum(count-1) + countret
base_case:mov RAX, 0ret
the stack
return address for sum(100)saved RDI: 100return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(1)saved RDI: 1return address for sum(0)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)saved RDI: 1
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)
return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100return address for sum(100)
30
![Page 59: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/59.jpg)
simple recursion (assembly)
# RDI (arg 1) is countsum:
cmp RDI, 0jle base_case // if count <= 0 --> do base casepush RDI // save a copy of original RDIsub RDI, 1call sum // sum(count-1)pop RDI // restore copy of original RDIadd RAX, RDI // ret val = sum(count-1) + countret
base_case:mov RAX, 0ret
the stack
return address for sum(100)saved RDI: 100return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(1)saved RDI: 1return address for sum(0)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)saved RDI: 1
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)
return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100
return address for sum(100)
30
![Page 60: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/60.jpg)
simple recursion (assembly)
# RDI (arg 1) is countsum:
cmp RDI, 0jle base_case // if count <= 0 --> do base casepush RDI // save a copy of original RDIsub RDI, 1call sum // sum(count-1)pop RDI // restore copy of original RDIadd RAX, RDI // ret val = sum(count-1) + countret
base_case:mov RAX, 0ret
the stack
return address for sum(100)saved RDI: 100return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(1)saved RDI: 1return address for sum(0)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)saved RDI: 1
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2return address for sum(1)
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)saved RDI: 2
return address for sum(100)saved RDI: 100return address for sum(99)saved RDI: 99return address for sum(98)saved RDI: 98…return address for sum(2)
return address for sum(100)saved RDI: 100return address for sum(99)
return address for sum(100)saved RDI: 100
return address for sum(100)
30
![Page 61: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/61.jpg)
specifying pointers
[RAX + 2 * RBX + 0x1234]
optional 64-bit base register plusexample: RAX
optional 64-bit index register times 1 (default), 2, 4, or 8 plusexample: RBX times 2
optional 32-bit signed constantlabels count as constants
31
![Page 62: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/62.jpg)
example valid movs
mov rax, rbx // RAX ← RBXmov rax, [rbx] // RAX ← memory[RBX]mov [someLabel], rbx // memory[someLabel] ← RBXmov rax, [r13 − 4] // RAX ← memory[R13 + (-4)]mov [rsi + rax], cl // memory[RSI + RAX] ← CLmov rdx, [rsi + 4*rbx] // RDX ← memory[RSI + 4 * RBX]
32
![Page 63: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/63.jpg)
INVALID movs
mov rax, [r11 − rcx]can’t subtract register
mov [rax + r5 + rdi], rbx
mov [4*rax + 2*rbx], rcxonly multiply one register
33
![Page 64: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/64.jpg)
memory access lengths
move one byte:mov bl, [rax]mov [rax], blmov BYTE PTR [rax], blmov BYTE PTR [rbx], 42
move four bytes:mov ebx, [rax]mov [rax], ebxmov DWORD PTR [rax], ebxmov DWORD PTR [rbx], 10
(BYTE, WORD (2 bytes), DWORD (4 bytes), QWORD (8 bytes))34
![Page 65: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/65.jpg)
inc/dec
dec RAXinc QWORD PTR [RBX + RCX]
increment or decrement
register or memory operand
(same effect as add/sub 1)
35
![Page 66: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/66.jpg)
multiply
imul <first>, <second>imul RAX, RBXimul RAX, [RCX + RDX]
first ← first × secondfirst operand must be registerimul <first>, <second>, <third>imul RAX, RBX, 42imul RAX, [RCX + RDX], 42
first ← second × thirdfirst: must be register; third: must be constant
36
![Page 67: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/67.jpg)
multiply (with big result)
imul <first>imul RBXimul QWORD PTR [RCX + RDX]
{RDX, RAX} ← RAX × firstRDX gets most significant 64 bitsRAX gets least signiciant 64 bits
imul EBXimul DWORD PTR [RCX + RDX]
{EDX, EAX} ← EAX × firstEDX gets most significant 32 bitsEAX gets least signiciant 32 bits
37
![Page 68: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/68.jpg)
multiply — signed/unsigned
with result size = source size:signed and unsigned multiply is the same
with bigger results:imul — signed multiplymul — unsigned multiply
38
![Page 69: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/69.jpg)
divide
idiv <first>idiv RBXidiv QWORD PTR [RCX + RDX]
RAX ← {RDX,RAX} ÷ first
RDX ← {RDX,RAX} mod first
128-bit divided by 64-bitor 64-bit by 32-bit with 32-bit first operand, etc.
also div <first> — same, but unsigned division
39
![Page 70: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/70.jpg)
on LEA
LEA = Load Effective Addresseffective address = computed address for memory access
syntax looks like a mov from memory, but…
skips the memory access — just uses the address(sort of like & operator in C?)
lea RAX, [RAX + 4] ≈ add RAX, 4
“address of memory[rax + 4]” = rax + 4
40
![Page 71: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/71.jpg)
on LEA
LEA = Load Effective Addresseffective address = computed address for memory access
syntax looks like a mov from memory, but…
skips the memory access — just uses the address(sort of like & operator in C?)
lea RAX, [RAX + 4] ≈ add RAX, 4
“address of memory[rax + 4]” = rax + 4
40
![Page 72: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/72.jpg)
LEA tricks
lea RAX, [RAX + RAX * 4]
rax ← rax × 5
rax ← address-of(memory[rax + rax * 4])
lea RDX, [RBX + RCX]
rdx ← rbx + rcx
rdx ←address-of(memory[rbx + rcx])
41
![Page 73: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/73.jpg)
call example
int max(int x, int y) {int theMax;if (x > y)
theMax = x;else
theMax = y;return theMax;
}
int main() {int maxVal, a = 5, b = 6;maxVal = max(a, b);cout << "max␣value:␣" << maxVal << endl;return 0;
}
where do arguments go?
where do local variables go?
where does the return value go?
how does return know where to go?
42
![Page 74: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/74.jpg)
call example
int max(int x, int y) {int theMax;if (x > y)
theMax = x;else
theMax = y;return theMax;
}
int main() {int maxVal, a = 5, b = 6;maxVal = max(a, b);cout << "max␣value:␣" << maxVal << endl;return 0;
}
where do arguments go?
where do local variables go?
where does the return value go?
how does return know where to go?
42
![Page 75: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/75.jpg)
calling conventions
calling convention: rules about how function calls work
choice of compiler and OS NOT the processor itself
…but processor might make instructions to helpx86-64: call, ret, push, pop
43
![Page 76: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/76.jpg)
basic calling convention questions (1)
how does return know where to go?
x86-64: on the stack (otherwise can’t use call/ret)
where do arguments go?
Linux+x86-64: arguments 1-6: RDI, RSI, RDX, RCX, R8, R9Linux+x86-64: arguments 7-: push on the stack (last argument first)last argument first: so arguments are pop’d in order(exceptions: objects that don’t fit in a register, floating point, …)
44
![Page 77: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/77.jpg)
basic calling convention questions (1)
how does return know where to go?x86-64: on the stack (otherwise can’t use call/ret)
where do arguments go?
Linux+x86-64: arguments 1-6: RDI, RSI, RDX, RCX, R8, R9Linux+x86-64: arguments 7-: push on the stack (last argument first)last argument first: so arguments are pop’d in order(exceptions: objects that don’t fit in a register, floating point, …)
44
![Page 78: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/78.jpg)
basic calling convention questions (1)
how does return know where to go?x86-64: on the stack (otherwise can’t use call/ret)
where do arguments go?Linux+x86-64: arguments 1-6: RDI, RSI, RDX, RCX, R8, R9Linux+x86-64: arguments 7-: push on the stack (last argument first)last argument first: so arguments are pop’d in order(exceptions: objects that don’t fit in a register, floating point, …)
44
![Page 79: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/79.jpg)
basic calling convention questions (2)
where do local variables go?
Linux+x86-64: in registers (if room) or on the stackcaveat: what registers can function calls change?
where does the return value go?
Linux+x86-64: RAX
45
![Page 80: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/80.jpg)
basic calling convention questions (2)
where do local variables go?Linux+x86-64: in registers (if room) or on the stackcaveat: what registers can function calls change?
where does the return value go?
Linux+x86-64: RAX
45
![Page 81: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/81.jpg)
basic calling convention questions (2)
where do local variables go?Linux+x86-64: in registers (if room) or on the stackcaveat: what registers can function calls change?
where does the return value go?Linux+x86-64: RAX
45
![Page 82: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/82.jpg)
basic calling convention questions (2)
where do local variables go?Linux+x86-64: in registers (if room) or on the stackcaveat: what registers can function calls change?
where does the return value go?Linux+x86-64: RAX
45
![Page 83: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/83.jpg)
saved registers
what registers can function calls change?Linux+x86-64: RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, floatingpoint registersif using for local variables — be careful about function calls
other registers: must have same value when function returnsif using for local variables — save old value and restore before returning
46
![Page 84: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/84.jpg)
caller versus callee
void foo() {...
}
int main() {foo();return 0;
}
main is caller
foo is callee
47
![Page 85: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/85.jpg)
a function call
...globalVar =
foo(1, 2, 3, 4,5, 6, 7, 8);
...
// assuming R11// used for// local var// in callerpush R11
save important registersfoo might change
mov RDI, 1mov RSI, 2mov RDX, 3mov RCX, 4mov R8, 5mov R9, 6push 8push 7
place arguments in registersand (if necessary) on stack
call foo
← and actually call function
add RSP, 16
← and pop args from stack (if any)
pop R11
…and restore saved regs
mov [globalVar], RAX
…and use return value
48
![Page 86: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/86.jpg)
a function call
...globalVar =
foo(1, 2, 3, 4,5, 6, 7, 8);
...
// assuming R11// used for// local var// in callerpush R11
save important registersfoo might change
mov RDI, 1mov RSI, 2mov RDX, 3mov RCX, 4mov R8, 5mov R9, 6push 8push 7
place arguments in registersand (if necessary) on stack
call foo
← and actually call function
add RSP, 16
← and pop args from stack (if any)
pop R11 …and restore saved regsmov [globalVar], RAX
…and use return value
48
![Page 87: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/87.jpg)
a function call
...globalVar =
foo(1, 2, 3, 4,5, 6, 7, 8);
...
// assuming R11// used for// local var// in callerpush R11
save important registersfoo might change
mov RDI, 1mov RSI, 2mov RDX, 3mov RCX, 4mov R8, 5mov R9, 6push 8push 7
place arguments in registersand (if necessary) on stack
call foo
← and actually call function
add RSP, 16
← and pop args from stack (if any)
pop R11 …and restore saved regsmov [globalVar], RAX
…and use return value
48
![Page 88: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/88.jpg)
a function call
...globalVar =
foo(1, 2, 3, 4,5, 6, 7, 8);
...
// assuming R11// used for// local var// in callerpush R11
save important registersfoo might change
mov RDI, 1mov RSI, 2mov RDX, 3mov RCX, 4mov R8, 5mov R9, 6push 8push 7
place arguments in registersand (if necessary) on stack
call foo ← and actually call functionadd RSP, 16
← and pop args from stack (if any)
pop R11 …and restore saved regsmov [globalVar], RAX
…and use return value
48
![Page 89: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/89.jpg)
a function call
...globalVar =
foo(1, 2, 3, 4,5, 6, 7, 8);
...
// assuming R11// used for// local var// in callerpush R11
save important registersfoo might change
mov RDI, 1mov RSI, 2mov RDX, 3mov RCX, 4mov R8, 5mov R9, 6push 8push 7
place arguments in registersand (if necessary) on stack
call foo ← and actually call functionadd RSP, 16 ← and pop args from stack (if any)pop R11 …and restore saved regsmov [globalVar], RAX
…and use return value
48
![Page 90: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/90.jpg)
a function call
...globalVar =
foo(1, 2, 3, 4,5, 6, 7, 8);
...
// assuming R11// used for// local var// in callerpush R11
save important registersfoo might change
mov RDI, 1mov RSI, 2mov RDX, 3mov RCX, 4mov R8, 5mov R9, 6push 8push 7
place arguments in registersand (if necessary) on stack
call foo ← and actually call functionadd RSP, 16 ← and pop args from stack (if any)pop R11 …and restore saved regsmov [globalVar], RAX
…and use return value48
![Page 91: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/91.jpg)
caller task summarized
save registers that the function might change (consult list)
place parameters in registers, stack
call
remove any parameters from stack
restore registers that the function might change
use return value in RAX
49
![Page 92: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/92.jpg)
callee code example (naive version)long myFunc(long a, long b, long c) {
long result = 0;result += a;result += b;result += c;return result;
}
myFunc:// allocate space for resultsub RSP, 8mov QWORD PTR [RSP], 0 // result = 0add QWORD PTR [RSP], RDI // result += aadd QWORD PTR [RSP], RSI // result += badd QWORD PTR [RSP], RDX // result += cmov RAX, QWORD PTR [RSP] // ret val = result// deallocate spaceadd RSP, 8ret
address value…0xF0000000 (caller’s stuff)0xEFFFFFF8 return address for myFunc0xEFFFFFF0 value of result0xEFFFFFE8 (next stack allocation)…
one policy:local vars (result) lives on stackaccesses arguments directly
50
![Page 93: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/93.jpg)
callee code example (naive version)long myFunc(long a, long b, long c) {
long result = 0;result += a;result += b;result += c;return result;
}
myFunc:// allocate space for resultsub RSP, 8mov QWORD PTR [RSP], 0 // result = 0add QWORD PTR [RSP], RDI // result += aadd QWORD PTR [RSP], RSI // result += badd QWORD PTR [RSP], RDX // result += cmov RAX, QWORD PTR [RSP] // ret val = result// deallocate spaceadd RSP, 8ret
address value…0xF0000000 (caller’s stuff)0xEFFFFFF8 return address for myFunc0xEFFFFFF0 value of result0xEFFFFFE8 (next stack allocation)…
one policy:local vars (result) lives on stackaccesses arguments directly
50
![Page 94: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/94.jpg)
callee code example (animated)myFunc:
// allocate space for resultsub RSP, 8mov QWORD PTR [RSP], 0 // result = 0add QWORD PTR [RSP], RDI // result += aadd QWORD PTR [RSP], RSI // result += badd QWORD PTR [RSP], RDX // result += cmov RAX, QWORD PTR [RSP] // ret val = result// deallocate spaceadd RSP, 8ret
RSP 0x7FFF8RDI 2RSI 3RDX 4RAX…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
RSP→ 0
2
2
3
5
4
99
9
0x7FFF8
RSP→
0x80000
RSP→
51
![Page 95: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/95.jpg)
callee code example (animated)myFunc:
// allocate space for resultsub RSP, 8mov QWORD PTR [RSP], 0 // result = 0add QWORD PTR [RSP], RDI // result += aadd QWORD PTR [RSP], RSI // result += badd QWORD PTR [RSP], RDX // result += cmov RAX, QWORD PTR [RSP] // ret val = result// deallocate spaceadd RSP, 8ret
RSP 0x7FFF8RDI 2RSI 3RDX 4RAX…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
RSP→ 0
2
2
3
5
4
99
9
0x7FFF8
RSP→
0x80000
RSP→
51
![Page 96: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/96.jpg)
callee code example (animated)myFunc:
// allocate space for resultsub RSP, 8mov QWORD PTR [RSP], 0 // result = 0add QWORD PTR [RSP], RDI // result += aadd QWORD PTR [RSP], RSI // result += badd QWORD PTR [RSP], RDX // result += cmov RAX, QWORD PTR [RSP] // ret val = result// deallocate spaceadd RSP, 8ret
RSP 0x7FFF8RDI 2RSI 3RDX 4RAX…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
RSP→ 0
2
2
3
5
4
99
9
0x7FFF8
RSP→
0x80000
RSP→
51
![Page 97: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/97.jpg)
callee code example (animated)myFunc:
// allocate space for resultsub RSP, 8mov QWORD PTR [RSP], 0 // result = 0add QWORD PTR [RSP], RDI // result += aadd QWORD PTR [RSP], RSI // result += badd QWORD PTR [RSP], RDX // result += cmov RAX, QWORD PTR [RSP] // ret val = result// deallocate spaceadd RSP, 8ret
RSP 0x7FFF8RDI 2RSI 3RDX 4RAX…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
RSP→ 0
2
2
3
5
4
99
9
0x7FFF8
RSP→
0x80000
RSP→
51
![Page 98: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/98.jpg)
callee code example (animated)myFunc:
// allocate space for resultsub RSP, 8mov QWORD PTR [RSP], 0 // result = 0add QWORD PTR [RSP], RDI // result += aadd QWORD PTR [RSP], RSI // result += badd QWORD PTR [RSP], RDX // result += cmov RAX, QWORD PTR [RSP] // ret val = result// deallocate spaceadd RSP, 8ret
RSP 0x7FFF8RDI 2RSI 3RDX 4RAX…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
RSP→ 0
2
2
3
5
4
99
9
0x7FFF8
RSP→
0x80000
RSP→
51
![Page 99: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/99.jpg)
callee code example (animated)myFunc:
// allocate space for resultsub RSP, 8mov QWORD PTR [RSP], 0 // result = 0add QWORD PTR [RSP], RDI // result += aadd QWORD PTR [RSP], RSI // result += badd QWORD PTR [RSP], RDX // result += cmov RAX, QWORD PTR [RSP] // ret val = result// deallocate spaceadd RSP, 8ret
RSP 0x7FFF8RDI 2RSI 3RDX 4RAX…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
RSP→ 0
2
2
3
5
4
99
9
0x7FFF8
RSP→
0x80000
RSP→
51
![Page 100: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/100.jpg)
callee code example (animated)myFunc:
// allocate space for resultsub RSP, 8mov QWORD PTR [RSP], 0 // result = 0add QWORD PTR [RSP], RDI // result += aadd QWORD PTR [RSP], RSI // result += badd QWORD PTR [RSP], RDX // result += cmov RAX, QWORD PTR [RSP] // ret val = result// deallocate spaceadd RSP, 8ret
RSP 0x7FFF8RDI 2RSI 3RDX 4RAX…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
RSP→ 0
2
2
3
5
4
99
9
0x7FFF8
RSP→
0x80000
RSP→
51
![Page 101: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/101.jpg)
callee code example (animated)myFunc:
// allocate space for resultsub RSP, 8mov QWORD PTR [RSP], 0 // result = 0add QWORD PTR [RSP], RDI // result += aadd QWORD PTR [RSP], RSI // result += badd QWORD PTR [RSP], RDX // result += cmov RAX, QWORD PTR [RSP] // ret val = result// deallocate spaceadd RSP, 8ret
RSP 0x7FFF8RDI 2RSI 3RDX 4RAX…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
RSP→ 0
2
2
3
5
4
99
9
0x7FFF8
RSP→
0x80000
RSP→
51
![Page 102: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/102.jpg)
callee code example (animated)myFunc:
// allocate space for resultsub RSP, 8mov QWORD PTR [RSP], 0 // result = 0add QWORD PTR [RSP], RDI // result += aadd QWORD PTR [RSP], RSI // result += badd QWORD PTR [RSP], RDX // result += cmov RAX, QWORD PTR [RSP] // ret val = result// deallocate spaceadd RSP, 8ret
RSP 0x7FFF8RDI 2RSI 3RDX 4RAX…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
RSP→ 0
2
2
3
5
4
99
9
0x7FFF8
RSP→
0x80000
RSP→
51
![Page 103: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/103.jpg)
callee code example (allocate registers)long myFunc(long a, long b, long c) {
long result = 0;result += a; result += b; result += c;return result;
}myFunc:
push RBX // save old RBX, which we've decided to use for cpush R12 // save old R12, to be used for resultmov R8, RDI // store a in R8 (not callee-saved)mov R9, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, R8 // result += aadd R12, R9 // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultpop R12 // restore old R12pop RBXret
address value…0xFF000 (caller’s stuff)0xEFFF8 return address …0xEFFF0 saved RBX0xEFFE8 saved R12…
another policy:allocate new registers for local vars…and aren’t a, b, c local vars?
using registers for variables?if callee-saved, save and restore old
using registers for variables?if caller-saved, it’s okay to overwrite w/o saving
52
![Page 104: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/104.jpg)
callee code example (allocate registers)long myFunc(long a, long b, long c) {
long result = 0;result += a; result += b; result += c;return result;
}myFunc:
push RBX // save old RBX, which we've decided to use for cpush R12 // save old R12, to be used for resultmov R8, RDI // store a in R8 (not callee-saved)mov R9, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, R8 // result += aadd R12, R9 // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultpop R12 // restore old R12pop RBXret
address value…0xFF000 (caller’s stuff)0xEFFF8 return address …0xEFFF0 saved RBX0xEFFE8 saved R12…
another policy:allocate new registers for local vars…and aren’t a, b, c local vars?
using registers for variables?if callee-saved, save and restore old
using registers for variables?if caller-saved, it’s okay to overwrite w/o saving
52
![Page 105: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/105.jpg)
callee code example (allocate registers)long myFunc(long a, long b, long c) {
long result = 0;result += a; result += b; result += c;return result;
}myFunc:
push RBX // save old RBX, which we've decided to use for cpush R12 // save old R12, to be used for resultmov R8, RDI // store a in R8 (not callee-saved)mov R9, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, R8 // result += aadd R12, R9 // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultpop R12 // restore old R12pop RBXret
address value…0xFF000 (caller’s stuff)0xEFFF8 return address …0xEFFF0 saved RBX0xEFFE8 saved R12…another policy:
allocate new registers for local vars…and aren’t a, b, c local vars?
using registers for variables?if callee-saved, save and restore old
using registers for variables?if caller-saved, it’s okay to overwrite w/o saving
52
![Page 106: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/106.jpg)
callee code example (allocate registers)long myFunc(long a, long b, long c) {
long result = 0;result += a; result += b; result += c;return result;
}myFunc:
push RBX // save old RBX, which we've decided to use for cpush R12 // save old R12, to be used for resultmov R8, RDI // store a in R8 (not callee-saved)mov R9, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, R8 // result += aadd R12, R9 // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultpop R12 // restore old R12pop RBXret
address value…0xFF000 (caller’s stuff)0xEFFF8 return address …0xEFFF0 saved RBX0xEFFE8 saved R12…
another policy:allocate new registers for local vars…and aren’t a, b, c local vars?
using registers for variables?if callee-saved, save and restore old
using registers for variables?if caller-saved, it’s okay to overwrite w/o saving
52
![Page 107: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/107.jpg)
callee code example (allocate registers)long myFunc(long a, long b, long c) {
long result = 0;result += a; result += b; result += c;return result;
}myFunc:
push RBX // save old RBX, which we've decided to use for cpush R12 // save old R12, to be used for resultmov R8, RDI // store a in R8 (not callee-saved)mov R9, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, R8 // result += aadd R12, R9 // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultpop R12 // restore old R12pop RBXret
address value…0xFF000 (caller’s stuff)0xEFFF8 return address …0xEFFF0 saved RBX0xEFFE8 saved R12…
another policy:allocate new registers for local vars…and aren’t a, b, c local vars?
using registers for variables?if callee-saved, save and restore old
using registers for variables?if caller-saved, it’s okay to overwrite w/o saving
52
![Page 108: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/108.jpg)
callee code example (animated)myFunc:
push RBX // save old RBX, which we've decided to use for cpush R12 // save old R12, to be used for resultmov R8, RDI // store a in R8 (not callee-saved)mov R9, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, R8 // result += aadd R12, R9 // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultpop R12 // restore old R12pop RBXret
RSP 0x7FFF8RDI 2RSI 3RDX 4R8 4R9 4R12 0x5678RAXRBX 0x1234…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
0x1234
0x1234RSP→
0x7FFE8
0x5678
0x5678RSP→
2
2
3
3
4
4
0
4
437
2
999
0x7FFF0
0x5678
RSP→
0x7FFE8
0x1234
RSP→RSP→
53
![Page 109: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/109.jpg)
callee code example (animated)myFunc:
push RBX // save old RBX, which we've decided to use for cpush R12 // save old R12, to be used for resultmov R8, RDI // store a in R8 (not callee-saved)mov R9, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, R8 // result += aadd R12, R9 // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultpop R12 // restore old R12pop RBXret
RSP 0x7FFF8RDI 2RSI 3RDX 4R8 4R9 4R12 0x5678RAXRBX 0x1234…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
0x1234
0x1234RSP→
0x7FFE8
0x5678
0x5678RSP→
2
2
3
3
4
4
0
4
437
2
999
0x7FFF0
0x5678
RSP→
0x7FFE8
0x1234
RSP→RSP→
53
![Page 110: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/110.jpg)
callee code example (animated)myFunc:
push RBX // save old RBX, which we've decided to use for cpush R12 // save old R12, to be used for resultmov R8, RDI // store a in R8 (not callee-saved)mov R9, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, R8 // result += aadd R12, R9 // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultpop R12 // restore old R12pop RBXret
RSP 0x7FFF8RDI 2RSI 3RDX 4R8 4R9 4R12 0x5678RAXRBX 0x1234…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
0x1234
0x1234RSP→
0x7FFE8
0x5678
0x5678RSP→
2
2
3
3
4
4
0
4
437
2
999
0x7FFF0
0x5678
RSP→
0x7FFE8
0x1234
RSP→RSP→
53
![Page 111: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/111.jpg)
callee code example (animated)myFunc:
push RBX // save old RBX, which we've decided to use for cpush R12 // save old R12, to be used for resultmov R8, RDI // store a in R8 (not callee-saved)mov R9, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, R8 // result += aadd R12, R9 // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultpop R12 // restore old R12pop RBXret
RSP 0x7FFF8RDI 2RSI 3RDX 4R8 4R9 4R12 0x5678RAXRBX 0x1234…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
0x1234
0x1234RSP→
0x7FFE8
0x5678
0x5678RSP→
2
2
3
3
4
4
0
4
437
2
999
0x7FFF0
0x5678
RSP→
0x7FFE8
0x1234
RSP→RSP→
53
![Page 112: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/112.jpg)
callee code example (animated)myFunc:
push RBX // save old RBX, which we've decided to use for cpush R12 // save old R12, to be used for resultmov R8, RDI // store a in R8 (not callee-saved)mov R9, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, R8 // result += aadd R12, R9 // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultpop R12 // restore old R12pop RBXret
RSP 0x7FFF8RDI 2RSI 3RDX 4R8 4R9 4R12 0x5678RAXRBX 0x1234…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
0x1234
0x1234RSP→
0x7FFE8
0x5678
0x5678RSP→
2
2
3
3
4
4
0
4
437
2
999
0x7FFF0
0x5678
RSP→
0x7FFE8
0x1234
RSP→RSP→
53
![Page 113: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/113.jpg)
callee code example (animated)myFunc:
push RBX // save old RBX, which we've decided to use for cpush R12 // save old R12, to be used for resultmov R8, RDI // store a in R8 (not callee-saved)mov R9, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, R8 // result += aadd R12, R9 // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultpop R12 // restore old R12pop RBXret
RSP 0x7FFF8RDI 2RSI 3RDX 4R8 4R9 4R12 0x5678RAXRBX 0x1234…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
0x1234
0x1234RSP→
0x7FFE8
0x5678
0x5678RSP→
2
2
3
3
4
4
0
4
437
2
999
0x7FFF0
0x5678
RSP→
0x7FFE8
0x1234
RSP→RSP→
53
![Page 114: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/114.jpg)
callee code example (animated)myFunc:
push RBX // save old RBX, which we've decided to use for cpush R12 // save old R12, to be used for resultmov R8, RDI // store a in R8 (not callee-saved)mov R9, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, R8 // result += aadd R12, R9 // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultpop R12 // restore old R12pop RBXret
RSP 0x7FFF8RDI 2RSI 3RDX 4R8 4R9 4R12 0x5678RAXRBX 0x1234…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
0x1234
0x1234RSP→
0x7FFE8
0x5678
0x5678RSP→
2
2
3
3
4
4
0
4
437
2
999
0x7FFF0
0x5678
RSP→
0x7FFE8
0x1234
RSP→RSP→
53
![Page 115: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/115.jpg)
callee code example (animated)myFunc:
push RBX // save old RBX, which we've decided to use for cpush R12 // save old R12, to be used for resultmov R8, RDI // store a in R8 (not callee-saved)mov R9, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, R8 // result += aadd R12, R9 // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultpop R12 // restore old R12pop RBXret
RSP 0x7FFF8RDI 2RSI 3RDX 4R8 4R9 4R12 0x5678RAXRBX 0x1234…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
0x1234
0x1234RSP→
0x7FFE8
0x5678
0x5678RSP→
2
2
3
3
4
4
0
4
437
2
999
0x7FFF0
0x5678
RSP→
0x7FFE8
0x1234
RSP→RSP→
53
![Page 116: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/116.jpg)
callee code example (animated)myFunc:
push RBX // save old RBX, which we've decided to use for cpush R12 // save old R12, to be used for resultmov R8, RDI // store a in R8 (not callee-saved)mov R9, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, R8 // result += aadd R12, R9 // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultpop R12 // restore old R12pop RBXret
RSP 0x7FFF8RDI 2RSI 3RDX 4R8 4R9 4R12 0x5678RAXRBX 0x1234…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
0x1234
0x1234RSP→
0x7FFE8
0x5678
0x5678RSP→
2
2
3
3
4
4
0
4
437
2
999
0x7FFF0
0x5678
RSP→
0x7FFE8
0x1234
RSP→RSP→
53
![Page 117: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/117.jpg)
callee code example (animated)myFunc:
push RBX // save old RBX, which we've decided to use for cpush R12 // save old R12, to be used for resultmov R8, RDI // store a in R8 (not callee-saved)mov R9, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, R8 // result += aadd R12, R9 // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultpop R12 // restore old R12pop RBXret
RSP 0x7FFF8RDI 2RSI 3RDX 4R8 4R9 4R12 0x5678RAXRBX 0x1234…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
0x1234
0x1234RSP→
0x7FFE8
0x5678
0x5678RSP→
2
2
3
3
4
4
0
4
437
2
999
0x7FFF0
0x5678
RSP→
0x7FFE8
0x1234
RSP→RSP→
53
![Page 118: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/118.jpg)
callee code example (animated)myFunc:
push RBX // save old RBX, which we've decided to use for cpush R12 // save old R12, to be used for resultmov R8, RDI // store a in R8 (not callee-saved)mov R9, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, R8 // result += aadd R12, R9 // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultpop R12 // restore old R12pop RBXret
RSP 0x7FFF8RDI 2RSI 3RDX 4R8 4R9 4R12 0x5678RAXRBX 0x1234…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
0x1234
0x1234RSP→
0x7FFE8
0x5678
0x5678RSP→
2
2
3
3
4
4
0
4
437
2
999
0x7FFF0
0x5678
RSP→
0x7FFE8
0x1234
RSP→RSP→
53
![Page 119: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/119.jpg)
callee code example (animated)myFunc:
push RBX // save old RBX, which we've decided to use for cpush R12 // save old R12, to be used for resultmov R8, RDI // store a in R8 (not callee-saved)mov R9, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, R8 // result += aadd R12, R9 // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultpop R12 // restore old R12pop RBXret
RSP 0x7FFF8RDI 2RSI 3RDX 4R8 4R9 4R12 0x5678RAXRBX 0x1234…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
0x1234
0x1234RSP→
0x7FFE8
0x5678
0x5678RSP→
2
2
3
3
4
4
0
4
437
2
999
0x7FFF0
0x5678
RSP→
0x7FFE8
0x1234
RSP→RSP→
53
![Page 120: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/120.jpg)
callee code example (animated)myFunc:
push RBX // save old RBX, which we've decided to use for cpush R12 // save old R12, to be used for resultmov R8, RDI // store a in R8 (not callee-saved)mov R9, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, R8 // result += aadd R12, R9 // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultpop R12 // restore old R12pop RBXret
RSP 0x7FFF8RDI 2RSI 3RDX 4R8 4R9 4R12 0x5678RAXRBX 0x1234…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
0x1234
0x1234RSP→
0x7FFE8
0x5678
0x5678RSP→
2
2
3
3
4
4
0
4
437
2
999
0x7FFF0
0x5678
RSP→
0x7FFE8
0x1234
RSP→RSP→
53
![Page 121: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/121.jpg)
callee code example (animated)myFunc:
push RBX // save old RBX, which we've decided to use for cpush R12 // save old R12, to be used for resultmov R8, RDI // store a in R8 (not callee-saved)mov R9, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, R8 // result += aadd R12, R9 // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultpop R12 // restore old R12pop RBXret
RSP 0x7FFF8RDI 2RSI 3RDX 4R8 4R9 4R12 0x5678RAXRBX 0x1234…
…0x7FFF8 (ret address)
0x7FFF00x7FFE80x7FFE00x7FFD80x7FFD0…
RSP→
0x7FFF0
0x1234
0x1234RSP→
0x7FFE8
0x5678
0x5678RSP→
2
2
3
3
4
4
0
4
437
2
999
0x7FFF0
0x5678
RSP→
0x7FFE8
0x1234
RSP→RSP→
53
![Page 122: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/122.jpg)
what do compilers do?
must:deallocate any allocated stack spacesave/restore certain registerslook for arguments in certain placesput return value in certain place
but lots of policies for where to put locals…
what do compilers actually do?
it depends…
54
![Page 123: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/123.jpg)
callee code example (no optimizations)myFunc:
// allocate memory for a, b, c, resultsub rsp, 32mov qword ptr [rsp + 24], rdi // copy a from argmov qword ptr [rsp + 16], rsi // copy b from argmov qword ptr [rsp + 8], rdx // copy c from argmov qword ptr [rsp], 0 // result = 0mov rdx, qword ptr [rsp + 24] // rdx = aadd rdx, qword ptr [rsp] // rdx += resultmov qword ptr [rsp], rdx // result = rdxmov rdx, qword ptr [rsp + 16] // rdx = badd rdx, qword ptr [rsp] // rdx += resultmov qword ptr [rsp], rdx // result = rdxmov rdx, qword ptr [rsp + 8] // rdx = cadd rdx, qword ptr [rsp] // ...mov qword ptr [rsp], rdxmov rax, qword ptr [rsp] // ret val = result// dealocate memory for a, b, c, resultadd rsp, 32ret
address value…0xF000 (caller’s stuff)0xEFF8 return address …0xEFF0 value of a0xEFE8 value of b0xEFE0 value of c0xEFD8 value of result…
pretty inefficient — but obeys calling conventionone thing clang can generate without optimizations
55
![Page 124: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/124.jpg)
callee code example (no optimizations)myFunc:
// allocate memory for a, b, c, resultsub rsp, 32mov qword ptr [rsp + 24], rdi // copy a from argmov qword ptr [rsp + 16], rsi // copy b from argmov qword ptr [rsp + 8], rdx // copy c from argmov qword ptr [rsp], 0 // result = 0mov rdx, qword ptr [rsp + 24] // rdx = aadd rdx, qword ptr [rsp] // rdx += resultmov qword ptr [rsp], rdx // result = rdxmov rdx, qword ptr [rsp + 16] // rdx = badd rdx, qword ptr [rsp] // rdx += resultmov qword ptr [rsp], rdx // result = rdxmov rdx, qword ptr [rsp + 8] // rdx = cadd rdx, qword ptr [rsp] // ...mov qword ptr [rsp], rdxmov rax, qword ptr [rsp] // ret val = result// dealocate memory for a, b, c, resultadd rsp, 32ret
address value…0xF000 (caller’s stuff)0xEFF8 return address …0xEFF0 value of a0xEFE8 value of b0xEFE0 value of c0xEFD8 value of result…
pretty inefficient — but obeys calling conventionone thing clang can generate without optimizations
55
![Page 125: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/125.jpg)
callee code example (no optimizations)myFunc:
// allocate memory for a, b, c, resultsub rsp, 32mov qword ptr [rsp + 24], rdi // copy a from argmov qword ptr [rsp + 16], rsi // copy b from argmov qword ptr [rsp + 8], rdx // copy c from argmov qword ptr [rsp], 0 // result = 0mov rdx, qword ptr [rsp + 24] // rdx = aadd rdx, qword ptr [rsp] // rdx += resultmov qword ptr [rsp], rdx // result = rdxmov rdx, qword ptr [rsp + 16] // rdx = badd rdx, qword ptr [rsp] // rdx += resultmov qword ptr [rsp], rdx // result = rdxmov rdx, qword ptr [rsp + 8] // rdx = cadd rdx, qword ptr [rsp] // ...mov qword ptr [rsp], rdxmov rax, qword ptr [rsp] // ret val = result// dealocate memory for a, b, c, resultadd rsp, 32ret
address value…0xF000 (caller’s stuff)0xEFF8 return address …0xEFF0 value of a0xEFE8 value of b0xEFE0 value of c0xEFD8 value of result…
pretty inefficient — but obeys calling conventionone thing clang can generate without optimizations
55
![Page 126: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/126.jpg)
optimizations versus no
things that always work:allocate stack space for local variablesalways put values in their variable right awaydon’t reuse argument/return value registers
things clever compilers can doplace some local variables in registersskip storing values that aren’t usedreuse argument/return value registers when not calling/returning
56
![Page 127: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/127.jpg)
callee code example (better version)long myFunc(long a, long b, long c) {
long result = 0;result += a;result += b;result += c;return result;
}
myFunc:mov RAX, 0add RAX, RSIadd RAX, RDIadd RAX, RDXret
address value…0xF0000000 (caller’s stuff)0xEFFFFFF8 return address for myFunc0xEFFFFFE8 (next stack allocation)…
optimization: place result in RAX — avoid copy at endcaller can’t tell — RAX will be overwritten anyways
optimization: use argument registers directlyavoid copy at beginning (caller can’t tell)
57
![Page 128: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/128.jpg)
callee code example (better version)long myFunc(long a, long b, long c) {
long result = 0;result += a;result += b;result += c;return result;
}
myFunc:mov RAX, 0add RAX, RSIadd RAX, RDIadd RAX, RDXret
address value…0xF0000000 (caller’s stuff)0xEFFFFFF8 return address for myFunc0xEFFFFFE8 (next stack allocation)…
optimization: place result in RAX — avoid copy at endcaller can’t tell — RAX will be overwritten anyways
optimization: use argument registers directlyavoid copy at beginning (caller can’t tell)
57
![Page 129: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/129.jpg)
callee code example (better version)long myFunc(long a, long b, long c) {
long result = 0;result += a;result += b;result += c;return result;
}
myFunc:mov RAX, 0add RAX, RSIadd RAX, RDIadd RAX, RDXret
address value…0xF0000000 (caller’s stuff)0xEFFFFFF8 return address for myFunc0xEFFFFFE8 (next stack allocation)…
optimization: place result in RAX — avoid copy at endcaller can’t tell — RAX will be overwritten anyways
optimization: use argument registers directlyavoid copy at beginning (caller can’t tell)
57
![Page 130: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/130.jpg)
callee code example (better version)long myFunc(long a, long b, long c) {
long result = 0;result += a;result += b;result += c;return result;
}
myFunc:mov RAX, 0add RAX, RSIadd RAX, RDIadd RAX, RDXret
address value…0xF0000000 (caller’s stuff)0xEFFFFFF8 return address for myFunc0xEFFFFFE8 (next stack allocation)…
optimization: place result in RAX — avoid copy at endcaller can’t tell — RAX will be overwritten anyways
optimization: use argument registers directlyavoid copy at beginning (caller can’t tell)
57
![Page 131: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/131.jpg)
callee code example (good version)
long myFunc(long a, long b, long c) {long result = 0;result += a;result += b;result += c;return result;
}
myFunc:lea rax, [rdi + rsi] // return value = a + badd rax, rdx // return value += cret
address value…0xF0000000 (caller’s stuff)0xEFFFFFF8 return address for myFunc0xEFFFFFE8 (next stack allocation)…
what clang generates with optimizations
58
![Page 132: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/132.jpg)
callee code example (good version)
long myFunc(long a, long b, long c) {long result = 0;result += a;result += b;result += c;return result;
}
myFunc:lea rax, [rdi + rsi] // return value = a + badd rax, rdx // return value += cret
address value…0xF0000000 (caller’s stuff)0xEFFFFFF8 return address for myFunc0xEFFFFFE8 (next stack allocation)…
what clang generates with optimizations
58
![Page 133: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/133.jpg)
callee code example (good version)
long myFunc(long a, long b, long c) {long result = 0;result += a;result += b;result += c;return result;
}
myFunc:lea rax, [rdi + rsi] // return value = a + badd rax, rdx // return value += cret
address value…0xF0000000 (caller’s stuff)0xEFFFFFF8 return address for myFunc0xEFFFFFE8 (next stack allocation)…
what clang generates with optimizations
58
![Page 134: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/134.jpg)
writing called functions
save any callee-saved registers function usesRBP, RBX, R12-R15,
allocate stack space for local variables or temporary storage
(actual function body)
place return value in RAX
deallocate stack space
restore any saved registers
59
![Page 135: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/135.jpg)
callee code example (save registers weirdly)long myFunc(long a, long b, long c) {
long result = 0;result += a; result += b; result += c;return result;
}
myFunc:mov R8, RBX // save old RBX, but to R8mov R9, RBP // save old RBP, but to R9push R12 // save old R12, which we've decided to use for resultmov RAX, RDI // store a in RAXmov RBP, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, RAX // result += aadd R12, RBP // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultmov RBX, R8 // restore old RBXpop R12 // restore old R12mov RBP, R9 // restore old RBPret
calling convention doesn’t specify how you save/restore registersanything is fine as long as values are restored
60
![Page 136: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/136.jpg)
callee code example (save registers weirdly)long myFunc(long a, long b, long c) {
long result = 0;result += a; result += b; result += c;return result;
}
myFunc:mov R8, RBX // save old RBX, but to R8mov R9, RBP // save old RBP, but to R9push R12 // save old R12, which we've decided to use for resultmov RAX, RDI // store a in RAXmov RBP, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, RAX // result += aadd R12, RBP // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultmov RBX, R8 // restore old RBXpop R12 // restore old R12mov RBP, R9 // restore old RBPret
calling convention doesn’t specify how you save/restore registersanything is fine as long as values are restored
60
![Page 137: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/137.jpg)
callee code example (save registers weirdly)long myFunc(long a, long b, long c) {
long result = 0;result += a; result += b; result += c;return result;
}
myFunc:mov R8, RBX // save old RBX, but to R8mov R9, RBP // save old RBP, but to R9push R12 // save old R12, which we've decided to use for resultmov RAX, RDI // store a in RAXmov RBP, RSI // store b in RBPmov RBX, RDX // store c in RBXmov R12, 0 // result = 0add R12, RAX // result += aadd R12, RBP // result += badd R12, RBX // result += cmov RAX, R12 // ret val = resultmov RBX, R8 // restore old RBXpop R12 // restore old R12mov RBP, R9 // restore old RBPret
calling convention doesn’t specify how you save/restore registersanything is fine as long as values are restored 60
![Page 138: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/138.jpg)
activation records
calling subroutine puts some things on stack:saved register valuesparameters (if not in registers)local variablesreturn address
together called theactivation recordfor the subroutine
…caller saved registersreturn address of foolocal variables of foocallee saved registerscaller saved registersreturn address of barlocal variables of barcallee saved registers…
foo’sactivation
record
bar’sactivation
record
61
![Page 139: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/139.jpg)
missing calling conv. parts
floating point arguments/return values?floating point registers…
arguments/return values too big for registerarguments: passed on stackreturn value: caller allocates space, passes pointer
class methodsimplicit this argument, usuallyextra stuff for inheritence
62
![Page 140: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/140.jpg)
calling convention complete version (C)
https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI
section 3.2 covers calling convention
63
![Page 141: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/141.jpg)
C++ calling convention
https://itanium-cxx-abi.github.io/cxx-abi/
64
![Page 142: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/142.jpg)
and/or/xor
and <first>, <second>xor <first>, <second>or <first>, <second>
bit-by-bit and, or, xor
e.g. if RAX = 1110TWO and RBX = 0101TWOand RAX, RBX → RAX becomes 0100TWOxor RAX, RBX → RAX becomes 1011TWOor RAX, RBX → RAX becomes 1111TWO
65
![Page 143: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/143.jpg)
cmp+jmp
earlier idea: pair of compare + conditional jump
actually CMP one of many instruction that sets flags
66
![Page 144: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/144.jpg)
other flag setting instructions
compilers omit CMP by using subtraction, etc.implicit compare result to 0 (almost)e.g.:loop: add RBX, RBX
sub RAX, 1jne loop
is the same asloop: add RBX, RBX
sub RAX, 1cmp RAX, 0jne loop
67
![Page 145: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/145.jpg)
TEST/CMP
TEST instruction:performs bitwise and, set flags, discard result
TEST RAX, RAX ≈ CMP RAX, 0
TEST RAX, RAX ≈ AND RAX, RAX
CMP instruction:perform subtraction, set flags, discard result
CMP RAX, RBX ≈ PUSH RBX; SUB RAX, RBX; POP RBX
68
![Page 146: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/146.jpg)
program memory (x86-64 Linux)
Used by OS0xFFFF FFFF FFFF FFFF
0xFFFF 8000 0000 0000
Stack0x7F…
Heap / other dynamicWritable data
Code + Constants0x0000 0000 0040 0000
← activation records go here
← new uses space here
stack grows towards heap (activation records)heap grows towards stack (allocations with new)hopefully never meet
69
![Page 147: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/147.jpg)
program memory (x86-64 Linux)
Used by OS0xFFFF FFFF FFFF FFFF
0xFFFF 8000 0000 0000
Stack0x7F…
Heap / other dynamicWritable data
Code + Constants0x0000 0000 0040 0000
← activation records go here
← new uses space here
stack grows towards heap (activation records)heap grows towards stack (allocations with new)hopefully never meet
69
![Page 148: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/148.jpg)
godbolt.org
“compiler explorer”
many, many C++ compilers
does work of extracting just the relevant assembly
also does “demangling”translate ‘mangled’ assembly names to C++ names
70
![Page 149: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/149.jpg)
optimizing away
int foo() { return 42; }int example() { return 1 + foo(); }
possible generated asm:..._Z8example1v:
mov EAX, 43ret
int foo();int example() { return 1 + foo(); }
possible asm:_Z8example1v:
push RAXcall _Z4foo1vadd EAX, 1pop RCXret 71
![Page 150: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/150.jpg)
getting assembly output from clang
clang++ -S ... file.cpp — write assembly to file.sin machine’s AT&T assembly syntaxnot the syntax you will be coding
clang++ -mllvm --x86-asm-syntax=intel -S ...file.cpp — …in Intel-like syntax
much closer to syntax you will be codingbut won’t work with nasm
72
![Page 151: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/151.jpg)
test_abs.cpp
#include <iostream>using namespace std;extern "C" long absolute_value(long x);
long absolute_value(long x) {if (x<0) // if x is negative
x = −x; // negate xreturn x; // return x
}
int main() {long theValue=0;cout << "Enter␣a␣value:␣" << endl;cin >> theValue;long theResult = absolute_value(theValue);cout << "The␣result␣is:␣" << theResult << endl;return 0;
}73
![Page 152: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/152.jpg)
absolute_value
clang++ -S: (AT&T syntax)…absolute_value:
movq %rdi, −8(%rsp)cmpq $0, −8(%rsp)jge .LBB1_2xorl %eax, %eaxmovl %eax, %ecxsubq −8(%rsp), %rcxmovq %rcx, −8(%rsp)
.LBB1_2:movq −8(%rsp), %raxretq
…74
![Page 153: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/153.jpg)
AT&T syntax
destination last% = registerdisp(base) same as memory[disp + base]
disp(base, index, scale) same asmemory[disp + base + index * scale]
can omit disp /or omit base (defaults to 0) and/or scale (defualts to 1)
$ means constant/numberplain number/label means value in memorymovq/addq/… — 8 bytes (quad) mov/add
movl — 4 bytes (long); movw — 2 bytes (word); movb 1 byte75
![Page 154: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/154.jpg)
absolute_value (unoptimized)
clang++ -S --mllvm --x86-asm-syntax=intel -S -fomit-frame-pointer:
absolute_value:mov qword ptr [rsp − 8], rdicmp qword ptr [rsp − 8], 0jge .LBB1_2xor eax, eaxmov ecx, eaxsub rcx, qword ptr [rsp − 8]mov qword ptr [rsp − 8], rcx
.LBB1_2:mov rax, qword ptr [rsp − 8]ret
76
![Page 155: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/155.jpg)
absolute_value_int (unoptimized)
longs replaced with intsclang++ -S --mllvm --x86-asm-syntax=intel -S -fomit-frame-pointer:
absolute_value_int:mov dword ptr [rsp − 4], edicmp dword ptr [rsp − 4], 0jge .LBB0_2xor eax, eaxsub eax, dword ptr [rsp − 4]mov dword ptr [rsp − 4], eax
.LBB0_2:mov eax, dword ptr [rsp − 4]ret
77
![Page 156: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/156.jpg)
absolute_value (optimized)
clang++ -S -O2 --mllvm --x86-asm-syntax=intel -S -fomit-frame-pointer:
absolute_value:mov rax, rdineg raxcmovl rax, rdiret
(cmovl — mov if flags say less than;and negate sets those flags)my recommendation: use some optimization option when generatingassembly to look at
78
![Page 157: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/157.jpg)
absolute value without cmov (1)
what if we didn’t know about cmovXX…?// NASM syntax:global absolute_value// GNU assembler syntax: .global absolute_value
absolute_value:mov rax, rdi // x = return value ← arg 1cmp rax, 0 // x == 0?jge end_of_procedureneg rax // NEGate
end_of_procedure:ret
79
![Page 158: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/158.jpg)
absolute value without cmov (2)
what if we didn’t know about cmovXX and neg…?// NASM syntax:global absolute_value// GNU assembler syntax: .global absolute_value
absolute_value:mov rax, rdi // x = return value ← arg 1cmp rax, 0 // x == 0?jge end_of_proceduremov rax, 0sub rax, rdi
end_of_procedure:ret
80
![Page 159: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/159.jpg)
rest of the .s file
I’ve shown you a little bit of the .s file
there’s alot of extra stuff in there…
81
![Page 160: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/160.jpg)
in context (1)
“text segment” (code)file information:
.text
.intel_syntax noprefix
.file "test_abs.cpp"
82
![Page 161: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/161.jpg)
in context (2)
.section .text.startup,"ax",@progbits
.align 16, 0x90
.type __cxx_global_var_init,@function__cxx_global_var_init: # @__cxx_global_var_init
.cfi_startproc# BB#0:
push rax.Ltmp0:
.cfi_def_cfa_offset 16movabs rdi, _ZStL8__ioinitcall _ZNSt8ios_base4InitC1Evmovabs rdi, _ZNSt8ios_base4InitD1Evmovabs rsi, _ZStL8__ioinitmovabs rdx, __dso_handlecall __cxa_atexitmov dword ptr [rsp + 4], eax # 4-byte Spillpop raxret
__cxx_global_var_init —function to call global variable constructors/etc.
_ZStL8__ioinit = std::__ioinit (global var.)_ZNSt8ios_base4InitC1Ev = ios_base::Init::Init()(constructor)
.cfi_…— for debugger/exceptions
83
![Page 162: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/162.jpg)
in context (2)
.section .text.startup,"ax",@progbits
.align 16, 0x90
.type __cxx_global_var_init,@function__cxx_global_var_init: # @__cxx_global_var_init
.cfi_startproc# BB#0:
push rax.Ltmp0:
.cfi_def_cfa_offset 16movabs rdi, _ZStL8__ioinitcall _ZNSt8ios_base4InitC1Evmovabs rdi, _ZNSt8ios_base4InitD1Evmovabs rsi, _ZStL8__ioinitmovabs rdx, __dso_handlecall __cxa_atexitmov dword ptr [rsp + 4], eax # 4-byte Spillpop raxret
__cxx_global_var_init —function to call global variable constructors/etc.
_ZStL8__ioinit = std::__ioinit (global var.)_ZNSt8ios_base4InitC1Ev = ios_base::Init::Init()(constructor)
.cfi_…— for debugger/exceptions
83
![Page 163: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/163.jpg)
in context (2)
.section .text.startup,"ax",@progbits
.align 16, 0x90
.type __cxx_global_var_init,@function__cxx_global_var_init: # @__cxx_global_var_init
.cfi_startproc# BB#0:
push rax.Ltmp0:
.cfi_def_cfa_offset 16movabs rdi, _ZStL8__ioinitcall _ZNSt8ios_base4InitC1Evmovabs rdi, _ZNSt8ios_base4InitD1Evmovabs rsi, _ZStL8__ioinitmovabs rdx, __dso_handlecall __cxa_atexitmov dword ptr [rsp + 4], eax # 4-byte Spillpop raxret
__cxx_global_var_init —function to call global variable constructors/etc.
_ZStL8__ioinit = std::__ioinit (global var.)_ZNSt8ios_base4InitC1Ev = ios_base::Init::Init()(constructor)
.cfi_…— for debugger/exceptions
83
![Page 164: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/164.jpg)
in context (2)
.section .text.startup,"ax",@progbits
.align 16, 0x90
.type __cxx_global_var_init,@function__cxx_global_var_init: # @__cxx_global_var_init
.cfi_startproc# BB#0:
push rax.Ltmp0:
.cfi_def_cfa_offset 16movabs rdi, _ZStL8__ioinitcall _ZNSt8ios_base4InitC1Evmovabs rdi, _ZNSt8ios_base4InitD1Evmovabs rsi, _ZStL8__ioinitmovabs rdx, __dso_handlecall __cxa_atexitmov dword ptr [rsp + 4], eax # 4-byte Spillpop raxret
__cxx_global_var_init —function to call global variable constructors/etc.
_ZStL8__ioinit = std::__ioinit (global var.)_ZNSt8ios_base4InitC1Ev = ios_base::Init::Init()(constructor)
.cfi_…— for debugger/exceptions
83
![Page 165: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/165.jpg)
in context (3)
.text
.globl absolute_value
.align 16, 0x90
.type absolute_value,@functionabsolute_value: # @absolute_value
.cfi_startproc# BB#0:
mov qword ptr [rsp − 8], rdicmp qword ptr [rsp − 8], 0jge .LBB1_2
# BB#1:xor eax, eaxmov ecx, eaxsub rcx, qword ptr [rsp − 8]mov qword ptr [rsp − 8], rcx
.LBB1_2:mov rax, qword ptr [rsp − 8]ret
.Lfunc_end1:.size absolute_value, .Lfunc_end1−absolute_value.cfi_endproc
.globl — make this label accessible in other files
.type — help linker/debugger/etc.
84
![Page 166: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/166.jpg)
in context (3)
.text
.globl absolute_value
.align 16, 0x90
.type absolute_value,@functionabsolute_value: # @absolute_value
.cfi_startproc# BB#0:
mov qword ptr [rsp − 8], rdicmp qword ptr [rsp − 8], 0jge .LBB1_2
# BB#1:xor eax, eaxmov ecx, eaxsub rcx, qword ptr [rsp − 8]mov qword ptr [rsp − 8], rcx
.LBB1_2:mov rax, qword ptr [rsp − 8]ret
.Lfunc_end1:.size absolute_value, .Lfunc_end1−absolute_value.cfi_endproc
.globl — make this label accessible in other files
.type — help linker/debugger/etc.
84
![Page 167: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/167.jpg)
in context (4)
.globl main
.align 16, 0x90
.type main,@functionmain: # @main
.cfi_startproc# BB#0:
sub rsp, 56.Ltmp1:
.cfi_def_cfa_offset 64movabs rdi, _ZSt4coutmovabs rsi, .L.strmov dword ptr [rsp + 52], 0mov qword ptr [rsp + 40], 0call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKcmovabs rsi, _ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_mov rdi, ra_end1−absolute_value...
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc =ostream& operator<<(ostream&, char const*)
85
![Page 168: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/168.jpg)
in context (4)
.globl main
.align 16, 0x90
.type main,@functionmain: # @main
.cfi_startproc# BB#0:
sub rsp, 56.Ltmp1:
.cfi_def_cfa_offset 64movabs rdi, _ZSt4coutmovabs rsi, .L.strmov dword ptr [rsp + 52], 0mov qword ptr [rsp + 40], 0call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKcmovabs rsi, _ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_mov rdi, ra_end1−absolute_value...
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc =ostream& operator<<(ostream&, char const*)
85
![Page 169: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/169.jpg)
extern "C"
#include <iostream>using namespace std;extern "C" long absolute_value(long x);
long absolute_value(long x) {if (x<0) // if x is negative
x = −x; // negate xreturn x; // return x
}
int main() {long theValue=0;cout << "Enter␣a␣value:␣" << endl;cin >> theValue;long theResult = absolute_value(theValue);cout << "The␣result␣is:␣" << theResult << endl;return 0;
}86
![Page 170: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/170.jpg)
extern "C" — name mangling
with extern "C":absolute_value:
...
without extern "C":_Z14absolute_valuel:
...
87
![Page 171: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/171.jpg)
extern C — different args
This not allowed:extern "C" long absolute_value(long x);extern "C" int absolute_value(int x);
because C doesn’t allow it, and extern "C" means ‘C-compatible’.
This is fine:long absolute_value(long x);int absolute_value(int x);
because C++ allows functions with different args, but same nameassembly on Linux:
_Z14absolute_valuel, and_Z14absolute_valuei
88
![Page 172: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/172.jpg)
c++filt
c++filt — command line program to translate C++ symbolnames$ c++filtThe function is _Z14absolute_valuellll^DOutput: The function is absolute_value(long, long,long, long)
89
![Page 173: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/173.jpg)
frame pointers
stack pointer: points to “top” of stackx86 register RSP used for thisi.e. lowest address on stacki.e. location of next stack allocation
frame pointer: pointer to allocation record AKA “stack frame”x86 register RBP intended for this
not required by the calling conventionfunction can use RSP instead
90
![Page 174: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/174.jpg)
frame pointer defaults
some systems default to using frame pointerseasier to deallocate stack space (mov RSP, RBP)can support “dynamic” stack allocations (alloca())easier to write debuggers
our lab machines don’tat least with optimizations
clang/GCC flags:-fomit-frame-pointer/-fno-omit-frame-pointer(clang only) -mno-omit-leaf-frame-pointer(“leaf” = function that doesn’t call anything)
91
![Page 175: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/175.jpg)
frame pointer code
someFunction:push RBP // save old frame pointermov RBP, RSP // top of stack is frame pointersub RSP, 32 // allocate 32 bytes for local variables...add [RBP − 8], 1 // someLocalVar += 1...mov RSP, RBP // restore old stack pointer
// instead of: add RSP, 32pop RBPret
92
![Page 176: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/176.jpg)
int max(int x, int y)
int max(int x, int y) {int theMax;if (x > y) // if x > y then x is max
theMax = x;else // else y is the max
theMax = y;return theMax; // return the max
}
93
![Page 177: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/177.jpg)
max assembly (unoptimized)
max:mov dword ptr [rsp − 4], edimov dword ptr [rsp − 8], esimov esi, dword ptr [rsp − 4]cmp esi, dword ptr [rsp − 8]jle .LBB1_2mov eax, dword ptr [rsp − 4]mov dword ptr [rsp − 12], eaxjmp .LBB1_3
.LBB1_2:mov eax, dword ptr [rsp − 8]mov dword ptr [rsp − 12], eax
.LBB1_3:mov eax, dword ptr [rsp − 12]ret
32-bit int, not 64-bit longs in C code32-bit ints: stack allocations 4 bytes, not 8‘dumb’ variable allocation: every variable on the stackdoesn’t make intelligent use of registers…
94
![Page 178: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/178.jpg)
max assembly (unoptimized)
max:mov dword ptr [rsp − 4], edimov dword ptr [rsp − 8], esimov esi, dword ptr [rsp − 4]cmp esi, dword ptr [rsp − 8]jle .LBB1_2mov eax, dword ptr [rsp − 4]mov dword ptr [rsp − 12], eaxjmp .LBB1_3
.LBB1_2:mov eax, dword ptr [rsp − 8]mov dword ptr [rsp − 12], eax
.LBB1_3:mov eax, dword ptr [rsp − 12]ret
32-bit int, not 64-bit longs in C code
32-bit ints: stack allocations 4 bytes, not 8‘dumb’ variable allocation: every variable on the stackdoesn’t make intelligent use of registers…
94
![Page 179: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/179.jpg)
max assembly (unoptimized)
max:mov dword ptr [rsp − 4], edimov dword ptr [rsp − 8], esimov esi, dword ptr [rsp − 4]cmp esi, dword ptr [rsp − 8]jle .LBB1_2mov eax, dword ptr [rsp − 4]mov dword ptr [rsp − 12], eaxjmp .LBB1_3
.LBB1_2:mov eax, dword ptr [rsp − 8]mov dword ptr [rsp − 12], eax
.LBB1_3:mov eax, dword ptr [rsp − 12]ret
32-bit int, not 64-bit longs in C code
32-bit ints: stack allocations 4 bytes, not 8
‘dumb’ variable allocation: every variable on the stackdoesn’t make intelligent use of registers…
94
![Page 180: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/180.jpg)
max assembly (unoptimized)
max:mov dword ptr [rsp − 4], edimov dword ptr [rsp − 8], esimov esi, dword ptr [rsp − 4]cmp esi, dword ptr [rsp − 8]jle .LBB1_2mov eax, dword ptr [rsp − 4]mov dword ptr [rsp − 12], eaxjmp .LBB1_3
.LBB1_2:mov eax, dword ptr [rsp − 8]mov dword ptr [rsp − 12], eax
.LBB1_3:mov eax, dword ptr [rsp − 12]ret
32-bit int, not 64-bit longs in C code32-bit ints: stack allocations 4 bytes, not 8
‘dumb’ variable allocation: every variable on the stackdoesn’t make intelligent use of registers…94
![Page 181: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/181.jpg)
max assembly (optimized)
max:cmp edi, esicmovge esi, edimov eax, esiret
cmovge: mov if greater than or equal
95
![Page 182: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/182.jpg)
max assembly (optimized)
max:cmp edi, esicmovge esi, edimov eax, esiret
cmovge: mov if greater than or equal
95
![Page 183: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/183.jpg)
compare_string
bool compare_string (const char *theStr1,const char *theStr2) {
// while *theStr1 is not nul terminator// and the current corresponding bytes are equalwhile( (*theStr1 != '\0')
&& (*theStr1 == *theStr2) ) {theStr1++; // increment the pointers totheStr2++; // the next char / byte
}return (*theStr1==*theStr2);
}
96
![Page 184: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/184.jpg)
compare_string (optimized; part 1)
compare_string:mov al, byte ptr [rdi]test al, alje .LBB0_4inc rdi
.LBB0_2:movzx ecx, byte ptr [rsi]movzx edx, alcmp edx, ecxjne .LBB0_5inc rsimov al, byte ptr [rdi]inc rditest al, aljne .LBB0_2...
jump if al == 0al is *theStr1movzx = mov zero-extendrdi = theStr1, rsi = theStr2
97
![Page 185: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/185.jpg)
compare_string (optimized; part 1)
compare_string:mov al, byte ptr [rdi]test al, alje .LBB0_4inc rdi
.LBB0_2:movzx ecx, byte ptr [rsi]movzx edx, alcmp edx, ecxjne .LBB0_5inc rsimov al, byte ptr [rdi]inc rditest al, aljne .LBB0_2...
jump if al == 0al is *theStr1
movzx = mov zero-extendrdi = theStr1, rsi = theStr2
97
![Page 186: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/186.jpg)
compare_string (optimized; part 1)
compare_string:mov al, byte ptr [rdi]test al, alje .LBB0_4inc rdi
.LBB0_2:movzx ecx, byte ptr [rsi]movzx edx, alcmp edx, ecxjne .LBB0_5inc rsimov al, byte ptr [rdi]inc rditest al, aljne .LBB0_2...
jump if al == 0al is *theStr1
movzx = mov zero-extend
rdi = theStr1, rsi = theStr2
97
![Page 187: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/187.jpg)
compare_string (optimized; part 1)
compare_string:mov al, byte ptr [rdi]test al, alje .LBB0_4inc rdi
.LBB0_2:movzx ecx, byte ptr [rsi]movzx edx, alcmp edx, ecxjne .LBB0_5inc rsimov al, byte ptr [rdi]inc rditest al, aljne .LBB0_2...
jump if al == 0al is *theStr1movzx = mov zero-extend
rdi = theStr1, rsi = theStr297
![Page 188: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/188.jpg)
compare_string (optimized; part 2)
.LBB0_4:xor eax, eax
.LBB0_5:movzx ecx, byte ptr [rsi]movzx eax, alcmp eax, ecxsete alret
sete = set al to 1 if equal, 0 otherwise
98
![Page 189: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/189.jpg)
compare_string (optimized; part 2)
.LBB0_4:xor eax, eax
.LBB0_5:movzx ecx, byte ptr [rsi]movzx eax, alcmp eax, ecxsete alret
sete = set al to 1 if equal, 0 otherwise98
![Page 190: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/190.jpg)
fib
long fib(unsigned int n) {if ((n==0) || (n==1))
return 1;return fib(n−1) + fib(n−2);
}
99
![Page 191: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/191.jpg)
fib
long fib(unsigned int n) {if ((n==0) || (n==1))
return 1;return fib(n−1) + fib(n−2);
}
99
![Page 192: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/192.jpg)
fib (optimized; part 1)
fib:push r14push rbxpush raxmov ebx, edimov eax, ebxor eax, 1mov r14d, 1cmp eax, 1je .LBB0_3...
save two callee-saved registers
x86-64 rule: RSP must be multiple of 16when call happens(rax not actually restored)
if n is 0 or 1…jumps to code that returns R14edi, ebx both copies of n
100
![Page 193: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/193.jpg)
fib (optimized; part 1)
fib:push r14push rbxpush raxmov ebx, edimov eax, ebxor eax, 1mov r14d, 1cmp eax, 1je .LBB0_3...
save two callee-saved registers
x86-64 rule: RSP must be multiple of 16when call happens(rax not actually restored)
if n is 0 or 1…jumps to code that returns R14edi, ebx both copies of n
100
![Page 194: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/194.jpg)
fib (optimized; part 1)
fib:push r14push rbxpush raxmov ebx, edimov eax, ebxor eax, 1mov r14d, 1cmp eax, 1je .LBB0_3...
save two callee-saved registers
x86-64 rule: RSP must be multiple of 16when call happens(rax not actually restored)
if n is 0 or 1…jumps to code that returns R14edi, ebx both copies of n
100
![Page 195: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/195.jpg)
fib (optimized; part 1)
fib:push r14push rbxpush raxmov ebx, edimov eax, ebxor eax, 1mov r14d, 1cmp eax, 1je .LBB0_3...
save two callee-saved registers
x86-64 rule: RSP must be multiple of 16when call happens(rax not actually restored)
if n is 0 or 1…jumps to code that returns R14
edi, ebx both copies of n
100
![Page 196: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/196.jpg)
fib (optimized; part 1)
fib:push r14push rbxpush raxmov ebx, edimov eax, ebxor eax, 1mov r14d, 1cmp eax, 1je .LBB0_3...
save two callee-saved registers
x86-64 rule: RSP must be multiple of 16when call happens(rax not actually restored)
if n is 0 or 1…jumps to code that returns R14
edi, ebx both copies of n
100
![Page 197: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/197.jpg)
fib (optimized; part 2)
add ebx, −2mov r14d, 1
.LBB0_2:lea edi, [rbx + 1]call fibadd r14, raxmov eax, ebxor eax, 1add ebx, −2cmp eax, 1jne .LBB0_2
.LBB0_3:mov rax, r14add rsp, 8pop rbxpop r14ret
return r14undo stack adjustmentrestore rbx, r14
ebx previously set to n=edifib(n-1)trick: replace fib(n-2) call with loop
101
![Page 198: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/198.jpg)
fib (optimized; part 2)
add ebx, −2mov r14d, 1
.LBB0_2:lea edi, [rbx + 1]call fibadd r14, raxmov eax, ebxor eax, 1add ebx, −2cmp eax, 1jne .LBB0_2
.LBB0_3:mov rax, r14add rsp, 8pop rbxpop r14ret
return r14undo stack adjustmentrestore rbx, r14
ebx previously set to n=edifib(n-1)trick: replace fib(n-2) call with loop
101
![Page 199: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/199.jpg)
fib (optimized; part 2)
add ebx, −2mov r14d, 1
.LBB0_2:lea edi, [rbx + 1]call fibadd r14, raxmov eax, ebxor eax, 1add ebx, −2cmp eax, 1jne .LBB0_2
.LBB0_3:mov rax, r14add rsp, 8pop rbxpop r14ret
return r14undo stack adjustmentrestore rbx, r14
ebx previously set to n=edifib(n-1)
trick: replace fib(n-2) call with loop
101
![Page 200: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/200.jpg)
fib (optimized; part 2)
add ebx, −2mov r14d, 1
.LBB0_2:lea edi, [rbx + 1]call fibadd r14, raxmov eax, ebxor eax, 1add ebx, −2cmp eax, 1jne .LBB0_2
.LBB0_3:mov rax, r14add rsp, 8pop rbxpop r14ret
return r14undo stack adjustmentrestore rbx, r14
ebx previously set to n=edifib(n-1)
trick: replace fib(n-2) call with loop
101
![Page 201: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/201.jpg)
a vulnerable function
void vulnerable() {char buffer[100];cin >> buffer;
}
sub rsp, 120mov rsi, rspmov edi, /* cin */call /* operator>>(istream,char*) */add rsp, 120ret
102
![Page 202: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/202.jpg)
buffer overflows
incr
easin
gad
dres
ses
highest address (stack started here)
lowest address (stack grows here)
return address for vulnerable:0x401040
unused space (20 bytes)
buffer (100 bytes)
return address for cin::operator<<
an input…(part of buffer)a longer input…(part of buffer)an input that fills the buffer…
an input that fillsmore than the buffer
return address for vulnerable:(now part of input)
machine code somewhere
103
![Page 203: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/203.jpg)
buffer overflows
incr
easin
gad
dres
ses
highest address (stack started here)
lowest address (stack grows here)
return address for vulnerable:0x401040
unused space (20 bytes)
buffer (100 bytes)
return address for cin::operator<<
an input…(part of buffer)
a longer input…(part of buffer)an input that fills the buffer…
an input that fillsmore than the buffer
return address for vulnerable:(now part of input)
machine code somewhere
103
![Page 204: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/204.jpg)
buffer overflows
incr
easin
gad
dres
ses
highest address (stack started here)
lowest address (stack grows here)
return address for vulnerable:0x401040
unused space (20 bytes)
buffer (100 bytes)
return address for cin::operator<<
an input…(part of buffer)
a longer input…(part of buffer)
an input that fills the buffer…
an input that fillsmore than the buffer
return address for vulnerable:(now part of input)
machine code somewhere
103
![Page 205: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/205.jpg)
buffer overflows
incr
easin
gad
dres
ses
highest address (stack started here)
lowest address (stack grows here)
return address for vulnerable:(part of input)
unused space (20 bytes)
buffer (100 bytes)
return address for cin::operator<<
an input…(part of buffer)a longer input…(part of buffer)
an input that fills the buffer…
an input that fillsmore than the buffer
return address for vulnerable:(now part of input)
machine code somewhere
103
![Page 206: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/206.jpg)
buffer overflows
incr
easin
gad
dres
ses
highest address (stack started here)
lowest address (stack grows here)
return address for vulnerable:(part of input)
unused space (20 bytes)
buffer (100 bytes)
return address for cin::operator<<
an input…(part of buffer)a longer input…(part of buffer)an input that fills the buffer…
an input that fillsmore than the buffer
return address for vulnerable:(now part of input)
machine code somewhere
103
![Page 207: x86-64assembly · x86history seven8-bitregisters 1971:Intel8008 eight16-bitregisters: 1978:Intel8086 1982:Intel80286 eight32-bitregisters: 1985:Intel80386 1989:Intel80486](https://reader036.vdocument.in/reader036/viewer/2022062917/5ed968fcf59b0f56f45f7079/html5/thumbnails/207.jpg)
variable argument functions
C++ — multiple versions of functions — different assembly names:
long foo(long a) becomes _Z3foollong foo(long a, long b) becomes _Z3fooll
can also have variable argument functions — more common in Cexample: void printf(const char *format, ...) (C equiv.of cout)
printf("The␣number␣is␣%d.\n", 42);
mov edi, .L.strmov esi, 42xor eax, eax // # of floating point argscall printf...
.L.str:.asciz "This␣number␣is␣%d."
104