arm procedure calling conventions and recursion

25
Page 1 © David Brailsford 2011 Professor David Brailsford ([email protected]) School of Computer Science University of Nottingham Extra material courtesy of: Jaume Bacardit, Thorsten Altenkirch and Liyang Hu — School of CS, Univ. of Nottm. Steve Furber, Jim Garside and Pete Jinks — School of CS, Univ. of Manchester Lecture 08: ARM Procedure Calling Conventions and Recursion

Upload: stephan-cadene

Post on 05-Dec-2014

2.697 views

Category:

Documents


3 download

DESCRIPTION

◆ A portion of code within a larger program. Often called 􀀀 a subroutine or procedure in imperative languages like C 􀀀 methods in OO languages like Java 􀀀 and functions in functional languages like Haskell ◆ Functions return a value. So some purists would say that a C function returning void is actually a procedure ! ◆ Procedures are necessary for: 􀀀 reducing duplication of code and enabling re-use 􀀀 decomposing complex programs into manageable parts ◆ Procedures can call each other and can even call themselves ◆ What happens when we call a procedure? 􀀀 The caller is suspended; control hands over to the callee 􀀀 Callee performs the requested task 􀀀 Callee returns control to the caller

TRANSCRIPT

Page 1: ARM procedure calling conventions and recursion

Page 1© David Brailsford 2011

Professor David Brailsford([email protected])School of Computer Science

University of Nottingham

Extra material courtesy of:

Jaume Bacardit, Thorsten Altenkirch and Liyang Hu — School of CS, Univ. of Nottm.

Steve Furber, Jim Garside and Pete Jinks — School of CS, Univ. of Manchester

Lecture 08: ARM Procedure Calling Conventions and Recursion

Page 2: ARM procedure calling conventions and recursion

Page 2© David Brailsford 2011

What is a procedure?

◆ A portion of code within a larger program. Often called� a subroutine or procedure in imperative languages like C� methods in OO languages like Java� and functions in functional languages like Haskell

◆ Functions return a value. So some purists would say that a Cfunction returning void is actually a procedure !

◆ Procedures are necessary for:� reducing duplication of code and enabling re-use� decomposing complex programs into manageable parts

◆ Procedures can call each other and can even call themselves

◆ What happens when we call a procedure?� The caller is suspended; control hands over to the callee� Callee performs the requested task� Callee returns control to the caller

Page 3: ARM procedure calling conventions and recursion

Page 3© David Brailsford 2011

An Example in C

int f (int x, int y) {return sqrt(x * x + y * y);

}

int main ( ) {printf ("f(5,12) = %d\n", f(5, 12));

}

Jumps to a new piece of codebut keeps track of where we were before

function fmain

Returns to the next instruction of the original code

Page 4: ARM procedure calling conventions and recursion

Page 4© David Brailsford 2011

Basic procedure calls on ARM

◆ We already know that the BL instruction uses R14 as the link register (LR)This is where it stores the return address

◆ So in simple cases, at the end of the procedure, all we need to dois MOV PC, LR

◆ In simple cases routines may be able to do their job solely with registers

◆ We’ve seen some simple examples of this with the strcpy and strchrprocedures in courseworks.

◆ But we need conventions for register usage to avoid over-writingand misunderstandings

◆ Thus we have the APCS (ARM Procedure Call Standard) to guide us

Page 5: ARM procedure calling conventions and recursion

Page 5© David Brailsford 2011

APCS Register Use Convention

Register APCS name APCS Role�������������������������������������������������������������

0 a1 Argument 1/integer result/ scratch register1 a2 Argument 2/scratch register2 a3 Argument 3/scratch register3 a4 Argument 4/scratch register4 v1 Register variable 15 v2 Register variable 26 v3 Register variable 37 v4 Register variable 48 v5 Register variable 59 sb/v6 Static base / Register variable 6

10 sl/v7 Stack limit / Register variable 711 fp Frame pointer12 ip Scratch register/ specialist use by linker13 sp Lower end of current stack frame14 lr link address / scratch register15 pc Program counter

Page 6: ARM procedure calling conventions and recursion

Page 6© David Brailsford 2011

Caller Saved Registers

◆ R0–R3 used to pass arguments into a function

◆ But inside the function they may be used for any purpose (they arescratch registers). R0 often delivers back the result

◆ Caller must expect R0–R3 contents to be trashed (i.e. over-written)when a function call returns.

◆ If caller doesn’t want this to happen then it must save R0–R3contents beforehand (typically in memory).

◆ A typical simple leaf function e.g. strlen (i.e. onewhich does not call any other function), provided it uses onlyR0–R3, only needs BL to jump in and MOV PC, LR to return

Page 7: ARM procedure calling conventions and recursion

Page 7© David Brailsford 2011

Callee Saved Registers

◆ R4–R8 (R4–R10 in some variants of APCS) are registers which anycalled function is required to save.

◆ Therefore they must have unchanged values when control returns tothe calling routine (e.g. the main program)

◆ So if the called function needs these registers for extra workspacethen it must save them (hence: callee saved)

◆ Of course, it they have been saved then they must be restoredbefore returning to the caller.

◆ Registers are limited in number. Memory has much larger capacity

◆ We need a disciplined way to save stuff in memory. Best solutionis a stack

Page 8: ARM procedure calling conventions and recursion

Page 8© David Brailsford 2011

The Stack Concept

◆ A stack provides last in, first out storage

◆ It is a most important data structure in Computer Science

◆ Placing words on the stack is termed pushing

◆ Taking words off the stack is called popping

Page 9: ARM procedure calling conventions and recursion

Page 9© David Brailsford 2011

Stack Implementation Choices

◆ Do we grow the stack downwards (descending addresses) orupwards (ascending addresses) in memory?

◆ We need a stack pointer register (SP) to hold addressof the top of stack (this SP is R13 on the ARM)

◆ But should R13 point to topmost filled location (stack full)

◆ Or should it point to next empty location just beyond top of stack(stack empty)

◆ No single ‘right answer’. But ARM like many other systemsuses a “full descending” approach

Page 10: ARM procedure calling conventions and recursion

Page 10© David Brailsford 2011

Standard ARM C address space

◆ ARM C compilers generally arrange the memory address spaceas follows:

stack

unused

heap

static data

code

top of memory

stack pointer (sp)

stack limit (sl)

top of heap

top of applicationtop of application

application base address

static base (sb)

application’s image

Page 11: ARM procedure calling conventions and recursion

Page 11© David Brailsford 2011

Multiple Loads and Stores

◆ If we want to store register values on the stack in memory it’s goodto do this en bloc

◆ This is much more efficient than lots of individual STR and LDR

instructions

◆ ARM supplies Load and Store Multiple instructions (LDM and STM)for just this purpose

◆ Just like the pre-index modes for single LDR/STR instructions we canuse a base register as the indexer — with an option for write-back

◆ In a stack-based discipline we use SP (R13) as the memory indexer

◆ ARM assemblers support a range of suffixes for different stack regimes

◆ But the APCS uses ‘full descending’ STMFD and LDMFD options

Page 12: ARM procedure calling conventions and recursion

Page 12© David Brailsford 2011

Addressing modes and stack suffix options

◆ There are four addressing modes for multiple load/store instructions

� IA — Increment After� IB — Increment Before� DA — Decrement After� DB — Decrement Before

�������������������������������������������������������Stack Orientated Suffixes�������������������������������������������������������

Stack Type Push Pop�������������������������������������������������������Full descending STMFD (STMDB) LDMFD (LDMIA)Full ascending STMFA (STMIB) LDMFA (LDMDA)Empty descending STMED (STMDA) LDMED (LDMIB)Empty ascending STMEA (STMIA) LDMEA (LDMDB)��������������������������������������������������������

�������

�������

�������

��������

IA

R4R1R0

(a)

LDMxx R10, {R0, R1, R4}

STMxx R10, {R0, R1, R4}

R13

IBR4R1R0

(b)

DA

R4R1R0

(c)

DB

R4R1R0

(d)

High addresses

Low addresses

◆ We need only the first line of above table (and diagrams (a) and (d) )

Page 13: ARM procedure calling conventions and recursion

Page 13© David Brailsford 2011

Multiple Loads and Stores — Details

◆ In the Full Descending scheme a multiple store (STMFD) correspondsto pushing register contents onto the stack

◆ Conversely a multiple load (LDMFD) corresponds to a pop from the stack

◆ These operations could use the mnemonics STMDB and LDMIA if preferred

◆ Let’s assume we want to retrieve data from the stack into registers

◆ Consider LDMFD SP, {R0-R3}. Here the SP holds the base address

◆ The overall effect is equivalent to:LDR R0, [SP]LDR R1, [SP, #4]LDR R2, [SP, #8]LDR R3, [SP, #12]

◆ But notice, in the above sequence, that SP itself has not been changed

◆ If we want SP to be altered (and we usually will) we writeLDMFD SP!, {R0-R3}

Page 14: ARM procedure calling conventions and recursion

Page 14© David Brailsford 2011

Stack Frames and Link Registers — Details

◆ Data stored on the stack as part of a function call forms part ofthe stack frame for that function invocation.

◆ A stack frame can have stored register values, and also allocatedspace for local variables declared within the function

◆ The stack frame also stores ‘housekeeping’ information e.g. thecurrent value of the LR. (We’ll see why shortly)

◆ When a procedure is exited and we return to the caller of the function,then the whole stack frame content must be popped.

◆ This is why local variables vanish once a function is exited

◆ When doing Load/Store Multiple we generally give a list of registers incurly braces e.g. LDMFD SP, {R1–R4, LR}

◆ Remember: lowest-address item goes to the lowest numbered register

Page 15: ARM procedure calling conventions and recursion

Page 15© David Brailsford 2011

Storing the Link Register — Details

◆ Recall: if we are in a leaf function (which doesn’t call anything else)we don’t need to store the LR. But in all other cases we do! Why?

. . .

BL func1

. . .

BL func2

main func1

STMFDsp!, {regs, lr}

. . .

. . .LDMFD

sp!, {regs, pc}

. . .

. . .

MOV pc, lr

func2

◆ The BL func1 in main stores the return address in LR (R14)

◆ But then the BL func2 inside func1 overwrites it

◆ So func2 returns to func1 OK but if func1 returns tomain, using MOV PC, LR then LR would be wrong!

Page 16: ARM procedure calling conventions and recursion

Page 16© David Brailsford 2011

Storing the Link Register — More Details

◆ We definitely need to stack the LR value for all non-leaf functions !

◆ Note the stack frame push and pop instructions at start and end of func1

◆ Note how the LDMFD asks that the stored LR value be put back into PC

◆ This causes instantaneous return to main. Cute !

◆ This kind of trick can be used for ‘tail continued’ functions

◆ However, we usually have some ‘clearing up’ to do before we can return

◆ Let’s look at a real example of the situation in the previous diagram

◆ We’ll use strchr (see later slide) as our ‘leaf function’

◆ This program is a pin-number generator using a character as the ‘seed’

Page 17: ARM procedure calling conventions and recursion

Page 17© David Brailsford 2011

The leaf function version of strchr

◆ The index of the first occurrence of a given character within astring is found using strchr

◆ For example the index of ‘o’ in ‘Hello’ is 4 (indexing from 0)

◆ The final coursework gives you a C version of strchr and asksyou to convert it to ARM assembler.

◆ Let’s assume that this routine has been written and that it expects, onentry that R1 contains the start address of the string

◆ Also assume that R2 contains the character to be searched for

◆ The index value will be returned in R0

Page 18: ARM procedure calling conventions and recursion

Page 18© David Brailsford 2011

The func1 function

◆ We save, on the stack frame, R4-R8 (which APCS says we must preserve)and also LR

◆ Main program. PIN code issued is current year-number (2011) plus inputcharacter’s index position in the chosen string. Returned in R0

func1 STMFD SP!, {R4-R8, LR}; strchr trashes R4 and lots of other stuff may be added; here, later, that may well trash R5-R8 (which APCS says; we must save). We now get ready to call strchr; R1-3 untouched so should be OK

BL strchr ;expects str. address in R1 and ch. in R2ADD R0, R0, R5LDMFD SP!, {R4-R8,PC} ; restore R4-R8 and return result in R0

Page 19: ARM procedure calling conventions and recursion

Page 19© David Brailsford 2011

Global strings and main prog.

◆ Here are the global string declarations and the main program

stack EQU 0x1000B mainmesg1 DEFB "the quick brown fox jumps over the lazy dog\0"mesg2 DEFB "Please type a single lower-case alphabetic character: \0"mesg3 DEFB "\nOK - your pin number is \0"ALIGNmain ADR R0, mesg2

SWI 3SWI 1 ; get the character from keyboardMOV R2, R0 ; seed char now in R2ADR R1, mesg1ADR R0, mesg3SWI 3 ; OK - your pin number isLDR R5, =2011 ; not possible with a MOVMOV SP, #stackBL func1SWI 4 ; print out pin numberSWI 2

Page 20: ARM procedure calling conventions and recursion

Page 20© David Brailsford 2011

Notes (+ the stack picture)

◆ Registers R1, R2 and R5 contain vital info. for func1

◆ Notice that R1 and R2 are passed over into strchr

◆ Returned value from strch added to R0 contents inside func1

◆ Be clear that after the STMFD SP!, {R4-R8, LR} ‘push’ thestack looks like:

. . .

LR

R8

R7

R6

R5

R4SP

High addresses

Low addresses

Page 21: ARM procedure calling conventions and recursion

Page 21© David Brailsford 2011

Coping with recursion

◆ A recursive function is one that calls itself.

◆ Recursive function theory is of enormous importance for Maths and CS

◆ There has to be a way of escaping from the recursion. Otherwise it willgo on for ever (consuming CPU time and memory)

◆ The classic example is the factorial function defined as follows:

factorial (n) = n × factorial (n − 1)factorial (0) = 1

◆ Thus, factorial(4) = 4 × 3 × 2 × 1 × 1 = 24

◆ Here’s how it is expressed in C:int factorial (int n){

if (n==0) return 1elsereturn n * factorial (n-1)

}

Page 22: ARM procedure calling conventions and recursion

Page 22© David Brailsford 2011

More about recursion

◆ For more information see my ‘Notes on Recursion’ handout

◆ Let’s look at how to do recursion in ARM assembler

◆ And the afterwards be very thankful that the C compiler lets us writethe version that was on the last slide !

◆ One of the simplest examples is factorial so let’s do that

◆ The stack will build up a lot of instances of n in separate stackframes waiting to be consumed and multiplied together

◆ If a function calls itself it has to be written with extraordinary careto be general enough to cope with:

� Initial case when called from main� Final case when local instance of n has value 0

◆ Program we give next takes input argument in R1 and delivers result in R0

Page 23: ARM procedure calling conventions and recursion

Page 23© David Brailsford 2011

The factorial program

stack EQU 0x1000input EQU 6result DEFB " factorial is "B mainALIGN

factorial CMP R1, #0MOVEQ R1, #1BEQ exit ; base case -- no need for new frameSTMFD SP!, {R1, LR}SUB R1, R1, #1BL factorialLDMFD SP!, {R1,LR} ; restore R1 and LR

exit MUL R0, R0, R1 ; answer builds up in R0MOV PC, LR

main MOV R1, #inputMOV SP, #stackMOV R0, R1SWI 4ADR R0, resultSWI 3MOV R0, #1BL factorialSWI 4SWI 2

Page 24: ARM procedure calling conventions and recursion

Page 24© David Brailsford 2011

Example stack frames

◆ Diagrams below show:� (a) build up of simple stack frames for factorial� (b) more general block diagram of typical stack frame

. . .

LR

3

LR

2

LR

1

(a)

SP

. . .

LR

Saved Registers

Local variables

. . .

(b)

SP

FP High addresses

Low addresses

Page 25: ARM procedure calling conventions and recursion

Page 25© David Brailsford 2011

More about stack management

◆ Note the factorial stack contains different instances of n

◆ Generating correct code for stack-frame handling is the compiler’s job

◆ Things like factorial, fibbonacci and ackerman are increasinglytough tests of your compiler’s handling of recursion !

◆ Stack frames can be cleared down by LDMFD ‘pop’ operations

◆ But also useful to have a Frame Pointer (FP) to start of current frame(FP is R11 in the APCS scheme)

◆ Quick clear down of a frame can be done with MOV SP, FP

◆ If arguments and local vbles. are kept on stack frames what about global(and static) variables? Answer: you need something like DEFW

◆ Start point of static variable area can be kept in the static baseregister (R9 on ARM)