![Page 1: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/1.jpg)
ARM MemoryOwen Kaser, CS2253
Mostly corresponds to book Chapter 5.
![Page 2: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/2.jpg)
Overview
● Loads and Stores● Memory Maps● Register-Indirect Addressing● Post- and Pre-indexed Addressing
![Page 3: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/3.jpg)
16 Registers is Not Enough
● So far, the only places discussed for data are the ARM's CPU registers
● Most interesting programs need more data.● We need memory outside the CPU for our bulk
data storage.● Also, memory can contain pre-computed tables
(eg, of trig functions) that are never altered● For your toaster's software, the machine code
can be set at the factory. Fancy toaster: you can “flash” your toaster with improved software.
![Page 4: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/4.jpg)
Loads and Stores
● Recall that ARM is a “load/store” architecture. Cannot directly do calculations on values in memory. Have to load them into a CPU register to use them as inputs.
● Similarly, calculations put results into registers. Then you can use a store instruction to put them into memory.
● Loads and stores need to specify where in memory things should go. This will be a numeric “memory address”.
● (Memory) addressing modes are small built-in calculations the CPU can do, to compute the memory address.
● Simple case: value in, say, R3 is to be used as the address.
![Page 5: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/5.jpg)
System Memory Maps
● A system built around an ARM7TDMI processor uses 32-bit values as memory addresses. Each address would correspond to a byte (oops, octet).
● The overall “memory address space” ranges from 0 to 0xFFFFFFFF.
● But the overall memory address space is further subdivided (boundaries are often small multiples of powers of 2)
● RAM, ROM, flash, and I/O devices can be given their own subdivisions.
● More on I/O devices later in the course. For now, just realize that some memory addresses accept stores, and some ignore them.
![Page 6: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/6.jpg)
Ex. Memory Map (extracts from book Table 5.1)
Start End Description
0x00000000 0x0003FFFF On-chip flash
0x00040000 0x00FFFFFF reserved
0x01000000 0x1FFFFFFF ROM
0x20000000 0x20007FFF (Static) RAM
…..
0x4000C000 0x4000CFFF UART 0 (a “serial port”) device
…..
0xE0001000 0xE0001FFF “data watchpoint and trace” (DWT) facility
….
0xE0004000 0xFFFFFFFF reserved
![Page 7: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/7.jpg)
For Simplicity....
● Let's only mess with addresses in a range that corresponds to RAM memory.
● Then, loads and stores both make sense.
![Page 8: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/8.jpg)
Register-Indirect Addressing Mode
● Let's suppose you want to load the byte at address 0x00005000 into register R3.
● 8 bit value into a 32-bit container. If we want the 8-bit value to be zero-extended, use LDRB instruction.
● If you want it sign-extended, use LDRSB.● Simplest case: a register stores the address of some
data you care about. Let's go for R1.● Assembler: MOV R1, #0x00005000 ;address to R1
LDRB R3, [R1] ; memory value to R3
![Page 9: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/9.jpg)
Looping Through Memory
● Let's suppose you want to wipe clear (to 0) the contents of all memory locations from 0x00005000 to 0x00005FFF.
● A loop will work nicely.
MOV R1, #0x00005000 ; starting location
MOV R2, #0x00006000; when to stop
MOV R3, #0
LP STRB R3, [R1] ; wipe clear current location's value
ADD R1, R1, #1 ; advance to next location
TEQ R1, R2 ; has R1 hit the stopping location?
BNE LP
….
![Page 10: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/10.jpg)
Speeding It Up
● If the area to be cleared is properly aligned (starts on a multiple of 4) and is the right size (a multiple of 4) we can clear out 4 consecutive addresses with one STR (store word) instruction.
● Recall that a 32-bit word is stored across 4 addresses: A, A+1, A+2, A+3.
![Page 11: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/11.jpg)
Faster Code
MOV R1, #0x00005000 ; starting location
MOV R2, #0x00006000; when to stop
MOV R3, #0 ; 4 bytes of zeros
LP STR R3, [R1] ; wipe clear current location's value AND the next 3 locations' values
ADD R1, R1, #4 ; advance to location of next group of 4 bytes
TEQ R1, R2 ; has R1 hit the stopping location?
BNE LP
● Loop runs only ¼ as many times now.
![Page 12: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/12.jpg)
Even Faster
● The pattern of “use a register to provide a memory address, then update the register in preparation for the next loop” is extremely common.
● ARM designers created an addressing mode that does BOTH of these operations in a single instruction. “post-indexed”
● STR R3, [R1], #4 is equivalent to
STR R3, [R1]
ADD R1, R1, #4
![Page 13: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/13.jpg)
Textbook Figure 5.2
![Page 14: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/14.jpg)
Even Faster Code
MOV R1, #0x00005000 ; starting location
MOV R2, #0x00006000; when to stop
MOV R3, #0 ; 4 bytes of zeros
LP STR R3, [R1], #4 ; wipe 4, then advance “pointer” R1
ADD R1, R1, #4 ; advance to location of next group
TEQ R1, R2 ; has R1 hit the stopping location?
BNE LP
![Page 15: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/15.jpg)
Java Pre- vs Post-Increment
● Can draw a parallel to Java's ++ operators.● Recall, v = M[ p++] in Java
– it uses the current version of p to index M
– then it increments p. post-increment.
● Versus v = M[++p] in Java– it first increments p pre-increment
– then then new value of p is used to index into M
![Page 16: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/16.jpg)
Post-Indexed Addressing
● In ARM, post-indexed indexing takes a base register. (Should not be R15.)
● Uses that base register's value to go to memory● Then updates the base register's value by a little
computation– adding/subtracting a constant (earlier example)
– adding/subtracting a register● which is allowed to be modified by the barrel shifter● can be shifted/rotated by a constant amount● can be shifted/rotated by a register amount
● Usefulness of fanciest of these seems doubtful● LDR R1, [R2], ROR R3 ; is this useful???
![Page 17: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/17.jpg)
Useful? Example
● Java, for an int array M, variable x:
j = 0;
while (….) {
sum += M[j];
j += x;}
● ARM: suppose x in R2, start of M in R1● In loop body: LDR R3, [R1], R2 LSL #2
![Page 18: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/18.jpg)
Pre-Indexed Addressing
● There are two flavours of pre-indexed addressing. Both do a little computation and use the computed effective address to go to memory. In one, the base register is updated. Other flavour does not update.
● In assembly language, the ! symbol means to update the base register. Don't use R15 as the base register with !
● Ok to use R15, without ! The value of R15 is 8 bytes beyond the start of the current machine code. [Details of why are a bit advanced.]
![Page 19: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/19.jpg)
Rationale for the “little computations”
● PC-relative addressing for constants● Getting a field of an object, given the start of
the object.● Indexing into array of objects, selecting a field
(if the object size is a power of two)● (Selected largely by analyzing what compilers
for HLLs would find useful, I think...rather than focussing on assembly language programmers)
![Page 20: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/20.jpg)
Pre-indexed Figure (Textbook)
● Instruction is STR r0, [r1, #12]● Add ! to update r1 when finished:
STR r0, [r1, #12]! ; r0 ← x20c
![Page 21: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/21.jpg)
Some Pre-indexed Examples
● MOV R1, 0x123456578 fails. Constant is not a rotation of an 8 bit value.
● Instead, initialize a memory location with your constant. Then use PC-relative addressing to load it.
● LDR R1, myConst ; pseudo-op
… 1000 bytes later...
myConst DCD 0x12345678
● The LDR instruction is actually something like
LDR R1, [PC, #996] ; PC was already 8 ahead● 996 is close enough to PC. Must be within 4 kiB.
![Page 22: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/22.jpg)
Ex: Field Access for an Object
● In HLLs, the fields of an object occupy consecutive memory addresses (possibly with padding)
● Let's suppose that an object starts at 1000. There are two 32-bit fields, then a 16-bit halfword field that we want to load into R2.
● Let's suppose that R1 contains the starting address of the object.
● Use LDRH R2, [R1, #8] ; immediate offset is 8
(Desired field starts 8 bytes later: gotta skip over first two words.)
● (Minor point: LDRH requires offset ±256)
![Page 23: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/23.jpg)
Ex: Array Access
● Suppose R1 contains the starting address of an array.
● Suppose the array's elements are 4 bytes each● To load the wth array element, we want address
R1 + 4*w● Suppose value w is in R2● LDR R5, [R1, R2 , LSL #2] loads desired value.
![Page 24: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/24.jpg)
No ADR Pseudo-op
● The Crossware assembler does not seem to support ADR, which is used to put an address into a register (that you will then use as a base register). For instance, summing values in array…
MOV R0, #0 ; accumulate answer
ADR R1, MyArr ; Keil pseudo-op
ADR R2, AfterMyArr ; past last valid address
LP LDR R3, [R1], #4
ADD R0, R0, R3
TEQ R1, R2
BNE LP
…..
MyArr DCD 34, 23, 56, 78, 12345566, ……...
AfterMyArr DCB 0
![Page 25: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/25.jpg)
Instead of ADR● Instead of ADR, you should be able to do the following:
MOV R0, #0 ; accumulate answer
LDR R1, =MyArr
LDR R2, =AfterMyArr ; past last valid address
LP LDR R3, [R1], #4
ADD R0, R0, R3
TEQ R1, R2
BNE LP
…..
MyArr DCD 34, 23, 56, 78, 12345566, ……...
AfterMyArr DCD 0 ; wasted word, could avoid...
![Page 26: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/26.jpg)
LDR As Pseudoinstruction
● LDR Rx, =value works for any 32-bit value (address or constant).
● It sets aside space in a “constant pool” , preinitialized to value. This constant pool is (by default) at the end of the current AREA.
● Then it generates machine code for a PC-relative LDR into Rx from this preinitialized location.
● Like a convenient DCD and LDR Rx, [PC, #something]● See textbook Chapter 6.
![Page 27: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/27.jpg)
Machine-Code FormatsLDR/STR/LDRB/STRB
● From reference manual:
![Page 28: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/28.jpg)
Meaning of Some Bits (Ref Man)
![Page 29: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/29.jpg)
Exercise/Example
● Determine machine code for
LDR R3, [R1], #4
and also
STRB R3, [R1, R2, LSR #5]!
![Page 30: ARM Memory - University of New Brunswickowen/courses/2253-2017/slides/04-arm-memory.pdf · ARM Memory Owen Kaser, CS2253 Mostly corresponds to book Chapter 5. Overview Loads and Stores](https://reader030.vdocument.in/reader030/viewer/2022040720/5e2b8c2e3ea41c5d87225d4e/html5/thumbnails/30.jpg)
Load and Store Multiple
● There are instructions LDM and STM that load or store a number of registers.
● With LDM, a bit vector in the machine code indicates which register to load. They are loaded from consecutive addresses.
● STM works similarly● They are especially useful in storing things on
the runtime stack, and will be looked at when we cover that topic.