cpu risc example: armv1 arm1 arm arm2 armv2 arm2 … · apple iphone (original and 3g), apple ipod...

25
Reti Logiche Università degli studi di Udine CPU RISC example: ARM Reti Logiche Università degli studi di Udine ARM architectures ARMv1, ARMv2, ARMv3 are obsolete ARM7 ARMv3 ARM700, ARM710, ARM710a CPU families, architectures, and cores ARM1 ARMv1 ARM1 (April 1985 ; ~ 25K transistors) ARM2 ARMv2 ARM2 (1986 ; ~ 30K transistors) ARMv2a ARM250 ARM3 ARMv2a ARM3 ARM6 ARMv3 ARM60, ARM600, ARM610 Reti Logiche Università degli studi di Udine ARM CPU families, architectures, and cores ARM7TDMI ARMv4T ARM7TDMI, ARM710T, ARM720T, ARM740T ARM7EJ ARMv5TEJ ARM7EJ-S ARM8 ARMv4 ARM810 Reti Logiche Università degli studi di Udine ARM CPU families, architectures, and cores StrongARM ARMv4 SA-1 ARM9TDMI ARMv4T ARM9TDMI, ARM920T, ARM922T, ARM940T ARM9E ARMv5TE ARM946E-S, ARM966E-S, ARM968E-S, ARM996HS ARMv5TEJ ARM926EJ-S

Upload: others

Post on 25-Apr-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

CPU RISC example:

ARM

Reti Logiche Università degli studi di Udine

ARM

architectures

ARMv1, ARMv2, ARMv3

are obsolete

ARM7ARMv3

ARM700, ARM710, ARM710a

CPU families, architectures, and cores

ARM1ARMv1

ARM1 (April 1985 ; ~ 25K transistors)

ARM2ARMv2

ARM2 (1986 ; ~ 30K transistors)

ARMv2aARM250

ARM3ARMv2a

ARM3

ARM6ARMv3

ARM60, ARM600, ARM610

Reti Logiche Università degli studi di Udine

ARM

CPU families, architectures, and cores

ARM7TDMI

ARMv4T

ARM7TDMI, ARM710T, ARM720T, ARM740T

ARM7EJ

ARMv5TEJ

ARM7EJ-S

ARM8

ARMv4

ARM810

Reti Logiche Università degli studi di Udine

ARM

CPU families, architectures, and cores

StrongARM

ARMv4SA-1

ARM9TDMI

ARMv4TARM9TDMI, ARM920T, ARM922T, ARM940T

ARM9E

ARMv5TEARM946E-S, ARM966E-S, ARM968E-S, ARM996HS

ARMv5TEJARM926EJ-S

Page 2: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

ARM

CPU families, architectures, and cores

ARM10E

ARMv5TE

ARM1020E, ARM1022E

ARMv5TEJ

ARM1026EJ-S

XScale

ARMv5TE

XScale

Reti Logiche Università degli studi di Udine

ARM

CPU families, architectures, and cores

ARM11

ARMv6

ARM1136J-S

ARMv6T2

ARM1156T2-S

ARMv6K

ARM1176JZ-S, ARM11 MPCore

Reti Logiche Università degli studi di Udine

ARM

CPU families, architectures, and cores

Cortex-A

ARMv7-A (Application profile)

Cortex-A5, Cortex-A8, Cortex-A9, Cortex-A15

Cortex-R

ARMv7-R (Real-time profile)

Cortex-R4, Cortex-R5, Cortex-R7

Cortex-M

ARMv7-M (Microcontroller profile)

Cortex-M0, Cortex-M1, Cortex-M3, Cortex-M4

Reti Logiche Università degli studi di Udine

ARM

CPU families, architectures, and cores

Apple A6

ARMv7-A

Apple A6

Qualcomm Snapdragon

ARMv7-A

Scorpion, Krait

Page 3: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

ARM

CPU families, architectures, and cores

Cortex-A50

ARMv8-A

Cortex-A53, Cortex-A57

Apple A7

ARMv8-A

Apple A7

X-Gene

ARMv8-A

X-Gene

Reti Logiche Università degli studi di Udine

ARM

Examples of ARM cores applications

ARM1136J-SKindle DX [Freescale i.MX31]

Nokia phones (E63, E71, 5800, E51, 6700 Classic, 6120 Classic, 6210

Navigator, 6220 Classic, 6290, 6710 Navigator, 6720 Classic, E75, N97,

N81) [Freescale MXC300-30]

ARM1176JZ(F)-SApple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation)

Motorola RIZR Z8, Motorola RIZR Z10

Nintendo 3DS

Reti Logiche Università degli studi di Udine

ARM

Examples of ARM cores applications

Cortex-A8Apple iPhone (3GS and 4), Apple iPod touch (3rd and 4th Generation),

Apple iPad [cpu: Apple A4]

BeagleBoard

Motorola (Droid, Droid X, Droid 2, Droid R2D2 Edition)

Samsung (Omnia HD, Wave S8500, i9000 Galaxy S, P1000 Galaxy Tab)

Sony Ericsson (Satio, Xperia X10)

Nokia N900

Google Nexus S

Cortex-A9Apple iPad 2 [cpu: Apple A5], Apple iPhone 4GS [cpu: Apple A5]

LG Optimus 2X

Motorola (Atrix 4G, DROID BIONIC, Xoom)

PandaBoard

Reti Logiche Università degli studi di Udine

Architecture overview

� 32 bit architecture size

� Size of general purpose registers

� 7-9 processor modes

� 1 unprivileged mode

� 6-8 privileged modes

� 31-34 general-purpose registers

� Special registers

� Program status, memory management, processor configuration, ...

depends on

extensions

Page 4: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

Instruction sets

� ARM

� 32-bit instructions

� Default instruction set

� Thumb

� 16 bit instructions

� High code density

� Reduced performance

� Thumb2

� Thumb extended with 32-bit instructions

� High code density

� Good performance

Thumb

decompressor

Thumb

decompressor

ARM

instruction

decoder

ARM

instruction

decoder

Reti Logiche Università degli studi di Udine

Extensions

Thumb

16-bit instructions

No conditional execution

Implicit operands

Only 8 registers available for many instructions

ARMv4T, ARMv5, ARMv6, ARMv7, ARMv8

Thumb-2

Extends Thumb instruction set32 bit instructions

16 bit instructions

ARMv6T2, ARMv7, ARMv8

Instruction set extensions

Reti Logiche Università degli studi di Udine

Extensions

Jazelle Extension

Java bytecode execution (Direct Bytecode Execution)Hardware execution (~95%)

Interpreted as a short sequence of ARM instructions

Emulated via SW

From ARMv6 is requiredbut can be implemented as trivial

ThumbEE Extension

Extension of the Thumb instruction setRequired for ARMv7-A (backward compatibility)

Optional for ARMv7-R

Suitable for JIT and AOT compilationJava, C#, Perl, Python

Meant as successor of Jazelle

Currently, usage is deprecated

Instruction set extensions

Reti Logiche Università degli studi di Udine

Extensions

VFP Extension

VFPv1 (obsolete), VFPv2, VFPv3, VFPv4

Optional

Floating point support and registersSingle-precision floating point

Double-precision floating point

Advanced SIMD Extension (NEON)

SIMDv1, SIMDv2

Optional

Additional SIMD instructions and registersInteger

Single-precision floating point

Instruction set extensions

Page 5: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

Extensions

Fast Context Switch Extension (FCSE)

Modified address translationVA � MVA � PA

Depends on the process ID

Deprecated

Security Extension

2 Security states (Secure, Non-secure)

Additional processor mode: Monitor Mode (MON)

Restrictions on changes of execution state

Restrictions on memory accesses

ARMv6K, ARMv7-A

Optional

Architecture extensions

Reti Logiche Università degli studi di Udine

Extensions

Multiprocessing Extension

New instruction: PLDW

Changes on TLB and cache behavior

ARMv7-A, ARMv7-R, ARMv8

Optional

Large Physical Address Extension

Provides physical addresses up to 240

Requires the multiprocessing extension

3 levels of page tables

ARMv7, ARMv8

Optional

Architecture extensions

Reti Logiche Università degli studi di Udine

Extensions

Virtualization Extension

Modified MMU behavior

Additional instructions

Additional processor mode: Hyp

Requires the multiprocessing extension

Requires the large physical address extension

Requires the trivial Jazelle implementation

ARMv7-A, ARMv8

Optional

Architecture extensions

Reti Logiche Università degli studi di Udine

Extensions

Generic Timer Extension

System timer (with low-latency access)

ARMv7-A, ARMv7-R

Optional

Performance Monitor Extension

Special registers with event counters (implem. dependent)

ARMv7

Optional (recommended)

Architecture extensions

Page 6: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

Processor modes

� USR: User Mode (for user level code execution)

� SVC: Supervisor Mode (for kernel level code execution)

� activated on reset and when a SVC instruction is executed

� SYS: System Mode (similar to SVC, without banked registers)

� used to read/modify SP and LR of User Mode from kernel code

� IRQ: Normal Interrupt Mode (for interrupt handling)

� activated when the IRQ line is asserted

� FIQ: Fast Interrupt Mode (for fast interrupt handling)

� activated when the FIQ line is asserted

� UND: Undefined Mode

� activated when an invalid instruction is executed

� ABT: Abort Mode (for memory access faults handling)

� activated when� an instruction or data fetch is attempted from an invalid address (MMU fault):

� synchronous abort

� external exception from memory subsystem (e.g., parity error, unusable address):� asynchronous abort

� Hyp: Hypervisor Mode (for management of virtualized systems)

� If virtualization extensions are present

� Monitor: Monitor Mode (for handling secure vs non-secure transitions)

� If security extensions are present

privileged

modes

Reti Logiche Università degli studi di Udine

General Purpose registers

� Size: 32 bit

� Names: R0 – R15

� R0 - R12: general purpose� R8 - R12 banked in FIQ mode

� R13 (or SP): stack pointer (software rule)� banked in all privileged modes, but for SYS

� R14 (or LR): function return address� banked in all privileged modes, but for SYS

� R15 (or PC): program counter� Points to current instruction + 8 (when executing ARM instructions)� Points to current instruction + 4 (when executing Thumb instructions)� Instructions can read and write PC

� Banked registers

� duplicated copies of registers� available in some processor mode

Reti Logiche Università degli studi di Udine

Registers banks

R0R0

R1R1

R2R2

R3R3

R4R4

R5R5

R6R6

R7R7

R8R8

R9R9

R10R10

R11R11

R12R12

SPSP

LRLR

PCPC

SP_svcSP_svc

LR_svcLR_svc

SP_abtSP_abt

LR_abtLR_abt

SP_undSP_und

LR_undLR_und

SP_irqSP_irq

LR_irqLR_irq

R8_fiqR8_fiq

R9_fiqR9_fiq

R10_fiqR10_fiq

R11_fiqR11_fiq

R12_fiqR12_fiq

SP_fiqSP_fiq

LR_fiqLR_fiq

CPSRCPSR

SPSR_svcSPSR_svc SPSR_abtSPSR_abt SPSR_undSPSR_und SPSR_irqSPSR_irq SPSR_fiqSPSR_fiq

SP_hypSP_hyp

SPSR_hypSPSR_hyp

ELR_hypELR_hyp

SP_monSP_mon

LR_monLR_mon

SPSR_monSPSR_mon

privileged modes

R0

R1

R2

R3

R4

R5

R6

R7

R8

R9

R10

R11

R12

SP

LR

PC

R0

R1

R2

R3

R4

R5

R6

R7

R8

R9

R10

R11

R12

PC

R0

R1

R2

R3

R4

R5

R6

R7

R8

R9

R10

R11

R12

PC

R0

R1

R2

R3

R4

R5

R6

R7

R8

R9

R10

R11

R12

PC

R0

R1

R2

R3

R4

R5

R6

R7

R8

R9

R10

R11

R12

PC

R0

R1

R2

R3

R4

R5

R6

R7

PC

CPSR CPSR CPSR CPSR CPSR CPSR

R0

R1

R2

R3

R4

R5

R6

R7

R8

R9

R10

R11

R12

PC

CPSR

R0

R1

R2

R3

R4

R5

R6

R7

R8

R9

R10

R11

R12

PC

CPSR

LR

User System Supervisor Abort Undefined IRQ FIQ Hyp Monitor

With

virtualization

extension

With

security

extension

exception modes

Reti Logiche Università degli studi di Udine

Program status register

� Current Program Status Register: CPSR

� Condition flags

� Special flags

� Exception mask bits

� Execution state bits

� Mode bits

31 30 29 28 27 26 25 24 23 20 19 16 15 10 9 8 7 6 5 4 0

NN ZZ CC VV QQ IT[1:0]IT[1:0] JJ GEGE IT[7:2]IT[7:2] EE AA II FF TT M[4:0]M[4:0]

Program status register format

Page 7: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

Program status register

� Current Program Status Register: CPSR

� Condition flags

� accessible (read and write in all modes)

� N: negative flag

� Z: zero flag

� C: carry flag

� V: overflow flag

� Special flags

� accessible (read and write in all modes)

� Q: overflow or saturation for saturation arithmetic

� GE: Greater than or Equal flags for SIMD instructions

Reti Logiche Università degli studi di Udine

Program status register

� Current Program Status Register: CPSR

� Exception mask bits

� accessible (read: in all modes; write: in privileged modes)

� A: Asynchronous abort disable

� 0: async. abort exceptions enabled

� 1: async. abort exceptions masked

async. abort: external exception from memory subsystem

� e.g., parity error, unusable address

� not MMU exceptions (MMU exceptions are always synchronous)

� I: Interrupt disable

� 0: IRQ exceptions enabled

� 1: IRQ exceptions masked

� F: Fast interrupt disable

� 0: FIQ exceptions enabled

� 1: FIQ exceptions masked

Reti Logiche Università degli studi di Udine

Program status register

� Current Program Status Register: CPSR

� Execution state bits

� IT[7:0]: If-Then (for the Thumb IT instruction)

� not accessible (read as zero, writes are ignored or unpredictable)

� J: Jazelle

� not accessible (read as zero, writes are ignored or unpredictable)

� E: Endianness

� accessible (read: in all modes; write: in privileged modes)

� access is deprecated

� T: Thumb

� not accessible (read as zero, writes are ignored or unpredictable)

Reti Logiche Università degli studi di Udine

Program status register

� Current Program Status Register: CPSR

� Mode bits

� accessible (read: in all modes; write: in privileged modes)

� M[4:0] mode

� 10000: USR

� 10001: FIQ

� 10010: IRQ

� 10011: SVC

� 10111: ABT

� 11011: UND

� 11111: SYS

� 11010: HYP

� 10110: MON

Page 8: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

Program status register

� APSR

� the portion of CPSR useful in USR mode

� E, A, I, F, M are readable in USR mode

� the access is deprecated in USR mode

� reading E is deprecated in all modes

� IT, J, T are readable in USR mode

� reads return 0 in all modes

31 30 29 28 27 26 25 24 23 20 19 16 15 10 9 8 7 6 5 4 0

NN ZZ CC VV QQ IT[1:0]IT[1:0] JJ GEGE IT[7:2]IT[7:2] EE AA II FF TT M[4:0]M[4:0]

Read as zero

Read possible but deprecated

Read as zero

Reti Logiche Università degli studi di Udine

Program status register

� Saved Program Status Register: SPSR

� only in privileged modes

� one for each mode, but for SYS

� SPSR_svc, SPSR_irq, SPSR_fiq, SPSR_und, SPSR_abt,

SPSR_hyp, SPSR_mon

� accessible (read and write)

Reti Logiche Università degli studi di Udine

ARM: program status register

User mode:

only changes to flags

examples:

MRS R0, CPSR load R0 with content of CPSR (the accessible bits)

MSR CPSR_f, R0 write the flag portion of CPRS

no saved copy

Privileged modes:

examples:

MSR CPSR_f, R0 write the flag portion of CPRS

MSR CPSR, R0 write CPRS

MSR SPSR, R0 write SPRS (if exists)

MSR SPSR_c, R0 write the lowest byte of SPRS (if exists)

Reti Logiche Università degli studi di Udine

ARM structure

Increm.

Barrel

shifter

Mult.

Address register

Register bank

Dataout register Datain register

Instruction

decode

and

control

ALU

A[31:0]Control

PC

P

C

A

L

U

b

u

s

A

b

u

s

B

b

u

s

I-cache

Register read

D-cache

Register write

I-decode

shift

byte repl.

rot/sgn ext

+4

+4

muxALU

X

next

pc

pc+4

pc+8

LDM/

STM post-

index

pre-index

B, BLMOV pcSUBS pc

load/store

address

buffer/

data

execute

reg

shift

forwarding

paths

instruction

decode

immediate

fields

write-back

fetch

r15

LDR pc

Page 9: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

ARM: exception handling

Exceptions

ResetExternal reset asserted

Undefined instructionUndefined or invalid instruction executed

Supervisor callSVC instruction executed

Prefetch abortInvalid address during instruction fetch

Data abortInvalid address during data fetch

IRQExternal interrupt asserted

FIQExternal fast interrupt (high priority) asserted

Reti Logiche Università degli studi di Udine

Exception handling

Other exceptions (not considered here)

� Secure Monitor Call

� SMC instruction executed

� Only if security extensions are present

� Hypervisor Call

� HVC instruction executed

� Hyp Trap

� privileged instruction executed in a virtual machine

� Virtual Abort

� external asynchronous abort within a virtual machine

� Virtual IRQ

� Virtual IRQ generated in a virtual machine

� Virtual FIQ

� Virtual FIQ generated in a virtual machine

Only if virtualization

extensions are present

Only if secure extensions

are present

Reti Logiche Università degli studi di Udine

ARM: exception handling

Behavior:

1. Change processor mode

2. Save CPSR (to SPSR of the new mode)

3. Save return address in LR of the new mode

4. Mask IRQ exception

5. Mask other exceptions if needed� depends on exception

6. Jump to a fixed address� depends on exception

Reti Logiche Università degli studi di Udine

ARM: exception handling

Behavior:

Reset

PC = EBASE + 0x00 - New mode = SVC - Mask asynchronous Abort and FIQ

Undefined instruction

PC = EBASE + 0x04 - New mode = UND

Supervisor call

PC = EBASE + 0x08 - New mode = SVC

Prefetch abort

PC = EBASE + 0x0C - New mode = ABT - Mask asynchronous Abort

Data abort

PC = EBASE + 0x10 - New mode = ABT - Mask asynchronous Abort

IRQ

PC = EBASE + 0x18 - New mode = IRQ

FIQ

PC = EBASE + 0x1C - New mode = FIQ - Mask asynchronous Abort and FIQ

Page 10: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

ARM: exception handling

Behavior:

� Exception base address

� Bit V of special register SCTLR (System Control Register)

� 0: EBASE = 0 (default)

� 1: EBASE = 0xFFFF0000 (Hivecs)

� Vectored interrupt support

� Vendor implementation dependent

� Several IRQ and FIQ lines

� Each line has its own priority and exception address

Reti Logiche Università degli studi di Udine

Memory model

Virtual Memory System Architecture (VMSA)

MMU: Memory Management Unit

Address translation

Memory protection

Protected Memory System Architecture (PMSA)

MPU: Memory Protection Unit

Memory protection

No address translation

Not considered

here

Reti Logiche Università degli studi di Udine

Virtual Memory System Architecture

� Memory areas:

� Supersections: 16 MB (support is optional)

� Sections: 1 MB

� Large pages: 64 KB

� Small pages: 4 KB

� 2-level page table

� pointed by special registers

� TTBR0: Translation Table Base Register 0

� TTBR1: Translation Table Base Register 1

Reti Logiche Università degli studi di Udine

Virtual Memory System Architecture

� First level table

� Pointed by TTBR0 or TTBR1

� Contains first level descriptors

� 2nd level page table address (22 bits)

� Section base address (12 bits)

� Supersection base address (8 bits)

� Second level table

� Pointed by a first level descriptor

� Contains second level descriptors

� Large page base address (16 bits)

� Small page base address (20 bits)

supersection

section

L page

S page

Page 11: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

ARM: Virtual address translation

31 6 5 4 012314 13

Translation table address (Hi)Translation table address (Hi)

31 14-N 13-N 6 5 4 0123

Translation table address (Hi)Translation table address (Hi)

Translation Table Base Register 1 (TTBR1) format

Translation Table Base Register 0 (TTBR0) format

31 023

NN

Translation Table Control Register (TTBCR) format

N: bits to discharge in translation

Reti Logiche Università degli studi di Udine

ARM: Virtual address translation

Table address (Hi)Table address (Hi)

First level table 0

Physical address of

first level page table

18+N14-N

0

TTBR0

Table address (Hi)Table address (Hi)

First level table 1

Physical address of

first level page table

1814

0

TTBR1

Used if

N == 0

VA[31:32-N] == 0

Used if

N != 0 and VA[31:32-N] != 0

Reti Logiche Università degli studi di Udine

ARM: Virtual address translation

TTBR0TTBR0

0

232-N - 1

232-N

232 - 1

First level table 0

Physical

addressAddress

translation

Physical

address

Address

translation

TTBR1TTBR1

First level table 1

Vir

tual

add

ress

sp

ace

Reti Logiche Università degli studi di Udine

Virtual Memory System Architecture

� Virtual address translation

� Small pages

� 12 bits (address[31:20]): first-level table index

� ignore N MSBs when TTBR0 is used

� N: 3 LSBs of special register TTBCR (Translation Table Control Register)

� 8 bits (address[19:12]): second-level table index

� 12 bits (address[11:0]): page offset

Page 12: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

ARM: Virtual address translation

Page tablePage table

Small PageSmall Page

Table address (Hi)Table address (Hi)

First level table

Second level table

X31 20 19 12 11 0

12

812-N

20

Physical address

22

00

2

Virtual address

2

00

Physical address of

first level descriptor

Physical address of

second level

descriptor

18+N

31-N when using TTBR0

31 when using TTBR1X =

Reti Logiche Università degli studi di Udine

Virtual address translation example

31 6 5 4 0123

11 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

1314

TTBR0

Translation table address (Hi)

31 0

00 00 00 00 00 00 00 00 00 00 11 00 00 00 00 00 00 11 11 11 00 00 00 00 00 00 00 00 00 11 00 00

11121920

VA

11 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 11 00 00 00 Page tablePage table

First level table00

Physical address of first level descriptor

2

1218

Type: page table

Base: 1100111100000000000000

Step1: read entry 2 from first level table

Access 32-bit word at 0x80000008

found:

type is page table

base is 1100111100000000000000

second level table address: 0xCF000000

31 023

00 00 00TTBCR

N = 0

Reti Logiche Università degli studi di Udine

Virtual address translation example

31 0

00 00 00 00 00 00 00 00 00 00 11 00 00 00 00 00 00 11 11 11 00 00 00 00 00 00 00 00 00 11 00 00

11121920

VA

11 11 00 00 11 11 11 11 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 11 11 11 00 00 Small PageSmall Page

Second level table00

Physical address of second level descriptor

2

822

Type: small page

Base: 00010000000000000000

11 11 00 00 11 11 11 11 00 00 00 00 00 00 00 00 00 00 00 00 00 00

From first level page table

Step2: read entry 3 from second level table

Access 32-bit word at 0xCF00001C

found:

type is small page

base is 00010000000000000000

page address: 0x10000000

31 023

00 00 00TTBCR

N = 0

Reti Logiche Università degli studi di Udine

Virtual address translation example

31 0

00 00 00 00 00 00 00 00 00 00 11 00 00 00 00 00 00 11 11 11 00 00 00 00 00 00 00 00 00 11 00 00

11121920

VA

00 00 11 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 11 00 00 Physical address

1220

00 00 11 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

From second level page table

Step3:

physical address is 0x10000004

31 023

00 00 00TTBCR

N = 0

Page 13: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

Virtual Memory System Architecture

� Virtual address translation

� Large pages

� 12 bits (address[31:20]): first-level table index

� ignore N MSBs when TTBR0 is used

� N: 3 LSBs of special register TTBCR (Translation Table Control Register)

� 8 bits (address[19:12]): second-level table index

� 16 bits (address[15:0]): page offset

YES: they overlap!

the 2nd level page table must have repeated entries

16 repeated entries

Reti Logiche Università degli studi di Udine

ARM: Virtual address translation

Page tablePage table

Large PageLarge Page

Table address (Hi)Table address (Hi)

First level table

Second level table

X31 20 19 12 11 0

16

812-N

16

Physical address

22

00

2

Virtual address

2

00

Physical address of

first level descriptor

Physical address of

second level

descriptor

18+N

15

31-N when using TTBR0

31 when using TTBR1X =

Reti Logiche Università degli studi di Udine

Virtual Memory System Architecture

� Virtual address translation

� Sections

� 12 bits (address[31:20]): first-level table index

� ignore N MSBs when TTBR0 is used

� N: 3 LSBs of special register TTBCR (Translation Table Control Register)

� 20 bits (address[19:0]): section offset

Reti Logiche Università degli studi di Udine

ARM: Virtual address translation

SectionSection

Table address (Hi)Table address (Hi)

First level table

X31 20 19 0

20

12-N

Physical address

12

Virtual address

2

00

Physical address of

first level descriptor

18+N

31-N when using TTBR0

31 when using TTBR1X =

Page 14: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

Virtual Memory System Architecture

� Virtual address translation

� Supersections

� 12 bits (address[31:20]): first-level table index

� ignore N MSBs when TTBR0 is used

� N: 3 LSBs of special register TTBCR (Translation Table Control Register)

� 24 bits (address[23:0]): section offset

YES: they overlap!

in the 1st level page table, Supersection entries must be repeated 16 times

Reti Logiche Università degli studi di Udine

ARM: Virtual address translation

SupersectionSupersection

Table address (Hi)Table address (Hi)

First level table

X31 20 19 0

24

12-N

Physical address

8

Virtual address

2

00

Physical address of

first level descriptor

18+N

23

16, if 40-bit physical

addresses are

supported (optional)

31-N when using TTBR0

31 when using TTBR1X =

Reti Logiche Università degli studi di Udine

ARM: Virtual address translation

Page tablePage table

SectionSection

SupersectionSupersection

SupersectionSupersection

Large PageLarge Page

Large PageLarge Page

Small PageSmall Page

Repeated

16 times

Repeated

16 times

16 MB

memory

region

1 MB

memory

region

64 KB

memory

page

4 KB

memory

page

Table addressTable address

VA[X : 20]VA[X : 20] VA[19 : 12]VA[19 : 12]

From TTBR0

or TTBR1

(right padded)

First level table

Second level table

In both tables, entries also include

control bits

(entry type, access permissions, etc.)

32 bits 32 bits

31-N when using TTBR0

31 when using TTBR1X =

Reti Logiche Università degli studi di Udine

ARM instruction set

� 32-bit instructions

� Aligned on a four-byte boundary

� 3-address instructions

� Very regular format

� Conditional execution of (almost) every instruction

� Shift and ALU operation in a single instruction

� Cannot specify a 32-bit immediate constant

� Not enough bits in a 32-bit instruction

� Data instructions use 12 bits for immediate

� exploiting the barrel shifter

� Data processing instructions use 12 bits for immediate� 8 bits for base constant: imm8� 4 bits for rotation: rot� immediate = rotate_right(imm8, rot)� Available constants

� 0 – 255 (no rotation)� 256, 260, 264, …, 1020 (64, 65, …, 255) rotated of 30� 1024, 1040, 1056, …, 4080 (64, 65, …, 255) rotated of 28� …

Page 15: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

Conditional execution

� Conditional instructions

� Branches

� Pipeline hazard

� Pipeline stalls: several cycles lost

� Branch prediction

� Speculative execution

� Data processing instructions

� Execution proceed in any case

� If condition is false, results are discarded

� Conditional instruction behaves as a NOP

� 1 cycle lost

Reti Logiche Università degli studi di Udine

Conditional execution

� A “conditional suffix” makes the ARM instruction conditional

� Example

� Instruction:

� add r0, r1, r2

� Sum the content of registers r1 and r2, and store the result in r0 (r0 <= r1 + r2)

� Conditional suffix:

� eq

� true if flag Z is 1

� usually a previous comparison (a subtraction) has provided 0 as result

� ARM conditional instruction:

� addeq r0, r1, r2

� The operation is performed only if Z = 1

� The operation is actually always performed, but the result is stored only if Z = 1

� eq: conditional suffix

Reti Logiche Università degli studi di Udine

Conditional execution

conditional suffixes

Suffix Meaning Flags

EQ EQual Z == 1

NE Not Equal Z == 0

CS Carry Set C == 1

CC Carry Clear C == 0

MI MInus, negative N == 1

PL PLus, positive or zero N == 0

VS Overflow (V Set) V == 1

VC No overflow (V Clear) V == 0

HI Unsigned HIgher C == 1 and Z == 0

LS Unsigned Lower or Same C == 0 or Z == 1

GE Signed Greater than or Equal N == V

LT Signed Less Than N != V

GT Signed Greater Than Z == 0 and N == V

LE Signed Less than or Equal Z == 1 or N != V

None (AL) Always -

For tests on unsigned values

For tests on signed values

Reti Logiche Università degli studi di Udine

Conditional execution and flags

� Many data processing instructions do not affect flags by default

� Suffix S means: Store Flags

� Examples

� sub r1, r2, r3

� Subtract content of r3 from content of r2, store result in r1

� r1 <= r2 – r3

� Do NOT store flags in CPSR

� subs r1, r2, r3

� As above

� DO store flags in CPSR

� cmp does not require S

� cmp is only used to generate flags

� cmp means: subtract, do not store result but store flags

� e,g., cmp r1, r2: perform r1 - r2, discard result, store flags

Page 16: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

Conditional execution

if (a >= b) c = 0;else c = 1;

if (a >= b) c = 0;else c = 1;

a, b: signed integers

a mapped on r1

b mapped on r2

c mapped on r3

cmp r1, r2 // generate flags blt else // if a < b, branch to “else” code mov r3, #0 // this is the “then” code b endifelse:

mov r3, #1 // this is the “else” codeendif:

...

Example:

cmp r1, r2 // generate flags movge r3, #0 // “if” assignment movlt r3, #1 // “else” assignment

c = 1

...

a >= b ?ny

c = 0

else:

endif:

No branches

Reti Logiche Università degli studi di Udine

ARM instructions

� Data processing instructions

� Data transfer

� Arithmetic

� Logical

� Comparison

� Shift

� Control flow

� Memory access

� System instructions

Reti Logiche Università degli studi di Udine

ARM instructions

Data processing instruction:

OPCODE<flag suffix><conditional suffix> OPERANDS

OPCODE: operation

<flag suffix>: store flags

S: store flags

nothing: do not store flags

Some exceptions: CMP, TST, ... : always store flags

<conditional suffix>: conditional execution

Instruction is executed only if condition is true

OPERANDS: registers and immediate constants

Reti Logiche Università degli studi di Udine

Data processing instructions

� Only work on registers, NOT memory

� Second operand is sent to the ALU via barrel shifter

� Mostly 3-address instructions

� First: destination register

� Second: first operand (always a register)

� Third: second operand (a register or a constant)

Page 17: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

Data processing instructions

� Barrel shifter operations

� LSL: Logical Shift Left

� LSR: Logical Shift Right

� ASR: Arithmetic Shift Right

� ROR: Rotate Right

� RRX: Rotate Right Extended

operand 0

operand0

operand

operand

Same as ROR, but operand is

33-bit (carry flag is added)

Reti Logiche Università degli studi di Udine

Data processing instructions

� Data movement

� MOV, MVN

� Arithmetic

� ADD, ADC, SUB, SBC, RSB, RSC

� MUL

� MLA, MLS, UMULL, UMLAL, SMULL, SMLAL, UMAAL

� Some ARM cores also have instructions for integer division: UDIV, SDIV

� Logical

� AND, ORR, EOR, BIC, MVN

� Comparison

� CMP, CMN

� TST, TEQ

� Shift

� LSL, LSR, ASR, ROR, RRX

MVN is actually a NOT

4-address instructions

Not true instructions, actually mov with a shift applied

Arithmetic instructions without result storing

Logical instructions without result storing

Reti Logiche Università degli studi di Udine

Data movement instruction

� MOV: move data to a register

� mov rd, N

� rd: destination register

� N: immediate or source register (and shift)

� Examples

� mov r0, r2 r0 <= r2

� mov r0, #1 r0 <= 1

� mov r0, r1, lsl #2 r0 <= r1 << 2

� mov r0, r1, lsl r2 r0 <= r1 << r2

� MVN: move data negated to a register

� mvn rd, N

� N is negated before being stored in rd

Reti Logiche Università degli studi di Udine

Arithmetic instructions

� ADD: sum data

� add rd, rn, N

� rd: destination register

� rn: first source register

� N: immediate or source register (and shift)

� Examples

� add r0, r1, r2 r0 <= r1 + r2

� add r0, r1, #2 r0 <= r1 + 2

� add r0, r1, r2, lsl #2 r0 <= r1 + (r2 << 2)

� add r0, r1, r2, lsl r3 r0 <= r1 + (r2 << r3)

� Others

� SUB: subtract� e.g., sub r0, r1, r2 r0 <= r1 - r2

� RSB: reverse subtract� e.g., rsb r0, r1, r2 r0 <= r2 - r1

� ADC: add with carry� e.g., adc r0, r1, r2 r0 <= r1 + r2 + carry_flag

� SBC: subtract with carry� e.g., sbc r0, r1, r2 r0 <= r1 - r2 - carry_flag

� RSC: reverse subtract with carry� e.g., rsc r0, r1, r2 r0 <= r2 - r1 - carry_flag

Page 18: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

Arithmetic instructions

� MUL: multiply

� e.g., mul r0, r1, r2 r0 <= r1 * r2 {lowest 32 bits}

� MLA: multiply and accumulate

� e.g., mla r0, r1, r2, r3 r0 <= r3 + r1 * r2 {lowest 32 bits}

� MLS: multiply and subtract

� e.g., mls r0, r1, r2, r3 r0 <= r3 - r1 * r2 {lowest 32 bits}

� UMULL: unsigned multiply long

� e.g., umull r0, r1, r2, r3 r1:r0 <= r2 * r3

� UMLAL: unsigned multiply and accumulate long

� e.g., umlal r0, r1, r2, r3 r1:r0 <= r1:r0 + r2 * r3

� SMULL: signed multiply long

� e.g., smull r0, r1, r2, r3 r1:r0 <= r2 * r3

� SMLAL: signed multiply and accumulate long

� e.g., smlal r0, r1, r2, r3 r1:r0 <= r1:r0 + r2 * r3

� UMAAL: unsigned multiply and accumulate 2 long

� e.g., umaal r0, r1, r2, r3 r1:r0 <= r1 + r0 + r2 * r3

No barrel

shifter for

operands of

these

instructions

Reti Logiche Università degli studi di Udine

Logical instructions

� AND: bitwise and

� e.g., and r0, r1, r2 r0 <= r1 and r2

� ORR: bitwise or

� e.g., orr r0, r1, r2, lsl #1 r0 <= r1 or (r2 << 1)

� EOR: bitwise xor

� e.g., eor r0, r1, r2, lsl r3 r0 <= r1 xor (r2 << r3)

� BIC: bit clear

� Clear all bits of the first operand that are set in the second operand� e.g., bic r0, r2, r3 r0 <= r2 and not r3

Reti Logiche Università degli studi di Udine

Comparison instructions

� CMP: compare

� e.g., cmp r4, r5 r4 – r5 {do not store result, always store flags}

� CMN: compare negative

� e.g., cmn r4, r5 r4 + r5 {do not store result, always store flags}

� TST: test

� e.g., tst r4, r5 r4 and r5 {do not store result, always store flags}

� TEQ: test equivalence

� e.g., teq r4, r5 r4 xor r5 {do not store result, always store flags}

Reti Logiche Università degli studi di Udine

Shift instructions

� LSL: logical shift left

� e.g., lsl r0, r1, #5 r0 <= r1 << 5� actually: mov r0, r1, lsl #5

� e.g., lsl r0, r1, r2 r0 <= r1 << r2� actually: mov r0, r1, lsl r2

� LSR: logical shift right

� e.g., lsr r0, r1, #5 r0 <= r1 >> 5� actually: mov r0, r1, lsr 5

� ASR: arithmetic shift right

� e.g., asr r0, r1, #5 r0 <= r1 >> 5� actually: mov r0, r1, asr #5

� ROR: rotate right

� e.g., ror r0, r1, #5 rotation right without carry� actually: mov r0, r1, ror #5

� RRX: rotate right with extend

� only 1 bit rotation is available� e.g., rrx r0, r1 rotation right, 1 bit, with carry

� actually: mov r0, r1, rrx

Page 19: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

Data processing instructions

� Others

� Count leading zeros

� CLZ

� Saturated arithmetic

� QADD, QSUB, QDADD, QDSUB, …

� Parallel arithmetic

� SADD16, SSUB16, SADD8, SSUB8, …

� Halfword multiply and multiply accumulate instructions

� SMULWB, SMULWT, SMLABB, SMLABT, SMLATB, SMLATT, …

� Floating-point data processing

� Advanced SIMD instructions

Reti Logiche Università degli studi di Udine

Data processing instructions

� Notes

� Due to immediate constant limitations, mov cannot load

small negative values in registers

� Use mvn

� r0 <= -1 mvn r0, #0

� r1 <= -3 mvn r1, #2

� Fast multiplication for a small constant can be implemented

exploiting the barrel shifter

� e.g.,

� r4 <= r3 * 35 add r4, r3, r3, lsl #2 (r4 <= r3 * 5)

rsb r4, r4, r4, lsl #3 (r4 <= r4 * 7)

Reti Logiche Università degli studi di Udine

Control flow instructions

� Branch instructions

� Conditional or unconditional

� With or without link

� Link: save next instruction address in LR

� For subroutine calls

� Target address

� Immediate constant (offset from PC)

� Register

� With or without instruction set changing

� Switch between ARM and Thumb execution

� Implicit branches

� Instructions that use PC as destination register

� Others than ldm are deprecated

Reti Logiche Università degli studi di Udine

ARM instructions

Control flow (branches):

OPCODE<conditional suffix> DESTINATION

OPCODE: operation

<conditional suffix>: conditional execution

DESTINATION: register or immediate constant

Page 20: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

Branch instructions

� B: branch

� e.g., b label pc <= address of label

� e.g., beq label pc <= address of label if Z = 1

� destination address is in an immediate constant� computed (by assembler) as PC relative immediate offset

� BL: branch with link

� e.g., bl function lr <= <return address> ; pc <= function address

� destination address is in an immediate constant� computed (by assembler) as PC relative immediate offset

� BX: branch and exchange

� e.g., bx lr pc <= lr {change instruction set if needed}

� destination address is in a register

� BLX: branch with link and exchange

� e.g., blx function lr <= <return address> ; pc <= function address

{change instruction set if needed}

� e.g., blx r0 pc <= r0 {change instruction set if needed}

� destination address is in a register or in an immediate constant

Reti Logiche Università degli studi di Udine

Memory access instructions

� Single register transfers

� Data types

� 32-bit (word)

� 16-bit (half-word)

� 8-bit (byte)

� Direction

� Load: LD

� Store: ST

� Addressing

� Pre/post increment/decrement

� Allows efficient array access

Reti Logiche Università degli studi di Udine

ARM instructions

Memory access (single data):

OPCODE<size><conditional suffix> OPERANDS

OPCODE: operation

<size>:

B: byte

SB: signed byte (not for STR)

H: halfword (16-bit)

SH: signed halfword (16-bit) (not for STR)

Nothing: word (32-byte)

<conditional suffix>: conditional execution

OPERANDS:destination/source register

address and indexing specification

Reti Logiche Università degli studi di Udine

Single register transfers

� Load 32-bit from memory to register

� ldr rd, [rn]

� Use address stored in rn and load data from memory

� Store data in rd

� Example

� ldr r0, [r1] r0 <= MEM[r1]

Page 21: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

Single register transfers

� Load 32-bit from memory to register

� Pre-increment/decrement

� Pre-: (step 1) compute address, (step 2) access memory

� ldr rd, [rn, +/- rm, shift]

� Use address stored in rn +- (shifted rm)

� Example

� ldr r0, [r1, r2, lsl #2] r0 <= MEM[r1 + (r2 << 2)]

� ldr rd, [rn, +/- #imm12]

� Example

� ldr r0, [r1, #12] r0 <= MEM[r1 + 12]

Reti Logiche Università degli studi di Udine

Single register transfers

� Load 32-bit from memory to register

� Pre-increment/decrement with pointer update

� Pre-: (step 1) compute address, (step 2) access memory

� Update of pointer is indicated by !

� ldr rd, [rn, +/- rm, shift]!

� Use address computed as rn +- (shifted rm)

� Update rn

� Example

� ldr r0, [r1, r2, lsl #2] r0 <= MEM[r1 + (r2 << 2)]

r1 <= r1 + (r2 << 2)

� ldr rd, [rn, +/- #imm12]!

� Use address computed as rn +- #imm12

� Update rn

� Example

� ldr r0, [r1, #20]! r0 <= MEM[r1 + 20]

r1 <= r1 + 20

Reti Logiche Università degli studi di Udine

Single register transfers

� Load 32-bit from memory to register

� Post-increment/decrement (with implicit pointer update)

� Post-: (step 1) access memory, (step 2) compute address

� Update is implicit (otherwise computation is meaningless): no !

� ldr rd, [rn], +/- rm, shift

� Use address stored in rn

� Update rn with rn +- (shifted rm)

� Example

� ldr r0, [r1], r2, lsl #2 r0 <= MEM[r1

r1 <= r1 + (r2 << 2)

� ldr rd, [rn], +/- #imm12

� Use address stored in rn

� Update rn with rn +- #imm12

� Example

� ldr r0, [r1], #16 r0 <= MEM[r1]

r1 <= r1 + 16

Reti Logiche Università degli studi di Udine

Single register transfers

� Load 32-bit from memory to register (summary)

� Pre-increment

� ldr rd, [rn, offset]

� ldr rd, [rn, +/- rm, shift]

� ldr rd, [rn, +/-#imm32]

� Pre-increment with pointer update

� ldr rd, [rn, offset]!

� ldr rd, [rn, +/- rm, shift]!

� ldr rd, [rn, +/-#imm32]!

� Post-increment (with pointer update)

� ldr rd, [rn], offset

� ldr rd, [rn], +/- rm, shift

� ldr rd, [rn], +/-#imm32

Page 22: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

Single register transfers

� Store 32-bit data to memory from register

� Similar to ldr

� Pre-increment

� str rt, [rn, offset]

� Pre-increment with pointer update

� str rt, [rn, offset]!

� Post-increment (with pointer update)

� str rt, [rn], offset

Store data

contained

in rt

Reti Logiche Università degli studi di Udine

Single register transfers

� Other sizes

� Same address specification

� Use the lower part of the source/destination register

� Load

� Load byte (unsigned): LDRB

� Load byte (signed): LDRSB

� Load half-word (unsigned): LDRB

� Load half-word (signed): LDRSH

� Store

� Store byte: STRB

� Store half-word: STRH

Reti Logiche Università degli studi di Udine

Single register transfers

� Others

� Double register transfers

� LDRD, STRD

� Load a couple of registers

� load-linked, store conditional

� LDREX, STREX, CLREX

� Used for multiprocessing synchronization

� Since ARMv6

� Memory-register data swap

� SWP

� Double access

� Deprecated since ARMv6

� Extra load/store instructions, unprivileged

� LDRT, LDRBT, LDRSBT, LDRHT, LDRSHT, STRT, STRBT, STRHT

Reti Logiche Università degli studi di Udine

Memory access instructions

� Multiple register transfers

� Transfer a subset of registers to/from memory

� Data types

� 32-bit (word)

� Direction

� Load: LDM

� Store: STM

� Addressing

� Increment/decrement before/after

� Allows efficient stack access

Page 23: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

ARM instructions

Memory access (multiple data):

OPCODE<conditional suffix><addressing mode> OPERANDS

OPCODE: operation

<conditional suffix>: conditional execution

<addressing mode>:DA: decrement after

IA: increment after

DB: decrement before

IB: increment before

OPERANDS:pointer (register)

List of destination/source registers and pointer update request

Registers are transferred in order

Lowest register number is always transferred to/from lowest memory

location accessed.

Or (stack oriented addressing modes):

FA: full ascending

FD: full descending

EA: empty ascending

ED: empty descending

Reti Logiche Università degli studi di Udine

Memory access (multiple data)

� Memory access (multiple data) examples:

� ldmia r13!, {r0-r12, r14} ; IA: increment after

; r0 <= MEM[r13] step-1: access memory – step-2 (After): Increment address

; r1 <= MEM[r13 + 4]

; ...

; r12 <= MEM[r13 + 48]

; r14 <= MEM[r13 + 52]

; r13 <= r13 + 56 pointer update required (!)

� stmib r13!, {r0, r2} ; IB: increment before

; MEM[r13 + 4] <= r0 step-1 (Before): Increment address – step-2: access memory

; MEM[r13 + 8] <= r1

; r13 <= r13 + 8 pointer update required (!)

Reti Logiche Università degli studi di Udine

Stack

sp

r14 (lr)

r12 (ip)

r11 (fp)

r5r4

r15 (pc)sp

Full Descending

Store: decrement before

Load: increment after

restore r5 from stack

STMFD

LDMFD

STMDB

LDMIA

Full Descending:

Full: sp points to a location with data

Descending: sp must be decremented when pushing into the stack

Empty Descending:

Empty: sp points to an empty location

Descending: sp must be decremented when pushing into the stack

Empty Ascending:

Empty: sp points to an empty location

Ascending: sp must be incremented when pushing into the stack

Full Ascending:

Full: sp points to a location with data

Ascending: sp must be incremented when pushing into the stack

Reti Logiche Università degli studi di Udine

Stack

Full Descending Empy Descending Empty Ascending Full Ascending

ST: decrement before

LD: increment after

ST: decrement after

LD: increment before

ST: increment after

LD: decrement before

ST: increment before

LD: decrement after

STMFD

LDMFD

STMDB

LDMIA

sp

r14 (lr)

r12 (ip)

r11 (fp)r5

r4

r15 (pc)spsp

r14 (lr)

r12 (ip)r11 (fp)

r5

r4

r15 (pc)

sp

spr14 (lr)

r12 (ip)r11 (fp)

r5

r4

r15 (pc)

spsp

r14 (lr)

r12 (ip)

r11 (fp)r5

r4

r15 (pc)sp

Other possible stacks

Functions must use a FD stack

Page 24: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

System instructions

� SVC: supervisor call (also: SWI)

� e.g., svc #0

� MCR: move to coprocessor (special register) from register

� MRC: move to register from coprocessor (special register)

� MRS: move to register from status register

� e.g., mrs r0, CPSR� e.g., mrs r0, SPSR

� MSR: move to status register

� e.g., msr CPSR, R0� e.g., msr CPSR_f, R0 write only the flags portion

� e.g., msr SPSR, R0� e.g., msr CPSR_f, #0x20000000 set the C flag

Reti Logiche Università degli studi di Udine

System instructions

� Others

� Memory barriers

� DSB, DMB, ISB

� Other traps

� HVC, SMC

� Two registers core-coprocessor transfers

� MCRR, MCRR2, MRCC, MRCC2

� …

Reti Logiche Università degli studi di Udine

ARMv8

64-bit architecture

Backward compatible with 32-bit ARM architectures

ARMv7-A with:

Multiprocessing Extensions

Large Physical Address Extension

Virtualization Extensions

Security Extensions

VFPv4

SIMDv2

Reti Logiche Università degli studi di Udine

ARMv8

64-bit architecture

Backward compatible with 32-bit ARM architectures

2 execution statesAArch64

R0-R30: general purpose, 64-bit registers

SP: 64-bit stack pointer

PC: 64-bit program counter (not directly writable)

V0-V31: SIMD and floating point, 128-bit registers

Aarch32ARMv7-A with

A32 instruction set

In former notation: ARM instruction set

T32 instruction set

In former notation: Thumb + Thumb-2 instruction sets

No Jazelle; no ThumbEE

Page 25: CPU RISC example: ARMv1 ARM1 ARM ARM2 ARMv2 ARM2 … · Apple iPhone (original and 3G), Apple iPod touch (1st and 2nd Generation) Motorola RIZR Z8, Motorola RIZR Z10 Nintendo 3DS

Reti Logiche Università degli studi di Udine

ARM: other info

ARM Architecture Reference Manual

ARMv7-A and ARMv7-R edition

ARM v7-M Architecture Reference Manual

ARMv8, for ARMv8-A architecture profile