advanced processor architectures and memory organisation – arm

35
ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION ARM

Upload: norma-underwood

Post on 18-Jan-2016

245 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

ADVANCED PROCESSOR ARCHITECTURESAND MEMORY ORGANISATION – ARM

Page 2: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

The ARM architecture processors popular in Mobile phone systems

Page 3: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

ARM (Advanced RISC Machine) Features• ARM has 32-bit architecture but supports 16 bit

or 8 bit data types also.• ARM is programmable as little endian or big

endian data alignment in memory.• ARM provides the advantage of using a CISC in

terms of functionality, along with the advantage of an RISC in terms of faster program implementation as well as reduced code lengths.

• ARM processor has an RISC core for processing• Combination of RISC and CISC features - ARM

supports to a complex addressing modes based instruction set

Page 4: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

In-built compilation unit

• Compiles the CISC instructions into RISC formats, which are then implemented by the RISC core of the processor.

• Internally the implementation for many instructions is like in an RISC (without the micro-programmed unit)

Jazelle technology• Faster Java codes execution

Page 5: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

ARM Thumb 16-bit instructions• Thumb Set designed for 16-bit word lengths and

instructions, which internally executes by same 32-bit core.

• Instruction fetch of 2 bytes in Thumb mode in place of 4 bytes in ARM mode.

• Data alignment at steps of 2 bytes in Thumb mode in place of 4 bytes in ARM mode Memory savings of up to 35%, over the equivalent 32-bit code, while retaining all the benefits of a 32-bit system (such as access to a full 32-bit address space).

• Enables 32-bit performance at the 8/16-bit system cost in terms of memory needs.

Page 6: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

Thumb and 32-bit ARM modes• Switch from one mode to another• No overheads (in terms of time and memory) in

moving between Thumb and the normal ARM state of the codes. Two states are compatible on a normal basis.

• Gives code designer complete control over performance and code-size optimization.

Page 7: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

ARM7 versions• ARM7TDMI® (Integer Core)• ARM7TDMI-S™ (Synthesisable version of

ARM7TDMI)• ARM7EJ-S™ (Synthesisable core with DSP and

Jazelle technology)• ARM720T™ (cached processor macrocell , 8K

Cached Core with Memory Management Unit (MMU) supporting operating systems including Windows CE, Palm OS, Symbian OS and Linux)

• 130 MIPS using Dhrystone 2.1 benchmark in typical 0.13μm process

Page 8: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

ARM9 versions• ARM920T (Dual 16k caches with MMU support

multiple OSs.• ARM922T (Dual 8k caches for applications

support multiple OSs.• ARM940T™ (Dual 4k caches for embedded

control applications running a RTOS)• 32-bit RISC processor core Super scaling 5-stage

integer pipeline. 8-entry write buffers to avoid blocking the processor on external memory writes

• Achieves 1.1 MIPS/MHz, 300 MIPS (Dhrystone 2.1) in a typical 0.13μm process

Page 9: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

ARM11 versions• Families with ARMv6 instruction set architecture

that includes the Thumb® extensions for code density, Jazelle™ technology for Java™ acceleration, ARM DSP extensions, and SIMD media processing extensions. MMU supporting operating systems and palm OS

• 32-bit RISC processor core with 8-stage integer pipeline, static and dynamic branch prediction, and separate load-store and arithmetic pipelines to maximize instruction throughput

• Targets a performance range of Dhrystone MIPS 400 to 1200

Page 10: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

Memory Architecture

• ARM7 has Princeton memory architecture.• ARM9 processor has Harvard architecture

Page 11: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

Faster implementation and Reduced code lengths• Due to the instant availability of the register

word to the execution-unit.• Reduced code lengths─ Most instructions use

registers as operands.• Few bits in the instruction specify a register as

operand. 8, 16 or 32 bits specify a memory address as operand and the displacement bits in the instruction

Page 12: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

ARM registers• R0 to R15.• R15 also function as program counter.• R14 function as link register.• R13 may be used as stack pointer.• CPSR (current program status register).• SPSR (saved program status register).

Page 13: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

Processor Modes• The ARM has seven basic operating modes:

– User : unprivileged mode under which most tasks run

– FIQ : entered when a high priority (fast) interrupt is raised

– IRQ : entered when a low priority (normal) interrupt is raised

– Supervisor : entered on reset and when a Software Interrupt instruction is executed

– Abort : used to handle memory access violations

– Undef : used to handle undefined instructions

– System : privileged mode using the same registers as user mode

Page 14: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

r0r1r2r3r4r5r6r7r8r9

r10r11r12

r13 (sp)r14 (lr)r15 (pc)

cpsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r8r9

r10r11r12

r13 (sp)r14 (lr)

spsr

FIQ IRQ SVC Undef Abort

User Moder0r1r2r3r4r5r6r7r8r9

r10r11r12

r13 (sp)r14 (lr)r15 (pc)

cpsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r8r9

r10r11r12

r13 (sp)r14 (lr)

spsr

Current Visible Registers

Banked out Registers

FIQ IRQ SVC Undef Abort

r0r1r2r3r4r5r6r7

r15 (pc)

cpsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r8r9

r10r11r12

r13 (sp)r14 (lr)

spsr

Current Visible Registers

Banked out Registers

User IRQ SVC Undef Abort

r8r9

r10r11r12

r13 (sp)r14 (lr)

FIQ ModeIRQ Moder0r1r2r3r4r5r6r7r8r9

r10r11r12

r15 (pc)

cpsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r8r9

r10r11r12

r13 (sp)r14 (lr)

spsr

Current Visible Registers

Banked out Registers

User FIQ SVC Undef Abort

r13 (sp)r14 (lr)

Undef Moder0r1r2r3r4r5r6r7r8r9

r10r11r12

r15 (pc)

cpsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r8r9

r10r11r12

r13 (sp)r14 (lr)

spsr

Current Visible Registers

Banked out Registers

User FIQ IRQ SVC Abort

r13 (sp)r14 (lr)

SVC Moder0r1r2r3r4r5r6r7r8r9

r10r11r12

r15 (pc)

cpsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r8r9

r10r11r12

r13 (sp)r14 (lr)

spsr

Current Visible Registers

Banked out Registers

User FIQ IRQ Undef Abort

r13 (sp)r14 (lr)

Abort Mode r0r1r2r3r4r5r6r7r8r9

r10r11r12

r15 (pc)

cpsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r13 (sp)r14 (lr)

spsr

r8r9

r10r11r12

r13 (sp)r14 (lr)

spsr

Current Visible Registers

Banked out Registers

User FIQ IRQ SVC Undef

r13 (sp)r14 (lr)

The ARM Register Set

Page 15: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

The Registers• ARM has 37 registers all of which are 32-bits long.

– 1 dedicated program counter– 1 dedicated current program status register– 5 dedicated saved program status registers– 30 general purpose registers

• The current processor mode governs which of several banks is accessible. Each mode can access – a particular set of r0-r12 registers– a particular r13 (the stack pointer, sp) and r14 (the link

register, lr)– the program counter, r15 (pc)– the current program status register, cpsr

• Privileged modes (except System) can also access– a particular spsr (saved program status register)

Page 16: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

Program Status Registers

• Condition code flags– N = Negative result from ALU – Z = Zero result from ALU– C = ALU operation Carried out– V = ALU operation oVerflowed

• Sticky Overflow flag - Q flag– Architecture 5TE/J only– Indicates if saturation has

occurred

• J bit– Architecture 5TEJ only– J = 1: Processor in Jazelle state

• Interrupt Disable bits.– I = 1: Disables the IRQ.– F = 1: Disables the FIQ.

• T Bit– Architecture xT only– T = 0: Processor in ARM state– T = 1: Processor in Thumb state

• Mode bits– Specify the processor mode

2731

N Z C V Q28 67

I F T mode1623

815

5 4 024

f s x c U n d e f i n e dJ

Page 17: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

• When the processor is executing in ARM state:– All instructions are 32 bits wide– All instructions must be word aligned– Therefore the pc value is stored in bits [31:2] with bits [1:0]

undefined (as instruction cannot be halfword or byte aligned).

• When the processor is executing in Thumb state:– All instructions are 16 bits wide– All instructions must be halfword aligned– Therefore the pc value is stored in bits [31:1] with bit [0]

undefined (as instruction cannot be byte aligned).

• When the processor is executing in Jazelle state:– All instructions are 8 bits wide– Processor performs a word access to read 4 instructions at

once

Program Counter (r15)

Page 18: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM
Page 19: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

ARM Codes• ARM Codes─ Forward compatible with higher

versions.• ARM7 codes ─ Forward compatible with ARM9,

ARM9E and ARM10 processors as well as Intel XScale micro-architecture.

• ARM9E and ARM 10 families use a Vector Floating Point (VFP) ARM coprocessor, which adds full floating point operands.

• VFP also provides fast development in SoC design when using tools like MatLab®.

• Applications are in image processing (scaling), 2D and 3D transformations, font generation and digital filters.

Page 20: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

ARM Intelligent Energy Manager (IEM) technology

• Advanced algorithms to optimally balance processor workload and energy consumption.

• Maximizes system responsiveness.• IEM works with the operating system and mobile

OS.• Application running on a mobile phone

dynamically adjusts the required CPU performance level.

Page 21: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

ARM processors AHB (AMBA Advanced High Performance Bus) interface• AMBA an established open source specification for on-chip

interconnects.• AMBA serves as a framework for SoC designs and

development of the IP library.• AHB support in all new ARM cores.• Provides a high-performance and fully synchronous back

plane. (Back plane means additional set of controllers, which can access another common bus, which is distinct from system bus in a multilevel buses in the system.)

• Multi-layer AHB in version ARM926EJ-S and all members of the ARM10 family represents a significant advancement. It reduces access latencies and increases the bandwidth available to multi-master systems

Page 22: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM
Page 23: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM
Page 24: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM
Page 25: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

ARM Instruction SetFeatures• Two Instruction Sets─ 16-bit Thumb and 32-bit

ARM mode instructions• Operations on 8-bit or 16-bit or 32-bit data types• Data alignment in memory: Two byte words in

Thumb set and Four in 32-bit ARM mode

Page 26: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

ARM7 instruction set: Data Transfer Instructions

• Register-load a byte (LDRB).• Register- byte store (STRB).• Register Half Word store (STRH). [A word in ARM is of 32

bits].• Register-load Half Word as such or signed (LDRH or LDRSH).• Instructions for transfer between the register memories.

The memory address is as per a register used as index or index-relative or post auto-index addressing mode.

• Register-load a word (LDR).• Register-word stores a word (STR).• Set a memory address into a register (ADR). Address is of 12

bits. [Alternative for 16 bits address setting in a register is using any register or r15 in an arithmetic operation].

Page 27: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

Word transfer between registers• Move (MOV).• Move reverse (MVR).Load or move or store instruction conditionally

implementation• Conditions─ signed number LT(Less Than),

GT(Greater Than), LE(Less or Equal), EQ(Equal), NE (not equal), VS (overflow), VC (no overflow), GE

• Conditions─ unsigned number HI (higher), LS (lower), PL (plus, nor Negative), MI (minus), CC (carry bit reset), and CS (carry bit set).

• Example: MOVLT r3, #10.• Immediate operand 10 to r3 provided a previous

instruction for comparison showed the first source as less than the second.

Page 28: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

Bit Transfer or Manipulation Instructions• Register- bits Logical Left Shift (LSL).• Register- bits Logical Left arithmetic Shift (ASL).• Register- bits Logical Right Shift (LSR).• Register- bits Logical Right arithmetic Shift (ASR).• Register- bits Rotate Right (ROR).• Register bits Rotate Right with carry also

extended for rotating (RRX).

Page 29: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

Arithmetical Instructions• Three operands from the registers.• One source may however, be by immediate operand

addressing in addition and subtraction .• Add without carry two words and the result is in the

third operand (ADD).• Add with carry two words and the result is in the

third operand (ADC).• Subtract without carry two words and the result is in

the third operand (SUB). [Carry bit used as borrow.]• Subtract with carry two words and the result is in the

third operand (SBC).

Page 30: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

Arithmetical Instructions• Subtract reverse (second source with the first)

without carry two words and the result is in the third operand (RSB). [Carry bit used as borrow.]

• Subtract reverse with carry two words and the result is in the third operand (RSC).

• Multiply two different registers and the result is in the destined register (MUL).

• Multiply two source registers and add the result with the third source register and accumulate the new result in a destined register. (MLA).

Page 31: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

Logic Instructions• Bit wise OR two words and the result is in the

third operand. (ORR).• Bit wise AND two words and the result is in the

third operand. (AND).• Bit wise Exclusive OR two words and the result is

in the third operand. (EOR).• Clear a Bit (BIC). [There is one source for the bits;

a second source for the mask and the result is at the third operand.]

Page 32: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

Arithmetical or logical instruction conditional implementation• Example, SUBGE r1, r3, r5. The operand from r3

is subtracted from r5 if the GE condition resulted earlier (N and V status bits equal on comparison of two signed numbers).

• Conditions can be the results of a comparison or test

Page 33: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

Compare and Test Instructions• The result destines to CPSR, which stores the four

condition bits, N, V, C, and Z.• Bit wise Test two words (TST).• Bit wise Negated Test between two words (TEQ).• Compare two words and the result is at the CPSR

condition bits (CMP).• Compare two negative words and the result is at

the CPSR condition bits (CMN).

Page 34: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

Program-Flow Control Instructions• Branching (B) or Branch conditional operations.• Branch to an address relative to PC word in r15

(B) 'B #1A8' means add in PC 1A8 and change the program flow.

• 'BGE #100' means that if a GE condition resulted on a compare 0 test, add in PC 1A8.

• Similar instructions for different conditions of the processor status flags

Page 35: ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION – ARM

Software Interrupt instruction• SWI has 8-bit opcode and remaining bits are not

used by processor• Give single vector address of the ISR for SWI.• Remaining bits in SWI backtracked by

programmer to compute ISR and ISR parameter pointers

• This unique feature permits handling large number of SWIs required in the OS and application functions or threads or tasks