the arm architecture · program status registers 31 27 n z c v q 28 7 6 i f t mode 24 23 16 15 8 54...
Post on 03-Aug-2020
3 Views
Preview:
TRANSCRIPT
The ARM Architecture
Delivered at San Jose State University
28/4/10
1
AgendaIntroduction to ARM LtdARM Architecture/Programmers ModelD t P th d Pi liData Path and PipelinesAMBADevelopment ToolsDevelopment Tools
2
ARM Ltd
Founded in November 1990Spun out of Acorn ComputersSpun out of Acorn Computers
Designs the ARM range of RISC processor coresLicenses ARM core designs to semiconductorLicenses ARM core designs to semiconductor partners who fabricate and sell to their customers.
ARM does not fabricate silicon itself
Also develop technologies to assist with the design-in of the ARM architecture
Software tools, boards, debug hardware, li ti ft b hit tapplication software, bus architectures,
peripherals etc
3
ARM’s Activities
Connected Community
Software IP
Development Tools
Connected Community
memorymemory
SoCSoC
ProcessorsSystem Level IP:Data Engines
SoCSoC
Fabric3D Graphics
4
Physical IP
ARM Connected Community – 550+
5 5
Nokia N95 Multimedia Computer OMAP™ 2420OMAP™ 2420
Applications ProcessorARM1136™ processor-based
SoC, developed using Magma ® Blast® family and winner of
Symbian OS™ v9.2Operating System supporting ARM
b d bil d i
y2005 INSIGHT Award for ‘Most
Innovative SoC’
processor-based mobile devices, developed using ARM® RealView®
Compilation Tools
S60™ 3rd EditionS60 Platform supporting ARM
Mobiclip™ Video CodecSoftware video codec for ARM
processor-based mobile devices
pp gprocessor-based mobile devices
ST WLAN SolutionUltra-low power 802.11b/g WLAN
chip with ARM9™ processor-based MAC
6
Connect. Collaborate. Create.
Applications
7 7
8
AgendaIntroduction to ARM LtdARM Architecture/Programmers ModelD t P th d Pi liData Path and PipelinesAMBADevelopment ToolsDevelopment Tools
9
Architecture VersionsARMv7-Cortex
x1-4
Cortex-A9
Cortex-A8
x1-4ARMv6
ARM1176JZ(F)-S™
Cortex A8
ARM11™ MPCore™
ARMv5
ARM1156T2(F)-S™
ARM1136J(F)-S™
ARM1026EJ-S™
C t R4
Cortex-R4F
ARM966E-S™
ARM968E-S™
ARM926EJ-S™
ARM946E-S™
Cortex-R4
ARMv4
SC200™ARM7EJ-S™
ARM922T™ARM920T™
SC300™Cortex™-M3
10 10
SC100™ARM7TDMI(S)™ Cortex-M1
Relative Performance*
1200mW/MHz
800
1000
F (MH )
0.43
0.568
mW/MHz
200
400
600Freq (MHz)0.36
0.335
0.235
0 25
0
200
Processor
xA8
176J
Z-S
26E
J-S
920T
7TD
MI
136J
-S
026E
J-S
0.250.35
Cor
te
AR
M11
AR
M92
ARM
9
ARM
7
AR
M1
AR
M10
11
*Represents attainable speeds in 130, 90 or 65nm processes
Cortex family
Cortex-A8Architecture v7AMMU
Cortex-R4Architecture v7RMPU (optional)
Cortex-M3Architecture v7MMPU (optional)MMU
AXIVFP & NEON support
MPU (optional)AXI Dual Issue
MPU (optional)AHB Lite & APB
12
Data Sizes and Instruction SetsThe ARM is a 32-bit architecture.
Wh d i l ti t th ARMWhen used in relation to the ARM:Byte means 8 bitsHalfword means 16 bits (two bytes)( y )Word means 32 bits (four bytes)
’Most ARM’s implement two instruction sets32-bit ARM Instruction Set16-bit Thumb Instruction Set16 bit Thumb Instruction Set
Jazelle cores can also execute Java bytecode
13
y
ARM and Thumb Performance
30000
20000
25000
Dhrystone 2.1/sec
10000
15000 ARMThumb
Dhrystone 2.1/sec@ 20MHz
0
5000
Memory width (zero wait state)
32-bit 16-bit 16-bit with 32-bit stack
14
Thumb-2 Instruction Set
Second generation of the Thumb architectureBlended 16-bit and 32-bit instruction set 25% faster than Thumb
EEMBC Analysis - Performance
30% smaller than ARM
Increases performance but maintains code densitydensity
Maximizes cache and tightly coupled memory usage
EEMBC Analysis – Code Size
15
Processor Modes
The ARM has seven basic operating modes:
U i il d d d hi h t t kUser : unprivileged mode under which most tasks run
FIQ : entered when a high priority (fast) interrupt is raised
IRQ : entered when a low priority (normal) interrupt is raised
Supervisor : entered on reset and when a Software Interrupt instruction is executed
Abort : used to handle memory access violations
Undef : used to handle undefined instructions
16
System : privileged mode using the same registers as user mode
The ARM Register Set
Current Visible RegistersCurrent Visible RegistersCurrent Visible RegistersCurrent Visible RegistersCurrent Visible RegistersCurrent Visible Registersr0
r1
r2
r3
r4
User Mode r0
r1
r2
r3
r4Banked out Registers
r0
r1
r2
r3
r4Banked out Registers
FIQ ModeIRQ Mode r0
r1
r2
r3
r4Banked out Registers
Undef Mode r0
r1
r2
r3
r4Banked out Registers
SVC Mode r0
r1
r2
r3
r4Banked out Registers
Abort Mode r0
r1
r2
r3
r4Banked out Registers
r4
r5
r6
r7
r8 r8
FIQ IRQ SVC Undef Abort
r4
r5
r6
r7
r8 r8
FIQ IRQ SVC Undef Abort
r4
r5
r6
r7
r8
User IRQ SVC Undef Abort
r8
r4
r5
r6
r7
r8 r8
User FIQ SVC Undef Abort
r4
r5
r6
r7
r8 r8
User FIQ IRQ SVC Abort
r4
r5
r6
r7
r8 r8
User FIQ IRQ Undef Abort
r4
r5
r6
r7
r8 r8
User FIQ IRQ SVC Undef
r9
r10
r11
r12
r9
r10
r11
r12
r9
r10
r11
r12
r9
r10
r11
r12
r9
r10
r11
r12
r9
r10
r11
r12
r9
r10
r11
r12
r9
r10
r11
r12
r9
r10
r11
r12
r9
r10
r11
r12
r9
r10
r11
r12
r9
r10
r11
r12
r9
r10
r11
r12
r9
r10
r11
r12
r13 (sp)
r14 (lr)
r15 (pc)
cpsr
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r15 (pc)
cpsr
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r15 (pc)
cpsr
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r15 (pc)
cpsr
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r15 (pc)
cpsr
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r15 (pc)
cpsr
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r15 (pc)
cpsr
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
17
p
spsrspsr spsr spsrspsr
p
spsrspsr spsr spsrspsr
p
spsrspsr spsr spsrspsr
p
spsrspsr spsr spsrspsr
p
spsr spsr spsr spsrspsr
p
spsrspsrspsr spsrspsr
p
spsrspsr spsrspsr spsr
Exception Handling
When an exception occurs, the ARM:Copies CPSR into SPSR <mode>Copies CPSR into SPSR_<mode>Sets appropriate CPSR bits
Change to ARM state FIQIRQ
0x1C0x18Change to exception mode
Disable interrupts (if appropriate)Stores the return address in LR <mode>
IRQ(Reserved)Data Abort
Prefetch Abort
0x18
0x14
0x100x0CStores the return address in LR_<mode>
Sets PC to vector address
To return, exception handler needs to:
Prefetch AbortSoftware Interrupt
Undefined Instruction
Reset
0x0C
0x08
0x04
0x00
Vector TableRestore CPSR from SPSR_<mode>Restore PC from LR_<mode>
This can only be done in ARM stateVector table can be at
0xFFFF0000 on ARM720Td ARM9/10 f il d i
18
This can only be done in ARM state. and on ARM9/10 family devices
Program Status Registers2731
N Z C V Q
28 67
I F T mode
1623 815 5 4 024
f s x c
U n d e f i n e dJ
Condition code flagsN = Negative result from ALU Z = Zero result from ALU
Interrupt Disable bits.I = 1: Disables the IRQ.F = 1: Disables the FIQ.
C = ALU operation Carried outV = ALU operation oVerflowed T Bit
Architecture xT onlySticky Overflow flag - Q flag
Architecture 5TE/J onlyIndicates if saturation has occurred
T = 0: Processor in ARM stateT = 1: Processor in Thumb state
J bitArchitecture 5TEJ onlyJ = 1: Processor in Jazelle state
Mode bitsSpecify the processor mode
19
Cortex-M3 Programmer’s Model
Fully programmable in C r0
r1
Main
Stack-based exception modelOnly two processor modes
Th d M d f U t k
r2
r3
r4
r5
6Thread Mode for User tasksHandler Mode for OS tasks and exceptions
Vector table contains addressesr8
r9
r10
r6
r7
Process
r10
r11
r12
sp
lrsp
r15 (pc)
xPSR
20
Conditional Execution and Flags
ARM instructions can be made to execute conditionally by postfixing them with the appropriate condition code field.
This improves code density and performance by reducing the number ofThis improves code density and performance by reducing the number of forward branch instructions.CMP r3,#0 CMP r3,#0BEQ skip ADDNE r0,r1,r2Q p , ,ADD r0,r1,r2
skip
By default, data processing instructions do not affect the condition code flags but the flags can be optionally set by using “S”. CMP does not need “S”.
loop…SUBS r1,r1,#1BNE loop if Z flag clear then branch
decrement r1 and set flags
21
Classes of Instructions (v4T)
Load/Store
Miscellaneous
Data Operations
MOV PC, RmBcc
Change of Flow
BccBLBLX
22
Data processing InstructionsConsist of :
Arithmetic: ADD ADC SUB SBC RSB RSCLogical: AND ORR EOR BICComparisons: CMP CMN TST TEQData movement: MOV MVN
These instructions only work on registers, NOT memory.
Syntax:y
<Operation>{<cond>}{S} Rd, Rn, Operand2
Comparisons set flags only - they do not specify RdData movement does not specify RnSecond operand is sent to the ALU via barrel shifter
23
Second operand is sent to the ALU via barrel shifter.
Using a Barrel Shifter:The 2nd Operand
Register, optionally with shift operationShift value can be either be:
Operand 1
Operand 2 Shift value can be either be:
5 bit unsigned integerSpecified in bottom byte of another register
1
Barrel
2
another register.Used for multiplication by constant
Immediate value
BarrelShifter
Immediate value8 bit number, with a range of 0-255.
Rotated right through even number of positions
ALUnumber of positions
Allows increased range of 32-bit constants to be loaded directly into registersResult
24
registersResult
Single register data transferLDR STR WordLDRB STRB ByteLDRH STRH HalfwordLDRH STRH HalfwordLDRSB Signed byte loadLDRSH Signed halfword loadLDRSH Signed halfword load
Memory system must support all access sizes
Syntax:LDR{<cond>}{<size>} Rd, <address>STR{<cond>}{<size>} Rd, <address>
25
e.g. LDREQB
AgendaIntroduction to ARM LtdARM Architecture/Programmers Model D t P th d Pi liData Path and PipelinesAMBADevelopment ToolsDevelopment Tools
26
The ARM7TDM Core
AddressIncrementer
Address Register
A[31:0]ABE
Incrementer
nWAITMCLKBIGEND
Address Register
Register Bank PC Update
Incrementer
PC
Multiplier
InstructionDecoder
nRESET
nIRQnFIQ
nRWMAS[1:0]
ISYNC
Register Bank p
Decode StageInstruction
Decompression
A B
A
L
U p nRESET
nMREQSEQ
ABORT
LOCK
nTRANS
nM[4:0]
BarrelShifter
Read Data Register
and
Control Logic
B
u
s
B
u
s
B
u
s
nCPICPACPB
nOPC
nM[4:0]
32 Bit ALUWrite Data Register
Logic
27
D[31:0]DBE
Pipeline changes for ARM9TDMI
RegRead Shift ALU Reg
WriteThumb→ARMdecompress
ARM decodeInstructionFetch
ARM7TDMI
ARM9TDMI
Reg Selectdecompress
FETCH DECODE EXECUTE
InstructionFetch Shift + ALU Memory
AccessReg
WriteRegR d
RegD d
ARM9TDMIARM or Thumb
Inst Decode
ReadDecode
FETCH DECODE EXECUTE MEMORY WRITE
28
ARM10 vs. ARM11 Pipelines
Shift + ALU MemoryA RR R d
BranchPrediction ARM or
Th b
ARM10
S t U Access RegWrite
FETCH DECODE EXECUTE MEMORY WRITE
Reg Read
Multiply
Prediction
InstructionFetch
ISSUE
ThumbInstruction
Decode Multiply Add
ARM11
FETCH DECODE EXECUTE MEMORY WRITEISSUE
Shift ALU Saturate
Fetch1
Fetch2 Decode Issue Write
backMAC
1MAC
2MAC
3
AddressData
CacheData
Cache
29
Address Cache1
Cache2
Full Cortex-A8 Pipeline Diagram13-Stage Integer Pipeline 10-Stage NEON Pipeline
Architectura
NE
ONal register file
register file
30
AgendaIntroduction to ARM LtdARM Architecture/Programmers ModelD t P th d Pi liData Path and PipelinesAMBADevelopment ToolsDevelopment Tools
31
An Example AMBA System
High PerformanceARM processor UART
APBARM processor
HighBandwidth APB
UART
TimerAHBExternalMemoryInterface
APBBridge
Keypad
High-bandwidthon-chip RAM
DMABus Master
PIO
Low Power
High PerformancePipelinedBurst SupportMultiple Bus Masters
Non-pipelinedSimple Interface
32
p
AHB Structure
Arbiter
HADDRHWDATA
Master#1
Slave#1
HADDRHWDATA
HRDATA
HADDR
HRDATA
Master#2
Slave#2
Address/Control
Master#3
Slave#3
Write Data
Read Data
Decoder
#3
Slave#4
33
AgendaIntroduction to ARM LtdARM Architecture/Programmers ModelD t P th d Pi liData Path and PipelinesAMBADevelopment ToolsDevelopment Tools
34
Keil Development Tools for ARM
Includes ARM macro assembler, compilers (ARM RealView C/C++ Compiler Keil CARM Compiler or GNU compiler) ARM linker Keil uVisionCompiler, Keil CARM Compiler, or GNU compiler), ARM linker, Keil uVision Debugger and Keil uVision IDEKeil uVision Debugger accurately simulates on-chip peripherals (I2C, CAN, UART SPI Interrupts I/O Ports A/D and D/A converters PWM etc )UART, SPI, Interrupts, I/O Ports, A/D and D/A converters, PWM, etc.)
Evaluation Limitations16K byte object code + 16K data limitationSome linker restrictions such as base addresses for code/constantsGNU tools provided are not restricted in any way
http://www.keil.com/demo/
35
Keil Development Tools for ARM
36
37
University Resources
http://www.arm.com/support/university/
University@arm.com
38
Beagle Board
39
$ Wikis blogsTargeting community development
$149> 1000 participants
and growing
Wikis, blogs, promotion of community
activity
Personally affordable
Freedom to innovate
Active & technical
community
Open access to h d
y
Instant access to >10 million lines
Addressing open source communityhardware
documentation
FreeOpportunity
>10 million lines of code
communityneeds
Freesoftware
Opportunity to tinker and
learn
40
Fast, low power, flexible expansion
OMAP3530 Processor600MHz Cortex-A8
NEON+VFPv3
Peripheral I/ODVI-D video outNEON+VFPv3
16KB/16KB L1$256KB L2$
430MHz C64x+ DSP
SD/MMC+S-Video outUSB 2.0 HS OTG
3”
32K/32K L1$48K L1D32K L2
I2C, I2S, SPI,MMC/SDJTAG
PowerVR SGX GPU64K on-chip RAM
POP Memory
Stereo in/outAlternate powerRS-232 serialPOP Memory
128MB LPDDR RAM256MB NAND flash USB Powered
2W maximum consumption
41
OMAP is small % of thatMany adapter options
Car, wall, battery, solar, …
On-going collaboration at BeagleBoard.orgLive chat via IRC for 24/7 community support
And more…
Other Features4 LEDs
Live chat via IRC for 24/7 community supportLinks to software projects to download
Peripheral I/ODVI-D video outSD/MMC
3”4 LEDsUSR0USR1PMU STAT
SD/MMC+S-Video outUSB HS OTGI2C I2S SPI
_PWR
2 buttonsUSER
I2C, I2S, SPI,MMC/SDJTAG
RESET4 boot sources
SD/MMCNAND flash Stereo in/out
Alternate powerRS-232 serial
NAND flashUSBSerial
42
Project Ideas Using BeagleOS Projects
OS porting to ARM/Cortex (TI OMAP), such as open source FreeBSDMythTV systemMythTV system“Super-Beagle” – stack of Beagles as compute engine and task distribution
NEON Optimization ProjectsCodec optimization in ffmpeg (pick your favorite codec)Codec optimization in ffmpeg (pick your favorite codec)Voice and image recognitionOpen-source Flash player optimizations (swfdec)
43
Fin
44
top related