instruction set architectureadiaz/arqcomp/04-isa.pdfprogrammer, i. e., the conceptual structure and...

89
Laboratorio de Tecnologías de Información Arquitectura de Computadoras ISA- 1 Instruction Set Architecture Instruction Set Architecture Arquitectura de Computadoras Arquitectura de Computadoras Arturo D Arturo D í í az P az P é é rez rez Centro de Investigaci Centro de Investigaci ó ó n y de Estudios Avanzados del IPN n y de Estudios Avanzados del IPN Laboratorio de Tecnolog Laboratorio de Tecnolog í í as de Informaci as de Informaci ó ó n n [email protected] [email protected]

Upload: others

Post on 16-Mar-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 1

Instruction Set ArchitectureInstruction Set Architecture

Arquitectura de ComputadorasArquitectura de ComputadorasArturo DArturo Dííaz Paz Péérezrez

Centro de InvestigaciCentro de Investigacióón y de Estudios Avanzados del IPNn y de Estudios Avanzados del IPNLaboratorio de TecnologLaboratorio de Tecnologíías de Informacias de Informacióónn

[email protected]@cinvestav.mx

Page 2: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 2

InstructionInstruction SetSet

... the attributes of a [computing] system as seen by the programmer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and controls of logic design, and the physical implementation.

Amdahl, Blaaw, Brooks, 1964.

Page 3: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 3

Instruction Set DesignInstruction Set Design

instruction set

software

hardware

Which is easier to change/design?

Page 4: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 4

Instruction Set ArchitectureInstruction Set Architecture

... the attributes of a [computing] system as seen by the programmer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and controls of logic design, and the physical implementation.

Amdahl, Blaaw, Brooks, 1964.

Organization of programmable storage♦

Data types & data structures: encodings and representations

Instruction formats♦

Instruction (or Operand Code) Set

Modes of addressing and accessing data items and instructions♦

Exceptional conditions

Page 5: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 5

ISA: What Must be Specified?ISA: What Must be Specified?

InstructionFetch

InstructionDecode

OperandFetch

Execute

ResultStore

NextInstruction

Instruction Format or Encoding■

how is it decoded?

Location of operands and result■

where other than memory?

how many explicit operands?■

how are memory operands located?

which can or cannot be in memory?♦

Data type and Size

Operations■

what are supported

Successor instruction■

jumps, conditions, branches

fetch-decode-execute is implicit!

Page 6: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 6

EvolutionEvolution ofof InstructionInstruction SetsSets

Single Accumulator (EDSAC 1950)

Accumulator + Index Registers(Manchester Mark I, IBM 700 series 1953

Separation of Programming Modelfrom Implementation

High-level Language Based(B5000 1963)

Concept of a FamilyIBM 360 1964

General Purpose Register Machines

Complex Instruction Sets(Vax, Intel 432 1977-80)

Load/Store Architecture(CDC 6600, Cray 1 1963-76)

RISC: MIPS, Sparc, 88000, IBM RS6000, ... 1987

Page 7: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 7

Basic ISA Basic ISA ClassesClasses

Accumulator:1 address

add A

acc ← acc + mem[A]

1+x address addx A

acc ← acc + mem[A + x]

Stack0 address

add

tos ← tos + next

General Purpose register2 address

add A B

EA(A) ← EA(A) + EA(B)

3 address

add A B C

EA(A) ← EA(B) + EA(C)

Load/Store3 address

add Ra Rb Rc

Ra ← Rb + Rc

load Ra Rb

Ra ← mem[Rb]store Ra Rb

mem[Rb] ← Ra

Most real machines are hybrids of those

Page 8: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 8

Code sequence for (C = A + B) for four classes of instruction sets:

Accumulator

Load AAdd BStore C

Register (register-memory)

Load R1,AAdd R1,BStore C, R1

Stack

Push APush BAddPop C

Register (load-store)

Load R1,ALoad R2,BAdd R3,R1,R2Store C,R3

Comparison:Bytes per instruction? Number of Instructions? Cycles per instruction?

Comparing Number of InstructionsComparing Number of Instructions

Page 9: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 9

General Purpose Registers DominateGeneral Purpose Registers Dominate

1975-200x all machines use general purpose registers

Advantages of registers■

registers are faster than memory

registers are easier for a compiler to use» e.g., (A*B) –

(C*D) –

(E*F) can do multiplies in any order vs.

stack

registers can hold variables» memory traffic is reduced, so program is sped up (since

registers are faster than memory)» code density improves (since register named with fewer bits

than memory location)

Page 10: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 10

Caches vs. Caches vs. RegistersRegisters

Registers advantages■

Faster (no addressing mode, no tags)

Deterministic (no misses)■

Can duplicate for two ports

Short identifier (3-8 bits)

Register disadvantages■

Must save/restore on procedure calls

Can’t take the address of a register■

Fixed size (FP, strings, structures)

Compiler must control (?)

Page 11: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 11

Caches vs. Caches vs. RegistersRegisters ((contcont’’dd))

How many registers? More means

+ Hold operands longer (reducing memory traffic & potentially execution time)

-

Longer register specifiers (except with register windows)-

Slow registers

-

More state slows context switches

Page 12: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 12

Programmable storage■ 232

x bytes of memory■ 31 x 32-bit GPRs

(R0 = 0)

■ 32 x 32-bit FP regs

(paired DP)■ HI, LO, PC

0r0r1°°°r31PClohi

MIPS I RegistersMIPS I Registers

Page 13: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 13

Data Movement Load (from memory)Store (to memory)memory-to-memory moveregister-to-register moveinput (from I/O device)output (to I/O device)push, pop (to/from stack)

Arithmetic integer (binary + decimal) or FPAdd, Subtract, Multiply, Divide

Logical not, and, or, set, clear

Shift shift left/right, rotate left/right

Typical OperationsTypical Operations

Page 14: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 14

Control (Jump/Branch) unconditional, conditional

Subroutine Linkage call, returnInterrupt trap, returnSynchronization test & set (atomic r-m-w)String search, translateGraphics (MMX) parallel subword

ops (4 16bit add)

Typical OperationsTypical Operations

little change since 1960

Page 15: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 15

° Rank instruction Integer Average Percent total executed1 load 22%2 conditional branch 20%3 compare 16%4 store 12%5 add 8%6 and 6%7 sub 5%8 move register-register 4%9 call 1%10 return 1%

Total 96%° Simple instructions dominate instruction frequency

Top 10 80x86 InstructionsTop 10 80x86 Instructions

Page 16: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 16

Operation SummaryOperation Summary

Support these simple instructions, since they will dominate the number of instructions executed: ■

load,

store, ■

add,

subtract, ■

move register-register,

and, ■

shift,

compare equal, compare not equal, ■

branch, jump,

call, ■

return;

Page 17: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 17

OperandsOperands forfor ALU ALU instructionsinstructions

ALU instructions combine operands (e.g. ADD)♦

Number of explicit operands■

Two -

destination equals one source

Three -

orthogonal

Operands in registers or memory■

Any combination --

VAX

» (orthogonal, but variable instr. formats)■

At least one register --

much of 360

» (not orthogonal)■

All registers --

CRAY, DLX, RISCs

» (orthogonal, but needs loads/stores)

Page 18: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 18

Memory AddressingMemory Addressing

Since 1980 almost every machine uses addresses to level of 8-bits (byte)

2 questions for design of ISA:■

Since could read a 32-bit word as four loads of bytes from sequential byte addresses or as one load word from a single byte address,

» How do byte addresses map onto words?» Can a word be placed on any byte boundary?

Page 19: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 19

Big Endian:

address of most significant byte = word address (xx00 = Big End of word)

IBM 360/370, Motorola 68k, MIPS, Sparc, HP PA

Little Endian:

address of least significant byte = word address (xx00 = Little End of word)

Intel 80x86, DEC Vax, DEC Alpha (Windows NT)

Mode selectable■

becoming more common: PowerPC, MIPS R10000

msb lsb3 2 1 0

little endian byte 0

0 1 2 3big endian byte 0

Addressing Objects: Addressing Objects: EndianEndian WarsWars

Page 20: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 20

0 1 2 3

Aligned

NotAligned

Addressing Objects: AlignmentAddressing Objects: Alignment

Alignment: require that objects fall on address that is multiple of their size.

Page 21: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 21

AlignmentAlignment

No restrictions■

Simpler software

Hardware must detect misalignment and make 2 memory accesses■

expensive logic, slows down all references

sometimes required for backward compatibility♦

Restrictred alignment■

software must guarantee alignment

hardware only detecs misalignment and traps■

trap handler does it

Middle group■

misaligned data ok but requires multiple instructions

compiler must skill know■

still trap on misaligned access

Page 22: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 22

A A ““typicaltypical”” RISCRISC

32-bit fixed

format

instruction

(3 formats)♦

32 32-bit GPR (R0 contains

zero, DP take

pair)

3-address, reg-reg

arithmetic

instruction♦

Single address

mode

for

load/store: base+displacement

no indirection

Simple branch

conditions♦

Delay

branch

see: SPARC, MIPS MC88100, AMD2900, i960, i860, PARisc, DEC Alpha, Clipper, CDC 6600, CDC 7600, Cray-1, Cray-2, Cray-3, ...

Page 23: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 23

VAXVAX--1111

Variable format, 2 and 3 address instruction♦

32-bit word size, 16 GPR (four reserved)

Rich set of addressing modes (apply to any operand)♦

Rich set of operations■

bit-field, stack, call, case, loop, string, poly, system)

Rich set of data types (B, W, L, Q, O, F, D, G, H)♦

Condition codes

OpCode A/M A/M A/M

Byte 0 1 n m

Page 24: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 24

VAXVAX--11: 11: AddressingAddressing ModesModes

1.

Register

Ri2.

Base + Displacement

M[Ri

+ v]3.

Immediate

v4.

Register

Indirect

M[Ri]5.

Direct

(absolute)

M[v]6.

Base + Index

M[Ri

+ Rj]7.

Scaled

Index

M[Ri

+ Rj*d + v]8.

Autoincrement M[Ri++]

9.

Autodecrement M[Ri--]

10.

Memory

Indirec M[ M[Ri] ]

11.

[Indirection

chains]

Modes

1-4 account

for

93 % of

all operands

on

the

VAX

Memory

Register File

Page 25: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 25

Addressing Mode UsageAddressing Mode Usage

3 programs measured on machine with all address modes (VAX)---

Displacement:

42% avg, 32% to 55%

---

Immediate: 33% avg, 17% to 43%---

Register deferred (indirect): 13% avg, 3% to 24%

---

Scaled: 7% avg, 0% to 16%---

Memory indirect: 3% avg, 1% to 6%

---

Misc:

2% avg, 0% to 3%♦

75% displacement & immediate

85% displacement, immediate & register indirect

75% 85%

Page 26: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 26

° Avg. of 5 SPECint92 programs v. avg. 5 SPECfp92 programs

° 1% of addresses > 16-bits

° 12 -

16 bits of displacement needed

0%5%

10%15%20%25%30%

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Int. Avg. FP Avg.

Address Bits

Displacement Address Size?Displacement Address Size?

Page 27: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 27

Immediate Size?Immediate Size?

50% to 60% fit within 8 bits

75% to 80% fit within 16 bits

Page 28: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 28

Addressing SummaryAddressing Summary

Data Addressing modes that are important:■

Displacement, Immediate, Register Indirect

Displacement size should be 12 to 16 bits

Immediate size should be 8 to 16 bits

Page 29: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 29

Variable:

Fixed:

Hybrid:

……

Generic Example of Instruction Format WidthsGeneric Example of Instruction Format Widths

Page 30: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 30

Instruction FormatsInstruction Formats

If code size is most important, use variable length instructions

If performance is most important, use fixed length instructions

Recent embedded machines (ARM, MIPS) added optional mode to execute subset of 16-bit wide

instructions (Thumb, MIPS16); per procedure decide performance or density

Some architectures actually exploring on-the-fly decompression for more density.

Page 31: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 31

Instruction FormatInstruction Format

If have many memory operands per instruction and/or many addressing modes:

=>Need one address specifier

per operand

If have load-store machine with 1 address per instr. and one or two addressing modes:

=> Can encode addressing mode in the opcode

Page 32: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 32

op rs rt rd

register

Register (direct)

immedop rs rtImmediate

• All instructions 32 bits wide

Base+indeximmedop rs rt

register +

Memory

PC-relativeimmedop rs rt

PC +

Memory

• Register Indirect?

MIPS Addressing Modes/Instruction MIPS Addressing Modes/Instruction FormatsFormats

Page 33: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 33

Most Popular ISA of all time: Intel Most Popular ISA of all time: Intel 80x8680x86♦

1971: Intel invents microprocessor 4004/8008, 8080 in 1975

1975: Gordon Moore realized one more chance for new ISA before ISA locked in for decades■

hired CS people in Oregon

weren’t ready in 1977 (CS people did 432 in 1980)■

started crash effort for 16-bit microcomputer

1978: 8086 dedicated registers, segmented address, 16 bit■

8088; 8-bit external bus version of 8086

1980: IBM selects 8088 as basis for IBM PC♦

1980: 8087 floating point coprocessor: adds 60 instructions using hybrid stack/register scheme

1982: 80286 24-bit address, protection, memory mapping♦

1985: 80386 32-bit address, 32-bit GP registers, paging

Page 34: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 34

Intel x86 (IAIntel x86 (IA--32)32)

1989: 80486 & Pentium in 1992: faster + MP few instructions♦

1997: MMX multimedia extensions

200X: Superseded by IA-64 (Merced, McKinley, Itanium, etc.)♦

“Difficult to explain and impossible to love”■

See H&P Appendix D.8

Eight 32-bit registers (EAX, EBX, ..., but also ESP, EBP)♦

Also 16-

and 8-bit version (AX, AH, AL)

Most instructions have two operands, one possibly from memory♦

One super-duper addressing mode w/ effective address =■

base_reg

+ (index_reg

* scaling_factor) + displacement

Many formats: see H&P fig. D.8

Page 35: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 35

Intel MMXIntel MMX

MultiMedia eXtension to IA-32 [Peleg & Weiser, IEEE Micro, 8/96]■

Multimedia data values often need much less than 32 bits

But are organized in groups (e.g. red/green/blue)■

So in 64-bit FP registers: 2x32, 4x16, 8x8

E.g. ADDB (for byte)■

17

87

100

...

6 more

+17

13

200

...

6 more■

------

---

-----

...

-----------

34

100

255

...

6 more

MMX takes 16-element dot product (a0

*b0

+ a1

*b1

+ ... + a15

*b15

)■

from 200 to 16 instructions & from 76 to 12 cycles (6x)

Page 36: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 36

Control (Jump/Branch) unconditional, conditional

Subroutine Linkage call, returnInterrupt trap, returnSynchronization test & set (atomic r-m-w)String search, translateGraphics (MMX) parallel subword

ops (4 16bit add)

Typical OperationsTypical Operations

little change since 1960

Page 37: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 37

Control InstructionsControl Instructions

Taken or not taken?

Where is the target?

Link return address

Save or restore state

Conditional branches

X X

Jumps XProcedure calls

X X X

Procedure returns

X X

O.S. calls X X XO.O. returns X X

Page 38: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 38

(1) (1) Taken or not taken ?Taken or not taken ?

Compare and branch instruction+

No extra compare instruction

+

No state passed between instructions-

Requires ALU operation

-

Restricts code scheduling opportunities

Implicitly set condition codes (Z, N, V, C)+

Can be set “for free”

-

Constrains code reordering-

Extra state to save and restore

Explicitly set condition codes (Z, N, V, C)+

Can be set “for free”

+

Decouples branch/fetch from pipeline-

Extra state to save and restore

Page 39: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 39

(1) (1) Taken or not taken ?, cont.Taken or not taken ?, cont.

Condition in general-purpose register+

No special state to save and implement but uses up a register

-

branch condition separated from branch logic in pipeline

Some data for MIPS■

> 80 % of compares for branches use immediates

> 80 % of these immediates are zero■

50 % compares for branches are =0 or != 0

Compromise used in MIPS■

Have branch-if = 0 and branch-if != 0

Have compare instructions (r1=r2, r1 != r2, r1 < r2, r1 <= r2, etc.)

With pipelining, can we predict whether taken ?■

Statically ?

Dynamically ?

Page 40: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 40

(2) Where is the target ?(2) Where is the target ?

Could use Arbitrary Specifier

?+

Orthogonal and powerful

-

More bits to specify, more time to decode-

branch execution and target separated in pipeline

PC-relative with immediate+

Position independence (helps linking), target computable in branch unit

+

Short immediate sufficient. MIPS word immediate:<= 4 bits: 47 %<= 8 bits: 94 %<= 12 bits: 100 %

-

Target must be known statically (to link)-

Can’t jump arbitrarily far

-

Other techniques are required for returns and distance jumps

Page 41: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 41

(2) Where is the target ?, cont.(2) Where is the target ?, cont.

Register+

Short specification

+

Can jump anywhere+

Dynamic target okay (returns)

-

Extra instruction to load register

(Vectored) TrapCritical for O.S. calls+

Protection.

-

Implementation headache

Common compromise■

(Conditional) branches (pc-rel)

(Unconditional) jumps (pc-rel, reg)

Procedure calls (pc-rel, reg)■

Procedure returns (reg)

O.S. calls (trap)■

O.S. returns (reg)

Page 42: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 42

(3) Link return address ?(3) Link return address ?

Implicit register+

Fast, simple

-

SW must save register before next call

-

Surprise traps or interrups

?

Explicit register-

No important advantages over above

-

Register must be specified

Required for procedure calls and O.S. calls

Processor stack+

Recursion supported directly-

Complex instruction

Many recent architectures use implicit register

Page 43: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 43

(4) (4) Save or restore state ?Save or restore state ?

What state ?■

Procedure calls: registers

O.S. calls: registers and PSW (incl. CCs)

Hardware need not save registers■

Caller can save registers in use

Callee

can save registers it will use

Hardware register save■

Which (IBM STM, VAX CALLS) ?

Is the above faster ?■

Register windows

Many recent architectures do no register saving or do implicit saving with register windows

Page 44: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 44

MIPS: MIPS: RegisterRegister StateState

32 integer

registers■

$0 is hardwared to 0

$31 is

the

return

address register

software convention

for

other registers

32 single-precision

FP registers

or

16 double-

precision

FP registers

PC and

other

special

registers

Page 45: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 45

MIPS I Operation OverviewMIPS I Operation Overview

Arithmetic Logical:■

Add, AddU, Sub, SubU, And, Or, Xor, Nor, SLT, SLTU

AddI, AddIU, SLTI, SLTIU, AndI, OrI, XorI, LUI■

SLL, SRL, SRA, SLLV, SRLV, SRAV

Memory Access:■

LB, LBU, LH, LHU, LW, LWL,LWR

SB, SH, SW, SWL, SWR

Page 46: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 46

Multiply / DivideMultiply / Divide

Start multiply, divide■

MULT rs, rt

MULTU rs, rt■

DIV rs, rt

DIVU rs, rt♦

Move result from multiply, divide■

MFHI rd

MFLO rd♦

Move to HI or LO■

MTHI rd

MTLO rd♦

Why not third field for destination? ■

(Hint: how many clock cycles for multiply or divide vs. add?)

Registers

HI LO

Page 47: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 47

Data TypesData Types

Bit: 0, 1

Bit String: sequence of bits of a particular length4 bits is a nibble8 bits is a byte

16 bits is a half-word32 bits is a word64 bits is a double-word

Character:ASCII 7 bit codeUNICODE 16 bit code

Decimal:digits 0-9 encoded as 0000b thru 1001btwo decimal digits packed per 8 bit byte

Integers:2's Complement

Floating Point:Single PrecisionDouble PrecisionExtended Precision

M x RE

How many +/- #'s?Where is decimal pt?How are +/- exponents

represented?

exponent

basemantissa

Page 48: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 48

Operand Size UsageOperand Size Usage

Frequency of reference by size

0% 20% 40% 60% 80%

Byte

Halfword

Word

Doubleword

0%

0%

31%

69%

7%

19%

74%

0%

Int Avg.

FP Avg.

Support for these data sizes and types: 8-bit, 16-bit, 32-bit integers and 32-bit and 64-bit IEEE 754 floating point numbers

Page 49: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 49

MIPS arithmetic instructionsMIPS arithmetic instructions

Instruction Example Meaning Commentsadd add $1,$2,$3 $1 = $2 + $3 3 operands; exception possiblesubtract sub $1,$2,$3 $1 = $2 – $3 3 operands; exception possibleadd immediate addi $1,$2,100 $1 = $2 + 100 + constant; exception possibleadd unsigned addu $1,$2,$3 $1 = $2 + $3 3 operands; no exceptionssubtract unsigned subu $1,$2,$3 $1 = $2 – $3 3 operands; no exceptionsadd imm. unsign. addiu $1,$2,100 $1 = $2 + 100 + constant; no exceptionsmultiply mult $2,$3 Hi, Lo = $2 x $3 64-bit signed productmultiply unsigned multu$2,$3 Hi, Lo = $2 x $3 64-bit unsigned productdivide div $2,$3 Lo = $2 ÷ $3, Lo = quotient, Hi = remainder

Hi = $2 mod $3 divide unsigned divu $2,$3 Lo = $2 ÷ $3, Unsigned quotient & remainder

Hi = $2 mod $3Move from Hi mfhi $1 $1 = Hi Used to get copy of HiMove from Lo mflo $1 $1 = Lo Used to get copy of Lo

Page 50: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 50

MIPS logical instructionsMIPS logical instructions

Instruction Example Meaning Commentand and $1,$2,$3 $1 = $2 & $3 3 reg. operands; Logical ANDor or $1,$2,$3 $1 = $2 | $3 3 reg. operands; Logical ORxor xor $1,$2,$3 $1 = $2 ⊕

$3 3 reg. operands; Logical XORnor nor $1,$2,$3 $1 = ~($2 |$3) 3 reg. operands; Logical NORand immediate andi $1,$2,10 $1 = $2 & 10 Logical AND reg, constantor immediate ori $1,$2,10 $1 = $2 | 10 Logical OR reg, constantxor immediate xori $1, $2,10 $1 = ~$2 &~10 Logical XOR reg, constantshift left logical sll $1,$2,10 $1 = $2 << 10 Shift left by constantshift right logical srl $1,$2,10 $1 = $2 >> 10 Shift right by constantshift right arithm. sra $1,$2,10 $1 = $2 >> 10 Shift right (sign extend) shift left logical sllv $1,$2,$3 $1 = $2 << $3 Shift left by variableshift right logical srlv $1,$2, $3 $1 = $2 >> $3 Shift right by variableshift right arithm. srav $1,$2, $3 $1 = $2 >> $3 Shift right arith. by variable

Page 51: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 51

MIPS data transfer instructionsMIPS data transfer instructions

Instruction CommentSW 500(R4), R3

Store word

SH 502(R2), R3

Store halfSB 41(R3), R2

Store byte

LW R1, 30(R2)

Load wordLH R1, 40(R3)

Load halfword

LHU R1, 40(R3)

Load halfword

unsignedLB R1, 40(R3)

Load byte

LBU R1, 40(R3)

Load byte unsigned

LUI R1, 40

Load Upper Immediate (16 bits shifted left by 16)

0000 … 0000

LUI R5

R5

Page 52: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 52

When does MIPS sign extend?When does MIPS sign extend?

When value is sign extended, copy upper bit to full value: Examples of sign extending 8 bits to 16 bits:

00001010 ⇒ 00000000 00001010

10001100 ⇒ 11111111 10001100

When is an immediate value sign extended?■

Arithmetic instructions (add, sub, etc.) sign extend immediates

even for the unsigned versions of the instructions!■

Logical instructions do not sign extend

Load/Store half or byte do sign extend, but unsigned versions do not.

Page 53: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 53

Methods of Testing ConditionMethods of Testing Condition

Condition Codes■

Processor status bits are set as a side-effect of arithmetic instructions (possibly on Moves) or explicitly by compare or test instructions.

ex:

add r1, r2, r3bz

label

Condition RegisterEx:

cmp

r1, r2, r3

bgt

r1, label

Compare and BranchEx:

bgt

r1, r2, label

Page 54: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 54

Conditional Branch DistanceConditional Branch Distance

Bits of Branch Dispalcement

0%10%20%30%40%

0 1 2 3 4 5 6 7 8 910 11 12 13 14 15

Int. Avg. FP Avg.

• 25% of integer branches are 2 to 4 instructions

Page 55: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 55

Conditional Branch AddressingConditional Branch Addressing

PC-relative since most branches are relatively close to the current

PC♦

At least 8 bits suggested (±128 instructions)

Compare Equal/Not Equal most important for integer programs (86%)

Frequency of comparison types in branches

0% 50% 100%

EQ/NE

GT/LE

LT/GE

37%

23%

40%

86%

7%

7%

Int Avg.

FP Avg.

Page 56: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 56

MIPS Compare and BranchMIPS Compare and Branch

Compare and Branch■

BEQ rs, rt, offset if R[rs] == R[rt] then PC-relative branch

BNE rs, rt, offset

<>

Compare to zero and Branch■

BLEZ rs, offset

if R[rs] <= 0 then PC-relative branch

BGTZ rs, offset

>■

BLT

<

BGEZ >=■

BLTZAL rs, offset if R[rs] < 0 then branch and link (into R 31)

BGEZAL

>=!

Remaining set of compare and branch ops take two instructions

Almost all comparisons are against zero!

Page 57: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 57

MIPS jump, branch, compare MIPS jump, branch, compare instructionsinstructionsInstruction Example Meaningbranch on equal beq $1,$2,100 if ($1 == $2) go to PC+4+100

Equal test; PC relative branchbranch on not eq. bne $1,$2,100 if ($1!= $2) go to PC+4+100

Not equal test; PC relative set on less than slt $1,$2,$3 if ($2 < $3) $1=1; else $1=0

Compare less than; 2’s comp. set less than imm. slti $1,$2,100 if ($2 < 100) $1=1; else $1=0

Compare < constant; 2’s comp.set less than uns. sltu $1,$2,$3 if ($2 < $3) $1=1; else $1=0

Compare less than; unsigned numbersset l. t. imm. uns. sltiu $1,$2,100 if ($2 < 100) $1=1; else $1=0

Compare < constant; unsigned numbersjump j 10000 go to 10000

Jump to target addressjump register jr $31 go to $31

For switch, procedure returnjump and link jal 10000 $31 = PC + 4; go to 10000

For procedure call

Page 58: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 58

Signed vs. Unsigned ComparisonSigned vs. Unsigned Comparison

R1= 0…00 0000 0000 0000 0001R2= 0…00 0000 0000 0000 0010R3= 1…11 1111 1111 1111 1111

After executing these instructions:slt r4,r2,r1 ; if (r2 < r1) r4=1; else r4=0slt r5,r3,r1 ; if (r3 < r1) r5=1; else r5=0sltu r6,r2,r1 ; if (r2 < r1) r6=1; else r6=0sltu r7,r3,r1 ; if (r3 < r1) r7=1; else r7=0

What are values of registers r4 -

r7? Why?r4 = ; r5 = ; r6 = ; r7 = ;

two

two

two

Page 59: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 59

Calls: Why Are Stacks So Great?Calls: Why Are Stacks So Great?

Stacking of Subroutine Calls & Returns and Environments:

A: CALL B

CALL C

C: RET

RET

B:

A

A B

A B C

A B

A

Some machines provide a memory stack as part of the architecture (e.g., VAX)

Sometimes stacks are implemented via software convention (e.g., MIPS)

Page 60: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 60

Memory StacksMemory Stacks

Useful for stacked environments/subroutine call & return even ifoperand stack not part of architecture

Stacks that Grow Up vs. Stacks that Grow Down:

abc

0 Little

inf. Big 0 Little

inf. Big

MemoryAddresses

SP

NextEmpty?

LastFull?

How is empty stack represented?

Little --> Big/Last Full

POP: Read from Mem(SP)Decrement SP

PUSH: Increment SPWrite to Mem(SP)

growsup

growsdown

Little --> Big/Next Empty

POP: Decrement SPRead from Mem(SP)

PUSH: Write to Mem(SP)Increment SP

Page 61: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 61

CallCall--Return Linkage: Stack FramesReturn Linkage: Stack Frames

FP

ARGS

Callee SaveRegisters

Local Variables

SP

Reference args andlocal variables atfixed (positive) offsetfrom FP

Grows and shrinks duringexpression evaluation

(old FP, RA)

Many variations on stacks possible (up/down, last pushed / next )♦

Compilers normally keep scalar variables in registers, not memory!

High Mem

Low Mem

Page 62: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 62

0 zero constant 0

1 at reserved for assembler

2 v0 expression evaluation &

3 v1 function results

4 a0 arguments

5 a1

6 a2

7 a3

8 t0 temporary: caller saves

. . . (callee can clobber)

15 t7

MIPS: Software conventions for MIPS: Software conventions for RegistersRegisters

16 s0 callee saves

. . . (callee must save)

23 s7

24 t8 temporary (cont’d)

25 t9

26 k0 reserved for OS kernel

27 k1

28 gp Pointer to global area

29 sp Stack pointer

30 fp frame pointer

31 ra Return Address (HW)

Page 63: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 63

MIPS / GCC Calling ConventionsMIPS / GCC Calling Conventions

FP

SPfact:

addiu

$sp, $sp, -32sw

$ra, 20($sp)

sw

$fp, 16($sp)addiu

$fp, $sp, 32

. . .sw

$a0, 0($fp)

...lw

$31, 20($sp)

lw

$fp, 16($sp)addiu

$sp, $sp, 32

jr

$31

raold FP

raold FP

ra lowaddress

First four arguments passed in registers.

FP

SP

ra

FP

SP

Page 64: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 64

Details of the MIPS instruction setDetails of the MIPS instruction set

Register zero always has the value zero (even if you try to write it)♦

Branch/jump and link put the return addr. PC+4 or 8 into the link register (R31) (depends on logical vs

physical architecture)

All instructions change all 32 bits of the destination register (including lui, lb, lh) and all read all 32 bits of sources (add, sub, and, or, …)

Immediate arithmetic and logical instructions are extended as follows:■

logical immediates

ops are zero extended to 32 bits

arithmetic immediates

ops are sign extended to 32 bits (including addu)♦

The data loaded by the instructions lb and lh

are extended as follows:

lbu, lhu

are zero extended■

lb, lh

are sign extended

Overflow can occur in these arithmetic and logical instructions:■

add, sub, addi

it cannot occur in addu, subu, addiu, and, or, xor, nor, shifts, mult, multu, div, divu

Page 65: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 65

Delayed BranchesDelayed Branches

In the “Raw”

MIPS, the instruction after the branch is executed even when the branch is taken?■This is hidden by the assembler for the MIPS “virtual machine”■allows the compiler to better utilize the instruction pipeline

(???)

li r3, #7

sub r4, r4, 1

bz r4, LL

addi r5, r3, 1

subi r6, r6, 2

LL: slt r1, r3, r5

Page 66: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 66

Branch & PipelinesBranch & Pipelines

execute

Branch

Delay Slot

Branch Target

By the end of Branch instruction, the CPU knows whether or not the branch will take place.

However, it will have fetched the next instruction by then, regardless of whether or not a branch will be taken.

Why not execute it?

ifetch execute

ifetch execute

ifetch executeLL: slt r1, r3, r5

li r3, #7

sub r4, r4, 1

bz r4, LL

addi r5, r3, 1

Time

ifetch execute

Page 67: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 67

Filling Delayed Branches Filling Delayed Branches

IF DEC & OP fetchBranch:

execute successoreven if branch taken!

Then branch targetor continue Single delay slot

impacts the critical path

•Compiler can fill a single delay slot with a useful instruction 50% of the time.

try to move down from above jump

•move up from target, if safe

add r3, r1, r2

sub r4, r4, 1

bz r4, LL

NOP

...

LL: add rd, ...

Is this violating the ISA abstraction?

Execute

IF DEC & OP fetch Execute

IF

Page 68: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 68

Miscellaneous MIPS I instructionsMiscellaneous MIPS I instructions

break

A breakpoint trap occurs, transfers control to exception handler

syscall

A system trap occurs, transfers control to exception handler

coprocessor instrs.

Support for floating point♦

TLB instructions

Support for virtual memory: discussed later

restore from exception Restores previous interrupt mask &

kernel/user mode bits into status register♦

load word left/right

Supports misaligned word loads

store word left/right

Supports misaligned word stores

Page 69: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 69

MIPS: MIPS: InstructionInstruction SetSet FormatFormat

load/store architecture with 3 explicit operands (ALU ops)■

fixed 32-bit instructions

3 instruction formats» R-Type» I-Type» J-Type

6 instruction set groups:» load/store -

data movement operations

» computational -

arithmetic, logical, and shift operations» jump/branch -

including call and returns

» coprocessor -

FP instructions» coprocessor0 -

memory management and exception handling

» special -

accessing special registers, system calls, breakpoint instructions, etc.

Page 70: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 70

R2000/3000 R2000/3000 InstructionInstruction FormatsFormats

R-type

(register)e.g. add

$8, $17, $18

# $8 = $17 + $18

OpCode rs rt rd shamt funct056101115162021252631

0 17 18 8 0 32056101115162021252631

Page 71: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 71

I-type

(immediate)e.g. addi

$8, $17, -44

# $8 = $17 -44

lw

$8, -44($17)

# $8 = M[$17 -

44]beq

$17, $8, label

# if( $8 == $17) go

to

label:

OpCode rs rt immediate015162021252631

“op” 17 8 -44015162021252631

R2000/3000 R2000/3000 InstructionInstruction FormatsFormats

Page 72: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 72

J-type

(jump)e.g. jump

label

# call

label: ;

$31 = $pc

+ 8

OpCode target0252631

3 -440252631

R2000/3000 R2000/3000 InstructionInstruction FormatsFormats

Page 73: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 73

BhandarkarBhandarkar andand Clark: RISC vs. CISCClark: RISC vs. CISC

Compares the

VAX 8700 vs

MIPS M/2000 (R3000 chip)

Combines three

fractors:■

Architecture

Implementation■

Compilers

and

OS

Argues that:■

Implementation

effects

are second

order

Compilers

are “similar”■

RISCs

are better

than

CISCs

Is

it

a fair comparison

of

RISCs

vs

CISCs

?

Page 74: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 74

BhandarkarBhandarkar andand Clark, cont.Clark, cont.

RISC factor

Risc Facto InstrCPI Instr

vaxmips mips

r = CPIvax

Bechmark Inst.Ratio

CPIMIPS

CPIVAX

Ratio RISCfactor

spice2g6 2.5 1.8 8.0 4.4 1.8matrix300 2.4 3.0 13.8 4.5 1.9nasa7 2.1 3.0 15.0 5.0 2.4fpppp 2.9 1.5 15.2 10.5 2.7tomcatv 2.9 2.1 17.5 8.2 2.9dudoc 2.7 1.7 13.2 7.9 3.0espresso 1.7 1.1 5.4 5.1 3.0eqntott 1.1 1.3 4.4 3.5 3.3li 1.6 1.1 6.5 6.0 3.7geo. mean 2.2 1.7 9.9 5.8 2.7

Page 75: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 75

BhandarkarBhandarkar andand Clark, cont.Clark, cont.

Compensation Factors■

Increase VAX CPI but decrease VAX instruction count

Increase MIPS instruction count■

Example 1: Loads and stores vs. operand specifiers

Example 2: Necessary complex operations, e.g. loop branches

Factors favoring VAX■

Big immediate values

Not-taken branches incur no delay

Page 76: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 76

BhandarkarBhandarkar andand Clark, cont.Clark, cont.

Factors

favoring

MIPS■

Operand

specifier

decoding

Number

of

registers■

Separating

floating

point

unit

Simple jumps

and

branches

(lower

latency)■

Fancy

VAX instructions: Unnecessary

functionality■

Instruction

scheduling

Translation

buffer■

Branch

displacement

size

Page 77: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 77

HomeworkHomework AssignmentAssignment 33

Make a review of the ISA for a specific processor♦

Write a three page report explaining the following:

1.

Kind of IS Architecture (RISC, CISC or other, 16-bit, 32- bit, 64-bit)

2.

Classes of instructions (ALU, Memory Movement, Branches, etc.)

3.

Addressing Modes (immediate, base+displacemente, indirect, etc.)

4.

Displacements in branches and control flow instructions (call, ret)

5.

Special instructions: system calls, traps, access to special purpose registers

6.

Instructions formats

Page 78: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 78

HwHw33: List of Processors: List of Processors

1 Claudia Méndez Garza Xscale

o ARM2 José

Alberto Ramírez Uresti Opteron AMD 64 bits

3 Víctor Echeverría Ríos Texas Instruments

TMS320DM64x

o un DSP

Due date: September 26th, 2008.

Page 79: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 79

LLááminas complementarias no minas complementarias no expuestas en la clase que pueden expuestas en la clase que pueden

servir de soporteservir de soporte

Page 80: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 80

VAXVAX--1111

Introduced

by DEC in 1977: VAX- 11/780

Upward

compatible from

PDP-11♦

32-bit “word”

and

addresses

Virtual memory

is

first-class♦

16 GPRs

(r15 is

PC, r14 is

SP), CCs

Extremely

orthogonal, memory- memory

Decode

as byte stream■

Opcode: operation, number

of

operands

& operand

type■

Variable-length

address

specifiers

Page 81: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 81

VAXVAX--11, cont.11, cont.

Data types■

8-, 16-, 32-, 64-, 128-bit integers

F (32 bits), D (64), G (64), H (128) FP■

Character

string

(8-bits/char)

Decimal (4-bits/digit)■

Numeric

string

(8-bits/digit)

Addresing

modes

include–Literal (6 bits)–8-, 16-, 32-bit immediates–Register, register

deferred–8-, 16-, 32-bit displacements–8-, 16-, 32-bit displacement

deferred

–Indexed–Autoincrement–Autodecrement–Autoincrement

deferred

Page 82: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 82

VAXVAX--11, cont.11, cont.

Operations■

Data Transfers

(including

string

move)

Arithmetic

and

Logical

(2 and

3 operands)■

Control (Branch, Jump, etc. )

» AOBLEQ (Add

one

and

Branch

if

Less

than

or

EQual)■

Procedure

(CALLs

save

state)

Bit Manipulation■

Floating

Poing

(Add/Sub/Mult/Divide)

POLYF --

Polynomial

Evaluation■

System

(Exception, VM)

Other» CRC --

Cycle

redundant

chech

» INSQUE --

Insert

entry

in queue

Page 83: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 83

VAXVAX--11, cont.11, cont.

VAX has too many modes

& formats

Serial semantics

limit parallel

execution

8 bits

PC+0PC+1

PC+6

PC+9

PC+15

New instruction:opcode calls for three operandsSpecifier 1: four bits + register

+ four byte displacements

Specifier 2

+ two byte displacements

Specifier 3: indexSpecifier 3: indexed mode

+ four byte displacement

next instruction

The big deal with RISC is not REDUCED number of instructions; it’s few modes & formats to facilitate pipelining

Page 84: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 84

DEC AlphaDEC Alpha

Introduced

by DEC in 1992■

Ability

to

emulate VAX instructions

important

Strongly

influenced

by Cray-1

64-bit architecture♦

Load/Store --

only

displacement

addressing♦

Standard datatypes■

No byte loads/stores

Registers■

32 64-bit GPRs

(r31 = 0)

32 64-bit FPRs

VAX and

IEEE floating

point

Page 85: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 85

DEC Alpha, cont.DEC Alpha, cont.

Four

fixed-length

instruction

formats■

Sub-formats

for

computation

instructions

32-bit instructions■

Designed

with

multiple-issue

in mind

No delayed

branches

Precise exceptions

not

automatic

PAL code

Page 86: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 86

DEC Alpha DEC Alpha InstructionsInstructions FormatsFormats

OpCode src/dest base displacement015162021252631

Memory

Format

OpCode src displacement02021252631

PC-Relative

Format

OpCode PAL argument0252631

PAL-call

Format

Page 87: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 87

DEC ALPHA DEC ALPHA InstructionInstruction FormatsFormats, cont., cont.

Three-Register

Integer

Format

OpCode src1 src2 000 function dest045111215162021252631

0

Eight-bit Immediate

Integer

Format

OpCode src1 const function dest045111215162021252631

1

Eight-bit Immediate

Integer

Format

OpCode src1 src2 function dest04515162021252631

Page 88: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 88

DEC Alpha DEC Alpha InstructionInstruction SetSet

Operate Instructions■

Integer

Arithmetic

Logical

(AND, OR, conditional

MOV)■

Byte-manipulation

Floating-point

arithmetic■

Miscellaneous

(memory

prefetching,

trap

and

memory

barriers

Load/Store Instructions■

Load/Store Quadwords

(64-bits)

Load-Linked/Store Conditional

(for MP synchronization)

Page 89: Instruction Set Architectureadiaz/ArqComp/04-Isa.pdfprogrammer, i. e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and

Laboratorio deTecnologías de Información

Arquitectura de Computadoras ISA- 89

DEC Alpha DEC Alpha InstructionInstruction SetSet, cont., cont.

Control/Branching Instruction■

Branch on condition (8 conditions) in integer register

Branch on condition (6 conditions) in FP register

Unconditional branches■

Calculated jumps

Branch Hints■

Different hint/rule type of branch

Supervision Instructions■

PAL code for needed task