Download - ECE 486/586 Computer Architecture Lecture # 8web.cecs.pdx.edu/~zeshan/ece586_lec8.pdf•If condition or comparison is not met, branch is said to be ... • 1989 –1995: 80486, Pentium,

ECE 486/586

Computer Architecture

Lecture # 8

Spring 2019

Portland State University

Lecture Topics

• Instruction Set Principles

– MIPS

• Control flow instructions

• Dealing with constants

– IA-32

– Fallacies and Pitfalls

Reference:

• Appendix A: Sections A.6, A.9 and A.10

Branch Conditions

• Conditional branches are said to be “taken” based upon a “comparison” or “condition”– Next instruction to be executed is the target

• If condition or comparison is not met, branch is said to be “not taken”– Next instruction executed is instruction following the branch

instruction in the program• PC + <length of branch instruction in bytes>

Branch Condition Options

• Condition Codes– Special bits set as a consequence of ALU operations

• Z – last result was zero

• V – last result had over/under flow

• C – last result had a carry

• N – last result was negative

• Condition Register (MIPS)– Simple

– Uses register(s)• BEQZ R1, Target # Branch to Target if R1 == 0

• BEQ R1, R2, Target # Branch to Target if R1 == R2

• Compare and Branch (VAX)– BGEQ R1, R2, Target # Branch to target if R1 >= R2

Conditional Branch Instructions

• Condition is result of comparing two registers

• bne, beq

if(i == j) => bne r17, r18, Target

h = i + j; add r19, r17, r18

Target:

I op rs rt Immediate

op 10001 10010 0000000000000010

Conditional Branch Instructions

• Target address is PC-relative– Exploits spatial locality

• ½ of branches in SPEC2000 < 16 instructions away

– Saves bits of encoding (store offset, not absolute address)

– Facilitates position independent code

– Provides range of +215-1 to -215 instructions from PC

Unconditional Branch Instructions

• Jump instruction

• j, jr, jalif(i == j) => bne r17, r18, L1

h = i + j; add r19, r17, r18 else j L2

h = i – j; L1: sub r19, r17, r18

L2:

• Target address is pseudo-direct– Value is in words– Concatenate with PC[31:28] and “00”– Target must be in same 256 MB space as PC

Return from Procedure

• Set return value

• Restore state– At least register containing PC for return address

– Any saved registers

• Transfer control to caller

• MIPS implementation “JR” – “Jump Register”– JR R31

• PC R31

Summary: MIPS Control Flow Instructions

Constants

• Small constants are used frequently (50% of operands)

e.g., A = A + 5B = B - 1

• Design Principle: Make the common case fast

• Solution: immediate addressing mode

addi r18, r18, 4

andi r19, r20, 11slti r21, r20, 4

• But immediate field in an I-type instruction is only 16-bits wide• How to deal with larger constants (which need > 16 bits)?

Dealing with Larger Constants

• We’d like to be able to load a 32 bit constant into a register– For example, load register r5 with a 32-bit sequence of alternating 1s

and 0s (101010………….10)

• Must use two instructions– New “load upper immediate (lui)” instruction

lui r5, 1010101010101010

– Then must get the lower order bits right

ori r5, r5, 1010101010101010

1010101010101010 0000000000000000

Filled with zeroes

1010101010101010 0000000000000000

0000000000000000 1010101010101010

1010101010101010 1010101010101010

ori

Alternative ISAs

• Design Alternative– Provide more powerful instructions

– Goal is to reduce number of instructions executed

– Danger is a slower clock cycle time and/or higher CPI

• Example: Lets look briefly at IA-32

IA-32 History

• 1978: Intel 8086 is announced (16-bit architecture)

• 1980: 8087 floating point coprocessor is added

• 1982: 80286 increases address space to 24 bits

• 1985: 80386 extends address space to 32 bits, new addressing modes

• 1989 – 1995: 80486, Pentium, Pentium Pro add a few more instructions

• 1997: 57 new “MMX” instructions added to Pentium-II

• 1999: Pentium-III added another 70 instructions (SSE)

• 2001: Another 144 instructions (SSE2)

• 2004: Address space extended to 64 bits

• 2004: More multimedia extensions (SSE3)

IA-32 History

• “This history illustrates the impact of the “golden handcuffs” of compatibility”

• “Adding new features as someone might add clothing to a packed bag”

• “What the 80x86 lacks in style is made up in quantity, making it beautiful from the right perspective”

IA-32 Overview

• Complexity– Instructions from 1 to 17 bytes long

– One operand must act as both a source and destination

– One operand can come from memory

– Complex addressing modes• E.g., base or scaled index with 8 or 32 bit displacement

• Saving grace– Most frequently used instructions are not too difficult to build

– Compilers often avoid the portions of the ISA that are too slow

IA-32 Registers

IA-32 Typical Instructions

• Four major types of integer instructions• Data movement (move, push, pop)• Arithmetic and logical (destination can be either register or memory)• Control flow (use of condition code “CC” flags)• String instructions (such as string move and string compare)

ISA Summary

• Instruction set complexity affects the instruction count vs. CPI/clock rate tradeoff– RISC: higher instruction count but easier to decode and pipeline (higher

clock rate, lower CPI)

– CISC: lower instruction count but higher CPI

• Design principles– Simplicity favors regularity

– Smaller is faster

– Make the common case fast

Fallacy

• There is such a thing as a typical program

• Problems with the above argument:– Wide variations of instruction frequencies, addressing modes, operand

types etc. exist between applications, even belonging to the same class

Fallacy

• An architecture with flaws cannot be successful

• Counter-example:– Intel x86 (IA32)

• Several unpopular architectural decisions over the years– Segmentation vs. paging

– Extended accumulators vs. GPRs

– Stack for floating point operations

– Very complex ISA

• But enormously successful, because– Chosen by IBM for first PC

– Moore’s law provided resources to build an internal RISC-like core

– High volume permits higher R&D costs needed to support complexity

Fallacy

• You can design a flawless architecture

• Why not?– Technologies change with time; tradeoffs thought to be correct at

one time may look like mistakes with technology evolution

– Example: VAX (1975)• Emphasized the importance of code size efficiency

– Used compact but complex instructions to reduce code size

• Didn’t anticipate the future importance (five years later of)– On-chip caches

– Pipelining

Pitfall

• Innovating at the ISA level to reduce code size without accounting for the compiler

• Architects struggle to reduce code size (30% -- 40% code size reduction considered a major achievement at ISA level)

• Different compiler strategies can change code size by much larger factors (2x)

• Conclusion: Architect should start with the tightest code the compiler can produce before proposing ISA innovations to save space

Pitfall

• Designing a “high-level” instruction set feature specifically oriented to supporting a high-level language structure

• Example: VAX CALLS instruction

• Too general for the most frequent case, resulting in unneeded work and a slower instruction

Download - ECE 486/586 Computer Architecture Lecture # 8web.cecs.pdx.edu/~zeshan/ece586_lec8.pdf•If condition or comparison is not met, branch is said to be ... • 1989 –1995: 80486, Pentium,

Top Related