eece476: computer architecture lecture 3: instruction set architectures and signed number review...

EECE476: Computer Architecture

Lecture 3: Instruction Set Architectures

and Signed Number Review

Chapter 2, also 3.2

CPUs and Your Project

• … and why you should care!

Popularity of Embedded CPUs

• Embedded vs. servers & desktops• NIOS II is a type of embedded CPU

Millions of Computers

Why Altera NIOS II?• Two types of embedded CPUs

– Pre-made• Fixed CPU, comes in own chip package• Computer system designed by adding other chips

– Cores• Described in VHDL or Verilog code

– Can be modified or customized for each application– Synthesized into logic gates using software tools

• Computer system designed by adding other cores– Designed into ONE custom-made chip– Lower total system cost!

• Pre-made versus Cores– 31% were cores in 1998 \ Cores have significant growth !!!– 56% were cores in 2002 /

• NIOS II is an embedded CPU Core

Embedded CPUs• Found in…

– Your keyboard– Your hard disk– Your mouse– Your cell phone– Your microwave– Your car (10s of them !)– Your flat-panel monitor– Your iPod– Your cordless phone– Your coffee maker– Your alarm clock

Your TV– Your DVD player– Your TV/DVD remote control– Your digital camera– Your wristwatch

One estimate says theaverage home inNorth America has35 embedded CPUs !!

Your Job !

• You might design/analyse/recommend/use an embedded CPU

– You will probably work with an embedded CPU• Either software or hardware !

– This will probably be a CPU core

– You will probably use an FPGA

– NIOS II is an embedded CPU designed specifically to fit into an FPGA

Which ISA ?

Millions of Processors

CPUs in “Other”Category ?

Some possibilities:

Intel 8051

DSP chips (eg, cell phones,digital cameras, DVD players)

Sony PS2 (E-motion engine)

PIC, AVR microcontrollers

Motorola 6811, 6809, etc

Lecture Objectives (first half)

• Finish Instruction Set Architectures

– Summarize instruction formats and types

– Review addressing modes

– Working with Constant Numbers

Lecture Objectives (second half)

• Review signed binary numbers

– Foundation for CPU arithmetic

– Need this for tomorrow:• ALU design, add, multiply, floating-point

Review: Last Day

• Instruction Types

– Arithmetic add, sub, addi,addu, addiu, slt, …

– Logical and, or, ori, sll, srl, …– Memory lw, sw, lui– Control bne, beq, j, jr, jal

• Red instructions haven’t been covered yet…

Review: Last DayInstruction Meaning

• add $s1,$s2,$s3 $s1 = $s2 + $s3sub $s1,$s2,$s3 $s1 = $s2 – $s3

• lw $s1,100($s2) $s1 = Mem[$s2+100] sw $s1,100($s2) Mem[$s2+100] = $s1

• bne $s4,$s5,L Next instr. is at Labelif $s4 != $s5

• beq $s4,$s5,L Next instr. is at Label if $s4 == $s5

• j Label Next instr. is at Label

Review: Last Day

• Instruction Word Formats– Instructions are always 32 bits– R-type (Register) 3 registers– I-type (Immediate) 1 or 2 registers, 16-bit immediate– J-type (Jump) 0 registers, 26-bit immediate address

op rs rt rd shamt funct

op rs rt 16 bit immediate

op 26 bit address

Review: Last Day

• Instruction Word Formats– Instructions are always 32 bits– R-type (Register) 3 registers– I-type (Immediate) 1 or 2 registers, 16-bit immediate– J-type (Jump) 0 registers, 26-bit immediate address

op rs rt rd shamt funct

op rs rt 16 bit immediate

op 26 bit address

R-type modifies register rd

I-type modifies register rt

Review: Last Day

• Spot the Instruction Word Format for each

addi $t0, $s0,42

sub $t0, $s0,$s1

lw $t0, 4($s0)

beq $s0,$s1, Label

j Label

I-type

R-type

I-type

J-type

addi $t0, $s0,42

sub $t0, $s0,$s1

lw $t0, 4($s0)

beq $s0,$s1, Label

j Label

addi $t0, $s0,42

sub $t0, $s0,$s1

lw $t0, 4($s0)

beq $s0,$s1, Label

j Label

• We have: beq, bne

• What about Branch-if-less-than?• New instruction:

if $s1 < $s2 then $t0 = 1

slt $t0, $s1, $s2 else $t0 = 0

• Can use this instruction to build pseudoinstruction:blt $s1, $s2, Label

• Assembler translates this into 2 real MIPS instructionsslt $at, $s1, $s2

bne $zero,$at, Label– Note: assembler needs a temporary register to do this

More Control Flow

• Instructions:bne $t4,$t5,Label Next instruction is at Label if $t4 != $t5beq $t4,$t5,Label Next instruction is at Label if $t4 == $t5

• Formats:

• Imm16 – 16-bit immediate not big enough to represent all addresses– How do we handle this with load and store instructions?

• Most branches are local (principle of locality)– Use Imm16 value as an “offset” distance from current address

• Current address is stored in the Program Counter (PC)• Imm16 is added to PC+4 (Why +4?)• Imm16 is shifted-left by 2 bits (Why?)• Imm16 is signed! Why?

– PC = PC + 4 + SignExtend(Imm16)<<2

op rs rt Imm16I

Addresses in Branches

• Instruction:j Label Next instruction is at Label

• Format:

• Imm26 26-bit address not big enough to represent all addresses– 26-bit value shifted left two positions (28 bit value)

• Higher order bits: keep same values in PC – Address boundaries of 256 MB

op Imm26J

Addresses in Jumps

Addressing Modes

• Recall: 6811 has 6 addressing modes– Immediate LDAA #$32– Direct ADDA $02 (located on page 0)– Extended LDAB $100A– Indexed (X, Y) ADDA 10,X– Inherent INCB– Relative BRA LABEL

• MIPS has 5 different addressing modes

5 Addressing Modes for MIPS1. Immediate addressing

op rs rt Immediateaddi $t0, $s0,42

5 Addressing Modes for MIPS

1. Immediate addressing

2. Register addressing Registers

Register

op rs rt

op rs rt rd func

Immediateaddi $t0, $s0,42

sub $t0, $s0,$s1

2. Register addressing

3. Base addressing

Registers

Memory

Register

Half-WByteRegister

op rs rt

op rs rt rd func

op rs rt Address

sub $t0, $s0,$s1

lw $t0, 4($s0)

3. Base addressing

4. PC-relative addressing

Registers

Memory

Register

Half-WByteRegister

op rs rt

op rs rt rd func

op rs rt

Address

sub $t0, $s0,$s1

lw $t0, 4($s0)

beq $s0,$s1 Label

3. Base addressing

4. PC-relative addressing

5. Pseudodirect addressing

Registers

Memory

Register

Half-WByteRegister

op rs rt

op rs rt rd func

op rs rt

Address

sub $t0, $s0,$s1

lw $t0, 4($s0)

beq $s0,$s1 Label

j Label

• Small constants are used quite frequently (50% of operands) e.g., A = A + 5;

B = B + 1;C = C - 18;

• Poor solutions… why?– put 'typical constants' in memory and load them. – create hard-wired registers (like $zero) for constants like one.

• MIPS instructions (max. 16-bit signed constants): addi $29, $29, 4

slti $8, $18, 10andi $29, $29, 6ori $29, $29, 4

• How do we make this work for 32 bits?

Constants

• We'd like to be able to load a 32-bit constant into a register

• First, "load upper immediate" instruction

lui $t0, %1010101010101010

• Second, get the lower order bits

ori $t0, $t0, %1111000110001111

1010101010101010 0000000000000000

0000000000000000 1111000110001111

1010101010101010 1111000110001111

1010101010101010 0000000000000000

How about larger constants?

final $t0

lui part

ori part

Rest of registerfilled with zeros

• Assembly language– Convenient symbolic representation (human-readable form)– Can use simple arithmetic with constants (eg, 2*4-1)– Can use symbols to represent constants (eg, labels for branches)

• Machine language is the underlying reality– Binary form OR “Disassembled binary” textual form– Arithmetic gone (performed at “compile-time”)– Executed directly by machine

• Assembly may provide 'pseudoinstructions'– Pseudoinstructions– e.g., “move $t0, $t1” assemble “add $t0,$t1,$zero”– e.g., “blt $s0,$s1, L” assemble “slt $at,$s0,$s1 ; bne $at,$zero L”

• When considering performance you should only count real instructions

Assembly Language vs. Machine Language

• Things we are not going to cover todaysupport for procedureslinkers, loaders, memory layoutstacks, frames, recursionmanipulating strings and pointersinterrupts and exceptionssystem calls and conventions

• Some of these we'll talk about later

• We've focused on architectural issues– basics of MIPS assembly language and machine code– building a processor to execute similar Altera NIOS 2 instructions.

Other Issues

To summarize:MIPS operands

Name Example Comments$s0-$s7, $t0-$t9, $zero, Fast locations for data. In MIPS, data must be in registers to perform

32 registers $a0-$a3, $v0-$v1, $gp, arithmetic. MIPS register $zero always equals 0. Register $at is $fp, $sp, $ra, $at reserved for the assembler to handle large constants.

Memory[0], Accessed only by data transfer instructions. MIPS uses byte addresses, so

230 memory Memory[4], ..., sequential words differ by 4. Memory holds data structures, such as arrays,

words Memory[4294967292] and spilled registers, such as those saved on procedure calls.

MIPS assembly language

Category Instruction Example Meaning Commentsadd add $s1, $s2, $s3 $s1 = $s2 + $s3 Three operands; data in registers

Arithmetic subtract sub $s1, $s2, $s3 $s1 = $s2 - $s3 Three operands; data in registers

add immediate addi $s1, $s2, 100 $s1 = $s2 + 100 Used to add constants

load word lw $s1, 100($s2) $s1 = Memory[$s2 + 100] Word from memory to register

store word sw $s1, 100($s2) Memory[$s2 + 100] = $s1 Word from register to memory

Data transfer load byte lb $s1, 100($s2) $s1 = Memory[$s2 + 100] Byte from memory to register

store byte sb $s1, 100($s2) Memory[$s2 + 100] = $s1 Byte from register to memory

load upper immediate lui $s1, 100 $s1 = 100 * 216 Loads constant in upper 16 bits

branch on equal beq $s1, $s2, 25 if ($s1 == $s2) go to PC + 4 + 100

Equal test; PC-relative branch

Conditional

branch on not equal bne $s1, $s2, 25 if ($s1 != $s2) go to PC + 4 + 100

Not equal test; PC-relative

branch set on less than slt $s1, $s2, $s3 if ($s2 < $s3) $s1 = 1; else $s1 = 0

Compare less than; for beq, bne

set less than immediate

slti $s1, $s2, 100 if ($s2 < 100) $s1 = 1; else $s1 = 0

Compare less than constant

jump j 2500 go to 10000 Jump to target address

Uncondi- jump register jr $ra go to $ra For switch, procedure return

tional jump jump and link jal 2500 $ra = PC + 4; go to 10000 For procedure call

Reading• Chapter 1 (light reading)• Chapter 2

– 2.2 arithmetic operations– 2.3 memory operands– 2.4 instruction encoding– 2.5 logical– 2.6 control– 2.7 procedures (read lightly)– 2.8 load bytes/halfwords– 2.9 addressing in branches/jumps– 2.10 compilers (read lightly)– 2.11 compiler optimizations (read lightly)– 2.12 compiler introduction– 2.13 sorting program– 2.14 object-oriented– 2.15 arrays versus pointers– 2.16 IA-32

Signed Numbers

Representing numbers as binary data is important for arithmetic,

which we’ll cover next class!

Signed Numbers (Review)

• Two types of binarynumbers– Signed, unsigned– Sign bit: leftmost bit

• If sign bit == 1, value is negative

• Difference mainly software choice

• Sometimes, software must tell hardware if signed– Eg, Add. Why?– Eg, Multiply. Why?

4 bits Signed Unsigned

0101 +5 +5

1101 -3 +13Range -8 to 7 0 to 15

Signed Numbers• Two’s complement form is the most common

(almost universal)

• Conversion from +N to –N to +N is easy

• CONVERSION RULE• Invert and add 1

00001101 (13, notice sign bit)11110010 (inverted 13)11110011 (add one, -13, notice sign bit)

11110011 (-13, notice sign bit)00001100 (inverted -13)00001101 (add 1, 13, notice same as original bit pattern)

Signed Numbers

• Increasing #bits (width) of two’s complement numbers is easy

• Called sign extension

• RULE: Replicate the leftmost bit

00001101 (+13)

0000000000001101 (+13)

11110011 (-13)

1111111111110011 (-13)

UnSigned Numbers

• Take Note:– Sign extending applies only to signed numbers

– Hardware must know whether value is signed or unsigned

• Original 1101 (+13, unsigned)

• Sign Extended00001101 (+13, unsigned)11111101 (+253, unsigned, incorrect!)

Signed Number Arithmetic

• Subtraction trick

+/ – logic used for unsigned numbers==

+/ – logic for signed numbers (using two’s complement)

• No changes needed!• Note: not the same for multiply

F = A – B = A + (-B) = A + B + 1

Signed Number Arithmetic

Assume values are signed…

• Add any two +’ve values– Sum is bigger: it may overflow!

Example: 0101 + 0010 = 0111 (5 + 2 = 7)Example: 0101 + 0011 = 1000 (5 + 3 = -8?)

– Answer is negative (wrong!) => overflow!

• Add any two –’ve values– |Sum| is bigger: it may overflow!

Example: 1101 + 1011 = 1000 (-3 + -5 = -8)Example: 1101 + 1010 = 0111 (-3 + -6 =

7?)– Answer is positive (wrong!) => overflow!

Detecting Overflows• Some languages care about overflows

– FORTRAN cares– C language doesn’t care

• Overflow detection in Add operation– Compare sign of input operands to sign of result– If not sane, there was an overflow:

overflowPlus = (A>=0) & (B>=0) & (A+B<0)overflowMinus = (A<0) & (B<0) & (A+B>=0)Overflow = overflowPlus | overflowMinus

• What about Sub operation? unsigned?

• MIPS design decision:– Unsigned instructions (addu,subu) do not detect overflow– Signed instructions (add,sub) raise interrupt on overflow– Same +/– computation, but different side effects!

eece476: computer architecture lecture 3: instruction set architectures and signed number review...

Documents

eece476 lecture 8: altera tools for your project (no...

integrated cognitive-neuroscience architectures for ... ·...

eece476: computer architecture lectures 1, 2: instruction...

solution architectures, recommended products - · pdf...

eece476: computer architecture lecture 19: pipelining...

eece476: computer architecture lecture 0: computer...

eece476: computer architecture lecture 22: zero-cycle...

eece476 lecture 7: single-cycle cpu instruction processing &...

technical architectures

evolving custom convolutional neural network architectures...

ee141 system-on-chip test architectures ch. 2 – digital...

web architectures

software architectures: shared information systems€¦ ·...

eece476: verilog tutorial

computer architecture, advanced architectures part vii...

slide 1 outlineoutline classification ilp architectures data...

eece476: computer architecture lecture 20: branch prediction...

software architectures, week 5 - advanced architectures

eece476 lectures 10: multi-cycle cpu control chapter 5:...

eece476: computer architecture lecture 11: understanding and...