no slide titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · lec 15 systems architecture 2...

38
Lec 15 Systems Architecture 1 Systems Architecture Lecture 15: A Simple Implementation of MIPS Jeremy R. Johnson Anatole D. Ruslanov William M. Mongan Some or all figures from Computer Organization and Design: The Hardware/Software Approach, Third Edition, by David Patterson and John Hennessy, are copyrighted material (COPYRIGHT 2004 MORGAN KAUFMANN PUBLISHERS, INC. ALL RIGHTS RESERVED).

Upload: others

Post on 13-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

Lec 15 Systems Architecture 1

Systems Architecture

Lecture 15: A Simple Implementation of MIPS

Jeremy R. Johnson

Anatole D. Ruslanov

William M. Mongan

Some or all figures from Computer Organization and Design: The Hardware/Software Approach, Third Edition, by David Patterson and John Hennessy, are copyrighted material (COPYRIGHT 2004 MORGAN KAUFMANN PUBLISHERS, INC. ALL RIGHTS RESERVED).

Page 2: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

Lec 15 Systems Architecture 2

Introduction

• Objective: To understand how to implement the MIPS

instruction set.

• Combine components (registers, memory, ALU) and add

control

• Fetch-Execute cycle

• Topics

– Sequential logic (elements with state) and timing (edge triggered)

• Memory

• Registers

– Datapath components: Instruction memory, PC, Adder, Register File,

ALU, Data Memory

– Implement a subset of MIPS in a single cycle computer

– Shortcomings of a single cycle computer

Page 3: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

Lec 15 Systems Architecture 3

The Processor: Datapath & Control

• Implementation of MIPS

• Simplified to contain only:

– memory-reference instructions: lw, sw

– arithmetic-logical instructions: add, sub, and, or, slt

– control flow instructions: beq, j

• Generic Implementation:

– use the program counter (PC) to supply instruction address

– get the instruction from memory

– read registers

– use the instruction to decide exactly what to do

Page 4: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

12/22/2011 Chapter 4 — The Processor 4

Instruction Execution

• PC instruction memory, fetch instruction

• Register numbers register file, read

registers

• Depending on instruction class

– Use ALU to calculate

• Arithmetic result

• Memory address for load/store

• Branch target address

– Access data memory for load/store

– PC target address or PC + 4

Page 5: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

Lec 15 Systems Architecture 5

Abstract View

• Two types of functional units:

– elements that operate on data values (combinational)

– elements that contain state (sequential)

Page 6: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

12/22/2011 Chapter 4 — The Processor 6

Multiplexers

Can’t just join

wires together

Use multiplexers

Page 7: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

12/22/2011 Chapter 4 — The Processor 7

Control

Page 8: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

Lec 15 Systems Architecture 8

Timing

• Clocks used in synchronous logic

– when should an element that contains state be updated?

• Edge-triggered timing

cycle time

rising edge

falling edge

Page 9: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

Lec 15 Systems Architecture 9

Edge Triggered Timing

• State updated at clock edge

• Read contents of some state elements,

• Send values through some combinational logic

• Write results to one or more state elements

Clock cycle

State

element

1

Combinational logic

State

element

2

Page 10: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

12/22/2011 Chapter 4 — The Processor 10

Logic Design Basics

§4.2

Log

ic D

esig

n C

on

ven

tion

s

• Information encoded in binary

– Low voltage = 0, High voltage = 1

– One wire per bit

– Multi-bit data encoded on multi-wire buses

• Combinational element

– Operate on data

– Output is a function of input

• State (sequential) elements

– Store information

Page 11: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

22 December 2011 Chapter 4 — The Processor 11

Combinational Elements

• AND-gate

– Y = A & B

A

B Y

I0

I1 Y

M

u x

S

Multiplexer

Y = S ? I1 : I0

A

B

Y +

A

B

Y ALU

F

Adder

Y = A + B

Arithmetic/Logic Unit

Y = F(A, B)

Page 12: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

12/22/2011 Chapter 4 — The Processor 12

Sequential Elements

• Register: stores data in a circuit

– Uses a clock signal to determine when to update the stored value

– Edge-triggered: update when Clk changes from 0 to 1

D

Clk

Q

Clk

D

Q

Page 13: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

12/22/2011 Chapter 4 — The Processor 13

Sequential Elements

• Register with write control

– Only updates on clock edge when write control input is 1

– Used when stored value is required later

D

Clk

Q

Write

Write

D

Q

Clk

Page 14: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

12/22/2011 Chapter 4 — The Processor 14

Clocking Methodology

• Combinational logic transforms data during clock cycles

– Between clock edges

– Input from state elements, output to state element

– Longest delay determines clock period

Page 15: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

Lec 15 Systems Architecture 15

Components for Simple

Implementation

• Functional Units needed for each instruction

PC

Instruction

memory

Instruction address

Instruction

a. Instruction memory b. Program counter

Add Sum

c. Adder16 32

Sign

extend

b. Sign-extension unit

MemRead

MemWrite

Data

memoryWrite data

Read data

a. Data memory unit

Address

ALU control

RegWrite

RegistersWrite register

Read data 1

Read data 2

Read register 1

Read register 2

Write data

ALU result

ALU

Data

Data

Register

numbers

a. Registers b. ALU

Zero5

5

5 3

Page 16: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

12/22/2011 Chapter 4 — The Processor 16

Instruction Fetch

32-bit

register

Increment by

4 for next

instruction

Page 17: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

12/22/2011 Chapter 4 — The Processor 17

R-Format Instructions

• Read two register operands

• Perform arithmetic/logical operation

• Write register result

Page 18: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

12/22/2011 Chapter 4 — The Processor 18

Load/Store Instructions

• Read register operands

• Calculate address using 16-bit offset

– Use ALU, but sign-extend offset

• Load: Read memory and update register

• Store: Write register value to memory

Page 19: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

12/22/2011 Chapter 4 — The Processor 19

Branch Instructions

• Read register operands

• Compare operands

– Use ALU, subtract and check Zero output

• Calculate target address

– Sign-extend displacement

– Shift left 2 places (word displacement)

– Add to PC + 4

• Already calculated by instruction fetch

Page 20: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

12/22/2011 Chapter 4 — The Processor 20

Branch Instructions

Just

re-routes

wires

Sign-bit wire

replicated

Page 21: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

12/22/2011 Chapter 4 — The Processor 21

Composing the Elements

• First-cut data path does an instruction in one clock cycle

– Each datapath element can only do one function at a time

– Hence, we need separate instruction and data memories

• Use multiplexers where alternate data sources are used for

different instructions

Page 22: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

12/22/2011 Chapter 4 — The Processor 22

R-Type/Load/Store Datapath

Page 23: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

12/22/2011 Chapter 4 — The Processor 23

Full Datapath

Page 24: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set
Page 25: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

Lec 15 Systems Architecture 25

Adding Control

• Selecting the operations to perform (ALU, read/write, etc.)

• Controlling the flow of data (multiplexor inputs)

• Information comes from the 32 bits of the instruction

op rs rt rd shamt funct

op rs rt 16 bit address

op 26 bit address

R

I

J

Page 26: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

Lec 15 Systems Architecture 26

MIPS Instructions

• add $t0,$s1,$s2

• lw $t0,256($t1)

000000 10001 10010 01000 00000 100000

op rs rt rd shamt funct

100011 01001 01000 0000 0001 0000 0000

op rs rt offset

Page 27: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

Lec 15 Systems Architecture 27

MIPS Instructions Continued

• beq $s1,$s2,25 => 100

• j 1024 => 4096 [+PC+4[31-28]]

000010 00 0000 0000 0000 0100 0000 0000

op address

000100 10001 10010 0000 0000 0001 1001

op rs rt offset

Page 28: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

Lec 15 Systems Architecture 28

Determining ALU Control Bits

• ALUOp determined by instruction

Instruction ALUOp Instruction funct ALU ALU

opcode operation action control

LW 00 load word xxxxxx add 010

SW 00 store word xxxxxx add 010

BEQ 01 branch eq xxxxxx sub 110

R-type 10 add 100000 add 010

R-type 10 sub 100010 sub 110

R-type 10 and 100100 and 000

R-type 10 or 100101 or 001

R-type 10 slt 101010 slt 111

• Control Lines

000 and

001 or

010 add

110 sub

111 slt

Page 29: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

Lec 15 Systems Architecture 29

• Must describe hardware to compute 3-bit ALU control

input

– given instruction type

00 = lw, sw

01 = beq,

10 = arithmetic

– function code for arithmetic

• Describe it using a truth table (can turn into gates):

ALUOp

computed from instruction type

ALU Control

Page 30: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

Lec 15 Systems Architecture 30

Datapath with Control

PC

Instruction memory

Read address

Instruction [31–0]

Instruction [20–16]

Instruction [25–21]

Add

Instruction [5–0]

MemtoReg

ALUOp

MemWrite

RegWrite

MemRead

Branch

RegDst

ALUSrc

Instruction [31–26]

4

16 32Instruction [15–0]

0

0M u x

0

1

Control

AddALU

result

M u x

0

1

RegistersWrite register

Write data

Read data 1

Read data 2

Read register 1

Read register 2

Sign extend

Shift left 2

M u x

1

ALU result

Zero

Data memory

Write data

Read data

M u x

1

Instruction [15–11]

ALU control

ALU

Address

Page 31: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

Lec 15 Systems Architecture 31

Control Line Settings

• 8 control lines (control read/write and multiplexors)

Instruction RegDst ALUSrc

Memto-

Reg

Reg

Write

Mem

Read

Mem

Write Branch ALUOp

R-format 1 0 0 1 0 0 0 Func Code

lw 0 1 1 1 1 0 0 add

sw X 1 X 0 0 1 0 add

beq X 0 X 0 0 0 1 sub

Page 32: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

22 December 2011 Chapter 4 — The Processor 32

R-Type Instruction

Page 33: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

22 December 2011 Chapter 4 — The Processor 33

Load Instruction

Page 34: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

22 December 2011 Chapter 4 — The Processor 34

Branch-on-Equal Instruction

Page 35: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

22 December 2011 Chapter 4 — The Processor 35

Implementing Jumps

• Jump uses word address

• Update PC with concatenation of

– Top 4 bits of old PC

– 26-bit jump address

– 00

• Need an extra control signal decoded from opcode

2 address

31:26 25:0

Jump

Page 36: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

22 December 2011 Chapter 4 — The Processor 36

Datapath With Jumps Added

Page 37: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

Lec 15 Systems Architecture 37

Shortcomings of a Single Cycle Implementation

• Limits reuse of hardware components

– each functional unit can be used only once per cycle

– e.g. instruction and data memory required

• Inefficient

– clock cycle determined by longest possible path in the machine

– E.G. Assume time for:

• Memory units = 200 ps

• ALU and adders = 100 ps

• Register file (read or write) = 50 ps

Instruction class

Instruction memory

Register read ALU operation

Data memory Register write Total

R-type 200 50 100 0 50 400 ps

Load word 200 50 100 200 50 600 ps

Store word 200 50 100 200 550 ps

Branch 200 50 100 0 350 ps

Jump 200 200 ps

Page 38: No Slide Titlejjohnson/2012-13/fall/cs281/... · 2012-08-21 · Lec 15 Systems Architecture 2 Introduction • Objective: To understand how to implement the MIPS instruction set

Lec 15 Systems Architecture 38

Single Cycle Model is inefficient!

• Assume 25% loads, 10% stores, 45% ALU instructions, 15% branches, and 5% jumps

CPU execution time = Instruction count x CPI x Clock cycle time

Performance ratio =

CPU Performance (Multicycle impl.)

------------------------------------------------------ =

CPU Performance (Single cycle impl.)

CPU Exec. Time (Single cycle impl.)

------------------------------------------------------ =

CPU Exec. Time (Multicycle impl.)

600

------------------------------------------------------------------------------------- =

600 x 25% + 550 x 10% + 400 x 45% + 350 x 15% + 200 x 5%

600 ps

------------- = 1.34 faster

447.5 ps