cs 61c: great ideas in computer architecture mips cpu

107
Instructor: Alan Christopher CS 61C: Great Ideas in Computer Architecture MIPS CPU Control, Pipelining 7/29/2014 1 Summer 2014-- Lecture #21

Upload: others

Post on 05-Feb-2022

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Instructor: Alan Christopher

CS 61C: Great Ideas in Computer Architecture

MIPS CPU Control,Pipelining

7/29/2014 1Summer 2014-- Lecture #21

Page 2: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Agenda

• Quick Datapath Review• Control Implementation• Administrivia• Clocking Methodology• Pipelined Execution• Pipelined Datapath

7/29/2014 2Summer 2014-- Lecture #21

Page 3: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Datapath Review

• Part of the processor; the hardware necessary to perform all operations required

– Depends on exact ISA, RTL of instructions

7/29/2014 3Summer 2014-- Lecture #21

Page 4: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Datapath Review

• Part of the processor; the hardware necessary to perform all operations required

– Depends on exact ISA, RTL of instructions

• Major components:– PC and Instruction Memory– Register File (RegFile holds registers)– Extender (sign/zero extend)– ALU for operations (on two operands)– Data Memory

7/29/2014 4Summer 2014-- Lecture #21

Page 5: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Five Stages of the Datapath

1. InstructionFetch

PC

inst

ruct

ion

mem

ory

+4

RegisterFilert

rsrd

ALU

Data

mem

ory

imm

MU

X

7/29/2014 5Summer 2014-- Lecture #21

Page 6: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Five Stages of the Datapath

1. InstructionFetch

2. Decode/ Register Read

PC

inst

ruct

ion

mem

ory

+4

RegisterFilert

rsrd

ALU

Data

mem

ory

imm

MU

X

1. InstructionFetch

7/29/2014 6Summer 2014-- Lecture #21

Page 7: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Five Stages of the Datapath

1. InstructionFetch

2. Decode/ Register Read

3. Execute

PC

inst

ruct

ion

mem

ory

+4

RegisterFilert

rsrd

ALU

Data

mem

ory

imm

MU

X

2. Decode/ Register Read

1. InstructionFetch

7/29/2014 7Summer 2014-- Lecture #21

Page 8: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Five Stages of the Datapath

1. InstructionFetch

2. Decode/ Register Read

3. Execute 4. Memory

PC

inst

ruct

ion

mem

ory

+4

RegisterFilert

rsrd

ALU

Data

mem

ory

imm

MU

X

2. Decode/ Register Read

3. Execute1. InstructionFetch

7/29/2014 8Summer 2014-- Lecture #21

Page 9: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Five Stages of the Datapath

1. InstructionFetch

2. Decode/ Register Read

3. Execute 4. Memory 5. Register Write

PC

inst

ruct

ion

mem

ory

+4

RegisterFilert

rsrd

ALU

Data

mem

ory

imm

MU

X

2. Decode/ Register Read

3. Execute 4. Memory1. InstructionFetch

7/29/2014 9Summer 2014-- Lecture #21

Page 10: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Datapath and Control

• Route parts of datapath based on ISA needs

– Add MUXes to select from multiple inputs– Add control signals for component inputs

and MUXes

• Analyze control signals– How wide does each one need to be?– For each instruction, assign appropriate

value for correct routing

7/29/2014 10Summer 2014-- Lecture #21

Page 11: CS 61C: Great Ideas in Computer Architecture MIPS CPU

MIPS-lite Instruction Fetch

imm16

CLKPC

4

PC

Ext

Adder

Adder

MU

X

0

1

32

InstructionAddr

InstructionMemory

Instr Fetch Unit

nPC_selzero

7/29/2014 11Summer 2014-- Lecture #21

Page 12: CS 61C: Great Ideas in Computer Architecture MIPS CPU

MIPS-lite Datapath Control SignalsExtOp: 0 → “zero”; 1 → “sign”ALUsrc: 0 → busB; 1 → imm16ALUctr: “ADD”, “SUB”, “OR”nPC_sel: 0 → +4; 1 → branch

MemWr: 1 → write memoryMemtoReg: 0 → ALU; 1 → MemRegDst: 0 → “rt”; 1 → “rd”RegWr: 1 → write register

32

ALUctr

CLK

busW

RegWr

32

32busA

32

busB

5 5

RW RA RB

rs

rt

rt

rdRegDst

Exte

nd

er 3216

imm16

ALUSrcExtOp

MemtoReg

CLK

Data In

32

MemWr

01

RegFile0

1

ALU

0

1DataMemory

WrEn Addr

5 zero

=

nPC_sel InstrFetchUnitCLK

Page 13: CS 61C: Great Ideas in Computer Architecture MIPS CPU

system

datapath control

stateregisters

combinationallogic

multiplexer comparatorcode

registers

register logic

switchingnetworks

Hardware Design Hierarchy

Today

7/29/2014 13Summer 2014-- Lecture #21

Page 14: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Agenda

• Quick Datapath Review• Control Implementation• Administrivia• Clocking Methodology• Pipelined Execution• Pipelined Datapath

7/29/2014 14Summer 2014-- Lecture #21

Page 15: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Processor Design Process

• Five steps to design a processor:1. Analyze instruction set →

datapath requirements

2. Select set of datapath components & establish clock methodology

3. Assemble datapath meeting the requirements

4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer

5. Assemble the control logic● Formulate Logic Equations● Design Circuits

Control

Datapath

Memory

ProcessorInput

Output

Now

7/29/2014 15Summer 2014-- Lecture #21

Page 16: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Purpose of Control

ALUctrRegDst ALUSrcExtOp MemtoRegMemWrnPC_sel

Datapath

RegWr

Instruction<31:0>

<2

0:1

6>

<1

5:1

1>

<5

:0>

<0

:15

>

imm16rdrs rt

InstrMemory

opcode

<2

5:2

1>

funct

<3

1:2

6>

7/29/2014 16Summer 2014-- Lecture #21

Page 17: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Purpose of Control

Controller

ALUctrRegDst ALUSrcExtOp MemtoRegMemWrnPC_sel

Datapath

RegWr

Instruction<31:0>

<2

0:1

6>

<1

5:1

1>

<5

:0>

<0

:15

>

imm16rdrs rt

InstrMemory

opcode

<2

5:2

1>

funct

<3

1:2

6>

7/29/2014 17Summer 2014-- Lecture #21

Page 18: CS 61C: Great Ideas in Computer Architecture MIPS CPU

MIPS-lite Instruction RTLInstr Register Transfer Language

addu R[rd]←R[rs]+R[rt]; PC←PC+4

subu R[rd]←R[rs]–R[rt]; PC←PC+4

ori R[rt]←R[rs]+zero_ext(imm16); PC←PC+4

lw R[rt]←MEM[R[rs]+sign_ext(imm16)];PC←PC+4

sw MEM[R[rs]+sign_ext(imm16)]←R[rs];PC←PC+4

beq if(R[rs]==R[rt])then PC←PC+4+[sign_ext(imm16)||00]else PC←PC+4

7/29/2014 18Summer 2014-- Lecture #21

Page 19: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Instr Control Signals

addu ALUsrc=RegB, ALUctr=“ADD”, RegDst=rd, RegWr,nPC_sel=“+4”

subu ALUsrc=RegB, ALUctr=“SUB”, RegDst=rd, RegWr, nPC_sel=“+4”

ori ALUsrc=Imm,  ALUctr=“OR”,  RegDst=rt, RegWr,ExtOp=“Zero”, nPC_sel=“+4”

lw ALUsrc=Imm,  ALUctr=“ADD”, RegDst=rt, RegWr,ExtOp=“Sign”, MemtoReg, nPC_sel=“+4”

sw ALUsrc=Imm,  ALUctr=“ADD”,            MemWr, ExtOp=“Sign”, nPC_sel=“+4”

beq ALUsrc=RegB, ALUctr=“SUB”, nPC_sel=“Br”  

MIPS-lite Control Signals (1/2)

7/29/2014 19Summer 2014-- Lecture #21

Page 20: CS 61C: Great Ideas in Computer Architecture MIPS CPU

MIPS-lite Control Signals (2/2)

add sub ori lw sw beq

RegDst

ALUSrc

MemtoReg

RegWrite

MemWrite

nPC_sel

ExtOp

ALUctr[1:0]

1

0

0

1

0

0

X

Add

1

0

0

1

0

0

XSubtract

0

1

0

1

0

0

0

Or

0

1

1

1

0

0

1

Add

X

1

X

0

1

0

1

Add

X

0

X

0

0

1

XSubtract

func

op 00 0000 00 0000 00 1101 10 0011 10 1011 00 0100Green Sheet10 0000See MIPS 10 0010 n/a

Con

trol S

ign

als

All Supported Instructions

• Now how do we implement this table with CL?7/29/2014 20Summer 2014-- Lecture #21

Page 21: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Generating Boolean Expressions

• Idea #1: Treat instruction names as Boolean variables!

7/29/2014 21Summer 2014-- Lecture #21

Page 22: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Generating Boolean Expressions

• Idea #1: Treat instruction names as Boolean variables!

– opcode and funct bits are available to us– Use gates to generate signals that are 1 when

it is a particular instruction and 0 otherwise

7/29/2014 22Summer 2014-- Lecture #21

Page 23: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Generating Boolean Expressions

• Idea #1: Treat instruction names as Boolean variables!

– opcode and funct bits are available to us– Use gates to generate signals that are 1 when

it is a particular instruction and 0 otherwise

• Examples:beq = op[5]’∙op[4]’∙op[3]’∙op[2]∙op[1]’∙op[0]’ 

Rtype = op[5]’∙op[4]’∙op[3]’∙op[2]’∙op[1]’∙op[0]’

add = Rtype∙funct[5]∙funct[4]’∙funct[3]’    ∙funct[2]’∙funct[1]’∙funct[0]’

7/29/2014 23Summer 2014-- Lecture #21

Page 24: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Generating Boolean Expressions

• Idea #2: Use instruction variables to generate control signals

– Make each control signal the combination of all instructions that need that signal to be a 1

7/29/2014 24Summer 2014-- Lecture #21

Page 25: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Generating Boolean Expressions

• Idea #2: Use instruction variables to generate control signals

– Make each control signal the combination of all instructions that need that signal to be a 1

• Examples:– MemWrite = sw– RegWrite = add + sub + ori + lw

Read from row of table

7/29/2014 25Summer 2014-- Lecture #21

Page 26: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Generating Boolean Expressions

• Idea #2: Use instruction variables to generate control signals

– Make each control signal the combination of all instructions that need that signal to be a 1

• Examples:– MemWrite = sw– RegWrite = add + sub + ori + lw

• What about don’t cares (X’s)?

Read from row of table

7/29/2014 26Summer 2014-- Lecture #21

Page 27: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Generating Boolean Expressions

• Idea #2: Use instruction variables to generate control signals

– Make each control signal the combination of all instructions that need that signal to be a 1

• Examples:– MemWrite = sw– RegWrite = add + sub + ori + lw

• What about don’t cares (X’s)?– Want simpler expressions; set to 0!

Read from row of table

7/29/2014 27Summer 2014-- Lecture #21

Page 28: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Controller Implementation

“AND” Logic

add

sub

ori

lw

sw

beq

“OR” Logic

 RegDst ALUSrcMemtoReg

 RegWriteMemWrite

 nPC_selExtOp

  ALUctr[0]  ALUctr[1]

opcode funct

• Use these two ideas to design controller:

Generate instruction

signals

Generate control signals

7/29/2014 28Summer 2014-- Lecture #21

Page 29: CS 61C: Great Ideas in Computer Architecture MIPS CPU

AND Control Logic in Logisim

7/29/2014 29Summer 2014-- Lecture #21

Page 30: CS 61C: Great Ideas in Computer Architecture MIPS CPU

OR Control Logic in Logisim

7/29/2014 30Summer 2014-- Lecture #21

Page 31: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Great Idea #1: Levels of Representation/Interpretation

7/29/2014 Summer 2014-- Lecture #21 31

lw $t0, 0($2)lw $t1, 4($2)sw $t1, 0($2)sw $t0, 4($2)

Higher-Level LanguageProgram (e.g. C)

Assembly Language Program (e.g. MIPS)

Machine Language Program (MIPS)

Compiler

Assembler

Machine Interpretation

temp = v[k];v[k] = v[k+1];v[k+1] = temp;

0000 1001 1100 0110 1010 1111 0101 10001010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111

Architecture Implementation

Hardware Architecture Description(e.g. block diagrams)

Logic Circuit Description(Circuit Schematic Diagrams)

Page 32: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Great Idea #1: Levels of Representation/Interpretation

7/29/2014 Summer 2014-- Lecture #21 32

lw $t0, 0($2)lw $t1, 4($2)sw $t1, 0($2)sw $t0, 4($2)

Higher-Level LanguageProgram (e.g. C)

Assembly Language Program (e.g. MIPS)

Machine Language Program (MIPS)

Compiler

Assembler

Machine Interpretation

temp = v[k];v[k] = v[k+1];v[k+1] = temp;

0000 1001 1100 0110 1010 1111 0101 10001010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111

Architecture Implementation

Hardware Architecture Description(e.g. block diagrams)

Logic Circuit Description(Circuit Schematic Diagrams)

CALL HOME, WE’VE MADE HARDWARE/SOFTWARE

CONTACT!!!

Page 33: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Question: Are the following statements TRUE or FALSE? Assume use of the AND-OR controller design.

1) Adding a new instruction will NOT require changing any of your existing control logic. (new logic OK though)

2) Adding a new control signal will NOT require changing any of your existing control logic. (new logic OK though)

F F(B)F T(G)T F(P)T T(Y)

1 2

33

Page 34: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Question: Are the following statements TRUE or FALSE? Assume use of the AND-OR controller design.

1) Adding a new instruction will NOT require changing any of your existing control logic. (new logic OK though)

2) Adding a new control signal will NOT require changing any of your existing control logic. (new logic OK though)

F F(B)F T(G)T F(P)T T(Y)

1 2

34

Page 35: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Agenda

• Quick Datapath Review• Control Implementation• Administrivia• Clocking Methodology• Pipelined Execution• Pipelined Datapath

7/29/2014 35Summer 2014-- Lecture #21

Page 36: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Administrivia

• HW 5 due Thursday• Project 2 due Sunday• No lab on Thursday

– Make up labs encouraged● Labs checked off in lab thursday treated as

though checked of on tuesday for lateness purposes

• Project 3: Pipelined Processor in Logisim will be released this week

7/29/2014 36Summer 2014-- Lecture #21

Page 37: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Agenda

• Quick Datapath Review• Control Implementation• Administrivia• Clocking Methodology• Pipelined Execution• Pipelined Datapath

7/29/2014 37Summer 2014-- Lecture #21

Page 38: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Register-Register Timing: One Complete Cycle for addu

Clk

PC

Rs, Rt, Rd,Op, Func

ALUctr

Instruction Memory Access Time

Old Value New Value

RegWr Old Value New Value

Delay through Control Logic

busA, BRegister File Access Time

Old Value New Value

busWALU Delay

Old Value New Value

Old Value New Value

New ValueOld Value

32

ALUctr

clk

busW

RegWr

32busA

32

busB

5 5

Rw Ra Rb

RegFile

Rs Rt

ALU

5Rd

Clk-to-Q

38

Page 39: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Register-Register Timing: One Complete Cycle for addu

Clk

PC

Rs, Rt, Rd,Op, Func

ALUctr

Instruction Memory Access Time

Old Value New Value

RegWr Old Value New Value

Delay through Control Logic

busA, BRegister File Access Time

Old Value New Value

busWALU Delay

Old Value New Value

Old Value New Value

New ValueOld Value

32

ALUctr

clk

busW

RegWr

32busA

32

busB

5 5

Rw Ra Rb

RegFile

Rs Rt

ALU

5Rd

Clk-to-Q

39

Page 40: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Register-Register Timing: One Complete Cycle for addu

Clk

PC

Rs, Rt, Rd,Op, Func

ALUctr

Instruction Memory Access Time

Old Value New Value

RegWr Old Value New Value

Delay through Control Logic

busA, BRegister File Access Time

Old Value New Value

busWALU Delay

Old Value New Value

Old Value New Value

New ValueOld Value

32

ALUctr

clk

busW

RegWr

32busA

32

busB

5 5

Rw Ra Rb

RegFile

Rs Rt

ALU

5Rd

Clk-to-Q

40

Page 41: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Register-Register Timing: One Complete Cycle for addu

Clk

PC

Rs, Rt, Rd,Op, Func

ALUctr

Instruction Memory Access Time

Old Value New Value

RegWr Old Value New Value

Delay through Control Logic

busA, B

Register File Access Time

Old Value New Value

busWALU Delay

Old Value New Value

Old Value New Value

New ValueOld Value

32

ALUctr

clk

busW

RegWr

32busA

32

busB

5 5

Rw Ra Rb

RegFile

Rs Rt

ALU

5Rd

Clk-to-Q

41

Page 42: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Register-Register Timing: One Complete Cycle for addu

Clk

PC

Rs, Rt, Rd,Op, Func

ALUctr

Instruction Memory Access Time

Old Value New Value

RegWr Old Value New Value

Delay through Control Logic

busA, B

Register File Access Time

Old Value New Value

busWALU Delay

Old Value New Value

Old Value New Value

New ValueOld Value

Register WriteOccurs Here32

ALUctr

clk

busW

RegWr

32busA

32

busB

5 5

Rw Ra Rb

RegFile

Rs Rt

ALU

5Rd

Clk-to-Q

Setup Time

42

Page 43: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Clocking Methodology

• Storage elements (RegFile, Mem, PC) triggered by same clock

Clk

.

.

.

.

.

.

.

.

.

.

.

.

7/29/2014 43Summer 2014-- Lecture #21

Page 44: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Clocking Methodology

• Storage elements (RegFile, Mem, PC) triggered by same clock

• Critical path determines length of clock period – This includes CLK-to-Q delay and setup delay

Clk

.

.

.

.

.

.

.

.

.

.

.

.

7/29/2014 44Summer 2014-- Lecture #21

Page 45: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Clocking Methodology

• Storage elements (RegFile, Mem, PC) triggered by same clock

• Critical path determines length of clock period – This includes CLK-to-Q delay and setup delay

• So far we have built a single cycle CPU – entire instructions are executed in 1 clock cycle

– Up next: pipelining to execute instructions in 5 clock cycles

Clk

.

.

.

.

.

.

.

.

.

.

.

.

7/29/2014 45Summer 2014-- Lecture #21

Page 46: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Single Cycle Performance• Assume time for actions are 100ps for

register read or write; 200ps for other events• Minimum clock period is?

Instr Instr fetch

Register read

ALU op Memory access

Register write

Total time

lw 200ps 100 ps 200ps 200ps 100 ps 800ps

sw 200ps 100 ps 200ps 200ps 700ps

R-format 200ps 100 ps 200ps 100 ps 600ps

beq 200ps 100 ps 200ps 500ps

7/29/2014 46Summer 2014-- Lecture #21

Page 47: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Single Cycle Performance• Assume time for actions are 100ps for

register read or write; 200ps for other events• Minimum clock period is?

Instr Instr fetch

Register read

ALU op Memory access

Register write

Total time

lw 200ps 100 ps 200ps 200ps 100 ps 800ps

sw 200ps 100 ps 200ps 200ps 700ps

R-format 200ps 100 ps 200ps 100 ps 600ps

beq 200ps 100 ps 200ps 500ps

7/29/2014 47Summer 2014-- Lecture #21

Page 48: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Single Cycle Performance• Assume time for actions are 100ps for

register read or write; 200ps for other events• Minimum clock period is?

Instr Instr fetch

Register read

ALU op Memory access

Register write

Total time

lw 200ps 100 ps 200ps 200ps 100 ps 800ps

sw 200ps 100 ps 200ps 200ps 700ps

R-format 200ps 100 ps 200ps 100 ps 600ps

beq 200ps 100 ps 200ps 500ps

• What can we do to improve clock rate?

• Will this improve performance as well?● Want increased clock rate to mean faster programs

7/29/2014 48Summer 2014-- Lecture #21

Page 49: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Agenda

• Quick Datapath Review• Control Implementation• Administrivia• Clocking Methodology• Pipelined Execution• Pipelined Datapath

7/29/2014 49Summer 2014-- Lecture #21

Page 50: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipeline Analogy: Doing Laundry

• Ann, Brian, Cathy, and Dave each have one load of clothes to wash, dry, fold, and put away

– Washer takes 30 minutes

– Dryer takes 30 minutes

– “Folder” takes 30 minutes

– “Stasher” takes 30 minutes to put clothes into drawers

A B C D

7/29/2014 50Summer 2014-- Lecture #21

Page 51: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Sequential Laundry

Task

Order

B

C

D

A

30Time

3030 3030 30 3030 3030 3030 3030 3030

6 PM 7 8 9 10 11 12 1 2 AM

7/29/2014 51Summer 2014-- Lecture #21

Page 52: CS 61C: Great Ideas in Computer Architecture MIPS CPU

• Sequential laundry takes 8 hours for 4 loads

Sequential Laundry

Task

Order

B

C

D

A

30Time

3030 3030 30 3030 3030 3030 3030 3030

6 PM 7 8 9 10 11 12 1 2 AM

7/29/2014 52Summer 2014-- Lecture #21

Page 53: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelined Laundry

Task

Order

B

C

D

A

12 2 AM6 PM 7 8 9 10 11 1

Time303030 3030 30 30

7/29/2014 53Summer 2014-- Lecture #21

Page 54: CS 61C: Great Ideas in Computer Architecture MIPS CPU

• Pipelined laundry takes 3.5 hours for 4 loads!

Pipelined Laundry

Task

Order

B

C

D

A

12 2 AM6 PM 7 8 9 10 11 1

Time303030 3030 30 30

7/29/2014 54Summer 2014-- Lecture #21

Page 55: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelining Lessons (1/2)

• Pipelining doesn’t help latency of single task, just throughput of entire workload

6 PM 7 8 9

Time

B

C

D

A

3030 30 3030 30 30

Task

Order

7/29/2014 55Summer 2014-- Lecture #21

Page 56: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelining Lessons (1/2)

• Pipelining doesn’t help latency of single task, just throughput of entire workload

• Multiple tasks operating simultaneously using different resources

6 PM 7 8 9

Time

B

C

D

A

3030 30 3030 30 30

Task

Order

7/29/2014 56Summer 2014-- Lecture #21

Page 57: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelining Lessons (1/2)

• Pipelining doesn’t help latency of single task, just throughput of entire workload

• Multiple tasks operating simultaneously using different resources

• Potential speedup = number of pipeline stages

6 PM 7 8 9

Time

B

C

D

A

3030 30 3030 30 30

Task

Order

7/29/2014 57Summer 2014-- Lecture #21

Page 58: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelining Lessons (1/2)

• Pipelining doesn’t help latency of single task, just throughput of entire workload

• Multiple tasks operating simultaneously using different resources

• Potential speedup = number of pipeline stages

• Speedup reduced by time to fill and drain the pipeline:8 hours/3.5 hours or 2.3X v. potential 4X in this example

6 PM 7 8 9

Time

B

C

D

A

3030 30 3030 30 30

Task

Order

7/29/2014 58Summer 2014-- Lecture #21

Page 59: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelining Lessons (2/2)

• Suppose new Washer takes 20 minutes, new Stasher takes 20 minutes. How much faster is pipeline?

6 PM 7 8 9

Time

B

C

D

A

3030 30 3030 30 30

Task

Order

7/29/2014 59Summer 2014-- Lecture #21

Page 60: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelining Lessons (2/2)

• Suppose new Washer takes 20 minutes, new Stasher takes 20 minutes. How much faster is pipeline?

– Pipeline rate limited by slowest pipeline stage

6 PM 7 8 9

Time

B

C

D

A

3030 30 3030 30 30

Task

Order

7/29/2014 60Summer 2014-- Lecture #21

Page 61: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelining Lessons (2/2)

• Suppose new Washer takes 20 minutes, new Stasher takes 20 minutes. How much faster is pipeline?

– Pipeline rate limited by slowest pipeline stage

– Unbalanced lengths of pipeline stages reduces speedup

6 PM 7 8 9

Time

B

C

D

A

3030 30 3030 30 30

Task

Order

7/29/2014 61Summer 2014-- Lecture #21

Page 62: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Agenda

• Quick Datapath Review• Control Implementation• Administrivia• Clocking Methodology• Pipelined Execution• Pipelined Datapath

7/29/2014 62Summer 2014-- Lecture #21

Page 63: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Recall: 5 Stages of MIPS Datapath

1) IF: Instruction Fetch, Increment PC2) ID: Instruction Decode, Read Registers3) EX: Execution (ALU)

Load/Store: Calculate Address Others: Perform Operation

4) MEM: Load: Read Data from Memory Store: Write Data to Memory

5) WB: Write Data Back to Register

7/29/2014 63Summer 2014-- Lecture #21

Page 64: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelined Datapath

• Add registers between stages– Hold information produced in previous cycle

1. InstructionFetch

2. Decode/ Register Read

3. Execute 4. Memory 5. Write Back

PC

inst

ruct

ion

mem

ory

+4

RegisterFilert

rsrd

ALU

Data

mem

ory

imm

MU

X

7/29/2014 64Summer 2014-- Lecture #21

Page 65: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelined Datapath

• Add registers between stages– Hold information produced in previous cycle

• 5 stage pipeline– Clock rate potentially 5x faster

1. InstructionFetch

2. Decode/ Register Read

3. Execute 4. Memory 5. Write Back

PC

inst

ruct

ion

mem

ory

+4

RegisterFilert

rsrd

ALU

Data

mem

ory

imm

MU

X

7/29/2014 65Summer 2014-- Lecture #21

Page 66: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelining Changes

• Registers affect flow of information– Name registers for adjacent stages (e.g.

IF/ID)

7/29/2014 66Summer 2014-- Lecture #21

Page 67: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelining Changes

• Registers affect flow of information– Name registers for adjacent stages (e.g.

IF/ID)– Registers separate the information

between stages● You can still pass information around registers

7/29/2014 67Summer 2014-- Lecture #21

Page 68: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelining Changes

• Registers affect flow of information– Name registers for adjacent stages (e.g.

IF/ID)– Registers separate the information

between stages● You can still pass information around registers

– At any instance of time, each stage working on a different instruction!

7/29/2014 68Summer 2014-- Lecture #21

Page 69: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelining Changes

• Registers affect flow of information– Name registers for adjacent stages (e.g.

IF/ID)– Registers separate the information

between stages● You can still pass information around registers

– At any instance of time, each stage working on a different instruction!

• Will need to re-examine placement of wires and hardware in datapath

7/29/2014 69Summer 2014-- Lecture #21

Page 70: CS 61C: Great Ideas in Computer Architecture MIPS CPU

More Detailed Pipeline

• Examine flow through pipeline for lw

7/29/2014 70Summer 2014-- Lecture #21

Page 71: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Instruction Fetch (IF) for Load

Components in use are highlighted

For sequential logic, left half means write, right half means read

7/29/2014 71Summer 2014-- Lecture #21

Page 72: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Instruction Decode (ID) for Load

7/29/2014 72Summer 2014-- Lecture #21

Page 73: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Execute (EX) for Load

7/29/2014 73Summer 2014-- Lecture #21

Page 74: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Memory (MEM) for Load

7/29/2014 74Summer 2014-- Lecture #21

Page 75: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Write Back (WB) for Load

There’s something wrong here! (Can you spot it?)

7/29/2014 75Summer 2014-- Lecture #21

Page 76: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Write Back (WB) for Load

There’s something wrong here! (Can you spot it?)

Wrong register number!

7/29/2014 76Summer 2014-- Lecture #21

Page 77: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Corrected Datapath• Now any instruction that writes to a register will work properly

7/29/2014 77Summer 2014-- Lecture #21

Page 78: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Technology Break

7/29/2014 Summer 2014-- Lecture #21 78

Page 79: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Agenda

• Quick Datapath Review• Control Implementation• Administrivia• Clocking Methodology• Pipelined Execution• Pipelined Datapath (Continued)

7/29/2014 79Summer 2014-- Lecture #21

Page 80: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelined Execution Representation

IF

Time

7/29/2014 80Summer 2014-- Lecture #21

Page 81: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelined Execution Representation

IF ID

IF

Time

7/29/2014 81Summer 2014-- Lecture #21

Page 82: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelined Execution Representation

IF ID EX

IF ID

IF

Time

7/29/2014 82Summer 2014-- Lecture #21

Page 83: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelined Execution Representation

IF ID EX MEM

IF ID EX

IF ID

IF

Time

7/29/2014 83Summer 2014-- Lecture #21

Page 84: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelined Execution Representation

IF ID EX MEM WB

IF ID EX MEM

IF ID EX

IF ID

IF

Time

7/29/2014 84Summer 2014-- Lecture #21

Page 85: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelined Execution Representation

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM

IF ID EX

IF ID

IF

Time

7/29/2014 85Summer 2014-- Lecture #21

Page 86: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelined Execution Representation

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM

IF ID EX

IF ID

Time

7/29/2014 86Summer 2014-- Lecture #21

Page 87: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelined Execution Representation

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM

IF ID EX

Time

7/29/2014 87Summer 2014-- Lecture #21

Page 88: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelined Execution Representation

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM

Time

7/29/2014 88Summer 2014-- Lecture #21

Page 89: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelined Execution Representation

• Every instruction must take same number of steps, so some will idle

– e.g. MEM stage for any arithmetic instruction

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

Time

7/29/2014 89Summer 2014-- Lecture #21

Page 90: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Graphical Pipeline Diagrams

1. InstructionFetch

2. Decode/ Register Read

3. Execute 4. Memory 5. Write Back

PC

inst

ruct

ion

mem

ory

+4

RegisterFilert

rsrd

ALU

Data

mem

ory

imm

MU

X

7/29/2014 90Summer 2014-- Lecture #21

Page 91: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Graphical Pipeline Diagrams

• Use datapath figure below to represent pipeline:IF ID EX Mem WB

AL

U I$ Reg D$ Reg

1. InstructionFetch

2. Decode/ Register Read

3. Execute 4. Memory 5. Write Back

PC

inst

ruct

ion

mem

ory

+4

RegisterFilert

rsrd

ALU

Data

mem

ory

imm

MU

X

7/29/2014 91Summer 2014-- Lecture #21

Page 92: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Instr

Order

Load

Add

Store

Sub

Or

I$

Time (clock cycles)

I$

AL

U

Reg

Reg

I$

D$

AL

U

AL

U

Reg

D$

Reg

I$

D$

Reg

AL

U

Reg Reg

Reg

D$

Reg

D$

AL

U

• RegFile: right half is read, left half is write

Reg

I$

Graphical Pipeline Representation

7/29/2014 92Summer 2014-- Lecture #21

Page 93: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Instruction Level Parallelism (ILP)

• Pipelining allows us to execute parts of multiple instructions at the same time using the same hardware!

– This is known as instruction level parallelism

• Recall: Types of parallelism– DLP: same operation on lots of data

(SIMD)– TLP: executing multiple threads

“simultaneously” (OpenMP)

7/29/2014 93Summer 2014-- Lecture #21

Page 94: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipeline Performance (1/3)

• Use Tc (“time between completion of instructions”) to measure speedup

7/29/2014 94Summer 2014-- Lecture #21

Page 95: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipeline Performance (1/3)

• Use Tc (“time between completion of instructions”) to measure speedup

– Equality only achieved if stages are balanced (i.e. take the same amount of time) and register timing costs are negligible

• If not balanced, speedup is reduced

7/29/2014 95Summer 2014-- Lecture #21

Page 96: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipeline Performance (1/3)

• Use Tc (“time between completion of instructions”) to measure speedup

– Equality only achieved if stages are balanced (i.e. take the same amount of time) and register timing costs are negligible

• If not balanced, speedup is reduced• Speedup due to increased throughput

– Latency for each instruction does not decrease

7/29/2014 96Summer 2014-- Lecture #21

Page 97: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipeline Performance (2/3)

• Assume time for stages is– 100ps for register read or write– 200ps for other stages

• What is pipelined clock rate?– Compare pipelined datapath with single-cycle datapath

Instr Instr fetch

Register read

ALU op Memory access

Register write

Total time

lw 200ps 100 ps 200ps 200ps 100 ps 800ps

sw 200ps 100 ps 200ps 200ps 700ps

R-format 200ps 100 ps 200ps 100 ps 600ps

beq 200ps 100 ps 200ps 500ps

7/29/2014 97Summer 2014-- Lecture #21

Page 98: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipeline Performance (3/3)

Single-cycle

Pipelined

7/29/2014 98Summer 2014-- Lecture #21

Page 99: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipeline Performance (3/3)

Single-cycleTc = 800 ps

PipelinedTc = 200 ps

7/29/2014 99Summer 2014-- Lecture #21

Page 100: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelining and ISA Design

• MIPS Instruction Set designed for pipelining!

7/29/2014 100Summer 2014-- Lecture #21

Page 101: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelining and ISA Design

• MIPS Instruction Set designed for pipelining!• All instructions are 32-bits

– Easier to fetch and decode in one cycle

7/29/2014 101Summer 2014-- Lecture #21

Page 102: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelining and ISA Design

• MIPS Instruction Set designed for pipelining!• All instructions are 32-bits

– Easier to fetch and decode in one cycle• Few and regular instruction formats, 2 source

register fields always in same place– Can decode and read registers in one step

7/29/2014 102Summer 2014-- Lecture #21

Page 103: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelining and ISA Design

• MIPS Instruction Set designed for pipelining!• All instructions are 32-bits

– Easier to fetch and decode in one cycle• Few and regular instruction formats, 2 source

register fields always in same place– Can decode and read registers in one step

• Memory operands only in Loads and Stores– Can calculate address 3rd stage, access

memory 4th stage

7/29/2014 103Summer 2014-- Lecture #21

Page 104: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Pipelining and ISA Design

• MIPS Instruction Set designed for pipelining!• All instructions are 32-bits

– Easier to fetch and decode in one cycle• Few and regular instruction formats, 2 source

register fields always in same place– Can decode and read registers in one step

• Memory operands only in Loads and Stores– Can calculate address 3rd stage, access

memory 4th stage• Alignment of memory operands

– Memory access takes only one cycle

7/29/2014 104Summer 2014-- Lecture #21

Page 105: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Question: Assume the stage times shown below.Suppose we remove loads and stores from our ISA. Consider going from a single-cycle implementation to a 4-stage pipelined version.

1) The latency will be 1.25x slower.2) The throughput will be 3x faster.

F F(B)F T(G)T F(P)T T(Y)

1 2

105

Instr Fetch Reg Read ALU Op Mem Access Reg Write

200ps 100 ps 200ps 200ps 100 ps

Page 106: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Question: Assume the stage times shown below.Suppose we remove loads and stores from our ISA. Consider going from a single-cycle implementation to a 4-stage pipelined version.

1) The latency will be 1.25x slower.2) The throughput will be 3x faster.

F F(B)F T(G)T F(P)T T(Y)

1 2

106

Instr Fetch Reg Read ALU Op Mem Access Reg Write

200ps 100 ps 200ps 200ps 100 ps

Page 107: CS 61C: Great Ideas in Computer Architecture MIPS CPU

Summary

• Implementing controller for your datapath– Take decoded signals from instruction and

generate control signals– Use “AND” and “OR” Logic scheme

• Pipelining improves performance by exploiting Instruction Level Parallelism

– 5-stage pipeline for MIPS: IF, ID, EX, MEM, WB– Executes multiple instructions in parallel– Each instruction has the same latency– Be careful of signal passing (more on this next

lecture)

7/29/2014 Summer 2014-- Lecture #21 107