first implementation of the datapath - biuu.cs.biu.ac.il/~wiseman/co/co3.pdf · 2001. 10. 10. · 3...

Copyright 1995 by Coherence LTD., all rights reserved (Revised: Oct 97 by Rafi Lohev, Oct 99 by Yair Wiseman)IBM3-1י ב מ

3 Single-Cycle DatapathFirst implementation of the Datapath:

– One cycle for each instruction.– All combinational logic must stabilize within one clock cycle.– All state elements will be written exactly once at the end of the clock.– Simplified version; helps understand datapath operation.

● Architectural elements required for single cycle implementation:– Memories to hold instructions and data.– Register file.– ALU for basic arithmetic and logical operations.– Adders for computing instruction and jump addresses.– Multiplexors will be used to choose the correct resources for each instruction.☛ Note: Because everything must complete in one clock, we cannot use the ALU

for more than one operation. Extra adders required for computing addresses.

● Control requirements --– ALU, multiplexors.


Instructions review● R-type (register) instruction format:

– op: operation to be performed by the instruction.– rs: 1st register source operand.– rt: 2nd register source operand.– rd: destination register; gets result.– shamt: shift amount.– funct: selects operation more specifically than op does.

● Example: add $8, $17, $18

op rs rt rd shamt funct

6 bits 5 bits 5 bits 5 bits 5 bits 6 bits

000000 10001 10010 01000 00000 100000

6 bits 5 bits 5 bits 5 bits 5 bits 6 bits


● For load/store instructions we need a field which holds an address; 5 bits is too small.

● To preserve the principal of simplicity, we want all instructions to be the same length - 32 bits, so we have more than one instruction format. Op field distinguishes the instruction formats.

– I-type (immediate) instruction format

● Example: lw $8, Astart($19)

op rs rt

6 bits 5 bits 5 bits 16 bits

address

100011 10011 01000

6 bits 5 bits 5 bits 16 bits

Astart

Instructions review (Cont.)


Instruction Formats Handled by the Datapath

0 rs rt rd shift n function

31-26 25-21 20-16 15-11 10-6 5-0

field

bit positions

R-type instruction

address35 or 43 rs rt

31-26 25-21 20-16 15-0

field

bit positions

Load Word (LW)/Store Word (SW) instruction

field

bit positions

Branch Equal (BEQ) instruction

4 rs rt

31-26 25-21 20-16 15-0

address


The Singlecycle Datapath - Progressively

Readaddress

Instructionmemory

Instruction

PC

Add

4

Readregister 1

Readregister 2

Writeregister

Writedata

Readdata 1

Readdata 2

ALUzero

resultInstruction

FIGURE 3.1 Portion of the datapath needed for fetching instructions and incrementing the program counter. The fetched instruction is used by other parts of the datapath.

FIGURE 3.2 The datapath for R-type instructions.. The ALU discussed in Chapter 2 can be controlled to provide all the basic ALU functions required for R-type instructions.

Registers



Readregister 1

Readregister 2

Writeregister

Writedata

Readdata 1

Readdata 2

Registers ALUzero

resultInstruction

FIGURE 3.3 The datapath for a load or store that does a register access. It is followed by a memory address calculation, then a read or write from memory and a write into the register file if the instruction is a load.

FIGURE 3.4 The datapath for a branch uses an ALU for evaluation of the branch condition and a separate adder for computing the branch target as the sum of the incremented PC and the sign-extended, lower, 16 bits of the instruction (the branch displacement), shifted left 2 bits. The unit labeled shift left 2 performs the shift adding 00two to the bottom of the sign-extended offset field. Since we know that the offset was sign-extended from 16 bits, the shift will throw away only "sign bits". Control logic is used to decide whether the incremented PC or branch target should replace the PC, based on the Zero output of the ALU.

ReadAddress

WriteAddress

Writedata

Readdata

DataMemory

signextend

16 32

Readregister 1

Readregister 2

Writeregister

Writedata

Readdata 1

Readdata 2

Registers ALUzero

result

Instruction

signextend

16 32

Addersum

shiftleft 2

Branchtarget

To branch control logic

PC+4 from instruction datapath



Readregister 1

Readregister 2

Writeregister

Writedata

Readdata 1

Readdata 2

ALUzero

result

Instruction Registers ReadAddress

WriteAddress

Writedata

Readdata

DataMemory

0Mux1

signextend

16 32

1Mux0

FIGURE 3.5 Combining the datapaths for the memory instructions and the R-type instructions. This example shows how a single datapath can be assembled from the pieces. The multiplexors and their connections are highlighted.



Readregister 1

Readregister 2

Writeregister

Writedata

Readdata 1

Readdata 2

ALUzero

result

Registers ReadAddress

WriteAddress

Writedata

Readdata

DataMemory

0Mux1

signextend

16 32

1Mux0

FIGURE 3.6 The instruction fetch portion of the datapath from Figure 3.1 is appended to the datapath of Figure 3.5 that handles memory and ALU instructions The addition is highlighted. The result is a datapath that supports many operations of the MIPS instruction set -- branches and jumps are the major missing pieces.

Readaddress

Instructionmemory

Instruction

PC

Add

4



Readregister 1

Readregister 2

Writeregister

Writedata

Readdata 1

Readdata 2

ALUzero

result


WriteAddress

Writedata

Readdata

DataMemory

0Mux1

signextend

16 32

1Mux0

FIGURE 3.7 The simple datapath for the MIPS architecture combines the elements required by the different instruction classes. This datapath can execute the basic instructions (load/store word, ALU operations and branches) in a single clock cycle. The additions to Figure 3.6, which are needed to implement branches, are highlighted.

Readaddress

Instructionmemory

Instruction

PC

Add

4Adder

sum

shiftleft 2

0Mux1


ALU Functions

● We implement a subset of the MIPS ISA -- 8 instructions; which cover all important types.

– ALUop is a new control signal we create to decide what ALU operation is required.

– Ld, Str and BEQ use ALUop to choose Add or Subtract; others use Func code from MIPS instruction.

ALU control input operation000001010110111

AND

OR

Add

Subtract

Set on less than

OP ALUop Func ALU action ALU controlLoad 00 xxxxxx Add 010Store 00 xxxxxx Add 010BEQ 01 xxxxxx Sub 110Add 10 100000 Add 010Sub 10 100010 Sub 110AND 10 100100 AND 000OR 10 100101 OR 001Slt 10 101010 Sub/shift 111

from MIPS instruction

new logic signals

MIPS instruction op-codeto ALU control lines


Logic for ALU Control

● We must generate the logic signals A2, A1, A0 for controlling the ALU. ALUop will be derived later.

OP ALUop Func A2 A1 A0Load 00 xxxxxx 0 1 0Store 00 xxxxxx 0 1 0BEQ 01 xxxxxx 1 1 0Add 10 100000 0 1 0Sub 10 100010 1 1 0AND 10 100100 0 0 0OR 10 100101 0 0 1Slt 10 101010 1 1 1

01234567

1 0 5 4 3 2 1 0 ALU control linesALU F

A0 = (ALU1 ⋅⋅⋅⋅ ALU0 ⋅⋅⋅⋅ F5 ⋅⋅⋅⋅ F4 ⋅⋅⋅⋅ F3 ⋅⋅⋅⋅ F2 ⋅⋅⋅⋅ F1 ⋅⋅⋅⋅ F0) + (ALU1 ⋅⋅⋅⋅ ALU0 ⋅⋅⋅⋅ F5 ⋅⋅⋅⋅ F4 ⋅⋅⋅⋅ F3 ⋅⋅⋅⋅ F2 ⋅⋅⋅⋅ F1 ⋅⋅⋅⋅ F0)A1 = An exercise.A2 = (ALU1 ⋅⋅⋅⋅ ALU0) + (ALU1 ⋅⋅⋅⋅ ALU0 ⋅⋅⋅⋅ F5 ⋅⋅⋅⋅ F4 ⋅⋅⋅⋅ F3 ⋅⋅⋅⋅ F2 ⋅⋅⋅⋅ F1 ⋅⋅⋅⋅ F0) + (ALU1 ⋅⋅⋅⋅ ALU0 ⋅⋅⋅⋅ F5 ⋅⋅⋅⋅ F4 ⋅⋅⋅⋅ F3 ⋅⋅⋅⋅ F2 ⋅⋅⋅⋅ F1 ⋅⋅⋅⋅ F0)


Implementation of ALU ControlF0F1F2F3F4F5

ALU0ALU1

A2

2

A0

4 6 7

A1 is left as an exercise.


The Singlecycle Datapath without Control

Readregister 1

Readregister 2

Writeregister

Writedata

Readdata 1

Readdata 2

ALUzero

result


WriteAddress

Writedata

Readdata

DataMemory

0Mux1

signextend

16 32

1Mux0

FIGURE 3.8 The datapath of Figure 3.7 with all necessary multiplexors and all control lines identified. The ALU control block has also been added.

Readaddress

Instructionmemory

Instruction

PC

Add

4Adder

sum

shiftleft 2

0Mux1

Instruction [25-21]

Instruction [20-16]

Instruction [15-11]

0Mux1

Instruction [15-0]ALU

control

MemWrite

ALUSrc

RegDst

RegWrite

PCSrc

MemtoReg

ALUOp

Instruction [5-0]


Control Line Functions

MemWrite Contents of memory at the write address are replaced by value in write data input

ALUSrc The second ALU operand is the sign-extended lower 16 bits of instruction

RegDst The destination reg number for the Write register comes from the rd field

RegWrite Register whose number is in Write register input is written with value of write data input.

PCSrc The PC is replaced by the output of a special adder that computes branch target address

MemtoReg The value to register write data input comes from data memory

Signal name

None

The second ALU operand comes from the second register file output

The destination reg number for the write register comes from rt field

None

The PC is replaced by the output of the adder: PC+4

The value to register write data comes from the ALU

Effect when "off" Effect when "on"


The Singlecycle Datapath

signextend

0

1

ControlInstruction [31-26]

Instruction [25-21]

Instruction [20-16]

Instruction [15-11]

Instruction [15-0] 16

ALUcontrol

Readregister 1

Readregister 2

Writeregister

Writedata

Readdata 1

Readdata 2

0

1

32

ReadAddress

WriteAddress

Writedata

Readdata

DataMemory

Registers

1

0

ALUzero

result

shiftleft 2

Add

0

Mux

1

Readaddress

Instructionmemory

Instruction[31-0]

PC

Add

4RegDst

BranchMemtoRegALUop

MemWriteALUSrc

RegWrite

2

Mux

Mux

Mux

3

Instruction [5-0]

PCSrc


MIPS op-codes for the 3 instruction classes in our datapath

R-type 0 0 0 0 0 0 0

LW 35 1 0 0 0 1 1

SW 43 1 0 1 0 1 1

BEQ 4 0 0 0 1 0 0

op-codein decimal

op-code in binaryop5 op4 op3 op2 op1 op0


Op5 0 1 1 0Op4 0 0 0 0Op3 0 0 1 0Op2 0 0 0 1Op1 0 1 1 0Op0 0 1 1 0

RegDst 1 0 x xALUsrc 0 1 1 0MemtoReg 0 1 x xRegWrite 1 1 0 0MemWrite 0 0 1 0Branch 0 0 0 1ALUOp1 1 0 0 0ALUOp0 0 0 0 1

For the function of 6 inputs defined by the op-code, the table shows exactly those truth table rows that are true for each of the 9 control lines.

Inputs

Outputs

R-type LW SW BEQ


Implementation of Control LogicOp5Op4Op3Op2Op1Op0

RegDstALUsrcMemtoReg

RegWrite

MemWriteBranchALUop1ALUop0

Outputs

InputsR-type LW SW BEQ


Logic Flow -- R-type InstructionAdd $x, $y, $z

1) The instruction is fetched from memory; PC is incremented.

2) Source operand registers $y and $z are read from the register file; control lines determine how the register file is read.

3) ALU operates on the source operands; control lines set by control unit + function code.

4) ALU result is written to register file using bits 15-11 of the instruction register to select the correct destination register $x.

rd rs rt


R-type Instructions Flow

signextend

0

1


Instruction [25-21]

Instruction [20-16]

Instruction [15-11]


ALUcontrol

Readregister 1

Readregister 2

Writeregister

Writedata

Readdata 1

Readdata 2

0

1

32

ReadAddress

WriteAddress

Writedata

Readdata

DataMemory

Registers

1

0

ALUzero

result

shiftleft 2

Add

0

Mux

1

Readaddress

Instructionmemory

Instruction[31-0]

PC

Add

4RegDst (1)

Branch (0)MemtoReg (0)ALUop (10) - R-TYPE

MemWrite (0)ALUSrc (0)

RegWrite (1)

2

Mux

Mux

Mux

0

0

0

00

1

1

10

3

Instruction [5-0]

PCSrc


Logic Flow -- Ld/Str InstructionLW $x, base ($y)


2) The source register value $y is read from the register file; control lines determine how the register file is read.

3) ALU computes the sum of $y and the base value from instruction bits 0-15 sign extended to 32 bits.

4) Result of address computation goes to data memory.

5) Data from data memory written to register file; bits 20-16 of instruction specify destination register number $x.

rt rs


Ld/Str Instructions Flow

signextend

0

1


Instruction [25-21]

Instruction [20-16]

Instruction [15-11]


ALUcontrol

Readregister 1

Readregister 2

Writeregister

Writedata

Readdata 1

Readdata 2

0

1

32

ReadAddress

WriteAddress

Writedata

Readdata

DataMemory

Registers

1

0

ALUzero

result

shiftleft 2

Add

0

Mux

1

Readaddress

Instructionmemory

Instruction[31-0]

PC

Add

4

RegDst (0)

Branch (0)MemtoReg (1)ALUop (00) - ADD

MemWrite (0/1)ALUSrc (1)

RegWrite (1/0)

2

Mux

Mux

Mux

0

0

0 - Load1 - Store

11

1/0

0

00

3010

Instruction [5-0]

PCSrc


Logic Flow -- Branch InstructionBEQ $x, $y, offset


2) Source operand registers $x and $y are read from the register file; control lines determine how the register file is read.

3) ALU subtracts the source operands to determine if zero.

4) PC+4 is added to bits 0-15 of instruction (offset) which was shifted left 2 places and sign extended to 32 bits. Result is branch target address if taken.

5) Zero condition from ALU determines which next address to write into PC.

rs rt


Branch Instruction Flow

signextend

0

1


Instruction [25-21]

Instruction [20-16]

Instruction [15-11]


ALUcontrol

Readregister 1

Readregister 2

Writeregister

Writedata

Readdata 1

Readdata 2

0

1

32

ReadAddress

WriteAddress

Writedata

Readdata

DataMemory

Registers

1

0

ALUzero

result

shiftleft 2

Add

0

Mux

1

Readaddress

Instructionmemory

Instruction[31-0]

PC

Add

4RegDst (x)

Branch (1)MemtoReg (x)ALUop (01) - SUB

MemWrite (0)ALUSrc (0)

RegWrite (0)

2

Mux

Mux

Mux

1

0

0

0

01

3110

Instruction [5-0]

PCSrc


The Problems with a Single Cycle Datapath

3. CPU Time = IC X CPI X Clock Cycle Time● CPI = 1.● IC is the same for all implementations of MIPS ISA.● Everything depends on clock cycle Time.

I-type stage1 stage2 stage3 stage4 stage5 total (ns)R - type I-fetch regs ALU regs 38Load I-fetch regs ALU mem regs 48Store I-fetch regs ALU mem 39Branch I-fetch regs ALU 29Jump I-fetch regs ALU 29

table assumes ALU, Adders - 10ns; Memory - 10ns; register file - 9ns

1. Need for multiple ALUs, adders, etc.

2. Memory split (to data and instructions).

Example


Assume the following dynamic instruction counts for a typical program (gcc compiler)

Loads 22Stores 11R-type 49Branch 16Jump 2

I-type %

Using table of instruction delays from the previous slide, calculate actual average time per instruction for gcc.

(.49 x 38ns) + (.22 x 48ns) + (.11 x 39ns) + (.16 x 29ns) + (.02 x 29ns) = 38.69ns

For single cycle implementation we must use longest period = 48ns;slowdown = approx. 48/38.69 = 24%. Floating point or other longer instructions would make the slowdown much worse.

first implementation of the datapath - biuu.cs.biu.ac.il/~wiseman/co/co3.pdf · 2001. 10. 10. · 3...

Documents