cs141-l4-1tarun soni, summer’03 single cycle cpu previously: built and alu. today: actually…
DESCRIPTION
CS141-L4-3Tarun Soni, Summer’03 CPU: Building blocks Adder MUX ALU 32 A B Sum Carry 32 A B Result OP 32 A B Y Select Adder MUX ALU CarryInTRANSCRIPT
![Page 1: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/1.jpg)
CS141-L4-1 Tarun Soni, Summer’03
Single Cycle CPU
Previously: built and ALU.Today: Actually build a CPU
Questions on CS140 ? Computer Arithmetic ?
•Attend office hours with TAs or me.•Do the exercises in the text.
![Page 2: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/2.jpg)
CS141-L4-2 Tarun Soni, Summer’03
Instruction Set Architectures Performance issues 2s complement, Addition, Subtraction Multiplication, Division, Floating Point numbers
The Story so far:
Basically ISA & ALU stuff
![Page 3: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/3.jpg)
CS141-L4-3 Tarun Soni, Summer’03
CPU: Building blocks
• Adder
• MUX
• ALU
32
32
A
B32
Sum
Carry
32
32
A
B32
Result
OP
32A
B32
Y32
Select
Adder
MU
XA
LU
CarryIn
![Page 4: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/4.jpg)
CS141-L4-4 Tarun Soni, Summer’03
CPU: Building blocks
OP
32A
B32
Y32
Select
MU
X
3232A[31..0]
B[31..0]32
Sum[31..0]
Carry
Adder
CarryIn
32A[63..32]
B[63..32]32
Sum[63..32]
Carry
Adder
CarryIn
32
• Building a 64-bit adder from 2x32-bit adders
![Page 5: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/5.jpg)
CS141-L4-5 Tarun Soni, Summer’03
CPU: Building blocks
32A
B32
Sum[63..32]32
Select
MU
X
32
32
A[31..0]
B[31..0]32
Sum[31..0]
Carry
Adder
CarryIn
32
32
A[63..32]
B[63..32]32
S
Cout
Adder
Cin=0
32
32
A[63..32]
B[63..32]32
S
CoutA
dder
Cin=11
A
B1
Cout1
Select
MU
X
• Silicon is cheap – sort-of
![Page 6: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/6.jpg)
CS141-L4-6 Tarun Soni, Summer’03
CPU
Single Cycle CPU
![Page 7: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/7.jpg)
CS141-L4-7 Tarun Soni, Summer’03
CPU
The Big Picture: Where are We Now?
• The Five Classic Components of a Computer
• Datapath Design, then Control Design
Control
Datapath
Memory
ProcessorInput
Output
![Page 8: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/8.jpg)
CS141-L4-8 Tarun Soni, Summer’03
CPU: The big picture
InstructionFetch
InstructionDecode
OperandFetch
Execute
ResultStore
NextInstruction ° Design hardware for each of these steps!!!
Execute anentire instruction
Fetc
h
Dec
ode
Fetc
h
Exec
ute
Stor
e
Nex
t
![Page 9: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/9.jpg)
CS141-L4-9 Tarun Soni, Summer’03
CPU: Clocking
Clk
Don’t CareSetup Hold
.
.
.
.
.
.
.
.
.
.
.
.
Setup Hold
• All storage elements are clocked by the same clock edge
![Page 10: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/10.jpg)
CS141-L4-10 Tarun Soni, Summer’03
CPU
The Big Picture: The Performance Perspective
• Execution Time = Insts * CPI * Cycle Time• Processor design (datapath and control) will determine:
– Clock cycle time– Clock cycles per instruction
• Starting today:– Single cycle processor:
• Advantage: One clock cycle per instruction• Disadvantage: long cycle time
Execute anentire instruction
![Page 11: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/11.jpg)
CS141-L4-11 Tarun Soni, Summer’03
CPU
• We're ready to look at an implementation of the MIPS• Simplified to contain only:
– memory-reference instructions: lw, sw – arithmetic-logical instructions: add, sub, and, or, slt– control flow instructions: beq
• Generic Implementation:– use the program counter (PC) to supply instruction address– get the instruction from memory– read registers– use the instruction to decide exactly what to do
• All instructions use the ALU after reading the registersmemory-reference? arithmetic? control flow?
CPI
Inst. Count Cycle Time
![Page 12: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/12.jpg)
CS141-L4-12 Tarun Soni, Summer’03
CPU
Review: The MIPS Instruction Formats
op target address02631
6 bits 26 bits
op rs rt rd shamt funct061116212631
6 bits 6 bits5 bits5 bits5 bits5 bits
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
°The different fields are:•op: operation of the instruction•rs, rt, rd: the source and destination register specifiers•shamt: shift amount•funct: selects the variant of the operation in the “op” field•address / immediate: address offset or immediate value•target address: target address of the jump instruction
![Page 13: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/13.jpg)
CS141-L4-13 Tarun Soni, Summer’03
CPU
• R-type– add rd, rs, rt– sub, and, or, slt
• LOAD and STORE– lw rt, rs, imm16– sw rt, rs, imm16
• BRANCH:– beq rs, rt, imm16
op rs rt rd shamt funct061116212631
6 bits 6 bits5 bits5 bits5 bits5 bits
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
op rs rt displacement016212631
6 bits 16 bits5 bits5 bits
![Page 14: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/14.jpg)
CS141-L4-14 Tarun Soni, Summer’03
CPU
• Memory– instruction & data
• Registers (32 x 32)– read RS– read RT– Write RT or RD
• PC• Extender• Add and Sub register or extended immediate• Add 4 or extended immediate to PC
Requirements to implement the ISA
![Page 15: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/15.jpg)
CS141-L4-15 Tarun Soni, Summer’03
CPU
• Combinational Elements• Storage Elements
– Clocking methodology
StateElement
clk
A
B
C = f(A,B,state){State[n] = f(A,B,state[n-1])}
CombinationalLogic
A
BC = f(A,B)
![Page 16: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/16.jpg)
CS141-L4-16 Tarun Soni, Summer’03
CPU: Storage unit
• The set-reset latch– output depends on present inputs and also on past inputs
![Page 17: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/17.jpg)
CS141-L4-17 Tarun Soni, Summer’03
CPU: D-flip flop
• Two inputs:– the data value to be stored (D)– the clock signal (C) indicating when to read & store D
• Two outputs:– the value of the internal state (Q) and it's complement
Q
C
D
_Q
D
C
Q
• Output changes only on the clock edge
_Q
Q
_Q
Dlatch
D
C
Dlatch
DD
C
C
D
C
Q
![Page 18: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/18.jpg)
CS141-L4-18 Tarun Soni, Summer’03
CPU: Clocking Methodology
• An edge triggered methodology• Typical execution:
– read contents of some state elements, – send values through some combinational logic– write results to one or more state elements
Clock cycle
Stateelement
1Combinational logic
Stateelement
2
![Page 19: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/19.jpg)
CS141-L4-19 Tarun Soni, Summer’03
CPU: Storage block
• Register– Similar to the D Flip Flop except
• N-bit input and output• Write Enable input
– Write Enable:• 0: Data Out will not change• 1: Data Out will become Data In (on the clock edge)
Clk
Data In
Write Enable
N N
Data Out
![Page 20: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/20.jpg)
CS141-L4-20 Tarun Soni, Summer’03
CPU: Register Files
• Register File consists of (32) registers:– Two 32-bit output buses:– One 32-bit input bus: busW
• Register is selected by:– RA selects the register to put on busA– RB selects the register to put on busB– RW selects the register to be written
via busW when Write Enable is 1• Clock input (CLK)
• Factor only during write-enable=1;• Otherwise, this unit acts just like combinational logic.
Clk
busW
Write Enable
3232
busA
32busB
5 5 5RW RA RB
32 32-bitRegisters
![Page 21: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/21.jpg)
CS141-L4-21 Tarun Soni, Summer’03
CPU: Register Files
Mux
Register 0Register 1
Register n – 1Register n
Mux
Read data 1
Read data 2
Read registernumber 1
Read registernumber 2
Read registernumber 1 Read
data 1
Readdata 2
Read registernumber 2
Register fileWriteregister
Writedata Write
n-to-1decoder
Register 0
Register 1
Register n – 1C
C
D
DRegister n
C
C
D
D
Register number
Write
Register data
01
n – 1
n
Built using D-flip flopsStill use the real clock (not shown here) to do the actual write
![Page 22: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/22.jpg)
CS141-L4-22 Tarun Soni, Summer’03
CPU: Memory
• Memory (idealized)– One input bus: Data In– One output bus: Data Out
• Memory word is selected by:– Address selects the word to put on Data Out– Write Enable = 1: address selects the memory
word to be written via the Data In bus• Clock input (CLK)
– The CLK input is a factor ONLY during write operation– During read operation, behaves as a combinational logic block:
• Address valid => Data Out valid after “access time.”
Clk
Data In
Write Enable
32 32DataOut
Address
![Page 23: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/23.jpg)
CS141-L4-23 Tarun Soni, Summer’03
CPU: RTL
• is a mechanism for describing the movement and manipulation of data between storage elements:
R[3] <- R[5] + R[7]PC <- PC + 4 + R[5]R[rd] <- R[rs] + R[rt]R[rt] <- Mem[R[rs] + immed]
Register Transfer Language (RTL)
![Page 24: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/24.jpg)
CS141-L4-24 Tarun Soni, Summer’03
CPU: More building blocks
PC
Instructionmemory
Instructionaddress
Instruction
a. Instruction memory b. Program counter
Add Sum
c. Adder
ALU control
RegWrite
RegistersWriteregister
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Writedata
ALUresult
ALU
Data
Data
Registernumbers
a. Registers b. ALU
Zero5
5
5 3
16 32Sign
extend
b. Sign-extension unit
MemRead
MemWrite
Datamemory
Writedata
Readdata
a. Data memory unit
Address
![Page 25: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/25.jpg)
CS141-L4-25 Tarun Soni, Summer’03
CPU: The big picture
InstructionFetch
InstructionDecode
OperandFetch
Execute
ResultStore
NextInstruction ° Design hardware for each of these steps!!!
Execute anentire instruction
Fetc
h
Dec
ode
Fetc
h
Exec
ute
Stor
e
Nex
t
![Page 26: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/26.jpg)
CS141-L4-26 Tarun Soni, Summer’03
CPU: Instruction Fetch
• RTL version of the instruction fetch step: • Fetch the Instruction: mem[PC]– Update the program counter:
• Sequential Code: PC <- PC + 4 • Branch and Jump: PC <- “something else”
32
Instruction WordAddress
InstructionMemory
PCClk
Next AddressLogic
![Page 27: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/27.jpg)
CS141-L4-27 Tarun Soni, Summer’03
CPU: Binary arithmetic for PC
• In theory, the PC is a 32-bit byte address into the instruction memory:– Sequential operation: PC<31:0> = PC<31:0> + 4– Branch operation: PC<31:0> = PC<31:0> + 4 + SignExt[Imm16] * 4
• The magic number “4” always comes up because:– The 32-bit PC is a byte address– And all our instructions are 4 bytes (32 bits) long
• In other words:– The 2 LSBs of the 32-bit PC are always zeros– There is no reason to have hardware to keep the 2 LSBs
• In practice, we can simplify the hardware by using a 30-bit PC<31:2>:– Sequential operation: PC<31:2> = PC<31:2> + 1– Branch operation: PC<31:2> = PC<31:2> + 1 + SignExt[Imm16]– In either case: Instruction Memory Address = PC<31:2> concat “00”
![Page 28: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/28.jpg)
CS141-L4-28 Tarun Soni, Summer’03
CPU: Instruction Fetch unit
• The common RTL operations– Fetch the Instruction: inst <- mem[PC]– Update the program counter:
• Sequential Code: PC <- PC + 4 • Branch and Jump PC <- “something else”
![Page 29: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/29.jpg)
CS141-L4-29 Tarun Soni, Summer’03
CPU: Register-Register Operations (Add, Subtract etc.)
• R[rd] <- R[rs] op R[rt] Example: addU rd, rs, rt– Ra, Rb, and Rw come from instruction’s rs, rt, and rd fields– ALUctr and RegWr: control logic after decoding the instruction
32Result
ALUctr
Clk
busW
RegWr
3232
busA
32busB
5 5 5
Rw Ra Rb32 32-bitRegisters
Rs RtRd
ALU
op rs rt rd shamt funct061116212631
6 bits 6 bits5 bits5 bits5 bits5 bits
° Worry about instruction decode to generate ALUctr and RegWr later.
![Page 30: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/30.jpg)
CS141-L4-30 Tarun Soni, Summer’03
CPU: Register - Register Timing
32Result
ALUctr
Clk
busW
RegWr
3232
busA
32busB
5 5 5
Rw Ra Rb32 32-bitRegisters
Rs RtRd
AL
U
Clk
PC
Rs, Rt, Rd,Op, Func
Clk-to-Q
ALUctr
Instruction Memory Access Time
Old Value New Value
RegWr Old Value New Value
Delay through Control Logic
busA, B
Register File Access TimeOld Value New Value
busWALU Delay
Old Value New Value
Old Value New Value
New ValueOld Value
Register WriteOccurs Here
![Page 31: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/31.jpg)
CS141-L4-31 Tarun Soni, Summer’03
CPU: Logical Immediate Op.• R[rt] <- R[rs] op ZeroExt[imm16] ]
32
Result
ALUctr
Clk
busW
RegWr
3232
busA
32busB
5 5 5
Rw Ra Rb32 32-bitRegisters
Rs
RtRdRegDst
ZeroExt
Mux
Mux
3216imm16
ALUSrc
ALU
11
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits rd?
immediate016 1531
16 bits16 bits0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Handle Rt as destination
HandleImmediate asoperand
![Page 32: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/32.jpg)
CS141-L4-32 Tarun Soni, Summer’03
CPU: Load Operations
• R[rt] <- Mem[R[rs] + SignExt[imm16]] Example: lw rt, rs, imm16
11
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits rd
32
ALUctr
Clk
busW
RegWr
3232
busA
32busB
5 5 5
Rw Ra Rb32 32-bitRegisters
Rs
RtRdRegDst
Extender
Mux
Mux
3216
imm16
ALUSrc
ExtOp
Clk
Data InWrEn
32
Adr
DataMemory
32
ALU
MemWr Mux
W_Src
Need dataMemory!
Reg-Write could be from result or data memory
![Page 33: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/33.jpg)
CS141-L4-33 Tarun Soni, Summer’03
CPU: Store Operations
• Mem[ R[rs] + SignExt[imm16] <- R[rt] ] Example: sw rt, rs, imm16
32
ALUctr
Clk
busW
RegWr
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst
Extender
Mux
Mux
3216imm16
ALUSrcExtOp
Clk
Data InWrEn
32Adr
DataMemory
MemWr
ALU
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
32
Mux
W_SrcReg can
write to Data Memory
![Page 34: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/34.jpg)
CS141-L4-34 Tarun Soni, Summer’03
CPU: Branching
• beq rs, rt, imm16
– mem[PC] Fetch the instruction from memory
– Equal <- R[rs] == R[rt] Calculate the branch condition
– if (COND eq 0) Calculate the next instruction’s address• PC <- PC + 4 + ( SignExt(imm16) x 4 )
– else• PC <- PC + 4
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
![Page 35: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/35.jpg)
CS141-L4-35 Tarun Soni, Summer’03
CPU: Datapath for Branching
• beq rs, rt, imm16 Datapath generates condition (equal)
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
32
imm16
PCClk
00
Adder
Mux
Adder
4nPC_sel
Clk
busW
RegWr
32
busA
32busB
5 5 5
Rw Ra Rb32 32-bitRegisters
Rs Rt
Equa
l?
Cond
PC Ext
Inst Address
Calculate (PC+4) as well as (imm16+PC+4) and choose one
Calculate the “condition” part of the branch op.
![Page 36: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/36.jpg)
CS141-L4-36 Tarun Soni, Summer’03
CPU: The Aggregate Datapathim
m16
32
ALUctr
Clk
busW
RegWr
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst
Extender
Mux
3216imm16
ALUSrcExtOp
Mux
MemtoReg
Clk
Data InWrEn32 Adr
DataMemory
MemWrA
LU
Equal
Instruction<31:0>
0
1
0
1
01
<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRtRs
=
Adder
Adder
PC
Clk
00Mux
4
nPC_sel
PC E
xt
Adr
InstMemory
Still need to worry about Instruction Decode
![Page 37: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/37.jpg)
CS141-L4-37 Tarun Soni, Summer’03
CPU: Datapath: High-level view• Register file and ideal memory:
– The CLK input is a factor ONLY during write operation– During read operation, behave as combinational logic:
• Address valid => Output valid after “access time.”
Critical Path (Load Operation) = PC’s Clk-to-Q + Instruction Memory’s Access Time + Register File’s Access Time + ALU to Perform a 32-bit Add + Data Memory Access Time + Setup Time for Register File Write + Clock Skew
Clk
5
Rw Ra Rb32 32-bitRegisters
RdA
LU
Clk
Data In
DataAddress Ideal
DataMemory
Instruction
InstructionAddress
IdealInstruction
Memory
Clk
PC
5Rs
5Rt
16Imm
32
323232
A
B
Nex
t Add
ress
![Page 38: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/38.jpg)
CS141-L4-38 Tarun Soni, Summer’03
CPU: Control Signals
ALUctrRegDst ALUSrcExtOp MemtoRegMemWr Equal
Instruction<31:0>
<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
nPC_sel
Adr
InstMemory
DATA PATH
Control
Op
<21:25>
Fun
RegWr
![Page 39: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/39.jpg)
CS141-L4-39 Tarun Soni, Summer’03
CPU: Control Signals: Meaning
Adr
InstMemory
• Rs, Rt, Rd and Imed16 hardwired into datapath• nPC_sel: 0 => PC <– PC + 4; 1 => PC <– PC + 4 + SignExt(Im16) || 00
Adder
Adder
PC
Clk
00Mux
4
nPC_sel
PC Extim
m16
![Page 40: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/40.jpg)
CS141-L4-40 Tarun Soni, Summer’03
CPU: Control Signals: Meaning• ExtOp: “zero”, “sign”• ALUsrc: 0 => regB; 1 =>
immed• ALUctr: “add”, “sub”, “or”
32
ALUctr
Clk
busW
RegWr
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst
Extender
Mux
3216imm16
ALUSrcExtOp
Mux
MemtoReg
Clk
Data InWrEn32 Adr
DataMemory
MemWr
ALU
Equal
0
1
0
1
01
° MemWr: write memory° MemtoReg: 1 => Mem° RegDst: 0 => “rt”; 1 =>
“rd”° RegWr: write dest register
=
![Page 41: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/41.jpg)
CS141-L4-41 Tarun Soni, Summer’03
CPU: Control Signals for various operations
inst Register Transfer
ADD R[rd] <– R[rs] + R[rt]; PC <– PC + 4
ALUsrc = RegB, ALUctr = “add”, RegDst = rd, RegWr, nPC_sel = “+4”
SUB R[rd] <– R[rs] – R[rt]; PC <– PC + 4
ALUsrc = RegB, ALUctr = “sub”, RegDst = rd, RegWr, nPC_sel = “+4”
ORi R[rt] <– R[rs] + zero_ext(Imm16); PC <– PC + 4
ALUsrc = Im, Extop = “Z”, ALUctr = “or”, RegDst = rt, RegWr, nPC_sel = “+4”
LOAD R[rt] <– MEM[ R[rs] + sign_ext(Imm16)]; PC <– PC + 4
ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemtoReg, RegDst = rt, RegWr, nPC_sel = “+4”
STORE MEM[ R[rs] + sign_ext(Imm16)] <– R[rs]; PC <– PC + 4
ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemWr, nPC_sel = “+4”
BEQ if ( R[rs] == R[rt] ) then PC <– PC + sign_ext(Imm16)] || 00 else PC <– PC + 4
nPC_sel = EQUAL, ALUctr = “sub”
![Page 42: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/42.jpg)
CS141-L4-42 Tarun Soni, Summer’03
CPU: Control Signals: Logic Design
• nPC_sel <= if (OP == BEQ) then EQUAL else 0• ALUsrc <= if (OP == “000000”) then “regB” else “immed”• ALUctr <= if (OP == “000000”) then funct
elseif (OP == ORi) then “OR”elseif (OP == BEQ) then “sub”
else “add”• ExtOp <= _____________• MemWr <= _____________• MemtoReg <= _____________• RegWr: <=_____________• RegDst: <= _____________
![Page 43: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/43.jpg)
CS141-L4-43 Tarun Soni, Summer’03
CPU: Control Signals: Logic Design
• nPC_sel <= if (OP == BEQ) then EQUAL else 0• ALUsrc <= if (OP == “000000”) then “regB” else “immed”• ALUctr <= if (OP == “000000”) then funct
elseif (OP == ORi) then “OR” elseif (OP == BEQ) then “sub” else “add”
• ExtOp <= if (OP == ORi) then “zero” else “sign”• MemWr <= (OP == Store)• MemtoReg <= (OP == Load)• RegWr: <= if ((OP == Store) || (OP == BEQ)) then 0 else 1• RegDst: <= if ((OP == Load) || (OP == ORi)) then 0 else 1
![Page 44: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/44.jpg)
CS141-L4-44 Tarun Soni, Summer’03
CPU: Example: Load
32
ALUctr
Clk
busW
RegWr
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst
Extender
Mux
3216imm16
ALUSrcExtOp
Mux
MemtoReg
Clk
Data InWrEn32 Adr
DataMemory
MemWrA
LUEqual
Instruction<31:0>
0
1
0
1
01
<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRtRs
=
imm
16
Adder
Adder
PC
Clk
00Mux
4
nPC_sel
PC Ext
Adr
InstMemory
sign ext
addrt+4
R[rt] <- Mem[R[rs] + SignExt[imm16]]Viz., lw rt, rs, imm16
![Page 45: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/45.jpg)
CS141-L4-45 Tarun Soni, Summer’03
CPU: The abstract version
• Logical vs. Physical Structure
DataOut
Clk
5
Rw Ra Rb32 32-bitRegisters
Rd
ALU
Clk
Data In
DataAddress Ideal
DataMemory
Instruction
InstructionAddress
IdealInstruction
Memory
Clk
PC
5Rs
5Rt
32
323232
A
B
Nex
t Add
ress
Control
Datapath
Control Signals Conditions
![Page 46: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/46.jpg)
CS141-L4-46 Tarun Soni, Summer’03
CPU: The real thing
![Page 47: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/47.jpg)
CS141-L4-47 Tarun Soni, Summer’03
CPU: 5 steps to design
• 5 steps to design a processor– 1. Analyze instruction set => datapath requirements– 2. Select set of datapath components & establish clock methodology– 3. Assemble datapath meeting the requirements– 4. Analyze implementation of each instruction to determine setting of control points that
effects the register transfer.– 5. Assemble the control logic
• MIPS makes it easier– Instructions same size– Source registers always in same place– Immediates same size, location– Operations always on registers/immediates
• Single cycle datapath => CPI=1, CCT => long
![Page 48: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/48.jpg)
CS141-L4-48 Tarun Soni, Summer’03
CPU: Control Section
• The Five Classic Components of a Computer
Control
Datapath
Memory
ProcessorInput
Output
![Page 49: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/49.jpg)
CS141-L4-49 Tarun Soni, Summer’03
CPU: Add Instruction
• add rd, rs, rt
– mem[PC] Fetch the instruction from memory
– R[rd] <- R[rs] + R[rt] The actual operation
– PC <- PC + 4 Calculate the next instruction’s address
op rs rt rd shamt funct061116212631
6 bits 6 bits5 bits5 bits5 bits5 bits
![Page 50: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/50.jpg)
CS141-L4-50 Tarun Soni, Summer’03
CPU: The Add Instruction
Instruction Fetch Unit at the Beginning of Add
PC E
xt
• Fetch the instruction from Instruction memory: Instruction <- mem[PC]– This is the same for all instructions
Adr
InstMemory
Adder
Adder
PC
Clk
00Mux
4
nPC_sel
imm
16Instruction<31:0>
![Page 51: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/51.jpg)
CS141-L4-51 Tarun Soni, Summer’03
CPU: The Add Instruction
The Single Cycle Datapath during Add
32
ALUctr = Add
Clk
busW
RegWr = 1
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst = 1
Extender
Mux
Mux
3216imm16
ALUSrc = 0
Mux
MemtoReg = 0
Clk
Data InWrEn
32Adr
DataMemory
32
MemWr = 0A
LU
InstructionFetch Unit
Clk
Zero
Instruction<31:0>• R[rd] <- R[rs] + R[rt]
0
1
0
1
01<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
op rs rt rd shamt funct061116212631
nPC_sel= +4
![Page 52: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/52.jpg)
CS141-L4-52 Tarun Soni, Summer’03
CPU: The Add Instruction
Instruction Fetch Unit at the End of Add
• PC <- PC + 4– This is the same for all instructions except: Branch and Jump
Adr
InstMemory
Adder
Adder
PC
Clk
00Mux
4
nPC_sel
imm
16Instruction<31:0>
![Page 53: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/53.jpg)
CS141-L4-53 Tarun Soni, Summer’03
CPU: The Or Immediate Instruction
• R[rt] <- R[rs] or ZeroExt[Imm16]
op rs rt immediate016212631
32
ALUctr =
Clk
busW
RegWr =
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst =
Extender
Mux
Mux
3216imm16
ALUSrc =
ExtOp =
Mux
MemtoReg =
Clk
Data InWrEn
32Adr
DataMemory
32
MemWr = A
LU
InstructionFetch Unit
Clk
Zero
Instruction<31:0>
0
1
0
1
01<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
nPC_sel =
![Page 54: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/54.jpg)
CS141-L4-54 Tarun Soni, Summer’03
CPU: The Or Immediate Instruction
The Single Cycle Datapath during Or Immediate
32
ALUctr = Or
Clk
busW
RegWr = 1
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst = 0
Extender
Mux
Mux
3216imm16
ALUSrc = 1
ExtOp = 0
Mux
MemtoReg = 0
Clk
Data InWrEn
32Adr
DataMemory
32
MemWr = 0A
LU
InstructionFetch Unit
Clk
Zero
Instruction<31:0>
• R[rt] <- R[rs] or ZeroExt[Imm16]
0
1
0
1
01<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
op rs rt immediate016212631
nPC_sel= +4
![Page 55: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/55.jpg)
CS141-L4-55 Tarun Soni, Summer’03
CPU: The Load Instruction
The Single Cycle Datapath during Load
32
ALUctr = Add
Clk
busW
RegWr = 1
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst = 0
Extender
Mux
Mux
3216imm16
ALUSrc = 1
ExtOp = 1
Mux
MemtoReg = 1
Clk
Data InWrEn
32Adr
DataMemory
32
MemWr = 0A
LU
InstructionFetch Unit
Clk
Zero
Instruction<31:0>
0
1
0
1
01<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
• R[rt] <- Data Memory {R[rs] + SignExt[imm16]}
op rs rt immediate016212631
nPC_sel= +4
![Page 56: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/56.jpg)
CS141-L4-56 Tarun Soni, Summer’03
CPU: The Store Instruction
The Single Cycle Datapath during Store• Data Memory {R[rs] + SignExt[imm16]} <- R[rt]
op rs rt immediate016212631
32
ALUctr =
Clk
busW
RegWr =
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst =
Extender
Mux
Mux
3216imm16
ALUSrc =
ExtOp =
Mux
MemtoReg =
Clk
Data InWrEn
32Adr
DataMemory
32
MemWr = A
LU
InstructionFetch Unit
Clk
Zero
Instruction<31:0>
0
1
0
1
01<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
nPC_sel =
![Page 57: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/57.jpg)
CS141-L4-57 Tarun Soni, Summer’03
CPU: The Store Instruction
The Single Cycle Datapath during Store
32
ALUctr = Add
Clk
busW
RegWr = 0
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst = x
Extender
Mux
Mux
3216imm16
ALUSrc = 1
ExtOp = 1
Mux
MemtoReg = x
Clk
Data InWrEn
32Adr
DataMemory
32
MemWr = 1A
LU
InstructionFetch Unit
Clk
Zero
Instruction<31:0>
0
1
0
1
01<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
• Data Memory {R[rs] + SignExt[imm16]} <- R[rt]
op rs rt immediate016212631
nPC_sel= +4
![Page 58: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/58.jpg)
CS141-L4-58 Tarun Soni, Summer’03
CPU: Datapath during branch
32
ALUctr = Subtract
Clk
busW
RegWr = 0
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst = x
Extender
Mux
Mux
3216imm16
ALUSrc = 0
ExtOp = x
Mux
MemtoReg = x
Clk
Data InWrEn
32Adr
DataMemory
32
MemWr = 0A
LU
InstructionFetch Unit
Clk
Zero
Instruction<31:0>
0
1
0
1
01<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
• if (R[rs] - R[rt] == 0) then Zero <- 1 ; else Zero <- 0
op rs rt immediate016212631
nPC_sel= “Br”
![Page 59: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/59.jpg)
CS141-L4-59 Tarun Soni, Summer’03
CPU: Datapath during branch
Instruction Fetch Unit at the End of Branch
• if (Zero == 1) then PC = PC + 4 + SignExt[imm16]*4 ; else PC = PC + 4
op rs rt immediate016212631
Adr
InstMemory
Adder
Adder
PC
Clk
00Mux
4
nPC_sel
imm
16Instruction<31:0>
![Page 60: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/60.jpg)
CS141-L4-60 Tarun Soni, Summer’03
CPU: Creating control from Datapath
ALUctrRegDst ALUSrcExtOp MemtoRegMemWr Equal
<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
nPC_sel
Adr
InstMemory
DATA PATH
Control
Op
<21:25>
Fun
RegWr
![Page 61: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/61.jpg)
CS141-L4-61 Tarun Soni, Summer’03
CPU: Control Signals
inst Register Transfer
ADD R[rd] <– R[rs] + R[rt]; PC <– PC + 4
ALUsrc = RegB, ALUctr = “add”, RegDst = rd, RegWr, nPC_sel = “+4”
SUB R[rd] <– R[rs] – R[rt]; PC <– PC + 4
ALUsrc = RegB, ALUctr = “sub”, RegDst = rd, RegWr, nPC_sel = “+4”
ORi R[rt] <– R[rs] + zero_ext(Imm16); PC <– PC + 4
ALUsrc = Im, Extop = “Z”, ALUctr = “or”, RegDst = rt, RegWr, nPC_sel = “+4”
LOAD R[rt] <– MEM[ R[rs] + sign_ext(Imm16)]; PC <– PC + 4
ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemtoReg, RegDst = rt, RegWr, nPC_sel = “+4”
STORE MEM[ R[rs] + sign_ext(Imm16)] <– R[rs]; PC <– PC + 4
ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemWr, nPC_sel = “+4”
BEQ if ( R[rs] == R[rt] ) then PC <– PC + sign_ext(Imm16)] || 00 else PC <– PC + 4
nPC_sel = “Br”, ALUctr = “sub”
![Page 62: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/62.jpg)
CS141-L4-62 Tarun Soni, Summer’03
CPU: Summary of Control Signals
add sub ori lw sw beq jumpRegDstALUSrcMemtoRegRegWriteMemWritenPCselJumpExtOpALUctr<2:0>
1001000x
Add
1001000x
Subtract
01010000
Or
01110001
Add
x1x01001
Add
x0x0010x
Subtract
xxx0001x
xxx
op target address
op rs rt rd shamt funct061116212631
op rs rt immediate
R-type
I-type
J-type
add, sub
ori, lw, sw, beq
jump
funcop 00 0000 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010Appendix A
10 0000See 10 0010 We Don’t Care :-)
![Page 63: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/63.jpg)
CS141-L4-63 Tarun Soni, Summer’03
CPU: Summary of Control Signals
The Concept of Local Decoding
R-type ori lw sw beq jumpRegDstALUSrcMemtoRegRegWriteMemWriteBranchJumpExtOpALUop<N:0>
1001000x
“R-type”
01010000
Or
01110001
Add
x1x01001
Add
x0x0010x
Subtract
xxx0001x
xxx
op 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010
MainControl
op6
ALUControl(Local)
func
N
6ALUop
ALUctr3
ALU
![Page 64: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/64.jpg)
CS141-L4-64 Tarun Soni, Summer’03
CPU: Encoding of ALUop
• In this exercise, ALUop has to be 2 bits wide to represent:– (1) “R-type” instructions– “I-type” instructions that require the ALU to perform:
• (2) Or, (3) Add, and (4) Subtract• To implement the full MIPS ISA, ALUop has to be 3 bits to represent:
– (1) “R-type” instructions– “I-type” instructions that require the ALU to perform:
• (2) Or, (3) Add, (4) Subtract, and (5) And (Example: andi)
MainControl
op6
ALUControl(Local)
func
N
6ALUop
ALUctr3
R-type ori lw sw beq jumpALUop (Symbolic) “R-type” Or Add Add Subtract xxx
ALUop<2:0> 1 00 0 10 0 00 0 00 0 01 xxx
![Page 65: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/65.jpg)
CS141-L4-65 Tarun Soni, Summer’03
CPU: Decoding of the ‘func’ field
R-type ori lw sw beq jumpALUop (Symbolic) “R-type” Or Add Add Subtract xxx
ALUop<2:0> 1 00 0 10 0 00 0 00 0 01 xxx
MainControl
op6
ALUControl(Local)
func
N
6ALUop
ALUctr3
op rs rt rd shamt funct061116212631
R-type
funct<5:0> Instruction Operation10 000010 001010 010010 010110 1010
addsubtractandorset-on-less-than
ALUctr<2:0> ALU Operation000001010110111
AddSubtract
AndOr
Set-on-less-than
Recall
ALUctr
ALU
![Page 66: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/66.jpg)
CS141-L4-66 Tarun Soni, Summer’03
CPU: Truth table for ALUctr
R-type ori lw sw beqALUop(Symbolic) “R-type” Or Add Add Subtract
ALUop<2:0> 1 00 0 10 0 00 0 00 0 01
ALUop funcbit<2> bit<1> bit<0> bit<2> bit<1> bit<0>bit<3>
0 0 0 x x x x
ALUctrALUOperation
Add 0 1 0bit<2> bit<1> bit<0>
0 x 1 x x x x Subtract 1 1 00 1 x x x x x Or 0 0 11 x x 0 0 0 0 Add 0 1 01 x x 0 0 1 0 Subtract 1 1 01 x x 0 1 0 0 And 0 0 01 x x 0 1 0 1 Or 0 0 11 x x 1 0 1 0 Set on < 1 1 1
funct<3:0> Instruction Op.00000010010001011010
addsubtractandorset-on-less-than
![Page 67: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/67.jpg)
CS141-L4-67 Tarun Soni, Summer’03
CPU: Logic Equation ALUctr[2]
The Logic Equation for ALUctr<2>
ALUop funcbit<2> bit<1> bit<0> bit<2> bit<1> bit<0>bit<3> ALUctr<2>
0 x 1 x x x x 11 x x 0 0 1 0 11 x x 1 0 1 0 1
• ALUctr<2> = !ALUop<2> & ALUop<0> + ALUop<2> & !func<2> & func<1> & !func<0>
This makes func<3> a don’t care
![Page 68: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/68.jpg)
CS141-L4-68 Tarun Soni, Summer’03
CPU: Logic Equation ALUctr[1]
The Logic Equation for ALUctr<1>
ALUop funcbit<2> bit<1> bit<0> bit<2> bit<1> bit<0>bit<3>
0 0 0 x x x x 1ALUctr<1>
0 x 1 x x x x 11 x x 0 0 0 0 11 x x 0 0 1 0 11 x x 1 0 1 0 1
• ALUctr<1> = !ALUop<2> & !ALUop<0> + ALUop<2> & !func<2> & !func<0>
![Page 69: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/69.jpg)
CS141-L4-69 Tarun Soni, Summer’03
CPU: Logic Equation ALUctr[0]
The Logic Equation for ALUctr<0>
ALUop funcbit<2> bit<1> bit<0> bit<2> bit<1> bit<0>bit<3> ALUctr<0>
0 1 x x x x x 11 x x 0 1 0 1 11 x x 1 0 1 0 1
• ALUctr<0> = !ALUop<2> & ALUop<0> + ALUop<2> & !func<3> & func<2> & !func<1> & func<0> + ALUop<2> & func<3> & !func<2> & func<1> & !func<0>
![Page 70: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/70.jpg)
CS141-L4-70 Tarun Soni, Summer’03
CPU: ALU Control block
The ALU Control Block
ALUControl(Local)
func
3
6ALUop
ALUctr3
• ALUctr<2> = !ALUop<2> & ALUop<0> + ALUop<2> & !func<2> & func<1> & !func<0>
• ALUctr<1> = !ALUop<2> & !ALUop<0> + ALUop<2> & !func<2> & !func<0>
• ALUctr<0> = !ALUop<2> & ALUop<0> + ALUop<2> & !func<3> & func<2> & !func<1> & func<0> + ALUop<2> & func<3> & !func<2> & func<1> & !func<0>
![Page 71: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/71.jpg)
CS141-L4-71 Tarun Soni, Summer’03
CPU: Main Control
R-type ori lw sw beq jumpRegDstALUSrcMemtoRegRegWriteMemWriteBranchJumpExtOpALUop (Symbolic)
1001000x
“R-type”
01010000
Or
01110001
Add
x1x01001
Add
x0x0010x
Subtract
xxx0001x
xxx
op 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010
ALUop <2> 1 0 0 0 0 xALUop <1> 0 1 0 0 0 xALUop <0> 0 0 0 0 1 x
MainControl
op6
ALUControl(Local)
func
3
6
ALUop
ALUctr3
RegDstALUSrc
:
![Page 72: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/72.jpg)
CS141-L4-72 Tarun Soni, Summer’03
CPU: Main Control The “Truth Table” for RegWrite
R-type ori lw sw beq jumpRegWrite 1 1 1 0 0 0
op 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010
• RegWrite = R-type + ori + lw= !op<5> & !op<4> & !op<3> & !op<2> & !op<1> & !op<0> (R-type) + !op<5> & !op<4> & op<3> & op<2> & !op<1> & op<0> (ori) + op<5> & !op<4> & !op<3> & !op<2> & op<1> & op<0> (lw)
op<0>
op<5>. .op<5>. .
<0>
op<5>. .
<0>
op<5>. .
<0>
op<5>. .
<0>
op<5>. .
<0>
R-type ori lw sw beq jumpRegWrite
![Page 73: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/73.jpg)
CS141-L4-73 Tarun Soni, Summer’03
CPU: Main Control PLA Implementation of the Main Control
op<0>
op<5>. .op<5>. .
<0>
op<5>. .
<0>
op<5>. .
<0>
op<5>. .
<0>
op<5>. .
<0>
R-type ori lw sw beq jumpRegWrite
ALUSrc
MemtoRegMemWrite
BranchJump
RegDst
ExtOp
ALUop<2>ALUop<1>ALUop<0>
![Page 74: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/74.jpg)
CS141-L4-74 Tarun Soni, Summer’03
CPU Putting it All Together: A Single Cycle Processor
32
ALUctr
Clk
busW
RegWr
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst
Extender
Mux
Mux
3216imm16
ALUSrc
ExtOp
Mux
MemtoReg
Clk
Data InWrEn
32Adr
DataMemory
32
MemWrA
LU
InstructionFetch Unit
Clk
Zero
Instruction<31:0>
0
1
0
1
01<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
MainControl
op6
ALUControlfunc
6
3ALUop
ALUctr3
RegDst
ALUSrc:
Instr<5:0>
Instr<31:26>
Instr<15:0>
nPC_sel
![Page 75: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/75.jpg)
CS141-L4-75 Tarun Soni, Summer’03
CPU Worst Case Timing (Load)
Clk
PCRs, Rt, Rd,Op, Func
Clk-to-Q
ALUctr
Instruction Memoey Access Time
Old Value New Value
RegWr
Old Value New Value
Delay through Control Logic
busARegister File Access Time
Old Value New Value
busBALU Delay
Old Value New Value
Old Value New Value
New ValueOld Value
ExtOp Old Value New Value
ALUSrc Old Value New Value
MemtoReg
Old Value New Value
Address
Old Value New Value
busW Old Value New
Delay through Extender & Mux
RegisterWrite Occurs
Data Memory Access Time
![Page 76: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/76.jpg)
CS141-L4-76 Tarun Soni, Summer’03
CPU: Single Cycle Solution
• Long cycle time:– Cycle time must be long enough for the load instruction:
PC’s Clock -to-Q +Instruction Memory Access Time +Register File Access Time +ALU Delay (address calculation) +Data Memory Access Time +Register File Setup Time +Clock Skew
• Cycle time for load is much longer than needed for all other instructions
![Page 77: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/77.jpg)
CS141-L4-77 Tarun Soni, Summer’03
CPU: Single Cycle Solution
° Single cycle datapath => CPI=1, CCT => long
° 5 steps to design a processor• 1. Analyze instruction set => datapath requirements• 2. Select set of datapath components & establish clock methodology• 3. Assemble datapath meeting the requirements• 4. Analyze implementation of each instruction to determine setting of control points
that effects the register transfer.• 5. Assemble the control logic
° Control is the hard part
° MIPS makes control easier• Instructions same size• Source registers always in same place• Immediates same size, location• Operations always on registers/immediates
Control
Datapath
Memory
ProcessorInput
Output
![Page 78: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…](https://reader036.vdocument.in/reader036/viewer/2022081606/5a4d1be07f8b9ab0599df532/html5/thumbnails/78.jpg)
CS141-L4-78 Tarun Soni, Summer’03
CPU: Interrupts
° Datapath for interrupts
° Interrupt: basically hardware line requesting an immediate jump
° PC = Int[I] if Int[I] = 1;
° May or maynot save registers
° May or maynot be maskable.
° Useful for multitasking control & real-time processing
° Signal Processing
° Harder to implement in case of a multi-cycle/pipelines system !