cs222: pipeline processor · 2017-04-12 · ‐c ass lw t r t m t r t a t a t i t i sw t + t t + t...
TRANSCRIPT
![Page 1: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/1.jpg)
CS222: Pipeline Processor Design
Dr. A. Sahu
Dept of Comp. Sc. & Engg.Dept of Comp. Sc. & Engg.
Indian Institute of Technology Guwahati
1
![Page 2: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/2.jpg)
Outline• Mid Exam Answer Script: Friday Class • Previous:
– Control unit for Multi‐Cycle design
• Pipeline processor• Basic Structure of PipelineBasic Structure of Pipeline• Hazards
D t H d (D t d d )– Data Hazards (Data dependency)– Resource Hazards (Same resource used in two stage)stage)
– Control hazards (Branch instruction)2
![Page 3: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/3.jpg)
Course StructureCourse Structure • Before Mid‐Semester
I i d A bl i– Instruction set and Assembly programming
– Arithmetic's unit design (Adder/Sub/Mult, Float)
– Processor Design: data path for both Single cycle and multi cycle processor (RTL, control)
f• After Mid‐Semester– Pipeline processor, hazards, superscalar
– Memory hierarchy, Cache, Disk
– IO organization and controller
– Advance topic related to Computer Architecture3
![Page 4: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/4.jpg)
Problems with single cycle designProblems with single cycle design
• Slowest instruction pulls down the clockSlowest instruction pulls down the clock frequency
• Resource utilization is poor• Resource utilization is poor
• There are some instructions which are i ibl b i l d i hiimpossible to be implemented in this manner– Think which are the instructions ?
![Page 5: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/5.jpg)
1. Clock period in single cycle design1. Clock period in single cycle design
tt ttR l
clockperiodtR
tRtM
tR
tR
tA
tA
tI
tI
R‐class
lw
period
tMtR
tR
tA
tA
tI
tI
sw
tR tAt+t
tIt+tI
beq
t+tI
t+jtI
j
![Page 6: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/6.jpg)
1. Clock period in multi‐cycle design1. Clock period in multi cycle design
clocktR
tRtM
tR
tR
tA
tA
tI
tI
R‐class
lw
clockperiod
RM
tM
R
tR
t
A
tA
t
I
tI
t
sw
tR tAt+t
tIt+t
beq
t+tI
t+jtI
j
![Page 7: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/7.jpg)
Single Cycle DatapathSingle Cycle Datapath1
0
s2s2ins[25‐0]ja[31‐0]
28
0
++ s2s2
1
4
PC+4[31‐28]
ins[25‐21]
00
1
1100
1
PCPC
IM
adins
RF
rad1rad2wadwd
rd1
rd2
DMad rd
ALU
ins[25 21]ins[20‐16]
ins[15‐11]11
0011wd DMwd
sxsxins[15‐0]
16
![Page 8: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/8.jpg)
Improve Resource UtilizationImprove Resource Utilization
• Merge IM/DMMerge IM/DM– Lw: IM/PC++ ‐R ‐ ALU‐ DM ‐R– Sw: IM/PC++ ‐R ‐ ALU‐ DM/
• Eliminate 1st Adder and Use ALU– As 1st adder is used in 1st Cycle and ALU is free inAs 1 adder is used in 1 Cycle and ALU is free in 1st Cycle
• Eliminate 2nd Adder and Use ALU– As 2nd adder is used in 2nd Cycle and ALU is free in 2nd Cycle
8
![Page 9: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/9.jpg)
Multi‐cycle data pathMulti cycle data path
• Single cycle approach to multi‐cycle approach:Single cycle approach to multi cycle approach: improve performance and resource sharing
• Delays in different cycles should be balanced• Delays in different cycles should be balanced
• Single ALU and single memory used
• Additional registers and multiplexers required
![Page 10: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/10.jpg)
Multi‐cycle Control DesignMulti cycle Control Design
• Break instructions into cyclesBreak instructions into cycles
• Put cycle sequences together
C l i l d i i• Control signal groups and micro operations
• Control states and signal values
• Control state transitions
![Page 11: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/11.jpg)
Put cycle sequences togetherPut cycle sequences togetherR‐class sw lw beq j
IR = Mem[ ]PC = .. + ..
IR = Mem[ ]PC = .. + ..
IR = Mem[ ]PC = .. + ..
IR = Mem[ ]PC = .. + ..
IR = Mem[ ]PC = .. + ..
A = RF[..]B = RF[..]
A = RF[..]B = RF[..]
A = RF[..] A = RF[..]B = RF[..]Res = ..+..
PC = ..
Res = ..op.. Res = ..+.. Res = ..+.. if(..==..)PC =..
RF[..] = .. Mem[ ] = .. DR = Mem[ ]these can be merged
RF[..] = DR
![Page 12: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/12.jpg)
With a common decoding cycleWith a common decoding cycle
IR = Mem[ ]PC = .. + ..
A = RF[..][ ]B = RF[..]Res = ..+..
R‐class sw lw beq j
Res = ..op.. Res = ..+.. Res = ..+.. if(..==..)PC =..
PC = ..
RF[..] = .. Mem[ ] = .. DR = Mem[ ]
RF[ ] DRRF[..] = DR
![Page 13: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/13.jpg)
lw, sw can split after third cyclelw, sw can split after third cycle
IR = Mem[ ]PC = .. + ..
A = RF[..][ ]B = RF[..]Res = ..+..
R‐class sw / lw beq j
Res = ..op.. Res = ..+.. if(..==..)PC =..
PC = ..sw lw
RF[..] = .. Mem[ ] = .. DR = Mem[ ]
RF[ ] DRRF[..] = DR
![Page 14: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/14.jpg)
Control signals in multi‐cycle DPControl signals in multi cycle DP
22ins[25‐0] ja[31‐0]28
rad1
ss2
ins[25‐21]PC[31‐28]
IRA 20
PW
IorD
MR MW RW
AW
Asrc1
R WZ
0
1
PC
RF
rad1rad2wadwd
rd1
rd2
ALU
ins[20‐16]
ins[15‐11]Mem
ad rd
wd
IR
Res
1
0
1
0
14
B0
1
BW 3
ReW
Psrc
sxsx16
s2s2
Res
ins[15‐0]DR 2
3
1
0
IW
DW
Rdst
M2R
BW
Asrc2
3op
Psrc
0DW M2R
![Page 15: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/15.jpg)
Micro operations and l i lcontrol signals ‐ PC group
PWu PWc Psrc
PC PC 4 1 X 1
Micro operation
iPC = PC + 4
if (A == B) PC = Res
1 X 1
0 1 0PCinc
branchPC=PC[31‐28] || s2(IR[25‐0]) 1 X 2default 0 0 X
jumpnop
PW = PWu + Z . PWc
![Page 16: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/16.jpg)
Micro operations and l i lcontrol signals ‐Mem group
MW MR IorD IW DWMicro operation
IR = Mem[PC] 0 1 0 1 0fetch
DR = Mem[Res]
Mem[Res] = B
0 1 1 0 1
1 0 1 0 0
m_rd
m wrMem[Res] B 1 0 1 0 0
default 0 0 X 0 0
m_wr
nop
![Page 17: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/17.jpg)
Micro operations and l i lcontrol signals ‐ RF group
di i RW Rdst M2R AW BW
A = RF[IR[25‐21]] 0 X X 1 0
Micro operation
rs2A[ [ ]]
B = RF[IR[20‐16]] 0 X X 0 1
rs2A
rt2B
RF[IR[15‐11]] = Res 1 1 0 0 0res2rd
RF[IR[20‐16]] = DR 1 0 1 0 0
default 0 X X 0 0
mem2rt
nopnop
![Page 18: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/18.jpg)
Micro operations and l i lcontrol signals ‐ ALU group
opc Asrc1 Asrc2 ReW
PC = PC + 4 0 0 1 0
Micro operation
PCinc
Res = A op B
Res = A + sx(IR[15 0])
2 1 0 1
0 1 2 1
arith
M ddRes = A + sx(IR[15‐0]) 0 1 2 1
Res = PC + s2(sx(IR[15‐11])) 0 0 3 1
Maddr
Paddr
if (A == B) PC = Res 1 1 0 0
default X X X 0
branch
nopdefault X X X 0nop
![Page 19: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/19.jpg)
Control states and micro operationsControl states and micro operations
fetchcs0cs0 PCinc
rs2A
cs0cs0
rt2BPaddr
R‐class sw / lw beq j
cs1cs1
arith Maddr branch jumpsw lw
cs2cs2cs4cs4
cs8cs8cs9cs9
res2rd m_wr m_rd
2 tcs3cs3 cs5cs5
cs6cs6
77
cs9cs9
mem2rtcs7cs7
![Page 20: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/20.jpg)
Control states and signal valuesControl states and signal valuesPC grp Mem grp RF grp ALU grp
0 PCinc fetch nop PCincnop nop rs2A,rt2B Paddr
ith
cs0cs12 nop nop nop arith
nop nop res2rd nopnop nop nop Maddr
cs2cs3cs4 nop nop nop Maddr
nop m_wr nop nopnop m rd nop nop
cs4cs5cs6 nop m_rd nop nop
nop nop mem2rt nopbranch nop nop branch
cs6cs7cs8 branch nop nop branch
jump nop nop nopcs8cs9
![Page 21: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/21.jpg)
Control state transitionsControl state transitions
R‐class sw lw beq jR class sw lw beq jcs1 cs1 cs1 cs1 cs1cs2 cs4 cs4 cs8 cs9
cs0cs1
cs3 X X X Xcs0 X X X X
cs2cs3
X cs5 cs6 X XX cs0 X X X
cs4cs56 X X cs7 X X
X X cs0 X XX X X cs0 X
cs6cs7cs8 X X X cs0 X
X X X X cs0cs8cs9
![Page 22: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/22.jpg)
Pipeline DesignPipeline Design • Single Cycle
– Poor Resource Utilization, ,TC >= long Instr latency
• Multi Cycle– TC > Loner Stage, Better Utilization, Still performance need toperformance need to improve using pipeline
– When Decoding INSi you h Scan Fetch INSi+1
• Pipeline
22
![Page 23: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/23.jpg)
Instruction PipelineInstruction Pipeline
IF D EX Mem WB
IF D EX Mem WBIF D EX Mem WB
IF D EX Mem WB
IF D EX Mem WB
IF D EX Mem WB
Performance: 1 instruction per Cycle23
All the Stages work in parallel, No resource can be shared by stages
![Page 24: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/24.jpg)
Single cycle datapath (abstract)Single cycle datapath (abstract)
+
+4
PCPC
IM
adins
RF
rad
wadwd
rd1
rd2
DMad rd
ALU
wd DMwd
![Page 25: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/25.jpg)
Pipelined datapathPipelined datapath
IF ID EX Mem WB
IF/ID ID/EX EX/Mem Mem/WB
+
+4
PC
IM
adins
RF
rad
wadwd
rd1
rd2
DM
ad rd
ALU
wd DMwd
![Page 26: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/26.jpg)
Don’t share resources in StagesDon t share resources in Stages
• In Multi Cycle DesignIn Multi Cycle Design– ALU used for PC++ and Offset Adding
Used for 1st Adder and 2nd Adder– Used for 1st Adder and 2nd Adder
– Register FILE is used in 2nd and 4th Cycle
I Pi li• In Pipeline – Use Separate resource 1st Adder, 2nd Adder & ALU
– Register FILE is accesses 1st Half of 2nd Cycle and 2nd Half of 4th Cycle
26
![Page 27: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/27.jpg)
Put back multiplexersPut back multiplexers
IF ID EX Mem WB1
IF/ID ID/EX EX/Mem Mem/WB
1
0
d
+
+4
s2s2
PCPC
IM
adins
RF
rad1
wad
wd
rd1
rd2
DMad rd
ALU
0
1
0
11100
1
rad2
wd DMwd
0011
sxsx
![Page 28: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/28.jpg)
Correction for WB stageCorrection for WB stage
IF ID EX Mem WB1
IF/ID ID/EX EX/Mem Mem/WB
1
0
d
+
+4
s2s2
PCPC
IM
adins
RF
rad1
wad
wd
rd1
rd2
DMad rd
ALU 1100
1
rad2
00
wd DMwd
0011
sxsx
11
![Page 29: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/29.jpg)
Abstract: Adding controlAbstract: Adding control1
0
ololcontro
contro
d
+
+4
s2s2
PCPC
IM
adins
RF
rad1
wad
wd
rd1
rd2
DMad rd
ALU 1100
1
rad2
wd DMwd
00
0011
sxsx
Actrl
Actrl
11
![Page 30: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/30.jpg)
Control signals with delaysControl signals with delays1
0
ololcontro
contro
d
+
+4
s2s2
PCPC
IM
adins
RF
rad1
wad
wd
rd1
rd2
DMad rd
ALU 1100
1
rad2
wd DMwd
00
0011
sxsx
Actrl
Actrl
11
![Page 31: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/31.jpg)
Correction for RF write signalCorrection for RF write signal1
0
ololcontro
contro
d
+
+4
s2s2
PCPC
IM
adins
RF
rad1
wad
wd
rd1
rd2
DMad rd
ALU 1100
1
rad2
wd DMwd
00
0011
sxsx
Actrl
Actrl
11
![Page 32: CS222: Pipeline Processor · 2017-04-12 · ‐c ass lw t R t M t R t A t A t I t I sw t + t t + t I ... m_wr nop. Micro operations and ... IF D EX Mem WB IF D EX Mem WB IF D EX Mem](https://reader030.vdocument.in/reader030/viewer/2022040721/5e2de0d2f477be53e35ee61c/html5/thumbnails/32.jpg)
32