review of cs 203a

Review of CS 203A

Laxmi Narayan Bhuyanhttp://www.cs.ucr.edu/~bhuyan

Lecture2

Review CS 203A - Pipelining

Instr 1

Instr 2

Instr 3

Instr 4A

LU M Reg M Reg

U M Reg M Reg

U M Reg M RegA

LUReg M Reg

U M Reg M Reg

• Can’t read same memory twice in same clock cycle Structural Hazard

Instr.

Time (clock cycles)

Other Hazards

• Data Hazards – Due to data dependencies• Control Hazards – Due to branches

Getting CPI < 1: IssuingMultiple Instructions/Cycle

• Superscalar MIPS: 2 instructions, 1 FP & 1 anything– Fetch 64-bits/clock cycle; Int on left, FP on right– Can only issue 2nd instruction if 1st instruction issues– More ports for FP registers to do FP load & FP op in a pairType Pipe StagesInt. instruction IF ID EX MEM WBFP instruction IF ID EX MEM WBInt. instruction IF ID EX MEM WBFP instruction IF ID EX MEM WBInt. instruction IF ID EX MEM WBFP instruction IF ID EX MEM WB

MIPS R4000 Pipeline

Comparison of Issue CapabilitiesCourtesy of Susan Eggers; Used with Permission

VLIW and Superscalar• sequential stream of long instruction words• instructions scheduled statically by the compiler• number of simultaneously issued instructions is fixed during

compile-time • instruction issue is less complicated than in a superscalar

processor• Disadvantage: VLIW processors cannot react on dynamic events,

e.g. cache misses, with the same flexibility like superscalars.• The number of instructions in a VLIW instruction word is usually

fixed.• Padding VLIW instructions with no-ops is needed in case the full

issue bandwidth is not be met. This increases code size. More recent VLIW architectures use a denser code format which allows to remove the no-ops.

• VLIW is an architectural technique, whereas superscalar is a microarchitecture technique.

• VLIW processors take advantage of spatial parallelism.

Multithreading• How can we guarantee no dependencies between instructions

in a pipeline?– One way is to interleave execution of instructions from

different program threads on same pipeline – Micro context switching

Interleave 4 threads, T1-T4, on non-bypassed 5-stage pipe

T1: LW r1, 0(r2)T2: ADD r7, r1, r4T3: XORI r5, r4, #12T4: SW 0(r7), r5T1: LW r5, 12(r1)

HW Schemes: Instruction Parallelism• Out-of-order execution divides ID stage:

1. Issue—decode instructions, check for structural hazards, Issue in order if the functional unit is free and no WAW.

2. Read operands (RO)—wait until no data hazards, then read operands ADDD would stall at RO, and SUBD could proceed with no stalls.

• Scoreboards allow instruction to execute whenever 1 & 2 hold, not waiting for prior instructions.

(WAR?)

IF ISSUE

… RO EX1 … EXm

RO EX1 … EXn

… RO EX1 … EXp

FP unit and load-store unit using Tomasulo’s alg.

Four Steps of Speculative Tomasulo Algorithm

1. Issue— get instruction from FP Op Queue If reservation station and reorder buffer slot free, issue instr

& send operands & reorder buffer no. for destination (this stage sometimes called “dispatch”)

2. Execution— operate on operands (EX) When both operands ready then execute; if not ready, watch

CDB for result; when both in reservation station, execute; checks RAW (sometimes called “issue”)

3. Write result— finish execution (WB) Write on Common Data Bus to all awaiting FUs

& reorder buffer; mark reservation station available.4. Commit— update register with reorder result

When instr. at head of reorder buffer & result present, update register with result (or store to memory) and remove instr from reorder buffer. Mispredicted branch flushes reorder buffer (sometimes called “graduation”)

review of cs 203a

Documents

etx-203a - · pdf file · 2015-07-31etx-203a, based on or...

a charlton martin seminar pwd 203a form of contract...

cs 354 final exam review

pwd 203a rev 2010

cs 210 final review november 28, 2006. cs 210 adapter...

cs 1110 prelim ii: review session

cs 203a computer architecture lecture 10: multimedia and...

9/23/2004lec 1-21 cs 203a advanced computer architecture...

cs 115 final review session

larsonfoia fbi 203a wf 210023andlantos

cs 3120 final exam review

cs 2110 prelim 1 review

borang 203a malay translation rev 10 83

dap.f96 1 lecture 9: introduction to compiler techniques...

cs 161 review for test 2

cs 2110 final review

deluge 203a

coc 203a(20102007)amended 16 3 2012

v3-203a-1 - vivreau water

introduction and review - cs department