main thesis

Main thesisLecture 1

Review of CS 161

What is a von-Neumann computer? => The Stored Program Concept – Sequential Execution of a program – instructions in binary for storing in memory

• Design of an Instruction Set - RISC Vs. CISC, Examples

• Design of CPU Datapath• CPU Control Design (Hardwire vs.

Microprogramming)• Memory Design (Main memory, Cache Memory,

Virtual memory)• Input-Output

MIPS ISA

1. all MIPS instructions are same length– simplifies fetch and decode (steps 1,2)– Intel 80x86 and IBM 360/370 instructions are

variable length, 1-17 bytes

2. few instruction formats in MIPS– source register fields are same place in all

instructions– can read two registers and decode instruction

in the same cycle– Explicit Load/Store instructions for memory-

register operations

Review of CPU Datapath Design

Instruction operation consists of 5 parts, namely, Fetch (IF), Decode (ID), Execute (EX), Memory (DM), and Write-back (WB) stages

Single Cycle Design – Big Cycle – CPI = 1 Problems: (1) Low frequency meaning less number of instructions executed per cycle (2) All instns take same one big cycle

Multicycle Design – Small Cycle – CPI < 5 Break the datapath to several stages, each taking one cycle. Frequency is increased and some instns can finish earlier. Problems: Need extra registers to separate the stages and control must ensure that right control signals must be applied to right stage at the right time => complex control design, but still manageable in hardware.

Review: Datapath for MIPS

DataMemory(Dmem)

PC Registers ALUInstructionMemory(Imem)

Stage 1 Stage 2 Stage 3 Stage 4

Stage 5

IFtch Dcd Exec Mem WB

• Use datapath figure to represent stages

ALU IM Reg DM Reg

Pipelined Execution IPC= 1

• To simplify pipeline, every instruction takes same number of steps, called stages

• One clock cycle per stage






Program Flow

Time

Advanced Architectural Concepts

• Can we achieve CPI < 1? (i.e., can we have IPC > 1?) State-of-the-Art Microprocessor

• “Superscalar” execution or Instruction Level Parallelism (ILP)

“Deeper Pipeline => Dynamic Branch Prediction => Speculation => Recovery

• “Out-of-order” Execution => Instruction Window and Prefetch => Reorder Buffers

• “VLIW” Ex: Intel/HP Titanium

Instruction Level Parallelism (ILP) IPC > 1


Mem

Dcd Exec Mem WB




Program Flow ILP = 2

Time

IFtch Dcd Exec WB

IFetch

EX: Pentium, SPARC, MIPS 10000, IBM Power PC

Very Large Instruction Word (VLIW) IPC > 1

IFtch Dcd Exec Mem WBExec

IFtch Dcd Exec Mem WBExec


Program Flow EX: Itanium

Time

Exec

main thesis

Documents