main thesis
DESCRIPTION
thesisTRANSCRIPT
Main thesisLecture 1
Review of CS 161
What is a von-Neumann computer? => The Stored Program Concept – Sequential Execution of a program – instructions in binary for storing in memory
• Design of an Instruction Set - RISC Vs. CISC, Examples
• Design of CPU Datapath• CPU Control Design (Hardwire vs.
Microprogramming)• Memory Design (Main memory, Cache Memory,
Virtual memory)• Input-Output
MIPS ISA
1. all MIPS instructions are same length– simplifies fetch and decode (steps 1,2)– Intel 80x86 and IBM 360/370 instructions are
variable length, 1-17 bytes
2. few instruction formats in MIPS– source register fields are same place in all
instructions– can read two registers and decode instruction
in the same cycle– Explicit Load/Store instructions for memory-
register operations
Review of CPU Datapath Design
Instruction operation consists of 5 parts, namely, Fetch (IF), Decode (ID), Execute (EX), Memory (DM), and Write-back (WB) stages
Single Cycle Design – Big Cycle – CPI = 1 Problems: (1) Low frequency meaning less number of instructions executed per cycle (2) All instns take same one big cycle
Multicycle Design – Small Cycle – CPI < 5 Break the datapath to several stages, each taking one cycle. Frequency is increased and some instns can finish earlier. Problems: Need extra registers to separate the stages and control must ensure that right control signals must be applied to right stage at the right time => complex control design, but still manageable in hardware.
Review: Datapath for MIPS
DataMemory(Dmem)
PC Registers ALUInstructionMemory(Imem)
Stage 1 Stage 2 Stage 3 Stage 4
Stage 5
IFtch Dcd Exec Mem WB
• Use datapath figure to represent stages
ALU IM Reg DM Reg
Pipelined Execution IPC= 1
• To simplify pipeline, every instruction takes same number of steps, called stages
• One clock cycle per stage
IFtch Dcd Exec Mem WB
IFtch Dcd Exec Mem WB
IFtch Dcd Exec Mem WB
IFtch Dcd Exec Mem WB
IFtch Dcd Exec Mem WB
Program Flow
Time
Advanced Architectural Concepts
• Can we achieve CPI < 1? (i.e., can we have IPC > 1?) State-of-the-Art Microprocessor
• “Superscalar” execution or Instruction Level Parallelism (ILP)
“Deeper Pipeline => Dynamic Branch Prediction => Speculation => Recovery
• “Out-of-order” Execution => Instruction Window and Prefetch => Reorder Buffers
• “VLIW” Ex: Intel/HP Titanium
Instruction Level Parallelism (ILP) IPC > 1
IFtch Dcd Exec Mem WB
Mem
Dcd Exec Mem WB
IFtch Dcd Exec Mem WB
IFtch Dcd Exec Mem WB
IFtch Dcd Exec Mem WB
Program Flow ILP = 2
Time
IFtch Dcd Exec WB
IFetch
EX: Pentium, SPARC, MIPS 10000, IBM Power PC
Very Large Instruction Word (VLIW) IPC > 1
IFtch Dcd Exec Mem WBExec
IFtch Dcd Exec Mem WBExec
IFtch Dcd Exec Mem WB
Program Flow EX: Itanium
Time
Exec