ceg 360/560; ee 451/651 digital system design dr. travis doom, assistant professor department of...

57
CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University Section IV: Section IV: Digital System Organization Digital System Organization

Upload: philippa-ramsey

Post on 27-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560; EE 451/651

Digital System DesignDr. Travis Doom, Assistant Professor

Department of Computer Science and Engineering

Wright State University

Section IV: Section IV: Digital System OrganizationDigital System Organization

Page 2: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 2

IntroductionIntroduction

What is memory? What is the difference between cache/processor memory and

main/system memory? What is a microoperation? How are register-transfer-level operations controlled? How is control organized implemented in a microprocessor? How does pipelining work? What is the relationship between a C-program and the instructions that

executed by the microprocessor? What are the defining characteristics of modern RISC architectures? Why is memory organized hierarchically?

Page 3: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 3

OutlineOutline

Memory– SRAM

– DRAM Microprocessor Organization

– Data paths The Control Word Pipelining

– Control unit Microprogrammed Control

Simple Computer Architecture– Computer Instructions

– Instruction Set Architecture Issues in Computer Design

– CISC vs. RISC

– Contemporary processors

Page 4: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 4

Read/Write Memory (RWM / RAM)Read/Write Memory (RWM / RAM)

RWM = RAM (Random Access Memory) Highly structured like ROM’s and PLD’s Can store and retrieve data at (relatively) the same speed

Static RAM (SRAM) retains data in latches (while powered) Dynamic RAM (DRAM) stores data as capacitor charge; all capacitors

must be recharged periodically.

Volatile Memory: Both Static and Dynamic RAM Nonvolatile Memory: Data retained when power lost

= ROMs, NVRAM (w/battery), Flash Memory

Page 5: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 5

Basic Structure of SRAMBasic Structure of SRAM

Address/Control/Data Out lines like a ROM (Reading)

+ Write Enable (WE) and Data In (DIN) (Writing)

2n x b RAM

A0

A1

An-1

DIN0

DIN1

WEOECS

DINb-1 DOUTb-1

DOUT1

DOUT0

Page 6: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 6

One Bit of SRAMOne Bit of SRAM

SEL and WR asserted IN data stored in D-latch (Write) SEL only asserted D-latch output enabled (Read) SEL not asserted No operation

D Q

/WR/SEL

IN OUT IN

SEL

WR

OUT

C

Page 7: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 7

Structure of 8x4 SRAMStructure of 8x4 SRAM

3-to-8 Decoder selects one row of cells for Read or Write

If ADDR + CS + OE are asserted, DOUT (3-0) is enabled. (Read)

If ADDR + DIN + CS + WE are asserted, DIN latches (Write)

Page 8: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 8

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

3-to-8Decoder

/WE/CS

/OE DOUT3 DOUT2 DOUT1 DOUT1

0

1

2

3

4

5

6

7

210

A2A1A0

DIN3 DIN2 DIN1 DIN0

8 x 4SRAM

Page 9: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 9

Composite 128K x 8 MemoryComposite 128K x 8 Memory

64K x 8 RAM

DATA

ADRS

CS

R/W’

64K x 8 RAM

DATA

ADRS

CS

R/W’

/ 8Data Read/Write

ADR 15-0

1

0ADR 16

/16

216 = 64k217 = 128k218 = 256k

How many 64K x 8 RAMs do we need to create a 256k x 8 Memory? Output

EnMem EnDecoder

Page 10: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 10

Composite 64K x 16 MemoryComposite 64K x 16 Memory

64K x 8 RAM

DATA

ADRS

CS

R/W’

64K x 8 RAM

DATA

ADRS

CS

R/W’

8

Read/WriteADR 15-0

/

16 /

Output

8 / 8 /

CS

/8

Data(16 lines)

Page 11: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 11

Physical SRAM Array Should Be SquarePhysical SRAM Array Should Be SquareExample: 16 x 1 SRAM 4 x 4 Array

DI

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

IN

SEL

WR

OUT

2-to-4Decoder

/WE/CS

/OE

0

1

2

3

10

A1A0

4-to-1 Mux

ES

2-to-4Decoder

A3-A2

DO

Page 12: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 12

SRAM TimingSRAM Timing

During READ, outputs are combinational functions of ADDR, CS, OE (like ROM)

– Inputs can freely change without problems (except for propagation delay from last input change to output)

During WRITE, data stored in latches, NOT FF’s.– Thus, Setup & Hold on Data IN relative to trailing edge of /WR

Address must be stable – for setup time before /WR asserted, and

– for hold time after /WR deasserted to prevent “spraying” data to multiple rows

/WR asserted when BOTH /CS and /WE asserted /WR deasserted when EITHER /CS or /WE deasserted

Page 13: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 13

READ Timing (SRAM)READ Timing (SRAM)

Like a ROM!

ADDR

/CS

/OE

DOUT

stable stable stable

valid valid valid

tAA t OZ tOE

tOZ tOE

tOH

tAA

tACS

max(tAA, tACS)

Primary Spec for SRAMs

Page 14: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 14

WRITE Timing (SRAM)WRITE Timing (SRAM)

ADDR

/CS

/WE

DIN

stable stable

valid

tDS tDHtDS tDH

valid

tWPtAS tAH

tCSWtAStCSW

(WE-controlled write) (CS-controlled write)

tWP tAH

Page 15: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 15

64K x 1 DRAM64K x 1 DRAMword line

bit line

1-bit DRAM cell

Rowdecode

r

256 x 256array

Row register, Data mux/demux

ADDR

Dout

Control/

8

/RAS/CAS

/WE

Din

Col ADDR

Row ADDR

Control

ADDR

Din

Dout

RASCASWE

/8

64K x 1 DRAM

Page 16: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 16

DRAM TimingDRAM Timing Refresh cycle - RAS Only

– (RAS asserted) Entire row is latched in row-address register– (RAS deasserted) The data in the row-address register is rewritten

Read cycle - RAS + CAS– (RAS asserted) Entire row is latched in row-address register– (CAS asserted) Data in register is multiplexed to output– (RAS deasserted) Data is row-address register is rewritten– (CAS deasserted) Output is released

Write cycle - RAS + WE + CAS– (RAS asserted) Entire row is latched in row-address register– (WE asserted) Data_in is stable– (CAS asserted) Demultiplex Data_in to row address register– (WE deasserted) Data_in is no longer stable– (RAS deasserted) Data in row-address register is rewritten– (CAS deasserted) Operation complete

Page 17: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 17

Modern DRAM TimingModern DRAM Timing

Fast-Page Mode (One RAS, multiple CAS)– Multiple bits of a row can be written before rewrite

– Complex control, but much faster

Extended Data Out (One RAS, multiple CAS)– Latches the column address so that the next address can be prepared while

the output is read

– Saves ~10ns/read, and increase of 10-15%

– Even more complex control!

SDRAM - Synchronous DRAM– Unlike normal DRAM, SDRAM is clocked!

– Multiple signals and banks (row-address registers) allow “pipelined” operation

Page 18: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 18

RAM SummaryRAM Summary

SRAM: Fast Simple Interface Moderate bit density (4 gates 4 to 6

transistors) Moderate cost/bit

DRAM (Dynamic RAM): moderate speed complex interface High bit density (1 transistor cell) Low cost/bit

Small systemsor

very fast applications

(cache memory)

Large Memories:PC’s

Mainframes

Page 19: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 19

OutlineOutline

Memory– SRAM

– DRAM Microprocessor Organization

– Data paths The Control Word Pipelining

– Control unit Microprogrammed Control

Simple Computer Architecture– Computer Instructions

– Instruction Set Architecture Issues in Computer Design

– CISC vs. RISC

– Contemporary processors

Page 20: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 20

Computer DatapathComputer Datapath

Page 21: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 21

Computer DatapathComputer Datapath

Page 22: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 22

Control WordControl Word

The selection variables for the datapath select the microoperation to be executed within the datapath for any given clock pulse.

Control Word: the combined values of the datapath control inputs in a specified order.– Control words consist of multiple fields, each of which represents part of

the microoperation functionality. e.g. Destination Address, Operand A Address, Operand B Address, Read

Immediate/Constant, Function Select, Write Immediate/Constant, Read/Write

Page 23: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 23

Symbolic NotationSymbolic Notation

Page 24: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 24

Symbolic Notation to Control WordSymbolic Notation to Control Word

Page 25: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 25

Timing DiagramTiming Diagram

Page 26: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 26

Pipelined DatapathPipelined Datapath

Datapath throughput can be increased by breaking up the datapath with registers and increasing the clock speed.– This is known as pipelining the datapath.

Throughput vs. Turnaround time– Throughput: The number of operations (units) per second (time period).

– Turnaround time: The amount of time which elapses from the beginning of the operation’s execution (unit processing) to its completion.

– Consider an assembly line (or fast-food drive-thru!) Basic stages of a datapath microoperation:

– Operand Fetch (OF)

– Execute Function (EX)

– Write Back Results (WB)

Page 27: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 27

DatapathDatapathTimingTiming

Min. Clock Period = 12 ns Min. Clock Period = 5 ns

83.3 MHz 200 MHz

Pipeline Platforms

Stage One

Stage Two

Stage Three

Page 28: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 28

Pipeline Execution PatternPipeline Execution Pattern

Not all of the pipelined units are necessarily active at all times– Filling and emptying the pipeline “wastes” time

– Hazards exist!

Page 29: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 29

The Control UnitThe Control Unit

The control unit generates the signals for sequencing the operations in the datapath– A sequential circuit with states that dictate the control signals for the

system

– Using status conditions and control inputs, the sequential control unit determines the next state in which additional microoperations are activated.

Hardwired Control– The control unit is implemented to provide a particular digital function

Microprogrammed Control– The control unit’s binary control values are stores as words in a

microprogrammed control (usually ROM).

– Each word in the control contains a microinstruction

– A sequence of microinstructions constitutes a microprogram

– Firmware!

Page 30: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 30

Diagram of a Hardwired Control UnitDiagram of a Hardwired Control Unit

Page 31: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 31

Programmable Control UnitsProgrammable Control Units The binary information stored in a digital computer can be classified as either

data or control information (machine language instructions)– Von Neuman - Stored Program Model

Non-programmable Control Unit– A control unit which not responsible for obtaining instructions from memory, it

determines the operations to be performed and the sequence of those operations based only upon its inputs and status bits

Programmable Control Unit:– A portion of the input to the system consists of a sequence of instructions.– Each instruction contains the information necessary for the control unit to determine

a sequence of microoperations or which instruction to execute next– The address of the next instruction comes from a Program Counter (PC)

The PC can count (+1 instruction, normal operation) The PC can load (branch instruction)

Control Units (Programmable or Non) can be single-cycle or multi-cycle– If any instruction requires more than one microoperation the machine is multi-cycle.

How does this complicate the design?

Page 32: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 32

Diagram of a Diagram of a Instruction DecoderInstruction Decoder

The instruction decoder takes the inputs to the control unit (it this case an “instruction”) and creates a corresponding control word for a datapath microoperation.

This might be used in a single-cycle computer

Page 33: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 33

Diagram of a Diagram of a (Hardwired)(Hardwired)

Programmable Programmable ComputerComputer

This is a single-cycle computerOne instruction = One microoperation

Data Memory

Page 34: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 34

Diagram of a Diagram of a Microprogrammed Microprogrammed

Control UnitControl Unit The values of the control

signals (and outputs) are determined by the contents of the Control Memory (a.k.a. the Control Store)

A portion of the contents of the Control Memory is used (along with the next set of inputs).– The “next-address” field

maintains internal state

Page 35: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 35

A Simple Multi-cycle ComputerA Simple Multi-cycle Computer

This PC only increments… usuallythe PC must also be able to load

Page 36: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 36

Diagram of aDiagram of aPipelined Control Pipelined Control

UnitUnitBasic stages of instruction execution:

Instruction Fetch (IF)Decode & Operand Fetch (DOF)Execute Function (EX)Write Back Results (WB)

More stages are possible!

Relate this to the “Machine-cycle”

Page 37: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 37

Instruction Format: Opcode [15:12], Op A [11:8], Op B [7:4], Op C [3:0]Example Instruction: x2021Opcode 2 Format: M[A] M[B]Instruction: M[0] M[2]

-processor

Instruction

Start

Reset

Clock >

RAM

Ready

AddressData In

Write

Data OutAddress Out

Data Out

Write Enable

Data In

4

4

4

16

Ram Contents (Before) ADDR DATA x0 x4 x1 xF x2 x9 … ...

Ram Contents (After) ADDR DATA x0 x9 x1 xF x2 x9 … ...

A Simple ComputerA Simple Computer

Page 38: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 38

-processor

Instruction

Start

Reset

ClockReady

16 Address Out

Data Out

Write Enable

Data In

4

4

4

Control Unit Data Unit

VNCZ4

14

4Constant

ControlWord

>

>

Example Instruction: x2021Opcode 2 Format: M[A] M[B]Instruction: M[0] M[2]

• R0 Constant B• Address Out R0, R1 Data In• R0 Constant A• Address Out R0, Data Out R1, Write Enable• Return to IDLE state (Sequencing Instruction)

A Simple MicroprocessorA Simple Microprocessor

Page 39: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 39

Instruction

Start

Reset

Clock

Ready

16

Write Enable

Control Unit

VNCZ4

14

4Constant

ControlWord

OperandSelect

>CLR

CAR

DATA LD/CNT’

Generate Data Signal Logic

Generate Load Signal Logic

CONTROLSTORE(ROM)

Generate Ready

Mode

Condition Code

Next Address

Operand Select

Address Datamicroinstruction

? ?

12:0

16:13

A (not so) Simple Control UnitA (not so) Simple Control Unit

• R0 Constant B• Address Out R0, R1 Data In• R0 Constant A• Address Out R0, Data Out R1, Write Enable• CAR x00 (IDLE STATE)

Page 40: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 40

Microprogram DesignMicroprogram Design

Page 41: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 41

OutlineOutline

Memory– SRAM

– DRAM Microprocessor Organization

– Data paths The Control Word Pipelining

– Control unit Microprogrammed Control

Simple Computer Architecture– Computer Instructions

– Instruction Set Architecture Issues in Computer Design

– CISC vs. RISC

– Contemporary processors

Page 42: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 42

A simple computer architectureA simple computer architecture

Generic Computer System.– Current architectures are performance driven, and vary widely.

Processor– Uniprocessor systems

ASIC (Application Specific Integrated Circuit)– Performs a specific task, not a general purpose processor (e.g. Voodoo)

I/O Device– Accesses data devices (e.g. Graphics Adapter, Disk Controller, et al.)

Processor

Memory

ASIC

I/O Device

ASIC

I/O Device

Bus

Page 43: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 43

A simple microprocessorA simple microprocessor

Central Processing Unit– Control Unit, Integer Datapath (Load/Store, Integer ALU)

Floating Point Unit– Floating Point Datapath

Internal Cache– SRAM for Instruction Cache (i-cache) and Data Cache (d-cache)

Memory Management Unit– Controls communication with Main Memory and other I/O

FPU

CPU

Internal Cache

MMU

Bus

Page 44: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 44

Instruction Set ArchitectureInstruction Set Architecture

Microprocessors can only perform certain operations– Users determine which operations will be performed and in what order

though the use of a program.

– A program consists of a sequence of machine-executable instructions.

– An instruction is a collection of bits that instructs the computer to perform a specific operation.

– The set of instructions that a particular microprocessor can execute is its instruction set.

– Instruction set architecture: A thorough description of the instruction set of a computer.

Users can not easily produce meaningful programs using the instruction set directly.– Compilers convert programs specified in high-level languages into the

instruction set equivalent (a machine language program).

Page 45: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 45

Computer Instructions (1)Computer Instructions (1)

High-Level Language - C A = B + C;

– Memory-Transfer Equivalent Mem[A] Mem[B] + Mem[C] Mem[EA00] Mem[EA08] + Mem[EA10]

Machine-Level Equivalent – Assembly (human readable) Machine RTL

Load R2, B E2EA08 R2 M[xEA08] Load R3, C E3EA10 R3 M[xEA10] Add R2, R2, R3 0223 R2 R2 + R3 Store A, R2 F2EA00 M[xEA00] R2

The bits of a machine instruction are divided into fields– eg: E2EA08– E: Operation “Load”; 2: Destination Address R2; EA08: Address Field– The operation field (opcode) defines the format for the instruction

Page 46: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 46

Computer Instructions (2)Computer Instructions (2)

There are three basic types of computer instructions– Register Instructions: operate on values stored in registers

Arithmetic, Shift, and Logic instructions

– Move Instructions: move data between memory and registers Load/Store instructions Move/Copy portions of memory

– Branch Instructions: select one of two possible next instructions to execute Branch on condition, Unconditional branch (Jump) Only one address is explicit, the other operand is implict

– e.g.: Beq R2, R3, A

– If the contents of R2 = R3 then execute the instruction at location A next (explict)

– otherwise, execute the next instruction in the normal order (using the PC) (implict)

Page 47: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 47

Instructions Vs. MicrooperationsInstructions Vs. Microoperations

What is the difference between a computer instruction and a hardware microoperation?– Computer instruction: an operation stored in binary in the computer’s

memory

– The control unit uses the address or addresses provided by the program counter (PC) to retrieve the next instruction from memory

– The control unit then decodes the instruction fields to perform the required microoperations for the execution of the instruction.

– Thus, in microprogrammed control, each computer instruction corresponds to a microprogram!

Page 48: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 48

Instruction FormatsInstruction Formats

Different Instructions have different types of instruction formats– Register, Implied, Immediate, Direct, Indirect, Relative, Indexed

Register: operands are hardware registers (e.g. Add R3, R2, R1)

Opcode Destination Register

Source Register A

Source Register B

15 8 6 5 3 2 09

Immediate: one operand is a constant (eg. Add R2, 3)

Opcode Destination Register

Source Register A

Operand B

15 8 6 5 3 2 09

Page 49: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 49

The Instruction-execution CycleThe Instruction-execution Cycle Instruction Fetch (IF) stage

– Get next instruction from the memory address referenced by the PC

– Place the new instruction in the Instruction Register

– Increment the PC to the next instruction address Instruction decode (ID) stage

– The instruction is recognized, or decoded

– Determine the instruction format by examining the opcode Operand fetch (OF) stage

– Perform any calculations necessary to fetch the operand values

– If necessary, fetch operands from memory to temporary registers Execute operation (EX) stage

– Execute the operation specified in the opcode Branch instructions may update the PC

Writeback (WB) stage– Store the result of the operation in as determined by the instruction

Repeat the instruction-execution cycle (a.k.a. the machine cycle).

Page 50: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 50

OutlineOutline

Memory– SRAM

– DRAM Microprocessor Organization

– Data paths The Control Word Pipelining

– Control unit Microprogrammed Control

Simple Computer Architecture– Computer Instructions

– Instruction Set Architecture Issues in Computer Design

– CISC vs. RISC

– Contemporary processors

Page 51: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 51

CISCCISC

Complex Instruction Set Computers (CISC) - 1950’s +– Made up of powerful primitives (close to the primitives of high-level

languages).

– Had many computer instructions.

– Why? Early compilers did not produce the fastest, most memory efficient code! Most “respectable” programmers understood and occasional programmed

modules in the machine’s instruction set. Powerful primitives made this task user-friendly.

– Performance/Costs The memory footprint of the code was very small! The hardware to implement these instructions was very complicated. Due to the large number of different tasks, CISC machines are difficult to fully

pipeline.

Page 52: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 52

RISCRISC Reduced Instruction Set Computers (RISC) - 1980’s +

– 80% of the code uses only 20% of the instructions– Minimal instruction set

all instructions use simple addressing modes to reduce decoding difficulty most instructions require one clock tick to execute (simplifies pipelining)

– Why? Fast is Good! More chip space is devoted to making the most common instructions as fast as possible.

Some less common instructions may be slower, but overall performance is increased. Compilers have improved to the point that well-optimized sequences of simple (very fast)

commands often outperform the more complicated multi-cycle instructions. RISC designs could fit on a single chip, which reduced cost, increased reliability, and

increased clock speeds!

– Performance/Costs Larger code footprints, more/larger instructions that CISC. Initially RISC ISA could not include support for floating point instructions. They had to

be performed in software and were very slow. As chip densities increased floating point instructions were added to the RISC ISAs.

Page 53: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 53

Modern processorsModern processors

RISC is a design philosophy (Fast is good!) Most newer architectures as RISC-like Most older architectures (including x86) are CISC-like Processors of both designs now take advantage of many previously RISC-only

features, including:– Single chip implementation

– Instruction pipelining

– Data forwarding

– Delayed branchingBranch prediction Branches are not decided until the execute phase… anything in the pipeline after the branch

may have to be annulled if the “next address” is incorrect. If we executed a branch every fifth instruction and only half of our branches fell through, the

lost efficiency due to restarting the pipeline would be 20%! Branch delay slots increase pipeline efficiency by requiring an instruction in the branch delay

slot (which may be a No-Op) which be executed regardless of the branch decision. Branch prediction and the ability to annul instructions in the pipeline has greatly increased

the efficiency of branches.

Page 54: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 54

Pipeline Hazards (1)Pipeline Hazards (1)

Basic Pipelining Hazards– Data Dependency - occurs whenever one instruction generates a value that

is used by one or more successive instructions. Data Forwarding - a technique to avoid delays due to data dependency hazards

by allowing data to skip one or more pipeline stages

OperandDecode WriteExecFetchInstruction 1

OperandDecode WriteExecFetchInstruction 2

OperandDecode WriteExecFetchInstruction 3

OperandDecode WriteExecFetchInstruction 4

Load A

Load B

Add C, A, B

Store C

Page 55: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 55

Pipeline Hazards (2)Pipeline Hazards (2)

– Control Dependency - occurs whenever a branch instruction is encountered; the address of the “next” instruction can not be determined until the branch conditions is evaluated during the execute phase.

Branch Prediction - reduces delays due to control dependencies by making a educated guess and “anulling” the operations in the pipeline if the guess is incorrect.

Most branches backwards are “taken”, most branches forward are “not taken”. 2-bit history (traditional technique): If the guess is wrong twice in a row, change the future

guess for that instruction in the future. Trace caches: proposed designs use instruction traces to keep track of past program flow.

Some proposals use 50% of the chip area for this cache. Why?

OperandDecode WriteExecFetchInstruction 1

OperandDecode WriteExecFetchInstruction 2

OperandDecode WriteExecFetchInstruction 3

OperandDecode WriteExecFetchInstruction 4

OperandDecode WriteExecFetchInstruction 5

BRANCH

Guess

Guess

Guess

Sure!

Page 56: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 56

A contemporary architectureA contemporary architecture

Front Side Bus800 MHz

MicroprocessorsMemory Chipset

Bridge Back Side BusPCI - 33MHz

Main Memory

Graphics Adapter(non-AGP)

AGP

External Cache

Disk Interface(EIDE,uSCSI-2)

Network Interface Card (NIC)

Serial/Parallel I/O(keyboard, mouse, printer)

ISA devices Compatibility Bridge(South-side, ISA)

Registers 1 nsL1 On-chip 2 nsL2 On-chip 4 nsL3 Off-chip 10 nsMemory 100 nsPaged VM > 1 ms!

Page 57: CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University

CEG 360/560 - EE 451/651 Section

IV - 57

Contemporary ProcessorsContemporary Processors

Second-generation RISC processors have taken advantage of improved manufacturing processes to:– 1. Make the clock rate faster.

– 2. Duplicate functional units to allow parallel execution of instructions. Superscaler!

– 3. Increase the number of stages in the pipeline. Superpipelined!

Modern RISC processors are now introducing:– 1. More addressing modes

– 2. Specialized instructions (MMX)

– 3. Out-of-order speculative execution! What will the next generation introduce?

– Speculative Data, Processor in Memory (PIM), Trace Caches, et al.

– ??