chapter 11 cpu structure and function

97
Yonsei Yonsei University University Chapter 11 Chapter 11 CPU Structure CPU Structure and Function and Function

Upload: others

Post on 01-Mar-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity

Chapter 11Chapter 11

CPU StructureCPU Structureand Functionand Function

Page 2: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-2

• Processor organization• Register organization• Instruction cycle• The Pentium processor• The PowerPC processor

Contents Contents

Page 3: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-3

• Fetch instructions– CPU reads an instruction from memory

• Interpret instructions– The instruction is decoded to determine what

action is required• Fetch data

– The execution may require reading data from memory or an I/O module

• Process data– The execution may require performing arithmetic

or logical operation on data• Write data

– The result of an execution may require writing data to memory or I/O module

CPU StructuresCPU Structures Processor organizationProcessor organization

Page 4: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-4

CPU With The System BusCPU With The System Bus Processor organizationProcessor organization

Page 5: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-5

Internal Structure of the CPUInternal Structure of the CPU Processor organizationProcessor organization

Page 6: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-6

• User-visible registers– Enable to minimize main memory references

• Control and status registers– Enable the control unit to control the operation of

the CPU– Enable OS programs to control the execution of

programs

RegistersRegisters Register organizationRegister organization

Page 7: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-7

• General Purpose• Data• Address• Condition Codes

User Visible RegistersUser Visible Registers Register organizationRegister organization

Page 8: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-8

• Can be assigned to a variety of functions• May be true general purpose• May be restricted• May be used for data or addressing

General Purpose RegistersGeneral Purpose Registers Register organizationRegister organization

Page 9: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-9

• Accumulator• Only to hold data• Cannot be employed in the calculation of an

operand address

Data RegistersData Registers Register organizationRegister organization

Page 10: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-10

• Segment• May be somewhat general purpose• May be devoted to a particular addressing

mode

Address RegistersAddress Registers Register organizationRegister organization

Page 11: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-11

• Segment pointers– In a machine with segmented addressing, it holds

the address of the base of the segment– There may be multiple registers

• Index registers– Used for indexed addressing– May be autoindexed

• Stack pointer– If there is user-visible stack addressing, then

typically the stack is in memory and there is a dedicated register that points to the top of the stack

– This allows implicit addressing

Examples of Address RegistersExamples of Address Registers Register organizationRegister organization

Page 12: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-12

• Whether to use completely general purpose registers or to specialize their use

• The number of registers, either general purpose or data plus address, to be provided

• Register length– Register that must hold addresses obviously

must be at least long enough to hold the largest address

Design Issues Design Issues Register organizationRegister organization

Page 13: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-13

• Sets of individual bits that set by the CPU hardware as the result of operations– e.g. result of last operation was zero

• Condition code bits are collected into one or more registers

• Usually form part of a control register• Can be read (implicitly) by programs

– e.g. Jump if zero

• Cannot be altered by programmers

Condition Code RegistersCondition Code Registers Register organizationRegister organization

Page 14: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-14

• CPU registers that are employed to control the operation of the CPU

• Program Counter– Contains the address of an instruction to be fetched

• Instruction Register– Contains the instruction most recently fetched

• Memory Address Register– Contains the address of a location in memory

• Memory Buffer Register– Contains a word of data to be written to memory or

the word most recently read

Control & Status RegistersControl & Status Registers Register organizationRegister organization

Page 15: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-15

• Status Information• Conditional code plus other status information

Program Status WordProgram Status Word Register organizationRegister organization

Page 16: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-16

• Sign : Contains the sign bit of the result of the last arithmetic operation

• Zero : Set when the result is 0

• Carry : Set if an operation resulted in a carry into or borrow out of a high-order bit

• Equal : Set if a logical compare result is equality

• Overflow : Used to indicate arithmetic overflow

• Interrupt enable/disable : Used to enable or disable interrupts

• Supervisor : Indicate whether the CPU is executing in supervisor or user mode

Flag of PSW Flag of PSW Register organizationRegister organization

Page 17: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-17

• Pointers to a block of memory containing additional status information (Process control blocks)

• Interrupt Vectors register• System stack pointer

– If a stack is used, a system stack pointer is needed

• Page table pointer– In virtual memory system

• Registers for the control of I/O operations

Other RegistersOther Registers Register organizationRegister organization

Page 18: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-18

• Operating system support– Certain types of control information are of

specific utility to the operating system

• Allocation of control information between registers and memory– Common to dedicate thousands words of

memory for control purposes– How much control information should be in

registers and how much in memory

Design IssuesDesign Issues Register organizationRegister organization

Page 19: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-19

MC68000MC68000 Register organizationRegister organization

Page 20: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-20

80868086 Register organizationRegister organization

Page 21: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-21

80386 80386 –– Pentium IIPentium II Register organizationRegister organization

Page 22: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-22

Instruction CycleInstruction Cycle Instruction cycleInstruction cycle

Page 23: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-23

• May require memory access to fetch operands

• Indirect addressing requires more memory accesses

• Can be thought of as additional instruction subcycle

Indirect CycleIndirect Cycle Instruction cycleInstruction cycle

Page 24: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-24

Instruction CycleInstruction Cycle Instruction cycleInstruction cycle

Page 25: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-25

• Alternating instruction fetch and instruction execution activities

• After an instruction is fetched, examine if any indirect addressing is involved– If so, the required operands are fetched

• Following execution, an interrupt may be processed before the next instruction fetch

Instruction Cycle with Indirect Instruction Cycle with Indirect Instruction cycleInstruction cycle

Page 26: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-26

Instruction Cycle State DiagramInstruction Cycle State Diagram Instruction cycleInstruction cycle

Page 27: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-27

• Depends on CPU design• In general:• Fetch

– PC contains address of next instruction– Address moved to MAR– Address placed on address bus– Control unit requests memory read– Result placed on data bus, copied to MBR, then to

IR– Meanwhile PC incremented by 1

Data Flow Data Flow -- Instruction FetchInstruction Fetch Instruction cycleInstruction cycle

Page 28: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-28

Data Flow Data Flow -- Fetch CycleFetch Cycle Instruction cycleInstruction cycle

Page 29: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-29

• IR is examined• If indirect addressing, indirect cycle is

performed– Right most N bits of MBR transferred to MAR– Control unit requests memory read– Result (address of operand) moved to MBR

Data Flow Data Flow -- Data FetchData Fetch Instruction cycleInstruction cycle

Page 30: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-30

Data Flow Data Flow -- Indirect CycleIndirect Cycle Instruction cycleInstruction cycle

Page 31: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-31

• May take many forms• Depends on instruction being executed• May include

– Memory read/write– Input/Output– Register transfers– ALU operations

Data Flow Data Flow -- ExecuteExecute Instruction cycleInstruction cycle

Page 32: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-32

• Simple and Predictable• Current PC saved to allow resumption after

interrupt• Contents of PC copied to MBR• Special memory location (e.g. stack pointer)

loaded to MAR• MBR written to memory• PC loaded with address of interrupt handling

routine• Next instruction (first of interrupt handler)

can be fetched

Data Flow Data Flow -- InterruptInterrupt Instruction cycleInstruction cycle

Page 33: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-33

Data Flow Data Flow -- Interrupt CycleInterrupt Cycle Instruction cycleInstruction cycle

Page 34: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-34

TwoTwo--stage Instruction Pipelinestage Instruction Pipeline Instruction pipeliningInstruction pipelining

Page 35: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-35

• Fetch accessing main memory• Execution usually does not access main

memory• Can fetch next instruction during execution

of current instruction• Called instruction prefetch or fetch overlap

Instruction Pipelining Instruction Pipelining -- PrefetchPrefetch Instruction pipeliningInstruction pipelining

Page 36: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-36

• Doubling of execution ratio is unlikely :– Fetch usually shorter than execution

• Prefetch more than one instruction?

– Any jump or branch means that prefetchedinstructions are not the required instructions

• To reduce the time loss, when a conditional branch instruction is passed, fetch stage fetches the next instruction in memory after branch instruction

• If the branch is not taken, no loss• Else, the fetched instruction must be discarded and a

new instruction fetched

• Add more stages to improve performance

Improved PerformanceImproved Performance Instruction pipeliningInstruction pipelining

Page 37: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-37

• Fetch instruction (FI)– Read the next expected instruction into a buffer

• Decode instruction (DI)– Determine the opcode and the operand specifiers

• Calculate operands (CO)– Calculate effective address of each source

operand• Fetch operands (FO)

– Fetch each operand from memory

• Execute instruction (EI)– Perform the operation and store the result

• Write operand (WR)

PipeliningPipelining Instruction pipeliningInstruction pipelining

Page 38: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-38

Timing of PipelineTiming of Pipeline Instruction pipeliningInstruction pipelining

Page 39: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-39

• Six stages of the pipeline– This will not always be the case

• All stages can be performed in parallel– Particularly assumed that there is no memory

conflict– The desired value may be in cache : Memory

conflict won’t slow down the pipeline

Timing of PipelineTiming of Pipeline Instruction pipeliningInstruction pipelining

Page 40: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-40

• Memory conflict if the cache is not used• Unequal duration of stages

– There will be some waiting involved at stages

• The conditional branch instruction can invalidate several instruction fetches– A similar unpredictable event is an interrupt

Limiting FactorsLimiting Factors Instruction pipeliningInstruction pipelining

Page 41: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-41

Conditional Branch in a PipelineConditional Branch in a Pipeline Instruction pipeliningInstruction pipelining

Page 42: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-42

• The CO stage depends on the contents of a register that could be altered by a previous instruction that is still in the pipeline– Other such register and memory conflicts could

occur

Limiting FactorsLimiting Factors Instruction pipeliningInstruction pipelining

Page 43: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-43

66--stage CPU Instruction Pipelinestage CPU Instruction Pipeline Instruction pipeliningInstruction pipelining

Page 44: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-44

• At each stage, overhead is involved in moving data from buffer to buffer and in performing various preparation and delivery functions– This overhead can lengthen the total execution

time of a single instruction– This is significant when sequential instructions are

logically dependent, either through heavy use of branching or through memory access dependencies

• The amount of control logic increases enormously with the number of stages– The logic controlling the gating between stages is

more complex than the stages being controlled

Speed & The Number of StagesSpeed & The Number of Stages Instruction pipeliningInstruction pipelining

Page 45: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-45

• Cycle time– Time needed to advance a set of instructions one

stage through the pipeline

= maximum stage delay= number of stages in the instruction pipeline= time delay of a latch

Pipeline Performance Pipeline Performance

? ? dd mi ???? ??? max ki ??1

m?kd

Instruction pipeliningInstruction pipelining

Page 46: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-46

• In general, the time delay d is equivalent to a clock pulse and

• Suppose that n instructions are processed– Total time required

– Speedup factor

Pipeline PerformancePipeline Performance

kT

dm ???

? ??)1( ??? nkTk

? ? )1()1(1

???

????

nknk

nknk

TT

Sk

k?

?

Instruction pipeliningInstruction pipelining

Page 47: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-47

• Number of instructions

Speedup FactorsSpeedup Factors Instruction pipeliningInstruction pipelining

Page 48: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-48

• Number of stages

Speedup FactorsSpeedup Factors Instruction pipeliningInstruction pipelining

Page 49: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-49

• Multiple Streams• Prefetch Branch Target• Loop buffer• Branch prediction• Delayed branching

Dealing with BranchesDealing with Branches Instruction pipeliningInstruction pipelining

Page 50: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-50

• Replicate the initial portions of the pipeline and allow the pipeline to fetch both instructions, making use of two streams

• Two problems with this approach– Contention delays for access to the registers and

to memory– Additional branch instructions may enter the

pipeline before original branch decision is resolved • Each such instruction needs an additional

stream

• Despite these drawbacks, this strategy can improve performance

Multiple StreamsMultiple Streams Instruction pipeliningInstruction pipelining

Page 51: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-51

• The target of the branch is prefetched in addition to instructions following branch

• Keep the target until the branch is executed• If the branch is taken, the target has already

been prefetched• Used by IBM 360/91

PrefetchPrefetch Branch TargetBranch Target Instruction pipeliningInstruction pipelining

Page 52: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-52

• A small, very-high-speed memory maintained by fetch stage of the pipeline and containing the n most recently fetched instructions, in sequence

• If a branch is to be taken, the hardware first checks whether the branch target is within the buffer

• If so, the next instruction is fetched from the buffer

• Very good for small loops or jumps

Loop BufferLoop Buffer Instruction pipeliningInstruction pipelining

Page 53: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-53

• With the use of prefetching, the loop buffer will contain some instruction sequentially ahead of the current instruction fetch address

• If a branch occurs to a target a few locations ahead of the address of the branch instruction, the target will already be in the buffer– Useful for the rather common occurrence of IF-

THEN and IF-THEN-ELSE sequences

• Well suited to dealing with loops, or iterations

Benefits of Loop BufferBenefits of Loop Buffer Instruction pipeliningInstruction pipelining

Page 54: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-54

Loop BufferLoop Buffer Instruction pipeliningInstruction pipelining

Page 55: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-55

• Static approaches– Do not depend on the execution history up to the

time of the conditional branch instruction• Predict never taken• Predict always taken• Predict by opcode

• Dynamic approaches– Do depend on the execution history– Improve the accuracy of prediction by recording

the history of conditional branch instructions• Taken/not taken switch• Branch history table

Branch PredictionBranch Prediction Instruction pipeliningInstruction pipelining

Page 56: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-56

• Predict never taken– Assume that jump will not happen– Always fetch next instruction – 68020 & VAX 11/780– VAX will not prefetch after branch if a page fault

would result (O/S v CPU design)

• Predict always taken– Conditional branches are taken more than 50% – Assume that jump will happen– Always fetch target instruction

• Predict by Opcode– Some instructions are more likely to result in a

jump than others

Static ApproachesStatic Approaches Instruction pipeliningInstruction pipelining

Page 57: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-57

• Taken/Not taken switch– History bits : One or more bits can be associated

with each conditional branch instruction that reflects the recent history of the instruction

– History bits are kept in temporary high-speed storage

• Associate some history bits with any conditional branch instruction that is in a cache

– When the instruction is replaced in the cache, its history is lost

• Maintain a small table for recently executed branch instruction with one or more bits in each entry

Dynamic ApproachesDynamic Approaches Instruction pipeliningInstruction pipelining

Page 58: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-58

• Taken/Not taken switch (with a single bit)– With a single bit, it’s only recorded whether the

last execution of this instruction resulted in a branch or not

• Used in the case of a conditional branch instruction that is almost always taken, such as a loop instruction

• Error in prediction will occur twice : once on entering the loop and once on exiting

Dynamic ApproachesDynamic Approaches Instruction pipeliningInstruction pipelining

Page 59: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-59

• Taken/Not taken switch (with two bits)– With two bits, it can be recorded the result of the

last two instances of the execution of the associated instruction and a state in some other fashion

• If the last two branches of the given instruction have taken the same path, the prediction is to take that path again

• If the prediction is wrong, it remains the same the next time the instruction is encountered

• If the prediction is wrong again, the next prediction will be to select the opposite path

Dynamic Approaches Dynamic Approaches Instruction pipeliningInstruction pipelining

Page 60: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-60

Branch Prediction State DiagramBranch Prediction State Diagram Instruction pipeliningInstruction pipelining

Page 61: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-61

• Drawback of the use of history bits– If the decision is made to take the branch, the

target instruction cannot be fetched until the target address is decoded

– Greater efficiency could be achieved if the instruction fetch could be initiated as soon as the branch is made

Dynamic ApproachesDynamic Approaches Instruction pipeliningInstruction pipelining

Page 62: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-62

• Branch history table– A small cache memory associated with the

instruction fetch stage of the pipeline– Each entry in the table consists

• The address of a branch instruction• Some number of history bits that record the

state of use of that information• Information about the target instruction

– This yields a shorter instruction fetch time, but a greater table compared with storing the target address

Dynamic ApproachesDynamic Approaches Instruction pipeliningInstruction pipelining

Page 63: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-63

Predict Never Taken StrategyPredict Never Taken Strategy Instruction pipeliningInstruction pipelining

Page 64: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-64

Branch History TableBranch History Table Instruction pipeliningInstruction pipelining

Page 65: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-65

• Possible to improve pipeline performance by automatically rearranging instructions so that branch instructions occur later than actually desired

Delayed BranchDelayed Branch Instruction pipeliningInstruction pipelining

Page 66: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-66

• Fetch : instructions are fetched from the cache or from the external memory and placed into one of the two 16-byte prefetch buffers

• Decode stage1 : All opcode and addressing-mode information is decoded

• Decode stage2 : This stage expands each opcode into control signals for ALU

• Execute : This stage includes ALU operations, cache access and register update

• Write back : If needed, this stage updates registers and status flags modified during the preceding execution stage

Intel 80486 PipeliningIntel 80486 Pipelining Instruction pipeliningInstruction pipelining

Page 67: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-67

• No delay introduced into the pipeline when a memory access is required

No Data Delay in the PipelineNo Data Delay in the Pipeline Instruction pipeliningInstruction pipelining

Page 68: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-68

• A delay for values used to compute memory address

Pointer Load DelayPointer Load Delay Instruction pipeliningInstruction pipelining

Page 69: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-69

• The processor accesses the cache in the EX stage of the first instruction and stores the value retrieved in the register during the WB stage

• The next instruction needs that register in the D2 stage

• When the D2 stage lines up with the WB stage of the previous instruction, bypass signal paths allow the D2 stage to have access to the same data being used by the WB stage for writing, saving one pipeline stage

Pointer Load DelayPointer Load Delay Instruction pipeliningInstruction pipelining

Page 70: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-70

• Assume that the branch is taken

Branch Instruction TimingBranch Instruction Timing Instruction pipeliningInstruction pipelining

Page 71: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-71

• The compare instruction updates condition codes in the WB stage and bypass paths make this available to the EX stage of the jump instruction at the same time

• In parallel, the processor runs a speculative fetch cycle to the target of the jump during the EX stage of the jump instruction

• If the processor determines a false branch condition, it discards this prefetch and continues execution with the next sequential instruction

Branch Instruction TimingBranch Instruction Timing Instruction pipeliningInstruction pipelining

Page 72: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-72

Pentium II Processor RegistersPentium II Processor Registers Pentium processorPentium processor

Page 73: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-73

EFLAGS RegisterEFLAGS Register Pentium processorPentium processor

Page 74: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-74

• 6 condition codes• 7 flags that may be referred to as control bits

– Trap flag (TF)– Interrupt enable flag (IF)– Direction flag (DF)– I/O privilege flag (IOPL)– Resume flag (RF)– Alignment check (AC)– Identification flag (ID)

EFLAGS RegisterEFLAGS Register Pentium processorPentium processor

Page 75: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-75

Control RegisterControl Registerss Pentium processorPentium processor

Page 76: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-76

• Flags– Protection enable (PE)– Monitor coprocessor (MP)– Emulation (EM)– Task switched (TS)– Extension type (ET)– Numeric error (NE)– Write protect (WP)– Alignment mask (AM)– Not write through (NW)– Cache disable (CD)– Paging (PG)

Control RegistersControl Registers Pentium processorPentium processor

Page 77: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-77

• Nine additional control bits– Virtual-8086 mode extension– Protected-mode virtual interrupts– Time stamp disable– Debugging extensions– Page size extensions– Physical address extension– Machine check enable– Page global enable– Performance counter enable

Control Register 4 (CR4)Control Register 4 (CR4) Pentium processorPentium processor

Page 78: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-78

MMX RegistersMMX Registers Pentium processorPentium processor

Page 79: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-79

• For MMX operations, the floating-point registers are accessed directly

• The first time that an MMX instruction is executed after any floating-point operations, the FP tag word is marked valid

• The EMMS(Empty MMX state) instruction sets bits of the FP tag word to indicate that all registers are empty– The programmer insert this instruction at the end

of an MMX code block so that subsequent FP operations function properly

• When a value is written to an MMX register, bits[79:64] of the corresponding register are set to all ones

Features of MMX RegistersFeatures of MMX Registers Pentium processorPentium processor

Page 80: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-80

• Interrupts and exceptions

• Interrupt vector table

• Interrupt handling

Interrupt ProcessingInterrupt Processing Pentium processorPentium processor

Page 81: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-81

• Generated by a signal from hardware• May occur at random times during the

execution of a program

• Two sources of interrupts– Maskable interrupts

• Processor doesn’t recognize a maskableinterrupt unless the interrupt enable flag(IF) is set

– Nonmaskable interrupts• Recognition of such interrupts cannot be

prevented

InterruptsInterrupts Pentium processorPentium processor

Page 82: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-82

• Generated from software• Provoked by the execution of an instruction

• Two sources of exceptions– Processor-detected exceptions

• Results when the processor encounters an error while attempting to execute an instruction

– Programmed exceptions• These are instructions that generate an

exception

ExceptionsExceptions Pentium processorPentium processor

Page 83: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-83

Exception and Interrupt VectorException and Interrupt Vector Pentium processorPentium processor

Page 84: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-84

• Every type of interrupt is assigned a number– This number is used to index into the interrupt

vector table

• If more than one exception or interrupt is pending, the processor services them in a predictable order

• The order of priority– Class1 : Traps on the previous instruction– Class2 : External interrupts– Class3 : Faults from fetching next instruction– Class4 : Faults from decoding the next instruction– Class5 : Faults on executing an instruction

Interrupt Vector TableInterrupt Vector Table Pentium processorPentium processor

Page 85: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-85

• When an interrupt occurs and is recognized by the processor– If the transfer involves a change of privilege level, the

current stack segment register and the current extended stack pointer register are pushed onto the stack

– The current value of the EFLAGS register is pushed onto the stack

– Both the interrupt and trap flags are cleared– The current code segment pointer and the current

instruction pointer are pushed onto the stack– If the interrupt is accomplished by an error code, the error

code is pushed onto the stack– The interrupt vector contents are fetched and loaded into

the CS and IP or EIP registers

Interrupt HandlingInterrupt Handling Pentium processorPentium processor

Page 86: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-86

PowerPC G3 Block DiagramPowerPC G3 Block Diagram Pentium processorPentium processor

Page 87: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-87

UserUser--Visible RegistersVisible Registers PowerPC processorPowerPC processor

Page 88: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-88

• Fixed-point unit– General

• 32 64-bit general-purpose registers• These may be used to load, store and

manipulate data operands and also be used for register indirect addressing

– Exception register• Includes 3 bits that report exceptions in integer

arithmetic operations

Register OrganizationRegister Organization PowerPC processorPowerPC processor

Page 89: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-89

PowerPC Register FormatsPowerPC Register Formats PowerPC processorPowerPC processor

Page 90: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-90

• Floating-point unit contains additional user-visible registers– General

• 32 64-bit general-purpose registers• These may be used for all floating-point

operations– Floating-point status and control register

• Includes bits that control the operation of the floating-point unit and bits that record the status resulting from floating-point operations

Register OrganizationRegister Organization PowerPC processorPowerPC processor

Page 91: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-91

FP Status and Control RegisterFP Status and Control Register PowerPC PowerPC processorprocessor

Page 92: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-92

• Branch processing unit– Conditional register

• 8 4-bit condition code fields– Link register

• Can be used in a conditional branch instruction for indirect addressing of the target address

• Also used for call/return behavior– Count

• The count register can be used to control an iteration loop

Register OrganizationRegister Organization PowerPC processorPowerPC processor

Page 93: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-93

• Interpretation of Bits in Condition Register

Condition RegisterCondition Register PowerPC processorPowerPC processor

Page 94: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-94

PowerPC Interrupt TablePowerPC Interrupt Table PowerPC processorPowerPC processor

Page 95: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-95

• Machine state register– Fundamental to the interruption of a program is

the ability to recover the state of the processor at the time of the interrupt

Interrupt ProcessingInterrupt Processing PowerPC processorPowerPC processor

Page 96: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-96

Machine State RegisterMachine State Register PowerPC processorPowerPC processor

Page 97: Chapter 11 CPU Structure and Function

YonseiYonsei UniversityUniversity11-97

• The processor places the address of the instruction to be executed next in the Save/Restore Register 0 (SRR0)

• The processor copies machine state information from the MSR to the SRR1

• The MSR is set to a hardware-defined value specific to the interrupt type

• The processor transfers control to the appropriate interrupt handler

Interrupt HandlingInterrupt Handling PowerPC processorPowerPC processor