implementing precise interrupts in pipelined processors james e. smith andrew r.pleszkun presented...
TRANSCRIPT
Implementing Precise Interrupts in Pipelined
Processors
James E. Smith Andrew R.Pleszkun
Presented By:Ravikumar
Source: http://www.cse.msu.edu/~kangfeng/precise_interrupt.ppt
What will be covered?
Interrupts in pipelined processors The methods to implement precise
interrupt in pipelined processors The performance evaluation of
those methods Extension Architecture solution
Interrupt
Interrupt: Stop and resume.
Precise interrupt:
Imprecise interrupt:
Types of interrupts I/O device request Invoking an operating system service from a user
program Tracing instruction execution Breakpoint (programmer-requested interrupt) Integer arithmetic overflow FP arithmetic anomaly Page fault Misaligned memory access Memory protection violation Using an undefined or unimplemented instruction Hardware malfunctions Power failure
Example code
Statement Comments Execution Time
0 R2 0 Init. Loop index
1 R0 0 Init. Loop count
2 R5 1 Loop inc. value
3 R7 100 Maximum loop count
4 Loop:
R1 (R2 + A) Load A(I) 11 clock cycles
5 R3 (R2 + B) Load B(I) 11 clock cycles
6 R4 R1 + fR3 Floating add 6 clock cycles
7 R0 R0 + R5 Inc. loop count 2 clock cycles
8 (R0 + C) R4 Store C(I)
9 R2 R2 + R5 Inc. loop index 2 clock cycles
10 P = Loop:R0 != R7
Cond. Branch not equal
Interrupt in sequential model processors
pc=5,R1,R2,R0,R5,R7
pc=6,R3,R1,R2,R0,R5,R7
pc=7,R4,R3,R1,R2,R0,R5,R7 Interrupt occurs XX
1. Keep pc=7,R4, R3,R1,R2,R0,R5,R7, 2. Program suspended
Interrupt program running
Interrupt program stop
1. restore pc=7,R4, R3,R1,R2,R0,R5,R7, 2. Program resume
pc=8,R4,R3,R1,R2,R0,R5,R7,
4 R1 (R2+A)
5 R3 (R2+B)
6 R4 R1+fR3110 R2 44… …
150 R0 100
7 R0 R0+R5
In sequential model processors, the interrupt is precise. It guarantees suspended program can be resumed.
Interrupt in pipelined processors
pc=5,R1,R2,R0,R5,R7
pc=6,R3,R1,R2,R0,R5,R7
Interrupt occurs XX
1. Keep pc=8,R3, R1, R2,R0,R5,R72. Program suspended
Interrupt program running
Interrupt program stop
1. restore pc=8,R3,R1, R2,R0,R5,R72. Program resume
pc=8,R3,R1,R2,R0,R5,R7,
4 R1 (R2+A)
5 R3 (R2+B)
110 R2 44… …
150 R0 100
6 R4 R1+fR37 R0 R0+R5
8 (R0+C) R4 R4 isn’t available
In pipelined processors the interrupt could be imprecise,
It does not guarantee suspended program can be resumed.
9 R2 R2 + R5
Preliminaries
Model Architecture Register-register architecture
Load: Ri = (Rj+disp.) Store: (Ri+disp.) = Rj Function: Ri = Rj op Rk / Ri = op Rk Condition: P = disp: Ri op Rk
Process state General purpose registers Main memory Program counter (PC)
Interrupts Prior to Instruction Issue
Before an instruction is issued, the interrupt occurs. The instruction issuing is halted. And wait a while until all previously issued instructions complete.
Precise Interrupts Methods in pipelined processors
In-Order Instruction Completion Reorder Buffer Reorder Buffer with Bypass paths History Buffer Future File
In-Order Instruction Completion Instructions modify the process
state only when all previously issued instructions are known to be free of exception conditions.
In-Order Instruction Completion- result shift register
In-Order Instruction Completion- result shift register (cont’)
In-Order Instruction Completion- process state modification
Registers Main memory Program Counter
out-of-order instruction completion methods
Limitation of in-order completion: Fast instructions may sometimes get held up
even if there is no dependency. Further block other instructions.
6. R4 R1 + fR3 Floating add 6 clock periods7. R0 R0 + R5 Inc. loop count 2 clock periods
Methods to allow out-of-order completion. Basic reorder buffer, Reorder buffer with bypass
paths. History buffer, future file.
Basic reorder buffer method: Organization. Reorder buffer:
Separate the process of completing instructions from instruction commit
out-of-order completion.
In-order commit.
Reorder buffer is used to rearrange instructions before they commit.
Basic reorder buffer method: Structure.
Result shift register TAG field will guide result
and exception conditions to reorder buffer.
Reorder buffer Tail: when an instruction
issues, create one entry. Head: when it contains
valid result, check and remove.
Example: Two instructions’ relative
positions in the two buffers.
Basic reorder buffer: Keep precise process state Keep register value precise:
No exception at the head: results are written to register file.
Exception at the head: issue is stopped to process interrupt and no further writes to register file.
Keep memory precise: Hold stores in the issue register until all previous
instructions are known to be free of exceptions. Stores are issued. An dummy entry is put to
reorder buffer.
Keep program counter precise: Program counter is stored in one field of reorder
buffer as instructions are issued.
Reorder buffer with bypass paths Limitation of basic
reorder buffer. Operands are held in
reorder buffer. Instructions dependent on
operands can not be issued.
Reorder buffer with bypass paths is proposed.
Bypass paths are provided from reorder buffer to register file output latches.
Reorder buffer with bypass paths: precise process state. Keep precise register:
Operands are not actually written to register file but to register file output latches.
Register will not be modified until the instruction reaches the head of reorder buffer.
Keep precise memory and PC Same as before
Methods to reduce bypass circuit. Limitation of reorder buffer with bypass
paths :
The number of bypass comparators and the amount of circuitry for multiple bypass check.
History buffer, future file are proposed.
Basic idea: place computed results in a working register file, but retain enough state information so a precise state can be restored.
History buffer: organization History buffer:
Instruction issues: The current value of the destination register is stored to history buffer entry.
Instruction completes: Results on the result bus are written directly into register file.
History buffer: Keep precise process state
Keep register precise Tag field is used to guide
exception to history buffer. Old values are kept when
instruction issue. No exception: head is
removed. Keep memory and PC
precise Same as before
Example: Old value in entry 4, 5.
Future file Similar to the history
buffer method.
Keep register precise:two register files.
Architecture file: Future file:
Keep memory and PC precise.
Same as before.
Performance evaluation: Environment:
CRAY-1S simulation system. The first 14 Lawrence Livermore loops are used as
simulation workload.
Five methods are classified as three groups: In-order completion. Simple reorder buffer Reorder buffer with bypass, history buffer, future file.
Two evaluation cases based on different methods to handle store.
Performance evaluation(1)
Measure condition: store blocked until the
results pipeline is empty.
In-order completion is independent on the number of entries.
In-order completion is better if buffer is small.
If the number of entries increases beyond 3 ,the other two are better.
Performance evaluation(2) Measure condition:
Stores are issued and held in the memory pipeline.
Second method to handle store offers a clear improvement over first method.
Performance degradation for eight-entry reorder buffer with bypass paths is only 3 percent.
Indication from the methods. If the entries in the reorder buffer
exceed a certain value, the performance will not be improved. In both of the two tables, the number is
eight.
Tradeoff between performance degradation and cost of implementing a method.
Outline
Extensions Architectural Solutions Summary
Extensions
Additional state information Virtual memory Cache memory
Other State Values
State register
Condition codes:
Virtual Memory
Load/store instructions pass through the address translation section in order
reserve time slots in the result pipeline and/or reorder buffer
If addressing fault, the instruction and all subsequent load/store are cancelled
Virtual Memory: Using ROB
Send the page fault to reorder buffer.
Guide load/store to the correct reorder buffer using tag.
Entry removed while reaching the head.
Exception causes all further entries discarded.
Cache Memory
Store-Through cache
Write-Back cache
Architectural Solutions
Freeze and dump
Save program counters
Save a sequence of instructions
Summary
In-order instruction completion Reorder buffer Bypass paths, History buffer and
Future file Extensions Architecture solutions
References: http://www.ece.umd.edu/cources/enee446.F2000/p3.pdf
http://www.cs.uiowa.edu/~ghosh/9-19-02.pdf
http://lmi17.cnam.fr./~anceau/Documents/moud.pdf
http://www.netlib.org/benchmark/top500/reports/