basic processing unit (chapter 7) iosup/courses/2011_ti1400_7.ppt
Post on 15-Jan-2016
215 views
TRANSCRIPT
Basic Processing Unit(Chapter 7)
http://www.pds.ewi.tudelft.nl/~iosup/Courses/2011_ti1400_7.ppt
TU-DelftTI1400/11-PDS
2
Problem: How to Implement Computers?
Circuit Design
Digital logicMemory elementsOther building blocks (Multiplexer,Decoder)Finite State Machines
Lecture 1
Programmable Devices
Memory organizationProgram sequencingvon Neumann archi.Instruction levels
Lecture 2
Why Computer Organization Matters?Lecture
0
ComputersLectures 3,4,5,6
Data representation, conversion, and op.Instruction repr. and use
History of Computing(1642-2011)
TU-DelftTI1400/11-PDS
3
Problem
f
yALU
y
Decoder
a
instruction
Reg
?
TU-DelftTI1400/11-PDS
4
Lecture 2Von Neumann Architecture
READ(X)READ(Y)ADD(X,Y,Z)WRITE(Z)
X: 1Y: 2Z: 3 • •
TEMP_A: TEMP_B: RESULT:
IR:
PC:
arithmeticunit
(Central) Processing Unit
CONTROL
Memory
Input
Output
TU-DelftTI1400/11-PDS
55
The Processing Unit
1. Basic Processing Cycle2. Types of Operations3. Control Mechanisms
TU-DelftTI1400/11-PDS
6
Basic Processing Cycle
• Assume an instruction occupies a 32 bit single word in byte addressable memory
• Basic cycle to be implemented:
1. Fetch instruction pointed to by PC and
put it into the Instruction Register (IR): [IR] M([PC])
2. Increment PC: [PC] [PC] + 4
3. Perform actions as specified in IR
TU-DelftTI1400/11-PDS
7
Organization
PC
CPU bus
IR
Decoder
control
R0
R1
R2
Rn-1
Register file
MAR
MDR memory bus
Y
Z
ALU
TU-DelftTI1400/11-PDS
8
Register gating
Ri
CPU bus
Y
Z
ALU
x
x
x
x
MUX
x
Const 4 Ri_in
Ri_out
Y_in
Select
Z_in
Z_out
TU-DelftTI1400/11-PDS
9
Register gating
Tri-state gate: high impedance iff Ri_out=0, Q iff Ri_out=1
R1_in
IR/W
C
IR/W IR/W
C D
Q
R2_out
C
C D
Q
R3_out
C
C D
Q
Edge-triggered D flip-flop
1 bit of common bus line
1
R1_out
0R2_in R3_in
Multiplexer
TU-DelftTI1400/11-PDS
10
Multiple Datapaths
R0
R1
R2
R3
Y
register file ALU
Bus C Bus B
Bus A
TU-DelftTI1400/11-PDS
1111
The Processing Unit
1. Basic Processing Cycle2. Types of Operations3. Control Mechanisms 1. Register Transfer
2. Fetch from Memory3. Store to Memory4. Arithmetic/Logic
Ops.5. Complete Example6. Branching Ops.
TU-DelftTI1400/11-PDS
12
2. Types of Operations
• Operation cycle includes:
- Transfer data from register to register or to ALU
- Fetch contents of memory location and put in one of the CPU registers
- Store contents of CPU register in memory location
- Perform arithmetic or logic operation
TU-DelftTI1400/11-PDS
13
2.1. Register Transfers
R0
R1
R2
R3
Y
Z register file
ALU
Address_in R_in
R_out CPU bus
Address_out
Copy contents of R1 to R3
1. Address_out=R12. R_out 3. Address_in=R34. R_in
1. R1_out 2. R3_in
TU-DelftTI1400/11-PDS
14
2.2. Fetch from Memory (1)
MDR
x
Internal processor bus
Memory busData lines
x
x
x
MDR_out
MDR_in
MDR_outE
MDR_inE
TU-DelftTI1400/11-PDS
15
MAR
MDR
2.2. Fetch from memory (2)
1. MAR [Ri]2. Start read on memory bus3. Wait for MFC response4. Load MDR from memory bus5. Rj [MDR]
MFC Memory
CPURead
Address
Data
e.g., Move (Ri),Rj
Memory Function Complete
Control Step 1
Control Step 2
Control Step 3
TU-DelftTI1400/11-PDS
16
Fetch from memory (3)
Control Step 1. Ri_out, MAR_in, ReadControl Step 2. MDR_inE, WMFCControl Step 3. MDR_out, Rj_in
Signal Activation Sequence
Ri
x
x
Ri_in
Ri_out
x
x
x
x
MDR_out
MDR_in
MDR_outE
MDR_inE
MDR
Internal processor bus
Memory busData lines
TU-DelftTI1400/11-PDS
17
2.2. Fetch from Memory (4)
Value
New Address
Timing of the Operation
CLK
MAR_in
MR
1 2 3
address
Read
MDR_inE
Data
MFC
MDR_out
1. Ri_out, MAR_in, Read2. MDR_inE, WMFC3. MDR_out, Rj_in
Mem.Fnc.Complete
Mem.Read Cmd.
Mem.Bus to MDR
MAR to Mem.Bus
TU-DelftTI1400/11-PDS
18
2.3. Store to Memory
1. Ri_out, MAR_in
2. Rj_out, MDR_in, Write3. MDR_outE, WMFC
Memory CPUWrite
Address
Data
MFC
e.g., Move Rj,(Ri)
TU-DelftTI1400/11-PDS
19
2.4. Arithmetic Operation
Step Action 1. Address_out R1
Y_inR_out
2. Address_out R2R_outF_alu “ADD”Z_in
Address_in R3Z_outR_in
ADD R3,R2,R1
TU-DelftTI1400/11-PDS
20
Register Transfers
R0
R1
R2
R3
Y
Z register file
ALU
Y_in
R_out CPU bus
Address_out
1. Address_out R1Y_inR_out
TU-DelftTI1400/11-PDS
21
Arithmetic Operation
Step Action 1. Address_out R1
Y_inR_out
2. Address_out R2R_outF_alu “ADD”Z_in
Address_in R3Z_outR_in
ADD R3,R2,R1
TU-DelftTI1400/11-PDS
22
Register Transfers
R0
R1
R2
R3
Y
Z register file
ALU
Y_in
Z_in
R_out CPU bus
Address_out
F_alu
2. Address_out R2R_outF_alu “ADD”Z_in
TU-DelftTI1400/11-PDS
23
Arithmetic Operation
Step Action 1. Address_out R1
Y_inR_out
2. Address_out R2R_outF_alu “ADD”Z_in
Address_in R3Z_outR_in
ADD R3,R2,R1
TU-DelftTI1400/11-PDS
24
Register Transfers
R0
R1
R2
R3
Y
Z register file
ALU
Z_out Address_in R_in
CPU bus
Address_in R3Z_outR_in
TU-DelftTI1400/11-PDS
25
Steps in time
Y_in
Z_in
Z_out
R_in
1 2 3 StepY
Z
ALU
Z_out
CPU bus
Z_in
Y_in
TU-DelftTI1400/11-PDS
2727
The Processing Unit
1. Basic Processing Cycle2. Types of Operations3. Control Mechanisms 1. Register Transfer
2. Fetch from Memory3. Store to Memory4. Arithmetic/Logic
Ops.5. Complete
Example6. Branching Ops.
TU-DelftTI1400/11-PDS
28
2.5. Execution of a Complete Instruction
1. Fetch instruction2. Fetch the operand3. Perform operation4. Store result
• Example ADD (R3),R1[R1] M([R3]) + [R1]
TU-DelftTI1400/11-PDS
29
Execution fetch (1)
Step Action1 PC_out, MAR_in, Read
Set carry-in ALUF_alu = “ADD”Z_in
Z_out, PC_inWait for MFC
3 MDR_out, IR_in
[PC] [PC ]+1
[IR] M([PC ])
Step 1-3: Instruction fetch and PC update
Note: for architectures having PC:=PC+4 a different scheme must be used
TU-DelftTI1400/11-PDS
30
Fetch instruction
PC
Z
ALU
Z_in
ADD
MAR
PC_out
carry
MAR_in
Read
WFMC
MDR
IR
1. PC_out, MAR_in, Read Set carry-in ALU F_alu = “ADD” Z_in
Q Why Set carry-in ALU?
Q Why MAR_in?
TU-DelftTI1400/11-PDS
31
Execution fetch (2)
Step Action1 PC_out, MAR_in, Read
Set carry-in ALUF_alu = “ADD”Z_in
Z_out, PC_inWait for MFC
3 MDR_out, IR_in
Step 1-3: instruction fetchand PCupdate
[PC] [PC ]+1
[IR] M([PC ])
TU-DelftTI1400/11-PDS
32
Fetch instruction
PC
Z
ALU
PC_in
Z_out
MAR MAR_in
Read
WFMC
MDR
IR
MDR_in
Z_out, PC_in Wait for MFC
Q What is read into MDR?
TU-DelftTI1400/11-PDS
33
Execution fetch (3)
Step Action1 PC_out, MAR_in, Read
Set carry-in ALUF_alu = “ADD”Z_in
Z_out, PC_inWait for MFC
3 MDR_out, IR_in
[PC ] [PC ]+1
[IR] M([PC ])
Step 1-3: instruction fetchand PCupdate
TU-DelftTI1400/11-PDS
34
Fetch instruction
PC
Z
ALU
MAR
Read
WFMC
MDR
IR_in
MDR_out
IR
3. MDR_out, IR_in
Q What is loaded into IR?
TU-DelftTI1400/11-PDS
35
Execute
Step Action4 Address_out=R3, R_out
MAR_inRead
Address_out=R1, R_outY_in, Wait for MFC
6 MDR_out, Z_inF_alu = “ADD”
7 Address_in=R1, R_inZ_out, End
Step 4 and 5: operand fetch
Perform addition
Store Result
TU-DelftTI1400/11-PDS
36
Execute
PC
CPU bus
IR
Decoder control
R0
R1
R2
R3
register file
MAR
MDR memory bus
Y
Z
ALU
Read
4. R3_out MAR_in Read
Q Role of Decoder?
TU-DelftTI1400/11-PDS
37
Execute Step Action4 Address_out=R3, R_out
MAR_inRead
Address_out=R1, R_outY_in, Wait for MFC
6 MDR_out, Z_inF_alu = “ADD”
7 Address_in=R1, R_inZ_out, End
Step 4 and 5: operand fetch
Perform addition
Store Result
TU-DelftTI1400/11-PDS
38
Execute
PC
CPU bus
IR
Decoder
control
R0
R1
R2
R3
register file
MAR
MDR memory bus
Y
Z
ALU
WFMC
R1_out Y_in, Wait for MFC
Q Where does MDR read from?
TU-DelftTI1400/11-PDS
39
Execute
Step Action4 Address_out=R3, R_out
MAR_inRead
Address_out=R1, R_outY_in, Wait for MFC
6 MDR_out, Z_inF_alu = “ADD”
7 Address_in=R1, R_inZ_out, End
Step 4 and 5: operand fetch
Perform addition
Store Result
TU-DelftTI1400/11-PDS
40
Execute
PC
CPU bus
IR
Decoder
control
R0
R1
R2
R3
register file
MAR
MDR memory bus
Y
Z
ALU
6. MDR_out, Z_in F_alu = “ADD”
Q Who sets F_alu to ADD?
Q Why Z_in?
TU-DelftTI1400/11-PDS
41
Execute Step Action4 Address_out=R3, R_out
MAR_inRead
Address_out=R1, R_outY_in, Wait for MFC
6 MDR_out, Z_inF_alu = “ADD”
7 Address_in=R1, R_inZ_out, End
Step 4 and 5: operand fetch
Perform addition
Store Result
TU-DelftTI1400/11-PDS
42
Execute
PC
CPU bus
IR
Decoder
control
R0
R1
R2
R3
register file
MAR
MDR memory bus
Y
Z
ALU
7. R1_in Z_out, End
Q Role of End?
TU-DelftTI1400/11-PDS
4343
The Processing Unit
1. Basic Processing Cycle2. Types of Operations3. Control Mechanisms 1. Register Transfer
2. Fetch from Memory3. Store to Memory4. Arithmetic/Logic
Ops.5. Complete Example6. Branching Ops.
TU-DelftTI1400/11-PDS
44
2.6. Branching
Step Action1-3 <instruction fetch
as in previous example>
PC_out, Y_in
5 Offset-field-IR_outF_alu = “ADD”Z_in
6 PC_inZ_out, End
Jump PC+Offset
TU-DelftTI1400/11-PDS
45
Branching
PC
CPU bus
IR
Decoder
control
R0
R1
R2
R3
register file
MAR
MDR memory bus
Y
Z
ALU
PC_out, Y_in
TU-DelftTI1400/11-PDS
46
Branching
Step Action1-3 <instruction fetch
as in previous example>
PC_out, Y_in
5 Offset-field-IR_outF_alu = “ADD”Z_in
6 PC_inZ_out, End
TU-DelftTI1400/11-PDS
47
Branching
PC
CPU bus
IR
Decoder
control
R0
R1
R2
R3
register file
MAR
MDR memory bus
Y
Z
ALU
5. Offset-field-IR_out F_alu = “ADD” Z_in
TU-DelftTI1400/11-PDS
48
Branching
Step Action1-3 <instruction fetch
as in previous example>
PC_out, Y_in
5 Offset-field-IR_outF_alu = “ADD”Z_in
6 PC_inZ_out, End
TU-DelftTI1400/11-PDS
49
Branching
PC
CPU bus
IR
Decoder
control
R0
R1
R2
R3
register file
MAR
MDR memory bus
Y
Z
ALU
6. PC_in Z_out, End
TU-DelftTI1400/11-PDS
50
Conditional branching
Step Action1-3 <instruction fetch
as in previous example>
PC_out, Y_inIf N=0 then End
5 Offset-field-IR_outF_alu = “ADD”Z_in
6 PC_inZ_out, End
JN PC+Offset
If not Negative
TU-DelftTI1400/11-PDS
5151
The Processing Unit
1. Basic Processing Cycle2. Types of Operations3. Control Mechanisms
1. Hardwired2. Micro-ProgrammedQ Who sets F_alu to ADD?
TU-DelftTI1400/11-PDS
52
3. Control Mechanisms
• There are two basic control organizations:
- Hardwired control
- Micro-programmed control
TU-DelftTI1400/11-PDS
5353
The Processing Unit
1. Basic Processing Cycle2. Types of Operations3. Control Mechanisms
1. Hardwired2. Micro-ProgrammedQ Who sets F_alu to ADD?
TU-DelftTI1400/11-PDS
54
3.1. Hardwired Control
Control Unit Organization
Status Flags
Condition Codes
Control step counter
Clock CLK
Encoder/Decoder
IR
Control signals
TU-DelftTI1400/11-PDS
55
3.1. Hardwired Control
Separating decoding/encoding
Status Flags
Condition Codes
End
Reset
Run
Control step counter
Clock
Step decoderT_1 T_n
Ins_1
Encoder
Instruction decoder
IRIns_n
Only one set to 1
Z_in
Q Role of Run?
TU-DelftTI1400/11-PDS
56
3.1. Hardwired Control
Generation of control signalsADD
T_6 T_5BRanch
T_1
Z_in
Z_in = T_1 + T_6 . ADD + T_5 . BR
time slot
TU-DelftTI1400/11-PDS
57
3.1. Hardwired Control
End signal
End = T_7 . ADD + T_6 . BR +(T_6 . N + T_4 . /N) .BRN
Other example:
TU-DelftTI1400/11-PDS
59
3.1. Hardwired Control
Performance
• Performance is dependent on:- Power of instructions- Cycle time- Number of cycles per instruction
• Performance improvement by:- Multiple datapaths- Instruction prefetching and pipelining- Caches
TU-DelftTI1400/11-PDS
60
Complete CPU
Instruction unit
Floating-pointunit
Integer unit
Data Cache
Instruction
Cache
BusInterface
MainMemory
Input/Output
Processor/CPU
System Bus
Integer unit
Integer unit
Floating-pointunit
Floating-pointunit
TU-DelftTI1400/11-PDS
61
3.2. Micro-programmed control
• All control bits are organized as memory
• Each memory location represents a control setting/word- The word represents the state (0/1) of each control signal
• Memory words are called micro-instructions
• Micro-routines are sequences of micro-instructions- Control store for all micro-routines
- Micro-program counter (uPC) to read control words sequentially
TU-DelftTI1400/11-PDS
62
3.2. Micro-Programmed Control Examples of Micro-Instructions
micro- PC_in MAR_in Addr_in Z_in ...instruction
1 0 1 00 1 ...2 1 0 00 0 ...3 0 0 01 0 .......
TU-DelftTI1400/11-PDS
63
3.2. Micro-Programmed Control Example of a Micro-routineAddress Micro-instruction
0 PC_out, MAR_in, Read, Set carry-in ALU, F_alu = “ADD”, Z_in
1 Z_out, PC_in, Wait for MFC
2 MDR_out, IR_in
3 Branch to starting address routine (here, 25)........................................................................................................25 PC_out, Y_in, if N=0 then goto address 0
26 Offset-field-of-IR_out, F_alu = “ADD”, Z_in
27 Z_out, PC_in, End
Fetch Instruction
Test N bit
New PC address
TU-DelftTI1400/11-PDS
64
3.2. Micro-Programmed Control Basic organization
IR
Startingaddress
generator
Clock micro-PC
Control Store
ControlSignals
Q Can this organization perform conditional branching operations?
TU-DelftTI1400/11-PDS
65
3.2. Micro-Programmed Control Detailed organization
IR
Starting/Branchingaddress
generator
Clock micro-PC
Control Store
ControlSignals
Status flags
Control codes
TU-DelftTI1400/11-PDS
66
3.2. Micro-Programmed Control micro-PC Operation• Micro-PC is incremented by 1, except:
- After loading IR• Micro-PC is set to first micro-instruction for
executing machine instruction- At End• Micro-PC is set to first micro-instruction of
instruction fetch routine (typically 0)- At Branch instruction • Micro-PC is set to the branch address
TU-DelftTI1400/11-PDS
67
3.2. Micro-Programmed Control Why micro-programming?
• Flexibility
- emulation of different instruction sets on
same hardware
• Support for powerful instructions
TU-DelftTI1400/11-PDS
68
3.2. Micro-Programmed Control Structure micro-instructions
• Most simple organization: 1 bit per control signal
• However,- Many bits needed (e.g., 80-120 bits)- For many signals, only one is needed per
cycle; hence they can be grouped- Coding is possible: e.g., an address instead
of a single control bit per register
TU-DelftTI1400/11-PDS
69
3.2. Micro-Programmed Control Example
Field 1(4 bits): Register address_inField 2(4 bits): Register address_outField 3(4 bits): Other registers_inField 4(4 bits): Function ALUField 5(2 bit) : Read/Write/NopField 6(1 bit) : Carry-in ALUField 7(1 bit) : WMFCField 8(1 bit) : End............ ..............
F1 F2 F3 F4 F5 F6 F7 F8
TU-DelftTI1400/11-PDS
70
3.2. Micro-Programmed Control Forms of organization• Little coding: horizontal organization
- Large words- Little decoding logic- Fast
• Much coding: vertical organization- Small control store- Much decoding logic- Slower
• Mixed organization
TU-DelftTI1400/11-PDS
71
3.2. Micro-Programmed Control Horizontal/Vertical
F0 F1 F2 F3
R0 R1 R2 R3
Horizontal
F0 F1
Decoder
R0 R1 R2 R3
Vertical
TU-DelftTI1400/11-PDS
72
3.2. Micro-Programmed Control Sequencing
• Thus far only branch after fetch
• No sharing of micro-code between micro-routines
• Micro-subroutines lead to more efficient control
store
TU-DelftTI1400/11-PDS
73
3.2. Micro-Programmed Control Multi-way branching
• Number of two-way branches
- disadvantage: slows down
• More than one branch address in micro-
instruction
- disadvantage: more bits required
• bit-ORing if specified branch address
TU-DelftTI1400/11-PDS
74
3.2. Micro-Programmed Control Example
x x x 0 0
y z
x x x y z
micro-instruction
part of IR
branchaddress
OR
actualbranch address
TU-DelftTI1400/11-PDS
75
3.2. Micro-Programmed Control Example microroutine (1)
ADD (Rsrc)+, Rdst
Instruction Format
OP code 010 Rsrc Rdst
Mode
034781011
IR
bit 8: direct/indirectbit 9,10: indexed (11)
autodecrement(10) autoincrement(01) register(00)
TU-DelftTI1400/11-PDS
76
3.2. Micro-Programmed Control Example microroutine (2)Address Micro-instruction
0 PC_out, MAR_in, Read, Set carry-in ALU, F_alu = “ADD”, Z_in Z_out, PC_in, Wait for MFC2 MDR_out, IR_in3 μBranch{μPC101 (from PLA); μPC_5,4[IR_10,9];
μPC_3{[not.IR_10].[not.IR_9].[IR_8]}...........................................................................................................121 Rsrc_out, MAR_in, Set carry-in ALU,Read, F_alu = “ADD”, Z_in122 Z_out, Rscr_in123 μBranch{μPC170; μPC_0[not.IR_8]}, WMFC
170 MDR_out, MAR_in, Read, WMFC171 MDR_out, Y_in172 Rdst_out, F_alu = “ADD”, Z_in173 Z_out, Rdst_in, End
FETCH
autoincrement
indirectdirect
use bits from IR for addressing mode
TU-DelftTI1400/11-PDS
77
3.2. Micro-Programmed Control Micro branch address
OP code 010 Rsrc Rdst
Mode
034781011
IR
0 0 1 0 1 0 0 0 1
/IR10./IR9.IR8
PLA
121
101
9
TU-DelftTI1400/11-PDS
78
3.2. Micro-Programmed Control Micro branch address
OP code 010 Rsrc Rdst
Mode
034781011
IR
0 0 1 1 1 1 0 0 1
/IR8
PLA
171
170
TU-DelftTI1400/11-PDS
79
3.2. Micro-Programmed Control Next-address field (1)
• Micro-instruction contains address next
micro-instruction
• Larger store needed
• Branch micro-instructions no longer needed
TU-DelftTI1400/11-PDS
80
Next-address field (2)
IR Statusflags
Conditioncodes
Decoding circuits
micro-AR
Control store
Next address
Microinstruction decoder
micro-IR
TU-DelftTI1400/11-PDS
81
3.2. Micro-Programmed Control Example
Field 0(8 bits): Next addressField 1(4 bits): Register address_inField 2(4 bits): Register address_outField 3(4 bits): Other registers_inField 4(4 bits): Function ALUField 5(2 bit) : Read/Write/NopField 6(1 bit) : Carry-in ALUField 7(1 bit) : WMFCField 8(1 bit) : End............ PLA/ORing etc
F1 F2 F3 F4 F5 F6 F7 F8 F0
TU-DelftTI1400/11-PDS
82
3.2. Micro-Programmed Control Emulation• A micro-program determines the machine
instructions of a computer
• Suppose we have two computers M1 and M2
with different instruction sets
• By adapting the micro-program of M1, we can
emulate M2
TU-DelftTI1400/11-PDS
83
3.2. Micro-Programmed Control Organization
• Micro-program is often placed in ROM on CPU chip
• Some machines had writable control store, i.e. user could change instruction set