ece 550d fundamentals of computer systems and engineering...
TRANSCRIPT
ECE 550D Fundamentals of Computer Systems and Engineering
Fall 2017
Storage and Clocking
Prof. John Board
Duke University
Slides are derived from work by Profs. Tyler Bletsch and Andrew Hilton (Duke)
2
VHDL: Behavioral vs Structural
• A few words about VHDL
• Structural:
• Spell out at (roughly) gate level
• Abstract piece into entities for abstraction/re-use
• Very easy to understand what synthesis does to it
• Behavioral:
• Spell out at higher level
• Sequential statements, for loops, process blocks
• Can be difficult to understand how it synthesizes
• Difficult to resolve performance issues
3
Last time…
• Who can remind us what we did last time?
• Add
• Subtract
• Bit shift
• Floating point
4
So far…
• We can make logic to compute “math”
• Add, subtract,… (we’ll see multiply/divide later)
• Bitwise: AND, OR, NOT,…
• Shifts
• Selection (MUX)
• …pretty much anything
• But processors need state, i.e. memory (hold value)
• Registers
• Main Memory
• Storage (disks)
5
Memory Elements
• All the circuits we looked at so far are combinational circuits: the output is (only) a Boolean function of the present inputs (a specific combination of inputs always results in exactly the same output)
• We need circuits that can remember values (registers, memory)
• For sequential circuits, the output of the circuit is a function of both the present inputs and the history (sequence of events) of how we got here, represented by a value which we call the state
• Circuits with storage are called sequential circuits
• Key to storage: feedback
6
Ideal Storage – Where We’re Headed
• Ultimately, we want something that can hold 1 bit and we want to control when it is re-written
• However, instead of just giving it to you as a magic black box, we’re going to first dig a bit into the box
“flip flop” =
device that
holds one
bit (0 or 1)
bit to be written bit currently being held
bit to control
when we write
7
FF Step #1: NOR-based Set-Reset (SR) Latch
R
S
Q
Q
0
1 0
1 0
0
R
S
Q
Q
0
0 1
0 1
0
If I am crazy enough to wire up
2 nor gates this way – fire??
No – two stable configurations
as shown here. But how to move
between them?
Let’s start with the left side, which
we consider to be the “reset” state
8
R
S
Q
Q
0
1 0
1 0
0
R
S
Q
Q
0
0 1
0 1
1
Set-Reset Latch (Continued)
Time
S 0
1
R 0
1
Q 0
1
We activate SET
for a brief time,
then lower it
again
9
R
S
Q
Q
0
1 0
1 0
0
R
S
Q
Q
0
0 1
0 1
1
Set-Reset Latch (Continued)
Time
S 0
1
R 0
1
Q 0
1
Set Signal Goes High
Output Signal Goes High
(2 gate delays later)
10
R
S
Q
Q
0
1 0
1 0
0
R
S
Q
Q
0
0 1
0 1
1
Set-Reset Latch (Continued)
Time
S 0
1
R 0
1
Q 0
1
Set Signal Goes Low
Output Signal Stays High
11
R
S
Q
Q
0
1 0
1 0
0
R
S
Q
Q
0
0 1
0 1
1
Set-Reset Latch (Continued)
Time
S 0
1
R 0
1
Q 0
1
Until Reset Signal
Goes High
Then Output Signal Goes Low
(again 2 gate delays later)
12
FF Step #1: NOR-based Set-Reset (SR) Latch
R
S
Q
Q
0
1 0
1 0
0
R
S
Q
Q
0
0 1
0 1
0
R S Q
0 0 Q
0 1 1
1 0 0
1 1 - Don’t set both S & R to 1.
Seriously, don’t do it.
R,S both low, remember the
previous command
S set the latch to 1 (even if it is
already 1)
R reset the latch to 0 (even if it
is already 0)
R,S together causes oscillation,
random value - bad
13
SR Latch
• Upside: IT HAS MEMORY! If we never invented something better/fancier, we could build computers with these!!!
• Upside: it’s cheap – just 2 nor gates
• Downside: S and R at once = chaos
• Downside: Bad interface, bad timing behavior
• So let’s build on it to do better
14
Building up to the D Flip-Flop and beyond
SR Latch
Q
Q
R
S D
E
Q
Q
R
S
D Latch
D
latch
D Q
E Q
D
latch
D Q
E
D
latch
D Q
E !Q !Q
Q D
C
DFF
D Q
E Q
D Flip-Flop
32 bit reg
D Q
E Q
Register
DFF
D Q
E Q
DFF
D Q
E Q
DFF
D Q
E Q
DFF
D Q
E Q
DFF
D Q
E Q
(too awkward) (bad timing) (okay but only one bit) (nice!)
15
FF Step #2: Data Latch (“D Latch”)
Starting with SR Latch
Q
Q
R
S
16
Data Latch (D Latch)
Starting with SR Latch
Change interface to
Data + Enable (D + E)
If E=0, then R=S=0.
If E=1, then S=D and R=!D
Data
Enable
Q
Q
R
S
17
Data Latch (D Latch)
Data
Enable
Q
Q
D E Q
0 1 0
1 1 1
- 0 Q
Time
D 0
1
E 0
1
Q 0
1
E goes high
D “latched”
Stays as output
R
S
18
Data Latch (D Latch)
Data
Enable
Q
Q
D E Q
0 1 0
1 1 1
- 0 Q
Time
D 0
1
E 0
1
Q 0
1
Does not
affect Output
E goes low
Output unchanged
By changes to D
R
S
19
Data Latch (D Latch)
Data
Enable
Q
Q
D E Q
0 1 0
1 1 1
- 0 Q
Time
D 0
1
E 0
1
Q 0
1
E goes high
D “latched”
Becomes new output
R
S
20
Data Latch (D Latch)
Data
Enable
Q
Q
D E Q
0 1 0
1 1 1
- 0 Q
Time
D 0
1
E 0
1
Q 0
1
Slight Delay
(Logic gates take
time)
- Here about 4
gate delays
worth
R
S
21
Logic Takes Time
• The D latch is a good next step but still has messy timing for using in a computer
• Logic takes time:
• Gate delays: delay to switch each gate
• Wire delays: delay for signal to travel down wire
• Other factors (not going into them here)
• Need to make sure that signal timing is right
• Don’t want to have races or unpredictable or unexpected conditions..
22
Clocks
• Processors have a clock:
• Alternates 0 1 0 1
• Like the processor’s internal metronome
• Latch logic latch in one clock cycle
• 3.4 GHz processor = 3.4 Billion clock cycles/sec
One clock cycle
23
FF Step #3: Using Level-Triggered D Latches
• First thoughts: Level Triggered
• Latch enabled when clock is high
• Hold value when clock is low
D
latch
D Q
E Q
D
latch
D Q
E Q
Logic
Clk
3 3
This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.
24
Strawman: Level Triggered
• How we’d like this to work
• Clock is low, all values stable
D
latch
D Q
E Q
D
latch
D Q
E Q
Logic
Clk
3 3
010 111 100 001
0
Clk
This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.
25
Strawman: Level Triggered
• How we’d like this to work
• Clock goes high, latches capture and xmit new val
D
latch
D Q
E Q
D
latch
D Q
E Q
Logic
Clk
3 3
010 010 100 100
0
Clk
This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.
26
Strawman: Level Triggered
• How we’d like this to work
• Signals work their way through logic w/ high clk
D
latch
D Q
E Q
D
latch
D Q
E Q
Logic
Clk
3 3
010 010 100 100
0
Clk
This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.
27
Strawman: Level Triggered
• How we’d like this to work
• Clock goes low after all logic but just before signals reach next latch
D
latch
D Q
E Q
D
latch
D Q
E Q
Logic
Clk
3 3
010 010 100 100
0
Clk
This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.
28
Strawman: Level Triggered
• How we’d like this to work
• Clock goes low before signals reach next latch
D
latch
D Q
E Q
D
latch
D Q
E Q
Logic
Clk
3 3
111 010 000 100
0
Clk
This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.
29
Strawman: Level Triggered
• How we’d like this to work
• Everything stable before clk goes high a second time
D
latch
D Q
E Q
D
latch
D Q
E Q
Logic
Clk
3 3
111 010 000 100
0
Clk
This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.
30
Strawman: Level Triggered
• How we’d like this to work
• Clk goes high again, repeat
D
latch
D Q
E Q
D
latch
D Q
E Q
Logic
Clk
3 3
111 111 000 000
0
Clk
This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.
31
Strawman: Level Triggered
• Problem: Really hard to time this! What if signal reaches latch too early while clk is still high the first time?
D
latch
D Q
E Q
D
latch
D Q
E Q
Logic
Clk
3 3
111 111 101 000
0
Clk
This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.
32
Strawman: Level Triggered
• Problem: What if signal reaches latch too early?
• Signal goes right through latch, into next stage..
D
latch
D Q
E Q
D
latch
D Q
E Q
Logic
Clk
3 3
111 111 101 101
0
Clk
This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.
33
That would be bad…
• Getting into a stage too early is bad
• Something else is going on there corrupted
• Also may be a loop with one latch
• And this assumed all signals take the same amount of time to get through the logic cloud – in reality some faster than others
• Consider incrementing counter (or PC)
• Too fast: increment twice? Eeek…
D
latch
D Q
E Q
+1
3
010
011
This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.
3 bit adder with one side wired
permanently to 001
34
Building up to the D Flip-Flop and beyond
SR Latch
Q
Q
R
S D
E
Q
Q
R
S
D Latch
D
latch
D Q
E Q
D
latch
D Q
E
D
latch
D Q
E !Q !Q
Q D
C
DFF
D Q
E Q
D Flip-Flop
32 bit reg
D Q
E Q
Register
DFF
D Q
E Q
DFF
D Q
E Q
DFF
D Q
E Q
DFF
D Q
E Q
DFF
D Q
E Q
(too awkward) (bad timing) (okay but only one bit) (nice!)
35
FF Step #4: Edge Triggered
• Instead of level triggered latch
• Latch a new value at a clock level (high or low)
• We use edge triggered flip flop
• Latch a value (only) at a clock edge (rising or falling)
Falling Edges
Rising Edges
36
Our Ultimate Goal: D Flip-Flop
• Rising edge triggered D Flip-flop
• Two D Latches w/ opposite clocking of enables
D
latch
D Q
E
D
latch
D Q
E Q Q
Q D
C
37
D Flip-Flop
• Rising edge triggered D Flip-flop
• Two D Latches w/ opposite clocking of enables
• On Low Clock, first latch enabled (propagates value)
• Second not enabled, maintains value
D
latch
D Q
E
D
latch
D Q
E Q Q
Q D
C
38
D Flip-Flop
• Rising edge triggered D Flip-flop
• Two D Latches w/ opposite clking of enables
• On Low Clock, first latch enabled (propagates value)
• Second not enabled, maintains value
• On High Clock, second latch enabled
• First latch not enabled, maintains value
• Value from first latch flips into second latch at instant of rising clock edge, stays constant until next rising edge
D
latch
D Q
E
D
latch
D Q
E Q Q
Q D
C
39
D Flip-Flop
• No possibility of “races” anymore
• Even if I put 2 DFFs back-to-back…
• By the time signal gets through 2nd latch of 1st DFF
1st latch of 2nd DFF is disabled
• Still must ensure signals reach DFF before clk rises
• Important concern in logic design “making timing”
• Having signals ready in time for next clock edge (need to be stable a few moments before the edge in fact)
D
latch
D Q
E
D
latch
D Q
E Q
D
C
D
latch
D Q
E
D
latch
D Q
E Q
C
40
Making Timing
• Making timing is important in a design • If you don’t make timing, your logic won’t compute right
• Synthesis tool (Quartus) tells you what maximum frequency your design will run safely
• If you run above this, your logic doesn’t “finish” in time
41
D Flip-flops (continued…)
• Could also do falling edge triggered
• Switch which latch has NOT on clock
• D Flip-flop is ubiquitous
• Must properly differentiate between latches (level sensitive) and flip-flops (edge sensitive)
• Which edge: doesn’t matter
• As long as consistent in entire design
• We’ll use rising edge
• And often both – rising for writing, falling for reading, to allow data transfer in a single clock cycle, but more on that later!
42
D flip flops
• Generally don’t draw clk input
• Have one global clk, assume it goes there
• Should always see “>” to indicate edge triggered device
• Maybe have explicit enable
• Might not want to write every cycle
• If no enable signal shown, implies always enabled
• Get output and NOT(output) for “free”
DFF
D Q
Q
DFF
D Q
>
Q
DFF
D Q
E Q
> >
43
DFFs in VHDL
x: dffe port map (
clk => clk, --the clock
d => someInput, --the input is d
q => theOutput, --the output is q
ena => en, --the enable
clrn => '1', --clear
prn => '1'); --set (PReset)
Also, comes in “dff” with no enable
44
A word of advice
signal x_q : std_logic;
signal x_d: std_logic;
x : dffe port map (…, d=> x_d, q=>x_q,…);
• Use naming convention: x_d, x_q
• Write x_d, read x_q
• Remember new value shows up next cycle
x x_d x_q
45
A few words about timing
• Homework 2: VGA Controller
• Requires certain clock frequency
• Else won’t control monitor properly
• Quartus will tell you what maximum timing your current circuit makes
• Fmax : how fast can this be clocked
• Tells you your worst timing paths
• From which dff to which dff
• Can see in schematic viewer (usually)
• Homework 2
• Should be plenty of slack
• But if not…
46
Fixing timing misses
• Typical approach: reduce logic (gate delays)
• Better adder?
• Rethink approach?
• Change “don’t care” behavior?
• Fix high fanout
• Duplicate high Fan Out/simple logic
• Also, feel free to ask for help from me/TAs
• Quartus’s tools to help you fix them aren’t the best
47
Building up to the D Flip-Flop and beyond
SR Latch
Q
Q
R
S D
E
Q
Q
R
S
D Latch
D
latch
D Q
E Q
D
latch
D Q
E
D
latch
D Q
E !Q !Q
Q D
C
DFF
D Q
E Q
D Flip-Flop
32 bit reg
D Q
E Q
Register
DFF
D Q
E Q
DFF
D Q
E Q
DFF
D Q
E Q
DFF
D Q
E Q
DFF
D Q
E Q
(too awkward) (bad timing) (okay but only one bit) (nice!)
48
Stick a bunch of DFFs together to make a register
32 bit reg
D Q
E Q
in0
in1
in2
in31
. . .
out0
out1
out2
out31
enable
DFF
D Q
E Q
>
DFF
D Q
E Q
>
DFF
D Q
E Q
>
DFF
D Q
E Q
>
49
Next evolution: multiple registers
32 bit reg
D Q
E Q
Register
DFF
D Q
E Q
DFF
D Q
E Q
DFF
D Q
E Q
DFF
D Q
E Q
DFF
D Q
E Q
(nice!)
En0
En1
En30
En31
32 bit reg
D Q
E Q
32 bit reg
D Q
E Q
32 bit reg
D Q
E Q
32 bit reg
D Q
E Q
WrData
En0
En1
En30
En31
…
Register File (Tremendous!)
> “Tristate”
buffers –
more in a
moment!
50
Register File
• Can store one value… How about many?
• Register File
• In processor, holds values it computes on
• MIPS, 32 32-bit registers
• How do we build a Register File using D Flip-Flops?
• What other components do we need?
51
Register File: Interface
• 4 inputs
• 3 register numbers (5 bit): 2 registers to read, 1 register to write
• 1 register write value (32 bits)
• 2 outputs
• 2 register values (32 bits)
Register File
r_numA
r_numB
r_numW Wval
Aval
Bval
32 5
5
5
32
32
52
Register file strawman
• Use a mux to pick read ?
• 32 input mux = slow
• (other regs not pictured)
32 bit reg
D Q
E Q
32 bit reg
D Q
E Q
32 bit reg
D Q
E Q
32 bit reg
D Q
E Q
… …
53
Register File Design
• Two problems: write and read
• Writing the registers
• Need to pick which reg
• Have reg num (e.g., 19)
• Need to make En19=1
• En0, En1,… = 0
• Read: Use a mux to pick?
• 32-input mux = slow
• Need a better method…
• Let’s talk about writing first.
32 bit reg
D Q
E Q
32 bit reg
D Q
E Q
32 bit reg
D Q
E Q
32 bit reg
D Q
E Q
… … WrData
En0
En1
En30
En31
54
First: A Decoder
• First task: convert binary number to “one hot”
• Saw this before
• Take register number
Decoder
3
101
0 0
0
0 0
1 0
0
55
Register File
• Now we know how to write:
• Use decoder to convert reg # to one hot
• Send write data to all regs
• Use one hot encoding of reg # to enable right reg
• Still need to fix read side
• 32 input mux (the way we’ve made it) not realistic
• To do this: expand our world from {1,0} to {1, 0, Z}
32 bit reg
D Q
E Q
32 bit reg
D Q
E Q
32 bit reg
D Q
E Q
32 bit reg
D Q
E Q
WrData
En0
En1
En30
En31
…
56
Tri-states have a third logic level
• Z simply means “high impedance” which for our purposes means high resistance, effectively disconnected from both Vcc and Ground
57
So this third option: Z
• There is a third possibility: Z (“high impedance”)
• Just closed off/blocked
• Prevents electricity from flowing through
• Gate that gives us Z : Tri-state
D E Q
0 1 0
1 1 1
- 0 Z D
Q
E
58
CMOS: Complementary MOS
• 2 inputs: E and D. What does this do?
• Write truth table for output
Vcc
Gnd
E Output
E
D
E D Output
0 0
0 1
1 0
1 1
59
CMOS: Complementary MOS
• 2 inputs: E and D. What does this do?
• Write truth table for output
• When E =1, straightforward
Vcc
Gnd
E Output
E
D
E D Output
0 0
0 1
1 0 1
1 1 0
60
CMOS: Complementary MOS
• 2 inputs: E and D. What does this do?
• Write truth table for output
• When E =1, straightforward
• When E= 0, no connection: Z
Vcc
Gnd
E Output
E
D
E D Output
0 0 Z
0 1 Z
1 0 1
1 1 0
When E=0, both enable
transistors are not conducting –
they are both effectively million
ohm resistors, isolating output
from either 1 or 0
61
High Impedance: Z
• Z = High Impedance
• No path to power or ground
• “Gate” does not produce a 1 or a 0
• Previous slide: tri-state inverter
• More commonly drawn: tri-state buffer
• E = enable, D = data
D E Out
0 1 0
1 1 1
X 0 Z D
Out
E
X=“Don’t care”
D Out
62
Remember this rule?
• Remember I told you not to connect two outputs?
• If one gate tries to drive a 1 and the other drives a 0
• One pumps water in.. The other sucks it out
• Except its electric charge, not water
• “Short circuit”—lots of current -> lots of heat
a b
c
d
BAD!
63
We’ve had this rule one day… and you break it
It’s ok to connect multiple outputs together
Under one circumstance:
All but at most one must be outputting Z at
any time
Then no shorts – no fire
One-hot circuits guarantee this
D0
E0
D1
E1
Dn-2
En-2
Dn-1
En-1
64
Mux, implemented with tri-states
• We can build effectively a mux from tri-states
• Much more efficient for large #s of inputs (e.g., 32)
Decoder
5
11110
0
0
1
0
32 bit reg
D Q
E Q
32 bit reg
D Q
E Q
32 bit reg
D Q
E Q
32 bit reg
D Q
E Q
… …
… …
65
En0
En1
En30
En31
Register File
• Now we can write and read in one clock cycle!
32 bit reg
D Q
E Q
32 bit reg
D Q
E Q
32 bit reg
D Q
E Q
32 bit reg
D Q
E Q
WrData
En0
En1
En30
En31
…
66
Ports
• What we just saw: read port
• Ability to do one read / clock cycle
• May want more: read 2 source registers per instruction
• Maybe even more if we do many instructions at once
• This design: can just replicate port
• Another decoder
• Another set of tri-states
• Another output bus (wire connecting the tri-states)
• Ok to read same register on both ports – just driving two lines with same bits!
• Earlier: write port
• Ability to do one write/cycle
• Could add more: need muxes to pick write values
67
Minor Detail
• FYI: This is not how most real register files are implemented
• (Though it is how other things are implemented, and it would work for a register file, just not as well as other techniques we’ll learn later)
• Actually done with SRAM
• We’ll see how those work soon
68
Summary
Can layout logic to compute things
Add, subtract,…
Now can store things
SR, D latches
D flip-flops
Registers
Register files
Also begin to understand clocks