assume array size is 256 (mult: 4ns, add: 2ns) 1 * a[i] b[i] + temp sum path delay 6ns cycle time:...
TRANSCRIPT
Assume array size is 256 (mult: 4ns, add: 2ns)
1
*
A[i] B[i]
+
temp
sum
Path delay 6nsCycle time: 6nsClock rate: 166.6MHz166.6x106 MAC/sec
Option-1
Assume array size is 256 (mult: 4ns, add: 2ns)
2
*
A[i] B[i]
+
temp
sum
Path delay 6nsCycle time: 6nsClock rate: 166.6MHz166.6x106 MAC/sec
*
A[i] B[i]
*
A[i+1] B[i+1]
*
A[i+2] B[i+2]
*
A[i+3] B[i+3]
+ +
+
+
temp
sum
Path delay : 10nsCycle time: 10nsClock rate: 100MHz400x106 MAC/sec
Option-2
Assume array size is 256 (mult: 4ns, add: 2ns)
3
*
A[0] B[0]
*
A[1] B[1]
*
A[2] B[2]
*
A[3] B[3]
+ +
+
temp
sum
Path delay : 4 + 8*2 + 2 = 22nsCycle time: 22nsClock rate: 45.5MHz11.36x109 MAC/sec
*
A[254] B[254]
*
A[255] B[255]
+
. . . . .
Adder tree ( 8 levels)
Option-3
Assume array size is 256 (mult: 4ns, add: 2ns)
4
*
A[i] B[i]
+
temp
sum
Path delay 6nsCycle time: 6nsClock rate: 166.6MHz166.6x106 MAC/sec
*
A[i] B[i]
*
A[i+1] B[i+1]
*
A[i+2] B[i+2]
*
A[i+3] B[i+3]
+ +
+
+
sum
temp
Critical path delay : 4nsCycle time: 4nsClock rate: 250MHz109 MAC/sec
Option-4
Exercise: Serial vs. PipelinedAssume array size is N (add: 2ns)
5
+
A[i] B[i]
+
A[i+1] B[i+1]
+
A[i+2] B[i+2]
+
A[i+3] B[i+3]
+ +
+
temp
• Serial:• 12 cycles• cycle time: 6ns
• Pipelined:• 3 + 11 = 14 cycles• cycle time: 2ns
• ~2.57x
• Serial: • 25,000 cycles• cycle time: 6ns
• Pipelined: • 3 + 24999 = 25002 cycles• cycle time: 2ns
• ~2.99x
+
A[i] B[i]
+
A[i+1] B[i+1]
+
A[i+2] B[i+2]
+
A[i+3] B[i+3]
+ +
+
tempN=100,000 N=48
Sequential Logic Design Non-Ideal Flip-Flop Behavior
Flip-flop samples D at clock edge and D must be stable when sampled Similar to a photograph, D must be stable around clock edge If not, metastability can occur
Setup time: tsetup = time before clock edge data must be stable (i.e. not changing)
Hold time: thold = time after clock edge data must be stable
Aperture time: ta = time around clock edge data must be stable (ta = tsetup + thold)
6
CLK
tsetup
D
thold
ta
Sequential Logic Design Non-Ideal Flip-Flop Behavior
• Propagation delay: tpcq = time after clock edge that the output Q is guaranteed to be stable (i.e., to stop changing)
• Contamination delay: tccq = time after clock edge that Q might be unstable (i.e., start changing)
7
CLK
tccq
tpcq
Q
Sequential Logic Design Non-Ideal Flip-Flop Behavior
The delay between registers has a minimum and maximum delay, dependent on the delays of the circuit elements
8
CL
CLKCLK
R1 R2
Q1 D2
(a)
CLK
Q1
D2
(b)
Tc
Sequential Logic Design Non-Ideal Flip-Flop Behavior
Depends on the maximum delay from register R1 through combinational logic to R2
The input to register R2 must be stable at least tsetup before clock edge
9
CLK
Q1
D2
Tc
tpcq tpd tsetup
CL
CLKCLK
Q1 D2
R1 R2 Tc ≥
Sequential Logic Design Non-Ideal Flip-Flop Behavior
Depends on the maximum delay from register R1 through combinational logic to R2
The input to register R2 must be stable at least tsetup before clock edge
10
CLK
Q1
D2
Tc
tpcq tpd tsetup
CL
CLKCLK
Q1 D2
R1 R2 Tc ≥ tpcq + tpd + tsetup
tpd ≤
Sequential Logic Design Non-Ideal Flip-Flop Behavior
Depends on the maximum delay from register R1 through combinational logic to R2
The input to register R2 must be stable at least tsetup before clock edge
11
CLK
Q1
D2
Tc
tpcq tpd tsetup
CL
CLKCLK
Q1 D2
R1 R2
Tc ≥ tpcq + tpd + tsetup
tpd ≤ Tc – (tpcq + tsetup)
Sequential Logic Design Non-Ideal Flip-Flop Behavior
12
CLK CLK
A
B
C
D
X'
Y'
X
Y
Timing Characteristicstccq = 30 ps
tpcq = 50 ps
tsetup = 60 ps
thold = 70 ps
tpd = 35 ps
tcd = 25 pstpd =
tcd =
Setup time constraint:
Tc ≥
fc =
Sequential Logic Design Non-Ideal Flip-Flop Behavior
13
CLK CLK
A
B
C
D
X'
Y'
X
Y
Timing Characteristicstccq = 30 ps
tpcq = 50 ps
tsetup = 60 ps
thold = 70 ps
tpd = 35 ps
tcd = 25 pstpd = 3 x 35 ps = 105 ps
Setup time constraint:
Tc ≥ (50 + 105 + 60) ps = 215 ps
fc = 1/Tc = 4.65 GHz
Clock Skew
The clock doesn’t arrive at all registers at same time Skew: difference between two clock edges Perform worst case analysis to guarantee dynamic discipline
is not violated for any register – many registers in a system!
14
t skew
CLK1
CLK2
CL
CLK2CLK1
R1 R2
Q1 D2
CLKdelay
CLK
Clock Skew
In the worst case, CLK2 is earlier than CLK1
15
CLK1
Q1
D2
Tc
tpcq tpd tsetuptskew
CL
CLK2CLK1
R1 R2
Q1 D2
CLK2
Tc ≥
Clock Skew
In the worst case, CLK2 is earlier than CLK1
16
CLK1
Q1
D2
Tc
tpcq tpd tsetuptskew
CL
CLK2CLK1
R1 R2
Q1 D2
CLK2
Tc ≥ tpcq + tpd + tsetup + tskew
tpd ≤
Clock Skew
In the worst case, CLK2 is earlier than CLK1
17
CLK1
Q1
D2
Tc
tpcq tpd tsetuptskew
CL
CLK2CLK1
R1 R2
Q1 D2
CLK2
Tc ≥ tpcq + tpd + tsetup + tskew
tpd ≤ Tc – (tpcq + tsetup + tskew)
Sequential Logic Design Metastability
Violating setup/hold time can lead to bad situation known as metastable state Metastable state: Any flip-flop state other
than stable 1 or 0 Eventually settles to one or other, but we
don’t know which For internal circuits, we can make sure
observe setup time But what if input comes from external
(asynchronous) source, e.g., button press?
Partial solution Insert synchronizer flip-flop for
asynchronous input Special flip-flop with very small setup/hold
time Doesn’t completely prevent metastability
clk
D
Q
setup timeviolation
metastablestate
ai
ai
synchronizer
a
18
Sequential Logic Design Metastability
One flip-flop doesn’t completely solve problem How about adding more synchronizer flip-flops?
Helps, but just decreases probability of metastability
So how solve completely? Can’t! May be unsettling to new designers. But we just can’t guarantee a design that
won’t ever be metastable. We can just minimize the mean time between failure (MTBF) -- a number often given along with a circuit
ai
synchronizers
lowverylow
veryverylow
incrediblylow
Probability of flip-flop being metastable is…
19
Exercise
Circuit shown below computes the 4-input AND function using 2-input AND gates. Each 2-input AND gate has a propagation delay of 100ns and a contamination delay of 55ns. Each flip flop has a setup time of 30ns, a hold time of 20ns, a clock-to-Q maximum delay of 70ns, and a clock-to-Q minimum delay of 50ns.
a) If there is no clock skew, what is the maximum operating frequency of the circuit?
b) How much clock skew can the circuit tolerate if it must operate a 2MHz
20
Exercise-2 Determine the critical path and clock frequency of the following design
provided. Assume the setup time of a D flip-flop is 10 ns. assume the delay is estimated as 1 ns times the number of gate
inputs. mux delay = 5 ns adder delay = 20 ns
21
A B
4-bit Adder
clr
ld
1 0
-1 1
4
4
4
4
Cnt Reg
4
Cnt
cnt_clr
cnt_ld
sel
State Reg
n1
n0
s1 s0
up
en