fundamental algorithms for system modeling, analysis, and...
TRANSCRIPT
![Page 1: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/1.jpg)
1
Fundamental Algorithms for System Modeling, Analysis, and OptimizationAnalysis, and Optimization
Edward A. Lee, Jaijeet Roychowdhury, Sanjit A. Seshia, Stavros TripakisUC BerkeleyEECS 144/244Spring 2013Spring 2013
Copyright © 2010-13, E. A. Lee, J. Roychowdhury, S. A. Seshia, S. Tripakis. All rights reserved
Lecture: Timing Analysis, Retiming
Thanks to Kurt Keutzer for several slides
RTL Synthesis Flow
RTLS
HDLHDL
Simulation/ Verification
FSM,Verilog,VHDL
Synthesis
netlist
logicoptimization
netlist
Library/modulegenerators
a
b
s
q0
1
d
clk
a
bq
0
1
d
Boolean circuit/network
Boolean circuit/network
EECS 144/244, UC Berkeley: 2
physicaldesign
layout
K. Keutzer
s clk
Graph / RectanglesTiming Analysis
![Page 2: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/2.jpg)
2
Timing Analysis / Verification
Verifying a property about system timing
Arises in many settings: Integrated circuits Integrated circuits
Embedded software
Distributed embedded systems
Biological systems
…
EECS 144/244, UC Berkeley: 3
Here we focus on circuits.
Chapter 15 of Lee-Seshia book contains a more broad discussion.
Timing Analysis for Digital ICs
(Clock) Speed is one of the major performance metrics for digital circuits
Timing Analysis = the process of verifying that a chip meets its speed requirementE.g., 1 GHz means that next-state function must be
computed within 1 ns
EECS 144/244, UC Berkeley: 4
![Page 3: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/3.jpg)
3
Static Timing Analysis for Circuits
Determine fastest permissible clock speed (e.g. 1 GHz)
clk
Combinationallogic
clk
Combinationallogic
clk
Combinationallogic
EECS 144/244, UC Berkeley: 5
p p ( g )by determining delay of longest path from register to register (e.g. 1ns.)
Cycle Time - Critical Path Delay
Cycle time (T) cannot be smaller than longest path delay (Tmax) data
cycle time
p y ( max)
Longest (critical) path delay is a function of:
Total gate, wire delays
Tclock1
data
maxT T
EECS 144/244, UC Berkeley: 6
clock
Q1 Q2
Tclock1 Tclock2critical path,
~5 logic levels
![Page 4: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/4.jpg)
4
Cycle Time - Setup Time
For FFs to correctly latch data, input data must be stable during: data
setup time
be stable during:
Setup time (Tsetup) beforeclock arrives
Tclock1
data
max setupT T T
EECS 144/244, UC Berkeley: 7
clock
Q1 Q2
Tclock1 Tclock2critical path,
~5 logic levels
Cycle Time - Clock-skew
Tclock2
T
dataIf clock network has unbalanced delay – clock skew Tclock1
Q2clock skew
Q2
skew
Cycle time is also a function of clock skew (Tskew)
max setup skewT T T T
EECS 144/244, UC Berkeley: 8
clock
Q1 Q2
Tclock1 Tclock2
critical path, ~5 logic levels
![Page 5: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/5.jpg)
5
Cycle Time - Clock to Q
Cycle time is also a function of propagation delay of
Tclock1
T l k2
data
p p g yFF (Tclk-to-Q)
Tclk-to-Q : time from arrival of clock signal till change at FF output)
Q
Tclock2
Q2clock-to-Q
Q
max setup skew clk to QT T T T T
EECS 144/244, UC Berkeley: 9
clock
Q1 Q2
Tclock1 Tclock2
Q2
critical path, ~5 logic levels
Min Path Delay - Hold Time
For FFs to correctly latch data, input data must be stable during Hold time T
data
hold time
stable during Hold time (Thold) after clock arrives
Determined by delay of shortest path in circuit (Tmin) and clock skew (Tskew)
Tclock1
min hold skewT T T
EECS 144/244, UC Berkeley: 10
clock
Q1 Q2
Tclock1 Tclock2short path, ~3
logic levels
![Page 6: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/6.jpg)
6
Summary
Need to be able to compute
Tmin: delay of “fastest” path in the circuit
Tmax: delay of “slowest” (critical) path in the circuit
EECS 144/244, UC Berkeley: 11
Modeling Timing in a Combinational Circuit
Arrival time in green A 2
1
0.20
10
X
Z
W
.05
C
B
f
2
2
1
1
0
.20
.20
.10
YZ
.15
.05
1
Interconnect delay in red
Gate delay in blue
What’s the right mathematical
EECS 144/244, UC Berkeley: 13
What s the right mathematical object to use to represent this physical object?
![Page 7: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/7.jpg)
7
Modeling - 1
Use a labeled directed graph
G = <V E>
A 21
0.20
10
X
Z
W
.05
X0.05A
2
G = <V,E>
Vertices represent gates, primary inputs and primary outputs
Edges represent wires
C
B
f
2
2
1
1
0
.20
.20
.10
YZ
.15
.05
1
EECS 144/244, UC Berkeley: 14
C
B
f
Y
W .05.1
1
.2
.15.20
.20
12
2
2
Z
Labels represent delays
What about arrival times?
0
0
1
s
Modeling - 2
C
X0
0
A
15.201
2
2
Find longest/shortest paths in a directedgraph G = <V,E>
C
B
f
Y
W .05.1
1
.2
0
1
.15
.202
2
Z
g ap G ,
What sort of directed graph do we have?
Is this in the standard form for a
s
EECS 144/244, UC Berkeley: 15
longest/shortest path problem?
![Page 8: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/8.jpg)
8
Split Nodes into Edges
C
X0
05.10
A
.15.201
2
2C
B
f
Y
W .05.1
1
.2
0
1.20
2
Z
22
s
EECS 144/244, UC Berkeley: 16
2 1
0.5 2.5
3
DAG with Weighted Edges
X0
A
Cf
Y
W1.050.1
1
0.2
0
0
1
2.15
2.2
2.2Zs
EECS 144/244, UC Berkeley: 17
B1
Problem: Find the longest (critical) path from source s to sink f.
![Page 9: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/9.jpg)
9
Naïve Approach: Enumerate Paths
A
How many paths in this example?In the worst case?
Cf
X
W
0
1.05.1
.2
0
0
A
2.15
2.2
2 2Zs
EECS 144/244, UC Berkeley: 18
B
Y11
2.2
Problem: Find the longest path from source s to sink f.
Algorithm 1: Longest path in a DAG
Critical Path Method [Kirkpatrick 1966, IBM JRD](dynamic programming)
Let w(u,v) denote weight of edge from u to v
Steps:
1. Topologically sort vertices (graph is acyclic)
order: v1, v2, …, vn v1 = s, vn = ?
2. For each vertex v, compute
EECS 144/244, UC Berkeley: 19
d(v) = length of longest path from source s to v
d(v1) = 0
For i = 2..n
d(vi) = maxall incoming edges (u, vi)d(u) + w(u,vi)
![Page 10: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/10.jpg)
10
Algorithm 1: Longest path in a DAG
Critical Path Method [Kirkpatrick 1966, IBM JRD]
Let w(u,v) denote weight of edge from u to v
Steps:
1. Topologically sort vertices
order: v1, v2, …, vn v1 = s, vn = f
2. For each vertex v, compute
d( ) l th f l t th f t
Time Complexity?O(m+n)
EECS 144/244, UC Berkeley: 20
d(v) = length of longest path from source s to v
d(v1) = 0
For i = 2..n
d(vi) = maxall incoming edges (u, vi)d(u) + w(u,vi)
Run the CPM on our example
Graph vs. Circuit
Delay of a combinational circuit depends on Circuit topology (graph model)
Delay model (e.g., fixed vs. variable)
Boolean behavior of gates
We have only considered circuit topology so far.
EECS 144/244, UC Berkeley: 21
Longest/shortest path found on the graph can be very pessimistic: Paths can be FALSE
Delay values are BOUNDS
![Page 11: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/11.jpg)
11
False Paths (consider Transition Mode)
A path is false if it cannot be responsible for the delay of a circuit
EECS 144/244, UC Berkeley: 22
Graph model implies path of length 6
False Paths
A path is false if it cannot be responsible for the delay of a circuit
EECS 144/244, UC Berkeley: 23
Graph model implies path of length 6
![Page 12: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/12.jpg)
12
False vs. True Paths
TRUE path = one that can be responsible for the delay of a circuita circuit
Need techniques to find whether a path is TRUE or FALSE
EECS 144/244, UC Berkeley: 24
The Fixed Delay Model: Constant delay for each gate (or wire) Typo: this should be 1
EECS 144/244, UC Berkeley: 25
![Page 13: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/13.jpg)
13
Paradoxical Behavior with the Transition Model?Typo: this should be 1
non-monotonic
EECS 144/244, UC Berkeley: 26
w.r.t. time delays!
Problems with Fixed Delay + Transition Model
1. Transition model can be tricky to reason about
Fi d d l li i d2. Fixed gate delays are unrealistic, due to manufacturing process variations
More realistic delay model: Lower and upper bounds
EECS 144/244, UC Berkeley: 27
Perform timing analysis for a whole family of circuits that share the same lower/upper bounds
![Page 14: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/14.jpg)
14
Fixed Delays Bounded Delays
Want algorithms that report the critical path delay
of the slowest circuit in the circuit familyof the slowest circuit in the circuit family
EECS 144/244, UC Berkeley: 28
Paper: Computation of Floating Mode Delay in Combinational
Circuits: Theory and Algorithms, Devadas, Keutzer, Malik, 1993
(on bspace)
RETIMING
EECS 144/244, UC Berkeley: 29
![Page 15: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/15.jpg)
15
Retiming Tradeoffs
aa
b 1
1
12
2
2
EECS 144/244, UC Berkeley: 30
[Shenoy, 1997]
6
4
clock period =
# registers =
Retiming Tradeoffs
aa
b 1
1
12
2
2
EECS 144/244, UC Berkeley: 31
5
4
clock period =
# registers =
[Shenoy, 1997]
![Page 16: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/16.jpg)
16
Retiming Tradeoffs
aa
b 1
1
12
2
2
EECS 144/244, UC Berkeley: 32
clock period =
# registers =
4
3
[Shenoy, 1997]
Retiming Tradeoffs
aa
b 1
1
12
2
2
EECS 144/244, UC Berkeley: 33
clock period =
# registers =
2
4
[Shenoy, 1997]
![Page 17: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/17.jpg)
17
Goals of Retiming
Possible goals:
• Minimize clock period (min-period retiming)
• Minimize number of registers (min-area retiming)Minimize number of registers (min area retiming)
• Minimize number of registers for a target clock period (constrained min-area retiming)
EECS 144/244, UC Berkeley: 34[Shenoy, 1997]
Abstraction:Circuit Graph
Nodes: circuit elementsNode weights: delay
EECS 144/244, UC Berkeley: 35
Dummy node for fanout
![Page 18: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/18.jpg)
18
Abstraction:Registers?
EECS 144/244, UC Berkeley: 36
Abstraction - Registers
0
1
1
1
0
00 0
0 0 0
0
000
00
EECS 144/244, UC Berkeley: 37
Arc weights indicate registers
1
![Page 19: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/19.jpg)
19
Abstraction - Environment
0
1
0
00 0
0 0 0
0
000
0
EECS 144/244, UC Berkeley: 38
11
1
0
Cutset – Divides the Graph in Two
0
1
0
00 0
0 0 0
0
000
0
EECS 144/244, UC Berkeley: 39
11
1
0
![Page 20: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/20.jpg)
20
Retiming: Add a register on all arcs crossing the cutset in one direction, and subtract a register from all arcs crossing the cutset in the other direction.
0 0+1=1
1
0
00
00+1=1
0
0
000
0
0+1=1
EECS 144/244, UC Berkeley: 40
11-1=0
1-1=0
0
Recall: Retiming Tradeoffs
aa
b 1
1
12
2
2
EECS 144/244, UC Berkeley: 41
[Shenoy, 1997]
6
4
clock period =
# registers =
![Page 21: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/21.jpg)
21
Recall: Retiming Tradeoffs
aa
b 1
1
12
2
2
EECS 144/244, UC Berkeley: 42
5
4
clock period =
# registers =
[Shenoy, 1997]
Simplest cutset surrounds one node
0
1
1-1=0
1-1=00
00
0+1=1 0
0
000
0
EECS 144/244, UC Berkeley: 43
10 0
0
![Page 22: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/22.jpg)
22
Recall: Retiming Tradeoffs
aa
b 1
1
12
2
2
EECS 144/244, UC Berkeley: 44
5
4
clock period =
# registers =
[Shenoy, 1997]
Recall: Retiming Tradeoffs
aa
b 1
1
12
2
2
EECS 144/244, UC Berkeley: 45
clock period =
# registers =
4
3
[Shenoy, 1997]
![Page 23: Fundamental Algorithms for System Modeling, Analysis, and ...embedded.eecs.berkeley.edu/eecsx44/lectures/Spring2013/timingAn… · Static Timing Analysis for Circuits Determine fastest](https://reader034.vdocument.in/reader034/viewer/2022042806/5f6bea7ac8f62457b50fd5a9/html5/thumbnails/23.jpg)
23
For each node v, define r (v) = # of registers moved from the outputs to the inputs.
0
r (v) = -1
A retiming is now an assignment r (v) for every node v such that the weight of every arc is non-negative.
1
1-1=0
1-1=00
00
0+1=1 0
0
000
0v
y g
EECS 144/244, UC Berkeley: 46
10 0
0
Rest of the story
Retiming operations = graph transformations
Each graph has a vector of costs: (clock period, # registers, …)
Formulate and solve optimization problem.
Papers (on bspace):
EECS 144/244, UC Berkeley: 47
Papers (on bspace):1. Leiserson, C. E. and J. B. Saxe (1983). "Optimizing synchronous systems."
Journal of VLSI and Computer Systems: pp. 41-67.
2. Leiserson, C. E. and J. B. Saxe (1991). "Retiming synchronous circuitry." Algorithmica 6(1): pp. 5-35.
3. Shenoy, N. (1997). "Retiming: Theory and practice." Integration, the VLSI Journal 22: pp. 1-21.