mihai budiu may 23, 2007. based on critical path: a tool for system-level timing analysis girish...

20
Mihai Budiu May 23, 2007

Post on 21-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Mihai BudiuMay 23, 2007

Based On

Critical Path: A Tool for System-Level Timing Analysis Girish Venkataramani, Tiberiu Chelcea, Mihai Budiu, and Seth C. Goldstein, Design Automation Conference (DAC), San Diego, CA, June 4-8, 2007

Girish Venkataramani: summer intern here in 2005

Now graduating from CMU

His Ph.D. thesis: A System Level Timing Analysis and Optimization Methodology for Hardware Compilation is based on the Global Critical Path

2

Critical Path

Longest path between source and sink in DAG

3

Synchronous Combinational Circuits

Latc

h

Latc

h

clk

Longest signal propagating path between two consecutive latches.

clk > crit path

4

Events = (n1, t1) → (n2, t2)

Events

Circuit (V, E)

Events = Signal Transitions on edges E

5

Chaining of Events

Circuit (V, E)6

Timed Graph

B

A

B

A

t0 t1 t2 t3

Dynamic Critical Path = longest path in Timed Graph

|| (n1,t2) → (n2,t2) || = t2 – t1

7

Event: signal from (A, t1) to (B, t3)

Note: easy to model node computation delay too.

Goal: Apply to Real Circuits

+ reg

Delay

C H/S

+ reg

Delay

C H/S

+ reg

Delay

C H/S

data

reqi

acki

data

reqi

ackiacko

1 2 3 4

In this work focused on asynchronous 4-way handshake circuits

reqo

Model Stages Using Behaviors

+ reg

Delay

C H/S

data

reqi

acki

acko

reqo

Behavior Input transitions (precondition)

Output transitions(postcondition)

Compute reqi0↑, reqi1↑, ack0↓ req0↑, acki↑

Return to zero req ack0↑ req0↓

Return to zero ack reqi0↓, reqi1↓ acki↓

9

Behaviors can Handle Choice

mux arbiter

Deterministic (unique)choice

Nondeterministicchoice

10

In the absence of choice and non-deterministic delays a static analysis can determine the GCP.

Runtime: Locally Critical Events

Behavior Input transitions (precondition)

Output transitions(postcondition)

Compute reqi0↑, reqi1↑, ack0↓ req0↑, acki↑

Return to zero req ack0↑ req0↓

Return to zero ack reqi0↓, reqi1↓ acki↓

timelinereqi0↑ reqi1↑ ack0↓ req0↑acki↑

11

GCP Computation Algorithm

12

3. Some transitions repeated

2. Trace back along locally critical input event

1. Start from last nodeexecuted

0. At run-time each node records locally critical events

Possible Locally Critical Paths

13

acko↓

req0↑reqi↑

acki↓

reqi ↓

acko↑

req0↓acko↓

acki ↑

reqi ↑

1 2

3 4

Chaining Events Backwards

14

acko↓

req0↑reqi↑

acki↓

reqi ↓

2

1

acko↓

req0↑reqi↑

1

acko↑

req0↓3

acko↓

acki ↑

reqi ↑

4

PATHdata = [req↑]*

PATHsync = [ack↑→ req↓→ ack↓]*

GCP = [PATHdata → PATHsync]*

Theorem

15

What does this mean?

16

PATHdata = [req↑]*

Good: wait for data

PATHsync = [ack↑→ req↓→ ack↓]*

Maybe bad: synchronization problem

GCP = [PATHdata → PATHsync]*

An Example

17

reqAD↑→ [reqDE↑→reqEG↑→ackGC↑→reqCE↓→ackED↓]9

→reqDE↑→reqEG↑ →reqGM↑ →reqMN↑

reqAD↑→ [reqDE↑→reqEG↑→ackGJ↑→reqJA↑]9

→reqDE↑→reqEG↑ →reqGM↑ →reqMN↑

C

CASHcore

Verilog back-end

Synopsys,Cadence P/R

asynchronouscircuitlayout

ModelSim

Input data

Executiontrace

GCPextraction

Feedback path

Critical Path ToolflowGCP

P/Rmodel

PLI calls

18

Effectiveness

19

0

10

20

30

40

50

60

70

80

GMMediabench kernels

Imp

rov

em

en

t (%

)

Time

Energy-Delay

• Is defined as a path on the timed graph.

• Tracks dependences.

• Can be computed by automatic tools.

• Summarizes concurrent computation bottlenecks.

• Can be incorporated in a feedback loop. to drive optimizations and de-optimizations.

• Is a profiling (input-dependent) concept.

Conclusions: Global Critical Path

20

t0 t1 t2 t3