ece 667 - synthesis & verification - lecture 5 1 ece 697b (667) spring 2006 ece 697b (667)...

34
1 ECE 667 - Synthesis & Verification - L ecture 5 ECE 697B (667) ECE 697B (667) Spring 2006 Spring 2006 Synthesis and Verification of Digital Circuits Scheduling Scheduling Constructive Algorithms Constructive Algorithms

Upload: ralf-newman

Post on 01-Jan-2016

221 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

1

ECE 667 - Synthesis & Verification - Lecture 5

ECE 697B (667)ECE 697B (667)Spring 2006Spring 2006

Synthesis and Verificationof Digital Circuits

SchedulingSchedulingConstructive AlgorithmsConstructive Algorithms

Page 2: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 2

Scheduling – a Combinatorial Optimization ProblemScheduling – a Combinatorial Optimization Problem

• NP-complete ProblemNP-complete Problem• Optimal solutions for special cases and ILPOptimal solutions for special cases and ILP• Heuristics - iterative Improvements Heuristics - iterative Improvements • Heuristics – constructiveHeuristics – constructive• Various versions of the problemVarious versions of the problem

• Unconstrained minimum latencyUnconstrained minimum latency• Resource-constrained minimum latencyResource-constrained minimum latency• Timing constrained minimum latencyTiming constrained minimum latency• Latency-constrained minimum Latency-constrained minimum

• If all resources are identical, problem is reduced to If all resources are identical, problem is reduced to multiprocessor scheduling (Hu’s algorithm)multiprocessor scheduling (Hu’s algorithm)

• Minimum latency multiprocessor problem is intractableMinimum latency multiprocessor problem is intractable

Page 3: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 3

Scheduling - Iterative ImprovementScheduling - Iterative Improvement

• Kernighan - Lin (deterministic)Kernighan - Lin (deterministic)• Simulated AnnealingSimulated Annealing• Lottery Iterative ImprovementLottery Iterative Improvement• Neural NetworksNeural Networks• Genetic AlgorithmsGenetic Algorithms• Taboo SearchTaboo Search

Page 4: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 4

Scheduling - Constructive TechniquesScheduling - Constructive Techniques

• Most ConstrainedMost Constrained

• Least ConstrainingLeast Constraining

Page 5: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 5

Force Directed SchedulingForce Directed Scheduling

• Goal is to reduce hardware by balancing concurrencyGoal is to reduce hardware by balancing concurrency• Iterative algorithm, one operation scheduled per Iterative algorithm, one operation scheduled per

iterationiteration• Information (i.e. speed & area) fed back into Information (i.e. speed & area) fed back into

schedulerscheduler

Page 6: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 6

The Force Directed Scheduling AlgorithmThe Force Directed Scheduling Algorithm

Page 7: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 7

Step 1Step 1

• Determine ASAP and ALAP schedulesDetermine ASAP and ALAP schedules

*

-+

**

*+ <

**-

*

-

+* * *+ <**

-

ASAP ALAP

Page 8: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 8

Step 2Step 2

• Determine Determine Time FrameTime Frame of each op of each op– Length of box ~ Possible execution cyclesLength of box ~ Possible execution cycles

– Width of box ~ Probability of assignmentWidth of box ~ Probability of assignment– Uniform distribution, Area assigned = 1Uniform distribution, Area assigned = 1

C-step 1

C-step 2

C-step 3

C-step 4

Time Frames

*

-

*

*

-

*

**

+ <

+

1/2

1/3

Page 9: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 9

Step 3Step 3

• Create Create Distribution GraphsDistribution Graphs– Sum of probabilities of each Op typeSum of probabilities of each Op type– Indicates concurrency of similar OpsIndicates concurrency of similar Ops

DG(i) = DG(i) = Prob(Op, i) Prob(Op, i)

DG for Multiply DG for Add, Sub, Comp

Page 10: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 10

Diff Eq Example: Precedence Graph RecalledDiff Eq Example: Precedence Graph Recalled

Page 11: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 11

Diff Eq Example: Diff Eq Example: Time Frame & Probability CalculationTime Frame & Probability Calculation

Page 12: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 12

Diff Eq Example: DG CalculationDiff Eq Example: DG Calculation

Page 13: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 13

Conditional StatementsConditional Statements

• Operations in different branches are mutually exclusiveOperations in different branches are mutually exclusive

• Operations of same type can be overlapped onto DGOperations of same type can be overlapped onto DG

• Probability of most likely operation is added to DGProbability of most likely operation is added to DG

DG for Add

-+

-+

+Fork

Join

+-+

-+

Page 14: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 14

Self ForcesSelf Forces

Scheduling an operation will effect overall concurrency Every operation has 'self force' for every C-step of its time frame Analogous to the effect of a spring: f = K x

Desirable scheduling will have negative self force Will achieve better concurrency (lower potential energy)

Force(i) = DG(i) * x(i)

DG(i) ~ Current Distribution Graph value

x(i) ~ Change in operation’s probability

Self Force(j) = [Force(i)]

b

ti

Page 15: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 15

ExampleExample

Attempt to schedule multiply in C-step 1

Self Force(1) = Force(1) + Force(2)

= ( DG(1) * X(1) ) + ( DG(2) * X(2) )

= [2.833*(0.5) + 2.333 * (-0.5)] = +0.25

This is positive, scheduling the multiply in the

first C-step would be bad

DG for Multiply

*

-

*

*

-

*

**

+ <

+

C-step 1

C-step 2

C-step 3

C-step 41/2

1/3

Page 16: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 16

Diff Eq Example: Self Force for Node 4Diff Eq Example: Self Force for Node 4

Page 17: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 17

Predecessor & Successor ForcesPredecessor & Successor Forces

• Scheduling an operation may affect the time frames of Scheduling an operation may affect the time frames of other linked operationsother linked operations

• This may negate the benefits of the desired assignmentThis may negate the benefits of the desired assignment• Predecessor/Successor Forces = Sum of Self Forces of Predecessor/Successor Forces = Sum of Self Forces of

any implicitly scheduled operationsany implicitly scheduled operations

*

-+

**

*+ <

**-

Page 18: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 18

Diff Eq Example: Successor Force on Node 4Diff Eq Example: Successor Force on Node 4

• If node 4 scheduled in step 1If node 4 scheduled in step 1– no effect on time frame for successor node 8no effect on time frame for successor node 8

• Total force = Froce4(1) = +0.25Total force = Froce4(1) = +0.25

• If node 4 scheduled in step 2If node 4 scheduled in step 2– causes node 8 to be scheduled into step 3causes node 8 to be scheduled into step 3– must calculate successor forcemust calculate successor force

Page 19: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 19

Diff Eq Example: Diff Eq Example: Final Time Frame and ScheduleFinal Time Frame and Schedule

Page 20: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 20

Diff Eq Example: Final DGDiff Eq Example: Final DG

Page 21: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 21

LookaheadLookahead

• Temporarily modify the constant DG(i) to include the effect of the Temporarily modify the constant DG(i) to include the effect of the iteration being considerediteration being considered

Force (i) = temp_DG(i) * x(i)Force (i) = temp_DG(i) * x(i)temp_DG(i) = DG(i) + x(i)/3temp_DG(i) = DG(i) + x(i)/3

• Consider previous example:Consider previous example:

Self Force(1) = (DG(1) + x(1)/3)x(1) + (DG(2) + x(2)/3)x(2)Self Force(1) = (DG(1) + x(1)/3)x(1) + (DG(2) + x(2)/3)x(2) = .5(2.833 + .5/3) -.5(2.333 - .5/3)= .5(2.833 + .5/3) -.5(2.333 - .5/3) = +.41667 = +.41667

• This is even worse than beforeThis is even worse than before

Page 22: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 22

Minimization of Bus CostsMinimization of Bus Costs

• Basic algorithm suitable for narrow class of problemsBasic algorithm suitable for narrow class of problems• Algorithm can be refined to consider “cost” factorsAlgorithm can be refined to consider “cost” factors• Number of buses ~ number of concurrent data transfersNumber of buses ~ number of concurrent data transfers• Number of buses = maximum # transfers in any C-stepNumber of buses = maximum # transfers in any C-step• Create modified DG to include transfers: Transfer DGCreate modified DG to include transfers: Transfer DG

Trans DG(i) = [Prob (op,i) * Opn_No_InOuts]Trans DG(i) = [Prob (op,i) * Opn_No_InOuts]

Opn_No_InOuts ~ combined distinct in/outputs for OpOpn_No_InOuts ~ combined distinct in/outputs for Op

• Calculate Force with this DG and add to Self ForceCalculate Force with this DG and add to Self Force

Page 23: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 23

Minimization of Register CostsMinimization of Register Costs

• Minimum no. registers required is given by the largest Minimum no. registers required is given by the largest number of data arcs crossing a C-step boundarynumber of data arcs crossing a C-step boundary

• Create Create Storage OperationsStorage Operations, at output of any operation , at output of any operation that transfers a value to a destination in a later C-step that transfers a value to a destination in a later C-step

• Generate Generate Storage DGStorage DG for these “operations” for these “operations”• Length of storage operation depends on final scheduleLength of storage operation depends on final schedule

s

ss

d

d d

Storage distribution for S

ASAP Lifetime MAX Lifetime ALAP Lifetime

Page 24: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 24

Minimization of Register Costs ( contd.)Minimization of Register Costs ( contd.)

• avg life] =avg life] =

• storage DG(i) = (no overlap between ASAP & ALAP)storage DG(i) = (no overlap between ASAP & ALAP)

• storage DG(i) = (if overlap)storage DG(i) = (if overlap)

• Calculate and add “Storage” Force to Self ForceCalculate and add “Storage” Force to Self Force

3

life] [MAX life] [ALAP life] [ASAP

life][max

life] [avg

[overlap]life][max

[overlap] - life] [avg

7 registers minimum

ASAP Force Directed

5 registers minimum

Page 25: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 25

PipeliningPipelining

* * ****

+

+<

--

* * ****

+

+<

--

DG for Multiply

123, 1’4, 2’ 3’ 4’

Instance

Instance’

Functional Pipelining

1

2

34

*

*

Structural Pipelining

• Functional PipeliningFunctional Pipelining– Pipelining across multiple operationsPipelining across multiple operations– Must balance distribution across Must balance distribution across

groups of concurrent C-stepsgroups of concurrent C-steps– Cut DG horizontally and superimposeCut DG horizontally and superimpose– Finally perform regular Force Directed Finally perform regular Force Directed

SchedulingScheduling• Structural PipeliningStructural Pipelining

– Pipelining within an operationPipelining within an operation– For non data-dependant operations, For non data-dependant operations,

only the first C-step need be only the first C-step need be consideredconsidered

Page 26: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 26

Other OptimizationsOther Optimizations

• Local timing constraintsLocal timing constraints– Insert dummy timing operations -> Restricted time framesInsert dummy timing operations -> Restricted time frames

• Multiclass FU’sMulticlass FU’s– Create multiclass DG by summing probabilities of relevant opsCreate multiclass DG by summing probabilities of relevant ops

• Multistep/Chained operations.Multistep/Chained operations.– Carry propagation delay information with operationCarry propagation delay information with operation– Extend time frames into other C-steps as requiredExtend time frames into other C-steps as required

• Hardware constraintsHardware constraints– Use Force as priority function in list scheduling algorithmsUse Force as priority function in list scheduling algorithms

Page 27: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 27

Scheduling using Simulated AnnealingScheduling using Simulated Annealing

Reference:Reference:Devadas, S.; Newton, A.R.Devadas, S.; Newton, A.R.

Algorithms for hardware allocation in data path synthesisAlgorithms for hardware allocation in data path synthesis..

IEEE Transactions on Computer-Aided Design of Integrated IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, July 1989, Vol.8, (no.7):768-81. Circuits and Systems, July 1989, Vol.8, (no.7):768-81.

Page 28: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 28

Simulated AnnealingSimulated Annealing

Local Search

Solution space

Cos

t fu

nctio

n

?

Page 29: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 29

Statistical Mechanics Combinatorial OptimizationStatistical Mechanics Combinatorial Optimization

State {r:} (configuration -- a set of atomic position )State {r:} (configuration -- a set of atomic position )

weight eweight e-E({r:])/K -E({r:])/K BBTT -- Boltzmann distribution-- Boltzmann distribution

E({r:]): energy of configurationE({r:]): energy of configuration

KKBB: Boltzmann constant: Boltzmann constant

T: temperatureT: temperature

Low temperature limit ??Low temperature limit ??

Page 30: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 30

AnalogyAnalogy

Physical System

State (configuration)

Energy

Ground State

Rapid Quenching

Careful Annealing

Optimization Problem

Solution

Cost Function

Optimal Solution

Iteration Improvement

Simulated Annealing

Page 31: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 31

Generic Simulated Annealing AlgorithmGeneric Simulated Annealing Algorithm

1. Get an initial solution S2. Get an initial temperature T > 03. While not yet 'frozen' do the following: 3.1 For 1 i L, do the following:

3.1.1 Pick a random neighbor S'of S 3.1.2 Let =cost(S') - cost(S) 3.1.3 If 0 (downhill move) set S = S' 3.1.4 If >0 (uphill move)

set S=S' with probability e-/T

3.2 Set T = rT (reduce temperature)4. Return S

Page 32: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 32

Basic Ingredients for S.A.Basic Ingredients for S.A.

• Solution SpaceSolution Space

• Neighborhood StructureNeighborhood Structure

• Cost FunctionCost Function

• Annealing ScheduleAnnealing Schedule

Page 33: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 33

ObservationObservation

• All scheduling algorithms we have discussed so far All scheduling algorithms we have discussed so far are critical path schedulersare critical path schedulers

• They can only generate schedules for iteration period They can only generate schedules for iteration period larger than or equal to the critical pathlarger than or equal to the critical path

• They only exploit concurrency within a single They only exploit concurrency within a single iteration, and only utilize the intra-iteration iteration, and only utilize the intra-iteration precedence constraintsprecedence constraints

Page 34: ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling

ECE 667 - Synthesis & Verification - Lecture 5 34

ExampleExample

• Can one do better than iteration period of 4?Can one do better than iteration period of 4?– Pipelining + retiming can reduce critical path to 3, and also the # Pipelining + retiming can reduce critical path to 3, and also the #

of functional unitsof functional units

• ApproachesApproaches– Transformations followed by schedulingTransformations followed by scheduling– Transformations integrated with schedulingTransformations integrated with scheduling