design & co-design of embedded systems

28
Design & Co-design of Embedded Systems Distributed System Co-synthesis (2) Maziar Goudarzi

Upload: casimir-lel

Post on 02-Jan-2016

38 views

Category:

Documents


2 download

DESCRIPTION

Design & Co-design of Embedded Systems. Distributed System Co-synthesis (2). Maziar Goudarzi. Today Program. Introduction Preliminaries Hardware/Software Partitioning Distributed System Co-Synthesis (part 2). References: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Design & Co-design of Embedded Systems

Design & Co-design of Embedded Systems

Distributed System Co-synthesis (2)

Maziar Goudarzi

Page 2: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

2

Today Program

IntroductionPreliminariesHardware/Software PartitioningDistributed System Co-Synthesis (part 2)

References:

Wayne Wolf, “Hardware/Software Co-Synthesis Algorithms,” Chapter 2, Hardware/Software Co-Design: Principles and Practice, Eds: J. Staunstrup, W. Wolf, Kluwer Academic Publishers, 1997.

W. Wolf, “An architectural co-synthesis algorithm for distributed, embedded computing systems,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 5, no. 2, pp. 218-229, 1997.

References:

Wayne Wolf, “Hardware/Software Co-Synthesis Algorithms,” Chapter 2, Hardware/Software Co-Design: Principles and Practice, Eds: J. Staunstrup, W. Wolf, Kluwer Academic Publishers, 1997.

W. Wolf, “An architectural co-synthesis algorithm for distributed, embedded computing systems,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 5, no. 2, pp. 218-229, 1997.

Page 3: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

3

Topics

IntroductionAn Integer Linear Programming

ModelA Heuristic Algorithm

On ordinary task graphs On an Object-Oriented model

Page 4: Design & Co-design of Embedded Systems

Co-Synthesis Algorithms:Distributed System Co-Synthesis

Wolf’s Heuristic Algorithm on Ordinary Task Graphs

Page 5: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

5

Wolf’s Heuristic Algorithm

As ever, topics of importance: System Specification Language/Model Target Architecture Functionality (Allocation/Scheduling) Quantum Allocation Strategy Scheduling Strategy Cost Estimation Performance Estimation Algorithm Details

Page 6: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

6

Wolf’s Heuristic Algorithm (cont’d)

Wolf’s Heuristic Algorithm System Specification Language/Model

Algorithm input: single-rate task graph Target Architecture

Heterogeneous multiprocessor architecture Allocation

Primal approach: Performance is the major objective

Scheduling?

Functionality QuantumProcesses in a single-rate task graph

Page 7: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

7

Wolf’s Heuristic Algorithm (cont’d)

Wolf’s Heuristic Algorithm (cont’d) Performance Estimation

Component Technology LibraryRun-time of each process on each available PE is

supposed to be known Cost Estimation

Component Technology LibraryTotal Cost = i (Cost of PEi)

+ j (Cost of Devicej) + (Cost of Comm. Channelk)

Algorithm Details

Page 8: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

8

Wolf’s Heuristic AlgorithmDetails

Four major steps in co-design Partitioning: dividing the spec. into smaller parts (e.g.

processes) Allocation: assigning each process to a multiprocessor

node (PE) Scheduling: serializing processes assigned to each PE Mapping: selecting a particular component for each PE

Problem: These steps (especially allocation, scheduling, and mapping) have a circular relationship

Solution: Break the loop

Page 9: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

9

Wolf’s Heuristic AlgorithmDetails (cont’d)

Wolf:1. Give an initial allocation2. Refine it to reduce cost

Order of satisfying design criteria:1. Satisfy all deadlines2. Minimize PE cost3. Minimize comm. port cost4. Minimize device cost

Page 10: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

10

Wolf’s Heuristic AlgorithmDetails (cont’d)

First ignore communication costs. Later, take them into account

Steps:1. Create an initial feasible solution, and perform an

initial scheduling on it.• Initial feasible solution: assign each process to a separate PE

2. Reallocate processes to PEs to minimize total PE cost.• Possibly eliminate PEs from initial feasible solution

3. Reallocate processes again to minimize the amount of communication required between PEs

4. Allocate communication channels5. Allocate IO devices. (Internal or external to PEs)

Page 11: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

11

Wolf’s Heuristic Algorithm Details (cont’d)

The most important step: 2. Initial reallocationReason: PE cost is the dominant hardware cost

Initial reallocation1. PE cost reduction:

1.1 Scan the PEs, starting with the least-utilized PE. 1.2 Try to reallocate that PE’s processes to other

existing PEs 1.3 If no process left on the PE, eliminate it

otherwise replace the PE with a suitable lower-cost one

2. Pair-wise mergeMerge a pair of PEs into a single, more powerful one

3. Load balancing

Page 12: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

12

Wolf’s Heuristic Algorithm Details (cont’d)

Initial reallocation (cont’d)“PE cost reduction” phase tries to reallocate

multiple processes at a timeThe above 3 phases are repeated as far as

possible

Page 13: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

13

Wolf’s Heuristic Algorithm: Experimental Results

Example

#processes

Period Impl. Cost CPU time (sec)

Wolf P&P Wolf P&P

pp1 4 2.5 14 14 0.05 11

3 14 13 0.05 24

4 7 7 0.05 28

7 5 5 0.05 37

pp2 9 5 15 15 0.7 3732

6 12 12 1.1 26710

7 8 8 1.6 32320

8 8 7 1.0 4511

15 5 5 1.1 385012

Page 14: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

14

Wolf’s Heuristic Algorithm Experimental Results (cont’d)

Finds optimal solutions to most of ILP-solved examples

Finds near-optimal solutions for the remaining examples

Showed good results on larger examplesRequires very little run-time

Due to multiple-move strategy during PE cost minimization phase

Page 15: Design & Co-design of Embedded Systems

Co-Synthesis Algorithms:Distributed System Co-Synthesis

Wolf’s Heuristic Algorithm for Object-Oriented Models

Page 16: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

16

Introduction

Target Co-synthesis of a Distributed-System

out of an Object-Oriented Specification

Significance OO is a promising approach in

designing embedded systems at ESL

Reference:

W. Wolf, “Object-Oriented Co-Synthesis of Distributed Embedded Systems,” ACM Transactions on Design Automation of Electronics Systems, pp. 301-314, 1996

Reference:

W. Wolf, “Object-Oriented Co-Synthesis of Distributed Embedded Systems,” ACM Transactions on Design Automation of Electronics Systems, pp. 301-314, 1996

Page 17: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

17

OO Co-Synthesis Algorithm

Again, our eight topics System Specification Language/Model Target Architecture Functionality (Allocation/Scheduling) Quantum Allocation Strategy Scheduling Strategy Cost Estimation Performance Estimation Algorithm Details

Page 18: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

18

OO Co-Synthesis Algorithm (cont’d)

System Specification Model/Language An Object-Oriented Specification as input Method dataflow graph as model

Object O1

method m1variables v1,v2

method m2variables v2,v3

Object O2

method m4variables v10,v20

Object O3

method m3variables v8,v9

Page 19: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

19

OO Co-Synthesis Algorithm (cont’d)

Target Architecture Distributed System

An arbitrary-topology network of PEs

Functionality Quantum Methods of Objects in an OO Specification As far as possible, keeps together all methods

of an object Partitioning is done during algorithm

execution

Page 20: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

20

OO Co-Synthesis Algorithm (cont’d)

Cost and Performance Estimation Pre-specified

A technology description of available components is input to the algorithm

Allocation, Scheduling, and Algorithm Details Much like Wolf’s previous heuristic algorithm Includes modifications in order to:

handle large sets of methodsconsider effects of splitting objects across PEs

Page 21: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

21

OO Co-Synthesis Algorithm (cont’d)

Allocation, Scheduling, and Algorithm Details1. Initial allocation and scheduling.

Allocate processes to PEs such that all tasks are placed on PEs fast enough to ensure that all deadlines are met, keeping objects together as much as possible

2. Minimize PE cost.Reallocate processes to PEs to minimize PE cost, splitting objects when necessary.

3. Minimize communication. Reallocate processes again to minimize inter-PE communication, taking into account traffic generated by splitting objects across PEs

Page 22: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

22

OO Co-Synthesis Algorithm (cont’d)

4. Allocate channels.Allocate communication channels

5. Allocate devices.either as on-chip devices or external devices on communication channels

Allocation, … Details (cont’d)

Page 23: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

23

OO Co-synthesis Details

Step 1 (initial allocation) One PE per object

Step 2 (minimize PE cost) oo_balance_load()

Tries to redistribute methods to better balance the system load

PE_replacement()Use a cheaper PE without distributing the allocation

oo_pairwise_merge()Tries to eliminate PE by moving its methods to other

PEsStep 2 is done repeatedly

Methods are re-scheduled after each new allocation

Page 24: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

24

OO Co-synthesis Details (cont’d)

Note :

This operation may cause "Hidden communication”.

Note :

This operation may cause "Hidden communication”.

Page 25: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

25

OO Co-synthesis Details (cont’d)

Page 26: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

26

OO Co-Synthesis Algorithm (cont’d)

Experimental Results Algorithm implemented in C++

Using NIH class library8600 lines of codeExecuted on SGI Indigo workstation

Algorithm applied to examples from software engineering books on OO designExample #objects/methods CPU Timecfuge 2/3 0.05dye 3/15 2.0juice 3/4 0.05train 5/6 0.05

Reason for highest cpu-time:

Having most methods => scheduling required in each inner loop of step 2

This implementation, had a simple inefficient scheduler.

Reason for highest cpu-time:

Having most methods => scheduling required in each inner loop of step 2

This implementation, had a simple inefficient scheduler.

Page 27: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

27

OO Co-Synthesis Algorithm (cont’d)

Main contribution OO specification is an important aid to

automatic partitioningThe specification is naturally divided into two levels

of granularity• Systems is composed of Objects• Objects are composed of data members and

methods

The heuristic:Preserve the specification’s partitioning as much as

possible

Page 28: Design & Co-design of Embedded Systems

Fall 2005 Design & Co-design of Embedded Systems

28

What we learned today

Distributed System Co-Synthesis A heuristic approach

Non-OO algorithmCustomization to OO specificationsHeuristic: First minimize the PE cost since it is

the dominant factor