a new graph structure for hardware-software partitioning of heterogeneous systems

18
May 2004 Department of Electrical and Computer Engineering 1 A A NEW GRAPH STRUCTURE FOR HARDWARE- NEW GRAPH STRUCTURE FOR HARDWARE- SOFTWARE PARTITIONING OF HETEROGENEOUS SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS SYSTEMS G. N. Khan and M. Jin G. N. Khan and M. Jin System-on-Chip Research Group System-on-Chip Research Group Electrical & Computer Engineering Electrical & Computer Engineering Ryerson University, Toronto ON M5B 2K3 Ryerson University, Toronto ON M5B 2K3

Upload: vivian

Post on 05-Jan-2016

23 views

Category:

Documents


3 download

DESCRIPTION

A NEW GRAPH STRUCTURE FOR HARDWARE-SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS. G. N. Khan and M. Jin System-on-Chip Research Group Electrical & Computer Engineering Ryerson University, Toronto ON M5B 2K3. Hardware-Software (HW/SW) Co-design. Objective: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A NEW GRAPH STRUCTURE FOR HARDWARE-SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS

May 2004Department of Electrical and

Computer Engineering

1

AA NEW GRAPH STRUCTURE FOR HARDWARE-NEW GRAPH STRUCTURE FOR HARDWARE-SOFTWARE PARTITIONING OF SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMSHETEROGENEOUS SYSTEMS

G. N. Khan and M. JinG. N. Khan and M. JinSystem-on-Chip Research GroupSystem-on-Chip Research Group

Electrical & Computer EngineeringElectrical & Computer Engineering

Ryerson University, Toronto ON M5B 2K3Ryerson University, Toronto ON M5B 2K3

Page 2: A NEW GRAPH STRUCTURE FOR HARDWARE-SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS

May 2004Department of Electrical and

Computer Engineering

2

Hardware-Software (HW/SW) Co-design

Objective:

To design HW/SW early in the design cycle to produce more reliable, efficient and first time right design with in a reasonable time.

Page 3: A NEW GRAPH STRUCTURE FOR HARDWARE-SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS

May 2004Department of Electrical and

Computer Engineering

3

Hardware Software Partitioning

• Assignment of System parts to hetrogeneous implementation units (Hardware and Software)

• Meet constraints (Timing) and Minimize cost (Area, Time to Market)

• Directly affects the cost and performance of final system

Page 4: A NEW GRAPH STRUCTURE FOR HARDWARE-SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS

May 2004Department of Electrical and

Computer Engineering

4

Specification

• Traditionally in Plain English

• MSC, SDL, SystemC were developed

• Both textual and graphical representation like DAG (Directed Acyclic Graph) are used to describe system.

Page 5: A NEW GRAPH STRUCTURE FOR HARDWARE-SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS

May 2004Department of Electrical and

Computer Engineering

5

What is DADGP

• Directed Acyclic Data dependency Graph with Precedence is an extension of DAG

• DADGP is a super set of DAG • Two types of edges:

1) Weighted Dependency edge2) Precedence edge

Page 6: A NEW GRAPH STRUCTURE FOR HARDWARE-SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS

May 2004Department of Electrical and

Computer Engineering

6

DADGP Example• Arrow represents dependence

relationship• Precedence edge is

represented with a line• Precedence dependency

captures the order of execution between nodes and such nodes can be executed in parallel.

• Only necessary parallelism is exposed

A

B

C

D

1

3

10

5

Page 7: A NEW GRAPH STRUCTURE FOR HARDWARE-SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS

May 2004Department of Electrical and

Computer Engineering

7

Overall System Partitioning Structure

Specification

Profiling

LD Path Search

Mapping

Scheduling

Valid

Map

ping

Constr

aint

Satisf

ied FinishYes Yes

No

No

Page 8: A NEW GRAPH STRUCTURE FOR HARDWARE-SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS

May 2004Department of Electrical and

Computer Engineering

8

System Partitioning Algorithm

i. Profiling and building an initial DADGP

ii. Find the LD_path (longest delay path) in DADGP

iii. Mapping of LD-path nodes to hardware

iv. Schedule and if invalid mapping then goto Step iii

v. Update DADGP and calculate the total execution time of target system.

vi. If system constraints (specified by the user) are not met then goto Step ii, otherwise quit.

Page 9: A NEW GRAPH STRUCTURE FOR HARDWARE-SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS

May 2004Department of Electrical and

Computer Engineering

9

Profiling

Profiler collects the following data

• Execution time• Amount of data transfer• Execution order• Data dependencies between nodes

Page 10: A NEW GRAPH STRUCTURE FOR HARDWARE-SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS

May 2004Department of Electrical and

Computer Engineering

10

Longest Delay Path Search

• Finding the longest delay path in DADGP is like finding a bottleneck of the system

• Minimizes search space for mapping

• Longest Delay path means, longest execution path

Page 11: A NEW GRAPH STRUCTURE FOR HARDWARE-SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS

May 2004Department of Electrical and

Computer Engineering

11

Mapping

• Maps a node to be hardware

• Mapping can change the Longest Delay path, as well as DADGP

• Mapping is valid if mapping that node to Hardware gives the shortest Longest Delay path

Page 12: A NEW GRAPH STRUCTURE FOR HARDWARE-SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS

May 2004Department of Electrical and

Computer Engineering

12

Scheduling

• Very simple List Scheduling approach.

• Schedules the earliest node first without violating the resource limit.

• Exposes parallelism and changes the DADGP accordingly.

Page 13: A NEW GRAPH STRUCTURE FOR HARDWARE-SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS

May 2004Department of Electrical and

Computer Engineering

13

Summary of DADGP Scheduling

• Start scheduling from the root of DADGP• Traverse down the tree and schedule the earliest

starting time node• If the node is connected with precedence

dependency edge, check whether exposing parallelism can eliminate that edge. When an edge is eliminated, DADGP structure may convert to two DADGPs. Roots of the two DADGPs are combined to form a single DADGP with a dummy root node.

• In case of multiple descendents, schedule them forcibly by adding PEs

• Update the PE resource (HW-SW) library

Page 14: A NEW GRAPH STRUCTURE FOR HARDWARE-SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS

May 2004Department of Electrical and

Computer Engineering

14

Constraints

• Constraints of deadline and cost is given by the designer.

• Hardware cost is calculated by gate count.

• Different granularity level should be explored if no solution is found.

Page 15: A NEW GRAPH STRUCTURE FOR HARDWARE-SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS

May 2004Department of Electrical and

Computer Engineering

15

Edge Detection Example

Pair of 3x3 masks are convolved to estimate gradients (Gx & Gy) in x and y directions

HW-SW LibraryHW-SW Library DataData

dependencydependency

Precedence Precedence dependencydependency

GGxx

GGyy22

GGyy

GGxx22

AdAddd

Operation SWEXE(ms

)

HWEXE(ms)

HW Area

(gates)

Gradient(Gx or Gy)

9.4 1.4 1200

Square 5.2 0.9 500

Add 3.88 0.3 100

Page 16: A NEW GRAPH STRUCTURE FOR HARDWARE-SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS

May 2004Department of Electrical and

Computer Engineering

16

Edge Detection Solutions

0.1

0.1

0.1

0.1

0.1

Gx

SqY

Gy

SqX

Add

Gx

SqY

Gy

SqX

Add

0.1

0.1

0.1

0.1

Gx

SqY

Gy

SqX

Add

0.1

0.1

0.1

0.1

0.1

0.1

0.1

0.1

Gx

SqY

Gy

SqX

Add

Gx

SqY

Gy

SqX

Add

0.1

0.1

0.1

0.1

Page 17: A NEW GRAPH STRUCTURE FOR HARDWARE-SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS

May 2004Department of Electrical and

Computer Engineering

17

Performance improvement vs. HW area

2.8

6.38

10.68

15.88

23.68

33.8

0

5

10

15

20

25

30

35

40

0 1200 2400 2900 3400 3500

HW area

Seco

nds

Page 18: A NEW GRAPH STRUCTURE FOR HARDWARE-SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS

May 2004Department of Electrical and

Computer Engineering

18

Conclusion

• HW-SW Partitioning is a NP-hard problem

• To find optimal partitioning Hardware-Software set is very difficult due to many factors affecting the partitioning decision.

• DADGP Structure Expose Parallelism

• The complexity of DADGP partitioning algorithm is approximately n2log(n).