reliable embedded multimedia systems?tbasten/presentations/wisc20090921.pdf · p40 145 p50 190 p60...

12
Reliable Embedded Multimedia Systems? ‘Knowing is not understanding.’ Charles Kettering Twan Basten Joint work with Marc Geilen, AmirHossein Ghamarian, Hamid Shojaei, Sander Stuijk, Bart Theelen, and others Funding: NWO PROMES EC FP6 Betsy, FP7 MNEMEE 2 Overview Embedded Multi-media Analysis of Synchronous Dataflow Graphs Throughput Analysis Throughput-Storage Trade-off Analysis Scenario-aware Analysis Run-time Adaptation Looking Forward 3 Trends Concurrency Connectedness Interaction Variability Embedded Multi-media emb. mm throughput storage adaptation the future scenarios 4 Embedded Multi-media emb. mm throughput storage adaptation the future scenarios

Upload: others

Post on 13-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Reliable Embedded Multimedia Systems?tbasten/presentations/wisc20090921.pdf · P40 145 P50 190 P60 235 P70 265 P80 310 P99 390 RC I 350 P0 0 P30, P 40, P 50 250 P60 300 P70, P 80,

Reliable Embedded Multimedia Systems?

‘Knowing is not understanding.’

Charles Kettering

Twan Basten

Joint work with

Marc Geilen, AmirHossein Ghamarian, Hamid Shojaei, Sander Stuijk, Bart Theelen, and others

Funding:

NWO PROMES

EC FP6 Betsy, FP7 MNEMEE

2 Overview

• Embedded Multi-media

• Analysis of Synchronous Dataflow Graphs

• Throughput Analysis

• Throughput-Storage Trade-off Analysis

• Scenario-aware Analysis

• Run-time Adaptation

• Looking Forward

3

Trends

Concurrency

Connectedness

Interaction

Variability

Embedded Multi-media

emb. mm throughput storage adaptation the futurescenarios

4 Embedded Multi-media

emb. mm throughput storage adaptation the futurescenarios

Page 2: Reliable Embedded Multimedia Systems?tbasten/presentations/wisc20090921.pdf · P40 145 P50 190 P60 235 P70 265 P80 310 P99 390 RC I 350 P0 0 P30, P 40, P 50 250 P60 300 P70, P 80,

5 Embedded Multi-media

emb. mm throughput storage adaptation the futurescenarios

6 Embedded Multi-media

emb. mm throughput storage adaptation the futurescenarios

7 Approach: Models of Computation (MoCs)

functionality

concurrency

timing

energy

quality…

Basic dimensions

processing

communication

storage

functionality

concurrency

timing

energy

quality…

Aspects Metrics

executiontime

energydissipation

perceivedquality

reconfigurationtime

essential for predictabilityon the interface between science and engineering

emb. mm throughput storage adaptation the futurescenarios

8 Essential Challenges

predictability and efficiency

… are contradictory

emb. mm throughput storage adaptation the futurescenarios

Page 3: Reliable Embedded Multimedia Systems?tbasten/presentations/wisc20090921.pdf · P40 145 P50 190 P60 235 P70 265 P80 310 P99 390 RC I 350 P0 0 P30, P 40, P 50 250 P60 300 P70, P 80,

9 Scenario-aware H.263 Decoder Model

1

1VLD IDCT

RCMCFD1

1

3

1 1

1

1

1

11

1 e

ddd

a

bc c

Detector

Kernel

Parameterized

rate

Control channelData

channel

Tokens

I P99 P99

P0

...

1

Fixed rate

Rate I Px P0

a 0 1 0

b 0 x 0

c 99 x 1

d 1 1 0

e 99 x 0

Execution time

VLDP0 0

others 40

IDCTP0 0

others 17

MC

I, P0 0

P30 90

P40 145

P50 190

P60 235

P70 265

P80 310

P99 390

RC

I 350

P0 0

P30, P40, P50 250

P60 300

P70, P80, P99 320

FD All 0

x={30,40,50,60,70,80,99}

(FSM-based Scenario-Aware DataFlow (SADF))

MEMOCODE 2006emb. mm throughput storage adaptation the futurescenarios

10 The Goal: Scenario-aware Flow

extract application

model

keep trade-offs

select operating

point and switch

dynamically

emb. mm throughput storage adaptation the futurescenarios

11 Overview

• Embedded Multi-media

• Analysis of Synchronous Dataflow Graphs

• Throughput Analysis

• Throughput-Storage Trade-off Analysis

• Scenario-aware Analysis

• Run-time Adaptation

• Looking Forward

ACSD 2006

12 Synchronous Data Flow Graphs

A,1 B,2 C,22 3 1 2

2 3 1 2

1 1

1

1 1

1

1 1

1

4 2

channelratetoken

execution time

actor

Throughput: average number of actor firings over time

(SDFGs [Lee 1986])

storage adaptation the futurescenariosemb. mm throughput

Iteration: smallest non-empty set of actor firings that does not change

the token distribution (example: A:3, B:2, C:1)

Page 4: Reliable Embedded Multimedia Systems?tbasten/presentations/wisc20090921.pdf · P40 145 P50 190 P60 235 P70 265 P80 310 P99 390 RC I 350 P0 0 P30, P 40, P 50 250 P60 300 P70, P 80,

13 Our Throughput Analysis Method

A

A

A, C

A, CB

B

B

B

A

Periodic PhaseTransient Phase

Throughput can be calculatedfrom the periodic phase

Efficient implementationConsider one designated firing per iteration only to detect a recurrent state

storage adaptation the futurescenariosemb. mm throughput

• Self-timed execution:

• Each actor fires as soon as it can fire

• Known to maximize throughput

• Execution state: distribution of tokens + remaining execution times

• Execution: transient + periodic phase

14 Throughput Analysis: Our Result

1. 10-31. 10-31. 10-31. 10-3MP3 dec.

>1800>1800>180010. 10-3H.263 decoder

>1800>1800>180056. 10-3Satellite

>1800>1800>18002. 10-3Sample Rate

81. 10-381. 10-382. 10-31.10-3Modem

Young

Tarjan

Orlin

HowardDasdan

Gupta

State

Space

Runtimes of various throughput analysis methods (in seconds)

Traditional methods, all using

conversion to Homogeneous SDFGs

storage adaptation the futurescenariosemb. mm throughput

15 Overview

• Embedded Multi-media

• Analysis of Synchronous Dataflow Graphs

• Throughput Analysis

• Throughput-Storage Trade-off Analysis

• Scenario-aware Analysis

• Run-time Adaptation

• Looking Forward

DAC 2006, IEEE T on Comp 2008

16

`

5 6 7 8 9 10 110

0.05

0.1

0.15

0.2

0.25

thro

ug

hp

ut

storage

Trade-off: Storage vs. Throughput

DSP synthesis

multi-media

applications

adaptation the futurescenariosemb. mm throughput storage

Page 5: Reliable Embedded Multimedia Systems?tbasten/presentations/wisc20090921.pdf · P40 145 P50 190 P60 235 P70 265 P80 310 P99 390 RC I 350 P0 0 P30, P 40, P 50 250 P60 300 P70, P 80,

17 Storage Distribution

• Separate memory for each channel.

• Storage distribution, for example, ⟨α, β⟩ → ⟨4,2⟩

A,1 B,2 C,22 3 1 2α β

Tokens on channels must be stored in memory.

1 1

1

1 1

1

1 1

1

adaptation the futurescenariosemb. mm throughput storage

18 Problem Definition

5 6 7 8 9 10 110

0.05

0.1

0.15

0.2

0.25

thro

ug

hp

ut

distribution size

⟨⟨⟨⟨4,2⟩⟩⟩⟩⟨⟨⟨⟨6,2⟩⟩⟩⟩

⟨⟨⟨⟨6,3⟩⟩⟩⟩

⟨⟨⟨⟨7,3⟩⟩⟩⟩

⟨⟨⟨⟨5,3⟩⟩⟩⟩

Find all minimal storage distributions for any possible throughput

1 1

1

1 1

1

1 1

1

A,1 B,2 C,22 3 1 2α β

adaptation the futurescenariosemb. mm throughput storage

19

tk(α)

tk(β)

sp(α)

sp(β)

tk(A)

tk(α)sp(α)

Dependency Graph

A3 A4

B0

C0

A2 B1

cycles denotestorage

dependencies

sp - space

tk - token

adaptation the futurescenariosemb. mm throughput storage

1 1

1

1 1

1

1 1

1

A,1 B,2 C,22 3 1 2α β

Storage distribution ⟨α, β⟩ → ⟨4,2⟩

dependency:an actor firing start

may depend on an actor firing end

firings in one

period

20

tk(α)

tk(β)

sp(α)

sp(β)

tk(A)

tk(α)sp(α)

Dependency Graph

A3 A4

B0

C0

A2 B1

cycles denotestorage

dependencies

sp - space

tk - token

adaptation the futurescenariosemb. mm throughput storage

resolving storage dependencies may increase throughput

dependency:an actor firing start

may depend on an actor firing end

firings in one

period

Page 6: Reliable Embedded Multimedia Systems?tbasten/presentations/wisc20090921.pdf · P40 145 P50 190 P60 235 P70 265 P80 310 P99 390 RC I 350 P0 0 P30, P 40, P 50 250 P60 300 P70, P 80,

21

tk(α)

tk(β)

sp(α)

sp(β)

tk(A)

tk(α)sp(α)

Abstract Dependency Graph

A3 A4

B0

C0

A2 B1

tk(α) tk(β)

sp(β)

tk(A)

sp(α)

A B C

abstract dependency graph contains all storage

dependencies

adaptation the futurescenariosemb. mm throughput storage

easily constructed from state-space exploration

dependency graphmay be very large

dependencies between actors

22 Design-Space Exploration Algorithm

Given an SDFG G with storage distribution δ

1. Compute throughput and abstract dependency graph ∆ for G with δ

2. For each channel c with a storage dependency in ∆

Create new storage distribution by enlarging c

3. Repeat steps 1-2

All minimal storage distributions are found

when starting from storage distribution ⟨0,...,0⟩

adaptation the futurescenariosemb. mm throughput storage

23 Experimental Results

2ms

3

3

16

1/132

12

1/137

12

13

MP3

53min

3254

3254

8006

1/10000

4753

1/21876

3

4

H.263

7ms

2

2

1544

0.23

1542

0.18

26

22

Satellite

1msExec. time

5#Distributions

4#Pareto points

10Distr. size

1/4Max throughput

6Distr. size

1/7min throughput > 0

2#channels

3#actors

Example

Approximation

increase buffers with n times the minimal step size

adaptation the futurescenariosemb. mm throughput storage

24 Conclusions Trade-off Analysis

• Exact minimum memory requirements for any possible throughput

• Technique can be combined with heuristics to prune the search space if necessary

• Technique can be generalized to other types of resources and performance metrics

adaptation the futurescenariosemb. mm throughput storage

Page 7: Reliable Embedded Multimedia Systems?tbasten/presentations/wisc20090921.pdf · P40 145 P50 190 P60 235 P70 265 P80 310 P99 390 RC I 350 P0 0 P30, P 40, P 50 250 P60 300 P70, P 80,

25 Overview

• Embedded Multi-media

• Analysis of Synchronous Dataflow Graphs

• Throughput Analysis

• Throughput-Storage Trade-off Analysis

• Scenario-aware Analysis

• Run-time Adaptation

• Looking Forward

26 Scenario-aware H.263 Decoder Model

1

1VLD IDCT

RCMCFD1

1

3

1 1

1

1

1

11

1 e

ddd

a

bc c

Detector

Kernel

Parameterized

rate

Control channelData

channel

Tokens

I P99 P99

P0

...

1

Fixed rate

Rate I Px P0

a 0 1 0

b 0 x 0

c 99 x 1

d 1 1 0

e 99 x 0

Execution time

VLDP0 0

others 40

IDCTP0 0

others 17

MC

I, P0 0

P30 90

P40 145

P50 190

P60 235

P70 265

P80 310

P99 390

RC

I 350

P0 0

P30, P40, P50 250

P60 300

P70, P80, P99 320

FD All 0

x={30,40,50,60,70,80,99}

(FSM-based Scenario-Aware DataFlow (SADF))

MEMOCODE 2006adaptation the futurescenariosemb. mm throughput storage

27 Scenario-aware Performance Analysis

• SDF-based throughput analysis considers worst-case actor times

• Scenario-aware throughput analysis considers worst-case actor times per scenario, but it must consider scenario transitions

• Analyzing all scenario transitions separately can be avoided

• Separate scenario transitions with an invariant reference schedule

ACM TECSadaptation the futurescenariosemb. mm throughput storage

28 Overview

• Embedded Multi-media

• Analysis of Synchronous Dataflow Graphs

• Throughput Analysis

• Throughput-Storage Trade-off Analysis

• Scenario-aware Analysis

• Run-time Adaptation

• Looking Forward

DAC 2009

Page 8: Reliable Embedded Multimedia Systems?tbasten/presentations/wisc20090921.pdf · P40 145 P50 190 P60 235 P70 265 P80 310 P99 390 RC I 350 P0 0 P30, P 40, P 50 250 P60 300 P70, P 80,

29

Energy

Completion time

(3,ck1)

(4,ck1)

(5,ck1)

(1,ck2)

(2,ck2)

(3,ck2)

(3,ck1)

(4,ck1)

(5,ck1)

(1,ck2)

(2,ck2)

(3,ck2)

(3,ck1)

(4,ck1)

(5,ck1)

(1,ck2)

(2,ck2)

(3,ck2)

(8,ck1)

(4,ck2)

(10,ck1)

(9,ck1)

(6,ck2) (5,ck2)

(7,ck2)

(6,ck2)

(9,ck2)

(8,ck2)

JCM

Run-time Application Management

Const

Gantt chart

- Applications starting/stopping over time

- 6 procs, slow (ck1) and fast clocks (ck2)

- Minimize energy under time constraints

Join Const Min

“enforce the

same clock”

adaptation the futurescenariosemb. mm throughput storage

30 Pareto Algebra

Goal: Compositional computation of trade-offs

adaptation the futurescenariosemb. mm throughput storage

31 Pareto Algebra

• The elements: sets of configurations

• (Relevant) operators:

• Minimization

Gives the Pareto points (optimal trade-offs)

• Product

Cartesian product of configurations

e.g. application and platform configurations

• Constraint

Selects solutions according to constraints

e.g. all application configurations with some minimal quality

• AbstractionDiscards information about solutions

e.g. bandwidth usage in bandwidth × energy × quality configurations

• Derived metricDerive a new metric from other metrics

e.g. total power from power of components

energy

time

better

better

prerequisite

monotonicity

adaptation the futurescenariosemb. mm throughput storage

32 Pareto-Algebraic Characterization

… of run-time adaptation

… select and configure

adapt

product - derive | constrain | abstract - minimize

com

pon

ent

trade-o

ff s

paces

system trade-off space(at the time of a change)

adaptation the futurescenariosemb. mm throughput storage

Page 9: Reliable Embedded Multimedia Systems?tbasten/presentations/wisc20090921.pdf · P40 145 P50 190 P60 235 P70 265 P80 310 P99 390 RC I 350 P0 0 P30, P 40, P 50 250 P60 300 P70, P 80,

33 Pareto-Algebraic Characterization

… of run-time adaptation

… select and configure

product - derive | constrain | abstract - minimize

com

pon

ents

X

derive | constrain | abstract min

adaptation the futurescenariosemb. mm throughput storage

34 Complexity

n components with p Pareto points each

adapt

take a product and on-the-fly

derive | constrain | abstract - minimize

O(p2n)

com

pon

ent

trade-o

ff s

paces

system trade-off space(at the time of a change)

adaptation the futurescenariosemb. mm throughput storage

35

X

Complexity

n components with p Pareto points each

take a product and on-the-fly

derive | constrain | abstract - minimize

O(p2n)

com

pon

ents

constrainderive-abstract

min

adaptation the futurescenariosemb. mm throughput storage

36 Complexity Control

keep n and p small

motivation Pareto algebra adaptation wsncmp

• Compositional reasoning, among others

• Approximate Pareto sets

conclusions

y =

x =

y/x = c2

y/x = c4

Page 10: Reliable Embedded Multimedia Systems?tbasten/presentations/wisc20090921.pdf · P40 145 P50 190 P60 235 P70 265 P80 310 P99 390 RC I 350 P0 0 P30, P 40, P 50 250 P60 300 P70, P 80,

37 Multi-dimensional Multiple-choice Knapsack Problem

MMKP

- One optimization objective (value)

- Multiple resource dimensions with capacity constraints

- Multiple independent applications

- Multiple independent configurations per application

Pick one configuration per application optimizing value within constraints

NP hard

adaptation the futurescenariosemb. mm throughput storage

38 A Parameterized Compositional Heuristic

• Project all resource dimensions into one dimension

• Approximate Pareto set in 2-dimensional space

product-constrain-derive-abstract-minimize

product

constrain

derive abstract

minimize

Compute product while on-the-fly

• Applying resource constraints

• Computing values, projecting all resource dimensions into one(taking simply the sum)

• Minimizing the configuration set in 2-dimensional space

adaptation the futurescenariosemb. mm throughput storage

39 A Parameterized Compositional Heuristic

• Project all resource dimensions into one dimension

• Approximate Pareto set in 2-dimensional space

y =

x =

y/x = c2

y/x = c4

product-constrain-derive-abstract-minimize

adaptation the futurescenariosemb. mm throughput storage

40 A Parameterized Compositional Heuristic

• Project all resource dimensions into one dimension

• Approximate Pareto set in 2-dimensional space

product-constrain-derive-abstract-minimize

parameter p: maintain (at most) p Pareto points

Allows to budget analysis time

analysis time ≤ c ▪ n ▪ p2

(n number of applications; c platform constant)

adaptation the futurescenariosemb. mm throughput storage

Page 11: Reliable Embedded Multimedia Systems?tbasten/presentations/wisc20090921.pdf · P40 145 P50 190 P60 235 P70 265 P80 310 P99 390 RC I 350 P0 0 P30, P 40, P 50 250 P60 300 P70, P 80,

41

50

60

70

80

90

100

I1 I2 I3 I4 I5 I6

T=1ms t=5ms T=20ms t=100ms

Controlling Time Budgets

budgets

time

values

• MMKP benchmark: www.laria.u-picardie.fr/hifi/OR-Benchmark/MMKP

• Always within budget

• Good values

• Similar results to the IMEC, SOC 2006,

non-compositional state-of-the-art heuristic

par p

adaptation the futurescenariosemb. mm throughput storage

SimIt-ARM simulator

206 Mhz StrongARM

42

0

1

0 5 10 15 20

IMEC Pareto

time(ms)

768 768

1576

1576 2353

2353

3224

3705

3224

3900

0

0.5

1

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

time(ms)

Compositionality

• Applications start at various times

• 5 at a time (a), 1 at a time (b)

values

values always at least as good

as IMEC heuristic

what about applications stopping ?

# of active applications

# of active applicationsadaptation the futurescenariosemb. mm throughput storage

43 Conclusions Run-time Adaptation

• Run-time adaptation:

product - derive | constrain | abstract - minimize

… select and configure

• Parameterized compositional method

• Allows to trade off quality with analysis time

• Allows to bound analysis time

• Open question: components leaving the composition ?

• Feasible for resource-constrained embedded systems

• Wide variety of applications

adaptation the futurescenariosemb. mm throughput storage

44 Overview

• Embedded Multi-media

• Analysis of Synchronous Dataflow Graphs

• Throughput Analysis

• Throughput-Storage Trade-off Analysis

• Scenario-aware Analysis

• Run-time Adaptation

• Looking Forward

Page 12: Reliable Embedded Multimedia Systems?tbasten/presentations/wisc20090921.pdf · P40 145 P50 190 P60 235 P70 265 P80 310 P99 390 RC I 350 P0 0 P30, P 40, P 50 250 P60 300 P70, P 80,

45 Dataflow MoCs Expressiveness Hierarchy

RPN: Reactive Process Networks

KPN: Kahn Process Networks

DF: DataFlow

DDF: Dynamic DF

BDF: Boolean DF

CSDF: Cyclo-Static DF

SDF: Synchronous DF

HSDF: Homogeneous SDF

• BDF and larger: Turing complete

• Better notions of expressiveness

needed

• Unification needed

adaptation the futurescenariosemb. mm throughput storage

46 Challenges

• Suitable notions of expressivity

• Expressivity while maintaining analyzability and synthesizability

• Abstraction without loosing accuracy

• Composability and compositionality

• MoCs for non-functional aspects

• Unification of MoCs

• Multi-objective trade-off analysis

• Parametric analysis

• Model-driven design and synthesis flows

• Model-driven run-time systems

adaptation the futurescenariosemb. mm throughput storage

47 Thank you !

More info: www.es.ele.tue.nl/~tbasten/SDF3 toolkit: www.es.ele.tue.nl/sdf3/Pareto calculator: www.es.ele.tue.nl/pareto/

Questions ?

‘An understanding of the natural world and what's in it

is a source of not only a great curiosity but great fulfillment.’

David Attenborough