the seec computational model henry hoffmann, anant agarwal pemws-2 april 6, 2011

41
The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

Post on 19-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

The SEEC Computational Model

Henry Hoffmann, Anant Agarwal

PEMWS-2April 6, 2011

Page 2: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

In the beginning…*

2*The beginning, in this case, refers to the beginning of my career (1999)

Performance

Application programmers had one goal:

Page 3: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

But Modern Systems Have Increased the Burden on Application Programmers

3

Performance

Pow

er

Quality

Resiliency

Many additional constraints

Even worse, constraints can change dynamicallyE.g. power cap, workload fluctuation, core failure

beat/s

Lo Hi

Power

Page 4: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

Most Execution Models Designed for Performance

5

Coherent Shared Memory Message Passing

Communication

Concurrency

Coordination

Multi-threaded Multi-process

Through Memory Through Network

Locks Messages

Control Procedural Procedural

Global Shared Cache

LocalCache

RF

data

store

LocalCache

RF

data

load

Core 0 Core 1

data

LocalMemory

RF

data

store

LocalMemory

RFNetworkInterface

data

load

Core 0 Core 1

NetworkInterface

Network

Procedural control insufficient to meet the needs of modern systems

Page 5: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

6

Angstrom Replaces Procedural Control with Self-Aware Control

Procedural Control Self-Aware Control

DecideDecide

ActAct

• Run in open loop

• Assumptions made at design time• Based on guesses about future

DecideDecide ActAct

ObserveObserve

• Run in closed loop

• Understand user goals • Monitor the environment

• Application optimized for system• No flexibility to adapt to changes

• System optimizes for application• Flexibly adapt behavior

The self-aware model allows the system

to solve constrained optimization problems dynamically

Page 6: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

7

Outline

• Introduction/Motivating Example

• The SEEC Model and Implementation

• Experimental Validation

• Conclusions

7

Page 7: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

8

Angstrom’s SElf-awarE Computing (SEEC) Model

• Goal: Reduce programmer burden by continuously optimizing online

• Key Features: 1. Applications explicitly state goals and progress2. System software and hardware state available actions3. The SEEC runtime system dynamically selects actions to maintain goals

Application Developer

Systems Developer

Observe Act

Decide

API SPI

SEECRuntime

A unified decision engine adapts applications, system software, and hardware

Page 8: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

9

Example Self-Aware System Built from SEEC

Video Encoder

Actuators

Goals:30 beat/s, Minimize Power

20 b/s30 b/s33 b/s

Algorithm Cores

Frequency Bandwidth

Control and Learning System

_r

He

art

Ra

te

Time

pure delayslow convergenceoscillating

SEECController

Application-

Desired Heart Rate

_r

Observed Heart Rate

r(k)

Error

e(k)

Speedup

s(k)

SEECController

Application-

Desired Heart Rate

_r

Observed Heart Rate

r(k)

Error

e(k)

Speedup

s(k)

1

2

3

4

5

6

7

8

9

0 5 10 15 20 25 30

Action

Sp

ee

du

p

Initial Model

Observe

Decide Act

Page 9: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

10

Roles in the SEEC Model

Application Developer

Systems Developer

SEECRuntime System

Express application goals and progress(e.g. frames/ second)

Read goals and performance

Determine how to adapt (e.g. How much to speed up the application)

Provide a set of actions and a callback function(e.g. allocation of cores to process)

Initiate actions based on results of decision phase

ObserveObserve

DecideDecide

ActAct

Page 10: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

11

Registering Application Goals

• Performance– Goals: target heart rate and/or latency between tagged heartbeats– Progress: issue heartbeats at important intervals

• Quality– Goals: distortion (distance from application defined nominal value)– Progress: distortion over last heartbeat

• Power– Goals: target heart rate / Watt and/or target energy between tagged heartbeats– Progress: Power/energy over last heartbeat interval

ObserveObserve

Application

Lo Hi

Power

SEECDecision Engine

Performance

Power

Quality

Research to date focuses on meeting performance while minimizing power/maximizing quality

Page 11: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

Actuators

12

Registering System Actions

Each action has the following attributes:• Estimated Speedup

– Predicted benefit of taking an action

• Cost– Predicted downside of taking an action (increased power, lowered quality)

• Callback– A function that takes an id an implements the associated action

ActAct

SEECDecision Engine

Estimated Speedup

Cost

Callback

Algorithm Cores

Frequency Bandwidth

Page 12: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

SEECDecision Engine

13

Picking Actions to Meet Goals

Decisions are made to select actions given observations:• Read application goals and heartbeats• Determine speedup with control system• Translate speedup into one or more actions

DecideDecide

Application System Services

Lo Hi

Power

The control system provides predictable and analyzable behavior

Controller(Decide)

Actuator(Act)

Application(Observe)-

PerformanceGoal

CurrentPerformance

Page 13: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

14

Optimizing Resource Allocation with SEEC

• SEEC can observe, decide and act

• How does this enable optimal resource allocation?

• Let’s implement the video encoder example from the introduction

Page 14: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

15

Performance/Watt Adaptation in Video Encoding

Performance goal

0

10

20

30

40

50

60

70

80

50 150 250 350 450

Time (Heartbeat)

Per

form

ance

(F

ram

e/s)

130

140

150

160

170

180

0 2 4 6 8 10 12 14 16

Time (s)

Pow

er (

W)

Performance Power

Page 15: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

16

System Models in SEEC

16

1

2

3

4

5

6

7

8

9

0 5 10 15 20 25 30

Action

Sp

ee

du

pInitial Model

SEEC’s control system takes actions based on models (of speedup and cost per action) associated with actions

What if the models are inaccurate?

Page 16: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

SEEC Decision Engine

17

Updating Models in SEEC

• After every action, SEEC updates application and system models

• Kalman filters used to estimate true state of application and system

– We are exploring a number of different alternatives here

DecideDecide

SEEC combines predictability of control systems with adaptability of learning systems

Controller(Decide)

Actuator(Act)

Application(Observe)-

PerformanceGoal

ApplicationModel

(Decide)

SystemModel

(Decide)

CurrentPerformance

Adaptation Level 2

Adaptation Level 1

Adaptation Level 0

System Services

Application

Lo Hi

Power

Page 17: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

18

SEEC Online Learning of Speedup Model for Application with Local Minima

1

2

3

4

5

6

7

8

9

0 5 10 15 20 25 30

Action

Sp

eed

up

Initial Model Actual Speedups Learned Speedups

Page 18: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

SEEC Decision Engine

19

Handling Multiple Applications

19

DecideDecide

• Control actions computed separately for each application

• For finite resources, several alternatives:

• Priorities (for real-time, admission control)

• Weighting/Centroid method (for overall system throughput)

Controller(Decide)

Actuator(Act)

Application(Observe)-

PerformanceGoal

ApplicationModel

(Decide)

SystemModel

(Decide)

CurrentPerformance

Adaptation Level 2

Adaptation Level 1

Adaptation Level 0

Application

Lo Hi

Power

Application

Lo Hi

Power

Application

Lo Hi

Power

Application

Lo Hi

Power

System Services

Page 19: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

20

Outline

• Introduction/Motivating Example

• The SEEC Model and Implementation

• Experimental Validation

• Conclusions

20

Page 20: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

21

Systems Built with SEEC

21

System Actions Tradeoff BenchmarksDynamic Loop Perforation

Skip some loop iterations Performance vs. Quality

7/13 PARSECs

Dynamic Knobs Make static parameters dynamic

Performance vs. Quality

bodytrack, swaptions, x264, SWISH++

Core Scheduler Assign N cores to application

Compute vs. Power 11/13 PARSECs

Clock Scaler Change processor speed Compute vs. Power 11/13 PARSECs

Bandwidth Allocator Assign memory controllers to application

Memory vs. Power STREAM (doesn’t make a difference for PARSEC)

Power Manager Combination of the three above

Performance vs. Power

PARSEC, STREAM, simple test apps (mergesort, binary search)

Learned Models Power Manager with speedup and cost learned online

Performance vs. Power

PARSECs

Multi-App Control Power Manager with multiple applications

Performance vs. Power and Quality for multiple applications

Combinations of PARSECs

Page 21: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

22

Systems Built with SEEC

System Actions Tradeoff BenchmarksDynamic Loop Perforation

Skip some loop iterations Performance vs. Quality

7/13 PARSECs

Dynamic Knobs Make static parameters dynamic

Performance vs. Quality

bodytrack, swaptions, x264, SWISH++

Core Scheduler Assign N cores to application

Compute vs. Power 11/13 PARSECs

Clock Scaler Change processor speed Compute vs. Power 11/13 PARSECs

Bandwidth Allocator Assign memory controllers to application

Memory vs. Power STREAM (doesn’t make a difference for PARSEC)

Power Manager Combination of the three above

Performance vs. Power

PARSEC, STREAM, extra test apps (mergesort, binary search)

Learned Models Power Manager with speedup and cost learned online

Performance vs. Power

PARSECs

Multi-App Control Power Manager with multiple applications

Performance vs. Power and Quality for multiple applications

Combinations of PARSECs

22

Page 22: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

23

Dynamic Knobs: Creating Adaptive Applications

23

Application Goals

System Actions

Experiment

Turn static command line parameters into dynamic structure

Detail in Hoffmann et al. “Dynamic Knobs for Power Aware Computing” ASPLOS 2011

Maintain performance and minimize quality loss

Adjust memory locations to change application settings

Benchmarks: bodytrack, swaptions, SWISH++, x264

Maintain performance when clock speed changes

Page 23: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

24

0

0.2

0.4

0.6

0.8

1

1.2

1.4

0 50 100 150 200 250

Time

No

rmal

ized

Per

form

ance

Baseline Power Cap Dynamic Knobs

ResultsEnabling Dynamic Applications

bodytrack

Clock drops 2.4-1.6GHz

w/o SEEC perf. drops

w/ SEEC perf. recovers

Clock rises 1.6-2.4 GHz

SEEC returns quality to baseline

Page 24: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

25

swaptionsSWISH++ x264

Dynamic knobs automatically enable dynamic response for a range of applications using a single mechanism

Maintains performance despite noise

Perfect behavior Maintains baseline performance

0

0.2

0.4

0.6

0.8

1

1.2

1.4

0 200 400 600 800 1000

Time

No

rma

lize

d P

erf

orm

an

ce

Baseline Power Cap Dynamic Knobs

0

0.2

0.4

0.6

0.8

1

1.2

1.4

0 100 200 300 400 500

Time

No

rma

lize

d P

erf

orm

anc

e

Baseline Power Cap Dynamic Knobs

0

0.2

0.4

0.6

0.8

1

1.2

1.4

0 100 200 300 400 500 600

Time

No

rmal

ized

Per

form

ance

Baseline Power Cap Dynamic Knobs

ResultsEnabling Dynamic Applications

Page 25: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

26

Optimizing Performance per Watt for Video Encoding

26

Application Goals

System Actions

Experiment

Adapt system behavior to needs of individual inputs

Maintain 30 frame/s while minimizing power

Change cores, clock speed, and memory bandwidth

Benchmark: x264 w/ 16 different 1080p inputs

Compare performance/Watt w/ SEEC to best static allocation of resources for each input

Page 26: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

Average Performance per Watt for Video Encode

27

aver

age

0

0.2

0.4

0.6

0.8

1

1.2static worst static oracle AL 0 AL 1 AL 2

Input

No

rmal

ized

Per

form

ance

/Wat

t

Page 27: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

Performance per Watt for All Inputs

28

0

0.2

0.4

0.6

0.8

1

1.2

Nor

mal

ized

Per

form

ance

/Watt

Input

static worst static oracle AL 0 AL 1 AL 2

Page 28: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

29

Learning Models Online

29

Application Goals

System Actions

Experiment

Adapt system behavior when initial models are wrong

Four targets: Min, Median, Max, Max+

Minimize power consumption for each target

Change cores, clock speed, and mem. bandwidth

Initial models are incredibly optimistic(Assume linear speedup with any resource increase)

Benchmark: STREAM

Converge to within 5% of target performance, measure power and convergence time

Page 29: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

Complex Response to Resources

30

1 2 3 4 5 6 7 80.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4

2.62.4 GHz, 2 MC 2.4 GHz, 1 MC 1.6 Ghz, 2 MC 1.6 GHz, 1 MC

Cores

Spee

dup

Adding Memory Bandwidth Can Slow

Down

Adding Cores Can Slow the App Down

Page 30: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

Power Savings

31

Min Median Max Max+0

50

100

150

200

250Oracle AL 0 AL 1 AL 2

Performance Target

To

tal

Sys

tem

Po

wer

(W

atts

)

AL 2 can provide 30 Watt power savings over AL1

Page 31: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

Convergence Time

32

Min Median Max Max+0

10

20

30

40

50

60Oracle AL 0 AL 1 AL 2

Performance Target

Co

nve

rgen

ce T

ime

(qu

anta

)

Page 32: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

33

Managing Application and System Resources Concurrently

33

Application Goals

System Actions

Experiment

Manage multiple applications when clock frequency changes

bodytrack: maintain performance, minimize power

x264: maintain performance, minimize quality loss

Change core allocation to both applications

Change x264’s algorithms

Maintain performance of both applications when clock frequency changes

Page 33: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

34

ResultsSEEC Management of Multiple Applications

bodytrack x264

34

0

0.5

1

1.5

2

2.5

40 90 140 190 240Time (Heartbeat)

Nor

mal

ized

Per

form

ance

0

1

2

3

4

5

6

7

Cor

es

bodytrack w/ adaptation

bodytrack

bodytrack cores

0

0.5

1

1.5

2

2.5

40 90 140 190 240Time (Heartbeat)

Nor

mal

ized

Per

form

ance

0

1

2

3

4

5

6

7

Cor

es

x264 w/ adaptationx264x264 cores

Clock drops 2.4-1.6GHz w/o SEEC app

misses goals

SEEC allocates cores to bodytrack

w/o SEEC app exceeds goals

SEEC removes cores from x264

SEEC adjusts algorithm to meet goals

Page 34: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

Summary of Case Studies

Experiment Demonstrated Benefit of SEEC

Dynamic Knobs Maintains application performance in the face of loss of compute resources

Performance/Watt Out-performs oracle for static allocation of resources by adapting to fluctuations in input data

Performance/Watt with learning

Learns models online starting from initial model which is ~10x off true values

Multi-App control Maintains performance of multiple apps by managing algorithm and system resources to adapt to loss of compute resources

35

Page 35: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

36

Outline

• Introduction/Motivating Example

• The SEEC Framework

• Experimental Validation

• Conclusions

36

Page 36: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

Angstrom’s Hardware Support for Self-Awareness

• “If you want to change something, measure it.”

• We need sensors to measure everything we want to manage:– Performance (there is a rich body of work in this area)– Power – Quality– Resiliency– …

37

Page 37: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

Need Performance-level Support for Power Analysis

1. gprof -> pprof?

38

index % time self children called name Power?? Energy??? <spontaneous>----------------------------------------------- 0.00 0.05 1/1 main [2] [1] 100.0 0.00 0.05 1 report [3] 0.00 0.03 8/8 timelocal [6] 0.00 0.01 1/1 print [9] 0.00 0.01 9/9 fgets [12] 0.00 0.00 12/34 strncmp <cycle 1> [40] 0.00 0.00 8/8 lookup [20] 0.00 0.00 1/1 fopen [21] 0.00 0.00 8/8 chewtime [24] 0.00 0.00 8/16 skipspace [44] ----------------------------------------------- [2] 59.8 0.01 0.02 8+472 <cycle 2 as a whole> [4] 0.01 0.02 244+260 offtime <cycle 2> [7] 0.00 0.00 236+1 tzset <cycle 2> [26] -----------------------------------------------

Page 38: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

Angstrom’s Hardware Support for SEEC

Low-power core for SEEC runtime system

39

Page 39: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

40

Angstrom’s System Software and Hardware Support for SEEC

Actuators

Algorithm Cores

Frequency Memory Bandwidth

Cache

Registers

Network Bandwidth

ALU Width

Enable tradeoffs wherever possible

Support fine-grain allocation of resources

TLB Size

Page 40: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

41

Conclusions

• SEEC is designed to help ease programmer burden– Solves resource allocation problems– Adapts to fluctuations in environment

• SEEC has three distinguishing features– Incorporates goals and feedback directly from the application– Allows independent specification of adaptation– Uses an adaptive second order control system to manage adaptation

• Demonstrated the benefits of SEEC in several experiments– SEEC can optimize performance per Watt for video encoding– SEEC can adapt algorithms and resource allocation to meet goals in the

face of power caps or other changes in environment

• SEEC navigates tradeoff spaces– So build more tradeoffs into apps, system software, system hardware, etc.

41

Page 41: The SEEC Computational Model Henry Hoffmann, Anant Agarwal PEMWS-2 April 6, 2011

42

SEEC References

• Application Heartbeats Interface:– Hoffmann, Eastep, Santambrogio, Miller, Agarwal. Application Heartbeats: A Generic

Interface for Specifying Program Performance and Goals in Autonomous Computing Environments. ICAC 2010

• Controlling Heartbeat-enabled Systems:– Maggio, Hoffmann, Santambrogio, Agarwal, Leva. Controlling software applications within

the Heartbeats framework. CDC 2010– Maggio, Hoffmann, Agarwal, Leva. Control theoretic allocation of CPU resources. FeBID

2011

• Creating Adaptive Applications:– Hoffmann, Misailovic, Sidiroglou, Agarwal, Rinard. Using Code Perforation to Improve

Performance, Reduce Energy Consumption, and Respond to Failures. MIT-CSAIL-TR-2209-042. 2009

– Hoffmann, Sidiroglou, Carbin, Misailovic, Agarwal, Rinard. Dynamic Knobs for Power-Aware Computing. ASPLOS 2011.

• The SEEC Framework:– Hoffmann, Maggio, Santambrogio, Leva, Agarwal. SEEC: A Framework for Self-aware

Management of Multicore Resources. MIT-CSAIL-TR-2011-016 2011.– Hoffmann, Maggio, Santambrogio, Leva, Agarwal. SEEC: A Framework for Self-aware

Computing. MIT-CSAIL-TR-2010-049 2010.

42