design space exploration and application autotuning for runtime...

72
Design Space Exploration and Application Autotuning for Runtime Adaptivity in Multicore Architectures Cristina Silvano Politecnico di Milano [email protected]

Upload: others

Post on 11-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Design Space Exploration and Application Autotuning for Runtime Adaptivity in Multicore Architectures

Cristina SilvanoPolitecnico di [email protected]

Page 2: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

OutlineResearch challenges in multicore architecturesAutomatic design space explorationApplication autotuning for runtime adaptivitySummary

2

Page 3: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

RESEARCH CHALLENGES

Page 4: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Energy efficiency underlies all markets  Energy efficiency is of paramount importance for all 

application markets (automotive, consumer, mobile, healthcare and beyond) and target systems spanning from sensors, cyber‐physical systems, embedded systems up to servers and HPC systems.

Page 5: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Squeezing of computing cores

201122 nm 

200932 nm

200745 nm

200565 nm1.4 mm2

Source:ARM9 STMicroelectronics

201314 nm

5

Page 6: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

201122 nm 

200932 nm

200745 nm

200565 nm1.4 mm2

Source:ARM9 STmicroelectronics

201314 nm

… entering the multi/many‐core era

5

Page 7: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

A wide range of architecture parameters must be tuned to find the best system configuration.

Design space of the target architecture A should consider all possible configurations of each parameters pi :

A = Sp1 x Sp2 x …. x Spn

Large design space: 218 (262 144) design points

The design space in the age of multi/many‐cores

Parameter Min Max

#Processors 2 16

#Threads 1 4

L1 I$ size 2K 16K

L1 D$ size 2K 16K

L1 I$ associativity 1w 8w

L1 D$ associativity 1w 8w

L2 $ size 32K 256K

L2 $ associativity 1w 8w

6

Page 8: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Full search simulation time

FULL SEARCHCan become quickly unfeasible ~10 minutes

per simulation*

262 144 design pointsx

8 data sets=

2 097 152 simulations

10simulation time on 4 parallel cores

* Using a cycle-accurate simulator

Years

8

7

Page 9: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Research Challenges1) What are the best tradeoffs to design many‐cores in terms of 

multiple objectives such as energy and performance?Need for automatic Design Space Exploration of many‐core architectures

8

Page 10: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

What are the barriers of further scaling?

Transistor density increases ~2x every 2 years

Frequency wall

Power wall

Utilisation wall

… the end of the Dennard scaling… entering the dark silicon era

9

Page 11: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Shift from the power wall……to the programmability wall

STMicrolectronics P2012

The Problem of Computation Offloading on Heterogeneous Parallel Platforms

10

Page 12: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Mobile Application Workloads

* Source: Flurry Analytics

11

Page 13: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Mobile Application Workloads Short bursts of high

intensity computationand power consumption

Long periods of sustained high intensity

Long-use low-intensity

Page 14: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Runtime Resource Management

Amit Kumar Singh, Muhammad Shafique, Akash Kumar, and Jörg Henkel. Mapping on multi/many‐core systems: Survey of current and emerging trends. In Proceedings of the 50th Annual Design Automation Conference (DAC). 2013. 

RTRM

App1 App2 App3

13

Allocation

Mapping

44 44 66

Target Heterogeneous Platform

Page 15: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Research Challenges1) What are the best trade‐offs to design many‐cores in terms of 

multiple objectives such as energy and performance?Need for automatic Design Space Exploration of many‐core architectures

2) Are the system resources used efficiently?Need for runtime resource management

14

Page 16: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Tunable Applications

One or more application parameters, code transformations and code variants (applicationknobs) can be tuned at runtime  Adaptivity to adjust the application behavior to the changing operating conditions, usage contexts and resource availability

One or more application parameters, code transformations and code variants (applicationknobs) can be tuned at runtime  Adaptivity to adjust the application behavior to the changing operating conditions, usage contexts and resource availability

Application Knobs

15

Page 17: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Tunable Applications

One or more application parameters, code transformations and code variants (applicationknobs) can be tuned at runtime Adaptivity to adjust the application behavior to the changing operating conditions, usage contexts and resource availability

Approximate computing: output just needs to be “good enough” trading off accuracy and throughput

One or more application parameters, code transformations and code variants (applicationknobs) can be tuned at runtime Adaptivity to adjust the application behavior to the changing operating conditions, usage contexts and resource availability

Approximate computing: output just needs to be “good enough” trading off accuracy and throughput

Application Knobs

15

Page 18: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Research Challenges1) What are the best trade‐offs to design many‐cores in terms of 

multiple objectives such as energy and performance?Need for automatic Design Space Exploration of many‐core architectures

2) Are we using the system resources efficiently?Need for runtime resource management

3) How can we tune applications at runtime by using application knobs?Need for application autotuning

16

Page 19: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

AUTOMATIC DESIGN SPACEEXPLORATION

Page 20: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Objective function: To minimize both energy  (x) and execution time  (x) of the target application on system configurations x:

where  is the design space. The solution is a set of tradeoff configurations  p

known as Pareto set

The multi‐objective optimisation problem

(x)

(x)

18

Page 21: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Multi‐objective exploration: Pareto Points

1.5

2

2.5

3

3.5

4

4.5

100 200 300 400 500 600

dominated design point

Pareto point

Multi‐Objective Exploration: best designs are not unique.Pareto points provide tradeoffs with respect to the multiple objectives

Throughput

Error

23

19

Page 22: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

The concepts behind the automatic DSE

Input Variables:architecture 

parameters that define the 

design space

The black box generates the output values accordingly to the inputs. 

Output  Variables define the objective space

The black box can be:

1) A simulator that models the system behavior and generates output values 

2) A set of solvers that modelsthe system behavior andestimates output values

(x)

(x)

Page 23: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

MULTICUBE Explorer is an open source Multi‐Objective Design Space Exploration (DSE) framework to tune multi‐core architectures

Efficiency: MULTICUBE Explorer is a design automation and acceleration tool to minimize the numbers of simulations to be executed during the DSE process

Flexibility: MULTICUBE Explorer is a design optimization and exploration tool where you can easily plug‐in your own simulator,  optimization algorithms and machine learning techniques.

MULTICUBE Explorer is not “yet another simulator”  (there are already many simulators….)

MULTICUBE Explorer: What is it?

wwww.multicube.eu

21

Page 24: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Multicube Explorer: Overview

Architecturesand

Applications

Architecturesand

Applications

Architecturesand

Applications

22

Page 25: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

• Efficiency of automatic DSE process can be improved by:

1. Minimizing the numbers of simulations to be executed by using exploration heuristics such as state‐of‐art evolutionary algorithms

2. Speeding up simulations

3. Simulating at higher abstraction levels 

4. Defining an analytical response model of the system behavior based on a subset of simulations to predict the unknown system response

Automatic Design Space Exploration: Efficiency

23

Page 26: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

1. Design of Experiments (DoEs): To identify the experimentation plan: how to select the design points in the design space to be simulated

2. Optimisation Algorithms:Meta‐heuristics methods inspired by analogies with physics, or with biology to solve multi‐objective optimization problems: simulated annealing, genetic algorithms, evolutionary strategies, etc.

3. Response Surface Modeling (RSM): To use the set of simulated points to obtain an analytical model of the system behavior: linear regression, spline interpolation, artificial neural network, etc.

MULTICUBE Explorer

24

Page 27: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

“ReSPIR: A Response Surface‐Based Pareto Iterative Refinement for Application‐Specific Design Space Exploration”, Palermo, G.; Silvano, C.; Zaccaria, V., IEEE Transactions on Computer‐Aided Design of Integrated Circuits and Systems, Volume 28, Issue 12, Dec. 2009 Page(s): 1816 – 1829

How to combine DoEs and Response Surface Models? ReSPIR: RSM‐Support Iterative Pareto Refinement

Page 28: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Why automatic design space exploration?

Faster exploration time

Better quality of results

Benefits

26

Page 29: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Comparison Results: ReSPIR vs MOSA & NSGA‐II

0.0

2.0

4.0

6.0

8.0

10.0

12.0

14.0

16.0

18.0

20.0

1.0 2.0 3.0

MOSA

NSGA-II

ReSPIR

Accuracy[% ADRS]

Design Space Analyzed [%]

27

MIPS‐based shared memory multi‐core  Design space composed of 217 design points (131 072) Accuracy in terms of Average Distance from Reference Set: 

the lower the better approximated Pareto set

Page 30: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

The Multi‐View Case Study: The human eye stereo matching

2 eyes → third dimension

28

Design‐time optimization of OpenCL applications

Page 31: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Quality of Results: Pixel Disparity Error

Left camera Right camera

Reference disparity map

1

2

3

Disparity Error

Disparity Error

Application Knobs

29

stereo-matchingstereo-matching

Page 32: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Design Space Exploration (DSE)

design time

30

Page 33: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Parametric implementation:

Customization of application‐specific and platform parameters

Full search can become quickly unfeasible:

Multi‐dimensional large design space

Slow simulation platforms

31

E. Paone, G. Palermo, V. Zaccaria, C. Silvano, D. Melpignano, G. Haugou and T. Lepley, "An Exploration Methodology for a Customizable OpenCL Stereo-Matching Application Targeted to an Industrial Multi-Cluster Architecture", in Proc. CODES+ISSS 2012.

E. Paone, F. Robino, G. Palermo, V. Zaccaria, I. Sander and C. Silvano, "Customization of OpenCL Applications for Efficient Task Mapping under Heterogeneous Platform Constraints", In Proceedings of DATE 2015.

Design‐time optimization of OpenCL applications

Page 34: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Parametric implementation:

Customization of application‐specific and platform parameters

Full search can become quickly unfeasible:

Multi‐dimensional large design space

Slow simulation platforms

design space

Design‐time optimization of OpenCL applications

Page 35: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Parametric implementation:

Customization of application‐specific and platform parameters

Full search can become quickly unfeasible:

Multi‐dimensional large design space

Slow simulation platforms

OP1OP2

OP3

OP4

OP5

OP6

OP7

design space

Design‐time optimization of OpenCL applications

Page 36: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

APPLICATION AUTOTUNING

Page 37: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Autonomous Video‐surveillance System

Video Frame Rate Video Resolution

33

Application knobs can be used at runtime to trade‐off the throughput and the quality of results

Page 38: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Multiple Cameras

Video Frame Rate

Video Resolution

Autonomous Video-surveillance

System

Multiple instances of the same application

34

Page 39: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

ARGO Application Autotuning Key idea is that most of the applications are dynamically 

configurable in terms of a set of tunable parameters, code transformations and code variants (application knobs) to trade‐off accuracy and latency

Error

Latency

35

Page 40: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

design-time

run-time

36

Error

Latency

Page 41: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

ARGO Application Autotuning ARGO is a light‐weight application autotuning framework for 

multi/many‐core platforms in an adaptive multi‐application environment. 

Combination of design‐time and run‐time techniques to create an effective way of “self‐aware” computing with limited runtime overhead.Orthogonality between application autotuning and runtime management of system resources 

E. Paone, D. Gadioli, G. Palermo, V. Zaccaria, C. Silvano. “Evaluating Orthogonality between Application Auto-Tuning and Run-Time Resource Management for Adaptive OpenCL Applications”, Proc. of IEEE ASAP 2014.

D. Gadioli, G. Palermo, C. Silvano, “Application Autotuning to Support Runtime Adaptivity in Multicore Architectures”, in Proc. IEEE SAMOS 2015.

37

Page 42: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Design‐time explorationDesign‐time exploration phase based on code profiling to define the effects of application knobs and performance metrics (OPs) used to build a prediction model of the application behavior

38

Working Modesrepresent max

resource allocationassociated to OPs

Operating Pointscombine application

knobs and performance metrics

Powerconsumption

Performance

Accuracy

Page 43: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Monitor

Execu

te

PlanAnalyze

IFTTT

Monitoring some metrics

Monitoring some metrics

Modeling application

knobs to metrics

Modeling application

knobs to metricsConfiguration

selectionConfiguration

selection

Tuning application

knobs

Tuning application

knobs

MAPE feedback loop• Application autotuning enables self‐optimization capabilities

based on Monitor‐Analyze‐Plan‐Execute (MAPE) feedback loop*

Host CPU

Heterogeneous Many-Core Platform

* J.O. Kephart, D.M. Chess, “The vision of autonomic computing,” Computer, 2003

39

Page 44: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

ARGO Application Autotuning FrameworkPower consumption Performance Accuracy

Application

Goals+/-

Goals+/-

RankRank

OP List

Monitors

Autotuning Framework

40

Page 45: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

ARGO Application Autotuning FrameworkPower consumption Performance Accuracy

Application

Goals+/-

Goals+/-

RankRank

OP List

Monitors

Autotuning Framework

MAPE feedback loop

40

Page 46: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

ARGO Application Autotuning FrameworkPower consumption Performance Accuracy

Application

Goals+/-

Goals+/-

RankRank

OP List

Monitors

Autotuning Framework

1) Monitor

40

Page 47: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

ARGO Application Autotuning FrameworkPower consumption Performance Accuracy

Application

Goals+/-

Goals+/-

RankRank

OP List

Monitors

Autotuning Framework

2) Analyze

40

Page 48: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

ARGO Application Autotuning FrameworkPower consumption Performance Accuracy

Application

Goals+/-

Goals+/-

RankRank

OP List

Monitors

Autotuning Framework3) Plan

40

Page 49: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

ARGO  Application Autotuning FrameworkPower consumption Performance Accuracy

Application

Goals+/-

Goals+/-

RankRank

OP List

Monitors

Autotuning Framework

40

4) Execute

Page 50: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Separation of ConcernsPower consumption Performance Accuracy

Application

Goals+/-

Goals+/-

RankRank

OP List

Monitors

Autotuning FrameworkC++ source codes

41

Page 51: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Separation of ConcernsPower consumption Performance Accuracy

Application

Goals+/-

Goals+/-

RankRank

OP List

Monitors

Autotuning FrameworkKnowledge XML file

41

C++ source codes

Page 52: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Separation of ConcernsPower consumption Performance Accuracy

Application

Goals+/-

Goals+/-

RankRank

OP List

Monitors

Autotuning Framework

Adaption XML file

41

App source codesKnowledge XML fileKnowledge XML file

C++ source codes

Page 53: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Separation of ConcernsPower consumption Performance Accuracy

Application

Goals+/-

Goals+/-

RankRank

OP List

Monitors

Autotuning Framework

Adaption XML file

41

Adaptive ApplicationKnowledge XML fileKnowledge XML file

Page 54: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Target HW Platform

Orthogonality Concept: App Autotuning & RTRM

Platform OS

Req

uests

Res

ourc

es

42

Page 55: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Target HW Platform

Orthogonality Concept: App Autotuning & RTRM

Platform OS

Resource Availability

42

Req

uests

Res

ourc

es

Page 56: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Run-Time Resource Manager

Orthogonality Concept: App Autotuning & RTRM

Platform OS

Target HW Platform

42

Resource Availability

Page 57: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Experimental SetupTarget Platform  Intel Xeon QuadCore CPU E5‐1607 @ 3GHz & 8GB RAM Linux 3.5 & OpenCL 1.2 runtime provided by Intel SDK 2013

Dynamic Workload DefinitionSingle application reacting to external changesMultiple instances of the same application 

Metrics of interest• Throughput: Number of frames per second [fps]• Quality: Normalized disparity error w.r.t reference• Resources: Percentage of CPU used by the application

43

Page 58: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Application Autotuning Tradeoffs

44

[ASAP2014]

Page 59: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Application Autotuning: Dynamic Adaptation (1)

•First phase:The application isprocessing the images by respecting the requirements

45

[SAMOS2015]

Page 60: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Application Autotuning: Dynamic Adaptation (1)

•First phase:The application isprocessing the images by respecting the requirements.

•Second phase:There is a high priority task in the system requiring CPU resources

45

[SAMOS2015]

Page 61: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Application Autotuning: Dynamic Adaptation (1)

•First phase:The application isprocessing the images by respecting the requirements.

•Second phase:There is a high priority task in the system requiring CPU resources

•Third phase:The task has finished releasing CPU resources

45

[SAMOS2015]

Page 62: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Application Autotuning: Dynamic Adaptation (2)

•First phase:The application isprocessing the images by respecting the requirements.

46

[SAMOS2015]

Page 63: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Application Autotuning: Dynamic Adaptation (2)

•Second phase: Threat detected

46

[SAMOS2015]

Page 64: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Application Autotuning: Dynamic Adaptation (2)

•Third phase:The application isprocessing the images by respecting the requirements.

46

[SAMOS2015]

Page 65: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

PLAIN LINUX Multiple ApplicationDynamic Workload 

[ASAP2014]

Page 66: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

ARGO+LINUXPLAIN LINUX

35us overheadwith 100 OPs

[ASAP2014]

Page 67: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

ARGO+LINUX

35us overheadwith 100 OPs

ARGO+RTRM

1ms overheadwith 100 OPs

[ASAP2014]

Page 68: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

SummaryMULTICUBE Design Space Exploration framework and ARGO

autotuning framework to support design‐time multi‐objective optimization and runtime adaptivity in manycore architectures 

Most recent journal papers: TCAD‐2015 (2), TECS‐2013, PARCO‐2013, TECS‐2012 (2), TCAD‐2012, TCAD‐2009.

Most recent conference papers: SAMOS‐2015, DATE‐2015, SAMOS‐2014, ASAP‐2014, ARC‐2014,  DATE‐2014, ASP‐DAC‐2014 (2), FPL‐2013, DATE‐2013 (3), CODES‐ISSS‐2012, ReCoSoC‐2012, SASP‐2011, DAC‐2010, DATE‐2010, SASP‐2009.

EU projects: 

48

www.2parma.euwww.multicube.eu

Page 69: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Next challenge: The Green500 scenario

To reach the DARPA’s target of 20MW of Exascale supercomputers projected to 2020, current supercomputers must achieve an energy efficiency “quantum leap”, pushing towards a goal of 50 GFlops/W. 

Heterogeneous systems currently dominate the top of the Green500 list and this dominance is expected to be a trend for the next coming years to reach the target of 20MW Exascale supercomputers. 

To fulfil this target, energy‐efficient heterogeneous supercomputers need to be coupled with a radically new software stack capable of exploiting the benefits offered by heterogeneity to meet the scalability and energy efficiency required by the Exascale era. 

www.antarex-project.eu/

The main goal of the ANTAREX project is to provide a breakthrough approach to express by a Domain Specific Language 

the application self‐adaptivity and to runtime manage and autotune applications for green and heterogeneous High 

Performance Computing (HPC) systems up to the Exascale level. 

49

Page 70: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

Research Collaborations

Page 71: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015

People

Page 72: Design Space Exploration and Application Autotuning for Runtime …specs/projects/antarex/downloads/... · 2019. 12. 30. · Multicore Architectures”, in Proc. IEEE SAMOS 2015