towards an exascale sdk

37
LAWRENCE BERKELEY NATIONAL LABORATORY F U T U R E T E C H N O L O G I E S G R O U P Towards an Exascale SDK Costin Iancu Lawrence Berkeley National Laboratory 1

Upload: others

Post on 01-May-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Towards an Exascale SDK

LAWRENCE BERKELEY NATIONAL LABORATORY

F U T U R E T E C H N O L O G I E S G R O U P

Towards an Exascale SDK

Costin Iancu Lawrence Berkeley National Laboratory

1

Page 2: Towards an Exascale SDK

Berkeley  Lab  Ini.a.ve  in  Extreme  Data  Science  

Extreme Data Science Facility (XDSF)

MS-DESI

ALS

LHC

JGI

APS

LCLS

Other data-

producing sources

XDSF:  Extreme  Data  Science  Facility  concept    

Page 3: Towards an Exascale SDK

Data  services  

Storage  &  analy2cs  systems  

Network  services  

XDSF:  Will  bring  scien.sts  together  with  data  researchers  and  soBware  engineers  

Page 4: Towards an Exascale SDK

F U T U R E T E C H N O L O G I E S G R O U P

LAWRENCE BERKELEY NATIONAL LABORATORY

Building the Exascale Ecosystem

v  Multidisciplinary science at Exascale requires novel functionality in the software development environment

v  Large system scale (DEGAS) Programming environment that adapts to system variability in

performance and availability v  Composed execution models (Corvette)

Automated techniques to reason about program behavior (correctness, performance, precision)

v  “Data” pipelines (Hobbes) New system level support for application composition

(resources, data)

4

Page 5: Towards an Exascale SDK

DEGAS: Dynamic Exascale Global Address Space

Katherine Yelick, LBNL PI

Vivek Sarkar & John Mellor-Crummey, Rice James Demmel, Krste Asanoviç & Armando Fox, UC Berkeley

Mattan Erez, UT Austin Dan Quinlan, LLNL

Surendra Byna, Paul Hargrove, Steven Hofmeyr, Costin Iancu, Khaled Ibrahim, Leonid Oliker, Eric Roman, John Shalf, David Skinner, Erich

Strohmaier, Samuel Williams, Yili Zheng, LBNL

Page 6: Towards an Exascale SDK

Mission  Statement:  To  ensure  the  broad  success  of  Exascale  systems  through  a  unified  programming  model  that  is  produc2ve,  scalable,  portable,  and  interoperable,  and  meets  the  unique  Exascale  demands  of  energy  efficiency  and  resilience  

DEGAS  Mission  

6

Hierarchical  Programming  Models  

Communica.on-­‐Avoiding  Libraries  and  Compilers  Adap.ve  Interoperable  

Run.mes  Lightweight  One-­‐Sided  

Communica.on  

DEGAS Overview

Page 7: Towards an Exascale SDK

•  Scalability:    –  Billion-­‐way  concurrency;  performance  through  hierarchical  locality  control  

•  Programmability:    –  Convenient  programming  through  a  global  address  space  and  high-­‐level  

abstrac.ons  and  libraries  •  Performance  Portability:    

–  Ensure  applica.ons  can  be  moved  across  diverse  machines  with  domain-­‐specific  op.miza.ons  

•  Resilience:    –  Integrated  support  for  capturing  state  and  recovering  from  faults  

•  Energy  Efficiency:    –  Avoid  communica.on,  which  will  dominate  energy  costs,  and  adapt  to  

performance  heterogeneity  due  to  system-­‐level  energy  management    •  Interoperability:    

–  Encourage  use  of  languages  and  features  through  incremental  adop.on  

DEGAS  Proposal:  Goals  and  Objec.ves  

7 DEGAS Overview

Page 8: Towards an Exascale SDK

DEGAS:  Dynamic  Exascale  Global  Address  Space  

Hierarchical  Programming  Models  

Communica.on-­‐Avoiding  Libraries  and  Compilers  Adap.ve  Interoperable  

Run.mes  Lightweight  One-­‐Sided  

Communica.on  

UPC,  Co-­‐Array  Fortran  (CAF),  Habanero-­‐C,  and  libraries!  

DEGAS Overview 8

Page 9: Towards an Exascale SDK

What  we  love  about  UPC  •  Convenience    

–  Build  large  shared  structures  (PGAS)  –  Read  and  write  data  “anywhere,  any.me”    (global,  asynchronous,  and  one-­‐sided);  would  like  more  than  read/writes  

•  Locality  and  scalability  (shared  with  MPI)  –  Explicit  control  over  data  layout  

•  That  it’s  a  language  rather  than  library  –  Syntac.c  elegance:    *p  =  …      vs  shem_put(p,…)  –  Op.miza.ons  from  compilers  

•  Communica.on,  pointers,  etc.  –  Correctness  from  compilers  

•  Race  and  deadlock  analysis,…    •  More  in  Titanium,  less  in  UPC  

Glo

bal a

ddre

ss s

pace!

x: 1 y:

l: l: l:

g: g: g:

x: 5 y:

x: 7 y: 0

p0" p1" pn"

Page 10: Towards an Exascale SDK

Single  Program  Mul.ple  Data  (SPMD)  is  too  restric.ve  

Hierarchical  PGAS  (HPGAS)  hierarchical  memory  &  control  

•  Op2on  1:  Dynamic  parallelism  crea2on  –  Recursively  divide  un.l…  you  run  out  of  work  (or  hardware)  

•  Op2on  2:  Hierarchical  SPMD  with  “Mix-­‐ins”  –  Hardware  threads  can  be  grouped  into  units  hierarchically  –  Add  dynamic  parallelism  with  voluntary  tasking  on  a  group  –  Add  data  parallelism  with  collec.ves  on  a  group  

Two  approaches:  collec.ng  vs  spreading  threads  

0   3  1   2  

4  

5  

6  

7  

0  

1  

2  

3  

Beyond  (Single  Program  Mul2ple  Data,  SPMD)  •  Hierarchical  locality  model  for  

network  and  node  hierarchies  

•  Hierarchical  control  model  for  applica.ons  (e.g.,  mul.physics)  

 

DEGAS Overview 10

Page 11: Towards an Exascale SDK

UPC++  Programming  System  in  DEGAS  •  Problem

–  Need H-PGAS support for C++ applications –  C++ complier is very complex

•  Solution: “Compiler-Free” UPC++ –  Template library approach reduces development

and maintenance costs by leveraging C++ standards and compilers

–  Use SPMD+Aysnc execution model –  Combine popular features from existing PGAS

languages: async in Phalanx/X10, M-D domains and arrays in Titanium/Chapel

–  Interoperate with MPI, OpenMP, CUDA and etc.

•  Recent Progress –  Design of H-PGAS in the context of UPC++ –  UPC++ prototype development –  IPDPS14 paper with results on Cray XC30 and

IBM BG/Q

•  Impact –  Provided programming productivity similar to

UPC and Titanium for C/C++ apps –  Demonstrated competitive performance

[ZKDSY]  “UPC++  -­‐-­‐  A  PGAS  Extension  for  C++,”  IPDPS  2014.    [KY]  “Hierarchical  Computa.on  in  the  SPMD  Programming  Model  ,”  LCPC  2013.  UPC++  soBware:  hnps://bitbucket.org/upcxx/upcxx/    

GASNet Communication Library

Network Drivers and OS Libraries

C++ Compiler

C/C++ Apps

UPC++ Runtime

UPC++ Template Header

Files

UPC Runtime

UPC Apps

UPC Compiler

t1

e1

t2

t4 t3

t5

e3

e2

t6

t task

e event

1.00E+04  

1.00E+05  

1.00E+06  

1.00E+07  

1.00E+08  

64   216   512   1000   4096   8000   13824   32768  

Performan

ce  (FOM  z/

s)  

Number  of  Cores  

LULESH  Performance  on  Cray  XC30  MPI  

UPC++  

16  

32  

64  

128  

256  

512  

1024  

2048  

4096  

24   48   96   192   384   768   1536   3072   6144  

Performan

ce  (G

FLOPS)  

Number  of  Cores  

Stencil  Performance  on  Cray  XC30    

Titanium  UPC++  

Asynchronous  Event-­‐driven  task  execu.on  

1-­‐sided  communica.on  provides  bener  performance  at  

scale  

Produc.vity  without  loss  of  performance  

Page 12: Towards an Exascale SDK

•  We  can  make  the  global  address  space  more  powerful  –  Remote  read  and  write  –  Remote  atomic  invoca.on  –  Ac.ve  messages  (small  func.ons  that  execute  at  high  priority)  –  Remote  func.on  invoca.on  –  Remote  invoca.on  with  mul.ple  dependences  (DAG)  –  Run  anywhere  in  region  (e.g.,  on-­‐node  task  queue)  –  Run  anywhere  (global  task  queue)  

•  Retain  the  SPMD  model  for  locality:  1  main  thread  per  core  •  Key  ques2ons:  

–  How  quickly  do  things  run  vs  run.me  aggregates  communica.on  –  Resource  management:  avoid  .ming-­‐dependence  deadlock    

Extending  Remote  Access  

12 DEGAS Overview

Page 13: Towards an Exascale SDK

Ini.al  (highly  subjec.ve)  Analysis  of  Base  Languages  

DEGAS Overview 13

C++   Python   Scala  Base  performance  

Good   Poor   Medium  (likely  to  improve)  

Mul.core  performance  

Good   Very  poor   OK  

Cross-­‐language  support  

Good   Good   Poor  (expensive)  

Exis.ng  libraries   Good   Good   Poor  

Extensibility   Medium  (cannot  overload  .)  

Good   Medium  

Popularity  /  ecosystem  

Medium   Good   Poor  

Language  safety   Poor   Poor   Good  

Current  focus  of  DEGAS  for  DOE   DEGAS  Proposed  (cut)  

Page 14: Towards an Exascale SDK

DEGAS:  Dynamic  Exascale  Global  Address  Space  

Hierarchical  Programming  Models  

Communica.on-­‐Avoiding  Libraries  and  Compilers  Adap.ve  Interoperable  

Run.mes  Lightweight  One-­‐Sided  

Communica.on  Communica2on-­‐avoiding  algorithms  generalized  to  compilers,  and  communica2on  op2miza2ons  in  PGAS  

DEGAS Overview 14

Page 15: Towards an Exascale SDK

Generalizing  Communica.on  Lower  Bounds  and  Op.mal  Algorithms  

•  For  serial  matmul,  we  know  #words_moved  =          Ω  (n3/M1/2),  agained  by  2le  sizes  M1/2  x  M1/2  

–  Where  do  all  the  ½’s  come  from?  •  Thm  (Christ,Demmel,Knight,Scanlon,Yelick):                For  any  

program  that  “smells  like”  nested  loops,  accessing  arrays  with  subscripts  that  are  linear  func2ons  of  the  loop  indices,  #words_moved  =      Ω  (#itera2ons/Me),  for  some  e  we  can  determine  

•  Thm  (C/D/K/S/Y):  Under  some  assump2ons,  we  can  determine  the  op2mal  2les  sizes  

•  Long  term  goal:  All  compilers  should  generate  communica2on  op2mal  code  from  nested  loops  

DEGAS Overview 15

Page 16: Towards an Exascale SDK

Communica2on  Avoidance  in  DEGAS  

•  Problem –  Communication dominates time and energy –  This will be worse in the Exascale era

•  Solution –  Optimize latency by overlapping with

computation and other communication –  Use faster one-sided communication –  Use new Communication-Avoiding Algorithms

(provably optimal communication) –  Automatic compiler optimizations

–  IMPACT –  Dense linear algebra study shows 2X

speedups from both overlap and avoidance –  New “HBL” theory generalizes optimality

to arbitrary loops with array expressions –  First step in automating communication-

optimal compiler transformations

1  

10  

100  

1000  

10000  

PicoJoules  

now  2018  

New Communication Optimal “1.5D” N-Body Algorithm: Replicate and Reduce

Speedup of New 1.5D Algorithm over Old

3.7x  1.7x  1.8x  2.0x  

6K  24K  8K  32K  

#  of  cores   BG/P

XE6

[GGSZTY]  “Communica.on  Avoiding  and  Overlapping  for  Numerical  Linear  Algebra,”  SC12.    [DGKSY]  "A  Communica.on-­‐Op.mal  N-­‐Body  Algorithm  for  Direct  Interac.ons,”  IPDPS  2013.    [CDKSY]  “Communica.on  Lower  Bounds  and  Op.mal  Algorithms  for  Programs  That  Reference  Arrays  —  Part  1”,  UCB  TR  2013  

Page 17: Towards an Exascale SDK

DEGAS:  Dynamic  Exascale  Global  Address  Space  

Hierarchical  Programming  Models  

Communica.on-­‐Avoiding  Compilers  

Adap.ve  Interoperable  Run.mes  

Lightweight  One-­‐Sided  Communica.on  

LITHE,  JUGGLE:  adap2ve  and  efficient  run2me  

DEGAS Overview 17

Page 18: Towards an Exascale SDK

Management  of  cri2cal  resources  is  increasingly  important:    •  Memory  and  network  bandwidth  limited  by  cost  and  energy  •  Capacity  limited  at  many  levels:  network  buffers  at  interfaces,  internal  

network  conges.on  are  real  and  growing  problems  

DEGAS  Leverages  THOR  run.me  work  and  UPC  Library  work  

Having  more  than  4  submi2ng  processes  can  nega5vely  impact  performance  by  up  to  4x  

DEGAS Overview 18

Page 19: Towards an Exascale SDK

DEGAS:  Dynamic  Exascale  Global  Address  Space  

Hierarchical  Programming  Models  

Communica.on-­‐Avoiding  Compilers  

Adap.ve  Interoperable  Run.mes  

Lightweight  One-­‐Sided  Communica.on  

Next  Genera2on  GASNet  

DEGAS Overview 19

Page 20: Towards an Exascale SDK

GASNet-­‐EX  plans:    •  Conges2on  management:  for  1-­‐sided  communica.on  with  ARTS  •  Hierarchical:  communica.on  management  for  H-­‐PGAS  •  Resilience:  globally  consist  states  and  fine-­‐grained  fault  recovery  •  Progress:  new  models  for  scalability  and              interoperability    Leverage  GASNet  (redesigned)  •  Major  changes  for  on-­‐chip  interconnects  •  Each  network  has  unique  opportuni.es  

•  Interface  under  design:    “Speak  now  or….”  –  hnps://sites.google.com/a/lbl.gov/gasnet-­‐ex-­‐collabora.on/.    

DEGAS:  Lightweight  Communica.on  (GASNet-­‐EX)  

20 DEGAS Overview

GASNet  

Coarray  Fortran  2.0  

IBM   Cray   Infiniband  

Shared  memory  

Phalanx  

GCC  UPC   Chapel  Berkeley  UPC  

Titanium  

GPU  

Ethernet  

Cray  UPC/CAF  

for  Seastar  

AMD  Intel   SUN  

SGI  

Others  

Page 21: Towards an Exascale SDK

DEGAS:  Dynamic  Exascale  Global  Address  Space  

Hierarchical  Programming  Models  

Communica.on-­‐Avoiding  Compilers  

Adap.ve  Interoperable  Run.mes  

Lightweight  One-­‐Sided  Communica.on  

Containment  Domains  and  BLCR  (Berkeley  Lab  Checkpoint  Restart)  

DEGAS Overview 21

Page 22: Towards an Exascale SDK

Resilience  Approaches  

•  Containment  Domains  (CDs)  for  trees  –  Flexible  resilience  techniques  (mechanism  not  policy)  –  Each  CD  provides  own  recovery  mechanism    –  Analy.cal  model:  90%+  efficiency  at  2  EF                        vs.  0%  for  conven.onal  checkpoin.ng  

•  Berkeley  Lab  Checkpoint  Restart  –  BLCR  is  a  system-­‐level  Checkpoint/Restart  

•  Job  state  wrinen  to  filesystem  or  memory;  works  on  most  HPC  apps  

–  Checkpoint/Restart  can  be  used  for  roll-­‐back  recovery  

•  a  course-­‐grained  approach  to  resilience  •  BLCR  also  enables  use  for  job  migra.on  among  

compute  nodes  

–  Requires  support  from  the  MPI  implementa.on  

–  Impact:  part  of  standard  Linux  release  

Root  CD  

Child  CD  

•  Preserve  data  on  domain  start  •  Compute  (domain  body)  •  Detect  faults  before  commit  •  Recover  from  detected  errors  

22 DEGAS Overview

Page 23: Towards an Exascale SDK

             

             Resilience  Support  -­‐  Containment  Domains  +  BLCR  

     

Energy  /  Performance  Feedback  -­‐  IPM,Roofline  

ARTS  -­‐  Adap5ve  Run-­‐Time  System

General  Purpose  Cores Accelerator  Cores Network  Interface    &  I/O

GASNet-­‐EX  Communica5on  

       

Lithe  –  Resource  Mgmt.

Hardware  Threads Task  Dispatch

Proxy  Applica2ons,  Numerical Libraries

ROSE Python   Berkeley  UPC  

PyGAS   Habanero-­‐UPC   H-­‐CAF  

SEJITS

Dynamic  Control System

DEGAS  SoBware  Stack  

Unfunded  ac.vity  

DEGAS Overview 23

Page 24: Towards an Exascale SDK

•  Meraculous  (Evangelos  Georganas)  •  Convergent  Matrix  (Scog  French)                hgps://github.com/swfrench/convergent-­‐matrix)  •  Communica2on-­‐Avoiding  Matrix  (Penporn  Koanantakool)  •  Embree  graphics  (Michael  Driscoll)            hgps://github.com/mbdriscoll/embree/tree/upcxx  •  NPB  CG,  MG  and  FT  (Amir  Kamil)  •  Fan-­‐both  Sparse  Cholesky  (Mathias  Jacquelin)  •  mini-­‐GMG  (Hongzhang  Shan)  •  MILC  (Hongzhang  Shan)  •  Global  Arrays  and  NWChem  (Eric  Hoffman,  Hongzhang  Shan)  Planned:  •  Stochas2c  Gradient  

PGAS  Applica.ons  (Ongoing)  

24 DEGAS Overview

Page 25: Towards an Exascale SDK

LAWRENCE BERKELEY NATIONAL LABORATORY

F U T U R E T E C H N O L O G I E S G R O U P

Corectness, Verification and Testing of Exascale Applications (Corvette)

Koushik Sen (PI) James Demmel

UC Berkeley

Costin Iancu Lawrence Berkeley National Laboratory

25

Page 26: Towards an Exascale SDK

F U T U R E T E C H N O L O G I E S G R O U P

LAWRENCE BERKELEY NATIONAL LABORATORY

Develop correctness tools for different programming models: PGAS, MPI, dynamic parallelism!I. Testing and Verification!

q  Identify sources of non-determinism in executions"q  Data races, atomicity violations, non-reproducible floating point

results q  Explore state-of-the-art techniques that use dynamic analysis"q  Develop precise and scalable tools: < 2X overhead"

II. Debugging!q  Use minimal amount of concurrency to reproduce bug"q  Support two-level debugging of high-level abstractions"q  Detect causes of floating-point anomalies and determine the

minimum precision needed to fix them"

26

Goals

Page 27: Towards an Exascale SDK

EECS Electrical Engineering and

Computer Sciences BERKELEY PAR LAB

P A R A L L E L C O M P U T I N G L A B O R A T O R Y

EECS Electrical Engineering and

Computer Sciences BERKELEY PAR LAB

PRECIMONIOUS: Tuning Assistant forFloating-Point Precision"

David H. Bailey, Costin Iancu!Lawrence Berkeley National Laboratory!

David Hough!Oracle!

SC’13!

Cindy Rubio-González, Cuong Nguyen, Hong Diep Nguyen, !James Demmel, William Kahan, Koushik Sen!

University of California, Berkeley!

Page 28: Towards an Exascale SDK

EECS Electrical Engineering and

Computer Sciences BERKELEY PAR LAB Motivation"

v  Floating-point arithmetic used in wide variety of domains!"

v  Reasoning about these programs is difficult!§  Large variety of numerical problems!§  Most programmers not experts in floating point or numerical

analysis!

v  Common practice!§  Use the highest available floating-point precision!§  Disadvantages: more expensive in terms of running time,

storage, and energy consumption!

v  Goal: develop automated technique to assist in tuning floating-point precision!

"28 SC’13!

Page 29: Towards an Exascale SDK

EECS Electrical Engineering and

Computer Sciences BERKELEY PAR LAB

v Consider the problem of finding the arc length of the function!

"v  Summing for into n subintervals !

Example (D.H. Bailey)"

29

Ø Consider the problem of finding the arc length of the function"

Ø Summing for into n subintervals " with and "

Ø Consider the problem of finding the arc length of the function"

Ø Summing for into n subintervals " with and "

Ø Consider the problem of finding the arc length of the function"

Ø Summing for into n subintervals " with and "

Ø Consider the problem of finding the arc length of the function"

Ø Summing for into n subintervals " with and "

xk 2 (0,⇡)n�1X

k=0

ph

2 + (g(xk+1)� g(xk))2 h = ⇡/nxk = khwith! and!

1

2

3

g(x) = x+X

0k5

2�k sin(2kx)

SC’13!

Precision! Slowdown! Result!double-double! 20X! 5.795776322412856!double! 1X! 5.795776322413031!Summation variable is double-double!

< 2X! 5.795776322412856!

✖ ✔

Page 30: Towards an Exascale SDK

EECS Electrical Engineering and

Computer Sciences BERKELEY PAR LAB

30 Tuned Program!

Example (D.H. Bailey)"

Original Program!

double  fun(double  x)  {      int  k,  n  =  5;      double  t1  =  x;      float  d1  =  1.0f;    

   for(k  =  1;  k  <=  n;  k++)  {            ...      }      return  t1;  }    

int  main()  {      int  i,  n  =  1000000;      double  h,  t1,  t2,  dppi;      long  double  s1;      ...      for(i  =  1;  i  <=  n;  i++)  {          t2  =  fun(i  *  h);          s1  =  s1  +  sqrtf(h*h  +  (t2  -­‐  t1)*(t2  -­‐  t1));          t1  =  t2;      }      //  final  answer  stored  in  variable  s1      return  0;  }  

long  double  fun(long  double  x)  {      int  k,  n  =  5;      long  double  t1  =  x;      long  double  d1  =  1.0L;    

   for(k  =  1;  k  <=  n;  k++)  {            ...      }      return  t1;  }    

int  main()  {      int  i,  n  =  1000000;      long  double  h,  t1,  t2,  dppi;      long  double  s1;      ...      for(i  =  1;  i  <=  n;  i++)  {          t2  =  fun(i  *  h);          s1  =  s1  +  sqrt(h*h  +  (t2  -­‐  t1)*(t2  -­‐  t1));          t1  =  t2;      }      //  final  answer  stored  in  variable  s1      return  0;  }  

Page 31: Towards an Exascale SDK

EECS Electrical Engineering and

Computer Sciences BERKELEY PAR LAB

v  Searching efficiently over variable types and function implementations!§  Naïve approach → exponential time!

•  19,683 configurations for arc length program (39)!•  11 hours 5 minutes!

§  Global minimum vs. a local minimum!"

v  Evaluating type configurations!§  Less precision does not always result in

performance improvement!§  Run time, memory usage, energy consumption, etc.!

v  Determining accuracy constraints!§  How accurate must the final result be?!§  What error threshold to use?!

"

"

Challenges for Precision Tuning"

31

Automated!

Specified by the user!

SC’13!

Page 32: Towards an Exascale SDK

EECS Electrical Engineering and

Computer Sciences BERKELEY PAR LAB

32 SC’13!

0

5

10

15

20

25

30

35

40

45

10^-4

10^-6

10^-8

10^-10

Speedup for Various Error Thresholds"Sp

eedu

p %!

41%

Page 33: Towards an Exascale SDK

LAWRENCE BERKELEY NATIONAL LABORATORY

F U T U R E T E C H N O L O G I E S G R O U P

Hobbes: OS and Runtime Support for Application Composition

Ron Brightwell (PI) Sandia National Laboratory

…. Costin Iancu (Programming Models)

LBNL

33

Page 34: Towards an Exascale SDK

¡  Internode composition §  Minimally intrusive

§  need to connect outputs to inputs §  Potential for inefficient resource usage

¡  Intranode composition §  Separate enclaves

§  Minimally intrusive §  Connections could be made using shared memory

§  Unified Runtime ¡  Integration

§  I/O operation invokes visualization §  Minimal overhead

¡  Can we use a single description to support all of these implementations? §  Mapping logical structures onto physical resources through virtualization

COMPOSITION

Node

App Viz

Node

App

Node

Viz

Node Viz App

Page 35: Towards an Exascale SDK

HAL (Hardware Virtualization) VM Management Module

NOS 1 RT 1

Application 1

On Node Management

GOS

EOS 1

EOS 2

Glo

bal I

nfor

mat

ion

Bus

Kernel

User NOS 2 RT 2

Application 2

Additional mechanisms needed to manage multiple VMs. Run in kernel mode to take advantage of VM support in modern processors. AKA Palacios

Basic mechanisms needed to virtualize hardware resources like address spaces. AKA Kitten

Policies to manage the VMs on a single node. AKA PCT

Hobbes Node Architecture Independent Operating and Runtime Systems

VMs can share the resources via time sharing or space sharing. This is managed by the GOS

VM 1 VM 2

Page 36: Towards an Exascale SDK

F U T U R E T E C H N O L O G I E S G R O U P

LAWRENCE BERKELEY NATIONAL LABORATORY

OS Support for Adaptive Runtimes

v  Yardstick for success – application performance and ease of development §  Performance metrics – time, energy §  Software engineering metrics – “composability”, “tunability”

v  Interact with Co-Design centers for experiments v  Identify separation of concerns between adaptive runtime

and OS support §  Distinguish between mechanisms and policies §  Resource management: core, memory, network, energy §  Enable user level implementations and policies §  Identify the protection and isolation requirements

v  MANTRA: Check first if it can be done at the user level v  Approach:

§  Top-down – examine runtime APIs, determine lacking OS support §  Bottom-up – propose novel OS APIs, examine runtime implementation

36

Page 37: Towards an Exascale SDK

F U T U R E T E C H N O L O G I E S G R O U P

LAWRENCE BERKELEY NATIONAL LABORATORY

THANK YOU!

37