ecp applications summary -...

92
ECP Applications Summary Jan 2017

Upload: doananh

Post on 21-Apr-2018

227 views

Category:

Documents


2 download

TRANSCRIPT

ECP Applications Summary

Jan 2017

2Exascale Computing Project

Software Technologies Cited• C++, Fortran2003, Lua• MPI, RAJA, OpenMP 4.x, CUDA• Conduit, LLNL CS toolkit (SiDRe, Quest, CHAI, SPIO, others…)• MFEM, Hypre, MAGMA (UTK),• HDF5, SCR, zfp, ADIOS• UQ-Pipeline, VisIt, P-Mesh

High Order Multi-physics for Stockpile Stewardship

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

*PI: Rob Rieben (LLNL)

• Multi-physics simulations of High Energy-Density Physics (HEDP) and focused experiments driven by high-explosive, magnetic or laser based energy sources

• Use of high-order numerical methods speculated to better utilize exascale architectural features (higher flop to memory access ratios)

• Support for multiple diverse algorithms for each major physics package (e.g. built-in cross-check or “second vote” capability for validation)

• Improve end-user productivity for overall concept-to-solution workflow, including improved setup and meshing, support for UQ ensembles, in-situ vis and post-processing, and optimized workflow in the LC Advanced Technology Systems (exascale platform) environment

• Stockpile stewardship

Y1: ASC L2 language: Demonstrate at least one modular hydrodynamics capability using the CS toolkit. Description: Integrate CS Toolkit in-memory data repository (SiDRe) into the code and demonstrate benefits of centralized data mgmt across multiple physics packages by enabling access to generalized Toolkit services such as parallel I/O, visualization, runtime interrogation and steering, and computational geometry capabilitiesY2: ASC L2 language: Demonstration of coupled multi-physics using the CS toolkit linking capability. Description: Demonstrate multi-physics coupling via mesh linking library that directly interacts with the mesh-aware data description in SiDReY3: ASC L2 language: See Appendix of ASC Implementation PlanY4: ASC L1 milestone demonstrating problem of programmatic relevance on Sierra and Trinity

• Radiation transport on high-order meshes is a research area• High-order coupling of multi-physics is a research area• Immaturity of vendor-supported programming models and compiler

technology can slow progress• Arbitrary order selection (e.g. 2nd order vs 8th order,…) at run-time

can be expensive – important tradeoffs in flexibility for the user vs. compile-time optimizations and higher performance

• Complete separation of core CS components into a reusable toolkit is a new model for LLNL ASC code development

LLNL-ABS-698614

3Exascale Computing Project

Applications• NGC and ASC IC modernizationSoftware Technologies Cited• Legion, MPI• Kokkos, Thrust, CUDA, OpenMP• C++17, LLVM/Clang, Python, Lua• Trilinos, HYPRE• ParaView, VTK-m, HDF5, Portage,

FleCSI, Ingen

LANL ASC Advanced Technology Development and Mitigation: Next-Generation Code project (NGC)

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Aimee Hungerford, David Daniel (LANL), LA-UR-16-25966

• Multi-physics simulations of systems using advanced material modeling for extreme conditions supporting experimental programs at MaRIE

• Multi-physics simulations of high-energy density physics (HEDP) in support of inertial confinement fusion (ICF) experimental programs at NIF

• Routine 3D simulation capabilities to address a variety of new mission spaces of interest to the NNSA complex

• Develop abstraction layer (FleCSI) to separate physics method expression from underlying data and execution model implementations

• Demonstrate the use of advanced programming systems such as Legion for scalable parallel multi-physics, multi-scale code development

Y1: Release version 1.0 of a production toolkit for multi-physics application development onadvanced architectures Numerical physics packages that operate atop this foundational toolkit will be employed in a low-energy density multi-physics demonstration problem.Y2: Toolkit release version 2.0. Demonstration of a high-energy density multi-physics problem.Y3:Toolkit release version 3.0. Workflow integration in preparation for Y4 goal.Y4: ASC L1 milestone demonstrating problem of programmatic relevance on ASC Advanced Technology Systems

• Immaturity of advanced programming systems such as Legion• Performance impact of FleCSI abstraction layer may be too great in

the context of dynamic multi-physics problem• Serial nature of existing operator split may impact scalability at

exascale and beyond• Integration of advanced material models in modern unstructured

hydrodynamics codes is a research topic• Scalable storage in support of routine 3D simulations of sufficient

resolution is unproven

4Exascale Computing Project

Applications• EMPRESS• DrekarSoftware Technologies Cited• C++• MPI, Kokkos (OpenMP, Cuda), DARMA (Charm++)• DataWarehouse, Qthreads, Node-level resource manager• Trilinos (Solvers, Tpetra, Sacado, Stokhos, Panzer, Tempus, KokkosKernels)• Percept, IOSS, Exodus, CGNS, netcdf, pnetcdf, HDF5• In-situ visualization (VTK-M, Catalyst)

Next Generation Electromagnetics Simulation of Hostile Environment

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Matt Bettencourt (SNL)

• Self consistent simulation from a hostile builder device, radiation transport, plasma generation and propagation to NW system circuits, cables and components with uncertainties

• Develop coupled Source Region ElectroMagnetic Pulse (SREMP) to System Generated ElectroMagnetic Pulse (SGEMP) simulation. Physical spatial domain on the order of kilometers down to system geometry down to millimeters

• Efficient radiation transport and air chemistry through Direct Simulation Monte Carlo (DSMC) in rarified domains and condensed time history in thick regions

• Hybrid meshing (unstructured/regular mesh) for geometric fidelity near geometry and performance in the bulk domain for particle/radiation transport

• Single integrated code base with efficient execution on diverse modern hardware

Y1: Complete development on fluid representation of plasma models with simple sources verified – SREMP problem at low altitudesY2: Simple radiation transport and PIC coupled to EM/ES fields.Y3: PIC code verified for simple problems - SGEMP problem at high altitudeY4: Initial coupled PIC/Fluid approach for plasma simulation – Kinetic SREMP problem at middle to low altitudes

• Embedded uncertainty propagation through stochastic methods such as DSMC research is in its infancy

• Scalable solvers for high thread concurrency not available in solver tools, and it is not clear such tools exist in the literature

• Particle-fluid exchange of moment densities (mass, momentum and energy)

• Coupling on uncertainties between fluid and particle based codes• Load-balancing computing between particle and field codes, AMT

technologies

5Exascale Computing Project

Software Technology RequirementsFor Sandia ECP/ATDM ElectroMagnetic Plasma Application (EMPRESS)

• Programming Models and Runtimes– C++/C++17(1), Python(1), C(1)– MPI(1), OpenMP(1), CUDA(1), Darma(1)– PGAS(3), Kokkos(1)– Boost (1)

• ToolsCmake (1), Git (1), GitLab (1), PAPI (2), DDT (3), Vtune (2), Jenkins (1), Ctest (1), CDash (1), LLVM/Clang (2)

• Mathematical Libraries, Scientific Libraries, FrameworksTrilinos (1), AgileComponents (1), BLAS/LAPACK(1), CuBLAS(1), Darma(2), Drekar(1), Sacado/Stokhos(1), Dakota (1)

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

6Exascale Computing Project

Software Technology RequirementsFor Sandia ECP/ATDM ElectroMagnetic Plasma Application (EMPRESS)

• Data Management and WorkflowsMPI-IO (1), HDF5(1), NetCDF(1), Exodus(1), ADIOS (3), STK(1), CGNS(2), DataWarehouse(1)

• Data Analytics and VisualizationVisIt (1), Paraview (1), Catalyst (2), VTK (3), FFTW (3)

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

7Exascale Computing Project

Applications• SPARC (continuum compressible Navier-Stokes, hypersonic gas dynamics)• SPARTA (direct simulation Monte-Carlo, rarefied gas dynamics)• Sierra (Aria – thermal response, Salinas – structural dynamics)Software Technologies Cited• C++• MPI, Kokkos (OpenMP, Cuda), DARMA (Charm++)• DataWarehouse, Qthreads, Node-level resource manager• Trilinos (Belos, MuLue, Tpetra, Sacado, Stokhos, KokkosKernels)• Percept, IOSS, Exodus, CGNS, netcdf, pnetcdf, HDF5• In-situ visualization (VTK-M, Catalyst)

Virtual Flight Testing for Hypersonic Re-Entry Vehicles

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Micah Howard (SNL)

• Virtual flight test simulations of re-entry vehicles from bus separation (exo-atmospheric) to target for normal and hostile environments

• DSMC-based simulation of exo-atmospheric flight regime with hand-off to continuum Navier-Stokes at appropriate altitude

• Time-accurate wall-modeled LES of high Reynolds number (100k-10M) hypersonic gas dynamics

• Fully-coupled simulation of re-entry vehicle ablator/thermal (shape change, ablation products blowing) and structural dynamic (random vibration) response

• DNS and DSMC enhanced reacting gas models and turbulence models via a-priori and on-the-fly model parameter calculations

• Embedded sensitivity analysis, uncertainty quantification and optimization

FY17: Demonstrate extreme-scale mesh generation and refinement; continue UQ development efforts; begin research activity on scalable solvers; develop low-dissipation schemes for unsteady turbulent gas dynamics; document KNL performance on ATS-1 (Trinity)FY18: Focus on DARMA task-parallelism implementation; continue physics model development and implementation for hypersonic turbulent flowsFY19: Focus on multi-physics coupling and simulation development, including workflows; continue DARMA development activities; document GPU performance on ATS-2 (Sierra)FY20: Full-physics (SPARTA-SPARC coupling, unsteady hypersonic turbulent flows, ablator & structural response coupling) demonstrations with UQ and optimization; document performance

• Scalable solvers for hypersonic gas dynamics (multigrid methods for hyperbolic problems)

• Extreme-scale mesh generation and refinement• Accurate LES models for hypersonic gas dynamics• Developing appropriate hypersonic boundary layer/ablator surface

interaction models• Effective task-parallelism and load balancing of heterogeneous

physical model workloads

8Exascale Computing Project

Software Technology RequirementsFor Sandia ECP/ATDM Hypersonic Reentry Application (Sparc)

• Programming Models and Runtimes1. C++/C++17 (1), Python (2)2. MPI (1), Kokkos (1) [including OpenMP, CUDA], DARMA (1) [including Charm++, HPX, Legion/Regent]3. Boost (2)4. UPC/UPC++ (3), PGAS (3)

• Tools1. CMake (1), Git (1), GitLab (2), Gerrit (2), DDT (1), TotalView (1), Jenkins (1), CDash (1)2. Vtune (1), TAU (2), OpenSpeedShop (2), PAPI (2)3. LLVM/Clang (2)

• Mathematical Libraries, Scientific Libraries, Frameworks1. Trilinos (1), AgileComponents (1), BLAS/PBLAS (1), LAPACK/ScaLAPACK (1), Metis/ParMetis (1),

SuperLU (1), MueLu (1), Dakota (1)2. Chombo (3)

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

9Exascale Computing Project

Software Technology RequirementsFor Sandia ECP/ATDM Hypersonic Reentry Application (Sparc)

• Data Management and Workflows1. MPI-IO (1), HDF (1), netCDF (1)2. GridPro (meshing) (1), Pointwise (meshing) (1)3. Sierra (1), STK (1), DTK (2), CGNS(1), DataWarehouse (1)

• Data Analytics and Visualization1. ParaView (1), EnSight (1)2. Slycat (2)

• System Software

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

10Exascale Computing Project

Enabling GAMESS for Exascale Computing in Chemistry & MaterialsHeterogeneous Catalysis on Mesoporous Silica Nanoparticles (MSN)

PI: Mark Gordon (Ames)

• MSN: highly effective and selective heterogeneous catalysts for a wide variety of important reactions

• MSN selectivity is provided by “gatekeeper” groups (red arrows) that allow only desired reactants A to enter the pore, keeping undesirable species B from entering the pore

• Presence of solvent adds complexity: Accurate electronic structure calculations are needed to deduce the reaction mechanisms, and to design even more effective catalysts

• Narrow pores (3-5 nm) create a diffusion problem that can prevent product molecules from exiting the pore, hence the reaction dynamics must be studied on a sufficiently realistic cross section of the pore

• Adequate representation of the MSN pore requires ~10-100K atoms with a reasonable basis set; reliably modeling an entire system involves >1M basis functions

• Understanding the reaction mechanism and dynamics of the system(s) is beyond the scope of current hardware and software – requiring capable exascale

11Exascale Computing Project

Applications• GAMESS (General Atomic and Molecular Electronic Structure System),

QMCPACK, NWChem, PSI4Software Technologies Cited• Fortran, C++, Python, MPI, OpenMP, OpenACC, CUDA• Swift, DisPy, Luigi, BLAS• MacMolPlt• Gerrit, Git, Doxygen• ASPEN, Oxbow

Enabling GAMESS for Exascale Computing in Chemistry & Materials

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Mark Gordon (Ames)

• Design new materials for heterogeneous catalysis in ground & excited electronic states

• Employ ab initio electronic structure methods to analyze the reaction mechanisms and selectivity of mesoporous silica nanoparticles (MSN) with >10K atom simulations

• Impact development of green energy sources due to MSN catalysts’ ability to provide specific conversion of cellulosic based chemicals into fuels or other industrially important products

• Reduce time & expense of supporting experimental efforts; supports R&D in photochemistry, photobiology, ion solvation

• Develop strategies to reduce power consumption and common driver for multiple program interoperability

• Initiate GAMESS code analysis• Initiate GAMESS refactoring by enabling OpenMP for FMO/EFMO/EFP

methods• Release a new version of GAMESS• Release a new version of libcchem with RI MP2 energies and gradients• Complete and assess an initial threaded GAMESS RI-MP2 energy +

gradient code, conduct benchmarks• Initiate the development of a GAMESS-QMCPACK interface in

collaboration with the QMCPACK group

• Meeting power capping targets due to size of the computations• GAMESS-QMCPACK interoperability• Optimizing use of on- and off-node hierarchical memory• Interfacing QMC kernels to libcchem (C++ library for electronic

structure codes)• Hardware architecture uncertainties• Refactoring of very large and mature code base

12Exascale Computing Project

Software Technology RequirementsEnabling GAMESS for Exascale Computing in Chemistry & Materials

• Programming Models and Runtimes1. Fortran, C++/C++17, Python, C, Javascript, MPI, OpenMP, OpenACC, CUDA, OpenCL, GDDI, PARSEC, GDDI, Boost, DASK-

Parallel, PYBIND11, OpenMP 4.x2. TiledArrays, UPC/UPC++, Co-Array FORTRAN, JULIA3. Argobots, HPX, Kokkos, Raja, Thrust, OpenSHMEM, Sycl, TASCEL

• Tools1. HPCToolkit, PAPI, Oxbow, ASPEN, CMake, git, TAU, GitLab, Docker, Gerrit, GITHUB, TRAVIS CI, HDF5, PSiNSTracer, EventTracer,

PEBIL, VecMeter2. Jira, Cython, PerfPal3. LLVM/CLANG, ROSE, JIRA, Caliper, Cdash, Flux, Shifter, ESGF, EPAX

• Mathematical Libraries, Scientific Libraries, Frameworks1. BLAS/PBLAS, PETSc, LAPACK/ScaLAPACK, libint, libcchem, MKL, FFTW, MOAB, NumPy, SciPy, nose, MAGMA2. DPLASMA, Sympy3. Boxlib, HYPRE, Chombo, SAMRAI, Metis/Parmetis, SuperLU, Repast HPC, APOSMM, HPGMG, Dakota

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

13Exascale Computing Project

Software Technology RequirementsEnabling GAMESS for Exascale Computing in Chemistry & Materials

• Data Management and Workflows1. Python-PANDAS, HDF, Swift, DisPy, Luigi2. BLITZ DB, Drake3. Airflow

• Data Analytics and Visualization1. MacMolPlt Matplotlib2. Seaborn, Mayavi, h5py, PyTables, Statsmodels3. Pygal, Chaco

• System Software

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

14Exascale Computing Project

Software Technology PlansEnabling GAMESS for Exascale Computing in Chemistry & Materials

• Data Management and Workflows– Common Driver for Quantum Chemistry: capable of performing computations using multiple quantum chemistry codes (with QCDB

a common shared component), possibly using RabbitMQ as an MPI wrapper

• Data Analytics and Visualization– MacMolPlt / Matplotlib: used to view molecular systems (~20K atoms); need to tailor to differentiate those molecular fragments

computed with different techniques and to scale up the capability– Quantum Chemistry Database (QCDB): common framework for managing quantum chemical data, e.g., automatic generation of

data tables, summaries of error statistics, etc., that are easily transferrable to other quantum chemistry applications– In-situ data analysis (including filtering and reduction) and visualization (to various points in the visualization pipeline) can help

reduce data movement and storage. Workflow analysis tools such as Panorama can be used to optimize data movement

15Exascale Computing Project

NWChemEx: Tackling Chemical, Materials and Biomolecular Challenges in the Exascale Era

• Deliver molecular and materials modeling capabilities for development of new feedstocks biomass on marginal lands, and new catalysts for the conversion of these feedstocks to usable biofuels and other products

• NWChemEx will be a technology step change relative to NWChem, a powerful molecular modeling application (PNNL)

• Deliver a redesigned NWChem to enhance its scalability, performance, extensibility, and portability and enable NWChemEx to take full advantage of capable exascale

• A modular, library-oriented framework with new algorithms to reduce complexity, enhance scalability, and exploit new computer science approaches to reduce memory requirements and separate software implementation from hardware details

• Implement leading-edge developments in computational chemistry, computer science, and applied mathematics

PI: Thom Dunning (PNNL)

16Exascale Computing Project

Applications• NWChemEx (evolved from redesigned NWChem)Software Technologies Cited• Fortran, C, C++• Global Arrays, TiledArrays, ParSEC, TASCEL• VisIt, Swift• TAO, Libint• Git, svn, JIRA, Travis CI• Co-Design: CODAR, CE-PSI, ExaGraph

NWChemEx: Tackling Chemical, Materials and Biomolecular Challenges in the Exascale Era

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Thom Dunning (PNNL)

• Aid and accelerate advanced biofuel development by exploring new feedstocks for efficient production of biomass for fuels and new catalysts for efficient conversion of biomass-derived intermediates into biofuels and bioproducts

• Molecular understanding of how proton transfer controls protein-assisted transport of ions across biomass cellular membranes, often seen as a stress responses in biomass, would lead to more stress-resistant crops through genetic modifications

• Molecular-level prediction of the chemical processes driving the specific, selective, low-temperature catalytic conversion (e.g., Zeolites, such as H-ZSM-5) of biomass-derived alcohols into fuels and chemicals in constrained environments

Y1: Framework with tensor DSL, RTS, APIs, execution state tracking; Operator-level NK-based CCSD with flexible data distributions and symmetry/sparsity exploitationY2: Automated compute of CC energies and 1-/2-body CCSD density matrices; HT and DFT compute of >1K atom systems via multithreadingY3: Couple embedding with HF and DFT for multilevel memory hierarchies; QMD using HF and DFT for 10K atoms; Scalable R12/F12 for 500 atoms with CCSD energies and gradients using task-based schedulingY4: Optimized data distribution and multithreaded implementations for most time-intensive routines in HF, DFT, and CC.

• Unknown performance of parallel tools• Insufficient performance, scalability, or capacity of local memory will

require algorithmic reformulation• Unavailable tools for hierarchical memory, I/O, and resource

management at exascale• Unknown exascale architectures• Unknown types of correlation effect for systems with large number

of electrons• Framework cannot support effective development

17Exascale Computing Project

Software Technology RequirementsNWChemEx

• Programming Models and Runtimes1. Fortran, Python, C++, Global Arrays, MPI, OpenMP, CUDA2. Intel TBB, OpenCL, PaRSEC, TASCEL3. OpenCR

• Tools1. CHiLL, ADIC/Sacado/OpenAD, PAPI, HPCToolkit2. LLVM, TAU, Travis, gitHub3. Your list here

• Mathematical Libraries, Scientific Libraries, Frameworks1. BLAS+(Sca)LAPACK (single threaded, multi-threaded and massively parallel), Elemental, FFTW, MADNESS, libint2. Your list here3. Your list here

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

18Exascale Computing Project

Software Technology RequirementsNWChemEx

• Data Management and Workflows1. MPI-IO, HDF, Swift2. ADIOS3. Your list here

• Data Analytics and Visualization1. VisIt, 2. Your list here3. Your list here

• System Software1. Standard Linux software development environment, Eclipse2. Your list here3. Your list here

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

19Exascale Computing Project

Urban Atmosphere Buildings

• Impacts of greenhouse gases (GHG) on local climate

• Resulting impacts on city function

• Incorporation of renewables into city energy portfolio

• Resilience of physical infrastructure

• Economic protection, resilience, and enhancement

• …

Response times

Trip plans, service demands

Vehicle emissions,

heat

weather

Building missions, heat

Wind, pressure, heat, moisture

Vehicle mix, driving habits

Response times

Building demand

Building mix, pricing

Environment & Infrastructure Real-Time State Population Dynamics

Social-Economic & ActivitiesTransportation

Municipal Data Sources

Sensor Networks Census, Social Sources, Mobility…

Charlie Catlett (ANL), Mary Ann Piette (LBNL), Tianzhen Hong (LBNL), Budhendra Bhaduri (ORNL), Thom Dunning (PNNL), John Fey (PNNL), Nancy Carlisle (NREL), Daniel Macumber (NREL), Ralph Muehleisen (ANL)…

Sensitive Data

Sensed Data

Open Data

Multiscale Coupled Urban Systems

20Exascale Computing Project

Applications• MetroSEED framework to integrate sector-specific models• chiSIM, RegCM4, WRF, Nek5000, EnergyPlus, CityBES, URBANOpt,

TUMS, Polaris, P-MEDMSoftware Technologies Cited• Fortran, C, C++, Ruby, Python, JavaScript, C#, R, Swift/T• MPI, OpenMP, OpenACC• MOAB, PETSc, Boost, CityGML, SIGMA, OpenStudio, LandScan USA,

Repast HPC, OpenStreetMap, TRANSIMS• CESIUM

Multiscale Coupled Urban Systems

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Charles Catlett (ANL)

• Urbanization is increasing demand for energy, water, transportation, healthcare, infrastructure, physical & cyber security and resilience, food, education—and deepening the interdependencies between these systems. New technologies, knowledge, tools needed to retrofit / improve urban districts, with capability to model and quantify interactions between urban systems

• Integrate modules for urban atmosphere and infrastructure heat exchange and air flow; building energy demand at district or city-scale, generation, and use; urban dynamics & activity based decision, behavioral, and socioeconomic models; population mobility and transportation; energy systems; water resources

• Chicago metro area as testbed for coupling agent-based social/economics model with transportation, regional climate, CFD microclimate, energy of (up to 800K) buildings

Y1: Nek5000 verification; EnergyPlus, TUMS, Polaris, chiSIM, P-MEDM on HPC; EnergyPlus 500-bldg model and sim; City sub-region meshes; Model integration and data exchange architecture (CityGML-based)Y2: Nek5000+EnergyPlus coupling; CityGML+EnergyPlus 10K-bldg model and sim; P-MEDM + TUMS + chiSIM coupling; WRF RCP4.5 high-res sims; demo P-MEDM+TUMS couplingY3: Integration with Transp / ABM; 50-bldg sim with coupled atmosphere+EnergyPlus model; CityGML+EnergyPlus 100K-bldg model and sim; Coupling of chiSIM with TUMS+P-MEDMY4: Demo coupled system on all platforms; CityGML+EnergyPlus models for 800K bldgs; fully coupled sims of 500 bldgs on Aurora; 800k bldg with downscale atmospheric input; tune performance of transportation & coupled 800K bldgs on Aurora

• HPC resources may not be fully available• Some city data may not be available• Synergistic software and co-design projects may not be supported

by ECP• Difficulties in coupling urban atmosphere and buildings models at

the individual building time steps• Scalability of building energy code (EnergyPlus)• Scalability of agent-based models• Limited stakeholder testing and assessment

21Exascale Computing Project

Software Technology RequirementsMultiscale Coupled Urban Systems

• Definitely plan to use (Rank 1)– Programming Models and Runtimes: C++, C, C#, R, JavaScript, Python, Fortran, MPI, OpenMP, OpenACC, OpenStudio, Repast

HPC, TUMS, POLARIS, TRANSITS, Nek5000, WRF, EnergyPlus– Tools: CMake, git, TAU, GitLab, Jira, Valgrind, HPCToolkit– Mathematical Libraries, Scientific Libraries, Frameworks: PETSc, MOAB, CouPE, Boost, Lapack, BLAS and other standard

libraries for scientific computing– Data Management and Workflows: Swift, Swift/T MPI-IO, HDF5, CityBES, CityGML, SensorML, RabbitMQ, URBANOpt, LandScan,

CESIUM– Data Analytics and Visualization: Matplotlib, Graphviz, Paraview, WebGL, VisIT– System Software: Pmem, libnuma, memkind

• Will explore as an option (Rank 2)– Programming Models and Runtimes: Globus Online, Scala, CUDA, PyCUDA– Tools: Origami, OpenTuner, Orio, PAPI– Mathematical Libraries, Scientific Libraries, Frameworks: CNTK, Scikit Learn, Pylearn2, Petsc, Blas, CuBlas, CuSparse, H2O,

Neon– Data Management and Workflows: Jupyter– Data Analytics and Visualization: Cesium, WorldWind

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

22Exascale Computing Project

Computing the Sky at Extreme ScalesElucidating cosmological structure formation by uncovering how smooth and featureless initial conditions evolve under gravity in an expanding universe to eventually form a complex cosmic web

• Modern cosmological observations have led to a remarkably successful model for the dynamics of the Universe; 3 key ingredients --- dark energy, dark matter, and inflation --- are signposts to further breakthroughs, as all reach beyond the known boundaries of the particle physics Standard Model

• A new generation of sky surveys will provide key insights and new measurements such as of neutrino masses

• New discoveries - e.g., primordial gravitational waves and modifications of general relativity - are eagerly awaited

• Capable exascale simulations of cosmic structure formation are essential to shed light on some of the deepest puzzles in all of physical science with a comprehensive program to develop and apply a new extreme-scale cosmology simulation framework for verification of gravitational evolution, gasdynamics, and subgrid models at very high dynamics

PI: Salman Habib (ANL)

23Exascale Computing Project

Applications• HACC, NYXSoftware Technologies Cited• UPC++, C++17• MPI, OpenMP, OpenCL, CUDA• BoxLib, High-Performance Geometric Multigrid (HPGMG), FFTW, PDACS• Thrust

Computing the Sky at Extreme Scales

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Salman Habib (ANL)

• To evolve and meld aspects of the capabilities of Lagrangian particle-based techniques (gravity + gas) with Eulerian adaptive mesh resolution (AMR) methods to achieve a unified cosmological simulation approach at the exascale

• determine the dark energy equation of state• search for deviations from general relativity• determine the neutrino mass sum (to less than 0.1 eV) from galaxy clustering

measurements• characterize the properties of dark matter• testing the theory of inflation

Y1: First major HACC hydro simulation on Theta on full machine;First HACC tests on IBM Power8/NVIDIA Pascal 36 node early-access system; Release HACC & NyxY2: Access to 25% of Summit system as part of CAAR project with HACC, scaling runs and optimization; Nyx scale-up test on Cori/Theta: Clusters of galaxies (deep AMR); Summit CAAR project simulations: HACC hydrodynamic simulations on full machine; Release HACC & NyxY3: HACC and Nyx scaling runs; Scaling of CosmoTools to full scale on Aurora; meeting FOMs; Scaling of CosmoTools to full scale on Summit; Release HACC & NyxY4: Final Major HACC and Nyx code releases

• Accuracy of subgrid modeling• Filesystem stability and availability and fast access to storage for

post-processing• Loss of personnel. Team is relatively small; need to avoid single

points of failure• Resilience - machine MTBF

24Exascale Computing Project

Software Technology RequirementsCosmology

Definitely plan to use (Rank 1)• Programming Models and Runtimes: C++/C++17, C, Fortran, GASnet, Python, MPI, OpenMP, CUDA, Thrust, UPC++

• Tools: Make, svn, git, GitLab, Valgrind, LLVM/Clang, VTune, SKOPE

• Mathematical Libraries, Scientific Libraries, Frameworks: FFTW, MKL, HPGMG, VODE, BoxLib, TensorFlow

• Data Management and Workflows: MPI-IO, HDF5, Jupyter, Globus, Smaash, Docker, Shifter, PDACS/Galaxy, pNetCDF

• Data Analytics and Visualization: ParaView, VTK-m, vl3, CosmoTools, Gimlet, Reeber

Will explore as an option (Rank 2)• Programming Models and Runtimes: OpenACC, OpenCL, Julia, CHAPEL, Kokkos, Raja

• Tools: FTI, SKOPE, SCR, Cmake, Vampir

• Data Management and Workflows: Swift, Decaf

• Data Analytics and Visualization: BayesDB, yt, R Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

25Exascale Computing Project

Data Analytics and Visualization PlansComputing the Sky at Extreme Scales (cosmology)

• Use of NVRAM will help in implementing in situ analysis

• Dynamically partition MPI processes into MPI groups that simultaneously perform different tasks, the analysis groups synchronizing with the main“simulation” group when required

• 3 levels of data hierarchy – Level 1: raw/compressed simulation data– Level 2: analyzed/reduced simulation data– Level 3: further reduced to a database or catalog level

• Data reduction and in situ analysis acts on level 1/2 data sets to produce level 2/3 data

– Level 2 data further analyzed in situ or offline; level 3 is primarily for offline analyses with databases– In future, offline analysis of level 1 data will be severely disfavored, and a new offline/in situ

boundary will enter at level 2.

• Tools and technologies being used and/or explored– ParaView, VTK-m, vl3, CosmoTools, Gimlet, Reeber, BayesDB, yt, R

26Exascale Computing Project

Exascale Deep Learning Enabled Precision Medicine for CancerCANDLE accelerates solutions toward three top cancer challenges

• Focus on building a scalable deep neural network code called the CANcer Distributed Learning Environment (CANDLE)

• CANDLE addresses three top challenges of the National Cancer Institute:

1. Understanding the molecular basis of key protein interactions2. Developing predictive models for drug response, and

automating the analysis3. Extraction of information from millions of cancer patient

records to determine optimal cancer treatment strategies

PI: Rick Stevens (ANL)

27Exascale Computing Project

Applications• Deep learning platforms as candidates for foundation for CANDLE:

• Theano, Torch/fbcunn, TensorFlow, DSSTSNE, Neon, CAFFE Poseidon LBANN CTN)

Software Technologies Cited• cuDNN, BLAS and DAAL, OpenMP, OpenACC, CUDA, MPI, PGAS

Exascale Deep Learning Enabled Precision Medicine for Cancer

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Rick Stevens (ANL)

• Three cancer problem drivers: RAS pathway problem, drug response problem, treatment strategy problem

• Focus is on the machine learning aspect of the three problems• building single scalable deep neural network code we call CANDLE: CANcer Distributed

Learning Environment• Common threads

• Cancer types at all three scales molecular, cellular and population• significant data management and data analysis problems• need to integrate simulation, data analysis and machine learning

Y1: Release CANDLE 1.0 scalable to at least 1000 nodes, 10 billion weights, 10 million neuronsY2: Release CANDLE 2.0 scalable to at least 5,000 nodes, 30 billion weights, 30 million neuronsY3: Release CANDLE 3.0 scalable to at least 10,000 nodes, 100 billion weights, 100 million neuronsY4: Release CANDLE 4.0 scalable to at least 50,000 nodes, 300 billion weights, 300 million neurons

• CANDLE scalability may require more RAM than available on nodes

• Distributed implementation of CANDLE may not achieve high utilization of the high-performance interconnections between nodes

• None of the existing deep neural network code bases will have all the features required for CANDLE, particular the need to interface to scalable simulations

• NCI provided data insufficient for proper training / testing of deep learning networks

28Exascale Computing Project

Software Technology RequirementsPrecision Medicine for Oncology

• Definitely plan to use (Rank 1)– Programming Models and Runtimes: C++, C, Python, Scala, Fortran, MPI, OpenMP, SPARK, OpenACC, Cuda, PGAS, Globus

Online, Boost, OpenShmem, Lua– Tools: CMake, git, TAU, GitLab, Jira, Valgrind, PAPI, LLVM/Clang, HPCToolkit, Jenkins– Mathematical Libraries, Scientific Libraries, Frameworks: cuDNN, ESSL, MKL, DAAL, Tensorflow, Caffe, Torch, Theano, Mocha,

Lapack, BLAS and other standard libraries for scientific computing– Data Management and Workflows: Swift, MPI-IO, HDF5, Jupyter, Digits, DataSpaces, Bellepheron environment analysis for

materials (BEAM@ORNL)– Data Analytics and Visualization: Deep visualization toolbox for deep learning, Grafana, Matplotlib, Graphviz, Paraview– System Software: Pmem, libnuma, memkind

• Will explore as an option (Rank 2)– Programming Models and Runtimes: Java, Thrust, Minerva, Latte – Tools: Origami, OpenTuner, Orio– Mathematical Libraries, Scientific Libraries, Frameworks: CNTK, Scikit Learn, Pylearn2, Petsc, Blas, CuBlas, CuSparse, H2O,

Neon– Data Management and Workflows: Mesos, Heron, Beam, Zeppelin, Dockers– Data Analytics and Visualization: R, EDEN, Origami Graph Visualization on Everest

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

29Exascale Computing Project

Software Technology RequirementsPrecision Medicine for Oncology

• Might be useful but no concrete plans (Rank 3)– Programming Models and Runtimes: Julia, EventWave for event based computing/simulations – Tools: GraphLab, TAO– Mathematical Libraries, Scientific Libraries, Frameworks: GraphLab, ADIC, Keras, TensorFlow Serving– Data Analytics and Visualization: ActiveFlash

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

30Exascale Computing Project

Exascale Lattice Gauge Theory Opportunities and Requirements for Nuclear and High Energy PhysicsLattice quantum chromodynamics (QCD) calculations are the scientific instrument to connect observed properties of hadrons to the fundamental laws of quarks and gluons and critically important to particle and nuclear physics experiments in the decade ahead

• Lattice QCD has made formidable progress in formulating the properties of hadrons (particles containing quarks), but experimental particle and nuclear physics programs require lattice calculations orders of magnitude more demanding still

• Searching for the tiny effects of yet-to-be-discovered physics beyond the standard model, particle physics must have simulations accurate to ~0.10%, an order of magnitude more precise typically realized today

• To accurately compute properties and interactions of hadrons and light nuclei, nuclear physics needs lattice calculations on much larger volumes to investigate multi-hadron states in a reliable controlled way

• Exascale lattice gauge theory will make breakthrough advances possible in particle and nuclear physics

PI: Paul Mackenzie (FNAL)

31Exascale Computing Project

Applications• MILC, Columbia Physics System, Chroma, QDP++ (all built upon USQCD

software infrastructure)Software Technologies Cited• MPI, OpenMP, CUDA, Kokkos, OpenACC, C++17, Thrust,• SyCL, QUDA, QPhiX, LAPACK, ARPACK

Exascale Lattice Gauge Theory Opportunities and Requirements for Nuclear and High Energy Physics

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Paul Mackenzie (FNAL)

• Develop a software infrastructure that exploits recent compiler advances and improved language support to enable the creation of portable, high-performance QCD code with a shorter software tool-chain

• Focus on two nuclear/HEP applications:• Compute from first principles the properties and interactions of nucleons and ight

nuclei with physical quark masses and achieve the multi-physics goal of incorporating both QCD and electromagnetism

• Search for beyond-the-standard-model physics by increasing the precision of calculations of the properties of quark-anti-quark and three-quark states.

Y1: Develop adaptive multigrid for domain wall and staggered fermionsY2: Release new versions of old apps augmented with new implementations of algorithms; Initial release of Workflow frameworkY3: Release of data parallel API with GPU support; Release benchmark suite for non volatile memory storage; Scalable MG based deflation methods with variance reduction applicable to other HP domainsY4: Validated and documented high performance code implementing the 1-2 most successful algorithms for reducing critical slowing down

• Critical slowing down in gauge evolution• Correlation functions for large nuclei• Sub-optimal solver performance• Multi-level time integration for correlation functions• Performance in data-parallel GPU offload• Extending multigrid solver base• Architectural pathfinding (appropriate parallelization approach)

32Exascale Computing Project

Molecular Dynamics at the Exascale: Spanning the Accuracy, Length and Time Scales for Critical Problems in Materials Science (EXAALT)Combining time-acceleration techniques, spatial decomposition strategies, and high accuracy quantum mechanical and empirical potentials

• Tackle materials challenges for energy, especially fission and fusion, by allowing the scientist to target, at the atomistic level, the desired region in accuracy, length, and time space

• Shown here is a simulation aimed at understanding tungsten as a fusion first-wall material, where plasma-implanted helium leads to He bubbles that grow and burst at the surface, ultimately leading to surface "fuzz" by a mechanism not yet understood

• At slower, more realistic growth rates (100 He/µsec), the bubble shows a different behavior, with less surface damage, than the fast-grown bubble simulated with direct molecular dynamics (MD)

• Atomistic simulation allows for complete microscopic understanding of the mechanisms underlying the behavior

• At the slower growth rate, crowdion interstitials emitted from the bubble have time to diffuse over the surface of the bubble, so that they are more likely to release from the surface-facing side of the bubble, giving surface-directed growth.Slowly-growing He bubble in W at bursting

PI: Arthur Voter (LANL)

33Exascale Computing Project

Applications• LAMMPS, LATTE, AMDFSoftware Technologies Cited• MPI, OpenMP, CUDA• Kokkos• VTK, Paraview• Legion

Molecular Dynamics at the Exascale: Spanning the Accuracy, Length and Time Scales for Critical Problems in Materials Science

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Arthur Voter (LANL)

• Success of MD at risk by inability to reach necessary length & time scales while maintaining accuracy; simple scale-up of current practices only allows larger systems; doesn’t improve current timescales (nsec) & accuracy (empirical potentials)

• Predictive microstructure evolution requires access to msec timescales and accurately describing complex defects with explicit consideration of the electronic structure, which cannot be done using conventional empirical potentials

• Develop novel MD methodologies, driven by two challenges: (1) extending the burnup of nuclear fuel in fission reactors (dynamics of defects & fission gas clusters in UO2) and (2) developing plasma facing components (tungsten first wall) to resist the harsh conditions of fusion reactors

• Bring three state of the art MD codes into a unified tool to leverage exascale platforms across dimensions of accuracy, length, and time

Y1: Code integration demonstration on homogeneous nodes (problems 1, 2a); EXAALT package releaseY2: Science-at-scale demonstration, homogeneous nodes (Trinity, Mira) for problems 1 and 2a,b) at target simulation rate of 2 µsec/day; EXAALT package releaseY3: Science at scale demonstration, heterogeneous nodes (e.g., Summit) for problems 1 and 2a,b at target simulation rate of 10 µsec/day; EXAALT package releaseY4: Science at scale demonstration (e.g., Aurora, Summit). Problems 1 and 2a,b at target simulation rate of 50 µsec/day; Final EXAALT package release

• Lowest levels of DFTB theory might prove insufficient for actinide-bearing materials. Might require higher order DFTB expansion.

• Performance of spatially parallel replica-based AMD methods (SLParRep) might be affected by certain kinds of very low barrier events.

• SNAP potential descriptors may not represent the potential energy surface of target multispecies materials with sufficient accuracy. Would have to augment the descriptor set.

34Exascale Computing Project

Software Technology RequirementsMolecular Dynamics at the Exascale: Spanning the Accuracy, Length and Time Scales for Critical Problems in Materials Science

• Programming Models and Runtimes1. MPI, Kokkos2. Legion and various other task-management runtimes, OpenMP, CUDA, various fault-tolerant communication libraries (e.g., 0MQ)

3. HPX, Charm++

• Tools1. Git/GitLab

• Mathematical Libraries, Scientific Libraries, Frameworks1. Dakota, FFTW

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

35Exascale Computing Project

Software Technology RequirementsMolecular Dynamics at the Exascale: Spanning the Accuracy, Length and Time Scales for Critical Problems in Materials Science

• Data Management and Workflows1. We currently use bdb on node, but we plan to assess various other embedded and distributed databases (such as MDHIM)3. Swift, DHARMA

• Data Analytics and Visualization1. VTK, Paraview

• System Software

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

36Exascale Computing Project

Exascale Models of Stellar Explosions: Quintessential Multi-Physics SimulationCreation of the Heavy Elements in the Supernova Explosion of a Massive Star

• White dwarf: small dense star formed when a low-mass star has exhausted its central nuclear fuel and lost its outer layers as a planetary nebula; will eventually happen to the Sun

• Stellar collision: coming together of two stars, that merge into one larger unit thru the force of gravity; recently has been observed

• Series of stellar collisions in a dense cluster over time can lead to an intermediate-mass black hole via "runaway stellar collisions”

• Also a source of characteristic “inspiral” gravitational waves!• 3D Castro simulation of merging white dwarfs

• Hot material roils around a newly-born neutron star at the center of a core-collapse supernova

• Shown is the matter entropy from a 3D Chimera simulation of the first half second of the explosion

• Driven by intense heating from neutrinos coming from the cooling neutron star at the center, the shock wave will be driven outward, eventually ripping the star apart and flinging the elements that make up our solar system and ourselves into interstellar space

PI: Daniel Kasen (LBNL)

37Exascale Computing Project

Applications• CLASH framework: components from FLASH, CASTRO, Chimera, SedonaSoftware Technologies Cited• C, C++, Fortran, UPC++, Python• MPI, OpenMP, GASNet• Charm++, Perilla, • AMR (BoxLib), MC• VisIt, yt, Bellerophon

Exascale Models of Stellar Explosions:Quintessential Multi-Physics Simulation

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Daniel Kasen (LBNL)

• Stellar explosion simulations to explain the origin of the elements (especially those heavier than iron), a longstanding problem in physics

• Define the conditions for astrophysical nucleosynthesis that motivate, guide, and exploit experimental nuclear data (near drip lines) from the Facility for Rare Isotope Beams (FRIB)

• Address the physics of matter under extreme conditions, including neutrino and gravitational wave emission and the behavior of dense nuclear matter.

• Fully self-consistent calculations will model all proposed explosive nucleosynthesissites (core-collapse supernovae, neutron star mergers, accreting black holes) and related stellar eruptions: novae, x-ray bursts and thermonuclear supernovae

Y1: Establish / publish API, components, and composability for high level framework designY2: Core-collapse supernova simulation with two-moment transport; CLASH proto-code release with API examplesY3: Release of CLASH1.0; X-ray burst simulation with large networkY4: Release of entire CLASH ecosystem; Binary neutron star simulation: neutrino radiation transport calculated using a moment expansion, with a closure relation calibrated by IMC Boltzmann transport; general relativistic hydro with the metric computed assuming conformal flatness; detailed neutrino/matter coupling (non-isoenergetic scattering) and neutrino velocity-dependent and relativistic effects; nuclear reaction networks including of order 1000 isotopes

• Immaturity of programming models will inhibit progress• Unavailability of key personnel• Performance predictions for one or more code modules prove too

ambitious• Modules produced will be incompatible between major code lines

38Exascale Computing Project

Software Technology RequirementsProvided by Exascale Models of Stellar Explosions: Quintessential Multi-Physics Simulation

• Programming Models and Runtimes1. Fortran, C++, C, MPI, OpenMP, OpenACC, CUDA2. UPC++, GASNet, Charm++, Co-Array FORTRAN3. -

• Tools1. git, PAPI2. -3. TAU

• Mathematical Libraries, Scientific Libraries, Frameworks1. BoxLib, Hypre, BLAS, LAPACK2. MAGMA3. -

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

39Exascale Computing Project

Software Technology RequirementsProvided by Exascale Models of Stellar Explosions: Quintessential Multi-Physics Simulation

• Data Management and Workflows1. HDF, Bellerophon, MPI-IO2. -3. ADIOS

4. Data Analytics and Visualization5. Visit, yt6. -7. -

• System Software1. -2. -3. -

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

40Exascale Computing Project

High Performance, Multidisciplinary Simulations for Regional Scale Earthquake Hazard and Risk Assessments

• Ability to accurately simulate the complex processes associated with major earthquakes will become a reality with capable exascale

• Simulations offer a transformational approach to earthquake hazard and risk assessments

• Dramatically increase our understanding of earthquake processes

• Provide improved estimates of the ground motions that can be expected in future earthquakes

• Time snapshots (map view looking at the surface of the earth) of a simulation of a rupturing earthquake fault and propagation seismic waves

PI: David McCallen (LBNL)

41Exascale Computing Project

Applications• SW4, ESSISoftware Technologies Cited• OpenMP, OpenACC, CUDA, MPI• Fortran (inner loop of SW4)• HDF5 (coupling SW4 TO ESSI)• VISIT for visualization

High Performance, Multidisciplinary Simulations for Regional Scale Earthquake Hazard and Risk Assessments

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: David McCallen (LBNL)

• Build upon and advance simulation and data exploitation capabilities to transform computational earthquake hazard and risk assessments to a frequency range relevant to estimating the risk to key engineered systems (e.g. up to 10Hz).

• Require simulations of unprecedented size and fidelity utilizing measured ground motion data to constrain and construct geologic models that can support high frequency simulations. Highly leverages investment of a commercial partner (Pacific Gas and Electric) in obtaining unprecedented, dense ground motion data at regional scale from a SmartMeter system

• Provide the first strong coupling and linkage between HPC simulations of earthquake hazards (ground motions) and risk (structural system demands) – a true end-to-end simulation of complex, coupled phenomenon

Y1: Define core computational kernels of SW4 OpenMP within MPI-partitions for many-core machines Cuda/OpenACC or OpenMP for GPU/CPU machines; check pointing in SW4 Distributed processing of rupture model.Y2: Load balance within MPI-partitions Optimize 2D solver at mesh refinement interfaces Distribute processing for adjoint calculations.Y3: synchronous and overlapping MPI communication Mesh refinement within curvilinear mesh Multi-scale material model for FWI Construct local model from FWI and 5 Hz synthetic data.Y4: Combine local and regional SFBA models Construct local model from FWI and 10 Hz synthetic data.

• Transitioning software to emerging architectures and optimizing performance for both forward and inverse calculations

• Utilization of data to refine geologic models at the regional scales –determine just how far data can push simulation realism

• Developing an inversion simulation capability to extend the frequency of reliability/realism of ground motion simulations to constrain currently ill-defined geologic structure at fine scale

• The realized effectiveness of inversion algorithms at unprecedented scale

42Exascale Computing Project

Software Technology RequirementsProvided by High Performance, Multidisciplinary Simulations for Regional Scale Earthquake Hazard and Risk Assessments

• Programming Models and Runtimes1. Fortran-03, C++, C, MPI2. OpenMP, CUDA3. OpenACC

• Tools1. git, cmake, LLVM/Clang, totalview, valgrind2. ROSE, NVVP (NVIDIA visual profiler)3. HPCToolkit

• Mathematical Libraries, Scientific Libraries, Frameworks1. Blas/Pblas, Lapack/ScaLAPACK, Proj4 (Cartographic Projections Library)2. PETSc, BoxLib, Chombo3. Hypre

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

43Exascale Computing Project

Software Technology RequirementsProvided by High Performance, Multidisciplinary Simulations for Regional Scale Earthquake Hazard and Risk Assessments

• Data Management and Workflows1. HDF52. ASDF (Arbitrary Seismic Data Format), python3. Your list here

• Data Analytics and Visualization1. VisIt, ObsPy (Python framework for processing seismological data)2. gmt (generic mapping tool)3. Your list here

• System Software1. Linux/TOSS, Lustre, GPFS2. Your list here3. Your list here

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

44Exascale Computing Project

Applications• Chombo-Crunch, GEOSSoftware Technologies Cited• C++, Fortran, LLVM/Clang• MPI, OpenMP, CUDA• Raja, CHAI• Chombo AMR, PETSc• ADIOS, HDF5, Silo, ASCTK• VisIt

An Exascale Subsurface Simulator of Coupled Flow, Transport, Reactions and Mechanics

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Carl Steefel (LBNL)

• Safe and efficient use of the subsurface for geologic CO2 sequestration, petroleum extraction, geothermal energy and nuclear waste isolation

• Predict reservoir-scale behavior as affected by the long-term integrity of hundreds of thousands deep wells that penetrate the subsurface for resource utilization

• Resolve pore-scale (0.1-10 µm) physical and geochemical heterogeneities in wellbores and fractures to predict evolution of these features when subjected to geomechanical and geochemical stressors

• Integrate multi-scale (µm to km), multi-physics in a reservoir simulator: non-isothermal multiphase fluid flow and reactive transport, chemical and mechanical effects on formation properties, induced seismicity and reservoir performance

• Century-long simulation of a field of wellbores and their interaction in the reservoir

Y1: Evolve GEOS and Chombo-Crunch; Coupling framework v1.0; Large scale (100 m) mechanics test (GEOS); Fine scale (1 cm) reactive transport test (Chombo-Crunch)Y2: GEOS+Chombo-Crunch coupling for single phase; Coupling framework w/ physics; Multiphase flow for Darcy & pore scale; GEOS large strain deformation conveyed to Chombo-Crunch surfaces; Chombo-Crunch precipitation/dissolution conveyed to GEOS surfacesY3: Full demo of fracture asperity evolution-coupled flow, chemistry, and mechanicsY4: Full demo of km-scale wellbore problem with reactive flow and geomechanicaldeformation, from pore scale to resolve the geomechanical and geochemical modifications to the thin interface between cement and subsurface materials in the wellbore and to asperities in fractures and fracture networks

• Porting to exascale results in suboptimal usage across platforms• No file abstraction API that can meet coupling requirements• Batch scripting interface incapable of expressing simulation

workflow semantics• Scalable AMG solver in PETSc• Physics coupling stability issues• Fully overlapping coupling approach results inefficient.

45Exascale Computing Project

Software Technology RequirementsProvided by An Exascale Subsurface Simulator of Coupled Flow, Transport, Reactions and Mechanics

• Programming Models and Runtimes1. Fortran, C++/C++11, MPI, OpenMP, 2. UPC++, PGAS, TiledArrays3. OpenShmem

• Tools1. HPCToolkit, PAPI, ROSE, subversion, parallel debugger2.3.

• Mathematical Libraries, Scientific Libraries, Frameworks1. Chombo, PETSc, FFTW, BLAS, LAPACK2. HPGMG, SLEPc3.

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

46Exascale Computing Project

Software Technology RequirementsProvided by An Exascale Subsurface Simulator of Coupled Flow, Transport, Reactions and Mechanics

• Data Management and Workflows1. HDF5, BurstBuffer, HPSS2.3.

• Data Analytics and Visualization1. VisIt, ChomboVis2. VTK-m3.

• System Software1. DataWarp2.3.

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

47Exascale Computing Project

Exascale Modeling of Advanced Particle AcceleratorsToward compact and affordable particle accelerators. A laser beam or a charged particle beam propagating through ionized gas displaces electrons, creates a wakefield that supports electric fields orders of magnitude larger than with usual methods, accelerating a charged particle beam to high energy over a very short distance.

• Particle accelerators: a vital part of DOE infrastructure for discovery science and university- and private-sector applications - broad range of benefits to industry, security, energy, the environment, and medicine

• Improved accelerator designs are needed to drive down size and cost; plasma-based particle accelerators stand apart in their potential for these improvements

• Translating this promising technology into a mainstream scientific tool depends critically on exascale-class high-fidelity modeling of the complex processes that develop over a wide range of space and time scales

• Exascale-enabled acceleration design will realize the goal of compact and affordable high-energy physics colliders, with many spinoff plasma accelerator applications likely

PI: Jean-Luc Vay (LBNL)

48Exascale Computing Project

Applications• Warp, PICSARSoftware Technologies Cited• Foftran, C, C++, Python• MPI, OpenMP, GASNet, UPC++• BoxLib• HDF5, VisIT, Paraview, YT

Exascale Modeling of Advanced Particle Accelerators

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Jean-Luc Vay (LBNL)

• Design affordable compact 1 TeV electron-positron collider based on plasma acceleration for high-energy physics.

• Develop ultra-compact plasma accelerators with transformative implications in discovery science, medicine, industry and security.

• Enable virtual start-to-end optimization of the design and virtual prototyping of every component before they are built, leading to huge savings in design and construction.

• Develop powerful new accelerator modeling tool (WarpX) designed to run efficiently at scale on exascale supercomputers.

Y1: Modeling of single plasma-based accelerator stage with WarpX on single grid; verification against previous results.Y2: Modeling of single plasma-based accelerator stage with WarpX with static mesh refinement. plasma case.Y3: Optimized FDTD and spectral PIC on to 5-to-10 millions of cores, near-linear weak scaling with AMR, on a uniform plasma case.Y4:Convergence study in 3-D of ten consecutive multi-GeV stages in linear and bubble regime. Release of software to community.

• Dynamic load balancing• Parallel I/O, data analysis & visualization• Scaling electrostatic solver• Scaling high-order electromagnetic solver• Spurious reflections or charges owing to AMR• Lost signal with AMR• Numerical Cherenkov instability• Low temperature plasmas/beams

49Exascale Computing Project

Software Technology RequirementsProvided by Exascale modeling of advanced particle accelerators

• Programming Models and Runtimes1. Fortran, C++, Python, C, MPI, OpenMP, 2. UPC/UPC++, OpenACC, CUDA, GASNet3. HPX, Global Arrays, TiledArrays, Co-Array FORTRAN

• Tools1. LLVM/Clang, CMake, git, 2. TAU, HPCToolkit,

• Mathematical Libraries, Scientific Libraries, Frameworks1. BoxLib, FFTW, NumPy2. N/A3. BLAS/PBLAS, LAPACK/ScaLAPACK, P3DFFT

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

50Exascale Computing Project

Software Technology RequirementsProvided by Exascale modeling of advanced particle accelerators

• Data Management and Workflows1. MPI-IO, HDF, h5py2. ADIOS

• Data Analytics and Visualization1. VisIt, Jupyter notebook2. N/A3. VTK, Paraview

• System Software1. N/A2. N/A3. N/A

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

51Exascale Computing Project

Exascale Solutions for Microbiome AnalysisMicrobiomes: integral to the environment, agriculture, health and biomanufacturing.Analyzing the DNA of these microorganism communities is a computationally demanding bioinformatic task, requiring exascale computing and advanced algorithms.

• Microorganisms are central players in climate change, environmental remediation, food production, human health

• Occur naturally as “microbiomes” - communities of thousands of microbial species of varying abundance and diversity, each contributing to the function of the whole

• <1% of millions of worldwide microbe species have been isolated and cultivated in the lab, and a small fraction have been sequenced

• Collections of microbial data are growing exponentially, representing untapped info useful for environmental remediation and the manufacture of novel chemicals and medicines.

• “Metagenomics” — the application of high-throughput genome sequencing technologies to DNA extracted from microbiomes — is a powerful method for studying microbiomes

• First assembly step has high computational complexity, like putting together thousands of puzzles from a jumble of their pieces

• After assembly, further data analysis must find families of genes that work together and to compare across metagenomes

• ExaBiome application is developing exascale algorithms and software to address these challenges

PI: Katherine Yelick (LBNL)

52Exascale Computing Project

Applications• HipMer, GOTCCHA, MashSoftware Technologies Cited• UPC, UPC++, MPI, GASNet

Exascale Solutions for Microbiome Analysis

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Katherine Yelick (LBNL)

• Enable biological discoveries and bioengineering solutions through high quality assembly and comparative analysis of 1 million metagenomes on an early exascale system.

• Support biomanufacturing of materials for energy, environment, and health applications (e.g., antibiotics) through identification of genes and gene clusters in microbial communities

• Provide scalable tools for three core computational problems in metagenomics: (i) metagenome assembly, (ii) protein clustering and (iii) signature-based approaches to enable scalable and efficient comparative metagenome analysis:

Y1: Release HipMer for metagenomes on short read data; Demonstrate HipMer at scale on Cori, optimized for XeonPhiY2: Release Mashbased pipeline for whole metagenome classification; Release GOTTCHA/Mashbased visualization toolkitY3: Release HipMer for long read metagenomes; Release HipMer, HipMCL, GOTTCHA for metagenomes assembly and annotation on long/short reads for APEX/CORAL platformsY4: Complete assembly and analysis of data in SRA and IMG

• Exascale systems are not balanced for this data-intensive workload, limiting scalability or requiring new algorithmic approaches

• No GASNet, UPC or UPC++ implementation exists on the exascalesystems, requiring a different implementation strategy

• GOTTCHA algorithm cannot accurately distinguish metagenomes, requiring a new analysis approach

53Exascale Computing Project

Software Technology RequirementsProvided by Exascale Solutions for Microbiome Analysis

• Programming Models and Runtimes1. UPC/UPC++, PGAS, GASNetEX2. Thrust3. MPI

• Tools1. Git, CMake2. GDB, Valgrind

• Mathematical Libraries, Scientific Libraries, Frameworks1. Smith-Waterman

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

54Exascale Computing Project

Software Technology RequirementsProvided by Exascale Solutions for Microbiome Analysis

• Data Management and Workflows1. IMG/KBase, SRA, Globus2. MPI-IO

• Data Analytics and Visualization1. CombBLAS,2. Elviz, GAGE, MetaQuast

• System Software

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

55Exascale Computing Project

Full-loop, Lab-Scale Gasifier Simulation with MFIX-DEMThe MFIX-Exa project will track billions of reacting particles in a full-loop reactor, making it feasible to simulate a pilot-scale chemical looping reactor in a time-to-solution small enough to enable simulation-based reactor design and optimization

• Simulation contains over 1 million reacting particles coupled to a gas phase through interphase momentum, energy, and mass transfer.

• Particles comprised of carbon, volatiles, moisture and ash• Gas comprised of O2, CO, CO2, CH4, H2, H2O, N2• Animation: (Left) Solids particles colored by temperature. (Right) Volume rendering

of CO mass fraction. Grey surfaces indicate regions in the domain where the water-gas shift reaction is strong.

• Graphs. (Left) Gas species composition and (Right) temperature at the outlet.

PI: Madhava Syamlal (NETL)

56Exascale Computing Project

Applications• MFIX (MFIX-TFM, MFIX-PIC, MFIX-DEM)Software Technologies Cited• C++, FORTRAN, MPI• BoxLib, AMG solvers (e.g., Hypre, PETSc)• HDF5• MPI, OpenMP, GASNet• UPC++

Performance Prediction of Multiphase Energy Conversion Devices with Discrete Element, PIC, and Two-Fluid Models (MFIX-Exa)

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Madhava Syamlal (NETL)

• Curbing man-made CO2 emissions at fossil-fuel power plants relies on carbon-capture and storage (CCS) – understanding how to scale laboratory designs of multiphase reactors to industrial size is required to drive large-scale commercial deployment of CCS; a high-fidelity modeling capability is critically needed because Build and Test is prohibitive both in cost and time, to meet DOE’s CCS goals

• Deliver computational fluid dynamics-discrete element modeling (CFD-DEM) capability for lab-scale reactors, demonstrating it by simulating NETL’s 50 kW chemical looping reactor (CLR), including all relevant individual chemical and physical phenomena present in the reactor

• Decadal problem is CFD-DEM capability for small pilot-scale (0.5-5 MWe) reactors, demonstrating it by simulating NETL-CLR scaled up to 1MWe with sufficient fidelity and time-to-solution to impact design decisions.

Y1: Restructure DEM hydro models and particle data into BoxLib; replace BiCGStab by BoxLibGMG solver; improve single-level alg perf; optimize particle data layout; incorporate HDF5 I/OY2: Migrate scalar and thermodynamic models into BoxLib; incorporate coarse cut-cell support; implement non-subcycling multilevel time-stepping; improve multilevel particle-particle and particle-mesh perf; enable existing analytics for use in-situY3: Migrate species transport into BoxLib; adapt non-subcycle multilevel alg load-balance to species transport and chemical reactions; port expensive kernels to GPU (OpenMP); extend analytics to use BoxLib sidecarY4: Optimize load balance for subcycle multilevel alg with full physics; deliver spectral deferred corrections (SDC); optimize on-node perf; optimize workflow to minimize total runtime

• Unexpected effects from changes in temporal and spatial resolution within multilevel algorithms with subcycling in time

• Prototype node hardware not available for performance evaluation of intra-node programming model

• Available external AMG solvers fail to scale• Loss of personnel• Supplied Cartesian cut-cell geometry files from MFIX-GUI fail to

work correctly with BoxLib integration

57Exascale Computing Project

• Programming Models and Runtimes1. Fortran, C++/C++17, C, MPI, OpenMP, Python2. OpenACC, CUDA, OpenCL, UPC/UPC++

• Tools1. Git, GitLab, TAU, Jenkins (testing), DDT (debugger)2. HPCToolkit, PAPI

• Mathematical Libraries, Scientific Libraries, Frameworks1. BoxLib, Hypre, PETSc2. Trilinos

Software Technology RequirementsMultiphase (NETL)

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

58Exascale Computing Project

Software Technology RequirementsMultiphase (NETL)

• Data Management and Workflows1. HDF

• Data Analytics and Visualization1. VisIt, ParaView

• System Software

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

59Exascale Computing Project

Exascale Predictive Wind Plant Flow Physics ModelingUnderstanding Complex Flow Physics of Whole Wind Plants

• Must advance fundamental understanding of flow physics governing whole wind plant performance: wake formation, complex terrain impacts, turbine-turbine interaction effects

• Greater use of U.S. wind resources for electric power generation (~30% of total) will have profound societal and economic impact: strengthening energy security and reducing greenhouse-gas emissions

• Wide-scale deployment of wind energy on the grid without subsidies is hampered by significant plant-level energy losses by turbine-turbine interactions in complex terrains

• Current methods for modeling wind plant performance are not reliable design tools due to insufficient model fidelity and inadequate treatment of key phenomena

• Exascale-enabled predictive simulations of wind plants composed of O(100) multi-MW wind turbines sited within a 10 km x 10 km area with complex terrains will provide validated "ground truth" foundation for new turbine design models, wind plant siting, operational controls and reliably integrating wind energy into the grid

PI: Steve Hammond (NREL)

60Exascale Computing Project

Applications• Nalu, FASTSoftware Technologies Cited• C++, MPI, OpenMP (via Kokkos), CUDA (via Kokkos)• Trilinos (Tpetra), Muelu, Sierra Toolkit (STK), Kokkos• Spack, Docker• DHARMA (Distributed asynchronous Adaptive Resilient Management of

Applications)

Exascale Predictive Wind Plant Flow Physics Modeling

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Steve Hammond (NREL)

• A key challenge to wide-scale deployment of wind energy, without subsidy, in the utility grid is predicting and minimizing plant-level energy losses. Current methods lack model fidelity and inadequately treat key phenomena.

• Deliver predictive simulation of a wind plant composed of O(100) multi-MW wind turbines sited within 10km x 10km area, with complex terrain (O(10e11 grid points).

• Predictive physics-based high-fidelity models validated with target experiments, provide fundamental understanding of wind plant flow physics, and will drive blade, turbine, and wind plant design innovation.

• This work will play a vital role in addressing urgent national need to dramatically increase the percentage of electricity produced from wind power, without subsidy.

Y1: Baseline run for canonical ABL simulation with MPI; single-blade-resolved sim in non-rotating turbulent flow; incorporate Kokkos and demonstrate faster ABL run; demonstrate single-blade-resolved simulation with rotating bladesY2: Baseline single-blade-resolved capability (SBR) run; demonstrate mixed-order run with overset or sliding mesh algorithm; demonstrate faster SBR run; demonstrate single-turbine blade-resolved simulationY3: Demonstrate simulation of several turbines operating in flat terrainY4: Demonstrate simulation of O(10) turbines operating in complex terrain

• Transition to next generation platforms• Robustness of high-order schemes on turbulence model equations• Sliding-mesh algorithm scalability• Kokkos support

61Exascale Computing Project

Coupled Monte Carlo Neutronics and Fluid Flow Simulationof Small Modular Reactors

PI: Thomas Evans (ORNL)

• DOE has motivated and supported (just in this decade) the creation and enhancement of a new suite of high resolution physics applications for nuclear reactor analysis

• Petascale-mature applications include new Monte Carlo (MC) neutronics and computational fluid dynamics (CFD) capabilities suitable for much-improved analysis of light water reactors (LWR) on the grid today

• Petascale has enabled pin-resolved reactor physics solutions for reactor startup conditions

• Capable exascale: needed to model operational behavior of LWRs at hot full power with full-core multiphase CFD and fuel depletion (over the complete operational reactor lifetime)

• Capable exascale: allow coupling high-fidelity MC neutronics+ CFD into an integrated toolkit for also modeling the operational behavior of Small Modular Reactors (<300 MWe)

• Penultimate exascale challenge problem for nuclear reactors: generate experimental-quality simulations of steady-state and transient reactor behavior

62Exascale Computing Project

Applications• Nek5000, SHIFT, OpenMCSoftware Technologies Cited• MPI, OpenMP, Kokkos, Trilinos, PETSc,, CUDA, OpenACC, DTK

Coupled Monte Carlo Neutronics and Fluid Flow Simulation of Small Modular Reactors

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Thomas Evans (ORNL)

• Unprecedentedly detailed simulations of coupled coolant flow and neutron transport in Small Modular Reactor (SMR) cores to streamline design, licensing, and optimal operation.

• Coupled Monte Carlo (MC)/CFD simulations of the steady state operation of a 3D full small modular reactor core

• Coupled MC/CFD simulations of the full core in steady-state, low-flow critical heat flux (CHF) conditions

• Transient coupled MC neutronics/CFD simulations of the low-flow natural circulation startup of an small modular reactor

Y1:Three-dimenional full core neutronics and single assembly CFD including quantitative profiling and performance analysis to establish a baseline for on-node concurrency Y2: Three-dimensional, fully coupled neutronics and CFD including quantitative performance analysis Y3: Improvements in full core neutronics and CFD performance on Phi and GPU architectures Y4: Demonstration of full-core, fully-coupled neutronics and CFD with quantified machine utilization on Aurora and Summit

• Intra-node efficiency of MC random walks is insufficient to meet challenge problem requirements

• Transient acceleration techniques like ensemble averaging prove impractical for CFD at full scale

• Advanced Doppler broadening models fail to provide the desired accuracy, memory requirements, or completeness

• Algebraic multi-grid factorization too expensive for the largest cases under consideration or novel AMG techniques prove ineffective

63Exascale Computing Project

Software Technology RequirementsNuclear Reactors

• Programming Models and Runtimes1. C++/C++-17, C, Fortran, MPI, OpenMP, Thrust, CUDA, Python2. Kokkos, OpenACC, NVL-C3. Raja, Legion/Regent, HPX

• Tools1. LLVM/Clang, PAPI, Cmake, git, CDash, gitlab, Oxbow2. Docker, Aspen3. TAU

• Mathematical Libraries, Scientific Libraries, Frameworks1. BLAS/PBLAS, Trilinos, LAPACK2. Metis/ParMETIS, SuperLU, PETSc3. Hypre

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

64Exascale Computing Project

Software Technology RequirementsNuclear Reactors

• Data Management and Workflows1. MPI-IO, HDF, Silo, DTK2. ADIOS

• Data Analytics and Visualization1. VisIt2. Paraview

• System Software

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

65Exascale Computing Project

Applications• QMCPACK.Software Technologies Cited• MPI, CUDA, OpenMP, OpenACC, RAJA, Kokkos, C++17, runtimes, BLAS,

LAPACK, sparse linear algebra.Motifs• Particles, dense and sparse linear algebra, Monte Carlo.

QMCPACK: A Framework for Predictive and Systematically Improvable Quantum-Mechanics Based Simulations of Materials

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Paul Kent (ORNL)

• Develop a performance portable, exascale version of the open source petascalecode QMCPACK that can provide highly accurate calculations of the properties of complex materials.

• Find, predict and control materials and properties at the quantum level with an unprecedented and systematically improvable accuracy.

• The ten-year challenge problem is to simulate transition metal oxide systems of approximately 1000 atoms to 10meV statistical accuracy, such as complex oxide heterostructures that host novel quantum phases, and using the full concurrency of exascale systems.

Y1: Development of miniapps to assess programming models, expose and exploit additional concurrency. Full scale calculations on current generation supercomputers for code development and scaling tests.Y2: Testing of programing models, scaling techniques. Focus on on-node concurrency and portable performance on next generation nodes. Selection of preferred programming model. Y3: Implementation of newer programming model to expose greater concurrency. Multiple nodes on next generation supercomputers for code development and demo science application.Y4: Full scale calculations and their refinement on next generation supercomputers for code development and demonstration science application.

• Portable performance between multiple-architectures not achievable within project time frame.

• 10-year scientific challenge problems are no longer relevant or interesting due to scientific advances.

• Significant changes in computational architectures that require new solutions or problem decompositions.

• Application not running at capability scale with good figure of merit.

66Exascale Computing Project

Software Technology RequirementsQMCPACK

• Programming Models and Runtimes1. C++, MPI, OpenMP (existing), CUDA (existing).2. Kokkos, RAJA, C++17, OpenMP 4.x+, OpenACC, any sufficiently capable supported runtimes (distributed capability not essential).

Intend to identify a preferred solution and promote to #1.3. DSL (IRP used for associated Quantum Chemistry application), code generators & autotuners.

• Tools1. git (version control and git-based development workflow tools), CMake/CTest/CDash. 2. LLVM, Performance analysis and prediction tools, particularly for memory hierarchy, Static analysis (correctness, metrics/compliance).

• Mathematical Libraries, Scientific Libraries, Frameworks1. BLAS, FFTW, BOOST, MKL (Sparse BLAS, existing implementation), Python, Numpy.2. Any portable/supported sparse BLAS (including distributed implementations), runtime data compression (e.g. ZFP)

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

67Exascale Computing Project

Software Technology RequirementsQMCPACK

• Data Management and Workflows1. NEXUS (Existing lightweight python workflow tool included with QMCPACK), HDF5, XML.2. Supported workflow & data management tools with a fair balance of simplicity-complexity / capability / portability.

• Data Analytics and Visualization1. Matplotlib, Visit, VESTA, h5py

• System Software1. Support for fault tolerance/resilience (if provided system software level, and in the remainder of the software stack). Because our

application performs Monte Carlo and does limited I/O, the most common faults should be easily handled.

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

68Exascale Computing Project

Transforming Additive Manufacturing through Exascale Simulation (ExaAM)

• The Exascale AM (ExaAM) project is building a new multi-physics modeling and simulation toolkit for Additive Manufacturing (AM) to provide an up-front assessment of the manufacturability and performance of additively manufactured parts

• An Integrated Platform for AM Simulation (IPAMS) will be experimentally validated, enabled by in-memory coupling between continuum and mesoscale models to quantify microstructure development and evolution during the AM process

• Microscopic structure determines local material properties such as residual stress and leads to part distortion and failure• A validated AM simulator enables determination of optimal process parameters for desired material properties, ultimately

leading to reduced-order models that can be used for real-time in situ process optimization• Coupled to a modern design optimization tool, IPAMS will enable the routine use of AM to build novel and qualifiable parts

PI: John Turner (ORNL)

69Exascale Computing Project

Applications & S/W TechnologiesApplications• IPAMS: ALE3D, Truchas, Diablo, AMPE, MEUMAPPS, Tusas,

DiabloSoftware Technologies Cited• C++, Fortran• MPI, OpenMP, OpenACC, CUDA• Kokkos, Raja, Charm++• Hypre, Trilinos, P3DFFT, SAMRAI, Sundials, Boost• DTK, netCDF, HDF5, ADIOS, Metis, Silo• GitHub, GitLab, CMake, CDash, Jira, Eclipse ICE

Transforming Additive Manufacturing through Exascale Simulation (ExaAM)

PI: John Turner (ORNL), co-PI: Jim Belak (LLNL)

Exascale Challenge Problem• Develop, deliver, and deploy the Integrated Platform for Additive Manufacturing

Simulation (IPAMS) that tightly couples high-fidelity sub-grid simulations within a continuum process simulation to determine microstructure and properties at each time-step using local conditions, and hence performance

• Dramatically accelerate the widespread adoption of additive manufacturing (AM) by enabling fabrication of qualifiable metal parts with minimal trial-and-error iteration and realization of location-specific properties

• Large suite of physics models: melt pool-scale process modeling, part-scale process modeling (microstructure, residual stress & properties), microstructure modeling, material property modeling, and performance modeling

• Simulate AM of a part where mass reduction results in significant energy savings in the application and structural loading requires a graded structure

Development PlanY1: Simulate microstructure / residual stress in macro-scale solid part against experimental result; IPAMS v1.0 release; Initial IPAMS code coupling; Unsupported bridge and supported comb-type samples (Demo 1)Y2: Initial coupled demo simulation, with file-based communication; In memory IPAMS code coupling; IPAMS v2.0 release; Individual struts within multiple-unit-cell lattice (Demo 2)Y3: Common workflow system for problem setup and analysis; IPAMS v3.0 release; Full scale mechanical test sample (Demo 3)Y4: Fully-integrated, in-memory process simulation capability available; IPAMS v4.0 release; Full scale conformal lattice with local microstructure (Demo 4)

Risks and Challenges• New characterization techniques may be required for validation• Inability to efficiently spawn subgrid simulations• Microstructure model results do not match experiment• Linear solver Hypre is unable to efficiently take advantage of hybrid

architectures (multiple IPAMS components depend on Hypre)• Incompatibilities or inefficiencies in integrating components using

disparate approaches (e.g. Raja and Kokkos)• Uncertainty in input parameters for both micro- and macro-scale

simulations

70Exascale Computing Project

Physical Models and Code(s)• Physical Models: fluid flow, heat transfer, phase change

(melting/solidification and solid-solid), nucleation, microstructure formation and evolution, residual stress

• Codes: • Continuum: ALE3D, Diablo, Truchas• Mesoscale: AMPE, MEUMAPPS, Tusas

• Motifs: Sparse Linear Algebra, Dense Linear Algebra, Spectral Methods, Unstructured Grids, Dynamical Programs, Particles

Transforming Additive Manufacturing through Exascale Simulation (ExaAM)

PI: John Turner (ORNL), co-PI: Jim Belak (LLNL)

Application Domain• Application Area: Dramatically accelerate the widespread adoption of

additive manufacturing (AM) by enabling fabrication of qualifiable metal parts with minimal trial-and-error iteration and realization of location-specific properties• coupling of high-fidelity sub-grid simulations within a continuum process

simulation to determine microstructure and properties at each time-step using local conditions

• Challenge Problem: Simulate AM of a part where mass reduction results in significant energy savings in the application and structural loading requires a functionally graded structure

Partnerships• Co-Design Centers: CEED (discretization), CoPA (particles), Codar

(data), AMREx (AMR)• Software Technology Centers: ALExa (DTK), PEEKS (Trilinos), ATDM

(Hypre), Kokkos, Exascale MPI, OMPI-X, ForTrilinos, Sparse Solvers, xSDK4ECP, SUNDIALS, PETSc/TAO, ADIOS, potentially others

• Application Projects: ATDM projects (primarily LANL and LLNL), Exascope

First Year Development Plans• Demonstrate simulation of unsupported bridge and

supported comb-type samples• Release initial set of proxy apps• Integrated Platform for Additive Manufacturing Simulation

(IPAMS) v1.0 release with initial code coupling

71Exascale Computing Project

Models and Code(s)• Physical Models: fluid flow, heat transfer, phase change

(melting/solidification and solid-solid), nucleation, microstructure formation and evolution, residual stress

• Codes: • Continuum: ALE3D, Diablo, Truchas• Mesoscale: AMPE, MEUMAPPS, Tusas

• Motifs: Sparse Linear Algebra, Dense Linear Algebra, Spectral Methods, Unstructured Grids, Dynamical Programs, Particles

Transforming Additive Manufacturing through Exascale Simulation (ExaAM)

PI: John Turner (ORNL), co-PI: Jim Belak (LLNL)

Goals and Approach• Application Area: Dramatically accelerate the widespread adoption of additive

manufacturing (AM) by enabling fabrication of qualifiable metal parts with minimal trial-and-error iteration and realization of location-specific properties• coupling of high-fidelity sub-grid simulations within a continuum process simulation to

determine microstructure and properties at each time-step using local conditions• Challenge Problem: Simulate AM of a part where mass reduction results in

significant energy savings in the application and structural loading requires a functionally graded structure

Software and Numerical Library Dependencies• C++, Fortran• MPI, OpenMP, OpenACC, CUDA• Kokkos, Raja, Charm++• Hypre, Trilinos, P3DFFT, SAMRAI, Sundials, Boost• DTK, netCDF, HDF5, ADIOS, Metis, Silo• GitHub, GitLab, CMake, CDash, Jira, Eclipse ICE

Critical Needs Currently Outside the Scope of ExaAM• modeling of powder properties and spreading• shape and topology optimization• post-build processing, e.g. hot isostatic

pressing (HIP)• data analytics and machine learning of

process / build data• reduced-order models

72Exascale Computing Project

Models and Code(s)• Physical Models: fluid flow, heat transfer,

phase change (melting/solidification and solid-solid), nucleation, microstructure formation and evolution, residual stress

• Codes: • Continuum: ALE3D, Diablo, Truchas• Mesoscale: AMPE, MEUMAPPS, Tusas

• Motifs: Sparse Linear Algebra, Dense Linear Algebra, Spectral Methods, Unstructured Grids, Dynamical Programs, Particles

Transforming Additive Manufacturing through Exascale Simulation (ExaAM)

PI: John Turner (ORNL), co-PI: Jim Belak (LLNL)

Goal and Approach• Accelerate the widespread adoption of

additive manufacturing (AM) by enabling fabrication of qualifiable metal parts with minimal trial-and-error iteration and realization of location-specific properties• Coupling of high-fidelity sub-grid simulations

within a continuum process simulation to determine microstructure and properties at each time-step using local conditions

Software and Numerical Library Dependencies• C++, Fortran• MPI, OpenMP, OpenACC, CUDA• Kokkos, Raja, Charm++• Hypre, Trilinos, P3DFFT,

SAMRAI, Sundials, Boost• DTK, netCDF, HDF5, ADIOS,

Metis, Silo• GitHub, GitLab, CMake, CDash,

Jira, Eclipse ICE

Critical Needs Currently Outside the Scope of ExaAM• modeling of powder properties and spreading• shape and topology optimization• post-build processing, e.g. hot isostatic

pressing (HIP)• data analytics and machine learning of

process / build data• reduced-order models

73Exascale Computing Project

Software Technology RequirementsAdvanced Manufacturing

• Programming Models and Runtimes1. Fortran, C++/C++17, Python, MPI, OpenMP, OpenACC, CUDA, Kokkos, Raja, Boost2. Legion/Regent, Charm++3. other asynchronous, task-parallel, programming/execution models and runtime systems

• Tools1. git, CMake, CDash, GitLab2. Docker, Jira, Travis, PAPI, Oxbow

• Mathematical Libraries, Scientific Libraries, Frameworks1. BLAS/PBLAS, Trilios, PETSc, LAPACK/ScaLAPACK, Hypre, DTK, Chaco, ParMetis, WSMP (direct solver from IBM)2. HPGMG, MeuLu (actually part of Trilinos – replacement for ML), MAGMA, Dakota, SuperLU, AMP

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

74Exascale Computing Project

Software Technology RequirementsAdvanced Manufacturing

• Data Management and Workflows1. HDF, netCDF, Exodus2. ADIOS

• Data Analytics and Visualization1. VisIt, ParaView, VTK

• System Software

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

75Exascale Computing Project

Optimizing Stochastic Grid Dynamics at ExascaleIntermittent renewable sources, electric vehicles, and smart loads will vastly change the behavior of the electric power grid and impose new stochastics and dynamics that the grid is not designed for nor can easily accommodate

• Optimizing such a stochastic and dynamic grid with sufficient reliability and efficiency is a monumental challenge

• Not solving this problem appropriately or accurately could result in either significantly higher energy cost, or decreased reliability inclusive of more blackouts, or both

• Power grid data are clearly showing the trend towards dynamics that cannot be ignored and would invalidate the quasi-steady-state assumption used today for both emergency and normal operation

• The increased uncertainty and dynamics severely strains the analytical workflow that is currently used to obtain the cheapest energy mix at a given level of reliability

• The current practice is to keep the uncertainty, dynamics and optimization analysis separate, and then to make up for the error by allowing for larger operating margins

• The cost of these margins is estimated by various sources to be in $5-15B per year for the entire United States

• The ECP grid dynamics application can result in the best achievable bounds on these errors and thus resulting in potentially billions of dollars a year in savings.

PI: Zhenyu (Henry) Huang (PNNL)

76Exascale Computing Project

Applications• GRID-Pack, PIPSSoftware Technologies Cited• MPI, OpenMP, MA57, LAPACK, MAGNA, PETSc, Global Arrays,

ParMETIS, Chaco, BLAS, GridOPTICS software system (GOSS), Julia, Elemental, PARDISO

• Co-Design: ExaGraph

Optimizing Stochastic Grid Dynamics at Exascale

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Zhenyu (Henry) Huang (PNNL)

• Exascale optimization problem in electric system expansion planning with stochastic and transient constraints. Create exascale software tools that achieve optimal bounds on errors. Potentially saves billions in cost savings annually by producing the best planning decisions and enables large-scale renewable energy (especially wind and solar) without compromising reliability under economical constraints.

• Planning encompasses the problem of determining the structure, positioning, and timing of billions in assets for any market operator in such a way that electricity costs and reliability satisfy given performance requirements.

• Major step forward from today’s practice that only considers steady-state constraints in a deterministic context and inconsistently samples time intervals for reliability metric estimation.

Y1: PIPS for SCOPF with nonlinear constraints (NL) achieves 10K-way parallelism Transient Simulation/PETSc scales to 100-way parallelism.Y2: SCOPF/NL/PIPS Weather Uncertainty (U) Scales to 10K–100K-way parallelism Parallel Adjoint Computations of transients scale to 100K-way parallelism.Y3:SCOPF/PIPS/NL/U with transient constraints scale to 100K–1M-way parallelism.Y4: 20-year production cost (planning) model with integer variables, uncertainty, and transient constraints scales to 1M–10M-way parallelism.

• Availability of target computers for software development• Alignment of the planned software attributes with target computers• Availability of large-scale test datasets for the target domain

problem• Achieving target computational performance via integration of

optimization and dynamics, spanning a time horizon of sub-seconds to decades.

• Achieving target end-to-end performance considering data ingestion on an exascale platform using a data-driven approach.

77Exascale Computing Project

High-Fidelity Whole Device Modeling of Magnetically Confined Fusion Plasmas

• Progress must be made in simulating the multiple spatiotemporal scale processes in magnetic plasma confinement for fusion scientists to understand physics, predict performance of future fusion reactors such as ITER, and accelerate development of commercial fusion reactors

• Plasma confinement in a magnetic fusion reactor is governed through self-organized, multiscale interaction between plasma turbulence and instabilities, evolution of macroscopic quantities, kinetic dynamics of macroscopic plasma particles, atomic physics interaction with neutral particles generated from wall-interaction, plasma heating sources and sinks

• Capable exascale is required to perform high-fidelity whole-device simulations of plasma confinement: two advanced, scalable fusion gyrokinetic codes are being coupled self-consistently to build the basis for a whole device model of a tokamak fusion reactor: the grid-based GENE code for the core region and the particle-in-cell XGC1 code for the edge region

PI: Amitava Bhattacharjee (PPPL)

78Exascale Computing Project

Applications• GENE, XGCSoftware Technologies Cited• MPI, OpenMP, OpenACC

CUDA-Fortran• Adios, Python• PETSc, BLAS/LAPACK, FFTW,

SuperLU, SLEPc, Trilinos-Fortran• VisIt, PAPI, Tau

High-Fidelity Whole Device Modeling of Magnetically Confined Fusion Plasmas

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Amitava Bhattacharjee (PPPL)

• Develop high-fidelity Whole Device Model (WDM) of magnetically confined fusion plasmas to understand and predict the performance of ITER and future next-step facilities, validated on present tokamak (and stellarator) experiments

• Couple existing, well established extreme-scale gyrokinetic codes• GENE continuum code for the core plasma and the• XGC particle-in-cell (PIC) code for the edge plasma, into which a few other important (scale-

separable) physics modules will be integrated at a later time for completion of the whole-device capability

Y1: Demonstrate initial implicit coupling capability between core (GENE) and edge (XGC) on the ITG turbulence physicsY2: Demonstrate telescoping of the gyrokinetic turbulent transport using a multiscale time integration framework on leadership class computersY3: Demonstrate and assess the experimental (transport) time scale telescoping of whole-device gyrokinetic physicsY4: Complete the phase I integration framework and demonstrate the capability of the WDM of multiscale gyrokinetic physics in realistic present-day tokamaks on full-scale SUMMIT, AURORA, and CORI

• Efficient scaling of GENE and XGC to exascale• Telescoping to experimental time scales• Load balancing among different kernels in coupled codes• FLOPS-intensive (communication-avoiding) algorithms• Numerical approaches to coupling• Adaptive algorithms

Turbulence fills the whole plasma volume, controlling

tokamak plasma confinement.

79Exascale Computing Project

Software Technology RequirementsProvided by Fusion Whole Device Modeling

• Programming Models and Runtimes1. Fortran, Python, C, MPI, OpenMP, OpenACC, CUDA-Fortran2. Co-Array FORTRAN, PGAS

• Tools1. Allinea DDT, PAPI, Globus Online, git, TAU2.3. GitLab

• Mathematical Libraries, Scientific Libraries, Frameworks1. PETSc, SCOREC, LAPACK/ScaLAPACK, Hypre, IBM ESSL, Intel MKL, CUBLAS, CUFFT, SLEPc, BLAS/PBLAS, FFTW, Dakota, 2. Trilios [with Fortran interface], SuperLU, Sundials3. DPLASMA, MAGMA, FMM

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no

concrete plans

80Exascale Computing Project

Software Technology RequirementsProvided by Fusion Whole Device Modeling

• Data Management and Workflows1. Adios2. ZFP, SZ3. SCR

• Data Analytics and Visualization1. VisIt, VTK 2. Paraview

• System Software

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no

concrete plans

81Exascale Computing Project

Data Analytics at the Exascale for Free Electron LasersLinear Coherent Light Source (LCLS) is revealing biological structures in unprecedented atomic detail, helping to model proteins that play a key role in many biological functions. The results could help in designing new life-saving drugs.

• Biological function is profoundly influenced by dynamic changes in protein conformations and interactions with molecules – processes that span a broad range of timescales

• Biological dynamics are central to enzyme function, cell membrane proteins and the macromolecular machines responsible for transcription, translation and splicing

• Modern X-ray crystallography has transformed the field of structural biology by routinely resolving macromolecules at the atomic scale

• LCLS has demonstrated the ability to resolve structures of macromolecules previously inaccessible - using the new approaches of serial nanocrystallography and diffract-before-destroy with high-peak-power X-ray pulses

• Higher repetition rates of LCLS-II can enable major advances by revealing biological function with its unique capability to follow dynamics of macromolecules and interacting complexes in real time and in native environments

• Advanced solution scattering and coherent imaging techniques can characterize sub-nanometer scale conformational dynamics of heterogeneous ensembles of macromolecules – both spontaneous fluctuations of isolated complexes, and conformational changes that may be initiated by the presence of specific molecules, environmental changes, or by other stimuli

PI: Amedeo Perazzo (SLAC)

82Exascale Computing Project

Applications• Psana Framework, cctbx, lunus, M-TIP, IOTASoftware Technologies Cited• Tasking runtime (Legion)• C++, Python• MPI, OpenMP, CUDA• FFT, BLAS/LAPACK• HDF5, Shifter, XTC

Data Analytics at the Exascale for Free Electron Lasers

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Amedeo Perazzo (SLAC)

• LCLS detector data rates up 103-fold by 2025; XFEL data analyses times down from weeks to min with real-time interpretation of molecular structure revealed by X-ray diffraction; LCLS-II beam rep rate goes from 120 Hz to 1 MHz by 2020.

• LCLS X-ray beam, @ atomic scale wavelengths & 109 brighter than other sources, probes complex, ultra-small structures with ultrafast pulses to freeze atomic motions

• Science drivers to orchestrate compute, network, and storage: Serial Femtosecond Crystallography (SFX) and Single Particle Imaging (SPI)

• SFX: study of biological macromolecules (e.g., protein structure / dynamics) and crystalline nanomaterials; need rapid image analysis feedback on diffraction data to make experimental decisions

• SPI: discern 3D molecular structure of individual nano particles & molecules; rapid diffraction pattern tuning of sample concentrations needed for sufficient single particle hit rate, adequate data collection

Y1: Release cctbx, psana: Benchmark exascale aware M-TIP routines; Prototype psanatasking, image kernels ported to C++ parallel STL; Deploy live streaming HDF5 files from FFB to Cori; SFX experiment on Cori PIIY2: Release cctbx, psana, M-TIP; port psana- MPI and tasking to Summit, key image kernels ported to image DSL; Decision for psana-MPI vs psana-task; SFX experiment using IOTA on Cori PII & SPI experiment on Cori PIIY3: Release cctbx, psana, M-TIP; Optimized M-TIP under streaming; cctbx & M-TIP integrated into psana with 50% scaling Cori PII & Sierra; SFX experiment using ray tracingY4: End-to-end cctbx on NERSC9 / ESnet6; optimized scheduler for live streaming jobs on Cori; SFX & SPI experimental demos for LCLS users visualizing strructures < 10 min

• Schedulability of NERSC & LCLS resources• LCLS-II data rate > ESnet data rate• HPC execution overhead• Scalability of file format(s)• Keeping up with network infrastructure upgrades• Experiment calendar uncertainty• New model for HPC utilization; bursty, short workloads imply lower

machine utilization unless resilient jobs can be preempted• Maturity of tasking runtime/image analysis kernels

83Exascale Computing Project

Software Technology RequirementsData Analytics for Free Electron Lasers

• Programming Models and Runtimes1. Python, C++, Legion, MPI, OpenMP2. Argobots, Qthreads3. Halide

• Tools1. CMake, git, Github, Zenhub, LLVM, Travis

• Mathematical Libraries, Scientific Libraries, Frameworks1. scikit-beam, BLAS/LAPACK, FFTW, GSL

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

84Exascale Computing Project

Software Technology RequirementsData Analytics for Free Electron Lasers

• Data Management and Workflows1. HDF5, psana2. iRods, GASNet, Mercury, bbcp, globus online, gridftp, xrootd, zettar

• Data Analytics and Visualization1. cctbx, M-TIP

• System Software1. shifter

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

85Exascale Computing Project

Transforming Combustion Science and Technology with Exascale Simulations

• Direct Numerical Simulation (DNS) of a turbulent lifted jet flame stabilized on pre-ignition species reveals how low-temperature reactions help stabilize the flame against the disruptive effects of high velocity turbulence

• Understand the role of multi-stage ignition in high pressure diesel jet flames• High-fidelity geometrically faithful simulation of the relevant in-cylinder processes in a

low temperature reactivity controlled compression ignition (RCCI) internal combustion engine that is more thermodynamically favorable then existing engines, with potential for groundbreaking efficiencies yet limiting pollutant formation

• Develop DNS and hybrid DNS-LES adaptive mesh refinement solvers for transforming combustion science and technology through capable exascale

• Prediction of the relevant processes - turbulence, mixing, spray vaporization, ignition and flame propagation, and soot/radiation -- in an RCCI internal combustion engine will become feasible

PI: Jacqueline Chen (SNL)

Lifted jet flame showing the separation of low-temperature reactions (yellow to red) and the high-temperature flame (blue to green)

86Exascale Computing Project

Applications• S3D, LMC, Pele, PeleC, PeleLMSoftware Technologies Cited• C++, UPC++• MPI, OpenMP, GASNet, OpenShmem• Kokkos, Legion, Perilla• BoxLib AMR, Chombo AMR, PETSc, Hypre• HDF5• IRIS

Transforming Combustion Science and Technology with Exascale Simulations

Exascale Challenge Problem Applications & S/W Technologies

Development PlanRisks and Challenges

PI: Jacqueline Chen (SNL)

• First-principles (DNS) and near-first principles (DNS/LES hybrids) AMR-based technologies to advance understanding of fundamental turbulence-chemistry interactions in device relevant conditions

• High-fidelity geometrically faithful simulation of the relevant in-cylinder processes in a low temperature reactivity controlled compression ignition (RCCI) internal combustion engine that is more thermodynamically favorable then existing engines, with potential for groundbreaking efficiencies yet limiting pollutant formation

• Also demonstrate technology with hybrid DNS/LES simulation of a sector from a gas turbine for power generation burning hydrogen enriched natural gas.

• High-fidelity models will account for turbulence, mixing, spray vaporization, low-temperature ignition, flame propagation, soot/radiation, non-ideal fluids

Y1: Initial release of Pele, report on baseline performance on KNL performance; Simulation of turbulent premixed flame with non-ideal fluid behaviorY2: Verification of Lagrangian spray implementation in compressible solver; Annual end of year Pele performance benchmarksY3: Benchmarking of linear solvers on Summit; Simulation of turbulent flame with complex geometry; Annual end of year Pele performance benchmarksY4: Low Mach simulation of gas turbine sector with real geometry

• Embedded boundary (EB) multigrid performance• Effective AMR for challenge problem demos• Accurate DNS/LES coupling• Maturity of Eulerian spray models• Effective EB load-balance strategies• Performance portability of chemistry DSL• Performant task-based particle implementation that is interoperable

with BoxLib

87Exascale Computing Project

Software Technology RequirementsCombustion (SNL)

• Programming Models and RuntimesFortran (1) , C++/C++17 (1), MPI (1), OpenMP(1), CUDA (1) , TiledArrays (1), OpenCL(2), Legion / Regent (1) , Gasnet (1), Tida (1), OpenACC (2), PGAS (1), Kokkos (2), UPC/UPC++(1) , Perilla (2) , OpenShmem (2)Boost (3) , Thrust (3)

• ToolsCmake (1) , Git (1) , GitLab (1), Papi (2), DDT (1), Vtune (2), Jenkins (1)

• Mathematical Libraries, Scientific Libraries, FrameworksBoxLib (1), HPGMG (1), FFTW, Sundials (1), Tida (1), Hypre (2), PetSC (1)BLAS / PBLAS (3)

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

88Exascale Computing Project

Software Technology RequirementsCombustion (SNL)

• Data Management and WorkflowsMPI-IO (1), HDF5(1), ADIOS (3)

• Data Analytics and VisualizationVisit (1), Paraview (1), VTK (3), FFTW (3), Dharma (3)

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

89Exascale Computing Project

Cloud-Resolving Climate Modeling of the Earth's Water Cycle

• Cloud-resolving Earth system model with throughput necessary for multi-decade, coupled high resolution climate simulations

• Target substantial reduction of major systematic errors in precipitation with realistic / explicit convective storm treatment

• Improve ability to assess regional impacts of climate change on the water cycle that directly affect multiple sectors of the U.S. and global economies (agriculture & energy production)

• Implement advanced algorithms supporting a super-parameterization cloud-resolving model to advance climate simulation and prediction

• Design super-parameterization approach to make full use of GPU accelerated systems, using performance performance portable approaches, to ready the model for capable exascale

PI: Mark Taylor (SNL)

Hurricane simulated by the ACME model at the high resolution necessary to simulate extreme events such as tropical cyclones

90Exascale Computing Project

Exascale challenge problem• Earth system model (ESM) with throughput needed for multi-decadal coupled

high-resolution (~1 km) climate simulations, reducing major systematic errors in precipitation models via explicit treatment of convective storms

• Improve regional impact assessments of climate change on water cycle, e.g., influencing agriculture/energy production

• Integrate cloud-resolving GPU-enabled convective parameterization into ACME ESM using Multiscale Modeling Framework (MMF); refactor key ACME model components for GPU systems

• ACME ESM goal: Fully weather-resolving atmosphere/cloud-resolving superparameterization, eddy-resolving ocean/ice components, throughput (5 SYPD) enabling 10–100 member ensembles of 100 year simulations

PI: Mark Taylor (SNL)

Cloud-Resolving Climate Modeling of Earth’s Water CycleSummary example of an application project development plan

Development Plan

• Y1: Demonstrate ACME-MMF model for Atmospheric Model IntercomparisonProject configuration; complete 5 year ACME-MMF simulation with active atmosphere and land components at low resolution and ACME atmosphere diagnostics/ metrics

• Y2: Demonstrate ACME-MMF model with active atmosphere, land, ocean and ice; complete 40 year simulation with ACME coupled group water cycle diagnostics/metrics

• Y3: Document GPU speedup in performance-critical components: Atmosphere, Ocean and Ice; compare SYPD with and without using the GPU

• Y4: ACME-MMF configuration integrated ACME model; document highest resolution able to deliver 5 SYPD; complete 3 member ensemble of 40 year simulations with all active components (atmosphere, ocean, land, ice) with ACME coupled group diagnostics/metrics

Risks and challenges• Insufficient LCF allocations• Obtaining necessary GPU throughput on the cloud-resolving model• Cloud-resolving convective parameterization via multi-scale modeling

framework does not provide expected improvements in water cycle simulation quality

• Global atmospheric model cannot obtain necessary throughput• MPAS ocean/ice components not amenable to GPU acceleration

Applications Software technologies• ACME Earth system model:

ACME-Atmosphere• MPAS (Model Prediction Across

Scales)-Ocean (ocean)• MPAS-Seaice (sea ice);

MPAS-Landice (land ice)• SAM (System for

Atmospheric Modeling)

• Fortran, C++, MPI, OpenMP, OpenACC

• Kokkos, Legion• PIO, Trilinos,

PETSc• ESGF,

Globus Online, AKUNA framework

91Exascale Computing Project

Software Technology RequirementsClimate (ACME)

• Programming Models and Runtimes1. Fortran, C++/C++17, Python, C, MPI, OpenMP, OpenACC, Globus Online2. Kokkos, Legion/Regent, 3. Argobots, HPX, PGAS, UPC/UPC++

• Tools1. LLVM/Clang, JIRA, CMake, git, ESGF2. TAU, GitLab3. PAPI, ROSE, HPCToolkit

• Mathematical Libraries, Scientific Libraries, Frameworks1. Metis2. MOAB, Trillinos, PETSc, 3. Dakota, Sundials, Chaco

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans

92Exascale Computing Project

Software Technology RequirementsClimate (ACME)

• Data Management and Workflows1. MPI-IO, HDF, PIO2. Akuna3. ADIOS

• Data Analytics and Visualization1. VTK, Paraview, netCDF

• System Software

Requirements Ranking1. Definitely plan to use2. Will explore as an option3. Might be useful but no concrete plans