basc: next generation computing: needs and...

12
ASCR-BER Requirements March 29, 2016 Department of Energy Biological and Environmental Research 1 Office of Science Office of Biological and Environmental Research May 15, 2017 BASC: Next Generation Computing: Needs and Opportunities for Weather, Climate, and Atmospheric Sciences Department of Energy, Office of Science Office of Biological and Environmental Research Gary Geernaert, Division Director Climate and Environmental Sciences Dorothy Koch, Program Manager Earth System Modeling

Upload: ngodan

Post on 23-Mar-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

ASCR-BER Requirements • March 29, 2016 Department of Energy • Biological and Environmental Research 1

Office of Science Office of Biological

and Environmental Research

May 15, 2017

BASC: Next Generation Computing: Needs and Opportunities for Weather, Climate, and Atmospheric Sciences Department of Energy, Office of Science Office of Biological and Environmental Research Gary Geernaert, Division Director Climate and Environmental Sciences Dorothy Koch, Program Manager Earth System Modeling

ASCR-BER Requirements • March 29, 2016 Department of Energy • Biological and Environmental Research 2

Department of Energy - Office of Science

Steve Binkley Associate Director

Deputy Director

High Energy Physics

Biological and Environmental

Research (BER)

Sharlene Weatherwax,

Associate Director

Basic Energy

Sciences

Fusion Energy

Sciences

Advanced Scientific

Computing Research

(ASCR)

Barb Helland, Associate Director

Nuclear Physics

Biological Systems Science

Climate and Environmental

Sciences

Gary Geernaert, Director

Current and planned HPC-intensive activities related to

DoE climate and atmospheric sciences

LASSO Leveraging the Southern Great Plains

ARM site with Large-Eddy Simulation

ACME High-resolution (25km) coupled Earth System Modeling

targeting DoE Leadership Computing Facilities

An end-to-end approach to exascale systems —

from libraries to algorithms to applications to

hardware

Includes next-generation ACME computing

ASCR(Computing)-

BER(Climate)

partnership program Science and model

development projects

IDEAS Interoperable Design of Extreme-

scale Applications Software

BASC • May 2017 Department of Energy • Biological and Environmental Research

Accelerated Climate Model for Energy

ACME is a modeling project launched by DOE’s in

July 2014 to develop a branch of the CESM to

❖ Advance a set of science questions that

demand major computational power and

advanced software: “water cycle”,

“biogeochemistry” and “cryosphere-ocean”

❖ Provide high resolution coupled climate

simulations (15-25 km), with regionally

refined grids <10 km

❖ Focus on near-term time horizon: 1970-

2050

❖ Design codes to effectively utilize next and

successive generations DOE Leadership

Class computers, both hybrid and multi-core,

through exascale

V1 (version 1) is currently in production-simulation (to

be released in late 2017)

V2-v3 are under development (release in 2020-2023)

ACME will be an open source Earth system

model ready to run on DOE computers, including

DOE’s NERSC.

Examples of current (v1)

computational performance

- Edison: 12 SYPD (100km

coupled)

- Cori KNL: 6 SYPD (100km

coupled), 3 SYPD (25 km

atmosphere-only)

- Titan: 1.4 SYPD (25km)

- Mira: 0.33 SYPD (25km)

BASC • May 2017 Department of Energy • Biological and Environmental Research

Future directions – 5-10 years Earth System Modeling (ACME): Non-hydrostatic atmosphere (down to 100m with regional refinement); Eddy-resolving ocean (down to 100m for coastal modeling, inundation), fully integrated dynamic land ice; above and below-ground hydrology and BGC, dynamic vegetation, sub-grid orography

Model analysis: Calibration, testing and analysis on very large ensembles, better model-observation integration methods, embedded UQ and diagnostics

“Integrated” modeling: interoperable framework that includes ESM, IAM, IAV for various sectors (including energy), at appropriate scales and configurations to solve particular problems

Subsurface modeling: Watershed and genome-enabled BGC models, community infrastructure for interoperable hydrology and BGC

LES modeling: that integrates ARM data, for model parameterization development, testing of remote retrievals

BASC • May 2017 Department of Energy • Biological and Environmental Research

DOE’s Office of Science Computing Facilities and Programs

ASCR Computing Facilities:

• OLCF (hybrid, CPU-GPU)

Titan (27 PF Cray XK7 hybrid)

Summit (200 PF)

• ALCF (many-core)

Mira (10 PF IBM Blue Gene/Q)

Aurora (2018; 180 PF Knights Hill Xeon Phi)

• NERSC (many-core)

Edison (2.6 PF Cray XC40/30 Intel Xeon)

Cori (31 PF Cray XC40 Intel Xeon Phil KNL)

Programs for computer allocation awards

• INCITE: Open science competition; must effectively use

machine, ALCF and OLCF

5.8 Billion hours to 55 Projects

• ALCC: DOE-Science-relevant awards, to all 3 systems

3 Billion processor hours to 49 projects

• ERCAP: DOE-SC Office program, to NERSC

640 Awards!

Climate applications share and compete for these resources with the DOE and outside

community.

Smaller dedicated resources are sometimes purchased for quick turn-around simulations

Detailed specs for current and next systems

Applications, like ACME, are actively exploring heterogenous computing in

both its many-in-core (Cori/Theta) and GPU-accelerator (Titan) forms.

The DoE strategy is to target a wide cross-section of grand-

challenge scale computing applications in exascale design.

Exascale: ASCR facilities is undergoing extensive

procurement process

• Advanced Scientific Computing Research

• Basic Energy Sciences

• Biological and Environmental Research

• Fusion Energy Sciences

• High Energy Physics

• Nuclear Physics

As part of this, ASCR has convened

workshops and solicited

input/reports from each of the

Offices within the Office of Science

The 10-year milestone for DoE climate

modeling is linked to the design and

deployment of a DoE exascale

computing facility in the mid 2020s. http://exascaleage.org/ber/

Workshop in March, 2016

Report is 366pp

BASC • May 2017 Department of Energy • Biological and Environmental Research

Computational challenges to DOE climate modeling in next decade

DOE climate modeling is committed to using DOE machines, which are challenging due

to low-power, low-bandwidth trends in hardware

Added challenge is to effectively use both many-core and hybrid architectures, while

also preparing for unknown exascale platform.

DOE projects have significant advantage of co-location with facilities and other HPC-

intensive research projects within the DOE Laboratory system. However the projects do

not have significant dedicated resources.

Being on the “bleeding-edge” of computing is painful. The systems evolve quickly and

significant resources go toward ongoing updating of the codes

ACME project is committed to simulating the coupled climate system. Performance of

each system (ocean, atmosphere and coupling) must be considered

In addition to the “capacity” computing, DOE also needs good mid-size “capability”

resource that is efficient in data-processing for model analysis

BASC • May 2017 Department of Energy • Biological and Environmental Research

Future “reality-check”, strategies

Reality:

Although ACME has substantial allocations on DOE machines, it

can perform less than 200 years of coupled simulation at 25 km

resolution (in one calendar-year)

At exascale, assuming the ability to simulate on next-generation

low-power machine, one could perform an ensemble at 25 km, or

further increase resolution to 10km and perform 200 year

simulation

Strategies:

DOE has interests in new strategies for science/computation:

• Algorithms with high flop-to-memory ratio

• Algorithms with large sub-grid work for GPU’s, like MMF

• Methods to get statistics from simulations (besides brute-force

ensemble methods)

• Use of in-situ diagnostics to reduce I/O

• New/better methods to initialize the coupled-system, to avoid

long spin-up

• More extensive and invasive parallelization of code, e.g. task-

based

• For portability: memory/pattern “abstractions” of code, use of

portable libraries, programming models

AXICCS workshop in

Sep 2016

Report is 228 pp https://science.energy.gov/ber/comm

unity-resources/

Office of Science Office of Biological

and Environmental Research

Thank you! BER

https://science.energy.gov/ber

ASCR

https://science.energy.gov/ascr

ASCR facilities

https://science.energy.gov/user-facilities/

ACME https://climatemodeling.science.energy.gov/projects/accelerated-climate-modeling-energy