design and performance of the mpas-a non-hydrostatic atmosphere model
DESCRIPTION
Design and Performance of the MPAS-A Non-hydrostatic atmosphere model. Michael Duda 1 and Doug Jacobsen 2 1 National Center for Atmospheric Research*, NESL 2 Los Alamos National Laboratories, COSIM. *NCAR is funded by the Nationa l Science Foundation. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Design and Performance of the MPAS-A Non-hydrostatic atmosphere model](https://reader036.vdocument.in/reader036/viewer/2022081520/5681692f550346895de075bd/html5/thumbnails/1.jpg)
The Second Workshop on Coupling Technologies for Earth System Models20 – 22 February 2013, NCAR, Boulder, Colorado
1
DESIGN AND PERFORMANCE OF THE MPAS-A NON-HYDROSTATIC ATMOSPHERE MODEL
Michael Duda1 and Doug Jacobsen2
1National Center for Atmospheric Research*, NESL2Los Alamos National Laboratories, COSIM
*NCAR is funded by the National Science Foundation
![Page 2: Design and Performance of the MPAS-A Non-hydrostatic atmosphere model](https://reader036.vdocument.in/reader036/viewer/2022081520/5681692f550346895de075bd/html5/thumbnails/2.jpg)
The Second Workshop on Coupling Technologies for Earth System Models20 – 22 February 2013, NCAR, Boulder, Colorado
2
WHAT IS THE MODEL FOR PREDICTION ACROSS SCALES?
A collaboration between NCAR (MMM) and LANL (COSIM) to develop models for climate, regional climate, and NWP applications:• MPAS-Atmosphere (NCAR)• MPAS-Ocean (LANL)• MPAS-Ice (LANL)• MPAS framework, infrastructure* (NCAR, LANL)
These models use a centroidal Voronoi tessellation (CVT) with a C-grid staggering for their horizontal discretization*The MPAS infrastructure handles general (conformal?)
unstructured horizontal grids!
from Ringler et al. (2008)
Prognostic velocities are velocities normal to cell faces (“edges”) at the point where the edge intersects the arc joining cells on either side
![Page 3: Design and Performance of the MPAS-A Non-hydrostatic atmosphere model](https://reader036.vdocument.in/reader036/viewer/2022081520/5681692f550346895de075bd/html5/thumbnails/3.jpg)
The Second Workshop on Coupling Technologies for Earth System Models20 – 22 February 2013, NCAR, Boulder, Colorado
3
MPAS SOFTWARE ARCHITECTURE
1. Driver layers – The high-level DRIVER calls init, run, finalize methods implemented by the core-independent SUBDRIVER; DRIVER can be replaced by a coupler, and SUBDRIVER can include import and export methods for model state
2. MPAS core – The CORE contains science code that performs the computational work (pre-, post-processing, model time integration, etc.) of MPAS; each core’s implementation lives in a separate sub-directory and is selected at compile-time
3. Infrastructure – The infrastructure provides data types used by the core and the rest of the model infrastructure, communication, I/O, and generic computational operations on CVT meshes such as interpolation
Arrows indicate interaction between components of the MPAS architecture
![Page 4: Design and Performance of the MPAS-A Non-hydrostatic atmosphere model](https://reader036.vdocument.in/reader036/viewer/2022081520/5681692f550346895de075bd/html5/thumbnails/4.jpg)
The Second Workshop on Coupling Technologies for Earth System Models20 – 22 February 2013, NCAR, Boulder, Colorado
4
PARALLEL DECOMPOSITION
Graph partitioning
The dual mesh of a Voronoi tessellation is a Delaunay triangulation – essentially the connectivity graph of the cells
Parallel decomposition of an MPAS mesh then becomes a graph partitioning problem: equally distribute nodes among partitions (give each process equal work) while minimizing the edge cut (minimizing parallel communication)
We use the Metis package for parallel graph decomposition• Currently done as a pre-processing step, but could be done
“on-line”Metis also handles weighted graph partitioning• Given a priori estimates for the computational costs of each
grid cell, we can better balance the load among processes
![Page 5: Design and Performance of the MPAS-A Non-hydrostatic atmosphere model](https://reader036.vdocument.in/reader036/viewer/2022081520/5681692f550346895de075bd/html5/thumbnails/5.jpg)
The Second Workshop on Coupling Technologies for Earth System Models20 – 22 February 2013, NCAR, Boulder, Colorado
5
PARALLEL DECOMPOSITION (2)
Given an assignment of cells to a process, any number of layers of halo (ghost) cells may be added Block of cells owned by
a process
Block plus one layer of halo/ghost cells
Block plus two layers of halo/ghost cells
Cells are stored in a 1d array (2d with vertical dimension, etc.), with halo cells at the end of the array; the order of real cells may be updated to provide better cache re-use
With a complete list of cells stored in a block, adjacent edge and vertex locations can be found; we apply a simple rule to determine ownership of edges and vertices adjacent to real cells in different blocks
![Page 6: Design and Performance of the MPAS-A Non-hydrostatic atmosphere model](https://reader036.vdocument.in/reader036/viewer/2022081520/5681692f550346895de075bd/html5/thumbnails/6.jpg)
The Second Workshop on Coupling Technologies for Earth System Models20 – 22 February 2013, NCAR, Boulder, Colorado
7
MODEL INFRASTRUCTURE
I/O: Provides parallel I/O through an API that uses infrastructure DDTs• High-level interface for creating “streams” (groups of fields that are
read/written at the same time from/to a file)• Underlying I/O functionality is provided by CESM’s PIO
PARALLELISM: Implements operations on field types needed for parallelism • E.g., add halo cells to a block, halo cell update, all-to-all• Callable from either serial or parallel code (no-op for serial code)• For multiple blocks per process, differences between shared-
memory and MPI are hidden
OPERATORS: Provides implementations of general operations on CVT meshes • Horizontal interpolation via RBFs, 1d spline interpolation• Ongoing work to add a generic advection operator for C-grid
staggered CVTs
![Page 7: Design and Performance of the MPAS-A Non-hydrostatic atmosphere model](https://reader036.vdocument.in/reader036/viewer/2022081520/5681692f550346895de075bd/html5/thumbnails/7.jpg)
The Second Workshop on Coupling Technologies for Earth System Models20 – 22 February 2013, NCAR, Boulder, Colorado
9
THE MPAS REGISTRY
The need to support different cores in the MPAS framework suggests that the developer of a core would need to write “copy-and-paste” code to handle aspects of each field such as:• Field definition • Allocation/deallocation of field
structures • Halo setup• I/O calls
![Page 8: Design and Performance of the MPAS-A Non-hydrostatic atmosphere model](https://reader036.vdocument.in/reader036/viewer/2022081520/5681692f550346895de075bd/html5/thumbnails/8.jpg)
The Second Workshop on Coupling Technologies for Earth System Models20 – 22 February 2013, NCAR, Boulder, Colorado
10
THE MPAS REGISTRY
MPAS has employed a computer-aided software engineering tool (the “registry”) to isolate the developer of a core from the details of the data structures used inside the framework• An idea borrowed from the WRF model (Michalakes (2004))• The Registry is a “data dictionary”: each field has an entry providing
meta-data and other attributes (type, dims, I/O streams, etc.)• Each MPAS core is paired with its own registry file• At compile time, a small C program is first compiled; the program runs,
parses registry file, and generates Fortran code– Among other things, creates code to allocate, read, and write fields
For dynamics-only non-hydrostatic atmosphere model, Registry generates ~23,200 lines of code vs 5100 lines hand-written for dynamics and 23,500 lines hand-written for infrastructure
![Page 9: Design and Performance of the MPAS-A Non-hydrostatic atmosphere model](https://reader036.vdocument.in/reader036/viewer/2022081520/5681692f550346895de075bd/html5/thumbnails/9.jpg)
The Second Workshop on Coupling Technologies for Earth System Models20 – 22 February 2013, NCAR, Boulder, Colorado
11
ROLE OF THE REGISTRY IN COUPLING
The registry can generate more than just Fortran code – anything we’d like it to generate based on registry entries, in fact!– Information for meta-data driven couplers– Documentation (similar idea to doxygen for generating source-code documentation)
The syntax of the MPAS registry files is easily changed or updated– Could be extended to permit additional attributes and metadata– We’re considering a new format for the registry files to accommodate richer
metadata
field {name:udimensions:nVertLevels,nCellsunits:”m s-1”description:”normal velocity on cell
faces”coupled-write:truecoupled-read:needed
}
![Page 10: Design and Performance of the MPAS-A Non-hydrostatic atmosphere model](https://reader036.vdocument.in/reader036/viewer/2022081520/5681692f550346895de075bd/html5/thumbnails/10.jpg)
The Second Workshop on Coupling Technologies for Earth System Models20 – 22 February 2013, NCAR, Boulder, Colorado
12
MPAS-A SCALABILITY
MPI tasks Cells per task
Speedup Efficiency
16 10240 1.00 100.00%
32 5120 1.97 98.40%
64 2560 3.90 97.49%
128 1280 7.67 95.88%
256 640 14.65 91.57%
512 320 27.56 86.12%
1024 160 48.49 75.77%
2048 80 85.21 66.57%
4096 40 151.43 59.15%
MPAS-A scaling – 60-km mesh, yellowstone MPAS-A scaling
60-km mesh, yellowstone and bluefire
The full MPAS-A solver (physics+dynamics, no I/O) achieves >75% efficiency down to about 160 owned cells per MPI task
![Page 11: Design and Performance of the MPAS-A Non-hydrostatic atmosphere model](https://reader036.vdocument.in/reader036/viewer/2022081520/5681692f550346895de075bd/html5/thumbnails/11.jpg)
The Second Workshop on Coupling Technologies for Earth System Models20 – 22 February 2013, NCAR, Boulder, Colorado
13
MPAS-A SCALABILITY
Halo communication (“comm”) accounts for a rapidly growing share of the total solver time
• Physics are currently all column-independent and scale almost perfectly• Redundant computation in the halos
limit scalability of the dynamics
Lower bound for the number of ghost cells in two halo layers, Ng, is
where No is the number of owned cells
No = 80 -> Ng = 76No = 40 -> Ng = 57
MPAS-A (60-km mesh, yellowstone) timing breakdown
![Page 12: Design and Performance of the MPAS-A Non-hydrostatic atmosphere model](https://reader036.vdocument.in/reader036/viewer/2022081520/5681692f550346895de075bd/html5/thumbnails/12.jpg)
The Second Workshop on Coupling Technologies for Earth System Models20 – 22 February 2013, NCAR, Boulder, Colorado
14
STRATEGIES FOR MINIMIZING COMMUNICATION COSTS
• Aggregate halo exchanges for fields with the same stencil– Not currently implemented– In MPAS-A, limited areas where we exchange multiple halos at the same
time; restructuring of the solver code might help
• Use one MPI task per shared-memory node, and assign that task as many blocks as there are cores on the node– Supported already in the MPAS infrastructure– Initial testing underway in MPAS-O and the MPAS shallow water model;
block loops parallelized with OpenMP
• Overlap computation and communication by splitting halo exchanges into a begin phase and an end phase with non-blocking communication– Prototype code has been written to do this; looks promising– Restructuring of the MPAS-A solver might improve opportunities to take
advantage of this option– At odds with aggregated halo exchanges?
![Page 13: Design and Performance of the MPAS-A Non-hydrostatic atmosphere model](https://reader036.vdocument.in/reader036/viewer/2022081520/5681692f550346895de075bd/html5/thumbnails/13.jpg)
The Second Workshop on Coupling Technologies for Earth System Models20 – 22 February 2013, NCAR, Boulder, Colorado
15
SUMMARY
• MPAS is a family of Earth-system component models sharing a common software framework– Infrastructure should be general enough for most
horizontally unstructured (conformal?) grids
– Extensive use of derived types enable simple interfaces to infrastructure functionality
– Besides PIO, we’ve chosen to implement functionality “from scratch”
• The Registry mechanism in MPAS could be further leveraged for maintenance of metadata and coupling purposes
• We’re just beginning to experiment with new approaches to minimizing communication costs in the MPAS-A solver– Any improvements to infrastructure can be leveraged by
all MPAS cores