© crown copyright met office weather prediction and climate modelling at exascale: introducing the...

20
© Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard, S. Pickles, G. Riley and N. Wood

Upload: ruby-fox

Post on 25-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: © Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard,

© Crown copyright Met Office

Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho projectR. Ford, M.J. Glover, D.Ham, C.M. Maynard, S. Pickles, G. Riley and N. Wood

Page 2: © Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard,

… and the weather for the conference is

© Crown copyright Met Office

Page 3: © Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard,

© Crown copyright Met Office

The primitive equations

• Rather complicated

• Equations of motion for density, humidity, pressure, temperature and wind, mass conservation and thermodynamics

• Partial Differential Equations – no general solution

• Approximate, discrete, numerical methods

• Rather complicated

• Equations of motion for density, humidity, pressure, temperature and wind, mass conservation and thermodynamics

• Partial Differential Equations – no general solution

• Approximate, discrete, numerical methods

When a problem in pure or in applied mathematics is "solved" by numerical computation, errors, that is, deviations of the numerical "solution" obtained from the true, rigorous one, are unavoidable. Such a "solution" is therefore meaningless, unless there is an estimate of the total error in the above sense.

When a problem in pure or in applied mathematics is "solved" by numerical computation, errors, that is, deviations of the numerical "solution" obtained from the true, rigorous one, are unavoidable. Such a "solution" is therefore meaningless, unless there is an estimate of the total error in the above sense.

J.von Neumann and H.H. Goldstine, Bull.Amer.Math.Soc 53 (1947) 1021-99J.von Neumann and H.H. Goldstine, Bull.Amer.Math.Soc 53 (1947) 1021-99

Page 4: © Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard,

© Crown copyright Met Office

Vegetation Model

Short-wave radiation

CloudsConvection

Pre

cipi

tatio

n

Long-wave radiation

Surface Processes

Physics Parameterisations

Page 5: © Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard,

© Crown copyright Met Office

Duration and/or Ensemble size

Re

so

luti

on

ComputingResources

Complexity

1/120

Challenge: Demands on computer power

Page 6: © Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard,

Parallel Programming just got harder!

June 21, 2012 6

Moore’s Law: More not fasterMoore’s Law: More not faster

Some cores are more equal than others. NUMASome cores are more equal than others. NUMA

AMD InterlagosAMD Interlagos

Heterogeneous Architectures: AcceleratorsHeterogeneous Architectures: Accelerators

NVidia FermiNVidia Fermi

Data parallel: cores MPI taskData parallel: cores MPI task

scale 230 heterogeneous cores?scale 230 heterogeneous cores?

Main memory is receding from view

Main memory is receding from view

Page 7: © Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard,

The Unified Model - software

© Crown copyright Met Office

Obs, Var, UM (+), IO server, ensembles and verification – more than 2 million lines of code

+ Coupled models excluding ocean and sea ice

UM used for both NWP and Climate Models

Now ~ 25 years old

Fortran90 (some F77 features remain)

Parallelism expressed via MPI

Some lower-level OpenMP (retro-fit)

IO server

MPI tasks dedicated to IO

Dramatic improvement in IO performance.

UM used for both NWP and Climate Models

Now ~ 25 years old

Fortran90 (some F77 features remain)

Parallelism expressed via MPI

Some lower-level OpenMP (retro-fit)

IO server

MPI tasks dedicated to IO

Dramatic improvement in IO performance.

Page 8: © Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard,

© Crown copyright Met Office

Problems with a long-lat grid

At 25km resolution, grid spacing near poles = 75m

At 10km reduces to 12m!

At 25km resolution, grid spacing near poles = 75m

At 10km reduces to 12m!

3rd Gen dynamical core (ENDGame) improved scaling

Weak CFL ∆t↓ as ∆x↓ (implicit scheme)

Data parallel in 2-D

3rd Gen dynamical core (ENDGame) improved scaling

Weak CFL ∆t↓ as ∆x↓ (implicit scheme)

Data parallel in 2-D

Page 9: © Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard,

Globally

Uniform

Next

Generation

Highly

Optimized

GungHo! - Working Together Harmoniously

Page 10: © Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard,

5 Year Project

“To research, design and develop a new dynamical core suitable for operational, global and regional, weather and climate simulation on massively parallel computers of the size envisaged over the coming 20 years.”

To address (inter alia):

What should replace the lat-lon grid?

How to transport material on that grid?

Is implicit time scheme viable/desirable on such computers?

Split into two phases:

2 years “research”

3 years “development”

Bath, Exeter, Imperial, Leeds, Manchester, Reading – NERC

STFC Daresbury and Met Office

Bath, Exeter, Imperial, Leeds, Manchester, Reading – NERC

STFC Daresbury and Met Office

Page 11: © Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard,

Choice of mesh

© Crown copyright Met Office

New dynamical coreScalable to a very large number of elementsChoice of elements and mesh not fixed

Support for irregular elements in the horizontal

New dynamical coreScalable to a very large number of elementsChoice of elements and mesh not fixed

Support for irregular elements in the horizontal

structured meshNeighbours known by construction - stencilDirect memory access

structured meshNeighbours known by construction - stencilDirect memory access

unstructured meshNeighbours unknownLook up tableIndirect memory access

unstructured meshNeighbours unknownLook up tableIndirect memory access

Derivative operatorsDerivative operators

Page 12: © Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard,

Consequences for memory access

© Crown copyright Met Office

a(i)=c*b(nb_list(i))a(i)=c*b(nb_list(i))

do k = 1, nlevela(k,i)=c*b(k,nb_list(i))

end do

do k = 1, nlevela(k,i)=c*b(k,nb_list(i))

end do

Indirect memory access destroys data localitypoor cache utilisation poor performance

Indirect memory access destroys data localitypoor cache utilisation poor performance

Mesh is likely to be structured in vertical horizontally unstructured columnar mesh vertical index (k) innermost (contiguous in memory)

Mesh is likely to be structured in vertical horizontally unstructured columnar mesh vertical index (k) innermost (contiguous in memory)

Page 13: © Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard,

Cache versus oversubscribed concurrency

© Crown copyright Met Office

Conventional CPU cache based memory model Will node level cache-coherency continue?

Conventional CPU cache based memory model Will node level cache-coherency continue?

GPU based thread-teams (Warp) fast switchingNaively each thread own individual elementvector memory access (coalesced)

horizontal index (i) contiguous in memory

GPU based thread-teams (Warp) fast switchingNaively each thread own individual elementvector memory access (coalesced)

horizontal index (i) contiguous in memory

Page 14: © Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard,

ILP - vectorisation

© Crown copyright Met Office

Vectorisation not limited to GPU-type machineSIMD units on CPUsSSE 2 64-bit words, AVX 4, SIMD on Intel MIC 8

Vectorisation not limited to GPU-type machineSIMD units on CPUsSSE 2 64-bit words, AVX 4, SIMD on Intel MIC 8

complex issue -- Pickles and Porter 2012 – NEMO (Ocean code) Compared two data layouts for 3D arrays-- found different operations favour different orderings Possible to vectorise some ops either layout

complex issue -- Pickles and Porter 2012 – NEMO (Ocean code) Compared two data layouts for 3D arrays-- found different operations favour different orderings Possible to vectorise some ops either layout

Vector friendly -- layer contiguousVector friendly -- layer contiguous

cache friendly -- column contiguouscache friendly -- column contiguous

Page 15: © Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard,

Overview

ModelInputdata

Griddata

Infrastructure(e.g. mctutils)

halo_exchange()put(), get() – in place coupling

(MPI) ProgramParallelism mgmt-mpi, threads

Read_(partitioned)_grid()

call infrastructure_init()

-Data and comms init-Including halo exchange init- and coupling exchange init

call model_init()-e.g. Allocation

-non in-place coupling

model_timestep_control call model_run()

call model_finalise()

Model Science Codeinit()run()finalise()

Model data set upe.g. Field descriptionsCoupling requirements(‘tag’-based access)

There will be several models and programs

Page 16: © Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard,

© Crown copyright Met Office

Science Model

Computational Science (CS) workpackage - proposal

Met Office Software development project

Separation of concerns

Computational Science (CS) workpackage - proposal

Met Office Software development project

Separation of concerns

Computational science performance code

Computational science performance code

Scientific and CS, performance code

Scientific and CS, performance code

Fortran 2K3 + MPI + directives (OpenMP)Do not exclude PGAS models (CAF) of single-sided comms

Page 17: © Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard,

Kernel API

© Crown copyright Met Office

Algorithm layer calls kernelsparallel or serial implemented in PSy layerPSy layer calls compute for generic kernels -- defined interface

Algorithm layer calls kernelsparallel or serial implemented in PSy layerPSy layer calls compute for generic kernels -- defined interface

Hand code versus auto-generatedHand code versus auto-generated

Misses opportunity for data re-use between kernelsSpecial kernels e.g. Helmholtz consist of smaller kernels which share halo exchange

Misses opportunity for data re-use between kernelsSpecial kernels e.g. Helmholtz consist of smaller kernels which share halo exchange

Page 18: © Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard,

Infrastructure Model

© Crown copyright Met Office

Define infrastructure API to be used by modelsImplementation neutralUse infrastructure software modelsHide these implementations behind APIe.g. ESMF for halo exchange, MCT-OASIS for coupling to Ocean model (NEMO)

Define infrastructure API to be used by modelsImplementation neutralUse infrastructure software modelsHide these implementations behind APIe.g. ESMF for halo exchange, MCT-OASIS for coupling to Ocean model (NEMO)

Page 19: © Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard,

Data Model

© Crown copyright Met Office

• Model has local data view + halos• Data belongs to objects – fields • Data objects contain

• function space information – DoF of field• topological entity

• Algorithm layer cannot access raw DoF arrays• Enables Mesh/topological entity/function space to be changed without large code changes• Unpacked as arrays before passing to kernel (variable or fixed data size for kernel?)• State object contains internal GH data

• Model has local data view + halos• Data belongs to objects – fields • Data objects contain

• function space information – DoF of field• topological entity

• Algorithm layer cannot access raw DoF arrays• Enables Mesh/topological entity/function space to be changed without large code changes• Unpacked as arrays before passing to kernel (variable or fixed data size for kernel?)• State object contains internal GH data

Page 20: © Crown copyright Met Office Weather prediction and climate modelling at Exascale: Introducing the Gung-Ho project R. Ford, M.J. Glover, D.Ham, C.M. Maynard,

Summary

© Crown copyright Met Office

NWP and Climate models are complex problemsKey scientific driver for Exascale systemsGung-ho Complete redesign for UK Met Office mathematical formulation Algorithm Numerical Analysis software

NWP and Climate models are complex problemsKey scientific driver for Exascale systemsGung-ho Complete redesign for UK Met Office mathematical formulation Algorithm Numerical Analysis software

personal view What about the hardware? Is there scope Co-design (wider project?) Software and hardware working together harmoniously

personal view What about the hardware? Is there scope Co-design (wider project?) Software and hardware working together harmoniously