philipp offenhauser¨...a dynamic load-balancing strategy for large scale cfd-applications philipp...

27
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenh ¨ auser 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Upload: others

Post on 02-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

A dynamic load-balancing strategy for large scale CFD-applications

Philipp Offenhauser

10.10.2017

1/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 2: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

Outline

Motivation and Background

Load-Balancing strategy

Results

Further Challenges

Conclusion

2/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 3: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

Motivation - Top500 from 1993 to 2017

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016100

101

102

103

104

105

106

107

108

109

Year

Perform

ance

[GFlop/s]

Rmax Rank 1Rmax Rank 10

100

101

102

103

104

105

106

107

108

109

#Cores

#Cores Rank 1#Cores Rank 10

(Source: top500.org)

3/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 4: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

Motivation

I HPC systems generate their power byfacilitating hundreds of thousands ofcores

I Massive parallel algorithms areimplemented

I Computational effort must bedistributed evenly across all cores

I Initial domain decompositiontechniques are well know

I Well balanced applications for scientificuse-cases are developed

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016100

101

102

103

104

105

106

107

108

109

Year

Perform

ance

[GFlop/s]

Rmax Rank 1Rmax Rank 10

100

101

102

103

104

105

106

107

108

109

#Cores

#Cores Rank 1#Cores Rank 10

(Source: top500.org)

4/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 5: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

Definitions

Element-weight:Specific computational effort of an element.

Load of a MPI-domain:Sum of the element-weights of a MPI-domain.

Load-imbalance:

Load-imbalanceL =maximum loadaverage load

Load-imbalanceT =maximum computation timeaverage computation time

5/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 6: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

Definitions

Element-weight:Specific computational effort of an element.

Load of a MPI-domain:Sum of the element-weights of a MPI-domain.

Load-imbalance:

Load-imbalanceL =maximum loadaverage load

Load-imbalanceT =maximum computation timeaverage computation time

5/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 7: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

Definitions

Element-weight:Specific computational effort of an element.

Load of a MPI-domain:Sum of the element-weights of a MPI-domain.

Load-imbalance:

Load-imbalanceL =maximum loadaverage load

Load-imbalanceT =maximum computation timeaverage computation time

5/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 8: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

Example 1 - Finite-Volume-Code FS3D

References:Institute of Aerospace Thermodynamics - University of Stuttgarthttp://www.uni-stuttgart.de/itlr/forschung/tropfen/fs3d/index.php?lang=en&lang=en:w

6/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 9: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

Examples for load-imbalance - Volume of Fluid Method

1 1 1 0,7 0

1 1 1 0,3 0

1 1 0,6 0 0

0,7 0,3 0 0 0

0 0 0 0 0

x

y

Source: ITLR

Treatment of multiple phases

f (x , t) =

0 liquid(0; 1) interface1 solid

I Reconstruction of the interfaceI Additional computational effort for the

interface elements

7/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 10: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

Examples for load-imbalance - Volume of Fluid Method

1 1 1 0,7 0

1 1 1 0,3 0

1 1 0,6 0 0

0,7 0,3 0 0 0

0 0 0 0 0

x

y

Source: ITLR

Treatment of multiple phases

f (x , t) =

0 liquid(0; 1) interface1 solid

I Reconstruction of the interfaceI Additional computational effort for the

interface elements

7/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 11: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

Example 2 - Discontinuous Galerkin Spectral ElementMethod (DG SEM) based code FLEXI

References:Institute of Aerodynamics and Gas Dynamics - University of StuttgartNumerical research group https://nrg.iag.uni-stuttgart.de/

FLEXI on github: https://github.com/flexi-framework/flexi

8/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 12: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

FLEXI - a high-order numerical framework

I Hight-order numerical approachI Massively parallel CFD-code

designed for HPC-systemsI DG SEM enables efficient

communication-pattern(communication hiding)

I FLEXI tested on differentHPC-systems: Hornet, HazelHen,JUQueen, Marconi

I FLEXI is used for very largeacademic use-cases

9/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 13: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

Example - gas injection

I Numerical approach wasextended to simulate complexfluid-flow

I The FV-Sub-Cell-Method isused for shock capturing

I Code is used for industrial usecases

I Depending on the localsolution theFV-Sub-Cell-Method is turnedon/off

I FV-Sub-Cell-Method addslocal computational effort

10/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 14: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

Local computational effort

T = 0,00 ms T = 0,25 ms T = 0,50 ms

11/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 15: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

Motivation for dynamic load-balancing

0.95

1.00

1.05

1.10

1.15

1.20

1.25

1.30

10 20 30 40 50 60 70 80 90

load

/av

erag

elo

ad

core idLoad distribution of a test-case on 96 cores.

I Small number of cores are over-loadedI Huge number of cores are under-loadedI Performance of the application suffers from the overload on a small number of cores

12/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 16: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

A dynamic load-balancing strategy

Step 1: Calculate a new optimized loaddistribution

Step 2: Distribute the load betweenMPI-processes

Challenges:I Global optimization problemI Simple and fast calculation of the

optimized load distributionI Keep the communication overhead

small

Challenges:I Communication-structure originates

during run-timeI Communication-structure changes

during run-timeI Keep the communication overhead

small

13/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 17: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

A dynamic load-balancing strategy

Step 1: Calculate a new optimized loaddistribution

Step 2: Distribute the load betweenMPI-processes

Challenges:I Global optimization problemI Simple and fast calculation of the

optimized load distributionI Keep the communication overhead

small

Challenges:I Communication-structure originates

during run-timeI Communication-structure changes

during run-timeI Keep the communication overhead

small

13/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 18: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

A dynamic load-balancing strategy

Step 1: Calculate a new optimized loaddistribution

Step 2: Distribute the load betweenMPI-processes

Challenges:I Global optimization problemI Simple and fast calculation of the

optimized load distributionI Keep the communication overhead

small

Challenges:I Communication-structure originates

during run-timeI Communication-structure changes

during run-timeI Keep the communication overhead

small

13/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 19: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

A dynamic load-balancing strategy - approach

Core0

7

0

1 2

3 4 5

67

8 9

101112

1314

15

16 17

1819

20

21 22

23 24

25 26

27

2829

30 31 32 33

3435

36

37 38

39 40

41 42

43

4445

46 47

48

4950

515253

54 55

5657

58 59 60

61 62

63

Element-weight

w1

w2

Core0

7

0

1 2

3 4 5

67

8 9

101112

1314

15

16 17

1819

20

21 22

23 24

25 26

27

2829

30 31 32 33

3435

36

37 38

39 40

41 42

43

4445

46 47

48

4950

515253

54 55

5657

58 59 60

61 62

63

I Using a space-filling-curve for the domain-decompositionI Mapping the 3D-problem to a 1D-problemI Minimal communication (only one time)I Simple communication pattern

14/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 20: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

1. Calculate a new optimized load distribution

An optimization algorithm based on the element specific computational effort is used for theload-distribution.

∆cPnavg = min(|∆c

Pn−1avg + ∆Pn

1 |, |∆cPn−1avg + ∆Pn

2 |)

0.95

1.00

1.05

1.10

1.15

1.20

1.25

1.30

10 20 30 40 50 60 70 80 90

load

/av

erag

elo

ad

core id

Load distribution without load-balancing

0.95

1.00

1.05

1.10

1.15

1.20

1.25

1.30

10 20 30 40 50 60 70 80 90

load

/av

erag

elo

ad

core id

Load distribution with load-balancing

Load distribution of a test-case on 96 cores.

15/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 21: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

Step 2: Distributing the load between MPI-processes

I Using again the 1D-structure of the space-filling-curveI Intra-node-communication via MPI-Shared-Memory-WindowI Shared-Memory used for communication and to store the results during the

load-balancingI MPI-communication only between nodes

16/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 22: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

Results - Element distribution

T0 = 0,00ms T1 = 0,25ms T2 = 0,50ms

I Element distribution depends on the computational effortI Redistrubution of the elements every 1,000 timestepsI 10% reduction of the wall-time

17/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 23: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

Dynamic load distribution

18/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 24: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

Further Challenges - detection of load-imbalance at run-time

State of the art:I Different powerful performance measuring tools are availableI All tools need an instrumentation and produce overheadI Results of the measurement only available after simulationI No feedback loop to the application

Challenges:I Measurement has to be integrated in the applicationI Feedback loop to the applicationI Measurement of the load-imbalance with a minimal overhead

19/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 25: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

Further Challenges - detection of load-imbalance at run-time

State of the art:I Different powerful performance measuring tools are availableI All tools need an instrumentation and produce overheadI Results of the measurement only available after simulationI No feedback loop to the application

Challenges:I Measurement has to be integrated in the applicationI Feedback loop to the applicationI Measurement of the load-imbalance with a minimal overhead

19/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 26: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

Conclusion

I HPC systems generate their power by facilitating hundreds of thousands of coresI Massively parallel CFD-codes are implementedI Initial domain decomposition techniques are well knowI Well balanced CFD-applications

I Simulation of complex fluid flowI varying load distribution during run-timeI occurrence of the load distribution hard to predict spatially and chronologicallyI simulation specific load-imbalance occurs

I For efficient application execution dynamic load-balance strategies are indispensableI Efficient redistribution of the computational loadI Detection of load-imbalance with a very low overhead

20/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::

Page 27: Philipp Offenhauser¨...A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhauser¨ 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

Thank you for your attention!

20/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 ::