a new 3d parallel sph scheme for free surface flows

15
A new 3D parallel SPH scheme for free surface flows Angela Ferrari, Michael Dumbser * , Eleuterio F. Toro, Aronne Armanini Department of Civil and Environmental Engineering, University of Trento, Via Mesiano 77, I-38050 Trento (TN), Italy article info Article history: Received 30 July 2008 Received in revised form 27 October 2008 Accepted 19 November 2008 Available online 3 December 2008 abstract We propose a new robust and accurate SPH scheme, able to track correctly complex three-dimensional non-hydrostatic free surface flows and, even more important, also able to compute an accurate and little oscillatory pressure field. It uses the explicit third order TVD Runge–Kutta scheme in time, following Shu and Osher [Shu C-W, Osher S. Efficient implementation of essentially non-oscillatory shock-capturing schemes. J Comput Phys 1988;89:439–71], together with the new key idea of introducing a monotone upwind flux for the density equation, thus removing any artificial viscosity term. For the discretization of the velocity equation, the non-diffusive central flux has been used. A new flexible approach to impose the boundary conditions at solid walls is also proposed. It can handle any moving rigid body with arbi- trarily irregular geometry. It does neither produce oscillations in the fluid pressure in proximity of the interfaces, nor does it have a restrictive impact on the stability condition of the explicit time stepping method, unlike the repellent boundary forces of Monaghan [Monaghan JJ. Simulating free surface flows with SPH. J Comput Phys 1994;110:399–406]. To asses the accuracy of the new SPH scheme, a 3D mesh-convergence study is performed for the strongly deforming free surface in a 3D dam-break and impact-wave test problem providing very good results. Moreover, the parallelization of the new 3D SPH scheme has been carried out using the message pass- ing interface (MPI) standard, together with a dynamic load balancing strategy to improve the computa- tional efficiency of the scheme. Thus, simulations involving millions of particles can be run on modern massively parallel supercomputers, obtaining a very good performance, as confirmed by a speed-up anal- ysis. The 3D applications consist of environmental flow problems, such as dam-break flows and impact flows against a wall. The numerical solutions obtained with our new 3D SPH code have been compared with either experimental results or with other numerical reference solutions, obtaining in all cases a very satisfactory agreement. Ó 2008 Elsevier Ltd. All rights reserved. 1. Introduction 1.1. Governing equations The smooth particle hydrodynamics (SPH) method is a meshless scheme, initially developed by Lucy [20] and Gingold and Mona- ghan [13] for astrophysical applications, subsequently extended to large strain solid mechanics [17,5,15,18] as well as free surface flow problems [22,23]. In the SPH method, the continuum is discretized by a finite set of discrete values defined at interpolation points. Following the Lagrangian approach, each point moves with the material velocity of the continuum and carries physical properties, such as density, pressure and velocity. In this paper, we focus on the Navier–Stokes equations to simulate the three-dimensional free surface non-shal- low water flow of a slightly compressible fluid, dq dt ¼q ~ r ~ v ; ð1Þ d~ v dt ¼ 1 q ~ r r þ ~ S; ð2Þ where (1) and (2) are the mass and momentum balance in non-con- servative form. Moreover, the position of each infinitesimal fluid element is governed by d ~ x dt ¼ ~ v : ð3Þ The term d dt denotes the total derivative that follows the motion of the fluid. The variables are the density q, the velocity ~ v , and the position ~ x. The vector ~ S represents the body forces per unit mass and typically consists of the gravity acceleration ~ g. The term r de- notes the total stress tensor, defined as r ¼p I þ s; ð4Þ where p is the pressure, I is the unit tensor and s is the viscous stress tensor. For an ideal fluid the stress is independent of the deformation and (4) is reduced to r ¼p I. Instead, the following 0045-7930/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.compfluid.2008.11.012 * Corresponding author. E-mail addresses: [email protected] (A. Ferrari), michael.dumbser@ ing.unitn.it, [email protected] (M. Dumbser), [email protected] n.it (E.F. Toro), [email protected] (A. Armanini). Computers & Fluids 38 (2009) 1203–1217 Contents lists available at ScienceDirect Computers & Fluids journal homepage: www.elsevier.com/locate/compfluid

Upload: angela-ferrari

Post on 26-Jun-2016

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: A new 3D parallel SPH scheme for free surface flows

Computers & Fluids 38 (2009) 1203–1217

Contents lists available at ScienceDirect

Computers & Fluids

journal homepage: www.elsevier .com/ locate /compfluid

A new 3D parallel SPH scheme for free surface flows

Angela Ferrari, Michael Dumbser *, Eleuterio F. Toro, Aronne ArmaniniDepartment of Civil and Environmental Engineering, University of Trento, Via Mesiano 77, I-38050 Trento (TN), Italy

a r t i c l e i n f o

Article history:Received 30 July 2008Received in revised form 27 October 2008Accepted 19 November 2008Available online 3 December 2008

0045-7930/$ - see front matter � 2008 Elsevier Ltd. Adoi:10.1016/j.compfluid.2008.11.012

* Corresponding author.E-mail addresses: [email protected] (A.

ing.unitn.it, [email protected] (Mn.it (E.F. Toro), [email protected] (A. Arma

a b s t r a c t

We propose a new robust and accurate SPH scheme, able to track correctly complex three-dimensionalnon-hydrostatic free surface flows and, even more important, also able to compute an accurate and littleoscillatory pressure field. It uses the explicit third order TVD Runge–Kutta scheme in time, following Shuand Osher [Shu C-W, Osher S. Efficient implementation of essentially non-oscillatory shock-capturingschemes. J Comput Phys 1988;89:439–71], together with the new key idea of introducing a monotoneupwind flux for the density equation, thus removing any artificial viscosity term. For the discretizationof the velocity equation, the non-diffusive central flux has been used. A new flexible approach to imposethe boundary conditions at solid walls is also proposed. It can handle any moving rigid body with arbi-trarily irregular geometry. It does neither produce oscillations in the fluid pressure in proximity of theinterfaces, nor does it have a restrictive impact on the stability condition of the explicit time steppingmethod, unlike the repellent boundary forces of Monaghan [Monaghan JJ. Simulating free surface flowswith SPH. J Comput Phys 1994;110:399–406]. To asses the accuracy of the new SPH scheme, a 3Dmesh-convergence study is performed for the strongly deforming free surface in a 3D dam-break andimpact-wave test problem providing very good results.

Moreover, the parallelization of the new 3D SPH scheme has been carried out using the message pass-ing interface (MPI) standard, together with a dynamic load balancing strategy to improve the computa-tional efficiency of the scheme. Thus, simulations involving millions of particles can be run on modernmassively parallel supercomputers, obtaining a very good performance, as confirmed by a speed-up anal-ysis. The 3D applications consist of environmental flow problems, such as dam-break flows and impactflows against a wall. The numerical solutions obtained with our new 3D SPH code have been comparedwith either experimental results or with other numerical reference solutions, obtaining in all cases a verysatisfactory agreement.

� 2008 Elsevier Ltd. All rights reserved.

1. Introduction

1.1. Governing equations

The smooth particle hydrodynamics (SPH) method is a meshlessscheme, initially developed by Lucy [20] and Gingold and Mona-ghan [13] for astrophysical applications, subsequently extendedto large strain solid mechanics [17,5,15,18] as well as free surfaceflow problems [22,23].

In the SPH method, the continuum is discretized by a finite setof discrete values defined at interpolation points. Following theLagrangian approach, each point moves with the material velocityof the continuum and carries physical properties, such as density,pressure and velocity. In this paper, we focus on the Navier–Stokesequations to simulate the three-dimensional free surface non-shal-low water flow of a slightly compressible fluid,

ll rights reserved.

Ferrari), michael.dumbser@. Dumbser), [email protected]).

dqdt¼ �q~r �~v ; ð1Þ

d~vdt¼ 1

q~rrþ~S; ð2Þ

where (1) and (2) are the mass and momentum balance in non-con-servative form. Moreover, the position of each infinitesimal fluidelement is governed by

d~xdt¼ ~v : ð3Þ

The term ddt denotes the total derivative that follows the motion

of the fluid. The variables are the density q, the velocity ~v , and theposition ~x. The vector ~S represents the body forces per unit massand typically consists of the gravity acceleration ~g. The term r de-notes the total stress tensor, defined as

r ¼ �pI þ s; ð4Þ

where p is the pressure, I is the unit tensor and s is the viscousstress tensor. For an ideal fluid the stress is independent of thedeformation and (4) is reduced to r ¼ �pI. Instead, the following

Page 2: A new 3D parallel SPH scheme for free surface flows

1204 A. Ferrari et al. / Computers & Fluids 38 (2009) 1203–1217

linear stress–strain rate relationship has to be introduced in (4) tomodel the flow of weakly compressible Newtonian fluids:

r ¼ �pþ l0~r �~v� �

I þ 2lD; l0 ¼ �23l; ð5Þ

where the coefficient l is the dynamic bulk viscosity of the fluid andthe term D is the rate of the strain tensor, D ¼ 1

2 ½r~v þ ðr~vÞT �. To

close the system of equations, we need an equation of state (EOS).Following the classical SPH approach, the most frequently usedEOS is one derived from a relation proposed by Batchelor [3] forwater. It is the so-called Tait equation, which computes a relativepressure field assuming the atmospheric reference pressure beingequal to zero:

p ¼ k0qq0

� �c

� 1� �

: ð6Þ

Here, q0 is the reference fluid density at atmospheric pressureand for the exponent c typically one uses c ¼ 7. Eq. (6) approximatesthe liquid as a weakly compressible fluid. The coefficient k0 deter-mines the speed of sound and it is chosen so that the density fluctu-ations become negligible. Numerical experiments confirm that thedensity variations are consistent with the incompressible fluidassumption using a Mach number of M 6 10�1 [23].

1.2. The SPH formulation of Gingold and Monaghan

The oldest and more common formulation to discretize an idealfluid has been proposed by Gingold and Monaghan in [13]:

dqi

dt¼ �

XN

j¼1

mj ~v j �~v i� �

� ~riWij; ð7Þ

d~v i

dt¼ �

XN

j¼1

mjpi

q2i

þpj

q2j

þPij

!� ~riWij þ~Sij; ð8Þ

d~xi

dt¼ ~v i; ð9Þ

where the subscript i denotes the ith particle and ~riWij is the gra-dient of the interpolating kernel function centered in~xi with respectto the location ~xj. In all our applications we refer to the cubic B-spline, defined as follows:

Wij ¼c

hvij

2=3� q2ij þ q3

ij=2; if 0 6 qij < 1;

ð2� qijÞ3=6; if 1 6 qij 6 2;

0; if qij > 2;

8><>: ð10Þ

where qij is the relative distance defined for each couple of points iand j as qij ¼ ð~xj �~xi

�� ��Þ=hij. c is a normalization constant chosen sothat the integral of the Kernel satisfies

RWijdV ¼ 1. The so-called

smoothing length hij is locally variable and it is averaged betweenthe interacting particles i and j

hij ¼12ðhi þ hjÞ with hi ¼

ffiffiffiffiffiffimi

qi

m

r: ð11Þ

Here, m denotes the number of space dimensions, i.e. m ¼ 3 forour applications in 3D.

An artificial viscosity term Pij is added in the evolution equa-tion of the velocity (8). It produces a repulsive force between theparticles when two points approach each other

Pij ¼lij

blij�a�cij�qij

if ~v ij �~nij < 0;

0 if ~v ij �~nij P 0;

(ð12Þ

where a and b are two parameters, �qij and �cij are the density andcelerity of the fluid, both averaged at the location ~xi and ~xj,

~nij ¼~xj�~xi

~xj�~xij j is the unitary vector of the distance between the ith

and jth points and similarly ~v ij ¼ ~v j �~v i is the difference of thevelocities. The term lij is defined as

lij ¼ hij~v ij �~nij

~xj �~xi

�� �� : ð13Þ

Unfortunately, the expression (12) contains two parametersa and b, which require a careful numerical calibration and theirvalues usually depend on the test case.

Since the SPH approach reduces the original continuous partialdifferential equations (PDEs) to sets of ordinary differential equa-tions (ODEs), any stable time stepping algorithm for ODE can beused [25]. Different time discretizations have been applied to theSPH method based on [13]. Oger et al. [29] propose to apply an ex-plicit third order Runge–Kutta ODE integrator. A comparativestudy with Runge–Kutta methods of different orders proved theadvantages of methods of order greater than two: the increasedCPU time due to the increased number of second member evalua-tions is more than balanced by the increased maximum admissibletime step of the explicit scheme, resulting finally in a decrease oftotal CPU time [29]. A fourth order Runge–Kutta scheme togetherwith a periodic re-initialization of the density field based on amoving-least-square (MLS) interpolation [4] has been imple-mented by Colagrossi and Landrini [6]. It improves the stabilityproperties of the scheme in comparison with the modified-Eulerand Leap-Frog time integration [6], but it is quite expensive fromthe computational point of view.

In one space dimension and setting Pij ¼ 0, the SPH approach ofGingold and Monaghan can be interpreted as an explicit central fi-nite difference (FD) scheme. Applying the scheme to the linear sca-lar advection equation, one can prove that the method is notmonotone and it can be shown via a von Neumann stability anal-ysis that the method is unconditionally unstable using explicit Eu-ler time stepping [11]. A very detailed von Neumann analysis forthe original SPH methods (7)–(9) applied to the full Euler equationshas been carried out by Balsara [2]. Note that the linear instabilityis different from the tensile instability reported in [32,24], which isanother numerical problem inherent in the original SPH approach.Finally, we also underline that the SPH scheme of Gingold andMonaghan is not even zeroth order consistent, which means thatthe method is not able to maintain a constant, which usually is afundamental property of any finite difference, finite volume or fi-nite element scheme.

1.3. The SPH formulation of Ben Moussa and Vila

An alternative and more general formulation of the SPH schemederives from the work of Ben Moussa and Vila [33,28,27], who useRiemann solvers to evaluate a numerical flux between each coupleof two interacting particles i and j. The semi-discrete form of thisalternative SPH approach is written in conservation form for anideal fluid as follows:

d x~Q� �

i

dt¼ �

XN

j¼1

xixj2Gij � ~riWij þxiqi~Si; ð14Þ

dxi

dt¼ �

XN

j¼1

xixj ~v j �~v i� �

� ~riWij; ð15Þ

d~xi

dt¼ ~v i; ð16Þ

where the vectors ~Q ¼ ðq;qvx;qvy;qvzÞ and Gij denote the vectorof conserved variables and the numerical flux tensor, respectively.

Page 3: A new 3D parallel SPH scheme for free surface flows

A. Ferrari et al. / Computers & Fluids 38 (2009) 1203–1217 1205

In (14) and (15) the weights x represent the volume of the particlesand evolve in time, taking into account deformations due to thevelocity field. The numerical flux Gij in (14) depends on the choiceof the Riemann solver. In general, a monotone upwind flux is used.Using the Godunov flux [14] based on the exact Riemann solver, thenumerical flux tensor Gij is evaluated as follows, with~v ij ¼ 1

2~v j þ~v i� �

:

Gij ¼ Fij~Q E� �

� ~Q E �~v ij; Fij~Q� �

¼q~vT

pIþ q~v �~v

!: ð17Þ

The term ~QE denotes the exact solution of the associated classi-cal one-dimensional Riemann problem between each pair of parti-cles i and j, computed along the direction ~nij connecting~xi with~xj,where n ¼~x �~nij:

oot~Q þ o

on F ~Q� �

� ~Q �~v� �

�~nij

� �¼ 0;

~Qðn;0Þ ¼~Q i if n 6 0;~Q j if n > 0:

(8>>><>>>:

ð18Þ

Using for example the simpler Rusanov flux, one obtains

Gij ¼12

H ~Q j

� �þH ~Q i

� �� �� cij

2~Q j � ~Q i

� ��~nij; ð19Þ

where the term cij corresponds to the maximal celerity between thetwo interacting particles i and j, defined as

cij ¼maxðci; cjÞ; ci ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffic

k0

q0

qi

q0

� �ðc�1Þs

; ð20Þ

and the flux H is the flux tensor of the Euler equations in a referenceframe moving with the fluid velocity ~v:

H ~Q� �

¼ F ~Q� �

� ~Q �~v ¼0pI

!: ð21Þ

In this approach the artificial viscosity term Pij in the law of themomentum conservation (8) is replaced by an intrinsic numericalviscosity, automatically contained in the numerical flux Gij. Thismeans that for this SPH scheme no calibration of parametersdepending on the test case is required. Very recently, for the caseof scalar nonlinear hyperbolic conservation laws in bounded do-mains, it has been rigorously proven by Ben Moussa [27] that usingmonotone fluxes Gij (so-called e-fluxes) the SPH schemes (14)–(16)is L1 stable under CFL condition and converges to the unique en-tropy solution of the conservation law. This is a very strong theo-retical result. A linear von Neumann stability analysis for the SPHmethod of Ben Moussa and Vila applied to the linear scalar advec-tion equation can be found in [11].

1.4. Elliptical drop test case

Both approaches have been applied to a simple test problem inorder to verify and compare their accuracy. This two-dimensionaltest case, originally proposed by Monaghan in [23], consists ofthe evolution of a circular drop with an initial velocity field thatis a linear function of x and y,

~v jt¼0 ¼ ð�100x;þ100yÞT : ð22Þ

Initially, the particles are arranged in the form of a circular dropof radius R = 1 m, which evolves in time to a narrow ellipse with ra-dii a and b [23]. Assuming a perfectly incompressible fluid one ob-tains the condition that the product a � b must remain constant. Nogravity force is involved and no interaction with solid boundariestakes place. The number of particles is 1858 and the pressure iscomputed using a coefficient k0 in the Tait equation that gives a

speed of sound of 1400 ms [23]. The parameters of the artificial vis-

cosity term in (12) have been taken as a ¼ 0:01 and b ¼ 0, accord-ing to [23].

In Fig. 1 the particle positions and the pressure field are plotted.The solution of both SPH formulations is compared with the ana-lytic one given in [23]. The approach of Monaghan is very accuratein computing the particle positions, but the numerical pressureprofile is highly oscillatory and thus unacceptable. We suspect thatthe cause of these unphysical spurious pressure fluctuations lies inthe non-monotonicity of the original SPH method that produces ahighly oscillatory density field, although these high frequency den-sity fluctuations are less than 1%, as reported in [23]. Theseunphysical high frequency density fluctuations are then furtheramplified in the pressure field by the stiff equation of state. Ofcourse one would expect some variations in density due to theweak compressibility effects that have been included in the SPHsimulations, however, one would expect a rather monotone densityfield and as a consequence also a monotone pressure field. Thepressure field obtained with the Gingold and Monaghan formula-tion, however, contains a lot of unphysical high frequency noise,which is a well-known phenomenon in the SPH literature [6].

In contrast, the Ben Moussa and Vila approach using the mono-tone Rusanov flux produces a monotone pressure field as a conse-quence. Unfortunately, it is too diffusive to compute the particlepositions correctly and hence it cannot be used to simulate violentfree surface flows, which are at the focus of our attention. Even theGodunov flux, based on the exact Riemann solver, does not help toremedy the excess of numerical viscosity inherent in the SPH for-mulation of Ben Moussa and Vila.

In conclusion, neither of the two above-mentioned SPH formu-lations is able to produce simultaneously accurate solutions for thepressure field and also for the free surface and the development ofan alternative approach seems to be necessary.

1.5. Outline of the paper

Therefore, a new robust and accurate formulation of the SPHscheme is proposed in this paper (Section 2). A new method ofimplementing the solid wall boundary conditions is presented inSection 3. It produces accurate, little oscillatory pressure fields inproximity of fluid-solid interfaces and uses a new flexible approachwhich is able to treat any complex geometry. Section 4 deals withthe parallelization of the scheme using the message passing inter-face (MPI) paradigm as well as the implementation of a dynamicload-balancing strategy based on the METIS library [16]. The per-formance of the parallel scheme is demonstrated via a speed-upanalysis. In Section 5, the new SPH scheme is applied to manythree-dimensional test cases that are of relevance for environmen-tal flow problems, such as dam break and impact problems. In par-ticular, in Section 5.1 the results of a 3D mesh-convergence studyare presented for a 3D dam-break wave-impact problem. Finally,in Section 6 the new SPH scheme is discussed with commentsand remarks.

2. A new SPH formulation

The new SPH approach proposed in this paper uses an explicitthird order Runge–Kutta scheme in time to improve the linear sta-bility of the method, following a similar time evolution procedureof Oger et al. [29]. Since the high order TVD Runge–Kutta time dis-cretization initialized by Shu and Osher [31] has been extremelysuccessful in solving non-linear hyperbolic conservation laws withtotal variation diminishing (TVD) finite difference and finite vol-ume schemes, the third order TVD Runge–Kutta scheme [31] hasbeen implemented. However, the explicit high order time integra-

Page 4: A new 3D parallel SPH scheme for free surface flows

y

x

-2 -1 0 1 2-2

-1

0

1

2

y

x

-2 -1 0 1 2-2

-1

0

1

2

yp

-2 -1 0 1 2-2E+07

-1E+07

0

1E+07

2E+07

y

p

-2 -1 0 1 2-2E+07

-1E+07

0

1E+07

2E+07

Fig. 1. The Gingold and Monaghan SPH version with explicit third order Runge–Kutta time stepping (left) and the Vila SPH version (right) applied to the elliptical drop testcase. The particle positions (top) and the pressure profiles (bottom) are shown. Numerical results (symbols) and analytic (line) solutions are compared at time t = 0.0076 s.

1206 A. Ferrari et al. / Computers & Fluids 38 (2009) 1203–1217

tion is not enough to stabilize the numerical pressure field. Similarconsiderations about the stability properties are reported in [29]. Afurther stabilization is then absolutely necessary.

The new key idea consists of introducing a monotone upwindflux, following directly the Ben Moussa and Vila approach [33],but only for the density equation. For the velocity equation, thepure non-diffusive central flux without any artificial viscosity hasbeen applied. In this way, the flux is split into two parts. One isassociated with the advection of the flow, evaluated using an up-wind flux. The other part of the flux is due to the propagation ofacoustic pressure waves, which is implemented resorting to a cen-tral flux, since there is no preferred direction for the propagation ofpressure waves in subsonic flows. This approach recalls the Advec-tion Upstream Splitting Method (AUSM), developed by Liou andSteffen [19] in the finite volume framework.

We underline that in our new approach no artificial viscosityterm must be added, which would require the careful calibrationof two parameters depending on the test case, but only the intrin-sic numerical viscosity of the monotone upwind flux in the densityequation has been used to stabilize the pressure field.

The new approach can be applied to both the SPH formulationsshown in [13,33]. The original SPH scheme [13] for the ideal fluidflow is rewritten as follows:

dqi

dt¼ �

XN

j¼1

mj ~v j �~v i� �

� ~riWij �~nij � ~riWijcij

qjðqj � qiÞ

! !;

ð23Þ

d~v i

dt¼ �

XN

j¼1

mjpi

q2i

þpj

q2j

!� ~riWij þ~Si; ð24Þ

d~xi

dt¼ ~v i; ð25Þ

where the term cij denotes again the maximal celerity between thetwo interacting particles i and j, as in (20).

In the Vila formulation [33], the numerical flux is simply re-placed by

Gij ¼� 1

2 cijðqj � qiÞ12 ðpi þ pjÞI

!�~nij: ð26Þ

We remark that introducing the Rusanov flux in the densityequation corresponds to the introduction of an explicit penaltyterm for density jumps. This penalization of density fluctuationsis consistent with the physics of nearly incompressible flow, wherethe density variations should tend to zero anyway for M ! 0.When applying the new SPH approach, both formulations becomevery similar. In the following, we refer to the new SPH approachdefined by (23)–(25), because it is faster from the computationalpoint of view, since no evolution equation of the weight x mustbe solved.

In a second step, the rheology of a Newtonian fluid (5) has beenimplemented. It requires integral approximations of the secondderivative. The dissipative terms have been discretized accordingto the approach proposed by Español and Revenga [10] and areadded in the conservation law of the momentum (24) as follows:

d~v i

dt¼ �

XN

j¼1

~FIij �

XN

j¼1

~FVij þ~Si; ð27Þ

where the vectors ~FIij and ~FV

ij denote the inviscid and the viscouscomponents of the flux,

~FIij ¼ mj

pi

q2i

þpj

q2j

!~riWij; ð28Þ

~FVij ¼

73

lqi

mj~v j �~v i

qjþ 5

3lqi

mj~nij

qj

~nij � ~v j �~v i� � !

Hij; ð29Þ

Hij ¼ �~nij

~xj �~xi

�� �� � ~riWij: ð30Þ

Page 5: A new 3D parallel SPH scheme for free surface flows

A. Ferrari et al. / Computers & Fluids 38 (2009) 1203–1217 1207

In order to show the accuracy of the new SPH formulation, we applythe new method again to the two-dimensional inviscid ellipticaldrop test case, introduced in Section 1. In Fig. 2 the evolution ofthe initially circular drop is plotted. The numerical results of thenew SPH scheme (symbols) are compared with the analytic solution(solid line) [23]. Substantial improvements in the accuracy of thenew SPH method can be noted. Not only the profile of the particlepositions follows very well the analytical solution, similar to [23],but also the density fluctuations are reduced to less than 0.1%(one order of magnitude less than in the original approach of Mona-ghan). This produces considerable advances in the accuracy of themethod to satisfy the incompressibility condition of the flow and,in particular, leads to a much less oscillatory pressure field.

3. Boundary conditions

The simulation of fluid flow typically involves fixed or movingboundaries that represent physical interfaces. A major advantageof the SPH approach over classical finite volume or finite differencemethods consists in the automatic treatment of the free surfaceboundary condition, which is intrinsically satisfied. This followsfrom the equation of state (6), which evaluates the relative pres-sure field so that the atmospheric pressure is zero, in combinationwith the kernel interpolation, which does not produce any contri-butions to the flux from parts of the computational domain with-out interpolation points.

Unlike the free surface, the solid boundaries require special careand the no-penetration condition must be imposed. In the litera-ture, two different numerical techniques have been proposed. Inthe first approach [23], the boundaries are replaced by interactingparticles which exert a repellent force on fluid particles. In general,a Lennard-Jones-type force acting between fluid and boundary par-ticles is applied [23]. Otherwise, the solids can be also imple-mented by ghost particles with density, pressure and velocitydeduced from those of the physical particles adjacent to the solidboundary by mirroring the fluid particles on a plane that is tangentto the boundary surface [6]. Geometrically, this consists in mirror-ing the fluid particles via plane symmetry.

In general, the boundary forces are more flexible and can handleany moving rigid body with arbitrarily irregular geometry. How-ever, the ghost particle approach provides a smoother behaviorof the particles in proximity of the boundary [6], in particular whenthe pressure field along a solid boundary is of interest. The repel-lent forces on the boundaries produce oscillations in the fluid pres-sure and this follows straightforward from the form of the forcedepending only on the distance between the fluid and solid parti-cles. Small displacements of the fluid particle in proximity of theboundary correspond to a large variation of the intensity of therepellent force that produces pressure oscillations in the fluid.

-2

-1

1

2

y

x

-2 -1 0 1 2-2

-1

0

1

2

Fig. 2. The new SPH version applied to the elliptical drop test case. The particle positionsanalytic (line) solutions are compared at time t = 0.0076 s.

The new idea for the implementation of the boundary condi-tions is to combine the approaches presented in [23,6] replacingthe solid interfaces with boundary particles. Each of them interactswith all adjacent fluid particles by setting for each fluid particle afictitious fluid point via local point-symmetry. The boundary inter-actions are then added in summation form in the momentum con-servation law (24) as follows:

d~v i

dt¼ �

XN

j¼1

~FIij �

XN

j¼1

~FVij �

XNB

k¼1

~FIik �

XNB

k¼1

~FVik þ~Si; ð31Þ

assuming that the integer NB is the total number of the boundaryparticles and the fictitious fluid point k denotes the kth boundaryparticle at the interface with the following associated properties:

~xk ¼ 2~xB �~xi; ~vk ¼ 2~vB �~v i; qk ¼ qi; mk ¼ mi; ð32Þ

where~xB and ~vB are the position and the velocity of the particle lo-cated at the interface, respectively. The fluxes ~FI

ik and ~FVik are evalu-

ated as usual using Eqs. (28) and (29).We repeat that from a geometrical viewpoint this means that

we use boundary particles, whose flux contribution is calculatedvia virtual (non-existing) ghost particles k using local point-symme-try rather than plane-symmetry [6]. The numerical experimentspresented in the following sections show that the new boundaryconditions preserve well hydrostatic pressure distributions with-out introducing spurious pressure oscillations at the wall. More-over, the implementation can treat any complex geometry. It isto remark that the time step is not limited by the new boundaryconditions, unlike the repellent forces proposed by Monaghan[23], which may lead to stiff ODE systems if the fluid particlesare very close to the boundary particles. Since for SPH schemestypically explicit time stepping methods are used, these stiff ODEsystems introduced by large boundary forces can significantlydeteriorate the computational efficiency of the whole scheme. An-other advantage of our new approach for the boundary conditionsis that no parameters must be tuned, unlike in the boundary forceapproach [23], where four new adjustable parameters have beenintroduced. The various techniques of implementing the boundaryconditions are illustrated in Fig. 3, where the current particle i un-der consideration is highlighted in grey and the dashed circle indi-cates its compact support. The solid wall boundary is representedby a thick black line. In the boundary force approach (Fig. 3, left),particle i is subject to a repellent force of all boundary particles (so-lid black circles) in its kernel support, independent of its currentvelocity, but depending only on its position. In the ghost particleapproach (Fig. 3, middle), all particles first have to be mirrored atthe solid wall and then the usual kernel summation is applied.The interaction of the wall with particle i therefore depends onthe position and velocity of the latter as well as on the positions

y

p

-2 -1 0 1 2E+07

E+07

0

E+07

E+07

(on the left) and the pressure profiles (on the right). Numerical results (symbols) and

Page 6: A new 3D parallel SPH scheme for free surface flows

Fig. 3. Different ways of implementing the boundary conditions. Boundary forces (left), ghost particles (middle), and new approach (right).

CPUs

Wal

lClo

ckTi

me

[s]

100 101 102 103103

104

105

106

MeasuredIdeal

Fig. 4. Speed-up graph, comparing the measured (solid line) and the ideal (dashedline) wall-clock time from 2 to 128 CPUs.

Table 1Table of the efficiency of the code increasing the number of the CPUs. The data areevaluated with respect to the ideal case, equal to 1.

Number of CPUs 4 8 16 32 64 128

Efficiency 0.96 0.92 0.78 0.68 0.57 0.46

1208 A. Ferrari et al. / Computers & Fluids 38 (2009) 1203–1217

and velocities of all neighbours. In the approach presented in thispaper (Fig. 3, right), we use boundary particles as in [23], however,instead of using a Lennard-Jones-type force, we compute a flux be-tween particle i and each of the boundary particles via the standardkernel summation by locally mirroring particle i at each boundaryparticle via point symmetry (see Eq. (32)). The contribution of thewall therefore depends on the position and velocity of particle i,but not on the ones of the neighbors.

4. Improvements in computational performance

4.1. Neighbor search

Following a Lagrangian approach, in the SPH method the inter-polation points move with the fluid velocity~v . This implies that theSPH kernel estimation of any quantity at any point in space re-quires as a first step the search of the neighbor particles j that enterthe compact support of particle i. However, the neighbors j thatcontribute to particle i are not known a priori. The advantage ofgrid based methods, such as finite difference or finite volumeschemes, where the interpolation points are the vertices of themesh, is precisely that the neighbor search is trivial, since the con-nectivity is always known via the mesh. But on the other hand,mesh based methods can hardly simulate phenomena with largedeformations because of problems of mesh distortion.

In the SPH method, the search of the neighbor particles is basedon the mutual distance of the interpolation points and is veryexpensive from a computational point of view. From a profile anal-ysis of our code we found that initially most of the computationaltime was used to seek the neighbors.

Typically the neighbor search is achieved by using a fictitiousCartesian grid, which is fixed in time during the entire simulation,and which consists of macro-cells that contain aggregations offluid and solid particles, as proposed by Monaghan and Lattanzio[26]. In the SPH literature, the macro-cells are also referred to asbook-keeping cells [23]. For the calculation of the kernel interpola-tion procedure, only particles in adjacent book-keeping cells cancontribute to each other’s density and velocity field evolution.

As proposed by Monaghan and Lattanzio, we use book-keepingcells of size 2hM , where hM is the maximum value of the locally var-iable smoothing length. Furthermore, in addition to the particle po-sition ~x we store the coordinates of each particle also in integerformat, where the integer coordinates ~xInt

i are computed from theposition vector ~xi as

~xInti ¼ floor

NI

2hM~xi

� �: ð33Þ

Here, NI denotes the number of sub-divisions of one macro-cellsin order to produce the integer coordinates. We use NI ¼ 1000. Theuse of integer coordinates reduces the time for data access withinthe neighbor search procedure. For each particle i we first identify

the index of its macro-cell, denoted by~I in the following. For theneighbor search, we then loop over a cube of 27 macro-cells~J with~J �~I��� ��� 6 1. To cut the particles contained in the macro-cells ofopposite corners and edges of the cube from the search, a firstcheck is based on the barycentres of the book-keeping cells

~xBJ �~xi

��� ��� 6 3hM ; ð34Þ

where ~xBJ denotes the barycentre coordinates of macro-cell~J. Only

when (34) is satisfied, a valid neigbouring macro-cell is identified.Subsequently, the search focuses on the each point j located inthe macro-cell ~J using a criterion based on the space coordinatesin integer format~xInt

j :

~xIntj �~xInt

i

��� ��� 6 NI: ð35Þ

Only the points which satisfy (35) are used in the kernel estima-tion to evaluate the spatial derivatives in the equations of fluiddynamics (23) and (24).

Our code is written in FORTRAN-95 and the data are organizedin a flexible way in linked lists using pointers so that particles canbe deleted or added during the simulation, following the motion ofthe fluid. Storage problems caused by fix-size arrays, as typicallyused in FORTRAN-77 codes, are thereby also eliminated.

Page 7: A new 3D parallel SPH scheme for free surface flows

Fig. 5. Time evolution of the 3D dam break problem. The numerical solution as computed by our new SPH scheme with 2,000,000 particles on 256 CPUs is shown at timest = 0.6 s, 1.2 s, 1.5 s and 2 s.

Fig. 6. The free surface flow impacts against the vertical rigid wall. Side-view in the x–z-plane at times t = 0.6 s, 1.2 s, 1.5 s and 2 s.

A. Ferrari et al. / Computers & Fluids 38 (2009) 1203–1217 1209

This flexible data structure will be of fundamental importancefor the MPI parallelization together with the dynamic load-bal-ancing approach, both described in detail in the followingsection.

4.2. MPI parallelization with dynamic load-balancing

The MPI system is a widely used paradigm on modern mas-sively parallel distributed-memory supercomputers. It is based

Page 8: A new 3D parallel SPH scheme for free surface flows

Fig. 7. Time evolution of the water front before the flow impact against the wall,where xfront denotes the front position.

1210 A. Ferrari et al. / Computers & Fluids 38 (2009) 1203–1217

on the concept that any MPI process running on its own CPU com-municates via messages which other MPI processes by calling stan-dard MPI communication subroutines. In our code, the MPIimplementation uses non-blocking communication, hence decou-pling each send message from the corresponding receive by theneighbor processes. The subdivision of the load-work among theCPUs is based on a spatial decomposition. In fact, within the SPHapproach, the numerical domain is already organized in macro-cells (see Section 4.1). Using this spatial decomposition, to eachCPU a non-overlapping subset of the macro-cells is assigned.

In order to have the best performance and efficiency of the codeand the MPI communication, each CPU should have approximatelythe same number of points to compute, using the least possiblenumber of MPI communications. Unfortunately, the Lagrangian ap-proach of the SPH scheme introduces more complexity in the MPIimplementation. Even an optimal distribution of the macro-cells tothe various CPUs at the initial time does not guarantee a good effi-ciency during the entire simulation, because the positions of thepoints can change considerably in time. Each particle can movefrom one cell to another and subsequently it may also have tochange the processor, where its data are stored. In other words, astatic load-balancing approach cannot assure the best reachableefficiency of the code. For this reason, the implementation of a dy-namic rearrangement of the workload among the CPUs is abso-lutely necessary.

The dynamic load-balancing has been carried out resorting tothe METIS library, a very powerful free software package for parti-tioning meshes and weighted graphs, proposed by Karypis and Ku-mar [16]. The computational domain is then partitioned so thateach CPU has always almost the same number of fluid particles,minimizing at the same time the MPI communication among theCPUs. The graph that is to be partitioned by METIS is the Cartesiangrid of the macro-cells. The METIS package also accepts differentweights for each macro-cell as input data to optimize the proce-dure with respect to a spatially non-uniform distribution of work-load, which is definitely the case for the SPH scheme, where eachbook-keeping cell may contain a different number of particles, thusleading to a different computational effort. The weights are evalu-ated multiplying the number of the fluid particles per book-keep-ing cell with the sum of all particles (fluid and solid) in theneighbor macro-cells. Every NM time-steps, METIS is called andthe macro-cells of the whole 3D computational domain are re-dis-tributed among the CPUs to guarantee an optimal load-balancingfor the new spatial configuration of the particles. We typicallyuse NM ¼ 300.

This dynamic load-balancing increases considerably the effi-ciency of the code. The improvements in the performance dependon the numerical test case, the geometrical complexity of the com-putational domain and on the number of points. In particular,increasing the number of particles, the efficiency of the MPI com-munication improves.

A performance benchmark problem has also been computed. Itconsists of a dam break test case in 3D (see Section 5.1) with500,000 particles. In Fig. 4 the speed-up is plotted. It representsthe efficiency using an increasing number of CPUs (2, 4, 8, 16, 32,64, and 128). The wall clock time corresponds to the time requiredby the CPUs to complete the same computational job. In the graphin logarithmic scale, the measured computational efficiency of thecode (solid line) is compared with the optimal theoretical perfor-mance (supposing 100% MPI efficiency), represented by the dashedline. Similarly, in Table 1 the data represent the efficiency of thecode.

The performance results shown in Table 1 and in Fig. 4 are sat-isfactory, especially considering the fact that the benchmark prob-lem contains large deformations of the computational domain withviolent free surface motion and that the scheme is a meshless

Lagrangian method with high order Runge–Kutta time integration,which requires quite a lot of MPI communication compared to ex-plicit one-step discontinuous Galerkin or finite volume schemes(see [8,9] for MPI performance data).

5. Applications to environmental problems

In this section, the numerical results of some 3D applicationsare presented. They consist of environmental problems for freesurface flows, such as dam-break and impact flows against a wall.The numerical solutions obtained with our new 3D SPH code havebeen compared with either experimental results or with othernumerical reference solutions, obtaining in all cases a very satisfac-tory agreement. Moreover, to asses the accuracy of the new SPHscheme, a three-dimensional mesh-convergence study has beenperformed for the strongly deforming free surface.

In all test cases, the new SPH scheme with the monotone up-wind flux in the density equation and the new boundary conditions(see Section 2) have been applied, resorting to the MPI paralleliza-tion with dynamic load-balancing as described in Section 4.2. Alltest cases shown in this paper are inviscid in order to verify thatthe numerical scheme alone is stable enough without having theadditional stabilizing effect of physical viscosity.

5.1. Dam break problem and mesh-convergence study

This test case has been originally proposed by Colagrossi andLandrini in two space dimensions [6]. In this paper we extend itto a three-dimensional dam break problem with following impactflow against a rigid vertical wall. The initial conditions are plottedin Fig. 5, where the initial reservoir height is H = 0.6 m, the lengthand height of the channel are d = 5.366H and D = 3.0H, according to[6]. For our 3D version, we set the channel width to W = H = 0.6 m.The numerical solution has been computed with the new SPH for-mulation for an ideal fluid (l = 0) with four different levels ofrefinement: 250,000, 500,000, 1,000,000 and 2,000,000 fluid parti-cles. Fig. 6 shows our computational results using 2,000,000 pointsat various times (t = 0.6 s, 1.2 s, 1.5 s and 2 s) using 256 CPUs. The

Page 9: A new 3D parallel SPH scheme for free surface flows

t·(g/H)1/5

h 1/H

0 1 2 3 4 5 6 7 8 9

0

0.2

0.4

0.6

0.8

1

ExperimentsNew SPHClassical SPH [Colagrossi & Landrini, 2003]Fluent

t·(g/H)1/5

h 2/H

0 1 2 3 4 5 6 7 8 9

0

0.2

0.4

0.6

0.8

1

ExperimentsNew SPHClassical SPH [Colagrossi & Landrini, 2003]Fluent

Fig. 8. Total height h of water at x1 ¼ 2:228 m (on the left) and x = 2.725 m (on the right). Experiment results are from [34].

t(g/H)1/5

P/(ρ

0gH

)

0 2 4 6 8

0

0.5

1

1.5

Experimental dataNew SPH (z=0.08m)New SPH (z=0.10m)New SPH (z=0.12m)

t(g/H)1/5

P/(ρ

0gH

)

0 2 4 6 8

0

0.5

1

1.5

Experimental dataTwo-phase SPH [Colagrossi & Landrini, JCP 2003]Free-surface SPH [Colagrossi & Landrini, JCP 2003]

Fig. 9. Pressure evolution on the wall evaluated by the classical SPH approach (on the left) and by our new SPH scheme (on the right). Experiment data are from [34].

A. Ferrari et al. / Computers & Fluids 38 (2009) 1203–1217 1211

evolution of the pressure field for the dam break flow is plottedthrough the colors.1 As expected, our new SPH scheme does not pro-duce any spurious oscillations in the pressure field. In particular,using the new boundary conditions, the evolution of the pressurein proximity of the solid interfaces is very smooth, without disconti-nuities or improper fluctuations. It is to remark that no artificial vis-cosity term is necessary to stabilize the new SPH scheme and nodensity re-initialization is required to improve the solution. For thistest case, laboratory experiments have been carried out by Zhouet al. [34]. In the experiments, the water was initially containedwithin the solid boundary of a water flume and a piece of wax paper,clamped between two metallic frames. The intense current producedby a short circuit has been used to melt the wax and quickly releasethe paper diaphragm, leaving the water free to flow along a practi-cally unlimited dry deck [6].

The numerical velocity of the free surface flow before the im-pact (at time t = 0.6 s in Fig. 6) is compared with the experimen-tal data and the classical Ritter wave front celerity [30]. In theparticular case of a frictionless dam break in a long channel,

1 For interpretation of the references to color in Fig. 6, the reader is referred to theweb version of this paper.

the shallow water equations (SWEs) can be solved analyticallyand the celerity of the dam break wave front results asU ¼ 2

ffiffiffiffiffiffigH

p, where H is the initial reservoir height. In this appli-

cation, the dam break wave front celerity results U ¼ 4:85 ms .

Good results are apparent in Fig. 7. The evolution of the frontshows an initial acceleration up to the point where an almostconstant longitudinal velocity is reached. Our SPH approachagrees asymptotically very well with the Ritter solution. Weemphasize that the Ritter solution is not applicable to the initialinstants of the phenomenon, because there the assumptions ofthe shallow water equations are not verified. A slower waterfront velocity results from the experimental data. It can be dueto having neglected the friction on the bottom and on the verti-cal walls. On a longer time scale, the wall roughness present inthe experiments can produce increasing differences in the prop-agation velocity between simulations and experiments.

In Fig. 8 the numerical free surface elevation has been comparedwith experimental records at gauges located at x1 ¼2:228 m and x ¼ 2:725 m for the depth levels h1 and h2, respec-tively. In addition to the results computed by the new SPH ap-proach, the plot shows also other numerical computations. Theyconsist of the classical SPH scheme with a periodic density re-ini-tialization, proposed by Colagrossi and Landrini [6], and the

Page 10: A new 3D parallel SPH scheme for free surface flows

Fig. 10. Mesh convergence study at output times t = 1.2 s (left), 1.5 s (middle) and 2 s (right) using 250,000, 500,000, 1,000,000 and 2,000,000 particles (from top to bottom).The last row contains the free surface profile of all computations.

Fig. 11. Setup of the physical model (on the left) and a 3D view of the numerical solution at time t = 8.5 s (on the right).

1212 A. Ferrari et al. / Computers & Fluids 38 (2009) 1203–1217

Navier–Stokes solver FLUENT, which uses an Eulerian finite volume(FV) method along with a volume of fluid (VOF) scheme for phaseinterface capturing, applied in [1]. In the experiments, standardcapacitive wave gauges have been used which are sensitive tothe wet part of the wire [6]. Experimental and numerical resultsclosely agree until t

ffiffiffigH

q¼ 6:5 for both gauges h1 and h2 during

the transition from dry to wet bed conditions. Beyond this point,

the wave after the impact traveling opposite to the main flowbreaks. In the new SPH scheme, the numerical water depth refersto the computed free surface and follows well the experimentalmeasure. The differences detected between numerical and experi-mental data are most likely due to details of the initial conditionsin the experiments and roughness effects on the bottom, as con-firmed in [6].

Page 11: A new 3D parallel SPH scheme for free surface flows

Fig. 12. Comparison between the numerical solution of the new SPH scheme (symbols) and the reference solution (red line) computed by a 1D finite volume scheme of thirdorder in space and in time [7], at times t = 2 s, 3 s, 5 s and 8.5 s. (For interpretation of the references to color in this figure legend, the reader is referred to the web version ofthis paper.)

x [m]

]m[

y

0 1 2 3 4 5 6 7 8 9 10

0

1

2

3

4

G1 G2 G4

G7G6

G5

Fig. 13. Plane view of the physical model with the positions of the gauging points inthe reservoir and in the channel.

Table 2Table of the coordinates of the gauging points.

Gauge G1 G2 G4 G5 G6 G7

x (m) 1.59 2.74 5.74 6.74 6.65 6.56y (m) 0.66 0.69 0.69 0.72 0.80 0.89

A. Ferrari et al. / Computers & Fluids 38 (2009) 1203–1217 1213

Experimental data for the pressure on the vertical wall duringthe impact are also available [34]. A circular shaped gauge of9 cm (�0.15H) diameter has been used, located on the vertical wallwith the center 0.267H above the deck [6]. The experimental sen-sor cannot determine the pressure evolution exactly in a point onthe wall, but the impact pressure measurement is averaged overtime and location. Instead, it can be directly calculated numericallyat any point of the wall. In impact problems, as the spatial andtemporal pressure gradients are high [1], any of the approachesto average the pressure provides significantly different results.Therefore, in Fig. 9 (on the right) the numerical pressure on thewall evaluated at different height z by our new SPH scheme isshown and is estimated in post-processing of the data, averaging

the value of the pressure on the width of the wall, for each height.The impact of the water front against the vertical structure pro-duces an increasing in the pressure field and a second pressurepeak has been induced by the backward breaking of the wave.The numerical results obtained by our SPH method together withthe new boundary conditions (see Sections 2 and 3) improve con-siderably the accuracy and monotonicity of the pressure field onthe wall in comparison with the classical SPH approach. It slightlyunderestimates the value of the measured pressure, no oscillationson the rigid interfaces are apparent and the the peak magnitudesare largely similar. The better agreement between the experimen-tal measurements and the numerical data are provided atz = 0.08 m.

Finally, a mesh-convergence study is presented for this testcase, to underline the reliability and accuracy of the new SPHscheme. The comparison of the numerical solutions has been car-ried out at three different times (t = 1.2 s, 1.5 s and 2 s) applyingan increasing refinement of the fluid volume. We run 250,000,500,000, 1,000,000 and 2,000,000 particles on 32, 64, 128 and256 CPUs, respectively. In Fig. 10, the flow impact and the wavebreaking have been plotted for the four successively refined com-putations. The free surface profiles have been tracked, neglectingthe drops and focusing on the study of the fluid understood as acontinuum.

The comparison at the chosen reference times produces verygood results. It demonstrates that our new SPH approach con-verges even when the flow exhibits extremely large deformation.From the mesh-convergence study we conclude that for this partic-ular test case the discretization of the fluid with 500,000 particlesis sufficient to obtain a sufficient level of accuracy.

An in-depth analysis of the drops, their formation and evolutionwould require the implementation of the surface tension and itsinteraction with the physical viscosity of the fluid. Moreover, itwould be necessary to take into account the turbulence in the flowafter the impact and the breaking of the wave, which governs the

Page 12: A new 3D parallel SPH scheme for free surface flows

Fig. 14. Numerical results of the 3D computation at time t = 1 s, 4 s, 7 s and 10 s.

Fig. 15. Plane view of the oblique fronts on the free surface at time t = 4 s in the channel upstream the bend.

2 For interpretation of the references to color in Fig. 12, the reader is referred to theweb version of this paper.

1214 A. Ferrari et al. / Computers & Fluids 38 (2009) 1203–1217

formation of the drops and the extremely high deformations of thefluid. This would also require the implementation of a sub-gridscale turbulence model which is not the aim of our presentapplications.

5.2. 3D dam break over a triangular obstacle

We now propose the three-dimensional extension of an orig-inally one-dimensional test case for the shallow water equationswith variable bottom topography, see [21] for details of the set-up. It consists of a dam break problem which includes the ad-vance over a triangular obstacle, as shown in Fig. 11 (on theleft). The reservoir is connected to a rectangular channel withdry bed and the length of the entire model is 22.5 m. The gateis located at x = 15.5 m. A triangular obstacle (6 m long, 0.4 mhigh) is situated 13 m downstream the dam over the bed ofthe channel [21]. The slopes of the obstacle are symmetric andrepresent fixed rigid boundaries. At the initial time the waterdepth in the reservoir is equal to 0.75 m. The channel width isequal to W = 1 m.

The computation has been carried out using 1,300,000 fluid par-ticles and 128 CPUs have been employed. In Fig. 11, a 3D view ofthe result at time t = 8.5 s has been plotted. In Fig. 12 the numerical

solution obtained with our new SPH method has been compared atdifferent times (t = 2 s, 3 s, 5 s and 8.5 s) with a reference solution(red2 line). The reference solution has been computed using a one-dimensional WENO Finite Volume scheme of third order of accuracyin space and in time, applied to the one-dimensional SWE, followingthe implementation proposed by Dumbser et al. in [7]. The referencesolution has been obtained using 500 elements in one space dimen-sion. The numerical results show a very good level of agreement. Thewave front has almost the same celerity at all three instants. The freesurface profile within the rarefaction wave produced by the new SPHscheme, especially at times t = 5 s and t = 8.5 s, follows very well thereference solution. Moreover, we underline that the numerical solu-tions refer to two completely different systems of equations, namelythe fully three-dimensional compressible Euler equations, using theTait equation of state (6), for the 3D SPH approach and the simpledepth-averaged one-dimensional shallow water model solved withthe FV scheme. The good agreement confirms not only the very goodaccuracy of our new 3D SPH scheme, but also the consistency of theTait equation of state to simulate a weakly compressible fluid.

Page 13: A new 3D parallel SPH scheme for free surface flows

A. Ferrari et al. / Computers & Fluids 38 (2009) 1203–1217 1215

The pressure field represented by the colors2 in Fig. 12 is per-fectly smooth. Before the impact against the triangular obstacle,the pressure keeps a hydrostatic distribution. As expected, the im-pact flow produces an increase in the pressure on the backwardslope of the obstacle with a super-elevation in the free surface,as shown in Fig. 12.

5.3. Dam break in a channel with a 45 bend

This test case has been proposed by the European Concerted Ac-tion on DAm break Modelling (CADAM). The physical model com-

Time [s]

Wat

er d

epth

[m]

0 5 10 15 20

0.1

0.15

0.2

0.25

Experimental dataFinite Volume method [Soares Frazao et al.,1999]New SPH scheme

G1

~

Time [s]

Time [s]

Wat

er d

epth

[m]

0 5 10 15 200

0.05

0.1

0.15

0.2 Experimental dataFinite Volume method [Soares Frazao et al.,1999]New SPH scheme

G4

~

Wat

er d

epth

[m]

0 5 10 15 200

0.05

0.1

0.15

0.2 Experimental dataFinite Volume method [Soares Frazao et al.,1999]New SPH scheme

G6

~

Fig. 16. Evolution of the free surface at gauges G1, G2, G4, G5, G6 and G7. Comparison agscheme [12].

bines a reservoir connected to a rectangular channel with a 45bend, whose geometry is depicted in Fig. 13. The application hasthe aim to reproduce the evolution of a dam break wave that im-pacts against a bend with 45�. The initial water level in the up-stream reservoir is 0.25 m. The bed of the channel is initially dryand the downstream boundary condition is an open boundary[12]. The walls of the channel consist of glass and the bed consistsof steel. Experimental results have been published in [12]. Severalgauges are located along the channel (see Fig. 13 and Table 2) tomeasure the time evolution of the water depth during the wholeexperiments.

Time [s]

Time [s]

Time [s]

Wat

er d

epth

[m]

0 5 10 15 200

0.05

0.1

0.15

0.2 Experimental dataFinite Volume method [Soares Frazao et al.,1999]New SPH scheme

G2

~

Wat

er d

epth

[m]

0 5 10 15 200

0.05

0.1

0.15

0.2 Experimental dataFinite Volume method [Soares Frazao et al.,1999]New SPH scheme

G5

~

Wat

er d

epth

[m]

0 5 10 15 200

0.05

0.1

0.15

0.2 Experimental dataFinite Volume method [Soares Frazao et al.,1999]New SPH scheme

G7

~

ainst experimental data and a SWE simulation using an unstructured finite volume

Page 14: A new 3D parallel SPH scheme for free surface flows

1216 A. Ferrari et al. / Computers & Fluids 38 (2009) 1203–1217

The numerical results in Fig. 14 have been computed on 128CPUs by our new 3D SPH scheme using 1,300,000 particles to dis-cretize the fluid. The colors represent the numerical pressure field.At time t = 1 s a rarefaction wave propagates backwards into thereservoir, the front advances on the dry bed of the channel and isabout to impact against the bend. After 4 s the impact flow is fullydeveloped. The motion produces a super-elevation on the externalwall of the bend due to the centrifugal acceleration. Oblique frontson the free surface are apparent in the channel upstream anddownstream the bend (see Fig. 15). During the evolution of thedam break wave, the free surface profile in the bend changes sig-nificantly. After 10 s (see Fig. 14) the super-elevation on the exter-nal wall changes into a bore, which travels upstream, back towardthe reservoir.

Finally, the SPH results have been compared with the exper-imental data. The numerical solutions computed by the new SPHscheme (solid line) have been compared with the results of theFV method (dotted line) reported in [12] and with the measuredwater depth (symbols) during 20 s of simulation at gauges G1,G2, G4, G5, G6 and G7. The data are plotted in Fig. 16. Thegauge G1 is located in the reservoir and the water level evolu-tion is correct for both numerical schemes. This means that bothnumerical schemes compute the right discharge entering thechannel. The gauges G2 and G4 are located in the channel,one close to the gate (G2) and the other one before the bend(G4). The comparison at gauge G2 shows significant differencesbetween the numerical models. Using the FV scheme with shal-low water equations the bore velocity is highly dependent uponthe friction coefficient, as shown from a sensitivity analysis in[12]. Introducing a higher bottom friction coefficient, the waterfront travels slower and even small errors in the velocity ofthe front result in great differences in the time needed for thebore to travel from the bend to the reservoir, as explained in de-tail in [12]. In fact, the SWE model assumes a constant frontvelocity and neglects the vertical components of the flow, whichis very important at the beginning of the dam break. This pro-duces a time lag between the experiments and the numerical re-sults obtained with the SWE model and it corresponds to thetime required by the water to develop an essentially one-dimen-sional flow, as supposed by the shallow water equations. On thecontrary, the 3D SPH approach is able to simulate the wholephenomenon right from the first instants, when the motion isactually three-dimensional. Regarding the friction, in the SPHcomputation it has been neglected considering that the channelconsists of glass and steel producing a very low friction coeffi-cient which is thus not highly determinant for the water flow.At G2 the fully three-dimensional SPH computations agree muchbetter with the experimental results than the 2D SWE resultsreported in [12], where a much higher water level was pre-dicted. At gauge G4 located in the channel upstream and closeto the bend, the results of both numerical schemes are moresimilar. Both numerical methods do not reproduce the oscilla-tions present in the experimental data, but predict a rathermonotone evolution of the water depth. The gauges located inthe bend are G5 (outer gauge), G6 (centerline gauge) and G7(inner gauge) and are placed along a cross section. As expected,the water depth inside the bend shows a free surface inclinationin the curve due to the centrifugal acceleration with a super-ele-vation at G5 and a bore at G7. After the initial impact of theflow against the wall of the bend at G5, the time evolution ofthe water height leads to an attenuation of the level differencealong the cross section. The comparison at G5, G6, G7 betweenthe numerical computations and the experiments is shown inFig. 16. A satisfactory agreement with the experiments is appar-ent for both, the 3D SPH method as well as the 2D SWE FVsimulation.

6. Summary and conclusions

In this paper, we have presented a new SPH formulation for thesimulation of three-dimensional non-hydrostatic free surfaceflows. The approach is able to simulate accurately the free surfaceprofile with little numerical diffusion and at the same time it is alsoable to produce an accurate and little-oscillatory pressure field. Forthis purpose, the monotone Rusanov flux has been introduced inthe density equation of the SPH formulation, whereas the pressureterms are discretized using a centered formulation without anyadditional numerical viscosity. For a stable time-integration ofthe method, we resort to the successful third order TVD Runge–Kutta scheme, as used in the method-of-lines approach by Shuand Osher [31].

The scheme has been implemented in parallel using the MPI ap-proach together with a dynamic load balancing strategy, based onthe METIS software package. This makes the method suitable torun on modern massively parallel distributed memory supercom-puters. The efficiency of the parallel implementation has been ver-ified in a speed-up test. Subsequently, the accuracy of the approachhas been successfully validated in a thorough mesh-refinementstudy, based on a three-dimensional dam-break impact-flow testcase with violent free surface motion.

Finally, we have shown some classical dam-break test cases thatare commonly used as benchmark problems for the shallow waterequations. Also in all these cases the agreement between the fullythree-dimensional SPH computations and the experimental dataand the one- or two-dimensional shallow water equations wasvery satisfactory.

Future applications of our method will concern realistic geo-physical flows such as mud- and debris-flows.

Acknowledgments

The first author has been supported for the present work by theHPC Europe programme, carried out at the HLRS center for high-performance supercomputing in Stuttgart, Germany. The secondauthor acknowledges support by the Leibniz Rechenzentrum(LRZ), München, Germany.

References

[1] Abdolmaleki K, Thiagarajan KP, Morris-Thomas MT. Simulation of the dambreak problem and impact flows using a Navier-Stokes solver. In: 15thAustralasian fluid mechanics conference; 2004. p. 1–4.

[2] Balsara DS. Von Neumann stability analysis of smoothed particlehydrodynamics – suggestions for optimal algorithms. J Comput Phys1995;121:357–72.

[3] Batchelor GK. An introduction to fluid mechanics. Cambridge: CambridgeUniversity Press; 1974.

[4] Belytschko T, Krongauz Y, Dolbow J, Gerlach C. On the completeness of meshfree particle methods. Int J Numer Methods Eng 1998;43:785–819.

[5] Benz W, Asphaug E. Simulations of brittle solids using smooth particlehydrodynamics. Comput Phys Commun 1995;87:253–65.

[6] Colagrossi A, Landrini M. Numerical simulation of interfacial flows bysmoothed particle hydrodynamics. J Comput Phys 2003;191:448–75.

[7] Dumbser M, Enaux C, Toro EF. Finite volume schemes of very high order ofaccuracy for stiff hyperbolic balance laws. J Comput Phys2008;227:3971–4001.

[8] Dumbser M, Käser M. Arbitrary high order finite volume schemes for seismicwave propagation on unstructured meshes in 2D and 3D. Geophys J Int2007;171:665–94.

[9] Dumbser M, Käser M, A Titarev V, Toro EF. Quadrature-free non-oscillatoryfinite volume schemes on unstructured meshes for nonlinear hyperbolicsystems. J Comput Phys 2007;226:204–43.

[10] Español P, Revenga M. Smoothed dissipative particle dynamics. Phys Rev E2003;67:026705.

[11] Ferrari A, Dumbser M, Toro EF, Armanini A. A new stable version of the SPHmethod in Lagrangian coordinates. Commun Comput Phys 2008;4:378–404.

[12] Soares Frazão S, Sillen X, Zech Y. Dam-break flow through sharp bends –physical model and 2D Boltzmann model validation. In: Proceedings of theCADAM meeting, Wallingford, United Kingdom, 2 and 3 March 1998,Commission Europenne, Bruxelles; 1999. p. 151–69.

Page 15: A new 3D parallel SPH scheme for free surface flows

A. Ferrari et al. / Computers & Fluids 38 (2009) 1203–1217 1217

[13] Gingold RA, Monaghan JJ. Smooth particle hydrodynamics: theory andapplication to non-spherical stars. Month Notices Roy Astron Soc1977;181:375–89.

[14] Godunov SK. Finite difference methods for the computation of discontinuoussolutions of the equations of fluid dynamics. Sbornik – Math USSR1959;47:271–306.

[15] Johnson GR, Beissel SR. Normalized smoothing functions for SPH impactcomputations. Int J Numer Methods Eng 1998;35:2725–41.

[16] Karypis G, Kumar V. Multilevel k-way partitioning scheme for irregular graphs.J Parallel Distribut Comput 1998:96–129.

[17] Libersky LD, Petschek AG. Smooth particle hydrodynamics with strength ofmaterials. In: Trease HE, Fritts MJ, Crowley WP, editors. Advances in the free-lagrange method. New York: Springer; 1991. p. 248–57.

[18] Libersky LD, Petschek AG, Carney TC, Hipp JR, Allahadi FA. High strainLagrangian hydrodynamics. J Comput Phys 1993;109:67–75.

[19] Liou M-S, Steffen C. A new flux splitting scheme. J Comput Phys1993;107:23–39.

[20] Lucy LB. A numerical approach to the testing of the fission hypothesis. Astron J1977;82:1013–24.

[21] Mohammadian A, Le Roux DY. Simulation of shallow flows over variabletopographies using unstructured grids. Int J Numer Methods Fluids2006;52:473–98.

[22] Monaghan JJ. Particle methods for hydrodynamics. Comput Phys Rep1985;3:71–124.

[23] Monaghan JJ. Simulating free surface flows with SPH. J Comput Phys1994;110:399–406.

[24] Monaghan JJ. SPH without a tensile instability. J Comput Phys2000;159:290–311.

[25] Monaghan JJ. Smoothed particle hydrodynamics. Rep Progr Phys2005;68:1703–59.

[26] Monaghan JJ, Lattanzio JC. A refined particle method for astrophysicalproblems. Astron Astrophys 1985;149:135–43.

[27] Ben Moussa B. On the convergence of SPH method for scalarconservation laws with boundary conditions. Method Appl Anal2006;13:29–62.

[28] Ben Moussa B, Lanson N, Vila JP. Convergence of meshless methods forconservations laws: applications to Euler equations. Int Ser Numer Math1999;129:31–40.

[29] Oger G, Doring M, Alessandrini B, Ferrant P. Two-dimensional SPH simulationsof wedge water entries. J Comput Phys 2006;213:803–22.

[30] Ritter A. Die Fortpflanzung der Wasserwellen. VDI Zeitschrift1892;36:947–54.

[31] Shu C-W, Osher S. Efficient implementation of essentially non-oscillatoryshock-capturing schemes. J Comput Phys 1988;89:439–71.

[32] Swegle JW, Attaway SW, Heinstein MW, Mello FJ, Hicks DL. An analysis ofsmoothed particle hydrodynamics, technical report SAND93-2513, Sandia;1994.

[33] Vila JP. On particle weighted methods and smooth particle hydrodynamics.Math Model Methods Appl Sci 1999;9:161–209.

[34] Zhou ZQ, Kat JOD, Buchner B.. A nonlinear 3D approach to simulate greenwater dynamics on deck. In: Seventh international conference on numericalship hydrodynamics; 1999. p. 1–4.