hpc @ t2i what an integration partner can do for you: an ... · update volume, mass and momentum...

25
HPC @ T2i What an integration partner can do for you: an example with EPFL-LMH Lionel Clavien & Siamak Alimirzazadeh / 2017-04-12

Upload: others

Post on 25-May-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

HPC @ T2iWhat an integration partner can do for you:

an example with EPFL-LMHLionel Clavien & Siamak Alimirzazadeh / 2017-04-12

Page 2: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

22

Groupe T2i

About

Skills & Services

EPFL-LMH’sGPU-SPHEROS

Numerical Method

Algorithms

Results

Agenda

Page 3: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

33

Software Editor &IT Services Provider

• > 30 years on the market

• ~ 200 Employees

• > 250’000 Users

Markets & Territories

• Public, Real Estate, HR, Financial Services,Academia, Retail, Insurance, Logistics, …

• From SMEs to international accounts

• Switzerland, France & Canada

Solutions • EDM, CMS, Workflows, HR, IT Modernization, …

Infrastructure & CloudExpertise

• Reselling, Hosting, Managed Services, SaaS, Services VAR

Quick Facts

Page 4: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

44

Strong relationships with vendors &deep technical

knowledge of products

• IBM & Lenovo Business Partner

• NVIDIA Preferred Partner

• OpenPOWER Foundation Member

HPC reach• Academic / Research

• Commercial

Heterogeneous Computing

• Acceleration through GPGPUs and FPGAs (*CAPI)

• High-speed / High-throughput Data Acquisition

Cognitive Computing

• AI / Machine Learning / Deep Learning

• IBM Watson

• Predictive analysis

Skills & Interests… in the HPC World

Page 5: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

55

Exploration

• Presentations

• Workshops

• Brainstormings

Testing

• Own test machines

• Access to vendors’ labs

Design

• Networking

• Storage

• Rack layout

Pricing

• Optimizations

• Support from vendors

Provided Services – Pre-sales

Page 6: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

66

Monitoring

Infrastructure : Nagios, Icinga, … Performance : Ganglia, …

Cluster and storage configuration

Schedulers, including parameterization: Platform LSF, SGE, … Deep expertise with IBM Spectrum Scale (aka GPFS)

Software deployment

Operating system and base toolchain, libraries Through xCAT or similar tools

Physical installation

Expertise with HPC solutions cabling

Provided Services – Implementation

Page 7: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

77

The OpenPOWER Foundation

Chip / SOC

Boards / Systems

I/O / Storage / Acceleration

System / Integration

Software

Implementation / HPC / Research

Page 8: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

88

OPF: an example (Minsky)

NVIDIA: GPU Accelerator

Ubuntu by Canonical: Launch OS,

supporting NVLink and Page Migration

Engine

Wistron: Platform co-design

Mellanox: InfiniBand / Ethernet connectivity

in and out of server

HGST: NVMe adapters

Broadcom: PCIe adapters

QLogic: Fiber Channel adapters

Samsung: SSDs

Hynix, Samsung, Micron: Memory

IBM: CPU

Page 9: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

LMH Laboratory for Hydraulic Machines

Introduction

Free surface and large boundary deformation problems

Finite Volume Particle Method (FVPM)

SPHEROS: an in-house MPI-based parallel solver

GPU-SPHEROS: a GPU-accelerated version of SPHEROS

Speedups

9

Page 10: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

LMH Laboratory for Hydraulic Machines

Simulation of Pelton turbines usingparticle-based methods

Free surface and splashing (water jet)

Large deformation of boundaries (rotating bucket and free surface)

Finite Volume Particle Method

Christian Vessaz, EPFL PhD thesis n° 6470 (2015)

10

Page 11: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

LMH Laboratory for Hydraulic Machines

Mass and momentum conservation:

Finite Volume Particle Method (FVPM)Governing equations

Conservative and consistent

Robust in handling free surface and moving boundaries

High computational cost

Conservation law can be written as:𝜕𝑼

𝜕𝑡+ 𝛁 . 𝑭 𝑼 = 𝜌𝐠

𝑼 =𝜌𝜌𝑪

𝑑𝜌

𝑑𝑡= −𝜌𝛁 . 𝑪

𝜌𝑑𝑪

𝑑𝑡= 𝛁 . (𝒔 − 𝑝I) + 𝜌𝐠

where: 𝑭 =𝜌𝑪

𝜌𝑪⊗ 𝑪 − 𝒔 + 𝑝Iand,

11

Page 12: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

LMH Laboratory for Hydraulic Machines

SPHEROS

SPHEROS is a FVPM parallel in-house solver (using MPI)

Able to simulate interaction between fluid, solid and silt

Mainly developed for free surface and erosion modeling in hydraulic turbines

1. École Polytechnique Fédérale de Lausanne (EPFL) thesis n° 6470 (2015)

2. Sebastian Leguizamon, PhD candidate at EPFL-LMH since 2015

[2] [1]

12

Page 13: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

LMH Laboratory for Hydraulic Machines

GPU-SPHEROS overall algorithm

for each time step t

for each particle i

find the neighbor particles j

end for

for each particle i

for each neighbor j

compute the interaction vectors

end for

end for

for each particle i

for each neighbor j

If i is silt

compute contact forces 𝑓𝑖𝑗𝑐 from Hertz theory for spherical particles

and hydrodynamic force 𝑓𝑖𝑗ℎ : 𝑓𝑖 = 𝑓𝑖𝑗

𝑐𝑛𝑗∈𝑠𝑖𝑙𝑡 ,𝑠𝑜𝑙𝑖𝑑 + 𝑓𝑖𝑗

ℎ𝑛𝑗∈𝑓𝑙𝑢𝑖𝑑

else

compute momentum flux from pressure P and deviatory stress G

where G is Newtonian viscous stress in the fluid and hypo-elastic

stress in the solid:

𝑓𝑖 = (𝜌𝑪𝒙 − 𝜌𝑪𝑪)𝑖𝑗 − 𝑷𝑖𝑗 + 𝑮𝑖𝑗 𝑖 .∆𝒊𝒋 − 𝑝𝑏𝑩𝑖

compute mass flux including the smoothing mass term:

𝑚𝑖 = (𝜌𝒙 − 𝜌𝑪)𝑖𝑗 + 𝑮𝒊𝒋 𝑖 .∆𝒊𝒋

compute volume flux:

𝑉 𝑖 = 𝒙 𝒊𝒋.∆𝒊𝒋𝑖 + 𝒙 𝒊.𝑩𝒊

end if

end for

end for

for each particle i (using 2nd order Runge-Kuta predictor corrector scheme)

update volume, mass and momentum

update density and compute pressure from equation of state

compute velocity correction and update particle velocity

update particle position

end for

𝑡 ← 𝑡 + ∆𝑡 end for

Computing interaction vectors (67.5%)

Computing forces and fluxes + time integration (5%)

Octree-based neighbor search (27.5%)

GPU-SPHEROS

13

Page 14: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

LMH Laboratory for Hydraulic Machines

Octree-based fixed-radius neighbor search based on SFC

The particle fixed-radius neighbor search is based on spece filling

curves (Morton curve)

The particles are sorted using parallel radix sort algorithm (Thrust

parallel algorithms) in order to make coalesced memory accesses

14

Page 15: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

LMH Laboratory for Hydraulic Machines

[2][1]

1. E. Jahanbakhsh, PhD Thesis EPFL, n° 6284 (2014)2. E. Jahanbakhsh et al. Exact finite volume particle method

with spherical-support kernels, Computer Methods inApplied Mechanics and Engineering (ISSN: 0045-7825),vol. 317, p. 102-127 Elsevier, 2017

15

Page 16: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

LMH Laboratory for Hydraulic Machines

Computing forces and fluxes (Re = 50k)

16

Page 17: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

LMH Laboratory for Hydraulic Machines

WLS (compute-bound) and VIG (memory-bound)

Intel Xeon E5649 vs. NVIDIA Tesla K40

17

Page 18: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

LMH Laboratory for Hydraulic Machines

0

1

2

3

4

5

6

7

8

9

janv.16 oct.16 feb.17 mars.17

13%

85%

2%

Octree Neighbor SearchInteraction VectorsForces and Fluxes

Reference: 2 x Intel® Xeon® CPU E5-2660 v2Accelerated: Tesla P100 with NVlink + 2x POWER8 (10 core 2.86 GHz)

13%2%

13%

27.5%

67.5%

5%

[-] Global Speedup (for ~132k particles)

1.13x1.32x

2.5x

7.7x

Optimization of interaction vectors

Optimization ofneighbor search

Interaction vectors based on spherical-support kernel

Computing forces and fluxes

Octree-based neighbor search

18

Page 19: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

LMH Laboratory for Hydraulic Machines

Speedup

19

Page 20: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

LMH Laboratory for Hydraulic Machines

Summary

Finite Volume Particle Method (FVPM) is very compute-intensive

GPU-SPHEROS: a GPU-accelerated version of SPHEROS

GPU-SPHEROS is faster than 8 nodes with 2 Intel® Xeon® CPU

E5-2660 v2

2 Tesla P100 GPUs with NVlink between GPUs and CPU (T2i)

Another great speedup is expected after further optimization of

interaction vectors computation

20

Page 21: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

21

Page 22: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

LMH Laboratory for Hydraulic Machines

Realistic simulations

Single-jet with three buckets takes more than 8 hours on one node

22

Page 23: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

LMH Laboratory for Hydraulic Machines

Computing forces and fluxes

Lid-driven cavity benchmark

23

Page 24: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

LMH Laboratory for Hydraulic Machines

Computing forces and fluxes

Validation for lid-driven cavity benchmark (Re = 400)

24

Page 25: HPC @ T2i What an integration partner can do for you: an ... · update volume, mass and momentum update density and compute pressure from equation of state compute velocity correction

LMH Laboratory for Hydraulic Machines

Kernel Optimization

Optimization level Technique Time [ms] Improvement [-]

0 Atomic operations/Thrust sequential reduction ≈100 > 4x

1 for loop 23.69 1x

2 Unrolling loops 5.27 4.49x

3 Using Structure of Arrays 3.23 7.33x

4 __restrict__ pointers 2.89 8.19x

5 __launch_bound__ 2.51 9.43x

6 Optimized number of threads per block 2.36 10.03x

Memory-bound kernels

Compute-bound kernels

Example: Optimization procedure of a volume integral gradient kernel

𝜵𝒑𝒊 =𝟏

𝑽𝒊

𝒋

𝒑𝒊 + 𝒑𝒋

𝟐∆𝒊𝒋

25