high-performance scien1fic compu1ng - montefiore institutegeuzaine/info0939/hpc1.pdf · –...

23
High-Performance Scien1fic Compu1ng INFO 0939, Fall 2017 hBp://www.montefiore.ulg.ac.be/~geuzaine/INFO0939/ Based in part on material from Victor Eijkhout’s SSC 335/394 course at the Texas Advanced CompuCng Center 1

Upload: vanbao

Post on 28-May-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

High-PerformanceScien1ficCompu1ng

INFO0939,Fall2017

hBp://www.montefiore.ulg.ac.be/~geuzaine/INFO0939/

BasedinpartonmaterialfromVictorEijkhout’sSSC335/394courseattheTexasAdvancedCompuCngCenter

1

Introduc1on

2

Mathema1cs&Science

• Inscience,weusemathema1cstounderstandphysicalsystems.�

• Differentfieldsofscienceexploredifferent‘domains’oftheuniverse,andhavetheirownsetsofequa1ons,encapsulatedintheories.�

• Determiningthetheoriesandgoverningequa1onsrequiresobserva1onorexperimenta1on,andtes1nghypotheses.

3

[CourtesyofSanDiegoSupercomputerCenter]

4

Scien1ficCompu1ng• Whyshouldwecareaboutscien1ficcompu1ng?

– Computa1onalresearchhasemergedtocomplementexperimentalmethodsinbasicresearch,design,op1miza1on,anddiscoveryinallfacetsofengineeringandscience

– Incertaincases,computa1onalsimula1onsaretheonlypossibleapproachtoanalyzeaproblem:• Experimentsmaybecostprohibi1ve(eg.flighttesCnga1,000fuselage/wing-

bodyconfiguraConsforamodernfighteraircraN)• Experimentsmaybeimpossible(eg.interacConeffectsbetweenthe

InternaConalSpaceStaConandShuQleduringdocking)

– Simula1oncapabili1esrelyheavilyontheunderlyingcomputepower(e.g.amountofmemory,totalcomputeprocessors,andprocessorperformance)• Fosteredtheintroduc1onanddevelopmentofsuper-computersstar1nginthe

1960’s• Large-scalecomputepoweristrackedaroundtheworldviatheTop500List

(moreonthatlater)

5

Scien1ficCompu1ng:adefini1on

• “Theefficientcomputa1onofconstruc1vemethodsinappliedmathema1cs”– Appliedmath:gehngresultsoutofapplica1onareas

– Numericalanalysis:resultsneedtobecorrectlyandefficientlycomputable

– Compu1ng:thealgorithmsneedtobeimplementedonmodernhardware

6

ExamplesofScien1ficCompu1ng�(itreallyiseverywhere)

Automotive

StreamlinesforworkstaConvenClaCon

Heating, ventilation, and air conditioning

7

Electrical Engineering

ExamplesofScien1ficCompu1ng�(itreallyiseverywhere)

F18StoreSeparaCon

Aerospace

Weather Forecasting

8

ExamplesofScien1ficCompu1ng�(itreallyiseverywhere)

9

TemperatureandnaturalconvecConcurrentsintheeyefollowinglaserheaCng.

BiomedicalCOMPUMAG–SYDNEY 2011, STATIC FIELDS AND QUASI-STATIC FIELDS (III), PA10.2, CMP–305 2

Fig. 1. Mesh used for the numerical experiments [20]: a) full head (300,000 nodes, 27 tissues), b) grey matter, c) white matter.

matter �G

(!) are modeled within a probabilistic framework,as functions of the random variable !. Therefore j

Avg

, jmax

,j99�perc

, eAvg

, emax

and e99�perc

are random as well. Inparticular, by using the maximum entropy principle [22] wemodel (arbitrarily) �

G

(!) and �W

(!) as independent randomvariables, uniformly distributed:

�G

(!) ⇠ U([0.0753 ; 0.5155]) (S/m) (1)�W

(!) ⇠ U([0.0533 ; 0.3020]) (S/m) (2)

C. The non intrusive approach

As the conductivities of the brain and the cerebellum aretwo independent random variables of finite variance, we canexpand them as a truncated series of order p

in

in the bi-dimensional Hermite polynomials of a random gaussian vector⇠(!) = (⇠1(!), ⇠2(!)), known as Hermite chaos polynomials[18]:

�G

(!) ⇡P

inX

i=0

�Gi

i

(⇠(!)) (3)

�W

(!) ⇡P

inX

i=0

�Wi

i

(⇠(!)) (4)

where �Gi

and �Wi

are scalar values that depend on theprobabilistic law of the conductivities, P

in

= Cp

in

2+p

in

is thenumber of bi-dimensional polynomials of order less than p

in

,and

i

is the ith bi-dimensional Hermite polynomial. Tosolve the stochastic problem, we use an approach based ona polynomial chaos decomposition of both the conductivityand the induced fields [18]. We assume the conductivities tobe of finite variance, with no assumption on the shape of theprobabilistic distribution.

The values of the induced fields—the average currentdensity in the brain j

Avg

(!) = jAvg

(⇠(!))—are computedby the finite element method from any couple of values(�

G

(⇠(!)),�W

(⇠(!))). The average density belongs to aspace that can be spanned by the polynomials (⇠(!)) andthus written as a truncated series to an order p

out

:

jAvg

(!) =P

outX

m=0

jAvg

m

m

(⇠(!)). (5)

To compute the value of the unknown real coefficients jAvg

m

,we use the orthogonality properties of the Hermite polynomi-als:

jAvg

m

=E[j

Avg

(!) m

(⇠(!))]

E[ m

(⇠(!))2], (6)

where E[·] is the mathematical expectation. The denominatorcan be computed analytically. The integral in the numeratoris computed by means of a Hermite Gauss integration schemewith d integration points [18]:

E[jAvg

(!) m

(⇠(!))] ⇡dX

i=1

...dX

j=1

wi,j

(jAvg

((t1, t2)i,j)) m

((t1, t2)i,j), (7)

with (t1, t2)i,j the i, j-th Gauss point and wi,j

the associatedweight in the bi-dimensional Cartesian rule. The deterministicproblem must thus be computed d2 times, with the conductiv-ity evaluated through (3) and (⇠1(!), ⇠2(!)) = (t1,= t2)i,j ,i, j = 1, . . . , d.

III. RESULTS AND DISCUSSION

The non intrusive method is governed by three parame-ters: p

in

, pout

and d; pin

is linked to the precision on the

Fieldsinducedinhumanbodyclosetopowerlines

10

NewKindsofComputa1ons

TheTop500List

• hBp://www.top500.org• OwnersubmiBedbenchmarkperformancesine1993– basedonadenselinearsystemsolve– hBp://www.netlib.org/benchmark/hpl/

11

12

Top500byOverallArchitecture

13

Top500byMicroprocessor

14

Top500byInstalla1onType

15

Top500byOpera1ngSystem

HPCandCECI• CECIisthe“Consor1umdes

EquipementsdeCalculIntensif”inWallonia/Brussels– hBp://www.ceci-hpc.be– Onceyoucreateanaccountyou

canusealltheCECIclusters– FundedbyFNRS– UCL,ULB,FUNDP,UMons,ULg– Singleloginforallclusters

(moreonthatlater)

16

• MachinesinCECIgrid:– NIC4@ULg– HMEM&Lemaitre2@UCL– Vega@ULB– Hercules@FUNDP– Dragon1@UMons

– 120computenodeswithtwo8-coresIntelE5-2650processorsat2.0GHzand64GBofRAM(4GB/core)

– QDRInfinibandnetwork– 144TBFHGFSparallelfilesystem– Theclusterisespeciallydesigned

formassivelyparalleljobs(MPI,severaldozensofcores)withmanycommunica1onsand/oralotofparalleldiskI/O,2daysmax.

– SSHtonic4.segi.ulg.ac.bewiththeappropriateloginandid_rsa.cecifile(moreonthatlater).

17

NIC4SystemSummary

18

ExternalPowerandCooling

ClassGoals/Topics

• Rememberthatdefini1on“Theefficientcomputa1onofconstruc1vemethodsinappliedmathema1cs”– Numericalanalysis/algorithms,(parallel)computa1on,andhowtocombinethem

• Theorytopics:architecture,numericalanalysis,implemen1ngtheoneontheother

• Prac1calskills:thetoolsofscien1ficcompu1ng

19

ClassGoals/Topics• UNIXExposure

– shells/commandline– environment– compilers– libraries

• Ideally:goodpracCcesforscienCficsoNwareengineering– versioncontrol– buildsystems– datastorage– debuggingskills

20

ClassSetup

• TheoryclassesonTuesday@2pm• 3or4projects,combiningtheoryandprogramming

21

ComputerAccounts• CECIclusters

– Youshouldcreateanaccountnow,andrunyourcodesoverthere

– MustaccesspagefromtheULgsubnet:hBps://login.ceci-hpc.be

• AllrequiredinfoisavailableonhBp://www.ceci-hpc.be

• Don’tforgettoreadtheFAQ!

• Jobsruninamanagedenvironment– logintotheloginnode

– submitjobstothescheduler

– wait

– collectresults

• Produc1onrunsontheloginnodeareforbidden– avoidresourceintensivetasks

– excep1onsincludecompilers,“standard”UNIXcommands(ls,mkdir,etc.)22

RemoteLoginonCECIClusters

• OnlySSHaccessisallowed–Windowsusers:Getaclient• PuTTY(hBp://www.chiark.greenend.org.uk/~sgtatham/puBy/)• oruseCygwin(hBp://www.cygwin.com/)andfollowtheUNIXinstruc1ons

• Allrequiredinfoisavailableon– hBp://www.ceci-hpc.be

• Don’tforgettoreadtheFAQ:– hBps://login.ceci-hpc.be/sta1c/faq.html

23