scientific domain-informed machine learning

Post on 31-Jan-2022

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

www.anl.gov

Scientific Domain-Informed Machine Learning

ALCF Simulation, Data, and Learning Workshop October 3, 2018

Prasanna Balaprakash Computer Scientist

Mathematics and Computer Science Division and Leadership Computing Facility Argonne National Laboratory

Joint work with P. Malakar, V. Vishwanath, K. Kumaran, V. Morozov, J. Wang, R. Kotamarthi

DOEscien*ficdataformachinelearning

Regulardomain

Spa*al Temporal

Irregulardomain

Tabular Graphs Manifolds Pointclouds

2

Challengesforirregulardomains

3

MachineLearning

Irregulardomains

Domainknowledge

Automa*on

Scalability

Casestudies•  Scien*ficapplica*onperformancemodeling•  Surrogatemodelinginweathersimula*on

4

Predic*vemodelsinHPCapplica*ons

•  Performance(run*me)predic*ons*llchallenging•  ML-basedperformancemodelingtobridgethegap•  Insightsonimportantknobsthatimpactsperformance•  Helpprunelargesearchspacesinperformancetuning

[H.Hoffmann,WorldChangingIdeas,SA2009] [S.Williamsetal.,ACM2009]

5

Biasvariancetradeoff

6

•  Nofreelunch:nosinglemethodwillworkwellonalldataset•  Allsupervisedlearningalgorithmsseektoreducebiasandvarianceina

differentway

Fortmann-Roe,2012

Deeplearningsummerschoollecture,CIFAR,2016

Supervisedlearningmethods

Classicalmethods

Regulariza*on

Instance-based

Recursivepar**oning

Kernel-based

Bagging

Boos*ng

7

Deepneuralnetworks

x0

x1

x2

xn

h10

h11

h12

h13

h14

h15

h16

h17

h18

h20

h21

h22

h23

h24

h25

O0

8

Applica*onsandpla`orms

•  Miniapps(#noofdatapoints):•  miniMD(<2K);O(1024)nodes•  miniAMR(<1K);O(4096)nodes•  miniFE(6Kto15K);O(8192)nodes•  LAMMPS(<1K);O(1024)nodes

9

Impactoffeatureengineering

•  NoFeatureEngineering(No-FE)•  applica*oninputparameters

•  FeatureEngineering(FE)•  applica*oninputparameters•  ra*ooftheapplica*onproblemsizeandthe

numberofnumberofprocesses•  inverseofthenumberofprocesses•  binarylogarithmofnumberofprocesses

10X20:80crossvalida*on

10

Featureengineeringhasasignificantimpactontheaccuracy

Impactofhardwarepla`orms

Algorithmiccomplexityhasmoreimpactthanhardwarepla`orms

11

Impactoftrainingdatasizeonaccuracy

Nonlinearmethodsleveragelargetrainingdatasize 12

Transferlearning

Pla`orm1 Pla`orm2

Freezeweights

Retrainweights

13

Transferlearning

Transferlearningsignificantlyimprovespredic*onaccuracy 14

Extrapola*onfromsmalltolargeproblemsizes

Incorpora*ngdomainknowledgehelpsinexplora*on 15

Casestudies•  Scien*ficapplica*onperformancemodeling•  Surrogatemodelinginweathersimula*on

16

Planetaryboundarylayer•  Planetaryboundarylayer(PBL)

–  lowestpartoftheatmosphere–  directlyinfluencedbyitscontactwitha

planetarysurface–  respondstochangesinsurfaceradia*ve

forcing–  flowvelocity,temperature,moisture,

etc.,displayrapidfluctua*ons–  computa*onallyexpensiveinweather

researchandforecas*ngmodel

17

hhps://en.wikipedia.org/wiki/Planetary_boundary_layer

Planetaryboundarylayer

Input:Surfaceproper*es,fluxes,andgroundtemperature

Output:profilesofwind,temperature,moisture

18

Surrogateneuralnetwork

Input:Surfaceproper*es,fluxes,andgroundtemperature

Output:profilesofwind,temperature,moisture 19

Predic*onresults(level1:closetosurface) (level16:~2km)

20

Predic*onresults

21

meridionalwind(south-northwind) ver*calvelocity(upanddownmo*ons)

Predic*onresults

22

meridionalwind(south-northwind) ver*calvelocity(upanddownmo*ons)

Challengesforirregulardomains

23

MachineLearning

Irregulardomains

Domainknowledge

Automa*on

Scalability

Acknowledgements•  U.S.DOE

– ANLLDRD– ALCF– ASCR,EarlyCareerResearchProgram

24

top related