scientific domain-informed machine learning
Post on 31-Jan-2022
2 Views
Preview:
TRANSCRIPT
www.anl.gov
Scientific Domain-Informed Machine Learning
ALCF Simulation, Data, and Learning Workshop October 3, 2018
Prasanna Balaprakash Computer Scientist
Mathematics and Computer Science Division and Leadership Computing Facility Argonne National Laboratory
Joint work with P. Malakar, V. Vishwanath, K. Kumaran, V. Morozov, J. Wang, R. Kotamarthi
DOEscien*ficdataformachinelearning
Regulardomain
Spa*al Temporal
Irregulardomain
Tabular Graphs Manifolds Pointclouds
2
Challengesforirregulardomains
3
MachineLearning
Irregulardomains
Domainknowledge
Automa*on
Scalability
Casestudies• Scien*ficapplica*onperformancemodeling• Surrogatemodelinginweathersimula*on
4
Predic*vemodelsinHPCapplica*ons
• Performance(run*me)predic*ons*llchallenging• ML-basedperformancemodelingtobridgethegap• Insightsonimportantknobsthatimpactsperformance• Helpprunelargesearchspacesinperformancetuning
[H.Hoffmann,WorldChangingIdeas,SA2009] [S.Williamsetal.,ACM2009]
5
Biasvariancetradeoff
6
• Nofreelunch:nosinglemethodwillworkwellonalldataset• Allsupervisedlearningalgorithmsseektoreducebiasandvarianceina
differentway
Fortmann-Roe,2012
Deeplearningsummerschoollecture,CIFAR,2016
Supervisedlearningmethods
Classicalmethods
Regulariza*on
Instance-based
Recursivepar**oning
Kernel-based
Bagging
Boos*ng
7
Deepneuralnetworks
x0
x1
x2
xn
h10
h11
h12
h13
h14
h15
h16
h17
h18
h20
h21
h22
h23
h24
h25
O0
8
Applica*onsandpla`orms
• Miniapps(#noofdatapoints):• miniMD(<2K);O(1024)nodes• miniAMR(<1K);O(4096)nodes• miniFE(6Kto15K);O(8192)nodes• LAMMPS(<1K);O(1024)nodes
9
Impactoffeatureengineering
• NoFeatureEngineering(No-FE)• applica*oninputparameters
• FeatureEngineering(FE)• applica*oninputparameters• ra*ooftheapplica*onproblemsizeandthe
numberofnumberofprocesses• inverseofthenumberofprocesses• binarylogarithmofnumberofprocesses
10X20:80crossvalida*on
10
Featureengineeringhasasignificantimpactontheaccuracy
Impactofhardwarepla`orms
Algorithmiccomplexityhasmoreimpactthanhardwarepla`orms
11
Impactoftrainingdatasizeonaccuracy
Nonlinearmethodsleveragelargetrainingdatasize 12
Transferlearning
Pla`orm1 Pla`orm2
Freezeweights
Retrainweights
13
Transferlearning
Transferlearningsignificantlyimprovespredic*onaccuracy 14
Extrapola*onfromsmalltolargeproblemsizes
Incorpora*ngdomainknowledgehelpsinexplora*on 15
Casestudies• Scien*ficapplica*onperformancemodeling• Surrogatemodelinginweathersimula*on
16
Planetaryboundarylayer• Planetaryboundarylayer(PBL)
– lowestpartoftheatmosphere– directlyinfluencedbyitscontactwitha
planetarysurface– respondstochangesinsurfaceradia*ve
forcing– flowvelocity,temperature,moisture,
etc.,displayrapidfluctua*ons– computa*onallyexpensiveinweather
researchandforecas*ngmodel
17
hhps://en.wikipedia.org/wiki/Planetary_boundary_layer
Planetaryboundarylayer
Input:Surfaceproper*es,fluxes,andgroundtemperature
Output:profilesofwind,temperature,moisture
18
Surrogateneuralnetwork
Input:Surfaceproper*es,fluxes,andgroundtemperature
Output:profilesofwind,temperature,moisture 19
Predic*onresults(level1:closetosurface) (level16:~2km)
20
Predic*onresults
21
meridionalwind(south-northwind) ver*calvelocity(upanddownmo*ons)
Predic*onresults
22
meridionalwind(south-northwind) ver*calvelocity(upanddownmo*ons)
Challengesforirregulardomains
23
MachineLearning
Irregulardomains
Domainknowledge
Automa*on
Scalability
Acknowledgements• U.S.DOE
– ANLLDRD– ALCF– ASCR,EarlyCareerResearchProgram
24
top related