predicting the temporal evolution of tokamak plasma profiles … suli deliverables...

1
RESEARCH POSTER PRESENTATION DESIGN © 2012 www.PosterPresentations.com Motivation Background - Deep Learning via Recurrent Neural Networks Jalal Butt 1 and Egemen Kolemen 2,3 Predicting the Temporal Evolution of Tokamak Plasma Profiles with Deep Learning References and Acknowledgements Future Work Preliminary Plasma Profile Predictions via RNN Deep Learning with Neural Networks 1 Central Connecticut State University, 2 Princeton University, 3 Princeton Plasma Physics Laboratory Model in early stages (lots still to do): Larger training shot list (model currently trains on ~600 shots– lots more data available) More 0D data (power and torque from each beam – potentially valuable influence on local spatial scales) Comparing profile evolution predictions for data-driven and physics-based models CNN-RNN hybrid - learn more abstract spatial patterns Parameterize input data – reduce computational expense Kinetically-constrained plasma profiles Train model on plasma transport codes (e.g. TRANSP) Implementing RNN in NERSC Supercomputers Awarded NERSC exploratory resources to upscale project Implementing RNN model in Cori and Edison Supercomputers Necessary: increase accuracy àincrease computational power Inspired by biological neural networks Interdisciplinary applications (including fusion, particularly disruption predictions) Supervised learning problem (target profiles known) Why?: Rudimentary NN previously developed doesn’t explicitly use previous time-steps and predictions as model inputs (doesn’t have “memory”) – but plasma evolution has spatial + temporal dependence Model goal: predict physically significant spatio-temporal changes in profiles with knowledge of previous plasma state (via profiles) and current 0D inputs Examples from preliminary results show model predicting onset of ion profile “hollowing” Fusion goal: Make nuclear fusion a commercially feasible energy source Major challenge: plasma stability if we can predict plasma stability à manipulate plasma to avoid instabilities/dangerous modes Tokamak plasma profiles are inputs to plasma stability codes Can AI help us predict plasma [profile] evolution? Plasma evolution knowledge à predict potential instability onsets 3.) Learning Process: Minimize loss( " # ) by updating weights and biases ( ) using SGD algorithm; 4.) minimization optimized by Adam optimizer 2.) A mean-absolute difference loss [error] function is defined for loss minimization: [1] Long Short-Term Memory, Hochreiter et al, Neural Computation 1997 [2] Adam: A Method for Stochastic Optimization, Kingma et al, ICLR 2015 [3] A Brief Introduction to Machine Learning for Engineers, Simeone, KCL This work is supported by US DOE grants DE-FC02-04ER54698, DE-AC02-09CH11466, and DE-SC0015878. NERSC, U.S. D.O.E. Office of Science User Facility operated under Contract No. DE- AC02-05CH11231. Princeton Plasma Physics Laboratory, DOE for funding the Summer Undergraduate Laboratory Internship (SULI) Patrick Vail, Florian Laggner, and Yichen Fu for their insight to fusion plasmas. i i-1 + γ i batch X n2batch (r f n ()| =(i-1) ) = Each “neuron” is its own neural network – node learns non-linear combination of values to determine gates’ operation for ML task 1.a.) Weights (W) and biases (b) for the model’s raw neuron activations (A i+1 ) are defined as: Rudimentary neural network Relevant information Recurrent Neural Network Model: LSTM Units: 200; LSTM Cells [layers]: 4 Limited by comp. power 588 [DIII-D] shots 50-ms time-slices Built using Google’s TensorFlow API Loss: measure of ”error” RNN unrolled in time - how RNN uses pervious time-slices for prediction f t = ReLU (W f · [C t-1 ,h t-1 ,x t ]+ b f ) i t = ReLU (W i · [C t-1 ,h t-1 ,x t ]+ b i ) o t = ReLU (W o · [C t ,h t-1 ,x t ]+ b o ) à Forget gate à Input gate à Output gate 1.b.) Raw neuron acts. are linearly activated: Each node (LSTM unit) described by: Model Inputs: Previous 150-ms of n e ,n i ,T e ,T i , Rot. profiles (zipfit - OMFITmdsplus) 0D time-traces (previous and concurrent) NBI power/torque, gas A/B flowrates, ECH power Model Outputs: Future (unseen) 150-ms of plasma profiles 2 6 6 6 4 A i+1 1 A i+1 2 . . . A i+1 n 3 7 7 7 5 = 2 6 6 6 6 4 W 1 1 W 1 2 ... W 1 np W 2 1 . . . . . . . . . W n 1 ... W n np 3 7 7 7 7 5 · 2 6 6 6 4 A i 1 A i 2 . . . A i np 3 7 7 7 5 + 2 6 6 6 4 b 1 b 2 . . . b np 3 7 7 7 5 Tensors are extended by one rank for multi-layer NN y i is the sum of the target values vector and " # is the sum of the predicted values vector ReLU (A i+1 j )= max(0,A i+1 j )= x + Differences : prediction : target Another example of ion-hollowing onset prediction x10 19 x10 19 x10 19 lossy i )= batchsize X 1 (y i - ˆ y i ) batchsize " : learning-rate schedule; " : updated weights and biases " : parameterized by and

Upload: others

Post on 02-Aug-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Predicting the Temporal Evolution of Tokamak Plasma Profiles … SULI deliverables (1)/Butt/PosterDraft... · Preliminary Plasma Profile Predictions via RNN Deep Learning with Neural

RESEARCH POSTER PRESENTATION DESIGN © 2012

www.PosterPresentations.com

Motivation

Background- DeepLearningviaRecurrent NeuralNetworks

JalalButt1 andEgemenKolemen2,3PredictingtheTemporalEvolutionofTokamakPlasmaProfileswithDeepLearning

ReferencesandAcknowledgements

FutureWork

PreliminaryPlasmaProfilePredictionsviaRNN

DeepLearningwithNeuralNetworks

1CentralConnecticutStateUniversity,2PrincetonUniversity,3PrincetonPlasmaPhysicsLaboratory

Modelinearlystages(lotsstilltodo):• Largertrainingshotlist(modelcurrentlytrainson~600shots– lotsmoredataavailable)• More0Ddata(powerandtorquefromeachbeam– potentiallyvaluableinfluenceonlocalspatialscales)• Comparingprofileevolutionpredictionsfordata-drivenandphysics-basedmodels• CNN-RNNhybrid- learnmoreabstractspatialpatterns• Parameterizeinput data– reducecomputationalexpense• Kinetically-constrainedplasmaprofiles• Trainmodelonplasmatransportcodes(e.g.TRANSP)

ImplementingRNNinNERSCSupercomputers

• AwardedNERSCexploratoryresourcestoupscaleproject

• ImplementingRNNmodelinCoriandEdisonSupercomputers• Necessary:increaseaccuracyàincreasecomputationalpower

• Inspiredbybiologicalneuralnetworks• Interdisciplinaryapplications(including

fusion,particularlydisruptionpredictions)• Supervisedlearningproblem(targetprofiles

known)

• Why?:RudimentaryNNpreviouslydevelopeddoesn’texplicitlyuseprevioustime-stepsandpredictionsasmodelinputs(doesn’thave“memory”)– butplasmaevolutionhasspatial+temporal dependence

• Modelgoal:predictphysicallysignificantspatio-temporalchangesinprofileswithknowledgeofpreviousplasmastate(viaprofiles)andcurrent0Dinputs

• Examplesfrompreliminaryresultsshowmodelpredictingonsetofionprofile“hollowing”

• Fusiongoal:Makenuclearfusionacommerciallyfeasibleenergysource

• Majorchallenge:plasmastability• ifwecanpredictplasmastabilityàmanipulateplasmatoavoidinstabilities/dangerousmodes

• Tokamakplasmaprofilesareinputstoplasmastabilitycodes• CanAIhelpuspredictplasma[profile]evolution?• Plasmaevolutionknowledgeà predictpotential

instabilityonsets

• 3.)LearningProcess:Minimizeloss(𝑦"# )byupdatingweightsandbiases(𝜃)usingSGDalgorithm;4.)minimizationoptimizedbyAdamoptimizer

• 2.)Amean-absolutedifferenceloss[error]functionisdefinedforlossminimization:

• [1]LongShort-TermMemory,Hochreiter etal,NeuralComputation1997• [2]Adam:AMethodforStochasticOptimization,Kingma etal,ICLR2015• [3]ABriefIntroductiontoMachineLearningforEngineers,Simeone,KCL

• ThisworkissupportedbyUSDOEgrantsDE-FC02-04ER54698,DE-AC02-09CH11466,andDE-SC0015878.

• NERSC,U.S.D.O.E.OfficeofScienceUserFacilityoperatedunderContractNo.DE-AC02-05CH11231.

• PrincetonPlasmaPhysicsLaboratory,DOEforfundingtheSummerUndergraduateLaboratoryInternship(SULI)

• PatrickVail,FlorianLaggner,andYichen Fufortheirinsighttofusionplasmas.

✓i ✓i�1 +�i

batch

X

n2batch

(r✓fn(✓)|✓=✓(i�1))

=

• Each“neuron”isitsownneuralnetwork–nodelearnsnon-linearcombinationofvaluestodeterminegates’operationforMLtask

• 1.a.)Weights(W)andbiases(b)forthemodel’srawneuronactivations(Ai+1)aredefinedas:

Rudimentaryneuralnetwork

Relevantinformation• RecurrentNeuralNetworkModel:• LSTMUnits:200;LSTMCells[layers]:4• Limitedbycomp.power

• 588[DIII-D]shots• 50-mstime-slices• BuiltusingGoogle’sTensorFlow API

Loss:measureof”error”

RNNunrolledintime- howRNNusespervioustime-slicesforprediction

f

t

= ReLU(Wf

· [Ct�1, ht�1, xt

] + b

f

)

i

t

= ReLU(Wi

· [Ct�1, ht�1, xt

] + b

i

)

o

t

= ReLU(Wo

· [Ct

, h

t�1, xt

] + b

o

)

à Forgetgateà Inputgateà Outputgate

• 1.b.)Rawneuronacts.arelinearlyactivated:

• Eachnode(LSTMunit)describedby:

ModelInputs:• Previous150-msofne,ni,Te,Ti,Rot.profiles(zipfit - OMFITmdsplus)

• 0Dtime-traces(previousandconcurrent)•NBIpower/torque,gasA/Bflowrates,ECHpower

ModelOutputs:• Future(unseen)150-msofplasmaprofiles

2

6664

Ai+11

Ai+12...

Ai+1n

3

7775=

2

66664

W 11 W 1

2 . . . W 1np

W 21

. . ....

...Wn

1 . . . Wnnp

3

77775·

2

6664

Ai1

Ai2...

Ainp

3

7775+

2

6664

b1b2...

bnp

3

7775

Tensorsareextendedbyonerankformulti-layerNN

yi isthesumofthetargetvaluesvectorand𝑦"# isthesumofthepredictedvaluesvector

ReLU(Ai+1j ) = max(0, Ai+1

j ) = x

+

Differences

:prediction:target

Anotherexampleofion-hollowingonsetprediction

x1019

x1019

x1019

loss(yi) =batchsizeX

1

���(yi � yi)

batchsize

���

𝛾": learning-rateschedule;𝜃": updatedweightsandbiases

𝑦'":parameterizedby𝑊 and𝑏