data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/stochastic model...

32
Data-driven modeling of strongly nonlinear chaotic systems with non-Gaussian statistics Hassan Arbabi and Themistoklis Sapsis * August 20, 2019 Abstract Strongly nonlinear systems, which commonly arise in turbulent flows and climate dynamics, are characterized by persistent and intermittent energy transfer between various spatial and temporal scales. These systems are difficult to model and analyze due to either high-dimensionality or uncertainty and there has been much recent inter- est in obtaining reduced models, for example in the form of stochastic closures, that can replicate their non-Gaussian statistics in many dimensions. On the other hand, data-driven methods, powered by machine learning and operator theoretic concepts, have shown great utility in modeling nonlinear dynamical systems with various degrees of complexity. Here we propose a data-driven framework to model stationary chaotic dynamical systems through non-linear transformations and a set of decoupled stochas- tic differential equations (SDEs). Specifically, we first use optimal transport to find a transformation from the distribution of time-series data to a multiplicative reference probability measure such as the standard normal distribution. Then we find the set of decoupled SDEs that admit the reference measure as the invariant measure, and also closely match the spectrum of the transformed data. As such, this framework repre- sents the chaotic time series as the evolution of a stochastic system observed through the lens of a nonlinear map. We demonstrate the application of this framework in Lorenz-96 system, a 10-dimensional model of high-Reynolds cavity flow, and reanalysis climate data. These examples show that SDE models generated by this framework can reproduce the non-Gaussian statistics of systems with moderate dimensions (e.g. 10 and more), and approximate super-Gaussian tails that are not readily computable from the training data. Some advantageous features of our framework are convexity of the main optimization problem, flexibility in choice of reference measure and the SDE model, and interpretability in terms of interaction between variables and the underlying dynamical process. Keywords: measure-preserving chaos, Koopman continuous spectrum, extreme events, spectral proper orthogonal decomposition, mixed spectra * The authors are with the department of Mechanical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA. Correspondence: {arbabi, sapsis}@mit.edu 1

Upload: others

Post on 19-Mar-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

Data-driven modeling of strongly nonlinearchaotic systems with non-Gaussian statistics

Hassan Arbabi and Themistoklis Sapsis ∗

August 20, 2019

Abstract

Strongly nonlinear systems, which commonly arise in turbulent flows and climatedynamics, are characterized by persistent and intermittent energy transfer betweenvarious spatial and temporal scales. These systems are difficult to model and analyzedue to either high-dimensionality or uncertainty and there has been much recent inter-est in obtaining reduced models, for example in the form of stochastic closures, thatcan replicate their non-Gaussian statistics in many dimensions. On the other hand,data-driven methods, powered by machine learning and operator theoretic concepts,have shown great utility in modeling nonlinear dynamical systems with various degreesof complexity. Here we propose a data-driven framework to model stationary chaoticdynamical systems through non-linear transformations and a set of decoupled stochas-tic differential equations (SDEs). Specifically, we first use optimal transport to find atransformation from the distribution of time-series data to a multiplicative referenceprobability measure such as the standard normal distribution. Then we find the set ofdecoupled SDEs that admit the reference measure as the invariant measure, and alsoclosely match the spectrum of the transformed data. As such, this framework repre-sents the chaotic time series as the evolution of a stochastic system observed throughthe lens of a nonlinear map. We demonstrate the application of this framework inLorenz-96 system, a 10-dimensional model of high-Reynolds cavity flow, and reanalysisclimate data. These examples show that SDE models generated by this frameworkcan reproduce the non-Gaussian statistics of systems with moderate dimensions (e.g.10 and more), and approximate super-Gaussian tails that are not readily computablefrom the training data. Some advantageous features of our framework are convexity ofthe main optimization problem, flexibility in choice of reference measure and the SDEmodel, and interpretability in terms of interaction between variables and the underlyingdynamical process.

Keywords: measure-preserving chaos, Koopman continuous spectrum, extreme events,spectral proper orthogonal decomposition, mixed spectra

∗The authors are with the department of Mechanical Engineering, Massachusetts Institute of Technology,Cambridge, MA 02139, USA. Correspondence: arbabi, [email protected]

1

Page 2: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

1 Data-driven analysis and prediction of dynamical sys-

tems

Scientists and engineers are facing increasingly difficult challenges such as predicting andcontrolling the earth climate, understanding and emulating the human brain, and design andmaintenance of social networks. All these challenges require modeling and analysis of systemsthat have a large number of degrees of freedom, possibly combined with a considerableamount of uncertainty in parameters. In doing so, the classical model-based state-spaceapproach for dynamical systems analysis and control falls short due to the computationalsize of these problems and lack of an accurate model. As a result, the recent decades haveseen a great development in the area of model reduction and data-driven modeling with thefocus of making these problems more amenable to computation and analysis.

One of the main challenges in modeling complex systems is centered around stronglynonlinear systems, i.e. chaotic systems with persistent and intermittent energy transferbetween multiple scales. Climate dynamics and engineering turbulence give rise to manysuch problems and have been a major target of model reduction techniques. In particular,using stochastic reduced models has become increasingly popular for these problems becauseit allows representing turbulent fluctuations, which arise from nonlinear dynamics in manydegrees of freedom, in the form of simple systems with random forcing. Stochastic modelsof climate are proposed to estimate climate predictability and response to external output[26, 36], explain the observed energy spectrum and transport rates in atmosphere [16, 21],facilitate data assimilation [25] and predict weather patterns [9]. Similar models for analysisof turbulent flows have been suggested in [13, 63, 70, 80] . The formal analysis in [43, 45, 46]has established the validity of stochastic closure models under certain assumptions whileemphasizing the need to go beyond the commonly used linear models with additive noise.All the above efforts have shown the great utility and promise of stochastic modeling, butthey heavily rely on expert intuition and the underlying model of the system (e.g. linearizedNavier-Stokes equations) and therefore their application remains limited to well-understoodsystems and well-chosen variables.

Data-driven methods offer a shortcut for analysis and control of high-dimensional sys-tems by allowing us to discover and exploit low-dimensional structures that may be maskedby the governing equations, or find alternative models with more computational tractabil-ity. A key enabler of data-driven analysis for dynamical systems in the recent years hasbeen the operator-theoretic formalism. The Perron-Frobenius operator, which propagatesthe probabilities in the state space, enables a Markov-type approximation of dynamical sys-tems [14, 35]. Another framework is centered around the Koopman operator [31, 48], whichdescribes the evolution of observables, for computation of geometric objects and linear pre-dictors from trajectory data. These approaches are accompanied by an arsenal of numericalalgorithms that extract dynamical information from large data produced by simulations andexperiments [54,59,64,66,78]. Due to interpretability and connection with the classical the-ory of dynamical systems, operator-theoretic methods have become popular in many fieldsranging from fluid dynamics [67] and power networks [61] to biological pattern extraction [8]and visual object recognition [20]. A central feature of these frameworks is to use data to findthe canonical coordinates which decouple the system dynamics and allow independent and

2

Page 3: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

linear representation. This feature is also emphasized in manifold learning methods for datamining [10, 18, 53, 68]. However, much of the current operator-theoretic framework relies onthe discrete spectrum expansion, and the modeling framework for stationary chaotic systems,where the associated operators possess continuous spectrum, remains to be developed.

Statistical and machine learning methods offer another large and open-ended avenuefor data-driven modeling. Variations of neural networks, including auto-encoders [40], longshort-term memory networks [75] and reservoir computers [57], are proposed for varioustasks like discovering representations of complex systems, closure modeling and short-timeprediction of chaotic systems. A big theme of many efforts in this area is to explicitlyencode available physical and mathematical information in the structure of learning to reducethe amount of training data and adhere to known constraints [62, 74, 76]. Furthermorethe approximation of operator-theoretic objects from data is naturally connected to theregression problem and there is a growing body of works that use machine learning techniquesfor approximation of Koopman or Perron-Frobenius operator [28,38,40,55,71,79,81]. All theabove studies show the promise of machine learning in modeling dynamical systems but themethods in this category often accompany important caveats such as lack of interpretabilityand non-robust optimization processes.

In this work, we present a data-driven framework for stochastic modeling of stronglynonlinear systems. Starting from time-series measurements of observables on a dynamicalsystem, this framework yields a set of randomly forced linear oscillators combined with anonlinear observation map that produce the same statistics as the given time-series data. Thecore idea is to use optimal transport of probabilities [56,73] to map the data distribution toa multiplicative reference measure and then model each dimension separately using a simpleSDE that mimics the spectrum of the time series. In contrast to aforementioned workson stochastic modeling, our framework enables a stronger data-driven approach: we startwith a general space of stochastic models, i.e., linear stochastic oscillators pulled back underpolynomial maps, and then select the model that produces closest statistics and dynamicsto the observed data. This framework also offers advantage over the previous works on data-driven modeling of strongly nonlinear systems [34,44] by treating nonlinearities as a featureof the observation map and not the underlying vector field. As a result, we can use dynamicmodels with guaranteed stable invariant measures as opposed to quadratic models in [34],and avoid using latent variables as opposed to [44]. Moreover, isolating the nonlinearityin the observation map makes it possible to formulate a major part of the challenge as anoptimal transport problem which can be attacked in large dimensions due to the recentcomputational progress in this area.

We use our framework to compute stochastic models for the Lorenz-96 system, high-Reynolds flow in a 2D cavity and 6-hourly time series of velocity and temperature reanalysisdata from the earth atmosphere. Application of our framework to the above examplesgenerates models that closely match the non-Gaussian features of data such as skewness andheavy tails in large dimensions from relatively little data. In case of the cavity flow, theSDE models not only produce the statistics of modal coordinates which are used to train themodel, but they also lead to accurate approximation of high-order statistics for pointwise flowmeasurements. The transport-based framework also allows modeling systems that possessheavy tails in their invariant measures. By utilizing this feature in case of the climatedata, we are able to extrapolate the heavy tails of chaotic time series and characterize the

3

Page 4: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

probability of extreme events from short-time observations. Compared to the state-of-the-art modeling via neural networks, our framework offers convexity of the main optimizationproblem, and interpretability in terms of dependence between variables and the underlyingdynamical process.

2 Stochastic modeling via measure transport and spec-

tral matching

Consider a dynamical system given as

x = f(x), (1)

y = g(x)

with the state variable x in the state space S and the output observable y in RN . Assumethat the trajectories in a neighborhood of S converge to an attractor Ω which supportsan invariant and ergodic probability measure denoted by µ. We assume that instead ofx, we can only measure y which is a random variable with distribution ν = g#µ, thatis, the pushforward of µ under the observation map g. Moreover, the time-series of y arerealizations of the stationary stochastic process U tgt∈R where U t is the Koopman operatorassociated with the system in (1). Given this time series, our goal is to construct a generativemodel in the form of SDEs that produces the same joint statistics as this process. We aretypically interested in the setup where N is smaller than the dimension of the attractor (e.g.measurements from a few sensors in a turbulent flow) and therefore the process may not beMarkovian. Yet our objective is to use a Markovian stochastic model to obtain a coarse-grained and statistically faithful model rather than constructing a deterministic model whichwould necessarily be nonlinear and contain many latent variables.

The first step in our framework is to find the mapping T : RN → RN that takes therandom variable y ∼ ν to another random variable, denoted as q, which has a standardnormal distribution, that is,

q = T (y) ∼ π, (2)

where π is the standard normal distribution. In other words, we seek the change of variablesthat makes the attractor look like a normalized Gaussian distribution in each direction. Weuse the theory of optimal transport to find this change of variable. This theory studiesthe choice of maps which minimize some notion of cost for carrying a probability measureto another [60, 73]. The existence of such a map generally depends on the regularity ofthe measure ν. We assume that ν is absolutely continuous with respect to the Lebesguemeasure on RN which guarantees the existence of the transport map [73]. For numericalapproximation of this map, we use the computational approach developed in [56] (also see[19,47]). This framework leverages the structure of a specific class of optimal transport maps,known as Knothe–Rosenblatt rearrangement, which leads to advantageous computationalproperties. The map is assumed to be a lower triangular, differentiable and orientation-

4

Page 5: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

preserving function given as

T (y1, y2, . . . , yN) =

T1(y1)

T2(y1, y2)...

TN(y1, y2, . . . , yN)

, (3)

where each component is approximated with a multivariate polynomial of a prescribed de-gree. The coefficients of the polynomials in T are determined by maximum likelihood esti-mation: let T be a candidate for T , and consider the pullback of π under this map,

ν(y) = π(T (y))| det∇T (y)|. (4)

Then the map T is found by minimizing the Kullback-Leibler divergence between this pull-back and the data distribution ν, i.e.,

T = arg minT

Eν[

logν(y)

ν(y)

]= arg min

TEν[− log ν(y)

](5)

where we dropped ν(y) in the second equality since it is independent of optimizer T . The ex-pectation with respect to ν is approximated using the time average of data due to ergodicity.The key feature of this computational setup is that it leads to a convex optimization problemwhich can be solved for each Ti separately [56], thereby allowing an efficient computation forlarge dimensions and long time series.

After we have found T , we fit a system of stochastic oscillators to the time series of therandom variable q = T (y). Since we chose π to be a multiplicative measure, we can dothis fitting for each component of q independently. We consider the systems of forced linearoscillators,

qj + βj qj + kjqj =√

2DjWj, βj, kj, Dj > 0, j = 1, . . . , N, (6)

where Wj’s denote standard and mutually independent Brownian motions. Each oscillatoradmits a stationary density given by

ρj(q, q) = cj exp

−βjDj

( q2

2+ kj

q2

2

), j = 1, . . . , N. (7)

with cj being the normalization constant. By setting Dj = kjβj, we make sure that thedisplacement of each oscillator has a standard normal distribution. Next we optimize thefree parameters of each system to match the spectrum of q. To be more precise, let Sj denotethe power spectral density (PSD) of the stochastic process U tqj. On the other hand, thePSD of response, qj, in (6) is given as

Sj(ω) =2Dj∣∣kj + iβjω − ω2

∣∣2 , (8)

5

Page 6: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

where ω denotes the frequency [69]. Then we find the pair (kj, βj) that minimizes the spectraldifference

∆ =

∫ ωs2

0

∣∣Sj(ω)− Sj(ω)∣∣dω (9)

with ωs being the sampling frequency. We approximate Sj(ω) from the time series of qj =Tj(y) using the Welch method [77], and solve this 2D optimization problem using particleswarm optimization. Note that the spectrum of yj will be affected by the spectrum ofq1, . . . , qj due to the coupling of variables in T−1, however, we have found that minimizingthe error of q’s spectra independently is an efficient way to minimize the error in the spectrumof y — see fig. 10 for a 10-dimensional example. When the time series are not long enoughfor a robust approximation of PSD, we recommend matching the autocorrelation functionup to some finite time lag τ . As discussed in [33], the autocorrelation sequence correspondsto the moments of the spectral measure and matching those moments in the limit of τ →∞leads to matching of the PSD. At the end, the data-driven stochastic model for the systemin (1) consists of the stochastic oscillators in (6) with the observation map T−1.

Our framework for modeling, which is summarized in fig. 1, can be interpreted in variousways: from a machine learning viewpoint, we start with a hypothesis space consisting ofstochastic linear oscillators observed under the inverse of triangular monotonic polynomials,and then we find the optimal model using maximum likelihood estimation. From the systemsand signals perspective, our approach resembles the classical Wiener system identificationwhich approximates nonlinear systems as a linear system with a nonlinear observation map[6,65]. In the context of measure-preserving systems with discrete spectrum (i.e. flow on tori)one can discover the independent components of motion (i.e. phases along each direction ofthe tori or equivalently principal Koopman eigenfunctions [50]) and model the evolution ofeach component as a linear system. Our framework can be considered as an extension of thisstrategy to measure-preserving chaotic systems. Finally, this method provides a backwardtechnique for fitting solutions of Fokker-Planck equations to data: direct construction of anSDE which gives rise to the data distribution requires solving Fokker-Planck equation inlarge dimensions which is often difficult or impossible, but in this framework the data ismapped to some distribution which is already a solution to a well-known set of models.

The key feature of our framework is the use of measure transport to ensure the accuratereproduction of the observable statistics. We note that one could use a more sophisticatedclass of reference measures and stochastic models (e.g. a bimodal distribution and an oscil-lator with double-well potential) or a more inclusive space of observation maps based on thecomplexity of the dynamics and availability of data. We find that using polynomial mapswith total degree of 2 or 3 combined with linear stochastic oscillators leads to an accurateand computationally efficient pipeline for modeling complex time-series data from fluid flowsand climate.

6

Page 7: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

y(t) q(t)

PSD

qj + βqj + kqj =√

2DjWj

βj , kj , Dj >0

decoupled SDEs

optimal transport

T

T−1

Figure 1: Framework for stochastic modeling. A change of variable is found that transports thedata distribution to standard normal distribution and then a stochastic oscillator is built by minimizing thedifference between the PSD of transported time series and the SDE model (the shaded area).

3 Results

3.1 Lorenz 96 system

Lorenz 96 system is a toy model of atmospheric turbulence which is commonly used inassessment of modeling, closure and data assimilation schemes [42]. This model describes aset of interacting states on a ring which evolve according to

xk =xk−1(xk+1 − xk−2)− xk + F, k = 1, 2, . . . , K (10)

x0 = xK , xK+1 = x1, x−1 = xK−1.

For the standard parameter values of K = 40 and F = 8 trajectories converge to a chaoticattractor. We assume that we can only observe the first component of the state while thetrajectory is evolving on the attractor, i.e.,

y = g(x) = x1. (11)

We record a time series of y with the length of 1000 and sampling interval of 0.1. Afterapplying our SDE modeling framework to this time series, we find the stochastic model tobe

q + 4.73q + 26.26q = 15.76W , (12)

where the state q is related to the observable y through the map1

q = T (y) ≈ 0.001y3 − 0.010y2 + 0.279y − 0.570. (13)

1In the actual computation, the polynomial is not explicitly realized as in (13). See [56] for details onparametrization of the transport map.

7

Page 8: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

The probability density function (PDF) and PSD of the observable y = T−1(q) are shownin fig. 2. Due to the inclusion of nonlinear terms in the transport, this model is capable ofcapturing the skewness in the PDF of observable y. This is a crucial feature for models ofturbulent systems since skewness gives rise to the nonlinear cascade of energy, and (linear)models with Gaussian distribution are incapable of describing this type of energy transferbetween the system components.

0 5 10 15 20

−10

0

10

truth

signal

0 5 10 15 20

−10

0

10

SDE model

0 5 10 15 20

t

−10

0

10

phase model

0 15 300

5

PSD

0 15 300

5

truth

0 15 30

ω

0

5

−10 0 100.0

0.1

PDF

−10 0 100.0

0.1

−10 0 100.0

0.1

Figure 2: Data-driven modeling of Lorenz 96 system. Top: truth observation on a state observable ofchaotic Lorenz 96 system, middle: approximation via our stochastic modeling framework, bottom: randomphase model approximation. The SDE model captures the skewness in the PDF and closely matches thespectrum. The quasi-periodic phase model, by design, matches the spectrum of data but systematicallymisses the skewness in the PDF. The PSDs are computed after removing the mean.

We compare our model to a data-driven approximation of Koopman operator for theLorenz system. Koopman operator is usually approximated via Extended Dynamic ModeDecomposition (EDMD) algorithm [30, 32, 78], however, the choice of observable dictionaryin EDMD, which plays a critical role in the approximation, is not generally well-resolved.The most popular choice is to include delay-embeddings of the measurements [2], but it isknown that for measure-preserving chaotic systems the finite-dimensional approximationspossess only eigenvalues with negative real part which leads to trivial statistics [33]. Here,we use the phase model approximation of the observable evolution in the form of

y(t) =n∑j=1

αj,yei(ωjt+ζj) (14)

8

Page 9: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

where ωj’s are randomly chosen frequencies from the support of the Koopman spectralmeasure for y, αj’s are determined from spectral projections of y, and ζj’s are initial phasesrandomly drawn from U [0, 2π) (see section 5.2 for more detail). It is shown in [33] that, asn→∞, such an approximation converges to the true evolution of y over finite time intervals.A realization of this quasi-periodic phase model for the Lorenz state for n = 200 is shownin fig. 2. This model, by construction, accurately reproduces the spectrum of the data butfails to capture the skewness in the PDF. In fact, the phase model, which is commonly usedfor ocean wave modeling, is shown to converge to Gaussian statistics as n → ∞ [29]. Thisexample highlights the fact that data-driven approximations of Koopman operator in theform of finite-dimensional linear systems cannot reproduce complicated statistics observedin measure-preserving chaotic systems.

9

Page 10: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

3.2 Chaotic cavity flow and Spectral Proper Orthogonal Decom-position (SPOD)

Fluid flows at large length and velocity scales are canonical examples of high-dimensionalchaotic behavior. We discuss the application of our framework to such flows using a numericalmodel of the chaotic lid-driven cavity flow. This flow consists of an incompressible fluid in a2D square domain, D = [−1, 1]2, with solid walls, where the steady sliding of the top wall,given by

u(y = 1) = (1− x2)2, −1 ≤ x ≤ 1, (15)

induces a circulatory fluid motion inside the cavity. The Reynolds number of this flowis defined as Re = 2/κ where κ is the kinematic viscosity of the fluid in the numericalsimulation. The cavity flow at Re = 30, 000 converges to a measure-preserving chaoticattractor exhibiting a purely continuous Koopman spectrum [3]. Here we model the evolutionof the post-transient cavity flow in two steps: first we use the modes obtained by SpectralProper Orthogonal Decomposition (SPOD) [39, 72] as a spatial basis for description of theflow evolution. We justify the use of SPOD modes through its connection with the spectralexpansion of Koopman operator with continuous spectrum. In the second step, we use theframework based on measure transport and spectral matching to find the stochastic modelthat captures the evolution of flow in the SPOD coordinates.

Consider a non-homogenous stationary turbulent flow. The velocity field at each point isa random variable defined on the underlying measure-preserving attractor, but due to non-homogeneity these random variables have different statistics and spectra. However, since allthese variable arise from the same underlying attractor we can define characteristic spatialfields that connect the spectra of various variables through the notion of Koopman spectralmeasure. Let ux and ux′ denote velocity field at location x and x′ in the flow domain. Thespectral expansion of the Koopman operator for these two random variables is

< u(x), U τu(x′) >µ=

∫ 2π

0

eiωτρx,x′(ω)dω, (16)

where ρx,x′ is the Koopman cross spectral density of ux and ux′ , and < ·, · >µ is the innerproduct with respect to the invariant measure on the attractor [48]. The SPOD modes of theflow at frequency ω, denoted by ψ(x, ω), are defined as solutions of the eigenvalue problem 2∫

Ω

ρx,x′(ω)ψ(x′, ω)dx′ = λψ(x, ω). (17)

As such, SPOD modes are intrinsic dynamical properties that do not depend on the choiceof flow realization, as opposed to dynamic modes or Fourier modes, and therefore provide arobust choice of basis for description of the flow evolution. For more details on Koopmanspectral density see section 5.1.

We compute the SPOD modes of cavity flow using the algorithm in [72]. We define theSPOD coordinates to be the projection of the flow onto the 10 most energetic SPOD modes

2In contrast to [72] we do not include the time-dependence oscillation in the definition of SPOD modes,since for chaotic systems there are no observables with single-frequency oscillations.

10

Page 11: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

at different frequencies, which contain ∼ 50% of the turbulent kinetic energy, i.e.,

yj(t) =< u(x, t), ψj >D, j = 1, 2, . . . , 10. (18)

where < ·, · >D denotes the spatial inner product over the flow domain.As the training data for our SDE model, we use a single time-series of SPOD coordinates

with the length of 2500 seconds and sampling rate of 10 Hz. We use a polynomial map of(total) degree 2 to compute the transport between the distribution of SPOD coordinatesand the standard normal distribution, and identify the corresponding SDE. After generatinga 10000-second-long trajectory of the SDE, we compute the statistics of that trajectory,observed under the inverse of polynomial map. Figure 3 (a,b) shows the excellent agreementbetween the true marginals and the ones obtained from the SDE model. Next we use theSPOD data generated by the model to construct a new flow trajectory and compare thestatistics of pointwise velocity measurements with the original data projected to the 10-dimensional subspace of SPOD modes,. The results, shown in fig. 3(c), confirm an accuraterecovery of pointwise statistics hence indicating the high skill of our modeling in capturingthe 10-dimensional joint pdf of flow modal data. Moreover, the match between log PDFsshows the utility of polynomial transport for data-driven extrapolation of PDF tails whichwe will explore in the next example. See section 5.3 for more details on modeling of cavityflow.

11

Page 12: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

(a)

y2

y4

y6

y10

(b)

y2

y4

y6

y10

(c)

−1 1

10−4

10−2

100

u1

SDE model PDF true PDF

−1 1

10−4

10−2

100

v1

−0.2 0.2

u2

−0.30 0.25

v2

−1 1

u3

−0.7 0.7

v3

−0.7 0.7

u4

−2 2

v4

12

3

4

Figure 3: Modeling of chaotic cavity flow at Re = 30000. a) Single and pairwisemarginal PDFs of 4 (out of 10) SPOD coordinates from data, b) same marginals by theSDE model. The quantile axes limits are (−0.026, 0.026). c) Location of sensors for velocitymeasurements and a vorticity snapshot of the cavity flow (left) and the log PDFs of u− andv−velocity generated by the SDE model and the 10-dimensional representation of the flowin SPOD coordinates (labeled as truth). The insets show the PDFs on a linear ordinatescale. The matching between the pointwise statistics indicates the skill of the SDE model incapturing the 10-dimensional joint pdf of SPOD coordinates.

12

Page 13: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

3.3 Reanalysis climate data and tail extrapolation

Statistical characterization of dynamic extreme events, i.e. probabilistically rare eventsthat arise from combination of dynamics and randomness, is an important and challengingtask. The importance stems from the significant and disruptive effect of extreme eventssuch as earthquakes, rogue waves and extreme weather patterns. Yet finding probabilitiesof extreme events, which reside at the tails of the PDF, is challenging because it requires avery large number of samples. The theoretical setup for study of such events is the extremevalue theory (e.g. [11]) which has been recently extended to chaotic dynamical system withsome success [12,22], but remains limited to systems with specific mixing properties. In therecent years, there has been a number of methods that employ techniques such as modelreduction or machine learning for data-efficient quantification of extreme events [51, 52].

Our modeling framework can be used for characterization of extreme events using rela-tively little data. The key idea is that we can model a large space of probability measureswith heavy tails by using polynomial transport maps to a reference measure like standardnormal distribution3. By discovering such a transport map, we identify our data distribu-tion with a pullback of Gaussian probability under that map and the resulting distributionprovides an approximation of the tails of the real data distribution. Although analytic de-scription of the tails in these distributions are not available when the transport map is ahigh-degree multivariate polynomial map, we can easily generate a large sample from thereference measure and pass them through the transport map inverse to construct the tailapproximation. In the following we consider the application of this method to climate datawith super-Gaussian tails.

Our data is based on 6-hourly reanalysis of velocity and temperature in the earth atmo-sphere recorded at the 100-mbar iso-pressure surface from 1981 to 2017 [5]. These globalfields are then expanded in a spherical wavelet basis [37]. Figure 4 shows the time series ofthe expansion coefficients for a level-1 wavelet envelope centered on top of the North Pole.These time series exhibit a combination of periodic and chaotic oscillations. Therefore, wemodel them as observation on a stationary system with mixed spectra, i.e., a system thatpossesses both discrete and continuous Koopman spectra (see e.g. [7, 48]). In the first step,we extract the periodic component of data (i.e. Koopman modes) by removing Fouriermodes of time series that correspond to more than 1% of fluctuation energy. We apply ourstochastic modeling framework to the chaotic remainders with heavy tails.

We construct a sequence of stochastic models for each of the chaotic time series obtainedfrom measurements at the North Pole. In each model, we add an extra random variablefrom velocity and temperature measurements at other locations to capture the statisticaldependence between those variables and the target variable (i.e. chaotic part of temperatureor u-velocity). These covariates are chosen from the set of all wavelet coefficients (> 8000variables) in the order of decreasing absolute linear correlation with the target variables.The geographic position of the covariates is shown in fig. 5(a). Importantly, we exploit thetriangular structure of transport in (3) and place the target variable on the bottom of eachmap, so that its mapping to the reference measure is informed by all the covariates. Fortraining the models we use only the time series from 1981 and 1982 and polynomials of total

3Indeed the representation of distribution by transport to a reference measure is the idea behind devel-opment of the computational framework in [56]

13

Page 14: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

0

20

UNP

(m/s)

time series

periodic part −10

0

10chaotic part

−10 10

10−1

10−3

10−5

PDF of chaotic part

Gaussian fit

1980 1985 1990 1995

t (year)

140

150

TNP

(K)

1980 1985 1990 1995

t (year)

−5

0

5

−5 5

10−1

10−3

10−5

Figure 4: Velocity and temperature at the North Pole: The time series showing wavelet coefficientsof u-velocity and temperature field at North Pole (left). We extract the periodic component consisting ofFourier modes with at least 1% of signal variance to isolate the chaotic part (middle) which shows super-Gaussian tails (right).

degree 2.After discovering each model, we generate a 74-year long trajectory of the SDE model

and pull it back to the space of observations. The PDFs of the observed map provide along-time extrapolation to the PDFs of the training data. As shown in fig. 5, those PDFsare in good agreement with the original reanalysis data of 37 years. In particular, whencovariates are added to the model the approximation of tails with good training data (e.g.left tails in the top and bottom row) becomes consistently better.

14

Page 15: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

(a)

(b)

−5 5

10−1

10−3

10−5

TNP

−5 5 −5 5

model (74 years) train (2 years) truth (37 years)

−5 5 −5 5

−10 10

10−1

10−3

10−5

UNP

n = 0

−10 10

n = 1

−10 10

n = 2

−10 10

n = 3

−10 10

n = 4

Figure 5: Extrapolation of tails for climate data. a) Location of covariates used tobuild models for the target random variables, i.e., u-velocity and temperature at the NorthPole, b) approximation of PDF tails by generating a surrogate trajectory of the SDE modeland pulling it back under the transport map. n is the number of covariates used to learn theSDE model. The training data is the time series in 1981-1982 and the truth data is the timeseries in 1981-2017. The shaded envelopes show the 95% (pointwise) confidence interval ofPDF estimation for training and truth data.

15

Page 16: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

4 Discussion

We presented a data-driven framework for modeling chaotic dynamical systems in theform of decoupled stochastic oscillators with nonlinear observation maps. The key featureof our framework is the use of measure transport to model non-Gaussian invariant measuresthat arise from statistics of complex chaotic systems. Application to a high-Reynolds chaoticflow showed that models generated by our framework can accurately reproduce the 10-dimensional joint pdfs of SPOD modal coordinates computed from data and hence recoverthe pointwise statistics in the flow. We also showed the promise of our framework in data-driven characterization of extreme events through an example of observational climate data.

In the context of operator-theoretic approximation, our framework is closest in spirit tothe approach in [14, 15] where the Perron-Frobenius operator of the underlying dynamicalsystem is approximated using a Markov chain for transition between the computational cellsin the state space (also see [24]). Although this approach reproduces both the statistics anddynamical behavior, it is not computationally tractable for moderate and high-dimensionalsystems, and for systems like complex flows one needs to use extreme coarse-graining of thedynamics e.g. by clustering data in to discrete representative states [27, 59]. Another data-driven approach for generative modeling is the Generative Adversarial Network (GAN) [23].GANs have produced exemplary results in emulating high-dimensional and rich data distri-butions, however, like other deep networks, they offer little to no interpretability and lackof an inherent dynamical character. Training GANs often require non-trivial techniques andinterestingly one of the effective improvements on GAN training, the Wasserstein GAN [4],is based on ideas from measure transport. The framework presented in this paper providesa balanced trade-off for data-driven modeling of dynamical systems with complicated sta-tistical behavior. Through the use of a single-layer triangular polynomial maps, it offersmuch interpretability in assessing the role of different variables on the observed dynamics,and due to the convexity and separability of the underlying optimization problem it is easilyextendable to tens of dimensions.

One promising direction for extension of our framework is to include physics-based con-straints in computation of transport maps and the underlying SDE models. As shown inprevious works, incorporation of physical information about the target system into the struc-ture of learning can substantially increase the data-efficiency of modeling. In the contextof statistical modeling for turbulent flows and climate, statistical laws and tail slopes pre-dicted from the theory provide suitable constraints. Our framework also has great potentialfor data-driven modeling in further applications such as nonlinear optics, chemical reactionnetworks, molecular dynamics and biological systems.

16

Page 17: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

5 More details on data, computation and theory

The data for Lorenz 96 system is generated using direct integration of eq. (11) using4th-order Runge-Kutta with time step ∆t = 0.01. The velocity field for the lid-driven cavityflow is computed by direct solution of Navier-Stokes equations in NEK5000 [58]. The cavitydomain is divided into 25 elements in each direction with polynomials of 7th degree and thestationary solution is recorded after 3500 transient simulation time units. The processedreanalysis climate data is provided by AIR Worldwide. The PDFs reported for Lorenz andcavity flow and climate (truth and model) are computed using a 100-bin histogram and thensmoothed by convolving with Gaussian kernel with standard deviation of 2 bin widths. ThePDFs for the climate training data are computed using 50 bins. The confidence intervalsfor PDFs of climate are computed using binomial statistics and the adjusted Wald formulain [1].

A Python implementation of our framework with the time-series data for generation of thefigures in this paper can be found at https://github.com/arbabiha/StochasticModelingwData.

5.1 Cross-spectral density of Koopman operator and Spectral ProperOrthogonal Decomposition (SPOD)

In this section, we recall the Koopman spectral expansion for chaotic systems and definethe cross-spectral density of Koopman operator which is used in the definition of SPOD insection 3.2. Consider a deterministic dynamical system with an attractor Ω which supportsa physical measure µ, and µ(Ω) = 1. Let f, g ∈ H := L2(Ω, µ) be observables of this system,with the usual inner product

< f, g >µ=

∫A

fg∗dµ. (19)

Now recall the Koopman operator U t defined as Uf = f F t where F t is the reversible flowof the dynamical system. U t is a unitary operator on H, that is (U t)∗ = U−t. This impliesthat the spectrum of the Koopman operator lies on the imaginary axis in the complex plane,and the spectral expansion of the Koopman operator [48,49] is given as

U tf =∞∑k=0

vkφkeiωkt +

∫ 2π

0

eiωtdEωf, (20)

where the countable sum is the Koopman mode decomposition of f associated with thequasi-periodic part of the evolution, iωj is a Koopman eigenvalue (i.e. an element of discretespectrum) associated with eigenfunction φk, and E is the spectral measure of the Koopmanoperator associated with the continuous part of the spectrum. To be more precise, E is ameasure on [0, 2π) that takes values in the space of projections on H. That is, for everymeasurable set B ⊂ [0, 2π), EB is a projection operator, and EBf is projection of f onto theeigen-subspace associated with the part of spectrum residing in B. We are interested in con-tinuous part of the Koopman spectrum which corresponds to chaotic behavior, and thereforewe assume that there are no quasi-periodic parts, including the part with zero frequency (i.e.mean of the observable) present in the evolution.

17

Page 18: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

The operator-valued measure E is difficult to characterize using data. Instead, we candefine a positive real-valued measure associated only with f and defined as follows [41],

µf (B) =< E(B)f, f >µ . (21)

Then, we can rewrite the spectral expansion of f as

< U tf, f >µ=

∫ 2π

0

eiωt < dEωf, f >=

∫ 2π

0

eiωtdµf (ω) (22)

where ρf is the Koopman spectral density of observable f . Similarly, we can define a measurefor spectral correlation of two distinct observables. That is

< E(B)f, g >µ=< f,E(B)g >µ= µf,g(B) (23)

defines a finite complex-valued measure µf,g on [0, 2π). Under the assumption of absolutecontinuity for this measure, we can write

µf,g(B) =

∫B

ρf,gdα, (24)

with ρf,g being Koopman spectral density of observables f and g. Consequently, we can writethe following expansion for the dynamic evolution of the two observables

< f, U τg >µ=

∫ 2π

0

eiωτρf,g(ω)dω. (25)

In view of the duality between measure-preserving deterministic systems and stationarystochastic processes [17], the expansion in (25) is the same as the spectral expansion forstochastic processes used in definition of SPOD [39,72].

5.2 Construction of random phase model for chaotic systems

Let g : Ω → R be an observable on the measure-preserving dynamical system. Giventhe spectral density of g, algorithm 1 generates a random phase model for the evolution ofg in time. The main idea is to approximate the spectral measure of g using sum of deltafunctions, that is, modeling the evolution as rotation on tori. Using ergodicity, we can use asingle realization to compute the PDF reported in section 3.1.

To see the connection with the approximation of the Koopman operator note that thework in [33] has shown that approximations such as

U1mg :=

m∑j=1

e2πiωjE(Bj)g, (26)

where Bj’s are a partition of [0, 2π) and ωj ∈ Bj, will converge to the true evolution U1gin L2(Ω, µ) in the limit of infinitely refined partition. This approximation is in the functionspace and the choice of Bj, ωj’s are not unique. Algorithm 1 generates a realization of thisapproximation, with Bj = [ej−1, ej), that is spectrally consistent, i.e.,

|aj|2 =

∫Bj

ρg(ω)dω. (27)

18

Page 19: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

Algorithm 1 Construction of random phase model from spectral density

Initialization: spectral density ρ(ω), number of intervals on the frequency domain m1: Draw m random value of cell edges, ej, from uniform distribution on [0, 2π).2: Set e0 = 0 and em+1 = 2π.3: Sort the random cell edges to form the sequence ej, j = 0, 1, . . . ,m+ 1 with ej+1 > ej.4: for j = 1, . . . ,m do5: Let ωj = 1

2(ej + ej−1), j = 1, . . . ,m.

6: Let ∆ωj = ej − ej−1, j = 1, . . . ,m.7: Let

aj =√ρ(ωj)∆ωj ≈

(∫ ej

ej−1

ρ(ω)dω

)1/2

. (28)

8: Draw ζj randomly and independently for each j from the uniform distribution on [0, 2π)and let

gt =m∑j=1

akei(ωjt+ζj) (29)

19

Page 20: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

5.3 Supplemental information and figures for cavity flow

The data for the cavity flow consists of a single trajectory with the length of 12000seconds. We have removed the mean flow from the data and then applied the algorithmin [72] to the first 2000 seconds to compute the SPOD modes of flow shown in fig. 6. Thenprojected the next 10000 seconds onto the top 10 energetic modes, using eq. (18), to obtainthe SPOD coordinates. We have used the first 2500 second of the SPOD coordinate timeseries, with sampling rate of 10 Hz, for training of the SDE model. The marginal distributionsof flow data (fig. 7) are computed using the whole trajectory. An SDE trajectory of the samelength is used to compute the marginal distributions in fig. 3 and fig. 8. To compute thepointwise statistics, we reconstruct the flow field via

u(x) =N∑j=1

N∑k=1

Hjkykψj(x) (30)

where the matrix H is the inverse of Gramian matrix G defined as Gjk =< ψj,ψk >D4.

The results in this paper are compiled using the real part of SPOD modes for simplicity inpresentation.

4The matrix H appears to compensate the non-orthogonality of SPOD modes.

20

Page 21: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

(a)

(b)

Figure 6: Spectral orthogonal decomposition (SPOD) for cavity flow atRe = 30000. a) Distribution of the kinetic energy within the SPOD modes vs frequency,for each frequency we have 155 spatially orthogonal modes. b) (real part of) vorticity forthe top 10 energetic SPOD modes corresponding to the 10 highest circles in (a). The redcolor marks clockwise rotation and the blue counter-clockwise.

21

Page 22: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

Figure 7: Marginal distributions of SPOD coordinates in cavity flow from the flow simulation. The quantileaxes limits are (−0.026, 0.026)

22

Page 23: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

Figure 8: Marginal distributions of SPOD coordinates in cavity flow from the SDE model. The quantileaxes limits are (−0.026, 0.026). Compare to the truth in fig. 7

23

Page 24: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

−0.02

0.02y1

truth signal SDE model signal

−0.02

0.02y2

−0.02

0.02y3

−0.02

0.02y4

−0.02

0.02y5

−0.02

0.02y6

−0.02

0.02y7

−0.02

0.02y8

−0.02

0.02y9

0 10 20 30 40 50

t

−0.02

0.02y10

0 10 20 30 40 50

t

Figure 9: Time evolution of SPOD coordinates for cavity flow with samples from the truth data (numericalsimulation of the flow, shown on the left) and a realization from the SDE model (right).

24

Page 25: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

0

8

PSD of qj = Tj(y)

0

9e-4j = 1

PSD of yj

SDE model

truth

0

5

0

6e-4j = 2

0

4

0

8e-4j = 3

0

5

0

9e-4j = 4

0

4

0

2e-4j = 5

0

3

0

2e-4j = 6

0

4

0

5e-4j = 7

0

5

0

2e-4j = 8

0

5

0

2e-4j = 9

0 2 4 6

ω

0

4

0 2 4 6

ω

0

3e-4j = 10

Figure 10: Power spectral density (PSD) of SPOD coordinates from the flow simulation and the SDEmodel (left), and the transported variable (right).

25

Page 26: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

Acknoweldgements

We are grateful to Dr Boyko Dodov and AIR Worldwide for providing the reanalysisclimate data, as well as Prof. Christian Lessig who performed the projection of the data to aspherical wavelet basis. We also thank Profs. Igor Mezic and Yannis Kevrekidis for instruc-tive discussions and pointing out related references. H.A. is grateful to Antoine Blanchardfor notes on usage of NEK5000. This research was supported in part by the ARO-MURIgrant W911NF-17-1-0306. T.S. acknowledges support from MIT Sea Grant through DohertyAssociate Professorship and AIR Worldwide.

Data and source code

Python Implementation of our framework with the time series data for generation of thefigures in this paper is available at https://github.com/arbabiha/StochasticModelingwData.

26

Page 27: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

References

[1] A. Agresti and B. A. Coull, Approximate is better than exact for interval estima-tion of binomial proportions, The American Statistician, 52 (1998), pp. 119–126.

[2] H. Arbabi and I. Mezic, Ergodic theory, dynamic mode decomposition, and com-putation of spectral properties of the Koopman operator, SIAM Journal on AppliedDynamical Systems, 16 (2017), pp. 2096–2126.

[3] H. Arbabi and I. Mezic, Study of dynamics in post-transient flows using Koopmanmode decomposition, Phys. Rev. Fluids, 2 (2017), p. 124402.

[4] M. Arjovsky, S. Chintala, and L. Bottou, Wasserstein GAN, arXiv preprintarXiv:1701.07875, (2017).

[5] P. Berrisford, D. Dee, P. Poli, R. Brugge, M. Fielding, M. Fuentes,P. Kallberg, S. Kobayashi, S. Uppala, and A. Simmons, The ERA-interimarchive version 2.0, ECMWF Report, (2011).

[6] S. Boyd and L. Chua, Fading memory and the problem of approximating nonlinearoperators with volterra series, IEEE Transactions on circuits and systems, 32 (1985),pp. 1150–1161.

[7] H. Broer and F. Takens, Mixed spectra and rotational symmetry, Archive for ratio-nal mechanics and analysis, 124 (1993), pp. 13–42.

[8] B. W. Brunton, L. A. Johnson, J. G. Ojemann, and J. N. Kutz, Extractingspatial–temporal coherent patterns in large-scale neural recordings using dynamic modedecomposition, Journal of neuroscience methods, 258 (2016), pp. 1–15.

[9] N. Chen, A. J. Majda, and D. Giannakis, Predicting the cloud patterns of themadden-julian oscillation through a low-order nonlinear stochastic model, GeophysicalResearch Letters, 41 (2014), pp. 5612–5619.

[10] R. R. Coifman and S. Lafon, Diffusion maps, Applied and computational harmonicanalysis, 21 (2006), pp. 5–30.

[11] S. Coles, J. Bawa, L. Trenner, and P. Dorazio, An introduction to statisticalmodeling of extreme values, vol. 208, Springer, 2001.

[12] P. Collet, Statistics of closest return for some non-uniformly hyperbolic systems,Ergodic Theory and Dynamical Systems, 21 (2001), pp. 401–420.

[13] A. de la Torre and J. Burguete, Slow dynamics in a turbulent von karmanswirling flow, Physical review letters, 99 (2007), p. 054101.

[14] M. Dellnitz and O. Junge, On the approximation of complicated dynamical behav-ior, SIAM Journal on Numerical Analysis, 36 (1999), pp. 491–515.

27

Page 28: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

[15] , Set oriented numerical methods for dynamical systems, Handbook of dynamicalsystems, 2 (2002), p. 221.

[16] T. DelSole, Stochastic models of quasigeostrophic turbulence, Surveys in Geophysics,25 (2004), pp. 107–149.

[17] J. L. Doob, Stochastic processes, vol. 7, Wiley New York, 1953.

[18] C. J. Dsilva, R. Talmon, N. Rabin, R. R. Coifman, and I. G. Kevrekidis,Nonlinear intrinsic variables and state reconstruction in multiscale simulations, TheJournal of chemical physics, 139 (2013), p. 11B608 1.

[19] T. A. El Moselhy and Y. M. Marzouk, Bayesian inference with optimal maps,Journal of Computational Physics, 231 (2012), pp. 7815–7850.

[20] N. B. Erichson, S. L. Brunton, and J. N. Kutz, Compressed dynamic modedecomposition for background modeling, Journal of Real-Time Image Processing, (2016),pp. 1–14.

[21] B. F. Farrell and P. J. Ioannou, A theory for the statistical equilibrium energyspectrum and heat flux produced by transient baroclinic waves, Journal of the atmosphericsciences, 51 (1994), pp. 2685–2698.

[22] A. C. M. Freitas and J. M. Freitas, On the link between dependence and indepen-dence in extreme value theory for dynamical systems, Statistics & Probability Letters,78 (2008), pp. 1088–1093.

[23] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley,S. Ozair, A. Courville, and Y. Bengio, Generative adversarial nets, in Advancesin neural information processing systems, 2014, pp. 2672–2680.

[24] N. Govindarajan, R. Mohr, S. Chandrasekaran, and I. Mezic, On the ap-proximation of koopman spectra for measure preserving transformations, arXiv preprintarXiv:1803.03920, (2018).

[25] J. Harlim and A. Majda, Filtering nonlinear dynamical systems with linear stochas-tic models, Nonlinearity, 21 (2008), p. 1281.

[26] K. Hasselmann, Stochastic climate models part i. theory, tellus, 28 (1976), pp. 473–485.

[27] E. Kaiser, B. R. Noack, L. Cordier, A. Spohn, M. Segond, M. Abel,G. Daviller, J. Osth, S. Krajnovic, and R. K. Niven, Cluster-based reduced-order modelling of a mixing layer, Journal of Fluid Mechanics, 754 (2014), pp. 365–414.

[28] I. Kevrekidis, C. W. Rowley, and M. o. Williams, A kernel-based method fordata-driven koopman spectral analysis, Journal of Computational Dynamics, 2 (2015),pp. 247–265.

28

Page 29: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

[29] B. Kinsman, Wind waves: their generation and propagation on the ocean surface,Courier Corporation, 1984.

[30] S. Klus, P. Koltai, and C. Schutte, On the numerical approximation of theperron-frobenius and Koopman operator, arXiv preprint arXiv:1512.05997, (2015).

[31] B. O. Koopman, Hamiltonian systems and transformation in hilbert space, Proceed-ings of the National Academy of Sciences, 17 (1931), pp. 315–318.

[32] M. Korda and I. Mezic, On convergence of extended dynamic mode decompositionto the Koopman operator, Journal of Nonlinear Science, (2017), pp. 1–24.

[33] M. Korda, M. Putinar, and I. Mezic, Data-driven spectral analysis of the koopmanoperator, Applied and Computational Harmonic Analysis, (2018).

[34] S. Kravtsov, D. Kondrashov, and M. Ghil, Multilevel regression modeling ofnonlinear processes: Derivation and applications to climatic variability, Journal of Cli-mate, 18 (2005), pp. 4404–4424.

[35] A. Lasota and M. C. Mackey, Chaos, Fractals and Noise, Springer-Verlag, NewYork, 1994.

[36] C. Leith, Climate response and fluctuation dissipation, Journal of the AtmosphericSciences, 32 (1975), pp. 2022–2026.

[37] C. Lessig, Divergence free polar wavelets for the analysis and representation of fluidflows, Journal of Mathematical Fluid Mechanics, 21 (2019), p. 18.

[38] Q. Li, F. Dietrich, E. M. Bollt, and I. G. Kevrekidis, Extended dynamic modedecomposition with dictionary learning: A data-driven adaptive spectral decompositionof the koopman operator, Chaos: An Interdisciplinary Journal of Nonlinear Science, 27(2017), p. 103111.

[39] J. L. Lumley, Stochastic tools in turbulence, Academic Press, 1970.

[40] B. Lusch, J. N. Kutz, and S. L. Brunton, Deep learning for universal linearembeddings of nonlinear dynamics, Nature communications, 9 (2018), p. 4950.

[41] B. MacCluer, Elementary functional analysis, vol. 253, Springer Science & BusinessMedia, 2008.

[42] A. Majda, R. V. Abramov, and M. J. Grote, Information theory and stochasticsfor multiscale nonlinear systems, vol. 25, American Mathematical Soc., 2005.

[43] A. Majda, I. Timofeyev, and E. Vanden-Eijnden, Stochastic models for selectedslow variables in large deterministic systems, Nonlinearity, 19 (2006), p. 769.

[44] A. J. Majda and J. Harlim, Physics constrained nonlinear regression models fortime series, Nonlinearity, 26 (2012), p. 201.

29

Page 30: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

[45] A. J. Majda, I. Timofeyev, and E. V. Eijnden, Models for stochastic climateprediction, Proceedings of the National Academy of Sciences, 96 (1999), pp. 14687–14691.

[46] A. J. Majda, I. Timofeyev, and E. Vanden Eijnden, A mathematical frameworkfor stochastic climate models, Communications on Pure and Applied Mathematics: AJournal Issued by the Courant Institute of Mathematical Sciences, 54 (2001), pp. 891–974.

[47] Y. Marzouk, T. Moselhy, M. Parno, and A. Spantini, Sampling via measuretransport: An introduction, Handbook of Uncertainty Quantification, (2016), pp. 1–41.

[48] I. Mezic, Spectral properties of dynamical systems, model reduction and decompositions,Nonlinear Dynamics, 41 (2005), pp. 309–325.

[49] I. Mezic, Analysis of fluid flows via spectral properties of the Koopman operator, AnnualReview of Fluid Mechanics, 45 (2013), pp. 357–378.

[50] , Koopman operator spectrum and data analysis, arXiv preprint arXiv:1702.07597,(2017).

[51] M. A. Mohamad, W. Cousins, and T. P. Sapsis, A probabilistic decomposition-synthesis method for the quantification of rare events due to internal instabilities, Jour-nal of Computational Physics, 322 (2016), pp. 288–308.

[52] M. A. Mohamad and T. P. Sapsis, Sequential sampling strategy for extreme eventstatistics in nonlinear dynamical systems, Proceedings of the National Academy of Sci-ences, 115 (2018), pp. 11138–11143.

[53] B. Nadler, S. Lafon, R. R. Coifman, and I. G. Kevrekidis, Diffusion maps,spectral clustering and reaction coordinates of dynamical systems, Applied and Compu-tational Harmonic Analysis, 21 (2006), pp. 113–127.

[54] F. Noe and F. Nuske, A variational approach to modeling slow processes in stochasticdynamical systems, Multiscale Modeling & Simulation, 11 (2013), pp. 635–655.

[55] S. E. Otto and C. W. Rowley, Linearly recurrent autoencoder networks for learningdynamics, SIAM Journal on Applied Dynamical Systems, 18 (2019), pp. 558–593.

[56] M. D. Parno and Y. M. Marzouk, Transport map accelerated markov chain montecarlo, SIAM/ASA Journal on Uncertainty Quantification, 6 (2018), pp. 645–682.

[57] J. Pathak, B. Hunt, M. Girvan, Z. Lu, and E. Ott, Model-free prediction of largespatiotemporally chaotic systems from data: A reservoir computing approach, Physicalreview letters, 120 (2018), p. 024102.

[58] J. W. L. Paul F. Fischer and S. G. Kerkemeier, nek5000 Web page, 2008.http://nek5000.mcs.anl.gov.

30

Page 31: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

[59] G. Perez-Hernandez, F. Paul, T. Giorgino, G. De Fabritiis, and F. Noe,Identification of slow molecular order parameters for markov model construction, TheJournal of chemical physics, 139 (2013), p. 07B604 1.

[60] G. Peyre, M. Cuturi, et al., Computational optimal transport, Foundations andTrends in Machine Learning, 11 (2019), pp. 355–607.

[61] F. Raak, Y. Susuki, and T. Hikihara, Data-driven partitioning of power networksvia koopman mode analysis, IEEE Transactions on Power Systems, 31 (2015), pp. 2799–2808.

[62] M. Raissi, P. Perdikaris, and G. E. Karniadakis, Numerical gaussian processesfor time-dependent and nonlinear partial differential equations, SIAM Journal on Sci-entific Computing, 40 (2018), pp. A172–A198.

[63] G. Rigas, A. Morgans, R. Brackston, and J. Morrison, Diffusive dynamicsand stochastic models of turbulent axisymmetric wakes, Journal of Fluid Mechanics, 778(2015).

[64] C. Rowley, I. Mezic, S. Bagheri, P. Schlatter, and D. Henningson, Spectralanalysis of nonlinear flows, Journal of Fluid Mechanics, 641 (2009), pp. 115–127.

[65] W. J. Rugh, Nonlinear system theory, Johns Hopkins University Press Baltimore,1981.

[66] P. J. Schmid, Dynamic mode decomposition of numerical and experimental data, Jour-nal of Fluid Mechanics, 656 (2010), pp. 5–28.

[67] P. J. Schmid, L. Li, M. Juniper, and O. Pust, Applications of the dynamic modedecomposition, Theoretical and Computational Fluid Dynamics, 25 (2011), pp. 249–259.

[68] A. Singer and R. R. Coifman, Non-linear independent component analysis withdiffusion maps, Applied and Computational Harmonic Analysis, 25 (2008), pp. 226–239.

[69] K. Sobczyk, Stochastic differential equations: with applications to physics and engi-neering, vol. 40, Springer Science & Business Media, 2013.

[70] K. R. Sreenivasan, A. Bershadskii, and J. Niemela, Mean wind and its reversalin thermal convection, Physical Review E, 65 (2002), p. 056306.

[71] N. Takeishi, Y. Kawahara, and T. Yairi, Learning koopman invariant subspacesfor dynamic mode decomposition, in Advances in Neural Information Processing Sys-tems, 2017, pp. 1130–1140.

[72] A. Towne, O. T. Schmidt, and T. Colonius, Spectral proper orthogonal decom-position and its relationship to dynamic mode decomposition and resolvent analysis,Journal of Fluid Mechanics, 847 (2018), pp. 821–867.

31

Page 32: Data-driven modeling of strongly nonlinear chaotic systems ...arbabi/research/Stochastic model reduction.pdf · on stochastic modeling, our framework enables a stronger data-driven

[73] C. Villani, Optimal transport: old and new, vol. 338, Springer Science & BusinessMedia, 2008.

[74] Z. Y. Wan and T. P. Sapsis, Machine learning the kinematics of spherical particlesin fluid flows, Journal of Fluid Mechanics, 857 (2018).

[75] Z. Y. Wan, P. Vlachas, P. Koumoutsakos, and T. Sapsis, Data-assistedreduced-order modeling of extreme events in complex dynamical systems, PloS one, 13(2018), p. e0197704.

[76] J.-X. Wang, J.-L. Wu, and H. Xiao, Physics-informed machine learning approachfor reconstructing reynolds stress modeling discrepancies based on dns data, PhysicalReview Fluids, 2 (2017), p. 034603.

[77] P. Welch, The use of fast fourier transform for the estimation of power spectra: amethod based on time averaging over short, modified periodograms, IEEE Transactionson audio and electroacoustics, 15 (1967), pp. 70–73.

[78] M. O. Williams, I. G. Kevrekidis, and C. W. Rowley, A data-driven approx-imation of the Koopman operator: Extending dynamic mode decomposition, Journal ofNonlinear Science, 25 (2015), pp. 1307–1346.

[79] E. Yeung, S. Kundu, and N. Hodas, Learning deep neural network representationsfor koopman operators of nonlinear dynamical systems, arXiv preprint arXiv:1708.06850,(2017).

[80] A. Zare, M. R. Jovanovic, and T. T. Georgiou, Colour of turbulence, Journalof Fluid Mechanics, 812 (2017), pp. 636–680.

[81] Z. Zhao and D. Giannakis, Analog forecasting with dynamics-adapted kernels, Non-linearity, 29 (2016), p. 2888.

32