stochastic processes and applications · 2017. 11. 14. · appendix b elements of probability...

155
STOCHASTIC PROCESSES AND APPLICATIONS G.A. Pavliotis Department of Mathematics Imperial College London November 11, 2015

Upload: others

Post on 18-Feb-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • STOCHASTIC PROCESSES AND APPLICATIONS

    G.A. PavliotisDepartment of Mathematics

    Imperial College London

    November 11, 2015

  • 2

  • Contents

    1 Stochastic Processes 11.1 Definition of a Stochastic Process . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 1

    1.2 Stationary Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 3

    1.3 Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 10

    1.4 Examples of Stochastic Processes . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 13

    1.5 The Karhunen-Loéve Expansion . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 16

    1.6 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 20

    1.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 21

    2 Diffusion Processes 272.1 Examples of Markov processes . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 27

    2.2 The Chapman-Kolmogorov Equation . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 30

    2.3 The Generator of a Markov Processes . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 34

    2.4 Ergodic Markov processes . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 37

    2.5 The Kolmogorov Equations . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 39

    2.6 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 46

    2.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 47

    3 Introduction to SDEs 493.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 49

    3.2 The Itô and Stratonovich Stochastic Integrals . . . . . . .. . . . . . . . . . . . . . . . . . 52

    3.3 Solutions of SDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 57

    3.4 Itô’s formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 59

    3.5 Examples of SDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 65

    3.6 Lamperti’s Transformation and Girsanov’s Theorem . . . .. . . . . . . . . . . . . . . . . . 68

    3.7 Linear Stochastic Differential Equations . . . . . . . . . . .. . . . . . . . . . . . . . . . . 70

    3.8 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 73

    3.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 74

    4 The Fokker-Planck Equation 774.1 Basic properties of the Fokker-Planck equation . . . . . . .. . . . . . . . . . . . . . . . . 77

    4.2 Examples of the Fokker-Planck Equation . . . . . . . . . . . . . .. . . . . . . . . . . . . 81

    i

  • 4.3 Diffusion Processes in One Dimension . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 874.4 The Ornstein-Uhlenbeck Process . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 904.5 The Smoluchowski Equation . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 964.6 Reversible Diffusions . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 1014.7 Eigenfunction Expansions . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 1064.8 Markov Chain Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 1084.9 Reduction to a Schrödinger Operator . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 1104.10 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 1144.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 118

    Appendix A Frequently Used Notation 123

    Appendix B Elements of Probability Theory 125B.1 Basic Definitions from Probability Theory . . . . . . . . . . . .. . . . . . . . . . . . . . . 125B.2 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 127B.3 Conditional Expecation . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 131B.4 The Characteristic Function . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 132B.5 Gaussian Random Variables . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 133

    B.5.1 Gaussian Measures in Hilbert Spaces . . . . . . . . . . . . . . .. . . . . . . . . . 135B.6 Types of Convergence and Limit Theorems . . . . . . . . . . . . . .. . . . . . . . . . . . 138B.7 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 139

    Index 141

    Bibliography 145

    ii

  • Chapter 1

    Introduction to Stochastic Processes

    In this chapter we present some basic results from the theoryof stochastic processes and investigate theproperties of some of the standard continuous-time stochastic processes. In Section 1.1 we give the definitionof a stochastic process. In Section 1.2 we present some properties of stationary stochastic processes. InSection 1.3 we introduce Brownian motion and study some of its properties. Various examples of stochasticprocesses in continuous time are presented in Section 1.4. The Karhunen-Loeve expansion, one of the mostuseful tools for representing stochastic processes and random fields, is presented in Section 1.5. Furtherdiscussion and bibliographical comments are presented in Section 1.6. Section 1.7 contains exercises.

    1.1 Definition of a Stochastic Process

    Stochastic processes describe dynamical systems whose time-evolution is of probabilistic nature. The pre-cise definition is given below.1

    Definition 1.1 (stochastic process). Let T be an ordered set,(Ω,F ,P) a probability space and(E,G) ameasurable space. A stochastic process is a collection of random variablesX = {Xt; t ∈ T} where, foreach fixedt ∈ T ,Xt is a random variable from(Ω,F ,P) to (E,G). Ω is known as the sample space, whereE is the state space of the stochastic processXt.

    The setT can be either discrete, for example the set of positive integersZ+, or continuous,T = R+.The state spaceE will usually beRd equipped with theσ-algebra of Borel sets.

    A stochastic processX may be viewed as a function of botht ∈ T andω ∈ Ω.We will sometimes writeX(t),X(t, ω) or Xt(ω) instead ofXt. For a fixed sample pointω ∈ Ω, the functionXt(ω) : T 7→ E iscalled a (realization, trajectory) of the processX.

    Definition 1.2 (finite dimensional distributions). The finite dimensional distributions (fdd) of a stochasticprocess are the distributions of theEk-valued random variables(X(t1),X(t2), . . . ,X(tk)) for arbitrarypositive integerk and arbitrary timesti ∈ T, i ∈ {1, . . . , k}:

    F (x) = P(X(ti) 6 xi, i = 1, . . . , k)

    with x = (x1, . . . , xk).

    1The notation and basic definitions from probability theory that we will use can be found in Appendix B.

    1

  • From experiments or numerical simulations we can only obtain information about the finite dimensionaldistributions of a process. A natural question arises: are the finite dimensional distributions of a stochasticprocess sufficient to determine a stochastic process uniquely? This is true for processes with continuouspaths2, which is the class of stochastic processes that we will study in these notes.

    Definition 1.3. We say that two processesXt andYt are equivalent if they have same finite dimensionaldistributions.

    Gaussian stochastic processes

    A very important class of continuous-time processes is thatof Gaussian processes which arise in manyapplications.

    Definition 1.4. A one dimensional continuous time Gaussian process is a stochastic process for whichE = R and all the finite dimensional distributions are Gaussian, i.e. every finite dimensional vector(Xt1 ,Xt2 , . . . ,Xtk ) is a N (µk,Kk) random variable for some vectorµk and a symmetric nonnegativedefinite matrixKk for all k = 1, 2, . . . and for all t1, t2, . . . , tk.

    From the above definition we conclude that the finite dimensional distributions of a Gaussian continuous-time stochastic process are Gaussian with probability distribution function

    γµk,Kk(x) = (2π)−n/2(detKk)

    −1/2 exp

    [−12〈K−1k (x− µk),x− µk〉

    ],

    wherex = (x1, x2, . . . xk).It is straightforward to extend the above definition to arbitrary dimensions. A Gaussian processx(t) is

    characterized by its mean

    m(t) := Ex(t)

    and the covariance (or autocorrelation) matrix

    C(t, s) = E((x(t)−m(t)

    )⊗(x(s)−m(s)

    )).

    Thus, the first two moments of a Gaussian process are sufficient for a complete characterization of theprocess.

    It is not difficult to simulate Gaussian stochastic processes on a computer. Given a random numbergenerator that generatesN (0, 1) (pseudo)random numbers, we can sample from a Gaussian stochastic pro-cess by calculating the square root of the covariance. A simple algorithm for constructing a skeleton of acontinuous time Gaussian process is the following:

    • Fix ∆t and definetj = (j − 1)∆t, j = 1, . . . N .

    • SetXj := X(tj) and define the Gaussian random vectorXN ={XNj

    }Nj=1

    . ThenXN ∼ N (µN , ΓN )with µN = (µ(t1), . . . µ(tN )) andΓNij = C(ti, tj).

    2In fact, all we need is the stochastic process to beseparableSee the discussion in Section 1.6.

    2

  • • ThenXN = µN + ΛN (0, I) with ΓN = ΛΛT .

    We can calculate the square root of the covariance matrixC either using the Cholesky factorization, via thespectral decomposition ofC, or by using the singular value decomposition (SVD).

    Examples of Gaussian stochastic processes

    • Random Fourier series: letξi, ζi ∼ N (0, 1), i = 1, . . . N and define

    X(t) =

    N∑

    j=1

    (ξj cos(2πjt) + ζj sin(2πjt)) .

    • Brownian motion is a Gaussian process withm(t) = 0, C(t, s) = min(t, s).

    • Brownian bridge is a Gaussian process withm(t) = 0, C(t, s) = min(t, s)− ts.

    • The Ornstein-Uhlenbeck process is a Gaussian process withm(t) = 0, C(t, s) = λ e−α|t−s| withα, λ > 0.

    1.2 Stationary Processes

    In many stochastic processes that appear in applications their statistics remain invariant under time transla-tions. Such stochastic processes are calledstationary. It is possible to develop a quite general theory forstochastic processes that enjoy this symmetry property. Itis useful to distinguish between stochastic pro-cesses for which all finite dimensional distributions are translation invariant (strictly stationary processes)and processes for which this translation invariance holds only for the first two moments (weakly stationaryprocesses).

    Strictly Stationary Processes

    Definition 1.5. A stochastic process is called (strictly) stationary if allfinite dimensional distributions areinvariant under time translation: for any integerk and timesti ∈ T , the distribution of(X(t1),X(t2), . . . ,X(tk))is equal to that of(X(s+t1),X(s+t2), . . . ,X(s+tk)) for anys such thats+ti ∈ T for all i ∈ {1, . . . , k}.In other words,

    P(Xt1+s ∈ A1,Xt2+s ∈ A2 . . . Xtk+s ∈ Ak) = P(Xt1 ∈ A1,Xt2 ∈ A2 . . . Xtk ∈ Ak), ∀s ∈ T.

    Example 1.6. Let Y0, Y1, . . . be a sequence of independent, identically distributed random variables andconsider the stochastic processXn = Yn. ThenXn is a strictly stationary process (see Exercise 1). Assumefurthermore thatEY0 = µ < +∞. Then, by the strong law of large numbers, Equation (B.26), we have that

    1

    N

    N−1∑

    j=0

    Xj =1

    N

    N−1∑

    j=0

    Yj → EY0 = µ,

    3

  • almost surely. In fact, theBirkhoff ergodic theoremstates that, for any functionf such thatEf(Y0) < +∞,we have that

    limN→+∞

    1

    N

    N−1∑

    j=0

    f(Xj) = Ef(Y0), (1.1)

    almost surely. The sequence of iid random variables is an example of an ergodic strictly stationary processes.

    We will say that a stationary stochastic process that satisfies (1.1) isergodic. For such processes wecan calculate expectation values of observable,Ef(Xt) using a single sample path, provided that it is longenough (N ≫ 1).

    Example 1.7. LetZ be a random variable and define the stochastic processXn = Z, n = 0, 1, 2, . . . . ThenXn is a strictly stationary process (see Exercise 2). We can calculate the long time average of this stochasticprocess:

    1

    N

    N−1∑

    j=0

    Xj =1

    N

    N−1∑

    j=0

    Z = Z,

    which is independent ofN and does not converge to the mean of the stochastic processesEXn = EZ(assuming that it is finite), or any other deterministic number. This is an example of a non-ergodic processes.

    Second Order Stationary Processes

    Let(Ω,F ,P

    )be a probability space. LetXt, t ∈ T (with T = R or Z) be a real-valued random process

    on this probability space with finite second moment,E|Xt|2 < +∞ (i.e. Xt ∈ L2(Ω,P) for all t ∈ T ).Assume that it is strictly stationary. Then,

    E(Xt+s) = EXt, s ∈ T, (1.2)

    from which we conclude thatEXt is constant and

    E((Xt1+s − µ)(Xt2+s − µ)) = E((Xt1 − µ)(Xt2 − µ)), s ∈ T, (1.3)

    implies that thecovariance functiondepends on the difference between the two times,t ands:

    C(t, s) = C(t− s).

    This motivates the following definition.

    Definition 1.8. A stochastic processXt ∈ L2 is called second-order stationary, wide-sense stationaryorweakly stationary if the first momentEXt is a constant and the covariance functionE(Xt − µ)(Xs − µ)depends only on the differencet− s:

    EXt = µ, E((Xt − µ)(Xs − µ)) = C(t− s).

    The constantµ is the expectation of the processXt. Without loss of generality, we can setµ = 0, sinceif EXt = µ then the processYt = Xt − µ is mean zero. A mean zero process is called a centered process.The functionC(t) is thecovariance(sometimes also called autocovariance) or theautocorrelation function

    4

  • of theXt. Notice thatC(t) = E(XtX0), whereasC(0) = EX2t , which is finite, by assumption. Since wehave assumed thatXt is a real valued process, we have thatC(t) = C(−t), t ∈ R.

    Let nowXt be a strictly stationary stochastic process with finite second moment. The definition of strictstationarity implies thatEXt = µ, a constant, andE((Xt − µ)(Xs − µ)) = C(t − s). Hence, a strictlystationary process with finite second moment is also stationary in the wide sense. The converse is not true,in general. It is true, however, for Gaussian processes: since the first two moments of a Gaussian process aresufficient for a complete characterization of the process, aGaussian stochastic process is strictly stationaryif and only if it is weakly stationary.

    Example 1.9. Let Y0, Y1, . . . be a sequence of independent, identically distributed random variables andconsider the stochastic processXn = Yn. From Example 1.6 we know that this is a strictly stationaryprocess, irrespective of whetherY0 is such thatEY 20 < +∞. Assume now thatEY0 = 0 andEY 20 = σ2 <+∞. ThenXn is a second order stationary process with mean zero and correlation functionR(k) = σ2δk0.Notice that in this case we have no correlation between the values of the stochastic process at different timesn andk.

    Example 1.10. Let Z be a single random variable and consider the stochastic processXn = Z, n =0, 1, 2, . . . . From Example 1.7 we know that this is a strictly stationary process irrespective of whetherE|Z|2 < +∞ or not. Assume now thatEZ = 0, EZ2 = σ2. ThenXn becomes a second order stationaryprocess withR(k) = σ2. Notice that in this case the values of our stochastic process at different times arestrongly correlated.

    We will see later in this chapter that for second order stationary processes, ergodicity is related to fastdecay of correlations. In the first of the examples above, there was no correlation between our stochasticprocesses at different times and the stochastic process is ergodic. On the contrary, in our second examplethere is very strong correlation between the stochastic process at different times and this process is notergodic.

    Continuity properties of the covariance function are equivalent to continuity properties of the paths ofXt in theL2 sense, i.e.

    limh→0

    E|Xt+h −Xt|2 = 0.

    Lemma 1.11. Assume that the covariance functionC(t) of a second order stationary process is continuousat t = 0. Then it is continuous for allt ∈ R. Furthermore, the continuity ofC(t) is equivalent to thecontinuity of the processXt in theL2-sense.

    Proof. Fix t ∈ R and (without loss of generality) setEXt = 0. We calculate:

    |C(t+ h)− C(t)|2 = |E(Xt+hX0)− E(XtX0)|2 = E|((Xt+h −Xt)X0)|2

    6 E(X0)2E(Xt+h −Xt)2

    = C(0)(EX2t+h + EX

    2t − 2E(XtXt+h)

    )

    = 2C(0)(C(0) − C(h)) → 0,

    ash→ 0. Thus, continuity ofC(·) at0 implies continuity for allt.

    5

  • Assume now thatC(t) is continuous. From the above calculation we have

    E|Xt+h −Xt|2 = 2(C(0)− C(h)), (1.4)

    which converges to0 ash → 0. Conversely, assume thatXt is L2-continuous. Then, from the aboveequation we getlimh→0C(h) = C(0).

    Notice that form (1.4) we immediately conclude thatC(0) > C(h), h ∈ R.The Fourier transform of the covariance function of a secondorder stationary process always exists.

    This enables us to study second order stationary processes using tools from Fourier analysis. To make thelink between second order stationary processes and Fourieranalysis we will use Bochner’s theorem, whichapplies to all nonnegative functions.

    Definition 1.12. A functionf(x) : R 7→ R is called nonnegative definite ifn∑

    i,j=1

    f(ti − tj)cic̄j > 0 (1.5)

    for all n ∈ N, t1, . . . tn ∈ R, c1, . . . cn ∈ C.

    Lemma 1.13.The covariance function of second order stationary processis a nonnegative definite function.

    Proof. We will use the notationXct :=∑n

    i=1Xtici. We have.

    n∑

    i,j=1

    C(ti − tj)cic̄j =n∑

    i,j=1

    EXtiXtj cic̄j

    = E

    n∑

    i=1

    Xtici

    n∑

    j=1

    Xtj c̄j

    = E

    (Xct X̄

    ct

    )

    = E|Xct |2 > 0.

    Theorem 1.14. [Bochner] LetC(t) be a continuous positive definite function. Then there exists a uniquenonnegative measureρ onR such thatρ(R) = C(0) and

    C(t) =

    R

    eiωt ρ(dω) ∀t ∈ R. (1.6)

    LetXt be a second order stationary process with autocorrelation functionC(t) whose Fourier transformis the measureρ(dω). The measureρ(dω) is called thespectral measureof the processXt. In the followingwe will assume that the spectral measure is absolutely continuous with respect to the Lebesgue measure onR with densityS(ω), i.e. ρ(dω) = S(ω)dω. The Fourier transformS(ω) of the covariance function is calledthespectral densityof the process:

    S(ω) =1

    ∫ ∞

    −∞e−itωC(t) dt. (1.7)

    6

  • From (1.6) it follows that that the autocorrelation function of a mean zero, second order stationary processis given by the inverse Fourier transform of the spectral density:

    C(t) =

    ∫ ∞

    −∞eitωS(ω) dω. (1.8)

    The autocorrelation function of a second order stationary process enables us to associate a timescale toXt,thecorrelation timeτcor:

    τcor =1

    C(0)

    ∫ ∞

    0C(τ) dτ =

    1

    E(X20 )

    ∫ ∞

    0E(XτX0) dτ.

    The slower the decay of the correlation function, the largerthe correlation time is. Notice that when thecorrelations do not decay sufficiently fast so thatC(t) is not integrable, then the correlation time will beinfinite.

    Example 1.15. Consider a mean zero, second order stationary process with correlation function

    C(t) = C(0)e−α|t| (1.9)

    whereα > 0. We will write C(0) = Dα whereD > 0. The spectral density of this process is:

    S(ω) =1

    D

    α

    ∫ +∞

    −∞e−iωte−α|t| dt

    =1

    D

    α

    (∫ 0

    −∞e−iωteαt dt+

    ∫ +∞

    0e−iωte−αt dt

    )

    =1

    D

    α

    (1

    −iω + α +1

    iω + α

    )

    =D

    π

    1

    ω2 + α2.

    This function is called theCauchyor the Lorentzdistribution. The correlation time is (we have thatR(0) = D/α)

    τcor =

    ∫ ∞

    0e−αt dt = α−1.

    A real-valued Gaussian stationary process defined onR with correlation function given by (1.9) is calledthe stationaryOrnstein-Uhlenbeck process. We will study this stochastic process in detail in later chapters.The Ornstein-Uhlenbeck processXt can be used as a model for the velocity of a Brownian particle.It is ofinterest to calculate the statistics of the position of the Brownian particle, i.e. of the integral (we assume thatthe Brownian particle starts at0)

    Zt =

    ∫ t

    0Ys ds, (1.10)

    The particle positionZt is a mean zero Gaussian process. Setα = D = 1. The covariance function ofZt is

    E(ZtZs) = 2min(t, s) + e−min(t,s) + e−max(t,s) − e−|t−s| − 1. (1.11)

    7

  • Ergodic properties of second-order stationary processes

    Second order stationary processes have nice ergodic properties, provided that the correlation between valuesof the process at different times decays sufficiently fast. In this case, it is possible to show that we cancalculate expectations by calculating time averages. An example of such a result is the following.

    Proposition 1.16. Let{Xt}t>0 be a second order stationary process on a probability space(Ω, F , P) withmeanµ and covarianceC(t), and assume thatC(t) ∈ L1(0,+∞). Then

    limT→+∞

    E

    ∣∣∣∣1

    T

    ∫ T

    0Xs ds− µ

    ∣∣∣∣2

    = 0. (1.12)

    For the proof of this result we will first need the following result, which is a property of symmetricfunctions.

    Lemma 1.17. LetR(t) be an integrable symmetric function. Then∫ T

    0

    ∫ T

    0C(t− s) dtds = 2

    ∫ T

    0(T − s)C(s) ds. (1.13)

    Proof. We make the change of variablesu = t − s, v = t + s. The domain of integration in thet, svariables is[0, T ] × [0, T ]. In theu, v variables it becomes[−T, T ] × [|u|, 2T − |u|]. The Jacobian of thetransformation is

    J =∂(t, s)

    ∂(u, v)=

    1

    2.

    The integral becomes∫ T

    0

    ∫ T

    0R(t− s) dtds =

    ∫ T

    −T

    ∫ 2T−|u|

    |u|R(u)J dvdu

    =

    ∫ T

    −T(T − |u|)R(u) du

    = 2

    ∫ T

    0(T − u)R(u) du,

    where the symmetry of the functionC(u) was used in the last step.

    Proof of Theorem 1.16.We use Lemma (1.17) to calculate:

    E

    ∣∣∣∣1

    T

    ∫ T

    0Xs ds− µ

    ∣∣∣∣2

    =1

    T 2E

    ∣∣∣∣∫ T

    0(Xs − µ) ds

    ∣∣∣∣2

    =1

    T 2E

    ∫ T

    0

    ∫ T

    0(Xt − µ)(Xs − µ) dtds

    =1

    T 2

    ∫ T

    0

    ∫ T

    0C(t− s) dtds

    =2

    T 2

    ∫ T

    0(T − u)C(u) du

    62

    T

    ∫ +∞

    0

    ∣∣∣(1− u

    T

    )C(u)

    ∣∣∣ du 6 2T

    ∫ +∞

    0C(u) du→ 0,

    8

  • using the dominated convergence theorem and the assumptionC(·) ∈ L1(0,+∞). Assume thatµ = 0and define

    D =

    ∫ +∞

    0C(t) dt, (1.14)

    which, from our assumption onC(t), is a finite quantity.3 The above calculation suggests that, fort ≫ 1,we have that

    E

    (∫ t

    0X(t) dt

    )2≈ 2Dt.

    This implies that, at sufficiently long times, the mean square displacement of the integral of the ergodicsecond order stationary processXt scales linearly in time, with proportionality coefficient2D. Let nowXtbe the velocity of a (Brownian) particle. The particle position Zt is given by (1.10). From our calculationabove we conclude that

    EZ2t = 2Dt.

    where

    D =

    ∫ ∞

    0C(t) dt =

    ∫ ∞

    0E(XtX0) dt (1.15)

    is thediffusion coefficient. Thus, one expects that at sufficiently long times and under appropriate assump-tions on the correlation function, the time integral of a stationary process will approximate a Brownianmotion with diffusion coefficientD. The diffusion coefficient is an example of a transport coefficientand (1.15) is an example of the Green-Kubo formula: a transport coefficient can be calculated in terms ofthe time integral of an appropriate autocorrelation function. In the case of the diffusion coefficient we needto calculate the integral of the velocity autocorrelation function. We will explore this topic in more detail inChapter??.

    Example 1.18. Consider the stochastic processes with an exponential correlation function from Exam-ple 1.15, and assume that this stochastic process describesthe velocity of a Brownian particle. SinceC(t) ∈ L1(0,+∞) Proposition 1.16 applies. Furthermore, the diffusion coefficient of the Brownian particleis given by ∫ +∞

    0C(t) dt = C(0)τ−1c =

    D

    α2.

    Remark 1.19. LetXt be a strictly stationary process and letf be such thatE(f(X0))2 < +∞. A calcula-tion similar to the one that we did in the proof of Proposition1.16 enables to conclude that

    limT→+∞

    1

    T

    ∫ T

    0f(Xs) ds = Ef(X0), (1.16)

    in L2(Ω). In this case the autocorrelation function ofXt is replaced by

    Cf (t) = E[(f(Xt)− Ef(X0)

    )(f(X0)− Ef(X0)

    )].

    3Notice however that we do not know whether it is nonzero. Thisrequires a separate argument.

    9

  • Settingf = f − Eπf we have:

    limT→+∞

    T Varπ

    (1

    T

    ∫ T

    0f(Xt) dt

    )= 2

    ∫ +∞

    0Eπ(f(Xt)f(X0)) dt (1.17)

    We calculate

    Varπ

    (1

    T

    ∫ T

    0f(Xt) dt

    )= Eπ

    (1

    T

    ∫ T

    0f(Xt) dt

    )2

    =1

    T 2

    ∫ T

    0

    ∫ T

    0Eπ

    (f(Xt)f(Xs)

    )dtds

    =:1

    T 2

    ∫ T

    0

    ∫ T

    0Rf (t, s) dtds

    =2

    T 2

    ∫ T

    0(T − s)Rf (s) ds

    =2

    T

    ∫ T

    0

    (1− s

    T

    )Eπ

    (f(Xs)f(X0)

    )ds,

    from which (1.16) follows.

    1.3 Brownian Motion

    The most important continuous-time stochastic process is Brownian motion. Brownian motion is a processwith almost surely continuous paths and independent Gaussian increments. A processXt has independentincrements if for every sequencet0 < t1 < . . . tn the random variables

    Xt1 −Xt0 , Xt2 −Xt1 , . . . ,Xtn −Xtn−1

    are independent. If, furthermore, for anyt1, t2, s ∈ T and Borel setB ⊂ R

    P(Xt2+s −Xt1+s ∈ B) = P(Xt2 −Xt1 ∈ B),

    then the processXt has stationary independent increments.

    Definition 1.20. A one dimensional standardBrownian motionW (t) : R+ → R is a real valued stochasticprocess with a.s. continuous paths such thatW (0) = 0, it has independent increments and for everyt > s > 0, the incrementW (t) −W (s) has a Gaussian distribution with mean0 and variancet − s, i.e.the density of the random variableW (t)−W (s) is

    g(x; t, s) =(2π(t− s)

    )− 12exp

    (− x

    2

    2(t− s)

    ); (1.18)

    A standardd-dimensional standard Brownian motionW (t) : R+ → Rd is a vector ofd independent one-dimensional Brownian motions:

    W (t) = (W1(t), . . . ,Wd(t)),

    10

  • 0 0.2 0.4 0.6 0.8 1−1.5

    −1

    −0.5

    0

    0.5

    1

    1.5

    2

    t

    U(t)

    mean of 1000 paths5 individual paths

    Figure 1.1: Brownian sample paths

    whereWi(t), i = 1, . . . , d are independent one dimensional Brownian motions. The density of the Gaussianrandom vectorW (t)−W (s) is thus

    g(x; t, s) =(2π(t− s)

    )−d/2exp

    (− ‖x‖

    2

    2(t− s)

    ).

    Brownian motion is also referred to as theWiener process. If Figure 1.1 we plot a few sample paths ofBrownian motion.

    As we have already mentioned, Brownian motion has almost surely continuous paths. More precisely, ithas a continuous modification: consider two stochastic processesXt andYt, t ∈ T , that are defined on thesame probability space(Ω,F ,P). The processYt is said to be a modification ofXt if P(Xt = Yt) = 1 forall t ∈ T . The fact that there is a continuous modification of Brownianmotion follows from the followingresult which is due to Kolmogorov.

    Theorem 1.21. (Kolmogorov) LetXt, t ∈ [0,∞) be a stochastic process on a probability space(Ω,F ,P).Suppose that there are positive constantsα andβ, and for eachT > 0 there is a constantC(T ) such that

    E|Xt −Xs|α 6 C(T )|t− s|1+β , 0 6 s, t 6 T. (1.19)

    Then there exists a continuous modificationYt of the processXt.

    We can check that (1.19) holds for Brownian motion withα = 4 andβ = 1 using (1.18). It is possibleto prove rigorously the existence of the Wiener process (Brownian motion):

    Theorem 1.22. (Wiener) There exists an almost surely continuous processWt with independent incrementssuch andW0 = 0, such that for eacht > 0 the random variableWt is N (0, t). Furthermore,Wt is almostsurely locally Ḧolder continuous with exponentα for anyα ∈ (0, 12).

    11

  • 0 5 10 15 20 25 30 35 40 45 50

    −6

    −4

    −2

    0

    2

    4

    6

    8

    50−step random walk

    0 100 200 300 400 500 600 700 800 900 1000

    −50

    −40

    −30

    −20

    −10

    0

    10

    20

    1000−step random walk

    a. n = 50 b. n = 1000

    Figure 1.2: Sample paths of the random walk of lengthn = 50 andn = 1000.

    Notice that Brownian paths are not differentiable.

    We can construct Brownian motion through the limit of an appropriately rescaled random walk: letX1, X2, . . . be iid random variables on a probability space(Ω,F ,P) with mean0 and variance1. Definethe discrete time stochastic processSn with S0 = 0, Sn =

    ∑j=1Xj, n > 1. Define now a continuous time

    stochastic process with continuous paths as the linearly interpolated, appropriately rescaled random walk:

    W nt =1√nS[nt] + (nt− [nt])

    1√nX[nt]+1,

    where[·] denotes the integer part of a number. ThenW nt converges weakly, asn → +∞ to a one dimen-sional standard Brownian motion. See Figure 1.2.

    An alternative definition of the one dimensional standard Brownian motion is that of a Gaussian stochas-tic process on a probability space

    (Ω,F ,P

    )with continuous paths for almost allω ∈ Ω, and finite dimen-

    sional distributions with zero mean and covarianceE(WtiWtj ) = min(ti, tj). One can then show thatDefinition 1.20 follows from the above definition.

    For thed-dimensional Brownian motion we have (see (B.7) and (B.8))

    EW (t) = 0 ∀t > 0

    and

    E

    ((W (t)−W (s))⊗ (W (t)−W (s))

    )= (t− s)I, (1.20)

    whereI denotes the identity matrix. Moreover,

    E

    (W (t)⊗W (s)

    )= min(t, s)I. (1.21)

    Although Brownian motion has stationary increments, it is not a stationary process itself Brownian motion

    12

  • itself. The probability density of the one dimensional Brownian motion is

    g(x, t) =1√2πt

    e−x2/2t.

    We can easily calculate all moments:

    E(W (t)n) =1√2πt

    ∫ +∞

    −∞xne−x

    2/2t dx

    ={

    1.3 . . . (n− 1)tn/2, n even,0, n odd.

    In particular, the mean square displacement of Brownian motion grows linearly in time.Brownian motion is invariant under various transformations in time.

    Proposition 1.23. LetWt denote a standard Brownian motion inR. Then,Wt has the following properties:

    i. (Rescaling). For eachc > 0 defineXt = 1√cW (ct). Then(Xt, t > 0) = (Wt, t > 0) in law.

    ii. (Shifting). For eachc > 0Wc+t −Wc, t > 0 is a Brownian motion which is independent ofWu, u ∈[0, c].

    iii. (Time reversal). DefineXt =W1−t −W1, t ∈ [0, 1]. Then(Xt, t ∈ [0, 1]) = (Wt, t ∈ [0, 1]) in law.

    iv. (Inversion). LetXt, t > 0 defined byX0 = 0, Xt = tW (1/t). Then(Xt, t > 0) = (Wt, t > 0) inlaw.

    The equivalence in the above result holds in law and not in a pathwise sense. The proof of this proposi-tion is left as an exercise.

    We can also add a drift and change the diffusion coefficient ofthe Brownian motion: we will define aBrownian motion with driftµ and varianceσ2 as the process

    Xt = µt+ σWt.

    The mean and variance ofXt are

    EXt = µt, E(Xt − EXt)2 = σ2t.

    Notice thatXt satisfies the equation

    dXt = µdt+ σ dWt.

    This is an example of astochastic differential equation. We will study stochastic differential equations inChapters 3 and??.

    1.4 Examples of Stochastic Processes

    We present now a few examples of stochastic processes that appear frequently in applications.

    13

  • The Ornstein-Uhlenbeck process

    The stationary Ornstein-Uhlenbeck process that was introduced earlier in this chapter can be defined throughthe Brownian motion via a time change.

    Lemma 1.24. LetW (t) be a standard Brownian motion and consider the process

    V (t) = e−tW (e2t).

    ThenV (t) is a Gaussian stationary process with mean0 and correlation function

    R(t) = e−|t|. (1.22)

    For the proof of this result we first need to show that time changed Gaussian processes are also Gaussian.

    Lemma 1.25. LetX(t) be a Gaussian stochastic process and letY (t) = X(f(t)) wheref(t) is a strictlyincreasing function. ThenY (t) is also a Gaussian process.

    Proof. We need to show that, for all positive integersN and all sequences of times{t1, t2, . . . tN} therandom vector

    {Y (t1), Y (t2), . . . Y (tN )} (1.23)is a multivariate Gaussian random variable. Sincef(t) is strictly increasing, it is invertible and hence, thereexistsi, i = 1, . . . N such thatsi = f−1(ti). Thus, the random vector (1.23) can be rewritten as

    {X(s1), X(s2), . . . X(sN )},

    which is Gaussian for allN and all choices of timess1, s2, . . . sN . HenceY (t) is also Gaussian.

    Proof of Lemma 1.24.The fact thatV (t) is a mean zero process follows immediately from the fact thatW (t) is mean zero. To show that the correlation function ofV (t) is given by (1.22), we calculate

    E(V (t)V (s)) = e−t−sE(W (e2t)W (e2s)) = e−t−s min(e2t, e2s)

    = e−|t−s|.

    The Gaussianity of the processV (t) follows from Lemma 1.25 (notice that the transformation that givesV (t) in terms ofW (t) is invertible and we can writeW (s) = s1/2V (12 ln(s))).

    Brownian Bridge

    We can modify Brownian motion so that the resulting processes is fixed at both ends. LetW (t) be a standardone dimensional Brownian motion. We define the Brownian bridge (from0 to 0) to be the process

    Bt =Wt − tW1, t ∈ [0, 1]. (1.24)

    Notice thatB0 = B1 = 0. Equivalently, we can define the Brownian bridge to be the continuous Gaussianprocess{Bt : 0 6 t 6 1} such that

    EBt = 0, E(BtBs) = min(s, t)− st, s, t ∈ [0, 1]. (1.25)

    14

  • 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−2

    −1.5

    −1

    −0.5

    0

    0.5

    1

    1.5

    t

    Bt

    Figure 1.3: Sample paths and first (blue curve) and second (black curve) moment of the Brownian bridge.

    Another, equivalent definition of the Brownian bridge is through an appropriate time change of the Brownianmotion:

    Bt = (1− t)W(

    t

    1− t

    ), t ∈ [0, 1). (1.26)

    Conversely, we can write the Brownian motion as a time changeof the Brownian bridge:

    Wt = (t+ 1)B

    (t

    1 + t

    ), t > 0.

    We can use the algorithm for simulating Gaussian processes to generate paths of the Brownian bridge processand to calculate moments. In Figure 1.3 we plot a few sample paths and the first and second moments ofBrownian bridge.

    Fractional Brownian Motion

    The fractional Brownian motion is a one-parameter family ofGaussian processes whose increments arecorrelated.

    Definition 1.26. A (normalized) fractional Brownian motionWHt , t > 0 with Hurst parameterH ∈ (0, 1)is a centered Gaussian process with continuous sample pathswhose covariance is given by

    E(WHt WHs ) =

    1

    2

    (s2H + t2H − |t− s|2H

    ). (1.27)

    The Hurst exponent controls the correlations between the increments of fractional Brownian motion aswell as the regularity of the paths: they become smoother asH increases.

    Some of the basic properties of fractional Brownian motion are summarized in the following proposition.

    Proposition 1.27. Fractional Brownian motion has the following properties.

    15

  • 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−2

    −1.5

    −1

    −0.5

    0

    0.5

    1

    1.5

    2

    t

    WtH

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−1

    −0.5

    0

    0.5

    1

    1.5

    2

    t

    WtH

    a. H = 0.3 b. H = 0.8

    Figure 1.4: Sample paths of fractional Brownian motion for Hurst exponentH = 0.3 andH = 0.8 and first(blue curve) and second (black curve) moment.

    i. WhenH = 12 , W12t becomes the standard Brownian motion.

    ii. WH0 = 0, EWHt = 0, E(W

    Ht )

    2 = |t|2H , t > 0.

    iii. It has stationary increments andE(WHt −WHs )2 = |t− s|2H .

    iv. It has the following self similarity property

    (WHαt , t > 0) = (αHWHt , t > 0), α > 0, (1.28)

    where the equivalence is in law.

    The proof of these properties is left as an exercise. In Figure 1.4 we present sample plots and the firsttwo moments of the factional Brownian motion forH = 0.3 andH = 0.8. As expected, for larger valuesof the Hurst exponent the sample paths are more regular.

    1.5 The Karhunen-Loéve Expansion

    Let f ∈ L2(D) whereD is a subset ofRd and let{en}∞n=1 be an orthonormal basis inL2(D). Then, it iswell known thatf can be written as a series expansion:

    f =∞∑

    n=1

    fnen,

    where

    fn =

    Ωf(x)en(x) dx.

    16

  • The convergence is inL2(D):

    limN→∞

    ∥∥∥∥∥f(x)−N∑

    n=1

    fnen(x)

    ∥∥∥∥∥L2(D)

    = 0.

    It turns out that we can obtain a similar expansion for anL2 mean zero process which is continuous in theL2 sense:

    EX2t < +∞, EXt = 0, limh→0

    E|Xt+h −Xt|2 = 0. (1.29)

    For simplicity we will takeT = [0, 1]. LetR(t, s) = E(XtXs) be the autocorrelation function. Notice thatfrom (1.29) it follows thatR(t, s) is continuous in botht ands; see Exercise 20.

    Let us assume an expansion of the form

    Xt(ω) =

    ∞∑

    n=1

    ξn(ω)en(t), t ∈ [0, 1] (1.30)

    where{en}∞n=1 is an orthonormal basis inL2(0, 1). The random variablesξn are calculated as

    ∫ 1

    0Xtek(t) dt =

    ∫ 1

    0

    ∞∑

    n=1

    ξnen(t)ek(t) dt =∞∑

    n=1

    ξnδnk = ξk,

    where we assumed that we can interchange the summation and integration. We will assume that theserandom variables are orthogonal:

    E(ξnξm) = λnδnm,

    where{λn}∞n=1 are positive numbers that will be determined later.Assuming that an expansion of the form (1.30) exists, we can calculate

    R(t, s) = E(XtXs) = E

    ( ∞∑

    k=1

    ∞∑

    ℓ=1

    ξkek(t)ξℓeℓ(s)

    )

    =

    ∞∑

    k=1

    ∞∑

    ℓ=1

    E (ξkξℓ) ek(t)eℓ(s)

    =

    ∞∑

    k=1

    λkek(t)ek(s).

    Consequently, in order to the expansion (1.30) to be valid weneed

    R(t, s) =∞∑

    k=1

    λkek(t)ek(s). (1.31)

    17

  • From equation (1.31) it follows that

    ∫ 1

    0R(t, s)en(s) ds =

    ∫ 1

    0

    ∞∑

    k=1

    λkek(t)ek(s)en(s) ds

    =∞∑

    k=1

    λkek(t)

    ∫ 1

    0ek(s)en(s) ds

    =∞∑

    k=1

    λkek(t)δkn

    = λnen(t).

    Consequently, in order for the expansion (1.30) to be valid,{λn, en(t)}∞n=1 have to be the eigenvalues andeigenfunctions of the integral operator whose kernel is thecorrelation function ofXt:

    ∫ 1

    0R(t, s)en(s) ds = λnen(t). (1.32)

    To prove the expansion (1.30) we need to study the eigenvalueproblem for the integral operator

    Rf :=∫ 1

    0R(t, s)f(s) ds. (1.33)

    We consider it as an operator fromL2[0, 1] to L2[0, 1]. We can show that this operator is selfadjoint andnonnegative inL2(0, 1):

    〈Rf, h〉 = 〈f,Rh〉 and 〈Rf, f〉 > 0 ∀ f, h ∈ L2(0, 1),

    where〈·, ·〉 denotes theL2(0, 1)-inner product. It follows that all its eigenvalues are realand nonnegative.Furthermore, it is a compact operator (if{φn}∞n=1 is a bounded sequence inL2(0, 1), then{Rφn}∞n=1 hasa convergent subsequence). The spectral theorem for compact, selfadjoint operators can be used to deducethatR has a countable sequence of eigenvalues tending to0. Furthermore, for everyf ∈ L2(0, 1) we canwrite

    f = f0 +

    ∞∑

    n=1

    fnen(t),

    whereRf0 = 0 and{en(t)} are the eigenfunctions of the operatorR corresponding to nonzero eigenvaluesand where the convergence is inL2. Finally, Mercer’s Theorem states that forR(t, s) continuous on[0, 1]×[0, 1], the expansion (1.31) is valid, where the series converges absolutely and uniformly.

    Now we are ready to prove (1.30).

    Theorem 1.28. (Karhunen-Lóeve). Let{Xt, t ∈ [0, 1]} be anL2 process with zero mean and continuouscorrelation functionR(t, s). Let {λn, en(t)}∞n=1 be the eigenvalues and eigenfunctions of the operatorRdefined in(1.33). Then

    Xt =

    ∞∑

    n=1

    ξnen(t), t ∈ [0, 1], (1.34)

    18

  • where

    ξn =

    ∫ 1

    0Xten(t) dt, Eξn = 0, E(ξnξm) = λδnm. (1.35)

    The series converges inL2 toX(t), uniformly int.

    Proof. The fact thatEξn = 0 follows from the fact thatXt is mean zero. The orthogonality of the randomvariables{ξn}∞n=1 follows from the orthogonality of the eigenfunctions ofR:

    E(ξnξm) = E

    ∫ 1

    0

    ∫ 1

    0XtXsen(t)em(s) dtds

    =

    ∫ 1

    0

    ∫ 1

    0R(t, s)en(t)em(s) dsdt

    = λn

    ∫ 1

    0en(s)em(s) ds = λnδnm.

    Consider now the partial sumSN =∑N

    n=1 ξnen(t).

    E|Xt − SN |2 = EX2t + ES2N − 2E(XtSN )

    = R(t, t) + E

    N∑

    k,ℓ=1

    ξkξℓek(t)eℓ(t)− 2E(Xt

    N∑

    n=1

    ξnen(t)

    )

    = R(t, t) +

    N∑

    k=1

    λk|ek(t)|2 − 2EN∑

    k=1

    ∫ 1

    0XtXsek(s)ek(t) ds

    = R(t, t)−N∑

    k=1

    λk|ek(t)|2 → 0,

    by Mercer’s theorem.

    The Karhunen-oéve expansion is straightforward to apply to Gaussian stochastic processes. LetXt be aGaussian second order process with continuous covarianceR(t, s). Then the random variables{ξk}∞k=1 areGaussian, since they are defined through the time integral ofa Gaussian processes. Furthermore, since theyare Gaussian and orthogonal, they are also independent. Hence, for Gaussian processes the Karhunen-Loéveexpansion becomes:

    Xt =+∞∑

    k=1

    √λkξkek(t), (1.36)

    where{ξk}∞k=1 are independentN (0, 1) random variables.

    Example 1.29.The Karhunen-Loéve Expansion for Brownian Motion. The correlation function of Brown-ian motion isR(t, s) = min(t, s). The eigenvalue problemRψn = λnψn becomes

    ∫ 1

    0min(t, s)ψn(s) ds = λnψn(t).

    19

  • Let us assume thatλn > 0 (we can check that0 is not an eigenvalue). Upon settingt = 0 we obtainψn(0) = 0. The eigenvalue problem can be rewritten in the form

    ∫ t

    0sψn(s) ds + t

    ∫ 1

    tψn(s) ds = λnψn(t).

    We differentiate this equation once: ∫ 1

    tψn(s) ds = λnψ

    ′n(t).

    We sett = 1 in this equation to obtain the second boundary conditionψ′n(1) = 0. A second differentiationyields;

    −ψn(t) = λnψ′′n(t),

    where primes denote differentiation with respect tot. Thus, in order to calculate the eigenvalues and eigen-functions of the integral operator whose kernel is the covariance function of Brownian motion, we need tosolve the Sturm-Liouville problem

    −ψn(t) = λnψ′′n(t), ψ(0) = ψ′(1) = 0.

    We can calculate the eigenvalues and (normalized) eigenfunctions are

    ψn(t) =√2 sin

    (1

    2(2n − 1)πt

    ), λn =

    (2

    (2n − 1)π

    )2.

    Thus, the Karhunen-Loéve expansion of Brownian motion on[0, 1] is

    Wt =√2

    ∞∑

    n=1

    ξn2

    (2n − 1)π sin(1

    2(2n − 1)πt

    ). (1.37)

    1.6 Discussion and Bibliography

    The material presented in this chapter is very standard and can be found in any any textbook on stochasticprocesses. Consult, for example [48, 47, 49, 32]. The proof of Bochner’s theorem 1.14 can be found in [50],where additional material on stationary processes can be found. See also [48].

    The Ornstein-Uhlenbeck process was introduced by Ornsteinand Uhlenbeck in 1930 as a model for thevelocity of a Brownian particle [101]. An early reference onthe derivation of formulas of the form (1.15)is [99].

    Gaussian processes are studied in [1]. Simulation algorithms for Gaussian processes are presented in [6].Fractional Brownian motion was introduced in [64].

    The spectral theorem for compact, selfadjoint operators that we used in the proof of the Karhunen-Loéveexpansion can be found in [84]. The Karhunen-Loéve expansion can be used to generate random fields, i.e.a collection of random variables that are parameterized by aspatial (rather than temporal) parameterx.See [29]. The Karhunen-Loéve expansion is useful in the development of numerical algorithms for partialdifferential equations with random coefficients. See [92].

    20

  • We can use the Karhunen-Loéve expansion in order to study theL2-regularity of stochastic processes.First, letR be a compact, symmetric positive definite operator onL2(0, 1) with eigenvalues and normalizedeigenfunctions{λk, ek(x)}+∞k=1 and consider a functionf ∈ L2(0, 1) with

    ∫ 10 f(s) ds = 0. We can define

    the one parameter family of Hilbert spacesHα through the norm

    ‖f‖2α = ‖R−αf‖2L2 =∑

    k

    |fk|2λ−α.

    The inner product can be obtained through polarization. This norm enables us to measure the regularityof the functionf(t).4 Let Xt be a mean zero second order (i.e. with finite second moment) process withcontinuous autocorrelation function. Define the spaceHα := L2((Ω, P ),Hα(0, 1)) with (semi)norm

    ‖Xt‖2α = E‖Xt‖2Hα =∑

    k

    |λk|1−α. (1.38)

    Notice that the regularity of the stochastic processXt depends on the decay of the eigenvalues of the integraloperatorR· :=

    ∫ 10 R(t, s) · ds.

    As an example, consider theL2-regularity of Brownian motion. From Example 1.29 we know thatλk ∼ k−2. Consequently, from (1.38) we get that, in order forWt to be an element of the spaceHα, weneed that ∑

    k

    |λk|−2(1−α) < +∞,

    from which we obtain thatα < 1/2. This is consistent with the Hölder continuity of Brownianmotion fromTheorem 1.22.5

    1.7 Exercises

    1. LetY0, Y1, . . . be a sequence of independent, identically distributed random variables and consider thestochastic processXn = Yn.

    (a) Show thatXn is a strictly stationary process.

    (b) Assume thatEY0 = µ < +∞ andEY 20 = σ2 < +∞. Show that

    limN→+∞

    E

    ∣∣∣∣∣∣1

    N

    N−1∑

    j=0

    Xj − µ

    ∣∣∣∣∣∣= 0.

    (c) Letf be such thatEf2(Y0) < +∞. Show that

    limN→+∞

    E

    ∣∣∣∣∣∣1

    N

    N−1∑

    j=0

    f(Xj)− f(Y0)

    ∣∣∣∣∣∣= 0.

    4Think ofR as being the inverse of the Laplacian with periodic boundaryconditions. In this caseHα coincides with the standardfractional Sobolev space.

    5Notice, however, that Wiener’s theorem refers to a.s. Hölder continuity, whereas the calculation presented in this section isaboutL2-continuity.

    21

  • 2. LetZ be a random variable and define the stochastic processXn = Z, n = 0, 1, 2, . . . . Show thatXn isa strictly stationary process.

    3. LetA0, A1, . . . Am andB0, B1, . . . Bm be uncorrelated random variables with mean zero and variancesEA2i = σ

    2i , EB

    2i = σ

    2i , i = 1, . . . m. Let ω0, ω1, . . . ωm ∈ [0, π] be distinct frequencies and define, for

    n = 0,±1,±2, . . . , the stochastic process

    Xn =m∑

    k=0

    (Ak cos(nωk) +Bk sin(nωk)

    ).

    Calculate the mean and the covariance ofXn. Show that it is a weakly stationary process.

    4. Let{ξn : n = 0,±1,±2, . . . } be uncorrelated random variables withEξn = µ, E(ξn −µ)2 = σ2, n =0,±1,±2, . . . . Let a1, a2, . . . be arbitrary real numbers and consider the stochastic process

    Xn = a1ξn + a2ξn−1 + . . . amξn−m+1.

    (a) Calculate the mean, variance and the covariance function ofXn. Show that it is a weakly stationaryprocess.

    (b) Setak = 1/√m for k = 1, . . . m. Calculate the covariance function and study the casesm = 1

    andm→ +∞.

    5. LetW (t) be a standard one dimensional Brownian motion. Calculate the following expectations.

    (a) EeiW (t).

    (b) Eei(W (t)+W (s)), t, s,∈ (0,+∞).(c) E(

    ∑ni=1 ciW (ti))

    2, whereci ∈ R, i = 1, . . . n andti ∈ (0,+∞), i = 1, . . . n.

    (d) Ee[i(∑n

    i=1 ciW (ti))]

    , whereci ∈ R, i = 1, . . . n andti ∈ (0,+∞), i = 1, . . . n.

    6. LetWt be a standard one dimensional Brownian motion and define

    Bt =Wt − tW1, t ∈ [0, 1].

    (a) Show thatBt is a Gaussian process with

    EBt = 0, E(BtBs) = min(t, s)− ts.

    (b) Show that, fort ∈ [0, 1) an equivalent definition ofBt is through the formula

    Bt = (1− t)W(

    t

    1− t

    ).

    (c) Calculate the distribution function ofBt.

    22

  • 7. LetXt be a mean-zero second order stationary process with autocorrelation function

    R(t) =

    N∑

    j=1

    λ2jαje−αj |t|,

    where{αj , λj}Nj=1 are positive real numbers.

    (a) Calculate the spectral density and the correlaction time of this process.

    (b) Show that the assumptions of Theorem 1.16 are satisfied and use the argument presented in Sec-tion 1.2 (i.e. the Green-Kubo formula) to calculate the diffusion coefficient of the processZt =∫ t0 Xs ds.

    (c) Under what assumptions on the coefficients{αj , λj}Nj=1 can you study the above questions in thelimit N → +∞?

    8. Show that the position of a Brownian particle whose velocity is described by the stationary Ornstein-Uhlenbeck process, Equation (1.10) is a mean zero Gaussian stochastic process and calculate the covari-ance function.

    9. Let a1, . . . an ands1, . . . sn be positive real numbers. Calculate the mean and variance ofthe randomvariable

    X =n∑

    i=1

    aiW (si).

    10. LetW (t) be the standard one-dimensional Brownian motion and letσ, s1, s2 > 0. Calculate

    (a) EeσW (t).

    (b) E(sin(σW (s1)) sin(σW (s2))

    ).

    11. LetWt be a one dimensional Brownian motion and letµ, σ > 0 and define

    St = etµ+σWt .

    (a) Calculate the mean and the variance ofSt.

    (b) Calculate the probability density function ofSt.

    12. Prove proposition 1.23.

    13. Use Lemma 1.24 to calculate the distribution function ofthe stationary Ornstein-Uhlenbeck process.

    14. Calculate the mean and the correlation function of the integral of a standard Brownian motion

    Yt =

    ∫ t

    0Ws ds.

    15. Show that the process

    Yt =

    ∫ t+1

    t(Ws −Wt) ds, t ∈ R,

    is second order stationary.

    23

  • 16. LetVt = e−tW (e2t) be the stationary Ornstein-Uhlenbeck process. Give the definition and study themain properties of the Ornstein-Uhlenbeck bridge.

    17. The autocorrelation function of the velocityY (t) a Brownian particle moving in a harmonic potentialV (x) = 12ω

    20x

    2 is

    R(t) = e−γ|t|(cos(δ|t|) − 1

    δsin(δ|t|)

    ),

    whereγ is the friction coefficient andδ =√ω20 − γ2.

    (a) Calculate the spectral density ofY (t).

    (b) Calculate the mean square displacementE(X(t))2 of the position of the Brownian particleX(t) =∫ t0 Y (s) ds. Study the limitt→ +∞.

    18. Show the scaling property (1.28) of the fractional Brownian motion.

    19. The Poisson process with intensityλ, denoted byN(t), is an integer-valued, continuous time, stochasticprocess with independent increments satisfying

    P[(N(t)−N(s)) = k] = e−λ(t−s)(λ(t− s)

    )k

    k!, t > s > 0, k ∈ N.

    Use Theorem (1.21) to show that there does not exist a continuous modification of this process.

    20. Show that the correlation function of a processXt satisfying (1.29) is continuous in botht ands.

    21. LetXt be a stochastic process satisfying (1.29) andR(t, s) its correlation function. Show that the integraloperatorR : L2[0, 1] 7→ L2[0, 1] defined in (1.33),

    Rf :=∫ 1

    0R(t, s)f(s) ds,

    is selfadjoint and nonnegative. Show that all of its eigenvalues are real and nonnegative. Show thateigenfunctions corresponding to different eigenvalues are orthogonal.

    22. LetH be a Hilbert space. An operatorR : H → H is said to be Hilbert–Schmidt if there exists acomplete orthonormal sequence{φn}∞n=1 in H such that

    ∞∑

    n=1

    ‖Ren‖2

  • 24. Calculate the Karhunen-Loeve expansion for a second order stochastic process with correlation functionR(t, s) = ts.

    25. Calculate the Karhunen-Loeve expansion of the Brownianbridge on[0, 1].

    26. LetXt, t ∈ [0, T ] be a second order process with continuous covariance and Karhunen-Loéve expansion

    Xt =

    ∞∑

    k=1

    ξkek(t).

    Define the processY (t) = f(t)Xτ(t), t ∈ [0, S],

    wheref(t) is a continuous function andτ(t) a continuous, nondecreasing function withτ(0) = 0, τ(S) =T . Find the Karhunen-Loéve expansion ofY (t), in an appropriate weightedL2 space, in terms of theKL expansion ofXt. Use this in order to calculate the KL expansion of the Ornstein-Uhlenbeck process.

    27. Calculate the Karhunen-Loéve expansion of a centered Gaussian stochastic process with covariance func-tionR(s, t) = cos(2π(t− s)).

    28. Use the Karhunen-Loeve expansion to generate paths of

    (a) the Brownian motion on[0, 1];

    (b) the Brownian bridge on[0, 1];

    (c) the Ornstein-Uhlenbeck on[0, 1].

    Study computationally the convergence of the Karhunen-Lo´eve expansion for these processes. Howmany terms do you need to keep in the expansion in order to calculate accurate statistics of these pro-cesses? How does the computational cost compare with that ofthe standard algorithm for simulatingGaussian stochastic processes?

    29. (See [29].) Consider the Gaussian random fieldX(x) in R with covariance function

    γ(x, y) = e−a|x−y| (1.39)

    wherea > 0.

    (a) Simulate this field: generate samples and calculate the first four moments.

    (b) ConsiderX(x) for x ∈ [−L,L]. Calculate analytically the eigenvalues and eigenfunctions of theintegral operatorK with kernelγ(x, y),

    Kf(x) =∫ L

    −Lγ(x, y)f(y) dy.

    Use this in order to obtain the Karhunen-Loéve expansion for X. Plot the first five eigenfunctionswhena = 1, L = −0.5. Investigate (either analytically or by means of numericalexperiments) theaccuracy of the KL expansion as a function of the number of modes kept.

    25

  • (c) Develop a numerical method for calculating the first few eigenvalues and eigenfunctions ofK witha = 1, L = −0.5. Use the numerically calculated eigenvalues and eigenfunctions to simulateX(x)using the KL expansion. Compare with the analytical resultsand comment on the accuracy of thecalculation of the eigenvalues and eigenfunctions and on the computational cost.

    26

  • Chapter 2

    Diffusion Processes

    In this chapter we study some of the basic properties of Markov stochastic processes and, in particular,of diffusion processes. In Section 2.1 we present various examples of Markov processes, in discrete andcontinuous time. In Section 2.2 we give the precise definition of a Markov process. In Section 2.2 wederive the Chapman-Kolmogorov equation, the fundamental equation in the theory of Markov processes. InSection 2.3 we introduce the concept of the generator of a Markov process. In Section 2.4 we study ergodicMarkov processes. In Section 2.5 we introduce diffusion processes and we derive the forward and backwardKolmogorov equations. Discussion and bibliographical remarks are presented in Section 2.6 and exercisescan be found in Section 2.7.

    2.1 Examples of Markov processes

    Roughly speaking, a Markov process is a stochastic process that retains no memory of where it has been inthe past: only the current state of a Markov process can influence where it will go next. A bit more precisely:a Markov process is a stochastic process for which, given thepresent, the past and future are statisticallyindependent.

    Perhaps the simplest example of a Markov process is that of a random walk in one dimension. Letξi, i = 1, . . . be independent, identically distributed mean zero and variance1 random variables. The onedimensional random walk is defined as

    XN =N∑

    n=1

    ξn, X0 = 0.

    Let i1, i2, . . . be a sequence of integers. Then, for all integersn andm we have that1

    P(Xn+m = in+m|X1 = i1, . . . Xn = in) = P(Xn+m = in+m|Xn = in). (2.1)

    In words, the probability that the random walk will be atin+m at timen +m depends only on its currentvalue (at timen) and not on how it got there.

    1In fact, it is sufficient to takem = 1 in (2.1). See Exercise 1.

    27

  • The random walk is an example of adiscrete time Markov chain: We will say that a stochastic process{Sn;n ∈ N} with state spaceS = Z is a discrete time Markov chain provided that the Markov property (2.1)is satisfied.

    Consider now a continuous-time stochastic processXt with state spaceS = Z and denote by{Xs, s 6t} the collection of values of the stochastic process up to timet. We will say thatXt is a Markov processesprovided that

    P(Xt+h = it+h|{Xs, s 6 t}) = P(Xt+h = it+h|Xt), (2.2)for all h > 0. A continuous-time, discrete state space Markov process iscalled a continuous-time Markovchain. A standard example of a continuous-time Markov chainis the Poisson process of rateλ with

    P(Nt+h = j|Nt = i) ={ 0 if j < i,

    e−λs(λs)j−i

    (j−i)! , if j > i.(2.3)

    Similarly, we can define a continuous-time Markov process with state space isR, as a stochastic processwhose future depends on its present state and not on how it gotthere:

    P(Xt+h ∈ Γ|{Xs, s 6 t}) = P(Xt+h ∈ Γ|Xt) (2.4)

    for all Borel setsΓ. In this book we will consider continuous-time Markov processes for which aconditionalprobability densityexists:

    P(Xt+h ∈ Γ|Xt = x) =∫

    Γp(y, t+ h|x, t) dy. (2.5)

    Example 2.1. The Brownian motion is a Markov process with conditional probability density given by thefollowing formula

    P(Wt+h ∈ Γ|Wt = x) =∫

    Γ

    1√2πh

    exp

    (−|x− y|

    2

    2h

    )dy. (2.6)

    The Markov property of Brownian motion follows from the factthat it has independent increments.

    Example 2.2. The stationary Ornstein-Uhlenbeck processVt = e−tW (e2t) is a Markov process with con-ditional probability density

    p(y, t|x, s) = 1√2π(1 − e−2(t−s))

    exp

    (−|y − xe

    −(t−s)|22(1− e−2(t−s))

    ). (2.7)

    To prove (2.7) we use the formula for the distribution function of the Brownian motion to calculate, fort > s,

    P(Vt 6 y|Vs = x) = P(e−tW (e2t) 6 y|e−sW (e2s) = x)= P(W (e2t) 6 ety|W (e2s) = esx)

    =

    ∫ ety

    −∞

    1√2π(e2t − e2s)

    e− |z−xe

    s|2

    2(e2t−e2s) dz

    =

    ∫ y

    −∞

    1√2πe2t(1− e−2(t−s))

    e− |ρe

    t−xes|2

    2(e2t(1−e−2(t−s)) dρ

    =

    ∫ y

    −∞

    1√2π(1− e−2(t−s))

    e− |ρ−x|

    2

    2(1−e−2(t−s)) dρ.

    28

  • Consequently, the transition probability density for the OU process is given by the formula

    p(y, t|x, s) = ∂∂y

    P(Vt 6 y|Vs = x)

    =1√

    2π(1− e−2(t−s))exp

    (−|y − xe

    −(t−s)|22(1− e−2(t−s))

    ).

    The Markov property enables us to obtain an evolution equation for the transition probability for adiscrete-time or continuous-time Markov chain

    P(Xn+1 = in+1|Xn = in), P(Xt+h = it+h|Xt = it), (2.8)

    or for the transition probability density defined in (2.5). This equation is theChapman-Kolmogorov equa-tion. Using this equation we can study the evolution of a Markov process.

    We will be mostly concerned with time-homogeneous Markov processes, i.e. processes for which theconditional probabilities are invariant under time shifts. For time-homogeneous discrete-time Markov chainswe have

    P(Xn+1 = j|Xn = i) = P(X1 = j|X0 = i) =: pij.

    We will refer to the matrixP = {pij} as the transition matrix. The transition matrix is a stochastic matrix,i.e. it has nonnegative entries and

    ∑j pij = 1. Similarly, we can define then-step transition matrix

    Pn = {pij(n)} aspij(n) = P(Xm+n = j|Xm = i).

    We can study the evolution of a Markov chain through the Chapman-Kolmogorov equation:

    pij(m+ n) =∑

    k

    pik(m)pkj(n). (2.9)

    Indeed, letµ(n)i := P(Xn = i). The (possibly infinite dimensional) vectorµ(n) determines the state of the

    Markov chain at timen. From the Chapman-Kolmogorov equation we can obtain a formula for the evolutionof the vectorµ(n)

    µ(n) = µ(0)Pn, (2.10)

    wherePn denotes thenth power of the matrixP . Hence in order to calculate the state of the Markov chainat timen what we need is the initial distributionµ0 and the transition matrixP . Componentwise, the aboveequation can be written as

    µ(n)j =

    i

    µ(0)i πij(n).

    Consider now a continuous-time Markov chain with transition probability

    pij(s, t) = P(Xt = j|Xs = i), s 6 t.

    If the chain is homogeneous, then

    pij(s, t) = pij(0, t − s) for all i, j, s, t.

    29

  • In particular,

    pij(t) = P(Xt = j|X0 = i).

    The Chapman-Kolmogorov equation for a continuous-time Markov chain is

    dpijdt

    =∑

    k

    pik(t)gkj , (2.11)

    where the matrixG is called thegeneratorof the Markov chain that is defined as

    G = limh→0

    1

    h(Ph − I),

    with Pt denoting the matrix{pij(t)}. Equation (2.11) can also be written in matrix form:

    dP

    dt= PtG.

    Let nowµit = P(Xt = i). The vectorµt is the distribution of the Markov chain at timet. We can study itsevolution using the equation

    µt = µ0Pt.

    Thus, as in the case of discrete time Markov chains, the evolution of a continuous- time Markov chain iscompletely determined by the initial distribution and transition matrix.

    Consider now the case a continuous-time Markov process withcontinuous state space and with con-tinuous paths. As we have seen in Example 2.1 the Brownian motion is such a process. The conditionalprobability density of the Brownian motion (2.6) is the fundamental solution (Green’s function) of the dif-fusion equation:

    ∂p

    ∂t=

    1

    2

    ∂2p

    ∂y2, lim

    t→sp(y, t|x, s) = δ(y − x). (2.12)

    Similarly, the conditional distribution of the Ornstein-Uhlenbeck process satisfies the initial value problem

    ∂p

    ∂t=∂(yp)

    ∂y+

    1

    2

    ∂2p

    ∂y2, lim

    t→sp(y, t|x, s) = δ(y − x). (2.13)

    The Brownian motion and the OU process are examples of adiffusion process: a continuous-time Markovprocess with continuous paths. A precise definition will be given in Section 2.5, where we will also deriveevolution equations for the conditional probability density p(y, t|x, s) of an arbitrary diffusion process, theforward Kolmogorov (Fokker-Planck) (2.54) and backward Kolmogorov (2.47) equations.

    2.2 Markov Processes and the Chapman-Kolmogorov equation

    In Section 2.1 we gave the definition of Markov process whose time is either discrete or continuous, andwhose state space is countable. We also gave several examples of Markov chains as well as of processeswhose state space is the real line. In this section we give theprecise definition of a Markov process witht ∈ R+ and with state space isRd. We also introduce the Chapman-Kolmogorov equation.

    30

  • In order to give the definition of a Markov process we need to use the conditional expectation of thestochastic process conditioned on all past values. We can encode all past information about a stochasticprocess into an appropriate collection ofσ-algebras. Let(Ω,F , µ) denote a probability space and considera stochastic processX = Xt(ω) with t ∈ R+ and state space(Rd,B) whereB denotes the Borelσ-algebra. We define theσ-algebra generated by{Xt, t ∈ R+}, denoted byσ(Xt, t ∈ R+), to be thesmallestσ-algebra such that the family of mappings{Xt, t ∈ R+} is a stochastic process with samplespace(Ω, σ(Xt, t ∈ R+)) and state space(Rd,B).2 In other words, theσ-algebra generated byXt is thesmallestσ-algebra such thatXt is a measurable function (random variable) with respect to it.

    We define now a filtration on(Ω,F) to be a nondecreasing family{Ft, t ∈ R+} of sub-σ-algebras ofF :

    Fs ⊆ Ft ⊆ F for s 6 t.

    We setF∞ = σ(∪t∈TFt). The filtration generated by our stochastic processXt, whereXt is:

    FXt := σ (Xs; s 6 t) . (2.14)

    Now we are ready to give the definition of a Markov process.

    Definition 2.3. LetXt be a stochastic process defined on a probability space(Ω,F , µ) with values inRdand letFXt be the filtration generated by{Xt; t ∈ R+}. Then{Xt; t ∈ R+} is a Markov process providedthat

    P(Xt ∈ Γ|FXs ) = P(Xt ∈ Γ|Xs) (2.15)

    for all t, s ∈ T with t > s, andΓ ∈ B(Rd).

    We remark that the filtrationFXt is generated by events of the form{ω|Xt1 ∈ Γ1, Xt2 ∈ Γ2, . . . Xtn ∈Γn, } with 0 6 t1 < t2 < · · · < tn 6 t andΓi ∈ B(Rd). The definition of a Markov process is thusequivalent to the hierarchy of equations

    P(Xt ∈ Γ|Xt1 ,Xt2 , . . . Xtn) = P(Xt ∈ Γ|Xtn) a.s.

    for n > 1 and0 6 t1 < t2 < · · · < tn 6 t with Γ ∈ B(E).We also remark that it is sometimes possible to describe a non-Markovian processXt in terms of a

    Markovian processYt in a higher dimensional state space. The additional variables that we introduce accountfor the memory in theXt. This is possible when the non-Markovian process has finite memory that can berepresented by a finite number of additional degrees of freedom. We will use this approach in Chapter??when we derive stochastic differential equations from deterministic dynamical systems with random initialconditions, see Definition??.

    As an example, consider a Brownian particle whose velocity is described by the stationary Ornstein-Uhlenbeck processYt = e−tW (e2t), see (1.10) and (1.11). The particle position is given by theintegral ofthe Ornstein-Uhlenbeck process

    Xt = X0 +

    ∫ t

    0Ys ds.

    2In later chapters we will also consider Markov processes with state space being the a subset ofRd, for example the unit torus.

    31

  • The particle position depends on the past of the Ornstein-Uhlenbeck process and, consequently, is not aMarkov process. However, the joint position-velocity process{Xt, Yt} is. Its transition probability densityp(x, y, t|x0, y0) satisfies the forward Kolmogorov equation

    ∂p

    ∂t= −p∂p

    ∂x+

    ∂y(yp) +

    1

    2

    ∂2p

    ∂y2.

    The Chapman-Kolmogorov Equation

    With every continuous-time Markov processXt3 defined in a probability space(Ω,F ,P) and state space(Rd,B) we can associate thetransition function

    P (Γ, t|Xs, s) := P[Xt ∈ Γ|FXs

    ],

    for all t, s ∈ R+ with t > s and allΓ ∈ B(Rd). It is a function of4 arguments, the initial times and positionXs and the final timet and the setΓ. The transition functionP (t,Γ|x, s) is, for fixedt, x s, a probabilitymeasure onRd with P (t,Rd|x, s) = 1; it is B(Rd)-measurable inx, for fixed t, s, Γ and satisfies theChapman-Kolmogorov equation

    P (Γ, t|x, s) =∫

    Rd

    P (Γ, t|y, u)P (dy, u|x, s). (2.16)

    for all x ∈ Rd, Γ ∈ B(Rd) ands, u, t ∈ R+ with s 6 u 6 t. Assume thatXs = x. SinceP[Xt ∈ Γ|FXs

    ]=

    P [Xt ∈ Γ|Xs] we can writeP (Γ, t|x, s) = P [Xt ∈ Γ|Xs = x] .

    The derivation of the Chapman-Kolmogorov equation is basedon the Markovian assumption and on prop-erties of conditional probability. We can formally derive the Chapman-Kolmogorov equation as follows:We use the Markov property, together with Equations (B.9) and (B.10) from Appendix B and the fact thats < u ⇒ FXs ⊂ FXu to calculate:

    P (Γ, t|x, s) := P(Xt ∈ Γ|Xs = x) = P(Xt ∈ Γ|FXs )= E(IΓ(Xt)|FXs ) = E(E(IΓ(Xt)|FXs )|FXu )= E(E(IΓ(Xt)|FXu )|FXs ) = E(P(Xt ∈ Γ|Xu)|FXs )= E(P(Xt ∈ Γ|Xu = y)|Xs = x)

    =

    Rd

    P (Γ, t|Xu = y)P (dy, u|Xs = x)

    =:

    Rd

    P (Γ, t|y, u)P (dy, u|x, s),

    whereIΓ(·) denotes the indicator function of the setΓ. In words, the Chapman-Kolmogorov equation tellsus that for a Markov process the transition fromx at times to the setΓ at timet can be done in two steps:first the system moves fromx to y at some intermediate timeu. Then it moves fromy to Γ at timet. In

    3We always taket ∈ R+.

    32

  • order to calculate the probability for the transition fromx at times toΓ at timet we need to sum (integrate)the transitions from all possible intermediate statesy.

    The transition function and the initial distribution ofXt are sufficient to uniquely determine a Markovprocess. In fact, a processXt is a Markov process with respect to its filtrationFXt defined in (2.14) withtransition functionP (t, ·|s, ·) and initial distributionν (X0 ∼ ν) if and only if for all0 = t0 < t1 < · · · < tnand bounded measurable functionsfj, j = 0, . . . N we have

    n∏

    j=0

    fj(Xtj ) =

    Rd

    f0(x0)ν(dx0)n∏

    j=1

    Rd

    fj(xj)P (dxj , tj |xj−1, tj−1), (2.17)

    , where we have used the notationEν to emphasize the dependence of the expectation on the initial dis-tribution ν. The proof that a Markov process with transition functionP satisfies (2.17) follows from theChapman-Kolmogorov equation (2.16) and an induction argument. In other words, the finite dimensionaldistributions ofXt are uniquely determined by the initial distribution and thetransition function:

    P(X0 ∈ dx0, Xt1 ∈ dx1, . . . ,Xtn ∈ dxn) = ν(dx0)n∏

    j=1

    P (dxj , tj |xj−1, tj−1). (2.18)

    In this book we will consider Markov processes for which the transition function has a density with respectto the Lebesgue measure:

    P (Γ, t|x, s) =∫

    Γp(y, t|x, s) dy.

    We will refer top(y, t|x, s) as thetransition probability density. It is a function of four arguments, the initialposition and timex, s and the final position and timey, t. For t = s we haveP (Γ, s|x, s) = IΓ(x). TheChapman-Kolmogorov equation becomes:

    Γp(y, t|x, s) dy =

    Rd

    Γp(y, t|z, u)p(z, u|x, s) dzdy,

    and, sinceΓ ∈ B(Rd) is arbitrary, we obtain the Chapman-Komogorov equation forthe transition probabilitydensity:

    p(y, t|x, s) =∫

    Rd

    p(y, t|z, u)p(z, u|x, s) dz. (2.19)

    When the transition probability density exists, and assuming that the initial distributionν has a densityρ,we can writeP(X0 ∈ dx0, Xt1 ∈ dx1, . . . ,Xtn ∈ dxn) = p(x0, t0, . . . xn, tn)

    ∏nj=0 dxj and we have

    p(x0, t0, . . . xn, tn) = ρ(x0)n∏

    j=1

    p(xj , tj |xj−1, tj−1). (2.20)

    The above formulas simplify when the (random) law of evolution of the Markov processXt does not changein time. In this case the conditional probability in (2.15) depends on the initial and final timet ands onlythrough their difference: we will say that a Markov process is time-homogeneousif the transition functionP (·, t|·, s) depends only on the difference between the initial and final time t− s:

    P (Γ, t|x, s) = P (Γ, t− s|x, 0) =: P (t− s, x,Γ),

    33

  • for all Γ ∈ B(Rd) andx ∈ Rd. For time-homogeneous Markov processes with can fix the initial time,s = 0. The Chapman-Kolmogorov equation for a time-homogeneous Markov process becomes

    P (t+ s, x,Γ) =

    Rd

    P (s, x, dz)P (t, z,Γ). (2.21)

    Furthermore, formulas (2.17) and (2.18) become

    n∏

    j=0

    fj(Xtj ) =

    Rd

    f0(y0)µ(dy0)n∏

    j=1

    Rd

    fj(yj)P (tj − tj−1, yj−1, dyj), (2.22)

    and

    P(X0 ∈ dx0, Xt1 ∈ dx1, . . . ,Xtn ∈ dxn) = ν(dx0)n∏

    j=1

    P (tj − tj−1, yj−1, dyj),

    (2.23)

    respectively. Given the initial distributionν and the transition functionP (x, t,Γ) of a Markov processXt,we can calculate the probability of findingXt in a setΓ at timet:

    P(Xt ∈ Γ) =∫

    Rd

    P (x, t,Γ)ν(dx).

    Furthermore, for an observablef we can calculate the expectation using the formula

    Eνf(Xt) =

    Rd

    Rd

    f(x)P (t, x0, dx)ν(dx0). (2.24)

    The Poisson process, defined in (2.3) is a homogeneous Markovprocess. Another example of a time-homogenous Markov process is Brownian motion. The transition function is the Gaussian

    P (t, x, dy) = γt,x(y)dy, γt,x(y) =1√2πt

    exp

    (−|x− y|

    2

    2t

    ). (2.25)

    Let nowXt be time-homogeneous Markov process and assume that the transition probability density exists,P (t, x,Γ) =

    ∫Γ p(t, x, y) dy. The Chapman-Kolmogorov equationp(t, x, y) reads

    p(t+ s, x, y) =

    Rd

    p(s, x, z)p(t, z, y) dz. (2.26)

    2.3 The Generator of a Markov Processes

    LetXt denote a time-homogeneous Markov process. The Chapman-Kolmogorov equation (2.21) suggeststhat a time-homogeneous Markov process can be described through a semigroup of operators, i.e. a one-parameter family of linear operators with the properties

    P0 = I, Pt+s = Pt ◦ Ps for all t, s > 0.

    34

  • Indeed, letP (t, ·, ·) be the transition function of a homogeneous Markov process and letf ∈ Cb(Rd), thespace of continuous bounded functions onRd and define the operator

    (Ptf)(x) := E(f(Xt)|X0 = x) =∫

    Rd

    f(y)P (t, x, dy). (2.27)

    This is a linear operator with

    (P0f)(x) = E(f(X0)|X0 = x) = f(x)

    which means thatP0 = I. Furthermore:

    (Pt+sf)(x) =

    Rd

    f(y)P (t+ s, x, dy) =

    Rd

    Rd

    f(y)P (s, z, dy)P (t, x, dz)

    =

    Rd

    (∫

    Rd

    f(y)P (s, z, dy)

    )P (t, x, dz) =

    Rd

    (Psf)(z)P (t, x, dz)

    = (Pt ◦ Psf)(x).

    Consequently:

    Pt+s = Pt ◦ Ps.

    The semigroupPt defined in (2.27) is an example of a Markov semigroup. We can study properties of atime-homogeneous Markov processXt by studying properties of the Markov semigroupPt.

    Let nowXt be a Markov process inRd and letPt denote the corresponding semigroup defined in (2.27).We consider this semigroup acting on continuous bounded functions and assume thatPtf is also aCb(Rd)function. We define byD(L) the set of allf ∈ Cb(E) such that the strong limit

    Lf := limt→0

    Ptf − ft

    (2.28)

    exists. The operatorL : D(L) → Cb(Rd) is called the (infinitesimal) generator of the operator semigroupPt. We will also refer toL as the generator of the Markov processXt.

    The semigroup property and the definition of the generator ofa Markov semigroup (2.28) imply that,formally, we can write:

    Pt = etL.

    Consider the functionu(x, t) := (Ptf)(x) = E(f(Xt)|X0 = x) . We calculate its time derivative:

    ∂u

    ∂t=

    d

    dt(Ptf) =

    d

    dt

    (etLf

    )

    = L(etLf

    )= LPtf = Lu.

    Furthermore,u(x, 0) = P0f(x) = f(x). Consequently,u(x, t) satisfies the initial value problem

    ∂u

    ∂t= Lu, (2.29a)

    u(x, 0) = f(x). (2.29b)

    35

  • Equation (2.29) is thebackward Kolmogorov equation. It governs the evolution of the expectation value ofan observablef ∈ Cb(Rd). At this level this is formal since we do not have a formula forthe generatorLof the Markov semigroup. In the case where the Markov processis the solution of a stochastic differentialequation, then the generator is a second order elliptic differential operator and the backward Kolmogorovequation becomes an initial value problem for a parabolic PDE. See Section 2.5 and Chapter 4.

    As an example consider the Brownian motion in one dimension.The transition function is givenby (2.25), the fundamental solution of the heat equation in one dimension. The corresponding Markovsemigroup is the heat semigroupPt = exp

    (t2

    d2

    dx2

    ). The generator of the one dimensional Brownian motion

    is the one dimensional Laplacian12d2

    dx2. The backward Kolmogorov equation is the heat equation

    ∂u

    ∂t=

    1

    2

    ∂2u

    ∂x2.

    The Adjoint Semigroup

    The semigroupPt acts on bounded continuous functions. We can also define the adjoint semigroupP ∗twhich acts on probability measures:

    P ∗t µ(Γ) =∫

    Rd

    P(Xt ∈ Γ|X0 = x) dµ(x) =∫

    Rd

    P (t, x,Γ) dµ(x).

    The image of a probability measureµ underP ∗t is again a probability measure. The operatorsPt andP∗t are

    (formally) adjoint in theL2-sense:∫

    R

    Ptf(x) dµ(x) =

    R

    f(x) d(P ∗t µ)(x). (2.30)

    We can, writeP ∗t = e

    tL∗ , (2.31)

    whereL∗ is theL2-adjoint of the generator of the process:∫

    Lfh dx =∫fL∗hdx.

    LetXt be a Markov process with generatorXt withX0 ∼ µ and letP ∗t denote the adjoint semigroup definedin (2.31). We define

    µt := P∗t µ. (2.32)

    This is thelaw of the Markov process. An argument similar to the one used in the derivation of the backwardKolmogorov equation (2.29) enables us to obtain an equationfor the evolution ofµt:

    ∂µt∂t

    = L∗µt, µ0 = µ.

    Assuming that both the initial distributionµ and the law of the processµt have a density with respect toLebesgue measure,ρ0(·) andρ(t, ·), respectively, this equation becomes:

    ∂ρ

    ∂t= L∗ρ, ρ(y, 0) = ρ0(y). (2.33)

    36

  • This is theforward Kolmogorovequation. When the initial conditions are deterministic,X0 = x, the initialcondition becomesρ0 = δ(x − y). As with the backward Kolmogorov equation (2.29) this equation is stillformal, still we do not have a formula for the adjointL∗ of the generator of the Markov processXt. InSection 2.5 we will derive the forward and backward Kolmogorov equations and a formula for the generatorL for diffusion processes.

    2.4 Ergodic Markov processes

    In Sect. 1.2, we studied stationary stochastic processes, and we showed that such processes satisfy a formof the law of large numbers, Theorem 1.16. In this section we introduce a class of Markov processes forwhich the phase-space average, with respect to an appropriate probability measure, theinvariant measureequals the long time average. Such Markov processes are called ergodic. Ergodic Markov processes arecharacterized by the fact that an invariant measure, see Eq.(2.36) below, exists and is unique.

    For the precise definition of an ergodic Markov process we require that all shift invariant sets, i.e. allsets of the form(Xt1 ∈ Γ1,Xt2 ∈ Γ2, . . . ,Xtk ∈ Γk) that are invariant under time shifts, are trivial, i.e.they have probability0 or 1. For our purposes it is more convenient to describe ergodic Markov processesin terms of the properties of their generators and of the corresponding Markov semigroup.

    We will consider a Markov processXt in Rd with generatorL and Markov semigroupPt. We will saythatXt is ergodic provided that0 is a simple eigenvalue ofL or, equivalently, provided that the equation

    Lg = 0 (2.34)

    has only constant solutions. Consequently, we can study theergodic properties of a Markov processXt bystudying the null space of its generator. From (2.34), and using the definition of the generator of a Markovprocess (2.28), we deduce that a Markov process is ergodic ifthe equation

    Ptg = g, (2.35)

    has only constant solutions for allt > 0. Using the adjoint semigroup, we can define aninvariant measureas a probability measure that is invariant under the time evolution ofXt, i.e., a fixed point of the semigroupP ∗t :

    P ∗t µ = µ. (2.36)

    This equation is theL2-adjoint of the equationPtg = g in (2.35). If there is a unique probability measuresatisfying (2.36), then the Markov process is ergodic (withrespect to the measureµ). Using this, we canobtain an equation for the invariant measure in terms of the adjointL∗ of the generator, which is the generatorof the semigroupP ∗t . Assume, for simplicity, that the measureµ has a densityρ with respect to Lebesguemeasure. We divide (2.36) byt and pass to the limit ast→ 0 to obtain

    L∗ρ = 0. (2.37)

    WhenXt is a diffusion process, this equation is thestationary Fokker–Planck equation. Equation (2.37),which is the adjoint of (2.34), can be used to calculate the invariant distributionρ, i.e., the density of theinvariant measureµ.

    37

  • The invariant measure (distribution) governs the long-time dynamics of the Markov process. In particu-lar, whenX0 ∼ µ0 initially, we have that

    limt→+∞

    P ∗t µ0 = µ. (2.38)

    Furthermore, the long-time average of an observablef converges to the equilibrium expectation with respectto the invariant measure

    limT→+∞

    1

    T

    ∫ T

    0f(Xs) ds =

    ∫f(x)µ(dx).

    This is the definition of an ergodic process that is quite often used in physics: the long-time average equalsthe phase-space average.

    If X0 is distributed according toµ, then so isXt for all t > 0. The resulting stochastic process, withX0distributed in this way, is stationary; see Sect. 1.2.

    Example 2.4.Brownian motion inRd is not an ergodic Markov process. On the other hand, if we consider itin a bounded domain with appropriate boundary conditions, then it becomes an ergodic process. Considera one-dimensional Brownian motion on[0, 1], with periodic boundary conditions. The generator of thisMarkov processL is the differential operatorL = 12 d

    2

    dx2, equipped with periodic boundary conditions on

    [0, 1]. This operator is self-adjoint. The null spaces of bothL andL∗ comprise constant functions on[0, 1].Both the backward Kolmogorov and the Fokker–Planck equation reduce to the heat equation

    ∂ρ

    ∂t=

    1

    2

    ∂2ρ

    ∂x2(2.39)

    with periodic boundary conditions in[0, 1]. We can solve the heat equation(2.39)using Fourier analysis todeduce that the solution converges to a constant at an exponential rate.

    Example 2.5. The one-dimensional Ornstein–Uhlenbeck process is a Markov process with generator

    L = −αx ddx

    +Dd2

    dx2.

    The null space ofL comprises constants inx. Hence, it is an ergodic Markov process. In order to calculatethe invariant measure, we need to solve the stationary Fokker–Planck equation:

    L∗ρ = 0, ρ > 0,∫ρ(x) dx = 1. (2.40)

    We calculate theL2-adjoint ofL. Assuming thatf, h decay sufficiently fast at infinity, we have∫

    R

    Lfh dx =∫

    R

    [(−αx df

    dx

    )h+

    (Dd2f

    dx2

    )h

    ]dx

    =

    R

    [f∂x(αxh) + f(D∂

    2xh)]dx =:

    R

    fL∗hdx,

    where

    L∗h := ddx

    (axh) +Dd2h

    dx2.

    38

  • 0 1 2 3 4 5 6 7 8 9 10−2

    −1.5

    −1

    −0.5

    0

    0.5

    1

    1.5

    2

    2.5

    t

    Xt

    Figure 2.1: Sample paths of the Ornstein–Uhlenbeck process

    We can calculate the invariant distribution by solving Eq.(2.40). The invariant measure of this process isthe Gaussian measure

    µ(dx) =

    √α

    2πDexp

    (− α2D

    x2)dx.

    If the initial condition of the Ornstein–Uhlenbeck processis distributed according to the invariant measure,then the Ornstein–Uhlenbeck process is a stationary Gaussian process. LetXt denote the one-dimensionalOrnstein–Uhlenbeck process withX0 ∼ N (0,D/α). ThenXt is a mean-zero Gaussian second-orderstationary process on[0,∞) with correlation function

    R(t) =D

    αe−α|t|

    and spectral density

    f(x) =D

    π

    1

    x2 + α2.

    The Ornstein–Uhlenbeck process is the only real-valued mean-zero Gaussian second-order stationary Markovprocess with continuous paths defined onR. This is the content of Doob’s theorem. See Exercise 6. A fewpaths of the stationary Ornstein–Uhlenbeck process are presented in Fig. 2.1.

    2.5 Diffusion processes and the forward and backward Kolmogorov equa-tions

    A Markov process consists of three parts: a drift, a random part and a jump process. A diffusion process is aMarkov process that has continuous sample paths (trajectories). Thus, it is a Markov process with no jumps.A diffusion process can be defined by specifying its first two moments, together with the requirement thatthere are no jumps. We start with the definition of a diffusionprocess in one dimension.

    39

  • Definition 2.6. A Markov processXt in R with transition functionP (Γ, t|x, s) is called a diffusion processif the following conditions are satisfied.

    i. (Continuity). For everyx and everyε > 0∫

    |x−y|>εP (dy, t|x, s) = o(t− s) (2.41)

    uniformly overs < t.

    ii. (Definition of drift coefficient). There exists a function b(x, s) such that for everyx and everyε > 0∫

    |y−x|6ε(y − x)P (dy, t|x, s) = b(x, s)(t− s) + o(t− s). (2.42)

    uniformly overs < t.

    iii. (Definition of diffusion coefficient). There exists a functionΣ(x, s) such that for everyx and everyε > 0 ∫

    |y−x|6ε(y − x)2P (dy, t|x, s) = Σ(x, s)(t− s) + o(t− s). (2.43)

    uniformly overs < t.

    In Definition 2.6 we truncated the domain of integration since do not know whether the first and secondmoments ofXt are finite. If we assume that there exists aδ > 0 such that

    limt→s

    1

    t− s

    Rd

    |y − x|2+δP (dy, t|x, s) = 0, (2.44)

    then we can extend the integration over the wholeR and use expectations in the definition of the drift andthe diffusion coefficient. Indeed, letk = 0, 1, 2 and notice that

    |y−x|>ε|y − x|kP (dy, t|x, s) =

    |y−x|>ε|y − x|2+δ |y − x|k−(2+δ)P (dy, t|x, s)

    61

    ε2+δ−k

    |y−x|>ε|y − x|2+δP (dy, t|x, s)

    61

    ε2+δ−k

    Rd

    |y − x|2+δP (dy, t|x, s).

    Using this estimate together with (2.44) we conclude that:

    limt→s

    1

    t− s

    |y−x|>ε|y − x|kP (dy, t|x, s) = 0, k = 0, 1, 2.

    This implies that Assumption (2.44) is sufficient for the sample paths to be continuous (k = 0) and forthe replacement of the truncated integrals in (2.42) and (2.43) by integrals overR (k = 1 andk = 2,respectively).

    Assuming that the first two moment exist, we can write the formulas for the drift and diffusion coeffi-cients in the following form:

    limt→s

    E

    (X