c8.1 stochastic di erential equations

C8.1 Stochastic Differential Equations

Zhongmin QianMathematical Institute, University of Oxford

November 30, 2021

Contents

1 Introduction 1

2 Martingales, Brownian motion and Ito calculus 52.1 Some notions about stochastic processes . . . . . . . . . . . . . . . . . . . 52.2 Martingales in continuous time . . . . . . . . . . . . . . . . . . . . . . . . 62.3 Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.4 Ito’s calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.4.1 Quadratic processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.4.2 Ito’s integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.4.3 Ito’s formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3 Selected applications of Ito’s formula 233.1 Levy’s characterization of Brownian motion . . . . . . . . . . . . . . . . . 233.2 Time-changes of Brownian motion . . . . . . . . . . . . . . . . . . . . . . . 243.3 Burkholder-Davis-Gundy inequality . . . . . . . . . . . . . . . . . . . . . . 253.4 Stochastic exponentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.5 Exponential inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.6 Girsanov’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.7 The martingale representation theorem . . . . . . . . . . . . . . . . . . . . 36

4 Stochastic differential equations 414.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.2 Several examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.2.1 Linear-Gaussian diffusions . . . . . . . . . . . . . . . . . . . . . . . 434.2.2 Geometric Brownian motion . . . . . . . . . . . . . . . . . . . . . . 444.2.3 Cameron-Martin’s formula . . . . . . . . . . . . . . . . . . . . . . . 46

4.3 Existence and uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474.3.1 Strong solutions: existence and uniqueness . . . . . . . . . . . . . . 474.3.2 Continuity in initial conditions . . . . . . . . . . . . . . . . . . . . 52

4.4 Martingale problem and weak solutions . . . . . . . . . . . . . . . . . . . . 53

5 Local times 595.1 Tanaka’s formula and local time . . . . . . . . . . . . . . . . . . . . . . . . 595.2 The Skorohod equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755.3 Local time for Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . 77

iii

iv CONTENTS

6 Appendix: Convex functions 816.1 Monotonic functions, Lebesgue-Stieltjes measures . . . . . . . . . . . . . . 816.2 Right continuous inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . 846.3 Convex functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

Chapter 1

Introduction

Stochastic differential equations become increasingly important not only in applied areassuch as quantitative finance1, mathematical physics2, turbulence3, and etc., but also inmany research areas of pure mathematics like non-linear partial differential equations4,harmonic analysis5, ergodic theory6, differential geometry7 and etc. There are great namesin modern science history which are associated with the development of the subject. Letus make a short list which is far from complete of course.

1) The major character in stochastic analysis is, without doubt, Brownian motion.The physical Brownian motion, which is the name given to the chaotic movements of smallpollen grains of clarkia pulchella suspended in water, was observed and reported first to theRoyal Society by the Scottish botanist Robert Brown in 1828 (he lived from 21 December1773 to 10 June 1858).

2) Loius Bachelier, around 1900, took a far time-ahead step using Brownian motionto model stock market in his Ph. D thesis (submitted to the faculty of Ecole NormaleSuperieure Paris). His thesis was published: L. Bachelier: Theorie de la speculation, inAnn. Sci. Ecole Norm. Sup. 17, 21 - 86 (1900). L. Bachelier used the fundamentalsolution of the heat equation (i.e. the heat kernel, Green’s function or called the sourcefunction) to describe the transition laws of Brownian motion. In a delighted book byMark Davis and Alison Etheridge: Louis Bachelier’s Theory of Speculation – the origins

1Steven E. Shreve: Stochastic Calculus for Finance II – Continuous-Time Models, Springer 2004. AlsoRobert C. Merton: Continuous-Time Finance, Blackwell 1990.

2Barry Simon: Functional Integration and Quantum Physics, Academic Press 1979.3Stephen B. Pope: Turbulent Flows, Cambridge University Press 2000.4Daniel W. Stroock and S.R.S. Varadhan: Multidimensional Diffusion Processes, Springer 1979 and

1997 (reprinted), and also Daniel W. Stroock: Partial Differential Equations for Probabilists, CambridgeUniversity Press 2008.

5Ross G. Pinsky: Positive Harmonic Functions and Diffusion. Cambridge University Press 1995. For adiscrete case refer to Tullio Ceccherini-Silberstein, Fabio Scarabotti and Filippo Tolli: Harmonic Analysison Finite Groups. Cambridge University Press 2008.

6H. P. Mckean, Jr.: Stochastic Integrals, Academic Press 1969. This is the little book beautifullywritten which educated a generation of scholars in stochastic analysis. Also his recent very nice exercisebook Henry McKean: Probability – The Classical Limit Theorems. Cambridge University Press 2014.

7Jean-Michel Bismut and Gilles Lebeau: The Hypoelliptic Laplacian and Ray-Singer Metrics, Prince-ton University Press 2008, and of course Nobuyuki Ikeda and Shinzo Watanabe: Stochastic DifferentialEquations and Diffusion Processes, North-Holland 1981.

1

2 CHAPTER 1. INTRODUCTION

of Modern Finance, Princeton University Press (2006), one may find a good account aboutthis in English with detailed explanations, also the Xrox copy of the original print, and aForward penned by the first Nobel economics laureate Paul A. Samuelson (May 15, 1915to December 13, 2009).

3) Albert Einstein (14 March 1879-18 April 1955), well known for his special andgeneral theories of relativity, in 1905, the year he created his fundamental paper on specialrelativity, published his study of Brownian motion [A. Einstein: On the movement ofsmall particles suspended in a stationary liquid demanded by the molecular-kinetic theoryof heat, Annalen der Physik 17 (1905), 549-560. English translation of the paper maybe found in Einstein’s Miraculous Year – five papers that changed the face of physics,Princeton University Press (1998)]. Einstein’s Brownian motion paper and related papersare collected in a small book titled Investigations on the theory of the Brownian movement,edited by R. Furth and translated by A. D. Cowper, Dover Publications, INC. (1956). Inthe same year A. Einstein published 5 fundamental papers. One is about Special Relativity[A. Einstein: Zur Elektrodynamik bewegter Korper, Annalen der Physik 17 (1905), 891 –921. English translation including his fundamental paper on General Theory in 1916 andother key papers on gravitation may be found in The Principe of Relativity – A collectionof original papers on the special and general theory of relativity by A. Einstein, H. A.Lorentz, H. Weyl and H. Minkowski (with notes by A. Sommerfeld), Dover Publications,INC (1952)]. The another work published in the same year by A. Einstein is on the law ofthe photoelectric effect, a fundamental quantum hypothesis [A. Einstein: On a heuristicpoint of view concerning the production and transformation of light, Annalen der Physik 17(1905), 132 - 148], which is the work A. Einstein was awarded the Nobel Prize in Physicsin 1921.

4) Norbert Wiener (November 26, 1894 - March 18, 1964) in 1923 [N. Wiener: Differ-ential space, J. Math. Phys. 2 (1923), 131 - 174] constructed the distribution of Brownianmotion as a probability measure over the space of continuous paths, which is the archetypalexample of a measure over an infinite dimensional space in the Lebesgue sense. This workmarked the start of the research area of stochastic analysis.

5) Paul Levy (1886 - 1971) was probably the longest ever lived mathematician whoworked by him own and discovered rich and interesting unusual properties of the mathe-matical model of Brownian motion proposed by L. Bachilier and A. Einstein. His resultswere published in a book: Processus stochastiques et mouvement Brownien, Gauthier-Villars et Fils (1948). He studied a class of stochastic processes (he called them additiveprocesses, his monograph Theorie de l’addition des variables aleatories was published in1937) which have important applications, now named after him called Levy’s processes.

6) Kiyoshi Ito (September 7, 1915 - 10 November 2008) was mainly credited as thefounder of the theory of stochastic differential equations. His foundation papers includingthe paper about Ito’s lemma or Ito’s formula which was published around 1942, and histheory of SDE was, with much delay due to the second war, published in 1951 [K. Ito:On stochastic differential equations, Mem. Amer. Math. Soc. 4 (1951)]. The reader maygather more or less a full picture about Ito’s calculus in Nobuyuki Ikeda and Shinzo Watan-abe: Stochastic Differential Equations and Diffusion Processes, North-Holland/Kodansha(1981). Ito’s calculus was also explained in a master piece Stochastic Integrals by H. P.McKean published in 1969 (Academic Press, recently reissued by AMS). K. Ito and H.

3

P. McKean also wrote a comprehensive monograph ”Diffusion processes and their samplepaths” (Springer-Verlag, 1965) which gives a path-wise construction of a class of diffusionmodels determined by second-order elliptic differential operators on intervals of the realline. This book has been the aspiring source for mathematicians since then.

7) The general theory of stochastic processes – the stochastic calculus for semi-martingaleswhich generalizes Ito’s theory to a large class of stochastic processes, was established mainlyby the French School under the leadership of Paul-Andre Meyer (21 August 1934 - 30 Jan-uary 2003). Meyer and his co-authors created monumental volumes on semi-martingales,represented with the 5 volumes written by him and his co-authors [C. Dellacherie, P.A.Meyer: Probabilites et potentiel. Hermann, Paris, vol. I (1975), vol. II (Chapitres 5 a 8: Theorie des martingales, 1980), vol III (Chapitres 9 a 11, Theorie discrete du potentiel,1983), vol. IV (Theorie du potentiel associee a une resolvante, Theorie des processus deMarkov, 1987), vol. V (with one more co-author B. Maisonneuve : Processus de Markov(fin) : Complements de calcul stochastique, 1992)].

8) Paul Malliavin (September 10, 1925 - June 3, 2010) developed a calculus of variationsfor Wiener functionals. His book Stochastic Analysis (Springer, 1997) soon becomes aclassic. He even wrote a book with Anton Thalmaiera on the applications of his calculusto quantitative finance titled Stochastic calculus of variations in mathematical finance(Springer-Verlag, Berlin, 2006). The latter book should be useful for those interested inthe applications of viscosity solutions to PDEs too.

9) There are many excellent books which appeared during 1970’s to 1990’s, such as(1) D. W. Stroock and S. R. S Varadhan: Multidimensional diffusion processes (Springer-Verlag 1979), (2) L. C. G. Rogers and D. Williams: Diffusions, Markov processes, andmartingales Vol. 2: Ito Calculus (Wiley, New York, 1987) (reissued by Cambridge Uni-versity Press), (3) D. Revuz and M. Yor: Continuous martingales and Brownian motion(Springer-Verlag, Berlin 1991), (4) I. Karatzas and S. E. Shreve: Brownian motion andstochastic calculus (Springer 1998). This list is still a small sample of books useful forlearning stochastic analysis.

10) On the applied fronts, besides those traditional applications in pure mathematicsand in engineering, Ito’s calculus has been found to be the perfect tool for quantitativefinance. In a paper published in 1971 by Fischer Sheffey Black (January 11, 1938 - August30, 1995, an economist) and Myron Scholes (born July 1, 1941, and economist who wasawarded the Nobel Memorial Prize in Economic Sciences in 1997) titled The Pricing of op-tions and corporate liabilities (J. Polit. Economy) in which they derived the famous PDEnow named after them called the Black-Scholes equation, which allow them to obtain anexplicit formula for pricing the European options. Another economist Robert C. Merton(born July 31, 1944, awarded Nobel Prize in Economic in 1997) made the pioneering con-tributions to continuous-time finance by using Brownian motion in the ”Theory of rationaloption pricing”. Bell Journal of Economics and Management Science (The RAND Cor-poration) 4 (1),141-183 (1973). Here, after about 70 years of Bachilier’s work, we witnessthe rebirth of the subject with modern and matured tools towards the understanding offinancial markets. The reader can learn a lot from the collection of Robert C. Merton:Continuous-time Finance (Blackwell, Cambridge MA & Oxford UK, 1990).

Finally let me point out that a major problem in science, which is largely still open, isto describe and to construct laws (equivalently distributions) of random fields which are

4 CHAPTER 1. INTRODUCTION

probability measures on certain infinite dimensional spaces of mappings between spaces,such as spaces of continuous mapping from T to M , where T ⊂ Rd a subset and M ⊆ Rn

(a sub-manifold of Rn). In modern science, fields such as gauge fields in quantum field the-ories, velocity fields in turbulence, are fundamental objects, and fields appearing in manyapplications (such as turbulence, quantum field theory, statistical mechanics, condensedmatter physics and so on) tend to be random. A theory in science should acquires at leasttwo functions. First it should provide a good description of observed phenomena in termsof certain language, and second it should have the power of making certain predictions.Being a successful scientific theory, it is essential to build good mathematical models andtheories of random fields. In this course we develop a general theory for constructing aspecial class of random fields, called stochastic processes.

Chapter 2

Martingales, Brownian motion andIto calculus

In this chapter we recall the fundamental concepts and results about martingales (incontinuous-time), Brownian motion, and the elements about stochastic integrals (Ito’sintegrals).

2.1 Some notions about stochastic processes

Stochastic processes are mathematical models used to describe random phenomena evolv-ing in time. Let T denote the range of time-parameter t, which is [0,∞) in these lectures,though other choices are also allowed. T is thus an ordered set endowed with the usualtopology.

Definition 2.1.1 A stochastic process is a parameterized family X = (Xt)t∈T of randomvariables on a probability space (Ω,F ,P) taking values in a topological space S (called thestate space). Unless otherwise specified, in this course, S = R, or Rd.

A stochastic process X = (Xt)t∈T may be considered as a function from T × Ω→ Rd,which is the reason why a stochastic process is also called a random function.

For each ω ∈ Ω, the mapping t → Xt(ω) from T to S is called a sample path (or atrajectory, or a sample function). Naturally, a stochastic process X = (Xt)t∈T is continuous(resp. right-continuous, right-continuous with left-limits) if sample paths t → Xt(ω) arecontinuous (resp. right-continuous, right-continuous with left-limits) for almost all ω ∈ Ω.

If X = (Xt)t≥0 is a stochastic process with values in Rd, and 0 ≤ t1 < t2 < · · · < tn,the joint distribution of random variables (Xt1 , · · · , Xtn) given by

µt1,t2,··· ,tn(dx1, · · · , dxn) = P (Xt1 ∈ dx1, · · · , Xtn ∈ dxn) ,

a probability measure on Rd × · · · × Rd, is called a finite-dimensional distribution of X =(Xt)t≥0. If d = 1, the measure µt1,t2,··· ,tn may be determined from its distribution function

Ft1,t2,··· ,tn(x1, · · · , xn) = P (Xt1 ≤ x1, · · · , Xtn ≤ xn) .

5

6 CHAPTER 2. MARTINGALES, BROWNIAN MOTION AND ITO CALCULUS

In order to overcome some technical difficulties arising from measurability, in particularin dealing with stochastic processes in continuous-time, a common condition, which is goodenough to include a large class of interesting stochastic processes, is that almost all samplepaths X = (Xt)t≥0 are right-continuous.

Exercise 2.1.2 Let (Xt)t≥0 be a stochastic process in Rd on (Ω,F ,P), and B ⊂ Rd beBorel measurable. If F ⊆ [0,∞) is finite or countable, then ω : Xt(ω) ∈ B for anyt ∈ Fand supt∈F |Xt| are measurable.

The main task of stochastic analysis is to study the probability (or distribution) prop-erties of random functions determined by their families of finite-dimensional distributions.

Definition 2.1.3 Two stochastic processes X = (Xt)t≥0 and Y = (Yt)t≥0 are equivalent(distinguishable) if P (Xt = Yt) = 1 for every t ≥ 0. In this case, (Yt)t≥0 is called a versionof (Xt)t≥0.

By definition, the family of finite-dimensional distributions of a stochastic process X =(Xt)t≥0 is unique up to equivalence of processes.

On the other hand, in many practical situations, we are given a collection of compatiblefinite dimensional distributions D = µt1,··· ,tn , for t1 < · · · < tn, tj ∈ T , we would liketo construct a stochastic model (Xt)t∈T on some probability space (Ω,F ,P) so that thefamily of finite dimensional distributions determined by (Xt)t∈T coincides with the familyD of distributions. In this case, (Xt)t∈T is called a realization of D.

2.2 Martingales in continuous time

The definition of martingales (super- and sub-martingales) and Doob’s fundamental in-equalities in discrete-time may be extended to martingales in continuous time, after nec-essary and obvious modifications. The only new result in the theory of martingales incontinuous time is a regularity result about their sample paths.

Let (Ω,F ,Ft,P) be a filtered probability space. A (Ft)-adapted (real valued) pro-cess (Xt)t≥0 is called a martingale (resp. super-martingale; resp. sub-martingale), ifE (Xt|Fs) = Xs (resp. E (Xt|Fs) ≤ Xs; resp. E (Xt|Fs) ≥ Xs) almost surely for anyt ≥ s ≥ 0. Similarly, the concept of stopping times can be stated in this setting as well,namely, a function T : Ω → [0,∞] is an (Ft)-stopping time if for every t ≥ 0, the eventT ≤ t belongs to Ft. A new kind of stopping times called predictable times which hasno interest in discrete-time case will play a role if the underlying stochastic processes havejumps. A stopping time T : Ω → [0,∞] is predictable if there is an increasing sequence(Tn) of (Ft)-stopping times such that for each n, Tn < T and limn→∞ Tn = T .

If T is a stopping time, then

FT = A ∈ F : for any t ≥ 0, [T ≤ t] ∩ A ∈ Ft

is the σ-algebra representing the information available up to the random time T , and

FT− = A ∈ F : for any t ≥ 0, [T < t] ∩ A ∈ Ft

2.2. MARTINGALES IN CONTINUOUS TIME 7

represents information known strictly before time T .Let Ft+ = ∩s>tFs for every t ≥ 0, and define Ft− = σFs : s < t for t > 0. Then

(Ft+) is again a filtration and Ft+ ⊇ Ft. If T : Ω→ [0,∞] is an (Ft+)-stopping time then

FT+ = A ∈ G : for any t ≥ 0, [T ≤ t] ∩ A ∈ Ft+

is a σ-algebra.A filtration (Gt) is right-continuous if Gt+ = Gt for each t ≥ 0. Clearly (Ft+) is right-

continuous.

Theorem 2.2.1 If (Xt)t≥0 is a martingale (resp. super-martingale, resp. sub-martingale)on (Ω,F ,Ft,P) with right-continuous sample paths almost surely, then (Xt)t≥0 is a mar-tingale (resp. super-martingale, resp. sub-martingale) on (Ω,F ,Ft+,P).

One would ask when a martingale (super-martingale) has right-continuous sample pathsalmost surely? The question can be answered via Doob’s convergence theorem for super-martingales.

Let X = (Xt)t≥0 be a real valued stochastic process, and let a < b. If

F = 0 ≤ t1 < t2 < · · · < tN

is a finite subset of [0,∞), then U ba(X,F ) denotes the number of up-crossings by Xt1 , · · · , XtN,

and if D ⊂ [0,∞), then U ba(X,D) denotes the superemum of U b

a(X,F ) over F being finitesubset of D. Obviously D → U b

a(X,D) is increasing with respect to the inclusion ⊂. Inparticular, if X = (Xt)t≥0 is (Ft)-adapted and if D is a countable subset of [0,∞) thenfor every t ≥ 0, U b

a(X,D ∩ [0, t]) is measurable with respect to Ft. We may apply Doob’sup-crossing number inequality to (Xt)t∈F where F is a finite subset, and therefore establishthe following

Theorem 2.2.2 (Doob’s up-crossing number inequality). If X = (Xt)t≥0 is a super-martingale, then

E[U ba(X,D)

]≤ 1

b− aE (Xt − a)−

for any a < b, t > 0 and any countable subset D of [0, t], where x− = (−x) ∨ 0.

Proof. Let t > 0 be any but fixed. List all elements in D as t1, t2, · · · , and for eachn = 1, 2, · · · , Fn = t1, · · · , tn, t. Then clearly Fn ↑ D ∩ [0, t], and therefore

U ba(X,Fn) ↑ U b

a(X,D ∪ t).

Each U ba(X,Fn) is Ft-measurable, and therefore U b

a(X,D ∪ t) is Ft-measurable too. Bythe same reasoning (dropping t in the definition Fn) U b

a(X,D) is Ft-measurable. By MCT

E[U ba(X,D)

]≤ U b

a(X,D ∪ t) = limn→∞

U ba(X,Fn).

While by Doob’s up-crossing lemma for S-martingales in discrete-time

U ba(X,Fn) ≤ 1

b− aE (Xt − a)−

and the claim follows.It follows thus the following version of the super-martingale convergence theorem.


Corollary 2.2.3 Let X = (Xt)t≥0 be a super-martingale, and let D be a countable densesubset of [0,∞). Then for almost all ω ∈ Ω, the right limit of (Xt)t≥0 along the countabledense set D, lims∈D,s>t,s↓tXs exist at all t ≥ 0, and the left limit along D, lims∈D,s<t,s↑tXs

exist at t > 0.

Here is the proof [also the proof of 1) and 2) in Follmer’s lemma below] which is quitetypical of this kind of arguments using discrete results. For simplicity, let Dt = D ∩ [0, 2t]for t > 0. Then, by using the definition of limits, both limits

lims∈D,s>t,s↓t

Xs and lims∈D,s<t,s↓t

Xs

exist on U ba(X,Dt) <∞ : a < b, a, b ∈ Q

=

⋂a,b∈Q,a<b

U ba(X,Dt) <∞

which is F2t-measurable (in particular it is measurable). By Doob’s up-crossing numberlemma

E[U ba(X,D)

]≤ 1

b− aE (Xt − a)− <∞

so thatU ba(X,Dt) =∞

has probability zero, and therefore

Nt ≡⋃

a,b∈Q,a<b

U ba(X,Dt) =∞

has probability zero for every t. Now observe that Nt is actually increasing in t, so thatN = ∪t>0Nt = limk→∞Nk is measurable, and has probability zero. That is

⋃t≥0Nt has

probability zero and thereforeU ba(X,Dt) <∞ : for all a < b, a, b ∈ Q and t > 0

= Ω \N

has probability one. By construction of N , both limits

lims∈D,s>t,s↓t

Xs and lims∈D,s<t,s↓t

Xs

exist on Ω \N for all t > 0.The following is the main regularity result, which is called Follmer’s lemma.

Theorem 2.2.4 Let (Xt)t≥0 be a super-martingale (resp. martingale) on (Ω,F ,Ft,P),and D ⊆ [0,∞) be countable. Let N be constructed in the previous proof. Then

1. For every ω ∈ Ω \ N , Zt(ω) = lims∈D,s>t,s↓tXs(ω) exists, and Zt is Ft+-measurablefor all t ≥ 0.

2. For every ω ∈ Ω \ N and for all t > 0 the left limit Zt−(ω) = lims<t,s↑ Zs(ω), and(Zt)t≥0 is a (Ft+)-adapted process with right-continuous sample paths and left limits.

3. For any t ≥ 0, E (Zt|Ft) ≤ Xt (resp. E (Zt|Ft) = Xt) almost surely.

4. (Zt)t≥0 is a super-martingale (resp. martingale) on (Ω,F ,Ft+,P).

2.2. MARTINGALES IN CONTINUOUS TIME 9

In general however (Zt)t≥0 constructed as above may not be not a version of (Xt)t≥0, andtherefore they may have different distributions. The following is the most useful regularityresult about martingales.

Corollary 2.2.5 Under the same assumptions and notations as in Theorem 2.2.4. As-sume that (Ft)t≥0 is right-continuous. Then (Zt)t≥0 is a version of (Xt)t≥0, that is, foreach t ≥ 0, Zt = Xt almost surely, if and only if t→ EXt is right-continuous.

Corollary 2.2.6 Under the same assumptions and notations as in Theorem 2.2.4. If(Ft)t≥0 is right continuous, and if (Xt)t≥0 is a martingale on (Ω,F ,Ft,P), then the process(Zt)t≥0 defined in 2.2.4 is a version of (Xt)t≥0.

This is because for a martingale (Xt)t≥0, t→ EXt = EX0 is a constant.From now we will work on a filtered probability space (Ω,F ,Ft,P) which satisfies the

following the usual conditions.

1. (Ω,F ,P) is a complete probability space.

2. The filtration (Ft)t≥0 is right-continuous: Ft = Ft+ ≡ ∩s>tFs for every t ≥ 0.

3. Each Ft contains all null sets in F .

Remark 2.2.7 If X = (Xt)t≥0 is a right-continuous stochastic process on a completeprobability space (Ω,F ,P), then its natural filtration (Ft)t≥0 satisfies the usual conditions.

Theorem 2.2.8 If X = (Xt)t≥0 is a right-continuous stochastic process adapted to (Ft)t≥0

(recall that our filtration (Ft)t≥0 satisfies the usual conditions), and if T : Ω→ [0,∞] is astopping time, then the random variable XT1T<∞ is measurable with respect to σ-algebraFT , where

XT1T<∞(ω) = XT (ω)(ω)1ω:T (ω)<∞(ω)

=

XT (ω)(ω) ; if T (ω) <∞ ,

0; if T (ω) =∞ .

The following theorem provides us with a class of interesting stopping times.

Theorem 2.2.9 Let X = (Xt)t≥0 be an Rd-valued, adapted stochastic process that is right-continuous and has left-limits. Then T = inf t ≥ t0 : Xt ∈ D is a stopping time, whereD ⊂ Rd is Borel measurable and t0 ≥ 0, with the convention that inf ∅ = ∞. T is calledthe hitting time of D by the process X.

Example 2.2.10 If X = (Xt)t≥0 is an adapted, continuous process on (Ω,F ,Ft,P) andif D ∈ Rd is a bounded closed subset of Rd, then T = inft ≥ 0 : Xt ∈ D is a stoppingtime. If X0 ∈ Dc, then XT1T<∞ ∈ ∂D. In particular, if d = 1 and b is a real number,then Tb = inft ≥ 0 : Xt = b is a stopping time. supt∈[0,N ] Xt is a random variable (whereN > 0 is any number),

supt∈[0,N ]

Xt < b

= Tb > N


and supt∈[0,N ]

Xt ≥ b

= Tb ≤ N .

The concept of stopping times provides us with a means of “localizing” quantities.Suppose (Xt)t≥0 is a stochastic process, and T is a stopping time, then XT = (Xt∧T )t≥0 isa stochastic process stopped at (random) time T , where

Xt∧T (ω) =

Xt(ω) if t ≤ T (ω) ;XT (ω)(ω) if t ≥ T (ω) .

Another interesting stopped process at random time T associated with X is the processX1[0,T ] which is by definition(

X1[0,T ]

)t(ω) = Xt1t≤T(ω)

=

Xt(ω) if t ≤ T (ω) ;

0 if t > T (ω) .

It is obvious that XTt = Xt1t≤T+XT1t>T. If (Xt)t≥0 is adapted to the filtration (Ft)t≥0

, so are the process (Xt∧T )t≥0 stopped at stopping time T and Xt1t≤T.

Definition 2.2.11 An adapted stochastic process X = (Xt)t≥0 is called a local martingaleif there is an increasing family Tn of finite stopping times such that Tn ↑ ∞ as n→∞and that (Xt∧Tn)t≥0 is a martingale for each n.

Similarly, we may define local super- or sub-martingales etc.

2.3 Brownian motion

Let us begin with the definition of Brownian motion as a continuous stochastic process.A stochastic process B = (Bt)t≥0 on a probability space (Ω,F ,P) with values in Rd is

called a Brownian motion (BM) in Rd, if

1. (Bt)t≥0 possesses independent increments: for any 0 ≤ t0 < t1 < · · · < tn randomvariables

Bt0 , Bt1 −Bt0 , · · · , Btn −Btn−1

are independent,

2. for any t > s ≥ 0, random variable Bt − Bs has a normal distribution N(0, t − s),that is, Bt −Bs has pdf (probability density function)

p(t− s, x) =1

(2π(t− s))d/2e−

|x|22(t−s) ; x ∈ Rd .

In other wordsP (Bt −Bs ∈ dx) = p(t− s, x)dx ,

2.3. BROWNIAN MOTION 11

3. almost all sample paths of (Bt)t≥0 are continuous.

If, in addition, PB0 = x = 1 where x ∈ Rd, then we say (Bt)t≥0 is a Brownian motionstarting at x. PB0 = 0 = 1 where 0 is the origin of Rd, then we say (Bt)t≥0 is a standardBrownian motion.

We will see that the condition 3) is not nontrivial, which ensure that many interestingfunctionals of Brownian motion, such as the running maximum B?

t = sups≤tBs, are indeedrandom variables. Let p(t, x, y) = p(t, x− y), and define

Ptf(x) =

∫Rdf(y)p(t, x, y)dy ∀f ∈ Cb(Rd) .

for every t > 0. Since

p(t+ s, x, y) =

∫Rdp(t, x, z)p(s, z, y)dz

therefore (Pt)t≥0 is a semigroup on Cb(Rd). (Pt)t≥0 is called the heat semigroup in Rd: iff ∈ C2

b (Rd), then u(t, x) = (Ptf)(x) solves the heat equation(1

2∆ +

∂

∂t

)u(t, x) = 0 ; u(0, ·) = f ,

where ∆ =∑

i∂2

∂x2iis the Laplace operator.

The connection between Brownian motion and the Laplace operator ∆ (hence theharmonic analysis) is demonstrated through the following identity:

E (f(Bt + x)) = (Ptf) (x) =1

(2πt)d/2

∫Rdf(y)e−

|y−x|22t dy

where Bt is a standard Brownian motion.

Example 2.3.1 If B = (Bt)t≥0 is a BM in R, then

E|Bt −Bs|p = cp|t− s|p/2 for all s, t ≥ 0 (2.1)

for p ≥ 0, where cp is a constant depending only on p. Indeed

E|Bt −Bs|p =1√

2π|t− s|

∫R|x|p exp

(− |x|2

2|t− s|

)dx .

Making change of variable

x√|t− s|

= y ; dx =√|t− s|dy

we thus have

E|Bt −Bs|p =(√|t− s|)p√

2π

∫R|x|p exp

(−|x|

2

2

)dx

= cp|t− s|p/2


where

cp =1√2π

∫Rd|x|p exp

(−|x|

2

2

)dx .

(2.1) remains true for BM in Rd with a constant cp depending on p and d.

Remark 2.3.2 If d = 1, then Bt − Bs ∼ N(0, t − s). It is an easy exercise to show thatfor every n ∈ Z+

E(Bt −Bs)2n =

(2n)!

2nn!|t− s|n .

Let B = (Bt)t≥0 be a standard BM in R. Then B is a centered Gaussian process withco-variance function C(s, t) = s ∧ t. Indeed, any finite-dimensional distribution of B isGaussian [Exercise], so that B is a centered Gaussian process, and its co-variance function(if s < t)

E(BtBs) = E((Bt −Bs)Bs +B2s )

= E((Bt −Bs)Bs) + EB2s

= E(Bt −Bs)EBs + EB2s

= s .

Theorem 2.3.3 (N. Wiener) There is a standard Brownian motion in Rd.

Let B = (Bt)t≥0 be a standard BM in Rd.(1) BM B = (Bt)t≥0 is stationary, in the sense that for any fixed time S, Bt = Bt+S−BS

is again a standard Brownian motion. This statement is true indeed for any finite stoppingtime S.

(2) Scaling invariance, self-similarity. For any real number λ 6= 0, Mt ≡ λBt/λ2 is astandard BM in Rd.

(3) Isotropic property. If U is an d × d orthonormal matrix, then UB = (UBt)t≥0 isa standard BM in Rd. That is, BM is invariant under the action of orthogonal group ofRd. This says a little bit more than the usual definition of isotropic property in that theinvariance is valid for all U ∈ O(d), not only for its sub-group SO(d).

(4) Let B = (Bt)t≥0 be a standard BM in R, and define M0 = 0 and Mt = tB1/t fort > 0, is a standard BM in R.

Brownian motion as a Markov process. Recall that

p(t, x) =1

(2πt)d/2e−|x|22t

in Rd, and (Pt)t≥0 the heat semigroup Ptf(x) =∫Rd f(y)p(t, x− y)dy for every t > 0.

Let (F0t )t≥0 denote the filtration generated by a standard Brownian motion (Bt)t≥0,

and F0∞ = ∪t≥0F0

t .(5) For any t > s ≥ 0, the increment Bt −Bs is independent of F0

s .(6) If t > s, then the joint distribution of Bs and Bt is given by

P (Bs ∈ dx, Bt ∈ dy) = p(s, x)p(t− s, y − x)dxdy .

2.3. BROWNIAN MOTION 13

Indeed, since Bs and Bt−Bs are independent, so that (Bs, Bt−Bs) has a pdf p(s, x1)p(t−s, x2), thus, for any bounded Borel measurable function f

Ef(Bs, Bt) = Ef(Bs, Bt −Bs +Bs)

=

∫∫f(x1, x2 + x1)p(s, x1)p(t− s, x2)dx1dx2 .

Making change of variables x1 = x and x2 +x1 = y in the last double integral, the inducedJacobi is 1 so that dx1dx2 =dxdy (as measures), and therefore

Ef(Bs, Bt) =

∫∫f(x, y)p(s, x)p(t− s, y − x)dxdy

which implies that the pdf of (Bs, Bt) is p(s, x)p(t− s, y − x).(7) Let t > s, and f a bounded Borel measurable function. Then

E(f(Bt)|F0

s

)= Pt−sf(Bs) a.s. (2.2)

In particular E (f(Bt)|F0s ) = E (f(Bt)|Bs) (which is called the simple Markov property).

(8) For any 0 < t1 < t2 < · · · < tn, the (Rn×d-valued) random variable (Bt1 , · · · , Btn)possesses the following probability density function

p(t1, x1)p(t2 − t1, x2 − x1) · · · p(tn − tn−1, xn − xn−1).

That is, the joint distribution of (Bt1 , · · · , Btn) is given by

P Bt1 ∈ dx1, · · · , Btn ∈ dxn= p(t1, x1)p(t2 − t1, x2 − x1) · · · p(tn − tn−1, xn − xn−1)dx1 · · · dxn . (2.3)

(9) Let Bt = (B1t , · · · , Bd

t ) be a d-dimensional standard Brownian motion. Then foreach j, Bj

t is a standard BM in R, and (Bjt )t≥0 (j = 1, · · · , d) are mutually independent.

Therefore a d-dimensional BM is d independent copies of BM in R.(10) Brownian motion starts afresh at a stopping time, and the Markov property for

Brownian motion remains valid at a stopping time. Therefore Brownian motion possessesthe strong Markov property, a very important property which had been used by Paul Levyin the form of the reflection principle, long before the concept of strong Markov propertyhad been properly defined. We will exhibit this principle by computing the distributionof the running maximum of a Brownian motion. Let B = (Bt)t≥0 be a standard onedimensional Brownian motion on (Ω,F ,Ft,P) in R. Let b > 0, b > a and Tb = inft >0 : Bt = b. Then Tb is a stopping time, and the Brownian motion starts afresh at Tb as aBrownian motion (starting at b) after hitting the level b, and therefore

P

(sups∈[0,t]

Bs ≥ b, Bt ≤ a

)= P

(sups∈[0,t]

Bs ≥ b, Bt ≥ 2b− a

)= P (Bt ≥ 2b− a)

where the first equality follows from the “fact” that the Brownian motion starting at Tb(in position b): BTb = b, runs afresh like a Brownian motion starting at b, so that it


moves with equal probability about the line y = b. The second equality follows from2b− a = b+ (b− a) > b.

The above equation may be written as

P (Tb ≤ t, Bt ≤ a) = P (Tb ≤ t, Bt ≥ 2b− a)

= P (Bt ≥ 2b− a) ,

which can be justified by the Strong Markov Property of Brownian motion, a topic thatwill not pursue here. Therefore

P

(sups∈[0,t]

Bs ≥ b, Bt ≤ a

)=

1√2πt

∫ +∞

2b−ae−

x2

2t dx ,

which gives us the joint distribution of a Brownian motion and its maximum at a fixedtime t. By differentiating in a and in b to obtain the pdf of the joint distribution of randomvariables (Mt = sups∈[0,t] Bs, Bt):

P (Mt ∈ db, Bt ∈ da) =2(2b− a)√

2πt3exp

−(2b− a)2

2t

dadb

over the region (b, a) : a ≤ b, b ≥ 0 in R2.Brownian motion as a martingale.Let B = (Bi

t)t≥0 (i = 1, · · · , d) be a standard BM in Rd, with its natural filtration(F0

t )t≥0.(11) Each Bt is p-th integrable for any p > 0, and for t > s

E(|Bt −Bs|p) = cp,d|t− s|p/2 . (2.4)

(Bt)t≥0 is a continuous, square-integrable martingale, and for each pair i, j, Mt =BitB

jt − δijt is a continuous martingale.

(12) Let B = (Bt)t≥0 be a continuous stochastic process in R such that B0 = 0. Then(Bt)t≥0 is a standard BM in R, if and only if for any ξ ∈ R and t > s

E

exp (i〈ξ, Bt −Bs〉) |F0s

= exp

(−(t− s)|ξ|2

2

). (2.5)

(13) Let (Bt) be a standard BM in R. If ξ ∈ R, then Mt ≡ exp(i〈ξ, Bt〉+ |ξ|2

2t)

is a

martingale.(14) Let B = (Bt)t≥0 be a standard BM in R. Then Mt ≡ B2

t − t are martingales andlimm(D)→0

∑l |Btl−Btl−1

|2 = t in L2(Ω,P) for any t, where D runs over all finite partitionsof interval [0, t], and m(D) = maxl |tl − tl−1|. Therefore limm(D)→0

∑l |Btl − Btl−1

|2 = t inprobability.

(15) Let (Bt)t≥0 be a standard BM in R. Then for any t > 0 we have

2n∑j=1

∣∣∣B j2nt −B j−1

2nt

∣∣∣2 → t a.s. (2.6)

2.4. ITO’S CALCULUS 15

as n→∞.It can be shown (not easy) that supD

∑l |Btl−Btl−1

|p <∞ almost surely for any p > 2,where sup is taken over all finite partitions of [0, 1], and supD

∑l |Btl−Btl−1

|2 =∞ almostsurely. That is to say, Brownian motion has finite p-variation for any p > 2. Indeed almostall Brownian motion sample paths are α-Holder continuous for any α < 1/2 but not forα = 1/2. It follows that almost all Brownian motion paths are nowhere differentiable. Wewill not go into a deep study about the sample paths of BM, which are not needed in orderto develop Ito’s calculus for Brownian motion.

Definition 2.3.4 Let p > 0 be a constant. A path f(t) in Rd [a function on [0, T ] valuedin Rd] is said to have finite p-variation on [0, T ], if

supD

∑l

|f(ti)− f(ti−1)|p <∞

where D runs over all finite partitions of [0, T ]. f(t) in Rd has finite (total) variation if ithas finite 1-variation.

A function with finite variation must be a difference of two increasing functions. Itparticular, it has at most countably many discontinuous points.

A stochastic process V = (Vt)t≥0 is called a variational process, if for almost all ω ∈ Ω,the sample path t → Vt(ω) possesses finite variation on any finite interval. A Brownianmotion is not a variational process.

2.4 Ito’s calculus

In this section we recall the basic definitions and results about Ito’s integrals with respectto continuous martingales.

2.4.1 Quadratic processes

Let M = (Mt)t≥0 be a continuous, square integrable martingale. Then

〈M〉t = limm(D)→0

∑l

∣∣Mtl −Mtl−1

∣∣2exists both in probability and in the L2-norm for every t ≥ 0, where the limit takes overall finite partitions D of the interval [0, t]. 〈M〉 is called the (quadratic) variational processof (Mt)t≥0, or simply the bracket process of (Mt)t≥0. The quadratic variational processt→ 〈M〉t is an adapted, continuous, increasing stochastic process [and therefore has finitevariation] with initial zero. The following theorem demonstrates the importance of 〈M〉t.

Theorem 2.4.1 (The quadratic variational process) Let M = (Mt)t≥0 be a contin-uous, square integrable martingale. Then 〈M〉t is the unique continuous, adapted andincreasing process with initial zero, such that M2

t − 〈M〉t is a martingale.


The theorem is a special case of the Doob-Meyer decomposition for sub-martingales:any sub-martingale can be decomposed into a sum of a martingale and a predictable,increasing process with initial value zero. The decomposition was conjectured by L. Doob,and proved by P. A. Meyer in the 60’s, which opened the new era of stochastic calculus.

Theorem 2.4.2 Let (Mt)t≥0 and (Nt)t≥0 be two continuous, square integrable martingales,and

〈M,N〉t =1

4(〈M +N〉t − 〈M −N〉t)

called the bracket process of M and N . Then 〈M,N〉t is the unique adapted, continuous,variational process with initial zero, such that MtNt − 〈M,N〉t is a martingale. Moreover

limm(D)→0

n∑l=1

(Mtl −Mtl−1)(Ntl −Ntl−1

) = 〈M,N〉t , in prob. (2.7)

where D = 0 = t0 < · · · < tn = t and m(D) = maxl(tl − tl−1).

Let T > 0 be a fixed but arbitrary number, andM20 be the vector space of all continu-

ous, square-integrable martingales up to time T on a probability space (Ω,F , Ft,P) withinitial value zero, equipped with the distance

d(M,N) =√E|MT −NT |2 for M,N ∈M2

0 .

By definition, a sequence of square-integrable martingales (M(k)t)t≥0 (k = 1, · · · ,) con-verges to M in M2

0, if and only if M(k)T → MT in L2(Ω,F ,P) as k →∞. The followingmaximal inequality, which is the “martingale version” of the Markov inequality, allows usto show that (M2

0, d) indeed is complete.

Theorem 2.4.3 (Kolmogorov’s inequality) Let M ∈M20. Then for any λ > 0

P(

sup0≤t≤T

|Mt| ≥ λ

)≤ 1

λ2E(M2

T

).

Theorem 2.4.4 (M20, d) is a complete metric space.

Proof. Let M(k) ∈M20 ( k = 1, 2, · · · ) be a Cauchy sequence in M2

0. Then

E|M(k)T −M(l)T |2 → 0 , as k, l→∞ .

According to Kolmogorov’s inequality

P(

sup0≤t≤T

|M(k)t −M(l)t| ≥ λ

)≤ 1

λ2E|M(k)T −M(l)T |2 ,

so that, M(k) uniformly converges to a limit M on [0, T ] in probability. Therefore thereexists a stochastic process M ≡ (Mt) such that sup0≤t≤T |M(k)t −Mt| → 0 in probability.Obviously (Mt)t≥0 is a continuous and square -integrable martingale (up to time T ) as theuniform limit of a sequence of continuous martingales.


2.4.2 Ito’s integrals

Let us now recall the definition of Ito’s integral∫ t

0FsdMs which is again a (local) martin-

gale. This local martingale is also denoted by F ·M , where F is called an integrand.The definition is divided into three steps.Step 1. Assume that M is a continuous square integrable martingale, i.e. M is a

continuous martingale on (Ω,F ,Ft,P) such that E(M2t ) <∞ for every t > 0.

If F = (Ft) is a simple stochastic process if

Ft = ξ010(t) +∞∑i=1

ξi1(ti,ti+1](t)

for some partition 0 ≤ t1 < t2 < · · · , where ti →∞, F is adapted so that ξi ∈ Fti , and ξiis bounded for i = 1, 2, · · · . For such F the Ito’s integral

(F ·M)t =

∫ t

0

FsdMs =∞∑i=1

ξi(Mti+1∧t −Mti∧t

)for every t ≥ 0. By definition

(F ·M)t =

∫ t

0

FsdMs = limm(D)→0

∑i

Fsi(Msi+1

−Msi

)for every t > 0, where the limit takes over all finite partitions D : 0 = s0 < s1 < . . . <sk = t.

F ·M is a continuous square integrable martingale and

〈F ·M〉t =

∫ t

0

|Fs|2d 〈M〉s (2.8)

which yields the Ito’s isometry

E [〈F ·M〉t] = E[∫ t

0

|Fs|2d 〈M〉s]

(2.9)

for every t ≥ 0, which allows us to generalize the definition to more general integrands.An adapted process F = (Ft) is said to be in L2(M) if there is a sequence of simple

adapted processes F (n) (for n = 1, 2, . . .)

E(∫ T

0

|F (n)t − Ft|2d 〈M〉t)→ 0

as n→∞ for every T > 0. The Ito’s isometry implies that

(F ·M)t = limn→∞

(Fn ·M)t

in M20. F · M is also continuous square martingale and (2.8) holds. Moreover for any

F ∈ L2(M) then

(F ·M)t = limm(D)→0

∑l

Ftl−1

(Mtl −Mtl−1

)in probability


where the limit takes over all finite partitions of [0, t].By the use of the polarization identity, if M , N ∈ M2

0 and F ∈ L2(M), G ∈ L2(N),then

〈F.M,G.N〉t =

∫ t

0

FsGsd 〈M,N〉s

and F.(G.M) = (FG).M , as far as these stochastic integrals make sense. That is,∫ t

0

Fsd

(∫ s

0

GudMu

)s

=

∫ t

0

FsGsdMs .

Step 2. The Ito integration may be extended to local martingales. Let me brieflydescribe the idea. Suppose M = (Mt)t≥0 is a continuous, local martingale with initialzero, then we may choose a sequence (Tn) of stopping times such that Tn ↑ ∞ a.s. and foreach n, MTn = (Mt∧Tn)t≥0 is a continuous, square integrable martingale with initial zero.In this case we may define 〈M〉t = 〈MTn〉t for t ≤ Tn, which is an adapted, continuous,increasing process with initial zero such that M2

t − 〈M〉t is a local martingale.Let F = (Ft)t≥0 be a left-continuous, adapted process such that for each T > 0∫ T

0

F 2s d〈M〉s <∞ a.s. (2.10)

and define

Sn = inf

t ≥ 0 :

∫ t

0

F 2s d〈M〉s ≥ n

∧ n

which is a sequence of stopping times. Condition (2.10) ensures that Sn ↑ ∞. Let Tn =

Tn ∧ Sn. Then Tn ↑ ∞ almost surely, and for each n, M Tn ∈ M20. Let F (n)t = Ft1t≤Tn.

Then ∫ ∞0

F (n)2sd〈M〉s =

∫ Tn

0

F 2s d〈M〉s ≤ n

so that F (n) ∈ L2(M Tn). We may define

(F.M)t =

∫ t

0

F (n)sd(M Tn

)s

if t ≤ Tn ↑ ∞

for n = 1, 2, 3, · · · , called the Ito integral of F with respect to local martingale M . It canbe shown that F.M does not depend on the choice of stopping times Tn. By definition,both F.M and

(F.M)2t −

∫ t

0

F 2s d〈M〉s

are continuous, local martingales with initial zero.Step 3. Finally let us extend the theory of stochastic integrals to the most useful class

of (continuous) semi-martingales. An adapted, continuous stochastic process X = (Xt)t≥0

is a semi-martingale if X possesses a decomposition Xt = Mt + Vt, where (Mt)t≥0 is acontinuous local martingale, and (Vt)t≥0 is stochastic processes with finite variation on anyfinite interval.


If f(t) is a function on [0, T ] having finite variation:

supD

∑l

|f(tl)− f(tl−1)| < +∞

where D runs over all finite partitions of [0, t] (for any fixed t), then∫ t

0g(s)df(s) is under-

stood as the Lebesgue-Stieltjes integral. If in addition s→ f(s) is continuous, then∫ t

0

g(s)df(s) = limm(D)→0

∑l

g(tl−1)(f(tl)− f(tl−1)) .

Therefore, if V = (Vt)t≥0 is a continuous stochastic process with finite variation, then∫ t0FsdVs is a stochastic process defined path-wisely as the Lebesgue-Stieltjes integral∫ t

0

FsdVs(ω) ≡∫ t

0

Fs(ω)dVs(ω)

= limm(D)→0

∑l

Ftl−1(ω)(Vtl(ω)− Vtl−1

(ω)) .

The definition of stochastic integrals may be extended to any continuous semi-martingalein an obvious way, namely ∫ t

0

FsdXs =

∫ t

0

FsdMs +

∫ t

0

FsdVs

where, the first term on the right-hand side is the Ito’s integral with respect to localmartingale M defined in probability sense, which is again a local martingale, the secondterm is the usual Lebesgue-Stieltjes integral which is defined path-wisely. Moreover∫ t

0

FsdXs = limm(D)→0

∑l

Ftl−1

(Xtl −Xtl−1

)in probab.

2.4.3 Ito’s formula

Ito’s formula was established by K. Ito in 1944. Since Ito’s stated it as a lemma in hisseminal paper [ ], Ito’s formula is also refereed in literature as Ito’s Lemma. Ito’s Lemmais indeed the Fundamental Theorem in stochastic calculus.

We have used in many occasions the following elementary formula

X2tj−X2

tj−1=(Xtj −Xtj−1

)2+ 2Xtj−1

(Xtj −Xtj−1

).

If in addition (Xt)t≥0 is a continuous square integrable martingale, then, by adding up theabove identity over j = 1, · · · , n, where 0 = t0 < t1 < · · · < tn = t is an arbitrary finitepartition, one obtains

X2t −X2

0 = 2n∑j=1

Xtj−1

(Xtj −Xtj−1

)+

n∑j=1

(Xtj −Xtj−1

)2.


Letting m(D)→ 0, we obtain

X2t −X2

0 = 2

∫ t

0

XsdXs + 〈X〉t .

which is the Ito formula for the martingale (Xt)t≥0 applying to f(x) = x2. By usingpolarization and localization, we establish the following integration by parts formula.

Lemma 2.4.5 If X, Y are two continuous semi-martingales: X = M+A and Y = N+B,where M and N are two continuous local martingales, A and B are two adapted variationalprocess, then

XtYt −X0Y0 =

∫ t

0

YsdXs +

∫ t

0

XsdYs + 〈M,N〉t .

Corollary 2.4.6 (Integration by parts) Let X = M +A and Y = N +B be a continuoussemi-martingale: M and N are continuous local martingales, and A,B are continuous,adapted processes with finite variations. Then

XtYt −X0Y0 =

∫ t

0

XsdYs +

∫ t

0

YsdXs + 〈M,N〉t .

The following is the fundamental theorem in stochastic calculus.

Theorem 2.4.7 (Ito’s formula) Let X = (X1t , · · · , Xd

t ) be a continuous semi-martingalein Rd with decomposition X i

t = M it + Ait: M

1t , · · · ,Md

t are continuous local martingales,and A1

t , · · · , Adt are continuous, locally integrable, adapted processes with finite variations.Let f ∈ C2(Rd,R). Then

f(Xt)− f(X0) =d∑i=1

∫ t

0

∂f

∂xi(Xs)dX i

s

+1

2

d∑i,j=1

∫ t

0

∂2f

∂xi∂xj(Xs)d〈M i,M j〉s . (2.11)

The first term on the right-hand side of (2.11) can be decomposed into

d∑j=1

∫ t

0

∂f

∂xi(Xs)dM

js +

d∑j=1

∫ t

0

∂f

∂xi(Xs)dA

js

so that f(Xt)− f(X0) is again a semi-martingale with its martingale part given by

M ft =

d∑j=1

∫ t

0

∂f

∂xi(Xs)dM

js .

It follows that

〈M f ,M g〉t =

∫ t

0

d∑i,j=1

∂f

∂xi(Xs)

∂g

∂xj(Xs)d〈M i,M j〉s .


If B = (B1t , · · · , Bd

t )t≥0 is Brownian motion in Rd, then, for f ∈ C2(Rd,R)

f(Bt)− f(B0) =

∫ t

0

∇f(Bs).dBs +

∫ t

0

1

2∆f(Bs)ds .

Let

M[f ]t = f(Bt)− f(B0)−

∫ t

0

1

2∆f(Bs)ds .

Then M [f ] is a local martingale and

〈M [f ],M [g]〉t =

∫ t

0

〈∇f,∇g〉(Bs)ds .

Chapter 3

Selected applications of Ito’s formula

We present several applications of Ito’s lemma.

3.1 Levy’s characterization of Brownian motion

Our first application is Levy’s martingale characterization of Brownian motion. Let (Ω,F ,Ft,P)be a filtered probability space satisfying the usual condition.

Theorem 3.1.1 Let Mt = (M1t , · · · ,Md

t ) be an adapted, continuous stochastic process on(Ω,F ,Ft,P) taking values in Rd with initial zero. Then (Mt)t≥0 is a Brownian motion ifand only if

1. Each M it is a continuous square-integrable martingale.

2. M itM

jt − δijt is a martingale, that is, 〈M i,M j〉t = δijt for every pair (i, j).

Proof. We only need to prove the sufficient part. Recall that, under the assumption,(Mt)t≥0 is a Brownian motion if and only if

E[e√−1〈ξ,Mt−Ms〉

∣∣∣Fs] = e−|ξ|22

(t−s) (3.1)

for any t > s and ξ = (ξi) ∈ Rd. We thus consider the adapted process

Zt = exp

(√−1

d∑i=1

ξiMit +|ξ|2

2t

)

and we show that it is a martingale. To this end, we apply Ito’s formula to f(x) = ex (inthis case f ′ = f ′′ = f) and semi-martingale

Xt =√−1

d∑i=1

ξiMit +|ξ|2

2t ,

23

24 CHAPTER 3. SELECTED APPLICATIONS OF ITO’S FORMULA

and obtain

Zt = Z0 +

∫ t

0

Zsd

(√−1

d∑i=1

ξiMis +|ξ|2

2s

)

+1

2

∫ t

0

Zsd〈√−1

d∑i=1

ξiMi〉s

= 1 +√−1

d∑i=1

ξi

∫ t

0

ZsdMis +|ξ|2

2

∫ t

0

Zsds

−1

2

∫ t

0

d∑i,j=1

ξiξjZsd〈M i,M j〉s

= 1 +√−1

d∑i=1

ξi

∫ t

0

ZsdMis

the last equality follows from

1

2

∫ t

0

d∑i,j=1

ξiξjZsd〈M i,M j〉s =1

2|ξ|2

∫ t

0

Zsds .

due to the assumption that 〈M i,M j〉s = δijs. Since |Zs| = e|ξ|2s/2, so that for any T > 0

E∫ T

0

|Zs|2ds =

∫ T

0

e|ξ|2sds < +∞

and therefore (Zt) ∈ L2(M i) for i = 1, · · · , d as 〈M i〉t = t. It follows that∫ t

0ZsdM

is ∈Mc

2.That is, Zs is a continuous, square-integrable martingale with initial value 1. (3.1) followsfrom the martingale property

Eei〈ξ,Mt〉+ |ξ|

2

2t

∣∣∣∣Fs = ei〈ξ,Ms〉+ |ξ|2

2s

for t > s.

3.2 Time-changes of Brownian motion

A continuous local martingale is actually a time change of Brownian motion.

Theorem 3.2.1 (Dambis, Dubins, Schwarz) Let M = (Mt)t≥0 be a continuous, localmartingale on (Ω,F ,Ft,P) with initial value zero satisfying 〈M〉∞ =∞, and let

Tt = inf s : 〈M〉s > t .

Then Tt is a stopping time for each t ≥ 0, Bt = MTt is an (FTt)-Brownian motion, andMt = B〈M〉t.

3.3. BURKHOLDER-DAVIS-GUNDY INEQUALITY 25

Proof. The family T = (Tt)t≥0 is called a time-change, each Tt is a stopping time,and obviously t → Tt is increasing. The function t → Tt is right-continuous, and is theright-continuous inverse of t → 〈M〉t, so that 〈M〉Tt = t for all t ≥ 0. Each Tt is finiteP-a.e. because 〈M〉∞ =∞ a.e. By continuity of 〈M〉t we have

〈M〉Tt = t

for every t ≥ 0. Applying Doob’s optional sampling theorem for the square integrablemartingale (Ms∧Tt)s≥0 (which is a square integrable martingale) and stopping times Tt ≥ Ts(t ≥ s), we obtain that

E (MTt|FTs) = MTs

i.e. Bt is a (FTt)-martingale. By the same argument but to the martingale (M2s∧Tt −

〈M〉s∧Tt)s≥0 we have

E(M2

Tt − 〈M〉Tt |FTs)

= M2Ts − 〈M〉Ts .

Hence (B2t − t) is an (FTt)-martingale. We may verify that t→ Bt is continuous (which is

actually not trivial, see the comment below). Therefore B = (Bt)t≥0 is an (FTt) Brownianmotion according to Levy’s characterization of Brownian motion

Remark 3.2.2 The continuity of t → Bt can be argued as the following (outline). Sincet → Tt is right continuous, so B is righy continuous at least. Note that t → 〈M〉t iscontinuous and increasing, and

〈M〉t = limm(D)→0

∑i

(Mti −Mti−1

)2

where the limit runs over any finite partitions of [0, t]. Let us assume that the previousequality holds for every ω ∈ Ω. Let ω ∈ Ω be any but fixed. Now observe that if t isan increasing point of 〈M〉, that is, there is a ε > 0 such that s → 〈M〉s (ω) is strictlyincreasing on (t− ε, t + ε), then by inverse function theorem, s→ Ts(ω) is the inverse of〈M〉 and therefore is continuous at t. Hence Bs = MTs is continuous at t. On the otherhand, if s1 < s2 such that 〈M〉s1 = 〈M〉s2 at ω, then Ms = Ms1 (at ω) for all s ∈ [s1, s2],and therefore Bs is continuous on [s1, s2]. Therefore B is continuous.

Exercise 3.2.3 Let f be a continuous increasing function on [0,∞) with f(0) = 0 andf(∞) =∞. Define

f−1(x) = infy : f(y) > x.

Prove that f(f−1(x)) = x for all x ≥ 0.

3.3 Burkholder-Davis-Gundy inequality

Recall Doob’s Lp-inequality: if X is a martingale or a non-negative sub-martingale, andX?t = sups≤t |Xs|, then for every p > 1 and T > 0 we have

‖X?T‖p ≤ q ‖XT‖p


where 1p

+ 1q

= 1 (so that q = pp−1

). In particular if X is a martingale and p = 2 then

‖XT‖22 = E 〈X〉T . Hence

E 〈X〉T ≤ E|X?T |2 ≤ 4E 〈X〉T .

In this section we prove the B-D-G inequality by using Ito’s formula.

Theorem 3.3.1 Let X be a continuous local martingale with X0 = 0, and X?t = sups≤t |Xs|.

Let T > 0. Then1) If p > 1

(4p)−pE〈X〉pT ≤ E|X?T |2p ≤

(2ep2

)p E〈X〉pT . (3.2)

2) If 0 < p < 1

p2pE〈X〉pT ≤ E|X?T |2p ≤

(16

p

)pE〈X〉pT . (3.3)

3) In the case p = 1E〈X〉T ≤ E|X?

T |2 ≤ 4E〈X〉T . (3.4)

Proof. The conclusion 3) follows from Kolmogorov’s inequality, so we prove the casethat p 6= 1. By stopping time techniques, we may assume that X is a bounded continuousmartingale. First consider the case that p > 1. In this case, according to Doob’s inequality

E|X?T |2p ≤

(2p

2p− 1

)2p

E|XT |2p.

On the other hand, applying Ito’s formula to f(x) = (x2 + ε)p, where ε > 0 is an arbitraryconstant, we have

f ′(x) = 2px(x2 + ε)p−1,

f ′′(x) = 2p((2p− 1)x2 + ε

)(x2 + ε)p−2

and

(|XT |2 + ε)p = εp +

∫ T

0

f ′(Xt)dXt

+p

∫ T

0

((2p− 1)X2

t + ε)

(X2t + ε)p−2d〈X〉t.

Integrating both sides then letting ε ↓ 0 one obtains

E|XT |2p = p(2p− 1)E∫ T

0

X2(p−1)t d〈X〉t

and therefore

E|XT |2p ≤ p(2p− 1)E(|X?

T |2(p−1)〈X〉T)

≤ p(2p− 1)(E|X?

T |2p)1− 1

p (E〈X〉pT )1/p .


Combining with the Doob’s inequality we obtain

E|X?T |2p ≤

(1 +

1

2p− 1

)2p−1

2p2(E|X?

T |2p)1− 1

p (E〈X〉pT )1/p

≤ 2ep2(E|X?

T |2p)1− 1

p (E〈X〉pT )1/p

so that

E|X?T |2p ≤

(2ep2

)p E〈X〉pT .

To prove the other direction of the inequality, we begin with

E〈X〉pT = pE∫ T

0

〈X〉p−1t d〈X〉t

= pE(∫ T

0

〈X〉p−12

t dXt

)2

,

and by integration by parts gives∫ T

0

〈X〉p−12

t dXt = XT 〈X〉p−12

T −∫ T

0

Xtd〈X〉p−12

t

≤ 2X?T 〈X〉

p−12

T

we thus obtain

E〈X〉pT ≤ 4pE[(X?

T )2 〈X〉p−1T

]≤ 4p

(E (X?

T )2p) 1p (E〈X〉pT )1− 1

p

which implies that

E (X?T )2p ≥ (4p)−pE〈X〉pT

and therefore (3.2).

Next we assume that 0 < p < 1. Let Mt =∫ t

0〈X〉

p−12

s dXs, which is a continuous localmartingale with bracket process

〈M〉t =

∫ t

0

〈X〉p−1s d〈X〉s =

1

p〈X〉pt .

In terms of M we may recover X by formula

Xt =

∫ t

0

〈X〉−p−12

s dMs =

∫ t

0

〈X〉1−p2

s dMs

= Mt〈X〉1−p2

t −∫ t

0

Msd〈X〉1−p2

s

≤ 2M∗t 〈X〉

1−p2

t


where the third equality follows from the integration by parts, so that

X∗T ≤ 2M∗T 〈X〉

1−p2

T

and therefore

E|X∗T |2p ≤ 22pE[(M∗

T )2p 〈X〉p(1−p)T

]≤ 22p

(E (M∗

T )2)p (E〈X〉pT )1−p

where the second inequality follows from the Holder inequality as 0 < p < 1. Since

E|M∗T |2 ≤ 4E|MT |2 = 4E〈M〉T

=4

pE〈X〉pT

thus we have

E|X∗T |2p ≤ 4p(

4

pE〈X〉pT

)p(E〈X〉pT )1−p

=

(16

p

)pE〈X〉pT .

To prove the other direction, we consider the continuous martingale

Zt =

∫ t

0

1

(ε+X∗s )1−pdXs

where ε > 0 is a constant, whose bracket process

〈Z〉t =

∫ t

0

1

(ε+X∗s )2(1−p)d〈X〉s.

Since p < 1 so that

〈Z〉t ≥1

(ε+X∗t )2(1−p) 〈X〉t.

On the other hand, by integration by parts we have

Zt =Xt

(ε+X∗t )1−p −∫ t

0

Xsd(ε+X∗s )p−1

=Xt

(ε+X∗t )1−p + (1− p)∫ t

0

Xs

(ε+X∗s )2−pdX∗s


and therefore

|Zt| ≤X∗t

(ε+X∗t )1−p + (1− p)∫ t

0

X∗s(ε+X∗s )2−pdX

∗s

=X∗t

(ε+X∗t )1−p + (1− p)∫ t

0

d(X∗s + ε)

(ε+X∗s )1−p

−(1− p)ε∫ t

0

d(X∗s + ε)

(ε+X∗s )2−p

=X∗t

(ε+X∗t )1−p +1− pp

((ε+X∗t )p − εp)

−ε(

1

(ε+X∗t )1−p − εp−1

)≤ 1

p(ε+X∗t )p − 2ε(ε+X∗t )p−1 +

(2− 1

p

)εp

≤ 1

p(ε+X∗t )p + εp

for every ε > 0. Hence

E〈X〉T

(ε+X∗T )2(1−p) ≤ E〈Z〉T

= EZ2T

≤ E[

1

p(ε+X∗T )p + εp

]2

.

On the other hand

E〈X〉pT = E[(

〈X〉T(ε+X∗T )2(1−p)

)p ((ε+X∗T )2p

)1−p]

=

[E(

〈X〉T(ε+X∗T )2(1−p)

)]p E(ε+X∗T )2p

1−p

together with the previous inequality we obtain

E〈X〉pT ≤

E[

1

p(ε+X∗T )p + εp

]2p

E(ε+X∗T )2p1−p

for every ε > 0. Letting ε ↓ one obtains

E〈X〉pT ≤1

p2pE|X∗T |2p.


3.4 Stochastic exponentials

In this section we consider a simple stochastic differential equation

dZt = ZtdXt , Z0 = 1 (3.5)

where Xt = Mt + At is a continuous semi-martingale. The solution of (3.5) is calledthe stochastic exponential of X. The equation (3.5) should be understood as an integralequation

Zt = 1 +

∫ t

0

ZsdXs (3.6)

where the integral is taken as Ito’s integral. To find the solution to (3.6) we may tryZt = exp(Xt+Vt), where (Vt)t≥0 to be determined as a “correction” term (which has finitevariation) due to the Ito’s integration. Applying Ito’s formula we obtain

Zt = 1 +

∫ t

0

Zsd(Xs + Vs) +1

2

∫ t

0

Zsd〈M〉s

and therefore, in order to match the equation (3.6) we must choose Vt = −12〈M〉t.

Lemma 3.4.1 Let Xt = Mt + At (where M is a continuous local martingale, A is anadapted continuous process with finite total variation) with X0 = 0. Then

E(X)t = exp

(Xt −

1

2〈M〉t

)is the solution to (3.6).

E(X) is called the stochastic exponential of X = (Xt)t≥0.

Proposition 3.4.2 Let (Mt)t≥0 be a continuous local martingale with M0 = 0. Then thestochastic exponential E(M) is a continuous, non-negative local martingale.

Remark 3.4.3 According to definition of Ito’s integrals, if T > 0 such that

E∫ T

0

e2Mt−〈M〉td〈M〉t < +∞ (3.7)

then the stochastic exponential

E(M)t = exp

(Mt −

1

2〈M〉t

)is a non-negative, continuous martingale. For n = 1, 2, · · · , define

Tn = inf t ≥ 0 : Mt > n or 〈M〉t > n .

Then every Tn is a stopping time and Tn ↑ ∞. Moreover

E∫ T

0

e2Mt∧Tn−〈M〉t∧Tnd〈M〉t ≤ e2nn <∞

so that E(M)t∧Tn is a martingale, and therefore E(M) is a non-negative local martingale.

3.4. STOCHASTIC EXPONENTIALS 31

The remarkable fact is that, although E(M) may fail to be a martingale, but it isnevertheless a super-martingale.

Lemma 3.4.4 Let X = (Xt)t≥0 be a non-negative, continuous local martingale. ThenX = (Xt)t≥0 is a super-martingale: E(Xt|Fs) ≤ Xs for any t < s. In particular t → EXt

is decreasing, and therefore EXt ≤ EX0 for any t > 0.

Proof. Recall Fatou’s lemma: if fn is a sequence of non-negative, integrable func-tions on a probability space (Ω,F ,P), such that limn→∞E (fn) < ∞, then limn→∞fn isintegrable and

E (limn→∞fn|G) ≤ limn→∞E (fn|G)

for any sub σ-algebra G (see page 88, D. Williams: Probability with Martingales).By definition, there is a sequence of finite stopping times Tn ↑ +∞ P -a.e. such that

XTn = (Xt∧Tn)t≥0 is a martingale for each n. Hence

E (Xt∧Tn|Fs) = Xs∧Tn , ∀t ≥ s, n = 1, 2, · · · .

In particular E (Xt∧Tn) = EX0, so that, by Fatou’s lemma, Xt = limn→∞Xt∧Tn is inte-grable. Applying Fatou’s lemma to Xt∧Tn and G = Fs for t > s we have

E(Xt|Fs) = E[

limn→∞

Xt∧Tn|Fs]≤ limn→∞E(Xt∧Tn|Fs)

= limn→∞Xs∧Tn = Xs

According to definition, X = (Xt)t≥0 is a super-martingale.

Corollary 3.4.5 Let M = (Mt)t≥0 be a continuous, local martingale with M0 = 0. ThenE(M) is a super-martingale. In particular,

E exp

(Mt −

1

2〈M〉t

)≤ 1 for all t ≥ 0 .

Clearly, a continuous super-martingale X = (Xt)t≥0 is a martingale if and only if itsexpectation t→ E(Xt) is constant. Therefore

Corollary 3.4.6 Let M = (Mt)t≥0 be a continuous, local martingale with M0 = 0. ThenE(M) is a martingale up to time T , if and only if

E exp

(MT −

1

2〈M〉T

)= 1 . (3.8)

Stochastic exponentials of local martingales play an important role in probability trans-formations. It is vital in many applications to know whether the stochastic exponentialof a given martingale M = (Mt)t≥0 is indeed a martingale. A simple sufficient conditionto ensure (3.8) is the so-called Novikov’s condition stated in Theorem 3.4.7 below (A.A. Novikov: On moment inequalities and identities for stochastic integrals, Proc. secondJapan-USSR Symp. Prob. Theor., Lecture Notes in Math., 330, 333-339, Springer-Verlag,Berlin 1973).


Theorem 3.4.7 (A. A. Novikov 1973) Let M = (Mt)t≥0 be a continuous local martingalewith M0 = 0. If

E exp

(1

2〈M〉T

)<∞ , (3.9)

then E(M) is a martingale up to time T .

To prove this, we need the following fact about uniform integrability.

Lemma 3.4.8 Suppose A be a family of integrable random variables on (Ω,F ,P) andsuppose there is an integrable random variable η such that E [1D|ξ|] ≤ E [1D|η|] for anyD ∈ F and ξ ∈ A. Then A is uniformly integrable.

Proof. of Theorem 3.4.7. (The following proof is due to J. A. Yan: Criteres d’integrabiliteuniforme des martingales exponentielles, Acta. Math. Sinica 23, 311-318 (1980).) Theidea is the following, first show that, under the Novikov condition (3.9), for any 0 < α < 1

E(αM)t ≡ exp

(αMt −

1

2α2 〈M〉t

)is a uniformly integrable martingale up to time T . By Corollary (3.4.5), E (E(αM)t) ≤ 1for every α. While

E(αM)t ≡ exp

α

(Mt −

1

2〈M〉t

)− 1

2α (α− 1) 〈M〉t

= (E(M)t)

α exp

1

2α (1− α) 〈M〉t

,

so that for every t ≤ T and for every A ∈ FT

E (1AE(αM)t) = E

1A (E(M)t)α exp

[1

2α (1− α) 〈M〉t

]. (3.10)

If α ∈ (0, 1), by using Holder’s inequality with p = 1α> 1 and q = 1

1−α in (3.10) one obtains

E 1AE(αM)t = E

(E(M)t)α exp

[1

2α (1− α) 〈M〉t

]≤ E (E(M)t)α

E[1A exp

(1

2α 〈M〉t

)]1−α

≤ E (E(M)t)αE[1A exp

(1

2α 〈M〉t

)]1−α

≤E[1A exp

(1

2α 〈M〉T

)]1−α

≤ E

1A exp

(1

2〈M〉T

). (3.11)

3.5. EXPONENTIAL INEQUALITY 33

Therefore by Lemma (3.4.8) we deduce that (E(αM)t)t≤T is, up to time T , uniformlyintegrable local martingale, and therefore E(αM) must be a martingale on [0, T ]. Inparticular

E (E(αM)T ) = E (E(αM)0) = 1, ∀α ∈ (0, 1).

Set A = Ω in (3.11), the first inequality of (3.11) becomes

1 = E (E(αM)t)

≤ (E (E(M)t))α

E(

exp

(1

2〈M〉T

))1−α

for every α ∈ (0, 1). Letting α ↑ 1 we thus obtain that E (E(M)t) ≥ 1 for every t ≤ T , sothat E (E(M)t) = 1 for any t ≤ T . Hence E(M)t is a martingale up to T .

Consider a standard Brownian motionB = (Bt), and F = (Ft)t≥0 ∈ L2. If E exp[

12

∫ T0F 2t dt]<

∞ then

Xt = exp

∫ t

0

FsdBs −1

2

∫ t

0

F 2s ds

(3.12)

is a positive martingale on [0, T ]. For example, for any bounded process F = (Ft)t≥0 ∈ L2:|Ft(ω)| ≤ C (for all t ≤ T and ω ∈ Ω), where C is a constant, then

E

exp

(1

2

∫ T

0

F 2t dt

)≤ exp

(1

2C2T

)<∞

so that, in this case, X = (Xt) defined by (3.12) is a martingale up to time T .Novikov’s condition is very nice, it is however not easy to verify in many interesting

cases. For example, consider the stochastic exponential of the martingale∫ t

0BsdBs, the

Novikov condition requires to estimate the integral E

exp[

12

∫ T0B2t dt]

, which is already

not an easy task.

3.5 Exponential inequality

We are going to present three significant applications of stochastic exponentials: a sharpimprovement of Doob’s maximal inequality for martingales, Girsanov’s theorem, and themartingale representation theorem (in the next section). Additional applications will bediscussed in the next chapter.

Recall that, according to Doob’s maximal inequality, if (Xt)t≥0 is a continuous super-martingale on [0, T ], then for any λ > 0

P

supt∈[0,T ]

|Xt| ≥ λ

≤ 1

λ

(E(X0) + 2E(X−T )

)where x− = −x if x < 0 and = 0 if x ≥ 0. In particular, if (Xt)t≥0 is a non-negative,continuous super-martingale on [0, T ], then

P

supt∈[0,T ]

Xt ≥ λ

≤ 1

λE(X0) . (3.13)

This inequality has a significant improvement stated as follows.


Theorem 3.5.1 Let M = (Mt)t≥0 be a continuous square integrable martingale with M0 =0. Suppose there is a (deterministic) continuous, increasing function a = a(t) such thata(0) = 0, 〈M〉t ≤ a(t) for all t ∈ [0, T ]. Then

P

supt∈[0,T ]

Mt ≥ λa(T )

≤ e−

λ2

2a(T ) . (3.14)

Proof. For every α > 0 and t ≤ T

αMt −α2

2〈M〉t ≥ αMt −

α2

2〈M〉T

≥ αMt −α2

2a(T )

so that

E(αM)t ≥ eαMt−α2

2a(T ) for α > 0 .

Hence, by applying Doob’s maximal inequality to the non-negative super-martingale E(αM)we obtain

P

supt∈[0,T ]

Mt ≥ λa(T )

≤ P

supt∈[0,T ]

E(αM)t ≥ eαλa(T )−α2

2a(T )

≤ e−αλa(T )+α2

2a(T )E E(αM)0

= e−αλa(T )+α2

2a(T )

for any α > 0. The exponential inequality follows by setting α = λ.In particular, by applying the exponential inequality to a standard Brownian motion

B = (Bt)t≥0,

P

supt∈[0,T ]

Bt ≥ λT

≤ e−

λ2

2T . (3.15)

3.6 Girsanov’s theorem

This is an important tool in stochastic analysis. To state it let us introduce several notionsand notations.

Let (Ω,F ,P) be a probability space. If G ⊂ F is a sub σ-algebra, and ξ is an non-negative, integrable and G-measurable with E [ξ] = 1, then define a probability measure Qon (Ω,F) by

Q(A) = E [ξ1A] =

∫A

ξdP

for every A. Of course Q restricted on (Ω,G) is also a probability measure. In this casewe use the notation that

dQdP

∣∣∣∣G

= ξ

3.6. GIRSANOV’S THEOREM 35

to represent the Radon-Nikodym derivative of Q with respect to P as measures on (Ω,G).Let us work with a filtered probability space (Ω,F ,Ft,P). Let T > 0, and Q be a

probability measure on (Ω,FT ) such that

dQdP

∣∣∣∣FT

= ξ

for some non-negative random variable ξ ∈ L1(Ω,FT ,P). By definition, for any boundedFT -measurable random variable X∫

Ω

X(ω)Q(dω) =

∫Ω

X(ω)ξ(ω)P(dω)

or simply written as EQ(X) = EP(ξX). If, however, X is Ft-measurable, t ≤ T , then

EQ(X) = EP(EP (ξX|Ft)) = EP(EP (ξ|Ft)X) .

That is, for every t ≤ TdQdP

∣∣∣∣Ft

= EP (ξ|Ft)

which is a non-negative martingale up to T under the probability P.Conversely, if T > 0 and Z = (Zt)t≥0 is a continuous, positive martingale up to time

T , with Z0 = 1, on a filtered probability space (Ω,F ,Ft,P). We define a measure Q on(Ω,FT ) by

Q(A) = P (ZTA) if A ∈ FT . (3.16)

That is, dQdP

∣∣FT

= ZT . Q is a probability measure on (Ω,FT ) as E (ZT ) = 1. Since (Zt)t≤T

is a martingale up to time T , so that dQdP

∣∣Ft

= Zt for all t ≤ T .

If (Zt)t≥0 is a positive martingale with Z0 = 1, then there is a probability measure Qon (Ω,F∞), where F∞ ≡ σFt : t ≥ 0, such that dQ

dP

∣∣Ft

= Zt for all t ≥ 0.We are now in a position to prove Girsanov’s theorem.

Theorem 3.6.1 (Girsanov’s theorem) Let (Mt)t≥0 be a continuous local martingale on(Ω,F ,Ft,P) up to time T . Then

Xt = Mt −∫ t

0

1

Zsd 〈M,Z〉s

is a continuous local martingale on (Ω,F ,Ft,Q) up to time T .

Proof. Using localization technique, we may assume that M,Z, 1/Z are all bounded.In this case M,Z are bounded martingales. We want to prove that X is a martingaleunder the probability Q:

EQ Xt|Fs = Xs for all s < t ≤ T ,

that is,EQ 1A (Xt −Xs) = 0 for all s < t ≤ T , A ∈ Fs .


By definitionEQ 1A (Xt −Xs) = EP (ZtXt − ZsXs)1A

thus we only need to show that (ZtXt) is a martingale up to time T under probabilitymeasure P. By use of integration by parts, we have

ZtXt = Z0X0 +

∫ t

0

ZsdXs +

∫ t

0

XsdZs + 〈Z,X〉t

= Z0X0 +

∫ t

0

Zs

(dMs −

1

Zsd 〈M,Z〉s

)+

∫ t

0

XsdZs + 〈Z,X〉t

= Z0X0 +

∫ t

0

ZsdMs +

∫ t

0

XsdZs

which is a local martingale.Since Zt > 0 is a positive martingale up to time T , we may apply the Ito formula to

logZt, to obtain

logZt − logZ0 =

∫ t

0

1

ZsdZs −

∫ t

0

1

Z2s

d〈Z〉s ,

that is, Zt = E(N)t with Nt =∫ t

01Zs

dZs is a continuous local martingale. Hence Zt = E(N)tsolves the Ito integral equation

Zt = 1 +

∫ t

0

ZsdNs ,

and therefore

〈M,Z〉t = 〈∫ t

0

dMs,

∫ t

0

ZsdNs〉 =

∫ t

0

Zsd〈N,M〉s .

It follows thus that ∫ t

0

1

Zsd 〈M,Z〉s = 〈N,M〉t .

Corollary 3.6.2 Let Nt be a continuous local martingale on (Ω,F ,Ft,P), N0 = 0, suchthat its stochastic exponential E(N)t is a continuous martingale up to T . Define a proba-bility measure Q on the measurable space (Ω,FT ) by

dQdP

∣∣∣∣Ft

= E(N)t for all t ≤ T .

If M = (Mt)t≥0 is a continuous local martingale under the probability P , then Xt =Mt−〈N,M〉t is a continuous, local martingale under Q up to time T . (You should carefullydefine the concept of a local martingale up to time T ).

3.7 The martingale representation theorem

The martingale representation theorem is a deep result about Brownian motion. There isa natural version for multi-dimensional Brownian motion, for simplicity of notations, wehowever concentrate on one-dimensional Brownian motion.

3.7. THE MARTINGALE REPRESENTATION THEOREM 37

Let B = (Bt)t≥0 be a standard Brownian motion in R on a complete probabilityspace (Ω,F ,P), and (F0

t )t≥0 (together with F0∞ = ∪F0

t ) be the filtration generated bythe Brownian motion (Bt)t≥0. Let Ft be the completion of F0

t , and F∞ = σ (∪Ft). As amatter of fact, (Ft)t≥0 is continuous.

Theorem 3.7.1 Let M = (Mt)t≥0 be a square-integrable martingale on (Ω,F ,Ft,P).Then there is a stochastic process F = (Ft)t≥0 in L2, such that

Mt = EM0 +

∫ t

0

FsdBs a.s.

for any t ≥ 0. In particular, any martingale with respect to the Brownian filtration (Ft)t≥0

has a continuous version.

The proof of this theorem relies on the following several lemmas. There are severalfacts from analysis we are going to use, which we state them here.

1. For p ∈ [1,∞], Lp(Rd, µ) denotes the Lp-space on the measure space (Rd,B(Rd), µ)where µ is a measure which is absolutely continuous with respect to the Lebesgue mea-sure. Then C∞0 (Rd) is dense in Lp(Rd, µ), where C∞0 (Rd) denotes the space of all smoothfunctions on Rd with compact supports.

2. If φ ∈ C∞0 (Rd) then its Fourier transform φ is defined to be

φ(x) =1

(2π)n/2

∫Rnφ(x)e−i〈z,x〉dx

which is smooth too (while its support may be not compact) and f is recovered by theinverse Fourier transform:

φ(x) =1

(2π)n/2

∫Rnφ(z)ei〈z,x〉dz.

3. This is a theorem from Functional Analysis. Suppose H is a Hilbert space (i.e. acomplete metric space whose metric is induced by the inner product 〈·, ·〉). Let X ⊂ H bea subset of H and spanX be the linear subspace generated by X, that is the linear spanof X, i.e. the linear space of all finite linear combinations of elements in X. Then spanXis dense in H if and only if the only element h ∈ H satisfying the condition 〈x, h〉 = 0 forall x ∈ X is the zero element.

Let T > 0 be any fixed time.

Lemma 3.7.2 The following collection of random variables on (Ω,FT ,P)φ(Bt1 , · · · , Btk) : ∀k ∈ Z+, tj ∈ [0, T ] and φ ∈ C∞0 (Rk)

is dense in L2(Ω,FT ,P).

Proof. If X ∈ L2(Ω,FT ,P), then, by definition, there is an F0T -measurable function

which equals X almost surely. Therefore, without losing generality, we may assume thatX ∈ L2(Ω,F0

T ,P). According to definition, F0T = σBt : t ≤ T. Let D = Q∩ [0, T ] the set


of all rational numbers in the interval [0, T ]. Since D is dense in [0, T ], so that F0T = σBt :

t ∈ D. Moreover D countable, so that we may write D = t1, · · · , tn, · · · . Let Dn =t1, · · · , tn for each n, and Gn = σ Bt1 , · · · , Btn. Then Gn is increasing, and Gn ↑ F0

T .Let Xn = E (X|Gn). Then (Xn)n≥1 is square integrable martingale, thus, according tothe martingale convergence theorem Xn → X almost surely. Moreover Xn → X in L2.While, for each n, Xn is measurable with respect to Gn, so that Xn = fn(Bt1 , · · · , Btn)for some Borel measurable function fn : Rn → R. Since Xn ∈ L2, so that fn ∈ L2(Rn, µ)where µ is a Gaussian measure such that EX2

n =∫Rn fn(x)2µ(dx). Since C∞0 (Rn) is dense

in L2(Rn, µ), for each n, there is a sequence φnk in C∞0 (Rn) such that φnk → fn inL2(Rn, µ). It follows that φnn(Bt1 , · · · , Btn)→ X in L2.

If I ⊂ R is an interval, then we use L2(I) to denote the Hilbert space of all functionsh on I which are square integrable.

Lemma 3.7.3 Let T > 0. For any h ∈ L2([0, T ]), we associate with an exponentialmartingale up to time T :

M(h)t = exp

∫ t

0

h(s)dBs −1

2

∫ t

0

h(s)2ds

; t ∈ [0, T ]. (3.17)

Then L =spanM(h)T : h ∈ L2([0, T ]) is dense in L2(Ω,FT ,P).

Proof. For any 0 = t0 < t1 < · · · < tn = T and ci ∈ R, consider a step functionh(t) = ci for t ∈ (ti, ti+1]. Then

M(h)T = exp

∑i

ci(Bti+1−Bti)−

1

2

∑i

c2i (ti+1 − ti)

.

Suppose that H ∈ L2 such that∫

ΩHΦdP = 0 for any Φ ∈ L, then∫

Ω

H exp

∑i

ci(Bti+1−Bti)−

1

2

∑i

c2i (ti+1 − ti)

dP = 0 .

The deterministic, positive term e−12

∑i c

2i (ti+1−ti) can be removed from the integrand, and

it follows therefore that ∫Ω

H exp

∑i

ci(Bti+1−Bti)

dP = 0 .

Since ci are arbitrary numbers, hence∫Ω

H exp

∑i

ciBti

dP = 0

for any ci and ti ∈ [0, T ]. Since the left-hand is analytic in ci, so that the equality remainstrue for any complex numbers ci. If φ ∈ C∞0 (Rn), then

φ(x) =1

(2π)n/2

∫Rnφ(z)ei〈z,x〉dz

3.7. THE MARTINGALE REPRESENTATION THEOREM 39

where

φ(z) =1

(2π)n/2

∫Rnφ(x)e−i〈z,x〉dx

is the Fourier transform of φ. Hence∫Ω

Hφ(Bt1 , · · · , Btn)dP =1

(2π)n/2

∫Ω

H

∫Rnφ(z) exp

(i∑j

zjBtj

)dzdP

=1

(2π)n/2

∫Rn

φ(z)

∫Ω

H exp

(i∑i

ziBti

)dP

dz

= 0 .

Therefore, for any φ ∈ C∞0 (Rn),∫Ω

Hφ(Bt1 , · · · , Btn)dP = 0 . (3.18)

By Lemma 3.7.2, the collection of all functions like φ(Bt1 , · · · , Btn) is dense in L2(Ω,FT ,P),so that ∫

Ω

HGdP = 0 for any G ∈ L2(Ω,FT ,P) .

In particular,∫

ΩH2dP = 0 so that H = 0.

Theorem 3.7.4 (Ito’s representation theorem) Let ξ ∈ L2(Ω,FT ,P). Then there is aF = (Ft)t≥0 ∈ L2, such that

ξ = Eξ +

∫ T

0

FtdBt .

Proof. By Lemma 3.7.3 we only need to show this lemma for ξ = X(h)T (whereh ∈ L2([0, T ])) defined by (3.17). While, X(h)t is an exponential martingale so that itmust satisfy the following integral equation

X(h)T = 1 +

∫ T

0

X(h)td

(∫ t

0

h(s)dBs

)= E(X(h)T ) +

∫ T

0

X(h)th(t)dBt .

Therefore Ft = X(h)th(t) will do.

Chapter 4

Stochastic differential equations

The main goal of the chapter is to establish the basic existence and uniqueness theoremfor a class of stochastic differential equations which are important in applications.

4.1 Introduction

Stochastic differential equations (SDE) are ordinary differential equations perturbed bynoises. We will consider a simple class of noises modeled by Brownian motion. Thus weconsider the following type of equation

dXjt =

n∑i=1

f ji (t,Xt)dBit + f j0 (t,Xt)dt , j = 1, · · · , N (4.1)

where Bt = (B1t , · · · , Bn

t )t≥0 is a standard Brownian motion in Rn on a filtered probabilityspace (Ω,F ,Ft,P), and

f ji : [0,∞)× RN → RN

are Borel measurable functions. Of course, differential equation (4.1) should be interpretedas an integral equation in terms of Ito’s integration. More precisely, an adapted, continuous,RN -valued stochastic process Xt ≡ (X1

t , · · · , XNt ) is a solution of (4.1), if

Xjt = Xj

0 +n∑k=1

∫ t

0

f jk(s,Xs)dBks +

∫ t

0

f j0 (s,Xs)ds (4.2)

for j = 1, · · · , N . Since we are concerned only with the distribution determined by thesolution (Xt)t≥0 of (4.1), we therefore expect that any solution of SDE (4.1) should have thesame distribution for any Brownian motion B = (Bt)t≥0. It thus leads to different conceptsof solutions and uniqueness: strong solutions and weak solutions, path-wise uniqueness anduniqueness in law.

Definition 4.1.1 1) An adapted, continuous, RN -valued stochastic process X = (Xt)t≥0

on (Ω,F ,Ft,P) is a (weak) solution of (4.1), if there is a Brownian motion W = (Wt)t≥0

in Rn, adapted to the filtration (Ft), such that

Xjt −X

j0 =

n∑l=1

∫ t

0

f jl (s,Xs)dWls +

∫ t

0

f j0 (s,Xs)ds, j = 1, · · · , N .

41

42 CHAPTER 4. STOCHASTIC DIFFERENTIAL EQUATIONS

In this case we also call the pair (X,W ) a (weak) solution of (4.1).2) Given a standard Brownian motion B = (Bt)t≥0 in Rn on (Ω,F ,P) with its natural

filtration (Ft)t≥0, an adapted, continuous stochastic process X = (Xt)t≥0 on (Ω,F ,Ft,P)is a strong solution of (4.1), if

Xjt −X

j0 =

n∑i=1

∫ t

0

f ji (s,Xs)dBis +

∫ t

0

f j0 (s,Xs)ds .

We also have different concepts of uniqueness.

Definition 4.1.2 Consider SDE (4.1).

1. We say that the path-wise uniqueness holds for (4.1), if whenever (X,B) and

(X, B) are two solutions defined on the same filtered space and the same Brownian

motion B, and X0 = X0, then X = X.

2. It is said that uniqueness in law holds for (4.1), if (X,B) and (X, B) are two

solutions (with possibly different Brownian motions B and B, even can be on different

probability spaces), and X0 and X0 possess the same distribution, then X and X havesame distribution.

Theorem 4.1.3 (Yamada-Watanabe) Path-wise uniqueness implies uniqueness in law.

The following is a simple example of SDE for which has no strong solution, but possessesweak solutions and uniqueness in law holds.

Example 4.1.4 (H. Tanaka) Consider 1-dimensional stochastic differential equation:

Xt =

∫ t

0

sgn(Xs)dBs , 0 ≤ t <∞

where sgn(x) = 1 if x ≥ 0, and equals −1 for negative value of x.

1. Uniqueness in law holds, since X is a standard Brownian motion (Levy’s Theorem).

2. There is a weak solution. Let Wt be a one-dimensional Brownian motion, and Bt =∫ t0sgn(Ws)dWs. Then B is a one-dimensional Brownian motion, and

Wt =

∫ t

0

sgn(Ws)dBs ,

so that (W,B) is a solution.

3. Path-wise uniqueness does not hold.

4. There is no any strong solution.

4.2. SEVERAL EXAMPLES 43

4.2 Several examples

4.2.1 Linear-Gaussian diffusions

Linear stochastic differential equations can be solved explicitly. Consider

dXjt =

n∑i=1

σji dBit +

N∑k=1

βjkXkt dt (4.3)

(j = 1, · · · , N), where B is a Brownian motion in Rn, σ = (σji ) a constant N × n matrix,and β = (βjk) a constant N ×N matrix. (4.3) may be written as

dXt = σdBt + βXtdt .

Let

eβt =∞∑k=0

tk

k!βk

be the exponential of the square matrix β. Using Ito’s formula, we have

e−βtXt −X0 =

∫ t

0

e−βsdXs −∫ t

0

e−βsβXsds

=

∫ t

0

e−βs(dXs − βXsds)

=

∫ t

0

e−βsσdBs

so that

Xt = eβtX0 +

∫ t

0

eβ(t−s)σdBs .

In particular, if X0 = x, then Xt has a normal distribution with mean eβtx. For example,if n = N = 1, then

Xt ∼ N(eβtx,σ2

2β

(e2βt − 1

)) .

It can be shown that (Xt) is a diffusion process, and its distribution can be described byits transition probability Pt(x, dz). According to definition

(Ptf)(x) ≡∫RNf(z)Pt(x, dz) = E (f(Xt)|X0 = x) ,

thus

(Ptf)(x) = E (f(Xt)|X0 = x)

=

∫Rf(z)

1√2π σ

2

2(e2βt − 1)

exp

(− |z − e

βtx|2σ2

2(e2βt − 1)

)dz

=

∫Rf(eβtx+

√σ2

2(e2βt − 1)z)

1√2π

exp

(−|z|

2

2

)dz

= Ef(eβtx+

√σ2

2(e2βt − 1)ξ)


where ξ has the standard normal distribution N(0, 1). From the second line of the aboveformula, and compare to the definition of Pt(x, dz), we may conclude that Pt(x, dz) =p(t, x, z)dz with

p(t, x, z) =1√

2π σ2

2(e2βt − 1)

exp

(− |z − e

βtx|2σ2

2(e2βt − 1)

).

p(t, x, z) is called the transition density of the diffusion process (Xt)t≥0. We thus, againfollowing from the above formula, have a probability representation

(Ptf)(x) = Ef(eβtx+

√σ2

2(e2βt − 1)ξ)

which is quite useful in some computations.

Remark 4.2.1 It is easy to see from the above representation that

d

dx(Ptf) = eβtPt

(d

dxf

).

The distribution of (Xt) is determined by the transition density p(t, x, z). Indeed, forany 0 < t1 < · · · < tk, the joint distribution of (Xt1 , · · · , Xtk) is Gaussian, and indeed itspdf is

p(t1, x, z1)p(t2 − t1, z1, z2) · · · p(tk − tk−1, zk−1, zk) .

If B = (B1t , · · · , Bn

t )t≥0 is a Brownian motion in Rn, then the solution Xt of the SDE:

dXt = dBt − (AXt) dt

is called the Ornstein-Uhlenbeck process, where A ≥ 0 is a d × d matrix called the driftmatrix. Hence we have

Xt = e−AtX0 +

∫ t

0

e−(t−s)AdBs .

Exercise 4.2.2 If X0 = x ∈ Rn, compute Ef(Xt), where Xt is the Ornstein-Uhlenbeckprocess with drift matrix A.

4.2.2 Geometric Brownian motion

The Black-Scholes model is the stochastic differential equation

dSt = St (µdt+ σdBt) (4.4)

whose solution to (4.4) is the stochastic exponential of∫ t

0

µds+

∫ t

0

σdBs .

4.2. SEVERAL EXAMPLES 45

Hence

St = S0 exp

(∫ t

0

σdBs +

∫ t

0

(µ− 1

2σ2

)ds

).

In the case σ and µ are constants, then

St = S0 exp

(σBt +

(µ− 1

2σ2

)t

)which is called the geometric Brownian motion. If S0 = x > 0, then St remains positive,and

logSt = log x+ σBt +

(µ− 1

2σ2

)t

has a normal distribution with mean log x +(µ− 1

2σ2)t and variance σ2. Again, as a

solution to the stochastic differential equation (4.4), (St)t≥0 is a diffusion process, its dis-tribution is determined by its transition function Pt(x, dz) (unfortunately we have to usethe same notations as in the last sub-section), and according to definition∫

Rf(z)Pt(x, dz) = E (f(St)|X0 = x)

= E(f(xeσBt+(µ− 1

2σ2)t)

)=

∫Rf(xeσz+(µ− 1

2σ2)t)

1√2πt

e−z2

2πtdz

=

∫ ∞0

f(y)1√2πt

1

σye−

12πt(

1σ

log yx−(µσ−

12σ))

2

dy

where we assume that σ > 0 and have made the change of variable

xeσz+(µ− 12σ2)t = y .

As usual, we define (Ptf)(x) =∫R f(z)Pt(x, dz). By the third line of the previous formula

(Ptf)(x) =

∫Rf(xeσz+(µ− 1

2σ2)t)

1√2πt

e−z2

2πtdz

=

∫Rf(xeσ

√ty+(µ− 1

2σ2)t)

1√2πe−

y2

2π dy

= E(f(xeσ

√tξ+(µ− 1

2σ2)t)

)[we have made a change variable z into

√ty], where ξ ∼ N(0, 1). Compare with the

definition of Pt(x, dy) we have

Pt(x, dy) =1√2πt

1

σye−

12πt(

1σ

log yx−(µσ−

12σ))

2

dy on (0,∞)

That is, (St) has the transition density

p(t, x, y) =1√2πt

1

σye−

12πt(

1σ

log yx−(µσ−

12σ))

2

on (0,∞) .


and, therefore, for geometric Brownian motion

(Ptf)(x) =

∫ ∞0

1√2πt

1

σye−

12πt(

1σ

log yx−(µσ−

12σ))

2

f(y)dy

for any x > 0.

4.2.3 Cameron-Martin’s formula

Consider a simple stochastic differential equation

dXt = dBt + b(t,Xt)dt (4.5)

where b(t, x) is a bounded, Borel measurable function on [0,∞)× R. We may solve (4.5)by means of change of probabilities.

Let (Wt)t≥0 be a standard Brownian motion on (Ω,F ,Ft,P), and define probabilitymeasure Q on (Ω,F∞) by

dQdP

∣∣∣∣Ft

= E(N)t for all t ≥ 0

where Nt =∫ t

0b(s,Ws)dWs is a martingale (under the probability P), with 〈N〉t =∫ t

0b(s,Ws)

2ds, which is bounded on any finite interval. Hence

E(N)t = exp

(∫ t

0

b(s,Ws)dWs −1

2

∫ t

0

b(s,Ws)2ds

)is a martingale. According to Girsanov’s theorem

Bt ≡ Wt −W0 − 〈W,N〉t

is a martingale under the new probability Q, and 〈B〉t = 〈W 〉t = t. By Levy’s martingalecharacterization of Brownian motion, (Bt)t≥0 is a Brownian motion. Moreover

〈W,N〉t = 〈∫ t

0

dWs,

∫ t

0

b(s,Ws)dWs〉

=

∫ t

0

b(s,Ws)ds

and therefore

Wt −W0 −∫ t

0

b(s,Ws)ds = Bt

is a standard Brownian motion on (Ω,F ,Q). Thus

Wt = W0 +Bt +

∫ t

0

b(s,Ws)ds (4.6)

so that (Wt)t≥0 on (Ω,F∞,Q) is a solution of (4.5). The solution we have just constructedis a weak solution of SDE (4.5).

4.3. EXISTENCE AND UNIQUENESS 47

Theorem 4.2.3 (Cameron-Martin’s formula) Let b(t, x) = (b1(t, x), · · · , bn(t, x)) be bounded,Borel measurable functions on [0,∞)×Rn. Let Wt = (W 1, · · · ,W n

t ) be a standard Brown-ian motion on a filtered probability space (Ω,F ,Ft,P), and let F∞ = σFt, t ≥ 0. Defineprobability measure Q on (Ω,F∞) by

dQdP

∣∣∣∣Ft

= e∑nk=1

∫ t0 b

k(s,Ws)dBks− 12

∑nk=1

∫ t0 |bk(s,Ws)|2ds for t ≥ 0 .

Then (Wt)t≥0 under the probability measure Q is a solution to

dXjt = dBj

t + bj(t,Xt)dt (4.7)

for some Brownian motion (B1t , · · · , Bn

t )t≥0 under probability Q.

On the other hand, if (Xt) is a solution of SDE (4.7) on some probability space(Ω,F ,Ft,P) and define P

dPdP

∣∣∣∣∣Ft

= exp

−

n∑k=1

∫ t

0

bk(s,Xs)dBks −

1

2

n∑k=1

∫ t

0

∣∣bk(s,Xs)∣∣2 ds

for t ≥ 0

we may show that (Xt)t≥0 under probability measure P is a Brownian motion. Thereforesolutions to SDE (4.7) is unique in law: all solutions have the same distribution.

4.3 Existence and uniqueness

In this section we present a fundamental result about the existence and uniqueness ofstrong solutions.

4.3.1 Strong solutions: existence and uniqueness

By definition, any strong solution is a weak solution. We next prove a basic existence anduniqueness theorem for a stochastic differential equation under a global Lipschitz condition.Our proof will rely on two inequalities: The Gronwall inequality and Doob’s Lp-inequality.

Lemma 4.3.1 (The Gronwall inequality) If a non-negative function g satisfies the integralequation

g(t) ≤ h(t) + α

∫ t

0

g(s)ds , 0 ≤ t ≤ T

where α is a constant and h : [0, T ]→ R is an integrable function, then

g(t) ≤ h(t) + α

∫ t

0

eα(t−s)h(s)ds , 0 ≤ t ≤ T .

Proof. Let F (t) =∫ t

0g(s)ds. Then F (0) = 0 and

F ′(t) ≤ h(t) + αF (t)


so that (e−αtF (t)

)′ ≤ e−αth(t) .

Integrating the differential inequality we obtain∫ t

0

(e−αsF (s)

)′ds ≤

∫ t

0

e−αsh(s)ds

and therefore

F (t) ≤∫ t

0

eα(t−s)h(s)ds

which yields Gronwall’s inequality.Consider the following stochastic differential equation

dXjt =

n∑l=1

f jl (t,Xt)dBlt + f j0 (t,Xt)dt ; j = 1, · · · , N (4.8)

where f jk(t, x) are Borel measurable functions on R+ × RN , which are bounded on anycompact subset in RN . We are going to show the existence and uniqueness by Picard’siteration. The main ingredient in the proof is a special case of Doob’s Lp- inequality: if(Mt)t≥0 is a square-integrable, continuous martingale with M0 = 0, then for any t > 0

E

sups≤t|Ms|2

≤ 4 sup

s≤tE(|Ms|2

)= 4E〈M〉t . (4.9)

Lemma 4.3.2 Let (Bt)t≥0 be a standard BM in R on (Ω,Ft,F ,P), and (Zt)t≥0 and (Zt)t≥0

be two continuous, adapted processes. Let f(t, x) be a Lipschitz function

|f(t, x)− f(t, y)| ≤ C|x− y| ; ∀t ≥ 0, x, y ∈ R

for some constant C.

1. Let

Mt =

∫ t

0

f(s, Zs)dBs −∫ t

0

f(s, Zs)dBs ∀t ≥ 0 .

Then

E sups≤t|Ms|2 ≤ 4C2

∫ t

0

E∣∣∣Zs − Zs∣∣∣2 ds

for all t ≥ 0.

2. If

Nt =

∫ t

0

f(s, Zs)ds−∫ t

0

f(s, Zs)ds ∀t ≥ 0

then

E sups≤t|Ns|2 ≤ C2t

∫ t

0

E∣∣∣Zs − Zs∣∣∣2 ds ∀t ≥ 0 .


Proof. To prove the first statement, we notice that

sups≤t|Ms|2 = sup

s≤t

∣∣∣∣∫ s

0

(f(u, Zu)− f(u, Zu)

)dBu

∣∣∣∣2so that, by Doob’s L2-inequality

E sups≤t|Ms|2 = E sup

s≤t

∣∣∣∣∫ s

0


)dBu

∣∣∣∣2≤ 4E

∣∣∣∣∫ t

0

(f(s, Zs)− f(s, Zs)

)dBs

∣∣∣∣2= 4E

∫ t

0

∣∣∣f(s, Zs)− f(s, Zs)∣∣∣2 ds

≤ 4C2

∫ t

0

E∣∣∣Zs − Zs∣∣∣2 ds .

Next we prove the second claim. Indeed

sups≤t|Ns|2 = sup

s≤t

∣∣∣∣∫ s

0


)du

∣∣∣∣2≤

(∫ t

0

∣∣∣f(s, Zs)− f(s, Zs)∣∣∣ ds)2

≤ t

∫ t

0

∣∣∣f(s, Zs)− f(s, Zs)∣∣∣2 ds

≤ C2t

∫ t

0

∣∣∣Zs − Zs∣∣∣2 ds

where the second inequality follows from the Schwartz inequality.

Theorem 4.3.3 Consider SDE (4.8). Suppose that f ji satisfy the Lipschitz condition:

|f(t, x)− f(t, y)| ≤ C|x− y| (4.10)

and the linear-growth condition that

|f(t, x)| ≤ C(1 + |x|) (4.11)

for t ∈ R+ and x, y ∈ RN . Then for any η ∈ L2(Ω,F0,P) and a standard Brownian motionBt = (Bi

t) in Rn, there is a unique strong solution (Xt) of (4.8) with X0 = η.

Proof. Proof of the existence of strong solutions. The unique strong solution can beconstructed by Picard’s iteration. As the first step in the iteration, X(0) = η, and defineX(n) inductively by the following equation

X(n+1)t = η +

∫ t

0

f(s,X(n)s ) · dBs +

∫ t

0

f0(s,X(n)s )ds (4.12)


for n = 0, 1, 2, · · · . For simplifying our notations we introduce the following notations:

D(n)t =

∣∣∣X(n)t −X

(n−1)t

∣∣∣and

ρ(n)(t) = E sups≤t

∣∣X(n)s −X(n−1)

s

∣∣2for n = 1, 2, · · · . Since f and f0 are globally Lipschitz, so that

D(n+1)t ≤

∣∣∣∣∫ t

0

(f(s,X(n)

s )− f(s,X(n−1)s )

)· dBs

∣∣∣∣+

∣∣∣∣∫ t

0

(f0(s,X(n)

s )− f0(s,X(n−1)s )

)ds

∣∣∣∣≤∣∣∣∣∫ t

0

(f(s,X(n)

s )− f(s,X(n−1)s )

)· dBs

∣∣∣∣+ C

∣∣∣∣∫ t

0

D(n)s ds

∣∣∣∣ .It follows that (

D(n+1)t

)2

≤ 2 sups≤t

∣∣∣∣∫ s

0

(f(r,X(n)

r )− f(r,X(n−1)r )

)· dBr

∣∣∣∣2+ 2C

∣∣∣∣∫ t

0

D(n)s ds

∣∣∣∣2 .After taking the expectations both sides of the previous inequality to obtain

ρ(n+1)(s) ≤ 2E

[sups≤t

∣∣∣∣∫ s

0

(f(r,X(n)

r )− f(r,X(n−1)r )

)· dBr

∣∣∣∣2]

+ 2CE∣∣∣∣∫ t

0

D(n)s ds

∣∣∣∣2 . (4.13)

The first expectation can be estimated by Doob’s inequality to the martingale

Mt =

∫ t

0

(f(r,X(n)

r )− f(r,X(n−1)r )

)· dBr

where

〈M〉t =

∫ t

0

∣∣f(r,X(n)r )− f(r,X(n−1)

r )∣∣2 dr ≤ C

∫ t

0

(D(n)r

)2dr

here the inequality follows from ((4.10)), so that

E sups≤t

∣∣∣∣∫ s

0

(f(r,X(n)

r )− f(r,X(n−1)r )

)· dBr

∣∣∣∣ ≤ 4C

∫ t

0

ρ(n)(s)ds. (4.14)

While the second expectation on the right-hand side of (4.13) can be handled as thefollowing. By Cauchy-Schwartz inequality

E∣∣∣∣∫ t

0

D(n)s ds

∣∣∣∣2 ≤ t

∫ t

0

E∣∣D(n)

s

∣∣2 ds ≤ t

∫ t

0

ρ(n)(s)ds. (4.15)


Inserting (4.14) and (4.15) into (4.13) we thus obtain the following key inequality

ρ(n+1)(t) ≤ (8C + 2Ct)

∫ t

0

ρ(n)(s)ds (4.16)

for all t ≥ 0 and n = 0, 1, 2, · · · .For any but fixed T > 0 we have

ρ(n+1)(t) ≤ (8C + 2CT )

∫ t

0

ρ(n)(s)ds

and

ρ(n)(s) ≤ (8C + 2CT )

∫ s

0

ρ(n−1)(r)dr,

hence

ρ(n+1)(t) ≤ (8C + 2CT )2

∫ t

0

(∫ s

0

ρ(n−1)(r)dr

)ds.

By repeating the same procedure we obtain that

ρ(n+1)(t) ≤ (8C + 2CT )n∫ t

0

∫ t1

0

· · ·∫ tn−1

0

ρ(1)(tn−1)dtn−1dtn−2 · · · dt1. (4.17)

While

ρ(1)(t) = E sups≤t

∣∣∣∣∫ s

0

f(r, η)dBr +

∫ s

0

f0(r, η)dr

∣∣∣∣2≤ 2E sup

s≤t

∣∣∣∣∫ s

0

f(r, η)dBs

∣∣∣∣2 + 2E sups≤t

∣∣∣∣∫ s

0

f0(r, η)ds

∣∣∣∣2≤ 2E sup

s≤T

∣∣∣∣∫ s

0

f(r, η)Br

∣∣∣∣2 + 2E sups≤t

∣∣∣∣∫ s

0

f0(r, η)ds

∣∣∣∣2≡ C2(T ),

thus by inserting this estimate to (4.17) we conclude that

ρ(n+1)(t) ≤ C2(T ) (8C + 2CT )n∫ t

0

∫ t1

0

· · ·∫ tn−1

0

dtn−1dtn−2 · · · dt1

= C2(T )(8C + 2CT )n T n

n!

for every n = 0, 1, 2, · · · and for all t ≤ T . Therefore

E supt≤T

∣∣∣X(n+1)t −X(n)

t

∣∣∣2 ≤ C2(T )(8C + 2CT )n T n

n!(4.18)

and therefore, by using Markov inequality,

P

supt≤T

∣∣∣X(n+1)t −X(n)

t

∣∣∣2 > 1

2n

≤ C2(T )

(8C + 2CT )n (2T )n

n!


for every n = 0, 1, 2, · · · . Since

∑n

C2(T )(8C + 2CT )n (2T )n

n!<∞

so that, according to Borel-Cantelli’s lemma, with probability one, X(n) converges uni-formly to a limit X on any [0, T ]. It is easy to verify X is a strong solution.

Proof of uniqueness.

Let X and X be two solutions with same Brownian motion B. Then

Xt = η +

∫ t

0

f(s,Xs) · dBs +

∫ t

0

f0(s,Xs)ds

and

Xt = η +

∫ t

0

f(s, Xs) · dBs +

∫ t

0

f0(s, Xs)ds

Then, as in the proof of the existence,

E|Xt − Xt|2 ≤ C(T )

∫ t

0

E|Xt − Xt|2ds

for all t ≤ T , where C(T ) is a constant. The Gronwall inequality implies thus thatE (|Yt − Zt|2) = 0.

Remark 4.3.4 The iteration X(n) constructed in the proof of Theorem 4.3.3 is a functionof the Brownian motion B, and X

(n)t only depends on η and Bs, 0 ≤ s ≤ t.

4.3.2 Continuity in initial conditions

Theorem 4.3.5 Under the same assumptions as in Theorem 4.3.3. Given a BM B =(Bt)t≥0 in Rn on (Ω,F ,Ft,P), let (Xx(t))t≥0 be the unique strong solution of (4.8). Thenx→ Xx is uniformly continuous almost surely on any finite interval [0, T ]:

limδ↓0

sup|x−y|<δ

E

sup0≤t≤T

|Xx(t)−Xy(t)|2

= 0 . (4.19)

Proof. Let us only consider 1-dimensional case. Thus

Xx(t) = x+

∫ t

0

f1(s,Xx(s))dBs +

∫ t

0

f0(s,Xx(s))ds

and

Xy(t) = y +

∫ t

0

f1(s,Xy(s))dBs +

∫ t

0

f0(s,Xy(s))ds .

4.4. MARTINGALE PROBLEM AND WEAK SOLUTIONS 53

Therefore, by Doob’s maximal inequality,

E

sup0≤t≤T

|Xx(t)−Xy(t)|2≤ 3|x− y|2

+3E

sup

0≤t≤T

∣∣∣∣∫ t

0

(f1(s,Xx(s))− f1(s,Xy(s)))dBs

∣∣∣∣2

+3E

sup

0≤t≤T

∣∣∣∣∫ t

0

(f0(s,Xx(s))− f0(s,Xy(s)))ds

∣∣∣∣2

≤ 3|x− y|2 + 12E

∣∣∣∣∫ T

0

(f1(s,Xx(s))− f1(s,Xy(s))) dBs

∣∣∣∣2

+3TE∫ T

0

|f0(Xx(s))− f0(Xy(s))|2 ds

≤ 3|x− y|2 + 12E

∫ T

0

|f1(s,Xx(s))− f1(s,Xy(s))|2 ds

+3TC2E

∫ T

0

|Xx(s)−Xy(s)|2 ds

≤ 3|x− y|2 + 3C2(4 + T )

∫ T

0

E(|Xx(t)−Xy(t)|2

)dt.

Setting

∆(t) = E

sup0≤s≤t

|Xx(s)−Xy(s)|2

,

then we have

∆(T ) ≤ 3|x− y|2 + 3C2(4 + T )

∫ T

0

∆(t)dt

and therefore by Gronwall’s inequality

∆(T ) ≤ 6|x− y|2 exp(12C2 + 3TC2)

which yields (4.19).

4.4 Martingale problem and weak solutions

In the lecture (lecture 13) I did the computation for one dimensional case, but the com-putations I did in the lecture do not depend on the dimension, so I supply the details fora general case here, though some technical conditions still can be weaken but this can bedone better case by case.

Consider the following SDE

dX it =

d∑k=1

σik(Xt)dBkt + bi(Xt)dt (4.20)


where i = 1, · · · , d. We assume that all coefficients σij and bi are Borel measurable andare bounded (here the boundedness is brought in for simplicity, which can be replaced byother conditions). Let X = (Xt)≥0 be a solution on a filtered probability space (Ω,F ,Ft,P)and B is a d-dimensional Brownian motion. That is, X satisfies the following stochasticintegral equation

X it = X i

0 +

∫ t

0

d∑k=1

σik(Xs)dBks +

∫ t

0

bi(Xt)ds

for t ≥ 0 and i = 1, · · · , d. Then, by Ito’s formula, for every f ∈ C2(Rd),

f(Xt)− f(X0) =d∑i=1

∫ t

0

∂f

∂xi(Xs)dX

is +

1

2

d∑i,j=1

∫ t

0

∂2f

∂xi∂xj(Xs)d〈X i, Xj〉s.

Since

〈X i, Xj〉t =

∫ t

0

d∑k,l=1

σik(Xs)σjl (Xs)d

⟨Bk, Bl

⟩s

=

∫ t

0

d∑k=1

σik(Xs)σjk(Xs)ds.

Let aij(x) =∑d

k=1 σik(x)σjk(x). Then a(x) = (aij(x)), as a square d×d-matrix, is symmetric

and non-negative (here non-negative means all its eigenvalues are non-negative). Then theprevious equality shows that

〈X i, Xj〉t =

∫ t

0

aij(Xs)ds

for all t ≥ 0. Substituting this relation into the previous equation for f(Xt) we obtain

f(Xt)− f(X0) =d∑i=1

∫ t

0

∂f

∂xi(Xs)dX

is +

1

2

d∑i,j=1

∫ t

0

aij(Xs)∂2f

∂xi∂xj(Xs)ds.

Now using the fact that X is a solution to our SDE we thus deduce that

f(Xt)− f(X0) =d∑i=1

∫ t

0

∂f

∂xi(Xs)

(d∑

k=1

σik(Xs)dBks + bi(Xs)ds

)

+1

2

d∑i,j=1

∫ t

0

aij(Xs)∂2f

∂xi∂xj(Xs)ds

=

∫ t

0

d∑k,i=1

σik(Xs)∂f

∂xi(Xs)dB

ks +

∫ t

0

d∑i=1

bi(Xs)∂f

∂xi(Xs)ds

+1

2

d∑i,j=1

∫ t

0

aij(Xs)∂2f

∂xi∂xj(Xs)ds.


Let us introduce the following linear operator

L =1

2

d∑i,j=1

aij∂2

∂xi∂xj+

d∑i=1

bi∂

∂xi(4.21)

which is an elliptic differential operator of second-order. The linear operator L operatingon a C2 function f gives a new function Lf :

Lf(x) =1

2

d∑i,j=1

aij(x)∂2f(x)

∂xi∂xj+

d∑i=1

bi(x)∂f(x)

∂xi

for every x ∈ Rd. By our assumption Lf is Borel measurable and locally bounded, i.e. Lfis bounded on any compact subset of Rd. Under these notations, the previous equation forf(Xt) can be rearranged as the following

f(Xt)− f(X0)−∫ t

0

Lf(Xs)ds =

∫ t

0

d∑k,i=1

σik(Xs)∂f

∂xi(Xs)dB

ks

which is a continuous local martingale. Now the crucial observation, made first by Stroockand Varadhan, is that the left-hand side is independent of an underlying Brownian motionB. Since its importance, we introduce the following notation:

M[f ]t = f(Xt)− f(X0)−

∫ t

0

Lf(Xs)ds (4.22)

for t ≥ 0, and f ∈ C2(Rd).As a by-product of the previous computation, we also have

M[f ]t =

∫ t

0

d∑k,i=1

σik(Xs)∂f

∂xi(Xs)dB

ks

and therefore

〈M [f ],M [g]〉t =

∫ t

0

d∑k,i=1

σik(Xs)∂f

∂xi(Xs)

d∑l,j=1

σjl (Xs)∂f

∂xj(Xs)d

⟨Bk, Bl

⟩s

=

∫ t

0

d∑j,i=1

d∑l,j=1

σjk(Xs)σik(Xs)

∂f

∂xi(Xs)

∂f

∂xj(Xs)ds

=

∫ t

0

d∑j,i=1

aij(Xs)∂f

∂xi(Xs)

∂f

∂xj(Xs)ds

Lemma 4.4.1 Suppose σij and bi are bounded and Borel measurable, aij and L are definedby (4.21). If (Xt)t≥0 is a (weak) solution to SDE (4.20) on (Ω,F ,Ft,P), then for anyf ∈ C2(R)

M[f ]t = f(Xt)− f(X0)−

∫ t

0

(Lf)(Xs)ds


is a continuous local martingale on (Ω,F ,Ft,P). Moreover

〈M [f ],M [g]〉t =

∫ t

0

d∑j,i=1

aij(Xs)∂f

∂xi(Xs)

∂f

∂xj(Xs)ds.

The fact the statement thatM [f ] is a local martingale does not depend on the underlyingBrownian motion allows us formulate the concept of weak solutions in terms of martingaleproblem.

Definition 4.4.2 Let a(x) = (aij(x))i,j≤d be symmetric square d× d matrix valued, Borelmeasurable and bounded function on Rd and bi(x) are bounded Borel measurable. Let Lbe defined by (4.21) which is a linear operator operating on C2 functions. Let (Xt)t≥0

be a continuous stochastic process on a filtered space (Ω,F ,Ft,P). Then we say that(Xt)t≥0 together with the probability P is a solution to the L-martingale problem, if forevery f ∈ C2(R)

M[f ]t = f(Xt)− f(X0)−

∫ t

0

Lf(Xs)ds

is a local martingale under the probability P.

Therefore a solution (Xt)t≥0 of SDE (4.20) on (Ω,F ,P) is a solution to L-martingaleproblem. Conversely, we have the following theorem.

Theorem 4.4.3 Let aij, bi and L be given in Definition 4.4.2. Suppose there is (σij(x))which is a symmetric matrix-valued, Borel measurable and

λ−1 ≤d∑i,j

ξiξjσij(x) ≤ λ

for some constant λ > 0, for all x ∈ Rd, (this is equivalent to say all eigenvalues of thesquare matrix

(σij(x)

)are between λ−1 and λ for all x), such that aij =

∑dk=1 σ

ikσ

jk for

i, j = 1, · · · , d. If (Xt)t≥0 on (Ω,F ,Ft,P) is a continuous process solving the L-martingaleproblem: for every f ∈ C2(Rd)

M[f ]t = f(Xt)− f(X0)−

∫ t

0

Lf(Xs)ds

is a continuous local martingale on (Ω,F ,Ft,P), then (Xt)t≥0 on (Ω,F ,Ft,P) is a weaksolution to SDE

dX it =

d∑k=1

σik(Xt)dBkt + bi(Xt)dt (4.23)

for i = 1, · · · , d. Moreover

〈M [f ],M [g]〉t =

∫ t

0

L(fg)− f (Lg)− g (Lf) (Xs)ds .

for every f, g ∈ C2(Rd). Of course

L(fg)− f (Lg)− g (Lf) =d∑

j,i=1

aij∂f

∂xi∂g

∂xj.


Proof. The key idea is to construct a Brownian motion B such that

X it = X i

0 +

∫ t

0

d∑k=1

σik(Xs)dBks +

∫ t

0

bi(Xt)ds.

If we apply the local martingale property to functions f i(x) = xi (the i-th coordinate ofx), according to the definition of L-martingale problem,

M it = f i(Xt)− f i(X0)−

∫ t

0

Lf i(Xs)ds

= X it −X i

0 −∫ t

0

bi(Xs)ds

where M i = M [f i] for simplicity are continuous martingale. Therefore the candidate forBrownian motion is defined by

Bit =

d∑k=1

∫ t

0

(σ−1(Xs)

)ikdMk

s

which are continuous martingales with initial zero. Here σ−1 denotes the inverse matrix ofσ with its entries denoted by (σ−1)

ij. We need to show that B = (B1, · · · , Bd) is a standard

Brownian motion in Rd. We will apply Levy’s characterization for Brownian motion tothis end. Therefore we need to compute

⟨Bi, Bj

⟩t

=

∫ t

0

d∑k,l=1

(σ−1(Xs)

)ik

(σ−1(Xs)

)jld⟨Mk,M l

⟩s.

Let us compute 〈M [f ],M [g]〉t. By the polarization identity, we only need to compute〈M [f ]〉t. By assumption

M[f2]t = f 2(Xt)− f 2(X0)−

∫ t

0

Lf 2(Xs)ds

is a local martingale. On the other hand, by integration by part

f 2(Xt)− f 2(X0) = 2

∫ t

0

f(Xs)df(Xs) +⟨M [f ]

⟩t

= 2

∫ t

0

f(Xs)[dM [f ]

s + Lf(Xs)ds]

+⟨M [f ]

⟩t

= 2

∫ t

0

f(Xs)dM[f ]s + 2

∫ t

0

f(Xs)Lf(Xs)ds+⟨M [f ]

⟩t.

Substitute it into the previous equality we obtain that

M[f2]t = 2

∫ t

0

f(Xs)dM[f ]s +

⟨M [f ]

⟩t−∫ t

0

(Lf 2 − 2fLf

)(Xs)ds.


Therefore we must have ⟨M [f ]

⟩t−∫ t

0

(Lf 2 − 2fLf

)(Xs)ds = 0

that is ⟨M [f ]

⟩t

=

∫ t

0

(Lf 2 − 2fLf

)(Xs)ds.

Therefore ⟨M [f ],M [g]

⟩t

=

∫ t

0

(L(fg)− fLg − gLf)(Xs)ds.

It is an easy exercise to verify that

L(fg)− fLg − gLf =d∑

j,i=1

aij∂f

∂xi∂g

∂xj.

In particular, for coordinate functions fk(x) = xk and f l(x) = xl we have⟨Mk,M l

⟩t

=

∫ t

0

akl(Xs)ds =

∫ t

0

d∑m=1

σkm(Xs)σlm(Xs)ds.

Thanks to this formula we can compute⟨Bi, Bj

⟩t

=

∫ t

0

d∑k,l=1

(σ−1(Xs)

)ik

(σ−1(Xs)

)jld⟨Mk,M l

⟩s

=

∫ t

0

d∑m,k,l=1

(σ−1(Xs)

)ik

(σ−1(Xs)

)jlσkm(Xs)σ

lm(Xs)ds

=

∫ t

0

d∑m,l=1

δim(σ−1(Xs)

)jlσlm(Xs)ds

=

∫ t

0

d∑m=1

δimδjmds = δijt.

According to Levy’s theorem, B is a standard Brownian motion. By definition∫ t

0

d∑k=1

σik(Xs)dBks =

∫ t

0

d∑k,l=1

σik(Xs)(σ−1(Xs)

)kldM l

s

=

∫ t

0

d∑l=1

δildMls = M i

t

= X it −X i

0 −∫ t

0

bi(Xs)ds.

Rearrange this equation to deduce that

X it = X i

0 +

∫ t

0

d∑k=1

σik(Xs)dBks +

∫ t

0

bi(Xs)ds

for i = 1, · · · , d. That is to say that (X,B) is a weak solution to (4.23).

Chapter 5

Local times

Ito’s formula provides us a powerful semimartingale decomposition for f(Xt) − f(X0),where f is a C2-function and X is a semimartingale. For example, if X = (Xt) is acontinuous semimartingale, then

f(Xt)− f(X0) =

∫ t

0

f ′(Xs)dXs +1

2

∫ t

0

f ′′(Xs)d〈X〉s.

It has been recognized that such decomposition plays a central role in stochastic analysis,it is thus natural to search for a decomposition for functions f with less regularity.

The concept of ”local times” of Brownian motion was introduced by Levy in his study offine properties of Brownian sample paths, which was further developed into an importanttool in constructing diffusion processes on the real line in the classics Ito-McKean “Diffusionprocesses and their sample paths”. In 1958, Tanaka established a decomposition for |Bt−a|where B is a Brownian motion. One of the important contributions made by the Frenchschool is that if f is convex and X is a continuous semimartingale, then f(X) is again asemimartingale, its decomposition sometimes is called the generalized Ito’s formula.

5.1 Tanaka’s formula and local time

H. Tanaka proved that |Xt − a| is a semimartingale if X is a Brownian motion, estab-lished its semimartingale decomposition, and identified its variation process as the localtime introduced by L. Levy. These results have been extended to a general continuoussemimartingales by M. Yor and etc.

To study these results, we need a few facts about convex functions which are discussedwith details in the appendix.

Let X = (Xt)t≥0 be a continuous semimartingale and a ∈ R. Consider the convexfunction f(x) = |x− a| on R, which is smooth except at a. Its left derivative f ′−(x) = −1for x ≤ a and f ′−(x) = 1 for x > a. Choose a function α ∈ C∞0 (R) with a compact supportin (0, T ) (for some T > 0), such that

∫R α(s)ds = 1. For each ε > 0, define fε = f ∗ αε

59

60 CHAPTER 5. LOCAL TIMES

where αε(z) = 1εα(xε

), so that

fε(x) =

∫ ∞−∞

f(x− z)αε(z)dz

=

∫ ∞−∞

f(z)αε(x− z)dz.

fε is smooth for every ε > 0 and

dk

dxkfε(x) =

∫ ∞−∞

f(z)dk

dxkαε(x− z)dz

for k = 0, 1, 2, · · · . Since f is continuous, so that fε → f uniformly on any compact subset,and

d

dxfε(x) =

∫ ∞−∞

f(z)d

dxαε(x− z)dz

= −∫ ∞−∞

f(z)d

dzαε(x− z)dz

= − limδ→0

1

δ

∫ ∞−∞

f(z) (αε(x− z − δ)− αε(x− z)) dz

= − limδ→0

∫ εT

0

f(x− z − δ)− f(x− z)

δαε(z)dz.

If x− a ≤ 0, then for z ∈ (0,∞) and δ > 0

f(x− z − δ)− f(x− z)

δ= 1

so that

d

dxfε(x) = − lim

δ→0+

∫ ∞0

f(x− z − δ)− f(x− z)

δαε(z)dz

= −∫ ∞

0

αε(z)dz = −1.

While for x− a > 0, and for every ε > 0 such that εT < x−a2

, then for |δ| < 0 we have forany z ∈ (0, εT )

f(x− z − δ)− f(x− z)

δ=x− z − δ − (x− z)

δ= −1

we therefore have

d

dxfε(x) = − lim

δ→0

∫ ∞0

f(x− z − δ)− f(x− z)

δαε(z)dz

= − limδ→0

∫ εT

0

(−1)αε(z)dz = −1.

5.1. TANAKA’S FORMULA AND LOCAL TIME 61

Hence f ′ε(x) → f ′−(x) at every point x ∈ R including x = a. Applying Ito’s formula to fεfor each ε > 0 to obtain

fε(Xt) = fε(X0) +

∫ t

0

f ′ε(Xs)dXs +1

2

∫ t

0

f ′′ε (Xs)d〈X〉s.

Since fε(Xt)→ f(Xt), fε(X0)→ f(X0) and∫ t

0

f ′ε(Xs)dXs →∫ t

0

f ′−(Xs)dXs

as ε ↓ 0, therefore 12

∫ t0f ′′ε (Xs)d〈X〉s has a limit as ε ↓ 0, which is denoted by 2Lat .

Tanaka’s formula. Suppose X is a continuous semimartingale and a ∈ R. Then thereis an adapted, continuous increasing process (Lat )t≥0 such that

|Xt − a| = |X0 − a|+∫ t

0

sgn(Xs − a)dXs + 2Lat . (5.1)

wheresgn = 1(0,∞) − 1(−∞,0]

is the sign function (which is left-hand continuous). This formula indeed can be consideredas the definition of the local time Lat of X at a.

The important thing here is that, t → Lat is continuous, increasing, starting at 0, andadapted to the filtration generated by (Xt)t≥0. Since

Lat =1

2|Xt − a| −

1

2|X0 − a| −

1

2

∫ t

0

sgn(Xs − a)dXs

so that Lat is jointly measurable in (t, a) with respect to the Borel σ-algebra B([0,∞)×R).Next we establish the Ito’s formula for a general convex function f .Suppose f : R → R is a convex function, then f must be continuous on R, its right-

derivative

f ′+(a) = limh→0+

f(a+ h)− f(a)

h

and its left derivative

f ′−(a) = limh→0−

f(a+ h)− f(a)

h

exist and are increasing on R. f ′+ is right-continuous and f ′− is left-continuous. More-over f ′+ is the right-continuous modification of f ′−, and similarly f ′− is the left-continuousmodification of f ′+, in the sense that

f ′+(a) = limε→0+

f ′−(a+ ε)

andf ′−(a) = lim

ε→0+f ′+(a− ε)

for every a ∈ R.


The Lebesgue-Stieltjies measure associated with the right derivative f ′+ is denoted byµf ′+ which is the unique measure on (R,B(R)) such that

µf ′+((a, b]) = f ′+(b)− f ′+(a)

for any a ≤ b. As a consequence we also have

µf ′+([a, b)) = f ′−(b)− f ′−(a)

for any a ≤ b. On can verify that if in addition f ∈ C2(R), then µf ′+(dx) = f ′′(x)dxwhich is absolutely continuous with respect to the Lebesgue measure. [Be careful, if thesecond derivative of f exists but not continuous, then µf ′+(dx) is in general different fromthe measure f ′′(x)dx !]

To establish the Ito’s formula for f(Xt) (where f is a convex function), let us computethe following integral

J(x) =

∫(−n,n)

|x− z|µf ′+(dz)

where x ∈ (−n, n), where n > 0 is a positive number. Firstly we notice that

J =

∫(−n,x)

|x− z|µf ′+(dz) +

∫(x,n)


=

∫(−n,x)

(x− z)µf ′+(dz)−∫

(x,n)

(x− z)µf ′+(dz)

≡ J1 − J2.

Integrating by parts we obtain that

J1 =

∫(−n,x)


= −(x+ n)f ′+(−n) +

∫(−n,x)

f ′+(z)dz

= −(x+ n)f ′+(−n) + f(x)− f(−n)

and similarly

J2 =

∫(x,n)


= (x− n)f ′−(n) + f(n)− f(x)

so that

J = −(x+ n)f ′+(−n) + f(x)− f(−n)− (x− n)f ′−(n)− f(n) + f(x)

= 2f(x)−(f ′+(−n) + f ′−(n)

)x− nf ′+(−n) + nf ′−(n)− f(−n)− f(n)

which yields that

f(x) = αnx+ βn +1

2

∫(−n,n)



for any x ∈ (−n, n), where

αn =f ′+(−n) + f ′−(n)

2and

βn =1

2

(f(−n) + f(n) + nf ′+(−n)− nf ′−(n)

).

By continuity in x, the formula holds for x = ±n as well. Now we have established allnecessary facts about a convex function required to derive the Ito’s formula.

Theorem 5.1.1 Let X = (Xt)t≥0 be a continuous semimartingale and f : R→ R a convexfunction. Then

f(Xt) = f(X0) +

∫ t

0

f ′−(Xs)dXs +

∫RLatµf ′+(da) (5.2)

where µf ′+ is the Lebesgue-Stieltjies measure associated with the right derivative f ′+. The

same formula is equally valid if f =∑n

j=1 cjfj where fj are convex functions, cj areconstants, and µf ′+(da) =

∑nj=1 cjµf ′j+(da) which is a signed measure.

If f ∈ C2(R) then ∫Rf ′′(a)Lat da =

1

2

∫ t

0

f ′′(Xs)d〈X〉s (5.3)

for every t ≥ 0.

Proof. For any n > 0 we have

f(x) = αnx+ βn +1

2

∫(−n,n)


for every x ∈ [−n, n], where αn and βn are given as above.Let Tn = inft ≥ 0 : |Xt| ≥ n. Then Tn : n ≥ 1 is an increasing sequence of

stopping times and Tn ↑ ∞. Applying Tanaka’s formula to Xnt = Xt∧Tn one obtains

|Xnt − a| = |Xn

0 − a|+∫ t

0

sgn(Xns − a)dXn

s + 2Ln,at

for each n, and as n → ∞, Ln,at = Lat∧Tn ↑ Lat . According to the previous representation

for f one obtains

f(Xnt ) = αnX

nt + βn +

1

2

∫(−n,n)

|Xnt − z|µf ′+(dz)

= αnXnt + βn +

1

2

∫(−n,n)

|Xn0 − z|µf ′+(dz)

+

∫(−n,n)

∫ t

0

sgn(Xns − z)dXn

s µf ′+(dz) +

∫(−n,n)

Ln,zt µf ′+(dz)

= αnXnt + βn +

1

2

∫(−n,n)

|Xn0 − z|µf ′+(dz)

+

∫ t

0

∫(−n,n)

sgn(Xns − z)µf ′+(dz)dXn

s +

∫(−n,n)

Ln,zt µf ′+(dz).


According to definition of sgn and µf ′+ we have∫(−n,n)

sgn(Xns − z)µf ′+(dz) =

∫(−n,Xn

s )

µf ′+(dz)−∫

[Xns ,n)

µf ′+(dz)

= f ′−(Xns )− f ′+(−n)−

(f ′−(n)− f ′−(Xn

s ))

= 2f ′−(Xns )−

(f ′+(−n) + f ′−(n)

)so after rearranging the terms to obtain

f ′−(Xns ) =

1

2

∫(−n,n)

sgn(Xns − z)µf ′+(dz) + αn.

By the representation

f(Xn0 ) = αnX

n0 + βn +

1

2

∫(−n,n)

|Xn0 − z|µf ′+(dz).

Putting these relations into the equation for f(Xnt ) we thus have

f(Xnt ) = f(Xn

0 ) + αn (Xnt −Xn

0 ) +

∫ t

0

[f ′−(Xn

s )− αn]dXn

s

+

∫(−n,n)

Ln,zt µf ′+(dz)

= f(Xn0 ) +

∫ t

0

f ′−(Xns )dXn

s +

∫(−n,n)

Ln,zt µf ′+(dz).

Letting n→∞ we obtain the formula.For each a, (Lat )t≥0 is called the local time (process) of X at site a ∈ R. According to

definition

Lat =1

2(|Xt − a| − |X0 − a|)−

1

2

∫ t

0

sgn(Xs − a)dXs (5.4)

for t ≥ 0. Our next task is to study the regularity of (t, a)→ Lat as a random field.Let us begin with the following

Lemma 5.1.2 Let b ∈ R, and g(x) = 12(x− b)2 if x ≤ b and g(x) = 0 if x > b.

1) Then g′(x) = x− b for x ≤ b, g′(x) = 0 for x > b, and

µg′(dx) = 1(−∞,b](x)dx.

2) If X is a continuous semimartingale, then

g(Xt)− g(X0) =

∫ t

0

g′(Xs)dXs +1

2

∫ t

0

1(−∞,b](Xs)d〈X〉s.

Proof. Item 1) follows an easy computation, and the fact that g′ is continuous. Toshow item 2), we choose a function α ∈ C∞0 (R) with compact support in (0,∞) such that∫R α(x)dx = 1 and for ε > 0 set αε(x) = ε−1α (ε−1x) and gε = αε ∗ g. Then, since g′ is


continuous, both gε, g′ε converge to g, g′ uniformly on any compact set, and g′′ε converges

to 1(−∞,b] point-wise. Therefore, by passing to the limit as ε ↓ 0 in the Ito’s formula

gε(Xt)− gε(M0) =

∫ t

0

g′ε(Ms)dMs +1

2

∫ t

0

g′′ε (Ms)d〈M〉s

we obtain item 2).

Lemma 5.1.3 Let X = X0 + M + A be the Doob-Meyer decomposition of a continuoussemimartingale, and Sn = inf t ≥ 0 : |Xt| ≥ n. Then, for any p ≥ 1 there is a constantC depending only on p such that

E∣∣∣∣ sup0≤s≤t

∫ s∧Sn

0

1(a,b](Xr)dMr

∣∣∣∣2p≤ C

(L+ n)p + E

(∫ t

0

|dAs|)p

+ E〈M〉p/2t

|b− a|p (5.5)

for any n ≥ 0, t ≥ 0 and any a ≤ b such that |a|, |b| ≤ L.

Proof. Without losing a generality, we may assume that Sn = ∞. That is, |Xt| ≤ nfor all t. Let Nt =

∫ t0

1(a,b](Xr)dMr for t ≥ 0. According to Burkholder’s inequality

E sup0≤s≤t

∣∣∣∣∫ s

0

1(a,b](Xr)dMr

∣∣∣∣2p ≤ CE∣∣∣∣∫ t

0

1(a,b](Xr)d〈M〉r∣∣∣∣p

where C, and through the proof, denotes a constant depending only on p, which can bedifferent from place to place. Let gc(x) = 1

2(x− c)2 for x ≤ c and gc(x) = 0 for x > c. By

Lemma 5.1.2 ∫ t

0

1(a,b](Xs)d〈M〉s

= 2 (gb(Xt)− ga(Xt)− gb(X0) + ga(X0))

−2

∫ t

0

(g′b − g′a) (Xs)dMs − 2

∫ t

0

(g′b − g′a) (Xs)dAs.

It is easy to see that|gb(x)− ga(x)| ≤ (L+ n)|b− a|

and|g′b(x)− g′a(x)| ≤ |b− a|

so that ∣∣∣∣∫ t

0

Xa<Xs≤b(Xs)d〈M〉s∣∣∣∣

≤[4(L+ n) + 2

∫ t

0

|dAs|]|b− a|

+2

∣∣∣∣∫ t

0

(g′b − g′a) (Xs)dMs

∣∣∣∣ .


Therefore

E∣∣∣∣∫ t

0

Xa<Xs≤b(Xs)d〈M〉s∣∣∣∣p

≤ 2p−1E[4(L+ n) + 2

∫ t

0

|dAs|]p|b− a|p

+2pE∣∣∣∣∫ t

0


∣∣∣∣p .

For the last term on the right hand side, we may use Burkholder’s inequality again, toobtain

E∣∣∣∣∫ t

0


∣∣∣∣p ≤ CE∣∣∣∣∫ t

0

(g′b − g′a)2

(Xs)d〈M〉s∣∣∣∣p/2

≤ CE〈M〉p/2t |b− a|p

which completes the proof.

Theorem 5.1.4 Let Xt = X0 + Mt + At be a continuous semimartingale, where M is alocal continuous martingale, A is a continuous process with finite variation. Let

Lat =1

2(|Xt − a| − |X0 − a|)−

1

2

∫ t

0

sgn(Xs − a)dXs

be the local time of X at a.1) There is a set N ∈ F with probability zero, such that

(t, a)→ Lat (ω)

is jointly continuous in t ∈ R+ and right-continuous in a ∈ R for any ω ∈ Ω \N .2) For each ω ∈ Ω \N , La−t (ω) exists and

Lat (ω)− La−t (ω) =

∫ t

0

1Xs(ω)=adAs(ω)

and (t, a)→ Lat (ω) is jointly continuous almost surely if and only if∫ t

01Xs=adAs = 0.

Proof. For each n define

Tn = inft ≥ 0 : |Mt| ≥ n, 〈M〉t ≥ n or V (A)t ≥ n

where V (A)t is the total variation of A on [0, t]. Then Tn : n ≥ 1 is an increasingsequence of stopping times, such that Tn ↑ ∞. Let Xn

t = Xt∧Tn , and similar notationsapply to other processes. Then

Lat∧Tn =1

2(|Xn

t − a| − |Xn0 − a|)−

1

2

∫ t

0

sgn(Xns − a)dXn

s .


Hence we first consider a semimartingale such that all processes M , 〈M〉 and V (A) arebounded. For this case, t → 〈M〉t + V (A)t + t is continuous and strictly increasing, its(right-continuous) inverse

τt = infs : 〈M〉s + V (A)s + s > t.

Then each τt is a stopping, t→ τt is continuous and τt ↑ ∞. By definition

〈M〉τt + V (A)τt + τt = t, τ〈M〉t+V (A)τt+t= t ∀t ≥ 0. (5.6)

In particular〈M〉τt − 〈M〉τs + V (A)τt − V (A)τs + (τt − τs) = t− s

for t ≥ s ≥ 0, which yields that

〈M〉τt − 〈M〉τs ≤ t− s ∀t ≥ s ≥ 0 (5.7)

andV (A)τt − V (A)τs ≤ t− s ∀t ≥ s ≥ 0. (5.8)

According to (5.4) one has

Laτt =1

2(|Xτt − a| − |X0 − a|)−

1

2

∫ τt

0

sgn(Xs − a)dXs

=1

2(|Zt − a| − |Z0 − a|)−

1

2

∫ t

0

sgn(Zs − a)dZs.

where Zt = Xτt . Let Zt = Z0 + Mt + At.Then Mt = Mτt and At = Aτt .Since t→ τt is continuous and strictly increasing, so that

Lat = Laττ−1t

whereτ−1t = 〈M〉t + V (A)t + t

is continuous, so that we only need to prove the conclusions in the theorem for Laτt .Therefore we may further assume that processes M , 〈M〉 and V (A) are bounded by

some constant C0, and〈M〉t − 〈M〉s ≤ t− s ∀t ≥ s ≥ 0,

andV (A)t − V (A)s ≤ t− s ∀t ≥ s ≥ 0.

Let N(a)t = 12

∫ t0sgn(Xs − a)dMs for simplicity. Then

Lat =1

2(|Xt − a| − |X0 − a|)−N(a)t

−1

2

∫ t

0

sgn(Xs − a)dAs.


We next prove that

(t, a)→ Jat ≡1

2(|Xt − a| − |X0 − a|)−N(a)t (5.9)

is jointly continuous. More precisely, there is a set N ⊂ Ω with probability zero, such that(t, a)→ Jat (ω) is continuous on R+ × R for any Ω \ N .

Clearly we only need to show (t, a) → N(a)t is continuous except a probability zeroset. Let a < b ∈ R, 0 ≤ s ≤ t. Then

N(a)t −N(b)s =1

2

∫ t

0

sgn(Xr − a)dMr −1

2

∫ s

0

sgn(Xr − b)dMr

=1

2

∫ t

s

sgn(Xr − a)dMr +

∫ s

0

1(a,b](Xr)dMr.

Therefore for any p ≥ 1 we have

E |N(a)t −N(b)s|2p ≤ 22(p−1)E∣∣∣∣∫ s

0

1(a,b](Xr)dMr

∣∣∣∣2p+22(p−1)E

∣∣∣∣∫ t

s

sgn(Xr − a)dMr

∣∣∣∣2p .

The first integral on the right-hand side can be dominated via (5.5), so let us control thesecond expectation. Indeed, by using the Burkholder’s inequality

E∣∣∣∣∫ t

s

sgn(Xr − a)dMr

∣∣∣∣2p ≤ CE∣∣∣∣∫ t

s

d〈M〉r∣∣∣∣p

= CE(〈M〉t − 〈M〉s)p.

Therefore

E |N(a)t −N(b)s|2p

≤ CE(〈M〉t − 〈M〉s)p

+C

(L+ C0)p + E

(∫ t

s

|dAs|)p

+ E〈M〉p/2t−s

|b− a|p

for any a, b such that |a|, |b| ≤ L, any t ≥ 0. Since, under our reduction, it holds that∫ t

0

|dAs| ≤ t and 〈M〉t − 〈M〉s ≤ t− s

so that there is a constant C depending only on p, L and C0

E |N(a)t −N(b)s|2p ≤ C[tp2 (b− a)p + (t− s)p

](5.10)

for any s, t ≥ 0, a, b ∈ R such that |a|, |b| ≤ L. It follows that, according to the Kolmogorov-Centsov theorem, (t, a)→ N(a)t is αp-Holder continuous on any compact subset of [0,∞)×R, for any p > 2, where αp = p−2

2p.


Finally we treat the Riemann-Stieltjies integral

Iat ≡∫ t

0

sgn(Xs − a)dAs.

It is clear that

Iat − Ibs =

∫ t

0

(sgn(X − a)− sgn(X − b)) dA

+

∫ t

s

sgn(X − b)dA

so that, since a→sgn(Xs − a) is right-continuous,

lims→tb↓a

(Iat − Ibs

)= 0

and

lims→tb↑a

(Iat − Ibs

)= −2

∫ t

0

XXs−a=0dAs.

Therefore (t, a) → Lat is jointly right continuous in a and continuous in t. Moreover it isjointly continuous if and only if

∫ t0XXs=adAs = 0. In particular

Lat − La−t =

∫ t

0

XXs=adAs

which completes the proof.From now on, for a continuous semimartingale X = (Xt)t≥0, we always use a version

of its local time Lat such that (t, a) → Lat is jointly continuous in t, right continuous in aand having left hand limits La−t , without further qualification.

By checking the preceding proof, we find that it is possible to improve on the regularityfor the local time of Brownian motion.

Theorem 5.1.5 Let Lat : t ≥ 0, a ∈ R be the local time of one-dimensional standardBrownian motion (Wt)t≥0.

1) For any L, T > 0 and p > 1 there is a constant C depending only on L, T, p suchthat

E|Lat − Lbs|2p ≤ C (|b− a|p + |t− s|p) (5.11)

for (t, a), (s, b) ∈ [0, T ]× [−L,L].2) If α < 1

2, (t, a)→ Lat is jointly α-Holder continuous.

Proof. From the proof of the previous theorem, for Brownian motion (Wt)t≥0, 〈W 〉t =t.

Lat =1

2(|Wt − a| − |a|)−N(a)t

where

N(a)t =1

2

∫ t

0

sgn(Ws − a)dWs


so that

N(b)t −N(a)t =1

2

∫ t

0

Xa<Ws≤b(Ws)dWs.

Let gc(x) = 12(x− c)2 if x ≤ c and gc(x) = 0 if x > c. By Lemma 5.1.2 one has∫ t

0

Xa<Ws≤b(Ws)ds

= (Wt − b)21Wt≤b − (Wt − a)21Wt≤a

−2

∫ t

0

(g′b − g′a) (Ws)dWs

= (Wt − b)21a<Wt≤b − 2 (b− a)Wt1Wt≤a

+(b2 − a2

)1Wt≤a − 2

∫ t

0

(g′b − g′a) (Ws)dWs.

While for any p > 1 one has the following elementary estimates

E|Wt − b|2p1a<Wt≤b ≤ |a− b|2p

and

E|Wt|2p1Wt≤a =

∫ a

−∞|z|2p 1√

2πte−

z2

2t dz

≤ tp∫ ∞−∞|z|2p 1√

2πe−

z2

2 dz

= Cptp .

so that

E∣∣∣∣∫ t

0

Xa<Ws≤b(Ws)ds

∣∣∣∣p≤ C [(a− b)p + (b+ a)p + tp] (b− a)p

+CE∣∣∣∣∫ t

0

(g′b − g′a) (Ws)dWs

∣∣∣∣p≤ C [(a− b)p + (b+ a)p + tp] (b− a)p

+CE∣∣∣∣∫ t

0

(b− a)2 ds

∣∣∣∣p/2where we have used the fact that |g′b(x)− g′a(x)| ≤ |b− a| for any x. Therefore

E∣∣∣∣∫ t

0

Xa<Ws≤b(Ws)ds

∣∣∣∣p ≤ C|b− a|p

where

C = Cp[(a− b)p + (b+ a)p + tp + tp/2

]


where Cp depends only on p. It follows from B-D-G inequality that

E| sups≤t

(N(b)s −N(a)s) |2p ≤ CE∣∣∣∣∫ t

0

Xa<Ws≤b(Ws)ds

∣∣∣∣p≤ C|b− a|p

where C as above. Therefore, for any L, T > 0 and p > 1, there is a constant C = C(L, T, p)such that

E|N(a)t −N(b)s|2p ≤ C (|b− a|p + |t− s|p)

for all (s, b), (t, a) ∈ [0, T ]× [−L,L]. Now, by the Tanaka formula

Lat − Lbs =1

2(|Wt − a| − |Ws − b|)

−1

2(|a| − |b|)− (N(a)t −N(b)s)

so that

|Lat − Lbs| ≤1

2|Wt −Ws|+ |b− a|+ |N(a)t −N(b)s| ,

which, together with the estimate for Nt, yields the desired result.

Remark 5.1.6 The following formula (from the preceding proof) may be useful. For t, s ≥0 and a > b then

Lat − Lbs =1

2(|Wt − a| − |Ws − b|) +

1

2(|b| − |a|)

+ (b− a)

(b+ a

2−Wt

)1Wt≤a

+1

2(Wt − b)21a<Wt≤b − (b− a)

∫ t

0

1Ws≤adWs

−∫ t

0

(Ws − a)1a<Wt≤bdWs −∫ t

s

sgn(Wr − b)dWr.

Let us prove important properties about the local time Lat of a continuous semimartin-gale Xt = X0 +Mt + At.

Proposition 5.1.7 The following Tanaka’s formulas hold

(Xt − a)+ = (X0 − a)+ +

∫ t

0

1(a,∞)(Xs)dXs + Lat (5.12)

and

(Xt − a)− = (X0 − a)− −∫ t

0

1(−∞,a](Xs)dXs + Lat . (5.13)

Proof. Apply (5.2) to f(x) = (x−a)+ so that f ′−(x) = 1(a,∞)(x) and µf ′+(dx) = δa(dx),to obtain (5.12). Similarly, applying (5.2) to f(x) = (x − a)−: f ′−(x) = −1(−∞,a] andµf ′+(dx) = δa(dx) so that (5.13) follows.


Theorem 5.1.8 The local time t → Lat increasing only on s : Xs = a. More precisely∫ t0|Xs − a|dLas = 0 for all t > 0.

Proof. According to Ito’s formula

|Xt − a|2 = |X0 − a|2 + 2

∫ t

0

(Xs − a)dXs + 〈M〉t.

On the other hand, according to Tanaka’s formula, |Xt−a| is a continuous semimartingalewith decomposition

|Xt − a| = |X0 − a|+∫ t

0

sgn(Xs − a)dXs + 2Lat .

Thus

〈|X − a|〉t =

∫ t

0

(sgn(Xs − a))2 d〈X〉s

=

∫ t

0

d〈X〉s = 〈M〉t

together with the integration by parts

|Xt − a|2 = |X0 − a|2 + 2

∫ t

0

|Xs − a|d|Xs − a|

+〈|X − a|〉t

= |X0 − a|2 + 2

∫ t

0

(Xs − a) dXs + 〈M〉t

+4

∫ t

0

|Xs − a|dLas .

Therefore we must have ∫ t

0

|Xs − a|dLas = 0.

Theorem 5.1.9 ( Occupation time formula) Let ϕ be a Borel measurable function on R,bounded or non-negative.

1) If X is a continuous semi-martingale, then∫ t

0

ϕ(Xs)d〈X〉s = 2

∫Rϕ(a)Lat da (5.14)

for all t ≥ 0.2) If X is a continuous local martingale, or if X = M + A (where M is a continuous

local martingale and A is adapted with finite variations) such that∫ t

01Xs=adAs = 0 for

all t > 0, then

limε↓0

1

4ε

∫ t

0

1(a−ε,a+ε)(Xs)d〈X〉s = Lat


for t ≥ 0.3) If B is a Brownian motion and Lat (for t ≥ 0) is the local time process of B at a,

then

limε↓0

1

4ε

∫ t

0

1(a−ε,a+ε)(Xs)dt = Lat

for all a and t ≥ 0.

Proof. Suppose ϕ = f ′′ for some f ∈ C2(R), then by (5.3), (5.14) holds. In particular,(5.14) is valid for any continuous function ϕ, so it is true for any bounded or non-negativeϕ.

Proposition 5.1.10 Suppose f : R → R is continuous, and there are finite many pointsa1 < · · · < an such that f ∈ C2(ai, ai+1) (i = 0, · · · , n, with a0 = −∞, an+1 = ∞), f hasleft and right derivatives at ai. Then

f(Xt) = f(X0) +

∫ t

0

f ′−(Xs)dXs +1

2

∫ t

0

f ′′(Xs)d〈X〉s

+n∑i=1

(f ′(ai+)− f ′(ai−))Lait . (5.15)

Proof. We can apply Theorem 5.1.1 to f . In this case f ′′ exists except at ai andtherefore

µf ′+(dx) = f ′′(x)dx+n∑i=1

(f ′+(ai)− f ′−(ai)

)δai(dx).

Hence

f(Xt) = f(X0) +

∫ t

0

f ′−(Xs)dXs +

∫Rf ′′(a)Lat da

+n∑i=1

(f ′+(ai)− f ′−(ai)

)Lait .

and the formula follows from∫Rf ′′(a)Lat da =

1

2

∫ t

0

f ′′(Xs)d〈X〉s.

As an application we prove the comparison theorem (due to Yamada and etc.)

Theorem 5.1.11 Let σ, b1 and b2 are real valued, Lipschitz continuous functions on R.Let X1 and X2 be the strong solutions to the following SDEs respectively:

X it = X i

0 +

∫ t

0

σ(X is)dBs +

∫ t

0

bi(X is)ds

for i = 1, 2, where B is a Brownian motion on (Ω,F ,P). Suppose(1) X1

0 ≥ X20 almost surely, and

(2) b1(x) ≥ b2(x) for all x ∈ R.Then X1

t ≥ X2t for all t almost surely.


Proof. Under the assumption there are unique strong solutions X1 and X2 for a givenBrownian motion B. Apply Tanaka ’s formula to the difference Yt = X2

t −X1t we have

(X2t −X1

t )+ =

∫ t

0

1(0,∞)(X2s −X1

s )d(X2s −X1

s

)+ L0

t

as (X20 −X1

0 )+ = 0, where Lat is the local time of X2 −X1 at a, so Lat is increasing withLa0 = 0. Taking expectations both side to obtain that

E(X2t −X1

t )+ = E(Zt) + E(L0t )

for all t ≥ 0, where

Zt =

∫ t

0

1(0,∞)(X2s −X1

s )d(X2s −X1

s

)

for simplicity. We first show that indeed L0t = 0. In fact by the occupation formula we

have ∫ t

0

1

Ys1(0,∞)(Ys)d 〈Y 〉s = 2

∫ ∞0

1

aLat da.

Since

∫ t

0

1

Ys1(0,∞)(Ys)d 〈Y 〉s =

∫ t

0

1

Ys

(σ(X2

s )− σ(X1s ))2

1(0,∞)(Ys)ds

≤ C

∫ t

0

Ys1(0,∞)(Ys)ds <∞

where C is the Lipschits constant of σ, so that

∫ ∞0

1

aLat da <∞.

Since a → Lat is right continuous, so that L0t = 0 for all t ≥ 0 almost surely. Therefore

E(X2t −X1

t )+ = E(Zt).

5.2. THE SKOROHOD EQUATION 75

By definition

Zt =

∫ t

0

1(0,∞)(X2s −X1

s )(σ(X2

s )− σ(X1s ))dBs

+

∫ t

0

1(0,∞)(X2s −X1

s )(b2(X2

s )− b1(X1s ))ds

≤∫ t

0

1(0,∞)(X2s −X1

s )(σ(X2

s )− σ(X1s ))dBs

+

∫ t

0

1(0,∞)(X2s −X1

s )(b1(X2

s )− b1(X1s ))ds

≤∫ t

0

1(0,∞)(X2s −X1

s )(σ(X2

s )− σ(X1s ))dBs

+ C

∫ t

0

1(0,∞)(X2s −X1

s )|X2s −X1

s |ds

≤∫ t

0

1(0,∞)(X2s −X1

s )(σ(X2

s )− σ(X1s ))dBs

+ C

∫ t

0

(X2s −X1

s )+ds

where C is the Lipschitz constant, which yields that

E(X2t −X1

t )+ = E(Zt) ≤ C

∫ t

0

E(X2s −X1

s )+ds

for all t ≥ 0. Therefore, by the Gronwall inequality, we may conclude that

E(X2t −X1

t )+ = 0

so that X2t ≤ X1

t for all t almost surely.

5.2 The Skorohod equation

It is possible to represent the local time Lat of a continuous semimartingaleX as a functionalof X. Skorohod has established an explicit formula for Lat in terms of the running maximumof −X, and his formula is completely deterministic without any stochastic content. It is atheorem which should be an exercise in Prelim Analysis!

Theorem 5.2.1 (Skorohod’s equation) Suppose y ∈ C[0,∞) is a continuous path in Rwith initial value y(0) ≥ 0. Let

k(t) = max

0, max

0≤s≤t(−y(s))

for t ≥ 0. (5.16)

Then k is the unique continuous and increasing function on [0,∞) with initial zero, suchthat

x(t) = y(t) + k(t) ≥ 0 for t ≥ 0

and t→ k(t) increases only on t : x(t) = 0, that is,∫∞

01x(t) 6=0dk(t) = 0.


Proof. Let us follow the proof in Karatzas-Shreve, page 210. Suppose y ∈ C[0,∞) isa continuous path with y(0) ≥ 0. Define a path k by (5.16). Since y is a continuous pathso is k, t→ k(t) is increasing by definition. Since −y(0) ≤ 0, so that k(0) = 0. Moreover

x(t) = y(t) + k(t) = max

y(t), max

0≤s≤t(y(t)− y(s))

≥ 0.

We next show that k increases only on I = t : x(t) = 0. To this end we only need toshow that k does not increase on Iε = t ≥ 0 : x(t) > ε for every ε > 0. Since x iscontinuous, Iε is open, so that Iε is countable union of disjoint open intervals (ai, bi) wherebi > ai (where i ∈ Λ an index set which may be empty). For every s ∈ [ai, bi]

−y(s) = k(s)− x(s) ≤ k(bi)− ε

so that by definition

k(bi) = max

0, max

0≤s≤bi(−y(s))

= max

k(ai), max

ai<s≤bi(−y(s))

≤ max k(ai), k(bi)− ε .

Since k is increasing, k(ai) ≤ k(bi), so we must have

k(ai) = k(bi)

for all i. Therefore k is constant on Iε for every ε > 0, hence k must be flat on R \ I =t ≥ 0 : x(t) > 0.

Suppose there are two continuous increasing functions k1 and k2 with initial value 0 att = 0, such that

xi(t) = y(t) + ki(t) ≥ 0 for all t ≥ 0

and ki increases only on t ≥ 0 : xi(t) = 0, where i = 1, 2.By definition xi(0) = y(0). Suppose there is b > 0 such that x1(b) > x2(b). Let

a = sup t ≤ b : x1(t) = x2(t) .

Then 0 ≤ a < b by the continuity of xi. Hence x1(s) > x2(s) ≥ 0 by definition of a, for alls ∈ (a, b]. Since k1 increases only on t ≥ 0 : x1(t) = 0, so that

k1(a) = k1(b)

and therefore

0 < x1(b)− x2(b) = k1(b)− k2(b)

= k1(a)− k2(b) ≤ k1(a)− k2(a)

= x1(a)− x2(a) = 0

5.3. LOCAL TIME FOR BROWNIAN MOTION 77

which is a contradiction. Hence x1(t) ≤ x2(t) for all t ≥ 0. By symmetry we also havex1(t) ≥ x2(t) for all t ≥ 0. The uniqueness now follows immediately.

If X is a continuous semimartingale, then according to Tanaka’s formula, for everya ∈ R

|Xt − a| = |X0 − a|+∫ t

0

sgn(Xs − a)dXs + 2Lat

where t → Lat is continuous, initial zero, and increasing only on t : Xt = a, whichis equivalent to that Lat increases only on t ≥ 0 : |Xt − a| = 0, hence, according toSkorohod’s equation, we have the following corollary.

Corollary 5.2.2 If X is a continuous semimartingale, a ∈ R. Then its local time at a isgiven by

Lat =1

2max

0, max

0≤s≤t

[−∫ s

0

sgn(Xr − a)dXr

]− |X0 − a|

. (5.17)

We end this section to propose the following question which is mentally challenging.Skorohod’s equation indeed gives rise to a mapping which sends a continuous path y(t) inR with initial y(0) ≥ 0 (with running time t ∈ [0,∞)) to a continuous and increasing pathk(t) with k(0) = 0 so that the path x(t) = y(t) + k(t) is a continuous path in R+, and kincrease only on t ≥ 0 : x(t) = 0. Are there similar mappings for continuous paths inRd? There are of course trivial mappings which essentially one dimensional, are there truemulti-dimensional versions of Skorohod’s equation?

5.3 Local time for Brownian motion

According to Wiener, for every z ∈ R, there is a unique probability measure Pz on theBorel σ-algebra over Ω = C[0,∞) the path space of all continuous paths in R, such thatthe coordinate process xt : t ≥ 0 is Brownian motion started at z: Pzx0 = z = 1. Thetheory established in the previous section may be applied to one-dimensional Brownianmotion. Therefore there is a random field Lat : t ≥ 0, a ∈ R jointly Holder continuous in(t, a), such that

|xt − a| = |z − a|+∫ t

0

sgn(xs − a)dxs + 2Lat Pz-a.s. (5.18)

for any z.

According to Skorohod’s equation

2Lat = max

max0≤s≤t

βs − |z − a|, 0

Pz-a.s. (5.19)

where

βt = −∫ t

0

sgn(xs − a)dxs (5.20)


is Brownian motion started at 0 under Pz, and in terms of βt : t ≥ 0 defined by (5.20),we may rewrite Tanaka’s formula

|xt − a| = max

max0≤s≤t

βs, |z − a|− βt Pz-a.s. (5.21)

Therefore

Theorem 5.3.1 (P. Levy, 1948) Let z, a ∈ R. Let W = (Wt)t≥0 be a Brownian motionon (Ω,F ,P) started at z ∈ R, and (Lat )t≥0 be the local time of W at a ∈ R. Then(2Lat , |Wt − a|)t≥0 on (Ω,F ,P) has the same distribution as that of(

(max0≤s≤t

βs − |z − a|)+, |z − a| ∨ max0≤s≤t

βs − βt)t≥0

(5.22)

where βt : t ≥ 0 is Brownian motion starting at 0.

The previous conclusion can be stated in terms of a Brownian motion βt : t ≥ 0 isBrownian motion starting at 0 in R.

Theorem 5.3.2 (P. Levy) Let βt : t ≥ 0 be a Brownian motion starting at 0 in R. Forany a ∈ R, and Lat be the local time of β at a, that is

2Lat = |βt − a| − |β0 − a| −∫ t

0

sgn (βs − a) dβs.

Then (2Lat , |βt − a|)t≥0 and((max0≤s≤t

βs − |a|)+, |a| ∨ max0≤s≤t

βs − βt)t≥0

(5.23)

have the same distribution.

In particular, if βt : t ≥ 0 is Brownian motion starting at 0, and Lt = L0t is the local

time of βt : t ≥ 0 at 0, then the pair of processes (2Lt, |βt|) : t ≥ 0 has the samedistribution as that of (

max0≤s≤t

βs, max0≤s≤t

βs − βt)t≥0

which was discovered by P. Levy in 1948.Next we would like to look at the distribution of the local time Lat .The central problem is that whether one is able to describe the distribution of the

random field (Lat , βt) : t ≥ 0, a ∈ R?Firstly, for fixed t > 0 and z ∈ R, the joint distribution (Lzt , |βt − z|) can be determined

according to Levy’s theorem. Indeed, by the reflection principle,

Pβt ∈ da, max

0≤s≤tβs ∈ db

=

2(2b− a)√2πt3

e−(2b−a)2

2t (5.24)

on b ≥ 0, a ≤ b, from which we can work out the distribution of Lzt for fixed t > 0 andz ∈ R.

5.3. LOCAL TIME FOR BROWNIAN MOTION 79

Corollary 5.3.3 Same assumptions as in Theorem 5.3.2. Let t > 0 and z ∈ R.

1) The law of (L0t , |βt|) has a PDF given by

p(x, y) =4(2x+ y)√

2πt3e−

(2x+y)2

2t ; x ≥ 0, y ≥ 0. (5.25)

2) If z 6= 0, (Lzt , |βt − z|) has a distribution in R2 with a PDF given by

P [Lzt ∈ dx, |βt − z| ∈ dy]

=1√2πt

(e−

(y−|z|)22t − e−

(y+|z|)22t

)1y≥0dyδ0(dx)

+4√2πt3

(2x+ y + |z|)e−(2x+y+|z|)2

2t 1x≥0,y≥0dydx. (5.26)

Proof. Let βt : t ≥ 0 is Brownian motion starting at 0 in R on a probabilityspace (Ω,F ,P). Let βMt = max0≤s≤t βs is the running maximum of the Brownian motion.Then, according to Levy’s theorem, (2Lzt , |βt − z|)t≥0 has the same distribution as that of((βMt − |z|)+, |z| ∨ βMt − βt

)t≥0

. In particular, for a fixed t > 0

Ef(L0t , |βt|) = Ef(

1

2βMt , β

Mt − βt),

together with (5.24), we have

Ef(L0t , |βt|) =

2√2πt3

∫∫b≥0,a≤b

f(1

2b, b− a)(2b− a)e−

(2b−a)22t dadb.

Making change of variables: x = 12b and y = b− a we obtain

Ef(L0t , |βt|) =

4√2πt3

∫∫x≥0,y≥0

f(x, y)(2x+ y)e−(2x+y)2

2t dxdy.

In the case that |z| > 0, one has

Ef(Lzt , |βt − z|)

= Ef(1

2(βMt − |z|)+, |z| ∨ βMt − βt)

=2√2πt3

∫∫b≥0,a≤b

f(1

2(b− |z|)+, |z| ∨ b− a)(2b− a)e−

(2b−a)22t dadb

=2√2πt3

∫∫|z|≥b≥0,a≤b

f(0, |z| − a)(2b− a)e−(2b−a)2

2t dadb

+2√2πt3

∫∫b≥|z|,a≤b

f(1

2(b− |z|), b− a)(2b− a)e−

(2b−a)22t dadb,


where the first integral ∫∫|z|≥b≥0,a≤b

f(0, |z| − a)(2b− a)e−(2b−a)2

2t dadb

=

∫ 0

−∞

∫ |z|0

f(0, |z| − a)(2b− a)e−(2b−a)2

2t dbda

+

∫ |z|0

∫ |z|a

f(0, |z| − a)(2b− a)e−(2b−a)2

2t dbda

=t

2

∫ |z|−∞

f(0, |z| − a)

(e−

a2

2t − e−(2|z|−a)2

2t

)da

=t

2

∫ ∞0

f(0, y)

(e−

(|z|−y)22t − e−

(|z|+y)22t

)dy,

and in the second integral we make change of variables: x = 12(b − |z|) and y = b − a, so

that ∫∫b≥|z|,a≤b

f(1

2(b− |z|), b− a)(2b− a)e−

(2b−a)22t dadb

= 2

∫∫x≥0,y≥0

f(x, y)(2x+ y + |z|)e−(2x+y+|z|)2

2t dxdy

Putting together we have

Ef(Lzt , |βt − z|)

=1√2πt

∫ ∞0

f(0, y)

(e−

(y−|z|)22t − e−

(y+|z|)22t

)dy

+4√2πt3

∫∫x≥0,y≥0

f(x, y)(2x+ y + |z|)e−(2x+y+|z|)2

2t dxdy.

This completes the proof.

Chapter 6

Appendix: Convex functions

In this chapter we collect some results about monotonic functions and convex functions.

6.1 Monotonic functions, Lebesgue-Stieltjes measures

A few elementary properties about about monotonic functions may be found in W. Rudin’sPrinciples, page 95. If g : (a, b) → R is an increasing (or called non-decreasing) function,then the left and right limits g(t−) and g(t+) of g at any t ∈ (a, b) exist. In fact, g(t−) =lims↑t g(s) = sups<t g(s), and similarly g(t+) = lims↓t g(s) = infs>t g(s). Clearly g(t−) ≤g(t+). g is continuous at t ∈ (a, b) if and only if g(t+) = g(t−). The difference g(t+) −g(t−) is called the jump of g at t ∈ (a, b). Define g−(t) = g(t−) and g+(t) = g(t+). Theng− and g+ are increasing on (a, b) as well, and g− ≤ g ≤ g+. Moreover g− = g = g+ exceptat most countable many points in (a, b), thus g− = g = g+ almost everywhere on (a, b) withrespect to the Lebesgue measure. g− is left-continuous and g+ is right-continuous on (a, b).We call that g− is the left-continuous modification of g, and that g+ is the right-continuousmodification of g.

Let g be a right-continuous increasing function on (a, b) with values in R. Then g(a) =g(a+) = limt↓a g(t) and g(b) = g(b−) = limt↑b g(t) exist, g(a) ∈ [−∞,∞) and g(b) ∈(−∞,∞]. The following convention is used: −(−∞) = ∞, −∞ < t < ∞ for every t ∈ Rand t +∞ = ∞. Therefore, g is naturally extended to an increasing function on [a, b]taking values in [−∞,∞], and g is finite on (a, b).

The Lebesgue-Stieltjes measure µg ( also denoted by g(dt) ) on (a, b) may be constructedas the following. First of all, define the length of (s, t] ⊆ (a, b) by µg((s, t]) = g(t)− g(s).Then we construct the outer measure µ∗g by

µ∗g(A) = inf

∞∑i=1

µg((si, ti]) : ∪∞i=1(si, ti] ⊇ A

where the inf runs over all possible countable coverings of A via intervals of form (s, t] ⊂ A.µ∗g is an outer measure on (a, b):

1) µ∗g(A) ≥ 0 for every A ⊆ (a, b),

2) µ∗g(A) ≤ µ∗g(B) if A ⊂ B ⊂ (a, b), and

81

82 CHAPTER 6. APPENDIX: CONVEX FUNCTIONS

3) µ∗g is countably sub-additive, that is,

µ∗g (∪∞i=1Ai) ≤∞∑i=1

µ∗g(Ai) for any Ai ⊂ (a, b).

The Caratheodory extension theorem allows us to construct the Lebesgue-Stieltjes measureµg out off the outer measure µ∗g. Namely, we say a subset A ⊂ (a, b) is µg-measurable ifthe Caratheodory condition holds:

µ∗g(F ) = µ∗g(F ∩ A) + µ∗g(F ∩ Ac)

for every subset F ⊂ (a, b). The totality of all µg-measurable subsets of (a, b) is denotedby Mg. Caratheodory’s extension theorem says that

1) Mg is a σ-algebra on (a, b). Any subset in Mg is called µg-measurable, or simply ameasurable set if g is specified.

2) The outer measure µ∗g restricted onMg is a measure, the restriction of µ∗g is denotedby µg, dg or by g(dt) if no confusion may arise.

3) ((a, b),Mg, µg) is complete, that is, every µ∗g-null set belongs to Mg.4) Every Borel subset of (a, b) is µg-measurable, that is, B(a, b) ⊂Mg, and finally5) If (s, t] ⊂ (a, b), then µ∗g((s, t]) = g(t)− g(s).In particular, if a < t < b, then t = [t, t] is measurable and

µg(t) = limn→∞

µg

((t− 1

n, t]

)= g(t)− lim

n→∞g

(t− 1

n

)= g(t)− g(t−)

which is the jump of g at t as g is right continuous. It follows that, if [s, t] ⊂ (a, b), then

µg([s, t]) = µg(s) + µg((s, t]) = g(s)− g(s−) + g(t)− g(s) = g(t)− g(s−),

and similarlyµg([s, t)) = g(t−)− g(s−).

If g is increasing function, but not necessary right-continuous on (a, b), then we defineµg to be the Lebesgue-Stieltjes measure generated by its right-continuous modification g+.That is, µg = µg+ . Thus

µg((s, t]) = g(t+)− g(s+) ∀(s, t] ⊂ (a, b) (6.1)

andµg(t) = g(t+)− g(t−) (6.2)

is the jump of g at t.

Exercise 6.1.1 Let g : (a, b) → R be increasing and right-continuous. For ε > 0 set

gε(t) = g(t+ε)−g(t)ε

if t, t + ε ∈ (a, b) otherwise gε(t) = 0. Though gε may not be increas-ing, but it is non-negative Borel measurable function, thus we can construct the measureµε(dt) = gε(t)dt on (a, b), then ∫

(a,b)

ϕdµε →∫

(a,b)

ϕdµg

as ε ↓ 0 for any ϕ ∈ C∞0 (a, b).

6.1. MONOTONIC FUNCTIONS, LEBESGUE-STIELTJES MEASURES 83

One of the famous theorems of Lebesgue is that, if g is increasing on (a, b), then g hasfinite derivative almost everywhere on (a, b), denoted by g′ (see for example Theorem 17.12,of E. Hewitt and K. Stromberg, page 264, or F. Riesz and B. Sz-Nagy: Lecons d’AnalyseFonctionnelle, page 5). The measure m(dt) = g′(t)dt in general does not coincide withthe Lebesgue-Stieltjes measure µg. This is obvious in the case that g is discontinuous atsome point t ∈ (a, b) as then µg(t) = g(t+) − g(t−) 6= 0 and thus µg is not absolutelycontinuous with respect to the Lebesgue measure.

Next we discuss several versions of Lebesgue-Stieltjes measures, which are most usefulforms in the study of stochastic processes.

If g is defined on [a, b) or [a, b], then the natural Lebesgue-Stieltjes measure associatedg should be

m(dt) = (g(a+)− g(a))δa + 1(a,b)µg

on [a, b), and

m(dt) = (g(a+)− g(a))δa + 1(a,b)µg + (g(b)− g(b−))δb

for [a, b]. The drawback for this convention lies in that the restriction of µg on [a1, b1] ⊂(a, b) may not coincide with the measure m on [a1, b1]. There is no better way to formulatea unified way to define Lebesgue-Stieltjes measures on an interval including at least endpoint. The best we can do is to specify the measure at the end point(s) in each concretecase.

If g : [0, b)→ [0,∞) is increasing. Then g is increasing on (0, b), it defines the Lebesgue-Stieltjes measure µg on (0, b). We define a measure, denoted by mg on [0, b) by

mg(A) = g(0+)δ0(A ∩ 0) + µg(A ∩ (0, b)),

which is a measure on the σ-algebra of all subset A of [0, b) such that A ∩ (0, b) ∈ Mg.This σ-algebra is still denoted by Mg, if no confusion may arise. Thus, by definition∫

[0,b)

f(t)mg(dt) = f(0)g(0+) +

∫(0,b)

f(t)g(dt).

In literature, mg is also denoted by g(dt) (as a measure on [0, b)), and the equality abovebecomes ∫

[0,b)

f(t)g(dt) = f(0)g(0+) +

∫(0,b)

f(t)g(dt).

Unfortunately, both integrals∫

[0,b)and

∫(0,b)

could be written as∫ b

0, thus confusion may

be produced. Therefore, we will, as long as is there is a risk for confusion, write Lebesgueintegrals as

∫E

, rather than using lower and upper limits. However, if g(0+) = 0, org(0) = 0 and g is right continuous, then the contribution of mg from 0 vanishes, in this

case both integrals∫

[0,b)fdg and

∫(0,b)

fdg coincide, and∫ b

0f(t)g(dt) can be used without

ambiguous.


6.2 Right continuous inverse

Let g : [0, b) → [0,∞) is right continuous and increasing. If b < ∞ then we set g(t) =lims↑b g(s) = sup g (which may be ∞) for t ≥ b, thus g is extended to a right continuousincreasing function on [0,∞) taking values in [0,∞]. The right-continuous inverse of g,denoted by g−1 is defined by

g−1(t) = infs ≥ 0 : g(s) > t for t ≥ 0

then g−1 : [0,∞) → [0,∞] is right-continuous and increasing. Moreover, g−1(t) < ∞ ifand only if t < lims↑b g(s). In particular, if lims↑b g(s) = ∞ then g−1 is finite. For t > 0the left limit

g−1(t) = lims<t,s↑t

g−1(s) = infs ≥ 0 : g(s) ≥ t

= sups ≥ 0 : g(s) < t

and set g−1(0) = 0.

Lemma 6.2.1 g and g−1 as above. Then1) For any t ≥ 0, g (g−1(t)) ≤ g (g−1(t)) ≤ t, and2) For t < lims↑b g(s), g(g−1(t)) ≥ g(g−1(t)) ≥ t.3) If g is a continuous then g(g−1(t)) = g(g−1(t)) = t for any t < lims↑b g(s).4) g is the right-continuous inverse of g−1 so that

g(t) = infs ≥ 0 : g−1(s) > t for t ≥ 0

andg(t) = sups ≥ 0 : g−1(s) ≤ t for t ≥ 0.

These facts can be easily derived from definitions.

6.3 Convex functions

References:1. G. H. Hardy, J. E. Littlewood and G. Polya’s classic “Inequalities’ (which has been

published by Cambridge University Press is and has been in print since 1934).2. L. Hormander: “Notions of Convexity”, published by Birkhauser in 1994. Birkhauser

has recently re-issued Hormander’s book in paperback and listed it as one of Birkhauser’sclassics.

We however follow the conventional definition of convex functions which is slightlydifferent from but equivalent to that of Hormander’s one.

Definition 6.3.1 Let f : (a, b) → R (where a or/and b may be infinity). f is called aconvex function on (a, b) if

f(λ1x1 + λ2x2) ≤ λ1f(x1) + λ2f(x2) (6.3)

for any x1, x2 ∈ (a, b) and λj ≥ 0 such that λ1 + λ2 = 1.

6.3. CONVEX FUNCTIONS 85

Remark 6.3.2 1) Some authors allow convex functions taking values ∞, but, for ourpropose, we only concern with real valued functions.

2) Some authors also define convex functions on closed (or half closed) intervals,whichdo bring some difference. For example, the continuity at the end points will be not guar-ranted.

The following proposition summarize some elementary properties of convex functions.

Proposition 6.3.3 Let f : (a, b)→ R be convex on (a, b).

1) For any x1, x2 ∈ (a, b), and any c ∈ R

supx∈[x1,x2]

(f(x)− cx) = max f(x1)− cx1, f(x2)− cx2 . (6.4)

2) For any xj ∈ (a, b) and λj ≥ 0 such that∑

j λj = 1, then

f(∑j

λjxj) ≤∑j

λjf(xj) . (6.5)

3) It holds thatf(x)− f(x1)

x− x1

≤ f(x2)− f(x)

x2 − x(6.6)

for any x1 < x < x2 such that xj ∈ (a, b).

4) If x ∈ (a, b), then h→ f(x+h)−f(x)h

for any h 6= 0 such that x+h ∈ (a, b) is increasing.

5) f is (locally) Lipschitz continuous. That is for any closed interval [x1, x2] ⊂ (a, b)there is a constant C depending only on f(x1) and f(x2) such that

|f(x)− f(y)| ≤ C|x− y| ∀x, y ∈ [x1, x2]. (6.7)

The first item appears as the definition property in Hormander’s book, 2) is the gen-eralization of our definition for convexity, and is a special case of Jensen’s inequality (seebelow). 3) is the reformulation of (6.3). Item 4) follows 3) and is the most useful form tous. 5) comes a little bit over awarded.

Exercise 6.3.4 If f : (a, b) → R is convex, then both limits limx↓a f(x) and limx↑b f(x)exist [as reals or ∞ !]

Exercise 6.3.5 f : (a, b)→ R is convex, if and only if∣∣∣∣∣∣1 1 1x1 x2 x3

f(x1) f(x2) f(x3)

∣∣∣∣∣∣ ≥ 0

for any xi ∈ (a, b) such that x1 ≤ x2 ≤ x3.


Exercise 6.3.6 If f : (a, b)→ R is convex, and t ∈ (a, b), then

f(s) ≥ β(s− t) + f(t)

for all s ∈ (a, b), and for any β such that

sups<t

f(t)− f(s)

t− s≤ β ≤ inf

s>t

f(s)− f(t)

s− t.

That is, the graph of f is above the line defined by the equation s = β(s − t) + f(t) (andthus we call it a tangent line of f at t).

Proposition 6.3.7 Let f : (a, b)→ R be convex. Then1) At any t ∈ (a, b), the left derivative f ′−(t) and right derivative f ′+(t) exist. Moreover

f ′−(t) = sups<t

f(t)− f(s)

t− sand f ′+(t) = inf

s>t

f(s)− f(t)

s− t.

2) If s, t ∈ (a, b) and s < t then

f ′−(s) ≤ f ′+(s) ≤ f(t)− f(s)

t− s≤ f ′−(t) ≤ f ′+(t). (6.8)

In particular, both f ′− and f ′+ are increasing on (a, b).3) f ′− is left-continuous and f ′+ is right-continuous on (a, b), respectively. Moreover for

any t ∈ (a, b)

f ′−(t) = f ′−(t−) = limε↓0

f ′+(t− ε) (6.9)

and

f ′+(t) = f ′+(t+) = limε↓0

f ′−(t+ ε). (6.10)

4) Let [s, t] ⊂ (a, b). Then∣∣∣∣f(y)− f(x)

y − x

∣∣∣∣ ≤ max|f ′+(s)|, |f ′−(t)|

for any x, y ∈ (s, t) and x 6= y.5) Let t ∈ (a, b). Then f is differentiable at t if and only if f ′− is right continuous

at t, i.e. f ′−(t) = limε↓0 f′−(t + ε), which is also equivalent to that f ′+ is left continuous

at t, i.e. f ′+(t) = limε↓0 f′+(xt − ε). That is, f ′+ (resp. f ′−) is the right-continuous (resp.

left-continuous) modification of f ′− (resp. f ′+). In particular, f ′+ and f ′− generate the sameLebesgue-Stieltjes measure on Borel sets of (a, b).

6) f is differentiable on (a, b) except at most countable many points.

All these conclusions follow easily from the item 4 in Proposition 6.3.3.Next we turn to integral representations of convex functions. We need the following

integration by parts.


Lemma 6.3.8 Let g : (a, b) → R be increasing, and [x1, x2] ⊂ (a, b) be a bounded, closedinterval. Suppose ϕ ∈ C1[x1, x2]. Then∫

(x1,x2)

ϕ(x)µg(dx) = ϕ(x2)g(x2−)− ϕ(x1)g(x1+)

−∫ x2

x1

g(x)ϕ′(x)dx.

Proof. Under the conditions,∫

(x1,x2)ϕ(x)µg(dx) exists as Riemann-Stieltjes’ integral,

and the integration by parts formula is well known, see Theorem 7.6, in T. M. Apostol:Mathematical Analysis, page 144.

If I = [x1, x2] is a compact interval, then define the Green function

GI(x, y) =

(y−x2)(x−x1)

x2−x1 if y ≥ x,(x−x2)(y−x1)

x2−x1 if y ≤ x.

Note that G ≤ 0, symmetric and continuous on I × I.

Theorem 6.3.9 Let f : (a, b)→ R be convex, and I = [x1, x2] ⊂ (a, b) be a bounded closedinterval. Then

f(x) =x− x1

x2 − x1

f(x2)− x− x2

x2 − x1

f(x1)

+

∫(x1,x2)

GI(x, y)µf ′+(dy) (6.11)

for x ∈ [x1, x2].

Proof. Let us consider the case x ∈ (x1, x2). Let us compute the integral

J(x) =

∫(x1,x2)

GI(x, y)µf ′+(dy)

=x− x2

x2 − x1

∫(x1,x)

(y − x1)µf ′+(dy)

+x− x1

x2 − x1

∫(x,x2)

(y − x2)µf ′+(dy)

+(x− x2)(x− x1)

x2 − x1

(f ′+(x)− f ′−(x)

).

The two integrals are Riemann-Stieltjes integrals which can be worked out by means ofintegration by parts∫

(x1,x)

(y − x1)µf ′+(dy) = (x− x1)f ′−(x)−∫

(x1,x)

f ′+(y)dy

= (x− x1)f ′−(x)− f(x) + f(x1)


and ∫(x,x2)

(y − x2)µf ′+(dy) = −(x− x2)f ′+(x)− f(x2) + f(x).

Therefore

J(x) = f(x) +x− x2

x2 − x1

f(x1)− x− x1

x2 − x1

f(x2)

which proves the representation.The following representation is most useful in our study of local times.

Theorem 6.3.10 Let f : (a, b)→ R be convex, and [x1, x2] ⊂ (a, b) be any bounded, closedinterval. Then

f(x) = αx+ β +1

2

∫(x1,x2)

|x− y|µf ′+(dy)

for any x ∈ [x1, x2], where

α =f ′+(x1) + f ′−(x2)

2and

β =1

2

(f(x1) + f(x2)− x1f

′+(x1)− x2f

′−(x2)

).

Proof. Let us consider the following integral

J(x) =

∫(x1,x2)

|x− y|µf ′+(dy)

where [x1, x2] ⊂ (a, b) and x ∈ [x1, x2]. Let us consider the case that x ∈ (x1, x2). Usingintegration by parts we have

J(x) =

∫(x1,x)

(x− y)µf ′+(dy) +

∫(x,x2)

(y − x)µf ′+(dy)

= (x− y)f ′+(y)∣∣(x1,x)

+ (y − x)f ′+(y)∣∣(x,x2)

+

∫(x1,x)

f ′+(y)dy −∫

(x,x2)

f ′+(y)dy

= −(x− x1)f ′+(x1) + (x2 − x)f ′−(x2)

+f(x)− f(x1)− f(x2) + f(x2)

which yields the conclusion.Similarly we have

Theorem 6.3.11 Let f : (a, b)→ R be convex, and [x1, x2] ⊂ (a, b) be any bounded, closedinterval. Let h(x) = 1 for x ≥ 0 and h(x) = −1 for x < 0. Then

f ′+(x) =1

2

∫(x1,x2)

h(x− y)µf ′+(dy) +f ′+(x1) + f ′−(x2)

2

for x ∈ [x1, x2], and, if h(x) = 1 for x > 0 and h(x) = −1 for x ≤ 0 then

f ′−(x) =1

2

∫(x1,x2)

h(x− y)µf ′+(dy) +f ′+(x1) + f ′−(x2)

2

for x ∈ [x1, x2].


Proof. Again, compute the integral

J(x) =

∫(x1,x2)

h(x− y)µf ′+(dy)

for x ∈ (x1, x2). Since h(0) = 1 so that∫x

h(x− y)µf ′+(dy) = f ′+(x)− f ′−(x)

and integrating by parts one has∫(x1,x)

h(x− y)µf ′+(dy) = f ′−(x)− f ′+(x1)

and ∫(x,x2)

h(x− y)µf ′+(dy) = −f ′−(x2) + f ′+(x)

so that2f ′+(x) = J(x) + f ′+(x1) + f ′−(x2).

Our last topic is about approximations to convex functions by C2-functions. Recall thestandard approach. Suppose α is a C∞-function with a compact support in R, such that∫R α(x)dx = 1, and for each ε > 0 we consider αε(x) = 1

εα (x/ε). If u is a function on R

which is locally integrable, then we set

uε(x) =

∫Ru(x− y)αε(y)dy

=

∫Ru(y)αε(x− y)dy.

The right-hand side is the convolution of u and αε, denoted by u ∗ αε. For each ε > 0, uεis a smooth function, and if u is continuous then uε → u uniformly on any compact set. Ifu ∈ Lp(R) where 1 ≤ p <∞, then uε → u in Lp(R) (the conclusion is not true for p =∞).

Suppose f : R → R is convex, then f must be continuous, its first derivative indistribution sense is f ′+ (and f ′−), and its second derivative in distribution sense is theLebesgue-Stieltjes measure µf ′+ generated by the increasing function f ′+. Therefore

f ′ε(x) =

∫Rf(y)

d

dxαε(x− y)dy

= −∫Rf(y)

d

dyαε(x− y)dy (6.12)

=

∫Rf ′−(y)αε(x− y)dy (6.13)

the last equation follows from the fact that f ′− is the distribution derivative of f . Inparticular f ′ε = f ′− ∗ αε for each ε > 0.


Similarly

f ′′ε (x) =

∫Rf(y)α′′ε(x− y)dy

=

∫Rαε(x− y)µf ′+(dy) (6.14)

which implies in particular that each fε is convex.

Lemma 6.3.12 Suppose α ∈ C∞0 (R) has a compact support in (0,∞) such that∫R α(x)ds =

1.1) If fε = f ∗ αε then

f ′ε(x)→ f ′−(x) ∀x ∈ R.

2) If fε = f ∗ αε where α(x) = α(−x) (so its support is in (−∞, 0)), then

f ′ε(x)→ f ′+(x) ∀x ∈ R.

Proof. Suppose α has a support in (0, R) where R > 0 is a number. Then

f ′ε(x)− f ′−(x) =

∫R

(f ′−(y)− f ′−(x)

)αε(x− y)dy

=

∫R

(f ′−(x− y)− f ′−(x)

)αε(y)dy

=

∫ R

0

(f ′−(x− y)− f ′−(x)

) 1

εα(y/ε)dy

=

∫ R

0

(f ′−(x− εz)− f ′−(x)

)α(z)dz.

Since f ′− is left continuous, so that f ′−(x − εz) → f ′−(x) uniformly in z ∈ (0, R) as ε ↓ 0.Therefore f ′ε(x)− f ′−(x)→ 0.

Clearly we can not expect that f ′′ε converges to the derivative of f ′+, but the family ofmeasures f ′′ε (x)dx does converges (in distribution sense).

Lemma 6.3.13 Let α ∈ C∞0 (R) has a compact support such that∫R α(x)ds = 1, and

fε = f ∗ αε. Then ∫Rϕ(x)f ′′ε (x)dx→

∫Rϕ(x)µf ′+(dx)

as ε ↓ 0 for any ϕ ∈ C∞0 (R).

Proof. By definition∫Rϕ(x)f ′′ε (x)dx =

∫R

(ϕ(x)

∫Rαε(x− y)µf ′+(dy)

)dx

=

∫R

(∫Rϕ(x)αε(x− y)dx

)µf ′+(dy)

=

∫Rϕ ∗ αε(y)µf ′+(dy)


where α(x) = α(−x). Since ϕ has a compact support, so is ϕ∗ αε and therefore ϕ∗ αε → ϕuniformly. It follows that ∫

Rϕ(x)f ′′ε (x)dx→

∫Rϕ(x)µf ′+(dx).

c8.1 stochastic di erential equations

Documents