wilhelm stannat institut fur mathematik technische universit at …€¦ · 1.2. mathematical...

Stochastic Processes in Neuroscience

Part I

Wilhelm Stannat

Institut fur MathematikTechnische Universitat Berlin

[email protected]

June 16, 2016

Lecture held in the summer term 2016.i

Contents

Chapter 1. Introduction 11.1. Mathematical models for single ion channels 31.2. Mathematical models for the membrane potential 41.3. Biological neural networks 7

Chapter 2. Markov Chain Models of Ion Channels 92.1. Time-continuous Markov Chains 92.2. The Martingale structure of Markov chains 162.3. Diffusion approximation of Markov chains 182.4. Long-time behavior of Markov chains 30

Chapter 3. Models for synaptic input 33

Chapter 4. Stochastic Integrate-and-Fire models 414.1. The distribution of T 42

Appendix A. Martingales 55A.1. Maximal inequality 57A.2. Stopping times and optional sampling 58

Appendix B. Brownian motion and stochastic integration 63B.1. Construction of BM 65B.2. Elementary properties of BM 65B.3. Path properties of BM 66B.4. The Ito-Integral 67B.5. The Ito-formula 71

Appendix C. Stochastic Differential Equations 77C.1. Explicit solutions 78C.2. Strong solutions 80C.3. Numerical approximation 82

Bibliography 83

v

CHAPTER 1

Introduction

These lecture notes provide a streamlined introduction to the modeling andmathematical analysis of neural activity in living organisms on three different scales:

(a) individual ion channels (microcospic)(b) single neurons (mezoscopic)(c) population of neurons (macroscopic)

The neural activity is intrinsically noisy and the specification of single neu-rons exhibit a large variability, so that various types of stochasticity have to beincorporated into the models. In the following chapters we will introduce the basicprinciples for stochastic models, together with the mathematical theory to analyzethem, that are used in todays computational neuroscience.

Before starting let us first provide a rough overview of the relevant models on allthree different levels we will discuss in the subsequent chapters. The nervous systemconsists of electrically excitable cells, called neurons, that process and transmitinformation. The typical structure of a neuron is sketched in the following figure:

[Source: Wikipedia: Neuropharmacology]

The single nerve cell receives its input from neighboring cells, or sensory input,via its dendrites. The is input is integrated at the nucleus of the cell. Having reacheda certain threshold, one can observe a temporal spike in the membrane potential, i.e.a sharp uprise in the membrane potential followed by a sharp decrease. This spiketravels down the axon and ends in the axon terminals, where it may be passedover to other neurons or muscles. These spikes are called the action potentialand they were the first activity in the nervous system that could be measured byphysiologists. Let us denote the membrane potential with v. Of course, v dependson time and the location where v is measured. If we reduce the neuron to a singlepoint, i.e. a point neuron, this observable is reduced to a real-valued variable v(t)as a function of time. Spatially extended models are much more complex and willbe discussed in the later chapters.

1

https://en.wikipedia.org/w/index.php?title=Neuropharmacology&oldid=686570071

2 1. INTRODUCTION

In the point neuron model, the membrane potential v is driven by three typesof electrical currents

(1.1) Cdv

dt= −F + Isyn + Iext

where

(i) F denotes the sum of currents as a result of ions flowing into or out of thecell membrane through ion channels, also called the membrane current

(ii) Isyn denote the synaptic currents entering the cell(iii) Iext denotes externally injected currents (e.g. exterior signals).

Whereas Isyn and Iext can be seen as exterior controls of the membrane po-tential, the current F is responsible for the intrinsic regulation of the membranepotential, in particular for the generation and regulation of action potentials. Thereis an extensive literature on the modeling of the membrane currents, and we willshortly illustrate the underlying principles in the following. Being the sum of mem-brane currents F can be represented as

F =∑i

gi(v − vi)

where the sum is over the different types of ion channels, vi is the correspondingreversal potential and gi the conductance. gi essentially depends on the concentra-tion of open ion channels of the respective type, which itself is coupled back to themembrane potential and it is the dynamics of the opening and closing of the ionchannels that generates the action potentials.

Single ion channel currents have first been measured by Neher and Sakmannwho invented the patch clamp technique around the year 1976 and later received theNobel Prize in Physiology or Medicine in the year 1991 ”for their discoveries con-cerning the function of single ion channels in cells” ([Nob91] ).These measurementsshowed that the dynamics of single ion channels is intrinsically random and there-fore cannot be described adequately with the help of a differential equation. Thefollowing picture illustrates this apparent stochastic behaviour of a single sodiumchannel in the giant axon of squid (that has also been considered by Hodgkin-Huxley(see below)):

(see [VB91]).

1.1. MATHEMATICAL MODELS FOR SINGLE ION CHANNELS 3

Peaks pointing downwards can be associated with times where the sodiumchannel opens, positive sodium ions (Na+) flow into the cell and raise the membranepotential. The figure shows that the response of the given ion channel is varyingfrom trial to trial and that only the probability of the ion channel being in theup-state (open) or down-state (closed) can be compared w.r.t. different appliedcurrents. The randomness in the response of single ion channels is called channelnoise and it is one of the dominant sources of variability in the membrane potential.

1.1. Mathematical models for single ion channels

It is widely accepted in computational neuroscience today that an adequatemodeling of the statistics of single ion channels can be achieved with the help of(time-continuous) Markov chains on a finite number of states (between open andclosed) and that the switching between these states, the transition rates, are voltagedependent.

&%'$

&%'$

C O

R

I

α(v)

β(v)

If X(t) denotes the state of the ion channel at time t, the probability p(t) =P (X(t) = O) of the Markov chain to be in found in the open state is then given asthe solution of the ordinary differential equation

(1.2)dp

dt= α(v)(1− p)− β(v)p .

It is important to notice at this point that (1.2) only is a statistical description ofthe Markov chain and not the description of a given realization. It can be seenas an approximation of the proportion of ion channels being in the open state,if there were virtually infinitely many independent ion channels X1(t), X2(t), . . .operating simultaneously. Indeed, given N (independent) ion channels, the propor-

tion pN (t) := 1N

∑Ni=1 1O(Xi(t)) of ion channels found in the open state converges

almost surely as N → ∞, due to the strong law of large numbers, towards thetheoretical probability p(t) = E (1O(Xi(t))), i.e.,

(1.3) limN→∞

pN (t) = limN→∞

1

N

N∑i=1

1O(Xi(t)) = E (1O(Xi(t))) = p(t) .

An important question in computational neuroscience is the numerically efficientapproximation of pN (t) and other statistics of large, but finite number of ion chan-nels. One of the most important methods is the diffusion approximation, that

4 1. INTRODUCTION

approximates pN with the help of some stochastic differential equation

(1.4) dpN = α(v)(1− pN )− β(v)pN dt+1√N

√α(v)(1− pN ) + β(v)pN dB

where B denotes a 1-dimensional Brownian motion. We will provide the frameworkfor the rigorous derivation of (1.4) in the next chapter.

1.2. Mathematical models for the membrane potential

1.2.1. Conductance-based neuronal models. Coupling the basic equation(1.1) for the membrane potential to the dynamics of the ion channel dynamics ofthe type (1.2) leads to the class of conductance-based neural models. The first andmost prominent example for this class of models are given by the Hodgkin-Huxleymodel that has been introduced in 1952 to describe the membrane potential in thesquid giant axon. In this model there are three different types of currents throughthe membrane:

• IK - potassium channels (activating)• INa - sodium channels inactivation (inactivating)• IL - sodium channel (activating)

The coupled system of four differential equations is then given as

(1.5)

Cdv

dt= gkn

4(vK − v) + gNam3h (vNa − v) + gL (vL − v) + Iext

dn

dt= αn(v)(1− n)− βn(v)n

dm

dt= αm(v)(1−m)− βm(v)m

dh

dt= αh(v)(1− h)− βh(v)h

with n, m and h denoting the concentration of open ion channels of the respec-tive type. The constants gK , gNa and gL denote the maximal values of membraneconductances for potassium, sodium and leakage ions, vK , vNa and vL the corre-sponding reversal potentials. Finally, the transition rates are given as

αn(v) =10− v

100(e(10−v)/10 − 1)βn(v) =

1

8e−V/80

αm(v) =25− v

10(e(25−v)/10 − 1)βm(v) = 4e−v/18

αh(v) =7

100e−v/20 βh(v) =

1

e(30−v)/10 + 1

Parameters taken from [HA10].

Remark 1.1. (1) The components of (1.5) are of the type

(1.6)dv

dt= g (vE − v)

with explicit solution

v(t) = e−gtv(0) + (1− e−gt)vEand corresponding long time behavior v(t)→ vE with rate exponential rate g. Forthis reason vE is sometimes also called the equilibrium potential.

1.2. MATHEMATICAL MODELS FOR THE MEMBRANE POTENTIAL 5

(2) The equations for m,n and h are of the type

(1.7)dn

dt= α(1− n)− βn

with explicit solution

n(t) = e−(α+β)tn(0) +(

1− e−(α+β)t) α

α+ β

In particular, the solution stays inside the unit interval [0, 1], if the initial conditionis contained in [0, 1].

(3) The system of coupled differential equations exhibits a bifurcation w.r.t.the exterior input current Iext. Depending on its size, one can observe a singleor a finite number of spikes or even periodic spikes. More precisely: in the aboveparameter set:

- minimal current required for at least one spike: Iext = 2.5- threshold value for periodic spiking: Iext = 6.25- if Iext > 154 the amplitude of the spikes decreases rapidly.

Illustration with Octave/Matlab:

Plotting V together with all concentrations shows that v and n are prettysynchronized. For a better understanding of the dynamical properties of the system,it is therefore possible to reduce the number of variables by lumping together vand n, and also the sodium inactivation h and 1 − m into one variable. Theresulting system is two-dimensional and called the FitzHugh-Nagumo system. Wewill study this two-dimensional system more closely in the subsequent chapters.The FitzHugh-Nagumo system is a mathematical idealization and its variables nolonger belong are physiological quantities. On the other hand, the bifurcation of thefour-dimensional Hodgkin-Huxley system can be illustrated and further understoodin the simplified FitzHugh-Nagumo system by graphical methods.

Finally, let us also compare the typical phase-plot of the Hodgkin-Huxley sys-tem with some real neural data:

6 1. INTRODUCTION

(from [H07]). The typical shape of the action potential is very well modeledin the Hodgkin-Huxley system in contrast to the fluctuations that are due to thefluctuations in the ion channel concentrations. The question now is, whether theion channel fluctuations should be incorporated into the model or whether thedeterministic Hodgkin-Huxley system already is sufficiently good. It turns out thatthe fluctuations have an impact on the action potential that have to be taken intoaccount for a more appropriate statistical analysis of the membrane potential inreal neural systems. There are a couple of important effects of these fluctuationson the membrane potential, among them are:

- spontaneous spiking,- time jitter, which means fluctuations in the velocity of the action potential,- splitting up and annihilation of action potentials,- propagation failure.

A numerical study of these effects has been carried out in [AAF07] and in [SS16]for the spatially extended analogue. Its implications on the minimal axon diameterrequired for faithful signal transmission have been investigated in [AAF05].

1.2.2. Integrate-and-fire models. Integrate-and-fire (IF) models model themembrane potential only with the help of a one-dimensional dynamical system.Since periodic dynamical systems cannot be fully described with a first order ordi-nary differential equation, one has to incorporate a discontinuous reset mechanismas follows:

CdV

dt= I if V ≤ Vth Vth − threshold value

and then reset V to a lower value V → Vreset. The interpretation is as follows: Themembrane potential is integrating up the input currents I up to a certain thresholdvalue. Once it hass reached this threshold that could be thought of a saturationvalue of the membrane potential it starts an action potential, that is it fires. Havingfired the membrane potential is reset to its resting value and integrating up againthe input.

In most of the cases, the leak current through the membrane potential is de-noted explicitly and this leads to the so-called leaky IF model:

CdV

dt= −V

R+ I if V ≤ Vth .

1.3. BIOLOGICAL NEURAL NETWORKS 7

The advantage of this model is its reduced complexity, its disadvantage of courseis that it neglects all ion channels and therefore its simulation results can only beinterpreted statistically for the membrane potential.

1.3. Biological neural networks

The modeling of the activity of neural circuits even in whole brain areas needs totake into account the modeling of communication between neurons. Neurons com-municate via their synapses, basically exchanging certain neurotransmitter. Givena single neuron, neurons that send input to the given one are therefore called presy-naptic and neurons to which the given neuron sends its output are called postsy-naptic. The precise underlying physiological process of exchange of the signal iscomplex (and also subject to noise) and it is different for every type of neuron. Itis not our aim to lay out this details here, but we only mention the fundamentaldistinction between chemical and electrical synapses.

The simpler type is the electrical synapse where the membrane potential of thepresynaptic and the postsynaptic neuron directly communicate roughly linearly:

Cdvpostdt

= sum of currents + g(vpre − vpost) g = coupling strength .

In the more complex case of chemical synapses the effect on the membrane potentialof the postsynaptic neuron may be a complex nonlinear function

Cdvpostdt

= sum of currents + g(vpre)(vE − vpost)

where vE denotes a certain equilibrium potential and g is a general general functiondepending on the membrane potential of the presynaptic neurons.

One of the major open questions of neural systems, and in systems biology ingeneral, is to establish a theory for the collective behavior of neural networks interms of their local specifications, that is, the specification of the single neurons andtheir connections. Clearly, this would require some global rules, similar to the caseof kinetic gas theory, where the global statistical behavior of a gas can be deducedfrom its local interactions using simple thermodynamical rules. The difficulty inbiological systems in general and in biological neural networks in particular is todetermine simple but nevertheless relevant global rules that are responsible for therich observed phenomenology of these complex highly nonlinear systems.

Let us give some interesting and important examples of collaborative behav-ior of neural systems in a very simple case of linearly coupled two dimensionalFitzHugh-Nagumo systems. We are given an N ×N -grid

8 1. INTRODUCTION

u(i, j)

and on each grid point (i, j) the following two-dimensional FitzHugh-Nagumosystem linearly coupled to neighboring neurons:

(1.8)

dvijdt

= vij(1− vij)(vij − a)− wij +1

2h(vi+1,j − vij + vi−1,j − vij)

+1

2h(vi,j+1 − vij + vi,j−1 − vi,j)

dwijdt

= b(vij − a+ wij)

Here, a ∈ (0, 1), b ∈ R and h ∼ 1N . It turns out that for certain parameters and

certain initial conditions the system exhibits remarkable collective behavior.

CHAPTER 2

Markov Chain Models of Ion Channels

As outlined in the Introduction, the activity of ion channels is intrinsically noisy. Awidely accepted modeling approach in Computational Neuroscience is to use fine-state Markov chains in continuous time for the approximation of the statistics ofthe up- and down states of single ionchannels and subsequently also networks ofcoupled ion channels.

2.1. Time-continuous Markov Chains

There are many textbooks containing an Introduction to the theory of time-continuous Markov chains. A classical reference is [Nor97]. In the following wewill introduce the basic elements of the theory that is needed for the modeling ofion channels and for the concepts of their diffusion approximations.

Let us denote with X(t), t ≥ 0, the state as a function of time and assume thatit is given as a time-continuous Markov Chain, that is we require that the statespace S is a discrete space, i.e., a countable set, that X = (X(t))t≥0 is a family ofrandom variables on an underlying probability space such that

• the trajectories t 7−→ X(t)(ω) are piecewise constant, right-continuous• for 0 ≤ t1 ≤ t2 ≤ · · · ≤ tn ≤ tn+1, i1, · · · , in+1 ∈ S the Markov property

P(X(tn+1) = in+1 | X(tn) = in, · · · , X(t1) = i1

)= P (X(tn+1) = in+1 | X(tn) = in)

holds.

The Markov chain is called time-homogeneous if

P(X(t+ s) = j | X(s) = i

)is independent of s, hence only depending on the length t of the time-interval[s, s + t]. In the first section, we will consider time-homogeneous Markov chainsonly.

The distribution µ = P X(0)−1 of the initial state is called the initial dis-tribution of X and

(2.1) pij(t) := P(X(t) = j | X(0) = i

), i, j ∈ S, t ≥ 0

are called the transition probabilities.

Lemma 2.1. For 0 = t0 ≤ t1 ≤ t2 ≤ · · · ≤ tn, i0, · · · , in ∈ S

P(X(tn) = in, X(tn−1) = in−1, · · · , X(t1 = i1, X(t0) = i0

)= µi0 pi0i1(t1 − t0) pi1i2(t2 − t1) · · · pin−1in(tn − tn−1).

9

10 2. MARKOV CHAIN MODELS OF ION CHANNELS

The above formula implies in particular that the joint law of (X(t0), · · · , X(tn)) iscompletely determined by the initial distribution and the transition probabilities ofthe Markov chain.

Proof. We will use induction on n. For n = 1 clearly

P (X(t1) = i1, X(t0) = i0) = P (X(t1) = i1 | X(t0) = i0)︸︷︷︸=pi0i1 (t1−t0)

·P (X(t0) = i0)︸︷︷︸=µi0

.

Now suppose that the statement is proven for n. It then follows that

P(X(tn+1) = in+1 , X(tn) = in, . . . , X(t0) = i0

)= P (X(tn+1) = in+1 | X(tn) = in, . . . , X(t0) = i0)︸︷︷︸

=↑

Markov property

P (X(tn+1)=in+1|X(tn)=in)=pinin+1(tn+1−tn)

· P (X(tn) = in, . . . , X(t0) = i0)︸︷︷︸=µi0pi0i1 (t1−t0)·pi1i2 (t2−t1)...

by assumption

= µi0pi0i1(t1 − t0) pi1i2(t2 − t1) . . . pinin+1(tn+1 − tn)

In the following we denote by P (t) = (pij(t)) the matrix of transition probabil-ities. P (t) is a stochastic matrix, in fact a right-continuous semigroup of stochasticmatrices in the sense of the following lemma.

Lemma 2.2. P (t), t ≥ 0, is a semigroup of matrices, i.e., P (t+ s) = P (t)P (s),s, t ≥ 0, and right-continuous, i.e., lims↓t Pij(s) = Pij(t) for all i, j ∈ S.

Proof.

pij(t+ s) = P (X(t+ s) = j | X(0) = i)

=∑k∈S

P (X(t+ s) = j, X(t) = k | X(0) = i)

=∑k∈S

P (X(t+ s) = j | X(t) = k, X(0) = i)︸︷︷︸=P (X(t+s)=j|X(t)=k)

(2.1)= pkj(s)

·P (X(t) = k | X(0) = i)︸︷︷︸=pik(t)

=∑k∈S

pik(t)pkj(s) = (P (t)P (s))ij .

For the proof of the right-continuity note that the right-continuity of the samplepaths t 7→ X(t)(ω) of the Markov chain implies for s ↓ t that

pij(s) = P (X(s) = j | X(0) = i) = E(

1X(s)=j︸︷︷︸−→1X(t)=j pointwise

| X(0) = i)

−→ E(1X(t)=j | X(0) = i

)= pij(t)

by Lebesgue’s dominated convergence.

It can be shown if pij is even differentiable and that there exists a matrixQ = (qij)i,j∈S such that

(2.2)d

dtP (t) = QP (t).

2.1. TIME-CONTINUOUS MARKOV CHAINS 11

Equation (2.2) is called Kolmogorov‘s backward equation, the matrix Q iscalled the generator of (X(t))t≥0 (see [Nor97]). It turns out that Q being thedifferential of a semigroup of stochastic matrices imposes some further propertieson Q, that under some additional assumption (finite state space) are in fact alsosufficient to generate a semigroup.

Lemma 2.3. Q is a Q-matrix, i.e., it has the following two properties:

(a) qij ≥ 0 if i 6= j(b)

∑j∈S qij = 0, in particular qii = −

∑j∈S,j 6=i qij ≤ 0.

Conversely, if supi |qii| <∞ for a given Q-matrix Q, its matrix exponential P (t) =etQ, t ≥ 0, defines a semigroup of stochastic matrices.

Proof. Since

qij =d

dtpij(t)

∣∣∣t=0

= limt↓0

1

t

(pij(t)− pij(0)︸︷︷︸

=δij

)= lim

t↓0

1

t

(pij(t)− δij

)and since pij(t) ∈ [0, 1], it follows that qij ≥ 0 if i 6= j (and qii ≤ 0). P (t)1 = 1

implies that

Q1 =d

dtP (t)1

∣∣∣t=0

= 0, hence∑j

qij = 0.

Conversely, suppose first that |S| < ∞ (which certainly implies supi |qii| < ∞):then

etQ =∑k=0

tk

k!Qk

is well-defined, and Q1 = 0 implies that Qk1 = Q · · · Q1 = 0 (k ≥ 1) andtherefore

etQ1 =

∞∑k=0

tk

k!Qk1 = Q0︸︷︷︸

=I

1 = 1 .

In conclusion P (t)1 = 1.

To see that pij(t) ≥ 0 note that for i 6= j

P (t) = I + tQ+O(t2) ,

hence Pij(t) ≥ 0 for small t. We can find some δ > 0 such that Pij(t) ≥ 0 fort ∈ [0, δ) and thus for any t ≥ 0

Pij(t) =(P

(t

n

)︸︷︷︸≥0 if t

n≤δ

. . . P(t

n

))ij≥ 0 .

2.1.1. General terminology for right-continuous stochastic processes.Given a stochastic process (X(t))t≥0 on a discrete set S with right-continuoustrajectories t 7→ X(t)(ω) we can define the the jump times J0, J1, . . .

J0 := 0

Jn+1 := inft ≥ Jn : Xt 6= XJn for , n = 0, 1, 2, . . . ,


with the convention inf ∅ = +∞, and the holding times

Tn :=

Jn − Jn−1 if Jn−1 <∞∞ otherwise.

Note that right-continuity implies Tn > 0 for all n. If Jn+1 = +∞ for some n, wedefine X∞ = XJn . The discrete-time process (Yn)n≥0 given by

Yn := XJn , n = 0, 1, 2, . . .

is called the jump process (or jump chain in the case where (X(t)) is Markovian).

-

6

J0(ω) J1(ω) J2(ω) J3(ω)

| | |

[

[ [

[ [

[

t

Xt(ω)

2.1.2. Poisson process. The most important example of a Markov chain ona discrete state space is the Poisson Process. A right-continuous stochastic process(X(t))t≥0 with values in 0, 1, 2, . . . is called a Poisson process of rate λ, λ >0, if its holding times T1, T2, . . . are independent exponential random variables ofparameter λ and its jump chain is given by Yn = n. We will show below that(X(t))t≥0 is Markovian and its generator is

Q =

−λ λ

. . .

. . .

The strong law of large numbers implies

Jn =

n∑k=1

Jk − Jk−1 =

n∑k=1

Tk −→ +∞ P-a.s.


so that a Poisson process jumps infinitely often.

Theorem 2.4 (Markov property). Let X(t), t ≥ 0, be a Poisson process ofrate λ. Then, for any s ≥ 0, (X(t+ s)−X(s))t≥0 is again a Poisson process of

rate λ, independent of X(r) : r ≤ s.

Proof. Let X(t) = X(t+ s)−X(s), t ≥ 0. Then

X(s) = i = Ji ≤ s < Ji+1 = Ji ≤ s ∩ Ti+1 > s− Ji

and thus the holding times of X(t) are given by

T1 = Ti+1 − (s− Ji), T2 = Ti+2, T3 = Ti+3, . . .

Since T1, T2, . . . are independent Exp(λ)-distributed, so that in particular T2, T2, . . .

are independent Exp(λ)-distributed too, we have that T1 = Ti+1−(s−Ji) is Exp(λ)-distibuted due to the memoryless property of Exp and independent of T1, . . . , Tibecause of the memoryless property of the exponential distribution. Indeed, let

fλ(x) :=

λe−λt if x ≥ 0

0 otherwise

be the density of the exponential distribution. Then

P(T1 ≥ t, T1 ≥ t1, . . . , Ti ≥ ti | Xs = i

)· P (Xs = i)

= P (Ti+1 > t+ s− (T1 + · · ·+ Ti), Ti ≥ ti, Ti−1 ≥ ti−1, . . . , T1 ≥ t1, Xs = i)

=

∫ ∞t1

fλ(s1)

∫ ∞t2

fλ(s2) · · ·∫ ∞ti

fλ(si)

∫ ∞t+s−(s1+···+si)

λe−λsi+11s1+···+si≤sdsi+1︸︷︷︸=e−λ(t+s−(s1+···+si))

dsidsi−1 · · · ds1

= e−λt∫ ∞t1

fλ(s1)

∫ ∞t2

fλ(s2) · · ·∫ ∞ti

fλ(si) e−λ(s−(s1+···+si))︸︷︷︸=P (Ti+1>s−(s1+···+si))

1s1+···+si≤sdsidsi−1 · · · ds1

= e−λtP (Ti+1 > s− (T1 + · · ·+ Ti), T1 ≥ ti, . . . , Ti ≥ ti, T1 + · · ·+ Ti ≤ s)

= e−λtP (T1 ≥ t1, . . . , Ti ≥ ti | Xs = i) P (Xs = i).

How does Theorem 2.4 imply the Markov Property? Well, for given 0 = t0 ≤. . . ≤ tn+1 and i0, . . . , in+1, we can conclude from the theorem that

P (X(tn+1) = in+1 | X(tn) = in, . . . , X(t0) = i0)

= P (X(tn+1)−X(tn) = in+1 − in | X(tn) = in, . . . , X(t0) = i0)

indep.= P

(X(tn+1 − tn) = in+1 − in

)X again Poisson with rate λ

= P (X(tn+1) = in+1 | X(tn) = in) .

Proposition 2.5. (Distribution of X(t))

(i) P (X(t+ s)−X(s) = k) = e−λt (λt)k

k! , k = 0, 1, 2, . . .(ii) For t0 ≤ t1 ≤ . . . the increments X(ti+1)−X(ti) are independent, Poisson

random variables of parameter λ(ti+1 − ti).


Proof. (i) W.l.o.g. s = 0 (Theorem!). Then

P (X(t)−X(0) = k) = P (T1 + · · ·+ Tk ≤ t, T1 + · · ·+ Tk+1 > t)

=↑

T1+···+Tk∼Γk,λTk+1 ind. of (T1+···+Tk)

λk+1

(k − 1)!

∫ ∞0

∫ ∞0

1u≤t uk−1e−λu 1u+v>tdudv

=λk

(k − 1)!

∫ t

0

uk−1du︸︷︷︸= tk

k

e−λt = e−λt(λt)k

k!

(ii) By induction on n:

P(X(tn+1)−X(tn) = k(n+1), . . . , X(t1)−X(t0) = k1

)= P (X(tn+1)−X(tn) = kn+1 | X(tn)−X(tn−1) = kn, . . . , X(t1)−X(t0) = k1)

· P (X(tn)−X(tn−1) = kn, . . . , X(t1)−X(t0) = k1)

Theorem (2.4)= P (X(tn+1)−X(tn) = kn+1) · P (X(tn)−X(tn−1) = kn, . . . , X(t1)−X(t0) = k1)

assumption= Poiss(λ(tn+1 − tn))(kn+1) · . . . · Poiss(λ(t1 − t0))(k1)

Once we have identified the distribution of X(t), we can now calculate theentries

qij =d

dtPij(t)

∣∣∣t=0

=d

dtPoiss(λt)(j − i)

∣∣∣t=0

of the generator matrix Q of (X(t)):

qij = 0 if j < i,

qii =d

dte−λt

∣∣∣t=0

= −λ

qi,i+1 =d

dte−λt λt

∣∣∣t=0

= λ

qi,i+k =d

dte−λt

(λt)k

k!

∣∣∣t=0

= 0 if k ≥ 2.

2.1.3. Construction/Simulation of Poisson processes. Consider a prob-ability space with independent Exp(λ)-random variables T1, T2, . . . and a r.v. Y0 ∼µ for a given starting distribution µ on 0, 1, 2, . . . , independent of T1, T2, . . . .

Let J0 = 0 and Jk := T1 + · · ·+ Tk,. Then

X(t) := Y0 +

∞∑k=0

1Jk≤t ,

is a Poisson process of rate λ.

2.1.4. Birth processes. Instead of homogeneous rates, we can also considerstate dependent rates αn. A right-continuous stochastic process X(t) is called a


Birth process if the holding times are independent Exp(αn)-random variables andthe jump chain is Yn = Y0 + n. The generator is given by

Q =

−α0 α0

−α1 α1

· · · · · ·

.2.1.5. Birth and Death processes. The holding times are independent ex-

ponential r.v.∼ Exp(αn + βn), and the jump chain (Yn) is a Markov chain withtransition matrix

Πij =

αi

αi+βi, j = i+ 1 ”jump up”

βiαi+βi

, j = i− 1 ”jump down”

0 , otherwise

for i, j ∈ Z.

Therefore the Generator is given as

Qij =

αi , j = i+ 1

βi , j = i− 1

−(αi + βi) , j = i

0 , otherwise

2.1.6. General structure of continuous-time Markov chains. For sim-plicity we state the following theorem for finite state spaces only. For generalcountable state spaces, see [Nor97], Section 2.8.

Theorem 2.6. Let X(t), t ≥ 0, be a right-continuous stochastic process on afinite set S. Let Q be a Q-matrix on S with associated jump matrix

πij =

− qijqii , i, j ∈ S, i 6= j, if qii > 0

1 , i = j ∈ S if qii = 0.

Then the following three conditions are equivalent:

(a) Conditioned on Y0 = i, the jump chain (Yn)n≥0 of (Xt)t≥0 is a time-discrete Markov chain with transition matrix Π = (πij)i,j∈S and condi-tioned on Y0, Y1, · · · , Yn−1, the holding times T1, . . . , Tn are independent∼ Exp(q(Y0)), . . . ,Exp(q(Yn−1)), q(i) = −qii.

(b) For all t, h ≥ 0, conditioned on X(t) = i, X(t + h) is independent ofX(s) : s ≤ t and

P (X(t+ h) = j | X(t) = i) = δij + qijh+ o(h).

(c) For all t0 ≤ t1 ≤ · · · ≤ tn+1, i0, i1, . . . , in+1 ∈ S

P (X(tn+1) = in+1 | X(tn) = in, . . . , X(t0) = i0) = pinin+1(tn+1 − tn),

where P (t) = (pij(t)) solves

P ′(t) = QP (t), P (0) = I.

A proof can be found in [Nor97], Section 2.8.


2.1.7. Construction/Simulation of time-continuous Markov chains.Construct a jump chain Y0, Y1, . . . with initial distribution µ for Y0 and transitionmatrix Π. Let T1, T2, . . . be independent Exp(1)-r.v., independent of Y0, Y1, . . .Then define the jump times

Jk =T1

q(Y0)+ · · ·+ Tk

q(Yk−1).

Then:

X(t) = Yk for Jk ≤ t < Jk+1 gives the desired Markov chain.

The above construction provides the following algorithm for the numerical sim-ulation: Given state X(t0) = i, draw the holding time T according to the Exp(q(i))-distribution and then choose the next state j 6= i with probability πij . This kind ofan event-based algorithm is called Gillespie’s algorithm and should be the preferredmethod to simulate Markov chains.

2.2. The Martingale structure of Markov chains

Continuous-time martingales are generalizations of stochastic processes withindependent increments, like e.g. the Poisson process. More precisely, let Ft,t ≥ 0, be a filtration, i.e. an increasing family of sub σ-algebras on the underlyingprobability space. Ft is interpreted as the information available at time t.

A stochastic process M = (M(t))t≥0 is called a martingale w.r.t. this filtra-tion if it is adapted, i.e. M(t) is Ft-measurable, integrable w.r.t. the underlyingprobability measure, and

(2.3) M(t) = E (M(t+ s) | Ft) for s, t ≥ 0 .

The basic theory of martingales is outlined in Appendix A and very useful forthe analysis of general stochastic processes. To make use of this theory in thecontext of continuous-time Markov chains, one first has to identify a (large) classof martingales for a given Markov chain.

To this end fix a continuous-time Markov chain X = (X(t))t≥0 and denote by

Ft := σ (X(s) | s ≤ t)

the filtration generated by X.

Theorem 2.7. Let f : S → R be any bounded function. Then

(2.4) f(X(t)) = f(X(0)) +Mf (t) +

∫ t

0

Qf(X(s)) ds t ≥ 0 ,

where

Mf (t) := f(X(t))− f(X(0))−∫ t

0

Qf(X(s)) ds , t ≥ 0 ,

is a right-continuous martingale w.r.t. (Ft)t≥0 with

E(Mf (t)2

)= E

(∫ t

0

(Q(f2)− 2fQf

)(X(s)) ds

)

=

∫ t

0

E

∑j∈S

qX(s) j (f(X(s))− f(j))2

ds

2.2. THE MARTINGALE STRUCTURE OF MARKOV CHAINS 17

The decomposition (2.4) decomposition is called the semimartingale decompo-sition of the process f(X(t)), since it gives a decomposition into a martingale and a

process of bounded variation∫ t

0QF (X(s)) ds. Before we give the rather short and

simple proof, let us first state a useful corollary.

Corollary 2.8. Suppose that P (X(0) = i0) = 1 for some initial state i0 ∈ S.Recall that P (t), t ≥ 0, denotes the transition semigroup associated with X. Then

E((Mf

)2(t))

=

∫ t

0

∑i,j∈S

pi0j(s)qi,j (f(i)− f(j))2ds .

Proof. Let us first verify the martingale property of Mf (t). There is no lossof generality assuming that P (X(0) = i0) = 1 for some initial state i0 ∈ S. TheMarkov property of then implies that for any bounded function g : S → R

(2.5) E (g(X(t+ s)) | Fs) = P (t)g(X(s))

in the sense that the right hand side

P (t)g(X(s)) =∑j∈S

pX(s) j(t)g(j)

is a version of the conditional expectation E (g(X(t+ s)) | Fs). Indeed, the Markovproperty implies that E (g(X(t+ s)) | Fs) = E (g(X(t+ s)) | X(s)) and

E (g(X(t+ s)) | X(s) = i) =∑j∈S

E(g(X(t+ s))1X(t+s)=j | X(s) = i

)=∑j∈S

g(j)P (X(t+ s) = j | X(s) = i) =∑j∈S

g(j)pij(t)

= P (t)g(i) .

Using ddtP (t) = QP (t) = P (t)Q, the main theorem of calculus implies that

P (t)g(i)− g(i) =

∫ t

0

QP (s)g(i) ds = g(i) +

∫ t

0

P (s)Qg(i) ds

=

∫ t

0

E (Qf(X(s))|X(0) = i) ds = E

(∫ t

0

Qf(X(s)) ds | X(0) = i

).

From this identity it then follows that

(2.6)

E (f(X(t+ s))− f(X(s)) | Fs) = P (t)f(X(s))− f(X(s))

= E

(∫ t+s

s

Qf(X(r)) dr | X(s)

)which implies the martingale property

E

(f(X(t+ s))−

∫ t+s

0

Qf(X(r)) dr | Fs)

= f(X(s)) + E

(∫ t+s

s

Qf(X(r)) dr | X(s)

)− E

(∫ t+s

0

Qf(X(r)) dr | Fs)

= f(X(s))−∫ s

0

Qf(X(r)) dr .


To derive the representation of the L2-norm we conclude similarly that

E((Mf

)2(t))

= E(

(f(X(t))− f(X(0)))2 − 2 (f(X(t))− f(X(0)))

∫ t

0

Qf(X(s)) ds

+

(∫ t

0

Qf(X(s)) ds

)2 )= E

((f(X(t))− f(X(0)))

2+ 2f(X(0))

∫ t

0

Qf(X(s)) ds− 2

∫ t

0

Qf(X(s)) ds

)= E

(∫ t

0

Qf(X(s))2 − 2f(X(s))Qf(X(s)) ds

)using

E

((∫ t

0

Qf (X(s)) ds

)2)

= 2

∫ t

0

∫ t

s

E (Qpu−sf(X(s))Qf(X(s))) du ds

= 2

∫ t

0

E ((pt−sf(X(s))− f(X(s)))Qf(X(s))) ds

= 2E

(f(X(t))

∫ t

0

Qf(X(s)) ds−∫ t

0

f(X(s))Qf(X(s)) ds

).

This proves the first equality. For the proof of the second equality note that forany state i ∈ S

Qf2(i)− 2f(i)Qf(i) =∑j∈S

qij(f2(j)− 2f(i)f(j)

)=∑j∈S

qij(f2(j)− 2f(i)f(j) + f2(i)

)=∑j∈S

qij (f(i)− f(j))2.

The use of the martingale structure will become apparent in the followingsection.

2.3. Diffusion approximation of Markov chains

Given a large number of ion channels regulating the membrane potential adetailed simulation of all the individual dynamics become more and more complexand time-consuming. The understanding of the statistics of single observables, e.g.the concentration of open channels, becomes increasingly difficult to understand. Itis therefore desirable to find methods for the reduction of the full Markov chain interms of lower dimensional stochastic processes. One important method to achievethis goal is the diffusion approximation, that aims to approximate the distributionof a single observable, or finitely many of them, in terms of a 1-dimensional orfinite-dimensional diffusion process.

To illustrate the method, let us start as a motivating example with the approx-imation of a large number of independent two-state Markov chains of the followingtype

2.3. DIFFUSION APPROXIMATION OF MARKOV CHAINS 19

&%'$

&%'$

0 1

R

I

α

β

for fixed strictly positive rates α and β. Here, ”0” stands for the closed stateand ”1” for the open state of the respective ion channel. Let X1(t), . . . , XN (t) beindependent Markov chains of the above type, so that

SN (t) := X1(t) + . . .+XN (t)

is the number of open states at time t. We would like to derive an approximation ofits distribution. In principle, this could be done explicitly (see subsection ... below).Instead, we will introduce a more conceptual approach that can be generalized toother Markov chain models.

To this end let us first compute the generator matrix of the full Markov chainX = (X1, . . . , XN ). Its state space is 0, 1N and a state i = (i1, . . . , iN ) of thechain is an N -tupel of 0−1 entries, depending on whether channel k is open ik = 1or closed ik = 0. X can only jump from states i to j if i and j differ at exactly oneposition ik, which means that ion channel k changes its state. Therefore the onlynon-zero off-diagonal entries of the generator matrix Q = (qij) are given as

qij =

α if j − i = ek

β if i− j = ek .

Here, ek denotes the unit-vector ek(l) = δkl in RN pointing towards the directionof the k-th coordinate.

If follows for the diagonal entries that

qii = −∑j

qij =

N∑k=1

ikβ + (1− ik)α = αN − (α+ β)SN (t) .

The number SN (t), as a functional of X can be written as f(X(t)), where sN (i) =∑k ik is simply the sum of the nonzero entries.

We can now compute

(2.7)

QsN (i) =∑j

qijsN (j) =∑j

qij (sN (j)− sN (i))

=∑k:ik=1

β +∑k:ik=0

α = αN − (α+ β)sN (i) .

Due to the general martingale structure, we therefore obtain that

MN (t) := SN (t)− SN (0)−∫ t

0

αN − (α+ β)SN (s) ds , t ≥ 0 ,


is a martingale (with respect to the natural filtration generated by the underlyingMarkov chain X). This implies in particular that

mN (t) := E(SN (t)) = E(MN (t)) + E(SN (0)) +

∫ t

0

αN − (α+ β)E(SN (s)) ds

= mn(0) +

∫ t

0

αN − (α+ β)mN (s) ds .

In particular, mN is differentiable in t, so is

pN (t) := E

(1

NSN (t)

)=

1

NmN (t) , t ≥ 0 ,

with differential

(2.8)dpNdt

(t) = α− (α+ β)pN (t) .

(compare with (1.2)). One remark here is in order: since SN is a linear functionalof the Markov chain, equation (2.8) exactly coincides with (1.2), which need not bethe case for nonlinear functionals.

pN (t) only gives an approximation of the actual concentration of open ionchannels SN (t) in the mean. However, we even have the following law of largenumbers:

2.3.1. Law of Large Numbers.

Theorem 2.9. Suppose that the initial condition SN (0) of open states is suchthat

limN→∞

SN (0)

N= p0 in L2(P ) .

Then

limN→∞

SN (t)

N= p(t) in L2(P ) ,

where p(t) is a solution of the ordinary differential equation

p(t) = −(α+ β)p(t) + α , p(0) = p0 .

Proof. (2.7) implies that(2.9)

SNN

(t)− p(t) =SNN

(0)− p(0)︸︷︷︸=:I1(t)

+MN (t) +

∫ t

0

−(α+ β)

(SN (s)

N− p(s)

)ds︸︷︷︸

=:I2(t)

,

where

MN (t) =SN (t)

N−∫ t

0

QSN (s) ds =SN (t)

N+

∫ t

0

(α+ β)SN (s)− αds

is a martingale with

E(MN (t)2

)=

∫ t

0

E

∑j∈S

qSN (s)j (SN (s)− sN (j))2

ds ≤ tα+ β

N

using Theorem 2.7 and∑j∈S

qSN (t)j

(SN (t)

N− sN (j)

N

)2

=1

N2

∑k:Xk(t)=0

α+1

N2

∑k:Xk(t)=1

β ≤ α+ β

N.


We can therefore estimate

E

((SN (t)

N− p(t)

)2) 1

2

= E(

(I1(t) +MN (t) + I2(t))2) 1

2

≤ E(I1(t)2

) 12 + E

(MN (t)2

) 12 + E

(IN (t)2

) 12

≤

(E(I1(t)2

) 12 +

√tα+ β

N

)+√α+ β

∫ t

0

E

((SN (s)

N− p(s)

)2) 1

2

ds

Now Gronwall’s Lemma (see below) implies that

E

((SN (t)

N− p(t)

)2) 1

2

≤

(E(I1(t)2

) 12 +

√2tα+ β

N

)e√α+βt .

The assumption on the initial condition finally yields

limN→∞

E

((SN (t)

N− p(t)

)2)

= 0 .

Lemma 2.10. (Gronwall’s inequality)Let α, β, g : [0, T ]→ R, α, β integrable, β ≥ 0, g continuous and

g(t) ≤ α(t) +

∫ t

0

β(s)g(s) ds ∀t ∈ [0, T ](2.10)

Then

g(t) ≤ α(t) +

∫ t

0

α(s)β(s)e∫ tsβ(r)dr ds ∀t ∈ [0, T ](2.11)

In particular, if

• β(s) ≡ β then

g(t) ≤ α(t) + β

∫ t

0

α(s)eβ(t−s) ds.

• α(s) ≡ α and β(s) ≡ β then

g(t) ≤ αeβt.

Proof. Define

H(t) := exp

(−∫ t

0

β(r) dr

) ∫ t

0

β(s)g(s) ds.


Then

H(t) = −β(t)H(t) + β(t)g(t) exp

(−∫ t

0

β(r) dr

)= β(t) exp

(−∫ t

0

β(r) dr

)(g(t)−

∫ t

0

β(s)g(s) ds

)︸︷︷︸

≤α(t)

≤ α(t)β(t) exp

(−∫ t

0

β(r) dr

),

hence,

H(t) =

∫ t

0

H(r) dr ≤∫ t

0

α(s)β(s) exp

(−∫ s

0

β(r) dr

)ds

and therefore ∫ t

0

β(s)g(s) ds ≤∫ t

0

α(s)β(s) exp

(∫ t

s

β(r) dr

)ds.

Inserting the last inequality into (2.10) yields the inequality (2.11).

Remark 2.11. It is worth to notice that up to this point we have not reallyused the fact the MN is a martingale, but only the estimate on its L2-norm, thatsimply follows from the Markovian structure. We will however, need the martingalestructure in the central limit theorem, that will give us the fluctuations in the aboveconvergence.

2.3.2. The Central Limit Theorem. For the second order correction weneed to rescale the martingales with the factor

√N to keep the variance constant.

Hence, from now on let us consider

(2.12) MN :=1√N

(SN (t)−

∫ t

0

QSN (s) ds

).

We will use the following generalization of the central limit theorem to martin-gales adapted from [EK84]:

Theorem 2.12. For n = 1, 2, , . . ., let (Fnt )t≥0 be a filtration and (Mn(t))t≥0

be an (Fnt )t≥0-martingale with right-continuous sample paths, having left limits at

t > 0 and starting at 0, i.e. Mn(0) = 0, such that

limn→∞

E

(sup

0≤s≤t|Mn(s)−Mn(s−)|

)= 0 .

Assume that there exist nonnegative, nondecreasing, (Fnt )t≥0-adapted processes suchthat

M2n(t)−An(t) , t ≥ 0 ,

is an (Fnt )t≥0-martingale and that

limn→∞

An(t) =

∫ t

0

σ2(s) ds in probability

for some deterministic function σ : [0,∞)→ R. Then

limn→∞

Mn(t) =

∫ t

0

σ(s) dW (s) , t ≥ 0 ,


weakly on the Skorohod-space D[0,∞). Here, (W (t))t≥0 is a 1-dimensional Brow-nian motion.

The Skorohod space

D([0,∞)) := ω : [0,∞)→ R | ω right-continuous having left limits for t > 0is the natural state space for time-continuous Markov chains. It can be endowedwith the following metric

d(ω, ω) := infλ∈Λ‖λ‖+ sup

t≥0e−t |ω(t)− ω(λ(t)) | ,

where

Λ := λ : λ : [0,∞)→ [0,∞), λ(0) = 0, increasing

‖λ‖ := sups,t≥0s 6=t

∣∣∣∣ logλ(t)− λ(s)

t− s

∣∣∣∣+ supt≥0|λ(t)− t| .

With respect to this metric, the space is a complete separable metric space withthe step functions densely contained. More details on this space can be found in[EK84], Chapter 3.

To apply the Theorem in the following to the stochastic process

S∗N (t) :=√N

(SN (t)

N− p(t)

), t ≥ 0

we first need to find the semimartingale decomposition of M2N (t), t ≥ 0, where

(MN (t))t≥0 is given in (2.12).Using the martingale property we have that

E(MN (t)2 | Fs

)= E

((MN (t)−MN (s))

2 | Fs)

+MN (s)2

=1

NE

((SN (t)− SN (s)−

∫ t

s

QSN (s) ds

)2

| Fs

)+MN (s)2 .

Theorem 2.7 now implies that

E

((SN (t)− SN (s)−

∫ t

s

QSN (s) ds

)2

| Fs

)= E

∫ t

s

∑j∈S

qSN (r−s)j (SN (r − s)− sN (j))2dr

= E

∫ t

0

∑j∈S

qSN (r)j (SN (r)− sN (j))2dr −

∫ s

0

∑j∈S

qSN (r)j (SN (r)− sN (j))2dr | Fs

using the Markov property, so that

MN (t)2 − 1

N

∫ t

0

∑j∈S

qSN (r)j (SN (r)− sN (j))2dr

is a martingale.We will simplify the exposition a little bit and assume that the initial condition

p0 of the limiting ordinary differential equation is given as the equilibrium pointp0 = α

α+β , so that p(t) ≡ p0 = αα+β for all t ≥ 0. Then

S∗N (t) =√N

(SN (t)

N− α

α+ β

)


and∫ t

0

∑j∈S

qSN (r)j (S∗N (r)− s∗N (j))2dr =

∫ t

0

∑k:Xk(r)=0

α1

N+

∑k:Xk(r)=1

β1

Ndr

=

∫ t

0

α(1− SN (r)

N) + β

SN (r)

Ndr → 2t

αβ

α+ β

in L2(P ), hence in particular in probability, under the assumptions of the law oflarge numbers Theorem 2.9.

Since also

|M∗N (t)−M∗N (t−)| ≤ 1√N

the central limit theorem for martingales, Theorem 2.12, can be applied and weobtain that

limN→∞

M∗N (t) =

√2αβ

α+ βW (t)

weakly on the Skorohod space D[0,∞).In the martingale decomposition

S∗N (t) = S∗N (0) +MN (t) +

∫ t

0

QS∗N (s) ds

= S∗N (0) +MN (t)− (α+ β)

∫ t

0

S∗N (s) ds

it therefore remains to prove the weak convergence of S∗N (t), t ≥ 0, at least alongsome subsequence.

To this end we will use the following tightness criterion for stochastic processeson the Skorohod space adapted from [EK84], combining Theorem 8.6 and Theorem8.8 of Chapter 3:

Theorem 2.13. Let Xn(t) be a sequence of stochastic processes on Rd havingright-continuous sample paths with left limits for t > 0. Assume the followingconditions hold:

(a) ∃γ0 > 0 supn≥1 supt≤T E (‖Xn(t)‖γ0) <∞(b) ∃C, ∃γ1 > 0, γ2 > 1 such that

supn≥1

supt≤T

E (‖Xn(t+ 2h)−Xn(t+ h)‖γ1‖Xn(t+ h)−Xn(t)‖γ1) ≤ Chγ2 .

Then the family of distributions P X−1n , n ≥ 1, is tight on the Skorohod space

D([0,∞)).

Condition (a) follows from the martingale decomposition, since

E(S∗N (t)2

) 12 ≤ E

(S∗N (0)2

) 12 + E

(MN (t)2

) 12 + (α+ β)

∫ t

0

E(S∗N (s)2

) 12 ds

≤ E(S∗N (0)2

) 12 +

√t(α+ β) + (α+ β)

∫ t

0

E(S∗N (s)2

) 12 ds

and therefore

E(S∗N (t)2

) 12 ≤

(E(S∗N (0)2

) 12 +

√t(α+ β)

)e(α+β)t

using Gronwall’s Lemma.


Condition (b) in the above Theorem can be verified for continuous time Markovchains with the help of the following

Proposition 2.14. Let X(t) be a continuous time Markov chain with statesspace S ⊂ Rd. Let Q be the generator. Suppose that

(a) supt≥0 ‖X(t)−X(t−)‖ ≤ K <∞ (bounded jumps)(b) q∞ := supi∈S |qii| <∞.

Then

(i) E (‖X(t+ h)−X(t)‖ | X(t)) ≤ q∞Kheq∞h for t, h ≥ 0(ii) E (‖X(t+ 2h)−X(t+ h)‖ · ‖X(t+ h)−X(t)‖) ≤ q2

∞K2h2e2q∞h for t, h ≥

0.

Proof. (i) Suppose that X(t) = i0. The Markov property implies that, con-ditioned on X(t) = i0, X(t + h), h ≥ 0 is again a Markov chain with generatorQ. Denote with (Yn)n≥0 the associated jump chain (Theorem 2.6). Then condi-tioned on Y0, . . . , Yn−1, the holding times T1, . . . , Tn are independent Exp(q(Yi))distributed, i = 0, . . . , n. Therefore

P (Jn ≤ h|Y0, . . . Yn−1) =

∫ h

0

fq(Y0)(t1)

∫ h

0

fq(Y1)(t2) . . .

∫ h

0

fq(Yn−1)(tn)

1t1+t2+...+tn≤h dt1 . . . dtn

≤ qn∞∫ h

0

∫ h

0

. . .

∫ h

0

1t1+t2+...+tn≤h dt1 . . . dtn ≤ qn∞hn

n!.

Therefore

E (‖X(t+ h)−X(t)‖ | X(t) = i0) ≤∞∑n=1

nKP (Jn ≤ h < Jn+1)

≤∞∑n=1

nKqn∞hn

n!= q∞hKe

q∞h .

Summing up over all possible states X(t) = i0 we arrive at the first assertion.

(ii) For the proof of the second assertion observe that

E (‖X(t+ 2h)−X(t+ h)‖‖X(t+ h)−X(t)‖)= E (E (‖X(t+ 2h)−X(t+ h)‖ | X(t+ h)) ‖X(t+ h)−X(t)‖)

≤ q∞heq∞hE (‖X(t+ h)−X(t)‖) ≤ q2∞h

2e2q∞h .

The last Proposition, applied to the Markov chain SN (t), now yields that

E (‖S∗N (t+ 2h)− S∗N (t+ h)‖‖SN (t+ h)− SN (t)‖) ≤ (α+ β)2

Nh2e

2(α+β)√

Nh

so that condition (b) of Theorem 2.13 is satisfied for γ0 = 1 and γ1 = 2.We have thus proven:

Theorem 2.15. Let x ∈ R and xNk(N) =K(N)− α

α+βN√N

be a sequence of stan-

dardized initial conditions converging to x. Then

limN→∞

S∗N (t) = U(t)


weakly on the Skorohod space D([0,∞)). Here,

U(t) = x− (α+ β)

∫ t

0

U(s) ds+

√2αβ

α+ βW (t) .

where W (t) is a 1-dimensional Brownian motion.

In particular,

limN→∞

E(f(S∗N (t))

∣∣S∗N (0) = xNk(N)

)= E (f(U(t))) for all f ∈ Cb(R)

but also

limN→∞

E(F (S∗N (·))

∣∣S∗N (0) = xNk(N)

)= E (F (U(·)))

for all bounded and continuous F : D([0,∞)) w.r.t. the Skorohod metric.

The process U constructed above is a diffusion approximation for the numberof open channels.

The voltage activity of the neuron solves the differential equation

cmdV

dt+GmV = I + (VE − V )g(t)

where

cm = membrane capacitance

Gm = membrane conductance

I = (exterior) current source

VE = equilibrium potential

g(t) = process modelling fluctuations in the opening and closing of ion channels

In the following let I = 0, γ = Gmcm, Ve = VE

cm, then

dV

dt+ γV = Ve g(t), V (0) = v0

with the explicit solution

V (t) = v0 e−γt + Ve

∫ t

0

e−γ(t−s)g(s)ds

If we now represent g via U , we obtain the stochastic process

V (t) = v0 e−γt + Ve

∫ t

0

e−γ(t−s)U(s)ds

or the respective system of stochastic differential equations

dV (t) = (Ve U(t)− γV (t))dt

dU(t) = −(α+ β)U(t)dt+

√2αβ

α+ βdW (t).


2.3.3. Convergence of finite-dimensional distributions. As an alterna-tive to martingale central limit theorem one can also explicitly compute the finite-dimensional distributions of SN and then apply the multivariate central limit theo-rem. To this end note that the transition semigroup of the two-state Markov chaincan be computed as

P (t) =1

α+ β

[β + e−t(α+β)α α− e−t(α+β)αβ − e−t(α+β)β α+ e−t(α+β)β

].

This implies in particular

E (X(t) | X(0) = 0) =α

α+ β− α

α+ βe−t(α+β),

E (X(t) | X(0) = 1) =α

α+ β+

β

α+ βe−t(α+β)

and

Var(X(t) | X(0) = 0) =1

(α+ β)2

(α− αe−t(α+β)

)(β + αe−t(α+β)

)Var(X(t) | X(0) = 1) =

1

(α+ β)2

(α+ βe−t(α+β)

)(β − βe−t(α+β)

).

N independent open channelsIf we now consider N independent two-state Markov chains X1(t), . . . , XN (t) withwith identical transition rates α and β the sum SN (t) = X1(t) + . . . + XN (t) isagain Markovian with states space 0, . . . , N and transition matrix

Pij(t) = P (SN (t) = j | S0(t) = i)

=

N∑k=0

(i

j − k

)↑

possibilitiesforclosingchan-nels

(N − ik

)↑

possibilitiesforopen-ingchan-nels

p10(t)i−(j−k) · p11(t)j−k · p00(t)N−(i+k) · p01(t)k .

Indeed, k denotes the number of changes from closed to open (at most j), hencej − k open channels ”stay” open, the transition in a single ion-channel happenswith probabilities pij(t) given by the single ion channel.

Limiting behavior N →∞


To apply the central limit theorem, we will need the first and second momentof SN (t):

E (SN (t) | SN (0) = i) = E (SN (t) | X1(0) = · · · = Xi(0) = 1, Xi+1(0) = · · · = XN (0) = 0)

=

N∑k=1

E(Xk(t) | −”−)︸︷︷︸=

p11(t) if k = 1, . . . , i

p01(t) if k = i+ 1, . . . , N

=i(α+ e−(α+β)tβ) + (N − i)(α− e−(α+β)tα)

α+ β

= Nα

α+ β(1− e−(α+β)t) + ie−(α+β)t

Var(SN (t) | SN (0) = i) = Var (SN (t) | X1(0) = · · · = Xi(0) = 1, Xi+1(0) = · · · = XN (0) = 0)

=

N∑k=1

Var(Xk(t) | −”−)︸︷︷︸=

p11(t)− p11(t)2 if k = 1, . . . , i

p01(t)− p01(t)2 if k = i+ 1, . . . , N

= iα+ e−(α+β)tβ

α+ β· β − e

−(α+β)tβ

α+ β+ (N − i) α− e

−(α+β)tα

α+ β· β + e−(α+β)t + α

α+ β

= Nαβ

(α+ β)2+i(α2 − αβ)

(α+ β)2e−(α+β)t +

(N − i)(α2 − αβ)

(α+ β)2e−(α+β)t

− iβ2 + (N − i)α2

(α+ β)2e−2(α+β)t

StandardizationAs in the case of the classical Central Limit Theorem we have to standardize S∗N .Therefore, let

S∗N (t) :=SN (t)− α

α+βN√N

and note that this will be again a Markov chain with state space

IN :=x

(N)k :=

k − αα+βN√N

| k = 0, . . . , N

For given x(N)k ∈ IN , we now consider the distribution

P(S∗N (t) = x

(N)j | S∗N (0) = x

(N)k

), j = 0, . . . , N,

as a probability measure

p(N)t

(x

(N)k , ·

)on the whole real line R.


The central limit theorem now implies that for a sequence of initial conditions(x

(N)k(N)

)with x

(N)k(N) → x ∈ R

limN→∞

p(N)t

(x

(N)k(N), ·

)= N

(e−t(α+β),

αβ

(α+ β)2(1− e−2t(α+β))

)weakly.

Indeed: First note that

S∗N =1√N

(N∑k=1

Xk(t)− α

α+ β

)

=1√N

k(N)∑k=1

Xk(t)−(

α

α+ β+

β

α+ βe−t(α+β)

)+

1√N

N∑k=k(N)+1

Xk(t)−(

α

α+ β− α

α+ βe−t(α+β)

)+

1√N

(k(N)

β

α+ βe−t(α+β) − (N − k(N)− 1)

β

α+ βe−t(α+β)

)= I + II + III, say.

Now

X(N)k(N) =

k(N)− αα+βN√N

−→ x implies K(N) ∼√Nx+

α

α+ βN,

and therefore

I + IIw−→N

(0 ,

α

(α+ β)3(α+ βe−t(α+β))(β − βe−t(α+β)) +

β

(α+ β)3(α− αe−t(α+β))(β + αe−t(α+β))

)= N

(0 ,

α2β + β2α+ (αβ2 − βα2 + βα2 − β2α)e−t(α+β) − (β2α+ α2β)e−2t(α+β)

(α+ β)3

)= N

(0 ,

αβ

(α+ β)2(1− e−2t(α+β))

),

III ∼ 1√N

((√N x+

α

α+ βN

)β

α+ βe−t(α+β) −

(−√N x+

β

α+ βN

)α

α+ βe−t(α+β)

)= e−t(α+β)x .

It turns out that

pt(x, ·) := N(e−t(α+β)x,

αβ

(α+ β)2(1− e−2t(α+β))

), t ≥ 0, x ∈ R,

defines a semigroup of transition probabilities on R.

The associated Markov process U(t), t ≥ 0, is given as the solution of thestochastic differential equation


dU(t) = −(α+ β)U(t)dt+

√2 αβ

α+ βdW (t),(2.13)

where W (t), t ≥ 0, is a 1-dimensional Brownian motion (see next Chapter 2).With similar computations we can also prove the weak convergence of finite-

dimensional distributions of S∗N towards the finite dimensional distributions of U .Together with the tightness of (S∗N (t)) we therefore arrive at the same conclusionas in Theorem 2.15

2.4. Long-time behavior of Markov chains

Recall: given a stochastic matrix P = (pij)i,j∈S a probability measure (µi)i∈Sis invariant for P if

µP = µ

i.e., ∀i ∈ S : (µP )i =∑j∈S

µjpji = µi

⇔ P (X1 = i) =∑j∈S

P (X1 = i | X0 = j)︸︷︷︸=pji

P (X0 = j)︸︷︷︸=µj

= P (X0 = i) ,

where (Xn)n≥0 denotes a Markov chain with transition probabilities P and initialdistribution µ.

Iterating yields: P (Xn = i) = · · · = P (X0 = i), i.e., the distribution of Xn isinvariant w.r.t. time.

Theorem 2.16. Part 1 (Convergence to invariant distributions)Let P be

• irreducible, i.e., ∀i, j ∃n0 ≥ 1 such that pn0ij > 0

• aperiodic, i.e., ∀i∃n0 ≥ 1 such that pnii > 0 ∀n ≥ n0.

Suppose that P has an invariant distribution µ. Then

limn→∞

P (Xn = j | X0 = i) = µj for all i, j ∈ S

⇔ limn→∞

pnij = µj .

Part 2 (Existence of invariant measures)

Let P be

• irreducible• positive recurrent, i.e.,

∀i : E (Ti | X0 = i) <∞,

where Ti = minn ≥ 1 : Xn = i = first return to i.

Then P has an invariant distribution µ.

Proof. (see: Norris, Markov chains)

2.4. LONG-TIME BEHAVIOR OF MARKOV CHAINS 31

We now come back to the case of time-continuous Markov chains. Let P (t), t ≥0, be a right-continuous semigroup of stochastic matrices with generator Q andjump matrix Π. A measure µ is called (infinitesimally) invariant for P(t) if

µQ = 0.

Lemma 2.17. : The following are equivalent:

(i) µ is (inf.) invariant(ii) µΠ = µ, where µi = µiqi, qi = −

∑j∈S qij

Proof. Follows from qi(πij − δij) = qij and thus

(µ(Π− I))j =∑i∈S

µi(πij − δij) =∑i∈S

µiqij = (µQ)j

We can now state the exact analogue of the previous theorem:

Theorem 2.18. : Assume that supi |qii| <∞Part 1 (Convergence to equilibrium)

Let P (t), t ≥ 0, be

• irreducible, i.e., ∀i, j ∃t0 > 0 such that Pij(t0) > 0

Suppose that P (t), t ≥ 0, has an invariant distribution µ, then

limt→∞

pij(t) = µj ∀i, j ∈ S

Part 2 (Existence of invariant measure)

Let P (t), t ≥ 0, be

• irreducible• positive recurrent, i.e.,

∀i : qi = 0 or E (Ti | X0 = i) <∞,where Ti = inft ≥ J1 : X(t)(ω) = i = first return to i.

then Q (resp. P (t), t ≥ 0) has an invariant distribution µ.

CHAPTER 3

Models for synaptic input

Neurons can pass over electrical signals arriving at the axon terminals viasynapses to the dendrites of other neurons. There are mainly two different mecha-nisms by which this is achieved: via electrical or chemical mechanisms.

In the case of a electrical coupling, the axon terminal of the presynaptic neuronis linked via special types of ion channels, the gap junctions, with the dendrites ofthe postsynaptic neuron by which they have an immediate impact on the membranepotential of the postsynaptic neuron.

In the case of a chemical coupling there is no direct connection between thepre- and the postsynaptic neuron, but rather a small synaptic cleft that is crossedby neurotransmitters emitted by the presynaptic neuron and received by the post-synaptic neuron.

In contrast to the electrical coupling, where the signal in the postsynapticneuron always is smaller or equal compared to the presynaptic neuron, signals canalso be amplified in the case of chemical couplings. The detailed mathematicalmodeling of chemical synapses in general therefore is more involved.

A simple theoretical model for the synaptic input that has been largely influen-tial has been provided by R. Stein [Ste65]. Because of the point event discontinuouscharacter, synaptic input is modeled with the help of point processes. Nevertheless,in the presence of large homogeneous input, a diffusion approximation can becomeagain appropriate.

In this chapter we will introduce Stein’s model for synaptic input and its dif-fusion approximation under appropriate assumptions provided in [LL87].

Stein’s model for synaptic inputLet V as usual denote the membrane potential. Then the time evolution of V

is given in the Stein Model as

(3.1) dV (t) = −1

τV (t)dt+ a+ dN+(t)− a−dN−(t)

where

- a± - denote the amplitude of excitatory/inhibitory currents- N± - are independent Poisson processes with rate λ±

Equation (3.1) has to be understood in integral form, i.e., t-a.e.

V (t) = V (0)− 1

τ

∫ t

0

V (s)ds+ a+

∫ t

0

dN+(s)− a−∫ t

0

dN−(s)

= V (0)− 1

τ

∫ t

0

V (s)ds+ a+(N+(t)−N+(0))− a−(N−(t)−N−(0)) .

33

34 3. MODELS FOR SYNAPTIC INPUT

Mathematically, equation (3.1) is an ordinary differential equation driven bytwo (independent) Poisson processes. We could consider the weighted sum a+N+(t)−a−N−(t) of the two Poisson processes as a Birth-Death process on the discrete seta+n1 − a−n2 | ni ∈ N0 with generator matrix qi,i+a+

= λ+ and qi,i−a− = λ−.The trajectories of the process have the following structure: between the jump-

ing times Jn and Jn+1 of both Poisson processes,

Vt = e−t−Jnτ VJn , Jn ≤ t < Jn+1 .

At t = Jn+1 the solution either jumps up to the value VJn+1− + a+ or down tothe value VJn+1−− a− due to an excitatory resp. inhibitory input of magnitude a+

resp. a−.

The following theorem shows that for a large amount of synaptic input, wehave a canonical diffusion approximation:

Theorem 3.1. Let λ(n)± , a

(n)± , n = 1, 2, . . . , be such that

(a) λ(n)± −→ +∞

(b) µ(n) := λ(n)+ a

(n)+ − λ(n)

− a(n)− −→ µ

(c) σ(n),2 := λ(n)+ (a

(n)+ )2 + λ

(n)− (a

(n)− )2 −→ σ2.

Then

(3.2) limn→∞

E(f(V (n)(t)) | V (n)(0) = x

)= E (f(V (t)) | V (0) = x)

where

dV (n)(t) = −1

τV (n)(t)dt+ a

(n)+ dN

(n)+ (t)− a(n)

− dN(n)− (t)

and V (t), t ≥ 0, is given as the solution of the stochastic differential equation

dV (t) =

(µ− 1

τV (t)

)dt+ σdW (t).

In fact, we also have that - similar to Theorem 2.15

limn→∞

E(F (V (n)(t)) | V (n)(0) = x

)= E (F (V (t)) | V (0) = x)

for all F : D([0,∞))→ R, bounded and continuous w.r.t. the Skorohod topology.

We need the following

Lemma 3.2. The finite dimensional distributions of

Z(n)(t) := a(n)+ N

(n)+ (t)− a(n)

− N(n)− (t)

converge weakly to those of σW (t).

3. MODELS FOR SYNAPTIC INPUT 35

Proof. Let tk = kn t be given. Then

Z(n)(t) =

n−1∑k=0

(Z(n)(tk+1)− Z(n)(tk)

)= a

(n)t

n−1∑k=0

(N

(n)+ (tk+1)−N (n)

− (tk))

↑

− a(n)−

n−1∑k=0

(N

(n)− (tk+1)−N (n)

− (tk))

↑independent, Poiss(λ

(n)±

tn )

w−→CLT

σ W (t) + µt ∼ N (µt, σ2t)

Similarly,

Zn(t)− Zn(s)w−→ σ(W (t)−W (s)) + µ(t− s)

and by independence of the increments of Z(n), we can also deduce the convergenceof finitely many increments to the increments of a Brownian motion with driftµ.

Proof. (of Theorem 3.1)We first show that limn→∞ Z(n) = σW + µ weakly on D([0,∞]), i.e.,

limn→∞

E(F (Z(n))

)= E (F (σW + µ))

for any F : D([0,∞])→ R bounded and continuous. To this end it suffices to showthat the sequence P (Z(n))−1, n ≥ 1 is tight on the Skorohod space D([0,∞]).We will apply Theorem 2.13) and have to show that

(a) ∃γ0 > 0 supn≥1 supt≤T E (Xn(t)γ0) <∞(b) ∃C, ∃γ1 > 0, γ2 > 1 such that

supn≥1

supt≤T

E (‖Xn(t+ 2h)−Xn(t+ h)‖γ1‖Xn(t+ h)−Xn(t)‖γ1) ≤ Chγ1+1 .

For the proof of (a) note that

E(

(Z(n)(t))2)

= E(

(a(n)+ N

(n)+ (t)− a(n)

− N(n)− (t))2

)= (a

(n)+ )2E

((N

(n)+ (t))2

)− 2a

(n)+ a

(n)− E

(N

(n)+ (t)N

(n)− (t)

)+ (a

(n)− )2E

((N

(n)− (t))2

)= (a

(n)+ )2

(λ

(n)+ t+ (λ

(n)+ )2t2

)− 2a

(n)+ a

(n)− λ

(n)+ λ

(n)− t2 + (a

(n)− )2

(λ

(n)− t+ (λ

(n)− )2t2

)=(σ(n)

)2

t+(µ(n)

)2

t2 → σ2t+ µ2t2 ,

as t→∞, so that condition (a) is satisfied with γ0 = 2.

We will next show that (b) is satisfied with γ1 = γ2 = 2. To this end note that byindependence of the increments of Z(n)

E(|Z(n)(t+ 2h)− Z(n)(t+ h)|2 |Z(n)(t+ h)− Z(n)(t)|2

)= E

(|Z(n)(t+ 2h)− Z(n)(t+ h)|2

)E∣∣∣Z(n)(t+ h)− Z(n)(t)2|

)=

((σ(n)

)2

h+(µ(n)

)2

h2

)2

≤ 2(σ(n)

)4

h2 + 2(µ(n)

)4

h4


which implies the assertion.Now

(3.3) V (n)(t) = x− 1

τ

∫ t

0

V (n)(s) + Z(n)(t), t ≥ 0,

has the alternative representation

V (n)(t) = e−tτ x+ Z(n)(t)− 1

τ

∫ t

0

e−(t−s)τ Z(n)(s) ds, t ≥ 0.

Indeed, note that (3.3) implies for T > 0∫ T

0

etτ V (n)(t) dt =

∫ T

0

etτ x dt− 1

τ

∫ T

0

etτ

∫ t

0

V (n)(s) ds dt+

∫ T

0

etτ Z(n)(t)dt

= τ(eTτ − 1)x−

∫ T

0

(etτ − e sτ )V (n)(s)ds+

∫ T

0

etτ Z(n)(t)dt

= eTτ

(τx−

∫ T

0

V (n)(s)ds

)− τx+

∫ T

0

etτ V (n)(t)dt+

∫ T

0

etτ Z(n)(t)dt,

hence subtracting∫ T

0etτ V (n)(t) dt on both sides yields

0 = eTτ

(τx−

∫ T

0

V (n)(s)ds

)− τx+

∫ T

0

etτ Z(n)(t)dt

=↑

inserting (3.3) again

eTτ (τV (n)(T )− τZ(n)(t))− τx+

∫ T

0

etτ Z(n)(t)dt

or equivalently,

V (n)(T ) = e−Tτ x+ Z(n)(t)− 1

τ

∫ T

0

e−(T−t)τ Z(n)(t)dt

Hence V (n)(t) = Φ(Z(n))(t), where

Φ : D([0,∞))→ D([0,∞))

is the mapping

Φ(ω)(t) := e−tτ x+ ω(t)− 1

τ

∫ t

0

e−(t−s)τ ω(s)ds.

It can be shown that Φ is continuous w.r.t. the Skorohod metric.Indeed, it is known that d(ωn, ω) → 0 if and only if there exist λn ∈ Λ such

that λn → id uniformly and ωn λn → ω locally uniformly. But then

Φ(ωn) λn(t)− Φ(ω)(t)

= (ωn λn(t)− ω(t))− 1

τ

∫ λn(t)

0

e−(λn(t)−s)

τ ωn(s)ds+1

τ

∫ t

0

e−t−sτ ω(s)ds

= (ωn λn(t)− ω(t))− 1

τ

∫ t

0

e−(λn(t)−λn(s))

τ ωn(λn(s))1

λn(s)ds+

1

τ

∫ t

0

e−t−sτ ω(s)ds

−→ 0 locally uniformly too.


Hence, V (n) = Φ(Z(n))→ Φ(σW + µ) = V weakly in D([0,∞)), which implies theassertion.

3.0.1. Convergence of spike times. Using the simple integrate and firemodel, a spike of the neuron is defined as the (first) event that the membranepotential V crosses a certain threshold value Vspike. The spike time T is thereforegiven as

T := inft > 0 : V (t) > Vspike .Mathematically, T is a stopping time, i.e., for all t the T ≤ t that a spike occurredup to time t is a measurable event w.r.t. the σ-algebra Ft := σV (s) : s ≤ tgenerated by the membrane potential V up to time t. The distribution of T containsa lot of information on the neuron, however it is difficult to compute directly forthe Poisson input, but might be simpler to compute for the diffusion approximationprovided by Theorem 3.1. However, the following example shows that the firstpassage time is not a continuous functional on D([0,∞)), so that Theorem 3.1 doesnot yet guarantee the convergence of the distribution the spike times of V (n) to thedistribution of the spike times of their diffusion approximation.

Counterexample:

ωn(t) =

(1 + 1

n ) sin(t) t ∈ [0, π]

(t− π) t ≥ π

-

6

|

π

1

t

ωn

Let Vspike = 1, then T1(ωn) ≤ π2 for all n,

ωn(t)→ ω(t) =

sin t, t ∈ [0, π]

(t− π), t ≥ π

uniformly, hence in particular

d(ωn, ω) ≤↑

λ=id (note ‖λ‖=0)

supt≥0

e−t |ωn(t)− ω(t)| → 0 ,

but T1(ω) = π + 1.To ensure the convergence of the spike times in distribution, we therefore have

to consider this problem in the following theorem separately:

Theorem 3.3. Let m ∈ R and

Tm(ω) = inft ≥ 0 : ω(t) > m


for ω ∈ D([0,∞)) with the convention that inf ∅ = +∞. Then Tm(V (N))d−→

Tm(V ).

Proof. We know that a sequence of r.v. XN converges in distribution to Xif and only if for the cumulative distribution functions FXn , FX

limn→∞

FXn(x) = FX(x)

for all points x of continuity of FX .For all m′ ∈ R we have that

P

(supt∈[0,T ]

V (t) < m′

)≤ limn→∞

P

(supt∈[0,T ]

V (n)(t) < m′

)

and

limn→∞

P

(supt∈[0,T ]

V (n)(t) ≤ m′)≤ P

(supt∈[0,T ]

V (t) ≤ m′)

since supt∈[0,T ]

V (n)(t) ≤ m′ ⊆ D([0,∞)) closed

and supt∈[0,T ]

V (n)(t) < m′ ⊆ D([0,∞)) open.

Note that V (t), t ≥ 0, is in fact a continuous process, so that for m′ ↓ m

Tm′(V ) ≤ T ↑ Tm(V ) ≤ T.

Indeed, Tm′(V ) ≤ T is monotone increasing for m′ decreasing and conversely

Tm(V ) ≤ T = V (t) > m for some t ≤ T

⊆⋃

m′>m

V (t) > m′ for some t ≤ T

=⋃

m′>m

T ′m(V ) ≤ T.

Lebesgue’s theorem implies that

P (Tm(V ) ≤ T ) = limm′↓m

P (Tm′(V ) ≤ T ).

Note that

Tm(V ) ≤ T = supt∈[0,T ]

V (t) > m

implies for m′ > m′′ > m


P (Tm′(V ) ≤ T ) = 1− P

(supt∈[0,T ]

V (t) ≤ m′)

≤ 1− limn→∞

P

(supt∈[0,T ]

V (n)(t) ≤ m′)

≤ 1− limn→∞

P

(supt∈[0,T ]

V (n)(t) < m′′

)

≤ 1− P

(supt∈[0,T ]

V (t) < m′′

)

= P

(supt∈[0,T ]

V (t) ≥ m′′)≤ P

(supt∈[0,T ]

V (t) > m

)= P (Tm(V ) ≤ T ) .

Taking the limit m′ ↓ m implies that we have equality everywhere, so that inparticular

limn→∞

P

(supt∈[0,T ]

V (n)(t) > m

)= P

(supt∈[0,T ]

V (t) > m

),

in other words

limn→∞

P(Tm(V (n)) ≤ T

)= P (Tm(V ) ≤ T ) .

CHAPTER 4

Stochastic Integrate-and-Fire models

In this chater we will introduce and analyse the stochastic integrate-and-fire(IF) model as th simplest statistical model for the membrane potential, which isthe basic observable of neural activity. For its neural background let us first recallthe basic dynamical features of the membrane potential:

- the synaptic input changes the membrane potential along the dendrites- this change in the membrane potential is passed through the dendrites to

the cell body- the membrane potential at the cell body integrates up the synaptic input

over time and, provided big enough, i.e. crossing a certain threshold value,can produce a sharp uprise in the membrane potential followed by a sharpdecrease and a refractory period in which the membrane potential slowlyreturns to its original resting value

- the sharp uprise followed by the sharp decrease is called a spike or theaction potential and is actively transmitted through the axon to otherneurons

The IF-model captures this basic mechanism setting up the following differentialequation for the membrane potential

(4.1) CdV

dt= −V

R+ I

together with a reset rule that consists in resetting the membrane potential V onceit reaches a certain value Vth to a lower value Vr. This mechanism induces a discon-tinuity in the process that causes many difficulties in its subsequent mathematicalanalysis.

Main additional feature of the leaky IF-model:

Firing can only be reached for a large enough input current I, because integrationof (4.1) yields

Vt = e−t/CRVr +(

1− e−t/CR)IR

which crosses the level Vth only for some t only if I > CR (Vth − Vr).

Stochasticity

We have already identified the two main sources for fluctuations in the membranepotential

- the random closing and opening of regulating ion channels- uncorrelated input of presynaptic neurons.

41

42 4. STOCHASTIC INTEGRATE-AND-FIRE MODELS

The simplest statistical effective modeling of fluctuations in the membranepotential is provided by simply adding Brownian motion (Wt)t≥0 as an exteriorforcing term acting on the membrane potential which yields the following stochasticdifferential equation (sde):

(4.2) dVt =

(I

C− VtCR

)dt+ σ dWt, V0 = Vr

Note that (4.2) is a linear sde and its unique strong solution can be representedas:

Vt = e−tCR Vr +

(1− e− t

CR

) I

C+

∫ t

0

e−t−sCR σ dWs

(see Appendix C).

A spike occurs, once the process Vt hits the threshold Vth, i.e., a spike occurs atthe first passage time

T := inft > 0 : Vt > Vth .

T is also called the firing time. The quantity of interest is the interspike interval(ISI) statistics, i.e. the distribution of T . E(T ) is called the mean firing time.

4.1. The distribution of T

Consider the stochastic differential equation

(4.3) dVt = f(Vt) dt+ σ(Vt) dWt , V0 = Vr .

We assume that f and σ are Lipschitz-continuous and that σ(V ) > 0 for all V .In particular, the above sde has for any initial condition a unique strong solution.How to compute the distribution of the first passage time T ?

4.1.1. General concepts. Let us begin with some general remarks. Considerthe stochastic differential equation (4.3) under the general assumption on the coef-ficients that for all x ∈ R there exists a unique strong solution Xt(x), t ≥ 0, withinitial condition X0(x) = x. It turns out that for σ 6= 0, the distribution of Xt(x)has a density pt(x, y) for t > 0 so that

E (g(Xt(x))) =

∫pt(x, y)g(y) dy .

Under additional assumptions on the coefficients f and σ, pt(x, y) satisfies for allx ∈ R the forward Kolmogorov equation

(4.4) ∂tpt(x, y) = L∗ypt(x, y) , t > 0 , y ∈ R, ,and for all y ∈ R the backward Kolmogorov equation

(4.5) ∂tpt(x, y) = Lxpt(x, y) , t > 0 , x ∈ R, .Here,

Lxg(x) =1

2σ2(x)gxx(x) + f(x)gx(x)

is the generator of the stochastic differential equation (4.3), and

L∗yg(y) =1

2(σ2g)xx(x)− (fg)x(x)

its formal adjoint (w.r.t. the Lebesgue measure).

4.1. THE DISTRIBUTION OF T 43

The family of densities pt(x, y), t > 0, forms a semigroup w.r.t. convolution,i.e.,

(4.6)

∫ps(x, y)pt(y, z) dy = ps+t(x, z)∀x, y, z .

which then is equivalent with the Markov property of the solution of (4.3).

We are interested in the first passage time

τx0

b = inft ≥ 0 | Xt(x0) > bof the solution of the level b. (Of course τx0

b ≡ 0 if b ≤ x0.) Let Gx0

b (t) =P (τx0

b ≤ t) denote the distribution function and gx0

b (t) its density (if it exists).Due to the (strong) Markov property the first passage time density satisfies thefollowing Volterra integral equation of the first kind

(4.7) pt(x0, y) =

∫ t

0

gx0

b (s)pt−s(b, y) ds ∀ y ≥ b > x0 .

The interpretation of this equation is a follows: given y > b > x0, the solutionXs(x0) must have crossed the level b at least once, if Xt(x0) = y. If we conditionon the first time τx0

b this happens, the process starts afresh at the level b at thattime until it reaches its terminal value y at time t. The probability for this ispt−τx0

b(b, y).

Example 4.1. Explicit solutions

(a) Brownian motion: pt(x, y) = 1√2πt

e−(x−y)2

2t . In this case, we can simply

integrate (4.7) w.r.t. y ≤ b to obtain∫ ∞b

pt(x0, y) dy =

∫ t

0

gx0

b (s)

∫ ∞b

pt−s(b, y) dy︸︷︷︸= 1

2

ds =1

2

∫ t

0

gx0(s) ds

=1

2Gx0(t)

which implies that

Gx0(t) = 2

∫ ∞b

pt(x0, y) dy = 2P (Wt > b− x0) and gx0

b (t) =1√

2πt3(b−x0)e−

(b−x0)2

2t .

(b) Brownian motion with constant drift I: pt(x, y) = 1√2πt

e−(x+It−y)2

2t . In this

case, we choose y = b, so that (4.7) reduces to

pt(x0, b) =

∫ t

0

gx0

b (s)1√

2π(t− s)e−

I2

2 (t−s) ds

and multiplying both side with eI2t2 , we obtain

eI2t2 pt(x0, b) =

∫ t

0

eI2s2 gx0

b (s)√2π(t− s)

ds .

The right hand side is the Abel-transform of eI2s2 gx0

b (s) that can be inverted withexplicit inverse

eI2t2 gx0

b (t) =1

π

d

dt

∫ t

0

1√t− s

eI2s2 ps(x0, b) ds .


Unfortunately, equation (4.7) determines the first passage time density only implic-itly and the literature covers a whole range of ideas of how to proceed with thisfundamental equation in order to come up with reasonable approximations for gx0

b .

4.1.2. The mean firing rate E(T ). The following Theorem gives an explicitformula for the mean firing time in terms of the coefficients of the driving sde.

Theorem 4.2. Suppose that P (T =∞) = 0. Then the mean firing time E(T )is given as(4.8)

E(T ) =

∫ Vth

−∞

∫ Vth

max(x,Vr)

exp

(−2

∫ y

Vr

f(s)

σ2(s)ds

)dy

2

σ2(x)exp

(2

∫ x

Vr

f(s)

σ2(s)ds

)dx .

Example 4.3. (i) constant drift: f(V ) = I, σ2(s) ≡ σ2. Then

E(T ) =2

σ2

∫ Vth

−∞

∫ Vth

x∨Vrexp

(−2I

σ2(y − Vr)

)dy exp

(2I

σ2(x− Vr)

)dx

=1

I

∫ Vr

−∞

[1− exp

(−2I

σ2(Vth−r)

)]exp

(2I

σ2(x− Vr)

)dx

+1

I

∫ Vth

Vr

[exp

(−2I

σ2(x− Vr)

)− exp

(−2I

σ2(Vth − Vr)

)]· exp

(2I

σ2(x− Vr)

)dx

=σ2

2I2− 1

I

∫ Vth

−∞exp

(2I

σ2(x− Vth)

)dx︸︷︷︸

=0

+Vth − Vr

I.

Thus E(T ) = Vth−VrI independent of σ2!

(ii) leaky IF-model: f(V ) = I − θV , σ2(s) ≡ σ2. Then

E(T ) =2

σ2

∫ Vth

−∞

∫ Vth

x∨Vrexp

(−2I

σ2(y − Vr) +

θ

σ2(y − Vr)2

)dy

· exp

(2I

σ2(x− Vr)−

θ

σ2(x− Vr)2

)dx

not easy!=↑θ=1

· · · =√π

∫ Vth−Iσ

Vr−Iσ

ex2

(1 + erf(x)) dx

with

erf(x) =1√π

∫ +x

−xe−s

2

ds .

The explicit formula for θ = 1 has been obtained in the paper [FB02].

4.1.3. Proof of Theorem 4.2. The following subsection is devoted to thederivation of formula (4.8). Thereby we will also introduce general concepts in thestochastic analysis of Ito processes, relating expectations to differential equations.

We will need the following additional notation: for a < Vr and b = Vth let

Ta,b := inft > 0 : Vt /∈ [a, b]


be the first exit time of the solution V from [a, b].

Proposition 4.4. Let h ∈ C2([a, b]) be a solution of

(4.9)σ2(x)

2h′′(x) + f(x) h′(x) = 0, x ∈ [a, b].

Then

P(VTa,b = b

)=h(V0)− h(a)

h(b)− h(a).

Proof. Ito’s formula applied to h(Vt), t ≤ Ta,b, implies

dh(Vt) = h′(Vt)dVt +1

2h′′(Vt) d〈V 〉t

= h′(Vt)(f(Vt) dt+ σ(Vt) dWt) +1

2h′′(Vt)σ

2(Vt) dt

= h′(Vt)σ(Vt) dWt +

(σ2

2(Vt) h

′′(Vt) + f(Vt)h′(Vt)

)︸︷︷︸

=0, if t≤Ta,b

dt

= h′(Vt)σ(Vt) dWt

hence h(Vt) = h(V0) +∫ t

0σ(Vs) dWs for t ≤ Ta,b, or equivalently,

h(Vt∧Ta,b

)= h(V0) +

∫ t∧Ta,b

0

h′(Vs)σ(Vs) dWs︸︷︷︸=Mt∧Ta,b

.

The stochastic integral Mt∧Ta,b is integrable and has mean zero E(Mt∧Ta,b

)= 0,

using the optional sampling theorem A.8, hence

E(h(VTa,b)

)= limt→∞

E(h(Vt∧Ta,b)

)= h(V0) .

Now h(VTa,b) only has two values h(a) and h(b), so that

E(h(VTa,b)

)= h(a) P (VTa,b = a)︸︷︷︸

=1−P (VTa,b=b)

+h(b)P (VTa,b = b)

and therefore

P(VTa,b = b

)=h(V0)− h(a)

h(b)− h(a).

An explicit solution h of (4.9) is given as

h(x) =

∫ x

v0

exp

(−2

∫ y

v0

f(s)

σ2(s)ds

)dy

since g = h′ satisfies the linear differential equation g′(x) = − 2f(x)σ2(x)g(x), therefore

g(x) = c exp(−2∫ xv0

f(s)σ2(s)ds

)for some constant c. Applying Proposition 4.4 yields

the explicit formula

P(VTa,b = b

)=

∫ v0

aexp

(−2∫ yv0

f(s)σ2(s) ds

)dy∫ b

aexp

(−2∫ yv0

f(s)σ2(s) ds

)dy


Example 4.5. (i) Brownian motion with variance σ2: f = 0, σ2 > 0constant. Then h(x) = x− v0 becomes linear and independent of σ2 and

P(VTa,b = b

)=v0 − ab− a

(→ 1, a→ −∞)

(ii) Brownian motion with constant drift f(V ) ≡ I 6= 0. Then

h(x) =

∫ x

v0

exp

(−2

I

σ2(y − v0)

)dy =

σ2

2I

(1− e−2 I

σ2 (x−v0))

which implies

P(VTa,b = b

)=

1− e−2 Iσ2 (a−v0)

e−2 Iσ2 (b−v0) − e−2 I

σ2 (a−v0)

=e−2 I

σ2 v0 − e−2 Iσ2 a

e−2 Iσ2 b − e−2 I

σ2 a−→a→−∞

1 if I > 0

e2 Iσ2 (b−v0) if I < 0 .

(iii) Brownian motion with affine linear drift f(V ) = I − θV , θ 6= 0. Then

h(x) =

∫ x

v0

exp

(θ

σ2(y − v0)2 − 2I

σ2(y − v0)

)dy

which implies

P(VTa,b = b

)=

∫ v0

aexp

(θσ2 (y − v0)2 − 2I

σ2 (y − v0))dy∫ b

aexp

(θσ2 (y − v0)2 − 2I

σ2 (y − v0))dy

−→a→−∞

1, ifθ > 0∫ v0−∞ exp( θ

σ2 (y−v0)2− 2Iσ2 (y−v0))dy∫ b

−∞ exp( θσ2 (y−v0)2− 2I

σ2 (y−v0))dy, if θ < 0 .

The last Proposition provides us with the exit distribution of Vt. To computethe mean exit time E(Ta,b) we will need the following:

Proposition 4.6. Let u ∈ C2([a, b]) be a solution of

(4.10)σ2(x)

2u′′(x) + f(x)u′(x) = −1, x ∈ [a, b] .

Then

E(Ta,b) = u(V0)− u(a)− P(VTa,b = a

)(u(b)− u(a))

= −u(a)P (VTa,b = a)− u(b) P (VTa,b = b) if u(V0) = 0 .

Proof. Similar to the proof of Proposition 4.4, Ito’s formula implies that

du(Vt) = u′(Vt)σ(Vt)dWt +

(σ2

2(Vt)u”(Vt) + f(Vt)u

′(Vt)

)︸︷︷︸

=−1, if t≤Ta,b

dt

= u′(Vt)σ(Vt)dWt − dt ,

hence

u(Vt∧Ta,b

)= u(V0) +

∫ t∧Ta,b

0

u′(Vs)σ(Vs)dWs − t ∧ Ta,b ,


therefore

E (Ta,b ∧ t) = −E(u(Vt∧Ta,b)

)which implies in the limit t ↑ ∞

E (Ta,b) = u(V0)− E(u(VTa,b)

)= u(V0)− u(a)P

(VTa,b = a

)− u(b)P

(VTa,b = b

).

An explicit solution u of (4.10) with u(a) = u(b) = 0 is given by

u(x) =−∫ x

a

(h(x)− h(y))2

σ2(y)exp

(2

∫ y

V0

f(s)

σ2(s)ds

)dy

+h(x)− h(a)

h(b)− h(a)

∫ b

a

(h(b)− h(y))2

σ2(y)exp

(2

∫ y

V0

f(s)

σ2(s)ds

)dy

therefore

E (Ta,b) = u(V0)

= −∫ V0

a

(h(V0)− h(y))2

σ2(y)exp

(2

∫ y

V0

f(s)

σ2(s)ds

)dy

+h(V0)− h(a)

h(b)− h(a)

∫ b

a

(h(b)− h(y))2

σ2(y)exp

(2

∫ y

V0

f(s)

σ2(s)ds

)dy .

Suppose now that

lima→−∞

h(x)− h(a)

h(b)− h(a)= 1 ,

which is the case if and only if P (T <∞) = 1 then

E(T ) = lima→−∞

E (Ta,b)

= −∫ V0

−∞(h(V0)− h(y))

2

σ2(y)exp

(2

∫ y

V0

f(s)

σ2(s)ds

)dy

+

∫ b

−∞(h(b)− h(y))

2

σ2(y)exp

(2

∫ y

V0

f(s)

σ2(s)ds

)dy

= −∫ V0

−∞(h(b)− h(V0))

2

σ2(y)exp

(2

∫ y

V0

f(s)

σ2(s)ds

)dy

+

∫ b

V0

(h(b)− h(y))2

σ2(y)exp

(2

∫ y

V0

f(s)

σ2(s)ds

)dy

=

∫ b

−∞(h(b)− h(V0 ∨ y))

2

σ2(y)exp

(2

∫ y

V0

f(s)

σ2(s)ds

)dy .

Finally, using h(b) − h(V0 ∨ y) =∫ bV0∨y exp

(−2∫ sV0

f(t)σ2(t)dt

)ds and b = Vth we get

the formula (4.8)

E(T ) =

∫ Vth

−∞

∫ Vth

V0∨yexp

(−2

∫ s

V0

f(t)

σ2(t)dt

)ds

2

σ2(y)exp

(2

∫ y

V0

f(s)

σ2(s)ds

)dy

and Theorem 4.2 is proven.


4.1.4. The distribution of T . The Laplace transformation of a probabilitymeasure provides a method to compute the distribution of T in particular cases.The method is based again on the optional sampling theorem.

(i) Consider as an example the case

dVt = σdWt, σ ≡ constant

Proposition 4.7. Let λ > 0 then

E(e−λT

)= e

√2λσ (Vr − Vth) .

In particular, the distribution of T is

P (T ∈ dt) =Vth − Vrσ√

2π t3exp

(− (Vth − Vr)2

2σ2t

)︸︷︷︸

=:f(t)

dt, t > 0

Proof. Consider the process

Mt : = exp(− λt+

√2λ

σVt); then

dMt =(− λMt +

1

2

(√2λ

σ

)2

σ2Mt

)dt+

√2λ MtdWt =

√2λ Mt dWt

It follows that (Mt) is a martingale, thus

E (MT∧t) = E(M0) for all t, which implies that

E

(exp

(−λ(t ∧ T ) +

√2λ

σVt∧T

))= exp

(√2λ

σVr

).

Taking the limit t→∞ and using P (T <∞) = 1 we obtain that

E(

exp(− λT +

√2λ

σVT︸︷︷︸Vth

))= exp

(√2λ

σVr

)

To verify the density of the distribution of T , it suffices to show that∫ ∞0

e−λtf(t) dt = e√

2λσ (Vr−Vth) , λ > 0

To this end note that for any α, β > 0

(4.11)

∫ ∞0

1√t3e−α

2t− β2

t dt =√πe−2αβ

β.

The identity will be proven below.


Consequently ∫ ∞0

e−λtf(t)dt =Vth − Vrσ√

2π

∫ ∞0

e−λt−(Vth−Vr)2

2σ2t dt

=Vth − Vresetσ√

2π

√π

e2√λ· (Vth−Vr)√

2σ(Vth−Vr√

2σ

)= e−

√2λσ (Vth−Vr)

The distribution of T is called Levy-distribution

Remark: (Proof of (4.11)) Indeed, the Cauchy-Schlomilch transformation statesthat for any measurable nonnegative function f∫ ∞

0

f

((αt− β

t

)2)dt =

1

α

∫ ∞0

f(y2)dy .

From this one can deduce the desired identity in two steps:

Step 1: ∫ ∞0

1√te−α

2t− β2

t dt =

√π

αe−2αβα .

Indeed, ∫ ∞0

1√te−α

2t− β2

t dt =

∫ ∞0

1√te−(α√t− β√

t

)2

dte−2αβ

= 2

∫ ∞0

e−(αx− βx )2

dxe−2αβ =

√π

αe−2αβ ,

thereby using∫∞

0e−y

2

dy =√π

2 .

Step 2: Let G(β) :=∫∞

01√te−α

2t− β2

t dt, so that G′(β) = −2β∫∞

01√t3e−α

2t− β2

t dt.

On the other hand, Step 1 implies that

G′(β) =d

dβ

√π

αe−αβ = −2

√πe−2αβ .

We finally arrive at the desired identity∫ ∞0

1√t3e−α

2t− β2

t dt =

√π

βe−2αβ .

(ii) In the next example we consider the case

dVt = Idt+ σdWt , σ 6= 0 constant, I > 0 constant

Proposition 4.8. Let λ > 0 then

E(e−λT

)= e

Vth−Vrσ2 I

[1−

√1 + 2λ

σ2

I2

]

= e(Vth−Vr)

σ

[I

σ−√I2

σ2+ 2λ

].


Proof. Similar to the previous example, consider the process

Mt = exp

(−α

2t

2+α

σVt

).

Then Ito’s formula implies that

dMt =

(−α

2

2Mt +

1

2

(ασ

)2

σ2Mt

)dt+

α

σMtdVt

=α

σIdt+ α MtdWt

It follows that

e−ασ It Mt, t ≥ 0, is a local martingale

and thus by the optional sampling theorem

E(e−(α

2

2 +ασ I)t∧T+α

σ Vt∧T)

= eασ Vr

Taking the limit t→∞ and using P (T <∞) = 1 (since I > 0) implies

E(e−(α

2

2 +ασ I)T

)= e

ασ (Vr−Vth)

If we now let α be such that α2 + α

σ I = λ, i.e.,

α1/2 = − Iσ±

√(I

σ

)2

+ 2λ

and observe that we have to take the positive implies

E(e−λT

)= e

Vth−Vrσ

Iσ−

√(I

σ

)2

+ 2λ

The distribution of T is an inverse Gaussian distribution with parameters(

∆VI ,(

∆Vσ

)2), where ∆V = Vth − Vr, i.e., a probability distribution with density

f(t) =

(∆V

σ

)1√

2πt3exp

(− I

2

σ2

(t− ∆VI )2

2t

)Indeed, note that∫ ∞

0

e−λtf(t)dt =

(∆V

σ

)1√2π

∫ ∞0

1√t3e−α

2t− β2

t dt · eI·∆Vσ2

=

(∆V

σ

)1√2π

√2σ

∆V

√πe−2√λ+ I2

2σ2 · ∆V√2σ

+ I∆Vσ2

= e∆Vσ

[I

σ−√I2

σ2+ 2λ

]

To obtain the second line we used α2 = λ+ I2

2σ2 , β = ∆V√2σ

.


Remark 4.9. Note that

limI→0

e∆Vσ

[Iσ−√

I2

σ2 +2λ

]= e

−√

2λσ ∆V

coincides with the Laplace-transform of T in the previous example.

(iii) The leaky integrate- and- fire model

The Laplace transform of the first passage time of the leaky integrate-and-firemodel

dVt = (I − θVt) dt+ σdWt

is more involved and does no longer have a closed form solution in terms ofelementary functions. It can be represented in terms of certain series expansions,that we are not going to state here, but instead refer to the excellent survey paper[APP05].

A rather useful alternative representation of the first passage time distribution canbe obtained in the particular case I = 0: Let Xt = V 2

t , then Ito’s formula impliesthat

dXt = 2VtdVt + σ2dt =(σ2 − 2θV 2

t

)dt+ 2σVtdWt

=(σ2 − 2θXt

)dt+ 2σ

√Xt sgn(Vs)dWt︸︷︷︸

=dWt

Now observe that

Wt =

∫ t

0

sgn(Vs)dWs , t ≥ 0

is a continuous martingale with quadratic variation 〈W 〉t = t. By Levy’s charac-

terisation of Brownian motion (see [Kle06], Theorem 25.28) it follows that (Wt) isa Brownian motion, hence (Xt) is a weak solution of the SDE

dXt =(σ2 − 2θXt

)dt+ 2σ

√Xt dWt.

Using this change of variable formula, the following representation of the first pas-sage time distribution can be obtained:

Proposition 4.10. The density fT of T of the leaky integrate-and-fire model

dVt = −θVtdt+ dWt , V0 = Vr

has the representation

fT (t) = e−θ(V2th−V

2r −t)/2 · f0

T (t) · E(

exp

(−θ

2

2

∫ t

0

(rs − Vr)2ds

))Here,

- f0T denotes the density of firing time in the case θ = 0,

i.e., f0T (t) =

Vth − Vr√2πt3

exp

(− (Vth − V reset)

2

2t

)see Proposition (ref to be inserted).


- (rs)s≥0 is the 3- dimensional Bessel-Bridge from Vr to Vth in time t, i.e.,the solution of the SDE

drt =

(Vth − rst− s

+1

rs

)ds+ dWt

4.1.5. Numerical approximation of T . Using numerical approximation ofthe driving stochastic differential equation (see Section C.3 in Appendix C) we canapproximate the distribution of T as follows:

Choose N(= number of runs, resp. samples), for i = 1, . . . , N construct the Euler-approximation

V k,itk+1= V k,itk

+ f(V k,itk) · h+ σ(V k,itk

) · (W itk+1−W i

tk),

for k = 0, 1, 2, . . .

until V k,itk+1> Vth

set T i := tk+1.

Then T i, i = 1, . . . , N can be considered an independent approximation of T ,hence its empirical distribution

µ(N) :=1

N

N∑i=1

δT i

should converge weakly to the distribution of T as N ↑ ∞. In fact, for any f ∈Bb(R+) we have

f(T 1), f(T 2), . . . iid ∼ µ, µ = distribution of T

hence

θN (f) :=

∫f dµ(N) =

1

N

N∑i=1

f(T i) −→P-a.s.

E (f(T )) =

∫f dµ =: θ(f)(SLLN)

and standard deviation

θN (f)− θ(f) ∼√

Var(f(T ))

N=:

σ√N

since

limN→∞

P(|θN (f)− θ(f)| ≤ c σ

N

)=

1√2π

∫ +c

−ce−

x2

2 dx.(CLT)

Numerical illustrations

for comparison with the closed form representations of fT consider first theclassical examples:

(i) dVt = σdWt, in this case

fT (t)dt = P (T ∈ dt) =

(∆V

σ

)1√

2πt3exp

(−∆V 2

2σ2t

)dt, ∆V = Vth − Vr

(Levy distribution)


(ii) dVt = Idt+ σdWt, in this case

fT (t)dt = P (T ∈ dt) =

(∆V

σ

)1√

2πt3exp

(− (It−∆V )2

2σ2t

)dt

(iii) dVt = (I − θVt)dt+σdWt, in this case only known: characterisation of fTas solution of

fT (t) = −2Φ(t) + 2

∫ t

0

fT (s)Ψ(t, s) ds

where Φ and Ψ are as in Theorem (??).

APPENDIX A

Martingales

The theory of martingales has been extremely useful and successful in the anal-ysis of stochastic processes and which can be seen as a generalization of sums ofindependent random variables. We will summarize the parts of the theory that areused in the main test. For a more systematic treatment one can consult any text-book on stochastic analysis. Above all we recommend the monograph by [SV79].

Throughout the whole appendix, let (Ω,F , P ) be a fixed probability space,I ⊆ R+ any index-set, e.q. N0, [0, T ] or R+ itself, and (Ft)t∈I be a filtrationon (Ω,F), i.e., a family of sub-−sigma-algebras of F satisfying Ft ⊆ Fs for s ≤ t,s, t ∈ I. In the context of stochastic process, Ft is interpreted as all the informationthat is available at time t.

Definition A.1. A family of random variables (Xt)t∈I that is P -integrableand (Ft)-adapted, i.e. Xt is Ft -measurable for all t, is called

(i) a martingale, if

Xs = E(Xt|Fs) ∀s ≤ t(ii) a submartingale, if

Xs ≤ E(Xt|Fs) ∀s ≤ t (”on average increasing”)

(iii) a supermartingale, if

Xs ≥ E(Xt|Fs) ∀s ≤ t (”on average decreasing”)

Example A.2. The most important examples for martingales are given asfollows:

(i) successive predictions of integrable random variables Let X ∈L1(P ), then

Xt = E(X|Ft), t ∈ I,is an (Ft)-martingale, because for s ≤ t the tower property for conditional expec-tations implies that

E(Xt|Fs) = E(E(X|Ft)|Fs) = E(X|Fs) = Xs .

(ii) centered sums of independent random variables Let Yn ∈ L1(P ),n ≥ 1, be independent random variables and let Fn = σ(Y1, . . . , Yn) be the σ-algebra generated by the time-discrete process Y1, Y2, Y3, . . .. Then

Xn :=

n∑k=1

(Yk − E(Yk)), n ≥ 0,

is an (Fn)-martingale, because

E(Xn+1|Fn) = E(Yn+1 − E(Yn+1)|Fn)︸︷︷︸=E(Yn+1−E(Yn+1))=0,

+E(Xn|Fn)︸︷︷︸=Xn

= Xn .

55

56 A. MARTINGALES

Here we used that E(Yn+1 | Fn) = E(Yn+1), due to the independence of Yn+1 ofFn.

(iii) martingale transform with previsible processesLet (Fn)n∈N0

be a filtration, (Xn)n∈N0a martingale and (Vn)n∈N be previsible,

i.e., Vn Fn−1-measurable ∀n. Then

(V ·X)n := X0 +

n∑k=1

Vk(Xk −Xk−1), n ∈ N0

is again an (Fn)-martingale, if Vk(Xk −Xk−1) is P -integrable for all k, since

E((V ·X)n+1|Fn) = E((V ·X)n|Fn) + E(Vn+1(Xn+1 −Xn)|Fn)

= (V ·X)n + Vn+1E(Xn+1 −Xn|Fn) = (V ·X)n .

(iv) martingales of Markov chainssee Chapter 2, Section 2.(v) martingales of Brownian motionLet (Xt)t≥0 be a Brownian motion, defined on (Ω,F , P ). Let F0

t := σ(Xs :s ≤ t) be the filtration generated by X(t) and

Ft :=⋂s>t

F0s

be the slightly larger right continuous filtration generated by (F0t ))t≥0. Then we

have the following proposition:

Proposition A.3. The following processes are martingales w.r.t. (Ft)t≥0:

(i) (Xt)t≥0.(ii) (X2

t − t)t≥0.(iii) (exp(αXt − 1

2α2t))t≥0 ∀α ∈ R.

The proof requires the following

Lemma A.4. Let t ≥ 0, h > 0. Then Xt+h −Xt is independent of Ft.

Proof. By definition of Brownian motion, Xt+h−Xt is independent of σ(Xt1 , Xt2−Xt1 , . . . , Xtn − Xtn−1) = σ(Xt1 , Xt2 , . . . , Xtn) for all 0 ≤ t1 < t2 < · · · < tn ≤ t,which implies that Xt+h −Xt is independent of F0

t . It follows on particular thatXt+h −Xt+ 1

nis independent of F0

t+ 1n

⊇ Ft for all n ≥ 1. If f ∈ Cb(R) (= all cont.

and bounded functions on R), and ϕ is Ft-measurable and bounded, this impliesthat

E(f(Xt+h −Xt) ϕ) = limn→∞

E(f(Xt+h −Xt+ 1n

) ϕ)

= limn→∞

E(f(Xt+h −Xt+ 1n

))E(ϕ)

= E(f(Xt+h −Xt))E(ϕ)

and hence the assertion.

Proof. (of Proposition A.3) (i) Similar to example (ii), we have that

E(Xt |Fs) = E(Xs + (Xt −Xs) | Fs)= E(Xs|Fs)︸︷︷︸

=Xs

+E(Xt −Xs|Fs)︸︷︷︸=E(Xt−Xs)=0

= Xs .

A.1. MAXIMAL INEQUALITY 57

(ii)

E(X2t − t | Fs) = E(X2

t −X2s | Fs) +X2

s − t= E((Xt −Xs)

2 + 2(Xt −Xs) Xs | Fs) +X2s − t

= E((Xt −Xs)2)︸︷︷︸

=t−s

+2XsE(Xt −Xs|Fs)︸︷︷︸=E(Xt−Xs)=0

+X2s − t

= X2s − s

(iii) Gαt := exp(αXt − 12 α

2t). Then

E(Gαt |Fs) = E(exp(α(Xt −Xs)−1

2α2(t− s)) | Fs) ·Gαs

= E(exp(α (Xt −Xs)︸︷︷︸∼N (0,t−s)

))

︸︷︷︸exp( 1

2α2(t−s))

exp(−1

2α2(t− s)) Gαs

= Gαs .

A.1. Maximal inequality

The martingale property of a stochastic process (Xt)t≥0 implies that Xt con-tains all the information of the process up to time t, since

Xs = E(Xt |Fs) .

One important statement exploiting this fact is provided by the Doob’s maximalinequality:

Theorem A.5. Let (Xt)t≥0 be a (right-) continuous martingale and let

X∗t := sup0≤s≤t

|Xs| , t ≥ 0 .

Then

(i)

P (X∗t ≥ R) ≤ 1

RE(|Xt| ;X∗t ≥ R) ≤ 1

RE(|Xt|) ∀R > 0.

In particular: Xs : s ∈ [0, t] is uniformly integrable ∀t > 0.(ii) If Xt ∈ Lp(P ) for some p > 1 then

E((X∗t )p)1p ≤ p

p− 1E(|Xt|p)

1p .

A.2. STOPPING TIMES AND OPTIONAL SAMPLING 59

first hitting times: (Xn)n∈N0Rd-valued, (Fn)−adapted, A ∈ B(Rd)

TA(ω) := infn ≥ 0 : Xn(ω) ∈ A, ω ∈ Ω

= first hitting time ofA (inf ∅ = +∞)

then TA is an (Fn)-stopping time, since

TA ≤ m =

m⋃n=0

Xn ∈ A

(ii) In continuous time one has to assume further regularity on (Xt) resp. A:let (Xt) be a continuous (Ft)-adapted process and A ⊆ Rd open, then

TA ≤ t =⋃

s∈[0,t]∩Q

Xs ∈ A.

Particular case: first passage time

Ta := inft ≥ 0 : Xt > ai.e. A = (a,∞) in the previous example.

The most important statement in connection with stopping times:

Theorem A.8 (Optional Sampling Theorem). Let (Xt)t≥0 be a right-continuous(Ft)-martingale, S, T be bounded (Ft)-stopping times, S ≤ T . Then

E(XT | Fs) = Xs.

In particular, for any (Ft)-stopping time T :

• (XT∧t) is an (FT∧t)-martingale• E(XT∧t) = E(X0) is constant w.r.t. time t

Here we used the notation

FT = A ⊂ Ω : A ∩ T ≤ t ∈ Ft ∀t ∈ Idenoting the σ−algebra at the stopping time T .

Exercise: Show that FT is indeed a σ-algebra.

Remark A.9. (i) S, T stopping times, S ≤ T ⇒ Fs ⊆ Ftbecause: A ∈ Fs implies

A ∩ T ≤ t =↑

T≤t=S≤t, T≤t

(A ∩ S ≤ t︸︷︷︸

∈Ft

)∩ T ≤ t ∈ Ft.

in particular: (FT∧t)t∈I is again a filtration.(ii) I = N0, (Xn) (Fn)−adapted, T stopping time, then

XT (ω) := XT (ω)(ω) FT −measurable,

because

XT ∈ A ∩ T ≤ m =

m⋂n=0

Xn ∈ A, T = n︸︷︷︸∈Fn⊂Fm

∈ Fm

Theorem A.10 (Optional sampling theorem in discrete time). Let I = N0, (Xn)be an (Fn)-martingale, T, S bounded (Fn)-stopping times with S ≤ T . Then

E(XT | Fs) = Xs.

In particular, for any (Fn)-stopping time T :

60 A. MARTINGALES

• (XT∧n) is an (FT∧n)-martingale• E(XT∧n) = E(X0) is constant w.r.t. (discrete) time n.

Lemma A.11. Let (Xn), T, S be as in Theorem (A.10). Then

E(XT ) = E(XS)

Proof. Let S ≤ T ≤ K. Then

XT = XS +

T∑k=S+1

Xk −Xk−1

= XS +

K∑k=1

1S<k≤T︸︷︷︸∈Fk−1

(Xk −Xk−1)

since S < k ≤ T = S ≤ k − 1 ∩ T > k − 1. Thus

E(XT ) = E(XS) +

K∑k=1

E(1S<k≤T(Xk −Xk−1)

)︸︷︷︸E(1S<k≤TE(Xk −Xk−1 | Fk−1)︸︷︷︸

=0

)= E(XS)

Proof of Theorem (A.10). Again, let T ≤ K. For B ∈ FS let

SB := S1B +K1Bc

TB := T1B +K1Bc

Then SB , TB are (Fn)-stopping times, because

S ≤ n =(S ≤ n ∩B︸︷︷︸

∈Fn

)∪(

K ≤ n ∩Bc︸︷︷︸=

∅ , K > n

Bc , K ≤ n∈Fn

)

and

TB ≤ n =(T ≤ n ∩B︸︷︷︸

∈Fn

)∪ (K ≤ n ∩Bc)

The previous lemma implies

E(XSB ) = E(XS1B) + E(XK1Bc)

= E(XT1B) + E(XK1Bc) = E(XTB ) .

Therefore E(XS1B) = E(XT1B). Since B ∈ FS arbitrary and XS FS-measurable,we obtain XS = E(XT |FS).

The proof of the optional sampling theorem in continuous time requires a suit-able approximation of a bounded stopping time

T : Ω→ [0,K]

A.2. STOPPING TIMES AND OPTIONAL SAMPLING 61

by finite valued stopping times

Tn(ω) =

K·2n∑k=1

k

2n1[ k−1

2n , k2n [(T (ω))

Clearly, Tn(ω) ↓ T (ω) ∀ω ∈ Ω. For any (right-)continuous (Ft)-adapted process(Xt)

XT (ω) := XT (ω)(ω) is FT −measurable.

Indeed,

X(t, ω) = limn→∞

∞∑k=1

X k−12n

(ω)1[ k−12n , k2n [(t)︸︷︷︸

=:X(n)(t,ω), (Ft)−adapted

Clearly,

X(n)T 1T≤t(ω) =

∞∑k=1

X k−12n ∧t

(ω)1[ k−12n ∧t,

k2n ∧t[

(T (ω))︸︷︷︸∈F k

2n∧t⊆Ft ∀t,

so that X(n)T is FT -measurable and thus

XT = limn→∞

X(n)T FT −measurable too.

Proof of Theorem (A.8). Let T ≤ K ∈ N. Let G be all (Ft)-stoppingtimes S with S ≤ K and

XS = E(XK | FS)

Hence, S ∈ G for all finite-valued S by Theorem (A.8)! In addition,

XS : S ∈ G is uniformly integrable

since

E(|XS |, |XS | ≥ R︸︷︷︸

∈FS

)= E (|E(XK | FS)|; |XS | ≥ R)

≤ E(|XK |; |XS | ≥ R)

≤ E (|XK |; X∗K ≥ R) −→R↑∞

0

uniformly in S ∈ G, using Doob’s maximal inequality.For general S let

Sn(ω) =

K·2n∑k=1

k

2n1] k−1

2n , k2n ](S(ω)) ↓ S(ω)

Tn(ω) =

K·2n∑k=1

k

2n1] k−1

2n , k2n ](T (ω)) ↓ T (ω)

62 A. MARTINGALES

Then limn→∞XSn = XS , limn→∞XTn = XT , Sn ≤ Tn ≤ K, and both in L1(P )because of uniform integrability and thus

XS = E(XS | FS) = limn→∞

E (XSn |FS)

= limn→∞

E (E (XTn | FSn) | FS)

=↑

FS⊂FSn

limn→∞

E (XTn | FS) = E (XT | FS)

Remark A.12. The conclusion of Theorem (A.10) (resp. Theorem (A.8)) doeshold in general for unbounded T .

Example: symm. random walk Sn = X1 + . . . + Xn, P (Xk = ±1) = 12 , (Xk)

iid.

T := minn ≥ 1 : Sn = +1 <∞ P-a.s.

⇒ E(ST ) = 1 6= E(S0) = 0.

Corollary A.13. Let (Xt)t≥0 be a continuous (Ft)-martingale, T an (Ft)-stopping time such that (XT∧k)k≥1 is uniformly integrable (e.g. bounded in k).Then

E(XT ) = E(X0).

Proof. T ∧ k ↑ T hence XT∧k → XT (P-a.s.) and thus

E(XT ) = limn→∞

E (XT∧k) = E(X0).

APPENDIX B

Brownian motion and stochastic integration

Brownian motion certainly is the most important stochastic process in con-tinuous time, used to describe diffusion processes. It is named after the Scottishbotanist Robert Brown, who first described the irregular motion of pollen grainssuspended in liquid that was later explained by Albert Einstein in his paper ”Uberdie von der molekularkinetischen Theorie der Warme geforderte Bewegung von (inruhenden Flussigkeiten) suspendierten Teilchen” (1905) by random collisions withmolecules of the liquid. This process also appeared a little bit earlier in the the-sis entitled ”Theorie de la speculation” by Louis Bachelier in the year 1900 in amathematical finance context. Norbert Wiener then provided the first rigorousmathematical construction of the process in the year 1923.

Definition B.1. A Brownian motion (BM) (with starting point 0) is anR-valued stochastic process (Wt)t≥0 on an underlying probability space (Ω,F , P )having the following properties:

(a) W0 = 0 P-a.s.(b) For 0 ≤ t0 < · · · < tn+1 the increments

Wti+1−Wti (i = 0, 1, . . . , n)

are independent, N (0, ti+1 − ti)-distributed.

The BM (Wt)t≥0 is called continuous, if the trajectory (or the path) t 7−→Wt(ω)is continuous for all ω ∈ Ω.

Brownian motion can be seen as the continuum limit description of fluctuations insums of independent identically distributed (iid) random variables. Indeed, supposethat X1, X2, . . . are iid with mean zero and finite variance σ2 > 0. Then

Mn(t) :=1√n

bntc∑i=1

Xi , t ≥ 0 ,

defines a right continuous stochastic process. The classical central limit theo-rem now implies that Mn(t) converges in distribution to the normal distributionN (0, tσ2) for all t ≥ 0. But even more: For any finite 0 ≤ t0 < t1 < . . . < tn theincrements

Mn(tk+1)−Mn(tk) :=1√n

bntk+1c∑i=bntkc+1

Xi , k = 0, . . . , n− 1

are independent partial sums and each of these partial sums converge in distributiontowards the Normal distribution N (0, (tk+1 − tk)σ2.

So quite naturally, the finite dimensional distributions of Brownian motion ariseas the finite dimensional distributions of a rescaled limit of the sum Mn of iid

63

64 B. BROWNIAN MOTION AND STOCHASTIC INTEGRATION

random variables. To further obtain the convergence of Mn(t) as processes to-wards the Brownian motion observe that Mn(t) is a martingale w.r.t. the filtrationFnt := Xk : k ≤ bntc generated by X1, X2, . . .. It therefore suffices to verifythe conditions of the martingale central limit theorem: to this end we simplify ourassumptions and assume that the random variables Xk are bounded by K. Then

sup0≤s≤t

|Mn(s)−Mn(s−)| ≤ 1√nK → 0n→∞ .

To identify the limiting behavior of the variance process note that M2n(t) is a sub-

martingale with variance process 1n

∑bntck=1 E(X2

k), i.e.

M2n(t)− 1

n

bntc∑k=1

E(X2k) = M2

n(t)− σ2

nbtnc , t ≥ 0 ,

is again a martingale, since

E(M2n(t+ s) | Fns

)= E

(M2n(t+ s)−M2

n(s) | Fnt)

+M2n(s)

= E(

(Mn(t+ s)−Mn(s))2 | Fnt

)+M2

n(s)

=1

nE

b(t+s)nc∑k,l=bsnc+1

XkXl | Fnt

+M2n(s)

=1

nE

b(t+s)nc∑k,l=bsnc+1

XkXl

+M2n(s)

=1

nE

b(t+s)nc∑k=bsnc+1

X2k

+M2n(s)

=σ2

n(btnc − bsnc) +M2

n(s)

and therefore

E

(M2n(t+ s)− σ2

nb(t+ s)nc | Fns

)= M2

n(s)− σ2

nbsnc .

Since limn→∞σ2

n btnc = σ2t, also this assumption in the martingale central limittheorem is satisfied so that Mn indeed converges weakly on the Skorohod spacetowards a Brownian motion W .

The partial sums X1 + . . . + Xn, n = 1, 2, . . . are also called a (discrete) timerandom walk and the above construction implies that Brownian motion can beapproximated with the help of suitable rescaled random walks. Since the Brownianmotion is the universal limit and independent of the particular distribution of theincrements, the above convergence is also called the invariance principle of Brownianmotion. The distribution P W−1 of a continuous Brownian motion W is calledWiener measure.

B.2. ELEMENTARY PROPERTIES OF BM 65

B.1. Construction of BM

We have already seen the construction of Brownian motion as limit of rescaledrandom walks. An alternative construction is provided by the so-called Wiener-Levy construction, that describes BM as random superposition of deterministicpaths as follows: let

• (Yn) be independent, N (0, 1)-distributed,• (en) be an orthonormal basis of L2([0, T ]), e.g.,

e0(t) = 1√T

e2k−1(t) = 1√2T

sin( 2πT kt), k ≥ 1

e2k(t) = 1√2T

cos( 2πT kt), k ≥ 1 .

Then

Wt(ω) :=

∞∑n=0

Yn(ω)

∫ t

0

en(s) ds

is a (continuous) BM, i.e. in our particular example

Wt(ω) =1√TY0(ω)+

∞∑n=1

Y2n−1(ω)1√2T

∫ t

0

sin(2π

Tns)ds+Yn(ω)

1√2T

∫ t

0

cos(2π

Tns) ds .

B.2. Elementary properties of BM

Proposition B.2. (symmetries and scaling properties) Let (Wt)t≥0 be a con-tinuous BM. Then the following stochastic processes are continuous BM too:

(i) Wt := −Xt, t ≥ 0 (symmetry)

(ii) Wt := cXt/c2 , t ≥ 0 for any c ∈ R \ 0 (scaling invariance)

Proof. (i) Obvious.(ii) Continuity is obvious. Next observe that clearly for 0 = t0 < t1 · · · < tn+1

the increments

c(Xti+1/c2 −Xti/c2

), i = 0, . . . , n

are independent and N (0, ti+1 − t1)-distributed.

Proposition B.3 (mean, covariance). Let (Wt)t≥0 be a BM. Then

i m(t) := E(Wt) = t,ii C(s, t) := Cov (Ws,Wt) = s ∧ t := mins, t.

Proof. (i) is obvious. For the proof of (ii) note that for s ≤ tby independenceof the increments

Cov (Ws,Wt) = E (WsWt) = E (Ws(Wt −Ws))+E(W 2s

)= E(Ws)E(Wt−Ws)+s = s = s∧t .


B.3. Path properties of BM

Proposition B.4 (Strong Law of Large Numbers - SLLN). Let (Wt) be acontinuous BM. Then

limt→∞

Wt

t= 0 P − a.s.

In particular, the growth of a typical Brownian path is sublinear.

Proof. First note that (|Wt|)t≥0 is a submartingale. Fix ε > 0 arbitrary.Then

P

(sup

t∈[2n,2n+1]

|Wt|t≥ ε

)≤ P

(sup

t∈[2n,2n+1]

|Wt| ≥ ε 2n

)

≤↑

maximal inequality

1

ε 2nE(|W2n+1 |) ≤

√2

ε2−

n2

using E(|W2n+1 |) ≤ E(W 22n+1)

12 = 2

n+12 . Hence

∞∑n=0

P

(sup

t∈[2n,2n+1]

|Wt|t≥ ε

)≤∞∑n=0

√2

ε2−n/2 <∞

therefore limt→∞Wt

t ≤ ε P -a.s. using the Borel-Cantelli Lemma.

The quadratic variation of Brownian pathsRecall the for a function f : [0,∞) → R its total variation on [0, t] (if it

exists) is defined as

Var[0,t] f := limn→∞

∑ti∈τnti≤t

|fti+1 − fti | .

Here τn := t0, t1, . . . denotes a partition of [0,∞), i.e. 0 = t0 < t1 < . . . withτn ⊂ τn+1 for all n with mesh size

|τn| := maxti∈τn

|ti+1 − ti| → 0 .

Instead of summing up the absolute values of the increments of f we can alsosum up the squares of the increments

〈f〉t := limn→∞

∑ti∈τnti≤t

(f(ti+i)− f(ti))2,

which is called the quadratic variation of f on [0, t] along (τn).It turns out that the latter notion is crucial for Brownian motion according to

the following theorem:

Theorem B.5 (Levy). Let (Wt)t≥0 be a continuous BM. Then

Snt :=∑ti∈τnti≤t

(Wti+1

−Wti

)2 −→n→∞

t in L2(P ) and P-a.s.

In particular: 〈W 〉t = t P-a.s.

B.4. THE ITO-INTEGRAL 67

Proof. (of L2-convergence) Recall that the increments Wti+1−Wti , ti ∈ τn

are independent, ∼ N (0, ti+1 − ti), so that

E(Snt ) =∑ti∈τnti≤t

E((Wti+1

−Wti)2)

=∑ti∈τnti≤t

(ti+1 − ti) −→ t .

In addition

Var(Snt ) =∑ti∈τnti≤t

Var((Wti+1−Wti)

2)

=∑ti∈τnti≤t

E((Wti+1 −Wti)

4)− E

((Wti+1 −Wti)

2)2︸︷︷︸

=(ti+1−ti)2

=∑ti∈τnti≤t

3(ti+1 − ti)2 − (ti+1 − ti)2

= 2∑ti∈τnti≤t

(ti+1 − ti)2 ≤ 2|τn|∑ti∈τnti≤t

(ti+1 − ti)→ 0.

Thus Snt −E(Snt )→ 0 in L2(P ), i.e., Snt → t in L2(P ). a.s.-convergence is obtainedwith a suitable martingale convergence result.

The last theorem implies in particular, that the typical path of a BM is ofunbounded variation, so in particular nowhere differentiable, but since its quadraticvariation exists, it is possible to extend the classical differential calculus to Brownianpaths, which leads to the so-called Ito-calculus (see below).

B.4. The Ito-Integral

We want to consider stochastic differential equations of the following type

(B.1) X(t) = B(X(t)) dt+ C(X(t)) Wt, X(0) = ξ ,

where (Wt) is a Brownian motion. Writing (B.1) in integral form we obtain theintegral equation

X(t) = ξ +

∫ t

0

B(X(s)) ds+

∫ t

0

C(X(s)) Wsds, t ≥ 0 ,

and substituting Ws ds = dWs

ds ds = dWs formally we can write this in the form

X(t) = ξ +

∫ t

0

B(X(s))ds+

∫ t

0

C(X(s)) dWs .

In this equation, the third term on the right hand side is a stochastic integral We

will sketch its construction∫ t

0Φs dWs for a reasonable class of stochastic integrands

(Φs)s≥0 in the following. To this end let (Ω,F , P ) be a probability space and (Wt)be a Brownian motion with associated right-continuous filtration (Ft)t≥0.

Step 1 Integration of elementary processes


Let E be the set of all elementary processes Φ of the type

Φt(ω) :=

n∑i=0

Φti(ω) 1(ti,ti+1](t) (0 ≤ t0 < t1 < . . . < tn)

where Φti is Fti-measurable. For Φ ∈ E we define the stochastic integral as(∫ t

0

ΦsdWs

)(ω) :=

∑i: ti<t

Φti(ω)(Wti+1∧t(ω)−Wti(ω)

)Lemma B.6. (i) ∫ ·

0

Φs dWs ∈M2c , where

M2c : = all continuous martingales, bounded in L2(P )

= M = (Mt)t≥0 : M martingale , t 7→Mt(ω) P-a.s. continuous,

‖M‖2 := supt≥0

E(M2t ) <∞

(ii) (Wiener-Ito-isometry)

E

((∫ t

0

Φs dWs

)2)

= E

(∫ t

0

Φ2s ds

).

Proof. (i) Ft-adapted, continuity is clear, square-integrability too. Forthe martingale property, let t′ < t and t′i < t′ ≤ t′i+1, ti < t ≤ ti+1 :

E

(∫ t

0

Φs dWs

∣∣∣ F ′t) = E

(∫ t′

0

Φs dWs

∣∣∣F ′t)

+ E

(∫ t′

t

Φs dWs

∣∣∣F ′t)

=

∫ t′

0

Φs dWs + E

(Φt′i

(Wt′i+1∧t −Wt′i+1∧t′

)+∑i: ti<t

Φti(Wti+1∧t −Wti)∣∣∣Ft′)

=

∫ t′

0

ΦsdWs + Φt′i E(Wt′i+1∧t −Wt′i+1∧t′

∣∣∣Ft′)︸︷︷︸=0

+∑i: ti<t

E(

Φti E(Wti+1∧t −Wti

∣∣Fti)︸︷︷︸=0

∣∣∣Ft′)

=

∫ t′

0

Φs dWs

(ii) For the proof of the Wiener-Ito-isometry, note that for ti < tj < t:

E(Φti(Wti+1∧t −Wti) Φtj (Wtj+1∧t −Wtj )

)= E

(Φti(Wti+1∧t −Wti) Φtj E(Wtj+1∧t −Wtj |Ftj )︸︷︷︸

=0

)= 0 ,

and

E((Φti(Wti+1∧t −Wti))

2)

= E(

Φ2ti E

((Wti+1∧t −Wti)

2 | Fti)︸︷︷︸

=ti+1∧t−ti

)= E

(Φ2ti(ti+1 ∧ t− ti)

).

B.4. THE ITO-INTEGRAL 69

Consequently,

E

((∫ t

0

Φs dWs

)2)

= E

( ∑i: ti<t

Φti(Wti+1∧t −Wti)

)2

=∑i: ti<tj: tj<t

E(Φti(Wti+1∧t −Wti)Φtj (Wtj+1∧t −Wtj )

)

=∑i: ti<t

E(Φ2ti (ti+1 ∧ t− ti)

)= E

(∫ t

0

Φ2s ds

)

Step 2: Extension of the set of admissible integrands to E .

The Lemma implies in particular the following isometry∥∥ ∫ ·0

Φs dWs

∥∥2= sup

t≥0E

((∫ t

0

Φs dWs

)2)

= E

(∫ ∞0

Φ2s ds

)between the spaces

E ⊆ L2(Ω, F , P ⊗ dt

)→M2

c

where Ω = Ω× [0;∞), F = F ⊗ B([0,∞)).We can therefore extend the stochastic integral to the space

E := closure of E in L2(P ⊗ dt),

i.e., for Φ ∈ E we define∫ t

0

Φs dWs := limn→∞

∫ t

0

Φ(n)s dWs (in L2(P ))

where (Φ(n)) ⊆ E is an arbitrary sequence of elementary processes converging toΦ in L2(P ⊗ dt). It can be shown that the limit

∫ ·0

Φs dWs is again a continuoussquare-integrable martingale.

How large is the class of admissible integrands obtained this way? To this end onehas to characterize the set E . It can be shown that

E ⊇ L2(Ω,P, P ⊗ dt) and E = L2(Ω,P, P ⊗ dt)

where

P = σ (As×]s, t] : As ∈ Fs, 0 ≤ s ≤ t) ”predictable σ-algebra”

!= σ (H = (Ht)t≥0 : H left-continuous, (Ft)− adapted)!= σ (H = (Ht)t≥0 : H continuous, (Ft)− adapted)

and P denotes the completion of P w.r.t. the measure P ⊗ dt.Hence any predictable process Φ = (Φt)t≥0 with

E

(∫ ∞0

Φ2t dt

)<∞


can be integrated against BM and the stochastic integral∫ t

0

Φs dWs, t ≥ 0

satisfies

(i) ∫ ·0

Φs dWs ∈M2c

(ii)

E

((∫ t

0

Φs dWs

)2)

= E

(∫ t

0

Φ2s ds

)(Wiener-Ito–isometry)

(Proof: (e.g.) Øksendal, Stochastic Differential Equations, Chapter 3).

Step 4: Localization

The definition of∫ t

0Φs dWs can be extended to the space of integrands

N =

Φ : Ω→ R, predictable and P

(∫ t

0

Φ2s ds <∞ ∀t ≥ 0

)= 1

using the stopping times

Tn := inft ≥ 0 :

∫ t

0

Φ2s ds ≥ n

+∞ P-a.s.

Indeed: the process Φ(n)s := Φs1s≤n is predictable and square-integrable with

E

(∫ ∞0

(Φ(n)s

)2

ds

)= E

(∫ Tn

0

Φ2s ds

)≤ n,

hence∫ t

0Φ

(n)s dWs well-defined and consistent in the sense that∫ t

0

Φ(n)s dWs =

∫ t

0

Φ(n+1)s dWs on t ≤ Tn.

We can therefore write ∫ t

0

Φ(n)s dWs =:

∫ t∧Tn

0

Φs dWs

and∫ t

0Φs dWs satisfies:

(i) ∫ t∧Tn

0

Φs dWs ∈M2c

(ii)

E

(∫ t∧Tn

0

Φs dWs

)2 = E

(∫ t∧Tn

0

Φ2s ds

)

B.5. THE ITO-FORMULA 71

B.5. The Ito-formula

Recall the classical chain rule for differentiable functions f,X·:

d

dtf(Xt) = f ′(Xt) Xt dt︸︷︷︸

=dXt

= f ′(Xt) dXt

that may be written in integral form as

f(Xt)− f(X0) =

∫ t

0

f ′(Xs) dXs = limn→∞

∑ti∈τnti≤t

f ′(Xti)(Xti+1 −Xti)(B.2)

We want to deduce a similar formula for X· given by a Brownian path. Themain difficulty, consists of the fact that if X· is a Brownian path, it is of unboundedvariation and the Riemann sum (B.2) need not converge pointwise (only in L2(P )).

However, if f ∈ C2(R), we can use instead of the linear approximation

f(Xti+1)− f(Xti) ≈ f ′(Xti)(Xti+1 −Xti)

the quadratic approximation

f(Xti+1)− f(Xti) ≈ f ′(Xti)(Xti+1

−Xti) +1

2f ′′(Xti)(Xti+1

−Xti)2 .

More precisely, let τn be a sequence of partitions of [0, t], then

f(Xt)− f(X0) =∑ti∈τn

f(Xti+1)− f(Xti)

=∑ti∈τn

f ′(Xti)(Xti+1−Xti) +

1

2f ′′(Xti)(Xti+1

−Xti)2

+1

2

∑ti∈τn

(f ′′(θ

(n)ti )− f ′′(Xti)

)(Xti+1

−Xti)2

︸︷︷︸=:Rn

where θ(n)ti ∈ [Xti , Xti+1

] and

|Rn| ≤1

2sup

|r−s|≤|τn|

∣∣f ′′(Xr)− f ′′(Xs)∣∣︸︷︷︸

→0 as |τn|→0

∑ti∈τn

(Xti+1−Xti)

2

︸︷︷︸→〈X〉t=t

The first term converges to 0 since f ′′(Xs) is equicontinuous on [0, t].Since

limn→∞

1

2

∑ti∈τn

f ′′(Xti)(Xti+1−Xti)

2 =1

2

∫ t

0

f ′′(Xs) ds

also the limit

limn→∞

∑ti∈τn

f ′(Xti)(Xti+1−Xti) =

∫ t

0

f ′(Xs) dXs

exists for any Brownian path and we have Ito’s formula:

(B.3) f(Xt) = f(X0) +

∫ t

0

f ′(Xs) dXs +1

2

∫ t

0

f ′′(Xs) ds .


If in addition f depends on time, f ∈ C1,2([0,∞)× R), then a similar Taylor-approximation

f(ti+1, Xti+1)− f(ti, Xti) = f(ti+1, Xti+1

)− f(ti, Xti+1) + f(ti, Xti+1

)− f(ti, Xti)

= ∂tf(ti, Xti+1) + ∂xf(ti, Xti)(Xti+1 −Xti)

+1

2∂xxf(ti, Xti)(Xti+1 −Xti)

2 + Rn

yields in the limit the time-dependent Ito-formula:

f(t,Xt) = f(0, X0) +

∫ t

0

∂sf(s,Xs) ds+

∫ t

0

∂xf(s,Xs) dXs(B.4)

+1

2

∫ t

0

∂xxf(s,Xs) ds

Example B.7. (i)

X2t = X2

0︸︷︷︸=0

+2

∫ t

0

Xs dXs +1

2

∫ t

0

2 ds = 2

∫ t

0

Xs dXs + t

in particular: X2t − t = 2

∫ t

0

Xs dXs is a martingale!

(ii)

exp

(λXt −

λ2

2t

)= exp

(λX0 −

λ2

2· 0)

︸︷︷︸=0

+

(−λ

2

2

)∫ t

0

exp

(λXs −

λ2

2s

)ds

+ λ

∫ t

0

exp

(λXs −

λ2

2s

)dXs +

1

2λ2

∫ t

0

exp

(λXs −

λ2

2

)ds

= 1 + λ

∫ t

0

exp

(λXs −

λ2

2s

)dXs is a martingale too.

(iii)

Xmt = Xm

0︸︷︷︸=0

+m

∫ t

0

Xm−1s dXs +

m (m− 1)

2

∫ t

0

Xm−2s ds, m ≥ 2 .

In the next step, we generalize the Ito formula to the class of Ito-processes:

Let (Wt) be a Brownian motion, (Ft) the associated right-continuous filtration.

Definition B.8. A one dimensional Ito process is a stochastic process (Yt)of the form

Yt = Y0 +

∫ t

0

us ds+

∫ t

0

vs dWs, t ≥ 0,

(or in infinitesimal form: dYt = ut dt+ vt dWt)

where

• us, vs are predictable processes

• P(∫ t

0|us| ds ≤ ∞,

∫ t0v2s ds ≤ ∞ ∀t ≥ 0

)= 1


Quadratic variation of Ito-processesSimilar to the BM, Ito-processes have continuous quadratic variation. Given

two Ito-processes, Y 1 and Y 2, their quadratic variation and covariation are definedby:

〈Y j〉t := lim|τn|→0

∑ti∈τn

(Y jti+1

− Y jti)2

, j = 1, 2

〈Y 1, Y 2〉t := lim|τn|→0

∑ti∈τn

(Y 1ti+1− Y 1

ti

)(Y 2ti+1− Y 2

ti

).

Lemma B.9. (i) (Cauchy-Schwarz inequality)

|〈Y 1, Y 2〉t| ≤ 〈Y 1〉12t 〈Y 2〉

12t

(ii) 〈Y 1, Y 2〉t = 12

(〈Y 1 + Y 2〉t − 〈Y 1〉t − 〈Y 2〉t

).

(iii) Let (At) be continuous and of bounded variation. Then 〈Y 1+A〉t = 〈Y 1〉t.

Proof. (of (iii)): We know that 〈A〉t = 0. Hence

〈Y 1 +A〉t = 〈Y 1 +A, Y 1 +A〉t = 〈Y 1〉t + 2〈Y 1, A〉t︸︷︷︸≤〈Y 1〉t〈A〉t

+ 〈A〉t︸︷︷︸=0

= 〈Y 1〉t

Proposition B.10. Let dYt = vt dWt be a stochastic integral. Then

〈Y 〉t =

∫ t

0

v2sds .

Proof. First suppose that vt ≡ v, so that Yt = v ·Wt. Then

〈Y 〉t = lim|τn|→0

∑ti∈τn

(vWti+1 − vWti

)2= v2t =

∫ t

0

v2sds.

Next assume that

vt =

m∑j=1

htj1]tj ,tj+1], tn+1 = t.

Since vt is constant on ]tj , tj+1]

〈Y 〉tj+1− 〈Y 〉tj = h2

tj (tj+1 − tj) =

∫ tj+1

tj

v2sds

hence 〈Y 〉t =∫ t

0v2sds in this case. The general case is obtained by approximation

of v with elementary processes vns satisfying

P

(∫ t

0

(vns − vs)2ds→ 0, n→∞)

= 1.

Corollary B.11. Let

dYt = utdt︸︷︷︸=At

+vtdWt.


Then

〈Y 〉t =

∫ t

0

v2sds.

Proposition B.12. Let

dY 1 = u1dt+ v1dW

dY 2 = u2dt+ v2dW

be two Ito-processes. Then

〈Y 1, Y 2〉t =

∫ t

0

v1sv

2sds.

Proof.

〈Y 1, Y 2〉t =1

2

(〈Y 1 + Y 2〉t − 〈Y 1〉t − 〈Y 2〉t

)=

1

2

(∫ t

0

(v1s + v2

s)2ds−∫ t

0

(v1s)2ds−

∫ t

0

(v2s)2ds

)=

∫ t

0

v1s v

2s ds

Theorem B.13 (1-dim. Ito-formula). Let (Yt) be an Ito process of the type

dYt = ut dt+ vt dXt.

Let f ∈ C1,2([0,∞)×R) then (f(t, Yt)) is an Ito process too with representation

df(t, Yt) = ∂tf(t, Yt) dt+ ∂xf(t, Yt) dYt +1

2∂xxf(t, Yt) d < Y >t

= ∂tf(t, Yt) dt+ ∂xf(t, Yt) ut dt+ ∂xf(t, Yt) vt dWt +1

2∂xxf(t, Yt) v

2t dt

=

(∂tf(t, Yt) + ∂xf(t, Yt) ut +

1

2∂xxf(t, Yt)v

2t

)dt+ ∂xf(t, Yt) vt dWt.

for dYt = utdt+ vtdWt and d〈Y 〉t = v2t dt.

The proof requires certain preliminaries:

Lemma B.14. (two simple stochastic differentials)

(i) d(W 2t ) = 2Wt dWt + dt

(ii) d(tWt) = Wt dt+ t dWt

Theorem B.15 (Ito’s product rule). Suppose that

dY1 = u1 dt+ v1 dW

dY2 = u2 dt+ v2 dW

with P(∫ t

0u2i + v2

i ds <∞, ∀ t ≥ 0)

= 1, i = 1, 2. Then:

d(Y1Y2) = Y1 dY2 + Y2 dY1 + d〈Y1, Y2〉(B.5)

= Y1 dY2 + Y2 dY1 + v1v2 dt.


Proof. First assume that

Y1(0) = Y2(0) = 0, ui(t) ≡ ui, vi(t) ≡ vi

so that Yi(t) = ui t+ vi Wt. Then

∫ t

0

Y2 dY1 +

∫ t

0

Y1 dY2 +

∫ t

0

v1v2 ds

=

∫ t

0

Y2u1 + Y1u2 ds+

∫ t

0

Y2v1 + Y1v2 dXs +

∫ t

0

v1v2 ds

=

∫ t

0

u2u1s+ v2u1Ws + u1u2s+ v1u2Ws ds

+

∫ t

0

u2v1s+ v2v1Ws + u1v2s+ v1v2Ws dXs +

∫ t

0

v1v2 ds

= u1u2t2 + (u1v2 + u2v1)

[∫ t

0

Ws ds+

∫ t

0

s dWs

]+ 2v1v2

∫ t

0

Ws dWs︸︷︷︸= 1

2 (X2t−t)

+v1v2t

According to the last lemma this can be simplified to

= u1u2t2 + (u1v2 + u2 + v1) tWt + v1v2 W

2t

= Y1(t) Y2(t)

Next assume that ui, vi are elementary processes:

ui =

n∑j=1

gitj1]tj ,tj+1], vi =

n∑j=1

hitj1]tj ,tj+1] , tn+1 = t

then exactly the same identity can be obtained on ]tj , tj+1]:∫ tj+1

tj

Y2 dY1 +

∫ tj+1

tj

Y1 dY2 +

∫ tj+1

tj

h1tjh

2tj ds

= Y1(tj+1)Y2(tj+1)− Y1(tj)Y2(tj)

and summing up w.r.t. j = 1, . . . , n gives the desired identity (B.5).Finally consider uni , v

ni , elementary processes, converging to ui, vi in the sense

that for i = 1, 2:

P

(∫ t

0

(uni (s)− ui(s))2 ds→ 0, n→∞)

= 1

P

(∫ t

0

(vni (s)− vi(s))2 ds→ 0, n→∞)

= 1

Let

Y ni (t) =

∫ t

0

uni (s) ds+

∫ t

0

vni (s) dWs , i = 1, 2


Then

(Y n1 Yn2 ) (t)− (Y n1 Y

n2 ) (0) =

∫ t

0

Y n1 dY n2 +

∫ t

0

Y n2 dY n1 +

∫ t

0

vn1 vn2 ds

↓ ↓

(Y1Y2)(t)− (Y1Y2)(0)

∫ t

0

Y1 dY2 +

∫ t

0

Y2 dY1 +

∫ t

0

v1v2 ds

as n→∞.

APPENDIX C

Stochastic Differential Equations

Consider the ordinary differential equation

dNtdt

= aNt , N0 = n0(C.1)

describing the membrane potential as a function of time. (C.1) is linear and itsunique solution is given by

Vt = eat · n0 .

(C.1) is a classical model describing growth, Nt denotes the population size, and athe growth rate.

Suppose now that a is only known partially, because it is subject to unknown,possibly random, forces. Then a natural Ansatz for this unknown, random, growthrate is

a = r + σdWt

dt

where W is a continuous BM, hence

dNtdt

=

(r + σ

dWt

dt

)·Nt

or

dNt = rNt dt+ σNt dWt(C.2)

(C.2) is called a stochastic differential equation (sde). The explicit solutionwith initial condition N0 is given by

Nt = exp

((r − 1

2σ2)t+ σWt

)·N0

Indeed: Ito’s formula, applied to the function f(w, t) = exp(σw+(r− 12σ

2)t) yields

dNt = σNt dWt +

(1

2σ2 + (r − 1

2σ2)

)Nt dt

= rNt dt+ σNt dWt.

Further examples for SDEs, often used in computational Neuroscience, are

(a)

dVt = (I − Vt)dt+ σdWt

modelling the membrane potential of a neuron subject to noise, like e.g.channel noise and/or noise in the synaptic input.

77

78 C. STOCHASTIC DIFFERENTIAL EQUATIONS

(b) Stochastic FitzHugh-Nagumo Systems

dVt = (Vt (1− Vt)(Vt − a)−Wt) dt+ σV dBVt

dWt = b (Vt − (a+Wt)) dt+ σW dBWt

where (BVt ), (BWt ) are possibly correlated BM. The above system of sdeis no longer linear, the drift term of the voltage variable is no longerLipschitz, still it is possible to uniquely solve it for arbitrary, possibly alsorandom, initial condition.

C.1. Explicit solutions

C.1.1. Linear SDE.

dXt = (A(t)Xt +B(t)) dt+ C(t) dWt, X0 = ξ0(C.3)

where

- (Wt) - d-dim. BM- A(t), C(t) ∈ Rd×d, B(t) ∈ Rd measurable, locally bounded- E(‖ξ0‖2) <∞, ξ0 independent of (Wt)

We will see below, that (C.3) has a unique solution.

Special case: (d = 1) Ornstein-Uhlenbeck SDE - modelling BM with friction.

dXt = −bXt dt+ σ dWt, X0 = ξ0 .(C.4)

To determine an explicit solution for this equation, recall the variation of constantsformula for linear ode, that allows to represent the solution of

Xt = −bXt + σWt, X0 = ξ0

as

Xt = e−bt(ξ0 + σ

∫ t

0

e−b(t−s)Wsds

).

Writing Wsds = dWs, we then obtain

Xt = e−bt(ξ0 + σ

∫ t

0

e−b(t−s) dWs︸︷︷︸stoch. integral

)

as solution of (C.4). Note that

σ

∫ t

0

e−b(t−s) dWs ∼ N(

0, σ2

∫ t

0

e−2b(t−s) ds

)so that

Xt ∼ N(e−btξ0, σ

2

∫ t

0

e−2b(t−s) ds

).

The general case (C.3)

Let Φt be the matrix solution of

C.1. EXPLICIT SOLUTIONS 79

Φt = A(t)Φt (e.g. A(t) ≡ A,Φt = etA)(C.5)

then

Xt = Φtξ0 +

∫ t

0

Φ−1t Φsb(s) ds+

∫ t

0

Φ−1t ΦsC(s) dWs

and again∫ t

0

Φ−1t ΦsC(s) dWs ∼ N

(0,

∫ t

0

Φ−1t ΦsC(s)C(s)TΦTs Φ−Tt ds

)e.g. for A(t) ≡ A:

Xt = etAξ0 +

∫ t

0

e(t−s)Ab(s) ds+

∫ t

0

e(t−s)AC(s) dWs.

C.1.2. Solving SDE using change of variables.

dXt = b(Xt) dt+ σ(Xt) dWt X0 = x ∈ R(C.6)

We try to solve (C.6) in terms of Xt = u(Yt) for suitable u and Yt solving

dYt = f(Yt) dt+ dWt, Y0 = y ∈ R(C.7)

where f will be chosen later. Ito ’s formula gives:

du(Yt) = u′(Yt)dYt +1

2u′′(Yt)dt(C.8)

=

[u′(Yt)f(Yt) +

1

2u′′(Yt)

]dt+ u′(Yt)dWt

Hence, if u′(Y ) = σ(u(Y ))

u′(Y )f(Y ) + 12u′′(Y ) = b(u(Y )),

(C.9)

(C.8) reduces to

du(Yt) = b(u(Yt)) dt+ σ(u(Yt)) dWt,

so that Xt = u(Yt) solves (C.6).

The solution of (C.9) can be obtained first solving the ODE

u′(z) = σ(u(z)), u(y) = x

and setting

f(z) =1

σ(u(z))

[b(u(z))− 1

2u′′(z)

].

Illustration: Consider again the SDE

dXt = rXt dt+ σXt dWt, X0 = x

then u′(z) = σ(u(z)) = σu(z), u(0) = x, leads to the solution u(z) = x eσz andtherefore

f(z) =1

σxeσz

[rxeσz − σ2

2xeσz

]=

1

σ

(r − σ2

2

).


The SDE

dYt =1

σ

(r − σ2

2

)dt+ dWt, Y0 = 0,

has the solution

Yt =1

σ(r − σ2

2)t+Wt,

and thus

Xt = u(Yt) = x exp

((r − σ2

2

)t+ σWt

)C.2. Strong solutions

Input

• bi, σij : I × Rd → R (I = [0, T ] or R+, 1 ≤ i ≤ d, 1 ≤ j ≤ r) Borelmeasurable

• (Wt) r-dim. continuous BM on a probability space (Ω,F , P )• ξ Rd-valued r.v. on (Ω,F , P ) independent of (Wt) (initial condition)

We are looking for a strong solution (Xt) of the SDE

dXt = b(t,Xt) dt+ σ(t,Xt) dWt, X0 = ξ(C.10)

(C.11)

or, componentwise,

dXit = bi(t,Xt) dt+

r∑j=1

σij(t,Xt) dWjt , Xi

0 = ξi, 1 ≤ i ≤ d .

Definition C.1. A strong solution of the SDE (C.10) is a stochastic process(Xt)t∈I with continuous sample paths satisfying:

(i) Xt is adapted, i.e., Ft-measurable, where Ft = σ (ξ,Ws, s ∈ [0, t]) ∨N , and N denotes all P -null sets in σ (ξ, Ws, s ∈ I)

(ii) X0 = ξ P -a.s.

(iii)∫ t

0

(|bi(s,Xs)|+ σ2

ij(s,Xs))ds <∞ P -a.s. ∀t ∈ I, ∀i, j

(iv) Xt = X0 +∫ t

0b(s,Xs) ds+

∫ t0σ(s,Xs) dWs P -a.s. ∀t ∈ I.

b(t, x) =

b1(t, x)...

bd(t, x)

is called the drift

σ(t, x) =

σ11(t, x) · · · σ1r(t, x)...

...σi1(t, x) · · · σdr(t, x)

the dispersion coefficient of the SDE.

Definition C.2. (Uniqueness of strong solutions)Strong uniqueness holds for the SDE (C.10) if for given BM (Wt) and ini-

tial condition ξ independent of (Wt) on a probability space (Ω,F , P ) two strong

solutions X, X are indistinguishable, i.e.

P(Xt = Xt ∀ t ∈ I

)= 1.

C.2. STRONG SOLUTIONS 81

Theorem C.3. Suppose that I = [0, T ] and thatAssumption (A)

(a) bi, σij are continuous(b) For all R > 0 there exist a constant LR such that

2〈b(t, x)−b(t, y), x−y〉+‖σ(t, x)−σ(t, y)‖2 ≤ LR‖x−y‖2 ∀‖x‖, ‖y‖ ≤ R, t ∈ [0, T ]

holds. Then there exists at most one strong solution of (C.10).

The assumption (A) is in particular satisfied if bi, σij are locally Lipschitz w.r.t.x, i.e., for all R > 0 exists a constant LR with

‖b(t, x)− b(t, y)‖+ ‖σ(t, x)− σ(t, y)‖ ≤ LR‖x− y‖

∀ x, y ∈ Rd with ‖x‖, ‖y‖ ≤ R.

Part (b) of the assumption (A) reduces in the particular case σij ≡ 0 to thefollowing condition

(C.12) 2〈b(t, x)− b(t, y), x− y〉 ≤ LR‖x− y‖2 ∀ ‖x‖, ‖y‖ ≤ R, t ∈ [0, T ] .

(C.12) is called (local) monotonicity or (local) one-sided Lipschitz condi-tion and usually satisfied in all stochastic differential equations describing neuralactivity.

Theorem C.4. Let bi, σij satisfy assumption (A) and the following lineargrowth condition(B) There exists a constant K such that

2〈b(t, x), x〉+ ‖σ(t, x)‖2 ≤ K(1 + ‖x‖2

)∀x ∈ Rd , t ∈ I .

Let (Wt) be a r-dimensional BM on (Ω,F , P ), ξ be an initial condition independentof (Wt). Then there exists a strong solution (Xt) of the SDE (C.10).Moreover, if the initial condition is square integrable,

E(‖ξ‖2

)<∞,

then for all T ≥ 0 ∃ CT with

supt≤T

E(‖Xt‖2

)≤ CT

(1 + E

(‖ξ‖2

)).

Assumptions (A) and (B) are both satisfied if the coefficients are globally Lipschitzcontinuous w.r.t. x with Lipschitz-constant L independent of t

‖b(t, x)− b(t, y)‖+ ‖σ(t, x)− σ(t, y)‖ ≤ L‖x− y‖ ∀x, y ∈ Rd, ∀t ∈ I

and if bi, σij are at most of linear growth, i.e.,

‖b(t, x)‖2 + ‖σ(t, x)‖2 ≤ K(1 + ‖x‖2) ∀x ∈ Rd,∀t ∈ I.


C.3. Numerical approximation

The simplest numerical approximation of the sde (C.10) is provided by theEuler scheme that is defined as follows:

Choose a time step h > 0, tk = k · h, k = 0, 1, 2, . . .

Xht0 = ξ

Xktk+1

= Xktk

+ b(tk, X

ktk

)· h+ σ

(tk, X

ktk

)·(Wtk+1

−Wtk

), k = 0, 1, 2, . . .

Note: Wtk+1−Wtk denote samples of independent d-dimensional normal ran-

dom variables with mean 0 and covariance matrix h Idd.

Strong Error

The primary tool to measure the approximation error is to look at the pathwisediscrepancy between the exact solution (Wt) of (C.10) and the Euler approximation(Xk

tk):

estrongh,t := sup

k: tk≤tE(|Xk

tk−Xtk |

)Theorem C.5. b, σ Lipschitz, q ≥ 1, then

(i)

∃ ct : E

(sup

k: tk≤t|Xk

tk−Xtk |2q

)≤ ct · hq

in particular, if h = tN , hence tN = t, then

E(|Xk

tk−Xtk |2q

)≤ ct

(t

N

)q(ii) for any α < 1

2 :

limh→0

1

hαsup

k: tk≤t|Xk

tk−Xtk | = 0 a.s.

(iii) if in addition b, σ ∈ C4b (i.e., four times continuously differentiable, all

derivatives up to fourth order bounded) and if u ∈ C4p (i.e., four times

continuously differentiable, all derivatives up to fourth order polynomiallybounded), then

E(|u(Xk

t )− u(Xt)|)≤ ct · h .

The proof of the theorem can be found in the monograph [PK92] by Kloedenand Platen, where also much more refined numerical schemes are analyzed. A gentleintroduction to the numerical approximation of stochastic differential equationsdriven by Wiener noise can be also found in the SIAM Review by Higham [Hig01].

Bibliography

[AAF05] S. B. Laughlin A. A. Faisal, Ion-channel noise places limits on the miniaturization of

the brain’s wiring, Current Biology (2005), no. 15, 1143–1149.[AAF07] , Stochastic simulations on the reliability of action potential propagation in thin

axons, PLoS Comput Biol. (2007), no. 3:e79.

[APP05] L. Alili, P. Patie, and J. L. Pedersen, Representations of the first hitting time densityof an ornstein-uhlenbeck process, Stochastic Models (2005), no. 21, 967–980.

[EK84] S. N. Ethier and T. G. Kurtz, Markov processes - characterization and convergence,

John Wiley & Sons, New York, 1984.[FB02] N. Fourcaud and N. Brunel, Dynamics of the firing probability of noisy integrate-and-fire

neurons, Neural Comput. (2002), no. 14, 2057–2110.

[H07] R. Hopfner, On a set of data for the membrane potential in a neuron, Math. Biosciences

(2007), no. 207, 275–301.

[HA10] T. Schardlow H. Alzubaidi, H. Gilsing, Numerical simulations of spe’s and spde’s fromneural systems using sdelab, Stochastic Methods in Neuroscience (G. J. Lord C. Laing,

ed.), Oxford University Press, Oxford, 2010, pp. pp–.

[Hig01] D. J. Higham, An algorithmic introduction to numerical simulation of stochastic differ-ential equations, SIAM Review 43 (2001), no. 3, 525–546.

[Kle06] A. Klenke, Probability theory, Springer-Verlag, Berlin, 2006.

[LL87] P. Lansky and V. Lanska, Diffusion approximation of the neuronal model with synapticreversal potentials, Biol. Cybern. (1987), no. 56, 19–26.

[Nob91] The nobel prize in physiology or medicine 1991, Nobelprize.org, 1991.[Nor97] J. R. Norris, Markov chains, Cambridge University Press, Cambridge, 1997.

[PK92] E. Platen P. Kloeden, Numerical solution of stochastic differential equations, Springer-

Verlag, Berlin, 1992.[SS16] M. Sauer and W. Stannat, Reliability of signal transmission in stochastic nerve axon

equations, Journal of Computational Neuroscience (2016), no. 40, 103–111.

[Ste65] R. B. Stein, A theoretical analysis of neuronal variability, Biophys J. (1965), no. 5,173–194.

[SV79] Daniel W. Stroock and S. R. Srinivasa Varadhan, Multidimensional diffusion pro-

cesses, Grundlehren der Mathematischen Wissenschaften [Fundamental Principles ofMathematical Sciences], vol. 233, Springer-Verlag, Berlin, 1979, Reprinted in 2006.

MR MR532498 (81f:60108)[VB91] C. A. Vandenberg and F. Bezanilla, A sodium channel gating model based on single

channel, macroscopic ionic, and gating currents in the squid giant axon, Biophys J.

(1991), no. 60, 1511–1533, doi:10.1016/S0006-3495(91)82186-5.

83

http://www.nobelprize.org/nobel_prizes/medicine/laureates/1991/

doi: 10.1016/S0006-3495(91)82186-5

wilhelm stannat institut fur mathematik technische universit at …€¦ · 1.2. mathematical...

Documents