wilhelm stannat institut fur mathematik technische universit at …€¦ · 1.2. mathematical...
TRANSCRIPT
Stochastic Processes in Neuroscience
Part I
Wilhelm Stannat
Institut fur MathematikTechnische Universitat Berlin
June 16, 2016
Lecture held in the summer term 2016.i
Contents
Chapter 1. Introduction 11.1. Mathematical models for single ion channels 31.2. Mathematical models for the membrane potential 41.3. Biological neural networks 7
Chapter 2. Markov Chain Models of Ion Channels 92.1. Time-continuous Markov Chains 92.2. The Martingale structure of Markov chains 162.3. Diffusion approximation of Markov chains 182.4. Long-time behavior of Markov chains 30
Chapter 3. Models for synaptic input 33
Chapter 4. Stochastic Integrate-and-Fire models 414.1. The distribution of T 42
Appendix A. Martingales 55A.1. Maximal inequality 57A.2. Stopping times and optional sampling 58
Appendix B. Brownian motion and stochastic integration 63B.1. Construction of BM 65B.2. Elementary properties of BM 65B.3. Path properties of BM 66B.4. The Ito-Integral 67B.5. The Ito-formula 71
Appendix C. Stochastic Differential Equations 77C.1. Explicit solutions 78C.2. Strong solutions 80C.3. Numerical approximation 82
Bibliography 83
v
CHAPTER 1
Introduction
These lecture notes provide a streamlined introduction to the modeling andmathematical analysis of neural activity in living organisms on three different scales:
(a) individual ion channels (microcospic)(b) single neurons (mezoscopic)(c) population of neurons (macroscopic)
The neural activity is intrinsically noisy and the specification of single neu-rons exhibit a large variability, so that various types of stochasticity have to beincorporated into the models. In the following chapters we will introduce the basicprinciples for stochastic models, together with the mathematical theory to analyzethem, that are used in todays computational neuroscience.
Before starting let us first provide a rough overview of the relevant models on allthree different levels we will discuss in the subsequent chapters. The nervous systemconsists of electrically excitable cells, called neurons, that process and transmitinformation. The typical structure of a neuron is sketched in the following figure:
[Source: Wikipedia: Neuropharmacology]
The single nerve cell receives its input from neighboring cells, or sensory input,via its dendrites. The is input is integrated at the nucleus of the cell. Having reacheda certain threshold, one can observe a temporal spike in the membrane potential, i.e.a sharp uprise in the membrane potential followed by a sharp decrease. This spiketravels down the axon and ends in the axon terminals, where it may be passedover to other neurons or muscles. These spikes are called the action potentialand they were the first activity in the nervous system that could be measured byphysiologists. Let us denote the membrane potential with v. Of course, v dependson time and the location where v is measured. If we reduce the neuron to a singlepoint, i.e. a point neuron, this observable is reduced to a real-valued variable v(t)as a function of time. Spatially extended models are much more complex and willbe discussed in the later chapters.
1
2 1. INTRODUCTION
In the point neuron model, the membrane potential v is driven by three typesof electrical currents
(1.1) Cdv
dt= −F + Isyn + Iext
where
(i) F denotes the sum of currents as a result of ions flowing into or out of thecell membrane through ion channels, also called the membrane current
(ii) Isyn denote the synaptic currents entering the cell(iii) Iext denotes externally injected currents (e.g. exterior signals).
Whereas Isyn and Iext can be seen as exterior controls of the membrane po-tential, the current F is responsible for the intrinsic regulation of the membranepotential, in particular for the generation and regulation of action potentials. Thereis an extensive literature on the modeling of the membrane currents, and we willshortly illustrate the underlying principles in the following. Being the sum of mem-brane currents F can be represented as
F =∑i
gi(v − vi)
where the sum is over the different types of ion channels, vi is the correspondingreversal potential and gi the conductance. gi essentially depends on the concentra-tion of open ion channels of the respective type, which itself is coupled back to themembrane potential and it is the dynamics of the opening and closing of the ionchannels that generates the action potentials.
Single ion channel currents have first been measured by Neher and Sakmannwho invented the patch clamp technique around the year 1976 and later received theNobel Prize in Physiology or Medicine in the year 1991 ”for their discoveries con-cerning the function of single ion channels in cells” ([Nob91] ).These measurementsshowed that the dynamics of single ion channels is intrinsically random and there-fore cannot be described adequately with the help of a differential equation. Thefollowing picture illustrates this apparent stochastic behaviour of a single sodiumchannel in the giant axon of squid (that has also been considered by Hodgkin-Huxley(see below)):
(see [VB91]).
1.1. MATHEMATICAL MODELS FOR SINGLE ION CHANNELS 3
Peaks pointing downwards can be associated with times where the sodiumchannel opens, positive sodium ions (Na+) flow into the cell and raise the membranepotential. The figure shows that the response of the given ion channel is varyingfrom trial to trial and that only the probability of the ion channel being in theup-state (open) or down-state (closed) can be compared w.r.t. different appliedcurrents. The randomness in the response of single ion channels is called channelnoise and it is one of the dominant sources of variability in the membrane potential.
1.1. Mathematical models for single ion channels
It is widely accepted in computational neuroscience today that an adequatemodeling of the statistics of single ion channels can be achieved with the help of(time-continuous) Markov chains on a finite number of states (between open andclosed) and that the switching between these states, the transition rates, are voltagedependent.
&%'$
&%'$
C O
R
I
α(v)
β(v)
If X(t) denotes the state of the ion channel at time t, the probability p(t) =P (X(t) = O) of the Markov chain to be in found in the open state is then given asthe solution of the ordinary differential equation
(1.2)dp
dt= α(v)(1− p)− β(v)p .
It is important to notice at this point that (1.2) only is a statistical description ofthe Markov chain and not the description of a given realization. It can be seenas an approximation of the proportion of ion channels being in the open state,if there were virtually infinitely many independent ion channels X1(t), X2(t), . . .operating simultaneously. Indeed, given N (independent) ion channels, the propor-
tion pN (t) := 1N
∑Ni=1 1O(Xi(t)) of ion channels found in the open state converges
almost surely as N → ∞, due to the strong law of large numbers, towards thetheoretical probability p(t) = E (1O(Xi(t))), i.e.,
(1.3) limN→∞
pN (t) = limN→∞
1
N
N∑i=1
1O(Xi(t)) = E (1O(Xi(t))) = p(t) .
An important question in computational neuroscience is the numerically efficientapproximation of pN (t) and other statistics of large, but finite number of ion chan-nels. One of the most important methods is the diffusion approximation, that
4 1. INTRODUCTION
approximates pN with the help of some stochastic differential equation
(1.4) dpN = α(v)(1− pN )− β(v)pN dt+1√N
√α(v)(1− pN ) + β(v)pN dB
where B denotes a 1-dimensional Brownian motion. We will provide the frameworkfor the rigorous derivation of (1.4) in the next chapter.
1.2. Mathematical models for the membrane potential
1.2.1. Conductance-based neuronal models. Coupling the basic equation(1.1) for the membrane potential to the dynamics of the ion channel dynamics ofthe type (1.2) leads to the class of conductance-based neural models. The first andmost prominent example for this class of models are given by the Hodgkin-Huxleymodel that has been introduced in 1952 to describe the membrane potential in thesquid giant axon. In this model there are three different types of currents throughthe membrane:
• IK - potassium channels (activating)• INa - sodium channels inactivation (inactivating)• IL - sodium channel (activating)
The coupled system of four differential equations is then given as
(1.5)
Cdv
dt= gkn
4(vK − v) + gNam3h (vNa − v) + gL (vL − v) + Iext
dn
dt= αn(v)(1− n)− βn(v)n
dm
dt= αm(v)(1−m)− βm(v)m
dh
dt= αh(v)(1− h)− βh(v)h
with n, m and h denoting the concentration of open ion channels of the respec-tive type. The constants gK , gNa and gL denote the maximal values of membraneconductances for potassium, sodium and leakage ions, vK , vNa and vL the corre-sponding reversal potentials. Finally, the transition rates are given as
αn(v) =10− v
100(e(10−v)/10 − 1)βn(v) =
1
8e−V/80
αm(v) =25− v
10(e(25−v)/10 − 1)βm(v) = 4e−v/18
αh(v) =7
100e−v/20 βh(v) =
1
e(30−v)/10 + 1
Parameters taken from [HA10].
Remark 1.1. (1) The components of (1.5) are of the type
(1.6)dv
dt= g (vE − v)
with explicit solution
v(t) = e−gtv(0) + (1− e−gt)vEand corresponding long time behavior v(t)→ vE with rate exponential rate g. Forthis reason vE is sometimes also called the equilibrium potential.
1.2. MATHEMATICAL MODELS FOR THE MEMBRANE POTENTIAL 5
(2) The equations for m,n and h are of the type
(1.7)dn
dt= α(1− n)− βn
with explicit solution
n(t) = e−(α+β)tn(0) +(
1− e−(α+β)t) α
α+ β
In particular, the solution stays inside the unit interval [0, 1], if the initial conditionis contained in [0, 1].
(3) The system of coupled differential equations exhibits a bifurcation w.r.t.the exterior input current Iext. Depending on its size, one can observe a singleor a finite number of spikes or even periodic spikes. More precisely: in the aboveparameter set:
- minimal current required for at least one spike: Iext = 2.5- threshold value for periodic spiking: Iext = 6.25- if Iext > 154 the amplitude of the spikes decreases rapidly.
Illustration with Octave/Matlab:
Plotting V together with all concentrations shows that v and n are prettysynchronized. For a better understanding of the dynamical properties of the system,it is therefore possible to reduce the number of variables by lumping together vand n, and also the sodium inactivation h and 1 − m into one variable. Theresulting system is two-dimensional and called the FitzHugh-Nagumo system. Wewill study this two-dimensional system more closely in the subsequent chapters.The FitzHugh-Nagumo system is a mathematical idealization and its variables nolonger belong are physiological quantities. On the other hand, the bifurcation of thefour-dimensional Hodgkin-Huxley system can be illustrated and further understoodin the simplified FitzHugh-Nagumo system by graphical methods.
Finally, let us also compare the typical phase-plot of the Hodgkin-Huxley sys-tem with some real neural data:
6 1. INTRODUCTION
(from [H07]). The typical shape of the action potential is very well modeledin the Hodgkin-Huxley system in contrast to the fluctuations that are due to thefluctuations in the ion channel concentrations. The question now is, whether theion channel fluctuations should be incorporated into the model or whether thedeterministic Hodgkin-Huxley system already is sufficiently good. It turns out thatthe fluctuations have an impact on the action potential that have to be taken intoaccount for a more appropriate statistical analysis of the membrane potential inreal neural systems. There are a couple of important effects of these fluctuationson the membrane potential, among them are:
- spontaneous spiking,- time jitter, which means fluctuations in the velocity of the action potential,- splitting up and annihilation of action potentials,- propagation failure.
A numerical study of these effects has been carried out in [AAF07] and in [SS16]for the spatially extended analogue. Its implications on the minimal axon diameterrequired for faithful signal transmission have been investigated in [AAF05].
1.2.2. Integrate-and-fire models. Integrate-and-fire (IF) models model themembrane potential only with the help of a one-dimensional dynamical system.Since periodic dynamical systems cannot be fully described with a first order ordi-nary differential equation, one has to incorporate a discontinuous reset mechanismas follows:
CdV
dt= I if V ≤ Vth Vth − threshold value
and then reset V to a lower value V → Vreset. The interpretation is as follows: Themembrane potential is integrating up the input currents I up to a certain thresholdvalue. Once it hass reached this threshold that could be thought of a saturationvalue of the membrane potential it starts an action potential, that is it fires. Havingfired the membrane potential is reset to its resting value and integrating up againthe input.
In most of the cases, the leak current through the membrane potential is de-noted explicitly and this leads to the so-called leaky IF model:
CdV
dt= −V
R+ I if V ≤ Vth .
1.3. BIOLOGICAL NEURAL NETWORKS 7
The advantage of this model is its reduced complexity, its disadvantage of courseis that it neglects all ion channels and therefore its simulation results can only beinterpreted statistically for the membrane potential.
1.3. Biological neural networks
The modeling of the activity of neural circuits even in whole brain areas needs totake into account the modeling of communication between neurons. Neurons com-municate via their synapses, basically exchanging certain neurotransmitter. Givena single neuron, neurons that send input to the given one are therefore called presy-naptic and neurons to which the given neuron sends its output are called postsy-naptic. The precise underlying physiological process of exchange of the signal iscomplex (and also subject to noise) and it is different for every type of neuron. Itis not our aim to lay out this details here, but we only mention the fundamentaldistinction between chemical and electrical synapses.
The simpler type is the electrical synapse where the membrane potential of thepresynaptic and the postsynaptic neuron directly communicate roughly linearly:
Cdvpostdt
= sum of currents + g(vpre − vpost) g = coupling strength .
In the more complex case of chemical synapses the effect on the membrane potentialof the postsynaptic neuron may be a complex nonlinear function
Cdvpostdt
= sum of currents + g(vpre)(vE − vpost)
where vE denotes a certain equilibrium potential and g is a general general functiondepending on the membrane potential of the presynaptic neurons.
One of the major open questions of neural systems, and in systems biology ingeneral, is to establish a theory for the collective behavior of neural networks interms of their local specifications, that is, the specification of the single neurons andtheir connections. Clearly, this would require some global rules, similar to the caseof kinetic gas theory, where the global statistical behavior of a gas can be deducedfrom its local interactions using simple thermodynamical rules. The difficulty inbiological systems in general and in biological neural networks in particular is todetermine simple but nevertheless relevant global rules that are responsible for therich observed phenomenology of these complex highly nonlinear systems.
Let us give some interesting and important examples of collaborative behav-ior of neural systems in a very simple case of linearly coupled two dimensionalFitzHugh-Nagumo systems. We are given an N ×N -grid
8 1. INTRODUCTION
u(i, j)
and on each grid point (i, j) the following two-dimensional FitzHugh-Nagumosystem linearly coupled to neighboring neurons:
(1.8)
dvijdt
= vij(1− vij)(vij − a)− wij +1
2h(vi+1,j − vij + vi−1,j − vij)
+1
2h(vi,j+1 − vij + vi,j−1 − vi,j)
dwijdt
= b(vij − a+ wij)
Here, a ∈ (0, 1), b ∈ R and h ∼ 1N . It turns out that for certain parameters and
certain initial conditions the system exhibits remarkable collective behavior.
CHAPTER 2
Markov Chain Models of Ion Channels
As outlined in the Introduction, the activity of ion channels is intrinsically noisy. Awidely accepted modeling approach in Computational Neuroscience is to use fine-state Markov chains in continuous time for the approximation of the statistics ofthe up- and down states of single ion- channels and subsequently also networks ofcoupled ion channels.
2.1. Time-continuous Markov Chains
There are many textbooks containing an Introduction to the theory of time-continuous Markov chains. A classical reference is [Nor97]. In the following wewill introduce the basic elements of the theory that is needed for the modeling ofion channels and for the concepts of their diffusion approximations.
Let us denote with X(t), t ≥ 0, the state as a function of time and assume thatit is given as a time-continuous Markov Chain, that is we require that the statespace S is a discrete space, i.e., a countable set, that X = (X(t))t≥0 is a family ofrandom variables on an underlying probability space such that
• the trajectories t 7−→ X(t)(ω) are piecewise constant, right-continuous• for 0 ≤ t1 ≤ t2 ≤ · · · ≤ tn ≤ tn+1, i1, · · · , in+1 ∈ S the Markov property
P(X(tn+1) = in+1 | X(tn) = in, · · · , X(t1) = i1
)= P (X(tn+1) = in+1 | X(tn) = in)
holds.
The Markov chain is called time-homogeneous if
P(X(t+ s) = j | X(s) = i
)is independent of s, hence only depending on the length t of the time-interval[s, s + t]. In the first section, we will consider time-homogeneous Markov chainsonly.
The distribution µ = P X(0)−1 of the initial state is called the initial dis-tribution of X and
(2.1) pij(t) := P(X(t) = j | X(0) = i
), i, j ∈ S, t ≥ 0
are called the transition probabilities.
Lemma 2.1. For 0 = t0 ≤ t1 ≤ t2 ≤ · · · ≤ tn, i0, · · · , in ∈ S
P(X(tn) = in, X(tn−1) = in−1, · · · , X(t1 = i1, X(t0) = i0
)= µi0 pi0i1(t1 − t0) pi1i2(t2 − t1) · · · pin−1in(tn − tn−1).
9
10 2. MARKOV CHAIN MODELS OF ION CHANNELS
The above formula implies in particular that the joint law of (X(t0), · · · , X(tn)) iscompletely determined by the initial distribution and the transition probabilities ofthe Markov chain.
Proof. We will use induction on n. For n = 1 clearly
P (X(t1) = i1, X(t0) = i0) = P (X(t1) = i1 | X(t0) = i0)︸ ︷︷ ︸=pi0i1 (t1−t0)
·P (X(t0) = i0)︸ ︷︷ ︸=µi0
.
Now suppose that the statement is proven for n. It then follows that
P(X(tn+1) = in+1 , X(tn) = in, . . . , X(t0) = i0
)= P (X(tn+1) = in+1 | X(tn) = in, . . . , X(t0) = i0)︸ ︷︷ ︸
=↑
Markov property
P (X(tn+1)=in+1|X(tn)=in)=pinin+1(tn+1−tn)
· P (X(tn) = in, . . . , X(t0) = i0)︸ ︷︷ ︸=µi0pi0i1 (t1−t0)·pi1i2 (t2−t1)...
by assumption
= µi0pi0i1(t1 − t0) pi1i2(t2 − t1) . . . pinin+1(tn+1 − tn)
In the following we denote by P (t) = (pij(t)) the matrix of transition probabil-ities. P (t) is a stochastic matrix, in fact a right-continuous semigroup of stochasticmatrices in the sense of the following lemma.
Lemma 2.2. P (t), t ≥ 0, is a semigroup of matrices, i.e., P (t+ s) = P (t)P (s),s, t ≥ 0, and right-continuous, i.e., lims↓t Pij(s) = Pij(t) for all i, j ∈ S.
Proof.
pij(t+ s) = P (X(t+ s) = j | X(0) = i)
=∑k∈S
P (X(t+ s) = j, X(t) = k | X(0) = i)
=∑k∈S
P (X(t+ s) = j | X(t) = k, X(0) = i)︸ ︷︷ ︸=P (X(t+s)=j|X(t)=k)
(2.1)= pkj(s)
·P (X(t) = k | X(0) = i)︸ ︷︷ ︸=pik(t)
=∑k∈S
pik(t)pkj(s) = (P (t)P (s))ij .
For the proof of the right-continuity note that the right-continuity of the samplepaths t 7→ X(t)(ω) of the Markov chain implies for s ↓ t that
pij(s) = P (X(s) = j | X(0) = i) = E(
1X(s)=j︸ ︷︷ ︸−→1X(t)=j pointwise
| X(0) = i)
−→ E(1X(t)=j | X(0) = i
)= pij(t)
by Lebesgue’s dominated convergence.
It can be shown if pij is even differentiable and that there exists a matrixQ = (qij)i,j∈S such that
(2.2)d
dtP (t) = QP (t).
2.1. TIME-CONTINUOUS MARKOV CHAINS 11
Equation (2.2) is called Kolmogorov‘s backward equation, the matrix Q iscalled the generator of (X(t))t≥0 (see [Nor97]). It turns out that Q being thedifferential of a semigroup of stochastic matrices imposes some further propertieson Q, that under some additional assumption (finite state space) are in fact alsosufficient to generate a semigroup.
Lemma 2.3. Q is a Q-matrix, i.e., it has the following two properties:
(a) qij ≥ 0 if i 6= j(b)
∑j∈S qij = 0, in particular qii = −
∑j∈S,j 6=i qij ≤ 0.
Conversely, if supi |qii| <∞ for a given Q-matrix Q, its matrix exponential P (t) =etQ, t ≥ 0, defines a semigroup of stochastic matrices.
Proof. Since
qij =d
dtpij(t)
∣∣∣t=0
= limt↓0
1
t
(pij(t)− pij(0)︸ ︷︷ ︸
=δij
)= lim
t↓0
1
t
(pij(t)− δij
)and since pij(t) ∈ [0, 1], it follows that qij ≥ 0 if i 6= j (and qii ≤ 0). P (t)1 = 1
implies that
Q1 =d
dtP (t)1
∣∣∣t=0
= 0, hence∑j
qij = 0.
Conversely, suppose first that |S| < ∞ (which certainly implies supi |qii| < ∞):then
etQ =∑k=0
tk
k!Qk
is well-defined, and Q1 = 0 implies that Qk1 = Q · · · Q1 = 0 (k ≥ 1) andtherefore
etQ1 =
∞∑k=0
tk
k!Qk1 = Q0︸︷︷︸
=I
1 = 1 .
In conclusion P (t)1 = 1.
To see that pij(t) ≥ 0 note that for i 6= j
P (t) = I + tQ+O(t2) ,
hence Pij(t) ≥ 0 for small t. We can find some δ > 0 such that Pij(t) ≥ 0 fort ∈ [0, δ) and thus for any t ≥ 0
Pij(t) =(P
(t
n
)︸ ︷︷ ︸≥0 if t
n≤δ
. . . P(t
n
))ij≥ 0 .
2.1.1. General terminology for right-continuous stochastic processes.Given a stochastic process (X(t))t≥0 on a discrete set S with right-continuoustrajectories t 7→ X(t)(ω) we can define the the jump times J0, J1, . . .
J0 := 0
Jn+1 := inft ≥ Jn : Xt 6= XJn for , n = 0, 1, 2, . . . ,
12 2. MARKOV CHAIN MODELS OF ION CHANNELS
with the convention inf ∅ = +∞, and the holding times
Tn :=
Jn − Jn−1 if Jn−1 <∞∞ otherwise.
Note that right-continuity implies Tn > 0 for all n. If Jn+1 = +∞ for some n, wedefine X∞ = XJn . The discrete-time process (Yn)n≥0 given by
Yn := XJn , n = 0, 1, 2, . . .
is called the jump process (or jump chain in the case where (X(t)) is Markovian).
-
6
J0(ω) J1(ω) J2(ω) J3(ω)
| | |
[
[ [
[ [
[
t
Xt(ω)
2.1.2. Poisson process. The most important example of a Markov chain ona discrete state space is the Poisson Process. A right-continuous stochastic process(X(t))t≥0 with values in 0, 1, 2, . . . is called a Poisson process of rate λ, λ >0, if its holding times T1, T2, . . . are independent exponential random variables ofparameter λ and its jump chain is given by Yn = n. We will show below that(X(t))t≥0 is Markovian and its generator is
Q =
−λ λ
. . .
. . .
The strong law of large numbers implies
Jn =
n∑k=1
Jk − Jk−1 =
n∑k=1
Tk −→ +∞ P-a.s.
2.1. TIME-CONTINUOUS MARKOV CHAINS 13
so that a Poisson process jumps infinitely often.
Theorem 2.4 (Markov property). Let X(t), t ≥ 0, be a Poisson process ofrate λ. Then, for any s ≥ 0, (X(t+ s)−X(s))t≥0 is again a Poisson process of
rate λ, independent of X(r) : r ≤ s.
Proof. Let X(t) = X(t+ s)−X(s), t ≥ 0. Then
X(s) = i = Ji ≤ s < Ji+1 = Ji ≤ s ∩ Ti+1 > s− Ji
and thus the holding times of X(t) are given by
T1 = Ti+1 − (s− Ji), T2 = Ti+2, T3 = Ti+3, . . .
Since T1, T2, . . . are independent Exp(λ)-distributed, so that in particular T2, T2, . . .
are independent Exp(λ)-distributed too, we have that T1 = Ti+1−(s−Ji) is Exp(λ)-distibuted due to the memoryless property of Exp and independent of T1, . . . , Tibecause of the memoryless property of the exponential distribution. Indeed, let
fλ(x) :=
λe−λt if x ≥ 0
0 otherwise
be the density of the exponential distribution. Then
P(T1 ≥ t, T1 ≥ t1, . . . , Ti ≥ ti | Xs = i
)· P (Xs = i)
= P (Ti+1 > t+ s− (T1 + · · ·+ Ti), Ti ≥ ti, Ti−1 ≥ ti−1, . . . , T1 ≥ t1, Xs = i)
=
∫ ∞t1
fλ(s1)
∫ ∞t2
fλ(s2) · · ·∫ ∞ti
fλ(si)
∫ ∞t+s−(s1+···+si)
λe−λsi+11s1+···+si≤sdsi+1︸ ︷︷ ︸=e−λ(t+s−(s1+···+si))
dsidsi−1 · · · ds1
= e−λt∫ ∞t1
fλ(s1)
∫ ∞t2
fλ(s2) · · ·∫ ∞ti
fλ(si) e−λ(s−(s1+···+si))︸ ︷︷ ︸=P (Ti+1>s−(s1+···+si))
1s1+···+si≤sdsidsi−1 · · · ds1
= e−λtP (Ti+1 > s− (T1 + · · ·+ Ti), T1 ≥ ti, . . . , Ti ≥ ti, T1 + · · ·+ Ti ≤ s)
= e−λtP (T1 ≥ t1, . . . , Ti ≥ ti | Xs = i) P (Xs = i).
How does Theorem 2.4 imply the Markov Property? Well, for given 0 = t0 ≤. . . ≤ tn+1 and i0, . . . , in+1, we can conclude from the theorem that
P (X(tn+1) = in+1 | X(tn) = in, . . . , X(t0) = i0)
= P (X(tn+1)−X(tn) = in+1 − in | X(tn) = in, . . . , X(t0) = i0)
indep.= P
(X(tn+1 − tn) = in+1 − in
)X again Poisson with rate λ
= P (X(tn+1) = in+1 | X(tn) = in) .
Proposition 2.5. (Distribution of X(t))
(i) P (X(t+ s)−X(s) = k) = e−λt (λt)k
k! , k = 0, 1, 2, . . .(ii) For t0 ≤ t1 ≤ . . . the increments X(ti+1)−X(ti) are independent, Poisson
random variables of parameter λ(ti+1 − ti).
14 2. MARKOV CHAIN MODELS OF ION CHANNELS
Proof. (i) W.l.o.g. s = 0 (Theorem!). Then
P (X(t)−X(0) = k) = P (T1 + · · ·+ Tk ≤ t, T1 + · · ·+ Tk+1 > t)
=↑
T1+···+Tk∼Γk,λTk+1 ind. of (T1+···+Tk)
λk+1
(k − 1)!
∫ ∞0
∫ ∞0
1u≤t uk−1e−λu 1u+v>tdudv
=λk
(k − 1)!
∫ t
0
uk−1du︸ ︷︷ ︸= tk
k
e−λt = e−λt(λt)k
k!
(ii) By induction on n:
P(X(tn+1)−X(tn) = k(n+1), . . . , X(t1)−X(t0) = k1
)= P (X(tn+1)−X(tn) = kn+1 | X(tn)−X(tn−1) = kn, . . . , X(t1)−X(t0) = k1)
· P (X(tn)−X(tn−1) = kn, . . . , X(t1)−X(t0) = k1)
Theorem (2.4)= P (X(tn+1)−X(tn) = kn+1) · P (X(tn)−X(tn−1) = kn, . . . , X(t1)−X(t0) = k1)
assumption= Poiss(λ(tn+1 − tn))(kn+1) · . . . · Poiss(λ(t1 − t0))(k1)
Once we have identified the distribution of X(t), we can now calculate theentries
qij =d
dtPij(t)
∣∣∣t=0
=d
dtPoiss(λt)(j − i)
∣∣∣t=0
of the generator matrix Q of (X(t)):
qij = 0 if j < i,
qii =d
dte−λt
∣∣∣t=0
= −λ
qi,i+1 =d
dte−λt λt
∣∣∣t=0
= λ
qi,i+k =d
dte−λt
(λt)k
k!
∣∣∣t=0
= 0 if k ≥ 2.
2.1.3. Construction/Simulation of Poisson processes. Consider a prob-ability space with independent Exp(λ)-random variables T1, T2, . . . and a r.v. Y0 ∼µ for a given starting distribution µ on 0, 1, 2, . . . , independent of T1, T2, . . . .
Let J0 = 0 and Jk := T1 + · · ·+ Tk,. Then
X(t) := Y0 +
∞∑k=0
1Jk≤t ,
is a Poisson process of rate λ.
2.1.4. Birth processes. Instead of homogeneous rates, we can also considerstate dependent rates αn. A right-continuous stochastic process X(t) is called a
2.1. TIME-CONTINUOUS MARKOV CHAINS 15
Birth process if the holding times are independent Exp(αn)-random variables andthe jump chain is Yn = Y0 + n. The generator is given by
Q =
−α0 α0
−α1 α1
· · · · · ·
.2.1.5. Birth and Death processes. The holding times are independent ex-
ponential r.v.∼ Exp(αn + βn), and the jump chain (Yn) is a Markov chain withtransition matrix
Πij =
αi
αi+βi, j = i+ 1 ”jump up”
βiαi+βi
, j = i− 1 ”jump down”
0 , otherwise
for i, j ∈ Z.
Therefore the Generator is given as
Qij =
αi , j = i+ 1
βi , j = i− 1
−(αi + βi) , j = i
0 , otherwise
2.1.6. General structure of continuous-time Markov chains. For sim-plicity we state the following theorem for finite state spaces only. For generalcountable state spaces, see [Nor97], Section 2.8.
Theorem 2.6. Let X(t), t ≥ 0, be a right-continuous stochastic process on afinite set S. Let Q be a Q-matrix on S with associated jump matrix
πij =
− qijqii , i, j ∈ S, i 6= j, if qii > 0
1 , i = j ∈ S if qii = 0.
Then the following three conditions are equivalent:
(a) Conditioned on Y0 = i, the jump chain (Yn)n≥0 of (Xt)t≥0 is a time-discrete Markov chain with transition matrix Π = (πij)i,j∈S and condi-tioned on Y0, Y1, · · · , Yn−1, the holding times T1, . . . , Tn are independent∼ Exp(q(Y0)), . . . ,Exp(q(Yn−1)), q(i) = −qii.
(b) For all t, h ≥ 0, conditioned on X(t) = i, X(t + h) is independent ofX(s) : s ≤ t and
P (X(t+ h) = j | X(t) = i) = δij + qijh+ o(h).
(c) For all t0 ≤ t1 ≤ · · · ≤ tn+1, i0, i1, . . . , in+1 ∈ S
P (X(tn+1) = in+1 | X(tn) = in, . . . , X(t0) = i0) = pinin+1(tn+1 − tn),
where P (t) = (pij(t)) solves
P ′(t) = QP (t), P (0) = I.
A proof can be found in [Nor97], Section 2.8.
16 2. MARKOV CHAIN MODELS OF ION CHANNELS
2.1.7. Construction/Simulation of time-continuous Markov chains.Construct a jump chain Y0, Y1, . . . with initial distribution µ for Y0 and transitionmatrix Π. Let T1, T2, . . . be independent Exp(1)-r.v., independent of Y0, Y1, . . .Then define the jump times
Jk =T1
q(Y0)+ · · ·+ Tk
q(Yk−1).
Then:
X(t) = Yk for Jk ≤ t < Jk+1 gives the desired Markov chain.
The above construction provides the following algorithm for the numerical sim-ulation: Given state X(t0) = i, draw the holding time T according to the Exp(q(i))-distribution and then choose the next state j 6= i with probability πij . This kind ofan event-based algorithm is called Gillespie’s algorithm and should be the preferredmethod to simulate Markov chains.
2.2. The Martingale structure of Markov chains
Continuous-time martingales are generalizations of stochastic processes withindependent increments, like e.g. the Poisson process. More precisely, let Ft,t ≥ 0, be a filtration, i.e. an increasing family of sub σ-algebras on the underlyingprobability space. Ft is interpreted as the information available at time t.
A stochastic process M = (M(t))t≥0 is called a martingale w.r.t. this filtra-tion if it is adapted, i.e. M(t) is Ft-measurable, integrable w.r.t. the underlyingprobability measure, and
(2.3) M(t) = E (M(t+ s) | Ft) for s, t ≥ 0 .
The basic theory of martingales is outlined in Appendix A and very useful forthe analysis of general stochastic processes. To make use of this theory in thecontext of continuous-time Markov chains, one first has to identify a (large) classof martingales for a given Markov chain.
To this end fix a continuous-time Markov chain X = (X(t))t≥0 and denote by
Ft := σ (X(s) | s ≤ t)
the filtration generated by X.
Theorem 2.7. Let f : S → R be any bounded function. Then
(2.4) f(X(t)) = f(X(0)) +Mf (t) +
∫ t
0
Qf(X(s)) ds t ≥ 0 ,
where
Mf (t) := f(X(t))− f(X(0))−∫ t
0
Qf(X(s)) ds , t ≥ 0 ,
is a right-continuous martingale w.r.t. (Ft)t≥0 with
E(Mf (t)2
)= E
(∫ t
0
(Q(f2)− 2fQf
)(X(s)) ds
)
=
∫ t
0
E
∑j∈S
qX(s) j (f(X(s))− f(j))2
ds
2.2. THE MARTINGALE STRUCTURE OF MARKOV CHAINS 17
The decomposition (2.4) decomposition is called the semimartingale decompo-sition of the process f(X(t)), since it gives a decomposition into a martingale and a
process of bounded variation∫ t
0QF (X(s)) ds. Before we give the rather short and
simple proof, let us first state a useful corollary.
Corollary 2.8. Suppose that P (X(0) = i0) = 1 for some initial state i0 ∈ S.Recall that P (t), t ≥ 0, denotes the transition semigroup associated with X. Then
E((Mf
)2(t))
=
∫ t
0
∑i,j∈S
pi0j(s)qi,j (f(i)− f(j))2ds .
Proof. Let us first verify the martingale property of Mf (t). There is no lossof generality assuming that P (X(0) = i0) = 1 for some initial state i0 ∈ S. TheMarkov property of then implies that for any bounded function g : S → R
(2.5) E (g(X(t+ s)) | Fs) = P (t)g(X(s))
in the sense that the right hand side
P (t)g(X(s)) =∑j∈S
pX(s) j(t)g(j)
is a version of the conditional expectation E (g(X(t+ s)) | Fs). Indeed, the Markovproperty implies that E (g(X(t+ s)) | Fs) = E (g(X(t+ s)) | X(s)) and
E (g(X(t+ s)) | X(s) = i) =∑j∈S
E(g(X(t+ s))1X(t+s)=j | X(s) = i
)=∑j∈S
g(j)P (X(t+ s) = j | X(s) = i) =∑j∈S
g(j)pij(t)
= P (t)g(i) .
Using ddtP (t) = QP (t) = P (t)Q, the main theorem of calculus implies that
P (t)g(i)− g(i) =
∫ t
0
QP (s)g(i) ds = g(i) +
∫ t
0
P (s)Qg(i) ds
=
∫ t
0
E (Qf(X(s))|X(0) = i) ds = E
(∫ t
0
Qf(X(s)) ds | X(0) = i
).
From this identity it then follows that
(2.6)
E (f(X(t+ s))− f(X(s)) | Fs) = P (t)f(X(s))− f(X(s))
= E
(∫ t+s
s
Qf(X(r)) dr | X(s)
)which implies the martingale property
E
(f(X(t+ s))−
∫ t+s
0
Qf(X(r)) dr | Fs)
= f(X(s)) + E
(∫ t+s
s
Qf(X(r)) dr | X(s)
)− E
(∫ t+s
0
Qf(X(r)) dr | Fs)
= f(X(s))−∫ s
0
Qf(X(r)) dr .
18 2. MARKOV CHAIN MODELS OF ION CHANNELS
To derive the representation of the L2-norm we conclude similarly that
E((Mf
)2(t))
= E(
(f(X(t))− f(X(0)))2 − 2 (f(X(t))− f(X(0)))
∫ t
0
Qf(X(s)) ds
+
(∫ t
0
Qf(X(s)) ds
)2 )= E
((f(X(t))− f(X(0)))
2+ 2f(X(0))
∫ t
0
Qf(X(s)) ds− 2
∫ t
0
Qf(X(s)) ds
)= E
(∫ t
0
Qf(X(s))2 − 2f(X(s))Qf(X(s)) ds
)using
E
((∫ t
0
Qf (X(s)) ds
)2)
= 2
∫ t
0
∫ t
s
E (Qpu−sf(X(s))Qf(X(s))) du ds
= 2
∫ t
0
E ((pt−sf(X(s))− f(X(s)))Qf(X(s))) ds
= 2E
(f(X(t))
∫ t
0
Qf(X(s)) ds−∫ t
0
f(X(s))Qf(X(s)) ds
).
This proves the first equality. For the proof of the second equality note that forany state i ∈ S
Qf2(i)− 2f(i)Qf(i) =∑j∈S
qij(f2(j)− 2f(i)f(j)
)=∑j∈S
qij(f2(j)− 2f(i)f(j) + f2(i)
)=∑j∈S
qij (f(i)− f(j))2.
The use of the martingale structure will become apparent in the followingsection.
2.3. Diffusion approximation of Markov chains
Given a large number of ion channels regulating the membrane potential adetailed simulation of all the individual dynamics become more and more complexand time-consuming. The understanding of the statistics of single observables, e.g.the concentration of open channels, becomes increasingly difficult to understand. Itis therefore desirable to find methods for the reduction of the full Markov chain interms of lower dimensional stochastic processes. One important method to achievethis goal is the diffusion approximation, that aims to approximate the distributionof a single observable, or finitely many of them, in terms of a 1-dimensional orfinite-dimensional diffusion process.
To illustrate the method, let us start as a motivating example with the approx-imation of a large number of independent two-state Markov chains of the followingtype
2.3. DIFFUSION APPROXIMATION OF MARKOV CHAINS 19
&%'$
&%'$
0 1
R
I
α
β
for fixed strictly positive rates α and β. Here, ”0” stands for the closed stateand ”1” for the open state of the respective ion channel. Let X1(t), . . . , XN (t) beindependent Markov chains of the above type, so that
SN (t) := X1(t) + . . .+XN (t)
is the number of open states at time t. We would like to derive an approximation ofits distribution. In principle, this could be done explicitly (see subsection ... below).Instead, we will introduce a more conceptual approach that can be generalized toother Markov chain models.
To this end let us first compute the generator matrix of the full Markov chainX = (X1, . . . , XN ). Its state space is 0, 1N and a state i = (i1, . . . , iN ) of thechain is an N -tupel of 0−1 entries, depending on whether channel k is open ik = 1or closed ik = 0. X can only jump from states i to j if i and j differ at exactly oneposition ik, which means that ion channel k changes its state. Therefore the onlynon-zero off-diagonal entries of the generator matrix Q = (qij) are given as
qij =
α if j − i = ek
β if i− j = ek .
Here, ek denotes the unit-vector ek(l) = δkl in RN pointing towards the directionof the k-th coordinate.
If follows for the diagonal entries that
qii = −∑j
qij =
N∑k=1
ikβ + (1− ik)α = αN − (α+ β)SN (t) .
The number SN (t), as a functional of X can be written as f(X(t)), where sN (i) =∑k ik is simply the sum of the nonzero entries.
We can now compute
(2.7)
QsN (i) =∑j
qijsN (j) =∑j
qij (sN (j)− sN (i))
=∑k:ik=1
β +∑k:ik=0
α = αN − (α+ β)sN (i) .
Due to the general martingale structure, we therefore obtain that
MN (t) := SN (t)− SN (0)−∫ t
0
αN − (α+ β)SN (s) ds , t ≥ 0 ,
20 2. MARKOV CHAIN MODELS OF ION CHANNELS
is a martingale (with respect to the natural filtration generated by the underlyingMarkov chain X). This implies in particular that
mN (t) := E(SN (t)) = E(MN (t)) + E(SN (0)) +
∫ t
0
αN − (α+ β)E(SN (s)) ds
= mn(0) +
∫ t
0
αN − (α+ β)mN (s) ds .
In particular, mN is differentiable in t, so is
pN (t) := E
(1
NSN (t)
)=
1
NmN (t) , t ≥ 0 ,
with differential
(2.8)dpNdt
(t) = α− (α+ β)pN (t) .
(compare with (1.2)). One remark here is in order: since SN is a linear functionalof the Markov chain, equation (2.8) exactly coincides with (1.2), which need not bethe case for nonlinear functionals.
pN (t) only gives an approximation of the actual concentration of open ionchannels SN (t) in the mean. However, we even have the following law of largenumbers:
2.3.1. Law of Large Numbers.
Theorem 2.9. Suppose that the initial condition SN (0) of open states is suchthat
limN→∞
SN (0)
N= p0 in L2(P ) .
Then
limN→∞
SN (t)
N= p(t) in L2(P ) ,
where p(t) is a solution of the ordinary differential equation
p(t) = −(α+ β)p(t) + α , p(0) = p0 .
Proof. (2.7) implies that(2.9)
SNN
(t)− p(t) =SNN
(0)− p(0)︸ ︷︷ ︸=:I1(t)
+MN (t) +
∫ t
0
−(α+ β)
(SN (s)
N− p(s)
)ds︸ ︷︷ ︸
=:I2(t)
,
where
MN (t) =SN (t)
N−∫ t
0
QSN (s) ds =SN (t)
N+
∫ t
0
(α+ β)SN (s)− αds
is a martingale with
E(MN (t)2
)=
∫ t
0
E
∑j∈S
qSN (s)j (SN (s)− sN (j))2
ds ≤ tα+ β
N
using Theorem 2.7 and∑j∈S
qSN (t)j
(SN (t)
N− sN (j)
N
)2
=1
N2
∑k:Xk(t)=0
α+1
N2
∑k:Xk(t)=1
β ≤ α+ β
N.
2.3. DIFFUSION APPROXIMATION OF MARKOV CHAINS 21
We can therefore estimate
E
((SN (t)
N− p(t)
)2) 1
2
= E(
(I1(t) +MN (t) + I2(t))2) 1
2
≤ E(I1(t)2
) 12 + E
(MN (t)2
) 12 + E
(IN (t)2
) 12
≤
(E(I1(t)2
) 12 +
√tα+ β
N
)+√α+ β
∫ t
0
E
((SN (s)
N− p(s)
)2) 1
2
ds
Now Gronwall’s Lemma (see below) implies that
E
((SN (t)
N− p(t)
)2) 1
2
≤
(E(I1(t)2
) 12 +
√2tα+ β
N
)e√α+βt .
The assumption on the initial condition finally yields
limN→∞
E
((SN (t)
N− p(t)
)2)
= 0 .
Lemma 2.10. (Gronwall’s inequality)Let α, β, g : [0, T ]→ R, α, β integrable, β ≥ 0, g continuous and
g(t) ≤ α(t) +
∫ t
0
β(s)g(s) ds ∀t ∈ [0, T ](2.10)
Then
g(t) ≤ α(t) +
∫ t
0
α(s)β(s)e∫ tsβ(r)dr ds ∀t ∈ [0, T ](2.11)
In particular, if
• β(s) ≡ β then
g(t) ≤ α(t) + β
∫ t
0
α(s)eβ(t−s) ds.
• α(s) ≡ α and β(s) ≡ β then
g(t) ≤ αeβt.
Proof. Define
H(t) := exp
(−∫ t
0
β(r) dr
) ∫ t
0
β(s)g(s) ds.
22 2. MARKOV CHAIN MODELS OF ION CHANNELS
Then
H(t) = −β(t)H(t) + β(t)g(t) exp
(−∫ t
0
β(r) dr
)= β(t) exp
(−∫ t
0
β(r) dr
)(g(t)−
∫ t
0
β(s)g(s) ds
)︸ ︷︷ ︸
≤α(t)
≤ α(t)β(t) exp
(−∫ t
0
β(r) dr
),
hence,
H(t) =
∫ t
0
H(r) dr ≤∫ t
0
α(s)β(s) exp
(−∫ s
0
β(r) dr
)ds
and therefore ∫ t
0
β(s)g(s) ds ≤∫ t
0
α(s)β(s) exp
(∫ t
s
β(r) dr
)ds.
Inserting the last inequality into (2.10) yields the inequality (2.11).
Remark 2.11. It is worth to notice that up to this point we have not reallyused the fact the MN is a martingale, but only the estimate on its L2-norm, thatsimply follows from the Markovian structure. We will however, need the martingalestructure in the central limit theorem, that will give us the fluctuations in the aboveconvergence.
2.3.2. The Central Limit Theorem. For the second order correction weneed to rescale the martingales with the factor
√N to keep the variance constant.
Hence, from now on let us consider
(2.12) MN :=1√N
(SN (t)−
∫ t
0
QSN (s) ds
).
We will use the following generalization of the central limit theorem to martin-gales adapted from [EK84]:
Theorem 2.12. For n = 1, 2, , . . ., let (Fnt )t≥0 be a filtration and (Mn(t))t≥0
be an (Fnt )t≥0-martingale with right-continuous sample paths, having left limits at
t > 0 and starting at 0, i.e. Mn(0) = 0, such that
limn→∞
E
(sup
0≤s≤t|Mn(s)−Mn(s−)|
)= 0 .
Assume that there exist nonnegative, nondecreasing, (Fnt )t≥0-adapted processes suchthat
M2n(t)−An(t) , t ≥ 0 ,
is an (Fnt )t≥0-martingale and that
limn→∞
An(t) =
∫ t
0
σ2(s) ds in probability
for some deterministic function σ : [0,∞)→ R. Then
limn→∞
Mn(t) =
∫ t
0
σ(s) dW (s) , t ≥ 0 ,
2.3. DIFFUSION APPROXIMATION OF MARKOV CHAINS 23
weakly on the Skorohod-space D[0,∞). Here, (W (t))t≥0 is a 1-dimensional Brow-nian motion.
The Skorohod space
D([0,∞)) := ω : [0,∞)→ R | ω right-continuous having left limits for t > 0is the natural state space for time-continuous Markov chains. It can be endowedwith the following metric
d(ω, ω) := infλ∈Λ‖λ‖+ sup
t≥0e−t |ω(t)− ω(λ(t)) | ,
where
Λ := λ : λ : [0,∞)→ [0,∞), λ(0) = 0, increasing
‖λ‖ := sups,t≥0s 6=t
∣∣∣∣ logλ(t)− λ(s)
t− s
∣∣∣∣+ supt≥0|λ(t)− t| .
With respect to this metric, the space is a complete separable metric space withthe step functions densely contained. More details on this space can be found in[EK84], Chapter 3.
To apply the Theorem in the following to the stochastic process
S∗N (t) :=√N
(SN (t)
N− p(t)
), t ≥ 0
we first need to find the semimartingale decomposition of M2N (t), t ≥ 0, where
(MN (t))t≥0 is given in (2.12).Using the martingale property we have that
E(MN (t)2 | Fs
)= E
((MN (t)−MN (s))
2 | Fs)
+MN (s)2
=1
NE
((SN (t)− SN (s)−
∫ t
s
QSN (s) ds
)2
| Fs
)+MN (s)2 .
Theorem 2.7 now implies that
E
((SN (t)− SN (s)−
∫ t
s
QSN (s) ds
)2
| Fs
)= E
∫ t
s
∑j∈S
qSN (r−s)j (SN (r − s)− sN (j))2dr
= E
∫ t
0
∑j∈S
qSN (r)j (SN (r)− sN (j))2dr −
∫ s
0
∑j∈S
qSN (r)j (SN (r)− sN (j))2dr | Fs
using the Markov property, so that
MN (t)2 − 1
N
∫ t
0
∑j∈S
qSN (r)j (SN (r)− sN (j))2dr
is a martingale.We will simplify the exposition a little bit and assume that the initial condition
p0 of the limiting ordinary differential equation is given as the equilibrium pointp0 = α
α+β , so that p(t) ≡ p0 = αα+β for all t ≥ 0. Then
S∗N (t) =√N
(SN (t)
N− α
α+ β
)
24 2. MARKOV CHAIN MODELS OF ION CHANNELS
and∫ t
0
∑j∈S
qSN (r)j (S∗N (r)− s∗N (j))2dr =
∫ t
0
∑k:Xk(r)=0
α1
N+
∑k:Xk(r)=1
β1
Ndr
=
∫ t
0
α(1− SN (r)
N) + β
SN (r)
Ndr → 2t
αβ
α+ β
in L2(P ), hence in particular in probability, under the assumptions of the law oflarge numbers Theorem 2.9.
Since also
|M∗N (t)−M∗N (t−)| ≤ 1√N
the central limit theorem for martingales, Theorem 2.12, can be applied and weobtain that
limN→∞
M∗N (t) =
√2αβ
α+ βW (t)
weakly on the Skorohod space D[0,∞).In the martingale decomposition
S∗N (t) = S∗N (0) +MN (t) +
∫ t
0
QS∗N (s) ds
= S∗N (0) +MN (t)− (α+ β)
∫ t
0
S∗N (s) ds
it therefore remains to prove the weak convergence of S∗N (t), t ≥ 0, at least alongsome subsequence.
To this end we will use the following tightness criterion for stochastic processeson the Skorohod space adapted from [EK84], combining Theorem 8.6 and Theorem8.8 of Chapter 3:
Theorem 2.13. Let Xn(t) be a sequence of stochastic processes on Rd havingright-continuous sample paths with left limits for t > 0. Assume the followingconditions hold:
(a) ∃γ0 > 0 supn≥1 supt≤T E (‖Xn(t)‖γ0) <∞(b) ∃C, ∃γ1 > 0, γ2 > 1 such that
supn≥1
supt≤T
E (‖Xn(t+ 2h)−Xn(t+ h)‖γ1‖Xn(t+ h)−Xn(t)‖γ1) ≤ Chγ2 .
Then the family of distributions P X−1n , n ≥ 1, is tight on the Skorohod space
D([0,∞)).
Condition (a) follows from the martingale decomposition, since
E(S∗N (t)2
) 12 ≤ E
(S∗N (0)2
) 12 + E
(MN (t)2
) 12 + (α+ β)
∫ t
0
E(S∗N (s)2
) 12 ds
≤ E(S∗N (0)2
) 12 +
√t(α+ β) + (α+ β)
∫ t
0
E(S∗N (s)2
) 12 ds
and therefore
E(S∗N (t)2
) 12 ≤
(E(S∗N (0)2
) 12 +
√t(α+ β)
)e(α+β)t
using Gronwall’s Lemma.
2.3. DIFFUSION APPROXIMATION OF MARKOV CHAINS 25
Condition (b) in the above Theorem can be verified for continuous time Markovchains with the help of the following
Proposition 2.14. Let X(t) be a continuous time Markov chain with statesspace S ⊂ Rd. Let Q be the generator. Suppose that
(a) supt≥0 ‖X(t)−X(t−)‖ ≤ K <∞ (bounded jumps)(b) q∞ := supi∈S |qii| <∞.
Then
(i) E (‖X(t+ h)−X(t)‖ | X(t)) ≤ q∞Kheq∞h for t, h ≥ 0(ii) E (‖X(t+ 2h)−X(t+ h)‖ · ‖X(t+ h)−X(t)‖) ≤ q2
∞K2h2e2q∞h for t, h ≥
0.
Proof. (i) Suppose that X(t) = i0. The Markov property implies that, con-ditioned on X(t) = i0, X(t + h), h ≥ 0 is again a Markov chain with generatorQ. Denote with (Yn)n≥0 the associated jump chain (Theorem 2.6). Then condi-tioned on Y0, . . . , Yn−1, the holding times T1, . . . , Tn are independent Exp(q(Yi))distributed, i = 0, . . . , n. Therefore
P (Jn ≤ h|Y0, . . . Yn−1) =
∫ h
0
fq(Y0)(t1)
∫ h
0
fq(Y1)(t2) . . .
∫ h
0
fq(Yn−1)(tn)
1t1+t2+...+tn≤h dt1 . . . dtn
≤ qn∞∫ h
0
∫ h
0
. . .
∫ h
0
1t1+t2+...+tn≤h dt1 . . . dtn ≤ qn∞hn
n!.
Therefore
E (‖X(t+ h)−X(t)‖ | X(t) = i0) ≤∞∑n=1
nKP (Jn ≤ h < Jn+1)
≤∞∑n=1
nKqn∞hn
n!= q∞hKe
q∞h .
Summing up over all possible states X(t) = i0 we arrive at the first assertion.
(ii) For the proof of the second assertion observe that
E (‖X(t+ 2h)−X(t+ h)‖‖X(t+ h)−X(t)‖)= E (E (‖X(t+ 2h)−X(t+ h)‖ | X(t+ h)) ‖X(t+ h)−X(t)‖)
≤ q∞heq∞hE (‖X(t+ h)−X(t)‖) ≤ q2∞h
2e2q∞h .
The last Proposition, applied to the Markov chain SN (t), now yields that
E (‖S∗N (t+ 2h)− S∗N (t+ h)‖‖SN (t+ h)− SN (t)‖) ≤ (α+ β)2
Nh2e
2(α+β)√
Nh
so that condition (b) of Theorem 2.13 is satisfied for γ0 = 1 and γ1 = 2.We have thus proven:
Theorem 2.15. Let x ∈ R and xNk(N) =K(N)− α
α+βN√N
be a sequence of stan-
dardized initial conditions converging to x. Then
limN→∞
S∗N (t) = U(t)
26 2. MARKOV CHAIN MODELS OF ION CHANNELS
weakly on the Skorohod space D([0,∞)). Here,
U(t) = x− (α+ β)
∫ t
0
U(s) ds+
√2αβ
α+ βW (t) .
where W (t) is a 1-dimensional Brownian motion.
In particular,
limN→∞
E(f(S∗N (t))
∣∣S∗N (0) = xNk(N)
)= E (f(U(t))) for all f ∈ Cb(R)
but also
limN→∞
E(F (S∗N (·))
∣∣S∗N (0) = xNk(N)
)= E (F (U(·)))
for all bounded and continuous F : D([0,∞)) w.r.t. the Skorohod metric.
The process U constructed above is a diffusion approximation for the numberof open channels.
The voltage activity of the neuron solves the differential equation
cmdV
dt+GmV = I + (VE − V )g(t)
where
cm = membrane capacitance
Gm = membrane conductance
I = (exterior) current source
VE = equilibrium potential
g(t) = process modelling fluctuations in the opening and closing of ion channels
In the following let I = 0, γ = Gmcm, Ve = VE
cm, then
dV
dt+ γV = Ve g(t), V (0) = v0
with the explicit solution
V (t) = v0 e−γt + Ve
∫ t
0
e−γ(t−s)g(s)ds
If we now represent g via U , we obtain the stochastic process
V (t) = v0 e−γt + Ve
∫ t
0
e−γ(t−s)U(s)ds
or the respective system of stochastic differential equations
dV (t) = (Ve U(t)− γV (t))dt
dU(t) = −(α+ β)U(t)dt+
√2αβ
α+ βdW (t).
2.3. DIFFUSION APPROXIMATION OF MARKOV CHAINS 27
2.3.3. Convergence of finite-dimensional distributions. As an alterna-tive to martingale central limit theorem one can also explicitly compute the finite-dimensional distributions of SN and then apply the multivariate central limit theo-rem. To this end note that the transition semigroup of the two-state Markov chaincan be computed as
P (t) =1
α+ β
[β + e−t(α+β)α α− e−t(α+β)αβ − e−t(α+β)β α+ e−t(α+β)β
].
This implies in particular
E (X(t) | X(0) = 0) =α
α+ β− α
α+ βe−t(α+β),
E (X(t) | X(0) = 1) =α
α+ β+
β
α+ βe−t(α+β)
and
Var(X(t) | X(0) = 0) =1
(α+ β)2
(α− αe−t(α+β)
)(β + αe−t(α+β)
)Var(X(t) | X(0) = 1) =
1
(α+ β)2
(α+ βe−t(α+β)
)(β − βe−t(α+β)
).
N independent open channelsIf we now consider N independent two-state Markov chains X1(t), . . . , XN (t) withwith identical transition rates α and β the sum SN (t) = X1(t) + . . . + XN (t) isagain Markovian with states space 0, . . . , N and transition matrix
Pij(t) = P (SN (t) = j | S0(t) = i)
=
N∑k=0
(i
j − k
)↑
possibilitiesforclosingchan-nels
(N − ik
)↑
possibilitiesforopen-ingchan-nels
p10(t)i−(j−k) · p11(t)j−k · p00(t)N−(i+k) · p01(t)k .
Indeed, k denotes the number of changes from closed to open (at most j), hencej − k open channels ”stay” open, the transition in a single ion-channel happenswith probabilities pij(t) given by the single ion channel.
Limiting behavior N →∞
28 2. MARKOV CHAIN MODELS OF ION CHANNELS
To apply the central limit theorem, we will need the first and second momentof SN (t):
E (SN (t) | SN (0) = i) = E (SN (t) | X1(0) = · · · = Xi(0) = 1, Xi+1(0) = · · · = XN (0) = 0)
=
N∑k=1
E(Xk(t) | −”−)︸ ︷︷ ︸=
p11(t) if k = 1, . . . , i
p01(t) if k = i+ 1, . . . , N
=i(α+ e−(α+β)tβ) + (N − i)(α− e−(α+β)tα)
α+ β
= Nα
α+ β(1− e−(α+β)t) + ie−(α+β)t
Var(SN (t) | SN (0) = i) = Var (SN (t) | X1(0) = · · · = Xi(0) = 1, Xi+1(0) = · · · = XN (0) = 0)
=
N∑k=1
Var(Xk(t) | −”−)︸ ︷︷ ︸=
p11(t)− p11(t)2 if k = 1, . . . , i
p01(t)− p01(t)2 if k = i+ 1, . . . , N
= iα+ e−(α+β)tβ
α+ β· β − e
−(α+β)tβ
α+ β+ (N − i) α− e
−(α+β)tα
α+ β· β + e−(α+β)t + α
α+ β
= Nαβ
(α+ β)2+i(α2 − αβ)
(α+ β)2e−(α+β)t +
(N − i)(α2 − αβ)
(α+ β)2e−(α+β)t
− iβ2 + (N − i)α2
(α+ β)2e−2(α+β)t
StandardizationAs in the case of the classical Central Limit Theorem we have to standardize S∗N .Therefore, let
S∗N (t) :=SN (t)− α
α+βN√N
and note that this will be again a Markov chain with state space
IN :=x
(N)k :=
k − αα+βN√N
| k = 0, . . . , N
For given x(N)k ∈ IN , we now consider the distribution
P(S∗N (t) = x
(N)j | S∗N (0) = x
(N)k
), j = 0, . . . , N,
as a probability measure
p(N)t
(x
(N)k , ·
)on the whole real line R.
2.3. DIFFUSION APPROXIMATION OF MARKOV CHAINS 29
The central limit theorem now implies that for a sequence of initial conditions(x
(N)k(N)
)with x
(N)k(N) → x ∈ R
limN→∞
p(N)t
(x
(N)k(N), ·
)= N
(e−t(α+β),
αβ
(α+ β)2(1− e−2t(α+β))
)weakly.
Indeed: First note that
S∗N =1√N
(N∑k=1
Xk(t)− α
α+ β
)
=1√N
k(N)∑k=1
Xk(t)−(
α
α+ β+
β
α+ βe−t(α+β)
)+
1√N
N∑k=k(N)+1
Xk(t)−(
α
α+ β− α
α+ βe−t(α+β)
)+
1√N
(k(N)
β
α+ βe−t(α+β) − (N − k(N)− 1)
β
α+ βe−t(α+β)
)= I + II + III, say.
Now
X(N)k(N) =
k(N)− αα+βN√N
−→ x implies K(N) ∼√Nx+
α
α+ βN,
and therefore
I + IIw−→N
(0 ,
α
(α+ β)3(α+ βe−t(α+β))(β − βe−t(α+β)) +
β
(α+ β)3(α− αe−t(α+β))(β + αe−t(α+β))
)= N
(0 ,
α2β + β2α+ (αβ2 − βα2 + βα2 − β2α)e−t(α+β) − (β2α+ α2β)e−2t(α+β)
(α+ β)3
)= N
(0 ,
αβ
(α+ β)2(1− e−2t(α+β))
),
III ∼ 1√N
((√N x+
α
α+ βN
)β
α+ βe−t(α+β) −
(−√N x+
β
α+ βN
)α
α+ βe−t(α+β)
)= e−t(α+β)x .
It turns out that
pt(x, ·) := N(e−t(α+β)x,
αβ
(α+ β)2(1− e−2t(α+β))
), t ≥ 0, x ∈ R,
defines a semigroup of transition probabilities on R.
The associated Markov process U(t), t ≥ 0, is given as the solution of thestochastic differential equation
30 2. MARKOV CHAIN MODELS OF ION CHANNELS
dU(t) = −(α+ β)U(t)dt+
√2 αβ
α+ βdW (t),(2.13)
where W (t), t ≥ 0, is a 1-dimensional Brownian motion (see next Chapter 2).With similar computations we can also prove the weak convergence of finite-
dimensional distributions of S∗N towards the finite dimensional distributions of U .Together with the tightness of (S∗N (t)) we therefore arrive at the same conclusionas in Theorem 2.15
2.4. Long-time behavior of Markov chains
Recall: given a stochastic matrix P = (pij)i,j∈S a probability measure (µi)i∈Sis invariant for P if
µP = µ
i.e., ∀i ∈ S : (µP )i =∑j∈S
µjpji = µi
⇔ P (X1 = i) =∑j∈S
P (X1 = i | X0 = j)︸ ︷︷ ︸=pji
P (X0 = j)︸ ︷︷ ︸=µj
= P (X0 = i) ,
where (Xn)n≥0 denotes a Markov chain with transition probabilities P and initialdistribution µ.
Iterating yields: P (Xn = i) = · · · = P (X0 = i), i.e., the distribution of Xn isinvariant w.r.t. time.
Theorem 2.16. Part 1 (Convergence to invariant distributions)Let P be
• irreducible, i.e., ∀i, j ∃n0 ≥ 1 such that pn0ij > 0
• aperiodic, i.e., ∀i∃n0 ≥ 1 such that pnii > 0 ∀n ≥ n0.
Suppose that P has an invariant distribution µ. Then
limn→∞
P (Xn = j | X0 = i) = µj for all i, j ∈ S
⇔ limn→∞
pnij = µj .
Part 2 (Existence of invariant measures)
Let P be
• irreducible• positive recurrent, i.e.,
∀i : E (Ti | X0 = i) <∞,
where Ti = minn ≥ 1 : Xn = i = first return to i.
Then P has an invariant distribution µ.
Proof. (see: Norris, Markov chains)
2.4. LONG-TIME BEHAVIOR OF MARKOV CHAINS 31
We now come back to the case of time-continuous Markov chains. Let P (t), t ≥0, be a right-continuous semigroup of stochastic matrices with generator Q andjump matrix Π. A measure µ is called (infinitesimally) invariant for P(t) if
µQ = 0.
Lemma 2.17. : The following are equivalent:
(i) µ is (inf.) invariant(ii) µΠ = µ, where µi = µiqi, qi = −
∑j∈S qij
Proof. Follows from qi(πij − δij) = qij and thus
(µ(Π− I))j =∑i∈S
µi(πij − δij) =∑i∈S
µiqij = (µQ)j
We can now state the exact analogue of the previous theorem:
Theorem 2.18. : Assume that supi |qii| <∞Part 1 (Convergence to equilibrium)
Let P (t), t ≥ 0, be
• irreducible, i.e., ∀i, j ∃t0 > 0 such that Pij(t0) > 0
Suppose that P (t), t ≥ 0, has an invariant distribution µ, then
limt→∞
pij(t) = µj ∀i, j ∈ S
Part 2 (Existence of invariant measure)
Let P (t), t ≥ 0, be
• irreducible• positive recurrent, i.e.,
∀i : qi = 0 or E (Ti | X0 = i) <∞,where Ti = inft ≥ J1 : X(t)(ω) = i = first return to i.
then Q (resp. P (t), t ≥ 0) has an invariant distribution µ.
CHAPTER 3
Models for synaptic input
Neurons can pass over electrical signals arriving at the axon terminals viasynapses to the dendrites of other neurons. There are mainly two different mecha-nisms by which this is achieved: via electrical or chemical mechanisms.
In the case of a electrical coupling, the axon terminal of the presynaptic neuronis linked via special types of ion channels, the gap junctions, with the dendrites ofthe postsynaptic neuron by which they have an immediate impact on the membranepotential of the postsynaptic neuron.
In the case of a chemical coupling there is no direct connection between thepre- and the postsynaptic neuron, but rather a small synaptic cleft that is crossedby neurotransmitters emitted by the presynaptic neuron and received by the post-synaptic neuron.
In contrast to the electrical coupling, where the signal in the postsynapticneuron always is smaller or equal compared to the presynaptic neuron, signals canalso be amplified in the case of chemical couplings. The detailed mathematicalmodeling of chemical synapses in general therefore is more involved.
A simple theoretical model for the synaptic input that has been largely influen-tial has been provided by R. Stein [Ste65]. Because of the point event discontinuouscharacter, synaptic input is modeled with the help of point processes. Nevertheless,in the presence of large homogeneous input, a diffusion approximation can becomeagain appropriate.
In this chapter we will introduce Stein’s model for synaptic input and its dif-fusion approximation under appropriate assumptions provided in [LL87].
Stein’s model for synaptic inputLet V as usual denote the membrane potential. Then the time evolution of V
is given in the Stein Model as
(3.1) dV (t) = −1
τV (t)dt+ a+ dN+(t)− a−dN−(t)
where
- a± - denote the amplitude of excitatory/inhibitory currents- N± - are independent Poisson processes with rate λ±
Equation (3.1) has to be understood in integral form, i.e., t-a.e.
V (t) = V (0)− 1
τ
∫ t
0
V (s)ds+ a+
∫ t
0
dN+(s)− a−∫ t
0
dN−(s)
= V (0)− 1
τ
∫ t
0
V (s)ds+ a+(N+(t)−N+(0))− a−(N−(t)−N−(0)) .
33
34 3. MODELS FOR SYNAPTIC INPUT
Mathematically, equation (3.1) is an ordinary differential equation driven bytwo (independent) Poisson processes. We could consider the weighted sum a+N+(t)−a−N−(t) of the two Poisson processes as a Birth-Death process on the discrete seta+n1 − a−n2 | ni ∈ N0 with generator matrix qi,i+a+
= λ+ and qi,i−a− = λ−.The trajectories of the process have the following structure: between the jump-
ing times Jn and Jn+1 of both Poisson processes,
Vt = e−t−Jnτ VJn , Jn ≤ t < Jn+1 .
At t = Jn+1 the solution either jumps up to the value VJn+1− + a+ or down tothe value VJn+1−− a− due to an excitatory resp. inhibitory input of magnitude a+
resp. a−.
The following theorem shows that for a large amount of synaptic input, wehave a canonical diffusion approximation:
Theorem 3.1. Let λ(n)± , a
(n)± , n = 1, 2, . . . , be such that
(a) λ(n)± −→ +∞
(b) µ(n) := λ(n)+ a
(n)+ − λ(n)
− a(n)− −→ µ
(c) σ(n),2 := λ(n)+ (a
(n)+ )2 + λ
(n)− (a
(n)− )2 −→ σ2.
Then
(3.2) limn→∞
E(f(V (n)(t)) | V (n)(0) = x
)= E (f(V (t)) | V (0) = x)
where
dV (n)(t) = −1
τV (n)(t)dt+ a
(n)+ dN
(n)+ (t)− a(n)
− dN(n)− (t)
and V (t), t ≥ 0, is given as the solution of the stochastic differential equation
dV (t) =
(µ− 1
τV (t)
)dt+ σdW (t).
In fact, we also have that - similar to Theorem 2.15
limn→∞
E(F (V (n)(t)) | V (n)(0) = x
)= E (F (V (t)) | V (0) = x)
for all F : D([0,∞))→ R, bounded and continuous w.r.t. the Skorohod topology.
We need the following
Lemma 3.2. The finite dimensional distributions of
Z(n)(t) := a(n)+ N
(n)+ (t)− a(n)
− N(n)− (t)
converge weakly to those of σW (t).
3. MODELS FOR SYNAPTIC INPUT 35
Proof. Let tk = kn t be given. Then
Z(n)(t) =
n−1∑k=0
(Z(n)(tk+1)− Z(n)(tk)
)= a
(n)t
n−1∑k=0
(N
(n)+ (tk+1)−N (n)
− (tk))
↑
− a(n)−
n−1∑k=0
(N
(n)− (tk+1)−N (n)
− (tk))
↑independent, Poiss(λ
(n)±
tn )
w−→CLT
σ W (t) + µt ∼ N (µt, σ2t)
Similarly,
Zn(t)− Zn(s)w−→ σ(W (t)−W (s)) + µ(t− s)
and by independence of the increments of Z(n), we can also deduce the convergenceof finitely many increments to the increments of a Brownian motion with driftµ.
Proof. (of Theorem 3.1)We first show that limn→∞ Z(n) = σW + µ weakly on D([0,∞]), i.e.,
limn→∞
E(F (Z(n))
)= E (F (σW + µ))
for any F : D([0,∞])→ R bounded and continuous. To this end it suffices to showthat the sequence P (Z(n))−1, n ≥ 1 is tight on the Skorohod space D([0,∞]).We will apply Theorem 2.13) and have to show that
(a) ∃γ0 > 0 supn≥1 supt≤T E (Xn(t)γ0) <∞(b) ∃C, ∃γ1 > 0, γ2 > 1 such that
supn≥1
supt≤T
E (‖Xn(t+ 2h)−Xn(t+ h)‖γ1‖Xn(t+ h)−Xn(t)‖γ1) ≤ Chγ1+1 .
For the proof of (a) note that
E(
(Z(n)(t))2)
= E(
(a(n)+ N
(n)+ (t)− a(n)
− N(n)− (t))2
)= (a
(n)+ )2E
((N
(n)+ (t))2
)− 2a
(n)+ a
(n)− E
(N
(n)+ (t)N
(n)− (t)
)+ (a
(n)− )2E
((N
(n)− (t))2
)= (a
(n)+ )2
(λ
(n)+ t+ (λ
(n)+ )2t2
)− 2a
(n)+ a
(n)− λ
(n)+ λ
(n)− t2 + (a
(n)− )2
(λ
(n)− t+ (λ
(n)− )2t2
)=(σ(n)
)2
t+(µ(n)
)2
t2 → σ2t+ µ2t2 ,
as t→∞, so that condition (a) is satisfied with γ0 = 2.
We will next show that (b) is satisfied with γ1 = γ2 = 2. To this end note that byindependence of the increments of Z(n)
E(|Z(n)(t+ 2h)− Z(n)(t+ h)|2 |Z(n)(t+ h)− Z(n)(t)|2
)= E
(|Z(n)(t+ 2h)− Z(n)(t+ h)|2
)E∣∣∣Z(n)(t+ h)− Z(n)(t)2|
)=
((σ(n)
)2
h+(µ(n)
)2
h2
)2
≤ 2(σ(n)
)4
h2 + 2(µ(n)
)4
h4
36 3. MODELS FOR SYNAPTIC INPUT
which implies the assertion.Now
(3.3) V (n)(t) = x− 1
τ
∫ t
0
V (n)(s) + Z(n)(t), t ≥ 0,
has the alternative representation
V (n)(t) = e−tτ x+ Z(n)(t)− 1
τ
∫ t
0
e−(t−s)τ Z(n)(s) ds, t ≥ 0.
Indeed, note that (3.3) implies for T > 0∫ T
0
etτ V (n)(t) dt =
∫ T
0
etτ x dt− 1
τ
∫ T
0
etτ
∫ t
0
V (n)(s) ds dt+
∫ T
0
etτ Z(n)(t)dt
= τ(eTτ − 1)x−
∫ T
0
(etτ − e sτ )V (n)(s)ds+
∫ T
0
etτ Z(n)(t)dt
= eTτ
(τx−
∫ T
0
V (n)(s)ds
)− τx+
∫ T
0
etτ V (n)(t)dt+
∫ T
0
etτ Z(n)(t)dt,
hence subtracting∫ T
0etτ V (n)(t) dt on both sides yields
0 = eTτ
(τx−
∫ T
0
V (n)(s)ds
)− τx+
∫ T
0
etτ Z(n)(t)dt
=↑
inserting (3.3) again
eTτ (τV (n)(T )− τZ(n)(t))− τx+
∫ T
0
etτ Z(n)(t)dt
or equivalently,
V (n)(T ) = e−Tτ x+ Z(n)(t)− 1
τ
∫ T
0
e−(T−t)τ Z(n)(t)dt
Hence V (n)(t) = Φ(Z(n))(t), where
Φ : D([0,∞))→ D([0,∞))
is the mapping
Φ(ω)(t) := e−tτ x+ ω(t)− 1
τ
∫ t
0
e−(t−s)τ ω(s)ds.
It can be shown that Φ is continuous w.r.t. the Skorohod metric.Indeed, it is known that d(ωn, ω) → 0 if and only if there exist λn ∈ Λ such
that λn → id uniformly and ωn λn → ω locally uniformly. But then
Φ(ωn) λn(t)− Φ(ω)(t)
= (ωn λn(t)− ω(t))− 1
τ
∫ λn(t)
0
e−(λn(t)−s)
τ ωn(s)ds+1
τ
∫ t
0
e−t−sτ ω(s)ds
= (ωn λn(t)− ω(t))− 1
τ
∫ t
0
e−(λn(t)−λn(s))
τ ωn(λn(s))1
λn(s)ds+
1
τ
∫ t
0
e−t−sτ ω(s)ds
−→ 0 locally uniformly too.
3. MODELS FOR SYNAPTIC INPUT 37
Hence, V (n) = Φ(Z(n))→ Φ(σW + µ) = V weakly in D([0,∞)), which implies theassertion.
3.0.1. Convergence of spike times. Using the simple integrate and firemodel, a spike of the neuron is defined as the (first) event that the membranepotential V crosses a certain threshold value Vspike. The spike time T is thereforegiven as
T := inft > 0 : V (t) > Vspike .Mathematically, T is a stopping time, i.e., for all t the T ≤ t that a spike occurredup to time t is a measurable event w.r.t. the σ-algebra Ft := σV (s) : s ≤ tgenerated by the membrane potential V up to time t. The distribution of T containsa lot of information on the neuron, however it is difficult to compute directly forthe Poisson input, but might be simpler to compute for the diffusion approximationprovided by Theorem 3.1. However, the following example shows that the firstpassage time is not a continuous functional on D([0,∞)), so that Theorem 3.1 doesnot yet guarantee the convergence of the distribution the spike times of V (n) to thedistribution of the spike times of their diffusion approximation.
Counterexample:
ωn(t) =
(1 + 1
n ) sin(t) t ∈ [0, π]
(t− π) t ≥ π
-
6
|
π
1
t
ωn
Let Vspike = 1, then T1(ωn) ≤ π2 for all n,
ωn(t)→ ω(t) =
sin t, t ∈ [0, π]
(t− π), t ≥ π
uniformly, hence in particular
d(ωn, ω) ≤↑
λ=id (note ‖λ‖=0)
supt≥0
e−t |ωn(t)− ω(t)| → 0 ,
but T1(ω) = π + 1.To ensure the convergence of the spike times in distribution, we therefore have
to consider this problem in the following theorem separately:
Theorem 3.3. Let m ∈ R and
Tm(ω) = inft ≥ 0 : ω(t) > m
38 3. MODELS FOR SYNAPTIC INPUT
for ω ∈ D([0,∞)) with the convention that inf ∅ = +∞. Then Tm(V (N))d−→
Tm(V ).
Proof. We know that a sequence of r.v. XN converges in distribution to Xif and only if for the cumulative distribution functions FXn , FX
limn→∞
FXn(x) = FX(x)
for all points x of continuity of FX .For all m′ ∈ R we have that
P
(supt∈[0,T ]
V (t) < m′
)≤ limn→∞
P
(supt∈[0,T ]
V (n)(t) < m′
)
and
limn→∞
P
(supt∈[0,T ]
V (n)(t) ≤ m′)≤ P
(supt∈[0,T ]
V (t) ≤ m′)
since supt∈[0,T ]
V (n)(t) ≤ m′ ⊆ D([0,∞)) closed
and supt∈[0,T ]
V (n)(t) < m′ ⊆ D([0,∞)) open.
Note that V (t), t ≥ 0, is in fact a continuous process, so that for m′ ↓ m
Tm′(V ) ≤ T ↑ Tm(V ) ≤ T.
Indeed, Tm′(V ) ≤ T is monotone increasing for m′ decreasing and conversely
Tm(V ) ≤ T = V (t) > m for some t ≤ T
⊆⋃
m′>m
V (t) > m′ for some t ≤ T
=⋃
m′>m
T ′m(V ) ≤ T.
Lebesgue’s theorem implies that
P (Tm(V ) ≤ T ) = limm′↓m
P (Tm′(V ) ≤ T ).
Note that
Tm(V ) ≤ T = supt∈[0,T ]
V (t) > m
implies for m′ > m′′ > m
3. MODELS FOR SYNAPTIC INPUT 39
P (Tm′(V ) ≤ T ) = 1− P
(supt∈[0,T ]
V (t) ≤ m′)
≤ 1− limn→∞
P
(supt∈[0,T ]
V (n)(t) ≤ m′)
≤ 1− limn→∞
P
(supt∈[0,T ]
V (n)(t) < m′′
)
≤ 1− P
(supt∈[0,T ]
V (t) < m′′
)
= P
(supt∈[0,T ]
V (t) ≥ m′′)≤ P
(supt∈[0,T ]
V (t) > m
)= P (Tm(V ) ≤ T ) .
Taking the limit m′ ↓ m implies that we have equality everywhere, so that inparticular
limn→∞
P
(supt∈[0,T ]
V (n)(t) > m
)= P
(supt∈[0,T ]
V (t) > m
),
in other words
limn→∞
P(Tm(V (n)) ≤ T
)= P (Tm(V ) ≤ T ) .
CHAPTER 4
Stochastic Integrate-and-Fire models
In this chater we will introduce and analyse the stochastic integrate-and-fire(IF) model as th simplest statistical model for the membrane potential, which isthe basic observable of neural activity. For its neural background let us first recallthe basic dynamical features of the membrane potential:
- the synaptic input changes the membrane potential along the dendrites- this change in the membrane potential is passed through the dendrites to
the cell body- the membrane potential at the cell body integrates up the synaptic input
over time and, provided big enough, i.e. crossing a certain threshold value,can produce a sharp uprise in the membrane potential followed by a sharpdecrease and a refractory period in which the membrane potential slowlyreturns to its original resting value
- the sharp uprise followed by the sharp decrease is called a spike or theaction potential and is actively transmitted through the axon to otherneurons
The IF-model captures this basic mechanism setting up the following differentialequation for the membrane potential
(4.1) CdV
dt= −V
R+ I
together with a reset rule that consists in resetting the membrane potential V onceit reaches a certain value Vth to a lower value Vr. This mechanism induces a discon-tinuity in the process that causes many difficulties in its subsequent mathematicalanalysis.
Main additional feature of the leaky IF-model:
Firing can only be reached for a large enough input current I, because integrationof (4.1) yields
Vt = e−t/CRVr +(
1− e−t/CR)IR
which crosses the level Vth only for some t only if I > CR (Vth − Vr).
Stochasticity
We have already identified the two main sources for fluctuations in the membranepotential
- the random closing and opening of regulating ion channels- uncorrelated input of presynaptic neurons.
41
42 4. STOCHASTIC INTEGRATE-AND-FIRE MODELS
The simplest statistical effective modeling of fluctuations in the membranepotential is provided by simply adding Brownian motion (Wt)t≥0 as an exteriorforcing term acting on the membrane potential which yields the following stochasticdifferential equation (sde):
(4.2) dVt =
(I
C− VtCR
)dt+ σ dWt, V0 = Vr
Note that (4.2) is a linear sde and its unique strong solution can be representedas:
Vt = e−tCR Vr +
(1− e− t
CR
) I
C+
∫ t
0
e−t−sCR σ dWs
(see Appendix C).
A spike occurs, once the process Vt hits the threshold Vth, i.e., a spike occurs atthe first passage time
T := inft > 0 : Vt > Vth .
T is also called the firing time. The quantity of interest is the interspike interval(ISI) statistics, i.e. the distribution of T . E(T ) is called the mean firing time.
4.1. The distribution of T
Consider the stochastic differential equation
(4.3) dVt = f(Vt) dt+ σ(Vt) dWt , V0 = Vr .
We assume that f and σ are Lipschitz-continuous and that σ(V ) > 0 for all V .In particular, the above sde has for any initial condition a unique strong solution.How to compute the distribution of the first passage time T ?
4.1.1. General concepts. Let us begin with some general remarks. Considerthe stochastic differential equation (4.3) under the general assumption on the coef-ficients that for all x ∈ R there exists a unique strong solution Xt(x), t ≥ 0, withinitial condition X0(x) = x. It turns out that for σ 6= 0, the distribution of Xt(x)has a density pt(x, y) for t > 0 so that
E (g(Xt(x))) =
∫pt(x, y)g(y) dy .
Under additional assumptions on the coefficients f and σ, pt(x, y) satisfies for allx ∈ R the forward Kolmogorov equation
(4.4) ∂tpt(x, y) = L∗ypt(x, y) , t > 0 , y ∈ R, ,and for all y ∈ R the backward Kolmogorov equation
(4.5) ∂tpt(x, y) = Lxpt(x, y) , t > 0 , x ∈ R, .Here,
Lxg(x) =1
2σ2(x)gxx(x) + f(x)gx(x)
is the generator of the stochastic differential equation (4.3), and
L∗yg(y) =1
2(σ2g)xx(x)− (fg)x(x)
its formal adjoint (w.r.t. the Lebesgue measure).
4.1. THE DISTRIBUTION OF T 43
The family of densities pt(x, y), t > 0, forms a semigroup w.r.t. convolution,i.e.,
(4.6)
∫ps(x, y)pt(y, z) dy = ps+t(x, z)∀x, y, z .
which then is equivalent with the Markov property of the solution of (4.3).
We are interested in the first passage time
τx0
b = inft ≥ 0 | Xt(x0) > bof the solution of the level b. (Of course τx0
b ≡ 0 if b ≤ x0.) Let Gx0
b (t) =P (τx0
b ≤ t) denote the distribution function and gx0
b (t) its density (if it exists).Due to the (strong) Markov property the first passage time density satisfies thefollowing Volterra integral equation of the first kind
(4.7) pt(x0, y) =
∫ t
0
gx0
b (s)pt−s(b, y) ds ∀ y ≥ b > x0 .
The interpretation of this equation is a follows: given y > b > x0, the solutionXs(x0) must have crossed the level b at least once, if Xt(x0) = y. If we conditionon the first time τx0
b this happens, the process starts afresh at the level b at thattime until it reaches its terminal value y at time t. The probability for this ispt−τx0
b(b, y).
Example 4.1. Explicit solutions
(a) Brownian motion: pt(x, y) = 1√2πt
e−(x−y)2
2t . In this case, we can simply
integrate (4.7) w.r.t. y ≤ b to obtain∫ ∞b
pt(x0, y) dy =
∫ t
0
gx0
b (s)
∫ ∞b
pt−s(b, y) dy︸ ︷︷ ︸= 1
2
ds =1
2
∫ t
0
gx0(s) ds
=1
2Gx0(t)
which implies that
Gx0(t) = 2
∫ ∞b
pt(x0, y) dy = 2P (Wt > b− x0) and gx0
b (t) =1√
2πt3(b−x0)e−
(b−x0)2
2t .
(b) Brownian motion with constant drift I: pt(x, y) = 1√2πt
e−(x+It−y)2
2t . In this
case, we choose y = b, so that (4.7) reduces to
pt(x0, b) =
∫ t
0
gx0
b (s)1√
2π(t− s)e−
I2
2 (t−s) ds
and multiplying both side with eI2t2 , we obtain
eI2t2 pt(x0, b) =
∫ t
0
eI2s2 gx0
b (s)√2π(t− s)
ds .
The right hand side is the Abel-transform of eI2s2 gx0
b (s) that can be inverted withexplicit inverse
eI2t2 gx0
b (t) =1
π
d
dt
∫ t
0
1√t− s
eI2s2 ps(x0, b) ds .
44 4. STOCHASTIC INTEGRATE-AND-FIRE MODELS
Unfortunately, equation (4.7) determines the first passage time density only implic-itly and the literature covers a whole range of ideas of how to proceed with thisfundamental equation in order to come up with reasonable approximations for gx0
b .
4.1.2. The mean firing rate E(T ). The following Theorem gives an explicitformula for the mean firing time in terms of the coefficients of the driving sde.
Theorem 4.2. Suppose that P (T =∞) = 0. Then the mean firing time E(T )is given as(4.8)
E(T ) =
∫ Vth
−∞
∫ Vth
max(x,Vr)
exp
(−2
∫ y
Vr
f(s)
σ2(s)ds
)dy
2
σ2(x)exp
(2
∫ x
Vr
f(s)
σ2(s)ds
)dx .
Example 4.3. (i) constant drift: f(V ) = I, σ2(s) ≡ σ2. Then
E(T ) =2
σ2
∫ Vth
−∞
∫ Vth
x∨Vrexp
(−2I
σ2(y − Vr)
)dy exp
(2I
σ2(x− Vr)
)dx
=1
I
∫ Vr
−∞
[1− exp
(−2I
σ2(Vth−r)
)]exp
(2I
σ2(x− Vr)
)dx
+1
I
∫ Vth
Vr
[exp
(−2I
σ2(x− Vr)
)− exp
(−2I
σ2(Vth − Vr)
)]· exp
(2I
σ2(x− Vr)
)dx
=σ2
2I2− 1
I
∫ Vth
−∞exp
(2I
σ2(x− Vth)
)dx︸ ︷︷ ︸
=0
+Vth − Vr
I.
Thus E(T ) = Vth−VrI independent of σ2!
(ii) leaky IF-model: f(V ) = I − θV , σ2(s) ≡ σ2. Then
E(T ) =2
σ2
∫ Vth
−∞
∫ Vth
x∨Vrexp
(−2I
σ2(y − Vr) +
θ
σ2(y − Vr)2
)dy
· exp
(2I
σ2(x− Vr)−
θ
σ2(x− Vr)2
)dx
not easy!=↑θ=1
· · · =√π
∫ Vth−Iσ
Vr−Iσ
ex2
(1 + erf(x)) dx
with
erf(x) =1√π
∫ +x
−xe−s
2
ds .
The explicit formula for θ = 1 has been obtained in the paper [FB02].
4.1.3. Proof of Theorem 4.2. The following subsection is devoted to thederivation of formula (4.8). Thereby we will also introduce general concepts in thestochastic analysis of Ito processes, relating expectations to differential equations.
We will need the following additional notation: for a < Vr and b = Vth let
Ta,b := inft > 0 : Vt /∈ [a, b]
4.1. THE DISTRIBUTION OF T 45
be the first exit time of the solution V from [a, b].
Proposition 4.4. Let h ∈ C2([a, b]) be a solution of
(4.9)σ2(x)
2h′′(x) + f(x) h′(x) = 0, x ∈ [a, b].
Then
P(VTa,b = b
)=h(V0)− h(a)
h(b)− h(a).
Proof. Ito’s formula applied to h(Vt), t ≤ Ta,b, implies
dh(Vt) = h′(Vt)dVt +1
2h′′(Vt) d〈V 〉t
= h′(Vt)(f(Vt) dt+ σ(Vt) dWt) +1
2h′′(Vt)σ
2(Vt) dt
= h′(Vt)σ(Vt) dWt +
(σ2
2(Vt) h
′′(Vt) + f(Vt)h′(Vt)
)︸ ︷︷ ︸
=0, if t≤Ta,b
dt
= h′(Vt)σ(Vt) dWt
hence h(Vt) = h(V0) +∫ t
0σ(Vs) dWs for t ≤ Ta,b, or equivalently,
h(Vt∧Ta,b
)= h(V0) +
∫ t∧Ta,b
0
h′(Vs)σ(Vs) dWs︸ ︷︷ ︸=Mt∧Ta,b
.
The stochastic integral Mt∧Ta,b is integrable and has mean zero E(Mt∧Ta,b
)= 0,
using the optional sampling theorem A.8, hence
E(h(VTa,b)
)= limt→∞
E(h(Vt∧Ta,b)
)= h(V0) .
Now h(VTa,b) only has two values h(a) and h(b), so that
E(h(VTa,b)
)= h(a) P (VTa,b = a)︸ ︷︷ ︸
=1−P (VTa,b=b)
+h(b)P (VTa,b = b)
and therefore
P(VTa,b = b
)=h(V0)− h(a)
h(b)− h(a).
An explicit solution h of (4.9) is given as
h(x) =
∫ x
v0
exp
(−2
∫ y
v0
f(s)
σ2(s)ds
)dy
since g = h′ satisfies the linear differential equation g′(x) = − 2f(x)σ2(x)g(x), therefore
g(x) = c exp(−2∫ xv0
f(s)σ2(s)ds
)for some constant c. Applying Proposition 4.4 yields
the explicit formula
P(VTa,b = b
)=
∫ v0
aexp
(−2∫ yv0
f(s)σ2(s) ds
)dy∫ b
aexp
(−2∫ yv0
f(s)σ2(s) ds
)dy
46 4. STOCHASTIC INTEGRATE-AND-FIRE MODELS
Example 4.5. (i) Brownian motion with variance σ2: f = 0, σ2 > 0constant. Then h(x) = x− v0 becomes linear and independent of σ2 and
P(VTa,b = b
)=v0 − ab− a
(→ 1, a→ −∞)
(ii) Brownian motion with constant drift f(V ) ≡ I 6= 0. Then
h(x) =
∫ x
v0
exp
(−2
I
σ2(y − v0)
)dy =
σ2
2I
(1− e−2 I
σ2 (x−v0))
which implies
P(VTa,b = b
)=
1− e−2 Iσ2 (a−v0)
e−2 Iσ2 (b−v0) − e−2 I
σ2 (a−v0)
=e−2 I
σ2 v0 − e−2 Iσ2 a
e−2 Iσ2 b − e−2 I
σ2 a−→a→−∞
1 if I > 0
e2 Iσ2 (b−v0) if I < 0 .
(iii) Brownian motion with affine linear drift f(V ) = I − θV , θ 6= 0. Then
h(x) =
∫ x
v0
exp
(θ
σ2(y − v0)2 − 2I
σ2(y − v0)
)dy
which implies
P(VTa,b = b
)=
∫ v0
aexp
(θσ2 (y − v0)2 − 2I
σ2 (y − v0))dy∫ b
aexp
(θσ2 (y − v0)2 − 2I
σ2 (y − v0))dy
−→a→−∞
1, ifθ > 0∫ v0−∞ exp( θ
σ2 (y−v0)2− 2Iσ2 (y−v0))dy∫ b
−∞ exp( θσ2 (y−v0)2− 2I
σ2 (y−v0))dy, if θ < 0 .
The last Proposition provides us with the exit distribution of Vt. To computethe mean exit time E(Ta,b) we will need the following:
Proposition 4.6. Let u ∈ C2([a, b]) be a solution of
(4.10)σ2(x)
2u′′(x) + f(x)u′(x) = −1, x ∈ [a, b] .
Then
E(Ta,b) = u(V0)− u(a)− P(VTa,b = a
)(u(b)− u(a))
= −u(a)P (VTa,b = a)− u(b) P (VTa,b = b) if u(V0) = 0 .
Proof. Similar to the proof of Proposition 4.4, Ito’s formula implies that
du(Vt) = u′(Vt)σ(Vt)dWt +
(σ2
2(Vt)u”(Vt) + f(Vt)u
′(Vt)
)︸ ︷︷ ︸
=−1, if t≤Ta,b
dt
= u′(Vt)σ(Vt)dWt − dt ,
hence
u(Vt∧Ta,b
)= u(V0) +
∫ t∧Ta,b
0
u′(Vs)σ(Vs)dWs − t ∧ Ta,b ,
4.1. THE DISTRIBUTION OF T 47
therefore
E (Ta,b ∧ t) = −E(u(Vt∧Ta,b)
)which implies in the limit t ↑ ∞
E (Ta,b) = u(V0)− E(u(VTa,b)
)= u(V0)− u(a)P
(VTa,b = a
)− u(b)P
(VTa,b = b
).
An explicit solution u of (4.10) with u(a) = u(b) = 0 is given by
u(x) =−∫ x
a
(h(x)− h(y))2
σ2(y)exp
(2
∫ y
V0
f(s)
σ2(s)ds
)dy
+h(x)− h(a)
h(b)− h(a)
∫ b
a
(h(b)− h(y))2
σ2(y)exp
(2
∫ y
V0
f(s)
σ2(s)ds
)dy
therefore
E (Ta,b) = u(V0)
= −∫ V0
a
(h(V0)− h(y))2
σ2(y)exp
(2
∫ y
V0
f(s)
σ2(s)ds
)dy
+h(V0)− h(a)
h(b)− h(a)
∫ b
a
(h(b)− h(y))2
σ2(y)exp
(2
∫ y
V0
f(s)
σ2(s)ds
)dy .
Suppose now that
lima→−∞
h(x)− h(a)
h(b)− h(a)= 1 ,
which is the case if and only if P (T <∞) = 1 then
E(T ) = lima→−∞
E (Ta,b)
= −∫ V0
−∞(h(V0)− h(y))
2
σ2(y)exp
(2
∫ y
V0
f(s)
σ2(s)ds
)dy
+
∫ b
−∞(h(b)− h(y))
2
σ2(y)exp
(2
∫ y
V0
f(s)
σ2(s)ds
)dy
= −∫ V0
−∞(h(b)− h(V0))
2
σ2(y)exp
(2
∫ y
V0
f(s)
σ2(s)ds
)dy
+
∫ b
V0
(h(b)− h(y))2
σ2(y)exp
(2
∫ y
V0
f(s)
σ2(s)ds
)dy
=
∫ b
−∞(h(b)− h(V0 ∨ y))
2
σ2(y)exp
(2
∫ y
V0
f(s)
σ2(s)ds
)dy .
Finally, using h(b) − h(V0 ∨ y) =∫ bV0∨y exp
(−2∫ sV0
f(t)σ2(t)dt
)ds and b = Vth we get
the formula (4.8)
E(T ) =
∫ Vth
−∞
∫ Vth
V0∨yexp
(−2
∫ s
V0
f(t)
σ2(t)dt
)ds
2
σ2(y)exp
(2
∫ y
V0
f(s)
σ2(s)ds
)dy
and Theorem 4.2 is proven.
48 4. STOCHASTIC INTEGRATE-AND-FIRE MODELS
4.1.4. The distribution of T . The Laplace transformation of a probabilitymeasure provides a method to compute the distribution of T in particular cases.The method is based again on the optional sampling theorem.
(i) Consider as an example the case
dVt = σdWt, σ ≡ constant
Proposition 4.7. Let λ > 0 then
E(e−λT
)= e
√2λσ (Vr − Vth) .
In particular, the distribution of T is
P (T ∈ dt) =Vth − Vrσ√
2π t3exp
(− (Vth − Vr)2
2σ2t
)︸ ︷︷ ︸
=:f(t)
dt, t > 0
Proof. Consider the process
Mt : = exp(− λt+
√2λ
σVt); then
dMt =(− λMt +
1
2
(√2λ
σ
)2
σ2Mt
)dt+
√2λ MtdWt =
√2λ Mt dWt
It follows that (Mt) is a martingale, thus
E (MT∧t) = E(M0) for all t, which implies that
E
(exp
(−λ(t ∧ T ) +
√2λ
σVt∧T
))= exp
(√2λ
σVr
).
Taking the limit t→∞ and using P (T <∞) = 1 we obtain that
E(
exp(− λT +
√2λ
σVT︸︷︷︸Vth
))= exp
(√2λ
σVr
)
To verify the density of the distribution of T , it suffices to show that∫ ∞0
e−λtf(t) dt = e√
2λσ (Vr−Vth) , λ > 0
To this end note that for any α, β > 0
(4.11)
∫ ∞0
1√t3e−α
2t− β2
t dt =√πe−2αβ
β.
The identity will be proven below.
4.1. THE DISTRIBUTION OF T 49
Consequently ∫ ∞0
e−λtf(t)dt =Vth − Vrσ√
2π
∫ ∞0
e−λt−(Vth−Vr)2
2σ2t dt
=Vth − Vresetσ√
2π
√π
e2√λ· (Vth−Vr)√
2σ(Vth−Vr√
2σ
)= e−
√2λσ (Vth−Vr)
The distribution of T is called Levy-distribution
Remark: (Proof of (4.11)) Indeed, the Cauchy-Schlomilch transformation statesthat for any measurable nonnegative function f∫ ∞
0
f
((αt− β
t
)2)dt =
1
α
∫ ∞0
f(y2)dy .
From this one can deduce the desired identity in two steps:
Step 1: ∫ ∞0
1√te−α
2t− β2
t dt =
√π
αe−2αβα .
Indeed, ∫ ∞0
1√te−α
2t− β2
t dt =
∫ ∞0
1√te−(α√t− β√
t
)2
dte−2αβ
= 2
∫ ∞0
e−(αx− βx )2
dxe−2αβ =
√π
αe−2αβ ,
thereby using∫∞
0e−y
2
dy =√π
2 .
Step 2: Let G(β) :=∫∞
01√te−α
2t− β2
t dt, so that G′(β) = −2β∫∞
01√t3e−α
2t− β2
t dt.
On the other hand, Step 1 implies that
G′(β) =d
dβ
√π
αe−αβ = −2
√πe−2αβ .
We finally arrive at the desired identity∫ ∞0
1√t3e−α
2t− β2
t dt =
√π
βe−2αβ .
(ii) In the next example we consider the case
dVt = Idt+ σdWt , σ 6= 0 constant, I > 0 constant
Proposition 4.8. Let λ > 0 then
E(e−λT
)= e
Vth−Vrσ2 I
[1−
√1 + 2λ
σ2
I2
]
= e(Vth−Vr)
σ
[I
σ−√I2
σ2+ 2λ
].
50 4. STOCHASTIC INTEGRATE-AND-FIRE MODELS
Proof. Similar to the previous example, consider the process
Mt = exp
(−α
2t
2+α
σVt
).
Then Ito’s formula implies that
dMt =
(−α
2
2Mt +
1
2
(ασ
)2
σ2Mt
)dt+
α
σMtdVt
=α
σIdt+ α MtdWt
It follows that
e−ασ It Mt, t ≥ 0, is a local martingale
and thus by the optional sampling theorem
E(e−(α
2
2 +ασ I)t∧T+α
σ Vt∧T)
= eασ Vr
Taking the limit t→∞ and using P (T <∞) = 1 (since I > 0) implies
E(e−(α
2
2 +ασ I)T
)= e
ασ (Vr−Vth)
If we now let α be such that α2 + α
σ I = λ, i.e.,
α1/2 = − Iσ±
√(I
σ
)2
+ 2λ
and observe that we have to take the positive implies
E(e−λT
)= e
Vth−Vrσ
Iσ−
√(I
σ
)2
+ 2λ
The distribution of T is an inverse Gaussian distribution with parameters(
∆VI ,(
∆Vσ
)2), where ∆V = Vth − Vr, i.e., a probability distribution with density
f(t) =
(∆V
σ
)1√
2πt3exp
(− I
2
σ2
(t− ∆VI )2
2t
)Indeed, note that∫ ∞
0
e−λtf(t)dt =
(∆V
σ
)1√2π
∫ ∞0
1√t3e−α
2t− β2
t dt · eI·∆Vσ2
=
(∆V
σ
)1√2π
√2σ
∆V
√πe−2√λ+ I2
2σ2 · ∆V√2σ
+ I∆Vσ2
= e∆Vσ
[I
σ−√I2
σ2+ 2λ
]
To obtain the second line we used α2 = λ+ I2
2σ2 , β = ∆V√2σ
.
4.1. THE DISTRIBUTION OF T 51
Remark 4.9. Note that
limI→0
e∆Vσ
[Iσ−√
I2
σ2 +2λ
]= e
−√
2λσ ∆V
coincides with the Laplace-transform of T in the previous example.
(iii) The leaky integrate- and- fire model
The Laplace transform of the first passage time of the leaky integrate-and-firemodel
dVt = (I − θVt) dt+ σdWt
is more involved and does no longer have a closed form solution in terms ofelementary functions. It can be represented in terms of certain series expansions,that we are not going to state here, but instead refer to the excellent survey paper[APP05].
A rather useful alternative representation of the first passage time distribution canbe obtained in the particular case I = 0: Let Xt = V 2
t , then Ito’s formula impliesthat
dXt = 2VtdVt + σ2dt =(σ2 − 2θV 2
t
)dt+ 2σVtdWt
=(σ2 − 2θXt
)dt+ 2σ
√Xt sgn(Vs)dWt︸ ︷︷ ︸
=dWt
Now observe that
Wt =
∫ t
0
sgn(Vs)dWs , t ≥ 0
is a continuous martingale with quadratic variation 〈W 〉t = t. By Levy’s charac-
terisation of Brownian motion (see [Kle06], Theorem 25.28) it follows that (Wt) isa Brownian motion, hence (Xt) is a weak solution of the SDE
dXt =(σ2 − 2θXt
)dt+ 2σ
√Xt dWt.
Using this change of variable formula, the following representation of the first pas-sage time distribution can be obtained:
Proposition 4.10. The density fT of T of the leaky integrate-and-fire model
dVt = −θVtdt+ dWt , V0 = Vr
has the representation
fT (t) = e−θ(V2th−V
2r −t)/2 · f0
T (t) · E(
exp
(−θ
2
2
∫ t
0
(rs − Vr)2ds
))Here,
- f0T denotes the density of firing time in the case θ = 0,
i.e., f0T (t) =
Vth − Vr√2πt3
exp
(− (Vth − V reset)
2
2t
)see Proposition (ref to be inserted).
52 4. STOCHASTIC INTEGRATE-AND-FIRE MODELS
- (rs)s≥0 is the 3- dimensional Bessel-Bridge from Vr to Vth in time t, i.e.,the solution of the SDE
drt =
(Vth − rst− s
+1
rs
)ds+ dWt
4.1.5. Numerical approximation of T . Using numerical approximation ofthe driving stochastic differential equation (see Section C.3 in Appendix C) we canapproximate the distribution of T as follows:
Choose N(= number of runs, resp. samples), for i = 1, . . . , N construct the Euler-approximation
V k,itk+1= V k,itk
+ f(V k,itk) · h+ σ(V k,itk
) · (W itk+1−W i
tk),
for k = 0, 1, 2, . . .
until V k,itk+1> Vth
set T i := tk+1.
Then T i, i = 1, . . . , N can be considered an independent approximation of T ,hence its empirical distribution
µ(N) :=1
N
N∑i=1
δT i
should converge weakly to the distribution of T as N ↑ ∞. In fact, for any f ∈Bb(R+) we have
f(T 1), f(T 2), . . . iid ∼ µ, µ = distribution of T
hence
θN (f) :=
∫f dµ(N) =
1
N
N∑i=1
f(T i) −→P-a.s.
E (f(T )) =
∫f dµ =: θ(f)(SLLN)
and standard deviation
θN (f)− θ(f) ∼√
Var(f(T ))
N=:
σ√N
since
limN→∞
P(|θN (f)− θ(f)| ≤ c σ
N
)=
1√2π
∫ +c
−ce−
x2
2 dx.(CLT)
Numerical illustrations
for comparison with the closed form representations of fT consider first theclassical examples:
(i) dVt = σdWt, in this case
fT (t)dt = P (T ∈ dt) =
(∆V
σ
)1√
2πt3exp
(−∆V 2
2σ2t
)dt, ∆V = Vth − Vr
(Levy distribution)
4.1. THE DISTRIBUTION OF T 53
(ii) dVt = Idt+ σdWt, in this case
fT (t)dt = P (T ∈ dt) =
(∆V
σ
)1√
2πt3exp
(− (It−∆V )2
2σ2t
)dt
(iii) dVt = (I − θVt)dt+σdWt, in this case only known: characterisation of fTas solution of
fT (t) = −2Φ(t) + 2
∫ t
0
fT (s)Ψ(t, s) ds
where Φ and Ψ are as in Theorem (??).
APPENDIX A
Martingales
The theory of martingales has been extremely useful and successful in the anal-ysis of stochastic processes and which can be seen as a generalization of sums ofindependent random variables. We will summarize the parts of the theory that areused in the main test. For a more systematic treatment one can consult any text-book on stochastic analysis. Above all we recommend the monograph by [SV79].
Throughout the whole appendix, let (Ω,F , P ) be a fixed probability space,I ⊆ R+ any index-set, e.q. N0, [0, T ] or R+ itself, and (Ft)t∈I be a filtrationon (Ω,F), i.e., a family of sub-−sigma-algebras of F satisfying Ft ⊆ Fs for s ≤ t,s, t ∈ I. In the context of stochastic process, Ft is interpreted as all the informationthat is available at time t.
Definition A.1. A family of random variables (Xt)t∈I that is P -integrableand (Ft)-adapted, i.e. Xt is Ft -measurable for all t, is called
(i) a martingale, if
Xs = E(Xt|Fs) ∀s ≤ t(ii) a submartingale, if
Xs ≤ E(Xt|Fs) ∀s ≤ t (”on average increasing”)
(iii) a supermartingale, if
Xs ≥ E(Xt|Fs) ∀s ≤ t (”on average decreasing”)
Example A.2. The most important examples for martingales are given asfollows:
(i) successive predictions of integrable random variables Let X ∈L1(P ), then
Xt = E(X|Ft), t ∈ I,is an (Ft)-martingale, because for s ≤ t the tower property for conditional expec-tations implies that
E(Xt|Fs) = E(E(X|Ft)|Fs) = E(X|Fs) = Xs .
(ii) centered sums of independent random variables Let Yn ∈ L1(P ),n ≥ 1, be independent random variables and let Fn = σ(Y1, . . . , Yn) be the σ-algebra generated by the time-discrete process Y1, Y2, Y3, . . .. Then
Xn :=
n∑k=1
(Yk − E(Yk)), n ≥ 0,
is an (Fn)-martingale, because
E(Xn+1|Fn) = E(Yn+1 − E(Yn+1)|Fn)︸ ︷︷ ︸=E(Yn+1−E(Yn+1))=0,
+E(Xn|Fn)︸ ︷︷ ︸=Xn
= Xn .
55
56 A. MARTINGALES
Here we used that E(Yn+1 | Fn) = E(Yn+1), due to the independence of Yn+1 ofFn.
(iii) martingale transform with previsible processesLet (Fn)n∈N0
be a filtration, (Xn)n∈N0a martingale and (Vn)n∈N be previsible,
i.e., Vn Fn−1-measurable ∀n. Then
(V ·X)n := X0 +
n∑k=1
Vk(Xk −Xk−1), n ∈ N0
is again an (Fn)-martingale, if Vk(Xk −Xk−1) is P -integrable for all k, since
E((V ·X)n+1|Fn) = E((V ·X)n|Fn) + E(Vn+1(Xn+1 −Xn)|Fn)
= (V ·X)n + Vn+1E(Xn+1 −Xn|Fn) = (V ·X)n .
(iv) martingales of Markov chainssee Chapter 2, Section 2.(v) martingales of Brownian motionLet (Xt)t≥0 be a Brownian motion, defined on (Ω,F , P ). Let F0
t := σ(Xs :s ≤ t) be the filtration generated by X(t) and
Ft :=⋂s>t
F0s
be the slightly larger right continuous filtration generated by (F0t ))t≥0. Then we
have the following proposition:
Proposition A.3. The following processes are martingales w.r.t. (Ft)t≥0:
(i) (Xt)t≥0.(ii) (X2
t − t)t≥0.(iii) (exp(αXt − 1
2α2t))t≥0 ∀α ∈ R.
The proof requires the following
Lemma A.4. Let t ≥ 0, h > 0. Then Xt+h −Xt is independent of Ft.
Proof. By definition of Brownian motion, Xt+h−Xt is independent of σ(Xt1 , Xt2−Xt1 , . . . , Xtn − Xtn−1) = σ(Xt1 , Xt2 , . . . , Xtn) for all 0 ≤ t1 < t2 < · · · < tn ≤ t,which implies that Xt+h −Xt is independent of F0
t . It follows on particular thatXt+h −Xt+ 1
nis independent of F0
t+ 1n
⊇ Ft for all n ≥ 1. If f ∈ Cb(R) (= all cont.
and bounded functions on R), and ϕ is Ft-measurable and bounded, this impliesthat
E(f(Xt+h −Xt) ϕ) = limn→∞
E(f(Xt+h −Xt+ 1n
) ϕ)
= limn→∞
E(f(Xt+h −Xt+ 1n
))E(ϕ)
= E(f(Xt+h −Xt))E(ϕ)
and hence the assertion.
Proof. (of Proposition A.3) (i) Similar to example (ii), we have that
E(Xt |Fs) = E(Xs + (Xt −Xs) | Fs)= E(Xs|Fs)︸ ︷︷ ︸
=Xs
+E(Xt −Xs|Fs)︸ ︷︷ ︸=E(Xt−Xs)=0
= Xs .
A.1. MAXIMAL INEQUALITY 57
(ii)
E(X2t − t | Fs) = E(X2
t −X2s | Fs) +X2
s − t= E((Xt −Xs)
2 + 2(Xt −Xs) Xs | Fs) +X2s − t
= E((Xt −Xs)2)︸ ︷︷ ︸
=t−s
+2XsE(Xt −Xs|Fs)︸ ︷︷ ︸=E(Xt−Xs)=0
+X2s − t
= X2s − s
(iii) Gαt := exp(αXt − 12 α
2t). Then
E(Gαt |Fs) = E(exp(α(Xt −Xs)−1
2α2(t− s)) | Fs) ·Gαs
= E(exp(α (Xt −Xs)︸ ︷︷ ︸∼N (0,t−s)
))
︸ ︷︷ ︸exp( 1
2α2(t−s))
exp(−1
2α2(t− s)) Gαs
= Gαs .
A.1. Maximal inequality
The martingale property of a stochastic process (Xt)t≥0 implies that Xt con-tains all the information of the process up to time t, since
Xs = E(Xt |Fs) .
One important statement exploiting this fact is provided by the Doob’s maximalinequality:
Theorem A.5. Let (Xt)t≥0 be a (right-) continuous martingale and let
X∗t := sup0≤s≤t
|Xs| , t ≥ 0 .
Then
(i)
P (X∗t ≥ R) ≤ 1
RE(|Xt| ;X∗t ≥ R) ≤ 1
RE(|Xt|) ∀R > 0.
In particular: Xs : s ∈ [0, t] is uniformly integrable ∀t > 0.(ii) If Xt ∈ Lp(P ) for some p > 1 then
E((X∗t )p)1p ≤ p
p− 1E(|Xt|p)
1p .
58 A. MARTINGALES
Proof. (i) Fix n ≥ 1
P
(max
0≤k≤n|X k·t
n| ≥ R
)=
n∑l=0
P
(|X l·t
n| ≥ R ; max
0≤k≤l|X k·t
n| < R
)
≤↑
Markov inequality
1
R
n∑l=0
E(| X l·t
n|︸ ︷︷ ︸
E( |Xt| | F l·tn
)
; |X l·tn≥ R, max
0≤k<l| X k·t
n|< R
)
≤↑
|Xt| sub-martingale
1
R
n∑l=0
E(E(|Xt| | F lt
n)); |X l·t
n| ≥ R, max
0≤k<l| X k·t
n|< R︸ ︷︷ ︸
∈F ltn
)
=↑
tower property
1
R
n∑l=0
E
(|Xt|; |X l·t
n| ≥ R, max
0≤k<l|X k·t
n| < R
)
=1
RE
(|Xt| ; max
0≤k≤n| X k·t
n|≥ R
)≤ 1
RE (|Xt| ; X∗t ≥ R)
The right-continuity of (Xt) now implies that max0≤k≤n | X k·tn| ↑ X∗t
as n→∞, hence for all ε > 0:
P (X∗t ≥ R) ≤ limn→∞
P
(max
0≤k≤n| X k·t
n|≥ R− ε
)≤ limn→∞
1
R− εE(| Xt |; X∗t ≥ R), which implies (i) taking the limit ε ↓ 0.
(ii)
E((X∗t )p) = E
(p
∫ X∗t
0
up−1 du
)= E
(p
∫ ∞0
1u≤X∗t up−1 du
)Fubini
= p
∫ ∞0
E(1u≤X∗t )︸ ︷︷ ︸≤ 1uE(|Xt|;X∗t ≥u) by (i)
up−1 du ≤ p∫ ∞
0
E (| Xt | ; X∗t ≥ u) up−2 du
Fubini= p E
(∫ X∗t
0
up−2 du︸ ︷︷ ︸1p−1 (X∗t )p−1
· | Xt |)≤ p
p− 1E (|Xt|p)
1p E ((X∗t )p)
p−1p
In the last step we applied Holder’s inequality with q = pp−1 .
A.2. Stopping times and optional sampling
Definition A.6. A mapping T : Ω→ I with
T ≤ t ∈ Ft ∀t ∈ I
is called an (Ft)-stopping time.
Example A.7. (i) for discrete time
A.2. STOPPING TIMES AND OPTIONAL SAMPLING 59
first hitting times: (Xn)n∈N0Rd-valued, (Fn)−adapted, A ∈ B(Rd)
TA(ω) := infn ≥ 0 : Xn(ω) ∈ A, ω ∈ Ω
= first hitting time ofA (inf ∅ = +∞)
then TA is an (Fn)-stopping time, since
TA ≤ m =
m⋃n=0
Xn ∈ A
(ii) In continuous time one has to assume further regularity on (Xt) resp. A:let (Xt) be a continuous (Ft)-adapted process and A ⊆ Rd open, then
TA ≤ t =⋃
s∈[0,t]∩Q
Xs ∈ A.
Particular case: first passage time
Ta := inft ≥ 0 : Xt > ai.e. A = (a,∞) in the previous example.
The most important statement in connection with stopping times:
Theorem A.8 (Optional Sampling Theorem). Let (Xt)t≥0 be a right-continuous(Ft)-martingale, S, T be bounded (Ft)-stopping times, S ≤ T . Then
E(XT | Fs) = Xs.
In particular, for any (Ft)-stopping time T :
• (XT∧t) is an (FT∧t)-martingale• E(XT∧t) = E(X0) is constant w.r.t. time t
Here we used the notation
FT = A ⊂ Ω : A ∩ T ≤ t ∈ Ft ∀t ∈ Idenoting the σ−algebra at the stopping time T .
Exercise: Show that FT is indeed a σ-algebra.
Remark A.9. (i) S, T stopping times, S ≤ T ⇒ Fs ⊆ Ftbecause: A ∈ Fs implies
A ∩ T ≤ t =↑
T≤t=S≤t, T≤t
(A ∩ S ≤ t︸ ︷︷ ︸
∈Ft
)∩ T ≤ t ∈ Ft.
in particular: (FT∧t)t∈I is again a filtration.(ii) I = N0, (Xn) (Fn)−adapted, T stopping time, then
XT (ω) := XT (ω)(ω) FT −measurable,
because
XT ∈ A ∩ T ≤ m =
m⋂n=0
Xn ∈ A, T = n︸ ︷︷ ︸∈Fn⊂Fm
∈ Fm
Theorem A.10 (Optional sampling theorem in discrete time). Let I = N0, (Xn)be an (Fn)-martingale, T, S bounded (Fn)-stopping times with S ≤ T . Then
E(XT | Fs) = Xs.
In particular, for any (Fn)-stopping time T :
60 A. MARTINGALES
• (XT∧n) is an (FT∧n)-martingale• E(XT∧n) = E(X0) is constant w.r.t. (discrete) time n.
Lemma A.11. Let (Xn), T, S be as in Theorem (A.10). Then
E(XT ) = E(XS)
Proof. Let S ≤ T ≤ K. Then
XT = XS +
T∑k=S+1
Xk −Xk−1
= XS +
K∑k=1
1S<k≤T︸ ︷︷ ︸∈Fk−1
(Xk −Xk−1)
since S < k ≤ T = S ≤ k − 1 ∩ T > k − 1. Thus
E(XT ) = E(XS) +
K∑k=1
E(1S<k≤T(Xk −Xk−1)
)︸ ︷︷ ︸E(1S<k≤TE(Xk −Xk−1 | Fk−1)︸ ︷︷ ︸
=0
)= E(XS)
Proof of Theorem (A.10). Again, let T ≤ K. For B ∈ FS let
SB := S1B +K1Bc
TB := T1B +K1Bc
Then SB , TB are (Fn)-stopping times, because
S ≤ n =(S ≤ n ∩B︸ ︷︷ ︸
∈Fn
)∪(
K ≤ n ∩Bc︸ ︷︷ ︸=
∅ , K > n
Bc , K ≤ n∈Fn
)
and
TB ≤ n =(T ≤ n ∩B︸ ︷︷ ︸
∈Fn
)∪ (K ≤ n ∩Bc)
The previous lemma implies
E(XSB ) = E(XS1B) + E(XK1Bc)
= E(XT1B) + E(XK1Bc) = E(XTB ) .
Therefore E(XS1B) = E(XT1B). Since B ∈ FS arbitrary and XS FS-measurable,we obtain XS = E(XT |FS).
The proof of the optional sampling theorem in continuous time requires a suit-able approximation of a bounded stopping time
T : Ω→ [0,K]
A.2. STOPPING TIMES AND OPTIONAL SAMPLING 61
by finite valued stopping times
Tn(ω) =
K·2n∑k=1
k
2n1[ k−1
2n , k2n [(T (ω))
Clearly, Tn(ω) ↓ T (ω) ∀ω ∈ Ω. For any (right-)continuous (Ft)-adapted process(Xt)
XT (ω) := XT (ω)(ω) is FT −measurable.
Indeed,
X(t, ω) = limn→∞
∞∑k=1
X k−12n
(ω)1[ k−12n , k2n [(t)︸ ︷︷ ︸
=:X(n)(t,ω), (Ft)−adapted
Clearly,
X(n)T 1T≤t(ω) =
∞∑k=1
X k−12n ∧t
(ω)1[ k−12n ∧t,
k2n ∧t[
(T (ω))︸ ︷︷ ︸∈F k
2n∧t⊆Ft ∀t,
so that X(n)T is FT -measurable and thus
XT = limn→∞
X(n)T FT −measurable too.
Proof of Theorem (A.8). Let T ≤ K ∈ N. Let G be all (Ft)-stoppingtimes S with S ≤ K and
XS = E(XK | FS)
Hence, S ∈ G for all finite-valued S by Theorem (A.8)! In addition,
XS : S ∈ G is uniformly integrable
since
E(|XS |, |XS | ≥ R︸ ︷︷ ︸
∈FS
)= E (|E(XK | FS)|; |XS | ≥ R)
≤ E(|XK |; |XS | ≥ R)
≤ E (|XK |; X∗K ≥ R) −→R↑∞
0
uniformly in S ∈ G, using Doob’s maximal inequality.For general S let
Sn(ω) =
K·2n∑k=1
k
2n1] k−1
2n , k2n ](S(ω)) ↓ S(ω)
Tn(ω) =
K·2n∑k=1
k
2n1] k−1
2n , k2n ](T (ω)) ↓ T (ω)
62 A. MARTINGALES
Then limn→∞XSn = XS , limn→∞XTn = XT , Sn ≤ Tn ≤ K, and both in L1(P )because of uniform integrability and thus
XS = E(XS | FS) = limn→∞
E (XSn |FS)
= limn→∞
E (E (XTn | FSn) | FS)
=↑
FS⊂FSn
limn→∞
E (XTn | FS) = E (XT | FS)
Remark A.12. The conclusion of Theorem (A.10) (resp. Theorem (A.8)) doeshold in general for unbounded T .
Example: symm. random walk Sn = X1 + . . . + Xn, P (Xk = ±1) = 12 , (Xk)
iid.
T := minn ≥ 1 : Sn = +1 <∞ P-a.s.
⇒ E(ST ) = 1 6= E(S0) = 0.
Corollary A.13. Let (Xt)t≥0 be a continuous (Ft)-martingale, T an (Ft)-stopping time such that (XT∧k)k≥1 is uniformly integrable (e.g. bounded in k).Then
E(XT ) = E(X0).
Proof. T ∧ k ↑ T hence XT∧k → XT (P-a.s.) and thus
E(XT ) = limn→∞
E (XT∧k) = E(X0).
APPENDIX B
Brownian motion and stochastic integration
Brownian motion certainly is the most important stochastic process in con-tinuous time, used to describe diffusion processes. It is named after the Scottishbotanist Robert Brown, who first described the irregular motion of pollen grainssuspended in liquid that was later explained by Albert Einstein in his paper ”Uberdie von der molekularkinetischen Theorie der Warme geforderte Bewegung von (inruhenden Flussigkeiten) suspendierten Teilchen” (1905) by random collisions withmolecules of the liquid. This process also appeared a little bit earlier in the the-sis entitled ”Theorie de la speculation” by Louis Bachelier in the year 1900 in amathematical finance context. Norbert Wiener then provided the first rigorousmathematical construction of the process in the year 1923.
Definition B.1. A Brownian motion (BM) (with starting point 0) is anR-valued stochastic process (Wt)t≥0 on an underlying probability space (Ω,F , P )having the following properties:
(a) W0 = 0 P-a.s.(b) For 0 ≤ t0 < · · · < tn+1 the increments
Wti+1−Wti (i = 0, 1, . . . , n)
are independent, N (0, ti+1 − ti)-distributed.
The BM (Wt)t≥0 is called continuous, if the trajectory (or the path) t 7−→Wt(ω)is continuous for all ω ∈ Ω.
Brownian motion can be seen as the continuum limit description of fluctuations insums of independent identically distributed (iid) random variables. Indeed, supposethat X1, X2, . . . are iid with mean zero and finite variance σ2 > 0. Then
Mn(t) :=1√n
bntc∑i=1
Xi , t ≥ 0 ,
defines a right continuous stochastic process. The classical central limit theo-rem now implies that Mn(t) converges in distribution to the normal distributionN (0, tσ2) for all t ≥ 0. But even more: For any finite 0 ≤ t0 < t1 < . . . < tn theincrements
Mn(tk+1)−Mn(tk) :=1√n
bntk+1c∑i=bntkc+1
Xi , k = 0, . . . , n− 1
are independent partial sums and each of these partial sums converge in distributiontowards the Normal distribution N (0, (tk+1 − tk)σ2.
So quite naturally, the finite dimensional distributions of Brownian motion ariseas the finite dimensional distributions of a rescaled limit of the sum Mn of iid
63
64 B. BROWNIAN MOTION AND STOCHASTIC INTEGRATION
random variables. To further obtain the convergence of Mn(t) as processes to-wards the Brownian motion observe that Mn(t) is a martingale w.r.t. the filtrationFnt := Xk : k ≤ bntc generated by X1, X2, . . .. It therefore suffices to verifythe conditions of the martingale central limit theorem: to this end we simplify ourassumptions and assume that the random variables Xk are bounded by K. Then
sup0≤s≤t
|Mn(s)−Mn(s−)| ≤ 1√nK → 0n→∞ .
To identify the limiting behavior of the variance process note that M2n(t) is a sub-
martingale with variance process 1n
∑bntck=1 E(X2
k), i.e.
M2n(t)− 1
n
bntc∑k=1
E(X2k) = M2
n(t)− σ2
nbtnc , t ≥ 0 ,
is again a martingale, since
E(M2n(t+ s) | Fns
)= E
(M2n(t+ s)−M2
n(s) | Fnt)
+M2n(s)
= E(
(Mn(t+ s)−Mn(s))2 | Fnt
)+M2
n(s)
=1
nE
b(t+s)nc∑k,l=bsnc+1
XkXl | Fnt
+M2n(s)
=1
nE
b(t+s)nc∑k,l=bsnc+1
XkXl
+M2n(s)
=1
nE
b(t+s)nc∑k=bsnc+1
X2k
+M2n(s)
=σ2
n(btnc − bsnc) +M2
n(s)
and therefore
E
(M2n(t+ s)− σ2
nb(t+ s)nc | Fns
)= M2
n(s)− σ2
nbsnc .
Since limn→∞σ2
n btnc = σ2t, also this assumption in the martingale central limittheorem is satisfied so that Mn indeed converges weakly on the Skorohod spacetowards a Brownian motion W .
The partial sums X1 + . . . + Xn, n = 1, 2, . . . are also called a (discrete) timerandom walk and the above construction implies that Brownian motion can beapproximated with the help of suitable rescaled random walks. Since the Brownianmotion is the universal limit and independent of the particular distribution of theincrements, the above convergence is also called the invariance principle of Brownianmotion. The distribution P W−1 of a continuous Brownian motion W is calledWiener measure.
B.2. ELEMENTARY PROPERTIES OF BM 65
B.1. Construction of BM
We have already seen the construction of Brownian motion as limit of rescaledrandom walks. An alternative construction is provided by the so-called Wiener-Levy construction, that describes BM as random superposition of deterministicpaths as follows: let
• (Yn) be independent, N (0, 1)-distributed,• (en) be an orthonormal basis of L2([0, T ]), e.g.,
e0(t) = 1√T
e2k−1(t) = 1√2T
sin( 2πT kt), k ≥ 1
e2k(t) = 1√2T
cos( 2πT kt), k ≥ 1 .
Then
Wt(ω) :=
∞∑n=0
Yn(ω)
∫ t
0
en(s) ds
is a (continuous) BM, i.e. in our particular example
Wt(ω) =1√TY0(ω)+
∞∑n=1
Y2n−1(ω)1√2T
∫ t
0
sin(2π
Tns)ds+Yn(ω)
1√2T
∫ t
0
cos(2π
Tns) ds .
B.2. Elementary properties of BM
Proposition B.2. (symmetries and scaling properties) Let (Wt)t≥0 be a con-tinuous BM. Then the following stochastic processes are continuous BM too:
(i) Wt := −Xt, t ≥ 0 (symmetry)
(ii) Wt := cXt/c2 , t ≥ 0 for any c ∈ R \ 0 (scaling invariance)
Proof. (i) Obvious.(ii) Continuity is obvious. Next observe that clearly for 0 = t0 < t1 · · · < tn+1
the increments
c(Xti+1/c2 −Xti/c2
), i = 0, . . . , n
are independent and N (0, ti+1 − t1)-distributed.
Proposition B.3 (mean, covariance). Let (Wt)t≥0 be a BM. Then
i m(t) := E(Wt) = t,ii C(s, t) := Cov (Ws,Wt) = s ∧ t := mins, t.
Proof. (i) is obvious. For the proof of (ii) note that for s ≤ tby independenceof the increments
Cov (Ws,Wt) = E (WsWt) = E (Ws(Wt −Ws))+E(W 2s
)= E(Ws)E(Wt−Ws)+s = s = s∧t .
66 B. BROWNIAN MOTION AND STOCHASTIC INTEGRATION
B.3. Path properties of BM
Proposition B.4 (Strong Law of Large Numbers - SLLN). Let (Wt) be acontinuous BM. Then
limt→∞
Wt
t= 0 P − a.s.
In particular, the growth of a typical Brownian path is sublinear.
Proof. First note that (|Wt|)t≥0 is a submartingale. Fix ε > 0 arbitrary.Then
P
(sup
t∈[2n,2n+1]
|Wt|t≥ ε
)≤ P
(sup
t∈[2n,2n+1]
|Wt| ≥ ε 2n
)
≤↑
maximal inequality
1
ε 2nE(|W2n+1 |) ≤
√2
ε2−
n2
using E(|W2n+1 |) ≤ E(W 22n+1)
12 = 2
n+12 . Hence
∞∑n=0
P
(sup
t∈[2n,2n+1]
|Wt|t≥ ε
)≤∞∑n=0
√2
ε2−n/2 <∞
therefore limt→∞Wt
t ≤ ε P -a.s. using the Borel-Cantelli Lemma.
The quadratic variation of Brownian pathsRecall the for a function f : [0,∞) → R its total variation on [0, t] (if it
exists) is defined as
Var[0,t] f := limn→∞
∑ti∈τnti≤t
|fti+1 − fti | .
Here τn := t0, t1, . . . denotes a partition of [0,∞), i.e. 0 = t0 < t1 < . . . withτn ⊂ τn+1 for all n with mesh size
|τn| := maxti∈τn
|ti+1 − ti| → 0 .
Instead of summing up the absolute values of the increments of f we can alsosum up the squares of the increments
〈f〉t := limn→∞
∑ti∈τnti≤t
(f(ti+i)− f(ti))2,
which is called the quadratic variation of f on [0, t] along (τn).It turns out that the latter notion is crucial for Brownian motion according to
the following theorem:
Theorem B.5 (Levy). Let (Wt)t≥0 be a continuous BM. Then
Snt :=∑ti∈τnti≤t
(Wti+1
−Wti
)2 −→n→∞
t in L2(P ) and P-a.s.
In particular: 〈W 〉t = t P-a.s.
B.4. THE ITO-INTEGRAL 67
Proof. (of L2-convergence) Recall that the increments Wti+1−Wti , ti ∈ τn
are independent, ∼ N (0, ti+1 − ti), so that
E(Snt ) =∑ti∈τnti≤t
E((Wti+1
−Wti)2)
=∑ti∈τnti≤t
(ti+1 − ti) −→ t .
In addition
Var(Snt ) =∑ti∈τnti≤t
Var((Wti+1−Wti)
2)
=∑ti∈τnti≤t
E((Wti+1 −Wti)
4)− E
((Wti+1 −Wti)
2)2︸ ︷︷ ︸
=(ti+1−ti)2
=∑ti∈τnti≤t
3(ti+1 − ti)2 − (ti+1 − ti)2
= 2∑ti∈τnti≤t
(ti+1 − ti)2 ≤ 2|τn|∑ti∈τnti≤t
(ti+1 − ti)→ 0.
Thus Snt −E(Snt )→ 0 in L2(P ), i.e., Snt → t in L2(P ). a.s.-convergence is obtainedwith a suitable martingale convergence result.
The last theorem implies in particular, that the typical path of a BM is ofunbounded variation, so in particular nowhere differentiable, but since its quadraticvariation exists, it is possible to extend the classical differential calculus to Brownianpaths, which leads to the so-called Ito-calculus (see below).
B.4. The Ito-Integral
We want to consider stochastic differential equations of the following type
(B.1) X(t) = B(X(t)) dt+ C(X(t)) Wt, X(0) = ξ ,
where (Wt) is a Brownian motion. Writing (B.1) in integral form we obtain theintegral equation
X(t) = ξ +
∫ t
0
B(X(s)) ds+
∫ t
0
C(X(s)) Wsds, t ≥ 0 ,
and substituting Ws ds = dWs
ds ds = dWs formally we can write this in the form
X(t) = ξ +
∫ t
0
B(X(s))ds+
∫ t
0
C(X(s)) dWs .
In this equation, the third term on the right hand side is a stochastic integral We
will sketch its construction∫ t
0Φs dWs for a reasonable class of stochastic integrands
(Φs)s≥0 in the following. To this end let (Ω,F , P ) be a probability space and (Wt)be a Brownian motion with associated right-continuous filtration (Ft)t≥0.
Step 1 Integration of elementary processes
68 B. BROWNIAN MOTION AND STOCHASTIC INTEGRATION
Let E be the set of all elementary processes Φ of the type
Φt(ω) :=
n∑i=0
Φti(ω) 1(ti,ti+1](t) (0 ≤ t0 < t1 < . . . < tn)
where Φti is Fti-measurable. For Φ ∈ E we define the stochastic integral as(∫ t
0
ΦsdWs
)(ω) :=
∑i: ti<t
Φti(ω)(Wti+1∧t(ω)−Wti(ω)
)Lemma B.6. (i) ∫ ·
0
Φs dWs ∈M2c , where
M2c : = all continuous martingales, bounded in L2(P )
= M = (Mt)t≥0 : M martingale , t 7→Mt(ω) P-a.s. continuous,
‖M‖2 := supt≥0
E(M2t ) <∞
(ii) (Wiener-Ito-isometry)
E
((∫ t
0
Φs dWs
)2)
= E
(∫ t
0
Φ2s ds
).
Proof. (i) Ft-adapted, continuity is clear, square-integrability too. Forthe martingale property, let t′ < t and t′i < t′ ≤ t′i+1, ti < t ≤ ti+1 :
E
(∫ t
0
Φs dWs
∣∣∣ F ′t) = E
(∫ t′
0
Φs dWs
∣∣∣F ′t)
+ E
(∫ t′
t
Φs dWs
∣∣∣F ′t)
=
∫ t′
0
Φs dWs + E
(Φt′i
(Wt′i+1∧t −Wt′i+1∧t′
)+∑i: ti<t
Φti(Wti+1∧t −Wti)∣∣∣Ft′)
=
∫ t′
0
ΦsdWs + Φt′i E(Wt′i+1∧t −Wt′i+1∧t′
∣∣∣Ft′)︸ ︷︷ ︸=0
+∑i: ti<t
E(
Φti E(Wti+1∧t −Wti
∣∣Fti)︸ ︷︷ ︸=0
∣∣∣Ft′)
=
∫ t′
0
Φs dWs
(ii) For the proof of the Wiener-Ito-isometry, note that for ti < tj < t:
E(Φti(Wti+1∧t −Wti) Φtj (Wtj+1∧t −Wtj )
)= E
(Φti(Wti+1∧t −Wti) Φtj E(Wtj+1∧t −Wtj |Ftj )︸ ︷︷ ︸
=0
)= 0 ,
and
E((Φti(Wti+1∧t −Wti))
2)
= E(
Φ2ti E
((Wti+1∧t −Wti)
2 | Fti)︸ ︷︷ ︸
=ti+1∧t−ti
)= E
(Φ2ti(ti+1 ∧ t− ti)
).
B.4. THE ITO-INTEGRAL 69
Consequently,
E
((∫ t
0
Φs dWs
)2)
= E
( ∑i: ti<t
Φti(Wti+1∧t −Wti)
)2
=∑i: ti<tj: tj<t
E(Φti(Wti+1∧t −Wti)Φtj (Wtj+1∧t −Wtj )
)
=∑i: ti<t
E(Φ2ti (ti+1 ∧ t− ti)
)= E
(∫ t
0
Φ2s ds
)
Step 2: Extension of the set of admissible integrands to E .
The Lemma implies in particular the following isometry∥∥ ∫ ·0
Φs dWs
∥∥2= sup
t≥0E
((∫ t
0
Φs dWs
)2)
= E
(∫ ∞0
Φ2s ds
)between the spaces
E ⊆ L2(Ω, F , P ⊗ dt
)→M2
c
where Ω = Ω× [0;∞), F = F ⊗ B([0,∞)).We can therefore extend the stochastic integral to the space
E := closure of E in L2(P ⊗ dt),
i.e., for Φ ∈ E we define∫ t
0
Φs dWs := limn→∞
∫ t
0
Φ(n)s dWs (in L2(P ))
where (Φ(n)) ⊆ E is an arbitrary sequence of elementary processes converging toΦ in L2(P ⊗ dt). It can be shown that the limit
∫ ·0
Φs dWs is again a continuoussquare-integrable martingale.
How large is the class of admissible integrands obtained this way? To this end onehas to characterize the set E . It can be shown that
E ⊇ L2(Ω,P, P ⊗ dt) and E = L2(Ω,P, P ⊗ dt)
where
P = σ (As×]s, t] : As ∈ Fs, 0 ≤ s ≤ t) ”predictable σ-algebra”
!= σ (H = (Ht)t≥0 : H left-continuous, (Ft)− adapted)!= σ (H = (Ht)t≥0 : H continuous, (Ft)− adapted)
and P denotes the completion of P w.r.t. the measure P ⊗ dt.Hence any predictable process Φ = (Φt)t≥0 with
E
(∫ ∞0
Φ2t dt
)<∞
70 B. BROWNIAN MOTION AND STOCHASTIC INTEGRATION
can be integrated against BM and the stochastic integral∫ t
0
Φs dWs, t ≥ 0
satisfies
(i) ∫ ·0
Φs dWs ∈M2c
(ii)
E
((∫ t
0
Φs dWs
)2)
= E
(∫ t
0
Φ2s ds
)(Wiener-Ito–isometry)
(Proof: (e.g.) Øksendal, Stochastic Differential Equations, Chapter 3).
Step 4: Localization
The definition of∫ t
0Φs dWs can be extended to the space of integrands
N =
Φ : Ω→ R, predictable and P
(∫ t
0
Φ2s ds <∞ ∀t ≥ 0
)= 1
using the stopping times
Tn := inft ≥ 0 :
∫ t
0
Φ2s ds ≥ n
+∞ P-a.s.
Indeed: the process Φ(n)s := Φs1s≤n is predictable and square-integrable with
E
(∫ ∞0
(Φ(n)s
)2
ds
)= E
(∫ Tn
0
Φ2s ds
)≤ n,
hence∫ t
0Φ
(n)s dWs well-defined and consistent in the sense that∫ t
0
Φ(n)s dWs =
∫ t
0
Φ(n+1)s dWs on t ≤ Tn.
We can therefore write ∫ t
0
Φ(n)s dWs =:
∫ t∧Tn
0
Φs dWs
and∫ t
0Φs dWs satisfies:
(i) ∫ t∧Tn
0
Φs dWs ∈M2c
(ii)
E
(∫ t∧Tn
0
Φs dWs
)2 = E
(∫ t∧Tn
0
Φ2s ds
)
B.5. THE ITO-FORMULA 71
B.5. The Ito-formula
Recall the classical chain rule for differentiable functions f,X·:
d
dtf(Xt) = f ′(Xt) Xt dt︸ ︷︷ ︸
=dXt
= f ′(Xt) dXt
that may be written in integral form as
f(Xt)− f(X0) =
∫ t
0
f ′(Xs) dXs = limn→∞
∑ti∈τnti≤t
f ′(Xti)(Xti+1 −Xti)(B.2)
We want to deduce a similar formula for X· given by a Brownian path. Themain difficulty, consists of the fact that if X· is a Brownian path, it is of unboundedvariation and the Riemann sum (B.2) need not converge pointwise (only in L2(P )).
However, if f ∈ C2(R), we can use instead of the linear approximation
f(Xti+1)− f(Xti) ≈ f ′(Xti)(Xti+1 −Xti)
the quadratic approximation
f(Xti+1)− f(Xti) ≈ f ′(Xti)(Xti+1
−Xti) +1
2f ′′(Xti)(Xti+1
−Xti)2 .
More precisely, let τn be a sequence of partitions of [0, t], then
f(Xt)− f(X0) =∑ti∈τn
f(Xti+1)− f(Xti)
=∑ti∈τn
f ′(Xti)(Xti+1−Xti) +
1
2f ′′(Xti)(Xti+1
−Xti)2
+1
2
∑ti∈τn
(f ′′(θ
(n)ti )− f ′′(Xti)
)(Xti+1
−Xti)2
︸ ︷︷ ︸=:Rn
where θ(n)ti ∈ [Xti , Xti+1
] and
|Rn| ≤1
2sup
|r−s|≤|τn|
∣∣f ′′(Xr)− f ′′(Xs)∣∣︸ ︷︷ ︸
→0 as |τn|→0
∑ti∈τn
(Xti+1−Xti)
2
︸ ︷︷ ︸→〈X〉t=t
The first term converges to 0 since f ′′(Xs) is equicontinuous on [0, t].Since
limn→∞
1
2
∑ti∈τn
f ′′(Xti)(Xti+1−Xti)
2 =1
2
∫ t
0
f ′′(Xs) ds
also the limit
limn→∞
∑ti∈τn
f ′(Xti)(Xti+1−Xti) =
∫ t
0
f ′(Xs) dXs
exists for any Brownian path and we have Ito’s formula:
(B.3) f(Xt) = f(X0) +
∫ t
0
f ′(Xs) dXs +1
2
∫ t
0
f ′′(Xs) ds .
72 B. BROWNIAN MOTION AND STOCHASTIC INTEGRATION
If in addition f depends on time, f ∈ C1,2([0,∞)× R), then a similar Taylor-approximation
f(ti+1, Xti+1)− f(ti, Xti) = f(ti+1, Xti+1
)− f(ti, Xti+1) + f(ti, Xti+1
)− f(ti, Xti)
= ∂tf(ti, Xti+1) + ∂xf(ti, Xti)(Xti+1 −Xti)
+1
2∂xxf(ti, Xti)(Xti+1 −Xti)
2 + Rn
yields in the limit the time-dependent Ito-formula:
f(t,Xt) = f(0, X0) +
∫ t
0
∂sf(s,Xs) ds+
∫ t
0
∂xf(s,Xs) dXs(B.4)
+1
2
∫ t
0
∂xxf(s,Xs) ds
Example B.7. (i)
X2t = X2
0︸︷︷︸=0
+2
∫ t
0
Xs dXs +1
2
∫ t
0
2 ds = 2
∫ t
0
Xs dXs + t
in particular: X2t − t = 2
∫ t
0
Xs dXs is a martingale!
(ii)
exp
(λXt −
λ2
2t
)= exp
(λX0 −
λ2
2· 0)
︸ ︷︷ ︸=0
+
(−λ
2
2
)∫ t
0
exp
(λXs −
λ2
2s
)ds
+ λ
∫ t
0
exp
(λXs −
λ2
2s
)dXs +
1
2λ2
∫ t
0
exp
(λXs −
λ2
2
)ds
= 1 + λ
∫ t
0
exp
(λXs −
λ2
2s
)dXs is a martingale too.
(iii)
Xmt = Xm
0︸︷︷︸=0
+m
∫ t
0
Xm−1s dXs +
m (m− 1)
2
∫ t
0
Xm−2s ds, m ≥ 2 .
In the next step, we generalize the Ito formula to the class of Ito-processes:
Let (Wt) be a Brownian motion, (Ft) the associated right-continuous filtration.
Definition B.8. A one dimensional Ito process is a stochastic process (Yt)of the form
Yt = Y0 +
∫ t
0
us ds+
∫ t
0
vs dWs, t ≥ 0,
(or in infinitesimal form: dYt = ut dt+ vt dWt)
where
• us, vs are predictable processes
• P(∫ t
0|us| ds ≤ ∞,
∫ t0v2s ds ≤ ∞ ∀t ≥ 0
)= 1
B.5. THE ITO-FORMULA 73
Quadratic variation of Ito-processesSimilar to the BM, Ito-processes have continuous quadratic variation. Given
two Ito-processes, Y 1 and Y 2, their quadratic variation and covariation are definedby:
〈Y j〉t := lim|τn|→0
∑ti∈τn
(Y jti+1
− Y jti)2
, j = 1, 2
〈Y 1, Y 2〉t := lim|τn|→0
∑ti∈τn
(Y 1ti+1− Y 1
ti
)(Y 2ti+1− Y 2
ti
).
Lemma B.9. (i) (Cauchy-Schwarz inequality)
|〈Y 1, Y 2〉t| ≤ 〈Y 1〉12t 〈Y 2〉
12t
(ii) 〈Y 1, Y 2〉t = 12
(〈Y 1 + Y 2〉t − 〈Y 1〉t − 〈Y 2〉t
).
(iii) Let (At) be continuous and of bounded variation. Then 〈Y 1+A〉t = 〈Y 1〉t.
Proof. (of (iii)): We know that 〈A〉t = 0. Hence
〈Y 1 +A〉t = 〈Y 1 +A, Y 1 +A〉t = 〈Y 1〉t + 2〈Y 1, A〉t︸ ︷︷ ︸≤〈Y 1〉t〈A〉t
+ 〈A〉t︸︷︷︸=0
= 〈Y 1〉t
Proposition B.10. Let dYt = vt dWt be a stochastic integral. Then
〈Y 〉t =
∫ t
0
v2sds .
Proof. First suppose that vt ≡ v, so that Yt = v ·Wt. Then
〈Y 〉t = lim|τn|→0
∑ti∈τn
(vWti+1 − vWti
)2= v2t =
∫ t
0
v2sds.
Next assume that
vt =
m∑j=1
htj1]tj ,tj+1], tn+1 = t.
Since vt is constant on ]tj , tj+1]
〈Y 〉tj+1− 〈Y 〉tj = h2
tj (tj+1 − tj) =
∫ tj+1
tj
v2sds
hence 〈Y 〉t =∫ t
0v2sds in this case. The general case is obtained by approximation
of v with elementary processes vns satisfying
P
(∫ t
0
(vns − vs)2ds→ 0, n→∞)
= 1.
Corollary B.11. Let
dYt = utdt︸︷︷︸=At
+vtdWt.
74 B. BROWNIAN MOTION AND STOCHASTIC INTEGRATION
Then
〈Y 〉t =
∫ t
0
v2sds.
Proposition B.12. Let
dY 1 = u1dt+ v1dW
dY 2 = u2dt+ v2dW
be two Ito-processes. Then
〈Y 1, Y 2〉t =
∫ t
0
v1sv
2sds.
Proof.
〈Y 1, Y 2〉t =1
2
(〈Y 1 + Y 2〉t − 〈Y 1〉t − 〈Y 2〉t
)=
1
2
(∫ t
0
(v1s + v2
s)2ds−∫ t
0
(v1s)2ds−
∫ t
0
(v2s)2ds
)=
∫ t
0
v1s v
2s ds
Theorem B.13 (1-dim. Ito-formula). Let (Yt) be an Ito process of the type
dYt = ut dt+ vt dXt.
Let f ∈ C1,2([0,∞)×R) then (f(t, Yt)) is an Ito process too with representation
df(t, Yt) = ∂tf(t, Yt) dt+ ∂xf(t, Yt) dYt +1
2∂xxf(t, Yt) d < Y >t
= ∂tf(t, Yt) dt+ ∂xf(t, Yt) ut dt+ ∂xf(t, Yt) vt dWt +1
2∂xxf(t, Yt) v
2t dt
=
(∂tf(t, Yt) + ∂xf(t, Yt) ut +
1
2∂xxf(t, Yt)v
2t
)dt+ ∂xf(t, Yt) vt dWt.
for dYt = utdt+ vtdWt and d〈Y 〉t = v2t dt.
The proof requires certain preliminaries:
Lemma B.14. (two simple stochastic differentials)
(i) d(W 2t ) = 2Wt dWt + dt
(ii) d(tWt) = Wt dt+ t dWt
Theorem B.15 (Ito’s product rule). Suppose that
dY1 = u1 dt+ v1 dW
dY2 = u2 dt+ v2 dW
with P(∫ t
0u2i + v2
i ds <∞, ∀ t ≥ 0)
= 1, i = 1, 2. Then:
d(Y1Y2) = Y1 dY2 + Y2 dY1 + d〈Y1, Y2〉(B.5)
= Y1 dY2 + Y2 dY1 + v1v2 dt.
B.5. THE ITO-FORMULA 75
Proof. First assume that
Y1(0) = Y2(0) = 0, ui(t) ≡ ui, vi(t) ≡ vi
so that Yi(t) = ui t+ vi Wt. Then
∫ t
0
Y2 dY1 +
∫ t
0
Y1 dY2 +
∫ t
0
v1v2 ds
=
∫ t
0
Y2u1 + Y1u2 ds+
∫ t
0
Y2v1 + Y1v2 dXs +
∫ t
0
v1v2 ds
=
∫ t
0
u2u1s+ v2u1Ws + u1u2s+ v1u2Ws ds
+
∫ t
0
u2v1s+ v2v1Ws + u1v2s+ v1v2Ws dXs +
∫ t
0
v1v2 ds
= u1u2t2 + (u1v2 + u2v1)
[∫ t
0
Ws ds+
∫ t
0
s dWs
]+ 2v1v2
∫ t
0
Ws dWs︸ ︷︷ ︸= 1
2 (X2t−t)
+v1v2t
According to the last lemma this can be simplified to
= u1u2t2 + (u1v2 + u2 + v1) tWt + v1v2 W
2t
= Y1(t) Y2(t)
Next assume that ui, vi are elementary processes:
ui =
n∑j=1
gitj1]tj ,tj+1], vi =
n∑j=1
hitj1]tj ,tj+1] , tn+1 = t
then exactly the same identity can be obtained on ]tj , tj+1]:∫ tj+1
tj
Y2 dY1 +
∫ tj+1
tj
Y1 dY2 +
∫ tj+1
tj
h1tjh
2tj ds
= Y1(tj+1)Y2(tj+1)− Y1(tj)Y2(tj)
and summing up w.r.t. j = 1, . . . , n gives the desired identity (B.5).Finally consider uni , v
ni , elementary processes, converging to ui, vi in the sense
that for i = 1, 2:
P
(∫ t
0
(uni (s)− ui(s))2 ds→ 0, n→∞)
= 1
P
(∫ t
0
(vni (s)− vi(s))2 ds→ 0, n→∞)
= 1
Let
Y ni (t) =
∫ t
0
uni (s) ds+
∫ t
0
vni (s) dWs , i = 1, 2
76 B. BROWNIAN MOTION AND STOCHASTIC INTEGRATION
Then
(Y n1 Yn2 ) (t)− (Y n1 Y
n2 ) (0) =
∫ t
0
Y n1 dY n2 +
∫ t
0
Y n2 dY n1 +
∫ t
0
vn1 vn2 ds
↓ ↓
(Y1Y2)(t)− (Y1Y2)(0)
∫ t
0
Y1 dY2 +
∫ t
0
Y2 dY1 +
∫ t
0
v1v2 ds
as n→∞.
APPENDIX C
Stochastic Differential Equations
Consider the ordinary differential equation
dNtdt
= aNt , N0 = n0(C.1)
describing the membrane potential as a function of time. (C.1) is linear and itsunique solution is given by
Vt = eat · n0 .
(C.1) is a classical model describing growth, Nt denotes the population size, and athe growth rate.
Suppose now that a is only known partially, because it is subject to unknown,possibly random, forces. Then a natural Ansatz for this unknown, random, growthrate is
a = r + σdWt
dt
where W is a continuous BM, hence
dNtdt
=
(r + σ
dWt
dt
)·Nt
or
dNt = rNt dt+ σNt dWt(C.2)
(C.2) is called a stochastic differential equation (sde). The explicit solutionwith initial condition N0 is given by
Nt = exp
((r − 1
2σ2)t+ σWt
)·N0
Indeed: Ito’s formula, applied to the function f(w, t) = exp(σw+(r− 12σ
2)t) yields
dNt = σNt dWt +
(1
2σ2 + (r − 1
2σ2)
)Nt dt
= rNt dt+ σNt dWt.
Further examples for SDEs, often used in computational Neuroscience, are
(a)
dVt = (I − Vt)dt+ σdWt
modelling the membrane potential of a neuron subject to noise, like e.g.channel noise and/or noise in the synaptic input.
77
78 C. STOCHASTIC DIFFERENTIAL EQUATIONS
(b) Stochastic FitzHugh-Nagumo Systems
dVt = (Vt (1− Vt)(Vt − a)−Wt) dt+ σV dBVt
dWt = b (Vt − (a+Wt)) dt+ σW dBWt
where (BVt ), (BWt ) are possibly correlated BM. The above system of sdeis no longer linear, the drift term of the voltage variable is no longerLipschitz, still it is possible to uniquely solve it for arbitrary, possibly alsorandom, initial condition.
C.1. Explicit solutions
C.1.1. Linear SDE.
dXt = (A(t)Xt +B(t)) dt+ C(t) dWt, X0 = ξ0(C.3)
where
- (Wt) - d-dim. BM- A(t), C(t) ∈ Rd×d, B(t) ∈ Rd measurable, locally bounded- E(‖ξ0‖2) <∞, ξ0 independent of (Wt)
We will see below, that (C.3) has a unique solution.
Special case: (d = 1) Ornstein-Uhlenbeck SDE - modelling BM with friction.
dXt = −bXt dt+ σ dWt, X0 = ξ0 .(C.4)
To determine an explicit solution for this equation, recall the variation of constantsformula for linear ode, that allows to represent the solution of
Xt = −bXt + σWt, X0 = ξ0
as
Xt = e−bt(ξ0 + σ
∫ t
0
e−b(t−s)Wsds
).
Writing Wsds = dWs, we then obtain
Xt = e−bt(ξ0 + σ
∫ t
0
e−b(t−s) dWs︸ ︷︷ ︸stoch. integral
)
as solution of (C.4). Note that
σ
∫ t
0
e−b(t−s) dWs ∼ N(
0, σ2
∫ t
0
e−2b(t−s) ds
)so that
Xt ∼ N(e−btξ0, σ
2
∫ t
0
e−2b(t−s) ds
).
The general case (C.3)
Let Φt be the matrix solution of
C.1. EXPLICIT SOLUTIONS 79
Φt = A(t)Φt (e.g. A(t) ≡ A,Φt = etA)(C.5)
then
Xt = Φtξ0 +
∫ t
0
Φ−1t Φsb(s) ds+
∫ t
0
Φ−1t ΦsC(s) dWs
and again∫ t
0
Φ−1t ΦsC(s) dWs ∼ N
(0,
∫ t
0
Φ−1t ΦsC(s)C(s)TΦTs Φ−Tt ds
)e.g. for A(t) ≡ A:
Xt = etAξ0 +
∫ t
0
e(t−s)Ab(s) ds+
∫ t
0
e(t−s)AC(s) dWs.
C.1.2. Solving SDE using change of variables.
dXt = b(Xt) dt+ σ(Xt) dWt X0 = x ∈ R(C.6)
We try to solve (C.6) in terms of Xt = u(Yt) for suitable u and Yt solving
dYt = f(Yt) dt+ dWt, Y0 = y ∈ R(C.7)
where f will be chosen later. Ito ’s formula gives:
du(Yt) = u′(Yt)dYt +1
2u′′(Yt)dt(C.8)
=
[u′(Yt)f(Yt) +
1
2u′′(Yt)
]dt+ u′(Yt)dWt
Hence, if u′(Y ) = σ(u(Y ))
u′(Y )f(Y ) + 12u′′(Y ) = b(u(Y )),
(C.9)
(C.8) reduces to
du(Yt) = b(u(Yt)) dt+ σ(u(Yt)) dWt,
so that Xt = u(Yt) solves (C.6).
The solution of (C.9) can be obtained first solving the ODE
u′(z) = σ(u(z)), u(y) = x
and setting
f(z) =1
σ(u(z))
[b(u(z))− 1
2u′′(z)
].
Illustration: Consider again the SDE
dXt = rXt dt+ σXt dWt, X0 = x
then u′(z) = σ(u(z)) = σu(z), u(0) = x, leads to the solution u(z) = x eσz andtherefore
f(z) =1
σxeσz
[rxeσz − σ2
2xeσz
]=
1
σ
(r − σ2
2
).
80 C. STOCHASTIC DIFFERENTIAL EQUATIONS
The SDE
dYt =1
σ
(r − σ2
2
)dt+ dWt, Y0 = 0,
has the solution
Yt =1
σ(r − σ2
2)t+Wt,
and thus
Xt = u(Yt) = x exp
((r − σ2
2
)t+ σWt
)C.2. Strong solutions
Input
• bi, σij : I × Rd → R (I = [0, T ] or R+, 1 ≤ i ≤ d, 1 ≤ j ≤ r) Borelmeasurable
• (Wt) r-dim. continuous BM on a probability space (Ω,F , P )• ξ Rd-valued r.v. on (Ω,F , P ) independent of (Wt) (initial condition)
We are looking for a strong solution (Xt) of the SDE
dXt = b(t,Xt) dt+ σ(t,Xt) dWt, X0 = ξ(C.10)
(C.11)
or, componentwise,
dXit = bi(t,Xt) dt+
r∑j=1
σij(t,Xt) dWjt , Xi
0 = ξi, 1 ≤ i ≤ d .
Definition C.1. A strong solution of the SDE (C.10) is a stochastic process(Xt)t∈I with continuous sample paths satisfying:
(i) Xt is adapted, i.e., Ft-measurable, where Ft = σ (ξ,Ws, s ∈ [0, t]) ∨N , and N denotes all P -null sets in σ (ξ, Ws, s ∈ I)
(ii) X0 = ξ P -a.s.
(iii)∫ t
0
(|bi(s,Xs)|+ σ2
ij(s,Xs))ds <∞ P -a.s. ∀t ∈ I, ∀i, j
(iv) Xt = X0 +∫ t
0b(s,Xs) ds+
∫ t0σ(s,Xs) dWs P -a.s. ∀t ∈ I.
b(t, x) =
b1(t, x)...
bd(t, x)
is called the drift
σ(t, x) =
σ11(t, x) · · · σ1r(t, x)...
...σi1(t, x) · · · σdr(t, x)
the dispersion coefficient of the SDE.
Definition C.2. (Uniqueness of strong solutions)Strong uniqueness holds for the SDE (C.10) if for given BM (Wt) and ini-
tial condition ξ independent of (Wt) on a probability space (Ω,F , P ) two strong
solutions X, X are indistinguishable, i.e.
P(Xt = Xt ∀ t ∈ I
)= 1.
C.2. STRONG SOLUTIONS 81
Theorem C.3. Suppose that I = [0, T ] and thatAssumption (A)
(a) bi, σij are continuous(b) For all R > 0 there exist a constant LR such that
2〈b(t, x)−b(t, y), x−y〉+‖σ(t, x)−σ(t, y)‖2 ≤ LR‖x−y‖2 ∀‖x‖, ‖y‖ ≤ R, t ∈ [0, T ]
holds. Then there exists at most one strong solution of (C.10).
The assumption (A) is in particular satisfied if bi, σij are locally Lipschitz w.r.t.x, i.e., for all R > 0 exists a constant LR with
‖b(t, x)− b(t, y)‖+ ‖σ(t, x)− σ(t, y)‖ ≤ LR‖x− y‖
∀ x, y ∈ Rd with ‖x‖, ‖y‖ ≤ R.
Part (b) of the assumption (A) reduces in the particular case σij ≡ 0 to thefollowing condition
(C.12) 2〈b(t, x)− b(t, y), x− y〉 ≤ LR‖x− y‖2 ∀ ‖x‖, ‖y‖ ≤ R, t ∈ [0, T ] .
(C.12) is called (local) monotonicity or (local) one-sided Lipschitz condi-tion and usually satisfied in all stochastic differential equations describing neuralactivity.
Theorem C.4. Let bi, σij satisfy assumption (A) and the following lineargrowth condition(B) There exists a constant K such that
2〈b(t, x), x〉+ ‖σ(t, x)‖2 ≤ K(1 + ‖x‖2
)∀x ∈ Rd , t ∈ I .
Let (Wt) be a r-dimensional BM on (Ω,F , P ), ξ be an initial condition independentof (Wt). Then there exists a strong solution (Xt) of the SDE (C.10).Moreover, if the initial condition is square integrable,
E(‖ξ‖2
)<∞,
then for all T ≥ 0 ∃ CT with
supt≤T
E(‖Xt‖2
)≤ CT
(1 + E
(‖ξ‖2
)).
Assumptions (A) and (B) are both satisfied if the coefficients are globally Lipschitzcontinuous w.r.t. x with Lipschitz-constant L independent of t
‖b(t, x)− b(t, y)‖+ ‖σ(t, x)− σ(t, y)‖ ≤ L‖x− y‖ ∀x, y ∈ Rd, ∀t ∈ I
and if bi, σij are at most of linear growth, i.e.,
‖b(t, x)‖2 + ‖σ(t, x)‖2 ≤ K(1 + ‖x‖2) ∀x ∈ Rd,∀t ∈ I.
82 C. STOCHASTIC DIFFERENTIAL EQUATIONS
C.3. Numerical approximation
The simplest numerical approximation of the sde (C.10) is provided by theEuler scheme that is defined as follows:
Choose a time step h > 0, tk = k · h, k = 0, 1, 2, . . .
Xht0 = ξ
Xktk+1
= Xktk
+ b(tk, X
ktk
)· h+ σ
(tk, X
ktk
)·(Wtk+1
−Wtk
), k = 0, 1, 2, . . .
Note: Wtk+1−Wtk denote samples of independent d-dimensional normal ran-
dom variables with mean 0 and covariance matrix h Idd.
Strong Error
The primary tool to measure the approximation error is to look at the pathwisediscrepancy between the exact solution (Wt) of (C.10) and the Euler approximation(Xk
tk):
estrongh,t := sup
k: tk≤tE(|Xk
tk−Xtk |
)Theorem C.5. b, σ Lipschitz, q ≥ 1, then
(i)
∃ ct : E
(sup
k: tk≤t|Xk
tk−Xtk |2q
)≤ ct · hq
in particular, if h = tN , hence tN = t, then
E(|Xk
tk−Xtk |2q
)≤ ct
(t
N
)q(ii) for any α < 1
2 :
limh→0
1
hαsup
k: tk≤t|Xk
tk−Xtk | = 0 a.s.
(iii) if in addition b, σ ∈ C4b (i.e., four times continuously differentiable, all
derivatives up to fourth order bounded) and if u ∈ C4p (i.e., four times
continuously differentiable, all derivatives up to fourth order polynomiallybounded), then
E(|u(Xk
t )− u(Xt)|)≤ ct · h .
The proof of the theorem can be found in the monograph [PK92] by Kloedenand Platen, where also much more refined numerical schemes are analyzed. A gentleintroduction to the numerical approximation of stochastic differential equationsdriven by Wiener noise can be also found in the SIAM Review by Higham [Hig01].
Bibliography
[AAF05] S. B. Laughlin A. A. Faisal, Ion-channel noise places limits on the miniaturization of
the brain’s wiring, Current Biology (2005), no. 15, 1143–1149.[AAF07] , Stochastic simulations on the reliability of action potential propagation in thin
axons, PLoS Comput Biol. (2007), no. 3:e79.
[APP05] L. Alili, P. Patie, and J. L. Pedersen, Representations of the first hitting time densityof an ornstein-uhlenbeck process, Stochastic Models (2005), no. 21, 967–980.
[EK84] S. N. Ethier and T. G. Kurtz, Markov processes - characterization and convergence,
John Wiley & Sons, New York, 1984.[FB02] N. Fourcaud and N. Brunel, Dynamics of the firing probability of noisy integrate-and-fire
neurons, Neural Comput. (2002), no. 14, 2057–2110.
[H07] R. Hopfner, On a set of data for the membrane potential in a neuron, Math. Biosciences
(2007), no. 207, 275–301.
[HA10] T. Schardlow H. Alzubaidi, H. Gilsing, Numerical simulations of spe’s and spde’s fromneural systems using sdelab, Stochastic Methods in Neuroscience (G. J. Lord C. Laing,
ed.), Oxford University Press, Oxford, 2010, pp. pp–.
[Hig01] D. J. Higham, An algorithmic introduction to numerical simulation of stochastic differ-ential equations, SIAM Review 43 (2001), no. 3, 525–546.
[Kle06] A. Klenke, Probability theory, Springer-Verlag, Berlin, 2006.
[LL87] P. Lansky and V. Lanska, Diffusion approximation of the neuronal model with synapticreversal potentials, Biol. Cybern. (1987), no. 56, 19–26.
[Nob91] The nobel prize in physiology or medicine 1991, Nobelprize.org, 1991.[Nor97] J. R. Norris, Markov chains, Cambridge University Press, Cambridge, 1997.
[PK92] E. Platen P. Kloeden, Numerical solution of stochastic differential equations, Springer-
Verlag, Berlin, 1992.[SS16] M. Sauer and W. Stannat, Reliability of signal transmission in stochastic nerve axon
equations, Journal of Computational Neuroscience (2016), no. 40, 103–111.
[Ste65] R. B. Stein, A theoretical analysis of neuronal variability, Biophys J. (1965), no. 5,173–194.
[SV79] Daniel W. Stroock and S. R. Srinivasa Varadhan, Multidimensional diffusion pro-
cesses, Grundlehren der Mathematischen Wissenschaften [Fundamental Principles ofMathematical Sciences], vol. 233, Springer-Verlag, Berlin, 1979, Reprinted in 2006.
MR MR532498 (81f:60108)[VB91] C. A. Vandenberg and F. Bezanilla, A sodium channel gating model based on single
channel, macroscopic ionic, and gating currents in the squid giant axon, Biophys J.
(1991), no. 60, 1511–1533, doi:10.1016/S0006-3495(91)82186-5.
83