lecture 1
TRANSCRIPT
Statistical Inference for Ergodic Diffusion Process
Yu.A. Kutoyants
Laboratoire de Statistique & Processus,
Universite du Maine,
72085 Le Mans, Cedex 9,
FRANCE
Johannes Gutenberg University, Mainz, June-July 2008
1
Classical Statistics
We observe n independent copies Xn = X1, . . . , Xn of the samer.v. with the density function f (x).
Parameter Estimation
We suppose that f (x) = f (ϑ, x) , ϑ ∈ Θ = (α, β). Then the likelihoodfunction is L (ϑ,Xn) =
∏nj=1 f (θ,Xj) and the estimators (MLE ϑn,
BE ϑn) are defined by the equations
L(ϑn, Xn
)= sup
θ∈ΘL (ϑ,Xn) , ϑn =
∫ β
α
θp (θ|Xn) dθ.
2
Under regularity conditions (smooth f (θ, x)) these estimators areconsistent, asymptotically normal√
n(ϑn − ϑ
)=⇒ N
(0, I (ϑ)−1
),
√n
(ϑn − ϑ
)=⇒ N
(0, I (ϑ)−1
)
and asymptotically efficient. Here I (ϑ) =∫ f(ϑ,x)2
f(ϑ,x) µ (dx) is theFisher information.
In the case of non smooth (w.r.t. ϑ) f (θ, x) these estimators havedifferent rates. Say, if f (θ, x) = g (x− θ), where g (x) has a jump atsome point x∗, then
n(ϑn − ϑ
)=⇒ u, n
(ϑn − ϑ
)=⇒ u.
If the singularity is of cusp-type, say, g (x) = |x− x∗|κ, then the ratedepends on κ ∈ (0, 1/2):
nγ(ϑn − ϑ
)=⇒ uγ , nγ
(ϑn − ϑ
)=⇒ uγ .
3
Nonparametric Estimation
If f (x) is unknown function, then we can consider the problems ofdistribution function F (x) and density f (x) estimation. In the firstcase the empirical distribution function
Fn (x) =1n
n∑
j=1
1IXj<x
is consistent,√
n-asymptotically normal and asymptotically efficient.In the second case the kernel type estimators
fn (x) =1
nhn
n∑
j=1
K
(Xj − x
hn
)
have good properties (consistent, nk
2k+1 -as. normal and as. efficientin some sense).
4
It is possible to construct a similar theory for the followingcontinuous time models of observations XT = Xt, 0 ≤ t ≤ T.• Gaussian processes
Xt = S (ϑ, t) + N (t) , 0 ≤ t ≤ T
where S (ϑ, t) is a signal and N (t) is Gaussian noise.
• Diffusion type processes
dXt = St (ϑ,X) dt + εσt (X) dWt, 0 ≤ t ≤ T
with small noise (ε → 0) and ergodic diffusion process
dXt = S (ϑ,Xt) dt + σ (Xt) dWt, 0 ≤ t ≤ T
with asymptotics T →∞.
• Point processes (mainly inhomogeneous Poisson processes ofintensity S (ϑ, t)).
5
Model
Diffusion process
dXt = S(Xt) dt + σ(Xt) dWt, X0, t ≥ 0
with S (·) and σ (·) such that the process is ergodic with invariantdensity
f (x) = G (S)−1σ(x)−2 exp
2
∫ x
0
S(y)σ(y)2
dy
S(·) is unknown and σ(·) > 0 is known to the observer.
We consider two types of problems : parametric andnonparametric estimation by observations Xt, 0 ≤ t ≤ T asT →∞.
6
Correspondence
In classical statistics the properties of the estimators depend directlyon the regularity of the density function f (ϑ, x). In ergodic diffusionmodel the properties of estimators depend on regularity of the trendcoefficient S (ϑ, x). Say the Fisher informations in these cases are
I (ϑ) =∫
f (ϑ, x)2
f (ϑ, x)2f (ϑ, x) dx, I (ϑ) =
∫S (ϑ, x)2
σ (ϑ, x)2f (ϑ, x) dx.
In nonparametric estimation the problem of density estimation fori.i.d. model is quite close to the problem of trend coefficientestimation. Note that the problem of distribution function estimation(i.i.d.case) is similar to the problem of invariant density estimationfor ergodic diffusion model.
7
Stochastic Integral
We are given a probability space Ω, F,P and Ft, 0 ≤ t ≤ T is anincreasing family of σ-algebras (filtration), i.e., for any 0 ≤ s < t ≤ T
the inclusions Fs ⊂ Ft ⊂ F hold.
Let MT be the class of progressively measurable random functionsh(·) such that
P
∫ T
0
h(t, ω)2 dt < ∞
= 1.
We say that h (·) ∈ M 2T if h (·) ∈ MT and
E∫ T
0
h(t, ω)2 dt < ∞.
Standard Wiener process is a continuous (with probability 1)
8
Gaussian process with independent increments and with the followingfirst two moments: EWt = 0, EWt Ws = t ∧ s.
The stochastic Ito integral
IT (h) =∫ T
0
h(t, ω) dWt
is defined for the functions h (·) ∈ MT .
• If h(·) ∈ M 2T , then
E IT (h) = 0, E IT (h) |Ft = It(h).
• For any two functions h(·), g(·) ∈ M 2T
E IT (h) IT (g) = E∫ T
0
h(t, ω) g(t, ω) dt.
9
In particular,
EIT (h)2 = E∫ T
0
h(t, ω)2 dt.
• If h(·) ∈ MT , then for any δ > 0 and γ > 0
P
sup0≤t≤T
∣∣∣∣∫ t
0
h(s, ω) dWs
∣∣∣∣ > δ
≤ γ
δ2+P
∫ T
0
h(t, ω)2dt > γ
.
• Let h(·) ∈ M 2T and for some m ≥ 1
E∫ T
0
|h(t, ω)|2m dt < ∞.
Then
E |IT (h)|2m ≤ [m(2m− 1)]m Tm−1 E∫ T
0
|h(t, ω)|2m dt.
10
Let g(t, ω) be Ft-adapted for almost all t ∈ [0, T ],
P
∫ T
0
|g(t, ω)| dt < ∞
= 1,
and h(·) ∈ MT . Then the stochastic process
Xt = X0 +∫ t
0
g(s, ω) ds +∫ t
0
h(s, ω) dWs, 0 ≤ t ≤ T,
is called the Ito process. Here X0 is a F0-measurable randomvariable. In the shortened form it is usually written as
dXt = g(t, ω) dt + h(t, ω) dWt, X0, 0 ≤ t ≤ T.
11
The class of Ito processes is closed with respect to smoothtransformations in the following sense. Let Xt, 0 ≤ t ≤ T be an Itoprocess with stochastic differential and G(x, t) be a differentiablefunction with the following continuous derivatives: G′t(x, t), G′x(x, t),G′′xx(x, t) (with obvious notation). Then the stochastic processYt = G(Xt, t), 0 ≤ t ≤ T is the Ito process too with the stochasticdifferential
dYt =[G′t(Xt, t) + G′x(Xt, t) g(t, ω) +
12G′′xx(Xt, t)h2(t, ω)
]dt
+ G′x(Xt, t)h(t, ω) dWt, Y0 = G(X0, 0), 0 ≤ t ≤ T.
This equality is called the Ito formula and it can be written as
dYt =[G′t(Xt, t) +
12G′′xx(Xt, t)h2(t, ω)
]dt + G′x(Xt, t) dXt,
with the same initial value.
12
Let h(·) ∈ MT and for some H > 0 with probability 1∫ T
0
h (t, ω)2 dt ≥ H,
then the stopping time
τH = inf
t :∫ t
0
h (s, ω)2 ds ≥ H
is well defined and
LIτH (h) = N (0,H) ,
i.e., IτH (h) is a Gaussian random variable with mean zero andvariance H.
13
Homogeneous diffusion processes
We are given two functions S (x), σ (x) and the stochastic differential
dXt = S (Xt) dt + σ (Xt) dWt, X0
GL. (Globally Lipschitz condition) There exists a constant L such
that
|S(x)− S(y)|+ |σ(x)− σ(y)| ≤ L |x− y|
for all x, y ∈ IR.
Note that by this condition the functions S(·) and σ(·) satisfy thelinear growth condition
|S(x)|+ |σ(x)| ≤ |S(0)|+ |σ(0)|+ L |x| ≤ L (1 + |x|)
too.
14
Theorem 1 Let the condition GL be fulfilled and P|X0| < ∞ = 1.Then this equation has a unique (strong) solution Xt, 0 ≤ t ≤ T,continuous with probability 1. If moreover EX2m
0 < ∞, then
EX2mt ≤ (1 + EX2m
0 ) ecmt − 1,
where cm is some positive constant.
15
Local time
Let us consider a homogeneous diffusion process
dXt = S (Xt) dt + σ (Xt) dWt, X0, 0 ≤ t ≤ T
The local time of this diffusion process denoted by ΛT (x) is definedas the following limit (with probability 1):
ΛT (x) = limε↓0
meast : |Xt − x| ≤ ε, 0 ≤ t ≤ T4 ε
, T ≥ 0, x ∈ IR
and by the Tanaka–Meyer formula it admits the representation:
|XT − x| = |X0 − x|+∫ T
0
sgn (Xt − x) dXt + 2 ΛT (x),
16
Let h (·) be a measurable function. Then with probability 1∫ T
0
h (Xt) σ (Xt)2 dt = 2
∫ ∞
−∞h (x) ΛT (x) dx.
We will use this equality in a different form. Let us denote
fT
(x) =2ΛT (x)
Tσ (x)2
and remember that the function σ (x)2 is supposed to be positive.The statistic f
T(x) we call the local time estimator of the invariant
density. Then
1T
∫ T
0
h (Xt) dt =∫ ∞
−∞h (x) f
T(x) dx →
∫ ∞
−∞h (x) f (x) dx.
17
Likelihood ratio
Let us consider two stochastic differential equations
dXt = S1 (Xt) dt + σ (Xt) dWt, X(1)0 , 0 ≤ t ≤ T,
dXt = S2 (Xt) dt + σ (Xt) dWt, X(2)0 , 0 ≤ t ≤ T
and denote by P(T )1 and P(T )
2 the probability measures induced in(CT ,BT
)by the solutions of these equations respectively. Likelihood
ratio is
dP(T )2
dP(T )1
(XT
)=
f2(X0)f1(X0)
exp
∫ T
0
S2 (Xt)− S1 (Xt)σ (Xt)
2 dXt
−12
∫ T
0
S2 (Xt)2 − S1 (Xt)
2
σ (Xt)2 dt
.
18
Limit theorems
We are given a homogeneous diffusion process
dXt = S (Xt) dt + σ (Xt) dWt, X0 = x0, t ≥ 0,
The statistical inference for such models is essentially based on twolimit theorems: the law of large numbers (LLN) for ordinary integralsand the central limit theorem (CLT) for stochastic and ordinaryintegrals. Note that the CLT for the ordinary integral is aconsequence of the CLT for stochastic integrals.
Law of Large Numbers
Let τa = inf t ≥ 0 : Xt = a, τab = inf t ≥ τa : Xt = b. We saythat the stochastic process X = Xt,≥ 0 is recurrent ifP τab < ∞ = 1 for all a, b ∈ IR. The recurrent process X is calledrecurrent positive if Eτab < ∞ for all a, b ∈ IR and is called null
recurrent if Eτab = ∞ for all a, b ∈ IR.
19
Theorem 2 The process X is recurrent if and only if
V (x) =∫ x
0
exp−2
∫ y
0
S(u)σ(u)2
du
dy,−→ ±∞ as x → ±∞.
The recurrent process X is positive if and only if
G =∫ ∞
−∞σ(y)−2 exp
2
∫ y
0
S(z)σ(z)2
dz
dy < ∞.
The process X is recurrent null if it is recurrent and
G =∫ ∞
−∞σ(y)−2 exp
2
∫ y
0
S(z)σ(z)2
dz
dy = ∞.
20
Examples.
dXt = −ϑ1 (Xt − ϑ2) dt + σ dWt,
dXt = − ϑ1 X3t
1 + ϑ2 X2t
dt + σ dWt,
dXt = −ϑ1 Xt [1 + γ sin (ϑ2Xt)] dt + σdWt,
dXt =[−ϑ1 X3
t + ϑ2 Xt
]dt +
√1 + X2
t dWt
dXt = − sgn (Xt − ϑ) dt + dWt, 0 ≤ t ≤ T,
dXt = −Xt
(a + b χϑ<Xt<c+ϑ
)dt + dWt,
dXt = −Xt
(a + b χϑ1<Xt<ϑ2
)dt + dWt.
21
For the positive recurrent diffusion process we have the Law of Large
Numbers: for any function h (·) such that E |h (ξ)| < ∞ we have(with probability 1)
1T
∫ T
0
h (Xt) dt −→ Eh (ξ) =∫
h (x) f (x) dx
The random variable ξ has “invariant” density f (x).
22
Central Limit Theorem
Let h (·, ω) ,∈ MT . Then the stochastic integral∫ T
0
h (t, ω) dWt
is well defined and we have the following
Theorem 3 (Central Limit Theorem) Suppose that there exists a(nonrandom) function ϕT and a positive constant % such that
P− limT→∞
ϕ2T
∫ T
0
h (t, ω)2 dt = %2 < ∞.
Then
L
ϕT
∫ T
0
h (t, ω) dWt
=⇒ N (
0, %2).
23
Theorem 4 ( CLT for ordinary integral) Let h (·) be a measurablefunction, such that E |h (ξ)| < ∞ and Eh (ξ) = 0. Then if
δ2 = 4E
(∫ ξ
−∞
h (v) f(v)σ (ξ) f (ξ)
dv
)2
< ∞,
then
L
1√T
∫ T
0
h (Xt) dt
=⇒ N (
0, δ2).
24