appendix b
DESCRIPTION
mathTRANSCRIPT
-
Appendix B1
Measure and Measurability
-algebra
Let denote a set of elements of interest, which is referred to
as a space, or sample space in statistical context.
A set is said to be countable if its elements can be listed as a
sequence: = {1, 2, . . .}. Otherwise is uncountable.
In particular, any finite set is countable. Any interval [a, b]
with a < b is uncountable.
A space is said to be discrete if it is countable; or continuous
if it is an interval or a product of intervals, such as:
= R = (,), = [0,), = [0, 1],
= Rn = {(x1, . . . , xn) : x1, . . . , xn R},
= [0,)2 = {(x, y) : x, y 0}, or
= [0,) [0, 1] = {(x, y) : x 0, 0 y 1}, etc.
A collection F of subsets of a space is called a -algebra or
-field if it satisfies the following axioms:
(i) The empty set F ;
(ii) If E F , then its complement Ec F ;
(iii) If Ei F , i = 1, 2, . . ., theni=1
Ei F .
1
-
The three axioms above further imply:
(iv) The state space F ;
(v) If E F and F F , then E F F , E F F , and
E F = E F c F ;
(vi) If Ei F , i = 1, 2, . . ., theni=1
Ei F .
In summary, a -algebra is nonempty and closed under finite or
countable operations of unions, intersections and complements.
In other words, it is a self-contained collection of subsets.
Given a collection G of subsets of , the smallest -algebra F
such that G F is called the -algebra generated by G, and
denoted by F = (G).
The smallest -algebra is F = ({}) = {,}. The largest
-algebra is the collection of all subsets of , denoted by 2.
Given E (E 6= ,), the smallest -algebra that includes E
as a member is F = ({E}) = {, E,Ec,}.
Obviously, if G1 G2, then (G1) (G2).
If is a discrete space, then every subset E of can be expressed
as a countable union of single-element sets:
E =E
{}
As a result, if a -algebra F on a discrete space includes all
single-point sets {} for , then F = 2.
2
-
Measurable sets
Given a space and a -algebra F on , a subset E of is said
to be measurable with respect to F if and only if E F .
By the definition of -algebra, any countable union, intersection
and complement of measurable sets are also measurable.
Borel field and sets
If = R and G = {[a, b] : a, b }, then B = (G) is called
the Borel algebra or Borel field on R. Equivalently, the intervals
[a, b] can be replaced by (a, b), (a, b], [a, b), (, b], etc.
Similarly we can define the Borel field on [0,) or any other
interval; as well as on Rn by the products of intervals:
B =
({ni=1
[ai, bi] : ai, bi R, i = 1, . . . , n
})
Any A B is called a Borel set, or said to be Borel measurable.
In particular, any single-point set {a} = [a, a] is a Borel set, and
consequently, every countable subset of R is a Borel set as well.
The Borel field B on an interval I must include all intervals
and countable unions, intersections and complements of intervals
contained in I. It is quite large and includes all subsets we need
to define probability, but it does not include every subset of an
interval; in other words, B 6= 2I . There exist many non-Borel
sets, but it is not easy to show an explicit example.
3
-
Measure
Given a space and a -algebra F on , a set function m()
defined on F is said to be a measure if it satisfies the following
two axioms:
(i) m(E) 0 for any E F ;
(ii) If E1, E2, . . . F are disjoint or mutually exclusive in the
sense that Ei Ej = for i 6= j, then
m
(i=1
Ei
)=
i=1
m (Ei)
A measure defined on the Borel field is called a Borel measure.
Among the most useful Borel measures is the Lebesgue measure,
which assigns measure
m([a, b]) = m((a, b)) = m((a, b]) = m([a, b)) = b a
to any interval in R. In particular, for any a R,
m({a}) = m ([a, a]) = a a = 0
Consequently, every countable subset of R has zero measure.
The Lebesgue measure is defined on a -algebra F larger than
the Borel field B. There exist, however, subsets of an interval I
for which even the Lebesgue measure cannot be assigned. That
explains why 2I is too large to be a useful -algebra.
For the purpose of probability, the Borel field together with the
Lebesgue measure is sufficient.
4
-
Measurable functions
A real-valued function g(x) defined on R is said to be measurable
if g1(A) = {x : g(x) A} is a Borel set for every Borel set A.
In particular, an indicator function g(x) = I{xB} of any B B
is measurable since g1(A) = , B, Bc or R for any A B.
A real-valued function g(x) defined on R is said to be Riemann
integrable if the integralRg(x)dx is well-defined in the usual
sense of Calculus. This integral is referred to as the Riemann
integral.
A property is said to hold almost everywhere if it holds on a
Borel set B such that m(Bc) = 0 under the Lebesgue measure.
A function is Riemann integrable if and only if it is continuous
almost everywhere.
An almost everywhere continuous function g(x) is measurable.
Consequently, all analytic, continuous, piecewise continuous, and
Riemann integrable functions are measurable.
In fact, all functions of practical interest are measurable.
A simple example of a measurable function that is not Riemann
integrable is the indicator of the set R of rational numbers:
IR(x) = I{xR} ={1 if x is a rational number0 if x is an irrational number
Since R is countable, it is a Borel set and so IR(x) is measurable.
It is however nowhere continuous, hence not Riemann integrable.
5
-
Lebesgue integral
The Lebesgue integral is defined for measurable functions.
If g(x) is a Riemann integrable function, its Lebesgue integral
coincides with its Riemann integral. Hence we will use the same
notation for Lebesgue and Riemann integrals.
If g(x) = 1IA1(x) + + kIAk (x) is a linear combination of
indicators of Borel sets, its Lebesgue integral is defined byR
g(x)dx =ki=1
im(Ai)
where m() is the Lebesgue measure. In particular, if R is the
set of rational numbers, thenRIR(x)dx = m(R) = 0.
For a measurable function g(x) 0, there exists a sequence of
linear combinations {gn(x)} of indicators of Borel sets such that
gn(x) g(x). The Lebesgue integral of g(x) is then defined byR
g(x)dx = limn
R
gn(x)dx
A measurable function g(x) is said to be Lebesgue integrable ifR
|g(x)|dx =
R
g+(x)dx+
R
g(x)dx
-
Probability space and random variable
For a continuous state space S, by a measurable set A S we
mean A B; that is, A is Borel measurable.
Let be a sample space (of all possible outcomes) and F be a
-algebra of subsets of . Each E F is called an event.
A probability (measure) is a measure Pr() defined on F such that
Pr() = 1. The probability Pr(E) is defined only for E F .
The triplet (,F ,Pr()) is called a probability space.
A real-valued functionX = X() of is said to be a random
variable if and only if {X A} F for every Borel set A.
If X is a random variable and g(x) is a measurable function,
then g(X) is also a random variable.
If X is a continuous random variable, then its state space S is
an interval and Pr(X A) is defined only for Borel set A S.
Given a cdf F (x), we can define Pr(X x) = F (x) and extend
it to Pr(X A) for any Borel set A via the axioms or properties
of the probability, such as
Pr(X (a, b]) = Pr(a < X b) = Pr(X b) Pr(X a)
Pr(X (, b)) = Pr(X < b) = limn
Pr(X b n1
)and so on, as the Borel field B is generated by {(, x], x R}.
Such extensions may not be possible if A / B.
This explains why a cdf can determine a probability distribution.
7
-
Appendix B2
Stationary Distribution
Existence
Let {Xn : n = 0, 1, . . .} be a Markov chain with a finite state
space S = {1, 2, . . . ,N} and transition matrix P = (pij)NN .
Since pi1 + + piN = 1, i = 1, . . . ,N, the matrix
I P =
1 p11 p12 . . . p1Np21 1 p22 . . . p2N...
.... . .
...pN1 pN2 . . . 1 pNN
has a zero sum of elements for each row.
Thus the N columns of I P are linearly dependent, implying
Rank(I P ) < N . Consequently, the equation pi(I P ) = 0 or
pi = piP has at least one solution pi 6= 0.
For convenience, we write pi > 0 for a vector pi = (pi1, . . . , piN )
(either a row or a column) if pij 0 for all j = 1, . . . ,N and
pij > 0 for some j. We also write pi < 0 if pi > 0.
We now prove the existence of a row vector pi > 0 such that
pi = piP by induction on the number N of states.
Start with N = 1. In this case, we must have P = 1, hence
pi = 1 satisfies pi = piP obviously.
8
-
For N > 1, assume there exists such a pi > 0 with k states for
1 k < N . Then we need to prove the case with N states.
Let pi 6= 0 be a solution to pi = piP . If neither pi > 0 nor pi < 0,
we can write pi = [pi1 pi2] and
P =
[P11 P12P21 P22
](B2.1)
where pi1 is a vector of k negative elements (1 k < N), pi2 > 0
has N k elements, and P11 is a k k matrix.
If P12 = 0, then P11 is a kk transition matrix (with the elements
of each row add to 1). Hence by the induction assumption, there
exists pi > 0 such that pi = piP11.
Consequently,
[pi 0]P = [pi 0]
[P11 0P21 P22
]= [piP11 0] = [pi 0]
Thus pi = [pi 0] > 0 and pi = piP .
If P12 6= 0, then pij > 0 for some i k and j > k, so that
pi1 + + pik < pi1 + + pik + pij 1.
Let 1k = [1 1 1]T denote the k 1 vector with all elements
equal to 1. Then
(I P11)1k =
1 p11 p1k
...1 pk1 pkk
> 0
Thus pi1(I P11)1k < 0 since all elements of pi1 are negative.
9
-
On the other hand, by (B2.1),
pi = piP pi1 = pi1P11 + pi2P21 pi1(I P11) = pi2P21
This together with pi2 > 0 leads to a contradiction:
0 > pi1(I P11)1k = pi2P211k 0
It follows that when P12 6= 0, either pi > 0 or pi < 0. Thus we
can take either pi = pi > 0 or pi = pi > 0 to satisfy pi = piP .
We have shown the existence of pi > 0 such that pi = piP with
N states. By the principle of mathematical induction, this holds
for all N = 1, 2, . . ..
Let pi = (pi1 , . . . , piN ) > 0 be a row vector such that pi
= piP .
Then pi1 + + piN > 0.
Take pi = [pi1 piN ] with
pij =pij
pi1 + + piN
, j = 1, 2, . . . ,N.
Then pi > 0, pi1 + + piN = 1 and
pi(I P ) =pi(I P )
pi1 + + piN
= 0 = pi = piP
Thus pi is a stationary distribution of {Xn}.
This shows that a Markov chain with a finite state space must
have at least one stationary distribution.
10
-
Uniqueness
Suppose there exist two stationary distributions pi 6= pi. Then
pi = pi pi 6= 0 and
piP = (pi pi)P = piP piP = pi pi = pi
Thus pi is a non-zero solution to equation pi = piP .
Since pi and pi each has a sum 1 over its elements, the elements
of pi has a sum 0. Hence neither pi > 0 nor pi < 0.
Then a partition of P as in (B2.1) leads to
P =
[P11 0P21 P22
]
where P11 is a k k matrix with 1 k < N .
It follows that the transition matrix at time n has the form
P (n) = Pn =
[Pn11 0
P(n)21 P
n22
](B2.2)
where P(n)21 and P
n22 are two matrices of orders (N k) k and
(N k) (N k) respectively.
From (B2.2) we see that p1N (n) = 0 for all n. This shows that
the Markov chain with transition matrix P is reducible.
Therefore, two distinct stationary distributions are possible only
with a reducible Markov chain.
Consequently, an irreducible Markov chain with finite states
must have a unique stationary distribution.
11
-
Appendix B3
Convergence of Markov Chain
For any time-homogeneous Markov chain with finite N states
and transition matrices P = (pij)NN and Pn = (pij(n))NN , it
has been theoretically proved that limn
pij(n) exists (i.e., pij(n)
converges) for every aperiodic state j. Thus if P is aperiodic
(every state of P is aperiodic), then Pn converges.
If P is irreducible, then P = limn
Pn exists if and only if P is
aperiodic. This is because each row of P is the unique stationary
distribution pi = (pi1, . . . , piN ) of P with pij > 0 for at least one
state j. Therefore,
limn
Pn = P = limn
pjj(n) = pij > 0 = pjj(n) > 0
for all sufficiently large n. This means that state j is aperiodic
(a periodic state j must have pjj(n) = 0 for infinitely many n),
and so P is aperiodic since it is irreducible.
If all states of P are periodic, then Pn must diverge. To see this,
let P11 be an irreducible block of P that is a transition matrix
itself. Then Pn11 diverges as P11 is periodic, hence Pn diverges.
However, individual pij(n) may converge for some (i, j) even if
every state of P is periodic. A trivial example is a reducible P ,
which has some states i, j such that pij(n) = 0 for all n.
If P has both periodic and aperiodic states (such a P must be
reducible), then Pn may or may not converge.
12
-
A simple example with divergent Pn is
P =
0 1 01 0 00 0 1
= Pn = { I3 for even n
P for odd n
Hence states 1 and 2 are periodic with d = 2, whereas state 3 is
aperiodic. As Pn oscillates between I3 and P , it diverges.
The next example shows that Pn may converge:
P =
0 0.5 0.50.5 0 0.5
0 0 1
= Pn =
0.5n 0 1 0.5n0 0.5n 1 0.5n
0 0 1
for even n and
Pn =
0 0.5n 1 0.5n0.5n 0 1 0.5n
0 0 1
for odd n.
Thus states 1 and 2 are periodic (d = 2) and state 3 is aperiodic.
Pn converges obviously in this case:
P = limn
Pn =
0 0 10 0 10 0 1
,
where each row of P is the unique stationary distribution of P .
This example also shows that limn
pij(n) = 0 exist for periodic
states j = 1, 2 beyond the trivial case pij(n) = 0 for all n.
Even if Pn diverges, pi(n) = pi(0)Pn may converge for some pi(0)
that satisfies certain conditions. This is demonstrated in the
following example.
13
-
Consider a transition matrix of the form
P =
[A B0 D
]NN
with D =
[0 11 0
]22
(N > 2)
Assume that An 0 as n. Then(I A2
)1exists.
Note that
P 2 =
[A2 AB + BD0 I2
]has aperiodic states N 1 and N . Hence
limm
P 2m = limm
(P 2)m =
[0 Q0 I2
]for some Q (B3.1)
(as pij(2m) converges for aperiodic states j = N 1 and N).
Since P 2m+2 has the same limit as P 2m, (B3.1) implies[0 Q0 I2
]= P 2
[0 Q0 I2
]=
[A2 AB +BD0 I2
] [0 Q0 I2
]= Q = A2Q+AB +BD =
(I A2
)Q = AB + BD
= Q =(I A2
)1(AB +BD) (B3.2)
It also follows from (B3.1) that
limm
P 2m+1 =
[0 Q0 I2
]P =
[0 QD0 D
](B3.3)
(B3.1) and (B3.3) show that Pn diverges as D 6= I2.
Let pi1 be a 1 (N 2) vector of nonnegative elements with sum
no more than 0.5, and pi2 = [0.5 0.5] pi1Q. Then
pi1Q+ pi2 = [0.5 0.5] = [0.5 0.5]D = (pi1Q+ pi2)D (B3.4)
14
-
It follows from (B3.1), (B3.3) and (B3.4) that
limm
[ pi1 pi2 ]P2m = [ 0 pi1Q+ pi2 ] = [0 pi1QD + pi2D ]
= limm
[pi1 pi2 ]P2m+1 (B3.5)
where 0 is a 1 (N 2) vector of zeros. This shows that the
limit limn
[ pi1 pi2 ]Pn exists.
Since each row of the matrix Q must have sum equal to 1, it
is not difficult to see that pi(0) = [pi1 pi2 ] provides an initial
distribution such that
limn
pi(n) = limn
pi(0)Pn = limn
[ pi1 pi2 ]Pn = [0 0.5 0.5 ]
Therefore, if pi1 and pi2 have nonnegative elements and satisfy
the condition pi1Q + pi2 = [0.5 0.5], then pi(n) converges with
pi(0) = [pi1 pi2 ], although Pn diverges.
The domain for pi(0) = [pi1 pi2 ] to meet the above conditions
has a dimension N 2, which is one less than the dimension
N 1 for pi(0) without such conditions.
The above example can also show that limn
pij(n) > 0 is possible
for a periodic state j. To see this, take N = 3, A = 0.8 and
B = [0.1 0.1]. Then by (B3.2),
Q = (1 0.82)1(0.8[0.1 0.1] + [0.1 0.1]D)
=1
0.36[0.18 0.18] = [0.5 0.5] = QD
Hence by (B3.1) and (B3.3), limn
p1j(n) = 0.5 > 0 for periodic
states j = 2, 3.
15
-
Appendix B4
Properties of Poisson Processes
Partition of a Poisson process
Let N(t) be a Poisson process with rate , which counts the
number of events occurred by time t.
Each event is classified into one of k types, independent of N(t),
with Pr(Type j) = pj , j = 1, . . . , k, where p1 + + pk = 1.
Let Nj(t) be the numbers of type j events occurred by time t,
j = 1, . . . , k. Conditional on N(t) = n, N1(t), . . . Nk(t) have a
multinomial distribution:
Pr(Nj(t) = nj, j = 1, . . . , k|N(t) = n) =n!
n1! nk!pn11 p
nkk
where n1, . . . , nk satisfy n1 + + nk = n.
Consequently,
Pr(N1(t) = n1, . . . ,Nk(t) = nk)
= Pr(Nj(t) = nj , j = 1, . . . , k|N(t) = n)Pr(N(t) = n)
=n!
n1! nk!pn11 p
nkk e
t (t)n
n!
=(t)n1++nk
n1! nk!pn11 p
nkk e
(p1++pk)t
=(tp1)
n1
n1!ep1t
(tpk)nk
nk!epkt
Thus N1(t),N2(t), . . . ,Nk(t) are independent Poisson processes
with rates p1, . . . , pk respectively.
16
-
Transform of a multivariate density
Before deriving the joint distribution of arrival times, we first
review the transform of a multivariate density function.
Let fX(x1, . . . , xn) denote the joint density function of random
variables X1, . . . ,Xn.
Assume that a one-to-one transform between (X1, . . . ,Xn) and
(Y1, . . . , Yn) is given by Xi = gi(Y1, . . . , Yn), i = 1, . . . , n.
Then by multivariate calculus, the joint density of Y1, . . . , Yn is
given by
fY (y1, . . . , yn) = fX(x1, . . . , xn)|J | (B4.1)
where xi = gi(y1, . . . , yn), i = 1, . . . , n, and
J =
(xiyj
)nn
=
x1y1
x1y2
x1yn
x2y1
x2y2
x2yn
......
. . ....
xny1
xny2
xnyn
which is called the Jacobian of the transform from (x1, . . . , xn)
to (y1, . . . , yn).
If the transform is linear: X = AY , where X = (X1, . . . ,Xn)T ,
Y = (Y1, . . . , Yn)T , and A = (aij)nn is a constant matrix, then
xiyj
=
yj
nk=1
aikyk = aij , i, j = 1, . . . , n.
Hence J = |A| (the determinant of A).
17
-
Arrival times
Given a Poisson process N(t) with rate to count the number
of events, the arrive time of the kth event is given by
Ak = T1 + T2 + + Tk, k = 1, 2, . . .
where T1, T2, . . . are i.i.d. exponentially distributed with rate .
Let (a1, . . . , an) and (t1, . . . , tn) denote the values of random
vectors (A1, . . . , An) and (T1, . . . , Tn) respectively. Thena1a2...an
=
1 0 01 1 0...
.... . .
...1 1 1
t1t2...tn
(B4.2)
subject to restrictions a1 < < an and t1 > 0, . . . , tn > 0.
Let fA and fT denote the joint densities of (A1, . . . , An) and
(T1, . . . , Tn) respectively.
Since the matrix in (B4.2) has determinant equal to 1, we have
J = 1 for the transform in (B4.2). Hence by (B4.1),
fA(a1, . . . , an|N(t) = n) = fT (t1, . . . , tn|N(t) = n)
=Pr(N(t) = n|t1, . . . , tn)fT (t1, . . . , tn)
Pr(N(t) = n)
=Pr(N(t)N(an) = 0)e
t1 etn
Pr(N(t) = n)
=e(tan)ne(t1++tn)
et(t)n/n!=
n!
tn(B4.3)
if 0 a1 < < an t; and 0 otherwise.
18
-
LetX1, . . . ,Xn be i.i.d. random variables with a common density
fX(x), and X(1) < < X(n) their order statistics.
The joint density of of X1, . . . ,Xn at (x1, . . . , xn) is given by
f(x1, . . . , xn) = fX(x1) fX(xn) = fX(x(1)) fX(x(n)).
Given x(1) < x(2) < < x(n), there are n! unordered n-tuples
(x1, . . . , xn) whose ordered values are equal to x(1), . . . , x(n), each
with density fX(x(1)) fX(x(n)).
Therefore, the density of X(1), . . . ,X(n) is given by
f(n)(x(1), . . . , x(n)) = n!ni=1
fX(x(i)) (B4.4)
For example, when n = 3 and (x(1), x(2), x(3)) = (1, 2, 3),
f(3)(1, 2, 3) = f(1, 2, 3) + f(1, 3, 2) + f(2, 1, 3)
+ f(2, 3, 1) + f(3, 1, 2) + f(3, 2, 1)
= 3!fX(1)fX(2)fX(3)
In particular, if fX(x) = t1I{0xt} is uniform over [0, t], then
f(n)(x(1), . . . , x(n)) = n!
(1
t
)n=
n!
tn(B4.5)
if 0 x(1) < < x(n) t; and 0 otherwise.
Compare (B4.5) with (B4.3), we see that the conditional joint
distribution of the arrival times A1 < < An given N(t) = n is
the same as that of the order statistics of n independent uniform
random variables over interval [0, t].
19
-
The time-inhomogeneous case
If the Poisson processN(t) is time-inhomogeneous with intensity
function (t), then by the independent increments,
Pr(Tk > t|T1 = t1, . . . , Tk1 = tk1)
= Pr(N(ak1 + t)N(ak1) = 0) = e[(t+ak1)(ak1)]
where a0 = t0 = 0 and ak = t1 + + tk, k = 1, 2, . . ..
Thus the density of Tk given {T1 = t1, . . . , Tk1 = tk1} is
fk(t|t1, . . . , tk1) = (t+ ak1)e[(t+ak1)(ak1)]
and the joint density of T1, . . . , Tn is given by
f(t1, ..., tn) =nk=1
fk(tk|t1, ..., tk1) =nk=1
(ak)e[(ak)(ak1)]
= (a1) (an)e(an)
Then similar to (B4.3), for 0 a1 < < an t,
fA(a1, . . . , an|N(t) = n) =e[(t)(an)](a1) (an)e
(an)
e(t)[(t)]n/n!
= n!(a1) (an)
[(t)]n= n!
nj=1
(aj)
(t)(B4.6)
Compare (B4.6) with (B4.4), we can see that given N(t) = n,
the arrival times A1 < < An are distributed as the order
statistics of i.i.d. X1, . . . ,Xn with a common density
fX(x) =(x)
(t), 0 x t.
20
-
Define Poisson process by inter-arrival times
Let {Nt, t 0} be a counting process and T1, T2, . . . are the inter-
arrival time of Nt. If T1, T2, . . . is a sequence of i.i.d. exponential
random variables with a common rate , we can show that the Nt is
a Poisson process with rate in the following steps.
First, Nt can be defined by T1, T2, . . . as follows:
Nt = min{k : Ak > t}, (B4.6)
where Ak = T1 + + Tk is the arrival time of the kth event.
Then (B4.6) implies
{Nt n} = {An+1 > t}, {Nt = n} = {An t < An+1} (B4.7)
Next, it is not difficult to show that Nt is Poisson distributed
with mean t (in a tutorial exercise).
Then we show that Nt has Markov property. Let nt 0 be
integer-valued and nondecreasing in t 0. Then Tns+1, . . . , Tnt
are independent of Anu = T1 + + Tnu for any 0 u s < t
and of Tnu+1 if nu < ns. These together with (B4.7) imply
Pr(Nt nt|Nu = nu, u s,Ans = as)
= Pr(Ant > t|Anu u < Anu + Tnu+1, u s,Ans = as)
= Pr(as + Tns+1 + + Tnt > t|as + Tns+1 > s,Ans = as)
= Pr(Nt nt|Ns = ns, Ans = as) (B4.8)
for any 0 s < t and 0 as s.
21
-
Note that the value as of Ans must satisfy 0 as s since ns is
the number of arrivals no later than time s by (B4.6) and (B4.7).
It follows from (B4.8) that
Pr(Nt nt|Nu = nu, u s) = Pr(Nt nt|Ns = ns)
for any 0 s < t. This shows the Markov property of Nt.
It remains to show that Nt+u Nt is Poisson distributed with
mean u and is independent of Nt for any t, u > 0. By (4.7),
Pr(Nt+u Nt n|Nt = m) = Pr(Nt+u m+ n|Nt = m)
= Pr(Am+n+1 > t+ u|Am t < Am+1)
Hence for any 0 am t, by the property of the exponential
distribution and the independence between Am and Tm+1,
Pr(Nt+u Nt 0|Nt = m,Am = am)
= Pr(Nt+u m|Nt = m,Am = am)
= Pr(Am+1 > t+ u|Am = am t < Am+1)
= Pr(am + Tm+1 > t+ u|am + Tm+1 > t)
= Pr(Tm+1 > t+ u am|Tm+1 > t am)
= Pr(Tm+1 > u) = eu (B4.9)
As eu does not depend on m and am, (B4.9) shows that
Pr(Nt+u Nt 0|Nt = m) = Pr(Nt+u Nt 0)
= eu = Pr(Nu = 0) = Pr(Nu 0) (B4.10)
22
-
For n 1, let X = Tm+2 + + Tm+n+1 Gamma(n, ) and
define event E = E(x) = {Am = am t,X = x}. Then an
argument similar to that of (B4.9) leads to
Pr(Nt+u Nt n|Nt = m,E) = Pr(Nt+u m+ n|Nt = m,E)
= Pr(am + Tm+1 + x > t+ u|am + Tm+1 > t,E)
= Pr(Tm+1 > u x) = e(ux)I{xu} + I{x>u} (B4.11)
Multiply (B4.11) by the density fX(x) of X Gamma(n, ) and
integrate over x 0, we get
Pr(Nt+u Nt n|Nt = m) =
0
Pr(Tm+1 > u x)fX(x)dx
=
u0
e(ux)nxn1ex
(n 1)!dx+
u
nxn1ex
(n 1)!dx
= eun u0
xn1
(n 1)!dx+ Pr(An > u)
= eunun
n!+ Pr(Nu n 1) = Pr(Nu n) (B4.12)
It follows from (B4.10) and (B4.12) that Nt+uNt has a Poisson
distribution with mean u and is independent of Nt. Thus Nt is
a Poisson process with rate .
Remarks
(i) (B4.6) and (B4.7) summarise the relationship between a counting
process and its arrival and inter-arrival times.
(ii) A processNt defined by (B4.6) is a Markov process provided that
T1, T2, . . . are independent (need not be i.i.d. or exponential).
23