statistics 416, applied geostatistics
TRANSCRIPT
Statistics 416, Applied Geostatistics.
Outline for the day:1. Syllabus, etc.2. Examples of spatial-temporal point processes. 3. Characterizations of point processes. 4. Integration. 5. Exercises. 6. Simplicity and orderliness. 7. Conditional intensity. 8. Poisson process.
Read reinhart18.pdf . Ignore CCLE website. Use http://www.stat.ucla.edu/~frederic/416/S21 . Zoom password. Grading? Breaks? Please do not drop without talking to me first!
1. Syllabus.
2. Examples of spatial-temporal point processes.
A point process is a random collection of points falling in some metric space. For a spatial-temporal point process, the metric space is a portion of space-time, S = Rd x R. Often d = 2, but sometimes 3.
Examples include incidence of disease, sightings or births of a species, occurrences of fires, earthquakes, lightning strikes, tsunamis, or volcanic eruptions.
Microearthquake origin times and epicenters in Parkfield, CA, recorded by the US High-Resolution Seismographic Station Network.
Wildfire centroids between 1876 and 1996 in Los Angeles County, CA, recorded by the Los Angeles County Department of Public Works
For a marked point process, each point has some mark or random variable associated with it.
Locations, times and magnitudes of moderate-sized (M ≥ 3.5) earthquakes in Bear Valley, CA, between 1970 and 2000.
3. Characterizations of point processes.
Point processes have been characterized various ways.
Daley and Vere-Jones (2003) detail the history originating from life tables, dating back to the 1700s.
3a. STOCHASTIC PROCESS. Point processes on the line were originally characterized as examples of stochastic processes on the line, that are: * Non-decreasing, * Z+ valued. So it is tempting to define a point process as any non-decreasing, Z+ valued stochastic process. On the line, this definition works just fine.
0 1 2 3 4 5 6 7 --------.------.-.------------.------.--------.--.-----------
3a continued.
Non-decreasing, Z+ valued stochastic process.
This has problems extending to spatial-temporal processes.
Let N(x,y) = the number of points (x1,y1) with x1 ≤ x and y1≤y.
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
0
1
2
3
3a continued.
It is very unclear what points this nondecreasing Z+ valued process corresponds to though.
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
0
1
3b. A LIST OF POINTS. For any finite collection of points, one could represent it simply as a finite list of points, N = {x1, x2, ...}. J. Møller and R. Waagepetersen (2004), Statistical Inference and Simulation for Spatial Point Processes, CRC, Boca Raton uses this formulation. If S = Rd x R is the domain where the points occur, then N is a vector in Ø U S U S2 U S3 U ....This works just fine for finite point processes, and point processes with a countable number of points. But you can imagine wanting to consider the point process with points everywhere, or points at all irrational times, or some other uncountable collection of points. You could restrict your definition to include only point processes with points on a set of measure zero, but then what about if N has points at all times in the Cantor set?
3c. RANDOM MEASURE.
A Z+ valued random measure is now the preferred mathematical definition, as it includes a wide range of processes on the line and extends readily to space-time.
The measure N(A) represents the number of points falling in the region A of space-time. For this set A, for example, the value of N(A) is 3. Attention is sometimes restricted to processes with only a finite number of points in any compact subset of S.
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
A
3c, continued.
In the 1950s-1970s much attention was paid to the idea that there are non-measurable sets. N(A) is technically only defined for measurable A. A collection of points at all times in the Vitali set, for instance, if not considered a valid point process.
With the random measure formulation, the domain S can also be some abstract space. Brillinger (1997, 2000) analyzed point processes on the surface of a sphere or on an ellipse, for instance.
Brillinger, D.R. (1997). A particle migrating randomly on a sphere, Journal of Theoretical Probability 10, 429-433.
Brillinger, D.R. (2000). Some examples of random process environmental data analysis, in Handbook of Statistics, Vol. 18, C.R. Rao & P.K. Sen, eds, NorthHolland, Amsterdam, pp. 33–56.
4. Integration.
Note that with the measure theory formulation, which we will use throughout, N(B) = ∫B dN is the number of points in B.
What is ∫t ∫x ∫y f(t,x,y) dN?
It is simply ∑i f(ti,xi,yi) .
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
t
f(t)
Exercises.
1. Suppose the spatial-temporal point process N has points at
time 1.2, x=2, y=3.
time 2.4, x=3, y=0.5.
time 8.7, x=2, y=1.
Let B = [0,10] x [1.5, 2.5] x [0,5].
What is ∫B dN?
Exercises.
1. Suppose the spatial-temporal point process N has points at
time 1.2, x=2, y=3.
time 2.4, x=3, y=0.5.
time 8.7, x=2, y=1.
Let B = [0,10] (time) x [0,5] x [0,5].
What is ∫B dN?
2.
Exercises.
1. Suppose the spatial-temporal point process N has points at
time 1.2, x=2, y=3.
time 2.4, x=3, y=0.5.
time 8.7, x=2, y=1.
Let B = [0,10] (time) x [0,5] x [0,5].
What is ∫B (t+x2y) dN?
Exercises.
1. Suppose the spatial-temporal point process N has points at
time 1.2, x=2, y=3.
time 2.4, x=3, y=0.5.
time 8.7, x=2, y=1.
Let B = [0,10] (time) x [0,5] x [0,5].
What is ∫B (t+x2y) dN?
13.2 + 6.9 + 12.7 = 32.8.
6. Simple and orderly point processes.
Last time we noted that the accepted definition of a point process is a Z+ valued random measure.
We will sometimes use the notation ti to represent point i = (ti,xi,yi).
A point process is stationary (or homogeneous) if, for any k and any collection of measurable sets, B1, B2, ..., Bk, the joint distribution of the collection {N(B1+D), N(B2+D), ..., N(Bk+D)} doesn't depend on D.
A point process is called simple if all the points are distinct, i.e.
P(there are indices i and j where ti = tj) = 0.
A point process is orderly if for any time t and any spatial interval B,
lim Dt -> 0 P(N([t,t+Dt) x B) > 1) / (Dt |B|) = 0.
If N is simple and stationary, then it is orderly (Daley and Vere-Jones, 2003, 3.3.V and 3.3.VI).
Simple and orderly point processes, continued.
The times {t1,t2, ..., tn} are sometimes said to form the ground process.
N has simple ground process if all the times are distinct, with prob. 1.
Examples of a non-simple process, N with non-simple ground process, and a nonorderly process with points at {(1/i, U(0,1), i=1,2,...}.
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
t
lat
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
t
lat
0.0 0.2 0.4 0.6 0.8 1.0
0.00.2
0.40.6
0.81.0
t
lat
7. Conditional intensity, l.
Fix any space-time location (t,x,y). l(t,x,y) is the limiting expected rate of accumulation of points around (t,x,y), given all points prior to t.
l(t,x,y) = lim Dt, d -> 0 E( N([t,t+Dt) x B(x,y,d)) | Ht ) / (Dt πd2), where
B(x,y,d) is a circle of radius d around (x,y), and Ht is the history of the process up to but not including time t.
If N is orderly, then
l(t,x,y) = lim Dt, d -> 0 P(N([t,t+Dt) x B(x,y,d)) > 0 | Ht ) / (Dt πd2).
Note that l is random. It can depend on what points have occurred previously, and these might be different with every realization.
Fix some spatial interval, B.
N(t, B) – ∫0t ∫∫ B l(t,x,y) dxdy dt is a martingale.
Conditional intensity, l, continued.
We will use l(t,x,y) = lim Dt, d -> 0 E( N([t,t+Dt) x B(x,y,d)) | Ht ) / (Dt πd2).
l is predictable.
Predictable is just a slight generalization of left-continuous.
If for any (t,x,y), the limit doesn't exist or is ∞, we say l doesn't exist.
l uniquely determines the distributions of any simple point process.
Conditional intensity, continued.
l(t,x,y) = lim Dt, d -> 0 E( N([t,t+Dt) x B(x,y,d)) | Ht ) / (Dt πd2).
l uniquely determines the distributions of any simple point process.
Daley and Vere-Jones (2003), prop. 7.2.IV.
This is an amazing fact. For many stochastic processes, you need to know the conditional mean and variance and maybe more info in order to specify the process. For Gaussian processes, you need to specify the mean, variance, and covariance of the process. For simple point processes, all you need is the conditional mean, l.
In modeling a point process, we typically assume it is simple and then just write down a model for l.
8. Poisson processes.
The most basic models for point processes are the Poisson processes.
If N is a simple point process with conditional intensity l, where l does not depend on what points have occurred previously, then N is a Poisson process.
The name comes from the fact that for such a process, for any set B, N(B) has a Poisson distribution.
P(N(B) = k) = e-A Ak / k! , for k = 0, 1, 2, ..., where A = ∫B l(t,x,y) dtdxdy, and with the convention 0! = 1.The mean of N(B) is A and the varianceis also A.
Poisson processes, continued.
A Poisson process is a simple point process with conditional intensity l, where l does not depend on what points have occurred previously.
Note that a Poisson process does not have
to be stationary. A simple point process
with l(t,x,y) = 10.5 + 2t + 4x – exp(-y)
is a Poisson process.
If l is constant for all t,x,y, and N is
simple, then N is a stationary Poisson
process, and is sometimes called
completely random.
Poisson processes, continued.
On the left is a stat. Poisson process with l(t,x) = 2.5 on [0,1] x [0,10],
and on the right is a Poisson process with l(t,x) = 1.5 + 10t + 2x.
The key thing about Poisson processes is their complete independence.
For a Poisson process N, N(B1) and N(B2) are independent for any disjoint sets B1 and B2.
0.0 0.2 0.4 0.6 0.8
24
68
10
t
x
0.0 0.2 0.4 0.6 0.8 1.0
02
46
810
t
x
More (tricky) exercises.
1. Suppose N is generated as follows. For each integer i = 1,2,..., N has a point at (i, i, i) with probability 1/i, independently of the other points, and N has no other points.
What is l(2,2,2)?
a) 2. b) 1. c) ½. d) does not exist.
0 20 40 60 80 100
020
4060
80100
t
lat
Exercises.
2. Suppose N is generated as follows. For each integer i = 1,2,..., N has a point at (i, i, i) with probability 1/i, independently of the other points, and N has no other points.
N is
a) simple but not orderly. b) orderly but not simple.
c) simple and orderly. d) neither simple nor orderly.
0 20 40 60 80 100
020
4060
80100
t
lat
Exercises.
3. Suppose N is a Poisson process with l(t,x,y) = 1.5 + 10t + 2x
on B = [0,1] x [0,10] x [0,1].
What is EN(B)?
0.0 0.2 0.4 0.6 0.8 1.0
02
46
810
t
x
Exercises.
3. Suppose N is a Poisson process with l(t,x,y) = 1.5 + 10t + 2x
on B = [0,1] x [0,10] x [0,1].
What is EN(B)?
∫B l(t,x,y) dt dx dy = 1.5(10) + 10(10)(12)/2 + 2(1)(102)/2 = 15+50+100=165.
0.0 0.2 0.4 0.6 0.8 1.0
02
46
810
t
x
Code.
## nonsimple point process n = 20x = runif(n)y = runif(n)plot(x,y,xlab="t",ylab="lat",pch=2)points(x[20],y[20],pch=3)
## nonsimple ground process plot(x,y,xlab="t",ylab="lat",pch=2)points(x[20],y[20]+.05,pch=3)
## nonorderly processplot(c(0,1),c(0,1),type="n",xlab="t",ylab="lat")n = 100for(i in 1:n) points(1/i,runif(1),pch=3,cex=.5)
Code.
## points at (i,i) with prob. 1/i. plot(c(0,100),c(0,100),type="n",xlab="t",ylab="lat")for(i in 1:100) if(runif(1) < 1/i) points(i,i,pch=3)
## stationary Poisson process with intensity 2.5 on B=[0,1]x[0,10].n = rpois(1,2.5*1*10)t = runif(n)x = runif(n)*10plot(t,x,pch=3)
Code. ## nonstationary Poisson process with intensity 1.5+10t+2x on B.n = rpois(1,15+50+100)n1 = 0t = c()x = c()while(n1<n){t2 = runif(1) ## candidate pointx2 = runif(1)*10if(runif(1) < (1.5+10*t2+2*x2)/(1.5+10+20)){ ## keep itt = c(t,t2)x = c(x,x2)n1 = n1 + 1cat(n1," ")
}}plot(t,x,pch=3)