methods for point patterns. methods consider first-order effects (e.g., changes in mean values...

30
Methods for point patterns

Upload: priscilla-skinner

Post on 18-Jan-2016

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Methods for point patterns

Page 2: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g., spatial interaction).

Edge effects can affect the results. Therefore either use a guard region (buffer around edge) or a toroidal shift technique.

Page 3: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Exploring point patterns

Quadrat analysis: either completely cover the study area (e.g., lay a grid over the study area) or randomly scatter quadrats (e.g., in a field study).

Problems: size of quadrat, loss of relative locations within quadrat, orientation (variance of results under rotation)

Page 4: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Exploring point patterns

Kernel estimation: obtain a smooth estimate of the probability density (aka a smoothed histogram).

Issues: choice of ‘kernel’, bandwidth (fixed distance or adaptive [fixed # of points])

Page 5: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Exploring point patterns

Kernel estimation weights points that are further away less than those that are close. Point A will count less than Point B.

Point A Point B

Tao is the bandwidth and determines resolution

Edge correctionsshould be used if points are near the edge

Page 6: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Exploring point patterns

Nearest neighbour distances: exploring second order properties. Two basic types: nearest neighbour event-event distance G(w), or the nearest neighbour point-event F(x) distance (where the point is a randomly selected point in the study area).

Page 7: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Exploring point patterns

NN results are typically plotted as a function of distance classes (# of pairs less than/equal to the distance class, divided by the number of events [or random points sampled]).

If curve rises rapidly at beginning, suggests clustering of events, if rises late, suggests regularity.

Page 8: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Exploring point patterns

You could also plot G(w) against F(x). If there is no interaction the plot should be roughly a straight line. If the events are clustered, then the G(w) values should be higher than the F(x) values.

Page 9: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Exploring point patterns K functions: basically, the expected

number of events within distance h of an arbitrary event. Consider drawing buffers around each event (distance 1, 2, 3, …), counting the [cummulative] # of events within each buffer, taking the average for each distance class (scaled by the area / n2). Could plot k against h.

Page 10: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Exploring point patterns

Typically k functions are not plotted as such; they are normally transformed into L plots. L values are K values ‘normalized’ by taking into account the expected number of points per distance class assuming a homogeneous point pattern distribution with no spatial dependence.

Page 11: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Exploring point patterns

L(h) plots: peaks indicate clustering and troughs regularity, as a function of distance (h).

K functions are the preferred means to examine point patterns as they consider all scales (not just the nearest neighbours), and they have a theoretical basis.

Page 12: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Modelling spatial point patterns

The standard model for complete spatial randomness (CSR) is the homogeneous Poisson process. Poisson distributions have means = variances, which enables us to test some of the exploring point pattern statistics just introduced. The is the most commonly assumed process.

Page 13: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Modelling spatial point patterns

Other choices for CSR include: Heterogeneous Poisson process (the intensity

of the process varies across space [e.g., a trend surface])

Cox process (the intensity of the process randomly varies across space)

Poisson cluster process (a two level poisson process—parent points are identified (CSR), and around each parent point a random number of offspring are placed; only the offspring remain).

Page 14: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Modelling spatial point patterns

Simple inhibition processes: a hard core process—a CSR process is thinned by removing all pairs of points less than distance d apart

Markov point processes: similar to SIP (above), but allowing for the possibility that point pairs could be found at distances less than d apart

Page 15: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Modelling spatial point patterns

Quadrat analyses: compute a chi-square value by dividing the variance (* n-1) by the mean. Significantly large values = clustered, small values = regularity.

An index of cluster size (ICS) = (variance / mean) – 1. If CSR holds, E(ICS) = 0. If ICS > 0 then clustering is implied, if ICS < 0 regularity is implied.

Page 16: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Modelling spatial point patterns

Nearest neighbour tests: based on CSR we can derive a theoretical distribution of nn distances. Edge effects are problematic when working with theoretical distributions, so normally a computational intensive (Monte Carlo) approach is taken.

Page 17: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Modelling spatial point patterns

The method: simulate m empirical distributions of n events under CSR, determine the NN for each. The mean values are determined for each distance class, as well as the min and max values observed within each class. The theoretical values (mean, min, max) are plotted against the observed values.

Page 18: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Modelling spatial point patterns

If CSR holds, the plot should be roughly linear at a 45o angle. If clustering is present the plot will lie above the line, if regularity the plot will lie below the line (assuming the simulated values are plotted on the x-axis).

Page 19: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Modelling spatial point patterns

K functions: similar to the NN method—produce m simulations and plot the observed L values and the min/max envelopes against the distance (h).

Page 20: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Comparing multiple types of events

Say we have two types of events (contaminated / clean wells, cases of larynx and lung cancers, crimes committed by blacks / whites) and what to examine if one type of event is related to the other, does one ‘explain’ the other (dependence).

Page 21: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Comparing multiple types of events

Simplest approach: quadrat analyses, count the # of events in each quadrat, produce a x 2 x contingency table (Chi-squared statistic).

Type 1

Type 2

Absent Present

Absent

Present

Page 22: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Comparing multiple types of events

NN distances: if independent, then the NN distance class values when determining the eventi – eventj NN values and a random point – eventj

values should be equal.

Page 23: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Comparing multiple types of events

Multivariate (cross) K functions: are the preferred means to explore the relation among multivariate point patterns.

The K function is: E(#(type j events ≤ h from an arbitrary type i event)) Again normalized by: Area / ni nj

Interestingly, the expected value of K is not affected if the i or j events are clustered, random or regular when considered separately.

Page 24: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Comparing multiple types of events

As with K functions, L plots are produced. This time, we plot Lii(h), Ljj(h) and Lij(h) simultaneously, which reveals whether individually i or j depart from CSR, as well as if i and j appear to be attracted (positive peaks) or repulsed (negative troughs in the plot).

Page 25: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Comparing multiple types of events

To determine the significance of the i-j L plot, the normal approach is to randomly shift one set of events (a toroidal shift) and determine the min/max observed values; if the observed values lie outside of the simulation envelope then the peaks / troughs are assumed to be significant.

Page 26: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Comparing multiple types of events

What if you would like to correct for spatial variation in the population at risk? That is, CSR is not a viable option, since there is an expectation that there is some natural spatial variation in the intensity of the process (e.g., population density – disease incidence, crime patterns)

Page 27: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Comparing multiple types of events

The simplest approach is to use kernel smoothing, dividing the surface of the variable of interest (say cancer victims, crime incidents) by a surface of the population at risk (say total population, housing density).

Page 28: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Comparing multiple types of events

A more sophisticated approach is to use a case / control approach. Use the observed events from one spatial process that are representative of the population variations (the control process).

Page 29: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Comparing multiple types of events

K functions are used (similar to the cross K functions described above). In order to determine the significance of the observed pattern, we randomly label the combined events into cases and controls. The values of (Kcases – Kcontrols) are plotted against h, along with the min/max of the randomly labeled simulation results.

Page 30: Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Methods for point patterns

I have just touched on a few of the methods that can be used to examine point patterns.

For linear and areal data there are also a similarly wide variety of methods, and once you get into GLMs you will encounter many sophisticated solutions.