interpolation for fishery habitat. overview part a. background on interpolation techniques part b....

Post on 11-Jan-2016

221 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Interpolation for Fishery HabitatInterpolation for Fishery HabitatInterpolation for Fishery HabitatInterpolation for Fishery Habitat

OverviewOverview

Part A.Background on Interpolation Techniques

Part B.The Geostatistical Process

– Explore the data– Fit a model– Perform diagnostics– Compare the models

Geostatistical Analyst of ArcGISGeostatistical Analyst of ArcGIS

• For advanced surface modeling

• Extension of ArcGIS

• Tools for creating a statistically valid surface

Loading the Geostatistical ExtensionLoading the Geostatistical Extension

3.

2.

1.

Recommended textsRecommended texts

Further readingFurther reading• Armstrong, M. 1998. Basic Linear Geostatistics. Springer, Berlin.• Chiles, J. and Delfiner, P. 1999. Geostatistics. Modeling Spatial

Uncertainty. John Wiley and Sons, New York.• Cressie, N. 1988. Spatial prediction and ordinary kriging. Mathematical

Geology 20:405-421. (Erratum, Mathematical Geology 21: 493-494)• Cressie, N. 1990. The origins of kriging. Mathematical Geology 22:239-

252.• Isaaks, E.H. and Srivastrava, R.M. 1989. An introduction to Applied

Geostatistics. Oxford University Press, New York.• Johnston, Kevin, Jay M. Ver Hoef, Konstantin Krivoruchko, and Neil

Lucas. Using ArcGis Geostatistical Analyst, 2001. Environmental System Research Institute, Redlands, CA.

• Shaw, Gareth and Dennis Wheeler.  Statistical Techniques in Geographical Analysis, 1994. David Fulton Publishers, London.

Interpolating a surfaceInterpolating a surface

• Generate the most accurate surface

• Sample point data as input

• Characterize the error and variability of the predicted surface

Interpolation techniquesInterpolation techniques

• Deterministic– Use mathematical functions for interpolation

• IDW, global and local polynomial, radial basis

• Geostatistical– Relies on both statistical and mathematical

methods– Can be used to assess the uncertainty of the

predictions

NOTE: Both rely on similarity of nearby points to create the surface

Deterministic techniquesDeterministic techniques

• Inverse distance weighted

• Global polynomial

• Local polynomial

• Radial bias functions

Inverse distance weightedInverse distance weighted

• Reasonably accurate if the points are evenly distributed and the surface characteristics do not change across the landscape

• Values of closer points are weighted more heavily than those further away

IDW (in the Geostatistical Analyst)IDW (in the Geostatistical Analyst)

IDW (in the Spatial Analyst)IDW (in the Spatial Analyst)

Global polynomialGlobal polynomial

• Identify and model local structures and surface trends

• Fit a plane between the sample points

One bend = 2nd order

Two bends = 3rd order

Etc…

Plane = first order

GPGP

GPGP

Local polynomialLocal polynomial

• Fitting many smaller overlapping planes

LPLP

Radial basisRadial basis

• Captures global trends and picks up local variation (bending and stretching of surface to match all the measured values)

RBRB

Geostatistical methods Geostatistical methods

• Based on statistical methods not just mathematical

• Include spatial autocorrelation

• Provide a measure of certainty or accuracy– Kriging– Cokriging

Principals of Geostat MethodsPrincipals of Geostat Methods

• Unlike the deterministic methods, geostatistics assumes that all values are a result of a random process with dependence

• What does this mean?

ExEx• Flip three coins and determine if H or T• The fourth coin will not be flipped; it will be laid

down based on what the 2nd and 3rd are• Rule to lay the 4th:

– if the 2nd and 3rd are tails, the fourth is the opposite of the first, if not then the 4th is same as first

How does this relate to predicting locations in an interpolation?How does this relate to predicting locations in an interpolation?

• In coin ex, dependence rules were given• In reality, dependence rules are not known• In geostats, there are two key tasks

To uncover the dependence rules

To make predictions

KEY: the predictions come from knowing the dependency rules!

Principals of Geostat MethodsPrincipals of Geostat Methods• Besides random process with dependence…• Stationarity

– Mean stationarity• mean is constant between samples and is independent of location

– Second order stationarity for covariance • covariance is the same between any two points that are at the

same distance and direction apart no matter which points you choose

– Intrinsic stationarity for semivariograms • variance of the difference is the same between any two points that

are at the same distance and direction apart no matter which two points you choose

KrigingKriging

• In geostats, there are two key tasksTo uncover the dependence rules

To make predictions Semivariogram and covariance functions

Interpolate areas

KrigingKriging

• Similar to IDW (weights surrounding values to derive a prediction)

• Different in that it incorporates the spatial arrangement among the measured points (must calculate spatial autocorrelation)

KrigingKriging

CokrigingCokriging• Uses information on several variable types• Requires much more estimation

(autocorrelation for each variable and cross-correlations)

Kriging processKriging process

1. Calculate the empirical semivariogram

2. Fit a model

3. Make a prediction

Empirical semivariogramEmpirical semivariogram

• Tests for spatial autocorrelation (things closer

are more alike) spatial modeling, structural analysis or variography

Combinations of the pointslow on both the x and y axis have more autocorrelation

Increasing distance

Incr

easi

ng

dis

sim

ilar

ity

Fit a ModelFit a Model

• Defining a line (weighted least squares) that provides the best fit through the points in the empirical semivariogram cloud

• Line is considered a model quantifying the spatial autocorrelation in a model

Make a predictionMake a prediction

• From the kriging weights for the measured values, you can calculate a prediction for the location with the unknown value.

Why explore your data?Why explore your data?

• To make better decisions when creating a surface

• To gain a better understanding of the data

• Look for obvious errors in the input sample that may drastically affect the output prediction surface

• Examine how the data is distributed

• Look for global trends

Summarizing the Geostatistical analyst data exploration Summarizing the Geostatistical analyst data exploration toolstools

• Tools to examine the distribution of your data

• Identify trends in the data if any

• Understand the spatial autocorrelation and directional influences

Examining the distribution of dataExamining the distribution of dataTools Available in ArcGIS 9 Geostatistical Analyst:

• Histogram

– Look for normal distribution

• Normal QQPlot

– To find trends

• Semivariogram/covariance cloud

– To identify spatial autocorrelation

Histogram toolHistogram tool

Make sure layer and attribute are set

NOTE: if mean and median are approximatelythe same value, then you have reason to believeyour data is normally distributed

Interpolation results give the best results when the data is normally distributed

If skewed (lopsided) you may choose to transform the data to make it normal

Histogram toolHistogram tool• Important features in the histogram

– Central value, spread, and symmetry

Data is unimodal (one hump) and fairly symmetric, close to a normal distribution

Right tail shows a small number of high ozone values

Normal QQPlotNormal QQPlot

• Used to compare your distribution to a standard normal distribution

• The closer your data is to the line, the more normally distributed is

Normal QQPlotNormal QQPlot

The quantiles from two distributionsare plotted against each other,for two identical distributions, the QQPlotwill be a straight line

This plot is very close to normalbut departs at the selected features

Identifying global trendsIdentifying global trends• Enables you to identify the

presence/absence of trends in the input dataset

Make sure toSet the layer and attribute

Finding trendsFinding trends• Each “stick” represents location and height of a data point

• East/West and North/South planes

• Trends are analyzed in these

– directions

• A best fit line (polynomial) is

– drawn through the projected pts

– which models trends in the

– specific directions

• A flat line indicates no trend

E to W axis

N to S axis

Interpretation of the trendsInterpretation of the trends• Values of ozone increase in the east to west direction• A weaker trend exists in the north to south direction• “Ozone is low at the coast,

higher inland then

tapers off in the mountains”

Definition of semivariogramDefinition of semivariogram

• A function that relates dissimilarity of data points to the distance that separates them.

• Its graphical representation can be used to provide a picture of the spatial correlation of data points with their neighbors

Semivariogram/covariance cloudSemivariogram/covariance cloud

• Examines the spatial autocorrelation between measured points

• Each red dot is a pair of observations

• X measures distance between the points and Y is the difference squared between the values

Semivariogram/covariance cloud interpretationSemivariogram/covariance cloud interpretation

• Points low on both axis represent points of higher autocorrelation (low distance between points = they are more alike)

• To test areas (near areas but different) select sectors in the graph

The pointsare primarily in LA

Semivariogram/covariance for directional influencesSemivariogram/covariance for directional influences

• Choose the show search direction checkbox after selecting the points in the graph that are the most different

Semivariogram/covariance cloud interpretationSemivariogram/covariance cloud interpretation

• This tells us that many of the paired points in the Los Angeles area are much higher than other paired points

• The high values of ozone in the LA area create high semivariance values with locations near and far away

• We can conclude that there is spatial autocorrelation in the data

Summary of what we found exploring dataSummary of what we found exploring data

• Ozone data is close to a normal distribution, unimodal and fairly symmetric around the mean

• The normal QQPlot reaffirmed that the data is normally distributed, no transformation is necessary

• Trend analysis showed a trend in the southeast to northwest direction where a second order polynomial would best be used

• The semivariogram indicated there is spatial autocorrelation

Side noteSide note

• If we were doing cokriging (integrating other variables) we could use the multiple attribute tools

Fit a modelFit a model

• Now we will use ordinary kriging to incorporate the trend and create a better model to make predictions

• The exploratory phase told us there was a trend, we can remove the trend with a second order polynomial to create a more accurate surface

Steps to fit a modelSteps to fit a model

• Choose input data and method

Steps to fit a modelSteps to fit a model• On the geostatistical method selection dialog, select the Order

of Trend Removal and click Second

– A second order polynomial will be fitted because an upside down U shaped curve was detected in the SW to NE direction

Steps to fit a modelSteps to fit a model• NOTE: trends should only be removed if there is justification for doing it

– Our justification: There exists ozone buildup between the mountains and the coast, prevailing winds contribute to low values in the mountains and at the coast, high concentration of people in this area also contributes to more ozone, the NW to SE trend varies less from the higher population in the south “LA” versus less to the north

NW to SE global trend

SW to NE global trend

Steps to fit a modelSteps to fit a model• Semivariogram/Covariance modeling

– Goal is now to find the best fit for a model (yellow line)

Try reducing the lag size to zoom into model the details of the local variationbetween neighboring sample points

NotesNotes• After reducing the lag size, the fitted semivariogram

(yellow line) rises sharply and then levels off, the flat area indicates little autocorrelation beyond this point

NotesNotes• By removing the trend, the semivariogram will model

the spatial autocorrelation among data points without having to consider the trend in the data

• The semivariogram model starts low at small distances (things close are more similar) and increases as distances increase (things further away are more dissimilar)

Directional semivariogramsDirectional semivariograms

• A directional influence will affect the model fit

• Directional influences are called anisotropy

– Caused by wind, runoff, geological structure, etc

• Can be accounted for in the Geostatistical analyst

• Use the Search Direction Tool

Steps for directional semivariogramsSteps for directional semivariograms• Check the show search direction (only the points in the direction are

displayed)• Click and hold the cursor on the center line in the search direction,

move the direction of the search tool– As you change the direction of the search, note how the semivariogram

changes

Click the centerand move directions

Accounting for the directional Accounting for the directional influencesinfluences• To actually account for the directional influences, you must calculate the anisotropical

semivariogram or covariance model• Check anisotropy

A blue ellipse appears which indicates the range of the semivariogram in different directions, in this case the major axis is in the NNW-SSE direction

Incorporating the anisotropyIncorporating the anisotropy

• Type the parameters as shown for the search direction to make the directional pointer correspond with the minor axis of the anisotropical ellipse, then hit next

Result to this pointResult to this point

• Now, we have a fitted model to describe the spatial autocorrelation, taking into account de-trending and directional influences in the data

• This info with the measures of locations around the prediction is used to make the predictions

Searching neighborhoodSearching neighborhood

• It is common practice to limit the data used for the prediction• Most times, the defaults are fine in this step

Locations usedand weights

Search neighborhood

Crosshairs define The location prediction

Number of pointsused to predicta value at a anunmeasured location

Minimum numberof points to use

Can set theshape type too

Cross validationCross validation• Gives an idea of “how well” the model predicts the unknown values• It works by omitting a point, predicting the value using the rest of the data, and then

comparing the measured and predicted values• This lets you know if the results are reasonable to make a surface or not

Summarystats

Cross validationCross validation

• Purpose is to help you make an informed decision about which model provides the most accurate predictions

• An accurate model will have:– Mean error should be close to 0– The root mean square error and average standard error should

be as small as possible– The root mean squared standardized error should be close to 1

Cross validation tabs to see other graphsCross validation tabs to see other graphs

QQPlotQQPlot• Some values fall slightly above the line, and some

below, most are very close to the dashed line indicating prediction errors are close to being normally distributed

• You can save the results for further analysis if you want

Last step, output summary, click ok to create mapLast step, output summary, click ok to create map

Add to display and name “trend removed”Add to display and name “trend removed”

Create a prediction standard error surfaceCreate a prediction standard error surface

• Can be used to examine the quality of the predictions

• Quantifies the uncertainty for each location in the surface you created

Right click on layer

Note how areas further from the points had more error

Compare the modelsCompare the models

• If more than one surface is produced, the results can be compared and a decision made as to which provides the better predictions of unknown values

• Cross validation is used to compare different modeled surfaces

Steps to compare the modelsSteps to compare the models• We will compare the trend removed model to the default layer

• Right click the trend removed layer and select Compare

Comparing the modelsComparing the models• The trend removed model is better because:

– The root mean square prediction is smaller– The root mean square standardized prediction

error is closer to one– The mean prediction error is also closer to zero

Mapping Tip: Extending the interpolationMapping Tip: Extending the interpolation

• By default, the interpolation only extends to the extent of the points

Extent of input points

Extending the interpolationExtending the interpolation

• Right click the interpolated grid surface to set the extent to..

Display interpolated surface for study area polygonDisplay interpolated surface for study area polygon

• Set the data frame extent to the polygon study area layer

5.

4.

3.

2.

1.

Online stat papersOnline stat papers

ApplicationsApplications

Lake Erie bathymetric dataLake Erie bathymetric data

Cheat Lake bathymetry dataCheat Lake bathymetry data

Questions / Comments?Questions / Comments?

top related