shaunaewald.files.wordpress.com€¦ · web view2018. 2. 2. · geostatisical analyst...
TRANSCRIPT
Shauna EwaldLab 5 Interpolation
GIIS 4850Oct 12, 2017
Part 1: Data Exploration
Create new geodatabase in ArcCatalog: Lab5Results In ArcMap (File > Map Document Properties) direct new location for outputAdd Data: coloradocounties.shpAdd Data: jan52017.csv (Display XY data in WGS 1984 GCS)Export Data into Lab5Results as WeatherStationsGeostatisical Analyst Histogram, attribute: snow, method for handling coincidental sample points: Use Mean
1. Click on the lowest snow total bar to highlight those points. Which area of Colorado had the lowest snow total? Which region has the highest snow total? (.5 points)
Lowest snow total: US1Cort002 station in Routt countyHighest snow total: Increasing the bar parameter of the histogram to 100 still gives 141 stations in the highest frequency group spread statewide. The only noticeable absences are in the northwest, north central, and central areas of the state.
Part 2: Surface Interpolation
Try interpolating wind speed, elevation, max temperature, and snow totals using IDWGeostatisical Analyst Geostatistical Wizard
2. Create three IDW prediction maps - one each for AWND (average wind speed), SNOW, and TMAX (max temperature). Add these maps to your Word document. (1.5 points)
Part 3: Evaluating Interpolation Methods
Geostatisical Analyst Geostatistical Wizard
Based upon the weather station points, interpolate TAVG (average temperature) by using the four interpolation techniques of IDW, global/local polynomial, RBF and Kriging. Be sure to always use TAVG as your Data Field. When prompted for handling of two points at the same location always select “Use Mean”. Experiment with the interpolation parameters, such as neighborhood size, shape, power, local/global ratio, etc.
TAVG (‘string’) TAVG_1 (Double)
3. Fill out the blank table in the extra word document. Record the mean error and RMSE for each function. Be sure to record the Optimized Power for IDW as well.
IDW (Inverse Distance Weighting)
Neighborhood Shape
Max. Number of Neighbors (k)
power = 1 power = 2 Optimized power Optimized power value
One Sector 16 Mean Error: 0.190RMSE: 4.088
Mean Error: 0.196RMSE: 4.039
Mean Error: 0.204RMSE: 4.029
1.678
One Sector 12 Mean Error: 0.152RMSE: 4.131
Mean Error: 0.162RMSE: 4.077
Mean Error: 0.170RMSE: 4.068
1.690
Four Sectors 4 Mean Error: 0.0482RMSE: 4.017
Mean Error: 0.112RMSE: 4.013
Mean Error: 0.106RMSE: 3.986
1.678
Polynomial
Method Exploratory Trend Surface Analysis Value
Order of Polynomial = 2 Order of Polynomial = 3
Global n/a Mean Error: -0.017RMSE: 4.446
Mean Error: -0.062RMSE: 4.552
Local 40 Mean Error: 0.395RMSE: 4.126
Mean Error: -0.141RMSE: 3.826
Local 70 Mean Error: 1.290RMSE: 15.951
Mean Error: 88.973RMSE: 1386.977
RBF (Radial Basis Functions)
Neighborhood Shape
Max. Number of Neighbors (k)
Max Number of Neighbors (k)
One Sector 16 Mean Error: 0.109RMSE: 3.746
24 Mean Error: 0.111RMSE: 3.758
Four Sectors(Cardinal)
4 Mean Error: 0.077RMSE: 3.743
6 Mean Error: 0.083RMSE:3.752
Four Sectors(45° Offset)
4 Mean Error: 0.021RMSE: 3.712
6 Mean Error: 0.056RMSE: 3.721
Ordinary Kriging
Neighborhood Shape
Max. Number of Neighbors (k)
Spherical Type ModelAnistropy = False
Exponential Type ModelAnistropy = True
One Sector 16 Mean Error: -0.145RMSE: 4.354
Mean Error: 0.175RMSE: 4.349
Four Sectors(Cardinal)
4 Mean Error: 0.059RMSE: 3.770
Mean Error: 0.059RMSE: 3.840
4. Produce a map with the best RMSE for each interpolation method. There should be four maps – one for IDW, one for Polynomial, one for RBF, and one for Ordinary Kriging. Include basic map elements (title, legend, scale, attribution including your name). (2.0 points) I’ve decided to the extent as the default and not show the interpolation for the entire state to show where TAVG data wasn’t gathered. But if I were to show interpolation for the entire state, I would change the Extent to “the rectangular extent of weather stations.”
Part 4. Report and Discussion.
5. Explain the relationship of the following interpolation parameters with RMSE: (1 point)
Power (p) in IDW (eg, does changing the power variable in the IDW raise or lower the RMSE? What does the Optimized Power function do?)
Desktop.arcgis.com defines Parameter Opitmization as the process in each model that minimizes the mean square error. Each model is focused on its most important parameter. “The power value of IDW is the only parameter for this interpolation model used in the optimization.” Power dictates the rate at which the weights decrease as the distance from observation points increase. “As p increases, the weights for distant points decrease rapidly. If the p value is very high, only the immediate surrounding points will influence the predicton.”
Looking at the IDW maps, the maps of higher power look smoother than those of lower power.
Number of neighbors (k) (Do more neighbors mean a lower or higher RMSE? Why?)
Desktop.arcgis.com: Because things that are close to one another are more alike than those that are farther away, as the locations get farther away, the measured values will have little relationship to the value of the prediction
location. To speed calculations, you can exclude the more distant points that will have little influence on the prediction. As a result, it is common practice to limit the number of measured values by specifying a search neighborhood.
Every model but polynomial experimented with the number of neighbors. Maps of higher neighborhoods look smoother, but lack precision.
Neighborhood shape and orientation (Does the shape and orientation of the neighborhood affect RMSE? Why?)
Desktop.arcgis.com: The shape of the neighborhood restricts how far and where to look for the measured values to be used in the prediction … The shape of the neighborhood is influenced by the input data and the surface you are trying to create. If there are no directional influences in your data, you'll want to consider points equally in all directions. To do so, you will define the search neighborhood as a circle. However, if there is a directional influence in your data, such as a prevailing wind, you may want to adjust for it by changing the shape of the search neighborhood to an ellipse with the major axis parallel with the wind … Once a neighborhood shape has been specified, you can restrict which data locations within the shape should be used. You can define the maximum and minimum number of locations to use, and you can divide the neighborhood into sectors. If you divide the neighborhood into sectors, the maximum and minimum constraints will be applied to each sector.
The RBF interpolation took shape under consideration by using the “Four Sectors (45° Offset)” parameter to determine the map with the lowest RMSE. It is possible to define Sector Type in kringing as well as determining direction.
6. How do the interpolated temperature surfaces differ among the models? (HINT: examine map edges, the plains and extreme areas like mountains.) Does the temperature change as you would expect? (.5 point)
Every model but polynomial follows the same general pattern by showing colder temperature dipping into the state from the north with the warmest temperatures in the southwestern region of the state. The polynomial model shows warmer temperatures in the northeastern area as well, but that wouldn’t follow the pattern of a cold front moving into the state, so that model would be dismissed.
The other three model also show that cold temperatures dip along the front range and follow that particular elevation boundary. Kriging, however also shows two areas of high temperature unlike the other two. IDW does show a pocket of higher temperature but not an entire area. Kriging might be showing the San Luis valley in this second area, but I’m not sure.
7. Pick a temperature surface that you think best represents the data and explain why. Think about the nature of temperature on a day in January (your data). The explanation of your choice is more
important than the choice of model itself.
Both IDW and RBF are an exact interpolation model, that is the surface must pass through each measured sample value. But IDW will never predict values above the maximum measure value or below the minimum measure value.
Desktop.arcgis.com recommends RGF when you have a large number of data points with gently varying surfaces. The technique is inappropriate when large changes in the surface values occur within short distances and/or you suspect the sample data is prone to error or uncertainty. The dataset for Weather Stations isn’t large but I do believe the surface varies gently.
Expanding the extent for both IDW and RBF provides a comparison for an area of unknown observation. The RBF may have a lower RMSE, but The IDW interpolation gave a better, more natural pattern of temperature distribution with warmer temperatures expanded across the whole southern part of the state instead of a cold spike in the southeast corner.
IDW more closely related to distance. Temperature does follow a pattern related to distance in that colder temperatures are found near other cold temperatures.
Do the geostatistics used in Kriging provide insight in the change in temperature?
Desktop.arc.gis: The deterministic methods include IDW (inverse distance weighting), Natural Neighbor, Trend, and Spline.
The deterministic interpolation methods assign values to locations based on the surrounding measured values and on specified mathematical formulas that determine the smoothness of the resulting surface.
Kriging is a geostatistical method of interpolation.
The geostatistical methods are based on statistical models that include autocorrelation (the statistical relationship among the measured points). Because of this, geostatistical techniques not only have the capability of producing a prediction surface but also provide some measure of the certainty or accuracy of the predictions.
Kriging is divided into two distinct tasks: quantifying the spatial structure of the data and producing a prediction. Quantifying the structure, known as
variography, is where you fit a spatial-dependence model to your data. To make a prediction for an unknown value for a specific location, kriging will use the fitted model from variography, the spatial data configuration, and the values of the measured sample points around the prediction location.
8. Answer the following questions: (.5 points) a. How long did the lab assignment take you to finish? 3-4 hours. It might have taken less time, but I was messing with all the different parameters on models and seeing how to export as a raster. b. Were there any errors in the assignment that made it difficult for you to finish? I didn’t see any. c. Where did you get help if you needed it? Desktop.arcgis