Transcript
Page 1: Modeling Spatial Relationships with ArcGIS Created By: Chen Shi …chenshi-portfolio.weebly.com/uploads/4/7/4/0/47409953/gis_and_spatial.pdf · Modeling Spatial Relationships with

Created By:Chen Shi

Purpose of the ProjectThe purpose of the project is touse three Modeling SpatialRelationships Tools(Exploratory Regression Tool,Ordinary Least Square Tool andGeographically WeightedRegression Tool) for predictingrecreation spending perhousehold on in one particularcategory and identify areaswhere predicted values arehigher than actual ones. Table 1: Data dictionary containing marked target variable and final predictors.

Using Exploratory Regression Tool

Table 2: Comparing Models from Explatoprary Regression Tool with best model marked

Table 3: Summary of Variable Significance

Table 3 Comments: Predictors from the selected model are:HealthCare, Child in Family, Money spends in Art per household andMoney spends in communication per household. HealthCare, Child inFamily can be considered as strong predictors and the relationshipare very stable. (Most of them are primarily positive). Art andCommunication, on the other hand, both have positive(%significance) and negative values. Table 4: Summary of Variable Multicollinearity

Table 4 Comments: Based on the Summary of Multicollinearity table,none of the predictors listed on the left have multicollinearity issue.

Table 5: Summary of Residual NormalityTable 5 Comments: This table shows top three models with the highest p-values for the Jarque-Bera testfor normality of residuals. All the three values are higher than 0.1. So all these models have normallydistributed residuals. The higher the p value, the better a given model is.

Table 2 Comments: The mostimportant is model fit, then collinearity,model simplicity, and normality ofresiduals. The selected model with fourpredictors is close to the highest bestfit. Also, the selected model 4 has thesecond best JB value. This model issimpler than models 5, 6, 7 and 8which have the similar fit. Hence, wecan draw the conclusion that model 4is the best model among the 8.

Using Ordinary Least Squares Tool

Table 6: Model Variable

Comments: t-Statistics provides the clue regardingthe relative importance of each variable in the model.The stronger predictor, the larger absolute t values.The weaker predictor, the smaller absolute t values.Therefore, predictor HealthCare has the largestabsolute value, which is the most useful predictor. Artis the second useful predictor. Child in Family has thelowest value, which means the least useful predictorfor this model.

Table 7: OLS Diagnostic

Table 8: Scatterplots from PDF created with OLS tool

Using Geographically Weighted Regression

Table 7 Comments: Both the Joint F-statistics and Join Wald Statistic aremeasures of overall model statisticalsignificance. The Joint F-statistic istrustworthy only when the Koenker (BP)statistic is not statistically significant.However BP is 34.663, so Joint F-statisticshould be used. Moreover, an asterisk nextto probability indicates significant model atthe 5% of confidence, which is desired.

Modeling Spatial Relationships with ArcGIS

Table 10 Comments:Comments:z-score is 8.889947, therefore,conclusion can be drawn thatthere is less than 1% likelihoodthat this clustered pattern couldbe the result of random chance.The model under and overpredictions are clusteredspatially.

Table 9: Histogram of Standardized Residual

Table 10: Spatial Autocorrelation of residuals

Regression equation with variable full namesThe regression equation: -369.473 + (2291.99 * CHILDFAM) + (0.902 *COMMUNICATION) + (0.764 * HEALTHCARE) + (9.085 * ART)

Table 12: Six GW scenarios with optimal scenario marked

Fig1: Bar chart with area in km2 for each category

Table 13: Frequency Table & Cross Tab Tool Results

Table 11 Comments:Comments: z-score is -0.57, therefore, conclusioncan be drawn that the pattern does not appear to be significantly differentthan random. Model under and over predictions are distributed randomly.

Expendictures on Recreation

¯0 30 60 90

Miles

Target Variable: Recreation1774.670 - 2747.9102747.911 - 3333.4903333.491 - 4030.3804030.381 - 4906.3904906.391 - 6545.600

¯0 30 60 90

Miles

Using OLS Tools for PredictingRecreation Spending Target Variable: RecreationOLS

Medium high potentialMedium low potentialModerate potentialVery high potentialVery low potential

¯0 30 60 90

Miles

¯0 30 60 90

Miles

¯0 30 60 90

Miles

¯0 30 60 90

Miles

Residuals: Standardized ValuesStdResid

Medium high potentialMedium low potentialMedium potentialVery high potentialVery low potential

How well does the modellocally fit?LocalR2

0.231028 - 0.5753240.575325 - 0.7678690.767870 - 0.8514440.851445 - 0.9147380.914739 - 0.980726

Family With Child Rate: Slope CoefficientImportance of ChildFam

-5135.899 - -2576.502-2576.501 - 187.375187.376 - 2089.2192089.220 - 4425.6924425.693 - 8156.884

Collinearity: Reclassified Values

Condition IndexPotential Collinearity ProblemSerious Collinearity Problem

Table 11: Spatial Autocorrelation graph

Map (a) Map (b)

Map (c ) Map (d )

Map (e ) Map (f )

Study Area Description: The study area of thisproject was chosen to include southwest part ofNova Scotia, 371 polygons in total.

Map (c): Local R2 values below 0.5 represent poor models andabove 0.5 are acceptable. GWR predicts well in most areas. Southand west part of the study area show low R2 values which meansthe model selected is less representative. Map (e): the majortiry ofthe area have serious or potential collinearity problems which showsresults are unreliable and unsteable. Map (f): Indicator ChildFam isthe most useful predictor in the model.For every year per householdspend on ChildFam, the expenditure on recreation per householdwill change accordingly with the largest changes in South and Northpart and least change in west part of study area.

References: 2013 Standard Data Set.pdf Projections: WGS1984 This product is intended for students training only.

Top Related