tse-chuan yang, ph.d the geographic information analysis core population research institute social...

17
Tse-Chuan Yang, Ph.D The Geographic Information Analysis Core Population Research Institute Social Science Research Institute Pennsylvania State University March, 2013 Introduction to Spatial Econometrics using R

Upload: maud-wilkinson

Post on 23-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tse-Chuan Yang, Ph.D The Geographic Information Analysis Core Population Research Institute Social Science Research Institute Pennsylvania State University

Tse-Chuan Yang, Ph .DThe Geographic In format ion Analys i s Core

Populat ion Research Inst i tuteSoc ia l Sc ience Research Inst i tute

Pennsy lvan ia State Univers i tyMarch , 2013

Introduction to Spatial Econometrics using R

Page 2: Tse-Chuan Yang, Ph.D The Geographic Information Analysis Core Population Research Institute Social Science Research Institute Pennsylvania State University

Overview

What are spatial data and analysis?

Why is a spatial perspective important?

Exploratory spatial analysis

Explanatory spatial analysis

Demonstration using R

Conclusions and caveats

Page 3: Tse-Chuan Yang, Ph.D The Geographic Information Analysis Core Population Research Institute Social Science Research Institute Pennsylvania State University

Goals

To realize why spatial analysis is needed when using ecological data

To understand the fundamentals of spatial econometrics modeling

To facilitate the use of R in your future work

Page 4: Tse-Chuan Yang, Ph.D The Geographic Information Analysis Core Population Research Institute Social Science Research Institute Pennsylvania State University

Why Does Space Matter?

Arguably, everything on earth could be spatially referenced and individual’s daily life is shaped by spatial factors.

The dynamics between individual and environment (space) draw increasing attention in social science.

Demography is inherently a spatial social science (Voss, 2007).

Social data are special because of dependence across space.

Page 5: Tse-Chuan Yang, Ph.D The Geographic Information Analysis Core Population Research Institute Social Science Research Institute Pennsylvania State University

Types of Spatial Data

Shapefiles Point Line Polygon

Page 6: Tse-Chuan Yang, Ph.D The Geographic Information Analysis Core Population Research Institute Social Science Research Institute Pennsylvania State University

What is Spatial Analysis?

Spatial analysis can be generally divided into (Weeks, 2004): The analysis that puts people into place The analysis that concerns about the associations

among observationsHierarchical modeling approach (hierarchical

data structure)Spatial econometrics approach (flat data

structure)

Page 7: Tse-Chuan Yang, Ph.D The Geographic Information Analysis Core Population Research Institute Social Science Research Institute Pennsylvania State University

Why is a Spatial Perspective Important?

Spatial homogeneity (dependence) and heterogeneity may bias the estimates in the traditional analysis approach (Voss et al., 2006).

Using a spatial perspective enhances the understanding of how neighbors matter.

A spatial perspective better reflects the real world as people are not confined by administrative boundaries.

Page 8: Tse-Chuan Yang, Ph.D The Geographic Information Analysis Core Population Research Institute Social Science Research Institute Pennsylvania State University

How Do We Analyze Spatial Data?

Exploratory spatial data analysis (ESDA): Visualization key variables Testing spatial dependence to gain statistical evidence Spatial clustering patterns

Explanatory spatial data analysis (spatial econometrics approach): Spatial lag model (endogenous interaction

relationships) Spatial error model (correlated relationships) Generalized spatial model (considering both spatial

lag and spatial error)

Page 9: Tse-Chuan Yang, Ph.D The Geographic Information Analysis Core Population Research Institute Social Science Research Institute Pennsylvania State University

ESDA: Visualization

Visualization is the fundamental aspect of ESDA and allows a basic understanding of data.

Page 10: Tse-Chuan Yang, Ph.D The Geographic Information Analysis Core Population Research Institute Social Science Research Institute Pennsylvania State University

ESDA: Testing Spatial Dependence

The goal is to find statistical evidence for visual inspections: Global measures

(across entire research region): Moran’s I; Geary’s C; Getis-Ord G statistic

Local measures (specific to each observation): Local Indicator of Spatial Association (LISA)

Page 11: Tse-Chuan Yang, Ph.D The Geographic Information Analysis Core Population Research Institute Social Science Research Institute Pennsylvania State University

Spatial Econometrics

Spatial dependence and heterogeneity often, if not always, violate the statistical assumptions used in the traditional analysis approach (LeSage and Pace, 2009): Independence Constancy

Spatial econometrics is arguably the most common approach to spatial dependence and heterogeneity (to some extent).

Page 12: Tse-Chuan Yang, Ph.D The Geographic Information Analysis Core Population Research Institute Social Science Research Institute Pennsylvania State University

Spatial Structure

Spatial weight matrix is treated as a priori. Spatial contiguity approach (polygon) Distance-based approach (point) K-nearest neighbor approach (point)

No agreement on which one is the most appropriate. It is arbitrarily determined by researchers (Leenders, 2002; Beck et al., 2006).

Page 13: Tse-Chuan Yang, Ph.D The Geographic Information Analysis Core Population Research Institute Social Science Research Institute Pennsylvania State University

Spatial Weight Matrix (Contiguity)

Rook’s spatial weight matrix:   Queen’s spatial weight matrix:

Second-order neighbors (Rook’s case):

Spatial weight matrix can be quite messy in practice.

  j  j i j  j  

j  j  jJ i J j j j 

    k      k j k  k j i j k  k j k      k    

Page 14: Tse-Chuan Yang, Ph.D The Geographic Information Analysis Core Population Research Institute Social Science Research Institute Pennsylvania State University

Spatial Regression Models

Spatial lag model (how the dependent variable is related across spatial units):

Spatial error model (the impact of unknown factors in the spatial structure):

Generalized spatial model (mixed both lag and error):

),0(~

2IN

XWMM

),,0(~

2IN

Wuu

uXM

),0(~

2

2

1

IN

uWu

uXMWM

Page 15: Tse-Chuan Yang, Ph.D The Geographic Information Analysis Core Population Research Institute Social Science Research Institute Pennsylvania State University

Demonstration

Using R to Analyze Mortality Data County-level mortality data (1998-2002) Independent variables drawn from 2000 Census

Tasks: Load necessary R packages Read the shapefile containing data Visualize the dependent variable and save it as a figure Generate spatial weight matrix using the shapefile Test spatial dependence (both global and local) Examine if a spatial perspective is better Implement spatial econometrics models Conduct model comparisons

Page 16: Tse-Chuan Yang, Ph.D The Geographic Information Analysis Core Population Research Institute Social Science Research Institute Pennsylvania State University

Caveats

Modifiable areal unit problem (Openshaw and Taylor, 1979)

The choice of spatial weight matrix

The link between spatial modeling and social theories

A lot more!

Page 17: Tse-Chuan Yang, Ph.D The Geographic Information Analysis Core Population Research Institute Social Science Research Institute Pennsylvania State University

Conclusions

Spatial modeling should become the “conventional analysis approach” when dealing with ecological data.

Spatial econometrics has paid relatively little attention to generalized linear modeling (non-continuous outcomes).

Spatial econometrics largely deals with cross-sectional data, though the methodological framework for spatial panel data is available.

R is good at statistical analysis, but for visualization, other GIS programs may be better.