spatial data management and analysis with r · •r-sig-geo (less on ecology more on spatial stuff)...

12
Spatial Data Management and Analysis with R Jeff Hollister Presented at: USIALE 2013, Austin TX Student Workshop on Data Management April 16, 2013 From: Matloff, N. 2012. The Art of R Programming

Upload: others

Post on 21-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Spatial Data Management and Analysis with R · •R-sig-geo (less on ecology more on spatial stuff) •R-sig-ecology (less on spatial stuff more on ecological applications) •Applied

Spatial Data Management and Analysis with R

Jeff Hollister Presented at:

USIALE 2013, Austin TX Student Workshop on Data Management

April 16, 2013

From: Matloff, N. 2012. The Art of R Programming

Page 2: Spatial Data Management and Analysis with R · •R-sig-geo (less on ecology more on spatial stuff) •R-sig-ecology (less on spatial stuff more on ecological applications) •Applied

2

• Sometimes yes

– Additional information • vector • raster • graph • projections

– Additional formats

• shapefiles • file geodatabase • tiff • kml • ascii grid • etc.

– Additional analysis

• point-pattern • auto-correlation • interpolation

Is spatial data unique?

newsletter.flatworldknowledge.com www.extension.org

www.eisbox.net

Page 3: Spatial Data Management and Analysis with R · •R-sig-geo (less on ecology more on spatial stuff) •R-sig-ecology (less on spatial stuff more on ecological applications) •Applied

3

• Sometimes no

– Tabular data

• csv

• spreadsheet

• tables in a relational database

– All data are spatial

• collected somewhere

• Best practices

– agnostic

– just have more to keep track of

Is spatial data unique?

Page 4: Spatial Data Management and Analysis with R · •R-sig-geo (less on ecology more on spatial stuff) •R-sig-ecology (less on spatial stuff more on ecological applications) •Applied

Common tools

• esri

• Open source

– GRASS

– QGIS

• Interactive (i.e., point and click) – Easy to lose track

– Show of hands • 100% of analysis?

• 50% of analysis?

• None?

• Automate and standardize – Show of hands

• 100% of analysis?

• 50% of analysis?

• None?

Page 5: Spatial Data Management and Analysis with R · •R-sig-geo (less on ecology more on spatial stuff) •R-sig-ecology (less on spatial stuff more on ecological applications) •Applied

Scientific Workflows

• Definition – a precise description of scientific

procedure, often conceptualized as a

series of data ingestion,

transformation, and analytical steps.

(verbatim from Carly’s primer on data

management)

– One addition: repeatability

• Specific systems – Kepler

– Taverna

– Not going to talk more about these

kepler-project.org

www.taverna.org.uk/

Page 6: Spatial Data Management and Analysis with R · •R-sig-geo (less on ecology more on spatial stuff) •R-sig-ecology (less on spatial stuff more on ecological applications) •Applied

Scripting as scientific workflow

• A general implementation: – Scripting

– Captures steps

– Flexible

– Full description of process (i.e metadata)

– How to manage data • You decide (i.e. best practices)

• Script enforces those decisions

www.python.org

cran.us.r-project.org/

Page 7: Spatial Data Management and Analysis with R · •R-sig-geo (less on ecology more on spatial stuff) •R-sig-ecology (less on spatial stuff more on ecological applications) •Applied

Soapbox

• EVERYONE SHOULD KNOW HOW TO WRITE CODE!!! – In other words, the computer exists to be bent to our will,

not the other way around

– How?

• formal courses at your university

• distance learning

– Books

• practical computing for biologists

– MOOCs

• www.coursera.org/course/compdata

• www.coursera.org/course/interactivepython

– Websites

• code.org

• www.programmingforbiologists.org

Page 8: Spatial Data Management and Analysis with R · •R-sig-geo (less on ecology more on spatial stuff) •R-sig-ecology (less on spatial stuff more on ecological applications) •Applied

Back on topic: R and spatial data

• Using R scripts and functions

– control your workflow

• read data

• analyze data

• output (figures, derived

spatial data)

– Closely aligned to “Reproducible

Research”

• Sweave, KnitR ...

PellysStats.Rnw

Example File:

Page 9: Spatial Data Management and Analysis with R · •R-sig-geo (less on ecology more on spatial stuff) •R-sig-ecology (less on spatial stuff more on ecological applications) •Applied

Back on topic: R and spatial data

• R as GIS – Why use R for this?

• Huge user base

• Multi-platform

• Data ready for analysis

• Flexible and Extensible

• De-facto?? – R funcitons ≈functions 31,375

– SAS fucntions = 647

– 2012: “R added more functions/procs than SAS Institute has

written in its entire history!” (From: http://www.r-

bloggers.com/rs-2012-growth-in-capability-exceeds-sas-all-

time-total/)

• FREE!!!!!!

http://r4stats.com/articles/popularity/

Page 10: Spatial Data Management and Analysis with R · •R-sig-geo (less on ecology more on spatial stuff) •R-sig-ecology (less on spatial stuff more on ecological applications) •Applied

How do you use R as a GIS?

• Subject of the hands-on

• Sneak Preview – base packages (sp, rgdal,rgeos,raster)

– Others (igraph,maptools, SDMTools,landsat)

– Links to other GISs (spgrass6,RPyGeo)

– In the hands on we will write a script that does the following:

• Read in raster and vector data

• Simple plots

• Do some analysis

• Create output

• Talk about general programing concepts of variables, flow control, file

I/O and touch on functions

Page 11: Spatial Data Management and Analysis with R · •R-sig-geo (less on ecology more on spatial stuff) •R-sig-ecology (less on spatial stuff more on ecological applications) •Applied

What kind of GIS can R do

• Pretty much anything

– Buffers

– Landscape Metrics

– Distance stuff

– geostatistics

– point pattern

– If it doesn’t exist, you could write it yourself!

Page 12: Spatial Data Management and Analysis with R · •R-sig-geo (less on ecology more on spatial stuff) •R-sig-ecology (less on spatial stuff more on ecological applications) •Applied

Find out more information

• Google (How I figure out most things)

– StackOverflow is always good

• R-sig-geo (less on ecology more on spatial stuff)

• R-sig-ecology (less on spatial stuff more on ecological applications)

• Applied Spatial Data Analysis with R Bivand et al.

• CRAN spatial task view: http://cran.r-project.org/view=Spatial

• http://help.nceas.ucsb.edu/r:spatial