spatial data management and analysis with r · •r-sig-geo (less on ecology more on spatial stuff)...
TRANSCRIPT
Spatial Data Management and Analysis with R
Jeff Hollister Presented at:
USIALE 2013, Austin TX Student Workshop on Data Management
April 16, 2013
From: Matloff, N. 2012. The Art of R Programming
2
• Sometimes yes
– Additional information • vector • raster • graph • projections
– Additional formats
• shapefiles • file geodatabase • tiff • kml • ascii grid • etc.
– Additional analysis
• point-pattern • auto-correlation • interpolation
Is spatial data unique?
newsletter.flatworldknowledge.com www.extension.org
www.eisbox.net
3
• Sometimes no
– Tabular data
• csv
• spreadsheet
• tables in a relational database
– All data are spatial
• collected somewhere
• Best practices
– agnostic
– just have more to keep track of
Is spatial data unique?
Common tools
• esri
• Open source
– GRASS
– QGIS
• Interactive (i.e., point and click) – Easy to lose track
– Show of hands • 100% of analysis?
• 50% of analysis?
• None?
• Automate and standardize – Show of hands
• 100% of analysis?
• 50% of analysis?
• None?
Scientific Workflows
• Definition – a precise description of scientific
procedure, often conceptualized as a
series of data ingestion,
transformation, and analytical steps.
(verbatim from Carly’s primer on data
management)
– One addition: repeatability
• Specific systems – Kepler
– Taverna
– Not going to talk more about these
kepler-project.org
www.taverna.org.uk/
Scripting as scientific workflow
• A general implementation: – Scripting
– Captures steps
– Flexible
– Full description of process (i.e metadata)
– How to manage data • You decide (i.e. best practices)
• Script enforces those decisions
www.python.org
cran.us.r-project.org/
Soapbox
• EVERYONE SHOULD KNOW HOW TO WRITE CODE!!! – In other words, the computer exists to be bent to our will,
not the other way around
– How?
• formal courses at your university
• distance learning
– Books
• practical computing for biologists
– MOOCs
• www.coursera.org/course/compdata
• www.coursera.org/course/interactivepython
– Websites
• code.org
• www.programmingforbiologists.org
Back on topic: R and spatial data
• Using R scripts and functions
– control your workflow
• read data
• analyze data
• output (figures, derived
spatial data)
– Closely aligned to “Reproducible
Research”
• Sweave, KnitR ...
PellysStats.Rnw
Example File:
Back on topic: R and spatial data
• R as GIS – Why use R for this?
• Huge user base
• Multi-platform
• Data ready for analysis
• Flexible and Extensible
• De-facto?? – R funcitons ≈functions 31,375
– SAS fucntions = 647
– 2012: “R added more functions/procs than SAS Institute has
written in its entire history!” (From: http://www.r-
bloggers.com/rs-2012-growth-in-capability-exceeds-sas-all-
time-total/)
• FREE!!!!!!
http://r4stats.com/articles/popularity/
How do you use R as a GIS?
• Subject of the hands-on
• Sneak Preview – base packages (sp, rgdal,rgeos,raster)
– Others (igraph,maptools, SDMTools,landsat)
– Links to other GISs (spgrass6,RPyGeo)
– In the hands on we will write a script that does the following:
• Read in raster and vector data
• Simple plots
• Do some analysis
• Create output
• Talk about general programing concepts of variables, flow control, file
I/O and touch on functions
What kind of GIS can R do
• Pretty much anything
– Buffers
– Landscape Metrics
– Distance stuff
– geostatistics
– point pattern
– If it doesn’t exist, you could write it yourself!
Find out more information
• Google (How I figure out most things)
– StackOverflow is always good
• R-sig-geo (less on ecology more on spatial stuff)
• R-sig-ecology (less on spatial stuff more on ecological applications)
• Applied Spatial Data Analysis with R Bivand et al.
• CRAN spatial task view: http://cran.r-project.org/view=Spatial
• http://help.nceas.ucsb.edu/r:spatial