expanding open data horizons with r and rstudio
TRANSCRIPT
ottawa.theodi.org
● Define problem or question
● Get the data
● Clean the data
● Explore the data
● Analyze the data
● Communicate results
Data Process
ottawa.theodi.org
● Define problem or question
● Get the data
● Clean the data
● Explore the data
● Analyze the data
● Communicate results
Data Process
ottawa.theodi.org
● Seoul Subway Data
● 2015 Canadian Federal Election
● Weather Impact on Ottawa Cycling
● Ottawa 311 Data
● Ottawa Crime Statistics
Case study code in Github: https://github.com/robscottd/OpenDataInAction
Case Studies
ottawa.theodi.org
Get the Data
The “Good” of getting open data:
Centralized government repositories
Multiple standard formats
The “Bad” of getting open data:
Data is too “clean”, value scrubbed out
Rate-limited/complex APIs
ottawa.theodi.org
Get the Data Using R
R basics: read.table, read.csv, read.csv2
Packages: readr, rvest, RSelenium, readxl, rjson
Using APIs: twitteR, httr, jsonlite
ottawa.theodi.org
Clean the Data
Tidy the data! Follow Hadley Wickham’s method
Address extreme outliers
Explore outliers graphically (Shiny Gadgets example)
Address missing values
Imputation through MICE, missForest, Hmisc
Or remove the incomplete observations
ottawa.theodi.org
Explore the Data: But why not just dive in?
No assumptions, “listen” to the data
Understand data properties
Find patterns in the data
Discover analysis strategies
Begin the visual narrative
ottawa.theodi.org
Explore the Data
Graph it
Map it (leaflet)
Network graph it (networkD3,igraph)
Heat map it (d3heatmap,heatmaply)
ottawa.theodi.org
Communicate Results
Complete the narrative - what is the story?
Design with audience in mind
Share your process
Publish for easy access and feedback
If possible, provide link to data
ottawa.theodi.org
It just might not tell you what you want to hear
● Survey design is biased● Sensors are not identically calibrated● Merged datasets are not temporally aligned
Data Does Not Lie
ottawa.theodi.org
● Data for Good Ottawa● Open Data Ottawa● Datafest Ottawa
Sample of Ottawa Open Data Groups
ottawa.theodi.org
● Federal Government● Provincial Government● Municipal Government
Canadian Open Data Consultations
Rob Davidson@[email protected]