a n d ti d y c e n s u s s i mp l e fe a tu r e g e o me tr y · s i mp l e fe a tu r e g e o me tr...

41
DataCamp Analyzing US Census Data in R Simple feature geometry and tidycensus ANALYZING US CENSUS DATA IN R Kyle Walker Instructor

Upload: others

Post on 11-Jan-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Simple feature geometryand tidycensus

ANALYZING US CENSUS DATA IN R

Kyle WalkerInstructor

Page 2: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

The geometry = TRUE argument

geometry = TRUE is available for the following geographies:

"state"

"county"

"tract"

"block group"

"block"

"zcta" (also "zip code tabulation area")

Page 3: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Getting simple feature geometrylibrary(tidycensus)

library(tidyverse)

library(sf)

cook_value <- get_acs(geography = "tract", state = "IL",

county = "Cook",

variables = "B25077_001",

geometry = TRUE)

Page 4: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Simple feature geometryhead(cook_value, 3)

Simple feature collection with 3 features and 5 fields

geometry type: MULTIPOLYGON

dimension: XY

bbox: xmin: -87.68465 ymin: 42.01232 xmax: -87.66434 ymax: 42.02297

epsg (SRID): 4269

proj4string: +proj=longlat +datum=NAD83 +no_defs

GEOID NAME variable

1 17031010100 Census Tract 101, Cook County, Illinois B25077_001

2 17031010201 Census Tract 102.01, Cook County, Illinois B25077_001

3 17031010202 Census Tract 102.02, Cook County, Illinois B25077_001

estimate moe geometry

1 230700 62332 MULTIPOLYGON (((-87.6772 42...

2 151100 19099 MULTIPOLYGON (((-87.68465 4...

3 133300 84063 MULTIPOLYGON (((-87.67685 4...

Page 5: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Plotting tidycensus geometryplot(cook_value["estimate"])

Page 6: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Joining tigris and tidycensus datalibrary(tigris)

idaho_income <- get_acs(geography = "school district (unified)",

variables = "B19013_001",

state = "ID")

idaho_school <- school_districts(state = "ID",

type = "unified",

class = "sf")

id_school_joined <- left_join(idaho_school,

idaho_income,

by = "GEOID")

Page 7: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Joining tigris and tidycensus dataplot(id_school_joined["estimate"])

Page 8: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Shifting Alaska and Hawaii geometrystate_value <- get_acs(geography = "state",

variables = "B25077_001",

survey = "acs1",

geometry = TRUE,

shift_geo = TRUE)

plot(state_value["estimate"])

Page 9: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Page 10: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Let's practice!

ANALYZING US CENSUS DATA IN R

Page 11: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Mapping demographic datawith ggplot2

ANALYZING US CENSUS DATA IN R

Kyle WalkerInstructor

Page 12: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

A basic choropleth map with geom_sf()library(ggplot2)

ggplot(cook_value, aes(fill = estimate)) +

geom_sf()

Page 13: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Page 14: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Modifying map colorsggplot(cook_value, aes(fill = estimate, color = estimate)) +

geom_sf() +

scale_fill_viridis_c() +

scale_color_viridis_c()

Page 15: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Customizing the map outputggplot(cook_value, aes(fill = estimate, color = estimate)) +

geom_sf() +

scale_fill_viridis_c(labels = scales::dollar) +

scale_color_viridis_c(guide = FALSE) +

theme_minimal() +

coord_sf(crs = 26916, datum = NA) +

labs(title = "Median home value by Census tract",

subtitle = "Cook County, Illinois",

caption = "Data source: 2012-2016 ACS.\nData acquired with the R

tidycensus package.",

fill = "ACS estimate")

Page 16: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Page 17: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Let's practice!

ANALYZING US CENSUS DATA IN R

Page 18: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Advanced demographicmapping

ANALYZING US CENSUS DATA IN R

Kyle WalkerInstructor

Page 19: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Visual variables in cartography

Visual variables by Axis Maps

Source: Axis Maps

Page 20: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Graduated symbol mapslibrary(sf)

centers <- st_centroid(state_value)

ggplot() +

geom_sf(data = state_value, fill = "white") +

geom_sf(data = centers, aes(size = estimate), shape = 21,

fill = "lightblue", alpha = 0.7, show.legend = "point") +

scale_size_continuous(range = c(1, 20))

Page 21: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Page 22: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Page 23: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Faceted maps with ggplot2ggplot(dc_race, aes(fill = percent, color = percent)) +

geom_sf() +

coord_sf(datum = NA) +

facet_wrap(~variable)

Page 24: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Interactive visualization

Web-based graphics allow interaction with data

Options in R:

leaflet

plotly

htmlwidgets

Page 25: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Interactive maps with mapviewlibrary(mapview)

mapview(cook_value, zcol = "estimate", legend = TRUE)

Page 26: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Let's practice!

ANALYZING US CENSUS DATA IN R

Page 27: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Cartographic workflowswith tigris and tidycensus

ANALYZING US CENSUS DATA IN R

Kyle WalkerInstructor

Page 28: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Page 29: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Generating random dots with sf

Key function for random point generation in sf: st_sample()

dc_dots <- map(c("White", "Black", "Hispanic", "Asian"), function(group) {

dc_race %>%

filter(variable == group) %>%

st_sample(., size = .$value / 100) %>%

st_sf() %>%

mutate(group = group)

}) %>%

reduce(rbind)

Page 30: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Considerations for random dot generation

For faster plotting:

For more accurate visualizations:

dc_dots <- dc_dots %>%

group_by(group) %>%

summarize()

dc_dots_shuffle <- sample_frac(dc_dots, size = 1)

Page 31: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Basic dot-density mapping with sfplot(dc_dots_shuffle, key.pos = 1)

Page 32: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Ancillary data with tigrisoptions(tigris_class = "sf")

dc_roads <- roads("DC", "District of Columbia") %>%

filter(RTTYP %in% c("I", "S", "U"))

dc_water <- area_water("DC", "District of Columbia")

dc_boundary <- counties("DC", cb = TRUE)

Page 33: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Ancillary data with tigrisplot(dc_water$geometry, col = "lightblue")

Page 34: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Dot-density mapping with ggplot2ggplot() +

geom_sf(data = dc_boundary, color = NA, fill = "white") +

geom_sf(data = dc_dots, aes(color = group, fill = group), size = 0.1) +

geom_sf(data = dc_water, color = "lightblue", fill = "lightblue") +

geom_sf(data = dc_roads, color = "grey") +

coord_sf(crs = 26918, datum = NA) +

scale_color_brewer(palette = "Set1", guide = FALSE) +

scale_fill_brewer(palette = "Set1") +

labs(title = "The racial geography of Washington, DC",

subtitle = "2010 decennial U.S. Census",

fill = "",

caption = "1 dot = approximately 100 people.\nData acquired with

the R tidycensus and tigris packages.")

Page 35: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Page 36: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Considerations for dot-density mapping

Be mindful of ways dot-density maps can be misinterpreted

Choose qualitative colors wisely

Take care when selecting ancillary layers

Page 37: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Let's practice!

ANALYZING US CENSUS DATA IN R

Page 38: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Other resources fordemographic data in R

ANALYZING US CENSUS DATA IN R

Kyle WalkerInstructor

Page 39: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Other R packages to know about

censusapi

ipumsr

cancensus

Page 40: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Other DataCamp courses

Working with Data in the Tidyverse

Data Visualization with ggplot2

Interactive Maps with Leaflet in R

Page 41: a n d ti d y c e n s u s S i mp l e fe a tu r e g e o me tr y · S i mp l e fe a tu r e g e o me tr y a n d ti d y c e n s u s ... B asi c dot -densi t y mappi ng wi t h sf plot(dc_dots_shuffle,

DataCamp Analyzing US Census Data in R

Thank you!

ANALYZING US CENSUS DATA IN R