sipi data days 2019 n. thompson, phd...2019/07/12 · 53/complex-headers-in-angular2-data-table 16...
TRANSCRIPT
![Page 1: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/1.jpg)
Welcome to data days
N. Thompson, PhDSIPI data days 2019
![Page 2: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/2.jpg)
My story: I’m Nicole...
2
❏ Cuban-American❏ Grew up all over USA❏ Loves:
❏ fantasy, sci-fi❏ human language and
culture❏ physics and chemistry
❏ Wanted to tie it all together
![Page 3: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/3.jpg)
I’m now a behavioral ecologistAfter 3 degrees and lots of exploring...
My job is to research animal behavior and physiology.
❏ Analyze data almost every day.
❏ Greatest tool: A computer programming language called “R”.
3
![Page 4: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/4.jpg)
My goals for you
❏ Apply the principles of tidy data and data visualization❏ Use curiosity and creativity to generate and answer
research questions on socially relevant topics
4
After our 2 day-long sessions you will be able to
❏ Manipulate & explore a data set using R programming language
❏ Visualize patterns in data with R
![Page 5: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/5.jpg)
… and have fun!
5
![Page 6: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/6.jpg)
What is R? A language to talk to your computer
6Diagram courtesy of Garret Grolemund
![Page 7: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/7.jpg)
Who is R for? Everyone.
7
![Page 8: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/8.jpg)
What is data science?
8
Wickham & Grolemund, r4ds
![Page 9: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/9.jpg)
Exploratory data analysis is one important part
9
Wickham & Grolemund, r4ds
![Page 10: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/10.jpg)
Your capstone team projects
❏ Choose data sets❏ Become familiar with them and form research questions
❏ Use functions in R to answer questions❏ Data transformations and summaries (package dplyr)❏ Data visualizations (package ggplot2)
❏ Present questions and findings to class in 15 min
Date of 15 minute group presentations is TBD.
10
![Page 11: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/11.jpg)
Our schedule
Day 1: 7/12/19
Literacy: Choose and describe a data set, create research questions
Transformations: Exploring data sets with R by subsets, transformations, and summaries
12
Day 2: 7/19/19
Graphics best practices: Evaluate and interpret visualizations
Visualizations: Exploring data sets with R by graphical plotting
*Exploring = answering questions*
![Page 12: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/12.jpg)
Let’s meet R in RStudio
13
![Page 13: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/13.jpg)
Troubleshooting
Run “?function_name” - for help
GOOGLE “R error name/function name/task”
Ask a friend.
Ask me!
Know you can do it.
14
![Page 14: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/14.jpg)
Introduction to Tidy Data
N. Thompson, PhDSIPI data days 2019
![Page 15: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/15.jpg)
Importance of data literacy
https://en.wikipedia.org/wiki/Data
https://www.digitaltveurope.com/2019/05/31/data-to-drive-40-of-tv-ad-spend-by-2020/
https://stackoverflow.com/questions/40182253/complex-headers-in-angular2-data-table 16
![Page 16: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/16.jpg)
What is tidy data?
❏ Data are in a table
❏ Each variable gets a column
❏ Each observation gets a row
❏ Each cell is a single value
❏ Each type of observation gets its own table
Fig 12.1, Wickham & Grolemund “R for Data Science”
17
![Page 17: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/17.jpg)
Tidy data sets have data dictionaries
Data dictionary: a description of each variable in a data set, including its data type and units.
Soon, you will write your own data dictionaries in teams.
18
![Page 18: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/18.jpg)
Example tidy data set: Diabetes risk factors in Pima women from AZFrom: https://www.kaggle.com/uciml/pima-indians-diabetes-database
19
![Page 19: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/19.jpg)
Continuous
Data dictionary: define the variables
ContinuousCategorical
Diabetes risk factors in Pima women from AZ
Logical
31
![Page 20: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/20.jpg)
Is it tidy?Diabetes risk factors in Pima women from AZ
❏ Data are in a table
❏ Each variable gets a column
❏ Each observation gets a row - a woman >21 yrs old
❏ Each cell is a single value
❏ Each type of observation gets its own table - diagnosis and measurements per woman
35
![Page 21: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/21.jpg)
Your turn…
❏ Break into teams of 3 - lead detective, scribe, & reporter
❏ Choose data sets - view in R Studio
❏ Learning goals for 1st group activity:
❏ create a data dictionary for chosen data set
❏ formulate research questions and diagnose limitations of data set
36
![Page 22: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/22.jpg)
Team roles
Reporter: communicates the team’s findings, process, and questions to the class as a whole.
37
Lead detective: drives the team toward its goal, takes charge of plans of action, watches the clock.
Scribe: writes down the team’s initial answers on worksheets and writes initial code.
![Page 23: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/23.jpg)
Project data sets:
1. Cancer rates by US state in 2017
2. Human trafficking in the USA in 2016 (some untidiness!)
3. Crime rates in major metropolitan areas
4. Gun crime in the USA 2012-2014
5. Diabetes risk factors among Pima women in AZ
38
![Page 24: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/24.jpg)
Exploratory Data Analysis (EDA) in R
N. Thompson, PhDSIPI data days 2019
![Page 25: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/25.jpg)
Moving on from tidy data… time to start exploring
40
Wickham & Grolemund, r4ds
![Page 26: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/26.jpg)
Key functions you will learn (see handouts)
Dplyr functions:
%>%
select()
filter()
mutate()
summarise()
group_by()
41
Base R arithmetic & notation:
<- “assignment”
==, != “equal to”, “not equal to”
>, <, >=, <= inequalities
&, | intersection, union
str(), View(), c()
mean(), sd(), sum()
![Page 27: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/27.jpg)
Key functions you will learn cont’d (see handouts)
Functions for data types:
class()
is.na()
as.numeric() - continuous
as.character() - categorical
as.factor() - categorical
42
Base subsetting:
Data[a,b] - a index = rows, b index = columns
Data$name - select a column
![Page 28: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/28.jpg)
Learning to code...
1. Observe live coding
2. Copy sections of live code
3. Fill in blanks and perform exercises solo
4. Share progress with teammates
43
![Page 29: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/29.jpg)
To our consoles!
44
![Page 30: SIPI data days 2019 N. Thompson, PhD...2019/07/12 · 53/complex-headers-in-angular2-data-table 16 What is tidy data? Data are in a table Each variable gets a column Each observation](https://reader035.vdocument.in/reader035/viewer/2022071010/5fc78c51e5e38b7b12174acf/html5/thumbnails/30.jpg)
Benefits of tidy data
❏ Consistent and predictable structure
❏ Prevents errors in your own analyses
❏ Increases clarity for others to follow your analyses
45