frequency tables, matrices, arrays, and functions - bojanorama

22
Analysing Social Science Data Using R Class 6: Frequency tables, matrices, arrays, and functions Michal Bojanowski [email protected] www.bojanorama.pl/r4sns2013:start ICM, University of Warsaw March 25, 2013 GSSR, Warsaw

Upload: others

Post on 11-Sep-2021

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Frequency tables, matrices, arrays, and functions - bojanorama

Analysing Social Science Data Using RClass 6: Frequency tables, matrices, arrays, and functions

Michał [email protected]

www.bojanorama.pl/r4sns2013:start

ICM, University of Warsaw

March 25, 2013GSSR, Warsaw

Page 2: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

1 Frequency tables

2 Chi-square tests

3 Matrices and arrays

4 Creating functions

5 Functions for tables

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions

Page 3: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

Outline

1 Frequency tables

2 Chi-square tests

3 Matrices and arrays

4 Creating functions

5 Functions for tables

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions

Page 4: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

Creating frequency tables

Frequency tables are created with function table.table(pgss$kobieta) # univariate tableswith(pgss, table(kobieta, g5b)) # bivariate tableswith(pgss, table(kobieta, g5b, pgssyear)) # multidimensional tables

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions

Page 5: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

Tables with many dimensions

with(pgss, table(kobieta, g5b, pgssyear)) # multidimensional tables## , , pgssyear = 1999#### g5b## kobieta -9 -2 1 2 3 4 5 6 7## FALSE 0 984 0 0 0 0 0 0 0## TRUE 0 1298 0 0 0 0 0 0 0#### , , pgssyear = 2008#### g5b## kobieta -9 -2 1 2 3 4 5 6 7## FALSE 0 0 28 109 84 52 166 161 23## TRUE 1 0 13 117 118 57 179 152 33

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions

Page 6: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

Additional arguments in table

exclude vector of levels to exclude from the table. ExcludesNAs by default.

useNA whether to show rows/columns corresponding to NA,one of "no" (default) or "ifany" or "always".

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions

Page 7: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

Outline

1 Frequency tables

2 Chi-square tests

3 Matrices and arrays

4 Creating functions

5 Functions for tables

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions

Page 8: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

Chi-square testWith χ2 tests we test a hypothesis that variables comprising thetable are stochastically independent.

Variables are stochastically independent if variables are uselessin predicting one another.

1 2 3 Total

a .2 .5 .3 1b .2 .5 .3 1

1 Given a table of observed frequencies m, we can calculate howthe table would look like if the variables were stochasticallyindependent (say m̂).

2 Calculate χ2 = (m−m̂)2

m̂ .3 Compare to theoretical distribution to get a p-value.

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions

Page 9: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

Performing Chi-square tests

Use chisq.test function with table as an argument:g5b <- replace(pgss08$g5b, which(pgss08$g5b==-9), NA)(tab <- table(pgss08$kobieta, g5b))## g5b## 1 2 3 4 5 6 7## FALSE 28 109 84 52 166 161 23## TRUE 13 117 118 57 179 152 33chisq.test(tab)#### Pearson's Chi-squared test#### data: tab## X-squared = 12.64, df = 6, p-value = 0.0492

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions

Page 10: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

Outline

1 Frequency tables

2 Chi-square tests

3 Matrices and arrays

4 Creating functions

5 Functions for tables

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions

Page 11: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

Matrices and arrays

New type of objectsmatrices rectangular (two-dimensional) objects containing

elements of the same type(numeric/character/logical).

arrays "multidimensional matrices".Tables created with table are special types of matrices/arrays.

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions

Page 12: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

Creating from scratch: matrices

Matrices and arrays cane be created from scratch with matrix andarray:matrix(1:6, nrow=2, ncol=2)## [,1] [,2]## [1,] 1 3## [2,] 2 4

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions

Page 13: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

Creating from scratch: arrays

array(1:8, dim=c(2,2,2))## , , 1#### [,1] [,2]## [1,] 1 3## [2,] 2 4#### , , 2#### [,1] [,2]## [1,] 5 7## [2,] 6 8

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions

Page 14: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

Row and column names: rownames, colnames

tab## g5b## 1 2 3 4 5 6 7## FALSE 28 109 84 52 166 161 23## TRUE 13 117 118 57 179 152 33rownames(tab)## [1] "FALSE" "TRUE"colnames(tab)## [1] "1" "2" "3" "4" "5" "6" "7"rownames(tab) <- c("mężczyzna", "kobieta")tab## g5b## 1 2 3 4 5 6 7## mężczyzna 28 109 84 52 166 161 23## kobieta 13 117 118 57 179 152 33

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions

Page 15: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

Indexing matrices and arrays

Using [ ], like vectors, but more subscriptstab## g5b## 1 2 3 4 5 6 7## mężczyzna 28 109 84 52 166 161 23## kobieta 13 117 118 57 179 152 33tab[1,1] # single element## [1] 28tab[1, ] # first row## 1 2 3 4 5 6 7## 28 109 84 52 166 161 23tab[ ,2] # second column## mężczyzna kobieta## 109 117

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions

Page 16: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

Negative subscripts

Negative subscripts can be used to drop associatedelements/rows/columns.tab[ -1 , ]## 1 2 3 4 5 6 7## 13 117 118 57 179 152 33

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions

Page 17: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

Outline

1 Frequency tables

2 Chi-square tests

3 Matrices and arrays

4 Creating functions

5 Functions for tables

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions

Page 18: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

Creating own functions

functionName <- function(x, y){

# what do we do with 'x' and 'y'}For example, a function computing mean of x:mymean <- function(x){

sum(x) / length(x)}mymean( c(1,2,3) )## [1] 2

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions

Page 19: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

Outline

1 Frequency tables

2 Chi-square tests

3 Matrices and arrays

4 Creating functions

5 Functions for tables

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions

Page 20: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

Functions useful for tables

cbind, rbind Create matrices by "gluing" vectors or matricesrow-wise or column-wise

rowSums, colSums, rowMeans, colMeans row and column sumsand means

prop.table tables of proportions

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions

Page 21: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

Function apply

Apply any function by row/column/layer of a matrix or arraytab## g5b## 1 2 3 4 5 6 7## mężczyzna 28 109 84 52 166 161 23## kobieta 13 117 118 57 179 152 33apply(tab, 2, sum)## 1 2 3 4 5 6 7## 41 226 202 109 345 313 56

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions

Page 22: Frequency tables, matrices, arrays, and functions - bojanorama

Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables

Using apply

Calculate percentages of responses 1-3, 4, 5-7fun <- function(r){

v <- c( agree=sum(r[1:3]), dontknow=r[4], disagree=sum(r[5:7]) )v / sum(v) * 100

}apply(tab, 1, fun)#### mężczyzna kobieta## agree 35.474 37.07## dontknow.4 8.347 8.52## disagree 56.180 54.41

Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions