frequency tables, matrices, arrays, and functions - bojanorama
TRANSCRIPT
Analysing Social Science Data Using RClass 6: Frequency tables, matrices, arrays, and functions
Michał [email protected]
www.bojanorama.pl/r4sns2013:start
ICM, University of Warsaw
March 25, 2013GSSR, Warsaw
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
1 Frequency tables
2 Chi-square tests
3 Matrices and arrays
4 Creating functions
5 Functions for tables
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
Outline
1 Frequency tables
2 Chi-square tests
3 Matrices and arrays
4 Creating functions
5 Functions for tables
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
Creating frequency tables
Frequency tables are created with function table.table(pgss$kobieta) # univariate tableswith(pgss, table(kobieta, g5b)) # bivariate tableswith(pgss, table(kobieta, g5b, pgssyear)) # multidimensional tables
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
Tables with many dimensions
with(pgss, table(kobieta, g5b, pgssyear)) # multidimensional tables## , , pgssyear = 1999#### g5b## kobieta -9 -2 1 2 3 4 5 6 7## FALSE 0 984 0 0 0 0 0 0 0## TRUE 0 1298 0 0 0 0 0 0 0#### , , pgssyear = 2008#### g5b## kobieta -9 -2 1 2 3 4 5 6 7## FALSE 0 0 28 109 84 52 166 161 23## TRUE 1 0 13 117 118 57 179 152 33
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
Additional arguments in table
exclude vector of levels to exclude from the table. ExcludesNAs by default.
useNA whether to show rows/columns corresponding to NA,one of "no" (default) or "ifany" or "always".
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
Outline
1 Frequency tables
2 Chi-square tests
3 Matrices and arrays
4 Creating functions
5 Functions for tables
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
Chi-square testWith χ2 tests we test a hypothesis that variables comprising thetable are stochastically independent.
Variables are stochastically independent if variables are uselessin predicting one another.
1 2 3 Total
a .2 .5 .3 1b .2 .5 .3 1
1 Given a table of observed frequencies m, we can calculate howthe table would look like if the variables were stochasticallyindependent (say m̂).
2 Calculate χ2 = (m−m̂)2
m̂ .3 Compare to theoretical distribution to get a p-value.
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
Performing Chi-square tests
Use chisq.test function with table as an argument:g5b <- replace(pgss08$g5b, which(pgss08$g5b==-9), NA)(tab <- table(pgss08$kobieta, g5b))## g5b## 1 2 3 4 5 6 7## FALSE 28 109 84 52 166 161 23## TRUE 13 117 118 57 179 152 33chisq.test(tab)#### Pearson's Chi-squared test#### data: tab## X-squared = 12.64, df = 6, p-value = 0.0492
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
Outline
1 Frequency tables
2 Chi-square tests
3 Matrices and arrays
4 Creating functions
5 Functions for tables
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
Matrices and arrays
New type of objectsmatrices rectangular (two-dimensional) objects containing
elements of the same type(numeric/character/logical).
arrays "multidimensional matrices".Tables created with table are special types of matrices/arrays.
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
Creating from scratch: matrices
Matrices and arrays cane be created from scratch with matrix andarray:matrix(1:6, nrow=2, ncol=2)## [,1] [,2]## [1,] 1 3## [2,] 2 4
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
Creating from scratch: arrays
array(1:8, dim=c(2,2,2))## , , 1#### [,1] [,2]## [1,] 1 3## [2,] 2 4#### , , 2#### [,1] [,2]## [1,] 5 7## [2,] 6 8
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
Row and column names: rownames, colnames
tab## g5b## 1 2 3 4 5 6 7## FALSE 28 109 84 52 166 161 23## TRUE 13 117 118 57 179 152 33rownames(tab)## [1] "FALSE" "TRUE"colnames(tab)## [1] "1" "2" "3" "4" "5" "6" "7"rownames(tab) <- c("mężczyzna", "kobieta")tab## g5b## 1 2 3 4 5 6 7## mężczyzna 28 109 84 52 166 161 23## kobieta 13 117 118 57 179 152 33
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
Indexing matrices and arrays
Using [ ], like vectors, but more subscriptstab## g5b## 1 2 3 4 5 6 7## mężczyzna 28 109 84 52 166 161 23## kobieta 13 117 118 57 179 152 33tab[1,1] # single element## [1] 28tab[1, ] # first row## 1 2 3 4 5 6 7## 28 109 84 52 166 161 23tab[ ,2] # second column## mężczyzna kobieta## 109 117
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
Negative subscripts
Negative subscripts can be used to drop associatedelements/rows/columns.tab[ -1 , ]## 1 2 3 4 5 6 7## 13 117 118 57 179 152 33
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
Outline
1 Frequency tables
2 Chi-square tests
3 Matrices and arrays
4 Creating functions
5 Functions for tables
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
Creating own functions
functionName <- function(x, y){
# what do we do with 'x' and 'y'}For example, a function computing mean of x:mymean <- function(x){
sum(x) / length(x)}mymean( c(1,2,3) )## [1] 2
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
Outline
1 Frequency tables
2 Chi-square tests
3 Matrices and arrays
4 Creating functions
5 Functions for tables
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
Functions useful for tables
cbind, rbind Create matrices by "gluing" vectors or matricesrow-wise or column-wise
rowSums, colSums, rowMeans, colMeans row and column sumsand means
prop.table tables of proportions
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
Function apply
Apply any function by row/column/layer of a matrix or arraytab## g5b## 1 2 3 4 5 6 7## mężczyzna 28 109 84 52 166 161 23## kobieta 13 117 118 57 179 152 33apply(tab, 2, sum)## 1 2 3 4 5 6 7## 41 226 202 109 345 313 56
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions
Frequency tables Chi-square tests Matrices and arrays Creating functions Functions for tables
Using apply
Calculate percentages of responses 1-3, 4, 5-7fun <- function(r){
v <- c( agree=sum(r[1:3]), dontknow=r[4], disagree=sum(r[5:7]) )v / sum(v) * 100
}apply(tab, 1, fun)#### mężczyzna kobieta## agree 35.474 37.07## dontknow.4 8.347 8.52## disagree 56.180 54.41
Michał Bojanowski ICM, University of WarsawAnalysing Social Science Data Using R: Frequency tables, matrices, arrays, and functions