![Page 1: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/1.jpg)
Analyzing Reading time data
LabSyntax, 03/01/06T. Florian Jaeger
![Page 2: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/2.jpg)
[2]
Self-paced RT studiesSelf-paced RT studies
A measure of processing complexity
Say we have a hypothesis that some supposedly ungrammatical wh-orders are actually just hard to process (cf. superiority violations).
As part of this hypothesis we predict that accessibility of the wh-fillers and accessibility of interveners result in more processing at the integration site (the verb)
![Page 3: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/3.jpg)
[3]
CopyrightCopyright
The following slides refer to a data set (downloadable along with these slides) that has been collected by the WH-Research Group, Linguistics Department, Stanford University.
Please do not use, cite, or distribute any results based on that dataset (data-accessibility.rtm) without our permission. Email [email protected] or [email protected] for more details.
![Page 4: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/4.jpg)
[4]
Follow alongFollow along
You can follow along this tutorial presentation in R by downloading the dataset and the R script from:http://www.stanford.edu/~tiflo/?teaching/LabSyntax2006/materials/
The .cnd file is used to extract the results from linger and to define the regions of interest
The .rtm file contains all the reading time data (including the practice items (see Lingeralyzer documentation)
The .r file contains the employed R script. I haven’t documented things carefully, but with some R experience you should be able to figure things out.
![Page 5: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/5.jpg)
[5]
Input fileInput file# prin2n3 1 BARE_BAREMary wondered what who read but later the teacher told her.?Did Mary want to know what was painted? N# prin2n3 1 BARE_WHICHMary wondered what which student read but later the teacher told her.?Did Mary want to know what was painted? N# prin2n3 1 WHICH_BAREMary wondered which book who read but later the teacher told her.?Did Mary want to know what was painted? N# prin2n3 1 WHICH_WHICHMary wondered which book which student read but later the teacher told her.?Did Mary want to know what was painted? N
Stimulus identifier: # experimentID itemID conditionID Stimulus (regions separated by “|”; default: word-by-word Content question and answer (Y/N)
![Page 6: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/6.jpg)
[6]
Extracting results (.cnd)Extracting results (.cnd)
set COND_NAME "prin2n3 BARE_BARE"set ANOVA_FACTORS "WH1 WH2"set REGIONS {1:1-2 2:3-4 3:5-8 4:9-99}addConditionset COND_NAME "prin2n3 BARE_WHICH"set ANOVA_FACTORS "WH1 WH2"set REGIONS {1:1-2 2:3-5 3:6-9 4:10-99}addConditionset COND_NAME "prin2n3 WHICH_BARE"set ANOVA_FACTORS "WH1 WH2"set REGIONS {1:1-2 2:3-5 3:6-9 4:10-99}addCondition…
1 2 3 4 5 6 7 8 9 10 11 12
Mary wondered what which student read but later the teacher told her.
![Page 7: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/7.jpg)
[7]
Output fileOutput file
![Page 8: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/8.jpg)
[8]
Import into RImport into R
data <- read.table("C:/Documents and Settings/tiflo/Desktop/CLASS/RT-example/data-accessibility.rtm")
colnames(data) <- c("expt","extraction","attachment","item","subj","order","position","word","region","rt","rtz","resrt","resrtz","qa")
Let’s do some data exploration and cleaning
![Page 9: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/9.jpg)
[9]
Testing the assumptions of Testing the assumptions of ANOVAANOVA Homogeneity of variances: The variances of all
conditions (and the variance of the error) are assumed to be identical. Violations of this assumption are tolerable as long
as the variances are correlated (cf. Howell, 1995:340-1)
Normality: The dependent variable is assumed to be normally distributed within each condition. ANOVA is relative robust against violations of
normality
Independence of observations: This assumption forces us to include subject and items as factors repeated measures ANOVA; mixed effect models
![Page 10: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/10.jpg)
NormalityNormality
![Page 11: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/11.jpg)
[11]
Outlier exclusionOutlier exclusion
Outlier exclusion
![Page 12: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/12.jpg)
[12]
TransformationsTransformations
Reading times should be log-transformed
(works also for magnitude estimation judgment)
![Page 13: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/13.jpg)
[13]
Normality checkNormality check
Within each condition, the dependent variable (logRT) is approximately normally distributed
![Page 14: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/14.jpg)
IndependenceIndependence
![Page 15: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/15.jpg)
[15]
A simple regressionA simple regression
data.verb <- subset(data.oe.clean, region== "3" & expt == "prin2n3")
lm <- lm(logRT ~ filler*intervener, data= data.verb)summary(lm)
Output: Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.75258 0.01590 361.910 < 2e-16 ***fillerBARE 0.06698 0.02248 2.980 0.00291 ** intervenerBARE 0.05232 0.02244 2.331 0.01980 * filBARE:intBARE -0.03980 0.03176 -1.253 0.21019 ---Multiple R-Squared: 0.004493, Adjusted R-squared: 0.003564
NB: Coefficients are given for logRT
![Page 16: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/16.jpg)
[16]
OverviewOverview
![Page 17: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/17.jpg)
OverviewOverview
![Page 18: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/18.jpg)
[18]
Clusters in your dataClusters in your data
The assumption of independence is violated if clusters in your data are correlated Several trials by the same subject Several trials of the same item
Do subjects really differ?
![Page 19: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/19.jpg)
[19]
Some example subjectsSome example subjects
lms1 <- lm(logRT ~ filler*intervener, data= data.verb, subset= subj== "1")
lms2 <- lm(logRT ~ filler*intervener, data= data.verb, subset= subj== "2")
lms3 <- lm(logRT ~ filler*intervener, data= data.verb, subset= subj== "3")
coefficients(lms1)coefficients(lms2)coefficients(lms3)
![Page 20: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/20.jpg)
[20]
Three random subjectsThree random subjects
> coefficients(lms1)(Intercept) fillerBARE intBARE fillerBARE:intBARE 5.69316570 0.26279856 0.09832092 -0.12438628 > coefficients(lms2)(Intercept) fillerBARE intBARE fillerBARE:intBARE 5.76799982 -0.07026181 -0.03666451 0.21004255 > coefficients(lms3)(Intercept) fillerBARE intBARE fillerBARE:intBARE 6.23218256 0.15147899 0.01664294 -0.03748124
![Page 21: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/21.jpg)
[21]
Plotting data for all subjects Plotting data for all subjects (from (from Fox, 2002)Fox, 2002)
trellis.device(color=F)xyplot(logRT ~ filler | subj,
data=data.verb, main="Verb logRTs",ylim=c(5,7),panel=function(x, y){
panel.xyplot(x, y)# panel.loess(x, y, span=1)
panel.lmline(x, y, lty=2)}
)
![Page 22: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/22.jpg)
[22]
![Page 23: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/23.jpg)
[23]
A more convenient wayA more convenient way
lmList (in package lme4)
lmList(formula = logRT ~ filler * intervener | subj, data = data.verb)
Coefficients: (Intercept) fillerBARE intBARE
fillerBARE:intBARE1 5.693166 0.262798559 0.098320919 -0.1243862762 5.768000 -0.070261811 -0.036664515 0.2100425523 6.232183 0.151478990 0.016642943 -0.0374812414 5.835349 0.178951680 0.080414896 -0.3186688395 5.717801 -0.006879702 0.035657065 0.2643021426 5.569169 0.017304250 0.192764048 -0.0485376827 5.299747 0.054687350 0.021888357 -0.1847647328 5.667252 -0.013897366 -0.031297567 0.1063458499 …
![Page 24: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/24.jpg)
[24]
ConclusionConclusion
That’s why we do repeated measures or mixed effect analyses (to capture the differences between subjects as well as the commonalities of all trials by the same participant)
![Page 25: Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger](https://reader035.vdocument.in/reader035/viewer/2022070412/56649cc45503460f9498d54a/html5/thumbnails/25.jpg)
[25]
Repeated Measures ANOVA in RRepeated Measures ANOVA in R
data.verb.F1 <- aggregate(data.verb,by= list(subj= data.verb$subj, filler= data.verb$filler, intervener= data.verb$intervener),FUN= mean)
data.verb.F2 <- aggregate(data.verb,by= list(item= data.verb$item, filler= data.verb$filler, intervener= data.verb$intervener),FUN= mean)
F1 <- aov(logRT ~ filler*intervener + Error(subj/(filler*intervener)),data.verb.F1)
F2 <- aov(logRT ~ filler*intervener + Error(item/(filler*intervener)),data.verb.F2)
summary(F1)summary(F2)