part iv the general linear model multiple explanatory variables chapter 13.4 fixed * random effects...
TRANSCRIPT
Part IVThe General Linear Model
Multiple Explanatory Variables
Chapter 13.4 Fixed * Random Effects
Randomized block
Statistical control
The effect of one variable (random: subject in last lecture) are removed to arrive at a better test for the variable of interest (fixed: drug in last lecture)
It is used when manipulative control is not possible
FieldNot possible at large scalesExpensiveCan generate artifacts study with well
designed statistical control can be more informative
Randomized block design
"Block what you can, randomize what you cannot.“Blocking to remove effects of the most important nuisance variablesRandomization to reduce contaminating effects of remaining nuisance variables
Within each block elements must be homogenous with respect to response variableHeterogeneity among blocksOrder blocks perpendicular to gradient of extraneous variableTreatments are randomized within each block
Randomized block design
Example from Netter & Wasserman (1974) Applied Linear Statistical Models
Response: sales volumeExplanatory: level of newspaper advertising
Size of city correlated with response
block according to population sizeperpendicular to gradient=from large to
smallrandomize level of advertising within each block
Randomized block design
Some blocking criteria:Characteristics associated with the unit– For persons: gender, age, etc– For geographic areas: population size, average income, etc
• Characteristics associated with the experimental settingobserver, time of processing, machine, tank, batch, measuring instrument, etc
• Data from Sokal & Rohlf 1995. Dry weights of 3 genotypes (wild, hetrozygote mutant, homozygote mutnt) of flour beetle Tribolium in 4 experiments
• Does weight vary among genotypes?
GLM | Randomized block design
Blocks Gtype Wt
1 1 0.958
2 1 0.971
3 1 0.927
4 1 0.971
1 2 0.986
2 2 1.051
3 2 0.891
4 2 1.010
1 3 0.925
2 3 0.952
3 3 0.829
4 3 0.955
1. Construct ModelResponse variable: M = beetle mass
Explanatory variables:
1. Genotype
Fixed effect
2. Experiment
Random effect
Experiments were laborious and carried out several months apart
Each experiment is a block
1. Construct ModelVerbal: Does weight vary among genotypes, after controlling for differences among genotypes?Graphical:
1. Construct ModelFormal:
Can we have an interaction term?
Same situation as last lecture. If we were to include an interaction term then
dfres=0 MSres = SSres / dfres =
X
1. Construct ModelFormal:
X
2. Execute analysis
lm1 <- lm(M~G+B, data=trib)
Blocks Gtype Wt
1 1 0.958
2 1 0.971
3 1 0.927
4 1 0.971
1 2 0.986
2 2 1.051
3 2 0.891
4 2 1.010
1 3 0.925
2 3 0.952
3 3 0.829
4 3 0.955
2. Execute analysis
Gtype Block
Effect Effect Fits Res
Blocks Gtype Wt β0 βG βB β0+ βB+ βG
1 1 0.958 0.9522 0.00458 0.004 0.9609 -0.0029
2 1 0.971 0.9522 0.00458 0.039 0.9959 -0.0249
3 1 0.927 0.9522 0.00458 -0.070 0.8869 0.0401
4 1 0.971 0.9522 0.00458 0.027 0.9833 -0.0123
1 2 0.986 0.9522 0.03233 0.004 0.9887 -0.0027
2 2 1.051 0.9522 0.03233 0.039 1.0237 0.0273
3 2 0.891 0.9522 0.03233 -0.070 0.9147 -0.0237
4 2 1.010 0.9522 0.03233 0.027 1.0110 -0.0010
1 3 0.925 0.9522-
0.03692 0.004 0.9194 0.0056
2 3 0.952 0.9522-
0.03692 0.039 0.9544 -0.0024
3 3 0.829 0.9522-
0.03692 -0.070 0.8454 -0.0164
4 3 0.955 0.9522-
0.03692 0.027 0.9418 0.0133
3. Evaluate model
a. Straight line
□ Straight line model ok?
b. Need to revise model?
□ Errors homogeneous?
c. Assumptions for computing p-values
□ Errors normal?
□ Errors independent?
a. Straight line
□ Straight line model ok?
b. Need to revise model?
□ Errors homogeneous?
c. Assumptions for computing p-values
□ Errors normal?
□ Errors independent?
3. Evaluate model
NA
?
3. Evaluate model
a. Straight line
□ Straight line model ok?
b. Need to revise model?
□ Errors homogeneous?
c. Assumptions for computing p-values
□ Errors normal?
□ Errors independent?
NA
?
X
3. Evaluate model
a. Straight line
□ Straight line model ok?
b. Need to revise model?
□ Errors homogeneous?
c. Assumptions for computing p-values
□ Errors normal?
□ Errors independent?
NA
X
?
4. State the population and whether the sample is representative.
Genotype fixed effectsWe will infer only to those genotypes
Experiment, i.e. time and other condition
random effects
All possible measurements that could have been made on Tribolium, given the mode of collection
5. Decide on mode of inference. Is hypothesis testing appropriate?
6. State HA / Ho pair, test statistic, distribution, tolerance for Type I error.
Interaction Term:
Removed by experimental design (genotypes weighed in random order)
Block Term experiment:
We are interested in this effect.
Only included in model to remove the variance from the error term
6. State HA / Ho pair, test statistic, distribution, tolerance for Type I error.
Genotype Term:
HA: E(MI) ≠ E(MII) ≠ E(MIII) HA:Var(βG) > 0
H0: E(MI) = E(MII) = E(MIII) H0:Var(βG) = 0
Test Statistic
Distribution of test statitstic
Tolerance for Type I error
7. ANOVA n = 12
Source df SS MS F p
Blocks
Genotype
Res______
______
Total
7. ANOVA n = 12
Source df SS MS F p
Blocks 30.0213
Genotype
20.0097
Res __6__0.0048
Total 110.0353
-1
7. ANOVA n = 12
Source df SS MS F p
Blocks 30.0213
Genotype
20.0097
0.004858
6.97 0.027
Res __6__0.0048
0.000697
Total 110.0353
-1
7. ANOVA n = 12Source df SS MS F p
Blocks 30.0213
Genotype
20.0097
0.004858
6.97 0.027
Res __6__0.0048
0.000697
Total 110.0353
What would it look like had we not controlled for experiment?
Source df SS MS F p
Genotype
20.0097
0.004858
1.71 0.235
Res __9__0.0255
0.002841
Total 110.0353
7. ANOVA n = 12Source df SS MS F p
Blocks 30.0213
Genotype
20.0097
0.004858
6.97 0.027
Res __6__0.0048
0.000697
Total 110.0353
What would it look like had we not controlled for experiment?
Source df SS MS F p
Genotype
20.0097
0.004858
1.71 0.235
Res __9__0.0255
0.002841
Total 110.0353
7. ANOVA n = 12Source df SS MS F p
Blocks 30.0213
Genotype
20.0097
0.004858
6.97 0.027
Res __6__0.0048
0.000697
Total 110.0353
What would it look like had we not controlled for experiment?
Source df SS MS F p
Genotype
20.0097
0.004858
1.71 0.235
Res __9__0.0255
0.002841
Total 110.0353
7. ANOVA n = 12Source df SS MS F p
Blocks 30.0213
Genotype
20.0097
0.004858
6.97 0.027
Res __6__0.0048
0.000697
Total 110.0353
What would it look like had we not controlled for experiment?
Source df SS MS F p
Genotype
20.0097
0.004858
1.71 0.235
Res __9__0.0255
0.002841
Total 110.0353
7. ANOVA n = 12Source df SS MS F p
Blocks 30.0213
Genotype
20.0097
0.004858
6.97 0.027
Res __6__0.0048
0.000697
Total 110.0353
What would it look like had we not controlled for experiment?
Source df SS MS F p
Genotype
20.0097
0.004858
1.71 0.235
Res __9__0.0255
0.002841
Total 110.0353
7. ANOVA n = 12Source df SS MS F p
Blocks 30.0213
Genotype
20.0097
0.004858
6.97 0.027
Res __6__0.0048
0.000697
Total 110.0353
What would it look like had we not controlled for experiment?
Source df SS MS F p
Genotype
20.0097
0.004858
1.71 0.235
Res __9__0.0255
0.002841
Total 110.0353
7. ANOVA n = 12Source df SS MS F p
Blocks 30.0213
Genotype
20.0097
0.004858
6.97 0.027
Res __6__0.0048
0.000697
Total 110.0353
What would it look like had we not controlled for experiment?
Source df SS MS F p
Genotype
20.0097
0.004858
1.71 0.235
Res __9__0.0255
0.002841
Total 110.0353
STATISTICAL CONTROL
8. Decide whether to recompute p-valueResiduals
homogenousnot independentdeviated slightly from normality
n=12
p=0.027 needs to change 2-fold to change decision
Randomization (100 000 runs)pran= 0.0186change in p: 0.027/0.186 = 1.45
9. Declare decision about terms
Only the fixed term was tested p=0.0186< α =0.05
Reject H0 There is significant variation in mean dry weight among genotypes
10.Report and interpret parameters of biological interest
Means per genotype with 95% CI, not controlled for among experiments variation
Genotype IIIHomozygote mutant
Genotype IHeterozygote mutant
Genotype IIWild
10.Report and interpret parameters of biological interest
Means per genotype with 95% CI, not controlled for among experiments variation
library(effects)effect("G", lm2,se=TRUE, confidence.level=.95)
Genotype IIIHomozygote mutant
Genotype IHeterozygote mutant
Genotype IIWild
Genotype IIIHomozygote mutant
10.Report and interpret parameters of biological interest
Means per genotype with 95% CI, controlled for among experiments variation
Genotype IHeterozygote mutant
Genotype IIWild
10.Report and interpret parameters of biological interest
Means per genotype with 95% CI, controlled for among experiments variation
effect("G", lm1,se=TRUE, confidence.level=.95)
Genotype IIIHomozygote mutant
Genotype IHeterozygote mutant
Genotype IIWild