bayesian anova
TRANSCRIPT
Or how to learn what you know all over again but different
History of ANOVAThe Math of ANOVABayes TheoremAnatomy of Baysian ANOVACompare and Contrast!Rumble in the Jungle: Advantages of
BayesReal World 13: Genotype and Frequency
Dependence in an invasive grass.
Ronald Fisher, 1956
John Bennet Lawes:Founder Rothamsted Experimental station 1843
Harvesting of Broadbalk field, the source of the data for Fisher’s 1921 paper on variation in crop yields.
Excerpt from Studies in Crop Variation: An examination of the yield of dressed grain from Broadbalk Journal of Agriculture Science , 11 107-135, 1921
Cover page from his 1925 book formalizing ANOVA methods
Table from chapter 8 of Statistical Methods for Research Workers,On the analysis of randomize block designs.
History of ANOVAThe Math of ANOVABayes TheoremAnatomy of Baysian ANOVACompare and Contrast!Rumble in the Jungle: Advantages of
BayesReal World 13: Genotype and Frequency
Dependence in an invasive grass.
upswithin gropsamong groutotal
n
j
__
iij
a
i
n
j
__
i
a
i
n
j
__
ij
a
i
SS SS SS
)Y(Y)Y(Y)Y(Y 2
11
2
11
2
11
Adapted from Gotelli and Ellison 2004
upswithin gropsamong groutotal
n
j
__
iij
a
i
n
j
__
i
a
i
n
j
__
ij
a
i
SS SS SS
)Y(Y)Y(Y)Y(Y 2
11
2
11
2
11
Source
d.f. Sum of squares
Mean square
F-ratio p-value
Among groups
a-1 Determined from F-distribution with (a-1),a(n-1) d.f.
Within groups
a(n-1)
Total an-12
11
)Y(Yn
j
__
ij
a
i
2
11
)Y(Yn
j
__
iij
a
i
2
11
)Y(Yn
j
__
i
a
i
)(a-
SS psamong grou
1
)a(n-
SS upswithin gro
1
1an-
SStotal
upswithin gro
psamong grou
MS
MS
Adapted from Gotelli and Ellison 2004
Adapted from Gotelli and Ellison 2004
upswithin gropsamong groutotal
n
j
__
iij
a
i
n
j
__
i
a
i
n
j
__
ij
a
i
SS SS SS
)Y(Y)Y(Y)Y(Y 2
11
2
11
2
11
Our statistical model
ijiijy 1
History of ANOVAThe Math of ANOVABayes TheoremAnatomy of Baysian ANOVACompare and Contrast!Rumble in the Jungle: Advantages of
BayesReal World 13: Genotype and Frequency
Dependence in an invasive grass.
Rev. Thomas Bayes 1702-1761
)(
)|()(
)(
),()|(
yp
ypp
yp
ypyp
)|()()|( yppyp
Prior Likelihood
Adapted from Clark 2007
10321 ....,, yyyy 10321 ....,, yyyy
10321 ....,,
,
10321 ....,,
10321 ....,, yyyy
Common Risk Independent Risk Hierarchical
Adapted from Clark 2007
History of ANOVAThe Math of ANOVABayes TheoremAnatomy of Baysian ANOVACompare and Contrast!Rumble in the Jungle: Advantages of
BayesReal World 13: Genotype and Frequency
Dependence in an invasive grass.
or
ijiijy 1
),(~ 21 yiij Ny
),0(~ 21 Ni
2)()(
1
)(1
1 mj
mj
J
jmm
m
Js
),0(~ 2 N
From Qian and Shen 2007
History of ANOVAThe Math of ANOVABayes TheoremAnatomy of Baysian ANOVACompare and Contrast!Rumble in the Jungle: Advantages of
BayesReal World 13: Genotype and Frequency
Dependence in an invasive grass.
Source d.f. SS MS F-ratio
p-value
Treatment
3 3.10
1.03 6.73 0.0068
Location 3 1.01
0.34 2.19 0.101
Treatment* Location
9 1.24
.14 .88 0.5543
Residuals 49 7.52
0.16
Source d.f. SS MS F-ratio
p-value
Treatment
3 3.10
1.03 6.73 0.0068
Location 3 1.01
0.34 2.19 0.101
Treatment* Location
9 1.24
.14 .88 0.5543
Residuals 49 7.52
0.16
Lines represent 95% credible intervals for Bayesian estimates and confidence intervals for frequentist.
Comparison Control v. Foam
Control v. Haliclona
Control v. Tedania
Foam v. Haliclona
Foam v. Tedania
Orthogonal contrasts p-value
0.0397 0.002 0.0015 0.258 0.0521
Tukey’s HSD p-value
0.16 0.01 0.00001 0.66 0.21
Bonferroni adjusted pairwise t-test p-value
0.238 0.012 0.0009 1.00 0.313
Bayesian credible interval around the difference between 2 means
(-0.68 , 0.03)
(-0.84 , -0.12)
(-0.91 , -0.18) (-0.51 , 0.21)
(-0.58, 0.14)
History of ANOVAThe Math of ANOVABayes TheoremAnatomy of Baysian ANOVACompare and Contrast!Rumble in the Jungle: Advantages of
BayesReal World 13: Genotype and Frequency
Dependence in an invasive grass.
• Avoids the muddled idea of fixed vs. random effects, treating all effects as random.
• Provides estimates of effects as well as variance components with corresponding uncertainty.
• Allows more flexibility in model construction (e.g. GLM’s instead of just normal models)
• Issues such as normality, unbalanced designs, or missing values are easily handled in this framework.
• You just don’t believe in p-values (uniformative, etc, see Anderson et al 2000)
What’s up now Fisher,
Neyman-Pearson null hypothesis testing!?
Source d.f. SS MS F-ratio
p-value
Plot 2 209 154 8.9 0.0002
Genotype 6 63 10 0.6 0.72
Plot* Genotype
12 227 19 1.1 0.36
Year 1 113 113 6.5 0.012
Residuals 106 1790
17
Source d.f. SS MS F-ratio
p-value
Plot 2 209 154 8.9 0.0002
Genotype 6 63 10 0.6 0.72
Plot* Genotype
12 227 19 1.1 0.36
Year 1 113 113 6.5 0.012
Residuals 106 1790
17
Source d.f. SS MS F-ratio
p-value
Plot 2 209 154 8.9 0.0002
Genotype 6 63 10 0.6 0.72
Plot* Genotype
12 227 19 1.1 0.36
Year 1 113 113 6.5 0.012
Residuals 106 1790
17
model { for( i in 1:n){ y[i] ~ dnorm(y.mu[i],tau.y) y.mu[i] <- mu + delta[plottype[i]] + gamma[studyyear[i]] + nu[gens[i]] + interact[plottype[i],gens[i]] } mu ~ dnorm(0,.0001) tau.y <- pow(sigma.y,-2) sigma.y ~ dunif(0,100) mu.adj <- mu + mean(delta[])+mean(gamma[]) +mean(nu[])+mean(interact[,])
#compute finite population standard deviation for(i in 1:n){ e.y[i] <- y[i] - y.mu[i]} s.y <- sd(e.y[])
xi.d ~dnorm(0,tau.d.xi) tau.d.xi <- pow(prior.scale.d,-2)
for(k in 1:n.plottype){
delta[k] ~ dnorm(mu.d,tau.delta) d.adj[k] <- delta[k] - mean(delta[]) for(z in 1:n.gens) { interact[k,z]~dnorm(mu.inter,tau.inter) } }
Nick Gotelli
Robin Collins