experimental design and statistical analyses of data lesson 1: general linear models and design of...
Post on 20-Jan-2016
217 views
TRANSCRIPT
Experimental design and statistical analyses of data
Lesson 1:
General linear models and design of experiments
Examples of General Linear Models (GLM)
Simple linear regression:
xy 10
Ex: Depth at which a white disc is no longer visible in a lake
y = depth at disappearancex = nitrogen concentration of water
0 2 4 6 8 10
N/volume water
0
2
4
6
8
10
Dep
th (
m)
The residual ε expressesthe deviation between the model and the actual observation
β0
Intercept
β1
Slope
Dependentvariable
Independentvariable
Polynomial regression:
Ex:: y = depth at disappearancex = nitrogen concentration of water
0 2 4 6 8 10
N/volume water
0
2
4
6
8
10
Dep
th (
m)
2210 xxy
Multiple regression:
21322110 xxxxy
Eks: y = depth at disappearancex1 = Concentration of N
x2 = Concentration of P
02
4
6
8Concentration of N
0
2
4
6
8
Concentration of P
0
2
4
6
8
10
Depth
0
2
4
6
8
10
Depth
Ex: y = depth at disappearancex1 = Blue disc
x2 = Green disc
White Blue Green
Disc color
0
2
4
6
8
10
De
pth
22110 xxy
x1= 0; x2 = 0x1= 1; x2= 0x1= 0; x2= 1
Analysis of variance (ANOVA)
Analysis of covariance (ANCOVA):
3253143322110 xxxxxxxy
Ex: y = depth at disappearancex1 = Blue disc
x2 = Green disc
x3 = Concentration of N
0 2 4 6 8 10
Concentration of N
0
2
4
6
8
10
Dep
th
Nested analysis of variance:
jiiy )(
Ex: y = depth at disappearanceαi = effect of the ith lake
β(i)j = effect of the jth measurement in the ith lake
What is not a general linear model?
y = β0(1+β1x)
y = β0+cos(β1+β2x)
Other topics covered by this course:
• Multivariate analysis of variance (MANOVA)
• Repeated measurements
• Logistic regression
Experimental designs
Examples
Randomised design
• Effects of p treatments (e.g. drugs) are compared
• Total number of experimental units (persons) is n
• Treatment i is administrated to ni units
• Allocation of treatments among units is random
Example of randomized design
• 4 drugs (called A, B, C, and D) are tested (i.e. p = 4)
• 12 persons are available (i.e. n = 12)
• Each treatment is given to 3 persons (i.e. ni
= 3 for i = 1,2,..,p) (i.e. design is balanced)
• Persons are allocated randomly among treatments
n
yy ij
Drugs
A B C D Total
y1A
y2A
y3A
y1B
y2B
y3B
y1C
y2C
y3C
y1D
y2D
y3D
A
jAA n
yy
B
jBB n
yy
C
jCC n
yy
D
jDD n
yy
DD
CC
BB
AA
yy
yy
yy
yy
Note!Different persons
DD
CC
BB
AA
yy
yy
yy
yy
0Ay
10 By AB yy 1
ADD
ACC
yyy
yyy
330
220
30
20
10
0
11 x
12 x
13 x
3322110 xxxy
Source Degrees of freedom
Estimate of
Treatments ( )
Residuals
1
p - 1 = 3
n-p = 8
Total n = 12
0
321
Randomized block design
• All treatments are allocated to the same experimental units
• Treatments are allocated at random
B C B
A B D
D A A
C D C
Blocks (b = 3)
Treatments (p = 4)
Treatments
Persons
A B C D Average
1
2
3
Average
Cy1 Dy1
Ay2
Ay3
Cy2By2 Dy2
By3 Cy3 Dy3
Ay By
1y
2y
3y
Cy Dy y
55443322110 xxxxxy
Blocks (b-1) Treatments (p-1)
Ay1 By1
Source Degrees of freedom
Estimate of
Blocks (persons)
Treatments ( drugs )
Residuals
1
b - 1 = 2
p-1 = 3
n-[(b-1)+(p-1)+1] = 6
Total n = 12
0
Randomized block design
Double block design (latin-square)Person
Sequence
1 2 3 4
1 B D A C
2 A C D B
3 C A B D
4 D B C A
Rows (a = 4)
Columns (b = 4)
9988776655443322110 xxxxxxxxxy
Sequence (a-1) Persons (b-1)
Drugs (p-1)
Source Degrees of freedom
Estimate of
Rows (sequences)
Blocks (persons)
Treatments ( drugs )
Residuals
1
a-1 = 3
b - 1 = 3
p-1 = 3
n-[3(p-1)+1] = 6
Total n = p2 = 16
0
Latin-square design
Factorial designs
• Are used when the combined effects of two or more factors are investigated concurrently.
• As an example, assume that factor A is a drug and factor B is the way the drug is administrated
• Factor A occurs in three different levels (called drug A1, A2 and A3)
• Factor B occurs in four different levels (called B1, B2, B3 and B4)
Factorial designs
Factor B
Factor A
B1 B2 B3 B4 Average
A1 y11 y12 y13 y14
A2 y21 y22 y23 y24
A3 y31 y32 y33 y34
Average
1y
2y
3y
1y 2y 3y 4y y
55443322110 xxxxxyij
Effect of A Effect of B No interaction between A and B
Factorial experiment with no interaction
• Survival time at 15oC and 50% RH: 17 days
• Survival time at 25oC and 50% RH: 8 days
• Survival time at 15oC and 80% RH: 19 days
• What is the expected survival time at 25oC and 80% RH?
• An increase in temperature from 15oC to 25oC at 50% RH decreases survival time by 9 days
• An increase in RH from 50% to 80% at 15oC increases survival time by 2 days
• An increase in temperature from 15oC to 25oC and an increase in RH from 50% to 80% is expected to change survival time by –9+2 = -7
days
Factorial experiment with no interaction
10 15 20 25 30
Temperature (oC)
0
5
10
15
20
25S
urv
ival
tim
e (d
ays)
50 % RH
80 % RH
Factorial experiment with no interaction
10 15 20 25 30
Temperature (oC)
0
5
10
15
20
25S
urv
ival
tim
e (d
ays)
50 % RH
80 % RH
Factorial experiment with no interaction
10 15 20 25 30
Temperature (oC)
0
5
10
15
20
25S
urv
ival
tim
e (d
ays)
50 % RH
80 % RH
Factorial experiment with no interaction
10 15 20 25 30
Temperature (oC)
0
5
10
15
20
25S
urv
ival
tim
e (d
ays)
50 % RH
80 % RH
10 15 20 25 30
Temperature (oC)
0
5
10
15
20
25S
urv
ival
tim
e (d
ays)
Factorial experiment with no interaction
22110 xxyij
0
1
2
Factorial experiment with interaction
10 15 20 25 30
Temperature (oC)
0
5
10
15
20
25S
urv
ival
tim
e (d
ays)
0
1
2
3
21322110 xxxxyij
Factorial designs
Factor B
Factor A
B1 B2 B3 B4 Average
A1 y11 y12 y13 y14
A2 y21 y22 y23 y24
A3 y31 y32 y33 y34
Average
1y
2y
3y
1y 2y 3y 4y y
Effect of A Effect of B
5211421032951841731655443322110 xxxxxxxxxxxxxxxxxyij
Interactions between A and B
Source Degrees of freedom
Estimate of
Factor A (drug)
Factor B (administration)
Interactions between A and B
Residuals
1
a-1 = 2
b - 1 = 3
(a-1)(b-1) = 6
n- ab = 0
Total n = ab = 12
0
Two-way factorial designwith interaction, but without replication
Source Degrees of freedom
Estimate of
Factor A (drug)
Factor B (administration)
Residuals
1
a-1 = 2
b - 1 = 3
n- a-b+1 = 6
Total n = ab = 12
0
Two-way factorial designwithout replication
Without replication it is necessary to assume no interaction between factors!
Source Degrees of freedom
Estimate of
Factor A (drug)
Factor B (administration)
Interactions between A and B
Residuals
1
a-1
b - 1
(a-1)(b-1)
ab( r-1)
Total n = rab
0
Two-way factorial designwith replications
Source Degrees of freedom
Estimate of
Factor A (drug)
Factor B (administration)
Interactions between A and B
Residuals
1
a-1 = 2
b – 1 = 3
(a-1)(b-1) = 6
ab( r-1) = 12
Total n = rab = 24
0
Two-way factorial designwith interaction (r = 2)
Factor CFactor B
Factor A
Factor A
10109988776655443322110 xxxxxxxxxxyijk
Factor B Factor C
Three-way factorial design
Factor A
42203219101189117811671156114511341123111 xxxxxxxxxxxxxxxxxxxx 10 Main effects
31 Two-way interactions
107272972718727094145841441031439314283141 xxxxxxxxxxxxxxxxxxxxxxxx
30 Three-way interactions
Source Degrees of freedom
Estimate of
Factor A
Factor B
Factor C
Interactions between A and B
Interactions between A and C
Interactions between B and C
Interactions between A, B and C
Residuals
1
a-1 = 2
b – 1 = 5
c-1 = 3
(a-1)(b-1) = 10
(a-1)(c-1) = 6
(b-1)(c-1) = 15
(a-1)(b-1)(c-1) = 30
abc( r-1) = 0
Total n = rabc = 72
0
Three-way factorial design
Why should more than two levels of a factor be used in a factorial design?
Two-levels of a factor
10 15 20 25 30
Temperature (oC)
0
5
10
15
20
25
30S
urv
ival
tim
e (d
ays)
10 15 20 25 30
Temperature (oC)
0
5
10
15
20
25
30S
urv
ival
tim
e (d
ays)
Three-levels
factor qualitative
22110 xxy
1
Low Medium High
0
2
10 15 20 25 30
Temperature (oC)
0
5
10
15
20
25
30S
urv
ival
tim
e (d
ays)
Three-levels
factor quantitative
2210 xxy
Why should not many levels of each factor be used in a factorial
design?
Because each level of each factor increases the number of
experimental units to be used
For example, a five factor experiment with four levels per factor yields 45 = 1024 different combinations
If not all combinations are applied in an experiment, the design is partially factorial