experimental designs and hypothesis testing
TRANSCRIPT
EXPERIMENTAL DESIGNS
AND HYPOTHESIS TESTING
PMA 4570/6228
Lab 5
July 13, 2017
Objectives
Design an experimental layout
Simple calculation and analysis of data
Interpret statistical results
Experimental Designs
Completely Randomized Design (CRD)
Randomized Block Design (RBD) Complete
Incomplete
Latin Square
Factorial
Split plot
Randomization to assign subjects to
treatment groups
Helps to prevent bias amongst treatment groups
Helps to balance the treatment groups with respect
to secondary (unknown) factors
Enables us to attribute any response differences to
the treatment rather than the secondary (unknown)
factors
Completely Randomized Design (CRD)
only 1 primary factor under consideration in
the experiment
test subjects are assigned to treatment levels of
the primary factor at random
assumes secondary factors will not produce a
systematic difference in response
E.g: plant height, petri dish location
CRD (Completely Randomized Design)
Diet A Diet D Diet B Diet D
Diet B Diet B Diet A Diet C
Diet D Diet A Diet C Diet B
Diet A Diet C Diet D Diet C
Example:
Five artificial diets are going to be compared for egg
production of H. axyridis (Coleoptera: Coccinellidae). The
females are randomly selected from the same colony and
randomly assigned to a diet treatment.
Treatment (4): Diet
Response: Egg production
Replicates: 4
Randomized Complete Block Design (RCBD)
AB
CD
BA
DC
AB
DC
DC
BA
Block 1 Block 2Block 3
Block 4
acknowledges potential effect of “secondary” factors, unlike in the CRD
contains blocks that represent the secondary factor, eg. shelves in your incubator, trays of plant, by slope, soil type
treatments in each block are exposed to similar conditions as much as possible
number of blocks represent number of replicates
Balanced Incomplete Block (BIB) Design
Block can only contain a subset of the treatments
Not all treatments are present in every block
BUT each possible pair of trt has to occur together within a block the same number of times to allow for comparison
Not a common design
A C E D
B E D C
E A B D
A B C E
B C D A
Treatment – 5
Replicate – 4
Comparison pairs – 3
Block 1
Block 2
Block 3
Block 4
Block 5
Factorial Design
combines effects of 2 or more factors to understand how
they affect a biological system individually and interactively
2-way factorial or 3-way factorial (more can get very
complicated!)
Treatment = combination of factors used in the experiment
Can be used in a CRD or RCBD
Factorial Design
Example:
Factor 1= Diets A, B, C
Factor 2 = Water in diet (1=10 ml, 2=15 ml)
Diet H2O TRTA 1 A1
A 2 A2
B 1 B1
B 2 B2
C 1 C1
C 2 C2
A 1 A1
A 2 A2
B 1 B1
B 2 B2
C 1 C1
C 2 C2
A1 C2 C1 B2 B1 A2
C1 B2 B1 A2 A1 C2
A2 B1 A1 B2 C2 C1
Rep 1
Rep 2
Rep 3
This is a RCBD experiment
Replicates = blocks
Latin Square
All treatments are replicated equally along the row and
along the column
Also known as row-column design
Could be used in a green house or a field experiment
col 1 col 2 col 3 col 4
row 1 A B C D
row 2 B C D A
row 3 C D A B
row 4 D A B C
Example:
The left light is out in the incubator.
The shelf level (row) AND the
location left to right on each shelf
(col) are sources of variation.
Split Plot
At least two factors have to be present
Whole plot (WP) – assigned to factor 1, generally based
on logistics of experiment (see example)
Subplots (SP) – factor 2, randomly assigned to each WP
Can be a CRD or RCBD
Also evaluates how factors affect your subjects
individually and interactively
Split Plot
Example:
Factor 1 (WP) = temperature – 10, 20°C
Factor 2 (SP) = diet – A, B, C, D
20oC 10oC
C D
B A
D B
A C
B
A
D
C
D
C
B
A
C D
B A
D B
A C
B
A
D
C
D
C
B
A
SP are organized in an RCBD experiment
4 Replicates = 4 blocks in each WP
**Temp is WP because
each incubator can be
set to one temperature
Hypothesis Testing
Purpose of an experiment: test a question/hypothesis
about the effectiveness of a new product/technique
Statistical analysis allow us to determine the
probability (P) that a hypothesis will be true for any
given sample
Null hypothesis (H0) – no difference
E.g. There are no differences in artificial diets for H. axyridis.
Alternative hypothesis (Ha) – there are differences
E.g. At least one of the artificial diets for H. axyridis is different.
(Flint and Gouveia 2001)
Testing hypotheses and levels of significance
p-value: probability that observed variation
among means could occur by chance
P > 0.05: not significant, therefore do not reject H0
P < 0.05: significant, therefore reject H0
Small p-value indicates strong evidence against H0
therefore reject H0
P < 0.0001 highly significant, therefore reject H0
Large p-value “fail to reject ” H0
P = 0.861 not significant, therefore do not reject H0
Testing Common Hypotheses
Comparing 2 treatment means (t-test)
H0: The two treatment means are equal
H1: The two treatment means differ
Comparing 3 or more treatment means (ANOVA)
H0: All of the treatment means are equal
H1: At least one treatment mean differs
T-test and ANOVA give you a p-value
Significant p-value (P ≤ 0.05) treatment differences
Means Separation Tests
0
1
2
3
4
5
6
7
a b c d
Treatment
Av
era
ge
Tukey’s test and LSD (Least Significant Difference)
are common
Only perform if ANOVA is significant (P ≤ 0.05)
Results look like this: Treatment a 5.2 A
Treatment d 4.8 A
Treatment c 4.1 AB
Treatment b 3.0 B
aaab
b
Treatments with similar letters are not sig. different
Simple Linear Regression
Explores and models the relationship between two
variables (x and y)
X = independent or predictor variable
Y = response or dependent variable
Changes in X cause changes in Y
Example: yield loss (y) and pest numbers (x)
Simple Linear Regression
Regression analysis measures the correlation (r) between X and Y
• Correlation coefficient: r
• Measures strength of the linear relationship between x and y
• Ranges between -1 and 1
• Where r ≥ 0.9 means highly positively correlated
• r = – 0.9 negatively correlated
• Coefficient of Determination: r2
• It is the square of correlation coefficient
• Ranges between 0 and 1
• Proportion of total variation in y attributable to variation in x
• Values of r2 >0.65 indicate significant correlation between the factors
r gives you the
direction of the
association
Simple Linear Regression
y = -8.9016x + 1011.2R² = 0.8957
0
100
200
300
400
500
600
700
800
900
1000
0 20 40 60
y = x
R2 = 1
0
5
10
15
20
0 5 10 15 20
# of TSSM
Yie
ld (kg
)
Yie
ld (kg
)
# of Neoseilus californicus
•Values of r2 >0.65 indicate significant correlation between the factors
Homework
Experimental Design and Hypothesis testing
handout
Worth 15 pts
DUE: Tues. July 18 before midnight by email