biol 500: basic statistics goals: 1) understand basics of experimental design - controls -...
TRANSCRIPT
Biol 500: basic statistics
Goals:
1) understand basics of experimental design - controls - replication
2) understand how to report quantitative data
3) be able to interpret the reporting of statistical results in a journal article
Replication: allows you to determine if the difference between treatments or groups of samples is greater than the variation within a treatment or group
Is there a difference in how effective the 3 drugs are in curing headaches?
?
no
yes
Replication: allows you to determine if the difference between treatments or groups of samples is greater than the variation within a treatment or group
Is there a difference in how effective the 3 drugs are in curing headaches?
Generally, overlapping error bars indicate no significant difference between the mean values that are being graphed
Bars don’t overlap = probably different
Controls: From these data, could you tell if the least effective drug has any effect at all?
Controls: From these data, could you tell if the least effective drug has any effect at all?
Key to interpreting your results: Include a control that is the same in all respects except the one variable you will manipulate experimentally
Controls: Procedural controls allow you to diagnose problems in your experiment, samples or technique
When we amplify DNA from unknown samples by PCR, we include a positive control (a DNA sample that always works) and a negative control (all the PCR reagents, but no DNA)
This allows us to interpret the results of our gels, and to troubleshoot any problems
Do squirrels bury acorns?
My experiment: I remove all the squirrels from 3 clumps of trees in one park, but leave the squirrels in 3 ‘control’ tree clumps in another park, on the other side of town
Park A Park B
Pseudoreplication
In this example the unit of replication is the park, not the clump of trees – I have no actual replication
Park A Park B
Any difference I measure could be due to differences between the two parks, and not due to my squirrel-removal treatment
Avoiding pseudoreplication
Correct design would be to have squirrel-removal and control areas in each of several replicate parks
This lets you assess differences between treatment and control areas, while simultaneously measuring variation among parks
Park A Park CPark B
n = 10
Did these two classes do differently on my 418 midterm?
n = 20
n = 44
n = 44 = 133.9 ± 29.7 SD range: 59 - 183
= 126.3 ± 38.8 SD range: 42 - 188
The statistical approach is to ask if the means of these two populations are significantly different
n = 44
the standard deviation (SD) is what you should report if you are actually interested in the variation – ie, for purposes ofdeciding where to draw the line between grades
= 133.9 ± 29.7 SD range: 59 - 183
= 126.3 ± 38.8 SD range: 42 - 188
n = 44 = 133.9 ± 29.7 SD or ± 4.3 SErange: 59 - 183
= 126.3 ± 38.8 SD or ± 5.8 SE range: 42 - 188
the standard error (SE, or SEM) is SD
√ n sample size
n = 44
the standard error is what you report when you want to compare the means of different treatments or samples
= 133.9 ± 29.7 SD or ± 4.3 SErange: 59 - 183
= 126.3 ± 38.8 SD or ± 5.8 SE range: 42 - 188
= 133.9 ± 29.7 SD
= 126.3 ± 38.8 SD
unpaired two-tailed t-test: df = 86, t = 1.20, P = 0.23
a t-test compares 2 populations by calculating a test statistic called t and determining the probability (P, or p) of getting that value of t, with that sample size, by chance alone
unpaired two-tailed t-test: df = 86, t = 1.20, P = 0.23
paired would be, you compare the % scores on midterm versus final for each student; most tests are unpaired
= 133.9 ± 29.7 SD
= 126.3 ± 38.8 SD
unpaired two-tailed t-test: df = 86, t = 1.20, P = 0.23
one-tailed if you have some reason to think, in advance, that the 2009 scores will only be higher (or lower) than 2007
- cuts your P-value in half, but you need a reason to do this!
= 133.9 ± 29.7 SD
= 126.3 ± 38.8 SD
unpaired two-tailed t-test: df = 86, t = 1.20, P = 0.23
power of your test will depend on your degrees of freedom, which is (sample size) – (number of groups)
- in this case: (44 + 44 students) – (2 groups) = 88 -2 = 86
= 133.9 ± 29.7 SD
= 126.3 ± 38.8 SD
unpaired two-tailed t-test: df = 86, t = 1.20, P = 0.23
P values below 0.05 are accepted as significant, meaning there is less than a 5% chance of getting a test statistic this large if the groups are not really any different
= 133.9 ± 29.7 SD
= 126.3 ± 38.8 SD
3 or more samples can be compared using a one-way Analysis of Variance, or ANOVA
instead of calculating a t statistic, ANOVA calculates an F-ratio, which compares variation within groups (error bars) to the differences in mean values among groups
2 degrees of freedom: 1st = (# of groups – 1) 2nd = (total sample size) – (# of groups)
F2,129 = 7.12
P <0.001
df subscripted under F ratio
n = 44 n = 44 n = 44 overall P for 3-way comparison of means
If your overall P-value is significant, you can then do a post-hoc(“after the fact”) test to work out which specific means are different from each other
Bonferroni - not too conservative; may see differences that aren’t real
Scheffe - very conservative; if it sees a difference, there really is one
Dunnett - compares each mean to a control; most powerful
F2,129 = 7.12
P<0.001Scheffe: P = 0.002n = 44 n = 44 n = 44
Scheffe: P = 0.050
P = 0.474
2-way ANOVA tests for interactions among 2 or more factors
0
25
50
75
100
control aspirin only tylenol only aspirin + tylenol
factors: aspirin, yes/no tylenol, yes/no
2-way ANOVA tests for interactions among 2 or more factors
0
25
50
75
100
control aspirin only tylenol only aspirin + tylenol
when the response to two treatments combined is not what you would expect from adding their individual effects, this is an interaction
interactions are usually the most biologically interesting result!
2-way ANOVA tests for interactions among 2 or more factors
0
25
50
75
100
control aspirin only tylenol only aspirin + tylenol
NOT appropriate to do a 1-way ANOVA on these data, because that requires that each treatment be independent of the other treatments
- since 2 treatments involve aspirin, they are not independent
- also, you miss the interaction, which is the important result
A B C D
Correlation analysis is appropriate when you think 2 variables are related, but not in a cause-and-effect way
- arm length and leg length are related, but longer arms do not cause you to have longer legs; both are due to your height
Regression analysis is when you believe a change in one predictor variable (what you manipulate) causes a change in the response variable (the thing you measure)
- adding more water makes plants grow taller
Output of a regression analysis includes:
1) ANOVA table
tells you if your modelexplains a significantamount of the variationin the response
Output of a regression analysis includes:
1) ANOVA table
2) equation of the best-fit line
summarizes the relationship between predictor and response
Output of a regression analysis includes:
1) ANOVA table
2) equation of the best-fit line
3) table testing the effect of each predictor
in multiple regression, you can test many possible predictors that might matter, and see which significantly affect the response variable
Output of a regression analysis includes:
1) ANOVA table
2) equation of the best-fit line
3) table testing the effect of each predictor
4) r2
r2 is the % of variation in the response that is due a change in the predictor
More scatter = lower r2
You can have a low r2, but still have a significant slope
ANOVA and regression are both types of linear models, which test the same basic equation:
response variable = model + error
thing you measure
predictors, andcoefficients thattell you how theyaffect the response
variance in the response that is not explained by the model
this is what a simple linearregression model looks like
Does predictor X affect response?
test is to set the coefficient = 0, which drops out the predictor, and see if the model (now just the residual error term) is really any worse
All of the tests we have discussed are parametric tests
- they use the numerical values of your actual data
- however, they also have built-in assumptions that your data, and the residual errors, fit a normal distribution
(bell curve)
Parametric versus non-parametric tests
0
2
4
6
8
10
12
14
16
Cou
nt
75 100 125 150 175 200 225 250 275 300Column 1
Histogram
If your data do not fit a normal distribution, you can transform the raw numbers to make them more “normal” – put the data through a mathematical function
Parametric versus non-parametric tests
0
2
4
6
8
10
12
14
16
Cou
nt
.3 .4 .5 .6 .7 .8 .9 1Column 1
Histogram
0
2
4
6
8
10
12
14
16
18
20
Cou
nt
.5 .6 .7 .8 .9 1 1.1 1.2 1.3 1.4 1.5Column 2
Histogram
% scores
arcsine(square-root(%)) is a standard transformation for %’s which stop at 100%, and are often not normally distributed
Parametric versus non-parametric testsAlternatively, there are non-parametric versions of most common statistical tests that use ranked values instead of the raw data
- are typically more conservative: if they see a difference, it is real
- make no assumptions about the shape of the distribution
raw ranked (high to low)3 52 66 34 49 21 712 1