biol 500: basic statistics goals: 1) understand basics of experimental design - controls -...

Biol 500: basic statistics

Goals:

1) understand basics of experimental design - controls - replication

2) understand how to report quantitative data

3) be able to interpret the reporting of statistical results in a journal article

Replication: allows you to determine if the difference between treatments or groups of samples is greater than the variation within a treatment or group

Is there a difference in how effective the 3 drugs are in curing headaches?

?

no

yes

Replication: allows you to determine if the difference between treatments or groups of samples is greater than the variation within a treatment or group

Is there a difference in how effective the 3 drugs are in curing headaches?

Generally, overlapping error bars indicate no significant difference between the mean values that are being graphed

Bars don’t overlap = probably different

Controls: From these data, could you tell if the least effective drug has any effect at all?

Controls: From these data, could you tell if the least effective drug has any effect at all?

Key to interpreting your results: Include a control that is the same in all respects except the one variable you will manipulate experimentally

Controls: Procedural controls allow you to diagnose problems in your experiment, samples or technique

When we amplify DNA from unknown samples by PCR, we include a positive control (a DNA sample that always works) and a negative control (all the PCR reagents, but no DNA)

This allows us to interpret the results of our gels, and to troubleshoot any problems

Do squirrels bury acorns?

My experiment: I remove all the squirrels from 3 clumps of trees in one park, but leave the squirrels in 3 ‘control’ tree clumps in another park, on the other side of town

Park A Park B

Pseudoreplication

In this example the unit of replication is the park, not the clump of trees – I have no actual replication

Park A Park B

Any difference I measure could be due to differences between the two parks, and not due to my squirrel-removal treatment

Avoiding pseudoreplication

Correct design would be to have squirrel-removal and control areas in each of several replicate parks

This lets you assess differences between treatment and control areas, while simultaneously measuring variation among parks

Park A Park CPark B

n = 10

Did these two classes do differently on my 418 midterm?

n = 20

n = 44

n = 44 = 133.9 ± 29.7 SD range: 59 - 183

= 126.3 ± 38.8 SD range: 42 - 188

The statistical approach is to ask if the means of these two populations are significantly different

n = 44

the standard deviation (SD) is what you should report if you are actually interested in the variation – ie, for purposes ofdeciding where to draw the line between grades

= 133.9 ± 29.7 SD range: 59 - 183

= 126.3 ± 38.8 SD range: 42 - 188

n = 44 = 133.9 ± 29.7 SD or ± 4.3 SErange: 59 - 183

= 126.3 ± 38.8 SD or ± 5.8 SE range: 42 - 188

the standard error (SE, or SEM) is SD

√ n sample size

n = 44

the standard error is what you report when you want to compare the means of different treatments or samples

= 133.9 ± 29.7 SD or ± 4.3 SErange: 59 - 183

= 126.3 ± 38.8 SD or ± 5.8 SE range: 42 - 188

= 133.9 ± 29.7 SD

= 126.3 ± 38.8 SD

unpaired two-tailed t-test: df = 86, t = 1.20, P = 0.23

a t-test compares 2 populations by calculating a test statistic called t and determining the probability (P, or p) of getting that value of t, with that sample size, by chance alone


paired would be, you compare the % scores on midterm versus final for each student; most tests are unpaired

= 133.9 ± 29.7 SD

= 126.3 ± 38.8 SD


one-tailed if you have some reason to think, in advance, that the 2009 scores will only be higher (or lower) than 2007

- cuts your P-value in half, but you need a reason to do this!

= 133.9 ± 29.7 SD

= 126.3 ± 38.8 SD


power of your test will depend on your degrees of freedom, which is (sample size) – (number of groups)

- in this case: (44 + 44 students) – (2 groups) = 88 -2 = 86

= 133.9 ± 29.7 SD

= 126.3 ± 38.8 SD


P values below 0.05 are accepted as significant, meaning there is less than a 5% chance of getting a test statistic this large if the groups are not really any different

= 133.9 ± 29.7 SD

= 126.3 ± 38.8 SD

3 or more samples can be compared using a one-way Analysis of Variance, or ANOVA

instead of calculating a t statistic, ANOVA calculates an F-ratio, which compares variation within groups (error bars) to the differences in mean values among groups

2 degrees of freedom: 1st = (# of groups – 1) 2nd = (total sample size) – (# of groups)

F2,129 = 7.12

P <0.001

df subscripted under F ratio

n = 44 n = 44 n = 44 overall P for 3-way comparison of means

If your overall P-value is significant, you can then do a post-hoc(“after the fact”) test to work out which specific means are different from each other

Bonferroni - not too conservative; may see differences that aren’t real

Scheffe - very conservative; if it sees a difference, there really is one

Dunnett - compares each mean to a control; most powerful

F2,129 = 7.12

P<0.001Scheffe: P = 0.002n = 44 n = 44 n = 44

Scheffe: P = 0.050

P = 0.474

2-way ANOVA tests for interactions among 2 or more factors

0

25

50

75

100

control aspirin only tylenol only aspirin + tylenol

factors: aspirin, yes/no tylenol, yes/no


0

25

50

75

100


when the response to two treatments combined is not what you would expect from adding their individual effects, this is an interaction

interactions are usually the most biologically interesting result!


0

25

50

75

100


NOT appropriate to do a 1-way ANOVA on these data, because that requires that each treatment be independent of the other treatments

- since 2 treatments involve aspirin, they are not independent

- also, you miss the interaction, which is the important result

A B C D

Correlation analysis is appropriate when you think 2 variables are related, but not in a cause-and-effect way

- arm length and leg length are related, but longer arms do not cause you to have longer legs; both are due to your height

Regression analysis is when you believe a change in one predictor variable (what you manipulate) causes a change in the response variable (the thing you measure)

- adding more water makes plants grow taller

Output of a regression analysis includes:

1) ANOVA table

tells you if your modelexplains a significantamount of the variationin the response


1) ANOVA table

2) equation of the best-fit line

summarizes the relationship between predictor and response


1) ANOVA table


3) table testing the effect of each predictor

in multiple regression, you can test many possible predictors that might matter, and see which significantly affect the response variable


1) ANOVA table


3) table testing the effect of each predictor

4) r2

r2 is the % of variation in the response that is due a change in the predictor

More scatter = lower r2

You can have a low r2, but still have a significant slope

ANOVA and regression are both types of linear models, which test the same basic equation:

response variable = model + error

thing you measure

predictors, andcoefficients thattell you how theyaffect the response

variance in the response that is not explained by the model

this is what a simple linearregression model looks like

Does predictor X affect response?

test is to set the coefficient = 0, which drops out the predictor, and see if the model (now just the residual error term) is really any worse

All of the tests we have discussed are parametric tests

- they use the numerical values of your actual data

- however, they also have built-in assumptions that your data, and the residual errors, fit a normal distribution

(bell curve)

Parametric versus non-parametric tests

0

2

4

6

8

10

12

14

16

Cou

nt

75 100 125 150 175 200 225 250 275 300Column 1

Histogram

If your data do not fit a normal distribution, you can transform the raw numbers to make them more “normal” – put the data through a mathematical function

Parametric versus non-parametric tests

0

2

4

6

8

10

12

14

16

Cou

nt

.3 .4 .5 .6 .7 .8 .9 1Column 1

Histogram

0

2

4

6

8

10

12

14

16

18

20

Cou

nt

.5 .6 .7 .8 .9 1 1.1 1.2 1.3 1.4 1.5Column 2

Histogram

% scores

arcsine(square-root(%)) is a standard transformation for %’s which stop at 100%, and are often not normally distributed

Parametric versus non-parametric testsAlternatively, there are non-parametric versions of most common statistical tests that use ranked values instead of the raw data

- are typically more conservative: if they see a difference, it is real

- make no assumptions about the shape of the distribution

raw ranked (high to low)3 52 66 34 49 21 712 1

biol 500: basic statistics goals: 1) understand basics of experimental design - controls -...

Documents