statistical power
DESCRIPTION
Statistical Power. H o : Treatments A and B the same H A : Treatments A and B different. Critical value at alpha=0.05. Points on this side, only 5% chance from distribution A. Frequency. A. Area = 5%. A could be control treatment B could be manipulated treatment. - PowerPoint PPT PresentationTRANSCRIPT
Statistical Power
Ho : Treatments A and B the same
HA: Treatments A and B different
Points on this side, only 5% chance from distribution A.
Area = 5%
Critical value at alpha=0.05F
req
uen
cy
A
A could be control treatmentB could be manipulated treatment
AB
If null hypothesis true, A and B are identical
Probability that any value of B will be not significantly different from A = 95%
Probability that any value of B is significantly different than A = 5%
AB
If null hypothesis true, A and B are identical
Probability that any value of B will be not significantly different from A = 95%
Probability that any value of B is significantly different than A = 5% = likelihood of type 1 error
Decide NOT significantly different (do not reject Ho)
Decide significantly different (reject Ho)
Ho true (same) Type 1 error
Ho false (different)
Type 2 error
What you say:R
eali
ty
AB
If null hypothesis false, two distributions are different
Probability that any value of B will be not significantly different from A = beta = likelihood of type 2 error
Probability that any value of B is significantly different than A = 1- beta = power
AB
Effect size
Effect size = difference in meansSD
AB
1. Power increases as effect size increases
Beta = likelihood of type 2 error
Power
Effect size
AB
2. Power increases as alpha increases
Beta = likelihood of type 2 error
Power
AB
3. Power increases as sample size increases
Low n
AB
3. Power increases as sample size increases
High n
Power
Effect size Alpha
Sample size
Types of power analysis:
A priori:
Useful for setting up a large experiment with some pilot data
Posteriori:
Useful for deciding how powerful your conclusion is (definitely? Or possibly). In manuscript writing, peer reviews, etc.
Example : Fox hunting in the UK(posteriori)
• Hunt banned (one year only) in 2001 because of foot-and-mouth disease.
• Can examine whether the fox population increased in areas where it used to be hunted (in this year).
• Baker et al. found no effect (p=0.474, alpha=0.05, n=157), but Aebischer et al. raised questions about power.
Baker et al. 2002. Nature 419: 34Aebischer et al. 2003. Nature 423: 400
157 plots where the fox population monitored. Alpha = 0.05
Effect size if hunting affected fox populations: 13%
157 plots where the fox population monitored. Alpha = 0.05
Effect size if hunting affected fox populations: 13%
Power = 0.95 !
Class exercise:
Means and SD of parasite load (p>0.05):
Daphnia magna 5.9 ± 2 (n = 3)
Daphnia pulex 4.9 ± 2 (n = 3)
(1) Did the researcher have “enough” power (>0.80)?
(2) Suggest a better sample size.
(3) Why is n=3 rarely adequate as a sample size?
How many samples?
PCBs in salmon from Burrard inlet and Alaska
In an initial survey (3 individuals each), we find the following information (mean, standard deviation)
Burrard – 120.5 ± 75.9 ppbAlaska – 75.2 ± 71.9 ppb
The two error bars overlap, but that’s still a big difference and we only took 3 samples
The difference could be “hidden” the sizes of the errors
This would be reduced by increased samples, but how many should we take?
How many samples?
Our difference between (q) is ~40, therefore if our confidence limits (SE) were <20ppb, we should have adifference between populations,
Burrard – 120.5 ± 75.9 ppbAlaska – 75.2 ± 71.9 ppb
How many samples do we therefore need??
€
q = 2ts
n
€
40 = 2∗1.9675.9
n
€
q = 2ts
n
€
40 = 2∗1.9675.9
n
Re-arrange the equation…
€
n =2*1.96* 75.9
40
⎛
⎝ ⎜
⎞
⎠ ⎟2
= 55.4
So we should take 56 samples to be reasonably sure of a significant difference
Don’t get silly..