type i and ii errors

42
Type I and II errors Ana Jerončić

Upload: sonia-moody

Post on 30-Dec-2015

33 views

Category:

Documents


2 download

DESCRIPTION

Type I and II errors. Ana Jerončić. What is a p value?. P value is a short form for probability value P=0.07=7% There is 7% probability that we will incounter such or more extreme differences by chance . OR - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Type  I  and  II  errors

Type I and II errors

Ana Jerončić

Page 2: Type  I  and  II  errors

P value is a short form for probability value

P=0.07=7% There is 7% probability that we will incounter such

or more extreme differences by chance. OR In case when no real effect exsists if we repeat

experiment a 100 times, such difference (or more extreme) would be found in 7 experiments.

What is a p value?

Page 3: Type  I  and  II  errors

P value is a short form for probability value

P=0.99=99% There is 99% probability that we will incounter such

or even more extreme differences by chance. OR In case when no real effect exsists if we repeat

experiment a 100 times, such difference (or more extreme) would be found in 99 experiments.

What is a p value?

Page 4: Type  I  and  II  errors

What is a significance level α?

Page 5: Type  I  and  II  errors

Interpretation of P-value (0.05)

P>=0.05

Significant difference between the treatmentsNull hypothesis is rejected, alternative is accepted

P<0.05 5%

No difference between the treatments (observed difference having happened by chance)Null hypothesis is accepted

Page 6: Type  I  and  II  errors

The threshold of P-value that determines when to reject a null hypothesis

It refers to the chance that you are willing to take in being wrong ie. in concluding that there is a substantial difference when there is none.

What is a significance level α?

Page 7: Type  I  and  II  errors

The most common significance level: α=0.05=5%

We want to risk that only 5% of our predictions are wrong.

What is a significance level α?

Page 8: Type  I  and  II  errors

= Alpha=0.05Out of 40 decisions => we could expect that 2 are wrong

Page 9: Type  I  and  II  errors

α is also called Type I error The probability of erroneously rejecting the

null hypothesis

Consequence of type I error Put an useless medicine into the market!

What is (Type I error)?

Page 10: Type  I  and  II  errors

Watch out for…

p

Page 11: Type  I  and  II  errors

The sample size calculation was based on the primary outcome, BMI or BMI z-score, which was assumed to have a SD of 1.5, or 1.0 respectively. To have 80% power to detect a difference in mean BMI of 0.38, or mean BMI z-score of 0.25 units between the groups at age 2 at the two sided 5% significance level, we needed a sample size of 252 per group

Example from the literatureEffectiveness of a home-based early intervention on children’s BMI at age two years: randomised controlled trial.” BMJ 2012;344:e3732

Page 12: Type  I  and  II  errors

The sample size calculation was based on the primary outcome, BMI or BMI z-score, which was assumed to have a SD of 1.5, or 1.0 respectively. To have 80% power to detect a difference in mean BMI of 0.38, or mean BMI z-score of 0.25 units between the groups at age 2 at the two sided 5% significance level, we needed a sample size of 252 per group

Example from the literatureEffectiveness of a home-based early intervention on children’s BMI at age two years: randomised controlled trial.” BMJ 2012;344:e3732

Page 13: Type  I  and  II  errors

…. The higher-degree RR was deemed significantly better if the P-value for the higher-degree model was 0.01.

…..

Example from the literature Quantitative Trait Locus Analysis of Longitudinal Quantitative Trait Dana in Complex Pedigrees. Macgregor, S, Knott, S et al. Genetics 171, 1365-1376, 2005

Page 14: Type  I  and  II  errors

…. The higher-degree RR was deemed significantly better if the P-value for the higher-degree model was 0.01.

…..

Example from the literature Quantitative Trait Locus Analysis of Longitudinal Quantitative Trait Dana in Complex Pedigrees. Macgregor, S, Knott, S et al. Genetics 171, 1365-1376, 2005

Page 15: Type  I  and  II  errors

Hippocampal gray matter volume change was assessed statistically using a two-tailed t contrast with a significance level set to 0.05 (corrected for multiple comparisons within the ROI). Uncorrected exploratory full-brain statistics were also performed with two-tailed t contrasts at a significance level set to 0.001.

Example: The Brain-Derived Neurotrophic Factor val66met Polymorphism and Variation in Human Cortical MorphologyLukas Pezawas, Beth A. Verchinski, et al.

Page 16: Type  I  and  II  errors

Hippocampal gray matter volume change was assessed statistically using a two-tailed t contrast with a significance level set to 0.05 (corrected for multiple comparisons within the ROI). Uncorrected exploratory full-brain statistics were also performed with two-tailed t contrasts at a significance level set to 0.001.

Example: The Brain-Derived Neurotrophic Factor val66met Polymorphism and Variation in Human Cortical MorphologyLukas Pezawas, Beth A. Verchinski, et al.

Page 17: Type  I  and  II  errors

The probability of erroneously failing to reject the null hypothesis.

The most common β = 0.2

Consequence of type I error Keep a good medicine away from patients!

What is (Type II error)?

Page 18: Type  I  and  II  errors

Power quantifies the ability of the study to find true differences.

Power = 1- =P (accept H1 given H1 is true) the probability of correctly identifing H1

(correctly identify a better medicine)

If β=0.2, power=0.8=80%

What is Power ?

Page 19: Type  I  and  II  errors

Example

Studies with the drug X have shown that usage of drug X induces very serious side effects. Therefore drug X was with-drawn from the market.

New alternative drug Y was examined and the reduction in harmful effects, compared to drug X, was observed.

What is the significance level that you will use to evaluate the significance of reduction in harmful effects of drug Y, compared to drug X?

Page 20: Type  I  and  II  errors
Page 21: Type  I  and  II  errors
Page 22: Type  I  and  II  errors

Example

The effect of alcohol on the driver’s reaction time was investigated on a simple random sample. Observed reaction times, before and after the alcohol intake, have shown the increase in average reaction time after the alcohol intake.

What is the significance level that you will use to evaluate the significance of increase in reaction time?

Page 23: Type  I  and  II  errors
Page 24: Type  I  and  II  errors
Page 25: Type  I  and  II  errors

1. the medical and practical consequences of the two kinds of errors

2. the desired impact of the results

The choice of and depends on:

Page 26: Type  I  and  II  errors

< (the most common approach =0.05 and =0.2) ie. if the control treatment is already widely used and

is known to be reasonably safe and effective, whereas the test treatment is new, costly, or produces serious side effects.

> ie. if there is no established control treatment and

test treatment is relatively inexpensive, easy to apply and is not known to have any serious side effects.

The choice of and

Page 27: Type  I  and  II  errors

Choices other than =0.05 and =0.2 =0.10 and =0.2 for preliminary trials that

are likely to be replicated.

=0.01 and =0.05 for the trial that are unlikely replicated.

The choice of and

Page 29: Type  I  and  II  errors

Power calculation

Page 30: Type  I  and  II  errors

Power quantifies the ability of the study to find true differences.

Power = 1- =P (accept H1 given H1 is true)

the probability of correctly identifing H1

(correctly identify a better medicine)

If β=0.2, power=0.8=80%

What is the power of the study?

Page 31: Type  I  and  II  errors

is the minimum difference between groups that is judged to be clinically important

1. Minimal effect which has clinical relevance in the management of patients

or2. The anticipated effect of the new treatment

What is delta ()?

Page 32: Type  I  and  II  errors

Power Depends on 4 elements:

The real difference between the two medicines, Big big power

The variation among individuals, Small big power

The sample size, n Large n big power

Type I error, Large big power

Power Calculation(assuming we compare two medicines)

Page 33: Type  I  and  II  errors

Sample size

Page 34: Type  I  and  II  errors

N

The power 1- N

The N

Sample size and , , and

Page 35: Type  I  and  II  errors

“How large a sample do I need?”-Very commonly asked -Important question-Answer not so simple

Statistical power calculations-Use statistical software or

graphical method-Depends on data type

Sample Size

Page 36: Type  I  and  II  errors

Braga L, Byrne R, Lorenzo A et al. Methodological quality assessment of RCTs in hypospadias literature. 23rd Annual ESPU Congress - Zurich, Switzerland - 2012

Analyses showed that publication after 2006 (p<0.01), RCT sample size >50 (p=0.03), significance level α=0.01 (p<0.01) and blinding of outcome assessor (p<0.01) were significantly associated with better quality of RCTs.

Interpret the results

Hypospadias is a birth defect of the urethra in males

Page 37: Type  I  and  II  errors

Weir R. Randomised controlled trial to meta-analysis ratio: a replyfrom a group producing systematic reviews. 2007. The New Zel Med

Journal 120, 1-3

Antman et al showed that recommendations for routine use of thrombolytic therapy first appeared in 1987, 14 years after a statistically significant reduction in mortality was apparent on a subsequent cumulative meta-analysis of all relevant RCTs.

At the first time a significant reduction in mortality was apparent in the cumulative meta-analysis of IV streptokinase therapy (1973, p=0.01), 2432 patients had been randomised in eight small trials. The results of a further 25 studies (34,542 additionalpatients) published before routine recommendation of thrombolytic therapy, reduced the significance level to p=0.001 in 1979 and p=0.0001 in 1986.

Interpret the results

Page 38: Type  I  and  II  errors

Based on the results presented in the abstract –

write down conclusion section

Page 39: Type  I  and  II  errors

CONCLUSION: Overall advice to use steam inhalation, or

ibuprofen rather than paracetamol, does not help control symptoms in patients with acute respiratory tract infections and must be balanced against the possible progression of symptoms during the next month for a minority of patients. Advice to use ibuprofen might help short term control of symptoms in those with chest infections and in children.

Little P, Moore M, et al. Ibuprofen, paracetamol, and steam for patients with respiratory tract infections in primary care: pragmatic randomised factorial trial. BMJ 2013 Oct 25;347:f6041

Page 40: Type  I  and  II  errors

CONCLUSION: Our findings suggest the presence of

heterogeneity in the associations between individual fruit consumption and risk of type 2 diabetes. Greater consumption of specific whole fruits, particularly blueberries, grapes, and apples, is significantly associated with a lower risk of type 2 diabetes, whereas greater consumption of fruit juice is associated with a higher risk.

Muraki I, Imamura F, et al. Fruit consumption and risk of type 2 diabetes: results from three prospective longitudinal cohort studies. BMJ. 2013 Aug 28;347:f5001

Page 41: Type  I  and  II  errors

Conclusions Although limited in quantity, existing randomised trial evidence on exercise interventions suggests that exercise and many drug interventions are often potentially similar in terms of their mortality benefits in the secondary prevention of coronary heart disease, rehabilitation after stroke, treatment of heart failure, and prevention of diabetes.

Huseyin Naci, John P A Ioannidis et al. Comparative effectiveness of exercise and drug interventions on mortality outcomes: metaepidemiological study. BMJ 2013; 347

Page 42: Type  I  and  II  errors

Sanjay Basu et al. Palm oil taxes and cardiovascular disease mortality in India: economic-epidemiologic model, BMJ. 2013 Oct 22;347;

Conclusions Curtailing palm oil intake through taxation may modestly reduce hyperlipidemia and cardiovascular mortality, but with potential distributional consequences differentially benefiting male and urban populations, as well as affecting food security.