p value, power, type 1 and 2 errors

P value, Power & Type I & II errorDr. S. A. Rizwan, M.D.

Public Health SpecialistSBCM, Joint Program – Riyadh

Ministry of Health, Kingdom of Saudi Arabia

Learningobjectives

Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Definepvalue• Describethemeaningandlimitationsofpvalue• Definepowerofatestanditsmeaning• Describetype1andtype2errorsinhypothesistestingandhowtheyaffecttheinterpretationofresults

• Understandhowconsiderationofpvalue,type1and2errorsrelatetosamplesizecalculation

2

Section1:Pvalue

Demystifying statistics! – Lecture 5 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh 3

Pvalue


• Definedastheprobabilityofobtainingaresultequaltoormoreextremethanwhatwasactuallyobserved

• Firstintroducedby KarlPearson inhis Pearson'schi-squaredtest

• ItcanalsobeseeninrelationtotheprobabilityofmakingaTypeIerror

4

Pvalue


Theverticalcoordinateistheprobability densityofeachoutcome,computedunderthenullhypothesis.The p-valueistheareaunderthecurvepasttheobserveddatapoint.

5

Pvalue– choiceofcutoffvalue


• Arbitrarycut-off0.05(5%chanceofafalse+conclusion)• Ifp<0.05statisticallysignificant- RejectH0,AcceptH1• Ifp>0.05statisticallynotsignificant,AcceptH0,RejectH1

• Testingpotentialharmful interventions‘α’valueissetbelow0.05

• Depends upontheresearchquestion!

6

Pvalue– degreesofmagnitude


• Verysmall(<0.001),theresultsaresaidtobehighlysignificant

• Near0.05,itissaidtobeborderlinesignificant• Near1.0,resultdoesnotmatter!

7

Pvalue– howtocalculateit?


• Dependinguponthestatisticweareinterestedinpredeterminedpvaluesandtheircriticalvaluesaredisplayedinstatisticaltables

• Soeachtypeofdistributionhasitsowntable

• Itisalsopossibletocalculateexactpvalueswithcomputersinsteadofusingsuchtables

8

Pvalue– interpretation


• Iftheresultsarestatisticallysignificant,decidewhethertheobserveddifferencesareclinicallyimportant

• Ifnotsignificant,seeifthesamplesizewasadequateenoughnottohavemissedaclinicallyimportantdifference

• Power ofthestudytellsusthestrengthwhichwecanconcludethatthereisnodifferencebetweenthetwogroups

9

Pvalue– interpretation


• Statisticalsignificancedoesnotnecessarilymeanrealsignificance

• Ifsamplesizeislarge,evensmalldifferencescanhavealowp-value

• Lackofsignificancedoesn’tnecessarilymeannullhypothesisistrue

• Ifsamplesizeissmall,therecouldbearealdifference,butwearenotabletodetectit

• Ifyouperformalargenumberoftestsinastudy,1in20willbesignificantmerelybychance

10

Section2:Type1and2errors



• Theseareerrorsthatarisewhenperforminghypothesistestinganddecisionmaking

• Type1error(falsepositiveconclusion)• Statingdifferencewhenthereisnodifference,alpha• Relatedtopvalue,how?• Setat1/20or0.05or5%• Theprobability isdistributedatthetailsofthenormalcurvei.e.,0.025on

eithertail

• Type2error (falsenegativeconclusion)• Statingnodifferencewhenthereisadifference,beta• Occurswhensamplesizeistoosmall.• Conventionalvaluesare0.1or0.2• Relatedtopower,how?

Whataretheseerrors?

12


Whataretheseerrors?

13

Reality:No effect

Reality:Effect exists

Research concludes:

Fail to reject null;No effect

CORRECT FAILURE TO REJECT TYPE 2 ERROR (β)

Researcher concludes:

Reject null;Effect exists

TYPE 1 ERROR (α) CORRECT REJECT (1-β)

• Advancedlearning:Doyouknowtherearetype3and4also?


Example1

14


Example2

15


Example3

16

Section3:Powerofthestudy


Powerofthestudy


• Theabilitytodetectastatisticallysignificantassociation

• Itcanalsobeseenastheprobabilityofnotmissinganeffect,duetosamplingerror,whentherereallyisaneffect

• Itisalsotheprobabilityofavoidingatype2error,i.e.,1– beta

• Aprospectivepoweranalysisisusedbeforecollectingdata,toconsiderdesignsensitivity

• Aretrospectivepoweranalysisisusedinordertoknowwhetherthestudiesyouareinterpretingwerewellenoughdesigned

18

Factorsaffectingpower


• Allelsebeingequal:

1. Assamplesizesincrease,powerincreases2. Aspopulation variancesdecrease,powerincreases3. Asthedifference increases,powerincreases4. Statisticalpowerisgreaterforone-tailedtests5. ThegreatertheprobabilityofmakingaTypeIerror,the

greaterthepower

19

CalculatingPower:Example


• Astudyofn=16retainsnullH:μ=170atα=0.05(two-sided);σis40.Whatwasthepoweroftest’sconditionstoidentifyapopulationmeanof190?

( )5160.004.0

4016|190170|96.1

||1 0

1 2

=

Φ=

⎟⎟⎠

⎞⎜⎜⎝

⎛ −+−Φ=

⎟⎟

⎠

⎞

⎜⎜

⎝

⎛ −+−Φ=− − σ

µµβ α

nz a

20

CalculatingPower:Example


• TopcurveassumesnullHistrue

• BottomcurveassumesalternativeHistrue

• αissetto0.05(two-sided)

• Wewillrejectnullwhenasamplemeanexceeds189.6(righttail,topcurve)

• Theprobability ofgettingavaluegreaterthan189.6onthebottomcurveis0.5160,correspondingtothepowerofthetest

21

Powervs.confidenceintervals


• Oncewehaveconstructedaconfidenceinterval,powercalculationsyieldnoadditional insights

• Itispointlesstoperformpowercalculationsforhypothesesoutsideoftheconfidenceinterval

• Confidenceintervalsbetterinformreadersaboutthepossibilityofaninadequatesamplesizethandoposthocpowercalculations

22

Howdotheerrorsrelatetosamplesize?


• Samplesizeforone-sampleztest:• 1– β≡desiredpower• α≡desiredsignificancelevel(two-sided)• σ≡population standarddeviation• Δ=μ0– μa≡thedifferenceworthdetecting

( )2

211

2

2

Δ

+=

−− αβσ zzn

23



• Howlargeasampleisneededforaone-sampleztestwith90%powerandα=0.05(two-tailed)whenσ=40?LetH0:μ=170andHa:μ=190(thus,Δ=μ0−μa=170– 190=−20)

• Samplesizeshouldbe42toensureadequatepower.

( )99.41

20)96.128.1(40

2

22

2

211

22 =

−

+=

Δ

+=

−− αβσ zzn

24



N=16 N=42

25

Takehomemessages


• Pvalue,type1and2errors,alpha,beta,power,criticalvalueandhypothesistesting,samplesizeareallrelatedtoeachother

26

[email protected]

27

p value, power, type 1 and 2 errors

Health & Medicine