1. common problems with regression. a. inferring causation ...frederic/13/f16/day15.pdfthat there is...
TRANSCRIPT
![Page 1: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/1.jpg)
Stat 13, Intro. to Statistical Methods for the Life and Health Sciences.
1.Commonproblemswithregression.a.Inferringcausation.b.Extrapolation.c.Curvature.
2.Testingsignificanceofcorrelationorslope.
NoclassThuNov24,Thanksgiving.Readch9.Hw4is10.1.8,10.3.14,10.3.21,10.4.11andisdueTueNov29.
ThefinalFri Dec9,8am-11,right here,willbeonch1-10.BringaPENCILandCALCULATORandanybooksornotesyouwant.Nocomputers.http://www.stat.ucla.edu/~frederic/13/F16.
1
![Page 2: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/2.jpg)
Commonproblemswithregression.
• a.Correlationisnotcausation.Especiallywithobservationaldata.
![Page 3: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/3.jpg)
Commonproblemswithregression.
![Page 4: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/4.jpg)
Commonproblemswithregression.Holmes and Willett (2004) reviewed all prospective studies on fat consumption and breast cancer with at least 200 cases of breast cancer. "Not one study reported a significant positive association with total fat intake.... Overall, no association was observed between intake of total, saturated, monounsaturated, or polyunsaturated fat and risk for breast cancer."
They also state "The dietary fat hypothesis is largely based on the observation that national per capita fat consumption is highly correlated with breast cancer mortality rates. However, per capita fat consumption is highly correlated with economic development. Also, low parity and late age at first birth, greater body fat, and lower levels of physical activity are more prevalent in Western countries, and would be expected to confound the association with dietary fat."
![Page 5: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/5.jpg)
Commonproblemswithregression.• b.Extrapolation.
![Page 6: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/6.jpg)
Commonproblemswithregression.• b.Extrapolation.• Oftenresearchersextrapolatefromhighdosestolow.
![Page 7: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/7.jpg)
Commonproblemswithregression.• b.Extrapolation.Therelationshipcanbenonlinearthough.Researchersalsooftenextrapolatefromanimalstohumans.
Zaichkina etal.(2004)onhamsters
![Page 8: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/8.jpg)
Commonproblemswithregression.• c.Curvature.Thebestfittinglinemightfitpoorly.Portetal.(2005).
![Page 9: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/9.jpg)
Commonproblemswithregression.• c.Curvature.Thebestfittinglinemightfitpoorly.Wongetal.(2011).
![Page 10: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/10.jpg)
Howwelldoesthelinefit?• 𝑟" isameasureoffit.Itindicatestheamountofscatteraroundthebestfittingline.• Residualplotscanindicatecurvature,outliers,orheteroskedasticity.
• 1 − 𝑟"� 𝑠(isusefulasameasureofhowfaroffpredictionswouldhavebeenonaverage.
Notethatregressionresidualshavemeanzero,whetherthelinefitswellorpoorly.
![Page 11: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/11.jpg)
Commonproblemswithregression.• d.Statisticalsignificance.Couldtheobservedcorrelationjustbeduetochancealone?
![Page 12: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/12.jpg)
InferencefortheRegressionSlope:Theory-BasedApproachSection10.5
![Page 13: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/13.jpg)
Dostudentswhospendmoretimeinnon-academicactivitiestendtohavelowerGPAs?Example10.4
![Page 14: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/14.jpg)
Dostudentswhospendmoretimeinnon-academicactivitiestendtohavelowerGPAs?
• Thesubjectswere34undergraduatestudentsfromtheUniversityofMinnesota.• Theywereaskedquestionsabouthowmuchtimetheyspentinactivitieslikework,watchingTV,exercising,non-academiccomputeruse,etc.aswellaswhattheircurrentGPAwas.• WearegoingtotesttoseeifthereisanegativeassociationbetweenthenumberofhoursperweekspentonnonacademicactivitiesandGPA.
![Page 15: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/15.jpg)
Hypotheses• NullHypothesis:ThereisnoassociationbetweenthenumberofhoursstudentsspendonnonacademicactivitiesandstudentGPAinthepopulation.
• AlternativeHypothesis:ThereisanegativeassociationbetweenthenumberofhoursstudentsspendonnonacademicactivitiesandstudentGPAinthepopulation.
![Page 16: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/16.jpg)
DescriptiveStatistics• GPAA = 3.60 − 0.0059(nonacademichours).• Whatdotheslopeandy-interceptmean?
![Page 17: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/17.jpg)
ShuffletoDevelopNullDistribution
• Wearegoingtoshufflejustaswedidwithcorrelationtodevelopanulldistribution.• Theonlydifferenceisthatwewillbecalculatingtheslopeeachtimeandusingthatasourstatistic.• atestofassociationbasedonslopeisequivalenttoatestofassociationbasedonacorrelationcoefficient.
![Page 18: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/18.jpg)
Betavs Rho• Testingtheslopeoftheregressionlineisequivalenttotestingthecorrelation (samep-value,butobviouslydifferentconfidenceintervalssincethestatisticsaredifferent)• Hencethesehypothesesareequivalent.• Ho:β =0Ha:β <0(Slope)• Ho:ρ =0Ha:ρ < 0(Correlation)
• Sampleslope(b)Population(β:beta)• Samplecorrelation(r)Population(ρ:rho)
• Whenwedothetheorybasedtest,wewillbeusingthet-statisticwhichcanbecalculatedfromeithertheslopeorcorrelation.
![Page 19: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/19.jpg)
Introduction• Ournulldistributionsareagainbell-shapedandcenteredat0(foreithercorrelationorslopeasourstatistic).
The book on p549 finds a p value of 3.3% by simulation.
![Page 20: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/20.jpg)
ValidityConditions• Undercertainconditions,theory-basedinferenceforcorrelationorslopeoftheregressionlineuset-distributions.• Wecouldusesimulationsorthetheory-basedmethodsfortheslopeoftheregressionline.• Wewouldgetthesamep-valueifweusedcorrelationasourstatistic.
![Page 21: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/21.jpg)
PredictingHeartRatefromBodyTemperatureExample10.5A
![Page 22: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/22.jpg)
HeartRateandBodyTemp• Earlierwelookedattherelationshipbetweenheartrateandbodytemperaturewith130healthyadults• PredictedHeartRate = −166.3 + 2.44 Temp• r=0.257
![Page 23: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/23.jpg)
HeartRateandBodyTemp
• Wetestedtoseeifwehadconvincingevidencethatthereisapositiveassociationbetweenheartrateandbodytemperatureinthepopulationusingasimulation-basedapproach.(Wewillmakeit2-sidedthistime.)• NullHypothesis:Thereisnoassociationbetweenheartrateandbodytemperatureinthepopulation. β=0• AlternativeHypothesis:Thereisanassociationbetweenheartrateandbodytemperatureinthepopulation. β≠0
![Page 24: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/24.jpg)
HeartRateandBodyTemp
Wegetaverysmallp-value(0.0036).Anythingasextremeasourobservedslopeof2.44happeningbychanceisveryrare
![Page 25: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/25.jpg)
HeartRateandBodyTemp
• Wecanalsoapproximatea95%confidenceintervalobservedstatistic+ 2SDofstatistic2.44± 2(0.842)=0.76to4.12
• Whatdoesthismean?We’re95%confidentthat,inthepopulationofhealthyadults,each1° increaseinbodytempisassociatedwithanincreaseinheartrateofbetween0.76to4.12beatsperminute
![Page 26: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/26.jpg)
HeartRateandBodyTemp
• Thetheory-basedapproachshouldworkwellsincethedistributionhasanicebellshape• Alsocheckthescatterplot
![Page 27: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/27.jpg)
HeartRateandBodyTemp
• Wewillusethet-statistictogetourtheory-basedp-value.• Wewillfindatheory-basedconfidenceintervalfortheslope.• Onp554,thebooknotestheformula
• t= PQRSTURT
�.
• Herethetstatisticis2.97.• Thep-valueis.36%.Sothecorrelationisstatisticallysignificantlygreaterthanzero.
![Page 28: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/28.jpg)
SmokingandDrinkingExample10.5B
![Page 29: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/29.jpg)
ValidityConditionsRememberourvalidityconditionsfortheory-basedinferenceforslopeoftheregressionequation.
1. Thescatterplotshouldfollowalineartrend.2. Thereshouldbeapproximatelythesamenumber
ofpointsaboveandbelowtheregressionline(symmetry).
3. Thevariabilityofverticalslicesofthepointsshouldbesimilar.Thisiscalledhomoskedasticity.
![Page 30: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/30.jpg)
ValidityConditions• Let’slookatsomescatterplotsthatdonotmeettherequirements.
![Page 31: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/31.jpg)
SmokingandDrinkingTherelationshipbetweennumberofdrinksandcigarettesperweekforarandomsampleofstudentsatHopeCollege.
Thedotat(0,0)represents524students
Aretheconditionsmet?Hardtosay.Thebooksaysno.
![Page 32: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/32.jpg)
SmokingandDrinking• Whentheconditionsarenotmet,applyingsimulation-basedinferenceispreferabletotheory-basedt-testsandCIs.
![Page 33: 1. Common problems with regression. a. Inferring causation ...frederic/13/F16/day15.pdfthat there is a positive association between heart rate and body temperature in the population](https://reader034.vdocument.in/reader034/viewer/2022042202/5ea2b79478a1f157b35aa516/html5/thumbnails/33.jpg)
ValidityConditions
• Whatdoyoudowhenvalidityconditionsaren’tmetfortheory-basedinference?• Usethesimulated-basedapproach.
• Anotherstrategyisto“transform”thedataonadifferentscalesoconditionsaremet.• Thelogarithmicscaleiscommon.
• Onecanalsofitadifferentcurve,notnecessarilyaline.