biostatistics in practice session 5: associations and confounding

33
Biostatistics in Practice Session 5: Associations and confounding Youngju Pak, Ph.D. Biostatistician http://research.LABioMed.org/Biostat 1

Upload: malia

Post on 14-Feb-2016

20 views

Category:

Documents


2 download

DESCRIPTION

Biostatistics in Practice Session 5: Associations and confounding. Youngju Pak, Ph.D. Biostatistician http://research.LABioMed.org/Biostat . Revisiting the Food Additives Study. From Table 3. Unadjusted. What does “adjusted” mean? How is it done?. Adjusted. - PowerPoint PPT Presentation

TRANSCRIPT

STA 7020 Statistical Methods in the Health Sciences (section 3)

Biostatistics in PracticeSession 5: Associations and confoundingYoungju Pak, Ph.D. Biostatistician

http://research.LABioMed.org/Biostat

1Biostatistcs in Practice: Session 51111/5/2013Revisiting the Food Additives Study

UnadjustedAdjustedWhat does adjusted mean?How is it done?From Table 32Goal One of Session 5

Earlier: Compare means for a single measure among groups. Use t-test, ANOVA.Session 5: Relate two or more measures. Use correlation or regression.Qu et al(2005), JCEM 90:1563-1569.Y/X3Goal Two of Session 5Try to isolate the effects of different characteristics on an outcome.Previous slide:GenderBMIGH Peak45CorrelationStandard English word correlateto establish a mutual or reciprocal relation between b:to show correlation or a causal relationship between In statistics, it has a more precise meaning Oct 25, 2012Biostatistcs in Practice: Session 56Correlation in StatisticsCorrelation: measure of the strength of LINEAR association

Positive correlation: two variables move to the same direction As one variable increase, other variables also tends to increase LINEARLY or vice versa.Example: Weight vs HeightNegative correlation: two variables move opposite of each other. As one variable increases, the other variable tends to decrease LINEARLY or vice versa (inverse relationship).Example: Physical Activity level vs. Abdominal height (Visceral Fat) Oct 25, 2012Biostatistcs in Practice: Session 57Pearson r correlation coefficientr can be any value from -1 to +1r = -1 indicates a perfect negative LINEAR relationship between the two variablesr = 1 indicates a perfect positive LINEAR relationship between the two variablesr = 0 indicates that there is no LINEAR relationship between the two variablesOct 25, 2012Biostatistcs in Practice: Session 58Scatter Plot: r= 1.0

Oct 25, 2012Biostatistcs in Practice: Session 59Scatter Plot: r= -1.0

Oct 25, 2012Biostatistcs in Practice: Session 510Scatter Plot: r= 0

Oct 25, 2012Biostatistcs in Practice: Session 5Anemic women: Anemia.sav n=20Hb(g/dl) PCV(%)11.13510.74512.44713.13110.5309.62512.53313.535r expresses how well the data fits in a straight line. Here, Pearsons r =0.673

PCV(Hematocrit):Packed cell volume =% of the concentration of red blood cells in blood usually 45% for men and 40% of women.Hemoglobin carries oxygen from the organs to rest of the body.Oct 25, 2012Biostatistcs in Practice: Session 5 Correlations in real data

12Logic for Value of Correlation

(X-Xmean) (Y-Ymean) (X-Xmean)2 (Y-Ymean)2Pearsons r = ++--Statistical software gives r.13Correlation Depends on Ranges of X & Y

Graph B contains only the graph A points in the ellipse.Correlation is reduced in graph B.Thus: correlations for the same quantities X and Y may be quite different in different study populations. BA14Simple Linear Regression (SLR)X and Y now assume unique roles: Y is an outcome, response, output, dependent variable. X is an input, predictor, explanatory, independent variable. Regression analysis is used to: Measure more than X-Y association, as with correlation. Fit a straight line through the scatter plot, for:Prediction of Ymean from X. Estimation of in Ymean for a unit change in X = Rate of change of Ymean as a unit change in X (slope = regression coefficient measure effect of X on Y).15SLR Example

eiMinimizesei2Range for IndividualsRange for meanStatistical software gives all this info.Range for IndividualsRange for individuals16Hypothesis testing for the true slope=0H0: true slope = 0 vs. Ha: true slope 0, with the rule:Claim association (slope0) if tc=|slope/SE(slope)| > t 2.There is a 5% chance of claiming an X-Y association that really does not exist.Note similarity to t-test for means: tc=|mean/ SE(mean)| Formula for SE(slope) is in statistics books.17Example Software OutputThe regression equation is: Ymean = 81.6 + 2.16 X

Predictor Coeff StdErr T PConstant 81.64 11.47 7.12