08 relationships among variables

33
1 Relationships Among Relationships Among Variables Variables Correlation and Regression Correlation and Regression KNES 510 KNES 510 Research Methods in Research Methods in Kinesiology Kinesiology

Upload: anand-satsangi

Post on 07-Jul-2015

46 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 08   relationships among variables

11

Relationships Among Relationships Among VariablesVariables

Correlation and RegressionCorrelation and Regression

KNES 510KNES 510Research Methods in Research Methods in

KinesiologyKinesiology

Page 2: 08   relationships among variables

22

CorrelationCorrelation

Correlation is “a statistical technique used to Correlation is “a statistical technique used to determine the relationship between two or more determine the relationship between two or more variables”variables”

We use two different techniques to determine We use two different techniques to determine score relationships:score relationships:1.1. graphing techniquegraphing technique

2.2. mathematical technique called correlation mathematical technique called correlation

Page 3: 08   relationships among variables

Graphs of the RelationshipGraphs of the RelationshipBetween VariablesBetween Variables

33

Page 4: 08   relationships among variables

44

Types of RelationshipsTypes of Relationships

The scattergram can indicate a The scattergram can indicate a positivepositive relationship, a relationship, a negativenegative relationship, or a relationship, or a zerozero relationshiprelationship

What are the characteristics of positive, What are the characteristics of positive, negative, and zero relationships?negative, and zero relationships?

Page 5: 08   relationships among variables

55

Mathematical Technique: The Mathematical Technique: The Correlation Coefficient (Correlation Coefficient (rr))

The correlation coefficient, The correlation coefficient, rr,* represents ,* represents the relationship between the z-scores of the relationship between the z-scores of the subjects on two different variables the subjects on two different variables (usually designated X and Y)(usually designated X and Y)

This can be stated mathematically as the This can be stated mathematically as the mean of the z-score products for all mean of the z-score products for all subjectssubjects

*A more complete name for this statistic is Pearson’s product-moment correlation coefficient

Page 6: 08   relationships among variables

66

Formula for the Correlation Formula for the Correlation CoefficientCoefficient

The correlation coefficient can be calculated as The correlation coefficient can be calculated as follows:follows:

( )N

ZZr YX∑=

Page 7: 08   relationships among variables

77

The values of the coefficient will always The values of the coefficient will always range from +1.00 to -1.00range from +1.00 to -1.00

A correlation coefficient near 0.00 A correlation coefficient near 0.00 indicates no relationshipindicates no relationship

Page 8: 08   relationships among variables

88

SPSS Bivariate Correlation OutputSPSS Bivariate Correlation Output

Correlations

1 .947

.053

4 4

.947 1

.053

4 4

Pearson Correlation

Sig. (2-tailed)

N

Pearson Correlation

Sig. (2-tailed)

N

X

Y

X Y

Page 9: 08   relationships among variables

99

Interpreting the Correlation Interpreting the Correlation CoefficientCoefficient

Because the relationship between two Because the relationship between two sets of data is seldom perfect, the sets of data is seldom perfect, the majority of correlation coefficients are majority of correlation coefficients are fractions (0.92, -0.80, and the like)fractions (0.92, -0.80, and the like)

When interpreting correlation coefficients When interpreting correlation coefficients it is sometimes difficult to determine what it is sometimes difficult to determine what is high, low, and averageis high, low, and average

Page 10: 08   relationships among variables

1010

The Correlation Coefficient and The Correlation Coefficient and Cause-and-EffectCause-and-Effect

There is a high correlation between a There is a high correlation between a person's shoe size and their math skills in person's shoe size and their math skills in grades K through 6grades K through 6

Is this an example of cause-and-effect?Is this an example of cause-and-effect?

Can we predict math skill based on shoe Can we predict math skill based on shoe size in grade K through 6 students?size in grade K through 6 students?

Page 11: 08   relationships among variables

1111

Coefficient of DeterminationCoefficient of Determination

The The coefficient of determinationcoefficient of determination is the is the amount of variability in one measure that amount of variability in one measure that is explained by the other measureis explained by the other measureThe The coefficient of determinationcoefficient of determination is the is the square of the correlation coefficient (square of the correlation coefficient (rr22).).For example, if the correlation coefficient For example, if the correlation coefficient between two variables is between two variables is rr = 0.90, the = 0.90, the coefficient of determination is (0.90)coefficient of determination is (0.90)22 = = 0.81 0.81

Page 12: 08   relationships among variables

1212

RegressionRegression

When two variables are related When two variables are related (correlated), it is possible to predict a (correlated), it is possible to predict a person’s score on one variable (Y) by person’s score on one variable (Y) by knowing their score on the second knowing their score on the second variable (X)variable (X)

Page 13: 08   relationships among variables

1313

Page 14: 08   relationships among variables

1414

This scatterplot illustrates that there is a This scatterplot illustrates that there is a strong, positive relationship between fat-strong, positive relationship between fat-free body mass and daily energy free body mass and daily energy expenditureexpenditure

Correlations

1 .981**

.000

7 7

.981** 1

.000

7 7

Pearson Correlation

Sig. (2-tailed)

N

Pearson Correlation

Sig. (2-tailed)

N

Fat-Free Mass (kg)

Energy Expenditure (kcal)

Fat-FreeMass (kg)

EnergyExpenditure

(kcal)

Correlation is significant at the 0.01 level (2-tailed).**.

Page 15: 08   relationships among variables

1515

Regression Line (Line of Best Fit)Regression Line (Line of Best Fit)

The regression line is a line that best The regression line is a line that best describes the trend in the datadescribes the trend in the data

This line is as close as possible to the This line is as close as possible to the data pointsdata points

The equation for this line is:The equation for this line is:

Y' = bX = CY' = bX = C

Page 16: 08   relationships among variables

Fitting a Regression LineFitting a Regression Line

1616

Page 17: 08   relationships among variables

1717

Simple PredictionSimple Prediction

Tests have been developed to predict VOTests have been developed to predict VO 22

max from the time it takes a person to run max from the time it takes a person to run 1.5 miles1.5 miles

A person's VOA person's VO22 max can thus be predicted max can thus be predicted

from their 1.5 mile run time because a from their 1.5 mile run time because a predictionprediction or or regression equationregression equation has has been developedbeen developed

Page 18: 08   relationships among variables

1818

The simple linear The simple linear predictionprediction or or regression regression equationequation takes the following form: takes the following form:

Y' = a + bXY' = a + bXY' = predicted valueY' = predicted value

a = intercept of the regression line (Y intercept) a = intercept of the regression line (Y intercept)

b = slope of the regression line (change in Y b = slope of the regression line (change in Y with each change in X)with each change in X)

X = score on the predictor variableX = score on the predictor variable

Page 19: 08   relationships among variables

1919

Determining Error in PredictionDetermining Error in Prediction

Unless two variables are perfectly related Unless two variables are perfectly related (-1.00 or +1.00) there will always be error (-1.00 or +1.00) there will always be error associated with a prediction equationassociated with a prediction equation

We find the standard deviation of this We find the standard deviation of this error, the error, the standard error of prediction standard error of prediction ((ssyy ⋅⋅xx)), using the following formula:, using the following formula:

21 rss yxy −=⋅

Page 20: 08   relationships among variables

2020

Prediction and ResidualsPrediction and Residuals

Page 21: 08   relationships among variables

2121

A predicted score (Y’) A predicted score (Y’) ± ± ssyy ⋅⋅xx yields a range yields a range

of scores within which a person’s true of scores within which a person’s true score on the predicted variable liesscore on the predicted variable lies

If the standard error of prediction may be If the standard error of prediction may be interpreted as the standard deviation of interpreted as the standard deviation of residuals, what are the odds that a residuals, what are the odds that a person’s true score lies between person’s true score lies between Y’ Y’ ± ± ssyy ⋅⋅xx??

Page 22: 08   relationships among variables

2222

The standard error of prediction for The standard error of prediction for percent body fat estimated using the percent body fat estimated using the skinfold method is ±3.5%skinfold method is ±3.5%

If a person has their percent body fat If a person has their percent body fat estimated at 12%, between what two estimated at 12%, between what two values does their true body fat lie (95% values does their true body fat lie (95% probability)?probability)?

Page 23: 08   relationships among variables

2323

Which of the following will more Which of the following will more precisely predict job performance?precisely predict job performance?

A: r = 0.168 B: r = 0.686

Page 24: 08   relationships among variables

2424

Sample SPSS OutputSample SPSS Output

Here is the SPSS output for regressing Here is the SPSS output for regressing Work Simulation Job Performance Work Simulation Job Performance (Dependent Variable) against Supervisor (Dependent Variable) against Supervisor Ratings (Independent Variable)Ratings (Independent Variable)

Coefficientsa

-1.156 .675 -1.712 .089

.033 .016 .168 2.053 .042

(Constant)

Supervisor Ratings

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: Work Simulation Job Performancea.

Page 25: 08   relationships among variables

2525

This information can be used to create a This information can be used to create a prediction (regression) equation for prediction (regression) equation for predicting work performance of future predicting work performance of future applicants from supervisor ratingsapplicants from supervisor ratings

Y’ = – 1.156 + 0.033 XY’ = – 1.156 + 0.033 X

Page 26: 08   relationships among variables

2626

Work Simulation Job Performance may Work Simulation Job Performance may also be predicted from Arm Strengthalso be predicted from Arm Strength

Here is the SPSS output:Here is the SPSS output:

Coefficientsa

-4.095 .392 -10.454 .000

.055 .005 .686 11.353 .000

(Constant)

Arm Strength (lbs)

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: Work Simulation Job Performancea.

Page 27: 08   relationships among variables

2727

This information can be used to create a This information can be used to create a prediction (regression) equation for prediction (regression) equation for predicting work performance of future predicting work performance of future applicants from supervisor ratingsapplicants from supervisor ratings

Y’ = – 4.095 + 0.055 XY’ = – 4.095 + 0.055 X

Page 28: 08   relationships among variables

2828

We now have two regression equations for We now have two regression equations for predicting Work Simulation Job predicting Work Simulation Job PerformancePerformance

Which is the better equation for accurate Which is the better equation for accurate prediction?prediction?

To determine this, we must examine the To determine this, we must examine the standard error of prediction for each standard error of prediction for each equationequation

Page 29: 08   relationships among variables

2929

Standard error of prediction using Supervisor Ratings:Standard error of prediction using Supervisor Ratings:

Standard error of prediction using Arm Strength:Standard error of prediction using Arm Strength:

Which is the better equation?Which is the better equation?

Model Summary

.168a .028 .022 1.66078Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), Supervisor Ratingsa.

Model Summary

.686a .471 .467 1.22582Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), Arm Strength (lbs)a.

Page 30: 08   relationships among variables

3030

Multiple PredictionMultiple Prediction

A prediction formula using a single A prediction formula using a single measure X is usually not very accurate for measure X is usually not very accurate for predicting a person's score on measure Ypredicting a person's score on measure Y

Multiple correlation-regressionMultiple correlation-regression techniques allow us to predict score Y techniques allow us to predict score Y using several X scoresusing several X scores

Page 31: 08   relationships among variables

3131

The general form of a two predictor The general form of a two predictor multiple regression equation is:multiple regression equation is:

Y' = a + bY' = a + b11XX11 + b + b22XX22

Page 32: 08   relationships among variables

3232

An example of multiple correlation-An example of multiple correlation-regression is the prediction of percent regression is the prediction of percent body fat from multiple skinfold body fat from multiple skinfold measurementsmeasurements

DB (g/cc) = 1.0994921 - 0.0009929 (DB (g/cc) = 1.0994921 - 0.0009929 (ΣΣ3SKF) 3SKF) + 0.0000023 (+ 0.0000023 (ΣΣ3SKF)3SKF)22 – 0.0001392 (age) – 0.0001392 (age)

Page 33: 08   relationships among variables

Next ClassNext Class

Chapters 9 & 11Chapters 9 & 11

Mock Mock ProposalsProposals in class! in class!

3333