statistics and research methods
DESCRIPTION
Statistics and Research methods . Wiskunde voor HMI Bijeenkomst 2. Correlation. Association between scores on two variables e.g., age and coordination skills in children, price and quality. Scatter Diagram. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/1.jpg)
Statistics and Research methods
Wiskunde voor HMIBijeenkomst 2
![Page 2: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/2.jpg)
Correlation
Association between scores on two variables– e.g., age and coordination skills in children, price
and quality
![Page 3: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/3.jpg)
Scatter Diagram
A Scatter Diagram (or scatterplot) is a visual display of the relationship between two variables
Example: A company is interested in whether there is a relationship between the number of employees supervised by a manager and the amount of stress reported by that manager
![Page 4: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/4.jpg)
Stress and Employees Supervised
0
1
2
3
4
5
6
7
8
9
10
0 2 4 6 8 10 12
# of Employees Supervised
Stre
ss L
evel
![Page 5: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/5.jpg)
Cause and Effect
An important type of relationship between two variables: cause and effect
Independent variable = cause Dependent variable = effect
![Page 6: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/6.jpg)
Correlation and Causality
Three possible directions of causality:
1. X Y
2. X Y
3. Z
X Y
![Page 7: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/7.jpg)
Correlation and Causality
In situations where variables cannot be manipulated experimentally, it is difficult to know whether one is actually causing the other
Example in newspaper: “drinking coffee causes cancer”– However, a third variable may cause both high
coffee consumption and cancer– Such third variables are called ‘confounds’
![Page 8: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/8.jpg)
However, we can still try to predict one variable on the basis of a second variable, even if the causal relationship has not been determined
Predictor variable Criterion variable
![Page 9: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/9.jpg)
Scatter Diagrams
The independent (or predictor) variable goes on the horizontal (x) axis; the dependent (or criterion) variable on the vertical (y) axis.
![Page 10: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/10.jpg)
Hours of Overtime Worked and Spouse’s Marital Satisfaction
0123456789
10
0 5 10 15 20 25
Hours of Overtime
Mar
ital S
atis
fact
ion
![Page 11: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/11.jpg)
Patterns of Correlation
Linear correlation Curvilinear correlation No correlation Positive correlation Negative correlation
![Page 12: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/12.jpg)
![Page 13: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/13.jpg)
Degree of Linear CorrelationThe Correlation Coefficient
Figure correlation using Z scores Cross-product of Z scores
– Multiply score on one variable by score on the other variable
Correlation coefficient– Average of the cross-products of Z scores
![Page 14: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/14.jpg)
Degree of Linear CorrelationThe Correlation Coefficient
Formula for the correlation coefficient:
Positive perfect correlation: r = +1 No correlation: r = 0 Negative perfect correlation: r = –1
![Page 15: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/15.jpg)
Correlation and Causality
Correlational research design– Correlation as a statistical procedure– Correlation as a kind of research design
![Page 16: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/16.jpg)
Issues in Interpreting the Correlation Coefficient
Statistical significance e.g. p < .05 Proportionate reduction in error =
Proportion of variance accounted for– r2
– Used to compare correlations
![Page 17: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/17.jpg)
Issues in Interpreting the Correlation Coefficient (continued)
Restriction in range
Unreliability of measurement
![Page 18: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/18.jpg)
Correlation in Research Articles
Scatter diagrams occasionally shown Correlation matrix
![Page 19: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/19.jpg)
Regression
Making predictions– does knowing a person’s score on one variable allow us to say
what their score on a second variable is likely to be? The method we use to make predictions is called
regression When scores on one variable are used to predict
scores on another variable, it is called bivariate regression (two variables)
When scores on two or more variables are used to predict scores on another variable, it is called multiple regression
![Page 20: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/20.jpg)
Naming (two variables)
Variable Predicted From
Variable Predicted To
Name Independent Variable Dependent Variable
Alternative Name Predictor Variable Criterion Variable
Symbol X Y
Example Number of hours slept night before
Happy mood that day
![Page 21: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/21.jpg)
• These two variables correlate positively
• People who drink a lot of coffee tend to be happy, and people who do not tend to be unhappy
• Preview: The line is called a regression line, and represents the estimated linear relationship between the two variables. Notice that the slope of the line is positive in this example.
![Page 22: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/22.jpg)
The Regression Line
Relation between predictor variable and predicted values of the criterion variable
Formula: Y = a + (b) X Slope of regression line
– Equals b, the raw-score regression coefficient Intercept of the regression line
– Equals a, the regression constant Method of least squares to derive a and b
![Page 23: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/23.jpg)
Method of least squares
a and b derived by:– least squares method (drawing)– line through MX and MY
![Page 24: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/24.jpg)
where b = (SDY/SDX) = (r)(SDY/SDX) a = MY – bMX
The Regression Line
Y = a + (b) X
![Page 25: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/25.jpg)
Bivariate Raw Score Prediction
Direct raw-score prediction model– Predicted raw score (on criterion variable) =
regression constant plus the result of multiplying a raw-score regression coefficient by the raw score on the predictor variable
– Formula
– The “hat” over Y means “predicted”
))((ˆ XbaY
![Page 26: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/26.jpg)
Bivariate prediction with Z scores
Given the Z score for X, what is the Z score for Y? We use the prediction model:
where (beta) is the “standardized regression coefficient”
It’s also called “beta weight”, because it tells us how much “weight” to give to ZX when making a prediction for ZY.
The “hat” over ZY means “predicted”.
XY ZZ ˆ
![Page 27: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/27.jpg)
What is ?
It turns out that the best value to use for in the prediction model is r, the (Pearson) correlation coefficient
Thus, the bivariate regression model is
When r = 1, ; when r = -1,
When r = 0; no relation;
“best guess” for Y is the mean score
XY ZrZ ˆXY ZZ ˆ XY ZZ ˆ
0ˆ YZ
![Page 28: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/28.jpg)
Proportionate Reduction in Error
We want a measure of how accurate our regression model (raw score prediction formula) is predicting the data
We can compare the error we make when predicting with our regression model, SSError to the error that we would make if we didn’t have the model SSTotal
![Page 29: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/29.jpg)
Proportionate Reduction in Error
Error– Actual score minus the predicted score
SSError = Sum of squared error using prediction model
SSTotal = Sum of squared error when predicting from the mean = 2MY
22 )ˆ( YYError
2)ˆ( YY
![Page 30: Statistics and Research methods](https://reader034.vdocument.in/reader034/viewer/2022042519/5681633f550346895dd3cfd6/html5/thumbnails/30.jpg)
Error and Proportionate Reduction in Error
Formula for proportionate reduction in error:
Proportionate reduction in error = r2
Proportion of variance accounted for
Total
ErrorTotal error in reduction ateProportionSS
SSSS