chapter 7 scatterplots, association, and correlation
TRANSCRIPT
![Page 1: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/1.jpg)
Chapter 7
Scatterplots, Association, and Correlation
![Page 2: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/2.jpg)
Examining Relationships
Relationship between two variables Examples:
•Height and Weight•Alcohol and Body Temperature•SAT Verbal Score and SAT Math Score•High School GPA and College GPA
![Page 3: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/3.jpg)
Two Types of Variables
Response Variable (Dependent) Measures an outcome of the study
Explanatory Variable (Independent) Used to explain the response variable.
Example: Alcohol and Body Temp Explanatory Variable: Alcohol Response Variable: Body Temperature
![Page 4: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/4.jpg)
Two Types of Variables
Does not mean that explanatory variable causes response variable It helps explain the response
Sometimes there are no true response or explanatory variables Ex. Height and Weight SAT Verbal and SAT Math Scores
![Page 5: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/5.jpg)
Graphing Two Variables
Plot of explanatory variable vs. response variable Explanatory variable goes on horizontal axis (x)
Response variable goes on vertical axis (y) If response and explanatory variables do not exist, you can plot the variables on either axis.
This plot is called a scatterplot This plot can only be used if explanatory and response variables are both quantitative.
![Page 6: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/6.jpg)
Scatterplots
Scatterplots show patterns, trends, and relationships.
When interpreting a scatterplot (i.e., describing the relationship between two variables) always look at the following: Overall Pattern
• Form• Direction• Strength
Deviations from the Pattern• Outliers
![Page 7: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/7.jpg)
Interpreting Scatterplots Form
Is the plot linear or is it curved?
Strength Does the plot follow the form very closely or is there a lot of scatter (variation)?
![Page 8: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/8.jpg)
Interpreting Scatterplots
Direction Is the plot increasing or is it decreasing?
Positively Associated•Above (below) average in one variable tends to be associated with above (below) average in another variable.
Negative Associated•Above (below) average in one variable tends to be associated with below (above) average in another variable.
![Page 9: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/9.jpg)
Example – Scatterplot
The following survey was conducted in the U.S. and in 10 countries of Western Europe to determine the percentage of teenagers who had used marijuana and other drugs.
![Page 10: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/10.jpg)
Example – Scatterplot
2434United States
3153Scotland
37Portugal
36Norway
1423North Ireland
819Italy
1637Ireland
15Finland
2140England
317Denmark
422Czech Republic
Other DrugsMarijuanaCountry
Percent who have used
![Page 11: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/11.jpg)
Example – Scatterplot
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Percent who have used Marijuana vs Other Drugs
05
101520253035
0 10 20 30 40 50 60
![Page 12: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/12.jpg)
Example – Scatterplot
The variables are interchangeable in this example. In this example, Percent of Marijuana is being used as the explanatory variable (since it is on the x-axis).
Percent of Other Drugs is being used as the response since it is on the y-axis.
![Page 13: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/13.jpg)
Example - Scatterplot
The form is linear The strength is fairly strong The direction is positive since larger values on the x-axis yield larger values on the y-axis
![Page 14: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/14.jpg)
Example - Scatterplot Negative association Outside temperature and amount of natural gas used
0
5
10
Gas
-5.0 .0 5.0 10.0 15.0
Temp
![Page 15: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/15.jpg)
Correlation
The strength of the linear relationship between two quantitative variables can be described numerically
This numerical method is called correlation
Correlation is denoted by r
![Page 16: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/16.jpg)
Correlation
A way to measure the strength of the linear relationship between two quantitative variables.
yxss
yyxx
nr
))((
11
![Page 17: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/17.jpg)
Correlation
Steps to calculate correlation: Calculate the mean of x and y Calculate the standard deviation for x and y
Calculate Plug all numbers into formula
))(( yyxx
![Page 18: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/18.jpg)
Correlation
Femur vs. Humerus
0
20
40
60
80
100
0 10 20 30 40 50 60 70 80
Femur
Hu
me
rus
![Page 19: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/19.jpg)
Calculating r.
Femur (x) 38 56 59 63 74
Humerus (y) 41 63 70 72 84 Set up a table with columns for x, y, ,
, , , and
xx yy 2xx 2yy
yyxx
![Page 20: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/20.jpg)
Calculating r.
828101068600330290
28832425618168474
303625657263
4161417059
694-3-26356
500625400-25-204138
yx xx yy 2xx 2yy yyxx
![Page 21: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/21.jpg)
Calculating r
Recall:
So,
n
yy
665
330
585
290
y
x
![Page 22: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/22.jpg)
Calculating r
Recall:
So,
9.154
1010
1.134
686
y
x
s
s
1
)( 2
n
yys
![Page 23: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/23.jpg)
Calculating r.
Put everything into the formula:
994.0
9.151.1315828
1
yx ssn
yyxxr
![Page 24: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/24.jpg)
Properties of r
r has no units (i.e., just a number)
Measures the strength of a LINEAR association between two quantitative variables If the data have a curvilinear relationship, the correlation may not be strong even if the data follow the curve very closely.
![Page 25: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/25.jpg)
Properties of r r always ranges in values from –1 to 1 r = 1 indicates a straight increasing line
r = -1 indicates a straight decreasing line
r = 0 indicates no LINEAR relationship As r moves away from 0, the linear relationship between variables is stronger
![Page 26: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/26.jpg)
Properties of r Changing the scale of x or y will not change the value of r
Not resistant to outliers Strong correlation ≠ Causation Strong linear relationship between two variables is NOT proof of a causal relationship!
![Page 27: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/27.jpg)
Reading JMP Output
The following is some output from JMP where I considered Blood Alcohol Content and Number of Beers. The explanatory variable is the number of beers. Blood alcohol content is the response variable.
![Page 28: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/28.jpg)
Reading JMP Output
0
0.05
0.1
0.15
0.2B
AC
0 2 4 6 8 10
Beers
Bivariate Fit of BAC By Beers
![Page 29: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/29.jpg)
Reading JMP Output
RSquare 0.803536
RSquare Adj 0.788424
Root Mean Square Error 0.02092
Mean of Response 0.076
Observations (or Sum Wgts) 15
Summary of Fit
![Page 30: Chapter 7 Scatterplots, Association, and Correlation](https://reader036.vdocument.in/reader036/viewer/2022081506/56649f3f5503460f94c5ff41/html5/thumbnails/30.jpg)
Reading JMP Output
RSquare = r2
This means I know this is positive because the scatterplot has a positive direction.
The Mean of the Response is the mean of the y’s or
896.00.803536 RSquarer
y