scatterplotsskwon.org/scatterplot.pdfa strong association between two variables is not enough to...
TRANSCRIPT
![Page 1: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/1.jpg)
Scatterplots
![Page 2: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/2.jpg)
Example: A study is done to see how the number of beers that a student drinks predicts his/her blood
alcohol content (BAC). Results of 16 students:
How many variables do we have?
![Page 3: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/3.jpg)
Variables and Individuals
• Any characteristics are called variables.
Beers, Blood Alcohol Content. There are two variables.
• Individuals are the objects described by a set of data.
16 individuals
![Page 4: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/4.jpg)
Example: A study is done to see how the number of beers that a student drinks predicts his/her blood
alcohol content (BAC). Results of 16 students:
Are two variables related to each other?
![Page 5: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/5.jpg)
Example: A study is done to see how the number of beers that a student drinks predicts his/her blood
alcohol content (BAC). Results of 16 students:
Are two variables related to each other?
Means
Does the number of consumption of beers
Increase, decrease or doesn’t affect the blood alcohol content?
![Page 6: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/6.jpg)
Example: A study is done to see how the number of beers that a student drinks predicts his/her blood
alcohol content (BAC). Results of 16 students:
Are two variables related to each other?
Yes, although it is not exactly related to each other in a predictable manner, they are related to each other.
Roughly,
Increase of beers
Increase of BAC
![Page 7: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/7.jpg)
Scatterplot, Explanatory Variable, Response Variable
A scatterplot( graph of dots ) is a graph that shows the relationship between two numerical variables, measured on the individuals. Number of dots=
number of individuals.
![Page 8: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/8.jpg)
Scatterplot, Explanatory Variable, Response Variable
• Explanatory variable (x-axis) explains, or causes, the change in another variable.
• Response variable (y-axis) measures the outcome, or response to the change
![Page 9: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/9.jpg)
Learning Objective
Relate the scatterplot graph with a straight line so that we can predict the results which don’t appear as dots in the scatterplot graph.
The straight line will be called a regression line.
![Page 10: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/10.jpg)
About straight line
𝑦 = 𝑎𝑥 + 𝑏
![Page 11: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/11.jpg)
Straight Line 𝒚 = 𝒂𝒙 + 𝒃
𝑎(the number in front of 𝑥) or 𝑏: coefficient
𝑥 ( Unknown number ): variable
𝑏 ( the number which is not associated with 𝑥 ): 𝒚-intercept
![Page 12: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/12.jpg)
Example(Numerical Approach)
(Positive slope)
1. Let 𝑦 = 2𝑥 + 1 and 𝑥 = 0. What is the corresponding 𝑦 − value?
2. Let 𝑦 = 2𝑥 + 1 and 𝑥 = 2. What is the corresponding 𝑦 − value?
3. Plot the graph of the equation.
(Negative slope)
1. Let 𝑦 = −3𝑥 + 5 and 𝑥 = 1. What is the corresponding 𝑦 −value?
2. Let 𝑦 = −3𝑥 + 5 and 𝑥 = 3. What is the corresponding 𝑦 −value?
3. Plot the graph of the equation.
![Page 13: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/13.jpg)
Slope of a Straight Line 𝑦 = 𝒂𝑥 + 𝑏
Positive slope 𝑎 > 0
𝑎 is a positive number.
Rise to the right
Negative slope 𝑎 < 0
𝑎 is a negative number.
Falls to the right
![Page 14: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/14.jpg)
𝒚 −intercept 𝑦 = 𝑎𝑥 + 𝑏
𝑦 = 3𝑥 + 2 𝑦 = −2𝑥 + 1
![Page 15: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/15.jpg)
Going back to Scatterplot…
![Page 16: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/16.jpg)
Terminology (Positive Association≈ positive slope)
Suppose that the given scatterplot looks roughly like a straight line.
Two variables are positively associated if an increase of explanatory variable tends to accompany an increase in the response variable.
![Page 17: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/17.jpg)
Terminology (Negative Association≈ Negative slope )
Suppose that the given scatterplot looks roughly like a straight line.
Two variables are negatively associated if an increase of explanatory variable tends to accompany a decrease in the response variable.
![Page 18: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/18.jpg)
Positive Association Negative Association
![Page 19: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/19.jpg)
Regression Line to predict the value of 𝑦 for a given value of 𝑥
A regression line is a straight line that describes how a response variable 𝑦 changes as an explanatory variable 𝑥 changes.
![Page 20: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/20.jpg)
Regression Line to predict the value of 𝑦 for a given value of 𝑥
A regression line can be written as follows:
𝑥: explanatory variable
𝑦 : response variable
y mx b
![Page 21: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/21.jpg)
Correlation r
Measures the direction and strength of the straight-line relationship between two numerical variables.
![Page 22: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/22.jpg)
Correlation r A correlation r is always a number between −1 and 1.
![Page 23: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/23.jpg)
Correlation r
It has the same sign as the slope of a regression line.
r > 0 for positive association (increase in one variable causes an increase in the other).
![Page 24: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/24.jpg)
Correlation r
It has the same sign as the slope of a regression line.
r < 0 for negative association (increase in one variable causes a decrease in the other)
![Page 25: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/25.jpg)
Correlation r
Perfect correlation r = 1 or r = −1 occurs only when all points lie exactly on a straight line.
![Page 26: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/26.jpg)
Correlation r
Correlation r = 0 indicates no straight-line relationship.
![Page 27: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/27.jpg)
The correlation moves away from 1 or −1 (toward zero)
as the straight-line relationship gets weaker.
![Page 28: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/28.jpg)
Least-Squares Regression Line
A line that makes the sum of the squares of the vertical distances of the data points from the line as small as possible.
![Page 29: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/29.jpg)
Example of Least Squares Regression Line
https://www.youtube.com/watch?v=xojW6OEDfC4&feature=related
Least squares regression line
(0-4:56)
![Page 30: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/30.jpg)
Equation of the Least-Squares Regression Line
(Skip at the lecture)
From the data for an explanatory variable x and a response variable y for n individuals, we have calculated the means , , and standard deviations sx , sy , as well as their correlation r.
The least-squares regression line is the line:
Predicted
With slope …
And y-intercept …
y mx b
y
x
sm r
s
b y mx
![Page 31: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/31.jpg)
A Few Cautions When Using Correlation and Regression
Both the correlation r and least-squares regression line can be strongly influenced by a few outlying points.
Always make a scatterplot before doing
any calculations.
![Page 32: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/32.jpg)
Above: Regression line before we remove the outlier on the right top.
Below: Regression line after we remove the outlier which was on the right top.
![Page 33: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/33.jpg)
A Few Cautions When Using Correlation and Regression
Often the relationship between two variables is strongly influenced by other variables.
Before conclusions are drawn based on
correlation and regression, other possible
effects of other variables should be
considered.
![Page 34: Scatterplotsskwon.org/ScatterPlot.pdfA strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does](https://reader033.vdocument.in/reader033/viewer/2022042019/5e76ab3657d80f053b6bd7b5/html5/thumbnails/34.jpg)
A Few Cautions When Using Correlation and Regression
A strong association between two variables is not enough to draw conclusions about cause and effect.
Sometimes an observed association really does reflect cause and effect (such as drinking beer causes increased BAC).
Sometimes a strong association is explained by other variables that influence both x and y.
Remember, association does not imply causation.