scatterplot and trendline. scatterplot scatterplot explores the relationship between two...

26
Scatterplot and trendline

Upload: robert-carter

Post on 03-Jan-2016

221 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

Scatterplot and trendline

Page 2: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

Scatterplot

Scatterplot explores the relationship between two quantitative variables.

Example:

Page 3: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

What can we tell from scatterplot

Direction of relationship (positive, negative, no correlation)

Strength of relationship ( strong >0.8, weak <0.5)

Form of relationship (linear, quadratic, cubic, etc)

Page 4: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

Some examples i

r=0.5 Weak Points are scattered

around Positive (upward

trend) Hard to tell the form Roughly Linear?

Page 5: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

Some examples ii

r=0.8 Strong Points are compact Positive Clear linear pattern

Page 6: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

Some examples iii

r=0.2 Very weak, almost no

pattern Points all over the plot Very hard to tell

whether it is positive or negative

Page 7: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

Some examples iii

r=0 No pattern Points fall everywhere

in the plot Can not tell whether

there is upward or downward trend

Page 8: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

Some examples iv

r= - 0.8 Strong relationship Negative relationship

(downward trend)

.Linear pattern

Page 9: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

Some examples v

r= - 0.2 Not very different

from plot iii

Page 10: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

What is r?

r is called correlation coefficient There are many different ways of

calculating r. The one that we use most frequently is

called Pearson product moments correlation coefficient (or simply Pearson correlation coefficient)

Page 11: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

How to calculate r?

Formula to be introduced later.

Page 12: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

Other facts about r

Ranges from –1 to +1 Sign shows direction of the correlation Absolute value shows the strength of the

correlation *** Only measures linear correlation

Page 13: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

Example

Y=x^2 r is almost 0 r= -0.016 *** But there is a clear

quadratic correlation between x and y for sure!!!

Page 14: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

How to use correlation

Make predictionsGiven a value of x and the correlation

between x and y, we can predict the value of y.

This is an example of model fitting in statistics

Page 15: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

Another classification of variables

In terms of the role of the variables in the model, they are put into two classes: Independent, explanatory, predictor, x-valueDependent, response, y-value

Page 16: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

What a statistical model does

Gives us a measure of the relationship between two (or more) variables.

Gives us a measure of how good the model performs, since we always have many model choices.

Enables us to make prediction using the relationship identified in the model

Page 17: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

Graphical Illustration of the model

Trendline r=0.8 Positive Strong Linear

Page 18: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

Regression

Regression is one way of fitting a statistic model. For the above data, we have Y=b0+b1x+error b0 is called the intercept b1 is called the regression coefficient/slope Error is a “must have” part in any statistic model

Page 19: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

Numeric Example

Data X: 10 15 20 25 30 35 40 45 50 Y: 41 41 42 38 53 56 59 59 71

r=0.9194795

Page 20: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

Results of a regression i

Intercept = 28.5111 Slope = 0.7533 The line in the middle

is called the trendline or regression line

The distance between individual points and the line is called “residual”

Page 21: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

Results of a regression ii

X: 10 15 20 25 30 35 40 45 50 Y: 41 41 42 38 53 56 59 59 71 Y.hat: 36.04 39.81 43.58 47.34 51.11 54.88 58.64 62.41 66.18 Resid: 4.96 1.19 -1.58 -9.34 1.89 1.12 0.36 -3.41 4.82 Y.hat is the predicted value of Y given X and the regression model

we got Residuals=Y-Y.hat and that is the error in our model

Page 22: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

How do we get the regression model We find the set of intercept and slope that

satisfies the following conditionsThe sum of all residuals should be 0The sum of the squared residuals is

minimized

Page 23: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

How to measure how good this model is? One measure is called r-square For this model, it is r^2=0.8454425 It means among all the variation observed

in the variable Y, about 84.5% is explained by the predictor X. The rest is the error.

Page 24: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

How is r-square related to our measure of correlation

Hint, it is called… r-squared

Page 25: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

Yes, it is the squared value of the correlation between X and Y.

0.9194795^2=0.8454425

Page 26: Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

Some things to know

This relationship only works regression with one predictor.

The trendline or the regression model only works for X values within the range of our data, or not too far from it.

In this case, our X values range from 10 to 50. So we can predict Y using X=26 but not X=126.

Correlation does not imply causality. Example: Children’s shoe size vs reading ability