ols regression

24
OLS Regression • What is it? • Closely allied with correlation – interested in the strength of the linear relationship between two variables • One variable is specified as the dependent variable • The other variable is the independent (or explanatory) variable

Upload: ila-ross

Post on 01-Jan-2016

65 views

Category:

Documents


1 download

DESCRIPTION

OLS Regression. What is it? Closely allied with correlation – interested in the strength of the linear relationship between two variables One variable is specified as the dependent variable The other variable is the independent (or explanatory) variable. Regression Model Y = a + bx + e - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: OLS Regression

OLS Regression

• What is it?

• Closely allied with correlation – interested in the strength of the linear relationship between two variables

• One variable is specified as the dependent variable

• The other variable is the independent (or explanatory) variable

Page 2: OLS Regression

• Regression Model

• Y = a + bx + e• What is Y?• What is a?• What is b?• What is x? • What is e?• What is Y-hat?

Y

Page 3: OLS Regression

Elements of the Regression Line

• a = Y intercept (what Y is predicted to equal when X = 0)

• b = Slope (indicates the change in Y associated with a unit increase in X)

• e = error (the difference between the predicted Y (Y hat) and the observed Y

Page 4: OLS Regression

Regression

• Has the ability to quantify precisely the relative importance of a variable

• Has the ability to quantify how much variance is explained by a variable(s)

• Use more often than any other statistical technique

Page 5: OLS Regression

The Regression Line

• Y = a + bx + e• Y = sentence length• X = prior convictions• Each point represents the number of priors

(X) and sentence length (Y) of a particular defendant

• The regression line is the best fit line through the overall scatter of points

Page 6: OLS Regression

iii YYe ˆ

bxaY ˆ

iii bxaYe )( X and Y are observed. We need to estimate a & b

Page 7: OLS Regression

Calculus 101Least Squares Method and differential calculus

Differentiation is a very powerful tool that is used extensively in model estimation. Practical examples of differentiation are usually in the form of minimization/optimization problems or rate of change problems.

Page 8: OLS Regression

Calculus 101: Calculating the rate of change or slope of a line

For a straight line it is relatively simple to calculate the slope

01

01

xx

yy

x

y

Page 9: OLS Regression

Calculating the rate of change or slope of a line for a curve is a bit harder

Differential Calculus: We have a curve describing the variable Y as some function of the variable X: y = x2

Page 10: OLS Regression

It is possible to find a general expression involving the function f(x) that describes the slopes of the approximating sequence of secant lines

h

xfhxfh

)()(lim

0

h = x1 – x0 (represents a small difference from a point of interest)

Page 11: OLS Regression

Lets take a cost curve example:

C(x) = x2

what is the derivative if x = 3

= f(3+h) – f(3) / h

= (3+h)2 – (3)2 / h

= (9 + 6h + h2) – 9 / h

= 6h + h2 / h

= 6 + h = 6 (as h approaches 0)

∆y/∆x = 6

h

xfhxfh

)()(lim

0

Page 12: OLS Regression

How does this relate to our Regression model that is a straight line?

Page 13: OLS Regression
Page 14: OLS Regression

How do you draw a line when the line can be drawn in almost any direction?

The Method of Least Squares: drawing a line that minimizing the squared distances from the line (Σe2)

This is a minimization problem and therefore we can use differential calculus to estimate this line.

Page 15: OLS Regression

iii YYe ˆ

bxaY ˆ

iii bxaYe )( X and Y are observed. We need to estimate a & b

Page 16: OLS Regression

Least Squares Method

x yDeviation =y-(a+bx) d2 d2

0 1 1 - a (1 - a)2 1-2a+a2

1 3 3 - a - b (3 - a - b)2 9 - 6a + a2 - 6b + 2ab + b2

2 2 2 - a - 2b (2 - a - 2b)2 4 - 4a - a2 - 8b + 4ab + 4b2

3 4 4 - a - 3b (4 - a - 3b)2 16 - 8a + a2 - 24b + 6ab +9b2

4 5 5 - a - 4b (5 - a - 4b)2 25 - 10a +a2 -40b +8ab +16b2

Page 17: OLS Regression

• Summing the squares of the deviations yields:

• f(a, b) = 55-30a + 5a2 - 78b + 20ab + 30b2

• Calculate the first order partial derivatives of f(a,b)

• fb = -78 + 20a + 60b and fa = -30 + 10a + 20b

Page 18: OLS Regression

Set each partial derivative to zero:

Manipulate fa:

• 0 = -30 + 10a + 20b

• 10a = 30 - 20b

• a= 3 - 2b

Page 19: OLS Regression

Substitute (3-2b) into fb:

• 0 = -78 + 20a + 60b = -78 +20(3-2b) + 60b

• = -78 + 60 - 40b + 60b

• = -18 +20b

• 20b = 18

• b = 0.9

• Slope = .09

Page 20: OLS Regression

Substituting this value of b back into fa to obtain a:

• 10a = 30 - 20(.09)

• 10a = 30 - 18

• 10a = 12

• a= 1.2

• Y-intercept = 1.2

Page 21: OLS Regression

Estimating the model (the easy way)

Calculating the slope (b)

xSS

SPb

Page 22: OLS Regression

• Sum of Squares for X

• Some of Squares for Y

• Sum of produces

22 XNXSSx

22 YNYSS y

YXNXYSP

Page 23: OLS Regression

Calculating the Y-intersept (a)

Calculating the error term (e)

Y hat = predicted value of Y

e will be different for every observation. It is a measure of how much we are off in are prediction.

XbYa

iii YYe ˆ

ii bxaY ˆ

Page 24: OLS Regression

• Regression is strongly related to Correlation

yxSSSS

SPr