lecture 3: simple linear regression - university of...

60
Lecture 3: Simple Linear Regression Xing Hong Department of Economics University of Maryland, College Park Spring 2016

Upload: others

Post on 20-Jul-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Lecture 3: Simple Linear Regression

Xing Hong

Department of EconomicsUniversity of Maryland, College Park

Spring 2016

Page 2: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Overview

1 Introduction to the simple linear regression modelInterpretationCausal relationship

2 Deriving the OLS Estimators

3 Goodness-of-Fit

4 Extensions on Simple Linear RegressionUnit of MeasurementFunctional Forms: Nonlinear Relationships

5 Statistical Properties of the OLS estimatorsGauss-Markov AssumptionsUnbiasednessEfficiency

Page 3: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Definition of the Simple Linear Regression

“Explains variable y in terms of variable x”

Page 4: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Interpretation of the Simple Linear Regression

“ Studies how y varies with changes in x :"

The simple linear regression model is rarely applicable in practice but itsdiscussion is useful for pedagogical reasons.

Page 5: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Ceteris Paribus: everything else held constant

Definition of CAUSAL effect of x on y :How does variable y change if variable x is changed but all otherrelevant factors are held constant.Most economic questions are ceteris paribus questionsIt is important to define which causal effect one is interested inIt is useful to describe how an experiment would have to be designedto infer the causal effect in questionHowever, it is usually not possible to literally hold everything elseconstant. The key question in most empirical studies is:

Have enough other factors been held constant to make a case ofcausality?

Page 6: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Causality vs. Correlation

Correlation: x moves with y

Causality: x moves y

eg. An observational relationship: college graduates earn about 50%more than those without a college degree.

Interpretations:Causal: college is productiveNon-causal: high ability people go to college and high ability peopleearn more

It is important to find out if the relationship is causal or not, becauseif non-causal, then higher investment in education (e.g. governmentsubsidies) will not yield higher productivity or income.

Page 7: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

An Example: Return of schooling

Suppose we build a simple regression model:

Wage = β0 + β1Edu + u

Wage: annual wage, in $Edu: years of formal schoolingu: all other unobserved factors that affect wage, including age,experience, gender, race, measurement error, etc.

Page 8: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

An Example: Return of schooling

Wage = β0 + β1Edu + u

Assume that this simple model is a reasonably good description ofthe actual wage determination process (*)Estimate β0 and β1 using data, obtain OLS estimates β0 = 21000 andβ1 = 9000Interpretation: If the years of schooling increases by 1, then annualwage, on average, increases by $9000, keeping everything elseconstant (ceteris paribus).Issue: the validity/accuracy of the estimates, in particular 9000, relieson assumption (*), which is unlikely to be true in this specificproblem. (More on this later.)

Page 9: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

When is there a causal effect?

Key Assumption for Causality:Zero Conditional Mean Assumption:

E (u|x) = E (u) = 0

Page 10: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Key Assumption for Causality: E (u|x) = E (u) = 0

1 E (u|x) = E (u):The average value of u does not depend on the value of x .i.e., the unobserved factors that affect y do not change when xchanges

In this example: wage = β0 + β1Edu + u, u contains the innate abilityof a person, among other things.To impose the Zero Condition Mean assumptionE (ability |Educ) = E (ability) is to assume: the average level of abilityis the same for people with different levels of education.For example, it also impliesE (ability |Educ = 6) = E (ability |Educ = 18). It means that the averageability for the group of all people with eight years of eduction is thesame as the average ability for the group of all people with sixteenyears of eduction.(Realistic?)A very strong assumption!

Page 11: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Key Assumption for Causality: E (u|x) = E (u) = 0

1 In addition, we assume E (u) = 0, a harmless assumption as long asβ0 is present in y = β0 + β1x + u.

if E (u) = ε, we can define a new error term u ′ by u ′ = u − ε.Then the original equation can be rewritten as y = (β0 + ε) + β1x + u ′,with E (u ′) = 0. (Exercise: Check this.)that is, the constant term β0 will “absorb” any non-zero mean of theerror term.

Page 12: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Understanding the Model

Once we have estimated the parameters, there are several ways to look atthe linear regression model:

1 with the population parameters

yi = E (yi |x) + ui

= β0 + β1xi + ui

2 with the estimated parameters

yi = yi + ui

= β0 + β1xi + ui

where :E (yi |x) = β0 + β1xi : “systematic part of y”yi = β0 + β1xi : predicted/fitted value of yi

ui = yi − yi : residual for observation i

Page 13: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Fitted values and Residuals

Page 14: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Interpretation of the Estimated β1

∆y = y2 − y1

= (β0 + β1x2) − (β0 + β1x1)

= β1(x2 − x1)

= β1∆x

When the zero conditional mean assumption holds, the estimatedslope gives the predicted impact of a unit change in x on y.Interpretation: If x increases by one unit, we predict that, onaverage, y increases or decreases (depending on the sign of β1) by β1units.When β1 = 0, we predict that there is no linear relationship between yand x.

Page 15: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Errors vs. Residuals

Errors ui : all other factors that affect y .never observed.Assumptions of the model are built around u

Residuals ui : computed from dataui = yi − yi

depends on how we estimate the parameters and obtain yi .will have several important algebraic properties

Page 16: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Overview

1 Introduction to the simple linear regression modelInterpretationCausal relationship

2 Deriving the OLS Estimators

3 Goodness-of-Fit

4 Extensions on Simple Linear RegressionUnit of MeasurementFunctional Forms: Nonlinear Relationships

5 Statistical Properties of the OLS estimatorsGauss-Markov AssumptionsUnbiasednessEfficiency

Page 17: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Ordinary Least Squares (OLS) Estimators

Recall that an estimator is a general rule/ approach to select estimates.Method of Moments estimatorsMaximum Likelihood estimatorsLeast Squares estimators

The Ordinary Least Squares (OLS) estimators are obtained by minimizingthe sum of squared residuals (SSR):

minSSR =n∑

i=1u2

i =n∑

i=1(yi − β0 − β1xi)

2

Page 18: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University
Page 19: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Deriving OLS Estimates: algebra

Minimizing the sum of squared residuals:

min SSR =n∑

i=1u2

i =n∑

i=1(yi − β0 − β1xi)

2

First Order Conditions:

∂(SSR)

∂β0= −2

n∑i=1

(yi − β0 − β1xi) = 0 ⇒ n∑i=1

ui = 0 (1)

∂(SSR)

∂β1= −2

n∑i=1

(yi − β0 − β1xi)xi = 0 ⇒ n∑i=1

uixi = 0 (2)

Page 20: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Re-arrange Eq. (1) n∑i=1

yi

− nβ0 − β1

n∑i=1

xi

= 0

1n

n∑i=1

yi

− β0 − β1

1n

n∑i=1

xi

= 0

Writeβ0 = y − β1x (3)

where

y =1n

n∑i=1

yi

x =1n

n∑i=1

xi

Page 21: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Plug Eq. (3) into Eq. (2)

n∑i=1

xi[yi − (y − β1x) − β1xi

]= 0

n∑i=1

(xi(yi − y) − β1xi(xi − x)

)= 0

n∑i=1

xi(yi − y) − β1

n∑i=1

xi(xi − x) = 0

β1 =

n∑i=1

xi(yi − y)n∑

i=1xi(xi − x)

Page 22: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Recall the properties of summation:n∑

i=1xi(yi − y) =

n∑i=1

(xi − x)(yi − y)

n∑i=1

xi(xi − x) =n∑

i=1(xi − x)2

Now we have obtained the OLS estimators:

β1 =

n∑i=1

(xi − x)(yi − y)n∑

i=1(xi − x)2

β0 = y − β1x

Page 23: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

An Example

Suppose we have data on the tire pressure and MPG:

ID 1 2 3 4 5

TirePres (xi) 20 25 30 35 40MPG (yi) 21.1 23.3 24.2 25.4 30.0

We want to estimate the population regression model

MPGi = β0 + β1TirePresi + ui

oryi = β0 + β1xi + ui

Page 24: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

The OLS estimate of β1 is

β1 =

n∑i=1

(xi − x)(yi − y)n∑

i=1(xi − x)2

Average tire pressure, x , is 30 and average MPG, y , is 24.8.Applying the OLS formula gives

β0 = 13.00β1 = 0.3940

In practice, this is done by econometrics packages: eg. STATA, SAS

Page 25: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Algebraic Properties of OLS Statistics

Some properties follow immediately from the algebra.In other words, β0 and β1 are chosen, such that:

1 The sum (and the sample average) of the OLS residuals is zero:

n∑i=1

ui = 0

2 Sample covariance between the regressors and the OLS residuals iszero:

1n − 1

n∑i=1

(xi − x)(ui − ¯u) = 0

3 The point (x , y) is always on the OLS regression line.

Page 26: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Overview

1 Introduction to the simple linear regression modelInterpretationCausal relationship

2 Deriving the OLS Estimators

3 Goodness-of-Fit

4 Extensions on Simple Linear RegressionUnit of MeasurementFunctional Forms: Nonlinear Relationships

5 Statistical Properties of the OLS estimatorsGauss-Markov AssumptionsUnbiasednessEfficiency

Page 27: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Goodness-of-Fit: Some Definitions

Total sum of squares (SST): measures the total sample variation inthe yi , or how spread out the yi are in the sample

SST =n∑

i=1(yi − y)2

recall: sample variance of y: S2y = 1

n−1∑n

i=1(yi − y)2 = 1n−1 SST

Explained sum of squares (SSE): measures the sample variation inthe yi

SSE =n∑

i=1(yi − y)2

Residual sum of squares (SSR): measures the sample variation inthe ui

SSR =n∑

i=1u2

i =n∑

i=1(yi − yi)

2

Page 28: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

SST = SSE + SSR

The total variation in y can be expressed as the sum of the explainedvariation and the unexplained variation: SST = SSE + SSR.

SST =n∑

i=1(yi − y)2 =

n∑i=1

[(yi − yi) + (yi − y)]2

=n∑

i=1[ui + (yi − y)]2

=n∑

i=1u2

i + 2n∑

i=1ui(yi − y) +

n∑i=1

(yi − y)2

= SSR + 2n∑

i=1ui(yi − y) + SSE

Page 29: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

But n∑i=1

ui(yi − y) = 0

n∑i=1

ui(yi − y) =n∑

i=1ui yi − y

n∑i=1

ui

=n∑

i=1ui(β0 + β1xi) + 0

= β0

n∑i=1

ui + β1

n∑i=1

uixi

= 0

So:SST = SSE + SSR

Page 30: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Goodness-of-Fit: R-squared

R-squared measures how good the OLS regression line fits the data.

R2 =

n∑i=1

(yi − y)2

n∑i=1

(yi − y)2=

SSESST = 1− SSR

SST

Some comments:R2 measures the proportion of the variation in y explained byvariation in x .R2 is always between 0 and 1.Higher R2 means that a higher proportion of variation in yi isexplained by the variation in xi .Low R2 are not uncommon, especially for cross-sectional data.

Page 31: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University
Page 32: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University
Page 33: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University
Page 34: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Overview

1 Introduction to the simple linear regression modelInterpretationCausal relationship

2 Deriving the OLS Estimators

3 Goodness-of-Fit

4 Extensions on Simple Linear RegressionUnit of MeasurementFunctional Forms: Nonlinear Relationships

5 Statistical Properties of the OLS estimatorsGauss-Markov AssumptionsUnbiasednessEfficiency

Page 35: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Extensions

Two important issues in applied economics are:1 how does changing the units of measurement of the dependent or

independent variables affect OLS estimates?2 how to incorporate non-linear functional forms used in economics into

regression analysis?

Page 36: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Issue 1: Unit of Measurement

Consider the following estimated simple linear regression model.

Salary = 963, 191+ 18, 501ROE

whereSalary is the predicted salary of CEO measured in dollars;ROE is return-on-equity ratio of a firm (annual earnings divided byvalue-of-equity).

The estimated equation says that if ROE increases by 0.01, CEOsalary increases by $185.01 on average.But what if we measure Salary in thousands of dollars?What if we change the unit of measurement for ROE fromproportions to percentages?

Page 37: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Issue 1: Unit of Measurement

Changing the unit of measurement is equivalent to multiplying thevariable by a constant c

If change the unit of y :

cy =(cβ0

)+(cβ1

)x = β ′

0 + β ′1x

If change the unit of x :

y = β0 +

(β1c

)(cx) = β0 + β ′

1(cx)

Changing the unit of measurement does not change the interpretationof the regression results!

Page 38: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Issue 2: Incorporating Nonlinear Relationships

The Meaning of Linear Regression: linear in the Parameters:linear regression: y = β0 + β1x1 + β2x2

2 + β3 log x3 + unon-linear regression: y = 1√

β0+β1x+ u

This insight allows us to use functional forms to incorporate nonlinearrelationships between y and x in the simple linear regression model.

Page 39: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

The limitation of a linear relationship

Previously, we have only used the level-level models.For example,

Wagei = β0 + β1Edui + ui

says that one additional year spent in school increases wage by β1dollars on average.The relationship between income and years of schooling is linear.Caveat: The model is only suitable for a constant return toeducation - β1 dollars is the increase for either the first year ofeducation or the twentieth year of education.What if there is an increasing return?

Page 40: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Example: A constant return to education

Page 41: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Example: An increasing return to education

Page 42: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

A log-level model

How to capture an increasing return to education? - Use percentagechange.Suppose we observe that one more year spent in school increaseswage by 10% on average.In that case, the relationship can be represented by the log-levelmodel:

log(Wagei) = β0 + β1Edui + ui

We can generate a new variable: log(Wagei), and then use it as thedependent variable in the linear regression.Question: Why a change in log value can be interpreted as apercentage change?

Page 43: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Interpretation of changes in log value

A key property implied by Taylor series:

f (x +∆x) ≈ f (x) + f ′(x)∆x

where f (·) is a differentiable function and ∆x is small.Hence when f (·) = log(·),

log(x +∆x) − log(x) ≈ ∆xx

or∆log(x) ≈ ∆x

xwhere ∆x

x · 100% is the percentage change in x .Therefore, we can interpret changes in log values as percentagechanges. This is an approximation.

Page 44: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Interpretation of a log-level model

For a level-level model,

y = β0 + β1x + u

The way we interpret:∆y = β1∆x

For a log-level model,

log(y) = β0 + β1x + u

We interpret:∆log(y) = β1∆x

Applying the approximation of changes in log, we interpret:

∆yy = β1∆x

Page 45: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Log-level model

For a log-linear model,

log(Wagei) = β0 + β1Edu + ui

For an one-year increase in years of schooling, wage increases by(100β1) % on average.

∆WageWage ≈ β1∆Edu

If we estimate β1 to be 0.09, then one more year of schooling(∆Edu = 1) increases wage by about 9% on average.

∆WageWage ≈ 0.09×∆Edu = 9%

Page 46: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Functional Forms: continued

If we believe that the relationship between yi and xi is such that if xiincreases by 1 percent, all else equal, yi increases by β1 percent, thenwe should use the log-log model:

log(yi) = β0 + β1 log(xi) + ui

If we believe that the relationship between yi and xi is such that if xiincreases by 1 percent, all else equal, yi increases by some absoluteamount, then we should use a level-log model

yi = β0 + β1 log(xi) + ui

Page 47: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Four types of models

level-level: y = β0 + β1x + ulog-level: log(y) = β0 + β1x + ulevel-log: y = β0 + β1 log(x) + ulog-log: log(y) = β0 + β1 log(x) + u. In this case, β1 is the elasticity.

Page 48: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Summary of interpretations

Page 49: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Overview

1 Introduction to the simple linear regression modelInterpretationCausal relationship

2 Deriving the OLS Estimators

3 Goodness-of-Fit

4 Extensions on Simple Linear RegressionUnit of MeasurementFunctional Forms: Nonlinear Relationships

5 Statistical Properties of the OLS estimatorsGauss-Markov AssumptionsUnbiasednessEfficiency

Page 50: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Statistical Properties of OLS

Question: Are the OLS estimators, β0 and β1, good estimators toestimate the population parameters, β0 and β1?

Recall some good properties of estimators:Unbiasedness: E (β0) = β0 and E (β1) = β1Efficiency: Var(β0) and Var(β1) should be relatively small

The statistical properties concern the distributions of β0 and β1 overdifferent random samples from the population, which depend on notonly how the estimators were constructed, but also the nature of thepopulation/data.In other words, we need to make assumptions on the populationto establish the statistical properties of the OLS estimators.In linear regression models, the set of assumptions we will make arecalled Gauss-Markov Assumptions.

Page 51: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Gauss-Markov Assumptions for Simple Regression: SLR.1-3

Assumption SLR.1 (Linear in Parameters): In the populationmodel, the dependent variable, y is related to the independentvariable, x, and the error (or disturbance), u, as

y = β0 + β1x + u

where β0 and β1 are the population intercept and slope parameters,respectively.Assumption SLR.2 (Random Sampling): We have a randomsample of size n, {(xi , yi) : i = 1, 2, . . . , n}, following the populationmodel in Assumption SLR.1.Assumption SLR.3 (Sample Variation in the ExplanatoryVariable): The sample outcomes on x, namely, {xi , i = 1, 2, . . . , n},are not all the same value.

Page 52: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Gauss-Markov Assumptions for Simple Regression: SLR.4-5

Assumption SLR.4 (Zero Conditional Mean): The error u has anexpected value of zero given any value of the explanatory variable.

E (u|x) = E (u) = 0

We have mentioned before that this is the key assumption to ensurethat the estimated β1 measures the ceteris paribus effect of x on y.In this section, we will show algebraically that this is the keyassumption to establish the unbiasedness of β1 and β0.

Assumption SLR.5 (Homoskedasticity): The error u has the samevariance given any value of the explanatory variable.

Var(u|x) = E (u2|x) = σ2

In this section, we use this assumption to obtain the usual OLSvariance formula. We will postpone the discussion on efficiency to later.

Page 53: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Heteroskedasticity vs. Homoskedasticity

Page 54: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Unbiasedness of β1: E (β1) = β1

Under the assumption SLR.1-SLR.5, we want to show that OLSestimators β1 is unbiased, i.e., to show E (β1) = β1.First, we can simplify β1 before taking expected values:

β1 =

n∑i=1

(xi − x)(yi − y)n∑

i=1(xi − x)2

=

n∑i=1

(xi − x)yi

n∑i=1

(xi − x)2=

n∑i=1

(xi − x)(β0 + β1xi + ui)

n∑i=1

(xi − x)2

= β0

n∑i=1

(xi − x)n∑

i=1(xi − x)2︸ ︷︷ ︸

=0

+β1

n∑i=1

(xi − x)xi

n∑i=1

(xi − x)2︸ ︷︷ ︸=

n∑i=1

(xi−x)(xi−x)

n∑i=1

(xi−x)2=1

+

n∑i=1

(xi − x)ui

n∑i=1

(xi − x)2

Page 55: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Unbiasedness of β1

β1 can rewritten as:

β1 = β1 +

n∑i=1

(xi − x)ui

n∑i=1

(xi − x)2Using SLR.1-3

Second, we take expectation on β1. In particular, we take theconditional expectation.

E (β1|x) = β1 + E

n∑

i=1(xi − x)ui

n∑i=1

(xi − x)2|x

= β1 +

n∑i=1

(xi − x)E (ui |x)n∑

i=1(xi − x)2

Page 56: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Unbiasedness of β1

Then we have:

E (β1|x) = β1 if SLR.4 holds: E (u|x) = 0

Finally, apply the law of iterated expectations:

E (β1) = E (E (β1|x)) = E (β1) = β1

⇒ OLS estimator β1 is an unbiased estimator of β1.

Question:Where did we use the assumption 1-5?What will happen if each assumption fails?

Page 57: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Unbiasedness of β0: E (β0) = β0

Recall that the OLS estimator β0 is given by β0 = y − β1x :E (β0|x) = E (y |x) − E (β1x |x) = E (y |x) − xE (β1|x)

= E (y |x) − β1x = E

1n

n∑i=1

yi |x

− β1x

= E

1n

n∑i=1

(β0 + β1xi + ui)|x

− β1x

= β0 + β11n

n∑i=1

xi +1nE

n∑i=1

ui |x

− β1x

= β0 + β1x − β1x if SLR.4 holds: E (u|x) = 0= β0

Again, apply the law of iterated expectations:E (β0) = E (E (β0|x)) = E (β0) = β0⇒ OLS estimator β0 is an unbiased estimator of β0.

Page 58: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Variance of β1

We have shown that

β1 = β1 +

n∑i=1

(xi − x)ui

n∑i=1

(xi − x)2

Recall: Var(aX + b) = a2Var(X ) for any constants a, b.

Var(β1|x) = Var

n∑

i=1(xi − x)ui

n∑i=1

(xi − x)2|x

=

1[n∑

i=1(xi − x)2

]2 · Var

n∑i=1

(xi − x)ui |x

Page 59: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Variance of β1

Var

n∑i=1

(xi − x)ui |x

=n∑

i=1Var [(xi − x)ui |x ]

=n∑

i=1(xi − x)2Var(ui |x)

if SLR.5 holds: Var(u|x) = σ2

=n∑

i=1(xi − x)2σ2

= σ2n∑

i=1(xi − x)2

Page 60: Lecture 3: Simple Linear Regression - University Of Marylandeconweb.umd.edu/~hong/files/Lecture3.pdf · Lecture3: SimpleLinearRegression XingHong Department of Economics University

Variance of β1

Var(β1|x) =Var

[n∑

i=1(xi − x)ui |x

][

n∑i=1

(xi − x)2

]2

=

σ2n∑

i=1(xi − x)2

[n∑

i=1(xi − x)2

]2

=σ2

n∑i=1

(xi − x)2

When is Var(β1|x) small?Why do we need the SLR.3: "x1, x2, · · · xn are not all of the samevalue"?