ols assumptions
Embed Size (px)
TRANSCRIPT

8/8/2019 OLS Assumptions
1/41
The Simple Regression Model
y = F0 +F1x +u

8/8/2019 OLS Assumptions
2/41
Some Terminology
In the simple linear regression model,
wherey = F0 +F1x +u, we typically refer
to y as the
Dependent Variable, or
LeftHand Side Variable, or
Explained Variable, or Regressand

8/8/2019 OLS Assumptions
3/41
Some Terminology, cont.
In the simple linear regression of y on x,we typically refer to x as the
Independent Variable, or RightHand Side Variable, or
Explanatory Variable, or
Regressor, or
Covariate, or
Control Variables

8/8/2019 OLS Assumptions
4/41
A Simple Assumption
The average value ofu, the error term, in
the population is 0. That is,
E(u) = 0
This is not a restrictive assumption, since
we can always use F0 to normalize E(u) to 0

8/8/2019 OLS Assumptions
5/41
Zero Conditional Mean
We need to make a crucial assumptionabout how u andx are related
We want it to be the case that knowingsomething about x does not give us anyinformation about u, so that they arecompletely unrelated. That is, that
E(ux) = E(u) = 0, which implies
E(yx) = F0 +F1x

8/8/2019 OLS Assumptions
6/41
.
.
x1
x2
E(yx) as a linear function ofx, where for anyx
the distribution ofy is centered about E(yx)
E(yx) = F0 + F1x
y
f(y)

8/8/2019 OLS Assumptions
7/41
Ordinary Least Squares
Basic idea of regression is to estimate the
population parameters from a sample
Let {(xi,yi): i=1, ,n} denote a random
sample of size n from the population
For each observation in this sample, it will
be the case that
yi = F0 + F1xi + ui

8/8/2019 OLS Assumptions
8/41
.
.
.
.
y4
y1
y2
y3
x1 x2 x3 x4
}
}
{
{
u1
u2
u3
u4
x
y
Population regression line, sample data points
and the associated error terms
E(yx) =F0 + F1x

8/8/2019 OLS Assumptions
9/41
Deriving OLS Estimates
To derive the OLS estimates we need torealize that our main assumption of E(ux) =
E(u) = 0 also implies that
Cov(x,u) = E(xu) = 0
Why? Remember from basic probabilitythat Cov(X,Y) = E(XY) E(X)E(Y)

8/8/2019 OLS Assumptions
10/41
Deriving OLS continued
We can write our2 restrictions just in terms
ofx,y, F0 and F , since u =y F0 F1x
E(y F0 F1x) = 0
E[x(y F0 F1x)] = 0
These are called moment restrictions

8/8/2019 OLS Assumptions
11/41
Deriving OLS using M.O.M.
The method of moments approach toestimation implies imposing the population
moment restrictions on the sample moments
What does this mean? Recall that for E(X),the mean of a population distribution, asample estimator of E(X) is simply thearithmetic mean of the sample

8/8/2019 OLS Assumptions
12/41
More Derivation ofOLS
We want to choose values of the parameters that
will ensure that the sample versions of our
moment restrictions are trueThe sample versions are as follows:
1
1
1
1
1
1
!
!
!
!
n
i
iii
n
i
ii
xyxn
xyn
FF
FF

8/8/2019 OLS Assumptions
13/41
More Derivation ofOLS
Given the definition of a sample mean, and
properties of summation, we can rewrite the first
condition as follows
xy
xy
10
10
or
,
FF
FF
!
!

8/8/2019 OLS Assumptions
14/41
More Derivation ofOLS
!!
!!
!
!
!
!
n
i
ii
n
i
i
n
i
ii
n
i
ii
n
i
iii
xxyyxx
xxxyyx
xxyyx
1
2
1
1
1
1
1
1
11
0
F
F
FF

8/8/2019 OLS Assumptions
15/41
So the OLS estimated slope is
0thatprovided
1
2
1
2
11
"
!
!
!
!
n
i
i
n
i
i
n
i
ii
xx
xx
yyxx
F

8/8/2019 OLS Assumptions
16/41
Summary ofOLS slope estimate
The slope estimate is the sample covariancebetweenx andy divided by the sample
variance ofxIfx andy are positively correlated, theslope will be positive
Ifx andy are negatively correlated, theslope will be negative
Only needx to vary in our sample

8/8/2019 OLS Assumptions
17/41
More OLS
Intuitively, OLS is fitting a line through the
sample points such that the sum of squared
residuals is as small as possible, hence theterm least squares
The residual, , is an estimate of the error
term, u, and is the difference between thefitted line (sample regression function) and
the sample point

8/8/2019 OLS Assumptions
18/41
.
.
.
.
y4
y1
y2
y3
x1 x2 x3 x4
}
}
{
{
1
2
3
4
x
y
Sample regression line, sample data points
and the associated estimated error terms
xy1
0
FF !

8/8/2019 OLS Assumptions
19/41
Alternate approach to derivation
Given the intuitive idea of fitting a line, we can
set up a formal minimization problem
That is, we want to choose our parameters suchthat we minimize the following:
!! !n
i
ii
n
i
i xyu1
2
10
1
FF

8/8/2019 OLS Assumptions
20/41
Alternate approach, continued
If one uses calculus to solve the minimization
problem for the two parameters you obtain the
following first order conditions, which are thesame as we obtained before, multiplied by n
0
0
1
10
1
10
!
!
!
!
n
i
iii
n
i
ii
xyx
xy
FF
FF

8/8/2019 OLS Assumptions
21/41
Algebraic Properties ofOLS
The sum of the OLS residuals is zero
Thus, the sample average of the OLS
residuals is zero as well
The sample covariance between the
regressors and the OLS residuals is zero
The OLS regression line always goes
through the mean of the sample

8/8/2019 OLS Assumptions
22/41
Algebraic Properties (precise)
xy
ux
n
u
u
n
iii
n
i
in
i
i
10
1
1
1
0
0
thus,and0
FF !
!
!!
!
!
!

8/8/2019 OLS Assumptions
23/41
More terminology
TThen
( )squaresosumresidualtheis
( )squaresosumexplainedtheis
( T)squaresosumtotaltheis
:ollo ingthede inethenWe
part,dunexplaineanandpart,explainedanoup
madebeingasnobservatioeachocan thinkWe
2
2
2
!
!
i
i
i
iii
u
yy
yy
uyy

8/8/2019 OLS Assumptions
24/41
Proof that SST = SSE + SSR
? A
? A
!!
!
!
!
0thatknoeand
SSE2SSR
2
22
2
22
yyu
yyu
yyyyuu
yyu
yyyyyy
ii
ii
iiii
ii
iiii

8/8/2019 OLS Assumptions
25/41
GoodnessofFit
How do we think about how well oursample regression line fits our sample data?
Can compute the fraction of the total sumof squares (SST) that is explained by themodel, call this the Rsquared of regression
R2 = SSE/SST = 1 SSR/SST

8/8/2019 OLS Assumptions
26/41
Using Stata forOLS regressions
Now that weve derived the formula for
calculating the OLS estimates of our
parameters, youll be happy to know youdont have to compute them by hand
Regressions in Stata are very simple, to run
the regression of y on x, just typereg y x

8/8/2019 OLS Assumptions
27/41
Unbiasedness ofOLS
Assume the population model is linear inparameters asy = F0 + F1x + u
Assume we can use a random sample ofsize n, {(xi, yi): i=1, 2, , n}, from thepopulation model. Thus we can write thesample modelyi = F0 + F1xi + uiAssume E(ux) = 0 and thus E(uixi) = 0
Assume there is variation in thexi

8/8/2019 OLS Assumptions
28/41
Unbiasedness ofOLS (cont)
In order to think about unbiasedness, we need to
rewrite our estimator in terms of the population
parameterStart with a simple rewrite of the formula as

!22
2 where,
xxs
s
yxx
ix
x
ii
F

8/8/2019 OLS Assumptions
29/41
Unbiasedness ofOLS (cont)
ii
iii
ii
iii
iiiii
uxxxxxxx
uxx
xxxxx
uxxxyxx
!
!!
10
10
10
FF
FF
FF

8/8/2019 OLS Assumptions
30/41
Unbiasedness ofOLS (cont)
211
2
1
2
thusand,
asrewrittenbecannumeratorthe,so
,0
x
ii
iix
iii
i
s
uxx
uxxs
xxxxx
xx
!
!
!
FF
F

8/8/2019 OLS Assumptions
31/41
Unbiasedness ofOLS (cont)
1211
21
1
then,1
thatso,let
FFF
FF
!
!
!
!
iix
iix
i
ii
uEds
E
uds
xxd

8/8/2019 OLS Assumptions
32/41
Unbiasedness Summary
The OLS estimates ofF1 and F0 areunbiased
Proof of unbiasedness depends on our 4assumptions if any assumption fails, thenOLS is not necessarily unbiased
Remember unbiasedness is a description ofthe estimator in a given sample we may benear or far from the true parameter

8/8/2019 OLS Assumptions
33/41
Variance of the OLS Estimators
Now we know that the samplingdistribution of our estimate is centered
around the true parameterWant to think about how spread out thisdistribution is
Much easier to think about this varianceunder an additional assumption, so
Assume Var(ux) = W2 (Homoskedasticity)

8/8/2019 OLS Assumptions
34/41

8/8/2019 OLS Assumptions
35/41
.
.
x1
x2
Homoskedastic Case
E(yx) = F0 + F1x
y
f(yx)

8/8/2019 OLS Assumptions
36/41
.
xx1 x2
f(yx)
Heteroskedastic Case
x3
.
.
E(yx) = F0 + F1x

8/8/2019 OLS Assumptions
37/41
Variance ofOLS (cont)
12
22
2
2
2
2
2
2
222
2
2
2
2
2
2
2
211
1
11
11
1
FWW
WW
FF
Vars
ss
d
s
d
s
uVards
udVars
uds
VarVar
xx
x
i
x
i
x
iix
iix
iix
!!
!
!
!
!
!
!

8/8/2019 OLS Assumptions
38/41
Variance ofOLS Summary
The larger the error variance, W2, the larger
the variance of the slope estimate
The larger the variability in thexi, thesmaller the variance of the slope estimate
As a result, a larger sample size should
decrease the variance of the slope estimateProblem that the error variance is unknown

8/8/2019 OLS Assumptions
39/41
Estimating the Error Variance
We dont know what the error variance, W2,
is, because we dont observe the errors, ui
What we observe are the residuals, i
We can use the residuals to form anestimate of the error variance

8/8/2019 OLS Assumptions
40/41
Error Variance Estimate (cont)
2/
2
1
isofestimatorunbiasedanThen,
22
2
1100
1010
10
!
!
!!
!
nSSRun
u
xux
xyu
i
i
iii
iii
W
W
FFFFFFFF
FF

8/8/2019 OLS Assumptions
41/41
Error Variance Estimate (cont)
2121
1
2
/se
,
oerrorstandardthe
havethen eorsubstituteei
sdthatrecall
regressiontheoerrortandard
!
!
!!
xx
s
i
x
WFF
WW
WF
WW