violations of ols
Post on 10-Feb-2018
307 Views
Preview:
TRANSCRIPT
-
7/22/2019 Violations of OLS
1/64
Violations of Assumptions
of OLS
1
-
7/22/2019 Violations of OLS
2/64
Readings: Asertiou:
P93-196 (2ndEdition
P83-179 (1stedition)
covers all this stuff.
2
-
7/22/2019 Violations of OLS
3/64
The Linear Regression Model
The Assumptions
1. Linearity: The dependent variable is a linear function of theindependent variable
2. Xt has some variation i.e. Var(X) is not 0.
3. Xt
is non stochastic and fixed in repeated samples
4. E(ut) = 0
5. Homoskedasticity: Var (ut) = 2
= constant for all t,
6.
Cov(ut, us) = 0 , serial independence
7. ut~ N (,2)
8. No Multi-collinearity (i.e. no explanatory variable can be a linearcombination of others in the model)
3
-
7/22/2019 Violations of OLS
4/64
Violation of the Assumptions of the Classical Linear
Regression Model (CLRM) (i.e. OLS)
We will focus on the assumptions regarding the
disturbance terms:
1. E(ut) = 0
2. Var(ut) = 2 (Homoskedasticity)
3. Cov (ui,uj) = 0 (Absence of autocorrelation)
4. utN(0,2) (Normally distributed errors)
4
-
7/22/2019 Violations of OLS
5/64
-
7/22/2019 Violations of OLS
6/64
Statistical Distributions for Diagnostic Tests
Often, anF- and a 2- version of the test are available.
TheF-test version involves estimating a restricted and an unrestricted
version of a test regression and comparing theRSS. (i.e. Wald tests)
The 2- version is sometimes called an LM test, and only has one
degree of freedom parameter: the number of restrictions being tested, m.
For small samples, theF-version is preferable.
6
-
7/22/2019 Violations of OLS
7/64
Assumption 1:
E(ut) = 0
7
-
7/22/2019 Violations of OLS
8/64
Assumption 1: E(ut) = 0
Assumption that the mean of the disturbances is zero.
For all diagnostic tests, we cannot observe the disturbances and so
perform the tests on the residuals instead.
Recall:
Disturbance -> difference between observation and true line
Residual -> difference between observation and estimated line
The mean of the residuals will always be zero provided that there is a
constant term in the regression so in most cases we dont need to worry
about this!!
8
-
7/22/2019 Violations of OLS
9/64
Assumption 2:
Var(ut) = 2
True => Homoskedasticity
False => Heteroskedasticity
9
-
7/22/2019 Violations of OLS
10/64
-
7/22/2019 Violations of OLS
11/64
Detection of Heteroscedasticity
Graphical methods
Formal tests:One of the best is Whitesgeneral test for heteroskedasticity.
The test is carried out as follows:
1. Assume that the regression we carried out is as follows
yt= 1+ 2x2t+3x3t+ utAnd we want to test Var(ut) =
2. We estimate the model, obtaining
the residuals,
2. Then run the auxiliary regression
ut
tttttttt vxxxxxxu 3262
35
2
2433221
2
11
-
7/22/2019 Violations of OLS
12/64
Performing Whites Test for Heteroscedasticity
3. ObtainR2from the auxiliary regression and multiply it by the number ofobservations, T. It can be shown that
T R22(m)
where mis the number of regressors in the auxiliary regression excluding theconstant term. The idea is that if R2 in the auxiliary regression is high, then
the equation has explanatory power.
4. If the 2test statistic from step 3 is greater than the corresponding valuefrom the statistical table then reject the null hypothesis that the disturbancesare homoscedastic.
There is a test for hetroscedasticity given with OLS regression output.
I.e. The basic idea is to see if xs are correlated with the size of errors(shouldnt be if homoskedastic)
12
-
7/22/2019 Violations of OLS
13/64
Consequences of Using OLS in the Presence of Heteroscedasticity
OLS estimation still gives unbiased coefficient estimates (in linear models),but they are no longer BLUE. [not efficient because it increases the varianceof the distribution of the s]
This implies that if we still use OLS in the presence of heteroscedasticity, ourstandard errors could be inappropriate and hence any inferences we make
could be misleading.
If standard error is too high: our t-statistic will be too low and we will fail to rejecthypotheses we should reject.
If standard error is too low: our t-statistic will be too high and we will reject
hypotheses we should fail to reject..
Whether the standard errors calculated using the usual formulae are too big ortoo small will depend upon the form of the heteroscedasticity.
13
-
7/22/2019 Violations of OLS
14/64
How Do we Deal with Heteroscedasticity?
If the form (i.e. the cause) of the heteroskedasticity is known, then we can use
an estimation method which takes this into account (called generalised leastsquares, GLS).
Suppose that the original regression:yt= 1+ 2x2t+ 3x3t+ ut
A simple illustration of GLS is as follows: Suppose that the error variance is
related to another variableztby
To remove the heteroscedasticity, divide the regression equation byzt
where is just the rescaled error term.
Now for knownzt.
So the disturbances from the new regression equation will be homoscedastic.
22
var tt zu
t
t
t
t
t
tt
t vz
x
z
x
zz
y 33
221
1
t
tt
z
uv
2
2
22
2
varvarvar
t
t
t
t
t
tt
z
z
z
u
z
uv
14
-
7/22/2019 Violations of OLS
15/64
Other Approaches to Dealing with Heteroscedasticity
Other solutionsinclude:1. Transforming the variables into logs or reducing by some other measure
of size.
2. Use Whitesheteroscedasticity consistent standard error estimates.
The effect of using Whitescorrection is that in general the standard errors
for the slope coefficients are increased relative to the usual OLS standarderrors.
This makes us more conservativein hypothesis testing, so that we would
need more evidence against the null hypothesis before we would reject it.
15
-
7/22/2019 Violations of OLS
16/64
Assumption 3:
Cov (ui,uj) = 0
(Absence of autocorrelation)
16
-
7/22/2019 Violations of OLS
17/64
Aside: Covariance
What is covariance?
The covariance between two real-valued randomvariablesXand Y,
with expected values and is defined
as
If two variables tend to vary together (that is, when one ofthem is above its expected value, then the other variabletends to be above itsexpected value too), then the
covariance between the two variables will be positive. On the other hand, if one of them tends to be above its
expected value when the other variable is belowitsexpected value, then the covariance between the twovariables will be negative.
17
http://en.wikipedia.org/wiki/Real_numberhttp://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/Expected_valuehttp://en.wikipedia.org/wiki/Expected_valuehttp://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/Real_number -
7/22/2019 Violations of OLS
18/64
Autocorrelation Autocorrelation is when the error terms of different observations
are not independent of each other i.e. they are correlated witheach other
Consider the time series regression
Yt= 1X1t+ 2X2t+ 3X3t ++ kXkt + utbut in this case there is some relation between the error termsacross observations. [Imagine a shock like the current economic
collapsehas effects in more than one period!] E(ut) = 0 Var (ut) =
2
But:Cov (ut, us) 0
Thus the error covariance's are not zero.
This means that one of the assumption that makes OLS BLUEdoes not hold.
Autocorrelation is most likely to occur in a time series framework In cross-section data there may be spatial correlationthis is a form of
autocorrelation too!
18
-
7/22/2019 Violations of OLS
19/64
Likely causes of Autocorrelation
1. Omit variable that ought to be included.
2. Misspecification of the functional form. Thisis most obvious where a straight line is
fitted where a curve should have beenused. This would clearly show up in plots ofresiduals.
3. Errors of measurement in the dependent
variable. If the errors are not random thenthe error term will pick up any systematicmistakes.
19
-
7/22/2019 Violations of OLS
20/64
The Problem with Autocorrelation
OLS estimators will be inefficient so no longer BLUE but
are still unbiased so get right answer for betas on
average!!
Inefficient here basically means the estimate is more variable
than it needs to be! => less confident in our estimate than wecould be
The estimated variances of the regression coefficients
will be biased and inconsistent, therefore hypothesis
testing will no longer be valid. t-statistics will tend to be higher
R2will tend to be overestimated. [i.e. think we explain
more than we actually do!]
20
-
7/22/2019 Violations of OLS
21/64
Detecting Autocorrelation
By observation Plot the residuals against time
If error is random should be no pattern!!
Plot u^tagainst u^t-1 If no correlation should not be a relationship!
The Durbin Watson Test Only good for first order serial correlation
It can give inconclusive results
It is not applicable when a lagged dependent variable isincluded in the regression
Breusch and Godfrey (1978) developed analternative test
21
-30
-20
-10
0
10
20
30
-30 -20 -10 0 10 20 30
RESID
PREVRESID
-
7/22/2019 Violations of OLS
22/64
Breusch-Godfrey Test
This is an example of an LM (Lagrange Multiplier) type test
where only the restricted form of the model is estimated. Wethen test for a relaxation of these restrictions by applying a
formula.
Consider the model Y
t=
1X
1t+
2X
2t+
3X
3t++
kX
kt+ u
tWhere ut=ut-1+ 2ut-2+ 3ut-3++ put-p+ t
Combing these two equations gives Yt= 1X1t+ 2X2t+ 3X3t ++ kXkt + ut-1
+ 2ut-2+ 3ut-3++ put-p+ t
Test the following H0 and Ha Ho: 1= 2= 3== p= 0 Ha: at least one of the s is not zero, thus there is serial
correlation
22
-
7/22/2019 Violations of OLS
23/64
Breusch-Godfrey Test
This two-stage test begins be considering the
model
Yt= 1X1t+ 2X2t+ 3X3t ++ kXkt + ut (1)
1) Estimate model 1 and save the residuals.
2) Run the following model with the number of lags used
determined by the order of serial correlation you are
willing to test.u^t= 0+ 1X2t+ 2X3t ++ RrXRt + R+1u^t-1+ R+2u^t-2++ R+pu^t-p
(auxiliary regression)
[Note: the ^ are supposed to be on top of the us above]23
-
7/22/2019 Violations of OLS
24/64
Breusch-Godfrey Test
The test statistic may be written as an
LM statistic = (n - )*R2,
where R2relates to this auxiliary regression.
The statistic is distributed asymptotically as chi-square
(2) with degrees of freedom.
If the LM statistic is bigger than the 2critical then we
reject the null hypothesisof no serial correlation and
conclude that serial correlation exists
24
-
7/22/2019 Violations of OLS
25/64
Ho: No Autocorrelation Here we have autocorrelation (Prob. F
-
7/22/2019 Violations of OLS
26/64
Solutions to Autocorrelation
1. Find cause
2. Increase number of observations
3. Re-specify model correctly
4. E-views provides number of procedures eg Cochrane Orcutt -last resort
5. Most important: It is easy to confuse misspecified dynamicswith serial correlation in the errors. In fact it is best to alwaysstart from a general dynamic model and test the restrictionsbefore applying the tests for serial correlation.
[i.e. If this periods Y depends on last periods Y, we should include alagged term for Y in our model rather than thinking its a problem with theerrors for the original model!]
26
-
7/22/2019 Violations of OLS
27/64
Assumption 4:
utN(0,2)
Normally distributed error term
27
-
7/22/2019 Violations of OLS
28/64
Testing the Normality Assumption
Recall we needed to assume normally distributeddisturbances for many hypothesis tests to be valid
Testing for Departures from Normality
The Bera Jarque normali ty test
Skewness and kurtosis are the (standardised) third andfourth moments of a distribution. The first and second arethe mean and variance respectively.
A normal distribution is not skewed and is defined to have a
coefficient of kurtosis of 3. The kurtosis of the normal distribution is 3 so its excess
kurtosis (b2-3) is zero.
28
-
7/22/2019 Violations of OLS
29/64
Normal versus Skewed Distributions
A normal distribution A skewed distribution
f(x)
x x
f(x)
29
-
7/22/2019 Violations of OLS
30/64
Leptokurtic versus Normal Distribution
-5.4 -3.6 -1.8 -0.0 1.8 3.6 5.4
0.0
0.1
0.2
0.3
0.4
0.5
30
-
7/22/2019 Violations of OLS
31/64
Testing for Normality
Bera and Jarque formalise this by testing the residuals for normality bytesting whether the coefficient of skewness and the coefficient of excesskurtosis are jointly zero.
It can be proved that the coefficients of skewness and kurtosis can beexpressed respectively as:
and
The Bera Jarque test statistic is given by
We estimate b1and b2using the residuals from the OLS regression, .
This test of normality can be easily performed using E-views
u
b
E u1
3
23 2
[ ]
/
b
E u2
4
2 2
[ ]
2~
24
3
6
2
2
2
2
1
bbTW
31
-
7/22/2019 Violations of OLS
32/64
What do we do if we find evidence of Non-Normality?
It is not obvious what we should do!
Could use a method which does not assume normality, but difficult and
what are its properties?
Often the case that one or two very extreme residuals causes us to
reject the normality assumption.
A solutionis to use dummyvariables.
e.g. say we estimate a monthly model of asset returns from 1980-1990, and we plot the residuals, and find a particularly large outlier for
October 1987:
32
-
7/22/2019 Violations of OLS
33/64
What do we do if we find evidence
of Non-Normality? (contd)
Create a new variable:
D87M10t= 1 during October 1987 and zero otherwise.
This effectively knocks out that observation. But we need a theoreticalreason for adding dummy variables.
Oct
1987
+
-
Time
tu
33
-
7/22/2019 Violations of OLS
34/64
A test for functional form
34
-
7/22/2019 Violations of OLS
35/64
Adopting the Wrong Functional Form
We have previously assumed that the appropriate functionalform is linear.
This may not always be true.
We can formally test this using RamseysRESET test, whichis a general test for mis-specification of functional form.
1. Essentially the method works by adding higher order terms of the
fitted values (e.g. etc.) into an auxiliary regression:2. Regress on powers of the fitted values:
3. ObtainR2from this regression. The test statistic is given by TR2and is
distributed as a .
4. So if the value of the test statistic is greater than a then rejectthe null hypothesis that the functional form was correct.
... u y y y vt t t p t
p
t
0 1
2
2
3
1
, y yt t2 3
2 1( )p
ut
2 1( )p
35
-
7/22/2019 Violations of OLS
36/64
But what do we do if this is the case?
The RESET test gives us no guide as to what a better specification
might be.
One possible cause of rejection of the test is if the true model is
In this case the remedy is obvious.
Another possibility is to transform the data into logarithms. This will
linearise many previously multiplicative models into additive ones:
ttttt uxxxy 3
24
2
23221
ttt
u
tt uxyeAxy t lnln
36
-
7/22/2019 Violations of OLS
37/64
Up to here is covered in the
first 9 chapters of Asetiou
and Hall
37
-
7/22/2019 Violations of OLS
38/64
Intro to Endogeneity
[wont have time to cover this
in classnot on exam]
38
-
7/22/2019 Violations of OLS
39/64
Learning outcomes
1. To understand what endogeneity is & why it
is a problem.
2. To have an intuitive understanding of how
we might fix it.
But: We wont cover this in any detail despitethe fact that it is an extremely important
topic. Unfortunately we have limited time!
39
-
7/22/2019 Violations of OLS
40/64
Sources of Endogeneity
3 causes of endogeneity:
1. Omitted Variables
2. Measurement Error
3. Simultaneity
I will only focus on endogeneity caused by
omitted variables.
40
-
7/22/2019 Violations of OLS
41/64
The assumption:
When we estimate a model such as:
using OLS, we assume that cov(Xi
, ) = 0
i.e. our included variables (Xi) are not related to any
unobserved variables () that influence the dependent
variable (Y).
The example we will work with today:
Earnings = + 1YrsEd + 2Other variables +
41
2211 XXY
-
7/22/2019 Violations of OLS
42/64
Assuming cov(X, ) = 0: We can visualize the case where the assumption holds as
follows:
Notice that there is no link between X and .
If X affects Y, when X changes so should Y
The effect of X on Y is:
42
Y
X
)(
),(
XVar
YXCov
-
7/22/2019 Violations of OLS
43/64
Example:
Comparing the earnings of people with different education
levels will give us an estimate of the effect of education on
earnings (controlling for gender etc.):
43
Earnings
Education Experience
Ethinicity
Region etc.
-
7/22/2019 Violations of OLS
44/64
Example 15.4 Wooldridges book
An extra year of education increases wages by
roughly 7.4%.
(Coefficient*100 is % effect because Y is logged)
44
-
7/22/2019 Violations of OLS
45/64
What if our assumption is
incorrect, i.e. cov(X, )0?
=> Endogeneity
45
-
7/22/2019 Violations of OLS
46/64
Endogeneity
In our example we looked at the effect of
education on earnings.
However many unobserved variables affect both
income and education e.g. the individuals ability High ability people may be more likely to go to
college
High ability people generally earn more
Since we do not observe ability, we will thinkall of the higher earnings is due to educationi.e. estimate the effect, , incorrectly!
46
-
7/22/2019 Violations of OLS
47/64
Endogeneity:
Now we have a problem, some variables in are related toX
as well as to Y => Cov(X,) 0
A change in , will changeX and Ybut we do not observe , so
we think that X is causing the change in Y directly!!! Thus we get the wrong estimate for :
47
Y
X
X
Xcorrp
),(lim
-
7/22/2019 Violations of OLS
48/64
Endogeneity:
Now we have a problem, some variables in are related toX
as well as to Y => Cov(X,) 0
A change in , will changeX and Ybut we do not observe , so
we think that X is causing the change in Y directly!!! Thus we get the wrong estimate for :
48
X
Xcorrp
),(lim
Earnings
Education
-
7/22/2019 Violations of OLS
49/64
Solutions to endogeneity:
Instrumental Variables (IV)
49
-
7/22/2019 Violations of OLS
50/64
Solution 1:
If we could include the omitted variable (or a
proxy) into our regression, this will solve the
problem since we would be controlling for
ability. E.g. we might try to measure ability with an
aptitude test and include that in our model.
Problem: often good proxies are not
available.
50
-
7/22/2019 Violations of OLS
51/64
Solution 2: Instrumental variable
Suppose we can find a variable Z that is related to Xbut not to (or Y).
Since Z is not related to , if we focus on changes in
X that are due to Z, we overcome the problem. (since
these changes in X are not related to )
51
Y
X Z
-
7/22/2019 Violations of OLS
52/64
Solution 2: Instrumental variable
In our example, we need a variable related to years ofeducation but not related to ability
Proximity to college may affect the likelihood of
going to college but not your ability!
52
Proximity to college
Earnings
Education
-
7/22/2019 Violations of OLS
53/64
Two stage least squares
First stage:
Regress X and Z (and any other observed variables)
Now predict X =>
Key point:
Since Z is not related to , the predicted cant be
related to , so it is exogenous
53
X= *+ *1Z+ *2Other variables +
-
7/22/2019 Violations of OLS
54/64
Problem solved:
Second stage: Regress Y and . This IVwill be
unbiased.
Note however that the standard error needs to be
corrected since we reduced the variability of X in stage 1.
54
Y
i d i i i
-
7/22/2019 Violations of OLS
55/64
First stage: regress Education on proximity
to college
55
S d U di d d i i
-
7/22/2019 Violations of OLS
56/64
Second stage: Usepredictededucation in
original regression instead of education
56
l b d
-
7/22/2019 Violations of OLS
57/64
Original OLS (biased)
57
C i f l
-
7/22/2019 Violations of OLS
58/64
Comparison of Results:
Coefficients:
OLS: 0.074009 = > 7.4%increase in wages per
year
IV: 0.1322888 = > 13.2%increase in wages peryear
This difference may be important to an individual
deciding whether to go to college!
Note: standard errors are larger in IV than OLS becausewe lose some explanatory power when we exclude part
of X (the part not correlated with Z)
58
-
7/22/2019 Violations of OLS
59/64
Some issues with IVwonthave time to cover today!
59
W k I
-
7/22/2019 Violations of OLS
60/64
Weak Instruments
Suppose Z is not strongly correlated with X e.g. My
education level may be related to my cousins
butprobably not very strongly!
In the first stage regression we will not get a very good
prediction for X => so our second stage estimate wont
be good.
60
X Z
Y
I lid I
-
7/22/2019 Violations of OLS
61/64
Invalid Instruments
Here The instrument is not exogenous since it is also related to
=> We would still get a biasedestimate
(in fact it couldbe even worse than the original estimate)
61
X Z
Y
T S L S i
-
7/22/2019 Violations of OLS
62/64
Two Stage Least Squares estimator
(2SLS): Recall the standard OLS estimator is:
The 2SLS estimator is:
62
)(
),(
XVar
YXCov
),(
),(
XZCov
YZCov
E i
-
7/22/2019 Violations of OLS
63/64
Exercise:
Suppose that I am interested in the effect of
hours of study (X) on exam results (Y).
1. Are there any unobserved variables that
might influence both X and Y?2. Can you think of a proxyvariable for them?
3. Suppose we have data on the following
variables, would any of these be validinstruments for hours of study? Why, why
not?
63
-
7/22/2019 Violations of OLS
64/64
Sample questions on Black Boardtry to do the ones with the *
top related