regression analysis 07
TRANSCRIPT
![Page 1: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/1.jpg)
1
Conceptualizing Conceptualizing Heteroskedasticity & Heteroskedasticity &
AutocorrelationAutocorrelation
Quantitative Methods IIQuantitative Methods II
Lecture 18Lecture 18
![Page 2: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/2.jpg)
2
OLS Assumptions about Error OLS Assumptions about Error Variance and CovarianceVariance and Covariance
Just finished our discussion of Just finished our discussion of Omitted Variable BiasOmitted Variable Bias
Violates the assumption E(u)=0Violates the assumption E(u)=0 This was only one of the assumptions This was only one of the assumptions
we made about errors to show that we made about errors to show that OLS is BLUEOLS is BLUE
Also assumed cov(u)=E(uu’)=Also assumed cov(u)=E(uu’)=σσ22IInn
That is, we assumed u ~ (0, That is, we assumed u ~ (0, σσ22IInn))
Remember, the formula for covariance
cov(A,B)=E[(A-μA) [(B-μB)]
![Page 3: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/3.jpg)
3
What Should uu’ Look Like?What Should uu’ Look Like?
Note uu’ is an nxn matrixNote uu’ is an nxn matrix Different from u’u – a scalar sum of Different from u’u – a scalar sum of
squared errorssquared errors Variances of uVariances of u11….u….unn on diagonal on diagonal Covariances of uCovariances of u11uu22, u, u11uu33…are off the …are off the
diagonaldiagonal
![Page 4: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/4.jpg)
4
A Well Behaved uu’ MatrixA Well Behaved uu’ Matrix
![Page 5: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/5.jpg)
5
Violations of E(uu’)=Violations of E(uu’)=σσ22IInn
Two basic reasons that E(uu’) may Two basic reasons that E(uu’) may not be equal to not be equal to σσ22IInn
Diagonal elements of uu’ may not be Diagonal elements of uu’ may not be constantconstant
Off-diagonal elements of uu’ may not Off-diagonal elements of uu’ may not be zerobe zero
![Page 6: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/6.jpg)
6
Problematic Population Error Problematic Population Error Variances and CovariancesVariances and Covariances
Problem of non-constant error Problem of non-constant error variances is known as variances is known as HETEROSKEDASTICITYHETEROSKEDASTICITY
Problem of non-zero error covariances Problem of non-zero error covariances is known as AUTOCORRELATIONis known as AUTOCORRELATION
These are different problems and These are different problems and generally occur with different types of generally occur with different types of data. data.
Nevertheless, the implications for OLS Nevertheless, the implications for OLS are the same.are the same.
![Page 7: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/7.jpg)
7
The Causes of The Causes of HeteroskedasticityHeteroskedasticity
Often a problem in cross-sectional Often a problem in cross-sectional data – especially aggregate datadata – especially aggregate data
Accuracy of measures may differ Accuracy of measures may differ across unitsacross units data availability or number of observations data availability or number of observations
within aggregate observationswithin aggregate observations If error is proportional to decision If error is proportional to decision
unit, then variance related to unit unit, then variance related to unit size (example GDP)size (example GDP)
![Page 8: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/8.jpg)
8
F(y/x)
y
x
x1
Demonstration of the Homskedasticity Assumption
Predicted Line Drawn Under Homoskedasticity
x2
x4
x3
Variance across values of x is constant
![Page 9: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/9.jpg)
9
F(y/x)
y
x
x1
Demonstration of the Homskedasticity Assumption
Predicted Line Drawn Under Heteroskedasticity
x2
x4
x3
Variance differs across values of x
![Page 10: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/10.jpg)
10
ALB
ARMAZE
BLR
BUL
CRO
CZR EST
FYRMAC
GRG
HUN
KYR
KZK
LATLIT
MLD
POL
RUM RUS
SLO
SLV
TAJ
TKM
UKR
UZB
20
40
60
80
Pro
gre
ss o
n D
ifficu
lt E
co
no
mic
Re
form
s
0 1000 2000 3000Distance from Nearest European Union Capital (km)
95% CI Fitted values progress
![Page 11: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/11.jpg)
11
Looking for HeteroskedasticityLooking for Heteroskedasticity
In a classic case, a plot of residuals In a classic case, a plot of residuals against dependent variable or other against dependent variable or other variable will often produce a fan variable will often produce a fan shapeshape
0
20
40
60
80
100
120
140
160
180
0 50 100 150
Series1
![Page 12: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/12.jpg)
12
Sometimes the variance if different across Sometimes the variance if different across different levels of the dependent variable.different levels of the dependent variable.
ALB
ARMAZE
BLR
BUL
CRO
CZR
EST
FYRMAC
GRG
HUN
KYR
KZK
LAT
LIT
MLD
POL
RUM
RUS
SLO
SLV
TAJ
TKM
UKR
UZB
-20
-10
0
10
20
Re
sid
ua
ls
30 40 50 60 70Fiited Values of Progress
![Page 13: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/13.jpg)
13
Causes of AutocorrelationCauses of Autocorrelation
Often a problem in time-series dataOften a problem in time-series data Spatial autocorrelation is possible and is more Spatial autocorrelation is possible and is more
difficult to addressdifficult to address May be a result of measurement May be a result of measurement
errors correlated over timeerrors correlated over time Any excluded x’s cause y but are Any excluded x’s cause y but are
uncorrelated with our x’s and are uncorrelated with our x’s and are correlated over timecorrelated over time
Wrong Functional FormWrong Functional Form
![Page 14: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/14.jpg)
14
Looking for AutocorrelationLooking for Autocorrelation
Plotting the residuals over time will Plotting the residuals over time will often show an oscillating patternoften show an oscillating pattern
Correlation of uCorrelation of utt & u & u t-1t-1 = .85 = .85
Re
sid
ua
ls
time1 148
-33.8331
26.0368
![Page 15: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/15.jpg)
15
Looking for AutocorrelationLooking for Autocorrelation
As compared to a non-autocorrelated As compared to a non-autocorrelated modelmodel
Re
sid
ua
ls
obs1 1150
-2.67677
3.30758
![Page 16: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/16.jpg)
16
How does it impact our results?How does it impact our results? Does not cause bias or inconsistency in OLS Does not cause bias or inconsistency in OLS
estimators (estimators (ββhathat).). R-squared also unaffected.R-squared also unaffected. The variance of The variance of ββhathat is biased without is biased without
homoskedastic assumption.homoskedastic assumption. T-statistics become invalid and the problem T-statistics become invalid and the problem
is not resolved by larger sample sizes.is not resolved by larger sample sizes. Similarly, F-tests are invalid.Similarly, F-tests are invalid. Moreover, if Var(u|X) is not constant, OLS is Moreover, if Var(u|X) is not constant, OLS is
no longer BLUE. It is neither BEST or no longer BLUE. It is neither BEST or EFFICIENT.EFFICIENT.
What can we do??What can we do??
![Page 17: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/17.jpg)
17
OLS if E(uu’) is not OLS if E(uu’) is not σσ22IInn
If errors are heteroskedastic or If errors are heteroskedastic or autocorrelated, then our OLS model autocorrelated, then our OLS model isis
Y=XY=Xββ+u+u E(u)=0E(u)=0 Cov(u)=E(uu’)=WCov(u)=E(uu’)=W
Where W is an unknown n x n matrixWhere W is an unknown n x n matrix u ~ (0,W)u ~ (0,W)
![Page 18: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/18.jpg)
18
OLS is Still Unbiased if OLS is Still Unbiased if E(uu’) is not E(uu’) is not σσ22IInn
We don’t need uu’ for unbiasednessWe don’t need uu’ for unbiasedness
![Page 19: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/19.jpg)
19
But OLS is not Best if But OLS is not Best if E(uu’) is not E(uu’) is not σσ22IInn
Remember from our derivation of the Remember from our derivation of the variance of the variance of the ββhatshats
Now, we square the distances to get the Now, we square the distances to get the variance of variance of ββhatshats around the true around the true ββss
![Page 20: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/20.jpg)
20
Comparing the Variance of Comparing the Variance of ββhathat
Thus if E(uu’) is not Thus if E(uu’) is not σσ22IInn then: then:
Recall CLM assumed E(uu’) = Recall CLM assumed E(uu’) = σσ22IInn and and thus estimated cov(thus estimated cov(ββhathat) as:) as:
Numerator Denominator
![Page 21: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/21.jpg)
21
Results of Heteroskedasticity Results of Heteroskedasticity and Autocorrelationand Autocorrelation
Thus if we unwittingly use OLS when Thus if we unwittingly use OLS when we have heteroskedastic or we have heteroskedastic or autocorrelated errors, our estimates autocorrelated errors, our estimates will have the wrong error varianceswill have the wrong error variances
Thus our t-tests will also be wrongThus our t-tests will also be wrong Direction of bias depends on nature Direction of bias depends on nature
of the covariances and changing of the covariances and changing variancesvariances
![Page 22: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/22.jpg)
22
What is What is GeneralizedGeneralized Least Least Squares (GLS)?Squares (GLS)?
One solution to both heteroskedasticity One solution to both heteroskedasticity and autocorrelation is GLSand autocorrelation is GLS
GLS is like OLS, but we provide the GLS is like OLS, but we provide the estimator with information about the estimator with information about the variance and covariance of the errorsvariance and covariance of the errors
In practice the nature of this In practice the nature of this information will differ – specific information will differ – specific applications of GLS will differ for applications of GLS will differ for heteroskedasticity and autocorrelationheteroskedasticity and autocorrelation
![Page 23: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/23.jpg)
23
From OLS to GLSFrom OLS to GLS
We began with the problem that We began with the problem that E(uu’)=W instead of E(uu’) = E(uu’)=W instead of E(uu’) = σσ22IInn Where W is an unknown matrixWhere W is an unknown matrix
Thus we need to define a matrix of Thus we need to define a matrix of information Ωinformation Ω Such that E(uu’)=W=ΩσSuch that E(uu’)=W=Ωσ22IInn
The Ω matrix summarizes the pattern The Ω matrix summarizes the pattern of variances and covariances among of variances and covariances among the errorsthe errors
![Page 24: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/24.jpg)
24
From OLS to GLSFrom OLS to GLS
In the case of heteroskedasticity, we In the case of heteroskedasticity, we give information in Ω about variance give information in Ω about variance of the errorsof the errors
In the case of autocorrelation, we In the case of autocorrelation, we give information in Ω about give information in Ω about covariance of the errorscovariance of the errors
To counterbalance the impact of the To counterbalance the impact of the variances and covariances in Ω, we variances and covariances in Ω, we multiply our OLS estmator by Ωmultiply our OLS estmator by Ω-1-1
![Page 25: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/25.jpg)
25
From OLS to GLSFrom OLS to GLS
We do this because:We do this because: if E(uu’)=W=Ωσif E(uu’)=W=Ωσ22IIn n
then W then W ΩΩ-1-1= = ΩσΩσ22IIn n ΩΩ-1-1==σσ22IInn
Thus our new GLS estimator is:Thus our new GLS estimator is:
This estimator is unbiased and has a This estimator is unbiased and has a variance:variance:
![Page 26: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/26.jpg)
26
What What ISIS GLS? GLS?
Conceptually what GLS is doing is Conceptually what GLS is doing is weighting the dataweighting the data
Notice we are multiplying X and y by Notice we are multiplying X and y by the inverse of error covariance the inverse of error covariance ΩΩ
We weight the data to We weight the data to counterbalance the variance and counterbalance the variance and covariance of the errorscovariance of the errors
![Page 27: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/27.jpg)
27
GLS, Heteroskedasticity and GLS, Heteroskedasticity and AutocorrelationAutocorrelation
For heteroskedasticity, we weight by For heteroskedasticity, we weight by the inverse of the variable associated the inverse of the variable associated with the variance of the errorswith the variance of the errors
For autocorrelation, we weight by the For autocorrelation, we weight by the inverse of the covariance among inverse of the covariance among errorserrors
This is also referred to as “weighted This is also referred to as “weighted regression”regression”
![Page 28: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/28.jpg)
28
The Problem of The Problem of HeteroskedasticityHeteroskedasticity
Heteroskedasticity is one of two Heteroskedasticity is one of two possible violations of our assumption possible violations of our assumption E(uu’)=E(uu’)=σσ22IInn
Specifically, it is a violation of the Specifically, it is a violation of the assumption of constant error varianceassumption of constant error variance
If errors are heteroskedastic, then If errors are heteroskedastic, then coefficients are unbiased, but standard coefficients are unbiased, but standard errors and t-tests are wrong.errors and t-tests are wrong.
![Page 29: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/29.jpg)
29
How Do We Diagnose How Do We Diagnose Heteroskedasticity?Heteroskedasticity?
There are numerous possible tests for There are numerous possible tests for heteroskedasticityheteroskedasticity
We have used two. The white test and hettest.We have used two. The white test and hettest. All of them consist of taking residuals from All of them consist of taking residuals from
our equation and looking for patterns in our equation and looking for patterns in variances.variances.
Thus no single test is definitive, since we can’t Thus no single test is definitive, since we can’t look everywhere.look everywhere.
As you have noticed, sometimes hettest and As you have noticed, sometimes hettest and whitetst conflict.whitetst conflict.
![Page 30: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/30.jpg)
30
Heteroskedasticity TestsHeteroskedasticity Tests
Informal MethodsInformal Methods Graph the data and look for patterns!Graph the data and look for patterns! The Residual versus Fitted plot is an The Residual versus Fitted plot is an
excellent one.excellent one. Look for differences in variance across Look for differences in variance across
the fitted values, as we did above.the fitted values, as we did above.
![Page 31: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/31.jpg)
31
Heteroskedasticity: TestsHeteroskedasticity: Tests
Goldfeld-Quandt testGoldfeld-Quandt test Sort the Sort the nn cases by the x that you think cases by the x that you think
is correlated with uis correlated with uii22..
Drop a section of Drop a section of cc cases out of the cases out of the middlemiddle(one-fifth is a reasonable number).(one-fifth is a reasonable number).
Run separate regressions on both upper Run separate regressions on both upper and lower samples.and lower samples.
![Page 32: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/32.jpg)
32
Heteroskedasticity TestsHeteroskedasticity Tests
Goldfeld-Quandt test (cont.)Goldfeld-Quandt test (cont.) Difference in variance of the errors in Difference in variance of the errors in
the two regressions has an F the two regressions has an F distributiondistribution
nn11-n-n11 is the degrees of freedom for the is the degrees of freedom for the first regression and nfirst regression and n22-k-k22 is the degrees is the degrees of freedom for the secondof freedom for the second
![Page 33: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/33.jpg)
33
Heteroskedasticity TestsHeteroskedasticity Tests
Breusch-Pagan Test (Wooldridge, Breusch-Pagan Test (Wooldridge, 281).281).
Useful if Heteroskedasticity depends Useful if Heteroskedasticity depends on more than one variableon more than one variable Estimate model with OLSEstimate model with OLS Obtain the squared residualsObtain the squared residuals Estimate the equation:Estimate the equation:
![Page 34: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/34.jpg)
34
Heteroskedasticity: TestsHeteroskedasticity: Tests
Where zWhere z11-z-zkk are the variables that are are the variables that are possible sources of heteroskedasticity. possible sources of heteroskedasticity.
The ratio of the explained sum of The ratio of the explained sum of squares to the variance of the squares to the variance of the residuals tells us if this model is residuals tells us if this model is getting any purchase on the size of getting any purchase on the size of the errorsthe errors
It turns out that:It turns out that: Where k=the number of z variablesWhere k=the number of z variables
![Page 35: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/35.jpg)
35
White Test (WHITETST)White Test (WHITETST)
Estimate the model using OLS. Obtain the OLS Estimate the model using OLS. Obtain the OLS residuals and the predicted values. Compute the residuals and the predicted values. Compute the squared residuals and squared predicted values.squared residuals and squared predicted values.
Run the equation:Run the equation:
Keep the RKeep the R2 2 from this regression.from this regression. Form the F-statistic and compute the p-value. Form the F-statistic and compute the p-value.
Stata uses the Stata uses the χχ2 2 distribution which resembles the distribution which resembles the F distribution.F distribution.
Look for a significant p-valueLook for a significant p-value..
![Page 36: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/36.jpg)
36
Problems with tests of Problems with tests of HeteroskedasticityHeteroskedasticity
Tests rely on the first four assumptions of Tests rely on the first four assumptions of the classical linear model being true!the classical linear model being true!
If assumption 4 is violated. That is, the If assumption 4 is violated. That is, the zero conditional mean assumption, then a zero conditional mean assumption, then a test for heteroskedasticity may reject the test for heteroskedasticity may reject the null hypothesis even if Var(y|X) is null hypothesis even if Var(y|X) is constant.constant.
This is true if our functional form is This is true if our functional form is specified incorrectly (omitting a quadratic specified incorrectly (omitting a quadratic term or specifying a log instead of a level).term or specifying a log instead of a level).
![Page 37: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/37.jpg)
37
If Heteroskedasticy is discovered…If Heteroskedasticy is discovered…
The solution we have learned thus The solution we have learned thus far and the easiest solution overall is far and the easiest solution overall is to use the to use the heterosekdasticity-robust heterosekdasticity-robust standard error.standard error.
In stata, this command is robust after In stata, this command is robust after the regression in the robust the regression in the robust command. command.
![Page 38: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/38.jpg)
38
Remedying Heteroskedasticity: Remedying Heteroskedasticity: Robust Standard ErrorsRobust Standard Errors
By hand, we use the formula By hand, we use the formula
The square root of this formula is the The square root of this formula is the heteroskedasticity robust standard error.heteroskedasticity robust standard error.
t-statistics are calculated using the new t-statistics are calculated using the new standard errror.standard errror.
![Page 39: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/39.jpg)
39
Remedying Heteroskedasticity: Remedying Heteroskedasticity: GLS, WLS, FGLSGLS, WLS, FGLS
Generalized Least SquaresGeneralized Least Squares Adds the Adds the ΩΩ-1-1 matrix to our OLS estimator to eliminate the matrix to our OLS estimator to eliminate the
pattern of error variances and covariancespattern of error variances and covariances A.K.A. Weighted Least SquaresA.K.A. Weighted Least Squares
An estimator used to adjust for a known form of An estimator used to adjust for a known form of heteroskedasticity where each squared residual is heteroskedasticity where each squared residual is weighted by the inverse of the estimated variance of the weighted by the inverse of the estimated variance of the error.error.
Rather than explicitly creating Rather than explicitly creating ΩΩ-1-1 we can weight the data we can weight the data and perform OLS on the transformed variables.and perform OLS on the transformed variables.
Feasible Generalized Least SquaresFeasible Generalized Least Squares A Type of WLS where the variance or correlation A Type of WLS where the variance or correlation
parameters are unknown and therefore must first be parameters are unknown and therefore must first be estimated.estimated.
![Page 40: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/40.jpg)
40
Before Before robustrobust, statisticians used , statisticians used Generalized or Weighted LeastGeneralized or Weighted Least
Recall our GLS Estimator:Recall our GLS Estimator:
We can estimate this equation by We can estimate this equation by weighting our independent and weighting our independent and dependent variables and then doing dependent variables and then doing OLSOLS
But what is the correct weight?But what is the correct weight?
![Page 41: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/41.jpg)
41
GLS, WLS and GLS, WLS and HeteroskedasticityHeteroskedasticity
Note, that we have X’X and X’y in this Note, that we have X’X and X’y in this equationequation
Thus to get the appropriate weight for Thus to get the appropriate weight for the X’s and y’s we need to define a the X’s and y’s we need to define a new matrix Fnew matrix F
Such that F’F is an nxn matrix where:Such that F’F is an nxn matrix where: F’F= ΩF’F= Ω-1-1
![Page 42: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/42.jpg)
42
GLS, WLS and GLS, WLS and HeteroskedasticityHeteroskedasticity
Then we can weight the x’s and y by F Then we can weight the x’s and y by F such that:such that: X*=FX and y*=FyX*=FX and y*=Fy
Now we can see that:Now we can see that:
Thus performing OLS on the Thus performing OLS on the transformed data IS the WLS or FGLS transformed data IS the WLS or FGLS estimatorestimator
![Page 43: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/43.jpg)
43
How Do We Choose the How Do We Choose the Weight?Weight?
Now our only remaining job is to Now our only remaining job is to figure out what F should befigure out what F should be
Recall if there is a heteroskedasticity Recall if there is a heteroskedasticity problem, then:problem, then:
![Page 44: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/44.jpg)
44
Determining FDetermining F
Thus:Thus:
![Page 45: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/45.jpg)
45
Determining FDetermining F
And since And since F’F= ΩF’F= Ω-1-1
![Page 46: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/46.jpg)
46
Identifying our WeightsIdentifying our Weights That is, if we believe that the variance of the That is, if we believe that the variance of the
errors depends on some variable h.errors depends on some variable h. ……then we create our estimator by weighting our then we create our estimator by weighting our
x and y variables by the square root of the x and y variables by the square root of the inverse of that variable (WLS)inverse of that variable (WLS)
If the error is unknown, I estimate by regressing If the error is unknown, I estimate by regressing the squared residuals on the independent the squared residuals on the independent variable and use that square root of the inverse of variable and use that square root of the inverse of the predicted (h-hat) as my weight.the predicted (h-hat) as my weight.
Then we perform OLS on the equation: Then we perform OLS on the equation:
![Page 47: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/47.jpg)
47
FGLS: An ExampleFGLS: An Example
I created a dataset where:I created a dataset where: Y=1+2xY=1+2x11-3x-3x22+u+u Where u=h_hat*uWhere u=h_hat*u And u~ N(0,25)And u~ N(0,25)
xx11 & x & x22 are uniform and uncorrelated are uniform and uncorrelated h_hat is uniform and uncorrelated h_hat is uniform and uncorrelated
with y or the x’swith y or the x’s Thus, I will need to re-weight by h_hatThus, I will need to re-weight by h_hat
![Page 48: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/48.jpg)
48
FGLS PropertiesFGLS Properties
FGLS is no longer unbiased, but it is FGLS is no longer unbiased, but it is consistent and asymptotically consistent and asymptotically efficient.efficient.
![Page 49: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/49.jpg)
49
FGLS: An ExampleFGLS: An Example
reg y x1 x2 Source | SS df MS Number of obs = 100
---------+------------------------------ F( 2, 97) = 16.31
Model | 29489.1875 2 14744.5937 Prob > F = 0.0000
Residual | 87702.0026 97 904.144357 R-squared = 0.2516
---------+------------------------------ Adj R-squared = 0.2362
Total | 117191.19 99 1183.74939 Root MSE = 30.069
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
x1 | 3.406085 1.045157 3.259 0.002 1.331737 5.480433
x2 | -2.209726 .5262174 -4.199 0.000 -3.254122 -1.16533
_cons | -18.47556 8.604419 -2.147 0.034 -35.55295 -1.398172
------------------------------------------------------------------------------
![Page 50: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/50.jpg)
50
Tests are SignificantTests are Significant
. whitetst
White's general test statistic : 1.180962 Chi-sq( 2) P-value = .005
. Bpagan x1 x2 Breusch-Pagan LM statistic: 5.175019 Chi-sq( 1) P-value = .0229
![Page 51: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/51.jpg)
51
FGLS in STATA:FGLS in STATA:Giving it the WeightGiving it the Weight
reg y x1 x2 [aweight=1/h_hat]
(sum of wgt is 4.9247e+001)
Source | SS df MS Number of obs = 100
---------+------------------------------ F( 2, 97) = 44.53
Model | 26364.7129 2 13182.3564 Prob > F = 0.0000
Residual | 28716.157 97 296.042856 R-squared = 0.4787
---------+------------------------------ Adj R-squared = 0.4679
Total | 55080.8698 99 556.372423 Root MSE = 17.206
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
x1 | 2.35464 .7014901 3.357 0.001 .9623766 3.746904
x2 | -2.707453 .3307317 -8.186 0.000 -3.363863 -2.051042
_cons | -4.079022 5.515378 -0.740 0.461 -15.02552 6.867476
------------------------------------------------------------------------------
![Page 52: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/52.jpg)
52
FGLS “By Hand”FGLS “By Hand”
reg yhhat x1hhat x2hhat weight, noc
Source | SS df MS Number of obs = 100
---------+------------------------------ F( 3, 97) = 75.54
Model | 33037.8848 3 11012.6283 Prob > F = 0.0000
Residual | 14141.7508 97 145.791245 R-squared = 0.7003
---------+------------------------------ Adj R-squared = 0.6910
Total | 47179.6355 100 471.796355 Root MSE = 12.074
------------------------------------------------------------------------------
yhhat | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
x1hhat | 2.35464 .7014901 3.357 0.001 .9623766 3.746904
x2hhat | -2.707453 .3307317 -8.186 0.000 -3.363863 -2.051042
weight | -4.079023 5.515378 -0.740 0.461 -15.02552 6.867476
------------------------------------------------------------------------------
![Page 53: Regression Analysis 07](https://reader033.vdocument.in/reader033/viewer/2022051513/54649be7af795969338b4a1f/html5/thumbnails/53.jpg)
53
Tests Now Tests Now NotNot-Significant-Significant
. whitetst
White's general test statistic : 1.180962 Chi-sq( 2) P-value = .589
. Bpagan x1 x2 Breusch-Pagan LM statistic: 5.175019 Chi-sq( 1) P-value = .229