leave-out estimation of variance componentspkline/papers/kss_slides.pdf ·...
TRANSCRIPT
![Page 1: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/1.jpg)
Leave-out Estimation of Variance Components
Patrick Kline1, Raffaele Saggio2, Mikkel Sølvsten3
1Department of Economics, University of California, Berkeley
2Department of Economics, University of British Columbia
3Department of Economics, University of Wisconsin-Madison
NBER Labor Studies, July 2018
![Page 2: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/2.jpg)
Mo’ data, mo’ problems
-
![Page 3: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/3.jpg)
Overview
- As our data grow so does the complexity of our models
- Classic tool: ANOVA (Fisher, 1925) provides low dimensional summary ofheavily parameterized models in terms of “variance components”
- Along with a framework for testing large numbers of linear restrictions (F-test)
- Extensions: Hierarchical Linear Models (HLM), Multi-way Fixed Effect Models
![Page 4: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/4.jpg)
Overview
- As our data grow so does the complexity of our models
- Classic tool: ANOVA (Fisher, 1925) provides low dimensional summary ofheavily parameterized models in terms of “variance components”
- Along with a framework for testing large numbers of linear restrictions (F-test)
- Extensions: Hierarchical Linear Models (HLM), Multi-way Fixed Effect Models
![Page 5: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/5.jpg)
Overview
- As our data grow so does the complexity of our models
- Classic tool: ANOVA (Fisher, 1925) provides low dimensional summary ofheavily parameterized models in terms of “variance components”
- Along with a framework for testing large numbers of linear restrictions (F-test)
- Extensions: Hierarchical Linear Models (HLM), Multi-way Fixed Effect Models
![Page 6: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/6.jpg)
Partying like it’s 1929..
Recent applications of two-way FE (AKM) models to wage data:Card, Heining, Kline (2013); Song, Price, Guvenen, Bloom, von Wachter (2015); Card,
Cardoso, Kline (2016); Macis and Schivardi (2016); Lavetti and Schmutte (2016); Sorkin
(2018); Lachowska, Mas, Woodbury (2018).
Related applications involving ANOVA, HLM, and/or Multi-way FE:Graham (2008); Chetty, Friedman, Hilger, Saez, Schanzenbach, Yagan (2011); Arcidiacono,
Foster, Goodpaster, Kinsler (2012); Chetty, Friedman, Rockoff (2014); Finkelstein,
Gentzkow, Williams (2016); Silver (2016); Angrist, Hull, Pathak, Walters (2017); Best,
Hjort, Szakonyi (2017); Chetty and Hendren (2018); Altonji and Mansfield (2018).
![Page 7: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/7.jpg)
Today
- Extend the classic toolkit to develop new estimator of quadratic forms thataccomodates heteroscedasticity
- Develop feasible inference procedure that adapts to different data designs(including cases where variance components are weakly identified)
- Application: Two-way fixed effects on weakly connected network of firms
![Page 8: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/8.jpg)
Today
- Extend the classic toolkit to develop new estimator of quadratic forms thataccomodates heteroscedasticity
- Develop feasible inference procedure that adapts to different data designs(including cases where variance components are weakly identified)
- Application: Two-way fixed effects on weakly connected network of firms
![Page 9: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/9.jpg)
Today
- Extend the classic toolkit to develop new estimator of quadratic forms thataccomodates heteroscedasticity
- Develop feasible inference procedure that adapts to different data designs(including cases where variance components are weakly identified)
- Application: Two-way fixed effects on weakly connected network of firms
![Page 10: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/10.jpg)
Framework
Consider a linear model
yi = x′iβ + εi (i = 1, · · · , n),
with the following features:
- Many non-random regressors (dim(xi) = k ∝ n)
- Potentially heteroscedastic mean-zero error terms (E[ε2i ] = σ2
i )
Object of interest is θ = β′Aβ where A is known and has rank r.
![Page 11: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/11.jpg)
Framework
Consider a linear model
yi = x′iβ + εi (i = 1, · · · , n),
with the following features:
- Many non-random regressors (dim(xi) = k ∝ n)
- Potentially heteroscedastic mean-zero error terms (E[ε2i ] = σ2
i )
Object of interest is θ = β′Aβ where A is known and has rank r.
![Page 12: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/12.jpg)
Motivating Example: AKM
Example I (Two-way fixed effects, AKM)
Our leading application is
log-wagegt = αg + ψj(g,t) + x′gtδ + εgt (g = 1, · · · , N, t = 1, ..., Tg),
where j(·, ·) assigns each employee to one of J + 1 employers in each period.
Objects of interest are σ2α, σ2
ψ, and σα,ψ where, e.g.,
σ2ψ = 1
n
N∑g=1
Tg∑t=1
(ψj(g,t) − ψ)2, ψ = 1n
N∑g=1
Tg∑t=1
ψj(g,t).
- σ2ψ = β′Aβ where the rank of A is J (often on the order of 1M!).
- Dimensionality presents substantial obstacles to estimation and inference
![Page 13: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/13.jpg)
Motivating Example: AKM
Example I (Two-way fixed effects, AKM)
Our leading application is
log-wagegt = αg + ψj(g,t) + x′gtδ + εgt (g = 1, · · · , N, t = 1, ..., Tg),
where j(·, ·) assigns each employee to one of J + 1 employers in each period.Objects of interest are σ2
α, σ2ψ, and σα,ψ where, e.g.,
σ2ψ = 1
n
N∑g=1
Tg∑t=1
(ψj(g,t) − ψ)2, ψ = 1n
N∑g=1
Tg∑t=1
ψj(g,t).
- σ2ψ = β′Aβ where the rank of A is J (often on the order of 1M!).
- Dimensionality presents substantial obstacles to estimation and inference
![Page 14: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/14.jpg)
Motivating Example: AKM
Example I (Two-way fixed effects, AKM)
Our leading application is
log-wagegt = αg + ψj(g,t) + x′gtδ + εgt (g = 1, · · · , N, t = 1, ..., Tg),
where j(·, ·) assigns each employee to one of J + 1 employers in each period.Objects of interest are σ2
α, σ2ψ, and σα,ψ where, e.g.,
σ2ψ = 1
n
N∑g=1
Tg∑t=1
(ψj(g,t) − ψ)2, ψ = 1n
N∑g=1
Tg∑t=1
ψj(g,t).
- σ2ψ = β′Aβ where the rank of A is J (often on the order of 1M!).
- Dimensionality presents substantial obstacles to estimation and inference
![Page 15: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/15.jpg)
Motivating Example: AKM
Example I (Two-way fixed effects, AKM)
Our leading application is
log-wagegt = αg + ψj(g,t) + x′gtδ + εgt (g = 1, · · · , N, t = 1, ..., Tg),
where j(·, ·) assigns each employee to one of J + 1 employers in each period.Objects of interest are σ2
α, σ2ψ, and σα,ψ where, e.g.,
σ2ψ = 1
n
N∑g=1
Tg∑t=1
(ψj(g,t) − ψ)2, ψ = 1n
N∑g=1
Tg∑t=1
ψj(g,t).
- σ2ψ = β′Aβ where the rank of A is J (often on the order of 1M!).
- Dimensionality presents substantial obstacles to estimation and inference
![Page 16: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/16.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
Outline
Literature
Model and Estimator
Consistency
Distribution Theory
Application
![Page 17: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/17.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
Related methods / theoretical results
Variance Components (R2, ANOVA, HLM, Two-way FEs): Wright (1921); Fisher
(1925); Theil (1961); Akritas and Papadatos (2004); Akritas and Wang (2011); Dicker
(2014); Andrews, Gill, Schank, Upward (2008); Verdier (2016); Jochmans and Weidner
(2016); Bonhomme, Lamadon, Manresa (2017); Borovičková and Shimer (2017).
Leave-out or cross-fitting: Hahn and Newey (2004); Dhaene and Jochmans (2015);
Phillips and Hale (1977); Powell, Stock, Stoker (1989); Angrist, Imbens, Krueger (1999);
Hausman et al. (2012); Kolesár (2013); Newey and Robins (2018).
Inference with heteroskedasticity and/or many regressors: Anatolyev (2012);
Karoui and Purdom (2016); Lei, Bickel, Karoui (2016); Cattaneo, Jansson, Newey (2017).
Inference in non-standard problems: Staiger and Stock (1997); Andrews and Cheng
(2012); Elliott, Müller, Watson (2015); Andrews and Mikusheva (2016).
![Page 18: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/18.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
Related methods / theoretical results
Variance Components (R2, ANOVA, HLM, Two-way FEs): Wright (1921); Fisher
(1925); Theil (1961); Akritas and Papadatos (2004); Akritas and Wang (2011); Dicker
(2014); Andrews, Gill, Schank, Upward (2008); Verdier (2016); Jochmans and Weidner
(2016); Bonhomme, Lamadon, Manresa (2017); Borovičková and Shimer (2017).
Leave-out or cross-fitting: Hahn and Newey (2004); Dhaene and Jochmans (2015);
Phillips and Hale (1977); Powell, Stock, Stoker (1989); Angrist, Imbens, Krueger (1999);
Hausman et al. (2012); Kolesár (2013); Newey and Robins (2018).
Inference with heteroskedasticity and/or many regressors: Anatolyev (2012);
Karoui and Purdom (2016); Lei, Bickel, Karoui (2016); Cattaneo, Jansson, Newey (2017).
Inference in non-standard problems: Staiger and Stock (1997); Andrews and Cheng
(2012); Elliott, Müller, Watson (2015); Andrews and Mikusheva (2016).
![Page 19: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/19.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
Related methods / theoretical results
Variance Components (R2, ANOVA, HLM, Two-way FEs): Wright (1921); Fisher
(1925); Theil (1961); Akritas and Papadatos (2004); Akritas and Wang (2011); Dicker
(2014); Andrews, Gill, Schank, Upward (2008); Verdier (2016); Jochmans and Weidner
(2016); Bonhomme, Lamadon, Manresa (2017); Borovičková and Shimer (2017).
Leave-out or cross-fitting: Hahn and Newey (2004); Dhaene and Jochmans (2015);
Phillips and Hale (1977); Powell, Stock, Stoker (1989); Angrist, Imbens, Krueger (1999);
Hausman et al. (2012); Kolesár (2013); Newey and Robins (2018).
Inference with heteroskedasticity and/or many regressors: Anatolyev (2012);
Karoui and Purdom (2016); Lei, Bickel, Karoui (2016); Cattaneo, Jansson, Newey (2017).
Inference in non-standard problems: Staiger and Stock (1997); Andrews and Cheng
(2012); Elliott, Müller, Watson (2015); Andrews and Mikusheva (2016).
![Page 20: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/20.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
Related methods / theoretical results
Variance Components (R2, ANOVA, HLM, Two-way FEs): Wright (1921); Fisher
(1925); Theil (1961); Akritas and Papadatos (2004); Akritas and Wang (2011); Dicker
(2014); Andrews, Gill, Schank, Upward (2008); Verdier (2016); Jochmans and Weidner
(2016); Bonhomme, Lamadon, Manresa (2017); Borovičková and Shimer (2017).
Leave-out or cross-fitting: Hahn and Newey (2004); Dhaene and Jochmans (2015);
Phillips and Hale (1977); Powell, Stock, Stoker (1989); Angrist, Imbens, Krueger (1999);
Hausman et al. (2012); Kolesár (2013); Newey and Robins (2018).
Inference with heteroskedasticity and/or many regressors: Anatolyev (2012);
Karoui and Purdom (2016); Lei, Bickel, Karoui (2016); Cattaneo, Jansson, Newey (2017).
Inference in non-standard problems: Staiger and Stock (1997); Andrews and Cheng
(2012); Elliott, Müller, Watson (2015); Andrews and Mikusheva (2016).
![Page 21: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/21.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
Outline
Literature
Model and Estimator
Consistency
Distribution Theory
Application
![Page 22: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/22.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
Model
Linear regression
yi = x′iβ + εi (i = 1, · · · , n),
with
- xi ∈ Rk non-random and Sxx =∑ni=1 xix
′i of full rank (k ≤ n),
- {εi}ni=1 mutually independent, E[εi] = 0 and E[ε2i ] = σ2
i ,
- maxi Pii < 1 where Pii = x′iS−1xx xi is the i’th leverage.
Object of interest: θ = β′Aβ where A is known, non-random, and symmetricwith rank r.
![Page 23: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/23.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
Model
Linear regression
yi = x′iβ + εi (i = 1, · · · , n),
with
- xi ∈ Rk non-random and Sxx =∑ni=1 xix
′i of full rank (k ≤ n),
- {εi}ni=1 mutually independent, E[εi] = 0 and E[ε2i ] = σ2
i ,
- maxi Pii < 1 where Pii = x′iS−1xx xi is the i’th leverage.
Object of interest: θ = β′Aβ where A is known, non-random, and symmetricwith rank r.
![Page 24: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/24.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
Model
Linear regression
yi = x′iβ + εi (i = 1, · · · , n),
with
- xi ∈ Rk non-random and Sxx =∑ni=1 xix
′i of full rank (k ≤ n),
- {εi}ni=1 mutually independent, E[εi] = 0 and E[ε2i ] = σ2
i ,
- maxi Pii < 1 where Pii = x′iS−1xx xi is the i’th leverage.
Object of interest: θ = β′Aβ where A is known, non-random, and symmetricwith rank r.
![Page 25: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/25.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
Model
Linear regression
yi = x′iβ + εi (i = 1, · · · , n),
with
- xi ∈ Rk non-random and Sxx =∑ni=1 xix
′i of full rank (k ≤ n),
- {εi}ni=1 mutually independent, E[εi] = 0 and E[ε2i ] = σ2
i ,
- maxi Pii < 1 where Pii = x′iS−1xx xi is the i’th leverage.
Object of interest: θ = β′Aβ where A is known, non-random, and symmetricwith rank r.
![Page 26: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/26.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
Limits are taken as n→∞
Linear regression
yi,n = x′i,nβn + εi,n (i = 1, · · · , n),
with
- xi,n ∈ Rkn non-random and Sxx,n =∑ni=1 xi,nx
′i,n of full rank (kn ≤ n),
- {εi,n}ni=1 mutually independent, E[εi,n] = 0 and E[ε2i,n] = σ2
i,n,
- maxi Pii,n < 1 where Pii,n = x′i,nS−1xx,nxi,n is the i’th leverage.
Object of interest: θn = β′nAnβn where An is known, non-random, andsymmetric with rank rn.
![Page 27: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/27.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
The problem w/ plugging in..
Sampling variability in β generates bias in plug-in estimator θPI = β′Aβ:
E[θPI − θ] = trace(AV[β]
)=
n∑i=1
Biiσ2i
for Bii = x′iS−1xxAS
−1xx xi.
- Bii closely related to leverage Pii
- Special case (ESS): A = Sxx ⇒ Bii = Pii
![Page 28: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/28.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
The problem w/ plugging in..
Sampling variability in β generates bias in plug-in estimator θPI = β′Aβ:
E[θPI − θ] = trace(AV[β]
)=
n∑i=1
Biiσ2i
for Bii = x′iS−1xxAS
−1xx xi.
- Bii closely related to leverage Pii
- Special case (ESS): A = Sxx ⇒ Bii = Pii
![Page 29: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/29.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
Estimating the bias
The plug-in estimator θPI = β′Aβ has a bias of
trace(AV[β]
)=
n∑i=1
Biiσ2i where Bii = x′iS
−1xxAS
−1xx xi.
Basic insight: an unbiased “cross-fit” estimator of σ2i is
σ2i = yi(yi − x
′iβ−i)
=(εi + x′iβ
) (εi + x′i(β − β−i)
),
where β−i =(∑
` 6=i x`x′`
)−1∑` 6=i x`y`.
![Page 30: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/30.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
Leave-out Estimator
Thus, we propose the bias corrected estimator of θ:
θ = β′Aβ −n∑i=1
Biiσ2i .
A “leave-out” representation:
θ =n∑i=1
yix′iβ−i where xi = AS−1
xx xi ∈ Rk,
=n∑i=1
∑6=iCi`yiy` for Ci` = Bi` − 2−1Mi`
(M−1ii Bii +M−1
`` B``
)
Highlights the connection with existing leave-one-out ideas in parametric andnon-parametric models, e.g., JIVE and weighted average derivatives.
![Page 31: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/31.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
Leave-out Estimator
Thus, we propose the bias corrected estimator of θ:
θ = β′Aβ −n∑i=1
Biiσ2i .
A “leave-out” representation:
θ =n∑i=1
yix′iβ−i where xi = AS−1
xx xi ∈ Rk,
=n∑i=1
∑6=iCi`yiy` for Ci` = Bi` − 2−1Mi`
(M−1ii Bii +M−1
`` B``
)
Highlights the connection with existing leave-one-out ideas in parametric andnon-parametric models, e.g., JIVE and weighted average derivatives.
![Page 32: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/32.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
Leave-out Estimator
Thus, we propose the bias corrected estimator of θ:
θ = β′Aβ −n∑i=1
Biiσ2i .
A “leave-out” representation:
θ =n∑i=1
yix′iβ−i where xi = AS−1
xx xi ∈ Rk,
=n∑i=1
∑6=iCi`yiy` for Ci` = Bi` − 2−1Mi`
(M−1ii Bii +M−1
`` B``
)
Highlights the connection with existing leave-one-out ideas in parametric andnon-parametric models, e.g., JIVE and weighted average derivatives.
![Page 33: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/33.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
“Fixing” HC2 in high dimensions..
Recall the HC2 variance estimator of Mackinnon and White (1985):
VHC2 = S−1xx
(n∑i=1
xix′i
(yi − x′iβ)2
1− Pii
)S−1xx
- HC2 inconsistent when k ∝ n (Cattaneo, Jansson, Newey, 2017)
A cross-fit replacement:
VKSS = S−1xx
(n∑i=1
xix′iσ
2i
)S−1xx
= S−1xx
(n∑i=1
xix′i
yi(yi − x′iβ)
1− Pii
)S−1xx
- Will show that this enables testing “a few” linear restrictions..
![Page 34: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/34.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
“Fixing” HC2 in high dimensions..
Recall the HC2 variance estimator of Mackinnon and White (1985):
VHC2 = S−1xx
(n∑i=1
xix′i
(yi − x′iβ)2
1− Pii
)S−1xx
- HC2 inconsistent when k ∝ n (Cattaneo, Jansson, Newey, 2017)
A cross-fit replacement:
VKSS = S−1xx
(n∑i=1
xix′iσ
2i
)S−1xx
= S−1xx
(n∑i=1
xix′i
yi(yi − x′iβ)
1− Pii
)S−1xx
- Will show that this enables testing “a few” linear restrictions..
![Page 35: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/35.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
“Fixing” HC2 in high dimensions..
Recall the HC2 variance estimator of Mackinnon and White (1985):
VHC2 = S−1xx
(n∑i=1
xix′i
(yi − x′iβ)2
1− Pii
)S−1xx
- HC2 inconsistent when k ∝ n (Cattaneo, Jansson, Newey, 2017)
A cross-fit replacement:
VKSS = S−1xx
(n∑i=1
xix′iσ
2i
)S−1xx
= S−1xx
(n∑i=1
xix′i
yi(yi − x′iβ)
1− Pii
)S−1xx
- Will show that this enables testing “a few” linear restrictions..
![Page 36: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/36.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
The “Homoscedastic-only” correction
A commonly applied estimator based on homoscedasticity is (adjusted-R2,bias-corrected 2SLS, ANOVA, . . . )
θHO = θPI −n∑i=1
Biiσ2 where σ2 = 1
n− k
n∑i=1
(yi − x′iβ)2.
- In general, biased when Pii or Bii correlate with σ2i .
- Special case (balanced design): (Bii, Pii) do not vary w/ i.
![Page 37: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/37.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
The “Homoscedastic-only” correction
A commonly applied estimator based on homoscedasticity is (adjusted-R2,bias-corrected 2SLS, ANOVA, . . . )
θHO = θPI −n∑i=1
Biiσ2 where σ2 = 1
n− k
n∑i=1
(yi − x′iβ)2.
- In general, biased when Pii or Bii correlate with σ2i .
- Special case (balanced design): (Bii, Pii) do not vary w/ i.
![Page 38: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/38.jpg)
Example II (Uncentered R2)
R2 =1n
∑ni=1(x′iβ)2
1n
∑ni=1 E
[y2i
]- Numerator targeted by choosing A = Sxx/n
- Plug-in estimator R2 (Wright, 1921) uses
1n
n∑i=1
(x′iβ)2
- Homoscedasticity corrected estimator is R2adj (Theil,1961)
1n
n∑i=1
(x′iβ)2 − k
n− k1n
n∑i=1
(yi − x′iβ)2
![Page 39: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/39.jpg)
Example II (Uncentered R2)
R2 =1n
∑ni=1(x′iβ)2
1n
∑ni=1 E
[y2i
]- Numerator targeted by choosing A = Sxx/n
- Plug-in estimator R2 (Wright, 1921) uses
1n
n∑i=1
(x′iβ)2
- Homoscedasticity corrected estimator is R2adj (Theil,1961)
1n
n∑i=1
(x′iβ)2 − k
n− k1n
n∑i=1
(yi − x′iβ)2
![Page 40: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/40.jpg)
Example II (Uncentered R2)
R2 =1n
∑ni=1(x′iβ)2
1n
∑ni=1 E
[y2i
]- Numerator targeted by choosing A = Sxx/n
- Plug-in estimator R2 (Wright, 1921) uses
1n
n∑i=1
(x′iβ)2
- Homoscedasticity corrected estimator is R2adj (Theil,1961)
1n
n∑i=1
(x′iβ)2 − k
n− k1n
n∑i=1
(yi − x′iβ)2
![Page 41: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/41.jpg)
Example II (Uncentered R2)
R2 =1n
∑ni=1(x′iβ)2
1n
∑ni=1 E
[y2i
]
- HO adjustment relies on Degrees of freedom correction:
(1− R2adj)/(1− R
2) = n/(n− k)
- Contrast w/ leave out estimator θ, which can be written:
1n
n∑i=1
yix′iβ−i
![Page 42: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/42.jpg)
Example II (Uncentered R2)
R2 =1n
∑ni=1(x′iβ)2
1n
∑ni=1 E
[y2i
]
- HO adjustment relies on Degrees of freedom correction:
(1− R2adj)/(1− R
2) = n/(n− k)
- Contrast w/ leave out estimator θ, which can be written:
1n
n∑i=1
yix′iβ−i
![Page 43: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/43.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
Outline
Literature
Model and Estimator
Consistency
Distribution Theory
Application
![Page 44: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/44.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
Assumption 1(a) maxi E[ε4
i ] + σ−2i = O(1),
(b) maxi Pii ≤ c < 1,(c) maxi(x
′iβ)2 = O(1).
(a) ensures thin tails of εi.
(b) + (c) implies that σ2i has bounded variance.
(c) can be relaxed (technical condition).
![Page 45: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/45.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
An important matrix
Eigenvalues (λ1, . . . , λr) of following matrix govern properties of θ:
A = S−1/2xx AS−1/2
xx
- A defines target parameter
- S−1xx summarizes regressor design / difficulty of estimating each coefficient
- Special case (orthogonal regressors): Sxx = I ⇒ A = A
![Page 46: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/46.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
An important matrix
Eigenvalues (λ1, . . . , λr) of following matrix govern properties of θ:
A = S−1/2xx AS−1/2
xx
- A defines target parameter
- S−1xx summarizes regressor design / difficulty of estimating each coefficient
- Special case (orthogonal regressors): Sxx = I ⇒ A = A
![Page 47: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/47.jpg)
Lemma 1 (Consistency)
Let A = S−1/2xx AS−1/2
xx .
1. If A is positive semi-definite, (i) θ = O(1), and
(ii) trace(A2) = o(1),
then θ − θ p→ 0.
2. If A is non-definite then write A = A′1A2 for some A1, A2. If θk = β′A′kAkβ
satisfies (i) and (ii) for k = 1, 2, then θ − θ p→ 0.
- For “leave out R2” we have trace(A2) = k/n2 → 0.
⇒ θ is consistent
- Next: Verify (ii) analytically in some stylized examples (ANOVA and HLM).
- Can assess (ii) empirically in cases where analytically intractable.
![Page 48: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/48.jpg)
Lemma 1 (Consistency)
Let A = S−1/2xx AS−1/2
xx .
1. If A is positive semi-definite, (i) θ = O(1), and
(ii) trace(A2) = o(1),
then θ − θ p→ 0.
2. If A is non-definite then write A = A′1A2 for some A1, A2. If θk = β′A′kAkβ
satisfies (i) and (ii) for k = 1, 2, then θ − θ p→ 0.
- For “leave out R2” we have trace(A2) = k/n2 → 0.
⇒ θ is consistent
- Next: Verify (ii) analytically in some stylized examples (ANOVA and HLM).
- Can assess (ii) empirically in cases where analytically intractable.
![Page 49: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/49.jpg)
Lemma 1 (Consistency)
Let A = S−1/2xx AS−1/2
xx .
1. If A is positive semi-definite, (i) θ = O(1), and
(ii) trace(A2) = o(1),
then θ − θ p→ 0.
2. If A is non-definite then write A = A′1A2 for some A1, A2. If θk = β′A′kAkβ
satisfies (i) and (ii) for k = 1, 2, then θ − θ p→ 0.
- For “leave out R2” we have trace(A2) = k/n2 → 0.
⇒ θ is consistent
- Next: Verify (ii) analytically in some stylized examples (ANOVA and HLM).
- Can assess (ii) empirically in cases where analytically intractable.
![Page 50: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/50.jpg)
Lemma 1 (Consistency)
Let A = S−1/2xx AS−1/2
xx .
1. If A is positive semi-definite, (i) θ = O(1), and
(ii) trace(A2) = o(1),
then θ − θ p→ 0.
2. If A is non-definite then write A = A′1A2 for some A1, A2. If θk = β′A′kAkβ
satisfies (i) and (ii) for k = 1, 2, then θ − θ p→ 0.
- For “leave out R2” we have trace(A2) = k/n2 → 0.
⇒ θ is consistent
- Next: Verify (ii) analytically in some stylized examples (ANOVA and HLM).
- Can assess (ii) empirically in cases where analytically intractable.
![Page 51: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/51.jpg)
Lemma 1 (Consistency)
Let A = S−1/2xx AS−1/2
xx .
1. If A is positive semi-definite, (i) θ = O(1), and
(ii) trace(A2) = o(1),
then θ − θ p→ 0.
2. If A is non-definite then write A = A′1A2 for some A1, A2. If θk = β′A′kAkβ
satisfies (i) and (ii) for k = 1, 2, then θ − θ p→ 0.
- For “leave out R2” we have trace(A2) = k/n2 → 0.
⇒ θ is consistent
- Next: Verify (ii) analytically in some stylized examples (ANOVA and HLM).
- Can assess (ii) empirically in cases where analytically intractable.
![Page 52: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/52.jpg)
Example III (ANOVA)
Consider
ygt = αg + εgt (g = 1, . . . , N, t = 1, . . . , Tg),
where the object of interest is
σ2α = 1
n
N∑g=1
Tgα2g.
- Chetty et al. (2011): σ2α = variance of “classroom effects” in STAR
- maxi Pii < 1 is equivalent to ming Tg ≥ 2.
- Here, Pii = nBii = 1Tg(i)
⇒ θHO biased when σ2i vary w/ group size
![Page 53: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/53.jpg)
Example III (ANOVA)
Consider
ygt = αg + εgt (g = 1, . . . , N, t = 1, . . . , Tg),
where the object of interest is
σ2α = 1
n
N∑g=1
Tgα2g.
- Chetty et al. (2011): σ2α = variance of “classroom effects” in STAR
- maxi Pii < 1 is equivalent to ming Tg ≥ 2.
- Here, Pii = nBii = 1Tg(i)
⇒ θHO biased when σ2i vary w/ group size
![Page 54: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/54.jpg)
Example III (ANOVA)
Consider
ygt = αg + εgt (g = 1, . . . , N, t = 1, . . . , Tg),
where the object of interest is
σ2α = 1
n
N∑g=1
Tgα2g.
- Chetty et al. (2011): σ2α = variance of “classroom effects” in STAR
- maxi Pii < 1 is equivalent to ming Tg ≥ 2.
- Here, Pii = nBii = 1Tg(i)
⇒ θHO biased when σ2i vary w/ group size
![Page 55: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/55.jpg)
Example III (ANOVA)
Consider
ygt = αg + εgt (g = 1, . . . , N, t = 1, . . . , Tg),
where the object of interest is
σ2α = 1
n
N∑g=1
Tgα2g.
- Chetty et al. (2011): σ2α = variance of “classroom effects” in STAR
- maxi Pii < 1 is equivalent to ming Tg ≥ 2.
- Here, Pii = nBii = 1Tg(i)
⇒ θHO biased when σ2i vary w/ group size
![Page 56: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/56.jpg)
Example III (ANOVA)
Consider
ygt = αg + εgt (g = 1, . . . , N, t = 1, . . . , Tg),
where the object of interest is
σ2α = 1
n
N∑g=1
Tgα2g.
Leave out estimator can be written:
σ2α = 1
n
N∑g=1
(Tgα
2g − σ
2g
)
where αg = 1Tg
∑Tg
t=1 ygt and σ2g = 1
Tg−1∑Tg
t=1(ygt − αg)2
![Page 57: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/57.jpg)
Example III (ANOVA)
Consider
ygt = αg + εgt (g = 1, . . . , N, t = 1, . . . , Tg),
where the object of interest is
σ2α = 1
n
N∑g=1
Tgα2g.
A is diagonal with N non-zero entries of 1n so
trace(A2)
= N
n2 ≤1n
= o(1).
![Page 58: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/58.jpg)
Example IV (Hierarchical Linear Model (HLM))
Consider
ygt = αg + xgtδg + εgt (g = 1, . . . , N, t = 1, . . . , Tg),
where∑Tg
t=1 xgt = 0 and the object of interest is
σ2δ = 1
n
N∑g=1
Tgδ2g .
- Raudenbush and Bryk (1986): σ2δ = student-wgt’d var of slopes wrt SES
- maxi Pii < 1 is implied by ming Tg ≥ 3 and xgt1 6= xgt2 6= xgt3 6= xgt1 .
- A is diagonal with N non-zero entries of 1n
Tg∑Tgt=1
x2gt
, so
trace(A2)
= o(1) if ming
n
Tg
Tg∑t=1
x2gt →∞.
![Page 59: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/59.jpg)
Example IV (Hierarchical Linear Model (HLM))
Consider
ygt = αg + xgtδg + εgt (g = 1, . . . , N, t = 1, . . . , Tg),
where∑Tg
t=1 xgt = 0 and the object of interest is
σ2δ = 1
n
N∑g=1
Tgδ2g .
- Raudenbush and Bryk (1986): σ2δ = student-wgt’d var of slopes wrt SES
- maxi Pii < 1 is implied by ming Tg ≥ 3 and xgt1 6= xgt2 6= xgt3 6= xgt1 .
- A is diagonal with N non-zero entries of 1n
Tg∑Tgt=1
x2gt
, so
trace(A2)
= o(1) if ming
n
Tg
Tg∑t=1
x2gt →∞.
![Page 60: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/60.jpg)
Example IV (Hierarchical Linear Model (HLM))
Consider
ygt = αg + xgtδg + εgt (g = 1, . . . , N, t = 1, . . . , Tg),
where∑Tg
t=1 xgt = 0 and the object of interest is
σ2δ = 1
n
N∑g=1
Tgδ2g .
- Raudenbush and Bryk (1986): σ2δ = student-wgt’d var of slopes wrt SES
- maxi Pii < 1 is implied by ming Tg ≥ 3 and xgt1 6= xgt2 6= xgt3 6= xgt1 .
- A is diagonal with N non-zero entries of 1n
Tg∑Tgt=1
x2gt
, so
trace(A2)
= o(1) if ming
n
Tg
Tg∑t=1
x2gt →∞.
![Page 61: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/61.jpg)
Example IV (Hierarchical Linear Model (HLM))
Consider
ygt = αg + xgtδg + εgt (g = 1, . . . , N, t = 1, . . . , Tg),
where∑Tg
t=1 xgt = 0 and the object of interest is
σ2δ = 1
n
N∑g=1
Tgδ2g .
- Raudenbush and Bryk (1986): σ2δ = student-wgt’d var of slopes wrt SES
- maxi Pii < 1 is implied by ming Tg ≥ 3 and xgt1 6= xgt2 6= xgt3 6= xgt1 .
- A is diagonal with N non-zero entries of 1n
Tg∑Tgt=1
x2gt
, so
trace(A2)
= o(1) if ming
n
Tg
Tg∑t=1
x2gt →∞.
![Page 62: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/62.jpg)
Example I (Two-way fixed effects, AKM)
Consider (Tg = 2 and no Xgt)
ygt = αg + ψj(g,t) + εgt (i = g, · · · , N, t = 1, 2),
and σ2ψ = 1
n
∑Ng=1
∑2t=1(ψj(g,t) − ψ)2.
A is not diagonal, but `’th largest eigenvalue given by:
λ` = 1n
1λJ+1−`(E
1/2LE1/2)
where E is a diagonal matrix of employer specific “churn rates”, L is thenormalized Laplacian for the worker-firm mobility network, and λ`(·) gives the`’th largest eigenvalue of argument.
![Page 63: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/63.jpg)
Example I (Two-way fixed effects, AKM)
Consider (Tg = 2 and no Xgt)
ygt = αg + ψj(g,t) + εgt (i = g, · · · , N, t = 1, 2),
and σ2ψ = 1
n
∑Ng=1
∑2t=1(ψj(g,t) − ψ)2.
A is not diagonal, but `’th largest eigenvalue given by:
λ` = 1n
1λJ+1−`(E
1/2LE1/2)
where E is a diagonal matrix of employer specific “churn rates”, L is thenormalized Laplacian for the worker-firm mobility network, and λ`(·) gives the`’th largest eigenvalue of argument.
![Page 64: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/64.jpg)
Example I (Two-way fixed effects, AKM)
Consider (Tg = 2 and no Xgt)
ygt = αg + ψj(g,t) + εgt (i = g, · · · , N, t = 1, 2),
and σ2ψ = 1
n
∑Ng=1
∑2t=1(ψj(g,t) − ψ)2.
- Sufficient condition for consistency: strong connectivity√JC → ∞
where C ∈ (0, 1] is Cheeger’s constant
- Intepretation: no “bottlenecks” in mobility network
![Page 65: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/65.jpg)
Rovigo and Belluno − Employer Mobility NetworkFirms in RovigoWithin−Rovigo mobility
Firms in BellunoWithin−Belluno mobility
Between region mobility
![Page 66: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/66.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
Outline
Literature
Model and Estimator
Consistency
Distribution Theory
Application
![Page 67: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/67.jpg)
Notation / Overview
We can represent the plug-in estimator θPI as
β′Aβ = β′S1/2xx AS
1/2xx β
= b′Db =r∑`=1
λ`b2`
where we write
- A = S−1/2xx AS−1/2
xx .
- A = QDQ′ for D = diag(λ1, . . . , λr), λ21 ≥ · · · ≥ λ
2r > 0, and Q′Q = Ir,
- b = Q′S1/2xx β
“Warmup” result: Distribution of infeasible estimator when εi ∼ N (0, σ2i )
θ∗ = β′Aβ −n∑i=1
Biiσ2i
![Page 68: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/68.jpg)
Notation / Overview
We can represent the plug-in estimator θPI as
β′Aβ = β′S1/2xx AS
1/2xx β
= b′Db =r∑`=1
λ`b2`
where we write
- A = S−1/2xx AS−1/2
xx .
- A = QDQ′ for D = diag(λ1, . . . , λr), λ21 ≥ · · · ≥ λ
2r > 0, and Q′Q = Ir,
- b = Q′S1/2xx β
“Warmup” result: Distribution of infeasible estimator when εi ∼ N (0, σ2i )
θ∗ = β′Aβ −n∑i=1
Biiσ2i
![Page 69: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/69.jpg)
Notation / Overview
We can represent the plug-in estimator θPI as
β′Aβ = β′S1/2xx AS
1/2xx β = b′Db =
r∑`=1
λ`b2`
where we write
- A = S−1/2xx AS−1/2
xx .
- A = QDQ′ for D = diag(λ1, . . . , λr), λ21 ≥ · · · ≥ λ
2r > 0, and Q′Q = Ir,
- b = Q′S1/2xx β
“Warmup” result: Distribution of infeasible estimator when εi ∼ N (0, σ2i )
θ∗ = β′Aβ −n∑i=1
Biiσ2i
![Page 70: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/70.jpg)
Notation / Overview
We can represent the plug-in estimator θPI as
β′Aβ = β′S1/2xx AS
1/2xx β = b′Db =
r∑`=1
λ`b2`
where we write
- A = S−1/2xx AS−1/2
xx .
- A = QDQ′ for D = diag(λ1, . . . , λr), λ21 ≥ · · · ≥ λ
2r > 0, and Q′Q = Ir,
- b = Q′S1/2xx β
“Warmup” result: Distribution of infeasible estimator when εi ∼ N (0, σ2i )
θ∗ = β′Aβ −n∑i=1
Biiσ2i
![Page 71: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/71.jpg)
Lemma 1 (Finite Sample)
If εi ∼ N (0, σ2i ), then
θ∗ =r∑`=1
λ`
(b2` − V[b`]
)and b ∼ N
(b,V[b]
)where b = Q′S1/2
xx β.
![Page 72: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/72.jpg)
Lemma 1 (Finite Sample)
If εi ∼ N (0, σ2i ), then
θ∗ =r∑`=1
λ`
(b2` − V[b`]
)and b ∼ N
(b,V[b]
)where b = Q′S1/2
xx β.
- Sums of squares of uncentered normals ⇒ non-central χ2
- Noncentrality governed by b
![Page 73: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/73.jpg)
Building intuition..
θ∗ =r∑`=1
λ`
(b2` − V[b`]
)and b ∼ N
(b,V[b]
)
Seek asymptotic approximations that simplify computation and relaxassumptions.
Note: can write b as weighted sum∑ni=1 wiyi
- Weights are wi = Q′S−1/2xx xi and obey
∑ni=1 wiw
′i = Ir.
- maxi w′iwi provides inverse measure of eff sample size
- Plausible that elements of b are approx normal even when εi is not..
![Page 74: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/74.jpg)
Building intuition..
θ∗ =r∑`=1
λ`
(b2` − V[b`]
)and b ∼ N
(b,V[b]
)
Seek asymptotic approximations that simplify computation and relaxassumptions.
Note: can write b as weighted sum∑ni=1 wiyi
- Weights are wi = Q′S−1/2xx xi and obey
∑ni=1 wiw
′i = Ir.
- maxi w′iwi provides inverse measure of eff sample size
- Plausible that elements of b are approx normal even when εi is not..
![Page 75: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/75.jpg)
Building intuition..
θ∗ =r∑`=1
λ`
(b2` − V[b`]
)and b ∼ N
(b,V[b]
)
Preview of asymptotic results:
1) When r small (e.g. testing a single linear restriction) and b is approximatelynormally distributed, we obtain non-central χ2
2) When r large (e.g., testing LOTS of linear restrictions) and eigenvaluessame order of magnitude, can invoke a CLT to get normal approximation
3) When r large and eigenvalues different orders of magnitude (weak-id), get acombination of χ2 and normal components
![Page 76: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/76.jpg)
The “low rank” case
Proposition 1 (Low Rank)
If Assumption 1 holds, (i) maxi w′iwi = o(1), and (ii) r is fixed, then
θ =r∑`=1
λ`
(b2` − V[b`]
)+ op(V[θ]1/2) and V[b]−1/2(b− b) d−→ N (0, Ir) .
Recall that b =∑ni=1 wiyi where wi = Q′S−1/2
xx xi and∑ni=1 wiw
′i = Ir.
The Lindeberg condition (i) ensures that
- no observation is too influential
- sampling error in the bias correction can be ignored.
![Page 77: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/77.jpg)
The “low rank” case
Proposition 1 (Low Rank)
If Assumption 1 holds, (i) maxi w′iwi = o(1), and (ii) r is fixed, then
θ =r∑`=1
λ`
(b2` − V[b`]
)+ op(V[θ]1/2) and V[b]−1/2(b− b) d−→ N (0, Ir) .
Recall that b =∑ni=1 wiyi where wi = Q′S−1/2
xx xi and∑ni=1 wiw
′i = Ir.
The Lindeberg condition (i) ensures that
- no observation is too influential
- sampling error in the bias correction can be ignored.
![Page 78: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/78.jpg)
Application: testing a linear restrictionSuppose we are interested in testing
H0 : v′β = 0 for v ∈ Rk×1
Example 1: testing for regional diffs in firm FEs
Example 2: std err on projection of firm FEs onto firm characteristics
Prop 1 implies that, under H0, choosing A = vv′ yields
V[v′β]−1θd→ χ2(1)− 1
Eicker-White style variance estimator for inference:
V[v′β] = v′S−1xx
(n∑i=1
xix′iσ
2i
)S−1xx v
![Page 79: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/79.jpg)
Application: testing a linear restrictionSuppose we are interested in testing
H0 : v′β = 0 for v ∈ Rk×1
Example 1: testing for regional diffs in firm FEs
Example 2: std err on projection of firm FEs onto firm characteristics
Prop 1 implies that, under H0, choosing A = vv′ yields
V[v′β]−1θd→ χ2(1)− 1
Eicker-White style variance estimator for inference:
V[v′β] = v′S−1xx
(n∑i=1
xix′iσ
2i
)S−1xx v
![Page 80: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/80.jpg)
Application: testing a linear restrictionSuppose we are interested in testing
H0 : v′β = 0 for v ∈ Rk×1
Example 1: testing for regional diffs in firm FEs
Example 2: std err on projection of firm FEs onto firm characteristics
Prop 1 implies that, under H0, choosing A = vv′ yields
V[v′β]−1θd→ χ2(1)− 1
Eicker-White style variance estimator for inference:
V[v′β] = v′S−1xx
(n∑i=1
xix′iσ
2i
)S−1xx v
![Page 81: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/81.jpg)
Proposition 2 (High Rank, Strong Id)
If Assumption 1 holds, (i) V[θ]−1 maxi(
(x′iβ)2 + (x′iβ)2)
= o(1), and
(ii) λ21∑r
`=1 λ2`
= o(1),
then V[θ]−1/2(θ − θ) d−→ N (0, 1).
Objects appearing in (i) are:
- xi = AS−1xx xi where θ =
∑ni=1 E[yix
′iβ].
- xi =∑n`=1 Mi`
B``
1−P``x` stems from bias correction.
- Intuition: Averaging r →∞ terms yields normality under (ii), but estimationof the bias can not be ignored (xi is present in V[θ]).
![Page 82: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/82.jpg)
Proposition 2 (High Rank, Strong Id)
If Assumption 1 holds, (i) V[θ]−1 maxi(
(x′iβ)2 + (x′iβ)2)
= o(1), and
(ii) λ21∑r
`=1 λ2`
= o(1),
then V[θ]−1/2(θ − θ) d−→ N (0, 1).
Objects appearing in (i) are:
- xi = AS−1xx xi where θ =
∑ni=1 E[yix
′iβ].
- xi =∑n`=1 Mi`
B``
1−P``x` stems from bias correction.
- Intuition: Averaging r →∞ terms yields normality under (ii), but estimationof the bias can not be ignored (xi is present in V[θ]).
![Page 83: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/83.jpg)
Application: testing many linear restrictions
Suppose we are interested in testing
H0 : Rβ = 0 for R ∈ Rr×k
- Example: testing block of FEs=0
- Traditional “F-test” would require homoscedasticity
Prop 2 implies that, under H0, choosing A = 1rR′(RS−1
xxR′)−1R yields
V[θ]−1/2θd−→ N (0, 1)
Consistent estimator of V[θ] provided in paper
![Page 84: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/84.jpg)
Application: testing many linear restrictions
Suppose we are interested in testing
H0 : Rβ = 0 for R ∈ Rr×k
- Example: testing block of FEs=0
- Traditional “F-test” would require homoscedasticity
Prop 2 implies that, under H0, choosing A = 1rR′(RS−1
xxR′)−1R yields
V[θ]−1/2θd−→ N (0, 1)
Consistent estimator of V[θ] provided in paper
![Page 85: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/85.jpg)
Application: testing many linear restrictions
Suppose we are interested in testing
H0 : Rβ = 0 for R ∈ Rr×k
- Example: testing block of FEs=0
- Traditional “F-test” would require homoscedasticity
Prop 2 implies that, under H0, choosing A = 1rR′(RS−1
xxR′)−1R yields
V[θ]−1/2θd−→ N (0, 1)
Consistent estimator of V[θ] provided in paper
![Page 86: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/86.jpg)
Assumption 2Suppose there exist a known and fixed q ∈ {1, . . . , r − 1} such that
λ2q+1∑r`=1 λ
2`
= o(1) andλ2q∑r
`=1 λ2`
≥ c ∀n.
Decomposition:
bq = (b1, . . . , bq)′ =
n∑i=1
wiqyi, wiq = (wi1, . . . , wiq)′,
θq = θ −q∑`=1
λ`(b2` − V[b`]), V[b] =
n∑i=1
wiw′iσ
2i .
![Page 87: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/87.jpg)
Assumption 2Suppose there exist a known and fixed q ∈ {1, . . . , r − 1} such that
λ2q+1∑r`=1 λ
2`
= o(1) andλ2q∑r
`=1 λ2`
≥ c ∀n.
Decomposition:
bq = (b1, . . . , bq)′ =
n∑i=1
wiqyi, wiq = (wi1, . . . , wiq)′,
θq = θ −q∑`=1
λ`(b2` − V[b`]), V[b] =
n∑i=1
wiw′iσ
2i .
![Page 88: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/88.jpg)
Theorem 1 (High Rank, Weak Id)
If maxi w′iqwiq = o(1), V[θq]−1 maxi
((x′iqβ)2 + (x′iqβ)2
)= o(1), and
Assumption 2 holds, then
θ =q∑`=1
λ`
(b2` − V[b`]
)+ θq + op(V[θ]1/2)
and
V[(b′q, θq)′]−1/2(
(b′q, θq)′ − E[(b′q, θq)′])
d−→ N(0, Iq+1
).
![Page 89: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/89.jpg)
Theorem 1 (High Rank, Weak Id)
If maxi w′iqwiq = o(1), V[θq]−1 maxi
((x′iqβ)2 + (x′iqβ)2
)= o(1), and
Assumption 2 holds, then
θ =q∑`=1
λ`
(b2` − V[b`]
)+ θq + op(V[θ]1/2)
and
V[(b′q, θq)′]−1/2(
(b′q, θq)′ − E[(b′q, θq)′])
d−→ N(0, Iq+1
).
- Result: q non-central χ2 terms + a normal
- When q � r: major simplification relative to finite sample dist.
- But still need to deal w/ q-dimensional nuisance parameter E[bq]
![Page 90: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/90.jpg)
Weak-id Robust Confidence Interval
To construct a confidence interval we invert a minimum distance statistic:
Cθq =[
min(b1,...,bq,θq)′∈Bq
q∑`=1
λ`b2` + θq, max
(b1,...,bq,θq)′∈Bq
q∑`=1
λ`b2` + θq
]where
Bq ={
(b′q, θq)′ ∈ Rq+1 :(
bq − bqθq − θq
)′Σ−1q
(bq − bqθq − θq
)≤ z2
κ
}
- Σ = V[(b′q, θq)′] and κ = κ(Σ),
- zκ is the critical value proposed in Andrews and Mikusheva (2016).
- κ measures the curvature (non-linearity) of the problem.
![Page 91: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/91.jpg)
Weak-id Robust Confidence Interval
To construct a confidence interval we invert a minimum distance statistic:
Cθq =[
min(b1,...,bq,θq)′∈Bq
q∑`=1
λ`b2` + θq, max
(b1,...,bq,θq)′∈Bq
q∑`=1
λ`b2` + θq
]where
Bq ={
(b′q, θq)′ ∈ Rq+1 :(
bq − bqθq − θq
)′Σ−1q
(bq − bqθq − θq
)≤ z2
κ
}
- Σ = V[(b′q, θq)′] and κ = κ(Σ),
- zκ is the critical value proposed in Andrews and Mikusheva (2016).
- κ measures the curvature (non-linearity) of the problem.
![Page 92: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/92.jpg)
Weak-id Robust Confidence Interval
To construct a confidence interval we invert a minimum distance statistic:
Cθq =[
min(b1,...,bq,θq)′∈Bq
q∑`=1
λ`b2` + θq, max
(b1,...,bq,θq)′∈Bq
q∑`=1
λ`b2` + θq
]where
Bq ={
(b′q, θq)′ ∈ Rq+1 :(
bq − bqθq − θq
)′Σ−1q
(bq − bqθq − θq
)≤ z2
κ
}
- Σ = V[(b′q, θq)′] and κ = κ(Σ),
- zκ is the critical value proposed in Andrews and Mikusheva (2016).
- κ measures the curvature (non-linearity) of the problem.
![Page 93: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/93.jpg)
Literature Model and Estimator Consistency Distribution Theory Application
Outline
Literature
Model and Estimator
Consistency
Distribution Theory
Application
![Page 94: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/94.jpg)
An application to Italian data
Wage and employment data on 2 provinces within the Veneto region of Italy.
Years: 1999 and 2001
Number of movers: 3,531 and 6,414.
Number of employers: 1,282 and 1,684
Example I (Two-way fixed effects, AKM)
Model (Tg = 2 and no Xgt):
log-wagegt = αg + ψj(g,t) + εgt (g = 1, · · · , N, t = 1, 2).
![Page 95: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/95.jpg)
The Provinces of Veneto
![Page 96: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/96.jpg)
Leave-out sample preserves first two moments
![Page 97: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/97.jpg)
High leverage ⇒ low-dimensional methods inappropriate
![Page 98: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/98.jpg)
HO adjustment under-corrects(Evidence of substantial heteroscedasticity)
![Page 99: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/99.jpg)
HO adjustment under-corrects(Evidence of substantial heteroscedasticity)
![Page 100: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/100.jpg)
Covariance flips sign!
![Page 101: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/101.jpg)
Leave out finds substantial PAM
![Page 102: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/102.jpg)
AKM model exhibits very strong explanatory power(Even after adjustment for “over-fitting”)
![Page 103: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/103.jpg)
Rovigo and Belluno − Employer Mobility NetworkFirms in RovigoWithin−Rovigo mobility
Firms in BellunoWithin−Belluno mobility
Between region mobility
![Page 104: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/104.jpg)
Firm effects higher in Belluno
![Page 105: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/105.jpg)
But person effects seem lower(Hard to tell b/c of limited mobility!)
![Page 106: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/106.jpg)
Pooling increases the std error!
![Page 107: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/107.jpg)
Consistent estimates
![Page 108: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/108.jpg)
Confidence interval adapts to bottleneck
![Page 109: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/109.jpg)
Strong curvature / big top eig share in pooled sample(But Lindeberg condition is satisfied)
![Page 110: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/110.jpg)
Simulations condition on observed mobility network
![Page 111: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/111.jpg)
Leave-out estimator is unbiased
![Page 112: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/112.jpg)
Plug-in / HO severely biased
![Page 113: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/113.jpg)
Leave out standard error is consistent
![Page 114: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/114.jpg)
Invalid normal approximation
![Page 115: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/115.jpg)
Weak-id interval slightly conservative
![Page 116: Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaffaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley](https://reader034.vdocument.in/reader034/viewer/2022043016/5f3898457aec725e745897f0/html5/thumbnails/116.jpg)
Summary
We proposed an unbiased and consistent estimator of any variance componentin a heteroscedastic linear model w/ many regressors.
Robust inference procedure can be used to
- Test linear restrictions (“het consistent F-test”)
- Build weak-id robust confidence intervals for variance components
- Eigenvalue based diagnostics for weak identification – in practice, q = 1appears to provide good coverage even with very weak connectivity
MATLAB code available at:https://github.com/rsaggio87/LeaveOutTwoWay.