icar- ifpri - basic statistics and econometric approaches lecture 3 - devesh roy
Upload: international-food-policy-research-institute-south-asia-office
Post on 15-Apr-2017
540 views
TRANSCRIPT
![Page 1: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/1.jpg)
Basic Statistics and Econometric approaches
Devesh Roy
IFPRI-ICAR training
21st September, 2015
![Page 2: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/2.jpg)
2
Why study Econometrics?
• Rare in economics (and many other areas without labs!) to have experimental data
• Need to use nonexperimental, or observational, data to make inferences
• Important to be able to apply economic theory to real world data
![Page 3: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/3.jpg)
Elements of econometrics
• An empirical analysis uses data to test a theory or to estimate a relationship
• A formal economic model can be tested
• Theory may be ambiguous as to the effect of some policy change –can use econometrics to evaluate the program
![Page 4: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/4.jpg)
4
Types of Data – Cross Sectional
• Cross-sectional data as a random sample
• Each observation is a new farmer, firm, etc. with information at a point in time
• If the data is not a random sample, we have a sample-selection problem
![Page 5: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/5.jpg)
5
Types of Data – Panel
• Can pool random cross sections and treat similar to a normal cross section. Will just need to account for time differences.
• Can follow the same random individual observations over time –known as panel data or longitudinal data
![Page 6: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/6.jpg)
6
Types of Data – Time Series
• Time series data has a separate observation for each time period – e.g. prices
• Since not a random sample, different problems to consider
• Trends and seasonality will be important
![Page 7: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/7.jpg)
7
The Question of Causality
• Simply establishing a relationship between variables is not sufficient
• In economics and most of the work we strive to show that effect is causal
• If we truly control for enough other variables, then the estimated ceteris paribus effect can often be considered to be causal
• Can be difficult to establish causality
![Page 8: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/8.jpg)
8
Example: Returns to Education (Wooldridge textbook example)
• A model of human capital investment implies getting more education should lead to higher earnings
• In the simplest case, this implies an equation like
ueducationEarnings 10
![Page 9: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/9.jpg)
9
Example: (continued)
• The estimate of 1, is the return to education, but can it be considered causal?
• While the error term, u, includes other factors affecting earnings, want to control for as much as possible
• Some things are still unobserved, which can threaten establishing causality
![Page 10: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/10.jpg)
Economics 20 - Prof. Anderson 10
Simple linear regression
• y = 0 + 1x + u, we typically refer to y as the• Dependent Variable, or
• Left-Hand Side Variable, or
• Explained Variable, or
• Regressand
![Page 11: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/11.jpg)
11
Terminology, continued
• In the simple linear regression of y on x, we typically refer to x as the• Independent Variable, or
• Right-Hand Side Variable, or
• Explanatory Variable, or
• Regressor, or
• Covariate, or
• Control Variables
![Page 12: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/12.jpg)
12
A Simple Assumption
• The average value of u, the error term, in the population is 0. That is,
• E(u) = 0
• This is not a restrictive assumption, since we can always use 0 to normalize E(u) to 0
![Page 13: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/13.jpg)
13
Zero Conditional Mean (very important condition) • We need to make a crucial assumption about how u and x are related
• We want it to be the case that knowing something about x does not give us any information about u, so that they are completely unrelated. That is, that
• E(u|x) = E(u) = 0, which implies
• E(y|x) = 0 + 1x
![Page 14: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/14.jpg)
14
..
x1 x2
E(y|x) as a linear function of x, where for any xthe distribution of y is centered about E(y|x)
E(y|x) = 0 + 1x
y
f(y)
![Page 15: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/15.jpg)
15
Ordinary Least Squares
• Basic idea of regression is to estimate the population parameters from a sample
• Let {(xi,yi): i=1, …,n} denote a random sample of size n from the population
• For each observation in this sample, it will be the case that
• yi = 0 + 1xi + ui
![Page 16: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/16.jpg)
Economics 20 - Prof. Anderson 16
.
..
.
y4
y1
y2
y3
x1 x2 x3 x4
}
}
{
{
u1
u2
u3
u4
x
y
Population regression line, sample data pointsand the associated error terms
E(y|x) = 0 + 1x
![Page 17: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/17.jpg)
Gauss Markov assumptions
• Assumption 1- linearity in parameters- Population equation is linear in the (unknown) parameters.
• Assumption 2- Sample is random
• Assumption 3- There is variation in x
• Assumption 4- Zero conditional mean E(u/x)=0
![Page 18: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/18.jpg)
18
Deriving OLS Estimates
• To derive the OLS estimates we need to realize that our main assumption of E(u|x) = E(u) = 0 also implies that
• Cov(x,u) = E(xu) = 0
• Why? Remember from basic probability that Cov(X,Y) = E(XY) –E(X)E(Y)
![Page 19: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/19.jpg)
Economics 20 - Prof. Anderson 19
Deriving OLS continued
• We can write our 2 restrictions just in terms of x, y, 0 and 1 , since u= y – 0 – 1x
• E(y – 0 – 1x) = 0
• E[x(y – 0 – 1x)] = 0
• These are called moment restrictions
![Page 20: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/20.jpg)
20
Deriving OLS using M.O.M.
• The method of moments approach to estimation implies imposing the population moment restrictions on the sample moments
• What does this mean? Recall that for E(X), the mean of a population distribution, a sample estimator of E(X) is simply the arithmetic mean of the sample
![Page 21: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/21.jpg)
21
More Derivation of OLS
• We want to choose values of the parameters that will ensure that the sample versions of our moment restrictions are true
• The sample versions are as follows:
0ˆˆ
0ˆˆ
1
10
1
1
10
1
n
i
iii
n
i
ii
xyxn
xyn
![Page 22: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/22.jpg)
22
More Derivation of OLS
• Given the definition of a sample mean, and properties of summation, we can rewrite the first condition as follows
xy
xy
10
10
ˆˆ
or
,ˆˆ
![Page 23: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/23.jpg)
23
More Derivation of OLS
n
i
ii
n
i
i
n
i
ii
n
i
ii
n
i
iii
xxyyxx
xxxyyx
xxyyx
1
2
1
1
1
1
1
1
11
ˆ
ˆ
0ˆˆ
![Page 24: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/24.jpg)
24
So the OLS estimated slope is
0 that provided
ˆ
1
2
1
2
11
n
i
i
n
i
i
n
i
ii
xx
xx
yyxx
![Page 25: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/25.jpg)
25
Summary of OLS slope estimate
• The slope estimate is the sample covariance between x and y divided by the sample variance of x
• If x and y are positively correlated, the slope will be positive
• If x and y are negatively correlated, the slope will be negative
• Only need x to vary in our sample
![Page 26: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/26.jpg)
26
More OLS
• Intuitively, OLS is fitting a line through the sample points such that the sum of squared residuals is as small as possible, hence the term least squares
• The residual, û, is an estimate of the error term, u, and is the difference between the fitted line (sample regression function) and the sample point
![Page 27: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/27.jpg)
Economics 20 - Prof. Anderson 27
.
..
.
y4
y1
y2
y3
x1 x2 x3 x4
}
}
{
{
û1
û2
û3
û4
x
y
Sample regression line, sample data pointsand the associated estimated error terms
xy 10ˆˆˆ
![Page 28: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/28.jpg)
28
Minimizing the sum of least squares
• Given the intuitive idea of fitting a line, we can set up a formal minimization problem
• That is, we want to choose our parameters such that we minimize the following:
n
i
ii
n
i
i xyu1
2
10
1
2 ˆˆˆ
![Page 29: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/29.jpg)
29
Alternate approach, continued
• If one uses calculus to solve the minimization problem for the two parameters you obtain the following first order conditions, which are the same as we obtained before, multiplied by n
0ˆˆ
0ˆˆ
1
10
1
10
n
i
iii
n
i
ii
xyx
xy
![Page 30: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/30.jpg)
30
Algebraic Properties of OLS
• The sum of the OLS residuals is zero
• Thus, the sample average of the OLS residuals is zero as well
• The sample covariance between the regressors and the OLS residuals is zero
• The OLS regression line always goes through the mean of the sample
![Page 31: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/31.jpg)
31
Algebraic Properties (precise)
xy
ux
n
u
u
n
i
ii
n
i
in
i
i
10
1
1
1
ˆˆ
0ˆ
0
ˆ
thus,and 0ˆ
![Page 32: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/32.jpg)
Economics 20 - Prof. Anderson 32
More terminology
SSR SSE SSTThen
(SSR) squares of sum residual theis ˆ
(SSE) squares of sum explained theis ˆ
(SST) squares of sum total theis
:following thedefine then Weˆˆ
part, dunexplainean and part, explainedan of up
made being asn observatioeach ofcan think We
2
2
2
i
i
i
iii
u
yy
yy
uyy
![Page 33: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/33.jpg)
Economics 20 - Prof. Anderson 33
Proof that SST = SSE + SSR
0 ˆˆ that know weand
SSE ˆˆ2 SSR
ˆˆˆ2ˆ
ˆˆ
ˆˆ
22
2
22
yyu
yyu
yyyyuu
yyu
yyyyyy
ii
ii
iiii
ii
iiii
![Page 34: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/34.jpg)
Economics 20 - Prof. Anderson 34
Goodness-of-Fit
• How do we think about how well our sample regression line fits our sample data?
• Can compute the fraction of the total sum of squares (SST) that is explained by the model, call this the R-squared of regression
• R2 = SSE/SST = 1 – SSR/SST
![Page 35: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/35.jpg)
35
Unbiasedness of OLS
• Assume the population model is linear in parameters as y = 0 + 1x + u
• Assume we can use a random sample of size n, {(xi, yi): i=1, 2, …, n}, from the population model. Thus we can write the sample model yi = 0 + 1xi + ui
• Assume E(u|x) = 0 and thus E(ui|xi) = 0
• Much of the problem in econometric estimation that we are going to face is because of violation of this assumption
• Assume there is variation in xi
![Page 36: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/36.jpg)
Economics 20 - Prof. Anderson 36
Unbiasedness of OLS (cont)
• In order to think about unbiasedness, we need to rewrite our estimator in terms of the population parameter
• Start with a simple rewrite of the formula as
22
21 where,ˆ
xxs
s
yxx
ix
x
ii
![Page 37: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/37.jpg)
Economics 20 - Prof. Anderson 37
Unbiasedness of OLS (cont)
ii
iii
ii
iii
iiiii
uxx
xxxxx
uxx
xxxxx
uxxxyxx
10
10
10
![Page 38: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/38.jpg)
Economics 20 - Prof. Anderson 38
Unbiasedness of OLS (cont)
211
2
1
2
ˆ
thusand ,
asrewritten becan numerator the,so
,0
x
ii
iix
iii
i
s
uxx
uxxs
xxxxx
xx
![Page 39: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/39.jpg)
39
Unbiasedness Summary
• The OLS estimates of 1 and 0 are unbiased
• Proof of unbiasedness depends on assumptions – if any assumption fails, then OLS is not necessarily unbiased
• Remember unbiasedness is a description of the estimator – in a given sample we may be “near” or “far” from the true parameter
![Page 40: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/40.jpg)
40
Variance of the OLS Estimators
• Now we know that the sampling distribution of our estimate is centered around the true parameter
• Want to think about how spread out this distribution is
• Much easier to think about this variance under an additional assumption, so
• Assume Var(u|x) = s2 (Homoskedasticity)
![Page 41: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/41.jpg)
41
Variance of OLS (cont)
• Var(u|x) = E(u2|x)-[E(u|x)]2
• E(u|x) = 0, so s2 = E(u2|x) = E(u2) = Var(u)
• Thus s2 is also the unconditional variance, called the error variance
• s, the square root of the error variance is called the standard deviation of the error
• Can say: E(y|x)=0 + 1x and Var(y|x) = s2
![Page 42: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/42.jpg)
Economics 20 - Prof. Anderson 42
..
x1 x2
Homoskedastic Case
E(y|x) = 0 + 1x
y
f(y|x)
![Page 43: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/43.jpg)
Economics 20 - Prof. Anderson 43
.
xx1 x2
f(y|x)
Heteroskedastic Case
x3
..
E(y|x) = 0 + 1x
![Page 44: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/44.jpg)
44
Variance of OLS (cont)
12
22
2
22
2
2
2222
2
2
2
2
2
2
2
211
ˆ1
11
11
1ˆ
ss
ss
Vars
ss
ds
ds
uVards
udVars
uds
VarVar
xx
x
ix
ix
iix
iix
iix
![Page 45: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/45.jpg)
45
Variance of OLS Summary
• The larger the error variance, s2, the larger the variance of the slope estimate
• The larger the variability in the xi, the smaller the variance of the slope estimate
• As a result, a larger sample size should decrease the variance of the slope estimate
• Problem that the error variance is unknown
![Page 46: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/46.jpg)
46
Estimating the Error Variance
• We don’t know what the error variance, s2, is, because we don’t observe the errors, ui
• What we observe are the residuals, ûi
• We can use the residuals to form an estimate of the error variance
![Page 47: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/47.jpg)
47
Error Variance Estimate (cont)
2/ˆ
2
1ˆ
is ofestimator unbiasedan Then,
ˆˆ
ˆˆ
ˆˆˆ
22
2
1100
1010
10
nSSRun
u
xux
xyu
i
i
iii
iii
s
s
![Page 48: ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy](https://reader031.vdocument.in/reader031/viewer/2022022412/58f1a5471a28ab276c8b4577/html5/thumbnails/48.jpg)
48
Error Variance Estimate (cont)
21
2
1
1
2
/ˆˆse
, ˆ oferror standard the
have then wefor ˆ substitute weif
ˆsd that recall
regression theoferror Standardˆˆ
xx
s
i
x
s
ss
s
ss