introduction to econometrics - weebly€¦ · introduction to econometrics lecture 1 : causal...

71
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th, 2017 Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 1 / 70

Upload: others

Post on 05-Sep-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Introduction to EconometricsLecture 1 : Causal Inference in Social Science

Zhaopeng Qu

Business School,Nanjing University

Sep. 11th, 2017

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 1 / 70

Page 2: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Outlines

1 Review of Probability TheoryProbabilities, the Sample Space and Random VariablesExpected Values, Mean, and VarianceMultiple Random VariablesProperties of Joint DistributionsConditional DistributionsFamous Distributions

2 Causal Inference in Social Science

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 2 / 70

Page 3: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Outlines

1 Review of Probability TheoryProbabilities, the Sample Space and Random VariablesExpected Values, Mean, and VarianceMultiple Random VariablesProperties of Joint DistributionsConditional DistributionsFamous Distributions

2 Causal Inference in Social Science

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 2 / 70

Page 4: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Review of Probability Theory

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 3 / 70

Page 5: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

A Fundamental Axiom of Econometrics

1 Any economy can be viewed as a stochastic process governedby some probability law.

2 Economic phenomenon, as often summarized in form of data, canbe reviewed as a realization of this stochastic data generatingprocess.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 4 / 70

Page 6: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Probabilities and the Sample Space

Random Phenomena, Outcomes and ProbabilitiesThe mutually exclusive potential results of a random process are calledthe outcomes.The probability of an outcome is the proportion of the time that theoutcome occurs in the long run.

The Sample Space and Random EventThe set of all possible outcomes is called the sample space.An event is a subset of the sample space, that is, an event is a set ofone or more outcomes.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 5 / 70

Page 7: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Random VariablesRandom Variables(R.V.)A random variable (r.v.) is a function that maps from the sample space ofan experiment to the real line or X : Ω R

A random variable is a numerical summary of a random outcome.They are numeric representation of uncertain events.(thus we can usemath!)Notation: R.V.s are usually denoted by upper case letters (e.g. X),particular realizations are denoted by the corresponding lowercaseletters (e.g. x = 3)

ExampleTossing a coin 5 times

but not a random variable because it’s not numeric.X(ω) = number of heads in the five tosses. X(HTHTT) = 2

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 6 / 70

Page 8: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Probability Distributions

Uncertainty over Ω uncertainty over value of . We’ll use probability toformalize this uncertainty.The probability distribution of a r.v. gives the probability of all of thepossible values of the r.v.

PX(X = x) = P (ω ∈ Ω : X(ω) = x)

ExampleTossing two coins: let X be the number of heads.ω P(ω) X(ω)

TT 1/4 0HT 1/4 1TH 1/4 1HH 1/4 2

x P(X = x)0 1/41 1/22 1/4

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 7 / 70

Page 9: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Distributional Functions of R.V.

It is cumbersome to derive the probabilities of X each time we needthem, so it is helpful to have a function that can give us theprobability of values or sets of values of X.

DefinitionThe cumulative distribution function or c.d.f of a r.v. X, denotedFX(x), is defined by

FX(x) ≡ PX(X ≤ x)

The c.d.f tells us the probability of a r.v. being less than some givenvalue.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 8 / 70

Page 10: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Distribution Functions of R.V.

We have two kinds of r.v.s

DefinitionA r.v. X, is discrete if its range(the set of values it can take) is finite(X ∈ x1, x2, ...xk) or countably infinite(X ∈ x1, x2, ...)

eg: the number of computer crashes before deadline

DefinitionA r.v. X, is continuous if it can contain all real numbers in a interval.There are an uncountably infinite number of possible realizations.

eg: commuting times from home to school

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 9 / 70

Page 11: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Probability Distribution of a Discrete R.V.

Probability mass functionProbability mass function (p.m.f.) describes the distribution of r.v. when itis discrete:

fX(xk) = P (X = xk) = px, k = 1, 2, ..., n

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 10 / 70

Page 12: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .

Page 13: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Probability Distribution of a Discrete R.V.

c.d.f of a discrete r.vthe c.d.f of a discrete r.v. is denoted as

FX(x) = P(X ≤ x) =∑

Xk≤xfX(xk)

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 12 / 70

Page 14: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Probability Distribution of a Continuous R.V.

Probability density functionThe probability density function or p.d.f., for a continuous random variableX is the function that satisfies for any interval, B

P(X ∈ B) =

∫B

fX(x)dx

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 13 / 70

Page 15: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Probability Distribution of a Continuous R.V.

Specifically, for a subset of the real line(a, b):P(a < X < b) =

∫ ba fX(x)dx, thus the probability of a region is the

area under the p.d.f. for that region.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 14 / 70

Page 16: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Probability Distribution of a Continuous R.V.Cumulative probability distributionjust as it is for a discrete random variable, except using p.d.f to calculatethe probability of x,

F(X) = P(X ≤ x) =∫ x

−∞fX(x)dx

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 15 / 70

Page 17: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Properties of Distributions

Probability distributions describe the uncertainty about r.v.s. Thecdf/pmf/pdf give us all the information about the distribution of somer.v., but we are quite often interested in some feature of thedistribution rather than the entire distribution.

What is the difference between these two density curves? How mightwe summarize this difference?

There are two simple indictors:1 Central tendency: where the center of the distribution is.

Mean/expectation (均值或期望)2 Spread: how spread out the distribution is around the center.

Variance/standard deviation(方差或标准差)

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 16 / 70

Page 18: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The Expected Value of a Random Variable

The expected value of a random variable X, denoted E(X) or µx, isthe long-run average value of the random variable over many repeatedtrials or occurrences. it is a natural measure of central tendency.For a discrete r.v., X ∈ x1, x2, ..., xk

µX = E[X] =

k∑j=1

xjpj

it is computed as a weighted average of the value of r.v., where theweights are the probability of each value occurring.For a continuous r.v., X, use the integral

µX = E[X] =

∫ +∞

−∞xfX(x)dx

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 17 / 70

Page 19: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Properties of Expectation

1 Additivity: expectation of sums are sums of expectations

E[X + Y] = E[X] + E[Y]

2 Homogeneity: Suppose that a and b are constants. Then

E[aX + b] = aE[X] + b

3 Law of the Unconscious Statistician, or LOTUS, if g(x) is afunction of a discrete random variable, then

E [g(X)] =

∑x g(x)fX(x) when x is discrete∫g(x)fX(x)dx when x is continuous

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 18 / 70

Page 20: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The Variance of a Random Variable

Besides some sense of where the middle of the distribution is, we alsowant to know how spread out the distribution is around that middle.

DefinitionThe Variance of a random variable X, denoted var(X)or σ2

X

σ2X = Var(X) = E[(X − µX)

2]

The Standard Deviation of X, denoted σX, is just the square root of thevariance.

σX =√

Var(X)

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 19 / 70

Page 21: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Properties of Variance

If a and b are constants, then we have the following properties:1 V(b) = 02 V(aX + b) = a2V(X)3 V(X) = E[X2]− (E[X])2

ExampleBernoulli Distribution:

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 20 / 70

Page 22: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Why multiple random variables?

We are going to want to know what the relationships are betweenvariables.“The objective of science is the discovery of the relations”—Lord KelvinIn most cases,we often want to explore the relationship between twovariables in one study.

eg. Mortality and GDP growth

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 21 / 70

Page 23: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Joint Probability Distribution

Consider two discrete random variables X and Y with a jointprobability distribution, then the joint probability mass function of(X,Y) describes the probability of any pair of values:

fX,Y(x, y) = P(X = x,Y = y) = pxy

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 22 / 70

Page 24: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Marginal Probability Distribution

The marginal distribution: often need to figure out the distribution ofjust one of the r.v.s.

fY(y) = P(Y = y) =∑

xfX,Y(x, y)

Intuition: sum over the probability that Y = y for all possible valuesof x.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 23 / 70

Page 25: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Joint Probability Density Function

Consider two continuous random variables X and Y with a jointprobability distribution, then the joint probability density functionof (X,Y) is a function, denoted as fX,Y(x, y) such that:

1 fX,Y(x, y) ≥ 02

∫ +∞−∞

∫ +∞−∞ fX,Y(x, y) dxdy = 1

3 P(a < X < b, c < Y < d) =∫ d

c∫ b

a fX,Y(x, y) dxdy, thus the probabilityin the a, b, c, darea.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 24 / 70

Page 26: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Joint Probability Density Function

Y and X axes denote on the “floor”, height is the value offXY(x, y)

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 25 / 70

Page 27: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Joint Probability Density Function

The probability equals to volume above a specific region

P(X,Y) ∈ A) =

∫(x,y)∈A

fX,Y(x, y)dxdy

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 26 / 70

Page 28: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Continuous Marginal Distribution

the marginal p.d.f of Y by integrating over the distribution of X:

fY(y) =∫ +∞

−∞fX,Y(x, y)dx

the marginal p.d.f of X by integrating over the distribution of Y:

fX(x) =∫ +∞

−∞fX,Y(x, y)dy

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 27 / 70

Page 29: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Continuous Marginal Distribution

Pile up all of the joint density onto a single dimensionZhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 28 / 70

Page 30: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Joint Cumulative Distribution Function

The joint cumulative distribution function of (X,Y) is

FX,Y(x, y) = P(X ≤ x,Y ≤ y) =∫ y

−∞

∫ x

−∞fX,Y(u, v) dudv

Transform joint c.d.f and joint p.d.f

fX,Y(x, y) =∂2FX,Y(x, y)

∂y

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 29 / 70

Page 31: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Expectations over multiple r.v.s

Expectations over multiple r.v.s

E[g(X,Y)] =

∑x∑

y g(x, y)fX,Y(x, y) if∫x∫

y g(x, y)fX,Y(x, y)dxdy if

Marginal expectation

E[Y] =

∑x∑

y yfX,Y(x, y) if∫x∫

y yf(x, y)dxdy if

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 30 / 70

Page 32: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Independence

Independence

Two r.v.s X and Y are independent, which we denote it as X ⊥ Y, if for allsets A and B

P(X ∈ A,Y ∈ B) = P(X ∈ A)P(Y ∈ B)

Intuition: knowing the value of X gives us no information about thevalue of Y.IfX and Y are independent, then

Joint p.d.f is the product of marginal p.d.f, thus fX,Y(x, y) = fX(x)fY(y)Joint c.d.f is the product of marginal c.d.f, thus fX,Y(x, y) = fX(x)fY(y)functions of independent r.v.s are independent, thus h(X) ⊥ g(Y) forany functions h(·) and g(·).

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 31 / 70

Page 33: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Independence

Theorem (Independence)if X and Y are independent r.v.s, then

E[XY] = E[X]E[Y]

Proof.Skip. you could finish it by yourself.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 32 / 70

Page 34: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Covariance

If two variables are not independent, we could still measure thestrength of their dependence by the definition of covariance.

Covariancethe covariance between X and Y is defined as

Cov[X,Y] = E [(X − E[X]) (Y − E[Y])]

Properties of covariances:Cov[X,Y] = E[XY]− E[X]E[Y]If X ⊥ Y, Cov[X,Y] = 0

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 33 / 70

Page 35: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Intuition of Covariance

The conditional probability mass function(conditional p.m.f) of Yconditional of X is

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 34 / 70

Page 36: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Covariance and Independence

Properties of covariances:Cov[X,Y] = E[XY]− E[X]E[Y]Cov[aX + b, cY + d] = acCoV[XY]Cov[X,X] = Var[X]

Covariance and IndependenceIf X ⊥ Y, then Cov[X,Y] = 0. thus independence⇒Cov[X,Y] = 0.If Cov[X,Y] = 0, then X ⊥ Y ? NO! Cov[X,Y] = 0 ⇏independence.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 35 / 70

Page 37: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Covariance and Correlation

Covariance is not scale-free. Correlation is a special form ofcovariance after dividing out the scales of the respective variables.

CorrelationThe correlation between X and Y is defined as

ρXY =Cov[X,Y]√

Var[X]Var[Y]

Correlation properties:−1 ≤ ρ ≤ 1If | ρXY |= 1, then X and Y are perfectly correlated with a linearrelationship: Y = a + bX

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 36 / 70

Page 38: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Conditional Probability functionConditional probability mass functionThe conditional probability mass functional(conditional p.m.f) of Yconditional of X is

fY|X (y|x) = P(Y = y | X = x) = P(X = x,Y = y)P(X = x) =

fX,Y(x, y)fX(x)

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 37 / 70

Page 39: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Conditional Density Function

Conditional probability density function:c.d.f. of Y conditional on X is

fY|X (y|x) = fX,Y(x, y)fX(x)

Based on the definition of the conditional p.m.f./p.d.f., we have thefollowing equation

fX,Y(x, y) = fY|X (y|x) fX(x)

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 38 / 70

Page 40: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Conditional Density Function

c.d.f is proportional to joint p.d.f along x0 like a slice of total volume.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 39 / 70

Page 41: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Conditional Independence

Conditional IndependenceX and Y are conditional Independent given Z, denoted as X ⊥ Y | Z, if

fX,Y|Z (x, y|z) = fX|Z (x|z) fY|Z (y|z)

X and Y are independent within levels of Z.Example:

X = swimming accidents, Y = ice cream sold.In general, two variable is highly correlated.If conditional on Z = temperature, then they are independent.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 40 / 70

Page 42: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Conditional Expectation Function

Conditional ExpectationConditional on X, Y’s Conditional Expectation is

E(Y|X) =

∑yfY|X(y|x) discrete Y∫

yfY|X(y|x)dy continuous Y

Conditional Expectation Function(CEF) is a function of x, since X is arandom variable, so CEF is also a random variable.Intuition:期望就是求平均值,而条件期望就是“分组取平均”或“在...条件下的均值”。

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 41 / 70

Page 43: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Properties of Conditional Expectation

1 E[c(X) | X] = c(X) for any function c(X).Thus if we know X, thenwe also know c(X).

eg. E[(X2 + 2X3) | X] =X2 + 2X3

if X and Y are independent r.v.s, then

E[Y | X = x] = E[Y]

if X and Y independent conditional on Z, thus X ⊥ Y | Z ,

E[Y | X = x,Z = z] = E[Y | Z = z]

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 42 / 70

Page 44: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Conditional Variance

Conditional VarianceConditional on X, Y’s Conditional Expectation is defined asVar(Y|X) = E [(Y − E[Y|X])2 | X]

Usual variance formula applied to conditional distribution.Discrete

V[Y | X] =∑

y(y − E[Y | X])2fY|X (y|x)

ContinuousV[Y | X] =

∫y(y − E[Y | X])2fY|X (y|x)

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 43 / 70

Page 45: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Families of distributions

There are several important families of distributions:The p.m.f./p.d.f. within the family has the same form, with parametersthat might vary across the family.The parameters determine the shape of the distribution

Statistical modeling in a nutshell: to study probability distributionfunction.

Assume the data, X1,X2, ...,Xn, are independent draws from a commondistribution fθ(x) within a family of distributions (normal, poisson, etc)Use a function of the observed data to estimate the value of theθ : θ(X1,X2, ...,Xn)

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 44 / 70

Page 46: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The Bernoulli Distribution

DefinitionX has a Bernoulli distribution if it have a binary values X ∈ 0, 1 andits probability mass function is

fX(x) = P(X = x) =

p if x = 1

1− p if x = 0

Question:What is the Expectation and Variance of X?

E(X) =k∑

j=1

xjpj = 0× (1− p) + 1× p = p

Var(X) = E[X − E(X)]2 = E[X2]− (E[X])2 = p − p2 = p(1− p)

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 45 / 70

Page 47: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The Normal Distribution

The p.d.f of a normal random variable X is

fX(x) =1

σ√2π

exp[− 1

2σ2(x − µ)2

], −∞ < X < +∞

if X is normally distributed with expected value µ and variance σ2,denoted as X ∼ N(µ, σ2)

if we know these two parameters, we know everything about thedistribution.

Examples: Human heights, weights, test scores,If X represents wage, income or consumption etc, it will has alog-normal distribution, thus

log(X) ∼ N(µ, σ2)

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 46 / 70

Page 48: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The Normal Distribution

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 47 / 70

Page 49: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The Standard Normal Distribution

A special case of the normal distribution where the mean is zero(µ = 0) and the variance is one (σ2 = σ = 1), then its p.d.f is

fX(x) = ϕ(x)= 1√2π

e−12

x2 , −∞ < X < +∞

if X is standard normally distributed, then denoted as X ∼ N(0, 1)

The standard normal cumulative distribution function is denoted

Φ(z) = P(Z ≤ z)

where z is a standardize r.v. thus z = x−µXσX

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 48 / 70

Page 50: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The Standard Normal Distribution

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 49 / 70

Page 51: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 50 / 70

Page 52: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The Chi-Square Distribution

Let Zi(i = 1, 2, ...,m) be independent random variables, eachdistributed as standard normal. Then a new random variable can bedefined as the sum of the squares of Zi :

X =

m∑i=1

Z2i

Then X has a chi-squared distribution with m degrees of freedomThe form of the distribution varies with the number of degrees offreedom, i.e. the number of standard normal random variables Ziincluded in X.The distribution has a long tail, or is skewed, to the right. As thedegrees of freedom m gets larger, however, the distribution becomesmore symmetric and ‘‘bell-shaped.’’In fact, as m gets larger, thechi-square distribution converges to, and essentially becomes, anormal distribution.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 51 / 70

Page 53: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The Chi-Square Distribution

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 52 / 70

Page 54: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The Student t Distribution

The Student t distribution can be obtained from a standard normaland a chi-square random variable.Let Z have a standard normal distribution, let X have a chi-squaredistribution with m degrees of freedom and assume that Z and X areindependent. Then the random variable

T =Z√X/n

has has a t-distribution with m degrees of freedom, denoted as T ∼ tn.The shape of the t-distribution is similar to that of a normaldistribution, except that the t-distribution has more probability massin the tails.As the degrees of freedom get large, the t-distribution approaches thestandard normal distribution.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 53 / 70

Page 55: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The Student t Distribution

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 54 / 70

Page 56: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The F Distribution

Let X1 ∼ χ2mand X2 ∼ χ2

n, and assume that X1 and X2 areindependent,

Z =X1

mX2

n∼ Fm,n

thus Z has an F-distribution with (m, n) degrees of freedom.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 55 / 70

Page 57: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The F Distribution

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 56 / 70

Page 58: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Causal Inference in Social Science

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 57 / 70

Page 59: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The Purposes of Empirical Work

“The objective of science is the discovery of the relations”—LordKelvinIn most cases,we often want to explore the relationship betweentwo variables in one paper.

eg. education and wageThen, in simplicity, there are two relationships between twovariables.

Correlation(相关) V.S. Causality(因果)

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 58 / 70

Page 60: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

A Classical Example: Hemline Index(裙边指数)

George Taylor, an economist in the United States, made up thephrase it in the 1920s. The phrase is derived from the idea thathemlines on skirts are shorter or longer depending on theeconomy.

Before 1930s, fashion women favored middle skirts most.In 1929, long skirts became popular. While the Dow Jones IndustrialIndex(DJII) plunged from about 400 to 200 and to 40 two years later.In 1960s, DJII rushed to 1000. At the same time, short skirts showedup.In 1970s, DJII fell to 590 and women began to wear long skirts again.In 1990s, mini skirt debuted, DJII rushed to 10000.In 2000s, bikini became a nice choice for girls, DJII was high up to13000.So what is about now? Long skirt is resorting?

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 59 / 70

Page 61: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Hemline Index:1920s-2010s

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 60 / 70

Page 62: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The Core of Empirical Studies: Causality v.s. Forecasting

Some Big Data researchers think causality is not important anymore in our times..

“Look at correlations. Look at the ’what’ rather than the’why’, because that is often good enough.”-ViktorMayer-Schonberger(2013)

Most empirical economists think that correlation only tell us thesuperficial, even false relationship while causal relationship canprovide solid evidence to make interference to the realrelationship.

Today, empirical economists care more about the causalrelationship of their interests than ever before.“the most interesting and challenging research in socialscience is about cause and effect”——Angrist andLavy(2008)

Even though forecasting need not involve causal relationships,economic theory suggests patterns and relationships that mightbe useful for forecasting.Multiple regression analysis allows us to quantify historicalrelationships suggested by economic theory, to check whetherthose relationships have been stable over time, to makequantitative forecasts about the future, and to assess theaccuracy of those forecasts.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 61 / 70

Page 63: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The Central Question of Causality(I)

A simple example: Do hospitals make people healthier? (Q:Dependent variable and Independent variable?)A naive solution: compare the health status of those who havebeen to the hospital to the health of those who have not.Two key questions are documented by the questionnaires(问卷)from The National Health Interview Survey(NHIS)

1 “During the past 12 months, was the respondent a patient ina hospital overnight?”

2 “Would you say your health in general is excellent, very good,good ,fair and poor”and scale it from the number “1” to “5”respectively.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 62 / 70

Page 64: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The Central Question of Causality(II)Hospital v.s. No Hospital

Group Sample Size Mean Health Status Std.DevHospital 7774 2.79 0.014

No Hospital 90049 2.07 0.003

In favor of the non-hospitalized, WHY?Hospitals not only cure but also hurt people.

1 hospitals are full of other sick people who might infect us2 dangerous machines and chemicals that might hurt us.

More important : people having worse health tends to visithospitals.

This simple case exhibits that it is not easy to answer an causalquestion, so let us formalize an model to show where the problemis.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 63 / 70

Page 65: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

The Central Question of Causality(III)

So A right way to answer a causal questions is construct acounterfactual world, thus “What If ....then”, Such asAn example: How much wage premium you can get from collegeattendance(上大学使工资增加多少?)

For any worker, we want to compareWage if he have a college degreeWage if he had not a college degree

Then make a difference. This is the right answer to ourquestion.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 64 / 70

Page 66: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Difficulty in Identification

Others are the same asMilitary serviceMigrationPublic policiesRoad buildingJob trainingParty membershipOthers

Difficulty: only one state can be observed

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 65 / 70

Page 67: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Formalization: Rubin Causal Model

Treatment : Di = 0, 1; eg, go or not go to college

Potential Outcomes =

Y1i if Di = 1

Y0i if Di = 0

To know the difference between Y1i and Y0i, which can be saidto be the causal effect of going to college for individual i. (Doyou agree with it?)

DefinitionCausal inference is the process of estimating a comparison ofcounterfactuals under different treatment conditions on the same setof units.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 66 / 70

Page 68: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Formalization: Rubin Causal Model

Knowing individual effect is not our final goal. As a socialscientist, we would like more to know the Average effect as asocial pattern.So it make us focus on the average wage for a group of people.How can we get the Average wage effect for college attendance?A naive solution: Comparing the average wage in labor marketwho went to college and did not go.

College vs Non-College Wage Differentials:

=E[Y1i|Di = 1]− E[Y0i|Di = 0]

=E[Y1i|Di = 1]− E[Y0i|Di = 1]+ E[Y0i|Di = 1]− E[Y0i|Di = 0]

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 67 / 70

Page 69: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Formalization: Rubin Causal Model

Average Total Treatment Effect(ATE)

E[Y1i|Di = 1]− E[Y0i|Di = 0]

Average Treatment on the Treated(ATT)

E[Y1i|Di = 1]− E[Y0i|Di = 1]

Selection Bias(SB)

E[Y0i|Di = 1]− E[Y0i|Di = 0]

Question 1: Which one defines the causal effect of collegeattendance?Question 2: Selection Bias is positive or negative in the case?

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 68 / 70

Page 70: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Random Assignment(随机实验)Solves the SelectionProblem

Random assignment of treatment Di can eliminates selectionbias. It means that the treated group is a random sample fromthe population.Being a random sample, we know that those included in thesample are the same, on average, as those not included in thesample on any measure.Mathematically ,it makes Di independent of potential outcomes,thus

Di ⊥ Y0i,Y1i

So we haveE[Y0i|Di = 1] = E[Y0i|Di = 0]

Then ATE equals ATT, thusE[Y1i|Di = 1]− E[Y0i|Di = 0] = E[Y1i|Di = 1]− E[Y0i|Di = 1]

=E[Y1i − Y0i|Di = 1]

no matter what assumptions we make about the distribution of Y , wecan always estimate it with the difference in means.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 69 / 70

Page 71: Introduction to Econometrics - Weebly€¦ · Introduction to Econometrics Lecture 1 : Causal Inference in Social Science Zhaopeng Qu Business School,Nanjing University Sep. 11th,

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Random Assignment Solves the Selection Problem

Think of causal effects in terms of comparing counterfactuals orpotential outcomes. However, we can never observe bothcounterfactuals —fundamental problem of causal inference.To construct the counterfactuals, we could use two broad categoriesof empirical strategies.

Random Controlled Trials/Experiments:it can eliminates selection bias which is the most important bias arisesin empirical research. If we could observe the counterfactual directly,then there is no evaluation problem, just simply difference.We can generate the data of our interest by controlling experiments justas physical scientists or biologists do. But too obviously, we face moredifficult and controversy situation than those in any other sciences.

The various approaches using naturally-occurring data providealternative methods of constructing the proper counterfactual.

We should take the randomized experimental methods as ourbenchmark when we do empirical research whatever the methods weapply.

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 11th, 2017 70 / 70