eco 420 advanced empirical methods
DESCRIPTION
ECO 420 Advanced Empirical Methods. Instructor. Jing Li Second year at Miami Taught undergraduate and graduate econometrics before Married with two kids. Books. Required: Introductory Econometrics, a Modern Approach by Jeffrey M. Wooldridge - PowerPoint PPT PresentationTRANSCRIPT
1
ECO 420Advanced Empirical Methods
2
Instructor
• Jing Li• Second year at Miami• Taught undergraduate and graduate
econometrics before• Married with two kids
3
Books
• Required: Introductory Econometrics, a Modern Approach by Jeffrey M. Wooldridge
• http://www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=Introductory+Econometrics%2C+a+Modern+Approach+by+Jeffrey+M.+Wooldridge
• Recommended: Mostly Harmless Econometrics: An Empiricist's Companion
4
Webpage
• http://fsb.muohio.edu/lij14/• Notes, data and codes will be posted
5
Critical Thinking
• Example: someone tries to show Canadians like girls more than boys
• How? He shows that the number of baby girls born in 2011 is greater than boys. Ok?
• Next, he shows that the number of girls adopted in 2011 is greater than boys. Ok?
• What do you think?
6
Critical Thinking
• A president of a private high school wants to prove that private school is worth the money.
• How? He shows that the more students of his schools go to Ivy League than the best public school in town.
• What do you think?
7
Causality
• How to interpret regression?• Regression can show association• Under stricter assumption (ceteris paribus),
regression can prove causality• Econometrics focuses on causality
8
Two Examples
• Does wearing safety belt cause fewer deaths?• Does the great recession in 2007-2009 cause
fewer marriages?
9
Ceteris Paribus
• It means all other things being equal• Ideally, causality can be proved if Ceteris
Paribus holds• The president of the private school is wrong
because the family backgrounds of students in private and public schools are not equal, so Ceteris Paribus fails.
10
Key Issues
• How to design experiment to ensure ceteris paribus?
• How to find natural experiment?• How to deal with non-experimental data?
11
STATA
• Available at FSB computer lab Room 2037• Do file puts commands together. Click File---
Do, and then choose the do file.• Log file puts results (no graphs) together. The
text format is recommended. You can use any text editor to open the file.
12
A Typical Do File
• Clear• Capture log close• Log using logfilename.txt, text replace• Insheet using datafilename.csv, clear• …• Log close
13
Most Important Commands
• Insheet: read data into memory• List: display data on screen• Sort: sort data• Des: summarize quantitative variable• Tab: summarize qualitative variable• Reg: run regression• Gen: generate new variable• Egen: generate fancy thing
14
Tips for Reading Data
• Double check the following items• The variable names (letter and number only)• The missing values (NA, space, a dot)• Dollar sign, comma, etc
15
Tips for Summarizing Data
• Pay attention to min, max, obs• Median (the 50% percentile) is more robust to
extreme values than mean• Skewness is zero for symmetric distribution• Kurtosis is three for normal distribution• Variance and standard deviation measure the
dispersion of the distribution
16
Tips for Plotting
• Help twoway• A scatter plot only shows association, not
causality • Pay attention to scale• It is not uncommon to plot the log of data
17
Tips for Running Regression
• In general, regression shows association, not causality
• Pay attention to following:• Outliers• Structural Change• Omitted Variables
18
Review
• Mean is good guess by minimizing errors• Conditional mean is better guess than
unconditional mean• Conditional mean is random variable• Law of iterated expectation
19
STATA
• Unconditional Mean: sum y
• Conditional Mean: by x: sum y
You need to sort data by x first
20
Compare Means
• Example: Are the exam scores in section B greater than section C?
• If there are two groups and x is the group indicator, using
ttest y, by(x)
21
Regression with Dummy Variables
• Alternatively, we can run regression using dummies to compare means. This approach is better than ttest
gen d1 = (x == 1), gen d2 = (x == 2), …
reg y d1 d2…
22
Question (Optional, Just for fun)
• Suppose we draw . After we know we draw . Find
• Solve this problem using Law of Iterated Expectation.
23
Answer
𝐸 (𝑌 )=𝐸 [𝐸 (𝑌∨𝑍 ) ]=𝐸 ( 1+𝑍2 )=1+𝐸𝑍2
=1+0.52
=0.75
24
Simulation
• help uniform• set obs 10000• gen z = uniform()• gen y = z+(1-z)*uniform()• sum y
25
Mean and Sample Mean
• We use sample mean (estimator) to estimate the population mean (parameter)
• The sample mean has nice properties of (1) unbiasedness; (2) consistency
26
Unbiasedness
• E
27
Law of Large Number (Consistency)
• If , then as
In words, the sample mean gets closer and closer to the true population mean as the size of random sample rises
28
Simulation (Show Consistency)
• clear• set obs 1000• gen y = 4 + 2*invnormal(uniform())• sum y in 1/10• sum y in 1/100• sum y in 1/1000
29
Discuss
• How to use simulation to show unbiasedness?
30
Answer
• We need to generate many samples.• Compute sample mean for each sample.• Unbiasedness means the average of those
sample means should be close (or in theory identical) to the population mean
31
Causality
• Most often, an economist goal is to infer that one variable (x) has a causal effect on another variable (y)
• Example 1: x = price, y = quantity demanded• Example 2: x = wearing safety belt, y = death
rate• Example 3: x = hosting Olympics Game, y =
GDP
32
Ceteris Paribus
• By definition, inferring causality requires ceteris paribus (CP)
• CP means all other factors being equal (fixed, constant)
• Without CP, we are not sure the change in Y is due to the change in X.
• Exercise: what does CP mean when x = price, y = quantity demanded?
33
Real Problem
• Someone thinks that driving on left (passing) lane causes more accidents
• How to prove? • One (bad) answer: let’s pick an interstate, say
I275. Consider I275 between exit 33 and exit 41. Each day between 8 am and 10 am we record the number of accidents that happen on the right lane and left lane. Then we do the mean-comparison test.
34
Discuss
• Is I275 representative?• Is traffic between exit 33 and exit 41
representative?• Does the time (rush hour, night time) matter?• How to run a regression?• How about using the percentage (# accident /
# traffic) rather than the number of accidents?
35
Fundamental Drawback
• CP fails since the bad answer uses the observed data
• Reckless drivers (W) tend to drive on left lane (X), and reckless driving causes more accidents (Y).
• The observed association between X and Y tells nothing about causality since W is not held constant (CP fails).
36
Solutions
• (I) use an experiment: randomly assign reckless drivers to left and right lanes. Then compare the mean using the experimental data.
• (2) still use observed data, but run a multiple regression which includes as regressor the number of reckless drivers on the left lane.
• (3) use fancy econometric models such as instrumental variable regression if the number of reckless drivers on left lane is unobserved.
37
Discrimination Paper
• (Race) Discrimination means that someone is treated unfairly just because of his skin color (even if he has high ability)
• Using observed data cannot ensure ceteris paribus
38
Experimental Data
• Obtained by using the fake resumes• Factors (characteristics) other than names
(signal for skin colors) are made comparable• In other words, the name (skin color) is
independent of ability.• E(xu)=0, so the key regressor is exogenous• Ceterius Paribus is ensured by using
experimental data
39
Policy Implications
• The punch line of this research is that job training program may help little for African-American, because the program may improve their skill, but cannot change their skin color.
40
Discuss
• This is a very smart paper, why?• How about using observed data? Can we draw
conclusion based on the observed salary difference between black and white?
• How about market heterogeneity? Can we generalize the finding to other markets such as the market for college faculty? Why?
41
Tax Paper
• Supply-Side Economics says that people work more (and so GDP rises) after tax cut
• So the theory implies a causal effect of tax cut on labor supply
42
Discuss
• Consider using the observed data, and run the regression of
What does represent?Is E(xu)=0, or is exogenous?What is the consequence of using observed data?
43
Natural Experiment
• There is a tax reform (natural experiment) • In 1987-1988, Iceland moved from a system
under which taxes were paid on previous year's income to a pay-as-you-earn system.
• So the tax rate for 1987 income became zero in an exogenous manner (has nothing to do with )
44
Policy Implications
• Figure 1 shows that cutting tax leads to higher employment
• Figure 2 shows a hike in GDP in 1987-1988• Another paper that uses natural experiment:
HOME-EQUITY LENDING AND RETAIL SPENDING: EVIDENCE FROM A NATURAL
EXPERIMENT IN TEXAS
45
Discuss
• Where is natural experiment? Reform, Law Change, Natural Event…
• Q: how to show the causal effect of the number of children on labor hours of women? How to design a pure experiment? How about natural experiment?