1 econ 240a power 17. 2 outline review projects 3 review: big picture 1 #1 descriptive statistics...
Post on 19-Dec-2015
220 views
TRANSCRIPT
11
Econ 240AEcon 240A
Power 17Power 17
22
OutlineOutline
• Review
• Projects
33
Review: Big Picture 1Review: Big Picture 1• #1 Descriptive Statistics
– Numericalcentral tendency: mean, median, modedispersion: std. dev., IQR, max-minskewnesskurtosis
– Graphical• Bar plots• Histograms• Scatter plots: y vs. x• Plots of a series against time (traces)
Question: Is (are) the variable (s) normal?
44
Review: Big Picture 2Review: Big Picture 2
• # 2 Exploratory Data Analysis– Graphical
• Stem and leaf diagrams• Box plots• 3-D plots
55
Review: Big Picture 3Review: Big Picture 3• #3 Inferential statistics
– Random variables– Probability– Distributions
• Discrete: Equi-probable (uniform), binomial, Poisson– Probability density, Cumulative Distribution Function
• Continuous: normal, uniform, exponential– Density, CDF
• Standardized Normal, z~N(0,1)– Density and CDF are tabulated
• Bivariate normal– Joint density, marginal distributions, conditional distributions– Pearson correlation coefficient, iso-probability contours
– Applications: sample proportions from polls
),(~:,//ˆ npBxwherenxnsuccessesp
66
Review: Big Picture 4Review: Big Picture 4• Inferential Statistics, Cont.
– The distribution of the sample mean is different than the distribution of the random variable
• Central limit theorem
– Confidence intervals for the unknown population mean
nxxExz x //][/][
95.0]/96.1/96.1[ nxnxp
77
Review: Big Picture 5Review: Big Picture 5• Inferential Statistics
– If population variance is unknown, use sample standard deviation s, and Student’s t-distribution
– Hypothesis tests
– Decision theory: minimize the expected costs of errors• Type I error, Type II error
– Non-parametric statistics• techniques of inference if variable is not normally distributed
95.0]//[ 025.0025.0 nstxnstxp
)//(][,0:,0:0 nsxExtHH A
88
Review: Big Picture 6Review: Big Picture 6• Regression, Bivariate and Multivariate
– Time series• Linear trend: y(t) = a + b*t +e(t)• Exponential trend: ln y(t) = a +b*t +e(t)• Quadratic trend: y(t) = a + b*t +c*t2 + e(t)• Elasticity estimation: lny(t) = a + b*lnx(t) +e(t)
• Returns Generating Process: ri(t) = c + rM(t) + e(t)
• Problem: autocorrelation– Diagnostic: Durbin-Watson statistic
– Diagnostic: inertial pattern in plot(trace) of residual
– Fix-up: Cochran-Orcutt
– Fix-up: First difference equation
99
Review: Big Picture 7Review: Big Picture 7• Regression, Bivariate and Multivariate
– Cross-section• Linear: y(i) = a + b*x(i) + e(i), i=1,n ; b=dy/dx• Elasticity or log-log: lny(i) = a + b*lnx(i) + e(i); b=(dy/dx)/(y/x)• Linear probability model: y=1 for yes, y=0 for no; y =a + b*x +e• Probit or Logit probability model• Problem: heteroskedasticity• Diagnostic: pattern of residual(or residual squared) with y and/or x• Diagnostic: White heteroskedasticity test• Fix-up: transform equation, for example, divide by x
– Table of ANOVA• Source of variation: explained, unexplained, total• Sum of squares, degrees of freedom, mean square, F test
1010
Review: Big Picture 8Review: Big Picture 8• Questions: quantitative dependent, qualitative
explanatory variables– Null: No difference in means between two or more
populations (groups), One Factor• Graph• Table of ANOVA• Regression Using Dummies
– Null: No difference in means between two or more populations (groups), Two Factors
• Graph• Table of ANOVA• Comparing Regressions Using Dummies
1111
Review: Big Picture 9Review: Big Picture 9
• Cross-classification: nominal categories, e.g. male or female, ordinal categories e.g. better or worse, or quantitative intervals e.g. 13-19, 20-29– Two Factors mxn; (m-1)x(n-1) degrees of freedom– Null: independence between factors; expected
number in cell (i,j) = p(i)*p(j)*n– Pearson Chi- square statistic = sum over all i, j of
[observed(i, j) – expected(i, j)]2 /expected(i, j)
1212
SummarySummary
• Is there any relationship between 2 or more variables– quantitative y and x: graphs and regression– Qualitative binary y and quantitative x:
probability model, linear or non-linear– Quantitative y and qualitative x: graphs and
Tables of ANOVA, and regressions with indicator variables
– Qualitative y and x: Contingency Tables
1313
ProjectsProjects
• Learning by doing
• Learning from one another
1414
Control of Social ProblemsControl of Social Problems
• HIV/AIDS
1515
HIV/AIDS HIV/AIDS What can we do to What can we do to
prevent it?!prevent it?!Group 4:Group 4:
Pinar SahinPinar SahinDarren EganDarren EganDavid WhiteDavid WhiteYuan YuanYuan Yuan
Miguel Delgado HelleseterMiguel Delgado HelleseterDavid RhodesDavid Rhodes
1616
MorbidityPer capita
Abatement ExpenditurePer Capita
Control Curve
The Problem is Controllable
1717
Is there a relationship?Is there a relationship?
#of HIV vs CDC expenditure
y = -78.53x + 104053
R2 = 0.610
10000
20000
30000
40000
50000
60000
70000
80000
500 600 700 800 900 1000
CDC expenditure ($million)
num
ber o
f mor
bidi
ty
regression of HIV infection and CDC expenditureDependent Variable: INFECTMethod: Least SquaresDate: 11/21/03 Time: 17:10Sample: 1 9Included observations: 9
Variable Coefficient Std. Error t-Statistic Prob.
CDCMONEY -78.5302 23.73235 -3.3089951 0.012959C 104053.4 17180.64 6.05643533 0.000513
R-squared 0.610016 Mean dependent var 48122.44Adjusted R-squared 0.554304 S.D. dependent var 13830.59S.E. of regression 9233.366 Akaike info criterion 21.29216Sum squared resid 5.97E+08 Schwarz criterion 21.33599Log likelihood -93.8147 F-statistic 10.94945Durbin-Watson stat 0.822936 Prob(F-statistic) 0.012959
both of t and F statistic are significant. R-squared is 0.61, which is also fine.
• Both the t and F statistics are significant• R^2 is .61, which is decent Group 4
1818
HIV/AIDS cases vs. per capita HIV/AIDS cases vs. per capita funding per statefunding per state
Dependent Variable: INFECTMethod: Least SquaresDate: 11/21/03 Time: 18:07Sample: 1 50Included observations: 50
Variable Coefficient Std. Error t-Statistic Prob.
PERCAPFUND 5.605922 2.044885 2.74143582 0.008568C 4.614507 2.447325 1.88553113 0.065418
R-squared 0.135376 Mean dependent var 10.518Adjusted R-squared 0.117363 S.D. dependent var 8.751926S.E. of regression 8.222325 Akaike info criterion 7.090761Sum squared resid 3245.118 Schwarz criterion 7.167242Log likelihood -175.269 F-statistic 7.51547Durbin-Watson stat 1.952071 Prob(F-statistic) 0.008568
# of cases VS per Capital funding per state
0
20
40
60
80
100
120
140
160
0 2 4 6 8 10 12
per Capita funding per state
# o
f cas
es p
er 1
00,0
00
peo
ple
Group 4
1919
Controlling Social ProblemsControlling Social Problems
• This same analytical framework works for various social ills– Morbidity per capita– Offenses per capita– Pollution per capita
2020
MorbidityPer capita
Abatement ExpenditurePer Capita
Control Curve
The Problem is Controllable
Offenses Per Capita
PollutionPer Capita
2121
Source: Report to the Nation on Crime and Justice
2222
Source: Report to the Nation on Crime and Justice
control
Causalfactors
2323
MorbidityPer capita
Abatement ExpenditurePer Capita
Control Curve
The Problem is Not Controllable
Controllability is an empiricalquestion that we want to answer
2424
Optimizing BehaviorOptimizing Behavior
• Cost Curve: – Cost = Damages from Morbidity + Abatement
Expenditures– C = p*M + Exp
2525
Cost CurveCost Curve
Abatement Exp
Morbidity M
C = p*M + Exp
Exp=0, M=C/p
M=0, Exp=C
2626
Family of Cost CurvesFamily of Cost Curves
Abatement Exp
Morbidity M
Higher Cost
Lower Cost
2727
MorbidityPer capita
Abatement ExpenditurePer Capita
Control Curve
The Problem is Not Controllable: Don’t Throw Money At It
Higher Cost
Lowest Cost
Optimum: ZeroAbatement
2828
MorbidityPer capita
Abatement ExpenditurePer Capita
Control Curve
The Problem is Controllable: Optimum Expenditures
Lowest Attainable Cost
Optimum
Higher Cost Curve
2929
MorbidityPer capita
Abatement ExpenditurePer Capita
Control Curve
The Problem is Controllable: Optimum Expenditures
Lowest Attainable Cost
Optimum
Higher Cost Curve
Spend too MuchBut Morbidity Is Low
3030
MorbidityPer capita
Abatement ExpenditurePer Capita
Control Curve
The Problem is Controllable: Optimum Expenditures
Lowest Attainable Cost
Optimum
Higher Cost Curve
Spend Too Little, Morbidity Is Too High
3131
Economic ParadigmEconomic Paradigm
• Step One: Describe the feasible alternatives
3232
MorbidityPer capita
Abatement ExpenditurePer Capita
Control Curve
The Problem is Controllable
3333
Economic ParadigmEconomic Paradigm
• Step One: Describe the feasible alternatives
• Step Two: Value the alternatives
3434
Cost CurveCost Curve
Abatement Exp
Morbidity M
C = p*M + Exp
Exp=0, M=C/p
M=0, Exp=C
3535
Economic ParadigmEconomic Paradigm
• Step One: Describe the feasible alternatives
• Step Two: Value the alternatives
• Step Three: Optimize, pick the lowest cost alternative
3636
MorbidityPer capita
Abatement ExpenditurePer Capita
Control Curve
The Problem is Controllable: Optimum Expenditures
Lowest Attainable Cost
Optimum
Higher Cost Curve
3737
MorbidityPer capita
Abatement ExpenditurePer Capita
Control Curve
The Problem is Controllable: Family of Control Curves
Control CurveAnother TimeOr Another Place
3838
Behind the Control CurveBehind the Control Curve
• Morbidity Generation– M = f(sex-ed, risky behavior)– M = f(sex-ed, RB)
• Producing Morbidity Abatement– Sex-ed = g(labor)– Sex-ed = g(L)
• Abatement Expendtiture– Exp = wage*labor = w*L
3939
Morbidity GenerationMorbidity Generation
Morbidity, M
Sex-ed
M = f(Sex-ed, RB)
4040
Morbidity GenerationMorbidity Generation
Morbidity, M
Sex-ed
M = f(Sex-ed, RB)
Riskier behavior
4141
Production FunctionProduction Function
Sex-ed
Labor, L
4242
Expenditure On Wage Bill Expenditure On Wage Bill (Abatement)(Abatement)
Labor, L
Exp
Exp = w*L
4343
Control CurveControl Curve
Labor,L
Exp
Exp = w*L
Sex-ed
Sex-ed = g(L)
Morbidity, M
M = f(Sex-ed, RB)
ExpenditurefunctionProduction function
Morbidity Generation
4444
Control CurveControl Curve
Labor,L
Exp
Exp = w*L
Sex-ed
Sex-ed = g(L)
Morbidity, M
M = f(Sex-ed, RB)
4545
Control CurveControl Curve
Labor,L
Exp
Exp = w*L
Sex-ed
Sex-ed = g(L)
Morbidity, M
M = f(Sex-ed, RB)
4646
Control CurveControl Curve
Labor,L
Exp
Exp = w*L
Sex-ed
Sex-ed = g(L)
Morbidity, M
M = f(Sex-ed, RB)
4747
Control CurveControl Curve
Labor,L
Exp
Exp = w*L
Sex-ed
Sex-ed = g(L)
Morbidity, M
M = f(Sex-ed, RB)
Higher Risky Behavior
4848
ExerciseExercise
• Derive the control curve for the jurisdiction with more risky behavior
4949
Expansion PathExpansion Path
• Assume the family of control curves is nested, i.e. have the same slope along any ray from the origin
• Assume all jurisdictions place the same value, p, on morbidity
• Assume all jurisdictions are optimizing
• Then the expansion path is a ray from the origin
5050
MorbidityPer capita
Abatement ExpenditurePer Capita
Control Curve
The Problem is Controllable: Family of Control Curves
Control CurveAnother TimeOr Another Place
Expansion path
5151
MorbidityPer capita
Abatement ExpenditurePer Capita
Control Curve
The Problem is Controllable: Family of Control Curves
Control CurveAnother TimeOr Another Place
Expansion path
M
Exp
5252
Econometric IssuesEconometric Issues
• Two Relationships– Control curve: M = h(exp, RB)– Expansion path: M/EXP = k
• Variation in risky behavior from one jurisdiction to the next shifts the control curve and traces out (identifies) the expansion path
5353
• Unless price, technology, or optimizing behavior changes from jurisdiction to jurisdiction, there will not be enough movement in the expansion path to trace out(identify) the control curve
5454
MorbidityPer capita
Abatement ExpenditurePer Capita
Control Curve
The Problem is Controllable: Family of Control Curves
Control CurveAnother TimeOr Another Place
Expansion path
M
Exp
5555
California Expenditure VS. California Expenditure VS. ImmigrationImmigration
By: Daniel Jiang, Keith Cochran, By: Daniel Jiang, Keith Cochran, Justin Adams, Hung Lam, Steven Justin Adams, Hung Lam, Steven
Carlson, Gregory WiefelCarlson, Gregory Wiefel
Fall 2003Fall 2003
Immigration VS ExpenditureImmigration VS Expenditure
Immigration VS Expenditure
y = 0.2363x + 814.96
R2 = 0.3733
20000
30000
40000
50000
60000
70000
80000
90000
100000 150000 200000 250000 300000 350000
Immigration
Exp
end
itu
re
5757
SimultaneitySimultaneity
Immigration
CA EXP
Expenditurefunction
ImmigrationFunction
5858
Simultaneity ConceptsSimultaneity Concepts• Jointly determined: Morbidity and abatement
expenditure are jointly determined by the control curve and the cost curve
• Morbidity and abatement expenditure are referred to as endogenous variables
• Risky behavior is an exogenous variable• For a 2-equation simultaneous system, at
least one exogenous variable must be excluded from a behavioral (structural) equation to identify it
5959
TheoryTheory
• Minimize Cost, C = p*M + Exp• Subject to the control curve, M = h(Exp, RB)• Lagrangian, La = p*M + Exp + [M-h(RB, Exp]
• Slope of the control curve = slope of cost curve
pExph
ExphExpLa
pMLa
/1/1/
0/1/
0/
6060
ModelModel
• Production Function: Cobb-Douglas– Sex-ed = a*Lb *eu b>0
• Abatement Expenditure– Exp = w*L
• Morbidity Abatement– M = d*sex-edm *RBn *ev m<0, n>0
6161
Model Cont.Model Cont.• Combine production function, expenditure and
morbidity abatement functions to obtain control function– M = d*[a*Lb *eu ]m *RBn *ev
– M = d*[a*(exp/w)b *eu ]m *RBn *ev
– M = d* am * expb*m * w-b*m *RBn *eu*m *ev
– lnM = ln(d*am) + b*m lnexp –b*m lnw + n* lnRB + (u*m + v)
– Or assuming w is constant: y1 = constant1 + b*m y2 + n x + error1
– We would like to show that b*m is negative, i.e. that morbidity is controllable
6262
Model Cont.Model Cont.
• Expansion Path– M/exp = k*ez
– lnM = -lnexp + lnk + z– Or y1 = constant2 – y2 + error2
6363
Reduced FormReduced Form• Solve for y1 and y2, the two endogenous
variables• y1 = [constant1 + constant2]/(1-b*m) + n/(1-b*m)
x + (error1 + b*m error2)/(1-b*m)• y2 ={ -[constant1 + constant2]/(1-b*m) +
constant2} - n/(1-b*m) x + {-(error1 + b*m error2)/(1-b*m) + error2}
• There is no way to get from the estimated parameter on x, n/(1 – b*m) to n or b*m, the parameters of interest for the control function
6464