how do lawyers set fees?. learning objectives 1.model i.e. “story” or question 2.multiple...
TRANSCRIPT
How do Lawyers Set fees?
Learning Objectives
1. Model i.e. “Story” or question
2. Multiple regression review
3. Omitted variables (our first failure of GM)
4. Dummy variables
Model
• An example of how we can use the tools we have learned
• Simple analyses that don’t have a complicated structure can often be useful
• Question: Lawyers claim that they set fees to reflect the amount of legal work done
• Our suspicion is that fees are set to reflect the amount of money at stake– Form of second degree price discrimination
Model
• How to translate a story into econometrics and then test the story?
• Our Idea: Fees are determined by the size of the award rather than the work done– Percentage fees– Price discrimination
• Careful to consider alternatives: Insurance
Analysis
• As always summarize and describe the data
• Graph variables of interest (see over)
• Regression to find percentage price rule
010
000
2000
030
000
4000
0S
olic
itors
Inst
ruct
ion
fee
excl
VA
T
0 50000 100000 150000 200000euros
reg ins_allow award
Source | SS df MS Number of obs = 91
-------------+------------------------------ F( 1, 89) = 133.05
Model | 2.7940e+09 1 2.7940e+09 Prob > F = 0.0000
Residual | 1.8689e+09 89 20999331.5 R-squared = 0.5992
-------------+------------------------------ Adj R-squared = 0.5947
Total | 4.6629e+09 90 51810441.4 Root MSE = 4582.5
------------------------------------------------------------------------------
ins_allow | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
award | .1519855 .0131763 11.53 0.000 .1258046 .1781665
_cons | 5029.183 827.947 6.07 0.000 3384.07 6674.296
------------------------------------------------------------------------------
Formulate Story as Hypothesis
• Story is that lawyers charge a fee based on award
• So null hypothesis is that coefficient on award is zero• H0: = 0 H1: ≠ 0
• Test hypothesis that award is not statistically significant– Stata does it automatically
1. H0: = 0 H1: ≠ 02. Calculate the test statistic assuming that H0 is
true.t=(0.1519855-0)/0.0131763)=11.53
3. Either find the test statistic on the t distribution and calculate p-value
Prob (t>11.53)=0.000Or compare with one of the traditional threshold (“critical”) values:
N-k degrees of freedom5% significance level: 1.96
4. |t|>all the critical values and Prob (t>11.53)=0.0005
5. So we reject the null hypothesis
95% of distribution
Type 1 error
• Note how we set up the hypothesis test
• Null was that percentage charge was zero
• Type one error is reject the null when it is true
• The prob of type 1 error is the significance level
• So there is a 5% chance of saying that lawyers charge a % fee when they do not
Some Comments
• You could formulate the test as one sided• H0: > =0 H1: < 0
• H0: <= 0 H1: > 0
• Exercise to do this and think about which is best
• Could also test a particular value– H0: = 0.2 H1: ≠ 0.2
Omitted Variables• Our first Failure of GM Theorem• Key practical issue
– Always some variables missing (R2<1)
• When does it matter?– When they are correlated with the included variables– OLS becomes inconsistent and biased
• Often a way to undermine econometric results• Discuss in two ways
– State the issue formally– Use the lawyers example
Formally• Suppose we have model with z omitted
yi = + xi + zi + ui true model
yi = a + bxi + ui estimated • Then we will have:
E(b) b is a biased estimator of effect of x on y also inconsistent: bias does not disappear as N
• The bias will be determined by the formula E(b) = + = direct effect of x on y = direct effect of z on y = effect of z on x (from regression of z on x)
In Practice
• OLS erroneously attributes the effect of the missing z to x– Violates GM assumption that E(u|x)=0
• From the formula, the bias will go away if – =0 : the variable should be omitted as it doesn’t
matter– =0: the missing variable is unrelated to the included
variable(s)• In any project ask:
– are there missing variables that ought to be included (≠0)?
– could they be correlated with any included variables (≠0) ?
– What is the direction of bias?
Lawyers Example
• Suppose we had the simple model of lawyers fees as before.
• A criticism of this model is that it doesn’t take account of the work done by lawyers– i.e. measure of quantity and quality of work
are omitted variables– This invalidates the est of b– This is how you could undermine the study
• Is the criticism valid?– these variables ought to be included as they
plausibly affect the fee i.e. ≠0– They could be correlated with the included
award variable (≠0)• it is plausible that more work may lead to higher
award • or higher wards cases may require more work
• Turns out not to matter in our case because award and trial are uncorrelated
• Not always the case: use IV
Dummy Variables• Record classifications
– Dichotomous: “yes/no” e.g. gender, trial, etc– Ordinal e.g. level of education
• OLS doesn’t treat them differently• Need to be careful about how coefficients
are interpreted• Illustrate with “trial” in the fee regression
– Trial =1 iff case went to court– Trial =0 iff case settled before court
• Our basic model is
feei = 1 + 2 awardi + ui
• This can be interpreted a predicting fees based on awards i.e.
E[feei]= 1 + 2 E[awardi]
• Suspect that fee is systematically different if case goes to trial
feei = 1 + 2 awardi + 3 Triali + ui
• Now the prediction becomes:
E[feei]= 1 + 2 E[awardi]+ 3 iff trial
E[feei]= 1 + 2 E[awardi] iff not
• Note that “trial” disappears when it is zero• This translates into separate intercepts on the
graph• The extra € for bringing a case to trial
• Testing if 3 is significant is test of significant difference in fees between the two groups
• For price discrimination story: award still significant
fee 1 +3 1 award
regress ins_allow award trial
Source | SS df MS Number of obs = 91
-------------+------------------------------ F( 2, 88) = 78.43
Model | 2.9871e+09 2 1.4936e+09 Prob > F = 0.0000
Residual | 1.6758e+09 88 19043267.3 R-squared = 0.6406
-------------+------------------------------ Adj R-squared = 0.6324
Total | 4.6629e+09 90 51810441.4 Root MSE = 4363.9
------------------------------------------------------------------------------
ins_allow | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
award | .1489103 .0125847 11.83 0.000 .1239009 .1739197
trial | 5887.706 1848.795 3.18 0.002 2213.616 9561.797
_cons | 4798.368 791.7677 6.06 0.000 3224.896 6371.84
------------------------------------------------------------------------------
• While the intercept could be different the slope could be also i.e. the degree of price discrimination could be different between the two groups
• Model this by an “interaction term”
feei = 1 + 2 awardi + 3 Triali +
4 awardi*Triali + ui
Interaction
• Now the prediction becomes:
E[feei]= 1 + (2 + 4 )*E[awardi]+ 3 iff trial
E[feei]= 1 + 2 E[awardi] iff not
• Note that “trial” disappears when it is zero• This translates into separate intercepts and
slopes on the graph• The extra € for bringing a case to trial and
an extra %
• Testing if 4 is significant is test of significant difference in % fee between the two groups
gen interact=trial*award
regress ins_allow award trial interact
Source | SS df MS Number of obs = 91
-------------+------------------------------ F( 3, 87) = 52.34
Model | 3.0004e+09 3 1.0001e+09 Prob > F = 0.0000
Residual | 1.6625e+09 87 19109443.6 R-squared = 0.6435
-------------+------------------------------ Adj R-squared = 0.6312
Total | 4.6629e+09 90 51810441.4 Root MSE = 4371.4
------------------------------------------------------------------------------
ins_allow | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
award | .1468693 .012842 11.44 0.000 .1213445 .1723941
trial | 2444.119 4526.143 0.54 0.591 -6552.081 11440.32
interact | .0561776 .0673738 0.83 0.407 -.0777352 .1900904
_cons | 4901.306 802.6927 6.11 0.000 3305.868 6496.745
------------------------------------------------------------------------------
fee 1 +
1 award
2
2 +
Multiple Hypotheses
• A little weird that the interact and trial variables are insignificant
• Possible that they are jointly significant
• Formally: H0: 4=0 and 3=0
H1: 4≠0 and 3≠0
• This is not the same as two t-tests in sequence• Use F-test of “Linear Restriction”• Turns out t-test is a special case
Procedure1. Estimate the model assuming the null is
true i.e. impose the restriction• Record R2 for the restricted model
• R2r=0.5992
2. Estimate the unrestricted model i.e. assuming the null is false
• Record the R2 for the unrestricted model
• R2u= 0.64350.5992
3. Form the Test statistic
r = number of restrictions (count equals signs)N = number of observations
Ku = number of variables (and constant) in the unrestricted model
4. Compare with the critical value from F tables: F (r, N- Ku )
• If test statistic is greater than critical value: reject H0
• F(2,87)= 3.15 at 5% significance level
40.5)491/(0.6435)-(1
2/)0.5992-(0.6435
)/()R-(1
/)R-(R(
2u
2r
2u
F
KN
rF
u
Comments/Intuition• Imposing a restriction must make the model explain less
of the dep variable• If it is “a lot” less then we reject the restriction as being
unrealistic• How much is “a lot”?
– Compare the two R2 (not “adjusted R2”)– Scale the difference– Compare to a threshold value
• Critical value is fn of 3 parameters: df1, df2, significance level
• Note doesn’t say anything about the component hypotheses
• Could do t-tests this way: stata does
• Sata automatically does H0: 2=0 …k=0
Conclusions
• We had four learning objectives1. Model i.e. “Story” or question
2. Multiple regression review
3. Dummy variables
4. Omitted variables (the first failure of GM)
• What’s Next?– More examples– More problems for OLS