how do lawyers set fees?. learning objectives 1.model i.e. “story” or question 2.multiple...

How do Lawyers Set fees?

Learning Objectives

1. Model i.e. “Story” or question

2. Multiple regression review

3. Omitted variables (our first failure of GM)

4. Dummy variables

Model

• An example of how we can use the tools we have learned

• Simple analyses that don’t have a complicated structure can often be useful

• Question: Lawyers claim that they set fees to reflect the amount of legal work done

• Our suspicion is that fees are set to reflect the amount of money at stake– Form of second degree price discrimination

Model

• How to translate a story into econometrics and then test the story?

• Our Idea: Fees are determined by the size of the award rather than the work done– Percentage fees– Price discrimination

• Careful to consider alternatives: Insurance

Analysis

• As always summarize and describe the data

• Graph variables of interest (see over)

• Regression to find percentage price rule

010

000

2000

030

000

4000

0S

olic

itors

Inst

ruct

ion

fee

excl

VA

T

0 50000 100000 150000 200000euros

reg ins_allow award

Source | SS df MS Number of obs = 91

-------------+------------------------------ F( 1, 89) = 133.05

Model | 2.7940e+09 1 2.7940e+09 Prob > F = 0.0000

Residual | 1.8689e+09 89 20999331.5 R-squared = 0.5992

-------------+------------------------------ Adj R-squared = 0.5947

Total | 4.6629e+09 90 51810441.4 Root MSE = 4582.5

------------------------------------------------------------------------------

ins_allow | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------

award | .1519855 .0131763 11.53 0.000 .1258046 .1781665

_cons | 5029.183 827.947 6.07 0.000 3384.07 6674.296

------------------------------------------------------------------------------

Formulate Story as Hypothesis

• Story is that lawyers charge a fee based on award

• So null hypothesis is that coefficient on award is zero• H0: = 0 H1: ≠ 0

• Test hypothesis that award is not statistically significant– Stata does it automatically

1. H0: = 0 H1: ≠ 02. Calculate the test statistic assuming that H0 is

true.t=(0.1519855-0)/0.0131763)=11.53

3. Either find the test statistic on the t distribution and calculate p-value

Prob (t>11.53)=0.000Or compare with one of the traditional threshold (“critical”) values:

N-k degrees of freedom5% significance level: 1.96

4. |t|>all the critical values and Prob (t>11.53)=0.0005

5. So we reject the null hypothesis

95% of distribution

Type 1 error

• Note how we set up the hypothesis test

• Null was that percentage charge was zero

• Type one error is reject the null when it is true

• The prob of type 1 error is the significance level

• So there is a 5% chance of saying that lawyers charge a % fee when they do not

Some Comments

• You could formulate the test as one sided• H0: > =0 H1: < 0

• H0: <= 0 H1: > 0

• Exercise to do this and think about which is best

• Could also test a particular value– H0: = 0.2 H1: ≠ 0.2

Omitted Variables• Our first Failure of GM Theorem• Key practical issue

– Always some variables missing (R2<1)

• When does it matter?– When they are correlated with the included variables– OLS becomes inconsistent and biased

• Often a way to undermine econometric results• Discuss in two ways

– State the issue formally– Use the lawyers example

Formally• Suppose we have model with z omitted

yi = + xi + zi + ui true model

yi = a + bxi + ui estimated • Then we will have:

E(b) b is a biased estimator of effect of x on y also inconsistent: bias does not disappear as N

• The bias will be determined by the formula E(b) = + = direct effect of x on y = direct effect of z on y = effect of z on x (from regression of z on x)

In Practice

• OLS erroneously attributes the effect of the missing z to x– Violates GM assumption that E(u|x)=0

• From the formula, the bias will go away if – =0 : the variable should be omitted as it doesn’t

matter– =0: the missing variable is unrelated to the included

variable(s)• In any project ask:

– are there missing variables that ought to be included (≠0)?

– could they be correlated with any included variables (≠0) ?

– What is the direction of bias?

Lawyers Example

• Suppose we had the simple model of lawyers fees as before.

• A criticism of this model is that it doesn’t take account of the work done by lawyers– i.e. measure of quantity and quality of work

are omitted variables– This invalidates the est of b– This is how you could undermine the study

• Is the criticism valid?– these variables ought to be included as they

plausibly affect the fee i.e. ≠0– They could be correlated with the included

award variable (≠0)• it is plausible that more work may lead to higher

award • or higher wards cases may require more work

• Turns out not to matter in our case because award and trial are uncorrelated

• Not always the case: use IV

Dummy Variables• Record classifications

– Dichotomous: “yes/no” e.g. gender, trial, etc– Ordinal e.g. level of education

• OLS doesn’t treat them differently• Need to be careful about how coefficients

are interpreted• Illustrate with “trial” in the fee regression

– Trial =1 iff case went to court– Trial =0 iff case settled before court

• Our basic model is

feei = 1 + 2 awardi + ui

• This can be interpreted a predicting fees based on awards i.e.

E[feei]= 1 + 2 E[awardi]

• Suspect that fee is systematically different if case goes to trial

feei = 1 + 2 awardi + 3 Triali + ui

• Now the prediction becomes:

E[feei]= 1 + 2 E[awardi]+ 3 iff trial

E[feei]= 1 + 2 E[awardi] iff not

• Note that “trial” disappears when it is zero• This translates into separate intercepts on the

graph• The extra € for bringing a case to trial

• Testing if 3 is significant is test of significant difference in fees between the two groups

• For price discrimination story: award still significant

fee 1 +3 1 award

regress ins_allow award trial


-------------+------------------------------ F( 2, 88) = 78.43

Model | 2.9871e+09 2 1.4936e+09 Prob > F = 0.0000


-------------+------------------------------ Adj R-squared = 0.6324

Total | 4.6629e+09 90 51810441.4 Root MSE = 4363.9

------------------------------------------------------------------------------


-------------+----------------------------------------------------------------

award | .1489103 .0125847 11.83 0.000 .1239009 .1739197

trial | 5887.706 1848.795 3.18 0.002 2213.616 9561.797

_cons | 4798.368 791.7677 6.06 0.000 3224.896 6371.84

------------------------------------------------------------------------------

• While the intercept could be different the slope could be also i.e. the degree of price discrimination could be different between the two groups

• Model this by an “interaction term”

feei = 1 + 2 awardi + 3 Triali +

4 awardi*Triali + ui

Interaction

• Now the prediction becomes:

E[feei]= 1 + (2 + 4 )*E[awardi]+ 3 iff trial

E[feei]= 1 + 2 E[awardi] iff not

• Note that “trial” disappears when it is zero• This translates into separate intercepts and

slopes on the graph• The extra € for bringing a case to trial and

an extra %

• Testing if 4 is significant is test of significant difference in % fee between the two groups

gen interact=trial*award

regress ins_allow award trial interact


-------------+------------------------------ F( 3, 87) = 52.34

Model | 3.0004e+09 3 1.0001e+09 Prob > F = 0.0000


-------------+------------------------------ Adj R-squared = 0.6312

Total | 4.6629e+09 90 51810441.4 Root MSE = 4371.4

------------------------------------------------------------------------------


-------------+----------------------------------------------------------------

award | .1468693 .012842 11.44 0.000 .1213445 .1723941

trial | 2444.119 4526.143 0.54 0.591 -6552.081 11440.32

interact | .0561776 .0673738 0.83 0.407 -.0777352 .1900904

_cons | 4901.306 802.6927 6.11 0.000 3305.868 6496.745

------------------------------------------------------------------------------

fee 1 +

1 award

2

2 +

Multiple Hypotheses

• A little weird that the interact and trial variables are insignificant

• Possible that they are jointly significant

• Formally: H0: 4=0 and 3=0

H1: 4≠0 and 3≠0

• This is not the same as two t-tests in sequence• Use F-test of “Linear Restriction”• Turns out t-test is a special case

Procedure1. Estimate the model assuming the null is

true i.e. impose the restriction• Record R2 for the restricted model

• R2r=0.5992

2. Estimate the unrestricted model i.e. assuming the null is false

• Record the R2 for the unrestricted model

• R2u= 0.64350.5992

3. Form the Test statistic

r = number of restrictions (count equals signs)N = number of observations

Ku = number of variables (and constant) in the unrestricted model

4. Compare with the critical value from F tables: F (r, N- Ku )

• If test statistic is greater than critical value: reject H0

• F(2,87)= 3.15 at 5% significance level

40.5)491/(0.6435)-(1

2/)0.5992-(0.6435

)/()R-(1

/)R-(R(

2u

2r

2u

F

KN

rF

u

Comments/Intuition• Imposing a restriction must make the model explain less

of the dep variable• If it is “a lot” less then we reject the restriction as being

unrealistic• How much is “a lot”?

– Compare the two R2 (not “adjusted R2”)– Scale the difference– Compare to a threshold value

• Critical value is fn of 3 parameters: df1, df2, significance level

• Note doesn’t say anything about the component hypotheses

• Could do t-tests this way: stata does

• Sata automatically does H0: 2=0 …k=0

Conclusions

• We had four learning objectives1. Model i.e. “Story” or question

2. Multiple regression review

3. Dummy variables

4. Omitted variables (the first failure of GM)

• What’s Next?– More examples– More problems for OLS

how do lawyers set fees?. learning objectives 1.model i.e. “story” or question 2.multiple...

Documents

test hypothesis

direct effect of z

test statistic

regression of z

missing z

direct effect of x

prob f

datagraph variables