robust alternatives to estimate benchmark frontiers alternatives to estimate benchmark frontiers:...

24
Robust Alternatives to Estimate Benchmark Frontiers KEI - september 2006, KUL Léopold Simar Institut de Statistique Université Catholique de Louvain, Belgium

Upload: nguyenhuong

Post on 16-Apr-2018

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

'

&

$

%

Robust Alternatives to Estimate Benchmark

Frontiers

KEI - september 2006, KUL

Léopold Simar

Institut de Statistique

Université Catholique de Louvain, Belgium

Page 2: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2'

&

$

%

Contents

• Dominance and Probabilistic Formulation of a Production Process

– Farell-Debreu efficiency scores

– Nonparametric estimators

• Robust Versions of Benchmark Frontiers

– Partial-order frontier

– Nonparmetric estimators

• Introducing Environmental Factors

– Exploring the influence of factors on the production process

• Some References

Page 3: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 3'

&

$

%

Dominance and Probabilistic Formulation -1-

• Reformulates the production process and introduces in a natural way

– The Farell-Debreu benchmark frontier

– Its various nonparametric estimators FDH and DEA

• Extensions allows for some noise and Robustness to outliers and extremes

• The formulation allows to introduces easily Environmental Factors

Page 4: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 4'

&

$

%

Dominance and Probabilistic Formulation -2-

• The Production process generates inputs X and outputs Y such that

(X, Y ) ∈ Ψ ⊂ Rp+ × R

q+

– The attainable set is Ψ = {(x, y) ∈ Rp+ × R

q+ | x can produce y}

– The DGP (data generating process) according a probabilty model completely

characterized by the knowledge of

HXY (x, y) = Prob(X ≤ x, Y ≥ y),

the probability for a unit operating at the level (x, y) to be dominated.

– The support of HXY (·, ·) is Ψ.

– Decomposition

HXY (x, y) = Prob(X ≤ x |Y ≥ y) Prob(Y ≥ y) = FX|Y (x|y) SY (y)

= Prob(Y ≥ y |X ≤ x) Prob(X ≤ x) = SY |X(y|x) FX(x),

• All the relevant information is there!

Page 5: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 5'

&

$

%

Dominance and Probabilistic Formulation -4-

Farell-Debreu Efficiency

• input orientation

θ(x, y) = inf{θ |FX|Y (θx|y) > 0} = inf{θ |HXY (θx, y) > 0}.

• output orientation

λ(x, y) = sup{λ |SY |X(λy|x) > 0} = sup{λ |HXY (x, λy) > 0}.

If Ψ is Free-disposal nothing new!

θ(x, y) = inf{θ | (θx, y) ∈ Ψ}λ(x, y) = sup{λ | (x, λy) ∈ Ψ}

Page 6: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 6'

&

$

%

Dominance and Probabilistic Formulation -3-

0 2 4 6 8 10 120

2

4

6

8

10

12

input1: x

1

inpu

t 2: x2

Isoquants in input space

∂ X(y2)

y2 > y

1

∂ X(y1)

y

P=(x,y1)

Q

O

*

0 2 4 6 8 10 120

2

4

6

8

10

12

output1: y

1

outp

ut2: y

2

Isoquants in output space

∂ Y(x2)

∂ Y(x1)

x

P=(x,y2)

Q=(x, y∂ (x) )

Figure 1: Isoquants and input efficiency measure: (left) θP = |OQ|/|OP | ≤ 1 and

(right) λP = |OQ|/|OP | ≥ 1.

Page 7: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 7'

&

$

%

Dominance and Probabilistic Formulation -5-

Ψ is unknown: has to be estimated from Xn = {(xi, yi)|i = 1, . . . , n}.

• Nonparametric estimators: plug-in the empirical version of HXY (x, y)

HXY,n(x, y) =1

n

n∑

i=1

1I(Xi ≤ x, Yi ≥ y),

So that:

FX|Y,n(x|y) =HXY,n(x, y)

HXY,n(∞, y)and SY |X,n(y|x) =

HXY,n(x, y)

HXY,n(x, 0)

• Efficiency estimators:

θ(x, y) = inf{θ | FX|Y,n(θx|y) > 0} and λ(x, y) = sup{λ | SY |X,n(λy|x) > 0}

These are the FDH estimators (DEA by convexifying the FDH).

Page 8: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 8'

&

$

%

Dominance and Probabilistic Formulation -6-

0 2 4 6 8 10 120

1

2

3

4

5

6

7

8

input: x

outp

ut: y

FDH estimator

Free Disposability

(xi,y

i)

° (x0,y

0)

ΨFDH

Figure 2: FDH estimator ΨFDH of the production set Ψ: the • are the observations.

Page 9: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 9'

&

$

%

Dominance and Probabilistic Formulation -7-

Properties

• FDH are consistent estimators (if Ψ is convex, DEA is also consistent)

• Asymptotic theory is available

• Bootstrap has to be used in practice

• Drawbacks:

– sensitivity to extreme and outliers

– curse of dimensionality: not n−1/2 but rather, e.g. for FDH, n−1/(p+q)

Page 10: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 10'

&

$

%

Robust Benchmark Frontier -1-

Basics: Presentation for output orientation and one output y ∈ R+.

• The production set is the set:

Ψ = {(x, y) ∈ Rp+ × R+ | x can produce y}.

• The production process is defined by the joint cdf of (X, Y ) on Rp+ × R+:

F (x, y) = Prob(X ≤ x, Y ≤ y) = Prob(Y ≤ y|X ≤ x)Prob(X ≤ x)

= FY |X(y|x)FX(x),

where here FY |X(y|x) = 1 − SY |X(y|x) is a nonstandard conditional cdf

(conditionned on X ≤ x).

• If Ψ is free disposal, the Farrell-Debreu benchmark frontier function is:

ϕ(x) = {y|(x, λy) 6∈ Ψ, ∀λ > 1} ≡ inf{y|FY |X(y|x) = 1}.

Page 11: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 11'

&

$

%

Robust Benchmark Frontier -2-

Partial order frontiers: Economic interpretation

a new benchmark frontier less extreme than the “full frontier”.

• Order-m

– a unit (x, y) is benchmarked against the average maximal output reached by

m peers randomly drawn from the population of units using less input than x.

– As m → ∞, order-m frontier converges to the full-frontier.

• Order-α: quantile-type

– a unit (x, y) is benchmarked against the output level not exceeded by

100(1 − α)% of firms in the population of units using less input than x.

– As α → 1, order-α frontier converges to the full-frontier.

Page 12: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 12'

&

$

%

Robust Benchmark Frontier -3-

Partial order frontiers: Mathematical definition

• In place of looking for full frontier ϕ(x) = inf{y|FY |X(y|x) = 1} define a less

extreme concept:

– Order-m frontier

ϕm(x) = E[max(Y 1, . . . , Y m)|X ≤ x

]

=

∫ ∞

0

(1 − [FY |X(y|x)]m) dy

– Order-α quantile frontier

ϕα(x) = F−1Y |X(α|x)

= inf{y ∈ R+|FY |X(y|x) ≥ α}

• Properties

as m → ∞, ϕm(x) → ϕ(x) and as α → 1, ϕα(x) → ϕ(x)

Page 13: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 13'

&

$

%

Robust Benchmark Frontier -4-

0 5 10 15 20 25 300

0.2

0.4

0.6

0.8

1

values of y

F(y

| X

<=

x)

φ(x)

φ0.80

(x)

* * ** * *

m data points y with X <=x

φm

(x)

Figure 3: Illustration of full and partial frontiers: m = 6 and α = 0.80

Page 14: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 14'

&

$

%

Robust Benchmark Frontier -5-

Nonparametric estimators of partial order frontier

• Plug-in principle

ϕm,n(x) =

∫ ∞

0

(1 − [Fn,Y |X(y|x)]m) dy

ϕα,n(x) = inf{y ∈ R+|Fn,Y |X(y|x) ≥ α}

• Properties

–√

n-consistency and asymptotic normality:

√n(ϕm,n(x) − ϕm(x))

L−→ N (0, σ2m(x)) and

√n(ϕα,n(x) − ϕα(x))

L−→ N (0, σ2α(x))

– Convergence to FDH estimator:

as m → ∞, ϕm,n(x) → ϕFDH,n(x) and as α → 1, ϕα,n(x) → ϕFDH,n(x)

• Detection of Outliers (Simar, JPA, 2003)

Page 15: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 15'

&

$

%

Robust Benchmark Frontier -6-

Robust estimator of “full frontier” ϕ(x)

• When m → ∞ or α → 1, the partial frontiers and their nonparametric estimator

converge to full frontier and to the FDH frontier respectively.

Theorem 1. If m = m(n) is such that m(n) = O(n log(n)) when n → ∞ and if

α = α(n) is such that n(p+2)/(p+1)(1 − α(n)) → 0 as n → ∞, then

n1/(p+1)(ϕm(n),n(x) − ϕ(x))L−→ Weibull(·)

n1/(p+1)(ϕα(n),n(x) − ϕ(x))L−→ Weibull(·)

(see CFS and ADT for details)

• Same asymptotic properties that the FDH frontier, but, for finite n, ϕm(n),n(x)

and ϕα(n),n(x) provide estimators of ϕ(x) that will not envelop all the data

points and so, are more robust to extreme and outliers.

• In practice, m and α are chosen as tuning parameters that tune the percentage

of points left out the obtained partial frontier estimate.

Page 16: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 16'

&

$

%

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

Figure 4: Example 1. In solid black line, the true frontier y = x0.5. In cyan solid, the

FDH frontier estimate, in blue dashed the estimated order-m frontier and in dash-dot

red the estimate of the order-α frontier. In black dotted, the shifted OLS estimate and

in dash-dot black, the parametric stochastic fit, m = 20 and α = 0.95.

Page 17: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 17'

&

$

%

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

Figure 5: Example 2. In solid black line, the true frontier y = x. In cyan solid, the

FDH frontier estimate, in blue dashed the estimated order-m frontier and in dash-dot

red the estimate of the order-α frontier. In black dotted, the shifted OLS estimate and

in dash-dot black, the parametric stochastic fit, m = 20 and α = 0.95.

Page 18: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 18'

&

$

%

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

Figure 6: Example 3. In solid black line, the true logit frontier. In cyan solid, the

FDH frontier estimate, in blue dashed the estimated order-m frontier and in dash-dot

red the estimate of the order-α frontier. In black dotted, the shifted OLS estimate and

in dash-dot black, the parametric stochastic fit, m = 20 and α = 0.95.

Page 19: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 19'

&

$

%

0 1 2 3 4 5 6 7 8 9 100

2

4

6

8

10

12

14Fit of frontiers with ACC data set

FDHorder−αorder−mShifted−OLSMLE−Norm−Expdata points

Figure 7: ACC data: Illustration of full and partial frontiers: m = 20 and α = 0.95.

Page 20: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 20'

&

$

%

Robust Benchmark Frontier: Multivariate versions -7-

• Farell-Debreu Efficiency (input orientation)

θ(x, y) = inf{θ |FX|Y (θx|y) > 0}

If X univariate, φ(y) = θ(x, y) x is an minimal input frontier (benchmark)

• Order-α quantile frontier: an other benchmark

θα(x, y) = inf{θ |FX|Y (θx|y) > 1 − α}.

If X univariate, φα(y) = θα(x, y) x is an order-α input frontier (new

benchmark)

• Order-m frontier: still an other benchmark

θm(x, y) =

∫ ∞

0

(1 − FX|Y (ux | y))mdu

If X univariate, φm(y) = θm(x, y) x = E(min(X1, . . . , Xm|Y ≥ y)) is an

order-m input frontier (new benchmark)

Page 21: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 21'

&

$

%

Environmental Factors Z

• Very easy and natural

– No separability conditions (Simar and Wilson, 2006 JE: 2-stage story)

– No prior information of the role of Z (favorable or not to the process)

• Replace HXY (x, y) = Prob(X ≤ x, Y ≥ y) by

HXY |Z(x, y|Z = z) = Prob(X ≤ x, Y ≥ y|Z = z)

• Nonparametric estimator: kernel smoothing on Z

HXY,n|Z(x, y|Z = z) =

∑ni=1 1I(Xi ≤ x, Yi ≥ y)K((Zi − z)/h)∑n

i=1 K((Zi − z)/h)

• All the relevant information is there and the theory is done!

• Effect of Z (favorable, neutral or detrimental) by analyzing θ(x, y|z)/θ(x, y) as

a function of z.

Page 22: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 22'

&

$

%

1 2 3 4 5 6 7 8 9 10

1

1.5

2

2.5

3Effect of Z on Full frontier

values of Z

eff(

x,y|

z)/e

ff(x,

y)

1 2 3 4 5 6 7 8 9 10

1

1.5

2

2.5

3Effect of Z on Order−m frontier

values of Z

eff m

(x,y

|z)/

eff m

(x,y

)

Figure 8: “Unfavorable" effect of Z on production efficiency, only after Z > 5

Page 23: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 23'

&

$

%

Conclusions

• Nonparametric frontiers are very flexible

• Statistical inference is available

• Robust versions are very useful and easy to compute

• Environmental factors are easy to introduce

Page 24: Robust Alternatives to Estimate Benchmark Frontiers Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 2 Contents • Dominance and Probabilistic Formulation of

Robust Alternatives to Estimate Benchmark Frontiers: KEI, september 2006, KUL 24'

&

$

%

Main References

• Daraio, C. and L. Simar (2006), Advanced Robust and Nonparametric Methods in

Efficiency Analysis. Methodology and Applications, forthcoming Springer,

New-York, September 2006.

• Cazals, C. Florens, J.P. and L. Simar (2002), Nonparametric Frontier

Estimation: a Robust Approach , in Journal of Econometrics, 106, 1–25.

• Daouia, A. and L. Simar (2004), Nonparametric efficiency analysis: a

multivariate conditional quantile approach, Discussion paper 0419, Institut de

Statistique, UCL, forthcoming Journal of Econometrics.

• Daraio, C. and L. Simar (2005), Introducing environmental variables in

nonparametric frontier models: a probabilistic approach, Journal of Productivity

Analysis, vol 24, 1, 93–121.

• Daraio, C. and L. Simar (2006), Conditional nonparametric frontier models for

convex and non convex technologies: a unifying approach, Discussion paper 0502,

Institut de Statistique, UCL, forthcoming Journal of Productivity Analysis.