fear 1.12 user’s guide - clemson universitymedia.clemson.edu › ... › software › fear ›...

FEAR 1.12 User’s Guide

Paul W. Wilson

Department of Economics

222 Sirrine Hall

Clemson University

Clemson, South Carolina 29634 USA

[email protected]

24 February 2009

Contents

1 Introduction 1

2 License Issues 2

3 Downloading and Installing R 3

4 Adding FEAR to R 4

4.1 Where to get FEAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44.2 Installing FEAR into R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

5 Estimation with FEAR 6

5.1 Getting help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65.2 Example #1: DEA estimates of technical efficiency . . . . . . . . . . . . . . . . . . . . . . . . 95.3 Example #2: Outlier detection for frontier models . . . . . . . . . . . . . . . . . . . . . . . . 115.4 Example #3: Other estimators of technical efficiency: . . . . . . . . . . . . . . . . . . . . . . 115.5 Example #4: Farrell-Debreu efficiencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145.6 Estimating other things . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

1 Introduction

An extensive literature concerning the measurement of efficiency in production has developed since Debreu

(1951) and Farrell (1957) provided basic definitions for technical and allocative efficiency in production. One

large section of this literature focuses on linear-programming based measures of efficiency along the lines of

Charnes et al. (1978) and Fare et al. (1985). In addition, the free disposal hull (FDH) method of Deprins

et al. (1984) is sometimes used; this estimator can be written as a linear program, algthough it is easier to

compute estimates using numerical methods rather than linear programming. Within this literature, the

that rely on convexity assumptions are known as Data Envelopment Analysis (DEA).

DEA estimators have been applied in more than 1,800 articles published in more than 490 refereed journals

(Gattoufi et al., 2004). DEA and similar non-parametric estimators offer numerous advantages, the most

obvious being that one need not specify a (potentially erroneous) functional relationship between production

inputs and outputs. Although much of the nonparametric efficiency literature has ignored statistical issues

such as inference, hypothesis testing, etc., the statistical properties of DEA estimators have recently been

established; see Simar and Wilson (2000a) for a survey of these results, and Kneip et al. (2003) for more

recent results.

Standard software packages (e.g., LIMDEP, STATA, TSP) used by econometricians do not include

procudures for DEA or other nonparametric efficiency estimators. Several specialized, commercial soft-

ware packages reviewed by Hollingsworth (1999) and Barr (2004) are available, and to varying degrees, these

are good at what they were designed to do. Each includes facilities for reading data into the program, in some

cases in a variety of formats, and procedures for estimating models that the authors have programmed into

their software. A common complaint heard among practitioners, however, runs along the lines of “package

X will not let me estimate the model I want!” The existing packages are designed for ease of use (again,

with varying degrees of success), but the cost of this is often inflexibility, limiting the user to procedures

the authors have explicitly made available. Moreover, none of the existing packages include procedures for

statistical inference. Although the asymptotic distribution of DEA estimators is now known (see Kneip et

al., 2003, for details) for the general case with p inputs and q outputs, bootstrap methods remain the only

useful approach for inference. None of the existing packages incorporate the bootstrap methods proposed by

Simar and Wilson (1998, 2000b).

FEAR 1.12 consists of a software library that can be linked to the general-purpose statistical package R.

The routines included in FEAR 1.12 allow the user to compute DEA estimates of technical, allocative, and

overall efficiency while assuming either variable, non-increasing, or constant returns to scale. The routines

are highly flexible, allowing measurement of efficiency of one group of observations relative to a technology

defined by a second, reference group of observations. Consequently, the routines can be used to compute

Malmquist indices, scale efficiency measures, super-efficiency scores along the lines of Andersen and Petersen

(1993), and other measures that might be of interest. Routines are also included to facilitate implementation

1

of the bootstrap methods described by Simar and Wilson (1998, 2000b). These features can be further used

to implement methods of inference for Malmquist indices as in Simar and Wilson (1999), statistical tests for

irrelevant inputs and outputs or aggregation possibilities as described in Simar and Wilson (2001). as well

as statistical tests of constant returns to scale versus non-increasing or varying returns to scale as described

in Simar and Wilson (2002a), A routine for maximum likelihood estimation of a truncated regression model

is included for regressing DEA efficiency estimates on environmental variables as described in Simar and

Wilson (2005). In addition, FEAR 1.12 includes commands that can be used to compute FDH efficiency

estimates along the lines of Deprins et al. (1984), to perform outlier analysis using the methods of Wilson

(1993), and to compute the new, robust, root-n consistent order-m efficiency estimators described by Cazals

et al. (2002). Most of these features are unavailable in existing software packages.

2 License Issues

R is distributed under the terms of the GNU General Public License Version 2, June 1991. This license can

be found on the internet at http://www.gnu.org/copyleft/gpl.html. Further details on licensing of R can be

found by typing license() in the R console window described below.

FEAR 1.12 is provided under the license that appears in the file COPYING included with the software.

This license states the following:

All software, documentation, etc. in this package is Copyright Paul W. Wilson, 2006.

This package may be copied by anyone for academic purposes. Under no circumstances, without

approval from the author, may any of the software contained in this package be used for com-

mercial purposes. This includes any purpose for which money, goods, or anything of value is

received as compensation in the form of direct or indirect payment, salary, wage, consulting fee,

etc., with the sole exception of university or college salaries paid for academic research leading

to publications in academic journals.

Software and documentation in this package may be freely distributed, but not modified. Any

reverse-engineering of compiled code included in this package is expressly prohibited.

Those wishing to use this package for commercial purposes should contact the author for permis-

sion.

This file must be included in any redistribution of this package.

Academic papers reporting results obtained from this software should cite the author. A proper

citation is:

Wilson, Paul W. (2007), FEAR: A package for frontier efficiency analysis with R,

Socio-Economic Planning Sciences, forthcoming.

2

By using the software in this package for any purpose, the user agrees to honor the terms and

conditions set forth herein.

The software in this package has been checked for accuracy. No guarantees, warranties, etc.,

explicit, implicit, or otherwise, are made.

3 Downloading and Installing R

R is a language and environment for statistical computing graphics. It is an implementation of the S

language developed at Bell Laboratories, but unlike the commercial version of S marketed as S-Plus by

Lucent Technoligies, R is freely available under the Free Software Foundation’s GNU General Public License.

According to the R project’s web pages (http://www.r-project.org),

“R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical

tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly

extensible.... One of R’s strengths is the ease with which well-designed publication-quality plots

can be produced, including mathematical symbols and formulae where needed. Great care has

been taken over the defaults for the minor design choices in graphics, but the user retains full

control.... R is an integrated suite of software facilities for data manipulation, calculation and

graphical display. It includes (i) an effective data handling and storage facility; (ii) a suite of

operators for calculations on arrays, in particular matrices; (iii) a large, coherent, integrated

collection of intermediate tools for data analysis; (iv) graphical facilities for data analysis and

display either on-screen or on hardcopy; and (v) a well-developed, simple and effective program-

ming language which includes conditionals, loops, user-defined recursive functions and input and

output facilities.”

R includes an online help facility and extensive documentation in the form of manuals included with the

package. In addition, several books describing uses of R are available (eg., Dalgaard, 2002; Venables and

Ripley, 2002; and Verzani, 2004); see also the recent review by Racine and Hyndman (20002).

The current version of R can be downloaded from http://lib.stat.cmu.edu/R/CRAN. Pre-compiled

binary versions are available for a variety of platforms; at present, however, FEAR 1.12 is being made

available only for the Microsoft Windows (NT, 95 and later) operating systems running on Intel (and clone)

machines.

Most users will want to download the precompiled, binary version of R for Windows operating systems.

After downloading the installer from the R web pages, close all running applications, then left-click on Start

then Run, find the R installer (for R version 2.4.0, this is a file named R-2.4.0-win32.exe), and run the

installer, following instructions as they appear. If all goes well, at the end of the installation process, users

should find the R icon on thier screen.

3

4 Adding FEAR to R

The next step is to make the FEAR package available to R.

4.1 Where to get FEAR

After downloading and installing R, FEAR 1.12 can be downloaded from

http://www.economics.clemson.edu/faculty/wilson/Software/FEAR

The FEAR package is contained in a zip file (FEAR.zip), which should be copied to some location on the

user’s machine. Do not unzip the file before installing into R. A copy of the paper announcing the FEAR

package, Wilson (2006a), this manual, (Wilson, 2006b), and the license file for FEAR 1.12 can be found at

the same site.

4.2 Installing FEAR into R

Before the FEAR package can be used, it must be “installed” into R. This must be done only once, unless

upgrading to a later version of FEAR. After downloading the FEAR package, start R by clicking on its

desktop icon. This will open the R graphical user interface (GUI), shown in Figure 1.

Figure 1: The R Windows GUI.

Along the top of the R GUI, find the word “Package,”, and left-click on this; then left-click on “Install

4

package(s) from local zip files...” in the pop-up menu (shown in Figure 2) that appears. Next, a Windows

file-selection window will appear, as shown in Figure 3. Use the navigation buttons to find the file FEAR.zip,

highlight the file name, then left-click on the “Open” button in the Windows file-selection window.

Figure 2: The R Packages menu.

Figure 3: The R file selection interface.

The following should appear in the R console window within the R GUI:

> utils:::menuInstallLocal()

package ’FEAR’ successfully unpacked and MD5 sums checked

updating HTML package descriptions

>

Next, to make the commands in the FEAR package available for use, click on the R console window to

make it active, and type library(FEAR) or require(FEAR) at the prompt; this should result in the following

displayed in the R console window:

> library(FEAR)

5

FEAR (Frontier Efficiency Analysis with R) 1.0 installed

Copyright Paul W. Wilson 2006

See file COPYING for license and citation information

>

Note that FEAR uses another package, KernSmooth, which is included in current distributions of R. This

package is loaded automatically by FEAR, and is used by some of the commands for bandwidth-selection in

FEAR.

5 Estimation with FEAR

5.1 Getting help

After successfully completing the installation of R and FEAR as described in the previous sections, one can

begin using FEAR. User-accessible commands are listed and described in Wilson (2005b). One can also list

the available commands in FEAR by typing help(package=FEAR at the prompt in the R console window

(remember, to access the FEAR package, one must type library(FEAR) or require(FEAR) first).

Alternatively, R’s HTML help facility can also be used. Left-click on “Help” at the top of the R GUI, and

then on “HTML help” in the pop-up menu that appears (see Figure 4). This will open a browser window and

display links for R manuals, packages, etc. as shown in Figure 5. In this window, click on the “Packages” link

to get a list of available packages, then click on “FEAR” to display a list of commands available in the FEAR

package as shown in Figure 6. Clicking on the command names will result in more detailed explanations.

Figure 4: The R help pop-up menu.

6

Figure 5: The R HTML help facility.

7

Figure 6: The R HTML help facility showing commands in FEAR 1.12.

8

Next, some simple examples using commands from the FEAR package are given. FEAR includes several

datasets that can be used to illustrate the package’s capabilities.

5.2 Example #1: DEA estimates of technical efficiency

Typing data(ccr) loads the data given in Charnes et al. (1981). These data contain observations on p = 5

inputs and q = 3 outputs for n = 70 schools. The following commands create a (p × n) matrix of input

vectors and a (q × n) matrix of outputs from the Charnes et al. data:

data(ccr)

x=matrix(nrow=5,ncol=70)

x[1,]=ccr$x1

x[2,]=ccr$x2

x[3,]=ccr$x3

x[4,]=ccr$x4

x[5,]=ccr$x5

y=matrix(nrow=3,ncol=70)

y[1,]=ccr$y1

y[2,]=ccr$y2

y[3,]=ccr$y3

Alternatively, one might type

data(ccr)

x=t(matrix(c(ccr$x1,ccr$x2,ccr$x3,ccr$x4,ccr$x5),

nrow=70,ncol=5))

y=t(matrix(c(ccr$y1,ccr$y2,ccr$y3),nrow=70,ncol=3))

The second example puts the data into a long vector (using the c command), assigns this to a matrix (R

puts data column-wise into matrices), and then assigns the transpose of this matrix (obtained using R’s t

command) to either x or y.

Next, FEAR’s dea command can be used to estimate technical efficiency for the observations in the

Charnes et al. (1981), relative to the technology estimated from these observations. The command dea

computes estimates of either Shephard (1970) input or ouput distance functions; here, the input-orientation

is used:

dhat=dea(XOBS=x,YOBS=y)

Alternatively, output distance functions could be estimated by adding the argument ORIENTATION=2 to the

dea command above. The default is to allow variable returns to scale in the estimated technology, but con-

stant returns or non-increasing returns to scale are also possible using the RTS argument; see documentation

for the dea command in Wilson (2005b).

The command boot.sw98 can be used to estimate confidence intervals via the homogeneous bootstrap

9

method described by Simar and Wilson (1998, 2000a):

tmp=boot.sw98(XOBS=x,YOBS=y,DHAT=dhat,NREP=2000)

Finally, the results from the dea and boot.sw98 commands can be manipulated to produce LaTeX code

for a table:

n=ncol(x) #number of DMUs

table.in=matrix(nrow=n,ncol=7)

table.in[,1]=c(1:n)

table.in[,2]=dhat

table.in[,3]=dhat-tmp$bias #bias-corrected estimate

table.in[,4]=tmp$bias

table.in[,5]=tmp$var

table.in[,6:7]=tmp$conf.int

table.in[1:9,1]=paste(" ",table.in[1:9,1],sep="")

At this point, table.in is a (70× 7) matrix, with each row corresponding to an observation in the Charnes

et al. (1981) data. The first column contains the observation number (1–70); the second column contains

the DEA estimates of the Shephard input distance function for each observation; columns 3 contains a

bias-corrected estimates of the Shephard input distance function, obtained by subtracting the bootstrap bias

estimate from the original distance function estimates in dhat; columns 4–5 contain the bootstrap bias and

variance estimates, respectively; and columns 6–7 contain estimated upper and lower bounds for 95-percent

confidence intervals obtained by bootstrapping.

Continuing to format the results for use in a LaTeX table,

table.in[,2]=ifelse(nchar(table.in[,2])==1,

paste(table.in[,2],".",sep=""),

table.in[,2])

table.in[,2:7]=paste(table.in[,2:7],"000000",sep="")

table.in[,c(2:3,5:7)]=substr(table.in[,c(2:3,5:7)],1,6)

table.in[,4]=substr(table.in[,4],1,7)table.in=paste(table.in[,1]," & ",

table.in[,2]," & ",

table.in[,3]," & ",

table.in[,4]," & ",

table.in[,5]," & ",

table.in[,6]," & ",

table.in[,7]," \\",sep="")

Note that table.in has been converted to a vector of length 70 by these commands. Typing

table.in[1:5] at the prompt in the R console window will display the first five elements of table.in

as shown here:

10

1 & 1.0393 & 1.0728 & -0.0335 & 0.0006 & 1.0424 & 1.1223 \\

2 & 1.1098 & 1.1391 & -0.0293 & 0.0002 & 1.1139 & 1.1660 \\

3 & 1.0697 & 1.0977 & -0.0280 & 0.0003 & 1.0732 & 1.1320 \\

4 & 1.1091 & 1.1279 & -0.0188 & 9.3540 & 1.1132 & 1.1446 \\

5 & 1.0000 & 1.0454 & -0.0454 & 0.0010 & 1.0033 & 1.1051 \\

The LaTeX code in table.in can be written to a file named table in.tex using the command

write(table.in,file="table in.tex")

Inserting the code into a LaTeX document, one can produce a table of results similar to Table 1.

5.3 Example #2: Outlier detection for frontier models

Wilson (1993) describes an influence-function approach for detecting outliers in the context of frontier models.

This is of vital importance, since conventional nonparametric estimators such as FDH or DEA are very

sensitive to outliers. The Wilson (1993) method is implemented by FEAR’s ap and ap.plot commands.

The Charnes et al. (1981) data can be analyzed for outliers using the following commands:

data(ccr)

x=t(matrix(c(ccr$x1,ccr$x2,ccr$x3,ccr$x4,ccr$x5),

nrow=70,ncol=5))

y=t(matrix(c(ccr$y1,ccr$y2,ccr$y3),nrow=70,ncol=5))

tmp=ap(x,y,NDEL=12)

ap.plot(RATIO=tmp$ratio)

Here, the ap.plot command reproduces the log-ratio plot for the Charnes et al. (1981) data that appears

in Wilson (1993). The plot is drawn on-screen, in the R GUI, as shown in Figure 7. Alternatively, R can write

Postscript code for plots to a file, which can be incorporated later into LaTeX or other document-producing

software. To write the plot to a Postscript file, one would add a postscript command before the ap.plot

command in the example shown above. Type help(postscript) or use R’s HTML help facility for further

information.

5.4 Example #3: Other estimators of technical efficiency:

FEAR 1.12 also includes commands to implement FDH and order-m (Cazals et al., 2002) efficiency esti-

mators. To remain consistent with the dea command, which estimates Shephard (1970) input or output

distance functions, the commands that implement the FDH and order-m estimators are also designed to

estimate the Shephard measures, as opposed to the reciprocal Farrell-Debreu measures.

Continuing with the Charnes et al. data from the previous examples, FDH estimators of input-efficiency

can be computed by

fdh(XOBS=x,YOBS=y)

after putting the data into matrices x and y as before. Alternatively, order-m input-efficiency estimates can

be computed using

11

Units Eff. Scores (VRS) Eff. Bias-Corrected BIAS bσ Lower Bound Upper Bound

1 1.0393 1.0728 -0.0335 0.0006 1.0424 1.12232 1.1098 1.1391 -0.0293 0.0002 1.1139 1.16603 1.0697 1.0977 -0.0280 0.0003 1.0732 1.13204 1.1091 1.1279 -0.0188 9.3540 1.1132 1.14465 1.0000 1.0454 -0.0454 0.0010 1.0033 1.10516 1.0990 1.1282 -0.0292 0.0002 1.1030 1.15667 1.1218 1.1427 -0.0209 0.0001 1.1255 1.15958 1.1049 1.1371 -0.0322 0.0005 1.1090 1.18559 1.1647 1.1889 -0.0242 0.0001 1.1683 1.2101

10 1.0629 1.0952 -0.0323 0.0005 1.0661 1.140211 1.0000 1.0456 -0.0456 0.0010 1.0038 1.107312 1.0000 1.0463 -0.0463 0.0010 1.0041 1.102113 1.1596 1.1782 -0.0186 7.2837 1.1642 1.192814 1.0104 1.0379 -0.0275 0.0002 1.0143 1.069015 1.0000 1.0554 -0.0554 0.0020 1.0034 1.144316 1.0524 1.0807 -0.0283 0.0004 1.0558 1.123417 1.0000 1.0536 -0.0536 0.0020 1.0034 1.144718 1.0000 1.0408 -0.0408 0.0006 1.0034 1.083919 1.0498 1.0764 -0.0266 0.0002 1.0534 1.106820 1.0000 1.0541 -0.0541 0.0018 1.0035 1.141121 1.0000 1.0529 -0.0529 0.0017 1.0037 1.130422 1.0000 1.0311 -0.0311 0.0002 1.0029 1.055423 1.0258 1.0476 -0.0218 0.0001 1.0293 1.073724 1.0000 1.0521 -0.0521 0.0016 1.0029 1.128925 1.0217 1.0381 -0.0164 5.2518 1.0250 1.049926 1.0609 1.0807 -0.0198 9.8236 1.0651 1.097827 1.0000 1.0457 -0.0457 0.0010 1.0034 1.100828 1.0097 1.0339 -0.0242 0.0001 1.0128 1.059229 1.1321 1.1673 -0.0352 0.0004 1.1370 1.208730 1.1193 1.1447 -0.0254 0.0001 1.1242 1.165131 1.1949 1.2185 -0.0236 0.0001 1.1995 1.237632 1.0000 1.0473 -0.0473 0.0011 1.0039 1.107333 1.0503 1.0737 -0.0234 0.0001 1.0543 1.099134 1.1640 1.1866 -0.0226 0.0001 1.1683 1.208235 1.0000 1.0428 -0.0428 0.0008 1.0037 1.097736 1.2611 1.2900 -0.0289 0.0002 1.2657 1.319537 1.1914 1.2243 -0.0329 0.0003 1.1960 1.253838 1.0000 1.0526 -0.0526 0.0019 1.0034 1.143039 1.0621 1.0943 -0.0322 0.0004 1.0661 1.133140 1.0528 1.0778 -0.0250 0.0002 1.0560 1.106841 1.0500 1.0640 -0.0140 4.3406 1.0536 1.074942 1.0491 1.0791 -0.0300 0.0003 1.0530 1.113743 1.1564 1.1871 -0.0307 0.0003 1.1610 1.221344 1.0000 1.0535 -0.0535 0.0020 1.0036 1.147845 1.0000 1.0358 -0.0358 0.0006 1.0037 1.084546 1.0954 1.1159 -0.0205 0.0001 1.0991 1.136947 1.0000 1.0516 -0.0516 0.0015 1.0037 1.124148 1.0000 1.0529 -0.0529 0.0020 1.0038 1.144549 1.0000 1.0458 -0.0458 0.0010 1.0032 1.103250 1.0431 1.0706 -0.0275 0.0004 1.0455 1.111851 1.0871 1.1106 -0.0235 0.0001 1.0908 1.137352 1.0000 1.0537 -0.0537 0.0019 1.0037 1.143053 1.1498 1.1739 -0.0241 0.0001 1.1539 1.194554 1.0000 1.0554 -0.0554 0.0021 1.0043 1.150655 1.0006 1.0313 -0.0307 0.0003 1.0047 1.067756 1.0000 1.0511 -0.0511 0.0015 1.0034 1.125557 1.0788 1.1096 -0.0308 0.0005 1.0819 1.153558 1.0000 1.0547 -0.0547 0.0021 1.0039 1.146159 1.0000 1.0540 -0.0540 0.0020 1.0032 1.147660 1.0199 1.0395 -0.0196 7.9640 1.0242 1.0543

Table 1: Estimates for Charnes et al. (1981) data (2000 bootstrap replications).

12

Units Eff. Scores (VRS) Eff. Bias-Corrected BIAS bσ Lower Bound Upper Bound

61 1.1202 1.1523 -0.0321 0.0005 1.1242 1.198862 1.0000 1.0543 -0.0543 0.0020 1.0035 1.144863 1.0379 1.0613 -0.0234 0.0001 1.0415 1.083464 1.0749 1.0932 -0.0183 8.9526 1.0784 1.109765 1.0252 1.0489 -0.0237 0.0001 1.0283 1.070366 1.0687 1.0833 -0.0146 4.8927 1.0722 1.095367 1.0568 1.0770 -0.0202 0.0001 1.0605 1.094068 1.0000 1.0554 -0.0554 0.0020 1.0038 1.147869 1.0000 1.0545 -0.0545 0.0020 1.0037 1.142570 1.0373 1.0563 -0.0190 8.7786 1.0411 1.0719

Table 1: (continued).

Figure 7: Log-ratio plot for Charnes et al. (1981) data produced by ap.plot command.

13

orderm(XOBS=x,YOBS=y)

with the default value m = 25 for the trimming parameter. The resulting estimates, along with DEA

estimates obtained using FEAR’s dea command, are shown in Table 2. See Wilson (2005b) for details on

the various options that are available with these commands.

The results in Table 2 provide a useful diagnostic check. It is now well-known that DEA and FDH

estimators suffer from the curse of dimensionality; i.e., their convergence rates diminish as the number of

inputs and outputs increases (see Simar and Wilson, 2000 for discussion). In Table 2, all but 5 of the

FDH estimates—almost 93 percent of the sample—are identically equal to 1. This means that most of the

apparent inefficiency implied by the DEA estimates is due solely to the convexity assumption incorporated

by DEA estimators.1 In other words, with 8 dimensions (5 inputs and 3 outputs), 70 observations are likely

far too few to obtain statistically meaningful estimates of technical efficiency. Here, the Charnes et al. (1981)

data are used only for illustrative purposes.

It is also interesting to note that most of the order-m estimates in Table 2 are less than 1. This is to be

expected, since (i) most of the FDH estimates are equal to 1, and (ii) the input-oriented order-m estimates

converge to the corresponding FDH estimates (from below) as m → ∞ for fixed sample size n. See Cazals

et al. (2002) for details.

5.5 Example #4: Farrell-Debreu efficiencies

As noted previously, FEAR’s efficiency estimation commands are designed to estimate the Shephard (1970)

measures of efficiency. Researchers sometimes work in terms of the Farrell (1957) definitions, where efficiency

is measured by the reciprocals of the Shephard measures. This creates no particular difficulties. Assuming

input and output observations have been placed in matrices x and y as in section 5.2, the following commands

will produce estimates of Farrell input efficiencies as well as estimates of 95-percent confidence intervals:

tmp1=dea(XOBS=x,YOBS=y)

tmp2=boot.sw98(XOBS=x,YOBS=y,DHAT=tmp1)

dhat=1/tmp1

n=ncol(x)

ci=matrix(nrow=n,ncol=2)

ci[,1]=1/tmp2$conf.int[,2]

ci[,2]=1/tmp2$conf.int[,1]

Estimates of the Farrell input-efficiencies are contained in dhat, while corresponding estimates of 95-

percent confidence intervals are contained in ci. Note that taking reciprocals of the confidence interval

estimates returned by boot.sw98 requires reversing the order of the bounds; i.e., the reciprocal of the upper

1FDH efficiency estimators estimate efficiency for a given point relative to the free-disposal hull of the sample observations,while DEA efficiency estimators estimate efficiency for a given point relative to the convex hull of the free-disposal hull of thesample observations. Therefore, any differences between FDH and DEA estimators can only be due to the DEA estimators’incorporation of the convexity assumption on the feasible set of inputs and outputs. See Simar and Wilson (2000a).

14

Obs. DEA FDH order-m Obs. DEA FDH order-m

1 1.0393 1.0000 1.0000 36 1.2611 1.0000 0.94382 1.1098 1.0000 0.9512 37 1.1914 1.0000 0.94093 1.0697 1.0000 0.9823 38 1.0000 1.0000 0.63874 1.1091 1.0513 0.9906 39 1.0621 1.0000 0.94375 1.0000 1.0000 0.6757 40 1.0528 1.0000 0.89806 1.0990 1.0000 0.8892 41 1.0500 1.0000 0.95587 1.1218 1.0000 0.9580 42 1.0491 1.0000 0.86028 1.1049 1.0000 0.9619 43 1.1564 1.0000 0.93249 1.1647 1.0000 0.9864 44 1.0000 1.0000 1.0000

10 1.0629 1.0000 0.9971 45 1.0000 1.0000 0.692911 1.0000 1.0000 0.9925 46 1.0954 1.0000 0.996212 1.0000 1.0000 0.9867 47 1.0000 1.0000 0.936013 1.1596 1.0135 1.0065 48 1.0000 1.0000 0.643714 1.0104 1.0000 0.6968 49 1.0000 1.0000 0.661115 1.0000 1.0000 0.6137 50 1.0431 1.0000 0.994416 1.0524 1.0000 0.9926 51 1.0871 1.0000 0.835817 1.0000 1.0000 0.7684 52 1.0000 1.0000 0.998318 1.0000 1.0000 0.9385 53 1.1498 1.0129 0.885619 1.0498 1.0000 0.9970 54 1.0000 1.0000 1.000020 1.0000 1.0000 0.9505 55 1.0006 1.0000 0.842321 1.0000 1.0000 0.9871 56 1.0000 1.0000 0.587322 1.0000 1.0000 0.8533 57 1.0788 1.0000 0.920423 1.0258 1.0000 1.0000 58 1.0000 1.0000 0.730324 1.0000 1.0000 0.8862 59 1.0000 1.0000 1.000025 1.0217 1.0000 0.9463 60 1.0199 1.0000 0.899426 1.0609 1.0000 0.9794 61 1.1202 1.0000 0.811727 1.0000 1.0000 0.9136 62 1.0000 1.0000 0.660128 1.0097 1.0000 0.8403 63 1.0379 1.0000 0.839429 1.1321 1.0000 0.8477 64 1.0749 1.0000 0.964130 1.1193 1.0000 0.8375 65 1.0252 1.0000 0.775531 1.1949 1.0577 1.0027 66 1.0687 1.0245 0.965432 1.0000 1.0000 0.7511 67 1.0568 1.0000 0.950733 1.0503 1.0000 1.0000 68 1.0000 1.0000 0.857234 1.1640 1.0000 0.9822 69 1.0000 1.0000 0.572435 1.0000 1.0000 1.0000 70 1.0373 1.0000 0.8774

Table 2: Various estimators of input efficiency for Charnes et al. (1981) data.

15

bound for the Shephard measure gives the lower bound for the Farrell measure, while the reciprocal of the

lower bound for the Shephard measures gives the upper bound for the Farrell measure.2

5.6 Estimating other things

The commands implemented in FEAR 1.12 are designed to be very flexible. In addition to the dea, fdh, and

orderm commands illustrated in the previous examples, commands are provided for estimating cost efficiency

(cost.min), revenue efficiency (revenue.max), and profit efficiency (profit.max). Malmquist indices and

various decompositions may be estimated using the commands malmquist.components and malmquist; in

addition, these commands can be used to obtain bootstrap estimates of confidence intervals for Malmquist

indices, etc., as described by Simar and Wilson (1999). See Wilson (2005b) for details on these commands

and arguments that may be passed, as well as simple examples for each command.

The commands that estimate various efficiencies are designed to estimate self-efficiency among a set of

observations (e.g., to estimate efficiency for a set of n observatons relative to the technology supported

by those same n observatons), or to estimate efficiency for a set of points relative to a reference set of

points. In the case of the dea command, this allows one to estimate Malmquist indices, where cross-period

efficiencies must be estimated. In addition, by a series of appropriate invocations of the dea command,

one can estimate components of the various decompositons of Malmquist indices that have appeared in the

literature, or that might appear in the future (rather than using the pre-coded facility implemented by the

commands malmquist.components and malmquist mentioned earlier).

This feature distinguishes FEAR from existing software packages, where typically only estimators that

have been explicitly programmed into the package can be computed. FEAR offers sufficient flexibility to

compute a wide variety of estimates, even of quantities that perhaps have not been estimated previously.

The homogeneous bootstrap method introduced by Simar and Wilson (1998, 2000a) is implemented by

the command boot.sw98, but the ability to estimate efficiency among observations in one group relative to

observations in another group makes it easy to program other bootstrap procedures. Using the commands in

FEAR 1.12, it is straightforward to implement the bootstrap for Malmquist indices proposed by Simar and

Wilson (1999), the heterogenous bootstrap proposed by Simar and Wilson (2000b), the bootstrap tests of

hypotheses regarding irrelevant variables or additivity relationships proposed by Simar and Wilson (2001),

as well as the bootstrap tests of hypotheses regarding returns to scale proposed by Simar and Wilson (2002).

Simar and Wilson (2003) discuss a bootstrap method for making inference in two-step procedures, where

one regresses DEA efficiency estimates from a first-stage estimation on observed environmental variables

in a second-stage regression. The trunc.reg command in FEAR 1.12 can be used to estimated truncated

normal regression equations as in Simar and Wilson (2003), and the rnorm.trunc command can be used to

implement either of the bootstrap alogorithms proposed by Simar and Wilson. Again, see Wilson (2005b)

2Note also that one should not merely take reciprocals of the bootstrap bias and variance estimates returned by boot.sw98

to obtain the corresponding Farrell measures.

16

for details on these commands.

17

References

[1] Andersen, P. and N.C. Petersen (1993), A procedure for ranking efficient units in data envelopment

analysis, Management Science 39, 1261–64.

[2] Barr, R.S. (2004), “DEA Software Tools and Technology,” in W.W. Cooper, L.M. Seiford, and J. Zhu

(eds.), Handbook on Data Envelopment Analysis, Boston: Kluwer Academic Publishers, pp. 539–566.

[3] Cazals, C., J.P. Florens, and L. Simar (2002), Nonparametric frontier estimation: A robust approach,

Journal of Econometrics 106, 1–25.

[4] Charnes, A., W.W. Cooper, and E. Rhodes (1978), Measuring the inefficiency of decision making units,

European Journal of Operational Research 2, 429–444.

[5] Charnes, A., W.W. Cooper, and E. Rhodes (1981), Evaluating program and managerial efficiency: an

application of data envelopment analysis to program follow through, Management Science 27, 668–697.

[6] Dalgaard, P. (2002), Introductory Statistics with R, New York: Springer-Verlag, Inc.

[7] Debreu, G. (1951), The coefficient of resource utilization, Econometrica 19(3), 273–292.

[8] Deprins, D., L. Simar, and H. Tulkens (1984), “Measuring Labor Inefficiency in Post Offices, in The

Performance of Public Enterprises: Concepts and measurements, ed. by M. Marchand, P. Pestieau, and

H. Tulkens, Amsterdam: North-Holland, 243–267.

[9] Fare, R., S. Grosskopf, and C.A.K. Lovell (1985), The Measurement of Efficiency of Production, Boston:

Kluwer-Nijhoff Publishing.

[10] Farrell, M.J. (1957), The measurement of productive efficiency, Journal of the Royal Statistical Society,

Series A, 120, 253–281.

[11] Gattoufi, S., M. Oral, and A. Reisman (2004), Data envelopment analysis literature: A bibliography

update (1951–2001), Socio-Economic Planning Sciences, in press.

[12] Kneip, A., L. Simar, and P.W. Wilson (2003), “Asymptotics for DEA Estimators in Nonparametric

Frontier Models,” Discussion paper #0317, Institut de Statistique, Universite Catholique de Louvain,

Louvain-la-Neuve, Belgium.

[13] Hollingsworth, B. (1999), Data envelopment analysis and productivity analysis: A review of the options,

Economic Journal 109, 458–462.

[14] Racine, J. and R. Hyndman (2002), Using R to teach econometrics, Journal of Applied Econometrics

17, 149–174.

18

[15] Shephard, R.W. (1970), Theory of Cost and Production Function, Princeton: Princeton University

Press.

[16] Simar, L. (2003), Detecting outliers in frontiers models: A simple approach, Journal of Productivity

Analysis 20, 391–424.

[17] Simar, L. and P.W. Wilson (1998), Sensitivity analysis of efficiency scores: How to bootstrap in non-

parametric frontier models, Management Science 44, 49–61.

[18] Simar, L. and P.W. Wilson (1999), Estimating and Bootstrapping Malmquist Indices, European Journal

of Operational Research 115, 459–471.

[19] Simar, L. and P.W. Wilson (2000a), Statistical inference in nonparametric frontier models: The state

of the art, Journal of Productivity Analysis 13, 49–78.

[20] Simar, L. and P.W. Wilson (2000b), A general methodology for bootstrapping in nonparametric frontier

models, Journal of Applied Statistics 27, 779–802.

[21] Simar L. and P.W. Wilson (2001), Testing restrictions in nonparametric frontier models, Communica-

tions in Statistics: Simulation and Computation 30, 161–186.

[22] Simar L. and P.W. Wilson (2002a), Nonparametric tests of return to scale, European Journal of Oper-

ational Research 139, 115–132.

[23] Simar, L. and P.W. Wilson (2003), “Estimation and Inference in Two-Stage, Semi-Parametric Models

of Production Processes,” Discussion paper #0307, Institut de Statistique, Universite Catholique de

Louvain, Belgium.

[24] Venables, W.N. and B.D. Ripley (2002), Modern Applied Statistics with S, New York: Springer-Verlag,

Inc.

[25] Verzani, J. (2004), Using R for Introductory Statistics, London: Chapman & Hall.

[26] Wilson, P.W. (1993), Detecting outliers in deterministic nonparametric frontier models with multiple

outputs, Journal of Business and Economic Statistics 11, 319–323.

[27] Wilson, P.W. (2006a), FEAR: A Software Package for Frontier Efficiency Analysis with R, Socio-

Economic Planning Sciences, forthcoming.

[28] Wilson, P.W. (2006b), FEAR 1.12 Command Reference, Department of Economics, University of Texas,

Austin, Texas.

19

fear 1.12 user’s guide - clemson universitymedia.clemson.edu › ... › software › fear ›...

Documents