relative target setting and cooperation · queen’s university, university of california, irvine,...

Relative Target Setting and Cooperation

MARTIN HOLZHACKER Michigan State University

STEPHAN KRAMER

Erasmus University Rotterdam

MICHAL MATĚJKA*

Arizona State University

NICK HOFFMEISTER Independent

February 2018

* Corresponding author: PO Box 873606, Tempe, AZ 85287-3606. E-mail: [email protected].

Acknowledgements: We thank Martin Artz (discussant), Alexander Brüggen (discussant), Will Demere, Christoph Feichter, Isabella Grabner, Katlijn Haesebrouck, Joanna Ho, Sylvia Hsingwen Hsu (discussant), Ranjani Krishnan, Peter Kroos, Judith Künneke, Edith Leung, Frank Moers, Mathijs van Peteghem, Stefan Reichelstein, Marcel van Rinsum, Naomi Soderstrom and workshop participants at KU Leuven, Maastricht University, Mannheim University, Queen’s University, University of California, Irvine, the 2017 AAA MAS meeting, the 2017 ACMAR conference, the 2016 EIASM conference on New Directions in Management Accounting Research, the Erasmus University brownbag seminar, and the Michigan State University brownbag seminar for comments and suggestions.

1

Relative Target Setting and Cooperation

ABSTRACT

A large stream of work on relative performance evaluation highlights the benefits of using

information about peer performance in contracting. In contrast, the potential costs of discouraging

cooperation among peers have received much less attention. The purpose of our study is to examine

how the importance of cooperation affects the use of information about peer performance in target

setting, also known as relative target setting. Specifically, we use data from an industrial services

company where business unit managers need to share specialized equipment and staff with their

peers to manage bottlenecks in their capacity. We construct several empirical proxies for the costs

and benefits of information about peer performance and examine their effects on target setting. We

find robust evidence that the sensitivity of target revisions to past peer performance is higher when

peer group performance has greater capacity to filter out noise but lower when the importance of

cooperation among peers is greater.

2

1 Introduction

A large stream of literature on relative performance evaluation (RPE) provides evidence that

information about peer performance is used to ex post filter out noise from compensation decisions

(Antle and Smith [1986], Janakiraman, Lambert, and Larcker [1992], Albuquerque [2009]). In

addition, several recent studies (Aranda, Arellano, and Davila [2014], Bol and Lill [2015]) show

that information about peer performance can also be used to ex ante revise performance targets, a

practice referred to as relative target setting (RTS). In contrast, there is relatively little empirical

evidence on the costs of using information about peer performance in compensation contracts even

though the theory predicts dysfunctional incentive effects in settings where managers can

strategically interact with peers and reduce their performance (Lazear [1989], Gibbons and

Murphy [1990]).1

Prior studies typically examine compensation contracts of top executives in large public

companies where the potential for cooperation with peers is limited (e.g., by antitrust law) and the

costs of using information about peer performance in contracting may be negligible. However,

information about peer performance is often used at lower organizational levels for performance

evaluation of managers who voluntarily share knowledge and tangible resources with their peers

(Matsumura and Shin [2006], Casas-Arce and Martinez-Jerez [2009]). In those settings,

facilitating cooperation is equally or more important than filtering out noise from performance

evaluations. In other words, what we know about incentive contracts of top executives may provide

1 There are some experimental studies suggesting that RPE encourages sabotage and collusion (Bandiera, Barankay, and Rasul [2009], Charness and Villeval [2009], Harbring and Irlenbusch [2011]). There is also some evidence that executive compensation is less sensitive to peer performance when firms prefer to soften product market competition (Aggarwal and Samwick [1999], Vrettos [2013]).

3

only partial insights about the relative costs and benefits of using information about peer

performance to evaluate managers at lower organizational levels.

The theoretical literature suggests that the use of information about peer performance

reflects a trade-off between the benefits of motivating individual effort and the costs of

discouraging cooperation (Milgrom and Roberts [1992]).2 On the one hand, RPE filters out noise

from performance evaluation and thus reduces the costs of incentive provision (Holmström [1979],

Holmström [1982]). Similar noise-filtering benefits also apply to RTS as long as shocks to

performance are persistent and correlated among peers (Casas-Arce et al. [2017]). An additional

benefit of RTS is that it reduces sensitivity of targets to past own performance and consequently

alleviates the ratchet effect (Weitzman [1980], Meyer and Vickers [1997]). On the other hand,

RPE and RTS reduce incentives to cooperate because helping a peer reduces own compensation

(either directly as in the case of RPE contracts or indirectly through increased targets in RTS

contracts). Our empirical analysis is one of the first to examine the trade-off between the noise-

filtering benefits of peer performance information and the costs of discouraging cooperation.

In particular, we collect data on performance evaluation of business unit (BU) managers in

a large global company where the importance of cooperation among BUs varies depending on their

location and specialization. The company provides maintenance and certification services for oil

wells, pipelines, refineries, and other energy infrastructure. The demand for these services

fluctuates widely and BU managers often deal with capacity bottlenecks such as limited

2 Information about peer performance can be incorporated into compensation contracts in two ways (Meyer and Vickers [1997], Casas-Arce, Holzhacker, Mahlendorf, and Matějka [2017]). First, RPE contracts make compensation contingent on own and peer performance observed at the end of a period so that compensation is negatively associated with peer performance. Second, RTS contracts make compensation contingent on performance relative to a beginning-of-period target which is based on past own and peer performance. Targets are positively associated with past peer performance, which also results into a negative association between compensation and peer performance.

4

availability of certified technicians or highly-specialized equipment (pipeline monitoring robots,

X-ray inspection systems, etc.). The least costly way for BUs to address these bottlenecks is to

share capacity with one another. However, capacity sharing is voluntary and may not always be

feasible.

We use several empirical proxies to capture variation in peer group capacity to filter out

noise and in the importance of cooperation. We expect that peer group capacity to filter out noise

is high when (i) standard deviation in BU own performance (need for noise filtering) is high and

(ii) correlation between own and peer performance (peer group quality) is high. As for our

cooperation proxies, we exploit variation in the extent to which BUs can benefit from personnel

and equipment sharing. We expect that the importance of cooperation is high when

(i) geographical distance to peers is low, (ii) similarity between own and peer services is high, or

(iii) resource availability is high in the sense that peer demand for capacity occurs during months

when a unit has capacity available to share. We extensively validate alternative measures of the

importance of cooperation based on these three proxies. For example, we show that cooperation

among peers is associated with lower excess labor costs, which suggests that the ability to borrow

personnel from peers reduces the need to maintain own slack capacity as a buffer for unexpected

demand.

Our empirical models reflect that BU managers’ compensation in our setting is contingent

on their performance relative to ex ante targets based on past BU and peer performance with little

or no ex post adjustments. Therefore, we follow Aranda et al. [2014] and specify a model of target

revisions as a function of past own and past peer performance. We estimate this RTS model using

106 BU-year observations of actual and targeted earnings. Using this model, we replicate several

findings from prior literature on target setting (Indjejikian, Matějka, Merchant, and Van der Stede

5

[2014], Bol and Lill [2015]). For example, we show that when top-performing BU managers fail

to meet their target, the next-period target is revised downward. In contrast, failure to meet a target

is associated with only a weak target revision downward for BU managers performing poorly

relative to their peers.

For our main tests, we examine the extent to which the sensitivity of target revisions to past

own and peer performance varies as a function of our proxies for peer group capacity to filter out

noise and the importance of cooperation. We find robust empirical support for our predictions.

First, target revisions are less sensitive to past own performance and/or more sensitive to past peer

performance when peer group capacity to filter out noise is high. Second, we find the opposite

when the importance of cooperation is high—target revisions are more sensitive to past own

performance and/or less sensitive to past peer performance when BUs can more effectively share

resources with their peers either due to low geographical distance to peer BUs or high business

similarity, or high resource availability.

These results extend prior literature which almost exclusively focuses on the economic

benefits of information about peer performance and largely ignores its potential costs (Gong, Li,

and Shin [2011], Vrettos [2013]). We provide novel empirical evidence consistent with the theory

that information about peer performance can have adverse incentive effects, especially in settings

where peers need to cooperate. This evidence improves our understanding of performance

evaluation of lower-level managers who are often benchmarked against peers within the same

organization. It also highlights the importance of information about peer performance for ex ante

target revisions rather than for ex post compensation adjustments (Aranda et al. [2014], Bol and

Lill [2015]).

6

2 Prior Literature and Hypotheses

2.1 PRIOR LITERATURE

Gibbons and Murphy [1990] comprehensively discuss the incentive implications of compensation

contracts contingent on information about peer performance. On the one hand, such contracts

strengthen incentives by insulating managers from common shocks that also affect their peer

performance. On the other hand, the use of information about peer performance gives rise to

dysfunctional incentives if managers can take actions that reduce peer performance. The latter

effect arises because compensation contracts filter out noise by putting a negative weight on

average peer performance when there is positive correlation between shocks to own and peer

performance (Holmström [1979]). Consequently, managers can increase their compensation by

taking actions that lower peer performance (sabotage or collusion) or by failing to take actions that

increase peer performance (refusal to cooperate).

A large stream of RPE literature empirically examines the use of information about peer

performance in CEO compensation using various measures of own and industry performance. The

noise-filtering hypothesis implies a negative effect of industry performance on compensation. The

early evidence is somewhat mixed. Some studies find support for the hypothesis (Gibbons and

Murphy [1990], Janakiraman et al. [1992]), other studies provide supportive evidence only in

specific subsamples (Garvey and Milbourn [2003], Rajgopal, Shevlin, and Zamora [2006]), and

yet other studies find no support (Jensen and Murphy [1990]).

A common feature of the early literature is that peer performance is measured as the

average performance of all companies with the same industry classification. Recent studies

develop new ways to identify peers, e.g., by using new proxy statement disclosures about peer

groups, by textual analysis of 10-K filings, or by combining industry classifications and size

7

information (Albuquerque [2009], Gong et al. [2011], Jayaraman, Milbourn, and Seo [2015]).

These studies generally find that CEO compensation is negatively associated with peer stock

returns, which is consistent with the noise-filtering hypothesis.

Another feature of the early literature is that it only examines RPE or an ex post use of

information about peer performance, which entails that compensation is determined after the end

of a performance period when information about own and peer performance is observed. However,

firms often determine compensation as a function of own performance relative to a target (Murphy

[2001], Matějka and Ray [2017]). This creates demand for RTS which is an ex ante use of

information about past peer performance to adjust beginning-of-period targets (Aranda et al.

[2014]). Casas-Arce et al. [2017] show that RTS strengthens incentives not only by filtering out

noise but also by alleviating the ratchet effect. In particular, if target revisions are relatively more

sensitive to past peer performance and relatively less sensitive to past own performance, managers

are less concerned that greater effort will make future targets more difficult to achieve (Weitzman

[1980], Leone and Rock [2002], Bouwens, Cardinaels, and Zhang [2016]).

An alternative explanation for the mixed evidence on the noise-filtering benefits of RPE in

early work is that most prior studies disregard strategic interaction among product market

competitors. Aggarwal and Samwick [1999] and Joh [1999] find a significantly positive

association between compensation and peer performance and argue that it is consistent with firms’

incentives to soften product market competition. Vrettos [2013] uses a unique single-industry

setting to distinguish between different types of competition and shows that executive

compensation is negatively (positively) associated with peer performance when firms act

aggressively (cooperatively) in the product market.

8

The importance of considering whether and how peers interact is also born out in several

experimental studies. Bandiera, Barankay, and Rasul [2005] find evidence that RPE reduces

productivity because coworkers “are able to sustain implicit collusive agreements” and that this

effect is more pronounced in smaller groups. Charness, Masclet, and Villeval [2014] show that

RPE increases willingness to sabotage peer performance merely for status concerns, i.e., even

when favorable performance relative to peers does not increase compensation. More generally, the

issues of sabotage and/or lack of cooperation have been examined in prior literature on

tournaments and public good experiments (Fehr and Gächter [2000], Harbring and Irlenbusch

[2008]). Nevertheless, although the importance of these issues has been established in laboratory

conditions and field experiments involving low-paid workers, there is hardly any evidence on how

they affect the design of managerial compensation.

2.2 HYPOTHESES

It is well-understood that managerial incentives depend on the type of information used to revise

their performance targets (Anderson, Dekker, and Sedatole [2010], Bouwens and Kroos [2011],

Kim and Shin [2016]). However, only a few recent studies examine RTS or the extent to which

target revisions depend on past peer performance (Aranda et al. [2014], Bol and Lill [2015], Casas-

Arce et al. [2017]). The empirical evidence is consistent with the theory that past peer performance

has information content above and beyond past own performance and therefore can be used to

filter out noise from performance evaluations (Holmström [1982]). For example, favorable past

peer performance signals that future own performance is likely to be favorable as well, which

justifies a target revision upward.

The theory also predicts that the relative importance of past own versus past peer

performance in target updating depends on their relative noisiness (Banker and Datar [1989],

9

Meyer and Vickers [1997], Datar, Kulp, and Lambert [2001]). Consistent with this prediction,

Casas-Arce et al. [2017] show that higher peer group quality is associated with greater weight on

past peer performance and lower weight on past own performance in target updating. Bol and Lill

[2015] show that the weight on past own performance is decreasing in volatility of own

performance. More generally, prior work on the choice of performance measures commonly finds

that greater noisiness of a performance measure reduces its impact on incentive compensation and

increases the impact of other measures (Ittner, Larcker, and Rajan [1997], Core, Guay, and

Verrecchia [2003], Banker, Huang, and Natarajan [2009]). These findings motivate the following

hypotheses.

H1a: The sensitivity of target revisions to past own performance decreases in peer group capacity

to filter out noise.

H1b: The sensitivity of target revisions to past peer performance increases in peer group capacity

to filter out noise.

Thus the benefit of a greater weight on past peer performance is that managers are protected

from common shocks to performance that are beyond their control (Holmström [1982], Antle and

Smith [1986]). However, putting relatively more weight on past peer performance and less weight

on past own performance can also be costly because it reduces incentives to cooperate. In

particular, helping peers improve their performance translates into a higher own target in the future

and consequently in lower expected compensation (Gibbons and Murphy [1990], Milgrom and

Roberts [1992], Chen [2003]). Although it is difficult to make a priori predictions about the relative

magnitude of these costs and benefits, the benefits of using peer performance information likely

exceed the costs when determining compensation of top executives. This is because CEOs and

other top executives cannot easily cooperate (and sometimes are even prohibited by antitrust law

10

to cooperate) with executives from their peer groups, which are often comprised of their direct

competitors (Albuquerque [2009], Gong et al. [2011]). In contrast, the trade-off between the costs

and benefits of evaluating managers relative to their peers is important at lower levels of the

organization where cooperation with peers is critical for overall performance.

Firms can manage this inherent trade-off between incentives to increase own performance

(due to better noise filtering) and incentives to cooperate with peers not only by adjusting their

target-setting policies but also by other means such as team-based incentive schemes, transfer

pricing policies, promotion and job rotation practices, etc. (Ickes and Samuelson [1987], Bushman,

Indjejikian, and Smith [1995], Baldenius and Reichelstein [2006], Campbell [2008]). An

advantage of target-setting policies is that they can easily be customized to different organizational

units whereas other policies are often organization-wide and leave less scope for customization

(e.g. transfer pricing). In any case, no single policy or performance evaluation choice can

completely resolve the conflict between individual incentives and incentives to cooperate.

Based on the above arguments, we expect that an increase in the importance of cooperation

among peers leads to a decrease in the weight on past peer performance in target updating because

it alleviates managers’ concerns about higher future targets as a consequence of helping a peer.

Given that past peer performance and past own performance are substitutes in updating targets to

reflect persistent common shocks to performance (Meyer and Vickers [1997]), we expect that a

decrease in the weight on past peer performance also leads to an increase in the weight on past

own performance in target updating.

11

H2a: The sensitivity of target revisions to past own performance increases in the importance of

cooperation.

H2b: The sensitivity of target revisions to past peer performance decreases in the importance of

cooperation.

3 Field Setting

To test our hypotheses we collect data from a global company (further referred to as Gamma)

which provides testing and certification services in the energy sector. Gamma has approximately

5,000 employees worldwide and annual revenues of around $1 billion. It is organized in seven

business regions and 15 business groups each of which consists of multiple business units (BUs)

operating as relatively autonomous profit centers. BUs are typically located close to their clients

and offer a wide range of services depending on market demand.

In particular, there are three primary industry lines of service: upstream, downstream and

power. Upstream includes services to oil and gas companies involved in exploration, extraction,

onshore and offshore well operation as well as transportation and storage. Downstream includes

services to oil and gas processing companies that operate refineries or petrochemical plants as well

as companies involved in further transportation and storage of oil, gas, and other chemicals to

users. Power includes services to companies that operate any type of power plants. For all these

industries, Gamma performs two main types of activities: (i) testing for defects such as cracks or

corrosion and (ii) certifications of safety, reliability, and compliance with regulations.

The primary resources for these testing and certification services are qualified technicians

and specialized equipment. The type of tasks technicians can perform is constrained by their

certification, which comes in three levels: assistant (around 10 percent of the workforce), specialist

(80 percent), and expert (10 percent). Technicians also specialize in the type of equipment they

12

can operate and the type of inspection they can perform, which makes job experience even more

important than certification. For example, it takes about five years of training and experience for

a technician to be able to operate specialized equipment and competently carry out an inspection

assignment.

Depending on the type of service, technicians use a large variety of equipment ranging

from basic to highly specialized. For example, some relatively simple radiographic devices can be

used for corrosion inspections by all technicians in all three industry lines of service. More

sophisticated radiographic devices (e.g., real-time digital recorders) can only be operated by

technicians with a specific training. Ultrasonic equipment requires even greater specialization

because there are up to twenty different types of devices based on different technologies. Finally,

equipment for relatively infrequent tasks such as magnetic or penetrant testing is reserved for a

small group of technicians who have training and experience in this area.

3.1 PERFORMANCE EVALUATION AND TARGET SETTING

BU managers have a great autonomy to manage local operations as long as their financial

performance meets expectations. Their compensation consists of a fixed salary and a performance-

contingent bonus. Although performance is measured both in terms of earnings and revenues, the

former is more important because no bonus is paid unless actual earnings meet a performance

target (conversely, when earnings exceed the target, a significant bonus is awarded even if actual

revenues are below target). Thus, the largest part of the bonus is determined based on a comparison

of actual earnings with a target set at the end of the prior year. Ex post discretionary adjustments

to bonuses based on subjective evaluation are possible but insignificant both in magnitude and

occurrence. If used, the subjective criteria for adjustments are BU-specific and may include for

example assessments of safety, quality, or leadership.

13

Target setting within Gamma is a multi-stage process. First, BU managers provide an initial

estimate of next-year earnings and revenues and negotiate targets with business group managers.

During these negotiations business group managers and their staff extensively use a business

intelligence tool that tracks past performance in terms of a detailed break-down of monthly sales,

costs, and earnings for each BU in their business group (also referred to as the peer group). The

resulting targets are therefore based on past BU performance as well as past performance of its

peers. Second, a similar process takes place at the next higher organizational level where business

region directors negotiate targets with business group managers reporting to them. Third, business

regions aggregate all information and propose next-year targets to the board. The board approves

targets as proposed or revises them upward. Fourth, in case of a revision, business region and

group managers decide how to distribute target increases among their BUs. The final targets are

based on negotiations among region, group, and BU managers rather than based on uniform

percentage adjustments affecting all units equally.

3.2 CAPACITY MANAGEMENT

Gamma faces a highly volatile demand which makes capacity management critical for

profitability. BU managers have some flexibility to adjust capacity through overtime,

(re)scheduling of holidays or training sessions, or temporary employment contracts but only if they

know about additional demand in advance (e.g., in case of new maintenance contracts that typically

have a predictable schedule). Nevertheless, surges in demand cannot always be anticipated because

customer installations can get damaged in the normal course of operations and create an urgent

need for repair and retesting. BU managers are very concerned about declining a surprise order

due to capacity constraints because it could mean losing a long-time customer. To reduce this risk,

BUs invest in slack capacity both in terms of labor and equipment even though it involves

14

significant long-term cost commitments given that newly hired technicians need both training and

experience to become fully productive.

--- Insert Figure 1 ---

Figure 1 illustrates capacity issues for a representative BU in our sample. The first panel

shows that February was the busiest month with actual sales exceeding the average in other months

by 57 percent, whereas January and December were about 26 and 40 percent below the average,

respectively. It also shows that the February peak in demand was not anticipated because actual

sales exceeded target by 69 percent. The second panel describes usage of capacity as measured by

labor hours. Average capacity utilization was about 90 percent implying slack capacity of about

10 percent of total available labor hours. Slack capacity could be as high as 20 percent of labor

hours during its lowest utilization in January and December. However, the surprise demand in

February and June eliminated most of this slack.

Despite its cost, some slack capacity is necessary to manage volatile demand in the absence

of inventory buffers. Slack capacity can also be shared among BUs which bundles their demand,

reduces aggregate volatility, and improves capacity utilization (Jordan and Graves [1995]). In the

absence of sharing, BUs could only rely on their own capacity when accommodating demand

surges and consequently would have to keep even more slack capacity. These benefits of capacity

sharing are widely recognized within Gamma and managers have access to reports tracking their

capacity utilization on a monthly basis. Gamma also has a transfer pricing policy that encourages

capacity sharing. BU transfer prices are based on full costs including an hourly labor rate and travel

costs. Transfers within the same business group are typically significantly less expensive than

transfers across business groups because of higher labor rates and long-distance travel costs. In

15

any case, internal transfer prices are typically much lower than outside market prices of

independent contractors who charge a premium for qualified temporary workers.

Even though the transfer pricing policy creates financial incentives for BUs on both sides

of internal transfers, BUs borrowing capacity typically benefit more than BUs lending it out, which

makes bilateral relations and willingness to cooperate important for capacity sharing. In particular,

BUs transferring their personnel or equipment earn a contribution margin on capacity resources

that would otherwise be idle. However, they also incur opportunity costs in terms of not being able

to accommodate their own unexpected service orders, especially in cases when their equipment

gets damaged and needs to be repaired. It is therefore often the case that BU managers share their

capacity resources because of reciprocity concerns—refusal to share resources with others would

hurt their ability to internally borrow capacity in the future.

In summary, capacity sharing is an essential cost-saving measure that gives BUs greater

flexibility to respond to volatile demand and reduces the need to invest in their own slack capacity.

However, capacity sharing is voluntary and largely based on reciprocity, which makes it important

to create an environment where BU managers are willing to cooperate.

4 Research Design

4.1 DATA

The dataset available for our empirical analysis consists of 127 observations with non-missing

values of earnings targets as well as past own and peer performance in 93 BUs during 2013–2014.

We discard 21 abnormal observations affected by one of the following: (i) a merger with another

BU, (ii) a restructuring into multiple BUs, (iii) other acquisitions or divestitures resulting in

abnormal target revisions or performance relative to target, which we operationalize as 1% of

outlying observations in terms of target revisions or performance relative to target, (iv) a business

16

group manager acting as an interim BU manager, in which case the usual target-setting process

does not apply. Our final dataset contains 106 BU-year observations on 74 BUs.

4.2 VARIABLE MEASUREMENT

Our main analysis of target revisions as a function of past own and peer performance uses several

variables as in prior studies. We follow Bouwens and Kroos [2011] to measure target revisions

and past own performance as follows. Ti,t+1 – Ti,t is target revision of BU i at the end of year t,

measured as a change in targeted earnings scaled by revenues.3 Ai,t – Ti,t is own performance of

BU i during year t, measured as the difference between actual and targeted earnings scaled by

revenues. ACHIEVEi,t is an indicator variable for exceeding year t earnings target. We follow

Aranda et al. [2014] to measure peer performance APeer,t – Ti,t as the difference between average

earnings of all peers and earnings target of BU i during year t (all scaled by revenues).4

We use two variables to measure peer group capacity to filter out noise. First, similar to

prior studies on the choice of performance measures (Ittner et al. [1997], Core et al. [2003], Bol

and Lill [2015]), we calculate SDPERF as the standard deviation of own performance for each BU

using monthly data on earnings scaled by revenues (up to 36 BU-month observations). When the

standard deviation of own performance is high, the need to rely on peer performance information

to filter out noise is also high. Second, we follow the approach of Casas-Arce et al. [2017] and

calculate peer group quality, PCORR, as the time-series correlation between actual performance

3 To increase comparability across BUs and alleviate heteroscedasticity issues, we divide all earnings-based variables by revenues similar to the scaling in Bouwens and Kroos [2011] and Aranda et al. [2014]. Leone and Rock [2002] scale earnings by total assets, which are not available in our dataset. We cannot use prior-period earnings for scaling purposes because they can be close to zero or even negative in some cases. 4 Aranda et al. [2014] use a similar RTS variable with the reverse sign (with Ti,t – APeer,t in the numerator) so that high values reflect that Ti,t is relatively difficult to achieve. For exposition of our results, it is easier to code the peer performance variable so that high values reflect high peer performance. As discussed in Section 5.5 on “Additional Evidence”, we find similar results when we measure peer performance as APeer,t – TPeer,t, i.e., as average performance of peers relative to their targets (Casas-Arce et al. [2017]).

17

of a BU (earnings scaled by revenues) and average performance of all its peers (using up to 36

BU-month observations). A higher correlation indicates a greater peer group capacity to filter out

noise.

We use three variables to measure variation in the importance of cooperation across BUs.

First, some BUs are far away from their closest peer whereas others are located in close proximity

to multiple peers, which increases the potential for capacity sharing. To measure geographical

proximity among multiple BUs, we define DISTANT as the distance in kilometers between a BU

and the point of gravity of its peers, i.e., the point minimizing the sum of distances to all peers in

the same business group. An alternative measure based on the average distance between a BU and

all its peers yields qualitatively similar results.

Second, we compare BUs to their peers in terms of similarity in services they provide.

Although some capacity resources can be used for all industry lines of service, other resources are

service-specific and can only be shared if two BUs engage in similar activities. For example, X-ray

devices to inspect pipelines in the upstream industry are rarely used in BUs mainly servicing

downstream refineries. To measure similarity of BU services within a peer group, we calculate

SIMILAR as the Hirschman-Herfindahl Index of revenues by service line within a business group.

Specifically, we calculate the sum of squared revenue shares (of total business group revenue)

generated in different lines of service. High values indicate high similarity as reflected in high

concentration of revenues in the same line of service within a business group. Low values indicate

low similarity and low potential for capacity sharing among BUs due to a high diversity of services

provided by BUs within the same business group.

Third, BUs may be in close proximity to their peers and use similar resources but still have

limited potential for capacity sharing if the peak usage occurs during the same months of the year.

18

Therefore, we also measure MATCH as the proportion of peers that are in need of capacity transfers

(have highest revenues) during the same two months of the year when the focal BU has slack

capacity (its two lowest monthly revenues of the year).5

When testing for the effects of cooperation, we use these three proxies (DISTANT,

SIMILAR, MATCH) individually but we also aggregate them into an overall index for the

importance of cooperation. One way to aggregate is to assume equal weights and construct an

index as the sum of the three proxies standardized to have zero mean and variance of one. Our

validation analysis in Section 5.2 discusses an alternative aggregation method which estimates the

weights using additional information about the outcomes of cooperation. Using the proxies for

cooperation individually or using either of the aggregation methods yields qualitatively similar

results.

Our main analyses use the following control variables. SIZE is the logarithm of BU labor

costs.6 GROWTH is the logarithm of BU growth in revenue. GDP is annual GDP per capita growth

in the business group’s country. PRICE is the hourly charge paid by customers in the country,

which may proxy for (lack of) competitive pressures. As discussed below, we also control for all

time-invariant sources of sample heterogeneity by using a change in targeted earnings as the

dependent variable in our main analysis.

5 Alternative measures defining MATCH in terms of the same one (three) month(s) of the year yield qualitatively similar results in our hypotheses tests. Our main measure yields strongest results in our validation analysis discussed in the next section. In addition, we find support for the implicit assumption that monthly variation in demand for capacity is at least partly predictable. In particular, there is a highly significant serial correlation (ρ=0.27, p<.01) in ranks of monthly revenue in any given BU-year. 6 Using (the logarithm of) BU revenue or labor cost (unlogged) to control for size yields qualitatively unchanged results. In our validation analysis, where BU labor cost is one of the dependent variables, we use sales to control for size.

19

4.3 EMPIRICAL MODEL

We model target revisions, Ti,t+1 – Ti,t, as a function of past own performance and past peer

performance. We allow for asymmetric target ratcheting where the effect of past own performance

depends on ACHIEVEi,t (Leone and Rock [2002], Bouwens and Kroos [2011]). Thus, our baseline

model is as follows:

, 1 , 0 1 , , 2 , 3 , , , 4 , ,( ) ( ) ( )i t i t i t i t i t i t i t i t Peer t i tT T A T ACHIEVE ACHIEVE A T A T . (1)

Our hypotheses predict that the effects of past own performance ( 1 ) and past peer

performance ( 4 ) vary as a function of peer group capacity to filter out noise and the importance

of cooperation. To estimate these moderating effects, we rely on the following models:

, 1 , 0 1 , , 2 , 3 , , , 4 , ,

5 6 , , 7 , ,

( ) ( ) ( )

( ) ( ),

i t i t i t i t i t i t i t i t Peer t i t

i i i t i t i Peer t i t

T T A T ACHIEVE ACHIEVE A T A T

VAR VAR A T VAR A T

(2)

where VAR stands for one of our main explanatory variables (SDPERF, PCORR, DISTANT,

SIMILAR, or MATCH). We also include our control variables as well as year and peer group fixed

effects. In our main tests, we simultaneously estimate the main and moderating effects of both peer

group capacity to filter out noise and the importance of cooperation.

5 Results

5.1 DESCRIPTIVE EVIDENCE

Table 1 presents descriptive statistics for our sample of 106 BU-year observations. The typical BU

had direct labor costs of €1.1 million (see SIZE unlogged) and sales of €2.8 million (untabulated).

Performance exceeded targets in terms of return on sales in forty five percent of sample

observations. Median own performance (Ai,t – Ti,t) fell short of target by one percent of revenues

and the resulting median target revision (Ti,t+1 – Ti,t) was zero.

--- Insert Table 1 ---

20

Table 2 presents Pearson correlations among our main variables and yields two insights.

First, target revisions are positively associated with past own performance as well as past peer

performance. The association with past peer performance (0.46) is higher than the association with

past own performance (0.37). This is consistent with the assumption of RTS that information about

peer performance is useful for target-setting purposes. Second, we find that our three proxies for

the importance of cooperation reflect different dimensions of the potential for capacity sharing and

are not highly correlated. As discussed below, we take this into account when constructing an

overall index for the importance of cooperation.


5.2 VALIDATING MEASURES OF IMPORTANCE OF COOPERATION

As described in Section 3, cooperation allows BUs to pool their capacity resources with peers

which should have at least some of the following economic benefits. First, it should reduce the

need to maintain excess labor capacity and consequently also BU labor costs. Second, it should

reduce the usage of labor from outsourcing companies and other third-party providers. Third, it

should increase BU profitability because maintaining excess labor or hiring temporary labor from

third parties to accommodate unexpected demand is expensive.

We expect our measures of the importance of cooperation to be associated with the three

economic outcomes discussed above. We measure excess labor capacity as abnormally high labor

hours (SLACK LABOR), i.e., the residuals from a regression of BU direct labor cost on the average

hourly labor rate, sales, and all other control variables used in the main analysis. By construction,

these residuals cannot be explained by cross-sectional differences in wages, sales, or any other

measurable BU characteristics and should therefore be indicative of excess labor capacity. We

measure the usage of labor from third-party providers (THIRD PARTY) in terms of the labor hours

21

requested from outside providers as a percentage of the contractually available maximum at the

business group level. Finally, we measure profitability as BU profit margin (PROFITM), i.e.,

earnings scaled by revenues.

--- Insert Tables 3a and 3b ---

Table 3a shows the pairwise correlations among the three proxies for the importance of

cooperation and the three economic outcome variables. Table 3b tests whether some of these

associations are significant in multivariate regressions, which also include average hourly labor

rate, sales, GROWTH, GDP, and PRICE as control variables and year fixed effects. We do not

include peer group fixed effects in this validation analysis because between-group differences in

the importance of cooperation are associated with between-group differences in economic

outcomes. Our results are qualitatively similar but somewhat weaker in magnitude if we include

peer group fixed effects and thus eliminate this between-group variance.

Panel A of Table 3b shows that all three proxies for the importance of cooperation are

significantly associated with excess labor capacity as reflected in abnormally high labor hours. In

particular, the first proxy, DISTANT, is inversely related to the potential for capacity sharing and

we find that it is positively associated with SLACK LABOR (p<.01). In other words, BUs that are

further apart from their peers are more likely to maintain slack capacity in terms of excess

employment. We find significant negative associations between abnormal labor costs and the other

two measures, SIMILAR (p<.05) and MATCH (p<.01), that are increasing in the potential for

capacity sharing.

Panel B of Table 3b provides similar results regarding the usage of labor from third-party

providers. THIRD PARTY is increasing in DISTANT (p<.10) and decreasing in SIMILAR (p<.01)

and MATCH (p<.01). Finally, Panel C shows the implications for overall profitability as reflected

22

in BU margins. PROFITM is increasing in SIMILAR (p<.01) and the effects of DISTANT and

MATCH have the predicted signs even though the magnitudes do not quite reach the levels of

statistical significance.

We infer from these findings that our proxies for the importance of cooperation capture

economically meaningful differences in production functions. BUs that are close to their peers,

perform similar services, and experience peak demand at different times of the year than their peers

are in the best position to pool capacity resources and reap the benefits of low levels of excess

employment, low reliance on temporary labor from third parties, and to some extent also higher

average margins. It follows that BU cost structure depends not only on individual BU

characteristics but also on proximity and similarity to its peers.

The correlations presented in Table 3a can also be used to aggregate DISTANT, SIMILAR,

and MATCH into an overall index for the importance of cooperation and to test validity of the

resulting formative construct (CINDEX). In particular, we follow Diamantopoulos and Winklhofer

[2001] and estimate a model of cooperation with three causes (possibly uncorrelated) and three

outcomes. We find evidence of a very good fit in that a χ2-test does not reject validity of the model

(χ2=6.43, p=0.38). Relying on this measurement model, we estimate CINDEX as a linear

combination of DISTANT, SIMILAR, and MATCH with optimal aggregation weights that

maximize the proportion of common (explained) variance behind the three economic outcomes of

cooperation. We use CINDEX as one of the measures of cooperation in our hypotheses tests. Using

an equally-weighted index yields qualitatively similar results.

5.3 BASELINE TARGET REVISION MODELS

Prior literature shows that target revisions depend on past own as well as past peer performance

(Leone and Rock [2002], Aranda et al. [2014]). Table 4 replicates these findings in the full sample

23

as well as in two separate subsamples of BUs with low/high performance relative to peers

(measured as return on sales below/above the business group median). The full sample results in

column (i) are consistent with prior studies in that there is evidence of both target ratcheting

(p<.01) and RTS (p<.01). In particular, we find an economically significant extent of RTS as

reflected in the estimates suggesting that a 10% increase in average peer earnings is associated

with a 3.6% increase in target earnings on average. We also find that the association between target

revisions and past own performance is largely driven by BUs that fail to meet their earnings

target—failing a target by a wide margin is associated with a significant target revision downward,

whereas exceeding own target by a wide margin does not necessarily lead to a significant target

revision upward.


The subsample analysis in columns (ii) and (iii) of Table 4 sheds more light on these

asymmetric target revisions. Consistent with prior studies (Aranda et al. [2014], Indjejikian et al.

[2014], Bol and Lill [2015]), we find that the sensitivity of target revisions to past own performance

depends on performance relative to peers. When high performers fail to meet their target, next-

year target is strongly revised downward. However, when high performers exceed (an already

high) target, there is essentially no impact on next-year target. This type of asymmetric target

revision also affects BUs with low relative performance but is much weaker in magnitude.

5.4 HYPOTHESES TESTS

The main focus of our study is the cross-sectional variation in the extent of target ratcheting and

RTS. Our hypotheses predict that the sensitivity of target revisions to past own performance and

past peer performance depend on peer group capacity to filter out noise and on the importance of

cooperation. In what follows, we first present our target revision models separately for each of our

24

measures of (i) peer group capacity to filter out noise (SDPERF and PCORR) and (ii) the

importance of cooperation (DISTANT, SIMILAR, MATCH). Subsequently, we test our hypotheses

using CINDEX as our main measure of the importance of cooperation because it has the greatest

power to detect the economic benefits of capacity sharing (see Section 5.2 validating our measures

of the importance of cooperation).

--- Insert Table 5a ---

Columns (i) and (ii) of Table 5a each include one of the measures of peer group capacity

to filter out noise. As expected, we find that they are positively associated with the sensitivity of

target revisions to past peer performance (p < .01 in both cases). We also find that the sensitivity

of target revisions to past own performance is negatively associated with SDPERF (p < .01). These

findings are consistent with the theory that target revisions put relatively more weight on peer

performance information when peer group capacity to filter out noise is high. They also suggest

that volatility of performance (SDPERF) and peer group quality (PCORR) effectively proxy for

the effects of noise filtering.

Column (iii) of Table 5a uses DISTANT as a proxy inversely related to the importance of

cooperation. We find that the sensitivity of target revisions to past peer performance is relatively

low (p < .01) and the sensitivity to past own performance is relatively high (p < .01) when a BU is

located close to its peers (DISTANT is low). This is consistent with the theory that target revisions

are relatively immune to favorable performance of peers in settings where the potential for capacity

sharing and thus the importance of cooperation is high. That said, proximity to peers may also

imply greater peer group capacity to filter out noise (see Table 2), which could attenuate the effects

of DISTANT on the sensitivity of target revisions to past own and past peer performance in

Column (iii).

25

Column (iv) of Table 5a uses BU similarity as an empirical proxy for the importance of

cooperation. High values of SIMILAR indicate that BUs within a business group perform similar

services and therefore have a greater potential for capacity sharing. We find that SIMILAR is

negatively associated with the sensitivity of target revisions to past peer performance (p < .01) and

positively associated with the sensitivity to past own performance (p < .10). These findings imply

that information about peer performance is incorporated into targets relatively more and

information about own performance relatively less when business groups have a highly diverse set

of activities which renders transfers of services and resources between BUs more difficult.

Conversely, business groups that are homogenous in terms of services offered and resources used

put less emphasis on past peer performance and more emphasis on past own performance when

revising targets. Again, to the extent that greater similarity among peers makes peer performance

more informative about the future, these findings could be attenuated by noise-filtering motives.

Column (v) of Table 5a exploits variation in the extent to which a BU’s idle capacity

coincides with idle capacity of peers. At one extreme, if all BUs within a business group face peak

demand at the same time of the year, there is no potential for capacity sharing. At the other extreme,

the highest potential for capacity sharing occurs for the highest values of MATCH, i.e., when a BU

has slack capacity exactly when peers can use it. We find that when MATCH is high, the sensitivity

of target revisions to past peer performance is relatively low (p < .01) and the sensitivity to past

own performance is relatively high (p < .05).

--- Insert Table 5b ---

Table 5b presents the main tests of our hypotheses and extends the results in Table 5a in

two ways. First, as discussed in Section 5.2, we use CINDEX as an aggregate measure of the

importance of cooperation which combines all three proxies into an overall index as in

26

Diamantopoulos and Winklhofer [2001]. Column (i) of Table 5b shows that the effects of CINDEX

on the relative importance of past own and peer performance in target updating are qualitatively

similar to those in Columns (iii)-(v) of Table 5a. Second, we estimate the noise-filtering effects

predicted by H1a,b and the cooperation effects predicted by H2a,b simultaneously. Although

estimating multiple interaction effects in one model reduces the power of our tests, it takes into

account that some of our proxies for cooperation may also be related to noise filtering.

Consistent with H1a,b, Column (ii) shows that volatility of own performance (SDPERF)

reduces the sensitivity of target revisions to past own performance (p < .01) and increases the

sensitivity to past peer performance (p < .01). Column (iii) shows that peer group quality (PCORR)

increases the sensitivity to past peer performance (p < .01). The effect on the sensitivity to past

own performance has the predicted sign but is not statistically significant.

Columns (ii) and (iii) also provide strong support for H2a,b. In particular, controlling for

the noise-filtering effects discussed above, we find that the importance of cooperation (CINDEX)

increases the sensitivity of target revisions to past own performance (p < .01 in (ii) and p < .05 in

(iii)) and reduces the sensitivity to past peer performance (p < .01 in both columns). To assess the

magnitude of the effects, we compare the sensitivity of target revisions to past own and peer

performance at different values of CINDEX for the model reported in column (i) of Table 5b. For

example, the sensitivity estimates are 0.32 for past own performance and 0.43 for past peer

performance for an observation with average value of the index. In contrast, when the importance

of cooperation is high, as reflected by one standard deviation increase in the index, the sensitivity

estimates are 0.38 for past own and 0.28 for past peer performance. Thus, although we make no

predictions about the absolute size of the sensitivity coefficients, we provide evidence that they

depend to a great extent on the importance of cooperation.

27

In conclusion, we find evidence consistent with RTS in that target revisions incorporate

information about peer performance. Nevertheless, there is also considerable cross-sectional

variation in the extent of RTS. In particular, we find that target revisions are relatively immune to

favorable performance of peers in settings where the potential for cooperation among BUs is high

and/or the capacity of peer groups to filter out noise is low.

5.5 ADDITIONAL EVIDENCE

We estimate several alternative specifications of our empirical models to assess their robustness

and to provide additional evidence on the determinants of RTS. First, we examine the validity of

the assumption that peer groups are comprised of BUs in the same business group. Alternatively,

peer groups could be comprised of units outside of a business group. We randomly generate 100

alternative peer groups for each BU so that none of its peers comes from the same business group

and reestimate a target updating model similar to the one presented in Column (i) of Table 4. We

find that using peers from the same business group (as in our main analysis) results in a higher R2

than in all of the estimations using randomly generated peer groups.7 Similarly, the coefficient

estimate pertaining to past performance of business group peers is higher than the corresponding

coefficient in any of the estimations involving random peers. Collectively, these results suggest

that our assumption that peer groups are comprised of BUs in the same business group is not overly

restrictive.

Second, we consider an alternative way to measure peer performance relative to target. Our

main analysis follows the approach of Aranda et al. [2014] assuming that target difficulty varies

across BUs and can be measured by comparing average peer performance to own targets. As an

7 We do not estimate peer group fixed effects in these comparisons because they cannot be meaningfully defined for randomly generated peer groups that are different for each BU.

28

alternative, we follow Casas-Arce et al. [2017] and assume that performance targets are fully

adjusted every period so that their difficulty is the same for all BUs. This approach implies that

target revisions only reflect new information about peer performance, which can therefore be

measured as average peer performance relative to peer targets, APeer,t – TPeer,t. We reestimate our

empirical models using this alternative measure of past peer performance and find qualitatively

similar results.

Finally, we examine whether the sensitivity of target revisions to past own performance

and past peer performance depends on peer group size (the number of BUs in a business group).

Although peer group size is not directly related to the importance of cooperation, it may affect

target updating for two reasons: (i) average performance of a larger peer group is likely to be less

volatile and therefore used more in target updating (Banker and Datar [1989]), (ii) the adverse

incentive effects of using information about peer performance are less pronounced in larger groups

(Bandiera et al. [2005]). Therefore, we expect that peer group size increases the relative weight on

past peer performance in target updating and reduces the relative weight on past own performance.

We find support for these predictions in that the interaction between peer group size and past own

performance is significantly negative (p < .01, untabulated) and the interaction between peer group

size and past peer performance is significantly positive (p < .10, untabulated).

6 Discussion and Conclusions

A large stream of prior literature examines the extent to which executive compensation depends

on performance of industry peers. Theory predicts that the use of information about peer

performance in compensation contracts has benefits but also costs in terms of discouraging

cooperation among peers. Numerous studies find support for the benefits yet there is hardly any

evidence pertaining to the costs, in part because performance evaluation and compensation data is

29

rarely available at lower organizational levels where teamwork and cooperation are particularly

important (Feintzeig [2015]).

Our study fills in the void by providing novel evidence consistent with the theory that the

use of peer performance information implies a trade-off between the benefits of filtering out noise

from performance evaluations and the costs of discouraging cooperation. In particular, we collect

field data from a global industrial services company and examine how the importance of

cooperation affects relative target setting. Consistent with prior work, we find that the extent to

which target revisions depend on peer performance increases in peer group capacity to filter out

noise. At the same time, we also find that information about peer performance is used less when it

is important for BU managers to cooperate and voluntarily share capacity resources.

Our findings caution that the incentive implications of incorporating information about

peer performance into compensation contracts are more complex than recognized in most prior

empirical studies. The noise-filtering benefits may dominate the costs in settings where top

executives have limited interaction with their industry peers. However, it would be misleading to

conclude from the evidence on top executive compensation that the use of information about peer

performance strengthens incentives at other levels of the organizational hierarchy. Most peer

performance comparisons take place at lower levels where the assumption of a limited interaction

among peers does not apply. Our study shows that in such settings concerns about incentives to

cooperate may be equally or more important than the noise-filtering benefits of using information

about peer performance.

Our findings also relate to prior literature on managerial incentives in multi-period settings.

Although many analytical studies show that ex ante commitment to long-term contracts

strengthens managerial incentives (Laffont and Tirole [1993], Ederhof, Rajan, and Reichelstein

30

[2011]), it remains unclear whether such commitment is feasible in practice. Our results are

consistent with the use of long-term contracts and target-setting practices that encourage

cooperation through credible commitment not to revise target upward following improvements in

peer performance.

We acknowledge that our findings also have limitations. First, a key theoretical construct

of our study, the importance of cooperation, cannot easily be measured. To address concerns about

measurement error, we use several different measures of the importance of cooperation and show

that our results are robust. Second, our findings need not generalize to other settings. We do not

make any conclusive statements about the importance of cooperation outside the setting under

study or about the relative magnitude of the costs and benefits of relative target setting. Our aim

is to provide evidence that concerns about a lack of cooperation affect target setting and the extent

to which managerial compensation depends on peer performance.

31

References

AGGARWAL, R. K., AND A. A. SAMWICK. "Executive compensation, strategic competition, and relative performance evaluation: Theory and evidence." Journal of Finance 54 (1999): 1999–2043.

ALBUQUERQUE, A. "Peer firms in relative performance evaluation." Journal of Accounting & Economics 48 (2009): 69–89.

ANDERSON, S. W., H. C. DEKKER, AND K. L. SEDATOLE. "An empirical examination of goals and performance-to-goal following the introduction of an incentive bonus plan with participative goal-setting." Management Science 56 (2010): 90–109.

ANTLE, R., AND A. SMITH. "An empirical investigation of the relative performance evaluation of corporate executives." Journal of Accounting Research 24 (1986): 1–39.

ARANDA, C., J. ARELLANO, AND A. DAVILA. "Ratcheting and the role of relative target setting." The Accounting Review 89 (2014): 1197–1226.

BALDENIUS, T., AND S. REICHELSTEIN. "External and internal pricing in multidivisional firms." Journal of Accounting Research 44 (2006): 1–28.

BANDIERA, O., I. BARANKAY, AND I. RASUL. "Social preferences and the response to incentives: Evidence from personnel data." The Quarterly Journal of Economics 120 (2005): 917–962.

BANDIERA, O., I. BARANKAY, AND I. RASUL. "Social connections and incentives in the workplace: Evidence from personnel data." Econometrica 77 (2009): 1047–1094.

BANKER, R. D., AND S. M. DATAR. "Sensitivity, precision, and linear aggregation of signals for performance evaluation." Journal of Accounting Research 27 (1989): 21–39.

BANKER, R. D., R. HUANG, AND R. NATARAJAN. "Incentive contracting and value relevance of earnings and cash flows." Journal of Accounting Research 47 (2009): 647–678.

BOL, J. C., AND J. LILL. "Performance target revisions in incentive contracts: Do information and trust reduce ratcheting and the ratchet effect?" The Accounting Review 90 (2015): 1755–1778.

BOUWENS, J., E. CARDINAELS, AND J. ZHANG. "Principals and their car dealers: What do targets tell about their relation?" Working paper, University of Amsterdam, 2016.

BOUWENS, J., AND P. KROOS. "Target ratcheting and effort reduction." Journal of Accounting & Economics 51 (2011): 171–185.

BUSHMAN, R. M., R. J. INDJEJIKIAN, AND A. SMITH. "Aggregate performance measures in business unit manager compensation: The role of intrafirm interdependencies." Journal of Accounting Research 33 (1995): 101–128.

CAMPBELL, D. "Nonfinancial performance measures and promotion-based incentives." Journal of Accounting Research 46 (2008): 297–332.

CASAS-ARCE, P., M. HOLZHACKER, M. D. MAHLENDORF, AND M. MATĚJKA. "Relative performance evaluation and the ratchet effect." Contemporary Accounting Research (2017): forthcoming.

CASAS-ARCE, P., AND F. A. MARTINEZ-JEREZ. "Relative performance compensation, contests, and dynamic incentives." Management Science 55 (2009): 1306–1320.

CHARNESS, G., D. MASCLET, AND M. C. VILLEVAL. "The dark side of competition for status." Management Science 60 (2014): 38–55.

32

CHARNESS, G., AND M.-C. VILLEVAL. "Cooperation and competition in intergenerational experiments in the field and the laboratory." The American Economic Review 99 (2009): 956–978.

CHEN, K.-P. "Sabotage in promotion tournaments." Journal of Law, Economics, & Organization 19 (2003): 119–140.

CORE, J. E., W. R. GUAY, AND R. E. VERRECCHIA. "Price versus non-price performance measures in optimal CEO compensation contracts." The Accounting Review 78 (2003): 957–981.

DATAR, S., S. C. KULP, AND R. A. LAMBERT. "Balancing performance measures." Journal of Accounting Research 39 (2001): 75–92.

DIAMANTOPOULOS, A., AND H. M. WINKLHOFER. "Index construction with formative indicators: An alternative to scale development." Journal of Marketing Research 38 (2001): 269–277.

EDERHOF, M., M. V. RAJAN, AND S. REICHELSTEIN. "Discretion in managerial bonus pools." Foundations and trends in accounting 5 (2011): 243–316.

FEHR, E., AND S. GÄCHTER. "Cooperation and punishment in public goods experiments." The American Economic Review 90 (2000): 980–994.

FEINTZEIG, R. "The trouble with grading employees." The Wall Street Journal, http://www.wsj.com/articles/the-trouble-with-grading-employees-1429624897 (2015).

GARVEY, G., AND T. MILBOURN. "Incentive compensation when executives can hedge the market: Evidence of relative performance evaluation in the cross section." Journal of Finance 58 (2003): 1557–1581.

GIBBONS, R., AND K. J. MURPHY. "Relative performance evaluation for chief executive officers." Industrial & Labor Relations Review 43 (1990): 30–51.

GONG, G., L. Y. LI, AND J. Y. SHIN. "Relative performance evaluation and related peer groups in executive compensation contracts." The Accounting Review 86 (2011): 1007–1043.

HARBRING, C., AND B. IRLENBUSCH. "How many winners are good to have?: On tournaments with sabotage." Journal of Economic Behavior & Organization 65 (2008): 682–702.

HARBRING, C., AND B. IRLENBUSCH. "Sabotage in tournaments: Evidence from a laboratory experiment." Management Science 57 (2011): 611–627.

HOLMSTRÖM, B. "Moral hazard and observability." Bell Journal of Economics 10 (1979): 74–91.

HOLMSTRÖM, B. "Moral hazard in teams." Bell Journal of Economics 13 (1982): 324–340.

ICKES, B. W., AND L. SAMUELSON. "Job transfers and incentives in complex organizations: Thwarting the ratchet effect." The Rand Journal of Economics 18 (1987): 275–286.

INDJEJIKIAN, R. J., M. MATĚJKA, K. A. MERCHANT, AND W. A. VAN DER STEDE. "Earnings targets and annual bonus incentives." The Accounting Review 89 (2014): 1227–1258.

ITTNER, C. D., D. F. LARCKER, AND M. V. RAJAN. "The choice of performance measures in annual bonus contracts." The Accounting Review 72 (1997): 231–255.

JANAKIRAMAN, S. N., R. A. LAMBERT, AND D. F. LARCKER. "An empirical-investigation of the relative performance evaluation hypothesis." Journal of Accounting Research 30 (1992): 53–69.

JAYARAMAN, S., T. MILBOURN, AND H. SEO. "Product market peers and relative performance evaluation." Working paper, University of Rochester, 2015.

JENSEN, M. C., AND K. J. MURPHY. "Performance pay and top-management incentives." Journal of Political Economy 98 (1990): 225–264.

33

JOH, S. W. "Strategic managerial incentive compensation in japan: Relative performance evaluation and product market collusion." The Review of Economics and Statistics 81 (1999): 303–313.

JORDAN, W. C., AND S. C. GRAVES. "Principles on the benefits of manufacturing process flexibility." Management Science 41 (1995): 577–594.

KIM, S., AND J. Y. SHIN. "Executive bonus target ratcheting: Evidence from the new executive compensation disclosure rules." Contemporary Accounting Research 34 (2016): 1843–1879

LAFFONT, J. J., AND J. TIROLE. A theory of incentives in procurement and regulation. Cambridge, Massachusetts: The MIT Press, 1993.

LAZEAR, E. P. "Pay equality and industrial politics." Journal of Political Economy 97 (1989): 561–580.

LEONE, A. J., AND S. ROCK. "Empirical tests of budget ratcheting and its effect on managers' discretionary accrual choices." Journal of Accounting & Economics 33 (2002): 43–67.

MATĚJKA, M., AND K. RAY. "Balancing difficulty of performance targets: Theory and evidence." Review of Accounting Studies (2017): forthcoming.

MATSUMURA, E. M., AND J. Y. SHIN. "An empirical analysis of an incentive plan with relative performance measures: Evidence from a postal service." The Accounting Review 81 (2006): 533–566.

MEYER, M. A., AND J. VICKERS. "Performance comparisons and dynamic incentives." Journal of Political Economy 105 (1997): 547–581.

MILGROM, P., AND J. ROBERTS. Economics, organization and management. Englewood Cliffs: Prentice Hall, 1992.

MURPHY, K. J. "Performance standards in incentive contracts." Journal of Accounting & Economics 30 (2001): 245–278.

RAJGOPAL, S., T. SHEVLIN, AND V. ZAMORA. "CEOs' outside employment opportunities and the lack of relative performance evaluation in compensation contracts." Journal of Finance 61 (2006): 1813–1844.

VRETTOS, D. "Are relative performance measures in CEO incentive contracts used for risk reduction and/or for strategic interaction?" The Accounting Review 88 (2013): 2179–2212.

WEITZMAN, M. L. "The ratchet principle and performance incentives." Bell Journal of Economics 11 (1980): 302–308.

34

F I G U R E 1 Monthly Revenue and Capacity Utilization – Business Unit Example

Rev

enue

[€]

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Actual Revenue Target Revenue

Hou

rs

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Actual Billed Available Capacity Unused

35

T A B L E 1 Descriptive Statistics

Obs. Mean St. Dev. 25th Pct. Median 75th Pct.

Ti,t+1 – Ti,t 106 -0.01 0.05 -0.03 0.00 0.01

Ai,t – Ti,t 106 0.02 0.15 -0.08 -0.01 0.08

ACHIEVEi,t 106 0.45 0.50 0.00 0.00 1.00

APeer,t – Ti,t 106 0.02 0.13 -0.06 0.00 0.07

SDPERF 106 0.11 0.07 0.06 0.09 0.14

PCORR 106 0.14 0.29 -0.06 0.19 0.35

DISTANT 106 0.58 0.61 0.13 0.37 0.9

SIMILAR 106 0.44 0.26 0.24 0.3 0.62

MATCH 106 0.14 0.15 0.04 0.10 0.19

SIZE 106 7.04 1.18 6.42 6.97 7.60

GROWTH 106 0.27 0.91 -0.33 0.17 0.71

GDP 106 2.58 3.01 2.96 2.97 3.11

Ti,t+1 – Ti,t —annual earnings target revision scaled by revenues of BU i. Ai,t – Ti,t—own performance, i.e., the difference between actual and targeted earnings scaled by revenues. ACHIEVEi,t—indicator variable for performance meeting or exceeding its annual target. APeer,t – Ti,t—peer performance, i.e., the difference between average earnings of BUs in the same business group and own earnings target scaled by revenues. SDPERF—standard deviation in BU actual earnings scaled by revenues (own performance) over 36 months. PCORR—correlation between BU own performance and average performance of its peers over 36 months. DISTANT—geographical distance between a BU and point of gravity of its peers in the same business group in 1,000 kilometers. SIMILAR—business group level Hirschman-Herfindahl Index of revenues in industry-job type clusters. MATCH—proportion of business group peers at capacity during the months BU i has slack capacity. SIZE—logarithm of BU inflation-adjusted annual direct labor costs converted from local currency to € thousands. GROWTH—logarithm of BU growth in inflation-adjusted revenue. GDP—GDP per capita growth for country in which the BU is located.

36

T A B L E 2 Pearson Correlations

(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)

(1) Ti,t+1 – Ti,t

(2) Ai,t – Ti,t 0.37***

(3) ACHIEVEi,t 0.39*** 0.79***

(4) APeer,t – Ti,t 0.46*** 0.61*** 0.42***

(5) SDPERF 0.11 -0.01 -0.03 0.18*

(6) PCORR -0.21** 0.18* 0.03 0.22** -0.15

(7) DISTANT 0.09 0.08 0.06 0.16* 0.27*** -0.21**

(8) SIMILAR 0.18* 0.51*** 0.28*** 0.59*** 0.12 0.21** -0.12

(9) MATCH 0.37*** 0.16* 0.12 0.08 -0.01 -0.12 -0.13 0.21**

(10) SIZE -0.14 -0.11 -0.08 -0.15 -0.54*** 0.02 -0.18* 0.01 -0.09

(11) GROWTH -0.06 0.35*** 0.17* 0.33*** 0.11 0.18* 0.05 0.22** -0.10 -0.05

(12) GDP 0.12 0.13 0.18* 0.15 -0.21** -0.01 -0.08 -0.11 0.04 -0.13 -0.04

(13) PRICE 0.13 0.18* 0.07 0.23** 0.34*** -0.01 0.16* 0.15 0.12 0.01 0.19* -0.51***

***,**,* indicate significance at the 1%, 5%, and 10% levels, respectively. PRICE—inflation-adjusted hourly charges to customers in the business group’s country converted from local currency to €. See Table 1 for all other variable definitions.

37

T A B L E 3 A Importance of Cooperation—Validation Analysis

(1) (2) (3) (4) (5)

(1) DISTANT

(2) SIMILAR -0.12

(3) MATCH -0.13 0.21**

(4) SLACK LABOR 0.22** -0.22** -0.26***

(5) THIRD PARTY 0.19** -0.27*** -0.26*** 0.13

(6) PROFITM -0.09 0.31*** 0.18* -0.31*** -0.27***

***,**,* indicate significance at the 1%, 5%, and 10% levels, respectively. DISTANT—geographical distance between a BU and point of gravity of its peers in the same business group in 1,000 kilometers. SIMILAR—business group level Hirschman-Herfindahl Index of revenues in industry-job type clusters. MATCH—proportion of business group peers at capacity during the months BU i has slack capacity. SLACK LABOR—abnormal labor hours, i.e., the residuals from a regression of BU direct labor cost on the average hourly labor rate, sales, GROWTH, GDP, PRICE, and year fixed effects. THIRD PARTY—labor hours requested from outside providers as a percentage of the contractually available maximum at the business group level. PROFITM—BU profit margin.

38

T A B L E 3 B Importance of Cooperation—Validation Analysis

Panel A Dependent Variable: SLACK LABOR

VAR stands for

(i) DISTANT (ii) SIMILAR (iii) MATCH

VAR 497.46*** -1471.15** -2445.80*** (3.02) (2.50) (3.48) Control Variables Yes Yes Yes Year Fixed Effects Yes Yes Yes R2 0.05 0.06 0.07

Panel B Dependent Variable: THIRD PARTY

VAR stands for


VAR 0.02* -0.13*** -0.20*** (1.89) (3.92) (4.18)

Control Variables Yes Yes Yes Year Fixed Effects Yes Yes Yes R2 0.36 0.43 0.43

Panel C Dependent Variable: PROFITM

VAR stands for


VAR -0.03 0.21*** 0.13 (1.42) (3.33) (1.56)

Control Variables Yes Yes Yes Year Fixed Effects Yes Yes Yes R2 0.19 0.29 0.19

***,**,* indicate significance at the 1%, 5%, and 10% levels, respectively, using robust standard errors. Corresponding two-tailed t-values are reported in parentheses. Intercepts are included in estimation but untabulated. The estimations in Panels A and C use the full sample of 106 observations. Six missing values of THIRD PARTY reduce the sample size to 100 in Panel B. See Table 3a for variable definitions.

39

T A B L E 4 Target Revisions—Baseline Models

Dependent Variable: Ti,t+1 – Ti,t

Performance Relative to Peers

(i) Full Sample (ii) High (iii) Low

Constant -0.05 0.04 0.09 (0.81) (0.78) (0.63)

Ai,t – Ti,t 0.29*** 0.91*** 0.29** (3.98) (2.94) (2.66)

ACHIEVEi,t 0.01 0.00 0.03 (1.03) (0.20) (1.31)

ACHIEVEi,t · Ai,t – Ti,t -0.32*** -0.94*** -0.47** (2.91) (2.89) (2.28)

APeer,t – Ti,t 0.36*** (4.66)

Control Variables Yes Yes Yes

Peer Group Fixed Effects Yes Yes Yes

Year Fixed Effects Yes Yes Yes

R2 0.66 0.83 0.61

Sample size 106 50 56 ***,**,* indicate significance at the 1%, 5%, and 10% levels, respectively, using robust standard errors. Corresponding two-tailed t-values are reported in parentheses. See Table 1 for variable definitions.

40

T A B L E 5 A Cross-Sectional Variation in Target Revisions


VAR stands for

(i) SDPERF (ii) PCORR (iii) DISTANT (iv) SIMILAR (v) MATCH

Constant -0.10* -0.05 -0.07 -0.04 0.00 (1.89) (1.05) (1.33) (0.82) (0.00)

Ai,t – Ti,t 0.52*** 0.28*** 0.35*** 0.22*** 0.28*** (5.23) (4.13) (5.43) (2.90) (3.97)

ACHIEVEi,t -0.00 0.00 0.01** 0.02** 0.01 (0.02) (0.47) (2.01) (2.22) (0.67)

ACHIEVEi,t · Ai,t – Ti,t -0.27** -0.30*** -0.22** -0.38*** -0.44*** (2.51) (2.99) (2.36) (3.92) (4.29)

APeer,t – Ti,t 0.17* 0.38*** 0.23*** 0.73*** 0.47*** (1.85) (5.75) (3.39) (5.99) (5.84)

VAR 0.11 -0.07*** -0.02** 0.11*** 0.01 (1.56) (3.52) (2.44) (2.77) (0.41)

VAR · Ai,t – Ti,t -1.19*** -0.04 -0.20*** 0.17* 0.40** (2.85) (0.65) (5.54) (1.72) (2.01)

VAR · APeer,t – Ti,t 1.22*** 0.21*** 0.32*** -0.73*** -0.64*** (2.95) (2.86) (4.53) (4.28) (3.07)

Control Variables Yes Yes Yes Yes Yes Peer Group Fixed Effects Yes Yes Yes Yes Yes Year Fixed Effects Yes Yes Yes Yes Yes

R2 0.72 0.73 0.80 0.76 0.73 Sample size 106 106 106 106 106

***,**,* indicate significance at the 1%, 5%, and 10% levels, respectively, using robust standard errors. Corresponding two-tailed t-values are reported in parentheses. See Table 1 for variable definitions.

41

T A B L E 5 B Cross-Sectional Variation in Target Revisions


Predicted

Sign

VAR stands for

(i) (ii) SDPERF (iii) PCORR

Constant -0.03 -0.08** -0.04 (0.74) (2.15) (1.16) Ai,t – Ti,t 0.25*** 0.45*** 0.24*** (3.31) (4.95) (3.72) ACHIEVEi,t 0.01* 0.01 0.01 (1.94) (0.78) (1.29) ACHIEVEi,t · Ai,t – Ti,t -0.44*** -0.42*** -0.40*** (4.19) (4.36) (3.96) APeer,t – Ti,t 0.58*** 0.42*** 0.58*** (6.17) (4.29) (7.16) VAR 0.11** -0.03** (2.01) (2.49) VAR · Ai,t – Ti,t H1a: – -1.10*** -0.06 (3.78) (1.09) VAR · APeer,t – Ti,t H1b: + 0.86*** 0.21*** (2.88) (2.75) CINDEX 0.16 0.25** 0.15 (1.17) (2.07) (1.39) CINDEX · Ai,t – Ti,t H2a: + 1.13** 1.16*** 0.97** (2.62) (3.06) (2.40) CINDEX · APeer,t – Ti,t H2b: – -2.63*** -2.18*** -2.38*** (3.82) (4.18) (3.68)

Control Variables

Yes Yes Yes

Peer Group Fixed Effects Yes Yes Yes

Year Fixed Effects Yes Yes Yes

R2

0.79 0.83 0.81

Sample size 106 106 106 ***,**,* indicate significance at the 1%, 5%, and 10% levels, respectively, using robust standard errors. Corresponding two-tailed t-values are reported in parentheses. CINDEX—an overall index for the importance of cooperation. See Table 1 for all other variable definitions.

relative target setting and cooperation · queen’s university, university of california, irvine,...

Documents