relative target setting and cooperation · queen’s university, university of california, irvine,...
TRANSCRIPT
Relative Target Setting and Cooperation
MARTIN HOLZHACKER Michigan State University
STEPHAN KRAMER
Erasmus University Rotterdam
MICHAL MATĚJKA*
Arizona State University
NICK HOFFMEISTER Independent
February 2018
* Corresponding author: PO Box 873606, Tempe, AZ 85287-3606. E-mail: [email protected].
Acknowledgements: We thank Martin Artz (discussant), Alexander Brüggen (discussant), Will Demere, Christoph Feichter, Isabella Grabner, Katlijn Haesebrouck, Joanna Ho, Sylvia Hsingwen Hsu (discussant), Ranjani Krishnan, Peter Kroos, Judith Künneke, Edith Leung, Frank Moers, Mathijs van Peteghem, Stefan Reichelstein, Marcel van Rinsum, Naomi Soderstrom and workshop participants at KU Leuven, Maastricht University, Mannheim University, Queen’s University, University of California, Irvine, the 2017 AAA MAS meeting, the 2017 ACMAR conference, the 2016 EIASM conference on New Directions in Management Accounting Research, the Erasmus University brownbag seminar, and the Michigan State University brownbag seminar for comments and suggestions.
1
Relative Target Setting and Cooperation
ABSTRACT
A large stream of work on relative performance evaluation highlights the benefits of using
information about peer performance in contracting. In contrast, the potential costs of discouraging
cooperation among peers have received much less attention. The purpose of our study is to examine
how the importance of cooperation affects the use of information about peer performance in target
setting, also known as relative target setting. Specifically, we use data from an industrial services
company where business unit managers need to share specialized equipment and staff with their
peers to manage bottlenecks in their capacity. We construct several empirical proxies for the costs
and benefits of information about peer performance and examine their effects on target setting. We
find robust evidence that the sensitivity of target revisions to past peer performance is higher when
peer group performance has greater capacity to filter out noise but lower when the importance of
cooperation among peers is greater.
2
1 Introduction
A large stream of literature on relative performance evaluation (RPE) provides evidence that
information about peer performance is used to ex post filter out noise from compensation decisions
(Antle and Smith [1986], Janakiraman, Lambert, and Larcker [1992], Albuquerque [2009]). In
addition, several recent studies (Aranda, Arellano, and Davila [2014], Bol and Lill [2015]) show
that information about peer performance can also be used to ex ante revise performance targets, a
practice referred to as relative target setting (RTS). In contrast, there is relatively little empirical
evidence on the costs of using information about peer performance in compensation contracts even
though the theory predicts dysfunctional incentive effects in settings where managers can
strategically interact with peers and reduce their performance (Lazear [1989], Gibbons and
Murphy [1990]).1
Prior studies typically examine compensation contracts of top executives in large public
companies where the potential for cooperation with peers is limited (e.g., by antitrust law) and the
costs of using information about peer performance in contracting may be negligible. However,
information about peer performance is often used at lower organizational levels for performance
evaluation of managers who voluntarily share knowledge and tangible resources with their peers
(Matsumura and Shin [2006], Casas-Arce and Martinez-Jerez [2009]). In those settings,
facilitating cooperation is equally or more important than filtering out noise from performance
evaluations. In other words, what we know about incentive contracts of top executives may provide
1 There are some experimental studies suggesting that RPE encourages sabotage and collusion (Bandiera, Barankay, and Rasul [2009], Charness and Villeval [2009], Harbring and Irlenbusch [2011]). There is also some evidence that executive compensation is less sensitive to peer performance when firms prefer to soften product market competition (Aggarwal and Samwick [1999], Vrettos [2013]).
3
only partial insights about the relative costs and benefits of using information about peer
performance to evaluate managers at lower organizational levels.
The theoretical literature suggests that the use of information about peer performance
reflects a trade-off between the benefits of motivating individual effort and the costs of
discouraging cooperation (Milgrom and Roberts [1992]).2 On the one hand, RPE filters out noise
from performance evaluation and thus reduces the costs of incentive provision (Holmström [1979],
Holmström [1982]). Similar noise-filtering benefits also apply to RTS as long as shocks to
performance are persistent and correlated among peers (Casas-Arce et al. [2017]). An additional
benefit of RTS is that it reduces sensitivity of targets to past own performance and consequently
alleviates the ratchet effect (Weitzman [1980], Meyer and Vickers [1997]). On the other hand,
RPE and RTS reduce incentives to cooperate because helping a peer reduces own compensation
(either directly as in the case of RPE contracts or indirectly through increased targets in RTS
contracts). Our empirical analysis is one of the first to examine the trade-off between the noise-
filtering benefits of peer performance information and the costs of discouraging cooperation.
In particular, we collect data on performance evaluation of business unit (BU) managers in
a large global company where the importance of cooperation among BUs varies depending on their
location and specialization. The company provides maintenance and certification services for oil
wells, pipelines, refineries, and other energy infrastructure. The demand for these services
fluctuates widely and BU managers often deal with capacity bottlenecks such as limited
2 Information about peer performance can be incorporated into compensation contracts in two ways (Meyer and Vickers [1997], Casas-Arce, Holzhacker, Mahlendorf, and Matějka [2017]). First, RPE contracts make compensation contingent on own and peer performance observed at the end of a period so that compensation is negatively associated with peer performance. Second, RTS contracts make compensation contingent on performance relative to a beginning-of-period target which is based on past own and peer performance. Targets are positively associated with past peer performance, which also results into a negative association between compensation and peer performance.
4
availability of certified technicians or highly-specialized equipment (pipeline monitoring robots,
X-ray inspection systems, etc.). The least costly way for BUs to address these bottlenecks is to
share capacity with one another. However, capacity sharing is voluntary and may not always be
feasible.
We use several empirical proxies to capture variation in peer group capacity to filter out
noise and in the importance of cooperation. We expect that peer group capacity to filter out noise
is high when (i) standard deviation in BU own performance (need for noise filtering) is high and
(ii) correlation between own and peer performance (peer group quality) is high. As for our
cooperation proxies, we exploit variation in the extent to which BUs can benefit from personnel
and equipment sharing. We expect that the importance of cooperation is high when
(i) geographical distance to peers is low, (ii) similarity between own and peer services is high, or
(iii) resource availability is high in the sense that peer demand for capacity occurs during months
when a unit has capacity available to share. We extensively validate alternative measures of the
importance of cooperation based on these three proxies. For example, we show that cooperation
among peers is associated with lower excess labor costs, which suggests that the ability to borrow
personnel from peers reduces the need to maintain own slack capacity as a buffer for unexpected
demand.
Our empirical models reflect that BU managers’ compensation in our setting is contingent
on their performance relative to ex ante targets based on past BU and peer performance with little
or no ex post adjustments. Therefore, we follow Aranda et al. [2014] and specify a model of target
revisions as a function of past own and past peer performance. We estimate this RTS model using
106 BU-year observations of actual and targeted earnings. Using this model, we replicate several
findings from prior literature on target setting (Indjejikian, Matějka, Merchant, and Van der Stede
5
[2014], Bol and Lill [2015]). For example, we show that when top-performing BU managers fail
to meet their target, the next-period target is revised downward. In contrast, failure to meet a target
is associated with only a weak target revision downward for BU managers performing poorly
relative to their peers.
For our main tests, we examine the extent to which the sensitivity of target revisions to past
own and peer performance varies as a function of our proxies for peer group capacity to filter out
noise and the importance of cooperation. We find robust empirical support for our predictions.
First, target revisions are less sensitive to past own performance and/or more sensitive to past peer
performance when peer group capacity to filter out noise is high. Second, we find the opposite
when the importance of cooperation is high—target revisions are more sensitive to past own
performance and/or less sensitive to past peer performance when BUs can more effectively share
resources with their peers either due to low geographical distance to peer BUs or high business
similarity, or high resource availability.
These results extend prior literature which almost exclusively focuses on the economic
benefits of information about peer performance and largely ignores its potential costs (Gong, Li,
and Shin [2011], Vrettos [2013]). We provide novel empirical evidence consistent with the theory
that information about peer performance can have adverse incentive effects, especially in settings
where peers need to cooperate. This evidence improves our understanding of performance
evaluation of lower-level managers who are often benchmarked against peers within the same
organization. It also highlights the importance of information about peer performance for ex ante
target revisions rather than for ex post compensation adjustments (Aranda et al. [2014], Bol and
Lill [2015]).
6
2 Prior Literature and Hypotheses
2.1 PRIOR LITERATURE
Gibbons and Murphy [1990] comprehensively discuss the incentive implications of compensation
contracts contingent on information about peer performance. On the one hand, such contracts
strengthen incentives by insulating managers from common shocks that also affect their peer
performance. On the other hand, the use of information about peer performance gives rise to
dysfunctional incentives if managers can take actions that reduce peer performance. The latter
effect arises because compensation contracts filter out noise by putting a negative weight on
average peer performance when there is positive correlation between shocks to own and peer
performance (Holmström [1979]). Consequently, managers can increase their compensation by
taking actions that lower peer performance (sabotage or collusion) or by failing to take actions that
increase peer performance (refusal to cooperate).
A large stream of RPE literature empirically examines the use of information about peer
performance in CEO compensation using various measures of own and industry performance. The
noise-filtering hypothesis implies a negative effect of industry performance on compensation. The
early evidence is somewhat mixed. Some studies find support for the hypothesis (Gibbons and
Murphy [1990], Janakiraman et al. [1992]), other studies provide supportive evidence only in
specific subsamples (Garvey and Milbourn [2003], Rajgopal, Shevlin, and Zamora [2006]), and
yet other studies find no support (Jensen and Murphy [1990]).
A common feature of the early literature is that peer performance is measured as the
average performance of all companies with the same industry classification. Recent studies
develop new ways to identify peers, e.g., by using new proxy statement disclosures about peer
groups, by textual analysis of 10-K filings, or by combining industry classifications and size
7
information (Albuquerque [2009], Gong et al. [2011], Jayaraman, Milbourn, and Seo [2015]).
These studies generally find that CEO compensation is negatively associated with peer stock
returns, which is consistent with the noise-filtering hypothesis.
Another feature of the early literature is that it only examines RPE or an ex post use of
information about peer performance, which entails that compensation is determined after the end
of a performance period when information about own and peer performance is observed. However,
firms often determine compensation as a function of own performance relative to a target (Murphy
[2001], Matějka and Ray [2017]). This creates demand for RTS which is an ex ante use of
information about past peer performance to adjust beginning-of-period targets (Aranda et al.
[2014]). Casas-Arce et al. [2017] show that RTS strengthens incentives not only by filtering out
noise but also by alleviating the ratchet effect. In particular, if target revisions are relatively more
sensitive to past peer performance and relatively less sensitive to past own performance, managers
are less concerned that greater effort will make future targets more difficult to achieve (Weitzman
[1980], Leone and Rock [2002], Bouwens, Cardinaels, and Zhang [2016]).
An alternative explanation for the mixed evidence on the noise-filtering benefits of RPE in
early work is that most prior studies disregard strategic interaction among product market
competitors. Aggarwal and Samwick [1999] and Joh [1999] find a significantly positive
association between compensation and peer performance and argue that it is consistent with firms’
incentives to soften product market competition. Vrettos [2013] uses a unique single-industry
setting to distinguish between different types of competition and shows that executive
compensation is negatively (positively) associated with peer performance when firms act
aggressively (cooperatively) in the product market.
8
The importance of considering whether and how peers interact is also born out in several
experimental studies. Bandiera, Barankay, and Rasul [2005] find evidence that RPE reduces
productivity because coworkers “are able to sustain implicit collusive agreements” and that this
effect is more pronounced in smaller groups. Charness, Masclet, and Villeval [2014] show that
RPE increases willingness to sabotage peer performance merely for status concerns, i.e., even
when favorable performance relative to peers does not increase compensation. More generally, the
issues of sabotage and/or lack of cooperation have been examined in prior literature on
tournaments and public good experiments (Fehr and Gächter [2000], Harbring and Irlenbusch
[2008]). Nevertheless, although the importance of these issues has been established in laboratory
conditions and field experiments involving low-paid workers, there is hardly any evidence on how
they affect the design of managerial compensation.
2.2 HYPOTHESES
It is well-understood that managerial incentives depend on the type of information used to revise
their performance targets (Anderson, Dekker, and Sedatole [2010], Bouwens and Kroos [2011],
Kim and Shin [2016]). However, only a few recent studies examine RTS or the extent to which
target revisions depend on past peer performance (Aranda et al. [2014], Bol and Lill [2015], Casas-
Arce et al. [2017]). The empirical evidence is consistent with the theory that past peer performance
has information content above and beyond past own performance and therefore can be used to
filter out noise from performance evaluations (Holmström [1982]). For example, favorable past
peer performance signals that future own performance is likely to be favorable as well, which
justifies a target revision upward.
The theory also predicts that the relative importance of past own versus past peer
performance in target updating depends on their relative noisiness (Banker and Datar [1989],
9
Meyer and Vickers [1997], Datar, Kulp, and Lambert [2001]). Consistent with this prediction,
Casas-Arce et al. [2017] show that higher peer group quality is associated with greater weight on
past peer performance and lower weight on past own performance in target updating. Bol and Lill
[2015] show that the weight on past own performance is decreasing in volatility of own
performance. More generally, prior work on the choice of performance measures commonly finds
that greater noisiness of a performance measure reduces its impact on incentive compensation and
increases the impact of other measures (Ittner, Larcker, and Rajan [1997], Core, Guay, and
Verrecchia [2003], Banker, Huang, and Natarajan [2009]). These findings motivate the following
hypotheses.
H1a: The sensitivity of target revisions to past own performance decreases in peer group capacity
to filter out noise.
H1b: The sensitivity of target revisions to past peer performance increases in peer group capacity
to filter out noise.
Thus the benefit of a greater weight on past peer performance is that managers are protected
from common shocks to performance that are beyond their control (Holmström [1982], Antle and
Smith [1986]). However, putting relatively more weight on past peer performance and less weight
on past own performance can also be costly because it reduces incentives to cooperate. In
particular, helping peers improve their performance translates into a higher own target in the future
and consequently in lower expected compensation (Gibbons and Murphy [1990], Milgrom and
Roberts [1992], Chen [2003]). Although it is difficult to make a priori predictions about the relative
magnitude of these costs and benefits, the benefits of using peer performance information likely
exceed the costs when determining compensation of top executives. This is because CEOs and
other top executives cannot easily cooperate (and sometimes are even prohibited by antitrust law
10
to cooperate) with executives from their peer groups, which are often comprised of their direct
competitors (Albuquerque [2009], Gong et al. [2011]). In contrast, the trade-off between the costs
and benefits of evaluating managers relative to their peers is important at lower levels of the
organization where cooperation with peers is critical for overall performance.
Firms can manage this inherent trade-off between incentives to increase own performance
(due to better noise filtering) and incentives to cooperate with peers not only by adjusting their
target-setting policies but also by other means such as team-based incentive schemes, transfer
pricing policies, promotion and job rotation practices, etc. (Ickes and Samuelson [1987], Bushman,
Indjejikian, and Smith [1995], Baldenius and Reichelstein [2006], Campbell [2008]). An
advantage of target-setting policies is that they can easily be customized to different organizational
units whereas other policies are often organization-wide and leave less scope for customization
(e.g. transfer pricing). In any case, no single policy or performance evaluation choice can
completely resolve the conflict between individual incentives and incentives to cooperate.
Based on the above arguments, we expect that an increase in the importance of cooperation
among peers leads to a decrease in the weight on past peer performance in target updating because
it alleviates managers’ concerns about higher future targets as a consequence of helping a peer.
Given that past peer performance and past own performance are substitutes in updating targets to
reflect persistent common shocks to performance (Meyer and Vickers [1997]), we expect that a
decrease in the weight on past peer performance also leads to an increase in the weight on past
own performance in target updating.
11
H2a: The sensitivity of target revisions to past own performance increases in the importance of
cooperation.
H2b: The sensitivity of target revisions to past peer performance decreases in the importance of
cooperation.
3 Field Setting
To test our hypotheses we collect data from a global company (further referred to as Gamma)
which provides testing and certification services in the energy sector. Gamma has approximately
5,000 employees worldwide and annual revenues of around $1 billion. It is organized in seven
business regions and 15 business groups each of which consists of multiple business units (BUs)
operating as relatively autonomous profit centers. BUs are typically located close to their clients
and offer a wide range of services depending on market demand.
In particular, there are three primary industry lines of service: upstream, downstream and
power. Upstream includes services to oil and gas companies involved in exploration, extraction,
onshore and offshore well operation as well as transportation and storage. Downstream includes
services to oil and gas processing companies that operate refineries or petrochemical plants as well
as companies involved in further transportation and storage of oil, gas, and other chemicals to
users. Power includes services to companies that operate any type of power plants. For all these
industries, Gamma performs two main types of activities: (i) testing for defects such as cracks or
corrosion and (ii) certifications of safety, reliability, and compliance with regulations.
The primary resources for these testing and certification services are qualified technicians
and specialized equipment. The type of tasks technicians can perform is constrained by their
certification, which comes in three levels: assistant (around 10 percent of the workforce), specialist
(80 percent), and expert (10 percent). Technicians also specialize in the type of equipment they
12
can operate and the type of inspection they can perform, which makes job experience even more
important than certification. For example, it takes about five years of training and experience for
a technician to be able to operate specialized equipment and competently carry out an inspection
assignment.
Depending on the type of service, technicians use a large variety of equipment ranging
from basic to highly specialized. For example, some relatively simple radiographic devices can be
used for corrosion inspections by all technicians in all three industry lines of service. More
sophisticated radiographic devices (e.g., real-time digital recorders) can only be operated by
technicians with a specific training. Ultrasonic equipment requires even greater specialization
because there are up to twenty different types of devices based on different technologies. Finally,
equipment for relatively infrequent tasks such as magnetic or penetrant testing is reserved for a
small group of technicians who have training and experience in this area.
3.1 PERFORMANCE EVALUATION AND TARGET SETTING
BU managers have a great autonomy to manage local operations as long as their financial
performance meets expectations. Their compensation consists of a fixed salary and a performance-
contingent bonus. Although performance is measured both in terms of earnings and revenues, the
former is more important because no bonus is paid unless actual earnings meet a performance
target (conversely, when earnings exceed the target, a significant bonus is awarded even if actual
revenues are below target). Thus, the largest part of the bonus is determined based on a comparison
of actual earnings with a target set at the end of the prior year. Ex post discretionary adjustments
to bonuses based on subjective evaluation are possible but insignificant both in magnitude and
occurrence. If used, the subjective criteria for adjustments are BU-specific and may include for
example assessments of safety, quality, or leadership.
13
Target setting within Gamma is a multi-stage process. First, BU managers provide an initial
estimate of next-year earnings and revenues and negotiate targets with business group managers.
During these negotiations business group managers and their staff extensively use a business
intelligence tool that tracks past performance in terms of a detailed break-down of monthly sales,
costs, and earnings for each BU in their business group (also referred to as the peer group). The
resulting targets are therefore based on past BU performance as well as past performance of its
peers. Second, a similar process takes place at the next higher organizational level where business
region directors negotiate targets with business group managers reporting to them. Third, business
regions aggregate all information and propose next-year targets to the board. The board approves
targets as proposed or revises them upward. Fourth, in case of a revision, business region and
group managers decide how to distribute target increases among their BUs. The final targets are
based on negotiations among region, group, and BU managers rather than based on uniform
percentage adjustments affecting all units equally.
3.2 CAPACITY MANAGEMENT
Gamma faces a highly volatile demand which makes capacity management critical for
profitability. BU managers have some flexibility to adjust capacity through overtime,
(re)scheduling of holidays or training sessions, or temporary employment contracts but only if they
know about additional demand in advance (e.g., in case of new maintenance contracts that typically
have a predictable schedule). Nevertheless, surges in demand cannot always be anticipated because
customer installations can get damaged in the normal course of operations and create an urgent
need for repair and retesting. BU managers are very concerned about declining a surprise order
due to capacity constraints because it could mean losing a long-time customer. To reduce this risk,
BUs invest in slack capacity both in terms of labor and equipment even though it involves
14
significant long-term cost commitments given that newly hired technicians need both training and
experience to become fully productive.
--- Insert Figure 1 ---
Figure 1 illustrates capacity issues for a representative BU in our sample. The first panel
shows that February was the busiest month with actual sales exceeding the average in other months
by 57 percent, whereas January and December were about 26 and 40 percent below the average,
respectively. It also shows that the February peak in demand was not anticipated because actual
sales exceeded target by 69 percent. The second panel describes usage of capacity as measured by
labor hours. Average capacity utilization was about 90 percent implying slack capacity of about
10 percent of total available labor hours. Slack capacity could be as high as 20 percent of labor
hours during its lowest utilization in January and December. However, the surprise demand in
February and June eliminated most of this slack.
Despite its cost, some slack capacity is necessary to manage volatile demand in the absence
of inventory buffers. Slack capacity can also be shared among BUs which bundles their demand,
reduces aggregate volatility, and improves capacity utilization (Jordan and Graves [1995]). In the
absence of sharing, BUs could only rely on their own capacity when accommodating demand
surges and consequently would have to keep even more slack capacity. These benefits of capacity
sharing are widely recognized within Gamma and managers have access to reports tracking their
capacity utilization on a monthly basis. Gamma also has a transfer pricing policy that encourages
capacity sharing. BU transfer prices are based on full costs including an hourly labor rate and travel
costs. Transfers within the same business group are typically significantly less expensive than
transfers across business groups because of higher labor rates and long-distance travel costs. In
15
any case, internal transfer prices are typically much lower than outside market prices of
independent contractors who charge a premium for qualified temporary workers.
Even though the transfer pricing policy creates financial incentives for BUs on both sides
of internal transfers, BUs borrowing capacity typically benefit more than BUs lending it out, which
makes bilateral relations and willingness to cooperate important for capacity sharing. In particular,
BUs transferring their personnel or equipment earn a contribution margin on capacity resources
that would otherwise be idle. However, they also incur opportunity costs in terms of not being able
to accommodate their own unexpected service orders, especially in cases when their equipment
gets damaged and needs to be repaired. It is therefore often the case that BU managers share their
capacity resources because of reciprocity concerns—refusal to share resources with others would
hurt their ability to internally borrow capacity in the future.
In summary, capacity sharing is an essential cost-saving measure that gives BUs greater
flexibility to respond to volatile demand and reduces the need to invest in their own slack capacity.
However, capacity sharing is voluntary and largely based on reciprocity, which makes it important
to create an environment where BU managers are willing to cooperate.
4 Research Design
4.1 DATA
The dataset available for our empirical analysis consists of 127 observations with non-missing
values of earnings targets as well as past own and peer performance in 93 BUs during 2013–2014.
We discard 21 abnormal observations affected by one of the following: (i) a merger with another
BU, (ii) a restructuring into multiple BUs, (iii) other acquisitions or divestitures resulting in
abnormal target revisions or performance relative to target, which we operationalize as 1% of
outlying observations in terms of target revisions or performance relative to target, (iv) a business
16
group manager acting as an interim BU manager, in which case the usual target-setting process
does not apply. Our final dataset contains 106 BU-year observations on 74 BUs.
4.2 VARIABLE MEASUREMENT
Our main analysis of target revisions as a function of past own and peer performance uses several
variables as in prior studies. We follow Bouwens and Kroos [2011] to measure target revisions
and past own performance as follows. Ti,t+1 – Ti,t is target revision of BU i at the end of year t,
measured as a change in targeted earnings scaled by revenues.3 Ai,t – Ti,t is own performance of
BU i during year t, measured as the difference between actual and targeted earnings scaled by
revenues. ACHIEVEi,t is an indicator variable for exceeding year t earnings target. We follow
Aranda et al. [2014] to measure peer performance APeer,t – Ti,t as the difference between average
earnings of all peers and earnings target of BU i during year t (all scaled by revenues).4
We use two variables to measure peer group capacity to filter out noise. First, similar to
prior studies on the choice of performance measures (Ittner et al. [1997], Core et al. [2003], Bol
and Lill [2015]), we calculate SDPERF as the standard deviation of own performance for each BU
using monthly data on earnings scaled by revenues (up to 36 BU-month observations). When the
standard deviation of own performance is high, the need to rely on peer performance information
to filter out noise is also high. Second, we follow the approach of Casas-Arce et al. [2017] and
calculate peer group quality, PCORR, as the time-series correlation between actual performance
3 To increase comparability across BUs and alleviate heteroscedasticity issues, we divide all earnings-based variables by revenues similar to the scaling in Bouwens and Kroos [2011] and Aranda et al. [2014]. Leone and Rock [2002] scale earnings by total assets, which are not available in our dataset. We cannot use prior-period earnings for scaling purposes because they can be close to zero or even negative in some cases. 4 Aranda et al. [2014] use a similar RTS variable with the reverse sign (with Ti,t – APeer,t in the numerator) so that high values reflect that Ti,t is relatively difficult to achieve. For exposition of our results, it is easier to code the peer performance variable so that high values reflect high peer performance. As discussed in Section 5.5 on “Additional Evidence”, we find similar results when we measure peer performance as APeer,t – TPeer,t, i.e., as average performance of peers relative to their targets (Casas-Arce et al. [2017]).
17
of a BU (earnings scaled by revenues) and average performance of all its peers (using up to 36
BU-month observations). A higher correlation indicates a greater peer group capacity to filter out
noise.
We use three variables to measure variation in the importance of cooperation across BUs.
First, some BUs are far away from their closest peer whereas others are located in close proximity
to multiple peers, which increases the potential for capacity sharing. To measure geographical
proximity among multiple BUs, we define DISTANT as the distance in kilometers between a BU
and the point of gravity of its peers, i.e., the point minimizing the sum of distances to all peers in
the same business group. An alternative measure based on the average distance between a BU and
all its peers yields qualitatively similar results.
Second, we compare BUs to their peers in terms of similarity in services they provide.
Although some capacity resources can be used for all industry lines of service, other resources are
service-specific and can only be shared if two BUs engage in similar activities. For example, X-ray
devices to inspect pipelines in the upstream industry are rarely used in BUs mainly servicing
downstream refineries. To measure similarity of BU services within a peer group, we calculate
SIMILAR as the Hirschman-Herfindahl Index of revenues by service line within a business group.
Specifically, we calculate the sum of squared revenue shares (of total business group revenue)
generated in different lines of service. High values indicate high similarity as reflected in high
concentration of revenues in the same line of service within a business group. Low values indicate
low similarity and low potential for capacity sharing among BUs due to a high diversity of services
provided by BUs within the same business group.
Third, BUs may be in close proximity to their peers and use similar resources but still have
limited potential for capacity sharing if the peak usage occurs during the same months of the year.
18
Therefore, we also measure MATCH as the proportion of peers that are in need of capacity transfers
(have highest revenues) during the same two months of the year when the focal BU has slack
capacity (its two lowest monthly revenues of the year).5
When testing for the effects of cooperation, we use these three proxies (DISTANT,
SIMILAR, MATCH) individually but we also aggregate them into an overall index for the
importance of cooperation. One way to aggregate is to assume equal weights and construct an
index as the sum of the three proxies standardized to have zero mean and variance of one. Our
validation analysis in Section 5.2 discusses an alternative aggregation method which estimates the
weights using additional information about the outcomes of cooperation. Using the proxies for
cooperation individually or using either of the aggregation methods yields qualitatively similar
results.
Our main analyses use the following control variables. SIZE is the logarithm of BU labor
costs.6 GROWTH is the logarithm of BU growth in revenue. GDP is annual GDP per capita growth
in the business group’s country. PRICE is the hourly charge paid by customers in the country,
which may proxy for (lack of) competitive pressures. As discussed below, we also control for all
time-invariant sources of sample heterogeneity by using a change in targeted earnings as the
dependent variable in our main analysis.
5 Alternative measures defining MATCH in terms of the same one (three) month(s) of the year yield qualitatively similar results in our hypotheses tests. Our main measure yields strongest results in our validation analysis discussed in the next section. In addition, we find support for the implicit assumption that monthly variation in demand for capacity is at least partly predictable. In particular, there is a highly significant serial correlation (ρ=0.27, p<.01) in ranks of monthly revenue in any given BU-year. 6 Using (the logarithm of) BU revenue or labor cost (unlogged) to control for size yields qualitatively unchanged results. In our validation analysis, where BU labor cost is one of the dependent variables, we use sales to control for size.
19
4.3 EMPIRICAL MODEL
We model target revisions, Ti,t+1 – Ti,t, as a function of past own performance and past peer
performance. We allow for asymmetric target ratcheting where the effect of past own performance
depends on ACHIEVEi,t (Leone and Rock [2002], Bouwens and Kroos [2011]). Thus, our baseline
model is as follows:
, 1 , 0 1 , , 2 , 3 , , , 4 , ,( ) ( ) ( )i t i t i t i t i t i t i t i t Peer t i tT T A T ACHIEVE ACHIEVE A T A T . (1)
Our hypotheses predict that the effects of past own performance ( 1 ) and past peer
performance ( 4 ) vary as a function of peer group capacity to filter out noise and the importance
of cooperation. To estimate these moderating effects, we rely on the following models:
, 1 , 0 1 , , 2 , 3 , , , 4 , ,
5 6 , , 7 , ,
( ) ( ) ( )
( ) ( ),
i t i t i t i t i t i t i t i t Peer t i t
i i i t i t i Peer t i t
T T A T ACHIEVE ACHIEVE A T A T
VAR VAR A T VAR A T
(2)
where VAR stands for one of our main explanatory variables (SDPERF, PCORR, DISTANT,
SIMILAR, or MATCH). We also include our control variables as well as year and peer group fixed
effects. In our main tests, we simultaneously estimate the main and moderating effects of both peer
group capacity to filter out noise and the importance of cooperation.
5 Results
5.1 DESCRIPTIVE EVIDENCE
Table 1 presents descriptive statistics for our sample of 106 BU-year observations. The typical BU
had direct labor costs of €1.1 million (see SIZE unlogged) and sales of €2.8 million (untabulated).
Performance exceeded targets in terms of return on sales in forty five percent of sample
observations. Median own performance (Ai,t – Ti,t) fell short of target by one percent of revenues
and the resulting median target revision (Ti,t+1 – Ti,t) was zero.
--- Insert Table 1 ---
20
Table 2 presents Pearson correlations among our main variables and yields two insights.
First, target revisions are positively associated with past own performance as well as past peer
performance. The association with past peer performance (0.46) is higher than the association with
past own performance (0.37). This is consistent with the assumption of RTS that information about
peer performance is useful for target-setting purposes. Second, we find that our three proxies for
the importance of cooperation reflect different dimensions of the potential for capacity sharing and
are not highly correlated. As discussed below, we take this into account when constructing an
overall index for the importance of cooperation.
--- Insert Table 2 ---
5.2 VALIDATING MEASURES OF IMPORTANCE OF COOPERATION
As described in Section 3, cooperation allows BUs to pool their capacity resources with peers
which should have at least some of the following economic benefits. First, it should reduce the
need to maintain excess labor capacity and consequently also BU labor costs. Second, it should
reduce the usage of labor from outsourcing companies and other third-party providers. Third, it
should increase BU profitability because maintaining excess labor or hiring temporary labor from
third parties to accommodate unexpected demand is expensive.
We expect our measures of the importance of cooperation to be associated with the three
economic outcomes discussed above. We measure excess labor capacity as abnormally high labor
hours (SLACK LABOR), i.e., the residuals from a regression of BU direct labor cost on the average
hourly labor rate, sales, and all other control variables used in the main analysis. By construction,
these residuals cannot be explained by cross-sectional differences in wages, sales, or any other
measurable BU characteristics and should therefore be indicative of excess labor capacity. We
measure the usage of labor from third-party providers (THIRD PARTY) in terms of the labor hours
21
requested from outside providers as a percentage of the contractually available maximum at the
business group level. Finally, we measure profitability as BU profit margin (PROFITM), i.e.,
earnings scaled by revenues.
--- Insert Tables 3a and 3b ---
Table 3a shows the pairwise correlations among the three proxies for the importance of
cooperation and the three economic outcome variables. Table 3b tests whether some of these
associations are significant in multivariate regressions, which also include average hourly labor
rate, sales, GROWTH, GDP, and PRICE as control variables and year fixed effects. We do not
include peer group fixed effects in this validation analysis because between-group differences in
the importance of cooperation are associated with between-group differences in economic
outcomes. Our results are qualitatively similar but somewhat weaker in magnitude if we include
peer group fixed effects and thus eliminate this between-group variance.
Panel A of Table 3b shows that all three proxies for the importance of cooperation are
significantly associated with excess labor capacity as reflected in abnormally high labor hours. In
particular, the first proxy, DISTANT, is inversely related to the potential for capacity sharing and
we find that it is positively associated with SLACK LABOR (p<.01). In other words, BUs that are
further apart from their peers are more likely to maintain slack capacity in terms of excess
employment. We find significant negative associations between abnormal labor costs and the other
two measures, SIMILAR (p<.05) and MATCH (p<.01), that are increasing in the potential for
capacity sharing.
Panel B of Table 3b provides similar results regarding the usage of labor from third-party
providers. THIRD PARTY is increasing in DISTANT (p<.10) and decreasing in SIMILAR (p<.01)
and MATCH (p<.01). Finally, Panel C shows the implications for overall profitability as reflected
22
in BU margins. PROFITM is increasing in SIMILAR (p<.01) and the effects of DISTANT and
MATCH have the predicted signs even though the magnitudes do not quite reach the levels of
statistical significance.
We infer from these findings that our proxies for the importance of cooperation capture
economically meaningful differences in production functions. BUs that are close to their peers,
perform similar services, and experience peak demand at different times of the year than their peers
are in the best position to pool capacity resources and reap the benefits of low levels of excess
employment, low reliance on temporary labor from third parties, and to some extent also higher
average margins. It follows that BU cost structure depends not only on individual BU
characteristics but also on proximity and similarity to its peers.
The correlations presented in Table 3a can also be used to aggregate DISTANT, SIMILAR,
and MATCH into an overall index for the importance of cooperation and to test validity of the
resulting formative construct (CINDEX). In particular, we follow Diamantopoulos and Winklhofer
[2001] and estimate a model of cooperation with three causes (possibly uncorrelated) and three
outcomes. We find evidence of a very good fit in that a χ2-test does not reject validity of the model
(χ2=6.43, p=0.38). Relying on this measurement model, we estimate CINDEX as a linear
combination of DISTANT, SIMILAR, and MATCH with optimal aggregation weights that
maximize the proportion of common (explained) variance behind the three economic outcomes of
cooperation. We use CINDEX as one of the measures of cooperation in our hypotheses tests. Using
an equally-weighted index yields qualitatively similar results.
5.3 BASELINE TARGET REVISION MODELS
Prior literature shows that target revisions depend on past own as well as past peer performance
(Leone and Rock [2002], Aranda et al. [2014]). Table 4 replicates these findings in the full sample
23
as well as in two separate subsamples of BUs with low/high performance relative to peers
(measured as return on sales below/above the business group median). The full sample results in
column (i) are consistent with prior studies in that there is evidence of both target ratcheting
(p<.01) and RTS (p<.01). In particular, we find an economically significant extent of RTS as
reflected in the estimates suggesting that a 10% increase in average peer earnings is associated
with a 3.6% increase in target earnings on average. We also find that the association between target
revisions and past own performance is largely driven by BUs that fail to meet their earnings
target—failing a target by a wide margin is associated with a significant target revision downward,
whereas exceeding own target by a wide margin does not necessarily lead to a significant target
revision upward.
--- Insert Table 4 ---
The subsample analysis in columns (ii) and (iii) of Table 4 sheds more light on these
asymmetric target revisions. Consistent with prior studies (Aranda et al. [2014], Indjejikian et al.
[2014], Bol and Lill [2015]), we find that the sensitivity of target revisions to past own performance
depends on performance relative to peers. When high performers fail to meet their target, next-
year target is strongly revised downward. However, when high performers exceed (an already
high) target, there is essentially no impact on next-year target. This type of asymmetric target
revision also affects BUs with low relative performance but is much weaker in magnitude.
5.4 HYPOTHESES TESTS
The main focus of our study is the cross-sectional variation in the extent of target ratcheting and
RTS. Our hypotheses predict that the sensitivity of target revisions to past own performance and
past peer performance depend on peer group capacity to filter out noise and on the importance of
cooperation. In what follows, we first present our target revision models separately for each of our
24
measures of (i) peer group capacity to filter out noise (SDPERF and PCORR) and (ii) the
importance of cooperation (DISTANT, SIMILAR, MATCH). Subsequently, we test our hypotheses
using CINDEX as our main measure of the importance of cooperation because it has the greatest
power to detect the economic benefits of capacity sharing (see Section 5.2 validating our measures
of the importance of cooperation).
--- Insert Table 5a ---
Columns (i) and (ii) of Table 5a each include one of the measures of peer group capacity
to filter out noise. As expected, we find that they are positively associated with the sensitivity of
target revisions to past peer performance (p < .01 in both cases). We also find that the sensitivity
of target revisions to past own performance is negatively associated with SDPERF (p < .01). These
findings are consistent with the theory that target revisions put relatively more weight on peer
performance information when peer group capacity to filter out noise is high. They also suggest
that volatility of performance (SDPERF) and peer group quality (PCORR) effectively proxy for
the effects of noise filtering.
Column (iii) of Table 5a uses DISTANT as a proxy inversely related to the importance of
cooperation. We find that the sensitivity of target revisions to past peer performance is relatively
low (p < .01) and the sensitivity to past own performance is relatively high (p < .01) when a BU is
located close to its peers (DISTANT is low). This is consistent with the theory that target revisions
are relatively immune to favorable performance of peers in settings where the potential for capacity
sharing and thus the importance of cooperation is high. That said, proximity to peers may also
imply greater peer group capacity to filter out noise (see Table 2), which could attenuate the effects
of DISTANT on the sensitivity of target revisions to past own and past peer performance in
Column (iii).
25
Column (iv) of Table 5a uses BU similarity as an empirical proxy for the importance of
cooperation. High values of SIMILAR indicate that BUs within a business group perform similar
services and therefore have a greater potential for capacity sharing. We find that SIMILAR is
negatively associated with the sensitivity of target revisions to past peer performance (p < .01) and
positively associated with the sensitivity to past own performance (p < .10). These findings imply
that information about peer performance is incorporated into targets relatively more and
information about own performance relatively less when business groups have a highly diverse set
of activities which renders transfers of services and resources between BUs more difficult.
Conversely, business groups that are homogenous in terms of services offered and resources used
put less emphasis on past peer performance and more emphasis on past own performance when
revising targets. Again, to the extent that greater similarity among peers makes peer performance
more informative about the future, these findings could be attenuated by noise-filtering motives.
Column (v) of Table 5a exploits variation in the extent to which a BU’s idle capacity
coincides with idle capacity of peers. At one extreme, if all BUs within a business group face peak
demand at the same time of the year, there is no potential for capacity sharing. At the other extreme,
the highest potential for capacity sharing occurs for the highest values of MATCH, i.e., when a BU
has slack capacity exactly when peers can use it. We find that when MATCH is high, the sensitivity
of target revisions to past peer performance is relatively low (p < .01) and the sensitivity to past
own performance is relatively high (p < .05).
--- Insert Table 5b ---
Table 5b presents the main tests of our hypotheses and extends the results in Table 5a in
two ways. First, as discussed in Section 5.2, we use CINDEX as an aggregate measure of the
importance of cooperation which combines all three proxies into an overall index as in
26
Diamantopoulos and Winklhofer [2001]. Column (i) of Table 5b shows that the effects of CINDEX
on the relative importance of past own and peer performance in target updating are qualitatively
similar to those in Columns (iii)-(v) of Table 5a. Second, we estimate the noise-filtering effects
predicted by H1a,b and the cooperation effects predicted by H2a,b simultaneously. Although
estimating multiple interaction effects in one model reduces the power of our tests, it takes into
account that some of our proxies for cooperation may also be related to noise filtering.
Consistent with H1a,b, Column (ii) shows that volatility of own performance (SDPERF)
reduces the sensitivity of target revisions to past own performance (p < .01) and increases the
sensitivity to past peer performance (p < .01). Column (iii) shows that peer group quality (PCORR)
increases the sensitivity to past peer performance (p < .01). The effect on the sensitivity to past
own performance has the predicted sign but is not statistically significant.
Columns (ii) and (iii) also provide strong support for H2a,b. In particular, controlling for
the noise-filtering effects discussed above, we find that the importance of cooperation (CINDEX)
increases the sensitivity of target revisions to past own performance (p < .01 in (ii) and p < .05 in
(iii)) and reduces the sensitivity to past peer performance (p < .01 in both columns). To assess the
magnitude of the effects, we compare the sensitivity of target revisions to past own and peer
performance at different values of CINDEX for the model reported in column (i) of Table 5b. For
example, the sensitivity estimates are 0.32 for past own performance and 0.43 for past peer
performance for an observation with average value of the index. In contrast, when the importance
of cooperation is high, as reflected by one standard deviation increase in the index, the sensitivity
estimates are 0.38 for past own and 0.28 for past peer performance. Thus, although we make no
predictions about the absolute size of the sensitivity coefficients, we provide evidence that they
depend to a great extent on the importance of cooperation.
27
In conclusion, we find evidence consistent with RTS in that target revisions incorporate
information about peer performance. Nevertheless, there is also considerable cross-sectional
variation in the extent of RTS. In particular, we find that target revisions are relatively immune to
favorable performance of peers in settings where the potential for cooperation among BUs is high
and/or the capacity of peer groups to filter out noise is low.
5.5 ADDITIONAL EVIDENCE
We estimate several alternative specifications of our empirical models to assess their robustness
and to provide additional evidence on the determinants of RTS. First, we examine the validity of
the assumption that peer groups are comprised of BUs in the same business group. Alternatively,
peer groups could be comprised of units outside of a business group. We randomly generate 100
alternative peer groups for each BU so that none of its peers comes from the same business group
and reestimate a target updating model similar to the one presented in Column (i) of Table 4. We
find that using peers from the same business group (as in our main analysis) results in a higher R2
than in all of the estimations using randomly generated peer groups.7 Similarly, the coefficient
estimate pertaining to past performance of business group peers is higher than the corresponding
coefficient in any of the estimations involving random peers. Collectively, these results suggest
that our assumption that peer groups are comprised of BUs in the same business group is not overly
restrictive.
Second, we consider an alternative way to measure peer performance relative to target. Our
main analysis follows the approach of Aranda et al. [2014] assuming that target difficulty varies
across BUs and can be measured by comparing average peer performance to own targets. As an
7 We do not estimate peer group fixed effects in these comparisons because they cannot be meaningfully defined for randomly generated peer groups that are different for each BU.
28
alternative, we follow Casas-Arce et al. [2017] and assume that performance targets are fully
adjusted every period so that their difficulty is the same for all BUs. This approach implies that
target revisions only reflect new information about peer performance, which can therefore be
measured as average peer performance relative to peer targets, APeer,t – TPeer,t. We reestimate our
empirical models using this alternative measure of past peer performance and find qualitatively
similar results.
Finally, we examine whether the sensitivity of target revisions to past own performance
and past peer performance depends on peer group size (the number of BUs in a business group).
Although peer group size is not directly related to the importance of cooperation, it may affect
target updating for two reasons: (i) average performance of a larger peer group is likely to be less
volatile and therefore used more in target updating (Banker and Datar [1989]), (ii) the adverse
incentive effects of using information about peer performance are less pronounced in larger groups
(Bandiera et al. [2005]). Therefore, we expect that peer group size increases the relative weight on
past peer performance in target updating and reduces the relative weight on past own performance.
We find support for these predictions in that the interaction between peer group size and past own
performance is significantly negative (p < .01, untabulated) and the interaction between peer group
size and past peer performance is significantly positive (p < .10, untabulated).
6 Discussion and Conclusions
A large stream of prior literature examines the extent to which executive compensation depends
on performance of industry peers. Theory predicts that the use of information about peer
performance in compensation contracts has benefits but also costs in terms of discouraging
cooperation among peers. Numerous studies find support for the benefits yet there is hardly any
evidence pertaining to the costs, in part because performance evaluation and compensation data is
29
rarely available at lower organizational levels where teamwork and cooperation are particularly
important (Feintzeig [2015]).
Our study fills in the void by providing novel evidence consistent with the theory that the
use of peer performance information implies a trade-off between the benefits of filtering out noise
from performance evaluations and the costs of discouraging cooperation. In particular, we collect
field data from a global industrial services company and examine how the importance of
cooperation affects relative target setting. Consistent with prior work, we find that the extent to
which target revisions depend on peer performance increases in peer group capacity to filter out
noise. At the same time, we also find that information about peer performance is used less when it
is important for BU managers to cooperate and voluntarily share capacity resources.
Our findings caution that the incentive implications of incorporating information about
peer performance into compensation contracts are more complex than recognized in most prior
empirical studies. The noise-filtering benefits may dominate the costs in settings where top
executives have limited interaction with their industry peers. However, it would be misleading to
conclude from the evidence on top executive compensation that the use of information about peer
performance strengthens incentives at other levels of the organizational hierarchy. Most peer
performance comparisons take place at lower levels where the assumption of a limited interaction
among peers does not apply. Our study shows that in such settings concerns about incentives to
cooperate may be equally or more important than the noise-filtering benefits of using information
about peer performance.
Our findings also relate to prior literature on managerial incentives in multi-period settings.
Although many analytical studies show that ex ante commitment to long-term contracts
strengthens managerial incentives (Laffont and Tirole [1993], Ederhof, Rajan, and Reichelstein
30
[2011]), it remains unclear whether such commitment is feasible in practice. Our results are
consistent with the use of long-term contracts and target-setting practices that encourage
cooperation through credible commitment not to revise target upward following improvements in
peer performance.
We acknowledge that our findings also have limitations. First, a key theoretical construct
of our study, the importance of cooperation, cannot easily be measured. To address concerns about
measurement error, we use several different measures of the importance of cooperation and show
that our results are robust. Second, our findings need not generalize to other settings. We do not
make any conclusive statements about the importance of cooperation outside the setting under
study or about the relative magnitude of the costs and benefits of relative target setting. Our aim
is to provide evidence that concerns about a lack of cooperation affect target setting and the extent
to which managerial compensation depends on peer performance.
31
References
AGGARWAL, R. K., AND A. A. SAMWICK. "Executive compensation, strategic competition, and relative performance evaluation: Theory and evidence." Journal of Finance 54 (1999): 1999–2043.
ALBUQUERQUE, A. "Peer firms in relative performance evaluation." Journal of Accounting & Economics 48 (2009): 69–89.
ANDERSON, S. W., H. C. DEKKER, AND K. L. SEDATOLE. "An empirical examination of goals and performance-to-goal following the introduction of an incentive bonus plan with participative goal-setting." Management Science 56 (2010): 90–109.
ANTLE, R., AND A. SMITH. "An empirical investigation of the relative performance evaluation of corporate executives." Journal of Accounting Research 24 (1986): 1–39.
ARANDA, C., J. ARELLANO, AND A. DAVILA. "Ratcheting and the role of relative target setting." The Accounting Review 89 (2014): 1197–1226.
BALDENIUS, T., AND S. REICHELSTEIN. "External and internal pricing in multidivisional firms." Journal of Accounting Research 44 (2006): 1–28.
BANDIERA, O., I. BARANKAY, AND I. RASUL. "Social preferences and the response to incentives: Evidence from personnel data." The Quarterly Journal of Economics 120 (2005): 917–962.
BANDIERA, O., I. BARANKAY, AND I. RASUL. "Social connections and incentives in the workplace: Evidence from personnel data." Econometrica 77 (2009): 1047–1094.
BANKER, R. D., AND S. M. DATAR. "Sensitivity, precision, and linear aggregation of signals for performance evaluation." Journal of Accounting Research 27 (1989): 21–39.
BANKER, R. D., R. HUANG, AND R. NATARAJAN. "Incentive contracting and value relevance of earnings and cash flows." Journal of Accounting Research 47 (2009): 647–678.
BOL, J. C., AND J. LILL. "Performance target revisions in incentive contracts: Do information and trust reduce ratcheting and the ratchet effect?" The Accounting Review 90 (2015): 1755–1778.
BOUWENS, J., E. CARDINAELS, AND J. ZHANG. "Principals and their car dealers: What do targets tell about their relation?" Working paper, University of Amsterdam, 2016.
BOUWENS, J., AND P. KROOS. "Target ratcheting and effort reduction." Journal of Accounting & Economics 51 (2011): 171–185.
BUSHMAN, R. M., R. J. INDJEJIKIAN, AND A. SMITH. "Aggregate performance measures in business unit manager compensation: The role of intrafirm interdependencies." Journal of Accounting Research 33 (1995): 101–128.
CAMPBELL, D. "Nonfinancial performance measures and promotion-based incentives." Journal of Accounting Research 46 (2008): 297–332.
CASAS-ARCE, P., M. HOLZHACKER, M. D. MAHLENDORF, AND M. MATĚJKA. "Relative performance evaluation and the ratchet effect." Contemporary Accounting Research (2017): forthcoming.
CASAS-ARCE, P., AND F. A. MARTINEZ-JEREZ. "Relative performance compensation, contests, and dynamic incentives." Management Science 55 (2009): 1306–1320.
CHARNESS, G., D. MASCLET, AND M. C. VILLEVAL. "The dark side of competition for status." Management Science 60 (2014): 38–55.
32
CHARNESS, G., AND M.-C. VILLEVAL. "Cooperation and competition in intergenerational experiments in the field and the laboratory." The American Economic Review 99 (2009): 956–978.
CHEN, K.-P. "Sabotage in promotion tournaments." Journal of Law, Economics, & Organization 19 (2003): 119–140.
CORE, J. E., W. R. GUAY, AND R. E. VERRECCHIA. "Price versus non-price performance measures in optimal CEO compensation contracts." The Accounting Review 78 (2003): 957–981.
DATAR, S., S. C. KULP, AND R. A. LAMBERT. "Balancing performance measures." Journal of Accounting Research 39 (2001): 75–92.
DIAMANTOPOULOS, A., AND H. M. WINKLHOFER. "Index construction with formative indicators: An alternative to scale development." Journal of Marketing Research 38 (2001): 269–277.
EDERHOF, M., M. V. RAJAN, AND S. REICHELSTEIN. "Discretion in managerial bonus pools." Foundations and trends in accounting 5 (2011): 243–316.
FEHR, E., AND S. GÄCHTER. "Cooperation and punishment in public goods experiments." The American Economic Review 90 (2000): 980–994.
FEINTZEIG, R. "The trouble with grading employees." The Wall Street Journal, http://www.wsj.com/articles/the-trouble-with-grading-employees-1429624897 (2015).
GARVEY, G., AND T. MILBOURN. "Incentive compensation when executives can hedge the market: Evidence of relative performance evaluation in the cross section." Journal of Finance 58 (2003): 1557–1581.
GIBBONS, R., AND K. J. MURPHY. "Relative performance evaluation for chief executive officers." Industrial & Labor Relations Review 43 (1990): 30–51.
GONG, G., L. Y. LI, AND J. Y. SHIN. "Relative performance evaluation and related peer groups in executive compensation contracts." The Accounting Review 86 (2011): 1007–1043.
HARBRING, C., AND B. IRLENBUSCH. "How many winners are good to have?: On tournaments with sabotage." Journal of Economic Behavior & Organization 65 (2008): 682–702.
HARBRING, C., AND B. IRLENBUSCH. "Sabotage in tournaments: Evidence from a laboratory experiment." Management Science 57 (2011): 611–627.
HOLMSTRÖM, B. "Moral hazard and observability." Bell Journal of Economics 10 (1979): 74–91.
HOLMSTRÖM, B. "Moral hazard in teams." Bell Journal of Economics 13 (1982): 324–340.
ICKES, B. W., AND L. SAMUELSON. "Job transfers and incentives in complex organizations: Thwarting the ratchet effect." The Rand Journal of Economics 18 (1987): 275–286.
INDJEJIKIAN, R. J., M. MATĚJKA, K. A. MERCHANT, AND W. A. VAN DER STEDE. "Earnings targets and annual bonus incentives." The Accounting Review 89 (2014): 1227–1258.
ITTNER, C. D., D. F. LARCKER, AND M. V. RAJAN. "The choice of performance measures in annual bonus contracts." The Accounting Review 72 (1997): 231–255.
JANAKIRAMAN, S. N., R. A. LAMBERT, AND D. F. LARCKER. "An empirical-investigation of the relative performance evaluation hypothesis." Journal of Accounting Research 30 (1992): 53–69.
JAYARAMAN, S., T. MILBOURN, AND H. SEO. "Product market peers and relative performance evaluation." Working paper, University of Rochester, 2015.
JENSEN, M. C., AND K. J. MURPHY. "Performance pay and top-management incentives." Journal of Political Economy 98 (1990): 225–264.
33
JOH, S. W. "Strategic managerial incentive compensation in japan: Relative performance evaluation and product market collusion." The Review of Economics and Statistics 81 (1999): 303–313.
JORDAN, W. C., AND S. C. GRAVES. "Principles on the benefits of manufacturing process flexibility." Management Science 41 (1995): 577–594.
KIM, S., AND J. Y. SHIN. "Executive bonus target ratcheting: Evidence from the new executive compensation disclosure rules." Contemporary Accounting Research 34 (2016): 1843–1879
LAFFONT, J. J., AND J. TIROLE. A theory of incentives in procurement and regulation. Cambridge, Massachusetts: The MIT Press, 1993.
LAZEAR, E. P. "Pay equality and industrial politics." Journal of Political Economy 97 (1989): 561–580.
LEONE, A. J., AND S. ROCK. "Empirical tests of budget ratcheting and its effect on managers' discretionary accrual choices." Journal of Accounting & Economics 33 (2002): 43–67.
MATĚJKA, M., AND K. RAY. "Balancing difficulty of performance targets: Theory and evidence." Review of Accounting Studies (2017): forthcoming.
MATSUMURA, E. M., AND J. Y. SHIN. "An empirical analysis of an incentive plan with relative performance measures: Evidence from a postal service." The Accounting Review 81 (2006): 533–566.
MEYER, M. A., AND J. VICKERS. "Performance comparisons and dynamic incentives." Journal of Political Economy 105 (1997): 547–581.
MILGROM, P., AND J. ROBERTS. Economics, organization and management. Englewood Cliffs: Prentice Hall, 1992.
MURPHY, K. J. "Performance standards in incentive contracts." Journal of Accounting & Economics 30 (2001): 245–278.
RAJGOPAL, S., T. SHEVLIN, AND V. ZAMORA. "CEOs' outside employment opportunities and the lack of relative performance evaluation in compensation contracts." Journal of Finance 61 (2006): 1813–1844.
VRETTOS, D. "Are relative performance measures in CEO incentive contracts used for risk reduction and/or for strategic interaction?" The Accounting Review 88 (2013): 2179–2212.
WEITZMAN, M. L. "The ratchet principle and performance incentives." Bell Journal of Economics 11 (1980): 302–308.
34
F I G U R E 1 Monthly Revenue and Capacity Utilization – Business Unit Example
Rev
enue
[€]
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Actual Revenue Target Revenue
Hou
rs
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Actual Billed Available Capacity Unused
35
T A B L E 1 Descriptive Statistics
Obs. Mean St. Dev. 25th Pct. Median 75th Pct.
Ti,t+1 – Ti,t 106 -0.01 0.05 -0.03 0.00 0.01
Ai,t – Ti,t 106 0.02 0.15 -0.08 -0.01 0.08
ACHIEVEi,t 106 0.45 0.50 0.00 0.00 1.00
APeer,t – Ti,t 106 0.02 0.13 -0.06 0.00 0.07
SDPERF 106 0.11 0.07 0.06 0.09 0.14
PCORR 106 0.14 0.29 -0.06 0.19 0.35
DISTANT 106 0.58 0.61 0.13 0.37 0.9
SIMILAR 106 0.44 0.26 0.24 0.3 0.62
MATCH 106 0.14 0.15 0.04 0.10 0.19
SIZE 106 7.04 1.18 6.42 6.97 7.60
GROWTH 106 0.27 0.91 -0.33 0.17 0.71
GDP 106 2.58 3.01 2.96 2.97 3.11
Ti,t+1 – Ti,t —annual earnings target revision scaled by revenues of BU i. Ai,t – Ti,t—own performance, i.e., the difference between actual and targeted earnings scaled by revenues. ACHIEVEi,t—indicator variable for performance meeting or exceeding its annual target. APeer,t – Ti,t—peer performance, i.e., the difference between average earnings of BUs in the same business group and own earnings target scaled by revenues. SDPERF—standard deviation in BU actual earnings scaled by revenues (own performance) over 36 months. PCORR—correlation between BU own performance and average performance of its peers over 36 months. DISTANT—geographical distance between a BU and point of gravity of its peers in the same business group in 1,000 kilometers. SIMILAR—business group level Hirschman-Herfindahl Index of revenues in industry-job type clusters. MATCH—proportion of business group peers at capacity during the months BU i has slack capacity. SIZE—logarithm of BU inflation-adjusted annual direct labor costs converted from local currency to € thousands. GROWTH—logarithm of BU growth in inflation-adjusted revenue. GDP—GDP per capita growth for country in which the BU is located.
36
T A B L E 2 Pearson Correlations
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
(1) Ti,t+1 – Ti,t
(2) Ai,t – Ti,t 0.37***
(3) ACHIEVEi,t 0.39*** 0.79***
(4) APeer,t – Ti,t 0.46*** 0.61*** 0.42***
(5) SDPERF 0.11 -0.01 -0.03 0.18*
(6) PCORR -0.21** 0.18* 0.03 0.22** -0.15
(7) DISTANT 0.09 0.08 0.06 0.16* 0.27*** -0.21**
(8) SIMILAR 0.18* 0.51*** 0.28*** 0.59*** 0.12 0.21** -0.12
(9) MATCH 0.37*** 0.16* 0.12 0.08 -0.01 -0.12 -0.13 0.21**
(10) SIZE -0.14 -0.11 -0.08 -0.15 -0.54*** 0.02 -0.18* 0.01 -0.09
(11) GROWTH -0.06 0.35*** 0.17* 0.33*** 0.11 0.18* 0.05 0.22** -0.10 -0.05
(12) GDP 0.12 0.13 0.18* 0.15 -0.21** -0.01 -0.08 -0.11 0.04 -0.13 -0.04
(13) PRICE 0.13 0.18* 0.07 0.23** 0.34*** -0.01 0.16* 0.15 0.12 0.01 0.19* -0.51***
***,**,* indicate significance at the 1%, 5%, and 10% levels, respectively. PRICE—inflation-adjusted hourly charges to customers in the business group’s country converted from local currency to €. See Table 1 for all other variable definitions.
37
T A B L E 3 A Importance of Cooperation—Validation Analysis
(1) (2) (3) (4) (5)
(1) DISTANT
(2) SIMILAR -0.12
(3) MATCH -0.13 0.21**
(4) SLACK LABOR 0.22** -0.22** -0.26***
(5) THIRD PARTY 0.19** -0.27*** -0.26*** 0.13
(6) PROFITM -0.09 0.31*** 0.18* -0.31*** -0.27***
***,**,* indicate significance at the 1%, 5%, and 10% levels, respectively. DISTANT—geographical distance between a BU and point of gravity of its peers in the same business group in 1,000 kilometers. SIMILAR—business group level Hirschman-Herfindahl Index of revenues in industry-job type clusters. MATCH—proportion of business group peers at capacity during the months BU i has slack capacity. SLACK LABOR—abnormal labor hours, i.e., the residuals from a regression of BU direct labor cost on the average hourly labor rate, sales, GROWTH, GDP, PRICE, and year fixed effects. THIRD PARTY—labor hours requested from outside providers as a percentage of the contractually available maximum at the business group level. PROFITM—BU profit margin.
38
T A B L E 3 B Importance of Cooperation—Validation Analysis
Panel A Dependent Variable: SLACK LABOR
VAR stands for
(i) DISTANT (ii) SIMILAR (iii) MATCH
VAR 497.46*** -1471.15** -2445.80*** (3.02) (2.50) (3.48) Control Variables Yes Yes Yes Year Fixed Effects Yes Yes Yes R2 0.05 0.06 0.07
Panel B Dependent Variable: THIRD PARTY
VAR stands for
(i) DISTANT (ii) SIMILAR (iii) MATCH
VAR 0.02* -0.13*** -0.20*** (1.89) (3.92) (4.18)
Control Variables Yes Yes Yes Year Fixed Effects Yes Yes Yes R2 0.36 0.43 0.43
Panel C Dependent Variable: PROFITM
VAR stands for
(i) DISTANT (ii) SIMILAR (iii) MATCH
VAR -0.03 0.21*** 0.13 (1.42) (3.33) (1.56)
Control Variables Yes Yes Yes Year Fixed Effects Yes Yes Yes R2 0.19 0.29 0.19
***,**,* indicate significance at the 1%, 5%, and 10% levels, respectively, using robust standard errors. Corresponding two-tailed t-values are reported in parentheses. Intercepts are included in estimation but untabulated. The estimations in Panels A and C use the full sample of 106 observations. Six missing values of THIRD PARTY reduce the sample size to 100 in Panel B. See Table 3a for variable definitions.
39
T A B L E 4 Target Revisions—Baseline Models
Dependent Variable: Ti,t+1 – Ti,t
Performance Relative to Peers
(i) Full Sample (ii) High (iii) Low
Constant -0.05 0.04 0.09 (0.81) (0.78) (0.63)
Ai,t – Ti,t 0.29*** 0.91*** 0.29** (3.98) (2.94) (2.66)
ACHIEVEi,t 0.01 0.00 0.03 (1.03) (0.20) (1.31)
ACHIEVEi,t · Ai,t – Ti,t -0.32*** -0.94*** -0.47** (2.91) (2.89) (2.28)
APeer,t – Ti,t 0.36*** (4.66)
Control Variables Yes Yes Yes
Peer Group Fixed Effects Yes Yes Yes
Year Fixed Effects Yes Yes Yes
R2 0.66 0.83 0.61
Sample size 106 50 56 ***,**,* indicate significance at the 1%, 5%, and 10% levels, respectively, using robust standard errors. Corresponding two-tailed t-values are reported in parentheses. See Table 1 for variable definitions.
40
T A B L E 5 A Cross-Sectional Variation in Target Revisions
Dependent Variable: Ti,t+1 – Ti,t
VAR stands for
(i) SDPERF (ii) PCORR (iii) DISTANT (iv) SIMILAR (v) MATCH
Constant -0.10* -0.05 -0.07 -0.04 0.00 (1.89) (1.05) (1.33) (0.82) (0.00)
Ai,t – Ti,t 0.52*** 0.28*** 0.35*** 0.22*** 0.28*** (5.23) (4.13) (5.43) (2.90) (3.97)
ACHIEVEi,t -0.00 0.00 0.01** 0.02** 0.01 (0.02) (0.47) (2.01) (2.22) (0.67)
ACHIEVEi,t · Ai,t – Ti,t -0.27** -0.30*** -0.22** -0.38*** -0.44*** (2.51) (2.99) (2.36) (3.92) (4.29)
APeer,t – Ti,t 0.17* 0.38*** 0.23*** 0.73*** 0.47*** (1.85) (5.75) (3.39) (5.99) (5.84)
VAR 0.11 -0.07*** -0.02** 0.11*** 0.01 (1.56) (3.52) (2.44) (2.77) (0.41)
VAR · Ai,t – Ti,t -1.19*** -0.04 -0.20*** 0.17* 0.40** (2.85) (0.65) (5.54) (1.72) (2.01)
VAR · APeer,t – Ti,t 1.22*** 0.21*** 0.32*** -0.73*** -0.64*** (2.95) (2.86) (4.53) (4.28) (3.07)
Control Variables Yes Yes Yes Yes Yes Peer Group Fixed Effects Yes Yes Yes Yes Yes Year Fixed Effects Yes Yes Yes Yes Yes
R2 0.72 0.73 0.80 0.76 0.73 Sample size 106 106 106 106 106
***,**,* indicate significance at the 1%, 5%, and 10% levels, respectively, using robust standard errors. Corresponding two-tailed t-values are reported in parentheses. See Table 1 for variable definitions.
41
T A B L E 5 B Cross-Sectional Variation in Target Revisions
Dependent Variable: Ti,t+1 – Ti,t
Predicted
Sign
VAR stands for
(i) (ii) SDPERF (iii) PCORR
Constant -0.03 -0.08** -0.04 (0.74) (2.15) (1.16) Ai,t – Ti,t 0.25*** 0.45*** 0.24*** (3.31) (4.95) (3.72) ACHIEVEi,t 0.01* 0.01 0.01 (1.94) (0.78) (1.29) ACHIEVEi,t · Ai,t – Ti,t -0.44*** -0.42*** -0.40*** (4.19) (4.36) (3.96) APeer,t – Ti,t 0.58*** 0.42*** 0.58*** (6.17) (4.29) (7.16) VAR 0.11** -0.03** (2.01) (2.49) VAR · Ai,t – Ti,t H1a: – -1.10*** -0.06 (3.78) (1.09) VAR · APeer,t – Ti,t H1b: + 0.86*** 0.21*** (2.88) (2.75) CINDEX 0.16 0.25** 0.15 (1.17) (2.07) (1.39) CINDEX · Ai,t – Ti,t H2a: + 1.13** 1.16*** 0.97** (2.62) (3.06) (2.40) CINDEX · APeer,t – Ti,t H2b: – -2.63*** -2.18*** -2.38*** (3.82) (4.18) (3.68)
Control Variables
Yes Yes Yes
Peer Group Fixed Effects Yes Yes Yes
Year Fixed Effects Yes Yes Yes
R2
0.79 0.83 0.81
Sample size 106 106 106 ***,**,* indicate significance at the 1%, 5%, and 10% levels, respectively, using robust standard errors. Corresponding two-tailed t-values are reported in parentheses. CINDEX—an overall index for the importance of cooperation. See Table 1 for all other variable definitions.