order stability in supply chains: coordination risk and ...jsterman.scripts.mit.edu/docs/order...

21
Order Stability in Supply Chains: Coordination Risk and the Role of Coordination Stock Rachel Croson School of Economics, University of Texas at Dallas, Dallas, Texas 75080-3021, USA, [email protected] Karen Donohue The Carlson School, University of Minnesota, Minneapolis, Minnesota 55455-9940, USA, [email protected] Elena Katok Jindal School of Management, University of Texas at Dallas, Dallas, Texas 75080-3021, USA, [email protected] John Sterman Sloan School of Management, Massachusetts Institute of Technology, Cambridge, Massachusetts 02142, USA, [email protected] T he bullwhip effect describes the tendency for the variance of orders in supply chains to increase as one moves upstream from consumer demand. We report on a set of laboratory experiments with a serial supply chain that tests behavioral causes of this phenomenon, in particular the possible influence of coordination risk. Coordination risk exists when individuals’ decisions contribute to a collective outcome and the decision rules followed by each individual are not known with certainty, for example, where managers cannot be sure how their supply chain partners will behave. We con- jecture that the existence of coordination risk may contribute to bullwhip behavior. We test this conjecture by controlling for environmental factors that lead to coordination risk and find these controls lead to a significant reduction in order oscillations and amplification. Next, we investigate a managerial intervention to reduce the bullwhip effect, inspired by our conjecture that coordination risk contributes to bullwhip behavior. Although the intervention, holding additional on-hand inventory, does not change the existence of coordination risk, it reduces order oscillation and amplification by providing a buffer against the endogenous risk of coordination failure. We conclude that the magnitude of the bullwhip can be mitigated, but that its behavioral causes appear robust. Key words: bullwhip effect; behavioral operations; supply chain management; beer distribution game History: Received: April 2011; Accepted: May 2012 by Christoph Loch, after 2 revisions. 1. Introduction Despite the undoubted benefits of the lean manufac- turing and supply chain revolutions of the past dec- ade, supply chain instability continues to plague many businesses (Ellram 2010). The consequences include excessive inventories, periodic stockouts, poor customer service, and unnecessary capital investment. These outcomes have recently been docu- mented in many manufacturing and wholesale sec- tors of the economy (e.g., Aeppel 2010, Dooley et al. 2010). Supply chain instability is often attributed, in part, to the bullwhip effect. The bullwhip effect refers to the tendency for order variability to increase within a supply chain as orders move upstream from customer sales to production. Lee et al. (1997) and Sterman (2000) cite evidence of this phenomenon in a wide range of industries. Cachon et al. (2007) find that the phenomenon is most prevalent in wholesale industries and industries with low seasonality. Chen and Lee (2011) provide a detailed analysis of how the level of time aggregation used in measuring the bullwhip effect can influence whether the effect is detected or not. They also provide insight into what level of aggregation is most appropriate for captur- ing the connection between bullwhip behavior and supply chain cost. Previous research attributes the bullwhip effect to both operational and behavioral causes. Operational causes are structural characteristics that lead rational agents to amplify demand variation. Examples include order batching, gaming due to shortages, price fluctuations caused by promotions, and demand signaling due to forecast uncertainty (Lee et al. 1997). These causes have been documented in practice, and techniques to eliminate them are now an important part of the tool kit for supply chain management (Lee et al. 2004). Behavioral causes, in contrast, emphasize the bounded rationality of 1 Vol. 0, No. 0, xxxx–xxxx 2013, pp. 1–21 DOI 10.1111/j.1937-5956.2012.01422.x ISSN 1059-1478|EISSN 1937-5956|13|00|0001 © 2013 Production and Operations Management Society

Upload: others

Post on 17-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

Order Stability in Supply Chains: Coordination Riskand the Role of Coordination Stock

Rachel CrosonSchool of Economics, University of Texas at Dallas, Dallas, Texas 75080-3021, USA, [email protected]

Karen DonohueThe Carlson School, University of Minnesota, Minneapolis, Minnesota 55455-9940, USA, [email protected]

Elena KatokJindal School of Management, University of Texas at Dallas, Dallas, Texas 75080-3021, USA, [email protected]

John StermanSloan School of Management, Massachusetts Institute of Technology, Cambridge, Massachusetts 02142, USA, [email protected]

T he bullwhip effect describes the tendency for the variance of orders in supply chains to increase as one movesupstream from consumer demand. We report on a set of laboratory experiments with a serial supply chain that tests

behavioral causes of this phenomenon, in particular the possible influence of coordination risk. Coordination risk existswhen individuals’ decisions contribute to a collective outcome and the decision rules followed by each individual are notknown with certainty, for example, where managers cannot be sure how their supply chain partners will behave. We con-jecture that the existence of coordination risk may contribute to bullwhip behavior. We test this conjecture by controllingfor environmental factors that lead to coordination risk and find these controls lead to a significant reduction in orderoscillations and amplification. Next, we investigate a managerial intervention to reduce the bullwhip effect, inspiredby our conjecture that coordination risk contributes to bullwhip behavior. Although the intervention, holding additionalon-hand inventory, does not change the existence of coordination risk, it reduces order oscillation and amplification byproviding a buffer against the endogenous risk of coordination failure. We conclude that the magnitude of the bullwhipcan be mitigated, but that its behavioral causes appear robust.

Key words: bullwhip effect; behavioral operations; supply chain management; beer distribution gameHistory: Received: April 2011; Accepted: May 2012 by Christoph Loch, after 2 revisions.

1. Introduction

Despite the undoubted benefits of the lean manufac-turing and supply chain revolutions of the past dec-ade, supply chain instability continues to plaguemany businesses (Ellram 2010). The consequencesinclude excessive inventories, periodic stockouts,poor customer service, and unnecessary capitalinvestment. These outcomes have recently been docu-mented in many manufacturing and wholesale sec-tors of the economy (e.g., Aeppel 2010, Dooley et al.2010).Supply chain instability is often attributed, in part,

to the bullwhip effect. The bullwhip effect refers to thetendency for order variability to increase within asupply chain as orders move upstream fromcustomer sales to production. Lee et al. (1997) andSterman (2000) cite evidence of this phenomenon ina wide range of industries. Cachon et al. (2007) findthat the phenomenon is most prevalent in wholesale

industries and industries with low seasonality. Chenand Lee (2011) provide a detailed analysis of howthe level of time aggregation used in measuring thebullwhip effect can influence whether the effect isdetected or not. They also provide insight into whatlevel of aggregation is most appropriate for captur-ing the connection between bullwhip behavior andsupply chain cost.Previous research attributes the bullwhip effect to

both operational and behavioral causes. Operationalcauses are structural characteristics that lead rationalagents to amplify demand variation. Examplesinclude order batching, gaming due to shortages,price fluctuations caused by promotions, anddemand signaling due to forecast uncertainty (Leeet al. 1997). These causes have been documented inpractice, and techniques to eliminate them are nowan important part of the tool kit for supply chainmanagement (Lee et al. 2004). Behavioral causes, incontrast, emphasize the bounded rationality of

1

Vol. 0, No. 0, xxxx–xxxx 2013, pp. 1–21 DOI 10.1111/j.1937-5956.2012.01422.xISSN 1059-1478|EISSN 1937-5956|13|00|0001 © 2013 Production and Operations Management Society

Page 2: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

decision makers, particularly the failure to ade-quately account for feedback effects and time delays.Specifically, people tend to place orders based on thegap between their target inventory level and currenton-hand stock, while giving insufficient weight tothe supply line of unfilled orders (the stock of ordersplaced but not yet received). Prior work demon-strates that the tendency to underweight the supplyline is sufficient to cause the bullwhip effect in bothexperimental and real supply chains (e.g., Sterman1989a, 2000).The goal of this study is to extend our understand-

ing of the behavioral causes of the bullwhip effect byexamining how individuals perform when all opera-tional causes of the bullwhip effect are eliminated. Inthe absence of operational causes, any remaining sup-ply chain instability and bullwhip must arise frombehavioral causes. We test this hypothesis in a base-line experiment, which follows the general protocol ofthe beer distribution game except that demand uncer-tainty is eliminated by using a constant demand ratethat is publicly announced to all participants.1 Theexperimental setup significantly simplifies the deci-sions participants must make compared with realsupply chains as well as prior experimental research,by effectively eliminating all previously cited opera-tional causes of the bullwhip effect. Also, becausedemand is fixed at a known and constant rate, there isno need for safety stock, eliminating possible cogni-tive errors in computing the appropriate safety stocklevel that might occur with a more complex demanddistribution. Additionally, the system is started inequilibrium, at the optimal on-hand inventory level ofzero, so there is no need for participants to calculatethe optimal transient path to reach an equilibriumlevel of inventory.Nevertheless, we find that order oscillation and

amplification persist even in this simplified environ-ment. Estimation of participants’ decision rules showsthat the vast majority underweight the supply line ofunfilled orders, consistent with prior studies. Post-play questionnaires and participant debriefing sug-gest that many players were unable to predict howtheir teammates would behave.The results suggest that the existence of coordination

risk may partly account for the observed behavior.Coordination risk exists when individuals in a groupenvironment make independent decisions that con-tribute to the collective outcome and the decisionrules followed by each individual are not known withcertainty. Such uncertainty arises when decision mak-ers have limited knowledge of, or trust in, their part-ners’ motives or cognitive abilities. This uncertaintymay cause decision makers to deviate from equilib-rium to hedge against the possibility that their part-ners will not behave optimally. In real supply chains,

inventory managers often have limited understand-ing of each other’s decision rules or local incentives.Such uncertainty is particularly likely when the opti-mal decision rule is not commonly known.The next two experiments test the effect of eliminat-

ing environmental factors that may lead to coordina-tion risk by first creating common knowledge aboutthe optimal ordering policy (experiment 2) and thenensuring that all supply chain partners of the focalhuman subject follow the optimal policy by automat-ing their decisions (experiment 3). We find that orderoscillation and amplification are significantly reducedin these experiments compared with the baseline.However, the large majority of participants continueto underweight the supply line.Lastly, in experiment 4, we test a managerial inter-

vention to test our conjecture that coordination riskcontributes to bullwhip behavior. Although the inter-vention, providing participants with additional on-hand inventory, does not eliminate coordination risk,we find that it reduces order oscillation and amplifica-tion by providing a buffer against the endogenousrisk of coordination failure. The results suggest a newpurpose for inventory, which we term coordinationstock. Coordination stock differs from conventionalsafety stock held to buffer against exogenous uncer-tainty in demand or deliveries. In contrast, coordina-tion stock is inventory held to buffer against strategicuncertainty about the actions of other supply chainmembers. We find that a modest level of coordinationstock significantly lowers order variability and costs,suggesting a rationale for what may appear to beexcess inventory in decentralized supply chains.Overall, our results support the role of behavioral

causes in the bullwhip effect. Coordination risk andsupply line underweighting are likely both presentand contribute to deviation from optimal behavior.Directly reducing the strategic uncertainty that con-tributes to coordination risk significantly improvesperformance, although continued supply line under-weighting means that oscillation and amplificationstill exist. Holding coordination stock does not reducecoordination risk directly, but instead alleviates itseffect, and thus also has a positive effect on perfor-mance. That said, supply line underweighting ishighly robust—it does not vary significantly by sup-ply chain role and is only slightly moderated by theexperimental treatments.We continue in the next section with a brief over-

view of prior literature and its connection to our mainhypotheses. Section 3 provides details of the experi-mental protocol, followed in section 4 by a descrip-tion of the data generated by the experiments and theassociated analysis. Section 5 provides a generaldiscussion of the results and possible psychologicalprocesses behind the behavior we observed. Section 6

Croson, Donohue, Katok, and Sterman: Order Stability in Supply Chains2 Production and Operations Management 0(0), pp. 1–21, © 2013 Production and Operations Management Society

Page 3: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

concludes with potential managerial implications anddirections for future research.

2. Theory Development andHypotheses

Our research fits within the growing field of behav-ioral operations management. This field aims to iden-tify, analyze, and fill gaps in understanding betweenoperations theory and actual human behavior. Suchwork requires adapting tools and theories from socialscience disciplines, such as experimental economics,psychology, system dynamics, and organizationalbehavior. The behavioral operations managementliterature relating to this study examines inventoryordering decisions in multi-player settings, includingsupply chains. In this stream of research, summarizedin Croson and Donohue (2002), researchers havefocused on estimating individual decision rules forordering and comparing them with optimal rules, forexample, Sterman (1989a,b). Much of this researchhas been conducted in the context of the beer distribu-tion game. The game is well suited to examine behav-ioral issues as it is simple enough for people to learnquickly, while retaining key features of real supplychains.The major difference between our baseline setting

(experiment 1) and prior work using the beer distribu-tion game is the treatment of customer demand. Inprevious studies, customer demand both variedacross time periods (i.e., was either stochastic or non-stationary) and remains unknown to participants. Forexample, Sterman (1989a) and Steckel et al. (2004) useunknown, non-stationary but deterministic demandfunctions (a step function and S-curve, respectively).Croson and Donohue (2003, 2006) use a known, sta-tionary uniform distribution, but agents (other thanthe retailer) do not know the realizations of demand.Due to the order fulfillment delay, agents must fore-cast future demand. It is conceivable that the bull-whip in these settings results, in part, from demanduncertainty and resulting forecasting errors (Chenet al. 2000).We eliminate demand variability and forecasting as

potential causes of the bullwhip by using a constantdemand of four cases per period and publicly inform-ing participants of this fact before the game begins.When demand is constant and publicly known, thereis no need for safety stock. Furthermore, unlike priorstudies, we initialize the system in equilibrium, sothere is no incentive for participants to adjust the cur-rent stock level.By adding these controls, our baseline experiment

is designed to test whether the bullwhip effect stillexists when all operational causes and operationalsources of uncertainty are removed. The only

uncertainty facing participants lies in the decisionprocesses other supply chain team members use tomake their order decisions. It is trivial to show thatcosts are minimized when each decision maker sim-ply passes through the order they receive. Under thispolicy, the system remains in equilibrium, and thereis no order oscillation or amplification (i.e., no bull-whip). Hence, theory suggests:

Hypothesis 1: The bullwhip effect (order oscillationand amplification) will not occur when demand isknown and constant and the system begins in equi-librium.

This (null) hypothesis implicitly assumes that eithercoordination risk does not exist or, if it does exist, itdoes not affect ordering behavior.The importance of coordination in collective deci-

sion settings is well studied in the context of coordi-nation games (e.g., Cachon and Camerer 1996, VanHuyck et al. 1990, 1991, 1993). However, that litera-ture focuses primarily on games with multiple equi-libria, in which the failure to coordinate causesplayers to converge to the risk-dominant instead ofthe payoff-dominant equilibrium. A common goal inthat literature, which we share, is to identify and testmechanisms designed to help players coordinate.Unlike the traditional coordination games, our sup-

ply chain setting does not have multiple equilibria.The equilibrium is unique, but time delays, accumula-tions, and feedback in the supply chain cause theequilibrium to be potentially unstable. Failure to coor-dinate may induce instability, oscillation, and otherdisequilibrium dynamics that complicate or prevent areturn to equilibrium. Thus, our supply chain settinguncovers new aspects of the coordination problem. Inour setting, coordination risk may arises from twofactors: (i) a lack of common knowledge about thedecision rule that keeps the system in equilibrium,and (ii) a lack of trust that the other players willimplement that rule even if it is commonly known.The concept of common knowledge originated in

game theory (Aumann 1976) and is considered a nec-essary component of equilibrium in games. Vander-schraaf and Sillari (2005) provide a comprehensivedescription of the concept and its history in academicresearch. Agents often act on the basis of incompleteinformation; decision makers must always determinewhat they know and what they do not know beforemaking a decision. When the outcome involves morethan one agent, each decision maker must alsoconsider what others may know. Common knowl-edge goes beyond mutual knowledge. Under mutualknowledge, each agent privately knows the sameinformation. In our case, this information wouldbe the optimal ordering policy. Mutual knowledgeis typically implemented in the lab by privately

Croson, Donohue, Katok, and Sterman: Order Stability in Supply ChainsProduction and Operations Management 0(0), pp. 1–21, © 2013 Production and Operations Management Society 3

Page 4: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

informing all participants about relevant information.Common knowledge goes a step further by requiringnot only that agents know the information but thatthey also know that all the other players also knowthis information. In addition, players know that otherplayers also know that everyone knows this informa-tion and so on ad infinitum. Common knowledge istypically implemented in the laboratory by publiclyproviding the relevant information to participants.After the public announcement, everyone knows thateveryone else has been informed, etc.Game-theoretic models and experimental studies

show that failures of common knowledge can lead tocoordination failures and inefficient outcomes in avariety of settings (e.g., Geanakoplos 1992, Lei et al.2001, Nagel 1995). We contribute to this literature bytesting the effect of creating common knowledge onreducing bullwhip behavior, which is a natural out-come of reducing coordination risk. If the bullwhip iscaused or triggered by coordination risk, then provid-ing common knowledge may reduce both order vari-ability and amplification (compared with the baselineexperiment). This leads to our first hypothesis con-cerning coordination risk.

Hypothesis 2: The provision of common informa-tion through the announcement of the optimal policywill reduce the bullwhip effect (order oscillation andamplification).

Also, as the description of the optimal policy explic-itly directs participants to consider both on-hand andon-order inventory, theory suggests that decisionmaking at the individual level may improve by lead-ing agents to fully account for the supply line. Hence,

Hypothesis 3: The provision of common informa-tion through the announcement of the optimal policywill reduce supply line underweighting.

The second element of coordination risk is trust thatother agents will actually carry out the optimal order-ing policy. Although others have studied the effect ofa lack of common knowledge on experimental out-comes, they have used settings sufficiently simple sothat trust is typically not an issue (see Ochs 1995 for areview). A large literature in economics and organiza-tional behavior examines the effect of trust in others’benign intentions (see, e.g., Rousseau et al. 1998).Here, however, we focus not on intentions but on trustin the ability of other players to implement optimaldecisions correctly. The closest previous researchinvestigating this type of trust is Schwieren and Sutter(2008), who compare individuals’ trust in others’ goodintentions toward themwith trust in others’ abilities.If our explanation for the cause of order oscilla-

tion and amplification in the baseline is correct,eliminating the possibility of coordination failure, by

requiring one’s supply chain partners to follow theoptimal policy, should eliminate (or at least signifi-cantly reduce) order oscillation and amplification.Overall supply chain performance will improve bydefinition, as the automated agents play optimally.The relevant comparison is thus how well the humanplayers perform here compared with the retailersfrom previous experiments, who face the same incom-ing order stream. With the guarantee that all otherswill play optimally, the performance of the humanplayers should improve significantly, and we shouldsee less order oscillation.

Hypothesis 4: Eliminating coordination risk willdecrease order oscillation for the human players, rela-tive to the retailers in the (i) baseline and (ii) com-mon knowledge experiments.

Finally, as the supply line is now more predictable,it should be easier to track. Also, freed of the need todevote scarce cognitive resources to anticipating howtheir teammates will behave, individuals may bemore likely to use the optimal policy and fullyaccount for the supply line. Hence,

Hypothesis 5: Eliminating coordination risk willreduce supply line underweighting.

Following a common goal of prior coordinationresearch, it is valuable to consider what mechanismscould be used to mitigate the effects of coordinationrisk when the source of the risk itself cannot beremoved. One possible policy intervention is to holdexcess inventory to buffer against the interactioneffects caused by the possible sub-optimal decisionsmade by other supply chain members. Such a stabiliz-ing benefit has been suggested by others (e.g., Cohenand Baganha 1998), but never tested in an experimen-tal setting. Previous research has identified opera-tional factors affecting the optimal level of safetystock, including the length of the replenishment leadtime, level of demand variability, and the relative costof holding inventory vs. carrying a backlog. These arethe common inputs used to manage inventory levelsin most continuous or periodic review inventory sys-tems (Zipkin 2000). Our research contributes to thisliterature by testing a new role for excess inventory—to buffer against strategic uncertainty about otherplayers’ decisions (as opposed to exogenous uncer-tainty about market demand, production lags, orother operational factors).Additional on-hand inventory may improve perfor-

mance for two reasons. First, initializing the systemwith positive on-hand inventory reduces the need forplayers who seek a buffer stock to increase ordersabove equilibrium, reducing the incidence and mag-nitude of deviations that may trigger the instabilitycaused by supply line underweighting. Second, initial

Croson, Donohue, Katok, and Sterman: Order Stability in Supply Chains4 Production and Operations Management 0(0), pp. 1–21, © 2013 Production and Operations Management Society

Page 5: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

on-hand inventory reduces the likelihood that the sys-tem will enter the unstable backlog regime, moderat-ing the oscillations generated by players whounderweight the supply line. Hence,

Hypothesis 6: Excess initial inventory will decreasethe bullwhip effect (order oscillation and amplifica-tion).

On the other hand, because excess inventory neces-sarily initializes the system out of inventory equilib-rium, the behavior of the other members of thesupply chain may actually be more difficult to predicta priori. Rational agents would attempt to reduce theirinventory levels, but if they or their supply chain part-ners underweight the supply line, the result could beinstability and oscillations, perhaps even larger thanthose observed in the baseline experiment. Suchbehavior would refute Hypothesis 6.

3. Experimental Design

Our experiments follow the standard protocol for theBeer Distribution Game. The experiments were runusing a computer network with an interface writtenin Visual Basic. Complete instructions are available inan on-line supplement.2 All sessions were conductedat a large public university in the Northeastern UnitedStates in the spring semester of 2003.The Beer Distribution Game consists of four

agents (retailer, wholesaler, distributor, and factory),each making ordering decisions for one link in aserial supply chain with exogenous final customerdemand (Figure 1). The chain operates as a multi-echelon inventory system in discrete time withdemand backlogging, infinite capacity, and bothorder processing and shipping lags (see Chen 1999,Clark and Scarf 1960 for classic models of such sys-tems). Each period (corresponding to 1 week) allplayers experience the following sequence of events:(i) shipments from the upstream decision maker arereceived and placed in inventory, (ii) incomingorders are received from the downstream decisionmaker and either filled (if inventory is available) orplaced in backlog, and (iii) a new order is placed

and passed to the upstream player. Croson andDonohue (2006) and Sterman (1989a) provide moredetails on the dynamics of the game.As in most previous studies, the retailer, whole-

saler, and distributor (R, W, D) face a 2-week lagbetween placing an order and fulfillment by theirimmediate supplier and an additional 2-week trans-portation lag between fulfillment by the supplier anddelivery (a total lag of 4 weeks, assuming the supplierhas inventory in stock). The factory (F) has a 1-weeklag between placing orders (setting the productionschedule) and starting production and a 2-week pro-duction cycle (a total lag of 3 weeks). As in earlierstudies, costs are $0.50/week for each unit held ininventory and $1/week for each unit backlogged.We eliminated demand variability and forecasting

as potential causes of the bullwhip in all four experi-ments by using a constant demand of 4 cases/weekand publicly informing participants of this fact beforethe game began. Further, the realization of demandeach week (four cases) was displayed on everyagent’s screen at all times, confirming visually thatdemand is indeed constant. After the rules wereexplained, but before play began, participants weregiven a quiz to confirm they understood the rules andcalculations, as well as the fact that customer demandwould be constant at 4 cases per week for the entiregame. All participants correctly answered these ques-tions, confirming that they understood the rules andoperation of the game.All experiments began in flow equilibrium with

orders and shipments of four cases at each delay step.The initial end-of-period on-hand stock level (afterdeliveries are received and orders filled) was set tozero for all supply chain levels. No time limit wasimposed for making order decisions during the game.All games were run for 48 weeks, although this infor-mation was not shared with the participants to avoidhorizon effects. After the game, but before receivingpayment, participants were asked to complete an on-line questionnaire inviting them to reflect on theirexperience.The four experiments varied in the level of common

knowledge of the optimal order policy, the automa-

Figure 1 The Beer Distribution Game Portrays a Serial Supply Chain

Croson, Donohue, Katok, and Sterman: Order Stability in Supply ChainsProduction and Operations Management 0(0), pp. 1–21, © 2013 Production and Operations Management Society 5

Page 6: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

tion of supply chain partner decisions, and the initiallevel of on-hand inventory. Table 1 summarizes thesedifferences. In experiments 2 and 3, we provide com-mon knowledge about the optimal ordering policy, inaddition to common knowledge of customer demand(which is provided in all four experiments). This isaccomplished by including the following publicexplanation of the optimal policy at the beginning ofthe experiment:

The total team cost can be minimized if all teammembers place orders so as to make the total oftheir on-hand inventory and outstanding ordersequal to a pre-specified target level. This targetlevel is 16 for the retailer, wholesaler anddistributor, and 12 for the factory. This meansthat if the total on-hand inventory and outstand-ing orders is greater than or equal to the targetlevel, the order that minimizes team cost is 0.But if this total inventory is less than the targetlevel, the cost-minimizing order is to order justenough to bring it to the target.

The explanation was presented in written instruc-tions and explained publicly along with the gamerules. The explanation places no restrictions on partic-ipants’ actions, but the fact that the explanation ispublic provides common knowledge: each player isinformed of the cost-minimizing policy and knowsthat every other player is also so informed.In experiment 3, we further control the possible

sources of coordination risk by placing individuals ina supply chain with three automated players, each

programmed to use the optimal policy. We ran foursets of 10 supply chains with each set placing thehuman decision maker in a different role (i.e., 10experiments with a human retailer, 10 with a humanwholesaler, etc.). Participants are publicly told theoptimal, cost minimizing decision rule as in the previ-ous experiment, and also that all other members oftheir supply chain are automated and programmed tofollow that rule. All other conditions are identical tothe previous two experiments.In experiment 4, conditions are identical to the

baseline except that all participants begin with 12units of on-hand inventory rather than the optimallevel of zero. Twelve units is the initial inventorytraditionally used in the beer game, facilitating com-parison to prior work, and likely exceeds the desiredlevel of coordination stock (as suggested by the esti-mates of S’ in Table 2). Our purpose is not to find theoptimal level of coordination stock, but rather to testwhether directionally there is an improvement whenexcess inventory is provided.Table 1 also summarizes participant demograph-

ics for each treatment. A total of 160 participantswere recruited using an on-line recruitment system.All participants were drawn from the same studentpool and randomly assigned to one of the fourtreatments. Participants were paid a $5 show-upfee, plus up to $20 in bonus money depending ontheir team’s relative performance. Team perfor-mance was computed based on total supply chaincost (i.e., the cumulative sum of holding andbackorder costs across all four players) using the

Table 1 Conditions and Demographic Breakdown Across Experiments

Experiment 1(baseline)

Experiment 2(creating common

knowledge)

Experiment 3(eliminating

coordination risk)

Experiment 4(adding

coordination stock)

Publicly known and constant demand Yes Yes Yes YesInitial on-hand inventory set at equilibrium level Yes Yes Yes No (set to 12)Optimal order rule publicly announced No Yes Yes NoOne’s fellow supply chain members guaranteedto follow optimal policy (automated)

No No Yes No

Undergraduate* 90% 94.7% 98.7% 90.7%Gender: female 56.9% 31.6% 47.4% 39.5%Major: business 55.2% 63.2% 57.9% 72.1%Major: science & engineering 15.5% 10.5% 15.8% 9.3%Major: liberal arts, communications, and education 19.0% 21.1% 23.7% 14.0%Major: other† 10.3% 5.2% 2.6% 4.6%N (individuals) 40 40 40 40T (teams) 10 10 40 10N and T (minus outliers) 32, 8 40, 10 35, 35 28, 7

*All others are graduate students.†Other majors include HDFS, Agricultural Science, IST, EMS, Arts and Architecture, and undecided.Although participants were randomly assigned to experiments ex ante, we ran t-tests of proportions to check if there are systematic differences ex postin these demographics by experiment. Of the 36 comparisons run (each experiment vs. each other experiment on each dimension reported above), wefind only one significant difference (between the proportion of women in the baseline and common knowledge experiments). When running multiple tests,it is not uncommon to get false positive results (in fact, one in 20 tests will show a 5% significant difference by chance). In these situations, aBonferroni correction is typically used. Once this correction is applied, this gender difference is no longer statistically significant.

Croson, Donohue, Katok, and Sterman: Order Stability in Supply Chains6 Production and Operations Management 0(0), pp. 1–21, © 2013 Production and Operations Management Society

Page 7: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

continuous payment scheme introduced by Crosonand Donohue (2006).We tested the data from each treatment for out-

liers using the Grubbs procedure (Grubbs 1969). Asmall number of outliers were found and deletedfrom the statistical analyses reported in the maintext (the Appendix details the treatment of outliers).Eliminating the outliers reduces the between-condi-tion differences, an a fortiori procedure that providesa tougher test of our hypotheses. For completeness,we also report results using the entire data set (i.e.,including outliers) in footnotes. All our findingshold at similar levels of significance with or withoutoutliers.

4. Experimental Results

4.1. Experiment 1: Baseline with Constant DemandFigure 2 shows orders and on-hand inventory foreach individual in our baseline experiment. The char-acteristic pattern of oscillation and amplificationobserved in prior research persists. None of the teamsremain close to the equilibrium throughput of4 cases/week. Excluding outliers, factory orders aver-age 19; the mean of the peak factory orders is 215 andthe maximum factory order is 500.Figure 3 displays the standard deviations of

orders for each role across the eight teams in ourdata analysis (teams 4 and 9 were identified as out-liers and excluded). Significant order oscillationclearly exists despite constant demand and commonknowledge of that fact. We use a nonparametricsign test to check for order amplification (Seigel1965, p. 68). Order amplification exists when thestandard deviation of the ith stage in the supplychain, ri, exceeds that of its immediate customer,ri�1 (where i ∈ {R, W, D, F}). If there were no bull-whip, we would observe ri > ri�1 at the chance rateof 50%. The data reveal ri > ri�1 for 79% of thecases, rejecting H1 at p = 0.0025.3 The bullwhipeffect persists even when customer demand isconstant and publicly known to all participants.4

4.1.1. Individual Ordering Behavior. To betterunderstand the decision processes of the participants,we specify a decision rule for managing inventoryand estimate its parameters for each participant.Comparing the estimated decision weights to theoptimal weights offers insight into participant deci-sion making.Following Sterman (1989a), we estimate the follow-

ing decision rule for Oi,t orders placed in week t bythe person in role i:

Oi;t ¼ Maxf0;COi;t þ aiðS0i � Si;t � biSLi;tÞ þ ei;tg ð1Þwhere CO is expected Customer Orders (ordersexpected from the participant’s customer next per-iod), S0 is desired inventory, S is actual on-handinventory (net of any backlog), and SL is the supplyline of unfilled orders (on-order inventory). Ordersare modeled as replacement of (expected) incomingorders modified by an adjustment to bring inventoryin line with the target.The parameter a is the fraction of the inventory

shortfall or surplus ordered each week. The parame-ter b is the fraction of the supply line the participantconsiders. The optimal value of b is 1, as participantsshould include on-order as well as on-hand inventorywhen assessing their net inventory position. Givenb = 1, the optimal value of a is also 1: since there areno adjustment costs, participants should order the

Table 2 Estimated Parameters for the Ordering Decision Rule(Equation 1).

h a b S’ R2 RMSE

Experiment 1 (Baseline)Median estimate 0.09 0.47 0.08 4.42 0.54 5.47Median width of 95%confidence interval

0.55 0.30 0.29 12.66

N 24 32 30 30Experiment 2 (Common knowledge)Median estimate 0.08 0.28 0.10 8.26 0.57 2.83Median width of 95%confidence interval

0.31 0.22 0.37 10.90

N 30 40 39 39Experiment 3 (Eliminating coordination risk)Median estimate 0.00 0.24 0.14 3.48 0.29 1.62Median width of 95%confidence interval

0.91 0.30 0.50 7.76

N 9 27 26 26Experiment 4 (Coordination stock)Median estimate 0.08 0.26 0.24 4.97 0.54 2.53Median width of 95%confidence interval

0.40 0.23 0.42 8.01

N 21 28 27 27Median estimates,Sterman (1989a)

0.25 0.28 0.30 15 0.76 2.60

N 44 44 40 40Median estimates,Croson andDonohue (2002)*

N/A 0.22 0.14 N/A 0.71

N 44 44 44

*Croson and Donohue (2002) estimated a slightly different decision rule:Oi,t = Max[0, aiSi,t + biSLi,t + ciIOi,t + di + ei,t] where IO is incomingorders and a, b, c, d are estimated. The inventory and supply line weightsa and b can be expressed in terms of a and b by interpreting cIO as theparticipant’s demand forecast and noting that the inventory correctionterm d + aS + bSL = a[(d/a) – (S + (b/a)SL)], implying a = �a andb = b/a.Note that the number N of estimates can differ for each parameter. N canbe smaller than the number of participants in each condition (40) for fourreasons. First, outliers are excluded. Second, h cannot be identified ifincoming orders are constant, which is always true for retailers and alsotrue for upstream players if the downstream player always orders 4 cases/week (an occurrence that is more common in experiment 3, where three ofthe four players are automated and play optimally at all times). Third, noneof the parameters can be identified when the human plays optimally byordering 4 cases/week at all times (also more prevalent in exp. 3). Finally,when the optimal estimate of a = 0, b and S’ are undefined.

Croson, Donohue, Katok, and Sterman: Order Stability in Supply ChainsProduction and Operations Management 0(0), pp. 1–21, © 2013 Production and Operations Management Society 7

Page 8: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

entire inventory shortfall each period. The optimalexpectation for customer orders in our experimentwhere customer demand is stationary (indeed, con-

stant and known to all participants) is the mean ofactual orders placed by the final customer, that is,COi,j = 4 cases/week for all sectors i. In this case, the

Figure 2 Orders (top) and On-Hand Inventory (bottom) for the 10 Teams in Experiment 1 (Baseline). Negative Inventory Values Indicate Backlogs

Note: Vertical scales differ across teams.

Croson, Donohue, Katok, and Sterman: Order Stability in Supply Chains8 Production and Operations Management 0(0), pp. 1–21, © 2013 Production and Operations Management Society

Page 9: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

rule reduces to the familiar order-up-to rule. Theparameter S0 represents the sum of the desiredon-hand and desired on-order inventory. Since finalcustomer demand is constant and known, desiredon-hand inventory is zero, and desired on-orderinventory is the inventory level required to ensuredeliveries of 4 cases/week given the order fulfillmentlead time of 4 weeks (3 for the factory), yieldingS′ = 16 units (12 for the factory).Equation (1) can also be interpreted as a behavioral

decision rule based on the anchoring and adjust-ment heuristic (Kahneman et al. 1982, Tversky andKahneman 1974), in which expected customer ordersCOi,j represents the anchor (order what you expectyour customer to order from you), and inventoryimbalances motivate adjustments above or below theforecast. To capture the possibility that participantsdo not use the optimal forecast of incoming orders of4 cases/week, but rather respond to the actual ordersthey receive, we model expected customer orders asformed by exponential smoothing of actual incomingorders, IO, with adjustment parameter hi, as in Ster-man (1989a),

COi;t ¼ hiIOi;t�1 þ ð1� hiÞCOi;t�1: ð2ÞWe use nonlinear least squares to find the maxi-

mum likelihood estimates of the parameters hi, ai, bi,and S0i, subject to the constraints 0 � hi, ai, bi � 1and S0 � 0. We estimated the confidence intervalsaround each estimate with the parametric bootstrapmethod widely used in time series models of thistype (Efron and Tibshirani 1986, Fair 2003, Li andMaddala 1996), assuming an iid Gaussian error ewith variance given by the variance of the observedresiduals for each individual, ri

2. An ensemble of500 bootstrap simulations were computed for eachparticipant, and the 95% confidence intervals for

each of the four estimated parameters computedusing the percentile method. Table 2 summarizesthe results.As expected, the uncertainty bounds around h are

large. For many participants, the variance in incomingorders is small, and sometimes zero, for example,when their customer always orders the optimal quan-tity of 4 cases per week. In such cases, the smoothingtime constant h cannot be identified. More importantare the estimates of a and b. In experiment 1, the med-ian fraction of the inventory shortfall corrected eachperiod is 0.47, and the median fraction of the supplyline participants consider is 0.08, both far below theoptimal values of 1. The estimates of a and b aregenerally tight: 97% of the estimated values of a aresignificantly > 0, implying that most participantsrespond to inventory imbalances, as expected. How-ever, 72% are significantly less than the optimal valueof 1, implying most participants do not use the opti-mal order-up-to rule, but instead order only a fractionof their inventory shortfall each period. For b, 80% ofthe estimates are significantly less than the optimalvalue of 1. Furthermore, 60% are not significantlydifferent from zero, and the best estimate of b is zerofor 30% of participants, indicating that participantstend to underweight or ignore on-order inventory.

4.1.2. Discussion. Contrary to predictions basedon traditional assumptions of rationality and commonknowledge, the bullwhip effect remains even whendemand is constant and known to all participants.Constant demand and common knowledge of it didnot eliminate supply line underweighting. The med-ian estimate of b, the fraction of on-order inventoryaccounted for by the participants, at 0.08, is far belowthe optimal estimate of 1, and estimates of desiredtotal inventory, S0, are far too small to account for thereplenishment lead time. The median estimate of S′ is4.42 cases, while the optimal value is 16 (12 for thefactory). As found in Sterman (1989a) and Croson andDonohue (2006), participants substantially under-weight the supply line and underestimate the replen-ishment cycle time.Similar to prior studies, there is heterogeneity

across participants in the estimated parameters. Forexample, the estimates of a and b span the full rangeof possible values, [0, 1], and the standard deviationof the set of estimated values is large (0.35 and 0.34,respectively). We explore the possible relationshipbetween a and b in more detail in section 5.3. For now,it is interesting to note that prior work (Mosekildeet al. 1991, Sterman 1988, Thomsen et al. 1992) hasmapped the relationships between the parameters ofthe decision rule and supply chain performance.Although the parameter space is highly nonlinear,supply line underweighting (low b) and aggressive

0

10

20

30

40

50

60

70

80

90

100

110

120

1 2 3 5 6 7 8 10

Standard

Deviation

ofOrdersPlaced

TeamNumber

Retailer Wholesaler Distributor Manufacturer

Figure 3 Standard Deviation of Orders in Experiment 1 (Baseline)

Croson, Donohue, Katok, and Sterman: Order Stability in Supply ChainsProduction and Operations Management 0(0), pp. 1–21, © 2013 Production and Operations Management Society 9

Page 10: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

ordering (high a) can interact to increase the bull-whip, raise costs, and produce periodic, quasiperi-odic, and chaotic behavior. These studies show that,conditional on supply line underweighting, a lessaggressive response to inventory discrepancies (lowera) increases stability and lowers costs by reducing thegain of the main negative feedback regulating inven-tory. In section 5.3, we examine whether participantsin each of the four experiments placed orders in away that approximates the optimal response to inven-tory shortfalls conditioned on the degree of supplyline underweighting they exhibited.Post-play questionnaire responses suggest that

coordination risk may be contributing to order oscilla-tion and amplification. A number of subjects assertedthat while they realized that ordering 4 units everyperiod would minimize costs, they did not believetheir teammates understood or would follow the opti-mal policy. Some explicitly note that this uncertaintycaused them to seek additional inventory as protec-tion against their teammates’ unpredictable orderingbehavior:

Since I was the retailer it was easy to decidewhat my inventory should be. I tried to keepthe inventory at 4, just in case somethinghappened along the line and I wouldn’t get mysupply. [Retailer]

I wanted to have a little extra inventory thanwhat was demanded. [Wholesaler]

I tried to anticipate incoming orders. [Ordering]10 … would allow an ample buffer in case alarge order came in. [Distributor]

I tried to anticipate what the rest of my team-mates were going to order. This was difficulthowever since my teammates’ orders wouldjump from 6 to 200 and back to 4. I tried to getinventory to help but then they stopped makingorder [sic] and I was screwed. [Factory]

Uncertainty about the behavior of others moti-vated these participants to hold additional inventoryas a buffer against the risk of errors by their team-mates. However, the only way a participant canincrease on-hand inventory from the initial level ofzero is to temporarily increase orders, thus forcingthe supplier into an unanticipated backlog situationand, to the extent upstream players underweight thesupply line, triggering the oscillations we observe.The questionnaire responses and the persistence ofthe bullwhip effect with constant, publicly knowndemand are consistent with the conjecture that coor-dination risk contributes to an increase in orderoscillations and amplification.

4.2. Experiments 2 and 3: Reducing CoordinationRiskFigure 4 reports the results of experiment 2. Com-pared with experiment 1 (Figure 3––which is on alarger scale), publicly providing the optimal policyappears to substantially reduce order variability. Weuse a nonparametric two-tailed Mann–Whitney test(also referred to as a Wilcoxon test; see Seigel 1965,p. 116) to compare the standard deviations of ordersbetween experiments 1 and 2. The median standarddeviation of orders falls significantly from the base-line value of 15.7–5.4 (Mann–Whitney p = 0.012),5

supporting Hypothesis 2. However, the results alsoshow the bullwhip effect is resilient. Orders still oscil-late (although oscillation is much reduced) and thereremains significant amplification in order variabilityup the supply chain (ri > ri −1 for 83% of the cases,p = 0.0001).Interestingly, the tendency to underweight the sup-

ply line is not eliminated despite knowledge of theoptimal policy. The median fraction of the supply linethat participants consider in experiment 2 is 0.10, sub-stantially below the optimal value of 1. Further, 82%of the estimated values of b are significantly less thanone, 51% are not significantly different from zero, andzero is the best estimate of b for 21% of the partici-pants. Contrary to Hypothesis 3, providing partici-pants with the optimal policy does not reduce supplyline underweighting.Figure 5 reports the results of experiment 3. Prop-

erly assessing the impact of automating the orders ofother players (experiment 3) is complicated bydifferences in the environment faced by the humansin experiment 3. Humans playing the upstream posi-tions (W, D, or F) face no variability in incomingorders, since all downstream players are programmedto order the optimal 4 cases/week.6 Thus, human

0

5

10

15

20

25

30

35

40

1 2 3 4 5 6 7 9 10

Standard

Deviation

ofOrdersPlaced

TeamNumber

Retailer Wholesaler Distributor Manufacturer

Figure 4 Order Standard Deviations in Experiment 2 (CreatingCommon Knowledge)

Croson, Donohue, Katok, and Sterman: Order Stability in Supply Chains10 Production and Operations Management 0(0), pp. 1–21, © 2013 Production and Operations Management Society

Page 11: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

players in these roles do not experience any unin-tended inventory changes, unlike their counterpartsin the previous two experiments, who typically expe-rience large variation in incoming orders. In experi-ments 1 and 2, retailers are the only ones guaranteedto face constant orders. The impact of automation istherefore best examined by comparing the behaviorof humans in experiment 3 against the behavior ofretailers in experiments 1 and 2. Doing so provides aconservative measure of the impact of the treatment,as we would expect more significant differences inthe behavior of upstream players across treatments.The median standard deviation of the human play-

ers’ orders in experiment 3 is 1.7, compared with 3.9for retailers in the baseline, a significant difference(Mann–Whitney p = 0.012), supporting Hypothesis 4,part (i).7 Although the median standard deviation inorders of 1.7 is lower than the value of 2.5 for retailersin experiment 2 (common knowledge), the differenceis not statistically significant (Mann–Whitney p =0.43).8 Thus, the data do not consistently support part(ii) of Hypothesis 4.Nine of the 40 participants in experiment 3 ordered

the optimal quantity of 4 cases/week at all times. Theparameters of the decision rule cannot be estimatedfor these subjects. For the remaining 31 participants,the median fraction of the supply line participantsconsider is 0.14, significantly less than the optimalvalue of 1. Supply line underweighting remainsrobust among these participants despite knowledgeof the optimal ordering rule and elimination ofcoordination risk: 96% of the estimates of b are signifi-cantly less than the optimal value of 1. Hypothesis 5is, at best, only weakly supported.Overall, results from experiments 2 and 3 are con-

sistent with the conjecture that coordination risk is

one behavioral cause of the bullwhip effect. Whenuncertainty in the ordering behavior of one’s supplychain partners is reduced (through the introduction ofcommon knowledge of the optimal policy; experi-ment 2) or eliminated entirely (through the additionalintroduction of trust by guaranteeing others will usethe optimal policy; experiment 3), order oscillationsare significantly lower. Nevertheless, order oscillationand amplification are not completely eliminated, andsupply line underweighting persists.

4.3. Experiment 4: Adding Coordination StockThe previous experiments reduce the magnitude ofthe bullwhip effect by reducing or eliminating coordi-nation risk. However, in the field, it is often not practi-cal or possible to create common knowledge of theoptimal policy, or to guarantee that one’s supplychain partners will follow a particular ordering rule.In such cases, an intervention is needed to mitigatethe impact of coordination risk, despite not reducingthe risk itself. Figure 6 displays the average standarddeviations across all teams for experiment 4, whereexcess inventory was introduced as a possible inter-vention. Order oscillation clearly remains, althoughthe addition of coordination stock substantially andsignificantly dampens variation compared with thebaseline setting (comparing Figures 3 and 6). The signtest suggests that order amplification still exists, but itis only marginally significant (ri > ri −1 in 62% of thecases, differing from the chance rate of 50% atp = 0.097).9 The median standard deviation of ordersplaced is 3.8 compared with 15.7 in the baseline, a sig-nificant difference (Mann–Whitney p = 0.001). Ourresults thus support Hypothesis 6.Managers and researchers recognize that inventory

may be held for different reasons. Inventory held

0

5

10

15

20

25

30

35

40

Retailer Wholesaler Distributor Manufacturer

Standard

Deviation

ofOrdersPlaced

Positionof Non-AutomatedPlayer

Figure 5 Order Standard Deviations in Experiment 3 (EliminatingCoordination Risk)

Note: Here, participants all make up separate teams (with automated

team members), and so we group the data by role rather than team.

0

5

10

15

20

25

30

35

40

1 2 3 4 5 8 9

Standard

Deviation

ofOrdersPlaced

TeamNumber

Retailer Wholesaler Distributor Manufacturer

Figure 6 Standard Deviation of Orders in Experiment 4 (Adding Coor-dination Stock)

Croson, Donohue, Katok, and Sterman: Order Stability in Supply ChainsProduction and Operations Management 0(0), pp. 1–21, © 2013 Production and Operations Management Society 11

Page 12: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

because of the need to batch orders to amortize fixedordering costs is commonly referred to as cycle stock.Inventory used to buffer the effects of process uncer-tainty (e.g., to prevent starvation between work cells)is known as buffer inventory. Safety stock usually refersto inventory used to buffer against the impact ofdemand or supply uncertainty. Although coordina-tion risk is a contributing factor to demand and sup-ply uncertainty, such uncertainty in most inventorymodels is considered to arise from exogenous sources.In contrast, the uncertainty subjects experience heredoes not arise from the external environment—it isself-inflicted. We introduce the term coordination stockto represent a type of safety stock used to bufferagainst strategic uncertainty in orders or delivery dueto coordination risk. Coordination stock buffersagainst decision errors, but does not reduce thoseerrors directly.It is important to note that, all else equal, adding

excess inventory decreases the probability of enteringthe backlog regime where the system is intrinsicallyless stable. So, even if the source of perceived risk issomething other than (or in addition to) coordinationrisk (e.g., trusting the demand information or otheraspects of the experimental setting), then addingexcess inventory can still be beneficial in reducing thebullwhip effect. Also, while excess inventory reducesorder oscillation, one could argue that the improve-ment comes at the expense of increased costs as par-ticipants hold larger inventories. Whether addingcoordination stock is beneficial from a cost perspec-tive depends on the underlying cost parameters.Specifically, one must trade off the cost of holding a

larger buffer inventory (coordination stock) againstthe cost of increased inventory spikes and backlogswithout the buffer. For our experimental setting,where the unit backlog cost is twice the unit holdingcost, the median total supply chain cost with coordi-nation stock is $1890, significantly lower than themedian total cost of $9151 without coordination stock(Mann–Whitney p = 0.0065).10 Here, at least, the costof holding coordination stock is more than offset bythe lower costs resulting from greater stability. Oursetting (where unit backlog cost is at least twice thatof unit holding cost) is representative of industries,where stockouts not only lead to lost sales but alsoerode a firm’s reputation as a reliable supplier, poten-tially leading to loss of market share, lower prices,and other costs. However, if unit holding costs exceedbacklog cost, carrying additional inventory may beprohibitively expensive even if it reduces supplychain instability. Such a situation may exist for perish-able and customized goods, products with highembodied value-added, product variants orderedinfrequently, and other settings where firms tendtoward make-to-order strategies.

To test the impact of excess inventory on the ten-dency to underweight the supply line, we again esti-mate the decision rule in Equations (1) and (2). Themedian fraction of the inventory shortfall correctedeach period is 0.26, and the median fraction of thesupply line participants consider is 0.24, both substan-tially below the optimal values of 1. Furthermore,63% of the estimates of b are significantly less thanthe optimal value of 1, 41% are not significantly differ-ent from 0, and the best estimate is 0 for 19% ofparticipants. Supply line underweighting persists.However, the consequences of underweighting are less-ened considerably by the addition of coordinationstock. One reason is simply that the extra buffermeans the supply chain operates farther from theunstable backlog regime.Comparing the parameters of the decision rule

between experiments 1 and 4 (Table 2) suggestsanother source for the benefit of coordination stock.The estimated values of a are lower in experiment4, where participants begin with 12 units of on-handinventory, than in experiment 1, where they beginwith the optimal level of 0 (a median of 0.26 vs.0.47, respectively), although the difference is onlymarginally statistically significant (p = 0.066). Whymight participants be less aggressive when givencoordination stock? One possibility, of course, isthat, by chance, the participants in experiment 4were calmer by disposition than those in experiment1. Alternatively, the participants may have reactedto the situation. With the standard cost structure weuse, backlogs are more costly than on-hand inven-tory. When initial inventory is zero, even a smallincrease in incoming orders can create a costly back-log, perhaps triggering an aggressive effort to returnto non-negative on-hand inventory. However, withinitial inventory of 12, the same small jump inincoming orders will leave the participant withpositive on-hand inventory, avoiding the costlybacklog regime. Participants may therefore not feelas much urgency to order in the presence of coordi-nation stock.As an analogy, consider driving. You seek to

maintain a certain distance between your car andthe car ahead, analogous to the desired on-handinventory level in the supply chain. Although youknow that the speed limit is (say) 55 miles per hour,and although this is common knowledge, you can-not trust the driver ahead of you to maintain asteady speed of 55. You must often adjust yourspeed to maintain a safe distance between you andthe car ahead, accelerating when the distance grows,and easing off the gas or applying the brake whenthe distance shrinks below the safe level. As insupply chains, the velocity of your car responds toyour control actions with a lag. And, as in the beer

Croson, Donohue, Katok, and Sterman: Order Stability in Supply Chains12 Production and Operations Management 0(0), pp. 1–21, © 2013 Production and Operations Management Society

Page 13: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

game, costs are asymmetric: the cost of a larger gapthan you desire between you and the car ahead issmall, while the cost of “backlog” (rear-ending thecar in front of you) is very high. Drivers who tail-gate must brake aggressively when the distance tothe car ahead shrinks (that is, use a large a), whiledrivers with a large buffer between themselves andthe car ahead can simply ease off the gas (use asmaller a). That is, the gain of the negative feedbackthrough which drivers control the distance to thecar ahead is a nonlinear function of the distance. Wehypothesize that, at least with the cost structureused here, participants who find themselves close tothe costly backlog regime, as in experiment 1, arelikely to order more aggressively than those who aregiven an initial buffer, as in experiment 4. The con-trast between the estimated values of a in theseexperiments is consistent with the “tailgating” the-ory, but further experiments would be needed totest for the hypothesized nonlinear dependence of aon net inventory and the cost structure.

5. General Discussion andPsychological Mechanisms

The contributions of this study are both conceptualand empirical. First, we identified coordination riskas a possible new behavioral cause of the bullwhipeffect in supply chains. Second, we presented experi-mental evidence demonstrating that reducing factorscontributing to coordination risk moderates thebullwhip effect. Third, we introduced a successfulmanagerial intervention, showing that carrying coor-dination stock to buffer the system against strategicuncertainty improves performance.However, the more basic question regarding the

psychological processes through which these effectsoperate remains open. Although identifying the psy-chological foundations of the phenomenon was notour objective, we can shed some light on possiblemechanisms using the data we have collected.

5.1. Does the Presence of Coordination RiskCreate Supply Line Underweighting?The first possibility is that reducing coordination riskreduces supply line underweighting. Studies showthat people have great difficulty managing complexdynamic systems, typically failing to account for feed-back processes, underweighting time delays, andmisunderstanding stocks and flows (Booth Sweeneyand Sterman 2000, Sterman 1994). Underweighting thesupply line of unfilled orders in stock managementtasks such as the beer game is particularly common.Sterman (1989a) discusses explanations for supply

line underweighting grounded in the theory of

bounded rationality and behavioral decision making(Kahneman et al. 1982, Simon 1979, Tversky andKahneman 1974). Research shows that people tend toignore or underestimate time delays in dynamicalsystems (Dorner 1996, Sterman 1989a). If people’smental models do not include the delay between plac-ing and receiving orders, they may simply beunaware that they should account for on-order inven-tory in order decisions, even if, as in the presentexperiment, that information is available. If so,providing people with the optimal policy, whichexplicitly instructs them to consider all on-orderinventories, should eliminate, or at least reduce, sup-ply line underweighting. Alternatively, people mayknow that they should take the supply line intoaccount, but find it difficult to do so due to cognitivelimitations such as limited attention, working mem-ory, and mental computation capability (Plous 1993,Simon 1997). One must recognize not only the impor-tance of the supply line but also direct attentionalresources to monitoring it and incorporating it in thedecision heuristic. If people understand the impor-tance of the supply line, but face cognitive limits thatprevent them from accounting for it, then reducingthe cognitive load of the task should free up workingmemory and attentional resources for that purpose,reducing supply line underweighting.The cognitive resources available to attend to the

supply line should be the smallest in experiments 1and 4, where the optimal ordering policy is not com-mon knowledge. Cognitive resources should increasewhen players are given common knowledge of theoptimal policy (experiment 2), and increase furtherwhen players are guaranteed that all others in thesupply chain will in fact play the optimal policy(experiment 3). Consequently, if cognitive limitationscontribute to supply line underweighting, theestimated values of b should be lowest in experiments1 and 4, higher in experiment 2, and highest inexperiment 3.To test these hypotheses, we compare the estimated

values of b across experiments. In experiment 1 themedian value of b is extremely low (0.08), indicatingthat many subjects essentially ignored the supply linealtogether. The median value of b in experiment 2 isnearly identical (0.10), and there is no statisticallysignificant difference compared with experiment 1(Mann–Whitney p = 0.63). Providing people with theoptimal policy does not reduce supply line under-weighting. Adding trust by automating all playersbut one (experiment 3) frees up additional cognitiveresources because players know and need not specu-late on how their partners will behave. However,although the median value of b rises to 0.14, thedifference compared with experiment 1 is not statisti-cally significant (Mann–Whitney p = 0.27), and it

Croson, Donohue, Katok, and Sterman: Order Stability in Supply ChainsProduction and Operations Management 0(0), pp. 1–21, © 2013 Production and Operations Management Society 13

Page 14: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

remains far below the optimal value of 1. The hypoth-esis that eliminating coordination risk by requiringothers to use the optimal policy will reduce supplyline underweighting is not supported. The highestestimates of b are found in the coordination stock con-dition (experiment 4), in which participants are notinformed of the optimal policy: the median estimateof b in experiment 4 is 0.24, though these values arealso not significantly different than those in experi-ment 1 (Mann–Whitney p = 0.14). Overall, there is noevidence that providing common knowledge of theoptimal policy or even guaranteeing that others willuse it reduces supply line underweighting.We are left to conclude that (i) the fraction of the

supply line participants take into account remains farbelow the optimal level in all experimental condi-tions, and (ii) the performance improvements inducedby reducing coordination risk are at best only weaklyrelated to the degree of supply line underweighting.These results are consistent with prior work. Forexample, Sterman (1989b) and Diehl and Sterman(1995) show underweighting is robust in one-personstock management tasks where by definition there areno strategic interactions among players and hence noissues of coordination risk. It is possible that the com-plexity of those tasks still overwhelmed people’s cog-nitive capacities, but subsequent experiments with farsimpler systems show similar errors (e.g., BoothSweeney and Sterman 2000, Cronin et al. 2009). Priorresearch suggests that people suffer from deeper,more persistent difficulties in understanding and con-trolling dynamical systems, including difficultiesunderstanding feedback processes, time delays, accu-mulations, and nonlinearities (Sterman 1994, 2011).All of these elements of dynamic complexity are pres-ent in the beer game and play important roles in itsdynamics.

5.2. Does the Presence of Coordination Risk CauseSpontaneous Deviations?The second possibility is that reducing coordinationrisk would reduce the tendency of individuals todeviate from optimal ordering in the first place. Con-ceptually, one can distinguish between (i) the stabilityof the system’s response to perturbations, caused byunderweighting the supply line and (ii) the sources ofperturbations that trigger the response.In previous experimental studies using the beer

game, exogenous perturbations such as unanticipatedchanges in customer demand or random variations inproduction and deliveries served as triggers thatknocked the system out of equilibrium, allowing anybiases or errors in decision making such as supplyline underweighting to be observed. Here, there areno such exogenous perturbations. Instead, agents

who suspect their customers or suppliers will makepoor decisions may endogenously choose to deviatefrom the equilibrium strategy to build a buffer stockagainst the coordination risk of non-optimal behaviorby others, thus causing variability in the orders theyimpose on their suppliers. Note that such behavior isfundamentally different from supply line under-weighting, which causes fluctuations and amplifica-tion in response to perturbations. We define aspontaneous deviation as a player ordering a quantitydifferent from the optimal value of four cases beforeexperiencing any change in incoming orders, deliver-ies, or inventory. If this mechanism drives the results,then removing the sources of coordination risk shoulddampen bullwhip behavior by reducing the likeli-hood and magnitude of spontaneous deviations.In experiment 1, 72% of participants spontaneously

deviated from the equilibrium order of 4, and 74% ofthese did so in the first period. Of those who sponta-neously deviated, 74% ordered more than four cases;on average, those deviating spontaneously orderedan extra 2.6 cases above the equilibrium throughputof 4. However, the likelihood of spontaneous devia-tion in experiment 2 (when coordination risk isreduced) is almost identical, 75%. Like experiment 1,the vast majority of spontaneous deviations occur inthe first period (83%) and 93% of those doing so ordermore than 4 cases. The average excess order, 6.3 cases,is even larger than in experiment 1. The likelihoodand magnitude of spontaneous deviations are not sig-nificantly lower in experiment 2 than in experiment 1.Spontaneous deviations are almost identical in experi-ment 3 (69% of participants spontaneously deviateand the average deviation is 3.9 cases), which is notsignificantly different than the frequency or size ofspontaneous deviations in experiment 1. Thus, thelikelihood and extent of spontaneous deviation can-not explain the improved performance.

5.3. Does Coordination Risk Cause Overreactionto Inventory Shortfalls?Coordination risk might also operate by affecting peo-ple’s responses to inventory shortfalls. If decisionmakers believe that their counterparts will makerational decisions (because those counterparts areeither informed of or constrained to follow the opti-mal ordering rule), then an inventory shortfall is not acause for alarm. Thus, individuals in treatments withreduced coordination risk may be less likely to over-react to inventory shortfalls by placing excess orders.To test this possibility, we compared the estimated

values of a across the experimental treatments. Themedian estimate of a in experiment 1 is 0.47,whereas in experiment 2 (with coordination riskdecreased), the median value is 0.28. Although the

Croson, Donohue, Katok, and Sterman: Order Stability in Supply Chains14 Production and Operations Management 0(0), pp. 1–21, © 2013 Production and Operations Management Society

Page 15: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

difference is not statistically significant (Mann–Whit-ney p = 0.12), the magnitude of the drop is large andhas a substantial impact on costs, amplification, andoscillation.To illustrate, simulating the system with the deci-

sion rule in Equations (1)–(2) using the median esti-mated parameters for experiment 1 in all fourpositions yields a large cycle with amplitude growingover the first 48 weeks.11 The first peak in orders foreach link in the chain is 6, 8, 13, and 16 cases for the R,W, D, and F, respectively; total costs are $1650. Reduc-ing a to 0.28, the median value in experiment 2—withall other parameters unchanged—cuts initial peakorders to 5, 7, 9, and 10, respectively. Peak factoryorders drop 38% and total costs fall to $1332, a dropof 19%. Thus, although the differences in the esti-mated values of a between experiments 1 and 2 areonly directional, they are consistent with ourobserved results that common knowledge decreasesorder oscillation and amplification.We find a similar result comparing experiments 1

and 3. The median estimate of a in experiment 3 is0.24; the difference between this and experiment 1 ismarginally significant (Mann–Whitney p = 0.067).Simulating the system with the median parametersfrom experiment 3 for one player and the optimalpolicy for the roles that were automated essentiallyeliminates amplification and oscillation, as observedin the data. In experiment 4, the median estimate of ais 0.26; again, marginally different from experiment 1(Mann–Whitney p = 0.09).Although the evidence is of marginal statistical sig-

nificance, it appears that common knowledge of theoptimal policy, a guarantee that others will use it, orcoordination stock to buffer players against the riskof non-optimal behavior by others reduces howaggressively participants respond to a given inven-tory discrepancy. As players in all conditions signifi-cantly underweight the supply line, the reduction inthe gain of the inventory control feedback causesoscillation and amplification to drop. In terms of thetailgating analogy in section 5.1, a driver who fearsthat the cars ahead may slam on their brakes for noapparent reason must be prepared to react aggres-sively, while drivers who can trust that the cars aheadwill not behave erratically can react less aggressively.The persistence of the bullwhip effect, supply line

underweighting, and spontaneous deviations despitethe experimental treatments is consistent with recentresearch in judgment and decision making (e.g., Gige-renzer and Todd 2000, Kahneman 2002) and neuro-science (e.g., Camerer et al. 2005, McClure et al. 2004)stressing the role of unconscious, intuitive processes.Imaging studies show that decision tasks involvinglong time frames or purely economic considerationstend to activate brain regions associated with deliber-

ation, such as the frontal cortex, whereas tasks involv-ing immediate needs and emotionally laden contextstend to activate the limbic system. These intuitive pro-cesses are often adaptive in settings similar to those inwhich they evolved, but can lead to error and bias innovel contexts. When humans evolved, complex arti-ficial systems like supply chains did not exist.Resource scarcity placed a high premium on immedi-ate, certain rewards compared with delayed,uncertain rewards. As shown in studies of hyperbolicdiscounting (e.g., McClure et al. 2004), peoplestrongly prefer the “bird in the hand” and discountfuture rewards. In the beer game context, such prefer-ences would lead to underweighting of the supplyline of on-order inventory compared with on-handinventory. Similarly, when humans evolved, socialinteractions frequently involved coordination risk:there were no automated agents guaranteed to useoptimal decision rules. The supply chain context maytrigger unconscious fears that others either do notknow or would not use the optimal rule, even whenpeople are informed of it or know that the otheragents are programmed to use it. If so, the result islikely to be spontaneous deviations as players ordermore than the equilibrium throughput in an attemptto build coordination stock. We speculate that therobustness of supply line underweighting and spon-taneous deviations in the experiment may arise inneural structures distinct from those responsible forthe deliberative processes underlying rational choice.Further work investigating the psychological andneural processes at work is needed to test thesehypotheses.

5.4. Does Supply Line Underweighting Affect theAggressiveness of Inventory Corrections?As discussed above, a participant who fully accountsfor the supply line of on-order inventory (b = 1)should order the full discrepancy between desiredand actual total inventory each period (a = 1). Asthere are no adjustment costs in the experiment, set-ting a = 1 when b = 1 corrects any inventory gap asquickly as possible without unintended inventoryovershoot. However, if participants underweight thesupply line (b < 1), then large values of a cause severeinstability, whereas smaller values reduce amplifica-tion, oscillation, and costs (Mosekilde 1996, Thomsenet al. 1992). In control-theoretic terms, a is the gain ofthe negative feedback regulating inventory; the lowerthe value of b, the lower the gain of that feedback (thesmaller the a) must be to prevent costly oscillations.Although the experimental results clearly show par-ticipants did not respond to inventory discrepanciesusing the joint optimal values a, b = 1, the typicalvalue of a in each experiment is much less than1 (Table 2). Are the low values of a a result of

Croson, Donohue, Katok, and Sterman: Order Stability in Supply ChainsProduction and Operations Management 0(0), pp. 1–21, © 2013 Production and Operations Management Society 15

Page 16: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

participants approximating the optimal response toinventory shortfalls conditioned on the degree of sup-ply line underweighting they exhibited?To shed light on this question, we computed the

conditionally optimal value of a given the value of b,denoted a*(b), for the setting of experiment 4. Valuesof a*(b) were determined from simulations of the fullsupply chain using the decision rule in Equations (1)and (2), assuming that all ordering rule parametersother than a and b were optimal. Figure 7 shows theresulting values of a*(b). As prior research and thediscussion above suggest, a*(b) falls sharply as bdecreases.12 Figure 7 also plots the estimated valuesof a and b for participants in experiment 4 (excludingoutliers). The estimated values do not appear to con-form to a*(b). The correlation between a and b is small(r = 0.25) and not statistically significant (p = 0.21),including outliers yields essentially no correlation(r = 0.007, p = 0.96). The data do not support thehypothesis that participants’ order decisions in exper-iment 4 were consistent with the conditionally opti-mal responsiveness to inventory shortfalls given thedegree of supply line underweighting they exhibited.The other experimental conditions exhibit the same

pattern. Figure 8 shows the estimated (a, b) values forparticipants in experiments 1, 2, and 3. There is con-siderable scatter in all three experiments, with a span-ning a wide range of values for any given value of b.The correlation between the estimated values of a andb for participants in these experiments is small

(r = 0.16) and not statistically significant (p = 0.12);including outliers yields the same result (r = 0.13,p = 0.16). The evidence does not support the hypothe-sis that participants responded to inventory shortfallsoptimally, conditional on the degree of supply lineunderweighting they exhibited.

5.5. Do Participants Engage in Hoarding Behavior?The decision rule defined by Equations (1) and (2)assumes constant desired levels of on-hand and on-order inventory. However, it is plausible that thesetarget stock levels may vary endogenously with thestate of the system. For example, when suppliers areunable to fill orders, delivery times rise and custom-ers receive less than they desire. Customers in realsupply chains sometimes respond by seeking largersafety stocks (hoarding) and by ordering more fromsuppliers, often through multiple channels (placingphantom orders). Hoarding and phantom orders canbe rational when multiple customers compete for lim-ited supplies, when capacity constraints constraindeliveries, and when there are supply interruptionsor other stochastic shocks (Armony and Plambeck2005, Cachon and Lariviere 1999, Lee et al. 1997). Inour experimental setting, participants experienceendogenous shocks due to the unanticipated orderingpatterns of other supply chain members. Participantsmay be adjusting their target stock level in responseto such shocks, effectively inducing hoarding andphantom ordering behavior. To test for the presenceof hoarding and phantom ordering formally, how-ever, requires relaxing the assumption of constantdesired on-hand and on-order inventory levels in theordering decision rule (Equations (1) and (2)). Forexample, a participant, knowing that all members ofthe supply chain have common knowledge thatdemand is constant, might initially attempt to main-

Figure 7 Estimated a and b, Experiment 4, with Conditionally OptimalValue of a as a Function of b

Note: Conditionally optimal values of a were derived from simulations of

the supply chain model under the conditions of experiment 4, using the

decision rule in Equations (1) and (2) and assuming optimal parameters for

all links in the chain (h = 0, implying that the demand forecast is accurate

at all times at 4 cases/week, and target inventory for each link is set to opti-

mal levels of on-hand and on-order inventory). To reduce the dimensional-

ity of the parameter space, all four simulated participants are assumed to

use the same values of a and b. Estimated values shown exclude outliers.

Figure 8 Estimated Values of a and b in Experiments 1–3 (OutliersExcluded)

Croson, Donohue, Katok, and Sterman: Order Stability in Supply Chains16 Production and Operations Management 0(0), pp. 1–21, © 2013 Production and Operations Management Society

Page 17: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

tain the optimal stock of on-hand and on-order inven-tory. However, if that participant received a largeorder or less than she expects to receive from her sup-plier, she may revise her beliefs and increase the levelof on-hand and on-order stocks she seeks, leading toadditional orders. To test this hypothesis, one of theauthors of this study has estimated decision rules inwhich participants endogenously vary the levels ofon-hand and on-order inventory they seek based ontheir beliefs about incoming orders, supplier leadtimes, and other factors indicating scarcity. The fullstudy is available on request. In brief, the revisedordering rule is

Ot ¼ max 0;Det þ aS S�t � St

� �þ aSL SL�t � SLt� �þ et

� �;

ð3Þwhere target on-hand inventory, S*, and target supplyline of on-order inventory, SL*, are functions ofexpected demand, De (defined by exponentialsmoothing of actual incoming orders, as in Equa-tion (2)) and expected supplier lead time, ke (esti-mated from the current supply line and the rate ofdeliveries received from the supplier). For example,participants may set target on-hand inventory to acertain number of weeks coverage, c, of expectedincoming orders De:

S�t ¼ cDet ; ð4Þ

where target inventory coverage, c, may be either con-stant or proportional to the participant’s belief aboutthe supplier lead time. Similarly, to ensure a steadyflow of deliveries from the supplier, participantsmust, following Little’s Law, set the target supply lineof on-order inventory equal to ke weeks worth of their

forecast of incoming orders,

SL�t ¼ ketDet : ð5Þ

Full exposition of the models, results, and sensitiv-ity analysis is beyond the scope of this study. In brief,for approximately 80% of participants, the modifiedmodel does not significantly improve explanatorypower. However, for the remaining 20% there isstrong evidence of hoarding and/or phantom order-ing: the hypothesis that target on-hand and on-orderstock levels are constant is rejected and the enhancedmodel significantly improves R2 and lowers the rootmean square error (RMSE) between the model anddata. The improvement is particularly large for thoseparticipants classified here as outliers.Figure 9 illustrates two examples: the wholesaler

and distributor from the most extreme outlier team inthe data, Team 4 in experiment 1 (shown in Figure 2).Although the retailer’s orders never exceeded36 cases/week, the wholesaler’s orders reached apeak of 30,000 cases in a single week; the distributor’sorders peak at 20,000 cases/week. The original deci-sion rule with constant target stock levels cannotexplain such extreme behavior. However, incorporat-ing hoarding and phantom ordering into the modelsignificantly improves the fit to the data, boosting R2

from less than 0.05 to more than 0.70, and reducingthe RMSE by 46% for the wholesaler and 59% for thedistributor. The estimated parameters suggest theseparticipants reacted to the jump in incoming ordersby increasing target levels for on-hand inventory andby increasing the target supply line of on-order inven-tory as their supplier’s lead time lengthened.Why did a fifth of the participants engage in hoard-

ing and phantom ordering when it is not rational to

Figure 9 Fit to Participant Behavior of Modified Decision Rule Incorporating Hoarding and Phantom Orders

Croson, Donohue, Katok, and Sterman: Order Stability in Supply ChainsProduction and Operations Management 0(0), pp. 1–21, © 2013 Production and Operations Management Society 17

Page 18: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

do so in the experiments reported here? Following thediscussion in section 5.3, we conjecture that thesebehaviors are adaptive in situations characterized byscarcity and competition for resources, such as thosein which humans evolved, and may be embedded inneural structures in the limbic system, such as thosethat appear to lead to hyperbolic discounting. Wespeculate that hoarding and phantom ordering maybe triggered by the stress caused by unexpectedlylarge incoming orders or lags in supplier deliverieseven though such behavior is not rational in theexperimental setting. Further research is needed totest these hypotheses.

6. Limitations and Conclusions

Table 3 summarizes the results of our fourexperiments, showing which differences in orderoscillation and other measures of interest are statisti-cally significant.Four main findings emerge from the experiments.

1. The “bullwhip effect” and supply line under-weighting persist even when demand is con-stant and known to all (i.e., when there iscommon knowledge of demand). This con-trolled setting provides stronger evidence, ascompared with previous experimental research,that the bullwhip effect is, in part, a behavioralphenomenon.

2. We identify coordination risk as a possible newsource of uncertainty that may both trigger

and further amplify bullwhip behavior. We testthe impact of two factors that may reducecoordination risk, a lack of common knowledgeof the optimal strategy and a lack of trust thatothers will follow the optimal ordering policy.

3. Performance is improved through the introduc-tion of common knowledge and the addition oftrust, although these factors do not reduce thebehavioral tendency to underweight the supplyline.

4. Performance is similarly improved by addingcoordination stock, which buffers against strate-gic uncertainty.

Building on our notion of coordination risk, Su(2008) provides an analytical proof that such risk canlead to bullwhip behavior when the decision maker isboundedly rationality. Su’s result holds when there isno information lag and orders are filled immediately(i.e., without the potential for error introduced bysupply line underweighting). These results are consis-tent with our experimental findings and together pro-vide strong evidence that coordination risk caninhibit supply chain performance.All research has limitations, and ours is no excep-

tion. These limitations also suggest directions forfuture research. First, we used students as partici-pants, as is common practice in experimental econom-ics. It is possible that experienced supply chainprofessionals would perform better, although there isno evidence of any systematic differences due to thesubject pool. For example, Croson and Donohue

Table 3 Supply Chain Costs and Metrics of Order Amplification

Measures of orders placedExperiment 1: baseline

(retailers only)

Experiment 2: creatingcommon knowledge

(retailers only)

Experiment 3:eliminating

coordination risk

Experiment 4:adding coordination

stock Significance tests

Outlier teams excludedMedian std deviation 15.7

(Retailers = 3.9)5.4(Retailers = 2.5)

1.7 3.8 1 > 2**1 > 3**2 = 31 > 4**

Average std deviation 20.6(Retailers = 4.7)

8.3(Retailers = 5.5)

1.9 4.5

Median total supply chain cost 9151(Retailers = 336)

3396(Retailers = 122)

47 1890 1 > 2*1 > 3**2 > 3*1 > 4**

Average total supply chain cost 10,771(Retailers = 546)

4562(Retailers = 202)

147 2157

Outlier teams includedMedian std deviation 20.9

(Retailers = 3.9)5.4(Retailers = 2.5)

1.9 4.4 1 > 2**1 > 3**2 = 31 > 4**

Average std deviation 455.1(Retailers = 4.5)

8.3(Retailers = 5.5)

2.3 57.8

Median total supply chain cost 13,105(Retailers = 279)

3396(Retailers = 122)

54 2840 1 > 2**1 > 3**2 > 3*1 > 4**

Average total supply chain cost 202,646(Retailers = 463)

4562(Retailers = 202)

159 14,101

The numbers in parenthesis for experiments 1 and 2 are for retailers only and are used to compare with experiment 3. To derive these summary statistics, wefirst calculated each individual’s standard deviation of orders placed. We then calculated the median (average) standard deviation of orders placed separatelyfor each role (R, W, D, F). Finally, we took the average of these medians (averages) to report as the summary statistic in the table and the text of the article.*p < 0.10. **p < 0.05.

Croson, Donohue, Katok, and Sterman: Order Stability in Supply Chains18 Production and Operations Management 0(0), pp. 1–21, © 2013 Production and Operations Management Society

Page 19: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

(2006) report beer game results from supply chainmanagers and find similar patterns of bullwhip andunderweighting of the supply line. Croson (2007)provides an overview of research from experimentaleconomics and other fields on the use of student vs.professional subjects. She reports only a very fewsettings where professional participants performsignificantly differently than student participants.Croson and Donohue (2006) and Wu and Katok

(2006) provide some initial insight into the effects oflearning from experience. Taken as a whole, the evi-dence suggests that experience is unlikely to eliminatethe bullwhip effect by itself. Managers who experi-ence instability may conclude that others in their sup-ply chain are unreliable, motivating them to deviatesooner or more from optimal play so as to build upcoordination stock as a buffer. Further research on therole of experience, learning, and decision aids (e.g.,Wu and Katok 2006) is clearly needed.The specific choice of initial conditions and parame-

ters suggests other extensions. Although there is noa priori reason to think so, it is possible that differentcost parameters or initial inventory levels might leadto different outcomes.As described above, the decision rule estimated

here assumes constant levels for target on-hand andon-order inventory. These assumptions can be relaxedto test whether at least some participants endoge-nously increased the levels of coordination stock theysought to accumulate when faced with a surge inincoming orders or a drop in supplier delivery reli-ability. Preliminary work suggests that some partici-pants in the experiment did so, and that a modelincorporating behavioral responses to order volatilityand supply disruptions explains the behavior of manyparticipants classified as outliers. Further explorationof the conditions that trigger such hoarding, andempirical investigation into whether such behavioralresponses to scarcity and volatility exist in real supplychains, offers a promising avenue for furtherresearch.Finally, experiment 3 provides a starting point to

investigate how people interact with artificial agentsprogrammed to follow specific decision rules. Whileour agents were programmed to follow the cost-mini-mizing decision rule, further research could exploreparticipants’ responses to agents programmed to useother rules, including sub-optimal rules or rules esti-mated from the behavior of human players.Our work suggests three managerial insights.

First, we demonstrate that the bullwhip effect is abehavioral phenomenon as well as an operationalone, and therefore methods for reducing supplychain instability should address the behavioral aswell as structural causes of the problem. Decisionmakers have a difficult time controlling systems that

include feedback and delays, so training and deci-sion aids that help them do so may lead to substan-tial improvement.Second, the notion of “optimal” behavior is contin-

gent on people’s assumptions about the thinking andbehavior of the other agents with whom they interact.If a person believes that their counterparts willbehave in an unpredictable and capricious fashion,this may lead to further instability in the supplychain. To the extent that managers can be assured oftheir supply chain partners’ knowledge of the optimaldecision rule and trust them to implement it, perfor-mance can improve further.Our final contribution is the preliminary identifica-

tion of an effective mechanism to moderate the bull-whip effect: coordination stock. The results suggest apreviously unidentified purpose for excess stock.Intuitively, the optimal level of coordination stockdepends on the level of coordination risk, the cost ofholding excess inventory, and the cost of not dampen-ing the bullwhip effect. Additional experimentswould be needed to further define the nature of theserelationships. This represents a fertile area for futureresearch.

Acknowledgments

This research was supported by NSF Grant SES-0214337and a grant from the Center for Supply Chain Research(CSCR), Smeal College of Business, Penn State University.Further financial support for Sterman was received from theProject on Innovation in Markets and Organizations at MIT.We thank Diana Wu for help with the experiments and Gok-han Dogan for help with data analysis. We also benefitedimmensely from the comments of Gerard Cachon, SunilChopra, Charles Fine, Paulo Goncalves, Don Kleinmuntz,Rogelio Oliva, and seminar participants at Cornell, Har-vard, MIT, Stanford, the University of Texas, and Wharton.

Appendix. Outlier Analysis

To test for outliers, we used Grubb’s procedure, alsoknown as the maximum normed residual test (Grubbs1969), as recommended in the Engineering StatisticsHandbook (National Institute of Standards and Tech-nology; www.nist.gov). This procedure involvescalculating the variance of orders placed over theentire game for each individual and comparing thesevariances with those of others in the same role/exper-iment category (e.g., all retailers within experiment 1).The Grubbs procedure then identifies outliers withineach role/experiment category based on simple crite-ria: distance of an observation from the mean in termsof standard deviation. In our case, an individual wasflagged as an outlier if its variance of orders was morethan 2.29 standard deviations beyond the meanvariance (which is consistent with a statistical significance

Croson, Donohue, Katok, and Sterman: Order Stability in Supply ChainsProduction and Operations Management 0(0), pp. 1–21, © 2013 Production and Operations Management Society 19

Page 20: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

of p = 0.05). For example, of the 10 retailers in experi-ment 3, one had a variance of orders placed of 37.36.This is 2.39 standard deviations beyond the mean var-iance, and so this individual was identified as an out-lier. For the all human experiments (i.e., experiments1, 2, and 4), a team was eliminated if two or more ofits members did not pass the criterion. For the experi-ments with one human and three automated teammembers (i.e., experiment 3), a team was eliminated ifits single subject did not pass the criterion. Table A1lists the individuals identified as outliers and thenumber of teams eliminated as a result for each exper-iment.

Notes1The standard protocol for the beer game eliminates theneed for order batching because there are no fixed order-ing costs or quantity discounts. Each player orders fromand ships to only one other party, and production capac-ity is infinite, eliminating the incentive for gaming inresponse to shortages. Prices are fixed and customerdemand is exogenous, eliminating the possibility of pro-motions and forward buying. In our design, we also elimi-nate demand signaling, as demand is constant and knownto all participants.2Instructions, the quiz, and other materials can be foundat www.personal.psu.edu/exk106/Appendix_Compendium_Instructions.pdf.3Including the outlier teams (4 and 9), ri > ri�1 for 80% ofthe cases, and H1 is rejected at p = 0.0005.4To test the robustness of our results to the use of thestandard deviations of orders as the metric, we alsoanalyzed the root mean squared (RMS) deviation, d, oforders, O, from the steady-state optimal value of 4,di, j = (1/n)[Σt∈{1, 48}(Oi,j,t − 4)2]1/2 for role i in team j. TheRMS deviation penalizes participants for all deviationsfrom optimal, and not only for variability around the par-ticipant’s mean orders. All statistical results reported here

continue to hold using the RMS deviation from optimalinstead of the standard deviation.5Including outliers also yields significantly higher stan-dard deviation of orders placed in the baseline (median20.9) compared with those placed here (median 5.4),Mann–Whitney p = 0.007.6Although the automated agents are guaranteed to play theoptimal strategy, their orders may differ from 4 cases perweek if the human player does not order optimally. If thehuman deviates from equilibrium (for any reason),upstream agents experience unintended changes in theirinventory, requiring them to alter their own orders to correctthe imbalance (using the optimal order-up-to rule to do so).7Including outliers, median standard deviations by baselineretailers are 3.9, whereas median standard deviations amongall human players here are 1.9, Mann–Whitney p = 0.021.8Including outliers, median standard deviations by retail-ers with common knowledge is 2.5, while median stan-dard deviations here among all human players is 1.9,Mann–Whitney p = 0.27.9Teams 6, 7, and 10 were removed as outliers. Includingthem yields ri > ri − 1 in 63% of the cases, and the p-value falls to 0.051.10With the outlier teams, the median standard deviation oforders in the baseline is 20.9 and here is 4.4, which aresignificantly different (Mann–Whitney p = 0.032). Themedian total supply chain cost with outliers is 13,105 inthe baseline and 2840 here, and this difference is weaklysignificant (Mann–Whitney p = 0.09).11The system is perturbed from equilibrium because theestimated target inventory level, S’, is not the equilibriumvalue. The estimated values of players’ desired inventorylevels induce the spontaneous deviations that knock thesystem out of equilibrium.12The exact values of the function a*(b) plotted in Figure 7depend on the assumptions that the other parameters inthe decision rule (h, S’) are optimal and that initial inven-tory is 12 cases for all participants (as in experiment 4).Other assumptions and initial conditions will cause thevalues of a*(b) to vary, though in all cases the condition-ally optimal value of a decreases with b (Mosekilde 1996,Thomsen et al. 1992).

ReferencesAeppel, T. 2010. ‘Bullwhip’ hits firms as growth snaps back. The

Wall Street Journal, January 27.

Armony, M., E. Plambeck. 2005. The impact of duplicate orderson demand estimation and capacity investment. Manage. Sci.51(10): 1505–1518.

Aumann, R. 1976. Agreeing to disagree. Ann. Stat. 4(6): 1236–1239.

Booth Sweeney, L., J. Sterman. 2000. Bathtub dynamics: Initialresults of a systems thinking inventory. Syst. Dynamics Rev.16(4): 249–294.

Cachon, G., M. Lariviere. 1999. Capacity choice and allocation:Strategic behavior and supply chain performance. Manage.Sci. 45(8): 1091–1108.

Cachon, G., C. Camerer. 1996. Loss-avoidance and forward induc-tion in experimental coordination games. Q. J. Econ. 111(1):165–194.

Cachon, G., T. Randall, G. Schmidt. 2007. In search of the bull-whip effect. Manuf. Serv. Oper. Manage. 9(4): 457–479.

Table A1 Breakdown of Individual Outliers and Eliminated Teams

Experiment1: baseline

Experiment2: creatingcommonknowledge

Experiment3: eliminatingcoordination

risk

Experiment4: adding

coordinationstock

Outliersby role(team no.)

Retailer (5)Wholesaler(4)

Distributor(4)

Factory (4)Distributor(9)

Factory (9)

Retailer (8)Distributor(7)

Retailer (18)Wholesaler(27)

Factory (1)Factory (5)

Distributor (6)Factory (6)Retailer (7)Wholesaler(7)Distributor(7)Factory (7)Retailer (10)Wholesaler(10)

Eliminatedteams

Teams 4, 9 None Individuals1, 5, 18, 27

Teams6, 7, 10

The ID number/team number of each outlier player is listed inparenthesis.

Croson, Donohue, Katok, and Sterman: Order Stability in Supply Chains20 Production and Operations Management 0(0), pp. 1–21, © 2013 Production and Operations Management Society

Page 21: Order Stability in Supply Chains: Coordination Risk and ...jsterman.scripts.mit.edu/docs/Order Stability in Supply Chains 9-30.pdf · Order Stability in Supply Chains: Coordination

Camerer, C., G. Loewenstein, D. Prelec. 2005. Neuroeconomics: Howneuroscience can inform economics. J. Econ. Lit. 43(3): 9–64.

Chen, F. 1999. Decentralized supply chains subject to informationdelays. Manage. Sci. 45(8): 1016–1090.

Chen, L., H. Lee. 2011. Bullwhip effect measurement and itsimplications. Oper. Res. 60(4): 771–778.

Chen, F., Z. Drezner, J. Ryan, D. Simchi-Levi. 2000. Quantifyingthe bullwhip effect: The impact of forecasting, lead times, andinformation. Manage. Sci. 46(3): 436–443.

Clark, A., H. Scarf. 1960. Optimal policies for a multi-echeloninventory problem. Manage. Sci. 6(4): 475–490.

Cohen, M., M. Baganha. 1998. The stabilizing effect of inventoryin supply chains. Oper. Res. 46(3 Suppl): 572–583.

Cronin, M., C. Gonzalez, J. Sterman. 2009. Why don’t well-edu-cated adults understand accumulation? A challenge toresearchers, educators, and citizens. Organ. Behav. Hum. Decis.Process. 108(1): 116–130.

Croson, R. 2007. The use of students as participants in experimen-tal research. Working paper, University of Texas at Dallas(prepared for Behavioral Dynamics in Operations Manage-ment Online Discussion Group).

Croson, R., K. Donohue. 2002. Experimental economics and sup-ply-chain management. Interfaces 32(5): 74–82.

Croson, R., K. Donohue. 2003. Impact of POS data sharing onsupply chain management: An experimental study. Prod.Oper. Manag. 12(1): 1–11.

Croson, R., K. Donohue. 2006. Behavioral causes of the bullwhipand the observed value of inventory information. Manage. Sci.52(3): 323–336.

Diehl, E., J. Sterman. 1995. Effects of feedback complexity ondynamic decision making. Organ. Behav. Hum. Decis. Process.62(2): 198–215.

Dorner, D. 1996. The Logic of Failure. Metropolitan Books/HenryHolt, New York.

Efron, B., R. Tibshirani. 1986. Bootstrap methods for standarderrors, confidence intervals, and other measures of statisticalaccuracy. Statist. Sci. 1(1): 54–75.

Ellram, L. 2010. Introduction to the forum on the bullwhip effect inthe current economic climate. J. Supply Chain Manag. 46(1): 3–4.

Fair, R. 2003. Bootstrapping macro econometric models. Stud.Nonlinear Dynamics Econ. 7(4): 1–24. article 1.

Gigerenzer, G., P. Todd. 2000. Simple Heuristics that Make UsSmart. Oxford University Press, New York.

Grubbs, F. 1969. Procedures for detecting outlying observations insamples. Technometrics 11(1): 1–21.

Van Huyck, J., R. Battalio, R. Beil. 1990. Tacit coordination games,strategic uncertainty, and coordination failure. Amer. Econ.Rev. 80(1): 234–248.

Van Huyck, J., R. Battalio, R. Beil. 1991. Strategic uncertainty,equilibrium selection, and coordination failure in averageopinion games. Q. J. Econ. 106(3): 885–911.

Van Huyck, J., R. Battalio, R. Beil. 1993. Asset markets as an equilib-rium selection mechanism: Coordination failure, game form auc-tions, and tacit communication.Games Econ. Behav. 5(3): 485–504.

Kahneman, D. 2002. Maps of bounded rationality: A perspective onintuitive judgment and choice. Nobel Prize Lecture 8: 351–401.

Kahneman, D., P. Slovic, A. Tversky. 1982. Judgment Under Uncer-tainty: Heuristics and Biases. Cambridge University Press,Cambridge UK.

Lee, H., V. Padmanabhan, S. Whang. 1997. Information distortion ina supply chain: The bullwhip effect.Manage. Sci. 43(4): 546–558.

Lee, H., V. Padmanabhan, S. Whang. 2004. Comments on infor-mation distortion in a supply chain: The bullwhip effect.Manage. Sci. 50(12): 1887–1893.

Lei, V., C. Noussair, C. Plott. 2001. Nonspeculative bubbles inexperimental asset markets: Lack of common knowledge ofrationality vs. actual irrationality. Econometrica 69(4): 831–859.

Li, H., G. Maddala. 1996. Bootstrapping time series models. Econ.Rev. 15(2): 115–158.

McClure, S., D. Laibson, G. Loewenstein, J. Cohen. 2004. Separateneural systems value immediate and delayed monetaryrewards. Science 306: 503–507.

Mosekilde, E. 1996.Topics inNonlinearDynamics: Applications to Physics,Biology, and Economic Systems. World Scientific, River Edge, NJ.

Mosekilde, E., E. Larsen, J. Sterman. 1991. Coping with complex-ity: Deterministic chaos in human decision making behavior.J. L. Casti, A Karlqvist, eds. Beyond Belief: Randomness, Predic-tion, and Explanation in Science. CRC Press, Boston, 199–229.

Nagel, R. 1995. Unraveling in guessing games: An experimentalstudy. Amer. Econ. Rev. 85(5): 1313–1326.

Ochs, J. 1995. Games and Economic Behavior. Elsevier, New York.

Plous, S. 1993. The Psychology of Judgment and Decision Making.McGraw-Hill, New York.

Rousseau, D., S. Sitkin, R. Burt, C. Camerer. 1998. Not so differentafter all: A cross-discipline view of trust. Acad. Manag. Rev.23(3): 393–404.

Schwieren, C., M. Sutter. 2008. Trust in cooperation or ability? Anexperimental study on gender differences. Econ. Lett. 99(3):494–497.

Seigel, S. 1965. Nonparametric Statistics for the Behavioral Sciences.Wiley, New York.

Simon, H. 1979. Rational decision making in business organiza-tions. Amer. Econ. Rev. 69(4): 493–513.

Simon, H. 1997. Administrative Behavior, 4th edn. The Free Press,New York.

Steckel, J., S. Gupta, A. Banerji. 2004. Supply chain decision mak-ing: Will shorter cycle times and shared point-of-sale informa-tion necessarily help? Manage. Sci. 50(4): 458–464.

Sterman, J. 1988. Deterministic chaos in models of human behav-ior: Methodological issues and experimental results. Syst.Dynamics Rev. 4(1–2): 148–178.

Sterman, J. 1989a. Modeling managerial behavior: Misperceptionsof feedback in a dynamic decision making experiment.Manage. Sci. 35(3): 321–339.

Sterman, J. 1989b. Misperceptions of feedback in dynamic deci-sion making. Organ. Behav. Hum. Decis. Process. 43(3): 301–335.

Sterman, J. 1994. Learning in and about complex systems. Syst.Dynamics Rev. 10(2–3): 291–330.

Sterman, J. 2000. Business Dynamics: Systems Thinking and Modelingfor a Complex World. McGraw-Hill, New York.

Sterman, J. 2011. Communicating climate change risks in a skepti-cal world. Climatic Change 108: 811–826.

Su, X. 2008. Bounded rationality in newsvendor models. Manuf.Serv. Oper. Manag. 10(4): 566–589.

Thomsen, J., E. Mosekilde, J. Sterman. 1992. Hyperchaotic phe-nomena in dynamic decision making. J. Syst. Anal. Model.Simul. (SAMS) 9: 137–156.

Tversky, A., D. Kahneman. 1974. Judgment under uncertainty:Heuristics and biases. Science 185(4157): 1124–1131.

Vanderschraaf, P., G. Sillari. 2005. Common knowledge. The Stan-ford Encyclopedia of Philosophy. Available at http://plato.stanford.edu/archives/win2005/entries/common-knowledge/(accessed date June 18, 2012).

Wu, D., E. Katok. 2006. Learning, communication, and the bull-whip effect. J. Oper. Manag. 24(6): 839–850.

Zipkin, P. 2000. Foundations of Inventory Management. McGraw-Hill, New York.

Croson, Donohue, Katok, and Sterman: Order Stability in Supply ChainsProduction and Operations Management 0(0), pp. 1–21, © 2013 Production and Operations Management Society 21