kremer, moritz, & siemsen - ms2011 - demand forecasting behavior

Upload: boshi-zhou

Post on 06-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 Kremer, Moritz, & Siemsen - MS2011 - Demand Forecasting Behavior

    1/17

    MANAGEMENT SCIENCEArticles in Advance, pp. 117issn 0025-1909 eissn 1526-5501 doi 10.1287/mnsc.1110.1382

    2011 INFORMS

    Demand Forecasting Behavior:

    System Neglect and Change DetectionMirko Kremer, Brent Moritz

    Smeal College of Business, Pennsylvania State University, University Park, Pennsylvania 16802{[email protected], [email protected]}

    Enno SiemsenCarlson School of Management, University of Minnesota, Minneapolis, Minnesota 55455, [email protected]

    We analyze how individuals make forecasts based on time-series data. Using a controlled laboratory exper-iment, we find that forecasting behavior systematically deviates from normative predictions: Forecastersoverreact to forecast errors in relatively stable environments, but underreact to errors in relatively unstableenvironments. The performance loss that is due to such systematic judgment biases is larger in stable than inunstable environments.

    Key words : forecasting; behavioral operations; system neglect; exponential smoothingHistory : Received December 22, 2009; accepted April 16, 2011, by Martin Lariviere, operations management.

    Published online in Articles in Advance.

    1. IntroductionDemand forecasting in time-series environments isfundamental to many operational decisions. Poorforecasts can result in inadequate capacity, excessinventory, and inferior customer service. Given theimportance of good forecasts to operational success,quantitative methods of time-series forecasting are

    well known and widely available (cf. Makridakis et al.1998). Although companies frequently have access tosophisticated quantitative methods embedded in fore-casting software, empirical evidence shows that real-world forecasting often relies on human judgment.In a study of 240 U.S. corporations, over 90% ofcompanies reported having access to some forecast-ing software (Sanders and Manrodt 2003a). However,only 29% of companies primarily used quantitativeforecasting methods, 30% mainly used judgmentalmethods, and the remaining 41% applied both quanti-tative and judgmental methods (Sanders and Manrodt2003b). Although quantitative methods may provide

    the basis for a forecast, it is common practice to mod-ify computer-generated forecasts based on human

    judgment (Fildes et al. 2009).Several recent studies have examined decision

    biases in a variety of operations contexts (cf. Bendolyet al. 2006) and documented behavioral anoma-lies related to demand forecasting even when fore-casting was theoretically irrelevant to the task.Schweitzer and Cachon (2000) investigated newsven-dor inventory decision making under stationary andknown demand distributions; a key finding of their

    study is that average order quantities are biasedtoward mean demand relative to the expected profit-maximizing order quantity. This biased ordering has

    been attributed to randomness in decision making(Su 2008), as well as to systematic biases like meandemand anchoring and demand chasing (Schweitzerand Cachon 2000). In a more complex beergame set-

    ting, Croson and Donohue (2003) observed the bull-whip effect with participants who faced a knownand stationary demand distribution. Croson et al.(2005) observed this effect even with constant anddeterministic demand. To mitigate such suboptimal

    behavior, Schweitzer and Cachon (2000, p. 419) high-lighted the importance of separating the forecastingtask from the inventory decision task: While the fore-casting task typically requires managerial judgment,the task of converting a forecast into an order quan-tity can be automated. A firm may reduce decision

    bias by asking managers to generate forecasts thatare then automatically converted into order quan-

    tities. In other words, inventory decisions can bedecomposed by estimating the probability distribu-tion of future demand (a more judgmental forecastingtask), selecting a service level, and then using both todetermine an order quantity (a more automated task).Therefore, it is important to consider human judg-ment in forecasting to attenuate errors in higher-orderdecisions such as purchasing, inventory, and capacity.

    There is extensive literature on human judgment intime-series forecasting (Lawrence et al. 2006). Centralresults include the widespread use of heuristics such

    1

    postedonanyother

    website,includingtheauthorssite.Pleasesendanyquestionsre

    gardingthispolicytopermissions

    @informs.org.

    Published online ahead of print July 15, 2011

  • 8/3/2019 Kremer, Moritz, & Siemsen - MS2011 - Demand Forecasting Behavior

    2/17

    Kremer, Moritz, and Siemsen: Demand Forecasting Behavior2 Management Science, Articles in Advance, pp. 117, 2011 INFORMS

    as anchoring and adjustment, as well as the influenceof feedback and task decomposition on forecastingperformance. However, findings remain inconclusive,in part because forecasting behavior appears to besensitive to different components of the time series.Furthermore, the judgmental forecasting literature is

    typically concerned with the detection of patterns ina time series, such as trends or seasonality (Harvey2007). In contrast, our research focuses on reactionsto unpredictable change in the level of a time series:How do individuals create time-series forecasts inunstable environments? We study this question in alaboratory setting where forecasters face time seriesgenerated by a random walk with noise (Muth 1960),a demand process that provides an intuitive map-ping between a simple normative forecasting bench-mark and the structural parameters describing thedemand environment. We show that time-series fore-casting behavior is described by an error-response

    model across a wide range of conditions. However,forecasters tend to overreact to forecast errors in morestable environments and underreact to forecast errorsin less stable environments. This pattern is consis-tent with the system-neglect hypothesis (Massey andWu 2005), which posits that forecasters place toomuch weight on recent signals relative to the envi-ronment that produces these signals. Surprisingly, wefind that forecasting performance relative to the nor-mative benchmark is poorer in stable environmentscompared to less stable environments.

    This paper proceeds as follows. The next sectionoutlines the academic literature that relates to our

    research. Section 3 discusses our theoretical develop-ments, and 4 discusses the results of our study. Wediscuss our results and conclude this paper in 5.

    2. Related LiteratureExisting research on judgmental forecasting providesvast but somewhat inconclusive empirical evidenceregarding forecasting performance, cognitive pro-cesses, and managerial interventions. Many studieshave been devoted to comparing the performance ofhuman forecasts to quantitative forecasting methods,

    but the empirical evidence is not consistent (Lawrence

    et al. 1985, Carbone and Gorr 1985, Sanders 1992,Fildes et al. 2009). The literature has investigated avariety of cognitive processes underlying the evolu-tion of judgmental forecasts, such as variations of theanchoring-and-adjustment heuristic (Harvey 2007).Regarding managerial interventions, judgmental fore-cast accuracy can improve with performance feed-

    back (e.g., Stone and Opel 2000) and task propertyfeedback (e.g., Sanders 1997), but the effectiveness ofthese levers depends on specific contextual elementsof the forecasting task (Lawrence et al. 2006). Existing

    research on judgmental time-series forecasting pre-dominantly examines pattern detection, that is, howwell human subjects can identify trends and seasonal-ity in noisy time series (Andreassen and Kraus 1990,Lawrence and OConnor 1992, Bolger and Harvey1993, Lawrence and OConnor 1995). In contrast, our

    research focuses on change detection, that is, how sub-jects separate random noise from persistent change inthe level of a time series.

    When observing demand variation in a time series,a forecaster needs to identify whether there is sub-stantive (and persistent) cause for this variation orwhether variation is noise and has no implicationsfor future observations. The ability to distinguish sub-stantive change from random variation has been stud-ied extensively in the literature on regime changedetection (Barry and Pitz 1979). A central conclusionfrom regime change research is that decision mak-ers underreact to change in environments that are

    unstable and have precise signals and overreact inenvironments that are stable and have noisy signals(Griffin and Tversky 1992). This seemingly contra-dictory reaction pattern is reconciled by the system-neglect hypothesis (Massey and Wu 2005), whichposits that individuals overweigh signals relative tothe underlying system that generates the signals.

    A related stream of research in financial eco-nomics explains the pattern of short-term underreac-tion and long-term overreaction to information that isoften observed in stock market investment decisions(Poteshman 2001). Some theoretical work has beendevoted to explaining this behavioral pattern, linking

    such behavior to the gamblers fallacy or the hot-hand effect (Barberis et al. 1998, Rabin 2002, Rabinand Vayanos 2010). In an asset pricing context, Bravand Heaton (2002) illustrate how an over- or under-reaction pattern arises from biased information pro-cessing by investors, subject to the representativenessheuristic (Kahneman and Tversky 1972) and conser-vatism (Edwards 1968), and demonstrated how thispattern can also arise from a fully Bayesian investorwho lacks structural knowledge about the possibleinstability of the time series. Experimental tests of thismixed-reaction pattern include those by Bloomfieldand Hales (2002) and Asparouhova et al. (2009).

    A central difference between our research and exist-ing research on change detection is the complexityof the judgment environment. In Massey and Wu(2005), participants faced binary signals (red or blue

    balls) that were generated from one of two regimes(draws from two urns with fixed proportions of redand blue balls in each). Given a sequence of signals,their experimental task was to identify when a regimechange (a switch from one urn to the other) hadoccurred. Furthermore, because subjects had perfectknowledge of the system parameters (the proportion

    postedonanyother

    website,includingtheauthorssite.Pleasesendanyquestionsre

    gardingthispolicytopermissions

    @informs.org.

  • 8/3/2019 Kremer, Moritz, & Siemsen - MS2011 - Demand Forecasting Behavior

    3/17

    Kremer, Moritz, and Siemsen: Demand Forecasting BehaviorManagement Science, Articles in Advance, pp. 117, 2011 INFORMS 3

    of blue balls in either urn), there was no ambigu-ity concerning the relevant world. This environmentfits a binary forecasting task in which a well-knownphenomenon needs to be detected, such as whena bull market turns into a bear market. Similarly,in Bloomfield and Hales (2002) and Asparouhova

    et al. (2009), participants faced fairly simple series ofsignals generated from a symmetric binary randomwalk. Brav and Heaton (2002) illustrated their the-oretical considerations in an environment in whicha series of independently and identically distributedassets exhibited a single structural break that shiftedthe asset distribution only once during the time series.A central question of our research is whether theoverreaction/underreaction patterns observed in suchfairly simple settings translate to the relatively richerenvironment of time-series demand forecasting.

    3. TheoryIn the time-series judgment task we examine, a fore-caster needs to decide whether an observed variationin the time-series data provides a reason to modify aprevious forecast for the next period. Figure 1 illus-trates this judgment task.

    If the forecaster interprets the variation as ran-dom noise, she can ignore it and uphold her fore-cast as the long-run average. If she believes thatthe variation represents a change in the underly-ing level of the time series, recent demand observa-tions contain more information about the future thanpast observations do. Therefore, recent observations

    should receive more weight in her revised forecast.Finally, if she believes that this variation is indicativeof a trend, she would project the variation to not onlyshift the level once, but also to continue doing so infuture periods.

    In practice, these options are not mutually exclu-sive. A forecaster may decide that variation is par-tially due to noise and partially due to a level change.She may also believe that the variation represents

    Figure 1 Challenge of Time-Series Analysis

    Noise

    Change in level

    Trend

    Time

    Demand

    Average

    Past Present Future

    Variation indicates

    Variation

    both a level change and a trend. The key challenge weexamine is whether she can successfully differentiatelevel changes from noise. Although our empiricalanalysis controls for the possibility that forecastersinclude illusory trends, our simulated demand envi-ronment does not contain any underlying trends;

    a comprehensive discussion of trend detection isbeyond the scope of this paper.

    3.1. Demand EnvironmentForecasts are made after observing demand Dt inperiod t, with no additional information on futuredemand beyond that which is contained in the timeseries Dt = DtDt1Dt2 }. The demand processfollows:

    Dt = t +t (1a)

    t = t1 +t (1b)

    where t N0n2 and t N0c

    2 are indepen-dent random variables. The time series thus containstwo kinds of random components: temporary shocks(through t and permanent shocks (through t. Thestandard deviation c captures the notion of changein the true (but unobserved) level t , i.e., permanentshocks to the time series that persist in subsequentperiods. The standard deviation n captures the noisesurrounding the level, i.e., temporary shocks to thetime series that last only for a single period.

    This demand model has several appealing proper-ties. By varying the two parameters (c and n, themodel provides a simple way to describe a fairlywide range of different environments, ranging fromrather stable to highly unstable processes. In its lim-its, the demand model produces pure random walksfor n= 0, while yielding stationary white noise pro-cesses for c = 0. Figure 2 illustrates how the shapeof a representative time series depends on the twoparameters. The demand model defined in Equations(1a) and (1b) is descriptively accurate for many real-world processes (see Makriadakis and Hibon 2000,Gardner 2006, and the references therein), often usedto describe nonstationary demand in supply chainsettings (e.g., Graves 1999), and discussed in prac-tically every operations management textbook (e.g.,

    Nahmias 2008, Chopra and Meindl 2009). Impor-tant for our empirical investigation of judgmen-tal forecasts and their performance, this demandmodel also provides a direct and intuitive mapping

    between a simple normative forecasting benchmarkand the structural parameters describing the demandenvironment.

    3.2. Normative BenchmarkFrom Equation (1a), the best forecast Ft+1 (made inperiod t for period t + 1) for our demand environ-ment is the true level t , but the true level is obscured

    postedonanyother

    website,includingtheauthorssite.Pleasesendanyquestionsre

    gardingthispolicytopermissions

    @informs.org.

  • 8/3/2019 Kremer, Moritz, & Siemsen - MS2011 - Demand Forecasting Behavior

    4/17

    Kremer, Moritz, and Siemsen: Demand Forecasting Behavior4 Management Science, Articles in Advance, pp. 117, 2011 INFORMS

    Figure 2 Example Demand Time Series for Different c and n

    0 = 500

    Time

    Condition 5, c = 40, n = 10

    Demand

    Demand

    Demand

    Demand

    De

    mand

    De

    mand

    Time

    Time Time

    Time Time

    Condition 6, c = 40, n = 40

    Condition 3, c = 10, n = 10 Condition 4, c = 10, n = 40

    Condition 1, c = 0, n = 10 Condition 2, c = 0, n = 40

    by noise and must be estimated. The challenge isto optimally determine the extent that a variation inthe demand signal Dt is evidence for a permanent

    change in the level rather than a random, transientshock. When Dt follows the demand process in Equa-tions (1a) and (1b), the optimal forecasting mechanismis the familiar single exponential smoothing method(Muth 1960). A forecast Ft+1 is a weighted average ofthe most recent demand observation and the previousforecast,

    Ft+1 = Dt + 1Ft = Ft +Dt Ft (2)

    The latter part of Equation (2) highlights an importantobservation: A forecast is a function of an observedforecast error (DtFt and the weight that is placed

    on this error. In conceptual terms, the optimal fore-cast revision (from Ft to Ft+1 in light of new evi-dence Dt is a function of strength of this evidence,relative to its weight (Griffin and Tversky 1992). Thestrength of evidence is provided by the forecast error(Dt Ft itselfall else equal, a forecast should bemore strongly revised the larger the forecast error.The weight of evidence depends on the parametersof the system that generates the demand signals, i.e.,the change (c and noise (n parameters governingthe time series. In fact, the weight of evidence in our

    context can be precisely captured by the change-to-noise ratio W= c2/n2, which itself is directly relatedto the optimal smoothing parameter . All else equal,a forecast should be revised according to the change-to-noise ratio: With low values of W (variations indemand are mostly noise), forecast errors should be

    mostly discarded and should not have much influ-ence on the new forecast, whereas with high val-ues of W (variations in demand mostly representlevel changes), the forecast error should have greaterinfluence on a forecast. This intuition can be formal-ized as follows (Harrison 1967, McNamara and Hous-ton 1987):1

    W=2

    1+

    1+ 4/W (3)

    Thus, the optimal smoothing constant dependsonly on the change-to-noise ratio W, whereas thedemand time series is driven by absolute levels of c

    and n. For example, condition 3 (c = 10 and n = 10)and condition 6 (c= 40 and n= 40) have the same W,implying the same W. Combining optimal struc-ture of the forecasting mechanism in Equation (2)with its optimal parameter in Equation (3) yields theoptimal forecasts for our demand environment,

    Ft+1 = Ft +WDt Ft (4)

    3.3. System Neglect in Forecasting BehaviorBesides being the normative forecasting benchmarkfor the demand process described in Equations (1a)and (1b), single exponential smoothing can be viewedas a plausible model for describing human forecast-

    ing behavior. Practically, exponential smoothing cor-responds to the mental process of error detection andsubsequent adaptation,2 i.e., trial-and-error learning,where a forecaster observes an error and then adjustsher next forecast based on that error. Furthermore,exponential smoothing has two important character-istics as a boundedly rational decision heuristic: First,it does not require much memory because the mostrecent forecast and demand contain all the informa-tion necessary to make the next forecast. Second, itis a robust heuristic in many different environments

    beyond the particular one used in our study (e.g.,Gardner 1985, 2006). These are compelling behavioral

    reasons to assume that forecasters follow the error-response logic of exponential smoothing. The cru-cial question is how their behavioral error-response

    1 McNamara and Houston (1987) derived Equation (3) usingBayesian principles. Harrison (1967) derived the same expression(although articulated differently) as the argument that minimizesthe variance of forecast errors.2 Error detection and adaptation are also fundamental principlesof cybernetics (Wiener 1948) and the foundation for closed looptheories of learning (Adams 1968). There is neurological evidencethat the human brain supports such a process (Gehring et al. 1993).

    postedonanyother

    website,includingtheauthorssite.Pleasesendanyquestionsre

    gardingthispolicytopermissions

    @informs.org.

  • 8/3/2019 Kremer, Moritz, & Siemsen - MS2011 - Demand Forecasting Behavior

    5/17

    Kremer, Moritz, and Siemsen: Demand Forecasting BehaviorManagement Science, Articles in Advance, pp. 117, 2011 INFORMS 5

    parameter W compares to the optimal W inEquation (3).

    We assume that decision makers update their fore-casts in the light of new evidence and that they incor-porate both the strength of this evidence, i.e., themagnitude of the observed variation in demand sig-

    nal, and its statistical weight (W= c2

    /n

    2

    . However,we hypothesize that they suboptimally incorporatesignal strength and weight. Specifically, we suggestthat decision makers attribute too much weight toforecast errors at the expense of the system parame-ters (c n) that produce these errors. The primary rea-son for this pattern is that demand signals and theassociated forecast errors are highly salient, whereasthe system parameters are not. In most instances, thesystem parameters c and n are unknown or evenunknowable. Even if a decision maker knows theexact system parameters, those parameters are likelyto remain latent in the background relative to the

    signals they produce. Massey and Wu (2005) inte-grated these ideas into their system-neglect hypothesis.Because the weight W is less a determinant of actual

    behavior than Equation (1) implies for the optimalbenchmark, we expect the behavioral W to be lessresponsive to W than W:

    dW

    dW 300. In a few cases,typographical errors were obvious, and the forecastswere corrected. If the intended forecast could not bedetermined but the response appeared to be a typo-graphical error (i.e., one forecast of 20 between a longseries of forecasts between 700 and 900), that forecastwas recorded as missing. However, such correctionswere rare (

  • 8/3/2019 Kremer, Moritz, & Siemsen - MS2011 - Demand Forecasting Behavior

    7/17

    Kremer, Moritz, and Siemsen: Demand Forecasting BehaviorManagement Science, Articles in Advance, pp. 117, 2011 INFORMS 7

    has correct beliefs about the structure of the data-generating process. Because beliefs are typically unob-servable, an additional empirical challenge arises

    because all incorrect beliefs may be inappropriatelyattributed to W. For example, although the data-generating process from Equation (1) does not incor-

    porate a systematic trend component, the process canproduce sequences of demand signals that lend them-selves to the perceptions of trends where there arenone (see the sample demand paths in Figure 1).Although an overall assessment of the data shouldhave led to the conclusion not to expect trends (seeAppendix B), a nonholistic assessment of the timeseries may produce the illusion of short-term trends.In such a situation, simply imposing single expo-nential smoothing as the model of behavior could

    bias our conclusions. This essentially is Brav andHeatons (2002) argument that it is difficult to dis-tinguish between models of irrational behavior under

    structural certainty (individuals use single exponen-tial smoothing but use biased estimates of andmodels of rational behavior under structural uncer-tainty (lacking knowledge of the true structure of thedemand environment, individuals optimally updateparameters of an incorrect forecasting model and onlygradually learn its true structure). To address thisempirical challenge, our analysis in 4.2.2 will con-sider more generalized forecasting models capable ofcapturing behavior that goes beyond single exponen-tial smoothing.3

    4.2. Results

    4.2.1. Initial Analyses. Let Ft = 1/I

    i Fit denotethe average forecast across all I individuals withina given condition. The optimal forecast for period tis given as Ft = FtDt1

    W. Through its depen-dence on the smoothing constant W and thedemand realizations Dt1, F

    t is specific to each of the

    six conditions (which differ by W as well as to eachof the four demand sets within a condition (whichdiffer by the vector of demand realizations Dt.

    Table 2 compares the observed mean absoluteforecast error MAEDtFit = 1/SIT

    SIT Fsit Dst ,

    which is the T-period average across all I subjects in

    all S demand data sets within a given demand envi-ronment over all conditions. Simple t-tests (p 001)confirm the observed mean absolute error is signif-icantly larger than the corresponding error measure

    3 Alternatively, the researcher may try to experimentally controlwhat subjects know directly, e.g., by revealing structure and param-eters of the data-generating process. Although this approach seemsreasonable in the context of simple coin-toss experiments, it is ques-tionable to assume that a decision maker can efficiently exploitinformation on the demand process described by Equations (1a)and (1b) and its parameters.

    Table 2 Observed Forecasting Performance Measured by MAE

    n= 10 n= 40

    c = 0 1015 (7.75) 3855 (30.74)c = 10 1642 (12.86) 4736 (36.51)

    c = 40 3894 (34.34) 6403 (53.54)

    Notes. Optimal performance is in parentheses. All differences

    between observed and optimal MAEs are significant (p 001).

    based on optimal forecasts MAEDtFt . Further-

    more, a comparison across environments is consistentwith our expectation that performance deteriorateswhen noise n and change c increase.

    Figure 4 illustrates the evolution of demand Dt ,average observed forecasts Ft , and normative fore-casts Ft for one example data set in each condition(other data sets look similar). We can make a num-

    ber of observations without formal analysis. The aver-age observed forecasts (gray line) lag the evolution

    of demand (dots), which is consistent with exponen-tial smoothing. This behavior is suboptimal in the sta-ble demand environments (conditions 1 and 2), wherethe correct forecasts Ft (black line) do not react atall to demand signals. Furthermore, although both

    Figure 4 Example Series of Demand, Average Observed Forecast,

    and Normative Forecast

    400

    500

    600

    400

    500

    600

    500

    650

    800

    500

    650

    800

    300

    500

    700

    Condition 5, c = 40, n = 10

    500

    700

    900

    TimeTime

    TimeTime

    TimeTime

    DemandObservedNormative

    Condition 6, c = 40, n = 40

    Condition 3, c = 10, n = 10 Condition 4, c = 10, n = 40

    Condition 1, c = 0, n = 10 Condition 2, c = 0, n = 40

    postedonanyother

    website,includingtheauthorssite.Pleasesendanyquestionsre

    gardingthispolicytopermissions

    @informs.org.

  • 8/3/2019 Kremer, Moritz, & Siemsen - MS2011 - Demand Forecasting Behavior

    8/17

    Kremer, Moritz, and Siemsen: Demand Forecasting Behavior8 Management Science, Articles in Advance, pp. 117, 2011 INFORMS

    Figure 5 Illustration of Ranges for the Adjustment Score it

    it < 0, i.e., negative trend or

    gamblers fallacy

    Time

    Demand

    Past Present Future

    Fit1

    Dt1

    it > 1, i.e., subject believes in

    0 it 1, i.e., exponential

    a trend

    smoothing

    the observed and the optimal forecasts smooth someof the variability in demand signals, there is morevariability in the series of observed forecasts than inthe series of optimal forecasts. This is most visible in

    condition 4.Next, we analyze observed forecast adjustments.To formalize adjustments as a response to observedforecast error, we define the adjustment score it =Fit Fit1/Dt1 Fit1, which follows immediatelyfrom rearranging the single exponential smoothingformula in Equation (2).4 We use this ratio to catego-rize behavior, as illustrated conceptually in Figure 5.A score of it < 0 indicates that subjects adjustedtheir forecast in the opposite direction of their fore-cast error (11% of all observations). Possible expla-nations of such behavior include that subjects mayhave incorporated a prior (illusory) trend, or they

    acted in accordance with the gamblers fallacy, i.e., believing that high values of a stable series balanceout with low values in small samples. An adjustmentscore of it = 0 (10% of all observations) indicatesno reaction. If the adjustment score falls between 0and 1 (42% of all observations), the forecast is consis-tent with single exponential smoothing. Finally, anyadjustment score it > 1 (37% of all observations)indicates that subjects were extrapolating illusionarytrends into the future. This initial analysis highlightsthat although error-response level adjustment is themost likely response pattern, there is strong evidencethat subjects also tended to adjust their forecasts out-

    side the range of possibilities consistent with singleexponential smoothing.To provide a brief aggregate analysis5 of forecast

    adjustments across conditions, we calculated average

    4 By construction, this score is not defined for the first period,nor for when Dt1 = Fit1. In such cases, adjustment scores wererecorded as missing. This ratio has also been used as an adjustmentscore in newsvendor research (e.g., Schweitzer and Cachon 2000).5 Because excessively high and low adjustments can have a stronginfluence on this analysis, we remove all it 2 (8% of observations) in our calculations here.

    adjustment scores = 1/SIT

    SITits as shownin Table 3.6 Although such average scores should

    be interpreted with caution, we can make severaldirectional observations. First, the average reaction increases in c and decreases in n. This observa-tion is directionally in line with the normative predic-

    tions from Equation (1). Second, with the exceptionof condition 5, the average adjustment score differsfrom the normative reaction and shows evidence ofoverreaction.

    4.2.2. A Generalized Model of Forecasting Be-havior. The previous section highlights that actual

    behavior is not completely captured by single expo-nential smoothing. Identifying a descriptively accu-rate behavioral forecasting model is ultimately anempirical question. Rather than imposing single expo-nential smoothing, we allow the data to select apreferred model of forecasting behavior. We includetwo generalizations in the empirical specification of

    behavior: initial anchoring and illusionary trends. Ini-tial anchoring refers to the well-documented tendencyof individuals to anchor their decisions on an artifi-cial or self-generated value (Epley and Gilovich 2001).Illusionary trends refers to the idea that individu-als are quick to see trends where there are none(DeBondt 1993).

    We conceptualize forecasts as containing threestructural components: A level estimate Lt , a trendestimate Tt , and trembling hands noise t , leadingto a generalized structural equation for forecast Ft+1:

    Ft+1 = Lt + Tt + t (6)We include the noise term because human deci-sion making is known to be inherently stochastic(Rustichini 2008). Note that in our experiment, we donot observe the components in Equation (6) explicitly,and we therefore replace these components with fore-casts and demand observations. We specify the levelterm in Equation (6) as

    Lt = LFt +Dt Ft+ 1 LC (7)

    The specification in Equation (7) introduces the

    anchoring parameter L and the constant C. Althoughexponential smoothing suggests that forecasterscorrectly and exclusively anchor on their previousforecasts, the literature on anchor-and-adjustmentheuristics often includes the initial values of a timeseries as an additional anchor (Chapman and Johnson

    6 Note that to compare average adjustment scores between condi-tions, we did not rely on simple paired t-tests because of the nestednature of our data. Instead, comparisons were made using a nestedrandom-effects model, with observations nested in subjects, sub-

    jects nested in data sets, and data sets nested in conditions.

    postedonanyother

    website,includingtheauthorssite.Pleasesendanyquestionsre

    gardingthispolicytopermissions

    @informs.org.

  • 8/3/2019 Kremer, Moritz, & Siemsen - MS2011 - Demand Forecasting Behavior

    9/17

    Kremer, Moritz, and Siemsen: Demand Forecasting BehaviorManagement Science, Articles in Advance, pp. 117, 2011 INFORMS 9

    Table 3 Average Forecast Adjustment Scores

    n= 10 n= 40Between-

    W condition comparision W

    c = 0 0.59 (0.03) 0.00 (p = 053) 0.56 (0.03) 0.00

    Between-conditions

    comparison

    (p 001) (p 001)

    c = 10 0.75 (0.03) 0.62 (p = 027) 0.69 (0.03) 0.22Between-conditions

    comparison(p 001) (p 001)

    c = 40 0.89 (0.03) 0.94 (p = 005) 0.79 (0.03) 0.62

    Notes. The unit of analysis is the forecast. Bold entries are the average adjustment scores, with standard errors

    reported in parentheses. All significance tests were done using Wald tests. The p-values that are recorded belowor in between average adjustment scores show the result from Wald tests comparing average adjustment scores

    between similar conditions with changes in c or n. See Table C.2 for a breakdown of sample size by condition.The average adjustment score is significantly different from its normative value at p = 001.

    2002, Baucells et al. 2011). The parameter L rep-resents a generalization that allows individuals toeither anchor their forecasts only on previous fore-

    casts (L= 1), only on the initial and constant valueC (L = 0), or on some weighted average of these twoextremes (0

  • 8/3/2019 Kremer, Moritz, & Siemsen - MS2011 - Demand Forecasting Behavior

    10/17

  • 8/3/2019 Kremer, Moritz, & Siemsen - MS2011 - Demand Forecasting Behavior

    11/17

    Kremer, Moritz, and Siemsen: Demand Forecasting BehaviorManagement Science, Articles in Advance, pp. 117, 2011 INFORMS 11

    (= 0) and instead base forecasts on the long-runaverage. Observing a tendency for such anchoring inthese conditions is therefore not surprising. In condi-tions 3 and 5 (i.e., low noise conditions), there is noevidence for initial anchoring, whereas there is a weak

    but significant tendency to base forecasts partially on

    an initial anchor in conditions 4 and 6 (high noiseconditions). This observation implies that noise in thetime series may increase the tendency to incorporateinitial anchors.

    Second, estimates for the trend-updating parame-ter are positive and significant in conditions 1, 3,5, and 6, indicating that participants tend to incorpo-rate illusionary trends into their forecasts. A rationalforecaster optimally learning the unknown parame-ters should not perceive and update trends in any ofthe demand conditions of our study (see Appendix B).However, although the data-generating process inEquation (1) has no trend component by construc-

    tion, it can produce consecutive demand signals thatmay easily be misinterpreted as trends by a humanforecaster. Interestingly, the occurrence of trend-likesequences of demand signals, and their perceptionas real trends, seems to depend on the parametersc and n. For example, is not significant in condi-tions 2 and 4, suggesting that illusionary trends areless prevalent with increasing noise. To some degree,this is expected: Noise appears as temporary vari-ation in the time series, thereby reducing the falseimpression of persistent (trend-like) changes in thelevel, though the temporary nature of noise may

    become more difficult to detect in conditions of higher

    change (as in condition 6). Clearly, acting on illusion-ary trends can have a detrimental impact on impor-tant planning decisions in many settings. A thoroughinvestigation of illusionary trends would require adifferent experimental design, including demand con-ditions producing actual trends.

    Third, our generalized forecasting model in Equa-tion (10) includes Ft = Ft Ft1 as an independentvariable, which captures the effect of previous fore-cast adjustments on the current forecast. We observethat the effects of Ft are generally negative and sig-nificant (Table C.2 in Appendix C). In other words,participants tended to decrease their forecasts if they

    had recently increased them, and increased their fore-casts if they had recently decreased them. A possiblepost hoc explanation for these negative effects may

    be some form of counterpoise adjustment, where arecent change in a forecast is attenuated by a subse-quent forecast adjustment in the opposite direction.

    4.2.4. Performance Implications. Next, we ex-plore how the decision biases uncovered in the previ-ous section impact forecasting performance. Using theestimations from our generalized forecasting model,we can attribute loss in forecasting performance

    to two classes of (mis)behavior: systematic decision biases such as misspecified error-response ( = ,initial anchoring (L = 1), or illusionary trends (> 0),and unsystematic trembling hands random errors.To separate these two sources of performance loss,we calculate the forecast performance of three types

    of forecast evolutions for each demand seed s: thenormative, the observed, and the behaviorally pre-dicted forecasts.

    The normative forecast for period t of seed s isdefined by Fst = FstDst1Fst1

    W, where Wis common to all demand seeds within a demandenvironment. The observed forecast of subject i forperiod t of seed s is denoted by Fsit . The predictedforecasts are defined as Fsit = FsitDst1 Fsit1 si,where si are the estimated parameters of our gener-alized forecasting model, including best linear unbi-ased predictions of random effects at the data setand individual levels (see Bates and Pinheiro 1998).The predicted forecasts Fsit are seed- and individual-specific forecasts that, unlike the observed Fsit , werefiltered through the structural estimation of ourgeneralized forecasting model. We measure perfor-mance for each of our six demand conditions as themean absolute forecast error, averaged across all Isubjects i and all S seeds s within that condition. For-mally, for the observed forecasts, we define MAEO =1/SIT

    SITFsit Dst , and make equivalent defini-

    tions for the normative (MAEN and predicted fore-casts (MAEP.

    8 Using these definitions, we describethe relative total performance loss from observed fore-

    casts as MAEOMAEN/MAEN. We capture the lossin forecasting performance that is due to systematicdecision biases (Loss 1) as MAEPMAEN/MAEN,and the loss in forecasting performance due to unsys-tematic random noise in decision making (Loss 2)as MAEO MAEP/MAEN. Table 6 provides anoverview of the results.

    As expected, all MAEs increase in c and n, but thisresult has to be interpreted with caution because dif-ferent environments produce different forecast per-formance because of inherent changes in randomnessand complexity. Forecasting performance relative tooptimal performance (Loss (Total)) improves in less

    stable environments. In general, the relative loss inperformance that is due to random decision mak-ing (Loss 2) is at least as high as the relative loss ofperformance that is due to decision biases (Loss 1).Counter to intuition, the relative loss in performancethat is due to systematic error (Loss 1) is lower in

    8 Because insufficient forecasting history means we cannot fit ourgeneralized forecasting model for periods 3032, and period 79 isthe last period that results in an observed error, we use all forecastsmade in periods 3379.

    postedonanyother

    website,includingtheauthorssite.Pleasesendanyquestionsre

    gardingthispolicytopermissions

    @informs.org.

  • 8/3/2019 Kremer, Moritz, & Siemsen - MS2011 - Demand Forecasting Behavior

    12/17

    Kremer, Moritz, and Siemsen: Demand Forecasting Behavior12 Management Science, Articles in Advance, pp. 117, 2011 INFORMS

    Table 6 Mean Absolute Forecast Errors and Performance Loss

    Loss 1 Loss 2 Loss (Total)

    Systematic error Random error

    Normative Predicted Observed MAEP MAEN/ MAEO MAEP/ MAEO MAEN/(MAEN) (MAEP) (MAEO) MAEN MAEN MAEN

    n= 10 n= 40 n= 10 n= 40 n= 10 n= 40 n= 10 n= 40 n= 10 n= 40 n= 10 n= 40

    c = 0 775 3074 886 3488 1015 3855 14% 13% 17% 12% 31% 25%c = 10 1286 3651 1434 4201 1642 4736 11% 15% 16% 15% 28% 30%

    c = 40 3434 5354 3541 5682 3894 6403 3% 6% 10% 13% 13% 20%

    conditions with high change (c = 40) than in condi-tions with little (c= 10) or no (c= 0) change. It seemsthat the decision heuristics individuals use to makeforecasts work comparatively better in unstable andchanging environments and become more biased instable environments.

    5. ConclusionWe investigate judgmental time-series forecasting inenvironments that can be precisely described by theirstability and noise. Behavior is somewhat consistentwith the mechanics of single exponential smoothing,the normative benchmark in our context. However,subjects tend to overreact to observed forecast errorsin relatively stable time series and to underreact toforecast errors for less stable time series. This pat-tern is consistent with the system-neglect hypothesis,which posits that individuals place too much empha-sis to signals they receive relative to the system thatgenerates the signals (Massey and Wu 2005). Ourresearch provides empirical support for this hypothe-sis in a many small changes time-series forecastingcontext, which is notably different from the few bigchanges environments commonly investigated in theregime-change literature.

    Our results show that decisions made in a stableenvironment exhibit stronger systematic decision

    biases than decisions made in less stable envi-ronments. The decline in forecasting performancethat is due to randomness in decisions (Loss 2)is at least as strong as the decline in forecastingperformance that is due to the systematic biasesuncovered (Loss 1). Human judgment appears to be

    better adapted to detecting change in volatile environ-ments than to exploiting information in stable envi-ronments. A tendency to overreact to noise may bethe result of a decision heuristic geared toward thedetection of and adaptation to change. This findingsuggests that managerial judgment in forecasting is

    better suited to unstable environments than to sta-ble ones, so particular emphasis should be placed onautomating decision making in stable environments.Additionally, because randomness in decision makingmay be mitigated by multiple individuals indepen-dently preparing forecasts and then averaging those

    forecasts (Larrick and Soll 2006), large benefits may beachieved by simply averaging multiple independent

    judgments for a forecast.The system-neglect framework lends itself to the

    design of managerial interventions. If decision biasin forecasting is due to the salience of forecast errorscompared to the latent demand process that gen-erated these errors, forecasting performance should

    improve if either the salience of forecast errors isreduced or if the demand process is reemphasized

    before making a decision.9 Future research shouldaddress these two avenues in more detail.

    Our results relate to the growing literature on behavioral operations management. For example,experimental studies of simple newsvendor settingshave documented a persistent tendency to chasedemand in stationary environments (Schweitzer andCachon 2000, Bolton and Katok 2008, Kremer et al.2010). Our study suggests that this tendency may

    be a forecasting phenomenon and not exclusivelyrelated to inventory ordering. While subjects in

    most newsvendor studies are given full knowledgeabout the underlying demand-generating system, thesystem-neglect hypothesis suggests that the signalsand feedback they observe will encourage partialneglect of that knowledge. Therefore, decomposingsuch inventory decisions into their forecasting andordering components may be a fruitful and impor-tant endeavor. Further, newsvendor studies oftenassume that using a stationary and known demandenvironment makes the forecasting task simpler, butour results suggest that stable environments lead tomore biased decision making. If subjects neglect their

    knowledge of the system and change forecasts based

    9 We conducted an initial experiment based on the idea of highlight-ing the demand process. Specifically, we had participants generateforecasts sequentially in the four data sets within a condition, i.e.,subjects would prepare a forecast in data set 1, then in data set 2,then in data set 3, etc. This design was intended as a simple manip-ulation that might reemphasize the underlying system. We com-pared performance to a design where participants made successiveforecasts in one data set before moving on to the next. We found noconsistent performance improvement using the alternative design,suggesting that we were unable to reemphasize the system withthis manipulation.

    postedonanyother

    website,includingtheauthorssite.Pleasesendanyquestionsre

    gardingthispolicytopermissions

    @informs.org.

  • 8/3/2019 Kremer, Moritz, & Siemsen - MS2011 - Demand Forecasting Behavior

    13/17

    Kremer, Moritz, and Siemsen: Demand Forecasting BehaviorManagement Science, Articles in Advance, pp. 117, 2011 INFORMS 13

    on signals, performance deteriorates more in stableenvironments compared to unstable demand envi-ronments. Finally, subjects in most newsvendor and

    beergame studies are confronted with demand stimuliin quick succession, a context that provides particu-larly salient demand signals. Our study suggests that

    decision makers may perform better when the relativesalience of recent demand signals is mitigated, suchas by reemphasizing the environment before makingthe next decision.

    Our study has several limitations. Although ouranalyses explicitly controlled for initial anchoring andillusionary trends, our study was not designed toanalyze these behaviors in detail. Future researchshould further explore these behavioral phenomenaand explicitly capture predictable changes in the levelsuch as real trends and seasonality. Further, ourforecasting context assumes that forecasters have noquantitative forecasting support available, other than

    a graph and a history table. However, in practice,many forecasts are judgmental adjustments based ona quantitative forecasting method (Fildes et al. 2009).Future research could explicitly address the impactthat quantitative decision support has on human

    judgment.In general, our understanding of human judgment

    in nonstationary environments is limited, partiallybecause analyzing such contexts is not trivial. Ourresearch suggests a method of formally capturinga persistent judgment bias and its relationship toparameters describing a nonstationary environment.Our results can provide a solid theoretical and empir-

    ical basis for future research on how to designinformation and incentive systems that are resistantto the kinds of judgment biases we observed. Forexample, this framework lends itself to the studyof real-world forecasting processes in more complexorganizational and functional environments, such asincentive conflicts that frequently arise in sales fore-casting processes at the interface between marketingand operations (e.g., Oliva and Watson 2009).

    Additionally, the implications of our study may berelevant to many fields beyond operations manage-ment. For example, our framework may be useful forthe study of overreaction and illusionary trends instock markets, for examining how medical doctorsinterpret longitudinal data of their patients, or per-haps as a window for understanding human reactionsto climate change. In general, our research points outthat when faced with a time series, decision mak-ers discount distant information and place strongerweight on recent informationa strategy consistentwith adaptation to changing environments ratherthan information exploitation in stable environments.We hope that further studies will examine this phe-nomenon in broader business and societal contexts,

    and study its implications for performance, welfare,and policy.

    AcknowledgmentsThe authors thankfully acknowledge research support bythe Carlson School of Management Deans Research Grantand by a grant from the Smeal College of Business. They arealso thankful for the many helpful comments made by par-ticipants at the annual Behavioral Operations conference, aswell as seminars at the Carlson School of Management andthe Darden School. They are also grateful for the construc-tive feedback provided by an anonymous associate editorand three anonymous reviewers.

    Appendix A. Pretest InformationPrior to the study presented in this paper, we completeda pretest of our experiment. The task, experimental param-eters, software, and functionality were very similar to the baseline study reported here, with two exceptions: First,participants in the pretest made decisions for only 40 con-secutive periods, whereas the data presented here are based

    on 50 periods. Second, the students in the pretest weregiven extra course credit for participating and were enteredinto a drawing for one cash reward per section. We con-ducted the same statistical tests on our pretest data andfound results that are directionally identical to the onesreported here. The pretest was predominantly used to deter-mine whether subjects should receive a graph of the timeseries, and whether providing qualitative information onthe demand series (product with stable/unstable demand)influenced performance. The final design (subjects receiveda graph but no qualitative information) corresponds to thesetting in the pretest in which subjects had the best perfor-mance.

    Appendix B. One-Step-AheadExponential SmoothingThe optimal is unknown and unknowable for subjects.The best they can do is estimate an optimal , given thedata that they have. This fact has implications for forecast-ing performance that we should consider when creating anormative benchmark. Subjects can only estimate an opti-mal at any point of time, given the data they have untilthen, and use this estimate to predict the future. In thissection, we briefly examine how such a one-step-ahead(OSA) procedure differs from the normative benchmark weemploy in our study.

    A first decision is which optimality criterion to use whenestimating the optimal OSA alpha. We use the MAPE as

    an optimality criterion, in addition to a maximum likeli-hood approach (also referred to as SES) (Hyndman et al.2002). We also use a maximum likelihood procedure thatallows for simultaneous parameter estimation in the con-text of double exponential smoothing (DES) (Andrawisand Atiya 2009). Because we have 24 data sets, each with50 forecasts, using three methods, this resulted in 24 503= 3600 optimizations to find all optimal OSA alphas.

    All methods produce alphas close to 0 for all data sets inconditions 1 and 2. The OSA benchmark is therefore no dif-ferent from our normative benchmark in these conditions.Furthermore, the double exponential smoothing method

    postedonanyother

    website,includingtheauthorssite.Pleasesendanyquestionsre

    gardingthispolicytopermissions

    @informs.org.

  • 8/3/2019 Kremer, Moritz, & Siemsen - MS2011 - Demand Forecasting Behavior

    14/17

    Kremer, Moritz, and Siemsen: Demand Forecasting Behavior14 Management Science, Articles in Advance, pp. 117, 2011 INFORMS

    Table B.1 OSA Alphas

    Condition 3 Condition 4 Condition 5 Condition 6

    Period 30 80 30 80 30 80 30 80

    Normative 062 062 022 022 094 094 062 062

    OSA (MAPE) 065 066 021 022 090 085 100 078OSA (SES) 069 067 024 024 090 088 086 073

    OSA (DES) 065 067 NA NA 090 087 077 071

    Table B.2 Mean Absolute Errors Using OSA Alphas

    Condition Normative OSA (MAPE) OSA (SES) OSA (DES)

    3 1311 1301 1302 1300

    4 3659 3664 3666 NA5 3392 3345 3358 33506 5395 5477 5336 5306

    Avg. 3366 3374 3332 3319

    never produces an optimal beta that is far from 0, indicat-ing that our data indeed show little evidence for trends.For the remaining conditions, alphas are generally close tothe normative benchmark, and, with the possible exceptionof condition 6, show little evidence of being closer to thenormative benchmark as more data are revealed through-out the experiment than just the initial demand historyof 30 periods. We summarize the average alphas over alldata sets in a condition, early and late in the experiment, inTable B.1.

    Furthermore, we test whether the overall MAEs usingany of these out-of-sample procedures are different fromthe MAEs using our normative benchmark. We calculate theabsolute error for each observation in each condition using

    each method and then average these absolute errors over allobservations within each condition to get an overall MAEfor each condition and method. Results from this analysisare summarized in Table B.2.

    As can be seen in Table B.2, forecasting performanceemploying the different OSA methods and using our nor-mative approach is quite similar. This is a result of the OSAestimates being fairly close to the normative , and of theobjective function defined by the absolute forecast errorsusing single exponential smoothing being relatively smootharound the optimum. Therefore, it matters little whether weuse the normative benchmark or any of the out-of-samplemethods. Nevertheless, we replicate our performance com-parison from Table 6 with the OSA (SES) errors in Table C.1

    below.

    Table C.1 Performance Comparison Using OSA Procedure

    Loss 1 Loss 2 Loss (Total)

    Predicted Observed MAEP MAEN/ MAEO MAEP/ MAEO MAEN/

    (MAEN) (MAEP) (MAEO) MAEN MAEN MAEN

    n= 10 n= 40 n= 10 n= 40 n= 10 n= 40 n= 10 n= 40 n= 10 n= 40 n= 10 n= 40

    c = 0 775 3074 886 3488 1015 3855 14% 13% 17% 12% 31% 25%

    c = 10 1281 3659 1434 4201 1642 4736 12% 15% 16% 15% 28% 29%c = 40 3400 5267 3541 5682 3894 6403 4% 8% 10% 14% 15% 22%

    Appendix C. Econometric Specification andEstimation DetailsEquation (10) provides a basis for the behavioral model weestimate in our analysis. An empirical problem with Equa-tion (10) is that we do not observe data on Tt1, which couldbias empirical results. To at least partially control for thispotential bias, we propose to estimate Equation (10) with

    the additional independent variables Dt1 and Ft1, lead-ing to the following empirical specification:

    Ft+1=a1Et+a2F1+a3Dt+a4Dt1+a5Ft1+constant+t

    (C1)

    Finally, the following (equivalent) specification of Equa-tion (C1) provides an easier comparison of nested models,so it serves as our primary empirical specification:

    Ft+1 = a1Et + a2 1Ft + a3Dt + a4Dt1

    + a5Ft + a6Ft1 + constant+t (C2)

    In general, an observation at time t in the experimentis nested in subject i, who is nested in demand data sets, which is nested in experimental condition (i.e., demandenvironment) c. Because we estimate our model within eachcondition, this approach yields a three-level nested struc-ture of error terms, such that we have random intercepts vsand wi. Furthermore, we believe that the behavioral param-eters of our model vary considerably depending both on theactual data set being observed and on the individual per-forming the forecast. This expectation implies that a1 a4should be modeled as random coefficients. However, resultsin our pretest show that although there was some vari-ance over a1 and a3, there was little variance on the othertwo coefficients. Estimating random coefficients models in

    which the coefficients have little variance can lead to non-convergence and inappropriate standard errors. Therefore,we estimate only a1 and a3 as random slopes. This three-level random-effects model will effectively control for thedependence we have among observations in our data set.In summary, we can write

    Ft+1csi = asi1 Et + a2F1 + a

    si3 Dt + a4Dt1 + a5Ft

    + a6Ft1 + constant+ vsc+wisc+ t (C3)

    All random coefficients are estimated as having a nor-mal distribution. In our results, we use and to refer tothe mean and standard deviation of that distribution. Forexample, Et refers to the mean of the random slope a1,

    whereas iEt refers to the standard deviation of that slope

    postedonanyother

    website,includingtheauthorssite.Pleasesendanyquestionsre

    gardingthispolicytopermissions

    @informs.org.

  • 8/3/2019 Kremer, Moritz, & Siemsen - MS2011 - Demand Forecasting Behavior

    15/17

  • 8/3/2019 Kremer, Moritz, & Siemsen - MS2011 - Demand Forecasting Behavior

    16/17

    Kremer, Moritz, and Siemsen: Demand Forecasting Behavior16 Management Science, Articles in Advance, pp. 117, 2011 INFORMS

    at the individual level. The behavioral parameters of Equa-tion (10) can then be calculated as follows: = Et, L =a2 + 1, =Dt/Et, and C=constant/a2.

    References

    Adams, J. A. 1968. Response feedback and learning. Psych. Bull.

    70(6) 486504.Andrawis, R. R., A. F. Atiya. 2009. A new Bayesian formulation for

    Holts exponential smoothing. J. Forecasting 28(3) 218234.Andreassen, P. B., S. J. Kraus. 1990. Judgmental extrapolation and

    the salience of change. J. Forecasting 9(4) 347372.Asparouhova, E., M. Hertzel, M. Lemmon. 2009. Inference from

    streaks in random outcomes: Experimental evidence on beliefsin regime shifting and the law of small numbers. ManagementSci. 55(11) 17661782.

    Barberis, N., A. Shleifer, R. Vishny. 1998. A model of investor sen-timent. J. Financial Econom. 49(3) 307343.

    Barry, D. M., G. F. Pitz. 1979. Detection of change in nonstation-ary, random sequences. Organ. Behav. Human Performance 24(1)111125.

    Bates, D. M., J. C. Pinheiro. 1998. Computational methods for mul-

    tilevel modeling. Technical Memorandum BL0112140-980226-01TM. Bell Labs, Lucent Technologies, Murray Hill, NJ.Baucells, M., M. Weber, F. Welfens. 2011. Reference-point formation

    and updating. Management Sci. 57(3) 506519.Bendoly, E., K. Donohue, K. L. Schultz. 2006. Behavior in opera-

    tions management: Assessing recent findings and revisiting oldassumptions. J. Oper. Management 24(6) 737752.

    Bloomfield, R., J. Hales. 2002. Predicting the next step of a ran-dom walk: Experimental evidence of regime-shifting beliefs.

    J. Financial Econom. 65(3) 397414.Bolger, F., N. Harvey. 1993. Context-sensitive heuristics in statistical

    reasoning. Quart. J. Experiment. Psych. 46A(4) 779811.Bolton, G., E. Katok. 2008. Learning-by-doing in the newsvendor

    problem: A laboratory investigation of the role of experienceand feedback. Manufacturing Service Oper. Management 10(3)

    519538.Brav, A., J. B. Heaton. 2002. Competing theories of financial anoma-lies. Rev. Financial Stud. 15(2) 575606.

    Carbone, R., W. Gorr. 1985. Accuracy of judgmental forecasting oftime series. Decision Sci. 16(2) 153160.

    Chapman, G. B., E. J. Johnson. 2002. Incorporating the irrele-vant: Anchors in judgments of belief and value. T. Gilovich,D. Griffin, D. Kahneman, eds. Heuristics and Biases. CambridgeUniversity Press, Cambridge UK, 120138.

    Chopra, S., P. Meindl. 2009. Supply Chain Management. Prentice Hall,Upper Saddle River, NJ.

    Croson, R., K. Donohue. 2003. Impact of POS data sharing onsupply chain management: An experimental study. ProductionOper. Management 12(1) 111.

    Croson, R., K. Donohue, E. Katok, J. Sterman. 2005. Order stabil-ity in supply chains: Coordination risk and the role of coor-dination stock. Working paper, University of Texas at Dallas,Richardson.

    DeBondt, W. F. M. 1993. Betting on trends: Intuitive forecasts offinancial risk and return. Internat. J. Forecasting 9(3) 355371.

    Edwards, W. 1968. Conservatism in human information processing.B. Kleinmuntz, ed. Formal Representation of Human Judgment.Wiley, New York, 1752.

    Epley, N., T. Gilovich. 2001. Putting adjustment back in the anchor-ing and adjustment heuristic. Psych. Sci. 12(5) 391396.

    Fildes, R., P. Goodwin, M. Lawrence, K. Nikolopoulos. 2009. Effec-tive forecasting and judgmental adjustments: An empiricalevaluation and strategies for improvement in supply-chainplanning. Internat. J. Forecasting 25(1) 323.

    Fischbacher, U. 2007. z-Tree: Zurich toolbox for ready-made eco-nomic experiments. Experiment. Econom. 10(2) 171178.

    Gardner, E. S. 1985. Exponential smoothing: The state of the art.J. Forecasting 4(1) 128.

    Gardner, E. S. 2006. Exponential smoothing: The state of the artPart II. Internat. J. Forecasting 22(4) 637666.

    Gehring, W. J., B. Goss, M. G. H. Coles, D. E. Meyer, E. Donchin.

    1993. A neural system for error detection and compensation.Psych. Sci. 4(6) 385390.

    Graves, S. C. 1999. A single-item inventory model for a nonstation-ary demand process. Manufacturing Service Oper. Management1(1) 5061.

    Griffin, D., A. Tversky. 1992. The weighing of evidence and thedeterminants of confidence. Cognitive Psych. 24(3) 411435.

    Harrison, P. J. 1967. Exponential smoothing and short-term salesforecasting. Management Sci. 13(11) 821842.

    Harvey, N. 2007. Use of heuristics: Insights from forecastingresearch. Thinking Reasoning 13(1) 524.

    Hyndman, R. J., A. B. Koehler, R. D. Snyder, S. Grose. 2002. A statespace framework for automatic forecasting using exponentialsmoothing methods. Internat. J. Forecasting 18(3) 439454.

    Kahneman, D., A. Tversky. 1972. Subjective probability: A judgmentof representativeness. Cognitive Psych. 3(3) 430454.

    Kremer, M., S. Minner, L. N. Van Wassenhove. 2010. Do ran-dom errors explain newsvendor behavior? Manufacturing Ser-vice Oper. Management 12(4) 673681.

    Larrick, R. P., J. B. Soll. 2006. Intuitions about combining opinions:Misappreciation of the averaging principle. Management Sci.52(1) 111127.

    Lawrence, M., M. OConnor. 1992. Exploring judgmental forecast-ing. Internat. J. Forecasting 8(1) 1526.

    Lawrence, M., M. OConnor. 1995. The anchor and adjustmentheuristic in time-series forecasting. J. Forecasting 14(5) 443451.

    Lawrence, M. J., R. H. Edmundson, M. J. OConnor. 1985. An exam-ination of the accuracy of judgmental extrapolation of timeseries. Internat. J. Forecasting 1(1) 2535.

    Lawrence, M., P. Goodwin, M. OConnor, D. nkal. 2006. Judge-mental forecasting: A review of progress over the last 25 years.Internat. J. Forecasting 22(3) 493518.

    Makridakis, S., M. Hibon. 2000. The M3-competition: Results, con-clusions and implications. Internat. J. Forecasting 16(4) 451476.

    Makridakis, S., S. Wheelwright, R. Hyndman. 1998. Forecasting:Methods and Applications. Wiley, New York.

    Massey, C., G. Wu. 2005. Detecting regime shifts: The causes ofunder- and overreaction. Management Sci. 51(6) 932947.

    McNamara, J. M., A. I. Houston. 1987. Memory and the efficientuse of information. J. Theoretical Biology 125(4) 385395.

    Muth, J. F. 1960. Optimal properties of exponentially weighted fore-casts. J. Amer. Statist. Assoc. 55(290) 299306.

    Nahmias, S. 2008. Production and Operations Analysis. Irwin,

    Chicago.Oliva, R., N. Watson. 2009. Managing functional biases in organiza-tional forecasts: A case study of consensus forecasting in sup-ply chain planning. Production Oper. Management 18(2) 138151.

    Poteshman, A. M. 2001. Underreaction, overreaction, and increasingmisreaction to information in the options market. J. Finance56(3) 851876.

    Rabin, M. 2002. Inference by believers in the law of small numbers.Quarterly J. Econom. 117(3) 775816.

    Rabin, M., D. Vayanos. 2010. The gamblers and hot-hand fallacies:Theory and applications. Rev. Econom. Stud. 77(2) 730778.

    Rustichini, A. 2008. Neuroeconomics: Formal models of decision-making and cognitive neuroscience. P. W. Glimcher, C. Camerer,

    postedonanyother

    website,includingtheauthorssite.Pleasesendanyquestionsre

    gardingthispolicytopermissions

    @informs.org.

  • 8/3/2019 Kremer, Moritz, & Siemsen - MS2011 - Demand Forecasting Behavior

    17/17

    Kremer, Moritz, and Siemsen: Demand Forecasting BehaviorManagement Science, Articles in Advance, pp. 117, 2011 INFORMS 17

    R. Poldrack, E. Fehr, eds. Neuroeconomics. Elsevier, London,3346.

    Sanders, N. 1992. Accuracy of judgmental forecasts: A comparison.Omega 20(3) 353364.

    Sanders, N. 1997. The impact of task properties feedback on timeseries judgmental forecasting tasks. Omega 25(2) 135144.

    Sanders, N., K. B. Manrodt. 2003a. Forecasting software in practice:

    Use, satisfaction, and performance. Interfaces 33(5) 9093.Sanders, N., K. B. Manrodt. 2003b. The efficacy of using judgmen-tal versus quantitative forecasting methods in practice. Omega31(6) 511522.

    Schweitzer, M. E., G. Cachon. 2000. Decision bias in the newsven-dor problem with a known demand distribution: Experimentalevidence. Management Sci. 46(3) 404420.

    Stone, E. R., R. B. Opel. 2000. Training to improve calibration anddiscrimination: The effects of performance and environmentalfeedback. Organ. Behav. Human Decision Processes 83(2) 282309.

    Su, X. 2008. Bounded rationality in newsvendor models. Manufac-turing and Service Oper. Management 10(4) 566589.

    Verbeek, M. 2000. A Guide to Modern Econometrics. Wiley, New York.

    Wiener, N. 1948. Cybernetics or Control and Communication in the Ani-mal and the Machine. Wiley, New York.

    postedonanyother

    website,includingtheauthorssite.Pleasesendanyquestionsre

    gardingthispolicytopermissions

    @informs.org.