performance variation among major league baseball closers ......you don't feel like your back...

80
_____________________________ I would like to thank my adviser, Professor Pedro Dal Bó, for his guidance and support throughout the research process. Additional thanks to Professor Jeremy Kahn for serving as the second reader from the mathematics department, Professors Kenneth Chay and Anna Aizer for their suggestions, and Nicholas Coleman for providing a wealth of valuable advice. Thank you to the Office of the Dean of the College for providing funding for necessary data through a Research at Brown grant. I’d finally like to thank Russell A. Carleton, Matthew Goldman and Justin Rao for their voluntary guidance and insights that significantly contributed to this paper. Performance Variation among Major League Baseball Closers: Field Evidence of Situational Pressure Effects Anthony Bakshi Brown University April 17, 2013 Abstract Despite immense analytical advances in Major League Baseball (MLB), some traditional methods of player labor allocation continue to be used. Closers, a subset of pitchers, are primarily substituted into particular game states that qualify as save situations (SS), and conventional wisdom suggests that they enjoy motivational benefits that lead to improved performance in these states. This study investigates the persistence of the performance discrepancy after the inclusion of key variables, including controls for pitcher and hitter skill levels, and examines other potential causes. Analysis of a data set of more than 26,000 plate appearances supports a significant and positive effect of the SS state on closer performance. The results further suggest a significant and positive effect of the SS state when combined with high-pressure at-bats that considerably impact the game’s outcome. The study also contributes a “clutchness” ranking system that quantifies the heterogeneous effects of situational pressure on the performance levels of individual closers.

Upload: others

Post on 25-Apr-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

_____________________________ I would like to thank my adviser, Professor Pedro Dal Bó, for his guidance and support throughout the research process. Additional thanks to Professor Jeremy Kahn for serving as the second reader from the mathematics department, Professors Kenneth Chay and Anna Aizer for their suggestions, and Nicholas Coleman for providing a wealth of valuable advice. Thank you to the Office of the Dean of the College for providing funding for necessary data through a Research at Brown grant. I’d finally like to thank Russell A. Carleton, Matthew Goldman and Justin Rao for their voluntary guidance and insights that significantly contributed to this paper.

Performance Variation among Major League Baseball Closers: Field Evidence of Situational Pressure Effects

Anthony Bakshi Brown University

April 17, 2013

Abstract Despite immense analytical advances in Major League Baseball (MLB), some traditional methods of player labor allocation continue to be used. Closers, a subset of pitchers, are primarily substituted into particular game states that qualify as save situations (SS), and conventional wisdom suggests that they enjoy motivational benefits that lead to improved performance in these states. This study investigates the persistence of the performance discrepancy after the inclusion of key variables, including controls for pitcher and hitter skill levels, and examines other potential causes. Analysis of a data set of more than 26,000 plate appearances supports a significant and positive effect of the SS state on closer performance. The results further suggest a significant and positive effect of the SS state when combined with high-pressure at-bats that considerably impact the game’s outcome. The study also contributes a “clutchness” ranking system that quantifies the heterogeneous effects of situational pressure on the performance levels of individual closers.

1

1. Introduction

Major League Baseball (MLB) is a professional sports organization that lies at the heart of

American culture. With origins in the late 19th century, “America’s pastime” has developed into a multi-

billion dollar industry1 with a strong players union that protects its members’ guaranteed, multi-year

and often multi-million dollar contracts. As player salaries have risen dramatically since the 1980s, so

have the stakes involved in efficiently allocating wages. Coupled with the influx of statistical analysis into

the sport throughout the last decade, a movement brought to the public sphere by Michael Lewis’

Moneyball, the sport seems destined to continue evolving toward a data-driven, wholly efficient

industry.

Despite the analytical advances, some traditions persist. Closers, usually the highest-skilled relief

pitchers on teams, are deployed in a rigid manner, substituted into games to earn “saves,” an old-

fashioned statistic. A save situation (SS) occurs when a pitcher enters with his team leading by one, two

or three runs and attempts to convert at least the final three outs of the game.2 A non-save situation

(NSS) is any other situation; the closers’ team may be losing, tied or ahead by four runs or more (Figure

1). The importance attributed to this arbitrary set of game situations by teams, players and fans helps

form an opportune area of study. There has been extensive analysis of the pitfalls of closer usage by

prominent “sabermetricians,” those who apply analytical methods to baseball. Bill James, the founder of

the movement and creator of the term, wrote that “using your relief ace to protect a three-run lead is

like a business using its top executive to negotiate fire insurance.” But the number of saves earned by

closers remains an important statistic, reflected in closer salary levels that are the highest among relief

1 The MLB generated revenues of $7 billion in 2010, according to Reuters.

2 There are two less common SS: When a pitcher enters the game with the potential tying run already on base, at bat, or next to

bat or pitches at least three innings to finish the game while his team is ahead by any margin.

2

pitchers,3 the group of pitchers substituted into games to “relieve” starting pitchers who pitch the

majority of innings.

The relationship between saves and closer pay may contribute to a historical performance

premium: closers typically pitch better in SS than in NSS. An MLB.com study of closers who recorded 40

or more saves in a season between 2001 and 2010 showed a marked discrepancy in Earned Run Average

(ERA)4; the closers had a 2.28 ERA in SS and a 2.99 ERA in NSS (Singer 2011).5 This performance variation

is commonly attributed to behavioral effects by fans, baseball media members and, most curiously, the

closers themselves. Chris Perez, a closer for the Cleveland Indians who had a 2.75 ERA in SS and a 4.18

ERA in NSS in 2011, said of NSS: "Obviously, there isn't as much intensity. The game isn't on the line and

you don't feel like your back is up against the wall." Matt Capps, then a closer for the Minnesota Twins,

echoed Perez, observing that closers “have to find a way to make it the same. You have to find a way to

get over all of it, because there is a difference. The intensity level, the hitters' focus, there are a lot of

differences.” (Meisel, 2012)

This study investigates the persistence of the performance discrepancy after the inclusion of key

variables, including a control for pitcher-batter matchup, and potential other causes of the variation.

After testing a sample of more than 26,000 closer-batter interactions (plate appearances) across two

regular seasons, the findings support a significant effect of the SS state on at-bat outcome. Closers are

12 percent less likely to allow a batter to reach base in an SS than in an NSS, all else equal. The data also

supports a significant effect of changes in Leverage Index (LI), a more precise measure of situational

importance that encapsulates the effect of every possible result of an at-bat on the game’s ultimate

outcome (team win or loss). Though closers are about four percent more likely to allow a hit or walk for

3 Closers have signed approximately 90 percent (24 of 27) of contracts worth $5.5M or more annually in baseball history (Cot’s

Baseball Contracts) and make approximately $2.5M more annually than relievers that are primarily used in the eighth inning (Carleton 2008). 4 Earned Run Average measures the average number of runs allowed by a pitcher every nine innings.

5 This difference matters in the context of game outcome, as teams scored an average of 4.7 runs each per game during the ten

seasons studied.

3

every unit increase of LI in an NSS, an identical increase in an SS state significantly improves the

likelihood of pitcher success by a magnitude of about one percent.

These findings lend credence to the importance of game pressure and heightened intensity in SS

states that has been anecdotally communicated by closers. It also provides support for the oft-

beleaguered status quo management of relief pitchers by MLB teams. If closers indeed internalize the

mantra of SS importance, as suggested by the results, and thus reap the benefits of internally driven

increases in motivation, it is sensible to continue using closers in this set of game states.

This study contributes to existing literature on the effects of psychological pressure and

motivation on agents in high-stakes environments. The results provide mixed evidence regarding the

impact of psychological pressure on athletes that has recently also been studied in professional soccer

(Apesteguia and Palacios-Huerta, 2010) and basketball (Goldman and Rao, 2012). The findings do not

support the existence of other behavioral biases among closers, such as the loss averse preferences

found by Pope and Schweitzer (2011) among professional golfers. This paper also aims to provide an

explanatory extension to the main findings by creating a method to rank closers based on their

sensitivities to game state changes. This “clutchness” statistic identifies individual situational pressure

effects on each pitcher in the sample, and the calculated coefficients quantify the effects and allow for

inter-pitcher comparisons. This measure is developed and discussed later in the paper, which proceeds

as follows.

Section 2 details the motivations for this study and provides background on related literature.

Section 3 describes the data analyzed, and Section 4 explains the chosen econometric methods and

tested hypotheses. Section 5 provides detailed explanations of the results. Section 6 explains the player

ranking system, and Section 7 discusses robustness measures, alternative hypotheses, and potential

caveats.

4

2. Motivation and Background

2.1 Effort and Performance

Recent research has applied psychological theories to test the normative economic convention

of a strictly positive relationship between effort and performance. Ariely et al. (2009) conducted

experiments involving games and found that high incentives can lead to decreases in performance. The

authors’ findings regarding the relationship between moderate incentives and optimal performance

echoes the Yerkes-Dodson Law, a psychological theory that states that optimal performance requires an

intermediate level of “arousal,” or emotional intensity (Yerkes and Dodson, 1908). Chib et al. (2012) use

fMRI technology in a neuroscience study to observe agents performing motor task experiments and also

find evidence that larger incentives can lead to performance decreases.6 Rauh and Seccia (2006) develop

a theoretical model for performance that incorporates anxiety into a framework composed of individual

skill and effort level.

Goldman and Rao (2012) contribute to this area of literature by distinguishing between the

effects of psychological pressure based on the type of game action in basketball. The authors find that

players do not benefit from playing at home (a “home-court advantage”) in all situations, despite the

increased pressure and presumed beneficial effects on performance that arise from performing in front

of a supportive audience. Butler and Baumeister (1998) previously observed, through experimental

results, that the presence of supportive audiences can lead to declines in performance. Goldman and

Rao found that basketball players benefit from home crowds when grabbing rebounds, a high-effort task

that takes place in a fluid game state. But home players were negatively impacted when shooting free

throws in tense moments, as that task requires concentration that can lend itself to the detrimental self-

focus observed in previous experimental work (Lewis and Linder, 1997; Baumeister and Steinhilber,

6 The study also relates the observed performance declines to loss-averse preferences, discussed in the next section.

5

1984). Cao et al. (2011) also studied free-throw shooting in the NBA and found an average performance

deterioration of 5-10 percent in the final moments of close games.

A study of field goal kick conversions in professional football (Clark et al., 2013) found no

significant effect of psychological factors, while Apesteguia and Palacios-Huerta (2010) observed a

negative impact of psychological pressure on penalty kicks in soccer. They observed a higher-than-

expected winning percentage among soccer teams shooting first in the penalty kick rounds. Their

empirical results matched survey data collected from professional players that overwhelmingly stated a

preference to kick first to place the mental pressure on the other team. Baseball at-bats, which is the

game action studied in this paper, is quite similar to such concentration-heavy tasks, which provides

some context for the effects observed in high-pressure game situations in the MLB.

2.2 Loss Aversion

Closer performance also provides a template for testing for evidence of loss aversion in a field

environment. Loss aversion is a tenet of prospect theory, a foundational set of behavioral economic

concepts developed by Kahneman and Tversky (1979). Behavioral economics incorporates psychological

influences and heuristic biases, or rules of thumb, into the rational decision-making process, thereby

diverging from traditional neoclassical theory in which rational agents maximize their individual utilities.

Prospect theory maintains that agents’ choices depend not just on total utility, but also on reference

points from which decisions, and the potential subsequent gains or losses, are compared to on a relative

scale. Departures from reference points are not weighed equally, and this idea that “losses loom larger

than corresponding gains” (Kahneman and Tversky, 1991) is formalized in the prospect theory value

function:

( ) { | |

(1)

6

The utility of a gain of amount x is less than the disutility of a loss of –x. The discrepancy

depends on lambda, a measure of an individual’s degree of loss aversion found experimentally to be

approximately 2.25 (Kahneman and Tversky, 1992). Figure 2 depicts a typical value function diagram

with the reference point located at the origin. The function is steeper in the loss domain (x<0) than in

the gain domain (x>0), which indicates a higher marginal benefit of effort in the loss domain. The beta

exponent term, found to have a median value of 0.88 in the same set of experiments, gives the value

function its “S-shape.” The resulting convexity of the function in the loss domain and concavity in the

gain domain relate to the concept of diminishing sensitivity, which suggests that the impact of a gain or

a loss lessens as an agent moves further away from the reference point. In such a situation, facing a loss

of $110 instead of $100 is less painful than facing a $20 loss after expecting a $10 loss. The authors also

found experimental evidence of varying risk attitudes in the two domains: agents tend to be risk-seeking

when faced with potential losses and risk-averse when faced with potential gains.

Extensive contributions to prospect theory literature have been made through both

experiments and fieldwork (Kahneman et al. 1990; List, 2003; List, 2004; Fryer et al., 2012; Levitt et al.,

2012). The studies of Thaler (1999) and Haigh and List (2005) investigate myopic loss aversion, a theory

that incorporates the narrow bracketing of decisions examined in mental accounting literature. The

impact of goals, expectations and endogenous changes on reference points has been studied

theoretically (Köszegi and Rabin, 2006), experimentally (Heath et al, 1999), and in relation to effort

provision in labor markets (Camerer et al., 1997; Farber, 2005; Fehr and Goette, 2007; Abeler et al.,

2011).

The loss aversion framework discussed in Section 4.1 incorporates the ideas of Pope and

Schweitzer (2011), who found a significant difference between accuracy on comparable par and birdie

putts by golfers on the PGA Tour. This difference was attributed to loss-averse preferences, in which

golfers exerted more effort to avoid falling below the reference point of par on individual holes, an

7

irrational behavioral bias since all strokes are equally valued in the final score. This effect diminished in

later rounds of tournaments, as the par reference point lost salience and the aggregate tournament

score gained importance. Berger and Pope (2011) and Goldman and Rao (2013) both find significant

effects of score margins on basketball teams. The former study found that college and professional

teams losing by small margins at halftime were more likely to ultimately win the game, and the latter

found an improvement in shot-making efficiency among trailing teams.

Several aspects of baseball have also been used to study reference-dependent preferences.

Moskowitz and Wertheim (2011) found evidence of loss aversion within MLB at-bats through the use of

Pitch f/x data, a tool for graphical analysis of pitch-by-pitch data7. The authors claimed that both

pitchers and batters adjusted their risk-seeking behaviors depending on the count of balls and strikes

within individual at-bats. Pope and Simonsohn (2011) found evidence in MLB regular season batting

statistics that supported the use of round numbers as reference points, concluding that significantly

more batters finished regular seasons with averages of .300 than with averages of .299. Pedace and

Smith (2012) observed loss-averse tendencies in baseball managerial decisions, finding that general

managers were more likely to retain poor performers in whom they had originally invested.

2.3 Baseball

A complementary goal of this study is to investigate the effectiveness of common closer usage.

Closers have been used in a confined role, mostly restricted to one-inning outings in save situations, for

about the last 20 years. An incongruous trend has developed, as this increased role specification, which

leads to fewer innings pitched, has been coupled with increasing salary levels. This raises questions of

labor allocation efficiency. Previous research has delved into such issues: Jazayerli (2000) finds that

closers have the second-most potential impact in tie games (a NSS in which they are thus less frequently

7 See Brooks Baseball (Brooks) for examples of the Pitch f/x tool.

8

used), while the protection of a three-run lead in the final inning (an SS) has less impact on the game’s

outcome than achieving a scoreless first inning. Wyers (2012) finds that the fixation of managers on

preserving closers for SS game states in the final inning often backfires; it yields too few opportunities

for closers to pitch, and thus relegates the highly skilled relievers to waste labor on low-leverage

situations after stretches of inactivity.

Despite these drawbacks, the current system may generate unobvious benefits that lead to

increased performance, and this is the area to which this study contributes. The player quotes cited in

the introduction certainly suggest an association of importance to save situations; if a closer is not as

personally motivated to perform in tied game situations, their incentives are misaligned in a way that

may detract from team success. The potential benefits of the well-defined role of closers have also

been considered in previous literature (Tango et al., 2006; Carleton, 2012a). This paper’s analysis sheds

empirical light on the effects of game states on the group of closers studied.

3. Data

To analyze the effects of game situations on closers, this study examines data on the plate

appearance level. A plate appearance8 is an interaction that involves a pitcher and a batter and occurs at

least six times every inning. Since one of the goals of the paper is to examine the effect of the SS game

state, the pitcher pool in a given season was limited to those who had been closers in a significant

capacity at some point during the season. The sample of pitchers includes only those who earned at

least five saves and finished at least ten games in the season. Games finished simply means the pitcher

was the last to pitch for his team in the given game. The data used to refine the sample was acquired

from the Thebaseballcube.com, a source of baseball statistics.

8 Though the terms at-bats and plate appearances are used interchangeably throughout the text, it should be noted that the

data sample consists of all plate appearances involving the closers. Walks are included in the sample.

9

Two seasons were selected for inclusion in the study: 2000 and 2011. The latter was chosen

because it was the final completed season when the project was started, and the former was selected as

a counterbalance in terms of league-wide batting success. Offensive output has been trending

downward in the MLB in recent years, potentially because of the end of widespread steroid usage (Stark,

2012). The league-average On-Base Percentage (OBP), a measure of offensive success, was 0.321 in

2011, then the lowest in 23 seasons. The 2000 season, meanwhile, had a league-average OBP of 0.345,

tied for the highest mark in the previous 62 seasons. Though the model specification includes a control

for player skill, which incorporates these league-wide averages within seasons, this extra measure was

taken. The season choices also generated a varied sample of pitchers. The cutoffs specified above

yielded 48 pitchers in each season9. Only two pitchers, Jason Isringhausen and Mariano Rivera, were

featured in both sample seasons. Appendix Table A-1 contains descriptive data regarding all of the

pitchers in the sample.

Details about each plate appearance were acquired through the Play Index tool of Baseball-

Reference.com. Specifically, the “Pitching Event Finder” sub-tool was used to isolate all plate

appearances involving each pitcher within the two sample seasons. In total, 26,223 plate appearances

were collected in this manner across the 96 pitcher-seasons. The Play Index output provided almost all

of the necessary information for the specifications described in Section 4, including opposing batter,

score, inning, outcome of the plate appearance, and the Leverage Index (LI).

The LI is a sabermetric statistic that plays a major role in this study’s specifications and analysis.

The statistic, as defined by its creator, Tom Tango (2006), is the “swing in the possible change in win

probability” of an at-bat. The measure depends on the score, the inning, the number of outs, and the

number and position of runners on base10. The LI is found through a summation of the probability of the

occurrence of each at-bat outcome multiplied by the corresponding change in win probability of the

9 A 49

th in the 2000 season, Jose Jimenez of the Colorado Rockies, was omitted because of difficulties with data obtainment.

10 For tables that include Leverage Index statistics for every possible game situation, see Tango (2007).

10

team. LI is a standardized statistic, which provides valuable intuition: a typical at-bat has an LI of one, so

it can be said that an at-bat with an LI of two is twice as important to the game outcome as a typical at-

bat. The average LI of plate appearances in the ninth inning or later, the game period in which closers

are primarily used, is 1.33, approximately 37 percent more important than the average event during the

first eight innings (Wyers, 2012). The benefit of this statistic is that it quantifies the importance of game

situations, which provides a way of measuring the stakes, and changes in implied pressure, faced by

closers in individual plate appearances.

The contract statuses of the closers were one statistic not directly available through the Play

Index. These data points were collected from Cot’s Baseball Contracts, which is found on

baseballprospectus.com.

4. Empirical Methodology and Hypotheses

4.1 Loss Aversion

The uncontrolled statistical evidence and anecdotes about closers’ performance in SS and NSS,

cited in the introduction, do not suggest the presence of loss-averse preferences. While the studies cited

in Section 2.2 involve improvements in performance by players on trailing teams, closers excel in

pressure-packed situations that seem to place them in the gain domain, since their teams are ahead in

SS. But such a straightforward loss aversion model is difficult to apply to closers. Since they are

traditionally deployed to protect leads, it can be argued that SS are in the loss domain; the closer can, at

best, maintain the score margin to not lose ground, and, at worst, lose the lead for the team. In addition,

the gain domain of closers is not a true gain domain, because there is nothing non-preventative that can

be attained in an appearance. This identifies a downside of aggregating closer performance on the

appearance level, as opposed to the at-bat level used for analysis in subsequent sections. While Pope

and Schweitzer narrowly bracketed golfers’ effort selection, thus creating individual utility functions

11

weighing the marginal cost and benefit of effort on each putt, the aggregatory nature of a closer’s

success or failure makes such utility maximization unsuitable for the scenarios studied.

However, the margins of team leads and deficits at which closers enter games do provide a

testable context for loss aversion. All else equal, a loss-averse group of agents would yield fewer runs

when the lead margin is at its narrowest; the benefit of a closer’s effort exertion would be highest when

he is attempting to prevent the tying run from scoring. Extending this idea, the cost of yielding a run

would be significantly smaller if the closer enters in a more relaxed SS state, such as when his team is

ahead by three runs. The benefit of effort exertion would be smaller, initially, and a loss-averse closer

may be expected to yield one or two runs more frequently in this scenario. Pitchers have acknowledged

such a mindset: Jose Valverde, a closer for the Detroit Tigers included in the sample, said, “If I give up a

couple of runs, it doesn’t bother me. I just want to get the save. As long as I get a save so my team wins,

it doesn’t matter” (Meisel, 2012). This statement, however, may also be interpreted as an indication of

rational maximization behavior, in which the only goal of the player is a team victory.

Scatter plots of earned runs allowed per appearance, depending on the run margin at game

entry, are plotted in Figure 3. The 2,940 game appearances of the 48 closers from the 2011 season

compose the sample points. The lowess11 regression curve in 4a does not resemble an S-shaped value

function that would arise under the behavioral bias discussed. There is no evident significant increase in

runs allowed as the team’s lead increases. Figure 3b restricts the sample to appearances in which at

least one run was allowed, which only composes 21 percent of the total sample. The slope of the fitted

curve again does not reveal any particular trends regarding the average number of runs yielded in the

different run margins of SS and NSS states.

Figure 4 plots the same data points, now with adjustments for observation frequency. The size

of each point is conditional on its frequency in the sample. As evident, the largest clusters form in the SS

11 Locally weighted scatterplot smoothing is a form of nonparametric regression.

12

region on the horizontal axis, where the run difference is +1 to +3, and at the zero runs allowed mark on

the vertical axis. The top three observations in the sample of 2011 closers are margins of one, two and

three runs in the pitcher’s favor without any earned runs allowed in the appearance. There are 1,443 of

these points combined, composing 49 percent of the total data set. Such a distribution hints at several

themes already discussed, such as the tendency to use closers in SS states and the generally high skill

level of this group of pitchers.

4.2 Primary Specification

The primary goal in this study was to test whether closers’ improved performance in SS persists

after appropriate controls are included, or whether other situational variables affected performance

more significantly. A logistic regression with a binary dependent variable was applied to the data, similar

to the specifications used in the Clark et al. (2013) study of field goals in football and the baseball

studies of Carleton (2009). Specifically, a binary logit regression was used, which incorporates a

dependent variable that represents a proportion of “successes” between zero and one in the sample

data. The dependent variable takes on the value below, with the variable p representing probability of

success. The logit function takes the log of the odds of the “success” outcome, as evident in (2). This is

the reason that the logit function is also known as “log-odds.”

( ) [

]

(2)

13

A generalized logit regression is presented below, with k independent variables and pi taking on

the value shown in (4). The logit model estimates the probability of the dependent variable taking on

the value of one, or, in other words, of a “success” occurring. The model uses the cumulative standard

logistic distribution, represented by F in (5).

[

]

( )

( | ) ( )

( )

For the main effect independent variables in such a specification12, the coefficients of the logit

model can be intuitively interpreted through exponentiation with base e. This calculation provides odds

ratios that more clearly describe the relationship between the variables of interest and the binary

outcome variable. In this study, the outcome variable is OnBase, which takes on two values at the

conclusion of each at-bat in the sample.

{

This outcome variable was selected to allow for analysis at the at-bat level, which is amenable to

the inclusion of key controls and more precise observation of the situational impact. Note, however,

that the outcome variable in the introductory discussion was aggregate Earned Run Average (ERA). To

assuage potential concerns regarding the switch to at-bats, Table 1 shows the mean Onbase

percentage13 yielded by pitchers in the sample in SS and in NSS. The pattern seen in ERA holds; pitchers

are significantly less likely to allow baserunners in SS than in NSS, t(26221) = 3.88, p<0.001.

12

The intercept and interaction variables cannot be interpreted directly, which is discussed further in Section 5. 13

This percentage should not be confused with the previously cited statistic OBP (On-base percentage).

(6)

(3)

(4)

(5)

14

The primary variable of interest, SaveSituation, is also binary, and it can take on the two potential

values in (7). The naïve regression incorporating the primary variable of interest is expressed in (8),

where F is again the cumulative logistic distribution function.

{

( | ) ( ( ) )

( ( ))

This naïve specification can be considered a representation of the evidence previously used to

remark on the disparity in closers’ performance, decomposed to an at-bat level. A negative β1 coefficient

yielded by (8) would suggest that the chance of a closer allowing someone to reach base decreases in a

SS, while a positive coefficient would suggest the opposite.

The complete specification is shown in (9). The subscripts i,p,s represent at-bat i involving

pitcher p that occurred in season s, and X is the vector of covariates. The two terms preceding the error

term are fixed effects for pitchers and seasons, respectively. Standard errors are robust and clustered by

individual pitchers.

( | )

( ( ) ( )

( ) ( ) ( ) [ ( ) ]

)

( ( ) ( ) ( ) ( ) ( ) [ ( ) ])

LeverageIndex is the plate appearance Leverage Index value, which, as discussed previously,

conveys the implicit pressure in an at-bat through its potential effect on the ultimate game outcome.

RunDiff, or run differential, measures the difference in team scores at each at-bat from the perspective

of the pitcher’s team. The next two terms, Home and Contract, are binary variables taking on the values

shown in (10) and (11). The inclusion of Home was motivated by the findings of Goldman and Rao (2012)

(7)

(8)

(9)

15

discussed in Section 2.1, as well as the work of Moskowitz and Wertheim (2011), who found an away-

pitcher disadvantage linking increases in situational Leverage Index to decreases in the rate of strike

calls made by umpires during home at-bats. Contract was included to provide insight into the ambiguous

effect of “contract years,” or the final year of a player contract before he reaches free agency, on player

performance.14

{

{

The final variable, ExpectedOddsRatio, is critical. It controls for batter and pitcher skill by

generating an expected outcome for each at-bat using the season statistics of the players involved. This

control method is adapted from Carleton (2009). The variable is constructed by creating odds ratios (OR)

for both players using season On-Base Percentages (OBP)15. The statistic used for pitchers is OBP against,

which measures the average success rate of the opponent batters faced. OBP is already a probability

between [0,1], so the odds ratio formula is directly applied, as shown in (12).

14

Van-Riper (2010) suggests no impact of contract years on batter performance, while Huckabay (2003) found a significant difference in performance among batters but no significant difference in performance among pitchers. 15

The equation for OBP is (hits + walks + hit by pitches) / (at-bats + walks + hit by pitches + sacrifice flies). The statistic is more comprehensive than batting average (BA), which is simply hits/at-bats.

(10)

(11)

(12)

16

Combining the batter, pitcher, and league average odds ratios16 yields (13), and solving for the

variable of interest yields (14). Note that the natural log of the result is then taken before applying to

the binary logistic model in (9). This variable controls for player ability, which allows for more precise

analysis of the variables of interest. As discussed in Carleton, the coefficient of this constant term should

be approximately equal to one.

( ) (

)

Pitcher fixed effects are included to account for individual, time-invariant pitcher characteristics

that may otherwise affect the results but get overlooked on the at-bat scale. Season fixed effects are

included to represent league-wide changes in hitting and pitching performance.17

4.3 Interaction Specification

Another purpose of this study is to investigate the effect of other situational variables on the

variation in closers’ performance levels in their two overarching game states, SS and NSS. To study this

topic, the specification in (9) is expanded to include interaction terms, as seen in (15).

( | )

( ( ) ( ) ( ) ( ) ( ) ( )

( ) ( ) ( )

[ ( ) ] )

( ( ) ( ) ( ) ( ) ( ) ( ) ( ) )

16

The league average OR uses the average OBP of all hitters in the sample season. The league average OBP against for pitchers is identical. 17

The average number of runs per game in 2011, the final year of the sample, was the lowest since the 1992 season.

(13)

(14)

(15)

17

Four independent variables are interacted with SS to study the potential effects on performance

of these joint situations. There is one interaction between a continuous and a binary variable (LI*SS),

one interaction between a categorical and a binary variable (RunDiff*SS), and two interactions between

two binary variables (Home*SS, Contract*SS). One noteworthy change is the new restriction on RunDiff

value in the interaction term; by definition, the only potential run differences in an SS are one, two and

three. The main effect variables that correspond to the interacted terms now represent the effects of

the respective variables in the NSS state.

4.4 Hypotheses

The primary coefficient of interest in the main specification is β1, the coefficient of the SS

variable. A straightforward Wald test of significance on this coefficient is the first hypothesis tested, with

a null hypothesis of β1 =0. The uncontrolled data would suggest that β1 < 0, meaning that pitchers allow

fewer baserunners in SS than in NSS. Similar Wald tests for significant differences from zero are

calculated for all of the regressor coefficients in (10). For the interacted specification, the focus is on the

coefficients of the four interacted terms and their main effects companions. The significance levels of

these variables can provide insight regarding their varying effects by game state. Section 5 also contains

discussions of additional specifications that contribute to the analysis, and these incorporate other

significance tests that are discussed when appropriate.

18

5. Results

5.1 Summary Statistics

Table 2 reports summary statistics for the variables used in both the primary and interaction

specifications. Table 3 provides frequencies and means of key variables in the SS and NSS game states.

The latter table provides a clearer picture of the differences in the two game states; for instance, it

presents accurate mean values for the interacted terms that do not include the zeroes accumulated

through NSS sample at-bats.

The mean Leverage Index cited in Table 3 is of particular interest. The mean LI for closers has

previously been reported to be 1.8 (Cameron, 2010), and this study’s sample mean is quite similar with a

rate of 1.7 for SS and NSS combined. The data suggests that in SS, the average plate appearance is about

138 percent more important to the game’s eventual outcome than an average plate appearance.18

Figure 5 adds further clarity to the increased pressure experienced by closers in the sample, as it

compares the percentage of at-bats closers face in subdivisions of LI to the typical breakdown of at-bat

importance in a full game (Appelman, 2008). About a third of all at-bats in the sample are more than

twice as important to the game outcome as a typical at-bat (LI ≥2). This breakdown suggests that though

closers are not substituted in games based on measures of LI, the SS acts as an imperfect proxy that still

tends to place the best relief pitchers in the most important game situations. Figure 6 depicts kernel

density plots for the LI distribution in SS and NSS states within the sample plate appearances. The

distribution is more positively skewed for NSS, which indicates that closers tend to face higher LI levels

in SS.

Figures 7 and 8 incorporate run margin into the discussion of the sample data. The distribution

of plate appearances by run difference levels shows that the four most common margins in the sample

are tied games and plate appearances in the +1 to +3 range that constitute SS. The vertical axis in Figure

18

Part of this wide discrepancy is endogenous to the definition of Leverage Index, since it depends on the inning in which the at-bat took place, among other factors. This fact is discussed in Section 7.

19

8 represents the plate appearance Leverage Index, which shows the expected decreases in LI as the

magnitude of run margin increases in both the positive and negative directions.

The pairwise correlation coefficients of potentially related variables are reported in Table 4.

Many of the variables indeed exhibit a significant (p < 0.05) correlation. Among the correlations

between the outcome variable and regressors, both SS and RunDiff have a significant negative

correlation with On-base, which partly confirms previously discussed hypotheses. In addition, there are

significant positive correlations between LI and SS, RunDiff and SS, and RunDiff and LI. These correlations

raise multicollinearity concerns, though these are largely eased in the robustness discussion of Section 7.

5.2 Primary Specification Results

Table 5 presents the regression results of the naïve specification (8) and of a specification that

only adds the control for pitcher-batter matchup. Though the naïve specification, as expected, suggests

significant omitted variable bias through its significant constant term (p<0.001), Columns (3) and (4)

show a controlled significant effect of SS on plate appearance outcome. The results suggest a significant

negative effect of the SS game state on the probability of a pitcher yielding a baserunner, as suggested

by the results reported in Table 1. The binary logit specification allows for intuitive interpretation of the

effect’s magnitude, both in this simple pair of regressions and in the full specification. The coefficients

are exponentiated with base e (for any coefficient βn=x, eβn) and this transformation allows the

regressor’s effect to be interpreted as a percentage or factor change. The odds ratio in Column (4)

suggests that a pitcher is 8.7 percent (1-0.913) more likely to get a batter out in an SS than in an NSS,

controlling for the skill levels of both players.

The results of the full specification of (9), along with several variations, are presented in Table 6.

The output is organized into four pairs, the left column of each showing the binary logit coefficient

output and the right column showing the odds ratio coefficients. The most striking result of the full

20

specification is the negative and significant (p<0.01) coefficient on SS. The effect observed in the naïve

results persisted and increased in magnitude after the addition of covariates that may have affected the

observed effect of the SS game state on at-bat outcome. By switching from an NSS state to an SS state, a

pitcher is 11.7 percent (1-0.883) more likely to get the batter out, all else equal.

LI has a positive and significant (at the 10 percent level) coefficient that indicates the effect of

increases in situational game pressure on at-bat outcome. The data suggests that for a unit increase in

Leverage Index, the chance of a pitcher allowing a baserunner increases; specifically, column (2) shows

that the magnitude of this increase is approximately 1.7 percent. Columns (3) and (4) present a

specification that omits the LI variable, as it had a significant and positive pairwise correlation with SS in

Table 4. The SS coefficient remains significant and negative, though the magnitude of the effect

decreases slightly. The Home and Contract coefficients, meanwhile, become significant at the 10 percent

level with opposite signs in this specification. The data suggests that pitching at home results in a 5.3

percent decrease in the chance of yielding a baserunner, while pitchers in their contract years suffer a

2.7 percent increase. In Columns (5) and (6), which report results of a specification that omits the SS

variable, the only significant result19 is a similarly deleterious effect of Contract on at-bat outcome, from

the pitcher’s perspective.

The final pair of columns presents a specification that includes fixed effects for each margin of

run difference. This eliminates all significant effects of the independent variables of interest. The

significance and signs of the run difference coefficients (not reported) vary widely in magnitude and

significance. The individual pitcher fixed effects, which were included in each of the four specifications,

are also not reported. These individual fixed effects yielded coefficients with varying signs, as well as

wide-ranging magnitudes and significances, and the heterogeneity among individual pitchers is explored

further in Section 6. The low pseudo R2 values for these binary logit specifications are not a concern;

19

The coefficient on ln(ExpectedOddsRatio) is also significantly different from zero (p<0.001)in each of the specifications, but it is a control variable. As expected, the coefficient on the control term is approximately equal to one in the logit model.

21

logistic regression models tend to have low pseudo R2 (Lunt, 2012). The robustness discussion in Section

7 reports goodness-of-fit test results that support the use of this logistic regression specification.

A potential drawback of logit specifications is the misinterpretation of coefficients caused by

changing derivatives along the logit curve, particularly at its non-linear, extreme values. To test this

possibility and confirm the consistency of the results discussed, marginal effects were measured20 at the

mean values of the regressors, as well as at the means of four different subsections of the data. The

coefficients represent the instantaneous rate of change of the outcome variable, as expressed in (16),

where xk represents independent variable k. The results, shown in Table 7, remain consistent both at the

mean values of all variables and within the limited samples of Columns (2)-(6). This suggests no

significant changes in observed effects caused by the use of a logit specification. Apart from the negative

effect of the SS state on at-bat outcome, it is worth noting the persistent and significant positive

marginal effect of the contract variable.

( | )

Overall, the regression results stemming from the primary specification suggest a significant,

beneficial effect of the SS game state on the performance of closers. Closers are significantly more likely,

on average, to get the opposing batter out in matchups in the SS state than in the NSS state.

5.3 Interaction Specification Results

The discussion now turns to potential interaction effects between the covariates and these two

game states. Table 8 shows the regression output of (15) and of several variations of the model.

The results in column (1) suggest that LI and LI*SS are the two variables with coefficients

significantly different from zero. The data suggests that for a one-unit increase in Leverage Index in an

20

The function dlogit2 was used in Stata to generate the marginal effects (Sribney, 1996).

(16)

22

NSS, the chance of a pitcher allowing a batter to reach base increases. Specifically, column (2) shows

that for a one-unit increase in LI in an NSS, the odds of a pitcher allowing a baserunner increases by a

factor of 1.038. In percentage change terms, a batter is 3.8 percent more likely to successfully reach

base with a unit increase of LI in an NSS. As the situational pressure rises in an NSS, the batter is more

likely to succeed.

The LI*SS coefficient is also significant, but it must be interpreted with caution. Interaction

effects in binary logit specifications present a challenge in proper interpretation. The multiplicative

effect of the variable, as termed by Buis (2010), is discussed first, as these can be observed directly from

the column (2) odds ratio results. The effect of a one-unit increase in LI in an SS is 0.94 times the effect

of a one-unit increase in LI in an NSS. For further analysis, however, regarding the interaction variables’

statistical significance and marginal effects, additional calculations must be reported. As discussed in

Norton et al. (2003), the interaction effect of non-linear models can vary in magnitude, significance, and

sign (positive or negative) across observations, so the reported coefficients in Table 8 do not equal the

true marginal effects of the interactions21. Norton et al. (2004) presents a solution for this problem with

a Stata command, inteff, which correctly calculates the mean magnitudes of interaction effects in the

logit model with appropriate signs and significance levels.

Table 9, and the accompanying Figures 9 and 10, provide an accurate portrayal of the marginal

effect of a one-unit increase in LI in an SS state22. The mean marginal effect is -0.013, and the effect is

significantly different from zero (mean z=-8.46). In addition, as seen in Figure 9, no complications arise

from varying signs of the interaction effect across the sample data. The effect of the increase in

situational pressure conveyed through LI in the SS state is negative across all of the observed outcomes.

21

Precisely, the interaction effect in a non-linear model equals the cross partial derivative of the expected value of the dependent variable with respect to the two interacted variables, not the partial derivative of only the interacted term. Refer to Norton et al. (2004) for a complete discussion. 22

The proper mean interaction effects for the insignificant interaction variables are presented in Appendix Table A-2.

23

The significance and uniformity of the sign allows conclusions to be drawn from the reported magnitude.

The data suggests that the chance of a pitcher allowing a baserunner decreases by a factor of 0.987

(1-e-0.013) in an SS relative to an NSS. In percentage change terms, a pitcher is 1.3 percent (1-0.987) less

likely to yield a baserunner in an SS. Therefore, the overall effect of a one-unit increase in LI in an SS is a

2.5 percent increase in probability of the opposing batter reaching base. When compared to the 3.8

percent increase in probability of batter success in an NSS, it is evident that pitchers perform better

under rising situational pressure when it occurs in an SS game state. This result supports the main

effects discussed in Section 5.2.

The variation specifications presented in Table 8 present several noteworthy results. The

exclusion of all terms including LI in Columns (3) and (4) do not lead to significance in the other

interacted variables, while the exclusion of the RunDiff terms in Columns (5) and (6) does not lead to

insignificance in the LI terms.

Figures 10 and 11 graphically represent the disparate effects of LI in the NSS and SS states. The

vertical axes show the average on-base proportion yielded in plate appearances within bounds of

Leverage Index. The mean of these LI ranges — which were [-0.5,4] in intervals of 0.5 for Figure 10 and

[0,4] in intervals of 1 for Figure 11 — were calculated and are used as the precise x-coordinates for the

data points. The first graph of each figure shows a linear regression and the second shows a lowess

curves fit to the data. The figures reinforce the primary finding in this regression specification: it

presents an evident positive relationship between increases in LI and on-base proportion in NSS, but a

relative negative effect of increases in LI in the SS game state on the probability of reaching base.

Table 10 presents the marginal effects of the interacted specification at the means and within

the restricted samples also used in Table 7. Again, there are no apparent issues with the logit

specification, as the LI and LI*SS coefficients remain almost identical across the specifications. The

24

marginal effects of the other variables, which previously had insignificant coefficients, remain

insignificantly different from zero.

5.4 Alternate Specifications

Linear regression specifications were also applied to both the primary and interaction models to

observe if the significant effects persisted. The specifications for both linear models are presented

below.

( ) ( )

( ) ( ) ( )

[ ( ) ]

( ) ( )

( ) ( ) ( ) ( )

( ) ( ) ( )

[ ( ) ]

Pitcher and season fixed effects are included in both linear regressions. The regression results

are reported in Tables 11 and 12, respectively. In the primary specification, the SS coefficient maintains

its significance and sign, though the coefficient is smaller in magnitude. A pitcher is about 2.6 percent

more likely to get a batter out in an SS state than in an NSS state, according to the full linear

specification. For the specification variations in Columns (2) and (3) of Table 11, this significance and

magnitude remains similar despite the omission of the LI and RunDiff variables, respectively. As in the

marginal results of the logit specification, the coefficient on the contract variable is positive and

significant.

Table 12 also presents no changes in the significance of coefficients of the variables of interest,

though the magnitude of the coefficients collectively decreased. The interpretation of the interaction

effect between LI and SS is now different. Whereas the overall effect of a one-unit increase in LI in the SS

(17)

(18)

25

game state previously favored the batter, it now favors the pitcher. The data suggests that the chance of

a pitcher allowing a baserunner after an LI unit increase increases by 0.8 percent in NSS and decreases

by 1.3 percent in SS. The overall effect of a one-unit increase in LI in an SS state is therefore negative;

the pitcher is 0.5 percent less likely to yield a baserunner in an SS, all else equal. Though the magnitudes

of the effects differ, both the logit and linear interacted specifications suggest an improvement in closer

performance in pressure-packed situations that occur in SS states. Partial regression plots23 of the LI and

LI*SS coefficients, generated from the linear specification, are presented in Figure 12. The divergent

relationships between these two regressors and the outcome variable, as previously discussed, are seen

to persist in the linear specification.

Table 13 presents several adjustments to the included fixed effects in the interaction variable

specification. Column (1) reproduces the full model for reference, which includes pitcher and season

fixed effects. Columns (2) and (3) each use one of the two fixed effects variables from the full model.

Fixed effects for team run difference during each plate appearance is introduced in Column (4), and the

final column presents the logit regression with no fixed effects included. Again, the two significant

variables, LI and LI*SS, remain significant, and no other variables become significantly different from

zero because of these adjustments. Noticeable changes occur among the variable coefficients in the

specification with run difference fixed effects; for instance, the variables of both LI and LI*SS increase in

magnitude in their respective directions.

5.5 Further Analysis: Restricted Specifications

The effects of game situation on closer performance can be explored further within the sample.

The strict definition of an SS game state provides an opportunity to extend the study to three distinct

game states, or zones, in which a closer may be deployed. Referring back to Figure 1, the two non-save

situation (NSS) ranges can be differentiated: run differences of four runs or greater (in favor of the

23

These plots are also called adjusted variable plots.

(13

26

pitcher’s team) will henceforth be called WinNSS, while the range including tied games and all margins

of deficit will be called LoseNSS. The purpose of this refinement is to pose questions regarding the true

drivers of performance variation: does the impact of increases in situational pressure differ across the

two newly defined zones?

The results of two adjusted regressions incorporating the NSS zones are presented in Table 14.

Both a binary logit and a linear specification are included, and the adjusted logit specification is

presented below, where F is the logistic cumulative distribution function.

( | )

( ( ) ( ) ( )

( ) ( ) ( ) ( )

[ ( ) ] )

Column (2) will be the focus of the analysis regarding coefficients on the interacted variables.

The results suggest a curious disparity in the effects of LI increases in the two NSS zones. The LI*WinNSS

variable has a negative and significant coefficient in the reported linear results, while the LI*LoseNSS

coefficient is positive and not significantly different from zero. The LI variable, now restricted to SS game

states, yields a positive and insignificant coefficient. Relative to a one-unit increase of LI in an SS state,

pitchers are significantly less likely to allow hits and walks when the pressure increases in the WinNSS

zone. The overall effect is negative and thus favors the pitchers; adding the coefficients of LI and

LI*WinNSS yields -2.8 percent, which suggests a decrease in the chance of a batter reaching base when a

one-unit increase in LI occurs in the WinNSS zone. This result provides further clarity regarding the

effects of situational pressure and game state on closer performance. The benefits to pitcher

performance as situational pressure rises, as measured through increases in Leverage Index, is not

restricted to SS game states, but also extends into higher lead margins. This finding suggests that closers

tend to respond positively to protecting their team’s lead as pressure increases, even when the

individual reward of earning a save is unattainable.

(19)

27

The data is parsed in a different manner in the specifications shown in Table 15. Instead of zones

of NSS, the regressions are restricted by several margins of run difference. Columns (1), (2), and (3) show

symmetric ranges of run differences that each incorporate both SS and NSS situations. Column (4) then

restricts the sample to all positive run margins, which composes about 60 percent of the data, and

Column (5) includes all of the other potential margins (tied or trailing). The first three columns display

results similar to those reviewed in Section 5.3, including the positive, significant coefficients on LI in

NSS states. Column (4) presents a newly significant interaction variable; the negative coefficient on

RunDiff*SS suggests that relative to run margin increases in NSS, a lead increase of one run in the SS

state positively affects closer performance. It is noteworthy that the LI values, in this situation of

increasing run difference, would on average be decreasing, as depicted in Figure 8, so this effect is

complementary to several themes discussed throughout the analyses.

6. Extension: Pitcher Rankings

6.1 Ranking System

The discussions in this study have thus far focused on the average effects of changes in game

situations observed across the entire sample. It is also valuable to examine the effects of such changes

on individual closers, in order to observe the effects of psychological pressure (and its potential

motivational effects) on performance. This section investigates heterogeneity within the sample to

generate “Clutchness” rankings24 for each closer25.

The first ranking system applies specification (17) to each pitcher’s sample at-bats individually.

Each ranking coefficient incorporates the individual estimates of unit increases in SS and LI, as well as

the unique constant that is in effect each pitcher’s fixed effect coefficient. Though the goal here is to

24

The sabermetric website FanGraphs publishes a clutchness measure for hitters; their definition, which differs from the discussion regarding pitchers here, is explained by Seidman (2008). 25

Pope and Schweitzer (2011) analyze individual regression coefficients in their examination of heterogeneous loss aversion among golfers in their sample.

28

assemble information about a closers’ performance response to the specific situational pressure

increase of LI and the game state effects of SS, the fixed effects must be considered to present an

accurate portrayal of performance. An effective ranking system should incorporate pitchers’ baseline

levels of performance, regardless of the relative changes that high-stakes environments may cause.

The average effects of the SS and LI changes, as reported in Table 11, are -2.6 percent and 0.3

percent, respectively. But the individual regressions yield point estimates that vary considerably by

player. To generate the ranking coefficients, a 25 percent weight is allocated to the SS and LI coefficients

each, and the remaining 50 percent weight is allocated to the individual pitcher fixed effects. Table 16

and Table 17 present the compiled coefficients for closers in the 2000 and 2011 seasons, respectively.

The interpretation of these averaged values is slightly counterintuitive: increasingly negative numbers

correspond to improved performance and higher levels of clutchness. As previously, the dependent

variables in the individual regressions equal one if the pitcher yielded a hit or a walk, so negative

numbers represent decreases in the probability of yielding baserunners in save situations and in

situations of increased Leverage Index. The top-ranked pitcher represents the most clutch performer, as

determined by the model; he wields the highest combination of baseline performance and performance

improvement in response to the game scenario changes of interest.

The individual regressions generate sample size concerns that may affect the accuracy of the

rankings. For instance, two of the top four clutch pitchers in 2000, and three of the top four in 2011,

converted eight saves or fewer overall in the season, as noted in Tables 16 and 17. Such a small sample

size can create overweighted estimates of the improved performance of these pitchers when they are

deployed in SS. Tables 18 and 19 omit all pitchers who converted eight saves or fewer in the given

seasons to present a condensed Clutchness ranking. These rankings also omit closers with a sample size

of fewer than 100 plate appearances in the season, as well as those who made 30 percent or fewer of

29

their appearances in SS game states. These two latter cutoffs affected significantly fewer pitchers.26

Overall, these refinements reduced the pool of pitchers to 33 in the 2000 season and 35 in the 2011

season, reducing the total sample size by approximately 29 percent to 68 total pitcher-seasons.

Rankings are also generated with the linear interacted variable specification presented in (18).

The coefficients of interest are now those on the LI and LI*SS variables, as these were the significant

results and primary topic of discussion in Section 5.3. A similar weighted average is calculated: 50

percent is allocated to the individual pitcher fixed effects, and 25 percent apiece is now allocated to the

LI and LI*SS coefficients. The observed effects in the individual regressions again varied significantly

from the previously reported average effects. Tables 20 and 21 present the clutchness coefficients for

the interacted specification, in which the focus is now on the performance difference between high-

pressure situations in SS and in NSS states. The rankings of the pitchers with sufficient sample sizes, as

determined for the first set of rankings, are reported in the two tables.

6.2 Ranking Applications

The devised ranking system can be used to better understand the effects of shifting game

pressure on pitchers. It would be telling if the rankings possessed predictive power on other key

characteristics of individual closers. Two such performance-based attributes, both related to individual

measures, were compared to the calculated clutchness data points. Figure 15 presents scatter plots of

each pitcher’s save opportunity conversion rate, annual salary, and logarithm of annual salary plotted

against his Clutchness coefficient calculated through the non-interacted specification. The statistics that

generated the data points for each player can be found in Appendix Table A-1. The conversion rate of

save opportunities was included instead of total saves to standardize the outcome across unequal

appearance sample sizes among the closers.

26

See table notes for details.

30

The fitted lines in each of the three plots of Figure 13 are positively sloped, which contradicts

the expected relationship in each case. A higher clutchness coefficient indicates worse performance, so

those with lower coefficients would intuitively be more likely to have higher save conversion

percentages and salaries. However, there are no statistically significant relationships between the

outcome variables and the clutchness measurements. T-tests do not reveal significant effects of the

clutchness coefficients on the rate of save opportunities converted (t=1.95, p=0.05), on annual salaries

(t=0.85, p=0.40), or on the logs of annual salaries (t=0.67, p=0.51).

7. Robustness, Competing Hypotheses and Caveats

7.1 Goodness-of-fit

Beyond the alternate specifications examined in Section 5, several additional goodness-of-fit

tests were applied to the data to ensure the appropriateness of a logistic regression model. A Hosmer-

Lemeshow test was conducted; the test divides the data set into groups and then tests the observed and

predicted number of positive outcomes of the dependent variable within each. It tests the null

hypothesis that the difference between the outcomes is zero across the subgroups (H0: Observed

outcomes - Positive outcomes =0). Using the convention of 10 groups (Lunt, 2012), the null hypothesis

fails to be rejected (C2 (8) = 5.9, p=0.66), which confirms the fit of the full specification.

Tests for potential collinearity, a primary concern considering the associations between the

regressors SS, LI, and RunDiff, also did not yield any concerning results that would require model

adjustments. Variance inflation factors (VIFs) were calculated for both the binary logit and linear models.

VIF values of greater than 10 are cause for concern (Chen et al., 2003), but there were no such values in

the models. In the binary logit specification, the three potentially collinear regressors had a mean VIF of

31

1.1227. The mean VIF of the linear regression model that included season and pitcher fixed effects was

6.68, and a linear model without fixed effects yielded a 4.15 mean VIF.

Additional model diagnostics are also presented in Appendix A-3. These include tables

containing various measures of fit for the binary logit and linear models that include the interacted

variables, as well as a classification table that shows the logit model’s rate of accuracy in predicting the

outcome variable28.

7.2 Competing Hypotheses

The choice of timeframe for the primary analyses within this study can be debated. While the

sample of individual at-bats provides an excellent avenue for controlling for individual player skill and

observing the effects of slight changes in situational pressure, it is less suitable for understanding larger

trends within a pitcher’s season. The performance changes attributable to in-game situations may

instead be more closely related to large-scale themes.

The Wald-Wolfowitz, or one-sample, runs test is one method of testing for game-level trends in

closer performance. The runs test has been applied to sports in previous research to test the validity of

the “hot hand effect,” a claim of positive dependence between outcomes most notably applied to

basketball shots in Gilovich et al. (1985). A run is defined as a series of consecutive hits or misses —

saves converted and saves blown in this case — with a minimum length of one. For instance, if C is a

converted save and B is a blown save, the streak CCBBBCBC has five runs. The null hypothesis for the

test is H0: Observed # of runs = Expected # of runs.

27

A user-generated command, collin, was used in Stata to calculate the VIF values of the non-linear regression. Additional information can be found in Chen et al. (2003). 28

The estat class command was used to generate this classification table in Stata. While the default cutoff for the test is 0.5, the threshold was changed to 0.314, the proportion of batters who reached base in the sample. This cutoff better reflects the true proportion of successes (hits, walks) and failures (outs) that the model aimed to predict.

32

A runs test was administered to each of the 96 pitcher-seasons in the sample, with sequences of

save opportunities ordered chronologically from April to September in each regular season. Of the 96

pitcher-seasons tested, only six (6.25 percent) had seasons in which their observed number of runs of

converted and blown saves significantly differed from the number of expected runs. The test statistics of

these six players are presented in Table 22, and the entire list of results is found in Appendix A-4. The

runs of the overwhelming majority of players did not significantly differ from the expected binomial

distribution, which suggests that the closers are not affected by streakiness that carries over between

game appearances. It is noteworthy that the six players with significant results all had fewer runs of

converted and blown saves than would be expected. This corresponds to the potential of a “hot hand

effect” in this small subgroup; significantly differing from zero in the other direction (compiling more

runs than expected) would instead suggest a different type of performance inconsistency.

7.3 Caveats and Research Implications

Simultaneity is the most pressing issue affecting the analyses discussed in this paper. Unlike

some of the reviewed literature that also examined situations in sports (golf putts, free throws), the

interactions studied herein are between two players. It is therefore difficult to make definitive

conclusions regarding psychological effects on one of the competing groups. Though the study strove to

isolate the effects on pitchers — by including and clustering by pitcher fixed effects, for instance — it by

no means accounted for all of the impact the game situation changes may have had on the batters.

Carleton (2012b) remarks on similar concerns regarding a study that also involves closers. The

Apesteguia and Palacios-Huerta (2010) paper on soccer penalty kicks, which also involves a game event

involving two players, provides somewhat of a comparable situation. The authors note that the majority

of penalty kicks are scored, which is analogous to batter-pitcher interactions; just as most penalty kicks

are converted, most batters do not reach base (see Table 3). However, randomness plays a larger role in

33

the outcome of at-bats, which presents a challenge in the attribution of the majority of the situational

psychological pressure to one side of the interaction. The pitcher ranking system developed in Section 6

works toward ameliorating such issues. The line of reasoning discussed there could be extended with

the further isolation of game state effects on individual pitcher performance, and a more robust sample

pool may lead to more substantive results.

It would also be valuable to expand the sample to include playoff plate appearances. This would

offer a new avenue for analysis, since the playoffs could be expected to increase pressure across all

types of in-game situations. Otten and Barrett (2013) recently conducted an observational study

comparing regular season and playoff performance among baseball players, and an extension of the

empirical work described in this paper would provide an intriguing extension to their findings. In a

similar vein, the inclusion of game-level pressure indices, such as the “Season Leverage Index”

developed by Studeman (2008), could provide a more complete picture of the motivations and

pressures experienced by players at different stages of the regular season.

This study broadly addressed the possibility of loss-averse preferences among closers, but such

analysis could be extended with the use of Pitch f/x pitch-by-pitch data. By expanding the study into

intra-plate appearance patterns, the risk preferences of pitchers could be examined. For instance, the

pitch selection, velocity, accuracy, and movement could be compared in SS, NSS, and moments of

heightened LI. Such work would complement the previously mentioned analysis of Moskowitz and

Wertheim (2011) regarding pitch selection that varied after extreme changes in batting count.

Though a powerful statistic, Leverage Index is not a panacea that can be used independently in

player performance analysis. The LI can be thought of as a snapshot measure of game pressure, so it

cannot account for a player’s personal effect on the situational pressure he later experiences. As

discussed in Tango (2006b), a pitcher who performs well can reduce the LI of ensuing at-bats. Variations

of the plate appearance LI (paLI) statistic used in this study, such as average Leverage Index (aLI) that

34

measures the average pressure across sets of at-bats, may be considered for inclusion in study

extensions to account for such changes.

An ideal sample would provide several improvements to the dataset used in this study. As

evident in Section 6, the generally low number of innings pitched by closers in a given season can be

troublesome when analyzing thin slices of the dataset. There is literature on consistency measures and

stabilization rates of baseball statistics (Carleton, 2012b), and closers’ season statistics rarely meet those

standards. In addition, there is potential for a bandwidth problem regarding the OBP values used to

calculate the odds ratio control variables for player skill. The end-of-season OBP statistics were used for

both pitchers and batters, which does not reflect potential swings in performance throughout the

lengthy regular season. A refinement of the study could include the same control variable calculated

with different ranges of OBP (e.g. the three months around the at-bat’s occurrence) to observe if such a

change impacts the results.

8. Conclusion

This study examined the causes of performance variation among professional baseball closers, a

highly skilled and well-compensated set of agents. A binary logit regression model was developed to

consider the drivers of closer performance. It included controls for individual pitcher and batter skill

levels, which provided a chance to clearly examine the potential psychological effects caused by changes

in situational pressure. The findings support a significant and positive effect of the save situation (SS)

game state on closer performance, which reinforces the conventional wisdom held by players, fans and

the media. After controlling for player matchups and other potential situational influences, the heralded

motivational effects of SS on pitchers persist in the average performance of the nearly 100 closers

included in the study. The findings also support a beneficial effect of the SS state on pitchers within the

35

more specific context of higher-pressure situations, as measured by the Leverage Index statistic, that

considerably affect the game’s ultimate outcome.

The study also contributes a statistic that isolates the effects of various situational pressures on

individual pitchers. This line of reasoning allows for the study of game pressure impact on an

individualized level, and it presents a set of heterogeneous situational effects that are more precise than

the broad-stroke claims previously applied to closers as an entity. Further refinement of such individual

“clutchness” measures could provide teams with an analytical tool to help optimally allocate its roster

selection based on the situational factors encountered in games.

This paper contributes to the literature on the relationship between incentives, effort level and

performance, as well as to recent work examining the influence of situational pressure on outcomes of

professional sporting events. The study’s findings provide empirical support for the implicit motivational

effects of the SS game state on pitchers. The conclusions suggest that the relationship between

situational factors and closer performance is valid and an area ripe for further investigation.

36

References Apesteguia, Jose and Ignacio Palacios-Huetra. 2010. “Psychological Pressure in Competitive

Environments: Evidence from a Randomized Natural Experiment,” American Economic Review 100:5, 2548–2564.

Appelman, David. "Get to Know: Leverage Index." FanGraphs (2008),

http://www.fangraphs.com/blogs/index.php/get-to-know-leverage-index/. Ariely, Dan, Uri Gneezy, George Loewenstein, and Nina Mazar. 2009. “Large Stakes and Big Mistakes.”

Review of Economic Studies, Vol.76, No. 2, pp. 451-469. Baumeister, R.F. and Steinhilber, A. (1984). "Paradoxical effects of supportive audiences on

performance under pressure: The home field disadvantage in sports championships." Journal of Personality and Social Psychology, 47(1): 85-93

Berger, Jonah, and Devin Pope. "Can Losing Lead to Winning?" Management Science 57(5) (2011): 817-827. Brooks, Dan. "Brooks Baseball." http://www.brooksbaseball.net/. Buis, M. L. "Predict and Adjust with Logistic Regression." Stata Journal 7 2 (2007): 221-26. ———. "Stata Tip 87: Interpretation of Interactions in Nonlinear Models."

Stata Journal 10, no. 2 (2010): 305-08. Butler, J. L., & Baumeister, R. F. (1998). The trouble with friendly faces: Skilled performance

with a supportive audience. Journal of Personality and Social Psychology, 75(5), 1213-1230. Camerer, Colin, Linda Babcock, George Loewenstein, and Richard Thaler. 1997.

“Labor Supply of New York City Cab Drivers: One Day at a Time.” Quarterly Journal of Economics, 112(2): 407–441.

Cameron, Dave. WAR and Relievers. 2010. Available from

http://www.fangraphs.com/blogs/index.php/war-and-relievers/. Carleton, Russell A. "A Modest Proposal for the Use of Closers." (2008),

http://statspeakmvn.wordpress.com/2008/02/16/a-modest-proposal-for-the-use-of-closers/. ———. “If you’re happy and you know it, get on base.”(2009),

http://www.hardballtimes.com/main/blog_article/if-youre-happy-and-you-know-it-get-on-base/. ———. "In Praise of the Modern Bullpen." Baseball Prospectus (2012a),

http://www.baseballprospectus.com/article.php?articleid=18835. ———. “It's a Small Sample Size After All.” (2012b),

http://www.baseballprospectus.com/article.php?articleid=17659.

37

Cao, Zheng, Joseph Price, and Daniel F. Stone. "Performance under Pressure in the NBA." Journal of Sports Economics 12 3 (2011): 231-52.

Chen, X., Ender, P., Mitchell, M. and Wells, C. 2003. Regression with Stata,

from http://www.ats.ucla.edu/stat/stata/webbooks/reg/default.htm . Chib, V. S., B. De Martino, S. Shimojo, and J. P. O'Doherty, 2012, Neural mechanisms underlying

paradoxical performance for monetary incentives are driven by loss aversion, Neuron 74, 582-594.

Clark, Aaron W. Johnson and Alexander J. Stimpson and Torin K. "Going for Three:

Predicting the Likelihood of Field Goal Success with Logistic Regression." Sloan Sports Analytics Conference. 2013, Boston http://www.sloansportsconference.com/wp-content/uploads/2013/

Cot's Baseball Contracts. 2012.

http://www.baseballprospectus.com/compensation/cots/. Farber, Henry S. 2005. “Is Tomorrow Another Day? The Labor Supply of New York City Cab Drivers.”

Journal of Political Economy, 113(1): 46–82. Fehr, Ernst, and Lorenz Goette. 2007. “Do Workers Work More If Wages Are High?

Evidence from a Randomized Field Experiment.” American Economic Review, 97(1): 298–317. Fryer, Roland G., Steven D. Levitt, John List and Sally Sadoff. 2012. “Enhancing the Efficacy of

Teacher Incentives through Loss Aversion: A Field Experiment.” NBER Working Paper No. 18237

Gilovich, Thomas, Robert Vallone, and Amos Tversky. 1985. “The Hot Hand in Basketball:

On the Misperception of Random Sequences.” Cognitive Psychology 17: 295-314. Goldman, Matthew and Justin M. Rao. 2012. “Effort vs. Concentration: The Asymmetric Impact

of Pressure on NBA Performance.” Submission to the MIT Sloan Sports Analytics Conference http://www.sloansportsconference.com/wpcontent/uploads/2012/02/Goldman_Rao_Sloan2012.pdf

Goldman, Matthew and Justin M. Rao. 2013. "Live by the Three, Die by the Three? The Price of

Risk in the NBA." Submission to the MIT Sloan Sports Analytics Conference http://www.sloansportsconference.com/wp-content/uploads/2013/

Gullickson, Aaron. "Logistic Regression." University of Oregon,

http://pages.uoregon.edu/aarong/teaching/G4075_Outline/node16.html. Haigh, M. and List J. 2005. Do professional traders exhibit myopic loss aversion?

“An experimental analysis.”Journal of Finance 60(1), 523–534. Heath, Chip, Richard P. Larrick, and George Wu. 1999. “Goals as Reference Points.”

Cognitive Psychology. 38(1): 79–109.

38

Huckabay, Gary. Hitters love the 'walk year'. 2003. ESPN, http://sports.espn.go.com/mlb/columns/story?id=1608344.

James, Bill. 2003. The New Bill James Historical Baseball Abstract. New York: Simon and Schuster.

http://books.google.com/books/about/The_New_Bill_James_Historical_Baseball_A.html?id=3uSbqUm8hSAC.

Jazayerli, Rany. "The Impact of Closers: Moving Away from Save Situation Specialization." Baseball Prospectus (2000), http://www.baseballprospectus.com/article.php?articleid=648.

Kahneman, Daniel, Jack L. Knetsch, and Richard H. Thaler. 1990. "Experimental Tests of the

Endowment Effect and the Coase Theorem." Journal of Political Economy 98 (6): 1325-48. Kahneman, Daniel, and Amos Tversky. 1992. “Advances in Prospect Theory:

Cumulative Representation of Uncertainty.”Journal of Risk and Uncertainty 5: 297-323. Kahneman, Daniel, and Amos Tversky. 1991. “Loss Aversion in Riskless Choice:

A Reference-Dependent Model.”The Quarterly Journal of Economics, 106(4): 1039-1061

Kahneman, Daniel, and Amos Tversky. 1979. “Prospect Theory: An Analysis of Decision under Risk.” Econometrica, 47(2):263–91.

Klayman, Ben. 2010. Analysis: No perfect game but MLB to post record revenue. Reuters,

http://www.reuters.com/article/2010/10/25/us-baseball-economics-idUSTRE69O4GQ20101025. Köszegi, Botond, and Matthew Rabin. 2006. “A Model of Reference-Dependent Preferences.”

Quarterly Journal of Economics, 121(4): 1133–65. League Year-By-Year Batting--Averages. 2012. [cited 12/16 2012]. Available from

http://www.baseball-reference.com/leagues/MLB/bat.shtml. Levitt, Steven D., John A. List, Susanne Neckermann and Sally Sadoff. 2012.

“The Behavioralist Goes to School: Leveraging Behavioral Economics to Improve Educational Performance.” NBER Working Paper, No.18165.

Lewis, Brian P., and Darwyn E. Linder. "Thinking About Choking? Attentional Processes and Paradoxical

Performance." Personality and Social Psychology Bulletin 23 9 (1997): 937-44. List, John A. 2003. “Does Market Experience Eliminate Market Anomalies?.”

Quarterly Journal of Economics, 118:41-71. List, John A. 2004. “Neoclassical Theory vs. Prospect Theory: Evidence from the Marketplace.”

Econometrica 72:615–25. Lunt, Mark. "Modelling Binary Outcomes." http://personalpages.manchester.ac.uk/staff/mark.lunt/stats_course.html.

39

Meisel, Zack. "Non-Save Situations No Easy Task for Closers." (2012), http://mlb.mlb.com/news/article.jsp?ymd=20120612&content_id=33161766&vkey=news_mlb&c_id=mlb.

MLB Closer Report - 2012. 2012. [cited 12/16 2012]. Available from http://espn.go.com/mlb/stats/closers.

Moskowitz, Tobias., and L. Jon Wertheim. Scorecasting: The Hidden Influences Behind How Sports Are

Played and Games Are Won. New York: Random House, 2011. Norton, E. C., H. Wang, and C. Ai. "Computing Interaction Effects and Standard Errors in

Logit and Probit Models." Stata Journal 4 2 (2004): 154-67. ———. "Interaction terms in logit and probit models." Economics Letters 80 1 (2003): 123-129 Otten, M. P., & Barrett, M. E. (2013). Pitching and clutch hitting in Major League Baseball:

What 109 years of statistics reveal. Psychology of Sport and Exercise, 14(4), 531-537. Pedace, Roberto, and Janet Smith. 2012. “Loss Aversion and Managerial Decisions:

Evidence from Major League aseball.” Economic Inquiry, doi: 10.1111/j.1465-7295.2012.00463.x

Pope, Devin G., and Maurice E. Schweitzer. 2011. "Is Tiger Woods Loss Averse?

Persistent Bias in the Face of Experience, Competition, and High Stakes." American Economic Review 101 1: 129-57.

Pope, Devin, and Uri Simonsohn. 2011. "Round Numbers as Goals." Psychological Science 22(1): 71-79. Rauh, Michael T., and Giulio Seccia. "Anxiety and Performance: An Endogenous

Learning-by-Doing Model*." International Economic Review 47 2 (2006): 583-609. Read, Daniel, George Loewenstein, and Matthew Rabin. "Choice Bracketing."

Journal of Risk and Uncertainty 19 1-3 (1999): 171-97. Seidman, Eric. "All About Clutch." FanGraphs (2008),

http://www.fangraphs.com/blogs/index.php/all-about-clutch/. Singer, Tom. "Valverde Vulnerable in Non-Save Situations." (2011),

http://detroit.tigers.mlb.com/news/article.jsp?ymd=20111013&content_id=25637296&c_id=det.

Sribney, Bill. Dlogit2: Stata Modules to Compute Marginal Effects for Logit, Probit, and Mlogit.

Computer software. Boston College Department of Economics, 1996. Stark, Jayson. "The Age of the Pitcher." ESPN (2012),

http://espn.go.com/mlb/story/_/id/8048897/the-age-pitcher-how-got-here-mlb. Studeman, Dave. "Season Leverage Index." The Hardball Times (2008),

http://www.hardballtimes.com/main/article/season-leverage-index/.

40

Tango, Tom. “Crucial Situations” (2006a), http://www.hardballtimes.com/main/article/crucial-situations.

———. "Crucial Situations: Part 3." The Hardball Times (2006b),

http://www.hardballtimes.com/main/article/crucial-situations-part-three/. ———. "Crucial Situations: Leverage Index (LI)." (2007), http://www.insidethebook.com/li.shtml. Tango, Tom and Mitchel Lichtman and Andrew Dolphin. "Excerpt: The Book - the Right –

and Wrong -- Time to Use Your Ace Reliever." Sports Illustrated (2006), http://sportsillustrated.cnn.com/2006/baseball/mlb/04/17/thebook.excerpt/index.html.

Thaler, Richard H. 1999. “Mental Accounting Matters.” Journal of Behavioral

Decision-making. 12, 183-206. Thaler, R. H., & Johnson, E. J. 1990. Gambling with the house money and trying to break even:

The effects of prior outcomes on risky choices. Management Science, 36: 643-660. Torres-Reyna, Oscar. "Getting Started in Logit and Ordered Logit Regression."

Princeton University, http://dss.princeton.edu/training/Logit.pdf. Van-Riper, Tom. The Myth Of The Contract Year Slugger. Forbes 2010. Available from

http://www.forbes.com/2010/04/13/yankees-phillies-astros-business-sports-bloomberg-baseball.html/. Wyers, Colin. "Extra Innings Excerpt Are Relievers Being Used Properly?"

Baseball Prospectus (2012), http://www.baseballprospectus.com/article.php?articleid=16287. Yerkes, R. M. and Dodson, J. D. 1908. “The Relationship of Strength of Stimulus to

Rapidity of Habit-Formation”, Journal of Comparative Neurology of Psychology, 18 (5), 459–482.

41

Tables and Figures

Figure 1: Breakdown of save situations (SS) and non-save situations NSS) depending on the game run difference

42

Figure 2: An “S-shaped” value function

43

Figure 3: Scatterplots of earned runs allowed in appearances by closers in the 2011 season against the run difference margin faced at game entry. Lowess curves are fitted to the data points in both figures.

Figure 3a

Figure 3b

44

Figure 4: Scatterplot of earned runs allowed in appearances by closers in the 2011 season against the run difference margin faced at game entry. The size of each circle is conditional on the frequency of the observation within the sample. The three points most observed were (1,0), (2,0), and (3,0), respectively.

45

Game State Proportion of batters who

reached base (%)

Non-save situation (NSS) 32.2

Save situation (SS) 29.8

T-test t=3.88***

Table 1 Proportion of On-base Occurrences

Note: This table reports summary statistics detailing the proportion of plate appearances that resulted in a batter reaching base, depending on the game state. The total sample size is 26,223 plate appearances. The t-test tested if the difference in proportions was significantly different from zero: t(26221) = 3.88.

* p < 0.05,

** p < 0.01,

*** p < 0.001

46

Observations Mean

Standard Deviation Minimum Maximum

On-base 26223 0.3135 0.4639 0 1 Save Situation 26223 0.3401 0.4737 0 1 Leverage Index 26223 1.6576 1.5649 -0.34 11.04 Run Difference 26223 0.8351 3.1054 -18 15

Home 26223 0.5234 0.4995 0 1 Contract 26223 0.1429 0.3500 0 1 Player Skill Control 26223 -0.8171 0.2751 -3.42 1.80 Leverage Index x Save Situation 26223 0.8102 1.5152 0 11.04 Run Difference x Save Situation 26223 0.6256 0.9866 0 3 Home x Save Situation 26223 0.1579 0.3646 0 1 Contract x Save Situation 26223 0.0574 0.2327 0 1

Table 2 Summary Statistics

Note: This table reports summary statistics for the variables used in the primary and interacted variable regression specifications described in the text. Data correspond to the collected sample of 26,223 at-bats involving 96 closers in the 2000 and 2011 Major League Baseball regular seasons. On-base, Save Situation, Home, and Contract are binary variables.

47

Game State Number of

Observations Percent of

Total Observations Mean

Leverage Index Mean

Run Difference

Non-save situation (NSS) 17,305 65.99 1.28 0.32 Save situation (SS) 8,918 34.01 2.38 1.84

Total 26,223 100 1.66 0.84

Plate appearance outcome

Out 18,001 68.65

Reached Base 8,222 31.35

Total 26,223 100

Note: This table reports summary statistics depending on the game state, specifically non-save situations (NSS) and save situations (SS). The table also presents frequency statistics for the binary dependent variable. The results correspond to the values taken on by the dependent variable, Onbase, in subsequent regressions. “Out” corresponds to a coding of zero and “Reached Base” corresponds to a coding of one.

Table 3 Summary Statistics by Game State

48

43.05 23.68 33.27 60 30 10 0

10

20

30

40

50

60

70

Low (<1) Medium (1-2) High (>2)

Pe

rce

nt

of

at-

ba

ts (

%)

Leverage Index

Leverage Index Ranges

Closers in Sample

All Game Situations

Figure 5: A comparison of Leverage Index ranges

Figure 6: Kernel density plots of Leverage Index distribution by game state

49

Figure 8: Scatter plot of Leverage Index values against run differences

Figure 7: Histogram of run differences faced by closers in the sample

50

On-base

Save situation (SS)

Leverage Index (LI)

Run Difference (RunDiff)

On-base 1

Save situation (SS) -0.0240* 1

(0.0001)

Leverage Index (LI) 0.0066 0.3324* 1

(0.2841) (0)

Run Difference (RunDiff) -0.0122* 0.2322* 0.0591* 1

(0.0489) (0) (0)

Note: This table reports pairwise correlation coefficients for the variables of interest. Standard errors in parentheses:

* p < 0.05.

Table 4 Matrix of Correlation Coefficients

51

On-Base

Naive Control Added

Logit Odds Ratio

Logit Odds Ratio

(1) (2) (3) (4)

Save Situation (SS)

-0.110*** (0.0283)

0.896*** (0.0254)

-0.0911** (0.0285)

0.913** (0.0260)

Player Skill Control

0.962 (0.0517)

2.617

(0.135)

Constant -0.747*** (0.0163)

0.0200 (0.0437)

Observations 26223 26223 26223 26223 Pseudo R

2 0.000 0.000 0.012 0.012

Table 5 Naïve Specifications

Note: This table reports regression results for two binary logit specifications. Column (1) report the naïve specification results and Column (3) reports results after the inclusion of the player matchup control variable. The effect of the Player Skill Control variable is also significantly different from zero (p<0.001) in each of the specifications, but it is a control variable that, as expected, has a coefficient approximately equal to one in the logit model. The coefficients for the odds ratio columns are equal to base e raised to the logit coefficients. Constant terms are omitted in the OR columns because that relationship does not hold. Standard errors in parentheses:

* p < 0.05,

** p < 0.01,

*** p < 0.001

52

On-Base Full Specification LI Omitted SS Omitted RunDiff FEs

Logit Odds Ratio

Logit Odds Ratio

Logit Odds Ratio

Logit Odds Ratio

(1) (2) (3) (4) (5) (6) (7) (8)

Save Situation (SS) -0.124*** (0.0349)

0.883*** (0.0308)

-0.107** (0.0328)

0.898** (0.0295)

-0.0572 (0.0515)

0.944 (0.0486)

Leverage Index (LI) 0.0166

(0.00940) 1.017

(0.00956)

0.00708 (0.00891)

1.007 (0.00897)

0.0127 (0.0130)

1.013 (0.0131)

Run Difference (RunDiff)

-0.00212 (0.00443)

0.998 (0.00442)

-0.00236 (0.00438)

0.998 (0.00437)

-0.00538 (0.00425)

0.995 (0.00423)

Home -0.0505 (0.0316)

0.951 (0.0300)

-0.0548 (0.0321)

0.947 (0.0304)

-0.0440 (0.0312)

0.957 (0.0299)

-0.0554 (0.0325)

0.946 (0.0307)

Contract 0.0273 (0.0170)

1.028 (0.0174)

0.0269 (0.0157)

1.027 (0.0161)

0.0423* (0.0207)

1.043* (0.0216)

0.0264 (0.0209)

1.027 (0.0215)

Player Skill Control 0.942

(0.0732) 2.564

(0.188) 0.943

(0.0731) 2.569

(0.188) 0.941

(0.0731) 2.562 (0.187)

0.945 (0.0727)

2.572 (0.187)

Season Fixed Effects (2011)

0.0108 (0.0134)

1.011 0.0135)

0.0140 (0.0117)

1.014 (0.0119)

0.0136 (0.0183)

1.014 (0.0185)

0.0124 (0.0130)

1.012 (0.0132)

Constant 0.0292 (0.0552)

0.0566 (0.0530)

-0.0305 (0.0531)

-0.682* (0.322)

Observations 26223 26223 26223 26223 26223 26223 26223 26221

Pseudo R2 0.012 0.012 0.012 0.012 0.012 0.012 0.013 0.013

Table 6 The Effects of Game Situations and Other Determinants on Probability of Reaching Base

Note: This table reports regression results for binary logit specifications and the odds ratio results of those same specifications. The coefficients for LI in (1) and (2) are significant at the 10% level. The coefficients for the odds ratio columns are equal to base e raised to the logit coefficients. Constant terms are omitted in the OR columns because that relationship does not hold. Fixed effects for individual pitchers were included in each regression (not reported). The specification reported in (7) and (8) include run difference fixed effects (not reported). The effect of the Player Skill Control variable is also significantly different from zero (p<0.001) in each of the specifications, but it is a control variable that, as expected, has a coefficient approximately equal to one in the logit model. Standard errors are robust and adjusted for clustering at the pitcher level. Robust standard errors in parentheses:

* p < 0.05,

** p < 0.01,

*** p < 0.001.

53

On-Base

Restricted Samples

Means LI ≥ 2 LI ≤ 2 LI ≥ 1 LI ≤ 1 (1) (2) (3) (4) (5)

Save Situation (SS)

-0.0266*** (0.00746)

-0.0266*** (0.00748)

-0.0265*** (0.00744)

-0.0268*** (0.00754)

-0.0265*** (0.00743)

Leverage Index (LI)

0.00354 (0.00201)

0.00355 (0.00203)

0.00354 (0.00199)

0.00357 (0.00205)

0.00353 (0.00200)

Run Difference (RunDiff)

-0.000455 (0.000948)

-0.000456 (0.000950)

-0.000454 (0.000946)

-0.000458 (0.000955)

-0.000453 (0.000945)

Home -0.0108

(0.00676) -0.0108 (0.00677)

-0.0108 (0.00674)

-0.0109 (0.00682)

-0.0108 (0.00673)

Contract 0.0159*** (0.00385)

0.0160*** (0.00384)

0.0159*** (0.00386)

0.0161*** (0.00384)

0.0159*** (0.00385)

Player Skill Control

0.202 (0.0156)

0.202 (0.0156)

0.201 (0.0156)

0.203 (0.0156)

0.201 (0.0156)

Season Fixed Effects (2011)

0.00231 (0.00286)

0.00232 (0.00287)

0.00231 (0.00286)

0.00233 (0.00288)

0.00231 (0.00285)

Constant -0.00383 (0.0113)

-0.00384 (0.0113)

-0.00382 (0.0113)

-0.00386 (0.0114)

-0.00382 (0.0112)

Observations 26223 26223 26223 26223 26223

Table 7 Marginal Effects of Primary Specification Variables

Note: This table reports the marginal effects of the independent variables. Column (1) reports marginal effects at the mean value of each of the variables, and Columns (2)-(5) report the marginal effects at the means of subsections of the data. The marginal effects were calculated using the dlogit2 command in Stata. Fixed effects for individual pitchers were included in each regression (not reported). Standard errors are robust and adjusted for clustering at the pitcher level. Robust standard errors in parentheses:

* p < 0.05,

** p < 0.01,

*** p < 0.001

54

On-Base

Full Specification LI Omitted RunDiff Omitted LI, RunDiff Omitted

Logit Odds Ratio

Logit Odds Ratio

Logit Odds Ratio

Logit Odds Ratio

(1) (2) (3) (4) (5) (6) (7) (8)

Save Situation (SS) 0.0899 (0.111)

1.094 (0.121)

-0.0814 (0.0668)

0.922 (0.0616)

-0.0501 0.0672)

0.951 (0.0640)

-0.121*

(0.0492) 0.886

*

(0.0436)

Leverage Index (LI) 0.0369

***

(0.0112) 1.038

***

(0.0116)

0.0365**

(0.0111) 1.037

**

(0.0115)

Run Difference (RunDiff) -0.00254 (0.00447)

0.997 (0.00446)

-0.00187 (0.00439)

0.998 (0.00438)

Home -0.0454 (0.0385)

0.956 (0.0367)

-0.0475 (0.0385)

0.954 (0.0367)

-0.0438 (0.0387)

0.957 (0.0371)

-0.0462 (0.0388)

0.955 (0.0370)

Contract -0.0351 0.0440)

0.966 (0.0425)

-0.0397 (0.0441)

0.961 (0.0424)

-0.0275 (0.0445)

0.973 (0.0433)

-0.0342 (0.0444)

0.966 (0.0429)

Leverage Index x Save Situation

-0.0600**

(0.0222) 0.942

**

(0.0209)

-0.0438*

(0.0187) 0.957

*

(0.0179)

Run Difference x Save Situation

-0.0516 (0.0333)

0.950 (0.0316)

-0.0209 (0.0273)

0.979 (0.0267)

Home x Save Situation

-0.0353 (0.0635)

0.965 (0.0613)

-0.0199 (0.0617)

0.980 (0.0605)

-0.0284 (0.0632)

0.972 (0.0615)

-0.0217 (0.0619)

0.978 (0.0606)

Contract x Save Situation

0.126 (0.0768)

1.135 (0.0871)

0.128 0.0769)

1.136 (0.0873)

0.127 0.0767)

1.135 (0.0870)

0.127 (0.0770)

1.136 (0.0875)

Player Skill Control 0.941

(0.0736) 2.563

(0.189) 0.944

(0.0731) 2.570 (0.188)

0.941 (0.0734)

2.562 (0.188)

0.944 (0.0730)

2.570 (0.188)

Season Fixed Effects (2011)

0.00458 (0.0138)

1.005 (0.0138)

0.0120 (0.0144)

1.012 (0.0145)

0.0088 (0.0141)

1.009 (0.0142)

0.0142 (0.0139)

1.014 (0.0141)

Constant 0.0134 (0.0620)

0.0675 (0.0589)

0.0073 (0.0615)

0.0636 (0.0585)

Observations 26223 26223 26223 26223 26223 26223 26223 26223

Pseudo R2 0.013 0.013 0.012 0.012 0.013 0.013 0.012 0.012

Table 8 The Interacted Effects of Game State and Other Determinants on On-Base Probability

Note: This table reports regression results for binary logit specifications and the odds ratio results of those same specifications. The coefficients for the odds ratio columns are base e raised to the logit coefficients. Constant terms are omitted in the odds ratio columns because that relationship does not hold. Fixed effects for individual pitchers were included in each regression (not reported). The effect of the Player Skill Control variable is also significantly different from zero (p<0.001) in each of the specifications, but it is a control variable that, as expected, has a coefficient approximately equal to one in the logit model. Standard errors are robust and adjusted for clustering at the pitcher level. Robust standard errors in parentheses:

* p < 0.05,

** p < 0.01,

*** p < 0.001

55

LI*SS Interaction

Mean Std. Dev. Minimum Maximum Observations

Interaction Effect -0.0126 0.0013 -0.0148 -0.0019 26223

Standard error 0.0018 0.0010 0.0004 0.0115 26223

Z -8.46 2.90 -13.05 -1.03 26223

Table 9 Effect of Leverage Index*Save Situation Interaction Variable

Note: This table reports the results of the inteff Stata command, which correctly calculates the coefficient, sign and significance of interacted variables in non-linear models, such as the binary logit specification (full results reported in Table 8).

Figure 9a Figure 9b

Figure 9: Distributions of the LI*SS interaction effect and its significance level plotted against predicted probabilities of the dependent variable

56

Figure 10a Figure 10b

Figure 11a Figure 11b

Figures 10, 11: Scatter plots of the proportion of on-base outcomes against ranges of LI and LI in save situations (LI*SS). Figures 10a and 11a show linear regressions and Figures 10b and 11b show lowess curves fit to the data.

57

On-Base

Restricted Samples

Means Save

Situations Non-Save Situations

LI ≥ 2 LI ≤ 2 LI ≥ 1 LI ≤ 1

(1) (2) (3) (4) (5) (6) (7)

Save Situation (SS)

0.0193 (0.0238)

0.0187 (0.0231)

0.0195 (0.0241)

0.0194 (0.0240)

0.0192 (0.0237)

0.0193 (0.0239)

0.0191 (0.0236)

Leverage Index (LI)

0.00789***

(0.00239) 0.00767***

(0.00232) 0.00800*** (0.00242)

0.00796** (0.00244)

0.00785***

(0.00236) 0.00793** (0.00242)

0.00784***

(0.00235) Run Difference (RunDiff)

-0.00054 (0.00095)

-0.000529 (0.000930)

-0.000551 (0.000969)

-0.000549 (0.000965)

-0.000541 (0.000952)

-0.000547 (0.000961)

-0.000540 (0.000950)

Home -0.00973 (0.00823)

-0.00945 (0.00799)

-0.00986 (0.00835)

-0.00982 (0.00831)

-0.00968 (0.00819)

-0.00978 (0.00827)

-0.00966 (0.00817)

Contract -0.00463 (0.00948)

-0.00450 (0.00921)

-0.00469 (0.00961)

-0.00467 (0.00957)

-0.00461 (0.00944)

-0.00465 (0.00953)

-0.00460 (0.00942)

Leverage Index x Save Situation

-0.0128** (0.00475)

-0.0125** (0.00461)

-0.0130** (0.00482)

-0.0130** (0.00478)

-0.0128** (0.00473)

-0.0129** (0.00477)

-0.0128** (0.00471)

Run Difference x Save Situation

-0.0111 (0.00713)

-0.0107 (0.00692)

-0.0112 (0.00724)

-0.0112 (0.00720)

-0.0110 (0.00710)

-0.0111 (0.00717)

-0.0110 (0.00708)

Home x Save Situation

-0.00755 (0.0136)

-0.00734 (0.0132)

-0.00765 (0.0138)

-0.00762 (0.0137)

-0.00751 (0.0135)

-0.00759 (0.0137)

-0.00750 (0.0135)

Contract x Save Situation

0.0271 (0.0164)

0.0263 (0.0160)

0.0275 (0.0167)

0.0273 (0.0166)

0.0269 (0.0164)

0.0272 (0.0165)

0.0269 (0.0163)

Player Skill Control

0.201 (0.0156)

0.196 (0.0151)

0.204 (0.0160)

0.203 (0.0158)

0.201 (0.0156)

0.202 (0.0157)

0.200 (0.0156)

Season Fixed Effects (2011)

0.000980 (0.00295)

0.000952 (0.00287)

0.000993 (0.00299)

0.000989 (0.00298)

0.000975 (0.00294)

0.000985 (0.00296)

0.000973 (0.00293)

Constant -0.000018 (0.0176)

-0.000018 (0.0171)

-0.000018 (0.0179)

-0.000018 (0.0178)

-0.000018 (0.0175)

-0.000018 (0.0177)

-0.000018 (0.0175)

Observations 26223 8918 17305 8724 17499 14933 11290

Table 10 Marginal Effects of Interacted Variables

Note: This table reports the marginal effects of the independent variables in the specification including interaction variables. Column (1) reports marginal effects at the mean value of each of the variables, and Columns (2)-(7) report the marginal effects at the means of subsections of the data. The marginal effects were calculated using the dlogit2 command in Stata. Fixed effects for individual pitchers were included in each regression (not reported). Standard errors are robust and adjusted for clustering at the pitcher level. Robust standard errors in parentheses: * p < 0.05, ** p < 0.01, *** p < 0.001

58

On-Base

Linear LI Omitted RunDiff Omitted RunDiff FEs

(1) (2) (3) (4)

Save Situation (SS) -0.0262*** (0.00731)

-0.0227** (0.00688)

-0.0269*** (0.00704)

-0.0119 (0.0109)

Leverage Index (LI) 0.00346 (0.00202)

0.00350 (0.00201)

0.00265 (0.00279)

Run Difference (RunDiff)

-0.000501 (0.000956)

-0.000560 (0.000950)

Home -0.0110 (0.00669)

-0.0119 (0.00680)

-0.0107 (0.00672)

-0.0120 (0.00689)

Contract 0.0155*** (0.00402)

0.0177*** (0.00357)

0.0166*** (0.00407)

0.0133* (0.00522)

Player Skill Control 0.189

(0.0134) 0.189

(0.0133) 0.189

(0.0133) 0.189

(0.0132)

Season Fixed Effects (2011)

0.00208 (0.00344)

0.00272 (0.00295)

0.00218 (0.00346)

0.00251 (0.00319)

Constant 0.471*** (0.0105)

0.474*** (0.0102)

0.469***

(0.0102) 0.330*** (0.0553)

Observations 26223 26223 26223 26223

R2 0.015 0.015 0.015 0.016

Table 11 Primary Specification Linear Regression

Note: This table reports regression results for linear specifications including the non-interacted variables of interest. The results of the binary logit specification of the same variables are reported in Table 6. Fixed effects for individual pitchers were included in the regression (not reported). The run difference fixed effects of Column (4) are also not reported. The effect of the Player Skill Control variable is also significantly different from zero (p<0.001) in each of the specifications, but it is a control variable that is expected to differ from zero. Standard errors are robust and adjusted for clustering at the pitcher level. Robust standard errors in parentheses: * p < 0.05, ** p < 0.01, *** p < 0.001

59

On-Base

Linear

(1)

Save Situation (SS) 0.0182 (0.0230)

Leverage Index (LI) 0.00799** (0.00246)

Run Difference (RunDiff) -0.000592 (0.000957)

Home -0.0101 (0.00829)

Contract -0.00255 (0.00932)

Leverage Index x Save Situation

-0.0128** (0.00463)

Run Difference x Save Situation

-0.0106 (0.00687)

Home x Save Situation -0.00676 (0.0133)

Contract x Save Situation 0.0265 (0.0163)

Player Skill Control 0.188

(0.0134)

Season Fixed Effects (2011) 0.000789 (0.00357)

Constant 0.472*** (0.0161)

Observations 26223

Note: This table reports regression results for a linear specification that includes all of the covariates used in the binary logit specifications of Table 8. Fixed effects for individual pitchers were included in the regression (not reported). The effect of the Player Skill Control variable is also significantly different from zero (p<0.001) in the specification, but it is a control variable that is expected to differ from zero. Standard errors are robust and adjusted for clustering at the pitcher level. Robust standard errors in parentheses:

* p < 0.05,

** p < 0.01,

*** p < 0.001

Table 12 Linear Regression of Interacted Variable Specification

60

Figure 12a

Figure 12b

Figure 12: Partial regression plots of the LI and LI*SS variables

61

On-Base

Full Logit Pitcher FE only Year FE only RunDiff FE No FE

(1) (2) (3) (4) (5)

Save Situation (SS) 0.0899 (0.111)

0.0900 (0.111)

0.108 (0.117)

0.0804 (0.154)

0.103 (0.116)

Leverage Index (LI) 0.0369*** (0.0112)

0.0369*** (0.0112)

0.0370**

(0.0124) 0.0625*** (0.0157)

0.0368**

(0.0124)

Run Difference (RunDiff)

-0.00254 (0.00447)

-0.00254 (0.00447)

-0.00259 (0.00456)

-0.00266 (0.00456)

Home -0.0454 (0.0385)

-0.0454 (0.0385)

-0.0442 (0.0332)

-0.0494 (0.0399)

-0.0447 (0.0332)

Contract -0.0351 (0.0440)

-0.0397 (0.0415)

-0.0386 (0.0494)

-0.0338 (0.0466)

-0.0423 (0.0490)

Leverage Index x Save Situation

-0.0600** (0.0222)

-0.0600** (0.0222)

-0.0598** (0.0218)

-0.0837** (0.0257)

-0.0592** (0.0217)

Run Difference x Save Situation

-0.0516 (0.0333)

-0.0517 (0.0333)

-0.0518 (0.0385)

0.0284 (0.0608)

-0.0510 (0.0385)

Home x Save Situation

-0.0353 (0.0635)

-0.0353 (0.0635)

-0.0338 (0.0584)

-0.0303 (0.0653)

-0.0328 (0.0584)

Contract x Save Situation

0.126 (0.0768)

0.126 (0.0768)

0.100 (0.0786)

0.128 (0.0787)

0.101 (0.0785)

Player Skill Control 0.941

(0.0736) 0.941

(0.0732) 0.955

(0.0547) 0.945

(0.0729) 0.964

(0.0520)

Season Fixed Effects (2011)

0.00458 (0.0138)

-0.0165 (0.0289)

0.0155 (0.0123)

Constant 0.0134 (0.0620)

0.0179 (0.0697)

0.00293 (0.0507)

-0.675* (0.309)

0.00471 (0.0506)

Observations 26223 26223 26223 26221 26223

Pseudo R2 0.013 0.013 0.012 0.014 0.012

Table 13 Variation of Fixed Effects Specifications in Binary Logit Models

Note: This table reports regression results for logit specifications that vary based on the included fixed effects. The results of the various individual fixed effects are not reported. The effect of the Player Skill Control variable is also significantly different from zero (p<0.001) in each of the specifications, but it is a control variable that, as expected, has a coefficient approximately equal to one in the logit model. Standard errors are robust and adjusted for clustering at the pitcher level. Robust standard errors in parentheses:

* p < 0.05,

** p < 0.01,

*** p < 0.001

62

On-Base Logit Linear (1) (2)

Win Non-Save Situation (WinNSS) 0.125* (0.0605)

0.0258* (0.0119)

Lose Non-Save Situation (LoseNSS) 0.0863 0.0458)

0.0183 (0.00966)

Leverage Index (LI) 0.00947 (0.0131)

0.00200 (0.00254)

Leverage Index * WinNSS -0.149 (0.0770)

-0.0300* (0.0151)

Leverage Index * LoseNSS 0.0281 (0.0190)

0.00592 (0.00430)

Home -0.0598 (0.0324)

-0.0129* (0.00581)

Contract 0.0503** (0.0183)

-0.00172 (0.00822)

Player Skill Control 0.945

(0.0734) 0.197

(0.00998)

Season Fixed Effects (2011) 0.0225 (0.0171)

Constant -0.0880 (0.0582)

0.466*** (0.0119)

Observations 26223 26223 Note: This table reports regression results for binary logit and linear specifications that divide the NSS variable into two subsections. Fixed effects for individual pitchers were included in the logit regression (not reported). The linear regression omitted pitcher and season fixed effects. Standard errors are robust in both specifications and adjusted for clustering at the pitcher level in Column (1). The coefficients for LoseNSS in (1) and (2), and the coefficients for LI*WinNSS and Home in (1), are significant at the 10% level. The effect of the Player Skill Control variable is also significantly different from zero (p<0.001) in each of the specifications, but it is a control variable that, as expected, has a coefficient approximately equal to one in the logit model. Robust standard errors in parentheses: * p < 0.05, ** p < 0.01, *** p < 0.001

Table 14 Regression of On-Base Probability on Ranges of NSS

States

63

On-Base

Run Range:

[-1,1] Run Range:

[-2,2] Run Range:

[-3,3] Leading

(RunDiff > 0) Trailing or Tied

(RunDiff ≤ 0)

(1) (2) (3) (4) (5)

Save Situation (SS) 0.163 (0.154)

0.104 (0.156)

0.156 (0.118)

0.357* (0.141)

Leverage Index (LI) 0.0662*** (0.0196)

0.0666*** (0.0159)

0.0707*** (0.0142)

0.0757** (0.0252)

0.0335 (0.0183)

Run Difference (RunDiff)

-0.0546 (0.0412)

-0.0396 (0.0225)

-0.0508** (0.0163)

0.0407* (0.0165)

0.00464 (0.00883)

Home -0.118* (0.0507)

-0.0983* (0.0439)

-0.0820* (0.0414)

0.00767 (0.0566)

-0.0898 (0.0467)

Contract 0.109

(0.0725) 0.0277 (0.109)

-0.000324 (0.0798)

-0.155* (0.0649)

0.343*** (0.0273)

Leverage Index x Save Situation

-0.0968** (0.0337)

-0.0964*** (0.0289)

-0.0949*** (0.0246)

-0.101** (0.0317)

Run Difference x Save Situation

0.0356 (0.0619)

-0.00433 (0.0362)

-0.0978** (0.0360)

Home x Save Situation

-0.0363 (0.0911)

-0.0324 (0.0704)

0.000449 (0.0643)

-0.0924 (0.0752)

Contract x Save Situation

0.0552 (0.115)

0.121 (0.0932)

0.120 (0.0872)

0.0940 (0.0796)

Player Skill Control 1.052 (0.112)

1.069 (0.0906)

1.015 (0.0819)

0.928 (0.0840)

0.995 (0.106)

Season Fixed Effects (2011)

0.0278 (0.0210)

0.0374 (0.0915)

0.0308 (0.0562)

0.0377* (0.0152)

0.0335*

(0.0168)

Constant -0.0738 (0.0917)

-0.0631 (0.114)

-0.0745 (0.0883)

-0.236* (0.119)

-0.0508 (0.0831)

Observations 11346 16691 20263 15896 10327

Pseudo R2 0.018 0.017 0.015 0.014 0.016

Table 15 Regressions of On-Base Probability on Run Difference Ranges

Note: This table reports regression results for logit specifications that vary depending on ranges of team run differences at the time of the plate appearance sample points. Fixed effects for individual pitchers were included in each regression (not reported). Standard errors are robust and adjusted for clustering at the pitcher level. The effect of the Player Skill Control variable is also significantly different from zero (p<0.001) in each of the specifications, but it is a control variable that, as expected, has a coefficient approximately equal to one in the logit model. Robust standard errors in parentheses:

* p < 0.05,

** p < 0.01,

*** p < 0.001

64

Rank Name Team Coefficient

Rank Name Team Coefficient

1 Steve Kline MON 0.1049

25 Trevor Hoffman SDP 0.2462

2 Jose Paniagua# SEA 0.1133

26 Jeff Shaw LAD 0.2510

3 Keith Foulke CHW 0.1185

27 Scott Strickland MON 0.2529

4 Jerry Spradlin*# KCR 0.1382

28 Bob Wells MIN 0.2533

5 Danny Graves CIN 0.1388

29 Bob Wickman CLE 0.2555

6 Robb Nen SFG 0.1426

30 John Rocker ATL 0.2607

7 Mike Williams PIT 0.1502

31 Mariano Rivera NYY 0.2635

8 Mike Fetters# LAD 0.1658

32 Curtis Leskanic* MIL 0.2674

9 Rick Aguilera CHC 0.1724

33 Billy Koch TOR 0.2704

10 Mike Remlinger ATL 0.1766

34 Jeff Brantley PHI 0.2729

11 Gabe White# COL 0.1935

35 Ryan Kohlmeier BAL 0.2740

12 Eddie Guardado*# MIN 0.2013

36 Troy Percival ANA 0.2779

13 Shigetoshi Hasegawa# ANA 0.2030

37 Dave Veres STL 0.2813

14 Steve Karsay CLE 0.2035

38 Kazuhiro Sasaki SEA 0.2867

15 Wayne Gomes*# PHI 0.2092

39 Ricky Bottalico KCR 0.2879

16 Bob Howry# CHW 0.2093

40 Scott Williamson*# CIN 0.2896

17 John Wetteland TEX 0.2111

41 Byung-Hyun Kim ARI 0.2897

18 Jason Isringhausen OAK 0.2143

42 Todd Jones DET 0.3076

19 LaTroy Hawkins* MIN 0.2144

43 Roberto Hernandez TBR 0.3105

20 Ugueth Urbina^# MON 0.2152

44 Kerry Ligtenberg ATL 0.3310

21 Mike Morgan*# ARI 0.2153

45 Billy Wagner# HOU 0.3449

22 Mike Timlin BAL 0.2170

46 Derek Lowe BOS 0.3670

23 Armando Benitez NYM 0.2407

47 Antonio Alfonseca FLA 0.3758

24 Octavio Dotel* HOU 0.2433

48 Matt Mantei ARI 0.3906

Table 16 Clutchness Rankings: 2000

Note: This table reports the “Clutchness” coefficients of the 48 pitchers in the sample who pitched in the 2000 season. The coefficients were calculated by taking the weighted average of the SS, LI, and the pitcher-specific constant term obtained through linear regressions individually run for each pitcher. The pitcher-specific fixed effect receives a 50 percent weight and the other two receive 25 percent each. The symbols following some pitcher names correspond to the following caveats: # if the pitcher earned eight or fewer saves in the season, * if the pitcher made 30 percent or fewer of his appearances in SS game states, ^ if the pitcher participated in fewer than 100 plate appearances in the season.

65

Rank Name Team Coefficient

Rank Name Team Coefficient

1 J.J. Putz ARI 0.1151

25 Brian Fuentes OAK 0.2438

2 Sean Marshall# CHC 0.1335

26 Jordan Walden LAA 0.2440

3 Eduardo Sanchez# STL 0.1425

27 Drew Storen WAS 0.2475

4 Bobby Parnell# NYM 0.1536

28 Antonio Bastardo# PHI 0.2496

5 Francisco Cordero CIN 0.1604

29 Jose Valverde DET 0.2531

6 Joel Hanrahan PIT 0.1613

30 Jim Johnson BAL 0.2556

7 Chris Sale# CHW 0.1616

31 Kyle Farnsworth TBR 0.2557

8 Brandon League SEA 0.1631

32 Neftali Feliz TEX 0.2576

9 Ryan Madson PHI 0.1689

33 Kenley Jansen*# LAD 0.2669

10 Joel Peralta# TBR 0.1748

34 Rafael Betancourt# COL 0.2682

11 Sergio Santos CHW 0.1801

35 Javy Guerra LAD 0.2689

12 Carols Marmol CHC 0.1810

36 Mariano Rivera NYY 0.2709

13 Joakim Soria KCR 0.1843

37 Jon Rauch TOR 0.2762

14 David Hernandez ARI 0.1965

38 Chris Perez CLE 0.2809

15 Kevin Gregg BAL 0.1998

39 Santiago Casilla*# SFG 0.3019

16 Jonathan Papelbon BOS 0.2017

40 Jonathan Broxton^# LAD 0.3043

17 Jason Isringhausen# NYM 0.2047

41 Brian Wilson SFG 0.3062

18 Francisco Rodriguez NYM 0.2054

42 Jose Contreras^# PHI 0.3066

19 Heath Bell SDP 0.2233

43 Juan Carlos Oviedo FLA 0.3183

20 Matt Capps MIN 0.2325

44 Craig Kimbrel ATL 0.3188

21 Jason Motte# STL 0.2365

45 Mark Melancon HOU 0.3237

22 Fernando Salas STL 0.2391

46 Huston Street COL 0.3259

23 Frank Francisco TOR 0.2391

47 Andrew Bailey OAK 0.3359

24 John Axford MIL 0.2393

48 Joe Nathan MIN 0.3665

Table 17 Clutchness Rankings: 2011

Table 12 Clutchness Rankings: 2011

Note: This table reports the “Clutchness” coefficients of the 48 pitchers in the sample who pitched in the 2011 season. The coefficients were calculated by taking the weighted average of the SS, LI, and the pitcher-specific constant term obtained through linear regressions individually run for each pitcher. The pitcher-specific fixed effect receives a 50 percent weight and the other two receive 25 percent each. The symbols following some pitcher names correspond to the following caveats: # if the pitcher earned eight or fewer saves in the season, * if the pitcher made 30 percent or fewer of his appearances in SS game states, ^ if the pitcher participated in fewer than 100 plate appearances in the season.

66

Rank Name Team Coefficient

Rank Name Team Coefficient

1 Steve Kline MON 0.1049

18 John Rocker ATL 0.2607

2 Keith Foulke CHW 0.1185

19 Mariano Rivera NYY 0.2635

3 Danny Graves CIN 0.1388

20 Billy Koch TOR 0.2704

4 Robb Nen SFG 0.1426

21 Jeff Brantley PHI 0.2729

5 Mike Williams PIT 0.1502

22 Ryan Kohlmeier BAL 0.2740

6 Rick Aguilera CHC 0.1724

23 Troy Percival ANA 0.2779

7 Mike Remlinger ATL 0.1766

24 Dave Veres STL 0.2813

8 Steve Karsay CLE 0.2035

25 Kazuhiro Sasaki SEA 0.2867

9 John Wetteland TEX 0.2111

26 Ricky Bottalico KCR 0.2879

10 Jason Isringhausen OAK 0.2143

27 Byung-Hyun Kim ARI 0.2897

11 Mike Timlin BAL 0.2170

28 Todd Jones DET 0.3076

12 Armando Benitez NYM 0.2407

29 Roberto Hernandez TBR 0.3105

13 Trevor Hoffman SDP 0.2462

30 Kerry Ligtenberg ATL 0.3310

14 Jeff Shaw LAD 0.2510

31 Derek Lowe BOS 0.3670

15 Scott Strickland MON 0.2529

32 Antonio Alfonseca FLA 0.3758

16 Bob Wells MIN 0.2533

33 Matt Mantei ARI 0.3906 17 Bob Wickman CLE 0.2555

Table 18 Clutchness Rankings: 2000 (limited sample)

Note: This table reports the Clutchness rankings of the 33 pitchers in the sample in the 2000 season who did not have sample size-related caveats. The coefficients were calculated by taking the weighted average of the SS, LI, and the pitcher-specific constant term obtained through linear regressions individually run for each pitcher. The pitcher-specific fixed effect receives a 50 percent weight and the other two receive 25 percent each. The follow caveats led to the omission of 15 pitchers: if the pitcher earned eight or fewer saves in the season, if the pitcher made 30 percent or fewer of his appearances in SS game states, and/or if the pitcher participated in fewer than 100 plate appearances in the season.

67

Rank Name Team Coefficient

Rank Name Team Coefficient

1 J.J. Putz ARI 0.1151

19 Jordan Walden LAA 0.2440

2 Francisco Cordero CIN 0.1604

20 Drew Storen WAS 0.2475

3 Joel Hanrahan PIT 0.1613

21 Jose Valverde DET 0.2531

4 Brandon League SEA 0.1631

22 Jim Johnson BAL 0.2556

5 Ryan Madson PHI 0.1689

23 Kyle Farnsworth TBR 0.2557

6 Sergio Santos CHW 0.1801

24 Neftali Feliz TEX 0.2576

7 Carols Marmol CHC 0.1810

25 Javy Guerra LAD 0.2689

8 Joakim Soria KCR 0.1843

26 Mariano Rivera NYY 0.2709

9 David Hernandez ARI 0.1965

27 Jon Rauch TOR 0.2762

10 Kevin Gregg BAL 0.1998

28 Chris Perez CLE 0.2809

11 Jonathan Papelbon BOS 0.2017

29 Brian Wilson SFG 0.3062

12 Francisco Rodriguez NYM 0.2054

30 Juan Carlos Oviedo FLA 0.3183

13 Heath Bell SDP 0.2233

31 Craig Kimbrel ATL 0.3188

14 Matt Capps MIN 0.2325

32 Mark Melancon HOU 0.3237

15 Fernando Salas STL 0.2391

33 Huston Street COL 0.3259

16 Frank Francisco TOR 0.2391

34 Andrew Bailey OAK 0.3359

17 John Axford MIL 0.2393

35 Joe Nathan MIN 0.3665

18 Brian Fuentes OAK 0.2438

Table 19 Clutchness Rankings: 2011 (limited sample)

Note: This table reports the Clutchness rankings of the 35 pitchers in the sample in the 2011 season who did not have sample size-related caveats. The coefficients were calculated by taking the weighted average of the SS, LI, and the pitcher-specific constant term obtained through linear regressions individually run for each pitcher. The pitcher-specific fixed effect receives a 50 percent weight and the other two receive 25 percent each. The follow caveats led to the omission of 15 pitchers: if the pitcher earned eight or fewer saves in the season, if the pitcher made 30 percent or fewer of his appearances in SS game states, and/or if the pitcher participated in fewer than 100 plate appearances in the season.

68

Rank Name Team Coefficient

Rank Name Team Coefficient

1 Steve Kline MON 0.0580

18 Jeff Shaw LAD 0.2455

2 Robb Nen SFG 0.1045

19 Scott Strickland MON 0.2480

3 Keith Foulke CHW 0.1092

20 Dave Veres STL 0.2549

4 Mike Williams PIT 0.1456

21 Bob Wells MIN 0.2619

5 Danny Graves CIN 0.1463

22 Jason Isringhausen OAK 0.2692

6 Mike Remlinger ATL 0.1720

23 Jeff Brantley PHI 0.2855

7 Steve Karsay CLE 0.1875

24 Billy Koch TOR 0.2856

8 Troy Percival ANA 0.1928

25 Todd Jones DET 0.2914

9 John Wetteland TEX 0.1958

26 Roberto Hernandez TBR 0.2968

10 Rick Aguilera CHC 0.2119

27 Kazuhiro Sasaki SEA 0.3037

11 Ryan Kohlmeier BAL 0.2225

28 Bob Wickman CLE 0.3076

12 Ricky Bottalico KCR 0.2248

29 Byung-Hyun Kim ARI 0.3135

13 Mike Timlin BAL 0.2332

30 Kerry Ligtenberg ATL 0.3238

14 Trevor Hoffman SDP 0.2345

31 Derek Lowe BOS 0.3357

15 Mariano Rivera NYY 0.2377

32 Antonio Alfonseca FLA 0.3812

16 John Rocker ATL 0.2409

33 Matt Mantei ARI 0.4184 17 Armando Benitez NYM 0.2421

Table 20 Leverage Index Clutchness Rankings: 2000 (refined sample)

Note: This table reports the Leverage Index Clutchness rankings of the 33 pitchers in the sample in the 2000 season who did not have sample size-related caveats. The coefficients were calculated by taking the weighted average of the LI*SS, LI, and the pitcher-specific constant term obtained through linear regressions (including the interaction terms) that were individually run for each pitcher. The pitcher-specific fixed effect receives a 50 percent weight and the other two receive 25 percent each. The follow caveats led to the omission of 15 pitchers: if the pitcher earned eight or fewer saves in the season, if the pitcher made 30 percent or fewer of his appearances in SS game states, and/or if the pitcher participated in fewer than 100 plate appearances in the season.

69

Rank Name Team Coefficient

Rank Name Team Coefficient

1 Joel Hanrahan PIT 0.1302

19 Mariano Rivera NYY 0.2418

2 Heath Bell SDP 0.1509

20 Brian Fuentes OAK 0.2477

3 Brandon League SEA 0.1525

21 Andrew Bailey OAK 0.2516

4 Francisco Cordero CIN 0.1744

22 Kyle Farnsworth TBR 0.2549

5 Ryan Madson PHI 0.1756

23 Frank Francisco TOR 0.2566

6 Matt Capps MIN 0.1798

24 Juan Carlos Oviedo FLA 0.2650

7 Jonathan Papelbon BOS 0.1834

25 David Hernandez ARI 0.2659

8 J.J. Putz ARI 0.1846

26 Jim Johnson BAL 0.2701

9 Jordan Walden LAA 0.1853

27 Jon Rauch TOR 0.2710

10 Drew Storen WAS 0.1855

28 Neftali Feliz TEX 0.2884

11 Carols Marmol CHC 0.1932

29 Javy Guerra LAD 0.3189

12 Joakim Soria KCR 0.2139

30 Mark Melancon HOU 0.3228

13 Francisco Rodriguez NYM 0.2181

31 Chris Perez CLE 0.3270

14 Kevin Gregg BAL 0.2225

32 Huston Street COL 0.3272

15 Sergio Santos CHW 0.2231

33 Joe Nathan MIN 0.3348

16 Fernando Salas STL 0.2249

34 Craig Kimbrel ATL 0.3432

17 Jose Valverde DET 0.2259

35 Brian Wilson SFG 0.3669

18 John Axford MIL 0.2265

Table 21 Leverage Index Clutchness Rankings: 2011 (refined sample)

Note: This table reports the Leverage Index Clutchness rankings of the 35 pitchers in the sample in the 2011 season who did not have sample size-related caveats. The coefficients were calculated by taking the weighted average of the LI*SS, LI, and the pitcher-specific constant term obtained through linear regressions (including the interaction terms) that were individually run for each pitcher. The pitcher-specific fixed effect receives a 50 percent weight and the other two receive 25 percent each. The follow caveats led to the omission of 15 pitchers: if the pitcher earned eight or fewer saves in the season, if the pitcher made 30 percent or fewer of his appearances in SS game states, and/or if the pitcher participated in fewer than 100 plate appearances in the season.

70

Figure 13a

Figure 13b Figure 13c

Figure 13: Scatter plots of save opportunity conversion percentage, annual salary, and log(annual salary) against individual Clutchness rankings

71

Name Team Year Runs Expected Runs z Save Opportunities

Mike Fetters LAD 2000 2 3.9 -1.97* 7

Eddie Guardado MIN 2000 2 4.3 -2.64** 11

Byung-Hyun Kim ARI 2000 5 9.4 -2.43** 20

Brandon League SEA 2011 7 9.8 -2.17* 42

Joe Nathan MIN 2011 3 5.9 -2.67* 17

Francisco Rodriguez NYM 2011 5 10.5 -3.24** 29

Note: This table reports the six significant results yielded by the Wald-Wolfowitz runs tests. The results of all 96 Wald-Wolfowitz tests can be found in Appendix Table A-5. Significance symbols are as follows:

* p < 0.05,

** p < 0.01,

*** p < 0.001

Table 22 Wald-Wolfowitz Runs Test: Significant Results

72

Appendix

Name Season Team Saves Blown Saves

Conversion Rate (%)

Season Salary ($)

SS AB NSS AB Total AB

Rick Aguilera 2000 CHC 29 8 78 3,500,000 140 70 209

Antonio Alfonseca 2000 FLA 45 4 92 380,000 217 94 310

John Axford 2011 MIL 46 2 96 442,500 195 110 305

Andrew Bailey 2011 OAK 24 2 92 465,000 108 62 170

Antonio Bastardo 2011 PHI 8 1 89 419,000 84 141 225

Heath Bell 2011 SDP 43 5 90 7,500,000 185 71 256

Armando Benitez 2000 NYM 41 5 89 3,437,500 192 112 304

Rafael Betancourt 2011 COL 8 4 67 3,775,000 123 114 236

Ricky Bottalico 2000 KCR 16 7 70 1,500,000 144 175 319

Jeff Brantley 2000 PHI 23 5 82 500,000 132 124 256

Jonathan Broxton 2011 LAD 7 1 88 7,000,000 34 28 62

Matt Capps 2011 MIN 15 9 63 7,150,000 131 143 274

Santiago Casilla 2011 SFG 6 1 86 1,300,000 51 160 210

Jose Contreras 2011 PHI 5 0 100 2,500,000 34 26 60

Francisco Cordero 2011 CIN 37 6 86 12,125,000 169 105 273

Octavio Dotel 2000 HOU 16 7 70 240,000 111 448 559

Kyle Farnsworth 2011 TBR 25 6 81 2,600,000 117 114 231

Neftali Feliz 2011 TEX 32 6 84 457,160 147 105 252

Mike Fetters 2000 LAD 5 2 71 550,000 72 129 205

Keith Foulke 2000 CHW 34 5 87 375,000 207 143 350

Frank Francisco 2011 TOR 17 4 81 4,000,000 88 130 217

Brian Fuentes 2011 OAK 12 3 80 5,000,000 97 153 250

Wayne Gomes 2000 PHI 7 4 64 925,000 72 252 323

Danny Graves 2000 CIN 30 5 86 2,100,000 186 202 386

Kevin Gregg 2011 BAL 22 7 76 4,200,000 120 155 274

Eddie Guardado 2000 MIN 9 2 82 875,000 66 196 262

Javy Guerra 2011 LAD 21 2 91 488,000 84 111 194

Joel Hanrahan 2011 PIT 40 4 91 1,400,000 177 97 274

ShiShigetoshi Hasegawa 2000 ANA 9 9 50 900,000 217 98 415

Table A-1 Descriptive Statistics of Sample Pitchers

73

LaTroy Hawkins 2000 MIN 14 0 100 1,115,000 101 269 370

David Hernandez 2011 ARI 11 3 79 423,500 149 142 290

Roberto Hernandez 2000 TBR 32 8 80 6,000,000 177 138 315

Trevor Hoffman 2000 SDP 43 7 86 6,600,000 196 95 290

Bob Howry 2000 CHW 7 5 58 325,000 121 168 288

Jason Isringhausen 2000 OAK 33 7 83 825,000 182 122 304

Jason Isringhausen 2011 NYM 7 4 64 700,000 122 78 198

Kenley Jansen 2011 LAD 5 1 83 416,000 42 147 218

Jim Johnson 2011 BAL 9 5 64 975,000 166 200 366

Todd Jones 2000 DET 42 4 91 3,650,000 187 84 270

Steve Karsay 2000 CLE 20 9 69 1,200,000 198 131 329

Byung-Hyun Kim 2000 ARI 14 6 70 762,500 128 177 318

Craig Kimbrel 2011 ATL 46 8 85 419,000 210 96 299

Steve Kline 2000 MON 14 4 78 355,000 141 208 347

Billy Koch 2000 TOR 33 5 87 333,333 189 137 325

Ryan Kohlmeier 2000 BAL 13 1 93 200,000 64 56 120

Brandon League 2011 SEA 37 5 88 2,250,000 157 93 250

Curtis Leskanic 2000 MIL 12 1 92 1,450,000 97 236 330

Kerry Ligtenberg 2000 ATL 12 2 86 255,000 90 127 216

Derek Lowe 2000 BOS 42 5 89 625,000 232 147 379

Ryan Madson 2011 PHI 32 2 94 4,833,333 142 104 246

Matt Mantei 2000 ARI 17 3 85 2,831,000 80 120 200

Carols Marmol 2011 CHC 34 10 77 2,533,333 196 131 327

Sean Marshall 2011 CHC 5 4 56 1,600,000 155 152 306

Mark Melancon 2011 HOU 20 5 80 421,000 121 188 309

Mike Morgan 2000 ARI 5 1 83 800,000 78 276 445

Jason Motte 2011 STL 9 4 69 435,000 103 165 267

Joe Nathan 2011 MIN 14 3 82 11,250,000 97 94 191

Robb Nen 2000 SFG 41 5 89 5,500,000 175 81 256

Juan Carlos Oviedo 2011 FLA 36 6 86 3,650,000 167 101 268

Jose Paniagua 2000 SEA 5 3 63 275,000 110 234 343

Jonathan Papelbon 2011 BOS 31 3 91 12,000,000 136 119 255

Bobby Parnell 2011 NYM 6 6 50 433,500 98 170 267

Joel Peralta 2011 TBR 6 2 75 925,000 105 151 256

74

Troy Percival 2000 ANA 32 10 76 2,350,000 172 49 221

Chris Perez 2011 CLE 36 4 90 2,225,000 145 103 248

J.J. Putz 2011 ARI 45 4 92 4,000,000 163 47 210

Jon Rauch 2011 TOR 11 5 69 3,500,000 81 144 225

Mike Remlinger 2000 ATL 12 4 75 1,400,000 170 141 310

Mariano Rivera 2000 NYY 36 5 88 7,250,000 201 110 311

Mariano Rivera 2011 NYY 44 5 90 14,911,700 178 55 222

John Rocker 2000 ATL 24 3 89 290,000 147 104 251

Francisco Rodriguez 2011 NYM 23 6 79 12,166,666 184 123 307

Fernando Salas 2011 STL 24 6 80 425,000 145 150 294

Chris Sale 2011 CHW 8 2 80 425,000 122 166 287

Eduardo Sanchez 2011 STL 5 2 71 425,000 67 51 118

Sergio Santos 2011 CHW 30 6 83 435,000 151 109 260

Kazuhiro Sasaki 2000 SEA 37 3 93 4,000,000 166 99 264

Jeff Shaw 2000 LAD 27 7 79 5,383,333 134 115 248

Joakim Soria 2011 KCR 28 7 80 4,000,000 154 102 256

Jerry Spradlin 2000 KCR 7 4 64 962,500 84 287 389

Drew Storen 2011 WAS 43 5 90 418,000 209 94 303

Huston Street 2011 COL 29 4 88 7,300,000 139 100 239

Scott Strickland 2000 MON 9 4 69 202,500 67 133 198

Mike Timlin 2000 BAL 12 6 67 4,250,000 112 183 293

Ugueth Urbina 2000 MON 8 2 80 3,200,000 40 14 54

Jose Valverde 2011 DET 49 0 100 7,000,000 192 109 301

Dave Veres 2000 STL 29 7 81 1,366,667 176 134 310

Billy Wagner 2000 HOU 6 9 40 3,200,000 77 52 129

Jordan Walden 2011 LAA 32 10 76 414,000 183 70 253

Bob Wells 2000 MIN 10 10 50 700,000 158 193 350

John Wetteland 2000 TEX 34 9 79 6,500,000 188 81 269

Gabe White 2000 COL 5 4 56 630,000 125 204 326

Bob Wickman 2000 CLE 30 7 81 2,400,000 155 154 308

Mike Williams 2000 PIT 24 5 83 1,000,000 120 187 307

Scott Williamson 2000 CIN 6 2 75 300,000 78 413 491

Brian Wilson 2011 SFG 36 5 88 6,500,000 168 75 243

75

RunDiff*SS Interaction

Observations Mean Std. Dev. Minimum Maximum

Interaction Effect 26223 -0.0109862 0.0012338 -0.0128623 -0.0010391

Standard error 26223 0.0021855 0.0015608 0.0005329 0.015625

Z 26223 -6.720539 2.892888 -12.00559 -0.3968737

Home*SS Interaction

Observations Mean Std. Dev. Minimum Maximum

Interaction Effect 26223 -0.0073779 0.0006544 -0.0082426 -0.0012449

Standard error 26223 0.0013126 0.0001505 0.0005213 0.0024951

Z 26223 -5.705046 0.889937 -7.206858 -1.768682

Contract*SS Interaction

Observations Mean Std. Dev. Minimum Maximum

Interaction Effect 26223 0.0215835 0.0022451 0.0036194 0.0251713

Standard error 26223 0.0021625 0.0006044 0.0010952 0.0073849

Z 26223 10.72835 3.17989 1.75991 22.23528

Table A-2.1 Effect of Run Difference*Save Situation Interaction Variable

Table A-2.2 Effect of Home*Save Situation Interaction Variable

Table A-2.3 Effect of Contract*Save Situation Interaction Variable

Note: These tables report the results of the inteff Stata command, which correctly calculates the coefficient, sign and significance of interacted variables in non-linear models, such as the binary logit specification (full results reported in Table 8). The results of the LI*SS interaction are reported in Table 9, and these tables report the results of the remaining three interactions in the related specification.

76

Collinearity Diagnostics

VIF SQRT VIF Tolerance R-Squared

Save Situation 1.18 1.09 0.8441 0.1559

Leverage Index 1.12 1.06 0.8891 0.1109

Run Difference 1.06 1.03 0.9457 0.0543

Mean VIF 1.12

Eigenvalue Condition Index

1 2.4370 1 2 0.8632 1.6802 3 0.4302 2.3801 4 0.2696 3.0067

Condition Number 3.0067 Det(correlation matrix) 0.8412

Logistic model for ob

True Classified D ~D Total

+ 4636 8195 12831

— 3586 9806 13392

Total 8222 18001 26223

Classified + if predicted Pr(D) >= .314 True D defined as ob != 0

Sensitivity Pr( + D) 56.39%

Specificity Pr( -~D) 54.47%

Positive predictive value Pr( D +) 36.13%

Negative predictive value Pr(~D -) 73.22%

False + rate for true ~D Pr( +~D) 45.53%

False - rate for true D Pr( - D) 43.61%

False + rate for classified + Pr(~D +) 63.87%

False - rate for classified - Pr( D -) 26.78%

Correctly classified 55.07%

Table A-3 Goodness-of-fit Test Results

Table A-3.1 Collinearity Test

Table A-3.2 Classification

77

Measures of Fit for logit of ob

Log-Lik Intercept Only: -16308.217 Log-Lik Full Model: -16102.038

D(26119): 32204.076 LR(9):

412.357

Prob > LR:

0

McFadden's R2: 0.013 McFadden's Adj R2: 0.006

ML (Cox-Snell) R2: 0.016 Cragg-Uhler(Nagelkerke) R2: 0.022

McKelvey & Zavoina's R2: 0.023 Efron's R2:

0.016

Variance of y*: 3.368 Variance of error: 3.29

Count R2: 0.687 Adj Count R2: 0

AIC: 1.236 AIC*n:

32412.076

BIC: -233540.873 BIC':

-320.788

BIC used by Stata: 32305.82 AIC used by Stata: 32224.076

Measures of Fit for regress of ob

Log-Lik Intercept Only: -17069.129 Log-Lik Full Model: -16866.944

D(26119): 33733.887 LR(9): 404.37

Prob > LR: 0

R2: 0.015 Adjusted R2: 0.011

AIC: 1.294 AIC*n: 33941.887

BIC: -232011.062 BIC': -312.801

BIC used by Stata: 33835.631 AIC used by Stata: 33753.887

Table A-3.3 Logit goodness-of-fit tests

Table A-3.4 Linear goodness-of-fit tests

Note: The tables in these two pages report goodness-of-fit tests for both the binary logit and linear models used throughout the paper. The first table presents variance inflation factors (VIFs) to test for issues of collinearity in the logit model. The test command collin was used in Stata. The second table reports classification results, generated through the estat command, that present a prediction table based on the binary logit specification. As noted in the text, the test threshold was changed from the baseline 0.5 to 0.314 to reflect the sample proportion of positive on-base outcomes in the data. Tables A-3.3 and A-3.4 present a number of goodness-of-fit tests for the logit and linear models, respectively, using the fitstat command in Stata.

78

Name Team Runs Expected

Runs Z Saves Blown Saves

Name Team Runs

Expected Runs Z Saves

Blown Saves

Shigetoshi Hasegawa ANA 11 10.0 0.49 9 9

Jerry Spradlin KCR 8 6.1 1.32 7 4

Troy Percival ANA 19 16.2 1.20 32 10

Mike Fetters LAD 2 3.9 -1.97* 5 2

Byung-Hyun Kim ARI 5 9.4 -2.43** 14 6

Jeff Shaw LAD 10 12.1 -1.15 27 7

Matt Mantei ARI 6 6.1 -0.10 17 3

Curtis Leskanic MIL 17 20.0 -1.10 35 13

Mike Morgan ARI 2 2.7 -1.41 5 1

Eddie Guardado MIN 2 4.3 -2.64** 9 2

Kerry Ligtenberg ATL 4 4.4 -0.54 12 2

LaTroy Hawkins MIN 1 1.0 - 14 0

Mike Remlinger ATL 6 7.0 -0.71 12 4

Bob Wells MIN 12 11.0 0.46 10 10

John Rocker ATL 6 6.3 -0.35 24 3

Steve Kline MON 7 7.2 -0.16 14 4

Ryan Kohlmeier BAL 3 2.9 0.41 13 1

Scott Strickland MON 6 6.5 -0.37 9 4

Mike Timlin BAL 6 9.0 -1.65 12 6

Ugueth Urbina MON 4 4.2 -0.23 8 2

Derek Lowe BOS 9 9.9 -0.75 42 5

Armando Benitez NYM 11 9.9 0.87 41 5

Rick Aguilera CHC 13 13.5 -0.27 29 8

Mariano Rivera NYY 11 9.8 0.93 36 5

Keith Foulke CHW 10 9.7 0.21 34 5

Jason Isringhausen OAK 11 12.6 -0.88 33 7

Bob Howry CHW 8 6.8 0.73 7 5

Jeff Brantley PHI 8 9.2 -0.82 23 5

Danny Graves CIN 11 9.6 1.03 30 5

Wayne Gomes PHI 7 6.1 0.63 7 4

Scott Williamson CIN 3 4.0 -1.08 6 2

Mike Williams PIT 11 9.3 1.18 24 5

Steve Karsay CLE 11 13.4 -1.07 20 9

Trevor Hoffman SDP 13 13.0 -0.02 43 7

Bob Wickman CLE 15 12.4 1.47 30 7

Robb Nen SFG 7 9.9 -2.33 41 5

Gabe White COL 4 5.4 -1.04 5 4

Jose Paniagua SEA 3 4.8 -1.44 5 3

Todd Jones DET 9 8.3 0.69 42 4

Kazuhiro Sasaki SEA 7 6.6 0.56 37 3

Antonio Alfonseca FLA 9 8.3 0.66 45 4

Dave Veres STL 15 12.3 1.50 29 7

Octavio Dotel HOU 9 10.7 -0.88 16 7

Roberto Hernandez TBR 13 13.8 -0.41 32 8

Billy Wagner HOU 8 8.2 -0.11 6 9

John Wetteland TEX 14 15.2 -0.58 34 9 Ricky Bottalico KCR 11 10.7 0.13 16 7

Billy Koch TOR 10 9.7 0.24 33 5

Note: This table reports the results of the Wald-Wolfowitz runs tests for closers in the sample from the 2000 season. Significance symbols are as follows:

* p < 0.05,

** p < 0.01,

*** p < 0.001

Table A-4.1 Wald-Wolfowitz Runs Test: 2000 Results

79

Name Team Runs Expected

Runs Z Saves Blown Saves

Name Team Runs

Expected Runs Z Saves

Blown Saves

J.J. Putz ARI 9 8.3 0.66 45 4

Matt Capps MIN 11 12.3 -0.56 15 9

David Hernandez ARI 4 5.7 -1.48 11 3

Jason Isringhausen NYM 4 6.1 -1.45 7 4

Craig Kimbrel ATL 12 14.6 -1.46 46 8

Bobby Parnell NYM 7 7.0 0.00 6 6

Jim Johnson BAL 6 7.4 -0.87 9 5

Francisco Rodriguez NYM 5 10.5 -3.24** 23 6

Kevin Gregg BAL 15 11.6 1.77 22 7

Mariano Rivera NYY 9 10.0 -0.80 44 5

Jonathan Papelbon BOS 34 6.5 -0.55 31 3

Brian Fuentes OAK 7 5.8 1.05 12 3

Carlos Marmol CHC 18 16.5 0.68 34 10

Andrew Bailey OAK 4 4.7 -1.10 24 2

Sean Marshall CHC 7 5.4 1.12 5 4

Jose Contreras PHI 1 1.0 - 5 0

Sergio Santos CHW 13 11.0 1.25 30 6

Ryan Madson PHI 5 4.8 0.42 32 2

Chris Sale CHW 3 4.2 -1.36 8 2

Antonio Bastardo PHI 2 2.8 -1.87 8 1

Francisco Cordero CIN 9 11.3 -1.54 37 6

Joel Hanrahan PIT 9 8.3 0.71 40 4

Chris Perez CLE 7 8.2 -1.12 36 4

Heath Bell SDP 11 10.0 0.85 43 5

Rafael Betancourt COL 4 6.3 -1.61 8 4

Santiago Casilla SFG 2 2.7 -1.58 6 1

Huston Street COL 8 8.0 -0.03 29 4

Brian Wilson SFG 8 9.8 -1.36 36 5

Jose Valverde DET 1 1.0 - 49 0

Brandon League SEA 7 9.8 -2.17* 37 5

Juan Carlos Oviedo FLA 11 11.3 -0.19 36 6

Jason Motte STL 4 6.5 -1.75 9 4

Mark Melancon HOU 10 9.0 0.65 20 5

Fernando Salas STL 10 10.6 -0.36 24 6

Joakim Soria KCR 11 12.2 -0.65 28 7

Eduardo Sanchez STL 3 3.9 -0.91 5 2

Jordan Walden LAA 14 16.2 -0.97 32 10

Kyle Farnsworth TBR 11 10.7 0.19 25 6

Jonathan Broxton LAD 3 2.8 0.58 7 1

Joel Peralta TBR 4 4.0 0.00 6 2

Javy Guerra LAD 4 4.7 -0.98 21 2

Neftali Feliz TEX 11 11.1 -0.07 32 6

Kenley Jansen LAD 3 2.7 0.71 5 1

Frank Francisco TOR 7 7.5 -0.36 17 4

John Axford MIL 4 4.8 -1.73 46 2

Jon Rauch TOR 7 7.9 -0.53 11 5

Joe Nathan MIN 3 5.9 -2.67** 14 3

Drew Storen WSN 11 10.0 0.85 43 5

Table A-4.2 Wald-Wolfowitz Runs Test: 2011 Results

Note: This table reports the results of the Wald-Wolfowitz runs tests for closers in the sample from the 2011 season. Significance symbols are as follows:

* p < 0.05,

** p < 0.01,

*** p < 0.001