an empirical analysis of biases in cigarette...
TRANSCRIPT
An Empirical Analysis of Biases in Cigarette Addiction
Matthew R. Levy∗
This version: June 10, 2009
Abstract
I embed rational choice along with two types of psychological biases into a model describing theconsumption of an addictive good, and empirically investigate the extent to which these biases help explainobserved smoking patterns. Rational agents correctly maximize their lifetime utility, whereas “present-biased” agents have a taste for immediate gratification, and “projection-biased” agents underestimate howmuch their future tastes are affected by current behavior. Consumers may have any degree, including zero,of both biases. I use MSA-level variation in cigarette price shocks over a 26-year span to examine individualsmoking responses to current and predicted future prices, and to price volatility. I find that while bothyoung smokers and mature smokers respond to the current price (with a 1% price increase lowering theprobability of smoking by between 1% and 2%), only mature smokers respond to the future price. Thisyouthful myopia is consistent with either type of biased consumers but not with fully rational ones. Theratio of the future-price response to current-price response among mature response yields an estimate ofthe net discount rate. Decomposing prices into a permanent and a transitory component, I use maturesmokers’ responses to separate out a short-run discount factor β ∈ [0.70, 0.83] and long-run discountfactor δ ≈ 0.9. Because projection-biased agents do not fully appreciate the effect of current consumptionon future behavior, they under-appreciate the “option value” of remaining unaddicted in an uncertainprice environment. With an additional assumption on the utility function, I use the under-response toprice volatility to estimate a degree of projection bias α ∈ [0.41, 0.49], meaning that people underestimatechanges in tastes from smoking behavior by 40%. Overall, I find that a combination of present biasand projection bias explains significant deviations from rational cigarette consumption. Manipulating theprice path to take full advantage of these biases could increase the cost in present value that a consumerpays for a lifetime smoking habit by a factor of 3. I find that relatively large taxes—between $8 and$11—would correct for the overconsumption caused by the two psychological biases.
∗University of California, Berkeley. Email: [email protected]. I thank my advisors Matthew Rabin, StefanoDellaVigna, and Botond Koszegi for research guidance. I am also grateful to Ulrike Malmendier and Shachar Kariv for additionaladvice. I thank participants at the UC Berkeley Psychology and Economics Seminar, UC Berkeley Labor Lunch, UC BerkeleyPsychology and Economics Lunch, and Michigan State University Theory Brownbag Seminar for their helpful comments.
1 Introduction
Many studies have looked at the response of cigarette consumption to changes in prices and taxes.1 In general,
these studies find substantial price and tax elasticities of cigarette consumption across different income levels,
different ages, and different countries. There is also a growing collection of theoretical models describing
how agents treat addictive goods,2 which differ in both their behavioral and their welfare implications.3
However, there is little work attempting to link the two literatures and empirically identify which, if any,
model applies to cigarette addiction.4 This paper derives testable implications from a unified model of
addiction that allows the extent of two biases to freely vary, and then estimates the degree of each bias.
I present a simple model of addiction based on O’Donoghue and Rabin (2000)’s adaptation of Becker
and Murphy (1988), where infinitely-lived agents choose in every period consumption of a numeraire and an
addictive good. Addiction is captured by two key features: a negative effect of current consumption on the
utility of future selves, and a “reinforcement” effect whereby current consumption increases the temptation
to consume in future periods. I then allow for consumers to have varying degrees of two biases, with fully
rational consumers—who maximize their true lifetime utility function—embedded as a special case.5 This
case corresponds to the “rational addiction” described in Becker and Murphy (1988), where addicts correctly
assess their maximization problem and consume only when addiction is utility-maximizing.6
A growing literature—summarized in DellaVigna (2009)—finds that many people display a taste for
immediate gratification. I allow agents to exhibit this “present bias” according to the quasi-hyperbolic
discounting “β−δ” model of Laibson (1997) and O’Donoghue and Rabin (1999a). The degree of present bias
is indexed by the parameter β, which is an additional discount factor applied to any future utility in addition
to the usual exponential discount factor δ. Setting β = 1 therefore reduces to exponential discounting.
Consumers with β < 1 will be more impatient in choices involving both present and future utility—only
future utility receives extra discounting—while not overly impatient in choices involving only future utility
tradeoffs. I focus on “naive” present-biased agents, who are unaware of their self-control problem.7 Their1Becker, Grossman and Murphy (1994), Chaloupka and Warner (2000), Gruber (2001), Gruber and Koszegi (2001),
Chaloupka, Jha, de Beyer and Heller (2005), Tauras, Chaloupka, Farrelly, Giovino, Wakefield, Johnston, O’Malley, Kloskaand Pechacek (2005), Adda and Cornaglia (2006b)
2Becker and Murphy (1988), O’Donoghue and Rabin (1999b), Gruber and Koszegi (2001), Loewenstein, O’Donoghue andRabin (2003), Berheim and Rangel (2004), Gul and Pesendorfer (2007)
3Gruber and Koszegi (2000) discuss optimal taxes for present-biased consumers; Gruber and Mullainathan (2005) find thattaxes can improve happiness among those otherwise likely to have been smokers.
4The exception is Gruber and Koszegi (2001), which proposes a test of present bias but does not implement it. Becker,Grossman and Murphy (1994) test forward-looking behavior against the extreme of pure myopia, but this does not distinguishbetween the three types presented in this paper.
5By embedding the rational case within the model, I am not presupposing the existence of non-rational consumers.6Later rational-addiction models, such as Orphanides and Zervos (1995), allow initial uncertainty to produce both ex post
inefficient addictions and ex post inefficient abstentions. Once addicted, however, steady-state behavior will mimic that inBecker and Murphy (1988).
7In contrast, “sophisticated” present-biased agents are aware of their self-control problem and will seek out commitment
1
taste for immediate gratification causes them to under-weigh the negative effect that consuming the addictive
good will have on their future selves, while their naivete causes them to underestimate the extent to which
their future selves will be tempted to consume the addictive good. These two forces can lead to costly
overconsumption, as well as mistaken beliefs about future consumption paths. In particular, it can lead to
procrastination in which an addict always hits today and always plans to start refraining tomorrow. Being
patient in the long-run, such an addict will want to quit smoking. Her short-run impatience, however, causes
her to prefer quitting tomorrow over quitting today, even in a stationary environment. Because she does not
realize that she will also want to delay when tomorrow arrives, she will procrastinate forever on quitting.
Finally, consumers may under-appreciate the extent to which changes to their “state”—in this context,
addicted or unaddicted—will affect their utility. The “simple projection bias” of Loewenstein, O’Donoghue
and Rabin (2003) allows consumers to maximize a weighted average of their utility conditional on their
true future state, and their utility conditional on their state remaining constant. The extent of consumers’
projection bias is indexed by the parameter α ∈ [0, 1], with α = 0 embedded as the fully rational case.8
Because they do not currently crave the addictive good, unaddicted consumers underestimate how much
current consumption will make their future selves crave it. Currently addicted consumers, conversely, fail to
appreciate the extent to which refraining will decrease their future cravings. This can lead both addicts and
non-addicts to overconsume the addictive good. It can also cause unexpected revisions to the consumption
plan if a person intends to consume only for a short duration but, finding herself unexpectedly addicted and
her marginal utility unexpectedly high, then continues to consume in all periods.
In order to yield testable predictions, I place three restrictions on the instantaneous utility function.
First, it is quasi-linear in consumption of a numeraire and an addictive good. Second, utility from the
addictive good can be further separated into two components: one that depends on the agent’s level of
addiction, and one that depends on the period. Finally, I assume that consumers’ lives can be divided into
their “youth” and their “maturity”. The period-dependent component of utility is constant within each
phase, and normalized to be zero during smokers’ adulthood.9 The non-stationarity introduced by this
last restriction is intended to help the rational model explain actual smoking patterns. Setting the value
during maturity to the youth value reduces to the stationary model often used elsewhere. These assumptions
narrow the range of plans that consumers may form, enabling the estimation of behavioral parameters from
devices to restrain their future selves. Although I include them in the model in Section 2, I later rule them out generically usingthe restriction that sophisticates, like fully rational consumers, must have rational expectations.
8That is, if a consumer’s current state is sc but her actual future state will be given by sf , she believes her future utility willbe U(x) = (1− α)U(x|sf ) + αU(x|sc).
9I assume that teenagers are still in their youth, while smokers over 30 or 35 are in their adulthood. The distributionof maturity dates is investigated in Section 5.2.3. This assumption is a simplified version of the “youthful preferences” inO’Donoghue and Rabin (2000), which allow for a more gradual decay over time in the instantaneous incentive to consume theaddictive good.
2
responses to different types of price shocks.
Section 2 develops the model and tests that distinguish between biased and unbiased agents based on
their consumption responses to anticipated changes in cigarette prices. In doing so, I am assuming that
smokers hold rational expectations of the future cigarette price but do not have perfect foresight: the
realized future price will be higher than an individual’s expectation about as often as it is lower. To avoid
strong assumptions on smokers’ ability to predict prices, Gruber and Koszegi (2000) use high-frequency
smoking data from the National Vitality Statistics Natality Data in order to identify tax changes that
have been legislated but not yet implemented, and still find a future-price effect. In order to use the more
representative data from the NHIS, however, I can only look at tax and price changes at an annual frequency.
I instead use current-period information—taxes and state budget tightness—as instruments for future prices
in order to address consumers’ imperfect foresight and avoid attenuation bias from their noisy predictions.10
While I assume that consumers make use of the available information, this approach does not require the
seeming clairvoyance assumed by some previous research, such as Becker, Grossman and Murphy (1994),
that have included future prices directly into the regressions.11
Because current consumption affects the marginal utility from smoking in the future, consumers who
expect to smoke in the next period will react immediately to an anticipated future price change. Alterna-
tively, if an unaddicted consumer knows a price increase will cause her future self to not smoke, she may
choose not to start smoking so as to avoid withdrawal in the future. Because present-biased agents heavily
discount the costs to future selves, they will under-respond to anticipated future price changes. The ratio of
the response to a change in the expected next-period price to the response to a change in the current price
will be exactly the net discount factor βδ.12 A different function of β and δ can be estimated by comparing
reactions to transitory and permanent price changes. Because the response to a permanent shock is the
discounted response to a series of transitory shocks, the ratio of the responses will yield 1 + βδ/(1 − δ).
These estimates can be combined to separate out each discount factor.
Next, I consider the effect of changes in the variance of the cigarette price distribution on smokers with
different degrees of biases. Rational smokers will react to volatility because remaining unaddicted can be
viewed as a “real option”. While an addict will be stuck hitting for high realized prices, a non-addict can
always start hitting if prices are low. For example, if there is uncertainty about whether a large excise10The use of budget tightness as an instrument is similar in spirit to the enacted-but-not-implemented taxes used by Gruber
and Koszegi (2000), as budget tightness proves to be a predictor of the enactment of new taxes.11In Section 5.3, I explore a departure from rational expectations that, while it can explain some features of the data, cannot
account for the full set of results explained by the model of psychological biases.12This is also a feature explored in Becker, Grossman and Murphy (1994). In that more restrictive model, however, this ratio
yields the single discount factor δ rather than the net discount factor βδ. The authors note that “the estimates are not fullyconsistent with rational addiction, because the point estimates of the discount factor ... are implausibly low.” This paper willalso estimate a low value for this ratio, but is able to interpret this as β < 1 rather than implausible long-run impatience.
3
tax will be applied in the next period, there is value in remaining un-addicted until the uncertainty is
resolved. Because projection-biased agents do not appreciate how much current consumption affects their
future cravings, they will tend to underestimate the option value. As a result, they will under-respond to
changes in volatility relative to a rational or purely present-biased agent.
The data are presented in Section 3. To estimate consumers’ responses, I use repeated cross-sectional
data on self-reported smoking status and number of cigarettes from the National Health Interview Survey
for a large sample over a 26-year span. The individual-level data in the NHIS provides better identification
than the state aggregate sales data used in Becker, Grossman and Murphy (1994), and is more representative
than the National Vital Statistics Natality Data used in Gruber and Koszegi (2001). It also allows me to
estimate responses separately for youth smokers and for mature smokers, as well as control for demographic
characteristics.
To estimate the prices facing smokers, I form metropolitan statistical area (MSA)-level price indexes.
These reflect an average of national, state, and local taxes faced within the MSA. In order to deal with the po-
tential endogeneity of both prices and future taxes with respect to current smoking rates, I use current taxes
and states’ budgetary tightness as an instrument for prices. Because most states have a balanced-budget
requirement, budget shortfalls necessitate revenue increases. “Sin taxes” may be a politically expedient way
to close a budget gap.13 Budget tightness proves to be a predictor of tobacco tax increases, and therefore
of price, in the following year. I also decompose prices into permanent components and transitory shocks
using taxes and agricultural variation in Kentucky burley tobacco—an important constituent of domestic
cigarettes.14 MSA fixed effects and a time trend are included in the decomposition.
I then test the above predictions empirically in Section 4. I find that smokers differ from the rational
model in ways that suggest both present bias and projection bias. Young smokers react to the current price
of cigarettes but have almost no response to future price changes. This contrasts with the finding for mature
smokers, who respond both to current and anticipated future prices. A 1% increase in the current price of
cigarettes lowers the probability of smoking between 1% and 2%, with the effect monotonically decreasing in
age. Examining the responses to permanent versus temporary price changes for mature smokers, I estimate
β = 0.7 and δ = 0.89, suggesting that present bias is a significant factor in smoking behaviors. Finally,
there is strong evidence that price volatility per se increases the likelihood of smoking. A 1% increase in the
standard deviation of price increases the probability of smoking by 0.02 percentage points—approximately
one-tenth the effect of a direct 1% decrease in the current price level. One would expect a negative, or at least13Gul and Pesendorfer (2007), for example, discuss a $0.61 increase in the Connecticut cigarette tax, which was “expected to
be used primarily to reduce the budget deficit.”14Poor growing conditions in a given year will increase the price temporarily in a way that is uncorrelated either with demand
characteristics or with future prices.
4
much smaller, response from agents without projection bias. Further restrictions are needed, however, to
estimate the α this implies. In Section 5, I linearize the addictive component of utility and make additional
distributional assumptions about prices. With these additional assumptions, I estimate a degree of projection
bias α ∈ [0.41, 0.49], meaning that people underestimate the changes in tastes caused by smoking behavior
by approximately 40%.
I conclude in Section 6 by considering the welfare effects of the estimated degrees of bias. Whereas fully
rational agents without credit constraints should care only about the lifetime costs of smoking, I find that
manipulating the price path can “trick” a consumer into accepting an expensive lifelong addiction. Because
of her impatience in early periods, a consumer will be tempted to smoke if prices start low and then increase
sharply later—much in the way Shui and Ausubel (2004) find that consumers are drawn to credit cards with
low “teaser rates”. Moreover, projection bias in later periods causes mature smokers to underappreciate the
possibility of quitting. I estimate that a person with the estimated degrees of bias could be a lifetime smoker
at a cost 2–3 times as great in present value as another price path that she would refuse. I also estimate
that a corrective tax of between $8 and $11 would be needed for mature smokers to consume optimally. This
tax does not include the effects of secondhand smoke or the effect of premature deaths on the government’s
budget constraint, but focuses only on the intervention necessary to correct the over-consumption due to
the psychological biases.
2 Model
2.1 General Setup
Following O’Donoghue and Rabin (2000)’s simplification of Becker and Murphy (1988), consider a setting
with infinitely-lived agents who consume two goods in every period: an addictive good a ∈ {0, 1} and a
numeraire y ∈ <+. Consuming a = 1 is considered “hitting”, while consuming a = 0 is “refraining” from
the addictive good.15 The utility derived from hitting or refraining depends on the accumulated “stock of
addiction” k ∈ <+, with the initial taste for the addictive good drawn from some distribution κ(·).
I let the addictive stock evolve according to:
kt+1 =
γkt + 1, if at = 1
γkt, if at = 0(1)
15Although smokers can indeed choose their nicotine consumption from a continuum of possible levels, formulating the problemas a binary choice captures the intuition that many smokers choose between smoking at a set level (e.g. a pack a day, half a packa day, etc.) and quitting. Gruber and Koszegi (2001) derive a model for present-biased agents that can choose a continuousquantity of the addictive good, and obtain qualitatively similar results.
5
The parameter γ ∈ [0, 1] thus captures how quickly the addictive stock evolves. For example, γ = 0
implies that a consumer is either completely addicted or completely unaddicted as a result of her behavior
in the previous period. That is, consuming just once puts her in a fully addicted state, but she can become
unaddicted again by refraining for just one period. Larger values of γ capture a slower addiction process,
where it takes time both to become addicted and to go through withdrawal.16 The value of γ will influence
the magnitudes of how consumers react any particular shock to the price of the addictive good, but it will
not affect the ratio of how they react to different shocks. It will also determine the maximum steady-state
level of addiction: kmax = 11−γ . For generality, I do not assume a specific value of γ.
The key features of the evolution of the addictive stock are that past consumption smoothly increases
the current addiction level, past abstention decreases it, and that the effect is stronger for more recent past
consumption. Alternative formulations that preserve these features will not substantively alter the model.
Conversely, functional forms like kt = max{aτ |τ < t} that eliminate the possibility of becoming un-addicted
would change many of the predictions of the model. Although one could think of many alternative forms,
a geometric process has been used in the economics literature since Becker and Murphy (1988) and is used
here for simplicity.17
To abstract away from the wealth effects from choosing to consume the addictive good, consumers’
instantaneous utility function is quasilinear in their consumption of the the numeraire yt and the addictive
good at:
ut(at, yt|kt) =
yt + ft(kt), if at = 1
yt + gt(kt) if at = 0(2)
The functions ft(·) and gt(·) are continuously differentiable and, because this is a harmful good, Iassume:18
f ′t(k) < 0 ∀ tg′t(k) < 0 ∀ t (3)
The negative effect of past consumption thus comes indirectly through its effect on the current addictive16Using just one parameter γ to describe the process embeds a degree of symmetry in the processes of becoming addicted and
becoming unaddicted. This assumption is not important for the results in this paper, and can be relaxed by assigning separateparameters to the cases where a=1 and where a=0. The assumption that the addictive stock decays geometrically, however, isstandard in the literature.
17Becker and Murphy (1988) assume the stock of addictive capital, S(t), evolves according to S(t) = c(t) − δS(t) − h[D(t)],where c(t) is consumption and D(t) is expenditures on endogenous depreciation. The model in this paper is identical if h(·) = 0and (1− δ) = γ.
18A negative first derivative stands in contrast to what can be considered positive addictive goods, such as exercise. Suchgoods would be underconsumed by the biased types in this model, rather than over-consumed. The set of consumption plansallowed under positive addictive goods, however, is inconsistent with the patterns observed in the data. More directly, themedical consensus is that the delayed health consequences of smoking are indeed negative (DHHS, 1982).
6
stock. Increased past consumption raises kt, and therefore lowers current utility regardless of whether one
hits or refrains today.
The second effect of past consumption on present utility comes through reinforcement—the more one
has consumed in the past, the more tempting it is to hit today. Let the instantaneous temptation to hit be
given by ht(k) = ft(k)− gt(k). Reinforcement requires that:
h′t(k) = f ′t(k)− g′t(k) > 0 ∀ t (4)
The exact functional form of ft(·) and gt(·) will, along with γ, determine the magnitude of consumer
responses to price shocks. For now, I assume that f(·) and g(·) are each weakly convex.19 In Section 5,
following O’Donoghue and Rabin (2000), I linearize these addiction functions.
Finally, assume that the price of the addictive good in period t is pt and let the consumer’s disposable
income in period t be Yt.20 Instantaneous utility can then be re-written only as a function of the addictive
stock and the decision of whether to hit or refrain:21
ut(at | kt) =
[Yt − pt] + ft(kt), if at = 1
Yt + gt(kt), if at = 0(5)
The model assumes that all agents have perfect information about the health risks of smoking. It is
possible that if there is significant heterogeneity in beliefs about the costs of smoking, differences in beliefs
may lead to sorting: those with relatively positive beliefs about cigarettes choose to smoke and those with
relatively negative beliefs refrain. It is unclear to what extent information drives smoking behaviors within
the United States: while Cutler and Glaeser (2008) attribute almost half the difference in smoking rates
between the US and Europe to differences in beliefs about the health effects of smoking, Khwaja et al.
(2008) find that mature smokers in the US are either correct or too pessimistic about their health risks.
While heterogeneity in beliefs may change the mean smoking rate and the absolute magnitude of price
responses, it will not affect the ratios used for identification in this paper. A time trend is included in all
regressions to control for any changes in information about smoking over time (or other secular changes in
the attractiveness of smoking).
The preferences described above are general preferences over an addictive good, and say nothing about19This will understate the harm from smoking if f(·) and g(·) are concave, and will bias the results towards the rational case.20Assuming that spending on the numeraire is large relative to spending on the addictive good, one may interpret Yt as the
portion of permanent income allocated to period t.21I assume that all agents know the functions ft(·) and gt(·). That is, smokers are aware of the heath and welfare consequences
of smoking. Previous studies suggest that this may be a reasonable approximation for most people (Viscusi 1990, Schoenbaum1997, Khwaja, Silverman, Sloan and Wang 2008).
7
a consumer’s degree of rationality. The psychological biases introduced in the following section do not
change the underlying instantaneous utility function, but rather represent systematic departures in how
consumers maximize this utility. While the list is not exhaustive in the set of explanations it considers, I
focus on projection bias and present bias as important examples. For example, I omit among others the
“cue-triggered decision processes” of Berheim and Rangel (2004) and the preferences over choice sets of Gul
and Pesendorfer (2007).22
2.1.1 Rational Consumers
As Becker and Murphy (1988) show, there is no reason why a fully rational agent could never consume an
addictive good. If the benefits are large enough, or the future costs sufficiently small, it may be utility-
maximizing to hit. In this model, they will tend to do so whenever ht(·) is large or their initial addictive
stock k0 is large. Let the discount rate be given by δ. Formally, the fully rational consumer forms a period-t
consumption plan at = (at, at+1, at+2, ...) in order to:
maxat
Ut(kt) =∞∑τ=t
δτ−tut(aτ |kτ ) (6)
s.t. kτ+1 = γkτ + aτ
Yτ − pτaτ ≥ 0
That is, fully rational consumers form a consumption plan that maximizes their true utility subject to
their budget constraint and the actual addictive process. Because fully rational consumers with perfect
information about the addiction functions only consume the addictive good when the upfront benefits out-
weigh the future costs, they will be responsive to any anticipated future price shocks that affect the cost of
future consumption.23
2.1.2 Projection-Biased Consumers
Projection-biased consumers do not appreciate the degree to which their “state” in the future will differ
from their current “state”. For example, Read and Van Leeuwen (1998) show that people who are currently
hungry act as though their future selves will also be relatively hungry, and people who are currently sated22While present bias and projection bias make firm predictions about how consumers react to different types of price shocks,
other models are more ambiguous. Cue-triggered addiction or utility over choice sets do make predictions about other facets ofbehavior, but they will not be well-identified using the approach of this paper.
23This forward-looking behavior has been used to test the rational model against the alternative model of complete myopia(Becker and Murphy (1988), Becker, Grossman and Murphy (1991)). However, both types of biased consumers presented hereinwill also display forward-looking behavior in many circumstances. The tests in Section 4 are designed to distinguish betweenthese alternative models.
8
behave as though their future selves will also be relatively sated. Similar effects have been shown for the
“endowment effect” (Loewenstein and Adler 1995), sexual arousal (Ariely and Loewenstein 2005), weather
and catalog orders (Conlin, O’Donoghue and Vogelsang 2007), adaptation to changing life circumstances
(Gilbert, Pinel, Wilson, Blumberg and Wheatley 1998, Loewenstein and Frederick 1997), and, indeed, drug
addiction (Giordano et al. 2002).
Projection bias may be very important in the consumption of additive goods if people do not appreciate
the degree to which they will become addicted, or conversely if addicts do not appreciate that they can
become un-addicted. In either case, the effect of projection bias will be to increase consumption relative
to the perfectly rational case. It may also introduce some time-inconsistencies between her plans and her
behavior, if future selves find themselves experiencing an unexpectedly strong addiction.
Following Loewenstein, O’Donoghue and Rabin (2003), a projection-biased agent believes her future
utility will be a weighted sum of her actual future utility and her utility if she remained in her current state.
The weight she places on the incorrect term, α, indexes the degree of the bias.24 She then forms a period-t
consumption plan at in order to maximize her (incorrect) assessment of her lifetime utility:
maxat
Ut(kt) = u(at|kt) +∞∑
τ=t+1
δτ−t[(1− α)u(aτ |kτ ) + αu(aτ |kt)
](7)
s.t. kτ+1 = γkτ + aτ
Yτ − pτ aτ ≥ 0
The projection-biased consumer then consumes the first component of her plan in the current period. In
the next period, however, she re-optimizes according to (7) and may form a very different plan.
This formulation of projection bias is different from other forms of ignorance about the addictive process.
A projection-biased person has the correct model of f(k) and g(k), just as a sated person knows what it
feels like to be hungry or an unaroused person knows what it feels like to be sexually aroused. The mistake
is strictly about the evolution of their state k—they believe it will not differ as much from their current state
as it truly will. One key difference between this model of projection bias and simply overestimating f ′(k),
for example, is that projection bias leads both addicts and non-addicts to over-consume. An addict who hit
due to an early mistake about f ′(k), but who did not suffer projection bias, would not make mistakes in her
consumption going forward.
24Alternatively, one could imagine projection-biased people making a mistake directly about kτ of the form kτ = (1−α)kτ+αkt,and then believing their future utility to be u(aτ |kτ ). If f(·) and g(·) are linear, this approach is identical to the one used inthis paper.
9
2.1.3 Present-Biased Consumers
Consumers with present bias are impatient—they value consumption in the present at a premium to any
delayed consumption. This taste for immediate gratification has been identified in a range of contexts, from
long-term savings behaviors by Angeletos, Laibson, Repetto, Tobacman and Weinberg (2001) to caloric
intake by Shapiro (2005). An overview of the literature is provided in DellaVigna (2009). Following the
formulation of Laibson (1997), the degree of present bias is captured by an extra discount factor β < 1 applied
uniformly to all future periods. As a consequence of this short-run impatience, present bias generates time-
inconsistency. For example, because she is patient about long-run decisions, a consumer in period 1 may
prefer having her period-2 self refrain from smoking. When period 2 arrives, however, she is now trading off
long-run costs against instantaneous benefits and may be tempted to hit. O’Donoghue and Rabin (2000) and
Gruber and Koszegi (2001) show that present bias can lead biased consumers to greatly overconsume the
addictive good, at a substantial welfare loss. Present-biased consumers can be divided into two categories
based on whether they are aware that their future selves will also be present-biased: naifs are unaware of
this fact, while sophisticates are aware of it.
A naive present-biased agent has β < 1, but does not recognize that her future selves will also be
impatient—following O’Donoghue and Rabin (1999a), she believes that β = 1 for all future selves. The
problem for a naive present-biased agent is then to form a consumption plan at in order to maximize her
current self’s lifetime discounted utility, since she does not realize her time-inconsistency. That is, because
she believes her future selves will not feel impatient in the way she does, she believes they will behave in
a manner consistent with the plans she forms today. She therefore performs the simple optimization given
here, and forms the plan at. Her maximization problem is:
maxat
Ut(kt) = ut(at|kt) + β
∞∑τ=t+1
δτ−tu(aτ |kτ ) (8)
s.t. kτ+1 = γkτ + aτ
Yτ − pτ aτ ≥ 0
Having formed at, the naive present-biased consumer does consume the first component, at, in the
current period. In the next period, however, she forms a new plan according to (8) which may differ from
her plan at time t. For example, she may expect to hit today when she is impatient but refrain in all future
periods when she thinks she will be patient. In the next period, however, she finds that β < 1, and hits
again. This naivete about present bias may therefore lead a consumer always to think she is going to quit,
but procrastinate forever in doing so.
10
Sophisticates are aware of their taste for immediate gratification—and that future selves will also have
such a taste—and therefore must play a subgame-perfect Nash equilibrium with their future selves. Their
maximization problem is therefore the same as that of a naive present-biased consumer, but with the
additional constraint that future consumption in each period τ must be optimal from self τ ’s point of view.
The sophisticate forms a plan at = (at, at+1, ...) and chooses at in order to maximize:
maxat
Ut(at|kt) = ut(at|kt) + β∞∑
τ=t+1
δτ−tu(aτ |kτ ) (9)
s.t. kτ+1 = γkτ + aτ
Yτ − pτ aτ ≥ 0
aτ = 1 ⇐⇒ hτ (kτ ) ≥ βδ[Uτ+1(aτ+1|γkτ )− Uτ+1(aτ+1|γkτ+1 + 1)
]∀τ > t
where aτ = (aτ , aτ+1, ...) ∀τ ≥ t.
The solution to this problem is sensitive to the parameters of the model. On the one hand, if sophisticates
realize that their future selves will eventually start consuming regardless of their current behavior (as they
would with any finite horizon), this sense of “inevitability” makes them more likely to begin their habit
today. On the other hand, if they believe that they can overcome their future selves’ impatience by refraining
today, there is a strong “incentive effect” to do so. In general, one cannot state which effect will dominate.
The apparent myopia of young smokers, combined with the requirement that sophisticates have rational
expectations, will ultimately rule out sophisticates except for the knife-edge case where this incentive effect
exactly counterbalances the direct effect of a price increase on average
One can form some comparative statics for sophisticates—for example, O’Donoghue and Rabin (2000)
show that the addition of “youthful preferences” of the sort in Section 2.2 can cause a sophisticate to switch
from always refraining to always hitting only when it would also cause a naif to make the same switch.
Unfortunately, this highlights the fact that sophisticates in this model will not be separately identified
from fully-rational and naive present-biased agents. In some instances (e.g. when they believe they are
consuming only for a short duration), they will behave like fully rational consumers. In others (e.g. when
they believe they will consume always), they will behave like their naive counterparts. Because without
additional parametric assumptions it is impossible to predict when they will behave like each type, I focus
only on rational consumers, or naive consumers experiencing projection- and present-bias.25
25The difficulties in making firm predictions for sophisticated present-biased agents are two-fold. First, there may be multipleequilibria in the infinite-period model described here. In general, the substantive predictions will be similar across the multipleequilibria. For example, in models of procrastination such as those in O’Donoghue and Rabin (1999a), sophisticated agents maydelay doing a task for a brief number of periods but never for very long. Moreover, the equilibria of this game will be genericallyunique if one moves from an infinite horizon to a finite time horizon.
11
2.2 Consumption Plans Under “Youthful Preferences”
Without further structure on ut(at | kt), the model does not yield a closed-form solution for consumers’
plans. Moreover, any observed behavior could be reconciled with any type of consumer.26 In order to derive
testable predictions, then, we make two assumptions:
Assumption 1 (Separability) ∃ x = {x1, x2, ...}, f(k), g(k) s.t.
ft(k) = xt + f(k) ∀tgt(k) = g(k) ∀t
This assumption implies that a consumer’s utility is determined partly by a time-independent component
which depends only on her addictive state, and partly by a purely instantaneous incentive to hit.27 The non-
addictive component is normalized to zero for the case where she refrains. The time-independent component
is meant to largely capture the biological and psychological process of addiction.28 xt captures any external
incentives to hit (e.g. peer pressure) as well as any other changes in the enjoyment of the addictive good
across ages.
This assumption implies that the instantaneous incentive to hit at age t can be re-written as ht(k) =
xt + h(k) or alternatively that ∂2ht∂k∂t = 0. That is, while consumers at different points in the life-cycle may
certainly find hitting more or less tempting, the degree to which this temptation grows for more addicted
consumers does not depend on age.
Assumption 2 (Nondecaying Youthful Preferences) ∃ m, x s.t.
x1 = x2 = ... = xm = x ≥ 0; xm+1 = xm+2 = ... = 0
The second difficulty for sophisticated agents is that their comparative statics are much harder to predict. In general, theywill be “well-behaved” in the manner described later for other types of agents. However, they may sometimes behave incounterintuitive ways. Consider, for example, the following example. Agents live for 3 periods, and may “hit” or “refrain” inperiods 1 and 2, and hits if indifferent. Everyone refrains in period 3. Utility is defined as above, such that ft(k) = xt − ρkand gt(k) = −σk. Let p1 = p2 = 0, x1 = 1.2, x2 = 0, γ = 0, δ = 1, β = 1
2, ρ = 1, σ = 2 and consider the effects of applying
a tax τ = 1 in period 2. A naive consumer will hit in the first period with or without the tax—she does not expect to smokein period 2 in either case. Without the tax, the sophisticate believes she will smoke in period 2 only if she smokes in period 1.She therefore refrains without the tax in order to give her future self an incentive to refrain as well. With the tax, however,she realizes that her period-2 self will refrain no matter what, and therefore chooses to smoke in period 1. Increasing the futureprice can therefore actually raise smoking for sophisticates, if it acts as a control device for their future selves.
26Consider any plan a. It can be rationalized easily by setting ht(·) = +∞ in any period in which at = 1 and ht(·) = −∞ inany period in which the consumer abstains. Less extreme changes in ht(·) could also generate the same pattern, while keepingconsumers’ price sensitivity intact.
27A similar separability assumption is made in Becker and Murphy (1988), O’Donoghue and Rabin (2000), and Gruber andKoszegi (2001).
28Although there is some evidence that the effects of nicotine withdrawal may differ over time, there is no conclusive evidencefor how these changes occur or the magnitudes of the effects. For example, Wilmouth and Spear (1998) find that adolescentrats suffer more from cognitive disruption effects during withdrawal and adult rats suffer more from anxiety-like symptoms. Itis unclear how this could be translated into utility measures.
12
The youthful-preferences assumption is meant to capture the social pressures, physiological differences,
and other reasons why it may be quite tempting for a teen to experiment with a first cigarette, but that an
unaddicted 40-year-old would have no desire to do so.29 This version of the youthful preferences assumption
is the simplest way to capture that the addictive good is initially very tempting due to external influences,
but is not tempting later.30
If one alternatively found that x < 0, one would have a good that is not very tempting initially but
that older consumers enjoy (e.g. scotch). If this were the case, however, one would expect to see very few
consumers starting a consumption habit when young, and then initiating once reaching maturity. Neither an
intuitive assessment of cigarettes nor the empirical hazard rates of initiation (shown in Figure 3) correspond
to this story. Consumers may of course still differ on their values for m and x. Indeed it is this heterogeneity,
along with differences in the initial addictive state k0, that allows the model to account for differences in
starting and quitting behavior across individuals.
The general maximization problem for a consumer with discount rate δ, degree of present bias β, degree
of projection bias α, and period-t addiction level kt is then:
maxat
Ut(kt) = u(at|kt) + β∞∑
τ=t+1
δτ−t[(1− α)u(aτ |kτ ) + αu(aτ |kt)
](10)
s.t. kτ+1 = γkτ + aτ
Yτ − pτ aτ ≥ 0
Lemma 1 establishes that in the current period, all three types of consumers follow a cutoff strategy.
That is, given the price path, they will consume if and only if their current addictive state is above a certain
threshold.
Lemma 1 Given x, m, and p, and let a strategy a(k, t) denote consumption in any period t as a function
of the addiction level k:
1. There is a unique strategy A(k, t) that is optimal from self-t’s perspective
2. The optimal strategy follows a cutoff rule: A(k, t) = 1 if and only if k > kt
3. k1 ≤ k2 ≤ ... ≤ km ≤ km+1 = km+2 = ..., and x > 0⇒ km < km+1.29Alternatively, one could interpret xt as representing “divorce, unemployment, death of a loved one, and other stressful
events” that Becker and Murphy (1988) believe “stimulate the demand for addictive goods”.30The “youthful preferences” discussed in O’Donoghue and Rabin (2000) simply require xt ≥ xt+1 if t < m, and xt = xt+1 if
t ≥ m. Some structure on {xt}∞t=1 is necessary for identification. A more relaxed assumption than Assumption 2 is to restrictthe magnitude of |xt − xt+1|, t ≤ m − 1. Using this in place of nondecaying youthful preferences will not change the overallpattern of predictions, but will expand the number of cases in Lemma 2, as consumers may heterogeneously plan to consumeuntil some intermediate date s ≤ m.
13
Because of the pattern the addiction threshold follows, rational or projection-biased consumer may only
expect to consume never, to consume always, or to consume only until maturity. With the addition of
present-bias, people may form any of these plans or may plan only on yielding to temptation today and
refraining in the future. Consider a consumer after date m, when the problem becomes stationary. Whereas
rational consumers realize that consuming today will increase their temptation to consume tomorrow and
projection-biased consumers believe that tomorrow’s problem will be identical to today’s (and therefore
must believe that if they find it optimal to hit today, they will also find it optimal to hit tomorrow), a naive
present-biased consumer may believe that the slight increase in her temptation to hit will be undone by the
disappearance of β starting tomorrow. She may thus believe that today is the last time she will hit, and
that she will virtuously refrain starting tomorrow.
For all of these plans, a positive shock to the price of the addictive good in a period in which the consumer
expects to hit will lower the utility of consumption, and hence the consumer will only consume at higher
levels of addiction. The magnitude of this change in the cutoff level of addiction depends on the specific
functional form of the utility function. However, the ratios of the responses to price changes at different
times depend only on the discount rate. Lemma 2 establishes the relationships between the reactions of
consumers to different sorts of potential price shocks.
Lemma 2 Suppose pt = p for all t, m > 1, and k ∈ (0, kmax).
1. If a consumer plans to consume today and in all future periods, then:
dk1/dpτdk1/dp1
= βδτ−1 anddk1/dp
dk1/dp1=
1− (1− β)δ1− δ
2. If a consumer plans to consume today and until date m but refrain thereafter, then:
dk1/dpτdk1/dp1
=
0 τ > m
βδτ−1 τ ≤ mand
dk1/dp
dk1/dp1= 1 +
βδ(1− δm−1)1− δ
3. If a consumer plans to consume today and refrain thereafter, then:
dk1/dpτdk1/dp1
= 0 anddk1/dp
dk1/dp1= 1
Consider a marginal smoker with β = 1, so if she believes she will hit in all periods she will discount a
shock to the price in period t+1 relative to a shock in period t by a factor of δ. Because a permanent price
shock is equivalent to a temporary price shock in all periods, her response is just the sum of the responses
14
to a series of instantaneous shocks: 1/(1− δ). If she believes she will only hit until maturity, then she will
respond in exactly the same way to shocks in periods in which she expects to be hitting, and ignore shocks
in periods in which she does not expect to be hitting.
A naive present-biased agent reacts very similarly to the rational agent in most cases, except that her
response to all shocks in the future is discounted by an additional factor of β. Part (3) of Lemma 2 further
allows that she may expect to hit only in the current period. In that case, she ignores all future price shocks
and reacts equally to a temporary and to a permanent shock.
A projection-biased agent forms the same set of plans as the rational agent. The key difference, however,
is that projection-biased people may not follow through on their plans later. That is, a projection-biased
agent may believe she will only smoke until maturity and therefore ignore any price shocks after date m, even
though she may also then wind up continuing to smoke forever. Because a rational agent’s plans coincide
with her actual consumption, she can only ignore those prices which truly will not affect her. Thus two
consumers whose observed consumption history is identical may nonetheless differ in their response to price
shocks.
Moreover, because consumption plans may depend on whether a consumer is mature—that is, whether
t > m—Lemma 2 has different implications for youth smokers and adult smokers. It is therefore possible to
distinguish between fully rational young smokers and biased ones using their future-price responses:
Proposition 1 Let π(p|X) be the probability of being a smoker as a function of the price path and a vector
of individual characteristics.
1. If young smokers are fully rational and expect to smoke in the next period, then ∂π/∂pt+1
∂π/∂pt= δ > 0
2. If young smokers are (naive) present-biased or projection-biased, then ∂π/∂pt+1
∂π/∂pt∈ {0, δ, βδ}
That is, forward-looking behavior is consistent with either rational or biased agents, while myopic be-
havior is only consistent with biased agents.31
Using self reports not only of current smoking status but also of the age of smoking initiation and
cessation, it is possible to estimate hazard rates of smoking. Although this data is not available in all years,
Figure 3 plots the results from the 1987 NHIS. The overall quitting hazard rate is very low—below 2%
through the first 20 years of a consumption spell. In order for young, fully-rational smokers to expect on
average that they will only smoke during the current year, the quitting hazard in the first several years of
a smoking spell would necessarily be much greater. Because rational agents must have correct beliefs on
average, this is taken as evidence that rational teen smokers must expect to smoke in the future as well.31Provided that δ > 0. There is some evidence that teenagers do have a finite discount rate.
15
Older smokers, because they are past date m, are restricted in the consumption plans they can form.
In particular, both fully rational and projection-biased smokers may only believe they will consume forever
or consume never. Present-biased smokers may form either of these plans, or may additionally believe they
will consume only in the current period.32
Lemma 2 established how mature smokers will react to anticipated future price changes relative to current
price changes, and to permanent changes relative to temporary changes. These two predictions allow for β
and δ to be separately identified:
Proposition 2 Let π(p|X) be the probability of being a smoker, and let the price be the sum of a permanent
component and a transitory shock: pt = pPt + pTt . Define d = ∂πt/∂pt+1
∂πt/∂pt.
1. If mature smokers are not present biased, then d = δ and ∂πt/∂pPt∂πt/∂pTt
= 11−d
2. If mature smokers are present biased, then d = βδ, and ∂πt/∂pPt∂πt/∂pTt
= 1−(1−β)δ1−δ > 1
1−d
Because present-biased consumers will somewhat over-react to temporary shocks relative to permanent
ones, but greatly over-react to current shocks relative to delayed ones, Proposition 2 enables one to estimate
the degree of present bias.
2.3 Effects of Price Volatility
The preceding analysis assumes that consumers have a single prediction for the price in the future. When
consumers have a probability distribution over future prices, the main analysis is unchanged. Although
the overall attractiveness of smoking may decrease, the ratios of elasticities calculated do not change. The
following section analyzes how smokers will respond to different levels of volatility.
For the following analysis, consider only a two-period model. This is necessary for tractability, as agents
can form consumption plans that are conditional not only on the current price, but on the entire history of
prices that has occurred at time t. Unlike in the world with single expectations of prices, consumers explicitly
considering uncertainty can rationally form plans other than “consume never” or “consume always” by setting
period-specific conditional cutoffs. Reducing the number of periods to two clearly avoids complicated plans,
and conveys the basic result.
Now let pt = pt+νt, with νt ∼ i.i.d. N(0, s2). That is, the current price in all periods varies independently
about a known mean. The assumption of a normal error term is not important, except inasmuch as it32Hammar and Carlsson (2005) find in a survey of mature smokers that only “5% of the respondents stated that it is very
likely that they will quit smoking.”
16
guarantees a nonzero support at all levels of addiction for the initiation decision.33 The following Lemma
establishes the effect of changing the price volatility s2 on different types of consumers.
Lemma 3 Let pt = pt + νt, with vt ∼ i.i.d. N(0, s2) and t ∈ {1, 2}. Let π(p|X, s2) denote the probability
the consumer hits in period 1. If the period-1 addiction level is fixed at k1, then:
1. ∃ d s.t. if βδ > d and α = 0, dπds2
< 0.
2. If the consumer is fully projection biased (α = 1), dπds2
> 0.
3. If the consumer is partially projection biased (α ∈ (0, 1)), d2πds2dα
> 0.
Lemma 3 states that for fully rational people, increasing the volatility of prices decreases the attrac-
tiveness of consuming the addictive good. They realize that their current choice will influence their future
selves’ incentives to hit or refrain. If tomorrow’s realized price turns out to be exceptionally low, they can hit
whether they are unaddicted or addicted. If the realized price turns out to be very high (and the addiction is
strong enough), the consumer will hit only if they are in an addicted state. Because of this, refraining today
in a volatile price environment has a real option value—the consumer can always choose to become addicted
later on. Because the value of this option is increasing in the volatility of the price, rational consumers will
lower their threshold price for smoking as the price volatility increases.
Present-biased people will also react by lowering their threshold price for consumption today, but to a
lesser degree. In particular, their response will be β times the response of a fully-rational consumer. As long
as β and δ are sufficiently high, they value the option enough that they lower their threshold sufficiently to
overcome the increased probability of extremely low prices being realized.34
Fully projection-biased people do not realize the effect that today’s decision has on tomorrow’s incentives.
As such, they do not adjust the cutoff price for hitting today. Increased price volatility increases the
probability that today’s price realization falls beneath their threshold, and that they choose to hit. In this
sense they are “fooled” into consuming—they believe they are just taking advantage of particularly low
prices and do not realize the effect they are having on their future self. Partial projection bias may look like
either type, with their reaction becoming monotonically more positive as α increases to 1.
33We make the technical assumption that p and s are such that dΦ(x2+h(γk1))
ds2< 0. If this assumption is not met, the
predictions for projection-biased agents are unchanged. The predictions for rational agents, however, become ambiguous.34If instead an agent were extremely impatient and had β = 0, they would not lower their price threshold at all and would
resemble the projection-biased agent.
17
3 Data
3.1 Smoking Outcomes
All outcomes on smoking behavior are drawn from the National Health Interview Survey (NHIS) between
1965 and 1991. The NHIS is a large, national, repeated cross-sectional survey which in 15 years during
this time included questions on smoking behaviors.35 The smoking questions were included either in the
core set of questions or in various supplements. Specific questions include current cigarette consumption
and past consumption, as well as broader questions such as “Have you ever smoked 100 cigarettes over your
lifetime?” Despite the potential for some under-reporting, validation studies have found interview questions
on smoking to be generally reliable (Patrick et al., 1994).
In addition to smoking outcomes, the NHIS provides an array of broad demographic, economic, and
health-related data. Education and other socioeconomic factors are known to correlate with smoking be-
haviors, and are used as controls in all regressions.36 Respondents in the survey are classified into 31 large
metropolitan statistical areas (as defined by the Bureau of the Census) or as other. Only the urban sample
is used.
Table 1 presents summary statistics for the full urban sample, and separately by smoking status. Smoking
status is defined by the current number of cigarettes consumed daily, with any positive number being
considered a smoker. There is no difference in the average age of smokers and nonsmokers, although the
higher standard deviation of nonsmokers’ age reflects some differences in composition of the groups: both
young people and the elderly are more likely to be nonsmokers. Nonsmokers are slightly more likely to be
female and slightly less likely to be black. Differences in education status also reflect the age compositions
of the two groups. Regressions of smoking status on the demographic controls are presented in Appendix
Table A-2, and confirm the pattern seen in the means: smoking correlates positively with age and non-white
race groups, and correlates negatively with age2, female, and high levels of education.
Among smokers in the sample, the average number of cigarettes smoked is 18.79—just under a pack a
day. This figure may not be directly comparable to more recent data, as the mean number of cigarettes per
smoker has fallen considerably in recent years. According to the CDC, the mean daily cigarettes per smoker
was just under 20 prior to 1995, but had fallen to 13.9 by 2006.37
351965-66, 1970, 1974, 1976-80, 1983, 1985, 1987-88, 1990-91.36For example, Townsend, Roderick and Cooper (1994) and Kenkel, Lillard and Mathios (2006) find that smoking behaviors
and cigarette price elasticities vary by age, education, and income groups. Arnold et al. (2001), however, find that even thoughthe perceived health risks of smoking are correlated with reading level among low-income women, actual smoking rates are not.
37(CDC, 2006. http://www.cdc.gov/tobacco/data_statistics/tables/adult/table_4.htm)
18
3.2 Price Measures
Price data on cigarettes are taken from the Tobacco Tax Council’s “The Tax Burden on Tobacco”, which
contains annual state-average cigarette prices and taxes. The prices and taxes are collected in November of
each year, and because the analysis in this paper relies on changes in prices rather than levels, no seasonal
adjustment is performed. Taxes are a statewide average of federal, state, and local excise taxes. Prices and
taxes are then translated into real terms using CPI data from the Bureau of Labor Statistics.
Figures 1 and 2 show real prices and nominal taxes over time, by Metropolitan Statistical Area. Changes
in nominal taxes are lumpy in general and, although there is substantial variation in the timing between
MSAs, tended not to occur during the 1970s. Because taxes are a large component of the retail price, this
creates a U-shaped pattern in the real price. During the 1970s, high inflation and few tax increases actually
caused the real price of cigarettes to fall. The real price during the 1980s and 1990s then rose largely due
to a combination of lower inflation and more tax increases.
Ideally, one would want to know the exact price faced by every individual in the NHIS sample. Even
knowing the average prices and taxes in the exact jurisdiction where an individual lives, however, would not
be sufficient for calculating the effective price they face. Not only may they purchase their cigarettes in a
nearby jurisdiction, the exact location at which they purchase (e.g. convenience store, gas station, etc.) and
the quantities they buy at one time (i.e. by carton or by pack) will generate substantial heterogeneity in
their effective price.38
An individualized price is neither reported in the NHIS nor calculable from any potential data. Instead, I
assign to each individual the average price and taxes for her MSA of residence. These averages are calculated
as follows. The state average price and tax level is first assigned to each county in a MSA. The MSA price
level is then the average (weighted by population) of all prices of its constituent counties. For a MSA located
entirely within one state, this simply returns the state average. For MSAs comprising more than one state,
it will be a mixture of these states. This reflects the fact that a smoker in Washington, DC may be affected
not only by DC prices but by those in Maryland and Virginia. It also makes my price measure more robust
to the short-distance smuggling discussed in Becker, Grossman and Murphy (1994) and Gruber and Koszegi
(2000).
It will be useful to decompose the price level into a permanent component and a transitory shock—that
is, to rewrite pt = pTt + pPt . To do this, I observe that tobacco is an agricultural product and hence is
dependent on the growing conditions of that particular year. Rainfall, the length of the growing season, and38It is also possible that teen smokers rely on friends or parents for their cigarettes, and therefore do not face any price at
all. Cummings, Sciandra, Pechacek, Orlandi and Lynn (1992), however, find that 88% of regular teen smokers usually purchasetheir own cigarettes.
19
other growing conditions affect the quality (as defined by the United States Department of Agriculture) of
the tobacco crop, and are not correlated across years. Moreover, the amount of land eligible for tobacco
farming is tightly regulated under two New Deal-era laws: the Agricultural Adjustment Act of 1933 and
the Agricultural Adjustment Act of 1938.39 As a summary statistic of a year’s growing conditions, I use the
spot market price of Kentucky Burley tobacco, as reported by the United States Department of Agriculture
Consumer and Marketing Service, Tobacco Division (1970–1990). Kentucky Burley tobacco proves to be a
predictor of the current year’s price, but not future prices (see Appendix Table A-3.).
4 Results
4.1 Young Smokers
To test Proposition 1, I estimate the probability of young respondents being a current smoker. In order
to control for potential differences across time and across metropolitan areas, the regressions include year
dummies and MSA-specific time trends.40 A vector of individual controls is also included, comprising age,
age2, race, and education.41 The estimation equation is given by
Pr(smoker = 1) =θ1 ∗ pricejt + θ2 ∗ pricej,(t+1) + θ3 ∗Xit + λj (11)
+ φj ∗ yeart + ηt + εijt
where the left-hand side variable is the probability of being a smoker for an individual i, in metropolitan
area j, in year t. Note that Proposition 1 implies a one-sided test, and does not allow for the future-price
elasticity to exceed the current-price elasticity.
The prices in an MSA, however, may be endogenous. In particular, they may depend on smoking
rates among youths. Although young smokers represent only a fraction of the entire market (and behave
quite differently from the majority of their co-consumers), they represent a very important sub-market from
the perspective of tobacco companies.42 Facing a drop-off of teen smoking, tobacco companies may lower
prices temporarily in order to attract new customers. State legislatures may also raise taxes (an important39Chaloupka and Warner (2000) provide an overview of the U.S. tobacco agriculture regulatory system.40Because prices are measured at the year-by-MSA level, it is not possible to separately include year-by-MSA dummies.41Much of the identification strategy of this paper relies on the fact that age is more than just a control. It is exactly the
separation of smokers into “young” and “mature” categories that allows me to distinguish the various possible biases. However,after including dummies for this purpose, I leave in age as a control for other possible unobservables. Regressions that excludeage and age2 as controls (available on request) yield virtually identical coefficients on regressors of interest.
42Because nicotine is addictive, it is in the interests of tobacco companies to get their customers “hooked” when they areyoung. According to the Department of Health and Human Services, 90% of all adult smokers began smoking when they wereteens. (Department of Health and Human Services 1994)
20
component of the variation in tobacco prices) in response to an increase in teen smoking, so future taxes
may also be endogenous.
To alleviate this problem, I instrument for prices with current taxes and the tightness of state budgets as
measured by the percentage gap between general fund expenditures and general fund revenues. According
to the General accounting office, 48 states have balanced-budget requirements.43 Whereas midyear budget
gaps are primarily closed through spending cuts and other “one-time actions”, gaps prior to the enactment
of a budget are largely closed through revenue increases. A budget shortfall can thus predict increases in
all taxes in the following year. Sin taxes in particular may be a politically expedient way of closing budget
gaps. Although tobacco taxes may be raised for other reasons, this instrument relies only on the variation
generated by budgetary necessity. 44
The results of this estimation are shown in Table 2.45 The main specification is in column (iv). This
specification confines the analysis to smokers under the age of 20. Future price is instrumented using current
taxes and the average percentage budget gap for the states comprising the MSA, with the results of this
first-stage presented in Panel C. The impression given is one of near-complete myopia. The coefficient on
ln(pricet) is highly significant, and highly negative at −1.363, suggesting that smokers in their later teens
are highly price-sensitive. Even more importantly, the coefficient on the next year’s price, ln(pricet+1), is
both small and insignificant. That is to say that although very young smokers react to current price, they
do not react to future prices. This is consistent with either the projection bias or present bias models, but
not with the rational model.46
Columns (iv) and (v) present something of a robustness check on this result by defining the cutoff for
being a “young” smoker at 25 and 30, respectively. Several clear patterns emerge. First, the magnitude of
current-price sensitivity is monotonically decreasing in the cutoff. More importantly, the coefficient on future
price is increasing in magnitude, and strongly significant when both groups of older smokers are included.
This suggests that the seeming myopia of teenage smokers is not permanent, and is consistent with smokers
moving to a regime where they expect to smoke forever as the level of their addiction deepens. The ratio of
the future-price response to the current-price response is therefore also increasing across the specifications.
The null hypothesis that this ratio equals zero is equivalent to saying that teens do not react to future prices.43(General Accounting Office 1993). The exceptions are Vermont and Wyoming.44By instrumenting the future price with current-period observables, I also alleviate the problem caused by the fact that
actual future price is a noisy measurement of expected future price. If expectations are on average correct, and the error in thefirst-stage is independent of the error in expectations and the second-stage error term, then this will solve the errors-in-variablesproblem.
45Standard errors are robust and clustered by MSA. Appendix Table A-4 shows several alternative specifications, which themain results appear robust to.
46While this is prima facie evidence of consumer biases, the range of parameters consistent with this result is not identified fromthis test alone. One therefore cannot yet suggest whether the necessary degree of projection bias or present bias is “plausible”.
21
This ratio is not significantly different from zero for teenage smokers, but is significantly greater than zero
for both older groups. While teen smokers’ myopia could be rational if the future price is fully unknown,
columns (iv) and (v) suggest that older smokers are in fact capable of predicting future prices. Additionally,
Table 4 confirms a non-zero future-price response among mature smokers.
Table 3 reports the results of a slightly different approach. Using the fact that smokers are not making
the simple binary choice of smoking or not smoking, but rather choosing a level of consumption as well,
I sort subjects into five categories: nonsmokers, up to 12 pack/day, up to 1 pack/day, up to 2 packs/day,
and over 2 packs/day. I then perform an ordered probit regression to assess the probability of being in
each category. The key results are the coefficients on ln(pricet+1) and the interaction of ln(pricet+1) with a
dummy for whether the subject is under 20 years old. Across all specifications, the coefficient on the former
is large, and significantly negative. That is, anticipated future price increases will push smokers into lower
categories. However, the interaction term is approximately the same magnitude and positive. That these
two coefficients sum to zero is not rejected at a p-value of 0.8690. In other words, young smokers —unlike
mature smokers—are not using information about future prices in their current consumption decisions.
At this point, there are multiple interpretations of this result. One could possibly conclude that the lack
of a future-price response indicates that cigarettes are not addictive. However, the non-trivial future-price
elasticity for mature smokers belies this interpretation.47 Within the model presented in Section 2, this
result is consistent with either α > 0 or β < 1, for appropriate parameters. That is, this test alone does
not distinguish between the two types of bias. However, it is a strong rejection of the joint hypothesis that
α = 0, β = 1—that is, full rationality—for young smokers.
One could also conclude that consumers in their teen years are mistaken about the process by which
prices evolve (e.g. they believe today’s price will be the price forever), but some time during their 20’s
become able to predict prices. It would not be enough that teens have a noisier estimate of future prices
that is still correct on average. Rather, it must be that even conditioning on the actual future price, teens
believe it will too much resemble the current price. It is important to address this (admittedly ex-post)
hypothesis, because it would have greatly different welfare implications. Unlike the cases of present bias
or projection bias—where the losses come through f(k) and g(k)—the welfare losses for an individual who
makes price mistakes are limited to the present value of their financial error. However, this hypothesis is
not supported by later results. In particular, if price misprediction were driving the results, then one would
expect to see a similar pattern of learning in the results for price volatility. In Section 4.3, no such pattern47Auld and Grootendorst (2004) warn that serial correlations in aggregate consumption data may spuriously return results
in favor of addiction even when there is none, such as for milk demand. The individual responses used above lessen thisconcern. Moreover, the just-identified two-stage least squares specifications presented in Appendix Table A-4, which Auld andGrootendorst suggest “performs well”, are qualitatively similar to the main specifications.
22
is found. Both groups respond to price volatility (although, as will be discussed, not as much as they ought
to), meaning that even youth smokers understand that prices are not immutable. Moreover, the response
is not significantly different for young and mature smokers, indicating that no learning is taking place on
this dimension. The pattern of learning and non-learning required for price misprediction to explain the
results seems implausible, and I am unaware of other research that supports it. Section 5.3 formalizes this
price extrapolation bias, and finds that while it can explain the myopia among teenagers, it is inconsistent
with the rest of the results. Combined with the fact that this reaction to the second moment of the price
distribution does not change as they mature, this story of teenage price misprediction is not consistent with
the overall pattern of results.
4.2 Mature smokers and impatience
To get an estimate of the net discount factor βδ, I estimate current-price and future-price responses on a
sample restricted to respondents aged 30 and above—an identifying assumption is that whenever date m
occurs, it is before the age of 30. Current taxes and state budget gaps are used to instrument for prices in
all specifications, which contain controls for age, age2, sex, race, and education. As before, year dummies
and MSA-specific time trends are also included. The estimation equation is given by:
Pr(smoker = 1)ijt =θ1 ∗ pricejt + θ2 ∗ pricej,(t+1) + θ3 ∗Xit (12)
+ λj + φj ∗ yeart + ηt + εijt
Table 4 presents the results of estimating (12). The dependent variable is an indicator for current smoker
status, and all specifications control for age, age2, race, and education, and include both year dummies and
MSA-specific time trends.48 Where indicated, taxes and budget tightness are used to instrument for prices,
as in Section 4.1.49
Column (iii) is the preferred specification, using participants over age 30 as the sample. Similar results
are obtained using 35 as the cutoff, suggesting that the “maturity” date does not fall within this window.
The coefficient on current price is -0.725, so that a 1% increase in the price decreases the smoking rate
by about 0.7%.50 That these are both smaller in magnitude than the current-price response for teenagers48Appendix Table A-4 shows several alternative specifications, which the main results appear robust to.49As discussed earlier, this instrument will also alleviate the errors-in-variables problem arising from the fact that actual
future price is a noisy measurement of expected future price. The increase in the magnitude of the coefficient on ln(pricet+1) inthe instrumented specifications suggests that measurement error is indeed attenuating the coefficients from OLS.
50The overall elasticity of smoking participation will be a mixture of the elasticity of initiation and the elasticity of cessation.DeCicca, Kenkel and Mathios (2008) use the National Education Longitudinal Survey to estimate a model that accounts forboth explicitly, and find the cessation elasticity to be significantly greater than the initiation elasticity for 26-year-olds. Becausethe NHIS does not have panel data, I cannot distinguish initiation from cessation. However, virtually all of the response for
23
estimated in Table 2 is not surprising. The coefficients on future price are also significantly negative, at
-0.531 and -0.496 for the two age cutoffs. While these estimates are not uniformly greater in magnitude than
those for teens, such a comparison conflates the generally smaller price elasticity of older smokers with their
relatively greater attention to future prices. The relevant comparison is the ratio of future-price response
to current-price response, which ranges from 0.630 to 0.778 for mature smokers. For mature smokers, this
ratio is interpreted as the net discount rate, βδ.51 In all specifications, it is significantly different from zero.
It is also significantly smaller than one in all but one specification.
I then estimate ∂πt/∂pP and ∂πt/∂p
T by regressing consumption on the permanent and transitory
components of price. As before, I include age, age2, sex, race, and education as individual controls and
include MSA dummies. Because annual variation determines pTt , I include either a trend or MSA-specific
trends instead of year dummies and cluster standard errors at the year level. The estimation equation is:
Pr(smoker = 1)ijt =θ1 ∗ pricePjt + θ2 ∗ priceTj,t + θ3 ∗Xit + λj (13)
+ φj ∗ yeart + ηt + εijt
where price is first decomposed into permanent and transitory components by estimating:
pjt =θ4 ∗ taxesjt + θ5 ∗KY Burleyt + θ6 ∗ yeart + λ2j (14)
The ratio of reactions to a permanent and to a transitory price shock is found by estimating (13).
The decomposition of price into pP and pT is shown in Panel B of Table 5. All coefficients are highly
significant, and R2 values of 0.85 suggest that the greater part of price variation is accounted for. Taxes,
MSA fixed effects, and a time trend are used to generate the permanent component of cigarette price, while
the fluctuation in the tobacco price are used to generate the transitory portion. The burley tobacco price
proves not to predict even the next year’s increase in price. Moreover, regressing the estimated (pTt+3 − pTt )
on (pTt+1 − pTt ) yields an insignificant coefficient (0.53 with a standard error of 0.44). The same regression
for the “permanent” price yields a significantly positive coefficient (0.86 with a standard error of 0.19). This
suggests that there is indeed little predictive power in the transitory component, but a lasting effect of the
permanent component.52 I then include these estimates of pP and pT as regressors for an indicator of current
teens must be initiation, while the great majority for mature smokers is cessation. The pattern I find is therefore qualitativelysimilar, and the biases I estimate can help explain the difference in these elasticities.
51A similar ratio is estimated in Becker, Grossman and Murphy (1994)—and is found to be comparably low— but is interpretedas the single discount factor corresponding to δ.
52If the “permanent” shock is not truly permanent or the “temporary” shock not temporary, the estimates for δ will be biaseddownwards, and the estimates for β biased upwards (i.e. too little present bias)
24
cigarette consumption, as in (13).
In general, the effect of the transitory component of price is significant and smallThe effect of a permanent
price shock is much greater—by almost an order of magnitude. Reassuringly, the sum of the coefficients is
comparable to the earlier, non-decomposed price effects. The preferred specification is reported in column
(iv) and includes MSA fixed effects and MSA-specific time trends, and is run on all respondents over 35 years
old. A temporary price increase of 1% lowers the probability of smoking by 0.225%, while a 1% increase
in the permanent price is associated with a 1.41% decrease in the smoking probability. The ratio of these
effects will yield (1− (1− β)δ)/(1− δ), and is estimated at 6.267. That this is statistically different from 1
allows one to immediately reject the hypothesis that β = 1 or δ = 0.
Column (v) presents the results of performing this analysis on teenage rather than mature smokers, and
serves primarily as a robustness check on the model. Given the results in Table 2 and the restriction this
puts on teen smokers’ plans, one would expect teen smokers to react approximately equally to permanent
and temporary price effects. This is precisely the pattern that emerges. With a coefficient of -0.33, teens are
more responsive to temporary price shocks than mature smokers–consistent with their greater current-price
sensitivity in Table 2—while their elasticity with respect to permanent price shocks, -0.822, is considerably
lower than that of mature smokers. Although they do react more to permanent than to temporary shocks,
the ratio of these responses is only 2.47 and is not statistically different from 1. That is, teen smokers
continue to neglect future prices in their current smoking decisions.
Combining the estimates from Table 4 and Table 5, one can estimate both β and δ individually for
mature smokers. These estimates are shown in Table 7 for the instrumental variables specifications in Table
4 and Table 5. Using the preferred specifications from both, one obtains estimates for β of 0.701 and 0.828,
and for δ of 0.899 and 0.885. The estimates for β are significantly different from 1 (although the δ estimates
are less precise and not significantly different from 1).
4.3 Price volatility and initiation
The model presented in Section 2.3 predicts that price volatility has a differential impact on rational and
projection-biased agents. Whereas volatility decreases the attractiveness of consumption to a fully-rational
agent, projection-biased agents may be “fooled” into a lifelong habit when they try to take advantage of
particularly low price realizations.
I use prices in the first half of the sample to estimate price volatility by metropolitan area. I then
estimate the effect of this price volatility on smoking behaviors in the latter half of the sample. To control
for any correlation between σ2pj and pj , I include price directly as a control in all specifications. Demographic
25
controls again include age, age2, sex, race, and education. The estimating equation is a regression of the
form:
Pr(smoker = 1) =F(θ1 ∗ pricejt + θ2 ∗ pricej,(t+1) + θ3 ∗ std pricej + θ4 ∗Xit + φj ∗ yeart + εijt
)(15)
Where the left-hand side variable is one for current smokers and zero for current nonsmokers, prices are
log real prices, and std price is estimated as above. F (x) = Φ(x) for specifications using the probit model
and F (x) = x for those using the linear probability model.53
Table 6 shows the results of running the analysis described by (15). The first column shows the results of
the analysis for all consumers. The sign on Pt is highly significant and negative, as higher prices reduce the
benefits of smoking for any type of consumer. The magnitude suggests that a 1% increase in the price lowers
the probability of smoking rates by 0.36 percentage points, and is relatively stable across all specifications.
More interesting, however, is estimated the effect of price volatility per se on smoking rates. The regressor
in the first row is the natural logarithm of the standard deviation of price (by MSA) between 1965 and 1977.
In all specifications, the estimated coefficient is positive and approximately one-tenth the magnitude of the
direct effect of price changes. For example, column (i) states that increasing the standard deviation of
price realizations in an MSA by 10% increases the the probability of smoking by roughly 0.25 percentage
points. This lends strong support to the conclusion that a mistake in predicting their future utility can lull
consumers into a smoking habit. Columns (ii) and (iii) allow the volatility effect to vary between young and
mature smokers. In neither case is the coefficient significantly different for the different age groups. This
suggests that the effect is present for both mature and young smokers. However, without knowing a mature
smoker’s MSA of residence during her adolescence, it is difficult to separate the enduring effect on initiation
from a current effect on cessation.
One possible concern is whether the estimated coefficient is reasonable given any of the models presented
in Section 2. Is it too high? One possible check is to perform a back-of-the envelope calculation of what
one would expect this coefficient to be. Except for realizations extremely close to the mean, the assumption
that (Pt− P ) ∼ N (0, 0.087) fits the data well.54 A quick estimate of (P ∗− P )—the threshold price at which
a person chooses to smoke—can be formed for teenagers by using the actual smoking rate of approximately
30% during this period.
Under these assumptions, a person who did not alter their price threshold would be 0.06 percentage points
more likely to smoke in an environment where the prices were 1% more volatile. This purely mechanical53An alternative specification (available on request) instruments std price by the population of the MSA. The point estimates
are approximately unchanged, but are imprecisely estimated.54Where Pt is the real price and P is the best predictor using MSA fixed effects and a single time trend.
26
effect is considerably greater than the coefficients of between 0.02 and 0.03 estimated in Table 6. Any
extent to which a consumer lowered their price threshold in more volatile environments would lower this
estimate—meaning that this back-of-the-envelope calculation still leaves some room for consumers to alter
their threshold in response to volatility. On an intuitive level, these estimates suggest that young smokers
are not adjusting their thresholds “enough”. That is, they under-value the real option that derives from
remaining unaddicted. Lemma 3 establishes that if α = 0 and consumers are sufficiently patient, this
elasticity must be negative. Unfortunately, the range of βδ for which this is true depends on the functions
f(·) and g(·). Without further restrictions, one can reject the (somewhat extreme) joint hypothesis that
(α = 0, βδ = 1, h(·) > 0), but cannot confirm if the β found in Section 4.2 can explain this positive elasticity.
Rather than imposing the strength of addiction externally, the following section makes more parametric
assumptions about the model in order to calibrate, among other things, how much projection bias these
estimates imply.
5 Additional Parameter Estimates
The results from Section 4 require very few assumptions about the nature and strength of the addiction
process. While this is sufficient for many ends—for example, estimating β and δ—other results are not
identified. In particular, the degree of projection bias depends critically on how strong the addiction is.
In this section, I linearize f(·) and g(·) and make additional distributional assumptions about “maturity
dates”, price volatility, and initial addiction states k.
5.1 Derivations
Rather than use the flexible addiction functions f(·) and g(·) with only (3) and (4) as restrictions, I follow
O’Donoghue and Rabin (1999b) and assume f(kt) = xt − ρkt and g(kt) = −σkt. That is, smoking in the
current period increases the next period’s incentive to smoke by (σ − ρ).
Proposition 3 Let the initial addictive state be distributed according to κ(·), and the “maturity date” m be
distributed according to µ(·). If teen smokers expect to smoke only until maturity, response in the smoking
rate of cohort τ , πτ,t to an anticipated shock to the price beginning in period t+1 is then:
dπτ,tdpt+1
= −κ′(k; τ, t) ·(
dk
dpt+1
)µ(t− τ)
27
Where:dk
dpTt+1
=βδ[
1 + βδ(
(1− α)γ 1−(δγ)m−1
1−δγ + α1−δm−1
1−δ
)]· (σ − ρ)
· 1{m > t}
Suppose that all teens believe that they will smoke only until maturity and refrain thereafter, and that
they believe maturity occurs in the current period. In this case, dkdpTt
reduces to 1σ−ρ . The change in the
teen smoking rate given a change in the current price is then dπdpTt
= −κ′(k) · 1σ−ρ . Let k ∼ N(0, 1), so that
consuming in the first period raises the addiction level by one standard deviation. Using the actual teen
smoking rate of 30.52%, one can calculate κ′(k).
Proposition 4 Let prices be distributed according to pt ∼ N (p, s2), where p is the known MSA-specific
mean price. For a consumer with discount rate δ, degree of present bias β, and degree of projection bias
α, the change in probability of smoking given a change in the standard deviation of the price distribution is
given by:
dπ
ds=
1√2πs
e−
„p∗2
2s2
«·
(p√
2πs2e−
„p2
2s2
«)· (σ − ρ)βδ(1− α)
− p∗√2πs2
e−
„p∗2
2s2
«
Proposition 4 states that, provided with an estimate of (σ − ρ) and βδ, one can estimate the degree of
projection bias implied by under-reactions to price volatility.
5.2 Parameter Estimates
5.2.1 Addictiveness: σ − ρ
Using Proposition 3 with the additional assumption that all teens expect to smoke only until maturity,
one can re-interpret the estimates from Table 2 in order to recover (σ − ρ). Using the under-20 sample
from columns (i) and (iv) of Table 2, one estimates, respectively, 19.544 and 16.833. Because of the quasi-
linearity of the utility function, these estimates can be interpreted as dollar amounts. That is, going from
an unaddicted to an addicted state increases the willingness to pay by between $16 and $20.
One could repeat the analysis for the under-25 and under-30 samples. Because these groups are on
average less price-sensitive, the estimates of (σ − ρ) will be greater than for the under-20 sample. However,
this procedure requires that the young smokers expect to smoke only in the current period. Because these
older samples include more consumers expecting a lifelong habit, the estimates of σ−ρ will be biased upwards.
For this same reason, the estimates in Table 7 should be considered upper bounds of the addictiveness slope.
28
5.2.2 Projection Bias: α
Given values for β, δ, and (σ − ρ), one can identify α from the response to price volatility. In the data,
(Pt− Pt) ∼ N(0, 0.087) (Kolmogorov-Smirnov p-value of 0.264). This implies that the mechanical effect of a
1% increase in the standard deviation of price is to increase smoking rates by 0.06 percentage points. To the
extent that the estimated responses are smaller than this, consumers are reacting to volatility by changing
their threshold price for smoking. However, the fact that they are not substantially smaller and, moreover,
non-negative suggests that consumers are not reacting as much as they should in the fully-rational case.
Proposition 4 derives the net response to a change in the volatility of the price distribution. The second
term is the purely mechanical response, and will be positive when consumers do not smoke on average.
The first term is the behavioral response, and depends on the value of the real option that is generated by
refraining. Using the estimated values of βδ and (ρ− σ), it is possible to determine how big the behavioral
response ought to be. The degree of under-reaction is attributed to α.
The first column of Table 7 shows values for α, using the estimate of (σ − ρ) = 16.883 and various
estimates of the net discount rate.55 The point estimates vary between 0.410 and 0.494, suggesting that
consumers under-appreciate how their addictive state will change by 40%–50%. In the linear case, this is
interpreted as consumers only expecting their instantaneous incentive to hit to increase by $8 to $10 instead
of the full $16, should they become addicted. Conversely, addicts under-appreciate their ability to become
un-addicted by approximately the same amount. The estimate for projection bias compares favorably with
the range of α ∈ [0.31, 0.50] found by Conlin et al. (2007). This lends support to the notion that projection
bias is a general psychological phenomenon, rather than limited to specific domains such as smoking.
5.2.3 Maturity: µ(·)
Lastly, one can estimate the distribution of “maturity” from the changes in the ratio of future-price elasticity
to current-price elasticity. In the model, maturity is simply defined at the age when the additional instan-
taneous incentive to hit, xt, becomes zero. According to to Proposition 3, one can estimate µ(t1)− µ(t0) by
the difference in dπ/dpt+1
dπ/dptat ages t0 and t1 divided by the net discount rate. Using column (v) of Table 4 to
estimate β · δ, Table 2 implies that between 27% and 39% of people cross this age between 20 and 25, and
another 7% to 27% cross it between 25 and 30. This conclusion matches with the earlier result that very
few people appear to be crossing the same threshold between the ages of 30 and 35.55Because this estimate depends on the net discount rate instead of β and δ individually, the estimates do not vary across the
columns of Table 7. Using higher estimates of σ − ρ increases the values slightly, but never above 0.6.
29
5.3 Price-Extrapolation
I have interpreted the myopia towards future prices exhibited by teenagers as evidence that teens ignore
predictable future price changes because they do not believe they will be smoking in the future. Similarly,
I have interpreted mature smokers’ impatience as evidence of present bias. This requires no differential
ability to predict future prices between young and mature smokers, and is consistent with the fact that teens
are equally responsive to the second moment of the price distribution as mature smokers.56 It is possible,
however, that consumers do not have rational expectations over prices.57 While there are any number of ways
that expectations could differ from rationality, one theory that would seem to be a candidate for explaining
teen myopia would be “price extrapolation”. Under this error, consumers believe too much that future
prices will be similar to current prices, even conditional on all available information. This is a substantial
departure from rational expectations and, while it is a post-hoc hypothesis, it initially seems to provide an
alternative explanation for the results. One simple way to capture this would be to let a period-t consumer’s
expectation of the period-τ price, pt,τ , be given by:
pt,τ = (1− ω)Et [pτ ] + ωpt,τ−1 + ετ for all τ > t (16)
pt = pt
where Et[pτ ] is the correct expectation of the price in period τ conditional on period-t information, pt is the
current price, and εt is a normally-distributed error term. The degree of price-extrapolation error is indexed
by the parameter ω ∈ [0, 1], with ω = 0 reducing to the rational expectations assumed elsewhere in this
paper. Under this form of error, coefficients from regressions of smoking status or cigarette consumption on
current and future prices do not correspond to smokers’ responses to their price expectations. In particular,
the coefficient on current price will load some of the future-price response because it is a component of the
expectation of future price. An instrumental variables regression will solve the measurement error problem
caused by εt+1, but will not correct for the extrapolation error.58
Can such a mistake about prices explain the results found in Section 4 without the presence of present
bias or projection bias? Suppose that all smokers have β = 1, α = 0, that teens and mature smokers have
ω = ωt and ω = ωm, respectively, and that the addiction functions are linear. Then Lemma 3 impliesdπ/dpPtdπ/dpTt
= 1−ωδ1−δ , and dπ/dpt+1
dπ/dpt= (1−ω)δ. One can then use the same regressions as in Section 4 to estimate δ
56It is also consistent with a theory that the error term for teens’ expectations is larger than that for mature smokers.57While consumers certainly do not have perfect foresight of future prices, the approach taken here has been to assume that
they hold expectations that, while noisy, are correct on average. The actual future price is an estimate of consumers’ expectationof the future price measured with error, and can be overcome with an instrumental variables approach.
58That is, if one estimates Equation (11) and β = 1 and α = 0, then θ2θ1
= (1− ω)δ < δ if ω > 0.
30
and ωm. Using the baseline results from Column (iii) of Table 4 and Column (iv) of Table 5, one estimates
δ = 0.90, ωm = 0.435. For youth smokers, using Column (i) of Table 2 and Column (v) of Table 5, one
estimates ωt = 0.96—that is, teens must be almost fully extrapolating the price.
However, the different values of ω required to explain the different future-price responses also make very
different predictions about how consumers will respond to volatility. In particular, values close to 1 imply
a zero response to price volatility that is contradicted by the teen response in Section 4.3. Moreover, this
interpretation of the model also makes a strong prediction on the magnitudes of price responses. If one
believes that the only thing changing over time is ω, then one can ask what the price responses of one group
imply about the ω of the other. Following Lemma 3, the magnitude of the current-price response will depend
on an expression involving the discount rate, parameters of addiction, and degree of price-extrapolation error.
If teens in fact expect to smoke in future periods, then the denominator in this expression will be the same
as for mature smokers. The difference in price responses is attributed to different values of ω. Given the
estimates of ωm and δ outlined above, one can therefore calculate the degree of price extrapolation consistent
with the observed current-price elasticity for teen smokers. Far from the complete price extrapolation their
myopia implies, this method finds that a value of ωt = 0.68 is allowed by the current-price response in
Column (iv) of Table 2, and ωt = 0.63 for the response in Column (i). Intuitively, the change in the way
teens and mature smokers react to current prices is not drastic enough to allow for a drastic change in the
way they extrapolate prices. However, this lower value of ωt implies a future-price response of -0.65, which is
outside the 95% confidence interval of the actual estimate. Price-extrapolation error, while it may contribute
along with the psychological biases estimated in this paper, therefore cannot on its own explain the pattern
of price responses..
6 Conclusion
The overall conclusions to be drawn from this paper are three-fold. First, the patterns of cigarette con-
sumption differ greatly from those predicted by the fully rational model. Second, the model of projection
bias explains several different features of the data. Finally, evidence is found in favor of present bias as an
additional contributing factor. I strongly reject the fully-rational case and, with additional assumptions, I
estimate a degree of present bias β between 0.7 and 0.8, and a degree of projection bias α between 0.4 and
0.5.
The parameter estimates obtained in this paper generally agree with those obtained elsewhere. The
degree of impatience, β ∈ [0.7, 0.8], matches estimates identified in very different contexts. For example,
Laibson, Repetto and Tobacman (2007) estimate an annual β = 0.7, δ = 0.96 in a structural model of
31
consumption and decisions. Using data on unemployment spells and accepted wages, Paserman (2008) finds
some heterogeneity among income groups but still a comparable range of parameter estimates. A short-run
discount factor β ∈ {0.40, 0.48} is estimated for low-wage earners, while high-wage earners have β = 0.89.
Both groups have an annual long-run discount factor δ = 0.99. Using a large-scale experiment to measure
consumer switching in credit cards, Shui and Ausubel (2004) estimate β = 0.8.59
Although there is a growing literature that identifies the presence of projection-bias, there has been less
work attempting to estimate the parameter α. In general, such estimates require structural assumptions on
utility. Conlin, O’Donoghue and Vogelsang (2007) use catalog orders and returns of cold-weather clothing
to test for projection bias with respect to the current weather. If consumers experience projection bias, they
may over-order cold-weather clothing on cold days, because they expect their future selves will be cold as
well. Returns are examined to control for potential confounds, such as attention. After making suitable
structural assumptions, they estimate a degree of projection bias α ∈ [0.31, 0.50]—strikingly similar to the
range this paper estimates for projection bias with respect to cigarette addiction.
One could attempt to quantify the size of the mistake that consumers are making in several ways. For
a rational consumer who is not credit-constrained, it is sufficient to know the present discounted cost of
smoking to know if she will smoke. Biased consumers may be sensitive to the details of timing of prices.
For example, Shui and Ausubel (2004) find that consumers are too sensitive to credit card “teaser rates”
relative to the long-term interest rate—and in ways that end up being quite costly. For the analogue in
cigarette prices, consider an unaddicted teenager with the degrees of bias and preferences estimated in this
paper, who is deciding whether to smoke or not. By setting a very low price initially and raising the price
in her adulthood—essentially a “teaser rate” for cigarettes—one can design a price path that will cause her
to smoke forever at a very high present-value cost.60 Alternatively, by setting a high initial price and then
lowering it considerably in her adulthood, one has a price path that will cause her to never smoke, even
though the present-value cost of a lifelong habit is relatively low. Define the “overpayment ratio” as the
ratio of the most expensive path that causes her to smoke to the least expensive path that causes her not
to smoke. Using the consumer’s own discount rate to calculate present value, this ratio should be exactly
one for rational consumers.61
59If short-run impatience differs between smokers and non-smokers, these estimates of β would not be directly comparable tothe estimates in this paper obtained from smokers. That the results are similar suggests that this bias may be quite general.Indeed, using a specially-designed survey, Khwaja, Silverman and Sloan (2007) conclude that there is “no evidence that short-runand long-run rates of discount differ by smoking status.”
60Although it is unrealistic to charge different age groups different prices for cigarettes, this thought experiment may beapproximated by using brands that appeal differentially to teens and mature smokers.
61The ratio is bounded below by 1 unless the attractiveness of smoking is increasing in the cigarette price. While this wouldnever occur for standard preferences, it can sometimes occur for partially sophisticated present-biased agents for whom higherprices act as a failed commitment device on future selves.
32
Panel A of Figure 4 shows the overpayment ratio as a function of the parameter γ, which controls the
evolution of the addictive stock. For a wide range of values, this teenager can be “tricked” into paying
between 2 and 3.5 times as much for a smoking habit as she would have had to pay under a different
price pattern that she would have turned down.62 Panel B shows the ratio of the most expensive habit a
biased smoker would accept to the most expensive habit a fully rational smoker would accept. Although
the ratio is somewhat lower in Panel B, an agent with the biases estimated in this paper will still pay
between 1.5 and 3 times too much. This suggests that most of the result is coming from people smoking
when they rationally should not—teens impatiently starting, and mature smokers not quitting because they
do not realize that withdrawal will be temporary. In addition to quantifying the consumer mistake, such a
calculation is valuable to tobacco companies, which may design incentives to attract inexperienced smokers
who unwittingly continue on to a lifelong habit.
Identifying present bias and projection bias as driving factors in addiction also suggests directions for
effective policy interventions, including tax policy.63 While there are many additional concerns that will
affect the optimal cigarette tax, one can ask simply what lump-sum redistributed tax would maximize
smokers’ own welfare.64 In the Becker and Murphy (1988) model, such a tax would clearly be zero, as
any cigarette consumption is already chosen optimally.65 When consumers are biased, however, there is
substantial room for welfare-enhancing corrective taxes. Although such taxes will clearly not be Pareto
efficient—some people may be smoking optimally—one can consider what tax would cause a biased consumer
to hit only in circumstances when an unbiased consumer would have hit without such a tax. The derivation
of this tax is presented in the Appendix, and because it depends on the addiction parameter γ, a range of
values is presented in Appendix Table A-1. In general, a tax which corrects for both sources of bias will be
quite large—on the order of $8-$11 per pack in current dollars. Using parameters of β = 0.9 and δ = 1,
Gruber and Koszegi (2000) estimate that for a sophisticated present-biased agent, the optimal tax is between
$1.50 and $2.50 per pack. Plugging in the same value of β into the framework used by this paper yields
a similar value for the optimal tax of between $1.10 and $3.70 per pack. Lowering β rapidly increases the
optimal tax, however. The estimate of β = 0.8, δ = 0.9 yields significantly higher results: between $2.20 and
$5.30 per pack. Including the effect of projection bias raises the corrective tax even further. Adding α = 0.4,62For comparison, the mean smoker in 2006 spent roughly $1080 annually on cigarettes (CDC, http://www.cdc.gov/tobacco/
data_statistics/)63See, for example, Wertenbroch (1998) and O’Donoghue and Rabin (2003).64In addition to the usual Ramsey rule for the overall tax, many studies include a Pigouvian component that reflects the
negative externality of second-hand smoke and the positive externality that smokers’ early deaths impose on government healthcare and retirement programs. The calculations in this paper are separate from these concerns.
65Gul and Pesendorfer (2007) also find an optimal tax of zero in a model where smoking may not be desirable, becausetaxation does not remove smoking from the choice set. Their model does, however, predict that a ban on smoking may bewelfare-enhancing.
33
the tax is between $8.60 and $11.40 per pack. By comparison, the highest current combined state-local tax
on cigarettes is $4.25 in New York City.
Taken together, the results in this paper provide support for both present bias and projection bias as
explanations for many smoking behaviors. When first starting a smoking habit, inexperienced smokers do
not appreciate the degree to which they will become addicted to nicotine. Conversely, experienced smokers
fail to fully appreciate how refraining from smoking will eventually make them un-addicted. Moreover, all
smokers are too willing to sacrifice future well-being for the instantaneous benefits of smoking. Addiction
offers a setting where projection bias and present bias interact to greatly alter consumer behavior. Further
research may address whether this interaction of biases has a large effect in other contexts. Additionally,
although consumer mistakes about prices cannot explain the pattern of results in this paper, it remains an
open question how important such mistakes may be to both behavior and welfare.
References
Adda, Jerome and Francesca Cornaglia, “The Effects of Taxes and Bans on Passive Smoking,” CemmapWorking Papers, October 2006.
and , “Taxes, Cigarette Consumption, and Smoking Intensity,” American Economic Review, September2006, 96 (4), 1013–1028.
Angeletos, George-Marios, David Laibson, Andrea Repetto, Jeremy Tobacman, and StephenWeinberg, “The Hyperbolic Consumption Model: Calibration, Simulation, and Empirical Evaluation,” TheJournal of Economic Perspectives, Summer 2001, 15 (3), 47–68.
Ariely, Dan and George Loewenstein, “The Heat of the Moment: the Effect of Sexual Arousal on SexualDecision Making,” Journal of Behavioral Decision Making, July 2005, 19 (2), 87–98.
Arnold, CL, TC Davis, HJ Berkel, RH Jackson, I Nandy, and S London, “Smoking status, reading level,and knowledge of tobacco effects among low-income pregnant women,” Preventive Medicine, April 2001, 32 (4),313–320.
Auld, M. Christopher and Paul Grootendorst, “An empirical analysis of milk addiction,” Journal of HealthEconomics, November 2004, 23 (6), 1117–1133.
Becker, Gary S. and Kevin M. Murphy, “A Theory of Rational Addiction,” Journal of Political Economy,1988, 96, 675–700., Michael Grossman, and Kevin M. Murphy, “Rational Addiction and the Effect of Price on
Consumption,” The American Economic Review, 1991, 81 (2), 237–241., , and , “An Emprical Analysis of Cigarette Addiction,” The American Economic Review, 1994, 84 (3),
396–418.Berheim, B. Douglas and Antonio Rangel, “Addiction and Cue-Triggered Decision Processes,” The American
Economic Review, December 2004, 94 (5), 1558–1590.Cameron, A. Colin, Jonah B. Gelbach, and Douglas L. Miller, “Robust Inference with Multi-Way
Clustering,” NBER Working Paper T0327, September 2006., , and , “Bootstrap-Based Improvements for Inference with Clustered Errors,” The Review of Economics
and Statistics, August 2008, 90 (3), 414–427.Chaloupka, Frank J and Kenneth Warner, “The Economics of Smoking,” in A.J. Culyer and J.P. Newhouse,
eds., Handbook of Health Economics, Volume I, Elsevier, 2000, pp. 1539–1627., Prabhat Jha, Joy de Beyer, and Peter Heller, “The Economics of Tobacco Control,” Briefing Notes in
Economics, January 2005, (63).Conlin, Michael, Ted O’Donoghue, and Timothy Vogelsang, “Projection Bias in Catalog Orders,” The
American Economic Review, September 2007, 97 (4), 1217–1249.Cummings, K. Michael, Eva Sciandra, Terry F. Pechacek, Mario Orlandi, and William R. Lynn,
34
“Where teenagers get their cigarettes: a survey of the purchasing habits of 13-16 year olds in 12 US communities,”Tobacco Control, December 1992, 1 (4), 264–267.
Cutler, David M. and Edward L. Glaeser, “Why Do Europeans Smoke More Than Americans?,” in D. Wise,ed., Developments in the Economics of Aging, Chicago: University of Chicago Press, 2008.
DeCicca, Philip, Don Kenkel, and Alan Mathios, “Cigarette taxes and the transition from youth to adultsmoking: Smoking initiation, cessation, and participation,” Journal of Health Economics, 2008, 27, 904–917.
DellaVigna, Stefano, “Psychology and Economics: Evidence from the Field,” Journal of Economic Literature,June 2009, 47, 315–372.
Department of Health and Human Services, The Health Consequences of Smoking: A Report of the SurgeonGeneral, Washington, DC: US Government Printing Office, 1982., Preventing Tobacco Use Among Young People: A Report of the Surgeon General, Washington, DC: US
Government Printing Office, 1994.General Accounting Office, “Balanced Budget Requirements,” March 1993.Gilbert, Daniel, Elizabeth Pinel, Timothy Wilson, Stephen Blumberg, and Thalia Wheatley, “Immune
Neglect: A Source of Durability Bias in Affective Forecasting,” Journal of Personality and Social Psychology,September 1998, 75 (3), 617–638.
Giordano, Louis, Warren Bickel, George Loewenstein, Eric Jacobs, Lisa Marsch, and Gary Badger,“Mild Opioid Deprivation Increases the Degree that Opiod-Dependent Outpatients Discount Delayed Heroin andMoney,” Psychopharmacology, September 2002, 163 (2), 174–182.
Gruber, Jonathan, “Youth Smoking in the 1990’s: Why Did it Rise and What Are the Long-Run Implications?,”The American Economic Review, May 2001, 91 (2), 85–90.
and Botond Koszegi, “Is Addiction ‘Rational?’ Theory and Evidence,” NBER Working Paper 7507, 2000.and , “Is Addiction ‘Rational?’ Theory and Evidence,” Quarterly Journal of Economics, 2001, 116 (4),
1261–1303.and Jonathan Zinman, “Youth Smoking in the US: Evidence and Implications,” in Jonathan Gruber, ed.,
Risky Behavior Among Youth: An Economic Analysis, Chicago: University of Chicago Press, 2001, pp. 69–120.and Sendhil Mullainathan, “Do Cigarette Taxes Make Smokers Happier?,” Advances in Economic Analysis
and Policy, 2005, 5 (1).Gul, Faruk and Wolfgang Pesendorfer, “Harmful Addiction,” Review of Economic Studies, 2007, 74 (1),
147–172.Hammar, Henrik and Fredrik Carlsson, “Smokers’ expectations to quit smoking,” Health Economics, 2005, 14
(3), 257–267.Kenkel, Donald, Dean Lillard, and Alan Mathios, “The Roles of High School Completion and GED Receipt in
Smoking and Obesity,” Journal of Labor Economics, 2006, 24 (3), 635–660.Khwaja, Ahmed, Dan Silverman, Frank Sloan, and Yang Wang, “Are Smokers Misinformed? Evidence on
Subjective Beliefs about Mortality and Health,” August 2008. Mimeo., Daniel Silverman, and Frank Sloan, “Time Preference, Time Discounting, and Smoking Decisions,”
Journal of Health Economics, 2007, 26 (5), 927–949.Laibson, David, “Golden Eggs and Hyperbolic Discounting,” The Quarterly Journal of Economics, 1997, 112 (2),
443., Andrea Repetto, and Jeremy Tobacman, “Estimating Discount Functions with Consumption Choices over
the Lifecycle,” August 2007. Mimeo.Loewenstein, George and Daniel Adler, “A Bias in the Prediction of Tastes,” Economic Journal, 1995, 105,
929–937.and Shane Frederick, “Predicting Reactions to Environmental Change,” in M Bazerman, D Messick,
A Tenbrunsel, and K Wade-Benzoni, eds., Psychological Perspectives on the Environment, Russell SageFoundation Press, 1997., Ted O’Donoghue, and Matthew Rabin, “Projection Bias in Predicting Future Utility,” Quarterly Journal
of Economics, November 2003, 53 (2), 151–169.O’Donoghue, Ted and Matthew Rabin, “Doing it Now or Later,” The American Economic Review, March 1999,
89 (1), 103–124.and , “Addiction and Self Control,” in Jon Elster, ed., Addiction: Entries and Exits, Russel Sage
Foundation, 1999.and , “Addiction and Present-Biased Preferences,” October 2000.and , “Studying Optimal Paternalism, Illustrated by a Model of Sin Taxes,” The American Economic
Review, May 2003, 93 (2), 186–191.
35
Oehlert, Gary W., “A Note on the Delta Method,” The American Statistician, February 1992, 46 (1), 27–29.Orphanides, Athanasios and David Zervos, “Rational Addiction with Learning and Regret,” The Journal of
Political Economy, August 1995, 103 (4), 739–758.Paserman, M. Daniele, “Job Search and Hyperbolic Discounting: Structural Estimation and Policy Implications,”
Economic Journal, August 2008, 118 (531), 1418–1452.Patrick, Donald L., Allen Cheadle, Diane C. Thompson, Paula Diehr, Thomas Koepsell, and Susan
Kinne, “The Validity of Self-Reported Smoking: A Review and Meta-Analysis,” American Journal of PublicHealth, July 1994, 84 (7), 1086–1093.
Read, Daniel and Barbara Van Leeuwen, “Predicting Hunger: The Effects of Appetite and Delay on Choice,”Organizational Behavior and Human Decision Processes, November 1998, 76 (2), 189–205.
Schoenbaum, Michael, “Do Smokers Understand the Mortality Effects of Smoking? Evidence from the Health andRetirement Survey,” American Journal of Public Health, May 1997, 87 (5), 755–759.
Shapiro, Jesse, “Is there a daily discount rate? Evidence from the food stamp nutrition cycle,” Journal of PublicEconomics, 303–325 2005, 89 (2), 303.
Shui, Haiyan and Lawrence M. Ausubel, “Time Inconsistency in the Credit Card Market,” May 2004. 14thAnnual Utah Winter Finance Conference.
Tauras, John A and Frank J Chaloupka, “Impact of Tobacco Control Spending and Tobacco Control Policieson Adolescents’ Attitudes and Beliefs about Cigarette Smoking,” Evidence-Based Preventive Medicine, 2004, 1(2), 11–120., , Matthew Farrelly, Gary Giovino, Melanie Wakefield, Lloyd Johnston, Patrick O’Malley,
Deborah Kloska, and Terry Pechacek, “State Tobacco Control Spending and Youth Smoking,” AmericanJournal of Public Health, February 2005, 95 (2), 338–344.
The Tobacco Institute, The Tax Burden on Tobacco: Historical Compilation 1993, Vol. 28, Washington, DC:Tobacco Institute, 1993.
Townsend, J, P Roderick, and J Cooper, “Cigarette smoking by socioeconomic group, sex, and age: effects ofprice, income, and health publicity,” BMJ, October 1994, (309), 923–927.
United States Department of Agriculture Consumer and Marketing Service, Tobacco Division,“Tobacco Market Review,” 1970–1990.
Unites States Department of Commerce, “Statistical Abstract of the United States,” 1970–1990.Viscusi, W. Kip, “Do Smokers Underestimate Risks?,” Journal of Political Economy, 1990, 98 (6), 1253.
and Joni Hersch, “Cigarette Smokers as Job Risk Takers,” The Review of Economics and Statistics, May2001, 83 (2), 269–280.
Wertenbroch, Klaus, “Consumption self-control by rationing purchase quantities of virtue and vice,” MarketingScience, 1998, 17 (4), 317–337.
Wilmouth, Carrie E. and Linda P. Spear, “Withdrawal from chronic nicotine in adolescent and adult rats,”Pharmacology Biochemistry and Behavior, November 1998, 85 (3), 648–657.
36
A Proofs
Proof of Lemma 1: Following O’Donoghue and Rabin (2000), let At ≡ {0, 1}∞, where at ≡ (at, at+1, ...) ∈ At is apath of hitting and refraining starting in period t. Let Vt(kt,at) be the perceived long-run continuation utility fromfollowing at conditional on period-t addiction level kt. Let Kτ (kt,at) = [γτ−tkt +
∑τ−1n=1 γ
τ−n−1ai] be the addictionlevel in period τ that results from following at. Then:
Vt(kt,at) =∞∑τ=t
δτ−t[aτ(Yτ − pτ + (1− α)fτ (Kτ (kt,at)) + αfτ (kt)
)+(1− aτ )
(Yτ + (1− α)gτ (Kτ (kt,at)) + αgτ (kt)
)]Because ft(·) and gt(·) are weakly convex, and Kτ is increasing and linear in kt, it follows that Vt is also weakly
convex in kt. Now let A(k, t) be an optimal strategy from self t’s perspective—that is is solves Ut(k|A(k, t)) =maxa∈At V
t(k,a).To show uniqueness, suppose A′ 6= A is another optimal strategy. But from (10), A(k, t) = 1 ⇐⇒ ht(k) ≥
βδ(1−α)[Ut+1(γk|A)− Ut+1(γk + 1|A)
]and A′(k, t) = 1 ⇐⇒ ht(k) ≥ βδ(1−α)
[Ut+1(γk|A′)− Ut+1(γk + 1|A′)
].
Then Ut(k|A) = Ut(k|A′) ∀ k and t, and so A(k, t) = A′(k, t), so they are the same strategy.Then because Ut(k|A) is the upper envelope of V t, it is also weakly convex in kt. Therefore Ut+1(γk|A)−Ut+1(γk+
1|A) is weakly decreasing in k. But for any type,
A(k, t) = 1 ⇐⇒ ht(k) ≥ βδ(1− α)[Ut+1(γk|A)− Ut+1(γk + 1|A)
]Thus ∃kt s.t. ∀t,A(k, t) = 1 ⇐⇒ k ≥ kt.
If t > m, then ht(k) is independent of t by Assumption 1. Then A(k, t) = 1 ⇐⇒ h(k) ≥ βδ(1−α)[Ut+1(γk|A)− Ut+1(γk + 1|A)
]is also independent of t, so ∃k s.t. ∀t,A(k, t) = 1 ⇐⇒ k ≥ k.
If t < m, then because future selves will be expected to follow a cutoff strategy and xt is weakly decreasing int,[Ut+1(γk|A)− Ut+1(γk + 1|A)
]is weakly increasing in t. Thus for t < m, the cutoff kt is weakly increasing in t.
Finally, because hm(k) ≥ hm+1(k), km < km+1.
Proof of Lemma 2 For any p = (p1, p2, ...), define k∗1(p) to be the period-1 addiction level such that an agent withdiscount rate δ, degree of present-bias β, and degree of projection-bias α is indifferent between hitting always andhitting never. That is,
Y1 − p1 + x1 + f(k∗1(p) + β
∞∑t=2
δt−1
[Yt − pt + xt + (1− α)f
(γt−1kα1 (p) +
t−1∑n=1
γn−1
)+ αf (kα1 (p))
]= Y1 + g(k1) + β
∑∞t=2 δ
t−1[Yt + (1− α)g
(γt−1kα1 (p)
)+ αg (kα1 (p))
]which can be re-written as:
x1 − p1 + β
∞∑t=2
δt−1(−pt) + Φ(k∗1(p)) + β
∞∑t=2
δt−1xt = 0
where Φ(k) = f(k∗1(p))− g(k∗1(p)) + β∑∞t=2 δ
t−1[(1− α)
(f(γt−1k +
∑t−1n=1 γ
n−1)− g
(γt−1k
))+ α (f(k)− g(k))
].
If k∗(p) ∈ min{k∗, k, k}, then kt = k∗1(p) ∀ t. (i.e. behavior is governed by the threshold for hitting always relativeto hitting never, and it is straightforward to show that this is true for sufficiently small changes to p). Substituting k
37
for k∗1 into the above and differentiating with respect to p yields:
−1 +−βδ1− δ
+ Φ′(k∗1(p)) · dk1(p)dp
= 0
dk1(p)dp
=1 + βδ
1−δ
−Φ′(k∗1(p)
and similarlydk1(p)dp1
=1
−Φ′(k∗1(p),
dk(p)dpτ
=βδτ−1
−Φ′(k∗1(p)
Now for any p = (p1, p2, ...), define k1(p) to be the period-1 addiction level such that the is indifferent betweenhitting until maturity and hitting never. That is, k1(p) is given by:
Y1 − p1 + f(k1) + x1 + β
m∑t=2
δt−1
[Yt − pt + xt + (1− α)f
(γt−1k1(p) +
t−1∑n=1
γn−1
)+ αf
(k1(p)
)]+
β
∞∑t=m+1
δt−1
[Yt + (1− α)g
(γt−1k1(p) + γt−m
m∑n=1
γn−1
)+ αg
(k1(p)
)]
= Y1 + g(k1) + β
∞∑t=2
δt−1[Yt + (1− α)g
(γt−1k1(p)
)+ αg
(k1(p)
)]which can be re-written as:
x1 − p1 + β
m∑t=2
δt−1(−pt) + Ψ(k1(p),m) + β
m∑t=2
δt−1xt = 0
where
Ψ(k,m) = f(k1)− g(k1) + β
m∑t=2
δt−1
[(1− α)
(f
(γt−1k +
t−1∑n=1
γm−1
)− g
(γt−1k
))+ α (f(k)− g(k))
]
+β∞∑
t=m+1
δt−1
[(1− α)
(g
(γt−1k + γt−m
m∑n=1
γn−1
)− g
(γt−1k
))]
For sufficiently small changes in p, kαt = k(p) ∀ t . Thus differentiating with respect to p gives:
−1 + βδ1− δm−1
1− δ· (−1) + Ψk(k1,m)
dkjdp
Which can be re-arranged to yield:
dkjdp
=1 + βδ(1− δm−1)/(1− δ)
Ψm(k1(p),m)
And similarly differentiating with respect to p1 and pτ yield:
dkjdp1
=1
Ψm(k1(p),m)and
dkjdpτ
=
{βδτ−1
Ψs(k1(p),s), τ ≤ s
0, else
The case for a present-biased agent planning to hit only in the current period can be solved by setting m = 1.
Proof of Proposition 1 Suppose the initial addictive state k0 is distributed according to some function κ(·). Forfully rational smokers who expect to continue smoking, ktc1 = k∗. Then in period 1, π(p|X) = 1−κ(k∗(p)). By Lemma2, ∂π/∂pt+1
∂π/∂pt= δ.
38
Present-biased and projection-biased consumers also only consume in period 1 when k1 > kn1 and k1 > kj1,respectively. Thus for present-biased consumers, π(p|X) = 1−κ(kn1 (p)) and for projection-biased consumers, π(p|X) =1 − κ(kj1(p)). Lemma 2 then implies that if kn(β,p) = min{k∗n, kn, kn} or kn = min{k∗n, kn, kn} and m = 1, then∂π/∂pt+1∂π/∂pt
= 0. Otherwise if k∗(β,p) = min{k∗n, kn, kn} or kn = min{k∗n, kn, kn} and m > 1, then ∂π/∂pt+1∂π/∂pt
= βδ.
For projection-biased consumers, Lemma 2 implies that if kj = min{k∗j , kj} and m = 1, then ∂π/∂pt+1∂π/∂pt
= 0. If
kj = min{k∗j , kj} and m > 1 or k∗j = min{k∗j , kj}, then ∂π/∂pt+1∂π/∂pt
= δ
Proof of Proposition 2 Let the addictive stock in period t among mature smokers be given by κt(·). WLOG, letpτ = p+ pTτ ∀τ ≥ t. The probability that a consumer of type i smokes is given by π(p|X) = 1−κ(ki1(p), and therefore∂πt/∂pt+1∂πt/∂pt
= −κ′(k)∂k/∂pt+1
−κ′(k)∂k/∂ptIf consumers expect to continue smoking, Lemma 2 implies that ∂kt/∂pt+1
∂kt/∂pt= βδ.
For fully rational smokers, Lemma 2 further implies ∂πt/∂p∂πt/∂pt
= 11−δ . Projection-biased consumers follow identically,
using k∗j (p).For present-biased consumers, either kn(β, p) < k∗n(β) or kn(β, p) > k∗n(β). In the first case, Lemma 2 implies d =
0 and ∂πt/∂pP
∂πt/∂pt= 1 = 1
1−0 . If kn(β, p) > k∗n(β), then δ = βδ and ∂πt/∂p∂πt/∂pt
= 1−(1−β)δ1−δ > 1
1−βδ .
Proof of Lemma 3 (i) Rational and present-biased consumers:In period 2, the consumer will smoke iff
Y2 + x2 − p2 + f(k2) ≥Y2 + g(k2)x2 + f(k2)− g(k2) = x2 + h(k2) ≥p2
Let p∗2(k2) = x2 + h(k2) and let ψ(k2) = Φ(p∗2(k2)). Her ex-ante expected utility in period 2 is:
V2(k2) = Y2 + ψ(k2) · (x2 − p2(k2) + h(k2)) + g(k2)where p2 = E[p|p ≤ p∗2(k2)]
She will choose to smoke in period 1 if:
Y1 + g(k1) + V2(γk1) ≤ Y1 − p1 + x1 + f(k1) + V2(γk1 + 1)p1 ≤ p∗1 = x1 + h(k1) + βδV2(γk1 + 1)− βδV2(γk1)
≈ x1 + h(k1) + βδV ′2(γk1)
Before proceeding, consider her expected price conditional on smoking in period 2, as a function of her period-2addiction level:
p2(k) =1
ψ(k)·∫ x2+h(k)
−∞p · dΦ(p)
p′2(k) = −ψ′(k) · 1ψ(k)2
·∫ x2+h(k)
−∞p · dΦ(p) +
1ψ(k)
· h′(k) · (x2 + h(k)) · φ(x2 + h(k))
= −ψ′(k)ψ(k)
· p(k) +ψ′(k)ψ(k)
· p∗(k)
= [p∗(k)− p(k)] · ψ′(k)ψ(k)
> 0
Because p∗(k) is necessarily greater than p(k), the expression is positive. Intuitively, increasing one’s addictionlevel increases the range of prices at which a purchase will be made, and hence the expected purchase price.
39
Now,
V ′2(k) = ψ′(k) [p∗(k)− p2(k)] + ψ(k) · (−p′(k) + h′(k)) + g′(k)
= ψ′(k) [p∗(k)− p2(k)]− ψ(k) · ψ′(k)ψ(k)
[p∗(k)− p(k)] + h′(k)ψ(k) + g′(k)
= h′(k)ψ(k) + g′(k)
Finally, then:
p∗1(k1) ≈ x1 + h(k1) + βδh′(γk1)ψ(γk1) + βδg′(γk1)dp∗1dσ2≈ dψ(γk1)
dσ2· h′(γk1) · βδ
The second term is, by assumption, positive for all values of k. Thus the sign of effect on the threshold price isdetermined entirely by the first term, dψ(γk1)
dσ2 . The overall effect on the probability of initiation is then:
dP(smoke)dσ2
= Φσ2(p∗1(k1;σ2), p;σ2) + ΦP (p∗1(k1;σ2), p;σ2) · dP1
dσ2
(ii) Projection-Biased Consumers Because the fully projection-biased consumer believes her period-2 addictionlevel will be unchanged, her expected utility as a function of her period-1 consumption is simply:
U(a1 = 0) = Y1 + g(k1) + βδ [Ψ(k1, p) · (Y2 − E[p2|p2 ≤ p∗2(k1)] + x2 + f(k1)+(1−Ψ(k1, p)) · (Y2 + g(k1))]
U(a1 = 1) = Y1 − p1 + x1 + f(k1) +βδ [Ψ(k1, p) · (Y2 − E[p2|p2 ≤ p∗2(k1)] + x2 + f(k1)
+(1−Ψ(k1, p)) · (Y2 + g(k1))]
Because the fully projection-biased consumer believes her current actions don’t influence her future incentives, shewill smoke in period 1 exactly when the price is below the instantaneous temptation to hit:
p1 ≤ x1 + f(k1)− g(k1)
Thus the threshold price of initiating smoking for a fully projection-biased consumer is unaffected by the varianceof the price distribution. Assuming, however, that p > x1 + f(k1)− g(k1), then the probability that p1 is less than thethreshold is increasing in σ2.
(iii) Partial Projection BiasLet p∗(k, kt) be the threshold price at which a partially projection-biased consumer expects to purchase in period
2 if her current addiction level is pt and her actual period-2 addiction level is k. With simple projection bias, it followsthat p∗(k, kt) = x2 + α · (f(kt)− g(kt)) + (1− α) · (f(k)− g(k)).
It then follows that a partially projection-biased agent’s perceived probability of smoking in the next period isΨ(k, p|kt) = Φ (p∗(k, kt)− p).
Her perceived expected utility, then, as a function of her period-1 choice is then:
U(a1 = 0) = Y1 + g(k1) +
+βδ[Ψ(γk1, p|k1) · [(Y2 − E[p2|p2 ≤ p∗2(γk1, k1)) + x2 + αf(k1) + (1− α)f(γk1)]
+(1− Ψ(γk1, p|k1)) · [Y2 + αg(k1) + (1− α)g(γk1)]]
U(a1 = 1) = Y1 − p1 + x1 + f(k1)
+βδ[Ψ(γk1 + 1, p|k1) · [(Y2 − E[p2|p2 ≤ p∗(γk1 + 1, k1)) + x2 + αf(k1) + (1− α)f(γk1)]
+(1− Ψ(γk1 + 1, p|k1)) · [Y2 + αg(k1) + (1− α)g(γk1 + 1)]]
40
Thus a1 = 1 if:
p1 ≤ x1 + f(k1)− g(k1)
+βδ[Ψ(γk1 + 1, p|k1) · [(−E[p2|p2 ≤ p∗(γk1 + 1, k1)) + x2 + αf(k1) + (1− α)f(γk1)]
−Ψ(γk1, p|k1) · [(−E[p2|p2 ≤ p∗2(γk1, k1)) + x2 + αf(k1) + (1− α)f(γk1)]+(1− Ψ(γk1 + 1, p|k1)) · [αg(k1) + (1− α)g(γk1 + 1)]
−(1− Ψ(γk1, p|k1)) · [αg(k1) + (1− α)g(γk1)]]
In the limit as α approaches unity, this expression simplifies to exactly the full projection-bias case. Similarly, itsimplifies to the fully rational case as α approaches zero. Thus the overall effect on the probability of smoking dependson the individual’s degree of projection bias. For large α, the threshold will barely adjust as σ2 increases, so as witha fully projection-biased agent, the probability of initiation increases. For small α, the threshold drops enough that itdominates the increased probability of meeting any given threshold, so the probability of initiation decreases.
Proof of Lemma 3: Let f(kt) = xt − ρkt and g(kt) = −σkt.For any p = (p1, p2, ...), define k1(α, β, δ) to be the period-1 addiction level such that a person with discount rate
δ, degree of present-bias β, and degree of projection-bias α is indifferent between hitting always and hitting never.That is,
Y1 − p1 + x1 − ρk1 + β
∞∑t=2
δt−1
((1− α)(ρ)
(γt−1k1 +
t−1∑n=1
γn−1
)+ α(−ρ)k1 + Yt − pt + xt
)
= Y1 − σk1 + β
∞∑t=2
δt−1((1− α)(−σ)(γt−1)k1 + α(σ)k1 + Yt
)
So that:
(σ − ρ)k1 + βδ
[(1− α)
γ
1− δγ+
α
1− δ
](σ − ρ)k1 + β
∞∑t=2
δt−1x1 − p1 − β∞∑t=2
δt−1pt + θ1 = 0
where θ1 =1− α1− γ
βδ
[γ
1− δγ− 1
1− δ
]ρ
If k1(α, β, δ) < ky1(α, β, δ), then kt = k1(α, β, δ) ∀ t. (i.e. behavior is governed by the threshold for hittingalways relative to hitting never, and it is straightforward to show that this is true for sufficiently small changes to p).Substituting k for k1 into the above and differentiating with respect to p yields:[
1 + βδ
((1− α)
γ
1− δγ+
α
1− δ
)](σ − ρ)dk =
[1 + β
∞∑t=2
δt−1
]dp
dk
dp=
[1 + βδ
1−δ
][1 + βδ
((1− α) γ
1−δγ + α1−δ
)]· (σ − ρ)
41
Similarly :
dk
dpTt=
1[1 + βδ
((1− α) γ
1−δγ + α1−δ
)]· (σ − ρ)
dk
dpTt+1
=βδ[
1 + βδ(
(1− α) γ1−δγ + α
1−δ
)]· (σ − ρ)
Now let ky1(α, β, δ,m) be the period-1 addiction level such that a person with discount rate δ, degree of present-biasβ, degree of projection-bias α, and maturity date m > 1 is indifferent between hitting until maturity (and refrainingthereafter), and hitting never. That is:
Y1 − p1 + x1 − ρk1 + β
m∑t=2
δt−1
[(1− α)(−ρ)
(γt−1k1 +
t−1∑n=1
γn−1
)− αρk1 − pt + xt
]
+β∞∑
t=m+1
δt−1
[(1− α)(−σ)
(γt−1k1 + γt−m ·
m∑n=1
γn−1
)− ασk1
]
= Y1 − σk1 − σ(
(1− α)βδγ1
1− δγ+ αβδ
11− δ
)k1
[1 + βδ
((1− α)γ
1− (δγ)m−1
1− δγ+ α
1− δm−1
1− δ
)](σ − ρ)k1 − p1 − β
m∑t=1
δt−1pt + θ2 = 0
where θ2 = x1 + β∑mt=2 δ
t−1xt − (1− α) βδ1−γ
(1−δm1−δ −
1−(δγ)m
1−δγ
)ρ.
If ky1(α, β, δ,m) < k1(α, β, δ), then kt = ky1(α, β, δ) ∀ t. (i.e. behavior is governed by the threshold for hittingonly until maturity and refraining thereafter over hitting never, and it is straightforward to show that this is true forsufficiently small changes to p). Substituting k for ky1 into the above and differentiating with respect to p yields:
dk
dp=
[1 + βδ(1−δm−1)
1−δ
][1 + βδ
((1− α)γ 1−(δγ)m−1
1−δγ + α 1−δm−1
1−δ
)]· (σ − ρ)
Similarly :
dk
dpTt=
1[1 + βδ
((1− α)γ 1−(δγ)m−1
1−δγ + α 1−δm−1
1−δ
)]· (σ − ρ)
dk
dpTt+1
=βδ[
1 + βδ(
(1− α)γ 1−(δγ)m−1
1−δγ + α 1−δm−1
1−δ
)]· (σ − ρ)
· 1{m > t}
Let the distribution of addictive states at time t for the cohort born in τ be given by Pr(k ≤ x) = κ(x; τ, t). Theprobability of smoking, and the population smoking rate, is then
(1− κ(k; τ, t)
).
Finally, let the “maturity date” m be distributed according to the probability distribution function µ(·). Theresponse in the smoking rate of cohort τ , πτ,t to an anticipated shock to the price beginning in period t+1 is then:
dπτ,tdpt+1
= −κ′(k; τ, t) ·(
dk
dpt+1
)(1− µ(t− τ))
Proof of Lemma 4: A consumer with current addictive state k1, discount rate δ, degree of present-bias β, and degree
42
of projection-bias α believes she will smoke in period 2 if the price and her period-2 addiction level are such that:
Y2 + x2 − p2 + (1− α)f(k2) + αf(k1) ≥ Y2 + (1− α)g(k2) + αg(k1)
p2 ≤ (1− α)h(k2) + αh(k1)
Let p(k2; k1) = x2 + (1− α)h(k2) + αh(k1). If p ∼ Φ(·), then her ex-ante expected utility in period is:
V (k2; k1) = EU(k2; k1) = Y2 + Φ(p) ((1− α)f(k2) + αf(k1)− E[p2|p2 ≤ p]) + (1− Φ(p)) ((1− α)g(k2) + αg(k1))
She will therefore smoke in period 1 if:
p1 ≤ p∗1 = h(k1) + βδV (γk1 + 1; k1)− βδV (γk1; k1)≈ h(k1) + V ′2(γk1; k1) · βδ
Let p(k2; k1) = E [p2|p2 ≤ p(k2; k1)], and note that
p(k2; k1) =1
Φ(p)
∫ (1−α)h(k2)+αh(k1)
−∞p · dΦ(p)
So that
p′(k2; k1) = −φ(p)h′(k2)Φ(p)2
∫ (1−α)h(k2)+αh(k1)
−∞p · dΦ(p) +
1Φ(p)
(1− α)h′(k2)pφ(p)
= −φ(p)h′(k2)Φ(p)
p+(1− α)φ(p)
Φ(p)
=φ(p)Φ(p)
h′(k2) [p− p]
ThusV2 = Y2 + Φ(p)(p− p) + (1− α)g(k2) + αg(k1)
V ′2(k2; k1) = φ(p)h′(k2)(p− p) + Φ(p)((1− α)h′(k2)− p′) + (1− α)g′(k2)
= φ(p)h′(k2)(p− p) + Φ(p)φ(p)Φ(p)
h′(k2)(p− p) + (1− α)h′(k2)Φ(p) + (1− α)g′(k2)
= (1− α) [h′(k2)Φ(p(k2; k1) + g′(k2)]
And therefore
p∗ = h(k1) + βδ(1− α) [Φ(p) · h′(γk1) + g′(γk1)]
Let s be the standard deviation of Φ(p).
dp∗
ds=
dΦ(p)ds
· h′(γk1) · βδ(1− α)
=dΦ(p)ds
· (σ − ρ) · βδ(1− α)
If the agent changes their threshold in this manner, then the overall change in their probability of smoking in anyperiod is:
dπ
ds= φ(p∗) · dΦ(p)
ds· (σ − ρ)βδ(1− α) +
dΦ(p∗)ds
If one imposes normality on the price distribution such that (pt − E [pt]) ∼ N(0, s2), then this can be written in
43
closed form as:
dπ
ds=
1√2πs
e−
“p∗2
2s2
”·(
p√2πs2
e−
“p2
2s2
”)· (σ − ρ)βδ(1− α)
− p∗√2πs2
e−
“p∗2
2s2
”
Derivation of Optimal Tax Consider the current-period addiction for which an agent is indifferent between hittingalways and hitting never, as given in the Proof of Lemma 3:
(σ − ρ)k1 + βδ
[(1− α)
γ
1− δγ+
α
1− δ
]k1 + β
∞∑t=2
δt−1x1 − p1 − β∞∑t=2
δt−1pt + θ1 = 0
where θ1 =1− α1− γ
βδ
[γ
1− δγ− 1
1− δ
]ρ
An agent with discount rate δ, degree of present-bias β, and degree of projection-bias α will therefore hit if:
k1 ≥p1 + β
∑∞t=2 δ
t−1pt − θ1
θ3(σ − ρ)
where θ3 = 1 + βδ
((1− α)
γ
1− δγ+ α
11− δ
)Whereas a fully-rational agent (i.e. α = 0, β = 1) will hit if:
k1 ≥p1 +
∑∞t=2 δ
t−1pt − 11−α
1β θ1
θR3 (σ − ρ)
where θR3 = 1 + δ
(γ
1− δγ
)Consider applying a uniform excise tax τ on consumption in every period. The level of tax that will cause a biased
agent to consume only when the rational agent would have consumed is given by:
p1 + τ + β∑∞t=2 δ
t−1(pt + τ)− θ1
θ3(σ − ρ)=p1 +
∑∞t=2 δ
t−1pt − 11−α
1β θ1
θR3 (σ − ρ)
τ
(1 +
βδ
1− δ
)=[
11− δ
θ3
θR3− (1 + βδ)1− δ
]p−
[1
1− α1β− θ3
θR3
]θ1
where p is the average future price.
Table A-1: Optimal Taxesγ = 0 γ = 0.25 γ = 0.5 γ = 0.75
β = 0.9, δ = 1, α = 0 1.12 1.31 1.84 3.70β = 0.8, δ = 0.9, α = 0 2.21 2.47 3.18 5.28β = 1, δ = 0.9, α = 0.4 11.40 7.46 4.16 2.89β = 0.8, δ = 0.9, α = 0.4 15.16 11.41 8.63 8.80
B Figures and Tables
44
Table 1: Summary Statistics
Full NHISUrban Sample
NHISSmokersa
NHISNonsmokers
1990 Censusb
Cigarettes/day 18.79 –(13.13)
Age 41.89 42.59 41.63 43.05(17.99) (17.10) (18.31)
Female 0.537 0.520 0.543 0.51(.499) (.499) (.498)
Black 0.127 0.129 0.126 0.12(0.333) (0.335) (0.332)
High School or Less 0.436 0.367 0.461 0.626(0.496) (0.482) (0.498)
College or Morec 0.565 0.633 0.539 0.373(0.496) (0.482) (0.498)
Northeast 0.328 0.321 0.330 0.20Central 0.268 0.275 0.265 0.24South 0.165 0.166 0.165 0.34West 0.237 0.238 0.237 0.21
N obs. 125,211 33,637 91,574Source: National Health Interview Survey (1965-1991) and Statistical Abstract of the United States.Notes: Standard deviations are in parentheses. Cigarette consumption and demographic variables are self-reported in the NHIS. Regions correspond to census designations. Twenty cigarettes is approximately a packa day. Census data are for combined urban and rural sample.a Self-designated “Current Smoker”b Estimates for total US population. Education outcomes are for persons 25 years old and over.b Includes “Some College”
45
Figure 1: Real Prices by Metropolitan Area.6
.81
1.2
1.4
.6.8
11
.21
.4.6
.81
1.2
1.4
.6.8
11
.21
.4.6
.81
1.2
1.4
.6.8
11
.21
.4
1960 1970 1980 1990 1960 1970 1980 1990 1960 1970 1980 1990 1960 1970 1980 1990 1960 1970 1980 1990
1960 1970 1980 1990
Anaheim Atlanta Baltimore Boston Buffalo Chicago
Cincinnati Cleveland Dallas Denver Detroit Houston
Indianapolis Kansas City Los Angeles Miami Milwaukee Minneapolis
New Orleans New York Philadelphia Pittsburgh Portland San Bernadino
San Diego San Francisco San Jose Seattle St. Louis Tampa
Washington, D.C.
Real P
rice
Figure 2: Nominal Taxes by Metropolitan Area
02
04
06
00
20
40
60
02
04
06
00
20
40
60
02
04
06
00
20
40
60
1960 1970 1980 1990 1960 1970 1980 1990 1960 1970 1980 1990 1960 1970 1980 1990 1960 1970 1980 1990
1960 1970 1980 1990
Anaheim Atlanta Baltimore Boston Buffalo Chicago
Cincinnati Cleveland Dallas Denver Detroit Houston
Indianapolis Kansas City Los Angeles Miami Milwaukee Minneapolis
New Orleans New York Philadelphia Pittsburgh Portland San Bernadino
San Diego San Francisco San Jose Seattle St. Louis Tampa
Washington, D.C.
Nom
inal Ta
xes
46
Figure 3: Hazard Rates of Cigarette Smoking Initiation and Cessation
0 5 10 15 20 25 30 35 40 45 50 550
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Time (Age or Duraton of Habit)
Ha
za
rd
Initiation
Cessation
Source: 1987 National Health Interview Survey.
Notes: Hazard rates are calculated from self-reported ages when started smoking and when last regularly smoked, and do not
control for observable characteristics. Quitting hazard does not include exits from smoking habits due to death.
47
Table 2: Price Sensitivity Among Young SmokersPanel A – Dependent Variable: 1(Smoker = 1), Linear Probability Model
(i) (ii) (iii) (iv) (v) (vi)Age < 20 Age < 25 Age < 30 Age < 20 Age < 25 Age < 30
ln(price)t -1.853∗∗∗ -1.256∗∗∗ -1.019∗∗∗ -1.363∗∗∗ -0.994∗∗∗ -0.744∗∗
(0.226) (0.135) (0.104) (0.525) (0.334) (0.294)ln(price)t+1 -0.118 -0.426∗∗∗ -0.545∗∗∗ -0.021 -0.389 -0.575∗∗
(0.161) (0.096) (0.075) (0.471) (0.289) (0.251)Ratio ofcoefficientsa 0.064 0.340∗∗ 0.534∗∗∗ 0.016 0.392∗∗ 0.773∗∗∗
(0.093) (0.110) (0.124) (0.339) (0.162) (0.051)
MSA Fixed Effects yes yes yes yes yes yesMSA-SpecificTrends
yes yes yes yes yes yes
N. obs 5,289 11,647 17,975 5,289 11,647 17,975
Panel B – First-stage Estimates of Pricesln(price)t
ln(taxes)t – – – 0.527∗∗∗ 0.516∗∗∗ 0.511∗∗∗
(0.038) (0.036) (0.036)Budget Gapb – – – 0.194∗∗ 0.161∗∗ 0.151∗∗
(0.089) (0.078) (0.074)ln(price)t+1
ln(taxes)t – – – 0.551∗∗∗ 0.558∗∗∗ 0.565∗∗∗
(0.054) (0.051) (0.050)Budget Gapb – – – 0.478∗∗∗ 0.454∗∗∗ 0.428∗∗∗
(0.135) (0.134) (0.131)Notes: Standard errors (in parentheses) are robust and clustered at the MSA level. All specifications control for age, age2, race,
and education and include both MSA dummies and MSA-specific time trends. * denotes significance at the 10% level; ** at the
5% level; and *** at the 1% level. Coefficients in Panel A are average marginal effects from a probit regression.a The ratio of the coefficients on ln(pricet+1) and ln(pricet) will equal the net discount factor βδ if consumers are fully rational,
or may equal 0 if consumers are sufficiently present- or projection-biased. Standard errors are calculated by the delta method.b Average percentage shortfall in state government budgets (general fund) within an MSA.
48
Table 3: Price Sensitivity – Ordered ProbitDependent Variable: Smoking Status
Less than Cutoff Less than or Equal to Cutoff(i) (ii) (iv) (v)
ln(pricet) -1.019∗∗ -0.816∗∗ -1.277∗∗∗ -1.063∗∗∗
(0.414) (0.385) (0.412) (0.382)ln(pricet) * under20 -2.233∗∗ -2.380∗∗
(0.966) (0.948)ln(pricet+1) -1.627∗∗∗ -1.821∗∗∗ -1.575∗∗∗ -1.768∗∗∗
(0.332) (0.305) (0.331) (0.303)ln(pricet+1) * under20 1.946∗∗∗ 1.962∗∗∗
(0.620) (0.542)under20 -0.021 -0.052
(0.423) (0.047)Thresholds:µ1 (nonsmoker) 218.6671 218.6596 212.8748 212.7885µ2 (1/2 pack) 219.155 219.1477 213.5996 213.5135µ3 (1 pack) 219.5799 219.5729 214.529 214.4437µ4 (2 packs) 220.7809 220.7751 215.6642 215.5800
p-value of β3 + β4 = 0 – 0.8690 – 0.7830
N obs. 43,537 43,537 43,537 43,537R2 0.167 0.167 0.173 0.174
Notes: Robust standard errors (in parentheses) and are clustered at the MSA level. All specifications control for age, age2, race,
and education. * denotes significance at the 10% level; ** at the 5% level; and *** at the 1% level. Cutoffs for smoking status
are: 0 = nonsmoker, 1 = 1/2 pack per day, 2 = 1 pack per day, 3 = 2 packs per day, 4 = 2+ packs per day
49
Table 4: Price Sensitivity Among Mature SmokersPanel A – Dependent Variable: 1(Smoker = 1), Linear Probability Model
(i) (ii) (iii) (iv)Age > 30 Age > 35 Age > 30 Age > 35
ln(pricet) -0.824∗∗∗ -0.879∗∗∗ -0.725∗∗∗ -0.787∗∗∗
(0.071) (0.077) (0.215) (0.251)ln(pricet+1) -0.641∗∗∗ -0.619∗∗∗ -0.531∗∗∗ -0.496∗∗
(0.052) (0.057) (0.183) (0.213)Ratio of Coefficientsa 0.778∗∗∗ 0.704∗∗∗ 0.734∗∗∗ 0.630∗∗∗
(0.125) (0.122) (0.044) (0.074)
MSA Fixed Effects Yes Yes Yes YesMSA-Specific Time Trends Yes Yes Yes YesN. obs 40,436 34,775 40,436 34,775Panel B – First-stage Estimates of Pricesln(price)t
ln(taxest) – – 0.494∗∗∗ 0.498∗∗∗
(0.035) (0.035)Budget Gapb – – 0.146∗ 0.132∗
(0.074) (0.072)ln(price)t+1
ln(taxest) – – 0.550∗∗∗ 0.559∗∗∗
(0.048) (0.048)Budget Gapb – – 0.398∗∗∗ 0.384∗∗∗
(0.138) (0.136)Notes: Standard errors (in parentheses) are robust and clustered at the MSA level. All specifications control for age, age2, race,
and education and include both year dummies and MSA-specific time trends. * denotes significance at the 10% level; ** at the
5% level; and *** at the 1% level. Coefficients for in Panel A are average marginal effects from a probit regression.a The ratio of the coefficients on ln(pricet+1) and ln(pricet) will equal the net discount factor βδ. Standard errors are calculated
by the delta method.b Average percentage shortfall in state government budgets (general fund) within an MSA.
50
Table 5: Permanent vs. Temporary Price EffectsPanel A – Dependent Variable: 1(Smoker = 1)
Age > 30 Age > 35 Age < 20(i) (ii) (iii) (iv) (v)
ln(pPt ) -1.111∗∗∗ -1.380∗∗∗ -1.113∗∗∗ -1.410∗∗∗ -0.822∗∗∗
(0.118) (0.181) (0.118) (0.177) (0.166)
ln(pTt ) -0.214∗∗∗ -0.217∗∗∗ -0.220∗∗∗ -0.225∗∗∗ -0.333∗∗∗
(0.044) (0.068) (0.045) (0.068) (0.112)
Ratio of Coefficientsa 5.191∗∗∗ 6.359∗∗ 5.059∗∗∗ 6.267∗∗ 2.467(1.31) (2.07) (1.26) (1.97) (1.11)
MSA Fixed Effects Yes Yes Yes Yes YesTrend Yes No Yes No NoTrends by MSA No Yes No Yes Yes
N obs. 37,651 37,651 32,322 32,322 5,289R2 0.497 0.505 0.495 0.503 0.457
Panel B – Decomposition (Dependent Variable = Real Pricet)Age > 30 Age > 35 Age < 20
Real Taxes 1.164∗∗∗ 1.160∗∗∗ 1.117∗∗∗
(0.172) (0.171) (0.170)Real KY Burley Price 0.109∗∗∗ 0.108∗∗∗ 0.101∗∗
(0.032) (0.032) (0.033)MSA Fixed Effects Yes Yes YesTime Trend Yes Yes YesR2 0.85 0.85 0.85
Notes: Standard errors (in parentheses) in Panel A are approximate bootstrap standard errors to account for the generation of
ln(pPt ) and ln(pTt ), clustered at the MSA level. Standard errors in Panel B are robust and clustered at the year level. Permanent
price is predicted using MSA fixed effects and taxes. Temporary price is predicted using Kentucky burley tobacco auction price.
All specifications include age, age2, sex, race, and education . All columns use a linear probability model. * denotes significance
at the 10% level; ** at the 5% level; and *** at the 1% level.a The ratio of the coefficients on ln(pP ) and ln(pT ) will equal (1− (1−β)δ)/(1− δ). Standard errors are calculated by the delta
method. Significance is calculated as difference from 1.
51
Table 6: Effect of Price Uncertainty on Smoking RatesDependent Variable: 1(Smoker = 1)
Probita Linear Probability Model(i) (ii) (iii) (iv) (v) (vi)
Std(Price) 0.0254∗ 0.0267∗∗ 0.0278∗∗ 0.0196∗ 0.0208∗ 0.0221∗
(0.0135) (0.0133) (0.0142) (0.0113) (0.0108) (0.0118)Std(Price) X Under20 – -0.0052 – – -0.0049 –
(0.0244) (0.0170))Under20 – -0.0930∗∗∗ – – -0.0869∗∗∗ –
(0.0246) (0.0216)Std(Price) X Under25 – – -0.0085 – – -0.0084
(0.0118) (0.0096)Under25 – – -0.317∗∗ – – -0.0319∗∗
(0.0157)) (0.0138)ln(pricet) -0.361∗∗∗ -0.355∗∗∗ -0.360∗∗∗ -0.301∗∗∗ -0.295∗∗∗ -0.301∗∗∗
(0.0573) (0.0568) (0.0543) (0.0487) (0.0482) (0.0487)Demog. Controls Yes Yes Yes Yes Yes YesTime Trend Yes Yes Yes Yes Yes Yes
R2 0.072 0.075 0.073 0.083 0.085 0.083Number of obs. 86,137 86,137 86,137 86,137 86,137 86,137
Notes: Standard errors (in parentheses) are robust and clustered at the MSA level. Std(Price) is calculated using departures
from MSA trend of prices in the first half of the sample, while the second half of the sample is used to estimate the model.
Demographics include age, age, race, and education. * denotes significance at the 1% level, ** denotes significance at the 5%
level, and *** denotes significance at the 1% level.a Coefficients are average marginal effects.
52
Table 7: Parameter Estimates
Table 5: (i) (ii) (iii) (iv)Table 4: α β δ β δ β δ β δ
(iii) 0.494 0.853 0.859 0.828 0.885 0.856 0.856 0.829 0.883(0.080) (0.106) (0.072) (0.078) (0.080) (0.107) (0.072) (0.080)
(iv) 0.410 0.717 0.879 0.700 0.901 0.720 0.875 0.701 0.899(0.138) (0.097) (0.129) (0.071) (0.138) (0.097) (0.129) (0.072)
Notes: Roman numerals refer to specifications in Tables 4 and 5. Standard errors are approximate, as calculatedusing the delta method. Estimates use the preferred estimate of (σ − ρ) = 16.883, and the coefficient on dπ
dsfrom
Column (iv) of Table 6
53
Figure 4: Overpayment Ratio
Panel A: Overpayment Relative to Biased Self
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.91
1.5
2
2.5
3
3.5
4
4.5
5
5.5
6
γ
m = 12
m= 6
Panel B: Overpayment Relative to Unbiased self
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.91
1.5
2
2.5
3
3.5
4
4.5
5
5.5
6
γ
m=6
m = 12
Notes: This figure calculates the ratio between the present values of the most expensive smoking habit that
an unaddicted teenager would accept and the least expensive smoking habit that she would refuse. Panel
A uses a biased agent in the denominator, while Panel B uses an otherwise-identical agent with α = 0,
β = 1. For a perfectly rational consumer (not facing credit constraints), this ratio should equal 1. Prices
are fixed during the consumer’s youth (i.e. for m periods) and then allowed to change during her adulthood.
Estimates use benchmark results of α = 0.410, β = 0.7 δ = 0.9, σ − ρ = 16.88. m refers to the remaining
periods of “youth”. Present values are taken using the estimated long-run discount factor δ.
54
C Additional Tables
Table A-2: Demographic ControlsDependent Variable: 1(Smoker = 1), Linear Probability Model
(i) (ii) (iii) (iv)
Age 0.00908∗∗∗ 0.00908∗∗∗ 0.01374∗∗∗ 0.01376∗∗∗
(0.00318) (0.00318) (0.00212) (0.00212)Age2 -0.00009∗∗∗ -0.00009∗∗∗ -0.00016∗∗∗ -0.00016∗∗∗
(0.00003) (0.00003) (0.00003) (0.00003)Education variables:a
Some College -0.234∗∗∗ -0.238∗∗∗ -0.0268∗∗∗ -0.0261∗∗∗
(0.0089) (0.0089) (0.0042) (0.0042)College -0.0366∗ -0.0362∗ -0.0518∗∗∗ -0.0510∗∗∗
(0.0199) (0.0198) (0.0104) (0.0105)More than College -0.0336 -0.0332 -0.0787∗∗∗ -0.0776∗∗∗
(0.0231) (0.0230) (0.0261) (0.0263)Demographic Variables:Female -0.0186∗ -0.0186∗ -0.0847∗∗∗ -0.0847∗∗∗
(0.0098) (0.0099) (0.0268) (0.0268)Black 0.0081 0.0080 0.0081∗∗ 0.0088∗∗∗
(0.0097) (0.0090) (0.0041) (0.0032)
Include Non-Urban Sampleb No No Yes YesYear Fixed Effects Yes Yes Yes YesMSA Fixed Effects and Trends No Yes No YesN obs. 120,535 120,535 410,729 410,729Notes: Standard errors (in parentheses) are robust and multi-way clustered by both year and MSA followingCameron, Gelbach and Miller (2006). * denotes significance at the 10% level; ** at the 5% level; and *** atthe 1% level.a The omitted category is “High School or Less”b Non-urban are respondents outside the 31 metropolitan statistical areas used in the main regressions. Forstatistical purposes here, they are grouped into a single “other” MSA.
55
Table A-3: Predicting Current and Future PricesDependent Variable: ln(Pricet)
(i) (ii) (iii) (iv)ln(Taxest) 0.3042 0.5006 0.5280 0.4971
(0.0222) (0.0310) (0.0315) (0.0314)[0.0621] [0.0781] [0.0562] [0.0756]
ln(KY Burleyt) – 0.1419 – 0.1402(0.0155) (0.0162)[0.0480] [0.0458]
Budget Gapt – – -0.0246 0.0337(0.0537) (0.0477)[0.1360] [0.1476]
R2 0.776 0.827 0.779 0.827Dependent Variable: ln(Pricet+1)
(i) (ii) (iii) (iv)ln(Taxest) 0.6364 0.6360 0.6080 0.6077
(0.0446) (0.0471) (0.0483) (0.0508)[0.0882] [0.0879] [0.0816] [0.0834]
ln(KY Burleyt) – -0.006 – -0.004(0.0471) (0.0631)[0.1239] [0.1111]
Budget Gapt – – 0.2655 0.2655(0.1091) (0.1095)[0.1786] [0.1778]
R2 0.844 0.853 0.860 0.867Dependent Variable: ln(Pricet+1)-ln(Pricet)
(i) (ii) (iii) (iv)ln(Taxest) 0.1110 0.0991 0.0799 0.0761
(0.0242) (0.0291) (0.0268) (0.0289)[0.0688] [0.0704] [0.0691] [0.0696]
ln(KY Burleyt) – -0.0482 – -0.0165(0.0518) (0.0441)[0.1418] [0.1319]
Budget Gapt – – 0.2902 0.2882(0.0677) (0.0659)[0.1407] [0.1421]
R2 0.102 0.150 0.197 0.225Notes: Standard errors in parentheses are robust and clustered by MSA. Standard errors in square
backets are robust and multi-way clustered by both year and MSA following Cameron et al. (2006).
All specifications include MSA fixed effects.
56
Tab
leA
-4:
Pri
ceSe
nsit
ivit
y–
Alt
erna
tive
Spec
ifica
tion
sP
anel
A:
You
ngSm
oker
s(D
epen
dent
Var
iabl
e:1
(Smoker
=1)
)A
ge<
20A
ge<
25A
ge<
30ln
(pri
cet)
-1.6
40-1
.577
-3.2
06-1
.638
-1.2
71-1
.261
-2.5
17-1
.271
-1.0
70-1
.061
-2.3
29-1
.069
(0.1
91)
(0.1
90)
(0.2
47)
(0.1
91)
(0.1
16)
(0.1
16)
(0.1
49)
(0.1
16)
(0.0
91)
(0.0
92)
(0.1
19)
(0.0
91)
ln(p
ricet+
1)
-0.1
660.
273
-0.6
61-0
.169
-0.4
46-0
.178
-0.9
10-0
.448
-0.5
44-0
.354
-0.9
98-0
.546
(0.1
61)
(0.2
61)
(0.1
67)
(0.1
61)
(0.0
96)
(0.1
56)
(0.1
02)
(0.0
96)
(0.0
75)
(0.1
21)
(0.0
79)
(0.0
75)
ln(t
axest)
––
1.40
2–
––
1.14
4–
––
1.12
3–
(0.1
42)
(0.0
87)
(0.0
68)
Rat
ioof
Pri
ceC
oeffi
cien
ts0.
101
-0.1
730.
206
0.10
30.
351
0.14
10.
362
0.35
20.
509
0.33
40.
429
0.51
0(0
.108
)(0
.158
)(0
.061
)(0
.109
)(0
.104
)(0
.130
)(0
.053
)(0
.105
)(0
.110
)(0
.130
)(0
.047
)(0
.110
)
Fir
stSt
age
(Dep
ende
ntV
aria
ble:
ln(P
ricet+
1))
ln(t
axest)
0.19
5–
0.19
50.
236
0.19
2–
0.19
20.
230
0.19
7–
0.19
70.
235
(0.0
52)
(0.0
52)
(0.0
57)
(0.0
51)
(0.0
51)
(0.0
58)
(0.0
50)
(0.0
51)
(0.0
57)
Bud
get
Gap
a t0.
298
0.35
00.
298
–0.
297
0.34
30.
298
–0.
290
0.33
80.
290
–(0
.080
)(0
.076
)(0
.080
)(0
.078
)(0
.071
)(0
.078
)(0
.076
)(0
.071
)(0
.076
)
N.
obs
5,28
95,
289
5,28
95,
289
11,6
4711
,647
11,6
4711
,647
17,9
7517
,975
17,9
7517
,975
Pan
elB
:M
atur
eSm
oker
s(D
epen
dent
Var
iabl
e:1
(Smoker
=1)
)A
ge>
30A
ge>
35ln
(pri
cet)
-0.8
73-0
.855
-2.1
22-0
.873
-0.9
30-0
.906
-2.1
70-0
.930
(0.0
63)
(0.0
63)
(0.0
82)
(0.0
63)
(0.0
69)
(0.0
69)
(0.0
90)
(0.0
69)
ln(p
ricet+
1)
-0.6
44-0
.431
-1.0
69-0
.645
-0.6
20-0
.369
-1.0
34-0
.621
(0.0
51)
(0.0
80)
(0.0
54)
(0.0
51)
(0.0
56)
(0.0
87)
(0.0
59)
(0.0
56)
ln(t
axest)
––
1.07
1–
––
1.05
9–
(0.0
46)
(0.0
50)
Rat
ioof
Pri
ceC
oeffi
cien
ts0.
737
0.50
50.
504
0.73
90.
666
0.40
70.
476
0.66
8(0
.108
)(0
.116
)(0
.038
)(0
.109
)(0
.106
)(0
.115
)(0
.040
)(0
.107
)
Fir
stSt
age
(Dep
ende
ntV
aria
ble:
ln(P
ricet+
1))
ln(t
axest)
0.17
6–
0.17
60.
211
0.17
0–
0.17
00.
205
(0.0
51)
(0.0
51)
(0.0
58)
(0.0
52)
(0.0
52)
(0.0
58)
Bud
get
Gap
a t0.
283
0.32
40.
283
–0.
285
0.32
50.
285
–(0
.074
)(0
.068
)(0
.074
)(0
.075
)(0
.068
)(0
.075
)
N.
obs
40,4
3640
,436
40,4
3640
,436
34,7
7534
,775
34,7
7534
,775
Note
s:Sta
ndard
erro
rs(i
npare
nth
eses
)are
robust
and
clust
ered
at
the
MSA
level
.A
llsp
ecifi
cati
ons
contr
ol
for
age,
age2
,ra
ce,
and
educa
tion,
and
incl
ude
both
MSA
fixed
effec
tsand
MSA
-sp
ecifi
cti
me
tren
ds.
aA
ver
age
per
centa
ge
short
fall
inst
ate
gov
ernm
ent
budget
sw
ithin
an
MSA
.
57