an empirical analysis of biases in cigarette...

An Empirical Analysis of Biases in Cigarette Addiction

Matthew R. Levy∗

This version: June 10, 2009

Abstract

I embed rational choice along with two types of psychological biases into a model describing theconsumption of an addictive good, and empirically investigate the extent to which these biases help explainobserved smoking patterns. Rational agents correctly maximize their lifetime utility, whereas “present-biased” agents have a taste for immediate gratification, and “projection-biased” agents underestimate howmuch their future tastes are affected by current behavior. Consumers may have any degree, including zero,of both biases. I use MSA-level variation in cigarette price shocks over a 26-year span to examine individualsmoking responses to current and predicted future prices, and to price volatility. I find that while bothyoung smokers and mature smokers respond to the current price (with a 1% price increase lowering theprobability of smoking by between 1% and 2%), only mature smokers respond to the future price. Thisyouthful myopia is consistent with either type of biased consumers but not with fully rational ones. Theratio of the future-price response to current-price response among mature response yields an estimate ofthe net discount rate. Decomposing prices into a permanent and a transitory component, I use maturesmokers’ responses to separate out a short-run discount factor β ∈ [0.70, 0.83] and long-run discountfactor δ ≈ 0.9. Because projection-biased agents do not fully appreciate the effect of current consumptionon future behavior, they under-appreciate the “option value” of remaining unaddicted in an uncertainprice environment. With an additional assumption on the utility function, I use the under-response toprice volatility to estimate a degree of projection bias α ∈ [0.41, 0.49], meaning that people underestimatechanges in tastes from smoking behavior by 40%. Overall, I find that a combination of present biasand projection bias explains significant deviations from rational cigarette consumption. Manipulating theprice path to take full advantage of these biases could increase the cost in present value that a consumerpays for a lifetime smoking habit by a factor of 3. I find that relatively large taxes—between $8 and$11—would correct for the overconsumption caused by the two psychological biases.

∗University of California, Berkeley. Email: [email protected]. I thank my advisors Matthew Rabin, StefanoDellaVigna, and Botond Koszegi for research guidance. I am also grateful to Ulrike Malmendier and Shachar Kariv for additionaladvice. I thank participants at the UC Berkeley Psychology and Economics Seminar, UC Berkeley Labor Lunch, UC BerkeleyPsychology and Economics Lunch, and Michigan State University Theory Brownbag Seminar for their helpful comments.

mailto:[email protected]

1 Introduction

Many studies have looked at the response of cigarette consumption to changes in prices and taxes.1 In general,

these studies find substantial price and tax elasticities of cigarette consumption across different income levels,

different ages, and different countries. There is also a growing collection of theoretical models describing

how agents treat addictive goods,2 which differ in both their behavioral and their welfare implications.3

However, there is little work attempting to link the two literatures and empirically identify which, if any,

model applies to cigarette addiction.4 This paper derives testable implications from a unified model of

addiction that allows the extent of two biases to freely vary, and then estimates the degree of each bias.

I present a simple model of addiction based on O’Donoghue and Rabin (2000)’s adaptation of Becker

and Murphy (1988), where infinitely-lived agents choose in every period consumption of a numeraire and an

addictive good. Addiction is captured by two key features: a negative effect of current consumption on the

utility of future selves, and a “reinforcement” effect whereby current consumption increases the temptation

to consume in future periods. I then allow for consumers to have varying degrees of two biases, with fully

rational consumers—who maximize their true lifetime utility function—embedded as a special case.5 This

case corresponds to the “rational addiction” described in Becker and Murphy (1988), where addicts correctly

assess their maximization problem and consume only when addiction is utility-maximizing.6

A growing literature—summarized in DellaVigna (2009)—finds that many people display a taste for

immediate gratification. I allow agents to exhibit this “present bias” according to the quasi-hyperbolic

discounting “β−δ” model of Laibson (1997) and O’Donoghue and Rabin (1999a). The degree of present bias

is indexed by the parameter β, which is an additional discount factor applied to any future utility in addition

to the usual exponential discount factor δ. Setting β = 1 therefore reduces to exponential discounting.

Consumers with β < 1 will be more impatient in choices involving both present and future utility—only

future utility receives extra discounting—while not overly impatient in choices involving only future utility

tradeoffs. I focus on “naive” present-biased agents, who are unaware of their self-control problem.7 Their1Becker, Grossman and Murphy (1994), Chaloupka and Warner (2000), Gruber (2001), Gruber and Koszegi (2001),

Chaloupka, Jha, de Beyer and Heller (2005), Tauras, Chaloupka, Farrelly, Giovino, Wakefield, Johnston, O’Malley, Kloskaand Pechacek (2005), Adda and Cornaglia (2006b)

2Becker and Murphy (1988), O’Donoghue and Rabin (1999b), Gruber and Koszegi (2001), Loewenstein, O’Donoghue andRabin (2003), Berheim and Rangel (2004), Gul and Pesendorfer (2007)

3Gruber and Koszegi (2000) discuss optimal taxes for present-biased consumers; Gruber and Mullainathan (2005) find thattaxes can improve happiness among those otherwise likely to have been smokers.

4The exception is Gruber and Koszegi (2001), which proposes a test of present bias but does not implement it. Becker,Grossman and Murphy (1994) test forward-looking behavior against the extreme of pure myopia, but this does not distinguishbetween the three types presented in this paper.

5By embedding the rational case within the model, I am not presupposing the existence of non-rational consumers.6Later rational-addiction models, such as Orphanides and Zervos (1995), allow initial uncertainty to produce both ex post

inefficient addictions and ex post inefficient abstentions. Once addicted, however, steady-state behavior will mimic that inBecker and Murphy (1988).

7In contrast, “sophisticated” present-biased agents are aware of their self-control problem and will seek out commitment

1

taste for immediate gratification causes them to under-weigh the negative effect that consuming the addictive

good will have on their future selves, while their naivete causes them to underestimate the extent to which

their future selves will be tempted to consume the addictive good. These two forces can lead to costly

overconsumption, as well as mistaken beliefs about future consumption paths. In particular, it can lead to

procrastination in which an addict always hits today and always plans to start refraining tomorrow. Being

patient in the long-run, such an addict will want to quit smoking. Her short-run impatience, however, causes

her to prefer quitting tomorrow over quitting today, even in a stationary environment. Because she does not

realize that she will also want to delay when tomorrow arrives, she will procrastinate forever on quitting.

Finally, consumers may under-appreciate the extent to which changes to their “state”—in this context,

addicted or unaddicted—will affect their utility. The “simple projection bias” of Loewenstein, O’Donoghue

and Rabin (2003) allows consumers to maximize a weighted average of their utility conditional on their

true future state, and their utility conditional on their state remaining constant. The extent of consumers’

projection bias is indexed by the parameter α ∈ [0, 1], with α = 0 embedded as the fully rational case.8

Because they do not currently crave the addictive good, unaddicted consumers underestimate how much

current consumption will make their future selves crave it. Currently addicted consumers, conversely, fail to

appreciate the extent to which refraining will decrease their future cravings. This can lead both addicts and

non-addicts to overconsume the addictive good. It can also cause unexpected revisions to the consumption

plan if a person intends to consume only for a short duration but, finding herself unexpectedly addicted and

her marginal utility unexpectedly high, then continues to consume in all periods.

In order to yield testable predictions, I place three restrictions on the instantaneous utility function.

First, it is quasi-linear in consumption of a numeraire and an addictive good. Second, utility from the

addictive good can be further separated into two components: one that depends on the agent’s level of

addiction, and one that depends on the period. Finally, I assume that consumers’ lives can be divided into

their “youth” and their “maturity”. The period-dependent component of utility is constant within each

phase, and normalized to be zero during smokers’ adulthood.9 The non-stationarity introduced by this

last restriction is intended to help the rational model explain actual smoking patterns. Setting the value

during maturity to the youth value reduces to the stationary model often used elsewhere. These assumptions

narrow the range of plans that consumers may form, enabling the estimation of behavioral parameters from

devices to restrain their future selves. Although I include them in the model in Section 2, I later rule them out generically usingthe restriction that sophisticates, like fully rational consumers, must have rational expectations.

8That is, if a consumer’s current state is sc but her actual future state will be given by sf , she believes her future utility willbe U(x) = (1− α)U(x|sf ) + αU(x|sc).

9I assume that teenagers are still in their youth, while smokers over 30 or 35 are in their adulthood. The distributionof maturity dates is investigated in Section 5.2.3. This assumption is a simplified version of the “youthful preferences” inO’Donoghue and Rabin (2000), which allow for a more gradual decay over time in the instantaneous incentive to consume theaddictive good.

2

responses to different types of price shocks.

Section 2 develops the model and tests that distinguish between biased and unbiased agents based on

their consumption responses to anticipated changes in cigarette prices. In doing so, I am assuming that

smokers hold rational expectations of the future cigarette price but do not have perfect foresight: the

realized future price will be higher than an individual’s expectation about as often as it is lower. To avoid

strong assumptions on smokers’ ability to predict prices, Gruber and Koszegi (2000) use high-frequency

smoking data from the National Vitality Statistics Natality Data in order to identify tax changes that

have been legislated but not yet implemented, and still find a future-price effect. In order to use the more

representative data from the NHIS, however, I can only look at tax and price changes at an annual frequency.

I instead use current-period information—taxes and state budget tightness—as instruments for future prices

in order to address consumers’ imperfect foresight and avoid attenuation bias from their noisy predictions.10

While I assume that consumers make use of the available information, this approach does not require the

seeming clairvoyance assumed by some previous research, such as Becker, Grossman and Murphy (1994),

that have included future prices directly into the regressions.11

Because current consumption affects the marginal utility from smoking in the future, consumers who

expect to smoke in the next period will react immediately to an anticipated future price change. Alterna-

tively, if an unaddicted consumer knows a price increase will cause her future self to not smoke, she may

choose not to start smoking so as to avoid withdrawal in the future. Because present-biased agents heavily

discount the costs to future selves, they will under-respond to anticipated future price changes. The ratio of

the response to a change in the expected next-period price to the response to a change in the current price

will be exactly the net discount factor βδ.12 A different function of β and δ can be estimated by comparing

reactions to transitory and permanent price changes. Because the response to a permanent shock is the

discounted response to a series of transitory shocks, the ratio of the responses will yield 1 + βδ/(1 − δ).

These estimates can be combined to separate out each discount factor.

Next, I consider the effect of changes in the variance of the cigarette price distribution on smokers with

different degrees of biases. Rational smokers will react to volatility because remaining unaddicted can be

viewed as a “real option”. While an addict will be stuck hitting for high realized prices, a non-addict can

always start hitting if prices are low. For example, if there is uncertainty about whether a large excise10The use of budget tightness as an instrument is similar in spirit to the enacted-but-not-implemented taxes used by Gruber

and Koszegi (2000), as budget tightness proves to be a predictor of the enactment of new taxes.11In Section 5.3, I explore a departure from rational expectations that, while it can explain some features of the data, cannot

account for the full set of results explained by the model of psychological biases.12This is also a feature explored in Becker, Grossman and Murphy (1994). In that more restrictive model, however, this ratio

yields the single discount factor δ rather than the net discount factor βδ. The authors note that “the estimates are not fullyconsistent with rational addiction, because the point estimates of the discount factor ... are implausibly low.” This paper willalso estimate a low value for this ratio, but is able to interpret this as β < 1 rather than implausible long-run impatience.

3

tax will be applied in the next period, there is value in remaining un-addicted until the uncertainty is

resolved. Because projection-biased agents do not appreciate how much current consumption affects their

future cravings, they will tend to underestimate the option value. As a result, they will under-respond to

changes in volatility relative to a rational or purely present-biased agent.

The data are presented in Section 3. To estimate consumers’ responses, I use repeated cross-sectional

data on self-reported smoking status and number of cigarettes from the National Health Interview Survey

for a large sample over a 26-year span. The individual-level data in the NHIS provides better identification

than the state aggregate sales data used in Becker, Grossman and Murphy (1994), and is more representative

than the National Vital Statistics Natality Data used in Gruber and Koszegi (2001). It also allows me to

estimate responses separately for youth smokers and for mature smokers, as well as control for demographic

characteristics.

To estimate the prices facing smokers, I form metropolitan statistical area (MSA)-level price indexes.

These reflect an average of national, state, and local taxes faced within the MSA. In order to deal with the po-

tential endogeneity of both prices and future taxes with respect to current smoking rates, I use current taxes

and states’ budgetary tightness as an instrument for prices. Because most states have a balanced-budget

requirement, budget shortfalls necessitate revenue increases. “Sin taxes” may be a politically expedient way

to close a budget gap.13 Budget tightness proves to be a predictor of tobacco tax increases, and therefore

of price, in the following year. I also decompose prices into permanent components and transitory shocks

using taxes and agricultural variation in Kentucky burley tobacco—an important constituent of domestic

cigarettes.14 MSA fixed effects and a time trend are included in the decomposition.

I then test the above predictions empirically in Section 4. I find that smokers differ from the rational

model in ways that suggest both present bias and projection bias. Young smokers react to the current price

of cigarettes but have almost no response to future price changes. This contrasts with the finding for mature

smokers, who respond both to current and anticipated future prices. A 1% increase in the current price of

cigarettes lowers the probability of smoking between 1% and 2%, with the effect monotonically decreasing in

age. Examining the responses to permanent versus temporary price changes for mature smokers, I estimate

β = 0.7 and δ = 0.89, suggesting that present bias is a significant factor in smoking behaviors. Finally,

there is strong evidence that price volatility per se increases the likelihood of smoking. A 1% increase in the

standard deviation of price increases the probability of smoking by 0.02 percentage points—approximately

one-tenth the effect of a direct 1% decrease in the current price level. One would expect a negative, or at least13Gul and Pesendorfer (2007), for example, discuss a $0.61 increase in the Connecticut cigarette tax, which was “expected to

be used primarily to reduce the budget deficit.”14Poor growing conditions in a given year will increase the price temporarily in a way that is uncorrelated either with demand

characteristics or with future prices.

4

much smaller, response from agents without projection bias. Further restrictions are needed, however, to

estimate the α this implies. In Section 5, I linearize the addictive component of utility and make additional

distributional assumptions about prices. With these additional assumptions, I estimate a degree of projection

bias α ∈ [0.41, 0.49], meaning that people underestimate the changes in tastes caused by smoking behavior

by approximately 40%.

I conclude in Section 6 by considering the welfare effects of the estimated degrees of bias. Whereas fully

rational agents without credit constraints should care only about the lifetime costs of smoking, I find that

manipulating the price path can “trick” a consumer into accepting an expensive lifelong addiction. Because

of her impatience in early periods, a consumer will be tempted to smoke if prices start low and then increase

sharply later—much in the way Shui and Ausubel (2004) find that consumers are drawn to credit cards with

low “teaser rates”. Moreover, projection bias in later periods causes mature smokers to underappreciate the

possibility of quitting. I estimate that a person with the estimated degrees of bias could be a lifetime smoker

at a cost 2–3 times as great in present value as another price path that she would refuse. I also estimate

that a corrective tax of between $8 and $11 would be needed for mature smokers to consume optimally. This

tax does not include the effects of secondhand smoke or the effect of premature deaths on the government’s

budget constraint, but focuses only on the intervention necessary to correct the over-consumption due to

the psychological biases.

2 Model

2.1 General Setup

Following O’Donoghue and Rabin (2000)’s simplification of Becker and Murphy (1988), consider a setting

with infinitely-lived agents who consume two goods in every period: an addictive good a ∈ {0, 1} and a

numeraire y ∈ <+. Consuming a = 1 is considered “hitting”, while consuming a = 0 is “refraining” from

the addictive good.15 The utility derived from hitting or refraining depends on the accumulated “stock of

addiction” k ∈ <+, with the initial taste for the addictive good drawn from some distribution κ(·).

I let the addictive stock evolve according to:

kt+1 =

γkt + 1, if at = 1

γkt, if at = 0(1)

15Although smokers can indeed choose their nicotine consumption from a continuum of possible levels, formulating the problemas a binary choice captures the intuition that many smokers choose between smoking at a set level (e.g. a pack a day, half a packa day, etc.) and quitting. Gruber and Koszegi (2001) derive a model for present-biased agents that can choose a continuousquantity of the addictive good, and obtain qualitatively similar results.

5

The parameter γ ∈ [0, 1] thus captures how quickly the addictive stock evolves. For example, γ = 0

implies that a consumer is either completely addicted or completely unaddicted as a result of her behavior

in the previous period. That is, consuming just once puts her in a fully addicted state, but she can become

unaddicted again by refraining for just one period. Larger values of γ capture a slower addiction process,

where it takes time both to become addicted and to go through withdrawal.16 The value of γ will influence

the magnitudes of how consumers react any particular shock to the price of the addictive good, but it will

not affect the ratio of how they react to different shocks. It will also determine the maximum steady-state

level of addiction: kmax = 11−γ . For generality, I do not assume a specific value of γ.

The key features of the evolution of the addictive stock are that past consumption smoothly increases

the current addiction level, past abstention decreases it, and that the effect is stronger for more recent past

consumption. Alternative formulations that preserve these features will not substantively alter the model.

Conversely, functional forms like kt = max{aτ |τ < t} that eliminate the possibility of becoming un-addicted

would change many of the predictions of the model. Although one could think of many alternative forms,

a geometric process has been used in the economics literature since Becker and Murphy (1988) and is used

here for simplicity.17

To abstract away from the wealth effects from choosing to consume the addictive good, consumers’

instantaneous utility function is quasilinear in their consumption of the the numeraire yt and the addictive

good at:

ut(at, yt|kt) =

yt + ft(kt), if at = 1

yt + gt(kt) if at = 0(2)

The functions ft(·) and gt(·) are continuously differentiable and, because this is a harmful good, Iassume:18

f ′t(k) < 0 ∀ tg′t(k) < 0 ∀ t (3)

The negative effect of past consumption thus comes indirectly through its effect on the current addictive16Using just one parameter γ to describe the process embeds a degree of symmetry in the processes of becoming addicted and

becoming unaddicted. This assumption is not important for the results in this paper, and can be relaxed by assigning separateparameters to the cases where a=1 and where a=0. The assumption that the addictive stock decays geometrically, however, isstandard in the literature.

17Becker and Murphy (1988) assume the stock of addictive capital, S(t), evolves according to S(t) = c(t) − δS(t) − h[D(t)],where c(t) is consumption and D(t) is expenditures on endogenous depreciation. The model in this paper is identical if h(·) = 0and (1− δ) = γ.

18A negative first derivative stands in contrast to what can be considered positive addictive goods, such as exercise. Suchgoods would be underconsumed by the biased types in this model, rather than over-consumed. The set of consumption plansallowed under positive addictive goods, however, is inconsistent with the patterns observed in the data. More directly, themedical consensus is that the delayed health consequences of smoking are indeed negative (DHHS, 1982).

6

stock. Increased past consumption raises kt, and therefore lowers current utility regardless of whether one

hits or refrains today.

The second effect of past consumption on present utility comes through reinforcement—the more one

has consumed in the past, the more tempting it is to hit today. Let the instantaneous temptation to hit be

given by ht(k) = ft(k)− gt(k). Reinforcement requires that:

h′t(k) = f ′t(k)− g′t(k) > 0 ∀ t (4)

The exact functional form of ft(·) and gt(·) will, along with γ, determine the magnitude of consumer

responses to price shocks. For now, I assume that f(·) and g(·) are each weakly convex.19 In Section 5,

following O’Donoghue and Rabin (2000), I linearize these addiction functions.

Finally, assume that the price of the addictive good in period t is pt and let the consumer’s disposable

income in period t be Yt.20 Instantaneous utility can then be re-written only as a function of the addictive

stock and the decision of whether to hit or refrain:21

ut(at | kt) =

[Yt − pt] + ft(kt), if at = 1

Yt + gt(kt), if at = 0(5)

The model assumes that all agents have perfect information about the health risks of smoking. It is

possible that if there is significant heterogeneity in beliefs about the costs of smoking, differences in beliefs

may lead to sorting: those with relatively positive beliefs about cigarettes choose to smoke and those with

relatively negative beliefs refrain. It is unclear to what extent information drives smoking behaviors within

the United States: while Cutler and Glaeser (2008) attribute almost half the difference in smoking rates

between the US and Europe to differences in beliefs about the health effects of smoking, Khwaja et al.

(2008) find that mature smokers in the US are either correct or too pessimistic about their health risks.

While heterogeneity in beliefs may change the mean smoking rate and the absolute magnitude of price

responses, it will not affect the ratios used for identification in this paper. A time trend is included in all

regressions to control for any changes in information about smoking over time (or other secular changes in

the attractiveness of smoking).

The preferences described above are general preferences over an addictive good, and say nothing about19This will understate the harm from smoking if f(·) and g(·) are concave, and will bias the results towards the rational case.20Assuming that spending on the numeraire is large relative to spending on the addictive good, one may interpret Yt as the

portion of permanent income allocated to period t.21I assume that all agents know the functions ft(·) and gt(·). That is, smokers are aware of the heath and welfare consequences

of smoking. Previous studies suggest that this may be a reasonable approximation for most people (Viscusi 1990, Schoenbaum1997, Khwaja, Silverman, Sloan and Wang 2008).

7

a consumer’s degree of rationality. The psychological biases introduced in the following section do not

change the underlying instantaneous utility function, but rather represent systematic departures in how

consumers maximize this utility. While the list is not exhaustive in the set of explanations it considers, I

focus on projection bias and present bias as important examples. For example, I omit among others the

“cue-triggered decision processes” of Berheim and Rangel (2004) and the preferences over choice sets of Gul

and Pesendorfer (2007).22

2.1.1 Rational Consumers

As Becker and Murphy (1988) show, there is no reason why a fully rational agent could never consume an

addictive good. If the benefits are large enough, or the future costs sufficiently small, it may be utility-

maximizing to hit. In this model, they will tend to do so whenever ht(·) is large or their initial addictive

stock k0 is large. Let the discount rate be given by δ. Formally, the fully rational consumer forms a period-t

consumption plan at = (at, at+1, at+2, ...) in order to:

maxat

Ut(kt) =∞∑τ=t

δτ−tut(aτ |kτ ) (6)

s.t. kτ+1 = γkτ + aτ

Yτ − pτaτ ≥ 0

That is, fully rational consumers form a consumption plan that maximizes their true utility subject to

their budget constraint and the actual addictive process. Because fully rational consumers with perfect

information about the addiction functions only consume the addictive good when the upfront benefits out-

weigh the future costs, they will be responsive to any anticipated future price shocks that affect the cost of

future consumption.23

2.1.2 Projection-Biased Consumers

Projection-biased consumers do not appreciate the degree to which their “state” in the future will differ

from their current “state”. For example, Read and Van Leeuwen (1998) show that people who are currently

hungry act as though their future selves will also be relatively hungry, and people who are currently sated22While present bias and projection bias make firm predictions about how consumers react to different types of price shocks,

other models are more ambiguous. Cue-triggered addiction or utility over choice sets do make predictions about other facets ofbehavior, but they will not be well-identified using the approach of this paper.

23This forward-looking behavior has been used to test the rational model against the alternative model of complete myopia(Becker and Murphy (1988), Becker, Grossman and Murphy (1991)). However, both types of biased consumers presented hereinwill also display forward-looking behavior in many circumstances. The tests in Section 4 are designed to distinguish betweenthese alternative models.

8

behave as though their future selves will also be relatively sated. Similar effects have been shown for the

“endowment effect” (Loewenstein and Adler 1995), sexual arousal (Ariely and Loewenstein 2005), weather

and catalog orders (Conlin, O’Donoghue and Vogelsang 2007), adaptation to changing life circumstances

(Gilbert, Pinel, Wilson, Blumberg and Wheatley 1998, Loewenstein and Frederick 1997), and, indeed, drug

addiction (Giordano et al. 2002).

Projection bias may be very important in the consumption of additive goods if people do not appreciate

the degree to which they will become addicted, or conversely if addicts do not appreciate that they can

become un-addicted. In either case, the effect of projection bias will be to increase consumption relative

to the perfectly rational case. It may also introduce some time-inconsistencies between her plans and her

behavior, if future selves find themselves experiencing an unexpectedly strong addiction.

Following Loewenstein, O’Donoghue and Rabin (2003), a projection-biased agent believes her future

utility will be a weighted sum of her actual future utility and her utility if she remained in her current state.

The weight she places on the incorrect term, α, indexes the degree of the bias.24 She then forms a period-t

consumption plan at in order to maximize her (incorrect) assessment of her lifetime utility:

maxat

Ut(kt) = u(at|kt) +∞∑

τ=t+1

δτ−t[(1− α)u(aτ |kτ ) + αu(aτ |kt)

](7)


Yτ − pτ aτ ≥ 0

The projection-biased consumer then consumes the first component of her plan in the current period. In

the next period, however, she re-optimizes according to (7) and may form a very different plan.

This formulation of projection bias is different from other forms of ignorance about the addictive process.

A projection-biased person has the correct model of f(k) and g(k), just as a sated person knows what it

feels like to be hungry or an unaroused person knows what it feels like to be sexually aroused. The mistake

is strictly about the evolution of their state k—they believe it will not differ as much from their current state

as it truly will. One key difference between this model of projection bias and simply overestimating f ′(k),

for example, is that projection bias leads both addicts and non-addicts to over-consume. An addict who hit

due to an early mistake about f ′(k), but who did not suffer projection bias, would not make mistakes in her

consumption going forward.

24Alternatively, one could imagine projection-biased people making a mistake directly about kτ of the form kτ = (1−α)kτ+αkt,and then believing their future utility to be u(aτ |kτ ). If f(·) and g(·) are linear, this approach is identical to the one used inthis paper.

9

2.1.3 Present-Biased Consumers

Consumers with present bias are impatient—they value consumption in the present at a premium to any

delayed consumption. This taste for immediate gratification has been identified in a range of contexts, from

long-term savings behaviors by Angeletos, Laibson, Repetto, Tobacman and Weinberg (2001) to caloric

intake by Shapiro (2005). An overview of the literature is provided in DellaVigna (2009). Following the

formulation of Laibson (1997), the degree of present bias is captured by an extra discount factor β < 1 applied

uniformly to all future periods. As a consequence of this short-run impatience, present bias generates time-

inconsistency. For example, because she is patient about long-run decisions, a consumer in period 1 may

prefer having her period-2 self refrain from smoking. When period 2 arrives, however, she is now trading off

long-run costs against instantaneous benefits and may be tempted to hit. O’Donoghue and Rabin (2000) and

Gruber and Koszegi (2001) show that present bias can lead biased consumers to greatly overconsume the

addictive good, at a substantial welfare loss. Present-biased consumers can be divided into two categories

based on whether they are aware that their future selves will also be present-biased: naifs are unaware of

this fact, while sophisticates are aware of it.

A naive present-biased agent has β < 1, but does not recognize that her future selves will also be

impatient—following O’Donoghue and Rabin (1999a), she believes that β = 1 for all future selves. The

problem for a naive present-biased agent is then to form a consumption plan at in order to maximize her

current self’s lifetime discounted utility, since she does not realize her time-inconsistency. That is, because

she believes her future selves will not feel impatient in the way she does, she believes they will behave in

a manner consistent with the plans she forms today. She therefore performs the simple optimization given

here, and forms the plan at. Her maximization problem is:

maxat

Ut(kt) = ut(at|kt) + β

∞∑τ=t+1

δτ−tu(aτ |kτ ) (8)



Having formed at, the naive present-biased consumer does consume the first component, at, in the

current period. In the next period, however, she forms a new plan according to (8) which may differ from

her plan at time t. For example, she may expect to hit today when she is impatient but refrain in all future

periods when she thinks she will be patient. In the next period, however, she finds that β < 1, and hits

again. This naivete about present bias may therefore lead a consumer always to think she is going to quit,

but procrastinate forever in doing so.

10

Sophisticates are aware of their taste for immediate gratification—and that future selves will also have

such a taste—and therefore must play a subgame-perfect Nash equilibrium with their future selves. Their

maximization problem is therefore the same as that of a naive present-biased consumer, but with the

additional constraint that future consumption in each period τ must be optimal from self τ ’s point of view.

The sophisticate forms a plan at = (at, at+1, ...) and chooses at in order to maximize:

maxat

Ut(at|kt) = ut(at|kt) + β∞∑

τ=t+1

δτ−tu(aτ |kτ ) (9)



aτ = 1 ⇐⇒ hτ (kτ ) ≥ βδ[Uτ+1(aτ+1|γkτ )− Uτ+1(aτ+1|γkτ+1 + 1)

]∀τ > t

where aτ = (aτ , aτ+1, ...) ∀τ ≥ t.

The solution to this problem is sensitive to the parameters of the model. On the one hand, if sophisticates

realize that their future selves will eventually start consuming regardless of their current behavior (as they

would with any finite horizon), this sense of “inevitability” makes them more likely to begin their habit

today. On the other hand, if they believe that they can overcome their future selves’ impatience by refraining

today, there is a strong “incentive effect” to do so. In general, one cannot state which effect will dominate.

The apparent myopia of young smokers, combined with the requirement that sophisticates have rational

expectations, will ultimately rule out sophisticates except for the knife-edge case where this incentive effect

exactly counterbalances the direct effect of a price increase on average

One can form some comparative statics for sophisticates—for example, O’Donoghue and Rabin (2000)

show that the addition of “youthful preferences” of the sort in Section 2.2 can cause a sophisticate to switch

from always refraining to always hitting only when it would also cause a naif to make the same switch.

Unfortunately, this highlights the fact that sophisticates in this model will not be separately identified

from fully-rational and naive present-biased agents. In some instances (e.g. when they believe they are

consuming only for a short duration), they will behave like fully rational consumers. In others (e.g. when

they believe they will consume always), they will behave like their naive counterparts. Because without

additional parametric assumptions it is impossible to predict when they will behave like each type, I focus

only on rational consumers, or naive consumers experiencing projection- and present-bias.25

25The difficulties in making firm predictions for sophisticated present-biased agents are two-fold. First, there may be multipleequilibria in the infinite-period model described here. In general, the substantive predictions will be similar across the multipleequilibria. For example, in models of procrastination such as those in O’Donoghue and Rabin (1999a), sophisticated agents maydelay doing a task for a brief number of periods but never for very long. Moreover, the equilibria of this game will be genericallyunique if one moves from an infinite horizon to a finite time horizon.

11

2.2 Consumption Plans Under “Youthful Preferences”

Without further structure on ut(at | kt), the model does not yield a closed-form solution for consumers’

plans. Moreover, any observed behavior could be reconciled with any type of consumer.26 In order to derive

testable predictions, then, we make two assumptions:

Assumption 1 (Separability) ∃ x = {x1, x2, ...}, f(k), g(k) s.t.

ft(k) = xt + f(k) ∀tgt(k) = g(k) ∀t

This assumption implies that a consumer’s utility is determined partly by a time-independent component

which depends only on her addictive state, and partly by a purely instantaneous incentive to hit.27 The non-

addictive component is normalized to zero for the case where she refrains. The time-independent component

is meant to largely capture the biological and psychological process of addiction.28 xt captures any external

incentives to hit (e.g. peer pressure) as well as any other changes in the enjoyment of the addictive good

across ages.

This assumption implies that the instantaneous incentive to hit at age t can be re-written as ht(k) =

xt + h(k) or alternatively that ∂2ht∂k∂t = 0. That is, while consumers at different points in the life-cycle may

certainly find hitting more or less tempting, the degree to which this temptation grows for more addicted

consumers does not depend on age.

Assumption 2 (Nondecaying Youthful Preferences) ∃ m, x s.t.

x1 = x2 = ... = xm = x ≥ 0; xm+1 = xm+2 = ... = 0

The second difficulty for sophisticated agents is that their comparative statics are much harder to predict. In general, theywill be “well-behaved” in the manner described later for other types of agents. However, they may sometimes behave incounterintuitive ways. Consider, for example, the following example. Agents live for 3 periods, and may “hit” or “refrain” inperiods 1 and 2, and hits if indifferent. Everyone refrains in period 3. Utility is defined as above, such that ft(k) = xt − ρkand gt(k) = −σk. Let p1 = p2 = 0, x1 = 1.2, x2 = 0, γ = 0, δ = 1, β = 1

2, ρ = 1, σ = 2 and consider the effects of applying

a tax τ = 1 in period 2. A naive consumer will hit in the first period with or without the tax—she does not expect to smokein period 2 in either case. Without the tax, the sophisticate believes she will smoke in period 2 only if she smokes in period 1.She therefore refrains without the tax in order to give her future self an incentive to refrain as well. With the tax, however,she realizes that her period-2 self will refrain no matter what, and therefore chooses to smoke in period 1. Increasing the futureprice can therefore actually raise smoking for sophisticates, if it acts as a control device for their future selves.

26Consider any plan a. It can be rationalized easily by setting ht(·) = +∞ in any period in which at = 1 and ht(·) = −∞ inany period in which the consumer abstains. Less extreme changes in ht(·) could also generate the same pattern, while keepingconsumers’ price sensitivity intact.

27A similar separability assumption is made in Becker and Murphy (1988), O’Donoghue and Rabin (2000), and Gruber andKoszegi (2001).

28Although there is some evidence that the effects of nicotine withdrawal may differ over time, there is no conclusive evidencefor how these changes occur or the magnitudes of the effects. For example, Wilmouth and Spear (1998) find that adolescentrats suffer more from cognitive disruption effects during withdrawal and adult rats suffer more from anxiety-like symptoms. Itis unclear how this could be translated into utility measures.

12

The youthful-preferences assumption is meant to capture the social pressures, physiological differences,

and other reasons why it may be quite tempting for a teen to experiment with a first cigarette, but that an

unaddicted 40-year-old would have no desire to do so.29 This version of the youthful preferences assumption

is the simplest way to capture that the addictive good is initially very tempting due to external influences,

but is not tempting later.30

If one alternatively found that x < 0, one would have a good that is not very tempting initially but

that older consumers enjoy (e.g. scotch). If this were the case, however, one would expect to see very few

consumers starting a consumption habit when young, and then initiating once reaching maturity. Neither an

intuitive assessment of cigarettes nor the empirical hazard rates of initiation (shown in Figure 3) correspond

to this story. Consumers may of course still differ on their values for m and x. Indeed it is this heterogeneity,

along with differences in the initial addictive state k0, that allows the model to account for differences in

starting and quitting behavior across individuals.

The general maximization problem for a consumer with discount rate δ, degree of present bias β, degree

of projection bias α, and period-t addiction level kt is then:

maxat

Ut(kt) = u(at|kt) + β∞∑

τ=t+1

δτ−t[(1− α)u(aτ |kτ ) + αu(aτ |kt)

](10)



Lemma 1 establishes that in the current period, all three types of consumers follow a cutoff strategy.

That is, given the price path, they will consume if and only if their current addictive state is above a certain

threshold.

Lemma 1 Given x, m, and p, and let a strategy a(k, t) denote consumption in any period t as a function

of the addiction level k:

1. There is a unique strategy A(k, t) that is optimal from self-t’s perspective

2. The optimal strategy follows a cutoff rule: A(k, t) = 1 if and only if k > kt

3. k1 ≤ k2 ≤ ... ≤ km ≤ km+1 = km+2 = ..., and x > 0⇒ km < km+1.29Alternatively, one could interpret xt as representing “divorce, unemployment, death of a loved one, and other stressful

events” that Becker and Murphy (1988) believe “stimulate the demand for addictive goods”.30The “youthful preferences” discussed in O’Donoghue and Rabin (2000) simply require xt ≥ xt+1 if t < m, and xt = xt+1 if

t ≥ m. Some structure on {xt}∞t=1 is necessary for identification. A more relaxed assumption than Assumption 2 is to restrictthe magnitude of |xt − xt+1|, t ≤ m − 1. Using this in place of nondecaying youthful preferences will not change the overallpattern of predictions, but will expand the number of cases in Lemma 2, as consumers may heterogeneously plan to consumeuntil some intermediate date s ≤ m.

13

Because of the pattern the addiction threshold follows, rational or projection-biased consumer may only

expect to consume never, to consume always, or to consume only until maturity. With the addition of

present-bias, people may form any of these plans or may plan only on yielding to temptation today and

refraining in the future. Consider a consumer after date m, when the problem becomes stationary. Whereas

rational consumers realize that consuming today will increase their temptation to consume tomorrow and

projection-biased consumers believe that tomorrow’s problem will be identical to today’s (and therefore

must believe that if they find it optimal to hit today, they will also find it optimal to hit tomorrow), a naive

present-biased consumer may believe that the slight increase in her temptation to hit will be undone by the

disappearance of β starting tomorrow. She may thus believe that today is the last time she will hit, and

that she will virtuously refrain starting tomorrow.

For all of these plans, a positive shock to the price of the addictive good in a period in which the consumer

expects to hit will lower the utility of consumption, and hence the consumer will only consume at higher

levels of addiction. The magnitude of this change in the cutoff level of addiction depends on the specific

functional form of the utility function. However, the ratios of the responses to price changes at different

times depend only on the discount rate. Lemma 2 establishes the relationships between the reactions of

consumers to different sorts of potential price shocks.

Lemma 2 Suppose pt = p for all t, m > 1, and k ∈ (0, kmax).

1. If a consumer plans to consume today and in all future periods, then:

dk1/dpτdk1/dp1

= βδτ−1 anddk1/dp

dk1/dp1=

1− (1− β)δ1− δ

2. If a consumer plans to consume today and until date m but refrain thereafter, then:

dk1/dpτdk1/dp1

=

0 τ > m

βδτ−1 τ ≤ mand

dk1/dp

dk1/dp1= 1 +

βδ(1− δm−1)1− δ

3. If a consumer plans to consume today and refrain thereafter, then:

dk1/dpτdk1/dp1

= 0 anddk1/dp

dk1/dp1= 1

Consider a marginal smoker with β = 1, so if she believes she will hit in all periods she will discount a

shock to the price in period t+1 relative to a shock in period t by a factor of δ. Because a permanent price

shock is equivalent to a temporary price shock in all periods, her response is just the sum of the responses

14

to a series of instantaneous shocks: 1/(1− δ). If she believes she will only hit until maturity, then she will

respond in exactly the same way to shocks in periods in which she expects to be hitting, and ignore shocks

in periods in which she does not expect to be hitting.

A naive present-biased agent reacts very similarly to the rational agent in most cases, except that her

response to all shocks in the future is discounted by an additional factor of β. Part (3) of Lemma 2 further

allows that she may expect to hit only in the current period. In that case, she ignores all future price shocks

and reacts equally to a temporary and to a permanent shock.

A projection-biased agent forms the same set of plans as the rational agent. The key difference, however,

is that projection-biased people may not follow through on their plans later. That is, a projection-biased

agent may believe she will only smoke until maturity and therefore ignore any price shocks after date m, even

though she may also then wind up continuing to smoke forever. Because a rational agent’s plans coincide

with her actual consumption, she can only ignore those prices which truly will not affect her. Thus two

consumers whose observed consumption history is identical may nonetheless differ in their response to price

shocks.

Moreover, because consumption plans may depend on whether a consumer is mature—that is, whether

t > m—Lemma 2 has different implications for youth smokers and adult smokers. It is therefore possible to

distinguish between fully rational young smokers and biased ones using their future-price responses:

Proposition 1 Let π(p|X) be the probability of being a smoker as a function of the price path and a vector

of individual characteristics.

1. If young smokers are fully rational and expect to smoke in the next period, then ∂π/∂pt+1

∂π/∂pt= δ > 0

2. If young smokers are (naive) present-biased or projection-biased, then ∂π/∂pt+1

∂π/∂pt∈ {0, δ, βδ}

That is, forward-looking behavior is consistent with either rational or biased agents, while myopic be-

havior is only consistent with biased agents.31

Using self reports not only of current smoking status but also of the age of smoking initiation and

cessation, it is possible to estimate hazard rates of smoking. Although this data is not available in all years,

Figure 3 plots the results from the 1987 NHIS. The overall quitting hazard rate is very low—below 2%

through the first 20 years of a consumption spell. In order for young, fully-rational smokers to expect on

average that they will only smoke during the current year, the quitting hazard in the first several years of

a smoking spell would necessarily be much greater. Because rational agents must have correct beliefs on

average, this is taken as evidence that rational teen smokers must expect to smoke in the future as well.31Provided that δ > 0. There is some evidence that teenagers do have a finite discount rate.

15

Older smokers, because they are past date m, are restricted in the consumption plans they can form.

In particular, both fully rational and projection-biased smokers may only believe they will consume forever

or consume never. Present-biased smokers may form either of these plans, or may additionally believe they

will consume only in the current period.32

Lemma 2 established how mature smokers will react to anticipated future price changes relative to current

price changes, and to permanent changes relative to temporary changes. These two predictions allow for β

and δ to be separately identified:

Proposition 2 Let π(p|X) be the probability of being a smoker, and let the price be the sum of a permanent

component and a transitory shock: pt = pPt + pTt . Define d = ∂πt/∂pt+1

∂πt/∂pt.

1. If mature smokers are not present biased, then d = δ and ∂πt/∂pPt∂πt/∂pTt

= 11−d

2. If mature smokers are present biased, then d = βδ, and ∂πt/∂pPt∂πt/∂pTt

= 1−(1−β)δ1−δ > 1

1−d

Because present-biased consumers will somewhat over-react to temporary shocks relative to permanent

ones, but greatly over-react to current shocks relative to delayed ones, Proposition 2 enables one to estimate

the degree of present bias.

2.3 Effects of Price Volatility

The preceding analysis assumes that consumers have a single prediction for the price in the future. When

consumers have a probability distribution over future prices, the main analysis is unchanged. Although

the overall attractiveness of smoking may decrease, the ratios of elasticities calculated do not change. The

following section analyzes how smokers will respond to different levels of volatility.

For the following analysis, consider only a two-period model. This is necessary for tractability, as agents

can form consumption plans that are conditional not only on the current price, but on the entire history of

prices that has occurred at time t. Unlike in the world with single expectations of prices, consumers explicitly

considering uncertainty can rationally form plans other than “consume never” or “consume always” by setting

period-specific conditional cutoffs. Reducing the number of periods to two clearly avoids complicated plans,

and conveys the basic result.

Now let pt = pt+νt, with νt ∼ i.i.d. N(0, s2). That is, the current price in all periods varies independently

about a known mean. The assumption of a normal error term is not important, except inasmuch as it32Hammar and Carlsson (2005) find in a survey of mature smokers that only “5% of the respondents stated that it is very

likely that they will quit smoking.”

16

guarantees a nonzero support at all levels of addiction for the initiation decision.33 The following Lemma

establishes the effect of changing the price volatility s2 on different types of consumers.

Lemma 3 Let pt = pt + νt, with vt ∼ i.i.d. N(0, s2) and t ∈ {1, 2}. Let π(p|X, s2) denote the probability

the consumer hits in period 1. If the period-1 addiction level is fixed at k1, then:

1. ∃ d s.t. if βδ > d and α = 0, dπds2

< 0.

2. If the consumer is fully projection biased (α = 1), dπds2

> 0.

3. If the consumer is partially projection biased (α ∈ (0, 1)), d2πds2dα

> 0.

Lemma 3 states that for fully rational people, increasing the volatility of prices decreases the attrac-

tiveness of consuming the addictive good. They realize that their current choice will influence their future

selves’ incentives to hit or refrain. If tomorrow’s realized price turns out to be exceptionally low, they can hit

whether they are unaddicted or addicted. If the realized price turns out to be very high (and the addiction is

strong enough), the consumer will hit only if they are in an addicted state. Because of this, refraining today

in a volatile price environment has a real option value—the consumer can always choose to become addicted

later on. Because the value of this option is increasing in the volatility of the price, rational consumers will

lower their threshold price for smoking as the price volatility increases.

Present-biased people will also react by lowering their threshold price for consumption today, but to a

lesser degree. In particular, their response will be β times the response of a fully-rational consumer. As long

as β and δ are sufficiently high, they value the option enough that they lower their threshold sufficiently to

overcome the increased probability of extremely low prices being realized.34

Fully projection-biased people do not realize the effect that today’s decision has on tomorrow’s incentives.

As such, they do not adjust the cutoff price for hitting today. Increased price volatility increases the

probability that today’s price realization falls beneath their threshold, and that they choose to hit. In this

sense they are “fooled” into consuming—they believe they are just taking advantage of particularly low

prices and do not realize the effect they are having on their future self. Partial projection bias may look like

either type, with their reaction becoming monotonically more positive as α increases to 1.

33We make the technical assumption that p and s are such that dΦ(x2+h(γk1))

ds2< 0. If this assumption is not met, the

predictions for projection-biased agents are unchanged. The predictions for rational agents, however, become ambiguous.34If instead an agent were extremely impatient and had β = 0, they would not lower their price threshold at all and would

resemble the projection-biased agent.

17

3 Data

3.1 Smoking Outcomes

All outcomes on smoking behavior are drawn from the National Health Interview Survey (NHIS) between

1965 and 1991. The NHIS is a large, national, repeated cross-sectional survey which in 15 years during

this time included questions on smoking behaviors.35 The smoking questions were included either in the

core set of questions or in various supplements. Specific questions include current cigarette consumption

and past consumption, as well as broader questions such as “Have you ever smoked 100 cigarettes over your

lifetime?” Despite the potential for some under-reporting, validation studies have found interview questions

on smoking to be generally reliable (Patrick et al., 1994).

In addition to smoking outcomes, the NHIS provides an array of broad demographic, economic, and

health-related data. Education and other socioeconomic factors are known to correlate with smoking be-

haviors, and are used as controls in all regressions.36 Respondents in the survey are classified into 31 large

metropolitan statistical areas (as defined by the Bureau of the Census) or as other. Only the urban sample

is used.

Table 1 presents summary statistics for the full urban sample, and separately by smoking status. Smoking

status is defined by the current number of cigarettes consumed daily, with any positive number being

considered a smoker. There is no difference in the average age of smokers and nonsmokers, although the

higher standard deviation of nonsmokers’ age reflects some differences in composition of the groups: both

young people and the elderly are more likely to be nonsmokers. Nonsmokers are slightly more likely to be

female and slightly less likely to be black. Differences in education status also reflect the age compositions

of the two groups. Regressions of smoking status on the demographic controls are presented in Appendix

Table A-2, and confirm the pattern seen in the means: smoking correlates positively with age and non-white

race groups, and correlates negatively with age2, female, and high levels of education.

Among smokers in the sample, the average number of cigarettes smoked is 18.79—just under a pack a

day. This figure may not be directly comparable to more recent data, as the mean number of cigarettes per

smoker has fallen considerably in recent years. According to the CDC, the mean daily cigarettes per smoker

was just under 20 prior to 1995, but had fallen to 13.9 by 2006.37

351965-66, 1970, 1974, 1976-80, 1983, 1985, 1987-88, 1990-91.36For example, Townsend, Roderick and Cooper (1994) and Kenkel, Lillard and Mathios (2006) find that smoking behaviors

and cigarette price elasticities vary by age, education, and income groups. Arnold et al. (2001), however, find that even thoughthe perceived health risks of smoking are correlated with reading level among low-income women, actual smoking rates are not.

37(CDC, 2006. http://www.cdc.gov/tobacco/data_statistics/tables/adult/table_4.htm)

18

http://www.cdc.gov/tobacco/data_statistics/tables/adult/table_4.htm

3.2 Price Measures

Price data on cigarettes are taken from the Tobacco Tax Council’s “The Tax Burden on Tobacco”, which

contains annual state-average cigarette prices and taxes. The prices and taxes are collected in November of

each year, and because the analysis in this paper relies on changes in prices rather than levels, no seasonal

adjustment is performed. Taxes are a statewide average of federal, state, and local excise taxes. Prices and

taxes are then translated into real terms using CPI data from the Bureau of Labor Statistics.

Figures 1 and 2 show real prices and nominal taxes over time, by Metropolitan Statistical Area. Changes

in nominal taxes are lumpy in general and, although there is substantial variation in the timing between

MSAs, tended not to occur during the 1970s. Because taxes are a large component of the retail price, this

creates a U-shaped pattern in the real price. During the 1970s, high inflation and few tax increases actually

caused the real price of cigarettes to fall. The real price during the 1980s and 1990s then rose largely due

to a combination of lower inflation and more tax increases.

Ideally, one would want to know the exact price faced by every individual in the NHIS sample. Even

knowing the average prices and taxes in the exact jurisdiction where an individual lives, however, would not

be sufficient for calculating the effective price they face. Not only may they purchase their cigarettes in a

nearby jurisdiction, the exact location at which they purchase (e.g. convenience store, gas station, etc.) and

the quantities they buy at one time (i.e. by carton or by pack) will generate substantial heterogeneity in

their effective price.38

An individualized price is neither reported in the NHIS nor calculable from any potential data. Instead, I

assign to each individual the average price and taxes for her MSA of residence. These averages are calculated

as follows. The state average price and tax level is first assigned to each county in a MSA. The MSA price

level is then the average (weighted by population) of all prices of its constituent counties. For a MSA located

entirely within one state, this simply returns the state average. For MSAs comprising more than one state,

it will be a mixture of these states. This reflects the fact that a smoker in Washington, DC may be affected

not only by DC prices but by those in Maryland and Virginia. It also makes my price measure more robust

to the short-distance smuggling discussed in Becker, Grossman and Murphy (1994) and Gruber and Koszegi

(2000).

It will be useful to decompose the price level into a permanent component and a transitory shock—that

is, to rewrite pt = pTt + pPt . To do this, I observe that tobacco is an agricultural product and hence is

dependent on the growing conditions of that particular year. Rainfall, the length of the growing season, and38It is also possible that teen smokers rely on friends or parents for their cigarettes, and therefore do not face any price at

all. Cummings, Sciandra, Pechacek, Orlandi and Lynn (1992), however, find that 88% of regular teen smokers usually purchasetheir own cigarettes.

19

other growing conditions affect the quality (as defined by the United States Department of Agriculture) of

the tobacco crop, and are not correlated across years. Moreover, the amount of land eligible for tobacco

farming is tightly regulated under two New Deal-era laws: the Agricultural Adjustment Act of 1933 and

the Agricultural Adjustment Act of 1938.39 As a summary statistic of a year’s growing conditions, I use the

spot market price of Kentucky Burley tobacco, as reported by the United States Department of Agriculture

Consumer and Marketing Service, Tobacco Division (1970–1990). Kentucky Burley tobacco proves to be a

predictor of the current year’s price, but not future prices (see Appendix Table A-3.).

4 Results

4.1 Young Smokers

To test Proposition 1, I estimate the probability of young respondents being a current smoker. In order

to control for potential differences across time and across metropolitan areas, the regressions include year

dummies and MSA-specific time trends.40 A vector of individual controls is also included, comprising age,

age2, race, and education.41 The estimation equation is given by

Pr(smoker = 1) =θ1 ∗ pricejt + θ2 ∗ pricej,(t+1) + θ3 ∗Xit + λj (11)

+ φj ∗ yeart + ηt + εijt

where the left-hand side variable is the probability of being a smoker for an individual i, in metropolitan

area j, in year t. Note that Proposition 1 implies a one-sided test, and does not allow for the future-price

elasticity to exceed the current-price elasticity.

The prices in an MSA, however, may be endogenous. In particular, they may depend on smoking

rates among youths. Although young smokers represent only a fraction of the entire market (and behave

quite differently from the majority of their co-consumers), they represent a very important sub-market from

the perspective of tobacco companies.42 Facing a drop-off of teen smoking, tobacco companies may lower

prices temporarily in order to attract new customers. State legislatures may also raise taxes (an important39Chaloupka and Warner (2000) provide an overview of the U.S. tobacco agriculture regulatory system.40Because prices are measured at the year-by-MSA level, it is not possible to separately include year-by-MSA dummies.41Much of the identification strategy of this paper relies on the fact that age is more than just a control. It is exactly the

separation of smokers into “young” and “mature” categories that allows me to distinguish the various possible biases. However,after including dummies for this purpose, I leave in age as a control for other possible unobservables. Regressions that excludeage and age2 as controls (available on request) yield virtually identical coefficients on regressors of interest.

42Because nicotine is addictive, it is in the interests of tobacco companies to get their customers “hooked” when they areyoung. According to the Department of Health and Human Services, 90% of all adult smokers began smoking when they wereteens. (Department of Health and Human Services 1994)

20

component of the variation in tobacco prices) in response to an increase in teen smoking, so future taxes

may also be endogenous.

To alleviate this problem, I instrument for prices with current taxes and the tightness of state budgets as

measured by the percentage gap between general fund expenditures and general fund revenues. According

to the General accounting office, 48 states have balanced-budget requirements.43 Whereas midyear budget

gaps are primarily closed through spending cuts and other “one-time actions”, gaps prior to the enactment

of a budget are largely closed through revenue increases. A budget shortfall can thus predict increases in

all taxes in the following year. Sin taxes in particular may be a politically expedient way of closing budget

gaps. Although tobacco taxes may be raised for other reasons, this instrument relies only on the variation

generated by budgetary necessity. 44

The results of this estimation are shown in Table 2.45 The main specification is in column (iv). This

specification confines the analysis to smokers under the age of 20. Future price is instrumented using current

taxes and the average percentage budget gap for the states comprising the MSA, with the results of this

first-stage presented in Panel C. The impression given is one of near-complete myopia. The coefficient on

ln(pricet) is highly significant, and highly negative at −1.363, suggesting that smokers in their later teens

are highly price-sensitive. Even more importantly, the coefficient on the next year’s price, ln(pricet+1), is

both small and insignificant. That is to say that although very young smokers react to current price, they

do not react to future prices. This is consistent with either the projection bias or present bias models, but

not with the rational model.46

Columns (iv) and (v) present something of a robustness check on this result by defining the cutoff for

being a “young” smoker at 25 and 30, respectively. Several clear patterns emerge. First, the magnitude of

current-price sensitivity is monotonically decreasing in the cutoff. More importantly, the coefficient on future

price is increasing in magnitude, and strongly significant when both groups of older smokers are included.

This suggests that the seeming myopia of teenage smokers is not permanent, and is consistent with smokers

moving to a regime where they expect to smoke forever as the level of their addiction deepens. The ratio of

the future-price response to the current-price response is therefore also increasing across the specifications.

The null hypothesis that this ratio equals zero is equivalent to saying that teens do not react to future prices.43(General Accounting Office 1993). The exceptions are Vermont and Wyoming.44By instrumenting the future price with current-period observables, I also alleviate the problem caused by the fact that

actual future price is a noisy measurement of expected future price. If expectations are on average correct, and the error in thefirst-stage is independent of the error in expectations and the second-stage error term, then this will solve the errors-in-variablesproblem.

45Standard errors are robust and clustered by MSA. Appendix Table A-4 shows several alternative specifications, which themain results appear robust to.

46While this is prima facie evidence of consumer biases, the range of parameters consistent with this result is not identified fromthis test alone. One therefore cannot yet suggest whether the necessary degree of projection bias or present bias is “plausible”.

21

This ratio is not significantly different from zero for teenage smokers, but is significantly greater than zero

for both older groups. While teen smokers’ myopia could be rational if the future price is fully unknown,

columns (iv) and (v) suggest that older smokers are in fact capable of predicting future prices. Additionally,

Table 4 confirms a non-zero future-price response among mature smokers.

Table 3 reports the results of a slightly different approach. Using the fact that smokers are not making

the simple binary choice of smoking or not smoking, but rather choosing a level of consumption as well,

I sort subjects into five categories: nonsmokers, up to 12 pack/day, up to 1 pack/day, up to 2 packs/day,

and over 2 packs/day. I then perform an ordered probit regression to assess the probability of being in

each category. The key results are the coefficients on ln(pricet+1) and the interaction of ln(pricet+1) with a

dummy for whether the subject is under 20 years old. Across all specifications, the coefficient on the former

is large, and significantly negative. That is, anticipated future price increases will push smokers into lower

categories. However, the interaction term is approximately the same magnitude and positive. That these

two coefficients sum to zero is not rejected at a p-value of 0.8690. In other words, young smokers —unlike

mature smokers—are not using information about future prices in their current consumption decisions.

At this point, there are multiple interpretations of this result. One could possibly conclude that the lack

of a future-price response indicates that cigarettes are not addictive. However, the non-trivial future-price

elasticity for mature smokers belies this interpretation.47 Within the model presented in Section 2, this

result is consistent with either α > 0 or β < 1, for appropriate parameters. That is, this test alone does

not distinguish between the two types of bias. However, it is a strong rejection of the joint hypothesis that

α = 0, β = 1—that is, full rationality—for young smokers.

One could also conclude that consumers in their teen years are mistaken about the process by which

prices evolve (e.g. they believe today’s price will be the price forever), but some time during their 20’s

become able to predict prices. It would not be enough that teens have a noisier estimate of future prices

that is still correct on average. Rather, it must be that even conditioning on the actual future price, teens

believe it will too much resemble the current price. It is important to address this (admittedly ex-post)

hypothesis, because it would have greatly different welfare implications. Unlike the cases of present bias

or projection bias—where the losses come through f(k) and g(k)—the welfare losses for an individual who

makes price mistakes are limited to the present value of their financial error. However, this hypothesis is

not supported by later results. In particular, if price misprediction were driving the results, then one would

expect to see a similar pattern of learning in the results for price volatility. In Section 4.3, no such pattern47Auld and Grootendorst (2004) warn that serial correlations in aggregate consumption data may spuriously return results

in favor of addiction even when there is none, such as for milk demand. The individual responses used above lessen thisconcern. Moreover, the just-identified two-stage least squares specifications presented in Appendix Table A-4, which Auld andGrootendorst suggest “performs well”, are qualitatively similar to the main specifications.

22

is found. Both groups respond to price volatility (although, as will be discussed, not as much as they ought

to), meaning that even youth smokers understand that prices are not immutable. Moreover, the response

is not significantly different for young and mature smokers, indicating that no learning is taking place on

this dimension. The pattern of learning and non-learning required for price misprediction to explain the

results seems implausible, and I am unaware of other research that supports it. Section 5.3 formalizes this

price extrapolation bias, and finds that while it can explain the myopia among teenagers, it is inconsistent

with the rest of the results. Combined with the fact that this reaction to the second moment of the price

distribution does not change as they mature, this story of teenage price misprediction is not consistent with

the overall pattern of results.

4.2 Mature smokers and impatience

To get an estimate of the net discount factor βδ, I estimate current-price and future-price responses on a

sample restricted to respondents aged 30 and above—an identifying assumption is that whenever date m

occurs, it is before the age of 30. Current taxes and state budget gaps are used to instrument for prices in

all specifications, which contain controls for age, age2, sex, race, and education. As before, year dummies

and MSA-specific time trends are also included. The estimation equation is given by:

Pr(smoker = 1)ijt =θ1 ∗ pricejt + θ2 ∗ pricej,(t+1) + θ3 ∗Xit (12)

+ λj + φj ∗ yeart + ηt + εijt

Table 4 presents the results of estimating (12). The dependent variable is an indicator for current smoker

status, and all specifications control for age, age2, race, and education, and include both year dummies and

MSA-specific time trends.48 Where indicated, taxes and budget tightness are used to instrument for prices,

as in Section 4.1.49

Column (iii) is the preferred specification, using participants over age 30 as the sample. Similar results

are obtained using 35 as the cutoff, suggesting that the “maturity” date does not fall within this window.

The coefficient on current price is -0.725, so that a 1% increase in the price decreases the smoking rate

by about 0.7%.50 That these are both smaller in magnitude than the current-price response for teenagers48Appendix Table A-4 shows several alternative specifications, which the main results appear robust to.49As discussed earlier, this instrument will also alleviate the errors-in-variables problem arising from the fact that actual

future price is a noisy measurement of expected future price. The increase in the magnitude of the coefficient on ln(pricet+1) inthe instrumented specifications suggests that measurement error is indeed attenuating the coefficients from OLS.

50The overall elasticity of smoking participation will be a mixture of the elasticity of initiation and the elasticity of cessation.DeCicca, Kenkel and Mathios (2008) use the National Education Longitudinal Survey to estimate a model that accounts forboth explicitly, and find the cessation elasticity to be significantly greater than the initiation elasticity for 26-year-olds. Becausethe NHIS does not have panel data, I cannot distinguish initiation from cessation. However, virtually all of the response for

23

estimated in Table 2 is not surprising. The coefficients on future price are also significantly negative, at

-0.531 and -0.496 for the two age cutoffs. While these estimates are not uniformly greater in magnitude than

those for teens, such a comparison conflates the generally smaller price elasticity of older smokers with their

relatively greater attention to future prices. The relevant comparison is the ratio of future-price response

to current-price response, which ranges from 0.630 to 0.778 for mature smokers. For mature smokers, this

ratio is interpreted as the net discount rate, βδ.51 In all specifications, it is significantly different from zero.

It is also significantly smaller than one in all but one specification.

I then estimate ∂πt/∂pP and ∂πt/∂p

T by regressing consumption on the permanent and transitory

components of price. As before, I include age, age2, sex, race, and education as individual controls and

include MSA dummies. Because annual variation determines pTt , I include either a trend or MSA-specific

trends instead of year dummies and cluster standard errors at the year level. The estimation equation is:

Pr(smoker = 1)ijt =θ1 ∗ pricePjt + θ2 ∗ priceTj,t + θ3 ∗Xit + λj (13)

+ φj ∗ yeart + ηt + εijt

where price is first decomposed into permanent and transitory components by estimating:

pjt =θ4 ∗ taxesjt + θ5 ∗KY Burleyt + θ6 ∗ yeart + λ2j (14)

The ratio of reactions to a permanent and to a transitory price shock is found by estimating (13).

The decomposition of price into pP and pT is shown in Panel B of Table 5. All coefficients are highly

significant, and R2 values of 0.85 suggest that the greater part of price variation is accounted for. Taxes,

MSA fixed effects, and a time trend are used to generate the permanent component of cigarette price, while

the fluctuation in the tobacco price are used to generate the transitory portion. The burley tobacco price

proves not to predict even the next year’s increase in price. Moreover, regressing the estimated (pTt+3 − pTt )

on (pTt+1 − pTt ) yields an insignificant coefficient (0.53 with a standard error of 0.44). The same regression

for the “permanent” price yields a significantly positive coefficient (0.86 with a standard error of 0.19). This

suggests that there is indeed little predictive power in the transitory component, but a lasting effect of the

permanent component.52 I then include these estimates of pP and pT as regressors for an indicator of current

teens must be initiation, while the great majority for mature smokers is cessation. The pattern I find is therefore qualitativelysimilar, and the biases I estimate can help explain the difference in these elasticities.

51A similar ratio is estimated in Becker, Grossman and Murphy (1994)—and is found to be comparably low— but is interpretedas the single discount factor corresponding to δ.

52If the “permanent” shock is not truly permanent or the “temporary” shock not temporary, the estimates for δ will be biaseddownwards, and the estimates for β biased upwards (i.e. too little present bias)

24

cigarette consumption, as in (13).

In general, the effect of the transitory component of price is significant and smallThe effect of a permanent

price shock is much greater—by almost an order of magnitude. Reassuringly, the sum of the coefficients is

comparable to the earlier, non-decomposed price effects. The preferred specification is reported in column

(iv) and includes MSA fixed effects and MSA-specific time trends, and is run on all respondents over 35 years

old. A temporary price increase of 1% lowers the probability of smoking by 0.225%, while a 1% increase

in the permanent price is associated with a 1.41% decrease in the smoking probability. The ratio of these

effects will yield (1− (1− β)δ)/(1− δ), and is estimated at 6.267. That this is statistically different from 1

allows one to immediately reject the hypothesis that β = 1 or δ = 0.

Column (v) presents the results of performing this analysis on teenage rather than mature smokers, and

serves primarily as a robustness check on the model. Given the results in Table 2 and the restriction this

puts on teen smokers’ plans, one would expect teen smokers to react approximately equally to permanent

and temporary price effects. This is precisely the pattern that emerges. With a coefficient of -0.33, teens are

more responsive to temporary price shocks than mature smokers–consistent with their greater current-price

sensitivity in Table 2—while their elasticity with respect to permanent price shocks, -0.822, is considerably

lower than that of mature smokers. Although they do react more to permanent than to temporary shocks,

the ratio of these responses is only 2.47 and is not statistically different from 1. That is, teen smokers

continue to neglect future prices in their current smoking decisions.

Combining the estimates from Table 4 and Table 5, one can estimate both β and δ individually for

mature smokers. These estimates are shown in Table 7 for the instrumental variables specifications in Table

4 and Table 5. Using the preferred specifications from both, one obtains estimates for β of 0.701 and 0.828,

and for δ of 0.899 and 0.885. The estimates for β are significantly different from 1 (although the δ estimates

are less precise and not significantly different from 1).

4.3 Price volatility and initiation

The model presented in Section 2.3 predicts that price volatility has a differential impact on rational and

projection-biased agents. Whereas volatility decreases the attractiveness of consumption to a fully-rational

agent, projection-biased agents may be “fooled” into a lifelong habit when they try to take advantage of

particularly low price realizations.

I use prices in the first half of the sample to estimate price volatility by metropolitan area. I then

estimate the effect of this price volatility on smoking behaviors in the latter half of the sample. To control

for any correlation between σ2pj and pj , I include price directly as a control in all specifications. Demographic

25

controls again include age, age2, sex, race, and education. The estimating equation is a regression of the

form:

Pr(smoker = 1) =F(θ1 ∗ pricejt + θ2 ∗ pricej,(t+1) + θ3 ∗ std pricej + θ4 ∗Xit + φj ∗ yeart + εijt

)(15)

Where the left-hand side variable is one for current smokers and zero for current nonsmokers, prices are

log real prices, and std price is estimated as above. F (x) = Φ(x) for specifications using the probit model

and F (x) = x for those using the linear probability model.53

Table 6 shows the results of running the analysis described by (15). The first column shows the results of

the analysis for all consumers. The sign on Pt is highly significant and negative, as higher prices reduce the

benefits of smoking for any type of consumer. The magnitude suggests that a 1% increase in the price lowers

the probability of smoking rates by 0.36 percentage points, and is relatively stable across all specifications.

More interesting, however, is estimated the effect of price volatility per se on smoking rates. The regressor

in the first row is the natural logarithm of the standard deviation of price (by MSA) between 1965 and 1977.

In all specifications, the estimated coefficient is positive and approximately one-tenth the magnitude of the

direct effect of price changes. For example, column (i) states that increasing the standard deviation of

price realizations in an MSA by 10% increases the the probability of smoking by roughly 0.25 percentage

points. This lends strong support to the conclusion that a mistake in predicting their future utility can lull

consumers into a smoking habit. Columns (ii) and (iii) allow the volatility effect to vary between young and

mature smokers. In neither case is the coefficient significantly different for the different age groups. This

suggests that the effect is present for both mature and young smokers. However, without knowing a mature

smoker’s MSA of residence during her adolescence, it is difficult to separate the enduring effect on initiation

from a current effect on cessation.

One possible concern is whether the estimated coefficient is reasonable given any of the models presented

in Section 2. Is it too high? One possible check is to perform a back-of-the envelope calculation of what

one would expect this coefficient to be. Except for realizations extremely close to the mean, the assumption

that (Pt− P ) ∼ N (0, 0.087) fits the data well.54 A quick estimate of (P ∗− P )—the threshold price at which

a person chooses to smoke—can be formed for teenagers by using the actual smoking rate of approximately

30% during this period.

Under these assumptions, a person who did not alter their price threshold would be 0.06 percentage points

more likely to smoke in an environment where the prices were 1% more volatile. This purely mechanical53An alternative specification (available on request) instruments std price by the population of the MSA. The point estimates

are approximately unchanged, but are imprecisely estimated.54Where Pt is the real price and P is the best predictor using MSA fixed effects and a single time trend.

26

effect is considerably greater than the coefficients of between 0.02 and 0.03 estimated in Table 6. Any

extent to which a consumer lowered their price threshold in more volatile environments would lower this

estimate—meaning that this back-of-the-envelope calculation still leaves some room for consumers to alter

their threshold in response to volatility. On an intuitive level, these estimates suggest that young smokers

are not adjusting their thresholds “enough”. That is, they under-value the real option that derives from

remaining unaddicted. Lemma 3 establishes that if α = 0 and consumers are sufficiently patient, this

elasticity must be negative. Unfortunately, the range of βδ for which this is true depends on the functions

f(·) and g(·). Without further restrictions, one can reject the (somewhat extreme) joint hypothesis that

(α = 0, βδ = 1, h(·) > 0), but cannot confirm if the β found in Section 4.2 can explain this positive elasticity.

Rather than imposing the strength of addiction externally, the following section makes more parametric

assumptions about the model in order to calibrate, among other things, how much projection bias these

estimates imply.

5 Additional Parameter Estimates

The results from Section 4 require very few assumptions about the nature and strength of the addiction

process. While this is sufficient for many ends—for example, estimating β and δ—other results are not

identified. In particular, the degree of projection bias depends critically on how strong the addiction is.

In this section, I linearize f(·) and g(·) and make additional distributional assumptions about “maturity

dates”, price volatility, and initial addiction states k.

5.1 Derivations

Rather than use the flexible addiction functions f(·) and g(·) with only (3) and (4) as restrictions, I follow

O’Donoghue and Rabin (1999b) and assume f(kt) = xt − ρkt and g(kt) = −σkt. That is, smoking in the

current period increases the next period’s incentive to smoke by (σ − ρ).

Proposition 3 Let the initial addictive state be distributed according to κ(·), and the “maturity date” m be

distributed according to µ(·). If teen smokers expect to smoke only until maturity, response in the smoking

rate of cohort τ , πτ,t to an anticipated shock to the price beginning in period t+1 is then:

dπτ,tdpt+1

= −κ′(k; τ, t) ·(

dk

dpt+1

)µ(t− τ)

27

Where:dk

dpTt+1

=βδ[

1 + βδ(

(1− α)γ 1−(δγ)m−1

1−δγ + α1−δm−1

1−δ

)]· (σ − ρ)

· 1{m > t}

Suppose that all teens believe that they will smoke only until maturity and refrain thereafter, and that

they believe maturity occurs in the current period. In this case, dkdpTt

reduces to 1σ−ρ . The change in the

teen smoking rate given a change in the current price is then dπdpTt

= −κ′(k) · 1σ−ρ . Let k ∼ N(0, 1), so that

consuming in the first period raises the addiction level by one standard deviation. Using the actual teen

smoking rate of 30.52%, one can calculate κ′(k).

Proposition 4 Let prices be distributed according to pt ∼ N (p, s2), where p is the known MSA-specific

mean price. For a consumer with discount rate δ, degree of present bias β, and degree of projection bias

α, the change in probability of smoking given a change in the standard deviation of the price distribution is

given by:

dπ

ds=

1√2πs

e−

„p∗2

2s2

«·

(p√

2πs2e−

„p2

2s2

«)· (σ − ρ)βδ(1− α)

− p∗√2πs2

e−

„p∗2

2s2

«

Proposition 4 states that, provided with an estimate of (σ − ρ) and βδ, one can estimate the degree of

projection bias implied by under-reactions to price volatility.

5.2 Parameter Estimates

5.2.1 Addictiveness: σ − ρ

Using Proposition 3 with the additional assumption that all teens expect to smoke only until maturity,

one can re-interpret the estimates from Table 2 in order to recover (σ − ρ). Using the under-20 sample

from columns (i) and (iv) of Table 2, one estimates, respectively, 19.544 and 16.833. Because of the quasi-

linearity of the utility function, these estimates can be interpreted as dollar amounts. That is, going from

an unaddicted to an addicted state increases the willingness to pay by between $16 and $20.

One could repeat the analysis for the under-25 and under-30 samples. Because these groups are on

average less price-sensitive, the estimates of (σ − ρ) will be greater than for the under-20 sample. However,

this procedure requires that the young smokers expect to smoke only in the current period. Because these

older samples include more consumers expecting a lifelong habit, the estimates of σ−ρ will be biased upwards.

For this same reason, the estimates in Table 7 should be considered upper bounds of the addictiveness slope.

28

5.2.2 Projection Bias: α

Given values for β, δ, and (σ − ρ), one can identify α from the response to price volatility. In the data,

(Pt− Pt) ∼ N(0, 0.087) (Kolmogorov-Smirnov p-value of 0.264). This implies that the mechanical effect of a

1% increase in the standard deviation of price is to increase smoking rates by 0.06 percentage points. To the

extent that the estimated responses are smaller than this, consumers are reacting to volatility by changing

their threshold price for smoking. However, the fact that they are not substantially smaller and, moreover,

non-negative suggests that consumers are not reacting as much as they should in the fully-rational case.

Proposition 4 derives the net response to a change in the volatility of the price distribution. The second

term is the purely mechanical response, and will be positive when consumers do not smoke on average.

The first term is the behavioral response, and depends on the value of the real option that is generated by

refraining. Using the estimated values of βδ and (ρ− σ), it is possible to determine how big the behavioral

response ought to be. The degree of under-reaction is attributed to α.

The first column of Table 7 shows values for α, using the estimate of (σ − ρ) = 16.883 and various

estimates of the net discount rate.55 The point estimates vary between 0.410 and 0.494, suggesting that

consumers under-appreciate how their addictive state will change by 40%–50%. In the linear case, this is

interpreted as consumers only expecting their instantaneous incentive to hit to increase by $8 to $10 instead

of the full $16, should they become addicted. Conversely, addicts under-appreciate their ability to become

un-addicted by approximately the same amount. The estimate for projection bias compares favorably with

the range of α ∈ [0.31, 0.50] found by Conlin et al. (2007). This lends support to the notion that projection

bias is a general psychological phenomenon, rather than limited to specific domains such as smoking.

5.2.3 Maturity: µ(·)

Lastly, one can estimate the distribution of “maturity” from the changes in the ratio of future-price elasticity

to current-price elasticity. In the model, maturity is simply defined at the age when the additional instan-

taneous incentive to hit, xt, becomes zero. According to to Proposition 3, one can estimate µ(t1)− µ(t0) by

the difference in dπ/dpt+1

dπ/dptat ages t0 and t1 divided by the net discount rate. Using column (v) of Table 4 to

estimate β · δ, Table 2 implies that between 27% and 39% of people cross this age between 20 and 25, and

another 7% to 27% cross it between 25 and 30. This conclusion matches with the earlier result that very

few people appear to be crossing the same threshold between the ages of 30 and 35.55Because this estimate depends on the net discount rate instead of β and δ individually, the estimates do not vary across the

columns of Table 7. Using higher estimates of σ − ρ increases the values slightly, but never above 0.6.

29

5.3 Price-Extrapolation

I have interpreted the myopia towards future prices exhibited by teenagers as evidence that teens ignore

predictable future price changes because they do not believe they will be smoking in the future. Similarly,

I have interpreted mature smokers’ impatience as evidence of present bias. This requires no differential

ability to predict future prices between young and mature smokers, and is consistent with the fact that teens

are equally responsive to the second moment of the price distribution as mature smokers.56 It is possible,

however, that consumers do not have rational expectations over prices.57 While there are any number of ways

that expectations could differ from rationality, one theory that would seem to be a candidate for explaining

teen myopia would be “price extrapolation”. Under this error, consumers believe too much that future

prices will be similar to current prices, even conditional on all available information. This is a substantial

departure from rational expectations and, while it is a post-hoc hypothesis, it initially seems to provide an

alternative explanation for the results. One simple way to capture this would be to let a period-t consumer’s

expectation of the period-τ price, pt,τ , be given by:

pt,τ = (1− ω)Et [pτ ] + ωpt,τ−1 + ετ for all τ > t (16)

pt = pt

where Et[pτ ] is the correct expectation of the price in period τ conditional on period-t information, pt is the

current price, and εt is a normally-distributed error term. The degree of price-extrapolation error is indexed

by the parameter ω ∈ [0, 1], with ω = 0 reducing to the rational expectations assumed elsewhere in this

paper. Under this form of error, coefficients from regressions of smoking status or cigarette consumption on

current and future prices do not correspond to smokers’ responses to their price expectations. In particular,

the coefficient on current price will load some of the future-price response because it is a component of the

expectation of future price. An instrumental variables regression will solve the measurement error problem

caused by εt+1, but will not correct for the extrapolation error.58

Can such a mistake about prices explain the results found in Section 4 without the presence of present

bias or projection bias? Suppose that all smokers have β = 1, α = 0, that teens and mature smokers have

ω = ωt and ω = ωm, respectively, and that the addiction functions are linear. Then Lemma 3 impliesdπ/dpPtdπ/dpTt

= 1−ωδ1−δ , and dπ/dpt+1

dπ/dpt= (1−ω)δ. One can then use the same regressions as in Section 4 to estimate δ

56It is also consistent with a theory that the error term for teens’ expectations is larger than that for mature smokers.57While consumers certainly do not have perfect foresight of future prices, the approach taken here has been to assume that

they hold expectations that, while noisy, are correct on average. The actual future price is an estimate of consumers’ expectationof the future price measured with error, and can be overcome with an instrumental variables approach.

58That is, if one estimates Equation (11) and β = 1 and α = 0, then θ2θ1

= (1− ω)δ < δ if ω > 0.

30

and ωm. Using the baseline results from Column (iii) of Table 4 and Column (iv) of Table 5, one estimates

δ = 0.90, ωm = 0.435. For youth smokers, using Column (i) of Table 2 and Column (v) of Table 5, one

estimates ωt = 0.96—that is, teens must be almost fully extrapolating the price.

However, the different values of ω required to explain the different future-price responses also make very

different predictions about how consumers will respond to volatility. In particular, values close to 1 imply

a zero response to price volatility that is contradicted by the teen response in Section 4.3. Moreover, this

interpretation of the model also makes a strong prediction on the magnitudes of price responses. If one

believes that the only thing changing over time is ω, then one can ask what the price responses of one group

imply about the ω of the other. Following Lemma 3, the magnitude of the current-price response will depend

on an expression involving the discount rate, parameters of addiction, and degree of price-extrapolation error.

If teens in fact expect to smoke in future periods, then the denominator in this expression will be the same

as for mature smokers. The difference in price responses is attributed to different values of ω. Given the

estimates of ωm and δ outlined above, one can therefore calculate the degree of price extrapolation consistent

with the observed current-price elasticity for teen smokers. Far from the complete price extrapolation their

myopia implies, this method finds that a value of ωt = 0.68 is allowed by the current-price response in

Column (iv) of Table 2, and ωt = 0.63 for the response in Column (i). Intuitively, the change in the way

teens and mature smokers react to current prices is not drastic enough to allow for a drastic change in the

way they extrapolate prices. However, this lower value of ωt implies a future-price response of -0.65, which is

outside the 95% confidence interval of the actual estimate. Price-extrapolation error, while it may contribute

along with the psychological biases estimated in this paper, therefore cannot on its own explain the pattern

of price responses..

6 Conclusion

The overall conclusions to be drawn from this paper are three-fold. First, the patterns of cigarette con-

sumption differ greatly from those predicted by the fully rational model. Second, the model of projection

bias explains several different features of the data. Finally, evidence is found in favor of present bias as an

additional contributing factor. I strongly reject the fully-rational case and, with additional assumptions, I

estimate a degree of present bias β between 0.7 and 0.8, and a degree of projection bias α between 0.4 and

0.5.

The parameter estimates obtained in this paper generally agree with those obtained elsewhere. The

degree of impatience, β ∈ [0.7, 0.8], matches estimates identified in very different contexts. For example,

Laibson, Repetto and Tobacman (2007) estimate an annual β = 0.7, δ = 0.96 in a structural model of

31

consumption and decisions. Using data on unemployment spells and accepted wages, Paserman (2008) finds

some heterogeneity among income groups but still a comparable range of parameter estimates. A short-run

discount factor β ∈ {0.40, 0.48} is estimated for low-wage earners, while high-wage earners have β = 0.89.

Both groups have an annual long-run discount factor δ = 0.99. Using a large-scale experiment to measure

consumer switching in credit cards, Shui and Ausubel (2004) estimate β = 0.8.59

Although there is a growing literature that identifies the presence of projection-bias, there has been less

work attempting to estimate the parameter α. In general, such estimates require structural assumptions on

utility. Conlin, O’Donoghue and Vogelsang (2007) use catalog orders and returns of cold-weather clothing

to test for projection bias with respect to the current weather. If consumers experience projection bias, they

may over-order cold-weather clothing on cold days, because they expect their future selves will be cold as

well. Returns are examined to control for potential confounds, such as attention. After making suitable

structural assumptions, they estimate a degree of projection bias α ∈ [0.31, 0.50]—strikingly similar to the

range this paper estimates for projection bias with respect to cigarette addiction.

One could attempt to quantify the size of the mistake that consumers are making in several ways. For

a rational consumer who is not credit-constrained, it is sufficient to know the present discounted cost of

smoking to know if she will smoke. Biased consumers may be sensitive to the details of timing of prices.

For example, Shui and Ausubel (2004) find that consumers are too sensitive to credit card “teaser rates”

relative to the long-term interest rate—and in ways that end up being quite costly. For the analogue in

cigarette prices, consider an unaddicted teenager with the degrees of bias and preferences estimated in this

paper, who is deciding whether to smoke or not. By setting a very low price initially and raising the price

in her adulthood—essentially a “teaser rate” for cigarettes—one can design a price path that will cause her

to smoke forever at a very high present-value cost.60 Alternatively, by setting a high initial price and then

lowering it considerably in her adulthood, one has a price path that will cause her to never smoke, even

though the present-value cost of a lifelong habit is relatively low. Define the “overpayment ratio” as the

ratio of the most expensive path that causes her to smoke to the least expensive path that causes her not

to smoke. Using the consumer’s own discount rate to calculate present value, this ratio should be exactly

one for rational consumers.61

59If short-run impatience differs between smokers and non-smokers, these estimates of β would not be directly comparable tothe estimates in this paper obtained from smokers. That the results are similar suggests that this bias may be quite general.Indeed, using a specially-designed survey, Khwaja, Silverman and Sloan (2007) conclude that there is “no evidence that short-runand long-run rates of discount differ by smoking status.”

60Although it is unrealistic to charge different age groups different prices for cigarettes, this thought experiment may beapproximated by using brands that appeal differentially to teens and mature smokers.

61The ratio is bounded below by 1 unless the attractiveness of smoking is increasing in the cigarette price. While this wouldnever occur for standard preferences, it can sometimes occur for partially sophisticated present-biased agents for whom higherprices act as a failed commitment device on future selves.

32

Panel A of Figure 4 shows the overpayment ratio as a function of the parameter γ, which controls the

evolution of the addictive stock. For a wide range of values, this teenager can be “tricked” into paying

between 2 and 3.5 times as much for a smoking habit as she would have had to pay under a different

price pattern that she would have turned down.62 Panel B shows the ratio of the most expensive habit a

biased smoker would accept to the most expensive habit a fully rational smoker would accept. Although

the ratio is somewhat lower in Panel B, an agent with the biases estimated in this paper will still pay

between 1.5 and 3 times too much. This suggests that most of the result is coming from people smoking

when they rationally should not—teens impatiently starting, and mature smokers not quitting because they

do not realize that withdrawal will be temporary. In addition to quantifying the consumer mistake, such a

calculation is valuable to tobacco companies, which may design incentives to attract inexperienced smokers

who unwittingly continue on to a lifelong habit.

Identifying present bias and projection bias as driving factors in addiction also suggests directions for

effective policy interventions, including tax policy.63 While there are many additional concerns that will

affect the optimal cigarette tax, one can ask simply what lump-sum redistributed tax would maximize

smokers’ own welfare.64 In the Becker and Murphy (1988) model, such a tax would clearly be zero, as

any cigarette consumption is already chosen optimally.65 When consumers are biased, however, there is

substantial room for welfare-enhancing corrective taxes. Although such taxes will clearly not be Pareto

efficient—some people may be smoking optimally—one can consider what tax would cause a biased consumer

to hit only in circumstances when an unbiased consumer would have hit without such a tax. The derivation

of this tax is presented in the Appendix, and because it depends on the addiction parameter γ, a range of

values is presented in Appendix Table A-1. In general, a tax which corrects for both sources of bias will be

quite large—on the order of $8-$11 per pack in current dollars. Using parameters of β = 0.9 and δ = 1,

Gruber and Koszegi (2000) estimate that for a sophisticated present-biased agent, the optimal tax is between

$1.50 and $2.50 per pack. Plugging in the same value of β into the framework used by this paper yields

a similar value for the optimal tax of between $1.10 and $3.70 per pack. Lowering β rapidly increases the

optimal tax, however. The estimate of β = 0.8, δ = 0.9 yields significantly higher results: between $2.20 and

$5.30 per pack. Including the effect of projection bias raises the corrective tax even further. Adding α = 0.4,62For comparison, the mean smoker in 2006 spent roughly $1080 annually on cigarettes (CDC, http://www.cdc.gov/tobacco/

data_statistics/)63See, for example, Wertenbroch (1998) and O’Donoghue and Rabin (2003).64In addition to the usual Ramsey rule for the overall tax, many studies include a Pigouvian component that reflects the

negative externality of second-hand smoke and the positive externality that smokers’ early deaths impose on government healthcare and retirement programs. The calculations in this paper are separate from these concerns.

65Gul and Pesendorfer (2007) also find an optimal tax of zero in a model where smoking may not be desirable, becausetaxation does not remove smoking from the choice set. Their model does, however, predict that a ban on smoking may bewelfare-enhancing.

33

http://www.cdc.gov/tobacco/data_statistics/

http://www.cdc.gov/tobacco/data_statistics/

the tax is between $8.60 and $11.40 per pack. By comparison, the highest current combined state-local tax

on cigarettes is $4.25 in New York City.

Taken together, the results in this paper provide support for both present bias and projection bias as

explanations for many smoking behaviors. When first starting a smoking habit, inexperienced smokers do

not appreciate the degree to which they will become addicted to nicotine. Conversely, experienced smokers

fail to fully appreciate how refraining from smoking will eventually make them un-addicted. Moreover, all

smokers are too willing to sacrifice future well-being for the instantaneous benefits of smoking. Addiction

offers a setting where projection bias and present bias interact to greatly alter consumer behavior. Further

research may address whether this interaction of biases has a large effect in other contexts. Additionally,

although consumer mistakes about prices cannot explain the pattern of results in this paper, it remains an

open question how important such mistakes may be to both behavior and welfare.

References

Adda, Jerome and Francesca Cornaglia, “The Effects of Taxes and Bans on Passive Smoking,” CemmapWorking Papers, October 2006.

and , “Taxes, Cigarette Consumption, and Smoking Intensity,” American Economic Review, September2006, 96 (4), 1013–1028.

Angeletos, George-Marios, David Laibson, Andrea Repetto, Jeremy Tobacman, and StephenWeinberg, “The Hyperbolic Consumption Model: Calibration, Simulation, and Empirical Evaluation,” TheJournal of Economic Perspectives, Summer 2001, 15 (3), 47–68.

Ariely, Dan and George Loewenstein, “The Heat of the Moment: the Effect of Sexual Arousal on SexualDecision Making,” Journal of Behavioral Decision Making, July 2005, 19 (2), 87–98.

Arnold, CL, TC Davis, HJ Berkel, RH Jackson, I Nandy, and S London, “Smoking status, reading level,and knowledge of tobacco effects among low-income pregnant women,” Preventive Medicine, April 2001, 32 (4),313–320.

Auld, M. Christopher and Paul Grootendorst, “An empirical analysis of milk addiction,” Journal of HealthEconomics, November 2004, 23 (6), 1117–1133.

Becker, Gary S. and Kevin M. Murphy, “A Theory of Rational Addiction,” Journal of Political Economy,1988, 96, 675–700., Michael Grossman, and Kevin M. Murphy, “Rational Addiction and the Effect of Price on

Consumption,” The American Economic Review, 1991, 81 (2), 237–241., , and , “An Emprical Analysis of Cigarette Addiction,” The American Economic Review, 1994, 84 (3),

396–418.Berheim, B. Douglas and Antonio Rangel, “Addiction and Cue-Triggered Decision Processes,” The American

Economic Review, December 2004, 94 (5), 1558–1590.Cameron, A. Colin, Jonah B. Gelbach, and Douglas L. Miller, “Robust Inference with Multi-Way

Clustering,” NBER Working Paper T0327, September 2006., , and , “Bootstrap-Based Improvements for Inference with Clustered Errors,” The Review of Economics

and Statistics, August 2008, 90 (3), 414–427.Chaloupka, Frank J and Kenneth Warner, “The Economics of Smoking,” in A.J. Culyer and J.P. Newhouse,

eds., Handbook of Health Economics, Volume I, Elsevier, 2000, pp. 1539–1627., Prabhat Jha, Joy de Beyer, and Peter Heller, “The Economics of Tobacco Control,” Briefing Notes in

Economics, January 2005, (63).Conlin, Michael, Ted O’Donoghue, and Timothy Vogelsang, “Projection Bias in Catalog Orders,” The

American Economic Review, September 2007, 97 (4), 1217–1249.Cummings, K. Michael, Eva Sciandra, Terry F. Pechacek, Mario Orlandi, and William R. Lynn,

34

“Where teenagers get their cigarettes: a survey of the purchasing habits of 13-16 year olds in 12 US communities,”Tobacco Control, December 1992, 1 (4), 264–267.

Cutler, David M. and Edward L. Glaeser, “Why Do Europeans Smoke More Than Americans?,” in D. Wise,ed., Developments in the Economics of Aging, Chicago: University of Chicago Press, 2008.

DeCicca, Philip, Don Kenkel, and Alan Mathios, “Cigarette taxes and the transition from youth to adultsmoking: Smoking initiation, cessation, and participation,” Journal of Health Economics, 2008, 27, 904–917.

DellaVigna, Stefano, “Psychology and Economics: Evidence from the Field,” Journal of Economic Literature,June 2009, 47, 315–372.

Department of Health and Human Services, The Health Consequences of Smoking: A Report of the SurgeonGeneral, Washington, DC: US Government Printing Office, 1982., Preventing Tobacco Use Among Young People: A Report of the Surgeon General, Washington, DC: US

Government Printing Office, 1994.General Accounting Office, “Balanced Budget Requirements,” March 1993.Gilbert, Daniel, Elizabeth Pinel, Timothy Wilson, Stephen Blumberg, and Thalia Wheatley, “Immune

Neglect: A Source of Durability Bias in Affective Forecasting,” Journal of Personality and Social Psychology,September 1998, 75 (3), 617–638.

Giordano, Louis, Warren Bickel, George Loewenstein, Eric Jacobs, Lisa Marsch, and Gary Badger,“Mild Opioid Deprivation Increases the Degree that Opiod-Dependent Outpatients Discount Delayed Heroin andMoney,” Psychopharmacology, September 2002, 163 (2), 174–182.

Gruber, Jonathan, “Youth Smoking in the 1990’s: Why Did it Rise and What Are the Long-Run Implications?,”The American Economic Review, May 2001, 91 (2), 85–90.

and Botond Koszegi, “Is Addiction ‘Rational?’ Theory and Evidence,” NBER Working Paper 7507, 2000.and , “Is Addiction ‘Rational?’ Theory and Evidence,” Quarterly Journal of Economics, 2001, 116 (4),

1261–1303.and Jonathan Zinman, “Youth Smoking in the US: Evidence and Implications,” in Jonathan Gruber, ed.,

Risky Behavior Among Youth: An Economic Analysis, Chicago: University of Chicago Press, 2001, pp. 69–120.and Sendhil Mullainathan, “Do Cigarette Taxes Make Smokers Happier?,” Advances in Economic Analysis

and Policy, 2005, 5 (1).Gul, Faruk and Wolfgang Pesendorfer, “Harmful Addiction,” Review of Economic Studies, 2007, 74 (1),

147–172.Hammar, Henrik and Fredrik Carlsson, “Smokers’ expectations to quit smoking,” Health Economics, 2005, 14

(3), 257–267.Kenkel, Donald, Dean Lillard, and Alan Mathios, “The Roles of High School Completion and GED Receipt in

Smoking and Obesity,” Journal of Labor Economics, 2006, 24 (3), 635–660.Khwaja, Ahmed, Dan Silverman, Frank Sloan, and Yang Wang, “Are Smokers Misinformed? Evidence on

Subjective Beliefs about Mortality and Health,” August 2008. Mimeo., Daniel Silverman, and Frank Sloan, “Time Preference, Time Discounting, and Smoking Decisions,”

Journal of Health Economics, 2007, 26 (5), 927–949.Laibson, David, “Golden Eggs and Hyperbolic Discounting,” The Quarterly Journal of Economics, 1997, 112 (2),

443., Andrea Repetto, and Jeremy Tobacman, “Estimating Discount Functions with Consumption Choices over

the Lifecycle,” August 2007. Mimeo.Loewenstein, George and Daniel Adler, “A Bias in the Prediction of Tastes,” Economic Journal, 1995, 105,

929–937.and Shane Frederick, “Predicting Reactions to Environmental Change,” in M Bazerman, D Messick,

A Tenbrunsel, and K Wade-Benzoni, eds., Psychological Perspectives on the Environment, Russell SageFoundation Press, 1997., Ted O’Donoghue, and Matthew Rabin, “Projection Bias in Predicting Future Utility,” Quarterly Journal

of Economics, November 2003, 53 (2), 151–169.O’Donoghue, Ted and Matthew Rabin, “Doing it Now or Later,” The American Economic Review, March 1999,

89 (1), 103–124.and , “Addiction and Self Control,” in Jon Elster, ed., Addiction: Entries and Exits, Russel Sage

Foundation, 1999.and , “Addiction and Present-Biased Preferences,” October 2000.and , “Studying Optimal Paternalism, Illustrated by a Model of Sin Taxes,” The American Economic

Review, May 2003, 93 (2), 186–191.

35

Oehlert, Gary W., “A Note on the Delta Method,” The American Statistician, February 1992, 46 (1), 27–29.Orphanides, Athanasios and David Zervos, “Rational Addiction with Learning and Regret,” The Journal of

Political Economy, August 1995, 103 (4), 739–758.Paserman, M. Daniele, “Job Search and Hyperbolic Discounting: Structural Estimation and Policy Implications,”

Economic Journal, August 2008, 118 (531), 1418–1452.Patrick, Donald L., Allen Cheadle, Diane C. Thompson, Paula Diehr, Thomas Koepsell, and Susan

Kinne, “The Validity of Self-Reported Smoking: A Review and Meta-Analysis,” American Journal of PublicHealth, July 1994, 84 (7), 1086–1093.

Read, Daniel and Barbara Van Leeuwen, “Predicting Hunger: The Effects of Appetite and Delay on Choice,”Organizational Behavior and Human Decision Processes, November 1998, 76 (2), 189–205.

Schoenbaum, Michael, “Do Smokers Understand the Mortality Effects of Smoking? Evidence from the Health andRetirement Survey,” American Journal of Public Health, May 1997, 87 (5), 755–759.

Shapiro, Jesse, “Is there a daily discount rate? Evidence from the food stamp nutrition cycle,” Journal of PublicEconomics, 303–325 2005, 89 (2), 303.

Shui, Haiyan and Lawrence M. Ausubel, “Time Inconsistency in the Credit Card Market,” May 2004. 14thAnnual Utah Winter Finance Conference.

Tauras, John A and Frank J Chaloupka, “Impact of Tobacco Control Spending and Tobacco Control Policieson Adolescents’ Attitudes and Beliefs about Cigarette Smoking,” Evidence-Based Preventive Medicine, 2004, 1(2), 11–120., , Matthew Farrelly, Gary Giovino, Melanie Wakefield, Lloyd Johnston, Patrick O’Malley,

Deborah Kloska, and Terry Pechacek, “State Tobacco Control Spending and Youth Smoking,” AmericanJournal of Public Health, February 2005, 95 (2), 338–344.

The Tobacco Institute, The Tax Burden on Tobacco: Historical Compilation 1993, Vol. 28, Washington, DC:Tobacco Institute, 1993.

Townsend, J, P Roderick, and J Cooper, “Cigarette smoking by socioeconomic group, sex, and age: effects ofprice, income, and health publicity,” BMJ, October 1994, (309), 923–927.

United States Department of Agriculture Consumer and Marketing Service, Tobacco Division,“Tobacco Market Review,” 1970–1990.

Unites States Department of Commerce, “Statistical Abstract of the United States,” 1970–1990.Viscusi, W. Kip, “Do Smokers Underestimate Risks?,” Journal of Political Economy, 1990, 98 (6), 1253.

and Joni Hersch, “Cigarette Smokers as Job Risk Takers,” The Review of Economics and Statistics, May2001, 83 (2), 269–280.

Wertenbroch, Klaus, “Consumption self-control by rationing purchase quantities of virtue and vice,” MarketingScience, 1998, 17 (4), 317–337.

Wilmouth, Carrie E. and Linda P. Spear, “Withdrawal from chronic nicotine in adolescent and adult rats,”Pharmacology Biochemistry and Behavior, November 1998, 85 (3), 648–657.

36

A Proofs

Proof of Lemma 1: Following O’Donoghue and Rabin (2000), let At ≡ {0, 1}∞, where at ≡ (at, at+1, ...) ∈ At is apath of hitting and refraining starting in period t. Let Vt(kt,at) be the perceived long-run continuation utility fromfollowing at conditional on period-t addiction level kt. Let Kτ (kt,at) = [γτ−tkt +

∑τ−1n=1 γ

τ−n−1ai] be the addictionlevel in period τ that results from following at. Then:

Vt(kt,at) =∞∑τ=t

δτ−t[aτ(Yτ − pτ + (1− α)fτ (Kτ (kt,at)) + αfτ (kt)

)+(1− aτ )

(Yτ + (1− α)gτ (Kτ (kt,at)) + αgτ (kt)

)]Because ft(·) and gt(·) are weakly convex, and Kτ is increasing and linear in kt, it follows that Vt is also weakly

convex in kt. Now let A(k, t) be an optimal strategy from self t’s perspective—that is is solves Ut(k|A(k, t)) =maxa∈At V

t(k,a).To show uniqueness, suppose A′ 6= A is another optimal strategy. But from (10), A(k, t) = 1 ⇐⇒ ht(k) ≥

βδ(1−α)[Ut+1(γk|A)− Ut+1(γk + 1|A)

]and A′(k, t) = 1 ⇐⇒ ht(k) ≥ βδ(1−α)

[Ut+1(γk|A′)− Ut+1(γk + 1|A′)

].

Then Ut(k|A) = Ut(k|A′) ∀ k and t, and so A(k, t) = A′(k, t), so they are the same strategy.Then because Ut(k|A) is the upper envelope of V t, it is also weakly convex in kt. Therefore Ut+1(γk|A)−Ut+1(γk+

1|A) is weakly decreasing in k. But for any type,

A(k, t) = 1 ⇐⇒ ht(k) ≥ βδ(1− α)[Ut+1(γk|A)− Ut+1(γk + 1|A)

]Thus ∃kt s.t. ∀t,A(k, t) = 1 ⇐⇒ k ≥ kt.

If t > m, then ht(k) is independent of t by Assumption 1. Then A(k, t) = 1 ⇐⇒ h(k) ≥ βδ(1−α)[Ut+1(γk|A)− Ut+1(γk + 1|A)

]is also independent of t, so ∃k s.t. ∀t,A(k, t) = 1 ⇐⇒ k ≥ k.

If t < m, then because future selves will be expected to follow a cutoff strategy and xt is weakly decreasing int,[Ut+1(γk|A)− Ut+1(γk + 1|A)

]is weakly increasing in t. Thus for t < m, the cutoff kt is weakly increasing in t.

Finally, because hm(k) ≥ hm+1(k), km < km+1.

Proof of Lemma 2 For any p = (p1, p2, ...), define k∗1(p) to be the period-1 addiction level such that an agent withdiscount rate δ, degree of present-bias β, and degree of projection-bias α is indifferent between hitting always andhitting never. That is,

Y1 − p1 + x1 + f(k∗1(p) + β

∞∑t=2

δt−1

[Yt − pt + xt + (1− α)f

(γt−1kα1 (p) +

t−1∑n=1

γn−1

)+ αf (kα1 (p))

]= Y1 + g(k1) + β

∑∞t=2 δ

t−1[Yt + (1− α)g

(γt−1kα1 (p)

)+ αg (kα1 (p))

]which can be re-written as:

x1 − p1 + β

∞∑t=2

δt−1(−pt) + Φ(k∗1(p)) + β

∞∑t=2

δt−1xt = 0

where Φ(k) = f(k∗1(p))− g(k∗1(p)) + β∑∞t=2 δ

t−1[(1− α)

(f(γt−1k +

∑t−1n=1 γ

n−1)− g

(γt−1k

))+ α (f(k)− g(k))

].

If k∗(p) ∈ min{k∗, k, k}, then kt = k∗1(p) ∀ t. (i.e. behavior is governed by the threshold for hitting always relativeto hitting never, and it is straightforward to show that this is true for sufficiently small changes to p). Substituting k

37

for k∗1 into the above and differentiating with respect to p yields:

−1 +−βδ1− δ

+ Φ′(k∗1(p)) · dk1(p)dp

= 0

dk1(p)dp

=1 + βδ

1−δ

−Φ′(k∗1(p)

and similarlydk1(p)dp1

=1

−Φ′(k∗1(p),

dk(p)dpτ

=βδτ−1

−Φ′(k∗1(p)

Now for any p = (p1, p2, ...), define k1(p) to be the period-1 addiction level such that the is indifferent betweenhitting until maturity and hitting never. That is, k1(p) is given by:

Y1 − p1 + f(k1) + x1 + β

m∑t=2

δt−1

[Yt − pt + xt + (1− α)f

(γt−1k1(p) +

t−1∑n=1

γn−1

)+ αf

(k1(p)

)]+

β

∞∑t=m+1

δt−1

[Yt + (1− α)g

(γt−1k1(p) + γt−m

m∑n=1

γn−1

)+ αg

(k1(p)

)]

= Y1 + g(k1) + β

∞∑t=2

δt−1[Yt + (1− α)g

(γt−1k1(p)

)+ αg

(k1(p)

)]which can be re-written as:

x1 − p1 + β

m∑t=2

δt−1(−pt) + Ψ(k1(p),m) + β

m∑t=2

δt−1xt = 0

where

Ψ(k,m) = f(k1)− g(k1) + β

m∑t=2

δt−1

[(1− α)

(f

(γt−1k +

t−1∑n=1

γm−1

)− g

(γt−1k

))+ α (f(k)− g(k))

]

+β∞∑

t=m+1

δt−1

[(1− α)

(g

(γt−1k + γt−m

m∑n=1

γn−1

)− g

(γt−1k

))]

For sufficiently small changes in p, kαt = k(p) ∀ t . Thus differentiating with respect to p gives:

−1 + βδ1− δm−1

1− δ· (−1) + Ψk(k1,m)

dkjdp

Which can be re-arranged to yield:

dkjdp

=1 + βδ(1− δm−1)/(1− δ)

Ψm(k1(p),m)

And similarly differentiating with respect to p1 and pτ yield:

dkjdp1

=1

Ψm(k1(p),m)and

dkjdpτ

=

{βδτ−1

Ψs(k1(p),s), τ ≤ s

0, else

The case for a present-biased agent planning to hit only in the current period can be solved by setting m = 1.

Proof of Proposition 1 Suppose the initial addictive state k0 is distributed according to some function κ(·). Forfully rational smokers who expect to continue smoking, ktc1 = k∗. Then in period 1, π(p|X) = 1−κ(k∗(p)). By Lemma2, ∂π/∂pt+1

∂π/∂pt= δ.

38

Present-biased and projection-biased consumers also only consume in period 1 when k1 > kn1 and k1 > kj1,respectively. Thus for present-biased consumers, π(p|X) = 1−κ(kn1 (p)) and for projection-biased consumers, π(p|X) =1 − κ(kj1(p)). Lemma 2 then implies that if kn(β,p) = min{k∗n, kn, kn} or kn = min{k∗n, kn, kn} and m = 1, then∂π/∂pt+1∂π/∂pt

= 0. Otherwise if k∗(β,p) = min{k∗n, kn, kn} or kn = min{k∗n, kn, kn} and m > 1, then ∂π/∂pt+1∂π/∂pt

= βδ.

For projection-biased consumers, Lemma 2 implies that if kj = min{k∗j , kj} and m = 1, then ∂π/∂pt+1∂π/∂pt

= 0. If

kj = min{k∗j , kj} and m > 1 or k∗j = min{k∗j , kj}, then ∂π/∂pt+1∂π/∂pt

= δ

Proof of Proposition 2 Let the addictive stock in period t among mature smokers be given by κt(·). WLOG, letpτ = p+ pTτ ∀τ ≥ t. The probability that a consumer of type i smokes is given by π(p|X) = 1−κ(ki1(p), and therefore∂πt/∂pt+1∂πt/∂pt

= −κ′(k)∂k/∂pt+1

−κ′(k)∂k/∂ptIf consumers expect to continue smoking, Lemma 2 implies that ∂kt/∂pt+1

∂kt/∂pt= βδ.

For fully rational smokers, Lemma 2 further implies ∂πt/∂p∂πt/∂pt

= 11−δ . Projection-biased consumers follow identically,

using k∗j (p).For present-biased consumers, either kn(β, p) < k∗n(β) or kn(β, p) > k∗n(β). In the first case, Lemma 2 implies d =

0 and ∂πt/∂pP

∂πt/∂pt= 1 = 1

1−0 . If kn(β, p) > k∗n(β), then δ = βδ and ∂πt/∂p∂πt/∂pt

= 1−(1−β)δ1−δ > 1

1−βδ .

Proof of Lemma 3 (i) Rational and present-biased consumers:In period 2, the consumer will smoke iff

Y2 + x2 − p2 + f(k2) ≥Y2 + g(k2)x2 + f(k2)− g(k2) = x2 + h(k2) ≥p2

Let p∗2(k2) = x2 + h(k2) and let ψ(k2) = Φ(p∗2(k2)). Her ex-ante expected utility in period 2 is:

V2(k2) = Y2 + ψ(k2) · (x2 − p2(k2) + h(k2)) + g(k2)where p2 = E[p|p ≤ p∗2(k2)]

She will choose to smoke in period 1 if:

Y1 + g(k1) + V2(γk1) ≤ Y1 − p1 + x1 + f(k1) + V2(γk1 + 1)p1 ≤ p∗1 = x1 + h(k1) + βδV2(γk1 + 1)− βδV2(γk1)

≈ x1 + h(k1) + βδV ′2(γk1)

Before proceeding, consider her expected price conditional on smoking in period 2, as a function of her period-2addiction level:

p2(k) =1

ψ(k)·∫ x2+h(k)

−∞p · dΦ(p)

p′2(k) = −ψ′(k) · 1ψ(k)2

·∫ x2+h(k)

−∞p · dΦ(p) +

1ψ(k)

· h′(k) · (x2 + h(k)) · φ(x2 + h(k))

= −ψ′(k)ψ(k)

· p(k) +ψ′(k)ψ(k)

· p∗(k)

= [p∗(k)− p(k)] · ψ′(k)ψ(k)

> 0

Because p∗(k) is necessarily greater than p(k), the expression is positive. Intuitively, increasing one’s addictionlevel increases the range of prices at which a purchase will be made, and hence the expected purchase price.

39

Now,

V ′2(k) = ψ′(k) [p∗(k)− p2(k)] + ψ(k) · (−p′(k) + h′(k)) + g′(k)

= ψ′(k) [p∗(k)− p2(k)]− ψ(k) · ψ′(k)ψ(k)

[p∗(k)− p(k)] + h′(k)ψ(k) + g′(k)

= h′(k)ψ(k) + g′(k)

Finally, then:

p∗1(k1) ≈ x1 + h(k1) + βδh′(γk1)ψ(γk1) + βδg′(γk1)dp∗1dσ2≈ dψ(γk1)

dσ2· h′(γk1) · βδ

The second term is, by assumption, positive for all values of k. Thus the sign of effect on the threshold price isdetermined entirely by the first term, dψ(γk1)

dσ2 . The overall effect on the probability of initiation is then:

dP(smoke)dσ2

= Φσ2(p∗1(k1;σ2), p;σ2) + ΦP (p∗1(k1;σ2), p;σ2) · dP1

dσ2

(ii) Projection-Biased Consumers Because the fully projection-biased consumer believes her period-2 addictionlevel will be unchanged, her expected utility as a function of her period-1 consumption is simply:

U(a1 = 0) = Y1 + g(k1) + βδ [Ψ(k1, p) · (Y2 − E[p2|p2 ≤ p∗2(k1)] + x2 + f(k1)+(1−Ψ(k1, p)) · (Y2 + g(k1))]

U(a1 = 1) = Y1 − p1 + x1 + f(k1) +βδ [Ψ(k1, p) · (Y2 − E[p2|p2 ≤ p∗2(k1)] + x2 + f(k1)

+(1−Ψ(k1, p)) · (Y2 + g(k1))]

Because the fully projection-biased consumer believes her current actions don’t influence her future incentives, shewill smoke in period 1 exactly when the price is below the instantaneous temptation to hit:

p1 ≤ x1 + f(k1)− g(k1)

Thus the threshold price of initiating smoking for a fully projection-biased consumer is unaffected by the varianceof the price distribution. Assuming, however, that p > x1 + f(k1)− g(k1), then the probability that p1 is less than thethreshold is increasing in σ2.

(iii) Partial Projection BiasLet p∗(k, kt) be the threshold price at which a partially projection-biased consumer expects to purchase in period

2 if her current addiction level is pt and her actual period-2 addiction level is k. With simple projection bias, it followsthat p∗(k, kt) = x2 + α · (f(kt)− g(kt)) + (1− α) · (f(k)− g(k)).

It then follows that a partially projection-biased agent’s perceived probability of smoking in the next period isΨ(k, p|kt) = Φ (p∗(k, kt)− p).

Her perceived expected utility, then, as a function of her period-1 choice is then:

U(a1 = 0) = Y1 + g(k1) +

+βδ[Ψ(γk1, p|k1) · [(Y2 − E[p2|p2 ≤ p∗2(γk1, k1)) + x2 + αf(k1) + (1− α)f(γk1)]

+(1− Ψ(γk1, p|k1)) · [Y2 + αg(k1) + (1− α)g(γk1)]]

U(a1 = 1) = Y1 − p1 + x1 + f(k1)

+βδ[Ψ(γk1 + 1, p|k1) · [(Y2 − E[p2|p2 ≤ p∗(γk1 + 1, k1)) + x2 + αf(k1) + (1− α)f(γk1)]

+(1− Ψ(γk1 + 1, p|k1)) · [Y2 + αg(k1) + (1− α)g(γk1 + 1)]]

40

Thus a1 = 1 if:

p1 ≤ x1 + f(k1)− g(k1)

+βδ[Ψ(γk1 + 1, p|k1) · [(−E[p2|p2 ≤ p∗(γk1 + 1, k1)) + x2 + αf(k1) + (1− α)f(γk1)]

−Ψ(γk1, p|k1) · [(−E[p2|p2 ≤ p∗2(γk1, k1)) + x2 + αf(k1) + (1− α)f(γk1)]+(1− Ψ(γk1 + 1, p|k1)) · [αg(k1) + (1− α)g(γk1 + 1)]

−(1− Ψ(γk1, p|k1)) · [αg(k1) + (1− α)g(γk1)]]

In the limit as α approaches unity, this expression simplifies to exactly the full projection-bias case. Similarly, itsimplifies to the fully rational case as α approaches zero. Thus the overall effect on the probability of smoking dependson the individual’s degree of projection bias. For large α, the threshold will barely adjust as σ2 increases, so as witha fully projection-biased agent, the probability of initiation increases. For small α, the threshold drops enough that itdominates the increased probability of meeting any given threshold, so the probability of initiation decreases.

Proof of Lemma 3: Let f(kt) = xt − ρkt and g(kt) = −σkt.For any p = (p1, p2, ...), define k1(α, β, δ) to be the period-1 addiction level such that a person with discount rate

δ, degree of present-bias β, and degree of projection-bias α is indifferent between hitting always and hitting never.That is,

Y1 − p1 + x1 − ρk1 + β

∞∑t=2

δt−1

((1− α)(ρ)

(γt−1k1 +

t−1∑n=1

γn−1

)+ α(−ρ)k1 + Yt − pt + xt

)

= Y1 − σk1 + β

∞∑t=2

δt−1((1− α)(−σ)(γt−1)k1 + α(σ)k1 + Yt

)

So that:

(σ − ρ)k1 + βδ

[(1− α)

γ

1− δγ+

α

1− δ

](σ − ρ)k1 + β

∞∑t=2

δt−1x1 − p1 − β∞∑t=2

δt−1pt + θ1 = 0

where θ1 =1− α1− γ

βδ

[γ

1− δγ− 1

1− δ

]ρ

If k1(α, β, δ) < ky1(α, β, δ), then kt = k1(α, β, δ) ∀ t. (i.e. behavior is governed by the threshold for hittingalways relative to hitting never, and it is straightforward to show that this is true for sufficiently small changes to p).Substituting k for k1 into the above and differentiating with respect to p yields:[

1 + βδ

((1− α)

γ

1− δγ+

α

1− δ

)](σ − ρ)dk =

[1 + β

∞∑t=2

δt−1

]dp

dk

dp=

[1 + βδ

1−δ

][1 + βδ

((1− α) γ

1−δγ + α1−δ

)]· (σ − ρ)

41

Similarly :

dk

dpTt=

1[1 + βδ

((1− α) γ

1−δγ + α1−δ

)]· (σ − ρ)

dk

dpTt+1

=βδ[

1 + βδ(

(1− α) γ1−δγ + α

1−δ

)]· (σ − ρ)

Now let ky1(α, β, δ,m) be the period-1 addiction level such that a person with discount rate δ, degree of present-biasβ, degree of projection-bias α, and maturity date m > 1 is indifferent between hitting until maturity (and refrainingthereafter), and hitting never. That is:

Y1 − p1 + x1 − ρk1 + β

m∑t=2

δt−1

[(1− α)(−ρ)

(γt−1k1 +

t−1∑n=1

γn−1

)− αρk1 − pt + xt

]

+β∞∑

t=m+1

δt−1

[(1− α)(−σ)

(γt−1k1 + γt−m ·

m∑n=1

γn−1

)− ασk1

]

= Y1 − σk1 − σ(

(1− α)βδγ1

1− δγ+ αβδ

11− δ

)k1

[1 + βδ

((1− α)γ

1− (δγ)m−1

1− δγ+ α

1− δm−1

1− δ

)](σ − ρ)k1 − p1 − β

m∑t=1

δt−1pt + θ2 = 0

where θ2 = x1 + β∑mt=2 δ

t−1xt − (1− α) βδ1−γ

(1−δm1−δ −

1−(δγ)m

1−δγ

)ρ.

If ky1(α, β, δ,m) < k1(α, β, δ), then kt = ky1(α, β, δ) ∀ t. (i.e. behavior is governed by the threshold for hittingonly until maturity and refraining thereafter over hitting never, and it is straightforward to show that this is true forsufficiently small changes to p). Substituting k for ky1 into the above and differentiating with respect to p yields:

dk

dp=

[1 + βδ(1−δm−1)

1−δ

][1 + βδ

((1− α)γ 1−(δγ)m−1

1−δγ + α 1−δm−1

1−δ

)]· (σ − ρ)

Similarly :

dk

dpTt=

1[1 + βδ

((1− α)γ 1−(δγ)m−1

1−δγ + α 1−δm−1

1−δ

)]· (σ − ρ)

dk

dpTt+1

=βδ[

1 + βδ(

(1− α)γ 1−(δγ)m−1

1−δγ + α 1−δm−1

1−δ

)]· (σ − ρ)

· 1{m > t}

Let the distribution of addictive states at time t for the cohort born in τ be given by Pr(k ≤ x) = κ(x; τ, t). Theprobability of smoking, and the population smoking rate, is then

(1− κ(k; τ, t)

).

Finally, let the “maturity date” m be distributed according to the probability distribution function µ(·). Theresponse in the smoking rate of cohort τ , πτ,t to an anticipated shock to the price beginning in period t+1 is then:

dπτ,tdpt+1

= −κ′(k; τ, t) ·(

dk

dpt+1

)(1− µ(t− τ))

Proof of Lemma 4: A consumer with current addictive state k1, discount rate δ, degree of present-bias β, and degree

42

of projection-bias α believes she will smoke in period 2 if the price and her period-2 addiction level are such that:

Y2 + x2 − p2 + (1− α)f(k2) + αf(k1) ≥ Y2 + (1− α)g(k2) + αg(k1)

p2 ≤ (1− α)h(k2) + αh(k1)

Let p(k2; k1) = x2 + (1− α)h(k2) + αh(k1). If p ∼ Φ(·), then her ex-ante expected utility in period is:

V (k2; k1) = EU(k2; k1) = Y2 + Φ(p) ((1− α)f(k2) + αf(k1)− E[p2|p2 ≤ p]) + (1− Φ(p)) ((1− α)g(k2) + αg(k1))

She will therefore smoke in period 1 if:

p1 ≤ p∗1 = h(k1) + βδV (γk1 + 1; k1)− βδV (γk1; k1)≈ h(k1) + V ′2(γk1; k1) · βδ

Let p(k2; k1) = E [p2|p2 ≤ p(k2; k1)], and note that

p(k2; k1) =1

Φ(p)

∫ (1−α)h(k2)+αh(k1)

−∞p · dΦ(p)

So that

p′(k2; k1) = −φ(p)h′(k2)Φ(p)2

∫ (1−α)h(k2)+αh(k1)

−∞p · dΦ(p) +

1Φ(p)

(1− α)h′(k2)pφ(p)

= −φ(p)h′(k2)Φ(p)

p+(1− α)φ(p)

Φ(p)

=φ(p)Φ(p)

h′(k2) [p− p]

ThusV2 = Y2 + Φ(p)(p− p) + (1− α)g(k2) + αg(k1)

V ′2(k2; k1) = φ(p)h′(k2)(p− p) + Φ(p)((1− α)h′(k2)− p′) + (1− α)g′(k2)

= φ(p)h′(k2)(p− p) + Φ(p)φ(p)Φ(p)

h′(k2)(p− p) + (1− α)h′(k2)Φ(p) + (1− α)g′(k2)

= (1− α) [h′(k2)Φ(p(k2; k1) + g′(k2)]

And therefore

p∗ = h(k1) + βδ(1− α) [Φ(p) · h′(γk1) + g′(γk1)]

Let s be the standard deviation of Φ(p).

dp∗

ds=

dΦ(p)ds

· h′(γk1) · βδ(1− α)

=dΦ(p)ds

· (σ − ρ) · βδ(1− α)

If the agent changes their threshold in this manner, then the overall change in their probability of smoking in anyperiod is:

dπ

ds= φ(p∗) · dΦ(p)

ds· (σ − ρ)βδ(1− α) +

dΦ(p∗)ds

If one imposes normality on the price distribution such that (pt − E [pt]) ∼ N(0, s2), then this can be written in

43

closed form as:

dπ

ds=

1√2πs

e−

“p∗2

2s2

”·(

p√2πs2

e−

“p2

2s2

”)· (σ − ρ)βδ(1− α)

− p∗√2πs2

e−

“p∗2

2s2

”

Derivation of Optimal Tax Consider the current-period addiction for which an agent is indifferent between hittingalways and hitting never, as given in the Proof of Lemma 3:

(σ − ρ)k1 + βδ

[(1− α)

γ

1− δγ+

α

1− δ

]k1 + β

∞∑t=2

δt−1x1 − p1 − β∞∑t=2

δt−1pt + θ1 = 0

where θ1 =1− α1− γ

βδ

[γ

1− δγ− 1

1− δ

]ρ

An agent with discount rate δ, degree of present-bias β, and degree of projection-bias α will therefore hit if:

k1 ≥p1 + β

∑∞t=2 δ

t−1pt − θ1

θ3(σ − ρ)

where θ3 = 1 + βδ

((1− α)

γ

1− δγ+ α

11− δ

)Whereas a fully-rational agent (i.e. α = 0, β = 1) will hit if:

k1 ≥p1 +

∑∞t=2 δ

t−1pt − 11−α

1β θ1

θR3 (σ − ρ)

where θR3 = 1 + δ

(γ

1− δγ

)Consider applying a uniform excise tax τ on consumption in every period. The level of tax that will cause a biased

agent to consume only when the rational agent would have consumed is given by:

p1 + τ + β∑∞t=2 δ

t−1(pt + τ)− θ1

θ3(σ − ρ)=p1 +

∑∞t=2 δ

t−1pt − 11−α

1β θ1

θR3 (σ − ρ)

τ

(1 +

βδ

1− δ

)=[

11− δ

θ3

θR3− (1 + βδ)1− δ

]p−

[1

1− α1β− θ3

θR3

]θ1

where p is the average future price.

Table A-1: Optimal Taxesγ = 0 γ = 0.25 γ = 0.5 γ = 0.75

β = 0.9, δ = 1, α = 0 1.12 1.31 1.84 3.70β = 0.8, δ = 0.9, α = 0 2.21 2.47 3.18 5.28β = 1, δ = 0.9, α = 0.4 11.40 7.46 4.16 2.89β = 0.8, δ = 0.9, α = 0.4 15.16 11.41 8.63 8.80

B Figures and Tables

44

Table 1: Summary Statistics

Full NHISUrban Sample

NHISSmokersa

NHISNonsmokers

1990 Censusb

Cigarettes/day 18.79 –(13.13)

Age 41.89 42.59 41.63 43.05(17.99) (17.10) (18.31)

Female 0.537 0.520 0.543 0.51(.499) (.499) (.498)

Black 0.127 0.129 0.126 0.12(0.333) (0.335) (0.332)

High School or Less 0.436 0.367 0.461 0.626(0.496) (0.482) (0.498)

College or Morec 0.565 0.633 0.539 0.373(0.496) (0.482) (0.498)

Northeast 0.328 0.321 0.330 0.20Central 0.268 0.275 0.265 0.24South 0.165 0.166 0.165 0.34West 0.237 0.238 0.237 0.21

N obs. 125,211 33,637 91,574Source: National Health Interview Survey (1965-1991) and Statistical Abstract of the United States.Notes: Standard deviations are in parentheses. Cigarette consumption and demographic variables are self-reported in the NHIS. Regions correspond to census designations. Twenty cigarettes is approximately a packa day. Census data are for combined urban and rural sample.a Self-designated “Current Smoker”b Estimates for total US population. Education outcomes are for persons 25 years old and over.b Includes “Some College”

45

Figure 1: Real Prices by Metropolitan Area.6

.81

1.2

1.4

.6.8

11

.21

.4.6

.81

1.2

1.4

.6.8

11

.21

.4.6

.81

1.2

1.4

.6.8

11

.21

.4

1960 1970 1980 1990 1960 1970 1980 1990 1960 1970 1980 1990 1960 1970 1980 1990 1960 1970 1980 1990

1960 1970 1980 1990

Anaheim Atlanta Baltimore Boston Buffalo Chicago

Cincinnati Cleveland Dallas Denver Detroit Houston

Indianapolis Kansas City Los Angeles Miami Milwaukee Minneapolis

New Orleans New York Philadelphia Pittsburgh Portland San Bernadino

San Diego San Francisco San Jose Seattle St. Louis Tampa

Washington, D.C.

Real P

rice

Figure 2: Nominal Taxes by Metropolitan Area

02

04

06

00

20

40

60

02

04

06

00

20

40

60

02

04

06

00

20

40

60

1960 1970 1980 1990 1960 1970 1980 1990 1960 1970 1980 1990 1960 1970 1980 1990 1960 1970 1980 1990

1960 1970 1980 1990

Anaheim Atlanta Baltimore Boston Buffalo Chicago

Cincinnati Cleveland Dallas Denver Detroit Houston

Indianapolis Kansas City Los Angeles Miami Milwaukee Minneapolis

New Orleans New York Philadelphia Pittsburgh Portland San Bernadino

San Diego San Francisco San Jose Seattle St. Louis Tampa

Washington, D.C.

Nom

inal Ta

xes

46

Figure 3: Hazard Rates of Cigarette Smoking Initiation and Cessation

0 5 10 15 20 25 30 35 40 45 50 550

0.02

0.04

0.06

0.08

0.1

0.12

0.14

Time (Age or Duraton of Habit)

Ha

za

rd

Initiation

Cessation

Source: 1987 National Health Interview Survey.

Notes: Hazard rates are calculated from self-reported ages when started smoking and when last regularly smoked, and do not

control for observable characteristics. Quitting hazard does not include exits from smoking habits due to death.

47

Table 2: Price Sensitivity Among Young SmokersPanel A – Dependent Variable: 1(Smoker = 1), Linear Probability Model

(i) (ii) (iii) (iv) (v) (vi)Age < 20 Age < 25 Age < 30 Age < 20 Age < 25 Age < 30

ln(price)t -1.853∗∗∗ -1.256∗∗∗ -1.019∗∗∗ -1.363∗∗∗ -0.994∗∗∗ -0.744∗∗

(0.226) (0.135) (0.104) (0.525) (0.334) (0.294)ln(price)t+1 -0.118 -0.426∗∗∗ -0.545∗∗∗ -0.021 -0.389 -0.575∗∗

(0.161) (0.096) (0.075) (0.471) (0.289) (0.251)Ratio ofcoefficientsa 0.064 0.340∗∗ 0.534∗∗∗ 0.016 0.392∗∗ 0.773∗∗∗

(0.093) (0.110) (0.124) (0.339) (0.162) (0.051)

MSA Fixed Effects yes yes yes yes yes yesMSA-SpecificTrends

yes yes yes yes yes yes

N. obs 5,289 11,647 17,975 5,289 11,647 17,975

Panel B – First-stage Estimates of Pricesln(price)t

ln(taxes)t – – – 0.527∗∗∗ 0.516∗∗∗ 0.511∗∗∗

(0.038) (0.036) (0.036)Budget Gapb – – – 0.194∗∗ 0.161∗∗ 0.151∗∗

(0.089) (0.078) (0.074)ln(price)t+1

ln(taxes)t – – – 0.551∗∗∗ 0.558∗∗∗ 0.565∗∗∗

(0.054) (0.051) (0.050)Budget Gapb – – – 0.478∗∗∗ 0.454∗∗∗ 0.428∗∗∗

(0.135) (0.134) (0.131)Notes: Standard errors (in parentheses) are robust and clustered at the MSA level. All specifications control for age, age2, race,

and education and include both MSA dummies and MSA-specific time trends. * denotes significance at the 10% level; ** at the

5% level; and *** at the 1% level. Coefficients in Panel A are average marginal effects from a probit regression.a The ratio of the coefficients on ln(pricet+1) and ln(pricet) will equal the net discount factor βδ if consumers are fully rational,

or may equal 0 if consumers are sufficiently present- or projection-biased. Standard errors are calculated by the delta method.b Average percentage shortfall in state government budgets (general fund) within an MSA.

48

Table 3: Price Sensitivity – Ordered ProbitDependent Variable: Smoking Status

Less than Cutoff Less than or Equal to Cutoff(i) (ii) (iv) (v)

ln(pricet) -1.019∗∗ -0.816∗∗ -1.277∗∗∗ -1.063∗∗∗

(0.414) (0.385) (0.412) (0.382)ln(pricet) * under20 -2.233∗∗ -2.380∗∗

(0.966) (0.948)ln(pricet+1) -1.627∗∗∗ -1.821∗∗∗ -1.575∗∗∗ -1.768∗∗∗

(0.332) (0.305) (0.331) (0.303)ln(pricet+1) * under20 1.946∗∗∗ 1.962∗∗∗

(0.620) (0.542)under20 -0.021 -0.052

(0.423) (0.047)Thresholds:µ1 (nonsmoker) 218.6671 218.6596 212.8748 212.7885µ2 (1/2 pack) 219.155 219.1477 213.5996 213.5135µ3 (1 pack) 219.5799 219.5729 214.529 214.4437µ4 (2 packs) 220.7809 220.7751 215.6642 215.5800

p-value of β3 + β4 = 0 – 0.8690 – 0.7830

N obs. 43,537 43,537 43,537 43,537R2 0.167 0.167 0.173 0.174

Notes: Robust standard errors (in parentheses) and are clustered at the MSA level. All specifications control for age, age2, race,

and education. * denotes significance at the 10% level; ** at the 5% level; and *** at the 1% level. Cutoffs for smoking status

are: 0 = nonsmoker, 1 = 1/2 pack per day, 2 = 1 pack per day, 3 = 2 packs per day, 4 = 2+ packs per day

49

Table 4: Price Sensitivity Among Mature SmokersPanel A – Dependent Variable: 1(Smoker = 1), Linear Probability Model

(i) (ii) (iii) (iv)Age > 30 Age > 35 Age > 30 Age > 35

ln(pricet) -0.824∗∗∗ -0.879∗∗∗ -0.725∗∗∗ -0.787∗∗∗

(0.071) (0.077) (0.215) (0.251)ln(pricet+1) -0.641∗∗∗ -0.619∗∗∗ -0.531∗∗∗ -0.496∗∗

(0.052) (0.057) (0.183) (0.213)Ratio of Coefficientsa 0.778∗∗∗ 0.704∗∗∗ 0.734∗∗∗ 0.630∗∗∗

(0.125) (0.122) (0.044) (0.074)

MSA Fixed Effects Yes Yes Yes YesMSA-Specific Time Trends Yes Yes Yes YesN. obs 40,436 34,775 40,436 34,775Panel B – First-stage Estimates of Pricesln(price)t

ln(taxest) – – 0.494∗∗∗ 0.498∗∗∗

(0.035) (0.035)Budget Gapb – – 0.146∗ 0.132∗

(0.074) (0.072)ln(price)t+1

ln(taxest) – – 0.550∗∗∗ 0.559∗∗∗

(0.048) (0.048)Budget Gapb – – 0.398∗∗∗ 0.384∗∗∗

(0.138) (0.136)Notes: Standard errors (in parentheses) are robust and clustered at the MSA level. All specifications control for age, age2, race,

and education and include both year dummies and MSA-specific time trends. * denotes significance at the 10% level; ** at the

5% level; and *** at the 1% level. Coefficients for in Panel A are average marginal effects from a probit regression.a The ratio of the coefficients on ln(pricet+1) and ln(pricet) will equal the net discount factor βδ. Standard errors are calculated

by the delta method.b Average percentage shortfall in state government budgets (general fund) within an MSA.

50

Table 5: Permanent vs. Temporary Price EffectsPanel A – Dependent Variable: 1(Smoker = 1)

Age > 30 Age > 35 Age < 20(i) (ii) (iii) (iv) (v)

ln(pPt ) -1.111∗∗∗ -1.380∗∗∗ -1.113∗∗∗ -1.410∗∗∗ -0.822∗∗∗

(0.118) (0.181) (0.118) (0.177) (0.166)

ln(pTt ) -0.214∗∗∗ -0.217∗∗∗ -0.220∗∗∗ -0.225∗∗∗ -0.333∗∗∗

(0.044) (0.068) (0.045) (0.068) (0.112)

Ratio of Coefficientsa 5.191∗∗∗ 6.359∗∗ 5.059∗∗∗ 6.267∗∗ 2.467(1.31) (2.07) (1.26) (1.97) (1.11)

MSA Fixed Effects Yes Yes Yes Yes YesTrend Yes No Yes No NoTrends by MSA No Yes No Yes Yes

N obs. 37,651 37,651 32,322 32,322 5,289R2 0.497 0.505 0.495 0.503 0.457

Panel B – Decomposition (Dependent Variable = Real Pricet)Age > 30 Age > 35 Age < 20

Real Taxes 1.164∗∗∗ 1.160∗∗∗ 1.117∗∗∗

(0.172) (0.171) (0.170)Real KY Burley Price 0.109∗∗∗ 0.108∗∗∗ 0.101∗∗

(0.032) (0.032) (0.033)MSA Fixed Effects Yes Yes YesTime Trend Yes Yes YesR2 0.85 0.85 0.85

Notes: Standard errors (in parentheses) in Panel A are approximate bootstrap standard errors to account for the generation of

ln(pPt ) and ln(pTt ), clustered at the MSA level. Standard errors in Panel B are robust and clustered at the year level. Permanent

price is predicted using MSA fixed effects and taxes. Temporary price is predicted using Kentucky burley tobacco auction price.

All specifications include age, age2, sex, race, and education . All columns use a linear probability model. * denotes significance

at the 10% level; ** at the 5% level; and *** at the 1% level.a The ratio of the coefficients on ln(pP ) and ln(pT ) will equal (1− (1−β)δ)/(1− δ). Standard errors are calculated by the delta

method. Significance is calculated as difference from 1.

51

Table 6: Effect of Price Uncertainty on Smoking RatesDependent Variable: 1(Smoker = 1)

Probita Linear Probability Model(i) (ii) (iii) (iv) (v) (vi)

Std(Price) 0.0254∗ 0.0267∗∗ 0.0278∗∗ 0.0196∗ 0.0208∗ 0.0221∗

(0.0135) (0.0133) (0.0142) (0.0113) (0.0108) (0.0118)Std(Price) X Under20 – -0.0052 – – -0.0049 –

(0.0244) (0.0170))Under20 – -0.0930∗∗∗ – – -0.0869∗∗∗ –

(0.0246) (0.0216)Std(Price) X Under25 – – -0.0085 – – -0.0084

(0.0118) (0.0096)Under25 – – -0.317∗∗ – – -0.0319∗∗

(0.0157)) (0.0138)ln(pricet) -0.361∗∗∗ -0.355∗∗∗ -0.360∗∗∗ -0.301∗∗∗ -0.295∗∗∗ -0.301∗∗∗

(0.0573) (0.0568) (0.0543) (0.0487) (0.0482) (0.0487)Demog. Controls Yes Yes Yes Yes Yes YesTime Trend Yes Yes Yes Yes Yes Yes

R2 0.072 0.075 0.073 0.083 0.085 0.083Number of obs. 86,137 86,137 86,137 86,137 86,137 86,137

Notes: Standard errors (in parentheses) are robust and clustered at the MSA level. Std(Price) is calculated using departures

from MSA trend of prices in the first half of the sample, while the second half of the sample is used to estimate the model.

Demographics include age, age, race, and education. * denotes significance at the 1% level, ** denotes significance at the 5%

level, and *** denotes significance at the 1% level.a Coefficients are average marginal effects.

52

Table 7: Parameter Estimates

Table 5: (i) (ii) (iii) (iv)Table 4: α β δ β δ β δ β δ

(iii) 0.494 0.853 0.859 0.828 0.885 0.856 0.856 0.829 0.883(0.080) (0.106) (0.072) (0.078) (0.080) (0.107) (0.072) (0.080)

(iv) 0.410 0.717 0.879 0.700 0.901 0.720 0.875 0.701 0.899(0.138) (0.097) (0.129) (0.071) (0.138) (0.097) (0.129) (0.072)

Notes: Roman numerals refer to specifications in Tables 4 and 5. Standard errors are approximate, as calculatedusing the delta method. Estimates use the preferred estimate of (σ − ρ) = 16.883, and the coefficient on dπ

dsfrom

Column (iv) of Table 6

53

Figure 4: Overpayment Ratio

Panel A: Overpayment Relative to Biased Self

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.91

1.5

2

2.5

3

3.5

4

4.5

5

5.5

6

γ

m = 12

m= 6

Panel B: Overpayment Relative to Unbiased self

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.91

1.5

2

2.5

3

3.5

4

4.5

5

5.5

6

γ

m=6

m = 12

Notes: This figure calculates the ratio between the present values of the most expensive smoking habit that

an unaddicted teenager would accept and the least expensive smoking habit that she would refuse. Panel

A uses a biased agent in the denominator, while Panel B uses an otherwise-identical agent with α = 0,

β = 1. For a perfectly rational consumer (not facing credit constraints), this ratio should equal 1. Prices

are fixed during the consumer’s youth (i.e. for m periods) and then allowed to change during her adulthood.

Estimates use benchmark results of α = 0.410, β = 0.7 δ = 0.9, σ − ρ = 16.88. m refers to the remaining

periods of “youth”. Present values are taken using the estimated long-run discount factor δ.

54

C Additional Tables

Table A-2: Demographic ControlsDependent Variable: 1(Smoker = 1), Linear Probability Model

(i) (ii) (iii) (iv)

Age 0.00908∗∗∗ 0.00908∗∗∗ 0.01374∗∗∗ 0.01376∗∗∗

(0.00318) (0.00318) (0.00212) (0.00212)Age2 -0.00009∗∗∗ -0.00009∗∗∗ -0.00016∗∗∗ -0.00016∗∗∗

(0.00003) (0.00003) (0.00003) (0.00003)Education variables:a

Some College -0.234∗∗∗ -0.238∗∗∗ -0.0268∗∗∗ -0.0261∗∗∗

(0.0089) (0.0089) (0.0042) (0.0042)College -0.0366∗ -0.0362∗ -0.0518∗∗∗ -0.0510∗∗∗

(0.0199) (0.0198) (0.0104) (0.0105)More than College -0.0336 -0.0332 -0.0787∗∗∗ -0.0776∗∗∗

(0.0231) (0.0230) (0.0261) (0.0263)Demographic Variables:Female -0.0186∗ -0.0186∗ -0.0847∗∗∗ -0.0847∗∗∗

(0.0098) (0.0099) (0.0268) (0.0268)Black 0.0081 0.0080 0.0081∗∗ 0.0088∗∗∗

(0.0097) (0.0090) (0.0041) (0.0032)

Include Non-Urban Sampleb No No Yes YesYear Fixed Effects Yes Yes Yes YesMSA Fixed Effects and Trends No Yes No YesN obs. 120,535 120,535 410,729 410,729Notes: Standard errors (in parentheses) are robust and multi-way clustered by both year and MSA followingCameron, Gelbach and Miller (2006). * denotes significance at the 10% level; ** at the 5% level; and *** atthe 1% level.a The omitted category is “High School or Less”b Non-urban are respondents outside the 31 metropolitan statistical areas used in the main regressions. Forstatistical purposes here, they are grouped into a single “other” MSA.

55

Table A-3: Predicting Current and Future PricesDependent Variable: ln(Pricet)

(i) (ii) (iii) (iv)ln(Taxest) 0.3042 0.5006 0.5280 0.4971

(0.0222) (0.0310) (0.0315) (0.0314)[0.0621] [0.0781] [0.0562] [0.0756]

ln(KY Burleyt) – 0.1419 – 0.1402(0.0155) (0.0162)[0.0480] [0.0458]

Budget Gapt – – -0.0246 0.0337(0.0537) (0.0477)[0.1360] [0.1476]

R2 0.776 0.827 0.779 0.827Dependent Variable: ln(Pricet+1)


(0.0446) (0.0471) (0.0483) (0.0508)[0.0882] [0.0879] [0.0816] [0.0834]

ln(KY Burleyt) – -0.006 – -0.004(0.0471) (0.0631)[0.1239] [0.1111]

Budget Gapt – – 0.2655 0.2655(0.1091) (0.1095)[0.1786] [0.1778]

R2 0.844 0.853 0.860 0.867Dependent Variable: ln(Pricet+1)-ln(Pricet)


(0.0242) (0.0291) (0.0268) (0.0289)[0.0688] [0.0704] [0.0691] [0.0696]

ln(KY Burleyt) – -0.0482 – -0.0165(0.0518) (0.0441)[0.1418] [0.1319]

Budget Gapt – – 0.2902 0.2882(0.0677) (0.0659)[0.1407] [0.1421]

R2 0.102 0.150 0.197 0.225Notes: Standard errors in parentheses are robust and clustered by MSA. Standard errors in square

backets are robust and multi-way clustered by both year and MSA following Cameron et al. (2006).

All specifications include MSA fixed effects.

56

Tab

leA

-4:

Pri

ceSe

nsit

ivit

y–

Alt

erna

tive

Spec

ifica

tion

sP

anel

A:

You

ngSm

oker

s(D

epen

dent

Var

iabl

e:1

(Smoker

=1)

)A

ge<

20A

ge<

25A

ge<

30ln

(pri

cet)

-1.6

40-1

.577

-3.2

06-1

.638

-1.2

71-1

.261

-2.5

17-1

.271

-1.0

70-1

.061

-2.3

29-1

.069

(0.1

91)

(0.1

90)

(0.2

47)

(0.1

91)

(0.1

16)

(0.1

16)

(0.1

49)

(0.1

16)

(0.0

91)

(0.0

92)

(0.1

19)

(0.0

91)

ln(p

ricet+

1)

-0.1

660.

273

-0.6

61-0

.169

-0.4

46-0

.178

-0.9

10-0

.448

-0.5

44-0

.354

-0.9

98-0

.546

(0.1

61)

(0.2

61)

(0.1

67)

(0.1

61)

(0.0

96)

(0.1

56)

(0.1

02)

(0.0

96)

(0.0

75)

(0.1

21)

(0.0

79)

(0.0

75)

ln(t

axest)

––

1.40

2–

––

1.14

4–

––

1.12

3–

(0.1

42)

(0.0

87)

(0.0

68)

Rat

ioof

Pri

ceC

oeffi

cien

ts0.

101

-0.1

730.

206

0.10

30.

351

0.14

10.

362

0.35

20.

509

0.33

40.

429

0.51

0(0

.108

)(0

.158

)(0

.061

)(0

.109

)(0

.104

)(0

.130

)(0

.053

)(0

.105

)(0

.110

)(0

.130

)(0

.047

)(0

.110

)

Fir

stSt

age

(Dep

ende

ntV

aria

ble:

ln(P

ricet+

1))

ln(t

axest)

0.19

5–

0.19

50.

236

0.19

2–

0.19

20.

230

0.19

7–

0.19

70.

235

(0.0

52)

(0.0

52)

(0.0

57)

(0.0

51)

(0.0

51)

(0.0

58)

(0.0

50)

(0.0

51)

(0.0

57)

Bud

get

Gap

a t0.

298

0.35

00.

298

–0.

297

0.34

30.

298

–0.

290

0.33

80.

290

–(0

.080

)(0

.076

)(0

.080

)(0

.078

)(0

.071

)(0

.078

)(0

.076

)(0

.071

)(0

.076

)

N.

obs

5,28

95,

289

5,28

95,

289

11,6

4711

,647

11,6

4711

,647

17,9

7517

,975

17,9

7517

,975

Pan

elB

:M

atur

eSm

oker

s(D

epen

dent

Var

iabl

e:1

(Smoker

=1)

)A

ge>

30A

ge>

35ln

(pri

cet)

-0.8

73-0

.855

-2.1

22-0

.873

-0.9

30-0

.906

-2.1

70-0

.930

(0.0

63)

(0.0

63)

(0.0

82)

(0.0

63)

(0.0

69)

(0.0

69)

(0.0

90)

(0.0

69)

ln(p

ricet+

1)

-0.6

44-0

.431

-1.0

69-0

.645

-0.6

20-0

.369

-1.0

34-0

.621

(0.0

51)

(0.0

80)

(0.0

54)

(0.0

51)

(0.0

56)

(0.0

87)

(0.0

59)

(0.0

56)

ln(t

axest)

––

1.07

1–

––

1.05

9–

(0.0

46)

(0.0

50)

Rat

ioof

Pri

ceC

oeffi

cien

ts0.

737

0.50

50.

504

0.73

90.

666

0.40

70.

476

0.66

8(0

.108

)(0

.116

)(0

.038

)(0

.109

)(0

.106

)(0

.115

)(0

.040

)(0

.107

)

Fir

stSt

age

(Dep

ende

ntV

aria

ble:

ln(P

ricet+

1))

ln(t

axest)

0.17

6–

0.17

60.

211

0.17

0–

0.17

00.

205

(0.0

51)

(0.0

51)

(0.0

58)

(0.0

52)

(0.0

52)

(0.0

58)

Bud

get

Gap

a t0.

283

0.32

40.

283

–0.

285

0.32

50.

285

–(0

.074

)(0

.068

)(0

.074

)(0

.075

)(0

.068

)(0

.075

)

N.

obs

40,4

3640

,436

40,4

3640

,436

34,7

7534

,775

34,7

7534

,775

Note

s:Sta

ndard

erro

rs(i

npare

nth

eses

)are

robust

and

clust

ered

at

the

MSA

level

.A

llsp

ecifi

cati

ons

contr

ol

for

age,

age2

,ra

ce,

and

educa

tion,

and

incl

ude

both

MSA

fixed

effec

tsand

MSA

-sp

ecifi

cti

me

tren

ds.

aA

ver

age

per

centa

ge

short

fall

inst

ate

gov

ernm

ent

budget

sw

ithin

an

MSA

.

57

an empirical analysis of biases in cigarette...

Documents