ccp estimation of dynamic discrete choice models with … · 2006. 10. 2. · peter arcidiacono...
TRANSCRIPT
CCP Estimation of Dynamic Discrete Choice
Models with Unobserved Heterogeneity∗
Peter Arcidiacono Robert Miller
Duke University Carnegie Mellon
September 29, 2006
Abstract
Standard methods for solving dynamic discrete choice models involve calculating the value
function either through backwards recursion (finite-time) or through the use of a fixed point
algorithm (infinite-time). Conditional choice probability (CCP) estimators provide a computa-
tionally cheaper alternative but are perceived to be limited both by distributional assumptions
and by being unable to incorporate unobserved heterogeneity via finite mixture distributions.
We extend the classes of CCP estimators that need only a small number of CCP’s for estima-
tion. We also show that not only can finite mixture distributions be used in conjunction with
CCP estimation, but, because of the computational simplicity of the estimator, an individual’s
location in unobserved space can transition over time. Monte Carlo results suggest that the
algorithms developed are computationally cheap with little loss in precision.
Keywords: dynamic discrete choice, unobserved heterogeneity
∗We thank Paul Ellickson and seminar participants at Duke University for valuable comments. Josh Kinsler
provided excellent research assistance.
1
1 Introduction
Standard methods for solving dynamic discrete choice models involve calculating the value function
either through backwards recursion (finite-time) or through the use of a fixed point algorithm
(infinite-time). Conditional choice probability (CCP) estimators provide an alternative to these
techniques which involves mapping the value functions into the probabilities of making particular
decisions. While CCP estimators are much easier to compute than estimators based on obtaining
the full solution and have experienced a resurgence in the literature on dynamic games,1 there are
at least two reasons why researchers have been reticent to employ them in practice. First, it is
perceived that the mapping between CCP’s and value functions is simple only in specialized cases.
Second, it is believed that CCP estimators cannot be adapted to handle unobserved heterogeneity.2
This latter criticism is particularly damning as one of the fundamental issues in labor economics,
and indeed one of the main purposes of structural microeconomics, is the explicit modelling of
selection.
We show that, for a wide class of Generalized Extreme Value (GEV) distributions of the error
structure, the value function depends only on the one period ahead CCP’s and, in single-agent
problem, often depends upon only the one period ahead CCP’s for a single choice. The class
of problems we discuss is quite large and includes dynamic games where one of the decisions is
whether to exit. Further, unobserved heterogeneity via finite mixture distributions is not only
easily incorporated into the algorithm, but the finite mixture distributions can transition over time
as well. Previous work on incorporating unobserved heterogeneity has been restricted to cases of
permanent unobserved heterogeneity in large part because of the computational burdens associated
with allowing for persistence, but not permanence, in the unobserved heterogeneity distribution.
Using insights from the macroeconomics literature on regime switching and the computational
simplicity of CCP estimation, we show that incorporating persistent but time-varying heterogeneity
comes at very little computational cost.
We adapt the EM algorithm, and in particular its application to sequential likelihood developed
1See Aguirregabiria and Mira (2006), Bajari, Benkard, and Levin (2006), Pakes, Ostrovsky, and Berry (2004),
and Pesendorfer and Schmidt-Dengler (2003).2A third reason is that to perform policy experiments it is often necessary to solve the full model. While this is
true, using CCP estimators would only involve solving the full model once for each policy simulation as opposed to
multiple times in a maximization algorithm.
2
in Arcidiacono and Jones (2003), to two classes of CCP estimators that between them cover a wide
class of dynamic optimization problems and sequential games with incomplete information, based
on representations developed in Hotz et al (1994) and Altug and Miller (1998). Our techniques
can be also readily applied to models with discrete and continuous choices by exploiting the Euler
equation representation given in Altug and Miller (1998).
The algorithm begins by making a guess as to the CCP’s for each unobserved type, conditional
on the observables. Given this initial guess, we iterate on two steps. First, given the type-specific
CCP’s, maximize the pseudo-likelihood function with the unobserved heterogeneity integrated out.
Second, update the type-specific CCP’s using the parameter estimates. These updates can take
two forms. In stationary models, the updates can come from the likelihoods themselves. In models
with finite time horizons, while the updates can come from the likelihoods themselves, a second
method may be computationally cheaper. Namely, similar to the EM algorithm, we calculate the
probability that an individual is a particular type. These conditional type probabilities are then
used as weights in forming the updated CCP’s from the data.
We illustrate the small sample properties of our estimator using two sets of Monte Carlos
designed to highlight the two methods of updating the type-specific CCP’s. The first is a finite
horizon model of teen drug use and schooling decisions. Students in each period decide to whether
to stay in school and, if the choice is to stay, use drugs. Before using drugs, individuals only have a
prior as to how well they will enjoy the experience. However, upon using drugs, students discover
their drug ‘type’ and this is used in informing their future decisions. Here we illustrate both ways
of updating the CCP’s, using the likelihoods or the conditional probabilities of being particular a
particular type as weights. Results of the Monte Carlo show that both methods of updating the
CCP’s yield estimates similar to that of full information maximum likelihood with little loss in
precision.
The second is a dynamic entry/exit example with unobserved heterogeneity in the demand
levels for particular markets which in turn affects the values of entry and exit. Here the unobserved
heterogeneity is allowed to transition over time and the example explicitly incorporates dynamic
selection. The type-specific CCP’s are updated using the likelihoods evaluated at the current
parameter estimates and the current type-specific CCP’s. The results suggest that incorporating
time-varying unobserved heterogeneity is not only feasible but computationally simple and yields
precise estimates even of the transitions on the unobserved state variables.
3
Our work is most closely related to the nested pseudo-liklelihood estimators developed by Aguir-
reggabiria and Mira (2006), and Buchinsky, Hahn and Hotz (2005). Both papers seek to incorporate
a fixed effect within their CCP estimation framework drawn from a finite mixture. Aguirregabiria
and Mira (2006) show how to incorporate unobserved characteristics of markets in dynamic games,
where the unobserved heterogeneity only affects the utility function itself. In contrast our analysis
demonstrates how to incorporate unobserved heterogeneity into both the utility functions and the
transition functions and are thereby account for the role of unobserved heterogeneity in dynamic
selection. Buchinsky, Hahn and Hotz (2005) use the tools of cluster analysis, seeking conditions
on the model structure that allow them to identify the unobserved type of each agent, whereas we
only identify the distribution of unobserved heterogeneity across agents. Thus their approach seems
most applicable in models where there are relatively small numbers of long lived agents which may
or may not be comparable to each other, whereas our approach is applicable to large populations
where the focus is on the unobserved proportions that partition it.
2 The Framework
We consider a dynamic programming problem in which an individual makes a sequence of discrete
choices dt over his lifetime t ∈ {1, . . . , T} for some T ≤ ∞. The choice set has the same cardinality
K at each date t, so we define dt by the multiple indicator function dt = (d1t, . . . , dKt) where
dkt ∈ {0, 1} for each k ∈ {1, . . . , K} and ∑K
k=1dkt = 1
A vector of characteristics (zt, εt) fully describes the individual at each time t, where zt are a set
of time-varying characteristics, and εt ≡ (ε1t, . . . , εKt) is independently and identically distributed
over time having continuous support with distribution function G (εt). The vector zt evolves as a
Markov process, depending stochastically on the choices of the individual. We model the transition
from zt to zt+1 conditional on the choice k ∈ {1, . . . , K} with the probability distribution function
Fk (zt+1 |zt ). We assume the current utility an individual with characteristics (zt, εt) gets from
choosing alternative k by setting dkt = 1, is additively separable in zt and εt , and can be expressed
as
uk (zt) + εkt
4
The individual sequentially observes (zt, εt) and maximizes the expected discounted sum of utilities
E{∑T
t=1
∑K
k=1βtdkt [uk (zt) + εkt]
}where β ∈ (0, 1) denotes the fixed geometric discount factor.
Let dot ≡ (do
1t, . . . , doKt) denote the optimal decision rule for this problem where do
kt ≡ dkt (zt, εt)
for each k ∈ {1, . . . , K}, define the conditional valuation functions by
vk (zt) = E{∑T
t=1
∑K
k=1βtdo
kt [uk (zt) + εkt]}
and the conditional choice probabilities as
pk (zt) =
∫dkt (zt, εt) dG (εt)
The representation theorem of Hotz and Miller (1993) implies there is a mapping from the condi-
tional choice probabilities to the conditional valuation functions, which we now denote as
q [p (zt)] =
q2 [p (zt)]
...
qK [p (zt)]
=
v2 (zt)− v1 (zt)
...
vK (zt)− v1 (zt)
The expected contribution of the εkt disturbance to current utility, conditional on the state zt, is
found by integrating over the region in which the kth action is taken, so appealing to the represen-
tation theorem, we may express it as∫[dkt (zt, εt) εkt] dG (εt)
=
∫1 {εkt − εjt ≥ vj (zt)− vk (zt) for all j ∈ {1, . . . , J}} [dkt (zt, εt) εkt] dG (εt)
=
∫1 {εkt − εjt ≥ qj [p (zt)]− qk [p (zt)] for all j ∈ {1, . . . , J}} [dkt (zt, εt) εkt] dG (εt)
≡ wk [p (zt)]
It now follows that the conditional valuation functions can be expressed as a mapping of the
conditional choice probabilities for states that might be visited in the future
vk (zt) = uk(zt) + Et
{∑T
t′=t+1
∑K
k′=1βt′−tp (zt′) [uk′ (zt′) + wk′ [p (zt′)]]
}(1)
Similarly, we can define the expected value function as:
v(z0) =K∑
k=1
pk (z0) [uk (z0) + wk (z0)] (2)
v then measures what the individual’s expected utility conditional on optimal behavior once the
ε’s are realized.
5
3 Finite Dependence
The power of the Hotz and Miller inversion is particularly strong in the finite dependence case.
In order to see this, we first proceed with a simple illustration that is a special case of Altug and
Miller (1998) and then proceed to generalize their result. We begin by rewriting (1) as:
vk (zt) = uk(zt) + βEt
{∑K
k′=1pk′ (zt+1) [vk′ (zt+1) + wk′ [p (zt+1)]] |zt, dt = k
}(3)
Now suppose there is an action l that can be taken at time t+1 such that vl does not depend upon
what action was taken at time t. That is, there may be state variables affected by the action at t,
but vl does not depend upon these variables. We can rewrite (3) as:
vk (zt) = uk(zt) + βEt
{vl(zt+1) +
∑K
k′=1pk′ (zt+1) [vk′ (zt+1)− vl(zt+1) + wk′ [p (zt+1)]] |zt, dt = k
}(4)
The Hotz and Miller inversion then allows us to write (4) as:
vk (zt) = uk(zt) + βEt
{vl(zt+1) +
∑K
k′=1pk′ (zt+1) [qk′l [p (zt)] + wk′ [p (zt+1)]] |zt, dt = k
}(5)
where qkl [p (zt)] ≡ qk [p (zt)] − ql [p (zt)]. Note that the summation can be viewed as the compen-
sation an individual would need to receive to make him indifferent between committing a priori
to action l or waiting and choosing optimally in the next period. This will be important when we
generalize the finite dependence case.
For estimation purposes, we must difference vk with respect to one of the alternatives, say v1.
Since vl does note depend upon the choice at time t, the vl terms difference out yielding:
vk (zt)− v1(zt) = uk(zt)− u1(zt)
+βEt
{∑K
k′=1pk′ (zt+1) [qkl [p (zt)] + wk′ [p (zt+1)]] |zt, dt = k
}−βEt
{∑K
k′=1pk′ (zt+1) [qkl [p (zt)] + wk′ [p (zt+1)]] |zt, dt = 1
}(6)
Hence the future value terms only depend upon the one period ahead conditional choice probabili-
ties.
We now generalize the model along two dimensions. First, similar to Altug and Miller (1998),
the number of periods before a decision can be made such that the v for that choice does not
depend upon the choice in the previous period can now be any finite number. Second, and more
importantly, the renewal action may still allow for the carryover of past choices. In this case finite
6
dependence is a bit of a misnomer because only a particular class of variables need have the finite
dependence. Formally, the assumption of finite dependence is as follows. Given a value of the
initial state, z0 ∈ Z, there exists a finite integer ρ (z0) , a value of the state variable zρ ∈ Z, and a
(not necessarily unique) sequence of choices over the next ρ periods that ensure the state will be
zρ at date ρ (z0) denoted by d(ρ)t (zt) ≡
(d
(ρ)1t (zt) , . . . , d
(ρ)kt (zt)
)that may depend on zt but not on
the transitory disturbance εt. This assumption of induced finite dependence implies the expected
valuation function can be expressed as
v (z0) = E(ρ)0
∑ρ
t=1
∑K
k=1βtd
(ρ)kt (zt) [uk (zt) + wk (zt)]
+E(ρ)0
∑ρ
t=1
∑K
k=1βt
{[pk (zt)− d
(ρ)kt (zt)
][uk (zt) + wk (zt)]
+[∑K
l=1βtpk (zt) d
(ρ)lt (zt) qkl [p (zt)]
]}+βρ+1v (zρ)
where the expectations of the transition probability for zt are taken conditional on the d(ρ)t (zt)
choices, denoted by E(ρ)0 [·] and qkl [p (zt)] ≡ qk [p (zt)]− ql [p (zt)]. The first of the three expressions
on the right side is the expected utility received from the next period to period ρ (z0) from following
the prescribed choices d(ρ)t (zt) , the second are the expected losses in utility incurred in those periods
from not choosing optimally, while the last is the expected utility from period ρ + 1 onwards when
the state is zρ, which by the assumption of finite dependence does not depend on the initial state.
The cases where finite dependence holds is quite large. To see this, note that we can partition
the state variables into three sets, a set that evolves deterministically and may or not depend
upon the choices an individual makes, a set that evolves stochastically but is unaffected by the
choices, and a set that evolves stochastically and depends upon the choices. Finite dependence
puts restrictions only on this third set of variables where the choice at ρ(z0) breaks any connection
between these variables and past choices. For any z0 ∈ Z and l ∈ {1, . . . , K} the expected valuation
function, v (z0) , can be expressed in terms of any conditional valuation function l ∈ {1, . . . , K} ,
as well as the current utilities obtained from all possible choices and a correction term qkl [p (z0)]
7
that only depends on the current conditional choice probabilities. Namely,
v (z0) ≡∑K
k=1pk (z0) [uk (z0) + wk (z0) + βvk (z0)]
= ul (z0) + wl (z0) + βvl (z0)
+∑K
k=1pk (z0) {uk (z0)− ul (z0) + wk (z0)− wl (z0) + β [vk (z0)− vl (z0)]}
= ul (z0) + wl (z0) + βvl (z0)
+∑K
k=1pk (z0) {uk (z0)− ul (z0) + wk (z0)− wl (z0) + βqkl [p (z0)]}
The expected valuation function is the expected current utility from taking the lth action, ul (z0)+
wl (z0) , a compensation in current utility from not behaving optimally,∑K
k=1 pk (z0) [uk (z0)− ul (z0) + wk (z0)− wl (z0)] ,
plus a compensation in future utility from not behaving optimally, namely∑K
k=1 pk (z0) βqkl [p (z0)] .
In equilibrium the conditional choice probabilities satisfy the identifying restrictions
pk (zt) =
∫dkt (zt, εt) dG (εt)
= Pr
{k = arg max
l∈{1,...,K}}[ul (zt) + vl(zt) + εlt] |zt
}
= Pr
{εlt − εkt < uk(zt)− ul(zt) +
∫z′∈Z
v (z′) {dFk [z′ |zt ]− dFl [z′ |zt ]}
for all l ∈ {1, . . . , K}}
If the finite dependence assumption is satisfied, then it is straightforward to check that only terms
pertaining to the next ρ (z0) periods enter the integrand.
4 Correlation Structures and Finite Time Dependence
4.1 One Period Time Dependence and GEV Errors
We first focus on a simple two choice model. Given the framework described above, Rust (1987)
shows that if we make the additional assumption that the εt’s are i.i.d. extreme value we can write
the v’s as:
v0(zt, εt) = u0(zt, εt) + β
(∫ln
(1∑
k=0
exp(vk(zt+1))
)f(zt+1|zt, dt = 0) + γ
)
v1(zt, εt) = u1(zt, εt) + β
(∫ln
(1∑
k=0
exp(vk(zt+1))
)f(zt+1|zt, dt = 1) + γ
)
8
where γ is Euler’s constant.
Note that what is inside the log is the denominator for the probability of choosing any of the
alternatives. This will hold true for any representation where the ε’s come from a GEV distribution.
This implies that the vk’s can be rewritten as:
v0(zt, εt) = u0(zt, εt) + β
(∫[v0(zt+1)− ln(p0(zt+1))]f(zt+1|zt, dt = 0) + γ
)
v1(zt, εt) = u1(zt, εt) + β
(∫[v0(zt+1)− ln(p0(zt+1))]f(zt+1|zt, dt = 1) + γ
)There are some cases when the v0’s are particularly easy to calculate. For example, if dt = 0 is an
absorbing state, v0 is often specified as something that is linear in the parameters. If it is possible to
consistently estimate the transitions f(zt+1|zt, dt) and the one period ahead p0’s, then the obtaining
estimates of the u1(zt)’s simplifies to estimating a linear in the parameters logit. Note that whether
T is finite or infinite has no bearing on the problem.
The condition that the probability of choosing an alternative depends only on current utility and
on a one period ahead choice probability of a single choice is broader than just the cases satisfying
the terminal state property. To see this, note that the probability of choosing an alternative is
always relative to one of the choices:
v1(zt, εt)− v0(zt, εt) = u1(zt, εt)− u0(zt, εt)
+β
∫[ln(p0(zt+1))][f(zt+1|zt, dt = 0)− f(zt+1|zt, dt = 1)]
+β
∫[v0(zt+1)][f(zt+1|zt, dt = 1)− f(zt+1|zt, dt = 0)]
If v0 does not depend the lagged choice except through the one period ahead flow utility then
the future component effectively cancels out leaving the v(zt)’s once again linear in the utility
parameters. This is a special case of Altug and Miller (1998), which establishes how to write
CCP’s when there is finite time dependence. Note that this does not rule out transitions of the
state variables provided the transitions do not depend upon the choice. Many problems of interest
naturally exhibit this property. Rust’s bus engine problem is one, along with virtually every game
that involves an exit decision.
This renewal property, where the difference in the value functions depends only on the linear
flow utility, the one-period ahead transitions on the state variables, and the one-period ahead choice
probabilities of a single choice, is not restricted solely to logit errors. Consider the case where the
9
errors follow a nested logit specification and the ‘outside option’ depends upon the current state
only through the one period ahead flow utlity, that is, it satisfies one period time dependence. The
conditional value function is then:
vj(zt, εt) = uj(zt, εt) + βγ
+β
∫ln
((K∑
k=1
exp
(vk(zt+1)
σ
))σ
+ exp(v0(zt+1))
)f(zt+1|zt, dtj)
Writing the future component in terms of the probability of choosing the outside good yields:
vj(zt, εt) = uj(zt, εt) + βγ
+β
∫[v0(zt+1)− ln(p0(zt+1))]f(zt+1|zt, dtj)
Note that this expression is identical to the simple logit example above. Differencing with respect
to v0 then yields the same cancellations as before.
Conditional on one of the options having one period time dependence, the expression above
will be the same for a broad class of correlation structures. In particular, consider a mapping
G : RK+1 → R1 that satisfies the restrictions given in McFadden (1978) such that:
F (ε0, ε1, ..., εK) = exp (−G(eε0 , eε1 , ..., eεK ))
is the cdf of multivariate extreme value distribution. Whenever G(·) can be written as:
G(ev) = G(ev1(zt), ..., evK(zt)) + ev0(zt)
then the difference in the jth conditional value function and v0 can be written as:
vk(zt, εt)− v0(zt, εt) = uk(zt, εt)− u0(zt, εt)
+β
∫[ln(p0(zt+1))][f(zt+1|zt, dt = 0)− f(zt+1|zt, dtk)]
+β
∫[v0(zt+1)][f(zt+1|zt, dt = 1)− f(zt+1|zt, dt0)]
Note that this structure can accommodate quite complex error structures. For example, Bresnahan,
Stern, and Trajtenberg (1997) allow errors to be correlated across multiple nests.3 Again, as long
as the outside option satisfies the properties listed above, the difference in the conditional value
functions only depends upon the one-period head probabilities of choosing the outside option.
3Arcidiacono (2005) incorporates the BST framework into a dynamic discrete choice model.
10
Moreover, by using the techniques developed below to incorporate unobserved heterogeneity,
the GEV assumption is quite weak. In particular, we can approximate correlation among the
ε’s with a finite mixture distribution and keep the additive separability conditional on the draw
from the mixture distribution. Now let the error distribution of the ε’s mix over M extreme value
distributions with different intercepts and φm be the probability of the draw coming from the mth
distribution. The future component of the value of choice j is then:
vj(zt, m, εt) = uj(zt, i,m, εt) + β
(M∑
m=1
φm
[∫ln
(1∑
k=0
exp(vk(zt+1, m))
)f(zt+1|zt, dt = j) + γ
])Now, if any of the options exhibits one period time dependence, then once again the future utility
component depends only on the one-period ahead CCP’s for one of the options and the transitions
on the state variables. To see this, suppose the kth alternative then exhibited the terminal state
property one-period time dependence. We can rewrite the expression above as:
vj(zt, m, εt) = uj(zt, m, εt) + β
(M∑
m=1
φm
[∫(ln(pk(zt+1, m)) + vk(zt+1, m)) f(zt+1|zt, dt = j) + γ
])Differencing with respect to vk then leads the third term to cancel out. Estimation is then simple
once we have a means of calculating the CCP’s for each of the m distributions.
4.2 Application: Female Labor Supply
While one period time dependence makes estimation quite simple, there are cases when it is unlikely
to hold. Here we cover an example that has finite dependence, but not one period time dependence.
In particular, consider the case of female labor supply. This example, of human capital accumulation
through work experience, draws on finite dependence that goes beyond one period. Suppose human
capital zt ≡ (z1t, z2t) has two dimensions, total accumulated skill z1t, and recent working experience
z2t. Each period t a woman chooses between three activities, represented by dt = (d1t, d2t, d3t) ,
where following our notational convention djt ∈ {0, 1} for each j ∈ {1, 2, 3} and∑3
j=1 djt = 1.
Choosing d1t = 1, full time work increases her skills (and thus future earnings) by one or two units
of human capital, with probability λ and 1 − λ respectively. Part time work increases her human
11
capital by one unit, while the staying out of the labor force in period t eliminates recent working
experience. Thus recent work experience follows the law of motion
z1,t+1 = d1trt + d2t
where rt is independently distributed on {1, 2} with probability λ on 1, while total accumulated
experience is measured by
z2,t+1 = z2t + d1trt + d2t
In this example we assume the female maximizes her expected lifetime earnings adjusted for the
compensating benefits of not working, and let uj (zt) + εjt denote her period t adjusted income,
from choosing j ∈ {1, 2, 3} . To simplify the notation we analyze stationary policies in an infinite
horizon setting, which implies her expected adjusted lifetime wealth is∑∞
t=1
∑3
j=1βtpj (zt) [uj (zt) + wj (zt)]
where as before pj (zt) is the conditional choice probability of making choice j ∈ {1, 2, 3} and wj (zt)
is the expected value of the idiosyncratic income conditional on the jth choice being optimal.
To apply the finite dependence assumption in this example, we remark that since the woman
might accumulate up to two units of human capital if she works full time in the current period, we
must prescribe choices for the next three periods to neutralize the effects of different choices in the
current period on human capital. This motivates why we set ρ = 3 and zt+3 = (0, h + 2) for . To
achieve this vector of capital in three periods, regardless of her choice in period t, the woman should
not work in period t + 2, thus guaranteeing z1,t+3 = 0. Therefore d(ρ)3,t+2 = 1. The prescribed choices
in the intervening three periods naturally depend on the period t choice. For example suppose the
woman has total experience of h at period t but no recent experience. That is zt = (0, h) .
If the woman does not work in the current period, that is setting d3t = 1 she should work part
time in the following two periods. Thus d(ρ)2,t+1 (0, h) = 1 and d
(ρ)2,t+2 (1, h) = 1. Discounted back to
period t,her expected future adjusted lifetime wealth, conditional on not working in period t, is
v3 (0, h) = u3 (0, h) +∑3
j=1βp (0, h) [uj (0, h) + wj (0, h) + q2j (0, h)]
+∑3
j=1β2p (1, h + 1) [uj (1, h + 1) + wj (1, h + 1) + q2j (1, h + 1)]
+∑3
j=1β3p (1, h + 2) [uj (1, h + 2) + wj (1, h + 2) + q3j (1, h + 2)]
+β4v (0, h + 2)
12
If she works part time in period t, setting d2t = 1, then she should work part time in one of the
following two periods but not both. Thus we could set d(ρ)2,t+1 (1, h + 1) = 1 and d
(ρ)3,t+2 (1, h + 2) = 1.
The conditional valuation function for part time work in period t can be expressed as
v2 (0, h) = u2 (0, h) +∑3
j=1βp (1, h + 1) [uj (1, h + 1) + wj (1, h + 1) + q2j (1, h + 1)]
+∑3
j=1β2p (1, h + 2) [uj (1, h + 2) + wj (1, h + 2) + q3j (1, h + 2)]
+∑3
j=1β3p (0, h + 2) [uj (0, h + 2) + wj (0, h + 2) + q3j (0, h + 2)]
+β4v (0, h + 2)
If the woman works full time this period, then to achieve long term capital of h+2 three periods
later, her choices depend on whether she accumulates one unit of capital in period t, in which case
she should work part time in period t + 1 or period t + 2, or two, in which case she should not
work for the next three periods. In terms of the notation we have developed, if z1,t+1 = 1 then
d(ρ)2,t+1 (1, h + 1) = 1 and d
(ρ)3,t+2 (1, h + 2) = 1, but if if z1,t+1 = 2 then d
(ρ)3,t+1 (1, h + 2) = 1 and
d(ρ)3,t+2 (0, h + 2) = 1. Using the expression we derived for v2 (0, h), and noting that in periods t + 2
and t + 3, her state variables and prescribed actions are the same if she accumulates two units of
capital in the current period, the expected utility accruing over the next three years, discounted
back to period t, conditional on working full time in the current period can now be expressed as
v2 (0, h) = u1 (0, h) + λ[v2 (0, h)− u2 (0, h)− β4v (0, h + 2)
]+ (1− λ)
∑3
j=1β1p (2, h + 2) [uj (2, h + 2) + wj (2, h + 2) + q3j (2, h + 2)]
+ (1− λ)∑3
j=1
(β2 + β3
)p (0, h + 2) [uj (0, h + 2) + wj (0, h + 2) + q3j (0, h + 2)]
+β4v (0, h + 2)
4.3 Application: Dynamic Games
To further show the simplicity of the estimator, we describe how to apply the estimator to a
dynamic entry/exit game that will form the basis for one of the Monte Carlos discussed in section
6. Assume that all players in the game are playing strategies consistent with a Markov-Perfect
Equilibrium. Let there be M markets with one potential entrant arriving in each market in each
period. Potential entrants choose whether or not to enter the market. Should the firm choose to
enter, the firm continues to make choices as to whether to stay in the market. Should the firm
13
leave the market, the firm dies and also dies if it chooses not to enter. As we will see, whether the
firm dies or enters the pool of potential entrants to be assigned to a market is not relevant to the
problem for large M . There can be at most 2 firms in a market.
Firms are all identical but subject to private i.i.d. profit shocks for staying in or enter a market
that are private information and distributed i.i.d. Type I extreme value. These i.i.d. shocks only
affect the cost of staying in the market. Since we are focusing on a simple entry/exit example,
we assume that firm identity is not important except for whether the firm is an incumbent or
a potential entrant– only the number of incumbent and potential entering firms matter. Profits
depend upon The firms in the market simultaneously make decisions regarding entry and exit.
Expected lifetime profits for firm i depend upon:
1. Whether there is another player, Ej ∈ {0, 1}
2. Whether firm i is an incumbent, Ii ∈ {0, 1}
3. If there is another player, whether that player is an incumbent, Ij ∈ {0, 1}
4. A state variable that is observed to the firms but not to the econometrician, s ∈ {0, 1}
The realized flow profits for i, π, then depend upon whether, after the entry/exit decisions are
made, the firm is a monopolist or a duopolist, D ∈ {0, 1}, whether the firm started out as an
incumbent, and the state. We can then express expected lifetime profits from entering as:
vi(Ej, Ii, Ij, s) =1∑
k=0
pk(1, Ij, Ii, s)
(π(k, Ii, s)
+β
2∑s′=1
V (1, 1, k, s′)f(s′|s))
+ εi
where p0 and p1 are the probabilities of the other firm staying out of the market and staying the
market, respectively. With the profits for exiting left as a deterministic scrap value, SV , and i.i.d.
Type I extreme value ε’s imply that we can write the previous expression as:
vi(Ej, Ii, Ij, s) =1∑
k=0
pk(1, Ij, Ii, s)
(π(k, Ii, s) + βγ + βSV
−β
2∑s′=1
p0(1, 1, k, s′)f(s′|s))
+ εi
14
5 CCP’s and Unobserved Heterogeneity
In this section we present our algorithm for solving problems with unobserved heterogeneity where
the unobserved heterogeneity is allowed to transition over time. We focus first on a method that
does not use CCP’s and then show how the algorithm easily adapts to using CCP’s.
5.1 Regime-switching and Dynamic Discrete Choice
Consider a panel data set of N individuals, where we observe T choices for each individual n ∈
{1, . . . , N} . Observations are independent across individuals. We partition the zt’s from the base
framework into what is observed and unobserved. With some abuse of notation, zt now refers
to variables that are observed by the econometrician. We assume that the unobserved variables
lead to a finite number of ‘types’ of individuals. An individual’s type may affect both the utility
function and the transition functions on the observed variables and may also transition over time.
Let there be S types with the initial probability of being assigned to type si given by πi. Types
follow a Markov process with pjk dictating the probability of transitioning from type j to type k.4
Let Lnst be the likelihood of observing the data at individual n at time t conditional on being in
unobserved state s. Integrating out with respect to the unobservable types and their transitions
yields the following log likelihood for the sample:5
N∑n=1
log
S∑s(1)
S∑s(2)
...S∑
s(T )
πs(1)Lns(1)
T∏t=2
ps(t−1),s(t)Lns(t)
(7)
Neglecting the complications in evaluating the individual likelihoods, the unobserved hetero-
geneity portion is actually simpler than the counterparts used in the macroeconomics literature on
regime-switching. While the log likelihood above would be extremely costly to evaluate, Hamil-
ton (1990) shows that the EM algorithm simplifies the problem substantially. The EM algorithm
4Note that this nests both the case where unobserved heterogeneity is permanent (the transition matrix would
be an S×S identity matrix) and the case where type is completely transitory (the transition matrix would have the
same values in row j as in row k for all j and k).5Note that when unobserved heterogeneity is permanent the log likelihood for the sample is given by:
N∑n=1
log
(S∑
s=1
T∏t=1
πsLnst
)
15
involves first calculating the conditional probabilities that an individual is each of the S types at
time t given the values for the parameters and the data. Let qnst be the conditional probability
individual n is type s at time t given the parameters and the data. We can calculate qnst through
the repeated application of Bayes’ rule:
qnst =Lnst
(∑Ss(1) ...
∑Ss(t−1) πs(1)Ls(1)
∏t−1t′=1 ps(t′−1),s(t′)Ls(t′)
)(∑Ss(t+1) ...
∑Ss(T )
∏Tt′=t+1 ps(t′−1),s(t′)Ls(t′)
)∑S
s Lnst
(∑Ss(1) ...
∑Ss(t−1) πs(1)Ls(1)
∏t−1t′=1 ps(t′−1),s(t′)Ls(t′)
)(∑Ss(t+1) ...
∑Ss(T )
∏Tt′=t+1 ps(t′−1),s(t′)Ls(t′)
)(8)
Rather then maximizing (7), the EM algorithm instead maximizes the expected log likelihood:∑N
n=1
∑S
s=1
∑T
t=1qnstLnst (9)
taking the qnst’s as given and where the L’s refer to the type-specific log likelihoods.
For the standard EM algorithm, this maximization has the same first order conditions as max-
imizing (7). Estimation then proceeds iteratively as follows. Given initial values for the utility
function parameters, the transition parameters for the observable state variables, the initial con-
ditions on the probabilities of being a particular type (the πi’s), and the transition matrix of the
unobservables (the pjk’s), calculate the probability of being each of the types for each of the time
periods. Given these conditional probabilities, maximize the expected log likelihood function given
in (9) with respect to the parameters of the utility function and the parameters governing the tran-
sitions on the unobservables. With these parameter estimates in hand, update 1) the conditional
probabilities of being each of the types for each individual at each time period, 2) the πi’s, and
3) the pjk’s. Iterate until convergence. The one piece of the method left to describe is then how
to update the πi’s and the pjk’s. These updates follow directly from Hamilton (1990) with m + 1
estimates of πi and pjk given by:
πm+1i =
∑Nn=1 πm
i Lmni1
N(10)
pm+1jk =
∑Nn=1
∑Tt=2 qm
nkt|jqmnjt∑N
n=1
∑Tt=2 qm
njt
(11)
where qnkt|j gives the conditional probability of individual n being type k at time t conditional on
being type j at time t− 1:
qnkt|j =pjkLnkt
(∑Ss(t+1) ...
∑Ss(T )
∏Tt′=t+1 ps(t′−1),s(t′)Ls(t′)
)∑S
k′ pjk′Lnk′t
(∑Ss(t+1) ...
∑Ss(T )
∏Tt′=t+1 ps(t′−1),s(t′)Ls(t′)
)16
The updating of the πi’s is simply the average of the conditional probabilities of being type i from
the first time period. The updating of the pjk’s takes the probability of being in state k conditional
on the data and on being in state j in the previous period for each individual in each time period
and weighting by the probability of being in state j in the previous time period.
Before introducing CCP’s into the estimator, there are two other important issues related to the
estimator above. The first concerns initial conditions. It may be the case that the first observation
in the data set is not at time 1 but at some later date implying that some dynamic selection has
already taken place. In this case, we can allow the πi’s to depend upon what is observed in the
first period. For example, in a dynamic entry/exit game the πi’s could depend upon the number of
firms in the market. With enough observations and a small number of discrete characteristics the
updated πi’s can then be found from averaging over the conditional probabilities from period 1 for
each of set of characteristics rather than from the population as a whole. If the number of initial
observed states is too large for this then a flexible multinomial logit can be used.
The second issue relates to other advantages of using the EM algorithm for solving dynamic
discrete choice models which also hold here in the presence of regime-switching. In the framework we
described in section 2.1 there were two sets of parameters: those that governed the utility function
and those that governed the transitions of the observed state variables. Let Lntc(θ, α, znt, si) be the
likelihood of the nth individual making the actual choice c at time t conditional on the structural
parameters, θ and α, his observed characteristics, znt, and on being in unobserved state s at time t.6
Similarly, define Lntz(α, znt−1, cnt−1, s) as the likelihood of observing state z at time t conditional θ,
on the subset of the parameters that affect the transitions of the state space, znt−1, the state at time
t− 1, being in unobserved state s at time t− 1, and cnt−1, the choice made at t− 1. Substituting
these likelihoods into (9) yields:∑N
n=1
∑S
s=1
∑T
t=1qnst (Lntc(θ, α, znt, s) + Lntz(α, znt−1, cnt−1, s)) (12)
Note that the additive separability present in the case without unobserved heterogeneity is rein-
troduced at the maximization step of the EM algorithm. Arcidiacono and Jones (2003) show that
consistent estimates of the α’s can then be obtained from maximizing only the second term in
the expected log likelihood function given in (12). The estimates of the θ’s are obtained from
6Note that the dependence of this likelihood on α comes through the transitions affecting the expected future
utility.
17
maximizing the first term, taking the estimates of γ from maximizing the second term as given.
5.2 Incorporating CCP’s
We now extend the algorithm to the case where CCP’s are used in the value function. This adds
one additional step that updates the CCP’s. There are two ways to update the CCP’s, the first
of which uses the likelihoods themselves. This method of updating is particularly convenient for
stationary problems so for the moment we suppress the time subscripts. Let Lk(θ, P, z, s) denote the
conditional likelihood of observing choice k ∈ {1, . . . , K}. For each (z, s, k), the rule for updating
P is defined as:
P(m+1)k (z, s) = Lk
(θ(m), P (m), z, s
)(13)
Denoting the the true value of the parameter vector (θ, P ) by (θ∗, P ∗) , the updating rule follows
from the definitions of conditional probability and the conditional likelihood, which directly imply
P ∗k (z, s) ≡ Pr {k|z, s} = Lk (θ∗, P ∗, z, s)
The second way involves using the conditional probabilities of being particular types as weights.
This method is useful when the problem is not stationary as using the likelihoods would require
solving the full backwards recursion problem, albeit only once at each iteration rather then multiple
times within a maximization routine. For example, consider a finite horizon model where decisions
are made until time T but data is only available until time T − b. Using the likelihoods to update
would require evaluating CCP’s far past when the data is available. Using this second method we
only need the T − b CCP’s.7 Here, we use the individual probabilities of being particular types as
weights in updating the type-specific CCP’s. Namely, the m + 1 estimate of Pkt(zt, st) is given by:
P(m+1)k (zt, st) =
∑Nn=1 q
(m+1)nst dnkth(zt, znt)∑N
n=1 q(m+1)nst h(zt, znt)
(14)
where h(zt, znt) smoothly moves towards I(zt = znt) where I is the indicator function as N goes
to infinity and dnkt indicates whether individual n chose option k at time t. Note that N going
to infinity does not lead to consistent estimates of the qnst’s. However, it does lead to consistent
estimates of the Pk(zt, st)’s as in the population we are obtaining the correct distribution of the
type-specific CCP’s.
7Note that the two updating methods may be used in conjunction. The likelihood approach could be used for all
periods before T − b with the data approach used only for period T − b.
18
We now have all the pieces to fully describe the algorithm. Our algorithm is triggered by setting
an initial guess at the CCP’s, P , the structural parameters, θ and α, and the parameters governing
the unobserved states, π and p, and then following a sequence of iterations. The algorithm iterates
on four steps where the m + 1 iteration follows:
Step 1 For each of the S types, calculate the conditional probability that each individual is of
type s at each time period t, q(m+1)nst using (8).
Step 2 Given the q(m+1)nst ’s, the π
(m)i ’s, and the p
(m)ij ’s, obtain the π
(m+1)i ’s and the p
(m+1)ij using (10)
and (11).
Step 3 Maximize the expected log likelihood function to obtain estimates of θ(m+1) and α(m+1)
taking the q(m+1)nst ’s, the p
(m+1)jk ’s, and the P (m)’s as given. Maximization may proceed sequen-
tially by first obtaining α(m+1) solely from the transitions on the observables and then taking
α(m+1) as given when estimating θ(m+1).8
Step 4 Update the type-specific CCP’s using either (13) or (14) and obtain P (m+1).
6 Small Sample Performance
We perform two Monte Carlo simulations to assess the performance of our estimator for finite
samples. Both simulations are simpler versions of proposed empirical projects discussed below.
The first simulation is of a finite horizon model and shows that both methods of updating the
CCP’s yield parameter estimates similar to full backwards recursion with little loss in efficiency.
The second simulation is an infinite horizon game and shows that CCP estimators can handle
regime switching in dynamic discrete choice models even when few time periods are observed in
the data.
6.1 Finite Horizon Monte Carlo
The first simulation is a model of teenage behavior where in each period t the teen decides among
three alternatives: drop out (d0t = 1), stay in school and do drugs (d1t = 1), or stay in school but
8Note that if the unobserved heterogeneity does not affect the transitions on the observables then these can be
estimated in a first stage outside of the algorithm described above.
19
do not do drugs (d2t = 1). In this model there are two types of agents, where an individual’s type
affects their utility of using drugs. The probability of being type L is given by π. An individual
learns their type through experimentation: if at any point an individual has tried drugs their
type is immediately revealed to them. Hence, the individual’s information about their type, st,
comes from the set {U,L,H} where U indicates that the individual does not know their type. The
econometrician and the individual then have the same priors as to the individual being a particular
type before the individual has tried drugs. However, once an individual has tried drugs their type
is immediately revealed to them but not to the econometrician.
The individual also faces a withdrawal cost if he uses drugs at time t−1 but does not use at time
t. Hence, the other relevant state variable, zt, is taken form {N, Y } which indicate whether the
individual used drugs in the previous period. There are then five possible information sets {st, zt}:
{U,N}, {L, N}, {L, Y }, {H, N}, and {H, Y }. With dropping out being a terminal state, the
evolution of the state space conditional on the two stay in school options is given by the transition
matrix in Table 1.
Table 1: Evolution of the State Space†
{U,N} {L, N} {L, Y } {H, N} {H, Y }
{D, U} d2t = 1 1 0 0 0 0
d1t = 1 0 0 π 0 1− π
{L, N} d2t = 1 0 1 0 0 0
d1t = 1 0 0 1 0 0
{L, Y } d2t = 1 0 1 0 0 0
d1t = 1 0 0 1 0 0
{H, N} d2t = 1 0 0 0 1 0
d1t = 1 0 0 0 0 1
{H, Y } d2t = 1 0 0 0 1 0
d1t = 1 0 0 0 0 1
†Rows give state at time t, columns the state at time t + 1.
With dropping out being a terminal state, we normalize utility with respect to this option. The
20
flow utilities net of the ε’s when an individual know their type are given by:
u1(st ∈ {L, H}, zt) = α0 + α1 + α2(st = H)
u2(st ∈ {L, H}, zt) = α0 + α3(zt = Y )
where α0 is the baseline utility of attending school, α1 is the baseline utility of using drugs, α2 is
the additional utility from being type H and using drugs, and α3 is the withdrawal cost. Note that
if the individual chooses to use drugs no withdrawal cost is paid meaning that zt is not relevant.
On the other hand, if the individual choose not to use drugs only the individual’s type becomes
irrelevant as type only affects the utility of using drugs. We can then write down the similar
expressions for those who do not know their type;
u1(U, zt) = α0 + α1 + (1− π)α2(st = H)
u2(U, zt) = α0
where now the utility of using drugs is probabilistic and, since the individual has not used drugs
in the past, no withdrawal cost needs to be paid.
With the flow utilities in hand, we now focus on our attention on the value functions themselves.
Here it is important to note that choosing to drop out leads to a terminal state. Combining this
with having the ε’s be distributed Type I extreme value means that the future value components
are functions of the transition on the state space and the one period ahead conditional choice
probabilities of dropping out. The expressions for the vk’s when an individual knows their type
then follow:
v1(st ∈ {L, H}, zt) = u1(st ∈ {L, H}, zt)− β ln [p0(st ∈ {L, H}, Y )] + βγ
v2(st ∈ {L, H}, zt) = u2(st ∈ {L, H}, zt)− β ln [p0(st ∈ {L, H}, N)] + βγ
with the corresponding expressions when an individual does not know their type following:
v1(U, zt) = u1(U, zt) + βγ
−β (π ln [p0(L, Y )] + (1− π) ln [p0(H, Y )])
v2(U, zt) = u2(U, zt)− β ln [p0(U,N)] + βγ
For each simulation we create 5000 simulated individuals with 5 periods of data. There are less
observations for those who drop out as no further decisions occur once the simulated individual
21
leaves school. We estimate the model using three different methods of calculating the expected
future utility where all three are equivalent asymptotically. The first calculates the expected future
utility via backwards recursion while the second and third use CCP’s with the CCP’s updated
using the likelihoods or the weighted data respectively. The parameter values were chosen to
approximate the increase in usage as students age in the NLSY97 data, the drop out rate, and the
persistence in drug usage. Each simulation was performed 500 times. Table 2 shows that both
CCP estimators performed nearly as well as the more efficient model. Updating the CCP’s via the
likelihoods yielded smaller standard errors than using the data but the differences were small. Note
that it is particularly surprising how well the CCP estimators performed given that we are only
using data from a discrete outcome. In more standard cases unobserved heterogeneity will affect
both transitions (which may be on continuous variables) and choices. Using the variation from the
transitions is likely to further reduce the differences across the estimators.
Table 2: School and Drug Choice Monte Carlo†
School/No Drug No Withdrawal Low Drug High Drug Discount Prob. of
Intercept Benefit Type Type Factor High Type
True Parameters 0.2 0.4 -15 2 0.9 0.7
Efficient Estimates 0.197 0.403 -15.011 2.003 0.900 0.700
Standard Error 0.069 0.078 0.755 0.074 0.017 0.014
CCP Estimates 1‡ 0.194 0.397 -14.835 1.99 0.891 0.701
Standard Error 0.085 0.092 0.800 0.087 0.017 0.014
CCP Estimates 2 0.2003 0.3934 -14.833 2.000 0.884 0.7005
Standard Error 0.097 0.105 0.817 0.099 0.026 0.014
†Listed values are the means and standard errors of the parameter estimates over the 500 simulations.‡CCP Estimates 1 refers to updating the CCP’s via the likelihoods while CCP Estimates 2 updates the CCP’s
using the data directly.
22
6.2 Monte Carlo 2: An Infinite Horizon Entry/Exit Game
Our second Monte Carlo examines an entry/exit game along the lines of the game described in
section 4.3. We assume that the econometrician observes prices as well, though these prices do not
affect the firm’s expected profits once we control for the relevant state variables. We specify the
price equation as:
Yjt = α0 + α1(1 Firm in Market j) + α2(2 Firms in Market j) + α3sjt + ζjt (15)
where j indexes the market. Price then depends upon how many firms are in the market, an
unobserved state variable that transitions over time, sjt, and takes on one of two values, H or L,
as well as a normally distributed error, ζjt. Firms know the current value of this unobserved state
variable but only have expectations regarding its transitions. The econometrician has the same
information as the firms regarding the probability of transitioning from state to state but does not
know the current value of the state. Profits are assumed to be linear in whether the firm has a
competitor and the state of the market. Each of our Monte Carlos has 3000 markets9 observed for
5 periods each. The rest of the specification of the Monte Carlo follows directly from the entry/exit
game discussed in section 4.3.
Results are presented for 100 simulations in Table 3. The CCP estimator yields estimates
that are quite close to the truth with small standard errors. The noisiest parameters are those
associated with the persistence of the states and with the initial conditions. Of particular interest
are the coefficients on monopoly and duopoly in the demand equation. If we ignored unobserved
heterogeneity and estimated the demand by OLS, the coefficients are biased upward at -.18 and
-.41 compared to the true values of -.3 and -.7. Controlling for dynamic selection shows a much
stronger effect of adding firms to the market which is consistent with the actual data generating
process.
7 Conclusion
Estimation of dynamic discrete choice models is computationally costly, particularly when controls
for unobserved heterogeneity are implemented. CCP estimation provides a computationally cheap
way of estimating dynamic discrete choice problems. In this paper we have broadened the class
9This is roughly the number of U.S. counties.
23
Table 3: Dynamic Entry/Exit Monte Carlo†
True Values Estimates Std. Error
Intercept L 7.000 6.999 0.042
Price Intercept H 8.000 8.005 0.066
Equation 1 Firm -0.300 -0.299 0.035
2 Firms -0.700 -0.702 0.053
Flow Profit L 0.000 -0.002 0.032
Profit Flow Profit H 0.500 0.516 0.103
Function Duopoly Cost -1.000 -1.009 0.043
Entry Cost -1.500 -1.495 0.027
p‡LL 0.800 0.799 0.028
Unobserved pHH 0.700 0.702 0.040
Heterogeneity π††L 0.800 0.799 0.031
† 100 simulations of 3000 markets for 5 periods. β set at 0.9.‡ The probability of a market being in the low state in period t conditional on being in the low state at t− 1.
†† Initial probability of a market being assigned the low state.
of CCP estimators that rely on a small number of CCP’s for estimation and have shown how to
incorporate unobserved heterogeneity that transitions over time. The algorithm itself borrowed
from the macroeconomics literature on regime switching– and in particular the insights gained
from the EM algorithm– in order to form an estimator that iterated on 1) updating the conditional
probabilities of being a particular type, 2) updating the CCP’s, 3) forming the expected future
utility as functions of the CCP’s, and 4) maximizing a likelihood function where the expected
future utility is taken as given. The algorithm was shown to be both computationally simple with
little loss of information relative to full information maximum likelihood.
References
[1] Altug, S. and R.A. Miller (1998): The Effect of Work Experience on Female Wages and Labour
Supply,” Review of Economic Studies, 62, 45-85.
24
[2] Aguirregabiria, V. and P. Mira (2002): Swapping the Nested Fixed Point Algorithm: A Class
of Estimators for Discrete Markov Decision Models,” Econometrica, 70, 1519-1543.
[3] Aguirregabiria, V. and P. Mira (2006): Sequential Estimation of Dynamic Discrete Games,”
Forthcoming in Econometrica.
[4] Arcidiacono, P. (2005): ”Affirmative Action in Higher Education: How do Admission and
Financial Aid Rules Affect Future Earnings?” Econometrica, 73, 1477-1524.
[5] Arcidiacono, P., and J.B. Jones (2003): Finite Mixture Distributions, Sequential Likelihood,
and the EM Algorithm,” Econometrica, 71, 933-946.
[6] Bajari, P., L. Benkard, and J. Levin (2006): Estimating Dynamic Models of Imperfect Com-
petition,” Forthcoming in Econometrica.
[7] Becker, G. and K.M. Murphy (1988): A Theory of Rational Addiction,” Journal of Political
Economy, 96, 675-700.
[8] Bresnahan, T.F., S. Stern and M. Trajtenberg (1997): Market Segmentation and the Sources of
Rents from Innovation: Personal Computers in the Late 1980s,” RAND Journal of Economics,
28, 17-44.
[9] Eckstein, Z. and K.I. Wolpin (1999): Why Youths Drop Out of High School: The Impact of
Preferences, Opportunities, and Abilities,” Econometrica, 67, 1295-1339.
[10] Ericson, R. and A. Pakes (1995): Markov-Perfect Industry Dynamics: A Framework for Em-
pirical Work,” Review of Economic Studies, 62, 53-82.
[11] Hamilton, J.D. (1990): Analysis of Time Series Subject to Changes in Regime,” Journal of
Econometrics, 45, 39-70.
[12] Hotz, J. and R.A. Miller (1993): Conditional Choice Probabilities and Estimation of Dynamic
Models,” Review of Economic Studies, 61, 265-289.
[13] Pakes, A., M. Ostrovsky, and S. Berry (2004): Simple Estimators for the Parameters of Discrete
Dynamic Games (with Entry/Exit Examples),” Working Paper: Harvard University.
25
[14] Pesendorfer, M. and P. Schmidt-Dengler (2003): Identification and Estimation of Dynamic
Games,” NBER working paper #9726.
[15] Rust, J. (1987): Optimal Replacement of GMC Bus Engines: An Empirical Model of Harold
Zurcher,” Econometrica, 55, 999-1033.
26