machines, buildings, and optimal dynamic...
TRANSCRIPT
Machines, Buildings, and Optimal Dynamic Taxes∗
Ctirad Slavıka and Hakki Yazicib
aGoethe University Frankfurt, Frankfurt, Germany. Email: [email protected]
bSabanci University, Istanbul, Turkey. E-mail: [email protected]
February 22, 2014
Abstract
The effective taxes on capital returns differ depending on capital type in the U.S.
tax code. This paper uncovers a novel reason for the optimality of differential capital
taxation. We set up a model with two types of capital - equipments and structures -
and equipment-skill complementarity. Under a plausible assumption, we show that it
is optimal to tax equipments at a higher rate than structures. In a calibrated model,
the optimal tax differential rises from 27 to 40 percentage points over the transition to
the new steady state. The welfare gains of optimal differential capital taxation can be
as high as 0.4% of lifetime consumption.
JEL classification: E62, H21.
Keywords: Differential capital asset taxation, equipment capital, structure capital,
equipment-skill complementarity.
∗Corresponding author: Ctirad Slavık, Gruneburgplatz 1 (Campus Westend) House of Finance, Room3.49 60323 Frankfurt am Main, Germany. Telephone: +49 (0)69 798 33808.
1 Introduction
In the U.S. corporate tax code, the effective marginal tax rates on returns to capital assets
show a considerable amount of variation depending on the capital type. For instance, ac-
cording to Gravelle (2011), the effective marginal tax rate on the returns to communications
equipment is 19%, whereas it is above 35% for non-residential buildings.1 This feature of the
tax code has been the subject of numerous reform proposals since the 1980s. Recently, Pres-
ident Obama called for a reform to abolish the tax rules that create differential taxation of
capital assets in order to “level the playing field” across companies.2 Many economists have
argued in favor of the proposals to abolish tax differentials following an efficiency argument
first raised by Diamond and Mirrlees (1971): taxing different types of capital at different
rates distorts firms’ production decisions, thereby creating production inefficiencies.
In this paper, we take a step back and reassess whether differential taxation of capital
income is a desirable feature of the tax code. Theoretically, we uncover a novel economic
mechanism that calls for optimality of differential capital asset taxation, but with an impor-
tant caveat. In the current U.S. tax code, the effective tax rate on equipment capital (i.e.,
mostly machines) is on average 5% below the effective tax rate on structure capital (i.e.,
mostly non-residential buildings). In contrast, our theory suggests that capital equipments
should be taxed at a higher rate than capital structures. We conduct a quantitative exercise
to assess the quantitative importance of optimal differential capital taxation. In our baseline
calibration, we find that the tax rate on capital equipments should be at least 27 percentage
points higher than the tax rate on capital structures in the transition and at the steady
state. Furthermore, we find that the welfare gains of optimal differential capital taxation
are as high as 0.4% of lifetime consumption for reasonable parameter values.
We study dynamic optimal taxes in an economy in which people are heterogeneous in
1The capital tax differentials are created through tax depreciation allowances that differ from actualdepreciation rates. We explain this in detail and provide further information on the historical evolution ofcapital tax differentials in the U.S. tax code in Appendix A.
2The 2011 U.S. President’s State of the Union Address. Retrieved from http://www.whitehouse.gov/the-press-office/2011/01/25/remarks-president-state-union-address.
1
terms of their skills, and the government uses capital and labor income taxes to provide redis-
tribution (insurance). In our benchmark model, we consider an environment with permanent
skills. We then generalize our main theoretical results to an environment with stochastic
skills. Our approach to optimal dynamic taxation follows the recent New Dynamic Public
Finance literature in the sense that taxes are allowed to be arbitrary functions of people’s
past and current incomes.
The key feature of our environment is equipment-skill complementarity in the production
technology. Following Gravelle (2011), we group capital assets into two categories: structure
capital and equipment capital. We further assume that there are two types of labor: skilled
and unskilled. Following the empirical evidence for the U.S. economy provided by Krusell,
Ohanian, Rıos-Rull, and Violante (2000), we assume that the degree of complementarity
between equipment capital and skilled labor is higher than the degree of complementarity
between equipment capital and unskilled labor. Structure capital is neutral in terms of its
complementarity with skilled and unskilled labor. More generally, Flug and Hercowitz (2000)
provide evidence for equipment-skill complementarity for a large panel of countries.
Equipment-skill complementarity implies that skilled and unskilled labor are not perfect
substitutes and that the skill premium – defined as the ratio of the skilled wage to the un-
skilled wage – is endogenous. In particular, a decrease in the stock of equipment capital
decreases the skill premium, thereby creating an indirect transfer from the skilled agents
to the unskilled ones. Therefore, depressing the level of equipment capital creates an extra
channel of redistribution and/or insurance. In order to depress equipment capital accumu-
lation, the government taxes returns to equipment capital at a higher rate than it taxes
returns to structure capital. This implies the optimality of differential capital taxation.
We assess the quantitative importance of differential capital taxation using the model
with permanent skills calibrated to the U.S. economy. In our benchmark calibration, the
optimal equipment capital income tax is 27.6 percentage points higher than the tax on
structure capital in the first period. The tax differential rises along the transition path to
2
39.6 percentage points at the steady state.
The skill premium is about 40% in the first period after the optimal tax reform, and rises
over the transition to 48% in the new steady state. Thus, the ‘optimal’ skill premium in any
period is significantly lower than 80%, the empirical estimate for the current U.S. economy.
This suggests that the optimal tax system relies much more on indirect redistribution than
the current U.S. tax system. We also find that the optimal skill premium is rising over
the transition because the economy is growing, and hence, the level of equipment capital
increases. This result is interesting as it suggests that, even if the government cares about
equality, an increasing skill premium is optimal in a growing economy.
Next, we assess the welfare gains of optimal differential capital taxation. We compare
welfare in the optimal tax system with welfare in a tax system, in which the government is
unrestricted in its choice of labor income taxes, but the tax rates on both types of capital
are restricted to be equal to the values in the U.S. tax code. The additional welfare gains of
allowing for differential capital taxation are 0.19% in terms of lifetime consumption in the
benchmark and can be as high as 0.40% within the set of reasonable parameter values.
We focus on the redistribution and insurance provision role of differential capital taxation.
There could be other reasons for differential taxation of capital. For instance, some authors
have argued that investment in equipment capital might create positive externalities. Other
things being equal, positive externalities would be a reason to tax equipment capital at a
lower rate than structure capital. Auerbach, Hassett, and Oliner (1994) point out, however,
that it is hard to support the existence of such positive externalities on empirical grounds.
In this paper, we abstract from all other possible reasons for differential capital taxation in
order to isolate its redistributive and insurance provision role.
Related Literature. Our paper relates to three distinct strands of literature. First, in
their seminal paper Diamond and Mirrlees (1971) show that tax systems should maintain
productive efficiency. In an environment with multiple capital types, this result implies that
all capital should be taxed at the same rate. However, Auerbach (1979) and Feldstein (1990)
3
show that it might be optimal to tax capital differentially if the government is exogenously
restricted to a narrower set of fiscal instruments than in Diamond and Mirrlees (1971). Our
paper is different in the sense that the optimality of differential capital taxation stems from
redistribution and/or insurance motives.
Our paper follows the New Dynamic Public Finance (NDPF) tradition. This literature
studies optimal capital and labor income taxation in dynamic settings in which agents’ la-
bor skills are allowed to change stochastically over time and the optimal tax system can be
arbitrarily nonlinear in the history of capital and labor income.3 No paper in this literature,
however, has studied differential taxation of capital assets prior to the current paper. In
addition, our paper contributes to the NDPF literature by adding to a set of recent papers
that aim to provide practical policy recommendations by quantifying the theoretical impli-
cations of the NDPF literature, see e.g, Fukushima (2010), Huggett and Parra (2010), Farhi
and Werning (2013), and Golosov, Troshkin, and Tsyvinski (2013).
Our paper is also related to a set of theoretical studies on optimal static Mirrleesian
taxation with endogenous wages. Stiglitz (1982) assumes that the labor supplies of agents
with different skills are imperfect substitutes and shows that the agent with the highest
income should be subsidized. Naito (1999) shows that the uniform commodity taxation result
of Atkinson and Stiglitz (1976) and productive efficiency result of Diamond and Mirrlees
(1971) are no longer valid under imperfect labor substitutability. Ales, Kurnaz, and Sleet
(2014) analyze a static optimal tax problem in which agents with different skills are assigned
to tasks (occupations). They calculate optimal taxes for the U.S. economy for the 1970s
and the 2000s and compare them to their empirical counterparts. In addition, they analyze
the impact of technical change on optimal taxes. We differ from this literature by focusing
on a dynamic environment with different types of capital, which we use to analyze optimal
differential taxation of capital assets both theoretically and quantitatively.
The rest of the paper is structured as follows. In Section 2, we lay out the model for the
3For seminal contributions to NDPF, see Golosov, Kocherlakota, and Tsyvinski (2003), Kocherlakota(2005), and Albanesi and Sleet (2006). For an excellent review of this literature, see Kocherlakota (2010).
4
case of permanent skills. In Section 3, we show that differential capital taxation is optimal in
this environment. Section 4 generalizes our main results to an environment with stochastic
skills. Section 5 discusses our quantitative results, and Section 6 concludes. Appendix A
discusses differential taxation of capital assets in the U.S. tax code, Appendix B contains the
proofs, Appendix C provides a formal implementation of the constrained efficient allocation
in an incomplete markets environment, and Appendix D defines alternative social planning
problems that we analyze in Section 5.
2 Model
There is a continuum of measure one of agents who live for infinitely many periods. They
differ in their skill levels: they are born either skilled or unskilled, h ∈ H = {u, s}. A fraction
πh of agents belong to skill group h. In the main body of the paper, we assume that agents’
skills are permanent. Permanent skills is a natural assumption given that in our quantitative
analysis we associate skill levels with educational attainment. In Section 4, we show that
our main theoretical results remain valid for a general stochastic skill process.
Production Technology. An agent of skill level h produces l · zh units of effective h
type labor when he works l units of labor. There are two different occupational sectors in
this economy: a skilled occupation in which only skilled agents are allowed to work and an
unskilled occupation in which only unskilled agents are allowed to work. The first assumption
reflects the fact that unskilled people do not have the skills to work in the skilled occupation.
The second assumption can be rationalized as follows. We assume that agents get the same
disutility from working in the two occupations. Therefore, a skilled agent will choose to work
in the skilled occupation as long as he gets a higher wage in the skilled occupation. This
reasoning holds in the presence of taxes under our assumption that taxes are functions of
income histories only. We discuss the nature of the tax system in more detail below.
Output is produced according to a production function Y = F (Ks, Ke, Ls, Lu), where
5
Ls, Lu, Ks and Ke denote the aggregate amounts of effective skilled labor, effective unskilled
labor, structure capital, and equipment capital. Output can be used for consumption or
can be converted to structure or equipment capital one-for-one. The economy is initially
endowed with K∗s,1 and K∗e,1 units of the capital goods. We define a function F that gives the
total wealth of the economy: F = F + (1− δs)Ks + (1− δe)Ke, where δs and δe denote the
depreciation rates of structure and equipment capital. We define Fi(·) and Fi(·) as partial
derivatives of F and F with respect to the ith argument.
Wages. Agents of type h ∈ H receive wage wh,t in period t for one unit of their labor:
ws,t = F3(Ks,t, Ke,t, Ls,t, Lu,t) · zs, wu,t = F4(Ks,t, Ke,t, Ls,t, Lu,t) · zu. (1)
Equipment-Skill Complementarity. Following Krusell, Ohanian, Rıos-Rull, and Vi-
olante (2000), we assume that the production technology features equipment-skill comple-
mentarity, which means that the degree of complementarity between equipment capital and
skilled labor is higher than that between equipment capital and unskilled labor. This as-
sumption has two important implications that make our model different from the standard
model in the NDPF literature. First, an increase in the stock of equipment capital decreases
the ratio of the marginal product of unskilled labor to the marginal product of skilled labor.
In other words, the ratio of skilled to unskilled wages (skill premium) is endogenous, and this
ratio is increasing in equipment capital. Structure capital, on the other hand, is assumed to
be neutral in terms of its complementarity with skilled and unskilled labor. Second, skilled
and unskilled labor are no longer perfect substitutes which implies that the skill premium is
decreasing in the total amount of skilled labor and increasing in the total amount of unskilled
labor. We formalize these assumptions on technology as follows.
Assumption 1. F3(·)/F4(·) is independent of Ks.
Assumption 2. F3(·)/F4(·) is strictly increasing in Ke.
Assumption 3. F3(·)/F4(·) is strictly decreasing in Ls and strictly increasing in Lu.
6
Assumptions (1) - (3) are maintained throughout the paper without further reference.
Preferences. An agent of type h evaluates a consumption-labor sequence, (ch,t, lh,t)∞t=1,
with a utility function that is time-separable and separable between consumption and labor,
∞∑t=1
βt−1 [u(ch,t)− v(lh,t)] ,
where β ∈ (0, 1) is the discount factor, u, v : R+ → R. We assume that u′,−u′′, v′, v′′ > 0.
Allocation. An allocation is x = ((ch,t, lh,t)h∈H , Ks,t, Ke,t, Ls,t, Lu,t)∞t=1 .
Feasibility. An allocation is feasible if in any period t ≥ 1,
∑h=u,s
πhch,t +Ks,t+1 +Ke,t+1 +Gt ≤ F (Ks,t, Ke,t, Ls,t, Lu,t), (2)
Lh,t = πhlh,tzh, for h ∈ H and Ks,1 ≤ K∗s,1, Ke,1 ≤ K∗s,1. (3)
Here, {Gt}∞t=0 is a sequence of exogenously given wasteful government consumption.
Optimal Tax Problem. As in the U.S. tax code, we assume that taxes only depend on
people’s incomes, and not directly on their skills, occupations, wages, or labor supplies. We
do not model why the government does not use this information in the tax code (there could
be constitutional, administrative or other reasons), but rather focus on the best policy given
the existing fiscal framework. Following Mirrlees (1971) and the recent New Dynamic Public
Finance literature, we impose no further restrictions on the tax code; specifically, taxes can
be arbitrarily nonlinear functions of income histories.
Following Kocherlakota (2010), we make no explicit mention of private information in
motivating why we restrict attention to tax systems that depend only on incomes. However,
the fact that the government can condition taxes only on income implies that the optimal tax
problem is isomorphic to a social planning problem, in which agents are privately informed
about their skills, occupations, wages, and labor supplies. Income and consumption are
public information. In the planning problem, each agent reports his skill type to the planner
7
and receives an allocation as a function of his report.4 The set of allocations available to the
planner is constrained by incentive compatibility constraints, which ensure that agents do
not misreport their types.5
Our strategy is to first characterize the solution to the planning problem and then use
this characterization to back out properties of an optimal tax system. We now formally
define the notion of incentive compatibility.
Incentive Compatibility. With permanent types, people report their type only once
in the first period. Moreover, since we assume that agents cannot switch occupations, an
agent can only mimic the other type’s income level by adjusting his labor hours. As a result,
the planner faces only two incentive constraints.
We say that an allocation is incentive compatible if and only if for all h ∈ H
∞∑t=1
βt−1 [u(ch,t)− v(lh,t)] ≥∞∑t=1
βt−1
[u(cj,t)− v(
lj,twj,twh,t
)
], (4)
where j denotes the complement of h in the set H.
Social Planning Problem. We analyze the problem of a planner who maximizes a
Utilitarian objective with equal weights on all agents. The social planning problem is
maxx
∑h∈H
πh
∞∑t=1
βt−1 [u(ch,t)− v(lh,t)] s.t. (1), (2), (3), and (4).
The allocation that solves the social planning problem is called the constrained efficient
allocation and is denoted with an asterisk throughout the paper.
4Agents only report their skill types, because given that income is observable and skilled (unskilled)agents can only work in the skilled (unskilled) occupation, knowing an agent’s skill type reveals all hisprivate information.
5The restriction to direct truth-telling mechanisms is without loss of generality because of the followingargument. Any market arrangement with taxes is a particular mechanism. By revelation principle, nosuch mechanism can do better than the optimal direct truth-telling mechanism. Conversely, in AppendixC, we prove that there is a tax system that implements the allocation that arises from the optimal directtruth-telling mechanism. Therefore, finding the optimal tax system reduces to finding the optimal directtruth-telling mechanism, which is the problem of a social planner who assigns allocations as functions ofagents’ types subject to incentive compatibility constraints.
8
3 Optimality of Differential Taxation of Capital
In this section, we uncover the economic mechanism that calls for differential capital tax-
ation. We show that, with equipment-skill complementarity, as long as only the incentive
constraint that prevents skilled agents from pretending to be unskilled binds, the optimal
tax on equipment capital is strictly higher than the optimal tax on structure capital. We
begin by formalizing the assumption on the pattern of binding incentive constraints.
Assumption 4. The incentive constraint (4) binds for h = s and is slack for h = u at the
solution to the social planning problem.
In all numerical simulations in Section 5, in which we parameterize the model to match
the U.S. data, the skilled wage is higher than the unskilled wage in every period. However, in
our environment with endogeneous wages, it is not possible to guarantee that skilled wages
are always higher than unskilled wages without making very restrictive assumptions on F .
Without monotonic wages, it is not possible to determine the pattern of binding incentive
constraints. Therefore, we proceed directly with Assumption 4, see Stiglitz (1982) for the
same approach. We verify that Assumption 4 is satisfied in all our quantitative exercises.
3.1 Capital Return Wedge
In the standard growth model with two types of capital, aggregate savings are allocated
between the two types of capital in a way that equates their marginal returns. Proposition
1 below shows that this is not true in the constrained efficient allocation, meaning it is
optimal to create a wedge between the marginal returns to structure and equipment capital.
This result forms the basis for the optimality of differential taxation of capital: to create the
optimal wedge in the market equilibrium, the two types of capital should be taxed differently.
Proposition 1. Suppose Assumption 4 holds. Then, at the constrained efficient allocation,
in any period t ≥ 2, F1(K∗s,t, K∗e,t, L
∗s,t, L
∗u,t) < F2(K∗s,t, K
∗e,t, L
∗s,t, L
∗u,t).
9
Proof. Let λtβt−1 be the multiplier on period t feasibility constraint and µ be the
multiplier on skilled agents’ incentive constraint. The first order optimality conditions with
respect to the two types of capital are:
(Ke,t) : λ∗t−1 = β[λ∗t F2(K∗s,t, K
∗e,t, L
∗s,t, L
∗u,t) +X∗t
],
(Ks,t) : λ∗t−1 = βλ∗t F1(K∗s,t, K∗e,t, L
∗s,t, L
∗u,t), where
X∗t = µ∗v′(l∗u,tw
∗u,t
w∗s,t
)l∗u,t
∂(w∗u,tw∗s,t
)∂K∗e,t
.
By equipment-skill complementarity, we have ∂(w∗u,tw∗s,t
)/∂K∗e,t < 0. Since µ∗ > 0, we have
X∗t < 0. Using X∗t < 0 together with the first-order conditions gives the result. �
Because of equipment-skill complementarity, increasing the level of equipment capital in
period t decreases the wage ratio w∗u,t/w∗s,t. This makes it more profitable for the skilled agents
to pretend to be unskilled and, hence, tightens the skilled incentive constraint. From a plan-
ning perspective, this means that increasing equipment capital has an extra negative return,
X∗t < 0, in addition to the physical return, F ∗2,t, where F ∗i,t denotes Fi(K∗s,t, K
∗e,t, L
∗s,t, L
∗u,t).
Since structure capital is neutral, changing its level does not affect the incentive constraint,
and hence its only return is its physical return, F ∗1,t. In order for the overall return on the
two types of capital to be equal, the physical return on equipment capital must higher than
the physical return on structure capital at the constrained efficient allocation.
This result is intuitive: decreasing the level of equipment capital has an additional
marginal benefit for the planner, because it decreases the skill premium and thus indirectly
redistributes from the skilled to the unskilled. Decreasing the level of equipment capital in-
creases its return above the return on structure capital due to diminishing marginal returns.
This intuition shows that there is an extra reason to depress equipment capital accumulation
relative to structure capital. This implies that equipment capital should be taxed at a higher
rate than structure capital as we show in Section 3.2.
Two features of the model are key for the optimality of the capital return wedge. First,
10
if equipment capital was also neutral in terms of its complementarity with the two types
labor, then, X∗t = 0, and hence, it would be efficient to equate the physical marginal returns
to the two types of capital. Second, if the government could condition taxes on skill types, it
could redistribute via type-specific lump-sum taxes at zero efficiency cost and would not need
the indirect (and distortionary) channel of redistribution, which works through the capital
return wedge. In terms of the planning problem, this would mean that skills were not private
information but publicly known. As a result, there would be no incentive constraints, and
hence, X∗t = 0, and the optimal capital return wedge would again be zero.
3.2 Optimal Differential Capital Taxes
In this section, we provide a link between the optimality of the capital return wedge and the
optimality of differential capital taxation. In Proposition 2, we characterize the properties of
optimal wedges (distortions) that a planner has to create in the intertemporal allocation of
resources in order to implement the constrained efficient allocation in a competitive market
environment, in which people are allowed to save through both types of capital. Formally, the
optimal intertemporal wedge that the planner has to create for an agent of type h for capital
of type i ∈ {s, e} from period t to t+1 is defined as τ ∗i,t+1(h) = 1−u′(c∗h,t)/[βF ∗i,t+1u
′(c∗h,t+1)].
Proposition 2. Suppose Assumption 4 holds. Then,
1. In all periods t ≥ 2, the optimal wedge on equipment capital is strictly positive and
independent of agent type, whereas the optimal wedge on structure capital is zero, i.e.,
for all h ∈ H, τ ∗e,t ≡ τ ∗e,t(h) > τ ∗s,t ≡ τ ∗s,t(h) = 0.
2. If a steady state of the constrained efficient allocation exists, then the optimal wedge
on equipment capital is strictly positive at the steady state.
Proof. Relegated to Appendix B. �
Part 1 of Proposition 2 calls for zero taxation of structure capital and positive taxation
of equipment capital in every period. Recall that by Assumption 1, a change in the level
11
of structure capital does not affect the skill premium. Therefore, there is no indirect redis-
tribution motive to distort structure capital accumulation. In addition, it follows from the
uniform commodity taxation result of Atkinson and Stiglitz (1976) that in the absence of
skill risk, it is optimal not to tax structure capital.6 In contrast, taxing equipment capital
has the extra benefit of decreasing the skill premium, thus providing indirect redistribution.
Therefore, the planner finds it optimal to tax equipment capital.7 Finally, the proposition
also shows that the capital tax rates are type independent.
Part 2 of Proposition 2 says that the optimal wedge on equipment capital is positive
in steady state. This result is interesting because it shows that the indirect redistribution
channel calls for taxing equipment capital not only in the short run but also in the long run.
This result is in contrast with the long run optimality of zero capital taxation in the Ramsey
literature shown by Chamley (1986) and Judd (1985).
3.3 Intratemporal Wedges
Our model has interesting implications for intratemporal wedges (i.e., marginal labor income
taxes) as well. The optimal intratemporal wedge in period t for an agent of skill type h,
defined as τ ∗y,t(h) = 1 − v′(l∗h,t)/[w∗h,tu
′(c∗h,t)], measures the efficient distortion that the
planner needs to create in this agent’s intratemporal allocation of consumption and labor
in period t. The famous no distortion at the top result, proven originally by Sadka (1976)
and Seade (1977), states that in a static Mirrleesian economy, if the distribution of skills
has a finite support, then the consumption-labor decision of the agent with the highest skill
level should not be distorted. Huggett and Parra (2010) proves this result for a dynamic
Mirrleesian economy in which skill types are permanent and a version of our Assumption 4
holds. In Proposition 3, we show that the no distortion at the top result does not hold in the
6The optimality of not taxing structure capital is closely related to Werning (2007), who shows thatwith permanent types zero capital taxation is optimal in a dynamic Mirrleesian model with standard Cobb-Douglas production function.
7If Assumption 4 is not satisfied, it will still be generically optimal to tax the two types of capitaldifferentially, as we show explicitly in a more general environment in Section 4. However, in that case, it isnot possible to determine which capital good will be taxed at a higher rate.
12
presence of equipment-skill complementarity. In particular, we show that the skilled agents’
labor income should be subsidized.
Proposition 3. Suppose Assumption 4 holds. Then, in any period t ≥ 1, the optimal
intratemporal wedge of the skilled agent is negative, i.e., τ ∗y,t(s) < 0.
Proof. Relegated to Appendix A. �
The intuition for this result is as follows. Under the equipment-skill complementarity as-
sumption, skilled and unskilled labor are imperfect substitutes. This implies that increasing
the labor supply of the skilled agents decreases the skill premium which means that increas-
ing skilled labor supply creates indirect redistribution. In order to encourage the supply of
skilled labor, the government finds it optimal to subsidize skilled labor at the margin. This
result is in line with Stiglitz (1982), who shows that when two types of labor are imperfect
substitutes, the more productive agents’ labor supply should be subsidized.
4 Generalization to Stochastic Skills
In the model laid out in Section 2, agents’ skill types are permanent. In this section, we
allow for agents’ skills to evolve stochastically over time. This level of generality is useful
because it allows us to establish that our main theoretical results remain valid if people’s
skills change after they enter the labor market, or if we take a dynastic interpretation of our
model in which skills change from one generation to another. Notice that in this environment
with stochastic skills the government uses taxes to provide insurance in addition to providing
redistribution and financing public spending.
We first show that differential taxation of capital is optimal for any stochastic skill pro-
cess. Moreover, under an assumption regarding the pattern of binding incentive compatibility
conditions, it is optimal to tax equipment capital at a higher rate than structure capital.
The environment is the same as in Section 2 except that people are born identical, but
their skills evolve stochastically over time. A skill realization in period t is denoted by
13
ht ∈ H. A partial skill history in period t is denoted by ht = (h1, h2, . . . , ht) ∈ H t, where H t
denotes the set of all period t histories. Let πt(ht) be the unconditional probability of ht.
Wages. An agent of type h in period t receives a wage wh,t, defined in equation (1),
independent of his skill history before period t. For expositional convenience, in this section,
we use wt(ht) instead of wh,t to denote wages.
Preferences. Preferences are now defined over stochastic processes of consumption and
labor, (ct, lt)∞t=0, where ct, lt : H t → R+, using an ex ante expected discounted utility function,
∞∑t=1
∑ht∈Ht
πt(ht)βt−1
[u(ct(h
t))− v(lt(ht))]. (5)
Allocation. An allocation is x = (ct, lt, Ks,t, Ke,t, Ls,t, Lu,t)∞t=1 .
Feasibility. An allocation is feasible if in any period t ≥ 1,
∑ht∈Ht
πt(ht)ct(h
t) +Ks,t+1 +Ke,t+1 +Gt ≤ F (Ks,t, Ke,t, Ls,t, Lu,t), (6)
Lh,t =∑
{ht∈Ht|ht=h}
πt(ht)lt(h
t)zh for h ∈ H, and Ks,1 ≤ K∗s,1, Ke,1 ≤ K∗s,1. (7)
Incentive Compatibility. Define σt : H t → H. A reporting strategy is σ = (σt)∞t=1.
Let Σ denote the set of all reporting strategies. The truth-telling strategy, which we denote
by σ∗, prescribes reporting the true type at each and every node: for all ht, σ∗t (ht) = ht. Let
σt(ht) = (σ1(h1), ..., σt(ht)) denote the history of reports along history ht. Define
W (σ|x) =∞∑t=1
∑ht∈Ht
πt(ht)βt−1
[u(ct(σ
t(ht)))− v
(lt(σ
t (ht))wt(σt (ht))
wt(ht)
)],
as the expected discounted value of using reporting strategy σ given an allocation x. We say
that an allocation x is incentive compatible if and only if for all σ ∈ Σ, W (σ∗|x) ≥ W (σ|x).
Following Fernandes and Phelan (2000), without loss of generality, we restrict attention
to the set of reporting strategies that has lying only at a single node. This allows us to
replace the incentive compatibility constraints defined above with a sequence of temporary
14
incentive constraints, one for each node. We say that an allocation x is temporary incentive
compatible if and only if, in any period t and at any node ht−1 and for all ht ∈ H,
u(ct(ht−1, ht))− v(lt(h
t−1, ht)) +∞∑
m=t+1
∑hm∈Hm
πm(hm)βm−t [u(cm(hm))− v(lm(hm))] (8)
≥ u(ct(ht−1, hot ))− v
(lt(h
t−1, hot )wt(hot )
wt(ht)
)+
∞∑m=t+1
∑hm∈Hm
πm(hm)βm−t[u(cm(hm))− v(lm(hm))
],
where hot is the complement of ht in the set H, Hm denotes the set of period m histories that
follow from ht, i.e., Hm ≡ {hm ∈ Hm : hm � ht}, and hm = (ht−1, hot , ht+1, ..., hm) is identical
to hm except in period t. From now on, we use (8) to represent incentive compatibility.8
Social Planning Problem. The social planning problem that defines the constrained
efficient allocation is: maxx (5) s.t. (1), (6), (7), and (8).
Optimality of Differential Capital Taxation. Now we prove the optimality of dif-
ferential taxation of capital for the general environment with skill shocks. First, we define
the intertemporal wedge for an agent with skill history ht and for capital of type i ∈ {s, e}
from period t to period t+ 1, as
τ i,t+1(ht) = 1− u′(ct(ht))
βFi,t+1Et {u′(ct+1(ht+1))|ht}. (9)
The first part of Proposition 4 generalizes Proposition 1 by showing that it is optimal to
create a wedge between the marginal returns to structure and equipment capital when skills
evolve stochastically over time. The second part of Proposition 4 shows that the optimal
intertemporal wedges for structure and equipment capital are different. Thus, optimality
of differential taxation of capital does not depend on the permanent skill types assumption
maintained in the main model.
8Temporary incentive constraints were first shown to be necessary and sufficient for incentive compatibilityby Green (1987) for an environment with i.i.d. shocks. Fernandes and Phelan (2000) generalized this resultto environments with persistent shocks. To be precise, we need to use two more assumptions to be able touse temporary incentive constraints. First, we need each skill history to be reached with strictly positiveprobability. Second, we need a transversality condition that is automatically satisfied if we assume thatinstantaneous utility is bounded.
15
Proposition 4. 1. At the constrained efficient allocation, in any period t ≥ 2,
F1(K∗s,t, K∗e,t, L
∗s,t, L
∗u,t) = F2(K∗s,t, K
∗e,t, L
∗s,t, L
∗u,t) +X∗t /λ
∗t , where
X∗t =∑{ht∈Ht}
µ∗t (ht)v′
(l∗t (h
t−1, hot )w∗t (h
ot )
w∗t (ht)
)l∗t (h
t−1, hot )∂w∗t (hot )
w∗t (ht)
∂K∗e,t
and λtβt−1 and µt(h
t) are Lagrange multipliers on period t feasibility constraint and
the incentive constraint at history ht.
2. (a) The optimal wedge on structure capital in any period t ≥ 2 and history ht−1 satisfies
τ ∗s,t(ht−1) ≥ 0. The inequality is strict if and only if there is no h ∈ H such that
πt(ht−1, h|ht−1) = 1.
(b) The optimal wedge on equipment capital in any period t ≥ 2 and history ht−1 is
[1− τ ∗e,t(ht−1)
]=[1− τ ∗s,t(ht−1)
]·[1 +X∗t /
(λ∗t F
∗2,t
)]. (10)
Proof. Relegated to Appendix B. �
The idea behind the first part of Proposition 4 is very similar to the one for Proposition
1: under equipment-skill complementarity, increasing the amount of equipment capital has
an effect on incentives, summarized by the term X∗t . In contrast, changing the amount of
structure capital does not affect incentives. As a result, it is optimal to create a wedge
between the physical returns to the two types of capital. The main distinction from the per-
manent type model is that, in the case with stochastic skills, a change in period t equipment
capital affects all the binding incentive constraints in that period. Thus, X∗t measures the
cumulative effect of a change in equipment capital on all the binding incentive constraints.
Since at this level of generality it is not possible to determine the pattern of binding incentive
constraints, the sign of X∗t is ambiguous.
Part 2(a) of Proposition 4 states that the intertemporal wedge on structure capital is
positive if there is skill risk. Intuitively, if we allow an agent to save at the marginal rate of
16
return to structure capital, he will save more than the efficient level. In the next period, he
will work less than socially optimal if he turns out to be of the skilled type. To prevent this
double deviation, it is optimal to discourage savings. The government achieves that with a
positive wedge on structure capital.9 Naturally, with permanent types there is no skill risk
and, hence, no reason to tax structure capital, as we have already shown in Proposition 2.
Equation (10) in part 2(b) of the proposition is a version of the no-arbitrage condition for
this economy. The equation shows that the intertemporal wedge on equipment capital can be
decomposed into two parts. First, the government wants to discourage savings in equipment
capital for the same reason that it wants to discourage savings in structure capital, which is
captured by the first term on the right-hand side of equation (10). The second term on the
right-hand side of equation (10) is present in order to create the optimal wedge between the
returns to the two types of capital. The presence of this term implies that generically the
optimal wedges on the two types of capital are different in any period and history, which
establishes the optimality of differential taxation of capital.
A Special Case. In Assumption 5 below, we assume that the only incentive constraints
that bind are those that prevent the skilled from pretending to be unskilled. We call these
incentive constraints downward incentive constraints. There is no theoretical result that
establishes the pattern of binding incentive constraints for general skill processes in dynamic
Mirrleesian environments, even when wages are exogeneous.10 Indeed, there are examples
in which some upward incentive constraints bind. In this regard, Assumption 5 is stronger
than Assumption 4, which we use in Section 3.
Assumption 5. In any period t ≥ 1, history ht, only downward incentive constraints bind.
Assumption 5 allows us to show that X∗t > 0 in all periods. It is then possible to sign the
9The positive wedge on structure capital follows from the inverse Euler equation, see equation (16)in Appendix B. This condition was first derived by Rogerson (1985) and then generalized by Golosov,Kocherlakota, and Tsyvinski (2003). The inverse Euler equation does not hold for equipment capital becauseof the effect that equipment capital has on incentives. We derive a modified version of the inverse Eulerequation for equipment capital in Appendix B, see equation (17).
10Downward incentive constraints are the only binding incentive constraints when skills are i.i.d. andwages are exogeneous.
17
capital return wedge, and show that the optimal equipment capital wedge is higher than the
optimal structure capital wedge. These results are summarized by the following proposition.
Proposition 5. Suppose Assumption 5 holds. Then, in any period t ≥ 2 and history ht−1,
F1(K∗s,t, K∗e,t, L
∗s,t, L
∗u,t) < F2(K∗s,t, K
∗e,t, L
∗s,t, L
∗u,t) and τ ∗e,t(h
t−1) > τ ∗s,t(ht−1).
Proof. Relegated to Appendix B. �
Intratemporal Wedges. Under Assumption 5, Proposition 6 generalizes the optimality
of subsidizing skilled labor supply, shown for the permanent type case in Section 3.3, for
the environment in which skills evolve stochastically over time. First, define the optimal
intratemporal wedge at history ht as τ ∗y,t(ht) = 1− v′(l∗t (ht))/(w∗t (ht)u′(c∗t (ht))).
Proposition 6. Suppose Assumption 5 holds. In any period t ≥ 1 and history ht−1,
τ ∗y,t(ht−1, s) < 0.
Proof. Relegated to Appendix B. �
Implementation. In Appendix C, we provide an implementation of the constrained
efficient allocation through a tax system in a competitive market environment in which
agents trade a risk free bond and capital. The implementation result holds for any stochastic
process, including the permanent type model. An interesting feature of this tax system is
that the optimal tax differentials across equipment and structure capital can be implemented
at the firm level, as is the case in the current U.S. tax system. This is possible because,
as the second term on the right-hand side of equation (10) shows, the differential between
optimal intertemporal wedges of structure and equipment capital is history independent in
any period. Another notable feature of our implementation is that, the optimal tax system
mimics the actual U.S. tax code in the sense that capital tax differentials are created through
depreciation allowances that differ from actual economic depreciation. Therefore, creating
the optimal capital tax differentials would not require complicating the U.S. tax code further.
18
5 Quantitative Analysis
The main goal of this section is to analyze the quantitative importance of differential taxation
of capital in a calibrated version of our model. As in the main part of the paper, we
assume that agents’ skill types are permanent. Since there is no labor income risk in this
environment, the only role of taxation is redistribution (along with financing government
consumption). Permanent skills is a natural assumption given that in the data we associate
skill levels with educational attainment. In addition, there is empirical evidence that initial
conditions account for a large part of the cross-sectional variation in lifetime earnings.11
We first calibrate the model parameters to the U.S. economy using a competitive equi-
librium framework with the actual U.S. tax code and government consumption level. We
then solve a social planning problem with endogeneous factor prices in which the planner
“inherits” the initial capital stocks from the steady state of the competitive equilibrium.12
We solve for the whole time series of the constrained efficient allocation, thus taking into
account the transition to a new steady state. Then, we recover the optimal wedges (taxes)
from the constrained efficient allocation. In line with Proposition 2, we find that the optimal
taxes on equipment capital are higher than those on structure capital. Specifically, in our
benchmark calibration, the optimal tax differential increases from 27.6% in the first period
to 39.5% in the steady state. We also assess the welfare consequences of differential capital
taxation. We find that the welfare gains of optimal differential capital taxation can be as
high as 0.4% in terms of lifetime consumption.
11Keane and Wolpin (1997) estimate that initial conditions account for 90% of the cross-sectional variationin life-time earnings. Huggett, Ventura, and Yaron (2011) estimate this number to be over 60%, andStoresletten, Telmer, and Yaron (2004) estimate it to be almost 50%.
12We would not be able to assess the role of differential capital taxation in a partial equilibrium environ-ment, because the skill premium would not be affected by changes in the level of equipment capital. Tothe contrary, most quantitative papers in the NDPF literature consider partial equilibrium environments.As Farhi and Werning (2012) show, considering general equilibrium effects might be important even with astandard production function without complementarities.
19
5.1 Calibration
To calibrate the parameters in the social planning problem, we assume that the steady state
of the competitive equilibrium defined in Appendix C represents the current U.S. economy.
We first fix a number of parameters to values from the data or from the literature. We then
calibrate the remaining parameters so that the SCE matches the U.S. data along selected
dimensions.
One period in our model corresponds to one year. We assume that the period utility
function takes the form u(c)− v(l) = c1−σ/(1− σ)− φl1+γ/(1 + γ). In the benchmark case,
we use σ = 2 and γ = 1. These are within the range of values that have been considered in
the literature. We further assume that the production function takes the same form as in
Krusell, Ohanian, Rıos-Rull, and Violante (2000):
Y = F (Ks, Ke, Ls, Lu) = Kαs
(ν [ωKρ
e + (1− ω)Lρs]ηρ + (1− ν)Lηu
) 1−αη.
Using U.S. data Krusell, Ohanian, Rıos-Rull, and Violante (2000) estimate α, ρ, η, and we
use their estimates, but they do not estimate ω and ρ. We calibrate these parameters to
U.S. data, as we explain in detail below. This production function satisfies Assumptions 1 –
3 if η > ρ, which is what Krusell, Ohanian, Rıos-Rull, and Violante (2000) find.
We assume that the government consumption-to-output ratio equals 16%, which is close
to the average ratio in the United States during the period 1980 – 2012, as reported in the
National Income and Product Accounts (NIPA) data. We follow Heathcote, Storesletten, and
Violante (2010) and assume a flat labor income tax rate of τ y = 27% (for a discussion of the
construction of this number, see Domeij and Heathcote (2004)). Gravelle (2011) documents
that because of differences in tax depreciation rates, the effective tax rates on structure
capital and equipment capital differ at the firm level. Specifically, she estimates the effective
corporate tax rate on structure capital to be 32%, and that on equipment capital to be 26%.
The capital income tax rate at the consumer level is 15% in the U.S. tax code. This implies
20
Table 1: Benchmark Parameters
Parameter Symbol Value Source
Relative risk aversion σ 2Inverse Frisch elasticity γ 1Structure capital depreciation rate δs 0.056 GHKEquipment capital depreciation rate δe 0.124 GHKShare of structure capital in output α 0.117 KORVMeasure of elasticity of substitution betweenequipment capital Ke and unskilled labor Lu η 0.401 KORVMeasure of elasticity of substitution betweenequipment capital Ke and skilled labor Ls ρ -0.495 KORVTax on labor income τy 0.27 HSVOverall tax on structure capital income τ s 0.422 Gravelle (2011)Overall tax on equipment capital income τ e 0.371 Gravelle (2011)Government consumption G/Y 0.16 NIPARelative supply of skilled workers ps/pu 0.778 U.S. CensusShare of skilled workers’ wealth ζ 0.686 U.S. Census
This table reports the benchmark parameters that we take directly from the data or the literature. The acronyms GHK,
KORV, and HSV stand for Greenwood, Hercowitz, and Krusell (1997), Krusell, Ohanian, Rıos-Rull, and Violante (2000), and
Heathcote, Storesletten, and Violante (2010), respectively. NIPA stands for the National Income and Product Accounts.
an overall tax on structure capital τ s = 1− 0.85 · (1− 0.32) = 42.2% and an overall tax on
equipment capital τ e = 1− 0.85 · (1− 0.26) = 37.1%. These numbers are in line with a 40%
tax on aggregate capital that is reported by Domeij and Heathcote (2004). We assume that
unspent government tax revenue is distributed back to the agents in a lump-sum manner,
which implies that in the SCE average taxes are in general not equal to marginal taxes. We
set the ratio of skilled to unskilled agents to be consistent with the 2011 US Census data.
The stationary competitive equilibrium is not unique in our environment with permanent
types. Let us denote the steady-state asset holdings of a skilled agent by as and of an
unskilled agent by au. Given aggregate capital levels Ks, Ke consistent with the SCE, any
asset distribution of the form psas = ζ(Ks+Ke) and puau = (1− ζ)(Ks+Ke) with ζ ∈ (0, 1)
can arise in the SCE. We pick a ζ so that the SCE asset distribution matches the observed
asset distribution between skilled and unskilled agents in the 2010 U.S. Census data. Table 1
summarizes the benchmark parameters that we take directly from the data or the literature.
This leaves us with several parameters to be determined. We cannot identify zu and
21
Table 2: Benchmark Calibration Procedure
Parameter Symbol Value Target Data and SCE Source
Discount factor β 0.985 Capital-to-output ratio 2.9 NIPA and FATDisutility of labor φ 67.8 Labor supply 1/3Production functionparameter ω 0.477 Labor share 2/3 NIPAProduction functionparameter ν 0.657 Skill premium ws
wu1.8 HPV
This table reports our benchmark calibration procedure. The production function parameters ν and ω control the income share
of equipment capital, skilled and unskilled labor in output. The acronym HPV stands for Heathcote, Perri, and Violante (2010).
NIPA stands for the National Income and Product Accounts, and FAT stands for the Fixed Asset Tables.
zs separately from the remaining parameters of the production function, and therefore, set
them to zu = zs = 1. We then calibrate the parameter that controls the income share of
equipment capital ω, the parameter that controls the income share of unskilled labor ν, the
labor disutility parameter φ, and the discount factor β. We calibrate these parameters so
that (i) the labor share equals 2/3 (approximately the average labor share in 1980 – 2010
as reported in the NIPA data), (ii) the capital-to-output ratio equals 2.9 (approximately
the average of 1980 – 2011 as reported in the NIPA and Fixed Asset Tables), (iii) the skill
premium equals 1.8 (as reported by Heathcote, Perri, and Violante (2010) for the 2000s),
and (iv) the aggregate labor supply in steady state equals 1/3 (as is commonly used in the
macro literature). Table 2 summarizes our calibration procedure.
5.2 Quantitative Results
In this section, we analyze the quantitative properties of the optimal tax system. We do
so by solving the social planning problem (SPP) defined in Section 3.2 with parameters
calibrated in Section 5.1 to the U.S. economy.13 We assume that the planner inherits initial
13We solve the SPP assuming that the economy converges to a steady state in 200 periods. We verify thatchanging the number of periods does not affect the results. In other words, the economy gets very close tosteady state long before period 200.
22
Table 3: Steady-State Comparison of Wedges
SCE SPP
Tax (wedge) on equipment capital τ e 37.10% 39.54%Tax (wedge) on structure capital τ s 42.20% 0.00%Tax (wedge) on unskilled labor τ y(u) 27.00% 26.60%Tax (wedge) on skilled labor τ y(s) 27.00% -11.14%
This table compares the tax rates in the stationary competitive equilibrium (column SCE) and wedges at the steady state of
the solution to the social planning problem (column SPP).
capital stocks from the SCE. Following Conesa, Kitao, and Krueger (2009), we assume that
the planner needs to finance the same level of government consumption as in the SCE.
Steady-State Comparison. We first discuss the properties of the optimal tax system
in steady state and compare it to the current U.S. tax system. The first column of Table 3
summarizes the current U.S. tax system. The second column reports its counterpart in the
optimal tax system at the steady state. The first two rows of Table 3 report capital income
taxes net of depreciation.14 We find that the equipment capital tax τ e is substantial at the
steady state of the solution to the SPP. It is 39.54% – that is, 39.54 percentage points higher
than the tax on structure capital τ s, which is zero. This is in contrast with the current
effective tax rates in the United States where structure capital is taxed by 5.1 percentage
points more than equipment capital overall. As for the labor wedges, they are 27% for both
types of labor in the SCE because we approximate the U.S. labor income tax code by a 27%
linear tax. At the steady state of the solution to the SPP, the labor wedge for unskilled labor
τ y(u) is 26.6%, which is almost the same as in the SCE. The skilled labor wedge τ y(s), on
the other hand, is -11.14%. Both higher taxes on equipment capital and marginal subsidies
on skilled labor are in line with our theoretical results from Section 3.
14We report capital income taxes net of depreciation rather than the capital wedges defined in Section3.2 because the former correspond to the taxes used in the U.S. tax code. With a slight abuse of notation,we denote these capital income taxes as we denoted the capital wedges. In the column denoted “SPP,” thecapital income taxes are recovered from the constrained efficient allocation by using the following definition
for each type h ∈ H, capital type i, and period t: τ i,t+1(h) ≡ 1−(
u′(ch,t)βu′(ch,t+1)
− 1)/ (Fi,t+1 − δi). Part 1 of
Proposition 2 implies that these taxes only depend on time and not on agent type; therefore, we only reportone number (time series).
23
Table 4: Steady-State Comparison of Allocations
SCE SPP
Ke/Ks 1.02 0.93Ls/Lu 0.82 1.11ws/wu 1.80 1.47
This table compares allocations in the stationary competitive equilibrium (column SCE) and at the steady state of the solution
to the social planning problem (column SPP). Ke/Ks denotes the equipment-to-structure capital ratio, Ls/Lu denotes the
skilled-to-unskilled labor ratio and ws/wu denotes the skill premium.
The higher taxes on equipment capital relative to structure capital, together with marginal
subsidies on skilled labor, are used to indirectly redistribute from the skilled to the unskilled.
Table 4 shows how the optimal tax system achieves indirect redistribution by comparing the
allocations at the SCE and the SPP. The higher tax on equipment capital discourages the
accumulation of equipment capital relative to structure capital at the SPP in comparison to
the SCE. At the same time, the marginal subsidy on skilled labor income increases the ratio
of skilled to unskilled labor. Both capital and labor taxes decrease the skill premium at the
SPP. This way, the planner provides indirect redistribution from the skilled to the unskilled.
The marginal subsidy on skilled labor income seems to imply that there is direct redis-
tribution from the unskilled to the skilled at the SPP. However, recall that optimal taxes
are nonlinear in labor income. In this case, at a given income level, the average income tax
can be quite different from the marginal income tax.15 As a consequence, a tax system can
be progressive overall even though the marginal taxes are regressive. This is precisely what
happens at the optimal tax system. To assess the overall progressivity of the optimal tax
system, we compute a measure of average labor taxes that an agent has to pay at the steady
state of the SPP. This measure is defined as 1 − ch/(whlh) for agents of type h, following
Farhi and Werning (2013). We find that the optimal average labor taxes are progressive:
6% for the unskilled and 18% for the skilled. Therefore, the optimal labor taxes do provide
15Suppose, e.g., that the tax formula for an agent with income $200,000 is T (y) = $100, 000− 0.1 · y. Thisagent pays $80,000 in taxes, implying an average tax of 40%, even though he gets a marginal subsidy of 10%.
24
direct redistribution from the skilled to the unskilled.16
Transition. This section summarizes the evolution of the optimal taxes (wedges) along
the transition to the new steady state. The left panel of Figure 1 shows that the optimal
structure capital income tax (net of depreciation) is 0 and the equipment capital tax is
positive in all periods. These properties are in line with Proposition 2. The equipment
capital tax is growing over time. To understand this finding, we look at the evolution of the
stocks of the two types of capital, shown in the left panel of Figure 2. It shows that both
capital stocks are growing along the transition path. The overall capital stock is growing in
the constrained efficient allocation because the planner inherits an inefficiently low level of
capital from the SCE, which is due to the inefficiently high overall level of capital taxes at
the SCE. As the quantity of equipment capital grows, so does the skill premium (see Figure
3). The government wants to prevent an unfettered growth of the skill premium because of
its adverse redistributive effects. To keep the growth of the skill premium under control, the
planner finds it optimal to increase the tax on equipment capital.17
Optimal labor wedges are almost constant along the transition, as shown in the right
panel of Figure 1. In fact, Werning (2007) shows that with our utility function labor wedges
are exactly constant over time in a permanent-type model without equipment-skill com-
plementarity. Figure 1 suggests that the extra distortions in labor wedges arising from
equipment-skill complementarity are also approximately constant over time. Since skilled
labor is subsidized, skilled agents work more than unskilled agents in each period, as shown
in the right panel of Figure 2. As the economy grows, both types of agents become richer,
and because of the income effect, they decrease their labor supply even though labor wedges
16The non-linear nature of the optimal labor income tax code also explains how government budget isbalanced under the optimal tax system. Table 3 seems to suggest that - except for a small increase inequipment capital taxes - government revenue from all other sources declines significantly when the economymoves from the current system to the optimal one. However, with a non-linear tax system the total amountof labor income taxes collected can increase even if the marginal taxes decline.
17We check the validity of this intuition by conducting exercises, in which the planner inherits inefficientlyhigh amounts of capital from the SCE. In those cases, as our intuition suggests, the planner decreases bothtypes of capital over the transition to the new steady state, and optimal equipment taxes decline over thetransition period.
25
Figure 1: Optimal Taxes/Wedges at the Constrained Efficient Allocation
0
0.2
0.4
-0.2
0
0 10 20 30 40 50
Structure capital tax
Equipment capital tax
years
0
0.2
0.4
Unskilled labor wedge
Skilled labor wedge
-0.2
0
0 10 20 30 40 50years
This figure shows the paths of optimal taxes (wedges) along the transition to the new steady state at the solution to the social
planning problem.
do not change much.
Figure 3 depicts the evolution of the optimal skill premium over time. First, the optimal
skill premium is much lower in each period than it is in the U.S. data. This result suggests
that the current U.S. tax system does not generate enough indirect redistribution. Second,
the skill premium is increasing over time because the equipment capital level increases. This
result implies that an increasing skill premium is optimal in a growing economy, even if the
government cares about equality.
Welfare Gains of Optimal Differential Taxation of Capital. We assess the im-
portance of optimal differential taxation of capital by answering the following question: how
much of the welfare gains of the full reform (which we call optimal DTC in this section)
do we give up if we restrict the government to use the current capital taxes and allow it
only to choose the labor income tax code optimally? To answer this question, we solve an
additional version of the planning problem. In this problem, the planner is unrestricted in
his choice of labor taxes, but he must use the capital income taxes as in the U.S. tax code.
We call this tax system current differential taxation of capital (current DTC). We state the
planning problem that gives rise to the current DTC in Appendix D. We find that, for the
benchmark parameters, reforming the current tax system to the optimal DTC implies 0.19%
26
Figure 2: Factors of Production at the Constrained Efficient Allocation
0.3
0.35
0.4
0.25
0.3
0 10 20 30 40 50
Structure capital
Equipment capital
years
0.3
0.35
0.4
0.25
0.3
0 10 20 30 40 50
Unskilled labor
Skilled labor
years
This figure shows the paths of factors of production along the transition to the new steady state at the solution to the social
planning problem.
Figure 3: Skill Premium at the Constrained Efficient Allocation
1.45
1.475
1.5
1.4
1.425
0 10 20 30 40 50
Skill premium
years
This figure shows the path of the skill premium ws/wu along the transition to the new steady state at the solution to the social
planning problem.
27
more welfare gains than reforming labor taxes alone (i.e., moving to the current DTC).18 The
additional gains of optimal DTC can be as high as 0.40% for reasonable parameter values,
as we discuss in more detail in the sensitivity analysis below.
In addition, we solve a version of the social planning problem, in which the planner is
unrestricted in his choice of labor taxes, but is not allowed to tax the two types of capital
differentially. We call this tax system the optimal nondifferential taxation of capital (optimal
NDTC). We state the planning problem that gives rise to the optimal NDTC in Appendix
D. We find that the welfare gains of the current DTC falls 0.14% short of the welfare gains
of the optimal NDTC for the benchmark parameters. This difference in welfare gains can be
as high as 0.27% for reasonable parameter values.19
We also assess how people rank the different capital tax reforms. We find that relative
to the current DTC the optimal DTC helps both types. The reason is that the overall level
of capital taxes at the current DTC is inefficiently high. Under the optimal DTC, structure
capital taxes are zero while equipment capital taxes are virtually unchanged. As a result,
there is more capital of both types at the optimal DTC. This increases the productivity of
both types of agents, and they both benefit from this reform. We also find that relative to
the current DTC, the optimal NDTC helps the skilled and hurts the unskilled.
Sensitivity Analysis. In each sensitivity exercise, we change the parameter of interest
and redo the calibration procedure. Table 5 summarizes the sensitivity results. We report
optimal taxes only for the optimal DTC reform. With a higher σ, the curvature of utility
from consumption, the planner wants to provide more redistribution. Therefore, the indirect
redistribution channel becomes more important. Hence, as σ increases, the tax on equipment
capital as well as the marginal subsidy to skilled labor increase. Table 5 also reports the
18We measure the welfare gains of allocation x relative to allocation y as a fraction by which consumptionin allocation y has to be increased in each date and state to make its welfare equal to allocation x welfare.
19These results suggest that setting capital tax rates to a uniform rate, as proposed recently by PresidentObama’s administration, might imply substantial welfare gains. However, our results here are only suggestive,since that proposal only involves reforming capital taxes, but would leave labor taxes intact. We evaluatethe consequences of such a proposal in a world with multiple layers of heterogeneity across agents in Slavikand Yazici (2014).
28
Table 5: Sensitivity Analysis
Benchmark Benchmarkσ 1 2 4 2 2 2γ 1 1 1 0.5 1 2
Optimal taxesτ e 24.39% 39.54% 54.84% 49.23% 39.54% 22.88%τ s 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
τy(u) 18.77% 26.60% 33.75% 26.43% 26.60% 24.90%τy(s) -5.29% -11.14% -21.23% -16.46% -11.14% -5.08%
Difference in welfare gainsOpt. DTC vs. current DTC 0.24% 0.19% 0.21% 0.20% 0.19% 0.22%
Opt. NDTC vs. current DTC 0.23% 0.14% 0.07% 0.10% 0.14% 0.21%
This table reports the results of the sensitivity analysis. In the first column, σ is intertemporal elasticity of substitution, γ is
the inverse Frisch elasticity of labor supply, τe is the equipment capital tax (wedge), τs is the structure capital tax (wedge),
τy(u) is the labor wedge of the unskilled agents, τy(s) is the labor wedge of the skilled agents. For all taxes (wedges) we report
their steady state values. Opt. DTC refers to optimal differential taxation of capital, i.e., a reform that reforms both labor and
capital taxes. Current DTC refers to current differential taxation of capital, i.e., a tax reform that reforms labor taxes, but
leaves capital taxes at their current values. Opt. NDTC refers to optimal nondifferential taxation of capital, i.e., a reform in
which the planner is free to adjust labor taxes, but must set the tax rate on equipment capital equal to the tax rate on structure
capital.
sensitivity of our results to changes in γ, the curvature of disutility from labor. We find that,
as γ decreases, the tax on equipment capital and the skilled labor subsidy increase.
As we report in the penultimate row of Table 5, the welfare gains of the optimal DTC
reform are around 0.20% higher than the gains of the current DTC reform for all values of
σ and γ we consider.20 The welfare gains of optimal NDTC relative to current DTC are
decreasing in σ and increasing in γ, as shown in the last row of Table 5. The reason is that
with a larger σ or lower γ, the optimal capital tax differential is larger, as one can see in the
rows denoted by τ e and τ s in Table 5. Therefore, optimal NDTC, which forces capital taxes
to be uniform, is more restrictive and implies smaller welfare gains for higher σ or lower γ.
We have also solved the model for the other values of (σ, γ) ∈ {1, 2, 4} × {0.5, 1, 2}. We
find that the welfare gains of optimal DTC relative to current DTC are as high as 0.28% for
20We also compute the welfare gains of optimal DTC under alternative social welfare weights. We findthat if the planner cares more about the unskilled, the welfare gains of optimal DTC are larger. This isintuitive: not being able to use one of the channels of indirect redistribution optimally has more severewelfare consequences when society care more about redistribution.
29
σ = 4 and γ = 0.5. We also consider a higher elasticity of substitution between equipment
capital and unskilled labor, namely, η = 0.79. This value has been used in He and Liu
(2008), who use the same production function, and is based on an empirical estimate by
Duffy, Papageorgiou, and Perez-Sebastian (2004). For this value of η and σ = 4 and γ = 0.5,
the welfare gains of optimal DTC relative to current DTC are 0.40%.
6 Conclusion
The effective marginal tax rates on returns to capital assets differ substantially depending
on the capital asset type in the U.S. tax code. In particular, the marginal tax rate on capital
structures is about 5% higher than the marginal tax rate on capital equipments. In this
paper, we assess the optimality of differential capital asset taxation both theoretically and
quantitatively from the perspective of a government whose aim is to provide redistribution
and insurance. Contrary to the actual practice in the U.S. tax code, we show that, under a
plausible assumption, it is optimal to tax equipment capital at a higher rate than structure
capital. Intuitively, in an environment with equipment-skill complementarity, taxing equip-
ment capital and hence depressing its accumulation decreases the skill premium, providing
indirect redistribution from the skilled to the unskilled agents. In a quantitative version of
our model, we find that the tax rate on equipment capital should be at least 27 percentage
points higher than the tax rate on structure capital during transition and at the steady state.
Furthermore, we find that the welfare gains of optimal differential capital taxation can be
as high as 0.4% of lifetime consumption.
30
7 Acknowledgements
We thank the associate editor, Christopher Sleet, an anonymous referee, as well as LaurenceAles, Alex Bick, Felix Bierbrauer, V. V. Chari, Nicola Fuchs-Schundeln, Kenichi Fukushima,Mike Golosov, Christian Hellwig, Larry Jones, Juan Pablo Nicolini, Fabrizio Perri, DominikSachs, Florian Scheuer, Yuichiro Waki, and participants at various conferences, workshopsand seminars for their comments. Yazici greatly appreciates IIES (Stockholm University)for their hospitality during his visit.
References
Albanesi, S., and C. Sleet (2006): “Dynamic Optimal Taxation with Private Informa-tion,” Review of Economic Studies, 73(1), 1–30.
Ales, L., M. Kurnaz, and C. Sleet (2014): “Tasks, Talents, and Taxes,” Workingpaper.
Atkinson, A. B., and J. E. Stiglitz (1976): “The Design of Tax Structure: Directversus Indirect Taxation,” Journal of Public Economics, 6(1–2), 55–75.
Auerbach, A. J. (1979): “The Optimal Taxation of Heterogeneous Capital,” QuarterlyJournal of Economics, 93(4), 589–612.
Auerbach, A. J. (1983): “Corporate Taxation in the United States,” Brookings Papers onEconomic Activity, 14(2), 451–514.
Auerbach, A. J., K. A. Hassett, and S. D. Oliner (1994): “Reassessing the SocialReturns to Equipment Investment,” Quarterly Journal of Economics, 109(3), 789–802.
Chamley, C. (1986): “Optimal Taxation of Capital Income in General Equilibrium withInfinite Lives,” Econometrica, 54(3), 607–622.
Conesa, J. C., S. Kitao, and D. Krueger (2009): “Taxing Capital? Not a Bad Ideaafter All!,” American Economic Review, 99(1), 25–48.
Diamond, P. A., and J. A. Mirrlees (1971): “Optimal Taxation and Public ProductionI: Production Efficiency,” American Economic Review, 61(1), 8–27.
Domeij, D., and J. Heathcote (2004): “On the Distributional Effects of Reducing Cap-ital Taxes,” International Economic Review, 45(2), 523–554.
Duffy, J., C. Papageorgiou, and F. Perez-Sebastian (2004): “Capital-Skill Com-plementarity? Evidence from a Panel of Countries,” Review of Economics and Statistics,86(1), 327–344.
Farhi, E., and I. Werning (2012): “Capital Taxation: Quantitative Explorations of theInverse Euler Equation,” Journal of Political Economy, 120(3), 398–445.
31
Farhi, E., and I. Werning (2013): “Insurance and Taxation over the Life Cycle,” Reviewof Economic Studies, 80(2), 596–635.
Feldstein, M. (1990): “The Second Best Theory of Differential Capital Taxation,” OxfordEconomic Papers, 42(1), 256–267.
Fernandes, A., and C. Phelan (2000): “A Recursive Formulation for Repeated Agencywith History Dependence,” Journal of Economic Theory, 91(2), 223–247.
Flug, K., and Z. Hercowitz (2000): “Equipment Investment and the Relative Demandfor Skilled Labor: International Evidence,” Review of Economic Dynamics, 3, 461–485.
Fukushima, K. (2010): “Quantifying the Welfare Gains From Flexible Dynamic IncomeTax Systems,” Working paper.
Golosov, M., N. Kocherlakota, and A. Tsyvinski (2003): “Optimal Indirect andCapital Taxation,” Review of Economic Studies, 70(3), 569–587.
Golosov, M., M. Troshkin, and A. Tsyvinski (2013): “Redistribution and SocialInsurance,” Working Paper 17642.
Gravelle, J. G. (1994): The Economic Effects of Taxing Capital Income. Cambridge MA:MIT Press.
(2003): “Capital Income Tax Revisions and Effective Tax Rates,” CRS Report forCongress 17642, Congressional Budget Office.
(2011): “Reducing Depreciation Allowances to Finance a Lower Corporate TaxRate,” National Tax Journal, 64(4), 1039–1054.
Green, E. (1987): “Lending and the Smoothing of Uninsurable Income,” in ContractualArrangements for Intertemporal Trade, ed. by E. C. Prescott, and N. Wallace. Minneapolis:University of Minnesota Press.
Greenwood, J., Z. Hercowitz, and P. Krusell (1997): “Long-Run Implications ofInvestment-Specific Technological Change,” American Economic Review, 87(3), 342–362.
He, H., and Z. Liu (2008): “Investment-Specific Technological Change, Skill Accumulation,and Wage Inequality,” Review of Economic Dynamics, 11(2), 314–334.
Heathcote, J., F. Perri, and G. L. Violante (2010): “Unequal We Stand: An Empiri-cal Analysis of Economic Inequality in the United States: 1967–2006,” Review of EconomicDynamics, 13(1), 15–51.
Heathcote, J., K. Storesletten, and G. L. Violante (2010): “The MacroeconomicImplications of Rising Wage Inequality in the United States,” Journal of Political Econ-omy, 118(4), 681–722.
Huggett, M., and J. C. Parra (2010): “How Well Does the U.S. Social Insurance SystemProvide Social Insurance?,” Journal of Political Economy, 118(1), 76–112.
32
Huggett, M., G. Ventura, and A. Yaron (2011): “Sources of Lifetime Inequality,”American Economic Review, 101(7), 2923–2954.
Judd, K. L. (1985): “Redistributive Taxation in a Simple Perfect Foresight Model,” Journalof Public Economics, 28(1), 59–83.
Keane, M. P., and K. I. Wolpin (1997): “The Career Decisions of Young Men,” Journalof Political Economy, 105(3), 473–522.
Kocherlakota, N. R. (2005): “Zero Expected Wealth Taxes: A Mirrlees Approach toDynamic Optimal Taxation,” Econometrica, 73(5), 1587–1621.
(2010): The New Dynamic Public Finance. Princeton University Press.
Krusell, P., L. E. Ohanian, J.-V. Rıos-Rull, and G. L. Violante (2000): “Capital-Skill Complementarity and Inequality: A Macroeconomic Analysis,” Econometrica, 68(5),1029–1054.
Mirrlees, J. A. (1971): “An Exploration in the Theory of Optimum Income Taxation,”Review of Economic Studies, 38(114), 175–208.
Naito, H. (1999): “Re-examination of Uniform Commodity Taxes under a Non-linear In-come Tax System and its Implication for Production Efficiency,” Journal of Public Eco-nomics, 71(2), 165–188.
Rogerson, W. P. (1985): “Repeated Moral Hazard,” Econometrica, 53(1), 69–76.
Sadka, E. (1976): “On Income Distribution, Incentive Effects and Optimal Income Taxa-tion,” Review of Economic Studies, 43(2), 261–267.
Samuelson, P. A. (1964): “Tax Deductibility of Economic Depreciation to Insure InvariantValuations,” Journal of Political Economy, 72(6), 604–606.
Seade, J. K. (1977): “On the Shape of Optimal Tax Schedules,” Journal of Public Eco-nomics, 7(2), 203–235.
Slavik, C., and H. Yazici (2014): “On the Consequences of Eliminating Capital TaxDifferentials,” Working paper.
Stiglitz, J. E. (1982): “Self-Selection and Pareto Efficient Taxation,” Journal of PublicEconomics, 17(2), 213–240.
Storesletten, K., C. I. Telmer, and A. Yaron (2004): “Consumption and RiskSharing over the Life Cycle,” Journal of Monetary Economics, 51(3), 609–633.
Werning, I. (2007): “Optimal Fiscal Policy with Redistribution,” Quarterly Journal ofEconomics, 122(3), 925–967.
33
Appendix
A Differential Taxation of Capital in the U.S. Tax Code
In this section, we first explain how the current U.S. tax system taxes returns to capital
assets differentially. Then we provide a brief summary of the historical evolution of capital
tax differentials.
According to the current U.S. corporate tax code, all capital income net of depreciation
is taxed at the statutory rate of 35%. However, effective marginal taxes on net returns
to capital might differ from the statutory rate if tax depreciation allowances differ from
actual economic depreciation. To see this point, let ρi be the return to capital type i, δi
be its economic depreciation rate, and δi be the depreciation rate allowed by the tax code.
Suppose for simplicity that at the end of a period, the firm liquidates and there is no further
production. Letting τ be the statutory tax rate, the effective tax rate, call it τ i, is given by
τ i =(ρi − δi)τ(ρi − δi)
.
As first pointed out by Samuelson (1964), if δi = δi, then τ i = τ . In words, when tax de-
preciation equals economic depreciation, then the effective tax on the return to capital i
equals the statutory rate. If this is true for all types of capital, there are no tax differentials.
If, instead, for capital of type i the tax depreciation allowance is higher (lower) than the
actual depreciation rate, then its return is effectively taxed at a lower (higher) rate (i.e.,
τ i < (>)τ). As argued by Gravelle (2003), inter alia, such discrepancies between tax depre-
ciation allowances and economic depreciation rates across capital assets are the main cause
of differential capital taxation in the United States.21
21Of course, in reality many firms continue operating for many periods and hence deduct the whole cost ofa capital investment over time via depreciation deductions. The differences in effective tax rates are createdby tax depreciation rules that make firms depreciate their capital assets either faster or slower than theireconomic depreciation over time. As long as the real interest rate is positive, tax depreciation rules thatallow firms to deduct depreciation faster relative to actual economic depreciation decrease the effective tax
34
Gravelle (2003) concisely summarizes the historical evolution of tax differentials.22 She
reports that before the 1986 Tax Reform Act, the statutory corporate income tax rate was
46%. The effective tax rates on returns to most equipment assets were less than 10%, whereas
they were around 35% to 40% for buildings. With the 1986 tax reform, the statutory rate
was reduced to 35%, and the depreciation rules were altered to reduce the tax differentials
between equipment capital and structure capital. These policy changes resulted in effective
equipment capital tax rates that were about 32% and structure capital tax rates that were
still about 35% to 40%. As documented by Gravelle (2011), in the current U.S. corporate
tax code (which went through a minor reform in 1993), equipments are taxed at 26% on
average and structures are taxed at 32%. To sum up, capital equipments have historically
been favored relative to capital structures, but the difference in effective tax rates has been
declining over time.
B Proofs
B.1 Proof of Proposition 2
We call the multipliers on period t feasibility and skilled agents’ incentive constraint to be
λtβt−1 and µ, respectively. The first-order optimality conditions with respect to cs,t and cu,t
are given by, for t ≥ 1,
(cs,t) : (πs + µ∗)u′(c∗s,t) = λ∗tπs (11)
(cu,t) : (πu − µ∗)u′(c∗u,t) = λ∗tπu, (12)
rate on capital. Inflation also affects the real value of depreciation deductions because these deductions arebased on the historical acquisition costs. For a detailed description of how effective tax rates are calculated,see Gravelle (1994).
22For a more detailed description of how tax differentials between capital equipments and structures havechanged between 1950 and 1983, see Auerbach (1983).
35
which imply for all h ∈ H and t ≥ 2,
λ∗t−1
λ∗t=u′(c∗h,t−1)
u′(c∗h,t). (13)
The first order optimality conditions with respect to the two types of capital for t ≥ 2 are:
(Ke,t) : λ∗t−1 = β[λ∗t F2(K∗s,t, K
∗e,t, L
∗s,t, L
∗u,t) +X∗t
],
(Ks,t) : λ∗t−1 = βλ∗t F1(K∗s,t, K∗e,t, L
∗s,t, L
∗u,t), where
X∗t = µ∗v′(l∗u,tw
∗u,t
w∗s,t
)l∗u,t
∂(w∗u,tw∗s,t
)∂K∗e,t
.
Combining the first-order conditions with respect to capital with (13), we get for both h ∈ H
and ∀t ≥ 2,
u′(c∗h,t−1) = βF ∗1,tu′(c∗h,t), (14)
u′(c∗h,t−1) = βF ∗2,tu′(c∗h,t)
(1 +
X∗tλ∗t F
∗2,t
). (15)
Part 1. Equation (14) together with the definition of intertemporal wedges proves that
the structure capital wedge is zero in all periods for all h ∈ H. Since µ∗ > 0, we have
X∗t < 0. Using the definition of the equipment capital wedge and equation (15), we have for
all h ∈ H and t ≥ 2,
τ ∗e,t(h) = − X∗tλ∗t F
∗2,t
> 0.
This finishes the proof of part 1 of Proposition 2.
Part 2. Suppose a steady state of the constrained efficient allocation exists. Letting the
allocation without any time subscripts denote this steady-state allocation, we have
X∗ = µ∗v′(l∗uw
∗u
w∗s)l∗u
∂(w∗uw∗s
)∂K∗e
< 0.
36
Thus, we have
τ ∗e = − X∗
λ∗F ∗2> 0. �
B.2 Proof of Proposition 3
As before, we call the multipliers on period t feasibility and skilled agents’ incentive constraint
to be λtβt−1 and µ, respectively. Consider the first-order optimality conditions with respect
to cs,t and ls,t :
u′(c∗s,t)A∗t = λ∗tπs
v′(l∗s,t)
(A∗t −
B∗tv′(l∗s,t)
)= λ∗tw
∗s,tπs, where
A∗t = (πs + µ∗)
B∗t = µ∗v′(l∗u,tw
∗u,t
w∗s,t
)l∗u,t
∂w∗u,tw∗s,t
∂L∗s,tzsπs.
By the first order conditions above, we have A∗t > 0 and(A∗t −
B∗tv′(l∗s,t)
)> 0. By Assumption
3, we have B∗t > 0. These imply
τ ∗y,t(s) = 1− A∗t(A∗t −
B∗tv′(l∗s,t)
) < 0. �
B.3 Proof of Proposition 4
Part 1. Let λtβt−1 and µt(h
t) be multipliers on period t feasibility constraint and the incentive
constraint at history ht, respectively. Under Assumptions 1 and 2, the result follows from
37
the first-order conditions with respect to the two capital types:
(Ks,t) : −λ∗t−1 + βλ∗t F1(K∗s,t, K∗e,t, L
∗s,t, L
∗u,t) = 0,
(Ke,t) : −λ∗t−1 + βλ∗t F2(K∗s,t, K∗e,t, L
∗s,t, L
∗u,t)
+β∑{ht∈Ht}
µ∗t (ht)v′
(l∗t (h
t−1, hot )w∗t (h
ot )
w∗t (ht)
)l∗t (h
t−1, hot )∂w∗t (hot )
w∗t (ht)
∂K∗e,t= 0.
Part 2. Also, under Assumption 1, it follows directly from Golosov, Kocherlakota, and
Tsyvinski (2003) that in any period t ≥ 1 and following any history ht, the constrained
efficient allocation satisfies the following inverse Euler equation:
1
u′(c∗t (ht))
=1
βF ∗1,t+1
Et
{1
u′(c∗t+1(ht+1))|ht}. (16)
Part 2(a) of the proposition follows from the definition of intertemporal wedges in equation
(9), equation (16) and the conditional version of Jensen’s inequality. Part 2(b) of the propo-
sition follows from Part 1 and equation (9), which defines intertemporal wedges for both
types of capital. �
There is a modified inverse Euler equation for equipment capital. We derive it here for
the sake of completeness. It follows from the first part of Proposition 4 and equation (16):
1
u′(c∗t (ht))
=1
β(F ∗2,t+1 +X∗t+1/λ
∗t+1
)Et{ 1
u′(c∗t+1(ht+1))|ht}. (17)
B.4 Proof of Proposition 5
First, under Assumption 5, for t ≥ 2
X∗t =∑
{ht−1∈Ht−1}
µ∗t (ht−1, s)v′
(l∗t (h
t−1, u)w∗u,tw∗s,t
)l∗t (h
t−1, u)∂w∗u,tw∗s,t
∂K∗e,t.
38
Assumption 2 implies ∂w∗u,tw∗s,t
/∂K∗e,t < 0, implying that, in any period t ≥ 2, X∗t < 0. The first
part of the proposition then follows from the first part of Proposition 4 and X∗t < 0, whereas
the second part follows from the second part of Proposition 4 and X∗t < 0.
B.5 Proof of Proposition 6
For any t, let ht = (ht−1, s) and for m ≤ t, let hm be the predecessor in period m. Consider
the first-order optimality conditions with respect to ct(ht) and lt(h
t) :
u′(c∗t (ht))A∗t (h
t) = λ∗tπt(ht),
v′(l∗t (ht))
(A∗t (h
t)− B∗t πt(ht)
v′(l∗t (ht))
)= λ∗tw
∗t (ht)πt(h
t),
where, letting χ be the indicator function,
A∗t (ht) = πt(h
t) +t∑
m=1
βt−mπt(ht|hm)µ∗m(hm)
[χ{s}(hm)− χ{u}(hm)
]B∗t =
∑{ht∈Ht|ht=s}
µ∗t (ht)v′
(l∗t (h
t−1, u)w∗t (u)
w∗t (s)
)l∗t (h
t−1, u)∂w∗t (u)
w∗t (s)
∂L∗s,tzs.
By the first order conditions above, we have A∗t (ht) > 0,
(A∗t (h
t)− B∗t πt(ht)
v′(l∗t (ht))
)and by As-
sumption 3, B∗t > 0. These imply
τ y,t(ht) = 1− A∗t (h
t)(A∗t (h
t)− B∗t πt(ht)
v′(l∗t (ht))
) < 0. �
C Implementation
In this section, we show how the constrained efficient allocation can be implemented in an
incomplete markets environment with taxes. We say that a tax system implements the
constrained efficient allocation in a market if the constrained efficient allocation arises as an
equilibrium of this market arrangement under the given tax system. The constrained efficient
39
allocation can be implemented in many different ways. We provide an implementation in
which the tax system mimics the actual U.S. tax code in the sense that capital tax differentials
are created at the firm level through depreciation allowances that differ from actual economic
depreciation. Therefore, creating the optimal capital tax differentials would not require
further complications to the U.S. tax code.
We begin by describing the market arrangement. We assume that markets are incomplete
in that agents can only trade non-contingent claims to future consumptions (i.e. they can
save and borrow at a net risk-free rate, rt). We denote an agent’s savings as at. There is
a representative firm that rents capital at a net interest rate rt and labor to produce the
output good.23 The wage rates for skilled and unskilled are ws,t and wu,t and are taken as
given by the firm.
Government and Taxes. There is a government that needs to finance {Gt}∞t=1, an
exogenously given sequence of government consumption. The government taxes consumers’
savings and labor income (in a nonlinear and history-dependent way). The government also
taxes the firms’ capital income net of depreciation. The statutory depreciation allowance
can differ from economic depreciation, as in the U.S. tax code.
Taxes on consumers are specified following Kocherlakota (2005). Let τ y,t : Rt+ → R
denote the labor tax schedule, where τ y,t(yt) is the labor income tax an agent with labor
income history yt = (y1, ..., yt) pays in period t. Labor income in history ht is defined as
yt(ht) = wt(ht) · lt(ht). Labor income yt and labor lt are one to one for a given level of
wages, which means it is sufficient to use either one of them when we define an allocation.
In the rest of this section, we use yt. There are also linear taxes on people’s asset holdings,
which depend on their income history. Letting τa,t : Rt+ → R denote linear tax rates on
asset holdings, an agent with income history yt pays τa,t(yt)(1 + rt)at(h
t−1), where at(ht−1)
denotes the agent’s asset holdings.24
23The assumption that the firm does not accumulate capital is innocuous and is made for convenienceonly. This setup is equivalent to one in which the firm, instead of the consumers, accumulates capital andmakes the capital accumulation decisions.
24One can show that the government can implement the constrained efficient allocation by taxing only the
40
Unlike in Kocherlakota (2005), the government also has access to a sequence of linear
taxes on firm’s capital income, denoted by (τ f,t)∞t=1. The corporate tax code also includes
statutory depreciation allowances, (δs,t, δe,t)∞t=1. The firm is allowed to deduct δi,tKi,t from
its tax base in period t.
Consumer’s Problem. Taking prices (rt, ws,t, wu,t)∞t=1 and taxes (τ y,t, τa,t)
∞t=1 as given,
a consumer solves
maxc,y,a
∞∑t=1
∑ht∈Ht
πt(ht)βt−1
[u(ct(h
t))− v(yt(h
t)
wt(ht)
)]s.t
ct(ht) + at+1(ht) ≤ yt(h
t)− τ y,t(yt(ht)) +[1− τa,t(yt(ht))
](1 + rt)at(h
t−1),
a1 = K∗s,1 +K∗e,1, c, y are nonnegative.
Firm’s Problem. Taking prices (rt, ws,t, wu,t)∞t=1 and taxes (τ f,t, δs,t, δe,t)
∞t=1 as given,
the firm solves
maxKe,t,Ks,t,Ls,t,Lu,t
F (Ks,t, Ke,t, Ls,t, Lu,t)− (1 + rt)(Ks,t +Ke,t)−ws,tzsLs,t −
wu,tzu
Lu,t − τ f,tΠf,t,
where Πf,t is the firm’s capital income net of depreciation allowances in period t given by
Πf,t =
[F (Ks,t, Ke,t, Ls,t, Lu,t)− δs,tKs,t − δe,tKe,t −
ws,tzsLs,t −
wu,tzu
Lu,t
].
Notice that, as in the U.S. corporate tax code, we assume a flat tax on the firm’s capital
income net of depreciation allowances.
Equilibrium. Given a tax system (τ y,t, τa,t, τ f,t, δs,t, δe,t)∞t=1, an equilibrium is an alloca-
tion for consumers, (ct(ht), yt(h
t), at+1(ht))∞t=1, an allocation for the firm, (Ks,t, Ke,t, Ls,t, Lu,t)∞t=1,
and prices (rt, ws,t, wu,t)∞t=1 such that (ct(h
t), yt(ht), at+1(ht))∞t=1 solves the consumer’s prob-
return from capital rtat(ht−1), which would be more in line with what we observe in the U.S. tax code.
41
lem, (Ks,t, Ke,t, Ls,t, Lu,t)∞t=1 solves the firm’s problem, and markets clear:
∑ht∈Ht
πt(ht)ct(h
t) +Ks,t+1 +Ke,t+1 +Gt = F (Ks,t, Ke,t, Ls,t, Lu,t),
Ks,t +Ke,t =∑ht∈Ht
πt(ht)at(h
t−1),
Ls,t =∑
{ht∈Ht|ht=s}
πt(ht)yt(h
t)
ws,tzs, Lu,t =
∑{ht∈Ht|ht=u}
πt(ht)yt(h
t)
wu,tzu.
The government’s period-by-period budget balance is implied by Walras’ law:
∑ht∈Ht
πt(ht)[τa,t(y
t(ht))(1 + rt)at(ht−1) + τ y,t(y
t(ht))]
+ τ f,tΠf,t = Gt.
In what follows, we describe an optimal tax system that implements the constrained effi-
cient allocation in the market setup described above. Before doing so, we provide a formal
definition of our notion of implementation.
Implementation. A tax system (τ y,t, τa,t, τ f,t, δs,t, δe,t)∞t=1 implements the constrained
efficient allocation (c∗t (ht), y∗t (h
t), K∗s,t, K∗e,t, L
∗s,t, L
∗u,t)∞t=1 if an allocation for consumers (c∗t (h
t),
y∗t (ht), at+1(ht))∞t=1 and an allocation for the firm (K∗s,t, K
∗e,t, L
∗s,t, L
∗u,t)∞t=1 jointly with the tax
system and prices (rt, ws,t, wu,t)∞t=1 constitute an equilibrium.
C.1 Optimal Tax System
First, we construct the optimal tax system. Then, we prove that it implements the con-
strained efficient allocation. Finally, we characterize its properties.
Optimal Tax System. We begin by describing optimal savings taxes. Set the taxes on
people’s savings as
τ ∗a,t+1(yt+1) = 1− u′(c∗t (ht))
βu′(c∗t+1(ht+1))F1(K∗s,t, K∗e,t, L
∗s,t, L
∗u,t)
, if yt+1 ∈ Y t+1∗
τ ∗a,t+1(yt+1) = 1, if else,
42
where we define yt(ht) = (ym(hm))tm=1, Yt∗ = {yt : yt = yt∗(ht), ht ∈ H t} .
In words, Y t∗ is the set of labor income histories observed at the constrained efficient
allocation. Set labor income taxes such that if yt ∈ Y t∗, then τ ∗y,t(yt) and a∗t+1(ht) are defined
to satisfy the flow budget constraints every period:
c∗t (ht) + a∗t+1(ht) = y∗t (h
t)− τ ∗y,t(yt) +[1− τ ∗a,t(yt)
]F1(K∗s,t, K
∗e,t, L
∗s,t, L
∗u,t)a
∗t (h
t−1).
If yt /∈ Y t∗, then τ ∗y,t(yt) = 2yt.
Finally, set
τ ∗f,t = 1−F1(K∗s,t, K
∗e,t, L
∗s,t, L
∗u,t)− δs
F2(K∗s,t, K∗e,t, L
∗s,t, L
∗u,t)− δe
, δ∗s,t = r∗t + δs, δ
∗e,t = δe.
Observe that at the corporate level, the optimal tax system works in the same way as the
U.S. tax code. In the U.S. corporate tax code, there is a single statutory corporate tax rate,
and differences between effective tax rates on equipment and structure capital income are
created through differences in tax depreciation rules, as explained in detail in Section A.
Proposition 7. The optimal tax system described above implements the constrained efficient
allocation.
Proof. First, define prices as r∗t = F1(K∗s,t, K∗e,t, L
∗s,t, L
∗u,t) − δs and ∀h ∈ H,w∗h,t =
∂F (K∗s,t,K∗e,t,L
∗s,t,L
∗u,t)
∂Lh,tzh. Second, observe that the only budget feasible income strategy for the
household is the one that corresponds to some agent’s income process at the constrained
efficient allocation (i.e., for any history ht in any period t, yt should be in Y t∗).
Third, we claim that if an agent chooses an income strategy y′, where this means y′t(ht) =
y∗t (ht) for some ht, then facing the prices defined above, the agent also chooses (c′, a′),
meaning that c′t(ht) = c∗t (h
t) and a′t+1(ht) = a∗t+1(ht) for all ht, t. If we can show this claim,
then the result that agent will actually choose the constrained efficient allocation follows,
since the constrained efficient allocation is incentive compatible.
43
To see this claim, take an agent that follows income strategy y′. His problem is
maxc,a
∞∑t=1
∑ht∈Ht
πt(ht)βt−1
[u(ct(h
t))− v
(y∗t (h
t)
w∗t (ht)
)]s.t
ct(ht) + at+1(ht) ≤ y∗t (h
t)− τ ∗y,t(yt∗(ht)) +[1− τ ∗a,t(yt∗(ht))
](1 + r∗t )at(h
t−1),
a1 ≤ K∗s,1 +K∗e,1, c is nonnegative.
The first-order conditions to this problem are the budget constraint with equality and
u′(ct(ht)) = (1 + r∗t+1)β
∑ht+1|ht
πt+1(ht+1|ht)u′(ct+1(ht+1))[1− τa∗t+1(yt+1∗(ht, ht+1)].
Clearly, the agent’s problem is concave, and hence these first-order conditions are necessary
and sufficient for optimality provided that a relevant transversality condition holds.
By construction of the labor tax code and the prices, c∗t (ht) and a∗t+1(ht) satisfy the flow
budget constraints. To see that they also satisfy the Euler equation above, observe that
wealth taxes are constructed such that
1− τ ∗a,t+1(yt+1∗(ht, ht+1)) =u′(c∗t (h
t))
βu′(c∗t+1(ht, ht+1))F ∗1 (Ks,t, Ke,t, Ls,t, Lu,t).
Finally, we also need to make sure that the firm chooses the right allocation. The firm’s opti-
mality conditions for labor are satisfied at the constrained efficient allocation by construction
of wages. The firm’s optimality conditions for capital are
(Ks,t) : F1(Ks,t, Ke,t, Ls,t, Lu,t)− (1 + r∗t )− τ ∗f,t[F1(Ks,t, Ke,t, Ls,t, Lu,t)− δs,t
]= 0
(Ke,t) : F2(Ks,t, Ke,t, Ls,t, Lu,t)− (1 + r∗t )− τ ∗f,t[F2(Ks,t, Ke,t, Ls,t, Lu,t)− δe,t
]= 0.
At the constrained efficient allocation, the first condition above holds by construction of r∗t
and δ∗s,t whereas the second one holds by construction of τ ∗f,t, r
∗t and δ
∗e,t. �
44
Properties of Optimal Capital Taxes. Next we summarize and discuss the properties
of optimal capital taxes.
1. Capital income is taxed twice, once at the consumer level via savings taxes and once
at the corporate level (double taxation of capital).
2. The statutory corporate tax is strictly positive if only downward incentive constraints
bind: ∀t ≥ 2 : τ ∗f,t > 0, where the inequality comes from Proposition 5.
3. There is a single statutory tax rate on corporate income, τ ft , but the effective taxes
on capital income at the firm level differ across different capital assets because of
differences in the statutory depreciation allowances. The firm is allowed to expense all
of its user cost of structure capital, whereas the depreciation allowance on equipment
capital is equal to its economic depreciation. These optimal tax depreciation allowances
imply that the optimal effective corporate tax rate on structure capital is zero and the
optimal effective corporate tax rate on equipment capital is equal to the statutory
corporate tax rate τ ∗f,t.25
4. Properties 2 and 3 imply that the government uses capital income taxes to collect
revenue at the corporate level, as in the U.S. tax code.
5. Expected taxes on consumers’ asset holdings are zero as in Kocherlakota (2005):
Et{
1− τa∗t+1(yt+1)}
= Et
{u′(c∗t (h
t))
βu′(c∗t+1(ht+1))F ∗1,t+1
}= 1.
This property follows from the inverse Euler equation (16).
25Note that there are other implementations in which both capital types can be taxed positively as longas the tax rates create the efficient capital return wedge. In that case, the savings’ tax that the consumersface would have to be adjusted.
45
D Alternative Tax Systems
D.1 Current Differential Taxation of Capital
In the current DTC, the planner must use the capital income taxes as in the U.S. tax code.
This means that we impose a set of constraints for both types of agents in each period (here,
τ s and τ e are fixed over time and taken from the data, namely, τ s = 0.422 and τ e = 0.371):
for all h ∈ H
1− τ s =
u′(ch,t)
βu′(ch,t+1)− 1
F1,t+1 − δsand 1− τ e =
u′(ch,t)
βu′(ch,t+1)− 1
F2,t+1 − δe, (18)
Observe that the restrictions that the current capital taxes impose on the set of allocations
that the planner can choose, (18), is written down taking into account that, in the U.S. tax
code, it is capital income net of depreciation that is being taxed. Also, notice that any three
of these constraints in (18) imply the fourth. Therefore, we write the current DTC planning
problem, ignoring the fourth:
max{(ch,t,lh,t)h∈H ,Ks,t,Ke,t,Lu,t,Ls,t}∞t=0
∑h∈H
πh
∞∑t=1
βt−1 [u(ch,t)− v(lh,t)] s.t.
∀t ≥ 1, Gt +∑h
πh∈Hch,t +Ks,t+1 +Ke,t+1 ≤ F (Ks,t, Ke,t, Ls,t, Lu,t),
∞∑t=1
βt−1 [u(cs,t)− v(ls,t)] ≥∞∑t=1
βt−1
[u(cu,t)− v(
lu,twu,tws,t
)
],
∀t ≥ 1, βu′(cs,t+1) [(1− τ s)(F1,t+1 − δs) + 1] = u′(cs,t),
∀t ≥ 1, βu′(cs,t+1) [(1− τ e)(F2,t+1 − δe) + 1] = u′(cs,t),
∀t ≥ 1, βu′(cu,t+1) [(1− τ s)(F1,t+1 − δs) + 1] = u′(cu,t).
46
D.2 Optimal Nondifferential Taxation of Capital
In optimal NDTC, the planner is not allowed to tax the two types of capital differentially.
This means that for agents of both types h ∈ H and all t ≥ 1, we impose
1− τ s,t+1(h) = 1− τ e,t+1(h).
This restriction on taxes is equivalent to the following restriction on allocations: for all h ∈ H
and t ≥ 1,
1− τ s,t+1(h) =
u′(ch,t)
βu′(ch,t+1)− 1
F1,t+1 − δs= 1− τ e,t+1(h) =
u′(ch,t)
βu′(ch,t+1)− 1
F2,t+1 − δe.
This equation can be simplified to an equality constraint on net returns: F1,t+1 − δs =
F2,t+1 − δe, which we impose for each period. The optimal NDTC planning problem reads:
max{(ch,t,lh,t)h,Ks,t,Ke,t,Lu,t,Ls,t}∞t=0
∑h=u,s
πh
∞∑t=1
βt−1 [u(ch,t)− v(lh,t)] s.t.
∀t ≥ 1, Gt +∑h
πhch,t +Ks,t+1 +Ke,t+1 ≤ F (Ks,t, Ke,t, Ls,t, Lu,t),
∞∑t=1
βt−1 [u(cs,t)− v(ls,t)] ≥∞∑t=1
βt−1
[u(cu,t)− v(
lu,twu,tws,t
)
],
∀t ≥ 1, F1(Ks,t+1, Ke,t+1, Ls,t+1, Lu,t+1)− δs = F2(Ks,t+1, Ke,t+1, Ls,t+1, Lu,t+1)− δe.
47