machines, buildings, and optimal dynamic...

Machines, Buildings, and Optimal Dynamic Taxes∗

Ctirad Slavıka and Hakki Yazicib

aGoethe University Frankfurt, Frankfurt, Germany. Email: [email protected]

bSabanci University, Istanbul, Turkey. E-mail: [email protected]

February 22, 2014

Abstract

The effective taxes on capital returns differ depending on capital type in the U.S.

tax code. This paper uncovers a novel reason for the optimality of differential capital

taxation. We set up a model with two types of capital - equipments and structures -

and equipment-skill complementarity. Under a plausible assumption, we show that it

is optimal to tax equipments at a higher rate than structures. In a calibrated model,

the optimal tax differential rises from 27 to 40 percentage points over the transition to

the new steady state. The welfare gains of optimal differential capital taxation can be

as high as 0.4% of lifetime consumption.

JEL classification: E62, H21.

Keywords: Differential capital asset taxation, equipment capital, structure capital,

equipment-skill complementarity.

∗Corresponding author: Ctirad Slavık, Gruneburgplatz 1 (Campus Westend) House of Finance, Room3.49 60323 Frankfurt am Main, Germany. Telephone: +49 (0)69 798 33808.

1 Introduction

In the U.S. corporate tax code, the effective marginal tax rates on returns to capital assets

show a considerable amount of variation depending on the capital type. For instance, ac-

cording to Gravelle (2011), the effective marginal tax rate on the returns to communications

equipment is 19%, whereas it is above 35% for non-residential buildings.1 This feature of the

tax code has been the subject of numerous reform proposals since the 1980s. Recently, Pres-

ident Obama called for a reform to abolish the tax rules that create differential taxation of

capital assets in order to “level the playing field” across companies.2 Many economists have

argued in favor of the proposals to abolish tax differentials following an efficiency argument

first raised by Diamond and Mirrlees (1971): taxing different types of capital at different

rates distorts firms’ production decisions, thereby creating production inefficiencies.

In this paper, we take a step back and reassess whether differential taxation of capital

income is a desirable feature of the tax code. Theoretically, we uncover a novel economic

mechanism that calls for optimality of differential capital asset taxation, but with an impor-

tant caveat. In the current U.S. tax code, the effective tax rate on equipment capital (i.e.,

mostly machines) is on average 5% below the effective tax rate on structure capital (i.e.,

mostly non-residential buildings). In contrast, our theory suggests that capital equipments

should be taxed at a higher rate than capital structures. We conduct a quantitative exercise

to assess the quantitative importance of optimal differential capital taxation. In our baseline

calibration, we find that the tax rate on capital equipments should be at least 27 percentage

points higher than the tax rate on capital structures in the transition and at the steady

state. Furthermore, we find that the welfare gains of optimal differential capital taxation

are as high as 0.4% of lifetime consumption for reasonable parameter values.

We study dynamic optimal taxes in an economy in which people are heterogeneous in

1The capital tax differentials are created through tax depreciation allowances that differ from actualdepreciation rates. We explain this in detail and provide further information on the historical evolution ofcapital tax differentials in the U.S. tax code in Appendix A.

2The 2011 U.S. President’s State of the Union Address. Retrieved from http://www.whitehouse.gov/the-press-office/2011/01/25/remarks-president-state-union-address.

1

terms of their skills, and the government uses capital and labor income taxes to provide redis-

tribution (insurance). In our benchmark model, we consider an environment with permanent

skills. We then generalize our main theoretical results to an environment with stochastic

skills. Our approach to optimal dynamic taxation follows the recent New Dynamic Public

Finance literature in the sense that taxes are allowed to be arbitrary functions of people’s

past and current incomes.

The key feature of our environment is equipment-skill complementarity in the production

technology. Following Gravelle (2011), we group capital assets into two categories: structure

capital and equipment capital. We further assume that there are two types of labor: skilled

and unskilled. Following the empirical evidence for the U.S. economy provided by Krusell,

Ohanian, Rıos-Rull, and Violante (2000), we assume that the degree of complementarity

between equipment capital and skilled labor is higher than the degree of complementarity

between equipment capital and unskilled labor. Structure capital is neutral in terms of its

complementarity with skilled and unskilled labor. More generally, Flug and Hercowitz (2000)

provide evidence for equipment-skill complementarity for a large panel of countries.

Equipment-skill complementarity implies that skilled and unskilled labor are not perfect

substitutes and that the skill premium – defined as the ratio of the skilled wage to the un-

skilled wage – is endogenous. In particular, a decrease in the stock of equipment capital

decreases the skill premium, thereby creating an indirect transfer from the skilled agents

to the unskilled ones. Therefore, depressing the level of equipment capital creates an extra

channel of redistribution and/or insurance. In order to depress equipment capital accumu-

lation, the government taxes returns to equipment capital at a higher rate than it taxes

returns to structure capital. This implies the optimality of differential capital taxation.

We assess the quantitative importance of differential capital taxation using the model

with permanent skills calibrated to the U.S. economy. In our benchmark calibration, the

optimal equipment capital income tax is 27.6 percentage points higher than the tax on

structure capital in the first period. The tax differential rises along the transition path to

2

39.6 percentage points at the steady state.

The skill premium is about 40% in the first period after the optimal tax reform, and rises

over the transition to 48% in the new steady state. Thus, the ‘optimal’ skill premium in any

period is significantly lower than 80%, the empirical estimate for the current U.S. economy.

This suggests that the optimal tax system relies much more on indirect redistribution than

the current U.S. tax system. We also find that the optimal skill premium is rising over

the transition because the economy is growing, and hence, the level of equipment capital

increases. This result is interesting as it suggests that, even if the government cares about

equality, an increasing skill premium is optimal in a growing economy.

Next, we assess the welfare gains of optimal differential capital taxation. We compare

welfare in the optimal tax system with welfare in a tax system, in which the government is

unrestricted in its choice of labor income taxes, but the tax rates on both types of capital

are restricted to be equal to the values in the U.S. tax code. The additional welfare gains of

allowing for differential capital taxation are 0.19% in terms of lifetime consumption in the

benchmark and can be as high as 0.40% within the set of reasonable parameter values.

We focus on the redistribution and insurance provision role of differential capital taxation.

There could be other reasons for differential taxation of capital. For instance, some authors

have argued that investment in equipment capital might create positive externalities. Other

things being equal, positive externalities would be a reason to tax equipment capital at a

lower rate than structure capital. Auerbach, Hassett, and Oliner (1994) point out, however,

that it is hard to support the existence of such positive externalities on empirical grounds.

In this paper, we abstract from all other possible reasons for differential capital taxation in

order to isolate its redistributive and insurance provision role.

Related Literature. Our paper relates to three distinct strands of literature. First, in

their seminal paper Diamond and Mirrlees (1971) show that tax systems should maintain

productive efficiency. In an environment with multiple capital types, this result implies that

all capital should be taxed at the same rate. However, Auerbach (1979) and Feldstein (1990)

3

show that it might be optimal to tax capital differentially if the government is exogenously

restricted to a narrower set of fiscal instruments than in Diamond and Mirrlees (1971). Our

paper is different in the sense that the optimality of differential capital taxation stems from

redistribution and/or insurance motives.

Our paper follows the New Dynamic Public Finance (NDPF) tradition. This literature

studies optimal capital and labor income taxation in dynamic settings in which agents’ la-

bor skills are allowed to change stochastically over time and the optimal tax system can be

arbitrarily nonlinear in the history of capital and labor income.3 No paper in this literature,

however, has studied differential taxation of capital assets prior to the current paper. In

addition, our paper contributes to the NDPF literature by adding to a set of recent papers

that aim to provide practical policy recommendations by quantifying the theoretical impli-

cations of the NDPF literature, see e.g, Fukushima (2010), Huggett and Parra (2010), Farhi

and Werning (2013), and Golosov, Troshkin, and Tsyvinski (2013).

Our paper is also related to a set of theoretical studies on optimal static Mirrleesian

taxation with endogenous wages. Stiglitz (1982) assumes that the labor supplies of agents

with different skills are imperfect substitutes and shows that the agent with the highest

income should be subsidized. Naito (1999) shows that the uniform commodity taxation result

of Atkinson and Stiglitz (1976) and productive efficiency result of Diamond and Mirrlees

(1971) are no longer valid under imperfect labor substitutability. Ales, Kurnaz, and Sleet

(2014) analyze a static optimal tax problem in which agents with different skills are assigned

to tasks (occupations). They calculate optimal taxes for the U.S. economy for the 1970s

and the 2000s and compare them to their empirical counterparts. In addition, they analyze

the impact of technical change on optimal taxes. We differ from this literature by focusing

on a dynamic environment with different types of capital, which we use to analyze optimal

differential taxation of capital assets both theoretically and quantitatively.

The rest of the paper is structured as follows. In Section 2, we lay out the model for the

3For seminal contributions to NDPF, see Golosov, Kocherlakota, and Tsyvinski (2003), Kocherlakota(2005), and Albanesi and Sleet (2006). For an excellent review of this literature, see Kocherlakota (2010).

4

case of permanent skills. In Section 3, we show that differential capital taxation is optimal in

this environment. Section 4 generalizes our main results to an environment with stochastic

skills. Section 5 discusses our quantitative results, and Section 6 concludes. Appendix A

discusses differential taxation of capital assets in the U.S. tax code, Appendix B contains the

proofs, Appendix C provides a formal implementation of the constrained efficient allocation

in an incomplete markets environment, and Appendix D defines alternative social planning

problems that we analyze in Section 5.

2 Model

There is a continuum of measure one of agents who live for infinitely many periods. They

differ in their skill levels: they are born either skilled or unskilled, h ∈ H = {u, s}. A fraction

πh of agents belong to skill group h. In the main body of the paper, we assume that agents’

skills are permanent. Permanent skills is a natural assumption given that in our quantitative

analysis we associate skill levels with educational attainment. In Section 4, we show that

our main theoretical results remain valid for a general stochastic skill process.

Production Technology. An agent of skill level h produces l · zh units of effective h

type labor when he works l units of labor. There are two different occupational sectors in

this economy: a skilled occupation in which only skilled agents are allowed to work and an

unskilled occupation in which only unskilled agents are allowed to work. The first assumption

reflects the fact that unskilled people do not have the skills to work in the skilled occupation.

The second assumption can be rationalized as follows. We assume that agents get the same

disutility from working in the two occupations. Therefore, a skilled agent will choose to work

in the skilled occupation as long as he gets a higher wage in the skilled occupation. This

reasoning holds in the presence of taxes under our assumption that taxes are functions of

income histories only. We discuss the nature of the tax system in more detail below.

Output is produced according to a production function Y = F (Ks, Ke, Ls, Lu), where

5

Ls, Lu, Ks and Ke denote the aggregate amounts of effective skilled labor, effective unskilled

labor, structure capital, and equipment capital. Output can be used for consumption or

can be converted to structure or equipment capital one-for-one. The economy is initially

endowed with K∗s,1 and K∗e,1 units of the capital goods. We define a function F that gives the

total wealth of the economy: F = F + (1− δs)Ks + (1− δe)Ke, where δs and δe denote the

depreciation rates of structure and equipment capital. We define Fi(·) and Fi(·) as partial

derivatives of F and F with respect to the ith argument.

Wages. Agents of type h ∈ H receive wage wh,t in period t for one unit of their labor:

ws,t = F3(Ks,t, Ke,t, Ls,t, Lu,t) · zs, wu,t = F4(Ks,t, Ke,t, Ls,t, Lu,t) · zu. (1)

Equipment-Skill Complementarity. Following Krusell, Ohanian, Rıos-Rull, and Vi-

olante (2000), we assume that the production technology features equipment-skill comple-

mentarity, which means that the degree of complementarity between equipment capital and

skilled labor is higher than that between equipment capital and unskilled labor. This as-

sumption has two important implications that make our model different from the standard

model in the NDPF literature. First, an increase in the stock of equipment capital decreases

the ratio of the marginal product of unskilled labor to the marginal product of skilled labor.

In other words, the ratio of skilled to unskilled wages (skill premium) is endogenous, and this

ratio is increasing in equipment capital. Structure capital, on the other hand, is assumed to

be neutral in terms of its complementarity with skilled and unskilled labor. Second, skilled

and unskilled labor are no longer perfect substitutes which implies that the skill premium is

decreasing in the total amount of skilled labor and increasing in the total amount of unskilled

labor. We formalize these assumptions on technology as follows.

Assumption 1. F3(·)/F4(·) is independent of Ks.

Assumption 2. F3(·)/F4(·) is strictly increasing in Ke.

Assumption 3. F3(·)/F4(·) is strictly decreasing in Ls and strictly increasing in Lu.

6

Assumptions (1) - (3) are maintained throughout the paper without further reference.

Preferences. An agent of type h evaluates a consumption-labor sequence, (ch,t, lh,t)∞t=1,

with a utility function that is time-separable and separable between consumption and labor,

∞∑t=1

βt−1 [u(ch,t)− v(lh,t)] ,

where β ∈ (0, 1) is the discount factor, u, v : R+ → R. We assume that u′,−u′′, v′, v′′ > 0.

Allocation. An allocation is x = ((ch,t, lh,t)h∈H , Ks,t, Ke,t, Ls,t, Lu,t)∞t=1 .

Feasibility. An allocation is feasible if in any period t ≥ 1,

∑h=u,s

πhch,t +Ks,t+1 +Ke,t+1 +Gt ≤ F (Ks,t, Ke,t, Ls,t, Lu,t), (2)

Lh,t = πhlh,tzh, for h ∈ H and Ks,1 ≤ K∗s,1, Ke,1 ≤ K∗s,1. (3)

Here, {Gt}∞t=0 is a sequence of exogenously given wasteful government consumption.

Optimal Tax Problem. As in the U.S. tax code, we assume that taxes only depend on

people’s incomes, and not directly on their skills, occupations, wages, or labor supplies. We

do not model why the government does not use this information in the tax code (there could

be constitutional, administrative or other reasons), but rather focus on the best policy given

the existing fiscal framework. Following Mirrlees (1971) and the recent New Dynamic Public

Finance literature, we impose no further restrictions on the tax code; specifically, taxes can

be arbitrarily nonlinear functions of income histories.

Following Kocherlakota (2010), we make no explicit mention of private information in

motivating why we restrict attention to tax systems that depend only on incomes. However,

the fact that the government can condition taxes only on income implies that the optimal tax

problem is isomorphic to a social planning problem, in which agents are privately informed

about their skills, occupations, wages, and labor supplies. Income and consumption are

public information. In the planning problem, each agent reports his skill type to the planner

7

and receives an allocation as a function of his report.4 The set of allocations available to the

planner is constrained by incentive compatibility constraints, which ensure that agents do

not misreport their types.5

Our strategy is to first characterize the solution to the planning problem and then use

this characterization to back out properties of an optimal tax system. We now formally

define the notion of incentive compatibility.

Incentive Compatibility. With permanent types, people report their type only once

in the first period. Moreover, since we assume that agents cannot switch occupations, an

agent can only mimic the other type’s income level by adjusting his labor hours. As a result,

the planner faces only two incentive constraints.

We say that an allocation is incentive compatible if and only if for all h ∈ H

∞∑t=1

βt−1 [u(ch,t)− v(lh,t)] ≥∞∑t=1

βt−1

[u(cj,t)− v(

lj,twj,twh,t

)

], (4)

where j denotes the complement of h in the set H.

Social Planning Problem. We analyze the problem of a planner who maximizes a

Utilitarian objective with equal weights on all agents. The social planning problem is

maxx

∑h∈H

πh

∞∑t=1

βt−1 [u(ch,t)− v(lh,t)] s.t. (1), (2), (3), and (4).

The allocation that solves the social planning problem is called the constrained efficient

allocation and is denoted with an asterisk throughout the paper.

4Agents only report their skill types, because given that income is observable and skilled (unskilled)agents can only work in the skilled (unskilled) occupation, knowing an agent’s skill type reveals all hisprivate information.

5The restriction to direct truth-telling mechanisms is without loss of generality because of the followingargument. Any market arrangement with taxes is a particular mechanism. By revelation principle, nosuch mechanism can do better than the optimal direct truth-telling mechanism. Conversely, in AppendixC, we prove that there is a tax system that implements the allocation that arises from the optimal directtruth-telling mechanism. Therefore, finding the optimal tax system reduces to finding the optimal directtruth-telling mechanism, which is the problem of a social planner who assigns allocations as functions ofagents’ types subject to incentive compatibility constraints.

8

3 Optimality of Differential Taxation of Capital

In this section, we uncover the economic mechanism that calls for differential capital tax-

ation. We show that, with equipment-skill complementarity, as long as only the incentive

constraint that prevents skilled agents from pretending to be unskilled binds, the optimal

tax on equipment capital is strictly higher than the optimal tax on structure capital. We

begin by formalizing the assumption on the pattern of binding incentive constraints.

Assumption 4. The incentive constraint (4) binds for h = s and is slack for h = u at the

solution to the social planning problem.

In all numerical simulations in Section 5, in which we parameterize the model to match

the U.S. data, the skilled wage is higher than the unskilled wage in every period. However, in

our environment with endogeneous wages, it is not possible to guarantee that skilled wages

are always higher than unskilled wages without making very restrictive assumptions on F .

Without monotonic wages, it is not possible to determine the pattern of binding incentive

constraints. Therefore, we proceed directly with Assumption 4, see Stiglitz (1982) for the

same approach. We verify that Assumption 4 is satisfied in all our quantitative exercises.

3.1 Capital Return Wedge

In the standard growth model with two types of capital, aggregate savings are allocated

between the two types of capital in a way that equates their marginal returns. Proposition

1 below shows that this is not true in the constrained efficient allocation, meaning it is

optimal to create a wedge between the marginal returns to structure and equipment capital.

This result forms the basis for the optimality of differential taxation of capital: to create the

optimal wedge in the market equilibrium, the two types of capital should be taxed differently.

Proposition 1. Suppose Assumption 4 holds. Then, at the constrained efficient allocation,

in any period t ≥ 2, F1(K∗s,t, K∗e,t, L

∗s,t, L

∗u,t) < F2(K∗s,t, K

∗e,t, L

∗s,t, L

∗u,t).

9

Proof. Let λtβt−1 be the multiplier on period t feasibility constraint and µ be the

multiplier on skilled agents’ incentive constraint. The first order optimality conditions with

respect to the two types of capital are:

(Ke,t) : λ∗t−1 = β[λ∗t F2(K∗s,t, K

∗e,t, L

∗s,t, L

∗u,t) +X∗t

],

(Ks,t) : λ∗t−1 = βλ∗t F1(K∗s,t, K∗e,t, L

∗s,t, L

∗u,t), where

X∗t = µ∗v′(l∗u,tw

∗u,t

w∗s,t

)l∗u,t

∂(w∗u,tw∗s,t

)∂K∗e,t

.

By equipment-skill complementarity, we have ∂(w∗u,tw∗s,t

)/∂K∗e,t < 0. Since µ∗ > 0, we have

X∗t < 0. Using X∗t < 0 together with the first-order conditions gives the result. �

Because of equipment-skill complementarity, increasing the level of equipment capital in

period t decreases the wage ratio w∗u,t/w∗s,t. This makes it more profitable for the skilled agents

to pretend to be unskilled and, hence, tightens the skilled incentive constraint. From a plan-

ning perspective, this means that increasing equipment capital has an extra negative return,

X∗t < 0, in addition to the physical return, F ∗2,t, where F ∗i,t denotes Fi(K∗s,t, K

∗e,t, L

∗s,t, L

∗u,t).

Since structure capital is neutral, changing its level does not affect the incentive constraint,

and hence its only return is its physical return, F ∗1,t. In order for the overall return on the

two types of capital to be equal, the physical return on equipment capital must higher than

the physical return on structure capital at the constrained efficient allocation.

This result is intuitive: decreasing the level of equipment capital has an additional

marginal benefit for the planner, because it decreases the skill premium and thus indirectly

redistributes from the skilled to the unskilled. Decreasing the level of equipment capital in-

creases its return above the return on structure capital due to diminishing marginal returns.

This intuition shows that there is an extra reason to depress equipment capital accumulation

relative to structure capital. This implies that equipment capital should be taxed at a higher

rate than structure capital as we show in Section 3.2.

Two features of the model are key for the optimality of the capital return wedge. First,

10

if equipment capital was also neutral in terms of its complementarity with the two types

labor, then, X∗t = 0, and hence, it would be efficient to equate the physical marginal returns

to the two types of capital. Second, if the government could condition taxes on skill types, it

could redistribute via type-specific lump-sum taxes at zero efficiency cost and would not need

the indirect (and distortionary) channel of redistribution, which works through the capital

return wedge. In terms of the planning problem, this would mean that skills were not private

information but publicly known. As a result, there would be no incentive constraints, and

hence, X∗t = 0, and the optimal capital return wedge would again be zero.

3.2 Optimal Differential Capital Taxes

In this section, we provide a link between the optimality of the capital return wedge and the

optimality of differential capital taxation. In Proposition 2, we characterize the properties of

optimal wedges (distortions) that a planner has to create in the intertemporal allocation of

resources in order to implement the constrained efficient allocation in a competitive market

environment, in which people are allowed to save through both types of capital. Formally, the

optimal intertemporal wedge that the planner has to create for an agent of type h for capital

of type i ∈ {s, e} from period t to t+1 is defined as τ ∗i,t+1(h) = 1−u′(c∗h,t)/[βF ∗i,t+1u

′(c∗h,t+1)].

Proposition 2. Suppose Assumption 4 holds. Then,

1. In all periods t ≥ 2, the optimal wedge on equipment capital is strictly positive and

independent of agent type, whereas the optimal wedge on structure capital is zero, i.e.,

for all h ∈ H, τ ∗e,t ≡ τ ∗e,t(h) > τ ∗s,t ≡ τ ∗s,t(h) = 0.

2. If a steady state of the constrained efficient allocation exists, then the optimal wedge

on equipment capital is strictly positive at the steady state.

Proof. Relegated to Appendix B. �

Part 1 of Proposition 2 calls for zero taxation of structure capital and positive taxation

of equipment capital in every period. Recall that by Assumption 1, a change in the level

11

of structure capital does not affect the skill premium. Therefore, there is no indirect redis-

tribution motive to distort structure capital accumulation. In addition, it follows from the

uniform commodity taxation result of Atkinson and Stiglitz (1976) that in the absence of

skill risk, it is optimal not to tax structure capital.6 In contrast, taxing equipment capital

has the extra benefit of decreasing the skill premium, thus providing indirect redistribution.

Therefore, the planner finds it optimal to tax equipment capital.7 Finally, the proposition

also shows that the capital tax rates are type independent.

Part 2 of Proposition 2 says that the optimal wedge on equipment capital is positive

in steady state. This result is interesting because it shows that the indirect redistribution

channel calls for taxing equipment capital not only in the short run but also in the long run.

This result is in contrast with the long run optimality of zero capital taxation in the Ramsey

literature shown by Chamley (1986) and Judd (1985).

3.3 Intratemporal Wedges

Our model has interesting implications for intratemporal wedges (i.e., marginal labor income

taxes) as well. The optimal intratemporal wedge in period t for an agent of skill type h,

defined as τ ∗y,t(h) = 1 − v′(l∗h,t)/[w∗h,tu

′(c∗h,t)], measures the efficient distortion that the

planner needs to create in this agent’s intratemporal allocation of consumption and labor

in period t. The famous no distortion at the top result, proven originally by Sadka (1976)

and Seade (1977), states that in a static Mirrleesian economy, if the distribution of skills

has a finite support, then the consumption-labor decision of the agent with the highest skill

level should not be distorted. Huggett and Parra (2010) proves this result for a dynamic

Mirrleesian economy in which skill types are permanent and a version of our Assumption 4

holds. In Proposition 3, we show that the no distortion at the top result does not hold in the

6The optimality of not taxing structure capital is closely related to Werning (2007), who shows thatwith permanent types zero capital taxation is optimal in a dynamic Mirrleesian model with standard Cobb-Douglas production function.

7If Assumption 4 is not satisfied, it will still be generically optimal to tax the two types of capitaldifferentially, as we show explicitly in a more general environment in Section 4. However, in that case, it isnot possible to determine which capital good will be taxed at a higher rate.

12

presence of equipment-skill complementarity. In particular, we show that the skilled agents’

labor income should be subsidized.

Proposition 3. Suppose Assumption 4 holds. Then, in any period t ≥ 1, the optimal

intratemporal wedge of the skilled agent is negative, i.e., τ ∗y,t(s) < 0.

Proof. Relegated to Appendix A. �

The intuition for this result is as follows. Under the equipment-skill complementarity as-

sumption, skilled and unskilled labor are imperfect substitutes. This implies that increasing

the labor supply of the skilled agents decreases the skill premium which means that increas-

ing skilled labor supply creates indirect redistribution. In order to encourage the supply of

skilled labor, the government finds it optimal to subsidize skilled labor at the margin. This

result is in line with Stiglitz (1982), who shows that when two types of labor are imperfect

substitutes, the more productive agents’ labor supply should be subsidized.

4 Generalization to Stochastic Skills

In the model laid out in Section 2, agents’ skill types are permanent. In this section, we

allow for agents’ skills to evolve stochastically over time. This level of generality is useful

because it allows us to establish that our main theoretical results remain valid if people’s

skills change after they enter the labor market, or if we take a dynastic interpretation of our

model in which skills change from one generation to another. Notice that in this environment

with stochastic skills the government uses taxes to provide insurance in addition to providing

redistribution and financing public spending.

We first show that differential taxation of capital is optimal for any stochastic skill pro-

cess. Moreover, under an assumption regarding the pattern of binding incentive compatibility

conditions, it is optimal to tax equipment capital at a higher rate than structure capital.

The environment is the same as in Section 2 except that people are born identical, but

their skills evolve stochastically over time. A skill realization in period t is denoted by

13

ht ∈ H. A partial skill history in period t is denoted by ht = (h1, h2, . . . , ht) ∈ H t, where H t

denotes the set of all period t histories. Let πt(ht) be the unconditional probability of ht.

Wages. An agent of type h in period t receives a wage wh,t, defined in equation (1),

independent of his skill history before period t. For expositional convenience, in this section,

we use wt(ht) instead of wh,t to denote wages.

Preferences. Preferences are now defined over stochastic processes of consumption and

labor, (ct, lt)∞t=0, where ct, lt : H t → R+, using an ex ante expected discounted utility function,

∞∑t=1

∑ht∈Ht

πt(ht)βt−1

[u(ct(h

t))− v(lt(ht))]. (5)

Allocation. An allocation is x = (ct, lt, Ks,t, Ke,t, Ls,t, Lu,t)∞t=1 .

Feasibility. An allocation is feasible if in any period t ≥ 1,

∑ht∈Ht

πt(ht)ct(h

t) +Ks,t+1 +Ke,t+1 +Gt ≤ F (Ks,t, Ke,t, Ls,t, Lu,t), (6)

Lh,t =∑

{ht∈Ht|ht=h}

πt(ht)lt(h

t)zh for h ∈ H, and Ks,1 ≤ K∗s,1, Ke,1 ≤ K∗s,1. (7)

Incentive Compatibility. Define σt : H t → H. A reporting strategy is σ = (σt)∞t=1.

Let Σ denote the set of all reporting strategies. The truth-telling strategy, which we denote

by σ∗, prescribes reporting the true type at each and every node: for all ht, σ∗t (ht) = ht. Let

σt(ht) = (σ1(h1), ..., σt(ht)) denote the history of reports along history ht. Define

W (σ|x) =∞∑t=1

∑ht∈Ht

πt(ht)βt−1

[u(ct(σ

t(ht)))− v

(lt(σ

t (ht))wt(σt (ht))

wt(ht)

)],

as the expected discounted value of using reporting strategy σ given an allocation x. We say

that an allocation x is incentive compatible if and only if for all σ ∈ Σ, W (σ∗|x) ≥ W (σ|x).

Following Fernandes and Phelan (2000), without loss of generality, we restrict attention

to the set of reporting strategies that has lying only at a single node. This allows us to

replace the incentive compatibility constraints defined above with a sequence of temporary

14

incentive constraints, one for each node. We say that an allocation x is temporary incentive

compatible if and only if, in any period t and at any node ht−1 and for all ht ∈ H,

u(ct(ht−1, ht))− v(lt(h

t−1, ht)) +∞∑

m=t+1

∑hm∈Hm

πm(hm)βm−t [u(cm(hm))− v(lm(hm))] (8)

≥ u(ct(ht−1, hot ))− v

(lt(h

t−1, hot )wt(hot )

wt(ht)

)+

∞∑m=t+1

∑hm∈Hm

πm(hm)βm−t[u(cm(hm))− v(lm(hm))

],

where hot is the complement of ht in the set H, Hm denotes the set of period m histories that

follow from ht, i.e., Hm ≡ {hm ∈ Hm : hm � ht}, and hm = (ht−1, hot , ht+1, ..., hm) is identical

to hm except in period t. From now on, we use (8) to represent incentive compatibility.8

Social Planning Problem. The social planning problem that defines the constrained

efficient allocation is: maxx (5) s.t. (1), (6), (7), and (8).

Optimality of Differential Capital Taxation. Now we prove the optimality of dif-

ferential taxation of capital for the general environment with skill shocks. First, we define

the intertemporal wedge for an agent with skill history ht and for capital of type i ∈ {s, e}

from period t to period t+ 1, as

τ i,t+1(ht) = 1− u′(ct(ht))

βFi,t+1Et {u′(ct+1(ht+1))|ht}. (9)

The first part of Proposition 4 generalizes Proposition 1 by showing that it is optimal to

create a wedge between the marginal returns to structure and equipment capital when skills

evolve stochastically over time. The second part of Proposition 4 shows that the optimal

intertemporal wedges for structure and equipment capital are different. Thus, optimality

of differential taxation of capital does not depend on the permanent skill types assumption

maintained in the main model.

8Temporary incentive constraints were first shown to be necessary and sufficient for incentive compatibilityby Green (1987) for an environment with i.i.d. shocks. Fernandes and Phelan (2000) generalized this resultto environments with persistent shocks. To be precise, we need to use two more assumptions to be able touse temporary incentive constraints. First, we need each skill history to be reached with strictly positiveprobability. Second, we need a transversality condition that is automatically satisfied if we assume thatinstantaneous utility is bounded.

15

Proposition 4. 1. At the constrained efficient allocation, in any period t ≥ 2,

F1(K∗s,t, K∗e,t, L

∗s,t, L

∗u,t) = F2(K∗s,t, K

∗e,t, L

∗s,t, L

∗u,t) +X∗t /λ

∗t , where

X∗t =∑{ht∈Ht}

µ∗t (ht)v′

(l∗t (h

t−1, hot )w∗t (h

ot )

w∗t (ht)

)l∗t (h

t−1, hot )∂w∗t (hot )

w∗t (ht)

∂K∗e,t

and λtβt−1 and µt(h

t) are Lagrange multipliers on period t feasibility constraint and

the incentive constraint at history ht.

2. (a) The optimal wedge on structure capital in any period t ≥ 2 and history ht−1 satisfies

τ ∗s,t(ht−1) ≥ 0. The inequality is strict if and only if there is no h ∈ H such that

πt(ht−1, h|ht−1) = 1.

(b) The optimal wedge on equipment capital in any period t ≥ 2 and history ht−1 is

[1− τ ∗e,t(ht−1)

]=[1− τ ∗s,t(ht−1)

]·[1 +X∗t /

(λ∗t F

∗2,t

)]. (10)


The idea behind the first part of Proposition 4 is very similar to the one for Proposition

1: under equipment-skill complementarity, increasing the amount of equipment capital has

an effect on incentives, summarized by the term X∗t . In contrast, changing the amount of

structure capital does not affect incentives. As a result, it is optimal to create a wedge

between the physical returns to the two types of capital. The main distinction from the per-

manent type model is that, in the case with stochastic skills, a change in period t equipment

capital affects all the binding incentive constraints in that period. Thus, X∗t measures the

cumulative effect of a change in equipment capital on all the binding incentive constraints.

Since at this level of generality it is not possible to determine the pattern of binding incentive

constraints, the sign of X∗t is ambiguous.

Part 2(a) of Proposition 4 states that the intertemporal wedge on structure capital is

positive if there is skill risk. Intuitively, if we allow an agent to save at the marginal rate of

16

return to structure capital, he will save more than the efficient level. In the next period, he

will work less than socially optimal if he turns out to be of the skilled type. To prevent this

double deviation, it is optimal to discourage savings. The government achieves that with a

positive wedge on structure capital.9 Naturally, with permanent types there is no skill risk

and, hence, no reason to tax structure capital, as we have already shown in Proposition 2.

Equation (10) in part 2(b) of the proposition is a version of the no-arbitrage condition for

this economy. The equation shows that the intertemporal wedge on equipment capital can be

decomposed into two parts. First, the government wants to discourage savings in equipment

capital for the same reason that it wants to discourage savings in structure capital, which is

captured by the first term on the right-hand side of equation (10). The second term on the

right-hand side of equation (10) is present in order to create the optimal wedge between the

returns to the two types of capital. The presence of this term implies that generically the

optimal wedges on the two types of capital are different in any period and history, which

establishes the optimality of differential taxation of capital.

A Special Case. In Assumption 5 below, we assume that the only incentive constraints

that bind are those that prevent the skilled from pretending to be unskilled. We call these

incentive constraints downward incentive constraints. There is no theoretical result that

establishes the pattern of binding incentive constraints for general skill processes in dynamic

Mirrleesian environments, even when wages are exogeneous.10 Indeed, there are examples

in which some upward incentive constraints bind. In this regard, Assumption 5 is stronger

than Assumption 4, which we use in Section 3.

Assumption 5. In any period t ≥ 1, history ht, only downward incentive constraints bind.

Assumption 5 allows us to show that X∗t > 0 in all periods. It is then possible to sign the

9The positive wedge on structure capital follows from the inverse Euler equation, see equation (16)in Appendix B. This condition was first derived by Rogerson (1985) and then generalized by Golosov,Kocherlakota, and Tsyvinski (2003). The inverse Euler equation does not hold for equipment capital becauseof the effect that equipment capital has on incentives. We derive a modified version of the inverse Eulerequation for equipment capital in Appendix B, see equation (17).

10Downward incentive constraints are the only binding incentive constraints when skills are i.i.d. andwages are exogeneous.

17

capital return wedge, and show that the optimal equipment capital wedge is higher than the

optimal structure capital wedge. These results are summarized by the following proposition.

Proposition 5. Suppose Assumption 5 holds. Then, in any period t ≥ 2 and history ht−1,


∗s,t, L

∗u,t) < F2(K∗s,t, K

∗e,t, L

∗s,t, L

∗u,t) and τ ∗e,t(h

t−1) > τ ∗s,t(ht−1).


Intratemporal Wedges. Under Assumption 5, Proposition 6 generalizes the optimality

of subsidizing skilled labor supply, shown for the permanent type case in Section 3.3, for

the environment in which skills evolve stochastically over time. First, define the optimal

intratemporal wedge at history ht as τ ∗y,t(ht) = 1− v′(l∗t (ht))/(w∗t (ht)u′(c∗t (ht))).

Proposition 6. Suppose Assumption 5 holds. In any period t ≥ 1 and history ht−1,

τ ∗y,t(ht−1, s) < 0.


Implementation. In Appendix C, we provide an implementation of the constrained

efficient allocation through a tax system in a competitive market environment in which

agents trade a risk free bond and capital. The implementation result holds for any stochastic

process, including the permanent type model. An interesting feature of this tax system is

that the optimal tax differentials across equipment and structure capital can be implemented

at the firm level, as is the case in the current U.S. tax system. This is possible because,

as the second term on the right-hand side of equation (10) shows, the differential between

optimal intertemporal wedges of structure and equipment capital is history independent in

any period. Another notable feature of our implementation is that, the optimal tax system

mimics the actual U.S. tax code in the sense that capital tax differentials are created through

depreciation allowances that differ from actual economic depreciation. Therefore, creating

the optimal capital tax differentials would not require complicating the U.S. tax code further.

18

5 Quantitative Analysis

The main goal of this section is to analyze the quantitative importance of differential taxation

of capital in a calibrated version of our model. As in the main part of the paper, we

assume that agents’ skill types are permanent. Since there is no labor income risk in this

environment, the only role of taxation is redistribution (along with financing government

consumption). Permanent skills is a natural assumption given that in the data we associate

skill levels with educational attainment. In addition, there is empirical evidence that initial

conditions account for a large part of the cross-sectional variation in lifetime earnings.11

We first calibrate the model parameters to the U.S. economy using a competitive equi-

librium framework with the actual U.S. tax code and government consumption level. We

then solve a social planning problem with endogeneous factor prices in which the planner

“inherits” the initial capital stocks from the steady state of the competitive equilibrium.12

We solve for the whole time series of the constrained efficient allocation, thus taking into

account the transition to a new steady state. Then, we recover the optimal wedges (taxes)

from the constrained efficient allocation. In line with Proposition 2, we find that the optimal

taxes on equipment capital are higher than those on structure capital. Specifically, in our

benchmark calibration, the optimal tax differential increases from 27.6% in the first period

to 39.5% in the steady state. We also assess the welfare consequences of differential capital

taxation. We find that the welfare gains of optimal differential capital taxation can be as

high as 0.4% in terms of lifetime consumption.

11Keane and Wolpin (1997) estimate that initial conditions account for 90% of the cross-sectional variationin life-time earnings. Huggett, Ventura, and Yaron (2011) estimate this number to be over 60%, andStoresletten, Telmer, and Yaron (2004) estimate it to be almost 50%.

12We would not be able to assess the role of differential capital taxation in a partial equilibrium environ-ment, because the skill premium would not be affected by changes in the level of equipment capital. Tothe contrary, most quantitative papers in the NDPF literature consider partial equilibrium environments.As Farhi and Werning (2012) show, considering general equilibrium effects might be important even with astandard production function without complementarities.

19

5.1 Calibration

To calibrate the parameters in the social planning problem, we assume that the steady state

of the competitive equilibrium defined in Appendix C represents the current U.S. economy.

We first fix a number of parameters to values from the data or from the literature. We then

calibrate the remaining parameters so that the SCE matches the U.S. data along selected

dimensions.

One period in our model corresponds to one year. We assume that the period utility

function takes the form u(c)− v(l) = c1−σ/(1− σ)− φl1+γ/(1 + γ). In the benchmark case,

we use σ = 2 and γ = 1. These are within the range of values that have been considered in

the literature. We further assume that the production function takes the same form as in

Krusell, Ohanian, Rıos-Rull, and Violante (2000):

Y = F (Ks, Ke, Ls, Lu) = Kαs

(ν [ωKρ

e + (1− ω)Lρs]ηρ + (1− ν)Lηu

) 1−αη.

Using U.S. data Krusell, Ohanian, Rıos-Rull, and Violante (2000) estimate α, ρ, η, and we

use their estimates, but they do not estimate ω and ρ. We calibrate these parameters to

U.S. data, as we explain in detail below. This production function satisfies Assumptions 1 –

3 if η > ρ, which is what Krusell, Ohanian, Rıos-Rull, and Violante (2000) find.

We assume that the government consumption-to-output ratio equals 16%, which is close

to the average ratio in the United States during the period 1980 – 2012, as reported in the

National Income and Product Accounts (NIPA) data. We follow Heathcote, Storesletten, and

Violante (2010) and assume a flat labor income tax rate of τ y = 27% (for a discussion of the

construction of this number, see Domeij and Heathcote (2004)). Gravelle (2011) documents

that because of differences in tax depreciation rates, the effective tax rates on structure

capital and equipment capital differ at the firm level. Specifically, she estimates the effective

corporate tax rate on structure capital to be 32%, and that on equipment capital to be 26%.

The capital income tax rate at the consumer level is 15% in the U.S. tax code. This implies

20

Table 1: Benchmark Parameters

Parameter Symbol Value Source

Relative risk aversion σ 2Inverse Frisch elasticity γ 1Structure capital depreciation rate δs 0.056 GHKEquipment capital depreciation rate δe 0.124 GHKShare of structure capital in output α 0.117 KORVMeasure of elasticity of substitution betweenequipment capital Ke and unskilled labor Lu η 0.401 KORVMeasure of elasticity of substitution betweenequipment capital Ke and skilled labor Ls ρ -0.495 KORVTax on labor income τy 0.27 HSVOverall tax on structure capital income τ s 0.422 Gravelle (2011)Overall tax on equipment capital income τ e 0.371 Gravelle (2011)Government consumption G/Y 0.16 NIPARelative supply of skilled workers ps/pu 0.778 U.S. CensusShare of skilled workers’ wealth ζ 0.686 U.S. Census

This table reports the benchmark parameters that we take directly from the data or the literature. The acronyms GHK,

KORV, and HSV stand for Greenwood, Hercowitz, and Krusell (1997), Krusell, Ohanian, Rıos-Rull, and Violante (2000), and

Heathcote, Storesletten, and Violante (2010), respectively. NIPA stands for the National Income and Product Accounts.

an overall tax on structure capital τ s = 1− 0.85 · (1− 0.32) = 42.2% and an overall tax on

equipment capital τ e = 1− 0.85 · (1− 0.26) = 37.1%. These numbers are in line with a 40%

tax on aggregate capital that is reported by Domeij and Heathcote (2004). We assume that

unspent government tax revenue is distributed back to the agents in a lump-sum manner,

which implies that in the SCE average taxes are in general not equal to marginal taxes. We

set the ratio of skilled to unskilled agents to be consistent with the 2011 US Census data.

The stationary competitive equilibrium is not unique in our environment with permanent

types. Let us denote the steady-state asset holdings of a skilled agent by as and of an

unskilled agent by au. Given aggregate capital levels Ks, Ke consistent with the SCE, any

asset distribution of the form psas = ζ(Ks+Ke) and puau = (1− ζ)(Ks+Ke) with ζ ∈ (0, 1)

can arise in the SCE. We pick a ζ so that the SCE asset distribution matches the observed

asset distribution between skilled and unskilled agents in the 2010 U.S. Census data. Table 1

summarizes the benchmark parameters that we take directly from the data or the literature.

This leaves us with several parameters to be determined. We cannot identify zu and

21

Table 2: Benchmark Calibration Procedure

Parameter Symbol Value Target Data and SCE Source

Discount factor β 0.985 Capital-to-output ratio 2.9 NIPA and FATDisutility of labor φ 67.8 Labor supply 1/3Production functionparameter ω 0.477 Labor share 2/3 NIPAProduction functionparameter ν 0.657 Skill premium ws

wu1.8 HPV

This table reports our benchmark calibration procedure. The production function parameters ν and ω control the income share

of equipment capital, skilled and unskilled labor in output. The acronym HPV stands for Heathcote, Perri, and Violante (2010).

NIPA stands for the National Income and Product Accounts, and FAT stands for the Fixed Asset Tables.

zs separately from the remaining parameters of the production function, and therefore, set

them to zu = zs = 1. We then calibrate the parameter that controls the income share of

equipment capital ω, the parameter that controls the income share of unskilled labor ν, the

labor disutility parameter φ, and the discount factor β. We calibrate these parameters so

that (i) the labor share equals 2/3 (approximately the average labor share in 1980 – 2010

as reported in the NIPA data), (ii) the capital-to-output ratio equals 2.9 (approximately

the average of 1980 – 2011 as reported in the NIPA and Fixed Asset Tables), (iii) the skill

premium equals 1.8 (as reported by Heathcote, Perri, and Violante (2010) for the 2000s),

and (iv) the aggregate labor supply in steady state equals 1/3 (as is commonly used in the

macro literature). Table 2 summarizes our calibration procedure.

5.2 Quantitative Results

In this section, we analyze the quantitative properties of the optimal tax system. We do

so by solving the social planning problem (SPP) defined in Section 3.2 with parameters

calibrated in Section 5.1 to the U.S. economy.13 We assume that the planner inherits initial

13We solve the SPP assuming that the economy converges to a steady state in 200 periods. We verify thatchanging the number of periods does not affect the results. In other words, the economy gets very close tosteady state long before period 200.

22

Table 3: Steady-State Comparison of Wedges

SCE SPP

Tax (wedge) on equipment capital τ e 37.10% 39.54%Tax (wedge) on structure capital τ s 42.20% 0.00%Tax (wedge) on unskilled labor τ y(u) 27.00% 26.60%Tax (wedge) on skilled labor τ y(s) 27.00% -11.14%

This table compares the tax rates in the stationary competitive equilibrium (column SCE) and wedges at the steady state of

the solution to the social planning problem (column SPP).

capital stocks from the SCE. Following Conesa, Kitao, and Krueger (2009), we assume that

the planner needs to finance the same level of government consumption as in the SCE.

Steady-State Comparison. We first discuss the properties of the optimal tax system

in steady state and compare it to the current U.S. tax system. The first column of Table 3

summarizes the current U.S. tax system. The second column reports its counterpart in the

optimal tax system at the steady state. The first two rows of Table 3 report capital income

taxes net of depreciation.14 We find that the equipment capital tax τ e is substantial at the

steady state of the solution to the SPP. It is 39.54% – that is, 39.54 percentage points higher

than the tax on structure capital τ s, which is zero. This is in contrast with the current

effective tax rates in the United States where structure capital is taxed by 5.1 percentage

points more than equipment capital overall. As for the labor wedges, they are 27% for both

types of labor in the SCE because we approximate the U.S. labor income tax code by a 27%

linear tax. At the steady state of the solution to the SPP, the labor wedge for unskilled labor

τ y(u) is 26.6%, which is almost the same as in the SCE. The skilled labor wedge τ y(s), on

the other hand, is -11.14%. Both higher taxes on equipment capital and marginal subsidies

on skilled labor are in line with our theoretical results from Section 3.

14We report capital income taxes net of depreciation rather than the capital wedges defined in Section3.2 because the former correspond to the taxes used in the U.S. tax code. With a slight abuse of notation,we denote these capital income taxes as we denoted the capital wedges. In the column denoted “SPP,” thecapital income taxes are recovered from the constrained efficient allocation by using the following definition

for each type h ∈ H, capital type i, and period t: τ i,t+1(h) ≡ 1−(

u′(ch,t)βu′(ch,t+1)

− 1)/ (Fi,t+1 − δi). Part 1 of

Proposition 2 implies that these taxes only depend on time and not on agent type; therefore, we only reportone number (time series).

23

Table 4: Steady-State Comparison of Allocations

SCE SPP

Ke/Ks 1.02 0.93Ls/Lu 0.82 1.11ws/wu 1.80 1.47

This table compares allocations in the stationary competitive equilibrium (column SCE) and at the steady state of the solution

to the social planning problem (column SPP). Ke/Ks denotes the equipment-to-structure capital ratio, Ls/Lu denotes the

skilled-to-unskilled labor ratio and ws/wu denotes the skill premium.

The higher taxes on equipment capital relative to structure capital, together with marginal

subsidies on skilled labor, are used to indirectly redistribute from the skilled to the unskilled.

Table 4 shows how the optimal tax system achieves indirect redistribution by comparing the

allocations at the SCE and the SPP. The higher tax on equipment capital discourages the

accumulation of equipment capital relative to structure capital at the SPP in comparison to

the SCE. At the same time, the marginal subsidy on skilled labor income increases the ratio

of skilled to unskilled labor. Both capital and labor taxes decrease the skill premium at the

SPP. This way, the planner provides indirect redistribution from the skilled to the unskilled.

The marginal subsidy on skilled labor income seems to imply that there is direct redis-

tribution from the unskilled to the skilled at the SPP. However, recall that optimal taxes

are nonlinear in labor income. In this case, at a given income level, the average income tax

can be quite different from the marginal income tax.15 As a consequence, a tax system can

be progressive overall even though the marginal taxes are regressive. This is precisely what

happens at the optimal tax system. To assess the overall progressivity of the optimal tax

system, we compute a measure of average labor taxes that an agent has to pay at the steady

state of the SPP. This measure is defined as 1 − ch/(whlh) for agents of type h, following

Farhi and Werning (2013). We find that the optimal average labor taxes are progressive:

6% for the unskilled and 18% for the skilled. Therefore, the optimal labor taxes do provide

15Suppose, e.g., that the tax formula for an agent with income $200,000 is T (y) = $100, 000− 0.1 · y. Thisagent pays $80,000 in taxes, implying an average tax of 40%, even though he gets a marginal subsidy of 10%.

24

direct redistribution from the skilled to the unskilled.16

Transition. This section summarizes the evolution of the optimal taxes (wedges) along

the transition to the new steady state. The left panel of Figure 1 shows that the optimal

structure capital income tax (net of depreciation) is 0 and the equipment capital tax is

positive in all periods. These properties are in line with Proposition 2. The equipment

capital tax is growing over time. To understand this finding, we look at the evolution of the

stocks of the two types of capital, shown in the left panel of Figure 2. It shows that both

capital stocks are growing along the transition path. The overall capital stock is growing in

the constrained efficient allocation because the planner inherits an inefficiently low level of

capital from the SCE, which is due to the inefficiently high overall level of capital taxes at

the SCE. As the quantity of equipment capital grows, so does the skill premium (see Figure

3). The government wants to prevent an unfettered growth of the skill premium because of

its adverse redistributive effects. To keep the growth of the skill premium under control, the

planner finds it optimal to increase the tax on equipment capital.17

Optimal labor wedges are almost constant along the transition, as shown in the right

panel of Figure 1. In fact, Werning (2007) shows that with our utility function labor wedges

are exactly constant over time in a permanent-type model without equipment-skill com-

plementarity. Figure 1 suggests that the extra distortions in labor wedges arising from

equipment-skill complementarity are also approximately constant over time. Since skilled

labor is subsidized, skilled agents work more than unskilled agents in each period, as shown

in the right panel of Figure 2. As the economy grows, both types of agents become richer,

and because of the income effect, they decrease their labor supply even though labor wedges

16The non-linear nature of the optimal labor income tax code also explains how government budget isbalanced under the optimal tax system. Table 3 seems to suggest that - except for a small increase inequipment capital taxes - government revenue from all other sources declines significantly when the economymoves from the current system to the optimal one. However, with a non-linear tax system the total amountof labor income taxes collected can increase even if the marginal taxes decline.

17We check the validity of this intuition by conducting exercises, in which the planner inherits inefficientlyhigh amounts of capital from the SCE. In those cases, as our intuition suggests, the planner decreases bothtypes of capital over the transition to the new steady state, and optimal equipment taxes decline over thetransition period.

25

Figure 1: Optimal Taxes/Wedges at the Constrained Efficient Allocation

0

0.2

0.4

-0.2

0

0 10 20 30 40 50

Structure capital tax

Equipment capital tax

years

0

0.2

0.4

Unskilled labor wedge

Skilled labor wedge

-0.2

0

0 10 20 30 40 50years

This figure shows the paths of optimal taxes (wedges) along the transition to the new steady state at the solution to the social

planning problem.

do not change much.

Figure 3 depicts the evolution of the optimal skill premium over time. First, the optimal

skill premium is much lower in each period than it is in the U.S. data. This result suggests

that the current U.S. tax system does not generate enough indirect redistribution. Second,

the skill premium is increasing over time because the equipment capital level increases. This

result implies that an increasing skill premium is optimal in a growing economy, even if the

government cares about equality.

Welfare Gains of Optimal Differential Taxation of Capital. We assess the im-

portance of optimal differential taxation of capital by answering the following question: how

much of the welfare gains of the full reform (which we call optimal DTC in this section)

do we give up if we restrict the government to use the current capital taxes and allow it

only to choose the labor income tax code optimally? To answer this question, we solve an

additional version of the planning problem. In this problem, the planner is unrestricted in

his choice of labor taxes, but he must use the capital income taxes as in the U.S. tax code.

We call this tax system current differential taxation of capital (current DTC). We state the

planning problem that gives rise to the current DTC in Appendix D. We find that, for the

benchmark parameters, reforming the current tax system to the optimal DTC implies 0.19%

26

Figure 2: Factors of Production at the Constrained Efficient Allocation

0.3

0.35

0.4

0.25

0.3

0 10 20 30 40 50

Structure capital

Equipment capital

years

0.3

0.35

0.4

0.25

0.3

0 10 20 30 40 50

Unskilled labor

Skilled labor

years

This figure shows the paths of factors of production along the transition to the new steady state at the solution to the social

planning problem.

Figure 3: Skill Premium at the Constrained Efficient Allocation

1.45

1.475

1.5

1.4

1.425

0 10 20 30 40 50

Skill premium

years

This figure shows the path of the skill premium ws/wu along the transition to the new steady state at the solution to the social

planning problem.

27

more welfare gains than reforming labor taxes alone (i.e., moving to the current DTC).18 The

additional gains of optimal DTC can be as high as 0.40% for reasonable parameter values,

as we discuss in more detail in the sensitivity analysis below.

In addition, we solve a version of the social planning problem, in which the planner is

unrestricted in his choice of labor taxes, but is not allowed to tax the two types of capital

differentially. We call this tax system the optimal nondifferential taxation of capital (optimal

NDTC). We state the planning problem that gives rise to the optimal NDTC in Appendix

D. We find that the welfare gains of the current DTC falls 0.14% short of the welfare gains

of the optimal NDTC for the benchmark parameters. This difference in welfare gains can be

as high as 0.27% for reasonable parameter values.19

We also assess how people rank the different capital tax reforms. We find that relative

to the current DTC the optimal DTC helps both types. The reason is that the overall level

of capital taxes at the current DTC is inefficiently high. Under the optimal DTC, structure

capital taxes are zero while equipment capital taxes are virtually unchanged. As a result,

there is more capital of both types at the optimal DTC. This increases the productivity of

both types of agents, and they both benefit from this reform. We also find that relative to

the current DTC, the optimal NDTC helps the skilled and hurts the unskilled.

Sensitivity Analysis. In each sensitivity exercise, we change the parameter of interest

and redo the calibration procedure. Table 5 summarizes the sensitivity results. We report

optimal taxes only for the optimal DTC reform. With a higher σ, the curvature of utility

from consumption, the planner wants to provide more redistribution. Therefore, the indirect

redistribution channel becomes more important. Hence, as σ increases, the tax on equipment

capital as well as the marginal subsidy to skilled labor increase. Table 5 also reports the

18We measure the welfare gains of allocation x relative to allocation y as a fraction by which consumptionin allocation y has to be increased in each date and state to make its welfare equal to allocation x welfare.

19These results suggest that setting capital tax rates to a uniform rate, as proposed recently by PresidentObama’s administration, might imply substantial welfare gains. However, our results here are only suggestive,since that proposal only involves reforming capital taxes, but would leave labor taxes intact. We evaluatethe consequences of such a proposal in a world with multiple layers of heterogeneity across agents in Slavikand Yazici (2014).

28

Table 5: Sensitivity Analysis

Benchmark Benchmarkσ 1 2 4 2 2 2γ 1 1 1 0.5 1 2

Optimal taxesτ e 24.39% 39.54% 54.84% 49.23% 39.54% 22.88%τ s 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%

τy(u) 18.77% 26.60% 33.75% 26.43% 26.60% 24.90%τy(s) -5.29% -11.14% -21.23% -16.46% -11.14% -5.08%

Difference in welfare gainsOpt. DTC vs. current DTC 0.24% 0.19% 0.21% 0.20% 0.19% 0.22%

Opt. NDTC vs. current DTC 0.23% 0.14% 0.07% 0.10% 0.14% 0.21%

This table reports the results of the sensitivity analysis. In the first column, σ is intertemporal elasticity of substitution, γ is

the inverse Frisch elasticity of labor supply, τe is the equipment capital tax (wedge), τs is the structure capital tax (wedge),

τy(u) is the labor wedge of the unskilled agents, τy(s) is the labor wedge of the skilled agents. For all taxes (wedges) we report

their steady state values. Opt. DTC refers to optimal differential taxation of capital, i.e., a reform that reforms both labor and

capital taxes. Current DTC refers to current differential taxation of capital, i.e., a tax reform that reforms labor taxes, but

leaves capital taxes at their current values. Opt. NDTC refers to optimal nondifferential taxation of capital, i.e., a reform in

which the planner is free to adjust labor taxes, but must set the tax rate on equipment capital equal to the tax rate on structure

capital.

sensitivity of our results to changes in γ, the curvature of disutility from labor. We find that,

as γ decreases, the tax on equipment capital and the skilled labor subsidy increase.

As we report in the penultimate row of Table 5, the welfare gains of the optimal DTC

reform are around 0.20% higher than the gains of the current DTC reform for all values of

σ and γ we consider.20 The welfare gains of optimal NDTC relative to current DTC are

decreasing in σ and increasing in γ, as shown in the last row of Table 5. The reason is that

with a larger σ or lower γ, the optimal capital tax differential is larger, as one can see in the

rows denoted by τ e and τ s in Table 5. Therefore, optimal NDTC, which forces capital taxes

to be uniform, is more restrictive and implies smaller welfare gains for higher σ or lower γ.

We have also solved the model for the other values of (σ, γ) ∈ {1, 2, 4} × {0.5, 1, 2}. We

find that the welfare gains of optimal DTC relative to current DTC are as high as 0.28% for

20We also compute the welfare gains of optimal DTC under alternative social welfare weights. We findthat if the planner cares more about the unskilled, the welfare gains of optimal DTC are larger. This isintuitive: not being able to use one of the channels of indirect redistribution optimally has more severewelfare consequences when society care more about redistribution.

29

σ = 4 and γ = 0.5. We also consider a higher elasticity of substitution between equipment

capital and unskilled labor, namely, η = 0.79. This value has been used in He and Liu

(2008), who use the same production function, and is based on an empirical estimate by

Duffy, Papageorgiou, and Perez-Sebastian (2004). For this value of η and σ = 4 and γ = 0.5,

the welfare gains of optimal DTC relative to current DTC are 0.40%.

6 Conclusion

The effective marginal tax rates on returns to capital assets differ substantially depending

on the capital asset type in the U.S. tax code. In particular, the marginal tax rate on capital

structures is about 5% higher than the marginal tax rate on capital equipments. In this

paper, we assess the optimality of differential capital asset taxation both theoretically and

quantitatively from the perspective of a government whose aim is to provide redistribution

and insurance. Contrary to the actual practice in the U.S. tax code, we show that, under a

plausible assumption, it is optimal to tax equipment capital at a higher rate than structure

capital. Intuitively, in an environment with equipment-skill complementarity, taxing equip-

ment capital and hence depressing its accumulation decreases the skill premium, providing

indirect redistribution from the skilled to the unskilled agents. In a quantitative version of

our model, we find that the tax rate on equipment capital should be at least 27 percentage

points higher than the tax rate on structure capital during transition and at the steady state.

Furthermore, we find that the welfare gains of optimal differential capital taxation can be

as high as 0.4% of lifetime consumption.

30

7 Acknowledgements

We thank the associate editor, Christopher Sleet, an anonymous referee, as well as LaurenceAles, Alex Bick, Felix Bierbrauer, V. V. Chari, Nicola Fuchs-Schundeln, Kenichi Fukushima,Mike Golosov, Christian Hellwig, Larry Jones, Juan Pablo Nicolini, Fabrizio Perri, DominikSachs, Florian Scheuer, Yuichiro Waki, and participants at various conferences, workshopsand seminars for their comments. Yazici greatly appreciates IIES (Stockholm University)for their hospitality during his visit.

References

Albanesi, S., and C. Sleet (2006): “Dynamic Optimal Taxation with Private Informa-tion,” Review of Economic Studies, 73(1), 1–30.

Ales, L., M. Kurnaz, and C. Sleet (2014): “Tasks, Talents, and Taxes,” Workingpaper.

Atkinson, A. B., and J. E. Stiglitz (1976): “The Design of Tax Structure: Directversus Indirect Taxation,” Journal of Public Economics, 6(1–2), 55–75.

Auerbach, A. J. (1979): “The Optimal Taxation of Heterogeneous Capital,” QuarterlyJournal of Economics, 93(4), 589–612.

Auerbach, A. J. (1983): “Corporate Taxation in the United States,” Brookings Papers onEconomic Activity, 14(2), 451–514.

Auerbach, A. J., K. A. Hassett, and S. D. Oliner (1994): “Reassessing the SocialReturns to Equipment Investment,” Quarterly Journal of Economics, 109(3), 789–802.

Chamley, C. (1986): “Optimal Taxation of Capital Income in General Equilibrium withInfinite Lives,” Econometrica, 54(3), 607–622.

Conesa, J. C., S. Kitao, and D. Krueger (2009): “Taxing Capital? Not a Bad Ideaafter All!,” American Economic Review, 99(1), 25–48.

Diamond, P. A., and J. A. Mirrlees (1971): “Optimal Taxation and Public ProductionI: Production Efficiency,” American Economic Review, 61(1), 8–27.

Domeij, D., and J. Heathcote (2004): “On the Distributional Effects of Reducing Cap-ital Taxes,” International Economic Review, 45(2), 523–554.

Duffy, J., C. Papageorgiou, and F. Perez-Sebastian (2004): “Capital-Skill Com-plementarity? Evidence from a Panel of Countries,” Review of Economics and Statistics,86(1), 327–344.

Farhi, E., and I. Werning (2012): “Capital Taxation: Quantitative Explorations of theInverse Euler Equation,” Journal of Political Economy, 120(3), 398–445.

31

Farhi, E., and I. Werning (2013): “Insurance and Taxation over the Life Cycle,” Reviewof Economic Studies, 80(2), 596–635.

Feldstein, M. (1990): “The Second Best Theory of Differential Capital Taxation,” OxfordEconomic Papers, 42(1), 256–267.

Fernandes, A., and C. Phelan (2000): “A Recursive Formulation for Repeated Agencywith History Dependence,” Journal of Economic Theory, 91(2), 223–247.

Flug, K., and Z. Hercowitz (2000): “Equipment Investment and the Relative Demandfor Skilled Labor: International Evidence,” Review of Economic Dynamics, 3, 461–485.

Fukushima, K. (2010): “Quantifying the Welfare Gains From Flexible Dynamic IncomeTax Systems,” Working paper.

Golosov, M., N. Kocherlakota, and A. Tsyvinski (2003): “Optimal Indirect andCapital Taxation,” Review of Economic Studies, 70(3), 569–587.

Golosov, M., M. Troshkin, and A. Tsyvinski (2013): “Redistribution and SocialInsurance,” Working Paper 17642.

Gravelle, J. G. (1994): The Economic Effects of Taxing Capital Income. Cambridge MA:MIT Press.

(2003): “Capital Income Tax Revisions and Effective Tax Rates,” CRS Report forCongress 17642, Congressional Budget Office.

(2011): “Reducing Depreciation Allowances to Finance a Lower Corporate TaxRate,” National Tax Journal, 64(4), 1039–1054.

Green, E. (1987): “Lending and the Smoothing of Uninsurable Income,” in ContractualArrangements for Intertemporal Trade, ed. by E. C. Prescott, and N. Wallace. Minneapolis:University of Minnesota Press.

Greenwood, J., Z. Hercowitz, and P. Krusell (1997): “Long-Run Implications ofInvestment-Specific Technological Change,” American Economic Review, 87(3), 342–362.

He, H., and Z. Liu (2008): “Investment-Specific Technological Change, Skill Accumulation,and Wage Inequality,” Review of Economic Dynamics, 11(2), 314–334.

Heathcote, J., F. Perri, and G. L. Violante (2010): “Unequal We Stand: An Empiri-cal Analysis of Economic Inequality in the United States: 1967–2006,” Review of EconomicDynamics, 13(1), 15–51.

Heathcote, J., K. Storesletten, and G. L. Violante (2010): “The MacroeconomicImplications of Rising Wage Inequality in the United States,” Journal of Political Econ-omy, 118(4), 681–722.

Huggett, M., and J. C. Parra (2010): “How Well Does the U.S. Social Insurance SystemProvide Social Insurance?,” Journal of Political Economy, 118(1), 76–112.

32

Huggett, M., G. Ventura, and A. Yaron (2011): “Sources of Lifetime Inequality,”American Economic Review, 101(7), 2923–2954.

Judd, K. L. (1985): “Redistributive Taxation in a Simple Perfect Foresight Model,” Journalof Public Economics, 28(1), 59–83.

Keane, M. P., and K. I. Wolpin (1997): “The Career Decisions of Young Men,” Journalof Political Economy, 105(3), 473–522.

Kocherlakota, N. R. (2005): “Zero Expected Wealth Taxes: A Mirrlees Approach toDynamic Optimal Taxation,” Econometrica, 73(5), 1587–1621.

(2010): The New Dynamic Public Finance. Princeton University Press.

Krusell, P., L. E. Ohanian, J.-V. Rıos-Rull, and G. L. Violante (2000): “Capital-Skill Complementarity and Inequality: A Macroeconomic Analysis,” Econometrica, 68(5),1029–1054.

Mirrlees, J. A. (1971): “An Exploration in the Theory of Optimum Income Taxation,”Review of Economic Studies, 38(114), 175–208.

Naito, H. (1999): “Re-examination of Uniform Commodity Taxes under a Non-linear In-come Tax System and its Implication for Production Efficiency,” Journal of Public Eco-nomics, 71(2), 165–188.

Rogerson, W. P. (1985): “Repeated Moral Hazard,” Econometrica, 53(1), 69–76.

Sadka, E. (1976): “On Income Distribution, Incentive Effects and Optimal Income Taxa-tion,” Review of Economic Studies, 43(2), 261–267.

Samuelson, P. A. (1964): “Tax Deductibility of Economic Depreciation to Insure InvariantValuations,” Journal of Political Economy, 72(6), 604–606.

Seade, J. K. (1977): “On the Shape of Optimal Tax Schedules,” Journal of Public Eco-nomics, 7(2), 203–235.

Slavik, C., and H. Yazici (2014): “On the Consequences of Eliminating Capital TaxDifferentials,” Working paper.

Stiglitz, J. E. (1982): “Self-Selection and Pareto Efficient Taxation,” Journal of PublicEconomics, 17(2), 213–240.

Storesletten, K., C. I. Telmer, and A. Yaron (2004): “Consumption and RiskSharing over the Life Cycle,” Journal of Monetary Economics, 51(3), 609–633.

Werning, I. (2007): “Optimal Fiscal Policy with Redistribution,” Quarterly Journal ofEconomics, 122(3), 925–967.

33

Appendix

A Differential Taxation of Capital in the U.S. Tax Code

In this section, we first explain how the current U.S. tax system taxes returns to capital

assets differentially. Then we provide a brief summary of the historical evolution of capital

tax differentials.

According to the current U.S. corporate tax code, all capital income net of depreciation

is taxed at the statutory rate of 35%. However, effective marginal taxes on net returns

to capital might differ from the statutory rate if tax depreciation allowances differ from

actual economic depreciation. To see this point, let ρi be the return to capital type i, δi

be its economic depreciation rate, and δi be the depreciation rate allowed by the tax code.

Suppose for simplicity that at the end of a period, the firm liquidates and there is no further

production. Letting τ be the statutory tax rate, the effective tax rate, call it τ i, is given by

τ i =(ρi − δi)τ(ρi − δi)

.

As first pointed out by Samuelson (1964), if δi = δi, then τ i = τ . In words, when tax de-

preciation equals economic depreciation, then the effective tax on the return to capital i

equals the statutory rate. If this is true for all types of capital, there are no tax differentials.

If, instead, for capital of type i the tax depreciation allowance is higher (lower) than the

actual depreciation rate, then its return is effectively taxed at a lower (higher) rate (i.e.,

τ i < (>)τ). As argued by Gravelle (2003), inter alia, such discrepancies between tax depre-

ciation allowances and economic depreciation rates across capital assets are the main cause

of differential capital taxation in the United States.21

21Of course, in reality many firms continue operating for many periods and hence deduct the whole cost ofa capital investment over time via depreciation deductions. The differences in effective tax rates are createdby tax depreciation rules that make firms depreciate their capital assets either faster or slower than theireconomic depreciation over time. As long as the real interest rate is positive, tax depreciation rules thatallow firms to deduct depreciation faster relative to actual economic depreciation decrease the effective tax

34

Gravelle (2003) concisely summarizes the historical evolution of tax differentials.22 She

reports that before the 1986 Tax Reform Act, the statutory corporate income tax rate was

46%. The effective tax rates on returns to most equipment assets were less than 10%, whereas

they were around 35% to 40% for buildings. With the 1986 tax reform, the statutory rate

was reduced to 35%, and the depreciation rules were altered to reduce the tax differentials

between equipment capital and structure capital. These policy changes resulted in effective

equipment capital tax rates that were about 32% and structure capital tax rates that were

still about 35% to 40%. As documented by Gravelle (2011), in the current U.S. corporate

tax code (which went through a minor reform in 1993), equipments are taxed at 26% on

average and structures are taxed at 32%. To sum up, capital equipments have historically

been favored relative to capital structures, but the difference in effective tax rates has been

declining over time.

B Proofs

B.1 Proof of Proposition 2

We call the multipliers on period t feasibility and skilled agents’ incentive constraint to be

λtβt−1 and µ, respectively. The first-order optimality conditions with respect to cs,t and cu,t

are given by, for t ≥ 1,

(cs,t) : (πs + µ∗)u′(c∗s,t) = λ∗tπs (11)

(cu,t) : (πu − µ∗)u′(c∗u,t) = λ∗tπu, (12)

rate on capital. Inflation also affects the real value of depreciation deductions because these deductions arebased on the historical acquisition costs. For a detailed description of how effective tax rates are calculated,see Gravelle (1994).

22For a more detailed description of how tax differentials between capital equipments and structures havechanged between 1950 and 1983, see Auerbach (1983).

35

which imply for all h ∈ H and t ≥ 2,

λ∗t−1

λ∗t=u′(c∗h,t−1)

u′(c∗h,t). (13)

The first order optimality conditions with respect to the two types of capital for t ≥ 2 are:

(Ke,t) : λ∗t−1 = β[λ∗t F2(K∗s,t, K

∗e,t, L

∗s,t, L

∗u,t) +X∗t

],

(Ks,t) : λ∗t−1 = βλ∗t F1(K∗s,t, K∗e,t, L

∗s,t, L

∗u,t), where

X∗t = µ∗v′(l∗u,tw

∗u,t

w∗s,t

)l∗u,t

∂(w∗u,tw∗s,t

)∂K∗e,t

.

Combining the first-order conditions with respect to capital with (13), we get for both h ∈ H

and ∀t ≥ 2,

u′(c∗h,t−1) = βF ∗1,tu′(c∗h,t), (14)

u′(c∗h,t−1) = βF ∗2,tu′(c∗h,t)

(1 +

X∗tλ∗t F

∗2,t

). (15)

Part 1. Equation (14) together with the definition of intertemporal wedges proves that

the structure capital wedge is zero in all periods for all h ∈ H. Since µ∗ > 0, we have

X∗t < 0. Using the definition of the equipment capital wedge and equation (15), we have for

all h ∈ H and t ≥ 2,

τ ∗e,t(h) = − X∗tλ∗t F

∗2,t

> 0.

This finishes the proof of part 1 of Proposition 2.

Part 2. Suppose a steady state of the constrained efficient allocation exists. Letting the

allocation without any time subscripts denote this steady-state allocation, we have

X∗ = µ∗v′(l∗uw

∗u

w∗s)l∗u

∂(w∗uw∗s

)∂K∗e

< 0.

36

Thus, we have

τ ∗e = − X∗

λ∗F ∗2> 0. �


As before, we call the multipliers on period t feasibility and skilled agents’ incentive constraint

to be λtβt−1 and µ, respectively. Consider the first-order optimality conditions with respect

to cs,t and ls,t :

u′(c∗s,t)A∗t = λ∗tπs

v′(l∗s,t)

(A∗t −

B∗tv′(l∗s,t)

)= λ∗tw

∗s,tπs, where

A∗t = (πs + µ∗)

B∗t = µ∗v′(l∗u,tw

∗u,t

w∗s,t

)l∗u,t

∂w∗u,tw∗s,t

∂L∗s,tzsπs.

By the first order conditions above, we have A∗t > 0 and(A∗t −

B∗tv′(l∗s,t)

)> 0. By Assumption

3, we have B∗t > 0. These imply

τ ∗y,t(s) = 1− A∗t(A∗t −

B∗tv′(l∗s,t)

) < 0. �


Part 1. Let λtβt−1 and µt(h

t) be multipliers on period t feasibility constraint and the incentive

constraint at history ht, respectively. Under Assumptions 1 and 2, the result follows from

37

the first-order conditions with respect to the two capital types:

(Ks,t) : −λ∗t−1 + βλ∗t F1(K∗s,t, K∗e,t, L

∗s,t, L

∗u,t) = 0,

(Ke,t) : −λ∗t−1 + βλ∗t F2(K∗s,t, K∗e,t, L

∗s,t, L

∗u,t)

+β∑{ht∈Ht}

µ∗t (ht)v′

(l∗t (h

t−1, hot )w∗t (h

ot )

w∗t (ht)

)l∗t (h

t−1, hot )∂w∗t (hot )

w∗t (ht)

∂K∗e,t= 0.

Part 2. Also, under Assumption 1, it follows directly from Golosov, Kocherlakota, and

Tsyvinski (2003) that in any period t ≥ 1 and following any history ht, the constrained

efficient allocation satisfies the following inverse Euler equation:

1

u′(c∗t (ht))

=1

βF ∗1,t+1

Et

{1

u′(c∗t+1(ht+1))|ht}. (16)

Part 2(a) of the proposition follows from the definition of intertemporal wedges in equation

(9), equation (16) and the conditional version of Jensen’s inequality. Part 2(b) of the propo-

sition follows from Part 1 and equation (9), which defines intertemporal wedges for both

types of capital. �

There is a modified inverse Euler equation for equipment capital. We derive it here for

the sake of completeness. It follows from the first part of Proposition 4 and equation (16):

1

u′(c∗t (ht))

=1

β(F ∗2,t+1 +X∗t+1/λ

∗t+1

)Et{ 1

u′(c∗t+1(ht+1))|ht}. (17)


First, under Assumption 5, for t ≥ 2

X∗t =∑

{ht−1∈Ht−1}

µ∗t (ht−1, s)v′

(l∗t (h

t−1, u)w∗u,tw∗s,t

)l∗t (h

t−1, u)∂w∗u,tw∗s,t

∂K∗e,t.

38

Assumption 2 implies ∂w∗u,tw∗s,t

/∂K∗e,t < 0, implying that, in any period t ≥ 2, X∗t < 0. The first

part of the proposition then follows from the first part of Proposition 4 and X∗t < 0, whereas

the second part follows from the second part of Proposition 4 and X∗t < 0.


For any t, let ht = (ht−1, s) and for m ≤ t, let hm be the predecessor in period m. Consider

the first-order optimality conditions with respect to ct(ht) and lt(h

t) :

u′(c∗t (ht))A∗t (h

t) = λ∗tπt(ht),

v′(l∗t (ht))

(A∗t (h

t)− B∗t πt(ht)

v′(l∗t (ht))

)= λ∗tw

∗t (ht)πt(h

t),

where, letting χ be the indicator function,

A∗t (ht) = πt(h

t) +t∑

m=1

βt−mπt(ht|hm)µ∗m(hm)

[χ{s}(hm)− χ{u}(hm)

]B∗t =

∑{ht∈Ht|ht=s}

µ∗t (ht)v′

(l∗t (h

t−1, u)w∗t (u)

w∗t (s)

)l∗t (h

t−1, u)∂w∗t (u)

w∗t (s)

∂L∗s,tzs.

By the first order conditions above, we have A∗t (ht) > 0,

(A∗t (h

t)− B∗t πt(ht)

v′(l∗t (ht))

)and by As-

sumption 3, B∗t > 0. These imply

τ y,t(ht) = 1− A∗t (h

t)(A∗t (h

t)− B∗t πt(ht)

v′(l∗t (ht))

) < 0. �

C Implementation

In this section, we show how the constrained efficient allocation can be implemented in an

incomplete markets environment with taxes. We say that a tax system implements the

constrained efficient allocation in a market if the constrained efficient allocation arises as an

equilibrium of this market arrangement under the given tax system. The constrained efficient

39

allocation can be implemented in many different ways. We provide an implementation in

which the tax system mimics the actual U.S. tax code in the sense that capital tax differentials

are created at the firm level through depreciation allowances that differ from actual economic

depreciation. Therefore, creating the optimal capital tax differentials would not require

further complications to the U.S. tax code.

We begin by describing the market arrangement. We assume that markets are incomplete

in that agents can only trade non-contingent claims to future consumptions (i.e. they can

save and borrow at a net risk-free rate, rt). We denote an agent’s savings as at. There is

a representative firm that rents capital at a net interest rate rt and labor to produce the

output good.23 The wage rates for skilled and unskilled are ws,t and wu,t and are taken as

given by the firm.

Government and Taxes. There is a government that needs to finance {Gt}∞t=1, an

exogenously given sequence of government consumption. The government taxes consumers’

savings and labor income (in a nonlinear and history-dependent way). The government also

taxes the firms’ capital income net of depreciation. The statutory depreciation allowance

can differ from economic depreciation, as in the U.S. tax code.

Taxes on consumers are specified following Kocherlakota (2005). Let τ y,t : Rt+ → R

denote the labor tax schedule, where τ y,t(yt) is the labor income tax an agent with labor

income history yt = (y1, ..., yt) pays in period t. Labor income in history ht is defined as

yt(ht) = wt(ht) · lt(ht). Labor income yt and labor lt are one to one for a given level of

wages, which means it is sufficient to use either one of them when we define an allocation.

In the rest of this section, we use yt. There are also linear taxes on people’s asset holdings,

which depend on their income history. Letting τa,t : Rt+ → R denote linear tax rates on

asset holdings, an agent with income history yt pays τa,t(yt)(1 + rt)at(h

t−1), where at(ht−1)

denotes the agent’s asset holdings.24

23The assumption that the firm does not accumulate capital is innocuous and is made for convenienceonly. This setup is equivalent to one in which the firm, instead of the consumers, accumulates capital andmakes the capital accumulation decisions.

24One can show that the government can implement the constrained efficient allocation by taxing only the

40

Unlike in Kocherlakota (2005), the government also has access to a sequence of linear

taxes on firm’s capital income, denoted by (τ f,t)∞t=1. The corporate tax code also includes

statutory depreciation allowances, (δs,t, δe,t)∞t=1. The firm is allowed to deduct δi,tKi,t from

its tax base in period t.

Consumer’s Problem. Taking prices (rt, ws,t, wu,t)∞t=1 and taxes (τ y,t, τa,t)

∞t=1 as given,

a consumer solves

maxc,y,a

∞∑t=1

∑ht∈Ht

πt(ht)βt−1

[u(ct(h

t))− v(yt(h

t)

wt(ht)

)]s.t

ct(ht) + at+1(ht) ≤ yt(h

t)− τ y,t(yt(ht)) +[1− τa,t(yt(ht))

](1 + rt)at(h

t−1),

a1 = K∗s,1 +K∗e,1, c, y are nonnegative.

Firm’s Problem. Taking prices (rt, ws,t, wu,t)∞t=1 and taxes (τ f,t, δs,t, δe,t)

∞t=1 as given,

the firm solves

maxKe,t,Ks,t,Ls,t,Lu,t

F (Ks,t, Ke,t, Ls,t, Lu,t)− (1 + rt)(Ks,t +Ke,t)−ws,tzsLs,t −

wu,tzu

Lu,t − τ f,tΠf,t,

where Πf,t is the firm’s capital income net of depreciation allowances in period t given by

Πf,t =

[F (Ks,t, Ke,t, Ls,t, Lu,t)− δs,tKs,t − δe,tKe,t −

ws,tzsLs,t −

wu,tzu

Lu,t

].

Notice that, as in the U.S. corporate tax code, we assume a flat tax on the firm’s capital

income net of depreciation allowances.

Equilibrium. Given a tax system (τ y,t, τa,t, τ f,t, δs,t, δe,t)∞t=1, an equilibrium is an alloca-

tion for consumers, (ct(ht), yt(h

t), at+1(ht))∞t=1, an allocation for the firm, (Ks,t, Ke,t, Ls,t, Lu,t)∞t=1,

and prices (rt, ws,t, wu,t)∞t=1 such that (ct(h

t), yt(ht), at+1(ht))∞t=1 solves the consumer’s prob-

return from capital rtat(ht−1), which would be more in line with what we observe in the U.S. tax code.

41

lem, (Ks,t, Ke,t, Ls,t, Lu,t)∞t=1 solves the firm’s problem, and markets clear:

∑ht∈Ht

πt(ht)ct(h

t) +Ks,t+1 +Ke,t+1 +Gt = F (Ks,t, Ke,t, Ls,t, Lu,t),

Ks,t +Ke,t =∑ht∈Ht

πt(ht)at(h

t−1),

Ls,t =∑

{ht∈Ht|ht=s}

πt(ht)yt(h

t)

ws,tzs, Lu,t =

∑{ht∈Ht|ht=u}

πt(ht)yt(h

t)

wu,tzu.

The government’s period-by-period budget balance is implied by Walras’ law:

∑ht∈Ht

πt(ht)[τa,t(y

t(ht))(1 + rt)at(ht−1) + τ y,t(y

t(ht))]

+ τ f,tΠf,t = Gt.

In what follows, we describe an optimal tax system that implements the constrained effi-

cient allocation in the market setup described above. Before doing so, we provide a formal

definition of our notion of implementation.

Implementation. A tax system (τ y,t, τa,t, τ f,t, δs,t, δe,t)∞t=1 implements the constrained

efficient allocation (c∗t (ht), y∗t (h

t), K∗s,t, K∗e,t, L

∗s,t, L

∗u,t)∞t=1 if an allocation for consumers (c∗t (h

t),

y∗t (ht), at+1(ht))∞t=1 and an allocation for the firm (K∗s,t, K

∗e,t, L

∗s,t, L

∗u,t)∞t=1 jointly with the tax

system and prices (rt, ws,t, wu,t)∞t=1 constitute an equilibrium.

C.1 Optimal Tax System

First, we construct the optimal tax system. Then, we prove that it implements the con-

strained efficient allocation. Finally, we characterize its properties.

Optimal Tax System. We begin by describing optimal savings taxes. Set the taxes on

people’s savings as

τ ∗a,t+1(yt+1) = 1− u′(c∗t (ht))

βu′(c∗t+1(ht+1))F1(K∗s,t, K∗e,t, L

∗s,t, L

∗u,t)

, if yt+1 ∈ Y t+1∗

τ ∗a,t+1(yt+1) = 1, if else,

42

where we define yt(ht) = (ym(hm))tm=1, Yt∗ = {yt : yt = yt∗(ht), ht ∈ H t} .

In words, Y t∗ is the set of labor income histories observed at the constrained efficient

allocation. Set labor income taxes such that if yt ∈ Y t∗, then τ ∗y,t(yt) and a∗t+1(ht) are defined

to satisfy the flow budget constraints every period:

c∗t (ht) + a∗t+1(ht) = y∗t (h

t)− τ ∗y,t(yt) +[1− τ ∗a,t(yt)

]F1(K∗s,t, K

∗e,t, L

∗s,t, L

∗u,t)a

∗t (h

t−1).

If yt /∈ Y t∗, then τ ∗y,t(yt) = 2yt.

Finally, set

τ ∗f,t = 1−F1(K∗s,t, K

∗e,t, L

∗s,t, L

∗u,t)− δs


∗s,t, L

∗u,t)− δe

, δ∗s,t = r∗t + δs, δ

∗e,t = δe.

Observe that at the corporate level, the optimal tax system works in the same way as the

U.S. tax code. In the U.S. corporate tax code, there is a single statutory corporate tax rate,

and differences between effective tax rates on equipment and structure capital income are

created through differences in tax depreciation rules, as explained in detail in Section A.

Proposition 7. The optimal tax system described above implements the constrained efficient

allocation.

Proof. First, define prices as r∗t = F1(K∗s,t, K∗e,t, L

∗s,t, L

∗u,t) − δs and ∀h ∈ H,w∗h,t =

∂F (K∗s,t,K∗e,t,L

∗s,t,L

∗u,t)

∂Lh,tzh. Second, observe that the only budget feasible income strategy for the

household is the one that corresponds to some agent’s income process at the constrained

efficient allocation (i.e., for any history ht in any period t, yt should be in Y t∗).

Third, we claim that if an agent chooses an income strategy y′, where this means y′t(ht) =

y∗t (ht) for some ht, then facing the prices defined above, the agent also chooses (c′, a′),

meaning that c′t(ht) = c∗t (h

t) and a′t+1(ht) = a∗t+1(ht) for all ht, t. If we can show this claim,

then the result that agent will actually choose the constrained efficient allocation follows,

since the constrained efficient allocation is incentive compatible.

43

To see this claim, take an agent that follows income strategy y′. His problem is

maxc,a

∞∑t=1

∑ht∈Ht

πt(ht)βt−1

[u(ct(h

t))− v

(y∗t (h

t)

w∗t (ht)

)]s.t

ct(ht) + at+1(ht) ≤ y∗t (h

t)− τ ∗y,t(yt∗(ht)) +[1− τ ∗a,t(yt∗(ht))

](1 + r∗t )at(h

t−1),

a1 ≤ K∗s,1 +K∗e,1, c is nonnegative.

The first-order conditions to this problem are the budget constraint with equality and

u′(ct(ht)) = (1 + r∗t+1)β

∑ht+1|ht

πt+1(ht+1|ht)u′(ct+1(ht+1))[1− τa∗t+1(yt+1∗(ht, ht+1)].

Clearly, the agent’s problem is concave, and hence these first-order conditions are necessary

and sufficient for optimality provided that a relevant transversality condition holds.

By construction of the labor tax code and the prices, c∗t (ht) and a∗t+1(ht) satisfy the flow

budget constraints. To see that they also satisfy the Euler equation above, observe that

wealth taxes are constructed such that

1− τ ∗a,t+1(yt+1∗(ht, ht+1)) =u′(c∗t (h

t))

βu′(c∗t+1(ht, ht+1))F ∗1 (Ks,t, Ke,t, Ls,t, Lu,t).

Finally, we also need to make sure that the firm chooses the right allocation. The firm’s opti-

mality conditions for labor are satisfied at the constrained efficient allocation by construction

of wages. The firm’s optimality conditions for capital are

(Ks,t) : F1(Ks,t, Ke,t, Ls,t, Lu,t)− (1 + r∗t )− τ ∗f,t[F1(Ks,t, Ke,t, Ls,t, Lu,t)− δs,t

]= 0

(Ke,t) : F2(Ks,t, Ke,t, Ls,t, Lu,t)− (1 + r∗t )− τ ∗f,t[F2(Ks,t, Ke,t, Ls,t, Lu,t)− δe,t

]= 0.

At the constrained efficient allocation, the first condition above holds by construction of r∗t

and δ∗s,t whereas the second one holds by construction of τ ∗f,t, r

∗t and δ

∗e,t. �

44

Properties of Optimal Capital Taxes. Next we summarize and discuss the properties

of optimal capital taxes.

1. Capital income is taxed twice, once at the consumer level via savings taxes and once

at the corporate level (double taxation of capital).

2. The statutory corporate tax is strictly positive if only downward incentive constraints

bind: ∀t ≥ 2 : τ ∗f,t > 0, where the inequality comes from Proposition 5.

3. There is a single statutory tax rate on corporate income, τ ft , but the effective taxes

on capital income at the firm level differ across different capital assets because of

differences in the statutory depreciation allowances. The firm is allowed to expense all

of its user cost of structure capital, whereas the depreciation allowance on equipment

capital is equal to its economic depreciation. These optimal tax depreciation allowances

imply that the optimal effective corporate tax rate on structure capital is zero and the

optimal effective corporate tax rate on equipment capital is equal to the statutory

corporate tax rate τ ∗f,t.25

4. Properties 2 and 3 imply that the government uses capital income taxes to collect

revenue at the corporate level, as in the U.S. tax code.

5. Expected taxes on consumers’ asset holdings are zero as in Kocherlakota (2005):

Et{

1− τa∗t+1(yt+1)}

= Et

{u′(c∗t (h

t))

βu′(c∗t+1(ht+1))F ∗1,t+1

}= 1.

This property follows from the inverse Euler equation (16).

25Note that there are other implementations in which both capital types can be taxed positively as longas the tax rates create the efficient capital return wedge. In that case, the savings’ tax that the consumersface would have to be adjusted.

45

D Alternative Tax Systems

D.1 Current Differential Taxation of Capital

In the current DTC, the planner must use the capital income taxes as in the U.S. tax code.

This means that we impose a set of constraints for both types of agents in each period (here,

τ s and τ e are fixed over time and taken from the data, namely, τ s = 0.422 and τ e = 0.371):

for all h ∈ H

1− τ s =

u′(ch,t)

βu′(ch,t+1)− 1

F1,t+1 − δsand 1− τ e =

u′(ch,t)

βu′(ch,t+1)− 1

F2,t+1 − δe, (18)

Observe that the restrictions that the current capital taxes impose on the set of allocations

that the planner can choose, (18), is written down taking into account that, in the U.S. tax

code, it is capital income net of depreciation that is being taxed. Also, notice that any three

of these constraints in (18) imply the fourth. Therefore, we write the current DTC planning

problem, ignoring the fourth:

max{(ch,t,lh,t)h∈H ,Ks,t,Ke,t,Lu,t,Ls,t}∞t=0

∑h∈H

πh

∞∑t=1

βt−1 [u(ch,t)− v(lh,t)] s.t.

∀t ≥ 1, Gt +∑h

πh∈Hch,t +Ks,t+1 +Ke,t+1 ≤ F (Ks,t, Ke,t, Ls,t, Lu,t),

∞∑t=1

βt−1 [u(cs,t)− v(ls,t)] ≥∞∑t=1

βt−1

[u(cu,t)− v(

lu,twu,tws,t

)

],

∀t ≥ 1, βu′(cs,t+1) [(1− τ s)(F1,t+1 − δs) + 1] = u′(cs,t),

∀t ≥ 1, βu′(cs,t+1) [(1− τ e)(F2,t+1 − δe) + 1] = u′(cs,t),

∀t ≥ 1, βu′(cu,t+1) [(1− τ s)(F1,t+1 − δs) + 1] = u′(cu,t).

46

D.2 Optimal Nondifferential Taxation of Capital

In optimal NDTC, the planner is not allowed to tax the two types of capital differentially.

This means that for agents of both types h ∈ H and all t ≥ 1, we impose

1− τ s,t+1(h) = 1− τ e,t+1(h).

This restriction on taxes is equivalent to the following restriction on allocations: for all h ∈ H

and t ≥ 1,

1− τ s,t+1(h) =

u′(ch,t)

βu′(ch,t+1)− 1

F1,t+1 − δs= 1− τ e,t+1(h) =

u′(ch,t)

βu′(ch,t+1)− 1

F2,t+1 − δe.

This equation can be simplified to an equality constraint on net returns: F1,t+1 − δs =

F2,t+1 − δe, which we impose for each period. The optimal NDTC planning problem reads:

max{(ch,t,lh,t)h,Ks,t,Ke,t,Lu,t,Ls,t}∞t=0

∑h=u,s

πh

∞∑t=1

βt−1 [u(ch,t)− v(lh,t)] s.t.

∀t ≥ 1, Gt +∑h

πhch,t +Ks,t+1 +Ke,t+1 ≤ F (Ks,t, Ke,t, Ls,t, Lu,t),

∞∑t=1

βt−1 [u(cs,t)− v(ls,t)] ≥∞∑t=1

βt−1

[u(cu,t)− v(

lu,twu,tws,t

)

],

∀t ≥ 1, F1(Ks,t+1, Ke,t+1, Ls,t+1, Lu,t+1)− δs = F2(Ks,t+1, Ke,t+1, Ls,t+1, Lu,t+1)− δe.

47

machines, buildings, and optimal dynamic...

Documents