portfolio selection in discrete time with transaction costs ......key words: portfolio optimization,...

May 26, 2017 22:34 Applied Mathematical Finance article

Applied Mathematical Finance,Vol. 00, No. 00, 1–32, 00 Month 20xx

Portfolio Selection in Discrete Time with TransactionCosts and Power Utility Function: A PerturbationAnalysis

GARY QUEK∗† & COLIN ATKINSON∗

∗Department of Mathematics, South Kensington Campus, Imperial College London, London SW7 2AZ, UK, †BEITScientific Research Fellow

(Received 00 Month 20xx; final version received 00 Month 20xx)

Abstract In this paper, we study a multi-period portfolio selection model in which a generic class of probabilitydistributions is assumed for the returns of the risky asset. An investor with a power utility function rebalances aportfolio comprising a risk-free and risky asset at the beginning of each time period in order to maximize expected utilityof terminal wealth. Trading the risky asset incurs a cost that is proportional to the value of the transaction. At eachtime period, the optimal investment strategy involves buying or selling the risky asset to reach the boundaries of acertain no-transaction region. In the limit of small transaction costs, dynamic programming and perturbation analysisare applied to obtain explicit approximations to the optimal boundaries and optimal value function of the portfolio ateach stage of a multi-period investment process of any length.

Key Words: Portfolio optimization, discrete time, transaction costs, power utility function, perturbation analysis

1. Introduction

The portfolio selection and consumption problem in a continuous time setting was first studiedby Merton (1969). The investor’s objective was to maximize expected utility from consump-tion, where the price of the risky asset was assumed to be driven by a geometric Brownianmotion in a frictionless market. It was shown that, for the case of the power or logarithmicutility function, the optimal investment strategy involved continuously rebalancing the portfo-lio to maintain a constant proportion of the risky asset. This constant proportion is commonlyreferred to as the Merton proportion.

However, continuous trading in financial markets could be ruinously expensive due to theimpact of transaction costs. Transaction costs incorporated into subsequent research weregenerally modeled as a constant amount for each transaction (constant costs), an amountproportional to the value of the transaction (proportional costs) or a fixed proportion of theentire portfolio value. Magill and Constantinides (1976) were the first to extend Merton’smodel to incorporate trading costs that were proportional to the value of the transaction(proportional transaction costs). Although their argument was heuristic, they provided theinsight that “the investor trades in securities when the variation in the underlying securityprices forces his portfolio proportions outside a certain region about the optimal proportions inthe absence of transactions costs”. Davis and Norman (1990) provided a rigorous formulationand analysis of the portfolio selection and consumption problem with proportional transaction

Correspondence Address: Department of Mathematics, South Kensington Campus, Imperial College London, LondonSW7 2AZ, UK. Email: [email protected]

ISSN: 0020-7721 Print/ISSN 1464-5319 Online/YY/issnum1stp–ppcnt c© 20xx Taylor & FrancisDOI: 10.1080/1350486X.20xx.CATSid


2 Gary Quek and Colin Atkinson

costs by applying the theory of stochastic singular control. Their work was further generalizedby Shreve and Soner (1994) with less restrictive assumptions using the theory of viscositysolutions. Taksar et al. (1988) analyzed the portfolio selection problem with proportionaltransaction costs, which involved applying stochastic singular control to maximize the longrun growth rate of the portfolio value. In the aforementioned models with transaction costs, thetypical optimal strategy was not to transact when the proportion of risky asset drifted withina particular no-transaction region. When the proportion of risky asset exceeded the boundaryof this region, the investor would transact instantaneously to return to the boundary. Akianet al. (1996) extended the Davis and Norman (1990) model to study the case where therewere more than one risky asset. Morton and Pliska (1995) introduced a multi-asset modelwhere the investor paid a transaction cost equal to a fixed proportion of the entire portfoliovalue. The investor’s objective was to maximize the long run growth rate of the portfolio valueand the optimal strategy was shown to be reduced to one that solved a single stopping timeproblem.

In general, a lack of analytical solutions meant that models with transaction costs had to besolved by numerical methods. These were usually computationally intensive, especially in thecase of multiple risky assets. Nonetheless, it was observed that transaction costs were smallin practice relative to the value of the transactions. In the limit of small transaction costs,Atkinson and Wilmott (1995) and Mokkhavesa and Atkinson (2002) applied techniques ofperturbation analysis about the no transaction costs solution to derive approximate solutions.Janecek and Shreve (2004) provided a rigorous derivation of the asymptotic expansions of theoptimal value function and boundaries of the no-transaction region. Their work was recentlyextended by Gerhold et al. (2012), who obtained power series expansions of arbitrary orderfor the optimal value function and boundaries of the no-transaction region by using dualitytheory. However, these perturbation analyses were applied to continuous time models wherethe prices of risky assets were assumed to be geometric Brownian motions.

It should be noted that Atkinson and Al-Ali (1997) solved the one dimensional portfo-lio problem exactly for the first correction, a result which seemed to have been repeatedby Soner and Touzi (2013) recently by a different method. The multi-asset problem withuncorrelated distribution was solved by Atkinson and Mokkhavesa (2004) who deduced theN -dimensional cuboid for the no transaction region and found explicit solutions for all theboundaries (N−1) transaction regions, obtaining the regions in the expansion of the solution.Furthermore, Atkinson and Mokkhavesa (2003) applied these methods to situations includ-ing stochastic volatility and procedures for any utility function. Atkinson and Papakokkinou(2005) also considered portfolios with a VaR or CVaR constraint as well as transaction costs.The problem of determining a speculator’s utility function from his observed transactionsbehaviour was investigated by Atkinson and Mokkhavesa (2001). Most of the above analyseswere in continuous time and with underlying Brownian behaviour of the risky assets togetherwith (in the multi-asset case) uncorrelated behaviour of the risky assets. The influence ofcorrelation was considered by Atkinson and Ingpochai (2006) and Atkinson and Ingpochai(2010) both for portfolio theory and for option pricing together with stochastic volatility. Theoption pricing theory is based on an indifference argument given by Davis and Norman (1990)and considers two portfolios, one with the option and one without, both portfolios using util-ity maximisation with the exponential utility function. For a basket of uncorrelated assets,Atkinson and Alexandropoulos (2006) have solved this problem by a perturbation methodand also shown how the Legendre transform can be used to give a single solution to the notransaction region in these alternate variables. This method is also used by Atkinson andIngpochai (2012) where correlation between assets is taken into account.

In a discrete time framework, Samuelson (1969) studied the portfolio selection and con-sumption problem in a multi-period model using a dynamic programming approach, whichwas analogous to the work by Merton (1969). He analyzed the maximization of expected utility


Portfolio Selection with Transaction Costs and Power Utility Function 3

from consumption, where the return of the risky asset was assumed to follow a general proba-bility distribution. In the case of the power or logarithmic utility function, he showed that theinvestor’s optimal strategy was to maintain a constant proportion of the risky asset at eachtime step. Mossin (1968) analyzed the dynamic portfolio selection problem in discrete timeand considered the objective of maximizing expected utility of terminal wealth. It was shownthat, for the power or logarithmic utility function, the optimal strategy involved making asequence of single-period decisions which disregarded future reinvestment opportunities. Thiswas described as a myopic strategy. However, this was only a special case and it was generallynot optimal to make a decision at each time step without considering the steps ahead.

Bobryk and Stettner (1999) incorporated proportional transaction costs in the maximiza-tion of expected utility from consumption. In the case of the power or logarithmic utilityfunction, the optimal investment strategy was shown to be characterized by a cone shapedno-transaction region. They derived various bounds on the no-transaction region by specifyingupper and lower bounds on the support of the probability measure for the returns of the riskyasset. Sass (2005) embedded the Cox et al. (1979) binomial model with a general transactioncosts structure. He formulated the objective of maximizing expected utility of terminal wealthas a Markov control problem and gave a multi-period existence result based on the solution ofthe dynamic programming equation. Explicit results were provided for the one-period problemin the case of a single risky asset with a binomial price process. Atkinson and Storey (2010)studied the discrete time portfolio selection problem by incorporating proportional transac-tion costs in the Mossin (1968) model. They assumed a general class of underlying probabilitydistributions for the returns of the risky asset and studied the problem of maximizing expectedutility of terminal wealth for the power utility function. Perturbation methods were appliedto obtain approximations of the optimal boundaries of the no-transaction region in the limitsof small and large transaction costs. However, the approximations of the optimal boundarieswere only obtained for two time steps and it was not obvious that their approach would allowone to generalise to an arbitrary number of time steps. Atkinson and Quek (2012) consideredthe case of maximizing expected utility of terminal wealth for the exponential utility function.In the limit of small transaction costs, they applied perturbation analysis to derive approxi-mations of the optimal value function and boundaries of the no-transaction region at all timesteps of the problem. An investor with the exponential utility function is characterized by aconstant level of absolute risk aversion, which resulted in optimal no-transaction boundariesthat were independent of wealth. In practice, one would expect the optimal boundaries tovary with the wealth of the investor. A more realistic description of the investor’s optimalstrategy would be provided by using the power utility function.

In this paper, we assume the more realistic case of the power utility function and carry outa perturbation analysis to an arbitrary number of time steps, in the limit of small transactioncosts. We present a method for explicitly constructing the approximations of both the optimalvalue function and optimal boundaries of the no-transaction region. It is a non-trivial extensionof the perturbation analysis developed in Atkinson and Quek (2012) as the proportion ofrisky asset at each time step depends on variations in both the return of the risky assetand the investor’s wealth. It should be stressed that the method of this paper allows fora wide variety of distributions for the returns of the risky asset. In our simplified examplediscussed in Section 6, we choose a specific distribution which is the same at each timestep, e.g.p(s) = 0.7×δ(s−1.5)+0.3×δ(s−0.5), with δ(.) being the Dirac delta function, but of coursemany other distgributions could be used in our method. Moreover, explicit error bounds aredetermined in the Appendix. Recent work such as that of Bayer and Veliyev (2014) consideronly a Binomial model for the risky asset and a log investor. Similarly, those of Balduzzi andLynch (1999) and Lynch and Balduzzi (2000) who address numerically quite complex modelsagain are somewhat restricted in terms of the underlying model of the risky asset. We thusconclude that our general approach here should have many applications to situations where a



variety of distributions might be expected for the return probabilities of the underlying riskyasset.

The plan of this paper is as follows. In Section 2, we describe the market model and theportfolio selection problem. Section 3 contains a review of the construction of the optimalportfolio via dynamic programming. In Section 4, we state the results for the special casewhen there are no transaction costs. The main part of the paper is found in Section 5, wherewe analyze the case of small transaction costs. We devise a perturbation method, whichis carried out in two stages, to systematically approximate the optimal value function andoptimal boundaries at any time step of the problem. In Section 6, we conclude with a discussionof the main results and a numerical example. The error for the procedure and details of theperturbation analysis are found in the Appendix.

2. Market Model

In this section, we present an overview of the market model that was developed in Atkinsonand Storey (2010). Consider a multi-period portfolio selection model with T periods. Assumea financial market with one risk-free asset (bond) and one risky asset (stock), where the priceof each asset evolves in discrete time. An investor is assumed to have a risk preference of theconstant relative risk aversion class (power utility function). The investor holds a portfoliothat is divided between the risk-free and risky assets. A cost proportional to the value ofthe transaction is incurred each time the investor buys or sells the risky asset. The investor’sobjective is to maximize the expected utility of terminal wealth by rebalancing the portfoliooptimally at each step of the investment process.

At time n, let Wn denote the wealth of the portfolio and let an be the dollar value of therisky asset inherited from the previous time step. Therefore, the corresponding value of therisk-free asset is Wn − an. The investor rebalances the portfolio at time n by buying ln orselling mn dollars of the risky asset. Suppose that sn denotes one plus the random return ofthe risky asset from time n to n+ 1. Thus, the value of the risky asset at time n+ 1 inheritedfrom time n is

an+1 = sn (an + ln −mn) . (1)

Furthermore, let λn and µn be the proportion costs of buying and selling the risky assetrespectively at time n. These costs reduce the wealth invested in the risk-free asset, resultingin a value of Wn − (an + ln −mn)− λnln − µnmn. Suppose that rn denotes one plus the surereturn of the risk-free asset from time n to n+ 1. The investor’s wealth at time n+ 1 is thengiven by

Wn+1 = rnWn + (sn − rn) (an + ln −mn)− rnλnln − rnµnmn. (2)

Assume that simultaneous buying and selling of the risky asset is not allowed, since it willnot be optimal due to the higher costs as compared to only buying or selling the asset. Theinvestor is thus left with three possible choices, which is to buy, to sell or not to transact therisky asset. The investor’s decision at each time step will affect the wealth and risky assetinherited at the next time step.

Suppose that the investor has a risk preference of the power utility type. Let the investor’sutility of wealth be

U(W ) =1

γW γ , (3)

where γ < 1 and γ 6= 0. In this case, it is convenient to parametrize the problem by expressingthe original variables in terms of fractions of wealth. Introduce the variables An = an/Wn

(fraction of wealth held in the risky asset), Ln = ln/Wn (fraction of wealth in buying the risky



asset) and Mn = mn/Wn (fraction of wealth in selling the risky asset). Using these variables,Eqs. (1) and (2) can be rewritten as

An+1 =sn (An + Ln −Mn)

Fn(4)

and

Wn+1 = WnFn (5)

respectively, where

Fn = rn + (sn − rn) (An + Ln −Mn)− rnλnLn − rnµnMn. (6)

Assume that 0 < An + Ln −Mn < 1 and 0 < 1−An − (1 + λn)Ln + (1− µn)Mn < 1, whichensure that the investor’s wealth Wn > 0.

The investor’s objective is to maximize the expected utility of terminal wealth WT given aninitial wealth W0 and initial proportion of risky asset A0, by choosing the optimal strategy ateach step of the investment process. The optimal value function at time n (n = 0, . . . , T −1)isdefined to be

Jn(Wn, An) = maxE [U(WT )] , (7)

where E is the conditional expectation operator taken with respect to the random vari-ables sn, . . . , sT−1 given Wn and An. The maximization is over the sequence of investments(Ln,Mn), . . . , (LT−1,MT−1) in the risky asset. In order to obtain the investor’s optimal valuefunction J0(W0, A0) and corresponding optimal strategy given an initial wealth W0 and ini-tial proportion of risky asset A0, we first simplify the problem by applying the principle ofdynamic programming.

3. Dynamic Programming

The dynamic programming algorithm for the problem, which starts at terminal time T andproceeds recursively backwards in time, is given by

JT (WT , AT ) = U(WT ) (8)

and

JT−k(WT−k, AT−k) = maxET−k [JT−k+1(WT−k+1, AT−k+1)] (9)

for k = 1, . . . , T . At time T − k, ET−k is the conditional expectation operator with respectto the random variable sT−k given WT−k and AT−k, while the maximization is over theinvestment (LT−k,MT−k) in the risky asset. A detailed description of the construction of theoptimal portfolio via dynamic programming can be found in Atkinson and Storey (2010). Weshall only state the results of this construction at time T − 1 and at the general case of timeT − k, which are subsequently required for our perturbation analysis in the limit of smalltransaction costs.

3.1 Time T − 1

This is a special case as it is one step before termination of the investment process, whichmeans that there is no further rebalancing opportunity for the investor. Using Eq. (5), thevalue function at time T − 1 is written as

JT−1(WT−1, AT−1) = maxET−1

[1

γW γT

]= W γ

T−1 maxVT−1(AT−1), (10)



where

VT−1(AT−1) = ET−1

[1

γF γT−1

]. (11)

In Eq. (10), WT−1 is taken out of ET−1, which is the expectation operator conditional onWT−1 and AT−1. In addition, WT−1 does not depend on the investor’s decision at time T − 1.Therefore, the problem is reduced to one of maximizing VT−1(AT−1) with respect to theinvestment strategy (LT−1,MT−1). The investor’s choice to buy (MT−1 = 0), to sell (LT−1 =0) or not to transact (MT−1 = 0 and LT−1 = 0) in the risky asset affects the definition ofFT−1 as given in Eq. (6).

The problem is one of finding the optimal no-transaction region delineated by A−T−1 ≤AT−1 ≤ A+

T−1, where A−T−1 and A+T−1 are the optimal buy and sell boundaries respectively.

The region to the left of A−T−1 is the optimal buy region while the region to the right of A+T−1

is the optimal sell region. A−T−1 and A+T−1 are given by the first order optimality conditions

∂VT−1

∂LT−1= ET−1

[{sT−1 − (1 + λT−1) rT−1}F γ−1

T−1

]= 0 (12)

and

∂VT−1

∂MT−1= ET−1

[{(1− µT−1) rT−1 − sT−1}F γ−1

T−1

]= 0 (13)

respectively. In Eqs. (12) and (13), note that FT−1 = rT−1 + (sT−1 − rT−1)AT−1 because wehave AT−1 = A−T−1, LT−1 = 0 on the buy boundary and AT−1 = A+

T−1,MT−1 = 0 on the sell

boundary. Furthermore, it can be shown that the second order conditions∂2VT−1

∂L2T−1

< 0 and

∂2VT−1

∂M2T−1

< 0 are satisfied, which ensure that these boundaries are optimal. In general, one will

solve for A−T−1 and A+T−1 numerically.

Having determined the optimal boundaries A−T−1 and A+T−1, the investor’s optimal strategy

and value function are as follows:In the buy region AT−1 < A−T−1, the investor’s optimal strategy is to buy LT−1 = A−T−1 −

AT−1 of the risky asset to reach the optimal buy boundary A−T−1. The corresponding optimal

value function VT−1 and its first derivative∂VT−1

∂AT−1are thus given by

V(B)T−1 = ET−1

[1

γF

(B)γT−1

](14)

and

∂V(B)T−1

∂AT−1= ET−1

[rT−1λT−1F

(B)γ−1T−1

], (15)

where

F(B)T−1 = rT−1 + (sT−1 − rT−1)A−T−1 − rT−1λT−1

(A−T−1 −AT−1

). (16)

In the sell region AT−1 > A+T−1, the investor sells MT−1 = AT−1 −A+

T−1 of the risky asset

to reach the optimal sell boundary, so that VT−1 and∂VT−1

∂AT−1are given by

V(S)T−1 = ET−1

[1

γF

(S)γT−1

](17)



and

∂V(S)T−1

∂AT−1= −ET−1

[rT−1µT−1F

(S)γ−1T−1

], (18)

where

F(S)T−1 = rT−1 + (sT−1 − rT−1)A+

T−1 − rT−1µT−1

(AT−1 −A+

T−1

). (19)

In the no-transaction region A−T−1 ≤ AT−1 ≤ A+T−1, where LT−1 = 0 and MT−1 = 0 as the

investor does not trade in the risky asset, VT−1 and∂VT−1

∂AT−1are given by

V(N)T−1 = ET−1

[1

γF

(N)γT−1

](20)

and

∂V(N)T−1

∂AT−1= ET−1

[{sT−1 − rT−1}F (N)γ−1

T−1

], (21)

where

F(N)T−1 = rT−1 + (sT−1 − rT−1)AT−1. (22)

It is noted that VT−1 and∂VT−1

∂AT−1are continuous across the optimal buy and sell boundaries.

At the optimal boundaries AT−1 = A−T−1 and AT−1 = A+T−1, continuity of the former is simply

observed from Eqs. (14), (17) and (20), while the latter is a direct consequence of the firstorder optimality conditions (12) and (13).

3.2 Time T − k

Applying the dynamic programming algorithm recursively backwards in time allows one toconstruct the optimal strategy and value function at time T − k (k = 2, . . . , T ). Using Eq.(5), the optimal value function given by Eq. (9) is expressed as

JT−k(WT−k, AT−k) = maxET−k[W γT−k+1VT−k+1(AT−k+1)

]= W γ

T−k maxVT−k(AT−k), (23)

where

VT−k(AT−k) = ET−k[F γT−kVT−k+1(AT−k+1)

]. (24)

The problem is effectively reduced to one of maximizing VT−k(AT−k) with respect to LT−k andMT−k, assuming that VT−k+1(AT−k+1) is optimal by the principle of dynamic programming.The definitions of FT−k from Eq. (6) and AT−k+1 from Eq. (4) depend on the investor’sdecision to buy (MT−k = 0), to sell (LT−k = 0) or not to transact (LT−k = 0 and MT−k = 0)the risky asset.

The optimal buy boundary AT−k = A−T−k and sell boundary AT−k = A+T−k satisfy the

corresponding first order optimality conditions

∂VT−k∂LT−k

= ET−k[γ {sT−k − (1 + λT−k) rT−k}F γ−1

T−kVT−k+1

+sT−krT−k (1 + λT−kAT−k)Fγ−2T−k

∂VT−k+1

∂AT−k+1

]= 0 (25)



and

∂VT−k∂MT−k

= ET−k[γ {(1− µT−k) rT−k − sT−k}F γ−1

T−kVT−k+1

−sT−krT−k (1− µT−kAT−k)F γ−2T−k

∂VT−k+1

∂AT−k+1

]= 0, (26)

respectively. In Eqs. (25) and (26), note that FT−k = rT−k + (sT−k − rT−k)AT−k. Compared

to the first order conditions at time T − 1, there is an additional∂VT−k+1

∂AT−k+1term due to the

opportunities for the investor to rebalance the portfolio at the time steps ahead. Moreover, theinvestor is not myopic (unlike the case where there are no transaction costs) and will take intoaccount future rebalancing opportunities when he determines his current investment strategy.In general, one will solve for A−T−k and A+

T−k by implementing the dynamic programmingalgorithm numerically, which becomes computationally more intensive as the number of timesteps increases.

Having determined the optimal buy and sell boundaries, the investor’s optimal strategy andvalue function are as follows:

In the buy region AT−k < A−T−k, the investor’s optimal strategy is to buy LT−k = A−T−k −AT−k of the risky asset to reach the optimal buy boundary A−T−k. In this case, the optimal

value function VT−k and its first derivative∂VT−k∂AT−k

are given by

V(B)T−k = ET−k

[F

(B)γT−k VT−k+1

](27)

and

∂V(B)T−k

∂AT−k= ET−k

[γrT−kλT−kF

(B)γ−1T−k VT−k+1

−sT−krT−kλT−kA−T−kF(B)γ−2T−k

∂VT−k+1

∂AT−k+1

], (28)

where

F(B)T−k = rT−k + (sT−k − rT−k)A−T−k − rT−kλT−k

(A−T−k −AT−k

). (29)

Note that the optimal value function at the time step ahead VT−k+1 is a functionof AT−k+1. From Eq. (4), we know that AT−k+1 depends on sT−k, AT−k and A−T−k

via AT−k+1 =sT−kA

−T−k

F(B)T−k

. Define s−T−k and s+T−k as the values of sT−k which corre-

spond to the optimal buy boundary A−T−k+1 and optimal sell boundary A+T−k+1 at

the time step ahead respectively. This implies, after rearranging the expressions, that

they are explicitly given by s−T−k =rT−kA

−T−k+1

{(1−A−T−k

)− λT−k


)}A−T−k

(1−A−T−k+1

) and

s+T−k =

rT−kA+T−k+1

{(1−A−T−k

)− λT−k


)}A−T−k

(1−A+

T−k+1

) . Since the expectation operator

ET−k is taken with respect to the random variable sT−k, Eq. (27) can thus be written in its



integral form, delineated by s−T−k and s+T−k, to give

V(B)T−k =

∫ s−T−k

0F

(B)γT−k V

(B)T−k+1p(sT−k) dsT−k

+

∫ s+T−k

s−T−k

F(B)γT−k V

(N)T−k+1p(sT−k) dsT−k

+

∫ ∞s+T−k

F(B)γT−k V

(S)T−k+1p(sT−k) dsT−k, (30)

where p(sT−k) is the probability density function of the random variable sT−k. We will beusing Eq. (30) in the perturbation analysis that follows subsequently.

In the sell region AT−k > A+T−k, the investor sells MT−k = AT−k −A+

T−k of the risky asset

to reach the optimal sell boundary. Thus, VT−k and∂VT−k∂AT−k

are given by

V(S)T−k = ET−k

[F

(S)γT−k VT−k+1

](31)

and

∂V(S)T−k

∂AT−k= ET−k

[−γrT−kµT−kF

(S)γ−1T−k VT−k+1

+sT−krT−kµT−kA+T−kF

(S)γ−2T−k

∂VT−k+1

∂AT−k+1

], (32)

where

F(S)T−k = rT−k + (sT−k − rT−k)A+

T−k − rT−kµT−k(AT−k −A+

T−k). (33)

Similar to the buy region, AT−k+1 depends on sT−k,

AT−k and A+T−k via AT−k+1 =

sT−kA+T−k

F(S)T−k

, which implies

that s−T−k =rT−kA

−T−k+1

{(1−A+

T−k)− µT−k

(AT−k −A+

T−k)}

A+T−k

(1−A−T−k+1

) and

s+T−k =

rT−kA+T−k+1

{(1−A+

T−k)− µT−k

(AT−k −A+

T−k)}

A+T−k

(1−A+

T−k+1

) . We can similarly express

Eq. (31) in its integral form, delineated by the s−T−k and s+T−k given above.

In the no-transaction region A−T−k ≤ AT−k ≤ A+T−k where LT−k = 0 and MT−k = 0, the

optimal value function VT−k and its derivative∂VT−k∂AT−k

are given by

V(N)T−k = ET−k

[F

(N)γT−k VT−k+1

](34)

and

∂V(N)T−k

∂AT−k= ET−k

[γ (sT−k − rT−k)F

(N)γ−1T−k VT−k+1

+sT−krT−kF(N)γ−2T−k

∂VT−k+1

∂AT−k+1

], (35)

where

F(N)T−k = rT−k + (sT−k − rT−k)AT−k. (36)



Here, AT−k+1 depends on sT−k and AT−k via AT−k+1 =sT−kAT−k

F(N)T−k

, which means that

s−T−k =rT−kA

−T−k+1 (1−AT−k)

AT−k(1−A−T−k+1

) and s+T−k =

rT−kA+T−k+1 (1−AT−k)

AT−k(1−A+

T−k+1

) . Similarly, we can write

Eq. (34) in its integral form delineated by the above values of s−T−k and s+T−k.

It is noted that VT−k and∂VT−k∂AT−k

are continuous across the optimal boundaries. The for-

mer is observable from Eqs. (30), (31) and (34) while the latter is a direct consequence ofthe first order conditions (25) and (26). Generally, one will need to implement the dynamicprogramming algorithm recursively to obtain numerical solutions of the optimal boundariesA−T−k and A+

T−k and the optimal value function VT−k(AT−k). However, this implementationis computationally intensive particularly when the number of time steps is large. Therefore,this has motivated us to investigate an alternative method of approximating the solutions ofA−T−k, A

+T−k and VT−k(AT−k). Furthermore, in the case where there are no transaction costs,

the solution to the portfolio selection problem is easily obtained as the optimal strategy isessentially myopic in nature. Coupled with the knowledge that transaction costs are small inpractice, we therefore carry out a perturbation analysis of the small transaction costs modelabout the no transaction costs case. In addition to deriving more tractable approximations tothe solutions, a perturbation analysis may provide some qualitative insights to the nature ofthe solutions.

4. No Transaction Costs Case

Prior to the perturbation analysis in the limit of small transaction costs, we first consider thespecial case where there are no transaction costs incurred in buying or selling the risky asset.Setting λT−k = 0 = µT−k and repeating the construction of the optimal portfolio as seen inthe previous section, we obtain the following results.

In general (k = 1, . . . , T ), the optimal buy boundary A−T−k and sell boundary A+T−k coincide

to the same point, which is denoted by AT−k and given by the first order condition

ET−k[(sT−k − rT−k) F γ−1

T−k

]= 0, (37)

where

FT−k = rT−k + (sT−k − rT−k) AT−k. (38)

This optimal point is commonly known as the Merton proportion. Generally, one has to solveEq. (37) numerically for AT−k, which can be easily done by using standard root findingtechniques. Moreover, for the case where the risky asset has a binomial price process, one willbe able to obtain an explicit solution for AT−k. The investor’s optimal strategy is thus totransact to the Merton proportion at each time step of the investment process. In addition,the optimal value function VT−k is given by

VT−k =1

γET−1

[F γT−1

]· · ·ET−k+1

[F γT−k+1

]ET−k

[F γT−k

]. (39)

It is observed that the optimal value function at time T −k does not vary with the proportionAT−k of risky asset inherited from the previous time step, since there is no cost incurred inbuying or selling the risky asset to reach the Merton proportion AT−k. The optimal strategyis a myopic one as the investor does not need to consider future rebalancing opportunitiesat the time steps ahead. If one further assumes that sT−k are independent and identicallydistributed random variables and that rT−k is a constant independent of k, then the Merton



proportion AT−k simplifies to a constant independent of k and the optimal value function

becomes VT−k = 1γ

{ET−k

[F γT−k

]}k. The relatively simple solution of the Merton proportion

and the optimal value function motivates one to carry out a perturbation analysis about theno transaction costs solution, in the limit of small transaction costs. Before proceeding further,it is useful to state the following results that

ET−k[F γT−k

]= ET−k

[rT−kF

γ−1T−k

](40)

and

ET−k[rT−k (sT−k − rT−k) F γ−2

T−k

]= −ET−k

[AT−k (sT−k − rT−k)2 F γ−2

T−k

]. (41)

The above results are direct consequences of Eq. (37) and will be used extensively to simplifythe asymptotic approximations of the value functions in the next section.

5. Small Transaction Costs Case

In practice, transaction costs are usually small compared to the value of the transactions. Asobserved in the previous section, the no transaction costs problem admits a relatively simplemyopic solution. Therefore, one is motivated to analyze the small transaction costs solution asa perturbation about the no transaction costs solution. In Atkinson and Storey (2010), theyobtained the leading order approximations to the optimal buy and sell boundaries for two timesteps via the expansion of Eqs. (12), (13), (25) and (26) in the limit of small transaction costs.However, it was not obvious that a direct expansion of the first order conditions will enableone to easily obtain leading order approximations to an arbitrary number of time steps. Inthis section, we present an approach to apply perturbation analysis about the no transactioncosts solution in the limit of small transaction costs. The advantages of this approach overthe direct expansion approach is that it allows one to systematically obtain approximationsof both the optimal value function and optimal boundaries for an arbitrary number of timesteps. A similar approach had been adopted in Atkinson and Quek (2012) for an investorwith the exponential utility function. A feature of the exponential utility function was that itresulted in optimal boundaries that were independent of the investor’s wealth, which is notusually the case in practice. A more realistic description of the investor’s optimal strategy isprovided by using the power utility function. Moreover, it is also more challenging to carryout the perturbation analysis in this context as the proportion of risky asset inherited at eachtime step depends on variations in both the return of the risky asset and the investor’s wealth.

We start with approximating the optimal value function and follow by determining the opti-mal boundaries. In order to approximate the optimal value function for an arbitrary number oftime steps, we adopt an approach that consists of two stages. The first stage involves makingthe assumption that the investor buys or sells to reach the Merton proportion at each timestep when transaction costs are small. This is clearly a suboptimal strategy as the investorhas ignored the presence of the no-transaction region. Consequently, an approximation of thesuboptimal value function is derived at each time step. The second stage assumes that theinvestor behaves optimally by taking into account the no-transaction region. A sequence ofcorrections are then applied to the suboptimal value function to give us the desired approxi-mation to the optimal value function. After approximating the optimal value function at eachtime step, the optimal boundaries are then approximated by imposing the condition that thefirst derivative of the value function is continuous across the boundaries. Finally, the error inthis procedure is calculated and we give an example in the Appendix.

Suppose now that transaction costs are small such that λT−k = ελT−k and µT−k = εµT−k,where ε � 1, λT−k = O(1) and µT−k = O(1), for k = 1, . . . , T . Here, O(.) is the usual



asymptotic order symbol so that λT−k and µT−k are said to be “of the order” 1. Equivalently,λT−k and µT−k are said to be of the order ε. We now apply a perturbation analysis in thefollowing two stages.

5.1 Stage One: Transacting to the Merton Proportion

In the first stage, assume that the investor follows the suboptimal strategy of transacting tothe Merton proportion. This is equivalent to assuming that both the optimal buy and sellboundaries are equal to the Merton proportion, which is suboptimal as we have effectivelyremoved one of the investor’s possible choices of not transacting in the risky asset.

In general, at time T − k for k = 1, . . . , T , the investor is assumed to adopt the suboptimalstrategy of buying LT−k or selling MT−k of the risky asset to reach the Merton proportionAT−k = A−T−k = A+

T−k. Therefore, Eq. (5) becomes

WT−k+1 = WT−kFT−k, (42)

where

FT−k = FT−k − ελT−krT−kLT−k − εµT−krT−kMT−k. (43)

Recall, from the analysis of the no transaction costs solution, that FT−k = rT−k +(sT−k − rT−k) AT−k. The proportion of risky asset inherited in the next time step is nowgiven by

AT−k+1 =sT−kAT−k

FT−k. (44)

In particular, the investor’s suboptimal strategy of transacting to the Merton proportionand corresponding value function in the buy and sell regions (compare with Section 3) are asfollows:

In the buy region AT−k < AT−k, the investor buys LT−k = AT−k −AT−k of the risky assetso that

F(B)T−k = FT−k − ελT−krT−k

(AT−k −AT−k

). (45)

The value function now becomes

V(B)T−1 =

1

γET−1

[F

(B)γT−1

](46)

at time T − 1 and

V(B)T−k = ET−k

[F

(B)γT−k VT−k+1

]=

∫ sT−k

0F

(B)γT−k V


+

∫ ∞sT−k

F(B)γT−k V

(S)T−k+1p(sT−k) dsT−k, (47)

at time T − k (k = 2, . . . , T ), where sT−k = sT−k denotes one plus the risky return thatresults in an inherited proportion of AT−k+1 = AT−k+1 (the Merton proportion). From Eq.



(44), sT−k =rT−kAT−k+1

{(1− AT−k

)− ελT−k

(AT−k −AT−k

)}AT−k

(1− AT−k+1

) . It is convenient to define

sT−k =rT−kAT−k+1

(1− AT−k

)AT−k

(1− AT−k+1

) (48)

and then express

sT−k = sT−k

1−ελT−k

(AT−k −AT−k

)(

1− AT−k)

. (49)

Therefore, note that sT−k is the leading order term of sT−k. It is remarked that, if one assumesin the special case that sT−k are independent and identically distributed random variablesand that rT−k are constant in time, then AT−k are constant in time and sT−k reduces to rT−k.

In the sell region AT−k > AT−k, the investor sells MT−k = AT−k − AT−k of the risky assetso that

F(S)T−k = FT−k − εµT−krT−k

(AT−k − AT−k

). (50)

The value function now becomes

V(S)T−1 =

1

γET−1

[F

(S)γT−1

](51)

at time T − 1 and

V(S)T−k = ET−k

[F

(S)γT−k VT−k+1

]=

∫ sT−k

0F

(S)γT−k V


+

∫ ∞sT−k

F(S)γT−k V

(S)T−k+1p(sT−k) dsT−k (52)

at time T − k (k = 2, . . . , T ), where

sT−k = sT−k

1−εµT−k

(AT−k − AT−k

)(

1− AT−k)

. (53)

It is of interest to note that the value functions in the buy and sell regions differ by onlya change of the transaction cost variables from λT−k to −µT−k. Essentially, we exploit thisobservation to deduce the approximation of the value function in the sell region from that inthe buy region.

5.2 Stage One: Perturbation about the No Transaction Costs Case

Since the parameter ε� 1 in the limit of small transaction costs, we will derive approximationsof the suboptimal value functions as power series in terms of ε, starting from time T − 1 andthen proceeding to the general time T −k case. We achieve this by perturbing the suboptimalvalue function about the no transaction costs solution. Recall that the no transaction costssolution is of a relatively simple form since it is characterized by a myopic investment strategy.In the spirit of dynamic programming, the perturbation will be done recursively, starting fromtime T − 1 and proceeding backwards in time. In general, the perturbation at time T − k



depends on the perturbations from the time steps ahead. The exception is time T − 1, sinceit is one step before termination of the investment process.

5.2.1 Time T − 1

The value function in the buy region, from Eqs. (45) and (46), is given by

V(B)T−1 =

1

γET−1

[{FT−1 − ελT−1rT−1

(AT−1 −AT−1

)}γ]. (54)

Expanding it as a power series in ε and using Eq. (40), we approximate

V(B)T−1 = VT−1

{1− ελT−1γ

(AT−1 −AT−1

)+

1

2ε2λ2

T−1αT−1

(AT−1 −AT−1

)2}

+O(ε3), (55)

where

αT−1 =γ (γ − 1) r2

T−1ET−1

[F γ−2T−1

]ET−1

[F γT−1

] . (56)

Note that the leading order term of the expansion is VT−1 = 1γET−1

[F γT−1

], which we recall is

the no transaction costs solution given by Eq. (39). The interested reader is referred to A foran analysis of the remainder term in the above expansion, which is shown to be bounded. Theremainder terms for subsequent expansions will not be provided but are otherwise similar.

Similarly, the value function in the sell region is given by Eqs. (50) and (51), which differsfrom the value function in the buy region by a change of variable from λT−1 to −µT−1.Therefore, it can be immediately deduced from Eq. (55) that

V(S)T−1 = VT−1

{1− εµT−1γ

(AT−1 − AT−1

)+

1

2ε2µ2

T−1αT−1

(AT−1 − AT−1

)2}

+O(ε3). (57)

5.2.2 Time T − k

Taking one step back to time T − 2, the value function in the buy region is given by Eq.

(47). Our aim is to delineate the integrals of V(B)T−2 by sT−2 rather than sT−2, since sT−2 is

the leading order term of sT−2. Therefore, it is rewritten as

V(B)T−2 =

∫ sT−2

0F

(B)γT−2 V

(B)T−1p(sT−2) dsT−2 +

∫ ∞sT−2

F(B)γT−2 V

(S)T−1p(sT−2) dsT−2

+

∫ sT−2

sT−2

F(B)γT−2

{V

(S)T−1 − V

(B)T−1

}p(sT−2) dsT−2, (58)

where sT−2 = sT−2 +O(ε) from Eq. (49). Recall that V(B)T−1 and V

(S)T−1 are functions of AT−1,

where AT−1 =sT−2AT−2

F(B)T−2

from Eq. (44). Also, when sT−2 = sT−2, we have AT−1 = AT−1 by

definition. We now derive an estimate of the third integral. Applying the Mean Value Theorem



for an integral, ∫ sT−2

sT−2

F(B)γT−2

{V

(S)T−1 − V

(B)T−1

}p(sT−2) dsT−2

= (sT−2 − sT−2) F(B)γT−2

{V

(S)T−1 − V

(B)T−1

}p(sT−2), (59)

which is evaluated at a point sT−2 ∈ (sT−2, sT−2), i.e. sT−2 = sT−2 + O(ε). At this point,AT−1 = AT−1 + O(ε) as a return that is close to sT−2 results in a proportion of risky assetthat is close to AT−1. This implies that

V(S)T−1 − V

(B)T−1 = −ε

(µT−1 + λT−1

)γVT−1

(AT−1 − AT−1

)+O(ε2) (60)

is of O(ε2). Since sT−2 − sT−2 is of O(ε), Eq. (59) is of O(ε3).Therefore, the value function in the buy region is approximated by

V(B)T−2 =

∫ sT−2

0F

(B)γT−2 V

(B)T−1p(sT−2) dsT−2 +

∫ ∞sT−2

F(B)γT−2 V

(S)T−1p(sT−2) dsT−2 +O(ε3). (61)

We now substitute the expressions for F(B)T−2, V

(B)T−1 and V

(S)T−1 from Eqs. (45), (55) and (57)

into Eq. (61). Expanding in powers of ε, simplifying with Eq. (40) and collecting the common

terms together, it can be shown after some algebra that V(B)T−2 is given by Eq. (62) below. In

order to approximate the value function in the sell region, recall that it differs from the valuefunction in the buy region by a change of variable from λT−2 to −µT−2. Therefore, with the

aforementioned change of variable, we can immediately deduce the approximation of V(S)T−2,

which is given below by Eq. (63).Using Eqs. (47) and (52), the above analysis for the time T − 2 case is repeated recursively

at time T − 3, time T − 4 and so on. In general, the suboptimal value function in the buy andsell regions at time T − k (k = 2, . . . , T ) can be inductively shown to be approximated by

V(B)T−k = VT−k

{1− ελT−kγ

(AT−k −AT−k

)+

1

2ε2λ2

T−kαT−k

(AT−k −AT−k

)2

+ε2λT−k

(βT−k − γ

k∑i=3

ζT−i+1

)(AT−k −AT−k

)+ ε

k∑i=2

ζT−i

+ε2

(1

2

k∑i=2

ηT−iαT−i+1 −1

γ

k∑i=3

ζT−iβT−i+1 +

k∑i=4

i∑j=4

ζT−iζT−j+2

)}+O(ε3) (62)

and, with a change of variable from λT−k to −µT−k, by

V(S)T−k = VT−k

{1− εµT−kγ

(AT−k − AT−k

)+

1

2ε2µ2

T−kαT−k

(AT−k − AT−k

)2

+ε2µT−k

(βT−k − γ

k∑i=3

ζT−i+1

)(AT−k − AT−k

)+ ε

k∑i=2

ζT−i

+ε2

(1

2

k∑i=2


γ

k∑i=3

ζT−iβT−i+1 +

k∑i=4

i∑j=4

ζT−iζT−j+2

)}+O(ε3). (63)



Note that in Eqs. (62) and (63), the summation terms are only valid in instances where the

upper limit is at least as large as the lower limit. For example, when k = 2,

k∑i=3

ζT−i+1 is not

valid and assumed to be equal to zero while

k∑i=2

ηT−i = ηT−2. The corresponding definitions

of αT−k, βT−k, ζT−k and ηT−k are as follows:

αT−k =γ(γ − 1)r2

T−kET−k[F γ−2T−k

]ET−k

[F γT−k

] , (64)

βT−k =γrT−k

ET−k[F γT−k

]×{λT−k+1

∫ sT−k

0F γ−2T−k

[sT−kAT−k − γ

(sT−kAT−k − FT−kAT−k+1

)]×p(sT−k) dsT−k

−µT−k+1

∫ ∞sT−k

F γ−2T−k

[sT−kAT−k − γ



}, (65)

ζT−k =γ

ET−k[F γT−k

]×{λT−k+1

∫ sT−k

0F γ−1T−k


)p(sT−k) dsT−k

−µT−k+1

∫ ∞sT−k

F γ−1T−k


)p(sT−k) dsT−k

}(66)

and

ηT−k =1

ET−k[F γT−k

]×{λ2T−k+1

∫ sT−k

0F γ−2T−k


)2p(sT−k) dsT−k

+µ2T−k+1

∫ ∞sT−k

F γ−2T−k


)2p(sT−k) dsT−k

}. (67)

In the first stage of the perturbation analysis, we have obtained approximations for the

suboptimal value functions V(B)T−k and V

(S)T−k in the buy and sell regions by ignoring the no-

transaction region and perturbing about the no transaction costs solution. The main advan-tage is that we have derived the approximations, albeit suboptimal, at any time step of theinvestment process. However, we do not know how the no-transaction region will affect theterms in these approximations and the corrections that may be required. In the second stageof our analysis, we improve on these preliminary approximations by correcting them as weincorporate the no-transaction region.



5.3 Stage Two: Perturbation about the Suboptimal Value Function

In the second stage of the perturbation analysis, we reintroduce the no-transaction regionin our approximation of the optimal value function. So instead of transacting to the Mertonproportion at all times, the investor will buy to reach the optimal buy boundary when he fallsin the buy region. Correspondingly, the investor will sell to reach the optimal sell boundary inthe sell region and will choose not to trade in the no-transaction region. In the limit of smalltransaction costs, one would expect the optimal buy and sell boundaries to be close to theMerton proportion. Assume that the optimal value function in the case of small transactioncosts is a perturbation about the no transaction costs case and assume that

A−T−k = AT−k + εω−T−k (68)

and

A+T−k = AT−k + εω+

T−k, (69)

where ω−T−k and ω+T−k are of O(1) and yet to be determined. Therefore, in the no-transaction

region A−T−k ≤ AT−k ≤ A+T−k, we have

AT−k = AT−k + εωT−k, (70)

where ω−T−k ≤ ωT−k ≤ ω+T−k. We subsequently demonstrate in our perturbation analysis that

these assumptions are indeed self-consistent in this model.At each time step, we first perturb the optimal value function about the suboptimal value

function in the corresponding buy and sell regions, followed by a further perturbation aboutthe no transaction costs solution. The aim is to correct the suboptimal value function for theterms that were left out when we assumed a strategy of transacting to the Merton proportion.

Recall that for the optimal value function, F(B)T−k, F

(S)T−k and F

(N)T−k are given by Eqs. (29),

(33) and (36) respectively. Since we are first perturbing the optimal value function about thesuboptimal value function, using Eqs. (68), (69) and (70), they are rewritten as

F(B)T−k = F

(B)T−k + εω−T−k (sT−k − rT−k)− ε2λT−kω

−T−krT−k, (71)

F(S)T−k = F

(S)T−k + εω+

T−k (sT−k − rT−k) + ε2µT−kω+T−krT−k (72)

and

F(N)T−k = FT−k + εωT−k (sT−k − rT−k) . (73)

Define the correction term in the buy and sell regions by

δ(B)T−k = V

(B)T−k − V

(B)T−k (74)

and

δ(S)T−k = V

(S)T−k − V

(S)T−k (75)

respectively. After correcting for the optimal value function in the buy and sell regions, we

proceed to obtain an approximation of the optimal value function V(N)T−k in the no-transaction

region, which will be a perturbation about the no transaction costs solution. Applying thedynamic programming principle, this sequence of correction and approximation is achievedrecursively backwards in time by using the estimates of the optimal value functions from thetime steps ahead. As before, the analysis starts at time T −1 before proceeding to the generaltime T − k case.



5.3.1 Time T − 1

In the buy region, recall that V(B)T−1 is given by Eq. (46) and that V

(B)T−1 is given by Eqs. (14)

and (71) as

V(B)T−1 =

1

γET−1

[F

(B)γT−1

](76)

and

V(B)T−1 =

1

γET−1

[{F

(B)T−1 + εω−T−1 (sT−1 − rT−1)− ε2λT−1ω

−T−1rT−1

}γ](77)

respectively. First, we perturb about the suboptimal solution as the correction term is given

by δ(B)T−1 = V

(B)T−1 − V

(B)T−1. Expanding Eq. (77) in powers of ε up to O(ε2) and subtracting Eq.

(76), we obtain

δ(B)T−1 =

1

γET−1

[γF

(B)γ−1T−1

{εω−T−1 (sT−1 − rT−1)− ε2λT−1ω

−T−1rT−1

}+

1

2γ (γ − 1) F

(B)γ−2T−1 ε2ω−2

T−1 (sT−1 − rT−1)2

]+O(ε3). (78)

Next, we perturb about the no transaction costs solution. Substituting F(B)T−1 = FT−1 −

ελT−1rT−1

(AT−1 −AT−1

)into Eq. (78), expanding further in powers of ε and simplifying

with Eqs. (37), (40) and (41), the correction term is found to be

δ(B)T−1 = VT−1ε

2

{[λT−1ω

−T−1AT−1

(AT−1 −AT−1

)+

1

2ω−2T−1

]φT−1

−λT−1ω−T−1γ

}+O(ε3), (79)

where

φT−1 =γ (γ − 1)ET−1

[(sT−1 − rT−1)2 F γ−2

T−1

]ET−1

[F γT−1

] . (80)

In the sell region, note that V(S)T−1 is given by Eqs. (17) and (72) as

V(S)T−1 =

1

γET−1

[{F

(S)T−1 + εω+

T−1 (sT−1 − rT−1) + ε2µT−1ω+T−1rT−1

}γ], (81)

which is equivalent to V(B)T−1 with a change of variable from µT−1 to −λT−1 and from ω+

T−1 to

ω−T−1. Therefore, the correction term in the sell region is immediately deduced from Eq. (79)to be

δ(S)T−1 = VT−1ε

2

{[µT−1ω

+T−1AT−1

(AT−1 − AT−1

)+

1

2ω+2T−1

]φT−1

+µT−1ω+T−1γ

}+O(ε3). (82)

In the no-transaction region, V(N)T−1 is given by Eq. (20) and (73) as

V(N)T−1 =

1

γET−1

[{FT−1 + εωT−1 (sT−1 − rT−1)

}γ]. (83)



Expanding in powers of ε up to O(ε2) and simplifying with Eq. (37),

V(N)T−1 = VT−1

{1 +

1

2ε2ω2

T−1φT−1

}+O(ε3). (84)

Note that time T − 1 is a special case as it is one step before termination of the investmentprocess. We now take one step back to time T − 2 and describe the analysis that is required,followed by the result for the general case.

5.3.2 Time T − k

In this section, we describe the key ideas in deriving the correction and approximation ofthe optimal value function at time T − 2. Details of this perturbation analysis can be foundin B. One of the key observations is that the analysis in the sell region differs from that in thebuy region only by a suitable change of variables. This reduces the analysis that is requiredfor the problem, since one can immediately deduce the result for the sell region from the buyregion by a simple change of variables.

Therefore, we focus our attention to the perturbation analysis in the buy region. From Eq.

(30), V(B)T−2 is expressed in its integral form as a sum of integrals delineated by s−T−2 and s+

T−2.

We note that sT−2 is the leading order term of s−T−2 and s+T−2. Thus, our first objective is

to rewrite V(B)T−2 as a sum of integrals delineated by sT−2, which is achieved by applying the

mean value theorem for integrals. This makes it directly comparable with V(B)T−2 from Eq. (61),

which we have also expressed as a sum of integrals delineated by sT−2. In order to estimate

the correction δ(B)T−2 = V

(B)T−2 − V

(B)T−2, we first perturb the optimal value function V

(B)T−2 about

the suboptimal value function V(B)T−2 by taking their difference and using the approximation

of the optimal value function from the time step ahead (i.e. time T − 1 in this case). This isfollowed by a perturbation about the no transaction costs solution. After some long algebra

and simplification, we will be able to derive the approximation of the correction δ(B)T−2 in

the buy region. The corresponding approximation of the correction term in the sell region

δ(S)T−2 = V

(S)T−2 − V

(S)T−2 can then be immediately deduced from the buy region by a change of

variables from λT−2 to −µT−2 and ω−T−2 to ω+T−2.

After deriving the correction in the buy and sell regions, we proceed to estimate the optimal

value function V(N)T−2 for the no-transaction region, which is given by Eq. (34). Once again,

by applying the mean value theorem for integrals, we can rewrite V(N)T−2 as a sum of integrals

delineated by sT−2. Substituting in estimates of the optimal value function from the time stepahead, expanding in powers of ε and simplifying, we will be able to derive the approximationfor the optimal value function in the no-transaction region.

Having approximated the optimal value function in the buy, sell and no-transaction regionsat time T − 2, we then proceed to the next time step. Applying the dynamic programmingprinciple, the above analysis for the time T − 2 case is repeated recursively backwards in timeat T − 3 using the results from T − 2, at T − 4 using the results from T − 3 and so on. Byinduction, we are then able to derive the results for the general time T − k (k = 2, . . . , T )case, which are stated below. The correction in the buy and sell regions are found to be givenby

δ(B)T−k = VT−kε

2

{[λT−kω

−T−kAT−k

(AT−k −AT−k

)+

1

2ω−2T−k

]φT−k

−λT−kω−T−kγ + ω−T−kψT−k +

k∑i=2

θT−i

}+O(ε3) (85)



and

δ(S)T−k = VT−kε

2

{[µT−kω

+T−kAT−k

(AT−k − AT−k

)+

1

2ω+2T−k

]φT−k

+µT−kω+T−kγ + ω+

T−kψT−k +

k∑i=2

θT−i

}+O(ε3). (86)

The optimal value function in the no-transaction region is given by

V(N)T−k = V

(N)T−k + δ

(N)T−k, (87)

where

V(N)T−k = VT−k

{1 + ε2ωT−kψT−k +

1

2ε2ω2

T−kφT−k + ε

k∑i=2

ζT−i

+ε2

(1

2

k∑i=2


γ

k∑i=3

ζT−iβT−i+1 +

k∑i=4

i∑j=4

ζT−iζT−j+2

)}+O(ε3) (88)

and

δ(N)T−k = VT−kε

2k∑i=2

θT−i +O(ε3). (89)

Recall that αT−k, βT−k, ζT−k and ηT−k are previously defined in Eqs. (64) to (67). Thedefinitions of φT−k, ψT−k and θT−k in the expressions above are as follows:

φT−k =γ (γ − 1)ET−k

[(sT−k − rT−k)2 F γ−2

T−k

]ET−k

[F γT−k

] , (90)

ψT−k =γ

ET−k[F γT−k

]×{λT−k+1

∫ sT−k

0F γ−2T−k

[sT−krT−k + γ (sT−k − rT−k)



−µT−k+1

∫ ∞sT−k

F γ−2T−k

[sT−krT−k + γ (sT−k − rT−k)



}(91)



and

θT−k =1

ET−k[F γT−k

]×

{∫ sT−k

0

[−F γ−1

T−kλT−k+1ω−T−k+1AT−k+1


)φT−k+1

+F γT−k

(1

2ω−2T−k+1φT−k+1 − λT−k+1ω

−T−k+1γ + ω−T−k+1ψT−k+11I{k 6=2}


+

∫ ∞sT−k

[F γ−1T−kµT−k+1ω

+T−k+1AT−k+1


)φT−k+1

+F γT−k

(1

2ω+2T−k+1φT−k+1 + µT−k+1ω

+T−k+1γ + ω+

T−k+1ψT−k+11I{k 6=2}

)]

×p(sT−k) dsT−k

}, (92)

where 1I{k 6=2} denotes an indicator function with respect to the index k.In conclusion, we have devised a systematic method to approximate the optimal value

function in the buy, sell and no-transaction regions at any time step. In this method, weinitially assumed that the investor adopted a suboptimal strategy of transacting to the Mertonproportion in the limit of small transaction costs. The second order approximation of thesuboptimal value function in the buy and sell regions was then derived by perturbing aboutthe no transaction costs solution. However, by ignoring the no-transaction region, we havemissed out on some second order terms in our preliminary approximation. In order to correctour initial approximation in the buy and sell regions, we perturbed the optimal value functionabout the suboptimal value function, followed by a perturbation about the no transactioncosts solution. Thereafter, we derived the approximation for the optimal value function in theno-transaction region by perturbing about the no transaction costs solution. This perturbationscheme was achieved by backwards recursion starting from time T−1. Nonetheless, the optimalbuy and sell boundaries are as yet unknown, which we determine in the next section.

5.4 Approximation of the Optimal Buy and Sell Boundaries

In Sections 5.1 and 5.3, we have derived the approximation of the optimal value function inthe buy, sell and no-transaction regions at each time step. In this section, we verify that theoptimal value function is continuous across the buy and sell boundaries. We further recall thatthe first derivative of the optimal value function should also be continuous across the optimalboundaries. An application of this condition allows us to derive estimates for the optimal buyand sell boundaries. In addition, the results that we obtain serve to demonstrate that theperturbation analysis is indeed self-consistent.

We start by considering the case at time T − 1. Collecting the previous results from Eqs.(55), (57), (79), (82) and (84), the optimal value function in the buy, sell and no-transaction

regions at time T − 1 is given by V(B)T−1 = V

(B)T−1 + δ

(B)T−1, V

(S)T−1 = V

(S)T−1 + δ

(S)T−1 and V

(N)T−1

respectively. Recall that AT−1 = AT−1 + εωT−1 in the no-transaction region. At the optimal



buy boundary, AT−1 = A−T−1 and ωT−1 = ω−T−1. Thus, we can verify that

V(B)T−1 = VT−1

{1 +

1

2ε2ω−2

T−1φT−1

}+O(ε3) = V

(N)T−1. (93)

Similarly, at the optimal sell boundary, where AT−1 = A+T−1 and ωT−1 = ω+

T−1, we can also

verify that V(S)T−1 = V

(N)T−1. Therefore, the optimal value function is continuous across the buy

and sell boundaries up to O(ε2).Note that we have obtained the approximation of the optimal value function only up to

O(ε2). This is so that we can match the first derivative of the optimal value function at theoptimal buy and sell boundaries up to O(ε). If we wish to obtain higher order estimates ofthe optimal boundaries, we will need to estimate the optimal value function beyond O(ε2).By matching the first derivative of the optimal value function at the buy and sell boundaries,we are able to derive the first order approximation of these boundaries. Therefore, in order to

determine the optimal boundaries, differentiate V(B)T−1, V

(S)T−1 and V

(N)T−1 with respect to AT−1

to give

∂V(B)T−1

∂AT−1= VT−1

{ελT−1γ − ε2λ2

T−1αT−1

(AT−1 −AT−1

)−ε2λT−1ω

−T−1AT−1φT−1

}+O(ε3), (94)

∂V(S)T−1

∂AT−1= VT−1

{−εµT−1γ + ε2µ2

T−1αT−1

(AT−1 − AT−1

)+ε2µT−1ω

+T−1AT−1φT−1

}+O(ε3) (95)

and

∂V(N)T−1

∂AT−1= VT−1εωT−1φT−1 +O(ε2). (96)

Note that the first derivative of the optimal value function in the no-transaction region isobtained by recalling that AT−1 and ωT−1 are related via AT−1 = AT−1 + εωT−1.

At the buy boundary (i.e. AT−1 = A−T−1 and ωT−1 = ω−T−1), by equating the coefficients of

ε for∂V

(B)T−1

∂AT−1and

∂V(N)T−1

∂AT−1, we obtain

ω−T−1 =λT−1γ

φT−1. (97)

Similarly, at the sell boundary (i.e. AT−1 = A+T−1 and ωT−1 = ω+

T−1), by equating the coeffi-

cients of ε for∂V

(S)T−1

∂AT−1and

∂V(N)T−1

∂AT−1, we obtain

ω+T−1 =

−µT−1γ

φT−1. (98)

Substituting these estimates back into the optimal value function, we have completed thederivation of its approximation up to O(ε2). In addition, the corresponding buy and sellboundaries are estimated by A−T−1 = AT−1 + εω−T−1 and A+

T−1 = AT−1 + εω+T−1 up to O(ε).

For the general case of time T −k (k = 2, . . . , T ), collecting the results from Eqs. (62), (63),

(85),(86),(88) and (89),the optimal value function in the three regions is given by V(B)T−k =



V(B)T−k + δ

(B)T−k, V

(S)T−k = V

(S)T−k + δ

(S)T−k and V

(N)T−k = V

(N)T−k + δ

(N)T−k. At the optimal buy boundary,

AT−k = A−T−k and ωT−k = ω−T−k. Thus, it can be verified that

V(B)T−k = VT−k

{1 + ε

k∑i=2

ζT−i + ε2k∑i=2

θT−i

+ε2

(1

2

k∑i=2


γ

k∑i=3

ζT−iβT−i+1 +

k∑i=4

i∑j=4

ζT−iζT−j+2

)

+1

2ε2ω−2

T−kφT−k + ε2ω−T−kψT−k

}+O(ε3) = V

(N)T−k. (99)

At the optimal sell boundary, AT−k = A+T−k and ωT−k = ω+

T−k. Similarly, it can also be

verified that V(S)T−k = V

(N)T−k up to O(ε2).

In order to estimate the buy and sell boundaries, we differentiate V(B)T−k, V

(S)T−k and V

(N)T−k

with respect to AT−k, which give us

∂V(B)T−k

∂AT−k= VT−k

{ελT−kγ − ε2λ2

T−kαT−k

(AT−k −AT−k

)

−ε2λT−k

(βT−k − γ

k∑i=3

ζT−i+1

)− ε2λT−kω

−T−kAT−kφT−k

}+O(ε3), (100)

∂V(S)T−k

∂AT−k= VT−k

{−εµT−kγ + ε2µ2

T−kαT−k

(AT−k − AT−k

)

+ε2µT−k

(βT−k − γ

k∑i=3

ζT−i+1

)+ ε2µT−kω

+T−kAT−kφT−k

}+O(ε3) (101)

and

∂V(N)T−k

∂AT−k= VT−k {εψT−k + εωT−kφT−k}+O(ε2). (102)

At the buy boundary (i.e. AT−k = A−T−k and ωT−k = ω−T−k), by equating the coefficients of ε

for∂V

(B)T−k

∂AT−kand

∂V(N)T−k

∂AT−k, we obtain

ω−T−k =λT−kγ − ψT−k

φT−k. (103)

Similarly, at the sell boundary (i.e. AT−k = A+T−k and ωT−k = ω+

T−k), by equating the

coefficients of ε for∂V

(S)T−k

∂AT−kand

∂V(N)T−k

∂AT−k, we obtain

ω+T−k =

−µT−kγ − ψT−kφT−k

. (104)

The corresponding optimal buy and sell boundaries are thus given by A−T−k = AT−k+εω−T−kand A+

T−k = AT−k + εω+T−k up to O(ε). Therefore, in the limit of small transaction costs, we

have obtained the first order approximation of the optimal boundaries at any time T − k.



Observe from Eq. (90) that φT−k is determined by the variables rT−k, sT−k and AT−k at timeT−k. The variable rT−k is deterministic while the random variable sT−k is characterized by itsprobability density function p(sT−k). Observe from Eq. (91) that, in addition to the variablesat time T − k, ψT−k also depends on the variables λT−k+1, µT−k+1 and AT−k+1 at the timestep ahead. Moreover, the Merton proportion AT−k at any time T − k is determined by thespecification of rT−k and p(sT−k) via Eq. (37). Therefore, one concludes that the first orderapproximation of the optimal buy and sell boundaries at each time step essentially dependon the transaction costs, returns of the risk-free asset and returns of the risky asset at thecurrent time step and one time step ahead.

In Section 6, we make a few assumptions that allow us to reduce these expressions for theoptimal boundaries to simpler forms.

6. Results

In order to illustrate our main results with a numerical example, we make the followingassumptions:

•Assume that rT−k is constant in time and say that rT−k = r for all k.

•Assume that sT−k are independent and identically distributed to the random variable s forall k.

•Assume that the costs of buying and selling the risky assets are equal and constant in timeand say that λT−k = µT−k = λ for all k.

We focus on the results in the general case of time T − k (k = 2, . . . , T ) as the time T − 1case is trivial. With these assumptions, we deduce from Eq. (37) that the Merton proportionAT−k = A is a constant, where A satisfies the equation

E[(s− r)

{r + (s− r) A

}γ−1]

= 0. (105)

Here, the expectation E is taken with respect to the random variable s. From Eq. (48), we

simplify the term sT−k = rA(

1− A)/

A(

1− A)

= r. Furthermore, from Eq. (91), we sim-

plify the expression sT−kAT−k− FT−kAT−k+1 = sA−{r + (s− r) A

}A = (s− r) A

(1− A

).

We observe that φT−k = φ and ψT−k = ψ are constants, where

φ =

γ (γ − 1)E[(s− r)2

{r + (s− r) A

}γ−2]

E[{r + (s− r) A

}γ] (106)

and

ψ =γλ

E[{r + (s− r) A

}γ]×

{∫ r

0

{r + (s− r) A

}γ−2 {sr + γ (s− r)2 A

(1− A

)}p(s) ds

−∫ ∞r

{r + (s− r) A

}γ−2 {sr + γ (s− r)2 A

(1− A

)}p(s) ds

}. (107)



Consequently, the first order approximations of the optimal boundaries

A−T−k = A+ ελγ − ψφ

(108)

and

A+T−k = A− ελγ + ψ

φ(109)

are also constants over time. This is not a surprising observation as the Merton proportion,which corresponds to the no transaction costs case, is also a constant. Thus, in the limit ofsmall transaction costs, one might expect the optimal buy and sell boundaries (being nearthe Merton point) to be constants as well. However, note that these results only hold underthe given assumptions. One advantage of this portfolio selection model is that the probabilitydistribution for the return of the risky asset at each time step is generic. In general, if thereturns of the risky asset are not assumed to be identically distributed, then one will notexpect the optimal boundaries to be constants over time.

In order to study the behavior of the optimal boundaries numerically, we consider a simpleexample and assume that the parameters in our model take the values T = 6, γ = 0.1, r = 1.05and the random variable s is given by its probability density function p(s) = 0.7×δ(s−1.5)+0.3× δ(s− 0.5), where δ(.) is the Dirac delta function.

In Fig. 1, the transaction costs are allowed to vary from 0 to 0.03. This figure shows therelationship between the optimal boundaries at the initial time and transaction costs. Theexact boundaries are obtained by implementing the dynamic programming algorithm numer-ically while the approximate boundaries are given by Eqs. (108) and (109). When transactioncosts are 0, the boundaries converge to the Merton proportion and the no-transaction regiondisappears. Therefore, in the absence of transaction costs, the investor will always trade inthe risky asset to reach the Merton proportion. When transaction costs increase, the width(difference between the buy and sell boundaries) of the no-transaction region increases andthe investor is less likely to trade in the risky asset. In practice, the level of transaction costsincurred in the financial markets are typically between 0.003 and 0.004. From this example, itcan be observed that the approximate boundaries are good estimates of the exact boundariesin the limiting case of small transaction costs.

In Fig. 2, we assume that the transaction costs are fixed at 0.004 and we vary γ from 0.1 to0.3. The level of risk aversion of the investor, which is measured by (1− γ), therefore variesfrom 0.7 to 0.9. In this figure, we investigate how the investor’s risk aversion affects the Mertonproportion and the optimal boundaries at the initial time as approximated by Eqs. (108) and(109). It is observed that the Merton proportion and the optimal boundaries are inverselyrelated to the level of risk aversion of the investor. As the investor becomes increasingly riskaverse, he would prefer to hold less of the risky asset and more of the risk-free asset, whichexplains the decrease in the Merton proportion. Although the width of the no-transactionregion appears to remain the same, the effects of increasing the investor’s risk aversion areseen by the sell region that is widening and the buy region that is narrowing. This means thatthere is a greater tendency for the investor to sell rather than to buy the risky asset when hislevel of risk aversion is high.

In Fig. 3, we assume that transaction costs are 0.004 and γ = 0.1. This figure shows the linearrelationship between the optimal holdings in the risky asset and the wealth of the investor.Recall the parametrization aT−k = AT−kWT−k, where aT−k is the dollar value invested inthe risky asset, AT−k is the proportion of wealth invested in the risky asset and WT−k is thewealth of the investor. The Merton line a0 = A0W0 corresponds to the optimal value to beinvested in the risky asset when there are no transaction costs. The buy boundary a0 = A−0 W0

and the sell boundary a0 = A+0 W0 delineate the no-transaction region, which in this case is



0 0.005 0.01 0.015 0.02 0.025 0.030.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

Transaction Costs

Pro

port

ion

of R

isky

Ass

et

Exact Buy BoundaryExact Sell BoundaryApproximate Buy BoundaryApproximate Sell Boundary

No−Transaction Region

Buy Region

Sell Region

Figure 1. Comparison of Exact and Approximate Boundaries

0.7 0.72 0.74 0.76 0.78 0.8 0.82 0.84 0.86 0.880.65

0.7

0.75

0.8

0.85

0.9

Risk Aversion

Prop

ortio

n of

Ris

ky A

sset

Buy BoundarySell BoundaryMerton Proportion

Sell Region

Buy Region

Figure 2. Optimal Boundaries vs Risk Aversion

narrow since the transaction costs are small. As the investor’s wealth increases, observe thatthe optimal holdings in the risky asset (in dollar value terms) also increase as depicted by theincreasing buy and sell boundaries, which is what one would expect in practice. Therefore,this observation represents a more realistic description of an investor’s behavior as comparedto the exponential utility function, where the Merton line, buy and sell boundaries are foundto be independent of the investor’s wealth.

In conclusion, we have devised a perturbation method that allows one to obtain approxi-mations of the optimal value function and optimal boundaries to an arbitrary number of timesteps in the portfolio selection model. Therefore, when transaction costs are small, one doesnot need to compute the optimal value function and optimal boundaries via the dynamic pro-gramming algorithm, which becomes computationally intensive as the number of time steps


REFERENCES 27

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

Wealth

Val

ue o

f R

isky

Ass

et

Buy BoundarySell BoundaryMerton Line

Buy Region

Sell Region

Figure 3. Optimal Boundaries vs Wealth

increases. Furthermore, one has the flexibility to specify the probability distribution of thereturns of the risky asset as we have kept it generic in the model.

References

Akian, M., Menaldi, J. L. and Sulem, A. (1996) On an Investment-Consumption Model with Transaction Costs, SIAMJournal on Control and Optimization, 34(1), pp. 329–364.

Atkinson, C. and Al-Ali, B. (1997) On an investment-consumption model with transaction costs: an asymptomaticanalysis., Applied Mathematical Finance, 4(2), p. 109.

Atkinson, C. and Alexandropoulos, C. (2006) Pricing a European basket option in the presence of proportional transactioncosts, Applied Mathematical Finance, 13(3), pp. 191–214.

Atkinson, C. and Ingpochai, P. (2006) The influence of correlation on multi-asset portfolio optimization with transactioncosts, Journal of Computational Finance, 10(2), pp. 53–96.

Atkinson, C. and Ingpochai, P. (2010) Optimization of N-risky asset portfolios with stochastic variance and transactioncosts, Quantitative Finance, 10(5), pp. 503–514.

Atkinson, C. and Ingpochai, P. (2012) The Effect of Correlation and Transaction Costs on the Pricing of Basket Options,Applied Mathematical Finance, 19(2), pp. 131–179.

Atkinson, C. and Mokkhavesa, S. (2001) Towards the determination of utility preference from optimal portfolio selections,Applied Mathematical Finance, 8(1), pp. 1–26.

Atkinson, C. and Mokkhavesa, S. (2003) Intertemporal portfolio optimization with small transaction costs and stochasticvariance, Applied Mathematical Finance, 10(4), pp. 267–302.

Atkinson, C. and Mokkhavesa, S. (2004) Multi-asset portfolio optimization with transaction cost., Applied MathematicalFinance, 11(2), pp. 95–123.

Atkinson, C. and Wilmott, P. (1995) Portfolio Management with Transaction Costs: An Asymptotic Analysis of theMorton and Pliska Model, Mathematical Finance, 5(4), pp. 357–367.

Atkinson, C. and Papakokkinou, M. (2005) Theory of optimal consumption and portfolio selection under a Capital-at-Risk (CaR) and a Value-at-Risk (VaR) constraint, IMA Journal of Management Mathematics, 16(1), pp. 37–70.

Atkinson, C. and Quek, G. (2012) Dynamic Portfolio Optimization in Discrete-Time with Transaction Costs, AppliedMathematical Finance, 19(3), pp. 265–298.

Atkinson, C. and Storey, E. (2010) Building an Optimal Portfolio in Discrete Time in the Presence of Transaction Costs.,Applied Mathematical Finance, 17(4), pp. 323–357.

Balduzzi, P. and Lynch, A. W. (1999) Transaction costs and predictability: Some utility cost calculations, Journal ofFinancial Economics, 52(1), pp. 47–78.

Bayer, C. and Veliyev, B. (2014) Utility maximization in a binomial model with transaction costs: A duality approachbased on the shadow price process, International Journal of Theoretical and Applied Finance, 17(04), p. 1450022.

Bobryk, R. V. and Stettner, L. (1999) Discrete Time Portfolio Selection with Proportional Transaction Costs, Probabilityand Mathematical Statistics, 19(2), pp. 235–248.

Cox, J. C., Ross, S. A. and Rubinstein, M. (1979) Option pricing: A simplified approach, Journal of Financial Economics,7(3), pp. 229–263.

Davis, M. H. A. and Norman, A. R. (1990) Portfolio Selection with Transaction Costs, Mathematics of OperationsResearch, 15(4), pp. 676–713.


28 REFERENCES

Gerhold, S., Muhle-Karbe, J. and Schachermayer, W. (2012) Asymptotics and duality for the Davis and Norman problem,Stochastics An International Journal of Probability and Stochastic Processes, 84(5-6), pp. 625–641.

Janecek, K. and Shreve, S. E. (2004) Asymptotic analysis for optimal investment and consumption with transactioncosts, Finance and Stochastics, 8, pp. 181–206.

Lynch, A. W. and Balduzzi, P. (2000) Predictability and transaction costs: The impact on rebalancing rules and behavior,The Journal of Finance, 55(5), pp. 2285–2309.

Magill, M. J. P. and Constantinides, G. M. (1976) Portfolio selection with transactions costs, Journal of EconomicTheory, 13(2), pp. 245–263.

Merton, R. C. (1969) Lifetime Portfolio Selection under Uncertainty: The Continuous-Time Case, The Review of Eco-nomics and Statistics, 51(3), pp. 247–257.

Mokkhavesa, S. and Atkinson, C. (2002) Perturbation solution of optimal portfolio theory with transaction costs for anyutility function, IMA Journal of Management Mathematics, 13(2), pp. 131–151.

Morton, A. J. and Pliska, S. R. (1995) Optimal Portfolio Management With Fixed Transaction Costs, MathematicalFinance, 5(4), pp. 337–356.

Mossin, J. (1968) Optimal Multiperiod Portfolio Policies, The Journal of Business, 41(2), pp. 215–229.Samuelson, P. A. (1969) Lifetime Portfolio Selection By Dynamic Stochastic Programming, The Review of Economics

and Statistics, 51(3), pp. 239–246.Sass, J. (2005) Portfolio optimization under transaction costs in the CRR model., Mathematical Methods of Operations

Research, 61(2), pp. 239–259.Shreve, S. E. and Soner, H. M. (1994) Optimal Investment and Consumption with Transaction Costs, The Annals of

Applied Probability, 4(3), pp. 609–692.Soner, H. M. and Touzi, N. (2013) Homogenization and asymptotics for small transaction costs, SIAM Journal on

Control and Optimization, 51(4), pp. 2893–2921.Taksar, M., Klass, M. J. and Assaf, D. (1988) A Diffusion Model for Optimal Portfolio Selection in the Presence of

Brokerage Fees, Mathematics of Operations Research, 13(2), pp. 277–294.

Appendix A. Remainder Term of V(B)T−1

In this appendix, we derive a bound on the remainder term of V(B)T−1. Applying Taylor’s The-

orem to Eq. (54),

V(B)T−1 =

1

γET−1

[F γT−1 − γF

γ−1T−1ελT−1rT−1

(AT−1 −AT−1

)+

1

2γ (γ − 1) F γ−2

T−1ε2λ2

T−1r2T−1

(AT−1 −AT−1

)2

−1

6γ (γ − 1) (γ − 2) ε3λ3

T−1r3T−1

(AT−1 −AT−1

)3

×{FT−1 − ξλT−1rT−1

(AT−1 −AT−1

)}γ−3], (A1)

where 0 < ξ < ε. Therefore, the absolute value of the remainder is∣∣∣R(B)T−1

∣∣∣ =1

6(γ − 1) (γ − 2) ε3λ3

T−1r3T−1

(AT−1 −AT−1

)3

×ET−1

[{FT−1 − ξλT−1rT−1

(AT−1 −AT−1

)}γ−3]. (A2)

Now, since FT−1 = rT−1 + (sT−1 − rT−1) AT−1, we have

ET−1

[{FT−1 − ξλT−1rT−1

(AT−1 −AT−1

)}γ−3]

=

∫ ∞0

{rT−1 + (sT−1 − rT−1) AT−1 − ξλT−1rT−1

(AT−1 −AT−1

)}γ−3

×p(sT−1) dsT−1. (A3)


REFERENCES 29

For γ < 1, note that{rT−1 + (sT−1 − rT−1) AT−1 − ξλT−1rT−1

(AT−1 −AT−1

)}γ−3is a de-

creasing function of sT−1 and has a maximum at sT−1 = 0. Hence,

ET−1

[{FT−1 − ξλT−1rT−1

(AT−1 −AT−1

)}γ−3]

<

∫ ∞0

{rT−1

(1− AT−1

)− ξλT−1rT−1

(AT−1 −AT−1

)}γ−3p(sT−1) dsT−1

= rγ−3T−1

{(1− AT−1

)− ξλT−1

(AT−1 −AT−1

)}γ−3. (A4)

Since 0 < ξ < ε, we obtain a bound for∣∣∣R(B)T−1

∣∣∣ < 1

6(γ − 1) (γ − 2) ε3λ3

T−1rγT−1

(AT−1 −AT−1

)3

×{(

1− AT−1

)− ελT−1

(AT−1 −AT−1

)}γ−3. (A5)

Note that even though γ − 3 < −2, this bound is finite since we have assumed that 0 <(1− AT−1

)− ελT−1

(AT−1 −AT−1

)< 1.

Appendix B. Derivation of δ(B)T−2, δ

(S)T−2 and V

(N)T−2

In this appendix, we apply perturbation analysis to derive the correction in the buy and sellregions and the optimal value function in the no-transaction region.

In the buy region at time T − 2, recall that V(B)T−2 is given by Eq. (30). We also make

the observation that s−T−2 =rT−2A

−T−1

{(1−A−T−2

)− λT−2

(A−T−2 −AT−2

)}A−T−2

(1−A−T−1

) = sT−2 +O(ε)

and that s+T−2 =

rT−2A+T−1

{(1−A−T−2

)− λT−2

(A−T−2 −AT−2

)}A−T−2

(1−A+

T−1

) = sT−2 +O(ε). Therefore,

sT−2 is the leading order term of s−T−2 and s+T−2. which motivates us to rewrite V

(B)T−2 in terms

of integrals delineated by sT−2. Thus, we have

V(B)T−2 =

∫ sT−2

0F

(B)γT−2 V

(B)T−1p(sT−2) dsT−2

+

∫ ∞sT−2

F(B)γT−2 V

(S)T−1p(sT−2) dsT−2

+

∫ sT−2

s−T−2

F(B)γT−2

{V

(N)T−1 − V

(B)T−1

}p(sT−2) dsT−2

+

∫ s+T−2

sT−2

F(B)γT−2

{V

(N)T−1 − V

(S)T−1

}p(sT−2) dsT−2. (B1)

Applying the Mean Value Theorem to the third integral,∫ sT−2

s−T−2

F(B)γT−2

{V

(N)T−1 − V

(B)T−1

}p(sT−2) dsT−2

=(sT−2 − s−T−2

)F

(B)γT−2

{V

(N)T−1 − V

(B)T−1

}p(sT−2) (B2)


30 REFERENCES

evaluated at a point sT−2 ∈(s−T−2, sT−2

), i.e. sT−2 = s−T−2 + O(ε). Recall that when

sT−2 = s−T−2, we have AT−1 = A−T−1 by definition. Therefore, when sT−2 = s−T−2 + O(ε),

we have AT−1 = A−T−1 + O(ε) = AT−1 + O(ε). This implies that the term V(N)T−1 − V

(B)T−1 =

ελT−1

(AT−1 −AT−1

)ET−1

[F γT−1

]+O(ε2) = O(ε2). Since sT−2− s−T−2 = O(ε), we can con-

clude that Eq. (B2) is of O(ε3). Similarly, by applying the Mean Value Theorem to the fourthintegral of Eq. (B1), we can also show that it is of O(ε3). In conclusion, Eq. (B1) can besimplified to

V(B)T−2 =

∫ sT−2

0F

(B)γT−2 V


+

∫ ∞sT−2

F(B)γT−2 V

(S)T−1p(sT−2) dsT−2 +O(ε3). (B3)

Expressing V(B)T−2 in this form allows us to perturb it about the suboptimal value function V

(B)T−2,

which is also in terms of integrals delineated by sT−2. Using Eq. (71) and the approximation

of the optimal value function from time T − 1, express V(B)T−2 as

V(B)T−2 =

∫ sT−2

0

{F

(B)T−2 + εω−T−2 (sT−2 − rT−2)− ε2λT−2ω

−T−2rT−2

}γ{V

(B)T−1 + δ

(B)T−1

}p(sT−2) dsT−2

+

∫ ∞sT−2

{F

(B)T−2 + εω−T−2 (sT−2 − rT−2)− ε2λT−2ω

−T−2rT−2

}γ{V

(S)T−1 + δ

(S)T−1

}p(sT−2) dsT−2 +O(ε3). (B4)

Here, we note that V(B)T−1, V

(S)T−1, δ

(B)T−1 and δ

(S)T−1 are functions of AT−1 =

sT−2A−T−2

F(B)T−2

since the

investor buys to reach the optimal buy boundary A−T−2.In addition, recall that the suboptimal value function given by Eq. (61), is

V(B)T−2 =

∫ sT−2

0F

(B)γT−2 V

(B)T−1p(sT−2) dsT−2 +

∫ ∞sT−2

F(B)γT−2 V

(S)T−1p(sT−2) dsT−2 +O(ε3). (B5)

In this case, V(B)T−1 and V


sT−2AT−2

F(B)T−2

since the investor buys to reach

the Merton proportion AT−2. Adopting a similar approach as the derivation of the correction

δ(B)T−1 at time T −1, we first perturb the optimal value function about the suboptimal solution,

followed by a perturbation about the no transaction costs solution. Subtracting Eq. (B5) fromEq. (B4), expanding in powers of ε and simplifying with Eqs. (37), (40) and (41), it can beshown after much algebra that

δ(B)T−2 = VT−2ε

2

{[λT−2ω

−T−2AT−2

(AT−2 −AT−2

)+

1

2ω−2T−2

]φT−2

−λT−2ω−T−2γ + ω−T−2ψT−2 + θT−2

}+O(ε3), (B6)

where φT−2, ψT−2 and θT−2 are defined in Eqs. (90), (91) and (92) respectively.In the sell region, the correction term can be immediately deduced from that in the buy


REFERENCES 31

region by a change of variables from λT−2 to −µT−2 and from ω−T−2 to ω+T−2 to give us

δ(S)T−2 = VT−2ε

2

{[µT−2ω

+T−2AT−2

(AT−2 − AT−2

)+

1

2ω+2T−2

]φT−2

+µT−2ω+T−2γ + ω+

T−2ψT−2 + θT−2

}+O(ε3), (B7)

In the no-transaction region, recall that V(N)T−2 is given by Eq. (34). Following the same line

of argument as in the buy region, we can show that similar to Eq. (B3),

V(N)T−2 =

∫ sT−2

0F

(N)γT−2 V


+

∫ ∞sT−2

F(N)γT−2 V

(S)T−1p(sT−2) dsT−2 +O(ε3). (B8)

Using Eq. (73) and the approximation of the optimal value function from time T − 1,

V(N)T−2 =

∫ sT−2

0

{FT−2 + εωT−2 (sT−2 − rT−2)

}γ{V

(B)T−1 + δ

(B)T−1

}p(sT−2) dsT−2

+

∫ ∞sT−2

{FT−2 + εωT−2 (sT−2 − rT−2)

}γ{V

(S)T−1 + δ

(S)T−1

}p(sT−2) dsT−2 +O(ε3), (B9)

where V(B)T−1, V

(S)T−1, δ

(B)T−1 and δ


sT−2AT−2

F(N)T−2

since the investor does

not transact in this region where A−T−2 ≤ AT−2 ≤ A+T−2. Let us denote

V(N)T−2 = V

(N)T−2 + δ

(N)T−2, (B10)

where

V(N)T−2 =

∫ sT−2

0

{FT−2 + εωT−2 (sT−2 − rT−2)

}γV


+

∫ ∞sT−2

{FT−2 + εωT−2 (sT−2 − rT−2)

}γV

(S)T−1p(sT−2) dsT−2 (B11)

and

δ(N)T−2 =

∫ sT−2

0

{FT−2 + εωT−2 (sT−2 − rT−2)

}γδ


+

∫ ∞sT−2

{FT−2 + εωT−2 (sT−2 − rT−2)

}γδ

(S)T−1p(sT−2) dsT−2. (B12)

Expanding in powers of ε and simplifying with Eq. (37), it can be shown after some algebrathat

V(N)T−2 = VT−2

{1 + ε2ωT−2ψT−2 +

1

2ε2ω2

T−2φT−2

+εζT−2 +1

2ε2ηT−2αT−1

}+O(ε3) (B13)


32 REFERENCES

and

δ(N)T−2 = VT−2ε

2θT−2 +O(ε3). (B14)

This concludes our derivation of δ(B)T−2, δ

(S)T−2 and V

(N)T−2.

portfolio selection in discrete time with transaction costs ......key words: portfolio optimization,...

Documents