mainbokanowski/... · olivier bokanowski†, stefania maroso‡, and hasnaa zidani§ abstract. this...

SOME CONVERGENCE RESULTS FOR HOWARD’S ALGORITHM∗

OLIVIER BOKANOWSKI† , STEFANIA MAROSO‡ , AND HASNAA ZIDANI§

Abstract. This paper deals with convergence results of Howard’s algorithm for the resolution ofmina∈A(Bax−ca) = 0 where Ba is a matrix, ca is a vector, and A is a compact set. We show a globalsuperlinear convergence result, under a monotonicity assumption on the matrices Ba. Extensions ofHoward’s algorithm for a max-min problem of the form max

b∈Bmina∈A

(Ba,bx−ca,b) = 0 are also proposed.

In the particular case of an obstacle problem of the form min(Ax − b, x − g) = 0 where A is anN ×N matrix satisfying a monotonicity assumption, we show the convergence of Howard’s algorithmin no more than N iterations, instead of the usual 2N bound. Still in the case of obstacle problem,we establish the equivalence between Howard’s algorithm and a primal-dual active set algorithm (M.Hintermuller et al., SIAM J. Optim., Vol 13, 2003, pp. 865-888).

The algorithms are illustrated on the discretization of nonlinear PDE’s arising in the contextof mathematical finance (American option, and Merton’s portfolio problem), of front propagationproblems, and for the double-obstacle problem.

Key words. Howard’s algorithm (policy iterations), primal-dual active set algorithm, semi-smooth Newton’s method, superlinear convergence, double-obstacle problem, min-max problem.

AMS subject classifications. 49M15, 65K10, 90C47, 91B28

1. Introduction. The main goal of this paper is to study new convergence re-sults of Howard’s algorithm for solving the nonlinear problem (for N ≥ 1):

Find x ∈ RN , min

α∈AN(B(α)x − c(α)) = 0, (1.1)

where A is a non empty compact set, and for every α ∈ AN , B(α) is a N × Nmonotone matrix and c(α) ∈ R

N (for x, y ∈ RN the notation min(x, y) denotes the

vector of components min(xi, yi)). Problems such as (1.1) come from the discretizationof Hamilton-Jacobi-Bellman equations [11, 12, 24, 28], from optimal control problems,and many other applications. A simple example is the well known obstacle problem:

Find x ∈ RN , min(Ax − b, x − g) = 0,

where the N × N matrix A and the vectors b and g are fixed.

In general, two methods are used to solve (1.1). The first one is a fixed pointmethod called ”value iteration algorithm”. Each iteration of this method is cheap.However, the convergence is linear and one needs a large number of iterations to geta reasonable error. The second method, called Howard’s algorithm or policy iterationalgorithm, was developed by Bellman [5, 6] and Howard [17] for solving steady infinite-horizon Markovian dynamic programming problems (MDP). For the problem (1.1),policy iteration algorithm computes the solution x∗ via a sequence of trial valuesxk and policies αk under an alternating sequence of policy improvement and

∗PUBLISHED IN SIAM J. NUMER. ANAL. VOLUME 47, ISSUE 4, PP. 3001-3026 (2009).† Laboratoire Jacques-Louis Lions, Universite Paris 6, 75252 Paris Cedex 05, France, and UFR de

Mathematiques, Universite Paris 7, Case 7012, 75251 Paris Cedex 05, France, [email protected]‡Projet Tosca, INRIA Sophia-Antipolis, 2004 route des lucioles, BP 93, 06902 Sophia Antipolis

Cedex, [email protected]§UMA - ENSTA, 32 Bd Victor, 75015 Paris. Also at projet Commands, CMAP - INRIA Futurs,

Ecole Polytechnique, 91128 Palaiseau, [email protected]

1

2 O. Bokanowski, S. Maroso, and H. Zidani

policy evaluation steps:

αk+1 = arg minα∈AN

[B(α)xk − c(α)

](policy improvement),

xk+1 ∈ RN is a solution of B(αk+1)x − c(αk+1) = 0 (policy evaluation).

The alternating process corresponds roughly to a splitting of the minimum operatorand the linear system.

Puterman and Brumelle [35] were among the first to analyze the convergenceproperties for policy iteration for MDP with continuous state and control spaces. Inthis framework, they showed that the algorithm is equivalent to a Newton’s method,but under very restrictive assumptions which are not easily verifiable. Recently, Rustand Santos [37] obtained local and global superlinear convergence results for Howard’salgorithm applied to a class of stationary infinite-horizon MDP (under additionalregularity assumptions, the authors even obtain up to quadratic convergence; see also[36]).

In the first part of this paper (Sections 2 & 3), we analyse Howard’s algorithmfor a general class of finite dimensional problems in the form of (1.1).

When A is finite (finite actions), Howard’s algorithm converges in at most (Card(A))N

iterations. The idea behind this result is that policy iteration has some rough sim-ilarities to the simplex algorithm of linear programming (LP). Just as the simplexalgorithm generates a sequence of improving trial solutions to the LP problem, policyiteration generates an improving sequence of decision rules to the nonlinear problem(1.1). Analogously to LP, where the number of vertices is finite, the number of possi-ble policies is also finite and is bounded by (Card(A))N . This, together with the factthat the policy iteration method is monotone imply that the algorithm does not cycleand it converges in finite number of steps.

Under monotonicity assumption on the matrices B(α), we prove that, when Ais an infinite compact set, Howard’s algorithm converges superlinearly. This resultis connected to an interpretation of Howard’s algorithm as a semismooth Newton’smethod for solving F (x) = 0, where F (x) = minα∈AN (B(α)x − c(α)).

Concept of semismoothness was introduced by Mifflin [26] for real-valued func-tions defined on finite-dimensional spaces. Qi et al. [33, 34] extended this notion tomappings between finite-dimensional spaces and showed that, although the underlyingmapping is in general nonsmooth, Newton’s method can be generalized to semismoothequations and converges locally with a superlinear rate to a regular solution (see also[22, 23, 10]).

In the second part of the paper (Section 4), we focus the study on the obstacleproblem. More precisely, we are interested in the following problem (for N ≥ 1):

find x ∈ RN , min(Ax − b, x − g) = 0, (1.2)

where A is a monotone1 N × N matrix, b and g are in RN . This problem can be

written in the form of (1.1), with Card(A) = 2. We prove that a specific formulationof Howard’s algorithm for problem (1.2) converges in no more than N iterations,instead of the expected 2N bound. Also, we prove that starting from particularinitial iterate, the number of iterations can be less than N (see Theorem 4.2). Eventhough in numerical tests one can observe that Howard’s algorithm converges in a few

1A matrix A is said monotone if it is invertible and A−1 ≥ 0 (componentwise).

Convergence results for Howard’s algorithm 3

iterations, the number of iterations needed to get the solution is in general dependentof the dimension N. We refer to [36] for a counter example given by J. N. Tsitsiklis,showing that the number of policy iteration steps cannot be bounded by a constantthat is independent of N .

On the other hand, we prove that in the context of equation (1.2), Howard’salgorithm is equivalent to the primal-dual active set method studied in [16, 7, 18].We also illustrate the theoretical convergence on two numerical applications comingfrom finance (pricing of American options, and Merton’s portfolio problem).

In section 5 we study a generalization of policy iteration algorithm to max-min prob-lems stated as follows (for N ≥ 1):

find x ∈ RN , max

β∈BNmin

a∈AN(B(α, β)x − c(α, β)) = 0 (1.3)

(for assumptions on B and c, see Section 5.1). This equation comes also from thediscretization of some optimal control problems, and specially from the discretizationof Isaac’s equations associated to the control of differential games. In the context of(1.3), a straightforward policy iteration method would be:

(i) (βk+1, αk+1) := arg maxβ∈BN

minα∈AN

(B(α, β)xk − c(α, β)

)

(ii) Compute xk+1 ∈ RN , solution of B(αk+1, βk+1)x − c(αk+1, βk+1) = 0

This algorithm can be showed to be equivalent to a Newton-like method for solving(1.3). However, since the max−min operator is neither convex nor concave, thisalgorithm may not converge (see Remark 5.7).

In this paper, we give other extensions of the policy iteration method whichare superlinearly convergent (see Algorithm (Ho-3) and (Ho-4)), are computationalyfeasible, and which are not Newton-like methods. Numerical tests are performed tosolve a max-min problem coming from front propagation with mean curvature motion[21]. We also give some numerical comments on the double-obstacle problem (of theform of: max(min(Ax − b, x − g), x − h) = 0).

2. Notations and preliminaries. Let N be a fixed integer. We denote by Mthe set of real-valued N × N matrices, and I the set 1, · · · , N. Throughout thepaper, for x ∈ R

N and A ∈ M, we use the following usual norms:

‖x‖ = supi∈I

|xi|, ‖A‖ := supi∈I

∑

j∈I

|Aij |.

For every x, y ∈ RN , the notation y ≥ x means that yi ≥ xi, ∀i ∈ I. We also

denote by min(x, y) (resp. max(x, y)) the vector with components min(xi, yi) (resp.max(xi, yi)). Moreover, if (xa)a∈A is a bounded subset of X , we denote by min

a∈A(xa)

(resp. maxa∈A

(xa)) the vector of components mina∈A

(xai ) (resp. max

a∈A(xa

i )).

Let (A, d) be a nonempty compact metric set. For each α ∈ AN , the notationsc(α) and B(α) will refer respectively to a vector in R

N and a matrix in M associatedto the variable α.

Statement of the problem and motivations. We consider the problem to findx ∈ R

N (for a fixed N ≥ 1), solution of the equation (1.1), i.e.

find x ∈ RN , min

α∈AN

(B(α)x − c(α)

)= 0.


Throughout the paper, it is furthermore assumed thatfor all α = (α1, . . . , αN ) ∈ AN , ∀i, j, Bij(α) depends only on αi.

Let us point out that the min operation in equation (1.1) is understood compo-nentwise. Hence by setting, for every a ∈ A,

αa := (a, · · · , a)T ∈ AN , Ba := B(αa), and ca = c(αa),

we get an equivalent reformulation of (1.1) as:

mina∈A

(Bax − ca

)= 0. (2.1)

(Conversely, if Ba is given, and α = (α1, . . . , αN ), we can recover B(α) by settingB(α)i,j := Bαi

i,j).Throughout all the paper, we use an important ”monotonicity” assumption on

the matrices (recall that a matrix A ∈ M is said monotone if and only if A is invertibleand A−1 ≥ 0 (componentwise)). More precisely, we assume that:

(H1) for every α ∈ AN , the matrix B(α) is monotone.(H2) when A is an infinite compact set, the functions α ∈ AN 7−→ B(α) ∈ M

and α ∈ AN 7−→ c(α) ∈ RN are continuous.

Assumption (H2) is satisfied whenever the functions a ∈ A 7−→ Baij are continuous

for every i, j ∈ I.Example 1. A simple example of problem (1.1) is the well-known obstacle problem:

find x ∈ RN , min(Ax − b, x − g) = 0, (2.2)

where A ∈ RN×N and b, g ∈ R

N are given. Indeed, by setting

A = 0, 1, B0 = A, B1 = IN , c0 = b, c1 = g,

(IN being the N × N identity matrix), it is clear that (2.2) is a particular case of(2.1). Also it can be setting in the form of min

α∈A(B(α)x − c(α)) = 0, with

Bij(α) :=

Aij if αi = 0δij if αi = 1

and ci(α) :=

bi if αi = 0gi if αi = 1

(2.3)

where δij = 1 if i = j, and δij = 0 if i 6= j. In the particular case when A =tridiag(−1, 2,−1), A is monotone and one can show that (H1) is satisfied. Also, ifA is a matrix such that any submatrix of the form (Aij)i,j∈I where I ⊂ 1, . . . , N ismonotone, and with Aij ≤ 0 ∀i 6= j, then assumption (H1) holds.

When A is a strictly dominant M -matrix of RN×N (i.e., Aij ≤ 0 ∀j 6= i, and

∃δ > 0, ∀i, Aii ≥ δ +∑

j 6=i

|Aij |) then (H1) is true (since all B(α) are all strictly

dominant M -matrices, and are thus monotone).Example 2. Let us consider a controlled Markov chain on a discrete state space

G. Let Xq, q ≥ 0 be the states of the Markov chain at time q, with given transitionprobabilities denoted p(x, y|a), where p(x, y|a) is the probability to move from x to yunder the action of a (a ∈ A is the canonical control variable). Let h be the time step,and E

ux be the conditional expectation given that X0 = x and an admissible control u

is used.We consider the optimal control problem (see [24]):

ϑ(xk) = minu=(uq), uq∈A

hEuxk

∑

q≥0

ℓ(Xq, uq)(1 + λh)−q−1

, (2.4)


where xk = X0 is the first state, and (uq)q≥0 represents the control action for thechain at discrete time q. Then the dynamic programming equation for the controlledchain Xq, q ≥ 0 and the cost (2.4) yields to:

ϑ(xk) = (1 + λh)−1 mina∈A

[hℓ(xk, a) +

∑

xl∈G

p(xk, xl|a)ϑ(xl)

](2.5)

for xk ∈ G. This can be rewritten in the form

(1 + λh)V = mina∈A

[C(a) +

∑

xl∈G

P(a)V·

](2.6)

where V := (ϑ(xk))xk∈G. Similar discrete control problems can also be obtained by dis-cretization of dynamic programming principle for continuous stochastic control prob-lems, see for instance [30, 24, 11, 28, 8].

Howard’s algorithm (Ho-1). For problem (1.1), the algorithm is as follows:

Initialize α0 in AN ,Iterate for k ≥ 0:

(i) find xk ∈ RN solution of B(αk)xk = c(αk).

If k ≥ 1 and xk = xk−1, then stop. Otherwise go to (ii).(ii) αk+1 := argminα∈AN

(B(α)xk − c(α)

).

Set k := k + 1 and go to (i).Under assumption (H1), for every α ∈ AN , the matrix B(α) is monotone, thus thelinear system in iteration (i) of Howard’s algorithm is well defined and has a uniquesolution xk ∈ R

N .In order to compute αk+1 (according to step (ii) of Ho-1) in a fast way, it is useful

to use the equivalence of the two formulations (1.1) and (2.1), and then remark thatαk+1 can also be given by:

αk+1 = argmina∈A(Baxk − ca).

For instance, in the case of obstacle problem (2.2), step (ii) of the algorithm amountstaking

αk+1i = 0 if [Axk − b]i < xk

i − gi, αk+1i = 1 if [Axk − b]i > xk

i − gi,

αk+1i ∈ 0, 1 if [Axk − b]i = xk

i − gi.

Theorem 2.1. Assume that (H1)-(H2) hold. Then there exists a unique x∗ inR

N solution of (1.1). Moreover, the sequence (xk) given by Howard’s algorithm (Ho-1) satisfies(i) xk ≤ xk+1 for all k ≥ 0.(ii) When A is finite, the algorithm converges in at most (Card(A))N iterations (i.e.,xk = x∗ for some k ≤ (Card(A))N ).(iii) If A is an infinite compact set, then xk → x∗ when k tends to +∞.

Actually, we shall prove in the next section that we have the stronger convergenceresult (Theorem 3.4):

limk→∞

‖xk+1 − x∗‖‖xk − x∗‖ = 0, for any x0 ∈ R

N .


The convergence of Howard’s algorithm is well known, see for instance [9]. Thestatement (i) of Theorem 2.1 is proved in the literature under strong assumptions,see for instance [37]. We give here a very simple proof based on the monotonicityassumption (H1).

Proof of Theorem 2.1. We first start by proving the uniqueness of x∗, while the exis-tence of x∗ will be shown separately by proving (ii) and (iii).

Consider x, y ∈ RN two solutions of (2.1), and let α ∈ AN be a minimizer

associated to y:

αy := argminα∈AN (B(α)y − c(α)) .

Then, we have

B(αy)y − c(αy) = minα∈AN

(B(α)y − c(α)) = 0

= minα∈AN

(B(α)x − c(α))

≤ B(αy)x − c(αy).

(2.7)

which gives that B(αy)(y − x) ≤ 0, and since B(αy) is monotone, we get that y ≤ x.In the same way, we have also y ≥ x. Therefore, y and x coincide.

(i) To prove that (xk)k is an increasing sequence, we use:

B(αk+1)xk − c(αk+1) = minα∈AN

(B(α)xk − c(α))

≤ B(αk)xk − c(αk)= 0= B(αk+1)xk+1 − c(αk+1).

Then by the monotonicity assumption (H1) we obtain the result.(ii) Assume that A is finite. Hence there is at most (Card(A)N ) different variables

α ∈ AN . Then there exist two indices k, ℓ such that 0 ≤ k < ℓ ≤ (Card(A)N ) andαk = αℓ (αk and αℓ being respectively the kth and ℓth iterate of the Howard’salgorithm). Hence xℓ = xk, and since xk ≤ xk+1 ≤ · · · ≤ xℓ we obtain also thatxk = xk+1. Therefore, Howard’s algorithm stops at the (k + 1)th iteration. Thisproves that xk+1 = xk is a solution of (1.1), since F (xk) = B(αk+1)xk − c(αk+1) =B(αk+1)xk+1 − c(αk+1) = 0.

(iii) We now consider the case when A is an infinite compact set. By the step (i)of the algorithm we have

‖xk‖ ≤ ‖B−1(αk)c(αk)‖≤ max

α∈AN‖B−1(α)‖ ‖c(α)‖. (2.8)

By assumptions (H1)-(H2), the function B−1(·) is continuous on AN . Moreover, c(·)is continuous, and AN is a compact set. Therefore, from the inequality (2.8), wededuce that (xk)k is bounded in R

N . Hence xk converges towards some x∗ ∈ RN . Let

us show that x∗ is the solution of (1.1).Define the function F for x ∈ X by:

F (x) := minα∈AN

(B(α)x − c(α)). (2.9)

and let Fi(x) be the ith component of F (x), i.e.

Fi(x) = minα∈AN

[B(α)x − c(α)

]i.


It is obvious that limk→+∞

Fi(xk) = Fi(x

∗). By the compactness of A and using a diag-

onal extraction argument, there exists a subsequence of (αk)k denoted αφk , thatconverges toward some α ∈ AN . Furthermore, with assumption (H2), we have:lim

k→∞(B(αφk )xk)i − (B(α)x∗)i = 0.

Passing to the limit in (B(αφk )xφk − c(αφk))i = 0, we deduce that (B(α)x∗ −c(α))i = 0, for all i ∈ I. On the other hand we have also

Fi(x∗) = lim

k→∞Fi(x

φk−1)

= limk→+∞

(B(αφk)xφk−1 − c(αφk)

)i

= [B(α)x∗ − c(α)]i.

Hence Fi(x∗) = 0, which concludes the proof of (iii) and implies also that x∗ is a

solution of (1.1).

3. Superlinear convergence. In this section, we consider the case when A isan infinite compact set. First let us rewrite Howard’s algorithm for problem (1.1) asa Newton-like method applied to find the zero of the function F : R

N → RN defined

by (2.9). For k ≥ 0, by definitions of αk+1 and xk (the kth iterate of Howard’salgorithm), we have

B(αk+1)xk − c(αk+1) = F (xk), and B(αk+1)xk+1 − c(αk+1) = 0.

Therefore, B(αk+1)(xk − xk+1) = F (xk), and thus

xk+1 = xk − B(αk+1)−1F (xk). (3.1)

Equation (3.1) can be interpreted as an iteration of a semismooth Newton’s method,where B(αk+1) plays the role of a derivative of F at point xk. It is well knownthat Newton’s method exhibits locally a quadratic rate of convergence provided thatthe functional derivative satisfies a certain regular Lipschitz condition which requiresthat the set of minimizers αk is unique. But this uniqueness seems fairly stringentfor B(α). However, we can prove that the function F is differentiable in a generalizedsense, called slant differentiability. This notion was introduced by [33, 34] and allowsto extend the Newton method to a more general class of finite or infinite dimensionalproblems (see also [15, 16]).

Here we first prove the superlinear convergence by using direct arguments. Then,we prove that this result is also connected to the fact that the function F is slantlydifferentiable, and that B(αk+1) is a slant derivative of F at point xk.

Remark 3.1. All the convergence results proved here for finite dimensional prob-lems, can be extended, up to some additional technical assumptions, to the infinitecountable spaces, i.e. to the case when in (1.1) we seek for x ∈ R

N∗

, see [8] for anexample.

For every x ∈ RN , let us denote Ax the set of minimizers associated to F (x), i.e.,

Ax :=

α ∈ AN , B(α)x − c(α) = F (x)

. (3.2a)

For every i ∈ I, we also define the set Ax,i of minimizers associated to the ith com-ponent of F (x), i.e.,

Ax,i :=a ∈ A,

[Bax − ca

]i=[F (x)]i

. (3.2b)


With these notations, it is clear that Ax =∏

i∈I

Ax,i.

Lemma 3.2. Assume that (H1)-(H2) hold. For every x ∈ X and for all i ∈ I,

d(αx+hi , Ax,i) → 0 as h ∈ R

N and ‖h‖ → 0,

with αx+hi ∈ Ax+h,i.

Proof. Let i be in I. Suppose on the contrary that there exists some δ > 0and a subsequence hn ∈ R

N , with ‖hn‖ → 0, such that d(αx+hn

i ,Ax,i) ≥ δ, ∀n ≥ 0.Let Kδ := a ∈ A, d(a,Ax,i) ≥ δ, the function f : A 7−→ R defined by f(a) :=[Bax − ca

]i, and

mδ := infa∈Kδ

f(a).

We note that Kδ is a compact set, hence mδ = f(a) for some a ∈ Kδ. In particulara /∈ Ax,i and thus mδ = f(a) > f(αx

i ). On the other hand, αx+hn

i ∈ Kδ. Thus

f(αx+hn

i ) − f(αxi ) ≥ f(a) − f(αx

i ) > 0. (3.3)

Let C := maxα∈AN ‖B(α)‖. We have that ‖F (y) − F (z)‖ ≤ C‖y − z‖ for everyy, z ∈ R

N . Moreover, (F (x + h) − F (x))i = f(αx+hi ) − f(αx

i ) + (B(αx+h)h)i. Hencef(αx+h

i )− f(αxi ) ≤ 2C‖h‖. Taking h := hn → 0 we obtain a contradiction with (3.3).

Remark 3.3. Of course, one readily sees that αx+hi does not converge necessar-

ily to a minimizer αxi , when h tends to 0. However, by Lemma 3.2, the set-valued

application x 7−→ Ax,i is upper semicontinuous on RN , for every i ∈ I.

The previous Lemma leads to the following convergence result.Theorem 3.4. Assume that (H1)-(H2) are satisfied. Then (1.1) has a unique

solution x∗ ∈ RN , and for any initial iterate α0 ∈ AN , Howard’s algorithm converges

globally ( limk→∞

‖xk − x∗‖ = 0) and superlinearly, i.e.,

‖xk+1 − x∗‖ = o(‖xk − x∗‖

), as k → ∞.

Proof. Existence and uniqueness of the solution, as well as the increasing propertyof the sequence xk have already been proved in Theorem 2.1.

There remains to show the superlinear convergence. We consider hk := xk − x∗

and denote αk+1 := αxk

= αx∗+hk . As in the proof of Theorem 3.8, for all k ≥ 0, wecan find αk,∗ ∈ Ax∗ such that

B(αk+1) − B(αk,∗) → 0 as k → ∞. (3.4)

Using the concavity of the function F and the fact that F (x∗) = 0 we obtain F (xk) ≤B(αk+1)(xk −x∗) (indeed for any α ∈ Ax∗ , i.e. a minimizer associated to x∗, we haveF (x) ≤ F (x∗) + B(α)(x − x∗) = B(α)(x − x∗)). Hence, by monotonicity of B(αk+1),

xk+1 = xk − B(αk+1)−1F (xk)

≥ xk − B(αk+1)−1B(αk,∗)(xk − x∗),

and thus

0 ≥ xk+1 − x∗ ≥ (I − B(αk+1)−1B(αk,∗))(xk − x∗). (3.5)


By (3.4) and (H1), we obtain I − B(αk+1)−1B(αk,∗)k→+∞−−−−−→ 0 (we also use the fact

that B−1(α) is bounded on A). Then from (3.5), we obtain

0 ≥ xk+1 − x∗ ≥ o(xk − x∗),

and this concludes the proof of superlinear convergence.

Remark 3.5 (quadratic convergence). Note that stronger convergence results canbe obtained under an additional assumption on the dependence of f(x, α) := B(α)x−c(α) with respect to α (as in [37]). For instance assume that A is a compact intervalof R, and that for all 1 ≤ i ≤ N , fi(x, α) = ri(x)α2

i +si(x)αi + ti(x) (i.e. quadratic inαi), where ri > 0, ri(·), si(·) and ti(·) are Lipschitz continuous functions on R

N (seethe example of Section 4.3). In this case, for every x ∈ R

N , a minimizer αx is given

by αxi := argminαi∈Afi(x, α) = PA(− si(x)

2ri) where PA denotes the projection on the

interval A. Hence in the neighborhood of the solution x∗, we obtain that ‖αx−αx∗‖ ≤C‖x− x∗‖ (for some C > 0). This implies that ‖B(αx)−B(αx∗

)‖ ≤ C‖x− x∗‖, andusing (3.5) this leads to a global quadratic convergence result.

Remark 3.6. The assertions of the theorems 2.1 and 3.4 also hold for the problem

find x ∈ X, minα∈

Q

i∈IAi

(B(α)x − c(α)) = 0.

where Ai are non empty compact sets.Although our proof of Theorem 3.4 is based on direct arguments, the superlinear

convergence may also be obtained as a consequence of the slant differentiability of thefunction F and the interpretation of Howard’s algorithm as Newton’s method [16, 34].Let us recall the notion of slant differentiability.

Definition 3.7 (Slant differentiability, [16, Definition 1] ). Let Y and Z be twoBanach spaces. A function F : Y → Z is said slantly differentiable in an open setU ⊂ Y if there exists a family of mappings G : U → L(Y, Z) such that

F(x + h) = F(x) + G(x + h)h + o(h)

as h → 0, ∀x ∈ U . G is called a slanting function for F in U .Theorem 3.8. We assume that (H1)-(H2) hold. The function F defined in (2.9)

is slantly differentiable on RN , with slanting function G(x) = B(αx), for αx ∈ Ax.

Proof. Let x, h be in RN . From the definition of F , for any α ∈ Ax and any

αx+h ∈ Ax+h, we have

F (x) + B(αx+h)h ≤ F (x + h) ≤ F (x) + B(α)h,

and thus

0 ≤ F (x + h) − F (x) − B(αx+h)h ≤ (B(α) − B(αx+h))h≤ ‖B(α) − B(αx+h)‖‖h‖. (3.6)

By Lemma 3.2 there exists αh ∈ Ax such that d(αhi , αx+h

i ) → 0 as ‖h‖ → 0, for allj ∈ I. Thus, by continuity assumption (H2), we obtain

limh→0

‖B(αh) − B(αx+h)‖ = 0, (3.7)

and this concludes the proof.

4. Some applications.


4.1. Obstacle problem, and link with the primal-dual active set strat-egy. We focus now on the obstacle problem introduced in (2.2). We assume that Ais such that assumption (H1) is satisfied.

As said in Example 1, here we have A = 0, 1, and Howard’s algorithm convergesin at most 2N iterations under assumption (H1).

Now we consider the following specific Howard’s algorithm for equation (2.2):

Algorithm (Ho-2)

Initialize α0 in AN := 0, 1N .Iterate for k ≥ 0:

(i) Find xk ∈ RN s.t. B(αk)xk = c(αk).

If k ≥ 1 and xk = xk−1 then stop. Otherwise go to (ii).

(ii) For every i = 1, ..., N , take αk+1i :=

0 if (Axk − b)i ≤ (xk − g)i

1 otherwise.

Set k := k + 1 and go to (i).Let us emphasize on the fact that when (Axk − b)i = (xk − g)i, we make the choiceαk+1

i = 0. Altough this is not necessary to obtain the convergence, in the following weshow that this specific choice leads to a drastic decrease of the number of iterationsneeded to get the convergence of Howard’s algorithm.

Theorem 4.1. We assume that the matrices B(·) defined in (2.3) satisfy (H1).Then Howard’s algorithm (Ho-2) converges in at most N +1 iterations (i.e, xk = xk+1

for some k ≤ N +1). In the case we start with α0i = 1, ∀i = 1, · · · , N , the convergence

holds in at most N iterations (i.e, xk = xk+1 for some k ≤ N).

Note that taking α0i = 1, for every 1 ≤ i ≤ N , implies that x0 = g and hence

the step k = 0 has no cost. The cost of the N iterations really means a cost of Nresolutions of linear systems (from k = 1 to k = N).

Proof of Theorem 4.1. First, remark that x1 ≥ g. Indeed,• if α1

i = 1, then we have (x1 − g)i = 0 (by definition of x1).• if α1

i = 0, then (Ax0 − b)i ≤ (x0 − g)i. Furthermore one of the two terms(Ax0−b)i or (x0−g)i is zero, by definition of x0. Hence (x0−g)i ≥ 0, and (x1−g)i ≥ 0.

We also know, by Theorem 2.1, that xk+1 ≥ xk. This concludes to:

∀k ≥ 1, xk ≥ g. (4.1)

Now if αki = 0 for some k ≥ 1 then (Axk − b)i = 0 and by (4.1) we deduce that

αk+1i = 0. This proves that the sequence (αk)k≥1 is decreasing in AN . Also, it implies

that the set of points Ik := i, (Axk − b)i ≤ (xk − g)i satisfies

Ik ⊂ Ik+1, for k ≥ 0.

Since Card(Ik) ≤ N , there exists a first index k ∈ [0, N ] such that Ik = Ik+1,and we have αk+1 = αk+2. In particular, F (xk+1) = B(αk+2)xk+1 − c(αk+2) =B(αk+1)xk+1 − c(αk+1) = 0, and thus xk+1 is the solution for some k ≤ N . Thismakes at most N + 1 iterations.

In the case α0i = 1, ∀i, we obtain that (αk)k≥0 is decreasing. Hence there exists

a first index k ∈ [0, N ] such that αk = αk+1, and we obtain F (xk) = 0. This is thedesired result.

Next, we consider the algorithm (Ho-2’) which is a variant of Howard’s algorithm(Ho-2) and defined as follows:Algorithm (Ho-2’):


Start from a given x0. Then compute α1 (by step (ii)) and x1 (bystep (i)), then α2 and x2, and so on, until F (xk) = 0.

For (Ho-2’), we have a specific result, that will be useful when studying theapproximation of American options in Section 4.2.

Theorem 4.2. Assume that the matrices B(·) defined in (2.3) satisfy (H1).(i) Assume that x0 satisfies

∀i ∈ I, x0i > gi ⇒ (Ax0 − b)i ≤ 0, (4.2)

(or, equivalently, min(Ax0 − b, x0 − g) ≤ 0). Then the iterates of (Ho-2’) satisfyxk+1 ≥ xk for all k ≥ 0.(ii) If furthermore x0 ≥ g, and if x∗ denotes the solution of (1.1), then the convergenceis in at most p iterations, with

p := Cardi ∈ I, x∗i > gi − Cardi ∈ I, x0

i > gi.

In other words, (ii) means that ”the number of iterations is bounded by the numberof points that take off the boundary g, between x0 and x∗.”

Proof of Theorem 4.2 (i) The only difficulty is to prove that x1 ≥ x0, otherwise theproof of xk+1 ≥ xk, for k ≥ 1, is the same as in Theorem 4.1. First in the case α1

i = 1,we have x1

i = gi and (Ax0−b)i ≥ (x0−g)i. If (x0−g)i > 0 then (Ax0−b)i > 0, whichcontradicts the assumption on x0. Hence x0

i ≤ gi and thus x0i ≤ x1

i . Otherwise in thecase α1

i = 0, we have (Ax0 − b)i < (x0 − g)i and (Ax1 − b)i = 0. If (Ax0 − b)i > 0then (x0− g)i > 0, which contradicts the assumption on x0. Hence (Ax0 − b)i ≤ 0. Inparticular, (Ax0)i ≤ (Ax1)i. In conclusion, we have the vector inequality B(α1)x0 ≤B(α1)x1. This implies that x0 ≤ x1 using the monotonicity of B(α1).

(ii) Now we turn on the proof of the second assertion. For every k ≥ 0, we set:

Ik := i ∈ I, xki > gi, and mk := Card(Ik),

Since (xk)k is increasing, so is (mk)k. Moreover, (mk) is bounded and thus convergent.Let us assume that mk+1 = mk for some k ≥ 0. In this case, we have Ik+1 = Ik

and for any i ∈ Ik, we have αki = 0 = αk+1

i . On the other hand, for i /∈ Ik, we havexk

i = gi, and xk+1i = gi (here we use the fact that xk ≥ x0 ≥ g). We have also by

the proof of Theorem 4.1 that αki is non-increasing. In particular, if αk

i = 0 thenαk+1

i = 0. Hence the only case when αki 6= αk+1

i is for i /∈ Ik, αki = 1 and αk+1

i = 0.However in this case, we see that (Axk+1 − b)i = 0 = xk+1

i − gi, and thus we have(B(αk)xk+1 − c(αk))i = 0. In particular B(αk)xk+1 − c(αk) = 0, which means thatxk+1 is solution of the same linear system as xk, thus xk+1 = xk, and the algorithmhas converged at iteration k.

Therefore, we have mk+1 ≥ mk +1 until convergence is reached (mk = mk+1 withk = q, for which xq = x). We obtain mq ≥ m0 + q, and this concludes the proof.

Now we turn on showing the relationship with the primal-dual active set algorithmproposed by Hintermuller, Ito and Kunisch [16] (see also [19, 20]). It is well knownthat x is a solution of (2.2) if and only if there exists λ ∈ R

N+ such that

Ax − λ = bC(x, λ) = 0

(4.3)


where

C(x, λ) := min(λ, x − g) = λ − max(0, λ − (x − g))

(note also that λ = P[0,+∞)(λ − (x − g))). The idea developed in [16] is to useC(x, λ) = 0 as a prediction strategy as follows.

Primal-dual active set algorithm

Initialize x0, λ0 ∈ RN .

Iterate for k ≥ 0:(i) Set Ik := i : λk

i ≤ (xk − g)i, ACk := i : λki > (xk − g)i.

If k ≥ 1 and Ik = Ik−1 then stop. Otherwise go to (ii).(ii) Solve

Axk+1 − λk+1 = b,xk+1 = g on ACk, λk+1 = 0 on Ik.

set k = k + 1 and return to (i).The sets Ik and ACk are called respectively the inactive and active sets. Note thatthe algorithm satisfies at each step, λk+1 = Axk+1 − b, and λk+1

i = (Axk+1 − b)i = 0for i ∈ Ik, and (xk+1 − g)i = 0 for i ∈ ACk. This is equivalent to say that

Bxk+1 − c = 0, (4.4)

where

Bi,. :=

Ai,. if (Axk − b)i ≤ (xk − g)i

(IN )i,. otherwise,

ci :=

bi if (Axk − b)i ≤ (xk − g)i

gi otherwise.

and where IN denotes the N × N identity matrix.Let us consider the equivalent formulation min(Ax − b, x − g) = 0, and let B(.)

and c(.) be defined as in (2.3). For k ≥ 0, if we set αk+1i := 0 for i ∈ Ik and αk+1

i := 1otherwise, then we find that αk+1 is defined from xk as in Howard’s algorithm, andwe have B = B(αk+1), c = c(αk+1). Therefore, (4.4) is equivalent to

B(αk+1)xk+1 − c(αk+1) = 0,

and xk+1 is defined from xk as in Howard’s algorithm applied to min(Ax−b, x−g) = 0.(Note also that similar considerations apply if we replace x− g by d(x− g) for a givenconstant d > 0.)

Theorem 4.3. Howard’s algorithm (Ho-2) and primal-dual active set algorithmfor the obstacle problem are equivalent: if we choose λ0 := Ax0 − b initially then thesequences (xk) generated by both algorithms are the same for k ≥ 0.

Remark 4.4. By Theorem 4.1, the primal-dual active set strategy converges inno more that N iterations (taking x0 = g). We refer to [16] for similar observations.

Remark 4.5. In the same way, one can see that the Front-Tracking algorithmproposed in [1, Chapter 6.5.3] for a particular obstacle problem (mainly, the Americanput option) and based on the primal-dual active set method, is equivalent to Howard’salgorithm.


4.2. An American option. In this section we consider the problem of comput-ing the price of an American put option in mathematical finance. The price u(t, s)for t ∈ [0, T ] and s ∈ [0, Smax] is known to satisfy the following non-linear PDE (see[25] or [31] for existence and uniqueness results in the viscosity framework):

min

(∂tu − 1

2σ2s2∂ssu − rs∂su + ru, u − ϕ(s)

)= 0,

t ∈ [0, T ], s ∈ (0, Smax), (4.5a)

u(t, Smax) = 0, t ∈ [0, T ], (4.5b)

u(0, s) = ϕ(s), s ∈ (0, Smax). (4.5c)

where σ > 0 represents a volatility, r > 0 is the interest rate, ϕ(s) := max(K − s, 0)is the ”Payoff” function (where K > 0, is the ”strike”). The bound Smax should beSmax = ∞, for numerical purpose we consider Smax > 0, large, but finite.

Let sj = jh with h = Smax/Ns and tn = nδt with δt = T/M , where M ≥ 1 andNs ≥ 1 are two integers. Suppose we want to implement the following simple ImplicitEuler (IE) scheme with unknown (Un

j ):

min

(Un+1

j − Unj

δt− 1

2σ2s2

j

(D2Un+1)j

h2− rsj

D+Un+1j

h+ rUn+1

j ;

Un+1j − gj

)= 0, j = 0, . . . , Ns − 1, n = 0, . . . , M − 1,

Un+1Ns

= 0, n = 0, . . . , M − 1,

U0j = gj, j = 0, . . . , Ns − 1

where (D2U)j and (D+U)j are the finite differences defined by

(D2U)j := Uj−1 − 2Uj + Uj+1, (D+U)j := Uj+1 − Uj ,

and with gj := ϕ(sj). Note that for j = 0 the scheme is simply

min

(Un+1

0 − Un0

δt+ rUn+1

0 , Un+10 − g0

)= 0.

(More accurate schemes can be considered, here we focus on the simple (IE) schemefor illustrative purpose mainly.)

It is known that the (IE) scheme converges to the viscosity solution of (4.5) whenδt, h → 0 (one may use for instance the Barles-Souganidis [4] abstract convergence re-sult and monotonicity properties of the scheme). The motivation for using an implicitscheme is for unconditional stability. 2

Now for n ≥ 0, we set b := Un. Thus, the problem to find x = Un+1 ∈ RNs

(i.e, x = (Un+10 , . . . , Un+1

Ns−1)T ) can be written equivalently as min(Ax − b, x− g) = 0,

where A = I + δtQ and Q is the matrix of RNs such that for all j = 0, . . . , Ns − 1:

(QU)j = −1

2σ2s2

j

Uj+1 − 2Uj−1 + Uj+1

h2− rsj

Uj+1 − Uj

h+ rUj ,

2An explicit scheme would need a condition of the form δth2 ≤ const in order to be stable.


assuming UNs= 0. Since A is an M -matrix, assumption (H1) is satisfied and for each

step n of the (IE) scheme, Howard’s algorithm generates a sequence of approximations(xk) of Un+1. We choose to apply the algorithm (Ho-2’) with starting point x0 := Un.Thus, for each n ∈ [0, . . . , M − 1], Howard’s algorithm may need up to Ns iterationsand the total number of Howard’s iterations (for the (IE) scheme) is a priori boundedby M × Ns. Actually, in the present example, if we choose to apply the algorithm(Ho-2’) in each step n with starting point x0 := Un, then the number of Howard’siterations can be improved.

Proposition 4.6. The total number of Howard’s iterations (using algorithm (Ho-2’)) from n = 0 up to n = M − 1, is bounded by Ns. In other words, the resolution of(IE) scheme uses no more than Ns resolutions of linear systems.

Proof. Let us first show that Un+1 ≥ Un by recursion. For n = 0, we haveU1 ≥ g = U0. For n ≥ 1, let us assume that Un ≥ Un−1. We know that x = Un+1 issolution of

min(Ax − Un, x − g) = 0. (4.6)

Therefore min(Ax − Un−1, x − g) ≥ 0. This means that Un+1 is a super-solutionof min(Ax − Un−1, x − g) = 0 (whose solution is Un). By the monotonicity of thematrices B(α), we deduce that Un+1 ≥ Un.

Then, for each n ≥ 0 and with x0 := Un, we have min(Ax0 − Un, x0 − g) ≤min(Ax0 − Un−1, x0 − g) = 0. Hence by Theorem 4.2 and using algorithm (Ho-2’)(initialized with x0 = Un) to solve (4.6), we obtain that Un+1 can be computed inno more than pn iterations, where pn := Cardj, Un+1

j > gj − Cardj, Unj > gj.

Henceforth, the total number of Howard’s iterations to compute U1, . . . , UM isbounded by

∑

n=0,...,M−1

pn = Cardj, UMj > gj − Cardj, U0

j > gj,

which is bounded by Ns.

The result of the above proposition can hold also for monotone implicit scheme formore complex American options (such as American options on two assets, involvinga PDE in two space dimension).

We finally mention that in [20] (see also [1]), the Primal-Dual active set algorithmis used for the approximation of an American option (in a Finite element setting), andconvergence in a finite number of iterations is also observed even though the matricesinvolved are not necessarily monotonous matrices.

4.3. Application to Merton’s problem and with a compact control set.Howard’s algorithm can be useful for the computation of the value functions of optimalcontrol problems. We consider the following example, also known as Merton’s problemin mathematical finance [29] (see for instance [27] or [8] for recent applications ofHoward’s algorithm to solve non-linear equations arising in finance). The problem isto find the optimal value for a dynamic portfolio. At a given time t, the holder mayinvest a part of the portfolio into a risky asset with interest rate µ > 0 and volatilityσ > 0, and the other part into a secure asset with interest rate r > 0. The value


0 20 40 60 80 100 120 140 160 180 2000

10

20

30

40

50

60

70

80

90

100

PayoffIE scheme: U^n

IE scheme: U^n+1

Figure 4.1. Plot of Unj and Un+1

j (time tn = 0.2) with respect to sj. Parameters: K = 100,σ = 1, r = 0.1, T = 1, Smax = 200, Ns = 50, M = 10.

u = u(t, s), for t ∈ [0, T ] and s ∈ [0, Smax], is a solution of the following PDE:

minα∈A

(∂tu − 1

2σ2α2s2∂ssu − (αµ + (1 − α)r)s∂su

)= 0,

t ∈ [0, T ], s ∈ (0, Smax), (4.7a)

u(0, s) = ϕ(s), s ∈ (0, Smax). (4.7b)

where A := [amin, amax], Smax = +∞, and ϕ is a given Payoff function (see for instanceØksendal [29]). Existence and uniqueness results can be obtained using [32].

In general, the exact solution is not known. For testing purposes, we considerhere the particular case of ϕ(s) = sp for some p ∈ (0, 1). In this case, the analyticsolution (when Smax = +∞) is known to be u(t, s) := eαopt t ϕ(s) where

αopt := maxα∈A

(−1

2p(1 − p)α2 + (αµ + (1 − α)r)p

)

(the constant αopt can be computed explicitly since the functional is quadratic in α).For numerical purpose we consider the problem on a bounded interval [0, Smax] witha finite (large) Smax. In this case we need to add a boundary condition at s = Smax

for completeness of the problem. Since (sp)′

sp = ps, we consider the following mixed

boundary condition:

∂xu(t, Smax) =p

Smaxu(t, Smax), t ∈ [0, T ]. (4.8)


A natural IE scheme for the set of equations (4.7)-(4.8) is:

minα∈A

(Un+1

j − Unj

δt− 1

2σ2s2

jα2Un+1

j−1 − 2Un+1j + Un+1

j+1

h2

−(αµ + (1 − α)r)sj

Un+1j+1 − Un+1

j

h

)= 0,

j = 0, . . . , Ns, n = 0, . . . , M − 1,

Un+1Ns+1 − Un+1

Ns

h=

p

SmaxUn+1

Ns, n = 0, . . . , M − 1,

U0j = ϕ(sj), j = 0, . . . , Ns.

Note that for j = 0 the scheme simply readsUn+1

0 − Un0

δt= 0.

Now for b := Un given (and for a given time iteration n ≥ 0), the computation ofx = Un+1 ∈ R

Ns+1 (i.e, x = (Un+10 , . . . , Un+1

Ns)T ) is equivalent to solve

minα∈A

(Bαx − b) = 0,

where Bα := I + δtQα and Qα is the matrix of R(Ns+1)×(Ns+1) such that, for all

j = 0, . . . , Ns − 1,

(QαU)j =1

2σ2s2

jα2−Uj−1 + 2Uj−1 − Uj+1

h2− (αµ + (1 − α)r)sj

Uj+1 − Uj

h,

and

(QαU)Ns=

1

2σ2S2

max

α2

h2

(−UNs−1 + (1 − hp

Smax)UNs

)− p(αµ + (1 − α)r) UNs

.

We obtain the monotonicity of the matrices Bα under a CFL-like condition on δt, h.3

Remark 4.7. The CFL condition comes from the mixed boundary condition andonly plays a role on the last row of the matrices Bα (for monotonicity of Bα). Notealso that it is of the form δt

h≤ const, which is less restrictive than the CFL condition

we would obtain with an explicit scheme (which is δth2 ≤ const).

In view of the expression of (Bαx)j for a given x, which is quadratic in α, thesecond step of Howard’s algorithm can always be performed analytically (otherwise aminimizing procedure should be considered). This improves considerably the speedfor finding the optimal control α. This step has a negligible cputime with respect tothe first step of Howard’s algorithm where a linear system must be solved.

Remark 4.8. It is clear that αj := argminα∈A(Bαx)j is defined by

αj = PA

((µ − r)d

(1)j

σ2sjd(2)j

),

for 0 < j < Ns, where d(1)j :=

Uj+1−Uj

h, d

(2)j :=

−Uj−1+2Uj−Uj+1

h2 (in the case d(2)j 6= 0),

and where PA is the projection on the interval A.

3Chose h > 0 and δt > 0 such thatδt

hmaxα∈A

„

1

2σ2α2pSmax + hp(αµ + (1 − α)r)

«

≤ 1.


In Fig. 4.2, we compare the approximated and exact solutions, and the associatednumerical control obtained with the IE scheme. In this example, the exact optimalcontrol is αopt = 0.625 (constant). Table 4.1 illustrate the quadratic convergencebehavior of the error for solving one time step by Howard’s algorithm. We alsoobserved that, in order to reach convergence, the number of Howard’s iterates keepssmall (about 3), even for large Ns values.

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.00.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

ExactEI−HOWARD

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.00.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

alpha_opt

alpha

Figure 4.2. Plot of (UMj ) (left) and of the discrete optimal control (αj ) at time tM = 1 (right),

with respect to sj. Parameters: Smax = 2, A = [0, 1], p = 12, σ = 0.4, r = 0.1, µ = 0.15, T = 1, and

Ns = 200, M = 20.

Iteration Error ‖xk − xk−1‖k = 1 1.4 10−4

k = 2 3.8 10−9

k = 3 2.0 10−15

Table 4.1Howard’s algorithm for Merton problem: iteration number and error (example at time step

tn = 0.2), for solving one time step evolution of the Implicit Euler scheme. Quadratic convergenceis observed.

Remark 4.9. At each step we have to solve a sparse linear system. Here thesystem is tridiagonal and thus can be solved in O(Ns) elementary operations. Moregenerally, multigrid methods can be considered [2] (see also [3]).

5. Max-min problems. In this section we aim at generalizing Howard’s algo-rithm for a max-min problem:

find x ∈ RN , max

b∈BN

(min

a∈AN

(Ba,bx − ca,b

))= 0. (5.1)

Such a nonlinear equation arizes in game theory and in optimal control problems.Here, we propose computationally feasible algorithms based on policy iterations. Weprove convergence results and analyse a numerical test coming from a two-playersgame problem (see Section 5.2).


Specific convergence properties will be also studied in the context of the doubleobstacle problem.

5.1. A general setting. Let A and B two compact sets. For every (α, β) ∈A×B, Ba,b denotes a N ×N real matrix and ca,b is a vector of R

N . As in Section 2,we introduce the matrices B(α, β) and vectors c(α, β) defined by

Bij(α, β) := Bαi,βi

ij and ci(α, β) := cαi,βi

i ,

for all α = (αi)1≤i≤N ∈ AN , and β = (βi)1≤i≤N ∈ B := BN . We also introduce thefunctions G : R

N → R, and F β : RN → R (for β ∈ BN) defined by:

F β(x) := minα∈AN

(B(α, β)x − c(α, β)) , G(x) := maxβ∈BN

F β(x). (5.2)

The min and max operators are understood componentwise. With these notations,problem (5.1) can be rewritten as: G(x) := maxβ∈BN F β(x) = 0,

A natural generalization of Howard’s algorithm for solving

Find x ∈ RN , max

β∈BNF β(x) = 0, (5.3)

can be stated as follows:

Algorithm (Ho-3). Consider a given β0 ∈ BN . Iterate for k ≥ 0:

(i) Find xk such that F βk

(xk) = 0 (resolution at fixed βk).(ii) Compute βk+1 := argmaxβ∈BN F β(xk).

If G(xk) = F βk+1

(xk) = 0 then stop;Else set k := k + 1 and go to (i).

With the same tools used in the previous sections, we obtain the following Thorem.Theorem 5.1. Let B be a metric compact set. For all 1 ≤ i ≤ N and βi ∈ B, we

consider a function F βi

i : RN → R. For β = (β1, . . . , βN) ∈ BN , we also denote

F β(x) := (F β1

1 (x), . . . , F βN

N (x))T.

We assume that the following assumptions hold(i) for all β ∈ BN , F β is monotone (i.e., ∀x, y ∈ R

N , F β(x) ≤ F β(y) ⇒ x ≤ y)(ii) for all β ∈ BN , there exist x such that F β(x) = 0.(iii) x ∈ R

N | ∃β ∈ BN , F β(x) = 0 is a bounded set.(iv) x → F β(x) is a uniformly continuous function with respect to β ∈ BN .Then, problem (5.3) admits a unique solution x∗. Moreover, the iterates given by(Ho-3) satisfy xk ≥ xk+1, and xk converges to the solution x∗, as k → ∞.

When B is finite, the convergence is obtained in at most Card(B)N iterations (i.e.,xk is a solution for some k ≤ Card(B)N).

Proof. First, we claim that the monotonicity and continuity of the functions F β

imply that G is also monotone. Indeed, suppose that G(x) ≤ G(y), for x, y ∈ RN . Let

βy ∈ BN such that G(y) = F βy(y). Then we have: F βy(x) ≤ G(y) = F βy(y). Usingthe monotonicity of F βy , we obtain x ≤ y. Therefore G is monotone, and problem(5.3) admits at most one solution.

With exactly the same arguments as in the proof of Theorem 2.1(ii), we derive theconvergence of xk to the solution x∗ of (5.3), as k → ∞ (for this, uniform continuityof F β(·), with respect to β ∈ B, is required).


The convergence result of Theorem 5.1 is stated in a general framework withabstract function F β, not necessarily defined as in (5.2).

In particular, algorithm (Ho-3) can be used to solve efficiently equation (5.1),under monotonicity conditions on matrices B(α, β). In this context, we have also aconvergence result (Super linear convergence does not seems easy in this context, andwill be the subject of further work).

Theorem 5.2. Assume that (α, β) ∈ AN × BN → B(α, β) and (α, β) ∈ AN ×BN → c(α, β) are continuous, and that all matrices B(α, β) are monotone. ThenF β(x) defined as in (5.2) satisfies the assumptions of Theorem 5.1 (and thus we havethe convergence of Howard’s algorithm).

We shall see in the next subsection an example where this convergence result canbe applied.

Proof of Theorem 5.2. Suppose that F β is defined as in (5.2) and that monotonic-ity and continuity assumptions on B(α, β) hold. By Sections 2 & 3, for evey β ∈ B,equation F β(x) = 0 admits a unique solution x satisfying:

‖x‖ ≤ maxα∈AN ,β∈BN

‖B(α, β)−1‖‖c(α, β)‖.

Moreover, straightforward arguments show that, for every β ∈ BN , the functionx → F β(x) is L-Lipschitz continuous with L := maxα∈AN ,β∈BN ‖B(α, β)‖, and F β

is monotone. Then, we apply Theorem 5.1 to conclude that problem (5.3) admits aunique solution and algorithm (Ho-3) converges to this solution.

Remark 5.3. Algorithm (Ho-3) is not a Newton-like method. Indeed, by thedefinitions of βk+1 and xk+1, we have

G(xk) = B(αβk+1

xk , βk+1)xk − c(αβk+1

xk , βk+1),

B(αβk+1

xk+1 , βk+1)xk+1 − c(αβk+1

xk+1 , βk+1) = 0,

The first equation involves a minimizer αβk+1

xk ∈ Aβk+1

xk , while for the second equation

we have a minimizer αβk+1

xk+1 ∈ Aβk+1

xk+1 . Hence the two matrices B(αβk+1

xk , βk+1) and

B(αβk+1

xk+1 , βk+1) in general differ.Notice that direct Newton’s method for the max-min problem leads to

(βk+1, αk+1) := arg maxβ∈BN

minα∈AN

(B(α, β)xk − c(α, β)

)

Compute xk+1 ∈ RN , solution of B(αk+1, βk+1)x − c(αk+1, βk+1) = 0

(equivalently, xk+1 = xk − B(αk+1, βk+1)−1G(xk), and αk+1 = αβk+1

xk ). Since themax-min functional is neither convex nor concave, Newton’s method may not be con-vergent.

Step (i) of algorithm (Ho-3) requires to solve exactly a subproblem F β(x) = 0. Inpractice, this resolution will be done approximately, that is, ‖F βk(xk)‖ smaller thana given threshold. A modified algorithm becomes:

Algorithm (Ho-4) Consider a given β0 ∈ BN and a sequence (ηk)k≥0 ∈ R+. Iterate

for k ≥ 0:(i) Find xk such that ‖F βk

(xk)‖ ≤ ηk (resolution at fixed βk).(ii) Set βk+1 := argmaxβ∈AF β(xk).

If G(xk) = F βk+1

(xk) = 0 then stop;Else set k := k + 1 and go to (i).


Theorem 5.4. Assume that assumptions of Theorem 5.2 hold. Let (ηk)k≥0 bea sequence of R

+, with∑

k≥0 ηk < ∞. Then the sequence of iterates (xk) given byalgorithm (Ho-4) converges to the unique solution x∗ of G(x∗) = 0. Furthermore, wehave the lower bound estimate

xk ≥ x∗ − Cηk, with C := maxα∈AN , β∈BN

‖B(α, β)−1‖. (5.4)

Proof. In the modified algorithm (Ho-4), F βk

(xk) is not necessarily 0 and thesequence xk is no more monotone. For a given x ∈ R

N , let us denote αβx a minimizer

of F β(x) (i.e., F β(x) = B(αβx , β)x− c(αβ

x , β). Let δ ∈ RN , and x, y be in R

N be suchthat F β(x) ≤ F β(y) + δ. We have:

B(αβx , β)x − c(αβ

x , β) = F β(x) ≤ F β(y) + δ ≤ B(αβx , β)y − c(αβ

x , β) + δ.

Taking into account the monotonicity of B(αβx , β), we get x ≤ y + ‖B(αβ

x , β)−1‖‖δ‖.Therefore F β satisfies:

∀x, y ∈ RN , ∀δ ∈ R

N , F β(x) ≤ F β(y) + δ =⇒ x ≤ y + C‖δ‖, (5.5)

with C given in (5.4). Let us denote δk := F βk

(xk). Firstly we have G(xk)−G(x∗) =

G(xk) ≥ F βk

(xk) = δk, hence

xk − x∗ ≥ −C‖δk‖ ≥ −Cηk. (5.6)

On the other hand,

F βk+1

(xk+1) − δk+1 = 0 = F βk

(xk) − δk

≤ F βk+1

(xk) − δk

and we deduce that

xk+1 − xk ≤ C(ηk + ηk+1), (5.7)

and thus for any q ≥ p + 1:

xq − xp ≤q−1∑

j=p

C(ηj + ηj+1) ≤ 2C∑

j≥p

ηj (5.8)

We deduce from (5.6) and (5.8) that (xk) is bounded. Moreover, from (5.8) we getthat lim supxk−lim inf xk ≤ 0, hence xk is convergent to some limit x∗, which satisfiesG(x∗) = 0 (as in the proof Theorem 5.1).

5.2. Application to a two-player game. We consider the following two-playerexit time problem. For a given ε > 0 and a given convex domain Ω of R

2, findu : R

2 → R2 solution of

u(x) = min|a|=1

maxb=±1

(ε + u(x +

√2εab)

)x ∈ Ω, (5.9a)

u(x) = 0 x ∈ ∂Ω (5.9b)


This problem comes from a problem involving two people, say Carol and Paul. InitialyPaul is at x ∈ Ω. His goal is to exit Ω as soon as possible, and Carol wants to delayhis exit as long as possible. At each step (ε beeing one time step), Paul can choosea direction a ∈ R

2 (‖a‖ = 1) and Carol can either accept or reverse Paul’s choice(control b = ±1). Paul then moves distance

√2ε in the direction ba, i.e. from x to

x +√

2εba. It is known that in the limit ε → 0, the problem is an approximation ofthe mean curvature problem

div

( ∇u

|∇u|

)|∇u| + 1 = 0, x ∈ Ω, u = 0 on ∂Ω.

We refer to Kohn and Serfaty [21] for details and related problems. Now, equation(5.9a) can be approximated by using a semi-Lagrangian approach (following the timedependant approach of [13]).

We consider a two-dimensional grid G = (Xℓ) of Ω. We denote by Uℓ an approx-imated value of u(Xℓ). We denote by [U ](X) an interpolated value at point X of thevalues [Uℓ] on the grid points, i.e., for X ∈ Ω,

[U ](X) :=∑

ℓ

γXℓ Uℓ

where (γXℓ ) are such that γX

ℓ ≥ 0,∑

ℓ γXℓ = 1 and

∑ℓ γX

ℓ Xℓ = X . The discretizedequation we consider is then,

max|a|=1

minb=±1

(Uℓ − [U ](Xℓ +

√2εab) − ε

)= 0 ∀Xℓ ∈ Ω, (5.10a)

Uℓ = 0 ∀Xℓ ∈ ∂Ω, (5.10b)

and this can be written in the abstract form

max|a|=1

minb=±1

(Ba,bU − ca,b

)= 0 (5.11)

with unknown vector U = (Uℓ)Xℓ∈Ω and where Ba,b is a monotone matrix and ca,b avector. In order to simplify the maximization operation, instead of (5.11) we consider

maxa∈A

minb=±1

(Ba,bU − ca,b

)= 0 (5.12)

where A := ak = (cos(θk), sin(θk)), k = 1, . . . , Na, with θk = 2πkNa

where Na ≥ 1, agiven integer.

In Fig. 5.1, we give the approximated solution of (5.12) obtained by using algo-rithm (Ho-3). The linear systems are solved by using the sparse linear solver of theUMFPACK library [38]. 4

5.3. The double obstacle problem. We consider the following ”double ob-stacle problem”: find x ∈ R

N solution of

max (min (Ax − b, x − g) , x − h) = 0, (5.13)

4using also Scilab (http://www.scilab.org) and the SCISPT interface of B. Pincon


Iterations IterationsN2

x hx (Howard) (total) CPU time (seconds)202 0.105 11 69 2.56402 0.051 11 78 8.01802 0.025 13 107 48.45

Table 5.1Number of iterations of Howard’s algorithm (Ho-3) to reach convergence, and CPU times (the

total number of iterations corresponds to the number of linear systems solved). We have usedε = 0.01 and Na = 40 discrete controls for variable a, and Ω = [−1, 1]2.

−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Figure 5.1. Contour plot of U , with Nx = Ny = 80, ε = 0.01.

where A is a given matrix of RN×N , and b, g, h are in R

N . Define the functions Fand G on R

N by:

F (x) := min(Ax − b, x − g), and G(x) := max(F (x), x − h) for x ∈ RN .

The problem (5.13) is then equivalent to find x ∈ RN solution of G(x) = 0.

We first re-write the function F as follows:

F (x) := minα∈AN

(B(α)x − c(α))

where A = 0, 1, and B(α) is defined as in (2.3):

Bij(α) :=

Aij if αi = 0δij if αi = 1,

and ci(α) :=

bi if αi = 0gi if αi = 1,

(5.14)

with δij = 1 if i = j, and δij = 0 otherwise. In the sequel, we consider that thematrices B(α) satisfy the assumption (H1). Also we define, for every β ∈ AN , F β(x) :R

N → RN by

F β(x)i :=

F (x)i, if βi = 0(x − h)i, if βi = 1

, for x ∈ RN .

We obtain easily that for every x ∈ RN , we have G(x) = max

β∈ANF β(x), and the problem

(5.13) is equivalent to:

find x ∈ RN , max

β∈ANF β(x) = 0. (5.15)


To solve this problem, we can consider the policy iteration algorithm (Ho-3) withA = B := 0, 1. More precisely we apply the algorithm with the following choice:

βk+1i :=

0 if F β(xk)i ≥ (xk − h)i,1 otherwise.

(5.16)

For every k ≥ 0, the equation F βk

(x) = 0 is an obstacle problem in the form of (2.2).Hence it can be solved with Howard’s algorithm (Ho-2) in at most N resolution oflinear systems.

We can establish the following convergence result, exactly as in the proof ofTheorem 4.1:

Theorem 5.5. Assume (H1). Then there exists a unique solution of (5.13),and algorithm (Ho-3), together with the choice (5.16), converges in at most N + 1iterations. It converges in at most N iterations if we start with β0

i = 1, ∀i.

Remark 5.6. Since each resolution of F βk

(x) = 0 can be solved by Howard’salgorithm using at most N iterations (i.e, resolution of linear systems), this meansthat the global algorithm converges in at most N2 resolution of linear systems.

5.4. Numerical illustration for the double obstacle problem. Note thatthe following example is chosen for illustrative purposes rather than for performanceof the methods.

We consider the following discrete double obstacle problem: find x∗ = (Ui)1≤i≤N

in RN such that

max

(min

(−Ui−1 − 2Ui + Ui+1

∆s2+ m(si), Ui − g(si)

), Ui − h(si)

)= 0,

i = 1, . . . , NU0 = uℓ, UN+1 = ur,

(5.17)

with uℓ = 1, ur = 0.8 (left and right border values), ∆s = 1N+1 and si = i∆s,

m(s) ≡ 0, g(s) := max(0, 1.2 − ((s − 0.6)/0.1)2) and h(s) := min(2, 0.3 + ((s −0.2)/0.1)2). In the numerical examples we choose N = 99.

This problem can easily be reformulated as

G(x) := max(min(Ax − b, x − g), x − h) = 0 (5.18)

and where A a tridiagonal N × N matrix (the corresponding matrices B(α) definedby (2.3) satisfy assumption (H1)). As in Section 5.3, we can write

G(x) = maxβ∈BN

minα∈AN

B(α, β)x − c(α, β).

With Howard’s algorithm (Ho-3), we can solve (5.18) with k = 14 global itera-tions, involving a total number of 88 resolutions of linear systems.

On the other hand if we use Newton’s method as in Remark (5.3), then dependingon the initial condition the algorithm may not always converge:

Remark 5.7. In the one dimensional case (N = 1), we consider the problem

max(min(Ax − b, γ(x − g)), γ(x − h)) = 0

with A > 0, and b, g, h are given constants with g < h. We note that if x∗ = b/A ∈(g, h) and γ = 1 < A, and if the starting point x0 satisfies x0 /∈ (g, h), then Newton’s


gh

y = Ax − b

y = γ(x − h)

x∗ = b/A

y = γ(x − g)

Figure 5.2. Divergence of Newton’s method in the one-dimensional case.

method does not converge (see Fig. 5.2). Conversely, we remark that when γ > ANewton’s method always converges (for any starting iterate x0 ∈ R).

Hence, instead of solving G(x∗) = 0 and in view of the previous remark, weconsider the equivalent problem of solving Gγ(x∗) = 0, where

Gγ(x) := max (min(Ax − b, γ(x − g)), γ(x − h)) ,

and with γ > 0 a given (large) constant. We obtain the exact solution x∗ after8 iterations and using γ = 104, see Table 5.2. Numerically, we have observed theconvergence of Newton’s method as soon as γ > maxi(Aii).

Now, we consider the primal-dual active set (PDAS) algorithm of Ito and Kunisch[19, Section 4] for the double obstacle problem. In our setting, it amounts to apply aNewton’s method to find the zero of

Gγ(x) := max(min(Ax − b, Ax − b + λ + γ(x − g)), Ax − b + λ + γ(x − h)

)

for a given constant γ > 0 and a given λ ∈ RN . In the numerical tests given in

Table 5.2, we have taken λ ≡ 0. We refer to [19] for the meaning and the role of λand γ.

Table 5.2Primal dual active set algorithm [19], and Newton’s method

Primal-Dual: Newton:γ iterations ||G(xk)||∞ iterations ||G(xk)||∞

102 4 2.4 10−1 divergent -104 7 8.7 10−3 8 1.6 10−12

106 9 1.0 10−4 8 1.6 10−12

1010 9 1.0 10−8 8 1.6 10−12

Although the PDAS algorithm converges also in finite number of iterations, itconverges to an approximate solution of Gγ(·) = 0. For γ large enough, we recover agood approximation of the exact solution. Howard’s algorithm needs more iterationshere (but this algorithm is more general since it can be applied to general max-minproblems such as in Section 5.2). Finally, newton’s method may diverge if γ is too


0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.00.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

Obstacle gObstacle hHOWARD

Figure 5.3. Solution of the double obstacle problem, N = 99, Values Uj are plotted with respectto sj. Solution obtained with Howard’s algorithm or with Newton.

small (and depending on the starting point), but converges very quickly to the exactsolution for γ large enough.

Acknowledgment. The Authors deeply thank Professor F. Bonnans for interestingdiscussions and comments on the manuscript.

REFERENCES

[1] Y. Achdou and O. Pironneau, Computational methods for option pricing, Frontiers in appliedmathematics, SIAM, 2005.

[2] M. Akian, Methodes multigrilles en controle stochastique. PHD Thesis, University Paris 9,1990. http://www.inria.fr/rrrt/tu-0107.html

[3] M. Akian, J. Menaldi, and A. Sulem, On an investment-consumption model with transactioncosts, SIAM J. Control Optim., 34-1 (1996), pp. 329–364.

[4] G. Barles and P.E. Souganidis, Convergence of approximation schemes for fully nonlinearsecond order equations, Asymptotic Anal. 4-3 (1991), pp. 271–283.

[5] R. Bellman, Functional equations in the theory of dynamic programming. V. Positivity andquasi-linearity, Proc. Nat. Acad. Sci. USA, 41 (1955), pp. 743-746.

[6] R. Bellman, Dynamic Programming, Princeton University Press, Princeton, NJ, 1957.[7] M. Bergounioux, K. Ito, and K. Kunisch, Primal Dual strategy for constrained optimal

control problems, SIAM J. Control Optim., 37 (1999), pp. 1176–1194.[8] O. Bokanowski, B. Bruder, S. Maroso, and H. Zidani, Numerical approximation of a

superreplication problem under gamma constraints, To appear in SIAM J. Num. Analysis.[9] F. Bonnans, P. Rouchon, Commande et optimisation de systemes dynamiques, Editions de

l’Ecole Polytechnique, Palaiseau, 2005.[10] J. Bolte, A. Daniilidis, A. Lewis, Tame functions are semismooth, Math. Program., Ser. B,

117 (2009), pp. 5–19.[11] F. Camilli and M. Falcone, Approximation of control problems involving ordinary and im-

pulsive controls, ESAIM Control Optim. Calc. Var., 4 (1999), pp. 159–176.[12] M.G. Crandall and P.-L. Lions, Two approximations of solutions of Hamilton-Jacobi equa-

tions, Mathematics of Computation, 43 (1984), pp. 1–19.[13] E. Carlini, M. Falcone, R. Ferretti, in Numerical Mathematics Advanced Applications-

ENUMATH2005, Springer, Berlino (2006), pp 732–739.[14] J.-Ph. Chancelier, B. Øksendal, A. Sulem, Combined stochastic control and optimal stop-

ping, and application to numerical approximation of combined stochastic and impulse con-trol, Tr. Mat. Inst. Steklova, 237 (2002), pp. 149–172.


[15] X. Chen, Z. Nashed, and L. Qi, Smoothing methods and semismooth methods for nondiffer-entiable operator equations, SIAM J. Numer. Anal. 38 (2000), pp. 1200–1216.

[16] M. Hintermuller, K. Ito, and K. Kunisch, The primal-dual active set strategy as a semis-mooth Newton method, SIAM J. Optim., 13 (2003), pp. 865–888.

[17] R. A. Howard, Dynamic programming and Markov processes, Cambridge MA, New York, TheTechnology Press of the MIT. J. Wiley. 1960.

[18] K. Ito, K. Kunisch, Optimal control of elliptic variational inequalities, Appl. Math. Optim.,41 (2000), pp. 343-364.

[19] K. Ito and K. Kunisch, Semi-smooth Newton methods for variationnal inequalities of the firstkind, M2AN, 37 (2003), pp. 41-62.

[20] K. Ito and K. Kunisch, Parabolic variational inequalities: the Lagrange multiplier approachJ. Math. Pures Appl. 85 (2006), pp. 415–449.

[21] R .V. Kohn and S. Serfaty, A deterministic-control-based approach to motion by curvature.Comm. Pure Appll. Math. 59 (2006), pp. 344–447.

[22] B. Kummer, Generalized Newton and NCP methods: convergence, regularity, actions, Dis-cus.Math.Differ.Incl.Control Optim., 20(2000), pp. 209-244

[23] B. Kummer, Newton’s method for nondifferentiable functions, in: J. Guddat et al., eds., Math-ematical Research, Advances in Optimization, Akademie-Verlag Berlin, 1988, pp. 114–125.

[24] H.J. Kushner and P. Dupuis, Numerical methods for stochastic control problems in continuoustime. Springer-Verlag NewYork-Berlin-Heidelberg, 2001.

[25] D. Lamberton and B. Lapeyre, Introduction au calcul stochastique applique a la finance,Ellipses, Paris, econd edition, 1997.

[26] R. Mifflin, Semismooth and semiconvex functions in constrained optimization, SIAM J. Cont.and Optim., 15-6 (1977), pp. 956–972.

[27] M. Mnif and A. Sulem, Optimal risk control and dividend policies under excess of loss rein-surance, Stochastics, 77 (2005), pp. 455–476.

[28] R. Munos and H. Zidani, Consistency of a simple multidimensional scheme for HJB equations,C. R. Acad. Sci. Paris, Ser. I, vol. 340, 2005

[29] B. Øksendal, Stochastic differential equations. An introduction with applications. Springer-Verlag, Berlin, sixth edition, 2003

[30] , B. Øksendal and A. Sulem, Applied Stochastic Control of Jump Diffusions Springer-Verlag,Berlin, 2005.

[31] H. Pham, Optimal stopping, free boundary, and American option in a jump-diffusion model,Appl. Math. Optim., 35 (1997), pp. 145–164.

[32] H. Pham, Optimisation et Controle Stochastique Appliques a la Finance . To appear in thecollection Mathematiques et Applications, Springer Verlag

[33] L. Qi, Convergence analysis of some algorithms for solving nonsmooth equations, Math. ofOperations Research, 18(1993), pp. 227–244.

[34] L. Qi and J. Sun, A nonsmooth version of Newton’s method, Mathematical Programming,58(1993), pp. 353–367

[35] M.L. Puterman and S.L. Brumelle, On the convergence of policy iteration in stationarydynamic programming, Mathematics of Operations Research, 4 (1979), pp. 60–69.

[36] M. S. Santos, Accuracy of numerical solutions using the equation residuals, Econometrica, 68(2000), pp. 1377–1402.

[37] M.S. Santos and J. Rust, Convergence properties of policy iteration, SIAM J. Control Optim.,42 (2004), pp. 2094-2115.

[38] UMFPACK, sparse solvers by Timothy A. Davis. Seehttp://www.cise.ufl.edu/research/sparse

mainbokanowski/... · olivier bokanowski†, stefania maroso‡, and hasnaa zidani§ abstract. this...

Documents