numerical solution of riemann{hilbert problems: · pdf filenumerical solution of...

Report no. 09/9

Numerical solution of Riemann–Hilbert problems: Painleve II

Sheehan Olver

Abstract We describe a new, spectrally accurate method for solving matrix-

valued Riemann–Hilbert problems numerically. The effectiveness of this ap-

proach is demonstrated by computing solutions to the homogeneous Painleve

II equation. This can be used to relate initial conditions with asymptotic be-

haviour.

Keywords Riemann–Hilbert problems, spectral methods, collocation meth-

ods, Painleve transcendents.

Oxford University Mathematical Institute

Numerical Analysis Group

24-29 St Giles’Oxford, England OX1 3LB

E-mail: [email protected] August, 2010

1. Introduction

Riemann–Hilbert problems occupy an important place in applied analysis. They have

been used to derive asymptotics for nonlinear differential equations such as the Painleve

transcendents [15], the nonlinear Schrodinger equation and the KdV equation [1], as well

as orthogonal polynomials and random matrices [9]. Whereas the originating equations are

nonlinear, Riemann–Hilbert problems are linear. A key aspect to their effectiveness is that

they can be deformed in the complex plane to turn oscillations into exponential decay, so that

asymptotics can be derived. This is known as nonlinear steepest descent [11], as it is much

like the classical theory of steepest descent for oscillatory integrals. Thus Riemann–Hilbert

formulations can loosely be viewed as a nonlinear counterpart to the integral representations

that are known for many important linear differential equations; such as the Airy equation,

hypergeometric equations [2], wave equation and heat equation.

Integral representations have another important use, in addition to the derivation of

asymptotics: numerical computation through quadrature. Indeed, they have been used to

great effect for computing Airy functions and Bessel functions [16], which, in a certain sense,

are linear analogues of Painleve transcendents. The aim of this paper is to demonstrate that

Riemann–Hilbert problems share this property with integral representations: they can also

be used to compute solutions to the associated equations numerically.

A Riemann–Hilbert problem is the problem of finding a function that is analytic every-

where in the complex plane except along a given curve, on which it has a prescribed jump.

This can be written more precisely as

Problem 1.1 [20] Given an oriented curve Γ ⊂ C and a jump matrix G : Γ→ C2×2, find a

bounded function Φ : C\Γ→ C2×2 which is analytic everywhere except on Γ such that

Φ+(z) = Φ−(z)G(z) for z ∈ Γ and

Φ(∞) = I,

where Φ+ denotes the limit of Φ as z approaches Γ from the left, and Φ− denotes the limit

of Φ as z approaches Γ from the right and Φ(∞) = lim|z|→∞Φ(z).

To demonstrate the numerical approach, we will focus on the Painleve II transcendent,

though the techniques developed are generalizable for other Riemann–Hilbert problems. The

Painleve II equation is

u′′ = xu+ 2u3 − α, (1.1)

where α is a complex parameter. For simplicity, we will take α = 0. From this differential

equation, an equivalent Riemann–Hilbert formulation is derived by finding a Lax pair rep-

resentation, which in turn is uniquely specified by behaviour along Stokes’ lines. This can

then be rephrased as a Riemann–Hilbert problem [15].

1

�1 s2e

− 8i3 z3−2ixz

1

�

�1 −s1e

− 8i3 z3−2ixz

1

� �1 −s3e

− 8i3 z3−2ixz

1

�

�1

s1e8i3 z3+2ixz 1

��1

s3e8i3 z3+2ixz 1

�

�1

−s2e8i3 z3+2ixz 1

�

Figure 1: The curve and jump matrix for the Painleve II Riemann–Hilbert problem.

As explained in [18], the curve Γ in the Riemann–Hilbert problem for the homogeneous

Painleve II equation consists of six rays eminating from the origin (see Figure 1):

Γ = Γ1 ∪ · · · ∪ Γ6

where Γκ ={z ∈ C : arg z = π

(1

6+κ− 1

3

)}.

The jump function has a different definition depending on which ray Γκ that z lies on:

G(x; z) =

(1 sκe−

8i3 z

3−2ixz

1

)κ even and z ∈ Γκ,

(1

sκe8i3 z

3+2ixz 1

)κ odd and z ∈ Γκ.

Instead of imposing initial conditions, we choose Stokes’ constants s1, s2 and s3 which satisfy

the following compatibility condition:

s1 − s2 + s3 + s1s2s3 = 0, (1.2)

from which the remaining Stokes’ constants are defined as

s4 = −s1, s5 = −s2 and s6 = −s3.

2

Once we have found a function Φ(x; z) that satisfies the Painleve II Riemann–Hilbert

problem, a solution to (1.1) is

u(x) = limz→∞ 2zΦ(12)(x; z).

If u does not have a pole at x, then (1.2) is sufficient to ensure a unique solution to the

Riemann–Hilbert problem [15]. As in the integral representation of the Airy function, the

original variable x has been reduced to a parameter.

By moving from the original differential equation to Problem 1.1, we have transformed

a nonlinear problem to a linear problem, ignoring the boundary condition at ∞. We can

rephrase the problem so that it is completely linear: define

U = Φ− I;

thence,

LU = U+ − U−G = G− I and U(∞) = 0. (1.3)

Now the operator L is linear, from the space of functions analytic off Γ and which decay at

∞ to the space of functions defined on Γ.

Consider for the moment the following simple and scalar Riemann–Hilbert problem:

Problem 1.2 Given an oriented curve Γ ⊂ C and a function f : Γ → C which is Holder-

continuous on each smooth segment of Γ, find a function ψ : C\Γ → C which is analytic

everywhere except on Γ such that

ψ+(z)− ψ−(z) = f(z) for z ∈ Γ and ψ(∞) = 0.

For analysts, solving this problem is a trivial application of Plemelj’s lemma [20], and

the solution is the Cauchy transform:

ψ(z) = CΓf(z) =1

iπ

∫

Γ

f(t)

t− z dt for z /∈ Γ. (1.4)

(When Γ is clear from context, we use the notation C.) However, by rewritting Problem 1.2

as an integral, we have, in a certain sense, moved backwards: the functions in Problem 1.2

are bounded (and, in the Painleve case, analytic), whereas (1.4) has introduced a singularity

that must somehow be dealt with. Thus the approach taken in [25] was to apply Plemelj’s

lemma in reverse; i.e., compute (1.4) numerically by rewriting it as Problem 1.2. This can

be accomplished efficiently using the FFT when Γ is a circle, interval, ray or combination

of multiple such curves, as reviewed in Section 3. Moreover, this approach allows us to also

compute the left and right limits of the Cauchy transform C±, which we require.

The importance of the operator C is that it uniquely maps any Holder-continuous function

defined on Γ to an analytic function defined off Γ [20]. This still holds true when it is a

matrix-valued function, in a component-wise manner. Moreover, under certain conditions

3

which are satisfied in our case, this map is one-to-one. In other words, for a function V

defined on Γ, if

LCV = C+V − (C−V )G = G− I, (1.5)

it follows that Ψ = CV + I satisfies all the conditions of Problem 1.1. Moreover, LC maps

the space of functions which are Holder-continuous on the segments of Γ to itself.

Our numerical approach is to impose (1.5) at a sequence of points lying on Γ:

Algorithm 1.3

1: Represent

V (z) =

V1(z) z ∈ Γ1...

V6(z) z ∈ Γ6

for Vκ(z) =

pκ(z)v

(11)κ pκ(z)v

(12)κ

pκ(z)v(21)κ pκ(z)v

(22)κ

,

where pκ = (pκ1, . . . , pκn) denotes a basis for functions defined on Γκ and v(ij)κ =

(v

(ij)κ1 , . . . , v

(ij)κn

)>are unknowns in Cn;

2: For collocation points zκ = (zκ1, . . . , zκn)> in Γκ, compute the Cauchy transform for the

chosen basis at the points z1, . . . ,z6;

3: Determine v(ij)κ by solving the 24n× 24n linear system

LCV (z1) = G(z1)− I,...

LCV (z6) = G(z6)− I;

(1.6)

4: Convert(v

(12)1 , . . . ,v

(12)6

)>to un(x), which approximates u(x).

The next four sections correspond to each of the steps of Algorithm 1.3. In Section 2,

we use mapped Chebyshev polynomials as our basis and mapped Chebyshev–Lobatto points

as our collocation points. In Section 3, we derive an expression for the Cauchy transform

of this basis in closed form. However, the junction point zero is included as a collocation

point, at which the Cauchy transform generically blows up. Therefore, the system (1.6), as

written, cannot be used. However, the six unbounded Cauchy transforms CΓ1V1, . . . , CΓ6V6

can sum in such away that the unbounded terms cancel, leaving only bounded terms. We

can determine the remaining bounded terms for each Cauchy transform, which is what we

will use in the linear system as its value at zero. In Section 4, we construct the true form of

(1.6). Whereas Algorithm 1.3 suggests that we will need to solve a 24n× 24n linear system,

we will find that the degree of the linear system can be reduced 6(n− 1)× 6(n− 1) by using

a priori information about the solution. In Theorem 4.3 we prove that our choice of value of

4

the Cauchy transforms at the junction point is justified subject to a condition on the Stokes’

multipliers s1, s2, s3: it automatically implies that the unbounded terms do indeed cancel

and the approximation does solve the conditions of (1.6). Finally, by integrating over Γ we

can convert the solution to the Riemann–Hilbert problem to a solution to the homogenous

Painleve II equation, as explained in Section 5.

There are approaches for the computation of solutions to other Riemann–Hilbert prob-

lems. The conjugation method [30] can be used to solve a nonlinear Riemann–Hilbert prob-

lem on the unit circle related to conformal mapping. This has been generalized to multiple

circles [29, 31], but not to intervals or curves like our Γ. Another approach applicable to

smooth closed curves is based on solving integral equations [21]. An approach which is sim-

ilar to ours was used in [13], where a Riemann–Hilbert problem associated with the sine

kernel Fredholm determinant (which has applications in random matrix theory) was solved

by reducing the equation to a singular integral equation, precisely as in (1.5). However, in

this approach the endpoints and junction points of the jump curve were avoided, and hence

exponentially many sample points were required near the endpoints to simulate boundedness

of the solution. On the other hand, in our approach we ensure that the junction points are

included in the collocation system, thus we automatically have a bounded solution and this

inefficiency is avoided.

Remark : Based on the method presented here, a Mathematica package, RHPackage,

has subsequently been developed for solving general Riemann–Hilbert problems [23]. How

the current approach is adaptable to general Riemann–Hilbert problems is described in [24].

Included in the package is an implementation of the method described in this paper for

computing Painleve II, which is more optimized than the general routine. We use this

implementation in the numerical results below.

2. Choice of basis

Each curve Γκ is a ray in the complex plane. An oft-used technique from spectral

methods, which we employ, is to represent a function defined on a ray by mapping it to a

function defined on the unit interval. Indeed, we can conformally map the unit interval to

Γκ using the map

Hκ(t) = eiπ( 16+κ−1

3 ) t+ 1

1− t .

(The map typically used is LHκ(t) for some constant L [7]. We fix L = 1 for simplicity.)

On the unit interval, the natural representation for functions is Chebyshev series.

5

Definition 2.1 For a fixed integer n ≥ 2, define the n Chebyshev–Lobatto points as

χ =(−1, cosπ

(1− 1

n− 1

), . . . , cos

π

n− 1, 1)>,

We can efficiently represent a function defined on the interval by its values at χ:

f = f(χ).

By taking an appropriately scaled discrete cosine transform of f , which we denote D, we

obtain the Chebyshev polynomial which interpolates f at χ:

e(t)>f = (T0(t), . . . , Tn−1(t))Df ,

where Tk are the Chebyshev polynomials of the first kind [2]. Alternatively, the barycentric

formula can be used [5].

In practice, f will vanish at t = +1, as this will correspond to z =∞. We want our basis

to capture that fact, so when we map the basis back to the half ray, it decays at infinity

and the Cauchy transform is well-defined. Therefore, we replace the basis T0, . . . , Tn−1 with

T0−1, . . . , Tn−1−1. Fortunately, since f vanishes at 1, we know that (1, . . . , 1)Df = f(1) = 0

and therefore

e(t)>f = (T0(t)− 1, . . . , Tn−1(t)− 1)Df .

Definition 2.2 For a vector x = (x1, . . . , xn)>, the notation x denotes x with its last entry

removed: x = (x1, . . . , xn−1)>.

Using the map Hκ and the fact that every function r we consider vanishes at ∞, we can

then represent a function r defined on Γκ by its values at n− 1 mapped Chebyshev–Lobatto

points:

rκ = r(zκ) for zκ = Hκ(χ).

The values at the points zκ = Hκ(χ) are rκ =(rκ0

)and the basis is

TΓκk (z) = Tk(H

−1κ (z))− 1.

We also define the n×(n−1) transform matrix for this truncated vector as D by the formula

Drκ = Drκ.Then the function

e(H−1κ (z))>rκ =

(TΓκ

0 (z), . . . , TΓκn−1(z)

)>Drκ

interpolates r at the points zκ. This is referred to as a rational Chebyshev interpolant [7].

6

Thus we approximate a function r which is smooth along each Γκ by the function

e(H−11 (z))>r1 z ∈ Γ1,

...e(H−1

6 (z))>r6 z ∈ Γ6.

In the notation of Algorithm 1.3 we use the basis whose elements are one at a point in zκand zero at every other point:

pκj(z) = e(H−1κ (z))>ej .

Then the coefficients v(ij)κ correspond to function values at the points zκ.

3. Computing the Cauchy transform over Γ

Our goal in this section is to construct Cauchy matrices C±, corresponding to the eval-

uation of the Cauchy transform C±Γ1along Γ1. If r1 = r(z1), consider

C±r1 =? C±e(H−1

1 (z1))>r1 =(C±Γ1

TΓ10 (z1), . . . , C±Γ1

TΓ1n−1(z1)

)Dr1.

This definition will suffice for every row of the matrix other than the first, provided we can

compute the left and right limits of the Cauchy transform for the chosen basis. On the other

hand, since the Cauchy transform is unbounded at zero, this definition fails for the first row,

and an alternative definition must be used.

We also want matrices C2, . . . , C6 corresponding to the evaluation of CΓ1 along Γ2, . . . ,Γ6:

Cκr1 =? (CΓ1T

Γ10 (zκ), . . . , CΓ1T

Γ1n−1(zκ)

)Dr1.

The first row of these definitions must also be altered. We will see that these matrices aresufficient to represent the Cauchy transform over the other curves Γ2, . . . ,Γ6 as well, due tosymmetry.

In order to construct these matrices, we must compute the Cauchy transform for our

basis, which can be written in terms of the Cauchy transform of the Chebyshev polynomials

over the unit interval.

Computation of the Cauchy transform over the unit interval

We have an expression for C(−1,1)Tk in closed form. This expression is derived by mapping

the interval to the unit circle, using the Joukowsky map and its inverses:

Definition 3.1 The Joukowsky map

T (z) =1

2

(z +

1

z

)

7

maps the interior and exterior of the unit circle to C\[−1, 1]. Thus it has two inverses defined

in C\[−1, 1]:

T−1± (t) = t∓

√t− 1

√t+ 1.

T−1+ and T−1

− map C\[−1, 1] to the interior and exterior of the circle, respectively. Since T±

each have a branch cut along [−1, 1], we need two additional inverses:

T−1↑ (t) = t+ i

√1− t

√1 + t and T−1

↓ (t) = t− i√

1− t√

1 + t.

These map [−1, 1] to the upper and lower half of the unit circle, respectively, and are analytic

along the interval.

Using these maps, we obtain the following formulæ:

Theorem 3.2 Define

µm(z) =

bm+12 c∑

j=1

z2j−1

2j − 1, ψ0(z) =

2

iπarctanh z,

ψm(z) = zm[ψ0(z)− 2

iπ

{µ−m−1(z) for m < 0µm(1/z) for m > 0

].

Then

C(−1,1)Tk(t) = −1

2

[ψk(T

−1+ (t)) + ψ−k(T

−1+ (t))

],

C(−1,1)Tk(t) ∼t→−1

− 1

2iπ(−1)k [log(−t− 1)− log 2] +

1

iπ(−1)k [µk−1(−1) + µk(−1)] , (3.1)

C(−1,1)Tk(t) ∼t→1

1

2iπ[log(t− 1)− log 2] +

1

iπ[µk−1(1) + µk(1)] , (3.2)

and, for t ∈ (−1, 1),

C+(−1,1)Tk(t) = −1

2

[ψk(T

−1↓ (t)) + ψ−k(T

−1↓ (t))

],

C−(−1,1)Tk(t) = −1

2

[ψk(T

−1↑ (t)) + ψ−k(T

−1↑ (t))

].

Proof :

This theorem is a simplification of Theorem 6 in [25]. Take the definition above of ψm(z)

for |z| < 1, and define for |z| > 1:

ψ0(z) =2

iπarctanh

1

z,

8

ψm(z) = zm[ψ0(z)− 2

iπ

{µ−m−1(z) for m < 0µm(1/z) for m > 0

].

From [25] we have

CTk(x) = −1

4

[ψk(T

−1+ (t)) + ψk(T

−1− (t)) + ψ−k(T

−1+ (t)) + ψ−k(T

−1− (t))

]. (3.3)

We now sketch a proof of this result. The function ψm is defined so that, for |z| = 1,

ψ+m(z)− ψ−m(z) = zm sgn arg z and ψm(∞) = 0.

For m = 0 this follows from the definition of arctanh. For m > 0, ψm is equal to zmψ0(z)

with the growth at∞ subtracted out, determined by the necessary terms of the Taylor series

of arctanh: µm. For m < 0, ψm is equal to zmψ0(z) with the pole at zero subtracted out to

ensure analyticity at zero, also determined from the Taylor series.

If g(z) = −f(T (z)) sgn arg z and ψ is the Cauchy transform of g over the unit circle,

ψ = C{z:|z|=1}g, then

C(−1,1)f(t) =ψ(T+(t)) + ψ(T−(t))

2.

This follows since

C+(−1,1)f(t)− C−(−1,1)f(t) =

ψ+(T↓(t)) + ψ−(T↑(t))− ψ+(T↑(t))− ψ−(T↓(t))

2= f(t)

and

ψ(T+(∞)) + ψ(T−(∞)) = ψ(0) = g0 = 0,

as g is symmetric. This proves (3.3), since Tk(T (z)) = 12

[zk + z−k

].

We can simplify (3.3). Note that T−1− (t) = 1

T−1+ (t), therefore ψ0(T−1

− (t)) = ψ0(T−1+ (t)).

Furthermore, for k > 0

ψk(T−1− (t)) = T−1

+ (t)−k[ψ0(T−1

+ (t))− 2

iπµk(T−1

+ (t))]

= ψ−k(T−1+ (t)) +

{12k k odd0 k even

and

ψ−k(T−1− (t)) = ψk(T

−1+ (t))−

{12k k odd0 k even

.

It follows that

ψk(T−1− (t)) + ψ−k(T

−1− (t)) = ψk(T

−1+ (t)) + ψ−k(T

−1+ (t)).

9

The asymptotic behaviour at the endpoints was shown in [25]. The expression for C±Tkfollows from the expression for CTk and the fact that, for z ∈ (−1, 1), limε→0+ T+(z + iε) =

T−1↓ (t) and limε→0+ T+(z − iε) = T−1

↑ (t).

Q.E.D.

Remark : One approach to computing ψm is to use the definition above with high precision

arithmetic (which is necessary as the definition is chosen precisely to cancel terms in the Tay-

lor series). Another approach [25] is to rewrite the series in terms of the Lerch transcendent

function [4], and use the method developed in [3], or the built-in Mathematica routine.

A fast, numerically stable and accurate approach is to compute ψm by writing it in terms

of the Hypergeometric function, and then use a stable recurrence relation to subsequently

obtain ψm+1, ψm+2, . . . [24].

Computation of the Cauchy transform over a half ray

Similar to the development in [25], we can write CΓκ in terms of C(−1,1): if f(t) = r(Hκ(t))

and Φ = C(−1,1)f , then

CΓκr(z) = Φ(H−1κ (z))− Φ+(1).

This is easily confirmed by looking at the behaviour as z approaches Γκ from the left and

right:

C+Γκr(z)− C−Γκr(z) = Φ+(H−1

κ (z))− Φ−(H−1κ (z)) = f(H−1

κ (z)) = r(z) and

limz→∞ CΓκr(z) = lim

z→∞Φ(H−1κ (z))− Φ+(1) = lim

t→1Φ(t)− Φ+(1) = 0,

since

Φ+(1)− Φ−(1) = r(∞) = 0.

Using this formula, we can derive an expression for CΓκTΓκk :

CΓκTΓκk (z) = C(−1,1)[Tk − 1](H−1

κ (z))− C+(−1,1)[Tk − 1](1).

The unbounded terms of C(−1,1)[Tk − 1](t) are cancelled as t→ 1, therefore we are left with

C+(−1,1)[Tk − 1](1) =

1

iπ[µk−1(1) + µk(1)] .

We thus obtain

CΓκTΓκk (z) = −1

2

[ψk(T

−1+ (H−1

κ (z))) + ψ−k(T−1+ (H−1

κ (z)))− 2ψ0(T−1+ (H−1

κ (z)))]

10

− 1

iπ[µk−1(1) + µk(1)] . (3.4)

Behaviour at zero

The expression (3.4) explodes at zero. We describe this behaviour, in order to choose a

value to assign for the first row of the Cauchy matrices, corresponding to evaluation at zero.

From Theorem 3.2 and (3.4) we have

CΓκTΓκk (z) ∼

z→0

1− (−1)k

2iπ

[log(−H−1

κ (z)− 1)− log 2]

+(−1)k

iπ[µk−1(−1) + µk(−1)]− 1

iπ[µk−1(1) + µk(1)] .

From the expression

H−1κ (z) =

z − eiπ( 16+κ−1

3 )

z + eiπ( 16+κ−1

3 )

we find that

H−1κ (z) = −1 + 2e−iπ( 1

6+κ−13 )z +O

(z2).

Therefore

CΓκTΓκk (z) ∼ 1

2iπ

[1− (−1)k

] [log(−2e−iπ( 1

6+κ−13 )z

)− log 2

]

+(−1)k

iπ[µk−1(−1) + µk(−1)]− 1

iπ[µk−1(1) + µk(1)]

=1− (−1)k

2iπlog |z|+ 1− (−1)k

2iπi arg

(−e−iπ( 1

6+κ−13 )z

)

+(−1)k

iπ[µk−1(−1) + µk(−1)]− 1

iπ[µk−1(1) + µk(1)] . (3.5)

Our choice for the value of the Cauchy transform at zero is this expression with the term

that grows like log |z| dropped. Note that it depends on the angle at which we approachzero.

Constructing Cauchy matrices

We can now construct the Cauchy matrices. We begin with C+. Note in (3.5) that the

term dependent on the angle is, for the limit as z approaches Γ1 from the left,

limε→0+

arg(−e−iπ6 ei(π6 +ε)

)= lim

ε→0+arg

(ei(−π+ε)

)= −π.

Thus we define

C+ =(ϕ+

0 (χ), . . . , ϕ+n−1(χ)

)D

11

for

ϕ+k (t) = −1

2

[ψk(T

−1↓ (t)) + ψ−k(T

−1↓ (t))− 2ψ0(T−1

↓ (t))]− µk−1(1) + µk(1)

iπ,

ϕ+k (−1) = −1− (−1)k

2+ (−1)k

µk−1(−1) + µk(−1)

iπ− µk−1(1) + µk(1)

iπ.

Similarly, we can define C− via

C− =(ϕ−0 (χ), . . . , ϕ−n−1(χ)

)D

for

ϕ−k (t) = −1

2

[ψk(T

−1↑ (t)) + ψ−k(T

−1↑ (t))− 2ψ0(T−1

↑ (t))]− µk−1(1) + µk(1)

iπ,

ϕ−k (−1) =1− (−1)k

2+ (−1)k

µk−1(−1) + µk(−1)

iπ− µk−1(1) + µk(1)

iπ.

Finally, we define Cγ , corresponding to evaluating CΓ1 along Γγ for γ = 2, . . . , 6. Now z

in (3.5) lies in Γγ , hence

arg(−e−iπ6 z

)= arg

(−e−iπ6 eiπ( 1

6+γ−13 ))

= πγ − 4

3.

Therefore we define

Cγ =(ϕ0,γ(H−1

1 (zγ)), . . . , ϕn−1,γ(H−11 (zγ))

)D

where

ϕk,γ(t) = −1

2

[ψk(T

−1+ (t)) + ψ−k(T

−1+ (t))− 2ψ0(T−1

+ (t))]− µk−1(1) + µk(1)

iπ,

ϕk,γ(−1) = (1− (−1)k)(γ

6− 2

3

)+ (−1)k

µk−1(−1) + µk(−1)

iπ− µk−1(1) + µk(1)

iπ.

Ostensibly we would need to redo these calculations for each Γκ to compute CΓκ . How-

ever, C is invariant under rotation. Therefore (ignoring the unboundedness at zero)

C±rκ = C±Γκe(H−1κ (zκ))>rκ,

Cγ−κ+1 mod 6rκ = C±Γκe(H−1κ (zγ))>rκ.

Since the Cauchy transform of a function r(z) defined on Γ is

CΓr(z) = CΓ1r(z) + · · ·+ CΓ6r(z),

we obtain

C±Γ r(z1) ≈ C±Γ1e(H−1

1 (z1))>r1 + CΓ2e(H−12 (z1))>r2 + · · ·+ CΓ6e(H−1

6 (z1))>r6

12

≈ C±r1 + C6r2 + C5r3 + · · ·+ C2r6, (3.6)

C±Γ r(z2) ≈ C2r1 + C±r2 + C6r3 + · · ·+ C3r6,

...

C±Γ r(z6) ≈ C6r1 + · · ·C2r5 + C±r6.

4. Constructing the linear system

We now use the matrices C±, C2, . . . , C6 to construct the linear system (1.6). We repre-

sent U via the definition

U (ij) = CV (ij) =6∑

κ=1

CΓκV(ij)κ ≈

6∑

κ=1

CΓκe(H−1κ (z))>v(ij)

κ .

Given v(ij)1 , . . . ,v

(ij)6 , we can determine the the approximation of C±V (ij) at the points

z1, . . . , z6 using (3.6):

c±,(ij)1 = C±v

(ij)1 + C6v

(ij)2 + · · ·+ C2v

(ij)6 ,

...

c±,(ij)6 = C6v

(ij)1 + · · ·+ C2v

(ij)5 + C±v

(ij)6 .

Hence we determine v(ij)1 , . . . ,v

(ij)6 by solving the 24(n− 1)× 24(n− 1) linear system

c+1 − c−1 G(z1) = G(z1)− I (4.1)

...

c+6 − c−6 G(z6) = G(z6)− I.

Here we take

G(zk) =

(G(11)(zk) G(12)(zk)

G(21)(zk) G(22)(zk)

)

and define multiplication of two matrices whose entries are vectors as

(c(11) c(12)

c(21) c(22)

)(g(11) g(12)

g(21) g(22)

)

=

diag

(c(11)

)g(11) + diag

(c(12)

)g(21) diag

(c(11)

)g(12) + diag

(c(12)

)g(22)

diag(c(21)

)g(11) + diag

(c(22)

)g(21) diag

(c(21)

)g(12) + diag

(c(22)

)g(22)

.

13

Reducing the dimension of the system

We can derive properties of the solution which will allow us to decrease the dimension

of the linear system. Now consider the case where γ is odd. Since CΓκV(ij)κ is analytic off

Γκ, and, by definition, C+Γγr − C−Γγr = r along Γγ , we obtain (here the ± refer to the limits

to a point in Γγ)

U (ij)+ − U (ij)− =6∑

κ=1

(C+Γκ− C−Γκ)V (ij)

κ = (C+Γγ− C−Γγ )V (ij)

γ = V (ij)γ

along Γγ . This implies that

LCV = U+ − U−G =

(U (11)+ U (12)+

U (21)+ U (22)+

)−(U (11)− U (12)−

U (21)− U (22)−

)(1

sγe8i3 z

3+2ixz 1

)

=

U

(11)+ − U (11)− − U (12)−sγe8i3 z

3+2ixz U (12)+ − U (12)−

U (21)+ − U (21)− − U (22)−sγe8i3 z

3+2ixz U (22)+ − U (22)−

=

V

(11)γ − sγe

8i3 z

3+2ixzC−V (12) V(12)γ

V(21)γ − sγe

8i3 z

3+2ixzC−V (22) V(22)γ

.

On the other hand, the right-hand side is

G− I =

(0 0

sγe8i3 z

3+2ixz 0

).

Therefore, we trivially obtain that 0 = V(12)γ = V

(22)γ , or in other words, the contribution to

U (12) and U (22) from Γγ is zero and we take 0 = v(12)γ = v

(22)γ .

Similarly, when γ is even we obtain

LCV =

V

(11)γ V

(12)γ − sγe−

8i3 z

3−2ixzC−V (11)

V(21)γ V

(22)γ − sγe−

8i3 z

3−2ixzC−V (21)

=

(0 sγe−

8i3 z

3−2ixz

0 0

).

Therefore, we take 0 = v(11)γ = v

(21)γ . In other words, V

(11)γ and V

(21)γ are only nonzero for

odd γ, V(21)γ and V

(22)γ are only nonzero for even γ.

Finally, we note that the top rows of V1, . . . , V6 are completely independent of the bottomrows.

Using these simplifications, we can reduce the dimension of (4.1), from one 24(n− 1)×24(n− 1) system to two 6(n− 1)× 6(n− 1) systems. The first system is for the unknowns

14

associated with the (11) and (12) entries of V :

v(11)1 − s1diag (e

8i3 z

31+2ixz1)

[C6v

(12)2 + C4v

(12)4 + C2v

(12)6

]= 0,

v(12)2 − s2diag (e−

8i3 z

32−2ixz2)

[C2v

(11)1 + C6v

(11)3 + C4v

(11)5

]= s2e−

8i3 z

32−2ixz2 ,

v(11)3 − s3diag (e

8i3 z

33+2ixz3)

[C2v

(12)2 + C6v

(12)4 + C4v

(12)6

]= 0,

v(12)4 + s1diag (e−

8i3 z

34−2ixz4)

[C4v

(11)1 + C2v

(11)3 + C6v

(11)5

]= −s1e−

8i3 z

34−2ixz4 ,

v(11)5 + s2diag (e

8i3 z

35+2ixz5)

[C4v

(12)2 + C2v

(12)4 + C6v

(12)6

]= 0,

v(12)6 + s3diag (e−

8i3 z

36−2ixz6)

[C6v

(11)1 + C4v

(11)3 + C2v

(11)5

]= −s3e−

8i3 z

36−2ixz6 .

The second system is for the unknowns associated with the (21) and (22) entries:

v(21)1 − s1diag (e

8i3 z

31+2ixz1)

[C6v

(22)2 + C4v

(22)4 + C2v

(22)6

]= s1e

8i3 z

31+2ixz1 ,

v(22)2 − s2diag (e−

8i3 z

32−2ixz2)

[C2v

(21)1 + C6v

(21)3 + C4v

(21)5

]= 0,

v(21)3 − s3diag (e

8i3 z

33+2ixz3)

[C2v

(22)2 + C6v

(22)4 + C4v

(22)6

]= s3e

8i3 z

33+2ixz3 ,

v(22)4 + s1diag (e−

8i3 z

34−2ixz4)

[C4v

(21)1 + C2v

(21)3 + C6v

(21)5

]= 0,

v(21)5 + s2diag (e

8i3 z

35+2ixz5)

[C4v

(22)2 + C2v

(22)4 + C6v

(22)6

]= −s2e

8i3 z

35+2ixz5 ,

v(22)6 + s3diag (e−

8i3 z

36−2ixz6)

[C6v

(21)1 + C4v

(21)3 + C2v

(21)5

]= 0.

Note that the left-hand side of both linear systems are the same, thus most of the computation

can be reused.

Though we have described how to construct C± and Cγ for odd γ, we only require the

computation of C2, C4 and C6. This simplification will not necessarily be possible for other

Riemann–Hilbert problems.

For the conversion from the Riemann–Hilbert problem to the value of the solution to

Painleve II at x, we require the (12) entry of Φ, which is also the (12) entry of U . Thus

we need only solve the first linear system. Assuming the linear system is nonsingular, we

denote the solution vectors for a given n as v(ij),nκ , unless n is implied by context. The

15

50 100 150n

10-10

10-7

10-4

0.1

100

50 100 150n

10-8

10-5

0.01

10

Figure 2: The convergence of the first entry of the solution vectors (first graph) and the ap-proximation of un compared to u200 (right graph) for x = 0 (plain), 6 (dashed), 8 (dotted) and i(thick).

approximations of V and U are

V nκ (z) =

e(H−1

κ (z))>v(11),nκ e(H−1

κ (z))>v(12),nκ

e(H−1κ (z))>v

(21),nκ e(H−1

κ (z))>v(22),nκ

and

Un(z) = CΓ1Vn

1 (z) + · · ·+ CΓ6Vn

6 (z),

where the Cauchy transforms can be computed as in Section 3.

As an example, consider the choice of constants (s1, s2, s3) = (1 + i,−2, 1 − i). Since

s1 = s3, we know that the corresponding solution to Painleve II is real on the real axis [15].

To demonstrate the rate of convergence, in Figure 2 we compare the first entry of solution

vectors (which corresponds to the value at zero along each Γκ) for consecutive choices of n:

∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥

e>1

(v

(11),n1 − v(11),n+1

1

)

e>1

(v

(11),n3 − v(11),n+1

3

)

e>1

(v

(11),n5 − v(11),n+1

5

)

e>1

(v

(12),n2 − v(12),n+1

2

)

e>1

(v

(12),n4 − v(12),n+1

4

)

e>1

(v

(12),n6 − v(12),n+1

6

)

∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∞

.

As can be seen, these values converges spectrally fast, including for complex x, though the

rate of convergence and stability degenerates as x becomes large.

Properties of the solution

In the previous example, the linear system was always nonsingular. We cannot expect

that this is always the case, as if x corresponds to a pole of the solution u(x), then the

16

corresponding Riemann–Hilbert problem itself is not solvable. We do, however know the

following:

Theorem 4.1 The two linear systems are solvable for sufficiently small (s1, s2, s3).

Proof : When (s1, s2, s3) = (0, 0, 0), the matrix associated with each linear system is simply

an identity operator. Thus continuity of eigenvalues proves the result. Q.E.D.

In the construction of Cκ, we determined the value at zero by assuming the solution was

bounded. This will be true when the hypotheses of the following lemma are satisfied:

Lemma 4.2 Suppose that the linear system is nonsingular and that the computed solutions

values at zero sum to zero:

0 = e>1

[v

(11)1 + v

(11)3 + v

(11)5

], 0 = e>1

[v

(12)2 + v

(12)4 + v

(12)6

],

0 = e>1

[v

(21)1 + v

(21)3 + v

(21)5

], 0 = e>1

[v

(22)2 + v

(22)4 + v

(22)6

].

Then the solution Un(z) is analytic everywhere off Γ, bounded at zero and satisfies the

relationship (1.5) at the points z1, . . . ,z6.

Proof :

The first part of the lemma follows since CΓk is analytic off Γk.

For the second part of the lemma, we consider the (11) entry, as the proof for other

entries is the same. The only possible blow-up from the Cauchy transforms is at zero. From

(3.5) we find that the behaviour at zero is

U (11),n(z) = CΓ1e(H−11 (z))>v

(11)1 + CΓ3e(H−1

3 (z))>v(11)3 + CΓ5e(H−1

5 (z))>v(11)5

∼ 1

2iπlog |z|

(0, 2, 0, . . . , 1− (−1)n−1

)D[v

(11)1 + v

(11)3 + v

(11)5

]+Darg z

where Darg z is a bounded constant depending only on the argument of z. Now we know that

(1, . . . , 1)>Dv(11)κ = e>n v

(11)κ = 0, as we construct the vector so that the value corresponding

to infinity is zero. Moreover,(1, . . . , (−1)n−1

)>D = e>1 . Therefore we have

U (11),n(z) ∼ − 1

2iπlog |z|e>1 [v

(11)1 + v

(11)3 + v

(11)5 ] +Darg z;

hence, the logarithmic term is cancelled and there is no blow-up.

The final part of the theorem follows since when the logarithmic terms are cancelled, the

value chosen at zero is precisely the value of the Cauchy transform itself.

Q.E.D.

17

In the following theorem, we demonstrate that, subject to a second constraint, the con-

ditions of the previous lemma are satisfed.

Theorem 4.3 Suppose that the linear system is nonsingular and that

s1s3 − s1s2 − s2s3 6= 9.

Then the solution Un(z) is analytic everywhere off Γ, bounded at zero and satisfies the

relationship (1.5) at the points z1, . . . ,z6.

Proof :

We focus on the (11) and (12) entries, as the proof for the other two entries is equivalent.

The theorem will result if we can demonstrate that the first entries sum to zero:

Σ = 0 for Σ =(e>1

[v

(11)1 + v

(11)3 + v

(11)5

]e>1

[v

(12)2 + v

(12)4 + v

(12)6

] ).

Define

Φ±1 =(e>1

(C±v

(11)1 + C5v

(11)3 + C3v

(11)5

)+ 1 e>1

(C6v

(12)2 + C4v

(12)4 + C2v

(12)6

)),

Φ±2 =(e>1

(C2v

(21)1 + C6v

(21)3 + C4v

(21)5

)+ 1 e>1

(C±v

(12)2 + C5v

(12)4 + C3v

(12)6

)),

...

Φ±6 =(e>1

(C6v

(21)1 + C4v

(21)3 + C2v

(21)5

)+ 1 e>1

(C5v

(12)2 + C3v

(12)4 + C±v

(12)6

)).

We assert that

Φ+κ = Φ−κ+1 +

Σ

6.

We first note that, for j = 2, . . . , n− 1,

e>1 C+ej = e>1 C

−ej = e>1 C2ej = · · · = e>1 C6ej .

This follows since (1,−1, . . . , (−1)n)Dej = (1, 1, . . . , 1)Dej = 0, and thus the term de-

pending on γ is cancelled. Therefore, for some constant vector d and constants v1 =

e>1 v(11)1 , . . . , v6 = e>1 v

(12)6 we can write

Φ+1 =

(e>1

(v1C

+ + v3C5 + v5C3

)e1 e>1 (v2C6 + v4C4 + v6C2) e1

)+ d

Φ−2 =(e>1 (v1C2 + v3C6 + v4C4) e1 e>1

(v2C

− + v4C5 + v6C3

)e1

)+ d

18

From the definition of ϕ and the fact that (1,−1, . . . , (−1)n)De1 = 1 and (1, 1, . . . , 1)De1 =

0, we find for some constant D indepenedent of γ that

e>1 C+e1 =

(ϕ+

0 (0), . . . , ϕ+n−1(0)

)De1 =

1

2+D,

e>1 C−e1 =

(ϕ−0 (0), . . . , ϕ−n−1(0)

)De1 = −1

2+D,

e>1 Cγ e1 = (ϕ0(0), . . . , ϕn−1(0))De1 =2

3− γ

6+D.

Thus we get:

Φ+1 −Φ−2 =

(v1

2− v1

3− v3

6+v3

3+v5

6,−v2

3+v2

2+v4

6+v6

3− v6

6

)

=1

6(v1 + v3 + v5, v2 + v4 + v6) =

Σ

6.

Similar manipulations prove the identity along the other contours.

By the design of the linear system, we also know that

Φ+κ = Φ−κ Sκ

for

S1 =(

1 0s1 1

), · · · , S6 =

(1 s60 1

).

And, from the analytical development [18], we know that

S1 · · ·S6 = I.

Therefore we obtain

Φ−1 = Φ−1 S1 · · ·S6 = Φ+1 S2 · · ·S6 =

Σ

6S2 · · ·S6 +Φ−2 S2 · · ·S6

=Σ

6(S2 · · ·S6 + S3 · · ·S6 + · · ·+ S6 + I) +Φ−1 .

Thus, unless

S2 · · ·S6 + S3 · · ·S6 + · · ·+ S6 + I

happens to be singular, we know that Σ = 0. The determinant of this matrix is

36 + 4s1s2 − 4s1s3 + 4s2s3.

Q.E.D.

The condition thats1s3 − s1s2 − s2s3 6= 9

19

does not appear in the existing literature; the condition that s1 − s2 + s3 + s1s2s3 = 0

is sufficient for there to exist a unique bounded solution for all x not at a pole of u(x).

However, this new condition states that, if it is satisfied, then the solution is still unique

(and bounded) even if we allow for it to be unbounded.

This new condition is also necessary for the system to be nonsingular [24]. For example,

when (s1, s2, s3) = (1,−2 − i, 2 − i) the linear system itself is singular, though it still has a

solution. In other words, the kernel of the associated matrix is nontrivial, and choosing the

wrong element of the kernel can cause the solution to not cancel at zero. In this case, the

problem can be rectified by imposing the additional conditions

0 = e>1

[v

(11)1 + v

(11)3 + v

(11)5

]= e>1

[v

(12)2 + v

(12)4 + v

(12)6

],

so that the linear system (now rectangular) is of full rank. It might be possible to show that

the system with these additional conditions always has a solution for large enough n when

the Riemann–Hilbert problem itself does. We leave this problem open.

5. Converting the solution to the Riemann–Hilbert problem to the solution of

Painleve II

We have described a method for computing the solution of the Riemann–Hilbert problem

associated with the homogeneous Painleve II equation. Now we want to compute

u(x) = 2 limz→∞ zΦ(12)(x; z) = 2 lim

z→∞ zU(12)(x; z) = lim

z→∞ zCV(12)(x; z)

=1

iπ

∫

ΓV (12)(x; z) lim

z→∞z

t− z dz = − 1

iπ

∫

ΓV (12)(x; z) dz

= − 1

iπ

[∫

Γ1

V (12)(x; z) dz + · · ·+∫

Γ6

V (12)(x; z) dz].

We need to compute these integrals.

We can transform each integral to the unit interval and use Clenshaw–Curtis quadrature

[8]. Let w be the n Clenshaw–Curtis weights associated with the Chebyshev–Lobatto points

χ, so that∫ 1

−1f(t) dt ≈ w>f(χ).

Note that this can be evaluated in O(n log n) time. We have, for z on Γκ, the approximation

V (12)(x; z) ≈ e(H−1κ (z))>v

(12)κ , therefore:

∫

ΓκV (12)(x; z) dz =

∫

ΓκV (12)(x;Hκ(t))H ′κ(t) dz ≈ w>diag (H ′κ(χ))v(12)

κ .

20

-5 5x

-0.2

0.2

0.4

H s1 s2 s3 L � H 1 0 -1 L

-5 5x

-5

5

10

H s1 s2 s3 L � J 1 2 1

3N

Figure 3: The real (plain) and imaginary (dashed) parts of u120(x) with (s1, s2, s3) = (1, 0,−1)(left) and (s1, s2, s3) = (1, 2, 1/3) (right).

We can now define

un(x) = − 1

iπw>

[diag (H ′1(χ))v

(12)1 + · · ·+ diag (H ′6(χ))v

(12)6

].

The right-hand side of Figure 2 demonstrates the convergence of un. In Figure 3 we

plot solutions for two choices of (s1, s2, s3). See Figure 4 for the solution with (s1, s2, s3) =

(1 + i,−2, 1− i).

Computing the derivative

So far, we have defined a unique solution to the homogeneous Painleve II equation by

specifying the constants (s1, s2, s3), in analogue to the analytic development in [18, 15]. This

is in contrast to what one would normally consider defining a unique solution to a differential

equation: initial conditions, say, at x = 0. Given the set (s1, s2, s3), we have already seen how

we can use our approach to determine u(x). But we can go one step further and determine

u′(x) as well. Note that

u′(x) = 2d

dxlimz→∞ zΦ(12)(x; z) = 2 lim

z→∞ zΦ(12)x (x; z) = − 1

iπ

∫

ΓV (12)x (x; z) dz.

Differentiating (1.5) we obtain

C+Vx −(C−Vx

)G =

(I + C−V

)Gx.

Now we already know how to compute C−V , hence the right-hand side is known. Fur-

thermore, the left-hand side of the equation is exactly the left-hand side of (1.5), with Vxin place of V . Thus we have the exact same linear systems as before, only with a different

21

-5 5x

-10

-5

5

10

Solution

-5 5x

-20

20

40

Derivative

Figure 4: For (s1, s2, s3) = (1 + i,−2, 1 − i), a plot of the real part of un (left graph) and itsderivative (right graph) for n = 25 (plain), 50 (dashed) and 100 (thick).

right-hand side. In the first linear system, the new right-hand side is:

2is1diag (z1)diag (e8i3 z

31+2ixz1)

[C6v

(12)2 + C4v

(12)4 + C2v

(12)6

],

−2is2diag (z2)diag (e−8i3 z

32−2ixz2)

[C2v

(11)1 + C6v

(11)3 + C4v

(11)5

],

2is3diag (z3)diag (e8i3 z

33+2ixz3)

[C2v

(12)2 + C6v

(12)4 + C4v

(12)6

],

2is1diag (z4)diag (e−8i3 z

34−2ixz4)

[C4v

(11)1 + C2v

(11)3 + C6v

(11)5

],

−2is2diag (z5)diag (e8i3 z

35+2ixz5)

[C4v

(12)2 + C2v

(12)4 + C6v

(12)6

],

2is3diag (z6)diag (e−8i3 z

36−2ixz6)

[C6v

(11)1 + C4v

(11)3 + C2v

(11)5

].

In short, it is very inexpensive to compute u′(x) whenever u(x) has already been computed

using the Riemann–Hilbert formulation. This allows us to map (s1, s2, s3) to the equivalent

initial conditions u(x), u′(x).

We can use this approach to compare the approximation derived from the Riemann–

Hilbert formulation to a standard ODE solver. We determine that the initial conditions for(s1, s2, s3) = (1 + i,−2, 1− i) are approximately (to about 10 digits accuracy)

u(0) ≈ −0.7233727039 and u′(0) ≈ 1.019298669.

Consider Figure 4, where we plot approximate solutions for this choice of constants. Note

the presence of multiple poles. Unlike an ODE solver, which cannot possibly integrate past

a pole, our numerical Riemann–Hilbert approach is only affected by the pole when trying

to evaluate close to the pole itself. For values of x bounded away from the first pole (say,

x < 2), we can compare un to Mathematica’s adaptive ODE solver NDSolve using the

computed initial conditions and extra precision arithmetic. In particular, u80 matches this

22

-5 0 5x

10-15

10-11

10-7

0.001

10Hastings– McLeod solution

-5 0 5x

10-15

10-11

10-7

0.001

10

Hastings– McLeod derivative

Figure 5: The absolute error in approximating the Hastings–McLeod solution (left) and itsderivative (right) for different values of x and n = 40 (plain), 80 (dotted), 120 (dashed) and 160(thick).

computed solution to about 10 digits, which is as most as can be expected given the limited

accuracy of the initial conditions.

Another important example is the Hastings–McLeod solution [17], which has the prop-

erties u(x) ∼ Ai(x) as x → +∞ and u(x) ∼√−x/2 as x → −∞. This solution is used in

the definition of the Tracy–Widom distribution [28]. Data values for the solution are avail-

able online [26], computed by numerically integrating the Painleve II ODE with very high

precision arithmetic, using the asympotics at +∞ as initial conditions [27]. The Hastings–

McLeod solution is particularly difficult to compute by time-stepping the ODE; though the

solution itself is non-oscillatory and free of poles on the real line, small perturbations of

the initial condition will cause either oscillations or poles to form [6]. Other methods are

also effective, such as using the differential equation to solve a boundary value problem

[12, 14]. The Tracy–Widom distribution itself can be computed by discretizing the Fredholm

determinant [6].

Figure 5 plots the absolute error in approximating the Hastings–McLeod solution and

its derivative using our method. Spectral convergence is evident. However, the problem is

badly conditioned for large |x| and relative accuracy is quickly lost. This is true for positive

x as well, since the solution itself decays like Ai(x) ∼ 12√πx−1/4e−

23x

3/2.

In most physical applications, it is precisely the Stokes’ multipliers or the asymptotic

behaviour which are known. However, one would sometimes want to find the direct trans-

formation [10]: given initial conditions u(0) and u′(0), determine (s1, s2, s3). But having

a map from (s1, s2, s3) to the initial conditions u(0) and u′(0) means that determining the

inverse map can likely be found using optimization techniques. Indeed, we benefit from the

fact that much of the work in constructing the linear system in Section 4 can be reused

for different choices of (s1, s2, s3). We, however, leave this step as a future problem. Other

approaches for solving the direct monodromy problem might also prove effective, such as a

23

20 40 60 80 100n

100

104

106

108

1010

-5 0 5 10x

104

106

108

1010

1012

Figure 6: For (s1, s2, s3) = (1, 2, 1/3), the condition number of the first linear system, on theleft for x = 0 (plain), 2.5 (dotted), 5 (dashed) and 7.5 (thick) and on the right for n = 20 (plain),40 (dotted), 60 (dashed) and 80 (thick).

recent method for computing the Painleve I Stokes’ multipliers [19].

Remark : As far as I am aware, this is the only known approach of computing the initial

conditions associated with the constants (s1, s2, s3). Since the constants (s1, s2, s3) deter-

mine the asymptotics of the solution, this would mean that this is the only reliable way of

connecting asymptotics with initial conditions in general.

6. Condition number

As |x| increases, the jump matrix G becomes increasingly oscillatory and/or stiff, hence

it is sensible that the rate of convergence deteriorates. However, we also saw in numerical

experiments that the number of digits of accuracy that is actually achievable is also signifi-

cantly reduced. We now explain this behaviour by investigating the growth of the condition

number of the linear system. In the left-hand side of Figure 6, we plot the growth of the

condition number as n increases for several choices of x, for (s1, s2, s3) = (1, 2, 1/3). For each

value of x, the condition number appears to grow linearly with n, which is quite good as

the approximation converges spectrally. Unfortunately, as seen in the right-hand size of Fig-

ure 6, increasing x causes exponential increase in the condition number! Thus the condition

number quickly reaches the point where not even a single digit of accuracy can be achieved.

The asymptotic formulæ for large x [15] ensure that this inaccuracy is not in general — i.e.,

excluded cases such as the Hastings–McLeod solution — due to an inherent instability of

the map from (s1, s2, s3) to initial conditions.

At first, this problem seems devastating to the approach: an exponentially increasing

condition number makes the linear system unusable even for modest n, and to resolve the

oscillations in the solution for large x would require large n. However, consider for a moment

24

the following contour representation for solutions to the Airy equation [18]:

(s2

∫

Γ2

+s4

∫

Γ4

+s6

∫

Γ6

)e−

8i3 z

3−2ixz dz for s2 + s4 + s6 = 0.

The choice (s2, s4, s6) = (0,− 1π ,

1π ) is equivalent to the contour integral representation of

Ai(x) [22]. This representation suffers from similar numerical issues as our linear system:

e−8i3 z

3−2ixz grows exponentially large for fixed z in Γ4 or Γ6 as x→ −∞ (though eventually

the super-exponential decrease of e−8i/3z3 wins out to make the integrals finite). But we

know Ai(x) is bounded as x → −∞, hence the oscillations must cancel. In other words,

round-off error makes this integral representation useless. In the case of the Airy equation,

we know how to resolve this issue: deform the contour through the saddle points at ± i√x

2 so

that the contour avoids areas of the complex plane where the integrand exhibits exponential

growth. This can be taken one step further so that the contour runs precisely along the

path of steepest descent, thus not only avoiding exponential increase, but also oscillations

all-together [16]. This has the added benefit that we do not need to increase the number of

quadrature points as |x| → ∞.

It is now clear how to resolve the conditioning problems for our linear system: deform the

curve Γ so it avoids the sectors of exponential growth by passing through the saddle points

of G. Moreover, in analogue to the integral case, we could even deform the contour along

the path of steepest descent, thus avoiding oscillations. This path of steepest descent has

already been computed for the derivation of the asymptotics of solutions to the homogeneous

Painleve II equation [10]. Now to apply our approach, if the steepest descent path is denoted

ΓSD, then we would need to compute CΓSD . This could possibly be achieved by conformally

mapping each of the pieces which make up ΓSD to the unit interval. However, we do not

necessarily need the exact steepest descent curve, as an approximate path will have only

minor oscillations. Thus we could alternatively approximate ΓSD by a linear interpolate.

As we know how to compute C over line segments in the complex plane, we should be able

to successfully apply our numerical Riemann–Hilbert approach. Again, we leave this as a

future problem.

Without using the path of steepest descent, we can still demonstrate this phenomena by

choosing x and (s1, s2, s3) to avoid the exponential increase. When x is real and positive,

the only curve which see exponential increase as x → ∞ are Γ2 and Γ5. Thus, if s2 = 0,

then the exponential increase is avoided. As Figure 7 demonstrates, the condition number

is now well-behaved for positive x.

We omit a plot for the condition number of the Hastings–McLeod solution, which is

similar to the previous plot for positive x, but much worse for negative x: it grows super-

exponentially, reaching about 1013 at x = −9. This is not surprising: if the Stokes’ constants

satisfy s1s3 = 1, then the contour deformation in nonlinear steepest descent is different from

25

20 40 60 80 100n

10

50

20

30

15

-5 0 5 10x

100

1000

104

105

106

107

Figure 7: For (s1, s2, s3) = (1, 0,−1), the condition number of the first linear system, on the leftfor x = 0 (plain), 2.5 (dotted), 5 (dashed) and 7.5 (thick) and on the right for n = 20 (plain), 40(dotted), 60 (dashed) and 80 (thick).

the case where s1s3 6= 1, as is the asymptotic behaviour as x approaches infinity [15].

Therefore, in the limit as x becomes large, the behaviour of Painleve II has a jump as the

Stokes’ constants pass over s1s3, and the solution itself is unstable. However, if we treat

s1s3 = 1 as a special case, it may still be possible to overcome this issue. This is not

completely unlike the Airy function solution to the Airy equation for large positive x, where

perturbations of s2 will introduce exponential growth. Assuming s2 = 0 and this instability

is avoided.

7. Closing remarks

We have demonstrated that a Riemann–Hilbert formulation is not just useful as an ana-

lytical tool, but also as a numerical one, by successfully computing solutions to the homoge-

neous Painleve II equation. This could potentially lay the groundwork for the construction

of a toolbox for computing Painleve equations. Then Painleve transcendentals would indeed

be the true analogues of linear special functions such as the Airy equation: not only useful

for analytical expressions, but efficient for practical computations as well.

In Figure 8 we depict the curves Γ for the first five Painleve Riemann–Hilbert problems,

including the inhomogeneous Painleve II equation. The important characteristic to note

is that they consist of a union of curves which each can be conformally mapped to the

unit interval using Mobius transformations: rays, arcs and line segments. Thus the general

approach of Algorithm 1.3 can already be implemented for these equations. So far the

Painleve III and Painleve IV Riemann–Hilbert problems have successfully been evaluated

using RHPackage [23, 24].

Because of the numerical problems described in Section 6, this approach is currently

not practical for large x, though it is likely that using the path of steepest descent will

rectify this issue (initial numerical experiments confirm this). However, it is practical for

small x, in particular for computing initial conditions. Thus it can already be used to

26

I III

IV V

II

Figure 8: A depiction of the curves Γ associated with the Painleve I–V equations.

connect asymptotic formulæ — which are known in terms of the constants (s1, s2, s3) —

to initial conditions. A simple numerical implementation for the computation of solutions

to the homogeneous Painleve II transcendent might appear to be straightforward: use the

proposed approach for small x, use asymptotic formulæ for large |x| and use an ODE solver

to extend these two regimes to moderate x. Several issues make such an approach impractical

in general: only a few terms of the asymptotic expansion are known, requiring x to be very

large; poles on the real line prevent numerical integration; and special solutions such as

Hastings–McLeod are extremely sensitive to errors in initial conditions.

Acknowledgments : I wish to thank Folkmar Bornemann for several interesting discussions,

and for helping me to realize the importance and peculiarity of the Hastings–McLeod solu-

tion, as well as the difficulty in extending the asymptotic solution using ODE solvers. I also

thank Peter Clarkson, Toby Driscoll, Arno Kuijlaars, Thanasis Fokas, Nick Trefethen, Andy

Wathen, Andre Weideman and the anonymous referees for their valuable advice.

References

[1] Ablowitz, M.J. and Segur, H., Solitons and the inverse scattering transform, Society for

Industrial Mathematics, 2006.

27

[2] Abramowitz, M. and Stegun, I., Handbook of Mathematical Functions, National Bureau

of Standards Appl. Math. Series, #55, U.S. Govt. Printing Office, Washington,

D.C., 1970.

[3] Aksenov, S., Savageau, M.A., Jentschura, U.D., Becher, J., Soff, G. and Mohr, P.J.,

Application of the combined nonlinear-condensation transformation to problems in

statistical analysis and theoretical physics, Comp. Phys. Comm. 150 (2003), 1–20.

[4] Bateman, H., Higher Transcendental Functions, McGraw-Hill, New York, 1953.

[5] Berrut, J.-P. and Trefethen, L.N., Barycentric lagrange interpolation, SIAM Review 46

(2004), 501–517.

[6] Bornemann, F., On the numerical evaluation of Fredholm determinants, Maths Comp

79 (2010), 871–915.

[7] Boyd, J.P., Chebyshev and Fourier spectral methods, Dover Pubns, 2001.

[8] Clenshaw, C. W. and Curtis, A. R., A method for numerical integration on an

automatic computer, Numer. Math. 2 (1960), 197–205.

[9] Deift, P., Orthogonal polynomials and random matrices: a Riemann-Hilbert approach,

American Mathematical Society, 2000.

[10] Deift, P. and Zhou, X., Asymptotics for the Painleve II equation, Communications on

Pure and Applied Mathematics 48 (1995), 277.

[11] Deift, P. and Zhou, X., A steepest descent method for oscillatory Riemann-Hilbert

problems,AMS 26 (1992), 119–124.

[12] Dieng, M., Distribution Functions for Edge Eigenvalues in Orthogonal and Symplectic

Ensembles: Painleve Representations, Ph.D. Thesis, University of Davis, 2005.

[13] Dienstfrey, A., The Numerical Solution of a Riemann-Hilbert Problem Related to

Random Matrices and the Painleve V ODE, Ph.D. Thesis, Courant Institute of

Mathematical Sciences, 1998.

[14] Driscoll, T. A., Bornemann, F. and Trefethen, L. N., The chebop system for automatic

solution of differential equations, BIT 48 (2008), 701-723.

[15] Fokas, A.S., Its, A.R., Kapaev, A.A. and Novokshenov,V.Y., Painleve transcendents:

the Riemann-Hilbert approach, American Mathematical Society, 2006.

[16] Gil, A., Segura, J. and Temme, N.M., Numerical Methods for Special Functions, SIAM,

2007.

28

[17] Hastings, SP and McLeod, JB, A boundary value problem associated with the second

Painleve transcendent and the Korteweg-de Vries equation, Archive for Rational

Mechanics and Analysis 73 (1980), 31–51.

[18] Its, A.R., The Riemann-Hilbert problem and integrable systems,Notices AMS 50

(2003), 1389–1400.

[19] Masoero, D., A Simple Algorithm for Computing Stokes Multipliers, preprint, Arxiv

preprint arXiv:1007.1554, 2010.

[20] Muskhelishvili, N.I., Singular Integral Equations, Groningen: Noordhoff (based on the

second Russian edition published in 1946), 1953.

[21] Nasser, M.M.S., Numerical solution of the Riemann-Hilbert problem, Punjab

University Journal of Mathematics 40 (2008), 9–29.

[22] Olver, F.W.J., Asymptotics and Special Functions, Academic Press, New York, 1974.

[23] Olver, S., RHPackage, http://www.comlab.ox.ac.uk/people/

Sheehan.Olver/projects/RHPackage.html

[24] Olver, S., A general framework for solving Riemann–Hilbert problems numerically,

preprint, NA-10/05, Maths Institute, Oxford University.

[25] Olver, S., Computing the Hilbert transform and its inverse, Maths Comp, to appear.

[26] Prahofer, M. and Spohn, H., Exact scaling functions for one-dimensional stationary

KPZ growth, http://www-m5.ma.tum.de/KPZ/

[27] Prahofer, M. and Spohn, H., Exact scaling functions for one-dimensional stationary

KPZ growth, J. Stat. Phys. 115 (2004), 255–279.

[28] Tracy, C.A. and Widom, H., Level-spacing distributions and the Airy kernel, Comm.

Math. Phys. 159 (1994), 151–174.

[29] Wegert, E., An iterative method for solving nonlinear Riemann-Hilbert problems, J.

Comp. Appl. Maths 29 (1990), 327.

[30] Wegmann, R., Discrete Riemann-Hilbert problems, interpolation of simply closed

curves, and numerical conformal mapping, J. Comp. Appl. Maths 23 (1988),

323–352.

[31] Wegmann, R., An iterative method for the conformal mapping of doubly connected

regions,J. Comp. Appl. Maths 14 (1986), 79–98.

29

numerical solution of riemann{hilbert problems: · pdf filenumerical solution of...

Documents