ambrosio e gangbo hamilton ian ode

Upload: elismar01

Post on 06-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 Ambrosio e Gangbo Hamilton Ian ODE

    1/35

    Hamiltonian ODEs in the Wasserstein space of

    probability measures

    L. AMBROSIOScuola Normale Superiore di Pisa

    AND

    W. GANGBOGeorgia Institute of Technology

    Abstract

    In this paper we consider a Hamiltonian H on P2(R2d), the set of probabil-

    ity measures with finite quadratic moments on the phase space R2d = Rd Rd,

    which is a metric space when endowed with the Wasserstein distance W2. Westudy the initial value problem dt/dt+ (Jdvtt) = 0, where Jd is the canon-ical symplectic matrix, 0 is prescribed, vt is a tangent vector to P2(R

    2d) att, and belongs to H(t), the subdifferential of H at t. Two methods for con-structing solutions of the evolutive system are provided. The first one concerns

    only the case where 0 is absolutely continuous. It ensures that t remains abso-lutely continuous and vt = H(t) is the element of minimal norm in H(t).The second method handles any initial measure 0. If we furthermore assumethat H is convex, proper and lower semicontinuous on P2(R

    2d), we provethat the Hamiltonian is preserved along any solution of our evolutive system:

    H(t) = H(0). c 2000 Wiley Periodicals, Inc.

    1 Introduction

    In the last few years there has been a considerable interest in the theory of gra-

    dient flows in the Wasserstein space P2(RD) of probability measures with finite

    quadratic moments in RD, starting from the fundamental papers [35], [43], with

    several applications ranging from rates of convergence to equilibrium to the proof

    of functional and geometric inequalities. In particular, in [4] (see also [13]), a sys-

    tematic theory of these gradient flows is built, providing existence and uniqueness

    results, contraction estimates and error estimates for the implicit Euler scheme.

    In this paper, motivated by a work in progress by Gangbo & Pacini [31], we pro-

    pose a rigorous theory concerning evolution problems in P2(RD) of Hamiltonian

    type. Here typically D=

    2d and the measures we are dealing with are defined in

    the phase space. As shown in Section 8, our study covers a large class of systems

    which have recently generated a lot of interest, including the Vlasov-Poisson in

    one space dimension [9] [47], the Vlasov-Monge-Ampere [12] [18] and the semi-

    geostrophic systems [10] [16] [17] [19] [18] [23] [20] [21] [22] [40].

    Communications on Pure and Applied Mathematics, Vol. 000, 00010035 (2000)c 2000 Wiley Periodicals, Inc.

  • 8/2/2019 Ambrosio e Gangbo Hamilton Ian ODE

    2/35

    2 LUIGI AMBROSIO, WILFRID GANGBO

    We note that a general theory of Hamiltonian ODEs for non-smooth Hamil-

    tonian H, in particular when H is only convex, seems to be completely understood

    only in finite-dimensional spaces, and even in these spaces the uniqueness question

    has been settled only in very recent times, see Remark 6.5. In infinite-dimensional

    Hilbert spaces very little appears to be known at the level of existence of solutions,and nothing is known at the level of uniqueness.

    Besides its comprehensive character, another nice feature of our theory is its

    ability to handle singular initial data and singular solutions. This class of solutions

    is natural, for instance, to include solutions (e.g. those generated by classical non-

    kinetic solutions) with one or finitely many velocities, see [47] for a first result

    in this direction. At the same time, there is the possibility to handle discrete and

    continuous models with the same formalism, and to show stability results (the first

    one in this direction, for two specific models, is [18]).

    We recall that P2(RD) is canonically endowed with the Wasserstein distance

    W2, defined as follows:

    (1.1) W22 (,) := min

    RDRD|x y|2d(x,y) : (,)

    .

    Here (,) is the set of Borel probabilty measures on RD RD which have and as their marginals. The Riemannian structure ofP2(R

    D), introduced at aformal level in [43] and later fully developed in [4], will be intensively exploited in

    this work. Notice that, as soon as P2(RD) is endowed with a differentiable struc-

    ture, the theory of ODEs in the finite-dimensional space RD naturally extends to a

    theory of ODEs in the infinite-dimensional space P2(RD): it suffices to consider

    the isometry I : z z, where z stands for the Dirac mass at z.

    In particular, we consider the case when D = 2d and we are given a lowersemicontinuous Hamiltonian H:P2(R

    2d) R. As we will be mostly considering

    semiconvex Hamiltonians, in the sense of displacement convexity [38], mimick-ing some classical concepts of convex analysis we introduce in Definition 3.2 the

    subdifferential H() and denote by H() its element with minimal L2(; R2d)norm (well defined whenever H() = /0).

    The problem we study in Section 6 is: given an initial measure P2(R2d),

    find a path t t P2(R2d) such that

    (1.2)

    d

    dtt + (JH(t)t) = 0, t (0,T)

    0 =

    and H(t)L2(t) L1(0,T). Here, J is a (2d) (2d) symplectic matrix.

    Using a suitable chain rule in the Wasserstein space first introduced in [4],

    we prove in Theorem 5.2 that H is constant among all solutions t of (1.2), pro-vided H is convex (or concave) for some real number . The proof of thisfact requires neither regularity assumptions on the velocity field JH(t) nor theabsolute continuity of t.

  • 8/2/2019 Ambrosio e Gangbo Hamilton Ian ODE

    3/35

    HAMILTONIAN ODES IN THE SPACE OF PROBABILITY MEASURES 3

    Existence of solutions can be established in (1.2) if one imposes a growth con-

    dition on the gradient, as

    (H1) the existence of constants Co (0,+), Ro (0,+] that for all P2a (R

    2d)with W2(, ) < Ro we have D(H), H() = /0 and|H()(z)| Co(1 + |z|)

    foralmost every z R2dand a continuity property of the gradient as

    (H2) If = L2d, n = nL2d Pa2 (R

    2d), supn W2(n, ) < Ro and n narrowly, then there exist a subsequence n(k) and functions wk, w : R

    2d R2d

    such thatwk =H(n(k)) n(k)-a.e., w =H() -a.e. andwk w L2da.e. in

    R2d as k +.

    Here we are denoting by Pa2 (R2d) the elements ofP2(R

    2d) that are absolutely

    continuous with respect toL2d. The requirements of bounds and continuity on thegradient naturally appear also in the finite dimensional theory, in order to obtain

    bounds on the discrete solutions of the ODE and to pass to the limit.

    In Theorem 6.6 we show that a minor variant of the algorithms used in [10],[12], [17] in connection with specific models, establishes existence of a solution tin (1.2) up to some time T = T(Co,Ro) (T = + whenever Ro = +), when 0 =0L

    2d is absolutely continuous with respect to L2d and (H1) and (H2) hold. A

    good feature of this algorithm is that it preserves the absolute continuity condition,

    so that t = tL2d, and provides the entropy inequalities

    R2dS(t) dz

    R2d

    S(0) dz t [0,T], with S convex.

    Unlike the theory of gradient flows, where the selection of the gradient among

    all subdifferentials is ensured on any solution by energy reasons (see [4]), in our

    case it is not clear why in general this selection should be the natural one, even

    though it provides the tangency condition and it is more likely to provide bounds,by the minimality of the gradient. Therefore, we consider also a weaker version of

    (1.2), which works for arbitrary initial measures : find a path t t P2(R2d)

    and vector fields vt L2(t; R

    2d) such that

    (1.3)

    d

    dtt + (Jvtt) = 0, 0 = , t (0,T)

    vt TtP2(R2d) H(t) for a.e. t.

    Here TtP2(R2d) is the tangent space to P2(R

    2d) at , according to Ottos calcu-

    lus [4], defined as the L2(; R2d) closure of the gradients ofCc (R2d) maps. Even

    in this case we are able to show that H is constant along solutions of (1.3), provided

    H is convex (or concave) for some R.

    For the system in (1.3), we weaken (H1) and (H2) and only assume that

    (H1) the existence of constants Co [0,+), Ro (0,+] such that for all P2(R

    2d) with W2(, )

  • 8/2/2019 Ambrosio e Gangbo Hamilton Ian ODE

    4/35

    4 LUIGI AMBROSIO, WILFRID GANGBO

    and

    (H2) Ifsupn W2(n, ) < Ro and n narrowly, then the limit points of convexcombinations of{H(n)n}

    n=1 for the weak

    -topology are representable as w

    for some w H() TP2(R2d).

    In Section 7 a second algorithm, based on linear interpolation of transport maps,

    provides existence of solutions to (1.3). We refer to Theorem 7.4 for a complete

    statement of the results we obtain. In particular, when = ( x,v), defining h on

    R2d by h(x,v) = H((x,v)), the algorithm used in this section coincides with a nat-ural finite-dimensional algorithm yielding in the limit the volume-preserving flow

    associated to the ODE (see Remark 6.5 for a more precise discussion):

    (1.4)

    Jd( x(t), v(t)) h(x(t),v(t)), t (0,T)

    (x(0),v(0)) = ( x, v).

    Note that proving existence of (1.3) is harder, compared to proving existence

    for the symplified system

    (1.5)

    d

    dtt + (Jvtt) = 0, 0 = , t (0,T)

    vt H(t) for a.e. t,

    where we drop the constraint that vt TtP2(R2d), and so vt may be not tangent

    to P2(R2d). The system in (1.5) does not make geometrical sense, except in spe-

    cial cases such as when t is concentrated on finitely many points (in this caseL2(t; R

    2d) = TtP2(R2d)). On the technical side, the lack of the tangency condi-

    tion seems to prevent the possibility of proving constancy of the Hamiltonian along

    solutions of (1.5).

    Finally, we add more motivations for the terminology Hamiltonian adopted

    for the systems (1.2) and (1.3) (particularly when J is the canonical symplectic

    matrix). A first justification is given in [31], where JdH() is shown to be thesymplectic gradient induced by a suitable skew-symmetric 2-form (see the more

    detailed discussion made right after Definition 5.1). Moreover, in the recent work

    [18] the authors consider Hamiltonians on R2nd of the form

    (x1,v1; ;xn,vn) Hn(x1,v1; ;xn,vn) = 1

    2W22

    1

    n

    n

    i=1

    (xi,vi),1

    n

    n

    i=1

    (ani ,bni )

    ,

    where (an1,bn1), ,(a

    nn,b

    nn) R

    2d are prescribed. They study the classical finite-

    dimensional Hamiltonian systems

    (1.6)

    xni (t) = nviHn(xn1(t),v

    n1(t); ;x

    nn(t),v

    nn(t)) t (0,T)

    vni (t) = nxiHn(xn1(t),v

    n1(t); ;x

    nn(t),v

    nn(t)) t (0,T)

    (xni (0),vni (0)) prescribed i = 1, ,n.

  • 8/2/2019 Ambrosio e Gangbo Hamilton Ian ODE

    5/35

    HAMILTONIAN ODES IN THE SPACE OF PROBABILITY MEASURES 5

    Defining

    nt =1

    n

    n

    i=1

    (xni (t),vni (t)),

    it is readily checked that the paths t nt P2(R2d) satisfy (1.3) with Hn in place

    of H. In [18], it is proven that if the initial conditions (xni (0),vni (0)) are suitablychosen and n = 1/nni=1 (ani ,bni ) tends to as n tends to +, then up to a subse-

    quence which is independent of the time variable t, the measures {nt }n=1 narrowly

    converge as n + to measures {t}t[0,T] satisfying (1.2) for the Hamiltonian

    H() = 1/2W22 (,).

    Acknowledgment It is a pleasure to express our gratitude to Y. Brenier for the

    many interesting and instructive discussions we had. Criticisms were also provided

    by T. Nguyen.

    2 Basic notation and terminology

    In this section we fix our basic notation and terminology on measure theory and

    Hamiltonian systems.

    - The effective domain of a function H : A (,+] is the set D(H) of alla A such that H(a) < +. We say that H is proper ifD(H) = /0.

    - Let d, D be integers. We denote by ID the identity matrix on RD and we denote

    by Jd the sympletic (2d) (2d) matrix

    Jd =

    0 Id

    Id 0

    .

    When d = 1, this is the clockwise rotation of angle /2. We denote by id theidentity map on RD or R2d.

    - If r> 0 and z RD, Br(z) denotes the ball in RD of center z and radius r. IfB RD we denote by Bc the complement ofB.

    - Assume that is a nonnegative Borel measure on a topological space X andthat is a nonnegative Borel measure on a topological space Y. We say that a Borelmap t : X Y transports onto , and we write t# = , if[B] = [t

    1(B)] forall Borel sets B Y. We sometimes say that t pushes to . We denote byT(,)the set of all t such that t# = .

    If is a nonnegative Borel measure on X Y then its projection projX is anonnegative Borel measure on X and its projection projY is a nonnegative Borelmeasure on Y; they are defined by

    projX[A] = [A Y], projY[B] = [XB].

    A measure on X Y is said to have and as its marginals if = projX and= projY. We write that (,) and call a transport plan between and .

    - When X = Y = M, any minimizer o in (1.1) is called an optimal transportplan between and . We write o o(,).

  • 8/2/2019 Ambrosio e Gangbo Hamilton Ian ODE

    6/35

    6 LUIGI AMBROSIO, WILFRID GANGBO

    - We denote by P(RD) the set of Borel probability measures on RD. TheDdimensional Lebesgue measure on RD is denoted by LD. The 2-moment of P(RD) with respect to the origin is defined by

    M2() = RD

    |x|2d(x).

    Notice that W22 (,0) = M2(). We will be dealing in particular with

    P2(RD) :=

    P(RD) : M2() < +

    and its subspace Pa2 (R

    D), made of absolutely continuous measures with respectto LD.

    - IfP2(RD) and v1, . . . ,vk L

    2(RD,), we write v = (v1, . . . ,vk) L2(RD,; Rk)

    or simply v L2(; Rk).

    - Assume that , are Borel probability measures on M = RD withM2(),M2() 0 L

    Da.e. on Brfor any r> 0. If C> 0, v TP2(R

    D) and

    (6.1) |v(z)| C(1 + |z|) for almost every z RD

    then there exists a sequence {n}n=1 C

    c (R

    D) such that

    |n(z)| C(2 + |z|) z RD

    andlim

    n+v nL2(;RD) = 0.

    Proof. Let {n}n=1 C

    c (R

    D) be such that v nL2() 0 as n +. Forall r> 0 we have

    limsupn+

    v n2

    L2(Br,LD,RD)

    1

    mrlimsup

    n+v n

    2L2() = 0.

    This proves that v L2loc(R2d,L2d) and that curl v = 0. Let l1 C

    c be a non-

    negative probability density whose support is contained in the unit ball of R2d and

    set

    vh =

    lh

    v,

    with lh(

    z) =

    1

    h2dl

    1(

    z

    h ).

    1 Even though the test function (x,y) vt(x);y is possibly discontinuous and unbounded, onecan use the boundedness of 2-moments of h and the fact that their first marginal does not dependon h to pass to the limit, see for instance 5.1.1 in [4]

  • 8/2/2019 Ambrosio e Gangbo Hamilton Ian ODE

    16/35

    16 LUIGI AMBROSIO, WILFRID GANGBO

    Clearly, vh C(R2d,R2d) and curl vh = 0. Hence, there exist Ah C

    (R2d) suchthat vh = Ah and Ah(0) = 0. Thanks to Jensens inequality, (6.1) implies that

    |vh(z)| = |

    R2dlh(w)v(z w)dw| C

    R2d

    lh(w)(1 + |z w|)dw

    C(1 + |z|) +C

    R2dlh(w)|w|dw

    = C(1 + |z|) + hC

    R2dl1(w

    )|w|dw

    C(1 + |z|) + hC

    B1(0)l1(w

    )dw

    C(2 + |z|),(6.2)

    for h 1. Since {vh}h>0 converges L2dalmost everywhere to v, the uniform

    bound in (6.2) and the fact that P2(RD) imply, by the dominated convergence

    theorem,

    (6.3) limh0 v Ah2L2(;RD) = 0.

    Define

    (6.4) Brh(z) =

    Ah(z) for |z| r

    0 for |z| 2r.

    Note that Brh is a C(2 + r)Lipschitz function and so it admits an extension to RD,

    that we still denote by Brh, which is C(2 + r)Lipschitz. We use (6.1), (6.2) and thefact that

    (6.5) |Brh(z)| C(2 + r) C(2 + |z|) on Bcr(0)

    to conclude that for all h 1R2d

    |v Brh|2d =

    Br(0)

    |v Ah|2d+

    Bcr(0)

    |v Brh|2d

    R2d|v Ah|

    2d+ 4C2

    Bcr(0)(2 + |z|)2d.(6.6)

    We combine (6.3) and (6.6) to conclude that

    (6.7) limh,1/r0

    v Brh2

    L2(;RD) = 0.

    This, together with (6.2) and (6.5) yields the lemma. QED.

    The following lemma provides a discrete solution of the Hamiltonian ODE in

    a small time interval, whose iteration will lead to a discrete solution. To make

    the iteration possible, one has to show that the flow preserves in some sense the

    bounds on the initial datum: this is possible thanks to the fact that the flow is

    incompressible.

  • 8/2/2019 Ambrosio e Gangbo Hamilton Ian ODE

    17/35

    HAMILTONIAN ODES IN THE SPACE OF PROBABILITY MEASURES 17

    Lemma 6.2. Let h > 0, let = LD Pa2 (RD) be satisfying

    (6.8) mr> 0 LDa.e. on Br, for any r> 0

    and letv TP2(RD) be satisfying (6.1), with eCh 2. Then there exists a family

    of measurest = t

    LD, t [0,

    h], satisfying

    (a)

    RD S(t) dz

    RD S() dz for any convex function S : [0,+) [0,+);(b) t t P2(R

    D) is absolutely continuous, 0 = and the continuityequation

    (6.9)d

    dtt + (Jvt) = 0, (t,z) (0,h) R

    D

    holds;

    (c) t mr LDa.e. on Br, with r

    = eChr+ 2(eCh 1).

    Finally, we have also that t t is Lipschitz continuous, with Lipschitz constantless than Lo = C

    24(1 +M2()) and, in particular,

    (6.10) W2(t,) hLo t [0,h].Remark 6.3. Assumption (6.8) is used twice. First, it is used to conclude that

    since v is defined almost everywhere, then it is defined LDalmost everywhere,hence talmost everywhere, if t L

    D. More importantly, it is used to applyLemma 6.1, to treat v as a gradient and to obtain that Jv is divergence free with

    respect to LD. This leads to the conclusion that the flow (t, ) associated to JvpreservesLD for each t fixed.

    Proof of lemma 6.2 We assume first that v = Cc (RD; RD) and that the

    weaker condition |v(z)| C(2 + |z|) is fulfilled. Under this assumption the au-tonomous vector field Jv is smooth and divergence-free, so the flow : [0,h] RD RD associated to Jv is smooth and measure-preserving. In this case wesimply define t = (t, )# , so that the continuity equation (6.9) is satisfied. Themeasure preserving property gives that t = tL

    D, with

    (6.11) t (t, ) = .

    Notice that (a) (with an equality, and even for nonconvex S) follows immediately

    by (6.11), and (c) as well, provided we show that (t, )1(Br) Br . To showthe latest inclusion, notice that (t,y) =(t, )1(y) is the flow associated to Jv,hence

    d

    dt|(t,y)| |Jv|((t,y)) C(2 + |(t,y)|).

    By integrating this differential inequality we immediately obtain that

    2 + |(t,y)| eCt

    (2 + |y|).Hence, |y| < r implies |(t,y)| < r for t [0,h]. An analogous argument gives2 + |(t,z)| eCt(2 + |z|), hence when eCh < 2 we obtain

    |(t,z)| 2(|z| + 1).

  • 8/2/2019 Ambrosio e Gangbo Hamilton Ian ODE

    18/35

    18 LUIGI AMBROSIO, WILFRID GANGBO

    Using this inequality we can estimateRD

    |Jv|2dt 2C2

    RD

    (4 + |y|2) dt = 8C2 + 2C2

    RD

    |(t,z)|2 d

    8C2 + 16C2 RD

    (1 + |z|2) d = 24C2 + 16C2M2() L2

    o

    .

    Using this estimate in conjunction with (3.2) and (6.9) yields that t t is LoLipschitz .

    In the general case we consider a sequence vn =n with all properties stated inLemma 6.1. As > 0 LDa.e., we can also assume with no loss of generality thatvn v L

    Da.e. in R2d. Let nt be the measures built according to the previousconstruction relative to vn and notice that t

    nt are equi-bounded in P2(R

    D),and LoLipschitz continuous. Furthermore,

    nt =

    ntL

    D with nt locally uniformlybounded from below. Hence, we may assume with no loss of generality that nt t narrowly for any t [0,h].

    By the lower semicontinuity of moments we get t P2(RD) for any t, and the

    lower semicontinuity of Wasserstein distance (see for instance Proposition 7.1.3 in[4]) gives that the Lipschitz bound and the distance bound (6.10) are preserved

    in the limit. Also the inequality

    S( nt ) dz

    S() dz with S convex and thelocal lower bound in (c) are easily seen to be stable under weak convergence, and

    imply (choosing S = S convex, growing faster than linearly at infinity, such thatS() dz < +) that t = tL

    D Pa2 (RD) with t mr L

    Da.e. on Br for any

    r> 0.

    It remains to show the validity of the continuity equation in (b). To this aim,

    it suffices to show that, for t fixed, Jvnnt converge in the sense of distributions

    to Jvt. As S grows faster than linearly at infinity, we obtain from the inequality

    S( nt ) dz

    S() dz, that nt is equi-integrable (see for instance Proposition 1.27of [3]). Hence for any

    >0 we can find

    >0 such that

    LD(B) < =

    B

    tdz + supn

    B

    nt dz < .

    We fix r> 0 and choose as B Br an open set given by Egorov theorem, so thatvn v uniformly on Br \B; let also v

    : R2d R2d be a continuous functioncoinciding with v on Br \B, with |v

    | C(2 + r). For any Cc(Br) we havethen

    RDJvn

    nt dz =

    RD

    (Jvn Jv) nt dz +

    RD

    Jvt dz

    +

    RD(Jv Jv)tdz +

    RD

    Jv(nt t) dz,

    so that

    limsupn+

    RDJvn

    nt dz

    RD

    Jvt dz

    2Csup ||(2 + r).As is arbitrary, this proves the weak convergence. QED.

  • 8/2/2019 Ambrosio e Gangbo Hamilton Ian ODE

    19/35

    HAMILTONIAN ODES IN THE SPACE OF PROBABILITY MEASURES 19

    Remark 6.4 (Stability of upper bounds). By the same argument one can show

    that if Mr LDa.e. on Br for any r> 0, then t Mr L

    Da.e. on Br with

    r = eChr+ 2(eCh 1).

    The main result of this section is concerned with Hamiltonians H satisfying the

    following properties:

    (H1) There exist constants Co (0,+), Ro (0,+] such that for all P2a (R

    D)with W2(, ) < Ro we have D(H), H() = /0 and w = H() satisfies|w(z)| Co(1 + |z|) for almost every z R

    D.

    (H2) If = LD, n = nLD Pa2 (R

    D), supn W2(n, ) < Ro and n nar-rowly, then there exist a subsequence n(k) and functions wk, w : R

    D RD suchthat wk = H(n(k)) n(k)-a.e., w = H() -a.e. andwk w L

    D a.e. in RD

    as k +.

    To ensure the constancy ofH along the solutions of the Hamiltonian system we

    consider also:

    (H3) H :P2(RD) (,+] is proper, lower semicontinuous andconvex forsome R.

    Recalling thatPa2 (RD) is dense inP2(R

    D) it would be not difficult to show, bythe same argument used at the beginning of the proof of Theorem 5.2, that (H3) and

    (H1) imply that H is Lipschitz continuous on the ball { P2(RD) : W2(, )

    Ro}. Assumption (H2), instead, is a kind of C1-regularity assumption on H.

    Thinking to the finite-dimensional theory (for instance to Peanos existence the-

    orems for ODEs with a continuous velocity field) some assumption of this type

    seems to be necessary in order to get existence. In the following remark we dis-

    cuss, instead, existence in the flat infinite-dimensional case and uniqueness in

    the finite-dimensional case.

    Remark 6.5. Assume that we are given a convex (or -convex for some R)Lipschitz function H : R2d R. Then, H(x) is not empty for all x R2d and wemay define solutions of the Hamiltonian ODE those absolutely continuous maps

    x : [0,+) R2d satisfying Jdx(t) H(x(t)) for a.e. t [0,+).

    The same subdifferentiability argument used in the proof of Theorem 5.2 then

    shows that t H(x(t)) is constant along Hamiltonian flows. Existence of Hamil-tonian flows can be achieved by the following discrete scheme: fix a time parame-

    ter h > 0 and an initial datum x R2d. Then, choose p0 H(x0) and set xh(t) =x0 +Jdp0t for t [0,h], choose p1 H(xh(h)) and set xh(t) =x1 +Jdp1(t h) fort [h,2h] and so on. In this way xh(t) solves the delayed Hamiltonian equation

    (6.12) Jd xh(t) H

    xh(h[

    t

    h ])

    for a.e. t 0.

    Using a compactness and equi-continuity argument we can find a sequence (hi) 0and a Lipschitz map x : [0,) R2d such that xhi (t) converge to x(t) as i forany t 0 and xhi weakly converge in L

    2loc([0,); R

    2d) to x.

  • 8/2/2019 Ambrosio e Gangbo Hamilton Ian ODE

    20/35

    20 LUIGI AMBROSIO, WILFRID GANGBO

    In order to show that Jdx H(x) a.e., we use an integral version of the discretesubdifferential inclusion, namely

    H(y)

    0H(xhi (hi[

    t

    hi]))(t) dt+

    0

    y xhi (hi[t

    hi]),Jd xhi (t)(t) dt,

    with (t) nonnegative, with compact support and satisfying

    dt = 1, and pass tothe limit as i to find

    H(y)

    0H(x(t))(t) dt+

    0

    y x(t),Jd x(t)(t) dt.

    Choosing properly a family i of approximations oft, this yields

    H(y) H(x) + y x(t),Jdx(t)

    at any Lebesgue point t of x. This proves existence of Hamiltonian flows. We also

    refer the reader to a work in progress by Ghoussoub and Moameni [32] on related

    questions.

    Notice that this scheme doesnt seem to work in the infinite-dimensional case,

    when R2d

    is replaced by an infinite-dimensional phase space X, due to the difficultyof handling terms

    fh(t),gh(t)dt with fh weakly converging in L

    2loc([0,+);X)

    and gh(t) only pointwise weakly converging to g(t). Indeed, we are not aware ofany existence result in this direction.

    Coming back to the finite-dimensional case X = R2d, the results in [5] (see also [6]for special classes of Hamiltonians) ensure a kind of generic uniqueness prop-

    erty, or uniqueness in the flow sense, in the same spirit of DiPernaLions theory

    [25] (see 6 of [5] for a precise formulation). In brief, among all families of solu-tions x(t, x) of the ODE, the condition

    (6.13) x(t, )#L2d CL2d with C independent oft

    determines x up to Ldnegligible sets (i.e. if x and x fulfil (6.13), then x(, x) =

    x(, x) for Lda.e. x) and the unique x satisfying (6.13) is stable within the classof approximations fulfilling (6.13) (in particular, one finds that x(t, ) is measure-preserving for all t). It turns out that the scheme described here produces a discrete

    flow xh(t, x) satisfying (6.13) with C= 1, and therefore is a good approximation ofthe unique Hamiltonian flow x. See also [45] for discrete schemes (called leap-frog

    schemes) that really preserve the symplectic forms and therefore the symplectic

    volume.

    Theorem 6.6. Assume that (H1) and (H2) hold and that T > 0 satisfies (6.18).Then there exists a Hamiltonian flow t = tL

    D : [0,T] D(H) starting from = LD Pa2 (R

    D) , satisfying (5.1) , such that the velocity field vt coincideswith H(t) for a.e. t [0,T]. Furthermore, t t is LLipschitz, with

    L2 = 2C2o (1 +M) and M = e(25C2o +1)T(1 +M()).

    Finally, there exists a function l(r) depending only on T and Co such that

    (6.14) mrLD-a.e. on Br r> 0 = t ml(r) L

    D-a.e. on Br r> 0

  • 8/2/2019 Ambrosio e Gangbo Hamilton Ian ODE

    21/35

    HAMILTONIAN ODES IN THE SPACE OF PROBABILITY MEASURES 21

    and

    (6.15) MrLD-a.e. on Br r> 0 = t Ml(r) L

    D-a.e. on Br r> 0.

    If in addition (H3) holds, then t H(t) is constant.

    Proof. In the first two steps of the proof, we shall assume existence of positive

    numbers mr such that the initial datum satisfies mr> 0 LD a.e. on Br for any

    r> 0. That technical assumption will be removed only in the last step of the proofof the theorem.

    Step 1. (a time discrete scheme). Since is integrable, standard argumentsgive existence of a convex function S : [0,+) [0,+), which grows faster thanlinearly at infinity and such that

    S() dz is finite. We fix an integer N sufficiently

    large, so that Coh < 1/8 and 1 +Coh/2 < eCoh < 1 + 2Coh with h = T/N, and we

    divide [0,T] into N equal intervals of length h. We shall next argue how, for anysuch N, Lemma 6.2 gives time discrete solutions Nt = Nt LD satisfying:

    (a) the Lipschitz constant oft Nt is less than L, with L independent ofN;(b) supN,tW2(

    Nt , ) < Ro,

    S(Nt ) dz

    S() dz and Nt ml(r) L

    D-a.e. on

    Br for any r> 0;(c) the delayed Hamiltonian equation

    (6.16)d

    dtNt + (Jv

    Nt

    Nt ) = 0

    holds in the sense of distributions in (0,T) RD, with vNt = H(Nih) for

    0 i N 1 and t [ih,(i + 1)h).

    In order to build Nt , we apply Lemma 6.2 N times with C = Co: we start with = and v = H(LD) to obtain a solution Nt of (6.16) in [0,h]. Then, weapply the lemma again with = Nh and v =H(

    Nh L

    D) to extend it continuouslyto a solution of (6.16) in [h,2h]. In N steps we build the solution in [0,T].

    However, in order to be sure that the lemma can be applied each time, we have

    to check that the inequality W2(Nih , ) < Ro is valid for i = 0, . . . ,N 1, and this is

    where the restriction on T comes from: first notice that since

    W2(N(i+1)h,

    Nih ) hCo

    24(1 +M2(Nih )) ,

    by the triangle inequality we need only to prove by induction an upper bound of

    the form

    (6.17) M2(Nih ) M,

  • 8/2/2019 Ambrosio e Gangbo Hamilton Ian ODE

    22/35

    22 LUIGI AMBROSIO, WILFRID GANGBO

    for some Msuch that CoT

    24(1 +M)

  • 8/2/2019 Ambrosio e Gangbo Hamilton Ian ODE

    23/35

    HAMILTONIAN ODES IN THE SPACE OF PROBABILITY MEASURES 23

    JH(t)t. Assume by contradiction that this does not happen, i.e. there exist asubsequence Ni and a smooth test function such that

    (6.19) inf i RD v

    Nit ; d

    Nit RD vt; dt> 0.

    Let us denote by [] the greatest integer function. Notice that by assumption (H2)and the narrow convergence of Ni

    [Nit]/Nito t we can assume with no loss of gener-

    ality that

    vNit = JH(

    Ni[Nit]/Ni

    ) JH(t) LDa.e. in R2d as i +.

    By the same argument used at the end of the proof of Lemma 6.2, based on Egorov

    theorem and the equi-integrability of Nit , we prove that vNit

    Nit converge in the

    sense of distributions to JH(t)t, thus reaching a contradiction with (6.19).

    Therefore, it suffices to pass to the limit as N in (6.16) to obtain that t isan Hamiltonian flow with velocity field vt =H(t).

    Step 3. Now we consider the general case. We strongly approximate inL1(RD) by functions k such that kLD P2(R

    D) and, for any k, there exist con-stants mkr > 0 such that

    k mkrLD-a.e. on Br for any r> 0 (for instance, convex

    combinations of with a Gaussian). We also notice that the equi-integrability of{k}

    k=1 ensures the existence of a convex function S having a more than linear

    growth at infinity, and independent of k, such that

    S( k) dz 1 for any k.

    The construction performed in Step 1 and Step 2 can then be applied for each

    k, yielding solutions of the Hamiltonian ODE kt = ktL

    D, t [0,T], satisfying k0 =

    k,

    S( kt )dx 1, and

    (6.20)d

    dt

    kt + (JH(kt )

    kt ) = 0 in (0,T) R

    2d.

    As, by construction, t kt are L-Lipschitz, we can also assume, possibly extract-ing a subsequence, that kt t narrowly as k + for any t [0,T]. The upperbound on

    S( kt )dx then ensures that t P

    a2 (R

    D) for all t [0,T].

    The same argument used in Step 2, based on (H2) and the equi-integrability of

    kt , shows that for any t [0,T], JH(kt )

    kt converges to JH(t)t as k +

    in the sense of distributions. Therefore, passing to the limit as k + in (6.20) weobtain that t is a solution of the Hamiltonian ODE with velocity field JH(t).

    Let us next give a more explicit expression for the Lipschitz constant of t t.Recall that by (6.17), we have

    (6.21) M2(Nih ) M= e

    PT(1 +M2())

    and W2(, ) < Ro for [0,T]. Thus, (6.21) and (H1) imply that(6.22)

    H()2

    L2(;RD) C2o

    RD

    (1 + |z|)2d(z) 2C2o (1 +M()) 2C

    2o (1 +M).

  • 8/2/2019 Ambrosio e Gangbo Hamilton Ian ODE

    24/35

    24 LUIGI AMBROSIO, WILFRID GANGBO

    This, together with (3.2), yields

    (6.23) W2(s,t) t

    sH()L2(;RD)d L(t s).

    Finally, the constancy of t H(t) follows by the (essential) boundedness ofvtL2(t;RD) and Theorem 5.2. QED.

    We conclude this section by showing a class of Hamiltonians satisfying the

    assumptions of Theorem 6.6.

    Lemma 6.7. Let P2(RD) with a bounded support and let V : RD R be V

    convex, W : RD RD R convex and even, both differentiable and with at mostquadratic growth at infinity. Then, for a > 0 the function(6.24)

    H() =H0()+V()+W() = a

    2W22 (,)+

    R2d

    V d+1

    2

    RDRD

    W d

    is (V a)convex, lower semicontinuous and satisfies (H1) and(H2).Proof. Possibly rescaling V and W, we shall assume that a = 1. It is well known(see for instance [46] or Chaper 10 of [4]) that the potential energy V is Vconvexand lower semicontinuous, and that the interaction energy W is convex and lower

    semicontinuous. As a consequence, H is (V 1)convex and lower semicontinu-ous.

    In order to show (H1) it suffices to notice that both W and W have a growth

    at most linear at infinity, and prove that

    (6.25) H() = H0() +V+ (W ) P2(RD),

    taking also into account that Proposition 4.3 yields, in the case when Pa2 (RD),

    H0() = {t id}, and that t

    L

    (; RD

    ) (by the boundedness of the supportof).

    The inclusion in (6.25) is a direct consequence of the characterization (4.4)of the subdifferential and of the inequalities

    V() V() +

    RDV, id d+

    V2

    W22 (,)

    W() W() +

    RD(W) , id d

    for o(,) (see for instance [4]). In order to prove the inclusion , we fixa vector H() and define, for o(,), the measures t = ((1 t)1 +

    t2)# and t := (1,(1 t)1 + t2)# o(,t). As (t id) = t( id),by applying the definition of subdifferential we obtain

    liminft0

    H(t) H()

    t

    RDw, id d.

  • 8/2/2019 Ambrosio e Gangbo Hamilton Ian ODE

    25/35

    HAMILTONIAN ODES IN THE SPACE OF PROBABILITY MEASURES 25

    Now, the dominated convergence theorem gives

    limt0

    V(t) V()

    t=

    R2dV, id d, lim

    t0

    W(t) W()

    t=

    R2d(W) , id d,

    so that

    liminft0

    H0(t) H()t

    RD0, id d

    with 0 = V (W) . Then, by (1)convexity ofH0 we get

    H0() H0() +

    RD0, id d

    1

    2W22 (,).

    The previous inequality, together with Propositions 4.2 and 4.3, gives that 0 H0().

    Property (H2) follows directly from the identity

    H() = {(t id) +V+ (W) }

    and from Lemma 3.3. QED.

    As shown in [38], another important class of convex functionals in P2(RD)

    is provided by the so-called internal energy functional = LD

    S() dz.However, as the subdifferential of this functional is not empty only when LS() isa W1,1 function (here LS(y) = yS

    (y) S(y)), these functionals fail to satisfy (H1).

    The previous result can be extended to Hamiltonians generated from those of

    Lemma 6.7 through a sup-convolution. For simplicity we consider the case when

    neither potential nor interaction energies are present, but their inclusion does not

    present any substantial difficulty.

    Lemma 6.8. Assume that RD is a bounded open set, and that

    (a) K P() is a convex set, with respect to the standard linear structure ofP(), closed with respect to the narrow convergence;

    (b) J : K R {+} is strictly convex with respect to the standard linearstructure ofP() , bounded from below and lower semicontinuous withrespect to the narrow convergence.

    Define the Hamiltonian H on P2(RD) by

    (6.26) H() = infK

    {1

    2W22 (,) + J()}.

    Then H is (1)convex and lower semicontinuous, and satisfies (H1) and(H2).

    Proof of Lemma 6.8. Since W22

    (,) J() is (2)-convex for each K, we obtain that H is (1)-convex and so (H3) holds.

    1. Notice first that W22 (,) is lower semicontinuous with respect to the narrowconvergence (see for instance Proposition 7.1.3 of [4]). Since J is bounded from

  • 8/2/2019 Ambrosio e Gangbo Hamilton Ian ODE

    26/35

    26 LUIGI AMBROSIO, WILFRID GANGBO

    below and lower semicontinuous, and since bounded sets in P2(RD) are sequen-

    tially compact with respect to the narrow convergence, we obtain that the infimum

    in the definition ofH is attained. Strict convexity of J and convexity ofW22 (,)give uniqueness of the minimizer, which we denote by (). A compactness argu-

    ment based on the uniqueness of () then shows that n in P2(RD

    ) implies(n) () narrowly in P(). As is bounded the map () is alsocontinuous between P2(R

    D) and P2().

    2. Let o Pa2 (R

    D) and P2(RD). Clearly,

    H() H(o) 1

    2

    W22 (,(o)) W

    22 (o,(o))

    .

    This, together with the fact that the Wasserstein gradient of 12

    W22 (,(o))

    at o is t(o)o id (see (4.8)), yields that t

    (o)o id H(o) and so H(o) is

    nonempty.

    To characterize the elements ofH(o), let Cc (R

    D) and set

    gs = id + s, s = gs #o, s = (s).

    If H(o), the fact that H is (1)convex implies that

    H(s) H(o)

    R2d ; t

    so iddo +

    1

    2W22 (o,s) 0.

    For |s|

  • 8/2/2019 Ambrosio e Gangbo Hamilton Ian ODE

    27/35

    HAMILTONIAN ODES IN THE SPACE OF PROBABILITY MEASURES 27

    where s is the unique optimal plan between s and s. Recall now that s o inP2(R

    D) and s in P2() as s 0, hence Lemma 3.3 gives

    (6.29) sRD ;do +s2

    2 RD ||2do sRD id t

    oo

    ;do + o(s).

    We divide both sides of (6.29) first by s > 0 then s < 0; letting |s| 0 we find

    RD ;do =

    RD

    id too ;do.

    This proves that 0 = too id. The minimality of the norm of the gradient then

    gives

    (6.30) H(o) = too id.

    From this representation ofH(o) and from (3.13) we obtain both (H1) and(H2). QED.

    7 An alternative algorithm yielding existence of Hamiltonian flows for

    general initial data

    In this section we provide a new discrete scheme providing existence of solu-

    tions to Hamiltonian flows for general initial data, i.e. not necessarily absolutely

    continuous with respect to Lebesgue measure. Being based on a linear interpola-

    tion at the level of transports, when particularized to Dirac masses this algorithm

    coincides with the one considered in Remark 6.5.

    Lemma 7.1. Let f : X Y be a Borel map, P(X), and let v L2(; RD).Then, setting = f#, we have f#(v) = w for some w L

    2(; RD) with

    (7.1) wL2(;RD) vL2(;RD).

    Proof. Let := f#(v) and L(Y; RD); denoting by , = 1, ,N, the

    components of we haveD

    i=1

    Y

    i di

    fL2(;RD)vL2(;RD) = L2(;RD)vL2(;RD).Since is arbitrary this proves (7.1). QED.

    Lemma 7.2. Let T > 0, C 0, nt : [0,T] P2(RD) and vnt L

    2(t; Rk) be

    satisfying:

    (a) nt t narrowly as n +, for all t [0,T];(b) vnt L2(t;Rk) C for a.e. t [0,T];

    (c) the Rk-valued space-time measures vnt nt dt are weakly

    converging in (0,T)RD to .

  • 8/2/2019 Ambrosio e Gangbo Hamilton Ian ODE

    28/35

    28 LUIGI AMBROSIO, WILFRID GANGBO

    Then there exist vt L2(t; R

    k) , with vtL2(t;Rk) C for a.e. t, such that =vttdt.

    Proof. Possibly extracting a subsequence we can also assume that the scalar space-

    time measures |vnt |nt dt weak

    -converge to , and it is well-known (see for instanceProposition 1.62(b) of [3]) that || . Since, by Holder inequality, the projectionof|vnt |

    nt dt on [0,T] is less than Cdt, the same is true for . Hence the disintegra-

    tion theorem (see for instance Theorem 2.28 in [3]) provides us with the represen-

    tation = tdt for suitable Rk-valued measures in RD having total variation less

    than C for a.e. t.

    Now, for any Cc (0,T), Cc (R

    D; Rk) we haveT

    0(t); tdt

    = |; | = limn+T

    0(t); vnt

    nt dt

    CT

    0||(t)

    ||2; tdt.

    As is arbitrary, this means that |; t| C

    ||2; t for a.e. t. By a density

    argument we can find a Lebesgue negligible set N (0,T) such that

    |; t| C

    ||2; t Cc (R

    D; Rk), t (0,T) \N.

    Hence, for any t (0,T) \N we have t = vtt for some vt L2(t; R

    k) withL2(t; R

    k) norm less than C. QED.

    We consider now two basic assumptions on the Hamiltonian, that are variants

    of those considered in the previous section.

    (H1) There exist constants Co [0,+), Ro (0,+] such that for all P2(RD)

    with W2(, ) < Ro we have D(H), H() = /0 andH()L2() Co.

    (H2) Ifsupn W2(n, ) < Ro and n narrowly, then

    (7.2)

    m=1

    co({H(n)n : n m})

    w : w H() TP2(RD),

    where co denotes the closed convex hull, with respect to weak-topology.

    Remark 7.3. (a) Assumption (H1) is weaker than (H1), with the replacement of a

    pointwise bound with an integral one. Also (H2) is essentially weaker than (H2),

    as it does not impose any strong convergence property on H(n); however,this forces to consider a stability with respect to closed convex hulls.

    (b) A sufficient condition which ensures (H2) is the following:

    (H2) If supn W2(n, ) < Ro and n narrowly, then

    H(n)n H()

    in the sense of distribution.

    (c) As in the previous section, the condition (H3) ensures constancy of the

    Hamiltonian along the Hamiltonian flows. We can apply the same argument used

  • 8/2/2019 Ambrosio e Gangbo Hamilton Ian ODE

    29/35

    HAMILTONIAN ODES IN THE SPACE OF PROBABILITY MEASURES 29

    at the beginning of the proof of Theorem 5.2, to obtain that (H3) and (H1) imply

    that H is Lipschitz continuous on the ball { P2(RD) : W2(, ) Ro}.

    Theorem 7.4. Assume that (H1) and (H2) hold and that CoT < Ro. Then thereexists a Hamiltonian flow t : [0,T] D(H) starting from P2(R

    D), satisfying

    (5.1), such that t t is CoLipschitz. Furthermore, if(H3) holds, then t H(t)is constant.

    In particular, ifH() TP2(RD) = {H()} for all such that W2(, )