tree-valued markov limit dynamics habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. an...

228
Tree-valued Markov limit dynamics Habilitationsschrift Anita Winter Mathematisches Institut Universit¨ at Erlangen–N¨ urnberg Bismarckstraße 1 1 2 91054 Erlangen GERMANY [email protected]

Upload: others

Post on 12-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

Tree-valued Markov limit dynamics

Habilitationsschrift

Anita Winter

Mathematisches Institut

Universitat Erlangen–Nurnberg

Bismarckstraße 11291054 Erlangen

[email protected]

Page 2: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter
Page 3: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

Contents

Many many thanks ... 5

Introduction 9Part I: Real trees and metric measure spaces 10Part II: Examples of prominent real trees and mm-spaces 13Part III: Tree-valued Markov dynamics 15Notes 20

Chapter 1. State spaces I: R-trees 211.1. The Gromov-strong topology 221.2. A complete metric: The Gromov-Hausdorff metric 221.3. Gromov-Hausdorff and the Gromov-strong topology coincide 241.4. Compact sets in Xc 251.5. 0-hyperbolic spaces and R-trees 251.6. R-trees with 4 leaves 281.7. Length measure 291.8. Rooted R-trees 311.9. Rooted subtrees and trimming 341.10. Compact sets in T 401.11. Weighted R-trees 411.12. Distributions of random (weighted) real trees 47

Chapter 2. State spaces II: The space of metric measure trees 492.1. The Gromov-weak topology 502.2. A complete metric: The Gromov-Prohorov metric 532.3. Distance distribution and modulus of mass distribution 592.4. Compact sets in M 642.5. Gromov-Prohorov and Gromov-weak topology coincide 692.6. Ultra-metric measure spaces 702.7. Compact metric measure spaces 712.8. Distributions of random metric measure spaces 732.9. Equivalent metrics 75

Chapter 3. Examples of limit trees I: Branching trees 833.1. Excursions 833.2. The Brownian continuum random tree 873.3. Aldous’s line-breaking representation of the Brownian CRT 89

3

Page 4: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

4 CONTENTS

3.4. Campbell measure facts: Functionals of the Brownian CRT 913.5. Existence of the reactant branching trees 993.6. Random evolutions: Proof of Theorem 3.5.3 103

Chapter 4. Examples of limit trees II: Coalescent trees 1114.1. Λ-coalescent measure trees 1124.2. Spatially structured Λ-coalescent trees 1154.3. Scaling limit of spatial Λ-coalescent trees on Zd, d ≥ 3 1204.4. Scaling limit of spatial Kingman coalescent trees on Z2 125

Chapter 5. Root growth and Regrafting 1315.1. A deterministic construction 1345.2. Introducing randomness 1385.3. Connection to Aldous’s line-breaking construction 1415.4. Recurrence, stationarity and ergodicity 1455.5. Feller property 1475.6. Asymptotics of the Aldous-Broder algorithm 1565.7. An application: The Rayleigh process 158

Chapter 6. Subtree Prune and Regraft 1636.1. A symmetric jump measure on (Twt, dGHwt) 1646.2. Dirichlet forms 1676.3. An associated Markov process 1696.4. The trivial tree is essentially polar 172

Chapter 7. Tree-valued Fleming-Viot dynamics 1817.1. The tree-valued Fleming-Viot martingale problem 1817.2. Duality: A unique solution 1867.3. Approximating tree-valued Moran dynamics 1897.4. Compact containment: Limit dynamics exist 1917.5. Limit dynamics are tree-valued Fleming-Viot dynamics 1937.6. Limit dynamics yield continuous paths 2007.7. Proof of the main results (Theorems 7.1.6 and 7.3.1) 2017.8. Long-term behavior 2027.9. The measure-valued Fleming-Viot process as a functional 2037.10. More general resampling mechanisms and extensions 2057.11. Application: Sample tree lengths distributions 208

Index 219

Bibliography 221

Page 5: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

Many many thanks ...

Es fehlte mir weniger am Zutrauen zu promovieren.Das ware der zweite Schritt vor dem ersten gewesen.

Christiane Leidinger1

Submitting my habilitation thesis is a good occasion to review the longand often difficult way I have covered and, of course, to thank all of the peo-ple who gave advice and supported me in relating me with the Habilitationdegree. Being aware of my social-cultural and educational background I gobeyond the common focus at the end of a habilitation project on the yearsafter the PhD and want to stress and acknowledge that it took the presentthesis each day of the last 35 years to finally become real.

Thanks and love to my mother Claudia Winter who has supported mycraving for studying from my earliest days, for example, by finding me popu-lar scientific books in mathematics and physics of which she had only guessedhow much I appreciated them and for finding and defending a space of myown in our much too small Prenzlauer Berg apartment. And to my sistersJeanette and Simone Winter for relieving my feelings of becoming a strangerto my family. I’m also very grateful to my father’s middle school teacherand later friend of the family Brunhilde Lorff who guided me in feminismmore than 30 years ago.

I want to thank my teachers Elke Elsing, Frau Hollander, Frau Matthesand Frau Steinbrecher at the Anne and Anton Saefkow School who sup-ported me in many aspects and who for the first time related me to the ideaof studying at a university - and even abroad - in the age of only 13. Bythat time I had just joined a circle in the Mathematische Schulergesellschaft(MSG) Leonard Euler at the Humboldt University instructed by IngmarLehman who introduced me to the academic way of approaching mathe-matical problems and who believed in my abilities to an extend that I couldnot ignore them anymore. Thanks to him and to all the people involved inthe organization of the yearly 10 days MSG summer camps which I hap-pily remember as very challenging. And also to Gunter Last who taughtme probability during my last year of high school and who advised my firstscientific thesis written as part of the requirements of my Abitur.

1KlasseN Dissertation Uber Arbeitertochter und das Promovieren aus der Bildungs-ferne, [Lei06]

5

Page 6: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

6 MANY MANY THANKS ...

After I entered university - despite an optimum preparation - I lost self-confidence with each day I was sitting in class and was too intimidatedto raise a single question and in the end often too confused to be ableto summarize what I had just learned. The strong feeling of alienationoften kept me from studying with my fellow students. It was only after 3years of chosen isolation that I attended a seminar with Andreas Greven onInteracting Particle Systems which all of a sudden excited and inspired me inmany ways. I want to thank Uta Freiberg who also attended the seminar forletting me re-enjoy talking about mathematics. My great gratitude goes toAndreas Greven, who later supervised my Diploma and PhD theses and withwhom I have continued to work in different projects since. He introduced meto the models arising in population genetics on which I seem to have builta career now, he encouraged and supported me in traveling to internationalconferences, but in the first place he adviced and emotionally supported meover all the years. I also felt emotionally supported by my former colleagueand office mate Achim Klenke whom I would like to thank particularly forsharing personal experiences at many occasions during my Diploma andPhD theses.

Since I have started my PhD I have traveled to many places. The incom-plete list of people whom I want to thank for invitations to research stays,meetings and seminars include Siva Athreya, Ellen Baake, Don Dawson,Frank den Hollander, Allison Etheridge, Steve Evans, Nina Gantert, An-dreas Greven, Achim Klenke, Vlada Limic, Terry Lyons, Ed Perkins, TheoSturm, Silke Rolles, Jan Swart, Alain-Sol Sznitman, Anton Wakolbinger,Ruth Williams, ... A huge impact had definitely my first international work-shop on Stochastic Partial Differential Equations and the attached summerschool at the University of British Colombia in Vancouver in 1997 at whichfor the first time I met several of the people who later became my co-authors.This journey as well as a research stay one and a half years later at the FieldsInstitute in Toronto were made possible due to a grant of Don Dawson who -along with Anton Wakolbinger - later refereed my PhD thesis and has sincebeen following my track. I would like to thank both as well as Steve Evansand Andreas Greven for writing letters of recommendation often with veryshort notice. I would also like to thank for the additional financial supportfrom the Edith and Otto Haupt Foundation and the unconventional help ofWolfgang Schmidt which made the trip to Vancouver possible.

It was one of my stays at the Fields Institute in Toronto from which Ibrought a poster with the striking message “Gays and lesbians are our teach-ers, students, parents, doctors, ...” which once hanging in the MathematicalInstitute in Erlangen created hot disputes. I would like to thank everybodywho was supportive during that time which was difficult for me, in particu-lar, to Tanja Dierkes, Andreas Greven, Andreas Knauf, Peter Pfaffelhuberand Iljana Zahle who took a firm stand in the discussion. This is a welcomeopportunity to also thank all my gay and lesbian (to (may)be) colleagues

Page 7: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

MANY MANY THANKS ... 7

at the department for coming out to me who would or would not like to seetheir names written here.

The academic year 2002/03 I spent with a DFG research fellowship atthe University of California in Berkeley where I worked with Steve Evansand Jim Pitman. I would like to thank them and all the people with whomI interacted during that year. I am particularly grateful to Steve who intro-duced me to the central theme of this thesis which is the space of real treesand the Gromov-Hausdorff distance and from whom I learned a lot aboutwriting (and finishing) a paper. Working and specifically writing with himis a great pleasure to me.

In the summer semester 2004 I did my first academic outing in thehumanities by attending a very inspiring reading seminar on “Written Iden-tities” instructed by Doris Feldmann at the English Department at ErlangenUniversity. As a consequence in February 2005 - although still troubled withrelating the mathematician that I am with lesbian-feminist research - I tookthe courage to join the lfq network “Netzwerk lesbisch-feministisch-queererForschung” initiated a year before by Christiane Leidinger.

Coming back from Berkeley to Erlangen I also started to take Hebrewlessons, a bold venture in a region which inhabits not many native speakers.However there is a class at the Bildungszentrum Nurnberg which had beentaught for more than 10 years by Ganja Benari. Although, by the timeI joined, the class was far beyond my knowledge in Hebrew, I found inGanja and in all the women taking part in the course excellent teachers. Somany thanks to Batja, Dorothee, Edith, Ganja, Hiltrud, Margot, Renate,Rosemarie, ... The present thesis was written during a current research stayat the Technion in Haifa founded by the Aly Kaufman Foundation. I amvery grateful to Leonid Mytnik for inviting and encouraging me to come andto all the local probability people for their great hospitality. Thanks alsoto Chen Weider who rented to me his wonderful sea view apartment whichhad been my home for the last months and in which I enjoyed writing hugeparts of this thesis.

I particularly acknowledge my co-authors whose work with me appears inthis thesis: Steven N. Evans, Andreas Greven, Vlada Limic, Peter Pfaffelhu-ber, Jim Pitman and Lea Popovic as well as all my other collaborators SivaAthreya, Michael Eckhoff, Janos Englander, Leonid Mytnik, Anja Sturm,Rongfeng Sun and Iljana Zahle. I am also thankful to all the anonymousreferees for the thorough reading of the papers. The revisions based ontheir reports often improved the presentation. Further thanks to MichaelEckhoff, Grit Paechnatz, Peter Pfaffelhuber, Ulrike Tisch and Iljana Zahlewho kindly proof-read various parts of the manuscript.

Special thanks go to my house mates and friends Heike Herzog, GritPaechnatz and Kathrin Schmidt who have been with me through all the upsand downs in the last years. And to the physiotherapists Perla Ben Simon,Silke Kruse and Dorit Thumer who professionally worked with me. I alsoextend thanks for support that runs far beyond the bounds of collegiality to

Page 8: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

8 MANY MANY THANKS ...

Lisa Beck, Nina Gantert, Vlada Limic, Lea Popovic and Iljana Zahle. Andto Peter Pfaffelhuber from whom I learned how to collaborate with peoplewhich may have different scholarly interests from my own.

An incomplete list of other colleagues and friends whom I would like tothank for advise and support include David Aldous, Nihat Ay, MichaelaBaetz, Nadja Bennewitz, Dieter Binz, Juditha Cofman, Claudia Dem-pel, Gabriele Dennert, Axel Ebinger, Silvia Eichner, Oye Felde, SimoneFischer, Orit Furman, Walter Hofmann, Tobias Jager, Gerhard Keller,Edith Kellinghusen, Julia Kempe, Manfred Kronz, Christiane Leidinger,Alexander Lepke, Heike Lepke, Anna Levit, Wolfgang Lohr, Oded Regev,Rosi Ringer, Gerhard Scheibel, Frank Schiller, Johanna Schmidt, SarahSchmiedel, Christoph Schumacher, J(.) Seipel, Thomas Springer, LjiljanaStamenkovic, Andrea Stroux, Sreekar Vadlamani, Stefanie Weigel, SilviaWendler, Yael Zbar, Helga Zech, ...

Anita Winter

Haifa, June 2007

Page 9: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

Introduction

In the present thesis we study random trees and tree-valued Markovdynamics which arise in the limit of discrete trees and discrete tree-valuedMarkov chains, respectively, after a suitable rescaling, as the number ofvertices tends to ∞.

Random trees appear frequently in the mathematical literature. Promi-nent examples are random binary search trees as a special case of ran-dom recursive trees ([DH05]), ultra-metric structures in spin-glasses (see,for example, [BK06, MPV87]), spanning trees (see, for example, [AS92,KL96, PW98, BGL98]), ect. In branching models trees arise, for exam-ple, as the Kallenberg tree and the Yule tree in the (sub-)critical or super-critical, respectively, Galton-Watson process which is conditioned on “sur-vival” ([Kal77, EO94]).

A huge enterprize in biology is phylogenetic analysis which reconstructsthe “family trees” of a collection of taxa. An introduction to mathematicalaspects of the subject are surveyed in [SS03]. Due to enormous diver-sity in life phylogenetics often leads to the consideration of very large treesand therefore demands for an investigation of limits of finite trees. Forexample, by taking continuum (mass) limits we have for branching mod-els the Brownian continuum random tree (Brownian CRT) or the Brow-nian snake ([Ald91b, Ald93, LG99a]). More general branching mech-anisms lead to more general genealogies, such as Levy trees which is theinfinite variance offspring distribution counterpart of the Brownian CRT([DLG02]), the Poisson snake ([AS02]) or the reactant trees arising in cat-alytic branching systems ([GPW06b]), to name just a few. In populationmodels with a fixed population size (or total mass), the genealogical treescan be generated by coalescent processes, for example, the Kingman coa-lescent tree ([Kin82a, Ald93, Eva00a]) or the Λ-coalescent measure trees([GPW06a]).

Many results towards convergence of finite trees toward a limit treehad been shown by considering the asymptotic behavior of functionals ofensembles of random trees such as their height, total number of vertices,averaged branching degree, ect. (see, for example, [CP00, Win02, PR04,CKMR05]).

With a series of papers [Ald91a, Ald91b, Ald93] (see also [LG99a,Pit02]) Aldous suggested a much stronger notion of convergence of randomtrees. The main difficulty Aldous had to overcome was to map the sequence

9

Page 10: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

10 INTRODUCTION

of trees into a space of all the “tree-like” objects which may arise in thelimit as the number of vertices tends to infinity. First, following a longtradition, he relied on the connection between trees and continuous paths.Encoding trees by continuous functions allows to think of weak convergenceof random trees as weak convergence of continuous functions with respectto the uniform topology on compacta. In particular examples this approachmay have at least two drawbacks. Although there is a classical bijectionbetween rooted planar trees and lattice paths (see, for example, [DM00]),there seems to be no obvious way to uniquely associate a continuous pathto a limit tree. Moreover the uniform topology is a rather strong topology.

Secondly, Aldous noticed that a finite leaf-labeled tree with edge lengthsis isomorphic to a compact subset of ℓ1, i.e., the space of non-negative sum-mable sequences equipped with the Hausdorff topology. With this encodingweak convergence of random trees translates into weak convergence of theassociated closed subsets of ℓ1 where the space of closed subsets of the met-ric space ℓ1 is equipped, as usual, with the Hausdorff topology. Aldous’sapproach is extremely powerful. In particular, it allowed him to show thata suitably rescaled family of Galton-Watson trees, conditioned to have totalpopulation size n, converges as n→ ∞ to the Brownian continuum randomtree, which can be thought of as the tree inside a standard Brownian excur-sion. More recently his approach was applied in [HMPW] to identify theself-similar fragmentation tree as the scaling limit of discrete fragmentationtrees and doing so to confirm in a strong way that the whole trees grow atthe same speed as the mean height of a randomly chosen leaf.

Both approaches via continuous paths and compact subsets of ℓ1 aredesigned for leaf-labeled trees. If one wants to rescale unlabeled trees onemay be tempted to invent a labelling. Since the choice of a labelling playingthe role of “coordinates” is arbitrary, it may not always be handy to work inthis setting. One rather should be more consequent and follow an intrinsic- that is, “coordinate free” - path.

Before we motivate this, we note that there is quite a large literatureon other approaches to “geometrizing” and “coordinatizing” spaces of trees.The first construction of codes for labeled trees without edge-length goesback to 1918: Prufer ([Pru18]) sets up a bijection between labeled treesof size n and the points of 1, 2, ..., nn−2. Phylogenetic trees are identifiedwith points in matching polytopes in [DH98], and [BHV01a] equips thespace of finite phylogenetic trees with a fixed number of leaves with a metricthat makes it a cell-complex with non-positive curvature.

The thesis is split in three parts.

Part I: Real trees and metric measure spaces

In the first part of the present thesis we develop systematically the topo-logical properties of possible state spaces and characterize the correspondingconvergence in distribution.

Page 11: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

PART I: REAL TREES AND METRIC MEASURE SPACES 11

In Chapter 1 we follow the path of the so-called T-theory (see [Dre84,DMT96, Ter97]) to extend the definition of a tree with edge lengths byallowing behavior such as infinite total edge length and vertices with infinitebranching degree. T-theory takes finite trees to be just metric spaces withcertain characteristic properties and then defines a more general class oftree-like metric spaces called real-trees or R-trees. We note that one of theprimary impetuses for the development of T-theory was to provide mathe-matical tools for concrete problems in the reconstruction of phylogenies. Wealso note that R-trees have been objects of intensive study in geometric grouptheory (see, for example, the surveys [Sha87, Mor92, Sha91, Bes02] andthe recent book [Chi01]).

Once we have an extended notion of trees as just particular abstractmetric spaces (or, more correctly, isometry classes of metric spaces), weneed a means of convergence. We will follow one aspect of Aldous’s phi-losophy to embed a sequence of trees isometrically in one and the samemetric space and to then decide convergence based on whether or not thesequence of isometric copies converges. However, rather than using Aldous’sparticular embedding into the space of compact subsets of ℓ1, we say thatthe sequence converges if and only if there is any common compact metricspace in which the sequence can be embedded isometrically such that thesequence of isometric copies converges. We refer to this topology as theGromov-strong topology and show that a metric on the space of compactmetric spaces generating the Gromov-strong topology is provided by thewell-studied Gromov-Hausdorff metric (compare, for example, [BBI01] andreferences therein). The Gromov-Hausdorff metric originated in geometryas a means of making sense of intuitive notions such as the convergence toEuclidean space of a re-scaled integer lattice as the grid size approaches zero.We remark in passing that the papers [Pau88, Pau89] are an application ofthe Gromov-Hausdorff metric to the study of R-trees that is quite differentto what we present here.

In some applications one is mainly interested in rooted real trees, thatare real trees (X, r) with a distinguished point ρ ∈ X that we may thinkof as a common ancestor. We therefore introduce also the space of rootedcompact real trees and equip it with the rooted Gromov-Hausdorff metricor sometimes referred to as the pointed Gromov-Hausdorff metric. Thisdistance was introduced in geometry as a means of making sense of theconvergence to Euclidean space of a sphere when viewed from a fixed point(for example, the North Pole) as the radius approaches infinity.

An important preliminary step is to show that it is possible to equip thespace of pairs of compact real trees and their accompanying weights withthe weighted Gromov–Hausdorff metric. We note that a Gromov–Hausdorfflike metric on more general metric spaces equipped with measures was intro-duced in [Stu06]. The latter metric is based on the Wasserstein-L2 metricbetween measures, whereas ours is based on the Prohorov metric.

Page 12: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

12 INTRODUCTION

Aldous’s philosophy has a second aspect which we exploit in Chapter 2.For a motivation one needs to have a more detailed look at the proof ofhis invariance principle for branching trees. In order to define convergenceAldous codes trees not only as separable and complete metric spaces sat-isfying some special properties for the metric characterizing them as treeswhich are embedded into ℓ+1 but in addition equips them with a probabilitymeasure. The idea of convergence in distribution of a “consistent” familyof finite random trees follows then Kolmogorov’s theorem which gives thecharacterization of convergence of R-indexed stochastic processes with reg-ular paths. That is, a sequence has a unique limit provided a tightnesscondition holds on path space and assuming that the “finite-dimensionaldistributions” converge. The analogs of finite-dimensional distributions are“subtrees spanned by finitely many randomly chosen leaves” and Aldous’snotion of convergence has been successful not only to show convergence ofbranching trees but also to construct limit coalescent trees which can notbe represented by continuous functions.

To follow Aldous’s approach without using his particular embeddingwe equip the space of separable and complete real trees which are equippedwith a probability measure with the following topology such that a sequenceof trees (equipped with a probability measure) converges to a limit tree(equipped with a probability measure) if and only if all randomly sampledfinite subtrees converge to the corresponding limit subtrees. The resultingtopology is referred to as the Gromov-weak topology. Since the constructionof the topology works not only for tree-like metric spaces, but also for thespace (of measure preserving isometry classes) of metric measure spaces weformulate everything within this framework. We will see that the Gromov-weak topology on the space of metric measure spaces is Polish. In fact, wemetrize the space of metric measure spaces equipped with the Gromov-weaktopology by the Gromov-Prohorov metric which combines the two conceptsof metrizing the space of metric spaces and the space of probability measureson a given metric space in a straightforward way. Moreover, we present anumber of equivalent metrics which might be useful in different contexts.

This then allows to discuss convergence of random variables taking valuesin that space. We characterize compact sets and tightness via quantitieswhich are reasonably easy to compute.

The most important ideas on metric measure spaces are contained inGromov’s book, Chapter 31

2 in [Gro99]. Several of the results presentedhere are stated in [Gro99] in a different set-up. While Gromov focuses ongeometric aspects, we provide the tools necessary to do probability theoryon the space of metric measure spaces.

Further related topologies on particular subspaces of isometry classesof complete and separable metric spaces have already been considered in[Stu06] and [EW06] (where the weighted Gromov-Hausdorff metric dis-cussed in Chapter 1 was introduced). Convergence in one of these two

Page 13: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

PART II: EXAMPLES OF PROMINENT REAL TREES AND MM-SPACES 13

topologies implies convergence in the Gromov-weak topology but not viceversa.

Part II: Examples of prominent real trees and mm-spaces

In the second part of the thesis we illustrate the theory with examplesof limit trees arising in the theory of branching and coalescing.

In Chapter 3 we reconsider branching trees as one source of random treemodels. Following the tradition to code branching trees by excursions wefirst recall the connection between trees and continuous paths. We thendefine the Aldous’s Brownian continuum random tree as the tree associ-ated with the path of a Brownian excursion. We will illustrate how thisallows to calculate certain functionals of the Brownian continuum randomtree explicitly via Ito’s excursion measure. An explicit representation of theBrownian continuum random tree is given by Aldous’s line-breaking con-struction ([Ald91a, Ald93]). This technique allows for studying explicitdistributions. Since we are going to apply this construction later in Chap-ter 5, in this chapter we recall Aldous’s results and reformulate it in termsof real trees and metric measure trees to illustrate the theory developed sofar. Note that line-braking constructions for generalizations of the Browniancontinuum random tree can be found in [AP06, HMPW].

It is shown in [Ald91a, Ald91b, Ald93] (see also [LG99a, Pit02])that the Brownian continuum random tree arises as the re-scaling limit ofsuitably rescaled family of Galton-Watson trees, conditioned to have totalpopulation size n, converges as n → ∞. to the Brownian continuum ran-dom tree, which can therefore be considered s the “family tree” of Feller’sbranching diffusion.

For branching models with interaction between the branching “individ-uals” excursions coding the genealogy have been investigated only recently.Following [GPW06b], in this chapter we describe the genealogy of catalyticbranching particle models and their diffusion limits by a consistent familyof contour processes.

In Chapter 4 we discuss coalescent processes as another source of randomtree models (see the recent review [Ald99] for references). The Kingman co-alescent was introduced in [Kin82a]) as a model for the genealogies of a neu-tral population model. Spatially structured Kingman coalescents with lo-cally infinitely many particles were first introduced in [DEF+99, GLW05].They appear as duals of interacting systems such as the voter model, thestepping stone model and interacting Fleming-Viot processes. The King-man coalescent is a prominent representative of the class of Λ-coalescentswhich were introduced in [Pit99] (see also [Sag99]) and allow for possi-bly multiple collisions reflecting an infinite variance of the population’s off-spring distribution. Such a process has since been the subject of many ap-plied and theoretical work (see, for example, [MS01], [BG05], [BBC+05],[BBS06]). The spatially structured Λ-coalescent was introduced in [LS06].

Page 14: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

14 INTRODUCTION

In [Eva00b, DEF+99] it was shown how the Kingman coalescent and asystem of coalescing Brownian motions on the circle are naturally associ-ated with a random compact metric space. In this chapter we will illustratethe theory developed in Chapter 2 to decide for which Λ-coalescents thegenealogies are described by a metric measure space. We will also showthat the suitably rescaled spatially structured coalescent trees on Zd, d ≥ 2,converge towards the Kingman measure tree. For that we rely on estimatesobtained in [GLW05, LS06, GLW07].

Note that not only trees get large, but also the number of possible phylo-genetic trees grows rapidly with an increasing number of taxa. For example,the number of trees with n labeled leaves is

(0.1) (2n− 3)!! = (2n− 3) · (2n− 1) · ... · 3 · 1

(compare, for example, Chapter 3 of [Fel03]). In particular, if n = 100 thenthe number of possible trees is ∼ 3.5 · 10184.

If we try to use statistical methods to find the “best” tree to fit agiven set of data rather than doing the impossible, i.e., performing an ex-hausting search through the enormously huge space of all possible treesone is definitely more successful performing a random search. Markovchains that move through a space of finite trees are an important ingre-dient in Markov chain Monte Carlo algorithms for simulating distributionson spaces of trees in Bayesian tree reconstruction and in simulated anneal-ing algorithms in maximum likelihood and maximum parsimony2 tree re-construction (see, for example, [Fel03] for a comprehensive overview ofthe field). Usually, such chains are based on a set of simple rearrange-ments that transform a tree into a “neighboring” tree. One widely usedset of moves is the nearest neighbor interchanges (NNI) (see, for exam-ple, [Fel03, BRST02, BHV01b, AS01]). Two other standard setsof moves that are implemented in several phylogenetic software packagesbut seem to have received less theoretical attention are the subtree pruneand re-graft (SPR) moves and the tree bisection and re-connection (TBR)moves that were first described in [SO90] and are further discussed in[Fel03, AS01, SS03]. We note that an NNI move is a particular typeof SPR move and that an SPR move is particular type of TBR move, and,moreover, that every TBR operation is either a single SPR move or thecomposition of two such moves (see, for example, Section 2.6 of [SS03]).Chains based on other moves are investigated in [DH02, Ald00, Sch02].

Once more, because of the exponential growth of the state space with anincreasing number of vertices, discrete tree-valued Markov chains are - evenso easy to construct by standard theory - hard to analyze for their qualitativeproperties. It therefore seems to be reasonable to pass to a continuum limit

2Maximum parsimony tree reconstruction is based on finding the phylogenetic tree andinferred ancestral states that minimize the total number of obligatory inferred substitutionevents on the edges of the tree.

Page 15: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

PART III: TREE-VALUED MARKOV DYNAMICS 15

and to construct certain limit dynamics and study them with methods fromstochastic analysis.

Markov dynamics with values in the space of “infinite” or contin-uum trees have been constructed only recently. These include excur-sion path valued Markov processes with continuous sample paths - whichcan therefore be thought of tree-valued diffusions - as investigated in[Zam03, Zam02, Zam01], and dynamics working with real-trees, for ex-ample, the so-called root growth with re-grafting (RGRG) ([EPW06]), thewild chain [AE99], the so-called subtree prune and re-graft move (SPR)([EW06]), the limit random mapping ([EL07]) and the tree-valued Fleming-Viot dynamics. While the RGRG dynamics have a projective property al-lowing for an explicit construction of the Feller semi-group as the limit semi-group of “finite” tree-valued dynamics arising in an algorithm for construct-ing uniform spanning trees, the SPR and the limit random mapping wereconstructed as candidates of the limit of “finite” tree-valued dynamics us-ing Dirichlet forms. Unfortunately, Dirichlet forms are often inadequate forproving convergence theorems as opposed to generator or martingale prob-lem characterizations of Markov processes, for example. The first exampleof tree-valued Markov dynamics constructed as the solution of a well-posedmartingale problem were given with the tree-valued Fleming-Viot dynamicsin [GPW07].

Part III: Tree-valued Markov dynamics

In the third part of the present thesis we use different techniques fromMarkov process theory to construct three of the above tree-valued limitdynamics.

Tree-valued Markov processes appear also in contexts other than phy-logenetic analysis. For example, a number of this processes appear in com-binatorics associated with spanning trees. One such process is the Rootgrowth with regrafting dynamics which we construct in Chapter 5 as thelimit dynamics of the Aldous-Broder algorithm (see, [Bro89, Ald90] andFigure 0.1 for an illustration). The Aldous–Broder algorithm is a Markovchain on the space of rooted combinatorial trees with N vertices that hasthe uniform tree as its stationary distribution. We construct and studya Markov process on the space of all rooted compact real trees that hasthe Brownian continuum random tree as its stationary distribution andarises as the scaling limit as N → ∞. The resulting process evolves viaalternating deterministic root growth and random jumps due to re-graftingand is an example of a piecewise-deterministic Markov process. A generalframework for such processes was introduced in [Dav84] as an abstractionof numerous examples in queueing and control theory, and this line of re-search was extensively developed in the subsequent monograph [Dav93].

Page 16: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

16 INTRODUCTION

S

T

Figure 0.1. Pruning off the subtree S of the tree T andregrafting it at the root ρ (figure courtesy of Steve Evans)

A more general formulation in terms of martingales and additive function-als can be found in [JS96]. Some other appearances of such processes are[EP98, CDP01, DC99, Cai93, Cos90].

The crucial feature of the root growth with re-grafting dynamics is thatthey have a simple projective structure: If one follows the evolution of thepoints in a rooted subtree of the initial tree along with that of the pointsadded at later times due to root growth, then these points together form arooted subtree at each period in time and this subtree evolves autonomouslyaccording to the root growth with re-grafting dynamics.

The presence of this projective structure suggests that one can makesense of the notion of running the root growth with re-grafting dynamicsstarting from an initial “tree” that has exotic behavior such as infinitelymany leaves, points with infinite branching, and infinite total edge length –provided that this “tree” can be written as the increasing limit of a sequenceof finite trees in some appropriate sense.

One of our main objectives is to give rigorous statements and proofs ofthese and related facts. Once the extended process has been constructed, wegain a new perspective on objects such as standard Brownian excursion andthe associated random triangulation of the circle (see [Ald94a, Ald94b,Ald00]). For example, suppose we follow the height (that is, distance fromthe root) of some point in the initial tree. It is clear that this height evolvesautonomously as a one-dimensional piecewise-deterministic Markov processthat:

• increases linearly at unit speed (due to growth at the root),• makes jumps at rate x when it is in state x (due to cut points fallingon the path that connects the root to the point we are following),

• jumps from state x to a point that is uniformly distributed on [0, x](due to re-grafting at the root).

Page 17: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

PART III: TREE-VALUED MARKOV DYNAMICS 17

We call such a process a Rayleigh process because, as we will show in Sec-tion 5.7, this process converges to the standard Rayleigh stationary distri-bution R on R+ given by

(0.2) R(]x,∞[) = e−x2/2, x ≥ 0,

(thus R is also the distribution of the Euclidean length of a two-dimensionalstandard Gaussian random vector or, up to a scaling constant, the distribu-tion of the distance to the closest point to the origin in a standard planarPoisson process). Now, if Bex := Bex

u ; u ∈ [0, 1] is standard Brownianexcursion and U is an independent uniform random variable on [0, 1], thenthere is a valid sense in which 2Bex

U has the law of the height of a randomlysampled leaf of the Brownian CRT, and this accords with the well-knownresult

(0.3) P2Bex

U ∈ dx= R(dx).

As we noted earlier, Markov chains that move through a space of fi-nite trees are an important ingredient for several algorithms in phylogeneticanalysis. In Chapter 6 we investigate with the Subtree Prune and Regraft(SPR) the asymptotics of one of the standard sets of moves that are imple-mented in several phylogenetic software packages first described in [SO90]and further discussed in [Fel03, AS01, SS03]. In an SPR move, a binarytree T (that is, a tree in which all non-leaf vertices have degree three) is cut“in the middle of an edge” to give two subtrees, say T ′ and T ′′. Anotheredge is chosen in T ′, a new vertex is created “in the middle” of that edge,and the cut edge in T ′′ is attached to this new vertex. Lastly, the “pendant”cut edge in T ′ is removed along with the vertex it was attached to in orderto produce a new binary tree that has the same number of vertices as T (seeFigure 0.2 for an illustration).

As remarked in [AS01],

The SPR operation is of particular interest as it can beused to model biological processes such as horizontal genetransfer3 and recombination.

Section 2.7 of [SS03] provides more background on this point as well as acomment on the role of SPR moves in the two phenomena of lineage sortingand gene duplication and loss.

The main emphasis is to construct a candidate for the limit dynamics asthe number of vertices tends to infinity. We do not, in fact, prove that thesuitably rescaled Markov chain with SPR moves converges to the processwe construct. Rather, we use Dirichlet form techniques to establish theexistence of a process that has the dynamics one would expect from such alimit. Unfortunately, although Dirichlet form techniques provide powerful

3Horizontal gene transfer is the transfer of genetic material from one species to an-other. It is a particularly common phenomenon among bacteria.

Page 18: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

18 INTRODUCTION

ba

y

bc

x

a

c

Figure 0.2. An SPR move. The dashed subtree tree at-tached to vertex x in the top tree is re-attached at a newvertex y that is inserted into the edge (b, c) in the bottomtree to make two edges (b, y) and (y, c). The two edges (a, x)and (b, x) in the top tree are merged into a single edge (a, b)in the bottom tree.

tools for constructing and analyzing symmetric Markov processes, they arenotoriously inadequate for proving convergence theorems.

In Chapter 7 we construct and study the evolution of the genealogicalstructure for two related classes of neutral multi-type population models,which are called the Moran model and the Fleming-Viot process. In bothmodels individuals have a genetic type, the population size is constant andthe genetic decomposition changes due to random dynamics called resam-pling.

The Moran model (MM) can be described as follows: consider a pop-ulation of finitely many individuals which carry genetic types. Each pairresamples at constant rate. Resampling of a pair means that one individualdies and the other one reproduces (see Figure 0.3 for an illustration).

For many purposes sufficient information about the population is con-tained in the probability measure on type space given by the empiricalmeasure of the current population. The Fleming-Viot process (FV) is themeasure-valued diffusion which arises in the limit of large populations (com-pare, for example, [FV78, FV79, Daw93, Eth01]). It is well-known thatthere exist moment dualities between the Moran model or the Fleming-Viotprocess on the one hand and the Kingman coalescent on the other (see, forexample, [Kin82a, DGV95, GLW05]). This duality has a strong versionshowing that the Kingman coalescent describes the genealogy of a sampletaken from the infinitely old population. This is helpful, for example, forthe analysis of the long-time behavior.

Page 19: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

PART III: TREE-VALUED MARKOV DYNAMICS 19

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

......

..................................................................................

.....................................................................

.................

..................................................................................

.....................................................................

.................

..................................................................................

..

...................................................................

....................................................................................

..................................................................................

...................

...................................................................

.................

..................................................................................

.....................................................................

.................

...................................................................

.................

..................................................................................

...................................................................

...................

...................................................................

.................

................................................................................................................................................................................................

...........................................................................................................................................................................................................................................................................................................................................

...........................

............................................

. . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

• • • ••

• ••••...............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

.................................................................................................................

..................

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......................

.................time

............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. .............

............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ..

s

t

Figure 0.3. The graphical representation of a Moranmodel of size N = 5. By resampling the genealogical rela-tionships between individuals change. Arrows between linesindicate resampling events. The individual at the tip diesand the other one reproduces. At any time, genealogical re-lationships of individuals •, which are currently alive, can beread from this graphical representation.

The main goal is to change the static point of view on fixed time ge-nealogies to dynamically evolving genealogies. That is, we construct the tree-valued Moran dynamics and the tree-valued Fleming-Viot dynamics, whichare strong Markov processes modeling the evolution of the genealogies astime varies. We do this by well-posed martingale problems.

Evolving genealogies (of skeletons) in exchangeable population models- including the models under consideration - have been described by look-down processes ([DK96, DK98, DK99]). Even though not formulated interms of trees, since look-down processes contain all information availablein the model, they contain all information about the trees. The same timethey encode a lot of information which for many purposes is not needed.For example, the crucial point in the construction of “look-down” processesis the use of labels as coordinates, while we are interested in developingthe stochastic analysis which allows for a coordinate-free description of tree-valued dynamics. A first approach in this direction has been taken in spatialsettings via so-called historical processes in the context of branching models([DP91]) and of interacting Fleming-Viot processes ([GLW05]).

The martingale problem formulation allows us to show that the suit-ably rescaled tree-valued Moran dynamics converge towards the tree-valuedFleming-Viot dynamics as the population size tends to infinity. Anotherimportant consequence of the construction of the tree-valued Moran andFleming-Viot dynamics as solutions of well-posed martingale problems isthat it allows to characterize the functionals of these processes which areagain strong Markov processes, and which can therefore also be character-ized as solutions of well-posed martingale problems. As an application we

Page 20: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

20 INTRODUCTION

show that the measure-valued Fleming-Viot diffusion is randomly embed-ded in the tree-valued Fleming-Viot dynamics. In addition we study thedynamics of the averaged total length of a sub-tree spanned by a finite setof “individuals” where the average is taken with respect to sampling the“individuals” according to the probability measure associated with the tree.Another related interesting functional, the diameter of the tree as a metricspace representing the time to the most recent common ancestor, has beenstudied earlier in [PW06] using the look-down construction of [DK98].

Notes

Portions of this thesis have been previously published in joint papers.

In Chapter 1 we collect facts on spaces of rooted real trees and weightedreal trees which were presented in [EPW06] and [EW06], respectively.Chapter 2 is essentially [GPW06a] updated by the results stated in Sec-tions 2.6 and 2.7 which appear in [GPW07]. In Chapter 3 we discuss theconnection on excursion paths and trees. The path decomposition result ofthe standard Brownian motion presented in Section 3.4 appeared in [EW06].Sections 3.5 and 3.6 summarize parts of the results obtained in [GPW06b].In Chapter 4 we illustrate how coalescent trees can be associated with anultra-metric measure space. Section 4.1 is taken from [GPW06a], whilethe construction and convergence of the spatially structured coalescent inSections 4.2, 4.3 and 4.4 are novel. Chapter 5 is essentially [EPW06]. Chap-ter 6 follows [EW06]. Chapter 7 is about to be submitted as [GPW07].

Page 21: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

CHAPTER 1

State spaces I: R-trees

In this chapter we consider spaces of real trees that are compact metricspaces with “tree-like” properties and give a notion of convergence such that

a sequence of trees converges to a limit tree if and only if the se-quence and the limit tree can be embedded isometrically in one andthe same compact metric space on which the image of the sequenceconverges as a sequence of closed subsets of a complete metric spaceto the image of the corresponding limit.

The resulting topology is referred to as the Gromov-strong topology. Thechapter is then organized as follows. In Section 1.2 we recall two equivalentdefinitions of the Gromov-Hausdorff metric as a candidate for a completemetric which generates the Gromov-strong topology. In Section 1.3 we provethat the topology generated by the Gromov-Hausdorff metric coincides withthe Gromov-strong topology. As a technical preparation in Section 1.4 werecall a criterion for a set of compact metric spaces to be pre-compact. InSection 1.5 we consider the subspaces of 0-hyperbolic spaces and real trees.In particular, we show that the space (of isometry classes) of compact realtrees is separable and complete. In Section 1.6 we discuss the fact thattrees can be reconstructed from their leaf to leaf distances. In Section 1.7we explain how compact real trees are associated with a natural lengthmeasure.

In Chapter 5 we construct the Root Growth with Regrafting dynamicsvia a limiting procedure in which a general rooted compact real tree isapproximated “from the inside” by an increasing sequence of finite subtrees.To prepare the construction we collect some results and properties on theset of isometry classes of rooted compact R-trees equipped with the rootedGromov-Hausdorff distance in Section 1.8. We then establish some factsabout rooted subtrees and their trimmings in Section 1.9. This yields alsointo a characterization of the compact sets in the space (of isometry classes)of compact real trees which we state in Section 1.10.

In Chapter 6 we construct with the Subtree Prune and Regraft processa candidate for a limit dynamics of a Markov chain in which subtrees arepruned at randomly chosen edges and get regrafted at randomly pickedvertices. While choosing an edge corresponds in the limit to choosing apoint according to the (normalized) length measure associated with the tree,picking a vertex at random needs more thought since there is no canonicalweight associated with a real tree. In Section 1.11 we therefore equip real

21

Page 22: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

22 1. STATE SPACES I: R-TREES

trees with a probability measure and provide with the weighted Gromov-Hausdorff metric a distance which takes this extra structure into account.

Finally, Section 1.12 discusses convergence in distribution of random(rooted, weighted) real trees and characterizes tightness.

1.1. The Gromov-strong topology

The ordinary Hausdorff metric between two subsets A1, A2 of one metricspace (X, d) is defined as

(1.1) dH(A1, A2) := infε > 0; A1 ⊆ Uε(A2) and A2 ⊆ Uε(A1),where

(1.2) Uϵ(A) :=x ∈ X; d(x,A) ≤ ε

.

Given a metric space (X, r), denote by P(X, r) the space of closed sub-sets of X equipped with the Hausdorff metric. Recall from Proposition 7.3.7in [BBI01] that if (X, r) is complete then P(X, r) is complete and fromPropositions 7.3.8 in [BBI01] that if (X, r) is compact then P(X, r) is com-pact. In the following we refer to the topology generated by the Hausdorffmetric as the Hausdorff topology.

Denote by Xc the space of isometry classes of compact metric spaces. Ifno confusion is possible, we denote by X = (X, r) the isometry class of ametric space.

Remark 1.1.1. Since the space (of isometry classes) of compact metricspaces can, of course, be metrized such that this space becomes compact,we have to be careful to deal with sets in the sense of the Zermelo-Fraenkelaxioms. The way out is to define Xc as the space of isometry classes of thosecompact metric spaces whose elements are not metric spaces themselves.

We are then in a position to define the Gromov-strong topology on Xc.

Definition 1.1.2 (Gromov-strong topology). A sequence (Xn)n∈N issaid to converge Gromov-strongly to X in Xc, as n → ∞, if and only ifthere exists a compact metric space (Z, rZ) and isometric embeddings φ, φ1,φ2, ... of (X, r), (X1, rX1), (X2, rX2), ..., respectively, into (Z, rZ) suchthat (φn(Xn))n∈N converges to φ(X ) in (Z, rZ) in the Hausdorff topology, asn→ ∞.

1.2. A complete metric: The Gromov-Hausdorff metric

In this section we introduce the Gromov-Hausdorff metric dH on Xc andprove that the metric space (Xc, dGH) is complete. In Section 1.3 we will seethat the Gromov-Hausdorff metric generates the Gromov-strong topology.

The following definition can be found, for example, in [Gro99, BH99,BBI01].

Page 23: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

1.2. A COMPLETE METRIC: THE GROMOV-HAUSDORFF METRIC 23

Definition 1.2.1 (Gromov-Hausdorff distance). We define the Gromov-Hausdorff distance, dGH(X1, X2), between two metric spaces (X1, rX1) and

(X2, rX2) as the infimum of d(Z,rZ)H (X ′

1, X′2) over all metric spaces X ′

1 andX ′

2 that are isomorphic to X1 and X2, respectively, and that are subspacesof some common metric space (Z, rZ).

Remark 1.2.2 (Gromov-Hausdorff metric on Xc). The Gromov-Hausdorffdistance defines a finite metric on the space of all isometry classes of compactmetric spaces (see, for example, Theorem 7.3.30 in [BBI01]).

We point out that a direct application of Definition 1.2.1 requires anoptimal embedding into a new metric space (Z, rZ). While this definition isconceptually appealing it turns out to often not be so useful for explicit com-putations in concrete examples. A re-formulation of the Gromov-Hausdorffdistance is suggested by the following observation. Suppose that two metricspaces (X1, rX1) and (X2, rX2) are close in the Gromov-Hausdorff distanceas witnessed by isometric embeddings φ1 and φ2 into some common space(Z, rZ). The map that associates each point in x1 ∈ X1 to a point in x2 ∈ X2

such that rZ(φ1(x1), φ2(x2)) is minimal should then be close to an isometryonto its image, and a similar remark holds with the roles of X1 and X2

reversed.In order to quantify the observation of the previous paragraph, we re-

quire some more notation.

Definition 1.2.3 (Distortion). Let (X1, rX1) and (X2, rX2) be metricspaces. The distortion of a relation R ⊆ X1 ×X2 is defined by

(1.3) dis(R) := sup|rX1(x1, y1)− rX2(x2, y2)|; (x1, x2), (y1, y2) ∈ R

.

Example 1.2.4 (Distortion of a function). Let (X1, rX1) and (X2, rX2)be metric spaces. The distortion of a function f : X1 → X2 is defined as thedistortion of the corresponding relation

(1.4) Rf :=(x, f(x)); x ∈ X1

⊆ X1 ×X2.

That is,

(1.5) dis(f) := dis(Rf ) = sup|rX1(x, y)− rX2(f(x), f(y))| : x, y ∈ X1

.

Definition 1.2.5 (Correspondence). A relation R ⊆ X1 × X2 is saidto be a correspondence between sets X1 and X2 if for each x1 ∈ X1 thereexists at least one x2 ∈ X2 such that (x1, x2) ∈ R, and for each y2 ∈ X2

there exists at least one y1 ∈ X1 such that (y1, y2) ∈ R.

The following re-formulation of the Gromov-Hausdorff metric is takenfrom Theorem 7.3.25 in [BBI01].

Page 24: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

24 1. STATE SPACES I: R-TREES

Proposition 1.2.6. For two elements (X1, rX1) and (X2, rX2) in Xc,

(1.6) dGH((X1, rX1), (X2, rX2)) =1

2infR

dis(R),

where the infimum is taken over all correspondences R between X1 and X2.

1.3. Gromov-Hausdorff and the Gromov-strong topology coincide

The definition of the Gromov-Hausdorff metric uses an embedding intoa common metric space. Convergence in the Gromov-Hausdorff topologycan as well be formulated in terms of an embedding into a common metricspace.

Theorem 1.3.1. Let (X, rX), (X1, rX1), (X2, rX2), ... be in Xc. Then

dGH(Xn, X)n→∞−→ 0

if and only if the sequence (Xn)n∈N converges Gromov-strongly to X in Xc,as n→ ∞.

We prepare the proof of Theorem 1.3.1 with the following lemma.

Lemma 1.3.2 (Extension of metrics via relations). Assume that (X1, rX1)and (X2, rX2) are metric spaces and R ⊆ X1 ×X2 is a non-empty relationbetween X1 and X2. Let for x1 ∈ X1 and x2 ∈ X2,

(1.7)rRX1⊔X2

(x1, x2)

:= infrX1(x1, x

′1) +

12dis(R) + rX2(x2, x

′2) : (x

′1, x

′2) ∈ R

.

Then the following hold:

(i) rRX1⊔X2defines a metric on the disjoint union X1 ⊔X2.

(ii) The metric rXi equals the metric rRX1⊔X2restricted to Xi, for i =

1, 2.(iii) rRX1⊔X2

(x1, x2) =12dis(R), for any pair (x1, x2) ∈ R.

(iv) With π1 and π2 denoting the projection operators on X1 and X2,respectively,

(1.8) d(X1⊔X2,rRX1⊔X2

)

H (π1R, π2R). =12dis(R),

Proof. It is not hard to check that rRX1⊔X2defines a metric on X1⊔X2

which extends the metrics on X1 and X2. In particular, rRX1⊔X2(x1, x2) =

12dis(R), for any pair (x1, x2) ∈ R, and therefore (1.8) holds.

Proof of Theorem 1.3.1. The “if”-direction is clear. So we comeimmediately to the “only if” direction. If dGH

(Xn, X

)→ 0, as n → ∞,

and we find correspondences Rn between Xn and X such that dis(Rn) → 0,as n → ∞, by (1.6). Using these, we define recursively metrics rZn on

Zn := X ⊔⊔nk=1Xk. First, set Z1 := X ⊔X1 and rZ1 := rR1

Z1as in (1.7). In

Page 25: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

1.5. 0-HYPERBOLIC SPACES AND R-TREES 25

the nth step, we are given a metric on Zn. Consider the canonical isometricembedding φ from X to Zn and define the relation Rn ⊆ Zn ×Xn+1 by

(1.9) Rn :=(z, x) ∈ Zn ×Xn+1 : (φ

−1(z), x) ∈ Rn+1

,

and set rZn+1 := rRnZn+1

. By this procedure we end up with a metric rZ on

Z := X ⊔⊔n∈NXn and isometric embeddings φ, φ1, φ2, ... between X, X1,

X2, ... and Z, respectively, such that

(1.10) d(Z,rZ)H

(φn(Xn), φ(X)

)=

1

2dis(Rn)

n→∞−→ 0.

W.l.o.g. we can assume that Z is complete. Otherwise we just embedeverything into the completion of Z. To verify compactness of (Z, rZ) itis therefore sufficient to show that Z is totally bounded (see, for example,Theorem 1.6.5 in [BBI01]). For that purpose fix ε > 0, and let n ∈ N. SinceX is compact, we can choose a finite ε/2-net S in X. Then for all x ∈ Zwith rZ(x,X) < ε/2 there exists x′ ∈ S such that rZ(x, x

′) < ε. Moreover,dH(φn(Xn), φ(X)

)< ε, for all but finitely many n ∈ N. For the remaining

φn(Xn) choose finite ε-nets and denote their union by S. In this way, S ∪ Sis a finite set, and Bε(s) : s ∈ S ∪ S is a covering of Z.

1.4. Compact sets in XcSince the Gromov-Hausdorff topology is a relatively weak one, one may

expect that it has relatively many compact sets. The following criterion fora set to be pre-compact in the Gromov-Hausdorff topology is taken fromTheorem 7.4.15 in [BBI01].

Proposition 1.4.1 (A criterion for pre-compactness). A set Γ ⊆ Xc ispre-compact if it is uniformly totally bounded, i.e.,

• the setdiam(X) : X ∈ Γ

is bounded, and

• for all ε > 0 there is a number N such that every X ∈ Γ can becovered by at most N balls of radius ε.

1.5. 0-hyperbolic spaces and R-trees

Definition 1.5.1 (0-hyperbolic metric space). A metric space (X, r) issaid to be 0-hyperbolic with respect to a point v ∈ X if and only if

(1.11) (x · y)v ≥ (x · z)v ∧ (y · z)v,

for all x, y, z ∈ X, where for x, y ∈ X,

(1.12) (x · y)y :=1

2

(r(x, v) + r(y, v)− r(x, y)

).

Page 26: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

26 1. STATE SPACES I: R-TREES

Remark 1.5.2. If a metric space (X, r) is 0-hyperbolic with respect toa point v ∈ X then it is 0-hyperbolic with respect to all points in X.

In the following we will therefore refer to a metric space which is 0-hyperbolic with respect to a point v ∈ X as simply being 0-hyperbolic.

Lemma 1.5.3 (Equivalence to “Four point condition”). A metric space(X, r) is 0-hyperbolic if and only if it satisfies the so-called four point con-dition, i.e.,

(1.13)r(x1,x2) + r(x3, x4)

≤ maxr(x1, x3) + r(x2, x4), r(x1, x4) + r(x2, x3)for all x1, . . . , x4 ∈ X.

Example 1.5.4 (Ultra-metric space). Recall that a metric space (X, r)is said to be ultra-metric if

(1.14) r(u,w) ≤ r(u, v) ∨ r(v, w),for all u, v, w ∈ X. It is easy to verify that each ultra-metric space (X, r) is0-hyperbolic.

Definition 1.5.5 (R-tree). A complete hyperbolic metric space (X, d) issaid to be an R-tree if it is path-connected.

See Section 1.6 for more elaboration on trees spanned by four vertices.Compare also with Figure 1.1 that shows all possible shapes.

We refer the reader to ([Dre84, DT96, DMT96, Ter97]) for back-ground on R-trees. In particular, [Chi01] shows that a number of otherdefinitions are equivalent to the one above.

Remark 1.5.6. A particularly useful fact is that a complete metric space(X, r) is an R-tree if it satisfies the following axioms:

Axiom 1 (Unique geodesics) For all x, y ∈ X there exists a uniqueisometric embedding ϕx,y : [0, r(x, y)] → X such that ϕx,y(0) = x andϕx,y(r(x, y)) = y.

Axiom 2 (Loop-free) For every injective continuous map ψ : [0, 1] → Xone has ψ([0, 1]) = ϕψ(0),ψ(1)([0, r(ψ(0), ψ(1))]).

Axiom 1 says simply that there is a unique “unit speed” path betweenany two points, whereas Axiom 2 implies that the image of any injectivepath connecting two points coincides with the image of the unique unitspeed path, so that it can be re-parameterized to become the unit speedpath. Thus, Axiom 1 is satisfied by many other spaces such as Rd with theusual metric, whereas Axiom 2 expresses the property of “treeness” and isonly satisfied by Rd when d = 1.

Page 27: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

1.5. 0-HYPERBOLIC SPACES AND R-TREES 27

The following result states that 0-hyperbolic metric spaces can be iso-metrically embedded into a real tree (see, Theorem 3.38 in [Eva06]).

Proposition 1.5.7 (The real tree spanned by an 0-hyperbolic space).Let (X, r) be a 0-hyperbolic metric spaces. There exists a R-tree (X ′, r′) andan isometry φ : X → X ′.

Example 1.5.8 (Ultra-metric spaces can be embedded into real trees).Proposition 1.5.7 says, in particular, that every complete ultra-metric space(U, rU ) can be isometrically embedded into an R-tree (X, rX).

Let (T, dGH) be the metric space of isometry classes of compact R-treesequipped with dGH. We will be a little loose and sometimes refer to an R-treeas an element of T rather than as a class representative of an element.

The following results says that, at the very least, T equipped with theGromov-Hausdorff distance is a “reasonable” space on which to do proba-bility theory.

Theorem 1.5.9. The metric space (T, dGH) is complete and separable.

The following result is useful in the proof of Theorem 1.5.9.

Lemma 1.5.10. The set T of compact R-trees is a closed subset of thespace of compact metric spaces equipped with the Gromov-Hausdorff dis-tance.

Proof. It suffices to note that the limit of a sequence in T is path-connected (see, for example, Theorem 7.5.1 in [BBI01]) and satisfies thefour point condition (1.13), (indeed, as remarked after Proposition 7.4.12 in[BBI01], there is a “meta–theorem” that if a feature of a compact metricspace can be formulated as a continuous property of distances among finitelymany points, then this feature is preserved under Gromov-Hausdorff limits).

Proof of Theorem 1.5.9. We start by showing separability. Given acompact R-tree, T , and ε > 0, let Sε be a finite ε-net in T . For a, b ∈ T , let

(1.15) [a, b[ := ϕa,b( [0, r(a, b)[ ) and ]a, b[ := ϕa,b( ]0, r(a, b)[ )

be the unique half open and open, respectively, arc between them, and writeTε for the subtree of T spanned by Sε , that is,

(1.16) Tε :=∪

x,y∈Sε

[x, y] and rTε := r∣∣Tε.

Obviously, Tε is still an ε-net for T , and hence dGH(Tε, T ) ≤ dH(Tε, T ) ≤ ε.Now each Tε is just a “finite tree with edge-lengths” and can clearly

be approximated arbitrarily closely in the dGH-metric by trees with thesame tree topology (that is, “shape”), and rational edge-lengths. The set of

Page 28: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

28 1. STATE SPACES I: R-TREES

@@

@@x2 x4

x1

x3(IV)

@@

@@x2 x3

x1x4

(III)

@@

@@x2 x4

x1x3

(II)

@@

@@x1 x2

x4

x3

y1,2

y3,4

(I)

Figure 1.1. shows the 4 different shapes of a labeled treewith 4 leaves.

isometry types of finite trees with rational edge-lengths is countable, and so(T, dGH) is separable.

It remains to establish completeness. It suffices by Lemma 1.5.10 to showthat any Cauchy sequence in T converges to some compact metric space, or,equivalently, any Cauchy sequence in T has a subsequence that converges tosome metric space.

Let (Tn)n∈N be a Cauchy sequence in T. By Exercise 7.4.14 and The-orem 7.4.15 in [BBI01], a sufficient condition for this sequence to have asubsequential limit is that for every ε > 0 there exists a positive numberN = N(ε) such that every Tn contains an ε-net of cardinality N .

Fix ε > 0 and n0 = n0(ε) such that dGH(Tm, Tn) < ε/2 for m,n ≥ n0.Let Sn0 be a finite (ε/2)-net for Tn0 of cardinality N . Then by (1.6) foreach n ≥ n0 there exists a correspondence Rn between Tn0 and Tn such thatdis(Rn) < ε. For each x ∈ Tn0 , choose fn(x) ∈ Tn such that (x, fn(x)) ∈ Rn.Since for any y ∈ Tn with (x, y) ∈ Rn, rTn(y, fn(x)) ≤ dis(Rn), for alln ≥ n0, the set fn(Sn0) is an ε-net of cardinality N for Tn, n ≥ n0.

1.6. R-trees with 4 leaves

For the sake of reference and establishing some notation, we record heresome well-known facts about reconstructing trees from a knowledge of thedistances between the leaves. We remark that the fact that trees can bereconstructed from their collection of leaf-to-leaf distances (plus also the leaf-to-root distances for rooted trees) is of huge practical importance in so-calleddistance methods for inferring phylogenetic trees from DNA sequence data,and the added fact that one can build such trees by building subtrees foreach collection of four leaves is the starting point for the sub-class of distancemethods called quartet methods. We refer the reader to [Fel03, SS03] foran extensive description of these techniques and their underlying theory.

Lemma 1.6.1. The isometry class of an unrooted tree (T, r) with fourleaves is uniquely determined by the distances between the leaves of T .

Proof. Let x1, x2, x3, x4 be the set of leaves of T . The tree T hasone of four possible shapes:

Page 29: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

1.7. LENGTH MEASURE 29

Consider case (I), and let y1,2 be the uniquely determined branch pointon the tree that lies on the arcs [x1, x2] and [x1, x3], and y3,4 be the uniquelydetermined branch point on the tree that lies on the arcs [x3, x4] and [x1, x3].Observe that

(1.17)

r(x1, y1,2) = (x2 · x3)x1 = (x2 · x4)x1r(x2, y1,2) = (x1 · x3)x2 = (x1 · x4)x2r(x3, y3,4) = (x4 · x1)x3 = (x4 · x2)x3r(x4, y3,4) = (x3 · x1)x4 = (x3 · x2)x4

r(y1,2, y3,4) =1

2(r(x1, x4) + r(x2, x3)− r(x1, x2)− r(x3, x4)),

Similar observations for the other cases show that if we know the shape ofthe tree, then we can determine its edge-lengths from leaf-to-leaf distances.Note also that

(1.18)

χ(I)(T ) :=1

2(r(x1, x3) + r(x2, x4)− r(x1, x2)− r(x3, x4))

> 0 for shape (I),

< 0 for shape (II),

= 0 for shapes (III) and (IV)

.

This and analogous inequalities for the quantities that reconstruct thelength of the “internal” edge in shapes (II) and (III), respectively, showthat the shape of the tree can also be reconstructed from leaf-to-leaf dis-tances.

1.7. Length measure

Compact R-trees are associated with a natural length measure as follows.Fix (T, d) ∈ T, and, as usual, denote the Borel-σ-algebra on T by B(T ). Fora, b ∈ T , recall the half open arc [a, b[ from (1.15), and let

(1.19) T o :=∪

a,b∈T]a, b[

the skeleton of T . Observe that if T ′ ⊂ T is a dense countable set, then(1.19) holds with T replaced by T ′. In particular, T o ∈ B(T ) and B(T )

∣∣T o =

σ(]a, b[; a, b ∈ T ′), where B(T )∣∣T o := A ∩ T o; A ∈ B(T ). Hence there

exists a unique σ-finite measure µT on T , called length measure, such thatµT (T \ T o) = 0 and

(1.20) µT (]a, b[) = d(a, b), ∀ a, b ∈ T.

Such a measure may be constructed as the trace onto T o of one-dimensionalHausdorff measure on T , and a standard monotone class argument showsthat this is the unique measure with property (1.20).

Page 30: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

30 1. STATE SPACES I: R-TREES

Remark 1.7.1. The terminology skeleton might seem somewhat de-risory, since for finite trees the difference between the skeleton and the wholetree is just a finite number of points. However, it is not difficult to produceR-trees for which the difference between the skeleton and the whole tree is aset with Hausdorff dimension greater than one. For example, the BrownianCRT which we discuss in Section 3.2 will almost surely be such a tree. Thisexplains our requirement that µ is carried by the skeleton.

Remark 1.7.2. Elements of T are really equivalence classes of treesrather than trees themselves, so what we are describing here is a way ofassociating a measure to each element of the equivalence class. However,this procedure respects the equivalence relation in that if T ′ and T ′′ aretwo representatives of the same equivalence class and are related by a root-invariant isometry f : T ′ → T ′′, then the associated length measures µ′ andµ′′ are such that µ′′ is the push-forward of µ′ by f and µ′ is the push-forwardof µ′′ by the inverse of f (that is, µ′′(A′′) = µ′(f−1(A′′)) and µ′(A′) =µ′′(f(A′)) for Borel sets A′ and A′′ of T ′ and T ′′, respectively).

We conclude this section by presenting explicit expressions for thelengths of a sub-tree spanned by finitely many points. Let (X, r) be a 0-hyperbolic metric space. Recall from Proposition 1.5.7 the notion of a sub-tree spanned by a finite subset of X. Let then for n ∈ N and x1, ..., xn ∈ X,

(1.21)L(X,r)n

(x1, ..., xn

):= length of the subtree of (X, r) spanned by x1, ..., xn.

Lemma 1.7.3 (Total length of a sub-tree spanned by a finite subset).For a 0-hyperbolic metric space (X, r) and x1, ..., xn ∈ X,

(1.22)

L(X,r)n

(x1, ..., xn

)= r(x1, x2) +

n∑k=3

∧1≤i<j≤k−1

1

2

(r(xk, xi) + r(xk, xj)− r(xi, xj)

)=

1

2inf n∑i=1

r(xi, xσ(i)); σ ∈ Σ1n

,

where Σ1n denotes the set of permutations of 1, ..., n with exactly one cycle.

Proof. To see the first identity in (1.22) we proceed by induction. Forn = 2 the assertion is clear. For n ≥ 2, 1 ≤ i < j ≤ k ≤ n, the distancefrom the point xk to the arc [xi, xj ] is

(1.23)1

2

(r(xk, xi) + r(xk, xj)− r(xi, xj)

).

Page 31: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

1.8. ROOTED R-TREES 31

Thus the distance from xk, 3 ≤ k ≤ n, to the subtree spanned byx1, ..., xk−1 is

(1.24)∧

1≤i<j≤k−1

1

2

(r(xk, xi) + r(xk, xj)− r(xi, xj)

),

and hence the first identity follows.

For the second identity consider the traveling salesperson problem for asalesperson who must visit all leaves of the tree and who starts at one leaf towhich she comes back at the end of the trip. In a planar embedding of thetree it is optimal to visit the leaves in a neighboring order. Indeed, followingthis procedure, through the journey each branch of the tree is visited exactlytwice, while for any other tour there are branches which are visited morethan twice. Since for this optimal strategy the traveled distance is twice thelength of the tree the second identity follows.

1.8. Rooted R-trees

In situations where we are interested in rooted trees, we extend ourdefinition as follows:

Definition 1.8.1 (Rooted R-trees). A rooted R-tree, (X, r, ρ), is anR-tree (X, r) with a distinguished point ρ ∈ X that we call the root.

Remark 1.8.2. It is helpful to use genealogical terminology and thinkof ρ as a common ancestor and h(x) := r(ρ, x) as the real-valued generationto which x ∈ X belongs (h(x) is also called the height of x). We define apartial order ≤ on X by declaring (using the notation introduced in (1.15))that x ≤ y if x ∈ [ρ, y], so that x is an ancestor of y. Each pair x, y ∈ Xhas a well-defined greatest common lower bound, x ∧ y, in this partial orderthat we think of as the most recent common ancestor of x and y.

Definition 1.8.3 (Root invariant isometry). A function ξ : X1 → X2

is called an root invariant isometry between two rooted R-trees (X1, rX1 , ρ1)and (X2, rX2 , ρ2) if and only if the function ξ is an isometry from X1 to X2

with ξ(ρ1) = ρ2.

Let Troot denote the collection of all root-invariant isometry classes ofrooted compact R-trees. We want to equip Troot with a Gromov-Hausdorfftype distance that incorporates the special status of the root.

Definition 1.8.4 (Rooted Gromov-Hausdorff distance). The rootedGromov-Hausdorff distance, dGHroot((X1, ρ1), (X2, ρ2)), between two rootedR-trees (X1, ρ1) and (X2, ρ2) is defined by

(1.25) dGHroot((X1, ρ1), (X2, ρ2)) := infd(Z,rZ)H (X ′

1, X′2) ∨ dZ(ρ′1, ρ′2)

Page 32: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

32 1. STATE SPACES I: R-TREES

where the infimum is taken over all rooted R-trees (X ′1, ρ

′1) and (X ′

2, ρ′2) that

are root-invariant isomorphic to (X1, ρ1) and (X2, ρ2), respectively, and thatare (as unrooted trees) subspaces of a common metric space (Z, dZ).

Remark 1.8.5 (Rooted Gromov-Hausdorff metric on Troot). It isstraightforward to check that the rooted Gromov-Hausdorff distance definesa finite (pseudo-)metric on the space of compact rooted R-trees and inducesa finite metric on Troot.

As in (1.6), we can compute dGHroot((X1, rX1 , ρ1), (X2, rX2 , ρ2)) by com-paring distances within X1 to distances within X2, provided that the distin-guished status of the root is respected. The following result is the analogueof Proposition 1.2.6.

Proposition 1.8.6. For two rooted trees (X1, rX1 , ρ1), and (X2, rX2 , ρ2),

(1.26) dGHroot((X1, rX1 , ρ1), (X2, rX2 , ρ2)) =1

2infRroot

dis(Rroot),

where now the infimum is taken over all correspondences Rroot between X1

and X2 with (ρ1, ρ2) ∈ Rroot.

Proof. Indeed, for any root-invariant isometric copies (X ′1, ρ

′1) and

(X ′2, ρ

′2) embedded in Z, and r > dGHroot((X1, ρ1), (X2, ρ2)),

(1.27) Rroot := (x1, x2); x1 ∈ X ′1, x2 ∈ X ′

2, rZ(x1, x2) < rgives a correspondence between X1 and X2 containing (ρ1, ρ2) such thatdis(Rroot) < 2r.

On the other hand, given a correspondence Rroot between X1 and X2

containing (ρ1, ρ2), define the metric rRroot

X1⊔X2on X1 ⊔ X2 as in (1.7). By

Lemma 1.3.2, in particular, rRroot

X1⊔X2(ρ1, ρ2) =

12dis(R

root). Then we have

(1.28) d(X1⊔X2,rR

root

X1⊔X2)

H (X1, X2) ∨ rX1⊔X2(ρ1, ρ2) ≤1

2dis(Rroot).

We state an analogue of Theorem 1.5.9 for rooted compact R-trees.

Theorem 1.8.7. The metric space (Troot, dGHroot) is complete and sep-arable.

Before we can prove Theorem 1.8.7 we need two preparatory results.Recall from Example 1.2.4 the notion of the distortion of a function.

Definition 1.8.8 ((Root invariant) ε-isometry). Let (X1, r1, ρ1) and(X2, r2, ρ2) be two rooted compact R-trees, and take ε > 0.

(i) A function f : X1 → X2 is called an ε(distorted)-isometry, ifdis(f) < ε and f(X1) is an ε-net for X2.

Page 33: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

1.8. ROOTED R-TREES 33

(ii) An ε-isometry f is called a ε-(distorted)root invariant isometryfrom X1 to X2 if f(ρ1) = ρ2.

The first is a the counterpart of Corollary 7.3.28 in [BBI01] and presentsconvenient upper and lower estimates for dGHroot that differ by a multiplica-tive constant.

Lemma 1.8.9. Let (X1, ρ1) and (X2, ρ2) be two rooted compact R-trees,and take ε > 0. Then the following hold.

(i) If dGHroot((X1, ρ1), (X2, ρ2)) < ε, then there exists a root-invariant2ε-isometry from (X1, ρ1) to (X2, ρ2).

(ii) If there exists a root-invariant ε-isometry from (X1, ρ1) to (X2, ρ2),then

dGHroot((X1, ρ1), (X2, ρ2)) ≤3

2ε.

Proof. (i) Let dGHroot((X1, ρ1), (X2, ρ2)) < ε. By Proposition 1.8.6there exists a correspondence Rroot between X1 and X2 such that (ρ1, ρ2) ∈Rroot and dis(Rroot) < 2ε. Define f : X1 → X2 by setting f(ρ1) = ρ2, andchoosing f(x) such that (x, f(x)) ∈ Rroot for all x ∈ X1 \ ρ1. Clearly,dis(f) ≤ dis(Rroot) < 2ε. To see that f(X1) is an 2ε-net for X2, let x2 ∈ X2,and choose x1 ∈ X1 such that (x1, x2) ∈ Rroot. Then rX2(f(x1), x2) ≤rX1(x1, x1) + dis(Rroot) < 2ε.

(ii) Let f be a root-invariant ε-isometry from (X1, ρ1) to (X2, ρ2). Definea correspondence Rroot

f ⊆ X1 ×X2 by

(1.29) Rrootf := (x1, x2) : rX2(x2, f(x1)) ≤ ε.

Then (ρ1, ρ2) ∈ Rrootf and Rroot

f is indeed a correspondence since f(X1) is a

ε-net for X2. If (x1, x2), (y1, y2) ∈ Rrootf , then

|rX1(x1, y1)− rX2(x2, y2)| ≤ |rX2(f(x1), f(y1))− rX1(x1, y1)|+ rX2(x2, f(x1)) + rX2(f(x1), y2)

< 3ε.

(1.30)

Hence dis(Rrootf ) < 3ε and, by (1.26), dGHroot((X1, ρ1), (X2, ρ2)) ≤ 3

2ε.

The second preparatory result we need is the following compactness cri-terion, which is the analogue of Theorem 7.4.15 in [BBI01] (note also Exer-cise 7.4.14 in [BBI01]) and can be proved the same way, using Lemma 1.8.9in place of Corollary 7.3.28 in [BBI01] and noting that the analogue ofLemma 1.5.10 holds for Troot.

Lemma 1.8.10 (A criterion for pre-compactness in Troot). A subset Γ ⊆Troot is pre-compact if for every ε > 0 there exists a positive integer N(ε)such that each T ∈ Γ has an ε-net with at most N(ε) points.

Page 34: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

34 1. STATE SPACES I: R-TREES

Proof of Theorem 1.8.7. The proof follows very much the same linesas that of Theorem 1.5.9. The proof of separability is almost identical.The key step in establishing completeness is again to show that a Cauchysequence in Troot has a subsequential limit. This can be shown in the samemanner as in the proof of Theorem 1.5.9, with an appeal to Lemma 1.8.10replacing one to Theorem 7.4.15 and Exercise 7.4.14 in [BBI01].

1.9. Rooted subtrees and trimming

There are situations, as for example the Root Growth with Regraftingdynamics which we construct in Chapter 5, which involve a limiting proce-dure in which a general rooted compact R-tree is approximated “from theinside” by an increasing sequence of finite subtrees. We therefore need toestablish some facts about such approximations.

To begin with, we require a notation for one tree being a subtree of an-other, with both trees sharing the same root. We need to incorporate the factthat we are dealing with equivalence classes of trees rather than trees them-selves. A rooted subtree of (T, r, ρ) ∈ Troot is an element (T ∗, r∗, ρ∗),∈ Troot

that has a class representative that is a subspace of a class representativeof (T, r, ρ), with the two roots coincident. Equivalently, any class repre-sentative of (T ∗, r∗, ρ∗) can be isometrically embedded into any class rep-resentative of (T, r, ρ) via an isometry that maps roots to roots. We writeT ∗ ≼root T and note that ≼root is an partial order on Troot.

All of the “wildness” in a compact R-tree happens “at the leaves”. Forexample, if T ∈ Troot has a point x at which infinite branching occurs (sothat the removal of x would disconnect T into infinitely many components),then any open neighborhood of x must contain infinitely many leaves, whilefor each η > 0 there are only finitely many leaves y such that x ∈ [ρ, y]with r(x, y) > η. A natural way in which to produce a finite subtree thatapproximates a given tree is thus to fix η > 0 and trim off the fringe of thetree by removing those points that are not at least distance η from at leastone leaf. Formally, for η > 0 define Rη : Troot → Troot to be the map thatassigns to (T, ρ) ∈ Troot the rooted subtree (Rη(T, ρ), ρ) that consists of ρand points a ∈ T for which the subtree

(1.31) ST,a :=x ∈ T : a ∈ [ρ, x[

(that is, the subtree above a ) has height greater than or equal to η. Equiv-alently,

(1.32) Rη(T, ρ) :=x ∈ T : ∃ y ∈ T x ∈ [ρ, y], rT (x, y) ≥ η

∪ ρ.

In particular, if T has height at most η, then Rη(T, ρ) is just the trivial treeconsisting of the root ρ.

Remark 1.9.1. Notice that the map described in (1.32) maps a metricspace into a sub-space. However, since isometric spaces are mapped into

Page 35: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

1.9. ROOTED SUBTREES AND TRIMMING 35

isometric sub-spaces, we may think of Rη as a map from Troot into Troot.

Lemma 1.9.2 (Properties of the trimming map).(i) The range of Rη consists of finite rooted trees.(ii) The map Rη is continuous.(iii) The family of maps (Rη)η>0 is a semigroup, i.e., Rη′ Rη′′ =

Rη′+η′′, for all η′, η′′ > 0. In particular, Rη′(T, ρ) ≼root Rη′′(T, ρ),for all η′ ≥ η′′ > 0.

(iv) For any (T, ρ) ∈ Troot and η ≥ 0, dGHroot((T, ρ), (Rη(T, ρ), ρ)) ≤dH(T,Rη(T, ρ)) ≤ η.

Proof. (i) Fix (T, r, ρ) ∈ Troot. Let E ⊂ Rη(T, ρ) be the leaves ofRη,ρ, that is, the points that have no subtree above them. We have to showthat E is finite. However, if a1, a2, ... are infinitely many points in E \ ρ,then we can find points b1, b2, ... in T such that bi is in the subtree above aiand r(ai, bi) ≥ η. It follows that inf i=j r(bi, bj) ≥ 2η, which contradicts thecompactness of T .

(ii) Suppose that (T ′, r′, ρ′) and (T ′′, r′′, ρ′′) are two compact trees with

dGHroot((T ′, ρ′), (T ′′, ρ′′)) < ε.

By part (i) of Lemma 1.8.9, there exists a root-invariant 2ε-isometry f :T ′ → T ′′. Recall that this means, f(ρ′) = ρ′′, dis(f) < 2ε, and f(T ′) is an2ε-net for T ′′.

For a ∈ Rη(T′, ρ′), let f(a) be the unique point in Rη(T

′′, ρ′′) thatis closest to f(a). We will show that f : Rη(T

′, ρ′) → Rη(T′′, ρ′′) is

a root-invariant 25ε-isometry and hence, by part (ii) of Lemma 1.8.9,dGHroot(Rη(T

′, ρ′), Rη(T′′, ρ′′)) ≤ 3

225ε.We first show that

(1.33) supr′′(f(a), f(a)) : a ∈ Rη(T

′, ρ′)≤ 8ε.

Fix a ∈ Rη(T′, ρ′) and let b ∈ T ′ be a point in the subtree above a such that

r′(a, b) ≥ η. Denote the most recent common ancestor of f(a) and f(b) onT ′′ by f(a) ∧′′ f(b).

Then

(1.34)

r′′(f(a) ∧′′ f(b), f(a)

)=

1

2

(r′′(f(a), f(b)) + r′′(ρ′′, f(a))− r′′(ρ′′, f(b))

)≤ 1

2

(∣∣r′′(f(a), f(b))− r′(a, b)∣∣

+∣∣r′′(ρ′′, f(a))− r′(ρ′, a)

∣∣+ ∣∣r′′(ρ′′, f(b))− r′(ρ′, b)∣∣)

≤ 3ε.

Page 36: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

36 1. STATE SPACES I: R-TREES

aa

aa a

aaf(a) ∧′′ f(b)

ρ′′

f(a)

f(b)

ρ′

a

b

Figure 1.2. illustrates the shapes of the trees spanned byρ′, a, b and by ρ′′, f(a), f(b). The point f(a) lies some-where on the arc [ρ′′, f(a)].

If f(a) ∈ [f(a)∧′′ f(b), f(a)] then we are immediately done. Otherwise,f(a) ∈ [ρ′′, f(a)] and f(a) is a leaf in Rη(T

′′, ρ′′). Hence f(b) ∈ Rη(T′′, ρ′′),

and therefore

(1.35) r′′(f(a), f(b)) ≤ η.

Furthermore,

(1.36)

r′′(f(a) ∧′′f(b), f(b)

)= r′′(f(a), f(b))− r′′

(f(a) ∧′′f(b), f(a)

)≥(r′(a, b)− 2ε

)− 3ε

≥ η − 5ε.

Combining (1.34), (1.35) and (1.36) finally yields that r′′(f(a), f(a)) ≤ 8εand completes the proof of (1.33).

It follows from (1.33) that

(1.37)

dis(f) = sup|r′(a, b)− r′′(f(a), f(b))| : a, b ∈ Rη(T

′, ρ′)

≤ sup|r′(a, b)− r′′(f(a), f(b))| : a, b ∈ Rη(T

′, ρ′)

+ 2 supr′′(f(a), f(a)) : a ∈ Rη(T

′, ρ′)

< 2ε+ 2× 8ε

= 18ε.

The proof of (ii) will thus be completed if we can show that f(Rη(T′, ρ′))

is a 25ε-net in Rη(T′′,ρ′′). Consider a point c ∈ Rη(T

′′, ρ′′). We need to showthat there is a point b ∈ Rη(T

′, ρ′) such that

(1.38) r′′(f(b), c) < 25ε.

If r′′(ρ′′, c) < 7ε, then we are done, because we can take b = ρ′ (recallthat f(ρ′) = ρ′′). Assume, therefore, that r′′(ρ′′, c) ≥ 7ε. We can thenfind points c−, c+ ∈ T ′′ such that ρ′′ ≤ c− ≤ c ≤ c+ with r′′(c−, c) = 7εand r′′(c, c+) ≥ η. There are corresponding points a−, a, a+ ∈ T ′ such thatr′′(f(a−), c−) < 2ε, r′′(f(a), c) < 2ε, and r′′(f(a+), c+) < 2ε. We claim thatb := a− ∧′ a+ (the most recent common ancestor of a− and a+ in the treeT ′) belongs to Rη(T

′, ρ′) and satisfies (1.38).

Page 37: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

1.9. ROOTED SUBTREES AND TRIMMING 37

Note first of all that

r′(b, a+) = r′(a− ∧′ a+, a+)

=1

2

(r′(a+, a−) + r′(ρ′, a+)− r′(ρ′, a−)

)≥ 1

2

(r′′(f(a+), f(a−))− 2ε+ r′′(f(ρ′), f(a+))− 2ε

− r′′(f(ρ′), f(a−))− 2ε)

≥ 1

2

(r′′(c+, c−)− 4ε+ r′′(ρ′′, c+)− 2ε− r′′(ρ′′, c−)− 2ε

)− 3ε

= r′′(c+, c−)− 7ε

= η + 7ε− 7ε

η,

and so b ∈ Rη(T′, ρ′).

Furthermore,

r′′(c, f(b)) ≤ r′′(c, c−) + r′′(c−, f(a−)) + r′′(f(a−), f(b))

≤ 7ε+ 2ε+ r′(a−, b) + 2ε

= 11ε+1

2

(r′(a+, a−) + r′(ρ′, a−)− r′(ρ′, a+)

)≤ 11ε+

1

2

(r′′(f(a+), f(a−)) + 2ε+ r′′(f(ρ′), f(a−)) + 2ε

− r′(f(ρ′), f(a+)) + 2ε)

≤ 14ε+1

2

(r′′(c+, c−) + 2ε+ r′′(ρ′′, c−) + 2ε− r′′(ρ′′, c+) + 2ε

)= 17ε.

Therefore, by (1.33),

r(c, f(b)) ≤ 17ε+ 8ε = 25ε.

This completes the proof of (1.38), and thus the proof of part (ii).

Claims (iii) and (iv) are clear.

Lemma 1.9.3 (Length of trimmed tree is continuous). Let Λ : Troot →R∪∞ be the map that sends a tree to its total length. For η > 0, the mapΛ Rη is continuous.

Proof. For all η > 0 we have by Lemma 1.9.2 that: Rη = Rη/2 Rη/2,the map Rη is continuous, and the range of Rη consists of finite trees. Ittherefore suffices to show for all η > 0 that if (T, d, ρ) is a fixed finite tree and(T ′, d′, ρ′) is any another finite tree sufficiently close to T , then Λ Rη(T ′)is close to Λ Rη(T ).

Page 38: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

38 1. STATE SPACES I: R-TREES

Suppose, therefore, that (T, r, ρ) is a fixed finite tree with leavesx1, . . . , xn and that (T ′, r′, ρ′) is another finite tree with

(1.39) dGHroot((T, r, ρ), (T ′, r′, ρ′)) < δ,

where δ is small enough that the conclusions of Lemma 5.5.2 hold. Considera rooted subtree (T ′′, r′, ρ′) of (T ′, r′, ρ′) and a map f : T → T ′′ with theproperties guaranteed by Lemma 5.5.2. Set x′k = f(xk) for 1 ≤ k ≤ n.

Fix κ > 0. For 1 ≤ k ≤ n, write xk ∈ T for the point on the arc [ρ, xk]that is at distance κ ∧ r(ρ, xk) from xk. Set x0 := ρ. Define x′0, . . . , x

′n ∈

T ′′ similarly. Note that Rκ(T ) is spanned by x0, . . . , xn and Rκ(T′′) is

spanned by x′0, . . . , x′n. By Lemma 1.7.3,

(1.40)

Λ Rκ(T )

= r(x0, x1) +

n∑k=2

∧0≤i<j≤k−1

1

2

(r(xk, xi) + r(xk, xj)− r(xi, xj)

),

and

(1.41)

Λ Rκ(T ′′)

= r′(x′0, x′1) +

n∑k=2

∧0≤i<j≤k−1

1

2

(r′(x′k, x

′i) + r′(x′k, x

′j)− r′(x′i, x

′j)).

Also observe that

(1.42)

r(xi, xj) = (r(x0, xi)− κ)+ + (r(x0, xj)− κ)+

− 2[(r(x0, xi)− κ)+ ∧ (r(x0, xj)− κ)+∧

∧12

(r(x0, xi) + r(x0, xj)− r(xi, xj)

)]and

r′(x′i, x′j) = (r′(x′0, x

′i)− κ)+ + (r′(x′0, x

′j)− κ)+

− 2

[(r′(x′0, x

′i)− κ)+ ∧ (r′(x′0, x

′j)− κ)+

∧12

(r′(x′0, x

′i) + r′(x′0, x

′j)− r′(x′i, x

′j))]

.

Now the function t 7→ (t − κ)+ is Lipschitz with Lipschitz constant 1for all κ > 0, and it follows that there is a family of Lipschitz functionsFκ, κ > 0, with Lipschitz constants uniformly bounded by some constant Csuch that

Λ Rκ(T ) = Fκ

((r(xi, xj))0≤i,j≤n

)and

Λ Rκ(T ′′) = Fκ

((r′(x′i, x

′j))0≤i,j≤n

).

By construction |r(xi, xj)− r′(x′i, x′j)| < 8δ, and so

|Λ Rκ(T )− Λ Rκ(T ′′)| ≤ 8δC

Page 39: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

1.9. ROOTED SUBTREES AND TRIMMING 39

for all κ > 0.Because dH(T

′, T ′′) < 3δ, we have

Λ Rη(T ′′) ≤ Λ Rη(T ′) ≤ Λ Rη−3δ(T′′).

ThusΛ Rη(T )− 8δC ≤ Λ Rη(T ′) ≤ Λ Rη−3δ(T ) + 8δC.

Since limδ↓0 ΛRη−3δ(T ) = ΛRη(T ), this suffices to establish the result.

Finally, we require the following result, which will be the key to showingthat the “projective limit” of a consistent family of tree-valued processescan actually be thought of as a tree-valued process in its own right.

Lemma 1.9.4. Consider a sequence (Tn)n∈N of representatives of isom-etry classes of rooted compact trees in (T, dGHroot) with the following prop-erties.

• Each set Tn is a subset of some common set U .• Each tree Tn has the same root ρ ∈ U .• The sequence (Tn)n∈N is nondecreasing, that is, T1 ⊆ T2 ⊆ · · · ⊆ U .• Writing rn for the metric on Tn, for m < n the restriction of rnto Tm coincides with rm, so that there is a well-defined metric onT :=

∪n∈N Tn given by

(1.43) r(a, b) = rn(a, b), a, b ∈ Tn.

• The sequence of subsets (Tn)n∈N is Cauchy in the Hausdorff dis-tance with respect to r.

Then the following hold.

(i) The metric completion T of T is a compact R-tree, and dH(Tn, T ) →0 as n→ ∞, where the Hausdorff distance is computed with respectto the extension of d to T . In particular,

(1.44) limn→∞

dGHroot((Tn, ρ), (T , ρ)) = 0.

(ii) The tree T has skeleton T o =∪n∈N T

on.

(iii) The length measure on T is the unique measure concentrated on∪n∈N T

on that restricts to the length measure on Tn for each n ∈ N.

Proof. (i) Because T is a complete metric space, the collection of closedsubsets of T equipped with the Hausdorff distance is also complete (see, forexample, Proposition 7.3.7 of [BBI01]). Therefore the Cauchy sequence(Tn)n∈N has a limit that is (see Exercise 7.3.4 of [BBI01]) the closure of∪k∈N Tk, i.e, T itself. It is clear that the complete space T is totally bounded,

path-connected, and satisfies the four point condition, and so T is a compactR-tree. Finally,(1.45) dGHroot((Tn, ρ), (T , ρ)) ≤ dH(Tn, T ) ∨ d(ρ, ρ) = dH(Tn, T ) → 0,

as n→ ∞.

Claims (ii) and (iii) are obvious.

Page 40: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

40 1. STATE SPACES I: R-TREES

1.10. Compact sets in T

In this section we state a necessary and sufficient condition for a subsetof T to be relatively compact.

For ε > 0, T ∈ T, and ρ ∈ T , recall from (1.32) the ε-trimming Rε(T, ρ)relative to the root ρ of the compact R-tree T . Then set

(1.46) Rε(T ) :=

∩ρ∈T Rε(T, ρ), diam(T ) > ε,

singleton, diam(T ) ≤ ε,

where by singleton we mean the trivial R-tree consisting of one point. Thetree Rε(T ) is called the ε-trimming of the compact R-tree T .

Proposition 1.10.1 (A characterization of pre-compactness in T). Asubset Γ of (T, dGH) is relatively compact if and only if for all ε > 0,

(1.47) supµT (Rε(T )) : T ∈ Γ

<∞.

The proof relies on the following estimate.

Lemma 1.10.2. Let T ∈ T be such that µT (T ) <∞. For each ε > 0 thereis an ε-net for T of cardinality at most

[( ε2)

−1µT (T )] [

( ε2)−1µT (T ) + 1

]Proof. Note that an ε

2 -net for R ε2(T ) will be an ε-net for T . The set

T \ R ε2(T ) is a collection of disjoint subtrees, one for each leaf of R ε

2(T ),

and each such subtree is of diameter at least ε2 . Thus the number of

leaves of R ε2(T ) is at most ( ε2)

−1µT (T ). Enumerate the leaves of R ε2(T )

as x0, x1, . . . , xn. Each arc [x0, xi], 1 ≤ i ≤ n, of R ε2(T ) has an ε

2 -net of

cardinality at most ( ε2)−1rT (x0, xi) + 1 ≤ ( ε2)

−1µT (T ) + 1. Therefore, bytaking the union of these nets, R ε

2(T ) has an ε

2 -net of cardinality at most[( ε2)

−1µT (T )] [

( ε2)−1µT (T ) + 1

].

Remark 1.10.3. The bound in Lemma 1.10.2 is far from optimal. It canbe shown that T has an ε-net with a cardinality that is of order µT (T )/ε.This is clear for finite trees (that is, trees with a finite number of branchpoints), where we can traverse the tree with a unit speed path and hencethink of the tree as an image of the interval [0, 2µT (T )] by a Lipschitz mapwith Lipschitz constant 1, so that a covering of the interval [0, 2µT (T )] byε-balls gives a covering of T by ε-balls. This argument can be extended toarbitrary finite length R-trees, but the details are tedious and so we havecontented ourselves with the above simpler bound.

Proof of Proposition 1.10.1. The “only if” direction follows fromthe fact that T 7→ µT (Rε(T )) is continuous, by Lemma 1.9.2.

Conversely, suppose that (1.47) holds. Given T ∈ Γ, an ε-net for Rε(T )is a 2ε-net for T . By Lemma 1.10.2, Rε(T ) has an ε-net of cardinality at most[( ε2)

−1µT (Rε(T ))] [

( ε2)−1µT (Rε(T )) + 1

]. By assumption, the last quantity

Page 41: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

1.11. WEIGHTED R-TREES 41

is uniformly bounded in T ∈ Γ. Thus Γ is uniformly totally bounded andhence is relatively compact by Theorem 7.4.15 of [BBI01].

1.11. Weighted R-trees

As usual, given a topological space (X,O), we denote by M1(X) bespace of all probability measures on X equipped with the Borel-σ-algebraB(X). The push forward of ν under a measurable map φ fromX into anothermetric space (Z, rZ) is the probability measure φ∗ν ∈ M1(Z) defined by

(1.48) φ∗ν(A) := ν(φ−1(A)

),

for all A ∈ B(Z).In the following we will be interested in compact R-trees (T, r) ∈ T

equipped with a probability measure ν on the Borel σ-field B(T ). We callsuch objects weighted compact R-trees.

Definition 1.11.1 (Weight preserving isometry). A function ξ : X1 →X2 is called a weight preserving isometry between two weighted R-trees(X1, r1, ν1) and (X2, r2, ν2) if and only if the function ξ is an isometry fromX1 to X2 with ν2 = ϕ∗ν1.

It is clear that the property of being weight-preserving isometric is anequivalence relation. Denote by Twt the space of weight-preserving isom-etry classes of weighted compact R-trees. We want to equip Twt with aGromov-Hausdorff type of distance which incorporates the weights on thetrees. For that purpose we first introduce some notions that will be used inthe definition.

Recall from Definition 1.8.8 and Example 1.2.4 the notion of an ε-isometry f between two metric spaces (X, rX) and (Y, rY ) and its distortiondis(f), respectively.

It is easy to see that if for two metric spaces (X, rX) and (Y, rY ) andε > 0 we have dGH

((X, rX), (Y, rY )

)< ε, then there exists a (possibly non-

measurable) 2ε-isometry from X to Y (compare Lemma 7.3.28 in [BBI01]and Lemma 1.8.9). The following Lemma states that we may choose thedistorted isometry between X and Y to be measurable if we allow a slightlybigger distortion.

Lemma 1.11.2. Let (X, rX) and (Y, rY ) be two compact real trees suchthat dGH

((X, rX), (Y, rY )

)< ε for some ε > 0. Then there exists a measur-

able 3ε-isometry from X to Y .

Proof. If dGH

((X, rX), (Y, rY )

)< ε, then there exists a correspondence

R between X and Y such that dis(R) < 2ε, by Proposition 1.2.6. Since(X, rX) is compact there exists a finite ε-net in X. We claim that for eachsuch finite ε-net, SX,ε = x1, ..., xNε ⊆ X, any set SY,ε = y1, ..., yNε ⊆ Ysuch that (xi, yi) ∈ R for all i ∈ 1, 2, ..., N ε is an 3ε-net in Y . To see

Page 42: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

42 1. STATE SPACES I: R-TREES

this, fix y ∈ Y . We have to show the existence of i ∈ 1, 2, ..., N ε withrY (yi, y) < 3ε. For that choose x ∈ X such that (x, y) ∈ R. Since SX,ε

is an ε-net in X there exists an i ∈ 1, 2, ..., N ε such that rX(xi, x) < ε.(xi, yi) ∈ R implies therefore that |rX(xi, x)− rY (yi, y)| ≤ dis(R) < 2ε, andhence rY (yi, y) < 3ε.

Furthermore we may decompose X into N ε possibly empty measurabledisjoint subsets of X by letting X1,ε := B(x1, ε), X2,ε := B(x2, ε) \ X1,ε,and so on, where B(x, r) is the open ball x′ ∈ X : rX(x, x

′) < r. Thenf defined by f(x) = yi for x ∈ Xi,ε is obviously a measurable 3ε-isometryfrom X to Y .

We also need to recall the definition of the Prohorov metric between twoprobability measures ν1 and ν2 on a common metric space (Z, rZ) definedby

(1.49) d(Z,rZ)Pr

(ν1, ν2

):= inf

ε > 0 : ν1(F ) ≤ ν2(F

ε) + ε, ∀F closed

with

(1.50) F ε :=x ∈ Z : inf

y∈Fr(x, y) < ε

.

The Prohorov distance is a metric on the collection of probability mea-sures on X (see, for example, [EK86]). The following result shows that if wepush measures forward with a map having a small distortion, then Prohorovdistances can’t increase too much.

Lemma 1.11.3. Suppose that (X, rX) and (Y, rY ) are two metric spaces,f : X → Y is a measurable map with dis(f) ≤ ε, and µ and ν are twoprobability measures on X. Then

(1.51) dPr(f∗µ, f∗ν) ≤ dPr(µ, ν) + ε.

Proof. Suppose that dPr(µ, ν) < δ. By definition, µ(F ) ≤ ν(F δ) + δ,for all closed sets F ⊆ A. If D is a closed subset of Y , then

(1.52)

f∗µ(D) = µ(f−1(D))

≤ µ(f−1(D))

≤ ν(f−1(D)δ) + δ

= ν(f−1(D)δ) + δ.

Now x′ ∈ f−1(D)δ means there is x′′ ∈ X such that rX(x′, x′′) < δ and

f(x′′) ∈ D. By the assumption that dis(f) ≤ ε, we have rY (f(x′), f(x′′)) <

δ + ε, and hence f(x′) ∈ Dδ+ε. Thus

(1.53) f−1(D)δ ⊆ f−1(Dδ+ε)

and we have

(1.54) f∗µ(D) ≤ ν(f−1(Dδ+ε)) + δ = f∗ν(Dδ+ε) + δ,

Page 43: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

1.11. WEIGHTED R-TREES 43

so that dPr(f∗µ, f∗ν) ≤ δ + ε, as required.

We are now in a position to define the weighted Gromov-Hausdorffdistance between the two compact, weighted R-trees (X, rX , νX) and(Y, rY , νY ). For ε > 0, set

(1.55) F εX,Y :=measurable ε-isometries from X to Y

.

Put

(1.56)

∆GHwt(X,Y )

:= inf

ε > 0 :

exist f ∈ F εX,Y , g ∈ F εY,X such that

dPr(f∗νX , νY ) ≤ ε, dPr(νX , g∗νY ) ≤ ε

.

Note that the set on the right hand side is non-empty because X and Yare compact, and hence bounded. It will turn out that ∆GHwt satisfies allthe properties of a metric except the triangle inequality. To rectify this, let

(1.57) dGHwt(X,Y ) := inf

n−1∑i=1

∆GHwt(Zi, Zi+1)14

,

where the infimum is taken over all finite sequences of compact, weightedR-trees Z1, . . . Zn with Z1 = X and Zn = Y .

Lemma 1.11.4. The map dGHwt : Twt × Twt → R+ is a metric on Twt.Moreover,

1

2∆GHwt(X,Y )

14 ≤ dGHwt(X,Y ) ≤ ∆GHwt(X,Y )

14

for all X,Y ∈ Twt.

Proof. It is immediate from (1.56) that the map ∆GHwt is symmetric.We next claim that

(1.58) ∆GHwt

((X, rX , νX), (Y, rY , νY )

)= 0,

if and only if (X, rX , νX) and (Y, rY , νY ) are weight-preserving isometric.The “if” direction is immediate. Note first for the converse that (1.58)implies that for all ε > 0 there exists an ε-isometry from X to Y , andtherefore, by Lemma 7.3.28 in [BBI01], dGH

((X, rX), (Y, rY )

)< 2ε. Thus

dGH

((X, rX), (Y, rY )

)= 0, and it follows from Theorem 7.3.30 of [BBI01]

that (X, rX) and (Y, rY ) are isometric. Checking the proof of that result,we see that we can construct an isometry f : X → Y by taking any densecountable set S ⊂ X, any sequence of functions (fn) such that fn is anεn-isometry with εn → 0 as n → ∞, and letting f be limk fnk

along anysubsequence such that the limit exists for all x ∈ S (such a subsequenceexists by the compactness of Y ). Therefore, fix some dense subset S ⊂ X andsuppose without loss of generality that we have an isometry f : X → Y givenby f(x) = limn→∞ fn(x), x ∈ S, where fn ∈ F εnX,Y , dPr(fn∗νX , νY ) ≤ εn, and

Page 44: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

44 1. STATE SPACES I: R-TREES

limn→∞ εn = 0. We will be done if we can show that f∗νX = νY . If µX is adiscrete measure with atoms belonging to S, then

dPr(f∗νX , νY ) ≤ lim supn

[dPr(fn∗νX , νY ) + dPr(fn∗µX , fn∗νX)

+ dPr(f∗µX , fn∗µX) + dPr(f∗νX , f∗µX)]

≤ 2dPr(µX , νX),

(1.59)

where we have used Lemma 1.11.3 and the fact that dPr(f∗µX , fn∗µX) → 0,as n→ ∞, because of the pointwise convergence of fn to f on S. Because wecan choose µX so that dPr(µX , νX) is arbitrarily small, we see that f∗νX =νY , as required.

Now consider three spaces (X, rX , νX), (Y, rY , νY ), and (Z, rZ , νZ) inTwt, and constants ε, δ > 0, such that ∆GHwt

((X, rX , νX), (Y, rY , νY )

)< ε

and ∆GHwt

((Y, rY , νY ), (Z, rZ , νZ)

)< δ. Then there exist f ∈ F εX,Y and

g ∈ F δY,Z such that dPr(f∗νX , νY ) < ε and dPr(g∗νY , νZ) < δ. Note that

g f ∈ F ε+δX,Z . Moreover, by Lemma 1.11.3,

(1.60)dPr((g f)∗νX , νZ

)≤ dPr

(g∗νY , νZ

)+ dPr

(g∗f∗νX , g∗νY

)< δ + ε+ δ.

This, and a similar argument with the roles of X and Z interchanged,shows that

(1.61) ∆GHwt(X,Z) ≤ 2[∆GHwt(X,Y ) + ∆GHwt(Y, Z)

].

The second inequality in the statement of the lemma is clear. In orderto see the first inequality, it suffices to show that for any Z1, . . . Zn we have

(1.62) ∆GHwt(Z1, Zn)14 ≤ 2

n−1∑i=1

∆GHwt(Zi, Zi+1)14 .

We will establish (1.62) by induction. The inequality certainly holdswhen n = 2. Suppose it holds for 2, . . . , n− 1. Write S for the value of thesum on the right hand side of (1.62). Put

(1.63) k := max1 ≤ m ≤ n− 1 :

m−1∑i=1

∆GHwt(Zi, Zi+1)14 ≤ S/2

.

By the inductive hypothesis and the definition of k,

(1.64)∆GHwt(Z1, Zk)

14 ≤ 2

k−1∑i=1

∆GHwt(Zi, Zi+1)14

≤ 2(S/2) = S.

Of course,

(1.65) ∆GHwt(Zk, Zk+1)14 ≤ S

Page 45: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

1.11. WEIGHTED R-TREES 45

By definition of k,

(1.66)k∑i=1

∆GHwt(Zi, Zi+1)14 > S/2,

so that once more by the inductive hypothesis,

(1.67)

∆GHwt(Zk+1, Zn)14 ≤ 2

n−1∑i=k+1

∆GHwt(Zi, Zi+1)14

= 2S − 2

k∑i=1

∆GHwt(Zi, Zi+1)14

≤ S.

From (1.64), (1.65), (1.67) and two applications of (1.61) we have

(1.68)

∆GHwt(Z1, Zn)14

≤4[∆GHwt(Z1, Zk) + ∆GHwt(Zk, Zk+1) + ∆GHwt(Zk+1, Zn)

] 14

≤(4× 3× S4

) 14

≤ 2S,

as required.It is obvious by construction that dGHwt satisfies the triangle inequality.

The other properties of a metric follow from the corresponding propertieswe have already established for ∆GHwt and the bounds in the statement ofthe lemma which we have already established.

The procedure we used to construct the weighted Gromov-Hausdorffmetric dGHwt from the semi-metric ∆GHwt was adapted from a proof in[Kel75] of the celebrated result of Alexandroff and Urysohn on the metriz-ability of uniform spaces. That proof was, in turn, adapted from earlierwork of Frink and Bourbaki. The choice of the power 1

4 is not particularlyspecial, any sufficiently small power would have worked.

Theorem 1.11.7 below says that the metric space (Twt, dGHwt) is completeand separable and hence is a reasonable space on which to do probabilitytheory. In order to prove this result, we need a compactness criterion thatwill be useful in its own right.

Proposition 1.11.5 (A characterization of pre-compactness in Twt).A subset Γwt of (Twt, dGHwt) is relatively compact if and only if the subsetΓ := (T, r) : (T, r, ν) ∈ Γwt in (T, dGH) is relatively compact.

Together with Proposition 1.10.1 we immediately obtain the following.

Page 46: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

46 1. STATE SPACES I: R-TREES

Corollary 1.11.6. A subset Γwt of (Twt, dGHwt) is relatively compactif and only if

(1.69) sup(T,r,µ)∈Γ

µT(Rε(T )

)<∞,

for all ε > 0.

Proof. The “only if” direction is clear. Assume for the converse that Γis relatively compact. Suppose that ((Tn, rTn , νTn))n∈N is a sequence in Γwt.By assumption, ((Tn, rTn))n∈N has a subsequence converging to some point(T, rT ) of (T, dGH). For ease of notation, we will renumber and also denotethis subsequence by ((Tn, rTn))n∈N. For brevity, we will also omit specificmention of the metric on a real tree when it is clear from the context.

By Proposition 7.4.12 in [BBI01], for each ε > 0 there is a finite ε-net T ε

in T and for each n ∈ N a finite ε-net T εn := xε,1n , ..., xε,#T ε

nn in Tn such that

dGH(Tεn, T

ε) → 0 as n → ∞. Moreover, we take #T εn = #T ε = N ε, say, forn sufficiently large, and so, by passing to a further subsequence if necessary,we may assume that #T εn = #T ε = N ε for all n ∈ N. We may then assume

that T εn and T ε have been indexed so that that limn→∞ rTn(xε,in , x

ε,jn ) =

rT (xε,i, xε,j) for 1 ≤ i, j ≤ N ε.We may begin with the balls of radius ε around each point of T εn

and decompose Tn into N ε possibly empty, disjoint, measurable sets

T ε,1n , ..., T ε,Nε

n of radius no greater than ε. Define a measurable map

f εn : Tn → T εn by f εn(x) = xε,in if x ∈ T ε,in and let gεn be the inclusion map fromT εn to Tn. By construction, f εn and gεn are measurable ε-isometries. More-over, dPr

((gεn)∗(f

εn)∗νn, νn

)< ε and, of course, dPr

((fεn)∗νn, (f

εn)∗νn

)= 0.

Thus,

(1.70) ∆GHwt

((T εn, (f

εn)∗νn), (Tn, νn)

)≤ ε.

By similar reasoning, if we define hεn : T εn → T ε by xε,in 7→ xε,i, then

(1.71) limn→∞

∆GHwt

((T εn, (f

εn)∗νn), (T

ε, (hεn)∗νn))= 0.

Since T ε is finite, by passing to a subsequence (and relabeling as before)we have

(1.72) limn→∞

dPr((hεn)∗νn, ν

ε)= 0

for some probability measure νε on T ε, and hence

(1.73) limn→∞

∆GHwt

((T ε, (hεn)∗νn), (T

ε, νε))= 0.

Therefore, by Lemma 1.11.4,

(1.74) lim supn→∞

dGHwt

((Tn, νn), (T

ε, (hεn)∗νn))≤ ε

14 .

Now, since (T, rT ) is compact, the family of measures νε : ε > 0is relatively compact, and so there is a probability measure ν on T suchthat νε converges to ν in the Prohorov distance along a subsequence ε ↓ 0

Page 47: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

1.12. DISTRIBUTIONS OF RANDOM (WEIGHTED) REAL TREES 47

and hence, by arguments similar to the above, along the same subsequence∆GHwt((T ε, νε), (T, ν)) converges to 0. Again applying Lemma 1.11.4, wehave that dGHwt((T ε, νε), (T, ν)) converges to 0 along this subsequence.

Combining the foregoing, we see that by passing to a suitable subse-quence and relabeling, dGHwt((Tn, νn), (T, ν)) converges to 0, as required.

Theorem 1.11.7. The metric space (Twt, dGHwt) is complete and sepa-rable.

Proof. Separability follows readily from separability of (T, dGH) (seeTheorem 1.5.9), and the separability with respect to the Prohorov distanceof the probability measures on a fixed complete, separable metric space (see,for example, [EK86]), and Lemma 1.11.4.

It remains to establish completeness. By a standard argument, it sufficesto show that any Cauchy sequence in Twt has a convergent subsequence.Let (Tn, rTn , νn)n∈N be a Cauchy sequence in Twt. Then (Tn, rTn)n∈N isa Cauchy sequence in T by Lemma 1.11.4. By Theorem 1.5.9 there is aT ∈ T such that dGH(Tn, T ) → 0, as n → ∞. In particular, the sequence(Tn, rTn)n∈N is relatively compact in T, and therefore, by Proposition 1.11.5,(Tn, rTn , νn)n∈N is relatively compact in Twt. Thus (Tn, rTn , νn)n∈N has aconvergent subsequence, as required.

1.12. Distributions of random (weighted) real trees

In section we consider characterizations of tightness which are particu-larly useful in approximating random trees.

The following simple result is based on the characterization of pre-compactness in (T, dGH) as formulated in Proposition 1.10.1.

Proposition 1.12.1 (A characterization of tightness in M1(T)). Fix anindex set I = ∅. A family A := Tα; α ∈ I of T-valued random variablesis relatively compact with respect to the Gromov-strong topology if and onlyif the family Bη := µTα(Rη(Tα)); α ∈ I of R+-valued random variables isrelatively compact, for all η > 0.

Proof. For the “only if” direction assume that A is tight and fixε, η > 0. By definition, we find a compact set Γε in (T, dGH) such thatinfα∈I PTα ∈ Γε > 1 − ε. Since Γε is compact there exists, by Propo-sition 1.10.1, an N = N(ε, η) > 0 and such that µT (Rη(T )) ≤ N , for allT ∈ Γε. Hence, for all α ∈ I,

(1.75) PµTα(Rη(Tα)) > N

= P

(µTα(Rη(Tα)) > N\Γε

)≤ P

(Γε)≤ ε.

Therefore, since ε, η > 0 were chosen arbitrarely, Bη is a tight family ofR+-valued random variables, for all η > 0.

Page 48: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

48 1. STATE SPACES I: R-TREES

For the “if” direction assume that Bη is relatively compact and fix ε > 0.Then for all n ∈ N there exists Nn = Nn(ε) such that

(1.76) infα∈I

PµTα(R2−n(Tα)) > Nn > 1− ε

2n.

Let Γε,n := Tα; µTα(R2−n(Tα)) ≤ Nn. By Lemma 1.9.2(ii) Γε,n is closed,for all n ∈ N. Put(1.77) Γε :=

∩n∈N

Γε,2−n .

Then also Γε is closed and hence, by Proposition 1.10.1, compact. Sinceε > 0 was chosen arbitrarily, it follows therefore together with (1.76) thatthe family A is relatively compact.

From Proposition 1.12.1 together with Proposition 1.11.5 we immedi-ately obtain the following.

Corollary 1.12.2 (A characterization of tightness inM1(Twt)). Fix anindex set I = ∅. A family Awt := (Tα, να); α ∈ I of Twt-valued randomvariables is relatively compact with respect to the weighted Gromov-strongtopology if and only if the family Bη := µTα(Rη(Tα)); α ∈ I of R+-valuedrandom variables is relatively compact, for all η > 0.

Page 49: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

CHAPTER 2

State spaces II: The space of metric measure trees

In this chapter we study convergence of metric measure spaces with themain emphasis to exploit the second aspect of Aldous’s philosophy of con-vergence without using Aldous’s particular embedding. That is, we considerthe space of separable and complete real trees which are equipped with aprobability measure and give a notion of convergence such that

a sequence of trees (equipped with a probability measure) convergesto a limit tree (equipped with a probability measure) if and only ifall randomly sampled finite subtrees converge to the correspondinglimit subtrees.

Since the construction of the topology works not only for tree-like metricspaces, but also for the space (of measure preserving isometry classes) ofmetric measure spaces we formulate everything within this framework. Theresulting topology is referred to as the Gromov-weak topology.

This chapter is organized as follows. In Section 2.2 we introduce theGromov-Prohorov metric as a candidate for a complete metric which gen-erates the Gromov-weak topology and show that the generated topology isseparable.

As a technical preparation we collect results on the modulus of mass dis-tribution and the distance distribution (see Definition 2.3.1) in Section 2.3.In Sections 2.4 we give characterizations on pre-compactness for the topol-ogy generated by the Gromov-Prohorov metric. In Section 2.5 we prove thatthe topology generated by the Gromov-Prohorov metric coincides with theGromov-weak topology. In Section 2.6 we discuss the closed sub-space ofultra-metric measure spaces whose elements are (often phylogenetic analysis)referred to as ultra-metric trees. In Subsection 2.7 we give a pre-compactnesscriterion for the sub-space of compact metric measure spaces. Section 2.8discusses convergence in distribution and characterizes tightness. Finally, inSection 2.9 we provide several other metrics that generate the Gromov-weaktopology.

Remark 2.0.3 (Gromov’s Chapter 312). Even so the material presented

in this chapter was developed independently of Gromov’s work, the mostimportant ideas formulated in Sections 2.1 through 2.5 are already containedin Chapter 31

2 in [Gro99]. A more detailed look shows that the Gromov-Prohorov metric coincides with Gromov’s 1-metric, and hence parts of thepre-compactness characterizations given in Section 2.4 are - maybe not in an

49

Page 50: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

50 2. STATE SPACES II: THE SPACE OF METRIC MEASURE TREES

obvious way - implicitly stated in Proposition 312 .D in [Gro99]. However,

the characterization of tightness and the equivalent metrics are novel.

2.1. The Gromov-weak topology

As before, given a topological space (X,O), we denote by M1(X) bespace of all probability measures on X equipped with the Borel-σ-algebraB(X). Recall from (1.48) the push forward of a measure under a map andthat the support of µ, supp(µ), is the smallest closed set X0 ⊆ X such thatµ(X \X0) = 0.

In the following we focus on complete and separable metric spaces.

Definition 2.1.1 (Metric measure space). A metric measure space is acomplete and separable metric space (X, r) which is equipped with a proba-bility measure µ ∈ M1(X). We write M for the space of measure-preservingisometry classes of complete and separable metric measure spaces, where wesay that (X, r, µ) and (X ′, r′, µ′) are measure-preserving isometric if thereexists an isometry φ between the supports of µ on (X, r) and of µ′ on (X ′, r′)such that µ′ = φ∗µ. It is clear that the property of being measure-preservingisometric is an equivalence relation.

We abbreviate (X, r, µ) for a whole isometry class of metric spaces when-ever no confusion seems to be possible.

Remark 2.1.2. (i) Metric measure spaces, or short mm-spaces,are discussed in [Gro99] in detail. Therefore they are some-times also referred to as Gromov metric triples (see, for example,[Ver98]).

(ii) Recall from Remark 1.1.1 that we have to be careful to deal withsets in the sense of the Zermelo-Fraenkel axioms since M will turnsout to be Polish by Theorem 2.1.10. Hence if P ∈ M1(M) then themeasure preserving isometry class represented by M equipped withP yields an element in M. The way out is once more to define Mas the space of measure preserving isometry classes of those metricspaces equipped with a probability measure whose elements are notthemselves metric spaces.

We are typically only interested in functions of metric measure treesthat do not describe artifacts of the chosen representation, i.e., which areinvariant under measure-preserving isometries. These are of a special formwhich we introduce next.

For a metric space (X, r) we define by

(2.1) R(X,r) :

XN → R(

N2)

+ ,((xi)i≥1

)7→(r(xi, xj)

)1≤i<j

Page 51: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

2.1. THE GROMOV-WEAK TOPOLOGY 51

the map which sends a sequence of points in X to its (infinite) distancematrix, and let for a metric measure space (X, r, µ),

(2.2) ν(X,r,µ) :=(R(X,r)

)∗µ

⊗N ∈ M1(R(N2)+ )

denote the distance matrix distribution of (X, r, µ). Obviously, ν(X,r,µ) de-pends on (X, r, µ) only through its measure-preserving isometry class X . Wecan therefore define the distance matrix distribution νX of an isometry classX ∈ M through an arbitrary represent.

Definition 2.1.3 (Distance matrix distribution). The distance matrixdistribution νX of X ∈ M is defined as the distance matric distributionν(X,r,µ) of an arbitrary represent X = (X, r, µ).

By Gromov’s Reconstruction theorem metric measure spaces areuniquely determined by their distance matrix distribution (see Section 31

25in [Gro99]).

It is then easy to see that a function F on M is measure preserving

isometric if and only if there exists a function F : M1(R(N2)+ ) → R such that

(2.3) F (X) := F (νX ).

To be in a position to formalize that for a sequence of metric measurespaces all finite subspaces sampled by the measures sitting on the corre-sponding metric spaces converge we next introduce the algebra of polyno-mials on M.

For each S ⊆ 1, 2, ..., denote by

(2.4) ρS :

R(

N2)

+ → R(S2)

+ ,(rij)1≤i<j 7→ (rij)1≤i<j∈S

the restriction map which sends an infinite distance matric to the restrictionof the coordinates indexed by S, and abbreviate for each n ∈ N,

(2.5) ρ≤n := ρ1,2,...,n, and ρ≥n := ρn,n+1,....

Definition 2.1.4 (Polynomials). A function Φ = Φn,ϕ : M → R is calleda monomial (of degree n with respect to the test function ϕ) on M if and only

if there exist n ∈ N and a bounded continuous function ϕ : [0,∞)(n2) → R

such that

(2.6) Φ(X)=

∫(R+)(

n2)(ρ≤n)∗ν

X (d((ri,j)1≤i<j≤n))ϕ((ri,j)1≤i<j≤n

),

Denote by Π the algebra generated by the monomials, in the following re-ferred to as the set of all polynomials on M.

Page 52: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

52 2. STATE SPACES II: THE SPACE OF METRIC MEASURE TREES

Remark 2.1.5 (Action of monomials on representatives). Let n ∈ N,ϕ : (R+)

(N2) → R, X ∈ M and (X, r, µ) a metric measure space with isometryclass X . Then

(2.7) Φ(X)=

∫Xn

µ⊗n(d(x1, ..., xn))ϕ((r(xi, xj))1≤i<j≤n

),

where µ⊗n is the n-fold product measure of µ.

Example 2.1.6. In future work, we are particularly interested in tree-like metric spaces, i.e., ultra-metric spaces and R-trees. In this setting,functions of the form (2.7) can be, for example, the mean total length orthe averaged diameter of the sub-tree spanned by n points sampled inde-pendently according to µ from the underlying tree.

The next example illustrates that one can, of course, not separate metricmeasure spaces by polynomials of degree 2 only.

Example 2.1.7. Consider the following two metric measure spaces.

...............................................................................................................................................................................................................................................................................................................

12

12

• •

X

......................................................................................................................................................................................................................................................................................................................

.......

.......

.......

.......

.......

.......

.......

.......

.......

..2−√3

62+

√3

613

• ••

Y

Assume that in both spaces the mutual distances between differentpoints are 1. In both cases, the empirical distribution of the distances be-tween two points equals 1

2δ0+12δ1, and hence all polynomials of degree n = 2

agree. But obviously, X and Y are not measure preserving isometric.

The first key observation is that the algebra of polynomials is a richenough subclass to determine a metric measure space.

Proposition 2.1.8 (Polynomials separate points). The algebra Π ofpolynomials separates points in M.

Proof. Let (Xℓ, rℓ, µℓ) ∈ M, ℓ = 1, 2, and assume that Φ(X1) = Φ(X2),for all Φ ∈ Π.

Define the space of infinite (pseudo-)distance matrices by

(2.8) Rmet :=r ∈ R(

N2)

+ : rij + rjk ≥ rik, rii = 0, ∀ 1 ≤ i < j < k <∞.

Recall from Definition 2.1.3 the distance matrix distribution νX of anelement X in M. As the algebra ϕ ∈ Cb(R(

n2)); n ∈ N is separating in

M1(Rmet), we find that

(2.9) νX1 = νX2 .

Page 53: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

2.2. A COMPLETE METRIC: THE GROMOV-PROHOROV METRIC 53

By Gromov’s Reconstruction theorem, mm-spaces are uniquely deter-mined by their distance matrix distribution (see Paragraph 31

2 .5 in [Gro99]).Hence, (2.9) implies X1 = X2.

We are now in a position to define the Gromov-weak topology.

Definition 2.1.9 (Gromov-weak topology). A sequence (Xn)n∈N is saidto converge Gromov-weakly to X in M if and only if Φ(Xn) converges toΦ(X ) in R, for all polynomials Φ ∈ Π. We call the corresponding topologyOM on M the Gromov-weak topology.

The following result ensures that the state space is suitable to do prob-ability theory on it. We will develop its proof throughout this chapter.

Theorem 2.1.10. The space (M,OM) is Polish.

Remark 2.1.11. We remark that topologies on metric measure spacesare considered in detail in Section 31

2 of [Gro99]. Several of the resultspresented in this chapter (in particular, Theorems 2.1.10, and 2.5.1) arealready stated in [Gro99] in a different set-up. While Gromov focuses ongeometric aspects, we follow here [GPW06a] where the tools necessary todo probability theory on the space of metric measure spaces are provided.

Further related topologies on particular subspaces of isometry classesof complete and separable metric spaces have already been considered in[Stu06] and [EW06]. Convergence in these two topologies implies conver-gence in the Gromov-weak topology but not vice versa.

2.2. A complete metric: The Gromov-Prohorov metric

In this section we introduce the Gromov-Prohorov metric dGPr on M andprove that the metric space (M, dGPr) is complete and separable. In Sec-tion 2.5 we will see that the Gromov-Prohorov metric generates the Gromov-weak topology.

Notice that the first naive approach to metrize the Gromov-weak topol-ogy could be to fix a countably dense subset Φn; n ∈ N in the algebra ofall polynomials, and to put for X ,Y ∈ M,

(2.10) dnaive(X ,Y

):=∑n∈N

2−n∣∣Φn(X )− Φn(Y)

∣∣.However, such a metric is not complete. Indeed one can check that thesequence Xn; n ∈ N given in Example 2.4.4(ii) is a Cauchy sequence whichdoes not converge.

Recall that metrics on the space of probability measures on a fixedcomplete and separable metric space are well-studied (see, for example,[Rac91, GS02]). Some of them, like the Prohorov metric and the Wasser-stein metric (on compact spaces) generate the weak topology. On the other

Page 54: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

54 2. STATE SPACES II: THE SPACE OF METRIC MEASURE TREES

hand the space of all (isometry classes of compact) metric spaces, not carry-ing a measure, is separable and complete once equipped with the Gromov-Hausdorff metric (compare Chapter 1).

Metrics on metric measure spaces should take both aspects into accountand compare the spaces and the measures simultaneously. This was, forexample, done in [EW06] and [Stu06]. We will follow along similar linesas in [Stu06], but replace the Wasserstein metric with the Prohorov metric.

Recall from (1.49) the Prohorov metric d(Z,rZ)Pr between two probability

measures µ1 and µ2 on a common metric space (Z, rZ).Sometimes it is easier to work with the equivalent formulation based

on couplings of the measures µ1 and µ2, i.e., measures µ on X × Y withµ(· × Y ) = µ1(·) and µ(X × ·) = µ2(·). Notice that the product measureµ1 ⊗ µ2 is a coupling, and so the set of all couplings of two measures is notempty. By Theorem 3.1.2 in [EK86],

(2.11)d(Z,rZ)Pr

(µ1, µ2

)= inf

µinfε > 0 : µ

(z, z′) ∈ Z × Z : rZ(z, z

′) ≥ ε≤ ε,

where the infimum is taken over all couplings µ of µ1 and µ2. The met-

ric d(Z,rZ)Pr is complete and separable if (Z, rZ) is complete and separable

([EK86], Theorem 3.1.7).To define a metric between two metric measure spaces we cannot use

the Prohorov metric directly because the two involved measures are notdefined on the same metric space. However, we can use an idea due toGromov and first embed the metric spaces isometrically into a commonmetric space. We therefore let the distance between two metric measurespaces X = (X, rX , µX) and Y = (Y, rY , µY ) be defined by

(2.12) dGPr

(X ,Y

):= inf

(φX ,φY ,Z)d(Z,rZ)Pr

((φX)∗µX , (φY )∗µY

),

where the infimum is taken over all isometric embeddings φX and φYfrom supp(µX) and supp(µY ), respectively, into some common metric space

(Z, rZ). Observe that dGPr is defined on metric measure spaces rather thantheir isometry classes.

In order to define a metric on M, i.e., on isometry classes of metricmeasure spaces we will show that for a metric measure space Y = (Y, rY , µY )and a pair (X = (X, rX , µX),X ′ = (X ′, r′X , µ

′X)) of (measure preserving)

isometric copies of metric measure spaces

(2.13) dGPr

(X ,Y

)= dGPr

(X ′,Y

).

To see this, fix ε > 0 and assume that dGPr

(X ,Y

)< ε. By definition,

there are isometric embeddings φ′X and φY from supp(µ′X) and supp(µY ),

respectively, into some common metric space (Z, rZ) such that

(2.14) d(Z,rZ)Pr

((φ′

X)∗µ′X , (φY )∗µY

)< ε.

Page 55: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

2.2. A COMPLETE METRIC: THE GROMOV-PROHOROV METRIC 55

Moreover, by assumption, there exists an isometry φ between supp(µ′X)and supp(µX) such that µ′X = φ∗µX . Hence,

(2.15)

dGPr

(X ,Y

)≤ d

(Z,rZ)Pr

((φX φ)∗µX , (φY )∗µY

)= d

(Z,rZ)Pr

((φX)∗µ

′X

), (φY )∗µY

)< ε.

Since ε was chosen arbitrarily, we have proved “≤” in (2.13). Equalityfollows then by symmetry.

This leads to the well-defined Gromov-Prohorov metric on M which weintroduce next.

Definition 2.2.1 (Gromov-Prohorov metric). The Gromov-Prohorovdistance between two metric measure spaces X = (X, rX , µX) and Y =(Y, rY , µY ) is defined by

(2.16) dGPr

(X ,Y

):= inf

(φX ,φY ,Z)d(Z,rZ)Pr

((φX)∗µX , (φY )∗µY

),

where the infimum is taken over all isometric embeddings φX and φYfrom supp(µX) and supp(µY ), respectively, into some common metric space(Z, rZ).

Remark 2.2.2. Notice that w.l.o.g. the common metric space (Z, rZ)and the isometric embeddings φX and φY from supp(µX) and supp(µY )can be chosen to be X ⊔ Y and the identities, respectively (compare, forexample, Remark 3.3(iii) in [Stu06]). We can therefore also write

(2.17) dGPr

(X ,Y

):= inf

(φX ,φY ,Z)d(X⊔Y,rX,Y )Pr

(µX , µY

),

where the infimum is here taken over all complete and separable metricsrX,Y which extend the metrics rX on X and rY on Y to X ⊔ Y .

We first show that the Gromov-Prohorov distance is indeed a metric.

Lemma 2.2.3. dGPr defines a metric on M.

In the following we refer to the topology generated by the Gromov-Prohorov metric as the Gromov-Prohorov topology. In Theorem 2.5.1 ofSection 2.5 we will prove that the Gromov-Prohorov metric generates theGromov-weak topology.

Proof. Symmetry and positive definiteness are obvious. To see thetriangle inequality, we follow the line of argument in the proof of Theorem 3.6in [Stu06]. Let ε, δ > 0 and Xi := (Xi, rXi , µXi) ∈ M, i = 1, 2, 3, be suchthat dGPr

(X1,X2

)< ε and dGPr

(X2,X3

)< δ. Then, by definition together

with Remark 2.2.2, we can find metrics r1,2 and r2,3 on X1⊔X2 and X2⊔X3,respectively, such that

(2.18) d(X1⊔X2,r1,2)Pr

(µX1 , µX2

)< ε,

Page 56: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

56 2. STATE SPACES II: THE SPACE OF METRIC MEASURE TREES

and

(2.19) d(X2⊔X3,r2,3)Pr

(µX2 , µX3

)< δ.

Now define a complete and separable metric r1,2,3 on X1 ⊔ X2 ⊔ X3

extending rX on X, rY on Y and rZ on Z by(2.20)

r1,2,3(x, y) :=

r1,2(x, y), if x, y ∈ X1 ⊔X2,r2,3(x, y), if x, y ∈ X2 ⊔X3,infr1,2(x, z) + r2,3(z, y); z ∈ X2, if x ∈ X1, y ∈ X3,infr1,2(y, z) + r2,3(z, x); z ∈ X2, if x ∈ X3, y ∈ X1.

Hence by the triangle inequality of the Prohorov metric,

(2.21)

dGPr

(X1,X3

)= d

(X1⊔X2⊔X3,r1,2,3)Pr

(µ1, µ3

)≤ d

(X1⊔X2,r1,2)Pr

(µ1, µ2

)+ d

(X2⊔X3,r2,3)Pr

(µ2, µ3

)< ε+ δ.

Taking the infimum over all extensions r1,2 and r2,3 on the right handside of (2.21) gives dGPr

(X1,X3

)< ε + δ by Definition 2.17 together with

Remark 2.2.2.

Proposition 2.2.4. The metric space is (M, dGPr) is complete and sep-arable.

We prepare the proof with a lemma.

Lemma 2.2.5. Fix (εn)n∈N in (0,∞). A sequence (Xn := (Xn, rXn , µXn))n∈Nin M satisfies

(2.22) dGPr

(Xn,Xn+1

)< εn

if and only if there exist a complete and separable metric space (Z, rZ) andisometric embeddings φX1, φX2, ... from supp(µX1), supp(µX2), ..., respec-tively, into (Z, rZ), such that

(2.23) d(Z,rZ)Pr

((φXn)∗µXn , (φXn+1)∗µXn+1

)< εn.

Proof. The “if” direction is clear. For the “only if” direction, takesequences (Xn := (Xn, rXn , µXn))n∈N and (εn)n∈N which satisfy (2.22). Bydefinition, for all n ∈ N, there are a metric space (Yn, rYn) and isometricembeddings φXn and φXn+1 from supp(µXn) and supp(µXn+1), respectively,to (Yn, rYn) with

(2.24) d(Yn,rYn )Pr

((φXn)∗µXn , (φXn+1)∗µXn+1

)< εn.

Put

(2.25) Rn :=(x, x′) ∈ Xn ×Xn+1 : rYn(φXn(x), φXn+1(x

′)) < εn.

Page 57: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

2.2. A COMPLETE METRIC: THE GROMOV-PROHOROV METRIC 57

Recall from (2.11) that (2.24) implies the existence of a coupling µn of(φXn)∗µXn and (φXn+1)∗µXn+1 such that

(2.26) µn(x, x′) : rYn(y, y

′) < εn> 1− εn.

In particular, Rn is non-empty.Recall from Lemma 1.3.2, for two metric spaces (X1, rX1) and (X2, rX2)

and a relation R ∈ RX1,X2 , the metric rRX1,X2on X1 ⊔X2 which extends the

metrics rX1 and rX2 such that rRX1,X2(x1, x2) =

12dis(R), for all (x1, x2) ∈ R.

Then by (2.25) together with (2.26),

(2.27) d(Xn⊔Xn+1,r

RnXn⊔Xn+1

)

Pr

((φXn)∗µXn , (φXn+1)∗µXn+1

)≤ εn

where φXn and φXn+1 are the canonical isometric embeddings from Xn andXn+1 to Xn ⊔Xn+1, respectively.

Using the metric spaces (Xn ⊔ Xn+1, rRnXn⊔Xn+1

) we define recursively

metrics rZn on Zn :=⊔nk=1Xn. Starting with n = 1, we set (Z1, rZ1) :=

(X1, r1). Next, assume we are given a metric rZn on Zn. Consider theisometric embeddings ψnk from Xk to Zn, for k = 1, ..., n which arise fromthe canonical embedding of Xk in Zn. Define for all n ∈ N,

(2.28) Rn :=(z, x) ∈ Zn ×Xn+1 : ((ψ

nn)

−1(z), x) ∈ Rn

which defines metrics rRnZn+1

on Zn+1 via (1.7).

By this procedure we obtain in the limit a separable metric space (Z ′ :=⊔∞n=1Xn, rZ′). Denote its completion by (Z, rZ) and isometric embeddings

fromXn to Z which arise by the canonical embedding by ψn, n ∈ N. Observethat the restriction of rZ to Xn⊔Xn+1 is isometric to (Xn⊔Xn+1, r

RnXn⊔Xn+1

)

and thus

(2.29) d(Z,rZ)Pr

((ψn)∗µXn , (ψn+1)∗µXn+1

)≤ εn

by (2.27). So the claim follows.

Proof of Proposition 2.2.4. To get separability, we partly followthe proof of Theorem 3.2.2 in [EK86]. Given X := (X, r, µ) ∈ M andε > 0, we can find X ε := (X, r, µε) ∈ M such that µε is a finitely supportedatomic measure on X and dPr(µ

ε, µ) < ε. Now dGPr

(X ε,X

)< ε, while Xε

is just a “finite metric space” and can clearly be approximated arbitraryclosely in the Gromov-Prohorov metric by finite metric spaces with rationalmutual distances and weights. The set of isometry classes of finite metricspaces with rational edge-lengths is countable, and so (M, dGPr) is separable.

To get completeness, it suffices to show that every Cauchy sequencehas a convergent subsequence. Take therefore a Cauchy sequence (Xn)n∈Nin (M, dGPr). By Lemma 2.2.5 we can choose a complete and separablemetric space (Z, rZ) and, for each n ∈ N, an isometric embedding φXn fromsupp(µXn) into (Z, rZ) such that ((φXn)∗µXn)n∈N is a Cauchy sequence on

Page 58: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

58 2. STATE SPACES II: THE SPACE OF METRIC MEASURE TREES

M1(Z) equipped with the weak topology. By the completeness of M1(Z),((φXn)∗µn)n∈N converges to some µ ∈ M1(Z).

Putting the arguments together yields that with Z := (Z, rZ , µ),

(2.30) dGPr

(Xn,Z

) n→∞−→ 0,

so that Z is the desired limit object, which finishes the proof.

We conclude this section by stating that if a sequence of metric measurespaces converges in Gromov-Prohorov metric to a limit metric measure spacethen all metric measure spaces can be embedded isometrically into a commonmetric space such that the image measures under these isometric embeddingsconverge weakly to the image of the limit measure. Compare also withTheorem 1.3.1, where the analog result for Gromov-Hausdorff convergencein the space of compact metric spaces is presented.

Corollary 2.2.6. Let (Xn = (Xn, rXn , µXn))n∈N be a sequence in Mand X = (X, rX , µX) ∈ M. Then,

(2.31) dGPr

(Xn,X

) n→∞−−−→ 0.

if and only if there exists a complete and separable metric space (Z, rZ) andisometric embeddings φ,φX1 , φX2 , ... from supp(µ), supp(µX1), supp(µX2), ...into (Z, rZ), respectively, such that

(2.32) d(Z,rZ)Pr

((φXn)∗µn, φ∗µX

) n→∞−−−→ 0.

Proof. Once more the “if” direction is clear by definition. For the“only if” direction, assume that (2.31) holds. To conclude (2.32) we followthe same line of argument as in the proof of Lemma 2.2.5 but with a metricr extending the metrics rX , rX1 , ... build on correspondences between Xand Xn (rather than Xn and Xn+1).

Fix ε > 0. Then there exists N = Nε such that

(2.33) dGPr

(Xn,X

)< ε,

for all n ≥ N . By definition, for all n ≥ N , there are a metric space (Yn, rYn)and isometric embeddings φX and φXn from supp(µX) and supp(µXn), re-spectively, to (Yn, rYn) with

(2.34) d(Yn,rYn )Pr

((φX)∗µX , (φX)∗µXn

)< ε.

Put

(2.35) Rn :=(x, x′) ∈ X ×Xn : rYn(φX(x), φXn(x

′)) < ε.

Once more, since (2.34) implies the existence of a coupling µn of(φX)∗µX and (φXn)∗µXn such that

(2.36) µn(x, x′) : rYn(y, y

′) < ε> 1− ε,

Rn is non-empty.

Page 59: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

2.3. DISTANCE DISTRIBUTION AND MODULUS OF MASS DISTRIBUTION 59

Recall from Lemma 1.3.2, for two metric spaces (X1, rX1) and (X2, rX2)and a relation R ∈ RX1,X2 , the metric rRX1,X2

on X1 ⊔X2 which extends the

metrics rX1 and rX2 such that rRX1,X2(x1, x2) =

12dis(R), for all (x1, x2) ∈ R.

Then by (2.35) together with (2.36),

(2.37) d(X⊔Xn,r

RnX⊔Xn

)

Pr

((φX)∗µX , (φXn)∗µXn

)≤ ε

where φX and φXn are the canonical isometric embeddings from X and Xn

to X ⊔Xn, respectively.Using the metric spaces (X ⊔Xn, r

RnX⊔Xn

) we define recursively metrics

rZn on Zn : X ⊔⊔nk=1Xn. Starting with n = 0, we set (Z0, rZ0) := (X, r).

Next, assume we are given a metric rZn on Zn. Consider the isometricembeddings ψnk from Xk to Zn, for k = 0, ..., n (with X0 := X) which arisefrom the canonical embedding of Xk in Zn. Define for all n ∈ N,(2.38) Rn :=

(z, x) ∈ Zn ×Xn+1 : ((ψ

nn)

−1(z), x) ∈ Rn

which defines metrics rRnZn+1

on Zn+1 via (1.7).

By this procedure we obtain in the limit a separable metric space(Z ′ := X ⊔

⊔∞n=1Xn, rZ′). Denote its completion by (Z, rZ) and isomet-

ric embeddings from Xn to Z which arise by the canonical embedding byψn, n ∈ N. Observe that the restriction of rZ to X ⊔ Xn is isometric to(X ⊔Xn, r

RnX⊔Xn

) and thus

(2.39) d(Z,rZ)Pr

((ψ0)∗µX , (ψn)∗µXn

)≤ ε

by (2.37). So the claim follows.

2.3. Distance distribution and modulus of mass distribution

In order to obtain later in criteria for tightness of a family of laws ofrandom elements in M we need a characterization of the compact sets of(M,OM). Informally, a subset of M will turn out to be pre-compact iff thecorresponding sequence of probability measures put most of their mass onsubspaces of a uniformly bounded diameter, and if the contribution of pointswhich do not carry much mass in their vicinity is small.

These two criteria lead to the following definitions.

Definition 2.3.1 (Distance and modulus of mass distribution). Let X =(X, r, µ) ∈ M.

(i) The distance distribution, which is an element in M1([0,∞)), isgiven by wX := r∗µ

⊗2, i.e.,

(2.40) wX (·) := µ⊗2(x, x′) : r(x, x′) ∈ ·

.

(ii) For δ > 0, define the modulus of mass distribution as

(2.41) vδ(X ) := infε > 0 : µ

x ∈ X : µ(Bε(x)) ≤ δ

≤ ε.

Page 60: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

60 2. STATE SPACES II: THE SPACE OF METRIC MEASURE TREES

Remark 2.3.2. Observe that wX and vδ are well-defined because theyare constant on isometry classes of a given metric measure space.

In this section we provide results on the distance distribution and on themodulus of mass distribution. These will be heavily used in the followingsections, where we present metrics which are equivalent to the Gromov-Prohorov metric and which are very helpful in proving the characterizationsof compactness and tightness in the Gromov-Prohorov topology.

We start by introducing the random distance distribution of a givenmetric measure space.

Definition 2.3.3 (Random distance distribution). Let X = (X, r, µ) ∈M. For each x ∈ X, define the map rx : X → [0,∞) by rx(x

′) := r(x, x′),and put µx := (rx)∗µ ∈ M1([0,∞)), i.e., µx defines the distribution ofdistances to the point x ∈ X. Moreover, define the map r : X → M1([0,∞))by r(x) := µx, and let

(2.42) µX := r∗µ ∈ M1(M1([0,∞))

be the random distance distribution of X .

Notice first that the random distance distribution does not characterizesthe metric measure space uniquely. We will illustrate this with an example.

Example 2.3.4. Consider the following two metric measure spaces:

..........................................................................................................................................................................................................................................................................................................................................

....................................................................................................................................................................................

............................

........................

............................

............................

................................................................................................................................................................................

..........................................................................................................................................................................................................................................................................................................................................

120

220

320

420

120

220

320

420

•X

..........................................................................................................................................................................................................................................................................................................................................

....................................................................................................................................................................................

............................

........................

............................

............................

................................................................................................................................................................................

..........................................................................................................................................................................................................................................................................................................................................

120

120

420

420

220

220

320

320

•Y

That is, both spaces consist of 8 points. The distance between two pointsequals the minimal number of edges one has to cross to come from one pointto the other. The measures µX and µY are given by numbers in the figure.We find that

(2.43)

µX = µY = 110δ 1

20 δ0+920 δ1+

12 δ2

+ 15δ 1

10 δ0+25 δ1+

12 δ2

+ 310δ 3

20 δ0+720 δ1+

12 δ2

+ 25δ1

5 δ0+310 δ1+

12 δ2.

Hence, the random distance distributions agree. But obviously, X and Yare not measure preserving isometric.

Recall the distance distribution w· and the modulus of mass distributionvδ(·) from Definition 2.3.1. Both can be expressed through the random

Page 61: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

2.3. DISTANCE DISTRIBUTION AND MODULUS OF MASS DISTRIBUTION 61

distance distribution µ(·). These facts follow directly from the definitions,so we omit the proof.

Lemma 2.3.5 (Reformulation of w· and vδ(·) in terms of µ(·)). Let X ∈M.

(i) The distance distribution wX satisfies

(2.44) wX =

∫M1([0,∞))

µX (dν) ν.

(ii) For all δ > 0, the modulus of mass distribution vδ(X ) satisfies

(2.45) vδ(X ) = infε > 0 : µX ν ∈ M1([0,∞)) : ν([0, ε]) ≤ δ ≤ ε

.

The next result will be used frequently.

Lemma 2.3.6. Let X = (X, r, µ) ∈ M and δ > 0. If vδ(X ) < ε, for someε > 0, then

(2.46) µx ∈ X : µ(Bε(x)) ≤ δ

< ε.

Proof. By definition of vδ(·), there exists ε′ < ε for which µx ∈ X :

µ(Bε′(x)) ≤ δ

≤ ε′. Consequently, since x : µ(Bε(x)) ≤ δ ⊆ x :µ(Bε′(x)) ≤ δ,(2.47) µx : µ(Bε(x)) ≤ δ ≤ µx : µ(Bε′(x)) ≤ δ ≤ ε′ < ε,

and we are done.

The next result states basic properties of the map δ 7→ vδ.

Lemma 2.3.7 (Properties of vδ(·)). Fix X ∈ M. The map which sendsδ ≥ 0 to vδ(X ) is non-decreasing, right-continuous and bounded by 1. More-

over, vδ(X )δ→0−→ 0.

Proof. The first three properties are trivial. For the forth, fix ε > 0,and let X = (X, r, µ) ∈ M. Since X is complete and separable there existsa compact set Kε ⊆ X with µ(Kε) > 1− ε (see [EK86], Lemma 3.2.1). Inparticular, Kε can be covered by finitely many balls A1, ..., ANε of radiusε/2 and positive µ-mass. Choose δ such that

(2.48) 0 < δ < minµ(Ai) : 1 ≤ i ≤ Nε

.

Then

(2.49)

µx ∈ X : µ(Bε(x)) > δ

≥ µ

(∪Nε

i=1Ai)

≥ µ(Kε)

> 1− ε.

Therefore, by definition and Lemma 2.3.6, vδ(X ) ≤ ε, and since ε waschosen arbitrary, the assertion follows.

Page 62: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

62 2. STATE SPACES II: THE SPACE OF METRIC MEASURE TREES

The following proposition states continuity properties of µ(·), w· andvδ(·). The reader should have in mind that we finally prove with Theo-rem 2.5.1 in Section 2.5 that the Gromov-weak and the Gromov-Prohorovtopology are the same.

Proposition 2.3.8 (Continuity properties of µ(·), w· and vδ(·)).

(i) The map X 7→ µX is continuous with respect to the Gromov-weaktopology on M and the weak topology on M1(M1([0,∞))).

(ii) The map X 7→ µX is continuous with respect to the Gromov-Prohorov topology on M and the weak topology on M1(M1([0,∞))).

(iii) The map X 7→ wX is continuous with respect to both the Gromov-weak and the Gromov-Prohorov topology on M and the weak topol-ogy on M1([0,∞)).

(iv) Let X , X1, X2, ... in M such that µXn

n→∞=⇒ µX and δ > 0, where

here =⇒ means weak convergence on M1(M1([0,∞))). Then

(2.50) lim supn→∞

vδ(Xn) ≤ vδ(X ).

The proof of Parts (i) and (ii) of Proposition 2.3.8 are based on thenotion of moment measures.

Definition 2.3.9 (Moment measures of µX ). For X = (X, r, µ) ∈ Mand k ∈ N, define the kth moment measure µkX ∈ M1([0,∞)k) of µX by

(2.51) µkX (d(r1, ..., rk)) :=

∫µX (dν) ν

⊗k(d(r1, ..., rk)).

Remark 2.3.10 (Moment measures determine µX ). Observe that for allk ∈ N,

(2.52)µkX (A1 × ...×Ak)

= µ⊗k+1(u0, u1, ..., uk) : r(u0, u1) ∈ A1, ..., r(u0, uk) ∈ Ak

.

By Theorem 16.16 of [Kal02], the moment measures µkX , k = 1, 2, ...determine µX uniquely. Moreover, weak convergence of random measures isequivalent to convergence of all moment measures.

Proof of Proposition 2.3.8. (i) Take X , X1, X2, ... in M such that

(2.53) Φ(Xn)n→∞−−−→ Φ(X ),

for all Φ ∈ Π. For k ∈ N, consider all ϕ ∈ Cb([0,∞)(k+12 )) which

depend on (rij)0≤i<j≤k only through (r0,1, ..., r0,k+1), i.e., there exists

ϕ ∈ Cb([0,∞)(k+1)) with ϕ((rij)0≤i<j≤k

)= ϕ

((r0,j)1≤j≤k

). Since for any

Page 63: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

2.3. DISTANCE DISTRIBUTION AND MODULUS OF MASS DISTRIBUTION 63

Y = (Y, r, µ) ∈ M,

(2.54)

∫µkY(d(r1, ..., rk)) ϕ

(r1, ..., rk

)=

∫µ⊗k+1(d(u0, u1, ..., uk)) ϕ

(r(u0, u1), ..., r(u0, uk)

)=

∫µ⊗k+1(d(u0, u1, ..., uk))ϕ

((r(ui, uj))0≤i<j≤k

)it follows from (2.53) that µkXn

n→∞=⇒ µkX in the topology of weak conver-

gence. Since k was arbitrary the convergence µXn

n→∞=⇒ µX follows by Re-

mark 2.3.10.

(ii) Once more it suffices to prove that all moment measures converge.Let X = (X, rX , µX) ∈ M and ε > 0 be given. Now consider a metric

measure space Y = (Y, rY , µY ) ∈ M with dGPr(X ,Y) < ε.We know that there exists a metric space (Z, rZ), isometric embeddings

φX and φY of supp(µX) and supp(µY ) into Z, respectively, and a couplingµ of (φX)∗µX and (φY )∗µY such that

(2.55) µ(z, z′) : rZ(z, z

′) ≥ ε≤ ε.

Given k ∈ N, define a coupling ˜µk of µkX and µkY by(2.56)˜µk(A1 × · · · ×Ak ×B1 · · · ×Bk

):= µ⊗(k+1)

(z0, z

′0), ..., (zk, z

′k) : rZ(z0, zi) ∈ Ai, rZ(z

′0, z

′i) ∈ Bi, i = 1, ..., k

for all A1 × · · · ×Ak ×B1 × · · · ×Bk ∈ B(R2k

+ ). Then

(2.57)

˜µk(r1, ..., rk, r

′1, ..., r

′k) : |ri − r′i| ≥ 2ε for at least one i

= k · ˜µ1

(r1, r

′1) : |r1 − r′1| ≥ 2ε

= k · µ⊗2

(z, z′), (z, z′) : |rZ(z, z)− rZ(z

′, z′)| ≥ 2ε

≤ k · µ⊗2(z, z′), (z, z′) : rZ(z, z

′) ≥ ε or rZ(z, z′) ≥ ε

≤ 2kε,

which implies that dRk+

Pr (˜µkX ,

˜µkY) ≤ 2kε, and the claim follows.

(iii) By Part (i) of Lemma 2.3.5, for X ∈ M, wX equals the first momentmeasure of µX . The continuity properties of X 7→ wX are therefore a directconsequence of (i) and (ii).

(iv) Let X , X1, X2, ... in M such that µXn

n→∞=⇒ µX and δ > 0. Assume

that ε > 0 is such that ε > vδ(X ) and

(2.58) µXν ∈ M1([0,∞)) : ν([0, ε]) = δ

= 0.

Then by Lemmata 2.3.5(ii) and 2.3.6,

(2.59) µXν ∈ M1([0,∞)) : ν([0, ε]) ≤ δ

< ε,

Page 64: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

64 2. STATE SPACES II: THE SPACE OF METRIC MEASURE TREES

and the set ν ∈ M1([0,∞)) : ν([0, ε]) ≤ δ is a µX -zero set in M1([0,∞)).Hence by the Portmanteau Theorem (see, for example, Theorem 3.3.1 in[EK86]),

(2.60)limn→∞

µXn

ν ∈ M1([0,∞)) : ν([0, ε]) ≤ δ

= µX

ν ∈ M1([0,∞)) : ν([0, ε]) ≤ δ

< ε.

That is, we have vδ(Xn) < ε, for all but finitely many n, by (2.50).Therefore we find that lim supn→∞ vδ(Xn) < ε. This holds for every ε >vδ(X ) satisfying (2.59), and and hence for a set of ε > vδ(X ) for whichvδ(X ) is a limit point, so we are done.

The following estimate will be used in the proofs of the pre-compactnesscharacterization given in Theorem 2.4.1 and of Part (i) of Lemma 2.9.3.

Lemma 2.3.11. Let δ > 0, ε ≥ 0, and X = (X, r, µ) ∈ M. If vδ(X ) < ε,then there exists N < ⌊1δ ⌋ and points x1, ..., xN ∈ X such that the followinghold.

• For i = 1, ..., N , µ(Bε(xi)

)> δ, and µ

( N∪i=1

B2ε(xi))> 1− ε.

• For all i, j = 1, ..., N with i = j, r(xi, xj

)> ε.

Proof. Consider the set D := x ∈ X : µ(Bε(x)) > δ. Since vδ(X ) <ε, Lemma 2.3.6 implies that µ(D) > 1 − ε. Take a maximal 2ε separatednet x1, ..., xN ⊆ D, i.e.,

(2.61) D ⊆N∪i=1

B2ε(xi),

and for all i = j,

(2.62) r(xi, xj) > 2ε,

while adding a further point to D would destroy (2.62). Such a net existsin every metric space (see, for example, in [BBI01], p. 278). Since

(2.63) 1 ≥ µ( N∪i=1

Bε(xi))=

N∑i=1

µ(Bε(xi)

)> Nδ,

N < ⌊1δ ⌋ follows.

2.4. Compact sets in M

In this section we characterize the (pre-)compact sets in the Gromov-Prohorov topology.

Recall the distance measure wX from (2.40) and the modulus of massdistribution vδ(X ) from (2.41). Denote by (Xc, dGH) the space of all isom-etry classes of compact metric spaces equipped with the Gromov-Hausdorffmetric (see Chapter 1 for basic definitions).

Page 65: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

2.4. COMPACT SETS IN M 65

The following characterizations together with Theorem 2.5.1 stated inSection 2.5 which states the equivalence of the Gromov-Prohorov and theGromov-weak topology imply characterizations of pre-compact sets in theGromov-weak topology.

Theorem 2.4.1 (Pre-compactness characterization). Let Γ be a familyin M. The following four conditions are equivalent.

(a) The family Γ is pre-compact in the Gromov-Prohorov topology.(b) The family

w(X ); X ∈ Γ

is tight, and

(2.64) supX∈Γ

vδ(X )δ→0−−−→ 0.

(c) For all ε > 0 there exists Nε ∈ N such that for all X = (X, r, µ) ∈ Γthere is a subset Xε,X ⊆ X with

– µ(Xε,X

)≥ 1− ε,

– Xε,X can be covered by at most Nε balls of radius ε, and– Xε,X has diameter at most Nε.

(d) For all ε > 0 and X := (X, rX , µX) ∈ Γ there exists a compactsubset Kε,X ⊆ X with

– µ(Kε,X

)≥ 1− ε, and

– the family Kε := Kε,X ; X ∈ Γ is pre-compact in (Xc, dGH).

Remark 2.4.2. If Γ = X1,X2, ... then we can replace sup by lim supin (2.64).

Remark 2.4.3.(i) In the space of compact metric spaces equipped with a probability

measure with full support, Proposition 1.11.5 states that Condi-tion (d) is sufficient for pre-compactness.

(ii) Starting with Theorem 2.4.1, (b) characterizes tightness for thestronger topology given in [Stu06] based on certain L2-Wassersteinmetrics if one requires in addition uniform integrability of sampledmutual distance.

Similarly, (b) characterizes tightness in the space of measurepreserving isometry classes of metric spaces equipped with a finitemeasure (rather than a probability measure) if one requires in ad-dition tightness of the family of total masses.

Example 2.4.4. In the following we illustrate the two requirements fora family in M to be pre-compact which are given in Theorem 2.4.1 by twocounter-examples.

(i) Consider the isometry classes of the metric measure spaces Xn :=(1, 2, rn(1, 2) = n, µn1 = µn2 = 1

2). A potential limit object

would be a metric space with masses 12 within distance infinity.

This clearly does not exist.

Page 66: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

66 2. STATE SPACES II: THE SPACE OF METRIC MEASURE TREES

Indeed, the family wXn = 12δ0 +

12δn; n ∈ N is not tight, and

hence Xn; n ∈ N is not pre-compact in M by Condition (ii) ofTheorem 2.4.1.

(ii) Consider the isometry classes of the metric measure spaces Xn =(Xn, rn, µn) given for n ∈ N by

(2.65) Xn := 1, ..., 2n, rn(x, y) := 1x = y, µn := 2−n2n∑i=1

δi,

i.e., Xn consists of 2n points of mutual distance 1 and is equippedwith a uniform measure on all points.

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

...

12

12

X1

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

...

...........................................................................................................................................................................

• •

14

14

14

14

X2

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

.......

...

...........................................................................................................................................................................

....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................•

• •

18

18

18

18

18

18

18

18

X3

· · ·

A potential limit object would consist of infinitely many pointsof mutual distance 1 with a uniform measure. That means we wouldneed the uniform distribution on a non-compact set which does notexist.

Indeed, notice that for δ > 0,

(2.66) vδ(Xn) =

0, δ < 2−n,

1, δ ≥ 2−n,

so supn∈N vδ(Xn) = 1, for all δ > 0. Hence Xn; n ∈ N doesnot fulfil Condition (ii) of Theorem 2.4.1, and is therefore not pre-compact.

Proof of Theorem 2.4.1. As before, we abbreviate X = (X, rX , µX).We prove four implications giving the statement.

(a) ⇒ (b). Assume that Γ ∈ M is pre-compact in the Gromov-Prohorovtopology.

To show thatw(X ); X ∈ Γ

is tight, consider a sequence X1,X2, ...

in Γ. Since Γ is relatively compact by assumption, there is a converging

subsequence, i.e., we find X ∈ M such that dGPr(Xnk,X )

k→∞−→ 0 along a

suitable subsequence (nk)k∈N. By Part (iii) of Proposition 2.3.8, wXnk

k→∞=⇒

wX . As the sequence was chosen arbitrary it follows thatw(X ); X ∈ Γ

is

tight.The second part of the assertion in (b) is by contradiction. Assume that

vδ(X ) does not converge to 0 uniformly in X ∈ Γ, as δ → 0. Then we find

Page 67: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

2.4. COMPACT SETS IN M 67

an ε > 0 such that for all n ∈ N there exists sequences (δn)n∈N convergingto 0 and Xn ∈ Γ with

(2.67) vδn(Xn) ≥ ε.

By assumption, there is a subsequence Xnk; k ∈ N, and a metric mea-

sure space X ∈ Γ such that dGPr

(Xnk

,X )k→∞−→ 0. By Parts (ii) and (iv) of

Proposition 2.3.8, we find that lim supk→∞ vδnk(Xnk

) = 0 which contradicts

(2.67).

(b) ⇒ (c). By assumption, for all ε > 0 there are C(ε) with

(2.68) supX∈Γ

wX([C(ε),∞)

)< ε,

and δ(ε) such that

(2.69) supX∈Γ

vδ(ε)(X ) < ε.

Set

(2.70) X ′ε,X :=

x ∈ X : µX

(BC(

ε2

4 )(x))> 1− ε/2

.

We claim that µX(X′ε,X ) > 1 − ε/2. If this were not the case, there

would be X ∈ Γ with

(2.71)

wX([C(14ε

2);∞))= µ⊗2

X

(x, x′) ∈ X ×X : rX(x, x

′) ≥ C(14ε2)

≥ µ⊗2X

(x, x′) : x /∈ X ′

ε,X , x′ /∈ B

C(ε2

4 )(x)

≥ ε

2µX(X ′

ε,X )

≥ ε2

4,

which contradicts (2.68). Furthermore, the diameter of X ′ε,X is bounded

by 3C( ε2

4 ). Indeed, otherwise we would find points x, x′ ∈ X ′ε,X with

BC(

ε2

4 )(x) ∩B

C(ε2

4 )(x′) = ∅, which contradicts that

(2.72)µX(BC(

ε2

4 )(x) ∩B

C(ε2

4 )(x′)

)≥ 1− µX

(B

C(ε2

4 )(x))− µX

(B

C(ε2

4 )(x′)

)> 1− ε.

By Lemma 2.3.11, for all X = (X, rX , µX) ∈ Γ, we can choose pointsx1, ..., xNX

ε∈ X with NX

ε ≤ N(ε) := ⌊ 1δ(ε/2)⌋, rX(xi, xj) > ε/2, 1 ≤ i < j ≤

NXε , and with µX

(∪NXε

i=1 Bε(xi))> 1− ε/2.

Set

(2.73) Xε,X := X ′ε,X ∩

NXε∪

i=1

Bε(xi).

Page 68: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

68 2. STATE SPACES II: THE SPACE OF METRIC MEASURE TREES

Then µX(Xε,X ) > 1 − ε. In addition, Xε,X can be covered by at most

N(ε) balls of radius ε and X ′ε,X has diameter at most 3C( ε

2

4 ), so the sameis true for Xε,X .

(c) ⇒ (d). Fix ε > 0, and set εn := ε2−(n+1), for all n ∈ N. By assump-tion we may choose for each n ∈ N, Nεn ∈ N such that for all X ∈ Γ there isa subset Xεn,X ⊆ X of diameter at most Nεn with µ

(Xεn,X

)≥ 1− εn, and

such that Xεn,X can be covered by at most Nεn balls of radius εn. Withoutloss of generality we may assume that all Xεn,X ; n ∈ N,X ∈ Γ are closed.Otherwise we just take their closure. For every X ∈ Γ take compact setsKεn,X ⊆ X with µX(Kεn,X ) > 1− εn. Then the set

(2.74) Kε,X :=∞∩n=1

(Xεn,X ∩Kεn,X

)is compact since it is the intersection of a compact set with closed sets, and

(2.75) µX(Kε,X ) ≥ 1−∞∑n=1

(µX(Xεn,X ) + µX(Kεn,X )

)> 1− ε.

Consider

(2.76) Kε :=Kε,X ; X ∈ Γ

.

To show that Kε is pre-compact we use the pre-compactness criteriongiven in Proposition 1.4.1, i.e., we have to show that Kε is uniformly totallybounded. This means that the elements of Kε have bounded diameter andfor all ε′ > 0 there is a number Nε′ such that all elements of Kε can becovered by Nε′ balls of radius ε

′. By definition, Kε,X has diameter at mostNε1 . So, take ε′ < ε and n large enough for εn < ε′. Then Xεn,X as wellas Kε,X can be covered by Nεn balls of radius ε′. So Kε is pre-compact in(Xc, dGH).

(d) ⇒ (a). The proof is in two steps. Assume first that all metric spaces(X, rX) such that there is µX ∈ M1(X) with (X, rX , µX) ∈ Γ are compact,and that the family (X, rX) : (X, rX , µX) ∈ Γ is pre-compact in theGromov-Hausdorff topology.

Under these assumptions we can choose for every sequence in Γ a subse-quence (Xm)m∈N, Xm = (Xm, rXm , µXm), and a metric space (X, rX), suchthat

(2.77) dGH(X,Xm)m→∞−→ 0.

By Theorem 1.3.1, there are a compact metric space (Z, rZ) and isomet-ric embeddings φX , φX1 , φX2 , ... from supp(µX), supp(µXm), supp(µX2), ...,

respectively, to Z, such that dH(φX(supp(µX)), φXm(supp(µXm))

) m→∞−→ 0.Hence the set (φXm)∗µXm is pre-compact in M1(Z) equipped with the weaktopology. Therefore (φXm)∗µXm has a converging subsequence, and (a) fol-lows in this case.

Page 69: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

2.5. GROMOV-PROHOROV AND GROMOV-WEAK TOPOLOGY COINCIDE 69

In the second step we consider the general case. Let εn := 2−n, fix forevery X ∈ Γ and every n ∈ N, x ∈ Kεn,X . Put

(2.78) µX,n(·) := µX(· ∩Kεn,X ) + (1− µX(Kεn,X ))δx(·)and let X n := (X, rX , µX ,n). By construction, for all X ∈ Γ,

(2.79) dGPr

(X n,X

)≤ εn,

and µX,n is supported byKεn,X . Hence by the first step, Γn := X n; X ∈ Γis pre-compact in Xc equipped with the Gromov-Hausdorff topology, for alln ∈ N. We can therefore find a converging subsequence in Γn, for all n.

By a diagonal argument we find a subsequence (Xm)m∈N with Xm =(Xm, rXm , µXm) such that (X n

m)m∈N converges for every n ∈ N to somemetric measure space Zn. Pick a subsequence such that for all n ∈ N andm ≥ n,

(2.80) dGPr

(X nm,Zn

)≤ εm.

Then

(2.81) dGPr

(X nm,X n

m′)≤ 2εn,

for all m,m′ ≥ n. We conclude that (Xn)n∈N is a Cauchy sequence in(M, dGPr) since

∑n≥1 εn <∞. Indeed,

(2.82)

dGPr

(Xn,Xn+1

)≤ dGPr

(Xn,X n

n

)+ dGPr

(X nn ,X n

n+1

)+ dGPr

(X nn+1,Xn+1

)≤ 4εn.

Since (M, dGPr) is complete, this sequence converges and we are done.

2.5. Gromov-Prohorov and Gromov-weak topology coincide

In this section we show that the topologies induced by convergence ofpolynomials and convergence in the Gromov-Prohorov metric coincide. Thisimplies that the characterizations of compact subsets of M in the Gromov-weak topology are covered by the corresponding characterizations with re-spect to the Gromov-Prohorov topology given in Theorem 2.4.1 and Propo-sition 2.8.5, respectively.

Recall from Definition 2.1.3 the distance matrix distribution νX of X ∈M.

Theorem 2.5.1. Let X ,X1,X2, ... ∈ M. The following are equivalent:

(a) All polynomials Φ ∈ Π converge, i.e.,

(2.83) Φ(Xn)n→∞−−−→ Φ(X ).

(b) The Gromov-Prohorov metric converges, i.e.,

(2.84) dGPr

(Xn,X

) n→∞−−−→ 0.

Page 70: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

70 2. STATE SPACES II: THE SPACE OF METRIC MEASURE TREES

(c) The distance matrix distributions converge weakly in M1(R(N2)),

i.e.,

(2.85) νXn =⇒n→∞

νX .

Proof. (a) ⇒ (b). Assume that for all Φ ∈ Π, Φ(Xn)n→∞−−−→ Φ(X ). It

is enough to show that the sequence (Xn)n∈N is pre-compact with respectto the Gromov-Prohorov topology, since by Proposition 2.1.8, this wouldimply that all limit points coincide and equal X . We need to check the twoconditions guaranteeing pre-compactness given by Part (b) of Theorem 2.4.1.

By Parts (iii) of Proposition 2.3.8, the map X 7→ wX is continuous withrespect to the Gromov-weak topology. Hence, the family wXn ; n ∈ N istight.

In addition, lim supn→∞ vδ(Xn) ≤ vδ(X )δ→0−−−→ 0, by Parts (i) and (iv)

of Proposition 2.3.8. By Remark 2.4.2, the latter implies (2.64), and we aredone.

(b) ⇒ (a). Let X = (X, rX , µX), X1 = (X1, r1, µ1), X2 = (X2, r2, µ2),.... be in M. By Corollary 2.2.6 there are a complete and separable met-ric space (Z, rZ) and isometric embeddings φ, φ1, φ2,... from (X, rX),(X1, r1), (X2, r2), ..., respectively, to (Z, rZ) such that (φn)∗µn convergesweakly to φ∗µX on (Z, rZ). Assume that Φn,ϕ ∈ Π for some n ∈ N and

ϕ ∈ Cb([0,∞)(n2)). Define the continuous map r : Zn → [0,∞)(

n2) by

r(z1, ..., zn) := (rZ(zi, zj))1≤i<j≤n and consider ϕ r ∈ Cb(Zn). We thencan write by the weak convergence of (φn)∗µn to φ∗µX on Z,

(2.86) Φ(Xn) =((φn)∗µn

)[ϕ r] n→∞−−−→

(φ∗µX

)[ϕ r] = Φ(X ),

which finally proves the claim.

The equivalence of (a) and (c) is obvious by the definitions of weak con-vergence on probability measures and of polynomials (compare with Defini-tion 2.7).

2.6. Ultra-metric measure spaces

The following sub-spaces of metric measure spaces will be of specialinterest throughout the paper. A metric measure space (X, r, µ) is calledultra-metric if and only if the metric space (X, r) is ultra-metric µ-almostsurely, i.e., (1.14) holds, for µ-almost all u, v, w ∈ X. Define

(2.87) U :=U ∈ M : U is an ultra-metric measure space

Remark 2.6.1 (Ultra-metric spaces are trees). Notice that there is a

close connection between ultra-metric spaces and trees.On the one hand, every complete ultra-metric space (U, rU ) spans a

path-connected complete metric space (X, rX) which satisfies the four pointcondition (1.13) and is such that (U, rU ) is isometric to the set of leaves

Page 71: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

2.7. COMPACT METRIC MEASURE SPACES 71

X \Xo. On the other hand, given a rooted R-tree (X, rX , ρ), the level setsXt := x ∈ T : r(ρ, x) = t, for some t ≥ 0, form ultra-metric sub-spaces.

Because of this connection between ultra-metric spaces and real trees,ultra-metric spaces are often (especially in phylogenetic analysis) referredto as ultra-metric trees.

The next lemma implies that U equipped with the Gromov-weak topol-ogy, is again Polish.

Lemma 2.6.2. The sub-space U is a closed sub-space of M.

Proof. Let (Un)n∈N be a sequence in U and X ∈ M such that Un → X inthe Gromov-weak topology, as n → ∞. Equivalently, by Theorem 2.5.1(c),

νUn ⇒ νX in the weak topology on M1(R(N2)+ ), as n → ∞. Consider the

open set

(2.88) A :=(rij)1≤i<j : r12 > r23εr13, r23 > r12 ∨ r13, r13 > r12 ∨ r23

.

By the Portmanteau Theorem (compare, for example, Theorem 3.3.1in [EK86]), νX (A) = limn→∞ νUn(A) = 0. Thus, (1.14) holds for µ⊗3-alltriples (u, v, w) ∈ X3. In other words, X is ultra-metric.

2.7. Compact metric measure spaces

A metric measure space (X, r, µ) is called compact if and only if themetric space (supp(µ), r|supp(µ)) is compact. Define

(2.89) Mc :=X ∈ M : X is a compact measure space

.

Remark 2.7.1 (Mc is not closed).(i) If X = (X, r, µ) is a finite metric measure space, i.e, #supp(µ) <∞,

then X ∈ Mc.(ii) Since M is separable, every element X ∈ M can be approximated

by a sequence of finite metric measure spaces. Hence the sub-spaceMc is not closed.

To be in a position to characterize pre-compactness in Mc, recall for X =(X, r, µ) ∈ M the distance distribution wX := r∗µ

⊗2 from Definition 2.3.1(i).

Proposition 2.7.2 (Pre-compactness in Mc). A set Γ ⊆ Mc is pre-compact in the Gromov-weak topology on Mc if the following two conditionsare satisfied.

(i) wX : X ∈ Γ is tight in M1(R+).(ii) For all ε > 0 there exists a Nε ∈ N such that for all (X, r, µ) ∈ Γ,

the metric space (supp(µ), r) can be covered by Nε balls of radius ε.

Page 72: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

72 2. STATE SPACES II: THE SPACE OF METRIC MEASURE TREES

The proof is based on two Lemmata. Recall that for a metric space(X, r) an ε-separated set is a subset X ′ ⊆ X such that r(x′, y′) ≥ ε, for allx′, y′ ∈ X ′ with x′ = y′. Of course, each non-empty metric space has atleast one ε-separated set.

Lemma 2.7.3 (Relation between ε-balls and ε-separated nets). Fix N ∈N, a metric space (X, r) with #X ≥ N + 1 and ε > 0. The following hold.

(a) If (X, r) can be covered by N balls of radius ε, then (X, r) has no2ε-separated set of cardinality at least N + 1.

(b) If (X, r) has no ε′-separated set of cardinality at least N + 1, then(X, r) can be covered by N balls of radius ε′, for all ε′ > 0.

Proof. (a) Assume that x1, ..., xN ∈ X are such that X =∪Ni=1Bε(xi).

Choose (N+1) distinct points y1, ..., yN+1 ∈ X. By the pigeonhole principle,two of the points must fall into the same ball Bε(xi), for some i = 1, ..., N ,and are therefore in distance smaller than ε. Hence y1, ..., yN+1 is not 2ε-separated. Since y1, ..., yN+1 ∈ X were chosen arbitrarily, the claim follows.

(b) We proceed by contradiction. Let K be the maximal possible car-dinality of an ε-separated set in (X, r). By assumption, K ≤ N . Assumethat SKε := x1, ..., xK is an ε-separated set in (X, r). We claim that

X =∪Ki=1Bε′(xi), for all ε

′ > ε. Indeed, let ε′ > ε, and assume, to the con-trary, that y ∈ X is such that r(y, xi) ≥ ε′, for all i = 1, ...,K, then SKε ∪yis an ε-set of cardinality K + 1, which clearly gives the contradiction.

Lemma 2.7.4. Fix ε > 0 and N ∈ N. Let X = (X, r, µ), X1 =(X1, r1, µ1), X2 = (X2, r2, µ2), ... be elements of M such that Xn → X

in the Gromov-weak topology, as n → ∞. If (supp(µ1), r1), (supp(µ2), r2),... can be covered by N balls of radius ε then (supp(µ), r) can be covered byN balls of radius 2ε′, for all ε′ > ε.

Proof. Fix ε > 0 and N ∈ N. Since X = (X, r, µ), X1 = (X1, r1, µ1),X2 = (X2, r2, µ2), ... are elements of M such that Xn → X in the Gromov-weak topology, as n→ ∞,

Assume that for each n ∈ N, the metric space (supp(µn), rn) can becovered by N balls of radius ε. Then by Lemma 2.7.3(a), there is no n ∈ Nfor which (supp(µn), rn) has a 2ε-separated set of cardinality N + 1. Set

B := (2ε,∞)(N+1

2 ). Notice that for Y = (Y, rY , µY ) ∈ M, (supp(µY ), rY )has a 2ε-separated set of cardinality N + 1 if and only if ΦN+1,1B (Y) = 0.Indeed,

(2.90) 0 ≤ ΦN+1,1B (X) = lim infn→∞

ΦN+1,1B (Xn) = 0,

by Theorem 4.1.3(d), i.e., ΦN+1,1B (X) = 0. By Lemma 2.7.3, (supp(µ), r)can be therefore covered by N balls of radius 2ε′, for all ε′ > ε.

Page 73: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

2.8. DISTRIBUTIONS OF RANDOM METRIC MEASURE SPACES 73

Proof of Proposition 2.7.2. Let a set Γ ⊆ Mc be such that thefamily wX : X ∈ Γ is tight in M1(R+), and for all ε > 0 there is an Nε ∈ Nsuch that for all X = (X, r, µ) ∈ Γ, there exists SX

ε := xX1 , ..., x

XNε

⊆ X

such that supp(µ) ⊆∪Nεi=1Bε(xi).

For ε > 0 and X = (X, r, µ) ∈ Γ, put

(2.91) XX ,ε :=∪

x∈SXε ;µ(Bε(x))>ε

Bε(xXi ).

Then µ(XX ,ε) ≥ 1 − ε, and XX ,ε can also be covered by Nε balls ofradius ε. We claim that the set diam(XX ,ε); X ∈ Γ is bounded. To seethis, assume to the contrary, that we find a sequence (Xn)n∈N in Γ withdiam(XXn,ε) ≥ n. Then

(2.92) supX∈Γ

w([n− 2ε,∞)) ≥ ε2

N2ε

> 0,

for all n ∈ N, which contradicts the tightness assumption of the distancedistribution. We can therefore apply Theorem 2.4.1(c) to conclude that Γis pre-compact in the Gromov-weak topology on M.

It is therefore enough to show that all limit points are compact. Tosee this take X ∈ M and X1, X2, . . . ∈ Γ such that Xn → X in the Gromov-weak topology, as n → ∞, and let ε > 0. By Assumption(ii) togetherwith Lemma 2.7.4, (supp(µ), r) can be covered by Nε/3 balls of radius ε.Therefore, X is totally bounded which implies X ∈ Mc, and we are done.

2.8. Distributions of random metric measure spaces

From Theorem 2.1.10 and Definition 2.1.9 we immediately conclude thecharacterization of weak convergence for a sequence of probability measureson M.

Corollary 2.8.1 (Characterization of weak convergence). A sequence(Pn)n∈N in M1(M) converges weakly if and only if

(i) the family Pn; n ∈ N is relatively compact in M1(M), and(ii) for all polynomials Φ ∈ Π, (Pn

[Φ])n∈N converges in R.

Proof. The “only if” direction is clear, as polynomials are boundedand continuous functions by definition. To see the converse, recall fromLemma 3.4.3 in [EK86] that given a relative compact sequence of proba-bility measures, each separating family of bounded continuous functions isconvergence determining.

While Condition (ii) of the characterization of convergence given inCorollary 2.8.1 can be checked in particular examples, we still need a man-ageable characterization of tightness on M1(M) which we can conclude fromTheorem 2.4.1 together with Theorem 2.5.1. It will be given in terms of thedistance distribution and the modulus of mass distribution.

Page 74: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

74 2. STATE SPACES II: THE SPACE OF METRIC MEASURE TREES

Theorem 2.8.2 (Characterization of tightness). A set A ⊆ M1(M) istight if and only if the following holds:

(i) The family P[wX ] : P ∈ A is tight in M1(R).(ii) For all ε > 0 there exist a δ = δ(ε) > 0 such that

(2.93) supP∈A

P[vδ(X )

]< ε.

Remark 2.8.3. If A = P1,P2, ... then we can replace sup by lim supin (2.93).

We will illustrate Theorem 2.8.2 with the example of Λ-coalescent mea-sure trees in Chapter 4.1, with examples of trees corresponding to spa-tially structured coalescents in Section 4.2, their scaling limit in Sections 4.3and 4.4 and of evolving coalescents in Chapter 4.

Remark 2.8.4. Starting with Theorem 2.8.2 one characterizes easilytightness for the stronger topology given in [Stu06] based on certain L2-Wasserstein metrics if one requires in addition to (i) and (ii) uniform inte-grability of sampled mutual distance.

Similarly, with Theorem 2.8.2 one characterizes tightness in the spaceof measure preserving isometry classes of metric spaces equipped with afinite measure (rather than a probability measure) if one requires in additiontightness of the family of total masses (compare, also with Remark 2.4.3).

In Theorem 2.4.1 we have given a characterization for relative compact-ness in M with respect to the Gromov-Prohorov topology. This characteri-zation extends to the following tightness characterization in M1(M) whichis equivalent to Theorem 2.8.2, once we have shown the equivalence of theGromov-Prohorov and the Gromov-weak topology in Theorem 2.5.1 in Sec-tion 2.9.

Proposition 2.8.5 (Tightness). A set A ⊆ M1(M) is tight with respectto the Gromov-weak topology on M if and only if for all ε > 0 there existδ > 0 and C > 0 such that

(2.94) supP∈A

P[vδ(X ) + wX ([C;∞))

]< ε.

Proof. For the “only if” direction assume that A is tight and fix ε > 0.By definition, we find a compact set Γε in (M, dGPr) such that infP∈A P(Γε) >1 − ε/4. Since Γε is compact there are, by part (b) of Theorem 2.4.1, δ =δ(ε) > 0 and C = C(ε) > 0 such that vδ(X ) < ε/4 and wX ([C,∞)) < ε/4,for all X ∈ Γε. Furthermore both vδ(·) and w·([C,∞)) are bounded above

Page 75: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

2.9. EQUIVALENT METRICS 75

by 1. Hence for all P ∈ A,

(2.95)

P[vδ(X ) + wX ([C,∞))

]= P

[vδ(X ) + wX ([C,∞)); Γε

]+ P

[vδ(X ) + wX ([C,∞)); Γε

]<ε

2+ε

2= ε.

Therefore (2.94) holds.

For the “if” direction assume (2.94) is true and fix ε > 0. For all n ∈ N,there are δn > 0 and Cn > 0 such that

(2.96) supP∈A

P[vδn(X ) + wX ([Cn,∞))

]< 2−2nε2.

By Tschebychev’s inequality, we conclude that for all n ∈ N,(2.97) sup

P∈APX : vδn(X ) + wX ([Cn,∞)) > 2−nε

< 2−nε.

By the equivalence of (a) and (b) in Theorem 2.4.1 the closure of

(2.98) Γε :=∞∩n=1

X : vδn(X ) + wX ([Cn,∞)) ≤ 2−nε

is compact. We conclude

(2.99)

P(Γε) ≥ P

(Γε)

≥ 1−∞∑n=1

PX : vδn(X ) + wX ([Cn,∞)) > ε

2n

> 1− ε.

Since ε was arbitrary, A is tight.

2.9. Equivalent metrics

In Section 2.2 we have seen that M equipped with the Gromov-Prohorovmetric is separable and complete. In this section we conclude the paper bypresenting further metrics (not necessarily complete) which are all equivalentto the Gromov-Prohorov metric and which may be in some situations easierto work with.

The Eurandom metric.1 Recall from Definition 2.1.4 the algebra ofpolynomials, i.e., functions which evaluate distances of finitely many pointssampled from a metric measure space. By Proposition 2.1.8, polynomialsseparate points in M. Consequently, two metric measure spaces are differentif and only if the distributions of sampled finite subspaces are different.

1When the authors of [GPW06a] first discussed how to metrize the Gromov-weaktopology the Eurandom metric came up. Since the discussion took place during a meetingat Eurandom, we decided to name the metric accordingly.

Page 76: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

76 2. STATE SPACES II: THE SPACE OF METRIC MEASURE TREES

We therefore define(2.100)

dEur(X ,Y

):= inf

µinfε > 0 : µ⊗2(x, y), (x′, y′) ∈ (X × Y )2 :

|rX(x, x′)− rY (y, y′)| ≥ ε < ε

,

where the infimum is over all couplings µ of µX and µY . We will refer todEur as the Eurandom metric.

Not only is dEur a metric on M, it also generates the Gromov-Prohorovtopology.

Proposition 2.9.1 (Equivalent metrics). The distance dEur is a metricon M. It is equivalent to dGPr, i.e., the generated topology is the Gromov-weak topology.

Before we prove the proposition we give an example to show that theEurandom metric is not complete.

Example 2.9.2 (Eurandom metric is not complete). Let for all n ∈ N,Xn := (Xn, rn, µn) as in Example 2.4.4(ii). For all n ∈ N,(2.101)

dEur(Xn,Xn+1

)≤ inf

ε > 0 : (µn ⊗ µn+1)

⊗2|1x = x′ − 1y = y′| ≥ ε ≤ ε

= 2−(n−1),

i.e., (Xn)n∈N is a Cauchy sequence for dEur which does not converge. Hence(M, dEur) is not complete. The Gromov-Prohorov metric was shown to becomplete, and hence the above sequence is not Cauchy in this metric. Indeed,

(2.102) dGPr

(Xn,Xn+1

)= 2−1

n→∞−→ 0.

To prepare the proof of Proposition 2.9.1, we provide bounds on theintroduced “distances”.

Lemma 2.9.3 (Equivalence). Let X ,Y ∈ M, and δ ∈ (0, 12).

(i) If dEur(X ,Y

)< δ4 then dGPr

(X ,Y

)< 12(2vδ(X ) + δ).

(ii)

(2.103) dEur(X ,Y

)≤ 2dGPr

(X ,Y

).

Proof. (i) The Gromov-Prohorov metric relies on the Prohorov metricof embeddings of µX and µY in M1(Z) in a metric space (Z, rZ). This is incontrast to the Eurandom metric which is based on an optimal coupling ofthe two measures µX and µY without referring to a space of measures overa third metric space. Since we want to bound the Gromov-Prohorov metricin terms of the Eurandom metric the main goal of the proof is to constructa suitable metric space (Z, rZ).

Page 77: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

2.9. EQUIVALENT METRICS 77

The construction proceeds in three steps. We start in Step 1 with findinga suitable ε-net x1, ..., xN in (X, rX), and show that this net has a suitablecorresponding net y1, ..., yN in (Y, rY ). In Step 2 we then verify that thesenets have the property that rX(xi, xj) ≈ rY (yi, yj) (where the ’≈’ is madeprecise below) and δ-balls around these nets carry almost all µX - and µY -mass. Finally, in Step 3 we will use these nets to define a metric space (Z, rZ)containing both (X, rX) and (Y, rY ), and bound the Prohorov metric of theimages of µX and µY .

Step 1 (Construction of suitable ε-nets in X and Y ). Fix δ ∈ (0, 12).

Assume that X ,Y ∈ M are such that dEur(X ,Y

)< δ4. By definition, we

find a coupling µ of µX and µY such that

(2.104) µ⊗2(x1, y1), (x2, y2) : |rX(x1, x2)− rY (y1, y2)| > 2δ

< δ4.

Set ε := 4vδ(X ) ≥ 0. By Lemma 2.3.11, there are N ≤ ⌊1δ ⌋ pointsx1, ..., xN ∈ X with pairwise distances at least ε,

(2.105) µ(Bε(xi)

)> δ,

for all i = 1, ..., N , and

(2.106) µ(∪N

i=1Bε(xi)

)≥ 1− ε.

Put D :=∪Ni=1Bε(xi). We claim that for every i = 1, ..., N there is

yi ∈ Y with

(2.107) µ(Bε(xi)×B2(ε+δ)(yi)

)≥ (1− δ2)µX

(Bε(xi)

).

Indeed, assume the assertion is not true for some 1 ≤ i ≤ N . Then, forall y ∈ Y ,

(2.108) µ(Bε(xi)× B2(ε+δ)(y)

)≥ δ2µX

(Bε(xi)

).

which implies that

(2.109)

µ⊗2(x′, y′), (x′′, y′′) : |rX(x′, x′′)− rY (y′, y′′)| > 2δ

≥ µ⊗2(x′, y′), (x′′, y′′) : x′, x′′ ∈ Bε(xi), y′′ /∈ B2(ε+δ)(y

′)≥ µX(Bε(xi))

2δ2

> δ4,

by (2.61) and (2.108) which contradicts (2.104).Step 2 (Distortion of x1, ..., xN and y1, ..., yn). Assume that x1, ..., xN

and y1, ..., yn are such that (2.105) through (2.107) hold. We claim thatthen

(2.110)∣∣rX(xi, xj)− rY (yi, yj)

∣∣ ≤ 6(ε+ δ),

for all i, j = 1, ..., N . Assume that (2.110) is not true for some pair (i, j).Then for all x′ ∈ Bε(xi), x

′′ ∈ Bε(xj), y′ ∈ B2(ε+δ)(yi), and y

′′ ∈ B2(ε+δ)(yj),

(2.111)∣∣rX(x′, x′′)− rY (y

′, y′′)∣∣ > 6(ε+ δ)− 2ε− 4(ε+ δ) = 2δ.

Page 78: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

78 2. STATE SPACES II: THE SPACE OF METRIC MEASURE TREES

Then(2.112)

µ⊗2(x′, y′), (x′′, y′′) : |rX(x′, x′′)− rY (y

′, y′′)| > 2δ

≥ µ⊗2(x′, y′), (x′′, y′′) :

x′ ∈ Bε(xi), x′′ ∈ Bε(xj), y

′ ∈ B2(ε+δ)(yi), y′′ ∈ B2(ε+δ)(yj)

= µ(Bε(xi)×B2(ε+δ)(yi)

)µ(Bε(xj)×B2(ε+δ)(yj)

)> δ2(1− δ)2

> δ4,

where we used (2.107), (2.61) and δ < 12 . Since (2.112) contradicts (2.104),

we are done.Step 3 (Definition of a suitable metric space (Z, rZ)). Define the rela-

tion R := (xi, yi) : i = 1, ..., N between X and Y and consider the metricspace (Z, rZ) defined by Z := X⊔Y and rZ := rRX⊔Y , given in Lemma 1.3.2.Choose isometric embeddings φX and φY from (X, rX) and (Y, rY ), re-spectively, into (Z, rZ). As dis(R) ≤ 6(ε + δ) (see (1.3) for definition), byLemma 1.3.2, rZ(φX(xi), φY (yi)) ≤ 3(ε+ δ), for all i = 1, ..., N .

If x ∈ X and y ∈ Y are such that rZ(φX(x), φY (y)) ≥ 6(ε + δ) andrX(x, xi) < ε then

(2.113)

rY (y, yi) ≥ rZ(φX(x), φY (y))− rX(x, xi)− rZ(φX(xi), φY (yi))

≥ 6(ε+ δ)− ε− 3(ε+ δ)

≥ 2(ε+ δ)

and so for all x ∈ Bε(xi),

(2.114)y ∈ Y : rZ(φX(x), φY (y)) ≥ 6(ε+ δ)

⊆ B2(ε+δ)(yi).

Let µ be the probability measure on Z × Z defined by µ(A × B) :=µ(φ−1

X (A) × φ−1Y (B)), for all A,B ∈ B(Z). Therefore, by (2.110), (2.114),

(2.107) and as N ≤ ⌊1/δ⌋,(2.115)

µ(z, z′) : rZ(z, z′) ≥ 6(ε+ δ)

≤ µ(φX(D)× φY (Y )) + µ

( N∪i=1

Bε(φX(xi))× B2(ε+δ)(φY (yi)))

≤ ε+

N∑i=1

µX(Bε(xi))δ2

≤ ε+ δ.

Hence, using (2.11) and ε = 4vδ(X ),

(2.116) d(Z,rZ)Pr

((φX)∗µX , (φY )∗µY

)≤ 6(4vδ(X ) + 2δ

),

and so dGPr

(X ,Y

)≤ 12

(2vδ(X ) + δ

), as claimed.

Page 79: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

2.9. EQUIVALENT METRICS 79

(ii) Assume that dGPr

(X ,Y

)< δ. Then, by definition, there exists a met-

ric space (Z, rZ), isometric embeddings φX and φY between supp(µX) andsupp(µY ) and Z, respectively, and a coupling µ of (φX)∗µX and (φY )∗µYsuch that

(2.117) µ(z, z′) : rZ(z, z

′) ≥ δ< δ.

Hence with the special choice of a coupling µ of µX and µY defined byµ(A×B) = µ

(φX(A)× φY (B)

), for all A ∈ B(X) and B ∈ B(Y ),

(2.118)

µ⊗2(x, y), (x′, y′) ∈ X × Y : |rX(x, x′)− rY (y, y

′)| ≥ 2δ≤ µ⊗2

(x, y), (x′, y′) ∈ X × Y :

rZ(φX(x), φY (y)) ≥ δ or rZ(φX(x′), φY (y

′)) ≥ δ

< 2δ.

This implies that dEur(X ,Y

)< 2δ.

Proof of Proposition 2.9.1. Observe that by Lemma 2.3.7, vδ(X )δ→0−→

0. So Lemma 2.9.3 implies the equivalence of dGPr and dEur once we haveshown that dEur is indeed a metric.

The symmetry is clear. If X , Y ∈ M are such that dEur(X ,Y) = 0, byequivalence, dGPr(X ,Y) = 0 and hence X = Y.

For the triangle inequality, let Xi = (Xi, ri, µi) ∈ M, i = 1, 2, 3, be suchthat dEur(X1,X2) < ε and dEur(X2,X3) < δ for some ε, δ > 0. Then thereexist couplings µ1,2 of µ1 and µ2 and µ2,3 of µ2 and µ3 with

(2.119) µ⊗21,2

(x1, x2), (x

′1, x

′2) : |r1(x1, x′1)− r2(x2, x

′2)| ≥ ε

< ε

and

(2.120) µ⊗22,3

(x2, x3), (x

′2, x

′3) : |r2(x2, x′2)− r3(x3, x

′3)| ≥ δ

< δ.

Introduce the transition kernel K2,3 from X2 to X3 defined by

(2.121) µ2,3(d(x2, x3)) = µ2(dx2)K2,3(x2,dx3).

which exists since X2 and X3 are Polish.Using this kernel, define a coupling µ1,3 of µ1 and µ3 by

(2.122) µ1,3(d(x1, x3)) :=

∫X2

µ1,2(d(x1, x2))K2,3(x2, dx3).

Page 80: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

80 2. STATE SPACES II: THE SPACE OF METRIC MEASURE TREES

Then(2.123)µ⊗21,3

(x1, x3), (x

′1, x

′3) : |r1(x1, x′1)− r3(x3, x

′3)| ≥ ε+ δ

=

∫X2

1×X22×X2

3

µ1,2(d(x1, x2))µ1,2(d(x′1, x

′2))K2,3(x2, dx3)K2,3(x

′2,dx

′3)

1|r1(x1, x′1)− r3(x3, x

′3)| ≥ ε+ δ

≤∫X2

1×X22×X2

3

µ1,2(d(x1, x2))µ1,2(d(x′1, x

′2))K2,3(x2, dx3)K2,3(x

′2,dx

′3)(

1|r1(x1, x′1)− r2(x2, x′2)| ≥ ε+ 1|r2(x2, x′2)− r3(x3, x

′3)| ≥ δ

)= µ⊗2

1,2

(x1, x2), (x1, x

′2) : |r1(x1, x′1)− r2(x2, x

′2)| ≥ ε

+ µ⊗22,3(x2, x3), (x2, x

′3) : |r2(x2, x′2)− r3(x3, x

′3)| ≥ δ

< ε+ δ

which yields dEur(X1,X3) < ε+ δ.

The Gromov-Wasserstein and the modified Eurandom metric.The topology of weak convergence for probability measures on a fixed metricspace (Z, r) is generated not only by the Prohorov metric, but also by

(2.124) d(Z,rZ)W (µ1, µ2) := inf

µ

∫Z×Z

µ(d(x, x′))(r(x, x′) ∧ 1

),

where the infimum is over all couplings µ of µ1 and µ2. This is a versionof the Wasserstein metric (see, for example, [Rac91]). If we rely on theWasserstein rather than the Prohorov metric, this results in two furthermetrics: in the Gromov-Wasserstein metric, i.e.,

(2.125) dGW(X ,Y) := inf(φX ,φY ,Z)

d(Z,rZ)W

((φX)∗µX , (φY )∗µY

),

where the infimum is over all isometric embeddings from supp(µX) andsupp(µY ) into a common metric Z and in the modified Eurandom metric

(2.126)

d′Eur(X ,Y

):=

infµ

∫µ(d(x, y))µ(d(x′, y′))

(|rX(x, x′)− rY (y, y

′)| ∧ 1),

where the infimum is over all couplings of µX and µY .

Remark 2.9.4. An L2-version of dGW on the set of compact metricmeasure spaces is already used in [Stu06]. It turned out that the metric iscomplete and the generated topology is separable.

Altogether, we might ask if we could achieve similar bounds to thosegiven in Lemma 2.9.3 by exchanging the Gromov-Prohorov with theGromov-Wasserstein metric and the Eurandom with the modified Euran-dom metric.

Page 81: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

2.9. EQUIVALENT METRICS 81

Proposition 2.9.5. The distances dGW and d′Eur define metrics on M.They all generate the Gromov-Prohorov topology. Bounds that relate thesetwo metrics with dGPr and dEur are for X ,Y ∈ M,

(2.127) (dGPr(X ,Y))2 ≤ dGW(X ,Y) ≤ dGPr(X ,Y)

and

(2.128) (dEur(X ,Y))2 ≤ d′Eur(X ,Y) ≤ dEur(X ,Y)

Consequently, the Gromov-Wasserstein metric is complete.

Proof. The fact that dGW and d′Eur define metrics on M is provedanalogously as for the Gromov-Prohorov and the Eurandom metric. TheProhorov and the version of the Wasserstein metric used in (2.125) and(2.126) on fixed metric spaces can be bounded uniformly (see, for example,Theorem 3 in [GS02]). This immediately carries over to the present case.

Page 82: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter
Page 83: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

CHAPTER 3

Examples of limit trees I: Branching trees

In this chapter we apply the theory developed in Chapters 1 and 2 toillustrate how branching trees can be associated with weighted rooted planarR-trees.

Family trees of branching systems can be most conveniently first embed-ded in a plane and then described via a continuous excursion obtained from“walking around the tree” and recording the height profile. In Section 3.1we recall the relevant facts on this connection between family trees and ex-cursions. In Section 3.2 we then introduce Aldous’s Brownian continuumrandom tree as the tree associated with the path of a Brownian excursion.The Brownian CRT appears for example as the suitably rescaled limit of thefinite trees associated with branching particle models as the total populationsize tends to infinity. We will restate Aldous’s construction in Section 3.3which leads to an interpretation of the Brownian CRT as the uniform com-pact real tree. In Section 3.4 we will use a novel path decomposition for theBrownian excursion and derive from that explicit formulae for the BrownianCRT.

The next important step to give an example of a limit tree which does nothave the uniform distribution. In Section 3.5 we investigate the genealogiesof a one-sided catalytic branching model. For the rescaled reactant treeswe state convergence and characterize the quenched law of the limit treecut off at the height where the catalyst falls below a certain threshold viaa diffusion path. The proof is given in Section 3.6 using techniques fromrandom evolutions.

3.1. Excursions

In some applications one is interested in regarding rooted trees (T, r, ρ)as embedded in the plane. In other words one has a linear order (sometimesalso called total order) on (T, r). There is a long tradition to encode rootedplanar trees as continuous excursion paths which we will recall in this section(see [LG91, Ald93, LG99b, DLG02] for more on this connection).

Definition 3.1.1 (Rooted planar real trees). A rooted planar real (orR-)tree, (T, r, ρ,≤lin), is a rooted R-tree (T, r, ρ) together with a linear orderrelation ≤lin which fulfills the following conditions:

(i) ≤lin is compatible with the partial order on (T, r, ρ) (recall fromRemark 1.8.2), i.e., for all x, y ∈ T with x ≤ y, we have x ≤lin y.

83

Page 84: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

84 3. EXAMPLES OF LIMIT TREES I: BRANCHING TREES

@@

@@tt t tx

x′

y

y′

x ∧ y

tρFigure 3.1. illustrates Condition (ii). To allow for defininga continuous excursion “around the tree” we need to requirethat whenever x ≤lin y and x′ is in the same “subtree above”x ∧ y as x and y′ is in the same “subtree above” x ∧ y as y,then the excursion will visit x′ before y′.

(ii) For all x, y ∈ T with x ≤Lian y, if x′, y′ ∈ T such that x∧y < x′∧xand x ∧ y ≤ y′ ∧ y then x′ <lin y′.

Definition 3.1.2 (Root and order invariant isometry). A function ξ :T1 → T2 is called an root and order invariant isometry between two rootedplanar R-trees (T1, r1, ρ1,≤lin

1 ) and (T2, r2, ρ2,≤lin2 ) if and only if the function

ξ is an root invariant isometry between the rooted real trees (T1, r1, ρ1) and(T2, r2, ρ2) which in addition satisfies ξ(u) ≤lin

T2ξ(u′) iff u ≤lin

T1u′.

Let Troot,lin denote the set of all root and order invariant isometry classesof compact rooted planar R-trees.

Any non-negative compactly supported continuous excursion path yieldsan compact rooted planar R-tree. Write CR+(R+) for the space of continuousfunctions from R+ into R+. For e ∈ CR+(R+), put

(3.1) ζ(e) := inft > 0 : e(t) = 0

with the convention that inf ∅ = ∞. The space of positive excursion pathsis given by

(3.2) U :=

e ∈ CR+(R+) :e(0) = 0, 0 < ζ(e) <∞,e(t) > 0 for 0 < t < ζ(e),and e(t) = 0 for t ≥ ζ(e)

.

For ℓ > 0, let U ℓ := e ∈ U : ζ(e) = ℓ.We associate each e ∈ U1 with a metric tree as follows. Define an

equivalence relation ∼e on [0, 1] by letting

(3.3) u1 ∼e u2, iff e(u1) = infu∈[u1∧u2,u1∨u2]

e(u) = e(u2).

Consider the pseudo-metric on [0, 1] defined by

(3.4) rTe(u1, u2) := e(u1)− 2 infu∈[u1∧u2,u1∨u2]

e(u) + e(u2),

Page 85: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

3.1. EXCURSIONS 85

which becomes a true metric on the quotient space Te := R+

∣∣∼e

= [0, 1]∣∣∼e

.

Lemma 3.1.3. For each e ∈ U1 the metric space (Te, rTe) is a compactR-tree.

Proof. It is straightforward to check that the quotient map from [0, 1]onto Te is continuous with respect to rTe . Thus (Te, rTe) is path-connectedand compact as the continuous image of a metric space with these properties.In particular, (Te, rTe) is complete.

To complete the proof, it therefore suffices to verify the four point con-dition (1.13). However, for u1, u2, u3, u4 ∈ Te we have

(3.5)maxrTe(u1, u3) + rTe(u2, u4), rTe(u1, u4) + rTe(u2, u3)

≥ rTe(u1, u2) + rTe(u3, u4),

where strict inequality holds if and only if

(3.6)

mini =j

infu∈[ui∧uj ,ui∨uj ]

e(u)

infu∈[u1∧u2,u1∨u2]

e(u), infu∈[u3∧u4,u3∨u4]

e(u)

.

Conversely, (at least finite) rooted planar trees can be associated with

an excursion. Denote by Troot,linfin the subspace in Troot,lin of finite trees, and

fix a speed σ > 0. There is a map

(3.7) C(·;σ): Troot,lin

fin → CR+ [0,∞),

which associates to T ∈ Troot,linfin an excursion by “walking around the tree”

(respecting the order) at speed σ and recording the height profile. (A formaldescription of the contour process can be found in Section 6.1 in [Pit02].)Typically, one sets the speed σ = 1. However as soon as we later want totake limits of rooted planar trees we will need to increase the speed of thetraversal in order to obtain a reasonable compactly supported limit contourpath.

Remark 3.1.4. If (T, r, ρ,≤lin) is such that µT (T ) = ∞ then, of course,we can not walk “around the tree” in finite time with constant speed. How-ever, any compact R-tree T is still isometric to Te for some e ∈ U1. To seethis, fix a root ρ ∈ T . Recall Rε(T, ρ), the ε-trimming of T with respectto ρ defined in (1.32). Let µ be a probability measure on T that is equiv-alent to the length measure µT . Because µT is σ-finite, such a probabilitymeasure always exists, but one can construct µ explicitly as follows: set

Page 86: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

86 3. EXAMPLES OF LIMIT TREES I: BRANCHING TREES

(111)

(11)

(112)

(1)

(121)

(12)

(122)

ρ

(21)

(2)

(22)

AA

AAAAAAA

A

AAA

AAAAAAA

AA

A

AAAAAA

(111)

(11)

(112)

(1)

(121)

(12)

(122)

ρ

(21)

(2)

(22)

Figure 3.2. illustrates the mapping of a finite linearly or-dered tree to an excursion.

H := maxu∈T r(ρ, u), and put

(3.8)

µ := 2−1µT (R2−1H(T, ρ) ∩ ·)µT (R2−1H(T, ρ))

+∑i≥2

2−iµT (R2−iH(T, ρ) \R2−i+1H(T, ρ) ∩ ·)µT (R2−iH(T, ρ) \R2−i+1H(T, ρ))

.

For all 0 < ε < H there is a continuous path

fε : [0, 2µT (Rε(T, ρ))] → Rε(T, ρ)

such that hε defined by hε(t) := r(ρ, fε(t)) belongs to U2µT (Rε(T,ρ)) (in par-

ticular, fε(0) = fε(2µT (Rε(T, ρ))) = ρ), hε is piecewise linear with slopes

±1, and Thε is isometric to Rε(T, ρ). Moreover, these paths may be chosenconsistently so that if ε′ ≤ ε′′, then

(3.9) fε′′(t) = fε′(infs > 0 : |0 ≤ r ≤ s : fε′(r) ∈ Rε′′(T, ρ)| > t

),

where | · | denotes Lebesgue measure. Now define eε ∈ U µ(Rε(T,ρ)) to be theabsolutely continuous path satisfying

(3.10)deε(t)

dt= 2

dµT

dµ(fε(t))

dhε(t)

dt.

It can be shown that eε converges uniformly to some e ∈ U1 as ε ↓ 0and that Te is isometric to T .

Each tree coming from a path in U1 has a natural weight on it: fore ∈ U1, we equip (Te, rTe) with the weight

(3.11) νTe := q∗λ

given by the push-forward of Lebesgue measure λ on [0, 1] by the quotionmap q which sends an element u ∈ [0, 1] to its equivalence class [u] ∈ Te.

Page 87: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

3.2. THE BROWNIAN CONTINUUM RANDOM TREE 87

We finish this section with a remark about the natural length measureon a tree coming from a path. Given e ∈ U1 and a ≥ 0, let

(3.12) Ga :=

t ∈ [0, 1] :e(t) = a and, for some ε > 0,e(u) > a for all u ∈]t, t+ ε[,

e(t+ ε) = a.

denote the countable set of starting points of excursions of the function eabove the level a. Then µTe , the length measure on Te, is given by

(3.13) µTe = q∗

(∫ ∞

0da∑t∈Ga

δt

).

Alternatively, write

(3.14) Γe :=(s, a) : s ∈]0, 1[, a ∈ [0, e(s)[

for the region between the time axis and the graph of e, and for (s, a) ∈ Γedenote by

(3.15) s(e, s, a) := supr < s : e(r) = a

and

(3.16) s(e, s, a) := inft > s : e(t) = a

the start and finish of the excursion of e above level a that straddles time s.Then

(3.17) µTe = q∗

(∫Γe

ds⊗ da

s(e, s, a)− s(e, s, a)δs(e,s,a)

).

We note that the measure µTe appears in [AS02].

3.2. The Brownian continuum random tree

In this section we will recall the definition of Aldous’s Brownian contin-uum random tree (Brownian CRT), which can be thought of as a uniformlychosen random weighted compact R-tree.

Consider the Ito excursion measure for excursions of standard Brownianmotion away from 0. This σ-finite measure is defined subject to a normal-ization of Brownian local time at 0, and we take the usual normalization oflocal times at each level which makes the local time process an occupationdensity in the spatial variable for each fixed value of the time variable. Theexcursion measure is the sum of two measures, one which is concentratedon non-negative excursions and one which is concentrated on non-positiveexcursions. Let N be the part which is concentrated on non-negative excur-sions. Thus, in the notation of Section 3.1, N is a σ-finite measure on U ,where we equip U with the σ-field U generated by the coordinate maps.

Define a map v : U → U1 by e 7→ e(ζ(e)·)√ζ(e)

. Then

(3.18) P(Γ) :=Nv−1(Γ) ∩ e ∈ U : ζ(e) ≥ c

Ne ∈ U : ζ(e) ≥ c, Γ ∈ U ,

Page 88: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

88 3. EXAMPLES OF LIMIT TREES I: BRANCHING TREES

does not depend on c > 0 (see, for example, Exercise 12.2.13.2 in [RY99]).The probability measure P is called the law of normalized non-negativeBrownian excursion. We have

(3.19) Ne ∈ U : ζ(e) ∈ dc

=

dc

2√2πc3

and, defining Sc : U1 → U c by

(3.20) Sce :=√ce( ·c

)we have

(3.21)

∫N(de)G(e) =

∫ ∞

0

dc

2√2πc3

∫U1

P(de)G(Sce)

for a non-negative measurable function G : U → R.Recall from Section 3.1 how each e ∈ U1 is associated with a weighted

compact R-tree (Te, rTe , νTe), and let the equivalence class [0] play the roleof the root ρT2e .

Definition 3.2.1 (The Brownian CRT). Denote for 2e ∈ U1, the ex-cursion path t 7→ 2e(t).

(i) The rooted Brownian CRT is the probability measure on the space(Troot, dGHroot) that is the push-forward of the normalized excursionmeasure by the map e 7→ (T2e, rT2e , ρT2e).

(ii) The weighted Brownian CRT is the probability measure on the space(Twt, dGHwt) that is the push-forward of the normalized excursionmeasure by the map e 7→ (T2e, rT2e , νT2e).

In the following let PCRT be the weighted Brownian CRT. If no confusionmay occur, we refer to it shortly as the Brownian CRT.

Remark 3.2.2. The probability measure P is the distribution of an ob-ject consisting of Aldous’s Brownian CRT along with a natural measureon this tree (see, for example, [Ald91a, Ald93]). Various combinatorialmodels of random trees correspond to conditioned critical Galton-Watsonprocess with specific offspring distribution. In particular, the uniform ran-dom ordered tree Tn with a fixed number n ∈ N of vertices (other than theroot) corresponds to the shifted geometric 1/2 offspring distribution. It haslong been known its contour process can be constructed from simple sym-metric random walk on the integers, conditioned on the first return to 0 attime 2n. This construction makes it simple to show that

(3.22) C( 1√

nTn;

1

n

)=⇒n→∞

Bex,

where here ⇒ means weak convergence on C([0, 1]) equipped with the uni-form topology.

The appearance of 2e rather than e in the definition of P is a consequenceof this choice of scaling. The associated probability measure on each realiza-tion of the continuum random tree is the measure that arises in this limiting

Page 89: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

3.3. ALDOUS’S LINE-BREAKING REPRESENTATION OF THE BROWNIAN CRT 89

construction by taking the uniform probability measure on realizations ofthe approximating finite trees. The probability measure P can therefore beviewed informally as the “uniform distribution” on (Twt, dGHwt).

We want to point out for general offspring distributions the contourprocess is not Markovian, and hence it is not profitable to derive (3.22) byproving convergence of this process to the Brownian excursion directly. Themotivation for Aldous to develop his abstract notion on convergence of treespresented in [Ald93] was actually to derive the invariance principle for allnon-trivial finite variance offspring distributions.

3.3. Aldous’s line-breaking representation of the Brownian CRT

The Brownian CRT can be obtained as an almost sure limit by Aldous’sline-breaking construction. To prepare a result stated in Chapter 5 in thissection we recall this representation and reformulate in terms of metric mea-sure spaces and real trees. We mainly follow Section 4 in [Ald91a] (comparealso Section in [Eva06]).

Put σ0 = τ0 := 0. Let τ := (τn; n ∈ N) the successive arrival times ofan inhomogeneous Poisson process with arrival rate r(t) = t at time t ≥ 0.Let σn := Unτn, where Un; n ∈ N is a family of independent identicallydistributed uniform random variables on [0, 1] independent of τ . In thefollowing we will refer to τn and σn as the nth cut time and the nth cutpoint, for each n ∈ N.

Informally, we are growing finite trees (that is, trees with finitely manyleaves and finite total branch length) in continuous time as follows (at alltimes t ≥ 0 the procedure will produce a rooted tree Rt with total edgelength t) as follows:

• Start at time 0 with the 1-tree (that is a line segment with twoends), R0, of length zero (R0 is “really” the trivial tree that consistsof one point only, but thinking this way will help later in Chapter 5).Identify one end of R0 as the root.

• Let this line segment grow at unit speed until the first cut time τ1.• At time τ1 pick the first cut point σ1 uniformly on the segment thathas been grown so far.

• Between time τ1 and time τ2, evolve a tree with 3 ends by lettinga new branch growing away from the first cut point at unit speed.

• Proceed inductively: Given the n-tree (that is, a tree with n +1 ends), Rτn−, pick the n-th cut point σn uniformly on Rτn− togive an n + 1-tree, Rτn , with one edge of length zero, and fort ∈ [τn, τn+1[, let Rt be the tree obtained from Rτn by letting abranch grow away from the nth cut point with unit speed.

Formally, given (τ, σ), let for each t > 0, Rt = (Rt, rt, µt) ∈ Mc bedefined by letting

(3.23) Rt := [0, t],

Page 90: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

90 3. EXAMPLES OF LIMIT TREES I: BRANCHING TREES

r R0 Rτ1− Rτ1 Rτ2− Rτ2

r r

Figure 3.3. illustrates how the real tree-valued process(Rt; t ≥ 0) evolve. (The bold dots re-present an edge oflength zero.)

and with N(t) := infn ∈ N : t ≤ τn,

(3.24) µt :=1

(1 +N(t))

N(t)∑i=0

δτi∧t.

Furthermore, define a metric rt inductively as follows: if s ∈ [0, τ1], letfor all x, y ∈ Rs,

(3.25) rs(x, y) := |x− y|.Assume then that for some n ∈ N we have already defined rs′ for all

s′ ∈ [0τn]. If s ∈]τn, τn+1], let for all x, y ∈ Rs,

(3.26) rs(x, y) :=

|x− y|, if x, y ∈]τn, τn+1],rτn(x, y), if x, y ∈ [0, τn],

(y − τn) + rτn(x, σn), if x ∈ [0, τn], y ∈]τn, τn+1],(x− τn) + rτn(y, σn), if y ∈ [0, τn], x ∈]τn, τn+1].

Notice that this inductive definition extends to a metric onR∞ := [0,∞).Denote by (R, r) the completion of (R∞, r∞).

Theorem 3.3.1 (Line-breaking construction in Xc). The random tree(R, r) is the Brownian CRT.

Recall the space of compact metric measure spaces Mc from Section 2.7.Notice that by Lemma 1.10.2 the metric measure space Rt is compact, foreach t ≥ 0. To prepare the proof, we will first show that the sequence (Rt)t≥0

converges in Mc and identify its limit as the weighted Brownian CRT.

Proposition 3.3.2 (Line-breaking construction in Mc). The family ofrandom trees Rt; t ≥ 0 converges weakly to the weighted Brownian CRTwith respect to the Gromov-weak topology on Mc, as t→ ∞.

The hardest part is, or course, compactness of the limit. We will rely onthe following estimate stated in Proposition 4 in [Ald91a].

Proposition 3.3.3. There exists a K > 0 such that, almost surely,(R, r) can be covered by at most −Kε−2 log ε balls of radius ε, for all ε > 0sufficiently small.

Page 91: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

3.4. CAMPBELL MEASURE FACTS: FUNCTIONALS OF THE BROWNIAN CRT 91

Moreover, we will rely on consistency (see, for example, the proof ofLemma 21 in [Ald93]).

Proposition 3.3.4 (Consistency property). The family of random met-ric spaces Rn; n ∈ N is consistent, i.e., the random tree span(x1, ..., xk)equals in distribution (Rτk , rτk), for all k ∈ N and n ≥ k, µ

⊗↓kτn -almost surely,

where

(3.27) µ⊗↓k := µ(dx1)⊗µ− 1

nδx11− 1

n

(dx2)⊗ ...⊗µ− 1

n

∑k−1i=1 δxi

1− (k−1)n

(dxk).

Proof of Proposition 3.3.2. For existence notice that the family(Rt, rt) is non-decreasing in t. Hence, for all sufficiently small ε > 0 andt ≥ 0, there exists a finite constant K such that we can cover (Rt, rt) byat most −Kε−2 log ε balls of radius ε, almost surely, by Proposition 3.3.3.In particular, the family of trees Rt; t ≥ 0 satisfies the assumptions ofProposition 2.7.2 and is therefore pre-compact in Mc, almost surely.

Uniqueness follows from consistency. In particular, if Φ ∈ Π is a mono-mial of degree k ≥ 1, then

(3.28) P[Φ(Rτn)

]= P

[Φ(Rτk)

],

for all n ≥ k. Hence we can conclude convergence of Rt; t ≥ 0 fromCorollary 2.8.1. Moreover, it follows from Corollary 23 in [Ald93] that if Ris the Brownian CRT then

(3.29) P[Φ(R)

]= P

[Φ(Rτk)

],

for all k ∈ N which identifies the limit as claimed.

Proof of Theorem 3.3.1. If (X, r, µ) is the weighted Brownian CRTthen (R, r) equals in distribution (supp(µ), r) by Theorem 3(iii) in [Ald91a],which together with Proposition 3.3.2 gives the claim.

3.4. Campbell measure facts: Functionals of the Brownian CRT

In this section we will conclude from the connection between the Brow-nian CRT and the Brownian excursion explicit expressions of the expecta-tions of some functionals with respect to P (the “uniform distribution” on(Twt, dGHwt) as introduced in the end of Section 3.2). We will establish ourcalculations from what appears to be a novel path decomposition of thestandard Brownian excursion.

For T ∈ Twt, and ρ ∈ T , recall Rc(T, ρ) from (1.32), and the lengthmeasure µT from (1.20). Given (T, d) ∈ Twt and u, v ∈ T , let

(3.30) ST,u,v :=w ∈ T : u ∈]v, w[

,

Page 92: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

92 3. EXAMPLES OF LIMIT TREES I: BRANCHING TREES

denote the subtree of T that differs from its closure by the point u, whichcan be thought of as its root, and consists of points that are on the “otherside” of u from v (recall ]v, w[ is the open arc in T between v and w).

The main result of this section is the following.

Theorem 3.4.1 (Expected height and weight moments).(i) For x > 0,

P[µT ⊗ νT

(u, v) ∈ T × T : height(ST,u,v) > x

]= P

[ ∫TνT (dv)µ

T (Rx(T, v))]

= 2

∞∑n=1

nx exp(−n2x2/2).

(ii) For 1 < α <∞,

P[∫

TνT (dv)

∫TµT (du)

(height(ST,u,v)

)α]= 2

α+12 αΓ

(α+ 1

2

)ζ(α),

where, as usual, ζ(α) :=∑

n≥1 n−α.

(iii) For 0 < p ≤ 1,

P[µT ⊗ νT (u, v) ∈ T × T : νT (S

T,u,v) > p]=

√2(1− p)

πp.

(iv) For 12 < β <∞,

P[∫

TνT (dv)

∫TµT (du)

(νT(ST,u,v

))β]= 2−

12Γ(β − 1

2

)Γ(β)

.

Remark 3.4.2. We will apply Theorem 3.4.1 in Section 6 in the proofof Lemma 6.1.2.

The proof of Theorem 3.4.1 will rely on calculations with Ito’s excursionmeasure. Because of our observation in Section 3.1 that for an excursione the length measure µTe on the corresponding tree is given by (3.17), weneed to understand the decomposition of the excursion e into the excursionabove a that straddles s and the “remaining” excursion when e is chosenaccording to the standard Brownian excursion distribution P and (s, a) ischosen according to the σ-finite measure ds ⊗ da 1

s(e,s,a)−s(e,s,a) on Γe (see

Figure 3.4).Given an excursion e ∈ U and a level a ≥ 0 write:

• ζ(e) := inft > 0 : e(t) = 0 for the “length”of e,• ℓat (e) for the local time of e at level a up to time t,

Page 93: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

3.4. CAMPBELL MEASURE FACTS: FUNCTIONALS OF THE BROWNIAN CRT 93

(s,a)

Figure 3.4. The decomposition of the excursion e (top pic-ture) into the excursion es,a above level a that straddles times (bottom left picture) and the “remaining” excursion es,a

(bottom right picture).

• e↓a for e time-changed by the inverse of t 7→∫ t0 ds 1e(s) ≤ a

(that is, e↓a is e with the sub-excursions above level a excised andthe gaps closed up),

• ℓat (e↓a) for the local time of e↓a at the level a up to time t,

• U↑a(e) for the set of sub-excursion intervals of e above a (that is,an element of U↑a(e) is an interval I = [gI , dI ] such that e(gI) =e(dI) = a and e(t) > a for gI < t < dI),

• N ↑a(e) for the counting measure that puts a unit mass at each point(s′, e′), where, for some I ∈ U↑a(e), s′ := ℓagI (e) is the amount oflocal time of e at level a accumulated up to the beginning of thesub-excursion I and e′ ∈ U is given by

(3.31) e′(t) =

e(gI + t)− a, 0 ≤ t ≤ dI − gI ,

0, t > dI − gI ,

is the corresponding piece of the path e shifted to become an ex-cursion above the level 0 starting at time 0,

• es,a ∈ U and es,a ∈ U , for the subexcursion “above” (s, a) ∈ Γe,that is,

(3.32) es,a(t) :=

e(s(e, s, a) + t)− a, 0 ≤ t ≤ s(e, s, a)− s(e, s, a),

0, t > s(e, s, a)− s(e, s, a),

respectively “below” (s, a) ∈ Γe, that is,

(3.33) es,a(t) :=

e(t), 0 ≤ t ≤ s(e, s, a),

e(t+ s(e, s, a)− s(e, s, a)), t > s(e, s, a).

• σas (e) := inft ≥ 0 : ℓat (e) ≥ s and τas (e) := inft ≥ 0 : ℓat (e) > s,

Page 94: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

94 3. EXAMPLES OF LIMIT TREES I: BRANCHING TREES

• es,a ∈ U for e with the interval ]σas (e), τas (e)[ containing an excur-

sion above level a excised, that is,

(3.34) es,a(t) :=

e(t), 0 ≤ t ≤ σas (e),

e(t+ τas (e)− σas (e)), t > σas (e).

The following path decomposition result under the σ-finite measure N ispreparatory to a decomposition under the probability measure P, Corollary3.4.4, that has a simpler intuitive interpretation.

Proposition 3.4.3. For non-negative measurable functions F on R+

and G,H on U ,∫N(de)

∫Γe

ds⊗ da

s(e, s, a)− s(e, s, a)F (s(e, s, a))G(es,a)H(es,a)

=

∫N(de)

∫ ∞

0da

∫N ↑a(e)(d(s′, e′))F (σas′(e))G(e

′)H(es′,a)

= N[G] N[H

∫ ζ

0dsF (s)

].

Proof. The first equality is just a change in the order of integrationand has already been remarked upon in Section 3.1.

Standard excursion theory (see, for example, [RW00, RY99, Ber96])says that under N, the random measure e 7→ N ↑a(e) conditional on e 7→ e↓a

is a Poisson random measure with intensity measure λ↓a(e)⊗N, where λ↓a(e)is Lebesgue measure restricted to the interval [0, ℓa∞(e)] = [0, 2ℓa∞(e↓a)].

Note that es′,a is constructed from e↓a and N ↑a(e)− δ(s′,e′) in the same

way that e is constructed from e↓a and N ↑a(e). Also, σas′(es′,a) = σas′(e).

Therefore, by the Campbell-Palm formula for Poisson random measures(see, for example, Section 12.1 of [DVJ88]),∫

N(de)∫ ∞

0da

∫N ↑a(e)(d(s′, e′))F (σas′(e))G(e

′)H(es′,a)

=

∫N(de)

∫ ∞

0da N

[ ∫N ↑a(e)(d(s′, e′))F (σas′(e))G(e

′)H(es′,a)∣∣∣ e↓a]

=

∫N(de)

∫ ∞

0da N[G]N

[∫ ℓa∞(e)

0ds′ F (σas′(e))

H∣∣∣ e↓a]

= N[G]∫ ∞

0da

∫N(de)

(∫dℓas(e)F (s)

H(e)

)= N[G]

∫N(de)

(∫ ∞

0da

∫dℓas(e)F (s)

H(e)

)= N[G]N

[H

∫ ζ

0dsF (s)

].

Page 95: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

3.4. CAMPBELL MEASURE FACTS: FUNCTIONALS OF THE BROWNIAN CRT 95

The next result says that if we pick an excursion e according to thestandard excursion distribution P and then pick a point (s, a) ∈ Γe accordingto the σ-finite length measure corresponding to the length measure µTe onthe associated tree Te (see the end of Section 3.1), then the following objectsare independent:

(a) the length of the excursion above level a that straddles time s,(b) the excursion obtained by taking the excursion above level a that

straddles time s, turning it (by a shift of axes) into an excursion es,a

above level zero starting at time zero, and then Brownian re-scalinges,a to produce an excursion of unit length,

(c) the excursion obtained by taking the excursion es,a that comes fromexcising es,a and closing up the gap, and then Brownian re-scalinges,a to produce an excursion of unit length,

(d) the starting time s(e, s, a) of the excursion above level a that strad-dles time s rescaled by the length of es,a to give a time in the interval[0, 1].

Moreover, the length in (a) is “distributed” according to the σ-finitemeasure

(3.35)1

2√2π

dρ√(1− ρ)ρ3

, 0 ≤ ρ ≤ 1,

the unit length excursions in (b) and (c) are both distributed as standardBrownian excursions (that is, according to P), and the time in (d) is uni-formly distributed on the interval [0, 1].

Recall from (3.20) the Brownian re-scaling map Sc : U1 → U c.

Corollary 3.4.4. For non-negative measurable functions F on R+ andK on U × U ,∫

P(de)

∫Γe

ds⊗ da

s(e, s, a)− s(e, s, a)F(s(e, s, a)ζ(es,a)

)K(es,a, es,a)

=∫ 1

0duF (u)

∫P(de)

∫Γe

ds⊗ da

s(e, s, a)− s(e, s, a)K(es,a, es,a)

=∫ 1

0duF (u)

1

2√2π

∫ 1

0

dρ√(1− ρ)ρ3

∫P(de′)⊗P(de′′)K(Sρe′,S1−ρe

′′).

Proof. For a non-negative measurable function L on U × U , it followsstraightforwardly from Proposition 3.4.3 that

(3.36)

∫N(de)

∫Γe

ds⊗ da

s(e, s, a)− s(e, s, a)F(s(e, s, a)ζ(es,a)

)L(es,a, es,a)

=∫ 1

0duF (u)

∫N(de′)⊗ N(de′′)L(e′, e′′)ζ(e′′).

Page 96: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

96 3. EXAMPLES OF LIMIT TREES I: BRANCHING TREES

The left-hand side of equation (3.36) is, by (3.21),

(3.37)

∫ ∞

0

dc

2√2πc3

∫P(de)

∫ΓSce

ds⊗ daF(s(Sce,s,a)

ζ(Sces,a

)

)L(Sce

s,a, Sce

s,a)

s(Sce, s, a)− s(Sce, s, a).

If we change variables to t = s/c and b = a/√c, then the integral for

(s, a) over ΓSce becomes an integral for (t, b) over Γe. Also,

(3.38)

s(Sce, ct,√cb) = sup

r < ct :

√ce(rc

)<

√cb

= c sup r < t : e(r) < b= cs(e, t, b),

and, by similar reasoning,

(3.39) s(Sce, ct,√cb) = cs(e, t, b)

and

(3.40) ζ(Scect,

√cb) = cζ(et,b).

Thus (3.37) is(3.41)∫ ∞

0

dc

2√2πc3

∫P(de)

√c

∫Γe

dt⊗ dbF( s(e,t,b)ζ(et,b)

)L(Sce

ct,√cb, Sce

ct,√cb)

s(e, t, b)− s(e, t, b).

Now suppose that L is of the form

(3.42) L(e′, e′′) = K(Rζ(e′)+ζ(e′′)e′,Rζ(e′)+ζ(e′′)e

′′)M(ζ(e′) + ζ(e′′))√

ζ(e′) + ζ(e′′),

where, for ease of notation, we put for e ∈ U , and c > 0,

(3.43) Rce := Sc−1e =1√ce(c ·).

Then (3.41) becomes

(3.44)

∫ ∞

0

dc

2√2πc3

∫P(de)

∫Γe

dt⊗ dbF(s(e,t,b)ζ(et,b)

)K(et,b, et,b)M(c)

s(e, t, b)− s(e, t, b).

Since (3.44) was shown to be equivalent to the left hand side of (3.36),it follows from (3.21) that

(3.45)

∫P(de)

∫Γe

dt⊗ db

s(e, t, b)− s(e, t, b)F(s(e, t, b)ζ(et,b)

)K(et,b, et,b)

=

∫ 10 duF (u)

N[M ]

∫N(de′)⊗ N(de′′)L(e′, e′′) ζ(e′′),

and the first equality of the statement follows.

Page 97: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

3.4. CAMPBELL MEASURE FACTS: FUNCTIONALS OF THE BROWNIAN CRT 97

We have from the identity (3.45) that, for any C > 0,

Nζ(e) > C∫

P(de)

∫Γe

ds⊗ da

s(e, s, a)− s(e, s, a)K(es,a, es,a)

=

∫N(de′)⊗ N(de′′)K(Rζ(e′)+ζ(e′′)e

′,Rζ(e′)+ζ(e′′)e′′)1ζ(e′) + ζ(e′′) > C√

ζ(e′) + ζ(e′′)ζ(e′′)

=

∫ ∞

0

dc′

2√2πc′3

∫ ∞

0

dc′′

2√2πc′′∫

P(de′)⊗P(de′′)K(Rc′+c′′Sc′e′,Rc′+c′′Sc′′e′′)1c′ + c′′ > C√

c′ + c′′.

Make the change of variables ρ = c′

c′+c′′ and ξ = c′ + c′′ (with corre-

sponding Jacobian factor ξ) to get∫ ∞

0

dc′

2√2πc′3

∫ ∞

0

dc′′

2√2πc′′∫

P(de′)⊗P(de′′)K(Rc′+c′′Sc′e′,Rc′+c′′Sc′′e′′)1c′ + c′′ > C√

c′ + c′′

=

(1

2√2π

)2 ∫ ∞

0dξ

∫ 1

0

dρ ξ√ρ3(1− ρ)ξ4

1ξ > C√ξ∫

P(de′)⊗P(de′′)K(Sρe′,S1−ρe′′)

=

(1

2√2π

)2∫ ∞

C

dξ√ξ3

∫ 1

0

dρ√ρ3(1− ρ)∫

P(de′)⊗P(de′′)K(Sρe′,S1−ρe′′),

and the corollary follows upon recalling (3.19). Corollary 3.4.5. (i) For x > 0,∫

P(de)

∫Γe

ds⊗ da

s(e, s, a)− s(e, s, a)1 max

0≤t≤ζ(es,a)es,a > x

= 2

∞∑n=1

nx exp(−2n2x2)

(ii) For 0 < p ≤ 1,∫P(de)

∫Γe

ds⊗ da

s(e, s, a)− s(e, s, a)1ζ(es,a) > p =

√1− p

2πp.

Proof. (i) Recall first of all from Theorem 5.2.10 in [Kni81] that

(3.46) P

e ∈ U1 : max

0≤t≤1e(t) > x

= 2

∞∑n=1

(4n2x2 − 1) exp(−2n2x2).

Page 98: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

98 3. EXAMPLES OF LIMIT TREES I: BRANCHING TREES

By Corollary 3.4.4 applied to K(e′, e′′) := 1maxt∈[0,ζ(e′)] e′(t) ≥ x and

F ≡ 1, ∫P(de)

∫Γe

ds⊗ da

s(e, s, a)− s(e, s, a)1 max

0≤t≤ζ(es,a)es,a > x

=1

2√2π

∫ 1

0

dρ√ρ3(1− ρ)

P

maxt∈[0,ρ]

√ρe(t/ρ) > x

=

1

2√2π

∫ 1

0

dρ√ρ3(1− ρ)

P

maxt∈[0,1]

e(t) >x√ρ

=

1

2√2π

∫ 1

0

dρ√ρ3(1− ρ)

2∞∑n=1

(4n2

x2

ρ− 1

)exp

(−2n2

x2

ρ

)

= 2

∞∑n=1

nx exp(−2n2x2),

as claimed.

(ii) Corollary 3.4.4 applied to K(e′, e′′) := 1ζ(e′) ≥ p and F ≡ 1immediately yields∫

P(de)

∫Γe

ds⊗ da

s(e, s, a)− s(e, s, a)1ζ(es,a) > p

=1

2√2π

∫ 1

p

dρ√ρ3(1− ρ)

=

√1− p

2πp.

Proof of Theorem 3.4.1. (i) The first equality is clear from the def-inition of Rx(T, v) and Fubini’s theorem.

Turning to the equality of the first and last terms, first recall from Def-inition 3.2.1 that P is the push-forward on (Twt, dGHwt) of the normalizedexcursion measure P by the map e 7→ (T2e, dT2e , νT2e), where 2e ∈ U1 isjust the excursion path t 7→ 2e(t). In particular, T2e is the quotient of theinterval [0, 1] by the equivalence relation defined by 2e. By the invarianceof the standard Brownian excursion under random re-rooting (see Section2.7 of [Ald91b]), the point in T2e that corresponds to the equivalence classof 0 ∈ [0, 1] is distributed according to νT2e when e is chosen according toP. Moreover, recall from the end of Section 3.1 that for e ∈ U1, the lengthmeasure µTe is the push-forward of the measure ds⊗da 1

s(e,s,a)−s(e,s,a)δs(e,s,a)on the sub-graph Γe by the quotient map defined in (3.3).

It follows that if we pick T according to P and then pick (u, v) ∈T × T according to µT ⊗ νT , then the subtree ST,u,v that arises has thesame σ-finite law as the tree associated with the excursion 2es,a when eis chosen according to P and (s, a) is chosen according to the measureds⊗ da 1

s(e,s,a)−s(e,s,a)δs(e,s,a) on the sub-graph Γe.

Page 99: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

3.5. EXISTENCE OF THE REACTANT BRANCHING TREES 99

Therefore, by part (i) of Corollary 3.4.5,

(3.47)

P[∫

TνT (dv)

∫TµT (du)1

height(ST,u,v) > x

]= 2

∫P(de)

∫Γe

ds⊗ da

s(e, s, a)− s(e, s, a)1

max

0≤t≤ζ(es,a)es,a >

x

2

= 2

∞∑n=1

nx exp(−n2x2/2).

Part (ii) is a consequence of part (i) and some straightforward calculus.

Part (iii) follows immediately from part(ii) of Corollary 3.4.5.

Part (iv) is a consequence of part (iii) and some more straightforwardcalculus.

3.5. Existence of the reactant branching trees

In this section we introduce the family forests with their contour pro-cesses of a catalytic branching particle model. We then state that suitablyrescaled versions cut off at the random height at which the catalyst massfalls below a given threshold converge.

The catalytic branching model consists of two different populations ofdistinct individuals: the catalyst population and the reactant population.The pair of populations (η, ξ) = (ηt, ξt)t≥0 evolves as a Markov processaccording to the following rules. The catalyst population η is a classicalcontinuous-time critical branching process: every individual has an expo-nential lifetime with parameter 1 after which it is replaced by 0 or 2 offspringwith equal probability. The reactant population ξ evolves analogously ex-cept that the exponential lifetime distribution of each individual is replaced

by a lifetime distribution F (t) = 1− exp(−∫ s+ts b(u)du) where s is the birth

time of this individual and b(u) equals the total number of catalyst individ-uals at time u.

The total number of individuals in the catalyst-reactant population is acontinuous-time Markov process with values in (mN)2 if we associate massmwith an individual. Typically one sets the mass of a single particle m = 1.However, to obtain a reasonable limit of the rescaled catalytic branchingmodel we will need to scale down a single particle’s mass contribution.

Definition 3.5.1 (Total mass processes).• The catalyst total mass process ηtot = (ηtott )t≥0 is a critical binaryGalton-Watson process with constant branching rate 1

(3.48) ηtott ≡(ηtott ;m

)7→ηtott +mηtott −m

each at rate 12

ηtott

m,

Page 100: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

100 3. EXAMPLES OF LIMIT TREES I: BRANCHING TREES

• given ηtot, the reactant total mass process ξtot = (ξtott )t≥0 is a criti-cal binary Galton-Watson proces with time-inhomogeneous branch-ing rate ηtott

(3.49) ξtott ≡(ξtott ;m

)7→ξtott +mξtott −m

each at rate 12η

tott

ξtott

m.

Remark 3.5.2. Recall that criticality of the catalyst process impliesthat ηtot will almost surely get absorbed at 0. Let

(3.50) T 0,1 := inft ≥ 0 : ηtott = 0

.

Notice that after the catalyst mass process is absorbed at 0, the reactantmass process gets absorbed as well. That is, in the reactant family tree themost recent common ancestor of any two points at height greater than orequal to T 0,1 has height smaller than T 0,1, or equivalently, after T 0,1 thebranches simply extend to ∞.

The standard way of representing genealogical relationships between in-dividuals in a branching population is with a family forest. The family forestconsists of as many trees as the initial number of individuals. Each individ-ual in a branching process has an edge associated to it whose length is equalto its lifetime. When an individual gives birth the edge branches into twonew edges, while its death turns its edge end into a leaf. In particular, timein the branching process corresponds to height in the family forest which ismeasured in terms of the distance to the roots.

Recall from Section 1.8 the collection Troot of all root invariant isometryclasses of real trees. In what follows we will think of a forest as a rootedR-tree whose root will have degree 1 or larger. For all t ≥ 0, let ∂Qt denotethe set of individuals alive at time t in the branching population. Define thegenealogical distance metric for (T, d, ρ) by

(3.51) d((t1, ι1), (t2, ι2)

):= t1 + t2 − 2τ

((t1, ι1), (t2, ι2)

)where τ

((t1, ι1), (t2, ι2)

)denotes the death of, or splitting time from the

“most recent common ancestor” of the individuals ι1 and ι2 alive at timest1 and t2, respectively. Moreover, for any two (0, ρ), (0, ρ′) ∈ ∂Q0 by ourconvention we have d

((0, ρ), (0, ρ′)

)= 0.

Let then

(3.52) ξfor :=(∪

t∈R+,ι∈∂Qξt

(t, ι), dξ, ρξ).

be the rooted R-trees generated by the catalyst populations.We now consider a family

(ηn, ξn

)of catalytic branching particle models

indexed by n, where the parameter n means that the initial number ofindividuals is increased by a factor n and the branching rate of the catalystis sped up by a factor n.

With probability of order O( 1n) the height of a Galton-Watson tree isof order O(n), and given a Galton Watson process is still alive at a time of

Page 101: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

3.5. EXISTENCE OF THE REACTANT BRANCHING TREES 101

order O(n) its population size at that time is of order O(n). Hence, startinginitially with n particles and speeding up time by a factor of n yields in thelimit a Poisson number of trees, each having total population size of orderO(n2). We therefore put

(3.53) ηtot,n0 = ξtot,n0 := 1,

and let

(3.54) ηtot,n :=(ηtotn· ;

1

n

)and ξtot,n :=

(ξtotn· ;

1

n

).

With standard techniques one can show that there exists a Markov pro-cess (X,Y ) with paths in DR+×R+ [0,∞) such that

(3.55)(ηtot,n, ξtot,n

)=⇒n→∞

(X,Y )

(see, for example, [Pen03]). Moreover, (X,Y ) is the unique strong solutionto the following system of stochastic differential equations

(3.56)dXt =

√Xt dW

Xt

dYt =√XtYt dW

Yt

with initial value (X0, Y0) := (1, 1), and where WX = (WXt )t≥0 and W Y =

(W Yt )t≥0 are two independent standard Brownian motions on the real line.Let then

(3.57) ξfor,n be the family forest of rooted R-trees associated with ξn.

By (3.55), we can realize the rescaled catalyst total mass processes andthe limiting Feller diffusion X = (Xt)t≥0 on a common probability spacesuch that for all T > 0

(3.58) supt≤T

∣∣ηtot,nt −Xt

∣∣−→n→∞

0.

Hence, branching rates of the reactant processes ξtot,n will be given by asequence of functions in Skorohod space which converge uniformly on com-

pacta to a continuous limit function X. Given t > 0 and (T, ρ) ∈ Troot,linfin ,

let

(3.59) Qt(T, ρ) :=x ∈ T : d(x, ρ) ≤ t

,

be the cut operator which takes a forest and cuts off the portion that liesabove height t. Put then

(3.60) ∂Qt(T, ρ) :=x ∈ T : d(x, ρ) = t

.

Let

(3.61) ξfor,n,δ := QT δ,n(ξfor,n).

denote the reactant tree cut off at

(3.62) supt≤T

∣∣ηtot,nt −Xt

∣∣−→n→∞

0.

Page 102: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

102 3. EXAMPLES OF LIMIT TREES I: BRANCHING TREES

In the following we study the behavior of the rescaled reactant forests ξfor,n,δ

under Assumption (3.62).

A lexicographic labeling, also called the Ulam-Harris labeling, gives eachindividual a label from the set Nn, where n ∈ 1, 2, . . . is the numberof ancestors the individual has had from the start of the process. Theinitial individuals are given labels between 1 and the initial population size,distributed in a random order to them. Each parent subsequently gives alabel to its children that consists of its own label followed by a numberbetween 1 and the total number of its children, distributing these randomlyamongst its children. Recall also Figure 3.2.

Recall then from (3.7) the map C(·;σ) which associates an finite orderedtree with the excursion obtained by “walking around the tree” (respectingthe order) with speed σ and recording the height profile. Notice that sincebranching is sped up by a factor of n all edges are of order O( 1n). As a

consequence, in order to find a non-trivial limit contour, in the nth approx-imation step we need to traverse the rescaled forest ξfor,n at speed σ = n.We therefore put

(3.63) Cδ,n =(Cδ,nu

)u≥0

:= C(ξfor,δ,n;n

).

The main result of this section describes the behavior of the limit of therescaled reactant forests in a catalytic environment that is stopped at T δ,n.We will give the proof via random evolutions in the next subsection.

Theorem 3.5.3 (Limit of the reactant contour process). Assume (3.62),and fix δ > 0. Consider the operator (Aδ,D(Aδ)) with

(3.64) Aδf(c) :=( 1

2Xcf ′)′(c)

on [0, τ δ] with reflection on the boundary, that is with domain

(3.65) D(Aδ):=f ∈ C 1

[0,τδ][0,∞) :1

X·f ′ ∈ C1

[0,τδ][0,∞), f ′∣∣0,τδ ≡ 0

.

Then the following holds:

(i) The (Aδ,D(Aδ))-martingale problem is well-posed.(ii) If ζδ is the solution of the (Aδ,D(Aδ))-martingale problem then

(3.66)(Cδ,n; ηtot,n

)=⇒n→∞

(ζδ;X

),

where ⇒ here mean weak convergence on CR+([0,∞)) with respectto the uniform topology on compacta.

Recall that T : C0R+ [0,∞) → Troot maps an excursion to a rooted linearly

ordered compact R-tree. From Theorem 3.5.3 we can immediately concludethe following.

Corollary 3.5.4. For all δ > 0,

(3.67)(ξfor,δ,n; ηtot,n

)=⇒n→∞

T(ζδ;X

).

Page 103: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

3.6. RANDOM EVOLUTIONS: PROOF OF THEOREM 3.5.3 103

3.6. Random evolutions: Proof of Theorem 3.5.3

In this section we give the proof of Theorem 3.5.3. We first show thatgiven a realization of the catalyst total mass process the reactant contourprocess is associated with a Markov process, and we derive its generator.We also give a representation of this Markov process as a random evolutionprocess. By this we mean that it moves at constant velocity for a randomtime, then changes the sign of its velocity and proceeds at constant velocityfor a random time again. We use this representation to prove that thesuitably rescaled contour processes of the truncated reactant forest convergetowards a limit contour process which is characterized as the solution of well-posed martingale problem.

Throughout this section a realization ηtot := (ηtots )s≥0 ∈ DN[0,∞) of thecatalyst path is fixed. Recall once more from (3.7) the map C(·;σ) whichassociates an finite ordered tree with the excursion obtained by “walkingaround the tree” (respecting the order) with speed σ, put

(3.68) C =(Cu)u≥0

:= C(ξfor; 1

),

and let its slope process V := (Vu)u≥0 be defined by

Vu := slope(Cu) ∈ Eslope

with

Eslope := −1,+1, Econt :=[0, T 0,1

].

where the slope of the root, branch points and leaves are defined in such away that (Vu)u≥0 has cadlag paths.

The next result states that the functional pairing of the height of thecontour with its slope is a Markov process.

Lemma 3.6.1 (Markov property of the contour process). The process(C, V ) := (Cu, Vu)u≥0 on Econt×Eslope is a Markov process whose generatoris the closure of the operator

(3.69) Af(c, v) = v∂

∂cf(c, v) + ηtotc

[f(c,−v)− f(c, v)

],

for all f ∈ D(A), where

(3.70) D(A) =f ∈ C1,0

Econt×Eslope[0,∞) :

∂f

∂c

∣∣∂(Econt)×Eslope

≡ 0.

Proof. Recall that C(·, σ) : Troot,linfin → CR+[0,∞) maps a rooted R-tree

to an excursion from Figure 3.2. We first show that the lengths of the linesegments of the contour process are independent of each other and then weuse this to obtain the Markov property and to identify the generator.

Page 104: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

104 3. EXAMPLES OF LIMIT TREES I: BRANCHING TREES

Step 1 (Independence of the lengths of the line segments). Recall fromFigure 3.2 the assignment of a contour process to the representative of aGalton-Watson family forest embedded in the plane. Each piece of thecontour process with constant slope sign corresponds to a sum of a numberof lifetimes, so independence of the different line segments is not obvious.We shall make use of the fact that the reactant process can be represented intwo ways without changing the distribution of its total mass process or thegenealogical distances between individuals, either as continuous time binaryGalton-Watson process, or as a birth and death process.

• In the continuous time Galton-Watson process the branch pointsoccur at rate ηtott , and the number of offspring at each branch pointis 0 or 2 with equal probability.

• In the birth and death process each individual dies at rate 12η

tott ,

and during its lifetime gives birth to new offspring at rate 12η

tott .

If in the Galton-Watson process at each birth time we choose to identifythe life of one of the offspring as a continuation of the life of its parent, weobtain the birth-and-death process.

The family forest of the Galton-Watson process has a canonical planarembedding coming from a linear order on its vertices induced by the linearorder of the tree as in Figure 3.5(a), while the family forest of a birth-and-death process with that same linear order on the vertices has two canonicalplanar embeddings. In Figure 3.5(b) we always choose to identify the con-tinuation of the life of the parent with the life of the offspring of higherlinear order, and the branch of the offspring is always drawn to the left ofthe branch of the parent. In Figure 3.5(c) we always choose to identify thecontinuation of the life of the parent with the life of the offspring of lowerlinear order, and the branch of the offspring is always drawn to the right ofthe branch of the parent.

The key observation now is that since all three planar embeddings re-spect the same linear order on the vertices they also have the same contourprocess (compare Figure 3.2).

In the birth and death process the line segments of constant slope corre-spond to a lifetime of exactly one individual. With the parent identificationas in Figure 3.5(b) each line segments of negative slope corresponds to a life-time of an individual, while in the parent identification as in Figure 3.5(c)each line segment of positive slope corresponds to a lifetime of an individual.Since lifetimes of individuals are independent this implies the independenceof all constant slope line segments in the contour process as claimed.

Step 2 (Identification of the generator). Given the branch rates (ηtott )t≥0,the law of the length l of a lifetime of an individual is

Pl > t

= e−

∫ t∗+tt∗ ηtots ds,

where t∗ is the time of birth of that individual. Hence the law of the lengthof each segment is the law of the first point of a Poisson process with rate

Page 105: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

3.6. RANDOM EVOLUTIONS: PROOF OF THEOREM 3.5.3 105

dc

e

b

f

g

h

a

ij

k

lm

∅(a)

d

e

c f

h

gb

i

k

j

lm

a

∅(b)

dc

b

a

e

f

g

h

ij

k

lm

∅(c)

Figure 3.5. illustrates three different planar embeddings ofthe reactant family forest with the same linear order and acommon contour process.

(ηtott )t≥t∗ or (ηtott )t≤t∗ if the slope of the line segment is +1 and −1, re-spectively. The alternating sign of Vu changes at rate ηtotCu

where Cu is thecurrent value of the contour process. In between the jumps of Vu, Cu moveswith unit speed in the direction determined by Vu. Hence, the contour pro-cess paired with its slope is Markovian, and furthermore its generator agreeson D(A) with the operator given in (3.69) and (3.70). Since D is dense inCEcont×Eslope

[0,∞) the generator is the closure of the operator (A,D).

We next notice that (C, V ) is a random evolution, i.e., a Markov pro-cess moving at constant speed in a direction which changes stochastically.Specifically, for the pair (C, V ) the change in speed is a counting processwhose rate is governed by the catalyst mass process ηtot.

Lemma 3.6.2 (Random evolution representation). Let N := (N(u))u≥0

be a unit rate Poisson process, and consider the following system

(3.71) Cu =

∫ u

0Vv dv, Vu = (−1)

N( ∫ u

0 ηtotCv

dv),

in int(Econt)× Eslope and with reflection on ∂(Econt)× Eslope. There existsa unique random evolution satisfying this system, and its distribution is thesame as the distribution of the contour process and its slope (C, V ).

Proof. See Chapter 12 of [EK86] for the definition of a random evolu-tion. Existence and uniqueness of a random evolution satisfying the system(3.71) follow from standard theorems on existence and solutions of a sys-tem of stochastic differential equations with continuous local martingales asdifferentials (see Theorem 3.15 in [Pro77]). Equality in distribution of thisrandom evolution with the contour process and its slope follows by simplycomparing the generator of this system to the generator obtained in theprevious lemma.

Page 106: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

106 3. EXAMPLES OF LIMIT TREES I: BRANCHING TREES

We will also need the following easy consequence of the above arguments.Fix δ > 0, and let T δ,1 be the first time that the catalyst process startedfrom 1 individual drops below δ. Also recall that QT δ,1 from (3.59) is themap that takes a tree and cuts the portion of its branches that lie aboveheight T δ,1.

Corollary 3.6.3 (Truncated process). If the contour C(ξfor; 1) solves(3.71) then C(QT δ,1(ξfor); 1) solves (3.71) but with the state space Econt re-placed by Eδcont = [0, T δ,1].

We next show that the rescaled random evolutions converge to the solu-tion of the well-posed martingale problem stated in Theorem 3.5.3. We relyon averaging techniques established for random evolutions. Throughout wefix realizations of ηtot,n, n ∈ N, and of X on a single probability space suchthat (3.62) holds and we choose a truncation parameter δ > 0.

Step 3 (The rescaled random evolution system). Recall the rescaled re-

actant contour process Cδ,n from (3.63). Let Eδ,ncont :=[0, T δ,n

], and define

its rescaled slope process V δ,n := sign(slope(Cδ,n· )

).

Then Lemma 3.6.1 applied to the rescaled reactant populations impliesthat

(Cδ,n, V δ,n

)is a Markov process whose generator is the closure of the

operator

(3.72) Aδ,nf(c, v) = nv∂

∂cf(c, v) + n2ηtot,nc [f(c,−v)− f(c, v)],

acting on all f ∈ D(Aδ,n), where

D(Aδ,n) =f ∈ C1,0

Eδ,ncont×Eslope

[0,∞) :∂f

∂c

∣∣0,T δ,n×Eslope

≡ 0.

Furthermore, the analogous argument to that of Lemma 3.6.2 impliesthat the distribution of the pair (Cδ,n, V δ,n) has the same distribution as therandom evolution which is the unique solution to the system

(3.73) Cδ,nu = n

∫ u

0V δ,nv dv, V δ,n

u = (−1)N(n2

∫ u0 ηtot,n

Cδ,nv

dv),

where N is a unit rate Poisson process. In other words, the rescaled processCδ,n evolves deterministically with speed n, and changes the sign at raten2 times a counting process whose rate is governed by the rescaled catalystmass process ηtot,n.

Step 4 (The velocity process). The convergence result stated in Theo-

rem 3.5.3 relies on the fact that the velocity component V δ,n evolves muchfaster than the contour component Cδ,n, as is it clear from (3.73). Hence inthe limit the velocity process will average out and can be replaced with itsstationary measure.

Page 107: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

3.6. RANDOM EVOLUTIONS: PROOF OF THEOREM 3.5.3 107

If Γn is the occupation time measure ofV δ,n on Eslope, i.e., for u ≥ 0,and v ∈ Eslope

(3.74) Γn([0, u]× v

):=

∫ u

01v(V

δ,nu′ ) du′,

then it is clear from the description (3.73) of V δ,n that

Γn =⇒n→∞

Γ = λ⊗ π,

where λ denotes Lebesgue measure on [0,∞), and π(1) = π(−1) = 12 .

In the limit the contour component will have spent on average half the timeincreasing and half the time decreasing.

Step 5 (Averaging for martingale problems). The proof of Theorem 3.5.3will rely on the following result taken from Theorem 2.1 in [Kur92] andadapted to our specific situation in which the state spaces are compact.

Proposition 3.6.4 (Stochastic averaging). Suppose there is an operatorAδ : D(Aδ) ⊆ C0

[0,∞)[0, τδ] → C0

[0,∞)[0, τδ]× −1, 1 such that

(i) for all f ∈ D(Aδ) there is a process εδ,f,n for which

(3.75)(f(Cδ,nt

)−∫ t

0

(Aδf

)(Cδ,ns , V δ,n

s

)ds+ εδ,f,nt

)t≥0

is a martingale,(ii) D(Aδ) is dense in C0

[0,∞)[0, τδ] with respect to the uniform topology,

and(iii) for f ∈ D(Aδ) and T > 0 there exists p > 1 such that

P[ ∫ T

0

∣∣Aδf(Cδ,ns , V δ,ns

)∣∣pds] <∞,

and

(3.76) limn→∞

P[supt≤T

|εδ,f,nt |]= 0.

Then for Γn defined as in (3.74)(Cδ,n,Γn)

n∈N

is relatively compact inD[0,∞)[0,∞)×M([0,∞)× −1, 1),

where M([0,∞)× −1, 1) is the space of all measures on [0,∞)× −1, 1for which µ([0, u)×−1, 1) = u, and for any limit point (ζδ, π) there existsa filtration such that(

f(ζδt)−∫ t

0

∑v∈−1,1

Aδf(Cζ

δ,n

s , v)dsπv

)t≥0

is a martingale with respect to this filtration, for all f ∈ D(Aδ).

Page 108: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

108 3. EXAMPLES OF LIMIT TREES I: BRANCHING TREES

Recall the operator (Aδ,D(Aδ)) from (3.64) and (3.65) which is the

generator for the rescaled contour process Cδ,n. Our goal is to show that allthree assumptions (i)-(iii) of Proposition 3.6.4 above are satisfied.

(i) We first show that we can define small error functions εδ,f ; f ∈D(Aδ) such that(

f(Cδ,nt

)−∫ t

0Aδf

(Cδ,ns , v

)ds+ εδ,f,nt

)t≥0

is a martingale for all f ∈ D(Aδ).Notice that f ∈ D(Aδ) if and only if there exists a function g ∈

C2[0,∞)[0, τ

δ] with g(0) = g(τ δ) = 0 such that

(3.77) f(x) = f(0) +

∫ x

0Xsg(s)ds,

for all x ≥ 0. Let f ∈ D(Aδ) be of the form (3.77) and let

fn(c) := f(0) +

∫ c

0ηtot,ns g

(T δ,n

τδs)ds.

Apply the operator (Aδ,n,D(Aδ,n)) from (3.72) to functions fn given by

fn(c, v) := fn(c) +v

2nηtot,nc

(fn)′(c)

to get

Aδ,nfn(c, v) = nv(fn)′(c)− n2ηtot,nc

v

nηtot,nc

(fn)′(c) + v2

( 1

2ηtot,nc

(fn)′)′

(c),

=( 1

2ηtot,nc

(fn)′)′

(c).

Let then for all t ≥ 0,

εδ,f,nt := fn(Cδ,nt

)− f

(Cδ,nt

)+

V δ,nt

2nηtot,nCδ,n

t

(fn)′(Cδ,nt

)+

∫ t

0

(Aδf − Aδ,nfn

)(Cδ,ns , V δ,n

s

).

Since fn(Cδ,nt , V δ,nt ) −

∫ t0 A

δ,nfn(Cδ,ns , V δ,ns ) is a martingale for all fn ∈

D(Aδ,n), it follows that (3.75) holds for all f ∈ D(Aδ).(ii) We next show that the domain D(Aδ) is dense in the space of con-

tinuous functions on [0, τ δ].

Lemma 3.6.5 (Dense domain). Fix δ > 0 and X ∈ C0[δ,∞)[0, τ

δ]. Then

the set of functions F defined by

F :=f : f(c) = C +

∫ c

0Xc′g(c

′)dc′; C ∈ R, g ∈ C2R[0, τ

δ], g∣∣0,τδ ≡ 0

.

is dense in C0R[0, τ

δ].

Page 109: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

3.6. RANDOM EVOLUTIONS: PROOF OF THEOREM 3.5.3 109

Proof. It is well-known that each continuous function on [0, τ δ] canbe approximated by piecewise linear functions. It is therefore enough toshow that any piecewise linear function can be approximated by functionsin F . This follows by continuity of X and the fact that Xu ≥ δ, for allu ∈ [0, τ δ].

We finally verify the last point. It is standard to show (3.76) holds for

all f ∈ D(Aδ), T > 0 and p > 1. Moreover, since 1/ηtot,nCδ,n

t

is bounded by 1δ ,

for all t ≥ 0, fn → f , and

(3.78) ∥Aδ,nfn −Aδf∥∥ ≤ |τ δ − T δ,n|

∥∥g′∥∥+ ∣∣1− T δ,n

τ δ∣∣∥∥g′′∥∥−→

n→∞0,

(3.76) is satisfied as well.Altogether we can apply Proposition 3.6.4 to the effect that the family

of rescaled contours Cδ,n; n ∈ N is relatively compact in law and any limitpoint satisfies the (Aδ,D(Aδ))-martingale problem.

Step 6 (Uniqueness of the limit martingale problem). We next show thatthe (Aδ,D(Aδ))-martingale problem has a unique solution ζδ, for which wefirst need the following lemma which characterizes solutions of transforms ofa reflecting Brownian motion. Recall that X is a Feller diffusion started at

X0 = 1 and τ δ is the first time it falls below δ. Let s : [0, τ δ] → [0,∫ τδ0 Xudu]

defined by

(3.79) s(x) :=

∫ x

0Xudu,

denote the scale function and denote its inverse by s−1.Consider the operator(Bδ,D(Bδ)

):=(12Xs−1(·)f

′′,h ∈ C2

R+ [0,∞) : h′∣∣0,s(τδ) ≡ 0

).

Lemma 3.6.6 (Relation to Brownian motion). Let δ > 0. If ζδ solvesthe (Aδ,D(Aδ))-martingale problem, then

Bt := s(ζδt), t ∈ [0,∞)

solves the (Bδ,D(Bδ))-martingale problem.

Proof. Fix H ∈ C2[0,s(τδ)]

[0,∞) with H ′∣∣0,s(τδ) ≡ 0. It is easy to check

that then H s ∈ D(Aδ).We therefore obtain that for all H ∈ C2

R+ [0,∞) such thatH ′∣∣0,s(τδ) ≡ 0

the process ((H s

)(ζδt)−∫ t

0Aδ(H s

)(ζδu)du)t≥0

=(H(Bt)−∫ t

0

1

2Xs−1(Bu)h

′′(Bu)du)t≥0

Page 110: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

110 3. EXAMPLES OF LIMIT TREES I: BRANCHING TREES

is a martingale. Here we have used that Aδ(H s)(ζ) = 12

(H ′ s

)′(ζ) =

12Xζ·(H

′′ s).

Proposition 3.6.7 (Uniqueness). The (Aδ,D(Aδ))-martingale problemhas a unique solution ζδ.

Proof of Proposition 3.6.7. Assume that ζδ,1 and ζδ,2 are two so-lutions of the (Aδ,D(Aδ))-martingale problem. By Theorem 5.6 in [?], since(Xu)u≥0 is bounded away from zero, the (Bδ,D(Bδ))-martingale problemhas a unique solution. Hence∫

[0,ζδ,1]Xudu

d=

∫[0,ζδ,2]

Xudu.

Since the scale function is strictly increasing on [0, τ δ], the one-dimensionaldistributions of ζδ,1 and ζδ,2 agree. It follows from Theorem 4.4.2 in [EK86]

that therefore ζδ,1d= ζδ,2.

Step 7 (Conclusion). We close the section by giving a proof of Theo-rem 3.5.3.

Proof of Theorem 3.5.3. We have shown in Step 2 that the sequence(Cδ,n)n∈N is relatively compact, and that any limit point ζδ of Cδ,n is asolution of the (Aδ,D(Aδ))-martingale problem. This proves existence of asolution for all δ > 0.

Furthermore, by Proposition 3.6.7, this martingale problem has only onesolution. That is, the (Aδ,D(Aδ))-martingale problem is well-posed and ifζδ is its unique solution (3.66) holds.

Finally, it follows from Theorem 4.4.2 in [EK86] that ζδ has the Markovproperty.

Page 111: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

CHAPTER 4

Examples of limit trees II: Coalescent trees

In this chapter we apply the theory developed in Chapters 1 and 2 toillustrate how certain classes of Λ-coalescents can be associated with randommetric measure spaces.

The Λ-coalescents were introduced in [Pit99] (see also [Sag99]) andhave since been the subject of many papers (see, for example, [MS01,BG05, BBC+05, LS06, BBS06]). They appear as the duals of populationmodels whose evolution is based on resampling. The fact that Λ-coalescentsallow for multiple collisions is reflected in a possibly infinite variance of theresampling offspring distribution. Moreover, Λ-coalescents are up to timechange dual to the process of relative frequencies of families of a Galton-Watson process with possibly infinite variance offspring distribution (com-pare [BBC+05]). In Section 4.1 we start with a characterization of the classof Λ-coalescents which can be described by an ultra-metric measure space.

The spatially structured (Kingman) coalescent, or short the spatial co-alescent, on a class of Abelian groups was introduced in [GLW05] andextended to the spatially structured Λ-coalescents in [LS06]. Spatial(lystructured) coalescents are of interest since they appear, of course, as theduals of of spatially interacting resampling models where interaction is dueto additional migration. In Section 4.2 we construct the spatially structuredΛ-coalescent trees spanned by partition elements initially located in a finitebox.

In Section 4.3 we then restrict to spatially structured Λ-coalescent co-alescents on Zd, where d ≥ 3, which come down from infinity. As shownin [LS06] these coalescents then automatically come down in an uniformway which implies convergence of the linearly rescaled spatially structuredΛ-coalescent coalescents on large tori towards the (non-spatial) Kingmancoalescent tree as the side length of the torus goes to infinity.

In Section 4.4 we focus on Z2 and further restrict to the spatially struc-tured Kingman coalescent. Since the dimension d = 2 is the critical dimen-sion in the recurrence versus transience dichotomy of the interaction betweennon-spatial coalescents (which is migration of partition elements), we hereencounter the so-called diffusive clustering regime. That is, the partitionelements at time t have collected initial partition elements whose initial lo-cations cover areas of side length t

α2 , for a random α ∈ (0, 1]. We will show

that the spatially structured Kingman coalescent trees spanned by partitionelements initially in a box of side length t

α2 , for a fixed α ∈ (0, 1], can be

111

Page 112: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

112 4. EXAMPLES OF LIMIT TREES II: COALESCENT TREES

(non-linearly) rescaled to converge to a non-spatial Kingman coalescent treeon a logarithmic scale, as t→ ∞.

4.1. Λ-coalescent measure trees

In this section we characterize the class of Λ-coalescents which can bedescribed by a metric measure space.

We start with a quick description of Λ-coalescents. Recall that a par-tition of a set S is a collection Aλ of pairwise disjoint subsets of S, alsocalled blocks, such that S = ∪λAλ. Denote by S∞ the collection of partitionsof N := 1, 2, 3, ..., and for all n ∈ N, by Sn the collection of partitions of1, 2, 3, ..., n. Each partition P ∈ S∞ defines an equivalence relation ∼P byi ∼P j if and only if there exists a partition element π ∈ P with i, j ∈ π.Write ρn for the restriction map from S∞ to Sn. We say that a sequence(Pk)k∈N converges in S∞ if for all n ∈ N, the sequence (ρnPk)k∈N convergesin Sn equipped with the discrete topology.

We are looking for a strong Markov process defined as follows.

Definition 4.1.1 (The Λ-coalescent). The Λ-coalescent is a strongMarkov process ξ starting in P0 ∈ S∞ such that for all n ∈ N, the re-stricted process ξn := ρn ξ is an Sn-valued Markov chain which starts inρnP0 ∈ Sn, and given that ξn(t) has b blocks, each k-tuple of blocks of Sn ismerging to form a single block at rate λb,k.

Pitman [Pit99] showed that such a process exists and is unique (in law)if and only if

(4.1) λb,k :=

∫ 1

0Λ(dx)xk−2(1− x)b−k,

for some non-negative and finite measure Λ on the Borel subsets of [0, 1].Let therefore Λ be a non-negative finite measure on B([0, 1]) and P ∈ S∞.

We denote by PΛ,P the probability distribution governing ξ with ξ(0) = Pon the space of cadlag paths with the Skorohod topology.

Example 4.1.2. If we choose

(4.2) P0 :=1, 2, ...

,

Λ = δ0, or Λ(dx) = dx, then PΛ,P0is the Kingman and the Bolthausen-

Sznitman coalescent, respectively.

For each non-negative and finite measure Λ, all initial partitions P ∈ S∞and PΛ,P -almost all ξ, there is a (random) metric rξ on N defined by

(4.3) rξ(i, j):= inf

t ≥ 0 : i ∼ξ(t) j

.

That is, for a realization ξ of the Λ coalescent, rξ(i, j)is the time it

needs i and j to coalesce. Notice that rξ is an ultra-metric on N, almostsurely.

Page 113: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

4.1. Λ-COALESCENT MEASURE TREES 113

For all ξ ∈ S∞, let (U ξ, rξ) denote the completion of (N, rξ). Clearly,the extension of rξ to U ξ is also an ultra-metric. Recall from Example 1.5.8that ultra-metric spaces are associated with R-trees.

The main goal of this section is to introduce the Λ-coalescent measuretrees as metric spaces (U ξ, rξ) equipped with the “uniform distribution”.Notice that since the Kingman coalescent is known to “come down imme-diately to finitely many partition elements” the corresponding metric spaceis almost surely compact ([Eva00a]). Even so there is no abstract conceptof the “uniform distribution” on compact spaces, the reader may find it notsurprising that in particular examples one can easily make sense out of thisnotion by approximation. We will see, that for Λ-coalescents, under an ad-ditional assumption on Λ, one can extend this strategy to locally compactmetric spaces. Within this class falls, for example, the Bolthausen-Sznitmancoalescent which is known to have infinitely many partition elements for alltimes, and whose corresponding metric space is therefore not compact.

Define Hn to be the map which takes a realization of the S∞-valuedcoalescent and maps it to (an isometry class of) a metric measure space asfollows:

(4.4) Hn : ξ 7→(U ξ, rξ, µξn := 1

n

∑n

i=1δi

).

Put then for given P0 ∈ S∞,

(4.5) QΛ,n :=(Hn

)∗P

Λ,P0 .

Next we give the characterization of existence and uniqueness of theΛ-coalescent measure tree.

Theorem 4.1.3 (The Λ-coalescent measure tree). The family QΛ,n; n ∈N converges in the weak topology with respect to the Gromov-weak topologyif and only if

(4.6)

∫ 1

0Λ(dx)x−1 = ∞.

Remark 4.1.4 (“Dust-free” property). By exchangeability and the de

Finetti Theorem, the family f(π); π ∈ ξ(t) of frequencies

(4.7) f(π) := limn→∞

1

n#j ∈ 1, ..., n : j ∈ π

exists for PΛ,P0 almost all π ∈ ξ(t) and all t > 0. Define f := (f(π); π ∈ξ(t)) to be the ranked rearrangements of f(π); π ∈ ξ(t) meaning that theentrees of the vector f are non-increasing. Let PΛ,P0 denote the probabilitydistribution of f . Call the frequencies f proper if

∑i≥1 f(πi) = 1. By

Theorem 8 in [Pit99], the Λ-coalescent has in the limit n → ∞ properfrequencies if and only if Condition (4.6) holds.

According to Kingman’s correspondence (see, for example, Theorem 14in [Pit99]), the distribution PΛ,P0 and PΛ,P0 determine each other uniquely.

Page 114: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

114 4. EXAMPLES OF LIMIT TREES II: COALESCENT TREES

For P ∈ S∞ and i ∈ N, let(4.8) P i :=

j ∈ N : i ∼P j

denote the partition element in P which contains i. Then the following areequivalent:

(a) Condition (4.6) holds.(b) For all t > 0,

(4.9) PΛ,P0f(ξ(t)1

)= 0= 0.

The latter is often referred to as the “dust”-free property.(c) The total coalescence rate of a given i ∈ P0 being infinite (com-

pare with the proof of Lemma 25 in [Pit99]).

Definition 4.1.5 (The Λ-coalescent measure tree). Assume that thedust-free property holds. The Λ-coalescent measure tree QΛ is the limit ofthe family QΛ,n; n ∈ N.

Proof of Theorem 4.1.3. For existence we will apply the characteri-zation of tightness as given in Theorem 2.8.2, and verify the two conditions.

(i) By definition, for all n ∈ N, QΛ,n[wX ] is exponentially distributedwith parameter λ2,2. Hence the family QΛ,n[wX ]; n ∈ N is tight.

(ii) Assume that Condition 4.6 holds. Fix t ∈ (0, 1). Then for all δ > 0,by the uniform distribution and exchangeability,

(4.10)

QΛ,nvδ(Hn(ξ)) ≥ t

= PΛ,P0

1n

n∑i=1

µξnx ∈ U ξ : µξn(Bt(x)) ≤ δ

∣∣x = i≥ t

≤ 1

tPΛ,P0

µξn(Bt(1)) ≤ δ

.

By the de Finetti theorem, µξn(Bt(1))n→∞−−−→ f

((ξ(t))1

), PΛ,P0-almost surely.

Hence, dominated convergence yields

(4.11)limδ→0

limn→∞

QΛ,nvδ(Hn(ξ)) ≥ t

≤ lim

δ→0

1

tPΛ,P0

f((ξ(t))1) ≤ δ

=

1

tPΛ,P0

f((ξ(t))1) = 0

.

We have therefore shown that Condition (4.6) implies that a limit ofQΛ,n exists.

Assume to the contrary, that the dust-free property does not hold. Thenfor all t > 0, limn→∞ µξ(U ξ) < 1, by Remark 4.1.4(c).

Uniqueness of the limit points follows from the projective property, i.e.restricting the observation to a tagged subset of initial individuals is thesame as starting in this restricted initial state.

Page 115: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

4.2. SPATIALLY STRUCTURED Λ-COALESCENT TREES 115

We conclude this section by characterizing the measure Λ for which theΛ-coalescent measure tree is compact.

Proposition 4.1.6 (Compact Λ-coalescent measure trees). The Λ-coalescent measure tree is compact if and only if

(4.12)∑∞

b=2

(∑b

k=2(k − 1)

(b

k

)λb,k

)−1<∞.

Remark 4.1.7 (Coming down from infinity). Notice that Condi-tion (4.12) is equivalent to the property that the Λ-coalescent comes downfrom infinity in infinitesimal small time, i.e.,

(4.13) #ξt <∞, PΛ,P0-almost surely,

for all t > 0 (see, Theorem 1 in [Sch00b]).

Proof of Proposition 4.1.6. Since for ultra-metric spaces, balls doeither agree or are disjoint, the number of partition elements #ξε at timeε > 0 equals the number of ε-balls one needs to cover the ultra-metricspace representing the Λ-coalescent tree. Hence the claim is an immediateconsequence of Remark 4.1.7.

4.2. Spatially structured Λ-coalescent trees

The main goal of this section is to introduce the spatially structuredΛ-coalescent measure tree as the random metric space (U (C,L), r(C,L)) asso-ciated with the Λ-coalescent and equipped with the “uniform distribution”on partition elements with a mark in a fixed finite subset.

Let G be an Abelian group: The basic ingredient for our processes is arandom walk (RW) on a(n at most) countable Abelian group G. Let a(·, ·)be a recurrent random walk kernel on G, i.e.,

(4.14) a(x, y) = a(0, y − x),

for all x, y ∈ G, and

(4.15)∑n∈N

a(n)(0, 0) = ∞.

Assume in addition that a(·, ·) is aperiodic and irreducible, and denote itsrate 1 continuous time transition kernel by

(4.16) at(x, y) :=∑n∈N

a(n)(x, y)tne−t

n!, x, y ∈ G.

for all x, y ∈ G where a(n)(x, y) is the n-step transition probability. Assumein addition that at(x, y) is aperiodic and irreducible.

There is a standard way to construct particle systems that possibly (if Gis countable infinite) start in configurations with countably many particles.

Page 116: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

116 4. EXAMPLES OF LIMIT TREES II: COALESCENT TREES

Fix a finite measure α on G with αx > 0, for all x ∈ G, and such thatthere exists a constant Γ with

(4.17)∑y∈G

a(x, y)αy ≤ Γ · αx,

for all x ∈ G. Denote by N (G) the set of all locally finite N-valued measureson G. Let then E = Eα be the Liggett-Spitzer space (corresponding to α),i.e.,

(4.18) E :=η ∈ N (G) :

∑x∈G

ηxαx <∞.

Remark 4.2.1. Let (Xit)t≥0 : i ∈ I be a countable collection of in-

dependent random walks, and put for all t ≥ 0, ηt :=∑

i∈I δXit∈ N (G).

Notice that if

(4.19) η0 ∈ E , a.s.,

then an easy calculation shows that the process (e−Γt∑

i∈I α(Xit))t≥0 is

a super-martingale.In particular, under (4.19),

(4.20) Pηt ∈ E , ∀t ∈ [0,∞)

= 1.

Note that this implies ηt ∈ N (G), for all t ≥ 0, almost surely.

To combine migration and coalescence, fix a countable index set I. No-tice then that from any P ∈ ΠI one can form a marked partition

(4.21) PG :=(π, L(π)); π ∈ P

,

by assigning each partition element π ∈ P, its mark L(π) ∈ G. Denote thespace of marked partitions by

(4.22) ΠI,G.

For all I ′ ⊆ I, write

(4.23) ρI′ : ΠI,G → ΠI′,G

for the restriction map. In this way, for all P ∈ ΠI,G and i′1, i′2 ∈ I ′,

(4.24) i′1 ∼ρI′P i′2 if and only if i′1 ∼P i′2.

Definition 4.2.2 (Topology for marked partitions). We say that a se-quence (Pn)n∈N converges in ΠI,G if and only if for all finite subsets I ′ ⊆ I,the sequence (ρI′Pn)n∈N converges in ΠI′,G equipped with the discrete topol-ogy.

Let Λ be a non-negative finite measure on B([0, 1]). Recall the Λ-coalescent from Definition 4.1.1. We next define the spatially structuredΛ-coalescent.

Page 117: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

4.2. SPATIALLY STRUCTURED Λ-COALESCENT TREES 117

Definition 4.2.3 (The spatially structured Λ-coalescent). The spatiallystructured Λ-coalescent, or short the spatial Λ-coalescent,

(4.25) (C,L) := (Ct, Lt)t≥0,

is a strong ΠI,G-valued Markov process with cadlag paths such that for allsubsets I ′ ⊆ I with

(4.26)∑

π∈C0

δL0(π) ∈ E ,

the restricted process (C,L)I′= ρI′ (C,L) is a ΠI′,G-valued strong Markov

particle system which undergoes the following two independent mechanisms:

• Migration The marks of the partition elements perform indepen-dently continuous time rate 1 random walks with kernel a(·, ·).

• Coalescence The partition built by restricting to elements with thesame mark perform a Λ-coalescent.

Example 4.2.4 (Spatially structured Kingman-coalescent). If Λ = δ0then the spatial Λ-coalescent is referred to as the spatial Kingman-coalescent,or even shorter the spatial coalescent.

Remark 4.2.5 (Spatially structured Λ-coalescent is well-defined). Ex-istence of the Λ-coalescent for finite groups G is shown in Proposition 3.4in [GLW05] and Theorem 1 in [LS06]. One can extend the construction toinfinite graphs with infinite initial configuration can be done via approxima-tion of G by finite subgroups. This, of course, requires an extra conditionthat the group G can be suitably approximated by sub-groups (compare, forexample, Condition 6.1 in [GLW05]), which is satisfied if G = Zd, d ∈ N,for example.

In the following we are interested in the spatial Λ-coalescent which startswith locally infinitely many particles at each site in given finite subset G′ ⊆G. We will choose indices and marks randomly as follows:

• sample a Poisson field N on G× [0,∞) with intensity n⊗λ, wheren is counting measure on G, and λ is Lebesgue measure on [0,∞),

• put the random index set

(4.27) I :=i ∈ [0,∞) : N(G× i) > 0

,

• and let for i ∈ I, xi be the unique mark in G such that

(4.28) N((xi, i)) > 0.

Let then

(4.29) PG0 :=

(i, xi); i ∈ I

∈ ΠI,G.

We denote by PΛ,PG0 the probability distribution governing (C,L) with

(C0, L0) = PG0 on the space of cadlag paths with the Skorohod topology.

Page 118: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

118 4. EXAMPLES OF LIMIT TREES II: COALESCENT TREES

For each non-negative and finite measure Λ and PΛ,PG0 -almost all (C,L),

there is a (random) metric r(C,L) on I defined by

(4.30) r(C,L)(i, j):= inf

t ≥ 0 : i ∼Ct j

.

That is, for a realization (C,L) of the spatially structured Λ-coalescent,

r(C,L)(i, j)is the time it needs i and j to coalesce. Notice that r(C,L) is an

ultra-metric on I, almost surely.We next want to construct the spatially structured Λ-coalescent measure

trees as the metric space (U (C,L), r(C,L)) equipped with the “uniform distri-bution” on partition elements with a mark in a fixed finite subset G′ ⊆ G.

Define for each ρ ∈ (0,∞), Hρ to be the map which takes a realizationof the ΠI,G-valued coalescent and a finite subset G′ ⊆ G and maps it to (anisometry class of) a metric measure space as follows:(4.31)

HG′ρ : (C,L) 7→

(U (C,L), r(C,L), µ(C,L)

,G′ρ := 1

ρ

∑i∈I∩[0,ρ],L(i)∈G′

δi

).

Put then

(4.32) QΛ,ρ,G′:=(HG′ρ

)∗P

Λ,PG0 .

The following result generalizes the characterization of existence anduniqueness of (non-spatial) Λ-coalescent trees to the spatially structured Λ-coalescent measure tree spanned by the partition elements with marks inG′.

Recall the “dust-free” property from (4.6) (see also Remark 4.1.4).

Theorem 4.2.6 (The spatial(ly structured) Λ-coalescent measure tree).

The family QΛ,ρ,G′; ρ ∈ (0,∞) converges in the weak topology with respect

to the Gromov-weak topology if and only if the “dust-free”-property holds.

Remark 4.2.7. We want to point out that the measures µ(C,L),G′ρ , for

ρ ∈ (0,∞), are not probability measures. However in analogy to the weaktopology on the space of finite measures we can define the Gromov-weaktopology on the space of metric (not necessarily probability) measure spacesby requiring in addition to convergence of all bounded continuous monomialsof degree n ≥ 2 that also the total masses (which can be considered as themonomial of degree 1 and with the constant 1 playing the role of a testfunction) converge. By the law of large number µρ(I) → 1, as ρ → ∞.Hence by Remark 2.4.3(ii) pre-compactness characterizations are essentiallythe same as for probability measures and in the following we therefore ignore

the fact that µ(C,L),G′ρ is only “almost” a probability measure.

Definition 4.2.8 (The Λ-coalescent measure tree). Assume that thedust-free property holds. The spatially structured Λ-coalescent measure treeQΛ,G′

is the limit of the family QΛ,ρ,G′; ρ ∈ (0,∞).

Page 119: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

4.2. SPATIALLY STRUCTURED Λ-COALESCENT TREES 119

Proof of Theorem 4.2.6. For existence we will once more apply thecharacterization of tightness as given in Theorem 2.8.2, and verify the twoconditions.

(i) Given i, j ∈ I and P ∈ ΠI,G, recall from (4.8) that P i denotes thepartition element which contains i and write:

• τ i,j := infs ≥ 0 : i ∼Cs j for the time the initial partition

elements i and j need to coalesce (in particular, r(C,L)(i, j) =τ i,j),

• σi,j := infs ≥ 0 : Ls(Cis) = Ls(C

js) for the first time the partition

elements containing i and j share the same mark (in particular,σi,j ≤ τ i,j),

• σx,y for the hitting time of two random walks started in x and y(in particular, σi,j equals in distribution σL0(i),L0(j)),

• G for a shifted geometric random variable with success probabilityλ2,2

2+λ2,2independent of τ i,j and σi,j , and

• τk; k ∈ N for a family of independent random variables which areall distributed as the length of the (almost surely finite) excursionaway from 0 and independent of τ i,j , σi,j and G.

Then it is clear, that τ i,j can be stochastically bounded to above byσi,j +

∑Gk=0 τk, which is almost surely finite. Moreover, σi,j is stochastically

bounded to above by maxx,y∈G σx,y which again is almost surely finite.

Fix ε > 0. Then we can choose cε such that P∑G

k=0 τk ≥ cε2

≤ ε

2 and

maxx,y∈G′ Pσx,y ≥ cε

2

≤ ε

2 . Therefore by definition, for all ρ ∈ (0,∞),

(4.33)

QΛ,ρ,G′[wX [cε,∞)

]≤ PΛ,PG

0[µ(C,L)ρ (i, j) : τ i,j ≥ cε

]≤ PΛ,PG

0[µ(C,L)ρ (i, j) : σi,j ≥ cε

2]+ P

G∑k=0

τk ≥cε2

≤ sup

x,y∈G′Pσx,y ≥ cε

2

+ P

G∑k=0

τk ≥cε2

≤ ε,

which proves the tightness of the family QΛ,ρ,G′[wX ]; ρ ∈ (0,∞).

(ii) Fix ε > 0. Given a realization (C,L) denote by Iε the set of allindices of initial partition elements which have not moved during the timeinterval [0,− log (1− ε)). In particular, by the strong law of large numbers,

(4.34) limρ→∞

µ(C,L),Gρ (Iε) = ε,

PΛ,P0-almost surely.Write for all x ∈ G′,

(4.35) Iρ,x\ε :=i ∈ supp(µ(C,L)ρ ) \ Iε : L0(i) = x

.

Page 120: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

120 4. EXAMPLES OF LIMIT TREES II: COALESCENT TREES

Notice that under PΛ,PG0 the process (Cx\

ε2 , Lx\

ε2 ) obtained by restrict-

ing (C,L) to Iρ,x\ε2 is a (non-spatial) Λ-coalescent during the time period

(0,− log (1− ε2)) ⊃ (0, ε2), for all x ∈ G′. Hence it follows from Theo-

rem 4.2.6 together with Theorem 2.8.2 that the dust-free property is equiv-alent to the existence of a δε ∈ (0, 1) (for any ε > 0) such that

(4.36) maxx∈G

supρ∈(0,∞)

QΛ,ρ[vδε(HG′ρ ((Cρ,x\

ε2 , Lρ,x\

ε2 )))]

≤ ε

2.

Then for all ρ ∈ (0,∞),(4.37)

QΛ,ρ[vδε(HG′ρ ((C,L))

)]= QΛ,ρ

[infε′ > 0 : µ(C,L),G

′ρ

u ∈ U (C,L) : µ(C,L),G

′ρ (Bε′(x)) ≤ δε

≤ ε′

]≤ QΛ,ρ

[infε′ > 0 : µ(C,L),G

′ρ

(I

ε2)≤ ε′;

µ(C,L),G′

ρ

u ∈ Iρ,x\

ε2 : µ(C,L),G

′ρ (Bε′(x)) ≤ δε

≤ ε′,∀x ∈ G′

]≤ ε

2,

by (4.35) and (4.36). Hence the sequence QΛ,ρ,G′[vδ(X )] converges to zero as

δ tends to zero uniformly in ρ ∈ (0,∞) if and only if the dust-free propertyholds.

Summarizing (i) and (ii), we have shown that sequence (QΛ,ρ)ρ∈(0,∞) hasa limit if and only if the dust-free property holds.

Uniqueness of the limit points follows as in the proof of Theorem 4.1.3from the projective property, i.e. restricting the observation to a taggedsubset of initial individuals is the same as starting in this restricted initialstate.

4.3. Scaling limit of spatial Λ-coalescent trees on Zd, d ≥ 3

In this section we further restrict the setting in the following way:

• The graph G (as well as the subgraph G′) is the d-dimensional torus

(4.38) GN := [−N,N ]d ∩ Zd ⊆ Zd,

where d ≥ 3 is fixed.• The migration is the corresponding random walk on the toruswith periodic boundary conditions, i.e., given the migration ker-nel a(x, y) from (4.14) and (4.15), we consider the random walkkernel

(4.39) aN (x, y) =∑z∼y

a(x, z), x, y ∈ GN ,

where here∼ denotes equivalence modulo 2N+1 in each coordinate.

Page 121: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

4.3. SCALING LIMIT OF SPATIAL Λ-COALESCENT TREES ON Zd, d ≥ 3 121

We restrict our attention to random walk kernels with

(4.40)∑x∈Zd

(a(0, x) + a(x, 0)

)∥ x ∥2+d<∞.

• We assume that Λ is a finite measure on B([0, 1]) which satisfiesCondition (4.12), i.e.,

(4.41) the Λ-coalescent comes down from infinity.

In particular, λ2,2 > 0.

Remark 4.3.1 (Coming down from infinity in the spatial setting). Recallfrom Remark 4.1.7 that Condition (4.12) is equivalent to the property thatthe corresponding non-spatial Λ-coalescent comes down from infinity. Thisis also true in the spatial setting. Proposition 11 in [LS06] even states thatthe spatially structured Λ-coalescent comes down from infinity in an uniformway, i.e., if

(4.42) T∞ := inft ≥ 0 : #Ct = 1

then the following are equivalent:

(a) Condition (4.12) holds.

(b) PΛ,PGN0 [T∞] <∞.

(c) QΛ,GN ∈ M1(Uc).

We are concerned with the convergence of the suitably rescaled familyof spatially structured Λ-coalescent trees on GN towards the (non-spatial)Kingman-coalescent measure tree.

As it will turn out the interesting time scale is given by the order oftime which two typical partition elements need to first hit to then be able tocoalesce (with a delay). It is known that in d ≥ 3 two random walk particleson a large torus which started at randomly sampled points meet after timeof the order of the volume. Therefore in the present set-up the time scale isgiven by the map defined by

(4.43) βN : t 7→ (2N + 1)dt, t ∈ [0,∞).

Recall that under the above assumptions, a random walk on Zd, d ≥ 3,with kernel a(·, ·) is transient. In particular,

(4.44) g :=∑k∈N

a(k)(0, 0) <∞.

In more detail the mean time until two partition elements interact istypically of order κ times the volume, where

(4.45) κ :=(λ−12,2 +

g

2

)−1.

Page 122: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

122 4. EXAMPLES OF LIMIT TREES II: COALESCENT TREES

Remark 4.3.2 (Probabilistic interpretation of κ). Let I := 1, 2, and(C,L) the spatially structured Kingman coalescent (with rate λ2,2) startedin C0 = 1, 2 with L0(1) = L0(2) = 0. Then

(4.46) κ = λ2,2 · Pλ2,2δ0,(C0,L0)#C∞ = 1

.

Recall the law Qκδ0 of the (rate κ) Kingman coalescent tree from Defi-nition 4.1.5 and the law QΛ,GN from Definition 4.2.8. Put

(4.47) QΛ,N :=(βN)∗Q

Λ,GN ,

where, as usual, multiplication of a metric measure space (X, r, µ) by a factorc gives the metric measure space (X, cr, µ).

Theorem 4.3.3 (Convergence to the Kingman measure tree).

(4.48) QΛ,N =⇒N→∞

Qκδ0 .

Before we can prove Theorem 4.3.3 we need two preparatory results.The first proposition states that the suitably rescaled spatially structuredΛ-coalescent started with finitely many partition elements within mutual dis-tance of the order of the side length of the torus converge to the (non-spatial)Kingman coalescent. The following result is Proposition 14 in [LS06]. Thestatement generalizes Lemma 7.3 in [GLW05] in two directions. First itextends the consideration of spatially structured Kingman coalescents tospatially structured Λ-coalescents which come down from infinity. Secondly,the statement in [GLW05] is phrased for marginals.

Proposition 4.3.4. Fix n ∈ N and a sequence (aN )N∈N which tends

to ∞ slowly enough that aNN → 0, as N → ∞. Let (C0, L0) ∈ Π1,2,...,n,GN

be such that C0 = 1, ..., n and

(4.49) ∥ L0(i)− L0(j) ∥∈ [aN ,√dN ],

for all 1 ≤ i = j ≤ n. Then

(4.50)(CβN (t)

)t≥0

=⇒N→∞

(ρnKκt

)t≥0

,

where (Kt)t≥0 is the Kingman coalescent (with index set N).

Remark 4.3.5 (Heuristics on Proposition 4.3.4). The scale factor κ canbe explained as follows. Restrict on two typical initial partition elements onthe (large) torus GN . As long as they don’t feel the boundary they try to tocoalesce as they would do on Zd. If they do not succeed they start wrappingaround the torus. This probability is given by

(4.51) Pλ2,2δ0,(C0,L0)#C∞ = 1

=(λ2,2

g

2+ 1)−1

.

Page 123: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

4.3. SCALING LIMIT OF SPATIAL Λ-COALESCENT TREES ON Zd, d ≥ 3 123

Given that our two partition elements have not coalesced without wrap-ping, they will soon after wrapping forget all information about their initialspatial distance and become uniformly distributed on the torus. They nowneed time of the order of the volume to meet and possibly coalesce. Thetime intervals during which the particles are within distance O(1) are veryshort when compared to the time intervals until wrapping around. Thismeans that the two elements under focus are now trying to coalesce on timescale βN (·) at rate λ2,2 according to the Kingman coalescent.

If we restrict to initially a finite number of initial partition elements theywill soon get distributed uniformly on the torus. Since on time scale βN (·)we won’t find more than two partition elements at the same site, also anensemble of finitely many elements will perform on time scale βN (·) a rateλ2,2 Kingman coalescent.

We will apply Proposition 4.3.4 to obtain “f.d.d.”-convergence in Theo-rem 4.3.3.

Corollary 4.3.6 (“F.d.d.”-convergence). For all polynomials Φ ∈ Π,

(4.52) QΛ,N[Φ]=⇒N→∞

Qκδ0[Φ].

Proof. Fix a monomial Φ with degree n ∈ N and test function ϕ ∈C((R+)

(n2)). Given ρ ∈ (0,∞) and (C,L) ∈ ΠI∩(0,ρ],GN , denote by(4.53)

J ρ,(C,L)

:=(i1, ..., in) ∈ (I ∩ (0, ρ])n : L0(i1), ..., L0(in) satisfies (4.49)

.

Then by Proposition 4.3.4, for all ρ ∈ (0,∞),

(4.54)

QΛ,ρ,GN

[ ∑i∈J ρ,(C,L)

(µ(C,L),GNρ

)⊗niϕ((r(C,L),GNρ (il, ik)

#GN

)1≤l<k≤n

)]−→N→∞

Qκδ0[ ∫

UC

(µC)⊗n

(d(u1, ..., un))ϕ((rC(ul, uk)

)1≤l<k≤n

)]while(4.55)

(βN )∗QΛ,ρ,GN

[ ∑i∈J ρ,(C,L)

(µ(C,L),GNρ

)⊗niϕ((r(C,L),GNρ (il, ik)

#GN

)1≤l<k≤n

)]≤∥ ϕ ∥ QΛ,ρ,GN

[(µ(C,L),GNρ

)⊗n(J ρ,(C,L))]

−→N→∞

0.

This proves that (βN )∗QΛ,ρ,GN [Φ] converges to Qκδ0 [Φ], for all ρ ∈(0,∞), as N → ∞. Since convergence is uniformly in ρ ∈ (0,∞), we also

have that QΛ,GN [Φ] converges to Qκδ0 [Φ], as N → ∞.

Page 124: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

124 4. EXAMPLES OF LIMIT TREES II: COALESCENT TREES

The following tightness result is taken from Lemma 16 in [LS06]. Thestatement goes back to an analogous statement given in Theorem 1 in[BG80] for the instantenous coalescent (that is, Λ = γδ0 with “γ = ∞”) andgeneralizes Lemma 7.4 in [GLW05] for the spatially structured Kingmancoalescent (that is, Λ = γδ0 with γ ∈ (0,∞)) .

Proposition 4.3.7. For all d ≥ 3, there is a finite constant c = cd suchthat uniformly in sequences (#CN0 )N∈N, and uniformly in N ∈ N,

(4.56) PΛ,(C0,L0),GN[#Ct

]≤ cdmax

1,

#C0

t

.

Proof of Theorem 4.3.3. For existence of limit points we will con-struct for every η > 0 a compact set Γη ⊂ M such that

(4.57) infN∈N

QΛ,N (Γη) ≥ 1− η.

Fix therefore η > 0. Recall T∞ from (4.42). As discussed in Re-mark 4.3.1, PΛ,GN [T∞] <∞ since we assumed (4.12). Hence, for all n ∈ N,

(4.58) PΛ,GNT∞ > 2n+2η−1PΛ,GN [T∞]

≤ η

42−n,

by the Markov inequality.Moreover, for all realization (C,L) and ε > 0 recall from (4.34) the set Iε

of all indices of initial partition elements which have not moved during thetime interval [0,− log (1− ε)). In particular, since particle jump at rate 1,

(4.59) µ(C,L),GNρ

(Iε)= 1− ε.

Therefore by Proposition 4.3.7,(4.60)

PΛ,GN[#ρI2−nCβN (2−n)

]≤ cdmax

1,

PΛ,GN [#ρI2−nC− log (1−2−n)]

βN (2−n) + log (1− 2−n)

= cdmax

1,#GN

PΛ[C− log (1−2−n)]

βN (2−n) + log (1− 2−n)

−→N→∞

2n · cd · PΛ[#C− log (1−2−n)],

and hence

(4.61) PΛ,GN [#ρI2−nCβN (2−n)] ≤ 2n+1 · cd · PΛ[#C− log (1−2−n)].

for all n ∈ N and N sufficiently large.Therefore

(4.62)

PΛ,GN#ρI2−nCβN (2−n) > 8 · 4n · η−1 · cd · PΛ[#C− log (1−2−n)]

PΛ,GN [#ρI2−nCβN (2−n)]

8 · 4n · cd · PΛ[#C− log (1−2−n)]

≤ η

42−n.

Page 125: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

4.4. SCALING LIMIT OF SPATIAL KINGMAN COALESCENT TREES ON Z2 125

Put for each n ∈ N and η > 0,

(4.63) Kn,η := max4 · 2n

ηPΛ,GN [T∞];

8 · 4n

η· cd · PΛ[C− log (1−2−n)]

,

and let then

(4.64)

Γη :=∩n∈N

(U, r, µ) ∈ U : ∃Un ⊆ U such that µ(Un) ≥ 1− 2−n;

diam(Un) ≤ Kn,η;

Un can be covered by Kn,η 2−n-balls.

Notice that Γη is compact by Theorem 2.4.1(c). Moreover, by (4.58),(4.59) and (4.62),(4.65)

QΛ,N(Γη)

= PΛ,GN

( ∩n∈N

∃I ′ ⊆ I : #ρI′CβN (2−n) ≤ Kn,η;#ρI′CβN (Kn,η) = 1

)≥ PΛ,GN

( ∩n∈N

#ρI2−nCβN (2−n) ≤ Kn,η;#ρI2−nCβN (Kn,η) = 1

)≥ 1−

∑n∈N

(PΛ,GN

#ρI2−nCβN (2−n) > Kn,η

+

+ PΛ,GN#ρI2−nCβN (Kn,η) > 1

)≥ 1−

∑n∈N

(η42−n +

η

42−n

)= 1− η,

which establishes (4.57) and therefore finishes the proof.

Uniqueness of the limit follows from Corollary 4.3.6 together with Propo-sition 2.1.8.

4.4. Scaling limit of spatial Kingman coalescent trees on Z2

In this section we restrict the setting of Section 4.2 in the following way:

• The graph G is the two-dimensional lattice Z2.• Fix α ∈ (0, 1], and consider for t > 0,

(4.66) Gαt :=[− t

α2 , t

α2]2 ∩ Z2 ⊂ G.

• Recall the migration kernel a(x, y) from (4.14) and (4.15) and as-sume that a(·, ·) has finite exponential moments, i.e.,

(4.67)∑

z1,z2∈Z2

eλ1z1+λ2z2a(0, z) <∞,

for all λ1, λ2 ∈ R.

Page 126: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

126 4. EXAMPLES OF LIMIT TREES II: COALESCENT TREES

• Choose the coalescent to be the Kingman coalescent with rate γ,i.e.,

(4.68) Λ(dx) := γδ0.

We are concerned with the convergence of the suitably rescaled familyof spatially structured Λ-coalescent trees on Gαt towards the (non-spatial)Kingman-coalescent measure tree (on a logarithmic scale).

The rescaling is motivated by a well-known result by Erdos and Taylor[ET60] for planar random walks with finite variance: if σ is the first hittingtime of the origin of a two-dimensional random walk, then

(4.69) limt→∞

Pxtα/2

σ > tβ=α

β∧ 1,

for all α, β ∈ [0, 1], and all x ∈ R2 \ (0, 0) (see, for example, Proposition 1in [CG86]).

Remark 4.4.1. In particular, the right hand side of (4.69) does notdepend on x ∈ R2\(0, 0). Due to this peculiar (specific to d = 2) property,the behavior of the spatial coalescent started in Gαt and observed at timetβ, asymptotically as t→ ∞, depends only on the logarithmic scales α andβ, while all the finer distinctions are washed out.

Define the scaling maps λα : [0,∞) → [0,∞) and ταt : [0,∞) → [0,∞)by

(4.70) λα : β 7→ − log(βα∨ 1),

and

(4.71) ταt : r 7→ tα·er.

In particular, since (λα)−1(r) = α · er, for all r > 0, we have (ταt )−1(r) =

λα(logt(r)), and therefore (4.69) applied to y := (ταt )−1(tβ) = λα(β) (or

equivalently β = (λα)−1(y)) yields that

(4.72) limt→∞

Pxtα/2

(ταt )−1(σ) > y

=

α

(λα)−1(y)= e−y.

In words, (ταt )−1(σ) (under the laws Pxtα/2

) converges as t→ ∞ to a rate1 exponentially distributed random variable (on a logarithmic time scale).

Recall the law Qδ0 of the (rate 1) Kingman coalescent tree from Defini-tion 4.1.5 and the law Qγδ0,Gα

t of the (rate γ) spatially structured Kingmancoalescent tree (spanned by the initial partition elements with marks in Gαt )from Definition 4.2.8. Put

(4.73) Qγδ0,t,α :=(ταt λα

)∗Q

γδ0,Gαt ,

and

(4.74) Qδ0α :=

(λα)∗Q

δ0 ,

Page 127: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

4.4. SCALING LIMIT OF SPATIAL KINGMAN COALESCENT TREES ON Z2 127

where a map f : (0,∞) → (0, 1] sends an ultra-metric measure space (X, r, µ)to the ultra-metric space (X, f r, µ) with

(4.75) f r(u1, u2

):= 2 · f

(12r(u1, u2)

)Remark 4.4.2. To understand the factors 2 and 1

2 , recall from Exam-ple 1.5.8 that ultra-metric spaces (U, r) can be embedded in to an R-tree(X, r) and think then for u1, u2 ∈ U of u1∧u2 as the unique point in X suchthat r(u1, u1 ∧ u2) = r(u2, u1 ∧ u2) = 1

2r(u1, u2). The rescaling (4.75) canthen be read as rescaling the two distances r(u1, u1 ∧ u2) and r(u2, u1 ∧ u2)on the f -scale.

Theorem 4.4.3 (Convergence to the Kingman measure tree). For allα ∈ (0, 1],

(4.76) Qγδ0,t,α=⇒t→∞

Qδ0α .

Remark 4.4.4 (Open problem: Pathwise convergence). It is an open

problem to show that the family (Qγδ0,t,α)α∈(0,1] converge to a limit process

whose one-dimensional marginals are equal to Qδ0α . We conjecture that the

limit process is given by suitably rescaled tree-valued Fleming-Viot dynamicswhich we construct in Chapter 7. In forthcoming work [GLW] we providethe necessary tightness estimates.

Before we can prove Theorem 4.4.3 we need two preparatory results.The first proposition states in analogy of Proposition 4.3.4 that the rescaledspatially structured Kingman coalescent started with finitely many partitionelements within mutual distance of the order of the side length of the torusconverge to the (non-spatial) Kingman coalescent on the logarithmic scale.The following result is taken from Proposition 5.1 in [GLW07].

Proposition 4.4.5 (Finite sparse coalescents: large time scales). Fix

n ∈ N, α ∈ (0, 1] and c > 0, and let (C0, L0) ∈ Π1,2,...,n,Gαt be such that

C0 = 1, ..., n, and

(4.77) ∥ L0(i)− L0(j) ∥∈[(c log t)−1 · t

α2 , (c log t) · t

α2],

for all 1 ≤ i = j ≤ n. Then

(4.78)(C(ταt λα)(β)

)β∈[α,∞)

=⇒t→∞

(ρnKλα(β)

)β∈[α,∞)

,

where (Kt)t≥0 is the Kingman coalescent (with index set N).

As done before we will again apply Proposition 4.4.5 to obtain “f.d.d.”-convergence in Theorem 4.4.3.

Page 128: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

128 4. EXAMPLES OF LIMIT TREES II: COALESCENT TREES

Corollary 4.4.6 (“F.d.d.”-convergence). For all polynomials Φ ∈ Πand α ∈ (0, 1],

(4.79) Qγδ0,t,α[Φ]=⇒t→∞

Qδ0α

[Φ].

Proof. The proof follows the same line of argument as the proof ofCorollary 4.3.6.

We will also rely on the following tightness result which is stated inProposition 6.1 in [GLW07].

Proposition 4.4.7 (Uniformly bounded expectation on logarithmicscale). There are finite constants M and t0 such that for all t ≥ t0, sat-isfying α ∈ (0,∞), and β ∈ (α,∞),

(4.80) Pγδ0,Gαt[#C(ταt λα)(β)

]≤M ·max

α

2(β − α),Pγδ0,Gα

t [#C2]

#Gαt, 1.

Proof of Theorem 4.4.3. We will proceed similarly to the proof ofTheorem 4.3.3.

Fix α ∈ (0, 1] and η > 0. For existence of limit points we will constructa compact set Γη ⊂ M such that now

(4.81) inft∈(0,∞)

Qγδ0,t,α(Γη) ≥ 1− η.

For all realization (C,L) and ε > 0 recall Iε from (4.34). Since#ρIε < ∞, and any two random walks on Z2 with kernel a(·, ·) satisfy-ing Assumption (4.67) meet (and coalesce) in finite time,

(4.82) T ε∞ := infs ≥ 0 : #ρIεCs = 1

<∞,

for all ε > 0, Pγδ0,Gαt almost surely. In particular, for each n ∈ N and η, we

can choose Ln,η ∈ (0,∞) such that

(4.83) Pγδ0,GαtT 2−n

∞ > Ln,η≤ η

42−n.

Moreover, by Proposition 4.4.7 applied to β = 2nα, for all n ≥ 1,

(4.84)Pγδ0,G

αt[#ρI2−nC(ταt λα)(2−n)

]≤M ·max

Pγδ0,Gαt [#ρI2−nC2]

#Gαt, 1

=M · Pγδ0 [#K2],

for all n ∈ N and t sufficiently large.Therefore

(4.85) Pγδ0,Gαt#ρI2−nC(ταt λα)(2−n) >

4 · 2n

ηM · Pγδ0 [#K2]

≤ η

42−n.

Page 129: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

4.4. SCALING LIMIT OF SPATIAL KINGMAN COALESCENT TREES ON Z2 129

Put now for each n ∈ N and η > 0,

(4.86) Kn,η := maxLn,η,

4 · 2n

ηM · Pγδ0 [#K2]

,

and let then Γη be the compact set defined in (4.64).Moreover, by (4.59), (4.83) and (4.85),

(4.87)

Qγδ0,t,α(Γη)

= Pγδ0,Gαt

( ∩n∈N

∃I ′ ⊆ I : #ρI′C(ταt λα)(2−n) ≤ Kn,η;T

2−n

∞ ≤ Kn,η

)≥ Pγδ0,G

αt

( ∩n∈N

#ρI2−nC(ταt λα)(2−n) ≤ Kn,η;T

2−n

∞ ≤ Kn,η

)≥ 1−

∑n∈N

(Pγδ0,G

αt#ρI2−nC(ταt λα)(2−n) > Kn,η

+ Pγδ0,G

αtT 2−n

∞ > Kn,η

)≥ 1−

∑n∈N

(η42−n +

η

42−n

)= 1− η,

for all sufficiently large t, which establishes (4.81) and therefore finishes theproof.

Once more, uniqueness of the limit follows from uniqueness of the finitedimensional distributions given by Corollary 4.4.6 together with Proposi-tion 2.1.8.

Page 130: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter
Page 131: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

CHAPTER 5

Root growth and Regrafting

The Aldous–Broder algorithm is a Markov chain on the space of rootedcombinatorial trees with N vertices that has the uniform tree as its sta-tionary distribution. In this chapter we construct and study the so-calledroot growth and regrafting dynamics which is a Markov process on the spaceof all rooted compact real trees that has the continuum random tree as itsstationary distribution and arises as the scaling limit of the Aldous–Broderchain as N → ∞. Before we outline the chapter we give a more detaileddescription and motivation of the dynamics.

Given an irreducible Markov matrix P with state space V , there is anatural probability measure on the collection of combinatorial trees withvertices labeled by V that assigns mass

(5.1) C−1∏

P(x, y)

to the tree T , where C is a normalization constant and the product is overpairs of adjacent vertices (x, y) in T ordered so that y is on the path fromthe root to x. For example, if P(x, y) ≡ 1/|V | for all x, y ∈ V , (so thatthe associated Markov chain consists of successive uniform random picksfrom V ), then the distribution (5.1) is uniform on the set of |V ||V |−1 rootedcombinatorial trees labeled by V .

The Aldous-Broder algorithm [AT89, Bro89, Ald90] is a tree-valuedMarkov chain that has the distribution in (5.1) as its stationary distribu-tion. The discrete time version of the algorithm has the following transitiondynamics (see the left hand side of Figure 5.1 for an illustration).

• Pick a vertex υ at random according to P(ρ, ·), where ρ is thecurrent root.

• If υ = ρ, do nothing.• If υ = ρ:

– Erase the edge connecting υ to the unique vertex adjacent toυ and on the path from ρ to υ.

– Insert a new edge between υ and ρ.– Designate υ as the new root.

Since this Markov chain is aperiodic, irreducible on the sub-spaces ofrooted trees with a fixed number of vertices and symmetric it is clear that itconverges in distribution to the uniform distribution on the space of rootedtrees with a given number of vertices.

131

Page 132: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

132 5. ROOT GROWTH AND REGRAFTING

rr u

rrrρ1

PPPPPPPPPP

a

ru r

rrr

PPPPPPPa ρ2

rr r

rurb

ρ3

PPPPPPP

rr r

rrrρ4

PPPPPPP

PPPPPPP

urrr

r rρ1@@

ur rrrrρ1

ρ2@@

@

rrur

r rρ1

ρ2

ρ3

rrrr

r

rρ1

ρ2

ρ3

ρ4

@@

Figure 5.1. illustrates how the Aldous-Broder chain evolve.

We can, of course, rephrase the Broder-Aldous chain in a way whichstresses it as an rooted tree valued process (see the right hand side of Fig-ure 5). To do so, assume we are starting with (T, r, ρ) ∈ Troot such that Thas exactly n vertices (i.e., points x ∈ T such that T \ x does either notget disconnected or gets disconnected into more than 2 components).

• Pick a vertex υ ∈ T at random according to the uniform distribu-tion on the set of vertices.

• If υ = ρ, do nothing.• If υ = ρ:

– Prune off the sub-tree above υ, erase the edge which is adjacentto υ and lies on [ρ, υ].

– Insert a new edge adjacent to ρ and pointing away from therest of the tree. Let its end be the new root.

– Regraft the pruned sub-tree by gluing together the new rootwith υ.

Page 133: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

5. ROOT GROWTH AND REGRAFTING 133

We now interested in a suitable rescaling of these dynamics. Suggestedby (3.22), we know for the one-dimensional distributions to converge, weneed to rescale edge lengths by a factor 1√

n. To find the right rescaling for

the time, recall the notion of rooted subtrees and trimmings from Section 1.9.For a fixed η > 0, we want to observe the dynamics of the trimmed subtreeR⌊η 1√

n⌋(T ) (assuming our eyes can only focus on things which are far enough

from the fringy margin, the rest - still there - will not be visible for us).Letting the Broder-Aldous chain run, there are two possible scenarios.

• with probability 1√nµT (R⌊η 1√

n⌋(T )) the picked vertex belongs to

the visible sub-tree and we observe that the (sub)subtree abovethis point gets pruned off (this way losing one edge of side length1n) and gets regrafted to the current root via an additional (and

from now on visible) adjoining edge of side length 1√n,

• while with probability 1− 1√nµT (R⌊η 1√

n⌋(T )) a for us invisible sub-

tree from T \R⌊η 1√n⌋(T ) gets regrafted to the root via an additional

(and from now on visible) adjoining edge of side length 1√n.

This suggests that a law of large numbers holds for the second ingredientof the dynamics if we let run the chain at the scale

√n. Moreover, if we

rescale time and edge length as suggested the following limit dynamics isexpected.

Definition 5.0.8 (Root growth with regrafting). The root growth withregrafting dynamics is a strong Markov process X with values in Troot suchthat if X0 = T ∈ Troot, then for all η > 0, Xη := (Rη(Xt))t≥0 starts inRη(T ) and evolves according to the following dynamics:

• Root growth. The root grows at speed 1.• Regrafting. At rate µX

η· (dx) the sub-tree above a point x ∈ Xη

·falls off and gets regrafted with the current root.

Before we can establish such an convergence result, we need to show thatthe root growth with regrafting dynamics makes even sense for compact realtrees with infinite total length. The plan of this chapter is therefore as fol-lows. We construct the extended root growth with re-grafting process inSections 5.1 and 5.2 via a procedure that is roughly analogous to buildinga discontinuous Markov process in Euclidean space as the solution of a sto-chastic differential equation with respect to a sufficiently rich Poisson noise.This approach is particularly well-suited to establishing the strong Markovproperty. In Section 5.3 we will rely on Aldous’s line-breaking construc-tion to show that the root growth and regrafting dynamics converges to theBrownian CRT. We prove in Section 5.4 that the extended root growth withre-grafting process is recurrent. We verify that the extended process hasa Feller semigroup in Section 5.5, and show in Section 5.6 that it is a re-scaling limit of the Markov chain appearing in the Aldous–Broder algorithm

Page 134: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

134 5. ROOT GROWTH AND REGRAFTING

for simulating a uniform rooted tree on some finite number of vertices. Wedevote Section 5.7 to a discussion on the Rayleigh process described above.

5.1. A deterministic construction

We are now ready to begin in earnest the construction of the Troot-valuedMarkov process, X, having the root growth with re-grafting dynamics.

Fix a tree (T, r, ρ) ∈ Troot. This tree will be the initial state of X.Following the semi-formal description, the “stochastic inputs” to the con-struction ofX will be a collection of cut times and a corresponding collectionof cut points. Based on these the strategy is as follows:

• Construct simultaneously for each finite rooted subtree T ∗ ≼root Ta process XT ∗

with XT ∗0 = T ∗ that evolves according to the root

growth with re-grafting dynamics.• Carry out this construction in such a way that if T ∗ and T ∗∗ aretwo finite subtrees with T ∗ ≼root T ∗∗, then XT ∗

t ≼root XT ∗∗t and

the cut points for XT ∗are those for XT ∗∗

that happen to fall onXT ∗τ− for a corresponding cut time τ of XT ∗∗

. Cut times τ for XT ∗∗

for which the corresponding cut point does not fall on XT ∗τ− are not

cut times for XT ∗.

• The tree (T, ρ) is a rooted Gromov-Hausdorff limit of finite R-treeswith root ρ (indeed, any subtree spanned by a finite ε-net andρ is finite and has rooted Gromov-Hausdorff distance less than εfrom (T, ρ)). In particular, (T, ρ) is the “smallest” rooted compactR-tree that contains all of the finite rooted subtrees of (T, ρ).

• Because of the consistent projective nature of the construction, wecan define Xt := XT

t for t ≥ 0 as the “smallest” element of Troot

that contains XT ∗t , for all finite trees T ∗ ≼root T .

It will be convenient for establishing features of the processX such as thestrong Markov property to introduce randomness later and work initially ina setting where the cut times and cut points are fixed. There are two typesof cut points: those that occur at points which were present in the initialtree T and those that occur at points which were added due to subsequentroot growth. Accordingly, we consider two countable subsets π0 ⊂ R++×T oand π ⊂ (t, x) ∈ R++ × R++ : x ≤ t (compare Figure 5.2). (Once againwe note that we are moving backwards and forwards between thinking of Tas a metric space or as an equivalence class of metric spaces. As we havewritten things here, we are thinking of π0 being associated with a particularclass representative, but of course π0 corresponds to a similar set for anyrepresentative of the same equivalence class by mapping across using theappropriate root invariant isometry.)

Assumption 5.1.1 (Nice point processes). Suppose that the sets π0 andπ have the following properties.

Page 135: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

5.1. A DETERMINISTIC CONSTRUCTION 135

-

6 rrr

r rr

t

T o

x

-

6 rr

r

t

Figure 5.2. illustrates the cut appearing in the initial tree(above) and in everything which will be introduced due toroot growth (below).

(a) For all t0 > 0, each of the sets π0∩(t0×T o) and π∩(t0×]0, t0])has at most one point and at least one of these sets is empty.

(b) For all t0 > 0 and all finite subtrees T ′ ⊆ T , the set π0∩(]0, t0]×T ′)is finite.

(c) For all t0 > 0, the set π ∩ (t, x) ∈ R++ × R++ : x ≤ t ≤ t0 isfinite.

Remark 5.1.2. Conditions (a)–(c) of Assumption 5.1.1 will hold almostsurely if π0 and π are realizations of Poisson point processes with respectiveintensities λ ⊗ µ and λ ⊗ λ (where λ is Lebesgue measure), and it is thisrandom mechanism that we will introduce later to produce a stochasticprocess having the root growth with re-grafting dynamics.

It will be convenient to use the notations π0 and π to also refer to theinteger-valued measures that are obtained by placing a unit point mass ateach point of the corresponding set.

Consider a finite rooted subtree T ∗ ≼root T . It will avoid annoyingcircumlocutions about equivalence via root invariant isometries if we workwith particular class representatives for T ∗ and T , and, moreover, supposethat T ∗ is embedded in T .

Put τ∗0 := 0, and let 0 < τ∗1 < τ∗2 < . . . (the cut times for XT ∗) be the

points of t > 0 : π0(t × T ∗) > 0 ∪ t > 0 : π(t × R++) > 0 (seeFigure 5.1 for an illustration).

An explicit construction of XT ∗t is then given in two steps:

Page 136: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

136 5. ROOT GROWTH AND REGRAFTING

-

6 bbb

r rb

t

T o

T ∗

τ∗2 τ∗3

x

-

6 rr

r

tτ∗1 τ∗4

Figure 5.3

Step 1 (Root growth) At any time t ≥ 0, XT ∗t as a set is given by the

disjoint union T ∗⊔]0, t]. The root of XT ∗t is the point ρt := t ∈]0, t]. The

metric rT∗

t on XT ∗t is defined inductively as follows. Set rT

∗0 to be the metric

on XT ∗0 = T ∗; that is, rT

∗0 is the restriction of r to T ∗. Suppose that rT

∗t

has been defined for 0 ≤ t ≤ τ∗n. Define rT∗

t for τ∗n < t < τ∗n+1 by

(5.2) rT∗

t (a, b) :=

rτ∗n(a, b), if a, b ∈ XT ∗

τ∗n,

|b− a|, if a, b ∈]τ∗n, t],|a− τ∗n|+ rτ∗n(ρτ∗n , b), if a ∈]τ∗n, t], b ∈ XT ∗

τ∗n.

Step 2 (Re-Grafting) Note that the left-limit XT ∗τ∗n+1−

exists in the rooted

Gromov-Hausdorff metric. As a set this left-limit is the disjoint union

(5.3) XT ∗τ∗n

⊔]τ∗n, τ∗n+1] = T ∗⊔]0, τ∗n+1],

and the corresponding metric rτ∗n+1− is given by a prescription similar to

(5.2).Define the (n+ 1)st cut point for XT ∗

by

(5.4) p∗n+1 :=

a ∈ T ∗, if π0((τ∗n+1, a)) > 0,

x ∈]0, τ∗n+1], if π((τ∗n+1, x)) > 0.

Let S∗n+1 be the subtree above p∗n+1 in XT ∗

τ∗n+1−, that is,

(5.5) S∗n+1 := b ∈ XT ∗

τ∗n+1− : p∗n+1 ∈ [ρτ∗n+1−, b[ .

Page 137: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

5.1. A DETERMINISTIC CONSTRUCTION 137

Define the metric rτ∗n+1by

rτ∗n+1(a, b)

:=

rτ∗n+1−(a, b), if a, b ∈ S∗

n+1,

rτ∗n+1−(a, b), if a, b ∈ XT ∗τ∗n+1

\ S∗n+1,

rτ∗n+1−(a, ρτ∗n+1) + rτ∗n+1−(p

∗n+1, b), if a ∈ XT ∗

τ∗n+1\ S∗

n+1, b ∈ S∗n+1.

(5.6)

In other words XT ∗τ∗n+1

is obtained from XT ∗τ∗n+1−

by pruning off the subtree

S∗n+1 and re-attaching it to the root.

Now consider two other finite, rooted subtrees (T ∗∗, ρ) and (T ∗∗∗, ρ) ofT such that T ∗∪ T ∗∗ ⊆ T ∗∗∗ (with induced metrics). Build XT ∗∗

and XT ∗∗∗

from π0 and π in the same manner as XT ∗(but starting at T ∗∗ and T ∗∗∗).

It is clear from the construction that:

• XT ∗t and XT ∗∗

t are rooted subtrees of XT ∗∗∗t for all t ≥ 0,

• the Hausdorff distance between XT ∗t and XT ∗∗

t as subsets of XT ∗∗∗t

does not depend on T ∗∗∗,• the Hausdorff distance is constant between jumps of XT ∗

and XT ∗∗

(when only root growth is occurring in both processes).

The following lemma shows that the Hausdorff distance between XT ∗t and

XT ∗∗t as subsets of XT ∗∗∗

t does not increase at jump times.

Lemma 5.1.3. Let T be a finite rooted tree with root ρ and metric r, andlet T ′ and T ′′ be two rooted subtrees of T (both with the induced metrics androot ρ). Fix p ∈ T , and let S be the subtree in T above p (recall (5.5)).Define a new metric r on T by putting

(5.7) r(a, b) :=

r(a, b), if a, b ∈ S,

r(a, b), if a, b ∈ T \ S,r(a, p) + r(ρ, b), if a ∈ S, b ∈ T \ S.

Then the sets T ′ and T ′′ are also subtrees of T equipped with the inducedmetric r, and the Hausdorff distance between T ′ and T ′′ with respect to r isnot greater than that with respect to r.

Proof. Suppose that the Hausdorff distance between T ′ and T ′′ underr is less than some given ε > 0. Given a ∈ T ′, there then exists b ∈ T ′′

such that r(a, b) < ε. Because r(a, a ∧ b) ≤ r(a, b) and a ∧ b ∈ T ′′, we maysuppose (by replacing b by a ∧ b if necessary) that b ≤ a. We claim thatr(a, c) < ε for some c ∈ T ′′. This and the analogous result with the roles ofT ′ and T ′′ interchanged will establish the result.

If a, b ∈ S or a, b ∈ T \ S, then r(a, b) = r(a, b) < ε. The only otherpossibility is that a ∈ S and b ∈ T \ S, in which case p ∈ [b, a] (for T

Page 138: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

138 5. ROOT GROWTH AND REGRAFTING

equipped with r). Then r(a, ρ) = r(a, p) ≤ r(a, b) < ε, as required (becauseρ ∈ T ′′).

Now let T1 ⊆ T2 ⊆ · · · be an increasing sequence of finite subtreesof T such that

∪n∈N Tn is dense in T . Thus limn→∞ rH(Tn, T ) = 0. Let

X1, X2, . . . be constructed from π0 and π starting with T1, T2, . . .. ApplyingLemma 5.1.3 yields

(5.8) limm,n→∞

supt≥0

dGHroot(Xmt , X

nt ) = 0.

Hence by completeness of Troot, there exists a cadlag Troot-valued processX such that X0 = T and

(5.9) limm→∞

supt≥0

dGHroot(Xmt , Xt) = 0.

A priori, the process X could depend on the choice of the approximatingsequence of trees (Tn)n∈N. To see that this is not so, consider two approxi-mating sequences T 1

1 ⊆ T 12 ⊆ · · · and T 2

1 ⊆ T 22 ⊆ · · · . For k ∈ N, write T 3

n

for the smallest rooted subtree of T that contains both T 1n and T 2

n . As a set,

T 3n = T 1

n ∪ T 2n . Now let ((Xn,i

t )t≥0)n∈N for i = 1, 2, 3 be the corresponding

sequences of finite tree-value processes and let (X∞,it )t≥0 for i = 1, 2, 3 be

the corresponding limit processes. By Lemma 5.1.3,

(5.10)

dGHroot(Xn,1t , Xn,2

t ) ≤ dGHroot(Xn,1t , Xn,3

t ) + dGHroot(Xn,2t , Xn,3

t )

≤ dH(Xn,1t , Xn,3

t ) + dH(Xn,2t , Xn,3

t )

≤ dH(T1n , T

3n) + dH(T

2n , T

3n)

≤ dH(T1n , T ) + dH(T

2n , T )−→

n→∞0.

Thus, for each t ≥ 0 the sequences (Xn,1t )n∈N and (Xn,2

t )n∈N do indeed havethe same rooted Gromov-Hausdorff limit and the process X does not dependon the choice of approximating sequence for the initial tree T .

5.2. Introducing randomness

In Section 5.1 we constructed a Troot-valued function t 7→ Xt startingwith a fixed triple (T, π0, π), where T ∈ Troot and π0, π satisfy the conditionsof Assumption 5.1.1. We now want to think of X as a function of time andsuch triples.

Let Ω∗ be the set of triples (T, π0, π), where T is a rooted compact R-tree (that is, a class representative of an element of Troot) and π0, π satisfyAssumption 5.1.1.

The root invariant isometry equivalence relation on rooted compact R-trees extends naturally to an equivalence relation on Ω∗ by declaring thattwo triples (T ′, π′0, π

′) and (T ′′, π′′0 , π′′), where π′0 = (σ′i, x′i) : i ∈ N and

π′′0 = (σ′′i , x′′i ) : i ∈ N, are equivalent if there is a root invariant isometryf mapping T ′ to T ′′ and a permutation γ of N such that σ′′i = σ′γ(i) and

Page 139: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

5.2. INTRODUCING RANDOMNESS 139

x′′i = f(x′γ(i)) for all i ∈ N. We write Ω for the resulting quotient space of

equivalence classes.In order to do probability, we require that Ω has a suitable measurable

structure. We could do this by specifying a metric on Ω, but the followingapproach is a little less cumbersome and suffices for our needs.

Let Ωfin denote the subset of Ω consisting of triples (T, π0, π) such that T ,π0 and π are finite. We are going to define a metric on Ωfin. Let (T ′, π′0, π

′)and (T ′′, π′′0 , π

′′) be two points in Ωfin, where π′0 = (σ′1, x′1), . . . , (σ′p, x′p),π′ = τ ′1, . . . , τ ′r, π′′0 = (σ′′1 , x′′1), . . . , (σ′′q , x′′q ), and π′′ = τ ′′1 , . . . , τ ′′s . As-sume that 0 < σ′1 < · · · < σ′p, 0 < τ ′1 < · · · < τ ′r, 0 < σ′′1 < · · · < σ′′q , and0 < τ ′′1 < · · · < τ ′′s . The distance between (T ′, π′0, π

′) and (T ′′, π′′0 , π′′) will

be 1 if either p = q or r = s. Otherwise, the distance is

(5.11) 1 ∧(12

infRroot,cuts

dis(Rroot,cuts) + maxi

|σ′i − σ′′i |+maxj

|τ ′j − τ ′′j |),

where the infimum is over all correspondences between T ′ and T ′′ that con-tain the pairs (ρT ′ , ρT ′′) and (x′i, x

′′i ) for 1 ≤ i ≤ p.

Equip Ωfin with the Borel σ-field corresponding to this metric. For t ≥ 0,let Fo

t be the σ-field on Ω generated by the family of maps from Ω into Ωfin

given by (T, π0, π) 7→ (Rη(T ), π0∩(]0, t]×(Rη(T ))o), π∩(s, x) : x ≤ s ≤ t)

for η > 0. As usual, set F+t :=

∩u>tFo

u for t ≥ 0. Put Fo :=∨t≥0Fo

t .It is straightforward to establish the following result from Lemma 1.9.2

and the construction of X in Section 5.1, and we omit the proof.

Lemma 5.2.1. The map (t, (T, π0, π)) 7→ Xt(T, π0, π) from R+ × Ω intoTroot is progressively measurable with respect to the filtration (Fo

t )t≥0. (Here,of course, we are equipping Troot with the Borel σ-field associated with themetric dGHroot.)

Given T ∈ Troot, let PT be the probability measure on Ω defined by thefollowing requirements.

• The measure PT assigns all of its mass to the set (T ′, π′0, π′) ∈ Ω :

T ′ = T.• Under PT , the random variable (T ′, π′0, π

′) 7→ π′0 is a Poisson pointprocess on the set R++ × T o with intensity λ ⊗ µ, where µ is thelength measure on T .

• Under PT , the random variable (T ′, π′0, π′) 7→ π′ is a Poisson point

process on the set (t, x) ∈ R++×R++ : x ≤ t with intensity λ⊗λrestricted to this set.

• The random variables (T ′, π′0, π′) 7→ π′0 and (T ′, π′0, π

′) 7→ π′ areindependent under PT .

Of course, the random variable (T ′, π′0, π′) 7→ π′0 takes values in a space of

equivalence classes of countable sets rather than a space of sets per se, so,more formally, this random variable has the law of the image of a Poisson

Page 140: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

140 5. ROOT GROWTH AND REGRAFTING

process on an arbitrary class representative under the appropriate quotientmap.

For t ≥ 0, g a bounded Borel function on Troot, and T ∈ Troot, set

(5.12) Ptg(T ) := PT [g(Xt)].

With a slight abuse of notation, let Rη for η > 0 also denote the map fromΩ into Ω that sends (T, π0, π) to (Rη(T ), π0 ∩ (R++ × (Rη(T ))

o), π).Our main construction result is the following.

Theorem 5.2.2. (i) If T ∈ Troot is finite, then (Xt)t≥0 under PTis a Markov process that evolves via the root growth with re-graftingdynamics on finite trees.

(ii) For all η > 0 and T ∈ Troot, the law of (Xt Rη)t≥0 under PTcoincides with the law of (Xt)t≥0 under PRη(T ).

(iii) For all T ∈ Troot, the law of (Xt)t≥0 under PRη(T ) converges asη ↓ 0 to that of (Xt)t≥0 under PT (in the sense of convergenceof laws on the space of cadlag Troot-valued paths equipped with theSkorohod topology).

(iv) For g ∈ bB(Troot), the map (t, T ) 7→ Ptg(T ) is B(R+) × B(Troot)-measurable.

(v) The process (Xt,PT ) is strong Markov with respect to the filtration(F+

t )t≥0 and has transition semigroup (Pt)t≥0.

Proof. (i) This is clear from the definition of the root growth and re-grafting dynamics.

(ii) It is enough to check that the push-forward of the probability mea-

sure PT under the map Rη : Ω → Ω is the measure PRη(T ). This, however,follows from the observation that the restriction of length measure on a treeto a subtree is just length measure on the subtree.

(iii) This is immediate from part (ii), the limiting construction in Section5.1, and part (iv) of Lemma 1.9.2. Indeed, we have that

(5.13) supt≥0

dGHroot(Xt, Xt Rη) ≤ dH(T,Rη(T )) ≤ η.

(iv) By a monotone class argument, it is enough to consider the casewhere the test function g is continuous. It follows from part (iii) thatPtg(Rη(T )) converges pointwise to Ptg(T ) as η ↓ 0, and it is not diffi-cult to show using Lemma 1.9.2 and part (i) that (t, T ) 7→ Ptg(Rη(T )) isB(R+)×B(Troot)-measurable. We omit the details, because we will establishan even stronger result in Proposition 5.5.1.

(v) By construction and part (ii) of Lemma 1.9.4, we have for t ≥ 0 and(T, π0, π) ∈ Ω that, as a set, Xo

t (T, π0, π) is the disjoint union T o⊔]0, t].

Page 141: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

5.3. CONNECTION TO ALDOUS’S LINE-BREAKING CONSTRUCTION 141

Put

(5.14)

θt(T, π0, π)

:=(Xt(T, π0, π),

(s, x) ∈ R++ × T o : (t+ s, x) ∈ π0

,

(s, x) ∈ R++ × R++ : (t+ s, t+ x) ∈ π)

=(Xt(T, π0, π),

(s, x) ∈ R++ ×Xo

t (T, π0, π) : (t+ s, x) ∈ π0,

(s, x) ∈ R++ × R++ : (t+ s, t+ x) ∈ π).

Thus θt maps Ω into Ω. Note that Xs θt = Xs+t and that θs θt = θs+t,that is, the family (θt)t≥0 is a semigroup. It is not hard to show that(t, (T, π0, π)) 7→ θt(T, π0, π) is jointly measurable, and we leave this to thereader.

Fix t ≥ 0 and (T, π0, π) ∈ Ω. Write µ′ for the measure on T o⊔]0, t] thatrestricts to length measure on T o and to Lebesgue measure on ]0, t]. Writeµ′′ for the length measure on Xo

t (T, π0, π). The strong Markov property willfollow from a standard strong Markov property for Poisson processes if wecan show that µ′ = µ′′. This equality is clear from the construction if T isfinite: the tree Xt(T, π0, π) is produced from the tree T and the set ]0, t] bya finite number of dissections and rearrangements. The equality for generalT follows from the construction and part (iii) of Lemma 1.9.4.

5.3. Connection to Aldous’s line-breaking construction

Recall the law of the Brownian CRT from Definition 3.2.1(i). The mainresult of this section is to show that the root growth dynamics started in thetrivial tree (consisting of one point only) converges to Aldous’s BrownianCRT.

Proposition 5.3.1. If ρ ∈ Troot is the trivial tree then the law of Xt

under Pρ converges weakly to the law of the Brownian CRT as t→ ∞.

Recall Aldous’s line-breaking process R = (R)t≥0 from (3.23), (3.26)and (3.24) in Section 3.3. By Theorem 3.3.1, Rt converges weakly to thatof the Brownian CRT as t→ ∞. For the proof of Proposition 5.3.1, we willshow the following stronger result.

Proposition 5.3.2. The random finite rooted tree Rτn− has the samedistribution as the Xτn− under Pρ, for all n ∈ N.

Proof. We will rely on a coupling of both processes. For that purpose,we let π ⊂ (t, x) ∈ R++ × R++ : x ≤ t be a Poisson point process on(t, x) ∈ R++ × R++ : x ≤ t with intensity measure λ ⊗ λ, where again λdenotes Lebesgue measure.

Again using the points 0 < τ1 < τ2 < ... being the points of t > 0 :π(t × R++) > 0 as cut times τ1, τ2, ..., the process (Tt)t≥0 with root

Page 142: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

142 5. ROOT GROWTH AND REGRAFTING

growth and regrafting dynamics starting in the trivial tree T0 := ρ canbe described as follows (here again the tree at time t will have total edgelength t):

• Start with the 1-tree (with one end identified as the root and theother as a leaf,) T0, of length zero.

• Let this segment grow at unit speed on the time interval [0, τ1[, andfor t ∈ [0, τ1[ let Tt be the rooted 1-tree that has its points labeledby the interval [0, t] in such a way that the root is t and the leaf is0.

• At time τ1 sample the first cut point uniformly along the tree Tτ1−,prune off the piece of Tτ1− that is above the cut point (that is,prune off the interval of points that are further away from the roott than the first cut point).

• Re-graft the pruned segment such that its cut end and the root areglued together. Just as we thought of T0 as a tree with two points,(a leaf and a root) connected by an edge of length zero, we take Tτ1to be the the rooted 2-tree obtained by “ramifying” the root Tτ1−into two points (one of which we keep as the root) that are joinedby an edge of length zero.

• Proceed inductively: Given the labeled and rooted n-tree, Tτn−1 ,for t ∈ [τn−1, τn[, let Tt be obtained by letting the edge containingthe root grow at unit speed so that the points in Tt correspond tothe points in the interval [0, t] with t as the root. At time τn, thenth cut point is sampled randomly along the edges of the n-tree,Tτn−, and the subtree above the cut point (that is the subtree ofpoints further away from the root than the cut point) is pruned offand re-grafted so that its cut end and the root are glued together.The root is then “ramified” as above to give an edge of length zeroleading from the root to the rest of the tree.

Recall from Section 3.3 the construction of R = (Rt, rt, µt)t≥0 basedon the successive arrival times of a inhomogeneous Poisson process withrate r(dt) = tdt, and build now R = (Rt, rt, µt)t≥0 based on the points(τi, xτi); i ∈ N, where for τ ∈ t > 0 : π(t × R++) > 0, xτ denotesthe unique element in [0, τ ] such that π(τ × xτ) > 0. The link betweenthese two dynamics for growing trees is Figure 5.4.

Let Rn denote the object obtained by taking the rooted finite tree withedge-lengths Rτn− and labeling the leaves with 1, ..., n, in the order as theyare added in Aldous’s construction. Let Tn be derived similarly from therooted finite tree with edge-lengths Tτn−, by labeling the leaves with 1, ..., nin the order that they appear in the root growth with re-grafting construc-tion. It will suffice to show that Rn and Tn have the same distribution. Notethat both Rn and Tn are rooted, bifurcating trees with n labeled leaves andedge-lengths. Such a tree Sn is uniquely specified by its shape, denoted byshape(Sn), which is a rooted, bifurcating, leaf-labeled combinatorial tree,

Page 143: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

5.3. CONNECTION TO ALDOUS’S LINE-BREAKING CONSTRUCTION 143

r R0 Rτ1− Rτ1 Rτ2− Rτ2

r r

T0 Tτ1− Tτ1 Tτ2− Tτ2ra

a

@@

Figure 5.4. illustrates how the coupled tree-valued pro-cesses (Rt; t ≥ 0) and (Tt; t ≥ 0) evolve. (The bold dotsre-present an edge of length zero, while the small dots indi-cate the position of the cut point that is going to show up atthe next moment.)

and by the list of its (2n− 1) edge-lengths in a canonical order determinedby its shape, say

(5.15) lengths(Sn) :=(length(Sn, 1), ..., length(Sn, 2n− 1)

),

where the edge-lengths are listed in order of traversal of edges by first work-ing along the path from the root to leaf 1, then along the path joining thatpath to leaf 2, and so on.

Then, by construction, the common collection of edge-lengths of Rn andof Tn is the collection of lengths of the 2n−1 subintervals of ]0, τn] obtainedby cutting this interval at the 2n− 2 points

(5.16)X

(n)i , 1 ≤ i ≤ 2n− 2

:=

n−1∪i=1

xτi , τi

where the X

(n)i are indexed to increase in i for each fixed n. Let X

(n)0 := 0

and X(n)2n−1 := τn. Then

(5.17) length(Rn, i) = X(n)i −X

(n)i−1,

and

(5.18) length(Tn, i) = length(Rn, σn,i),

for all 1 ≤ i ≤ 2n − 1 and for some almost surely unique random indicesσn,i ∈ 1, ..., 2n − 1 such that i 7→ σn,i is almost surely a permutation of1, ..., 2n − 1. According to [Ald93, Lemma 21], the distribution of Rnmay be equivalently characterized as follows:

Page 144: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

144 5. ROOT GROWTH AND REGRAFTING

(a) the sequence lengths(Rn) is exchangeable, with the same distribu-tion as the sequence of lengths of subintervals obtained by cutting]0, τn] at 2n− 2 uniformly chosen points xτn : 1 ≤ i ≤ 2n− 2;

(b) shape(Rn) is uniformly distributed on the set of all 1×3×5×· · ·×(2n− 3) possible shapes;

(c) lengths(Rn) and shape(Rn) are independent.

In view of this characterization and (5.18), to show that Tn has the samedistribution as Rn it is enough to show that

(i) the random permutation i 7→ σn,i : 1 ≤ i ≤ 2n− 1 is a functionof shape(Tn);

(ii) shape(Tn) = Ψn(shape(Rn)) for some bijective map Ψn from theset of all possible shapes to itself.

This is trivial for n = 1, so we assume below that n ≥ 2. Before proving(i) and (ii), we recall that (b) above involves a natural bijection

(5.19) (I1, . . . , In−1) ↔ shape(Rn)

where In−1 ∈ 1, . . . , 2n − 3 is the unique i such that xτn−1 ∈(X

(n−1)i−1 , X

(n−1)i ). Hence In−1 is the index in the canonical ordering of edges

of Rn−1 of the edge that is cut in the transformation from Rn−1 to Rnby attachment of an additional edge, of length τn − τn−1, connecting thecut-point to leaf n. Thus (b) and (c) above correspond via (5.19) to thefacts that I1, . . . , In−1 are independent and uniformly distributed over theirranges, and independent of lengths(Rn). These facts can be checked directlyfrom the construction of (Rn)n∈N from (τn)n∈N and (xτn)n∈N using standardfacts about uniform order statistics.

Now (i) and (ii) follow from (5.19) and another bijection

(5.20) (I1, . . . , In−1) ↔ shape(Tn)

where each possible value i of Im is identified with edge σm,i in the canon-ical ordering of edges of Tm. This is the edge of Tm whose length equalslength(Rm, i). The bijection (5.20), and the fact that σn,i depends only onshape(Tn), will now be established by induction on n ≥ 2. For n = 2 theclaim is obvious. Suppose for some n ≥ 3 that the correspondence between(I1, . . . , In−2) and shape(Tn−1) has been established, and that the length ofedge σn−1,i in the canonical ordering of edges of Tn−1 equals the length of theith edge in the canonical ordering of edges of Rn−1, for some σn−1,i whichis a function of i and shape(Tn−1). According to the construction of Tn, ifIn−1 = i then Tn is derived from Tn−1 by splitting Tn−1 into two branchesat some point along edge σn−1,i in the canonical ordering of the edges ofTn−1, and forming a new tree from the two branches and an extra segmentof length τn − τn−1. Clearly, shape(Tn) is determined by shape(Tn−1) andIn−1, and in the canonical ordering of the edge-lengths of Tn the length ofthe ith edge equals the length of the edge σn,i of Rn, for some σn,i which isa function of shape(Tn−1) and In−1, and hence a function of shape(Tn). To

Page 145: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

5.4. RECURRENCE, STATIONARITY AND ERGODICITY 145

complete the proof, it is enough by the inductive hypothesis to show thatthe map

(shape(Tn−1), In−1) → shape(Tn)

just described is invertible. But shape(Tn−1) and In−1 can be recoveredfrom shape(Tn) by the following sequence of moves:

• delete the edge attached to the root of shape(Tn);• split the remaining tree into its two branches leading away fromthe internal node to which the deleted edge was attached;

• re-attach the bottom end of the branch not containing leaf n toleaf n on the other branch, joining the two incident edges to forma single edge;

• the resulting shape is shape(Tn−1), and In−1 is the index such thatthe joined edge in shape(Tn−1) is the edge σn−1,In−1 in the canonicalordering of edges on shape(Tn−1).

Proof of Proposition 5.3.1. We saw in Proposition 5.3.2 that, Tτn−has the same distribution as Rτn−. Moreover, we recalled in Section 3.3that Rt converges in distribution to the Brownian CRT, as t→ ∞. Clearly,the rooted Gromov–Hausdorff distance between Tt and Tτn+1− is at mostτn+1 − τn for τn ≤ t < τn+1. It remains to observe that τn+1 − τn → 0 inprobability as n→ ∞.

5.4. Recurrence, stationarity and ergodicity

In this section we show that the Brownian CRT is the unique equilibriumunder the root growth with regrafting dynamics is recurrent.

We first state convergence to the Brownian CRT.

Proposition 5.4.1 (Convergence to the Brownian CRT). For any T ∈Troot, the law of Xt under PT converges weakly to that of the Brownian CRTas t→ ∞.

We prepare the proof with the following lemma.

Lemma 5.4.2. For any (T, r, ρ) ∈ Troot we can build on the same proba-bility space two Troot-valued processes X ′ and X ′′ such that:

• X ′ has the law of X under Pρ, where ρ is the trivial tree con-sisting of just the root ρ,

• X ′′ has the law of X under PT ,• for all t ≥ 0,

(5.21) dGHroot(X ′t, X

′′t ) ≤ dGHroot(ρ, T ) = sup

r(ρ, x) : x ∈ T

,

•(5.22) lim

t→∞dGHroot(X ′

t, X′′t ) = 0, almost surely.

Page 146: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

146 5. ROOT GROWTH AND REGRAFTING

Proof. The proof follows almost immediately from construction of Xin Section 5.1 and Lemma 5.1.3. The only point requiring some commentis (5.22). For that it will be enough to show for any ε > 0 that for PT -a.e.(T, π0, π) ∈ Ω there exists t > 0 such that the projection of π0 ∩ (]0, t]× T o)onto T is an ε-net for T .

Note that the projection of π0 ∩ (]0, t]× T o) onto T is a Poisson processunder PT with intensity tµ, where µ is the length measure on T . Moreover,T can be covered by a finite collection of ε-balls, each with positive µ-measure. Therefore, the PT -probability of the set of (T, π0, π) ∈ Ω such thatthe projection of π0∩(]0, t]×T o) onto T is an ε-net for T increases as t→ ∞to 1.

Proof of Proposition 5.4.1. This is an immediate consequence ofLemma 5.4.2 together with Proposition 5.3.1.

The next result states recurrence of the root growth with regraftingdynamics.

Proposition 5.4.3 (Recurrence). Consider a non-empty open set U ⊆Troot. For each T ∈ Troot,

(5.23) PTfor all s ≥ 0, there exists t > s such that Xt ∈ U

= 1.

Proof. It is straightforward, but notationally rather tedious, to showthat if B′ ⊆ Troot is any ball and ρ is the trivial tree, then

(5.24) PρXt ∈ B′ > 0

for all t sufficiently large. Thus, for any ball B′ ⊆ Troot there is, byLemma 5.4.2, a ball B′′ ⊆ Troot containing the trivial tree such that

(5.25) infT∈B′′

PTXt ∈ B′ > 0

for each t sufficiently large.By a standard application of the Markov property, it therefore suffices

to show for each T ∈ Troot and each ball B′′ around the trivial tree that

(5.26) PTthere exists t > 0 such that Xt ∈ B′′ = 1.

By another standard application of the Markov property, equation (5.26)will follow if we can show that there is a constant p > 0 depending on B′′

such that for any T ∈ Troot

(5.27) lim inft→∞

PTXt ∈ B′′ > p.

This, however, follows from Proposition 5.4.1 and the observation that forany ε > 0 the law of the Brownian CRT assigns positive mass to the set oftrees with height less than ε (which is just the observation that the law ofthe Brownian excursion assigns positive mass to the set of excursion pathswith maximum less that ε/2).

Page 147: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

5.5. FELLER PROPERTY 147

Proposition 5.4.4. The law of the Brownian CRT is the unique sta-tionary distribution for X. That is, if ξ is the law of the CRT, then∫ξ(dT )Ptf(T ) =

∫ξ(dT )f(T ) for all t ≥ 0 and f ∈ bB(Troot), and ξ is

the unique probability measure on Troot with this property.

Proof. This is a standard argument given Proposition 5.4.1 and theFeller property for the semigroup (Pt)t≥0 established in Proposition 5.5.1,but we include the details for completeness.

Consider a test function f : Troot → R that is continuous and bounded.By Proposition 5.5.1 below, the function Ptf is also continuous and boundedfor each t ≥ 0. Therefore, by Proposition 5.4.1,

(5.28)

∫ξ(dT ) f(T ) = lim

s→∞

∫ξ(dT )Psf(T ) = lim

s→∞

∫ξ(dT )Ps+tf(T )

= lims→∞

∫ξ(dT )Ps(Ptf)(T ) =

∫ξ(dT )Ptf(T )

for each t ≥ 0, and hence ξ is stationary. Moreover, if ζ is a stationarymeasure, then

(5.29)

∫ζ(dT )f(T ) =

∫ζ(dT )Ptf(T )

→∫ζ(dT )

(∫ξ(dT ) f(T )

)=

∫ξ(dT ) f(T ),

and ζ = ξ, as claimed.

5.5. Feller property

In this section we show that the law of Xt under PT is weakly continuousin the initial value T for each t ≥ 0. This property is sometimes referred toas the Feller property of the semigroup (Pt)t≥0, although this terminologyis often restricted to the case of a locally compact state space and transitionoperators that map the space of continuous functions that vanish at infinityinto itself. A standard consequence of this result is that the law of theprocess (Xt)t≥0 is weakly continuous in the initial value (when the space ofcadlag Troot-valued paths is equipped with the Skorohod topology).

Proposition 5.5.1. If the function f : Troot → R is continuous andbounded, then the function Ptf is also continuous and bounded for eacht ≥ 0.

We will prove the proposition by a coupling argument that, inter alia,builds processes with the law of X under PT for two different finite values ofT on the same probability space. The key to constructing such a couplingis the following pair of lemmas.

Page 148: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

148 5. ROOT GROWTH AND REGRAFTING

We require the following notion. A rooted combinatorial tree is just aconnected, acyclic graph with one vertex designated as the root. Equiva-lently, we can think of a rooted combinatorial tree as a finite rooted tree inwhich all edges have length one. Thus any finite rooted tree is associatedwith a unique rooted combinatorial tree by changing all the edge lengths toone, and any two finite rooted trees with the same topology are associatedwith the same rooted combinatorial tree. If U and V are two rooted com-binatorial trees with leaves labeled by (x1, . . . , xn) and (y1, . . . , yn), thenwe say that U and V are isomorphic if there exists a graph isomorphismbetween U and V that maps the root of U to the root of V and xi to yi for1 ≤ i ≤ n.

Lemma 5.5.2. Let (T, ρ) be a finite rooted trees with leaves x1, . . . , xn =T \T o. (recall the definition of the skeleton T o from (1.19)). Write η for theminimum of the (strictly positive) edge lengths in T . Suppose that (T ′, ρ′) isanother finite rooted tree with dGHroot((T ′, ρ′), (T, ρ)) < δ < η

16 . Then there

exists a subtree (T ′′, ρ′) ≼root (T ′, ρ′) and a map f : T → T ′′ such that:

(i) f(ρ) = ρ′,(ii) T ′′ is spanned by f(x1), . . . , f(xn), ρ′,(iii) dH(T

′, T ′′) < 3δ,(iv) dis(f) < 8δ,(v) T ′′ has leaves f(x1), . . . , f(xn),(vi) by possibly deleting some internal edges from the rooted combinato-

rial tree associated to T ′′ with leaves labeled by (f(x1), . . . , f(xn)),one can obtain a leaf-labeled rooted combinatorial tree that is iso-morphic to the rooted combinatorial rooted tree associated to T withleaves labeled by (x1, . . . , xn).

Proof. We have from (1.26) that there is a correspondence Rroot con-taining (ρ, ρ′) between T and T ′ such that dis(Rroot) < 2δ. For x ∈ T \ ρ,choose f(x) ∈ T ′ such that (x, f(x)) ∈ Rroot, and put f(ρ) := ρ′. Set T ′′

to be the subtree of T ′ spanned by f(x1), . . . , f(xn), ρ′. For x ∈ T definef(x) ∈ T ′′ to be the point in T ′′ that has minimum distance to f(x). Inparticular, f(ρ) = f(ρ) = ρ′ and f(xi) = f(xi) for all i, so that (i) and (ii)hold.

For x′ ∈ T ′ \ ρ′ choose g(x′) such that (g(x′), x′) ∈ Rroot and putg(ρ′) := ρ. Then f(T ) and g(T ′) are 2δ-nets for T ′ and T , respectively,and dis(f) ∨ dis(g) < 2δ. For each i ∈ 1, . . . , n and y′ ∈ T ′ we have(ρ, ρ′), (xi, f(xi)), (g(y

′), y′) ∈ Rroot. Hence rT ′(ρ′, y′) < rT (ρ, g(y′)) + 2δ,

and rT ′(y′, f(xi)) < rT (g(y′), xi) + 2δ. Now fix y′ ∈ T ′, and choose i ∈

1, . . . , n such that g(y′) ∈ [ρ, xi]. Then

(5.30)

rT ′(ρ′, f(xi)) + 2dH(y′, [ρ′, f(xi)]

)= rT ′(ρ′, y′) + rT ′(y′, f(xi))

< rT (ρ, xi) + 4δ

< rT ′(ρ′, f(xi)) + 2δ + 4δ,

Page 149: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

5.5. FELLER PROPERTY 149

and hence dH(y′, T ′′) < 3δ. Thus (iii) holds.

For x, y ∈ T ,(5.31)

|rT (x, y)− rT ′′(f(x), f(y))|

≤ |rT (x, y)− rT ′(f(x), f(y))|+ rT ′(f(x), f(x)) + rT ′(f(y), f(y))

≤ dis(f) + 2dH(T′, T ′′) < 8δ,

and (iv) holds.In order to establish (v), it suffices to observe for 1 ≤ i = j ≤ n that, by

part (iv),

(5.32)

rT ′′(f(xi), f(xj)) + rT ′′(f(xj), ρ′)− rT ′′(f(xi), ρ

′)

≥ rT (xi, xj) + rT (xj , ρ)− rT (xi, ρ)− 3dis(f)

> 2η − 24δ

> 0.

Similarly, part (vi) follows from part (iv) and the observations in Sec-tion 1.6 about re-constructing tree shapes from distances between the pointsin subsets of size four drawn from the leaves and the root of T ′′ once we ob-serve the inequality 1

2 · 4dis(f) < 16δ < η.

Lemma 5.5.3. Let (T, ρ) be a finite rooted tree and ε > 0. There existsδ > 0 depending on T and ε such that if (T ′, ρ′) is a finite rooted treewith dGHroot((T ′, ρ′), (T, ρ)) < δ, then there exist subtrees (S, ρ) ≼root T and(S′, ρ′) ≼root T ′ for which:

(i) dH(S, T ) < ε and dH(S′, T ′) < ε,

(ii) S and S′ have the same total length,(iii) there is a bijective measurable map ψ : S → S′ that preserves length

measure and has distortion at most ε,(iv) the length measure of the set of points a ∈ S such that b′ ∈ S′ :

ψ(a) ≤ b′ = ψ(b ∈ S : a ≤ b) (that is, the set of points a suchthat the subtree above ψ(a) is not the image under ψ of the subtreeabove a) is less than ε.

Proof. As in Lemma 5.5.2, denote by η the minimum of the (strictlypositive) edge lengths of T . Let (T ′, ρ′) be a finite rooted tree with

(5.33) dGHroot((T ′, ρ′), (T, ρ)) < δ <η

16,

where δ depending on T and ε will be chosen later. Set (T ′′, ρ′) and f to bea subtree of T ′ and a function from T to T ′′ whose existence is guaranteedby Lemma 5.5.2 for this choice of δ. Let x1, . . . xn denote the leaves of Tand write x′i := f(xi) = f(xi) for i = 1, . . . , n.

Define inductively subtrees S1, . . . , Sn of T (all with root ρ) andS′1, . . . , S

′n of T ′′ ⊆ T ′ (all with root ρ′) as follows. Set S1 := [ρ, y1] and

Page 150: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

150 5. ROOT GROWTH AND REGRAFTING

S′1 := [ρ, y′1], where y1 and y′1 are the unique points on the arcs [ρ, x1] and

[ρ′, x′1], respectively, such that

(5.34) rT (ρ, y1) = rT ′(ρ′, y′1) = rT (ρ, x1) ∧ rT ′(ρ′, x′1).

Suppose that S1, . . . , Sm and S′1, . . . , S

′m have been defined. Let zm+1 and

z′m+1 be the points on Sm and S′m closest to xm+1 and x′m+1. Put Sm+1 :=

Sm∪]zm+1, ym+1] and S′m+1 := S′

m∪]z′m+1, y′m+1], where ym+1 and y′m+1 are

the unique points on the arcs ]zm+1, xm+1] and ]z′m+1, x′m+1], respectively,

such that

(5.35)rT ′(zm+1, ym+1) = rT ′(z′m+1, y

′m+1)

= rT (zm+1, xm+1) ∧ rT ′(z′m+1, x′m+1).

Set S := Sn and S′ := S′n.

Put z1 := ρ, and z′1 := ρ′. By construction, the arcs ]zk, yk], 1 ≤ k ≤ n,are disjoint and their union is S \ρ. Similarly, the arcs ]z′k, y

′k] are disjoint

and their union is S′ \ ρ′. Moreover, the arcs ]zk, yk] and ]z′k, y′k] have

the same length (in particular, S and S′ have the same length and part (ii)holds). We may therefore define a measure-preserving bijection ψ between Sand S′ by setting ψ(ρ) = ρ′ and letting the restriction of ψ to each arc ]zk, yk]be the obvious length preserving bijection onto ]z′k, y

′k]. More precisely, if

a ∈]zk, yk], then ψ(a) is the uniquely determined point on ]z′k, y′k] such that

rS′(z′k, ψ(a)) = rS(zk, a).We next estimate the distortion of ψ to establish part (iii). We first

claim that for a, b ∈ S,

(5.36) |rS(a, b)− rS′(ψ(a), ψ(b))| ≤ 5γ,

where

(5.37) γ := max1≤k,m≤n

|rS(yk, ym)−rS′(y′k, y′m)|∨ max

1≤k≤n|rS(yk, ρ)−rS′(y′k, ρ

′)|.

To see (5.36), consider a, b ∈ S \ ρ with a ∈]zk, yk] and b ∈]zm, ym]where k = m. (The case where a = ρ or b = ρ holds “by continuity” and isleft to the reader.) Without loss of generality, assume that k < m, so thatyk∧ym ≤ zm < b ≤ ym in the partial order on S and y′k∧y′m ≤ z′m < ψ(b) ≤y′m in the partial order on S′. Note that yk ∧ ym and zk are comparable inthe partial order, as are y′k ∧ y′m and z′k. Moreover, by part (vi) of Lemma5.5.2, yk∧ym ≤ zk if and only if y′k∧y′m ≤ z′k. We then have to consider fourcases depending on the relative positions of yk ∧ ym, a and y′k ∧ y′m, ψ(a).

Case I: yk ∧ ym < a ≤ yk and y′k ∧ y′m < ψ(a) ≤ y′k.We have

(5.38) rS(yk, ym) = rS(yk, a) + rS(a, b) + rS(b, ym)

and

(5.39) rS′(y′k, y′m) = rS′(y′k, ψ(a)) + rS′(ψ(a), ψ(b)) + rS′(ψ(b), y′m).

Page 151: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

5.5. FELLER PROPERTY 151

By construction,

(5.40) rS(yk, a) = rS′(y′k, ψ(a))

and

(5.41) rS(b, ym) = rS′(ψ(b), y′m).

Hence

(5.42) |rS(a, b)− rS′(ψ(a), ψ(b))| = |rS(yk, ym)− rS′(y′k, y′m)| ≤ γ.

Case II: yk ∧ ym < a ≤ yk and ψ(a) ≤ y′k ∧ y′m < y′k.Note that in this case zk ≤ yk ∧ ym. We again have

(5.43) rS(yk, ym) = rS(yk, a) + rS(a, b) + rS(b, ym),

but now

rS′(y′k, y′m) = rS′(y′k, ψ(a)) + rS′(ψ(a), ψ(b)) + rS′(ψ(b), y′m)

− 2rS′(ψ(a), y′k ∧ y′m).(5.44)

Let yℓ be such that zk = yℓ ∧ yk = yℓ ∧ ym and hence z′k = y′ℓ ∧ y′k = y′ℓ ∧ y′m.Observe from Section 1.6 that

rS′(ψ(a), y′k ∧ y′m)

=1

2(rS′(y′ℓ, y

′m) + rS′(y′k, ρ

′)− rS′(y′k, y′m)− rS′(y′ℓ, ρ

′))− rS′(z′k, ψ(a))

≤ 1

2(rS(yℓ, ym) + rS(yk, ρ)− rS(yk, ym)− rS(yℓ, ρ)) +

1

24γ − rS(zk, a)

= rS(zk, yk ∧ ym)− rS(zk, a) + 2γ

≤ 2γ,

(5.45)

and hence

(5.46) |rS(a, b)− rS′(ψ(a), ψ(b))| ≤ 5γ.

Case III: a ≤ yk ∧ ym < yk and y′k ∧ y′m ≤ ψ(a) < y′k.Note that in this case, z′k ≤ y′k ∧ y′m. This case is similar to Case II, but werecord some of the details for use later in the proof of part (iv). Letting theindex ℓ be as in Case II, we have

rS(yk, ym) = rS(yk, a) + rS(a, b) + rS(b, ym)

− 2rS(a, yk ∧ ym)(5.47)

and

(5.48) rS′(y′k, y′m) = rS′(y′k, ψ(a)) + rS′(ψ(a), ψ(b)) + rS′(ψ(b), y′m).

Page 152: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

152 5. ROOT GROWTH AND REGRAFTING

We have

rS(a, yk ∧ ym)

=1

2(rS(yℓ, ym) + rS(yk, ρ)− rS(yk, ym)− rS(yℓ, ρ))− rS(zk, a)

≤ 1

2(rS′(y′ℓ, y

′m) + rS′(y′k, ρ

′)− rS′(y′k, y′m)− rS′(y′ℓ, ρ

′)) +1

24γ

− rS′(z′k, ψ(a))

= rS′(z′k, y′k ∧ y′m)− rS′(z′k, ψ(a)) + 2γ

≤ 2γ,

(5.49)

and hence

(5.50) |rS(a, b)− rS′(ψ(a), ψ(b))| ≤ 5γ.

Case IV: a ≤ yk ∧ ym < yk and ψ(a) ≤ y′k ∧ y′m < y′k.Letting the index ℓ be as in Case II, we have

(5.51) rS(zk, ym) = rS(zk, a) + rS(a, b) + rS(b, ym)

and

(5.52) rS′(z′k, y′m) = rS′(z′k, ψ(a)) + rS′(ψ(a), ψ(b)) + rS′(ψ(b), y′m).

Hence, from Section 1.6,

rS(a, b)− rS′(ψ(a), ψ(b))

= rS(zk, ym)− rS′(z′k, y′m)

= rS(yℓ ∧ ym, ym)− rS′(y′ℓ ∧ y′m, y′m)

=1

2(rS(ym, ρ) + rS(yℓ, ym)− rS(yℓ, ρ))

− 1

2(rS′(y′m, ρ

′) + rS′(y′ℓ, y′m)− rS′(y′ℓ, ρ

′)).

(5.53)

Thus

(5.54) |rS(a, b)− rS′(ψ(a), ψ(b))| ≤ 3

2γ.

Combining Cases I–IV, we see that (5.36) holds. We thus require anestimate of γ to complete the estimation of the distortion of ψ . Clearly,

|rS(yk, ym)− rS′(y′k, y′m)| ≤ |rT (xk, xm)− rT ′′(x′k, x

′m)|

+ rT (yk, xk) + rT (ym, xm)

+ rT ′′(y′k, x′k) + rT ′′(y′m, x

′m).

(5.55)

By (5.34),

rT (y1, x1) ∨ rT ′′(y′1, x′1) = |rT (ρ, x1)− rT ′′(ρ′, x′1)|

≤ dis(f)

< 8δ.

(5.56)

Page 153: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

5.5. FELLER PROPERTY 153

For 2 ≤ k ≤ n there exists by construction an index i ∈ 1, 2, . . . , k−1 suchthat zk ∈ [zi, yi] and z

′k ∈ [z′i, y

′i]. Applying the observations of Section 1.6,

(5.57)

rT (yk, xk) ∨ rT ′′(y′k, x′k)

= |rT (yk, xk)− rT ′′(y′k, x′k)|

= |rT (zk, xk)− rT ′′(z′k, x′k)|

≤ 1

2|rT (xi, xk)− rT ′′(x′i, x

′k)|

+ |rT (ρ, xi)− rT ′′(ρ′, x′i)|+ |rT (ρ, xk)− rT ′′(ρ′, x′k)|

≤ 3

2dis(f)

≤ 12δ.

Thus, from (5.55),

(5.58) |rS(yk, ym)− rS′(y′k, y′m)| < (8 + 4× 12)δ = 56δ.

A similar argument shows that |rS(yk, ρ)−rS′(y′k, ρ′)| < (8+2×12)δ = 32δ,

and hence γ < 56δ. Substituting into (5.36) gives

(5.59) dis(ψ) ≤ 5γ < (5× 56)δ = 280δ.

Moving to part (i), apply (5.57) to obtain

(5.60) rH(S, T ) ≤ max1≤i≤n

r(yk, xk) ≤ γ < 56δ

and, by similar arguments,

(5.61) rH(S′, T ′) ≤ rH(S

′, T ′′) + rH(T′′, T ′) < 59δ.

Finally, we consider part (iv). Suppose that a ∈ S is such that thesubtree of S′ above ψ(a) is not the image under ψ of the subtree of S above a.Let k be the unique index such that a ∈]zk, yk] (and hence ψ(a) ∈]z′k, y′k]). Itfollows from the construction of ψ that there must exist an index ℓ such thateither zk < a ≤ zℓ and z

′k < z′ℓ ≤ ψ(a) or zk < zℓ ≤ a and z′k < ψ(a) ≤ z′ℓ.

These two situations have already been considered in Case III and Case IIabove (in that order): there we represented zℓ as yk ∧ ym and z′ℓ as y

′k ∧ y′m.

It follows from the inequality (5.49) that the mass of the set of points athat satisfy the first alternative is at most 2γn < 112δn. Similarly, fromthe inequality (5.45) and the fact that ψ is measure-preserving, the massof the set of points a that satisfy the second alternative is also at most2γn < 112δn. Thus the total mass of the set of points of interest is at most224δn.

Before completing the proof of Proposition 5.5.1, we recall the definitionof the Wasserstein metric. Suppose that (E, r) is a complete, separablemetric space. Write B for the set of continuous functions functions f :E → R such that |f(x)| ≤ 1 and |f(x) − f(y)| ≤ r(x, y) for x, y ∈ E. The

Page 154: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

154 5. ROOT GROWTH AND REGRAFTING

Wasserstein (sometimes transliterated as Vasershtein) distance between twoBorel probability measures α and β on E is given by

(5.62) dW(α, β) := supf∈B

∣∣∣∣∫ fdα−∫fdβ

∣∣∣∣ .The Wasserstein distance is a genuine metric on the space of Borel probabil-ity measures and convergence with respect to this distance implies weak con-vergence (see, for example, Theorem 3.3.1 and Problem 3.11.2 of [EK86]).If V and W are two E-valued random variables on the same probabilityspace (Σ,A,P) with distributions α and β, respectively, then

dW(α, β) ≤ supf∈B

|P[f(V )]− P[f(W )]|

≤ supf∈B

P[|f(V )− f(W )|] ≤ P[d(V,W )].(5.63)

Proof of Proposition 5.5.1. For (T, ρ) ∈ Troot and t ≥ 0, let

(5.64) Pt((T, ρ), ·) := P(T,ρ)Xt ∈ ·.We need to show that (T, ρ) 7→ Pt((T, ρ), ·) is weakly continuous for eacht ≥ 0. This is equivalent to showing for each (T, ρ) ∈ Troot and t ≥ 0 that

(5.65) lim(T ′,ρ′)→(T,ρ)

dW(Pt((T, ρ), ·),Pt((T

′, ρ′), ·))= 0.

From the coupling argument in the proof of part (iii) of Theorem 5.2.2(in particular, the inequality (5.13)), we have that

dW(Pt((T, ρ), ·),Pt((T′, ρ′), ·))

≤ dW(Pt((T, ρ), ·),Pt((Rη(T ), ρ), ·))+ dW(Pt((Rη(T ), ρ), ·),Pt((Rη(T

′), ρ′), ·))+ dW(Pt((Rη(T

′), ρ), ·),Pt((T′, ρ′), ·))

≤ dW(Pt((Rη(T ), ρ), ·),Pt((Rη(T′), ρ′), ·)) + 2η.

(5.66)

By part (ii) of Lemma 1.9.2, Rη(T′) converges to Rη(T ) as (T

′, ρ′) con-verges to (T, ρ), and so it suffices to establish (5.65) when (T, ρ) and (T ′, ρ′)are finite trees, and so we will suppose this for the rest of the proof.

Fix (T, ρ) and ε > 0. Suppose that δ > 0 depending on (T, ρ) andε is sufficiently small that the conclusions of Lemma 5.5.3 hold for any(T ′, ρ′) within distance δ of (T, ρ). Let (S, ρ) and (S′, ρ′) be the subtreesguaranteed by Lemma 5.5.3. From the coupling argument in proof of part(iii) of Theorem 5.2.2 we have

(5.67) dW (Pt((T, ρ), ·),Pt((S, ρ), ·)) < ε

and

(5.68) dW(Pt((T

′, ρ′), ·),Pt((S′, ρ′), ·)

)< ε.

It therefore suffices to give a bound on dW(Pt((S, ρ), ·),Pt((S′, ρ), ·)) that

only depends on ε and converges to zero as ε converges to 0.

Page 155: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

5.5. FELLER PROPERTY 155

Construct on some probability space (Σ,A,P) a Poisson point processΠ0 on the set R++×So with intensity λ⊗µ, where µ is the length measure onS. Construct on the same space another independent Poisson point processon the set (t, x) ∈ R++ × R++ : x ≤ t with intensity λ ⊗ λ restricted tothis set. If we set Π′

0 := (t, ψ(x)) : (t, x) ∈ Π0 ⊂ R++ × (S′)o, then Π′0 is

a Poisson process on the set R++ × (S′)o with intensity λ⊗ µ′, where µ′ isthe length measure on S′ (because ψ preserves length measure.) Now applythe construction of Section 5.1 to realizations of Π0 and Π (respectively, Π′

0

and Π) to get two Troot-valued processes that we will denote by (Yt)t≥0 and(Y ′t )t≥0. We see from the proof of Theorem 5.2.2 that Y (respectively, Y ′)

has the same law as X under P(S,ρ) (respectively, P(S′,ρ′)).Define a map ψt from Yt = S⊔]0, t] to Y ′

t = S′⊔]0, t] by setting therestriction of ψt to S be ψ and the restriction of ψt to ]0, t] be the identitymap. Let rt and r

′t be the metrics on Yt and Y

′t , respectively. We will bound

the rooted Gromov-Hausdorff distance between Yt and Y′t by bounding the

distortion of ψt.The cut-times for Y and Y ′ coincide. If ξ is a cut-point of Y at some

cut-time τ , then the corresponding cut-point for Y ′ will be ψ(ξ).It is clear that the distortion of ψt is constant between cut-times. Write

Bt for the set of points b ∈ Yt such that the subtree of Y ′t above ψt(b) is not

the image under ψt of the subtree of Yt above b. The set Bt is unchangedbetween cut-times.

Consider a cut-time τ such that the corresponding cut-point ξ is inYτ−\Bτ−. If x and y are in the subtree above ξ in Yτ−, then they are movedtogether by the re-grafting operation and their distance apart is unchangedin Yτ . Also, ψτ−(x) and ψτ−(y) are in subtree above ψτ−(ξ) in Y ′

τ− andthese two points are also moved together. More precisely,

(5.69) rτ (x, y) = rτ−(x, y)

and

(5.70) r′τ (ψτ (x), ψτ (y)) = r′τ−(ψτ−(x), ψτ−(y)).

The same conclusion holds if neither x or y are in the subtree above ξ inYτ−. If x is in the subtree above ξ in Yτ− and y is not, then

(5.71) rτ (x, y) = rτ−(x, ξ) + rτ−(τ, y)

and

(5.72) r′τ (ψτ (x), ψτ (y)) = r′τ−(ψτ−(x), ψτ−(ξ)) + r′τ−(τ, ψτ−(y))

(where we recall that τ is the root in each of the trees Yτ−, Yτ , Y′τ−, Y

′τ ).

Combining these cases, we see that

(5.73) dis(ψτ ) ≤ 2dis(ψτ−).

Moreover, if ξ ∈ Yτ− \Bτ−, then Bτ = Bτ−.

Page 156: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

156 5. ROOT GROWTH AND REGRAFTING

Also, for any t ≥ 0 we always have the upper bound

(5.74)

dis(ψt) ≤ diam(Yt) + diam(Y ′t )

≤ diam(S) + diam(S′) + 2t

≤ diam(T ) + diam(T ′) + 2t

≤ 2diam(T ) + dGHroot((T, ρ), (T ′, ρ′)) + 2t

≤ 2diam(T ) + δ + 2t

=: Dt

Set Nt := |Π0 ∩ (]0, t]× So)|+ |Π∩ (s, x) : 0 < x ≤ s ≤ t| and write Itfor the indicator of the event Π0 ∩ (]0, t] × B0) = ∅, which, by the aboveargument, is the event that ξ ∈ Bτ− for some (cut-time, cut-point) pairs(τ, ξ) with 0 < τ ≤ t. We have

(5.75)

dW(Pt((S, ρ), ·),Pt((S′, ρ), ·)

≤ P[dGHroot(Yt, Y′t )]

≤ 1

2P[dis(ψt)]

≤ 1

2P[ε2Nt + ItDt

]=

1

2

ε exp

(µ(T )t+

t2

2

)+ [1− exp (−εt)]Dt

,

and this suffices to complete the proof.

5.6. Asymptotics of the Aldous-Broder algorithm

Recall the Broder-Aldous Markov chain from the introduction in thischapter. In this section we state that the suitable rescaled Broder-AldousMarkov chain converges indeed to the root growth with regrafting dynamics.It will be more convenient for us to work with the continuous time versionof this algorithm in which the above transitions are made at the arrivaltimes of an independent Poisson process with rate |V |/(|V |−1) (so that thecontinuous time chain makes actual jumps at rate 1).

We can associate a rooted compact real tree with a rooted labeled com-binatorial tree in the obvious way by thinking of the edges as line segmentswith length 1. Because we don’t record the labeling, the process that arisesfrom mapping the continuous-time Aldous-Broder algorithm in this waywon’t be Markovian in general. However, this process will be Markovian inthe case where P is the transition matrix for i.i.d. uniform sampling (thatis, when P(x, y) = 1/|V | for all x, y ∈ V ) and we assume this from now on.The following result says that if we rescale “space” and time appropriately,then this process converges to the root growth with re-grafting process. IfT = (T, r, ρ) is a rooted compact real tree and c > 0, we write cT for the

Page 157: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

5.6. ASYMPTOTICS OF THE ALDOUS-BRODER ALGORITHM 157

tree (T, c r, ρ) (that is, cT = T as sets and the roots are the same, but themetric is re-scaled by c).

Proposition 5.6.1. Let Y n = (Y nt )t≥0 be a sequence of Markov pro-

cesses that take values in the space of rooted compact real trees with inte-ger edge lengths and evolve according to the dynamics associated with thecontinuous-time Aldous-Broder chain for i.i.d. uniform sampling. Supposethat each tree Y n

0 is non-random with total branch length Nn, that Nn con-

verges to infinity as n → ∞, and that N−1/2n Y n

0 converges in the rootedGromov-Hausdorff metric to some rooted compact real tree T as n → ∞.Then, in the sense of weak convergence of processes on the space of cadlag

paths equipped with the Skorohod topology, (N−1/2n Y n(N

1/2n t))t≥0 converges

as n→ ∞ to the root growth with re-grafting process X under PT .

Proof. Define Zn = (Znt )t≥0 by

(5.76) Znt := N−1/2n Y n(N1/2

n t).

For η > 0, let Zη,n be the Troot-valued process constructed as follows.

• Set Zη,n0 = Rηn(Zn0 ), where ηn := N

−1/2n ⌊N1/2

n η⌋.• The value of Zη,n is unchanged between jump times of (Znt )t≥0.• At a jump time τ for (Znt )t≥0, the tree Zη,nτ is the subtree of Znτspanned by Zη,nτ− and the root of Znτ .

An argument similar to that in the proof of Lemma 5.1.3 shows that

(5.77) supt≥0

dH(Znt , Z

η,nt ) ≤ ηn,

and so it suffices to show that Zη,n converges weakly as n→ ∞ to X underPRη(T ).

Note that Zη,n0 converges to Rη(T ) as n → ∞. Moreover, if Λ is themap that sends a tree to its total length (that is, the total mass of its lengthmeasure,) then limn→∞ Λ(Zη,n0 ) = Λ Rη(T ) <∞ by Lemma 1.9.3.

The pure jump process Zη,n is clearly Markovian. If it is in a state(T ′, ρ′), then it jumps with the following rates.

• With rate N1/2n (N

1/2n Λ(T ′))/Nn = Λ(T ′), one of the N

1/2n Λ(T ′)

points in T ′ that are at distance a positive integer multiple of N−1/2n

from the root ρ′ is chosen uniformly at random and the subtree

above this point is joined to ρ′ by an edge of length N−1/2n . The

chosen point becomes the new root and an arc of length N−1/2n

that previously led from the new root toward ρ′ is erased. Such atransition results in a tree with the same total length as T ′.

• With rate N1/2n − Λ(T ′), a new root not present in T ′ is attached

to ρ′ by an edge of length N−1/2n . This results in a tree with total

length Λ(T ′) +N−1/2n .

Page 158: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

158 5. ROOT GROWTH AND REGRAFTING

It is clear that these dynamics converge to those of the root growth withre-grafting process, with the first class of transitions leading to re-graftingsin the limit and the second class leading to root growth.

An alternative algorithm for simulating from the distribution in (5.1)in the case of i.i.d. uniform sampling is the complete graph special case ofWilson’s loop-erased walk algorithm for generating a uniform spanning treeof a graph [PW98, Wil96, WP96]. Asymptotics of the latter algorithmhave been investigated in [Pit]. Wilson’s algorithm was also used in [PR04]to show that the finite-dimensional distributions of the re-scaled uniformrandom spanning tree for the d-dimensional discrete torus converges to theBrownian CRT as the number of vertices goes to infinity when d ≥ 5.

5.7. An application: The Rayleigh process

Suppose that we take the root growth with re-grafting process (Xt)t≥0

under PT for some T ∈ Troot, we fix a point x ∈ T , and we denote byRt the distance between x and the root t of Xt (that is, Rt is the heightof x in Xt). According to the root growth with re-grafting dynamics, Rtgrows deterministically with unit speed between cut-time τ for which thecorresponding cut-point falls on the arc [τ, x]. Such cut-times τ come alongat intensity Rt− dt in time, and at τ− the position of the corresponding cut-point is uniformly distributed on the arc [τ, x] conditional on the past up toτ−, so that Rτ is uniformly distributed on [0, Rτ−] conditional on the pastup to τ−. Consequently, the R+-valued process (Rt)t≥0 is autonomouslyMarkovian. In particular, (Rt)t≥0 is an example of the class of piecewisedeterministic Markov processes discussed in the Introduction.

In order to describe the properties of (Rt)t≥0, we need the followingdefinitions. A non-negative random variable R is said to have standardRayleigh distribution if it is distributed as the length of a standard normalvector in R2, that is,

(5.78) PR > r = exp

(−r

2

2

), r ≥ 0.

If R∗ is distributed according to the size-biased standard Rayleigh distribu-tion, that is,

(5.79) PR∗ ∈ dr =rPR ∈ dr

P[R]=

√2

πr2e−

12r2dr, r ≥ 0,

and if U is a uniform random variable that is independent of R∗, then UR∗

has the inverse size-biased standard Rayleigh distribution:

(5.80) PUR∗ ∈ dr =r−1PR ∈ dr

P[R−1]=

√2

πe−

12r2dr, r ≥ 0.

Thus R∗ and UR∗ are distributed as the length of a standard normal vectorin R3 and R, respectively.

Page 159: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

5.7. AN APPLICATION: THE RAYLEIGH PROCESS 159

For reasons that are apparent from Proposition 5.7.1 below, we callthe process (Rt)t≥0 the Rayleigh process. We note that there is a body ofliterature on stationary processes with Rayleigh one-dimensional marginaldistributions that arise as the length process of a vector-valued process inR2 with coordinate processes that are independent copies of some stationarycentered Gaussian process (see, for example, [Has70, MBB58, BS02]).

Proposition 5.7.1. Consider the Rayleigh process (Rt)t≥0. Write Prfor the law of (Rt)t≥0 started at r ≥ 0.

(i) The unique stationary distribution of the Rayleigh process is thestandard Rayleigh distribution and the total variation distance be-tween PrRt ∈ · and the standard Rayleigh distribution convergesto 0 as t→ ∞.

(ii) Under P0, for each fixed t > 0, Rt has the same law as R∧ t, whereR has the standard Rayleigh distribution.

(iii) For x > 0, the mean return time to x is x−1e12x2.

(iv) If τn denotes the nth jump time of (Rt)t≥0, then as n →∞ the triple (Rτn , Rτn+1−, Rτn+1) converges in law to the triple(U ′R∗, R∗, U ′′R∗), where U ′ and U ′′ are independent uniform ran-dom variables on ]0, 1[ independent of R∗, and R∗ has the size-biased Rayleigh distribution.

(v) The jump counting process N(t) := |n ∈ N : τn ≤ t| has asymp-totically stationary increments under Pr for any r ≥ 0, and

(5.81)1

tN(t) →

√π

2, Pr − a.s.

as t→ ∞.

Proof. (i) Let Π be a Poisson point process in R× R+ with Lebesgueintensity. For −∞ < t <∞ let

(5.82) Rt := infx+ (t− s) : (s, x) ∈ Π, s ≤ t.

It is clear that (Rt)t∈R is a stationary Markov process with the transitiondynamics of the Rayleigh process. Similarly, for r ∈ R+ and t ≥ 0, set

(5.83) Rrt = (r + t) ∧ infx+ (t− s) : (s, x) ∈ Π, 0 ≤ s ≤ t.

Then (Rrt )t≥0 has the same law as the Rayleigh process under Pr.Note that the event Rt > r is the event that Π has no points in the

triangle with vertices (t − r, 0), (t, 0), (t, r) and area r2/2. Thus PRt >r = exp(−r2/2) and the standard Rayleigh distribution is a stationarydistribution for the Rayleigh process.

Let T r := inft ≥ 0 : Rrt = Rt. Note that Rrt = Rt for all t ≥ T r.Note that T r > t if and only if either R0 > r and Π puts no points into thequadrilateral with vertices (0, 0), (t, 0), (t, r + t), (0, r), or R0 ≤ r and Πputs no points into the quadrilateral with vertices (0, 0), (t, 0), (t, R0 + t),

Page 160: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

160 5. ROOT GROWTH AND REGRAFTING

(0, R0). Hence

PT r > t = exp

(−r

2

2

)exp

(−1

2(r + (r + t))t

)+

∫ r

0exp

(−1

2(x+ (x+ t))t

)x exp

(−x

2

2

)dx.

(5.84)

By the standard coupling inequality, the total variation between PrRt ∈ ·and PRt ∈ · is at most 2PT r > t, which converges to 0 as t → ∞.This certainly shows that the standard Rayleigh distribution is the uniquestationary distribution.(ii) Note that Rrt > x if and only if r + t ≥ t > x and there are no pointsof Π in the triangle with vertices (t − x, 0), (t, 0), (t, x) of area x2/2 or r +t > x ≥ t and there are not points of Π in the quadrilateral with vertices(0, 0), (t, 0), (t, x), (0, x− t) of area ((x− t) + x)t/2 = x2/2− (x− t)2/2. Ineither case,

(5.85) PrRt > x = 1r + t > x exp(−1

2x2 +

1

2((x− t)+)

2

).

Taking r = 0 gives the result.

(iii) Let Ty := inft > 0 : Rt = y. It is obvious from the Poisson con-struction that, for all x ≥ 0 and y > 0, Px0 < Ty < ∞ = 1 andPx[exp(uTy)] < ∞ for all u in some neighborhood of 0. The Laplace trans-forms Px[exp−λTy] are determined by standard methods of renewal theory:

(5.86) Px[exp−λTx] =Ux(λ)

1 + Ux(λ), λ > 0,

where by (5.85),

(5.87)

Ux(λ) :=

∫ ∞

0dt e−λt

PxRt ∈ dxdx

=

∫ x

0dt e−λt t e−xt+t

2/2 + PR ∈ dxe−λx

λ.

In particular, it follows easily that the mean return time of state x

(5.88) Px[Tx] = − limλ↓0

1

λPx[exp−λTx]

is the inverse of the density of R at x, that is x−1e12x2 , as claimed.

(iv) Let τ := inft > 0 : Rt = Rt−. By part (i), the joint distributionof (Rτn , Rτn+1−, Rτn+1) converges to the joint distribution of (R0, Rτ−, Rτ )conditional on R0 = R0−. Let C denote the intensity of the stationary point

Page 161: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

5.7. AN APPLICATION: THE RAYLEIGH PROCESS 161

process t ∈ R : Rt = Rt−. ThenPR0 ∈ dx, Rτ− ∈ dy, Rτ ∈ dz | R0 = R0−

= C−1 exp

(−1

2x2)dx exp

(−1

2(x+ y)(y − x)

)dy dz

=

[1

ydx

]×[C−1y2 exp

(−1

2y2)dy

]×[1

ydz

](5.89)

for x < y and z < y. The result now follows from (5.79), which also identifies

C =√π/2.

(v) The stationary point process t ∈ R : Rt = Rt− is clearly ergodic by

construction, and it has intensity√π/2 from the argument in part (iv). For

any r > 0, it follows from the argument in part (i) that Rrt = Rt for all tsufficiently large, and so the result follows from the ergodic theorem appliedto t ∈ R : Rt = Rt−.

Then the following corollary is a consequence of Proposition 5.6.1. See[DGR02], where similar scaling limits are derived.

Corollary 5.7.2. For each N ∈ N, let (RNt )t≥0 denote a continuoustime Markov chain with state space 1, . . . , N and infinitesimal generatormatrix

(5.90) QN (i, j) :=

1/N, 1 ≤ j ≤ i− 1,

−(N − 1)/N, j = i,

(N − i)/N, j = i+ 1,

0, otherwise.

.

Write PN,r, r ∈ 1, . . . , N, for the corresponding family of laws. If a se-

quence (rN )N∈N, rN ∈ 1, . . . , N, is such that limN→∞N−1/2rN = r∞

exists, then the law of N−1/2(RNt√N

)t≥0

under PN,rN converges to that of

the Rayleigh process (Rt)t≥0 under Pr∞ in the usual sense of convergence ofcadlag processes with the Skorohod topology.

Page 162: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter
Page 163: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

CHAPTER 6

Subtree Prune and Regraft

As mentioned in the Introduction, Markov chains that move through aspace of finite trees are an important ingredient for several algorithms inphylogenetic analysis. In the present chapter we construct and investigatewith the Subtree Prune and Regraft dynamics (SPR) the asymptotics of oneof the standard sets of moves that are implemented in several phylogeneticsoftware packages.

Recall from Figure 0.2 in the Introduction that in an SPR move, a binarytree T is cut “in the middle of an edge” to give two subtrees, say T ′ andT ′′. Another edge is chosen in T ′, a new vertex is created “in the middle” ofthat edge, and the cut edge in T ′′ is attached to this new vertex. Lastly, the“pendant” cut edge in T ′ is removed along with the vertex it was attached toin order to produce a new binary tree that has the same number of verticesas T .

As motivated in Sections 1.7 and 1.11, any compact real tree has ananalogue of the length measure on it, but in general there is no canonicalanalogue of the weight measure. Consequently, the process we construct hasas its state space the set of weighted trees, i.e., its elements are pairs (T, ν),where T is a compact real tree and ν is a probability measure on T . Let µbe the length measure associated with T .

Our candidate for the limiting subtree prune with regraft dynamics willbe a pure jump Markov process with values in the space of weighted R-trees(compare Section 1.11). The process jumps away from T by first choosinga pair of points (u, v) ∈ T × T according to the rate measure µ ⊗ ν andthen transforming T into a new tree by cutting off the subtree rooted at uthat does not contain v and re-attaching this subtree at v. This jump kernel(which typically has infinite total mass – so that jumps are occurring on adense countable set) is precisely what one would expect for a limit (as thenumber of vertices goes to infinity) of the particular SPR Markov chain onfinite trees described above in which the edges for cutting and re-attachmentare chosen uniformly at each stage.

Since the process we want to construct is reversible with respect tothe distribution of the weighted Brownian continuum random tree (Defini-tion 3.2.1), the framework of Dirichlet forms allows us to translate the abovedescription into rigorous mathematics.

This chapter is organized as follows. First we need to understand indetail the Dirichlet form arising from the combination of the jump kernel

163

Page 164: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

164 6. SUBTREE PRUNE AND REGRAFT

with the distribution of the weighted Brownian CRT as a reference measure.In Section 6.1 we introduce a kernel corresponding to an SPR move andaccomplish based on calculations we presented in Section 3.4 that we cancontrol the intensity of “big” and “small” jumps. We construct the Dirichletform in Section 6.2 and the resulting process in Section 6.3. We use potentialtheory for Dirichlet forms to show in Section 6.4 that from almost all startingpoints (with respect to the continuum random tree reference measure) ourprocess does not hit the trivial tree consisting of a single point.

6.1. A symmetric jump measure on (Twt, dGHwt)

Recall the space (Twt, dGHwt) of weighted real trees from Section 1.11.In this section we will construct and study a measure on Twt × Twt that isrelated to the decomposition discussed at the beginning of Section 3.4.

Define a map Θ from ((T, r), u, v) : T ∈ T, u ∈ T, v ∈ T into T by

setting Θ((T, r), u, v) := (T, r(u,v)) where letting

(6.1) r(u,v)(x, y) :=

r(x, y), if x, y ∈ ST,u,v,r(x, y), if x, y ∈ T \ ST,u,v,

r(x, u) + r(v, y), if x ∈ ST,u,v, y ∈ T \ ST,u,v,r(y, u) + r(v, x), if y ∈ ST,u,v, x ∈ T \ ST,u,v.

That is, Θ((T, r), u, v) is just T as a set, but the metric has been changed

so that the subtree ST,u,v with root u is now pruned and re-grafted so as tohave root v.

If (T, r, ν) ∈ Twt and (u, v) ∈ T×T , then we can think of ν as a weight on

(T, r(u,v)), because the Borel structures induces by r and r(u,v) are the same.With a slight misuse of notation we will therefore write Θ((T, r, ν), u, v) for

(T, r(u,v), ν) ∈ Twt. Intuitively, the mass contained in ST,u,v is transportedalong with the subtree.

Define a kernel κ on Twt by

(6.2) κ((T, rT , νT ),B) := µT ⊗ νT(u, v) ∈ T × T : Θ(T, u, v) ∈ B

for B ∈ B(Twt). Thus κ((T, rT , νT ), ·) is the jump kernel described infor-mally in the Introduction.

Remark 6.1.1. It is clear that κ((T, rT , νT ), ·) is a Borel measure onTwt for each (T, rT , νT ) ∈ Twt. In order to show that κ(·,B) is a Borelfunction on Twt for each B ∈ B(Twt), so that κ is indeed a kernel, it sufficesto observe for each bounded continuous function F : Twt → R that∫

F (Θ(T, u, v))µT (du)νT (dv) = limε↓0

∫F (Θ(T, u, v))µRε(T )(du)νT (dv)

and that

(T, rT , νT ) 7→∫F (Θ(T, u, v))µRε(T )(du)νT (dv)

is continuous for all ε > 0 (the latter follows from an argument similar tothat in Lemma 7.3 of [EPW06], where it is shown that the (T, rT , νT ) 7→

Page 165: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

6.1. A SYMMETRIC JUMP MEASURE ON (Twt, dGHwt ) 165

u v

*# #

*

#

*

Figure 6.1. A subtree prune and re-graft operation on anexcursion path: the excursion starting at time u in the toppicture is excised and inserted at time v, and the resultinggap between the two points marked # is closed up. The twopoints marked # (resp. ∗) in the top (resp. bottom) picturecorrespond to a single point in the associated R-tree.

µRε(T )(T ) is continuous). We have only sketched the argument that κ is akernel, because κ is just a device for defining the measure J on Twt × Twt

in the next paragraph. It is actually the measure J that we use to defineour Dirichlet form, and the measure J can be constructed directly as thepush-forward of a measure on U1×U1 – see the proof of Lemma 6.1.2.

We show in part (i) of Lemma 6.1.2 below that the kernel κ is reversiblewith respect to the probability measure P. More precisely, we show that ifwe define a measure J on Twt × Twt by

(6.3) J(A×B) :=

∫AP(dT )κ(T,B)

for A,B ∈ B(Twt), then J is symmetric, where P denotes the law of theweighted Brownian CRT (compare Definition 3.2.1).

Recall ∆GHwt from (1.56)

Lemma 6.1.2. (i) The measure J is symmetric.(ii) For each compact subset K ⊂ Twt and open subset U such that

K ⊂ U ⊆ Twt,

J(K,Twt \U) <∞.

(iii) The function ∆GHwt is square-integrable with respect to J , that is,∫Twt×Twt

J(dT, dS)∆2GHwt(T, S) <∞.

Page 166: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

166 6. SUBTREE PRUNE AND REGRAFT

Proof. (i) Given e′, e′′ ∈ U1, 0 ≤ u ≤ 1, and 0 < ρ ≤ 1, definee(·; e′, e′′, u, ρ) ∈ U1 by

e(t; e′, e′′, u, ρ)

:=

S1−ρe

′′(t), 0 ≤ t ≤ (1− ρ)u,

S1−ρe′′((1− ρ)u) + Sρe′(t− (1− ρ)u), (1− ρ)u ≤ t ≤ (1− ρ)u+ ρ,

S1−ρe′′(t− ρ), (1− ρ)u+ ρ ≤ t ≤ 1.

(6.4)

That is, e(·; e′, e′′, u, ρ) is the excursion that arises from Brownian re-scalinge′ and e′′ to have lengths ρ and 1 − ρ, respectively, and then inserting there-scaled version of e′ into the re-scaled version of e′′ at a position that is afraction u of the total length of the re-scaled version of e′′.

Define a measure J on U1 × U1 by∫U1×U1

J(de∗, de∗∗)K(e∗, e∗∗)

:=

∫[0,1]2

du⊗ dv1

2√2π

∫ 1

0

dρ√(1− ρ)ρ3

∫P(de′)⊗P(de′′)

×K(e(·; e′, e′′, u, ρ), e(·; e′, e′′, v, ρ)

).

(6.5)

Clearly, the measure J is symmetric. It follows from the discussion at thebeginning of the proof of part (i) of Theorem 3.4.1 and Corollary 3.4.4 thatthe measure J is the push-forward of the symmetric measure 2J by the map

(6.6)U1 × U1 ∋ (e∗, e∗∗)

7→((T2e∗ , rT2e∗ , νT2e∗ ), (T2e∗∗ , rT2e∗∗ , νT2e∗∗ )

)∈ Twt × Twt,

and hence J is also symmetric.

(ii) The result is trivial if K = ∅, so we assume that K = ∅. SinceTwt \U and K are disjoint closed sets and K is compact, we have that

(6.7) c := infT∈K,S∈U

∆GHwt(T, S) > 0.

Fix T ∈ K. If (u, v) ∈ T ×T is such that ∆GHwt(T,Θ(T, u, v)) > c, thendiam(T ) > c (so that we can think of Rc(T ), recall (1.46), as a subset of T ).Moreover, we claim that either

• u ∈ Rc(T, v) (recall (1.32)), or• u ∈ Rc(T, v) and νT (S

T,u,v) > c (recall (3.30)).

Suppose, to the contrary, that u /∈ Rc(T, v) and that νT (ST,u,ρ) ≤ c.

Because u /∈ Rc(T, v), the map f : T → Θ(T, u, v) given by

f(w) :=

u, if w ∈ ST,u,v,

w, otherwise.

Page 167: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

6.2. DIRICHLET FORMS 167

is a measurable c-isometry. There is an analogous measurable c-isometryg : Θ(T, u, v) → T . Clearly,

dPr(f∗νT , νΘ(T,u,v)) ≤ c

anddPr(ν

T , g∗νΘ(T,u,v)) ≤ c.

Hence, by definition, ∆GHwt(T,Θ(T, u, v)) ≤ c.Thus we have

(6.8)

J(K,Twt \U)

≤∫KPdTκ(T, S : ∆GHwt(T, S) > c)

≤∫KP(dT )

∫TνT (dv)µ

T (Rc(T, v))

+

∫KP(dT )

∫TνT (dv)µ

T u ∈ T : νT (ST,u,v) > c

<∞,

where we have used Theorem 3.4.1.

(iii) Similar reasoning yields that

(6.9)

∫Twt×Twt

J(dT, dS)∆2GHwt(T, S)

=

∫Twt

PdT∫ ∞

0dt 2t κ(T, S : ∆GHwt(T, S) > t)

≤∫Twt

P(dT )∫ ∞

0dt 2t

∫TνT (dv)µ

T (Rc(T, v))

+

∫Twt

P(dT )∫ ∞

0dt 2t

∫TνT (dv)µ

T u ∈ T : νT ST,u,v > t

≤∫ ∞

0dt 2t

∫Twt

P(dT )∫TνT (dv)µ

T (Rc(T, v))

+

∫Twt

P(dT )∫TνT (dv)

∫TµT (du) ν2T (S

T,u,v)

<∞,

where we have applied Theorem 3.4.1 once more.

6.2. Dirichlet forms

Consider the bilinear form

(6.10) E(f, g) :=∫Twt×Twt

J(dT, dS)(f(S)− f(T )

)(g(S)− g(T )

),

for f, g in the domain

(6.11) D∗(E) :=f ∈ L2(Twt,P) : f is measurable, and E(f, f) <∞

,

Page 168: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

168 6. SUBTREE PRUNE AND REGRAFT

(here as usual, L2(Twt,P) is equipped with the inner product (f, g)P :=∫P(dx) f(x)g(x)). By the argument in Example 1.2.1 in [FOT94] and

Lemma 6.1.2, (E ,D∗(E)) is well-defined, symmetric and Markovian.

Lemma 6.2.1. The form (E ,D∗(E)) is closed. That is, if (fn)n∈N be asequence in D∗(E) such that

limm,n→∞

(E(fn − fm, fn − fm) + (fn − fm, fn − fm)P) = 0,

then there exists f ∈ D∗(E) such that

limn→∞

(E(fn − f, fn − f) + (fn − f, fn − f)P) = 0.

Proof. Let (fn)n∈N be a sequence such that limm,n→∞ E(fn− fm, fn−fm) + (fn − fm, fn − fm)P = 0 (that is, (fn)n∈N is Cauchy with respect toE(·, ·)+(·, ·)P). There exists a subsequence (nk)k∈N and f ∈ L2(Twt,P) suchthat limk→∞ fnk

= f , P-a.s, and limk→∞(fnk− f, fnk

− f)P = 0. By Fatou’sLemma,

(6.12)

∫J(dT, dS)

((f(S)− f(T )

)2 ≤ lim infk→∞

E(fnk, fnk

) <∞,

and so f ∈ D∗(E). Similarly,

(6.13)

E(fn − f, fn − f)

=

∫J(dT, dS) lim

k→∞

((fn − fnk

)(S)− (fn − fnk)(T )

)2≤ lim inf

k→∞E(fn − fnk

, fn − fnk) → 0

as n→ ∞. Thus (fn)n∈N has a subsequence that converges to f with respectto E(·, ·)+(·, ·)P, but, by the Cauchy property, this implies that (fn)n∈N itselfconverges to f .

Let L denote the collection of functions f : Twt → R such that

(6.14) supT∈Twt

|f(T )| <∞

and

(6.15) supS,T∈Twt, S =T

|f(S)− f(T )|∆GHwt(S, T )

<∞.

Note that L consists of continuous functions and contains the constants. Itfollows from (1.61) that L is both a vector lattice and an algebra. By Lemma6.2.2 below, L ⊆ D∗(E). Therefore, the closure of (E ,L) is a Dirichlet formthat we will denote by (E ,D(E)).

Lemma 6.2.2. Suppose that fnn∈N is a sequence of functions from Twt

into R such that

supn∈N

supT∈Twt

|fn(T )| <∞,

Page 169: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

6.3. AN ASSOCIATED MARKOV PROCESS 169

supn∈N

supS,T∈Twt, S =T

|fn(S)− fn(T )|∆GHwt(S, T )

<∞,

and

limn→∞

fn = f, P-a.s.

for some f : Twt → R. Then fnn∈N ⊂ D∗(E), f ∈ D∗(E), andlimn→∞

(E(fn − f, fn − f) + (fn − f, fn − f)P) = 0.

Proof. By the definition of the measure J (see (6.3)) and the symmetryof J (Lemma 6.1.2(i)), we have that fn(x)−fn(y) → f(x)−f(y) for J-almostevery pair (x, y). The result then follows from part (iii) of Lemma 6.1.2 andthe dominated convergence theorem.

6.3. An associated Markov process

In this section we associate the Dirichlet form (E ,D(E)) with a niceMarkov process. Before, we remark that L, and hence D(E) is quite a richclass of functions: we show in the proof of Theorem 6.3.1 below that Lseparates points of Twt and hence if K is any compact subset of Twt, then,by the Arzela-Ascoli theorem, the set of restrictions of functions in L to Kis uniformly dense in the space of real-valued continuous functions on K.

The following theorem states that there is a well-defined Markov processwith the dynamics we would expect for a limit of the subtree prune and re-graft chains.

Theorem 6.3.1. There exists a recurrent P-symmetric Hunt processX = (Xt,PT ) on Twt whose Dirichlet form is (E ,D(E)).

Proof. We will check the conditions of Theorem 7.3.1 in [FOT94] toestablish the existence of X.

Because Twt is complete and separable (recall Theorem 1.11.7) there is asequence H1 ⊆ H2 ⊆ . . . of compact subsets of Twt such that P(

∪k∈NHk) =

1. Given α, β > 0, write Lα,β for the subset of L consisting of functions fsuch that

(6.16) supT∈Twt

|f(T )| ≤ α

and

(6.17) supS,T∈Twt, S =T

|f(S)− f(T )|∆GHwt(S, T )

≤ β.

By the separability of the continuous real-valued functions on each Hk withrespect to the supremum norm, it follows that for each k ∈ N there is acountable set Lα,β,k ⊆ Lα,β such that for every f ∈ Lα,β(6.18) inf

g∈Lα,β,k

supT∈Hk

|f(T )− g(T )| = 0.

Page 170: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

170 6. SUBTREE PRUNE AND REGRAFT

Set Lα,β :=∪k∈N Lα,β,k. Then for any f ∈ Lα,β there exists a sequence

fnn∈N in Lα,β such that limn→∞ fn = f pointwise on∪k∈NHk, and hence

P-almost surely. By Lemma 6.2.2, the countable set∪m∈N Lm,m is dense in

L, and hence also dense in D(E), with respect to E(·, ·) + (·, ·)P.Now fix a countable dense subset S ⊂ Twt. Let M denote the countable

set of functions of the form

(6.19) T 7→ p+ q(∆GHwt(S, T ) ∧ r)for some S ∈ S and p, q, r ∈ Q. Note that M ⊆ L, that M separates thepoints of Twt, and, for any T ∈ Twt, that there is certainly a function f ∈Mwith f(T ) = 0.

Consequently, if C is the algebra generated by the countable set M ∪∪m∈N Lm,m, then it is certainly the case that C is dense in D(E) with respect

E(·, ·) + (·, ·)P, that C separates the points of Twt, and, for any T ∈ Twt,that there is a function f ∈ C with f(T ) = 0.

All that remains in verifying the conditions of Theorem 7.3.1 in [FOT94]is to check the tightness condition that there exist compact subsets K1 ⊆K2 ⊆ ... of Twt such that limn→∞Cap(Twt \ Kn) = 0 where Cap is thecapacity associated with the Dirichlet form – see Remark 6.3.2 below for adefinition. This convergence, however, is the content of Lemma 6.3.5 below.

Finally, because constants belongs toD(E), it follows from Theorem 1.6.3in [FOT94] that X is recurrent.

Remark 6.3.2. In the proof of Theorem 6.3.1 we used the capacityassociated with the Dirichlet form (E ,D(E)). We remind the reader that foran open subset U ⊆ Twt,

Cap(U) := inf E(f, f) + (f, f)P : f ∈ D(E), f(T ) ≥ 1, P−a.e.T ∈ U ,and for a general subset A ⊆ Twt

Cap(A) := inf Cap(U) : A ⊆ U is open .We refer the reader to Section 2.1 of [FOT94] for details and a proof

that Cap is a Choquet capacity.

The following results were needed in the proof of Theorem 6.3.1.

Lemma 6.3.3. For ε, a, δ > 0, put Vε,a := T ∈ T : µT (Rε(T )) > aand, as usual, Vδ

ε,a := T ∈ T : dGH(T,Vε,a) < δ. Then, for fixed ε > 3δ,∩a>0

Vδε,a = ∅.

Proof. Fix S ∈ T. If S ∈ Vδε,a, then there exists T ∈ Vε,a such that

dGH(S, T ) < δ. Observe that Rε(T ) is not the trivial tree consisting of asingle point because it has total length greater than a. Write y1, . . . , ynfor the leaves of Rε(T ). For all i = 1, ..., n, the connected component ofT\Rε(T )o that contains yi contains a point zi such that rT (yi, zi) = ε.

Page 171: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

6.3. AN ASSOCIATED MARKOV PROCESS 171

Let R be a correspondence between S and T with dis(R) < 2δ (recallDefinition 1.2.3). Pick x1, ..., xn ∈ S such that (xi, zi) ∈ R, and hence|rS(xi, xj)− rT (zi, zj)| < 2δ for all i, j.

By Lemma 1.7.3,(6.20)

µT(Rε(T )

)= rT (y1, y2)

+

n∑k=3

∧1≤i≤j≤k−1

1

2

(rT (yk, yi) + rT (yk, yj)− rT (yi, yj)

).

Now the distance in S from the point xk to the arc [xi, xj ] is

1

2(rS(xk, xi) + rS(xk, xj)− rS(xi, xj))

≥ 1

2(rT (zk, zi) + rT (zk, zj)− rT (zi, zj)− 3× 2δ)

=1

2(rT (yk, yi) + 2ε+ rT (yk, yj) + 2ε− rT (yi, yj)− 2ε− 6δ)

> 0

(6.21)

by the assumption that ε > 3δ. In particular, x1, . . . , xn are leaves of thesubtree spanned by x1, . . . , xn, and Rγ(S) has at least n leaves when0 < γ < 2ε− 6δ. Fix such a γ.

Now

µS(Rγ(S))

≥ rS(x1, x2)− 2γ

+

n∑k=3

∧1≤i≤j≤k−1

[1

2(rS(xk, xi) + rS(xk, xj)− rS(xi, xj))− γ

]≥ µT (Rε(T )) + (2ε− 2δ − 2γ) + (n− 2)(ε− 3δ − γ)

≥ a+ (2ε− 2δ − 2γ) + (n− 2)(ε− 3δ − γ).

(6.22)

Because µS(Rγ(S)) is finite, it is apparent that S cannot belong to Vδε,a

when a is sufficiently large.

Lemma 6.3.4. For ε, a > 0, let Vε,a be as in Lemma 6.3.3. Set Uε,a :=(T, ν) ∈ Twt : T ∈ Vε,a. Then, for fixed ε,

(6.23) lima→∞

Cap(Uε,a) = 0.

Proof. Observe that (T, rT , νT ) 7→ µRε(T )(T ) is continuous (this is es-sentially Lemma 1.9.3), and so Uε,a is open.

Choose δ > 0 such that ε > 3δ. Suppressing the dependence on ε and δ,define ua : Twt → [0, 1] by

(6.24) ua((T, ν)) := δ−1(δ − dGH(T,Vε,a)

)+.

Page 172: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

172 6. SUBTREE PRUNE AND REGRAFT

Note that ua takes the value 1 on the open set Uε,a, and so Cap(Uε,a) ≤E(ua, ua) + (ua, ua)P. Also observe that

(6.25)|ua((T ′, ν ′))− ua((T

′′, ν ′′))| ≤ δ−1dGH(T′, T ′′)

≤ δ−1∆GHwt((T ′, ν ′), (T ′′, ν ′′)).

It therefore suffices by part (iii) of Lemma 6.1.2 and the dominated con-vergence theorem to show for each pair ((T ′, ν ′), (T ′′, ν ′′)) ∈ Twt × Twt

that ua((T′, ν ′)) − ua((T

′′, ν ′′)) is 0 for a sufficiently large and for eachT ∈ Twt that ua((T, ν)) is 0 for a sufficiently large. However, ua((T

′, ν ′))−ua((T

′′, ν ′′)) = 0 implies that either T ′ or T ′′ belong to Vδε,a, while

ua((T, ν)) = 0 implies that T belongs to Vδε,a. The result then follows

from Lemma 6.3.3.

Lemma 6.3.5. There is a sequence of compact sets K1 ⊆ K2 ⊆ . . . suchthat limn→∞Cap(Twt \Kn) = 0.

Proof. By Lemma 6.3.4, for n = 1, 2, . . . we can choose an so thatCap(U2−n,an) ≤ 2−n. Set

(6.26) Fn := Twt \U2−n,an = (T, ν) ∈ Twt : µT (R2−n(T )) ≤ anand

(6.27) Kn :=∩m≥n

Fm.

By Proposition 1.11.5 and Proposition 1.10.1, each set Kn is compact. Byconstruction,

Cap(Twt \Kn) = Cap

∪m≥n

U2−m,am

≤∑m≥n

Cap(U2−m,am) ≤∑m≥n

2−m = 2−(n−1).

(6.28)

6.4. The trivial tree is essentially polar

From our informal picture of the process X evolving via re-arrangementsof the initial tree that preserve the total branch length, one might expectthat if X does not start at the trivial tree T0 consisting of a single point,then X will never hit T0. However, an SPR move can decrease the diameterof a tree, so it is conceivable that, in passing to the limit, there is someprobability that an infinite sequence of SPR moves will conspire to collapsethe evolving tree down to a single point. Of course, it is hard to imagine fromthe approximating dynamics how X could recover from such a catastrophe

Page 173: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

6.4. THE TRIVIAL TREE IS ESSENTIALLY POLAR 173

– which it would have to since it is reversible with respect to the continuumrandom tree distribution.

In this section we will use potential theory for Dirichlet forms to showthat X does not hit T0 from P-almost all starting points; that is, that theset T0 is essentially polar.

Let r be the map which sends a weighted R tree (T, d, ν) to the ν-averaged distance between pairs of points in T . That is,

(6.29) r((T, r, ν)

):=

∫T

∫Tν(dx)ν(dy) r(x, y), (T, r, ν) ∈ Twt.

In order to show that T0 is essentially polar, it will suffice to show that theset

(6.30) (T, r, ν) ∈ Twt : r((T, r, ν)

)= 0

is essentially polar.

Lemma 6.4.1. The function r belongs to the domain D(E).

Proof. If we let rn((T, r, ν)

):=∫T

∫T ν(dx)ν(dy) [r(x, y) ∧ n], for n ∈

N, then rn ↑ r, P-a.s. By the triangle inequality,

(6.31) (r, r)P ≤∫

P(dT ) (diam(T ))2 ≤∫

P(de)(4 supt∈[0,1]

e(t))2<∞,

and hence rn → r as n→ ∞ in L2(Twt,P).Notice, moreover, that for (T, r, ν) ∈ Twt and u, v ∈ T ,

(6.32)

(r((T, r, ν)

)− r(Θ((T, r, ν), u, v)

))2= 2

∫ST,u,v

∫T\ST,u,v

ν(dx)ν(dy)(r(y, u)− r(y, v)

)2= 2νT (S

T,u,v)ν(T \ ST,u,v) r2(u, v).Hence, applying Corollary 3.4.4 and the invariance of the standard Brownianexcursion under random re-rooting (see Section 2.7 of [Ald91b]),(6.33)∫

Twt×Twt

J(dT, dS)(r(T )− r(S)

)2= 2

∫Twt

P(dT )∫T×T

νT (dv)µT (du)νT (S

T,u,v)νT (T \ ST,u,v) r2T (u, v)

≤ 2

∫P(de) 2

∫Γe

ds⊗ da

s(e, s, a)− s(e, s, a)ζ(es,a)ζ(es,a)(2a)2

=8√2π

∫ 1

0

dρ√(1− ρ)ρ3

∫P(de′)⊗P(de′′) ρ(1− ρ)

(supS1−ρe

′′)2=

8√2π

∫ 1

0

dρ√(1− ρ)ρ3

ρ(1− ρ)2∫

P(de)

(supt∈[0,1]

e(t)

)2

<∞.

Page 174: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

174 6. SUBTREE PRUNE AND REGRAFT

Consequently, by dominated convergence, E(r − rn, r − rn) → 0 as n→ ∞.It is therefore enough to verify that rn ∈ L for all n ∈ N. Obviously,

(6.34) supT∈Twt

rn(T ) ≤ n,

and so the boundedness condition (6.14) holds. To show that the “Lipschitz”property (6.15) holds, fix ε > 0, and let (T, νT ), (S, νS) ∈ Twt be such that∆GHwt

((T, νT ), (S, νS)

)< ε. Then there exist f ∈ F εT,S and g ∈ F εS,T such

that dP(νT , g∗νS) < ε and dP(f∗νT , νS) < ε (recall F εT,S from (1.55)). Hence

(6.35)

∣∣∣∣rn((T, νT ))− rn((S, νS)

)∣∣∣∣≤∣∣∣∣ ∫

T

∫TνT (dx)νT (dy) (rT (x, y) ∧ n)

−∫g(S)

∫g(S)

g∗νS(dx)g∗νS(dy) (rT (x, y) ∧ n)∣∣∣∣

+

∣∣∣∣ ∫g(S)

∫g(S)

g∗νS(dx)g∗νS(dy) (rT (x, y) ∧ n)

−∫S

∫SνS(dx

′)νS(dy′) (rS(x

′, y′) ∧ n)∣∣∣∣.

For the first term on the right hand side of (6.35) we get

(6.36)

∣∣∣∣ ∫T

∫TνT (dx)νT (dy) (rT (x, y) ∧ n)

−∫g(S)

∫g(S)

g∗νS(dx)g∗νS(dy) (rT (x, y) ∧ n)∣∣∣∣

≤∣∣∣∣ ∫

T

∫TνT (dx)νT (dy) (rT (x, y) ∧ n)

−∫T

∫g(S)

νT (dx)g∗νS(dy) (rT (x, y) ∧ n)∣∣∣∣

+

∣∣∣∣ ∫S(g)

∫Tg∗νS(dx)νT (dy) (rT (x, y) ∧ n)

−∫g(S)

∫g(S)

g∗νS(dx)g∗νS(dy) (rT (x, y) ∧ n)∣∣∣∣.

By assumption and Theorem 3.1.2 in [EK86], we can find a probabilitymeasure ν on T × T with marginals νT and g∗νS such that

(6.37) ν(x, y) : rT (x, y) ≥ ε

≤ ε.

Page 175: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

6.4. THE TRIVIAL TREE IS ESSENTIALLY POLAR 175

Hence, for all x ∈ T ,

(6.38)

∣∣∣∣ ∫TνT (dy) (rT (x, y) ∧ n)−

∫g(S)

g∗νS(dy) (rT (x, y) ∧ n)∣∣∣∣

≤∫T×g(S)

ν(d(y, y′)

) ∣∣∣∣(rT (x, y) ∧ n)− (rT (x, y′) ∧ n)

∣∣∣∣≤∫T×g(S)

ν(d(y, y′)

)(rT (y, y

′) ∧ n)

≤(1 + (diam(T ) ∧ n)

)· ε.

For the second term in (6.35) we use the fact that g is an ε-isometry,that is, |(rS(x′, y′) ∧ n) − (rT (g(x

′), g(y′)) ∧ n)| < ε for all x′, x′′ ∈ T . Achange of variables then yields that

(6.39)

∣∣∣∣ ∫g(S)

∫g(S)

g∗νS(dx)g∗νS(dy) (rT (x, y) ∧ n)

−∫S

∫SνS(dx

′)νS(dy′) (rS(x

′, y′) ∧ n)∣∣∣∣

≤ ε+

∣∣∣∣ ∫g(S)

∫g(S)

g∗νS(dx)g∗νS(dy) (rT (x, y) ∧ n)

−∫S

∫SνS(dx

′)νS(dy′) (rT (g(x

′), g(y′)) ∧ n)∣∣∣∣

= ε.

Combining (6.35) through (6.39) yields finally that

(6.40) sup(T,νT )=(S,νS)∈Twt

∣∣rn((T, νT ))− rn((S, νS)

)∣∣∆GHwt

((T, νT ), (S, νS)

) ≤ 3 + 2n.

Proposition 6.4.2. The set T ∈ Twt : r(T ) = 0 is essentially polar.In particular, the set T0 consisting of the trivial tree is essentially polar.

Proof. We need to show that Cap(T ∈ Twt : r(T ) = 0) = 0 (seeTheorem 4.2.1 of [FOT94]).

For ε > 0 set

(6.41) Wε := T ∈ Twt : r(T ) < ε.

By the argument in the proof of Lemma 6.4.1, the function r is continuous,and so Wε is open. It suffices to show that Cap(Wε) ↓ 0 as ε ↓ 0.

Put

(6.42) uε(T ) :=

(2− r(T )

ε

)+

, T ∈ Twt.

Page 176: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

176 6. SUBTREE PRUNE AND REGRAFT

Then u ∈ D(E) by Lemma 6.4.1 and the fact that the domain of a Dirichletform is closed under composition with Lipschitz functions. Because uε(T ) ≥1 for T ∈ Wε, it thus further suffices to show

(6.43) limε↓0

(E(uε, uε) + (uε, uε)P) = 0.

By elementary properties of the standard Brownian excursion,

(6.44) (uε, uε)P ≤ 4PT : r(T ) < 2ε

→ 0

as ε ↓ 0. Estimating E(uε, uε) will be somewhat more involved.

Let E and E be two independent standard Brownian excursions, and letU and V be two independent random variables that are independent of Eand E and uniformly distributed on [0, 1]. With a slight abuse of notation,we will write P for the probability measure on the probability space whereE, E, U and V are defined.

Set

D := 4

∫0≤s<t≤1

ds⊗ dt

[Es + Et − 2 inf

s≤w≤tEw

]H := 2

∫[0,1]

dt Et

D := 4

∫0≤s<t≤1

ds⊗ dt

[Es + Et − 2 inf

s≤w≤tEw

]HU := 2

∫[0,1]

dt

[Et + EU − 2 inf

U∧t≤w≤U∨tEw

]HV := 2

∫[0,1]

dt

[Et + EV − 2 inf

V ∧t≤w≤V ∨tEw

].

(6.45)

For 0 ≤ ρ ≤ 1 set

DU (ρ) := (1− ρ)2√

1− ρD + ρ2√ρD

+ 2(1− ρ)ρ√ρH + 2(1− ρ)ρ

√1− ρHU

(6.46)

and

DV (ρ) := (1− ρ)2√

1− ρD + ρ2√ρD

+ 2(1− ρ)ρ√ρH + 2(1− ρ)ρ

√1− ρHV .

(6.47)

Then

E(uε, uε)

=1

2√2π

P

[∫ 1

0

dρ√(1− ρ)ρ3

(2− DU (ρ)

ε

)+

−(2− DV (ρ)

ε

)+

2].

(6.48)

Fix 0 < a < 12 and write a = 1 − a for convenience. We can write

the right-hand side of (6.48) as the sum of three terms I(ε, a), II(ε, a), and

Page 177: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

6.4. THE TRIVIAL TREE IS ESSENTIALLY POLAR 177

III(ε, a), that arise from integrating ρ over the respective ranges

(6.49) ρ : DU (ρ) ∨DV (ρ) ≤ 2ε, 0 ≤ ρ ≤ a ,

(6.50) ρ : DU (ρ) ∧DV (ρ) ≤ 2ε ≤ DU (ρ) ∨DV (ρ), 0 ≤ ρ ≤ a ,

and

(6.51) ρ : a < ρ ≤ 1 .

Consider I(ε, a) first. Note that if DU (ρ) ∨DV (ρ) ≤ 2ε, then

(6.52)

(2− DU (ρ)

ε

)+

−(2− DV (ρ)

ε

)+

2

≤ 22ρ2

ε2HU − HV 2.

Moreover,

0 ≤ ρ ≤ a : DU (ρ) ∨DV (ρ) ≤ 2ε

⊆0 ≤ ρ ≤ a : (1− ρ)

52 D + 2(1− ρ)

32 ρ(HU ∨ HV ) ≤ 2ε

⊆0 ≤ ρ ≤ a : a

52 D + 2a

32 ρ(HU ∨ HV ) ≤ 2ε

=

ρ : 0 ≤ ρ ≤ (2ε− a

52 D)+

2a32 (HU ∨ HV )

∧ a

.

(6.53)

Thus I(ε, a) is bounded above by the expectation of the random vari-able that arises from integrating 22ρ2HU − HV 2/ε2 against the measure

12√2π

dρ√(1−ρ)ρ3

over the interval [0, (2ε− a52 D)+/(2a

32 (HU ∨HV ))]. Note that

(6.54)

∫ x

0

dρ√ρ3ρα =

1

α− 12

xα−12 , α >

1

2.

Hence, letting C denote a generic constant with a value that doesn’t dependon ε or a and may change from line to line, and

I(ε, a) ≤ CP

((2ε− a52 D)+

HU ∨ HV

) 32 HU − HV 2

ε2

≤ C

ε2P

[(2ε− a

52 D)

32+(HU ∨ HV )

12

]≤ C

ε12

P[(HU + HV )

12 1D ≤ 2a−

52 ε]

≤ C

ε12

P[D

12 1D ≤ 2a−

52 ε]

≤ CPD ≤ 2a−52 ε,

(6.55)

where in the second last line we used the fact that

(6.56) P[HU | E] = P[HV | E] = D,

Page 178: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

178 6. SUBTREE PRUNE AND REGRAFT

and Jensen’s inequality for conditional expectations to obtain the inequali-

ties P[H12U | E] ≤ D

12 and P[H

12V | E] ≤ D

12 . Thus, limε↓0 I(ε, a) = 0 for any

value of a.Turning to II(ε, a), first note that D ≤ 4H and, by the triangle inequal-

ity,

(6.57) D ≤ 2(HU ∧ HV ).

Hence, for some constant K that does not depend on ε or a,

(6.58) |DU (ρ) ∧DV (ρ)− D| ≤ K(Hρ32 + (HU ∧ HV )ρ)

and

(6.59) |DU (ρ) ∨DV (ρ)− D| ≤ K(Hρ32 + (HU ∨ HV )ρ).

Combining (6.59) with an argument similar to that which established(6.53) gives, for a suitable constant K∗,

0 ≤ ρ ≤ a : DU (ρ) ∧DV (ρ) ≤ 2ε ≤ DU (ρ) ∨DV (ρ)= 0 ≤ ρ ≤ a : 2ε ≤ DU (ρ) ∨DV (ρ), ∩ 0 ≤ ρ ≤ a : DU (ρ) ∧DV (ρ) ≤ 2ε

ρ :

(2ε− D)+

K∗(H + HU ∨ HV )≤ ρ ≤ a

ρ : 0 ≤ ρ ≤ (2ε− a

52 D)+

2a32 (HU ∧ HV )

∧ a

.

(6.60)

Moreover, by (6.58) and the observation |(2ε − x)+ − (2ε − y)+| ≤ |x − y|,we have for DU (ρ) ∧DV (ρ) ≤ 2ε ≤ DU (ρ) ∨DV (ρ) that,

(2− DU (ρ)

ε

)+

−(2− DV (ρ)

ε

)+

2

=

(2− DU (ρ) ∧DV (ρ)

ε

)+

2

≤ 2

ε2

(2ε− D

)+

2+

2

ε2

(2ε−DU (ρ) ∧DV (ρ))+ −

(2ε− D

)+

2

≤ 2

ε2

(2ε− D

)+

2+

2

ε2DU (ρ) ∧DV (ρ)− D

2≤ C

ε2

[(2ε− D)2+ + H2ρ3 + (HU ∧ HV )

2ρ2].

(6.61)

for a suitable constant C that doesn’t depend on ε or a. It follows from(6.54) and

(6.62)

∫ a

x

dρ√ρ3ρβ =

112 − β

[xβ−

12 − aβ−

12

], β <

1

2,

Page 179: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

6.4. THE TRIVIAL TREE IS ESSENTIALLY POLAR 179

that

II(ε, a) ≤ C ′

ε2P

(2ε− D)2+

(2ε− D)+

H + HU ∨ HV

− 12

+C ′′

ε2P

H2

(2ε− a

52 D)+

2a32 (HU ∧ HV )

∧ a

52

+C ′′′

ε2P

(HU ∧ HV )2

(2ε− a

52 D)+

2a32 (HU ∧ HV )

∧ a

32

(6.63)

for suitable constants C ′, C ′′, and C ′′′.Consider the first term in (6.63). Using Jensen’s inequality for condi-

tional expectations and (6.56) again, this term is bounded above by

(6.64)1

ε2P

[(2ε− D)

32+

AD

12 +B

]≤ 1

ε2P

[(2ε− D)

32+

2

12Aε

12 +B

]for suitable constants A,B. Now, by Jensen’s inequality for conditionalexpectation yet again, along with the invariance of standard Brownian ex-cursion under random re-rooting (see Section 2.7 of [Ald91b]) and the factthat

(6.65) PEU ∈ dr = re−r2

2 dr

(see Section 3.3 of [Ald91b]), we have

P

[(2ε− D)

32+

]= P

[(P

[2ε− 2

EU + EV − 2 inf

U∧V≤t≤U∨VEt

∣∣∣ E]) 32

+

]

≤ P

[(2ε− 2

EU + EV − 2 inf

U∧V≤t≤U∨VEt

) 32

+

]

= P[(2ε− 2EU

) 32

+

]=

∫ ∞

0dr re−

r2

2 (2ε− 2r)32+

≤∫ ε

0dr r (2ε− 2r)

32 = 2

32 ε

72

∫ 1

0ds s(1− s)

32 .

(6.66)

Thus the limit as ε ↓ 0 of the first term in (6.63) is 0 for each a.For the second term in (6.63), first observe by Jensen’s inequality for

conditional expectation and (6.65) that

PD ≤ r ≤ P

[(2− D

r

)+

]≤ P

[(2− EU

r

)+

]≤ 2P

EU ≤ 2r

≤ 2

(2r)2

2= 4r2.

(6.67)

Page 180: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

180 6. SUBTREE PRUNE AND REGRAFT

Combining this observation with (6.57) and integrating by parts gives

P

(2ε− a52 D)+

2a32 (HU ∧ HV )

∧ a

52

≤ P

(2ε− a52 D)+

a32 D

∧ a

52

=

∫ 2ε/a52

0PD ∈ dr

(a−

32

(2ε

r− a

52

)∧ a) 5

2

≤∫ 2ε/a

52

2ε/(aa12+a

52 )

dr 4r2a−1545

2

(2ε

r− a

52

) 32 2ε

r2

= 40ε2a−154

∫ 1/a52

1/(aa12+a

52 )

ds

(1

s− a

52

) 32

.

(6.68)

If we denote the rightmost term by L(ε, a), then it is clear that

(6.69) lima↓0

limε↓0

1

ε2L(ε, a) = 0.

From (6.56) and Jensen’s inequality for conditional expectations, thethird term in (6.63) is bounded above by

C

ε2P

[(HU ∧ HV )

12

(2ε− a

52 D) 3

2

+

]≤ C

ε2P

[D

12

(2ε− a

52 D) 3

2

+

]≤ C

ε32

P

[(2ε− a

52 D) 3

2

+

],

(6.70)

and the calculation in (6.66) shows that the rightmost term converges tozero as ε ↓ 0 for each a.

Putting together the observations we have made on the three terms in(6.63), we see that

(6.71) lima↓0

limε↓0

II(ε, a) = 0.

It follows from the dominated convergence theorem that

(6.72) limε↓0

III(ε, a) = 0

for all a, and this completes the proof.

Page 181: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

CHAPTER 7

Tree-valued Fleming-Viot dynamics

The measure-valued Fleming-Viot process is a diffusion which modelsthe evolution of allele frequencies in a multi-type population of constantsize. In the neutral setting the Kingman coalescent is known to generatethe genealogies of the “individuals” in the population at a fixed time. In thischapter we replace this static and backward point of view on the genealogiesby an analysis of the forward evolution of genealogies.

We use techniques from martingale problems to construct the so-calledtree-valued Fleming-Viot dynamics in Section 7.1. A duality relation of thetree-valued Fleming-Viot dynamics to the tree-valued Kingman coalescent isgiven in Section 7.2. Similar to the measure-valued processes, Fleming-Viotdynamics can be approximated by the tree-valued finite population models,which are the tree-valued Moran-dynamics constructed in Section 7.3. Weshow in Sections 7.4 and 7.5 that these discrete tree valued dynamics havea limit point and characterize the limit as the tree-valued Fleming-Viotdynamics. We then conclude in Section 7.6 that the tree-valued Fleming-Viot dynamics have continuous sample paths and collect all the technicalstatements verified so far to give the proof of the two main theorems aboutthe well-posedness of the tree-valued Fleming-Viot martingale problem andits particle approximation in Section 7.7.

The next two sections are devoted to the connection to known results.In Section 7.8 we derive the Kingman coalescent measure tree as the equilib-rium and in Section 7.9 we recover the measure-valued from the tree-valuedFleming-Viot dynamics.

All the theory developed so far relied on the consistency property of thedual process. This can be extended to more general resampling mechanismswhich are dual to more general Λ-coalescents. In Section 7.10 we discussgeneralizations and possible extensions of the model.

As an application we study the evolution of the distribution of treelengths of finite samples in Section 7.11. We show in particular that theevolution of the distribution of the length of trees spanned by sequentiallysampled “individuals” evolves as an autonomous Markov process.

7.1. The tree-valued Fleming-Viot martingale problem

In this section we define the tree-valued Fleming-Viot dynamics as thesolution of a well-posed martingale problem. We start by recalling the ter-minology.

181

Page 182: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

182 7. TREE-VALUED FLEMING-VIOT DYNAMICS

Definition 7.1.1 (Martingale problem). Let (E,O) be a Polish space,P0 ∈ M1(E), Π an separating algebra of continuous functions on E and Ωa linear operator on Π.

The law P of an E-valued stochastic process X = (Xt)t≥0 is called a so-lution of the (P0,Ω,Π)-martingale problem (or a solution for the martingaleproblem for Ω) if X0 has distribution P0, X has paths in DE(R+) and forall Φ ∈ Π,

(7.1)(Φ(Xt)− Φ(X0)−

∫ t

0dsΩΦ(Xs)

)t≥0

is a P-martingale with respect to the canonical filtration.

The probability measure-valued Fleming-Viot process is a model in pop-ulation genetics which describes the evolution of allelic frequencies. In par-ticular, for a fixed time t, the state µt ∈ M1(K) records the current dis-tributions of allelic types on some (complete and separable metric) typespace K.

The measure-valued Fleming-Viot process is the unique solution of amartingale problem for the generator defined on functions F : M1(K) → R,given by

(7.2) ΩFVF (µ) =γ

2

∫K

∫K

(µ(du)δu(dv)− µ(du)µ(dv)

)∂2F∂ν2

[δu, δv],

where

(7.3)∂F

∂ν[ν] := lim

ε→0

1

ε

(F (ν + εν)− F (ν)

),

and

(7.4)∂2F

∂ν2[ν1, ν2] :=

∂ν

(∂F (ν)∂ν

[ν1])[ν2],

whenever these limits exist. Here and in the following γ ∈ (0,∞) is referredto as the resampling rate.

We next want to lift this construction on the level of trees and therebyconstruct the U-valued Fleming-Viot dynamics. Recall first that a functionF on U is measure preserving isometric if and only if F is of the form (2.3)

for a function F : M1(R(N2)+ ) → R. To lift the construction of the measure-

valued Fleming-Viot process on the level of trees and thereby construct theU-valued Fleming-Viot dynamics, we consider the set

(7.5)F :=

F = F F ∈ B(U) of the form (2.3) :

∀k, l ∈ N, ∀U ∈ U; ∃ ∂∂rF (U),

∂νF (U)[k],

∂2

∂ν2F (U)[k, l]

,

with

Page 183: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

7.1. THE TREE-VALUED FLEMING-VIOT MARTINGALE PROBLEM 183

• Distance derivative.

(7.6)∂

∂rF (U) := lim

ε↓0

1

ε

(F((σε1)∗ν

U)− F(νU))

where 1 ∈ R(N2)

+ denotes the matrix whose entrees are all 1, and for

each q ∈ R(N2)

+ , σq : R(N2)+ → R(

N2)

+ denotes the shift by q, i.e.,

(7.7) σq(r)= r + q,

• Measure derivative.

(7.8)∂

∂νF (U)[k] :=

∂νF (νU)[(θ1,k)∗ν

U ],

for each k ∈ 1, 2, ..., with the replacement operator

(7.9) θk,l(ri,j)1≤i<j :=

ri,j , if i, j = l,

ri,k, if j = l,

rk,j , if i = l,

,

and therefore for each k, l ∈ 1, 2, ..., ,

(7.10)∂2

∂ν2F (U)[k, l] :=

∂2

∂ν2F (νU)[(θ1,k)∗ν

U , (θ1,l)∗νU ].

Remark 7.1.2 (Measure derivative revisited).(i) By exchangeability of the distance matrix distribution,

(7.11) (ρ≥2 θ1,k)∗νU = νU

for all 1 ≤ k, l. Hence ∂∂ν F ρ≥2(ν

U)[k] does not depend on thedirection [k].

(ii) If the metric measure space (U, r, µ) is a representative of U ∈ U,then the direction measure derivatives are defined such that foreach k ∈ 1, 2, ...,

(7.12)∂

∂νF ρ≥2(ν

U)[k] :=

∫Uµ(du)

∂νF (U,r)(µ)[δu]

with F (U,r)(µ) := F ((U, r, µ)).

We therefore consider the martingale problem associated with the oper-ator Ω↑ acting on F where

(7.13) Ω↑F := Ω↑,growF +Ω↑,resF

with

(7.14) Ω↑,growF(U):= 2

∂rF (U),

Page 184: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

184 7. TREE-VALUED FLEMING-VIOT DYNAMICS

and

(7.15)

Ω↑,resF(U)

:=γ

2

∑1≤k,l

∂2

∂ν2F ρ≥2(ν

U)[k, l]− ∂2

∂ν2F ρ≥2(ν

U)[k, k]

Here Ω↑,res certainly describes the resampling while the additional de-terministic part Ω↑,grow incorporates the fact that the population gets olderand therefore the genealogical distances grow at speed 2 as the time goeson.

We first note that the above differential operators are defined for a niceclass of functions and give explicit formulas allowing for example to read offthe quadratic variation. To prepare this, we introduce some more notation.

First, let for n ∈ N and a test function ϕ ∈ C1(R(N2)

+ ),

(7.16) div(ϕ) :=∑

1≤i<j

∂ri,jϕ

denote the divergence of ϕ. Moreover, we write Id : R → R for the identitymap, i.e.,

(7.17) Id : x 7→ x.

In particular, if

(7.18) F0 :=F = F f,ψρ≤n ∈ F : f ∈ C2(R), n ∈ N, ψ ∈ C1(R(

n2)

+ ).

it is clear that Π1 = F = F Id,ψρ≤n ; n ∈ N, ψ ∈ C1(R(n2)

+ ) ⊂ F0.

Proposition 7.1.3 (Action on functions of monomials). For all func-

tions F = Fn,ϕ ∈ F0,

(7.19) Ω↑,growF f,ϕ := 2 · F f ′,ϕ · F Id,div(ϕ)

and

(7.20)

Ω↑,resF f,ϕ

:= γF f′,ϕ ·

∑1≤k<l

(F Id,ϕθk,l − F Id,ϕ

)+γ

2F f

′′,ϕ ·∑

1≤k<lF Id,⟨ϕ⟩k,l ,

where for ϕ ∈ C(R(N2)

+ ),

(7.21) ⟨ϕ, ϕ⟩k,l := 1|k − l| is odd

·(ϕ, ϕ) θk,l − (ϕ, ϕ)

and for ϕ1, ϕ2 ∈ C(R(

N2)

+ ),

(7.22) (ϕ, ψ)(p):= ϕ

((p2i−1,2j−1)1≤i<j≤N

)· ψ((p2i,2j)1≤i<j≤N

).

Page 185: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

7.1. THE TREE-VALUED FLEMING-VIOT MARTINGALE PROBLEM 185

Proof. Fix F = F f,ϕ ∈ F . The statement for the growth operator isobvious by definition. For the resampling operator, note that for all k, l ≥ 1,

(7.23)

∂νF (U)[k] = lim

ε→0

1

ε

f( ∫

d(νU + ε(θ1,k)∗νU)ϕ

)− f

( ∫dνU ϕ

)= f ′

( ∫dνU ϕ

)· f( ∫

d(θ1,k)∗νU ϕ),

and therefore with ϕ = ψ ρ≥2

(7.24)∂2

∂ν2F f,ψρ≥2(U)[k, l] =

∂ν

( ∂∂νF f,ψρ≥2(·)[k]

)(U)[l]

=∂

∂ν

(F f

′,ϕ ·∫

d(θ1,k)∗νU ψ ρ≥2

)(U)[l]

= F f′′,ϕ ·

∫dνU(ϕ, ϕ) θk,l + F f

′,ϕ ·∫

dνU ϕ θk,l

where we have used that

(7.25)

∫dνU ψ ρ≥2 θ1,k θ1,l =

∫dνU ϕ θk,l.

and

(7.26)

∂νF (U)[k] · ∂

∂νF (U)[l]

=

∫ ∫νU(dr)νU(dq)ϕ ρ≥2 θ1,k

(dr)ϕ ρ≥2 θ1,l

(dq)

=

∫dνU(ϕ, ϕ) θk,l.

Remark 7.1.4 (Quadratic variation revisited). If the metric measurespace (U, r, µ) is a representative of U ∈ U, then

(7.27)∑

1≤k<lF Id,⟨ψρ≤n⟩k,l(U) =

∫Uµ(du)

(ρ(u)−

∫Uµ(du′) ρ(u′)

)2with ρ(u) :=

∫Un−1 µ

⊗(n−1)(d(u2, ..., un))ψ((r(ui, uj))1≤i<j≤n) and u1 := u.

In particular, F Id,⟨ψρ≤n⟩k,l is clearly non-negative and positive if and onlyif ψ is not constant.

Denote then by

(7.28) F := domain of the closure of the operator (Ω↑, F0).

Remark 7.1.5 (Relation between F and F).

Page 186: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

186 7. TREE-VALUED FLEMING-VIOT DYNAMICS

(i) Obviously, the algebra A(Π1) generated by polynomials is con-

tained in the algebra generated by functions of F0. On the otherhand, any polynomial in monomials can be written as a monomialfollowing the example

(7.29)( ∫

dνU ϕ)2

=

∫dνU (ϕ, ϕ).

Therefore A(Π1) = F .

(ii) Clearly F0 ⊆ F by Proposition 7.1.3. Notice that U, and equiv-alently, M1(RN

+) are not locally compact, so that the Stone-Weierstraß Theorem does not apply. It is therefore open to decidewhether or not

(7.30) F = domain of the closure of the operator (Ω↑, F)?

Recall from (2.89) the space Mc of compact metric measure spaces, andput

(7.31) Uc := U ∩Mc.

Our first main result states that the martingale problem associated withΩ↑ acting on F is well-posed.

Theorem 7.1.6 (Tree-valued Fleming-Viot dynamics). Fix P0 ∈M1(U). The (P0,Ω

↑,F)-martingale problem has a unique solution U =(Ut)t≥0. Moreover, almost surely,

(i) U has sample paths in CU([0,∞)).(ii) Ut ∈ Uc for all t > 0.(iii) νUtr : r1,2 > 0 = 0 for Lebesgue-almost all t ∈ [0,∞).

Definition 7.1.7 (The tree-valued Fleming-Viot dynamics). Fix P0 ∈M1(U). The tree-valued Fleming-Viot dynamics with initial distribution P0

is the unique solution of the (P0,Ω↑,F)-martingale problem.

7.2. Duality: A unique solution

If applicable, duality is an extremely useful technique in the studyof Markov processes. It is well-known that the Kingman coalescent isdual to the neutral measure-valued Fleming-Viot process (see, for example,[Daw93, Eth01]). In this section this duality is lifted to the tree-valuedFleming-Viot dynamics. We will apply the duality to show uniqueness ofthe (Ω↑,F)-martingale problem for the tree-valued Fleming-Viot process inSection 7.7 and its relaxation into the equilibrium Kingman measure tree inSection 7.8.

We start with introducing with the tree-valued Kingman coalescent thedual process in and establish then the duality relation.

Page 187: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

7.2. DUALITY: A UNIQUE SOLUTION 187

The dual process. Recall from Section 4.1 the space S of partitionsof N. Let K = (Kt)t≥0 be the S-valued Kingman coalescent. Since weare constructing a dual to the U-valued dynamics, we add a componentrK = (rK)t≥0 which measures genealogical distances.

Recall from (2.8) the space Rmet of all pseudo-metrics, i.e., generaliza-tions of metrics which allow for distances 0 and ∞. The state space of thedual tree-valued Kingman coalescent is

(7.32) K := S× Rmet,

where we say that a sequence (Kn)n∈N converges in K, if its restriction toS ⊆ N converges in the product of the discrete and the Euclidean topology,for all finite S ⊆ N. In particular, G(·, r′) is continuous for all r′ ∈ Rmet if itdepends on P only via ρnP, for some n ∈ N, where ρn is the restriction mapfrom S to the set of partitions of 1, 2, ..., n.

In the following we refer to the K-valued stochastic process K = (Kt)t≥0,where for t ≥ 0,

(7.33) Kt = (Kt, r′t),

as the tree-valued Kingman coalescent if it follows the dynamics:

Coalescence. K = (Kt)t≥0 is the S-valued Kingman coalescentwith pair coalescence rate γ.Root growth. In between two coalescent events, given the cur-rent partition P ∈ S, for all 1 ≤ i < j < ∞ with i ∼P j, thegenealogical distance r′ij grows with constant speed 2.

For later purposes we will see that we can associate a martingale problemwith the tree-valued Kingman coalescent. Consider for P ∈ S, the coalescentoperator κP : P2 → S such that for π1, π2 ∈ P,

(7.34) κP(π1, π2) :=π1 ∪ π2

∪(P \ π1, π2

),

i.e., κP sends two partition elements of the partition P to the new partitionobtained by coalescing the two partition elements into one.

We consider the set(7.35)

G :=G ∈ B(K) : div(G) exists, ∀P ∈ S, G(·, r′) ∈ C(S),∀r′ ∈ Rmet

,

with the divergence operator as defined in (7.16), where as defined in (4.8)we denote by Pi, for P ∈ S, the unique partition element π ∈ P with i ∈ π.

We then consider the martingale problem associated with the operatorΩ↓ acting on G where

(7.36) Ω↓G := Ω↓,growG+Ω↓,coalG

with

(7.37) Ω↓,growG(P, r′) := 2div(G)(P, r′)

Page 188: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

188 7. TREE-VALUED FLEMING-VIOT DYNAMICS

and for K = (P, r′) with #P <∞,

(7.38) Ω↓,coalG(P, r′) := γ∑

π1 =π2∈P

(G(κP(π1, π2), r

′)−G(P, r′)).

Fix P0 ∈ M1(K). By construction, the tree-valued Kingman coalescentsolves the (P0,Ω

↓,G)-martingale problem.

The duality relation. We next state the duality relation between thetree-valued Fleming-Viot dynamics and the tree-valued Kingman coalescent.

Define for ϕ ∈ C1(R(N2)

+ ), functions Hϕ : U×K → R by

(7.39) Hϕ(U, (P, r′)

):=

∫R(

N2)νU(dr)ϕ

((rPi,Pj )1≤i<j + r′

).

The functions Hϕ are duality functions in the following sense.

Proposition 7.2.1 (Duality relation).(i) The collection of functions

(7.40) F Id :=Hϕ(·,K) : K ∈ K, ϕ ∈ C1(R(

N2))∩ F

is separating in M1(U).(ii) For U ∈ Uc, K ∈ K, let U = (Ut)t≥0 and K = (Kt)t≥0 be solutions

of the (δU ,Ω↑,F) and (δK,Ω

↓,G)-martingale problem, respectively.Then

(7.41) PU[Hϕ(Ut,K)]= PK[Hϕ(U,Kt)

],

for all t ≥ 0 and ϕ ∈ C1(R(N2)

+ ).

Proof. (i) Denote by Π1 the set of all polynomials with a test functionwhich is continuously differentiable. Observe then that

(7.42) Π1 ⊆ F Id.

Since Π1 separates points in M by Proposition 2.1.8, part(i) follows.

(ii) We have to establish that for all n ∈ N and ϕ ∈ C1(R(N2)

+ ), U ∈ Ucand K ∈ K,

(7.43) Ω↑Hϕ(·,K)(U) = Ω↓Hϕϕ(U, ·)(K)

(see, for example, [EK86, Section 4.4]). We will verify (7.43) for the twocomponents of the dynamics separately. Observe first that by (7.19) and

Page 189: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

7.3. APPROXIMATING TREE-VALUED MORAN DYNAMICS 189

(7.37)(7.44)

Ω↑,growHϕ(·, (P, r′))(U) = 2

∫R(N2)+

νU(dr) divr(ϕ)((rPi,Pj )1≤i<j + r′

)= 2

∫R(N2)+

νU(dr) divr′(ϕ)((rPi,Pj )1≤i<j + r′

)= Ω↓,growHϕ

(U, ·)(P, r′).

Similarly,(7.45)

Ω↑,resHϕ(·, (P, r′)

)(U)

= γ

∫R(N2)+

νU(dr)∑

1≤k<l

(ϕ(θk,l(rPi,Pj )1≤i<j + r′

)− ϕ

((rPi,Pj )1≤i<j + r′

))= γ

∫R(N2)+

νU(dr)∑

π1 =π2∈P

(ϕ((rκP (π1,π2)i,κP (π1,π2)j )1≤i<j + r′

)− ϕ

((rPi,Pj )1≤i<j + r′

))= Ω↓,coalHϕ

(U, ·)(P, r′)

Combining (7.44) with (7.45) yields (7.43) and thereby finishes the proof.

7.3. Approximating tree-valued Moran dynamics

In this section we state that the tree-valued Fleming-Viot dynamics canbe approximated by the tree-valued resampling dynamics which correspondto a finite particle system, the so-called Moran model. In this model or-dered pairs of individuals are replaced by new pairs and the children choosethe parent independently at random. This model is most conveniently con-structed via its graphical representation (see Figure 0.3). For the sake ofhaving notation and terminology at hand we give a formal description in thesequel.

Fix a population size N ∈ N, and put I := 1, 2, ..., N and choose ametric r0 on I.

Let then ξ := ξi,j = (N i,jt )t≥0; 1 ≤ i, j ≤ N be a realization of a family

of rate γ Poisson processes. We say for i, i′ ∈ I and 0 < s < t < ∞ thatthere is a path a path from (i, s) to (i′, t) if there exist n ∈ N, s ≤ u1 < u2 ≤... < un ≤ t and j1, ..., jn ∈ I such that for all k ∈ 1, ..., n+1 (and puttingj0 := i and jn := i′), N jk−1,jkuk = 1.

Notice that for all (i, t) ∈ I × (−∞, 0] and 0 ≤ s ≤ t there exists aunique

(7.46) As(i, t) ∈ I

Page 190: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

190 7. TREE-VALUED FLEMING-VIOT DYNAMICS

such that there is a path from (As(i, t), s) to (i, t). In the following we referto the individual As(i, t) as the ancestor of (i, t) back at time s. The ancestorrelationship defines a partial order on I × (−∞, 0]. For t ≥ 0, define thepseudo-metric rt on I for i, j ∈ I by(7.47)

rξt (i, j)

:=

2(t− sups ∈ [0, t] : As(i, t) = As(j, t)

), if A0(i, t) = A0(j, t),

2t+ r0(A0(i, t), A0(j, t)), if A0(i, t) = A0(j, t).

We then call UN = (UNt )t≥0 the tree-valued Moran dynamics, where fort ≥ 0,

(7.48) UNt := isometry class of (I, rNt , 1N

∑i∈I

δi).

In classical population genetics, the evolution of allelic frequencies isconsidered. Most often, distributions of allelic types on some (complete andseparable metric) type space K are given by a probability measure ν ∈M1(K). For the graphical representation of finite Moran models this meansthat all lines carry types which are inherited by resampling. The measure-valued Moran model describes the evolution of the empirical distributionon K. It is well-known (see e.g. [Daw93, Eth01]) that the measure-valuedMoran models converge in distribution to the measure-valued Fleming-Viotprocess as the population size N tends to infinity. The next result statesthat this invariance principle is valid also on the level of trees.

Theorem 7.3.1 (Convergence of tree-valued Moran to Fleming-Viot dy-namics). Let for N ∈ N, (UN )N∈N be the tree-valued Moran dynamics withpopulation size N , and let U = (Ut)t≥0 be the tree-valued Fleming-Viot dy-namics. If UN0 ⇒ U0, as N → ∞, weakly in the Gromov-weak topology,then

(7.49) UN =⇒N→∞

U

weakly in the Skorohod topology on DU([0,∞)).

Remark 7.3.2 (Diffusion approximation and universality).(i) Populations are certainly finite. Hence for applications Moran mod-

els rather than Fleming-Viot models are of primary interest. The-orem 7.3.1 allows us to give an asymptotic analysis of function-als of the tree-valued Moran dynamics by studying the tree-valuedFleming-Viot dynamics which is simpler to handle analytically.

(ii) The measure-valued Fleming-Viot process is universal in the sensethat it is limit point of frequency paths of various exchangeablepopulation models of constant size. We conjecture that the sameuniversality holds on the level of trees, i.e., the tree-valued Fleming-Viot dynamics is the point of attraction of various exchangeable

Page 191: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

7.4. COMPACT CONTAINMENT: LIMIT DYNAMICS EXIST 191

tree-valued dynamics. We will discuss universality as well as dif-ferent potential limit points in more detail in Section 7.11.

7.4. Compact containment: Limit dynamics exist

Recall from (7.48) the definition of a Moran model UN of population sizeN ∈ N (compare also Figure 0.3). We consider a sequence of tree-valuedMoran dynamics UN for which the population sizes increase to infinity. Inthis section we show that potential limit points take values in the space ofcompact metric measure spaces, for any positive time.

The following result is an important step in the proof of tightness ofthe family of tree-valued Moran dynamics. Recall from Definition 2.3.1 thedistance distribution, wU .

Proposition 7.4.1 (Compact containment). Let for N ∈ N, (UN )N∈Nbe the tree-valued Moran dynamics with population size N . Assume that thefamily wUN

0: N ∈ N is tight in M1(R+). Then, for all T > 0 and all

ε > 0, there exists a compact set Γε,T ⊆ Uc such that

(7.50) infN∈N

PUNt ∈ Γε,T for all t ∈ [ε, T ]

> 1− ε.

The proof relies on the graphical representation of the tree-valued Morandynamics as illustrated in Figure 0.3. Recall from (7.46) the ancestor As(i, t)back at time s ≤ t of the individual i ∈ 1, 2, ..., N living at time t. Denoteby

(7.51) SNε (t) := #At−ε(i, t) : i ∈ 1, 2, ..., N

,

the number of ancestors a time ε > 0 back of the population of size N attime t.

As a preparation we give uniform bounds on the number of ancestors.

Lemma 7.4.2 (Uniform bounds on the number of ancestors).(i) For t > 0, there exists a constant C = C(t) ∈ (0,∞) such that for

all ε ∈ (0, 12 ∧ t),

(7.52) supN∈N

PSNε (t) ≥ ε−4/3

≤ Cε2.

(ii) For T > 0, there exists a constant C = C(T ) ∈ (0,∞) such thatfor all ε ∈ (0, 12 ∧ T ),

(7.53) supN∈N

P

supt∈[ε,T ]

SNε (t) ≥ ε−4/3≤ CTε.

Proof. (i) Fix t > 0, and let ε ∈ (0, 12 ∧ t). W.l.o.g. we assume that

N ≥ ε−4/3. Consider the times

(7.54) TNk (t) :=

infs ≥ 0 : SNs (t) ≤ k

, if SN0 (t) ≤ k,

∞, if SN0 (t) > k.

Page 192: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

192 7. TREE-VALUED FLEMING-VIOT DYNAMICS

Since any pair of individuals coalesce at rate γ, t∧TN⌊ε−4/3⌋(t)) is stochasti-

cally bounded by the sum of (N−k) independent random variablesXi, where

Xi is exponentially distributed with parameter γ(i2

), for i = k + 1, ..., N .

Hence

(7.55) P[t ∧ TNk (t)

]≤ 1

γ

N∑i=k+1

(i

2

)−1

=2

γ

(1k− 1

N

)≤ 2

γk,

and

(7.56) Var[t ∧ TNk (t)

]≤ 1

γ2

N∑i=k+1

(i

2

)−1

≤ 4

γ2

∞∑i=k

i−4 ≤ C ′

k3,

for some C ′ > 0.Thus for all ε ≤ 1

2 ∧ t,(7.57)

PTN⌊ε−4/3⌋(t) ≥ ε

= P

t ∧ TN⌊ε−4/3⌋(t) ≥ ε

≤ P

t ∧ TN⌊ε−4/3⌋(t)− P[t ∧ TN⌊ε−4/3⌋(t)] ≥ ε− 2

γ ⌈ε4/3⌉

Var[TN⌊ε−4/3⌋(t)]

ε− 2γ ⌈ε4/3⌉

≤ Cε2,

for some some constant C > 0.Finally, since

(7.58) SNε (t) ≥ ε−4/3 = TN⌊ε−4/3⌋(t) ≥ ε,

(7.52) follows.

(ii) Fix N ∈ N, T > 0 and ε ∈ (0, 12), and let t ∈ [ε, T ]. Observe that forall s ∈ [0, t] and δ > 0 such that [s− δ, s] ⊆ [t− ε, t],

(7.59) SNδ (s) ≥ SNε (t).

Hence, for all k > 0,

(7.60) supt∈[k ε2 ,(k+1)

ε2 ]

SNε (t) ≤ SNε/2(kε2).

Thus

(7.61)

P

supt∈[ε,T ]

SNε (t) ≥ ε−4/3≤ P

sup

k=1,...,2Tε

SNε/2(kε2) ≥ ε−4/3

≤2T/ε∑k=1

PSNε/2(k

ε2) ≥ ε−4/3

≤ 2CTε,

and (7.53) follows.

Page 193: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

7.5. LIMIT DYNAMICS ARE TREE-VALUED FLEMING-VIOT DYNAMICS 193

Proof of Proposition 7.4.1. Fix T > 0 and ε > 0. For (U, r, µ) ∈ U,denote by(7.62)Nε(U, r, µ)

:= minN ′ ∈ N : (supp(µ), r) can be covered by N ′ balls of radius ε

.

By assumption, the family wUN0

: N ∈ N is tight, and we can therefore

choose a constant Cε > 0 such that

(7.63) wUN0([Cε;∞)) < ε,

for all N ∈ N.Set for each k ∈ N, εk := ε

CT∨12−k, and put

(7.64) Γε,T :=

∞∩k=1

U ∈ Uc : N2εk(U) ≤ ε

−4/3k , wU([Cεk + 2T ;∞)) < εk

.

Then ΓT,ε is certainly pre-compact in Uc equipped with the Gromov-weak topology, by Proposition 2.7.2.

We next show that (7.50) holds. Notice first that (7.63) implies that

(7.65) supN∈N,0≤t≤T

wUNt([Cε + 2T ;∞)) < ε,

for all ε > 0 almost surely, since distances between two sampled points donot grow more than 2T in a time interval of length T .

We therefore find that for all N ∈ N,

(7.66)

PUNt ∈ Γε,T for all t ∈ [ε, T ]

= 1− P

( ∞∪k=1

∪t∈[εk,T ]

N2εk(U

Nt , r

Nt ) ≥ ε

−4/3k

)≥ 1−

∞∑k=1

P

supt∈[εk,T ]

SNεk(t) ≥ ε−4/3k

≥ 1− ε,

and we are done.

7.5. Limit dynamics are tree-valued Fleming-Viot dynamics

Fix N ∈ N, and recall from (7.48) in Section 7.3 the tree-valued Morandynamics UN = (UNt )t≥0 of population size N . In this section we will char-acterize the tree-valued Moran dynamics as unique solutions of martingaleproblems. We will then apply this analytic characterization for establishingthe existence of the solution to the Fleming-Viot martingale problem.

Notice that the states of the tree-valued Moran dynamics with popula-tion size N are restricted to

(7.67) UN :=(U, r, µ) ∈ U : Nµ ∈ N (U)

⊂ Uc,

Page 194: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

194 7. TREE-VALUED FLEMING-VIOT DYNAMICS

where for a set U , N (U) is the set of integer-valued measures on U . More-over, notice that if U ∈ UN then U can be represented by the metric measurespace

(7.68)(1, 2, ..., N, r′, N−1

N∑i=1

δi),

for some pseudo-metric r′ ∈ Rmet. In the following we refer to the elementsi ∈ 1, 2, ..., N as the individuals of the population of size N , and to the ele-ments u ∈ 1, 2, ..., N/r′ of the ultra-metric measure tree, where /r′ denotes

the quotient map which identifies elements of pseudo-distance 0 as leaves.By construction, the tree-valued Moran dynamics is derived from the

following particle dynamics.

Resampling. At the resampling rate γ > 0, a resampling eventoccurs between two individuals i, j ∈ I = 1, 2, ..., N implyingthat the genealogical distance between i, j is set to zero. Equiv-alently, for two leaves u, u′ ∈ I, the measure changes from µ toµ+ 1

N δu −1N δu′ or to µ+ 1

N δu′ −1N δu, each happening with prob-

ability 12 .

Leaf growth. In between two resampling events the popula-tion gets older meaning that the genealogical distance between twodifferent individuals grows at speed 2. Equivalently, for two leavesu, u′ ∈ I which can be sampledwithout replacement, the mutualdistance grows at speed 2.

Example 7.5.1 (Leaf growth for various N). We illustrate the effectsof the distance growth for N = 2, N = 4 and large N . Consider theultra-metric space of all leaves in the tree from Figure 7.1(a). The numbersindicate weights of atoms in µ. After some small time ε, this ultra-metricspace evolves due to distance growth to different trees, depending on N .For N = 2, whenever we sample two different individuals, they must betaken from the two leaves. Therefore the distance between the two pointsin the ultra-metric space grows. For N = 4, we may sample two (but notmore) individuals without replacement from the same leaf in the above tree.We therefore may sample two individuals from the same point which thensplits into two branches whose lengths grow. For large N , we may sample alot of individuals from one and the same leaf and therefore split into manybranches whose lengths grow.

The martingale problem for a fixed population size N . We nextwant to characterize these dynamics by a martingale problem. Recall fromDefinition 2.1.3 the infinite distance matrix distribution νU of an elementU ∈ U. For each N ∈ N, U ∈ UN is characterized uniquely by its N distancematrix distribution which is defined similarly.

Page 195: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

7.5. LIMIT DYNAMICS ARE TREE-VALUED FLEMING-VIOT DYNAMICS 195

• •

12

12

..................

..................

..................

..................

..................

...........

.....................................................................................................

(a)

••

12

12

..................

..................

..................

..................

..................

..................

..................

.....

...................................................................................................................................

N = 2

(b)

• • • •

14

14

14

14

..................

..................

..................

..................

..................

...........

..............................................................................................................

..................

....

...............................

..................

.............

...............................

N = 4

(c)

..................

..................

..................

..................

..................

...........

..............................................................................................................

..................

....

............................

..........................

........................

.......................

.......

.......

.......

.

.......

.......

.......

.

.......

.......

.......

.

.......................

........................

..........................

............................

...............................

..................

.............

............................

..........................

........................

.......................

.......

.......

.......

.

.......

.......

.......

.

.......

.......

.......

.

.......................

........................

..........................

............................

...............................

N large

(d)

............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ....

............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ....

Figure 7.1. Leaf growth in finite Moran models. (a) Thestarting tree has only two distinct points. (b) In a populationof size 2 these two points grow in distance while ((c),(d)) twoor more individuals can be sampled from the same point.

For N ∈ N and a metric space (X, r) which #X = N , recall the maps

R(X,r) : XN → R and ρ≤N : R(N2)

+ → R(N2 )

+ from (2.1) and (2.4), respectively,and define by

(7.69) RN,(X,r) := ρ≤N R(X,r)

the map which sends a sequence of N points in X to its N distance matrix,and let for an ultra-metric measure space (U, r, µ) ∈ UN ,

(7.70) ν(U,r,µ) :=(RN,(X,r)

)∗µ

⊗↓N ∈ M1(R(N2 )+ )

denote the N distance matrix distribution of (U, r, µ), where for n ∈ N, µ⊗↓N

draws n points of U without replacement, i.e.,

(7.71)

µ⊗↓n(d(u1, ..., un))

:= µ(du1)⊗µ− 1

N δu11− 1

N

(du2)⊗ ...⊗µ− 1

N

∑n−1k=1 δuk

1− (n−1)N

(dun).

Once more, it is obvious that νN,(U,r,µ) depends on (U, r, µ) only through itsisometry class U ∈ UN leading to the following definition.

Definition 7.5.2 (N distance matrix distribution). For each N ∈ N,the N distance matrix distribution νN,U of U ∈ UN is defined as the Ndistance matric distribution νN ;(U,r,µ) of an arbitrary represent U = (U, r, µ).

Page 196: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

196 7. TREE-VALUED FLEMING-VIOT DYNAMICS

Recall the class of functions F from (7.28) and put for each N ∈ N,

f : R → R and ϕ : R(N2 )

+ → R,

(7.72) FN,f,ϕ(U) := f(∫

νN,U(dr)ϕ(r)).

Let then(7.73)

FN :=FN,f,ϕ : UN → R of the form (7.72) : f ∈ C2(R), ϕ ∈ C1(R(

N2 )

+ )

Fix N ∈ N. We are going to define an operator Ω↑,N whose action onFN is given by independent superposition of leaf growth and resampling, i.e.,

(7.74) Ω↑,N := Ω↑,grow,N +Ω↑,res,N .

We begin with the leaf growth operator Ω↑,grow,N . In periods of lengthε > 0 without resampling the N distance matrix distribution changes dueto leaf growth

(7.75)νN,U =

(RN,(U,r)

)∗µ

⊗↓N

7→(RN,(U,r) + 2ε(1)1≤i<j≤N

)∗µ

⊗↓N =: νN,Uε ,

where we assumed that U ∈ UN can be represented by U = (U, r, µ). There-fore,

(7.76)Ω↑,grow,NFN,f,ϕ := lim

ε↓0

1

ε

(f( ∫

dνN,Uε ϕ)− FN,f,ϕ

)= 2 · FN,f ′,ϕ · FN,Id,div(ϕ),

where Id : R → R denotes the identity map, i.e., Id(x) = x, for all x ∈ R.To understand the resampling operator Ω↑,res,N , recall from (7.68) that forU ∈ UN its N distance matrix distribution is of the form

(7.77) νN,U =1

N !

∑π∈SN

δ(rπ(i)∧π(j),π(i)∨π(j))1≤i<j≤N,

for non-negative numbers ri,j , 1 ≤ i = j ≤ N , with SN denoting the set of allpermutations π : 1, 2, ..., N → 1, 2, ..., N. Moreover, due to a resamplingevent between two individuals (say, the kth is reproducing pushing therebythe lth out of the population) it gets sent to

(7.78) νN,Uk,l =1

N !

∑π∈SN

δ(θk,l(rπ(i)∧π(j),π(i)∨π(j)))1≤i<j≤N,

for some 1 ≤ k, l ≤ N with the replacement operator θk,l as defined in (7.9).You may want to think of a π ∈ SN as a record of the order at which the Nindividuals get sampled without replacement, i.e., the ith particle sampledis the individual with “name” π(i).

Page 197: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

7.5. LIMIT DYNAMICS ARE TREE-VALUED FLEMING-VIOT DYNAMICS 197

Hence

(7.79)

Ω↑,res,NFN,f,ϕ =γ

2

∑1≤k,l≤N

(f( ∫

dνN,Uk,l ϕ)− FN,f,ϕ

)=γ

2

∑1≤k,l≤N

(FN,f,ϕθk,l − FN,f,ϕ

).

It is easy to see that for given N ∈ N, FN separates points in UN .We can therefore use the operator Ω↑,N acting on FN to characterize thetree-valued Moran models analytically.

Proposition 7.5.3 (Tree-valued Moran dynamics). Fix N ∈ N andPN0 ∈ M1(UN ). The (PN0 ,Ω↑,N ,FN )-martingale problem is well-posed.

Proof. For given N ∈ N and PN0 ∈ M1(UN ), if we choose in (7.47) the

pseudo-metric r0 randomly such that the law of (1, 2, ..., N, r0,∑N

i=1 δi)equals PN0 , the tree-valued Moran dynamics defined in (7.48) solve the(PN0 ,Ω↑,N ,FN )-martingale problem, by construction. This proves existence.

As for uniqueness there are two ways to proceed. First put

(7.80) Π↑,N,1 :=FN,Id,ψρ≤n ∈ FN : n ∈ 2, 3, ..., N, ψ ∈ C1(R(

n2)

+ ).

Following the same line of argument as given in Section 7.2 - one caneasily check that the (PN0 ,Ω↑,N ,Π↑,N,1)-martingale problem is dual to thetree-valued Kingman coalescent where the duality functions F ∈ Π↑,N,1 aresmooth monomials that involve sampling without replacement (see, for ex-ample, Corollary 3.7 in [GLW05] where a similar duality is proved on thelevel of the measure-valued processes).

Another line of argument is to notice that here we are dealing with finitetrees and hence with finitely many points only. Hence the action of Ω↑,N onΠ↑,N,1 defines a piece-wise deterministic jump process for which standardtheory applies. Hence we omit the details.

Convergence to the Fleming-Viot martingale problem. Themain goal of this section is to show that the operator for the tree-valuedFleming-Viot martingale problem is the limit of the operator for the tree-valued Moran martingale problems.

Proposition 7.5.4. For all F ∈ F and N ∈ N there exists FN ∈ FNsuch that

(7.81) limN→∞

∥ FN − F ∥= 0,

and

(7.82) limN→∞

∥ Ω↑,NFN − Ω↑F ∥= 0.

Page 198: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

198 7. TREE-VALUED FLEMING-VIOT DYNAMICS

Proof. Fix f : R → R and ϕ : R(N2)

+ → R such that F = F f,ϕ ∈ F .Notice first that by construction of the domain F (compare (??)) we can

find for all ε > 0 an n ∈ N and a test function ψn,ε : R(n2)

+ → R such that

F f,ψn,ερ≤n ∈ F and ∥ F f,ϕ − F f,ψn,ερ≤n ∥< ε.W.l.o.g. we therefore assume that ϕ is of the form ϕ = ψ ρ≤n for an

n ∈ N and a test function ψ ∈ C1(R(n2)

+ ). For each N ≥ n, we can then

associate ϕ with a test function ϕN : R(N2 )

+ such that ϕN ρ≤N = ϕ, and put

(7.83) FN := FN,f,ϕN ∈ FN .

Since for all n ∈ N, N ≥ n, and ϕN : R(N2 )

+ → R,

(7.84) FN,Id,ϕN(U) = F Id,ϕN (U) +O

(1

Nn−1

),

uniformly in U ∈ U, as N → ∞, (7.81) clearly holds.Moreover, by (7.19) and (7.76),

(7.85)

Ω↑,grow,NFN = 2 · FN,f ′,ϕN · FN,Id,div(ϕN )

= 2 · F f ′,ϕ · F Id,div(ϕ)

= Ω↑,growF.

It therefore only remains to show convergence for the resampling opera-

tor. For that notice that for 1 ≤ k, l ≤ N , π ∈ SN , n ∈ N and ψ ∈ C1(R(n2)

+ ),if(7.86)

∆N,π,n,ψ(U; (k, l))

:= ψ(θk,l(rπ(i)∧π(j),π(i)∨π(j))1≤i<j≤n

)− ψ

((rπ(i)∧π(j),π(i)∨π(j))1≤i<j≤n

)then ∆N,π,n,ψ(U; (k, l)) = 0 only if l ∈ 1, 2, ..., n.

Hence if ϕ = ψ ρ≤n for some n ∈ N and ϕ ∈ C1(R(n2)

+ ) then for all1 ≤ k, l ≤ N ,

(7.87)

∆N,n,ψ(U; (k, l)

):=

∫d(νN,Uk,l − νN,U

)ψ ρ≤n

=1

N !

∑π∈SN ,l∈1,2,...,n

∆N,π,n,ψ(U; (k, l))

= O( 1N

),

as N → ∞.

Page 199: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

7.5. LIMIT DYNAMICS ARE TREE-VALUED FLEMING-VIOT DYNAMICS 199

Therefore, applying (7.79) together with the Taylor expansion formulaon f ,

(7.88)

Ω↑,res,NFN = γ∑

1≤k<l≤N

f( ∫

dνN,·k,l ϕN)− FN,f,ϕ

N

2FN,f

′,ϕN ·∑

1≤k,l≤N∆N,n,ψ

(·; (k, l)

)+γ

4FN,f

′′,ϕN ·∑

1≤k,l≤N

(∆N,n,ψ

(·; (k, l)

))2+RN,f,ϕ

N,

where the error term RN,f,ϕN

is chosen such that the last equality holds.

In particular, RN,f,ϕN

= O(∑

1≤k,l≤N (∆N,n,ψ(·, (k, l)))3) and therefore

RN,f,ϕN → 0, as N → ∞, by (7.87). Furthermore,

(7.89)

∑1≤k,l≤N

∆N,n,ψ(·; (k, l)

)=

∑1≤k,l≤n

∫R(n2)+

d(ρ≤n)∗νN,·ψ θk,l − ψ

=

∑1≤k,l≤n

∫R(N2 )+

dνN,·ϕN θk,l − ϕN

.

To calculate the second order term in (7.88) we note that for all ϕ :

R(2N2 )

+ → R

(7.90) FN,x2,ϕ = F 2N,Id,(ϕ,ϕ) +O

( 1N

),

as N → ∞, with (ϕ, ϕ) as defined in (7.22).

Applying then (7.88) together with (7.89) to FN,x2,ϕN and F 2N,Id,(ϕN ,ϕN ),

we obtain

(7.91)

Ω↑,res,NFN,x2,ϕN = γF 2N,Id,ϕN ·

∑1≤k,l≤N

∆N,n,ψ(·; (k, l)

)+γ

2

∑1≤k,l≤N

(∆N,n,ψ

(·; (k, l)

))2+RN,x

2,ϕN ,

and

(7.92)

Ω↑,res,NF 2N,Id,(ϕN ,ϕN )

2

∑1≤k,l≤2N

∆2N,2n,(ψ,ψ)(·; (k, l)

)+R2N,Id,(ϕN ,ϕN ).

Page 200: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

200 7. TREE-VALUED FLEMING-VIOT DYNAMICS

Therefore(7.93)∑

1≤k,l≤N

(∆N,n,ψ

(·; (k, l)

))2+O

( 1N

)=

∑1≤k,l≤2n

F 2N,Id,(ϕN ,ϕN )θk,l−(ϕN ,ϕN )

−∑

1≤k,l≤nF 2N,Id,(ϕN ,ϕNθ2k,2l−ϕN ) −

∑1≤k,l≤n

F 2N,Id,(ϕNθ2k−1,2l−1−ϕN ,ϕN )

=∑

1≤k,l≤2n;|k−l| odd

F 2N,Id,(ϕN ,ϕN )θk,l−(ϕN ,ϕN ).

Hence, as N → ∞, (with ⟨ϕ, ϕ⟩k,l as defined in (7.21))(7.94)

Ω↑,res,NFN

−→N→∞

γF f′,ϕ ·

∑1≤k<l≤n

(F Id,ϕθk,l − F Id,ϕ

)+γ

2F f

′′,ϕ ·∑

1≤k<l≤nF Id,⟨ϕ,ϕ⟩k,l

= Ω↑,resF.

7.6. Limit dynamics yield continuous paths

Recall from Section 7.3 the definition of a Moran model UN of populationsize N ∈ N (compare also Figure 0.3). In this section we consider a sequenceof tree-valued Moran dynamics for the population sizes increase to infinityand show that potential limit points have continuous paths.

Proposition 7.6.1 (Limit points have continuous paths). If UN =⇒N→∞

Ufor some process U with sample paths in DU([0,∞)), then U ∈ CU([0,∞)),almost surely.

Proof. Recall from (7.48) in Section 7.3 the definition of the tree-valuedMoran dynamics UN of population size N ∈ N and its construction based onPoisson point processes ξ = ξi,j ; 1 ≤ i, j ≤ N (compare also Figure 0.3).In particular, the tree-valued Moran dynamics have paths in DUc(R+), al-most surely.

Page 201: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

7.7. PROOF OF THE MAIN RESULTS (THEOREMS 7.1.6 AND 7.3.1) 201

Let ξ :=∑

1≤i=j≤N ξi,j . If ξt = 0, then UNt− = UNt . Otherwise, if

ξi,jt = 1, for some 1 ≤ i = j, then

(7.95)

dGPr

(UNt−,UNt

)= dGPr

((1, 2, ..., N, rξt−,

∑N

k=1δk),(1, 2, ..., N, rξt ,

∑N

k=1δk))

≤ d(1,2,...,N,rt−)Pr

(∑N

k=1δk,

1N δi −

1N δj +

∑N

k=1δk)

≤ 2

N,

and therefore

(7.96)

∫ ∞

0dT e−T sup

t∈[0,T ]dGPr

(UNt−,UNt

)≤ 2

N,

for all T > 0 and almost all sample paths UN . Hence the assertion followsby Theorem 4.10.2 in [EK86].

7.7. Proof of the main results (Theorems 7.1.6 and 7.3.1)

The main results of this chapter are the well-posedness of the tree-valuedFleming-Viot martingale problem and its approximation by the tree-valuedMoran dynamics stated in Theorems 7.1.6 and 7.3.1, respectively. We willprove both theorems simultaneously.

Proof of Theorems 7.1.6 and 7.3.1. Fix P0 ∈ M1(U). Recall, foreach N ∈ N, the state-space UN , and the UN -valued Moran dynamics,UN = (UNt )t≥0, from (7.67) and (7.48), respectively. Furthermore, let foreach N ∈ N, PN0 ∈ M1(UN ) be given such that PN0 ⇒ P0, as N → ∞.

By Proposition 7.5.3, the (PN0 ,Ω↑,N ,FN )-martingale problem is well-posed, and is solved by UN . Proposition 7.5.4 now implies with a standardargument (see, for example, Lemma 4.5.1 in [EK86]) that if UN ⇒ U ,for some U ∈ DU([0,∞)), as N → ∞, then U solves the (P0,Ω

↑,F)-martingale problem. Hence for existence we need to show that the sequenceUN ; N ∈ N is tight, or equivalently by Remark 4.5.2 in [EK86], that thecompact containment condition holds. However, the latter was proved inProposition 7.4.1.

By standard theory (see, for example, Theorem 4.4.2 in [EK86]),uniqueness of the (P0,Ω

↑,Π1)-martingale problem (and therefore also ofthe (P0,Ω

↑,F)-martingale problem) follows from uniqueness of the one-dimensional distributions. The latter can be easily verified using the du-ality of the tree-valued Fleming-Viot dynamics to the tree-valued Kingmancoalescent, K := (Kt)t≥0, as defined in (7.33). That is, if U = (Ut)t≥0 is a so-

lution of the (P0,Ω↑,Π1)-martingale problem, then for n ∈ N, ϕ ∈ C1(R(

n2)

+ )

Page 202: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

202 7. TREE-VALUED FLEMING-VIOT DYNAMICS

and functions functions Hn,ϕ : U×K → R of the form (7.39), for all κ ∈ K,

(7.97) PP0[Hn,ϕ(Ut, κ)

]= PP0⊗δκ[Hn,ϕ(U ,Kt)

],

by (7.41). By Proposition 7.2.1(i), the collection of functions Hn,ϕ(·,K) :

K ∈ K, n ∈ N, ϕ ∈ C1(R(n2)) is separating in M1(U), and uniqueness of the

one-dimensional distributions follows.

(i) and (ii). So far we have shown that the (P0,Ω↑,F)-martingale prob-

lem is well-posed and its solution arises as the weak limit of (the) solutionsof the (PN0 ,Ω

↑,N ,FN )-martingale problems. Hence, in particular, the tree-valued Moran dynamics converge to the tree-valued Fleming-Viot dynamicswhich is the claim of Theorem 7.3.1.

Propositions 7.4.1 and 7.6.1 therefore imply that the tree-valuedFleming-Viot dynamics have values in the space of compact ultra-metricmeasure spaces for each t > 0 and have continuous paths, respectively, al-most surely.

(iii) Note that

(7.98)

P[νUtr : r1,2 = 0

]= lim

θ→∞P[ ∫

νUt(dr) e−θr1,2]

= limθ→∞

(e−2(θ+ γ

2)te−θ

∫νU0 (dr)r1,2 + (1− e−2(θ+ γ

2)t)

1

2( 2γ θ + 1)

)= 0,

where we used explicit calculations for the mean sample Laplace transformwhich we present later in more generality in Corollary 7.11.6. By dominatedand monotone convergence,

(7.99) P[ ∫ ∞

0dt νUtr : r1,2 = 0

]= lim

T→∞

∫ T

0dtP

[νUtr : r1,2 = 0

]= 0,

and hence∫∞0 dt νUtr : r1,2 = 0 = 0, for Lebesgue almost all t ≥ 0. This

proves that Ut is non-atomic, i.e., νUtr : r1,2 = 0 = 0, for Lebesgue almostall t ≥ 0.

7.8. Long-term behavior

Genealogical relationships in neutral models are frequently studied sincethe introduction of the Kingman coalescent in [Kin82b]. This stochasticprocess describes the limit genealogy of a Moran population in equilibriumand its projective limit as the population size tends to infinity.

In this section we formulate the related convergence result for the tree-valued Fleming-Viot dynamics. Recall from Definition 4.1.5 the Kingmancoalescent measure tree as the Λ-coalescent measure tree with Λ = γδ0.

Page 203: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

7.9. THE MEASURE-VALUED FLEMING-VIOT PROCESS AS A FUNCTIONAL 203

Theorem 7.8.1 (Convergence to the Kingman measure tree). Let U =(Ut)t≥0 be the tree-valued Fleming-Viot dynamics and UKin the Kingmancoalescent measure tree. Then,

(7.100) Ut=⇒t→∞

UKin,

for all U0 ∈ U.

Proof. Fix ϕ ∈ C1(R(N2)

+ ). Applying the duality relation (7.41) betweenthe tree-valued Fleming-Viot dynamics and the tree-valued Kingman coa-lescent which starts in K0 = (P0, r

′0) with P0 := n; n ∈ N and r0

′ ≡ 0

and using the fact that there exits a random time τ < ∞, almost surely,after which K is fixed in the trivial partition N (with mutual distancesnot changing anymore) implies that for all t ≥ 0 and all P0 ∈ M1(U),

(7.101) PP0

[ ∫R(

N2)νUτ+t(dr)ϕ

(r)]

= PδK0

[ϕ(r′τ+t

)].

Hence the right hand side of (7.101) converges as t→ ∞, and by domi-nated convergence,

(7.102)

PP0

[limt→∞

∫R(

N2)νUτ+t(dr)ϕ

(r)]

= limt→∞

PP0

[ ∫R(

N2)νUτ+t(dr)ϕ

(r)]

= PδK0

[ϕ(rτ

′)]= Qγδ0

[ ∫R(

N2)νU (dr)ϕ

(r)]

where Qγδ0 denotes - as before - the (rate γ) Kingman-coalescent measure

tree (compare with Definition 4.1.5). Since ϕ ∈ C1(R(N2)

+ ) was chosen arbi-trary, the statement of Theorem 7.8.1 follows.

7.9. The measure-valued Fleming-Viot process as a functional

In order to relate the tree-valued Fleming-Viot diffusion to the measure-valued Fleming-Viot diffusion we need the notion of families. Fix a timet > 0, and let U = (U, r, µ) ∈ U be a realization of the tree-valued Fleming-Viot dynamics at time t. By Theorem 7.1.6(b) we can assume w.l.o.g. thatU ∈ Uc. We can therefore cover (supp(µ), r) by a finite number of disjoint2t-balls (compare with Figure 7.2), i.e., we find a numberM =MU ∈ N anda collection of disjoint open balls Bi

2t; i = 1, ...,M of radius 2t such that

(7.103) supp(µ) = ∪Mi=1Bi2t.

In the following we refer to the isometry class of

(7.104) Ft :=Bi

2t; i = 1, ...,M,

as the families with degree of kindom t, and to

(7.105) Wt :=µt(B

i2t); i = 1, ...,M

.

Page 204: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

204 7. TREE-VALUED FLEMING-VIOT DYNAMICS

Figure 7.2. The tree-valued Fleming-Viot dynamics attime t can be decomposed into disjoint balls of radius 2t.Every tree in the picture represents one of these balls. Weremark that the tree has many tiny branches close to thetree top which we can not plot. So what you see is only themacroscopic structure of the disjoint balls after some time.

as the family weights at time t. Note that Ft andWt are intrinsically defined.Fix the type space K = [0, 1], and recall from (7.2) the M1(K)-valued

Fleming-Viot process, µ = (µt)t≥0 which starts in µ0(dk) = dk. It is well-known that for given t > 0, supp(µt) is a finite set k1, ..., kMµt ⊂ K, forsome random Mµt <∞, almost surely (see, for example, [Daw93]).

Denote, for each t > 0, by Wt and νt the size-ordered statistics of Wt

and νt, respectively. The following result states that the processes (Wt)t≥0

and (νt)t≥0 agree in distribution.

Proposition 7.9.1. The process W = (Wt)t≥0 is Markovian and equalsin distribution ν = (νt)t≥0.

Remark 7.9.2 (The measure-valued Fleming-Viot process). One canalso find the measure-valued Fleming-Viot randomly embedded in the thetree-valued Fleming-Viot dynamics (rather than just the their size-biasedweights). The construction is tedious and we only give a sketch. We needto be able to trace back families at different times to the same ancestor. Forthat fix a time T , and assume now that a realization U = (Ut)t∈[0,T ] of thetree-valued Fleming-Viot dynamics is given. Consider the process

(7.106) F :=(F

UT−s

T−s , )s∈[0,T ]

of the family decomposition of the Fleming-Viot population as time runsfrom T to 0. It is clear that F satisfies the following: #F0 < ∞, #Fis piece-wise constant, #F has jumps of height 1, and lims→T #Fs = ∞,almost surely. Then, for all k ≥ #F0,

(7.107) Tk := infs ∈ [0, T ] : #Fs = k

<∞,

almost surely. Moreover, the processes

(7.108) B :=µT−s(B); B ∈ Fs

s≥0

of family weights behaves as follows: in between [Tk, Tk+1), the k elementsfluctuate continuously, while for each time t ∈ [Tk, Tk+1), the elements ofB take different values in [0, 1], almost surely. We can therefore matchcontinuous paths and thereby the ancestral lines. Notice that whenever two

Page 205: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

7.9. THE MEASURE-VALUED FLEMING-VIOT PROCESS AS A FUNCTIONAL 205

weight pass cross we can not identify the ancestries. In that case we haveto rely on a family of independent and symmetrically distributed −1, 1-variables to decide with which to go on to find the ancestor at time 0. At atime of the form t = Tk, for some k, a new ancestral line appears. This waywe get a (random) map which sends an ancestor B0 ∈ lims↓0 Fs at time 0to its descent family D(B0, s) ∈ Ft−s at time s. Moreover, we can identifyeach element in B0 ∈ lims↓0 Fs with a type x ∈ [0, 1]. Then the processW = (Ws)s∈[0,T ], where for s ∈ [0, T ],

(7.109) Ws :=

∫[0,1]

dxµs(D(x, s))δx

is distributed like the measure-valued Fleming-Viot process ν = (νs)s∈[0,T ].

Proof of Proposition 7.9.1. Recall for each N ∈ N, the tree-valuedMoran dynamics UN from (7.48). Introduce for each t ≥ 0, the equivalencerelation ∼N

t on 1, 2, ..., N by i ∼Nt j if and only if rt(i, j) < 2t. Similarly to

(7.104) and (7.105), let, for each t ≥ 0, FNt be the set of equivalence classeswith respect to ∼N

t and WNt := WN

t (π); π ∈ FNt the set of weights of thisequivalence classes. That is, for π ∈ FNt , WN

t (π) = 1N#π. It is clear that

the process νN = (νNt )t≥0 defined by

(7.110) νNt :=

∫Kν0(dx)

∑π∈FN

t

µNt (π)δx,

for all t ≥ 0, is the Moran model with type space K. Denote by WN :=(WN

t )t≥0 and νN := (νNt )t≥0 the Markov chains of the vector of size-biasedweights. Of course, also

(7.111) WN d= νN .

Hence the claim follows by Theorem 7.3.1 since the map

(7.112) φ : U =Tc ∋ U 7→ WT (U) ∈ ℓ+,↓1,0

which sends a compact ultra-metric measure space with all distance beingdifferent from a given T > 0 to its size-biased weight “at time T > 0” vector

in the space ℓ+,↓1,0 of non-negative non-increasing entrees which sum up to1 and with only finitely many entrees being positive is continuous. Indeed,

since Ut ∈ U =2tc , almost surely, by continuity of U , this implies that the finite

dimensional distributions of W agree with those of ν.

Page 206: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

206 7. TREE-VALUED FLEMING-VIOT DYNAMICS

7.10. More general resampling mechanisms and extensions

So far we have considered tree-valued Moran resampling dynamics andtheir diffusion limit. As it will turn out the construction of the tree-valueddynamics as solutions of well-posed martingale problems relies only on theconsistency property that the processes of population size N ∈ N ∪ ∞restricted to a random sample of size n < N behave as the processes ofpopulation size n. We can therefore extend our construction to resamplingdynamics which allow for multiple offspring in the forward and multiplecoalescents with multiple mergers in the backward picture. In particular,we may even allow for offspring distributions with infinite variance, in whichcase the “infinite population limit” is a Levy-type jump process rather thana diffusion. In this section we discuss extensions in several directions.

A general framework for resampling dynamics is provided by the modelsof Cannings (see, [Can74, Can75] and Section 3.3 in [Ewe04] for a survey).For a fixed population size N ∈ N, each Cannings model is characterized by afamily of exchangeable random variables V N

1 , ..., V NN with

∑Ni=1 V

Ni = N

recording the numbers of offspring of the individuals currently alive. Af-ter random waiting times the population is replaced by a new populationwhere the ith, i = 1, ..., N , individual is replaced by V N

i new individuals. Aspecial family of Cannings models are Wright-Fisher models where (waitingtimes are discrete and) (V N

1 , ..., V NN ) is a multinomially distributed random

variable. In the following we denote the tree-valued Cannings dynamics ofsize N by VN = (VNt )t≥0.

Kingman has shown in [Kin82b] that genealogies of the measure-valuedCannings dynamics in discrete time, rescaled by N generations, converge tothe Kingman coalescent which is dual to the Fleming-Viot process provided

(7.113) supN∈N

P[(V N

1 )k]<∞,

for all k ≥ 2. We claim that the same results applies to the tree-valueddynamics.

Conjecture 7.10.1 (“Light tail” Cannings converges to Fleming-Viot).Let for N ∈ N, VN be the tree-valued Cannings dynamics of popula-tion size N . Assume that the corresponding offspring distributions satisfy(7.113). Under a suitable time rescaling, (VN )N∈N converges weakly to thetree-valued Fleming-Viot dynamics in the Skorohod topology on DU([0,∞)),as N → ∞.

However there is another regime where the Canning dynamics still con-verge but the limit is not a diffusion any more. We focus in our discussionon the case where we then have a dual which allows for multiple mergers,i.e. processes where ≥ 3 partition elements coalesce.

For example, consider a finite measure Λ ∈ Mfin([0, 1]). The Λ-Canningsmodel is then a fixed population size N ∈ N model with individuals carryinga genetic type. The population evolves as follows:

Page 207: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

7.10. MORE GENERAL RESAMPLING MECHANISMS AND EXTENSIONS 207

• Each (unordered) pair resamples at constant rate Λ0.• Each individual kills at rate Λ(dx) a binomial number Bin(x,N)of individuals (including itself) and simultaneously reproduces thesame number of copies of itself,

where Bin(x,N) denotes a random variable with binomial distribution withparameters (x,N). In the following we denote by

(7.114) VN,Λ

the corresponding tree-valued Cannings dynamics.

In the limit of a large population size N and in the long term the geneal-ogy of the Cannings model is given by Λ-coalescents, introduced in [Pit99].It is known that Λ-coalescents can be associated with the Λ-coalescent mea-sure tree, UΛ

∞ ∈ U, provided the dust-free property holds, i.e.,

(7.115)

∫ 1

0Λ(dx)x−1 = ∞,

(compare, Lemma 25 in [Pit99], Theorem 4.1.3 and Definition 4.1.5).

Conjecture 7.10.2 (Λ-Cannings converge to Λ-resampling dynamics).Fix Λ ∈ Mfin([0, 1]) and assume that the dust-free property (7.115) holds.

Assume that there exists a random element VΛ0 with VN,Λ0 ⇒ VΛ

0 , as N → ∞.

(i) There exists a U-valued process VΛ ∈ DU(R+), with VN,Λ ⇒ VΛ,as N → ∞.

(ii) The process VΛ is strong Markov and has the Λ-coalescent measuretree as its unique equilibrium.

In the following we refer to the process VΛ as the tree-valued Λ-resampling dynamics. We are next interested in properties of VΛ.

Note first that Λ(0, 1] > 0 implies that a substantial fraction of thepopulation can get replaced. Hence we expect continuous paths if and onlyΛ(dx) = γδ0.

Conjecture 7.10.3 (Continuous paths versus jumps). The followingthree properties are equivalent.

(a) The process VΛ has continuous path.(b) Λ = γδ0 for some γ > 0.(c) The process VΛ is the tree-valued Fleming-Viot dynamics.

In VΛ, a substantial fraction of the whole population can be replaced bya single event. If this happens the measure has an atom which neverthelessgets destroyed immediately by the leaf growth.

Conjecture 7.10.4 (VΛ has no atoms). Let Λ ∈ M([0, 1]) and VΛ =(VΛt )t≥0. For Lebesgue almost all t, the metric measure space VΛ

t has noatoms.

Page 208: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

208 7. TREE-VALUED FLEMING-VIOT DYNAMICS

Moreover, it is known that the Λ-coalescent comes down from infinity ininfinitesimal small time if and only if (4.12) holds (compare Proposition 4.1.6and Remark 4.1.7). Since for ultra-metric spaces, balls do either agree orare disjoint, the number of ancestral lines a time ε > 0 back equals thenumber of ε-balls one needs to cover the ultra-metric space representing thecoalescent tree. Hence we expect the following.

Conjecture 7.10.5 (Compactness and “coming down from infinity”).The following properties are equivalent.

(i) For each t > 0, VΛt ∈ Uc, almost surely.

(ii) The measure Λ ∈ Mfin([0, 1]) satisfies Condition (4.12).

Moreover, we want to point out that there are more general resamplingdynamics in which even two or more individuals may replace substantialfractions of the whole population. Their genealogical trees are describedby coalescent processes allowing for simultaneous multiple mergers (see,[Sch00a]).

Remark 7.10.6 (Extensions which require more fundamental work).There are several extensions of the tree-valued processes we have constructedwhich require the investigation of further topological aspects of the corre-sponding state spaces. These can be grouped into two different directions.

• First, a re-introduction of types on some type space K would allowfor type-dependent evolution.

– In particular, this would allow to construct tree-valued resam-pling dynamics under mutation and selection.

– Alternatively, if individuals are assigned a location in somegeographical space, we can model the tree-valued dynamics ofspatially structured resampling models.

• Secondly, more information about the population would result fromconstructing a genealogical tree relating all individuals which haveever been alive in the population, i.e. to include besides the currentpopulation all the fossils.

We adress the above mentioned questions in [GSW, GPW].

7.11. Application: Sample tree lengths distributions

In this section we investigate the functional of the length of sub-treesspanned by subsequently sampled points and discuss applications in popula-tion genetics.

We first state that this functional is the solution of a well-posed martin-gale problem and therefore evolves as an autonomous strong Markov process.The length of the genealogical tree spanned by a finite sample from the pop-ulation at a given time represents the total rate at which mutations haveoccurred since the time of the most recent common ancestor of the sample.

Page 209: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

7.11. APPLICATION: SAMPLE TREE LENGTHS DISTRIBUTIONS 209

Therefore the probability of the event that no mutation separated the in-dividuals of the sample leads to the mean sample Laplace-transform of thesubtree spanned by a sample of size.

We will therefore apply the tree lengths process to study state indepen-dent mutations in the neutral model.

Subtree lengths distribution process is strongly Markovian. Let (X, r) bea metric space with the four point condition given in (1.13). Moreover,recall from Proposition 1.5.7 that (X, r) can then be embedded into an R-tree which spans (X, r). Let then for n ∈ N and x1, ..., xn ∈ X,

(7.116)L(X,r)n

(x1, ..., xn

):= length of the subtree of (X, r) spanned by x1, ..., xn.

Note that the length of the tree is (indeed) a function of the mutualdistances between points (see Lemma 1.7.3 for explicit expressions).

Moreover, recall from Example 1.5.4 that each ultra-metric space (U, r)satisfies the four point condition, and consider its vector of sub-tree lengths,spanned by sequentially sampled points, i.e.,

(7.117) L(U,r)(u) :=(L(U,r)1 (u), L

(U,r)2 (u), ...

)∈ RN

+

with L(U,r)1 := 0.

Recall Lemma 1.7.3(ii), and consider the test function ϕlength ∈ C1(R(N2)

+ )by(7.118)

ϕlength((ri,j)1≤i<j≤n

):=(0, 12r1,2,

12 inf

3∑i=1

ri,σ(i); σ ∈ Σ13

, ...).

Then for each U = (U, r, µ) ∈ U, the distribution of L(U,r)(U) under theproduct measure µ⊗N does not depend on the representative and is given by

(7.119) κ(U) :=(ϕlength

)∗ν

U ∈ M1(RN+).

We investigate the evolution of the subtree length distribution under thetree-valued Fleming-Viot dynamics. That is, given the tree-valued Fleming-Viot dynamics U = (Ut)t≥0, we consider Λ = (Λt)t≥0, where for t ≥ 0,

(7.120) Λt := κ(Ut).For that purpose consider the algebra ΠΛ generated by functions Ψ :

M1(RN+) → R which are of the form

(7.121) Ψ(λ) := Ψψ(λ) =

∫RN+

λ(dℓ)ψ(ℓ),

for a test function ψ ∈ C1(RN+) which depends on finitely many variables

only. By standard arguments, ΠΛ is separating.Define for each n ∈ N, the reproduction operator βn : RN

+ → RN+ by

(7.122) βn : (ℓ1, ℓ2, ℓ3, ...) 7→ (ℓ1, ℓ2, ..., ℓn−1, ℓn, ℓn, ℓn+1, ...).

Page 210: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

210 7. TREE-VALUED FLEMING-VIOT DYNAMICS

The key observation is the following.

Lemma 7.11.1. For all Ψ = Ψψ ∈ ΠΛ,(7.123)

Ω↑Ψ κ(U)=

∫dκ(u)

∑n≥2

n∂

∂ℓnψ + γ

∑n≥1

n

∫dκ(u)

ψ βn − ψ

.

Proof.(7.124)

Ω↑Ψ κ(U)

= 2

∫dνU div(ψ ϕlength) + γ

∑1≤k<l

∫dνU

ψ ϕlength θk,l − ψ ϕlength

= 2

∫dνU

∑n≥2

∂ℓn(ψ ϕlength) ·

∑1≤i<j

∂ri,jϕlength

+ γ∑n≥2

(n− 1)

∫dνU

ψ βn−1 ϕlength − ψ ϕlength

=

∫dκ(u)

∑n≥2

n∂

∂ℓnψ + γ

∑n≥1

n

∫dκ(u)

ψ βn − ψ

.

By Lemma 7.11.1 the action of the operator Ω↑ on functions F = F Id,ϕ ∈F depends on U ∈ U only via κ(U). We therefore define the operator Ω↑,Λ

acting on ΠΛ by

(7.125) Ω↑,ΛΨ(λ) :=

∫dλ∑n≥2

n∂

∂ℓnψ + γ

∑n≥1

n

∫dλψ βn − ψ

.

The process Λ solves then for all P0 ∈ M1(M1(RN+)) the (P0,Ω

↑,Λ, πΛ)-martingale problem by construction. The main result of the section is thefollowing.

Theorem 7.11.2 (The subtree lengths distribution process). Fix P0 ∈M1(M1(RN

+)).

(i) The (P0,Ω↑,Λ,ΠΛ)-martingale problem is well-posed.

(ii) The process Λ = (Λt)t≥0 is the unique solution of the (P0,Ω↑,Λ,ΠΛ)-

martingale problem. In particular, it is a strong Markov process.(iii) The process Λ has continuous sample paths.

For the proof of uniqueness we use techniques from filtered martingaleproblems. To prepare them, define recursively (in n ∈ N) maps which sendsn ∈ N and a length vector ℓ ∈ RN

+ (with 0 = ℓ1 ≤ ℓ2 ≤ ... to an n point dis-

tance matrix distribution νn ∈ M1(R(n2)+ ). In fact we are going to construct

this distributions such that the are supported on ultra-metric distances.

Page 211: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

7.11. APPLICATION: SAMPLE TREE LENGTHS DISTRIBUTIONS 211

The heuristics will be the following: Given a (ultra-metric) tree (U, r)spanned by n points of total length ℓn, to obtain a new (ultra-metric) treespanned by (U, r) and an additional point now of total length ℓn+1, weregraft a branch of length ℓn+1 − ℓn at the “root” one of the “family clusterof degree of kinship 1

2(ℓn+1 − ℓn) and we choose the cluster at random.For a formal construction, assume that we are given ℓ.

• Put ν2(ℓ; ·) := δℓ2 .

• Given r ∈ M1(R(n2)+ ) and r > 0, recall the relation∼r on 1, 2, ..., n

defined by i ∼r j if and only if ri,j ≤ r2 . For ultra-metric distances

∼r is an equivalence relation. Let therefore Bnr denote the equiva-

lence classes with respect to ∼r.• Assume we have already defined νk(ℓ; ·); k = 2, ..., n. Using the

abbreviation ℓ∆n+1 := ℓn+1 − ℓn, put for all Ai,j ∈ BR(n+12 )

+ , 1 ≤ i <j ≤ n+ 1,

(7.126)νn+1

(ℓ;×1≤i<j≤n+1Ai,j

):=

∫×1≤i<j≤nAi,j

νn(ℓ; dr)1

#Bnℓ∆n+1

∑B∈Bn

ℓ∆n+1

∏i∈B,j∈B

1Ai,n+1(ri,j)∏i∈B

1Ai,n+1(12ℓ

∆n+1).

Define a transition kernel α : M1(RN+) → U whose action on F =

F Id,ψρ≤n ∈ Π1 for some n ∈ N and ψ ∈ C(R(n2)

+ ) is defined by

(7.127)

∫Uα(λ,dU)

∫RN+

νU(dr)ψ ρ≤n(r)

:=

∫RN+

λ(dℓ)

∫R(N2)+

νn(ℓ; dr)ψ ρ≤n(r).

The following result states that α defines indeed a transition kernel whichsatisfies the intertwining condition with respect to the operator of the tree-valued Fleming-Viot dynamics and the filtered subtree lengths distribution.

Proposition 7.11.3 (Intertwining condition).(i) For all λ ∈ M1(RN

+) such that κ−1(λ) = ∅, α(λ, κ−1(λ)

)= 1.

(ii) For all F = F Id,ϕ ∈ Π1,

(7.128) G↑,Λ(∫

RN+

d · (dℓ)∫RN+

α(ℓ,dr)ϕ(r))

(λ) =

∫α(λ,dU)G↑F Id,ϕ(U).

Page 212: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

212 7. TREE-VALUED FLEMING-VIOT DYNAMICS

Proof. (i) Notice first that applying (7.127) on ψ ≡ 1 yields that α(λ, ·)

is a probability measure, while for U0 ∈ U,

(7.129)

∫Uα(κ(U0), dU

) ∫RN+

νU(r)ψ ρ≤n(r)

=

∫RN+

(ϕlength

)∗ν

U0(dℓ)

∫R(N2)+

νn(ℓ; dr)ψ ρ≤n(r)

=

∫RN+

νU0(r)ψ ρ≤n(r)

=

∫κ−1(κ(U))

α(κ(U0), dU

) ∫RN+

νU(r)ψ ρ≤n(r),

which proves the claim.

(ii) For the second claim note that

(7.130)

∫Uα(λ,dU)G↑,growF Id,ψρ≤n(U)

=

∫RN+

λ(dℓ)

∫νn(ℓ; dr)divr(ψ ρ≤n)

(r)

=

∫RN+

λ(dℓ)

∫R(N2)+

n∑k=2

k∂

∂ℓkνn(ℓ; dr)ψ ρ≤n

(r)

and

(7.131)

∫Uα(λ, dU)G↑,resF Id,ψρ≤n(U)

=

∫RN+

λ(dℓ)

∫RN+

νn(ℓ; dr)∑

1≤k<l≤n

ψ ρ≤n

(θk,lr

)− ψ ρ≤n

(r)

=

∫RN+

λ(dℓ)n∑k=1

k

∫R(N2)+

νn(βkℓ, dr)− νn(ℓ, dr)

ψ ρ≤n

(r).

Proof of Theorem 7.11.2. (i) Existence of the martingale problemfollows by construction. By Theorem 2.3 in [Kur98] together with Proposi-tion 7.11.3 uniqueness of the (P0,Ω

↑,Λ,ΠΛ)-martingale problem follows fromthe existence of the (P0,Ω

↑,Π1)-martingale problem.

(ii) Since Λ is the unique solution of a martingale problem, it has thestrong Markov property (see, for example, Theorem 4.4.2 in [EK86]).

(iii) Since the tree-valued Fleming-Viot dynamics has continuous pathsby Theorem 7.1.6(ii), it is enough to show that the map κ : U → M1(RN

+)defined in (7.119) is continuous. To see this, assume that (Un)n∈N is asequence in U such that Un → U, for some U ∈ U, in the Gromov-weaktopology as n→ ∞. Then by Definition 2.1.9, Φ(Un) → Φ(U), for all Φ ∈ Π,

Page 213: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

7.11. APPLICATION: SAMPLE TREE LENGTHS DISTRIBUTIONS 213

as n → ∞. In particular, ⟨κ(Un), ψ⟩ → ⟨κ(U), ψ⟩, for all ψ ∈ C(RN+), or

equivalently, κ(Un) ⇒ κ(U) in the weak topology on M1(RN+), as n → ∞.

By the Markov property of the subtree lengths distribution the quadraticvariation ⟨Ψ(Λ·⟩t for a function Ψ ∈ Π1,Λ can be expressed as a functionalof Λt only.

Corollary 7.11.4 (Quadratic variation of the subtree lengths distribu-tion process). For all Ψ = Ψψρ≤ ∈ Π1,Λ for an n ∈ N and a test functionψ ∈ C1(RN

+),(7.132)

⟨Ψ(Λ·⟩t

2

∫ t

0ds

∫RN+

Λs(dℓ)

∫R(N2)+

νn(n; dr

) ∑1≤k,l≤n

⟨ψ ρ≤n ϕlength⟩k,l.

In particular, if ψ ≡ c, for some c ∈ R, then the quadratic variation ispositive.

Proof. This follows immediately from (ρ≤n)∗νUt =

∫Λt(dℓ)νn(ℓ; ·) to-

gether with Theorem 7.1.6.

Explicit calculations. To prepare the applications, we investigate for eachtime t ≥ 0 the mean sample Laplace transform,

(7.133) g(t; θ) := P[Ψθ(Λt)

],

of the subtree lengths distribution Λt, where for θ ∈ RN+,

(7.134) Ψθ(λ) :=

∫RN+

λ(dℓ)ψθ(ℓ)

with the test function

(7.135) ψθ(ℓ) := exp(−⟨θ, ℓ⟩).As usual, ⟨·, ·⟩ denotes the inner product.

Lemma 7.11.5 (ODE system for the mean sample Laplace transforms).Fix θ ∈ RN

+ and t ≥ 0. The mean sample Laplace transform under thesubtree lengths distribution of the tree-valued Fleming-Viot dynamics at timet satisfies the following system of differential equations:

(7.136)d

dtg(t; θ) = −

( ∞∑k=2

kθk)g(t; θ) + γ

∞∑k=1

k(g(t; τkθ)− g(t; θ)

)with the merging operator(7.137)

τk : (θ1, ..., θk−1, θk, θk+1, θk+2, ...) 7→ (θ1, ..., θk−1, θk + θk+1, θk+2, ...).

Page 214: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

214 7. TREE-VALUED FLEMING-VIOT DYNAMICS

Proof. By standard arguments, Ψθ ∈ ΠΛ and

(7.138)d

dtg(t; θ) = P

[Ω↑,ΛΨθ(Λt)

].

Hence, inserting (7.125), and using ψθ(βkℓ) = ψτkθ(ℓ) for all k = 1, 2, . . .,with βk from (7.125) and τk from (7.137), we find that g(t; θ) satisfies thesystem of differential equations(7.139)

d

dtg(t, θ)

= P[−∫

Λt(dℓ)

∞∑k=2

kθkψθ(ℓ) + γ

∫Λt(dℓ)

∞∑k=1

k(ψθ(βkℓ)− ψθ(ℓ)

)]= −

( ∞∑k=2

kθk)g(t, θ) + γ

∞∑k=1

k(g(t, τkθ)− g(t, θ)

),

as claimed.

Fix n ∈ N. If we consider specifically θ = (θk)k=1,2,... with

(7.140) θk := θ · δk,n,

for some θ ≥ 0 and n ∈ N, we obtain the mean sample Laplace transformgn under the distribution of the length of the subtree spanned by a sampleof size n. That is, for t ≥ 0 and θ ≥ 0,

(7.141) gn(t; θ) := P[ ∫

RN+

Λt(dℓ) e−θℓn

].

Lemma 7.11.5 implies therefore the following explicit expressions.

Corollary 7.11.6 (Mean sample Laplace transforms). For all θ ∈ R+

and n ≥ 2,(7.142)

gn(t, θ) = Γ(n)n∑k=2

(nk

)(−1)k( 2γ θ + 2k − 1)

Γ( 2γ θ + n+ k)·

·e−k(θ+

γ2(k−1))t

k∑m=2

(km

)(−1)mΓ( 2γ θ + k +m− 1)

Γ(m)gm(0; θ)

+(1− e−k(θ+

γ2(k−1))t

)(k − 1)( 2γ θ + k)Γ( 2γ θ + k − 1)

.

In particular, if gn(θ) = limt→∞ gn(t; θ) then

(7.143) gn(θ) = P[e−θ

∑nk=2 Ek( γ

2(k−1))

],

where Ek; k = 2, ..., n are independent and Ek is exponentially distributedwith mean 2

γ(k−1) , k = 2, ..., n.

Page 215: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

7.11. APPLICATION: SAMPLE TREE LENGTHS DISTRIBUTIONS 215

Remark 7.11.7 (The Kingman measure tree lengths distribution). Re-call from Definition 4.1.5 the law QΛ of the Λ-coalescent measure tree andfor the specific choice Λ = γδ0 the (rate γ) Kingman coalescent tree. If welet in Corollary 7.11.6 the time tend to infinity and apply Theorem 7.8.1 werecover the following well-known fact (see, for example, [Wat75]):

Under Qγδ0 , for all n ≥ 2,

(7.144) Qγδ0[κ(U)

]= L

[∑n

k=2Ek],

where Ek; k = 2, ..., n are independent and Ek is exponentially distributedwith mean 2

γ(k−1) , k = 2, ..., n.

The analog results for more general Λ-coalescents can be found in[DDSJ07, DIMR07].

Remark 7.11.8. Recall, for each n ∈ N, the function gn from (7.141).For each n ≥ 2 and θ ≥ 0, applying (7.136) to θ = (θδk,n)k≥2 yields, settingg1(t; θ) := 1,

(7.145)

d

dtgn(t; θ) = −nθgn(t; θ) + γ

(n

2

)(g(n−1)(t; θ)− gn(t; θ)

)=γ

2n(n− 1)g(n−1)(t; θ)− γ

2n(2γ θ + n− 1

)gn(t; θ),

i.e.,(7.146)d

dt

(g2(t; θ), g3(t; θ), . . .

)=γ

2

[A(2γ θ)(g2(t; θ), g3(t; θ), ...

)⊤+ (2, 0, ...)⊤

],

where for θ ≥ 0 the matrix A := A(θ) is defined by

(7.147) Ak,l :=

k(k − 1), if k = l + 1,

−k(θ + k − 1), if k = l,

0, else,

for all k, l ≥ 2.

We therefore prepare the proof of Corollary 7.11.6 with the followinglemma.

Lemma 7.11.9. Fix θ ≥ 0. Let B = (Bk,l)k,l≥2 and B−1 = (B−1k,l )k,l≥2 be

matrices defined by(7.148)

Bk,l :=k!l!

(k−1l−1

)Γ(θ + 2l)

Γ(θ + k + l), and B−1

k,l =(−1)k+l k!l!

(k−1l−1

)Γ(θ + k + l − 1)

Γ(θ + 2l − 1).

(i) The matrices B and B−1 are inverse to each other.

(ii) The matrix A = A(θ) = (Ak,l)k,l≥2 has eigenvalues

(7.149) λk := −k(θ + k − 1), k ≥ 2.

Page 216: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

216 7. TREE-VALUED FLEMING-VIOT DYNAMICS

(iii) If D = (λkδk,l)k,l≥2 then

(7.150) f(A) = Bf(D)B−1

for all function f : RN2 → RN2. Specifically, A−1 = BD−1B−1 and

eAt = BeDtB−1 for all t ≥ 0.(iv) If for θ > 0, A−1(θ) = (A−1

k,l )k,l≥2 is given by

(7.151) A−1k,l :=

(k − 1)!Γ(θ + l)

l!Γ(θ + k)

then A−1 and A are inverse to each other.

Proof. (i) We need to show that

(B ·B−1)k,l = δk,l(7.152)

for k ≥ l ≥ 2. This is clear in the case where k ≤ l. For k > l ≥ 2, withconstants C changing from line to line, and using the abbreviations k := k−land θ := θ + 2l − 1,(7.153)

(B ·B−1)k,l =

k∑m=l

Bk,mB−1m,l

=

k∑m=l

k!m!

(k−1m−1

)Γ(θ + 2m)

Γ(θ + k +m)·(−1)m+l m!

l!

(m−1l−1

)Γ(θ +m+ l − 1)

Γ(θ + 2m− 1)

= C

k∑m=l

(−1)m+l (θ + 2m− 1)Γ(θ +m+ l − 1)

(k −m)!(m− l)!Γ(θ + k +m)

= Ck∑

m=0

(−1)m(θ + 2m)Γ(θ +m)

Γ(k −m+ 1)Γ(m+ 1)Γ(θ + k +m+ 1)

= C

k∑m=0

(−1)m(θ + 2m)Γ(θ +m)

Γ(m+ 1)· Γ(θ + 2k + 1)

Γ(θ + k +m+ 1)Γ(k −m+ 1)

= 0,

where we have used that

(7.154) C · (θ + 2m)Γ(θ +m)

Γ(m+ 1)=

Γ(θ +m+ 1)

Γ(m+ 1)Γ(θ + 1)+

Γ(θ +m)

Γ(m)Γ(θ + 1)

and then applied Formula (5d) on page 10 in [Rio68].

(ii) Since A is lower triangular, this is obvious.

(iii) Note that

(7.155) (A ·B)2,l − λlB2,l = 0

Page 217: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

7.11. APPLICATION: SAMPLE TREE LENGTHS DISTRIBUTIONS 217

and

(7.156)λl − λk = θ(k − l) + (k2 − k − l2 + l)

= (k − l)(θ + k + l − 1

).

Thus for all k ≥ 3 and l ≥ 2,

(7.157) Bk,l =k(k − 1)

(k − l)(θ + k + l − 1)Bk−1,l,

and since Ak,k = λk,

(7.158)

(A ·B

)k,l

− λlBk,l = Ak,k−1Bk−1,l + (λk − λl)Bk,l

=(k(k − 1) + k(k − 1)

)Bk−1,l

= 0,

which proves that B contains all eigenvectors of A. Hence the claim followsby standard linear algebra.

(iv) It is clear that (A ·A−1)k,k = 1, while for k = l,(7.159)

(A ·A−1)k,l = Ak,k−1 ·A−1k−1,l +Ak,k ·A−1

k,l

= k(k − 1)(k − 2)!Γ(θ + l)

l!Γ(θ + k − 1)− k(θ + k − 1)

(k − 1)!Γ(θ + l)

l!Γ(θ + k)

= 0.

By Theorem 7.8.1 the claim follows.

Proof of Corollary 7.11.6. Fix n ∈ N and θ ≥ 0. Put

(7.160) hθ,n(t) := gn(2tγ ; θ

).

By (7.145), the vector h := (hθ,2, hθ,3, ...)⊤ satisfies the linear system ofordinary differential equations

(7.161)d

dth = Ah+ b,

or equivalently,

(7.162) h(t) = eAth(0) + eAtA−1b−A−1b,

with b = (2, 0, 0, ...)⊤ and A = (Ak,l)k,l≥2 as defined in (7.147). Conse-quently, if B, B−1 and D are as in Lemma 7.11.9, then

(7.163) h(t) = BeDtB−1h(0) +B(eDt − I

)D−1B−1b,

where I denotes the unit matrix.Combining (7.160) with (7.163) yield the explicit expressions given in

(7.142).

Page 218: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

218 7. TREE-VALUED FLEMING-VIOT DYNAMICS

By (7.146) together with Theorem 7.8.1,

(7.164)

gn(t; θ)−→t→∞

− 2(A(2γ θ)−1)n,2

=Γ(n)Γ(θ + 1)

Γ(n+ θ)

= E[e−θ

∑nk=2 Ek]

.

Connection to empirical population genetics. Consider in addition to theFleming-Viot dynamics state-independent mutation at rate θγ

2 > 0. Thismutation does not, of course, effect the tree-valued Fleming-Viot dynam-ics. For an ultra-metric measure tree (U, r, µ), the Watterson estimator(see [Wat75]) for θ is based on the number Sn of segregating sites in asample of size n ≥ 2, i.e.,(7.165)Sn(u1, . . . , un)

:= #mutations which fall on the sub-tree spanned by u1, ..., un

,

where u1, ..., un is a random independent sample from (U, r) with respect toµ.

The following result gives the mean sample Laplace transform of Sn atany given time t > 0 under the tree-valued Fleming-Viot dynamics.

Proposition 7.11.10 (Number of segregating sites). For all θ ≥ 0 andn ≥ 2,

(7.166) P[ ∫

µ⊗nt (d(u1, . . . , un))e−σSn(u1,...,un)

]= gn

(t; θγ2 (1− e−σ)

).

where gn as defined in (7.141) and explicitly calculated in Corollary 7.11.6.

Proof. Notice that given the length L(U,r)n (u1, ..., un) of the sub-tree

spanned by the sample u1, ..., un the number of mutations is Poisson dis-

tributed with parameter θγ2 L

(U,r)n (u1, ..., un).

Remark 7.11.11 (Number of segregating sites in equilibrium). Propo-sition 7.11.10 implies that in equilibrium of the tree-valued Fleming-Viotdynamics, i.e., under the law of the (rate γ) Kingman coalescent measuretree,(7.167)

Qγδ0[ ∫

µ⊗n(d(u1, ..., un)) e−σSn(u1,...,un)

]=

Γ(n)Γ(θ(1− e−σ) + 1)

Γ(θ(1− e−σ) + n),

by Remark 7.11.7.

Page 219: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

Index

(Ct, Lt), spatial coalescent, 1070-hyperbolic, 22T -theory, 7Λ-coalescent measure tree, 69, 103ε-isometry, 29ε-separated set, 66

Aldous-Broder algorithm, 121ancestor, 7arc, 23

block frequency, 103blocks, 102Bolthausen-Sznitman coalescent, 102Brownian excursion, 6, 12, 13, 82, 85,

86, 88, 92, 136, 163, 166, 169

Campbell measure, 85, 88Campbell-Palm formula, 88capacity, 160catalytic Feller diffusion, 93Choquet capacity, 160coalescing Brownian motion, 10conditioned branching process, 5, 6, 9,

82consistency property, 84contour process, 79correspondence, 20cut operator, 96cut point, 82, 83, 132cut time, 82

Dirichlet form, 11, 13, 153, 158–160distance distribution, 54distance matrix, 47distance methods, 24distortion, 19duality, 14dust-free property, 103, 104

essentially polar, 163

Eurandom metric, 71excursion, 9, 77

Fleming-Viot proces (FV), 14four point condition, 78

genealogical structure, 14geodesic, 22graphical representation, 15Gromov metric triple, 46Gromov-Hausdorff metric, 7, 19

pointed Gromov-Hausdorff metric, 7rooted Gromov-Hausdorff metric, 7,

28weighted Gromov-Hausdorff metric,

7, 8Gromov-Prohorov metric, 50Gromov-Prohorov topology, 51Gromov-strong topology, 7, 17, 18Gromov-Wasserstein metric, 75Gromov-weak topology, 8, 9, 45, 51

Hausdorff metric, 18Hausdorff topology, 6, 18historical process, 15horizontal gene transfer, 13Hunt process, 159

invariance principle, 8, 82isometry

weight preserving isometry, 37

Kingman coalescent, 9, 14, 102Kingman correspondence, 103

lattice path, 6length measure, 17, 18, 25, 26, 35, 79,

80, 85, 86, 88, 92, 130, 131, 136,139, 145, 147

Liggett-Spitzer space, 106line-breaking construction, 82

219

Page 220: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

220 INDEX

local time, 81, 86look-down process, 15

mark, 106marked partition, 106metric measure space, 45, 46

ultra-metric mm-space, 65compact mm-space, 66

mm-space, 46modulus of mass distribution, 45, 54moment measure, 57Moran model, 14

nearest neighbor interchange (NNI), 10

partition, 102phylogenetics, 5piecewise deterministic Markov process,

11, 148Poisson random measure, 88potential theory, 163pre-compactness

pre-compactness in M, 59pre-compactness in Mc, 66pre-compactness in T, 36pre-compactness in Troot, 30pre-compactness in Twt, 42pre-compactness in Xc, 21pre-compactness in M1(T), 44

Prohorov metric, 7, 38, 49proper frequencies, 103

quartet methods, 24quotion map, 80

random distance distribution, 54, 55random mapping, 11Rayleigh process, 13, 149real tree, 7, 22

ε-trimming, 36rooted R-tree, 27

ancestor, 27generation, 27greatest common lower bound, 27height, 27leaf, 132most recent common ancestor, 27partial order, 27root, 27, 132rooted subtree, 30subtree above a, 30, 126

trivial tree, 36, 83weighted R-tree, 37

resampling, 14

root growth with re-grafting (RGRG),11, 121

shape, 132singleton, 36skeleton, 25, 26spatial(ly structured) Λ-coalescent

measure tree, 105, 108spatial(ly structured) coalescent, 101,

107standard Rayleigh distribution, 148subtree of T spanned by Sε, 23subtree prune and re-graft (SPR), 10

taxa, 5time-change, 86traveling salesperson problem, 27tree

Λ-coalescent tree, 5Brownian CRT, 5

rooted Brownian CRT, 81weighted Brownian CRT, 81

Brownian snake, 5combinatorial tree, 138fragmentation trees, 6Kallenberg tree, 5Kingman coalescent tree, 5Poisson snake, 5reactant tree, 5recursive trees, 5search trees, 5self-similar fragmentation tree, 6spanning trees, 5ultra-metric structures, 5uniform random ordered tree, 82Yule tree, 5

tree bisection and re-connection (TBR),10

tree-valued Fleming-Viot dynamics, 11,15

ultra-metric, 102, 108

Wasserstein metric, 75, 143Wasserstein-L2 metric, 7wild chain, 11

Page 221: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

Bibliography

[AE99] David Aldous and Steven N. Evans, Dirichlet forms on totally disconnectedspaces and bipartite Markov chains, J. Theoret. Probab. 12 (1999), no. 3,839–857.

[Ald90] David J. Aldous, The random walk construction of uniform spanning trees anduniform labelled trees, SIAM J. Discrete Math. 3 (1990), no. 4, 450–465.

[Ald91a] , The continuum random tree I, Ann. Probab. 19 (1991), 1–28.[Ald91b] , The continuum random tree. II. An overview, Stochastic analysis

(Durham, 1990), London Math. Soc. Lecture Note Ser., vol. 167, CambridgeUniv. Press, Cambridge, 1991, pp. 23–70. MR MR1166406 (93f:60010)

[Ald93] , The continuum random tree III, Ann. Probab. 21 (1993), 248–289.[Ald94a] , Recursive self-similarity for random trees, random triangulations and

Brownian excursion, Ann. Probab. 22 (1994), no. 2, 527–545.[Ald94b] , Triangulating the circle, at random, Amer. Math. Monthly 101

(1994), no. 3, 223–233.[Ald99] , Deterministic and stochastic models for coalescence (aggregation and

coagulation): a review on mean field theory by probabilists, Bernoulli 5 (1999),3–48.

[Ald00] , Mixing time for a Markov chain on cladograms, Combin. Probab.Comput. 9 (2000), no. 3, 191–204. MR MR1774749 (2001f:05129)

[AP06] Jomy Alappattu and Jim Pitman, Colored loop-erased random walk on thecomplete graph, 2006.

[AS92] David J. Aldous and J. Michael Steele, Asymptotics for Euclidean minimalspanning trees on random points, Probab. Theory Related Fields 92 (1992),no. 2, 247–258.

[AS01] Benjamin L. Allen and Mike Steel, Subtree transfer operations and their in-duced metrics on evolutionary trees, Ann. Comb. 5 (2001), no. 1, 1–15. MRMR1841949 (2003c:68175)

[AS02] Romain Abraham and Laurent Serlet, Poisson snake and fragmentation,Electron. J. Probab. 7 (2002), no. 17, 15 pp. (electronic). MR MR1943890(2003m:60201)

[AT89] D.J. Aldous and P. Tsoucas, A proof of the Markov chain theorem, Stat.Probab. Letters 8 (1989), 189–192.

[BBC+05] M. Birkner, J. Blath, M. Capaldo, A. Etheridge, M. Mohle, J. Schweins-berg, and A. Wakolbinger, Alpha-stable branching and beta coalescents, Elec.J. Probab. 10 (2005), 303–325.

[BBI01] Dmitri Burago, Yuri Burago, and Sergei Ivanov, A course in metric geometry,Graduate studies in mathematics, vol. 33, AMS, Boston, MA, 2001.

[BBS06] J. Berestycki, N. Berestycki, and J. Schweinsberg, Small-time behavior of betacoalescents, Preprint (2006), 1–31.

[Ber96] Jean Bertoin, Levy processes, Cambridge Tracts in Mathematics, vol. 121,Cambridge University Press, Cambridge, 1996. MR MR1406564 (98e:60117)

221

Page 222: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

222 BIBLIOGRAPHY

[Bes02] Mladen Bestvina, R-trees in topology, geometry, and group theory, Hand-book of geometric topology, North-Holland, Amsterdam, 2002, pp. 55–91. MR2003b:20040

[BG80] M. Bramson and D. Griffeath, Asymptotics for interacting particle systems onzd, Z. Wahrscheinlichkeit verw. Gebiete 53 (1980), 183–196.

[BG05] J. Bertoin and J.-F. Le Gall, Stochastic flows associated to coalescent processesIII: Limit theorems, Preprint (2005), 1–30.

[BGL98] Carol Bezuidenhout, Geoffrey Grimmett, and Armin Loffler, Percolation andminimal spanning trees, J. Statist. Phys. 92 (1998), no. 1-2, 1–34.

[BH99] Martin R. Bridson and Andre Haefliger, Metric spaces of non-positive curva-ture, Grundlehren der Mathematischen Wissenschaften [Fundamental Princi-ples of Mathematical Sciences], vol. 319, Springer-Verlag, Berlin, 1999. MR2000k:53038

[BHV01a] Louis J. Billera, Susan P. Holmes, and Karen Vogtman, Geometry of the spaceof phylogenetic trees, Adv. Appl. Math. 27 (2001), 733–767.

[BHV01b] Louis J. Billera, Susan P. Holmes, and Karen Vogtmann, Geometry of thespace of phylogenetic trees, Adv. in Appl. Math. 27 (2001), no. 4, 733–767.MR MR1867931 (2002k:05229)

[BK06] E. Bolthausen and N. Kistler, On a non-hierarchical model of the generalizedrandom energy model, Ann. Appl. Probab. 16 (2006), no. 1, 1–14.

[Bro89] A. Broder, Generating random spanning trees, 30th IEEE Symp. Found.Comp. Sci., 1989, pp. 442–447.

[BRST02] Oliver Bastert, Dan Rockmore, Peter F. Stadler, and Gottfried Tinhofer,Landscapes on spaces of trees, Appl. Math. Comput. 131 (2002), no. 2-3,439–459. MR MR1920237 (2003g:92001)

[BS02] Andrei N. Borodin and Paavo Salminen, Handbook of Brownian motion—factsand formulae, second ed., Probability and its Applications, Birkhauser Verlag,Basel, 2002. MR 2003g:60001

[Cai93] Haiyan Cai, Piecewise deterministic Markov processes, Stochastic Anal. Appl.11 (1993), no. 3, 255–274. MR 94e:60062

[Can74] C. Cannings, The latent roots of certain Markov chains arising in genetics: Anew approach I. Haploid models, Adv. in Appl. Probab. 6 (1974), 260–290.

[Can75] , The latent roots of certain Markov chains arising in genetics: A newapproach II. Further haploid models, Adv. in Appl. Probab. 7 (1975), 264–282.

[CDP01] G. Colombo and P. Dai Pra, A class of piecewise deterministic Markovprocesses, Markov Process. Related Fields 7 (2001), no. 2, 251–287. MR2002h:60179

[CG86] J. Theodore Cox and David Griffeath, Diffusive clustering in the two dimen-sional voter model, Ann. Probab. 14 (1986), no. 2, 347–370.

[Chi01] Ian Chiswell, Introduction to Λ-trees, World Scientific Publishing Co. Inc.,River Edge, NJ, 2001. MR MR1851337 (2003e:20029)

[CKMR05] B. Chauvin, T. Klein, J.-F. Marckert, and A. Rounault, Martingales andProfile of Binary Search Trees, Electron. J. Probab. 10 (2005), no. 12, 420–435.

[Cos90] O. L. V. Costa, Stationary distributions for piecewise-deterministic Markovprocesses, J. Appl. Probab. 27 (1990), no. 1, 60–73. MR 91d:60169

[CP00] Michael Camarri and Jim Pitman, Limit distributions and random trees de-rived from the birthday problem with unqual probabilities, Electron. J. Probab.5 (2000), no. 2, 1–18.

[Dav84] M. H. A. Davis, Piecewise-deterministic Markov processes: a general class ofnondiffusion stochastic models, J. Roy. Statist. Soc. Ser. B 46 (1984), no. 3,353–388, With discussion. MR 87g:60062

Page 223: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

BIBLIOGRAPHY 223

[Dav93] , Markov models and optimization, Monographs on Statistics and Ap-plied Probability, vol. 49, Chapman & Hall, London, 1993. MR 96b:90002

[Daw93] Donald A. Dawson, Measure-valued Markov processes, Ecole d’Ete de Prob-abilites de Saint-Flour XXI—1991 (Berlin) (P.L. Hennequin, ed.), LectureNotes in Mathematics, vol. 1541, Springer, 1993, pp. 1–260.

[DC99] Francois Dufour and Oswaldo L. V. Costa, Stability of piecewise-deterministicMarkov processes, SIAM J. Control Optim. 37 (1999), no. 5, 1483–1502 (elec-tronic). MR 2000g:60125

[DDSJ07] Jean-Francois Delmas, Jean-Stephane Dhersin, and Arno Siri-Jegousse, As-ymptotic results on the length of coalescent trees, 2007.

[DEF+99] P. Donnelly, S.N. Evans, K. Fleischmann, T.G. Kurtz, and X. Zhou,Continuum-sites stepping–stone models, coalescing exchangeable partitions,and random trees, Ann Probab. 28 (1999), 1063–1110.

[DGR02] Vincent Dumas, Fabrice Guillemin, and Philippe Robert, A Markovian anal-ysis of additive-increase multiplicative-decrease algorithms, Adv. in Appl.Probab. 34 (2002), no. 1, 85–111. MR 2003f:60168

[DGV95] D. A. Dawson, A. Greven, and J. Vaillancourt, Equilibria and quasi-equilibriafor infinite systems of Fleming-Viot processes, Transactions of the AmericanMath. Society 347 (1995), no. 7, 2277–2360.

[DH98] Persi Diaconis and Susan Holmes, Matching and phylogenetic trees, Proc. Nat.Acad. Sci. U.S.A. 53 (1998), 321–402.

[DH02] Persi Diaconis and Susan P. Holmes, Random walks on trees and matchings,Electron. J. Probab. 7 (2002), no. 6, 17 pp. (electronic). MR MR1887626(2002k:60025)

[DH05] M. Drmota and H.-K. Hwuang, Profiles of random trees: correlation and widthof random recursive trees and binary search trees, Adv. Appl. Probab. 37(2005), no. 2, 321–341.

[DIMR07] M. Drmota, A. Iksanov, M. Mohle, and U. Rosler, Asymptotic results con-cerning the total branch length of the Bolthausen-Sznitman coalescent, Stoch.Process. Appl. (2007), no. 117.

[DK96] Peter Donnelly and Thomas G. Kurtz, A countable representation of theFleming-Viot processes, Ann. Probab. 24 (1996), no. 2, 698–742.

[DK98] , Particle representation for measure-valued population models, Ann.Probab. 27 (1998), no. 1, 166–205.

[DK99] , Genealogical processes for Fleming-Viot models with selection andrecombination, Ann. Appl. Probab. 9 (1999), 1091–1148.

[DLG02] Thomas Duquesne and Jean-Francois Le Gall, Random trees, Levy processesand spatial branching processes, Asterisque (2002), no. 281, vi+147. MRMR1954248 (2003m:60239)

[DM00] S.L. Degenhardt and S.C. Milne, Weighted inversion statistics and their sym-metry groups, J. Combin. Theory Ser. A 90 (2000), no. 49, 49–103.

[DMT96] Andreas Dress, Vincent Moulton, andWerner Terhalle, T -theory: an overview,European J. Combin. 17 (1996), no. 2-3, 161–175, Discrete metric spaces(Bielefeld, 1994). MR 97e:05069

[DP91] D. A. Dawson and E. A. Perkins, Historical processes, Mem. Amer. Math.Soc. 93 (1991), no. 454, iv+179.

[Dre84] Andreas W.M. Dress, Trees, tight extensions of metric spaces, and the coho-mological dimension of certain groups: A note on combinatorical properties ofmetric spaces, Adv. Math. 53 (1984), 321–402.

[DT96] Andreas W.M. Dress and W.F. Terhalle, The real tree, Adv. Math. 120 (1996),283–301.

Page 224: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

224 BIBLIOGRAPHY

[DVJ88] D. J. Daley and D. Vere-Jones, An introduction to the theory of point processes,Springer Series in Statistics, Springer-Verlag, New York, 1988. MR MR950166(90e:60060)

[EK86] Stewart N. Ethier and Thomas G. Kurtz, Markov processes: Characterizationand convergence, Wiley Series in Probability and Mathematical Statistics:Probability and Mathematical Statistics, John Wiley & Sons Inc., New York,1986. MR MR838085 (88a:60130)

[EL07] Steven N. Evans and Tye Lidman, Asymptotic evolution of acyclic randommappings, U.C. Berkeley Department of Statistics Technical Report No. 724,2007.

[EO94] Steven N. Evans and Neil O’Connell, Weighted occupation time for branchingparticle systems and a representation for the supercritical superprocess, Canad.Math. Bull. 37 (1994), no. 2, 187–196.

[EP98] Steven N. Evans and Jim Pitman, Stationary Markov processes related to stableOrnstein-Uhlenbeck processes and the additive coalescent, Stochastic Process.Appl. 77 (1998), no. 2, 175–185. MR 99j:60109

[EPW06] Steven N. Evans, Jim Pitman, and Anita Winter, Rayleigh processes, realtrees, and root growth with re-grafting, Prob. Theo. Rel. Fields 134 (2006),no. 1, 81–126.

[ET60] Paul Erdos and S. James Taylor, Some problems concerning the structure ofrandom walk paths, Acta Math. Acad. Sci. Hungar. 11 (1960), 137–162.

[Eth01] Alison Etheridge, An introduction to superprocesses, University Lecture Se-ries, 20, American Mathematical Society, Providence, RI, 2000, 2001. MRMR1779100 (2001m:60111)

[Eva00a] S. Evans, Kingman’s coalescent as a random metric space, Stochastic Models:Proceedings of the International Conference on Stochastic Models in Honourof Professor Donald A. Dawson, Ottawa, Canada, June 10-13, 1998 (L.GGorostiza and B.G. Ivanoff eds.), Canadian Mathematical Society, 2000.

[Eva00b] Steven N. Evans, Snakes and spiders: Brownian motion on R-trees, Probab.Theory Rel. Fields 117 (2000), 361–386.

[Eva06] Steven. S. Evans, Probability and real trees, 2006.[EW06] Steven N. Evans and Anita Winter, Subtree prune and re-graft: A reversible

real-tree valued Markov chain, Ann. Prob. 34 (2006), no. 3, 918–961.[Ewe04] W. J. Ewens, Mathematical Population Genetics. i. Theoretical introduction.

second edition, Springer, 2004.[Fel03] Joseph Felsenstein, Inferring phylogenies, Sinauer Associates, Sunderland,

Massachusetts, 2003.[FOT94] Masatoshi Fukushima, Yoichi Oshima, and Masayoshi Takeda, Dirichlet forms

and symmetric Markov processes, de Gruyter, 1994.[FV78] W. H. Fleming and M. Viot, Some measure-valued population processes, Sto-

chastic analysis (Proc. Internat. Conf., Northwestern Univ., Evanston, Ill.,1978), Academic Press, New York, 1978, pp. 97–108.

[FV79] Wendell H. Fleming and Michel Viot, Some measure-valued Markov processesin population genetics theory, Indiana Univ. Math. J. 28 (1979), no. 5, 817–843.

[GLW] Andreas Greven, Vlada Limic, and Anita Winter, Cluster formation in spatialMoran models in critical dimension via particle representation, Manuscript.

[GLW05] , Representation theorems for interacting Moran models, interactingFisher–Wright diffusions and applications, Electronic Journal of Probability10(39) (2005), 1286–1358.

[GLW07] , Coalescent processes arising in a study of diffusive clustering, 2007,Submitted to Ann. Appl. Probab.

Page 225: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

BIBLIOGRAPHY 225

[GPW] Andreas Greven, Peter Pfaffelhuber, and Anita Winter, Time-space structureof genealogies of spatial Moran models and applications to interacting Fisher-Wright diffusions, in preparation.

[GPW06a] , Convergence in distribution of random metric measure spaces (theΛ-coalescent measure tree), 2006, submitted to PTRF.

[GPW06b] Andreas Greven, Lea Popovic, and Anita Winter, Genealogy of catalyticbranching models, 2006.

[GPW07] Andreas Greven, Peter Pfaffelhuber, and Anita Winter, Tree-valued resam-pling dynamics (Martingale Problems and applications), submitted to Annalsof Probab., 2007.

[Gro99] Misha Gromov, Metric structures for Riemannian and non-Riemannianspaces, Progress in Mathematics, vol. 152, Birkhauser Boston Inc., Boston,MA, 1999, Based on the 1981 French original [MR 85e:53051], With appen-dices by M. Katz, P. Pansu and S. Semmes, Translated from the French bySean Michael Bates. MR 2000d:53065

[GS02] A. Gibbs and F. Su, On choosing and bounding probability metrics, Intl. Stat.Rev. 7 (2002), no. 3, 419–435.

[GSW] Andreas Greven, Rongfeng Sun, and Anita Winter, Limit genealogies of in-teracting Fleming-Viot processes on Z.

[Has70] A. M. Hasofer, On the derivative and the upcrossings of the Rayleigh process,Austral. J. Statist. 12 (1970), 150–151. MR 45 #7804

[HMPW] Benedict Haas, Gregory Miermont, Jim Pitman, and Matthias Winkel, Con-tinuum tree asymptotics of discrete fragmentations and applications to phylo-genetic models, to appear.

[JS96] Jean Jacod and Anatolii V. Skorokhod, Jumping Markov processes, Ann. Inst.Henri Poincare 32 (1996), 11–67.

[Kal77] O Kallenberg, Stability of critical cluster fields, Math. Nachr. 77 (1977), 7–43.[Kal02] Olav Kallenberg, Foundations of modern probability, Springer, 2002.[Kel75] John L. Kelley, General topology, Springer-Verlag, New York, 1975, Reprint

of the 1955 edition [Van Nostrand, Toronto, Ont.], Graduate Texts in Math-ematics, No. 27. MR 51 #6681

[Kin82a] J. F. C. Kingman, The coalescent, Stochastic Process. Appl. 13 (1982), no. 3,235–248.

[Kin82b] J.F.C. Kingman, Exchangeability and the evolution of large populations, Pro-ceedings of the International Conference on Exchangeability in Probabilityand Statistics, Rome, 6th-9th April, 1981, in honour of Professor Bruno deFinetti, North-Holland Elsevier, Amsterdam, 1982, pp. 97–112.

[KL96] Harry Kesten and Sungchul Lee, The central limit theorem for weighted min-imal spanning trees on random points, Ann. Appl. Probab. 6 (1996), no. 2,495–527.

[Kni81] Frank B. Knight, Essentials of Brownian motion and diffusion, MathematicalSurveys, vol. 18, American Mathematical Society, Providence, R.I., 1981. MR82m:60098

[Kur92] Thomas G. Kurtz, Averaging for martingale problems and stochastic approx-imation, Applied stochastic analysis (New Brunswick, NJ, 1991), Springer,Berlin, 1992, pp. 186–209.

[Kur98] , Martingale problems for conditional distributions of Markov processes,Electron. J. Probab. 3 (1998), no. 9, 29 pp. (electronic).

[Lei06] Christiane Leidinger, KlasseN Dissertation. Uber Arbeitertochter und das Pro-movieren aus der Bildungsferne, Geschichten aus 1001 Promotion (WernerFiedler/Eike Hebecker/Manuela Maschke, ed.), Bad Heilbrunn: Klinkhardt-Verlag, 2006, pp. 19–26.

Page 226: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

226 BIBLIOGRAPHY

[LG91] J.-F. Le Gall, Brownian excursions, trees and measure-valued branching pro-cesses, Ann. Probab. 19 (1991), no. 4, 1399–1439.

[LG99a] Jean-Francois Le Gall, Spatial branching processes, random snakes and par-tial differential equations, Lectures in Mathematics ETH Zurich, BirkhauserVerlag, Basel, 1999. MR MR1714707 (2001g:60211)

[LG99b] , Spatial branching processes, random snakes and partial differentialequations, Lectures in Mathematics ETH Zurich, Birkhauser Verlag, Basel,1999. MR MR1714707 (2001g:60211)

[LS06] V. Limic and A. Sturm, The spatial Λ-coalescent, Elec. J. Probab. 11 (2006),363–393.

[MBB58] K. S. Miller, R. I. Bernstein, and L. E. Blumenson, Rayleigh processes, Quart.Appl. Math. 16 (1958), 137–145. MR 20 #1371

[Mor92] JohnW. Morgan, Λ-trees and their applications, Bull. Amer. Math. Soc. (N.S.)26 (1992), no. 1, 87–112. MR 92e:20017

[MPV87] M. Mezard, G. Parisi, and M.A. Virasoro, The spin glass theory and beyond,World Scientific Lecture Notes in Physics, vol. 9, 1987.

[MS01] M. Mohle and S. Sagitov, A classification of coalescent processes for haploidexchangeable population models, Ann. Probab. 29 (2001), 1547–1562.

[Pau88] Frederic Paulin, Topologie de Gromov equivariante, structures hyperboliqueset arbres reels, Invent. Math. 94 (1988), no. 1, 53–80. MR 90d:57015

[Pau89] , The Gromov topology on R-trees, Topology Appl. 32 (1989), no. 3,197–221. MR 90k:57015

[Pen03] C. Penssel, Interacting Feller diffusions in catalytic media, Ph.D. thesis, In-stitute of Math., Erlangen, Germany, 2003.

[Pit] Boris Pittel, Note on exact and asymptotic distributions of the parameters ofthe loop-erased random walk on the complete graph, Mathematics and com-puter science, II (Versailles, 2002), Trends Math., Birkhauser. MR 1940151

[Pit99] Jim Pitman, Coalescents with multiple collisions, Ann. Prob. 27 (1999), no. 4,1870–1902.

[Pit02] J. Pitman, Combinatorial stochastic processes, Tech. Report 621, Dept. Sta-tistics, U.C. Berkeley, 2002, available via http://www.stat.berkeley.edu/tech-reports/,.

[PR04] Yuval Peres and David Revelle, Scaling limits of the uniform spanning treeand loop-erased random walk on finite graphs, Annals of Probab. (2004),math.PR/0410430.

[Pro77] P.E. Protter, On the existence, uniqueness, convergence and explosions of so-lutions of systems of stochastic integral equations, Annals of Probability 5(1977), no. 2, 243–261.

[Pru18] H. Prufer, Neuer Beweis eines Satzes uber Permutationen, Arch. Math. Phys.27 (1918), 742–744.

[PW98] James Gary Propp and David Bruce Wilson, How to get a perfectly randomsample from a generic Markov chain and generate a random spanning tree ofa directed graph, J. Algorithms 27 (1998), no. 2, 170–217, 7th Annual ACM-SIAM Symposium on Discrete Algorithms (Atlanta, GA, 1996). MR 1622393

[PW06] P. Pfaffelhuber and A. Wakolbinger, The process of most recent common an-cestors in an evolving coalescent, Stoch. Proc. Appl. 116 (2006), 1836–1859.

[Rac91] S. T. Rachev, Probability metrics and the stability of stochastic models, Wiley,1991.

[Rio68] J. Riordan, Combinatorial Identities, John Wiley & Sons, Inc., New York-London-Sydney, 1968.

[RW00] L. C. G. Rogers and David Williams, Diffusions, Markov processes, andmartingales. Vol. 2, Cambridge Mathematical Library, Cambridge University

Page 227: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

BIBLIOGRAPHY 227

Press, Cambridge, 2000, Ito calculus, Reprint of the second (1994) edition.MR MR1780932 (2001g:60189)

[RY99] Daniel Revuz and Marc Yor, Continuous martingales and Brownian motion,third ed., Grundlehren der Mathematischen Wissenschaften [FundamentalPrinciples of Mathematical Sciences], vol. 293, Springer-Verlag, Berlin, 1999.MR MR1725357 (2000h:60050)

[Sag99] S. Sagitov, The general coalescent with asynchronous mergers of ancestrallines, J. Appl. Probab. 36 (1999), no. 4, 1116–1125.

[Sch00a] J. Schweinsberg, Coalescents with simultaneous multiple collisions, Elec. J.Prob. 5 (2000), 1–50.

[Sch00b] , A necessary and sufficient condition for the Λ-coalescent to comedown from infintiy, Elec. Comm. Prob. 5 (2000), 1–11.

[Sch02] Jason Schweinsberg, An O(n2) bound for the relaxation time of a Markovchain on cladograms, Random Structures Algorithms 20 (2002), no. 1, 59–70.MR MR1871950 (2002j:60126)

[Sha87] Peter B. Shalen, Dendrology of groups: an introduction, Essays in group the-ory, Math. Sci. Res. Inst. Publ., vol. 8, Springer, New York, 1987, pp. 265–319.MR 89d:57012

[Sha91] , Dendrology and its applications, Group theory from a geometri-cal viewpoint (Trieste, 1990), World Sci. Publishing, River Edge, NJ, 1991,pp. 543–616. MR 94e:57020

[SO90] D.L. Swofford and G.J. Olsen, Phylogeny reconstruction, Molecular System-atics (D.M. Hillis and G. Moritz, eds.), Sinauer Associates, Sunderland, Mas-sachusetts, 1990, pp. 411–501.

[SS03] Charles Semple and Mike Steel, Phylogenetics, Oxford Lecture Series in Math-ematics and its Applications, vol. 24, Oxford University Press, Oxford, 2003.MR MR2060009

[Stu06] Karl-Theodor Sturm, On the geometry of metric measure spaces, Acta Math-ematica 196 (2006), no. 1, 65–131.

[Ter97] W.F. Terhalle, R-trees and symmetric differences of sets, Europ. J. Combina-torics 18 (1997), 825–833.

[Ver98] A.M. Vershik, The universal Urysohn space, Gromov metric triples and ran-dom matrices on the natural numbers, Russian Math. Surveys 53 (1998), no. 3,921–938.

[Wat75] G. A. Watterson, On the number of segregating sites in genetical models with-out recombination, Theo. Pop. Biol. 7 (1975), 256–276.

[Wil96] David Bruce Wilson, Generating random spanning trees more quickly than thecover time, Proceedings of the Twenty-eighth Annual ACM Symposium onthe Theory of Computing (Philadelphia, PA, 1996) (New York), ACM (1996).MR 1427525

[Win02] Anita Winter, Multiple scale analysis of branching processes under the Palmdistribution, EJP 7 (2002), no. 13, 74 pages.

[WP96] David Bruce Wilson and James Gary Propp, How to get an exact sample froma generic Markov chain and sample a random spanning tree from a directedgraph, both within the cover time, Proceedings of the Seventh Annual ACM-SIAM Symposium on Discrete Algorithms (Atlanta, GA, 1996) (New York)(1996). MR 1381954

[Zam01] Lorenzo Zambotti, A reflected stochastic heat equation as symmetric dynamicswith respect to the 3-d Bessel bridge, J. Funct. Anal. 180 (2001), no. 1, 195–209. MR MR1814427 (2002c:60108)

[Zam02] , Integration by parts on Bessel bridges and related stochastic partialdifferential equations, C. R. Math. Acad. Sci. Paris 334 (2002), no. 3, 209–212.MR MR1891060 (2002m:60104)

Page 228: Tree-valued Markov limit dynamics Habilitationsschrifthm0110/habil.pdf · 2011-11-20 · 6.3. An associated Markov process 169 6.4. The trivial tree is essentially polar 172 Chapter

228 BIBLIOGRAPHY

[Zam03] , Integration by parts on δ-Bessel bridges, δ > 3 and related SPDEs,Ann. Probab. 31 (2003), no. 1, 323–348. MR MR1959795 (2003m:60175)