twitter event networks and the superstar model

University of Pennsylvania University of Pennsylvania

ScholarlyCommons ScholarlyCommons

Statistics Papers Wharton Faculty Research

2015

Twitter Event Networks and the Superstar Model Twitter Event Networks and the Superstar Model

Shankar Bhamidi University of North Carolina

J Michael Steele University of Pennsylvania

Tauhid Zaman Massachusetts Institute of Technology

Follow this and additional works at: https://repository.upenn.edu/statistics_papers

Part of the Physical Sciences and Mathematics Commons

Recommended Citation Recommended Citation Bhamidi, S., Steele, J. M., & Zaman, T. (2015). Twitter Event Networks and the Superstar Model. The Annals of Applied Probability, 25 (5), 2462-2502. http://dx.doi.org/10.1214/14-AAP1053

This paper is posted at ScholarlyCommons. https://repository.upenn.edu/statistics_papers/63 For more information, please contact [email protected].

https://repository.upenn.edu/

https://repository.upenn.edu/statistics_papers

https://repository.upenn.edu/wharton_faculty

https://repository.upenn.edu/statistics_papers?utm_source=repository.upenn.edu%2Fstatistics_papers%2F63&utm_medium=PDF&utm_campaign=PDFCoverPages

http://network.bepress.com/hgg/discipline/114?utm_source=repository.upenn.edu%2Fstatistics_papers%2F63&utm_medium=PDF&utm_campaign=PDFCoverPages

http://dx.doi.org/10.1214/14-AAP1053

https://repository.upenn.edu/statistics_papers/63

mailto:[email protected]

Twitter Event Networks and the Superstar Model Twitter Event Networks and the Superstar Model

Abstract Abstract Condensation phenomenon is often observed in social networks such as Twitter where one “superstar” vertex gains a positive fraction of the edges, while the remaining empirical degree distribution still exhibits a power law tail. We formulate a mathematically tractable model for this phenomenon that provides a better fit to empirical data than the standard preferential attachment model across an array of networks observed in Twitter. Using embeddings in an equivalent continuous time version of the process, and adapting techniques from the stable age-distribution theory of branching processes, we prove limit results for the proportion of edges that condense around the superstar, the degree distribution of the remaining vertices, maximal nonsuperstar degree asymptotics and height of these random trees in the large network limit.

Keywords Keywords Dynamic networks, preferential attachment, continuous time branching processes, characteristics of branching processes, multitype branching processes, Twitter, social networks, retweet graph

Disciplines Disciplines Physical Sciences and Mathematics

This journal article is available at ScholarlyCommons: https://repository.upenn.edu/statistics_papers/63

https://repository.upenn.edu/statistics_papers/63

arX

iv:1

211.

3090

v2 [

mat

h.PR

] 9

Sep

201

5

The Annals of Applied Probability

2015, Vol. 25, No. 5, 2462–2502DOI: 10.1214/14-AAP1053c© Institute of Mathematical Statistics, 2015

TWITTER EVENT NETWORKS AND THE SUPERSTAR MODEL

By Shankar Bhamidi1, J. Michael Steele and Tauhid Zaman

University of North Carolina, University of Pennsylvania andMassachusetts Institute of Technology

Condensation phenomenon is often observed in social networkssuch as Twitter where one “superstar” vertex gains a positive fractionof the edges, while the remaining empirical degree distribution stillexhibits a power law tail. We formulate a mathematically tractablemodel for this phenomenon that provides a better fit to empiricaldata than the standard preferential attachment model across an arrayof networks observed in Twitter. Using embeddings in an equivalentcontinuous time version of the process, and adapting techniques fromthe stable age-distribution theory of branching processes, we provelimit results for the proportion of edges that condense around thesuperstar, the degree distribution of the remaining vertices, maximalnonsuperstar degree asymptotics and height of these random trees inthe large network limit.

1. Retweet graphs and a mathematically tractable model. Our goal hereis to provide a simple model that captures the most salient features of anatural graph that is determined by the Twitter traffic generated by pub-lic events. In the Twitter world (or Twitterverse), each user has a set offollowers; these are people who have signed-up to receive the tweets of theuser. Here, our focus is on retweets; these are tweets by a user who for-wards a tweet that was received from another user. A retweet is sometimesaccompanied with comments by the retweeter.

Let us first start with an empirical example that contains all the character-istics observed in a wide array of such retweet networks. Data was collectedduring the Black Entertainment Television (BET) Awards of 2010. We firstconsidered all tweets in the Twitterverse that were posted between 10 AM

Received November 2012; revised July 2014.1Supported in part by NSF-DMS Grants 1105581 and 1310002 and SES-1357622.AMS 2000 subject classifications. 60C05, 05C80, 90B15.Key words and phrases. Dynamic networks, preferential attachment, continuous time

branching processes, characteristics of branching processes, multitype branching pro-cesses, Twitter, social networks, retweet graph.

This is an electronic reprint of the original article published by theInstitute of Mathematical Statistics in The Annals of Applied Probability,2015, Vol. 25, No. 5, 2462–2502. This reprint differs from the original inpagination and typographic detail.

1

http://arxiv.org/abs/1211.3090v2

http://www.imstat.org/aap/


http://www.imstat.org

http://www.ams.org/msc/

http://www.imstat.org

http://www.imstat.org/aap/


2 S. BHAMIDI, J. M. STEELE AND T. ZAMAN

Fig. 1. Giant component of the 2010 BET Awards retweet graph.

and 4 PM (GMT) on the day of the ceremony, and we then restricted at-tention to all the tweets in the Twitterverse that contained the term “BETAwards.” We view the posters of these tweets as the vertices of an undirectedsimple graph where there is an edge between vertices v and w if w retweetsa tweet received from v, or vice-versa. We call this graph the retweet graph.

In the retweet graph for the 2010 BET Awards, one finds a single giantcomponent (see Figure 1). There are also many small components (withfive or fewer vertices) and a large number of isolated vertices. The giantcomponent is also approximately a tree in the sense that if we remove 91edges from the graph of 1724 vertices and 1814 edges we obtain an honesttree. Finally, the most compelling feature of this empirical tree is that it hasone vertex of exceptionally large degree. This “superstar” vertex has degree992, so it is connected to more than 57% of the vertices. As it happens, this“superstar” vertex corresponds to the pop-celebrity Lady Gaga who receivedan award at the ceremony.

1.1. Superstar model for the giant component. Our main observation isthat the qualitative and quantitative features of the giant component ina wide array of retweet graphs may be captured rather well by a simpleone-parameter model. The construction of the model only makes an obvi-

TWITTER EVENT NETWORKS AND THE SUPERSTAR MODEL 3

ous modification of the now classic preferential attachment model, but thismodification turns out to have richer consequences than its simplicity wouldsuggest. Naturally, the model has the “superstar” property baked into thecake, but a surprising consequence is that the distribution of the degreesof the nonsuperstar vertices is quite different from what one finds in thepreferential attachment model.

Our model is a graph evolution process that we denote by {Gn, n =1,2, . . .}. The graph G1 consists of the single vertex v0, that we call thesuperstar. The graph G2 then consists of the superstar v0, a nonsuperstarv1, and an edge between the two vertices. For n≥ 2, we construct Gn+1 fromGn by attaching the vertex vn to the superstar v0 with probability 0< p< 1while with probability q = 1− p we attach vn to a nonsuperstar accordingto the classical preferential attachment rule. That is, with probability q thenonsuperstar vn is attached to one of the nonsuperstars {v1, v2, . . . , vn−1}with probability that is proportional to the degree of vi in Gn.

1.2. Organization of the paper. In the next section, we state the mainresults for the Superstar model. In Section 3, we consider previous workon Twitter networks and explore the connection between our model andexisting models. In this section, we also describe two variants of the basicSuperstar model (linear attachment and uniform attachment) that can berigorously analyzed using the same mathematical methodology developedin this paper. In Section 4, we study the performance of this model onvarious real networks constructed from the Twitterverse and we compareour model to the standard preferential attachment model. Section 5 is theheart of the paper. Here, we construct a special two-type continuous timebranching process that turns out to be equivalent to the Superstar modeland analyze various structural properties of this continuous time model. InSection 5.2, we prove the equivalence between the continuous time model andthe Superstar model through a surgery operation. In Section 6, we completethe proofs of all the main results.

2. Mathematical results for the Superstar model. Let {Gn, n= 1,2, . . .}denote the graph process that evolves according to the Superstar model withparameter 0< p< 1. We shall think about all the processes constructed ona single probability space through the obvious sequential growth mechanismso that one can make almost sure statements. The degree of the vertex v inthe graph G is denoted by deg(v,G). The first result describes asymptoticsof the condensation phenomenon around the superstar. The result is animmediate consequence of the definition of the model and the strong law oflarge numbers. Since it is a defining element of our model, we set the resultout as a theorem.


Theorem 2.1 (Superstar strong law). With probability one, we have

limn→∞

1

ndeg(v0,Gn) = p.(2.1)

The next result describes the asymptotic degree distribution.

Theorem 2.2 (Degree distribution strong law). With probability one,we have

limn→∞

1

ncard{1≤ j ≤ n : deg(vj ,Gn) = k}= νSM(k, p),

where νSM(·, p) is the probability mass function defined on {1,2, . . .} by

νSM(k, p) =2− p

1− p(k− 1)!

k∏

i=1

(i+

2− p

1− p

)−1

.

Remark 2.3. One should note that the above theorem implies thatthe degree distribution of the nonsuperstar vertices has a power law tail.Specifically,

2− p

1− p(k− 1)!

k∏

i=1

(i+

2− p

1− p

)−1

∼Cpk−β as k→∞,

for the constants

β = 3+ p/(1− p), Cp =

(2− p

1− p

)2

Γ

(2− p

1− p

),

where Γ(x) is the gamma function. This should be contrasted with the stan-dard preferential attachment model (with no superstar attachment) whosedegree distribution scales like k−3 as k→∞. Thus, although one might ex-pect that this variation in the attachment scheme implies that a fraction1− p of the vertices still continue to perform preferential attachment, andthus the degree distribution should still have a power law exponent of 3; inreality, this attachment scheme has a major effect on the degree distribution.One requires a careful analysis of the different time-scales of the associatedcontinuous time branching process to tease out asymptotic properties of themodel.

The next theorem concerns the largest degree amongst all the nonsuper-star vertices {vi : 1≤ i≤ n}. Let

Υn := max1≤i≤n

deg(vi,Gn).


Theorem 2.4 (Maximal nonsuperstar degree). Let γ = (1− p)/(2− p).There exists a random variable ∆∗ with P(0<∆∗ <∞) = 1 such that

limn→∞

1

nγΥn

P−→∆∗.

The almost sure linear growth of the degree of the superstar (Theo-rem 2.1) is endemic to our construction. For standard preferential attach-ment (with no superstar attachment mechanism), the maximal degree growslike ΘP (n

1/2) (cf. [19]). Thus, the superstar attachment affects the scalingof the maximal degree as well.

Recall that Gn is a tree. View this tree as rooted at the superstar vertexv0. Write H(Gn) for the graph distance of the vertex furthest from the root.Thus, H(Gn) is the height of the random tree Gn. Theorem 2.1 impliesthat a fraction p of the vertices in the network are directly connected tothe superstar. One might wonder if this reflects a general property of thenetwork, namely does H(Gn) =Op(1) as n→∞? The next theorem showsthat in fact the height of the tree increases logarithmically in the size of thenetwork. Let Lam(·) be the Lambert special function (cf. [9]) and recall thatLam(1/e)≈ 0.2784.

Theorem 2.5 (Logarithmic height scaling). With probability one, wehave

limn→∞

1

lognH(Gn) =

1− p

Lam(1/e)(2− p).

3. Related results and questions. In this section, we briefly discuss theconnections between this model and some of the more standard models inthe literature as well as extensions of the results in the paper. We also discussprevious empirical research done on the structure of Twitter networks.

3.1. Preferential attachment. This has become one of the standardworkhorses in the complex networks community. It is well nigh impossi-ble to compile even a representative list of references; see [27] where it wasintroduced in the combinatorics community, [4] for bringing this model tothe attention of the networks community, [12, 21] for survey level treatmentsof a wide array of models, [5] for the first rigorous results on the asymptoticdegree distribution and [6, 8, 24] and [13] and the references therein for moregeneral models and results. Let us briefly describe the simplest model in thisclass of models. One starts with two vertices connected by a single edge as inthe Superstar model. Then each new vertex joins the system by connectingto a single vertex in the current tree by choosing this extant vertex withprobability proportional to its current degree. In this case, one can show


[5] that there exists a limiting asymptotic degree distribution, namely withprobability one

limn→∞

1

ncard{1≤ j ≤ n : deg(vj ,Gn) = k}= 4

k(k +1)(k + 2).

Thus, the asymptotic degree distribution exhibits a degree exponent of three.The Superstar model changes the degree exponent of the nonsuperstar ver-tices from three to (3− 2p)/(1− p) (see Theorem 2.4). Further, for the pref-erential attachment model, the maximal degree scales like n1/2 [19], whilefor the Superstar model, the maximal nonsuperstar degree scales like nγ

with γ = (1− p)/(2− p).

3.2. Statistical estimation. We use real data on various Twitter streamsto analyze the empirical performance of the Superstar model and comparethis with typical preferential attachment models in Section 4. Estimating theparameters from the data raises a host of new interesting statistical ques-tions. See [29] where such questions were first raised and likelihood basedschemes were proposed in the context of usual preferential attachment mod-els. Considering how often such models are used to draw quantitative con-clusions about real networks, proving consistency of such procedures as wellas developing methodology to compare different estimators in the contextof models of evolving networks would be of great interest to a number ofdifferent fields.

3.3. Stable age distribution. The proofs for the degree distribution buildheavily on the analysis of the stable age distribution for a single type contin-uous time branching process in [20]. We extend this analysis to the contextof a two-type variant whose evolution mirrors the discrete type model. UsingPerron–Frobenius theory, a wide array of structural properties are knownabout such models (see [16]). The models used in our proof technique arerelatively simpler and we can give complete proofs using special propertiesof the continuous time embeddings, including special martingales that playan integral role in the treatment (see, e.g., Proposition 5.4). There have beena number of recent studies on various preferential attachment models usingcontinuous time branching processes; see, for example, [2, 11, 25]. For theusual preferential attachment model (p = 0), [23] uses embeddings in con-tinuous time and results on the first birth time in such branching processes(see [17]) to show that the height satisfies

H(Gn)

logn

a.s.−→ 1

2Lam(1/e).

Here, we use a similar technique, but we first need to extend [17] to thesetting of multitype branching processes.


3.4. Previous analysis of Twitter networks. The majority of work ana-lyzing Twitter networks has been empirical in nature. In one of the earlieststudies of Twitter networks [18], the authors looked at the degree distribu-tion of the different networks in Twitter, including retweet networks asso-ciated with individual topics. Power-laws were observed, but no model wasproposed to describe the network evolution. In [1], the link between maxi-mum degree and the range of time for which a topic was popular or “trend-ing” was investigated. Correlations between the degree in retweet graphsand the Twitter follower graph for different users was studied in [7]. Theseempirical analyses provided many important insights into the structure ofnetworks in Twitter. However, the lack of a model to describe the evolutionof these networks is one of the important unanswered questions in this field,and the rigorous analysis of such a model has not yet been considered. Ourwork here presents one of the first such models that produces predictionsthat match Twitter data and also provides a rigorous theoretical analysis ofthe proposed model.

3.5. Related models. One of the main aims of this work is to developmathematical techniques that extend in a straightforward fashion to variantsof the Superstar model. We state results for two such models in this section.We will describe how to extend the proofs for the Superstar model to thesevariants in Section 6.4. We first start with the superstar linear preferentialattachment. Fix a parameter a > −1. The linear preferential attachmentmodel is described as follows: As before new vertices attach to vertex v0with probability p. With probability q := 1− p the new vertex attaches toone of the extant vertices v, with probability proportional to the d(v) + awhere d(v) is the present degree of the vertex. As before, by constructionthe degree of the superstar v0 scales like ∼ pn as n→∞. The techniques inthe paper extend with simple modifications to prove the following.

Theorem 3.1 (Linear superstar preferential attachment). Fix a > −1and p ∈ (0,1). In the linear Superstar model, one has for all k ≥ 1, withprobability one

limn→∞

1

ncard{1≤ j ≤ n : deg(vj ,Gn) = k}

=2− p+ a

1− p

∏k−1j=1(j + a)

∏ki=1(i+ ((2− p)/(1− p))(1 + a))

.

Further, for γ(a) = (1− p)/(2− p+ a), there exists a random variable 0<∆∗(a)<∞ a.s. such that the largest degree other than the superstar satisfies

n−γ(a) max{1≤i≤n}

deg(vi)P−→∆∗(a) as n→∞.


Similarly, one can show that the height of the linear Superstar modelscales like κ(a) logn for a limit constant 0<κ(a)<∞.

We next consider the case of the less realistic Superstar model with uni-form attachment. Here, each new vertex attaches to the superstar v0 withprobability p or to one of the remaining vertices uniformly at random (ir-respective of the degree). Although less realistic in the context of socialnetworks, this is the superstar variant of the random recursive tree a modelof a growing tree where each new vertex attaches to a uniformly chosen ex-tant vertex. The random recursive tree has been a model of great interestin the combinatorics and computer science community (see the survey [26]).This model differs from the previous models with the limiting degree distri-bution possessing exponential tails while the maximal degree only growinglogarithmically in the size of the network.

Theorem 3.2 (Superstar uniform attachment). Let q := 1− p. For theuniform attachment model, one has for all k ≥ 1 that with probability one

limn→∞

1

ncard{1≤ j ≤ n : deg(vj ,Gn) = k}= 1

1+ q

(q

1 + q

)k−1

,

and the maximal nonsuperstar degree satisfies

limn→∞

max1≤i≤n deg(vi)

logn

P−→ 1

log (1 + q)/q.

4. Retweet graphs for different public events. We collected tweets fromthe Twitter firehose for thirteen different public events, such as sports matchesand musical performances [10]. The Twitter firehose is the full feed of allpublic tweets that is accessed via Twitter’s Streaming Application Program-ming Interface [28]. By using the Twitter firehose, we were able to access allpublic tweets in the Twitterverse.

For each public event E ∈ {1,2, . . . ,13}, we kept only tweets that havean event specific term and used those tweets to construct the correspondingretweet graph, denoted by GE . Our analysis focuses on the giant compo-nent of the retweet graph, denoted by G0

E . In Table 1 we present importantproperties of each retweet graph’s giant component including the number ofvertices, number of edges, maximal degree, and the Twitter name of the su-perstar corresponding to the maximal degree. A more detailed description ofeach event, including the event specific term, can be found in the Appendix.

The sizes of the giant components range from 239 to 7365 vertices. Thegiant components of the retweet graphs corresponding to these events arenot trees, but they are very tree-like in that they have only a few smallcycles. In Table 1, one sees that for each giant component, the deletion of asmall number of edges will result in an honest tree.


Table 1

For each event E, we list the number of vertices [|V (G0E)|], number of edges [|E(G0

E)|]and maximal degree [dmax(G

0E)] in the giant component G0

E , along with the Twittername of the superstar corresponding to the maximal degree

E |V (G0

E)| |E(G0

E)| dmax(G0

E) Superstar

1 7365 7620 512 warrenellis2 3995 4176 362 anison3 2847 2918 566 FIFAWorldCupTM4 2354 2414 657 taytorswift135 1897 1929 256 FIFAcom6 1724 1814 992 ladygaga7 1659 2059 56 MMFlint8 1408 1459 269 FIFAWorldCupTM9 1025 1045 247 FIFAWorldCupTM

10 1024 1050 229 SkyNewsBreak11 705 710 113 realmadrid12 505 521 186 Wimbledon13 239 247 38 cnnbrk

4.1. Maximal degree. The maximal degree in the retweet graphs is largerthan would be expected under preferential attachment. Write n= |V (G0

E)|for the number of vertices in the giant component. For a preferential at-tachment graph with n vertices, it is known that the maximal degree scalesas n1/2. Figure 2 shows a plot of the maximal degree in the giant compo-nent dmax(G

0E) and a plot of n1/2 versus n for the retweet graphs. It can

be seen from the figure that the sublinear growth predicted by preferentialattachment does not capture the superstar effect in these retweet graphs.

4.2. Estimating p and the degree distribution. The asymptotic degreedistribution of the Superstar model is known (via Theorem 2.2) once thesuperstar parameter p is specified. We were interested in seeing, for eachevent E, how well this model predicted the observed degree distribution inG0

E . For an event E and degree k ∈ {1,2, . . .}, we define the empirical degreedistribution of the giant component as

νE(k) =1

|V (G0E)|

card{vj ∈ V (G0E) : deg(vj ,G

0E) = k}.

To predict the degree distribution using the Superstar model, we need a valuefor p. We estimate p for each event E as p(E) = dmax(G

0E)/|V (G0

E)|. Usingp = p(E) we obtain the Superstar model degree distribution prediction foreach event E and degree k, νSM(k, p) from Theorem 2.2. For comparison,we also compare νE(k) to the preferential attachment degree distribution


Fig. 2. Plot of dmax(G0E) versus n= |V (G0

E)| for the retweet graphs of each event. Theevents are labeled with the same numbers as in Table 1. Also shown is a plot of n1/2.

νPA(k) = 4(k(k + 1)(k + 2))−1 [5]. Figure 3 shows the empirical degree dis-tribution for the retweet graphs of four of the events, along with the predic-tions for the two models. As can be seen, the Superstar model predictionsseem to qualitatively match the empirical degree distribution better thanpreferential attachment. To obtain a more quantitative comparison of thedegree distribution, we calculate the relative error of these models for eachvalue of degree k. The relative error for event E and degree k is defined asrelerrorSM(k,E) = |νSM(k, p)− νE(k)|(νSM(k, p))−1 for the Superstar modeland relerrorPA(k,E) = |νPA(k) − νE(k)|(νPA(k))−1 for preferential attach-ment. In Figure 4, we show the relative errors for different values of k. Ascan be seen, the relative error of the Superstar model is lower than pref-erential attachment for degrees k = 1,2,3,4 and for all of the events withthe exception of k = 4 and E = 7. There is a clear connection between thesuperstar degree and the degree distribution in the giant component of theseretweet graphs that is captured well by the Superstar model.

5. Analysis of a special two-type branching process. Let us now startthe proofs of the main theorems of Section 2. The core of the proof is a spe-cial two-type continuous time branching processes together with a surgeryoperation that establishes the equivalence between this continuous time con-struction and the original Superstar model. We start by describing this con-struction and then prove the equivalence between the two models.


Fig. 3. Plots of the empirical degree distribution for the giant component of the retweetgraphs [νE(k)], and the estimates of the Superstar model [νSM(k, p(E))] and preferentialattachment [νPA(k)] for four different events. Each plot is labeled with the event specificterm and p(E).

5.1. A two-type continuous branching process. We now consider a two-type continuous time branching process BP(t) whose types we call red andblue. For each fixed t≥ 0, we shall view BP(t) as a random tree representingthe genealogical structure of the population till time t. This includes parentchild relationships of vertices as well as the color of each vertex. We use|BP(t)| for the total number of individuals in the population by time t. Everyindividual survives forever. We shall also let {BP(t)}t≥0 be the associatedfiltration of the process. Let us now describe the construction. At time t= 0,we begin with a single red vertex that we call v1. For any fixed time 0< t<∞, let Vt denote the vertex set of BP(t). Each vertex v ∈ Vt in the branchingprocess gives birth according to a Poisson process with rate

λ(v, t) = cB(v, t) + 1,

where cB(v, t) is equal to the number of blue children of vertex v at time t.Also let cR(v, t) denote the number of red children of vertex v by time t. Atthe moment of a new birth, this new vertex is colored red with probabilityp and colored blue with probability q = 1− p. Finally, for n≥ 1, define the


Fig. 4. Plots of the relative errors of the degree distribution predictions of preferentialattachment and the Superstar model for 13 retweet graphs. The errors are plotted for degreek = 1,2,3,4.

stopping times

τn = inf{t : |BP(t)|= n}.(5.1)

Since the counting process |BP(t)| is a nonhomogenous Poisson process witha rate that is always greater than or equal to one, the stopping times τnare almost surely finite. This completes the construction of the branchingprocess.

5.2. Equivalence between the branching process and the Superstar model.Before diving into properties of our two-type branching process constructedas above, let us show how the Superstar model can be obtained from theabove branching process via a surgery operation. We start with an informaldescription of the connection between the Superstar model and the branch-ing process BP(·). To describe this connection, we introduce a new vertex v0namely the superstar vertex to the system. Recall that v1 was the root (theinitial progenitor) of the branching process BP(·). We connect vertex v1, tothe superstar v0 [v0 played no role in the evolution of BP(·)]. This formsthe Superstar model G2 on 2 vertices. All the red vertices in the process


Fig. 5. The surgery passing from BP(τn) to Sn+1 and Gn+1 for n= 6.

BP(·) correspond to the neighbors of the superstar v0. The true degree of a(nonsuperstar) vertex in Gn+1 is one plus the number of its blue children inBP(τn), where the additional factor of one comes from the edge connectingthis vertex to its parent. By elementary properties of the exponential dis-tribution, the dynamics of BP(·) imply that each new vertex that is born isred (connected to the superstar v0) with probability p, else with probabilityq = 1−p is blue and connected to one of the remaining extant (nonsuperstar)vertices with probability proportional to the current degree of that vertex,thus increasing the degree of this chosen vertex by one. These dynamics arethe same as the Superstar model.

Formally, our surgery will take the tree BP(τn) and modify it to get an(n+1)-vertex tree Sn that has the same distribution as the Superstar modelGn+1. From this, we will be able to read off the probabilistic properties ofthe superstar tree Gn+1.

We label the vertices of BP(τn) as {v1, v2, . . . , vn} in order of their birth.Now add a new vertex v0 to this set to give us the vertex set of the treeSn. One can anticipate that v0 will be our superstar. Next, we define theedge set for Sn. To do this, we take each red vertex v in BP(τn), removethe edge connecting v to its parent (if it has one) and then we create a newedge between v and v0. To complete the construction of Sn, it only remainsto ignore the color of the vertices. An illustration of this surgery for n= 6is given in Figure 5.

Proposition 5.1 (Equivalence from surgery operation). The sequenceof trees {Sn :n≥ 1} has the same distribution as the Superstar model {Gn+1 :n≥ 1}.

Proof. Think of Sn as being rooted at v0 so that every vertex except v0in Sn has a unique parent. The parent of all the red individuals is the super-star v0 while the parents of all of the other blue individuals are unchangedfrom BP(τn).


The induction hypothesis will be that Sn has the same distribution asGn+1 and the degree of each nonsuperstar vertex in Sn is the number ofblue children it possesses plus one for the edge connecting the vertex to itsparent in Sn. Condition on BP(τn) and fix v ∈ BP(τn). By the property ofthe exponential distribution, the probability that the next vertex born intothe system is born to vertex v is given

λ(v, τn)∑u∈BP(τn)

λ(u, τn)=

cB(v, τn) + 1∑u∈BP(τn)

cB(u, τn) + 1.

Thus a new vertex vn+1 attaches to vertex v with probability proportionalto the present degree of v in Sn. Further, with probability p, this vertex iscolored red, whence by the surgery operation, the edge to vn+1 is deletedand this new vertex is connected to the superstar v0. In this case the degreeof v in Sn is unchanged. With probability 1− p this new vertex is coloredblue, whence the surgery operation does not disturb this vertex so that thedegree of vertex v is increased by one. These are exactly the dynamics ofGn+2 conditional on Gn+1. By induction the result follows. �

For the rest of the paper, we shall assume Gn+1 is constructed throughthis surgery process from BP(τn) and suppress Sn.

5.3. Elementary properties of the branching process. The previous sec-tion set up an equivalence between the Superstar model and the two typecontinuous time branching process. The aim of this section is to prove prop-erties of this two type branching process. Section 6 uses these results tocomplete the proof of the main results for the Superstar model.

For t ≥ 0, write R(t) and B(t) for the total number of red and bluevertices, respectively, in BP(t). By construction of the process {BP(t) : t≥ 0},every new vertex is independently colored red with probability p and bluewith probability 1−p. In particular, the number of blue vertices B(t) is justa time change of a random walk with Bernoulli(1− p) increments. Thus, bythe strong law of large numbers

B(t)

|BP(t)|a.s.−→ 1− p as t→∞.(5.2)

Before moving onto an analysis of the branching process, we introduce theYule process.

Definition 5.2 (Rate a Yule process). Fix a > 0. A rate a Yule processis defined as a pure birth process Yua(·) that starts with a single individualYua(0) = 1 and with the rate of creating new individuals proportional to thenumber of present individuals in the population, namely

P(Yua(t+ dt)− Yua(t) = 1|{Yua(s) : 0≤ s≤ t}) = aYua(t)dt.


The Yule process is a well-studied probabilistic object. The next lemmacollects some of its standard properties. In particular, part (a) follows from[22], Section 2.5, while (b) follows from [3], Theorem 1, III.7.

Lemma 5.3 (Yule process). (a) For each t > 0, the random variableYua(t) has a geometric distribution with parameter e−at, that is,

P(Yua(t) = k) = e−at(1− e−at)k−1, k ≥ 1.

(b) The process (e−atYua(t) : 0 ≤ t <∞) is an L

2 bounded martingale with

respect to the natural filtration and e−atYua(t)

a.s.−→ W ′, where W ′ has anexponential distribution with mean one.

Now define the process

M(t) = e−(2−p)t(|BP(t)|+B(t)), t≥ 0.

Note that M(0) = 1.

Proposition 5.4 [Asymptotics for BP(t)]. The process (M(t) : t ≥ 0)is a positive L

2 bounded martingale with respect to the natural filtration{BP(t) : t≥ 0}, and thus converges almost surely and in L

2 to a randomvariable W ∗ with E(W ∗) = 1. The random variable W ∗ is positive with prob-ability one. Further, one has

limt→∞

e−(2−p)t|BP(t)|= W ∗

2− pwith probability one.(5.3)

Proof. We write Z(t) = |BP(t)| and Y (t) = Z(t)+B(t) so that M(t) =e−(2−p)tY (t) and we let dM(t) =M(t+ dt)−M(t). We then have

dM(t) = e−(2−p)t dY (t)− (2− p)e−(2−p)tY (t)dt.(5.4)

The processes Z(t),B(t) are all counting processes. For such processes, weshall use the infinitesimal shorthand E(dZ(t)|BP(t)) = a(t)dt to denote the

fact that Z(t)−∫ t0 a(s)ds is a local martingale.

Now the counting process Z(t) = |BP(t)| evolves by jumps of size one with

P(dZ(t) = 1|BP(t)) =( ∑

v∈BP(t)

(cB(v, t) + 1)

)dt,(5.5)

where cB(v, t) always denotes the number of blue children of vertex v at timet. The number of blue vertices can be written as B(t) =

∑v∈BP(t) cB(v, t)

since every blue vertex is an offspring of a unique vertex in BP(t). Using(5.5) results in

E(dZ(t)|BP(t)) = (Z(t) +B(t))dt.


Since B(t) ≤ Z(t), we see that the rate of producing new individuals isbounded by 2|BP(t)|. Thus, the process |BP(t)| can be stochastically boundedby a Yule process with a= 2. This implies by Lemma 5.3 that for all t≥ 0we have E(|BP(t)|2)<∞.

Let us now analyze the process B(t). This process increases by one whenthe new vertex born into BP(·) is colored blue that happens with probability1− p. Thus, we get

E(dB(t)|BP(t)) = (1− p)(Z(t) +B(t))dt.

Combining the last two equation gives us

E(dY (t)|BP(t)) = (2− p)Y (t)dt.

Using (5.4) now gives that E(dM(t)|BP(t)) = 0. This completes the proofthat M(·) is a martingale.

Next, we check that M(·) is an L2 bounded martingale. Since Y 2(t+ dt)

can take values (Y (t) + 1)2 or (Y (t) + 2)2 at rate pY (t) and (1 − p)Y (t),respectively, we have

E(d(M2(t))|BP(t)) = (4− 3p)e−(2−p)tM(t)dt.

Thus, the process U(t) defined by

U(t) =M2(t)− (4− 3p)

∫ t

0e−(2−p)sM(s)ds,

is a martingale. Taking expectations and noting that since M(·) is a mar-tingale, with M(0) = 1 thus E(M(s)) = 1 for all s, we get

E(M2(t)) = 1+ (4− 3p)

∫ t

0e−(2−p)s ds≤ 1 +

4− 3p

2− p.

This L2 boundedness implies that there exists a random variable W ∗ such

that

e−(2−p)t(|BP(t)|+B(t))a.s.,L2

−→ W ∗.

Using (5.2) shows that e−(2−p)t|BP(t)| →W ∗/(2−p). To ease notation, write

W :=W ∗

(2− p).

To complete the proof of the proposition we need to show that W is strictlypositive. First, note that by L

2 convergence, E(W ∗) = 1. So in particularP(W = 0) = r < 1. Let ζ1 < ζ2 < · · · be the times of birth of children (blueor red) of the root vertex v1 and write BPi(·) for the subtree consisting ofthe ith child and its descendants. Then

e−(2−p)t|BP(t)|=∞∑

i=1

e−(2−p)ζi [e−(2−p)(t−ζi)|BPi(t− ζi)|]1{ζi ≤ t}+ e−(2−p)t.


Thus, as t→∞, for any fixed K ≥ 1, we have

W ≥st

K∑

i=1

e−(2−p)ζiWi,

where {Wi}i≥1 are independent and identically distributed with the samedistribution as W (independent of {ζi}i≥1) and ≥st denotes stochastic dom-ination. This independence gives us

P(W = 0)≤ P(Wi = 0 ∀1≤ i≤K) = rK.

Letting K →∞ that P(W = 0) = 0. �

Before ending this section, we derive some elementary properties of theoffspring of an individual in BP(·). Let σv be the time of birth of vertex v inBP(·). Recall that cB(v,σv + s) and cR(v,σv + s) denote the number of blueand red children, respectively, of this vertex s units of time after the birthof v. Since the distribution of the point process representing offspring ofeach vertex is the same, these random variables have the same distributionirrespective of the choice of the vertex v. Define the process

M∗(t) := cR(v,σv + t)− p

∫ t

0(cB(v,σv + s) + 1)ds, t≥ 0.

Lemma 5.5 (Offspring point process: distributional properties).

(a) Conditional on BP(σv) we have

(cB(v,σv + t) : t≥ 0)d= (Yu1−p(t)− 1 : t≥ 0),

and thus one has

E(cB(v,σv + t)) = e(1−p)t − 1, t≥ 0.

(b) The process (M∗(t) : t≥ 0) is a martingale with respect to the filtration{BP(σv + t) : t≥ 0} and one has

E(cR(v,σv + t)) =p

1− p(e(1−p)t − 1), t≥ 0.

Proof. Part (a) is obvious from construction. To prove (b), note that

E(dcR(v,σv + t)|BP(t+ σv)) = p(cB(v,σv + t) + 1)dt,

since vertex v creates a new child at rate cB(v,σv + t) + 1 which is thenmarked red with probability p. �


5.4. Convergence for blue children proportions. The equivalence betweenBP(·) and the Superstar model described in Section 5.2 will imply that thenumber of vertices with degree k+ 1 in Gn+1 is the same as the number ofvertices in BP(τn) with exactly k blue children. We will need general resultson the asymptotics of such counts for the process BP(t) as t→∞. Using theequivalence created by the surgery operation, one can then transfer theseresults to asymptotics for the degree distribution of the original Superstarmodel. Now recall the random variable W ∗ obtained as the martingale limitobtained in Proposition 5.4. Define p≥k(∞) as

p≥k(∞) = k!k∏

i=1

(i+

2− p

1− p

)−1

.(5.6)

Theorem 5.6. Fix k ≥ 1 and let Z≥k(t) denote the number of verticesin BP(t) that have at least k blue children. Then

e−(2−p)tZ≥k(t)a.s.−→ p≥k(∞)

W ∗

2− p

as t→∞.

Proof. The proof uses a variant of the reproduction martingale tech-nique developed in [20] and it is framed in two steps:

(a) Proving convergence of expectations of the desired quantities to theexpectations of the asserted limits. This is proved in Section 5.4.1.

(b) Bootstrapping this convergence to almost sure convergence using lawsof large numbers. This is proved in Section 5.4.2.

We start with some notation required to carry out this program. For avertex v, write

ζv = ((ξvi ,Cvi ) : i≥ 1),

for the point process representing offspring (times of birth and types) ofthis vertex v. More precisely here ξvi denotes the time of birth of the ithoffspring of vertex v after the birth of vertex v into the branching process{BP(t) : t≥ 0} while Cv

i denotes the color of this child (red or blue). Thus, theith offspring of vertex v is born into BP at time σv+ξvi . Write ξv = (ξvi : i≥ 1)for the process that just keeps track of times of birth of these offspring forvertex v. Note that the point processes ζv and ξv have the same distributionacross vertices v. We shall use ζ := ζv1 and ξ := ξv1 to denote a generic pointprocess with the above distributions. We shall view ξ as a counting measure


on (R+,B(R+)). For A ∈ B(R+), write ξ(A) for the number of points in theset A. Define the corresponding intensity measure µ by

µ(A) := E(ξ(A)), A ∈ B(R+).

We start with a simple lemma that has notable consequences.

Lemma 5.7 (Renewal measure). For α= 2− p, we have∫ ∞

0e−αtµ(dt) = 1.

The measure defined by setting µα := e−αtµ(dt) is a probability measure andthis measure has expectation

∫∞0 tµα(dt) = 1.

Proof. As in Lemma 5.5, let cB(v1, t) and cR(v1, t) denote the numberof red and blue children, respectively, of vertex v1 by time t (note thatσv1 = 0 ). Then by definition, the intensity measure µ satisfies µ([0, t]) =E(cR(v1, t) + cB(v1, t)). Further by Fubini’s theorem,

∫ ∞

0e−αtµ(dt) = α

∫ ∞

0e−αtµ[0, t]dt.

Using the expressions for E(cB(v1, t)) and E(cR(v1, t)) from Lemma 5.5 com-pletes the proof. The second assertion regarding the expectation follows sim-ilarly. �

5.4.1. Convergence of expectations. The first step in the proof of Theo-rem 5.6 is convergence of expectations. This follows using standard renewaltheory. We setup notation that allows us to use the linearity of expectationsto derive a renewal equation. We start with the definition of a characteristic[14, 15] that we use to count the number of vertices in the branching processwith some fixed property. For each vertex v ∈ BP(∞), let {φv(s) : s≥ 0} bean independent and identically distributed nonnegative stochastic process,with φv(s) measurable with respect to {(ξvi ,Cv

i ) : ξvi ≤ s}. Thus, the value

of the stochastic process at time s namely φv(s) is determined by the setoffspring of vertex v born before the age s of this vertex v.

The value φv(s) is referred to as the score of vertex v at age s [14],Section 6.9. We write φ := φv1 to denote the process corresponding to theroot when we would like to refer to a generic such process. Throughoutwe shall assume that φ(·) is bounded and nonnegative, namely for someconstant C <∞,

φ(s)≥ 0, φ(s)<C for all s≥ 0.

Define

Zφ(t) =∑

v∈BP(t)

φv(t− σv), t≥ 0


for the branching process BP(·) counted according to characteristic φ. Themain examples of interest are:

(a) Total size: φ(s) = 1 for all s≥ 0. This results in Zφ(t) = |BP(t)|, thetotal size of the branching process by time t.

(b) Degree: φ(s) = 1{k or more blue children at age s} gives Zφ(t) =Z≥k(t), the number of vertices in BP(t) with k or more blue children.

Now fix an arbitrary bounded characteristic φ. For fixed time t > 0, condi-tioning on the offspring process ζ := ζv1 of vertex v1, the branching processcounted according to this characteristic satisfies the recursion

Zφ(t) = φv1(t) +∑

ξv1i ≤t

Z(i)φ (t− ξv1i ),(5.7)

where Z(i)φ (·) d

= Zφ(·) and are independent for i ≥ 1 and correspond to thecontribution of the descendants of the ith child of vertex v1. Taking expec-tations and defining the function mφ(·) by mφ(t) := E(Zφ(t)), this functionsatisfies the renewal equation

mφ(t) = E(φ(t)) +

∫ t

0mφ(t− s)µ(ds).

Define

mφ(t) := e−αtmφ(t), t≥ 0.

Lemma 5.7 and standard renewal theory ([14], Theorem 5.2.8) now implythe next result.

Proposition 5.8. For arbitrary bounded characteristics, writing α =(2− p) we have

limt→∞

mφ(t)=

∫ ∞

0e−αs

E(φ(s))ds := mφ(∞).

Applying this to the two examples which count the size of the branchingprocess and number of vertices with at least k blue children, we get thefollowing result.

Corollary 5.9. Taking the two characteristics of interest one gets forφ(t) = 1

e−αtE(|BP(t)|)→ 1

αas t→∞

and for φ(t) = 1{k or more blue children at time t}

e−αtE(Z≥k(t))→

p≥k(∞)

αas t→∞,

with p≥k(∞) as in (5.6).


Proof. The first assertion in the corollary is obvious [correspondingto the case φ(·) ≡ 1]. To prove the second assertion regarding the numberof blue vertices, observe that the limit constant in Proposition 5.8 can bewritten as

1

α

∫ ∞

0αe−αs

E(1{root v1 has k or more blue children at age s})ds

=1

αP(cB(v1, T )≥ k),

where T is an exponential random variable with mean α−1 that is indepen-dent of the counting process of the number blue offspring cB(v1, ·). Further,by Lemma 5.5(a),

cB(v1, ·) d= Yu1−p(·)− 1,

where Yu1−p(·) is rate 1− p Yule process. The interarrival times Xi betweenblue children i and i+1 are independent exponential random variables withmean (1− p)−1(i+1)−1, independent of T . In particular P(cB(v1, T )≥ k) =

P(T >∑k−1

j=0 Xj). Conditioning on the value of∑k−1

j=0 Xj and using tail prob-abilities for the exponential distribution shows that

P

(T >

k−1∑

j=0

Xj

)= E

(exp

(−α

k−1∑

j=0

Xj

))=

k−1∏

j=0

E(exp(−αXj)).

Using the Laplace transform of the exponential distribution, one can checkthat the last expression equals p≥k(∞). �

5.4.2. Almost sure convergence. The aim of this section is to strengthenthe convergence of expectations to almost sure convergence. A key role isplayed by a reproduction martingale, a close relative of the martingale usedin [20] to analyze single type branching processes as well as in [17] to analyzetimes of first birth in generations. Let v1, v2, v3, . . . denote the vertices ofBP(·) listed in the order of their birth times and let σvi denote the time atwhich vertex vi is born into the branching process BP(·). Note that σv1 =0. Recall that ξvi = (ξvi1 , ξvi2 , . . .) denotes the offspring point process of vi,namely the first offspring of vi is born at time σvi + ξvi1 , the second offspringof vi is born at time σvi + ξvi2 and so on. To ease notation, we shall write

ζ(i) := ζvi and ξ(i) := ξvi . Viewing ξ(i) as a random counting measure on R+

and writing α= 2− p, we have

ξ(i)α :=∞∑

j=1

exp(−αξvij ) =

∫ ∞

0e−αtξ(i)(dt).


For m≥ 1, let Fm be the sigma-algebra generated by vertices {v1, . . . , vm}and their offspring processes, namely

Fm := σ({ζ(i) : 1≤ i≤m}).

For m= 0, let F0 be the trivial sigma-field. Now define R0 = 1 and for m≥ 0define

Rm+1 := Rm + e−ασvm+1 (ξ(m+1)α − 1).

Let Γm be the set of the first m individuals born and all of their offspring.One can check that

Rm =∑

v∈Γm

e−ασv −m∑

j=1

e−ασvj .(5.8)

Thus, Rm is a weighted sum of children of the first m individuals withweight e−ασx for vertex x, the individuals v1, v2, . . . , vm being excluded. Inparticular, Rm > 0 for all m. The next lemma shows that the sequence(Rm :m≥ 0) is much more.

Proposition 5.10 (Reproduction martingale). The sequence (Rm :m≥ 0)is a nonnegative L

2 bounded martingale with respect to the filtration {Fm :m≥ 0}. Thus, there exists a random variable R∞ with E(R∞) = 1 such thatRm →R∞ almost surely and in L

2.

Proof. By the choice of α = 2 − p in Lemma 5.7 for i ≥ 1, we have

E(ξ(i)α ) =

∫∞0 e−αtµ(dt) = 1. Further, σvm+1 is Fm measurable while ξ

(m+1)α

is independent of Fm. This implies

E(Rm+1 − Rm|Fm) = e−ασvm+1E(ξ(m+1)α − 1) = 0.

By the orthogonality of the increments of the martingale Rm, we see that

E((Rm − 1)2)≤ E([ξ(i)α ]2)E

(m∑

i=1

e−2ασvi

).

Thus, to check L2 boundedness it is enough to check that the right-hand side

is bounded. The following lemma bounds the right-hand side of the aboveequation in two steps and completes the proof. �

Lemma 5.11. (a) Let ξα := ξv1α and assume 0< p < 1. Then E([ξα]2)<

∞.(b) For any m, E(

∑mi=1 e

−2ασvi )≤ 1 + α−1.


Proof. To prove (a), we observe that ξα =∫∞0 αe−αtξ[0, t]dt where ξ is

the point process encoding times of birth of offspring of v1. Thus, by Jensen’sinequality with the probability measure αe−αt dt we have

[ξα]2 ≤

∫ ∞

0αe−αt[ξ[0, t]]2 dt.

Let T be an exponential random variable with mean α−1 independent of ξ.Thus, it is enough to show E([ξ[0, T ]]2)<∞. Note that ξ[0, T ] = cR(v1, T )+cB(v1, T ), that is, the number of red and blue vertices born to v1 by therandom time T . Thus, it is enough to show E(c2R(v1, T )) and E(c2B(v1, T ))<∞. Conditioning on T = t first note by using Lemma 5.3 that for fixed t,E(c2B(v1, t))≤Ce2(1−p)t where C <∞ is a constant independent of t. Furtheragain using Lemma 5.3, for any fixed t, conditional on cB(v1, t), cR(v1, t) isstochastically dominated by a Poisson random variable with rate tcB(v1, t).Noting that α= 2− p, we get

E([ξ[0, T ]]2)≤C ′

∫ ∞

0e−(2−p)t(e2(1−p)t + t2e2(1−p)t)dt <∞,

for some constant C ′ <∞. This completes the proof of (a).To prove (b), let S(t) =

∑v∈BP(t) e

−2ασv . Then∑m

i=1 e−2ασvi = S(τm).

Further, by (5.5) the rate of creation of new vertices at time t is |BP(t)|+B(t). Thus, one has

E(dS(t)|BP(t)) = e−2αt(|BP(t)|+B(t))dt.

Taking expectations and noting that e−αt(|BP(t)| + B(t)) is a martingalegives

E(S(t)) = 1+

∫ t

0e−αs ds.

This completes the proof of part (b), and thus completes the proof of thelemma. �

The next theorem completes the proof of Theorem 5.6. Before statingthe main result, we define some new constructs which will be used in theproof. For a bounded characteristic φ, recall the limit constant mφ(∞) inProposition 5.8. In the following theorem, a key role will be played by themartingale (Rm :m≥ 0). Recall that this was a martingale with respect tothe filtration {Fm :m≥ 0}. We shall switch gears and now think about theprocess in continuous time. Define I(t) as the set of individuals born aftertime t whose parents were born before time t and note that

R|BP(t)|=∑

x∈I(t)

e−ασx .(5.9)


To ease notation, set

Rt := R|BP(t)|, Ft := F|BP(t)|.(5.10)

Theorem 5.12 (Convergence of characteristics). For any bounded char-acteristic that satisfies the recursive decomposition in (5.7), one has

e−αtZφ(t)a.s.−→ mφ(∞)R∞.

Taking φ= 1 and using Proposition 5.4 implies that R∞ =W ∗, the a.s. limitof the martingale (e−αt(|BP(t)|+B(t)) : t≥ 0).

Proof. First note that Proposition 5.10 implies that {Rt : t≥ 0} is anL2 bounded martingale with respect to the filtration {Ft : t≥ 0} and thus

Rta.s.−→R∞. For a fixed c > 0, define I(t, c) as the set of vertices born after

time (t+ c) whose parents are born before time t and let

Rt,c :=∑

x∈I(t,c)

e−ασx .(5.11)

Obviously, Rt,c ≤Rt. Intuitively, one should expect Rt,c to be small for largec. The next lemma makes this intuition precise. Recall the random variableξα =

∫∞0 e−αtξ(dt) where ξ = ξv1 denoted the point process corresponding to

births of offspring of vertex v1. For fixed c≥ 0, write ξα(c) :=∫∞c e−αtξ(dt).

Finally, define

U := supc≥0

ec/2ξα(c), A= E(U), K(c) =Aeαe−c/2

1−√e.(5.12)

The proof below will show that A<∞. Also note that K(c)→ 0 as c→∞.Finally, recall from the proof of Proposition 5.4 that we definedlimt→∞ exp(−αt)|BP(t)|=W . �

Theorem 5.13. For any fixed c > 1, we have

lim supt→∞

Rt,c ≤K(c)W a.s.,

where K(c) is as in (5.12).

Proof. The proof uses a variant of the proof used in [20]. Let us startby showing that E(U)<∞. First, note that for any fixed c≥ 0,

ec/2ξα(c)≤∫ ∞

cet/2e−αtξ(dt)≤

∫ ∞

0et/2e−αtξ(dt).

Thus, it is enough to show that E(∫∞0 et/2e−αtξ(dt)) < ∞. By Fubini and

integration by parts, E(∫∞0 et/2e−αtξ(dt)) = (α − 1/2)

∫∞0 et/2e−αtµ[0, t]dt


where µ is the intensity measure of the point process ξ. Using Lemma 5.5shows that for some constant C <∞, we have

∫ ∞

0et/2e−αtµ[0, t]dt≤C

∫ ∞

0et/2e−αte(1−p)t dt

=C

∫ ∞

0e−t/2 dt <∞,

by using α= 2− p. This completes the proof of finiteness.Now note that by definition for any c > 1

Rt,c =

⌊t⌋∑

i=1

∑

v : σv∈[i−1,i)

j : ξvj+σv>t+c

exp(−α(ξvj + σv)) +∑

v : σv∈[⌊t⌋,t)

j : ξvj +σv>t+c

exp(−α(ξvj + σv))

(5.13)

≤⌈t⌉∑

i=1

∑

v : σv∈[i−1,i)

j : ξvj+σv>t+c

exp(−α(ξvj + σv)).

Here, as usual, ⌊t⌋ is the largest integer ≤ t and ⌈t⌉ is the smallest integer≥ t. Analogous to the definition of ξα(·), define for each vertex v, ξvα(·) usingthe offspring point process ξv of v, namely

ξvα(t) :=

∫ ∞

texp(−αt)ξv(dt) =

∑

j : ξvj≥t

exp(−αξvj ).

Further analagous to (5.12), for each vertex v define

Uv(t) := et/2ξvα(t), Uv := supt≥0

et/2ξvα(t).

Note that

Uvd= U, Uv(t)≤st U,(5.14)

where U is as in (5.12) and ≤st represents stochastic domination. Now fora fixed i≥ 1 and vertex v with σv ∈ [i− 1, i),

∑

j : ξvj >t+c−σv

e−α(ξvj +σv) = e−ασvξvα(t+ c− σv)

≤ e−α(i−1)e−(t+c−i)/2Uv(t+ c− σv).

Using this in (5.13) gives

Rt,c ≤⌈t⌉∑

i=1

e−α(i−1)e−(t+c−i)/2∑

v : σv∈[i−1,i)

Uv(t+ c− σv).(5.15)


To proceed, we will need the following generalization of the strong law. Weparaphrase the following from [20], Proposition 4.1.

Proposition 5.14 (Extension of the strong law). Let {ni : i≥ 1} be asequence of integers and let (Uij : 1≤ j ≤ ni) be a collection of independentrandom variables for each fixed i ≥ 1. Suppose that there exists a randomvariable U > 0 with E(U)<∞ such that

|Uij | ≤st U, 1≤ j ≤ ni.(5.16)

Further assume

lim infi→∞

ni+1

n1 + · · ·+ ni> 0.(5.17)

Then

Si :=

∑nij=1(Uij − E(Uij))

ni

a.s.−→ 0 as i→∞,(5.18)

and in fact for any ε > 0

∞∑

i=1

P(|Si|> ε)<∞.(5.19)

Proceeding with the proof, for any interval I ⊆ R+, write BP(I) for thecollection of vertices born in the interval I so that BP(t)≡ BP[0, t]. We willuse the above proposition with ni = |BP[i− 1, i)| and for each fixed i, thecollection of random variables {Uv(t + c − σv) :v ∈ BP[i − 1, i)}. This is alittle subtle since the above proposition is stated for deterministic sequencesbut this justified exactly as in the proof of [20], equation (5.29). First, notethat Uv(t+ c− σv)≤st U for each fixed v. Note that by Proposition 5.4,

ni+1

n1 + · · ·+ ni:=

|BP[i, i+1)|BP[0, i)

a.s.−→ eα − 1> 0,

as i→∞, thus (5.17) is satisfied (almost surely). Using Proposition 5.14 in(5.15) [in particular (5.19)] now shows that for any fixed ε > 0

limsupt→∞

Rt,c ≤ lim supt→∞

⌈t⌉∑

i=1

e−α(i−1)e−(t+c−i)/2(E(U) + ε)|BP[i− 1, i)|.

Using the fact that e−αi|BP[i− 1, i)| ≤ e−αi|BP[0, i)| a.s.−→W , simplifyingthe above bound and recalling that we used A= E(U), shows that for everyfixed ε > 0

limsupt→∞

Rt,c ≤W (A+ ε)eαe−(c−1)/2

1−√e

.


Since ε was arbitrary, this completes the proof. �

Completing the proof of Theorem 5.12. Recall that we are deal-ing with bounded characteristics, that is, ‖φ‖∞ < C for some constant C.Without loss of generality, let C = 1. We shall show that there exists aconstant κ such that for all ε > 0,

lim supt→∞

|e−αtZφ(t)− mφ(∞)R∞| ≤ ε(W +2κR∞).(5.20)

Since this is true for any arbitrary ε, this completes the proof. Fix ε > 0.First, choose c large such that the bound in Theorem 5.13 satisfies K(c)< ε.Next, for fixed s > 0, define the truncated characteristic φs as

φs(u) =

{φ(u), u≤ s,

0, u > s.(5.21)

When the branching process is counted by this characteristic, the contribu-tion of all vertices whose age is more than s is zero. One can view this asa characteristic used to count “young” vertices. The limit constant for thischaracteristic by Proposition 5.8 is

mφs(∞) =

∫ s

0e−αu

E(φ(u))du,

where φ is the original characteristic. Note that mφs(∞)→ mφ(∞) as thetruncation level s→∞. Further, writing φ′ = φ− φs, we can view φ′ as thecharacteristic counting scores for “old” vertices (vertices of age greater thans). With this notation, we have Zφ(u) =Zφs(u) +Zφ′(u).

Define

mφs(u) = e−αuE(Zφs(u)), u≥ 0.

Now choose s > c large enough with e−αs < ε such that for all u > s − cone has e−αs < ε, |mφs(∞)− mφ(∞)|< ε, and |mφs(u)− mφs(∞)|< ε. Theconstructs s and c shall remain fixed for the rest of the argument.

Let us understand Zφs(·), the branching process counted according to thetruncated characteristic. We first observe that since φs(u) = 0 when u > s,this implies that for any time t > s, vertices born before time t − s (oldvertices) do not contribute to Zφs(t). Define I(t − s) as the collection ofindividuals born after time t − s whose parents were born before time t.Then Zφs(t) decomposes as

Zφs(t) =∑

v∈I(t−s)

Zvφs(t− σv),

where Zvφs(t− σv) are the contributions to Zφs(t) by the descendants of a

vertex v born in the interval [t− s, t]. Note that by construction, the parent


of such a vertex v belongs to BP(t− s). Further, recall that in the definitionof Rt,c in (5.11) we used I(t − s, c) for the set of vertices born after time(t− s+ c) whose parents are born before time t− s. Then we can furtherdecompose the above sum as

Zφs(t) =∑

x∈I(t−s)\I(t−s,c)

Zxφs(t− σx) +

∑

x∈I(t−s,c)

Zxφs(t− σx).

To simplify notation, write N (t− s, c) = I(t− s) \ I(t− s, c), that is, the setof individuals born in the interval [t− s, t− s+ c] to parents who were bornbefore time t− s. Then we can decompose the difference as a telescopingsum:

e−αtZφ(t)− mφ(∞)R∞ :=

7∑

j=1

Ej(t).(5.22)

The definition of these seven terms {Ei(t) : 1≤ i≤ 7} are as follows:

(a) E1(t) is defined by setting

E1(t) = e−αtZφ′(t), t≥ 0.

Observe that for E1(t), the only vertices that contribute are those with agegreater than s (since φ′(u) = 0 for u < s). In particular, E1(t) = e−αtZφ′(t)≤e−αt|BP(t − s)|. Thus, by Proposition 5.4, one has lim supt→∞E1(t) ≤e−αsW ≤ εW a.s. by choice of s.

(b) E2(t) is defined by setting

E2(t) :=∑

x∈N (t−s,c)

e−ασx [e−α(t−σx)Zxφs(t− σx)− mφs(t− σx)].

Note that since in the above sum x ∈N (t− s, c), thus σx > t− s. Thus,

|E2(t)| ≤ e−α(t−s)|N (t−s, c)|∑

x∈N (t−s,c) e−α(t−σx)Zx

φs(t− σx)− mφs(t− σx)

|N (t− s, c)| .

For E2(t), N (t− s, c) consists of all children of parents in BP(t − s) thatare born in the interval [t− s, t− s+ c]. Thus, |N (t− s, c)| ≤ BP(t− s+ c).In particular, lim supt→∞ e−α(t−s)|N (t− s, c)| ≤Weαc. Further, each of theindividuals in BP(t − s) reproduce at rate at least 1. One can check bythe strong law of large numbers that lim inft→∞ |N (t− s, c)|/|BP(t− s)| ≥ calmost surely. Finally, the terms in the summand (conditional on BP(t− s))are independent random variables and each such term in the sum looks likeX−E(X), whereX is stochastically bounded by the random variable Zφs(c).A strong law of large numbers argument shows that lim supt→∞ |E2(t)|= 0a.s.


(c) E3(t) is defined as

E3(t) :=∑

x∈N (t−s,c)

e−ασx(mφs(t− σx)− mφs(∞)).

By the choice of s since t− σx ≥ s− c, |mφs(t− σx)− mφs(∞)| ≤ ε. Thus,one has |E3(t)| ≤ εRt. Letting t → ∞, one gets lim supt→∞ |E3(t)| ≤ εR∞

a.s.(d) E4(t) is defined as

E4(t) := mφs(∞)

( ∑

x∈N (t−s,c)

e−ασx −Rt−s

).

For E4(t), we have |(∑x∈N (t−s,c) e−ασx −Rt−s)|=Rt−s,c. Thus,

lim supt→∞

E4(t)≤ mφs(∞)K(c)W ≤ mφ(∞)εW,

almost surely by Theorem 5.13 for the asymptotics of Rt,c. Here, we haveused mφs(∞) ≤ mφ(∞) and that our choice of c guarantees K(c) < ε. Toease notation for the rest of the proof, let κ be a constant chosen such thatmax(supu,s≥0(mφs(u)), mφ(∞))<κ. The uniform boundedness of φ guaran-tees that this can be done. By choice, κ is independent of s,u. Thus, thebound for the fourth term simplifies to lim supt→∞E4(t)≤ κεW .

(e) E5(t) is defined by setting E5(t) := mφs(∞)(Rt−s − R∞). Since

Rt−sa.s.−→R∞, E5(t)

a.s.−→ 0.(f) E6(t) is defined by setting E6(t) :=R∞(mφs(∞)−mφ(∞)). By choice

of s, |E6(t)| ≤ εR∞.(g) E7(t) is defined by setting

E7(t) := e−αt∑

v∈I(t−s,c)

Zvφs(t− σv)

=∑

v∈I(t−s,c)

e−ασv (exp(−α(t− σv))Zvφs(t− σv)− mφs(t− σv))(5.23)

+∑

v∈I(t−s,c)

exp(−αt)mφs(t− σv).

Using the strong law of large numbers and arguing as in (b) shows that thefirst term goes to zero as t→ ∞ a.s. Using the constant κ defined in (d)above we get

∑

v∈I(t−s,c)

exp(−αt)mφs(t− σx)≤ κ∑

v∈I(t−s,c)

exp(−ασx) = κRt−s,c.

Using Theorem 5.13 and the choice of c and letting t→∞, we get

lim supt→∞

E7(t)≤ εκR∞ a.s.


Combining all these bounds, one finally arrives at

lim supt→∞

|e−αtZφ(t)− mφ(∞)R∞| ≤ ε(W +2R∞ + κ(W +R∞)) a.s.

Since ε > 0 was arbitrary, this completes the proof. �

5.5. Time of first birth asymptotics. For a rooted tree with root ρ (hereρ= v1), there is a natural notion of a generation of a vertex v. This is definedas the number of edges on the path between v and ρ. Thus, ρ belongs togeneration zero, all the neighbors of ρ belong to generation one, and so forth.The aim of this section is to define a modified notion of generation in BP(t),owing to the fact that the surgery operation as constructed in Section 5.2that sets up a method to go from the continuous time model to the discretetime model implies that the object of study are the number of edges to theclosest red vertex on the path to the root v1. For each fixed k, we shall definestopping times Bir(k) representing the first time an individual in modifiedgeneration k is born into the process BP(·). We study asymptotics of Bir(k)as k → ∞. In the next section, we use these asymptotics to understandheight asymptotics for the Superstar model.

Fix t > 0. For each vertex v ∈ BP(t), let r(v) denote the first red vertexon the path from v to the original progenitor of the process BP(·), namelyv1. If v is a red vertex then r(v) = v. Let d(v) be the number of edges onthe path between v and r(v) so that d(v) = 0 if v is a red vertex.

Fix k ≥ 1. Let Bir(k) denote the stopping times

Bir(k) = inf{t > 0 :∃v ∈ BP(t), d(v) = k}.In other words, Bir(k) is the first time that there exists a red vertex in BP(t)such that the subtree consisting of all blue descendants of this vertex androoted at this red vertex has an individual in generation k. Here, we use Birto remind the reader that this is the time of the first birth in a particulargeneration. The next theorem proves asymptotics for these stopping times.

Theorem 5.15. Let Lam(·) be the Lambert function [9]. We have

Bir(k)

k

a.s.−→ Lam(1/e)

1− pas k→∞.

Proof. Given any rooted tree T and v ∈ T , we shall let G(v) denotethe generation of this vertex in T . Write BP

v1b (·) for the subtree consisting

of all blue descendants of the original progenitor v1 and rooted at v1. Indistribution, this is just a single type continuous time branching processwhere each vertex has the same distribution as the process Yu1−p(·) − 1.Further, let

Bir∗(k) = inf{t :∃v ∈ BPv1b (t),G(v) = k}.


In words, this is the time of first birth of an individual in generation k forthe branching process BP

v1b (·). From the definitions of Bir(k),Bir∗(k), we

have Bir(k)≤Bir∗(k).Much is know about the time of first birth of a single type supercritical

branching process, in particular implies that for BPv1b (·), there exists a limit

constant β such that

Bir∗(k)/ka.s.−→ β.

Here, β can be derived as follows. Write µb for the expected intensity measureof the blue offspring, that is, as in Lemma 5.5

µb([0, t]) = E(cB [v1, t]) = e(1−p)t − 1, t≥ 0.

For θ > 0, let

Φ(θ) := E

(∫ ∞

0e−θtcB(v1, dt)

), θ ∈R.

It is easy to check that this is finite only for θ > 1− p since

Φ(θ) = θ

∫ ∞

0e−θtµb([0, t])dt=

1− p

θ− (1− p).

For a > 0, define

Λ(a) := inf{Φ(θ)eθa : θ ≥ 1− p}= (1− p)ae(1−p)a+1.(5.24)

Then by [17], Theorem 5, the limit constant β is derived as

β = sup{a > 0 :Λ(a)< 1}.(5.25)

From this, it follows that β = Lam(1/e)/(1−p) where Lam(·) is the Lambertfunction. Then we have

limsupk→∞

Bir(k)

k≤ lim

k→∞

Bir∗(k)

k

a.s.−→ W (1/e)

1− p.

This gives an upper bound in Theorem 5.15. Lemma 5.16 proves a lowerbound and completes the proof.

Lemma 5.16. Fix any ε > 0 and let β = Lam(1/e)/(1−p) be the assertedlimit constant. Then

∞∑

l=1

P(Bir(l)< (1− ε)βl)<∞.

Thus, one has lim infl→∞Bir(l)/l ≥ β a.s.


Proof. For ease of notation, for the rest of this proof we shall writetε(l) = (1− ε)βl. In the full process BP(·), two processes occur simultane-ously:

(a) New “roots” (red vertices) are created. Recall that we used R(·) forthe counting process for the number of red roots.

(b) The blue descendants of each new root have the same distribution asa single type continuous time branching process with offspring process havethe same distribution as the process Yu1−p(·)− 1.

Fix l≥ 2 and suppose a new red vertex v was created at some time σv <tε(l). Let BP

vb(·) denote the subtree of blue descendants of v. Let Bir∗(v, l)>

σv be the time of creation of the first blue vertex in generation l for subtreeBP

vb (·). Now Bir(l) < tε(l) if and only if there exists a red vertex v born

before tε(l) such that the subtree of blue descendants of this vertex has avertex in generation l by this time. For a fixed red vertex v ∈ BP(·), writeAv(l) for this event. Since Bir∗(v, l) − σv

d= Bir∗(l), conditional on BP(σv)

one has

P(Av(l)|BP(σv)) = P(Bir∗(l)≤ tε(l)− σv).

Fix 0< s< (1− ε)βl. Then for θ > 1− p, Markov’s inequality implies

P(Bir∗(l)< (1− ε)βl− s)≤ eθ((1−ε)βl−s)E[e−θBir∗(l)].

One of the main bounds of Kingman ([17], equation (2.5), Theorem 1) isE[e−θBir∗(l)]≤ (Φ(θ))l. Thus, we get

P(Bir∗(l)< (1− ε)βl− s)≤ [Φ(θ)eθ(1−ε)β ]le−θs.(5.26)

By the definition of β,

Λε := Λ(β(1− ε)) := inf{Φ(θ)eθ(1−ε)β : θ > 1− p}< 1,

where Λ is as in (5.24). It is easy to check that the minimizer occurs at

θε = 1− p+1

(1− ε)β.

The final probability bound we shall use is

P(Bir∗(l)< (1− ε)βl− s)≤ [Λε]le−θεs.(5.27)

Let N εl be the number of red vertices born before time tl(ε) whose trees

of blue descendants BPvb(·) have at least one vertex in generation l by time

tε(l). Obviously, P(Bir(l) < (1− ε)βl) ≤ E(N εl ). Conditioning on the times

of birth of red vertices, one gets

E(N εl )≤

∫ tε(l)

0[Λε]

ldE(R(s)) using equation (5.27),

= p[Λε]l

∫ tε(l)

0e−(θε−q)s ds using Lemma 5.5.


Simplifying, we get for all l≥ 2, E(N εl )≤C[Λε]

l for a constant C. Thus,

∞∑

l=1

P (Bir(l)< (1− ε)βl)<∞.�

6. Proofs of the main results. Recall the equivalence created by thesurgery operation between the Superstar model and the two-type branchingprocess as established in Section 5.2. We shall use this equivalence and theproven results on BP(·) in Section 5 to complete the proof of the main re-sults. We record the following fact about the asymptotics for the stoppingtimes τn.

Lemma 6.1 (Stopping time asymptotics). The stopping times τn satisfy

τn −1

2− plogn

a.s.−→− 1

2− plogW.

Proof. Proposition 5.4 proves that |BP(t)|e−(2−p)t a.s.−→ W . Thus

ne−(2−p)τn a.s.−→W . �

Let us now start by proving the main results. We note that Theorem 2.1 isobvious since the degree of the superstar is given by R(τn) =

∑ni=1 1{vi is red},

the total number of red vertices and (1{vi} is red)i≥1 is an i.i.d. sequencewith Bernoulli p as the marginal distribution. We now prove the remainingresults using the correspondence between the continuous time and discretetime processes.

6.1. Proof of the degree distribution strong law. In this section, we shallprove Theorem 2.2. Since Gn+1 is a connected tree, every vertex has degreeat least one. Recall that cB(v, t) denotes the number of blue children ofvertex v by time t. Write deg(v,Gn+1) for the degree of a vertex in Gn+1.The surgery operation implies that for any nonsuperstar vertex

deg(v,Gn+1) = cB(v, τn) + 1.(6.1)

Fixing k ≥ 0, the number of nonsuperstar vertices with degree exactly k+1is the same as the number of vertices in BP(τn) that have exactly k bluechildren. Recall that we used Z≥k(t) for the number of vertices in BP(t) thathave at least k blue children. Proposition 5.4, showed that the total numberof vertices |BP(t)| satisfies

e−(2−p)t|BP(t)| a.s.−→ W ∗

(2− p)as t→∞.(6.2)


Theorem 5.6 showed that

e−(2−p)tZ≥k(t)a.s.−→ k!

k∏

i=1

(i+

2− p

1− p

)−1 W ∗

2− p.

Thus, writing p≥k(t) = Z≥k(t)/BP(t) for the proportion of vertices with de-gree k, Theorem 5.6 implies one has

p≥k(t)a.s.−→ k!

k∏

i=1

(i+

2− p

1− p

)−1

:= p≥k(∞) as t→∞.

Now let k ≥ 1. Writing N≥k(n) for the number of vertices with degree at least

k in Gn+1, one has N≥k(n)/na.s.−→ p≥k−1(∞) as n→∞. Thus, the proportion

of vertices with degree exactly k converges to p≥k−1(∞)−p≥k(∞) = νSM(k).This completes the proof.

6.2. Proof of maximal degree asymptotics. The aim of this is to proveTheorem 2.4. We wish to analyze the maximal nonsuperstar degree that wewrote as

Υn =max{deg(vi,Gn+1) : 1≤ i≤ n}.The plan will be as follows: we will first prove the simpler assertion of con-vergence of the degree of vertex vk for fixed k ≥ 1. Then we shall show thatgiven any ε > 0, we can choose K such that for large n, the maximal degreevertex has to be one of the first K vertices v1, v2, . . . , vK with probabilitygreater than 1− ε. This completes the proof.

Fix k ≥ 1. Recall from (6.1) that deg(vk,Gn+1) = cB(vk, τn) + 1 wherecB(vk, t) are the number of blue vertices born to vertex k by time t. Recallthat cB(vk, t) is a Yule process of rate 1− p started at time τk (i.e., at thebirth of vertex vk). By Lemma 5.3,

cB(vk, t)

e(1−p)(t−τk)

a.s.−→W ′k,(6.3)

where W ′k is an exponential random variable with mean one. Write γ =

(1 − p)/(2 − p) and let ∆k = e−(1−p)τkW ′W−γ . Using (6.2) and (6.3), wehave

n−γ deg(vk,Gn+1) =cB(vk, τn−1) + 1

e(1−p)(τn−1−τk)

(e(2−p)τn−1

|BP(τn−1)|+ 1

)γ

e−(1−p)τk

a.s.−→W ′kW

−γe−(1−p)τk := ∆k.

Now let us prove distributional convergence of the properly normalizedmaximal nonsuperstar degree Υn. Fix L> 0 and let

Mn[0,L] := max{deg(vk,Gn+1) : τk ≤ L}.(6.4)


In other words, this is the largest degree in Gn+1 amongst all vertices bornbefore time L in BP(·). The convergence of the degree of vk for any k ≥ 1implies the next result.

Lemma 6.2 (Convergence near the root). Fix any L > 0. Then thereexists a random variable ∆∗[0,L]> 0 such that

n−γMn[0,L]a.s.−→∆∗[0,L],

where γ = (1− p)/(2− p).

Now if we can show that with high probability, Υn = Mn[0,L] for largefinite L as n → ∞, then we are done. This is accomplished via the nextlemma. Recall that by asymptotics for the stopping times τn in Lemma 6.1,given any ε > 0, we can choose Kε > 0 such that

lim supn→∞

P

(∣∣∣∣τn −1

2− plogn

∣∣∣∣>Kε

)≤ ε.(6.5)

For any 0<L< t, let BP(L, t] denote the set of vertices born in the interval(L, t]. Recall that we used v1 for the original progenitor. For any time tand v ∈ BP(t), let degv(t) = cB(v, t) + 1 denote the degree of vertex v inthe Superstar model G|BP(t)|+1 obtained through the surgery procedure. Forfixed K and L, let An(K,L) denote the event that for some time t ∈ [(2−p)−1 logn±K], there exists a vertex v in BP(L, t] with degv(t)> degv1(t).

Lemma 6.3 (Maxima occurs near the root). Given any K and ε, onecan choose L> 0 such that

lim supn→∞

P(An(K,L))≤ ε.

In particular, given any ε > 0, we can choose L such that

lim supn→∞

P(Υn 6= Mn([0,L]))≤ ε.

Deferring the proof of this result note that Lemma 6.2 now coupled withthe above lemma now shows that there exists a random variable ∆∗ suchthat Υn/n

γ P−→∆∗. This completes the proof of Theorem 2.4.

Proof of Lemma 6.3. For ease of notation, write

t−n = (2− p)−1 logn−K, t+n = (2− p)−1 logn+K.

Since the degree of any vertex is an increasing process it is enough to showthat we can choose L = L(K,ε) such that as n→∞, the probability thatthere is some vertex born in the time interval [L, t+n ] whose degree at time


t+n is larger than the degree of the root v1 at time t−n is smaller than ε. LetM[L,t+n ](t

+n ) denote the maximal degree by time t+n of all vertices born in the

interval [L, t+n ]. Then for any constant C > 0

P(An(K,L))≤ P({degv1(t−n )<Cnγ} ∪ {M[L,t+n ](t+n )>Cnγ})

≤ P(degv1(t−n )<Cnγ) + P(M[L,t+n ](t

+n )>Cnγ).

Since the offspring process of v1 has the same distribution as a rate (1− p)Yule process

e−(1−p)t−n degv1(t−n ) = e(1−p)K/2degv1(t

−n )

nγ

a.s.−→Wv1 ,

where Wv1 has an exponential distribution with mean one. Thus, for a fixedK, we can choose C =C(ε) large enough such that

lim supn→∞

P(degv1(t−n )<Cnγ)≤ ε/2.

Thus, for a fixed ε,C,K, it is enough to choose L large such that

lim supn→∞

P(M[L,t+n ](t+n )>Cnγ)≤ ε/2.

Without loss of generality, we shall assume throughout that Lε and t+n areintegers. For any integer Lε <m< t+n − 1, let M[m,m+1](t

+n ) denote the max-

imum degree by time t+n of all vertices born in the interval [m,m+1]. Then

M[L,t+n ](t+n ) = max

L≤m≤t+n−1M[m,m+1](t

+n ).

Let |BP[m,m+ 1]| denote the number of vertices born in the time interval[m,m+ 1]. Since for a vertex born at some time s < t+n , the degree of thevertex at time t+n has distribution Yu1−p(t

+n −s), an application of the union

bound yields

P(M[L,t+n ](t+n )>Cnγ)≤

t+n−1∑

m=L

E(|BP[m,m+ 1]|)P(Yu1−p(t+n −m)>Cnγ).

Now E(BP[m,m+ 1]) ≤ E(|BP(m + 1)|). By Proposition 5.4, E(|BP(t)|) ≤e(2−p)t. Further by Lemma 5.3, for fixed time s, a rate 1− p Yule processhas a geometric distribution with parameter e−(1−p)s. Thus, we have

P(M[L,t+n ](t+n )>Cnγ)≤

t+n−1∑

m=L

Ae(2−p)m[1− e−(1−p)(t+n−m)]Cnγ

(6.6)

≤t+n−1∑

m=L

Ae((2−p)m−Ce(1−p)(m−K)),


where last inequality follows from the fact that for 0 ≤ x ≤ 1, 1− x ≤ e−x

and

et+n /2 = nγe(1−p)K .

Now choosing L large, one can make the right-hand side of the last inequalityas small as one desires and this completes the proof. �

6.3. Proof of logarithmic height scaling. The aim of this section is tocomplete the proof of Theorem 2.5. Let us first understand the relationshipbetween the distances in BP(τn) and Gn+1 due to the surgery operation. Thedistance of all the red vertices in BP(τn) from the superstar v0 is one. Foreach blue vertex v ∈ BP(τn), let r(v) denote the first red vertex on the pathfrom v to the root v1 in BP(τn). Recall from Section 5.5 that d(v) denotedthe number of edges on the path between v and r(v) with d(v) = 0 if v wasa red vertex. Then the distance of this vertex from the superstar v0 in Gn+1

is just d(v) + 1 since the vertex needs d(v) steps to get to r(v) that is thendirectly connected to v0 in Gn+1 by an edge. Let D(u, v) denote the graphdistance between vertices u and v in Gn+1. Since by convention d(v) = 0for all the red vertices, this argument shows that for all v 6= v0 ∈ Gn+1,D(v, v0) = d(v) + 1. In particular, the height of Gn+1 is given by

H(Gn+1) =max{d(v) + 1 :v ∈ BP(τn)}.(6.7)

Now by the definition of H(Gn+1), there is a vertex in BP(τn) such thatd(v) =H(Gn+1)−1 but no vertex with d(v) =H(Gn+1). Recall the stoppingtimes Bir(k), defined as the first time a vertex with d(v) = k is born in BP(·).Thus, we have

Bir(H(Gn+1)− 1)≤ τn ≤ Bir(H(Gn+1)).(6.8)

Now recall that Theorem 5.15 showed that the stopping times Bir(k) satisfy

Bir(k)/ka.s.−→ Lam(1/e)/(1− p) as k→∞.

Dividing (6.8) throughout by H(Gn+1) by Theorem 5.15

Bir(H(Gn+1)− 1)

H(Gn+1)

a.s.−→ Lam(1/e)

1− p,

while by Lemma 6.1 we get

τnlogn

a.s.−→ 1

2− p.

Rearranging shows that

H(Gn+1)

logna.s.−→ (1− p)

Lam(1/e)(2− p).

This completes the proof. �


6.4. Extension to the variants of the Superstar model. We now describehow the above methodology easily extends to the two variants describedin Section 3, namely the superstar linear preferential attachment and theuniform attachment model (Theorems 3.1 and 3.2). Since the proofs areidentical to the original model, modulo the driving continuous time branch-ing process, we will not give full proofs but rather describe the continuoustime versions that need to be analyzed to understand the correspondingdiscrete model. The surgery operation and the subsequent analysis of thecontinuous time model are identical to the original Superstar model.

For fixed a >−1 and p ∈ (0,1), we write {Glinn (a, p) :n≥ 1} for the corre-

sponding family of growing random trees obtained via following the dynam-ics of the linear attachment scheme (see Section 3). We let {Guni

n (p) :n≥ 1}be the family of random trees obtained via uniform attachment. Now recallthat the analysis of the superstar preferential attachment model start withthe formulation of a continuous time two type branching process (consist-ing of red and blue vertices). One then performs surgery on this two typebranching process at appropriate stopping times τn as defined in (5.1) toobtain the Superstar model. For the two variants, let us now describe thecorresponding continuous time versions.

(a) Superstar linear preferential attachment : We write {BPlin(t)}t≥0 forthis branching process. Here one starts with a single red vertex v1 at timet = 0. Each individual lives forever. For any fixed t ≥ 0, each individualv ∈ BPlin(t) in the branching process reproduces at rate

λ(v, t) := cB(v, t) + 1+ a,

where as before cB(v, t) denotes the number of blue children of vertex v attime t. Each new offspring is colored red with probability p and blue withprobability q := 1− p.

(b) Uniform attachment : Start with a single red vertex v1 at time t= 0.Each individual reproduces at rate one and lives forever. Each new offspringis colored red with probability p and blue with probability q := 1− p. Write{BPuni(t)}t≥0 for this branching process.

Fix n ≥ 1 and recall the stopping time τn from (5.1), namely the time forthe branching process to reach size n. From Section 5.2, recall the surgeryoperation that takes BP(τn) to a random tree Sn on n + 1 vertices. Thefollowing proposition which is the general analog of Proposition 5.1 showingthe equivalence of the continuous time models and the discrete time versions.The result is stated for the linear preferential attachment model, the sameresult is true using the corresponding branching process for the uniformattachment model.


Proposition 6.4. Fix a > −1 and p ∈ (0,1). Let {BPlin(t) : t ≥ 0} be

the continuous time two type branching process constructed as above for

the superstar linear preferential attachment model with parameters a, p. The

sequence of trees {Sn :n ≥ 1} obtained by performing the surgery operation

on {BPlin(τn) :n≥ 1} has the same distribution as {Gn+1(a, p) :n≥ 1}.

Now recall that in the proof of the original Superstar model, a major role

was played by Proposition 5.4 which showed that the associated continuous

time branching process grew at rate exp((2− p)t). This allowed us to make

rigorous the following two ideas (see, e.g., the proof of Corollary 5.9):

(a) As t→∞, the age of an individual chosen uniformly at random from

the population has an exponential distribution with rate (2− p).

(b) For vertex v, let cB(v,σv + t) be the number of blue children t units

after being born and note that {cB(σv + t) : t≥ 0} has the same distribution

for any vertex. Since the number of blue children of a vertex represents

the out-degree in the Superstar model after the surgery operation, using

(i), the limiting degree distribution should be the same as 1 + cB(v1, T ),

where T ∼ exp((2− p)) independent of {cB(v1, t) : t≥ 0}. Here, we use v1 for

convenience since σv1 = 0.

The corresponding version of Proposition of 5.4 is the following.

Proposition 6.5. (a) Fix a > −1 and p ∈ (0,1). Then there exists a

random variable W (a, p)> 0 a.s. such that as t→∞,

exp(−(2− p+ a))|BPlin(t)| a.s.−→W (a, p).

(b) For the uniform attachment model, for any p ∈ (0,1) as t→∞,

exp(−t)|BPuni(t)| a.s.−→W,

where W ∼ exp(1).

Proof. We start with part (b). For the uniform attachment model,

since every individual lives forever and reproduces at rate one, the process

{|BPuni(t)| : t≥ 0} has the same distribution as a rate one Yule process (see

Definition 5.2). Then the result follows from Lemma 5.3.

To prove (a), define the process

M(t) := exp(−(2− p+ a))(|BPlin(t)|+B(t)), t≥ 0,


where as before B(t) denotes the number of blue individuals in the popula-tion by time t. Arguing exactly as in the proof of Proposition 5.4, it is easyto check that this process is a martingale. The rest of the proof now followsalong the same lines as the proof of Proposition 5.4. �

The proof of Theorems 3.1 and 3.2 now proceed as in the analysis ofthe original model. For example, to show the convergence of the degreedistribution for the uniform attachment model Theorem 3.2, first note thatfor any vertex v, since this vertex reproduces at rate one and each newoffspring is colored red with probability p and blue with probability q = 1−p.Thus, the process counting the number of blue children {cB(v1, t) : t≥ 0} is arate q Poisson process. Fix k ≥ 1 and write Z≥k(t) for the number of verticesin BPlin(t) which have k or more blue offspring by time t. The analogousversion of Theorem 5.6 for the uniform attachment model implies that

exp(−t)Z≥k(t)a.s.−→ p≥k(∞)W,

where

p≥k(∞) = P(cB(v,T )≥ k),

where T ∼ exp(1) independent of cB(·). Now note that

P(cB(v,T )≥ k) = P

(k∑

i=1

ξi ≤ T

),

where {ξi}i≥1 is a sequence of independent rate q exponential random vari-ables. Arguing as in the proof of Corollary 5.9, we get

P

(k∑

i=1

ξi ≤ T

)= (E(exp(−ξi)))

k =

(q

q +1

)k

.

For the maximal degree, note that by Proposition 6.5 implies that the stop-ping time τn as in (5.1) for the time the continuous time branching processgrows to be of size n satisfies

τn = logn+OP (1).

Since for each vertex, its true degree is the number of blue offspring, as aneasy lower bound, the root v1 by time τn should have degree ∼ (1− p) logn(since the process describing the blue offspring of the root is just a rate qPoisson process). To get that logn is the correct order for the maximal degreeand in particular the weak law, one argues as in Section 6.2 [in particularsee (6.6)], teasing apart the contribution to this maximal degree of verticesborn at various times. The proof of Theorem 3.1 is similar. We omit thedetails.


APPENDIX

Below we describe each of the thirteen events and show the correspondingevent specific term.

• E = 1: Brazil vs. Netherlands soccer match from the 2010 World Cup.The term is “Brazil” or “Netherlands.”

• E = 2: Basketball player Lebron James announcement of signing with theMiami Heat. The term is “Lebron.”

• E = 3: The 2010 World Cup Kick-Off Celebration Concert. The term is“World Cup.”

• E = 4: Brazil vs. Portugal soccer match from the 2010 World Cup. Theterm is “Brazil” or “Portugal.”

• E = 5: Italy vs Slovakia soccer match from the 2010 World Cup. The termis “Italy” or “Slovakia.”

• E = 6: The 2010 BET Awards show. The term is “BET Awards.”• E = 7: The firing of General Stanly McChrystal by US President Barack

Obama. The term is “McChrystal.”• E = 8: The 2010 World Cup Opening Ceremony. The term is “World

Cup.”• E = 9: Mexico vs. South Africa soccer match from the 2010 World Cup.

The term is “Mexico.”• E = 10: England vs. Slovakia soccer match from the 2010 World Cup. The

term is “England.”• E = 11: Portugal vs. North Korea soccer match from the 2010 World Cup.

The term is “Portugal.”• E = 12: Roger Federer’s tennis match in the first round of the 2010 Wim-

bledon tournament. The term is “Federer.”• E = 13: The UN imposing sanctions on Iran. The term is “Iran.”

Acknowledgements. S. Bhamidi would like to thank the hospitality of theStatistics department at Wharton where this work commenced. We thanktwo referees for detailed comments which lead to a significant improvementin the readability of the paper.

REFERENCES

[1] Asur, S., Huberman, B. A., Szabo, G. and Wang, C. (2011). Trends in socialmedia: Persistence and decay. In AAAI Conference on Weblogs and Social Media,Barcelona, Spain.

[2] Athreya, K. B., Ghosh, A. P. and Sethuraman, S. (2008). Growth of preferen-tial attachment random graphs via continuous-time branching processes. Proc.Indian Acad. Sci. Math. Sci. 118 473–494. MR2450248

[3] Athreya, K. B. and Ney, P. E. (1972). Branching Processes. Springer, New York.MR0373040

http://www.ams.org/mathscinet-getitem?mr=2450248



[4] Barabasi, A.-L. and Albert, R. (1999). Emergence of scaling in random networks.Science 286 509–512. MR2091634

[5] Bollobas, B., Riordan, O., Spencer, J. and Tusnady, G. (2001). The degreesequence of a scale-free random graph process. Random Structures Algorithms18 279–290. MR1824277

[6] Bollobas, B. and Riordan, O. M. (2003). Mathematical results on scale-freerandom graphs. In Handbook of Graphs and Networks 1–34. Wiley, Weinheim.MR2016117

[7] Cha, M., Haddadi, H., Benevenuto, F. and Gummadi, K. P. (2010). Measuringuser influence in Twitter: The million follower fallacy. In AAAI Conference onWeblogs and Social Media, Washington, DC.

[8] Cooper, C. and Frieze, A. (2003). A general model of web graphs. Random Struc-tures Algorithms 22 311–335. MR1966545

[9] Corless, R. M., Gonnet, G. H., Hare, D. E. G., Jeffrey, D. J. andKnuth, D. E. (1996). On the Lambert W function. Adv. Comput. Math. 5

329–359. MR1414285[10] Data courtesy Microsoft Research Cambridge. August 2010.[11] Deijfen, M. (2010). Random networks with preferential growth and vertex death.

J. Appl. Probab. 47 1150–1163. MR2752883[12] Dorogovtsev, S. N. and Mendes, J. F. F. (2003). Evolution of Networks. Oxford

Univ. Press, Oxford. MR1993912[13] Durrett, R. (2007). Random Graph Dynamics. Cambridge Univ. Press, Cambridge.

MR2271734[14] Jagers, P. (1975). Branching Processes with Biological Applications. Wiley, London.

MR0488341[15] Jagers, P. and Nerman, O. (1984). The growth and composition of branching

populations. Adv. in Appl. Probab. 16 221–259. MR0742953[16] Jagers, P. and Nerman, O. (1996). The asymptotic composition of supercritical

multi-type branching populations. In Seminaire de Probabilites, XXX. LectureNotes in Math. 1626 40–54. Springer, Berlin. MR1459475

[17] Kingman, J. F. C. (1975). The first birth problem for an age-dependent branchingprocess. Ann. Probab. 3 790–801. MR0400438

[18] Kwak, H., Lee, C., Park, H. and Moon, S. (2010). What is Twitter, a socialnetwork or a news media? In Proc. WWW. Proceedings of the 19th InternationalConference on World Wide Web 591–600. ACM, New York.

[19] Mori, T. F. (2007). Degree distribution nearby the origin of a preferential attach-ment graph. Electron. Commun. Probab. 12 276–282 (electronic). MR2342706

[20] Nerman, O. (1981). On the convergence of supercritical general (C–M–J) branchingprocesses. Z. Wahrsch. Verw. Gebiete 57 365–395. MR0629532

[21] Newman, M. E. J. (2003). The structure and function of complex networks. SIAMRev. 45 167–256 (electronic). MR2010377

[22] Norris, J. R. (1998). Markov Chains. Cambridge Series in Statistical and Proba-bilistic Mathematics 2. Cambridge Univ. Press, Cambridge. MR1600720

[23] Pittel, B. (1994). Note on the heights of random recursive trees and random m-arysearch trees. Random Structures Algorithms 5 337–347. MR1262983

[24] Rudas, A. and Toth, B. (2009). Random tree growth with branching processes—asurvey. In Handbook of Large-Scale Random Networks. Bolyai Soc. Math. Stud.18 171–202. Springer, Berlin. MR2582389

[25] Rudas, A., Toth, B. and Valko, B. (2007). Random trees and general branchingprocesses. Random Structures Algorithms 31 186–202. MR2343718





















[26] Smythe, R. T. and Mahmoud, H. M. (1995). A survey of recursive trees. TheoryProbab. Math. Statist. 51 1–27.

[27] Szymanski, J. (1987). On a nonuniform random recursive tree. In Random Graphs’85(Poznan, 1985). North-Holland Math. Stud. 144 297–306. North-Holland, Ams-terdam. MR0930497

[28] Twitter developers. Available at https://dev.twitter.com/docs/api/1.1/get/

statuses/firehose, August 2012.[29] Wiuf, C., Brameier, M., Hagberg, O. and Stumpf, M. P. H. (2006). A likelihood

approach to analysis of network data. Proc. Natl. Acad. Sci. USA 103 7566–7570(electronic). MR2221040

S. Bhamidi

Department of Statistics

and Operations Research

University of North Carolina

Chapel Hill, North Carolina 27599

USA

E-mail: [email protected]

J. M. Steele

The Wharton School

Department of Statistics

Huntsman Hall 447

University of Pennsylvania

Philadelphia, Pennsylvania 19104

USA


T. Zaman

Sloan School of Management

Massachusetts Institute of Technology

Cambridge, Massachusetts 02142

USA



https://dev.twitter.com/docs/api/1.1/get/statuses/firehose

https://dev.twitter.com/docs/api/1.1/get/statuses/firehose





twitter event networks and the superstar model

Documents