random graph - jsnu.edu.cnmathsqxx.jsnu.edu.cn/_upload/article/a8/72/3931fa5148d5...random graphs...

Random GraphSummer School 2016, Jiangsu Normal University

Yusheng Li

Tongji University

Contents

1 Probabilistic Method and Random Graphs 11.1 Random graphs . . . . . . . . . . . . . . . . . . . . . . 11.2 Elementary examples . . . . . . . . . . . . . . . . . . . . 81.3 References . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2 Concentration 172.1 The Chernoff’s inequality . . . . . . . . . . . . . . . . . 182.2 Applications of Chernoff’s bounds . . . . . . . . . . . . . 252.3 Martingales on random graphs ⋆ . . . . . . . . . . . . . . 302.4 Parameters of random graphs ⋆ . . . . . . . . . . . . . . 352.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3 Properties of Random Graphs 453.1 Some behavior of almost all graphs . . . . . . . . . . . . 453.2 Threshold functions . . . . . . . . . . . . . . . . . . . . . 493.3 Poisson limit . . . . . . . . . . . . . . . . . . . . . . . . 593.4 References . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4 Quasi-random Graphs 694.1 Properties of dense graphs . . . . . . . . . . . . . . . . . 704.2 Graph with small second eigenvalue . . . . . . . . . . . . 784.3 Applications of characters⋆ . . . . . . . . . . . . . . . . . 854.4 References . . . . . . . . . . . . . . . . . . . . . . . . . . 97

5 Real-world Networks 995.1 Data and empirical research . . . . . . . . . . . . . . . . 995.2 Six degrees of separation . . . . . . . . . . . . . . . . . . 100

1

0 CONTENTS

5.3 Clustering coefficient . . . . . . . . . . . . . . . . . . . . 1025.4 Small-world networks . . . . . . . . . . . . . . . . . . . . 1045.5 Power law and scale-free networks . . . . . . . . . . . . . 1055.6 Network Structure . . . . . . . . . . . . . . . . . . . . . 1085.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . 110

Chapter 1

Probabilistic Method andRandom Graphs

Probabilistic method primarily used in combinatorics and pioneered byPaul Erdos, for proving the existence of a prescribed kind of mathe-matical object. This method has now been applied to other areas ofmathematics such as number theory, algebra, analysis, and geometry,as well as in computer science (e.g. randomized rounding). When eachelement of a finite set Ω is assigned a non-negative weight, over whichthe sum is one, we have a probability space. The method works byshowing that if one randomly chooses objects from a specified class,the probability that the result is of the prescribed kind is more thanzero. The basic probabilistic method is by calculating the expected valueof some random variable.

In basic probabilistic method, we only use expectation of randomvariable. In addition to basic probabilistic method, common tools usedin the probabilistic method include Markov’s inequality, the Chernoffbound, and the Lovasz local lemma, etc.

1.1 Random graphs

The main reason that the probabilistic method becomes a main tool inRamsey theory is that there are many Ramsey problems that traditionalcombinatorial methods do not work well. The standard texts on ran-

1

2CHAPTER 1. PROBABILISTICMETHODANDRANDOMGRAPHS

dom graphs, we refer that of Alon and Spencer (2008), Bollobas (2001),and Jason, Luczak, and Rucinski (2000). Random graphs began withsome sporadic papers of Erdos in the 1940s and 1950s, in which he usedrandom methods to show the existence of graphs with seemingly con-tradictory properties. A paper “On the Evolution of Random Graphs”of Erdos and Renyi in 1960 was very important for the developmentof the theory of random graphs. Among all these results, Erdos gavean exponential lower bound for the diagonal Ramsey number r(n, n).Today random methods have been used in many other areas.

Every probability space whose points are graphs gives a notion of arandom graph. For a family of graphs G = G1, G2, . . . with probabil-ities Pr(Gi) such that 0 ≤ Pr(Gi) ≤ 1 and

∑i≥1 Pr(Gi) = 1, we have a

probability space of random graphs. Each Gi is called a random graphof G with probability Pr(Gi). We shall consider the probability spacethat consists of graphs on a fixed set V = [n] = 1, 2, . . . , n, wherethe vertices in V are distinguishable, so edges on V are distinguishable,too. Note that the complete graph Kn on vertex set V has(

n

1

)+

(n

2

)2 + · · · +

(n

k

)2(k

2) + · · · +

(n

n

)2(n

2)

subgraphs. The general term corresponds the subgraphs that have

exactly k vertices, and the last term 2(n2) corresponds all spanning sub-

graphs.Let us label all edges of Kn on vertex set V = [n] as e1, e2, . . . , em,

where m =(n2

). Note that the number of graphs on vertex set [n]

is 2m since the edges are distinguishable. The space G(n; p1, . . . , pm)is defined for 0 ≤ pi ≤ 1 as follows. To get a random element ofthis space, one selects the edge ei independently, with probability pi.Putting it another way, the ground set of G(n; p1, . . . , pm) is the set ofall 2m graphs on V = [n]. For a specific graph H in the space withE(H) = ej : j ∈ S, where S ⊆ 1, . . . ,m is the index set of edgesof H, the probability that H appears is(

Πj∈S pj)(

Πj ∈S(1 − pj)).

That is to say, each of the edges of H has to be selected and none ofH is allowed to be selected. Write qj = 1 − pj and G(p1, . . . , pm) for a

1.1. RANDOM GRAPHS 3

random element in G(n; p1, . . . , pm). Then

Pr(G(p1, . . . , pm) = H)) = (Πj∈Spj) (Πj ∈Sqj) .

Since the vertices (and edges) are distinguishable, the eventG(p1, . . . , pn) =H is different from that G(p1, . . . , pm) is isomorphic to H. To see thatG(n; p1, . . . , pm) is truly a probability space, let us verify that

∑H

Pr(G(p1, . . . , pm) = H

)=

∑S⊆[m]

(Πj∈Spj) (Πj ∈Sqj)

= Πmj=1(pj + qj) = 1.

We shall concentrate on the case p1 = p2 = · · · = pm = p, for whichthe probability space G(n; p1, . . . , pm) is written as G(n, p).

In space G(n, p) the probability of a specific graph H in the spacewith k edges is pk(1−p)m−k: Each of the k edges of H has to be selectedand none of H is allowed to be selected. Write Gn,p or simply Gp for arandom element of G(n, p), then

Pr(Gp = H) = pe(H)qm−e(H).

In the space G(n, 0), the probability that the empty graphKn appears isone, and the probability that any other graph appears is zero. Similarly,in the space G(n, 1), the only graph that appears is Kn. Other thanthese two extremal cases, for 0 < p < 1, any graph on vertex set [n] canappear with a positive probability. As p increases from 0 to 1, randomgraph Gp evolves from empty to full.

In their original paper on random graphs in 1960, Erdos and Renyilet G(n, e) be the random graph with vertex set V = [n] and precisely e

edges. For 0 ≤ e ≤ m =(n2

)with e fixed, the space G(n, e) consists of all(

me

)spanning subgraphs with exactly e edges: which can be turned into

a probability space by taking its elements to be equiprobable. Thus,write Ge for a random graph in the space G(n, e), for a specific graphH in the space, then

Pr[Ge = H] =

(m

e

)−1

,


where the event Ge = H means that Ge is precisely H, but not onlyisomorphic to H in general.

It is interesting, as expected, that for e ∼ p(n2

)the spaces G(n, e)

and G(n, p) are close to each other as n → ∞. In most proofs forexistence, the calculations are easier in G(n, p) than in G(n, e). So wewill work on the probability model G(n, p) exclusively.

Another point of view may be convenient, in which one colors alledges of the complete graph Kn with probability p, randomly and in-dependently. Thus random graph Gp is viewed as a random coloring ofedge set of Kn. The coloring of edge set of Kn is also said a coloring ofKn in short. Recalling the definition of Ramsey numbers, we see whythe relation between random method and Ramsey theory is so naturaland tight.

In Ramsey theory, we need to consider the probability that a givensubgraph appears. Let F be a given graph on k vertices, and let S ⊆ [n]with |S| = k. Let AS be the event that the subgraph induced byS contains F as a subgraph, then the event ∪SAS signifies the eventthat F appears in Gp as a subgraph, its probability is hard to calculatesince the events AS have a complex interaction. It is often to bound thisprobability from above by the expectation of the number of copies of Fin the random graph. To get the expectation, let us look the numberof copies of F in Kk first. This is closely related to the automorphismgroup of F . A permutation (or a bijection) ϕ of V (F ) is called anautomorphism of F when uv ∈ E(F ) if and only if ϕ(u)ϕ(v) ∈ E(F ) forany pair of vertices u and v. It is straightforward to verify the set of allautomorphisms of F forms a group under the operation of composition.This group is called the automorphism group of F , denoted by A(F ).For example, A(Kk) is the symmetric group Sk of order k!, and A(Ck)is the dihedral group Dk of order 2k. The following results are simpleconsequence of the definitions.

For any graph F , A(F ) = A(F );Let k be the order of graph F . Then |A(F )| is a divisor of k!, and

it is k! if and only if F is isomorphic to Kk or Kk.

In random graph space, the vertices (and hence edges) of the graphsare labeled, and counting different copies of a subgraph is related thesituation caused by different labeling of the vertices of the subgraph.


So we define that two graphs F1 and F2 to be identical if V (F1) =V (F2) and E(F1) = E(F2), where the equalities mean the same sets.Clearly, identical graphs are isomorphic, but the inverse statement isnot generally true. For example, given a labeling of a star K1,k as V =u, v1, . . . , vk with the center u and E = uv1, . . . , uvk, exchangingthe labels of any pair vertices vi and vj will yield identical graphs.

Theorem 1.1 Let F be a graph of order k. Then the number of copiesof F from a set of k labels such that no two resulting graphs are identicalis k!/|A(F )|.

Proof. Let v1, v2, . . . , vk be the set of labels. Certainly there arek! labeling of F from this set of labels with the number of resultinglabeled graphs that may be identical. Let F1, F2, . . . , Fk! be the labeledgraphs obtained from F . Then the relation “Fi is identical to Fj” is anequivalence relation. For a given labeled graph Fi, each automorphismgives rise to a labeled graph that is identical to Fi, and conversely.Hence each equivalence class so determined contains |A(F )| elements,thus implying that there are k!/|A(F )| equivalent classes in all. Thisproves the theorem. 2

For example, if we label the vertices of a star K1, 3 as 1, 2, 3, 4, thenany equivalence class is uniquely determined by the label of its center.So there are 4 such classes, and each class contain 6 copies of K1, 3 withthe same label of the center.

In a random graph space G(n, p), we need to consider the numberof copies of F in a labeled complete graph.

Corollary 1.1 Let F be a graph of order k. Then the number of copiesof F in a labeled complete graph of order k is k!/|A(F )|.

Let F be a graph of order k. Let S ⊆ [n] with |S| = k and let XS

be the number of copies of F on S. Then X =∑

S XS is the numberof copies of F in Gp. We have

E(XS) =k!

|A(F )|pe(F ),


and

E(X) =

(n

k

)k!

|A(F )|pe(F ) =

(n)k|A(F )|

pe(F ),

where (n)k = n(n− 1) · · · (n− k + 1) is the falling factorial.Similar formulas hold for the number of induced subgraphs. Let Y

be the number of induced F in Gp. Then

E(Y ) =(n)k

|A(F )|pe(F )q(

k2)−e(F ).

Recall that AS signifies the event that the subgraph induced by Sin Gp contains F as a subgraph, then

Pr(AS) ≤ k!

|A(F )|pe(F ).

Hence

Pr(F ⊂ Gp) = Pr(∪AS) ≤(n

k

)k!

|A(F )|pe(F ) =

(n)k|A(F )|

pe(F ), (1.1)

where the upper bound is exactly E(X). This can be seen also by thefact that X takes only nonnegative integral values and

Pr(∪AS) = Pr(X ≥ 1) =∑i≥1

Pr(X = i)

≤∑i≥1

iPr(X = i) = E(X).

It seems to be necessary to point out that F is not a random elementin G(n, p) and the above discussion is about appearance of F as asubgraph.

It is worth remarking that p = p(n) is often a function. The spaceG(n, p) is of great interest for fixed values of p as well; in particular,G(n, 1/2) could be viewed as the space: it consists of all 2m graphs onV = [n], and the probability of any graph is equiprobable. This is justa classical probability space. Thus Gn,1/2 is also obtained by pickingany of the 2m graphs on V = [n] at random with probability 2−m. Nomatter how p is fixed or a function, we tend to be interested in whathappens as n→ ∞.

Now we have obtained a space of random graphs, every graph invari-ant becomes a random variable; the nature of such a random variabledepends crucially on p. For instant, the number Xk(G) of completegraphs of order k in G is a random variable on our space of randomgraphs.

To be proficient in the probabilistic method one must have a feelingfor asymptotic calculation. For the sake of convenience, we state somesimple inequalities that will be used in the calculations. The followingprecise formula is called Stirling formula.

Lemma 1.1 For all n ≥ 1,

n! =√

2πn(ne

)neθ/(12n),

where 0 < θ = θn < 1. Thus√

2πn(n

e

)n

< n! <√

2πn(n

e

)n

e1/(12n).

Lemma 1.2 For any positive integers N ≥ n,(N

n

)n

≤(N

n

)≤(eN

n

)n

.

If n = o(√N) as n→ ∞, then

(Nn

)∼ Nn

n!.

Proof. The first inequality comes from Nn≤ N−i

n−i, and the second does

from(Nn

)≤ Nn

n!and Stirling’s formula, and then it suffices to see that

(N)n/Nn is equal to

exp[ n−1∑i=1

log(1 − i

N

)]= exp

[−

n−1∑i=1

i

N−O

(n2

N

)]→ 1,

and the desired asymptotical formula follows. 2

The following simple fact from calculus is often used.

Lemma 1.3 For any 0 ≤ x ≤ 1 and n ≥ 0,

(1 − x)n ≤ e−nx.

If x = x(n) → 0 and x2n→ 0 as n→ ∞, then

(1 − x)n ∼ e−nx.


We have obtained an upper bound of r(n, n) in Chapter 1. Thisand Stirling’s formula give

r(n, n) ≤(

2n− 2

n− 1

)= (1 + o(1))

4n−1

√πn

.

Thomason (1988) slightly improved the upper bound asexp(c

√logn)√n

(2n−2n−1

),

where exp(c√

log n) = o(nϵ) for any ϵ > 0. Recently, Conlon im-

proved the upper bound as n−c logn/ log logn(2n−2n−1

). However, this does

not change the following limit as

limn→∞

r(n, n)1/n ≤ 4.

We do not have any general lower bound for r(n, n) yet.

1.2 Elementary examples

This section is devoted to the methodology, in which we use basic prob-abilistic method to estimate r(m,n). All lower bounds in this sectionhave been improved. However, these bounds are from “almost all” ar-gument, that is to say, the probability of graphs in the correspondingspace offering such bounds tends to 1. This method is often associatedwith the basic method, in which we use expectation of random vari-ables. These graphs imply these bounds directly or indirectly (aftersome vertices deleted).

In the original proof of the exponent lower bound for r(n, n) in 1947,Erdos did not use the formal probabilistic language. So his paper hasbeen considered as an informal starting point of random graphs. Butin two papers published in 1959 and 1961, in which he gave a lowerbound c(n/ log n)2 for r(3, n), he even wrote probabilities in the titles.

Theorem 1.2 For n ≥ 1,

r(n, n) >n

e√

22n/2.

1.2. ELEMENTARY EXAMPLES 9

Proof. Consider the random graphs in G(N, 1/2), or color KN randomlyand independently with probability p = 1/2, where N is a positiveinteger to be chosen. Let S be a set of n vertices and let AS be theevent that S is monochromatic. Then

Pr[AS] = 2(

1

2

)(n2)

= 21−(n2),

as for S to hold all(n2

)edges must be colored the same. Consider the

event∪AS over all n-sets on [N ]. We use the simple fact that the

probability of a disjunction is at most the sum of the probability of theevents. Thus

Pr[∪AS] ≤∑

Pr[AS] =

(N

n

)21−(n

2).

If this probability is less than one, then the event B =∩AS has positive

probability. Therefore B is not the null event. Thus there is a point inthe probability space for which B holds. But a point in the probabilityspace is precisely a coloring of the edges of KN . And the event B isprecisely that under this coloring there is no monochromatic Kn. Hencer(n, n) > N .

We need to find the maximum possible N such that Pr[∪AS] < 1.From Stirling formula, we have(

N

n

)21−(n

2) ≤ Nn

n!21−(n

2) <2√2πn

(e√2N

n2n/2

)n.

This can be ensured by setting N =⌊

ne√22n/2

⌋such that the fraction in

the parenthesis is at most one. Therefore r(n, n) ≥ N + 1 > ne√22n/2.

2

The original proof of Erdos used the counting argument as follows.Fix a set S ⊆ [N ] with |S| = n. Among all graphs on [N ], the propor-

tion of the graphs in which S is a clique or an independent set is 21−(n2),

which can be seen as each edge in S has tow possibilities: to appear ornot. Since the number of sets S is

(Nn

), if(

N

n

)21−(n

2) < 1,

then there is a graph of order N which contains no Kn or independentset of order n, and thus r(n, n) > N .

Erdos in fact used the space G(N, 1/2), which is a classical proba-bility space as mentioned. It is interesting to see that this space is theonly one that counting argument works!

Theorem 1.3 Let m,n and N be positive integers. If for some 0 N . Hence r(m,n) ≥ c(

nlogn

)(m−1)/2, where c = c(m) > 0

is a constant.

Proof. Consider random graphs Gp in G(N, p). Let S be a set of mvertices and let AS be the event that S induces a complete graph. Then

Pr[AS] = p(m2 ), and

Pr[∪AS] ≤∑

Pr[AS] =

(N

m

)p(

m2 ).

Let T be a set of n vertices and let BT be the event that T induces anindependent set. Then

Pr[∪BT ] ≤∑

Pr[BT ] =

(N

n

)(1 − p)(

n2).

ThusPr[(∪AS) ∪ (∪BT )] < 1.

So there exists a graph on N vertices such that there is neither aninduced Km nor an induced Kn, thus r(m,n) > N .

The above result is ineffective in bounding r(3, n). We now examinethe lower bound of r(4, n). We shall give details in calculation forchoosing a suitable value of p, and that of N as large as possible forlarge n. To make the condition in Theorem 1.3 satisfied, we roughly

estimate(Nn

)as (eN/n)n, and (1− p)(

n2) as e−p(n

2), hence(Nn

)(1− p)(

n2)

as (eNn

)nexp

− p

(n

2

)=(

eN

nep(n−1)/2

)n

.

We have known that r(4, n) ≤ (1 + o(1))n3/(log n)2 in the last chap-ter, thus ep(n−1)/2 = na+o(1) for some constant a, so we take p =(c1 log n)/(n− 1). Then(

N

4

)p6 ∼ 1

24N4(c1 log n

n

)6∼ c61

24

(N( log n

n

)3/2)4< 1,

so N ∼ c2(n/ log n)3/2 for some constant c2.Formally, we let p = (c1 log n)/(n − 1) and N = ⌊c2(n/ log n)3/2⌋,

where c1 and c2 are positive constants to be chosen satisfying thatc61c

42 < 24. Then(

N

4

)p6 <

N4

24p6 ≤ c61c

42

24

(n

n− 1

)6

≤ c3 < 1

for large n, where c3 is a constant. For the second term, we estimate

that (1 − p)(n2) < e−pn(n−1)/2 = n−c1n/2 and(N

n

)(1 − p)(

n2) <

(eN

n

)n

n−c1n/2 =(

eN

n1+c1/2

)n

.

In order to make the above tending to zero, we have to take c1 ≥ 1. Onthe other hand, in order to take c2 as large as possible with c61c

42 < 24,

we have to take c1 as small as possible. So it has to be c1 = 1.Now, we may hope to optimize the constant c2. Since we need only

c2 < 241/4, so c2 = 241/4 − ϵ will be ok. Thus we have

r(4, n) ≥ (241/4 − o(1))

(n

log n

)3/2

.

For general m ≥ 4, by taking p = (m − 3) log n/(n − 1), the similarcalculation yields

r(m,n) ≥ c

(n

log n

)(m−1)/2

.

2

Hereafter we will choose p with some foresight. It is often to replacelog n/(n− 1) in the expression of p by log n/n. Strictly speaking, only

if f(n)−g(n) → 0, we have ef(n) ∼ eg(n). But the replacement works aswe use p = log n/n in above calculation for the lower bound of r(4, n),(

N

n

)(1 − p)(

n2) <

(eN

n

)n

n−(n−1)/2 =(eN

n3/2

)n

n1/2 → 0.

Also, when N = ⌊f(n)⌋ or N = ⌈f(n)⌉ and f(n) → ∞, we often ignorethe fact that f(n) may not be an integer by simply taking N = f(n) ifthis does not affect the proof essentially.

We have seen that the property of Gp is sensitive with the value ofp. To ensure that Gp contains no Km (with a positive probability, more

precisely, or(Nm

)p(

m2 ) is small), it is better to take smaller p. But it is

better to take a bigger p to ensure that there is no induced Kn (i.e,(Nn

)(1 − p)(

n2) is small). Our task is to balance both sides to obtain a

larger N as possible.

We shall improve the obtained lower bounds for r(n, n) and r(m,n)by the proofs so called deletion method.

Theorem 1.4 As n→ ∞,

r(n, n) ≥ (1 − o(1))n

e2n/2.

Proof. Consider the random graphs in G(N, 1/2). Let X be the numberof clique or independent set of size n. Then

X =∑

XS,

the sum over all n−set S, where XS is the indicator random variableof the event AS that S is a clique or independent. That is

XS =

1 if S induces Kn or Kn

0 otherwise.

Therefore

E[XS] = Pr[AS] = 2(

1

2

)(n2).

By linearity of expectation

E[X] =∑

E[XS] =

(N

n

)21−(n

2).

There is a point in the probability space for which X does not exceedits expectation. That is, there is a graph with at most E(X) sets Sthat induce a Kn or a Kn. Fix that graph. For each such S select apoint x ∈ S and delete it from the vertex set. The remaining vertexset V ∗ have neither Kn nor Kn. Thus

r(n, n) > |V ∗| ≥ N − E(X).

The rest of the proof is to find N such that |V ∗| as large as possible.

By taking N = ⌊n2n/2

e⌋, from the Stirling formula, we have

(N

n

)21−(n

2) <(eN

n

)n

21−(n2) < 2

(e√

2N

n2n/2

)n

≤ 2n/2+1,

which is o(N). Thus r(n, n) ≥ (1 − o(1))N . 2

Theorem 1.5 For any positive integer m,n and N , and any real num-ber 0 N −(N

m

)p(

m2 ) −

(N

n

)(1 − p)(

n2).

Thus for fixed m ≥ 3, if n is large, then

r(m,n) ≥ c

(n

log n

)m/2

,

where c = c(m) > 0 is a constant.

Proof. Note that in Gp in G(N, p) the expectation of the number of

clique of size m is(Nm

)p(

m2 ), and expectation of number of independent

sets of size n is(Nn

)(1 − p)(

n2). By the same argument in the proof for

Theorem 1.4 gives the first lower bound. For the second, we set N =

a(n/ log n)m/2 and p = (m−2) log n/(n−1) such that a− (m−2)(m2 )

m!am >

0. This is possible since am = o(a) as a→ 0+. Then

N1 =

(N

m

)p(

m2 ) ∼ (m− 2)(

m2 )am

m!

(n

log n

)m/2

,

and

N2 =

(N

n

)(1 − p)(

n2) <

(eN

n

)n

e−pn(n−1)/2 =(eN

nm/2

)n

→ 0.

So if c < a− (m−2)(m2 )am

m!, then

r(m,n) ≥ N −N1 −N2 > c

(n

log n

)m/2

,

completing the proof. 2

Let f(n) and g(n) be positive functions. Clearly, in order to showf(n) ≥ cg(n) for all positive n, we may assume that n is sufficientlylarge in the proof. Theorem 1.5 can be generalized as follows.

Theorem 1.6 Let F be a graph with m ≥ 3 vertices and e(F ) ≥ medges. Then

r(F,Kn) ≥ c

(n

log n

)e(F )/(m−1)

,

for all n ≥ 1, where c = c(F ) > 0 is a constant.

The above theorem and the linear formula of r(Tm, Kn) in Chapter2 give the following conclusion.

Corollary 1.2 For a fixed connected graph G, r(G,Kn) can have alinear upper bound on n if and only if G is tree.

In Theorem 1.2, we proved that if N = ne√22n/2, then almost all

graphs in G(N, 1/2) satisfy ω(Gp) < n and α(Gp) < n. Can we con-struct one (in fact, one family of graphs)? Unfortunately, nobody can

1.3. REFERENCES 15

give a constructive proof of a lower bound like r(n, n) ≥ an for any con-stant a > 1. This reveals that finite discrete structures are much morecomplicated than they seem to be. The graphs of large order we canimage must have some uniform description, but typical random graphsdo not have such description. This is not very unusual in mathematic-s. Most real numbers that we can write down are rational, which arerecycling, but almost all real numbers are irrational.

Let us have a brief remark on “almost all” arguments in the proofsof several theorems in this section. They are in fact weighted countingarguments. The probability space we are dealing is finite, which canbe defined as a finite set Ω together with nonnegative weights on theelements that sum to 1. An event A is a subset of Ω. The probabilityPr(A) is the sum of the weights of the elements of A. In particular,

in the space G(N, p), the graphs with around p(N2

)edges have larger

weights. For example, we can say that almost all graphs in G(N, 1/N3)are empty as the probability of any non-empty graph is negligible.

We have seen that

√2 ≤ limr(n, n)1/n ≤ limr(n, n)1/n ≤ 4.

There is a big gap between the two bounds. It is likely that in fac-t, the limit of r(n, n)1/n is 2, but both the lower bound

√2 and the

upper bound 4 has been standing for more than half century withoutimprovement.

Problem 1.1 Prove or disprove that limn r(n, n)1/n exists. Is the limit2 if it exists?

1.3 References

N. Alon and J. Spencer, The Probabilistic Method, 3rd ed., Wiley-Interscience, New York, 2008.

B. Bollobas, Random Graphs, 2nd Edition, Cambridge UniversityPress, London-New York, 2001.


D. Conlon, A new upper bound for diagonal Ramsey numbers, Ann.of Math., 170 (2009), 941-960.

P. Erdos, Some remarks on the theory of graphs, Bull. Amer. Math.Soc., 53 (1947), 292-294.

P. Erdos, Graph theory and probability, Canad. J. Math., 11(1959), 34-38.

P. Erdos, Graph theory and probability II, Canad. J. Math, 13(1961), 346-352.

P. Erdos and A. Renyi, On the evolution of random graphs, Publ.Math. Inst. Hungar. Acad. Sci., 5 (1960), 17-61.

S. Jason, T. Luczak, and A. Rucinski, Random Graphs, Wiley-Interscience, New York, 2000.

M. Molloy and B. Reed, Graph Coloring and the Probabilistic Method,Springer, Berlin, 2002.

Chapter 2

Concentration

The probability space we consider in graph Ramsey theory has onlyfinite many possible outcomes, and the random variable is often non-negative. Markov’s inequality gives an upper bound for the probabilitythat such a random variable is greater than or equal to some pos-itive constant. It is named after the Russian mathematician AndreyMarkov, although it appeared earlier in the work of Pafnuty Chebyshev(Markov’s teacher). Chebyshev’s inequality guarantees that in any datasample or probability distribution, “nearly all” values are close to themean. The inequalities have great utility because it can be applied tocompletely arbitrary distributions (unknown except for mean and vari-ance). In probability theory, the Chernoff bound, named after HermanChernoff, gives exponentially decreasing bounds on tail distributionsof sums of independent random variables. It is a sharper bound thanthe known first or second moment based tail bounds such as Markov’sinequality or Chebyshev inequality, which only yield power-law boundson tail decay but it requires that the variates be independent - a condi-tion that neither the Markov nor the Chebyshev’s inequalities requireThis chapter contains various Chernoff’s inequalities with the detailedproofs, and some of their applications, particularly in Ramsey theory.The last two sections are on martingales and concentration of parame-ters of dense random graphs, which the beginners of readers can skip.

17

18 CHAPTER 2. CONCENTRATION

2.1 The Chernoff’s inequality

Let X be a discrete random variable. Then the expected value of X isdefined to be E(X) =

∑i ai Pr(X = ai), where the summation is taken

over all values ai that X can take.

Theorem 2.1 (Markov’s Inequality) Let a > 0 and let X be a non-negative random variable. Then

Pr(X ≥ a) ≤ E(X)

a.

Proof. Suppose that ai is the set of all values that X takes. Then

E(X) =∑i

ai Pr(X = ai) ≥∑ai≥a

ai Pr(X = ai)

≥ a∑ai≥a

Pr(X = ai) = aPr(X ≥ a),

as required. 2

The following is exactly what we used to obtain lower bounds ofRamsey numbers in the last chapter.

Corollary 2.1 If a random variable X only takes nonnegative integervalues and E(X) < 1, then Pr(X ≥ 1) < 1 hence Pr(X = 0) > 0.

For a positive integer k, the kth moment of a real-valued randomvariable X is defined to be E(Xk), and so the first moment is simplythe expected value. Denote by µ = E(X), and define the variance ofX as E((X − µ)2), which is denoted by σ2. Call

σ =√E((X − µ)2)

as the standard deviation of X. A basic equality is as follows.

σ2 = E(X2) − µ2.

Theorem 2.2 (Chebyshev’s Inequality) Let X be a random vari-able and let a be a positive number. Then

Pr(|X − µ| ≥ a) ≤ σ2

a2.

2.1. THE CHERNOFF’S INEQUALITY 19

Proof. By Markov’s inequality, for any a > 0,

σ2 = E((X − µ)2)) ≥ a2 Pr((X − µ)2 ≥ a2)

= a2 Pr(|X − µ| ≥ a).

It follows by the required statement. 2

In importance, the second moment E(X2) is second to the firstmoment E(X).

Lemma 2.1 (Second Moment Method) If X is a random variable,then

Pr(X = 0) ≤ σ2

µ2=E(X2) − µ2

µ2,

where µ = E(X). In particular, Pr(X = 0) → 0 if E(X2)/µ2 → 1.

The proof follows from Chebyshev’s Inequality and the trivial factthat Pr(X = 0) ≤ Pr(|X −µ| ≥ µ) immediately. Intuitively, if σ growsmore slowly than µ grows, than Pr(X = 0) → 0 since σ “pulls” X closeto µ thus far away from zero.

The Chebshev’s inequality is in fact the Markov’s inequality onrandom variable |X − µ|. However, Chebshev’s inequality states theprobability of a random variable X apart from E(X) is bounded. Whenthis is the case, we say that X is concentrated. A concentration boundsis used to show that a random variable is very close to its expectedvalue with high probability, so it behaves approximately as one may“expect” it to be. When Sn is the sum of n independent variables, eachvariable equals to 1 with probability p and −1 with probability 1 − p,respectively, the bound can be sharper. Such random variables arebounded in Chernoff’s inequality. Most of the results in this chaptermay be found in, or immediately derived from, the seminal paper ofChernoff (1952) while our proofs are self-contained. A set of randomvariables X1, X2, . . . are said to be mutually independent means eachXi is independent of any Boolean expression formed from other (Xj)

′s.In any form of Chernoff bounds, we have an assumption as follows.

Assumption A : On the independence of variables in Chernoffbound or Chernoff inequality. Let X1, X2, . . . be mutually independent

variables and they have the same binomial distribution. Set

Sn =n∑

i=1

Xi.

All concentration bounds in the remaining part of this section are Cher-noff bounds of different forms, which estimate the probability of

Pr(Sn ≥ n(µ+ δ)) = Pr(Sn

n≥ µ+ δ

),

where µ = E(Xi). The symmetric bound on Pr(Sn ≤ n(µ− δ)) can beobtained similarly.

Theorem 2.3 Under Assumption A, suppose

Pr(Xi = 1) = Pr(Xi = −1) =1

2

for i = 1, 2, . . .. Then for any δ > 0,

Pr(Sn ≥ nδ) < exp−nδ2/2,

and for any a > 0,

Pr(Sn ≥ a) < exp−a2/(2n).

Proof. Let λ > 0 be arbitrary. Then

E(eλXi) =eλ + e−λ

2.

Note that

E(eλSn) = E(eλX1)E(eλX2) . . . E(eλXn) =

(eλ + e−λ

2

)n

=

∞∑j=0

λ2j

(2j)!

n

<

∞∑j=0

1

j!

(λ2

2

)jn

= enλ2/2,

where we use the fact that (2j)! ≥ 2jj! for all j ≥ 0 with strict inequalitywhen j ≥ 2. Now by Markov’s inequality,

Pr(Sn ≥ nδ) = Pr(eλSn ≥ eλnδ)

≤ E(eλSn)

eλnδ< expn(λ2/2 − λδ),

for all λ > 0. Setting λ = δ, we obtain the desired result. 2

For large n, the central limit theorem implies that Sn is approxi-mately normal with zero mean and standard deviation

√n. For any

fixed u,

limn→∞

Pr(Sn ≥ u√n)∫ ∞

u

1√2πe−t2/2dt < e−u2/2.

However, the Chernoff bound holds for all positive n and a.Since Xi is often an indicator variable of some random event, so Xi

takes 1 when the event appears and 0 otherwise. The following form ofChernoff bound may be used in more cases.


Pr(Xi = 1) = Pr(Xi = 0) =1

2

for i = 1, 2, . . .. Then for any δ > 0,

Pr(Sn ≥ n(1 + δ)/2) < exp−nδ2/2.

NamelyPr(Sn ≥ n(1/2 + δ) < exp−2nδ2.

Proof. Set Yi = 2Xi − 1 and Tn =∑n

i=1 Yi = 2Sn − n. Then

Pr(Yi = 1) = Pr(Yi = −1) =1

2,

and Yi satisfies Assumption A. Note that Tn ≥ nδ if and only ifSn ≥ n(1 + δ)/2. Applying Theorem 2.3 to Yi and Tn, we have

Pr(Sn ≥ n(1 + δ)/2) = Pr(Tn ≥ nδ) < exp−nδ2/2

as claimed. 2

Under Assumption A, suppose

Pr(Xi = 1) = p, and Pr(Xi = 0) = 1 − p

for i = 1, 2, . . .. Then we say that the sum Sn =∑n

i=1Xi has binomialdistribution, denoted by B(n, p). Involved in Theorem 2.4 is specialbinomial distribution B(n, 1/2). For general case , the calculation isslightly more complicated, but the technique is the same. As usual,denote by q for 1 − p.


Pr(Xi = 1) = p and Pr(Xi = 0) = q

for i = 1, 2, . . .. Then there exists δ0 = δ0(p) > 0 so that if 0 < δ < δ0,then

Pr(Sn ≥ n(p+ δ)) < exp−nδ2/(3pq).

Proof. Let a = p+ δ. By the same argument as used before,

Pr(Sn ≥ na) = Pr(eλSn ≥ eλna) ≤ 1

eλnaE(eλSn)

=1

eλna(peλ + q)n = (peλ(1−a) + qe−λa)n

for all λ > 0. Let c = 1 − a = q − δ > 0, then a + c = 1. By takingλ = log(aq/cp), we have

minλ>0

(peλc + qe−λa) = e−λa(peλ + q) =

(cp

aq

)aq

c=(p

a

)a (qc

)c

.

Setting 0 < δ < 1− p, and expanding in powers of δ, with the fact that

log(1 + x) = x− x2

2+x3

3+O(x4),

we find

log(p

a

)a

= (p+ δ) log

(1 − δ

p+ δ

)

= −δ − δ2

2(p+ δ)− δ3

3(p+ δ)2+ o(δ3),

and

log(q

c

)c

= (q − δ) log

(1 +

δ

q − δ

)

= δ − δ2

2(q − δ)+

δ3

3(q − δ)2+ o(δ3).

Adding them by terms, the first sum vanishes, and the second is

−δ2

2

(1

p+ δ+

1

q − δ

)=

−δ2

2

(1

p(1 + δ/p)+

1

q(1 − δ/q)

)

=−δ2

2

(1

pq− (q2 − p2)δ

p2q2+ o(δ)

)

=−δ2

2pq+

(q − p)δ3

2p2q2+ o(δ3),

and the third is

δ3

3

(1

(q − δ)2− 1

(p+ δ)2

)=

δ3

3

(1

q2− 1

p2+ o(1)

)

=−(q − p)δ3

3p2q2+ o(δ3).

We have for small δ > 0

log[(p

a

)a (qc

)c]=

−δ2

2pq+

(q − p)δ3

6p2q2+ o(δ3) <

−δ2

3pq.

ThusPr(Sn ≥ n(p+ δ)) < exp−nδ2/(3pq),


From above proof for p > q and Theorem 2.4 for p = q = 1/2, wesee that if p ≥ 1/2, the bound can be slightly better as

Pr(Sn > n(p+ δ)) < exp−nδ2/(2pq).

We now write out a symmetric form for Theorem 2.5, and omitthose for Theorem 2.3 and Theorem 2.4.

Pr(Xi = 1) = p and Pr(Xi = 0) = q

for i = 1, 2, . . .. Then there exists δ0 = δ0(p) > 0 so that if 0 < δ < δ0,then

Pr(Sn ≤ n(p− δ)) < exp−nδ2/(3pq).

Therefore

Pr(|Sn − np| > nδ)) < 2 exp−nδ2/(3pq).

2

From the above proof, we have

Pr(Sn ≥ na) ≤((

p

a

)a (qc

)c)n

= expn(a log

p

a+ (1 − a) log

q

1 − a

),

where a = p+ δ and c = 1 − a. Set k = na, then k > np and

Pr(Sn ≥ k) ≤ exp

n

((k/n) log

p

k/n+ (1 − k/n) log

q

1 − k/n

).

Let H(x) signify the entropy function

H(x) = x logp

x+ (1 − x) log

q

1 − x, 0 < x < 1,

then

Pr(Sn ≥ k) ≤ expnH(k/n),

which is valid also for k = np since H(p) = 0. The following form ofChernoff’s inequality was used by Beck (1983).


Pr(Xi = 1) = p and Pr(Xi = 0) = q

2.2. APPLICATIONS OF CHERNOFF’S BOUNDS 25

for i = 1, 2, . . .. If k ≥ np, then

Pr(Sn ≥ k) ≤(np

k

)k ( nq

n− k

)n−k

.

Consequently,

Pr(Sn ≥ k) ≤(npe

k

)k

.

Proof. The right hand side of the first inequality is just expnH(k/n).For the second inequality, simply note that

(nq

n− k

)n−k

≤(

n

n− k

)n−k

=

(1 +

k

n− k

)n−k

< ek.

Thus the required result follows. 2

2.2 Applications of Chernoff’s bounds

Let us first see that a. a. graphs are nearly regular.

Theorem 2.8 Let 0 0 be fixed. Then almost allgraphs G in G(n, p) satisfy

| deg(v) − (n− 1)p| ≤ ϵ(n− 1)p

for each vertex v.

Proof. Let G be a random graph in G(n, p) and let v ba a fixed vertex ofG. Then deg(v) has binomial distribution B(n−1, p). From Chernoff’sTheorems, we have

Pr(| deg(v) − (n− 1)p| > ϵ(n− 1)p) < 2 exp(−(n− 1)ϵ2/(3pq))

∼ 2 exp(−nϵ2/(3pq)).

Hence we bound the probability that there is at least one vertex v suchthat | deg(v) − (n − 1)p| > (n − 1)p) by (2 + o(1))n exp(−nϵ/(3pq)),which tends to zero as n→ ∞. 2

The condition that fixed p can be weakened as p = (log n/n)ω(n)with ω(n) → ∞, see Alon and Spencer (2008).

Let us enjoy an application of Chernoff bound that is of Erdos style,in which the authors disproved a conjecture with almost all graphs.

A suspended path in graph G is a path (x0, x1, . . . , xk) in whichx1, . . . , xk−1 have degree two in G. A graph H is a subdivision of G ifH is obtained from G by replacing each edge of G with a suspendedpath, that is to say, H is obtained by adding vertices on the edges ofG.

A often used measure for sparseness of graphs is Kr-freeness as wehave met in Chapter 3. The simplestK3-free graphs are bipartite graph-s. However, there are K3-free graphs whose chromatic number can bearbitrarily large, see Mycielski’s construction (1955) in the exercises.A more general measure for sparseness is to forbid some subdivision.Hajos conjectured that every graph G with χ(G) ≥ r contains a subdi-vision of Kr as a subgraph. This conjecture is trivial for r = 2, 3, and itis confirmed by Dirac (1952) for r = 4, and it is open for r = 5, 6. Catlin(1979) disproved the conjecture for r ≥ 7 by a constructive proof, butthe disproof of Erdos and Fajtlowicz (1981) was more powerful. Letγ(G) denote the largest r such that G contains a subdivision of Kr asa subgraph. Hajos conjecture is equivalent to that γ(G) ≥ χ(G).

Theorem 2.9 Almost all graphs G = Gp ∈ G(n, 1/2) satisfy

χ(G) ≥ n

2 log2 n, and γ(G) ≤

√6n.

Proof. Set k = ⌊2 log2 n⌋. Since

Pr(α(G) ≥ k) ≤(n

k

)2−(k

2) <

(e√

2n

k2k/2

)k

→ 0,

and α(G)χ(G) ≥ n for any graph G, the first statement follows imme-diately. Set r = ⌈

√6n⌉. Then n ≤ r2/6. There are(

n

r

)≤(en

r

)r

≤(er

6

)r


potential Kr subdivisions, one for each r−element subset of V (G). Ifwe fix such a subset X, then, since each subdivided edge has to usea distinct vertex of V (G) \ X, there are

(r2

)suspended pathes in a

subdivision, and at most n − r of them are of length two or more,which are “really” subdivided edges. So the subgraph induced by Xcontains at least(

r

2

)− (n− r) ≥

(r

2

)+ r − r2

6≥ 2

3

(r

2

)

edges. But the number of edges in subgraph induced by X, denotedby e(X), has binomial distribution B(N, 1/2), where N =

(r2

). From

Chernoff bound in the last section,

Pr(e(X) ≥ N(1 + δ)/2

)≤ exp−Nδ2/2,

by taking δ = 1/3 hence 23

(r2

)=(r2

)(1 + δ)/2, we have

Pr

(e(X) ≥ 2

3

(r

2

))≤ exp−Nδ2/2 = exp

− 1

18

(r

2

).

Thus we bound the probability that our random graph G contains asubdivision of Kr as follows.

Pr(γ(G) ≥ r) ≤∑X

Pr

(e(X) ≥ 2

3

(r

2

))

≤(n

r

)exp

− 1

18

(r

2

)≤(er exp−(r − 1)/36

6

)r

,

which tends to zero as n→ ∞. 2

Since for almost all G in G(n, 1/2),

χ(G) − γ(G) ≥ n

2 log2 n−

√6n→ ∞

as n → ∞, so Hajos conjecture failed badly, and almost all graphs inG(n, 1/2) are counterexamples.


We shall have another application of Chernoff’s bounds for Ramseynumbers rk(Km,n) and r(Km,n, Kn).

Recall that rk(G) is the smallest integerN such that in any k−coloringof edges of KN , there is a monochromatic G. Chung, Erdos, and Gra-ham (1975) propose a problem to determine rk(Km,n). We now give alower bound for it as k and m are fixed and n→ ∞, in which

√n log n

can be replaced by√nω(n), where ω(n) → ∞.

Theorem 2.10 Let integers k, m ≥ 1 be fixed. Then, there is constantC = C(k,m) > 0 such that

rk(Km,n) ≥ kmn− C√n log n

for all large n.

Proof. Set N = kmn − C√n log n, where C is a constant to be deter-

mined. Then

n =( 1

km+C√n log n

kmN

)N > (k−m + δn)N = (p+ δn)N,

where p = k−m and δn = C2k2m

√lognn

. Let us color the edges of KN+m

with k colors randomly and independently, such that each edge is as-signed in each color with probability 1/k. Consider a fixed color, saycolorA, and an arbitrary but fixed set U ofm vertices. Let v1, v2, . . . , vNbe the N vertices outside U . For each j, define a random variable Xj

such that Xj = 1 if the edges between vj and U are all in color A and0 otherwise. Then Pr(Xj = 1) = k−m = p. Set SN =

∑Ni=1Xj. Clearly

SN has the binomial distribution B(N, p) and the event SN ≥ n mean-s that there is a monochromatic Km,n in color A (in which U is them-vertex part). Hence

Pr(∃ monochromatic Km,n) ≤ k

(N +m

m

)Pr(SN ≥ n).

By virtue of Chernoff bound (Theorem 2.5)

Pr(SN ≥ n) ≤ Pr(SN ≥ (p+ δn)N) ≤ exp−Nδ2n/(3pq).

From the facts that

−Nδ2n

3pq∼ −C2 log n

12km(km − 1)

and

k

(N +m

m

)= O(nm) = O

(expm log n

),

we have that the probability that there exists monochromatic Km,n

tends to zero as N → ∞ if C ≥ km√

12m, which guarantees the ex-istence of an edge-coloring of KN+m with no monochromatic Km,n,implying that rk(Km,n) > N +m for all large n. 2

Theorem 2.11 Let integer m ≥ 2 be fixed. Then there exists a con-stant c = c(m) > 0 such that

r(Km,n, Kn) ≥ cnm+1

(log n)m.

Proof. The lower bound is obtained through a simple application ofChernoff bound (Theorem 2.7). Let

N =

⌊nm+1

3(2m log n)m

⌋

and p = (2m log n)/n. The probability that m chosen vertices inG(N, p) are connecting with another fixed vertex is pm. So the proba-bility that they have at least n common neighbors is Pr(S ≥ n), whereS has the binomial distribution B(N − m, pm). Then n > Npm andTheorem 2.7 yields

Pr(Km,n ⊆ G(N, p)) ≤(N

m

)((N −m)pme

n

)n

<Nm

m!

(Npme

n

)n

< c1nm(m+1)

(log n)m2

(e

3

)n

,

where c1 = c1(m) > 0 is a constant. Hence Pr(Km,n ⊆ G(N, p)) →0. At the time, by standard estimates that

(Nn

)≤ (Ne/n)n and 1 −

p < e−p, we obtain a bound of the probability that G(N, p) has anindependent set of size at least n as follows

Pr(α(G(N, p)) ≥ n) ≤(N

n

)(1 − p)n(n−1)/2

≤(Ne

ne−p(n−1)/2

)n

≤(

c23(2m log n)m

)n

,

where c2 = c2(m) > 0 is a constant, so Pr(α(G(N, p)) ≥ n) → 0. Hencethe probability that G(N, p) contains neither Km,n as a subgraph noran independent set of size n is positive (in fact, close to 1). Thusr(Km,n, Kn) > N . 2.

In Chapter 3, we proved that for any fixed m ≥ 1,

r(Km +Kn, Kn) ≤ (1 + o(1))nm+1

(log n)m−1.

So the upper bound and lower bound are just a log n factor away. InChapter 8 we shall show that the obtained lower bound is the rightorder of r(Km,n, Kn).

2.3 Martingales on random graphs ⋆

Most parameters of a random graph are concentrated around their ex-pectations. To describe such phenomena, martingale is a powerful tool,which may liberate us from drudgery computations.

Let X and Y be random variables on a probability space Ω. Giv-en Y = y with Pr(Y = y) > 0, we define a conditional expectationE(X|Y = y) as

E(X|Y = y) =∑x

xPr(X = x|Y = y),

which is a number depending on y. As Y is random, we have a newrandom variable E(X|Y ). For an element ω ∈ Ω, if Y (ω) = y, thenE(X|Y ) takes value E(X|Y = y) at ω.

2.3. MARTINGALES ON RANDOM GRAPHS ⋆ 31

Lemma 2.2 E[E(X|Y )] = E[X].

Proof. From the definition, we have

E[E(X|Y )] =∑y

E[X|Y = y] Pr(Y = y)

=∑y

(∑x

xPr[X = x|Y = y]

)Pr(Y = y)

=∑x

x

(∑y

Pr[X = x|Y = y] Pr(Y = y)

)=

∑x

xPr(X = x) = E(X)

as asserted. 2

A martingale is a sequence X0, X1, . . . , Xm of random variables sothat for 0 ≤ i < m,

E(Xi+1|Xi) = Xi;

namely, E(Xi+1|Xi = x) = x for any given Xi = x.Imagine one walks on a line randomly, at each step he moves one unit

to the left or right with probability p, or stands still with probability1 − 2p. Let Xi be the position of i step. This is a martingale as theexpected position after i + 1 steps equals the actual position after isteps.

Let us look some martingales used in graph theory. The first iscalled the edge exposure martingale on chromatic numbers, in whichwe reveal Gp one edge-slot at a time. Let the random graph space

G(n, p) be the underlying probability space. Set m =(n2

), and label

the potential edges on vertex set [n] by e1, e2, . . . , em in any manner.As follows, we define X0(H), X1(H), · · · , Xm(H) for a given graph Hon vertex set [n], which are random variables if H is a random graphin G(n, p). Let X0(H) = E(χ(Gp)). For general i,

Xi(H) = E[χ(Gp)|ej ∈ E(Gp) iff ej ∈ E(H), 1 ≤ j ≤ i].

In other words, Xi(H) is the expected value of E[χ(Gp)] under thecondition that the set of the first i edges of Gp equals that of H while


the remaining edges are not seen and considered to be random. Notethat X0 = E(χ(Gp)) and Xm = χ(H).

Figure 1 shows why this is a martingale on the random space G(3, 1/2).Of cause, we can consider any graph parameters other than χ.

e2 @@@

e3

e1

2@@@

2.25@@

1.75

e1

e1

@@

2.5

2PPP

2 HHH

1.5@@@

3

2

2

22

2

2

1

@@

@@q@@

qq @@

q q qX0 X1 X2 X3 H

Fig. 5.1 An edge exposure martingale

3@@@1 2

2@@@

2.25PPPPP

1.75HHHHH@@@@@

3

2

2

2

2

2

2

1

@@

@@q@@

qq @@

q q qX1 X2 X3 H

Fig. 5.2 A vertex exposure martingale

2.3. MARTINGALES ON RANDOM GRAPHS ⋆ 33

The second is called the vertex exposure martingale on chromaticnumbers, in which we reveal Gp one vertex-slot at a time. Let therandom graph space G(n, p) be the underlying probability space. Wedefine X1 = E(χ(Gp)) and

Xi(H) = E[χ(Gp)|Ei(Gp) = Ei(H)],

where Ei(H) is the edge set induced by vertex set 1, · · · , i. In otherwords, Xi(H) is the expected value of E[χ(Gp)] under the conditionthat the set of the edges of Gp induced by the first i vertices equalsthat of H while the remaining edges are not seen and considered tobe random. Note that X1 = E(χ(Gp)) and Xn = χ(H). Note thatthe vertex exposure martingale is a subsequence of the edge exposuremartingale.

In Fig. 5.1, The probability space is G(3, 1/2), so X0 = E(χ(Gp)) =2, and X1(H) = 2.75 if e1 ∈ E(H), and X1(H) = 1.75 otherwise. ThusE(X1|X0) = 2 = X0. The random variables X2 and X3 take 4 valuesand 8 values, respectively, and E(Xi+1|Xi) = Xi.

Lemma 2.3 Let Y be a (discrete) random variable such that E(Y ) = 0and |Y | ≤ 1. Then E(etY ) ≤ (et + e−t)/2 for all t ≥ 0.

Proof. For a fixed t ≥ 0, set

h(y) =et + e−t

2+et − e−t

2y, −1 ≤ y ≤ 1.

Note that the function f(y) = ety is convex, and h(y) is a line throughthe point (−1, f(−1)) and (1, f(1)) as f(−1) = h(−1) and f(1) = h(1),hence ety ≤ h(y), and

E(etY ) ≤ E(h(Y )) =et + e−t

2

as E(Y ) = 0, and thus the assertion follows. 2

Theorem 2.12 (Azuma’s Inequality) Let X0, X1, · · · , Xm be a mar-tingale with

|Xi+1 −Xi| ≤ 1

for all 0 ≤ i < m, and let λ > 0. Then

Pr[Xm −X0 ≥ λ√m] < e−λ2/2,

and

Pr[Xm −X0 ≤ −λ√m] < e−λ2/2.

Proof. We may assume that X0 = 0 by translation. Set Yi = Xi−Xi−1,then |Yi| ≤ 1 and E(Yi|Xi−1) = 0. Then Lemma 2.3 yields that

E(etYi|Xi−1) ≤et + e−t

2≤ et

2/2

for any t > 0, where the last inequality is clear. Hence by Lemma 2.2,we have

E(etXm) = E[etXm−1etYm

]= E

[E(etXm−1etYm |Xm−1

)]=

∑x

E(etXm−1etYm |Xm−1 = x

)Pr(Xm−1 = x)

=∑x

etxE(etYm|Xm−1 = x

)Pr(Xm−1 = x)

≤ et2/2∑x

etx Pr(Xm−1 = x)

= et2/2E(etXm−1).

This and the induction gave E(etXm) ≤ emt2/2. Using Markov’s In-equality, we obtain

Pr(Xm ≥ λ√m) = Pr(etXm ≥ etλ

√m)

≤ E(etXm)

etλ√m

≤ emt2/2

etλ√m.

The assertion follows by letting t = λ/√m. 2

2.4. PARAMETERS OF RANDOM GRAPHS ⋆ 35

2.4 Parameters of random graphs ⋆

We are ready to discuss some parameters of random graph Gp for fixedp. It is easy to see some parameters are concentrated around theirexpectations. The following result was due Shamir and Spencer (1987).

Theorem 2.13 Let n and p be arbitrary and let Gp ∈ G(n, p). Then

Pr(|χ(Gp) − E(χ(Gp))| > λ

√n− 1

)< 2e−λ2/2.

Proof. Consider the vertex exposure martingale X1, · · · , Xn on G(n, p)with the parameter χ(G). A single vertex can always be given a newcolor so Azuma’s Inequality can apply. 2

Similarly, we have

Pr(|ω(Gp) − E(ω(Gp))| > λ

√n− 1

)< 2e−λ2/2,

and

Pr(|e(Gp) − E(e(Gp))| > λ

√m)< 2e−λ2/2,

where m =(n2

). However, the proofs give no clue that what are these

expectations.

Lemma 2.4 Let 0 0 be fixed, and let

f(x) =(nx

)p(

x2) for 0 ≤ x ≤ n. Define a positive integer k such that

f(k − 1) > 1 ≥ f(k).

Then as n→ ∞,

⌈ωn − ϵ⌉ ≤ k ≤ ⌊ωn + ϵ⌋ + 1,

where

ωn = 2 loga n− 2 loga loga n+ 2 loga(e/2) + 1,

and f(k − 4) > c(

nloga n

)3= n3−o(1), where c > 0 is a constant.


Proof. It is easy to know that k → ∞ and k = o(√n), thus by Stirling’s

formula, we have

f(k) =

(n

k

)p(

k2) ∼ nk

k!pk(k−1)/2 ∼ 1√

2πk

(en

kp(k−1)/2

)k

.

So if δ > 0 fixed, for all large n,

en

kp(k−1)/2 ≤ 1 + δ

as f(k) ≤ 1. This is equivalent to that

k ≥ 2 loga n− 2 loga k + 2 loga e+ 1 − 2 loga(1 + δ).

Let us set k ∼ 2 loga n first. Then the difference between the right handside in the above inequality and ωn is

2 loga

2 loga n

k− 2 loga(1 + δ) → −2 loga(1 + δ),

so k−ωn ≥ −2 loga(1+δ)+o(1) ≥ −ϵ if we take δ small enough. Hencek ≥ ωn − ϵ.

Similarly, from

f(k − 1) ∼ 1√2π(k − 1)

(en

k − 1p(k−2)/2

)k−1

,

we have enk−1

p(k−2)/2 ≥ 1, which gives

k ≤ 2 loga n− 2 loga(k − 1) + 2 loga e+ 2.

Furthermore, by taking k ∼ 2 loga n first, we obtain k ≤ ωn+1+o(1) ≤ωn + ϵ+ 1, the desired upper bound for k follows.

Finally, note that

f(k − 2) >f(k − 2)

f(k − 1)=

k − 1

n− k + 2ak−2 ∼ p2

k

nak >

cn

log n,

the assertion for f(k − 4) follows immediately. 2

Lemma 2.5 For fixed 0 0, almost all graphsGp ∈ G(n, p) satisfy

ω(Gp) < ⌊ωn + ϵ⌋ < 2 loga n,

where ωn is defined in Lemma 2.4.

Proof. Let Xr be the number of r-cliques, where r is referred as aninteger. Then

E(Xr) = f(r) =

(n

r

)p(

r2) ≤ nr

r!pr(r−1)/2 <

1√2πr

(en

rp(r−1)/2

)r

.

We shall find some r = r(n) → ∞ such that E(Xr) → 0. This iscertainly true if enp(r−1)/2/r ≤ 1 (hence r → ∞). The same argumentin the proof of Lemma 2.4 applies that if r = ⌈ωn+ϵ⌉, then E(Xr) → 0,thus Pr[ω(Gp) ≥ r] → 0 and Pr[ω(Gp) ≤ ⌊ωn + ϵ⌋] → 1. 2

Note that the above result can be stated as

Pr (ω(Gp) ≤ ⌈ωn + ϵ⌉ − 1) → 1.

Matula (1970, 1972, 1976) was the first to notice that for fixed valuesof p almost all Gp ∈ G(n, p) have clique numbers concentrated on (atmost) two values,

⌊ωn − ϵ⌋ ≤ ω(Gp) ≤ ⌊ωn + ϵ⌋.

Results asserting this phenomenon were proved by Grimmett and Mc-Diarmid (1975); and these were further strengthened by Bollobas andErdos (1976).

In order to reduce the difficulty of the proof and preserve the typicalflavor, we slightly weaken the above lower bound ⌊ωn − ϵ⌋ by havingits asymptotical form a little bit later. Let us discuss the chromaticnumbers first. A technical lemma is as follows.

Lemma 2.6 Let k be the integer defined in Lemma 2.4 and let ℓ =k − 4. Let Y = Y (G) be the maximum size of a family of edge-disjointcliques of size ℓ in G ∈ G(n, p). Then

E(Y ) ≥ c n2

ℓ4,

where c > 0 is a constant.

Proof. Let L denote the family of ℓ-cliques of G. Then by Lemma 2.4,we have

µ = E(|L|) = f(ℓ) =

(n

ℓ

)p(

ℓ2) ≥ c1

(n

ℓ

)3

.

Let W denote the number of unordered pairs A,B of ℓ-cliques of Gwith A ∼ B, where A ∼ B signifies that 2 ≤ |A ∩B| < ℓ. Let

∆ =∑A∼B

Pr(AB),

where the sum is taken over all ordered pairs A,B. Then E(W ) =∆/2 and

∆ =

(n

ℓ

)ℓ−1∑i=2

(ℓ

i

)(n− ℓ

ℓ− i

)p2(

ℓ2)−(i

2)

= µℓ−1∑i=2

(ℓ

i

)(n− ℓ

ℓ− i

)p(

ℓ2)−(i

2) = µℓ−1∑i=2

Ri.

Setting a = 1/p, we have

Ri+1

Ri

=(ℓ− i)2

(i+ 1)(n− 2ℓ+ i+ 1)ai.

If i is small, say bounded, then this ratio is O((loga n)2/n), and if i islarge, say ℓ − i = O(1), then the ratio is at least

√n. It is increasing

on i, so

∆ = µℓ−1∑i=2

Ri ≤ 2µ(R2 +Rℓ−1).

Here

R2 =

(ℓ

2

)(n− ℓ

ℓ− 2

)p(

ℓ2)−1

=ℓ2(ℓ− 1)2

2p(n− ℓ+ 2)(n− ℓ+ 1)µ ≤ ℓ4

2pn2µ,

andRℓ−1 = ℓ(n− ℓ)p(

ℓ2)−(ℓ−1

2 ) ≤ nℓpℓ−1,

thus

∆ ≤ 2µ

(ℓ4

2pn2µ+ nℓpℓ−1

)≤ C

µ2ℓ4

n2.

Let C be a random subfamily of L defined by setting for each A ∈ L,

Pr[A ∈ C] = p1,

where 0 < p1 < 1 will be determined. Then E(|C|) = µp1. Let W ′ bethe number of unordered pairs A,B of ℓ-cliques in C with A ∼ B.Then

E(W ′) = E(W )p21 =∆p21

2.

Delete from C one set from each such pair A,B. This yields a set C∗

of edge-disjoint ℓ-cliques of G and

E(Y ) ≥ E(|C∗|) ≥ E(|C|) − E(W ′) = µp1 −∆p21

2.

By choosing p1 = µ∆< 1, we have

E(Y ) ≥ µ2

2∆≥ c n2

ℓ4

as asserted. 2

Theorem 2.14 (Bollobas) Let 0 < p < 1, a = 1/p be fixed, andlet m = ⌈n/ log2

a n⌉. Then for almost all graphs Gp ∈ G(n, p), eachinduced subgraph of order m of Gp has a clique of size at least r =2 loga n− 7 loga loga n.

Proof. Let S be an m-set of vertices. We shall bound the probabilitythat S induces no r-clique by e−m1+δ

for all large n (hence all large m),where δ > 0 is a constant. So the probability that there exists an m-setwith no r-clique is at most(

n

m

)e−m1+δ

<(en

m

)m

e−m1+δ

= exp(m loge

en

m−m1+δ

),

which goes to zero, and the assertion follows.

Let X be the maximum number of pairwise edge-disjoint r-cliquessets in this graph (induced by S), where edge-disjoint means they shareat most one vertex. We shall show that X ≥ 1 holds almost surely.To do this, we invoke Azuma’s Inequality. Consider the edge exposuremartingale for X that results from revealing G one-edge slot at a time.We have X0 = E(X) and X(m

2 ) = X. Clearly the Lipschitz condition

|Xi+1 −Xi| ≤ 1 is satisfied, so Azuma’s Lemma gives

Pr(X = 0) ≤ Pr[X − E(X) ≤ −E(X)]

= Pr

X − E(X) ≤ −λ(m

2

)1/2 ≤ e−λ2/2

= exp

(− E2(X)

m(m− 1)

),

where λ = E(X)/(m2

)1/2. Hence it suffices to find δ > 0 such that

E2(X) ≥ m3+δ for all large n.Now, let t0 be the integer such that f(t0 − 1) > 1 ≥ f(t0), where

f(x) =(mx

)p(

x2), and let t = t0 − 4. Then by Lemma 2.4, we have

t ≥ 2 logam− 2 loga logam− 3 > 2 loga n− 7 loga loga n,

so t > r. Let T be the maximum number of edge-disjoint cliques of sizet, Then E(X) ≥ E(T ) and E(T ) ≥ cm2/t4 by Lemma 2.6, hence

E(X) ≥ cm2

t4∼ cn2

16(loga n)8,

implying that E2(X) ≥ n4−o(1) ≥ n3+δ for any 1 > δ > 0 if n is large,which completes the proof. 2

Theorem 2.15 (Bollobas) Let 0 0 be fixed. Denoteb = 1/q = 1/(1 − p). Then almost all graphs Gp ∈ G(n, p) satisfy

n

2 logb n≤ χ(Gp) ≤ (1 + ϵ)

n

2 logb n.

Proof. The lower bound holds because almost all Gp satisfy α(Gp) ≤2 logb n and χ(G)α(G) ≥ n. The upper bound follows from the abovetheorem, which is applied for independent sets instead of cliques, be-cause we can almost always select independent set of size 2 logb n −7 logb logb n until we have only n/ log2

b n < (ϵ/2)n/(2 logb n) verticesleft. We first use at most

n

2 logb n− 7 logb logb n<(

1 +ϵ

2

)n

2 logb n

colors, and then we can complete the coloring by using distinct newcolors on each of the remaining vertices. 2

Let us remark that Achlioptas and Naor recently obtained a resulton sparser random graphs as follows. Given d > 0, let kd be the smallestinteger k such that d < 2k log k. Then χ(Gp) for almost all Gp ∈G(n, d/n) is either kd or kd + 1. This result improves an earlier resultof Luczak (1991) by specifying the form of kd.

Theorem 2.16 Let 0 0 be fixed. Then almost allgraphs Gp ∈ G(n, p) satisfy

(1 − ϵ)2 logb n ≤ α(Gp) < 2 logb n.

Proof. The upper bound is the complement of that in Lemma 2.5.The lower bound follows from Theorem 2.15 and the fact that α(G) ≥n/χ(G). 2

Theorem 2.17 Let 0 0 be fixed. Then almost allgraphs Gp ∈ G(n, p) satisfy

(1 − ϵ)2 loga n ≤ ω(Gp) < 2 loga n.

Proof. This is complement of Theorem 2.16. 2

For some graph parameter f(G), we have seen that there is a func-tion g(n) such that almost all graphs Gp in G(n, p) satisfy that

(1 − ϵ)g(n) ≤ f(Gp) ≤ (1 + ϵ)g(n),


hence f(G) concentrate in a small range. We shall call the functiong(n) as a threshold for the parameter f . We will discuss the thresholdfor probability p = p(n) instead of fixed p, and will consider some othergraph parameters in the next chapter.

2.5 References

D. Achlioptas and A. Naor, The two possible values of the chromaticnumber of a random graph, Ann. of Math., 162 (2005), 1335-1351.

N. Alon and J. Spencer, The Probabilistic Method, 3rd Edition,Wiley-Interscience, New York, 2008.

B. Bollobas, The chromatic numbers of random graphs, Combina-torica, 8 (1988), 49-55.

B. Bollobas and P. Erdos, Cliques in random graphs, Math. Proc.Cambridge Philos. Soc., 80 (1976), 419-427.

P. Catlin, Hajos’ graph-coloring conjecture: variations and coun-terexamples, J. Combin. Theory, Ser. B, 26 (1979), 268-274.

H. Chernoff, A measure of the asymptotic efficiency for tests of ahypothesis based on the sum of observations, Ann. Math. Statis. , 23(1952), 493-509.

F. Chung and R. Graham, On multicolor Ramsey numbers for bi-partite graphs, J. Combin. Theory Ser. B, 18 (1975),164-169.

G. Dirac, A property of 4-chromatic graphs and some remarks oncritical graphs, J. Lond. Math. Soc., 27 (1952), 85-92.

P. Erdos and S. Fajtlowicz, On the conjecture of Hajos, Combina-torica, 1 (1981) , 141-143.

G. Grimmett and C. McDiarmid, On coloring random graphs, Math.Proc. Cambridge Philos. Soc., 77 (1975), 313-324.

Y. Li and C. Rousseau, On the Ramsey number r(H + Kn, Kn),Discrete Math., 170 (1997), 265-267.

2.5. REFERENCES 43

T. Luczak, The chromatic number of random graphs, Combinator-ica, 11 (1991), 45-54.

T. Luczak, A note on the sharp concentration of the chromaticnumber of random graphs, Combinatorica, 11 (1997), 295-297.

D.W. Matula, On the complete subgraphs of a random graph, in:Proc. Second Chapel Hill Conference on Combinatory Mathematicsand its Applications, University of North Carolina, Chapel Hill, NorthCaroline, 1970.

D.W. Matula, The employee party problem, Notices Amer. Math.Soc., 19 (1972), A-328.

D.W. Matula, The largest clique size in a random graph, Tech. Rep.,Dept. Comput. Sci., Southern Methodist University, Dallas, 1976.

J. Mycielski, Sur le coloring des graphs, Coll. Math., 3 (1955),161-162.

E. Shamir and J. Spencer, Sharp concentration of the chromaticnumber in random graph Gn,p, Combinatorica, 7 (1987), 121-130.

Chapter 3

Properties of RandomGraphs

As we have seen that the random graph is important in probabilisticmethod for Ramsey theory. In this chapter, let us digress a bit fromRamsey theory to general properties of random graphs. This chaptercontains some of most important topics on random graphs, which andChapter 5 form a short path to the theory. A random graph is obtainedby starting with a set of n vertices and adding edges between them atrandom. Different random graph models produce different probabilitydistributions on graphs, for which the model in this chapter is classic.Erdos and Renyi showed that for many monotone-increasing propertiesof random graphs, graphs of a size slightly less than a certain thresholdare very unlikely to have the property, whereas graphs with a few moregraph edges are almost certain to have it. This is known as a phasetransition. The second section is devoted to this topic. The last sectioncovers some deeper discussion, for which the beginners of readers canskip.

3.1 Some behavior of almost all graphs

Given a graph property A, it is often associated with a family Q ofgraphs as

Q = Q(A) = G : G has A.

45

46 CHAPTER 3. PROPERTIES OF RANDOM GRAPHS

Slightly abusing notation, we do not distinguish the property A andthe family Q if no danger of confusion. We say that almost all (a.a.)graphs in G(n, p) have property Q if limn→∞ Pr[Gp ∈ Q] = 1. In thiscase we also say that almost surely that Gp ∈ G(n, p) has property Q.We begin at a classic result of Erdos that almost all graphs seem tobehave strangely even though they are sparse. In the following result,we denote by ⟨S⟩ the subgraph of G induced S.

Theorem 3.1 (Erdos) For any k ≥ 1, there exist positive constantsc = c(k) and ϵ = ϵ(k) such that almost all graphs in G(n, p) with p = c/nsatisfy that χ(G) ≥ k, and yet χ(⟨S⟩) ≤ 3 for any vertex subset S with|S| ≤ ϵn.

Proof. Let

H(x) = − log(xx(1 − x)1−x

), 0 < x < 1,

and let constants c and ϵ satisfy

c > 2k2H(1/k) and c3e5ϵ < 33.

Set p = c/n and consider G = Gp in G(n, p). We show that almost allgraphs in this space satisfy the conditions. If χ(G) ≤ k, then α(G) ≥n/k. The expected numbers of such independent sets is(

n

n/k

)(1 − p)(

n/k2 ).

From Stirling formula, we estimate that(n

n/k

)=

n!

(n/k)!(n− n/k)!≤ expnH(1/k),

and

(1 − p)(n/k2 ) ≤ exp−pn

2k(n

k− 1) = exp− cn

2k2(1 − o(1)).

Therefore,(n

n/k

)(1 − p)(

n/k2 ) ≤ exp

−n

(c

2k2−H(

1

k) − o(1)

),

3.1. SOME BEHAVIOR OF ALMOST ALL GRAPHS 47

which tends to zero by the condition satisfied by c.Suppose some set S with t ≤ ϵn vertices such that χ(⟨S⟩) ≥ 4, we

claim that ⟨S⟩ would have at least 3t/2 edges. Suppose S is a minimalsuch set. For any v ∈ S, there would be a (proper) 3−coloring of S\v.If v has two or fewer neighbors in ⟨S⟩ then it would be extended to a3−coloring of S. Hence the minimum degree of ⟨S⟩ is at least 3 andthe claim follows. The probability that some t ≤ ϵn vertices have atleast 3t/2 edges is less than

∑4≤t≤ϵn

(n

t

)( (t2

)3t/2

)(c

n

)3t/2

=∑

4≤t≤n1/4

+∑

n1/4<t≤ϵn

= s1 + s2.

We bound (n

t

)≤(en

t

)t

and

( (t2

)3t/2

)≤(et

3

)3t/2

.

So each term is at most

(en

t

)t (et3

)3t/2 ( cn

)3t/2

=

(c3/2e5/2t1/2

33/2n1/2

)t

.

Hence the first summation

s1 ≤ n1/4

(c3/2e5/2n1/4

33/2n1/2

)4

→ 0.

Each term in the second summation is at most(c3/2e5/2t1/2

33/2n1/2

)t

≤(c3/2e5/2

33/2ϵ1/2

)n1/4

.

Let δ be the bracketed term c3/2e5/2

33/2ϵ1/2, then δ < 1 by the choice of ϵ.

Thus the second summation

s2 ≤ ϵnδn1/4 → 0.

So almost surely no such set S exists, completing the proof. 2

By the above theorem, we know that in random graphs, the neigh-bors of average vertices distribute evenly in every part of the vertexset. So their clique numbers and independence numbers are small, andchromatic number are big. Let g(G) be the girth of G, which is thesmallest length of a cycle in G. A historic result of Erdos (1959) is thatboth of χ(G) and g(G) can be arbitrarily large.

Theorem 3.2 For any fixed ℓ and k, there exists a graph G withg(G) > ℓ and χ(G) > k.

Proof. Fix 0 < θ < 1/ℓ, let p = nθ−1. Consider random graphs inG(n, p). Let X be the number of cycles of length at most ℓ in G. Then

E(X) =ℓ∑

i=3

(n)i2i

pi ≤ℓ∑

i=3

nθi

2i= o(n)

as θℓ < 1, where (n)i is the falling factorial n(n− 1) · · · (n− i+ 1). Onthe other hand,

E(X) =∑i

iPr(X = i) ≥ n

2Pr(X ≥ n/2),

which implies that Pr(X ≥ n/2) = o(1) by E(X) = o(n).Set m = 3n1−θ log n. Then it is a routine procedure to show

Pr(α(G) ≥ m) ≤(n

m

)(1 − p)(

m2 ) <

(ne−p(m−1)/2

)m= o(1).

There exists a graph G of large order n such that X(G) < n/2 andα(G) < m. By deleting a vertex from each cycle of length at most ℓ,we obtain a graph G∗ of order at least n/2, which satisfies g(G∗) > ℓ,α(G∗) ≤ m, and

χ(G∗) ≥ |V (G∗|α(G∗)

≥ n/2

m≥ nθ

6 log n> k,

3.2. THRESHOLD FUNCTIONS 49

3.2 Threshold functions

For fixed 0 < p ≤ 1, most graphs in G(n, p) are dense. As we haveseen in the last chapter, Bollobas (1988) proved that the chromaticnumbers χ(Gp) for Gp ∈ G(n, p) are concentrated at n/(2 log1/q n),where q = 1 − p. In this section, we investigate the concentration ofedge probability function p = p(n) associated with a property. We haveseen that random graphs in G(n, p) behave sensitively on p = p(n).A monumental discovery of Erdos and Renyi (1960) was that manynatural graph theoretic properties become true in a very narrow rangeof p = p(n).

A property Q is said to be monotone increasing if G has Q, then anygraph from G by adding new edges has Q. The monotone decreasingproperty can be defined similarly. Thus the property of being connect-ed is monotone increasing and that of being triangle-free is monotonedecreasing. As mentioned in Section 1, a property Q is associated witha family of graphs. We call this family to be monotone increasing if sois Q. Also we do not distinguish the property and its associated family.

Lemma 3.1 Let Q be a monotone increasing property. For Gp ∈G(n, p), the function Pr(Gp ∈ Q) is increasing on p.

Proof. Let 0 ≤ p1(n) < p2(n) ≤ 1. We shall verify

Pr(Gp1 ∈ Q) ≤ Pr(Gp2 ∈ Q).

Set p = (p2 − p1)/(1 − p1), then p2 = p+ p1 − pp1. Choose G ∈ G(n, p)and G1 ∈ G(n, p1), independently, and set G2 = G ∪G1. Namely G2 isa graph on vertex set V = [n] with edge set E(G) ∪ E(G1), in whicheach edge e appears with probability

Pr(e) = Pr(e ∈ E(G) ∪ E(G1)) = p+ p1 − pp1 = p2

as the events that e appears in E(G) and in E(G1) are independent.Thus G2 is exactly a random graph of G(n, p2). As Q is monotoneincreasing, if G1 has Q so does G2, thus Pr(Gp1 ∈ Q) ≤ Pr(Gp2 ∈ Q)as claimed. 2

Let Q be a monotone increasing property. Erdos and Renyi defineda function f(n) with 0 ≤ f(n) ≤ 1 as threshold function for Q if

limn→∞

Pr(Gp ∈ Q

)=

0 if p = f(n)/ω(n)1 if p = f(n)ω(n),

where 0 < ω(n) < 1/f(n) is a function which tends to infinity with n,as slowly as desired. For example, if f(n) = log n/n, we may assumethat 0 < ω(n) < log log n. Note that if f(n) is a threshold function forQ, so is cf(n) for any constant c > 0.

Clearly, the definition of f(n) being a threshold function for a mono-tone increasing property Q is equivalent to

limn→∞

Pr(Gp ∈ Q

)=

0 if p≪ f(n)1 if p≫ f(n),

where p≪ f(n) means p = o(f(n)).For obvious reason, the above threshold function is in fact threshold

probability function. One can certainly define other threshold functionssuch as threshold edge function.

The definition of the threshold function for a monotone decreasingproperty is similar. The definitions means the situation whether or nota.a. Gp have Q changes suddenly even though p = p(n) changes slightlyin the moment.

Let X = X(G) be a non-negative integral parameter of graph G.Since

Pr(X ≥ 1) =∑k≥1

Pr(X = k) ≤ E(X),

so E(X) → 0 implies that a.a. graphs in G(n, p) satisfy X = 0. And inmany cases E(X) → ∞ implies that a.a graphs in G(n, p) satisfy X ≥ 1,which can be shown by Chebyshev’s inequality often. For example, letX be the number of triangles in Gp ∈ G(n, p). Then

E(X) =

(n

3

)p3 ∼ 1

6(np)3.

As we will see in the next theorem that f(n) = 1/n truly is a thresholdfunction for triangle-containedness. Let p = γ/n and let γ → 0 or


γ → ∞ signify ω(n) in the denominator or in the numerator in thedefinition, respectively. When γ reaches and passes 1, the structureof Gp changes radically. This is called the double jump because thestructure of Gp is significantly different for γ ≪ 1, γ ∼ 1 and γ ≫ 1.

Let us recall the Second Moment Method in last chapter.

Lemma 3.2 (Second Moment Method) If X is a random variable,then

Pr(X = 0) ≤ E(X2) − µ2

µ2,

where µ = E(X). In particular, Pr(X = 0) → 0 if E(X2)/µ2 → 1.

A graph G with average degree d is called balanced if no subgraphof it has average degree greater than d. Complete graphs, cycles andtrees are balanced.

Theorem 3.3 Let F be a balanced graph with k ≥ 2 vertices and ℓ ≥ 1edges and let Q be the property that a graph contains F as a subgraph.Then f(n) = n−k/ℓ is a threshold function for Q.

Proof. To simplify the notation as before, we shall use p = γnk/ℓ with

γ → 0 and γ → ∞ to signify the function ω(n) in the denominator andnumerator, respectively. Let X = X(G) be the number of copies of Fcontained in G = Gp ∈ G(n, p). Denote by a for the number of graphsisomorphic to F on fixed k labeled vertices. Then

µ = E(X) =

(n

k

)apℓ.

By noting the simple facts that 1 ≤ a ≤ k! with k and ℓ fixed, we haveE(X) ≤ nkpℓ = γℓ and E(X) ∼ a

k!nkpℓ = a

k!γℓ. Thus

c1γℓ ≤ c2n

kpℓ ≤ µ = E(X) ≤ γℓ,

where c1 and c2 henceforth ci are positive constants. So the order of µis γℓ = nkpℓ.

When γ → 0, by Markov’s inequality,

Pr(Gp ∈ Q) = Pr(X ≥ 1) ≤ E(X) → 0.


We then want to show that Pr(Gp ∈ Q) = Pr(X ≥ 1) → 1 as γ → ∞.We turn to the Second Moment Method for help since the Markov’sinequality does not work in this case.

For any k labeled vertices in [n], we have a = k!/|A|, where A is

the automorphism group of F . Then there are total of a(nk

)potential

copies of F on [n]. Denote by

F = F1, F2, . . .

for the family of these copies. Denote by Fi ∪ Fj for the graph withvertex set V (Fi) ∪ V (Fj) and edge set E(Fi) ∪E(Fj). The two criticalobservations are that most pairs Fi and Fj have no vertices in common,and if they have s ≥ 1 common vertices and these s vertices contains tedges of Fj, then t/s ≤ ℓ/k since F is balanced.

Let Xi be the indicator function of Fi. Then

E(Xi) = Pr(Xi = 1) = Pr(G ⊃ Fi).

Since X =∑

iXi,

E(X2) =∑i,j

E(XiXj) =∑i,j

Pr(G ⊃ Fi ∪ Fj),

where the sum is taken over all pairs i, j with Fi, Fj ∈ F . Set

A0 =∑

i,j: E(Fi)∩E(Fj)=∅Pr(G ⊃ Fi ∪ Fj),

and for s ≥ 1,

As =∑i, j

Pr(G ⊃ Fi ∪ Fj) : |V (Fi) ∩ V (Fj)| = s, E(Fi) ∩ E(Fj) = ∅ .

Then E(X2) =∑k

s=0As. Note that if E(Fi) ∩ E(Fj) = ∅, then

Pr(G ⊃ Fi ∪ Fj) = Pr(G ⊃ Fi) Pr(G ⊃ Fj)

from the independency of the events. We thus have A0 ≤ µ2 followingfrom

A0 =∑

V (Fi)∩V (Fj)=∅Pr(G ⊃ Fi ∪ Fj)


=∑

V (Fi)∩V (Fj)=∅Pr(G ⊃ Fi) Pr(G ⊃ Fj)

≤(∑

i

Pr(G ⊃ Fi))(∑

j

Pr(G ⊃ Fj))

= E2(X) = µ2.

For s ≥ 1 it is expected that As is much less than µ2. Fix Fi, counting Fj

that has s common vertices with Fi, in which these s common verticescontain t edges of E(Fi)∩E(Fj) with t ≤ sℓ/k since F is balanced, wehave

∑j: |V (Fi)∩V (Fj)|=s

Pr(G ⊃ Fi ∪ Fj) ≤∑

t≤sℓ/k

(k

s

)(n− k

k − s

)p2ℓ−t

≤ c3nk−s

∑t≤sℓ/k

p2ℓ−t

since k, s, ℓ are fixed and t is bounded. From the fact that there area(nk

)elements in F , we obtain

As ≤ a

(n

k

)c3n

k−s∑

t≤sℓ/k

p2ℓ−t

≤ c4n2k−s

∑t≤sℓ/k

p2ℓ−t ≤ c4(nkpℓ)2n−s

∑t≤sℓ/k

p−t

≤ c5γ2ℓn−s

psℓ/k=

c5γ2ℓ

(npℓ/k)s=c5γ

2ℓ

γsℓ/k≤ c6µ

2

γsℓ/k,

where we used the fact that nkpℓ, γℓ and µ have the same order. So fors ≥ 1, we have As/µ

2 ≤ c6/γsℓ/k, and

E(X2)

µ2=A0 +

∑ks=1As

µ2≤ 1 +

k∑s=1

c6γsℓ/k

≤ 1 +c7γℓ/k

.

By the Second Moment Method,

Pr(X = 0) ≤ Pr(|X − µ| ≥ µ) ≤ σ2

µ2

=E(X2) − µ2

µ2≤ c7γℓ/k

,

which tends to zero as γ → ∞. 2

In traditional definition of threshold function f(n) of Erdos andRenyi p1/p2 → 0 with p1(n) = f(n)/ω(n) and p2(n) = f(n)ω(n).On the other hand, if we have such p1(n) and p2(n), then f(n) =√p1(n)p2(n) is a threshold function. In many cases it is just needed

that p1 is slightly less than p2. A more precise definition of thresholdfunction is as follows.

Let Q be a monotone increasing property of graphs. A functionpℓ = pℓ(n) is called a lower threshold function (ltf) if almost no graphsin G(n, pℓ) have Q; a function pu = pu(n) is called an upper thresholdfunction (utf) if almost all graphs in G(n, pu) have Q.

A realistic situation is very interesting. In a conference, a pair ofmathematicians unknown each other can found a common mathemati-cian friend. The following result called distance two theorem gives us agood explanation for this small world phenomenon. The property beingdistance two is not strictly increasing, but it is except the addition ofthe last edge.

Theorem 3.4 For any function ω(n) → ∞ with ω(n) < log n, set

pℓ =

√2 log n− ω(n)

n, and pu =

√2 log n+ ω(n)

n.

Then pℓ and pu are ltf and utf for the property of graph having distancetwo, respectively.

Proof. Enumerate of all pairs of vertices u, v of G(n, p) as e1, e2, . . . , emwith m =

(n2

). For ek = u, v, let d(u, v) be the distance between u

and v. Define

Xk = Xk(G) =

0 d(u, v) ≤ 21 otherwise,

and X =∑m

k=1Xk. A non-complete graph G has distance two if andonly if X = 0. Since the event d(u, v) ≥ 3 for a pair of non-adjacentvertices is equivalent to that none of other n− 2 vertices is adjacent toboth u and v, so

E(Xk) = Pr(Xk = 1) = (1 − p)(1 − p2)n−2.

Set µ = E(X), then

µ = E(X) =

(n

2

)(1 − p)(1 − p2)n−2.

(i) Let p = pu =√

(2 log n+ ω(n))/n. We have

µ ∼ n2

2(1 − p2)n ∼ n2

2e−np2

=1

2e−ω(n) → 0.

Thus Pr(X ≥ 1) ≤ E(X) → 0, which proves that a.a. graphs in G(n, p)have distance at most two hence a.a. graphs in G(n, p) have distancetwo as almost no graph in G(n, p) is complete.

(ii) Let p = pℓ =√

(2 log n− ω(n))/n. Suppose that ω(n) <log log n without loss of generality. Consider

E(X2) =∑i,j

E(XiXj) = A0 + A1 + A2,

where As =∑

|ei∩ej |=sE(XiXj), in which the sum is taken over all pairsi, j with ei and ej having s vertices in common. Clearly

µ = E(X) ∼ n2

2(1 − p2)n ∼ n2

2e−np2 =

eω(n)

2→ ∞,

and

A0 =∑

|ei∩ej |=0

E(XiXj) ≤(n

2

)(n− 2

2

)(1 − p)2(1 − p2)2(n−2) < µ2.

Also

A2 =m∑k=1

E(Xk) = µ.

We now estimate A1 that should not be big since A0 counts most ofpairs. For ei = u, v and ej = v, w with |ei ∩ ej| = 1, the eventd(u, v) ≥ 3 and d(v, w) ≥ 3 is equivalent to the following events happen:

B1 : u and v, v and w are non-adjacent;

B2 : none of vertices other than u, v, w is adjacent to all of thethree vertices.

B3 : none of distinct vertices is adjacent to both u and v, anotheris adjacent to both v and w;

We have

A1 =∑

|ei∩ej |=1

E(XiXj) = Pr(B1B2B3) ≤ Pr(B3)

= 3

(n

3

)(1 − p2)2n−7

∼ n3

2(1 − p2)2n ∼ n3

2e−2np2 =

1

2ne2ω(n) → 0.

Henceσ2 = E(X2) − µ2 = A0 + A1 + A2 − µ2 < µ+ 1,

which and the Second Moment Method yield

Pr(X = 0) ≤ Pr(|X − µ| ≥ µ) ≤ σ2

µ2→ 0,

proving that almost no graphs in G(n, p) has distance two with p = pℓ.2

The further solution for diameter of random graphs is as follows.Let diam(G) be the diameter of G and let d ≥ 2 be an integer. If

p = n1/d−1(log(n2/x))1/d,

thenPr(diam(Gp) = d) → e−x/2

andPr(diam(Gp) = d+ 1) → 1 − e−x/2

an n → ∞. See Bollobas (2001) for details. The above limit distribu-tion implies that

pℓ,u = n1/d−1(2 log n± ω(n))1/d

are ltf and utf of graphs being distance d, respectively.

The distance two graphs are of interest in graph Ramsey theory.Let Gn be a Ramsey graph of order n = r(3, k) − 1. We suppose thatGn is edge maximal for triangle-freeness. Then Gn is a distance twograph. Since the order of n is k2/ log k, so the maximum degree of Gn isbounded above by k ≤ c

√n log n. It is likely that the minimum degree

of Gn has order√n log n hence the order of its edge density is

√log n/n

as that in above theorem.

Problem 3.1 Let Gn be a Ramsey graph of order n = r(3, k) − 1 thatis edge maximal. Determine the orders of the minimum and maximumdegrees of Gn as k → ∞.

The following result gives threshold functions for the property ofbeing connected. The deeper version of the result will be in the nextsection.

Theorem 3.5 Let ω(n) → ∞ be a function with ω(n) < log n. Set

pℓ =log n− ω(n)

nand pu =

log n+ ω(n)

n.

Then pℓ and pu are ltf and utf for graph in G(n, p) with the property ofbeing connected, respectively.

Proof. Let Q be the family of connected graphs. Since Q is monotoneincreasing, we may assume that ω(n) ≤ log log n without loss of gener-ality by Lemma 3.1. Let Xk = Xk(G) be the number of components ofG ∈ G(n, p) that have exactly k vertices.

(i) We first prove that p = pℓ is a ltf for Q. Set µ = E(X1). Notethat (1 − p)n ∼ e−np since np2 → 0, we have

µ = E(X1) = n(1 − p)n−1

∼ n(1 − p)n ∼ ne−np = eω(n) → ∞.

This may indicate that Pr(X1 = 0) → 0 and Pr(X1 ≥ 1) → 1, so a.a.graphs have isolated vertices hence they are disconnected. In orderto use the Second Moment Method, we need to estimate the varianceσ2 = σ2(X1) hence E(X2

1 ). We first have

E[X1(X1 − 1)] = n(n− 1)(1 − p)2n−3,


which is the expected number of order pairs of isolated vertices. Thereare n(n − 1) ordered pairs of vertices, and such a pair are isolated ifand only if they are neither adjacent each other nor adjacent to anyother n− 2 vertices, which count 2n− 3 edges. Then

E(X21 ) = E[X1(X1 − 1)] + E(X1)

= µ+ n(n− 1)(1 − p)2n−3.

We thus have

σ2 = σ2(X1) = E[(X1 − µ)2] = E(X21 ) − µ2

= µ+ n(n− 1)(1 − p)2n−3 − n2(1 − p)2n−2

≤ µ+ pn2(1 − p)2n−3.

Since p = (log n−ω(n))/n with log log n ≥ ω(n) → ∞ and 1−p ≤ e−p,

pn2(1 − p)2n−3 ≤ (1 + o(1))(log n)ne−2np

= (1 + o(1))(log n)ne−2 logn+2ω(n)

= (1 + o(1))log n

ne2ω(n) → 0.

We thus obtainσ2 = σ2(X1) ≤ µ+ 1.

This and the Second Moment Method give

Pr(Gp ∈ Q) ≤ Pr(X1 = 0) ≤ Pr(|X1 − µ| ≥ µ) ≤ σ2

µ2→ 0,

proving that pℓ is a ltf for property of being connected.(ii) Now let p = pu = (log n + ω(n))/n. Note that if G is not

connected then it must contains a component of order at most ⌊n/2⌋.So

Pr(Gp ∈ Q) = Pr

⌊n/2⌋∑k=1

Xk ≥ 1

≤ E

⌊n/2⌋∑k=1

Xk

=⌊n/2⌋∑k=1

E(Xk).

Since if a set with k vertices induces a component, then any vertex init is not adjacent to any vertex out of it. Thus

E(Xk) ≤(n

k

)(1 − p)k(n−k),

3.3. POISSON LIMIT 59

where we ignore condition that the set is connected. Therefore,

Pr(Gp ∈ Q) ≤⌊n/2⌋∑k=1

(n

k

)(1 − p)k(n−k).

Let us split the sum into two parts S1 and S2. Note that ekp ≤ 1 + ϵuniformly for k ≤ n3/4, and ne−np = e−ω(n)/n, we have

S1 =∑

1≤k≤n3/4

(n

k

)(1 − p)k(n−k) ≤

∑1≤k≤n3/4

(en

ke−npekp

)k

≤∑

1≤k≤n3/4

((1 + ϵ)

e1−ω(n)

k

)k

≤∑

1≤k≤n3/4

((1 + ϵ)e1−ω(n)

)k≤ (1 + o(1))(1 + ϵ)e1−ω(n) → 0.

Note that for n3/4 < k ≤ n/2, we have(nk

)≤ (en/k)k ≤ (en1/4)k and

(1 − p)k(n−k) ≤ (1 − p)kn/2 ≤ e−knp/2 <1

nk/2,

hence

S2 =∑

n3/4<k≤⌊n/2⌋

(n

k

)(1 − p)k(n−k)

≤∑

n3/4<k≤n/2

(e

n1/4

)k

≤ (1 + o(1))(

e

n1/4

)n3/4

→ 0.

Thus S1 + S2 → 0, proving that a.a. graphs in G(n, p) are connected.2

3.3 Poisson limit

In probability theory, we call a random variable X to have Poissondistribution if it takes non-negative integral values and Pr(X = k) =µk

k!e−µ for some constant µ > 0, which is the expectation of X (and the


variance of X). An elementary fact is that if X =∑n

i=1Xi ∼ B(n, p)

and np → µ as n → ∞, then Pr(X = k) → µk

k!e−µ. This is because for

fixed k

Pr(X = k) =

(n

k

)pk(1 − p)n−k ∼

(n

k

)pk(1 − p)n

∼ nk

k!pke−np → µk

k!e−µ.

In the last section, in order to show that a.a graphs in G(n, p) aredisconnected for p = (log n − ω(n))/n, we in fact have proved thata.a. graphs in G(n, p) have isolated vertices. Let X be the numberof isolated vertices in Gp of G(n, p), where p = (log n + x)/n, thenX =

∑ni=1Xi, where Xi is indicator of ith vertex being isolated. Define

p′ = Pr(Xi = 1). Then

p′ = (1 − p)n−1 ∼ exp(−np) =e−x

n=µ

n,

where µ = e−x, so np′ → µ. The distribution of X is close to B(n, p′)in the sense that X1, X2, · · · , Xn are “almost” mutually independent,so we are expecting that X has limit Poisson distribution.

The approach to the Poisson paradigm introduced in this section iscalled Brun’s sieve for its user T. Brun in number theory. Let us beginwith a basic identity called inclusion-exclusion formula.

In a probability space Ω, let X1, X2, . . . , Xℓ be 0-1 random variablesand set

X = X1 +X2 + · · · +Xℓ.

As usual, denote by [ℓ] for 1, 2, . . . , ℓ. Define S0 = 1 and

Sr =∑[ℓ](r)

Pr(Xi1Xi2 . . . Xir = 1),

where the sum is taken over elements i1, i2, . . . , ir ∈ [ℓ](r), the familyof all r−subsets of [ℓ]. Note the elements ω ∈ Ω such thatXi1Xi2 · · ·Xir =1 are what such that Xi1 = 1, Xi2 = 1, . . . , Xir = 1, and Sr = 0 forr > ℓ. For general r,

Sr =∑ω∈Ω

(X(ω)

r

)Pr(ω)


as an element ω of the sample space for which X(ω) = t contributes

to(tr

)of the terms defining Sr. Here and in what follow, we write

the formulas appropriate for finite sample space. Following standardnotation, we define falling factorials by (X)0 = 1 and

(X)r = X(X − 1) · · · (X − r + 1).

Then

Sr =∑ω∈Ω

(X)rr!

Pr(ω) =E((X)r)

r!.

The quantity E((X)r) is called the rth factorial moment of X.

Theorem 3.6 (Inclusion-Exclusion Formula) For each integer k ≥0,

Pr(X = k) =∑r≥0

(−1)r(k + r

r

)Sk+r.

Moreover, for each integer m ≥ 0,

2m−1∑r=0

(−1)r(k + r

r

)Sk+r ≤ Pr(X = k) ≤

2m∑r=0

(−1)r(k + r

r

)Sk+r.

Proof. It is easy to see

Pr(X = 0) = Pr(X1 = 0, . . . , Xℓ = 0)

= 1 − Pr(∃ i,Xi = 1) = S0 − S1 + S2 − · · · .

For general k, using

Sk+r =∑ω∈Ω

(X(ω)

k + r

)Pr(ω),

and interchanging orders of the summation, we obtain

∑r≥0

(−1)r(k + r

r

)Sk+r =

∑ω∈Ω

∑r≥0

(−1)r(k + r

r

)(X(ω)

k + r

)Pr(ω).

For a fixed ω hence fixed X = X(ω), note that

∑r≥0

(−1)r(k + r

r

)(X

k + r

)=

(X

k

)∑r≥0

(−1)r(X − k

r

).

If X < k, all terms vanish. If X = k, then one term (r = 0) contributesand the sum is 1. Finally, if X > k, the sum vanishes since

∑r≥0

(−1)r(X − k

r

)= (1 − 1)X−k = 0.

Thus ∑r≥0

(−1)r(k + r

r

)(X

k + r

)=

1 if X = k0 otherwise,

and ∑r≥0

(−1)r(k + r

r

)Sk+r =

∑ω∈Ω

δkX(ω) Pr(ω) = Pr(X = k),

where δij is the Kronecker delta. To verify the second result, note forall s ≥ 0 and n ≥ 1,

s∑r=0

(−1)r(n

r

)= (−1)s

(n− 1

s

),

which can be proved easily by induction on s, hence

s∑r=0

(−1)r(k + r

r

)(X

k + r

)=

(X

k

)s∑

r=0

(−1)r(X − k

r

)

=

0 if X < k1 if X = k

(−1)s(Xk

)(X−k−1

s

)if X > k.

We thus haves∑

r=0

(−1)r(k + r

r

)Sk+r

=∑ω∈Ω

s∑

r=0

(−1)r(k + r

r

)(X(ω)

k + r

)Pr(ω)

=∑

X(ω)=k

Pr(ω) +∑

X(ω)>k

(−1)s(X

k

)(X − k − 1

s

)Pr(ω)

= Pr(X = k) + (−1)s∑

X(ω)>k

(X

k

)(X − k − 1

s

)Pr(ω).


In the last line elements ω such that X(ω) > k make a positive ornegative contribution depending on whether s is even or odd. 2

Suppose that we have defined a sequence of probability spaces andthat in the space Ω = Ωn we have the preceding situation with ℓ = ℓ(n).If E((X)r) → µr as n → ∞, we can make a precise statement aboutthe limiting distribution of X =

∑ℓi=1Xi.

Theorem 3.7 (Poisson Limit) Suppose that there is a positive num-ber µ such that

limn→∞

Sr =µr

r!,

equivalently limn→∞E((X)r)) = µr, for each fixed integer r ≥ 0. Then

limn→∞

Pr(X = k) =µk

k!e−µ.

Namely, the limiting distribution of X is Poisson with mean µ.

Proof. Refer to the inequalities in the last theorem we have

1

k!

2m−1∑r=0

(−1)rE((X)k+r)

r!≤ Pr(X = k) ≤ 1

k!

2m∑r=0

(−1)rE((X)k+r)

r!.

Note that if m is fixed, we can make the limit (as n → ∞) term byterm to get

1

k!

2m−1∑r=0

(−1)rµk+r

r!≤ lim

n→∞Pr(X = k) ≤ 1

k!

2m∑r=0

(−1)rµk+r

r!.

Since m is arbitrary, the result follows. 2

Theorem 3.8 For any fixed real number x, let

p =log n+ x

n,

and let X be the number of isolated vertices in a graph of G(n, p), then

limn→∞

Pr[X = k] =µk

k!e−µ,

where µ = e−x. In particular, the limiting probability that graph inG(n, p) has no isolated vertices is exp(−e−x).


Proof. Define Xi as an indicator that the vertex i is an isolated vertex,and define X =

∑ni=1Xi. Then X counts the number of isolated vertices

in Gp andS1 = E(X) = n(1 − p)n−1 → e−x

as n→ ∞. More generally,

Sr =

(n

r

)(1 − p)r(n−r)+(r

2) ∼ nr

r!(1 − p)rn → µr

r!,

where µ = e−x. The limiting distribution of X follows from Poissonlimit theorem as desired. 2

Corollary 3.1 Suppose log n ≥ ω(n) → ∞. Then pℓ = logn−ω(n)n

and

pu = logn+ω(n)n

are ltf and utf for graph in G(n, p) having no isolatedvertices, respectively.

2

We shall show that for the same p = (log n+ x)/n,

limn

Pr(Gp has no isolates) = limn

Pr(Gp is connected) = exp(−e−x).

So in almost every graph when the last isolated vertex disappears, thegraph Gp becomes connected in the evolution of random graph as xincreases. Sightly before it is connected, a giant component with onlya bounded number vertices outside has formed. In fact, the giant com-ponent are formed by larger components and the smaller componentshave bigger chances to survive.

Theorem 3.9 For any fixed real number x, let

p =log n+ x

n,

and let A denote the event that outside of at most one non-trivial com-ponent, all vertices are isolated. Then

limn→∞

Pr(A) = 1

andlimn→∞

Pr[Gp is connected] = exp(−e−x).


Proof. We begin by identifying the following events in G(n, p).

A: Outside of at most one non-trivial component, Gp has only iso-lated vertices.

B: Gp has no isolated vertices.

C: Gp is connected.

Then C = A ∩B and

Pr(B) = Pr(C) + Pr(A ∩B).

To prove that Pr(C) → exp(−e−x) as n → ∞, it suffices to show thatPr(A) → 0 since we have known that Pr(B) → exp(−e−x). Let X ⊆ [n]be the vertex set of the largest component of G and let Y = V \ X.We do not distinguish a vertex set and the subgraph induced by thisset if no danger of confusion. If A holds, then for some X ⊆ [n] with|X| ≥ 2,

1. X is connected;

2. Y contains edge;

3. There is no X − Y edges.

Note that these events are independent. Let the probabilities of theevents 1, 2 and 3 be denoted PX , PY and PXY , respectively and let|X| = k and |Y | = m = n − k. By distinguishing that k ≤ ⌊n/2⌋ orm ≤ ⌊n/2⌋, we have

Pr(A) ≤⌊n/2⌋∑k=2

(n

k

)PXPXY +

⌊n/2⌋∑m=2

(n

m

)PY PXY . (3.1)

To bound Pr(A), we use the following facts:

1. PX ≤ kk−2pk−1;

2. PY = 1 − (1 − p)(m2 )

3. PXY = (1 − p)mk.


The first fact follows since X must contain one of the kk−2 possiblespanning trees. Consider then first term on the right hand side of(3.1), we have

⌊n/2⌋∑k=2

(n

k

)PXPXY ≤

⌊n/2⌋∑k=2

(n

k

)kk−2pk−1(1 − p)k(n−k).

The term corresponding to any fixed k ≥ 2 can be bounded above as

c1nkpk−1(1 − p)kn ≤ c1n

kpk−1e−knp = c1e−kxpk−1 → 0,

where c1 henceforth ci are positive constants. For any k ≤ n/2,

(1 − p)n−k ≤ e−(n−k)p ≤ e−np/2 =e−x/2

√n.

Thus (n

k

)kk−2pk−1(1 − p)k(n−k) ≤

(en

k

)k

kk−2pk−1

(e−x/2

√n

)k

=1

k2p

(enp

e−x/2

√n

)k

≤ n

log n+ x

(c2 log n√

n

)k

.

It follows that

⌊n/2⌋∑k=2

(n

k

)PXPXY ≤ o(1) +

∑k≥4

n

log n+ x

(c2 log n√

n

)k

→ 0

as n→ ∞. Now setting

K = ⌊ 4√n⌋, and M = ⌈2

√n exp(1 − x/2)⌉,

we shall separate the second term on the right hand side of (3.1) intothree parts by K and M . Using the facts that

PXY = (1 − p)m(n−m) ≤ e−m(n−m)p ≤ e−mnp/2 =

(e−x/2

√n

)m

for m ≤ ⌊n/2⌋ and(nm

)≤ (en/m)m, we have

⌊n/2⌋∑m=M

(n

m

)PY PXY ≤

⌊n/2⌋∑m=M

(n

m

)PXY ≤

∑m≥M

(ene−x/2

m√n

)m

≤∑

m≥M

1

2m=

1

2M−1→ 0

since M → ∞ as n → ∞. On the other hand, if m < M , we haveemp ≤ 2 for all large n since mp ≤Mp→ 0, so

PXY = (1 − p)m(n−m) ≤(e−npemp

)m≤(

2e−x

n

)m

.

It follows that

M−1∑m=K

(n

m

)PY PXY ≤

M−1∑m=K

(n

m

)PXY ≤

∑m≥K

(2e−x)m

m!→ 0

since the sum in the last line is the tail of a convergent series.

Finally, for all m < K,

PY ≤ 1 − (1 − p)(K2 ),

which tends to zero uniformly on m < M , and it follows that

K−1∑m=2

(n

m

)PY PXY = o

(K−1∑m=2

(n

m

)PXY

)≤ o

∑m≥2

(2e−x)m

m!

,which tends to zero. Combining these results, we find that Pr(A) → 0,completing the proof. 2

Corollary 3.2 Suppose that log n ≥ ω(n) → ∞. Then pℓ = logn−ω(n)n

and pu = logn+ω(n)n

are ltf and utf for graph in G(n, p) of being connected,respectively.

2


3.4 References


B. Bollobas, Random Graphs, 2nd Edition, Cambridge UniversityPress, London, 2001.

P. Erdos, Graph theory and probability, Canad. J. Math., 11(1959), 34-38.

P. Erdos and A. Renyi, On the evolution of random graphs, Publ.Math. Inst. Hungar. Acad. Sci., 5 (1960), 17-61.

S. Jason, T. Luczak, and A. Rucinski, Random Graphs, Wiley-Interscience, New York, 2000.

Chapter 4

Quasi-random Graphs

Random graphs have been proven to be one of most important tools inmodern graph theory. Their tremendous triumph raises the followinggeneral question: what are the essential properties and how can we tellwhen a given graph behaves like a random graph Gp in G(n, p)? Herea typical property of random graphs is that almost all Gp satisfy. Thisleads us to a concept of quasi-random graphs. It was Thomason (1987)who introduced the notation of jumbled graphs to measure the similar-ity between the edge distribution of quasi-random graphs and randomgraphs. Quasi-random graphs are also called pseudo-graphs. A cor-nerstone contribution of Chung, Graham and Wilson (1989) showedthat many properties of different nature are equivalent to the nota-tion of quasi-random graphs. For a survey on quasi-random graphs,see Krivelevich and Sudakov (2006). This chapter focuses on quasi-random graphs. In recent years, there are some quasi-random familiesof graphs appearing, which are all constructed by finite fields. Theiralgebraic parameters are easier to compute, some of which are relatedto characters of finite fields and thus the third section is devoted tothe topics. The last section is application for quasi-random graphs inRamsey theory.

69

70 CHAPTER 4. QUASI-RANDOM GRAPHS

4.1 Properties of dense graphs

Speaking formally, a quasi-random G of order n is a graph that behaveslike a random graph G(n, p) with p = e(G)/

(n2

). For 0 < p < 1 ≤ α,

a graph G is called (p, α)-jumbled if each induced subgraph H on hvertices of G satisfies that∣∣∣e(H) − p

(h

2

)∣∣∣ ≤ αh.

Equivalently, G is (p, α)-jumbled if the average degree d(H) of eachinduced subgraph H of G satisfies that

|d(H) − p(h− 1)| ≤ 2α.

The following result of Thomason (1987) contains a simple localcondition of a graph of being jumbled.

Theorem 4.1 Let G be a graph of order n with δ(G) ≥ pn. If any pairof vertices has at most p2n+ ℓ common neighbors, where ℓ > 0, then G

is (p,√

(p+ ℓ)n/2)-jumbled.

Proof. Let H be an induced subgraph of G of order h with d(H) = d,where h < n. Write V (G) = v1, v2, . . . , vn and V (H) = v1, v2, . . . , vh,say. Let di be the number of neighbors of vi in H for 1 ≤ i ≤ n. Then∑h

i=1 di = hd and

n∑j=h+1

dj ≥h∑

i=1

(pn− di) = h(pn− d).

Since any pair of vertices are covered by at most p2n + ℓ vertices, andat most that in H particularly, we have

n∑i=1

(di2

)≤(h

2

)(p2n+ ℓ).

The above and the convexity of the function(x2

)imply that

h

(d

2

)+ (n− h)

(h(pn− d)/(n− h)

2

)≤(h

2

)(p2n+ ℓ).

4.1. PROPERTIES OF DENSE GRAPHS 71

Equivalently,

(d− ph)2 ≤ n− h

n

[(h− 1)ℓ+ p(1 − p)n

],

which gives that

|d− p(h− 1)| ≤√

(p+ ℓ)n

as claimed. Finally, note that the same inequality holds for h = n. 2

For given graphs G and H, let N∗G(H) be the number of labeled

occurrences of H as an induced subgraph of G, which is the number ofadjacency-preserving injections from V (H) to V (G) whose image is theset of vertices of an induced copy ofH ofG. Namely, these injections areboth adjacency-preserving and non-adjacency-preserving. Let NG(H)be the number of labeled copies of H as a (not necessarily induced)subgraph of G. Then

NG(H) =∑H′N∗

G(H ′),

where H ′ ranges over all graphs on V (H) obtained from H by adding aset of edges. For example, if G = H = Ct, then N∗

G(H) = NG(H) = 2t,and if G = Kn and n ≥ t ≥ 4, then N∗

G(Ct) = 0 and NG(Ct) =

N∗G(Kt) = (n)t. If G = Kn/2,n/2 and n is even, then NG(C4) = 2

(n2(n2−

1))2

∼ 2 ·(n2

)4for large n.

Let G be a (p, α)-jumbled graph of order n, where α = αn = o(n)as n → ∞. Then, as shown by Thomason, for fixed p and fixed graphH of order h

N∗G(H) ∼ pe(H)(1 − p)(

h2)−e(H)nh.

Let x and y be vertices of G. Denote by s(x, y) the number of verticesof G adjacent to x and y the same way: either to both or none. Let λibe eigenvalues of G with |λ1| ≥ |λ2| ≥ · · · ≥ |λn|. Let λ = λ(G) = |λ2|.For two (not necessarily disjoint) subsets B and C, let e(B,C) denotethe number of edges from B to C, in which each edge in B ∩ C iscounted twice. If B ∩ C = ∅, then e(B,C) is simply the number ofedges between B and C.

The quasi-random graph defined by Chung, Graham and Wilsonis in fact a family of simple graphs, which satisfy any (hence all) of


equivalent properties in the following theorem. It is remarkable thatthese properties ignore “small” local structures. The expressions of theproperties are related to the edge density p, here p = 1/2.

Theorem 4.2 Let G be a sequence of graphs, where G = Gn is agraph of order n. Then the following properties are equivalent:

P1(h): For any fixed h ≥ 4 and graph H of order h, N∗G(H) ∼ (1

2)(

h2)nh.

P2(t): e(G) ∼ n2

4and NG(Ct) ≤ (n

2)t + o(nt) for any even t ≥ 4.

P3: e(G) ≥ n2

4+ o(n2), λ1 ∼ n

2and λ2 = o(n).

P4: For each U ⊆ V (G), e(U) = 12

(|U |2

)+ o(n2).

P5: For each U ⊆ V (G) with |U | = ⌊n2⌋, e(U) ∼ n2

16.

P6:∑

x,y

∣∣∣s(x, y) − n2

∣∣∣ = o(n3).

P7:∑

x,y

∣∣∣|N(x) ∩N(y)| − n4

∣∣∣ = o(n3).

Proof.⋆ The steps of proof of Chung, Graham and Wilso are P1(h+1) ⇒P1(h) and

P1(2h) ⇒ P2(2t) ⇒ P2(4) ⇒ P3 ⇒ P4 ⇐⇒ P5 ⇒ P6 ⇒ P1(2h),

so that all but P7 are proven to be equivalent. They then add P7 tothe equivalent chain by proving that

P2(t) ⇒ P7 ⇒ P6.

Here, we omit some steps but keep most of them and preserve thetypical flavor.

Fact 1. P1(h+ 1) ⇒ P1(h), and P1(3) implies the property

P0 :∑v

∣∣∣ deg(v) − n

2

∣∣∣ = o(n2).

Let us remark that P0 is equivalent to that

P ′0 : All but o(n) vertices of G have degree (1 + o(n))n

2


by Cauchy-Schwarz inequality, and P0 implies that

e(G) ∼ n2

4.

Assume that P1(h+ 1) holds. Let H be a graph of order h. There are2h ways to extend it to a graph H ′ of order h+ 1, and each copy of His contained in n − h subgraphs H ′ of order h + 1. By P1(h + 1), wehave

N∗G(H ′) ∼ nh+12−(h+1

2 ),

thus

N∗G(H) ∼ nh+12−(h+1

2 ) 2h

n− h∼ nh2−(h

2),

which is P1(h). Suppose that G satisfies P1(3). Let Hi be the graphof order 3 and i edges, 1 ≤ i ≤ 3. By counting how often each edge cancontribute to the various N∗

G(Hi), we have

(n− 2)∑v

deg(v) = N∗G(H1) + 2N∗

G(H2) +N∗G(H3) ∼

n3

2,

thus∑

v deg(v) ∼ n2

2and e(G) ∼ n2

4. Also

∑v

deg(v)(deg(v) − 1) = N∗G(H2) +N∗

G(H3) ∼n3

4,

implying that∑

v deg2(v) ∼ n3

4. Then, by Cauchy-Schwarz,

∑v

∣∣∣ deg(v) − n

2

∣∣∣ ≤√n(∑

v

∣∣∣ deg(v) − n

2

∣∣∣2)1/2=

√n(∑

v

deg2(v) − n∑v

deg(v) +n3

4

)1/2,

which is o(n2).

Fact 2. P1(2t) ⇒ P2(2t) (t ≥ 2). Fact 1 has proved that e(G) ∼ n2

4.

We then show that

NG(C2t) =∑H′N∗

G(C2t) ≤ (1 + o(1))(n

2

)2t.


As H ′ ranges over all graphs on V (H) obtained from H by adding to

it a set of edges, the number of such sets is 2(2t2 )−2t. This and P1(2t)

imply P2(2t).

Fact 3. P2(2t) ⇒ P2(4) ⇒ P3. There is nothing to prove for thefirst implication and we prove the second. Let A be the adjacencymatrix of G and d the average degree of G. We first claim that

λ1 ≥ d.

Let us verify that for any unit vector X, λ1 ≥ X tAX. Let Λ be thediagonal matrix with diagonal entries λ1, λ2, . . . , λn and P a normalorthogonal matrix such that PAP t = Λ. Then PX is a unit vector,and

λ1 = λ1(PX)t · (PX) ≥ (PX)tΛ(PX) = X t(P tΛP )X = X tAX.

By taking X = 1√nJ , where J = (1, 1, . . . , 1)t, we obtain that

λ1 ≥1

nJ tAJ =

1

n

∑v

deg(v) = d

as claimed. This and e(G) ∼ n2

2imply λ1 ≥ n

2+ o(n). Next, consider

the trace of A4. Clearly,

tr(A4) =n∑

i=1

λ4i ≥ λ41 ≥ (1 + o(1))n4

16.

On the other hand, as this trace is precisely the number of labeled andclosed walks of length 4 inG, i.e., the number of sequences v0, v1, v2, v3, v4 =v0 such that vivi+1 is an edge. This number is NG(C4) plus the num-ber of such sequences in which v2 = v0, and plus the number of suchsequences in which v2 = v0. Thus

n∑i=1

λ4i = NG(C4) + o(n4) ∼(n

2

)4.

It follows that tr(A4) ∼ n4

16, thus λ1 ∼ n

2and

∑ni=2 λ

4i = o(n4) hence

λ2 = o(n) as desired.

Fact 4. P3 ⇒ P4. To simply the proof, we suppose that G isregular. Then the Fact 4 follows from Corollary 4.2 in the next section.

Fact 5. P4 ⇐⇒ P5. The implication P4 ⇒ P5 is immediate, sowe show P5 ⇒ P4. By ignoring one vertex possibly, we assume that nis even so that n/2 is an integer. Suppose that for any subset S with

|S| = n/2,∣∣∣e(S) − n2

16

∣∣∣ < ϵn2, where ϵ > 0 is fixed. We shall show thatfor any subset T , ∣∣∣e(T ) − 1

2

(t

2

)∣∣∣ < 20ϵn2,

where t = |T |. Let us consider two cases, .Case 1. t = |T | ≥ n/2. By averaging over all S ⊆ T with |S| = n/2,

we have

e(T ) =1(

t−2n/2−2

) ∑e(S) : S ⊆ T, |S| = n/2

as each edge is counted exactly

(t−2

n/2−2

)times. Thus

e(T ) ≤

(t

n/2

)(

t−2n/2−2

)(n2

16+ ϵn2

)≤(t

2

)(1

2+ 9ϵ

).

Similarly,

e(T ) ≥(t

2

)(1

2− 9ϵ

).

Case 2. t = |T | < n/2. We shall show that the assumption

e(T ) ≥ 1

2

(t

2

)+ 20ϵn2

leads to a contradiction. Set T = V \ T . Then |T | = n− t > n/2 andby Case 1, we have(

n− t

2

)(1

2− 9ϵ

)< e(T ) <

(n− t

2

)(1

2+ 9ϵ

).

Consider the average value A of e(T ∪ T ′), where T ′ ranges over allsubsets of T with |T ′| = n/2 − t so that |T ∪ T ′| = n/2, so

A =

(n− t

n/2 − t

)−1∑T ′

e(T ∪ T ′) : T ′ ⊆ T , |T ′| = n/2 − t

.

Counting how much different edges contribute to the sum, we knowthat the sum equals to

e(T )

(n− t

n/2 − t

)+ e(T )

(n− t− 2

n/2 − t− 2

)+ e(T, T )

(n− t− 1

n/2 − t− 1

).

From the fact that e(T, T ) = e(G) − e(T ) − e(T ), we obtain that

A =n/2

n− te(T ) − (n/2 − t)n/2

(n− t)(n− t− 1)e(T ) +

n/2 − t

n− te(G),

which satisfies

A ≥ n/2

n− t

1

2

(t

2

)+ 20ϵn2

− (n/2 − t)n/2

(n− t)(n− t− 1)

(n− t

2

)(1

2+ 9ϵ

)+n/2 − t

n− t

(n

2

)(1

2− 9ϵ

)>n2

16+ ϵn2.

Similarly, the assumption

e(T ) <1

2

(t

2

)− 20ϵn2,

leads a contradiction to the property P5, too. 2

A property is called a quasi-random property for p = 1/2 if it isequivalent to any property in Theorem 4.2. It is surprised that P2(4),which seems to be weaker, is a quasi-random property for p = 1/2.

Theorem 4.3 The property

P2(4) : e(G) ∼ n2

4and NG(C4) ≤

(n2

)4+ o(n4)

is a quasi-random property for p = 1/2.

Proof. See Fact 3 in the proof of the last theorem. 2

Some other properties can be added to the list, one of which is inthe next theorem.

Theorem 4.4 The property

P8 : For all U, V ⊆ V (G), e(U, V ) =1

2|U ||V | + o(n2)

is a quasi-random property for p = 1/2.

Proof. Let us prove the result by P4 ⇐⇒ P8. It suffices to show thatP4 ⇒ P8. Suppose that P4 holds. If U and V are disjoint, then

e(U, V ) = e(U ∪ V ) − e(U) − e(V )

=1

4(u+ v)2 − 1

4u2 − 1

4v2 + o(n2)

=1

2uv + o(n2),

where u = |U | and v = |V |. In case U and V are not disjoint, write|U ∩ V | = x, from P4 and what we just proved, we know that e(U, V )equals to

e(U \ V, V \ U) + e(U ∩ V, U \ V ) + e(U ∩ V, V \ U) + 2e(U ∩ V )

=1

2(u− x)(v − x) +

1

2x(u− x) +

1

2x(v − x) + o(n2)

=1

2uv + o(n2),

which is P8. 2

The following theorem is for general edge density p. However, 0 <p < 1 is fixed.

Theorem 4.5 Let G be a sequence of graphs, where G = Gn is agraph of order n. Let 0 < p < 1 be fixed. Then the following propertiesare equivalent:

P1(h): For any fixed h ≥ 4 and graph H of order h,

N∗G(H) ∼ pe(H)(1 − p)(

h2)−e(H)nh.

P2(t): e(G) ∼ pn2

2and NG(Ct) ≤ (pn)t + o(nt) for any even t ≥ 4.


P3: e(G) ≥ pn2

2+ o(n2), λ1 ∼ pn and λ2 = o(λ1).

P4: For each U ⊆ V (G), e(U) = p(|U |2

)+ o(n2).

P5: For each U ⊆ V (G) with |U | = ⌊n2⌋, e(U) ∼ p

8n2.

P6:∑

x,y

∣∣∣s(x, y) − (p2 + (1 − p)2)n∣∣∣ = o(n3).

P7:∑

x,y

∣∣∣|N(x) ∩N(y)| − p2n∣∣∣ = o(n3).

Let us conclude the section with an example.

Example 4.1 The Paley graph Pq of order q.

This graph is defined in Chapter 2, where q ≡ 1 (mod 4) is a primepower. Then Pq is (q− 1)/2-regular, and the distinct eigenvalues of Pq

are (q − 1)/2, (√q − 1)/2 and −(

√q − 1)/2. Therefore,

e(Pq) =q(q − 1)

4∼ q2

4, λ1 =

q − 1

2∼ q

2, λ =

√q − 1

2= o(q).

Thus Pq satisfies quasi-random property P3 hence all other quasi-randomproperties with p = 1/2.

4.2 Graph with small second eigenvalue

The last section was devoted to the quasi-random graphs of fixed edgedensity. Let us now switch to the case of density p = p(n) = o(1),which is more important for some applications.

In applications, we shall allow the graphs to be semi-simple, thatis, each vertex is attached at most a loop. When p → 0, the situationis significantly more complicated as revealed by Chung and Graham(2002). The first remarked fact is that the properties defined for quasi-random graphs with fixed edge density may be not equivalent anymore.Let Eo

q be the Erdo-Renyi graph of order n = q2 + q + 1. The graph is(q+ 1)-regular, in which q+ 1 vertices have loops (each of such verticeshas one). So the edge density p ∼ 1√

n. We have found in Chapter 9

4.2. GRAPH WITH SMALL SECOND EIGENVALUE 79

that λ1 = q + 1 ∼ pn, and λ =√q = o(d). So the property P3 holds.

However,p4(1 − p)2n4 ∼ n2,

and thus the property P1(4) does not hold as Eoq does not contain C4.

Recall that the quasi-random property P3, the magnitude of λ =λ(G) is a measure of quasi-randomness. As called by Alon, a graph Gis an (n, d, λ)-graph if G is d-regular with n vertices and

λ = λ(G) = max|λi| : 2 ≤ i ≤ n,

where λ1 = d, and λ2, . . . , λn are all eigenvalues of G. Here he con-nected quasi-randomness to the eigenvalue gap. For sparse graphs withp = o(1), Chung and Graham (2002) found some equivalent propertiesunder certain conditions. One of the properties is that λ1 ∼ pn andλ = o(λ1).

We shall have more results on (n, d, λ)-graphs, which are due to Alonet.al, particularly Alon and Spencer (2008). For two (not necessarilydisjoint) subsets B and C, we have defined e(B,C) as the number ofordered pairs (u, v) with u ∈ B and v ∈ C. If G is simple, then e(B,C)is the same as defined in the last section, i.e., it counts each edge fromB \ C to C \ B once, and each edge in B ∩ C twice. When G is semi-simple, it also counts each loop in B ∩ C once. For disjoint subsets Band C in a random graph, e(B,C) is expected to be d

n|B||C|, which is

close to the right-hand side of the inequality in the following theoremif λ is much smaller than d.

Theorem 4.6 Let G = (V,E) be a semi-simple (n, d, λ)-graph. Thenfor each partition of V into disjoint subsets B and C,

e(B,C) ≥ (d− λ)|B||C|n

Proof. Let A be the adjacency matrix of G and I the identity matrixof order n. Observe that for any real vector x of dimension n (as a realvalued function on V ), we have

((dI − A)x, x) =∑u∈V

(dx2u −

∑v: uv∈E

xvxu)

= d∑u∈V

x2u − 2∑uv∈E

xvxu =∑uv∈E

(xu − xv

)2.


Set b = |B| and c = |C| = n− b. Define a vector x = (xv) by

xv =

−c v ∈ B,b v ∈ C.

Note that dI−A and A have the same eigenvectors, and that the eigen-values of dI − A are precisely d− µ as µ ranges over all eigenvalues ofA. Also, d is the largest eigenvalue of A corresponding to the eigen-vector J = (1, 1, . . . , 1)t and (x, J) = 0. Hence x is orthogonal to theeigenvector of the smallest eigenvalue of dI − A.

Since dI −A is a symmetric matrix, its eigenvectors are orthogonaleach other and form a basis of the n-dimensional space and x is alinear combination of these eigenvectors other than that of J/

√n. This

together with the fact that d − λ is the second smallest eigenvalue ofdI − A, we have

((dI − A)x, x) ≥ (d− λ)(x, x) = (d− λ)(bc2 + cb2) = (d− λ)bcn.

However, as B and C form a partition of V ,∑uv∈E

(xu − xv)2 = e(B,C)(b+ c)2 = e(B,C)n2,

implying the desired inequality. 2

The next theorem bounds some kind of variance. In a random d-regular graph, we expect that a vertex v has d

n|B| neighbors in B. The

theorem shows that if λ is small, then |NB(v)| is not too far from theexpectation for most vertices v, where NB(v) = N(v) ∩B.

Theorem 4.7 Let G = (V,E) be a semi-simple (n, d, λ) graph. Thenfor each B ⊆ V ,

∑v∈V

(|NB(v)| − d

n|B|

)2≤ λ2

|B|(n− |B|)n

.

Proof. Let A be the adjacency matrix of G. Define a vector f : V → Rby

fu =

1 − b

nu ∈ B,

− bn

u ∈ B,


where b = |B|. Then∑

u fu = 0, and f is orthogonal to the eigenvectorJ = (1, 1, . . . , 1)t of the largest eigenvalue d of A. Thus f is a linearcombination of eigenvectors other than J , and

(Af,Af) ≤ λ2(f, f) = λ2b(n− b)

n.

Let Av be the row of A corresponding to vertex v. Then the coordinate(Af)v of Af at v is

Avf =(1 − b

n

)|NB(v)| − b

n(d− |NB(v)|) = |NB(v)| − db

n,

and thus

(Af,Af) =∑v

(|NB(v)| − db

n

)2,

the desired inequality follows. 2

Corollary 4.1 Let G = (V,E) be a semi-simple (n, d, λ)-graph. Thenfor every two subsets B and C of G, we have∣∣∣e(B,C) − d

n|B||C|

∣∣∣ ≤ λ√|B||C|.

Proof. Set b = |B| and c = |C|. Note that

∣∣∣e(B,C) − dbc

n

∣∣∣ =∣∣∣ ∑v∈C

(|NB(v)| − db

n

)∣∣∣ ≤ ∑v∈C

∣∣∣|NB(v)| − db

n

∣∣∣≤

√c[ ∑v∈C

(|NB(v)| − db

n

)2]1/2,

where the Cauchy-Schwarz inequality is used. From Theorem 4.7, wehave ∣∣∣e(B,C) − dbc

n

∣∣∣ ≤√c[ ∑v∈V

(|NB(v)| − db

n

)2]1/2

≤ λ√c

√b(1 − b

n) ≤ λ

√bc

as desired. 2


Let e(B) and ℓ(B) be the number of edges and loops in B, respec-tively. Then

e(B,B) = 2 e(B) + ℓ(B).

Note that ℓ(B) ≤ |B| if G is semi-simple.

Corollary 4.2 Let G = (V,E) be a semi-simple (n, d, λ) graph, andlet B be a subset of G. Then

∣∣∣e(B) − d

2n|B|2

∣∣∣ ≤ λ+ 1

2|B|.

Remark. By setting e(B) = 0, we have α(G) ≤ λ+1dn, which is

slightly weaker than a similar bound obtained in Chapter 9.

For an (n, d, λ)-graph G = (V,E) and B ⊆ V , define B as the set ofvertices u so that the proportion of N(u) in B, which is |NB(u)|/|B|, isat most half of that in V . Then |B||B| is at most Θ(n2/d) if λ = Θ(

√d).

Corollary 4.3 Let G = (V,E) be a semi-simple (n, d, λ)-graph andB ⊆ V . Define

B = u ∈ V : |NB(u)| ≤ d

2n|B|,

where NB(u) = N(u) ∩B. Then

|B||B| ≤(2λn

d

)2.

Consequently, |B ∩B| ≤ 2λnd.

Proof. From Theorem 4.7, we have

∑v∈V

(|NB(v)| − d

n|B|

)2≤ λ2

|B|(n− |B|)n

≤ λ2|B|.

Each v ∈ B contributes to the left-hand side more than(d|B|2n

)2, thus

|B|(d|B|

2n

)2≤ λ2|B|,


implying what as claimed.

For an (n, d, λ)-graph, the spectral gap between d and λ is a measurefor its quasi-random property. The smaller the value of λ compared tod, the closer is edge distribution to the ideal uniform distribution. Howsmall can be λ?

Theorem 4.8 Let G be an (n, d, λ)-graph and let ϵ > 0. If d ≤ (1−ϵ)n,then

λ ≥√ϵd.

Proof. Let A be the adjacency matrix of G. Then

nd = 2e(G) = tr(A2) =n∑

i=1

λ2i

≤ d2 + (n− 1)λ2 ≤ (1 − ϵ)nd+ nλ2,

which follows by what claimed.

On this estimate, we can say, not precisely, that an (n, d, λ)-graphwith λ = Ω(

√d) has good quasi-randomness. Recall a result in Chapter

2, if G is an srg(n, d, µ1, µ2) with n ≥ 3. Then, except λ1 = d, the othereigenvalues are solutions of the equation

λ2 + (µ2 − µ1)λ+ (µ2 − d) = 0.

Thus when µ1−µ2 is small compared to d, which implies that λ is closeto

√d, G has good quasi-randomness.

Example 4.2 The Erdos-Renyi graph E0q .

Let E0q be the Erdos-Renyi graph of order q2+q+1 in Chapter 9, which

is semi-simple, (q + 1)-regular. Its distinct eigenvalues are q + 1,√q

and −√q. So λ ∼

√d.

Example 4.3 The projective norm graph Goq,t.

Let Gq,t be the projective norm graph of Alon et. al. in Chapter 9,whose order n = qt−1(q−1). Each vertex has degree qt−1−1 or qt−1−2.It can be semi-simple and d-regular, where d = qt−1 − 1, if we attachsome vertices with loops. The distinct eigenvalues of Gq,t will be find inthe next section as qt−1 − 1, q(t−1)/2, 1, 0, −1, −q(t−1)/2. Thus λ ∼

√d.


Example 4.4 The Alon’s graph Ak.

The Ak of Alon giving a constructive proof for r(3, n) ≥ cn3/2 is asfollows. Let k be a positive integer that is not divisible by 3 and F (2k)be the finite field of 2k elements. The elements of F (2k) are representedby binary vectors of length k. If a, b and c are such vectors, let (a, b, c)denote their concatenation, whose coordinates are those of a, followedby those b and those of c. Let a|i be the ith coordinate of a. Define

S = α ∈ F ∗(2k) : α7∣∣∣1

= 0, T = α ∈ F ∗(2k) α7∣∣∣1

= 1,

where the powers are computed in the field. Then |S| = 2k−1 − 1 and|T | = 2k−1.

The graph Ak is defined on vertex set F 3(2k), whose vertices are alln = 23k binary vectors of length 3k. Two vectors u and v are adjacentif and only if there exist s ∈ S and t ∈ T so that

u+ v = (s, s3, s5) + (t, t3, t5).

Theorem 4.9 If k is not divisible by 3, then1. The order of Ak is n = 23k;2. Ak is d-regular, where d = 2k−1(2k−1 − 1) ∼ 1

4n2/3;

3. Ak is triangle-free;4. Each eigenvalue µ other than the largest one satisfies

−9 · 2k − 3 · 2k/2 − 1

4≤ µ ≤ 4 · 2k + 2k/2 +

1

4.

2

Hence, λ(Ak) ≤ O(√d). Also, from the results in Chapter 2 or

Chapter 9, we have α(Ak) ≤ O(n2/3).The next example is due to Delsarte and Goethals and to Turyn

(unpublished, reported in Sseidel (1976)). Define a graph G(k)q of order

q2 as follows, where q is a prime power. Let the vertex set of G(k)q be

F 2q , the two dimensional vector space over Fq. Partition the q + 1 lines

through the origin of the space into two sets P and N , where |P | = k.Two vertices x and y of G(k)

q are adjacent if x − y is parallel to a linein P . The following result is easy to verify.

4.3. APPLICATIONS OF CHARACTERS⋆ 85

Theorem 4.10 Let q be a prime power. Then the graph G(k)q is an

srg(q2, k(q − 1), q − 2 + (k − 1)(k − 2), k(k − 1)

),

and its spectrum is

eigenvalue k(q − 1) q − k −kmultiplicity 1 k(q − 1) (q − 1)(q + 1 − k)

For any fixed p with 0 < p < 1, if we take k ∼ pq2 as q → ∞, thenG(k)

q is quasi-random with edge density p.

4.3 Applications of characters⋆

We shall find the spectrum of Gq,t defined in Chapter 9. Let us definethe characters of a finite field F (q), which are group homomorphismsfrom F (q) or F ∗(q) to

S1 = z : |z| = 1 = eiθ : 0 ≤ θ < 2π,

respectively, where S1 is a multiplicative group of complex numbers.An additive character of F (q) is a function ψ : F (q) → S1 such thatfor any x, y ∈ F (q),

ψ(x+ y) = ψ(x)ψ(y).

Clearly ψ(0) = 1 and ψ(−x) = ψ(x). The trivial function ψ0 withψ0(x) ≡ 1 is also called the principle additive character of F (q).

A multiplicative character of F (q) is a function χ : F ∗(q) → S1 suchthat for any x, y ∈ F ∗(q),

χ(xy) = χ(x)χ(y).

Clearly χ(1) = 1 and χ(x−1) = χ(x). The trivial function χ0 withχ0(x) ≡ 1 is also called the principal multiplicative character of F (q).It is often to extend the domain of a multiplicative character χ to allof F (q) by χ(0) = 0 if χ = χ0, and χ0(0) = 1. The character χ withχ(x) = x(q−1)/2 is often called quadratic residue character.

In the following proofs, we shall not distinguish the elements of F (p)from the integers of 0, 1, . . . , p− 1.


Lemma 4.1 The numbers of additive characters and multiplicative char-acters of F (q) are q and q − 1, respectively.

Proof. Let us begin with the multiplicative group F ∗(q), which is acyclic group of order q − 1 with F ∗(q) = 1, µ, . . . , µq−2, where µ is aprimitive element of F (q). Each multiplicative character χ of F (q) isuniquely determined by χ(µ). From 1 = χ(µq−1) = χ(µ)q−1, we havethat χ(µ) = ζkq−1 for some 0 ≤ k ≤ q − 2, where ζq−1 = e2πi/(q−1). If weuse χ1 to signify the multiplicative character of F (q) with χ1(µ) = ζq−1,then the set of all multiplicative characters of F (q) is χk

1 : 0 ≤ k ≤q − 2, in which χ0

1 is the trivial character χ0. Thus F (q) has q − 1multiplicative characters, forming a group isomorphic to F ∗(q).

Let q = pm and let ζp = e2πi/p. For each a = (a1, a2, . . . , am) ∈Fm(p), set

ψa : F (q) → S1, ψa(x) = ζa1x1+a2x2+···+amxmp ,

where x = (x1, x2, . . . , xm) is the unique expression of x as a vector ofFm(p). Then ψa is an additive character of F (q). For a = a′, we showthat ψa = ψa′ . It suffices to show that ψa is not the trivial characterfor a = 0. Since for a = 0 there is some k such that 1 ≤ ak ≤ p − 1,so for ek = (0, . . . , 0, 1, 0, . . . , 0), the unit vector with 1 at the kthcoordinate, ψa(ek) = ζakp = 1. Thus the group of additive characters ofF (q) contains at least hence exactly q elements. 2

As usual, the function δ(x, y) is the Kronecker’s symbol defined as

δ(x, y) =

1 if x = y0 otherwise.

The proof of Lemma 4.1 implies the following result.

Lemma 4.2 Let χ be a multiplicative character of F (q). Then∑t∈F (q)

χ(t) = q δ(χ, χ0).

Let us define the Gaussian sum as

G(χ, ψ) =∑

x∈F (q)

χ(x)ψ(x).


Theorem 4.11 Let χ be a multiplicative character of F (q) and ψ bean additive character of F (q). Then G(χ0, ψ0) = q, and G(χ, ψ) = 0 ifexactly one of χ and ψ is trivial. Furthermore

|G(χ, ψ)| =√q

if none of χ and ψ is trivial.

Proof. The first two equalities are easy, and we shall verify the last. Inorder to simplify the proof, we prove it for the case that q is a prime p,which is the most important special case. The proof for general q willbe given later, which is more involved and the readers are encouragedto skip.

Let ψ and χ be additive character and multiplicative character ofF (p), none of which is trivial. From the proof of Lemma 4.1, we haveψ(x) = ζaxp , where ζp = e2πi/p and a = 0. Let ga(χ) =

∑x∈F (p) χ(x)ζaxp ,

which is G(χ, ψ) on F (p).We shall verify that ga(χ) = χ(a)g1(χ). This follows from that

ga(χ) =∑

x∈F (p)

χ(x)ζaxp =∑

y∈F (p)

χ(a−1y)ζyp

= χ(a−1)∑

y∈F (p)

χ(y)ζyp = χ(a)g1(χ),

and|ga(χ)|2 = ga(χ)ga(χ) = |χ(a)|2|g1(χ)|2 = |g1(χ)|2.

That is to say, |ga(χ)|2 have the same value for any a = 0. On the otherhand, for any a ∈ F (p)

ga(χ)ga(χ) =∑

x∈F (p)

χ(x)ζaxp∑

y∈F (p)

χ(y)ζ−ayp =

∑x, y∈F (p)

χ(x)χ(y)ζa(x−y)p .

It is easy to see that∑

a∈F (p) ζa(x−y)p = p δ(x, y) as a(x− y) ranges all of

F (p) for x− y = 0, which and the fact that χ(0) = 0 as χ = χ0 implythat ∑

a∈F (p)

ga(χ)ga(χ) =∑

x, y∈F (p)

χ(x)χ(y)δ(x, y)p = (p− 1)p.


Since g0(χ) = 0 as χ = χ0, we obtain that (p − 1)|g1(χ)|2 = (p − 1)phence |ga(χ)| = |g1(χ)| =

√p. 2

The proof of Theorem 4.11 for general q = pm

The forms of additive characters in the proof of Lemma 4.1 aresimple, but we shall express them in the other way for proving Theorem4.11 in general case.

For α ∈ F (q) = F (pm), define the trace of α to be

tr(α) = α + αp + αp2 + · · · + αpm−1

.

Lemma 4.3 If α, β ∈ F (q) and a ∈ F (p), then(a) tr(α) ∈ F (p).(b) tr(α + β) = tr(α) + tr(β).(c) tr(aα) = a tr(α).(d) For fixed α = 0, tr(αx) maps F (q) onto F (p).

Proof. The properties (a),(b) and (c) follow from the facts that trp(α) =tr(α), (α+β)p = αp+βp, αq = α and ap = a. To show the property (d),consider the fact that the polynomial tr(αx) has at most pm−1 rootsand αx ranges all pm elements of F (q), there is x ∈ F (q) = F (pm) suchthat tr(αx) = c = 0, where c ∈ F (p). If b ∈ F (p), the using property(c) we see that tr((b/c)αx) = (b/c)tr(αx) = b. Thus the trace tr(αx)is onto. 2

For fixed α ∈ F (q), we now define ψα : F (q) → S1 by

ψα(x) = ζtr(αx)p ,

where ζp = e2πi/p. Note that ψ0 is the trivial additive character of F (q).For the case q = p, ψα(x) = ζαxp is exactly what we have used.

Lemma 4.4 The function ψα has the following properties.(a) ψα(x+ y) = ψα(x)ψα(y) for any x, y ∈ F (q).(b) If α = 0, then there is x ∈ F (q) such that ψα(x) = 1.(c) If α = 0, then

∑x∈F (q) ψα(x) = 0; if x = 0, then

∑α∈F (q) ψα(x) =

0.


Proof. The property (a) follows from that tr(α(x + y)) = tr(αx) +tr(αy). The property (b) follows from the fact that tr(αx) is onto asα = 0, so there x ∈ F (q) such that tr(αx) = 1. Then ψα(x) = ζp = 1.As ψα(x) = ψx(α), we shall only verify the first equality in the property(c). Let S =

∑x∈F (q) ψα(x). Choose y such that ψα(y) = 1, thus

ψα(y)S =∑

x∈F (q)

ψα(x)ψα(y) =∑

x∈F (q)

ψα(x+ y) = S,

thus S = 0. 2

Lemma 4.5 For any fixed α ∈ F (q), the function ψα is an additivecharacter of F (q), and any additive character of F (q) is of such form.Furthermore, for any x, y ∈ F (q),∑

α∈F (q)

ψα(x− y) = q δ(x, y).

Proof. The first assertion follows the property (a) in Lemma 4.4. Forthe second, we shall verify that the number of such functions is q. Itsuffices to show that if α = β, the functions ψα and ψβ are distinct. Ifψα(x) = ψβ(x) for any x ∈ F (q), then

ζtr((α−β)x)p = ψα−β(x) = 1

for any x ∈ F (q), implying that α = β from the property (b) in Lemma4.4.

Since ∑α∈F (q)

ψα(x− y) =∑

α∈F (q)

ζtr(α(x−y))p ,

which is q for x = y. If x = y, equality follows from the fact thatα(x− y) ranges over all of F (q) and the property (c) in Lemma 4.4. 2

We now write Gaussian sum in the form

G(χ, ψα) =∑

x∈F (q)

χ(x)ψα(x).

We shall prove that if χ = χ0 and α = 0, then

|G(χ, ψα)| =√q.


Proof. The proof is an analogy of that for the case q = p. For anyα = 0, we first verify that G(χ, ψα) = χ(α)G(χ, ψ1). This is becausethat

G(χ, ψα) =∑

x∈F (q)

χ(x)ζtr(αx)p =∑

y∈F (q)

χ(α−1y)ζtr(y)p

= χ(α−1)∑

y∈F (q)

χ(y)ζtr(y)p = χ(α)G(χ, ψ1).

Therefore, we have |G(χ, ψα)|2 = |G(χ, ψ1)|2 for α = 0. On the otherhand, for any α ∈ F (q),

G(χ, ψα)G(χ, ψα) =∑

x∈F (q)

χ(x)ζtr(αx))p

∑y∈F (q)

χ(y)ζ−tr(αy))p

=∑

x, y∈F (q)

χ(x)χ(y)ζtr(α(x−y))p .

Since χ(0) = 0 as χ = χ0,∑α∈F (q)

G(χ, ψα)G(χ, ψα) =∑x, y

χ(x)χ(y)δ(x, y)q = q(q − 1).

Observing that G(χ, ψ0) = 0 as χ = χ0, we have∑α∈F (q)

G(χ, ψα)G(χ, ψα) =∑

α∈F ∗(q)

|G(χ, ψ1)|2 = (q − 1)|G(χ, ψ1)|2,

yielding that (q−1)|G(χ, ψ1)|2 = q(q−1) hence |G(χ, ψα)| = |G(χ, ψ1)| =√q. The proof for general case of Theorem 4.11 is completed. 2

The order of a multiplicative character χ is the smallest positiveinteger d such that χd = χ0. A more sophisticated result on charactersum is the Weil’s theorem as follows. Let χ be the multiplicative char-acter of Fq = F (q) of order d > 1 and f(x) a polynomial over Fq. Iff(x) has precisely s distinct zeros and it is not the form c(g(x))d, wherec ∈ Fq and g(x) ∈ Fq[x], then∣∣∣∣∣∣

∑x∈F (q)

χ(f(x))

∣∣∣∣∣∣ ≤ (s− 1)√q. (4.1)

In particular, the inequality holds when χ is the square residue char-acter and f(x) is not the form cg2(x), where c ∈ Fq and g(x) ∈ Fq[x].Similarly, for an additive character ψ = ψ0, if g(x) is a polynomial of

degree n < q with g.c.d. (n, q) = 1, then∣∣∣∑x∈F (q) ψ(g(x))

∣∣∣ ≤ (n−1)√q.

The prepared results are enough for introducing the following resultof Szabo (2003) on spectrum of Go

q,t, which is constructed in Chapter9. Note that Go

q,t is qt−1(q − 1)-regular, in which each vertex (A, a) ∈F (q2) × F ∗(q) with N(2A) = a2 has a loop.

Theorem 4.12 Let t ≥ 2 be an integer and q be an odd prime power.The spectrum of Go

q,t is as follows.

eig. qt−1 − 1 q(t−1)/2 1 0 −1 −q(t−1)/2

mul. 1 (qt−1−1)(q−2)2

qt−1−12

q − 2 qt−1−12

(qt−1−1)(q−2)2

Proof. Let M be the adjacency matrix of Goq,t. Let ψ be an additive

character of F (qt−1) and χ be a multiplicative character of F (q). LetV (ψ, χ) be the column vector whose coordinates are labeled by verticesof Go

q,t, whose entry at (X, x) is ψ(X)χ(x). Then the entry of thecolumn vector MV (ψ, χ) at (A, a) is

∑(B,b)∈F (qt−1)×F∗(q)

N(A+B)=ab

ψ(B)χ(b) =∑

B∈F (qt−1)\−Aψ(B)χ

(N(A+B)

a

)

=∑

C∈F ∗(qt−1)

ψ(C − A)χ

(N(C)

a

)

=

∑C∈F ∗(qt−1)

ψ(C)χ(N(C))

ψ(A) χ(a).

Setting

λ = λ(ψ, χ) =∑

C∈F ∗(qt−1)

ψ(C)χ(N(C)),

then λ(χ, ψ) = λ(ψ, χ),

MV (ψ, χ) = λV (ψ, χ), (4.2)


and MV (ψ, χ) = λV (ψ, χ). Thus we have

M2V (ψ, χ) = λλV (ψ, χ) = |λ|2V (ψ, χ).

Hence V (ψ, χ) is an eigenvector of M2 with corresponding eigenvalue|λ(ψ, χ)|2.

Observe that in the multiplicative group consisting of additive char-acters, the inverse ψ−1 of ψ is ψ, and the similar statement holds alsofor the multiplicative group consisting of multiplicative characters. Weclaim that the eigenvectors of the form V (ψ, χ) are pairwise orthogonal.Let (ψ′, χ′) = (ψ, χ), and let ψ′′ = ψ′ψ−1 = ψ′ψ and χ′′ = χ′χ−1 = χ′χ.Then (ψ′′, χ′′) = (ψ0, χ0), where ψ0 and χ0 are trivial additive characterand trivial multiplicative character, respectively. The inner product ofcomplex vectors V (ψ′, χ′) and V (ψ, χ) is

V T (ψ′, χ′)V (ψ, χ) = V T (ψ′, χ′)V (ψ, χ)

=∑

(X,x)∈F (qt−1)×F ∗(q)

ψ′′(X)χ′′(x)

=∑

X∈F (qt−1)

ψ′′(X)∑

x∈F ∗(q)

χ′′(x) = 0,

as one of sum in the last row is 0. The number of vectors of form V (ψ, χ)is equal to the order of Go

q,t by Lemma 4.1, hence all eigenvalues of M2

are of form |λ(ψ, χ)|2. Therefore, any eigenvalue of M is of form

±|λ(ψ, χ)| = ±

∣∣∣∣∣∣∑

C∈F ∗(qt−1)

ψ(C)χ(N(C))

∣∣∣∣∣∣ .When ψ = ψ0 and χ = χ0, the corresponding eigenvalue is qt−1 − 1,which can be obtained from Perron-Frobenius Theorem with multiplic-ity 1.

Let µ be a primitive element of F (qt−1), and let

Ak = µk+j(q−1) : 0 ≤ j ≤ ℓ− 1,

where ℓ = (qt−1 − 1)/(q − 1). Then A0, A1, . . . , Aq−2 form a partitionof F ∗(q) with |Ak| = ℓ. It is easy to see N(x) = N(y) if x and y are inthe same Ak. Therefore, when ψ = ψ0 and χ = χ0, as

|λ(ψ0, χ)| =

∣∣∣∣∣∣∑

C∈F ∗(qt−1)

χ(N(C))

∣∣∣∣∣∣ =

∣∣∣∣∣∣ℓ∑

c∈F ∗(q)

χ(c)

∣∣∣∣∣∣ = 0,


thus 0 is an eigenvalue of M with multiplicity q−2 which is the numberof multiplicative characters of F (q) except χ0.

When ψ = ψ0 and χ = χ0,

λ(ψ, χ0) =∑

C∈F ∗(qt−1)

ψ(C) = −ψ(0) = −1.

So 1 is an eigenvalue of M2 with multiplicity qt−1 − 1, then ±1 areeigenvalues of M with the sum of the multiplicities being qt−1 − 1. LetW (ψ) = V (ψ, χ0)+V (ψ, χ0). It follows from (4.2) that for any ψ = ψ0,MW (ψ) = −W (ψ). For any ψ, ψ′ = ψ0, ψ = ψ′, it is easy to see thatthe complex vectors W (ψ) and W (ψ′) are orthogonal. So −1 is aneigenvalue of M with multiplicity at least (qt−1 − 1)/2. Similarly, byconsidering V (ψ, χ0)− V (ψ, χ0), we know that 1 is an eigenvalue of Mwith multiplicity at least (qt−1−1)/2, hence each multiplicity is exactly(qt−1 − 1)/2.

When ψ = ψ0 and χ = χ0, observing that χN is a non-trivialmultiplicative character of F (qt−1), by Theorem 4.11 on Gaussian sum,

|λ| =

∣∣∣∣∣∣∑

C∈F ∗(qt−1)

ψ(C)χ(N(C))

∣∣∣∣∣∣ = q(t−1)/2.

Let S and T be the multiplicities of the eigenvalues q(t−1)/2 and −q(t−1)/2

of M , respectively. As (qt−1−1)(q−2) is the number of vectors of formV (ψ, χ) with ψ = ψ0 and χ = χ0,

S + T = (qt−1 − 1)(q − 2).

By the definition, the graph Goq,t has a loop at each vertex (A, a) if

and only if N(2A) = a2. Since exactly (q − 1)/2 elements of F ∗(q) aresquares and the equation N(X) = y has (qt−1 − 1)/(q − 1) solutions inX for each fixed y ∈ F ∗(q), there are (qt−1−1)/2 elements A ∈ F (qt−1)with N(2A) being a non-zero square. Once N(2A) is a non-zero square,there are two distinct elements a,−a ∈ F ∗(q) with N(2A) = a2 =(−a)2. Thus Go

q,t contains qt−1 − 1 loops, which is the trace of M .Hence

qt−1 − 1 = tr(M) =qt−1(q−1)∑

j=1

λj


= qt−1 − 1 +qt−1 − 1

2− qt−1 − 1

2+ q(t−1)/2(S − T ),

implying that S = T = (qt−1 − 1)(q − 2)/2. 2

The above theorem has the following corollary, which and the lowerbound in Chapter 9 imply α(Gq,t) = α(Go

q,t) = Θ(q(t+1)/2).

Corollary 4.4 Let t ≥ 2 be an integer and q be an odd prime power.Then

α(Gq,t) ≤(qt − qt−1)(q(t−1)/2 + 1)

qt−1 + q(t−1)/2 − 1∼ q(t+1)/2 ∼ n(t+1)/(2t),

where n = qt−1(q − 1) is the order of Gq,t.

Let us conclude this section with an algebraic construction thatalmost matches the probabilistic bound rk(Km,n) ≥ kmn− n1/2+o(1) inChapter 5.

Theorem 4.13 Let positive integers k and m be fixed. Then

rk(Km,n) ≥ kmn− n0.525.

for large n.

Proof. As the assertion is trivial for k = 1, we assume that k ≥ 2.Let p ≡ 1 (mod 2k) be a prime and Fp the finite field of p elements.Let µ be a primitive element of Fp. Define a logarithmic-like functionlogµ(x) : F ∗

p → Zp−1 = 0, 1, . . . , p− 2 as

logµ(x) = ℓ if x = µℓ, 0 ≤ ℓ ≤ p− 2.

For every j with 0 ≤ j ≤ k − 1, define a graph Hj on vertex set Fp, inwhich x and y is adjacent in Hj if and only if

logµ(x− y) ≡ j ( mod k).

As p ≡ 1 (mod 2k) and (−1)2 = 1, we have −1 = µ(p−1)/2, and thuslogµ(x− y) ≡ logµ(y−x) (mod k), so the definition is compatible. Incase k = 2, the graph H0 is the Paley graph.

Lemma 4.6 Let k ≥ 2 be an integer and p ≡ 1 (mod 2k) be a prime.Let Hj, 0 ≤ j ≤ k − 1, be the graph defined with respect to a primitiveelement of µ of Fp. Then these Hj are pairwise isomorphic.

Proof. We shall verify that each Hj is isomorphic to H0. Define abijection ϕ on F (p) as ϕ(z) = µjz. Then x, y is an edge of H0 if andonly if x − y = µℓ for some ℓ ≡ 0 (mod k). As ϕ(x) − ϕ(y) = µj+ℓ,thus x, y is an edge of H0 if and only if ϕ(x), ϕ(y) is an edge of Hj.Thus Hj is isomorphic to H0. For any vertex x, its neighborhood in H0

is

x+ µk, x+ µ2k, . . . , x+ µp−1,

so the degree of x in H0 is ((p− 1)/k). This proves the lemma. 2

Let ζk = e2πi/k. It is easy to see the following identity holds

(x− ζk)(x− ζ2k) · · · (x− ζk−1k ) = 1 + x+ x2 + · · · + xk−1. (4.3)

Define a function χ on F ∗p as

χ(x) = ζℓk, where ℓ ≡ logµ x (mod k).

Extend χ to all of Fp by χ(0) = 0. Then χ is a multiplicative characterof Fp of order k.

Let U ⊆ Fp be a subset of vertices of the graph H0 with |U | = m.Denote by J(U) for ∩u∈UN(u). If |J(U)| < n for any such U , thenrk(Km,n) > p from Lemma 4.6. For a fixed U , define a function f(x)on x ∈ Fp as

f(x) =∏u∈U

k−1∏j=1

(χ(x− u) − ζjk) =∏u∈U

k−1∑j=0

χj(x− u),

where we use the identity (4.3). For x ∈ U , if x ∈ J(U), then f(x) = 0as χ(x − u) = ζjk for some j with 1 ≤ j ≤ k − 1. If x ∈ J(U), thenf(x) = km as logµ(x−u) ≡ 0 (mod k) hence χ(x−u) = 1. Therefore,we have ∑

x∈Uf(x) = km|J(U)|.

Set U = u1, u2, . . . , um. Note that χ is multiplicative thus

f(x) =m∏t=1

(1 + χ(x− ut) + · · · + χk−1(x− ut)

)=

∑0≤j1,...,jm≤k−1

χ((x− u1)

j1 · · · (x− um)jm)

= 1 +∑

0≤j1,...,jm≤k−1j1+...+jm≥1

χ((x− u1)

j1 · · · (x− um)jm).

Applying the Weil’s theorem for the the polynomial (x− u1)j1 · · · (x−

um)jm with j1 + · · ·+ jm ≥ 1, which is not the form chk(x) with c ∈ Fp

and h(x) ∈ Fp[x] as j1, . . . , jm < k, from (4.1), we have∣∣∣ ∑x∈Fp

χ((x− u1)

j1 · · · (x− um)jm)∣∣∣ ≤ (m− 1)

√p.

Hence we obtain that∣∣∣p− ∑x∈Fp

f(x)∣∣∣ =

∣∣∣ ∑x∈Fp

∑0≤j1,...,im≤k−1

j1+...+jm≥1

χ((x− u1)

j1 · · · (x− um)jm)∣∣∣

=∣∣∣ ∑

0≤j1,...,jm≤k−1j1+...+jm≥1

∑x∈Fp

χ((x− u1)

j1 · · · (x− um)jm)∣∣∣

≤∑

0≤j1,...,jm≤k−1j1+...+jm≥1

(m− 1)√p.

It is well-known that the number of solutions of nonnegative integers(j1, . . . , jm) of the equation

j1 + j2 + · · · + jm = s

is(s+m−1

s

)for a fixed integer s. Omitting the constraint that j1, . . . , jm ≤

k − 1, we obtain that

∣∣∣p− ∑x∈Fp

f(x)∣∣∣ ≤ m(k−1)∑

s=1

(s+m− 1

s

)(m− 1)

√p = A

√p,

4.4. REFERENCES 97

where A = A(k,m) is independent of p. Note that |f(x)| ≤ km, thus|∑x∈U f(x)| ≤ mkm and∣∣∣p− km|J(U)|

∣∣∣ =∣∣∣p− ∑

x ∈Uf(x)

∣∣∣ ≤ ∣∣∣p− ∑x∈Fp

f(x)∣∣∣+ ∣∣∣ ∑

x∈Uf(x)

∣∣∣≤ A

√p+mkm ≤ (A+ 1)

√p

for large p, which implies that

km|J(U)| ≤ p+ (A+ 1)√p.

It is known that there are asymptotically N/(ϕ(2k) logN) primes p inthe form p ≡ 1 (mod 2k) between 1 and N , where ϕ(2k) is the numberof integers from 1 to 2k that are relatively prime to 2k. Let p ≡ 1 ( mod2k) be a prime between kmn−n0.525 and kmn−n0.525/2. The existence ofsuch prime for large n is ensured by results for estimating the differencebetween consecutive primes, see Baker, Harman and Pintz (2001). Theconstant 0.525 is in the process of improvement to 0.5 + o(1) impliedby the famous Riemann hypothesis. By choosing such p, we have

|J(U)| ≤ n− n0.525

2+ (A+ 1)

√kmn < n,

for large n. Thus H0 contains no Km,n, implying

rk(Km,n) > p ≥ kmn− n0.525

as each Hi is isomorphic to H0. 2

The largest difference between consecutive primes is conjectured asp1/2+o(1). If so, we have rk(Km,n) ≥ kmn− n1/2+o(1), which is the sameas that in Chapter 5.

4.4 References

N. Alon and V. Rodl, Sharp bounds for some multicolor Ramsey num-bers, Combinatoria, 25 (2005), 125-141.


R. Baker, G. Harman and J. Pintz, The difference between consec-utive primes, II, Proc. Lond. Math. Soc., 83 (2001), 532-562.

F. R. Chung, R. Graham, Sparse quasi-random graphs, Combina-toria, 22 (2002), 217-244.

F. R. Chung, R. Graham and R. Wilson, Quasi-random graphs,Combinatoria, 9 (1989), 345-362.

M. Krivelevich and B. Sudakov, Pseudo-random graphs, Bolyai Soc.Math. Stud., 15 (2006), 199-262.

J. Seidel, A survey of two-graphs, in: Colloquio Internazionale sulleTeorie Combinatorie (Rome, 1973), vol I, Atti dei Convegni Lincei, No.17, Accad. Naz. Lincei, Rome, 1976, 481C 511.

T. Szabo, On the spectrum of projective norm-graphs, Inform. Pro-cess. Lett., 86 (2) (2003), 71-74.

A. Thomason, Pseudo-random graphs, in: Proceedings of RandomGraphs, Poznan 1985, M. Karonski, Eds., Ann. Discrete Math. 33(1987), 307-331.

Chapter 5

Real-world Networks

Complex systems from various fields, such as physics, biology, or sociol-ogy, can be systematically analyzed using their network representation.A network (also known as a graph) is composed of vertices (or nodes)and edges, where vertices represent the constitutes in the system andedges represent the relationships between these constitutes.

We shall introduce some basic concepts for real-world networks inthis chapter.

5.1 Data and empirical research

Big data is a term for large or complex data. The term often referssimply to that traditional data processing applications are inadequate,and seldom to a particular size of data set. Most big data come fromreal-world networks, and analysis of such data sets can find new corre-lations to spot business trends, prevent diseases, combat crime and soon.

Empirical research is research using empirical evidence, where theempirical evidence is the record of direct observations or experiencesin form of data and big data. Through quantifying the data, a re-searcher can answer empirical questions from real-world. In particular,we are interested in the empirical research and the data from real-worldnetworks.

It is usual that researchers aim the common case, so they often

99

100 CHAPTER 5. REAL-WORLD NETWORKS

describe the behavior by ignoring some case that are not significant forthe research. This is similar to the case in random graph, we describean event by saying “almost all” to signify that the probability of eventgoes to 1. It is often that we are more concerned with the average ofparameters since they are concentrated at the average in most cases.Average of some parameters may be more meaningful than the extremalvalue of them.

However, this is not always the case. When investigating a socialnetworks, the nodes of large degrees, called “hubs” such as internetcelebrities, attracted much attention as these nodes are very importantfor the structure of the networks.

Collecting data that needed is an challenge before analysis. Forexample, the data in Barabasi and Albert (2009) came from a softwaredesigned to collect the links in World Wide Web pointing from onepage to another. The data in Backstrom and Kleinberg (2014) andUgander, Backstrom, Marlow and Kleinberg(2012) came from FacebookInc. directly as several of co-authors are employees of the company.

5.2 Six degrees of separation

Six degrees of separation is the theory that any pair of persons is sixor fewer steps away in the world as a network connected by friendship,which means the maximum distance of nodes in the network is at leastsix in language of graph theory. However, as claimed before, “any pair”for social networks for sociology may mean most pairs.

The term small world became famous since a paper of S. Milgram(1967) who is an American psychologist. Some seminal works havebeen conducted before Milgram took up the experiments reported asthe small world problem, and the experiment is called “the small-worldexperiment”, in which Milgram and other researchers examined the av-erage path length for social networks of people in the United States.The research suggested that human society is a small-world-type net-work, and the experiments are often associated with the phrase “sixdegrees of separation”, although Milgram did not use this term him-self.

Milgram’s experiment developed out of a desire to learn more about

5.2. SIX DEGREES OF SEPARATION 101

the probability that two randomly selected people would know eachother. This is one way of looking at the small world problem.

Though the experiment went through several variations, Milgramtypically chose individuals in cities of Omaha, Nebraska, and Wichita,Kansas, to be the starting points and Boston, Massachusetts, to bethe end point of a chain of correspondence. These cities were selectedbecause they were thought to represent a great distance in US, bothsocially and geographically.

Information packets (a letter,a roster and postcards) were initiallysent to randomly selected individuals in Omaha or Wichita. In themore likely case that the person did not personally know the target,then the person was to think of a friend or relative he knew personallywho was more likely to know the target. He was then directed to signhis name on a roster in the information packet and forward the packetto that person. When and if the package eventually reached the contactperson in Boston, the researchers could examine the roster to count thenumber of times it had been forwarded from person to person.

However, a significant problem was that often people refused to passthe letter forward, and thus the chain never reached its destination. Inone case, only 64 of the 296 letters eventually did reach the targetcontact. Among these chains, the average path length fell around fiveand a half or six. Hence, the researchers concluded that people in USare separated by about six people on average.

Smaller communities, such as mathematicians and actors, have beenfound to be densely connected by chains of personal or professional as-sociations. Mathematicians have created the Erdos number to describetheir distance from Paul Erdos based on shared publications. A simi-lar exercise has been carried out for the actor Kevin Bacon and otheractors who appeared in movies together with him.

In 2001, D. Watts attempted to recreate Milgram’s experiment onthe Internet, using an e-mail message as the “package” that neededto be delivered, with 48, 000 senders and 19 targets (in 157 countries).Watts found that the average number of intermediaries was around six,reported in [?]. Today, the phrase “six degrees of separation” is oftenused as a synonym for the idea of the “small world” phenomenon.

Watts and Strogatz (1998) showed that the average path lengthbetween two nodes in a random network is equal to logN/ logK, where


N is number of nodes and K is degree of acquaintances per node.Thus, assuming 10% of population of US is too young to participateand N = 300, 000, 000 (90% of the US population and K = 30, theDegrees of Separation 5.7. If N = 6, 000, 000, 000 (90% of the Worldpopulation) and K = 30, then Degrees of Separation 6.6.

However, the convenient way of communication in a social networkwill make the average distance smaller and smaller. Facebook’s datateam released data in online papers described that amongst all Face-book users at the time of research, the average distances of friendshiplinks were 5.28 in 2008, 4.74 in 2011 and 3.57 in February 2016 (thisyear). The world changes from “six degrees of separation” to “fourdegrees of separation”.

5.3 Clustering coefficient

An important measure of network topology, called clustering coef-ficient, assesses the triangular pattern as well as the connectivity in avertex’s neighborhood: a vertex has a high clustering coefficient if itsneighbors tend to be directly connected with each other. The clusteringcoefficient cv of a vertex v can be calculated as

cv =

0, if dv = 0ev

(dv2 )

if dv ≥ 2.

For dv = 1, it is a convention to define cv ∈ [0, 1] depending on thesituation. Thus 0 ≤ cv ≤ 1. The clustering coefficient cv for dv ≥ 2is the ratio of number of triangles and all possible triangles that sharevertex v.

Let Gk be a graph obtained from G by adding new edges connectingvertices of distance at most k in G. It is to see if n ≥ 8, then cv = 1/2for each v in circular lattice C2

n.For a graph G of order N (i.e., G contains N vertices) and minimum

degree δ(G) ≥ 2, its average clustering coefficient is defined as

c(G) =1

N

∑v∈V

cv =2

N

∑v∈V

evdv(dv − 1)

.

5.3. CLUSTERING COEFFICIENT 103

Average clustering coefficient explains the clustering (triangulation)within a network by averaging the clustering coefficients of all its nodes.The idea of clustering coefficient is proposed (especially in the analysisof social networks) to measure the local connectivity or “cliqueness” ofa social network. If a network has a high average clustering coefficientand a small average distance, it is often called a “small-world” network.

Let us label the vertices of G of order N as v1, v2, . . . , vN . Recallthat A = (aij)N×N is the adjacency matrix of G, where

aij =

1, if vivj ∈ E,0, otherwise.

We also call the eigenvalues of A as eigenvalues of G. Let λ1 ≥ λ2 ≥· · · ≥ λN be eigenvalues of G in a non-increasing order. Set

λ = λ(G) = max|λi| : 2 ≤ i ≤ N.

As called by Alon, a graph G is an (N, d, λ)-graph if G is d-regularwith N vertices and λ = λ(G). Note that a d-regular connected graphsatisfies that λ1 = d. For an (N, d, λ)-graph, the spectral gap betweend and λ is a measure for its quasi-random property. The smaller thevalue of λ compared to d, the closer is the edge distribution to theideal uniform distribution (i.e., it becomes a random graph). We maysay, not precisely, that an (N, d, λ)-graph with λ = O(

√d) has good

quasi-randomness. Generally, this is a weak condition as most randomgraphs are such graphs.

Theorem 5.1 Let G be an (N, d, λ)-graph that is connected. If λ =O(

√d) as d→ ∞, then

c(G) ∼ d

N.

Proof. Let A be adjacency matrix of G. Note that A is symmetric,and thus it is diagonalizable. Let λ1, λ2, . . . , λN be the eigenvalues ofA. Then the eigenvalues of Ak are λk1, λ

k2, . . . , λ

kN . Note that the (i, j)

element of Ak is the number of walks from vertex vi to vertex vj, anda closed walk of length 3 is exactly a triangle. Thus the ith diagonalelement of A3 is 2evi , and

c(G) =2

N

∑v

evdv(dv − 1)

=1

Nd(d− 1)

N∑i=1

λ3i =1

Nd(d− 1)

(d3 +

N∑i=2

λ3i

),


where we used the fact that λ1 = d as G is d-regular and connected.The assumption λ = O(

√d) implies that∣∣∣∑N

i=2 λ3i

∣∣∣Nd(d− 1)

≤ Nλ3

Nd(d− 1)=O(d3/2)

d2→ 0.

Thus

c(G) ∼ d2

N(d− 1)∼ d

N

for large d. 2

5.4 Small-world networks

The small-world phenomenon is typical for random graphs that havesmall maximum distances. A definition for small-world network de-scribes it as a network, in which the typical distance L between tworandomly chosen nodes grows proportionally to logN , where N is thenumbers of nodes in the network. Namely,

E(L) = Θ(logN),

which grows slowly as N → ∞ and the average distance of nodes issmall.

A certain category of small-world networks were identified as a classof random graphs by D. Watts and S. Strogatz in (1998). They mea-sured that in fact many real-world networks have a small average dis-tance, but also a clustering coefficient significantly higher than expectedby random chance. They noted that graphs could be classified accord-ing to two independent structural features, namely the clustering co-efficient, and average distance. Purely random graphs, built accordingto the Erdos-Renyi (ER) model, exhibit a small average distance (typ-ically as Θ(logN)) along with a small clustering coefficient (typicallyd/N where d is the expected value of degrees).

Many biological,technological and social networks lie between com-pletely regular and completely random. Typically, these networks havemany vertices that are sparse in sense that the average degrees aremuch less than the number of vertices.

5.5. POWER LAW AND SCALE-FREE NETWORKS 105

Watts and Strogatz modelled the small-world networks by startingat a graph Ck

n withn ≥ k ≥ log n≫ 1,

where k ≫ log n guarantees that a random graph will be connected.Then, they choose vertices in order and edges adjacent to the chosenvertices and reconnect these edges to vertices chosen uniformly at ran-dom. In the process, the average clustering coefficient decreases slowlyand the average distance decreases rapidly and thus they obtained anetwork between regular ring lattice Ck

n and completely random net-work. The obtained network has a large average clustering coefficientand a small average distance, which is called small-world network.

5.5 Power law and scale-free networks

Let X be a discrete random variables taking positive integers. If

Pr(X = k) =c

kγ ,

where c, γ > 0 are constants, then X is said to have a power law distri-bution. This distribution is also call Pareto distribution as economistPareto originally used it to describe the allocation of wealth that a larg-er portion of the wealth of society is owned by a smaller percentage ofthe people (so called 80-20 rule). Contrast to exponential distributionthat decreases rapidly, power law is also called heavy-tailed distribu-tion.

Let P (k) be the fraction P(k) of nodes in the network G that havedegree k, namely

P (k) =|v : deg(v) = k|

N,

where N is number of nodes in G. If P (k) is equal (or close) to powerlaw, then the network G is said to be scale-free. For networks, it istypical 2 < γ ≤ 3.

The networks of citations between scientific papers are interesting.In 1965, D. Price (1965) found that vertices of degree k in such net-works had a heavy-tailed distribution following a power law. He didnot however use the term “scale-free network”, which was not coined


until some decades later. In 1976, Price also proposed a mechanismto explain the occurrence of power laws in citation networks, which hecalled cumulative advantage but which is today more commonly knownunder the name preferential attachment.

In 1999, A. Barabasi and colleagues coined the term scale-free net-work when they found some nodes had a much bigger degrees thanthe that “expected” in random network, and they were surprised andused the term “scale-free”, which now is used to describe the class ofnetworks that exhibit a power-law degree distribution.

Albert L. Barabasi is a physicist, best known for his work in theresearch of network theory, and Reka Albert, the co-author of paper(2009), is a professor of physics and biology.

Our earlier study (1999), Albert, Jeong and Barabasi found thatthe World Wide Web is not a random network, but the number of linksper node, often called the degree distribution, follows a power law. Sub-sequently, researchers found that not only the WWW, but many othernetworks, follow the same distribution. These different datasets togeth-er indicated that we are dealing with a potentially universal behavior,which might have a common explanation.

Barabasi and Albert (2009) proposed a generative mechanism toexplain the appearance of power-law distributions, which they called“preferential attachment” and which is essentially the same as thatproposed by Price. Analytic solutions for this mechanism (also similarto the solution of Price) were presented earlier by Dorogovtsev, Mendesand Samukhin (2002). Finally, it was rigorously proved by mathemati-cians Bollobas, Riordan, Spencer and Tusnady (2001).

To explain this phenomenon, Barabasi and Albert (2009) suggestedthe following random graph process as a model, called BA model.

Consider a random graph process in which vertices are added to thegraph one at a time and joined to a fixed number of earlier vertices,selected with probabilities proportional to their degrees. Let v1, v2, . . .be a sequence of vertices. Assume that m0 ≥ 2 is the number of verticesto start at the process, and let d(vi) be the degree for the early vertexvi in the existing graph.

They described the process to start with a small number m0 ofvertices, at every time step we add a new vertex with m ≤ m0 edgesthat link the new vertex to m different vertices already present in the

5.5. POWER LAW AND SCALE-FREE NETWORKS 107

system. If the new vertex is vt+1, then the probability that vt+1 isadjacent to vi is proportional to

d(vi)∑tj=1 d(vj)

.

The above probability signifies the new vertex to incorporate preferen-tial attachment. Note that, to be clear to start, there should exist atleast one edge in the first m0 vertices. A question is if we connect eachearly vertex and new vertex randomly by above probability, then theexcepted number of new edges is one.

The research in Barabasi and Albert (2009) is empirical, and theproof is heuristic. The process defined in Bollobas et. al. (2001)preserves the idea of preferential attachment, and the description ismuch more complex, and power law has been shown for degrees atmost N1/15 with γ = 3.

On a theoretical level, some other abstract definitions of scale-freehave been proposed. For example, Li et. al. (2005) offered a potentiallymore precise ”scale-free metric”. Let G = (V,E) be a simple graphand s(G) =

∑uv∈E d(u)d(v) and S(G) = s(G)/smax, where smax as the

maximum value of s(H) among simple graphs H on same vertex setV with degree distribution identical to G. The notation S(G) gives ametric between 0 and 1, where a G with small S(G) is “scale-rich”, andG with S(G) close to 1 is “scale-free”. Note that s(G) is maximizedwhen high-degree nodes are connected to other high-degree nodes andS(G) captures the notion of self-similarity implied in the name “scale-free”.

Some properties are often listed as the characteristics of scale-freenetworks, which are as follows.

• Power-law degree distribution;

• Generated by certain random processes with preferential attach-ment;

• Highly connected hubs that hold the network together with the“robust yet fragile” feature of error tolerance, which is robustwhen attacked by removing some nodes randomly and fragile byremoving some hubs deliberately;


• Generic in the sense of being preserved under random degree-preserving rewiring;

• Self-similar;

• Universal in the sense of not depending on domain-specific details.

5.6 Network Structure

As pointed by Newman (2003), the research on networks may pro-vide new insight into the study of complex systems. Networks havemany notable properties, such as the small-world property, the scale-free property, the community structure property, and the links betweentwo objects usually display diversity.

By collecting data from mobile phones, Fagle, Pentland and Lazer(2009) found that the data have the potential to provide insight in-to the relational dynamics of individuals, and allow the prediction ofindividual-level outcomes such as job satisfaction.

The concept of contagion has expanded from its original groundingin epidemic disease to describe many processes that spread across net-works such as fads, political opinions, the adoption of new technologies,and financial decisions, see, e.g. R. Pastor-Satorras and A. Vespignani(2001) and M. Newman, D. Watts and S. Strogatz (2002).

In traditional models of social contagion, the probability that anindividual is affected by the contagion grows by monotonically with thesize of neighborhood. By analyzing the growth of Facebook, Ugander,Backstrom, Marlow and Kleinberg (2012) find that the probability ofcontagion is tightly controlled by the number of connected componentsin an individual neighborhood, rather than by the actual size of theneighborhood.

A crucial task in the analysis of on-line social-networking systems isto identify important people liked by strong social ties. Drawing datafrom e-mail, Kossinets and Watts has developed a method of analyzingand estimating tie strength in on-line domains (2006), in which thekey structure is embeddedness–the number |N(u) ∩ N(v)| of mutualfriends of two people u and v, a quantity that typically increases withtie strength.

5.6. NETWORK STRUCTURE 109

The embeddedness is not necessarily to be the most appropriate forcharacterizing particular types of strength ties. Backstrom and Klein-berg (2014) proposed a networks-based characterization for intimaterelationships, those involving spouses or romantic partners. Using da-ta from a large sample of Facebook users, they try to recognize thesepeople with high accuracy. They found that embeddedness is in facta comparatively weak means of characterizing romantic relations, andthat an alternate network measure that they term dispersion is signif-icantly more effective. Roughly, a link between two people has highdispersion when their mutual friends are not well connected to one an-other. Their research has an important contingent nature: given that auser has declared a relationship partner, they want to understand howeffectively they can find partner.

Note that the links to a person’s relationship partner or other closestfriends may have lower embeddedness, but they often involve mutualneighbors from several foci, reflecting the fact that the social orbitsof these close friends are not bounded within any one focus–consider,for example, a husband who knows several of his wife’s co-workers,family members, and former classmates, even though these people beingto different foci and do not know each other. Thus, Backstrom andKleinberg proposed some definition as follows.

For a network G = (V,E) and a pair of nodes u and v, denote byCuv = N(u)∩N(v), the set of mutual friends of u and v, and cuv = |Cuv|.Let d(s, t, G) be the graph-theoretic distance between u and v in G. Fordistinct s and t in Cuv, define

duv(s, t) =

1, d(s, t, Cuv) ≥ 3,0, d(s, t, Cuv ≤ 2.

Then, define the absolute dispersion of u and v as

disp(u, v) =∑

s,t∈Cuv , s =t

duv(s, t).

Note that disp(u, v) depends on both of Cuv and d(s, t, Cuv), and define

norm(u, v) =disp(u, v)

cuv,


which is called normalized dispersion. Predicting u’s partner to be theindividual v maximizing norm(u, v) gives the correct answer in 48.0%of all instances.

There are two ways to strengthen normalized dispersion that leadto increased performance. The first is to rank pair of u and v by afunction of the form

(disp(u, v) + b)α

(cuv + c).

Searching over choices α, b and c leads to maximum performance of50.5% at

α = 0.61, b = 0, c = 5.

The second way is by applying the idea of dispersion recursively. For afixed node u, define first xv = for all neighbors v of u. Then, iterativelyupdate each xv to be∑

w∈Cuvx2w + 2

∑s,t∈Cuv

duv(s, t)xsxtcuv

→ xv.

Note that after the first iteration, xv = 1 + 2 · norm(u, v), and henceranking nodes by xv after the first iteration is equivalently to rankingnodes by norm(u, v). Backstrom and Kleinberg found that the highestperformance ranking nodes by values of xv after the third iteration,call such xv as recursive dispersion. The performance by embedded-ness and recursive dispersion for romantic relationships is 24.7% and50.6%, respectively; and that for (married) spouses is 32.1% and 60.7%,respectively.

5.7 References

R. Albert, H. Jeong and A.L. Barabasi, Internet-diameter of theWorld-Wide Web, Nature, 401 (6749) (1999),130-131.

L. Backstrom and J. Kleinberg, Romantic partnerships and the dis-persion of socialties: A network analysis of relation status on Facebook,Proc. 17th ACM conference on computer supported cooperative workand social computing, 2014.

5.7. REFERENCES 111

A. Barabasi and R. Albert, Emergence of scaling in random net-works, Science, 286 (5439) (2009), 509-512.

B. Bollobas, O. Riordan, J. Spencer and G. Tusnady, The degreesequence of a scale-free random graph process, Random Struct. Algor.,18 (2001), 279-290.

S. Dorogovtsev, J. Mendes, Evolution of networks, Advances inPhysics, 51 (4) (2002),1079.

N. Fagle, A. Pentland and D. Lazer, Infering friendship networkstructure by using mobile phone data, Proc. Natl. Acad. Sci. USA,106 (36) (2009), 15274-15278.

G. Kossinets and D. Watts, Empirical analysis of an evolving socialnetwork, Science, 311 (2006), 88-90.

L. Li, D. Alderson, J. Doyle and W. Willinger, Towards a theoryof Scale-free graphs: Definitions, properties and implications, InternetMath., 2 (4) (2005), 431-523.

S. Milgram, The small world problems, Psychology Today, 2 (1967),60-67.

M. Newman, The structure and function of complete networks,SIAM Review, 45 (2003), 167-256.

M. Newman, D. Watts and S. Strogatz, Random graph model forsocial networks, Proc. Natl. Acad. Sci. USA, 99 (Suppl 1) (2002),2566-2572.

R. Pastor-Satorras and A. Vespignani, Epidemic spreading in scale-free networks, Phys. Rew. Lett., 86 (2001), 3200-3203.

D. Price, Networks of scientific papers, Science, 149 (3683) (1965),510-515.

J. Ugander, L. Backstrom, C. Marlow and J. Kleinberg, Structuraldiversity in social contagion, Proc. Natl. Acad. Sci. USA, 109 (16)(2012), 5962-5966.

D. Watts and S. Strogatz, Collective dynamics of ‘small-world’ net-works, Nature, 393 (6684) (1998), 440-442.

random graph - jsnu.edu.cnmathsqxx.jsnu.edu.cn/_upload/article/a8/72/3931fa5148d5...random graphs...

Documents