05-smallworlds - stanford universitysnap.stanford.edu › class › cs224w-2015 › slides ›...

Post on 26-Jun-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Small world networks

CS 224W

Outline

¤ Small world phenomenon¤ Milgram’s small world experiment

¤ Local structure¤ clustering coefficient¤ motifs

¤ Small world network models:¤ Watts & Strogatz (clustering & short paths)¤ Kleinberg (geographical)¤ Kleinberg, Watts/Dodds/Newman (hierarchical)

¤ Small world networks: why do they arise?

NE

MA

Small world phenomenon:Milgram’s experiment

¤ “Six degrees of separation”

Instructions:Given a target individual (stockbroker in Boston), pass the message to a person you correspond with who is “closest” to the target.

Milgram’s experiment

Outcome:

20% of initiated chains reached targetaverage chain length = 6.5

email experiment Dodds, Muhamad, Watts, Science 301, (2003)

(optional reading)

•18 targets•13 different countries

•60,000+ participants•24,163 message chains •384 reached their targets•average path length 4.0

Source: NASA, U.S. Government;; http://visibleearth.nasa.gov/view_rec.php?id=2429

Milgram’s experiment repeated

Interpreting Milgram’s experiment

n Is 6 is a surprising number?n In the 1960s? Today? Why?

n Pool and Kochen in (1978 established that the average person has between 500 and 1500 acquaintances)

Quiz Q:

¤ Ignore for the time being the fact that many of your friends’ friends are your friends as well. If everyone has 500 friends, the average person would have how many friends of friends?¤ 500¤ 1,000¤ 5,000¤ 250,000

Quiz Q:

¤With an average degree of 500, a node in a random network would have this many friends-of-friends-of-friends (3rd

degree neighbors):¤ 5,000¤ 500,000¤ 1,000,000¤ 125,000,000

Interpreting Milgram’s experiment

n Is 6 is a surprising number?n In the 1960s? Today? Why?

n If social networks were random… ?n Pool and Kochen (1978) -­ ~500-­1500 acquaintances/personn ~ 500 choices 1st linkn ~ 5002 = 250,000 potential 2nd degree neighborsn ~ 5003 = 125,000,000 potential 3rd degree neighbors

n If networks are completely cliquish?n all my friends’ friends are my friendsn what would happen?

Quiz Q:

¤ If the network were completely cliquish, that is all of your friends of friends were also directly your friends, what would be true:¤ (a) None of your friendship edges would be

part of a triangle (closed triad)¤ (b) It would be impossible to reach any node

outside the clique by following directed edges

¤ (c) Your shortest path to your friends’ friends would be 2

complete cliquishness

¤ If all your friends of friends were also your friends, you would be part of an isolated clique.

Uncompleted chains and distance

n Is 6 an accurate number?

n What bias is introduced by uncompleted chains?n are longer or shorter chains more likely to be completed?

average

95 % confidence interval

probability of passing on message

position in chain

Source: An Experimental Study of Search in Global Social Networks: Peter Sheridan Dodds, Roby Muhamad, and Duncan J. Watts (8 August 2003); Science 301 (5634), 827.

Attrition

Quiz Q:

n if each intermediate person in the chain has 0.5 probability of passing the letter on, what is the likelihood of a chain being completedn of length 2?n of length 5?

chain of length 2sends for sure receives

passes on with probability 0.5

Quiz Q:

n if each intermediate person in the chain has 0.5 probability of passing the letter on, what is the likelihood of a chain of length 5 being completed¤ (a) ½¤ (b) ¼¤ (c) 1/8¤ (d) 1/16

‘recovered’histogram of path lengths

Source: An Experimental Study of Search in Global Social Networks: Peter Sheridan Dodds, Roby Muhamad, and Duncan J. Watts (8 August 2003); Science 301 (5634), 827.

Estimating the true distance

inter-­countryintra-­country

observed chain lengths

¤Is 6 an accurate number?

¤Do people find the shortest paths?¤ Killworth, McCarty ,Bernard, & House (2005):¤ less than optimal choice for next link in chain is made ½ of the time

Navigation and accuracy

Small worlds & networking

What does it mean to be 1, 2, 3 hops apart on Facebook, Twitter, LinkedIn, Google Plus?

Transitivity, triadic closure, clustering

¤Transitivity: ¤ if A is connected to B and B is connected to C

what is the probability that A is connected to C?

¤ my friends’ friends are likely to be my friends

A

B

C?

Clustering

C =

¤Global clustering coefficient3 x number of triangles in the graphnumber of connected triples of vertices

3 x number of triangles in the graph

number of connected triples

Local clustering coefficient (Watts&Strogatz 1998)

¤For a vertex i¤ The fraction pairs of neighbors of the node that

are themselves connected¤ Let ni be the number of neighbors of vertex i

Ci =

Ci directed =

Ci undirected =

# of connections between i’s neighborsmax # of possible connections between i’s neighbors

# directed connections between i’s neighborsni * (ni -1)

# undirected connections between i’s neighborsni * (ni -1)/2

Local clustering coefficient (Watts&Strogatz 1998)

¤Average over all n vertices

∑=i

iCnC 1

i

ni = 4max number of connections:4*3/2 = 63 connections presentCi = 3/6 = 0.5

link absentlink present

Quiz Q:

¤The clustering coefficient for vertex i is:

i

(a)0(b)1/3(c)1/2(d)2/3

Explanation

¤ni = 3

¤there are 2 connections present out of max of 3 possible

¤Ci = 2/3

i

Small world phenomenon:

high clustering

low average shortest path

beyond social networks

)ln(network Nl ≈

graph randomnetwork CC >>

what other networks can you think of with these characteristics?

Comparison with “random graph” used to determine whether real-world network is “small world”

Network size av. shortest path

Shortest path in fitted random graph

Clustering(averaged over vertices)

Clustering in random graph

Film actors 225,226 3.65 2.99 0.79 0.00027

MEDLINE co-­authorship

1,520,251 4.6 4.91 0.56 1.8 x 10-­4

E.Coli substrate graph

282 2.9 3.04 0.32 0.026

C.Elegans 282 2.65 2.25 0.28 0.05

Reconciling two observations:• High clustering: my friends’ friends tend to be my friends• Short average paths

Small world phenomenon:Watts/Strogatz model

Source: Watts, D.J., Strogatz, S.H.(1998) Collective dynamics of 'small-world' networks. Nature 393:440-442.

n As in many network generating algorithmsn Disallow self-edgesn Disallow multiple edges

Select a fraction p of edgesReposition on of their endpoints

Add a fraction p of additionaledges leaving underlying latticeintact

Watts-­Strogatz model:Generating small world graphs

Source: Watts, D.J., Strogatz, S.H.(1998) Collective dynamics of 'small-world' networks. Nature 393:440-442.

¤Each node has K>=4 nearest neighbors (local)

¤tunable: vary the probability p of rewiring any given edge

¤small p: regular lattice

¤ large p: classical random graph

Watts-­Strogatz model:Generating small world graphs

Quiz question:

¤ Which of the following is a result of a higher rewiring probability?

(a) Left (b) Right (c) insufficient information

What happens in between?

¤Small shortest path means low clustering?

¤Large shortest path means high clustering?

¤Through numerical simulation¤ As we increase p from 0 to 1

¤ Fast decrease of mean distance¤ Slow decrease in clustering

Clust coeff. and ASP as rewiring increases

10% of links rewired1% of links rewired

Source: Watts, D.J., Strogatz, S.H.(1998) Collective dynamics of 'small-world' networks. Nature 393:440-442.

Trying this with NetLogohttp://web.stanford.edu/class/cs224w/NetLogo/SmallWorldWS.nlogo

WS model clustering coefficient

¤ The probability that a connected triple stays connected after rewiring¤ probability that none of the 3 edges were rewired (1-­p)3¤ probability that edges were rewired back to each other very small, can ignore

¤ Clustering coefficient = C(p) = C(p=0)*(1-­p)3

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

1

C(p)/C(0)

pSource: Watts, D.J., Strogatz, S.H.(1998) Collective dynamics of 'small-world' networks. Nature 393:440-442.

Quiz Q

nWhich of the following is a descriptionmatching a small-­world network?

(a)Its average shortest path is close to that of an Erdos-Renyi graph

(b)It has many closed triads(c)It has a high clustering coefficient(d)It has a short average path length

WS Model: What’s missing?

n Long range links not as likely as short range ones

nHierarchical structure / groupsnHubs

Ties and geography

“The geographic movement of the [message] from Nebraska to Massachusetts is striking. There is a progressive closing in on the target area as each new person is added to the chain”

S.Milgram ‘The small world problem’, Psychology Today 1,61,1967

NE

MA

nodes are placed on a lattice andconnect to nearest neighbors

additional links placed withp(link between u and v) = (distance(u,v))-­r

Kleinberg’s geographical small world model

Source: Kleinberg, ‘The Small World Phenomenon, An Algorithmic Perspective’ (Nature 2000).

exponent that will determine navigability

NetLogo demo

¤how does the probability of long-­range links affect search?

http://web.stanford.edu/class/cs224w/NetLogo/SmallWorldSearch.nlogo

When r=0, links are randomly distributed, ASP ~ log(n), n size of gridWhen r=0, any decentralized algorithm is at least a0n2/3

geographical search when network lacks locality

When r<2, expected time at least αrn(2-r)/3

0~p p

Overly localized links on a latticeWhen r>2 expected search time ~ N(r-2)/(r-1)

41~pd

When r=2, expected time of a DA is at most C (log N)2

21~pd

Just the right balance

Navigability

T

S

Rλ2|R|<|R’|<λ|R|

k = c log2n calculate probability that s fails to have a link in R’

R’

Quiz Q:

¤ What is true about a network where the probability of a tie falls off as distance-2

(a)Large networks cannot be navigated(b)A simple greedy strategy (pass the message to the

neighbor who is closest to the target) is sufficient(c)There are fewer long range ties than short range

ones(d)If the number of nodes doubles, the average

shortest path will be twice as long

Origins of small worlds:group affiliations

Source: Kleinberg, ‘Small-­World Phenomena and the Dynamics of Information’ NIPS 14, 2001.

Hierarchical network models:

Individuals classified into a hierarchy, hij = height of the least common ancestor.

Group structure models:Individuals belong to nested groupsq = size of smallest group that v,w belong to

f(q) ~ q-­α

ijhijp b α−:

h b=3

e.g. state-­county-­city-­neighborhoodindustry-­corporation-­division-­group

hierarchical small-world models: Kleinberg

Watts, Dodds, Newman (Science, 2001)individuals belong to hierarchically nested groups

multiple independent hierarchies h=1,2,..,H coexist corresponding to occupation, geography, hobbies, religion…

pij ~ exp(-α x)

Source: Identity and Search in Social Networks: Duncan J. Watts, Peter Sheridan Dodds, and M. E. J. Newman; Science 17 May 2002 296: 1302-1305. < http://arxiv.org/abs/cond-mat/0205383v1 >

hierarchical small-world models: WDN

Navigability and search strategy:Reverse small world experiment

¤ Killworth & Bernard (1978):¤ Given hypothetical targets (name, occupation, location, hobbies, religion…) participants choose an acquaintance for each target

¤ based on (most often) occupation, geography¤ only 7% because they “know a lot of people”¤ Simple greedy algorithm: most similar acquaintance¤ two-­step strategy rare

Source: 1978 Peter D. Killworth and H. Russell Bernard. The Reverse Small World Experiment Social Networks 1:159–92.

Successful chains disproportionately used• weak ties (Granovetter)• professional ties (34% vs. 13%)• ties originating at work/college• target's work (65% vs. 40%)

. . . and disproportionately avoided• hubs (8% vs. 1%) (+ no evidence of funnels)• family/friendship ties (60% vs. 83%)

Strategy: Geography -> Work

Navigability and search strategy:Small world experiment @ Columbia

MotivationPower-law (PL) networks, social and P2P

Analysis of scaling of search strategies in PL networks

Simulationartificial power-law topologies, real Gnutella networks

2

Search in power-law networks

Mary

Bob

Jane

Who couldintroduce me toRichard Gere?

How do we search?

AT&T Call Graph

Aiello et al. STOC ‘00

# o

f tel

epho

ne n

umbe

rsfro

m w

hich

cal

ls w

ere

mad

e

# of telephone numbers called

100 101

100

101

102

number of neighbors

prop

ortio

n of

nod

es datapower-law fit τ = 2.07

Gnutella network

power-law link distribution

summer 2000,data provided by Clip2

Preferential attachment model

Nodes join at different times

The more connections a node has, the more likely it is to acquirenew connections

Growth process produces power-law network

host cache

pingping

file sharing w/o a central index

queries broadcast to every node within radius ttl⇒ as network grows, encounter a bandwidth barrier (dial up modems cannot keep up with query traffic, fragmenting the network)

Gnutella and the bandwidth barrier

Clip 2 reportGnutella: To the Bandwidth Barrier and Beyondhttp://www.clip2.com/gnutella.html#q17

16

54

6367

2

94

number ofnodes found

power-law graph

93

number ofnodes found

13

711

1519

Poisson graph

Search with knowledge of 2nd neighbors

Outline of search strategy

pass query onto only one neighbor at each step

requires that nodes sign query- avoid passing message onto a node twice

requires knowledge of one’s neighbors degree- pass to the highest degree node

requires knowledge of one’s neighbors neighbors- route to 2nd degree neighbors

OPTIONS

Generating functions

¤M.E.J. Newman, S.H. Strogatz, and D.J. Watts

¤‘Random graphs with arbitrary degree distributions and their applications’, PRE, cond-mat/0007235

¤Generating functions for degree distributions

¤Useful for computing moments of degree distribution,

¤component sizes, and average path lengths

=

=∑00

( ) kk

kG x p x

Introducing cutoffs

< −max 1k N a node cannot have more connections than there are other nodes

This is important for exponents close to 2

τ τ

∞ ∞

= =∑ ∑1 1

1 1kp Cx π=2 2

6C

τ∞

∑> = =1000

( 1000, 2) ~ 0.001kp k pProbability that none of the nodes in a 1,000 node graph has 1000 or more neighbors:

τ− > = 1000(1 ( 1000, 2)) ~ 0.36p kwithout a cutoff, for τ = 2have > 50% chance of observing a node with more neighbors than there are nodes

for τ = 2.1, have a 25% chance

# of sites linking to the site

prop

ortio

n of

site

s w/ s

o m

any

links

1000

Selecting from a variety of cutoffs

Nk <max1.

2. κτ /kk eCkp −−= Newman et al.

3.⎩⎨⎧

=−

0

τCkpk

( ) τ1CNk <

otherwise

Generating Function

( )( )

kCN

kxkCxG ∑

=

−=τ

τ

1

10

Aiello et al.

1 million websites (~ 1997)

N

Aiello’s ‘conservative’ vs. Havlin’s ‘natural’ cutoff

τ

τ

− −

=

= 1

1

* 1

~

kN p

Ck N

k N

cutoff where expectednumber of nodes of degreek is 1

k

n(k)

k

n(k)

1

1

cutoff so thatexpected number of nodesof degree > k is 1

τ

τ

τ

=

∞− −

=

− −

=∑

max

max

1

1 1max

11

max

* 1

~

~

~

kk k

k k

N p

ck N

k N

k N

The imposed cutoff can have a dramaticeffect on the properties of the graphdegrees drawn at random, for τ = 2, and N = 1000

=

=∑00

( ) kk

kG x p x

is the probability that a randomlychosen vertex has degree k

~kp k τ−

is a generating function

'0(1)k

kk kp G< >= =∑ is the expected degree of a

randomly chosen vertex

( ) ( )( )

'0

1 '0 1

G xG x

G= is the distribution of remaining

outgoing edges following and edge

assuming neighbors don’t share edges

( ) ( )11 '1

'02 GGz = is the expected number of second

degree neighbors

22

2

2 2

2

1

11

Generating functions for degree distributionsRandom graphs with arbitrary degree distributions and their applicationsby Newman, Strogatz & Watts

search with knowledge of first neighbors

( )

max

max

maxmax

max

max

01

' 1 10 0

1

' 1 1 20 max

1 1'

' 1 101 ' '

10 0

1 2'

203 2

' max1 '

0

( )

( ) ( )

1(1) 12

( )( )(1) (1)

( 1)(1)

( 2) 21(1)(1)

kk

kk

kk

kk

kk

G x c k x

G x G x c k xx

G k c k k dk k

G x cG x k xG G x

c k k xG

kGG

τ

τ

τ τ τ

τ

τ

τ τ

τ

τ

− −

− − −

− −

− −

− −

=

∂= =∂

=< >= = −−

∂= =

= −

− −=

∑ ∫

:

2max( 1) (3 )

( 2)(3 )k ττ τ

τ τ

−− + −

− −

Generating function with cutoff

Average degree of vertex

constant in Nfor 2<τ<3, and kmax~Na, decreaseswith N

Average number of neighborsfollowing an edge

search with knowledge of first neighbors (cont’d)

τ ττ

τ

ττ τ

− −−

−= =

− − −: :

3 3' 3max max

1 1 max' 20 max

1 2(1)(1) (3 ) 1 (3 )B

k kz G kG k

In the limit τ->2, ' max1

max

(1)log( )kGk

:

Let’s for the moment ignore the fact that as we do a random walk, we encounter neighborsthat we’ve seen before

s = number of steps =1B

Nz

Search time with different cutoffs

: 0.18(2.1)s N

τ τττ τ

−−

−= = < <: 3 3

m

2

ax

( ) ,2 3N Nsk N

NIf kmax = N,

τ= =: max

max

log(log( ) , 2)N ks Nk

τττ

τττ τ

−−−

−−= = < <:221

33max 1

( ) ,2 3NN Nsk

NIf kmax = N1/(τ-1),

=: max

max

log( )(2) log( )N ksk

N

: 0.1(2.1) Ns

2 3 /133max

,2 3( )

N Ns Nk

N

ττ

ττ

τ−−

= = < <:If kmax = N1/τ,

So the best we can do is for exponents close to 2 N

2nd neighbor random walk, ignoring overlap:

( ) 15.0~1.2, NNS =τ( ) ( )ττ 213~, −NNS

( )

( )

= 2

2

~

s B

B

n z NNS

z N

search with knowledge of first neighbors (cont’d)

τ

τ

ττ

−=

⎡ ⎤∂ −⎡ ⎤ ⎡ ⎤= = = ⎢ ⎥⎢ ⎥ ⎣ ⎦∂ − −⎣ ⎦ ⎣ ⎦

232' max2 1 1 1 2

1 max

2( ( )) (1)1 (3 )B

x

kz G G x Gx k

Following the degree sequence

a ~ s = # of steps taken

( ) 1.0deg 1.2, NNS ==τ

2nd neighbors, ignoring overlap:

Go to highest degree node, then next highest, … etc.

τ τ− −

−= ∫

max

max

1 11 max~

k

D k az Nk dk Nak

τ

τ τ

− −

max

max

' 2(2 )1 1

2( 2) 2 4 /

( ) ~

~ ~Dz G x Nak

s k N

0 10 20 30 40 50 60 70 80 90 100

1

τ = 2.00τ = 2.25τ = 2.50τ = 2.75τ = 3.00τ = 3.25τ = 3.50τ = 3.75

degree of node

degr

ee o

f nei

ghbo

r -1

degr

ee o

f nod

e

2

20

10

5

Ratio of the degree of a node to the expected degree of its highest degree neighbor for 10,000 node power-law graphs of varying exponents

Actor collaboration graph(imdb database)

τ ~ 2.0-2.2

Exponents τ close to 2 required to search effectively

World Wide Web, τ ~ 2.0-2.3,high degree nodes: directories, search engines

Social networks, AT&T call graph τ ~ 2.1

Gnutella

100 101 102 103 104100

101

102

103

104

105

number of costars

num

ber o

f act

ors/

actre

sses

actors, τ = 2actresses, τ = 2.1

15

50

18 17

8

109

6

Following the degree sequence

Complications

¤Should not visit same node more than once

¤Many neighbors of current node being visited were also neighbors of previously visited nodes, and there is a bias toward high degree nodes being ‘seen’ over and over again

0 100 200 300 400 500 6000

5

10

15

20

25

30

step

degr

ee o

f nod

e

not visitedvisitedneighborsvisited

Status and degree of node visited

1

10-2

0.1

1

step

prop

ortio

n of

nod

es fo

und

at st

ep

random walk

10 102 103 104 105 106

10-3

10-4

0 20 40 60 80 100

0.2

0.4

0.6

0.8

1

step

cum

ulat

ive

node

s fou

nd a

t ste

p

random walkdegree sequence

seeking high degree nodesspeeds up the search process

about 50% of a 10,000 node graphis explored in the first 12 steps

Progress of exploration in a 10,000 node graph knowing2nd degree neighbors

12

degree sequence

101 102 103 104 105100

101

102

103

size of graph

cove

rtim

e fo

r hal

f the

nod

es

random walkα = 0.37 fitdegree sequenceα = 0.24 fit

Scaling of search time with size of graph

Comparison with a Poisson graph

100

101

102

103

100

101

102

degr

ee o

f cur

rent

nod

e

step

Poissonpower-law

101

102

104

106

100

101

102

103

104

105

number of nodes in graph

cove

r tim

e fo

r 1/2

of g

raph

constant av. deg. = 3.4γ = 1.0 fit

( ) ( )10

−= xzexG

( ) ( ) ( )xGxGzxxG 001 =ʹ′=

expected degree and expecteddegree following a link are equal

scaling is linear

0 20 40 60 80 1000

0.2

0.4

0.6

0.8

1

step

cum

ulat

ive

node

s fou

nd a

t ste

p

high degree seeking 1st neighborshigh degree seeking 2nd neighbors

50% of the files in a 700 node network can be found in < 8 steps

Gnutella network

Expander graphsTime permitting

Def: Random k-Regular Graphs

¤We need to define two concepts

¤1) Define: Random k-Regular graph¤ Assume each node has k spokes (half-edges)¤ Randomly pair them up!

¤2) Define: Expansion¤ Graph G(V, E) has expansion α:

if∀ S ⊆ V: #edges leaving S ≥ α⋅ min(|S|,|V\S|)

¤ Or equivalently:

Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 81

|)\||,min(|#min SVS

SleavingedgesVS⊆

S V \ S

Expansion: Intuition

82

S nodes ≥ α·S edges

S’ nodes ≥ α·S’ edges

(A big) graph with “good” expansion|)\||,min(|

#min SVSSleavingedges

VS⊆=α

Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Expansion: k-Regular Graphs

¤ k-regular graph (every node has degree k):¤ Expansion is at most k (when S is a single node)

¤ Is there a graph on n nodes (n→∞), of fixed max deg. k, so that expansion α remains const?

Examples:¤ n×n grid: k=4: α =2n/(n2/4)→0

(S=n/2 × n/2 square in the center)

¤ Complete binary tree:α →0 for |S|=(n/2)-1

¤ Fact: For a random 3-regular graph on n nodes, there is some const α (α >0, independent. of n) such that w.h.p. the expansion of the graph is ≥ α (In fact, α=d/2 as d→∞)

83

S

S

|)\||,min(|#min SVS

SleavingedgesVS⊆

Make this into 6x6 grid!

Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Diameter of 3-Regular Rnd. Graph

¤Fact: In a graph on n nodes with expansion α, for all pairs of nodes s and t there is a path of O((log n) / α) edges connecting them.

¤Proof:¤ Proof strategy:

¤ We want to show that from any node s there is a path of length O((log n)/α) to any other node t

¤ Let Sj be a set of all nodes found within j steps of BFS from s.

¤ How does Sj increase as a function of j?

84

s

S0

S1

S2

Make this into a 3-ary tree

Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Diameter of 3-Regular Rnd. Graph

¤Proof (continued):¤ Let Sj be a set of all nodes found

within j steps of BFS from s. ¤ We want to relate Sj and Sj+1

85

=+≥+ kS

SS jjj

α1

At most k edges “collide” at a node

Expansion

1

01 11+

+ ⎟⎠

⎞⎜⎝

⎛ +=⎟⎠

⎞⎜⎝

⎛ +≥j

jj kS

kSS αα

s

S0

S1

S2

|Sj|nodes

At least α|Sj| edges

|Sj+1|nodes

Each ofdegree k

where S0=1

Make this into a 3-ary tree

Diameter of 3-Regular Rnd. Graph

¤Proof (continued):¤ In how many steps of BFS

do we reach >n/2 nodes?¤ Need j so that:

¤ Let’s set:¤ Then:

¤ In 2k/α·log n steps |Sj| grows to Θ(n). So, the diameter of G is O(log(n)/ α)

86

s21 nk

Sj

j ≥⎟⎠

⎞⎜⎝

⎛ +=α

t

αnkj 2log

=

221 2

2

log

log

nnk

n

nk

>=≥⎟⎠

⎞⎜⎝

⎛ +αα

n

nk

k2

2

log

log

21 ≥⎟⎠

⎞⎜⎝

⎛ +αα

Claim:

Remember n>0, α ≤ k then:

In log(n) steps, we reach >n/2 nodes

In log(n) steps, wereach >n/2 nodes

( )

nnnx

nn

ex

xk

22

2

22

logloglog

loglog11

211 and

: then 0 if

211:k if

>=⎟⎠

⎞⎜⎝

⎛ +

∞→=→

=+=

αα

α

In j steps, we reach >n/2 nodes

In j steps, wereach >n/2 nodes ⇒ Diameter = 2·j

⇒ Diameter = 2 log(n)

x

x xe ⎟

⎞⎜⎝

⎛ +=∞→

11lim

Make this into a 3-ary tree

Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Summary

¤Small world phenomenon:¤ Local structure (e.g. clustering)¤ Short average shortest path

¤The Watts-Strogatz captures both

¤Other models create navigable small-world models

¤Power-law networks are navigable due to presence of hubs

top related