connectivity and the small world overview background: de pool and kochen: random & biased...

43
Connectivity and the Small World Overview Background: •de Pool and Kochen: •Random & Biased networks •Rapoport’s work on diffusion Travers and Milgram •Argument •Method •Watts •Argument •Findings •Methods: •Biased Networks •Reachability Curves •Calculating L and C

Upload: brenda-wade

Post on 25-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Connectivity and the Small WorldOverview

Background:•de Pool and Kochen: •Random & Biased networks•Rapoport’s work on diffusion

Travers and Milgram•Argument•Method

•Watts•Argument•Findings

•Methods:•Biased Networks•Reachability Curves•Calculating L and C

Page 2: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Connectivity and the Small World

Started by asking the probability than any two people would know each other.

Extended to the probability that people could be connected through paths of 2, 3,…,k steps

Linked to diffusion processes: If people can reach others, then their diseases can reach them as well, and we can use the structure of the network to model the disease.

The reachability structure was captured by comparing curves with a random network, which we will do later today.

Page 3: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Connectivity and the Small World

Travers and Milgram’s work on the small world is responsible for the standard belief that “everyone is connected by a chain of about 6 steps.”

Two questions:Given what we know about networks, what is the longest path (defined by handshakes) that separates any two people?

Is 6 steps a long distance or a short distance?

Page 4: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Longest Possible Path: Two Hermits on the opposite side of the country

About 12-13 steps.

OHHermit

Mt. Hermit

StoreOwner

StoreOwner

TruckDriver

TruckDriver

Manager

Manager

CorporateManager

CorporateManager

CorporatePresident

CorporatePresident

CongressRep.

CongressRep.

Page 5: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

What if everyone maximized structural holes?

Associates do not know each other:Results in an exponential growth curve. Reach entire planet quickly.

Page 6: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

What if people know each other randomly?

Random graph theory shows us that we could reach people quite quickly if ties were random.

Page 7: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

0

20%

40%

60%

80%

100%

Per

cent

Con

tact

ed

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Remove

Degree = 4Degree = 3

Degree = 2

Random Reachability:By number of close friends

Page 8: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Milgram’s test: Send a packet from sets of randomly selected people to a stockbroker in Boston.

Experimental Setup:

Arbitrarily select people from 3 pools:a) People in Bostonb) Random in Nebraskac) Stockholders in Nebraska

Page 9: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Milgram’s Findings: Distance to target person, by sending group.

Page 10: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Most chains found their way through a small number of intermediaries.

What do these two findings tell us of the global structure of social relations?

Page 11: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Milgram’s Findings: Length of completed chains

Page 12: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Duncan Watts: Networks, Dynamics and the Small-World Phenomenon

Asks why we see the small world pattern and what implications it has for the dynamical properties of social systems.

His contribution is to show that globally significant changes can result from locally insignificant network change.

Page 13: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Duncan Watts: Networks, Dynamics and the Small-World Phenomenon

Watts says there are 4 conditions that make the small world phenomenon interesting:

1) The network is large - O(Billions)2) The network is sparse - people are connected to a small

fraction of the total network3) The network is decentralized -- no single (or small #) of

stars4) The network is highly clustered -- most friendship circles

are overlapping

Page 14: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Duncan Watts: Networks, Dynamics and the Small-World Phenomenon

Formally, we can characterize a graph through 2 statistics.

1) The characteristic path length, LThe average length of the shortest paths connecting any two actors. (note this only works for connected graphs)

2) The clustering coefficient, C•Version 1: the average local density. That is, Cv = ego-network density, and C = Cv/n•Version 2: transitivity ratio. Number of closed triads divided by the number of closed and open triads.

A small world graph is any graph with a relatively small L and a relatively large C.

Page 15: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

The most clustered graph is Watt’s “Caveman” graph:

1

61

2

kCcaveman

)1(2

k

nLcaveman

Page 16: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Duncan Watts: Networks, Dynamics and the Small-World Phenomenon

0

0.2

0.4

0.6

0.8

1

1.2

0 20 40 60 80 100 120Degree (k)

Clu

ste

rin

g C

oe

ffic

ien

t

0

20

40

60

80

100

120

140

Ch

ara

cte

ris

tic

Pa

th L

en

gth

C and L as functions of k for a Caveman graph of n=1000

Page 17: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Compared to random graphs, C is large and L is long. The intuition, then, is that clustered graphs tend to have (relatively) long characteristic path lengths. But the small world phenomenon rests on just the opposite: high clustering and short path distances. How is this so?

Duncan Watts: Networks, Dynamics and the Small-World Phenomenon

Page 18: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

A model for pair formation, as a function of mutual contacts formations.

Duncan Watts: Networks, Dynamics and the Small-World Phenomenon

0 p

,0 )1(

1

,

,

,

,,

ji

jik

m

ji

ji

m

mkpp

km

R ji

Using this equation, produces networks that range from completely ordered to completely random. (Mij is the number of friends in common, p is a baseline probability of a tie, and k is the average degree of the graph)

Page 19: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

A model for pair formation, as a function of mutual contacts formations.

Duncan Watts: Networks, Dynamics and the Small-World Phenomenon

Page 20: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Duncan Watts: Networks, Dynamics and the Small-World Phenomenon

C=Large, L is Small = SW Graphs

Page 21: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Why does this work? Key is fraction of shortcuts in the network

In a highly clustered, ordered network, a single random connection will create a shortcut that lowers L dramatically

Watts demonstrates that Small world graphs occur in graphs with a small number of shortcuts

Page 22: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Empirical Examples

1) Movie network: Actors through MoviesLo/Lr= 1.22 Co/Cr = 2925

2) Western Power Grid: Lo/Lr= 1.50 Co/Cr = 16

3) C. elegansLo/Lr= 1.17 Co/Cr = 5.6

Page 23: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

What are the substantive implications? Return to the initial interest in connectivity: disease diffusion

1) Diseases move more slowly in highly clustered graphs(fig. 11) - not a new finding.

2) The dynamics are very non-linear -- with no clear pattern based on local connectivity. Implication: small local changes (shortcuts) can have dramatic global outcomes (disease diffusion)

Page 24: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Duncan Watts: Networks, Dynamics and the Small-World Phenomenon

How do we know if an observed graph fits the SW model?

Random expectations:For basic one-mode networks (such as acquaintance

nets), we can get approximate random values for L and C as:

Lrandom ~ ln(n) / ln(k)Crandom ~ k / n

As k and n get large.

Note that C essentially approaches zero as N increases, and K is assumed fixed. This formula uses the density-based measure of C.

Page 25: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Duncan Watts: Networks, Dynamics and the Small-World Phenomenon

How do we know if an observed graph fits the SW model?

One problem with using the simple formulas for most extant data on large graphs is that, because the data result from people overlapping in groups/movies/publications, necessary clustering results from the assignment to groups.

G1 G2 G3 G4 G5Amy 1 0 1 0 0Billy 0 1 0 1 0Charlie 0 1 0 1 0Debbie 1 0 0 0 0Elaine 1 0 1 0 1Frank 0 1 0 1 0George 0 1 0 1 0 . . . . LINES CUT . . . . .William 0 1 0 0 0Xavier 0 1 0 1 0Yolanda 1 0 1 0 0Zanfir 0 1 1 1 1

12 14 9 14 5

Page 26: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Duncan Watts: Networks, Dynamics and the Small-World Phenomenon

How do we know if an observed graph fits the SW model?

Newman, M. E. J.; Strogatz, S. J., and Watts, D. J. “Random Graphs with arbitrary degree distributions and their applications” Phys. Rev. E. 2001

This paper extends the formulas for expected clustering and path length using a generating functions approach, making it possible to calculate E(C,L) for graphs with any degree distribution. Importantly, this procedure also makes it possible to account for clustering in a two-mode graph caused by the distribution of assignment to groups.

Page 27: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Duncan Watts: Networks, Dynamics and the Small-World Phenomenon

How do we know if an observed graph fits the SW model?

Newman, M. E. J.; Strogatz, S. J., and Watts, D. J. “Random Graphs with arbitrary degree distributions and their applications” Phys. Rev. E. 2001

)/log(

)/log(

12

1

zz

zNl

Where N is the size of the graph, Z1 is the average number of people 1 step away (degree) and z2 is the average number of people 2 steps away.

Theoretically, these formulas can be used to calculate many properties of the network – including largest component size, based on degree distributions.

A word of warning: The math in these papers is not simple, sharpen your calculus pencil before reading the paper…

Page 28: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Duncan Watts: Networks, Dynamics and the Small-World Phenomenon

How do we know if an observed graph fits the SW model?

Since C is just the transitivity ratio, there are a number of good formulas for calculating the expected value. Using the ratio of complete to (incomplete + complete) triads, we can use the expected values from the triad distribution in PAJEK for a simple graph or we can use the expected value conditional on the dyad types (if we have directed data) using the formulas in SPAN and W&F.

Page 29: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

The line of work most closely related to the small world is that on biased and random networks. Recall the reachability curves in a random graph:

0

20%

40%

60%

80%

100%

Per

cent

Con

tact

ed

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Remove

Degree = 4Degree = 3

Degree = 2

Random Reachability:By number of close friends

Page 30: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

For a random network, we can estimate the trace curves with the following equation:

)1)(1(1iap

ii eXP

Where Pi is the proportion of the population newly contacted at step i, Xi is the cumulative number contacted by step i, and a is the mean number of contacts people have.

This model describes the reach curves for a random network. The model is based on a, which (essentially) tells us how many new people we will reach from the new people we just contacted. This is based on the assumption that people’s friends know each other at a simple random rate.

Page 31: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

For a real network, people’s friends are not random, but clustered. We can modify the random equation by adjusting a, such that some portion of the contacts are random, the rest not. This adjustment is a ‘bias’ - I.e. a non-random element in the model -- that gives rise to the notion of ‘biased networks’. People have studied (mathematically) biases associated with:

•Race (and categorical homophily more generally)•Transitivity (Friends of friends are friends)•Reciprocity (i--> j, j--> i)

There is still a great deal of work to be done in this area empirically, and it promises to be a good way of studying the structure of very large networks.

Page 32: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Figure 1. Connectivity Distribution for a large Jr. High School(Add Health data)

0

0.2

0.4

0.6

0.8

1

Pr

op

ort

ion

Re

ac

he

d

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14Remove

"Pine Brook Jr. High"

Random graphObserved

Page 33: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

How useful are C & L for characterizing a network?

These two graphs both have high C

Page 34: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Uzzi & Spiro: Small worlds on Broadway

Page 35: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Uzzi & Spiro: Small worlds on Broadway

Page 36: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Uzzi & Spiro: Small worlds on Broadway

RandomObserved

RandomObserved

PLPL

CCCCQ

Page 37: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Uzzi & Spiro: Small worlds on Broadway

Page 38: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Uzzi & Spiro: Small worlds on Broadway

Page 39: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Uzzi & Spiro: Small worlds on Broadway

Figure 1. Q over time, from descriptive table.

0

0.5

1

1.5

2

2.5

3

3.5

1940 1950 1960 1970 1980 1990 2000

Page 40: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Uzzi & Spiro: Small worlds on Broadway

0.00

0.50

1.00

1.50

2.00

2.50

3.00

3.50

4.00

4.50

1940 1950 1960 1970 1980 1990 2000

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

CCactual

CCrandom

CCratio

Components: CC Ratio

Page 41: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

Uzzi & Spiro: Small worlds on Broadway

)1()1(

)201300(300

)201300(300

21

21

NN

P

NN

P

TTT

TTT

ra

ra

ijrija

rr

r

aa

a

PP

CCQ

Beware ratio of ratios (of ratios!)

Page 42: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

To calculate Average Path Length and Clustering in UCINET

1) Load the network2) To keep w. Watts, make the network symmetric

• Transform > Symmetrasize > Maximum• Note what you saved the graph as

3) Calculate clustering coefficient• Network > Network Properties > Clustering Coefficient• The local density version is the “overall clustering coefficient”• The transitivity version is the “weighted clustering coefficient”

Clustering Coefficient

Page 43: Connectivity and the Small World Overview Background: de Pool and Kochen: Random & Biased networks Rapoport’s work on diffusion Travers and Milgram Argument

To calculate Average Path Length and Clustering in UCINET

1) Load the network2) To keep w. Watts, make the network symmetric

• Transform > Symmetrasize > Maximum• Note what you saved the graph as

3) Calculate Distance• Network cohesion Distance• Tools Statistics Univariate Matrix

Average Length