section 8 – ec1818 jeremy barofsky [email protected]

16
Section 8 – Ec1818 Jeremy Barofsky [email protected] u March 31 st and April 1 st , 2010

Upload: stephanie-carpenter

Post on 04-Jan-2016

38 views

Category:

Documents


1 download

DESCRIPTION

Section 8 – Ec1818 Jeremy Barofsky [email protected]. March 31 st and April 1 st , 2010. Section 8 Outline (lectures 15, 16). Social Network Introduction Types of Networks / Graphs Random Regular Small-world Erdos / Bacon Numbers Review Questions? Evaluations - PowerPoint PPT Presentation

TRANSCRIPT

Section 8 – Ec1818

Jeremy [email protected]

March 31st and April 1st, 2010

Section 8 Outline (lectures 15, 16)• Social Network Introduction

• Types of Networks / Graphs1) Random2) Regular3) Small-world

• Erdos / Bacon Numbers

• Review Questions?• Evaluations

• Office Hours - Thursday, 4/1/10 10-11am, outside 320 CGIS North.

Small World Phenomenon - Milgram • Question: Probability that two randomly selected people know

each other?• Design: In 1967, Milgram sent packages to 160 random people

living in Omaha NE asking them to send the package to a friend or acquaintance they thought might know or be connected to the final individual – a stock broker in Boston. (Postcards also sent back to Harvard to track progress).

• Results: 1) Of those letters that found destination, average path length 5.5-

6. 2) Significant selection bias – in one experiment 232 of 296 were

not sent on. 3) Most of cards given to target through a few people. Experiment

with 160 packages sent, 24 reached target at his home and 16 of these were given to target by one person (nodes in network).

-Reasons for under-estimate or over-estimate of avg. path length?

Social networks• A graph G consists of a set V(G) of vertices (or nodes)

together with a set of edges E(G) (or links) that connect vertices.

• Degree: number of edges connected to a given vertex.

• Order: the number of vertices V(G) in graph G represent its order.

• Size: the number of edges E(G) in G represents its size.

• Directed graph / undirected graph: graph is directed if all its edges are directional, ie- the network tells us not just whether people are friends but whether each person considers the other a friend. If none of edges are directional, then graph G is undirected.

Networks / Graphs and 3 elements

Social Networks Metrics• Characteristic path length L(G, p): measures average

distance between vertices. By distance we mean the shortest path that connects vertices v and v’.

• Clustering coefficient C(G, p): Measures a vertex / person’s level of cliquishness within its neighborhood. Answers – are the friends of my friends, my friends also?

• Formally C(G, p)= actual edges in network within its neighborhood / maximum possible edges in that neighborhood.

• Maximum number of graph edges / number of connections in social network: n(n-1)/2 where n = number of vertices.

Types of Graphs• Regular Network: each vertex is connected to same

number k of their nearest neighbors only. All vertices have the same degree. Long characteristic path length because takes a long time to get from one vertex to another, large clustering coefficient because vertices connected to all other nearby vertices.

• Random Network: Edges between vertices occur randomly with prob. = 1/V(G). Full connectedness occurs non-linearly when Pr(connection) = 1/V(G). Small characteristic path length and clustering coefficient.

• Adjacency matrix: Way to represent network data with each row/ column representing whether those vertices have a connection.

From Regular to Random Graphs via Small Worlds

Regular -> Small World -> Random Graphs

• Rewire: Start with a regular graph with vertices in a circle and each connected to 4 closest neighbors. Rewire each edge at random with probability p.

• Changing p means tunes graph such that p = 0 defines a regular graph, p = 1 random.

• Watts and Strogratz define small-world networks with two characteristics:

1)Large Clustering Coefficient C(G, p) – most of my neighbors are friends and friends with me too.

2)Small Characteristic Path Length L(G, p) – Presence of random, long-distance connections mean that moving from one part of the graph to the other can be done quickly.

Characteristic path length L(p) and clustering coefficient C(p) for rewired graphs as p varies.

(Watts and Strogatz, 1998)

Empirical Examples (Watts and Strogatz, 1998)

Power Laws Again? Are you serious? (Random means normal distribution and small world means power law)

Erdos and Bacon Numbers• Small World Networks exhibit strong connections

between neighbors (cliques) but information can still travel quickly because of random connections to other highly connected groups of vertices.

• Erdos/ Bacon numbers: Level of connection in peer-reviewed journal articles or movie credits. Bacon number of 1 means individual acted in same movie as Kevin Bacon. Nearly all actors connected in this way – exhibits characteristics of small world networks.

Is Bacon Best?

• “By processing all of the 1.6 million people in the Internet Movie Database I discovered that there are currently 506 people who are better centers than Kevin Bacon!” –Oracle of Bacon website.

• Compute average Bacon number and compare to others.

Degree distribution of Bacon / Connery Numbers for Actors in IMD.

Bacon/ Connery Number

# of actors / Bacon

# of actors / Connery

0 1 1

1 2349 2446

2 223940 282628

3 666941 674960

4 153220 89845

5 9662 6533

6 877 598

7 134 118

8 15 10

Average 2.95 2.82

Midterm Questions?