the science of networks 1.1 welcome! compsci 96: the science of networks socsci 119 m,w 1:15-2:30...

33
The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes http://www.cs.duke.edu/courses/spring11/c ps096

Upload: erica-mclaughlin

Post on 11-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.1

Welcome!

CompSci 96: The Science of NetworksSocSci 119

M,W 1:15-2:30

Professor: Jeffrey Forbes

http://www.cs.duke.edu/courses/spring11/cps096

Page 2: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.2

Today’s topics

What is a network? Why are they important?

The Oracle of Bacon Network construction

Acknowledgements Notes taken from Michael Kearns ,Lada

Adamic, and Nicole Immorlica

Upcoming Network Structure: Graph Theory GUESS

Page 3: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.3

Course Information

Grading Breakdown No background assumed,

but we will Interpret and work

with models both quantitatively and qualitatively

Important Dates Midterm 2/23 Projects due 4/21 Final 5/5 9am-Noon

Let me know ASAP if you have any concerns

“The structure and interconnectivity of social, technological, and natural networks. Network structure: graph theory, economic, social, physical, and natural networks. Network behavior: game theory, markets and strategic interaction, aggregate and emergent functions, and dynamics. Information networks: search and integration.

Applications in sociology, economics, public policy, and computing..”

Assessment Weight (approx)

Assignments (5)

30%

Blog Posts (3)

15%

Classwork/Community

15%

Midterm 15%

Final 25%

Page 4: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.4

A Future for Computer Science?

Page 5: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.5

Emerging science of networks Examining apparent similarities between many

human and technological systems & organizations Importance of network effects in such systems

How things are connected matters greatly Structure, asymmetry and heterogeneity

Details of interaction matter greatly The metaphor of viral spread Dynamics of economic and strategic

interaction Qualitative and quantitative; can be very

subtle A revolution of

measurement theory breadth of vision

(M. Kearns)

Page 6: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.6

What is a network?

A collection of individual or atomic entities

Links can represent any pairwise relationship Links can be directed or undirected

Network: entire collection of nodes and links might sometimes be annotated by other info

(weights, etc.) For us, a network is an abstract object

(list of pairs) and is separate from its visual layout that is, we will be interested in properties that

are layout-invariant We will be interested in properties of

networks often structural properties often statistical properties of families of

networks

Page 7: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.7

Repesenting networks Networks are collections of points joined by

lines. What kinds of questions might we ask?

“Network” ≡ “Graph”

points lines

vertices edges, arcs math

nodes links computer science

sites bonds physics

actors ties, relations sociology

node

edge

Page 8: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.8

Definitions

Path: a sequence of nodes (v1, …, vk) such that for any adjacent pair vi and vi+1, there’s an edge ei,i+1 between them.

Distance: the length of the shortest path between two nodes

Diameter: the maximum shortest-path distance between any two nodes

2

8

3

7

4

5

6

1

Page 9: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.9

Network Definitions

Network size: total number of vertices (denoted n) Maximum possible number of edges (m)?

If the distance between all pairs is finite, we say the network is connected; else it has multiple components

Attributes of edges Weight or cost Direction

Degree of a node v = number of edges connected to v Directed versions (in-degree and out-degree)

What else might we want to model beyond just the connections?

Page 10: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.10

Issues

Why model networks? Structure & dynamics Models (structure): who is linked to whom?

• How does position within a network (dis)advantage an agent?

• What are the factors that lead people to trust each other?

• Graph theoretic models Implications (dynamics): individual behavior

can have global consequences• Diffusion of disease and information• Search by navigating the network• Resilience• Population, structural, and aggregate effects• Game theoretic models

Page 11: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.11

Social networks Example: Acquaintanceship networks

vertices: people in the world links: have met in person and know last names hard to measure

Example: scientific collaboration vertices: math and computer science researchers links: between coauthors on a published paper Erdos numbers : distance to Paul Erdos Erdos was definitely a hub or connector; had 507

coauthors How do we navigate in such networks?

Page 12: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.12

Acquaintanceship & more

Page 13: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.13

Six Degrees of Bacon Background

Stanley Milgram’s Six Degrees of Separation? Craig Fass, Mike Ginelli, and Brian Turtle invented it

as a drinking game at Albright College Brett Tjaden, Glenn Wasson, Patrick Reynolds have

run t online website from UVa and beyond Instance of Small-World phenomenon

http://oracleofbacon.org handles 2 kinds of requests1. Find the links from Actor A to Actor B. 2. How good a center is a given actor? How does it answer these requests?

Page 14: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.14

How does the Oracle work? Not using Oracle™ Queries require traversal of the graph

BN = 0 Mystic River

Apollo 13

Footloose

John Lithgow

Sarah Jessica Parker

Bill Paxton

Tom Hanks

Sean Penn

Tim Robbins

BN = 1

Kevin Bacon

Page 15: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.15

How does the Oracle Work?

Kevin Bacon

Mystic River

Apollo 13

Footloose

John Lithgow

Sarah Jessica Parker

Bill Paxton

Tom Hanks

Sean Penn

Tim Robbins

BN = 0

BN = 1Sweet and Lowdown

Fast Times at Ridgemont High

War of the Worlds

The Shawshank Redemption

Cast Away

Forrest Gump

Tombstone

A Simple Plan

Morgan Freeman

Sally Field

Helen Hunt

Val Kilmer

Miranda Otto

Judge Reinhold

Woody Allen

Billy Bob Thornton

BN = 2

BN = Bacon Number Queries require traversal of the graph

Page 16: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.16

How does the Oracle work?

Mystic River

Footloose

John Lithgow

Sarah Jessica Parker

Tom Hanks

Sean Penn

Tim Robbins

BN = 0

BN = 1Sweet and Lowdown

Fast Times at Ridgemont High

War of the Worlds

The Shawshank Redemption

Cast Away

Forrest Gump

A Simple Plan

Morgan Freeman

Sally Field

Helen Hunt

Miranda Otto

Judge Reinhold

Woody Allen

Billy Bob Thornton

BN = 2

Bill Paxton

Tombstone

Val Kilmer

Apollo 13Kevin Bacon

How do we choose which movie or actor to explore next?

Queries require traversal of the graph

Page 17: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.17

Center of the Hollywood Universe? 1,246,221 people can be connected to

Bacon Is he the center of the Hollywood

Universe? Who is? Who are other good centers? What makes them good centers?

Centrality Closeness: the inverse average distance of a

node to all other nodes Degree: the degree of a node Betweenness: a measure of how much a vertex

is between other nodes

Page 18: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.18

Oracle of Bacon

Name someone who is 4 degrees or more away from Kevin Bacon1 42 53 6

What characteristics makes someone farther away?

What makes someone a good center? Is Kevin Bacon a good center?

Page 19: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.19

Sample Blog Post

I'm Related to Kevin Bacon? Overview of the Oracle of Bacon:In class we have talked a

lot about social and computer networks and all of their component parts. We have learned many important aspects of networks and what makes them operate. One of the most interesting and complex notions is that of centrality and how one can go about calculating centrality within a social network. The Oracle of Bacon is one of the best examples of a project that has created an elaborate social network around the central figure of Kevin Bacon. However, it is interesting that the site proves Kevin Bacon to actually not be the center of the Hollywood network, in fact there are actually 1,048 actors who would make better centers than Bacon. Here is a breakdown of the best and worst centers of the Hollywood network. Although the only other actor mentioned who would make a better center is Sean Connery, it can be speculated as to what makes a great center. A good center would have to be an older actor, have appeared in many movies and many varities of movies, have appeared in large productions with many actors and have worked overseas. Alternatively, a bad center would be young, have appeared in only one type of movie, or one movie in general!

Page 20: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.20

Why is the Oracle of Bacon Interesting to us?• In reality, the game is an example of the small world

phenomenon. The small world phenomenon was researched by Stanley Milgram as he examined the average path length for social networks of people in the United States. The phenomenon shows that paths between nodes are always shorter than expected, which is proved in the game. This oracle of Bacon game was designed by computer scientists at the University of Virginia in order to create an engaging way of dealing with the small world phenomenon. The program for calculating a Bacon number was developed by mapping networks from http://imdb.com/ (the database for movies and actors information).

Other related points• Here is the original paper by Stanley Milgram, upon

which all of this information is based. The game works to find links between different actors and find the degree of separation from Bacon. It is amazing that almost any actor, no matter how obscure, can be linked to Bacon within six degrees and the average is under three links (2.960).

• It is also interesting to look at the earlier examples of small world phenomenon, which inspired the oracle of Bacon. Erdos numbers refer to the number of nodes mathematicians are away from Paul Erdos, a Hungarian mathematician famous for collaboration. The Erdos number project gives details similar to the Oracle of Bacon about the amount of connectivity within the network of mathematicians. In this network the median Erdos number is 5; the mean is 4.65, and the standard deviation is 1.21. This shows that there is slightly less connectivity, but a high degree of centrality.

Page 21: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.21

Here is a visualization of the Erdos Network.

More recent centrality work• There are many examples of computer scientists who

have dealt with the six degrees theory in their analysis of the small-world phenomenon including Jon Kleinberg. His paper: Could it be a Big World After All? The `Six Degrees of Separation’ Myth. Society, April 2002 deals with a lot of the important ideas discussed above. Kleinberg argues that the initial data used to create the notion of the small-world phenomenon was actually skewed and data shows that there might actually be less connectivity between people that was previously believed. This paper was published in 2002, and it does not seem to have garnered a large amount of debate amongst the scholarly community. It seems that more work and experimentation needs to be done in this field to in attempt to make claims about the connectedness of the actual world. Although Kleinberg and others made some really interesting points initially, unfortunately the computer science world seems focused on novelty, not finishing work on a phenomenon, so it may be awhile before all of our questions are answered!

Page 22: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.22

Physical Networks The Internet

Vertices: Routers Edges: Physical connections

Another layer of abstraction Vertices: Autonomous systems Edges: peering agreements Both a physical and business network

Other examples US Power Grid Interdependence and August 2003 blackout

Page 23: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.23

What does the Internet look like?

Page 24: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.24

US Power Grid

Page 25: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.25

Business & Economic Networks Example: eBay bidding

vertices: eBay users links: represent bidder-seller or buyer-seller fraud detection: bidding rings

Example: corporate boards vertices: corporations links: between companies that share a board

member Example: corporate partnerships

vertices: corporations links: represent formal joint ventures

Example: goods exchange networks vertices: buyers and sellers of commodities links: represent “permissible” transactions

Page 26: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.26

Enron

Page 27: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.27

Content Networks

Example: Document similarity Vertices: documents on web Edges: Weights defined by similarity See TouchGraph GoogleBrowser

Conceptual network: thesaurus Vertices: words Edges: synonym relationships

Page 28: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.28

Wordnet

Source: http://wordnet.princeton.edu/man/wnlicens.7WN

Page 29: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.29

Biological Networks

Example: the human brain Vertices: neuronal cells Edges: axons connecting cells links carry action potentials computation: threshold behavior N ~ 100 billion

Page 30: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.30

Gene regulatory networks Humans have only 30,000 genes, 98% shared with

chimps The complexity is in the interaction of genes Can we predict what result of the inhibition of

one gene will be?

Source: http://www.zaik.uni-koeln.de/bioinformatik/regulatorynets.html.en

Page 31: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.31

Types of networks Pick a class of network: Give a real-world example of such a

network: What are the vertices (nodes)?

What are the edges (links)?

How is the network formed? Is it decentralized or centralized? Is the communication or interaction local or global?

What is the network's topology? For example, is it connected? What is its size? What is the degree distribution?

Page 32: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.32

Graph properties

Max Degree?

Center?

Page 33: The Science of Networks 1.1 Welcome! CompSci 96: The Science of Networks SocSci 119 M,W 1:15-2:30 Professor: Jeffrey Forbes

The Science of Networks 1.33

Wrap up

Networks are everywhere and can be used to describe many, many systems.

By modeling networks, we can start to understand their properties and the implications those properties have for processes occurring on the network