simulating social networks james moody duke university the population sciences and agent based...

Simulating Social Networks

James Moody

Duke University

The Population Sciences and Agent Based Methodology:An Answer to the Macro-Micro Link?

September 27, 2006, NIH

Work reported in this presentation has been supported by NIH grants DA12831, HD41877, and AG024050. Thanks to the Center for Advanced Study in the Behavioral Sciences (CASBS) for office and tech support for this work.

Introduction: Network Structure Puzzles

Distribution of PopularityBy size and city type

Why are high school popularity distributions constant across vastly different communities ?

Add Health relational change statistics

How can the global structure remain constant given massive changes at the dyad level?


Male

Female

Male

Female

1212 99

63

22


What rules can account for adolescent romantic network structure?

I(N=52)

II-a(N=4)

II-b(N=15)

II-c(N=22)

II-d(N=81)

III(N=384)

Han, S-K. Social Networks 2003:251-280. Figure 1


Why are academic PhD exchange markets (hiring/placing) strongly centralized …

Burris, ASR 2004

… and positions within systems stable over generations?



• In each case, the interdependent activity of each actor affects the conditions shaping action for everyone else in the setting.

•History matters in a deterministic (rather than stochastic) sense•The process shaping actor’s choices are locally bounded•The resulting network structure is often very far from a random null.

•Statistical Models fail for either data or deep endogeneity reasons.

•Actor-oriented simulation methods•Provide a way of thinking about interdependent action •Create multiple replications with known variation on independent variables.

1. Introduction2. Micro – Macro elements in social networks

a) Coleman’s “boat”b) Structural correlates of micro & macroc) Linking rules to structures

3. Simulation Network Structurea) Dynamics of adolescent friendship structureb) Adolescent romantic exogamyc) Inequality in PhD exchange networks

4. Network Diffusion & Disease Spreada) Degree mixing modelsb) Relational timing

5. Promises & Pitfallsa) Good: Theoretical rigor, clarity, & eleganceb) Bad: How to test against observed datac) Ugly: Rule proliferation & specification

Introduction: outline

Micro-Macro elements in social networks Coleman’s “Boat”

Contextual State

IndividualResponse

ResultingAction

Global OutcomeMacro:

Micro:

1

1) Macro Micro: Typically contextual conditions that enable/constrain individual action.


Contextual State

IndividualResponse

ResultingAction


Micro:

1

1) Macro Micro: Typically contextual conditions that enable/constrain individual action.2) Micro Micro: A direct-action correlate of the contextually constrained behavior in (1)

2


Contextual State

IndividualResponse

ResultingAction


Micro:

1

1) Macro Micro: Typically contextual conditions that enable/constrain individual action.2) Micro Micro: A direct-action correlate of the contextually constrained behavior in (1)3) Micro Macro: An aggregation or interaction process that can account for the new global-

level outcome.

2

3


Contextual State

IndividualResponse

ResultingAction


Micro:

1

1) Macro Micro: Typically contextual conditions that enable/constrain individual action.2) Micro Micro: A direct-action correlate of the contextually constrained behavior in (1)3) Micro Macro: An aggregation or interaction process that can account for the new global-

level outcome.4) The observed macro-level correlation is thus accounted for by actors capable of intent and

action.

2

3

4

“[Social facts] assume a shape, a tangible form peculiar to them and constitute a reality sui generis vastly distinct from the individual facts which manifest that reality”

— Durkheim Rules Of Sociological Method


Of these 4 links, the 3rd is often the trickiest. Here we face questions about emergent properties: features of the macro system that cannot be seen as simple (mean, sum, proportion) aggregations of individual action, but instead are seen as some interactive effect of the combined action.

Contextual State

IndividualResponse

ResultingAction

Global Outcome

1

2

3

Micro-Macro elements in social networks Structural correlates of micro and macro

Network micro features: Anything you can measure on a local ego-network.

Ego-Net


Local-romantic Networks

Complete Network



Network micro features: Anything you can measure on a local ego-network.

Purely local: information on ego about ego’s contacts Number of ties (degree)node attribute mixing

1

2

4

3

e

Local + Alter interaction: information on ego and ego’s contacts with each otherNumber of ties (degree)node attribute mixingClusteringReciprocityStructure Holes

1

2

4

3

e


Network macro features: Features resting on (a) paths of length > 2.

The key element that makes a network a system is the path: it’s how sets of actors are linked together indirectly.

A walk is a sequence of nodes and lines, starting and ending with nodes, in which each node is incident with the lines following and preceding it in a sequence.

A path is a walk where all of the nodes and lines are distinct.

Paths are the routes through networks that make diffusion possible, they govern connectivity, clustering and reflect “clump” structure as well.


Network macro features: Features resting on (a) paths of length > 2.

These two graphs have the exact same local properties, but very different global properties.

A B

Micro-Macro elements in social networks Structural correlates of micro and macroMicro-Macro elements in social networks Structural correlates of micro and macro

Network macro features: Features resting on (b) the distribution of local features.

Distribution of PopularityBy size and city type

Micro-Macro elements in social networks Structural correlates of micro and macroMicro-Macro elements in social networks Structural correlates of micro and macro

Network macro features: Features resting on (b) the distribution of local features.


Network macro features: Features resting on (c) outcomes distributed across nodes.

Define as a general measure of the “diffusion susceptibility” of a graph as the ratio of the area under the observed curve to the area under a random baseline curve. As the ratio 1.0, you get effectively faster diffusion.

Micro-Macro elements in social networks Linking actor rules to network structure

For most simulation settings, we are often interested in identifying behavioral rules that (a) fit the micro network features of interest and (b) give rise (in combination) to the global features of interest.

Types of network rules:1. Node volume features (number of ties)2. Dyadic Interaction features (reciprocity, race-mixing rules)3. Indirect interaction features (Social balance, relational exogamy

rules)4. Timing rule (relation duration, concurrency, and order)

Micro-Macro elements in social networks Linking actor rules to network structure

For most simulation settings, we are often interested in identifying behavioral rules that (a) fit the micro network features of interest and (b) give rise (in combination) to the global features of interest.

Two ways to think of rule-action links for network modesl:

“Explanation”: Identifying a (small) set of rules that, when applied, account for feature difficult to explain otherwise.Examples:

Adolescent friendship dynamicsRomantic network structurePhD Exchange network structure & Stability

“Exploration”: Start with a set of local rules you are confident in, then apply to a setting to learn what system-level features emerge..Examples:

Diffusion potential of low-degree networksDiffusion constraints resulting from relational timing

Simulating Network Structure Adolescent Romantic Networks

Male

Female

Male

Female

1212 99

63

22

Explanation problem 1: Romantic relations at “Jefferson” high school

Source: Bearman, Moody and Stovel (2004) AJS


Is the network typical? How does it compare to random networks with the same micro-features?

Circle = observed, boxplots = simulated networks w. same volume.


Is the network typical? How does it compare to random networks with the same micro-features?

The network is decidedly not random. Moreover, typical network mixing features don’t take us very far (homophily on number of prior partners helps constrain component size, and smoking homophily is evident by inspection).

We propose a network exogamy rule: a prohibition on cycles of length 4:



Introduce a prohibition on forming 4-cycles in the randomly simulated networks.



Here we get a much closer match between the simulated networks and the observed in each of our test statistics…



…and the simulated components have similar qualitative structures as well.


Evaluation:

This single rule addition – more than any other dyadic feature such as homophily on behavior or age mixing – generates networks with the structure we observe in reality. It’s theoretical simplicity is the strongest strength of the model. From a simulation methods standpoint, this is a very simple rule set:

a) Constrain each actor to make the same number of partners observed in the real world

b) If a partner choice would close a 4-cycle, choose somebody else.


Evaluation:

From an implementation standpoint, the simulation is complicated by an empirical identification problem: there are many possible configurations where these two constraints cannot be met simultaneously.

In the process of making choices, we effectively run out of degrees of freedom – where any new choice would lead to a violation in the degree distribution or create a 4-cycle.

- Theoretically, this implies that the real-world graph is coming from a fairly small region of the overall graph space.

- Methodologically it means that using only a simple rule-based simulation was computationally inefficient. We solved this by adding “graph identification” procedures that forced choices once prior choices implied them.

- This difficutly followed from our desire to fit the distributions exactly.

Explaination problem 2: Academic Caste Systems

Simulating Network Structure Academic Castes: inequality in PhD exchange Networks

I(N=52)

II-a(N=4)

II-b(N=15)

II-c(N=22)

II-d(N=81)

III(N=384)

Han, S-K. Social Networks 2003:251-280. Figure 1

Why is this network so hierarchical and stable?

“Social Capital” = Bonacich Centrality on symmetric version of the PhD exchange Network

The resulting status-based network has a strong correlation between centrality in the hiring network & quality ranking

Simulating Network Structure Academic Castes: inequality in PhD exchange Networks

Simulating Network Structure Persistent inequality in PhD exchange Networks: Simulation Setup

The purpose of this simulation is to examine the effect of market-relevant behavior under ideal-typical conditions. This involves simplifying the real world as much as possible, to isolate how particular factors affect outcomes of interest.

Key real-world properties of interest:•Stable quality rankings•Strong correlation between size and quality•Centralized hiring networks•Strong correlation between centrality and prestige

Currently, all actors follow the same strategy, and I vary the strategy set across simulation runs.

Future work will vary department strategies within runs to see how these affect competitive advantage.

Actors•Departments: Collections of faculty who hire applicants & produce new students. (N=100). Initial department size is drawn from a normal distribution with mean = 25, std=12, but I re-draw if size is less than 10, so the actual distribution is slightly skewed.

Applicants: Students from (other) departments who apply for jobs.•Departments seek to hire the best students, students want to work at the best departments.

•These actors are rational, honest, and risk-averse. But all actors have individual preferences & errors in vision.

The simulation does not include tenure or senior moves. So you can treat this as the “realized” or “final” position outcomes.


AttributesQuality. Each faculty member and student has an overall quality score.

•Initial faculty quality is distributed as random normal(0,1). •Implies that departments are effectively equal at time 1–with only minor differences due to random chance.

•Student quality is a (specifiable) random function of faculty quality.

•Department quality is the mean of faculty quality.

While each person has a given quality score, actor choices are made based on an evaluation of quality, which differs across actors.

This variation reflects both differences in preferences and ability to discern quality.


Action: Departments•Departments hire & produce students.•For each of 100 years:

•Every department produces students (conditional on size).

•A (random) subset of departments have job openings based on (a) prior retirements & current size relative to a target size.

• Departments rank applicants by their evaluation of applicant quality, and make offers to their top choices.

•If a department’s 1st choice goes elsewhere, they go to next for a specifiable number of rounds to a specifiable ‘depth’ into the pool.

•Jobs can go unfilled, which means that departments can both grow and shrink.


Action: Students•Students rank departments that make them an offer by their evaluation of department quality, and take the best job they are offered.

•If a student does not receive a job offer in a given year, they move out of the system

•Lots of students don’t get jobs (at PhD granting universities…)

•Students are not strategic: they do not forego a good offer while waiting for a better one -- this is the “risk averse” quality, though this could be changed.


Parameter Description Specification

Hiring probability

Likelihood of a job opening beyond retirement replacement.

Cubic function of department size. [3 levels]

Student production

Probability of each faculty member putting a student on the market in a given year.

Binomial (0,1), p = (0.06 to 0.08). [2 levels]. X1 = 165 ; X2 = 220

Faulty - Student Quality Correlation

The correlation between student and faculty quality. Specify as a correlation from 0.37 to 0.91[3 levels]

Applicant Quality Evaluation

Used by departments to rank applicants. Each department assigns applicants an observed quality score based on this function.

Observed = (Student quality) + b(N(0,1)). b: 0.3 to 0.9. [3 levels]

Department Quality Evaluation

Used by applicants to rank job offers. Each student assigns departments an observed quality score based on this function.

Observed = (Department quality) + b(N(0,1)). B: 0.1 to 0.25. [2 levels]

Hiring Rounds Number of offer rounds made. Approximates time by limiting opportunity to make alternative offers.

Specify as number. 3 or 4 [2 levels]

There are 3*2*3*3*2*2*3 = 648 points in the parameter space; 30 draws from each set 19,440 observations

Depth of SearchHow deeply into the pool of candidates departments are willing to go.

Specify as max depth.10 to 30 [3 levels]

Simulating Network Structure Persistent inequality in PhD exchange Networks: parameter summary

Simulating Network Structure Persistent inequality in PhD exchange Networks: parameter summary

A look under the hood…

All results are presented around the competitive field:

High Competition

Low Competition

Disagreement on Candidate Quality

Dep

th o

f S

earc

h

0.3 0.6 0.9

10

20

30

Simulating Network Structure Persistent inequality in PhD exchange Networks: Non-network outcomes

Size & Quality: Correlation of Size and QualityCalculated at final year ( y=100)


Burris reports the correlation between size and prestige as 0.63

Correlation of Size and Quality over time

Simulating Network Structure Persistent inequality in PhD exchange Networks: A single example run

A single example run – taken from the middle competition cell.

Calculated at final year ( y=100) Quality Stability: 10 Year Correlation of Quality


Correlation of Quality 10 years prior

Simulating Network Structure Persistent inequality in PhD exchange Networks: A single example run

A single example run – taken from the middle competition cell.

The production and hiring of PhDs generates an exchange network, connecting the “sending” department to the hiring department.

Note that, unlike many simulations, here the edges in the network are actors (rather than simply the result of node action).

I record this network for all hires in the last 10 years of the simulation history, and construct two measures:

a) The network centralization scoreb) The correlation between network centrality & quality & size.

Simulating Network Structure Persistent inequality in PhD exchange Networks: Network outcomes

Disagreement on Candidate QualityD

epth

of

Sea

rch

For what follows, working within one region of the parameter space

A preliminary regression over the entire space shows that hiring rates & quality correlation matter most for centralization


Network Centralization by Quality Correlation & Job Openings


Correlation of Centrality & Department Size

Bonacich Centrality


Correlation of Centrality & Department Quality

Bonacich Centrality


Real data from a all applicants for an open position at a large Midwestern university


OLS line

Most Productive Line (first sort selects here!)

The very simple market model proposed here can account for many of the features we see in real PhD exchange markets:

a) Stable quality rankingsb) Strong Correlation between Size & Qualityc) Highly Centralized Networksd) Correlation between Quality ranking and Centralization

Qualitatively, it is appears that you can order most of these networks with a pretty clear distinction between “top” or “core” departments and a periphery, characterized by asymmetric flow of students.

Simulating Network Structure Persistent inequality in PhD exchange Networks: Tentative conclusions

There is still some room for non-market effects here, however, since the resulting hierarchies are not perfect:

a) Self-selection effectsa) Students avoiding applying “out of their league”b) Adjusting depth of search to be linked to current quality

b) Social Network Effectsa) Give a positive weight to students who come from departments where

current faculty received their PhDs

c) Market Segmentation Effectsa) Add a dimension of substantive “fit” to the market model.

a) Should act as (a) an interaction boost for market competition effects

b) Will give sending advantages to large diverse departments.

Simulating Network Structure Persistent inequality in PhD exchange Networks: Tentative conclusions

Simulating Network Structure Exploratory Simulation: Epidemic Potential from Low-degree networks

In this case, we motivate the work with 4 observations:1. STD Epidemics have to travel across a connected network

2. The connectivity structure should be robust – since transmission is a low probability result

3. Infectivity is temporally sensitive: for bacterial STDs the window is very short, for virus like HIV, infectivity probability is highest early and late.

1. This implies that the connected set needs to occupy a short infectivity window, which severely limits the number of partners most people will have (i.e. lifetime partner distributions are largely irrelevant).

4. A great deal of recent attention has been placed on extremely heterogeneous (“power law”) activity levels, with implications suggesting that we can only hope to contain epidemics like HIV by targeting the high-activity hubs.

But what kind of networks emerge in settings where there are no high activity hubs? How do these compare to the high-activity distribution networks?

Problem 3: Exploration of STD relevant networks


Here we simulate networks with a single behavior rule limiting the number of partners to a known distribution. -- the weakest form of an ABM model for networks.

We vary the population level constraint on the distribution of relation volume, keeping a maximum of 3 partners and changing the distribution from a mode of 1 to a mode of 3.

Population size of 10,000 nodes.


Very small changes in degree generate a quick cascade to large connected components. While not quite as rapid, STD cores follow a similar pattern, emerging rapidly and rising steadily with small changes in the degree distribution.

This suggests that, even in the very short run (days or weeks, in some populations) large connected cores can emerge covering the majority of the interacting population, which can sustain disease, even when nobody is particularly active.

These results occur faster for low-degree populations than for the scale free populations, whose hub structure makes it difficult to form large-reaching robust sets.

Simulating Network Structure Promises and Pitfalls: the good

In both modes of simulation study (explanatory and exploratory), it is possible to change the macro conditions directly by affecting micro-level rules.

This is clearly the strongest factor in bridging the micro-macro problem.

Contextual State

IndividualResponse

ResultingAction


Micro:

1

2

3

This modeling strategy moves us from this:

Simulating Network Structure Promises and Pitfalls: the good

In both modes of simulation study (explanatory and exploratory), it is possible to change the macro conditions directly by affecting micro-level rules.

This is clearly the strongest factor in bridging the micro-macro problem.

Initial Conditions

Macro:

Micro:

…to this:

ActorRules Action Interaction

Aggregates of Action

(feedback)

StableEquilibrium

Unstable (?)Conditions that further motivate individual action

Simulating Network Structure Promises and Pitfalls: the bad

Still many questions about the empirical etiology of observed phenomena:

-Identifying a particular mechanism that “works” doesn’t mean it is the mechanism active in the settings of interest.

-Social life may be “overdetermined” in that sense.

-The tradeoff between realism and simplicity carries a cost:

-Simplicity is best for identifying the implications of a theoretical mechanism, but tells us little about how the simplified assumption will work in other interactive contexts. Setting a parameter to “0” is still an assumption, even if left unexamined.

-Realism is best for extending external validity, but often at the cost of knowing exactly why changes in one parameter affect an outcome in a given way.

Simulating Network Structure Promises and Pitfalls: the bad

Methodologically, simulation work still largely works on a “boutique production” manner

Different modelers use different programs, initial assumptions, etc. Making replication difficult and increasing startup costs for everyone.

This is getting better: with NetLogo or Repast, widely distributed packages that share modules (such as work in R), but still little institutional support of generalized simulation practice.

Simulating Network Structure Promises and Pitfalls: the ugly

Evaluating & presenting results

Often more results than can reasonably be summarized in a single paper (or readable book).

- We’re often interested in the distribution of outcomes, rather than the central tendency, which makes sumation that much more challenging.

- Results are not well suited for paper-journal distribution - Color, dynamics, interaction are best treated with web-based outlets,

but these often lack status.

- How do we extend these results to fit or predict in empirical settings where our simulated assumptions are not (cannot be) met?

simulating social networks james moody duke university the population sciences and agent based...

Documents

micro micro

macro micro

network structure puzzleswhy

global structure

macromicro link

outlinemicromacro elements

individual action

interdependent action