simulating social networks james moody duke university the population sciences and agent based...
TRANSCRIPT
Simulating Social Networks
James Moody
Duke University
The Population Sciences and Agent Based Methodology:An Answer to the Macro-Micro Link?
September 27, 2006, NIH
Work reported in this presentation has been supported by NIH grants DA12831, HD41877, and AG024050. Thanks to the Center for Advanced Study in the Behavioral Sciences (CASBS) for office and tech support for this work.
Introduction: Network Structure Puzzles
Distribution of PopularityBy size and city type
Why are high school popularity distributions constant across vastly different communities ?
Add Health relational change statistics
How can the global structure remain constant given massive changes at the dyad level?
Introduction: Network Structure Puzzles
Male
Female
Male
Female
1212 99
63
22
Introduction: Network Structure Puzzles
What rules can account for adolescent romantic network structure?
I(N=52)
II-a(N=4)
II-b(N=15)
II-c(N=22)
II-d(N=81)
III(N=384)
Han, S-K. Social Networks 2003:251-280. Figure 1
Introduction: Network Structure Puzzles
Why are academic PhD exchange markets (hiring/placing) strongly centralized …
Burris, ASR 2004
… and positions within systems stable over generations?
Introduction: Network Structure Puzzles
Introduction: Network Structure Puzzles
• In each case, the interdependent activity of each actor affects the conditions shaping action for everyone else in the setting.
•History matters in a deterministic (rather than stochastic) sense•The process shaping actor’s choices are locally bounded•The resulting network structure is often very far from a random null.
•Statistical Models fail for either data or deep endogeneity reasons.
•Actor-oriented simulation methods•Provide a way of thinking about interdependent action •Create multiple replications with known variation on independent variables.
1. Introduction2. Micro – Macro elements in social networks
a) Coleman’s “boat”b) Structural correlates of micro & macroc) Linking rules to structures
3. Simulation Network Structurea) Dynamics of adolescent friendship structureb) Adolescent romantic exogamyc) Inequality in PhD exchange networks
4. Network Diffusion & Disease Spreada) Degree mixing modelsb) Relational timing
5. Promises & Pitfallsa) Good: Theoretical rigor, clarity, & eleganceb) Bad: How to test against observed datac) Ugly: Rule proliferation & specification
Introduction: outline
Micro-Macro elements in social networks Coleman’s “Boat”
Contextual State
IndividualResponse
ResultingAction
Global OutcomeMacro:
Micro:
1
1) Macro Micro: Typically contextual conditions that enable/constrain individual action.
Micro-Macro elements in social networks Coleman’s “Boat”
Contextual State
IndividualResponse
ResultingAction
Global OutcomeMacro:
Micro:
1
1) Macro Micro: Typically contextual conditions that enable/constrain individual action.2) Micro Micro: A direct-action correlate of the contextually constrained behavior in (1)
2
Micro-Macro elements in social networks Coleman’s “Boat”
Contextual State
IndividualResponse
ResultingAction
Global OutcomeMacro:
Micro:
1
1) Macro Micro: Typically contextual conditions that enable/constrain individual action.2) Micro Micro: A direct-action correlate of the contextually constrained behavior in (1)3) Micro Macro: An aggregation or interaction process that can account for the new global-
level outcome.
2
3
Micro-Macro elements in social networks Coleman’s “Boat”
Contextual State
IndividualResponse
ResultingAction
Global OutcomeMacro:
Micro:
1
1) Macro Micro: Typically contextual conditions that enable/constrain individual action.2) Micro Micro: A direct-action correlate of the contextually constrained behavior in (1)3) Micro Macro: An aggregation or interaction process that can account for the new global-
level outcome.4) The observed macro-level correlation is thus accounted for by actors capable of intent and
action.
2
3
4
“[Social facts] assume a shape, a tangible form peculiar to them and constitute a reality sui generis vastly distinct from the individual facts which manifest that reality”
— Durkheim Rules Of Sociological Method
Micro-Macro elements in social networks Coleman’s “Boat”
Of these 4 links, the 3rd is often the trickiest. Here we face questions about emergent properties: features of the macro system that cannot be seen as simple (mean, sum, proportion) aggregations of individual action, but instead are seen as some interactive effect of the combined action.
Contextual State
IndividualResponse
ResultingAction
Global Outcome
1
2
3
Micro-Macro elements in social networks Structural correlates of micro and macro
Network micro features: Anything you can measure on a local ego-network.
Ego-Net
Micro-Macro elements in social networks Structural correlates of micro and macro
Local-romantic Networks
Complete Network
Micro-Macro elements in social networks Structural correlates of micro and macro
Micro-Macro elements in social networks Structural correlates of micro and macro
Network micro features: Anything you can measure on a local ego-network.
Purely local: information on ego about ego’s contacts Number of ties (degree)node attribute mixing
1
2
4
3
e
Local + Alter interaction: information on ego and ego’s contacts with each otherNumber of ties (degree)node attribute mixingClusteringReciprocityStructure Holes
1
2
4
3
e
Micro-Macro elements in social networks Structural correlates of micro and macro
Network macro features: Features resting on (a) paths of length > 2.
The key element that makes a network a system is the path: it’s how sets of actors are linked together indirectly.
A walk is a sequence of nodes and lines, starting and ending with nodes, in which each node is incident with the lines following and preceding it in a sequence.
A path is a walk where all of the nodes and lines are distinct.
Paths are the routes through networks that make diffusion possible, they govern connectivity, clustering and reflect “clump” structure as well.
Micro-Macro elements in social networks Structural correlates of micro and macro
Network macro features: Features resting on (a) paths of length > 2.
These two graphs have the exact same local properties, but very different global properties.
A B
Micro-Macro elements in social networks Structural correlates of micro and macroMicro-Macro elements in social networks Structural correlates of micro and macro
Network macro features: Features resting on (b) the distribution of local features.
Distribution of PopularityBy size and city type
Micro-Macro elements in social networks Structural correlates of micro and macroMicro-Macro elements in social networks Structural correlates of micro and macro
Network macro features: Features resting on (b) the distribution of local features.
Micro-Macro elements in social networks Structural correlates of micro and macroMicro-Macro elements in social networks Structural correlates of micro and macro
Network macro features: Features resting on (b) the distribution of local features.
Micro-Macro elements in social networks Structural correlates of micro and macro
Network macro features: Features resting on (c) outcomes distributed across nodes.
Define as a general measure of the “diffusion susceptibility” of a graph as the ratio of the area under the observed curve to the area under a random baseline curve. As the ratio 1.0, you get effectively faster diffusion.
Micro-Macro elements in social networks Linking actor rules to network structure
For most simulation settings, we are often interested in identifying behavioral rules that (a) fit the micro network features of interest and (b) give rise (in combination) to the global features of interest.
Types of network rules:1. Node volume features (number of ties)2. Dyadic Interaction features (reciprocity, race-mixing rules)3. Indirect interaction features (Social balance, relational exogamy
rules)4. Timing rule (relation duration, concurrency, and order)
Micro-Macro elements in social networks Linking actor rules to network structure
For most simulation settings, we are often interested in identifying behavioral rules that (a) fit the micro network features of interest and (b) give rise (in combination) to the global features of interest.
Two ways to think of rule-action links for network modesl:
“Explanation”: Identifying a (small) set of rules that, when applied, account for feature difficult to explain otherwise.Examples:
Adolescent friendship dynamicsRomantic network structurePhD Exchange network structure & Stability
“Exploration”: Start with a set of local rules you are confident in, then apply to a setting to learn what system-level features emerge..Examples:
Diffusion potential of low-degree networksDiffusion constraints resulting from relational timing
Simulating Network Structure Adolescent Romantic Networks
Male
Female
Male
Female
1212 99
63
22
Explanation problem 1: Romantic relations at “Jefferson” high school
Source: Bearman, Moody and Stovel (2004) AJS
Simulating Network Structure Adolescent Romantic Networks
Is the network typical? How does it compare to random networks with the same micro-features?
Circle = observed, boxplots = simulated networks w. same volume.
Simulating Network Structure Adolescent Romantic Networks
Is the network typical? How does it compare to random networks with the same micro-features?
The network is decidedly not random. Moreover, typical network mixing features don’t take us very far (homophily on number of prior partners helps constrain component size, and smoking homophily is evident by inspection).
We propose a network exogamy rule: a prohibition on cycles of length 4:
Simulating Network Structure Adolescent Romantic Networks
We propose a network exogamy rule: a prohibition on cycles of length 4:
Introduce a prohibition on forming 4-cycles in the randomly simulated networks.
Simulating Network Structure Adolescent Romantic Networks
We propose a network exogamy rule: a prohibition on cycles of length 4:
Here we get a much closer match between the simulated networks and the observed in each of our test statistics…
Simulating Network Structure Adolescent Romantic Networks
We propose a network exogamy rule: a prohibition on cycles of length 4:
…and the simulated components have similar qualitative structures as well.
Simulating Network Structure Adolescent Romantic Networks
Evaluation:
This single rule addition – more than any other dyadic feature such as homophily on behavior or age mixing – generates networks with the structure we observe in reality. It’s theoretical simplicity is the strongest strength of the model. From a simulation methods standpoint, this is a very simple rule set:
a) Constrain each actor to make the same number of partners observed in the real world
b) If a partner choice would close a 4-cycle, choose somebody else.
Simulating Network Structure Adolescent Romantic Networks
Evaluation:
From an implementation standpoint, the simulation is complicated by an empirical identification problem: there are many possible configurations where these two constraints cannot be met simultaneously.
In the process of making choices, we effectively run out of degrees of freedom – where any new choice would lead to a violation in the degree distribution or create a 4-cycle.
- Theoretically, this implies that the real-world graph is coming from a fairly small region of the overall graph space.
- Methodologically it means that using only a simple rule-based simulation was computationally inefficient. We solved this by adding “graph identification” procedures that forced choices once prior choices implied them.
- This difficutly followed from our desire to fit the distributions exactly.
Explaination problem 2: Academic Caste Systems
Simulating Network Structure Academic Castes: inequality in PhD exchange Networks
I(N=52)
II-a(N=4)
II-b(N=15)
II-c(N=22)
II-d(N=81)
III(N=384)
Han, S-K. Social Networks 2003:251-280. Figure 1
Why is this network so hierarchical and stable?
“Social Capital” = Bonacich Centrality on symmetric version of the PhD exchange Network
The resulting status-based network has a strong correlation between centrality in the hiring network & quality ranking
Simulating Network Structure Academic Castes: inequality in PhD exchange Networks
Simulating Network Structure Persistent inequality in PhD exchange Networks: Simulation Setup
The purpose of this simulation is to examine the effect of market-relevant behavior under ideal-typical conditions. This involves simplifying the real world as much as possible, to isolate how particular factors affect outcomes of interest.
Key real-world properties of interest:•Stable quality rankings•Strong correlation between size and quality•Centralized hiring networks•Strong correlation between centrality and prestige
Currently, all actors follow the same strategy, and I vary the strategy set across simulation runs.
Future work will vary department strategies within runs to see how these affect competitive advantage.
Actors•Departments: Collections of faculty who hire applicants & produce new students. (N=100). Initial department size is drawn from a normal distribution with mean = 25, std=12, but I re-draw if size is less than 10, so the actual distribution is slightly skewed.
Applicants: Students from (other) departments who apply for jobs.•Departments seek to hire the best students, students want to work at the best departments.
•These actors are rational, honest, and risk-averse. But all actors have individual preferences & errors in vision.
The simulation does not include tenure or senior moves. So you can treat this as the “realized” or “final” position outcomes.
Simulating Network Structure Persistent inequality in PhD exchange Networks: Simulation Setup
AttributesQuality. Each faculty member and student has an overall quality score.
•Initial faculty quality is distributed as random normal(0,1). •Implies that departments are effectively equal at time 1–with only minor differences due to random chance.
•Student quality is a (specifiable) random function of faculty quality.
•Department quality is the mean of faculty quality.
While each person has a given quality score, actor choices are made based on an evaluation of quality, which differs across actors.
This variation reflects both differences in preferences and ability to discern quality.
Simulating Network Structure Persistent inequality in PhD exchange Networks: Simulation Setup
Action: Departments•Departments hire & produce students.•For each of 100 years:
•Every department produces students (conditional on size).
•A (random) subset of departments have job openings based on (a) prior retirements & current size relative to a target size.
• Departments rank applicants by their evaluation of applicant quality, and make offers to their top choices.
•If a department’s 1st choice goes elsewhere, they go to next for a specifiable number of rounds to a specifiable ‘depth’ into the pool.
•Jobs can go unfilled, which means that departments can both grow and shrink.
Simulating Network Structure Persistent inequality in PhD exchange Networks: Simulation Setup
Action: Students•Students rank departments that make them an offer by their evaluation of department quality, and take the best job they are offered.
•If a student does not receive a job offer in a given year, they move out of the system
•Lots of students don’t get jobs (at PhD granting universities…)
•Students are not strategic: they do not forego a good offer while waiting for a better one -- this is the “risk averse” quality, though this could be changed.
Simulating Network Structure Persistent inequality in PhD exchange Networks: Simulation Setup
Parameter Description Specification
Hiring probability
Likelihood of a job opening beyond retirement replacement.
Cubic function of department size. [3 levels]
Student production
Probability of each faculty member putting a student on the market in a given year.
Binomial (0,1), p = (0.06 to 0.08). [2 levels]. X1 = 165 ; X2 = 220
Faulty - Student Quality Correlation
The correlation between student and faculty quality. Specify as a correlation from 0.37 to 0.91[3 levels]
Applicant Quality Evaluation
Used by departments to rank applicants. Each department assigns applicants an observed quality score based on this function.
Observed = (Student quality) + b(N(0,1)). b: 0.3 to 0.9. [3 levels]
Department Quality Evaluation
Used by applicants to rank job offers. Each student assigns departments an observed quality score based on this function.
Observed = (Department quality) + b(N(0,1)). B: 0.1 to 0.25. [2 levels]
Hiring Rounds Number of offer rounds made. Approximates time by limiting opportunity to make alternative offers.
Specify as number. 3 or 4 [2 levels]
There are 3*2*3*3*2*2*3 = 648 points in the parameter space; 30 draws from each set 19,440 observations
Depth of SearchHow deeply into the pool of candidates departments are willing to go.
Specify as max depth.10 to 30 [3 levels]
Simulating Network Structure Persistent inequality in PhD exchange Networks: parameter summary
Simulating Network Structure Persistent inequality in PhD exchange Networks: parameter summary
A look under the hood…
All results are presented around the competitive field:
High Competition
Low Competition
Disagreement on Candidate Quality
Dep
th o
f S
earc
h
0.3 0.6 0.9
10
20
30
Simulating Network Structure Persistent inequality in PhD exchange Networks: Non-network outcomes
Size & Quality: Correlation of Size and QualityCalculated at final year ( y=100)
Simulating Network Structure Persistent inequality in PhD exchange Networks: Non-network outcomes
Burris reports the correlation between size and prestige as 0.63
Correlation of Size and Quality over time
Simulating Network Structure Persistent inequality in PhD exchange Networks: A single example run
A single example run – taken from the middle competition cell.
Calculated at final year ( y=100) Quality Stability: 10 Year Correlation of Quality
Simulating Network Structure Persistent inequality in PhD exchange Networks: Non-network outcomes
Correlation of Quality 10 years prior
Simulating Network Structure Persistent inequality in PhD exchange Networks: A single example run
A single example run – taken from the middle competition cell.
The production and hiring of PhDs generates an exchange network, connecting the “sending” department to the hiring department.
Note that, unlike many simulations, here the edges in the network are actors (rather than simply the result of node action).
I record this network for all hires in the last 10 years of the simulation history, and construct two measures:
a) The network centralization scoreb) The correlation between network centrality & quality & size.
Simulating Network Structure Persistent inequality in PhD exchange Networks: Network outcomes
Disagreement on Candidate QualityD
epth
of
Sea
rch
For what follows, working within one region of the parameter space
A preliminary regression over the entire space shows that hiring rates & quality correlation matter most for centralization
Simulating Network Structure Persistent inequality in PhD exchange Networks: Network outcomes
Network Centralization by Quality Correlation & Job Openings
Simulating Network Structure Persistent inequality in PhD exchange Networks: Network outcomes
Correlation of Centrality & Department Size
Bonacich Centrality
Simulating Network Structure Persistent inequality in PhD exchange Networks: Network outcomes
Correlation of Centrality & Department Quality
Bonacich Centrality
Simulating Network Structure Persistent inequality in PhD exchange Networks: Network outcomes
Real data from a all applicants for an open position at a large Midwestern university
Simulating Network Structure Persistent inequality in PhD exchange Networks: Network outcomes
OLS line
Most Productive Line (first sort selects here!)
The very simple market model proposed here can account for many of the features we see in real PhD exchange markets:
a) Stable quality rankingsb) Strong Correlation between Size & Qualityc) Highly Centralized Networksd) Correlation between Quality ranking and Centralization
Qualitatively, it is appears that you can order most of these networks with a pretty clear distinction between “top” or “core” departments and a periphery, characterized by asymmetric flow of students.
Simulating Network Structure Persistent inequality in PhD exchange Networks: Tentative conclusions
There is still some room for non-market effects here, however, since the resulting hierarchies are not perfect:
a) Self-selection effectsa) Students avoiding applying “out of their league”b) Adjusting depth of search to be linked to current quality
b) Social Network Effectsa) Give a positive weight to students who come from departments where
current faculty received their PhDs
c) Market Segmentation Effectsa) Add a dimension of substantive “fit” to the market model.
a) Should act as (a) an interaction boost for market competition effects
b) Will give sending advantages to large diverse departments.
Simulating Network Structure Persistent inequality in PhD exchange Networks: Tentative conclusions
Simulating Network Structure Exploratory Simulation: Epidemic Potential from Low-degree networks
In this case, we motivate the work with 4 observations:1. STD Epidemics have to travel across a connected network
2. The connectivity structure should be robust – since transmission is a low probability result
3. Infectivity is temporally sensitive: for bacterial STDs the window is very short, for virus like HIV, infectivity probability is highest early and late.
1. This implies that the connected set needs to occupy a short infectivity window, which severely limits the number of partners most people will have (i.e. lifetime partner distributions are largely irrelevant).
4. A great deal of recent attention has been placed on extremely heterogeneous (“power law”) activity levels, with implications suggesting that we can only hope to contain epidemics like HIV by targeting the high-activity hubs.
But what kind of networks emerge in settings where there are no high activity hubs? How do these compare to the high-activity distribution networks?
Problem 3: Exploration of STD relevant networks
Simulating Network Structure Exploratory Simulation: Epidemic Potential from Low-degree networks
Here we simulate networks with a single behavior rule limiting the number of partners to a known distribution. -- the weakest form of an ABM model for networks.
We vary the population level constraint on the distribution of relation volume, keeping a maximum of 3 partners and changing the distribution from a mode of 1 to a mode of 3.
Population size of 10,000 nodes.
Simulating Network Structure Exploratory Simulation: Epidemic Potential from Low-degree networks
Simulating Network Structure Exploratory Simulation: Epidemic Potential from Low-degree networks
Simulating Network Structure Exploratory Simulation: Epidemic Potential from Low-degree networks
Simulating Network Structure Exploratory Simulation: Epidemic Potential from Low-degree networks
Simulating Network Structure Exploratory Simulation: Epidemic Potential from Low-degree networks
Very small changes in degree generate a quick cascade to large connected components. While not quite as rapid, STD cores follow a similar pattern, emerging rapidly and rising steadily with small changes in the degree distribution.
This suggests that, even in the very short run (days or weeks, in some populations) large connected cores can emerge covering the majority of the interacting population, which can sustain disease, even when nobody is particularly active.
These results occur faster for low-degree populations than for the scale free populations, whose hub structure makes it difficult to form large-reaching robust sets.
Simulating Network Structure Promises and Pitfalls: the good
In both modes of simulation study (explanatory and exploratory), it is possible to change the macro conditions directly by affecting micro-level rules.
This is clearly the strongest factor in bridging the micro-macro problem.
Contextual State
IndividualResponse
ResultingAction
Global OutcomeMacro:
Micro:
1
2
3
This modeling strategy moves us from this:
Simulating Network Structure Promises and Pitfalls: the good
In both modes of simulation study (explanatory and exploratory), it is possible to change the macro conditions directly by affecting micro-level rules.
This is clearly the strongest factor in bridging the micro-macro problem.
Initial Conditions
Macro:
Micro:
…to this:
ActorRules Action Interaction
Aggregates of Action
(feedback)
StableEquilibrium
Unstable (?)Conditions that further motivate individual action
Simulating Network Structure Promises and Pitfalls: the bad
Still many questions about the empirical etiology of observed phenomena:
-Identifying a particular mechanism that “works” doesn’t mean it is the mechanism active in the settings of interest.
-Social life may be “overdetermined” in that sense.
-The tradeoff between realism and simplicity carries a cost:
-Simplicity is best for identifying the implications of a theoretical mechanism, but tells us little about how the simplified assumption will work in other interactive contexts. Setting a parameter to “0” is still an assumption, even if left unexamined.
-Realism is best for extending external validity, but often at the cost of knowing exactly why changes in one parameter affect an outcome in a given way.
Simulating Network Structure Promises and Pitfalls: the bad
Methodologically, simulation work still largely works on a “boutique production” manner
Different modelers use different programs, initial assumptions, etc. Making replication difficult and increasing startup costs for everyone.
This is getting better: with NetLogo or Repast, widely distributed packages that share modules (such as work in R), but still little institutional support of generalized simulation practice.
Simulating Network Structure Promises and Pitfalls: the ugly
Evaluating & presenting results
Often more results than can reasonably be summarized in a single paper (or readable book).
- We’re often interested in the distribution of outcomes, rather than the central tendency, which makes sumation that much more challenging.
- Results are not well suited for paper-journal distribution - Color, dynamics, interaction are best treated with web-based outlets,
but these often lack status.
- How do we extend these results to fit or predict in empirical settings where our simulated assumptions are not (cannot be) met?