social network analysis and complex systems science pip pattison university of melbourne csiro...

Social Network Analysis and Complex Systems Science

Pip Pattison

University of Melbourne

CSIRO Complex Systems Symposium, Pelican Beach, 10-12 Aug 2004

In collaboration with:

Garry Robins, University of Melbourne

Tom Snijders, University of Groningen

Henry Wong, University of Melbourne

Jodie Woolcock, University of Melbourne

Emmanuel Lazega, University of Lille I

Kim Albert, University of Melbourne

Anne Mische, Rutgers University

John Padgett, University of Chicago

Peng Wang, University of Melbourne

1. Why are social networks important?

For understanding action in relation to its social contextnetwork ties link actors to each other as well as to groups, cultural resources,

neighbourhoods, communitiesnetworks structure opportunities and constraints

For understanding social dynamics social action is interactive: one person’s action changes the context for those

to whom they are connected

To understand the cumulation of local processes into population level outcomes

The structure of networks and the dynamics of local processes are critical to understanding how locally interactive, context-dependent actions cumulate into outcomes at higher levels (eg communities, populations)

A simplified multi-layered and relational framework for the social world

Social units individuals

groups

...

Ties among social unitsperson-to-person

person-to-group

...

Settings geographical

sociocultural

...

For example:

Interactions between social units depend on proximity through ties

Interactions between ties depend on proximity through settings

There are interactions within and between levels

Social structure: regularities in interactions

2: Typical data structures

Network observations give rise to relational data structures, e.g.:

People groups, people attributes, groups attributes, people settings, groups settings, people people

people people types of tie, people people settings, …

Some important design issues:Network boundaries?

Complete: which “nodes” to include?

Which network ties?What are the relevant network links?

How do we best “measure” them?

Example 1: Management consulting firm node colour codes workgroup membership

node size codes extent of cohesive beliefsties: “Who do you ask when you want to find out what is going on..?”

Example 2:Networkof MutualCollaborationTies (Lazega, 1999)

Example 3: Change in interorganizational networks (Goldman et al, 1994)

Data are from an evaluation of the Robert Wood Johnson Program on Chronic Mental Illness in 6 US cities (one of which was a “control” site)

OrganisationsMental health agencies in the “control” site (n =37)

Networks at time 1 and time 2 (x1, x2)Client referralsInformation-sharingFund-sharing

Data are from key informants and were gathered two years apart

Client referrals: time 1

3: Modelling networks and other relational structures

Guiding principles:

1. Network ties (and other observations) are the outcome of unobserved processes that tend to be local and interactive

2. There are both regularities and irregularities in these local interactive processes

Hence we aim for a stochastic model formulation in which:– local interactions are permitted and assumptions about “locality” are explicit

– regularities are represented by model parameters and estimated from data

– consequences of local regularities for global network properties can be understood and can also provide an exacting approach to model evaluation

Building models for social networks

We model tie variables: X = [Xij] Xij = 1 if i has a tie to j

0 otherwise

realisation of X is denoted by x = [xij]

Two modelling steps:

methodological: define two network tie variables to be neighbours if they are conditionally dependent, given the values of all other tie variables

Substantive: what are appropriate assumptions about the neighbourhood relation (ie about the network topology)?

Network topologies: which tie variables are neighbours?

Two tie variables are neighbours if:

they share a dyad dyad-independent model

they share an actor Markov model

they share a connection realisation-dependent model with the same tie

They share a connection k-triangle model with two ties

etc...

Models for interactive systems of variables(Besag, 1974)

Hammersley-Clifford theorem: A model for X has a form determined by its neighbourhoods, where a neighbourhood is a set of mutually neighbouring variables

This general approach leads to:

P(X = x) = (1/c) exp{Q QzQ(x)}

normalizing quantity parameter network statistic

the summation is over all neighbourhoods Q

zQ(x) = XijQxij signifies whether c = xexp{Q QzQ(x)}

all ties in Q are observed in x

Neighbourhoods depend on proximity assumptions

Assumptions: two ties are neighbours:

if they share a dyad

dyad-independence

if they share an actor

Markov

if they share a connection with the same tie

realisation-dependent

Configurations for neighbourhoods

edge

+

2-star 3-star 4-star ... triangle

+ ...

3-path 4-cycle “coathanger”

Neighbourhoods, continued

k-triangle model

2 ties are neighbours if they create

a 4-cycle

configurations include:

k nodes

k-independent k-triangle 2-path

useful for higher-order clustering effects

Homogeneous Markov random graphs (Frank & Strauss, 1986)

P(X = x) = (1/c) exp{L(x) + 2S2(x) + … + kSk(x) + … + T(x)}

where: L(x) no of edges in x

S2(x) no of 2-stars in x

…

Sk(x) no. of k-stars in x …

…

T(x) no of triangles in x

Simulating from homogeneous Markov random graph distributions on 36 nodes: a typical graph

Parameter values: = -3 2 = 2 = 0 3 = -2

Average statistics: edges 57.0 2-stars 133.8 triangles 2.3 3-stars 68.4

Typical graphs for = 0, 2, 5, 6

A typical graph for = 10

Parameter values: = -3 2 = 2 = 10 3 = -2

Average statistics: edges 92.0 2-stars 390.0 triangles 130.0 3-stars 440.0

These models can represent very different network structures: eg small worlds: =-4, 2=0.1, 3=-0.05, =1[Robins, Pattison & Woolcock, in press]

No of edges

L=126

path length distribution

Q1 = 4 (5)

Q2 = 5 (7)

Q3 = ()

clustering coefficient

Cluster = 0.09 (0.02)

figures for Bernoulli distribution in red

Longer path worlds: =-1.2, 2=0.05, 3=-1, =1

but levels of clustering are still high

No of edges=118 Q1 = 5 (5)Q2 = 7 (7)Q3 = 9 ()Cluster = 0.08 (0.02)

Very long path worlds: =-2.2, 2=0.05, 3=-2, =1 (no clustering)

L=82 Q1 = (11)Q2 = ()Q3 = ()Cluster = 0.00 (0.02)

Simulations of two-star models (n=30) (a) = 0, 2 =[0.00, 0.01,…0.10]

(see also Handcock, 2004; Park & Newman, 2004; Snijders, 2002)

average no ofdegree 2-stars

complete graph has no of successful moves high probability for

high values of 2

Metropolis algorithm multiple random starts

2-star parameter

.12.10.08.06.04.020.00-.02

ave

rag

e d

eg

ree

30

28

26

24

22

20

18

16

14

2-star parameter

.12.10.08.06.04.020.00-.02

nu

mb

er

of

2-s

tars

14000

12000

10000

8000

6000

4000

2000

2-star parameter

.12.10.08.06.04.020.00-.02

no

of

succ

ess

ful p

rop

osa

ls in

50

0,0

00

ste

ps

600000

500000

400000

300000

200000

100000

0

-100000

(b) = -2.5, 2 =[-0.50, -0.45,…,0.25]

average no of

degree 2-stars

sharp transition

from low to high

no of successful moves density graphs

around 2 = -/(n-

2)

2-star parameter

.4.2-.0-.2-.4-.6

ave

rag

e d

eg

ree

30

20

10

0

2-star parameter

.4.2-.0-.2-.4-.6

nu

mb

er

of

2-s

tars

14000

12000

10000

8000

6000

4000

2000

0

-2000

2-star parameter

.4.2-.0-.2-.4-.6

no

of

succ

ess

ful p

rop

osa

ls in

50

0,0

00

ste

ps

120000

100000

80000

60000

40000

20000

0

-20000

“Freezing” at 2 = -/(n-2):

(,2) = (-14,0.5)/t for t = 0,1,…

Average degree Successful moves

See Park and Newman (2004) for an analytical solution

(including phase diagram)

2-star parameter

.6.5.4.3.2.10.0-.1

ave

rag

e d

eg

ree

30

20

10

0

-10

2-star parameter

.6.5.4.3.2.10.0-.1

no

of

succ

ess

ful p

rop

osa

ls in

50

0,0

00

ste

ps

600000

500000

400000

300000

200000

100000

0

-100000

4: Applications: Estimation of model parameters and model evaluation

A. Estimation of model parameters from data:MLE via MCMC approaches (Snijders, 2002; Handcock et al, 2004)

B. Model evaluation: do substantively important global properties of the observed data resemble simulated data?For example:

Degree distribution

Path length distribution

Presence of clustering, cycles

The overall aim is to identify regularities in local relational structures, and at the same time build models that reproduce global network structure from empirically-grounded local regularities

The alternating k-star, k-independent 2-path and k-triangle hypotheses (Snijders, Pattison, Robins & Handcock, 2004)

Suppose that: k = -k-1/ where 1 is a (fixed) constant alternating k-star hypothesis

Then kSk(x)k = S[](x) 2 where:

S[](x) = 2 i{(1 - 1/)d(i) + d(i)/ - 1} and d(i) denote the degree of node i alternating k-star statistic

Likewise:

If Uk(x) = no of k-independent 2-paths in x, with corresponding parameter k

and Tk(x) = no of k-triangles in x, with corresponding parameter k

We can suppose that:

k+1 = - k/ alternating independent 2-path hypothesis

k+1 = - k/ alternating k-triangle hypothesis

Networkof Collaboration Ties

Realisation-dependent model for colaboration ties among lawyers (Pattison & Robins, 2002)

neighbourhood estimate_________________________________________

edge -3.669 (.474)

2-star 0.307 (.053)

3-star -0.001 (.002)

triangle 0.173 (.047)

3-path -0.019 (.002)

4-cycle 0.086 (.009)_________________________________________

MCMCML parameter estimates for collaboration network (SIENA, conditioning on total ties, partners only)

Model 1 Model 2

Parameter est s.e. est s.e.

alternating k-stars (=3) -0.083 0.316

Alternating ind. 2-paths (=3) -0.042 0.154

Alternating k-triangles (=3) 0.572 0.190 0.608 0.089

No pairs connected by a 2–path -0.025 0.188

No pairs lying on a triangle 0.486 0.513

Seniority main effect 0.023 0.006 0.024 0.006

Practice (corp. law) main effect 0.391 0.116 0.375 0.109

Same practice 0.390 0.100 0.385 0.101

Same gender 0.343 0.124 0.359 0.120

Same office 0.577 0.110 0.572 0.100

Modelling group cohesion (Albert, 2002)

Network ties are important in understanding social processes, but so are:cultural and psychological resources and aspirations (beliefs, values,

attitudes, knowledge)settings (geographical locations, physical and organisational constraints)

Lindenberg (1997) on groups:Three overlapping forms of interdependence:

functional (common goals and tasks) workgroup membershipcognitive (psychological representations) beliefsstructural (patterning of interpersonal ties) network ties

Albert (2002) on group cohesion:An illustrative analysis of interdependent functional, cognitive and structural

aspects of group cohesion using generalised relational structures

Management consulting firm node colour codes group membership

node size codes extent of cohesive beliefsties: “Who do you ask when you want to find out what is going on..?”

Functional, structural and cognitive interdependence

Evidence for separable tendencies:

structural logic of information seeking: hierarchical with differentiation in information seeking structural interdependence

information ties within groups structural & functional interdependence

shared beliefs within groups cognitive and functional interdependence

shared beliefs within groups among those linked by an information tie cognitive, structural and functional

interdependence

5: A dynamic perspectiveco-evolution of action, networks, settings

Dynamic models

Suppose that Xij(t) are time-dependent relational variables

At any moment t, suppose that there is a possible change in status for some randomly chosen Xij with a transition rate

logistic(Q Q(zQ(x*ij(t)) - zQ(x(t))))

where:x(t) denotes the state of the network at time t;

x*ij(t) equals x(t) but with the value of Xij(t) changed from xij(t) to 1-xij(t);

is a rate parameter;logistic(z)=exp(z)/(1+exp(z))

Then this continuous-time Markov process converges to the distribution Pr (X = x) = (1/c) exp{QQ zQ(x)}parameters can be estimated from longitudinal data (using approach adapted

from that developed by Snijders, 2001, 2002)

Modelling client referrals

Time 1 Time 2 Time 2 Time1Time2 PLE PLE cond MCMCMLE* cond estimate

Edge -3.02 -3.20 - -2.74 (0.35)

2-in-star 0.01 0.05 0.06 (.03) 0.04 (0.03)

2-path -0.08 -0.07 -0.05 (.02) -0.05 (0.02)

2-out-star 0.09 0.10 0.08 (.02) 0.09 (0.02)

mutual tie 2.54 1.73 1.72 (.29) 1.39 (0.28)

3-cycle -0.20 -0.14 -0.15 (.09) -0.14 (0.09)

transitive triad 0.21 0.19 0.16 (.03) 0.14 (0.03)

*using SIENA, conditioning on number of ties

Early 1990s in Brazil: student, civic, political and business groups

time 1 time 2 time 3-3.222(.44) -3.805( .44) -4.678( .46)-2.223(1.1) -6.665(1.8) -10.71(1.5)

-4.405(.98) -6.333(1.5) -9.322(1.8) 0.099(.02) 0.116(.02) 0.170(.02)

0.123(.17) 0.734(.17) 1.051(.15) 0.198(.02) 0.207(.03) 0.202(.02) 0.204(.04) 0.309(.06) 0.459(.14) 0.745(.10) 0.886(.14) 0.906(.12) 0.320(.06) 0.443(.09) 0.444(.06) -0.177(.04) -0.123(.05) -0.022(.04) -0.461(.06) -0.307(.06) 0.000(.06) -0.146(.07) -0.041(.05) -0.024(.03) 0.808(.08) 0.472(.07) 0.139(.06)

Key : organisation project event

6. Concluding comments

Models can display complex behaviour (e.g. nonlinearities, phase transitions) creating some statistical difficulties!

Nonetheless, a statistical approach allows us to stay close to empirical data, and model parameters can be estimated from data. For a well-specified model

We can test hypotheses about local contextual effectsWe can predict the evolution of the system (and its variability) We can understand the aggregate-level consequences of local contextual

effects (and their variability)

Realisation-dependent models appear to be necessary, and reflect a “capacity for actors to transform as well as reproduce long-standing structures, frameworks and networks of interaction” (Emirbayer & Goodwin, 1994)

Some modelling challenges

Scaling up: the role of space Spatial random graph models (Henry Wong)

Co-evolutionDynamic interactions across levelsEvolution of multiple networks

Social “innovation” and transformationMultiple networks are implicated theoreticallye.g Padgett et al on the evolution of markets in Florence

“Emergent” phenomena?Eg emergence of social institutions such as groups

Technical issuesSampling, estimation, missing data…

social network analysis and complex systems science pip pattison university of melbourne csiro...

Documents

people people settings

people groups

social dynamics social

people peoplepeople

university of melbourne1

social networks important

social network analysis

people attributes