Measurement Sensitivity
It seems a reasonable approach to assessing the effect of measurement error on the ties in a network is to ask how would the network measures change if the observed ties differed from those observed. This question can be answered simply with Monte Carlo simulations on the observed network. Thus, the procedure I propose is to:
• Generate a probability matrix from the set of observed ties, • Generate many realizations of the network based on these underlying probabilities, and •Compare the distribution of generated statistics to those observed in the data.
•How do we set pij?•Range based on observed features (Sensitivity analysis)•Outcome of a model based on observed patterns (ERGM)
Measurement Sensitivity
As an example, consider the problem of defining “friendship” ties in highschools.
Should we count nominations that are not reciprocated?
Measurement Sensitivity
All ties Reciprocated
Measurement Sensitivity
Measurement Sensitivity
Measurement Sensitivity
Measurement Sensitivity
Measurement Sensitivity
Measurement Sensitivity
Statistical Analysis of Social Networks
Comparing multiple networks: QAP
The substantive question is how one set of relations (or dyadic attributes) relates to another. For example:
• Do marriage ties correlate with business ties in the Medici family network?• Are friendship relations correlated with joint membership in a club?
(review)
Modeling Social Networks parametrically:ERGM approaches
The earliest approaches are based on simple random graph theory, but there’s been a flurry of activity in the last 10 years or so.
Key historical references:- Holland and Leinhardt (1981) JASA- Frank and Strauss (1986) JASA- Wasserman and Faust (1994) – Chap 15 & 16-Wasserman and Pattison (1996)
Good practical overview: http://www.jstatsoft.org/v24 Great tutorial: http://statnet.csde.washington.edu/workshops/SUNBELT/EUSN/ergm/ergm_tutorial.html (last year’s sunbelt)
Or-https://statnet.csde.washington.edu/trac/wiki/Sunbelt2014 (lots of how to slides)
Modeling Social Networks parametrically:ERGM approaches
The “p1” model of Holland and Leinhardt is the classic foundation – the basic idea is that you can generate a statistical model of the network by predicting the counts of types of ties (asym, null, sym). They formulate a log-linear model for these counts; but the model is equivalent to a logit model on the dyads:
)(1Xlogit ij jiji X
Note the subscripts! This implies a distinct parameter for every node i and j in the model, plus one for reciprocity.
Modeling Social Networks parametrically:ERGM approaches
Modeling Social Networks parametrically:ERGM approaches
Results from SAS version on PROSPER datasets
Modeling Social Networks parametrically:ERGM approaches
Once you know the basic model format, you can imagine other specifications:
(orig) chars) (node )(1Xlogit
y)reciprocit ial(different )(1Xlogit
(orig) )(1Xlogit
ij
ij
ij
jiji
jigji
jiji
X
X
X
Key is to ensure that the specification doesn’t imply a linear dependency of terms.
Model fit is hard to judge – newer work shows that the se’s are “approximate” ;-)
)(
)}(exp{)(
xz
xXp
Where: is a vector of parameters (like regression coefficients)z is a vector of network statistics, conditioning the graph is a normalizing constant, to ensure the probabilities sum to 1.
Modeling Social Networks parametrically:ERGM approaches
)(
}exp{
)( ,
ji
ijij x
xXp
The simplest graph is a Bernoulli random graph,where each Xij is independent:
Where:
ij = logit[P(Xij = 1)]
() =[1 + exp(ij )]
Note this is one of the few cases where () can be written.
Modeling Social Networks parametrically:ERGM approaches
Typically, we add a homogeneity condition, so that all isomorphic graphs are equally likely. The homogeneous bernulli graph model:
)(
}{exp
)( ,
ji
ijx
xXp
Where:
() =[1 + exp()]g
Modeling Social Networks parametrically:ERGM approaches
If we want to condition on anything much more complicated than density, the normalizing constant ends up being a problem. We need a way to express the probability of the graph that doesn’t depend on that constant. First some terms:
j and ibetween tienox with Sociomatri
0 toforcedelement ijx with Sociomatri
1 toforcedelement ijx with Sociomatri
,
,
,
cji
ji
ji
X
X
X
Modeling Social Networks parametrically:ERGM approaches
)|0(
)|1()exp(
cijij
cijij
ij XXp
XXpw
)]()([exp{
)}(exp{
)}(exp{
)|0(
)|1(
ijij
ij
ij
cijij
cijij
xzxz
xz
xz
XXp
XXp
)]()([)|0(
)|1(log
ijijcijij
cijij
ij xzxzXXp
XXp
Modeling Social Networks parametrically:ERGM approaches
)]()([)|0(
)|1(log
ijijcijij
cijij
ij xzxzXXp
XXp
Note that we can now model the conditional probability of the graph, as a function of a set of difference statistics, without reference to the normalizing constant. The model, then, simply reduces to a logit model on the dyads.
Modeling Social Networks parametrically:ERGM approaches
Modeling Social Networks parametrically:ERGM approaches
)]()([)|0(
)|1(log
ijijcijij
cijij
ij xzxzXXp
XXp
Consider the simplest possible model: the Bernoulli random graph model, which says the only feature of interest is the number of edges in the graph. What is the change statistic for that feature?
dyads) allfor 1 is e(differenc 1][
zero) is vakyeso absent, is edge (assume )0(
one) is valueso present, is edge (assume )1(
ijij
ij
ij
xxz
xz
xz
Modeling Social Networks parametrically:ERGM approaches
Consider the simplest possible model: the Bernoulli random graph model, which says the only feature of interest is the number of edges in the graph. What is the change statistic for that feature?
The “Edges” parameter is simply an intercept-only model.
NODE ADJMAT
1 0 1 1 1 0 0 0 0 0
2 1 0 1 0 0 0 1 0 0
3 1 1 0 0 1 0 1 0 0
4 1 0 0 0 1 0 0 0 0
5 0 0 1 1 0 1 0 1 0
6 0 0 0 0 1 0 0 1 1
7 0 1 1 0 0 0 0 0 0
8 0 0 0 0 1 1 0 0 1
9 0 0 0 0 0 1 0 1 0
Density: 0.311
Modeling Social Networks parametrically:ERGM approaches
Consider the simplest possible model: the Bernoulli random graph model, which says the only feature of interest is the number of edges in the graph. What is the change statistic for that feature?
The “Edges” parameter is simply an intercept-only model.
proc logistic descending data=dydat;
model nom =;
run; quit;
---see results copy coef ---
data chk;
x=exp(-0.5705)/(1+exp(-0.5705));
run;
proc print data=chk;
run;
Modeling Social Networks parametrically:ERGM approaches
Including: A Practical Guide To Fitting p* Social Network
ModelsVia Logistic Regression
The site includes the PREPSTAR program for creating the variables of interest. The following example draws from this work. – this bit nicely walks you through the logic of constructing change variables, model fit and so forth.
But the estimates are not very good for any parameters other than “dyad independent” parameters!
Modeling Social Networks parametrically:ERGM approaches
The logit model estimation procedure was popularized by Wasserman & colleagues, and a good guide to this approach is:
Modeling Social Networks parametrically:ERGM approaches
Parameters that are often fit include:1) Expansiveness and attractiveness parameters. = dummies for
each sender/receiver in the network2) Degree distribution 3) Mutuality 4) Group membership (and all other parameters by group)5) Transitivity / Intransitivity6) K-in-stars, k-out-stars7) Cyclicity8) Node-level covariates (Matching, difference)9) Edge-level covariates (dyad-level features such as exposure)10) Temporal data – such as relations in prior waves.
Modeling Social Networks parametrically:Exponential Random Graph Models
Modeling Social Networks parametrically:Exponential Random Graph Models
…and there are LOTS of terms…
Modeling Social Networks parametrically:Exponential Random Graph Models
The terms currently available are (help(ergm.terms)
Node Main Effects: nodecov(attrname) Main effect of a covariate: nodefactor(attrname, base=1) Factor attribute effect: nodeicov(attrname) Main effect of a covariate for in-edges: nodeifactor(attrname, base=1) Factor attribute effect for in-edges: nodeocov(attrname) Main effect of a covariate for out-edges: nodeofactor(attrname, base=1) Factor attribute effect for out-edges: receiver(base=1) Receiver effect: sender(base=1) Sender effect: sociality(attrname=NULL, base=1) Undirected degree:
Modeling Social Networks parametrically:Exponential Random Graph Models
Attribute Mixing Effects absdiff(attrname, pow=1) Absolute difference: absdiffcat(attrname, base=NULL) Categorical absolute difference: dyadcov(x, attrname=NULL) Dyadic covariate: edgecov(x, attrname=NULL) Edge covariate: The edgecov and dyadcov terms are
equivalent for undirected networks. hamming(x, cov, attrname=NULL) Hamming distance: hammingmix(attrname, x, base=0) Hamming distance within mixing: match(attrname, diff=FALSE, keep=NULL) Uniform homophily and differential
homophily: This is an alias for nodematch(attrname, diff=FALSE). nodematch(attrname, diff=FALSE, keep=NULL) Uniform homophily and differential
homophily: nodemix(attrname, base=NULL) Nodal attribute mixing:
Modeling Social Networks parametrically:Exponential Random Graph Models
Structural Effects Base Volume
density Density: edges Edges: meandeg Mean vertex degree:
Degree/Star effects
altkstar(lambda, fixed=FALSE) Alternating k-star: gwdegree(decay, fixed=FALSE, cutoff=30) Geometrically weighted degree
distribution: gwidegree(decay, fixed=FALSE, cutoff=30) Geometrically weighted in-degree
distribution: gwodegree(decay, fixed=FALSE, cutoff=30) Geometrically weighted out-degree
distribution: idegree(d, by=NULL, homophily=FALSE) In-degree: isolates Isolates: istar(k, attrname=NULL) In-stars: kstar(k, attrname=NULL) k-Stars: odegree(d, by=NULL, homophily=FALSE) Out-degree: ostar(k, attrname=NULL) k-Outstars:
Modeling Social Networks parametrically:Exponential Random Graph Models
Structural Effects Dyadic Effects
asymmetric(attrname=NULL, diff=FALSE, keep=NULL) Asymmetric dyads: degree(d, by=NULL, homophily=FALSE) Degree: degcrossprod Degree Cross-Product: degcor Degree Correlation: mutual(same=NULL, diff=FALSE, by=NULL, keep=NULL) Mutuality:
Path Effects m2star Mixed 2-stars, a.k.a 2-paths: See also twopath. threepath(keep=1:4) Three-paths: twopath 2-Paths:
Modeling Social Networks parametrically:Exponential Random Graph Models
Triadic Effects ctriple(attrname=NULL) Cyclic triples:. cycle(k) Cycles: dsp(d) Dyadwise shared partners: esp(d) Edgewise shared partners: balance Balanced triads: gwdsp(alpha, fixed=FALSE, cutoff=30)Geometrically weighted dyadwise shared
partner distribution: gwesp(alpha, fixed=FALSE, cutoff=30) Geometrically weighted edgewise shared
partner distribution: gwnsp(alpha, fixed=FALSE, cutoff=30) Geometrically weighted nonedgewise shared
partner distribution: intransitive Intransitive triads: localtriangle(x) Triangles within neighborhoods: nearsimmelian Near simmelian triads: nsp(d) Nonedgewise shared partners: simmelian Simmelian triads: simmelianties Ties in simmelian triads: transitive Transitive triads: transitiveties(attrname=NULL) Transitive ties: triadcensus(d) Triad census: triangle(attrname=NULL) Triangles: tripercent(attrname=NULL) Triangle percentage: ttriple(attrname=NULL) Transitive triples:
Modeling Social Networks parametrically:Exponential Random Graph Models
Two Mode Networks b1concurrent(by=NULL) Concurrent node count for the first mode in a bipartite (aka two-
mode) network: b1degree(d, by=NULL) Degree for the first mode in a bipartite (aka two-mode) network: b1factor(attrname, base=1) Factor attribute effect for the first mode in a bipartite (aka
two-mode) network : b1star(k, attrname=NULL) k-Stars for the first mode in a bipartite (aka two-mode)
network: b1starmix(k, attrname, base=NULL, diff=TRUE) Mixing matrix for k-stars centered on
the first mode of a bipartite network: b1twostar(b1attrname, b2attrname, base=NULL) Two-star census for central nodes
ceneterd on the first mode of a bipartite network: b2concurrent(by=NULL) Concurrent node count for the second mode in a bipartite (aka
two-mode) network:. b2degree(d, by=NULL) Degree for the second mode in a bipartite (aka two-mode) network: b2factor(attrname, base=1) Factor attribute effect for the second mode in a bipartite
(aka two-mode) network : b2star(k, attrname=NULL) k-Stars for the second mode in a bipartite (aka two-mode)
network: b2starmix(k, attrname, base=NULL, diff=TRUE) Mixing matrix for k-stars centered on
the second mode of a bipartite network: b2twostar(b1attrname, b2attrname, base=NULL) Two-star census for central nodes
ceneterd on the second mode of a bipartite network: gwb1degree(decay, fixed=FALSE, cutoff=30) Geometrically weighted degree
distribution for the first mode in a bipartite (aka two-mode) network: gwb2degree(decay, fixed=FALSE, cutoff=30) Geometrically weighted degree
distribution for the second mode in a bipartite (aka two-mode) network: concurrent(by=NULL) Concurrent node count:
Modeling Social Networks parametrically:Exponential Random Graph Models
In practice, logit estimated models are difficult to estimate, and we have no good sense of how approximate the PMLE is.
The STATNET generalization is to use MCMC methods to better estimate the parameters. This is essentially a simulation procedure working “under the hood” to explore the space of graphs described by the model parameters; searching for the best fit to the observed data.
Modeling Social Networks parametrically:Exponential Random Graph Models:
Modeling Social Networks parametrically:Exponential Random Graph Models:
Modeling Social Networks parametrically:Exponential Random Graph Models
You can specify a model as a simple statement on terms:
Modeling Social Networks parametrically:Exponential Random Graph Models
A simple example: One of the schools in PROSPER
library(statnet);library(foreign);g <- read.paj("C:/jwmdata/prosper/Network_data_files/PAJEK/MATCHED/SC1C1W1Sch101.net");g %v% "indegree" <- degree(g,cmode="indegree");g %v% "outdegree" <- degree(g,cmode="outdegree");atr<-read.table("C:/jwmdata/prosper/Network_data_files/Rfiles/ergmfiles/n111101.txt");g %v% "sex" <- atr[,2 ];g %v% "white" <- atr[,3 ];g %v% "slun" <- atr[,4 ];g %v% "irtuse" <- atr[,5 ];g %v% "irtdev" <- atr[,6 ];g %v% "tgrad" <- atr[,7 ];g %v% "discip" <- atr[,8 ];g %v% "church" <- atr[,9 ];g %v% "sens" <- atr[,10 ];
plot(g,vertex.col="sex");plot(g,vertex.col="slun");plot(g,vertex.col="white");
Dynamics 1:Simple time-lag model: Prosper Peers
Modeling Social Networks parametrically:Exponential Random Graph Models
Complete Network AnalysisStochastic Network Analysis An example:
Panel model in PROSPER
Complete Network AnalysisStochastic Network Analysis
Modeling Social Networks parametrically:Exponential Random Graph Models: Degeneracy
"Assessing Degeneracy in Statistical Models of Social Networks" Mark S. Handcock, CSSS Working Paper #39
Modeling Social Networks parametrically:Exponential Random Graph Models:
Quick example (demo)
Modeling Social Networks parametrically:Latent Space Models
Modeling Social Networks parametrically:Latent Space Models
Z = a dimension in some unknown space that, once accounted for makes ties independent. Z is effectively chosen with respect to some latent cluster-space, G. These “groups” define different social sources for association.
Modeling Social Networks parametrically:Latent Space Models
Z = a dimension in some unknown space that, once accounted for makes ties independent. Z is effectively chosen with respect to some latent cluster-space, G. These “groups” define different social sources for association.
Modeling Social Networks parametrically:Latent Space Models
Modeling Social Networks parametrically:Latent Space Models
Prosper data, with three groups
Modeling Social Networks parametrically:Latent Space Models
Prosper data, with three groups (posterior density plots)
Modeling Social Networks parametrically:Latent Space Models
…note there is a non-R option.,..
Generating Random Graph Samples
A conceptual merge between exponential random graph models and QAP/sensitivity models is to attempt to identify a sample of graphs from the universe you are trying to model.
)(
)}(exp{)(
xz
xXp
That is, generate X empirically, then compare z(x) to see how likely a measure on x would be given X. The difficulty, however, is generating X.
Generating Random Graph Samples
The first option would be to generate all isomorphic graphs within a given constraint.
This is possible for small graphs, but the number gets large fast. For a network with 3 nodes, there are 16 possible directed graphs. For a network with 4 nodes, there are 218, for 5 nodes 9608, for 6 nodes1,540,944, and so on…
So, the best approach is to sample from the universe, but, of course, if you had the universe you wouldn’t need to sample from it. How do you sample from a population you haven’t observed?
(a) use a construction algorithm that generates a random graph with known constraints (b) use a ERGM model like above.
Romantic Networks
Generating Random Graph Samples
Romantic Networks
Generating Random Graph Samples
Romantic Networks
Generating Random Graph Samples
A draw from the simulation, this is what appeared in “Glamour”
Edge-matching random permutation
Can easily generate networks with appropriate degree distributions by generating “edge stems” and sorting:
aDegree:1: 22: 23: 1
b
di=1
c
c
di=2
d
d
f
f
di=3
f
(need to ensure you have a valid edge list!)
Generating Random Graph Samples
Edge-matching random permutationGenerating Random Graph Samples
PartnerDistribution
ComponentSize/Shape
Emergent Connectivity in low-degree networks
Generating Random Graph Samples
Development of STD cores in low-degree networks: rapid transition without stars.
Complete Network AnalysisNetwork Connections: Connectivity
Extend this view across the space of low-degree distributions defined by shape and volume...
Complete Network AnalysisNetwork Connections: Connectivity
Complete Network AnalysisNetwork Connections: Connectivity
ERGMs make it (fairly) easy to simulate networks from models.
•Simple: simulation from an estimated ERGM (this is how the GOF function works)•Simple II: simulate from a pre-defined ERGM formula (i.e. set the parameters by hand)•A little harder: Simulate from EGO networks. Here you can use ERGM to match the observed distribution for mixing by node characteristics reported in an ego-network survey.
• Can use degree, attribute mixing, •A bit harder: fit global structure features using ego-nets by modeling distribution of sub-structures (see Jeff Smith’s work)
Generating Random Graph SamplesModel based estimates
ERGM to simulate networks from Add Health
Modeling Network DynamicsRule-based simulation models
Rule-Based simulation models:The network-science approach to dynamic networks has been to identify toy behavioral models and play out the implications of these models for network dynamics. Focus is typically on how the network evolves (or reaches a steady stat).
dynamics OF networksBalance, preferential attachment, voter models
dynamics ON networksdiffusion simulations
These are usually agent-based models, difficult to specify – tradeoff in simplicity & realism.
Modeling Network DynamicsDescriptive dynamic techniques
Goal here is to make sense of how networks change or how things flow through them using a clear measurement / metrics approach. Challenge is defining the network.
Time and Social Networks
Examples of looking at change in networks: Roy and interlocking directorates (ASR 1983, 248-257)Non-financial interlocks:1886 - 1890
Time and Social Networks
Examples of looking at change in networks: Roy and interlocking directorates (ASR 1983, 248-257)Non-financial interlocks:1891 - 1895
Time and Social Networks
Examples of looking at change in networks: Roy and interlocking directorates (ASR 1983, 248-257)Non-financial interlocks:1896 - 1900
Time and Social Networks
Examples of looking at change in networks: Roy and interlocking directorates (ASR 1983, 248-257)Non-financial interlocks:1901 - 1905
Bearman and Everett: The Structure of Social Protest
1
3 2
45
6
13
2
4
5
7
61
3
2
4
5
(‘61-63) (‘66-68) (‘71-73)
7
61
3
2
4
5
(‘76-78) (‘81-83)
7
51
6
3
4
2
See paper for group compositions
Data on drug users in Colorado Springs, over 5 years
Data on drug users in Colorado Springs, over 5 years
Data on drug users in Colorado Springs, over 5 years
Data on drug users in Colorado Springs, over 5 years
Data on drug users in Colorado Springs, over 5 years
http://csde.washington.edu/statnet/movies/ConcurrencyAndReachability.mov
Animation captures much of the dynamism we care about:
STD Diffusion
Representing dynamic networks?
Animation captures much of the dynamism we care about:
Representing dynamic networks?
Animation captures much of the dynamism we care about:
Representing dynamic networks?
Modeling Network DynamicsRandom Graph models
Panel ERGM: Simply want to account for effect of past structures, you can add temporal covariates to the standard ERGM. Really only good for two waves.
STERGM: Separable Temporal ERGM. This is a two-equation model, with one equation for the formation of ties, a 2nd for the dissolution of ties. Goal is like ERGM, to explain the dynamics of the network.
http://statnet.csde.washington.edu/workshops/SUNBELT/current/tergm/tergm_tutorial.pdf
RELEVENT: Relational Events Model. This is really a model of action on a network think of conversation events or similar. Dynamic networks of very short duration events.
http://statnet.csde.washington.edu/workshops/SUNBELT/current/relevent/statnet_sunbelt2014_relevent.pdf
SIENA: Stochastic Actor Oriented Model (SAOM). Used to disentangle selection from influence, by jointly modeling both as functions of each other. Multi-equation model, simplest is one for behavior & one for network formation.Intro: https://www.stats.ox.ac.uk/~snijders/siena/SnijdersSteglichVdBunt2009.pdf Manual: https://www.stats.ox.ac.uk/~snijders/siena/RSiena_Manual.pdf
Modeling Network DynamicsRandom Graph models: STERGM
http://statnet.csde.washington.edu/workshops/SUNBELT/current/tergm/tergm_tutorial.html slides adapted from the workshop materials: http://statnet.csde.washington.edu/EpiModel/nme/index.html
Modeling Network DynamicsRandom Graph models: STERGM
http://statnet.csde.washington.edu/workshops/SUNBELT/current/tergm/tergm_tutorial.html slides adapted from the workshop materials: http://statnet.csde.washington.edu/EpiModel/nme/index.html
Under certain assumptions, you can model a single network w. average duration information (assumes an equilibrium process)
Modeling Network DynamicsRandom Graph models: STERGM
samp.fit <- stergm(samp, formation= ~edges+mutual+cyclicalties+transitiveties, dissolution = ~edges+mutual+cyclicalties+transitiveties, estimate = "CMLE", times=1:3)
SIENA
SIENA: Key Assumptions of the model
SIENA
SIENA
SIENA
Key element is how actors make changes. This is based on an evaluation of “utility” functions, similar to discrete choice models.
The model is then implemented as an actor-simulation, where actors are striving to maximize their utility.
note Tom is adamant that this is an “as if” model – no clear ontological commitment to a “choice” model!
Modeling Network DynamicsRandom Graph models: Siena
Modeling Network DynamicsRandom Graph models: Siena
Osgood, D. W., Ragan, D. T., Wallace, L., Gest, S. D., Feinberg, M. E., & Moody, J. 2013. “Peers and the emergence of alcohol use: Influence and selection processes in adolescent friendship networks.” Journal of Research on Adolescence 23:500–512.
Modeling Network DynamicsRandom Graph models: RelEvent
For repeated interactions amongst nodes