network models and network comparisons - college of arts ...network models and network comparisons...

Network Models

and Network Comparisons

Anna Mohr

Department of StatisticsThe Ohio State University

Catherine Calder

Department of StatisticsThe Ohio State University

Observational Data Reading GroupOctober 9th, 2014

What is a model?

A Statistical Framework

par$allyobserve

completelyobserve

REALITY STATISTICS PARAMETERS MODELS

What is a model?

par$allyobserve

completelyobserve

SAMPLING

What is a model?

par$allyobserve

completelyobserve

SAMPLING FOR NETWORKS

betweenness,embeddedness,clustering, etc.

What is a model?

par$allyobserve

completelyobserve

STATISTICAL INFERENCE

What is a model?

par$allyobserve

completelyobserve

STATISTICAL INFERENCE FOR NETWORKS

What is a model?

par$allyobserve

completelyobserve

STATISTICAL INFERENCE FOR NETWORKS

A Model for Network Graphs

a collection,{Pθ(G ),G ∈ G : θ ∈ Θ}

where G is a collection of possible graphs,

Pθ is a probability distribution on G,and θ is a vector of parameters, ranging over possible values in Θ.

A Model for Network Graphs

a collection,{Pθ(G ),G ∈ G : θ ∈ Θ}

where G is a collection of possible graphs,

Pθ is a probability distribution on G,and θ is a vector of parameters, ranging over possible values in Θ.

Keep in mind...

Mathematical Model- approximate relationship- simulations

vs.Statistical Model- describe uncertainty- learn about θ

A Naive Model

adjacency matrix, Y, for an undirected, unweighted network where each

Yij is the tie variable for vertices i and j

Logistic Regression

suppose Yijiid∼ Bernoulli(p)

logit(p) = θ

p1 Model

Yij ∼ Bernoulli(pij)

logit(pij) = θ + γi + γj

for directed graphs,

P(Yij = y1,Yji = y2) ∝ exp {y1(θ + αi + βj) + y2(θ + αj + βi ) + y1y2ρ}

A Naive Model

Logistic Regression

logit(p) = θ

p1 Model

A Naive Model

Logistic Regression

logit(p) = θ

p1 Model

A Naive Model

Logistic Regression

logit(p) = θ

p1 Model

Keep Improving...

p2 Model

take the p1 model, Yij ∼ Bernoulli(pij)

and additionally, model γ = Xβ + ζ, where ζiiid∼ Normal(0, σ2ζ )

θij = θ + Zijδ

where the X are covariates for the set of verticesand the Z are dyadic attributes

I accounts for some dependence between the Yij

I can incorporate meaningful covariates

I ∼ mixed effects logisitic regression

Keep Improving...

p2 Model

θij = θ + Zijδ

Keep Improving...

p2 Model

θij = θ + Zijδ

Markov Dependence

A Markov Process

let {Xt} be a stochastic process such that

P(Xn = xn|Xn−1 = xn−1, ...X1 = x1) = P(Xn = xn|Xn−1 = xn−1)

A Simple Markov Random Fielddependence on nearest neighbors

Markov Dependence

A Markov Process

let {Xt} be a stochastic process such that

P(Xn = xn|Xn−1 = xn−1, ...X1 = x1) = P(Xn = xn|Xn−1 = xn−1)

A Simple Markov Random Fielddependence on nearest neighbors

Markov Dependence

Network Graph

I all possible edges that share a vertex are dependent

Dependence graph represent each possible edge as a vertex; verticesare connected if they are dependent

Markov Dependence

Network Graph

I all possible edges that share a vertex are dependent

Markov Dependence

Network GraphI all possible edges that share a vertex are dependent

let Nv = 4, then

Markov Dependence

let Nv = 4, then

Markov Dependence

let Nv = 4, then

Markov Dependence

let Nv = 4, then

Markov Dependence

let Nv = 4, then

Markov Dependence

Hammersley-Clifford theorem → any undirected graph on Nv verticeswith dependence graph D has probability

P(G ) =

∑A⊆G

where αA is an indicator of the clique A in D.

Markov Modelcliques of D are edges, k-stars, and triangles in G

Markov Dependence

P(G ) =

∑A⊆G

Markov Dependence

P(G ) =

∑A⊆G

Markov Dependence

P(G ) =

∑A⊆G

Markov Dependence

P(G ) =

∑A⊆G

Markov Dependence

P(G ) =

∑A⊆G

Markov Dependence

P(G ) =

∑A⊆G

Markov Model

Pθ(Y = y) =

{Nv−1∑k=1

θkSk(y) + θτT (y)

where S1(y) = Ne

Sk(y) = # of k-stars for 2 ≤ k ≤ Nv − 1

and T (y) = # of triangles

“Triad Model”k ≤ 2 only

Markov Model

Pθ(Y = y) =

{Nv−1∑k=1

where S1(y) = Ne

Sk(y) = # of k-stars for 2 ≤ k ≤ Nv − 1

and T (y) = # of triangles

“Triad Model”k ≤ 2 only

Notes on the Markov Model

I intuitive dependence structure

I interpret sign of θi as tendency for/against statistic i aboveexpectations for a random graph

I model fitting and simulations done via MCMCnot easy...

I model degeneracy issuesplaces lots of mass on only a few outcomes

I especially so for large Nv

I related to the phase transitions known for the Ising model

I change statistics for the MCMC algorithm

Exponential Random Graph Models

Exponential FamilyZ belongs to an exponential family if its pmf can be expressed as

Pθ(Z = z) = exp{θ′g(z)− ψ(θ)

}where ψ(θ) is the normalization term.

ERGMlet Yij = Yji be a binary r.v. indicating the presence of an edge betweenvertices i and j

Pθ(Y = y) =

θHgH(y)

where each H is a configuration, gH(y) is an indicator/count of H in yand κ = κ(θ) is the normalization constant.

Exponential FamilyZ belongs to an exponential family if its pmf can be expressed as

Pθ(Z = z) = exp{θ′g(z)− ψ(θ)

}where ψ(θ) is the normalization term.

Pθ(Y = y) =

θHgH(y)

Markov Model

Pθ(Y = y) =

{Nv−1∑k=1

Pθ(Y = y) =

θHgH(y)

Logistic Regression Yijiid∼ Bernoulli(p)

logit(p) = θ

⇒ Pθ(Yij = 1) = p = logit−1(θ) =eθ

1 + eθ

so now, Pθ(Y = y) = =

1 + eθ

)S1(y)( 1

1 + eθ

)(Nv2 )−S1(y)

=exp {θS1(y)}

(1 + eθ)(Nv2 )

Bernoulli Model: Pθ(Y = y) =

)exp {θ S1(y)}

logit(p) = θ

1 + eθ

so now, Pθ(Y = y) = =

1 + eθ

)S1(y)( 1

1 + eθ

)(Nv2 )−S1(y)

=exp {θS1(y)}

(1 + eθ)(Nv2 )

)exp {θ S1(y)}

logit(p) = θ

1 + eθ

so now, Pθ(Y = y) =∏i , j

Pθ(Yij = yij)

1 + eθ

)S1(y)( 1

1 + eθ

)(Nv2 )−S1(y)

=exp {θS1(y)}

(1 + eθ)(Nv2 )

)exp {θ S1(y)}

logit(p) = θ

1 + eθ

so now, Pθ(Y = y) = [Pθ(Yij = 1)]S1(y) [Pθ(Yij = 0)](Nv2 )−S1(y)

1 + eθ

)S1(y)( 1

1 + eθ

)(Nv2 )−S1(y)

=exp {θS1(y)}

(1 + eθ)(Nv2 )

)exp {θ S1(y)}

logit(p) = θ

1 + eθ

)S1(y)( 1

1 + eθ

)(Nv2 )−S1(y)

=exp {θS1(y)}

(1 + eθ)(Nv2 )

)exp {θ S1(y)}

logit(p) = θ

1 + eθ

)S1(y)( 1

1 + eθ

)(Nv2 )−S1(y)

=exp {θS1(y)}

(1 + eθ)(Nv2 )

)exp {θ S1(y)}

logit(p) = θ

1 + eθ

)S1(y)( 1

1 + eθ

)(Nv2 )−S1(y)

=exp {θS1(y)}

(1 + eθ)(Nv2 )

)exp {θ S1(y)}

I Bernoulli Modelcomplete independence

Pθ(Y = y) =

)exp {θS1(y)}

I Markov Modelpossible edges that share a vertex are dependent

Pθ(Y = y) =

{Nv−1∑k=1

I General Case?? dependence

Pθ(Y = y) =

θHgH(y)

I Snijders et al. (2006)

Pθ(Y = y) =

)exp {θS1(y)}

Pθ(Y = y) =

{Nv−1∑k=1

Pθ(Y = y) =

θHgH(y)

Pθ(Y = y) =

)exp {θS1(y)}

Pθ(Y = y) =

{Nv−1∑k=1

Pθ(Y = y) =

θHgH(y)

Pθ(Y = y) =

)exp {θS1(y)}

Pθ(Y = y) =

{Nv−1∑k=1

Pθ(Y = y) =

θHgH(y)

New Specifications - Snijders et al. (2006)

make use of clique-like structures...

Pθ(Y = y) =

{θ1S1(y) + θ2u

(s)λ1

(y) + +θ3u(t)λ2

(y) + θ4upλ2

where S1(y) = Ne

u(s)λ (y) =

Nv−1∑k=2

(−1)kSk(y)

λk−2alternating k-stars

u(t)λ (y) =

∑i<j

Nv−2∑k=1

)k−1(L2ijk

)alt. k-triangles

upλ(y) = λ∑i<j

(1− 1

)L2ij}

alt. independent two-paths

New Specifications - Snijders et al. (2006)

k-triangles

independent two-paths

Some Notes on the Snijders Model

I fewer, less severe issues with model degeneracy

I model fitting and simulations done via MCMC

I interpretation of θ?

I what should λ be? what does it mean?→ curved exponential family

I satisfies (weaker) partial conditional dependence

Yiv and Yuj are conditionally dependent only if one of the twoconditions hold:

1. {i , v} ∩ {u, j} 6= ∅

2. yiu = yvj = 1

1. {i , v} ∩ {u, j} 6= ∅

2. yiu = yvj = 1

1. {i , v} ∩ {u, j} 6= ∅

2. yiu = yvj = 1

Network Models - Summary

I Statistical Models

Simple Logistic Regression / Bernoulli Model

p1 Model

p2 Model

Markov Model

Snijders et al. (2006)

ERGMs or p∗ Models

I Mathematical Models

Random Graphs – CUG, Erdos-Renyi, Generalized

Small World

Preferential Attachment

too simple

p1 Model

p2 Model

Markov Model

← too hard to fit

← too hard to interpret

Small World

too simple

p1 Model

p2 Model

Markov Model

← too hard to fit

Small World

too simple

p1 Model

p2 Model

Markov Model ← too hard to fit

Small World

too simple

p1 Model

p2 Model

Snijders et al. (2006) ← too hard to interpret

Small World

too simple

p1 Model

p2 Model

Small World

too simple

Random Graphs

a conditional uniform graph (CUG) distribution with sufficient statistict taking on value x:

P(G = g |t, x) =1

|{g ′ ∈ G : t(g ′) = x}|I{g ′∈G:t(g ′)=x}(g)

where t = (t1, ...tn) is an n-tuple of real-valued functions on G andx ∈ Rn is a known vector.

I pick a particular G and specify uniform probability

Special Cases

an Erdos-Renyi random graph puts uniform probablity on GNv ,Ne sothat

P(G = g |Nv ,Ne) =1(NNe

) I{g∈GNv ,Ne }(g)

where N =(Nv

another variant of this model, suggested by Gilbert around the same timeuses

GNv ,p = collection of graphs G with Nv vertices that may beobtained by assigning an edge independently to eachpossible edge with probability p ∈ (0, 1)

→ Bernoulli Model for large Nv , when p = f (Nv ) and Ne ∼ pNv

a generalized random graph puts uniform probability on GNv ,t where tis any other statistic/motif/characteristic of G .

I degree distribution ⇒ Ne fixed

Special Cases

where N =(Nv

Special Cases

where N =(Nv

Special Cases

where N =(Nv

Some Notes about Random Graphs

I mathematical models

I Erdos-Renyi appears to be the most commonly used

I most thoroughly studieddegree distribution, probability of connectedness, etc.

I easy to work with

PROSintuitiveeasy simulationsshort path lengths

CONSunrealisticdegree dist. is not broad enoughlevels of clustering too low

I easy to work with

Some Other Mathematical Models

Watts-Strogatz Small World Model

0. lattice of Nv vertices

1. randomly “rewire” each edge independently and with probability p,such that we change one endpoint of that edge to a different vertex(chosen uniformly)

I high levels of clustering, yet small distances between most nodes

Some Other Mathematical Models

Barabasi-Albert Preferential Attachment Model(a network growth model)

0. G (0) of N(0)v vertices and N

(0)e edges

t. G (t) is created by adding a vertex of degree m ≥ 1 to G (t−1), wherethe probability that this new vertex is connected to any existingvertex in G (t−1) is

dv∑v ′∈V dv

, where dv is the degree of vertex v

I can achieve broad degree distributions

p1 Model

p2 Model

Small World

too simple

Thank you!!

Some References

van Duijn, Marijtje A. J., Tom A. B. Snijders and Bonne J. H. Zijlstra. 2004. “p2: ARandom Effects Model with Covariates for Directed Graphs.” Statistica Neerlandica58(2): 234-254.

Frank, Ove and David Strauss. 1986. “Markov Graphs.” Journal of the AmericanStatistical Association 81: 832-42.

Snijders, Tom A. B., Philippa E. Pattison, Garry L. Robins, and Mark S. Handcock.2006. “New Specifications for Exponential Random Graph Models.” SociologicalMethodology 36(1): 99-153

Butts, Carter T. 2008. “Social Network Analysis: A Methodological Introduction.”Asian Journal of Social Psychology 11: 13-41.

Erdos, P and A. Renyi. 1960. ”On the Evolution of Random Graphs.” Publications ofthe Mathematical Institute of the Hungarian Academy of Sciences 5: 17-61.

van Wijk, Bernadette C. M., Cornelis J. Stam, and Andreas Daffertschofer. 2010.“Comparing Brain Networks of Different Size and Connectivity Density Using GraphTheory.” PLoS ONE 5(10): e13701.

network models and network comparisons - college of arts ...network models and network comparisons...

Documents