random graph models with fixed degree sequences: choices...

Random graph models with fixed degree sequences:

choices, consequences and irreducibilty proofs for

sampling

Joel Nishimura1, Bailey K Fosdick2, Daniel B Larremore3 and Johan Ugander4

1Arizona State Univ. 2Colorado State 3Univ. of Colorado 4Stanford Univ.

ASU Discrete Math Seminar 2018

See paper for the numerous literature connections

What is notable about a graph?

Interpretation requires a Null Model

karate club


model

err

or

implementation difficulty and/or required understanding

replicated experiments

Erdős–Rényi

application specific simulation

fixed degree sequence

Stub Matching

edges to

stubs

join 2

stubs

drop stub

labels

Stub-labeledVertex-labeled

Self-loops

edges to

stubs

join 2

stubs

drop stub

labels

Self-loops and Multiedges

• Self-loops and multiedges are asymptotically rare

(for reasonable degree sequences)

• Have been frequently been ignored, or simply deleted

• BUT – they can also have large impacts on finite sized null models

• AND – in null models which allow self-loops or multiedges, stub matching does not sample adjacency matrices uniformly at random


model

err

or

implementation difficulty and/or required understanding

replicated experiments

Erdős–Rényi

application specific simulation

fixed degree sequence

??

?

multiedges self-loops

vertex-

labeled

simple

stub-

labeled

Stub labeled graphs are biased

against multiedges and self-loops

Consider “1,2,2,1”Uniformly samples

from d and e are the

same

Uniform samples are

different

There’s a choice of graphs – and it matters

Example 1

• Geometer’s collaboration graph n=9,072m=22,577

• Nodes: computational geometry researchers.

• Edges: collaboration on a book or paper

• Degree assortativity

• Do high productivity authors coauthor with other high productivity authors?

Multiedges, self-loops, and labeling?

Stub-labeling isn’t causal

• Node iStub 1: first paper

Stub 2: second paper

• Node iStub 1: first paper


• Node jStub 1: first paper


• Node jStub 1: first paper


Consider a collaboration network, and two potential stub

labelings:

Vertex-Labeling is Causal

• Consider a collaboration network:

• Nodes: authors

• Edges: papers/books with unique title

• Suppose you order each edge’s arrival

• Each vertex labeled graph has m! edge orderings

• i.e. all adjacency matrices correspond to the same number of timelines where papers were produced in different orders.

Example 2

• Swallow graph n=17

• Nodes: barn swallows.

• Edges: bird-bird interactions

• Trait assortativity (based on bird color)

• Do birds of a similar color interact together?

Example 3• South Indian village social support network n=782

• Nodes: villagers Edges: reported social support

• Community detection via modularity maximization

• Modularity has a built in stub-labeled Chung-Lu null model

• Do results change if we use vertex labeled model?

Chung Lu

estimation

# of edges

observed in

configuratio

n models

Sampling graphs uniformly at random…

… is surprisingly difficult (except pseudo-graphs)

Sampling graphs uniformly at random

Sampling via Markov chain Monte Carlo

G0 G1 G2 G3 G4 G5

Goal: A sequence of degree constrained graphs such that subsampling

from this sequence approximates a set of graphs drawn uniformly at

random.

Double EdgeSwaps

, the Graph of Graphs

Dealing with Constraints


no self-loops

MCMC requirements

1. Random walks can reach any graph -Irreducibility/GOG connected

2. Balanced transition probabilities-P(𝐺𝑖 → 𝐺𝑗) = P(𝐺𝑗 → 𝐺𝑖)-i.e. edges will be weighted but undirected

3. Markov chain is aperiodic -otherwise subsampling can be biased

NOTE: There are mixing time results for some degree sequences. There are also numerical methods to gauge convergence. I will not discuss either.

Is the GoG periodic? Nope!

Or

Stub-labeled

GoG

Vertex-labeled

GoG

GoG is an

undirected

simple graph

GoG is a directed

pseudograph

Are transition probabilities balanced?

Stub-labeled

GoG

Vertex-labeled

GoG

GoG is an

undirected

simple graph

Is the GoG connected?• Most difficult of the 3 questions

• Need special proof for each of choice of self-loops/multiedges

• Stub labeled GoG connectivity iff vertex labeled GoG Connectivity,

because the following swap permutes stubs:

Connectivity of Graph of Pseudographs

start target diff

# of stubs per node

# gold = # maroon

Connectivity of Graph of Pseudographs

can always find a graph one edge closer to target

swap

start target

Connectivity on other GoGs?

Disconnectivity of loopy graphs

Consider graphs with self-loops but no multiedges

There are no swaps between these graphs

Two directions for generalizations: cycles and cliques

Degree sequence: “2,2,…,2}Swaps can:

1. Merge two cycles into a larger cycle (or do the reverse).

2. Swap two edges inside a cycles, preserving cycle length

3. Make a self-loop & reduce cycle length by 1 (or do the reverse), but only for cycles of length 4 or more.

Swaps cannot make every edge a self-loop

This can be further generalized

3) Vk are vertices

k distance from a

vertex in V0

1) Let V0 be

vertices without a

self-loop

2) Vertices in V1

have a neighbor in V0

A taxonomy of V

Let Vk be

vertices k hops

from a vertex

without a

selfloop

Deg seq: “n+1,…n+1,n-1,…,n-1”

No swaps are possible

Q1 and Q2 are exactly the problems

Proof of 4.20 outline

increasing

number of

self-loops

connected components

graphs with a fixed

degree sequence

increasing

number of

self-loops


graphs with most self-loops

in ‘yellow’ ‘m*-loopy’

graphs: graphs with

the most self-loops

increasing

number of

self-loops




graphs: graphs with

the most self-loops

Note: connectivity of

follows from connectivity

of simple graphs and an

exchange lemma.

increasing

number of

self-loops




graphs: graphs with

the most self-loops

The GoG is disconnected

iff there is some

component where:

U

Zooming into

Easy case:

Harder case:

What do we know about ?

Maximum number of self-loops

implies no open wedges in V0.

No sequence of swaps can net

create open wedges in V0.

&

Example: V4 is empty in any

Open Wedge

Q2

Q1

is m*-loopy

Decreasing any degree in K0

leaves Vu1 with excess degree.

is also m*-loopy

By an alternating cycle/path argument.

Thus

Q: Can a different swap connect loopy-graphs?

Triangle swaps connect the GoG

Bonus: other constraints

• Connected Graphs

• GoG known to be connected, but algorithms require complicated data-structures to track effect of edge changes.

• Graphs with the same clustering coefficients

• Or, triangle constraints

Triangle MCMC constraints

• Total number of triangles

• Number of triangles incident at each node

Do these affect connectedness in simple graphs?

Can we constrain number of triangles

How about triangle sequence

And more!

Thanks for listening!

random graph models with fixed degree sequences: choices...

Documents