communities & roles two types ways of identifying nodes that “go together”...

182
Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a) Cohesive subgroups literature: start w. Freeman b) Network Operationalization a) Graph Theoretic b) Heuristic Algorithms a) Graph search & modularity b) Cluster analysis c) LDA/Principle components c) Fundamental limitations b)Roles/Positions a) Literature grounded in structural anthropology & kinship b) Roles as relations imply paired sets c) Goal is to identify nodes with common patterns a) Original is CONCOR b) Alternatives based on triads, other clusterings

Upload: camron-dalton

Post on 13-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Communities & RolesTwo types ways of identifying nodes that “go together”

a)Communities/Groupsa) Cohesive subgroups literature: start w. Freemanb) Network Operationalization

a) Graph Theoreticb) Heuristic Algorithms

a) Graph search & modularityb) Cluster analysisc) LDA/Principle components

c) Fundamental limitations

b)Roles/Positionsa) Literature grounded in structural anthropology & kinshipb) Roles as relations imply paired setsc) Goal is to identify nodes with common patterns

a) Original is CONCORb) Alternatives based on triads, other clusterings

Page 2: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Social Sub-groups

Lin Freeman: The sociological concept of “Group”

Focus on collectivities that are: “Relatively small, informal, and involve close personal ties.” What we would call “Primary Groups”

What (network) structure characterizes such a group?

Goal: Identify (a) non-overlapping groups that allow one to (b) identify internal group structure.

Page 3: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Social Sub-groups

Lin Freeman: The sociological concept of “Group”

Winship’s Model:

1) Assign people to equivalence classes that are hierarchically nested:

Page 4: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Social Sub-groups

Lin Freeman: The sociological concept of “Group”

In words, this means that whatever metric you define, a person is closer to themselves than to anyone else, that the relation be symmetric, and that triads be transitive (which, given the symmetric condition, means that they be complete).

You can then identify partitions by scaling the proximity, such that these three conditions are met.

Winship’s Model:

Page 5: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Social Sub-groups

Lin Freeman: The sociological concept of “Group”

A B C D E F G H I J KA . 5 5 4 4 4 4 3 3 3 3 B 5 . 5 4 4 4 4 3 3 3 3 C 5 5 . 4 4 4 4 3 3 3 3 D 4 4 4 . 5 5 5 3 3 3 3 E 4 4 4 5 . 5 5 3 3 3 3 F 4 4 4 5 5 . 5 3 3 3 3 G 4 4 4 5 5 5 . 3 3 3 3 H 3 3 3 3 3 3 3 . 5 5 5 I 3 3 3 3 3 3 3 5 . 5 5 J 3 3 3 3 3 3 3 5 5 . 5 K 3 3 3 3 3 3 3 5 5 5 .

Winship’s Model:

Page 6: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Social Sub-groups

Lin Freeman: The sociological concept of “Group”

total

{A-G} {H-K}

{A-C} {D-G}

Winship’s Model:

Page 7: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Social Sub-groups

Lin Freeman: The sociological concept of “Group”

Granovetter’s Model:

Proceed exactly as in Winship, but treat intransitivity differently when looking at strong or weak ties.

If x and y are strongly connected, and y and z are strongly connected, then x and z should be at least weakly connected.

Page 8: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

An example of a graph fitting the prohibition against G-intransitive relations.

Social Sub-groups

Lin Freeman: The sociological concept of “Group”

Granovetter’s Model:

Page 9: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Social Sub-groups

The Davis - “Old South” Example

Page 10: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Social Sub-groups

The Davis - “Old South” Example: Ties > 2

Page 11: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Social Sub-groups

The Davis - “Old South” Example: Ties > 3

Page 12: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Social Sub-groups

The Davis - “Old South” Example: Ties > 4

Meets the G-transitivity condition

Page 13: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Social Sub-groups

The Davis - “Old South” Example: Ties > 5

Stronger than the G-transitivity condition

Page 14: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Social Sub-groups

Lin Freeman: The sociological concept of “Group”

Freeman argues that the G-intransitivity model fits the data best for each of the 7 groups he studies.

Substantively, the types of groups this model predicts are very similar to those predicted by the general transitivity model, except re-cast as a valued relation.

Empirically, if you want to identify groups based on levels like this, you can use PAJEK and walk through the model in just the same way as we did with “Old South” or you can use UCI-NET (or program it, it’s not hard)

Page 15: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Methods: How do we identify primary groups in a network?

A) Classic graph theoretical methods: Cliques and extensions of cliques•Cliques•k-cores•k-plexes•Freeman (1992) Models•K-components (we talked about these already)

B) Algorithmic methods: search through a network trying to maximize for a particular pattern (I.e. like Frank & Yasumoto)

•Adjust assignment of actors to groups until a particular pattern of ties (block diagonal, usually) is identified.•Standard models:

- Factions (UCI-NET)- KliqueFinder (Frank)-RNM/CROWDS/JIGGLE (Moody)-Principle component analysis (PCA)-Flow models (MCL)-Modularity Maximization routines- General Distance & Clustering Methods

Page 16: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Methods: How do we identify primary groups in a network?

Graph Theoretical Models.

Start with a clique. A clique is defined as a maximal subgraph in which every member of the graph is connected to every other member of the graph. Cliques are collections of nodes where density = 1.0.

Properties of cliques:•Density: 1.0•Everyone connected to n-1 alters•Distance between every pair is 1•Ratio of within group ties to between group ties is infinite

•All triads are transitive

Page 17: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Methods: How do we identify primary groups in a network?

Graph Theoretical Models.

In practice, complete cliques are not very useful. They tend to overlap heavily and are limited in their size.

Graph theorists have thus relaxed the complete connectivity requirement (with varying degrees of success). See the Moody & White paper on cohesion for a discussion of many of these attempts.

Page 18: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Methods: How do we identify primary groups in a network?

Graph Theoretical Models.

k-cores: Every person connected to at least k other people.

Ideally, they would look something like this (here two 3-cores).

However, adding a single tie from A to B would make the whole graph a 3-core

Page 19: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Methods: How do we identify primary groups in a network?Graph Theoretical Models.

Extensions of this idea include:

K-plex: Every member connected to at least n-k other people in the graph (recall in a clique everyone is connected to n-1, so this relaxes that condition.

n-clique: Every person is connected by a path of N or less (recall a clique is with distance = 1).

N-clan: same as an n-clique, but all paths must be inside the group.

I’ve never had much luck with any of these methods empirically. Real data is usually too messy to work well. You should try them, and gain some intuition for yourself. The place to start is in UCINET.

Page 20: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Methods: How do we identify primary groups in a network?

UCINET will compute all of the best-known graph theoretic treatments for subgroups

Graph Theoretical Models.

Page 21: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Methods: How do we identify primary groups in a network?

Consider running different methods on a known group structure:

Graph Theoretical Models.

Page 22: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Methods: How do we identify primary groups in a network?Graph Theoretical Models.

Page 23: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Methods: How do we identify primary groups in a network?CliquesGraph Theoretical Models.

Page 24: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Methods: How do we identify primary groups in a network?

The only way to get something meaningful from this is to analyze the clique overlap matrix, which is what the “Clique by partion” dataset does, using cluster analysis

Cliques

Page 25: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Heuristic strategies for identifying primary groups: Search:

1) Fit Measure: Identify a measure of groupness (usually a function of the number of ties that fall within group compared to the number of ties that fall between group).2) Algorithm to maximize fit. Once we have the index, we need a clever method for searching through the network to maximize the fit.

Destroy:Break apart the network in strategic ways, removing the weakest parts first, what’s left are your primary groups. See “edge betweeness” “MCL”

Evade:Don’t look directly, instead find a simpler problem that correlates:Examples: Generalized cluster analysis, Factor Analysis, RM.

Methods: How do we identify primary groups in a network?

Page 26: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Segregation Index(Freeman, L. C. 1972. "Segregation in Social Networks." Sociological Methods and Research 6411-30.)

Freeman asked how we could identify segregation in a social network. Theoretically, he argues, if a given attribute (group label) does not matter for social relations, then relations should be distributed randomly with respect to the attribute. Thus, the difference between the number of cross-group ties expected by chance and the number observed measures segregation.

)(

)(

XE

XXESeg

Methods: How do we identify primary groups in a network?Search: Optimize a partition to fit

Page 27: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Consider the (hypothetical) network below. There are two attributes in this network: people with Blue eyes and Brown eyes and people who are square or not (they must be hip).

Methods: How do we identify primary groups in a network?Search: Optimize a partition to fit

Page 28: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Segregation Index

Mixing Matrix:

Blue Brown

Blue 6 17

Brown 17 16

Hip Square

Hip 20 3

Square 3 30

Seg = -0.25

Seg = 0.78

Methods: How do we identify primary groups in a network?Search: Optimize a partition to fit

Page 29: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Segregation Index

One problem with the segregation index is that it is not ‘margin free.’ That is, if you were to change the distribution of the category of interest (say race) by a constant but not the core association between race and friendship choice, you can get a different segregation level.

One antidote to this problem is to use odds ratios. In this case, and odds ratio tells us the relative likelihood that two people in the same category will choose each other as friends.

Methods: How do we identify primary groups in a network?Search: Optimize a partition to fit

Page 30: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

rse

gn

om

log_or-.602628 1.8946

-.176744

.684106

Log(Same-Sex Odds Ratio)

Fri

ends

hip

Seg

rega

tion

Ind

ex

Segregation index compared to the odds ratio:

r=.95

Complete Network AnalysisNetwork Connections: Social Subgroups

Page 31: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

The second problem is that the Segregation index has no clear maximum – if every node is assigned to a single group the value can be higher than if everyone is assigned to the “right” group. -- it tends to have a monotonically changing score. This means you can’t just keep adjusting nodes until you see a best fit, but instead have to look for changes in fit.

The modularity score solves this problem by re-organizing the expectation in a way that forces the value to 0 if everyone is in a single group.

Methods: How do we identify primary groups in a network?Search: Optimize a partition to fit

Page 32: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

We can also measure the extent that ties fall within clusters with the modularity score:

Where:m is the number of edgesk is the degreeAij is the edge weight between ij(cicj) is 1 if in the same group is the resolution parameter

Q has the advantage of going to 0 if there is only 1 group, which means maximizing the score is sensible. Note resolution parameter means N of groups is not truly “automatic”

Methods: How do we identify primary groups in a network?Search: Optimize a partition to fit

Page 33: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Modularity Scores Comparison to Segregation Index – comparing values for known solutions

Modularity Score Plotted against Segregation Index for various nets

Methods: How do we identify primary groups in a network?Search: Optimize a partition to fit

Page 34: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Number of groups

In-group Density

Methods: How do we identify primary groups in a network?Search: Optimize a partition to fit

Page 35: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

•Louvain Method (Blondel et al) in PAJEK & R•Factions in UCI-NET

•Multiple options for the exact factor maximized. I recommend either the density or the correlation function, and I would calculate the distance in each case.

•Frank’s KliqueFinder•Moody’s crowds / Jiggle•Generalized blockmodel in PAJEK•iGraph (R) has a couple that see this sort (Fast-Greedy is good)

Methods: How do we identify primary groups in a network?Search: Optimize a partition to fit

Page 36: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Factions in UCI-NET

Methods: How do we identify primary groups in a network?Search: Optimize a partition to fit

Page 37: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Factions in UCI-NET

Page 38: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Factions in UCI-NET

Page 39: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Factions in UCI-NET

Reduced BlockMatrix

1 2 3 4 5 6

-- -- -- -- -- --

1 59 1 2 14 1 0

2 1 54 0 1 12 2

3 1 2 55 0 1 12

4 9 1 1 51 0 0

5 0 12 2 0 62 1

6 1 0 9 2 0 64

Fit perfectly

Page 40: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

UCINETBiggest drawbacks of FACTIONS are:

A) SLOWB) Have to specify the number of groups.

Methods: How do we identify primary groups in a network?Search: Optimize a partition to fit

Page 41: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

R – “Fast Greedy”

This is a direct optimization of Modularity

Page 42: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

PAJEK – “Louvain”

This is a direct optimization of Modularity

Page 43: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Cluster analysis

In addition to tools like FACTIONS, we can use the distance information contained in a network to cluster observations that are ‘close’ to each other. In general, cluster analysis is a set of techniques that allows you to identify collections of objects that are simmilar to each other in some degree.

A very good reference is the SAS/STAT manual section called, “Introduction to clustering procedures.” (http://wks.uts.ohio-state.edu/sasdoc/8/sashtml/stat/chap8/index.htm)

(See also Wasserman and Faust, though the coverage is spotty).

We are going to start with the general problem of hierarchical clustering applied to any set of analytic objects based on similarity, and then transfer that to clustering nodes in a network.

Methods: How do we identify primary groups in a network?Evade: Find a “cheap” indicator, and cluster/optimize that

Page 44: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Cluster analysis

Imagine a set of objects (say people) arrayed in a two dimensional space. You want to identify groups of people based on their position in that space.

How do you do it?

How Cool you are

How

Sm

art y

ou a

re

Page 45: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Start by choosing a pair of people who are very close to each other (such as 15 & 16) and now treat that pair as one point, with a value equal to the mean position of the two nodes.

x

Methods: How do we identify primary groups in a network?Evade: Find a “cheap” indicator, and cluster/optimize that

Page 46: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Now repeat that process for as long as possible.

Methods: How do we identify primary groups in a network?Evade: Find a “cheap” indicator, and cluster/optimize that

Page 47: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

This process is captured in the cluster tree (called a dendrogram)

Methods: How do we identify primary groups in a network?Evade: Find a “cheap” indicator, and cluster/optimize that

Page 48: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

As with the network cluster algorithms, there are many options for clustering. The three that I use most are:

•Ward’s Minimum Variance -- the one I use almost 95% of the time•Average Distance -- the one used in the example above•Median Distance -- very similar

Again, the SAS manual is the best single place I’ve found for information on each of these techniques.

Some things to keep in mind:Units matter. The example above draws together pairs

horizontally because the range there is smaller. Get around this by standardizing your data.

This is an inductive technique. You can find clusters in a purely random distribution of points. Consider the following example.

Methods: How do we identify primary groups in a network?Evade: Find a “cheap” indicator, and cluster/optimize that

Page 49: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

data random; do i=1 to 20; x=rannor(0); y=rannor(0); output; end;run;

The data in this scatter plot are produced using this code:

Cluster analysis

Methods: How do we identify primary groups in a network?Evade: Find a “cheap” indicator, and cluster/optimize that

Page 50: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Cluster analysis Resulting dendrogram

Methods: How do we identify primary groups in a network?Evade: Find a “cheap” indicator, and cluster/optimize that

Page 51: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Cluster analysisResulting cluster solution

Page 52: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Cluster analysis

Cluster analysis works by building a distance matrix between each pair of points. In the example above, it used the Euclidean distance which in two dimensions is simply the physical distance between the points in a plot.

Can work on any number of dimensions.

To use cluster analysis in a network, we base the distance on the path-distance between pairs of people in the network.

Consider again the blue-eye hip example:

Methods: How do we identify primary groups in a network?Evade: Find a “cheap” indicator, and cluster/optimize that

Page 53: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Cluster analysis

Distance Matrix0 1 3 2 3 3 4 3 3 2 3 2 2 1 11 0 2 2 2 3 3 3 2 1 2 2 1 2 13 2 0 3 2 4 3 3 2 1 1 1 2 2 32 2 3 0 1 1 2 1 1 2 3 3 3 2 13 2 2 1 0 2 1 1 1 1 2 2 3 3 23 3 4 1 2 0 1 1 2 3 4 4 4 3 24 3 3 2 1 1 0 2 2 2 3 3 4 4 33 3 3 1 1 1 2 0 1 2 3 3 4 3 23 2 2 1 1 2 2 1 0 1 2 2 3 3 22 1 1 2 1 3 2 2 1 0 1 1 2 2 23 2 1 3 2 4 3 3 2 1 0 1 2 2 32 2 1 3 2 4 3 3 2 1 1 0 1 1 22 1 2 3 3 4 4 4 3 2 2 1 0 2 21 2 2 2 3 3 4 3 3 2 2 1 2 0 11 1 3 1 2 2 3 2 2 2 3 2 2 1 0

Methods: How do we identify primary groups in a network?Evade: Find a “cheap” indicator, and cluster/optimize that

Page 54: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

The distance matrix implies a space that nodes are embedded within. Using something like MDS, we can represent the space implied by the distance matrix in two dimensions. This is the image of the network you would get if you did that.

Methods: How do we identify primary groups in a network?Evade: Find a “cheap” indicator, and cluster/optimize that

Page 55: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Cluster analysisWhen you use variables, the cluster analysis program generates a distance matrix. We can, instead use the network distance matrix directly. If we do that with this example network, we get the following:

Page 56: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Cluster analysis

Page 57: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Cluster analysis

In SAS you use two commands to get a cluster analysis. The first does the hierarchical clustering. The second analyzes the cluster output to create the tree.

Example 1. Using variables to define the space (like income and musical taste):

proc cluster data=a method=ave out=clustd std;var x y;id node;run;

proc tree data=clustd ncl=5 out=cluvars;run;

Page 58: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Cluster analysisExample 2. Using a pre-defined distance matrix to define the space (as in a social network).You first create the distance matrix (in IML), then use it in the cluster program.

proc iml; %include 'c:\moody\sas\programs\modules\reach.mod';

/* blue eye example */

mat2=j(15,15,0); mat2[1,{2 14 15}]=1; /* lines cut here */ mat2[15,{1 14 2 4}]=1;

dmat=reach(mat2); mattrib dmat format=1.0;

print dmat; id=1:nrow(dmat); id=id`;

ddat=id||dmat;

create ddat from ddat; /* creates the dataset */ append from ddat;

quit;

data ddat (type=dist); /* tells SAS it is a distance */ set ddat; /* matrix */run;

Page 59: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Cluster analysisExample 2. Using a pre-defined distance matrix to define the space (as in a social network).Once you have it, the cluster program is just the same.

proc cluster data=ddat method=ward out=clustd;id col1;run;

proc tree data=clustd ncl=3 out=netclust;copy col1;run;

proc freq data=netclust;tables cluster;run;

proc print data=netclust;var col1 cluster;run;

Page 60: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Moody’s CROWDS algorithm combines the search approach with an initial cluster analysis and a routine for determining how many clusters are in the network. It does so by using the Segregation index and all of the information from the cluster hierarchy, combining two groups only if it improves the segregation fit for both groups.

.395.341 .319 .254

.404 .185 .614

.197 .372

.394

.279 .238 .224

.370

.325.368 .473.285.171

.589

.679 .496

.398 .255

.387

.701

.402.410

.555 .400

.646

.692

.085.127

.762

.735

.745

.745

Total

Methods: How do we identify primary groups in a network?Evade: Find a “cheap” indicator, and cluster/optimize that

Page 61: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

The logic behind these algorithms is that you remove some weak links and see what is left. Most popular is the “edge betweenness” algorithm.

Methods: How do we identify primary groups in a network?Destroy: Remove lines/nodes until what is left over reveals something of interest

Page 62: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

UCINET has the MCL (Markov clustering, based on flow betweenness in a random walk sense) algorithm programmed.

Methods: How do we identify primary groups in a network?Destroy: Remove lines/nodes until what is left over reveals something of interest

Page 63: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

“Evade” – look for something that correlates with your split

Newman’s Leading Eigenvector (in R – this is the “bottom” partition, not the best fit, which aggregates/joins from here)

Page 64: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

The Recursive Neighborhood Means algorithm creates the variables that are then used in the cluster analysis to identify groups.

•Start by randomly assigning every node a value on k variables•Then calculate the average for each variable for the people each person is tied to•Repeat this process multiple times

This results in people who have many ties to each other having similar values on the k random variables. This similarity then gets picked up in a cluster analysis.

“Evade” – look for something that correlates with your split

Page 65: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Example of the RNM procedure

Time 1 Time 2 Time 3

Page 66: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Example of the RNM procedure

Page 67: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

As an example, consider the process active on a known-to-be clustered networks, starting with 2 random k variables.

You get something like this, where the nodes are now placed according to their resulting values on the 2 variables.

Page 68: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network
Page 69: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

The algorithm does a good job uncovering clusters in fake datasets.

Page 70: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

The algorithm does a good job uncovering clusters in fake datasets.

Page 71: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Compared to real data:

RNM Partition on the Prison data

Page 72: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Strategies for identifying primary groups: Evade

Factor Analysis: Treat the adjacency/similarity matrix as a set of N variables and look for latent factors that explain the variance in the data.

SES IQ

IncomeMathScore

1.0 1.0

0.0 0.0

We often use simple indicators and assume they measure our concepts

Page 73: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Strategies for identifying primary groups: Evade

Factor Analysis: Treat the adjacency/similarity matrix as a set of N variables and look for latent factors that explain the variance in the data.

SES IQ

IncomeReading

ScoreOccupation

Highest Degree

House Size

LanguagesSpoken

MathScore

But we don’t have to! We can imagine that each latent concept causes our indicators, and build a measurement model.

Page 74: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Strategies for identifying primary groups: Evade

Factor Analysis: Treat the adjacency/similarity matrix as a set of N variables and look for latent factors that explain the variance in the data.

But we don’t have to! We can imagine that each latent concept causes our indicators, and build a measurement model.

33

22

11

)(

)(

)(

sesHouseSize

sesOccupation

sesIncome

Page 75: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Strategies for identifying primary groups: Evade

Factor Analysis: Treat the adjacency/similarity matrix as a set of N variables and look for latent factors that explain the variance in the data.

In a network, we assume that the tie pattern is an imperfect measure of an underlying latent structure that we can explain with similar factors. Instead of lots of “measurements” we have many columns in the adjacency (sim) matrix, and we can summarize that with factor scores.

-- works best if the similarity matrix has more information – so multiple account data are perfect.– or you can transform the data in some way to more

information (like use a distance matrix.

Page 76: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Strategies for identifying primary groups: Evade

Factor Analysis: Treat the adjacency/similarity matrix as a set of N variables and look for latent factors that explain the variance in the data.

/* this section builds info on how to weight dyads for in-group, out-group. */

twostp=((adjmat+adjmat`)>0)*adjmat; /* make it either direction w. the first term */ttie=adjmat#twostp; /*=1 if tie contributes to a transitive triple */ttie=((ttie+ttie`));

adjraw=adjmat; adjmat=(adjmat+adjmat`); /* force it to be symetric, 1=asym 2=reciped */

adjmat=adjmat-diag(adjmat); /* remove any self ties */d2=reachlim((adjmat>0),3);

/* re-weight to bias toward recip ties */wm_4 = (d2=1)#(adjmat=2)#8; /* recip direct ties */wm_2a = (d2=1)#(adjmat=1)#4; /* unrecip direct ties */wm_1 = 2*(d2=2);/* ties 2-steps out */wm_p5 = 0*(d2=3); /* ties 3-steps out - note it's zeroed out here*/wm=wm_4+wm_2a+wm_1++wm_p5+(3*(ttie/(max(ttie)))); /* transitivity is at the end*/wm=wm-diag(wm);

Here is code I used in the PROSPER data:

Page 77: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Strategies for identifying primary groups: Evade

Factor Analysis: Treat the adjacency/similarity matrix as a set of N variables and look for latent factors that explain the variance in the data.Here is code I used in the PROSPER data: /* run factor analysis. Note nfactors is a high value, should only take those

w. EV > 2, but this gives us room... */

proc factor rotate=varimax min=&minev out=factset data=symmat nfactors=175

outstat=fscores noprint;

run; quit;

Page 78: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Strategies for identifying primary groups: Evade

Result:

Page 79: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Strategies for identifying primary groups: Evade

Result:

Each column is a person, these are the factor loadings for each person on each retained factor.

Page 80: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Strategies for identifying primary groups: Evade

Result:

Sociogram for a single school

Page 81: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Strategies for identifying primary groups: Evade

Result:

Sociogram for a single school.

Problem is that there are no necessary connectivity checks – you can get “groups” that are disconnected.

Biggest strengths are:a) Really fastb) Allows for overlapping

groupsc) Gives you “embeddedness”

scores based on factor loadigs

Page 82: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

The Crowds Algorithm1. Identify members of network bicomponents, remove people not included.

2. Cluster the reduced network. - Identify optimal number of groups: (TREEWALK) - For each level of the cluster partition tree do (BFS): -Move up the tree from smaller to larger groups. -If the fit for both groups is improved by joining them then do so. -If not, then identify group at that level. -End TREEWALK.

Do until all groups are identified (GLOBAL LOOP): 3. Evaluate node fit. Do until nodes cannot be moved: For each identified cluster do (GRPCHECK):

- Ensure group is a bi-component. -Calculate effect on group a of moving node j to group a. -Calculate effect on j's present group of removing j. - If there is a positive net gain to moving j from own group to a, then do so. End. 4. Identify Bridging members.-If removing j from group a would improve the fit of group a, AND assigning j to any other group would lower

the fit for that group, then j is considered a bridge. Place all bridges in separate class.5. Group Check.Check returns to combining groups. IF merging groups would improve the fit of all groups to be merged, then do so.- Evaluate bridges, to be sure that they are not bridging two groups that have now merged. End Global loop. 

Strategies for identifying primary groups: Hybrid

Page 83: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Social Sub-groups

Frank & Yasumoto: Action and Structure

They expect to find evidence of enforceable trust within social subgroups and evidence of reciprocity between such groups.

To do so, they must identify primary subgroups within the network. They do so using a density based criterion. Frank’s algorithm iteratively assigns nodes to subgroups until a parameter that maximizes in-group density is reached. Basic model is:

logit(Yij)= + ij

Seek to find an assignment of nodes to groups (g) that maximizes fit. This results in a ‘block diagonal’ adjacency matrix, where most of the ties fall along the diagonal.

Page 84: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Relations among the French Financial Elite (as drawn by F&Y)

Group-weighted MDS

Relations within group are weighted heavier than between to generate this picture:

Page 85: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Return to first question: What is a group?

•The simple notions of a complete clique are difficult to square w. real-world data.•Density is an indicator, but subject to over-grouping (no connectivity) and star-patterns.•Groups are likely internally differentiated – with “core” vs. “periphery” members

•Most sociological theories of groups rest on transitive closure and short distances •There’s a sense that members are equal – a tight-knit group•The group should be fairly small – face-to-face scale•The social processes underlying the group turn on reciprocity, trust, communication, homogeneity of norms & beliefs.•Almost all require a comparative set: in-group to out-group. It is relational not essential.•Cross-cutting social circles – would lead us to expect overlapping groups, but in practice most methods do not do that, as it’s analytically too cumbersome.

Practically, group detection is hard and most methods will give you (slightly) different results. You can compare results using a Rand statistic (proportion of pairs similarly categorized in two partitions), but for small settings these differences can matter.

Page 86: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Fast & Greedy Louvain Edge Between

Markov Chain Leading Eigenvect RNM (CROWDS)

Page 87: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network
Page 88: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Overview•Social life can be described (at least in part) through social roles.•To the extent that roles can be characterized by regular interaction patterns, we can summarize roles through common relational patterns.•Identifying these sets is the goal of block-model analyses.

Nadel: The Coherence of Role Systems•Background ideas for White, Boorman and Brieger. Social life as interconnected system of roles•Important feature: thinking of roles as connected in a role system = social structure

White, Boorman and Breiger: Social structure from Multiple Networks I. Blockmodels of Roles and Positions

•The key article describing the theoretical and technical elements of block-modeling

Roles & Positions

Page 89: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Nadel: The Coherence of Role Systems

Elements of a Role:

•Rights and obligations with respect to other people or classes of people

•Roles require a ‘role compliment’ another person who the role-occupant acts with respect to

Examples:Parent - child, Teacher - student, Lover - lover, Friend - Friend,

Husband - Wife, etc.

Nadel (Following functional anthropologists and sociologists) defines ‘logical’ types of roles, and then examines how they can be linked together.

Page 90: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Nadel describes how various roles fit together to form a coherent whole. Roles are collected in people through the ‘summation of roles”

Necessary:Some roles fit together necessarily. For example, the expected

interaction patterns of “son-in-law” are implied through the joint roles of “Husband” and “Spouse-Parent”

Coincidental:Some roles tend to go together empirically, but they need not

(businessman & club member, for example).

Distinguishing the two is a matter of usefulness and judgement, but relates to social substitutability. The distinction reverts to how the system as a whole will be held together in the face of changes in role occupants.

Nadel: The Coherence of Role Systems

Page 91: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Nadel: The Coherence of Role Systems

Given that roles can be identified as ‘going together’ is there a logic that underlies their connection? Nadel uses a functional description based on ascription and achievement:

Page 92: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Nadel: The Coherence of Role Systems

And he gives an example of a simple role system:

Nadel’s task is to make sense of these roles, to identify how they are interconnected to form a system -- a coherent structure.

This is a difficult task to do analytically, as the eventual failure of Parsonian functionalism shows.

Page 93: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

White et al: From logical role systems to empirical social structures

With the fall of parsons and functionalism in the late 60s, many of the ideas about social structure and system were also tossed. White et al demonstrate how we can understand social structure as the intercalation of roles, without the a priori logical categories.

Start with some basic ideas of what a role is: An exchange of something (support, ideas, commands, etc) between actors. Thus, we might represent a family as:

Page 94: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Start with some basic ideas of what a role is: An exchange of something (support, ideas, commands, etc) between actors. Thus, we might see an exchange network such as:

Provides food for

Romantic Love

Bickers with

White et al: From logical role systems to empirical social structures

Page 95: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Start with some basic ideas of what a role is: An exchange of something (support, ideas, commands, etc) between actors. Which is a summary of a (sort of) family.

H W

C

C

C

Provides food for

Romantic Love

Bickers with(and there are, of course, many other relations inside the family)

White et al: From logical role systems to empirical social structures

Page 96: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

White et al: From logical role systems to empirical social structures

The key idea, is that we can express a role through a relation (or set of relations) and thus a social system by the inventory of roles. If roles equate to positions in an exchange system, then we need only identify particular aspects of a position. But what aspect? Block modeling focuses on equivalence positions.

Structural Equivalence

Two actors are structurally equivalent if they have the same types of ties to the same people. That is, they have the exact same ties.

Page 97: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Structural Equivalence

A single relation

Page 98: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Structural Equivalence

Graph reduced to positions

Page 99: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Alternative notions of equivalence

Instead of exact same ties to exact same alters, you look for nodes with similar ties to similar types of alters

Page 100: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Blockmodeling: basic steps

In any positional analysis, there are 4 basic steps:

1) Identify a definition of equivalence2) Measure the degree to which pairs of actors are equivalent3) Develop a representation of the equivalencies4) Assess the adequacy of the representation

Page 101: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

1) Identify a definition of equivalence

Structural Equivalence: Two actors are equivalent if they have the same type of ties to the same people.

Page 102: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Automorphic Equivalence:

Actors occupy indistinguishable structural locations in the network. That is, that they are in isomorphic positions in the network.

Two graphs are isomorphic if there is some mapping of nodes to positions that equates the two. For example, all 030T triads are isomorphic.

A graph is automorphic, if there are patterns internal to the graph that are equated (if the mapping goes from the set of nodes in the graph to other nodes in the graph). In general, automorphicaly equivalent nodes are equivalent with respect to all graph theoretic properties (I.e. degree, number of people reachable, centrality, etc.) and are structurally indistinguishable.

Key difference from structural equivalence is relaxing of the necessity of being linked to the same nodes.

1) Identify a definition of equivalence

Page 103: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Automorphic Equivalence:

Page 104: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Regular Equivalence:Regular equivalence does not require actors to have identical ties to

identical actors or to be structurally indistinguishable.

Actors who are regularly equivalent have identical ties to and from equivalent actors.

If actors i and j are regularly equivalent, and actor i has a tie to/from some actor, k, then actor j must have the same kind of tie to/from some actor l, and actors k and l must be regularly equivalent.

So effectively this is a recursive definition, and not necessarily unique. There may be several ways to assign actors to clusters that satisfy this definition.

(This is related to graph colorings, regular equivalence definitions are those where nodes have neighbors of the same color).

1) Identify a definition of equivalence

Page 105: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Regular Equivalence:

There may be multiple regular equivalence partitions in a network, and thus we tend to want to find the maximal regular equivalence position, the one with the fewest positions.

Page 106: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Role or Local Equivalence:While most equivalence measures focus on position within the full

network, some measures focus only on the patters within the local tie neighborhood. These have been called ‘local role’ equivalence.

Note that:Structurally equivalent actors are automorphically equivalent,Automorphically equivalent actors are regularly equivalent.

Structurally equivalent and automorphically equivalent actors are role equivalent

In practice, we tend to ignore some of these fine distinctions, as they get blurred quickly once we have to operationalize them in real graphs. It turns out that few people are ever exactly equivalent, and thus we approximate the links between the types.

In all cases, the procedure can work over multiple relations simultaneously.

The process of identifying positions is called blockmodeling, and requires identifying a measure of similarity among nodes.

Page 107: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

0 1 1 1 0 0 0 0 0 0 0 0 0 01 0 0 0 1 1 0 0 0 0 0 0 0 01 0 0 1 0 0 1 1 1 1 0 0 0 01 0 1 0 0 0 1 1 1 1 0 0 0 00 1 0 0 0 1 0 0 0 0 1 1 1 10 1 0 0 1 0 0 0 0 0 1 1 1 10 0 1 1 0 0 0 0 0 0 0 0 0 00 0 1 1 0 0 0 0 0 0 0 0 0 00 0 1 1 0 0 0 0 0 0 0 0 0 00 0 1 1 0 0 0 0 0 0 0 0 0 00 0 0 0 1 1 0 0 0 0 0 0 0 00 0 0 0 1 1 0 0 0 0 0 0 0 00 0 0 0 1 1 0 0 0 0 0 0 0 00 0 0 0 1 1 0 0 0 0 0 0 0 0

Blockmodeling is the process of identifying these types of positions. A block is a section of the adjacency matrix - a “group” of people.

Here I have blocked structurally equivalent actors

Page 108: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

. 1 1 1 0 0 0 0 0 0 0 0 0 01 . 0 0 1 1 0 0 0 0 0 0 0 01 0 . 1 0 0 1 1 1 1 0 0 0 01 0 1 . 0 0 1 1 1 1 0 0 0 00 1 0 0 . 1 0 0 0 0 1 1 1 10 1 0 0 1 . 0 0 0 0 1 1 1 10 0 1 1 0 0 . 0 0 0 0 0 0 00 0 1 1 0 0 0 . 0 0 0 0 0 00 0 1 1 0 0 0 0 . 0 0 0 0 00 0 1 1 0 0 0 0 0 . 0 0 0 00 0 0 0 1 1 0 0 0 0 . 0 0 00 0 0 0 1 1 0 0 0 0 0 . 0 00 0 0 0 1 1 0 0 0 0 0 0 . 00 0 0 0 1 1 0 0 0 0 0 0 0 .

1 2 3 4 5 61 0 1 1 0 0 02 1 0 0 1 0 03 1 0 1 0 1 04 0 1 0 1 0 1 5 0 0 1 0 0 06 0 0 0 1 0 0

Once you block the matrix, reduce it, based on the number of ties in the cell of interest. The key values are a zero block (no ties) and a one-block (all ties present):

Structural equivalence thus generates 6 positions in the network

1 2 3 4 5 6

12

3

4

5

6

Page 109: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

. 1 1 1 0 0 0 0 0 0 0 0 0 01 . 0 0 1 1 0 0 0 0 0 0 0 01 0 . 1 0 0 1 1 1 1 0 0 0 01 0 1 . 0 0 1 1 1 1 0 0 0 00 1 0 0 . 1 0 0 0 0 1 1 1 10 1 0 0 1 . 0 0 0 0 1 1 1 10 0 1 1 0 0 . 0 0 0 0 0 0 00 0 1 1 0 0 0 . 0 0 0 0 0 00 0 1 1 0 0 0 0 . 0 0 0 0 00 0 1 1 0 0 0 0 0 . 0 0 0 00 0 0 0 1 1 0 0 0 0 . 0 0 00 0 0 0 1 1 0 0 0 0 0 . 0 00 0 0 0 1 1 0 0 0 0 0 0 . 00 0 0 0 1 1 0 0 0 0 0 0 0 .

1 2 31 1 1 02 1 1 1 3 0 1 0

Once you partition the matrix, reduce it:

Regular equivalence

1 2

3

(here I placed a one in the image matrix if there were any ties in the ij block)

Page 110: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

To get a block model, you have to measure the similarity between each pair. If two actors are structurally equivalent, then they will have exactly similar patterns of ties to other people. Consider the example again:

. 1 1 1 0 0 0 0 0 0 0 0 0 01 . 0 0 1 1 0 0 0 0 0 0 0 01 0 . 1 0 0 1 1 1 1 0 0 0 01 0 1 . 0 0 1 1 1 1 0 0 0 00 1 0 0 . 1 0 0 0 0 1 1 1 10 1 0 0 1 . 0 0 0 0 1 1 1 10 0 1 1 0 0 . 0 0 0 0 0 0 00 0 1 1 0 0 0 . 0 0 0 0 0 00 0 1 1 0 0 0 0 . 0 0 0 0 00 0 1 1 0 0 0 0 0 . 0 0 0 00 0 0 0 1 1 0 0 0 0 . 0 0 00 0 0 0 1 1 0 0 0 0 0 . 0 00 0 0 0 1 1 0 0 0 0 0 0 . 00 0 0 0 1 1 0 0 0 0 0 0 0 .

1 2 3 4 5 6

12

3

4

5

6

C D Match1 1 10 0 1. 1 01 . 00 0 10 0 11 1 1 1 1 1 1 1 11 1 10 0 10 0 10 0 10 0 1Sum: 12

C and D match on 12 other people

Page 111: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

If the model is going to be based on asymmetric or multiple relations, you simply stack the various relations:

H W

CC

C

Provides food for

Romantic Love

Bickers with

Romance0 1 0 0 01 0 0 0 00 0 0 0 00 0 0 0 00 0 0 0 0

Feeds0 0 1 1 10 0 1 1 10 0 0 0 00 0 0 0 00 0 0 0 0

Bicker0 0 0 0 00 0 0 0 00 0 0 1 10 0 1 0 00 0 1 1 0

0 1 0 0 01 0 0 0 00 0 0 0 00 0 0 0 00 0 0 0 00 0 1 1 10 0 1 1 10 0 0 0 00 0 0 0 00 0 0 0 00 0 0 0 00 0 0 0 01 1 0 0 01 1 0 0 01 1 0 0 00 0 0 0 00 0 0 0 00 0 0 1 10 0 1 0 10 0 1 1 0

Stacked

Page 112: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

0 8 7 7 5 5 11 11 11 11 7 7 7 7 8 0 5 5 7 7 7 7 7 7 11 11 11 11 7 5 0 12 0 0 8 8 8 8 4 4 4 4 7 5 12 0 0 0 8 8 8 8 4 4 4 4 5 7 0 0 0 12 4 4 4 4 8 8 8 8 5 7 0 0 12 0 4 4 4 4 8 8 8 811 7 8 8 4 4 0 12 12 12 8 8 8 811 7 8 8 4 4 12 0 12 12 8 8 8 811 7 8 8 4 4 12 12 0 12 8 8 8 811 7 8 8 4 4 12 12 12 0 8 8 8 8 7 11 4 4 8 8 8 8 8 8 0 12 12 12 7 11 4 4 8 8 8 8 8 8 12 0 12 12 7 11 4 4 8 8 8 8 8 8 12 12 0 12 7 11 4 4 8 8 8 8 8 8 12 12 12 0

For the entire matrix, we get:

(number of agreements for each ij pair)

Page 113: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

1.00 -0.20 0.08 0.08 -0.19 -0.19 0.77 0.77 0.77 0.77 -0.26 -0.26 -0.26 -0.26-0.20 1.00 -0.19 -0.19 0.08 0.08 -0.26 -0.26 -0.26 -0.26 0.77 0.77 0.77 0.77 0.08 -0.19 1.00 1.00 -1.00 -1.00 0.36 0.36 0.36 0.36 -0.45 -0.45 -0.45 -0.45 0.08 -0.19 1.00 1.00 -1.00 -1.00 0.36 0.36 0.36 0.36 -0.45 -0.45 -0.45 -0.45-0.19 0.08 -1.00 -1.00 1.00 1.00 -0.45 -0.45 -0.45 -0.45 0.36 0.36 0.36 0.36-0.19 0.08 -1.00 -1.00 1.00 1.00 -0.45 -0.45 -0.45 -0.45 0.36 0.36 0.36 0.36 0.77 -0.26 0.36 0.36 -0.45 -0.45 1.00 1.00 1.00 1.00 -0.20 -0.20 -0.20 -0.20 0.77 -0.26 0.36 0.36 -0.45 -0.45 1.00 1.00 1.00 1.00 -0.20 -0.20 -0.20 -0.20 0.77 -0.26 0.36 0.36 -0.45 -0.45 1.00 1.00 1.00 1.00 -0.20 -0.20 -0.20 -0.20 0.77 -0.26 0.36 0.36 -0.45 -0.45 1.00 1.00 1.00 1.00 -0.20 -0.20 -0.20 -0.20-0.26 0.77 -0.45 -0.45 0.36 0.36 -0.20 -0.20 -0.20 -0.20 1.00 1.00 1.00 1.00-0.26 0.77 -0.45 -0.45 0.36 0.36 -0.20 -0.20 -0.20 -0.20 1.00 1.00 1.00 1.00-0.26 0.77 -0.45 -0.45 0.36 0.36 -0.20 -0.20 -0.20 -0.20 1.00 1.00 1.00 1.00-0.26 0.77 -0.45 -0.45 0.36 0.36 -0.20 -0.20 -0.20 -0.20 1.00 1.00 1.00 1.00

The metric used to measure structural equivalence by White, Boorman and Brieger is the correlation between each node’s set of ties. For the example, this would be:

Another common metric is the Euclidean distance between pairs of actors, which you then use in a standard cluster analysis.

Page 114: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

The initial method for finding structurally equivalent positions was CONCOR, the CONvergence of iterated CORrelations.

1.00 -.77 0.55 0.55 -.57 -.57 0.95 0.95 0.95 0.95 -.75 -.75 -.75 -.75-.77 1.00 -.57 -.57 0.55 0.55 -.75 -.75 -.75 -.75 0.95 0.95 0.95 0.950.55 -.57 1.00 1.00 -1.0 -1.0 0.73 0.73 0.73 0.73 -.75 -.75 -.75 -.750.55 -.57 1.00 1.00 -1.0 -1.0 0.73 0.73 0.73 0.73 -.75 -.75 -.75 -.75-.57 0.55 -1.0 -1.0 1.00 1.00 -.75 -.75 -.75 -.75 0.73 0.73 0.73 0.73-.57 0.55 -1.0 -1.0 1.00 1.00 -.75 -.75 -.75 -.75 0.73 0.73 0.73 0.730.95 -.75 0.73 0.73 -.75 -.75 1.00 1.00 1.00 1.00 -.77 -.77 -.77 -.770.95 -.75 0.73 0.73 -.75 -.75 1.00 1.00 1.00 1.00 -.77 -.77 -.77 -.770.95 -.75 0.73 0.73 -.75 -.75 1.00 1.00 1.00 1.00 -.77 -.77 -.77 -.770.95 -.75 0.73 0.73 -.75 -.75 1.00 1.00 1.00 1.00 -.77 -.77 -.77 -.77-.75 0.95 -.75 -.75 0.73 0.73 -.77 -.77 -.77 -.77 1.00 1.00 1.00 1.00-.75 0.95 -.75 -.75 0.73 0.73 -.77 -.77 -.77 -.77 1.00 1.00 1.00 1.00-.75 0.95 -.75 -.75 0.73 0.73 -.77 -.77 -.77 -.77 1.00 1.00 1.00 1.00-.75 0.95 -.75 -.75 0.73 0.73 -.77 -.77 -.77 -.77 1.00 1.00 1.00 1.00

Concor iteration 1:

Page 115: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Concor iteration 2:1.00 -.99 0.94 0.94 -.94 -.94 0.99 0.99 0.99 0.99 -.99 -.99 -.99 -.99-.99 1.00 -.94 -.94 0.94 0.94 -.99 -.99 -.99 -.99 0.99 0.99 0.99 0.990.94 -.94 1.00 1.00 -1.0 -1.0 0.97 0.97 0.97 0.97 -.97 -.97 -.97 -.970.94 -.94 1.00 1.00 -1.0 -1.0 0.97 0.97 0.97 0.97 -.97 -.97 -.97 -.97-.94 0.94 -1.0 -1.0 1.00 1.00 -.97 -.97 -.97 -.97 0.97 0.97 0.97 0.97-.94 0.94 -1.0 -1.0 1.00 1.00 -.97 -.97 -.97 -.97 0.97 0.97 0.97 0.970.99 -.99 0.97 0.97 -.97 -.97 1.00 1.00 1.00 1.00 -.99 -.99 -.99 -.990.99 -.99 0.97 0.97 -.97 -.97 1.00 1.00 1.00 1.00 -.99 -.99 -.99 -.990.99 -.99 0.97 0.97 -.97 -.97 1.00 1.00 1.00 1.00 -.99 -.99 -.99 -.990.99 -.99 0.97 0.97 -.97 -.97 1.00 1.00 1.00 1.00 -.99 -.99 -.99 -.99-.99 0.99 -.97 -.97 0.97 0.97 -.99 -.99 -.99 -.99 1.00 1.00 1.00 1.00-.99 0.99 -.97 -.97 0.97 0.97 -.99 -.99 -.99 -.99 1.00 1.00 1.00 1.00-.99 0.99 -.97 -.97 0.97 0.97 -.99 -.99 -.99 -.99 1.00 1.00 1.00 1.00-.99 0.99 -.97 -.97 0.97 0.97 -.99 -.99 -.99 -.99 1.00 1.00 1.00 1.00

The initial method for finding structurally equivalent positions was CONCOR, the CONvergence of iterated CORrelations.

Page 116: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

1.00 -1.0 1.00 1.00 -1.0 -1.0 1.00 1.00 1.00 1.00 -1.0 -1.0 -1.0 -1.0-1.0 1.00 -1.0 -1.0 1.00 1.00 -1.0 -1.0 -1.0 -1.0 1.00 1.00 1.00 1.001.00 -1.0 1.00 1.00 -1.0 -1.0 1.00 1.00 1.00 1.00 -1.0 -1.0 -1.0 -1.01.00 -1.0 1.00 1.00 -1.0 -1.0 1.00 1.00 1.00 1.00 -1.0 -1.0 -1.0 -1.0-1.0 1.00 -1.0 -1.0 1.00 1.00 -1.0 -1.0 -1.0 -1.0 1.00 1.00 1.00 1.00-1.0 1.00 -1.0 -1.0 1.00 1.00 -1.0 -1.0 -1.0 -1.0 1.00 1.00 1.00 1.001.00 -1.0 1.00 1.00 -1.0 -1.0 1.00 1.00 1.00 1.00 -1.0 -1.0 -1.0 -1.01.00 -1.0 1.00 1.00 -1.0 -1.0 1.00 1.00 1.00 1.00 -1.0 -1.0 -1.0 -1.01.00 -1.0 1.00 1.00 -1.0 -1.0 1.00 1.00 1.00 1.00 -1.0 -1.0 -1.0 -1.01.00 -1.0 1.00 1.00 -1.0 -1.0 1.00 1.00 1.00 1.00 -1.0 -1.0 -1.0 -1.0-1.0 1.00 -1.0 -1.0 1.00 1.00 -1.0 -1.0 -1.0 -1.0 1.00 1.00 1.00 1.00-1.0 1.00 -1.0 -1.0 1.00 1.00 -1.0 -1.0 -1.0 -1.0 1.00 1.00 1.00 1.00-1.0 1.00 -1.0 -1.0 1.00 1.00 -1.0 -1.0 -1.0 -1.0 1.00 1.00 1.00 1.00-1.0 1.00 -1.0 -1.0 1.00 1.00 -1.0 -1.0 -1.0 -1.0 1.00 1.00 1.00 1.00

Concor iteration 3:

The initial method for finding structurally equivalent positions was CONCOR, the CONvergence of iterated CORrelations.

Page 117: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Concor iteration 3:1.00 1.00 1.00 1.00 1.00 1.00 1.00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.01.00 1.00 1.00 1.00 1.00 1.00 1.00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.01.00 1.00 1.00 1.00 1.00 1.00 1.00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.01.00 1.00 1.00 1.00 1.00 1.00 1.00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.01.00 1.00 1.00 1.00 1.00 1.00 1.00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.01.00 1.00 1.00 1.00 1.00 1.00 1.00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.01.00 1.00 1.00 1.00 1.00 1.00 1.00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0-1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.00 1.00 1.00 1.00 1.00 1.00 1.00-1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.00 1.00 1.00 1.00 1.00 1.00 1.00-1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.00 1.00 1.00 1.00 1.00 1.00 1.00-1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.00 1.00 1.00 1.00 1.00 1.00 1.00-1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.00 1.00 1.00 1.00 1.00 1.00 1.00-1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.00 1.00 1.00 1.00 1.00 1.00 1.00-1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.00 1.00 1.00 1.00 1.00 1.00 1.00

1347891025611121314

The initial method for finding structurally equivalent positions was CONCOR, the CONvergence of iterated CORrelations.

Page 118: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Repeat the process on the resulting 1-blocks until you have reached structural equivalent blocks

Because CONCOR splits every sub-group into two groups, you get a partition tree that looks something like this:

Page 119: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

CONCOR example:

Consider a simple senate voting network:

Network is dense, since every cell has some score and dynamic the pattern changes over time.

Color by structural equivalence…

Page 120: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Network is dense, since every cell has some score and dynamic the pattern changes over time.

Adjust position to collapse SE positions.

CONCOR example:

Consider a simple senate voting network:

Page 121: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Network is dense, since every cell has some score and dynamic the pattern changes over time.

And then adjust color, line width, etc. for clarity.

While we’ve gone some distance with identifying relevant information from the mass, how do we account for time?

CONCOR example:

Consider a simple senate voting network:

Page 122: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

CONCOR example:

Repeat at each wave, linking positions over time

Page 123: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

CONCOR example:

Page 124: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Automorphic and Regular equivalence are more difficult to find, and require iteratively searching over possible class assignments for sets that have the same graph theoretic patterns. Usually start with a set of nodes defined as similar on a number of network measures, then look within these classes for automorphic equivalence classes.

The classic reference is REGE (White & Reitz 1985), which recursively defines the degree of equivalence between pairs and then adjusts for as many iterations as you specify.

A theoretically appealing method for finding structures that are very similar to regular equivalence, role equivalence, uses the triad census. Each node is involved in (n-1)(n-2)/2 triads, and occupies a particular position in each of these triads. These positions are summarized in the following figure:

Page 125: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Network Sub-Structure: Triads

003

(0)

012

(1)

102

021D

021U

021C

(2)

111D

111U

030T

030C

(3)

201

120D

120U

120C

(4)

210

(5)

300

(6)

Intransitive

Transitive

Mixed

Page 126: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

An Example of the triad census

Type Number of triads--------------------------------------- 1 - 003 21--------------------------------------- 2 - 012 26 3 - 102 11 4 - 021D 1 5 - 021U 5 6 - 021C 3 7 - 111D 2 8 - 111U 5 9 - 030T 3 10 - 030C 1 11 - 201 1 12 - 120D 1 13 - 120U 1 14 - 120C 1 15 - 210 1 16 - 300 1---------------------------------------Sum (2 - 16): 63

Page 127: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

003

012_S

012_E

012_I

102_D

102_I

021D_S

021D_E

021U_S

021U_E

021C_S

021C_B

021C_E

111D_S

111D_B

111D_E

111U_S

111U_B

111U_E

030T_S

030T_B

030T_E

030C

201_S

201_B

120D_S

120D_E

120U_S

120U_E

120C_S

120C_B

120C_E

210_S

210_B

210_B

300

Triadic Position Census: 36 Positions within 16 Directed TriadsIndicates the position.

Page 128: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Triadic Position Census: 40 Positions within all mutual ties but two types of relations

Page 129: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

36 36 10 10 10 10 43 43 43 43 43 43 43 43 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 020 20 41 41 41 41 14 14 14 14 14 14 14 14 9 9 11 11 11 11 12 12 12 12 12 12 12 12 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 010 10 1 1 1 1 8 8 8 8 8 8 8 8 2 2 10 10 10 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 5 5 5 5 1 1 1 1 1 1 1 1

Triad position vectors for the example network, resulting in 3 positions:

Page 130: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

1.00 1.00 0.64 0.64 0.64 0.64 0.98 0.98 0.98 0.98 0.98 0.98 0.98 0.981.00 1.00 0.64 0.64 0.64 0.64 0.98 0.98 0.98 0.98 0.98 0.98 0.98 0.980.64 0.64 1.00 1.00 1.00 1.00 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.500.64 0.64 1.00 1.00 1.00 1.00 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.500.64 0.64 1.00 1.00 1.00 1.00 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.500.64 0.64 1.00 1.00 1.00 1.00 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.500.98 0.98 0.50 0.50 0.50 0.50 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.000.98 0.98 0.50 0.50 0.50 0.50 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.000.98 0.98 0.50 0.50 0.50 0.50 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.000.98 0.98 0.50 0.50 0.50 0.50 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.000.98 0.98 0.50 0.50 0.50 0.50 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.000.98 0.98 0.50 0.50 0.50 0.50 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.000.98 0.98 0.50 0.50 0.50 0.50 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.000.98 0.98 0.50 0.50 0.50 0.50 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

Correlating each person’s triad position vector with each other persons results in the following table, which clearly shows the positions that are equivalent:

Page 131: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Jefferson High School Sunshine High School

School provides a good boundary for social relations

School does not provide a good boundary for social relations

Complete Network AnalysisNetwork Connections: Role Positions

Page 132: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Jefferson High School Sunshine High School

Image networks. Width of tie is proportional to the ratio of cell density to mean cell density.

34%

32%

33%

4%

43%

52%

Complete Network AnalysisNetwork Connections: Role Positions

Page 133: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Once you have decided on a number of blocks, you need to determine what counts as a ‘one’ block or a ‘zero’ block. Usually this is a some function of the density of the resulting block.

General rules:“Fat Fit” Only put a one in blocks with all ones in the adjacency matrix“Lean Fit” Put a zero if all the cells are zero, else put a one“Density fit” If the average value of the cell is above a certain cutoff.

White, Boorman and Breiger used a ‘lean fit’ (zeroblock) rule for the examples in their paper:

Page 134: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

An example: White et al, figure 1.Biomedical Specialty data:

Page 135: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

White et al, figure 3.Biomedical Specialty data: Key to structure lies in zero blocks

Page 136: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Recent models

Recent work has generalized blockmodels in two directions:

Specific structural hypothesesexample: Core-periphery models or Structural Hole ideas

Generalized blockmodeling based on particular relationship types & patterns. Pat Doreian’s recent work the the PAJEK folks.

Connectivity sets. Identifying sets of nodes with some common patter of connectivity. This is a merge/mingle of community detection & positions. Moody & White would be an example.

Page 137: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

To identify a core-periphery structure, we compare an observed block structure to an ideal block structure.

1 1 1 11 1 1 11 1 1 1

1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1

1 1 1 11 1 1 11 1 1 11 1 1 11 1 1 11 1 1 11 1 1 11 1 1 11 1 1 11 1 1 11 1 1 11 1 1 1

An ideal core-periphery network:

Borgatti SP and Everett M G (1999) Models of core/periphery structures. Social Networks 21 375-395

Recent modelsCore-Periphery

Page 138: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

To identify a core-periphery structure, we compare an observed block structure to an ideal block structure.

(observed blocked network)

Recent modelsCore-Periphery

Page 139: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

(observed blocked network)(Ideal CP blocked network)

To identify a core-periphery structure, we compare an observed block structure to an ideal block structure.

Recent modelsCore-Periphery

Page 140: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

(observed blocked network)(Ideal CP blocked network)

A core periphery structure exists to the extent that the correlation between the ideal structure and the observed structure is high. We can search for cores by simply proposing a partition (many times) and then selecting the best fitting partition. But that’s silly-slow!

To identify a core-periphery structure, we compare an observed block structure to an ideal block structure.

Recent modelsCore-Periphery

Page 141: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

A continuous version of “coreness” can be had by generalizing the ideal image seen above. Instead of just 0/1, pairs of “high core” nodes have a very strong tie connecting them, and core-periphery nodes have a very low score.

Coreness can thus be defined as a type of centrality, but one that assumes a particular underlying structure to the network. Nodes with high coreness are more likely to be at the center of a core-periphery structure.

As it turns out, coreness is essentially Eigenvector centrality, and UCINET sorts nodes by eigenvector centrality and build the “core” until the correlation between ideal/observed drops.

To identify a core-periphery structure, we compare an observed block structure to an ideal block structure.

Recent modelsCore-Periphery

Page 142: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Recent modelsCore-Periphery

Page 143: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

The recent work on generalization focuses on the patterns that determine a block.

Instead of focusing on just the density of a block, you can identify a block as any set that has a particular pattern of ties to any other set.

This work starts from the observation that types of equivalence limit the observed types of blocks. So, for example, regularly equivalent blocks must be either empty, complete, or 1-covered. The “direct” approach is thus to search for these sorts of coverings.

Recent modelsGeneralized Block Models

Page 144: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Recent modelsGeneralized Block Models

Page 145: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Recent modelsGeneralized Block Models

From Carrington, Scott & Wasserman. Models & Methods in Social Network Analysis

Page 146: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

“A friend of a friend is a friend”

“The enemy of an enemy is a friend”

+ +

+

+

- -

F x F = F

E x E = F

We can generalize the balance rule to multitudes of “compound relations”

Use matrices for primary relations and matrix multiplication for compounds

Compound Relations.

Page 147: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Compound Relations.

One of the most powerful tools in role analysis involves looking at role systems through compound relations.

A compound relation is formed by combining relations in single dimensions. The best example of compound relations come from kinship.

SiblingChild of

Sibling0 1 0 0 01 0 0 0 00 0 0 1 00 0 1 0 00 0 0 0 0

Child of0 0 1 1 00 0 0 0 10 0 0 0 00 0 0 0 00 0 0 0 0

x =

Nephew/Niece0 0 0 0 10 0 1 1 00 0 0 0 00 0 0 0 00 0 0 0 0

SC = SC

Page 148: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

An example of compound relations can be found in W&F. This role table catalogues the compounds for two relations “Is boss of” and “Is on the same level as”

Consider a system with two sorts of relations. Here, one is hierarchical and the other defines “within class”.

We can build a role table with Boolean multiplation of the relations

Page 149: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

An example of compound relations can be found in W&F. This role table catalogues the compounds for two relations “Is boss of” and “Is on the same level as”

“Boss”

X

“Boss”

“boss of my boss is my boss”

Page 150: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

An example of compound relations can be found in W&F. This role table catalogues the compounds for two relations “Is boss of” and “Is on the same level as”

“On the same level”

X

“On the same level”

“On the same level”

Page 151: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

An example of compound relations can be found in W&F. This role table catalogues the compounds for two relations “Is boss of” and “Is on the same level as”

Page 152: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Kinship networks form a foundation to social structures.

In the west, we have 2 primary relations (Parent of, married to) and one partitioning attribute (male or female). So:

Parent of a Parent = GrandparentFather’s Father = Paternal GrandfatherMother’s Father = Maternal GrandfatherWife’s Mother’s Son = Brother-in-lawMother’s Mother’s son’s son = Cousin (mom’s side)

Quality: The entire western kinship structure can be decomposed into a set of equations consisting of only Parent, Child, and Gender.

Quantity: Given a fertility rate of 2 kids, the two-step* kinship neighborhood would have 26 people; if the fertility rate were 3 the same count goes up to 46.

*2-steps includes aunt’s & uncles, but not their spouses.

Compound Relations.

Page 153: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

The scientists second rule has to be to look for regularity and exploit that for theory. Consider as a good example, Harrison White’s Kinship model:

Compound Relations.

Page 154: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Ego connects to any of these

Compound Relations.

The scientists second rule has to be to look for regularity and exploit that for theory. Consider as a good example, Harrison White’s Kinship model:

Page 155: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Kinship networks form a foundation to social structures.

In China, we have the same 2 primary relations:

Parent ofMarried to

But 3 partitioning attributes:GenderRelative AgeRelational Order (1st wife, 2nd wife, etc)

This means that compounds we name as equivalent (cousin, uncle) are named differently.

But, while westerners largely ignore gender for anything other than final designation (aunt/uncle, niece/nephew), Chinese kinship terms are differentiated by parent’s line (maternal aunt, maternal uncle, etc.).

We know this designation, but use it rarely.

Compound Relations.

Page 156: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

*2-steps includes aunt’s & uncles, but not their spouses.

Compound Relations.

Page 157: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Uncles

Compound Relations.

Page 158: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Compound Relations.

Page 159: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

The Chinese extended family network – for “normal” relations westerners would recognize – includes 74 unique kinship terms.

The same set in the west has 28 different terms.

Each of these terms carries a different expected gift exchange system at holidays and mourning attire at death.

Compound Relations.

Page 160: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

How has this system changed? Consider the effects of the 1-child policy:

Source: Population research Bureau

With a fertility of 6, 2-step kinship nets would have 166 people; with 2 it’s 26.A full implementation of 1-child removes the “relative age” operator, erasing every kinship term dependent on “older” or “younger” and means that families play either in a maternal or a paternal line, but not both.

Compound Relations.

Page 161: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Using Compound Relations theoretically:

Other work on this general topic:

Page 162: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Using Compound Relations theoretically:

Other work on this general topic:

Page 163: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Methods: How to?

The basic block model formation can be done in multiple ways:

1. Apply any of our group-finding algorithms to a role-based similarity matrix- Here you’re simply converting the conditions for equivalence to

adjacency and solving for modularity. Requires either a community detection algorithm that uses valued ties or a binarization of the similarity matrix.

2. Cluster node-level structural indices (get at regular/automorphic equivalence)- This is the “evade” correlate to SE from community detection: cluster on

a BUNCH of easy-to-calculate node-level network statistics and this gives you nodes that are equivalent (with respect to the measures you used!)

Page 164: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Methods: How to?

The basic block model formation can be done in multiple ways:Role-specific algorithms:

Page 165: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Methods: How to?

The basic block model formation can be done in multiple ways:Role-specific algorithms:

Page 166: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Methods: How to?

Triad Structural Equivalence in SAS

Page 167: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Methods: How to?

Triad Structural Equivalence in SAS

Page 168: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Methods: How to?

Triad Structural Equivalence in SAS

Page 169: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Addendum A new statistic for determining the number of groups in a network.

Proc cluster gives you a statistic for the basic “fit” of a cluster solution.

This statistic varies depending on the method used, but is usually something like an R2. Consider this dendrogram:

Page 170: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Addendum A new statistic for determining the number of groups in a network.

Proc cluster gives you a statistic for the basic “fit” of a cluster solution.

This statistic varies depending on the method used, but is usually something like an R2. Consider this dendrogram:

The SPRSQ and the RSQ are your fit statistics.

Page 171: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Addendum A new statistic for determining the number of groups in a network.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1

SPRSQ

RSQ

A sharp change in the statistic is your best indicator.

Page 172: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Addendum A new statistic for determining the number of groups in a network.

Modularity:

mN

S

ss

L

d

L

lM

1

2

2

M is the modularity scoreS indexes each group (“module”)ls is the number of lines in group sL is the total number of linesds is the sum of the degrees of the nodes in sNm is the number of groups

Page 173: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Role Positions

Identifying positions: Could use the Modularity score at each tree cut…

Page 174: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Role Positions

Example positions identified in a single school network(role 7 is a “leading crowd” in the simplest sum-of-in-degree sense)

Page 175: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Repeating this process across all networks, generates a population of within-school position profiles.

We then pool & cluster these position profiles in a “2nd-order clustering” to identify a set of roles that can be compared across the populations.

We settle on 5 position solution:

Role Positions

89/2

313

260/

501

39/1

078

50/1

235

4/81

5

35/2

63

50/8

19

0/41

6Outsiders Aloofs Friends HangersCentral

Core

Page 176: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Role Positions

Uninvolved outsiders (35% of students, 28% of role groups)

Largely uninvolved: nominate few and are nominated rarely by others. Includes isolated dyads & small groups; mixing matrix show that few friends tend to be others in same positon.

Page 177: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Role Positions

Non-Reciprocated (17% of students, 15% of role groups)

Makes nominations, but rarely reciprocated and has low in-degree, targeting highly central nodes with nominations. “Hangers on” position.

Page 178: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Role Positions

Basically average – positive scores largely because the isolates have been removed – liked by some, like others.

Everyday kids: good friends (21% of students, 29% of role groups)

Page 179: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Role Positions

“Popular Aloof” (9% of students, 9% of role groups)

High in-degree but low out-degree, but the few they do nominate tend to reciprocate.

Page 180: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Role Positions

Central Core (17.5% of students, 17.8% of role groups)

Highly reciprocated ties, active, very central; both high in-degree and reciprocation rates.

Page 181: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Role Positions

How stable is occupancy of a school role?

Page 182: Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network

Role Positions

How stable is occupancy of a school role?