latent feature models for network data over time jimmy foulds advisor: padhraic smyth

Latent Feature Models for Network Data over Time

Jimmy FouldsAdvisor: Padhraic Smyth

(Thanks also to Arthur Asuncion and Chris Dubois)

Overview

The task Prior work – Miller, Van Gael, Indian Buffet

Processes The DRIFT model Inference Preliminary results Future work

The Task

Modeling Dynamic (time-varying) Social Networks

Interested in prediction Model interpretation for sociological

understanding Continuous time relational events versus

panel data?

Applications

Predicting Email Communications

Applications

Predicting Paper Co-authorship NIPS data

Prior Work

Erdos-Renyi Models are “pseudo-dynamic” Continuous Markov Process Models (Snijders

2006) The network stochastically optimizes ERGM

likelihood function Dynamic Latent Space Model (Sarkar & Moore,

2005) Each node (actor) is associated with a point in a low

dimensional space (Raftery et al. 2002). Link probability is a function of distance between points

Gaussian jumps in latent space in each timestep

Prior Work

Nonparametric Latent Feature Relational Model (Miller et al. 2009)

Each actor is associated with an unbounded sparse vector of binary latent features, generated from an Indian Buffet Process prior

The probability of a link between two actors is a function of the latent features of those actors (and additional covariates)

Prior Work

Nonparametric Latent Feature Relational Model (Miller et al. 2009) generative process:

Z ~ IBP W

kk' ~ N(0,

Yij ~ (Z

j+ covariate terms)

A kind of blockmodel with overlapping classes

(The next few slides are from Zoubin Ghahramani's NIPS 2009 Workshop talk)

How to Make this Model Dynamic For Longitudinal Data?

We would like the Zs to change over time, modeling changing interests, community memberships, …

Want to maintain sparsity property, but model persistence, generation of new features, ...

Infinite Factorial Hidden Markov Models (Van Gael et al., 2010)

A variant of the IBP A probability distribution over a potentially

infinite number of binary Markov chains Sparsity: At each timestep, introduce new

features using the IBP distribution Persistence: A coin flip determines whether

each feature persists to the next timestep Hidden Markov structure: the latent features

are hidden but we observe something at each timestep.

DRIFT: the Dynamic Relational Infinite FeaTure Model

The iFHMM models the evolution of one actor's features over time

We use an iFHMM for each actor, but share the transition probabilities

Observed graphs generated via (Miller et al. 2009)'s latent feature model Y

ij ~ (Z

j+...)

DRIFT: the Dynamic Relational Infinite FeaTure Model

Inference

Markov chain Monte Carlo inference Use “slice sampling” trick with the stick-

breaking construction of the IBP to effectively truncate num features but still perform exact inference

Blocked Gibbs sampling on the other variables Forward-backward dynamic programming on

each actor's feature chain Metropolis-Hastings updates for W's since non-

conjugate

Group DRIFT

Clustering to reduce the number of chains Each actor has hidden class variable c < C <

N The chains of infinite binary feature vectors

are associated with classes rather than actors Allows us to scale up to large numbers of

actors Clustering may be interpretable

β=1/C

Group DRIFT

Inference for a, b, exactly the same

Inference for z’s similar:• Slightly different “emission” probability• Run forward-backward sampler on M*C chains rather than M*N chains

Inference for c’s (actor’s assignment to specific chain) is easy too

Inference for W is similar (slightly different likelihood). Note we must now assume that the diagonal of W can be non-zero.

Preliminary Experimental Results (Synthetic Data)

Future work

Extension to Continuous Time It's easy to use IBP latent factor model as a

covariate in Relational Event Model (Butts 2008)

How to model the Zs changing over time for continuous data?

Thanks for Listening!

latent feature models for network data over time jimmy foulds advisor: padhraic smyth

Documents

data mining - 2011 - volinsky - columbia university 1...

cs 277: the netflix prize professor padhraic smyth...

local search and optimization cs 271: fall 2007 instructor:...

constraint satisfaction problems cs 271: fall 2007...

cs 277: data mining recommender systems padhraic smyth...

cs 277: data mining regression algorithms padhraic smyth...

logic agents and propositional logic cs 271: fall 2007...

data mining lectures lecture 1: introduction padhraic smyth,...

1 hicss keynote talk, jan 2008 © padhraic smyth, uc irvine:...

from grid data to patterns and structures padhraic smyth...

knowledge representation using first-order logic cs 271:...

data mining lectures lecture 15: text classification...

clustering credit: padhraic smyth university of california,...

analyzing federal funding, scientific publications and email...

notes on graphical models padhraic smyth department of...

an introduction to data mining padhraic smyth information...

annealing paths for the evaluation of topic models james...

lecture 2: problem solving using state space representations...

segmental hidden markov models with random effects for...

probabilistic learning tutorial: p. smyth, uc irvine, august...