relational factor graphs
DESCRIPTION
Relational Factor Graphs. Lin Liao Joint work with Dieter Fox. A Running Example. Collective classification of a person’s significant places. Features to Consider. Local features: Temporal: time of day, day of week, duration Geographic: near restaurants, near stores Pair-wise features: - PowerPoint PPT PresentationTRANSCRIPT
1
Relational Factor Graphs
Lin Liao
Joint work with Dieter Fox
2
A Running Example
Collective classification of a person’s significant places
3
Features to Consider Local features:
Temporal: time of day, day of week, duration
Geographic: near restaurants, near stores Pair-wise features:
Transitions: which place follows which place Global features:
Aggregates: number of homes or workplaces
4
Which Graphical Model?
Option 1: Bayesian networks and Probabilistic Relational Models But the pair-wise relations may introduce
cycles
Place 1
Place 3 Place 4
Place 2
5
Which Graphical Model?
Option 2: Markov networks and Relational Markov Networks But aggregations can introduce huge
cliques and lose independence relations.
Place 1
Place 3 Place 4
Place 2
Number of homes
6
Motivation
We want a relational probabilistic model that is Suitable to represent both undirected
relations (e.g., pair-wise features) and directed relations (e.g., deterministic aggregation)
Able to address some of the computational issues at the template level
7
Outline Representation
Factor graphs [Kschischang et al. 2001, Frey 2003]
Relational factor graphs Inference
Belief propagation Inference templates
Summation template based on FFT Experiments
8
Factor Graph Undirected factor graph [Kschischang et al.
2001] Bipartite graph that includes both variable
nodes (x1,…,xN) and factor nodes (f1,…,fM)
Joint distribution of variables is proportional to the product of factor functions
x1
x2
x3
x4
f1
f2
f3
9
Factor Graph Directed factor graph [Frey 2003]
Allow some edges to be directed so as to unify Bayesian networks and Markov networks
A valid graph should have no directed cycles
x1
x2
f1
x3
x4
f3
f2
10
Markov Network to Factor Graph
Factors represent the potential functions
Markov network Factor graph
11
Bayesian Network to Factor Graph
Factors represent the conditional probability table
Bayesian network Factor graph
12
Unify MN and BN
+
Local features
Place labels
Aggregation factor
Number of homes
Aggregate features
13
Relational Factor Graph
A set of factor templates that can be used to instantiate (directed) factor graphs given data Representation template
Use SQL (similar to RMN) Guarantee no directed cycles
Inference template Optimization within a factor (discussed
later)
14
Place Labeling: Schema
15
Place Labeling: Transition Features
Label1 Label2 Label3
Pair-wise factor
16
Place Labeling: Aggregate Features
Label1 Label2 Label3
+
=Home? =Home? =Home?
Bool variables
Num of homes
Aggregate feature
17
Outline Representation
Factor graphs [Kschischang et al. 2001, Frey 2003]
Relational factor graphs Inference
Belief propagation Inference templates
Summation template based on FFT Experiments
18
Inference in Factor Graph Belief propagation: two types of messages
Message from variable x to factor f
Message from factor f to variable x
nx: factors adjacent to x; nf: variables adjacent to f
19
Inference Templates Simplest case: specify the function f(nf) and
use the above formula to compute message f -> x Problem: complexity is exponential in the
number of factor arguments. This can be very expensive for aggregation factors
Inference templates allow users to specify optimized algorithms at the template level Be in general form and easy to be shared Support template level complexity analysis
20
Summation Templates
+
…..
xin1 xin
2 xin7 xin
8
xout
21
Summation: Forward Message
+
…..
xin1 xin
2 xin7 xin
8
xout
Compute the distribution of the sum of independent variables xin
1, …. , xin8
22
Summation: Forward Message
Convolution tree: each node can be computed using FFT; total complexity O(nlog2n)
23
Summation: Backward Message
+
…..
xin1 xin
2 xin7 xin
8
xout
Message from xout defines a prior distribution of the sum. For each value of xin
2, compute the distribution of sum and weighted by the prior
24
Summation: Backward Message
If we reuse the results cached for the forward message, complexity becomes O(nlogn)
25
Summation Templates
By using convolution tree, FFT, and caching, the average complexity of passing a message through summation factor is O(nlogn), instead of exponential.
26
Learning
Estimate the weights for probabilistic factors (local features, pair-wise features, and aggregate features)
Optimize the weights to maximize the conditional likelihood of the labeled training data The same algorithm as RMN
27
Experiments Two data sets:
“Single” data set: one person’s GPS data for 4 months
“Multiple” data set: one-week GPS data from 5 subjects
Six candidate labels: Home, Work, Shopping, Dining, Friend, Others
Get the geographic knowledge from Microsoft MapPoint Web Service
28
How Much Aggregates Help
Error rate Multiple Single
No aggregate 28% 9%
With aggregate 18% 6%
Test on “multiple” data set: leave-one-subject-crossvalidation
Test on “single” data set: crossvalidation (train on 1 month, test on 3 months)
29
How Efficient the Optimized BP
30
Summary
Relational factor graph is SQL + (directed) factor graph
It is Suitable to represent both undirected
relations and directed relations Convenient to use: no directed cycles Able to address computation issues at the
template level