probabilistic models of object-relational domains daphne koller stanford university joint work with:...
TRANSCRIPT
Probabilistic Models of Object-Relational Domains
Daphne KollerStanford University
Joint work with:Lise Getoor
Ming-Fai Wong
Eran Segal
Avi Pfeffer
Pieter AbbeelNir Friedman
Ben Taskar
Evan Parker
Drago Anguelov
Rahul Biswas
Bayesian Networks: Problem
Bayesian nets use propositional representation Real world has objects, related to each other
Intelligence Difficulty
Grade
Intell_Jane Diffic_CS101
Grade_Jane_CS101
Intell_George Diffic_Geo101
Grade_George_Geo101
Intell_George Diffic_CS101
Grade_George_CS101A C
These “instances” are not independent
Probabilistic Relational Models
Combine advantages of relational logic & BNs: Natural domain modeling: objects, properties,
relations Generalization over a variety of situations Compact, natural probability models
Integrate uncertainty with relational model: Properties of domain entities can depend on
properties of related entities Uncertainty over relational structure of domain
St. Nordaf University
Tea
ches
Tea
ches
In-course
In-course
Registered
In-course
Prof. SmithProf. Jones
George
Jane
Welcome to
CS101
Welcome to
Geo101
Teaching-abilityTeaching-ability
Difficulty
Difficulty Registered
RegisteredGrade
Grade
Grade
Satisfac
Satisfac
Satisfac
Intelligence
Intelligence
Relational Schema Specifies types of objects in domain, attributes of
each type of object & types of relations between objects
Teach
Student
Intelligence
Registration
Grade
Satisfaction
Course
Difficulty
Professor
Teaching-Ability
In
Take
ClassesClasses
RelationsRelationsAttributesAttributes
Representing the Distribution
Very large probability space for a given context All possible assignments of all attributes of all objects
Infinitely many potential contexts Each associated with a very different set of worlds
Need to represent infinite set of complex distributions
Probabilistic Relational Models
Universals: Probabilistic patterns hold for all objects in class Locality: Represent direct probabilistic dependencies
Links define potential interactions
StudentIntelligence
RegGrade
Satisfaction
CourseDifficulty
ProfessorTeaching-Ability
[K. & Pfeffer; Poole; Ngo & Haddawy]
0% 20% 40% 60% 80% 100%
hard,high
hard,low
easy,high
easy,lowA B C
Prof. SmithProf. Jones
Welcome to
CS101
Welcome to
Geo101
PRM Semantics
Teaching-abilityTeaching-ability
Difficulty
Difficulty
Grade
Grade
Grade
Satisfac
Satisfac
Satisfac
Intelligence
Intelligence
Instantiated PRM BN variables: attributes of all objects dependencies: determined by links & PRM
George
Jane
Welcome to
CS101
low / high
The Web of Influence
0% 50% 100%0% 50% 100%
Welcome to
Geo101 A
C
low high
0% 50% 100%
easy / hard
Reasoning with a PRM
Generic approach: Instantiate PRM to produce ground BN Use standard BN inference
In most cases, resulting BN is too densely connected to allow exact inference
Use approximate inference: belief propagation
Improvement: Use domain structure — objects & relations — to guide computation Kikuchi approximation where clusters = objects
Data Model Objects
LearnerLearnerLearnerLearner
Database
Course Student
Reg
Expert knowledge
Probabilistic Model
Data for NewSituation
Prob.Prob.InferencInferenc
ee
Prob.Prob.InferencInferenc
ee
What are the objects in the new situation?How are they related to each other?
Two Recent Instantiations
From a relational dataset with objects & links, classify objects and predict relationships: Target application: Recognize terrorist networks Actual application: From webpages to database
From raw sensor data to categorized objects Laser data acquired by robot Extract objects, with their static & dynamic
properties Discover classes of similar objects
Summary
PRMs inherit key advantages of probabilistic graphical models: Coherent probabilistic semantics Exploit structure of local interactions
Relational models inherently more expressive
“Web of influence”: use multiple sources of information to reach conclusions
Exploit both relational information and power of probabilistic reasoning
Discriminative Probabilistic Models for Relational Data
Ben TaskarStanford University
Ming-Fai WongDaphne KollerPieter Abbeel
Joint work with:
Web KB
Tom MitchellProfessor
WebKBProject
Sean SlatteryStudent
Advisor-of
Project-of
Member
[Craven et al.]
Undirected PRMs: Relational Markov Nets
Universals: Probabilistic patterns hold for all groups of objects Locality: Represent local probabilistic dependencies Address limitations of directed models:
Increase expressive power by removing acyclicity constraint Improve predictive performance through discriminative training
Study Group
Student2
Reg2GradeIntelligence
Course
RegGrade
Student
Difficulty
Intelligence
[Taskar, Abbeel, Koller ‘02]
0 0.5 1 1.5 2
AAABACBABBBCCACBCC
Template potential
RMN SemanticsInstantiated RMN MN variables: attributes of all objects dependencies: determined by links & RMN
George
Jane
Welcome to
CS101
Welcome to
Geo101
Difficulty
Difficulty
Jill
Geo Study Group
CS Study Group
Intelligence
Intelligence
Intelligence
Grade
Grade
Grade
Grade
Learning RMNs Parameter estimation is not closed form
Convex problem unique global maximum
(Reg1.Grade,Reg2.Grade)
0 0.5 1 1.5 2
AAABACBABBBCCACBCC
P(Grades,Intelligence|Difficulty)
0 0.5 1 1.5 2
AAABACBABBBCCACBCC
0 0.5 1 1.5 2
AAABACBABBBCCACBCC
Difficulty
Difficulty
Intelligence
Intelligence
Intelligence
Grade
Grade
Grade
Grade
low / higheasy / hard ABC
L = log
)|,(
),(#
DifficAGradeAGradeP
AGradeAGradeL
AA
Intelligence
Intelligence
Intelligence
Grade
Grade
Grade
Grade
Intelligence
Intelligence
Intelligence
Grade
Grade
Grade
Grade
Maximize
Web Classification Experiments
WebKB dataset Four CS department websites Five categories
(faculty,student,project,course,other) Bag of words on each page Links between pages Anchor text for links
Experimental setup Trained on three universities Tested on fourth Repeated for all four combinations
Exploiting Links
...
PageCategory
Word1 WordN
From-
Link ...
PageCategory
Word1 WordN
To-
0
0.05
0.1
0.15
0.2
Logistic RMN-link RMN++
Classify all pages collectively,
maximizing the joint label probability
35.4% relative reduction in error relative to strong flat approach
Scalability WebKB data set size
1300 entities 180K attributes 5800 links
Network size / school: 40,000 variables 44,000 edges
Training time: 20 minutes Classification time: 15-20 seconds
Predicting Relationships
Even more interesting are relationships between objects
Tom MitchellProfessor
WebKBProject
Sean SlatteryStudent
Advisor-of
Member
Member
WebKB++ Four new department web sites:
Berkeley, CMU, MIT, Stanford Labeled page type (8 types):
faculty, student, research scientist, staff, research group, research project, course, organization
Labeled hyperlinks and virtual links (6 types): advisor, instructor, TA, member, project-of, NONE
Data set size: 11K pages 110K links 2million words
Rel
Flat Model
...PageWord1 WordN
From- ...
PageWord1 WordN
To-
Type
...LinkWord1 LinkWordN
NONEadvisor
instructor
TAmemberproject-
of
Flat Model
...
......
...
...
...
Collective Classification: Links
Rel
...
Page
Word1 WordN
From-
...
Page
Word1 WordN
To-
Type
...LinkWord1 LinkWordN
Category Category
Link Model
...
......
...
...
...
Triad Model
Professor Student
Group
Advisor
MemberMember
Triad Model
Professor Student
Course
Advisor
TAInstructor
Triad Model
Link Prediction: Results
Error measured over links predicted to be present
Link presence cutoff is at precision/recall break-even point (30% for all models) 0
5
10
15
20
25
30
Flat Link Triad
...
... ...72.9% relative reduction in error relative to strong flat approach
Summary
Use relational models to recognize entities & relations directly from raw data
Collective classification: Classifies multiple entities simultaneously Exploits links & correlations between related
entities Uses web of influence reasoning to reach strong
conclusions from weak evidence
Undirected PRMs allow high-accuracy discriminative training & rich graphical patterns
Learning Object Maps from Laser Range Data
Dragomir AnguelovDaphne KollerEvan Parker
Robotics LabStanford University
Occupancy Grid Maps
Static world assumption Inadequate for answering symbolic
queries
personrobot
Objects
Entities with coherent properties: Shape Color Kinematics (Motion)
Object Maps
Natural and concise representation Exploit prior knowledge about object
models Walls are straight, doors open and close
Learn global properties of the environment Primary orientation of walls, typical door width
Generalize properties across objects Objects viewed as instances of object classes,
parameter sharing
Learning Object Maps
1. Define a probabilistic generative model2. Suggest object hypotheses3. Optimize the object parameters (EM)4. Select highest-scoring model
Object Properties
ObjectSegmentation
Laser Sensor Data
Probabilistic Model
Data: A set of scans Scan: set of <robot position, laser beam reading> tuples Each scan is associated with a static map M t
Global Map M A set of objects {1, …, J}
Each object i = {S[i ], Dt[i ]} S[i ] – static parameters Dt[i ] – dynamic parameters
Non-static environment – dynamic parameters vary only between static maps
Fully dynamic environment – dynamic parameters vary between scans
Probabilistic Model - II
General map M Static maps M1… MT
Objects i
Robot positions sit
Laser beams zit
Correspondence variables Ci
t
Z1t Z2
t ZKt
1
J 2
S1t S2
t SKt
C1t
MT
C2t C2
t
Generative Model Specification
Sensor model Object models
Particular instantiation: Walls Doors
Model score
P
W
t
d
Sensor Model
Modeling occlusion: Reading zt
k generated from: Random model (uniform probability) First object the beam intersects
Actual object (Gaussian probability) MaxRange model (Delta function)
Why we should model occlusion:
Realistic sensor model Helps to infer motion Improved model search
Zmax
P(Zt)
j
Zt
Zt
P(Zt)
Zmax
Wall Object Model
Wall model i
A line defined by <i, i>, as in S intervals <1, 2> each denoting a segment
along the line 2S + 2 independent parameters Collinear segments bias
1,0 iiTi x
Door Object Model
Door Model i
A pivot p Width w A set of angles t (t=1,2,…) Limited rotation (90o) Arc “center” d 4 static + 1 dynamic
parameter
p
W
t d
Model Score
Define structure prior p(M) over possible maps:
|S[M]| — number of static parameters in M |D[M]| — number of dynamic parameters in M L[M] — total length of segments in M
),|(log)(log)|,(log MpMpMp ZZ
][][][exp)( MLMDMSMp LDS
increased data likelihood
Increased number of parameters
Maximize log-posterior probability of map M, data Z:
Learning Model Parameters (EM)
E-step Compute expectations
M-step Walls
Optimize line parameters
Optimize segment ends Doors
Optimize pivot and angles
Optimize door width
Z1t Z2
t ZKt
1
J 2
S1t S2
t SKt
C1t
Map M
C2t C2
t
),,|( ttk
tk
tk MzscCp
Suggesting Object Hypotheses
Wall hypotheses Use Hough transform (histogram-based
approach) Compute preferred direction of the environment Use both to suggest lines
Door hypotheses Use temporal differencing of static maps
Check if points along segments in static maps Mt are well explained in the general map M
If not, the segment is a potential door
Results for a Single Pass
Results for Two Passes
Future Work
Simultaneous localization & mapping Object class hierarchies Dynamic environments Enrich the object representation:
More sophisticated shape models Color 3D
Hierarchical Object Maps
Come visit our poster!
Learn to recognize object classes