probabilistic models of object-relational domains daphne koller stanford university joint work with:...

Probabilistic Models of Object-Relational Domains

Daphne KollerStanford University

Joint work with:Lise Getoor

Ming-Fai Wong

Eran Segal

Avi Pfeffer

Pieter AbbeelNir Friedman

Ben Taskar

Evan Parker

Drago Anguelov

Rahul Biswas

Bayesian Networks: Problem

Bayesian nets use propositional representation Real world has objects, related to each other

Intelligence Difficulty

Grade

Intell_Jane Diffic_CS101

Grade_Jane_CS101

Intell_George Diffic_Geo101

Grade_George_Geo101

Intell_George Diffic_CS101

Grade_George_CS101A C

These “instances” are not independent

Probabilistic Relational Models

Combine advantages of relational logic & BNs: Natural domain modeling: objects, properties,

relations Generalization over a variety of situations Compact, natural probability models

Integrate uncertainty with relational model: Properties of domain entities can depend on

properties of related entities Uncertainty over relational structure of domain

St. Nordaf University

Tea

ches

Tea

ches

In-course

In-course

Registered

In-course

Prof. SmithProf. Jones

George

Jane

Welcome to

CS101

Welcome to

Geo101

Teaching-abilityTeaching-ability

Difficulty

Difficulty Registered

RegisteredGrade

Grade

Grade

Satisfac

Satisfac

Satisfac

Intelligence

Intelligence

Relational Schema Specifies types of objects in domain, attributes of

each type of object & types of relations between objects

Teach

Student

Intelligence

Registration

Grade

Satisfaction

Course

Difficulty

Professor

Teaching-Ability

In

Take

ClassesClasses

RelationsRelationsAttributesAttributes

Representing the Distribution

Very large probability space for a given context All possible assignments of all attributes of all objects

Infinitely many potential contexts Each associated with a very different set of worlds

Need to represent infinite set of complex distributions

Probabilistic Relational Models

Universals: Probabilistic patterns hold for all objects in class Locality: Represent direct probabilistic dependencies

Links define potential interactions

StudentIntelligence

RegGrade

Satisfaction

CourseDifficulty

ProfessorTeaching-Ability

[K. & Pfeffer; Poole; Ngo & Haddawy]

0% 20% 40% 60% 80% 100%

hard,high

hard,low

easy,high

easy,lowA B C

Prof. SmithProf. Jones

Welcome to

CS101

Welcome to

Geo101

PRM Semantics

Teaching-abilityTeaching-ability

Difficulty

Difficulty

Grade

Grade

Grade

Satisfac

Satisfac

Satisfac

Intelligence

Intelligence

Instantiated PRM BN variables: attributes of all objects dependencies: determined by links & PRM

George

Jane

Welcome to

CS101

low / high

The Web of Influence

0% 50% 100%0% 50% 100%

Welcome to

Geo101 A

C

low high

0% 50% 100%

easy / hard

Reasoning with a PRM

Generic approach: Instantiate PRM to produce ground BN Use standard BN inference

In most cases, resulting BN is too densely connected to allow exact inference

Use approximate inference: belief propagation

Improvement: Use domain structure — objects & relations — to guide computation Kikuchi approximation where clusters = objects

Data Model Objects

LearnerLearnerLearnerLearner

Database

Course Student

Reg

Expert knowledge

Probabilistic Model

Data for NewSituation

Prob.Prob.InferencInferenc

ee

Prob.Prob.InferencInferenc

ee

What are the objects in the new situation?How are they related to each other?

Two Recent Instantiations

From a relational dataset with objects & links, classify objects and predict relationships: Target application: Recognize terrorist networks Actual application: From webpages to database

From raw sensor data to categorized objects Laser data acquired by robot Extract objects, with their static & dynamic

properties Discover classes of similar objects

Summary

PRMs inherit key advantages of probabilistic graphical models: Coherent probabilistic semantics Exploit structure of local interactions

Relational models inherently more expressive

“Web of influence”: use multiple sources of information to reach conclusions

Exploit both relational information and power of probabilistic reasoning

Discriminative Probabilistic Models for Relational Data

Ben TaskarStanford University

Ming-Fai WongDaphne KollerPieter Abbeel

Joint work with:

Web KB

Tom MitchellProfessor

WebKBProject

Sean SlatteryStudent

Advisor-of

Project-of

Member

[Craven et al.]

Undirected PRMs: Relational Markov Nets

Universals: Probabilistic patterns hold for all groups of objects Locality: Represent local probabilistic dependencies Address limitations of directed models:

Increase expressive power by removing acyclicity constraint Improve predictive performance through discriminative training

Study Group

Student2

Reg2GradeIntelligence

Course

RegGrade

Student

Difficulty

Intelligence

[Taskar, Abbeel, Koller ‘02]

0 0.5 1 1.5 2

AAABACBABBBCCACBCC

Template potential

RMN SemanticsInstantiated RMN MN variables: attributes of all objects dependencies: determined by links & RMN

George

Jane

Welcome to

CS101

Welcome to

Geo101

Difficulty

Difficulty

Jill

Geo Study Group

CS Study Group

Intelligence

Intelligence

Intelligence

Grade

Grade

Grade

Grade

Learning RMNs Parameter estimation is not closed form

Convex problem unique global maximum

(Reg1.Grade,Reg2.Grade)

0 0.5 1 1.5 2

AAABACBABBBCCACBCC

P(Grades,Intelligence|Difficulty)

0 0.5 1 1.5 2

AAABACBABBBCCACBCC

0 0.5 1 1.5 2

AAABACBABBBCCACBCC

Difficulty

Difficulty

Intelligence

Intelligence

Intelligence

Grade

Grade

Grade

Grade

low / higheasy / hard ABC

L = log

)|,(

),(#

DifficAGradeAGradeP

AGradeAGradeL

AA

Intelligence

Intelligence

Intelligence

Grade

Grade

Grade

Grade

Intelligence

Intelligence

Intelligence

Grade

Grade

Grade

Grade

Maximize

Web Classification Experiments

WebKB dataset Four CS department websites Five categories

(faculty,student,project,course,other) Bag of words on each page Links between pages Anchor text for links

Experimental setup Trained on three universities Tested on fourth Repeated for all four combinations

Exploiting Links

...

PageCategory

Word1 WordN

From-

Link ...

PageCategory

Word1 WordN

To-

0

0.05

0.1

0.15

0.2

Logistic RMN-link RMN++

Classify all pages collectively,

maximizing the joint label probability

35.4% relative reduction in error relative to strong flat approach

Scalability WebKB data set size

1300 entities 180K attributes 5800 links

Network size / school: 40,000 variables 44,000 edges

Training time: 20 minutes Classification time: 15-20 seconds

Predicting Relationships

Even more interesting are relationships between objects

Tom MitchellProfessor

WebKBProject

Sean SlatteryStudent

Advisor-of

Member

Member

WebKB++ Four new department web sites:

Berkeley, CMU, MIT, Stanford Labeled page type (8 types):

faculty, student, research scientist, staff, research group, research project, course, organization

Labeled hyperlinks and virtual links (6 types): advisor, instructor, TA, member, project-of, NONE

Data set size: 11K pages 110K links 2million words

Rel

Flat Model

...PageWord1 WordN

From- ...

PageWord1 WordN

To-

Type

...LinkWord1 LinkWordN

NONEadvisor

instructor

TAmemberproject-

of

Flat Model

...

......

...

...

...

Collective Classification: Links

Rel

...

Page

Word1 WordN

From-

...

Page

Word1 WordN

To-

Type

...LinkWord1 LinkWordN

Category Category

Link Model

...

......

...

...

...

Triad Model

Professor Student

Group

Advisor

MemberMember

Triad Model

Professor Student

Course

Advisor

TAInstructor

Triad Model

Link Prediction: Results

Error measured over links predicted to be present

Link presence cutoff is at precision/recall break-even point (30% for all models) 0

5

10

15

20

25

30

Flat Link Triad

...

... ...72.9% relative reduction in error relative to strong flat approach

Summary

Use relational models to recognize entities & relations directly from raw data

Collective classification: Classifies multiple entities simultaneously Exploits links & correlations between related

entities Uses web of influence reasoning to reach strong

conclusions from weak evidence

Undirected PRMs allow high-accuracy discriminative training & rich graphical patterns

Learning Object Maps from Laser Range Data

Dragomir AnguelovDaphne KollerEvan Parker

Robotics LabStanford University

Occupancy Grid Maps

Static world assumption Inadequate for answering symbolic

queries

personrobot

Objects

Entities with coherent properties: Shape Color Kinematics (Motion)

Object Maps

Natural and concise representation Exploit prior knowledge about object

models Walls are straight, doors open and close

Learn global properties of the environment Primary orientation of walls, typical door width

Generalize properties across objects Objects viewed as instances of object classes,

parameter sharing

Learning Object Maps

1. Define a probabilistic generative model2. Suggest object hypotheses3. Optimize the object parameters (EM)4. Select highest-scoring model

Object Properties

ObjectSegmentation

Laser Sensor Data

Probabilistic Model

Data: A set of scans Scan: set of <robot position, laser beam reading> tuples Each scan is associated with a static map M t

Global Map M A set of objects {1, …, J}

Each object i = {S[i ], Dt[i ]} S[i ] – static parameters Dt[i ] – dynamic parameters

Non-static environment – dynamic parameters vary only between static maps

Fully dynamic environment – dynamic parameters vary between scans

Probabilistic Model - II

General map M Static maps M1… MT

Objects i

Robot positions sit

Laser beams zit

Correspondence variables Ci

t

Z1t Z2

t ZKt

1

J 2

S1t S2

t SKt

C1t

MT

C2t C2

t

Generative Model Specification

Sensor model Object models

Particular instantiation: Walls Doors

Model score

P

W

t

d

Sensor Model

Modeling occlusion: Reading zt

k generated from: Random model (uniform probability) First object the beam intersects

Actual object (Gaussian probability) MaxRange model (Delta function)

Why we should model occlusion:

Realistic sensor model Helps to infer motion Improved model search

Zmax

P(Zt)

j

Zt

Zt

P(Zt)

Zmax

Wall Object Model

Wall model i

A line defined by <i, i>, as in S intervals <1, 2> each denoting a segment

along the line 2S + 2 independent parameters Collinear segments bias

1,0 iiTi x

Door Object Model

Door Model i

A pivot p Width w A set of angles t (t=1,2,…) Limited rotation (90o) Arc “center” d 4 static + 1 dynamic

parameter

p

W

t d

Model Score

Define structure prior p(M) over possible maps:

|S[M]| — number of static parameters in M |D[M]| — number of dynamic parameters in M L[M] — total length of segments in M

),|(log)(log)|,(log MpMpMp ZZ

][][][exp)( MLMDMSMp LDS

increased data likelihood

Increased number of parameters

Maximize log-posterior probability of map M, data Z:

Learning Model Parameters (EM)

E-step Compute expectations

M-step Walls

Optimize line parameters

Optimize segment ends Doors

Optimize pivot and angles

Optimize door width

Z1t Z2

t ZKt

1

J 2

S1t S2

t SKt

C1t

Map M

C2t C2

t

),,|( ttk

tk

tk MzscCp

Suggesting Object Hypotheses

Wall hypotheses Use Hough transform (histogram-based

approach) Compute preferred direction of the environment Use both to suggest lines

Door hypotheses Use temporal differencing of static maps

Check if points along segments in static maps Mt are well explained in the general map M

If not, the segment is a potential door

Results for a Single Pass

Results for Two Passes

Future Work

Simultaneous localization & mapping Object class hierarchies Dynamic environments Enrich the object representation:

More sophisticated shape models Color 3D

Hierarchical Object Maps

Come visit our poster!

Learn to recognize object classes

probabilistic models of object-relational domains daphne koller stanford university joint work with:...

Documents