learning probabilistic relational models
DESCRIPTION
Nir Friedman Hebrew University [email protected]. Lise Getoor Stanford University [email protected]. Daphne Koller Stanford University [email protected]. Avi Pfeffer Stanford University [email protected]. Learning Probabilistic Relational Models. - PowerPoint PPT PresentationTRANSCRIPT
Learning Probabilistic Relational Models
Daphne KollerStanford University
Nir FriedmanHebrew [email protected]
Lise GetoorStanford University
Avi PfefferStanford University
• Data sources– relational and object-oriented databases– frame-based knowledge bases – World Wide Web
Learning from Relational Data
• Problem:– must fix attributes in advance
can represent only some limited set of structures– IID assumption may not hold
• Traditional approaches– work well with flat representations– fixed length attribute-value vectors – assume IID samples
Our Approach• Probabilistic Relational Models (PRMs)
– rich representation language models• relational dependencies• probabilistic dependencies
• Learning PRMs – parameter estimation– model selection
from data stored in relational databases
Outline• Motivation• Probabilistic relational models
– Probabilistic Logic Programming[Poole, 1993]; [Ngo & Haddawy 1994]
– Probabilistic object-oriented knowledge[Koller & Pfeffer 1997; 1998]; [Koller, Levy & Pfeffer; 1997]
• Learning PRMs• Experimental results• Conclusions
Probabilistic Relational Models
• Combine advantages of predicate logic & BNs: – natural domain modeling: objects, properties,
relations;– generalization over a variety of situations;– compact, natural probability models.
• Integrate uncertainty with relational model:– properties of domain entities can depend on
properties of related entities;– uncertainty over relational structure of domain.
Relational SchemaStudentIntelligencePerformance
RegistrationGradeSatisfaction
CourseDifficultyRating
ProfessorPopularity
Teaching-Ability
Stress-Level
Teach
In
Take
• Describes the types of objects and relations in the database
ClassesClasses
RelationshipsRelationships
AttributesAttributes
Example instance I Professor
Prof. GumpPopularity
highTeaching Ability
mediumStress-Level
low
CoursePhil142
Difficulty low
Ratinghigh
CoursePhil101
Difficulty low
Ratinghigh
Reg#5639
GradeA
Satisfaction 3
Reg#5639
GradeA
Satisfaction 3
Reg#5639
GradeA
Satisfaction 3
StudentJohn Doe
Intelligence high
Performance average
StudentJane Doe
Intelligence high
Performance average
What’s Uncertain?
Relations
ProfessorProf. Gump
Popularityhigh
Teaching Abilitymedium
Stress-Levellow
CoursePhil142
Difficulty low
Ratinghigh
CoursePhil101
Difficulty low
Ratinghigh
Reg#5639
GradeA
Satisfaction 3
Reg#5639
GradeA
Satisfaction 3
Reg#5639
GradeA
Satisfaction 3
StudentJohn Doe
Intelligence high
Performance average
StudentJane Doe
Intelligence high
Performance average
Attribute Values
ObjectsStudent
Judy DunnIntelligence
highPerformance
high
StudentJohn Deer
Intelligence ???
Performance ???
Attribute Uncertainty
Fixed skeleton – set of objects in each class– relations between them
Uncertainty– over assignments of values to attributes
ProfessorProf. Gump
Popularity???
Teaching Ability???
Stress-Level???
CoursePhil142
Difficulty ???
Rating???
CoursePhil101
Difficulty ???
Rating???
Reg#5639
GradeA
Satisfaction 3
Reg#5639
GradeA
Satisfaction 3
Reg#5639
Grade???
Satisfaction ???
StudentJane Doe
Intelligence ???
Performance ???
IntellReg.Taker.ficulty,Reg.In.Dif
|Reg.Grade P
PRM: Dependencies
StudentIntelligence
Performance
RegGradeSatisfaction
CourseDifficulty
Rating
ProfessorPopularity
Teaching-Ability
Stress-Level
1.06.03.01.01.08.04.05.01.01.04.05.0
,,,,
,
llhllhhh
CBAID
PRM: Dependencies (cont.)Professor
Prof. GumpPopularity
highTeaching Ability
mediumStress-Level
low
CoursePhil142
Difficulty low
Ratinghigh
CoursePhil101
Difficulty low
Ratinghigh
Reg#5639
GradeA
Satisfaction 3
Reg#5639
GradeA
Satisfaction 3
Reg#5639
Grade?
Satisfaction 3
StudentJohn Doe
Intelligence high
Performance average
StudentJane Doe
Intelligence high
Performance average
StudentJohn Deer
Intelligence low
Performance average
Reg#5639
Grade?
Satisfaction 3
1.06.03.01.01.08.04.05.01.01.04.05.0
,,,,
,
llhllhhh
CBAID
1.06.03.01.01.08.04.05.01.01.04.05.0
,,,,
,
llhllhhh
CBAID
PRM: aggregate dependencies
RegGrade
StudentIntelligence
Performance
Satisfaction
CourseDifficulty
Rating
ProfessorPopularity
Teaching-Ability
Stress-Level
StudentJane Doe
Intelligence high
Performance average
Reg#5077
GradeC
Satisfaction 2
Reg#5054
GradeC
Satisfaction 1
Reg#5639
GradeA
Satisfaction 3
Problem!!!
Need CPTs of varying sizes
avg
1.03.06.04.04.02.07.02.01.0
CBA
hmlavg
PRM: aggregate dependencies
StudentIntelligence
Performance
RegGradeSatisfaction
CourseDifficulty
Rating
ProfessorPopularity
Teaching-Ability
Stress-Level
avg
avg
count
sum, min, max, avg, mode, count
PRM: Summary• A PRM specifies
– a probabilistic dependency structure S• a set of parents for each attribute X.A
– a set of local probability models
• Given a skeleton structure , a PRM specifies a probability distribution over instances I:– over attribute values of all objects in
Classes Objects
)|(),,|( ).()( .
. axparentsX Xx AX
axPSP III
Value of attribute A in object xAttributes
Learning PRMs
Relational
Schema
Database:
• Parameter estimation
• Structure selection
Course Student
Reg
Course Student
Reg
Instance I
Parameter estimation in PRMs• Assume known dependency structure S• Goal: estimate PRM parameters
– entries in local probability models,
• A parameterization is good if it is likely to generate the observed data, instance I .
• MLE Principle: Choose so as to maximize l
),|(log),:( SPSl II
).(|. AxparentsAx
crucial property: decompositionseparate terms for different X.A
ML parameter estimation
IntellReg.Taker.ficulty,Reg.In.Dif
|Reg.Grade P
StudentIntelligence
PerformanceReg
GradeSatisfaction
CourseDifficultyRating
).,.().,.,.(
*
.,.|.
hISlDCNhISlDCAGRN
hISlDCAGR
DB technology well-suited to the computation of suff statistics:
Coursetable
Regtable
Studenttable
IntSGradeRDiffC
...
Count
sufficient statistics
Model Selection• Idea:
– define scoring function – do local search over legal structures
• Key Components:– scoring models– legal models– searching model space
Scoring Models
• Bayesian approach:
• closed form solution
])()|(log[)|(log):(
priorlikelihoodmarginal
SPSPSPSScore
III
Legal Models
• Dependency ordering over attributes:
x.a
y.b
axby .. if X.A depends on Y.B
PaperAccepted
ResearcherReputation author-of
• PRM defines a coherent probability model over skeleton if is acyclic
Guaranteeing AcyclicityHow do we guarantee that a PRM is acyclic for every skeleton?
PRMdependency structure S
dependencygraph
Y.B
X.A
if X.A depends directly on Y.B
dependency graph acyclic acyclic for any Attribute stratification:
Limitation of stratificationPersonM-chromosome
P-chromosome
Blood-type
PersonM-chromosome
P-chromosome
Blood-type
PersonM-chromosome
P-chromosome
Blood-type
Father Mother
Person.M-chrom Person.P-chrom
Person.B-type ???
Guaranteed acyclic relations
PersonM-chromosome
P-chromosome
Blood-type
PersonM-chromosome
P-chromosome
Blood-type
PersonM-chromosome
P-chromosome
Blood-type
Father Mother
• Prior knowledge: the Father-of relation is acyclic– dependence of Person.A on Person.Father.B cannot induce cycles
Guaranteeing acyclicity• With guaranteed acyclic relations, some cycles in
the dependency graph are guaranteed to be safe.• We color the edges in the dependency graph
A cycle is safe if– it has a green edge– it has no red edge
yellow: withinsingle object
X.B
X.Agreen: viag.a. relation
Y.B
X.Ared: viaother relations
Y.B
X.A
Person.M-chrom Person.P-chrom
Person.B-type
Searching Model Space
Student
Course Reg scoreAdd C.AC.B
score
Delete S.IS.P Student
Course Reg
Student
RegCourse
Phase 0: consider only dependencies within a class
Phased structure search
Student
Course Reg scoreAdd C.AR.B
score
Add S.IR.CStudent
Course Reg
Student
RegCourse
Phase 1: consider dependencies from “neighboring” classes, via schema relations
Phased structure search
scoreAdd C.AS.P
score
Add S.IC.B
Phase 2: consider dependencies from “further” classes, via relation chains
Student
Course Reg
Student
Course Reg
Student
Course Reg
Experimental Results:Movie Domain (real data)
11,000 movies, 7,000 actors
ActorGender
AppearsRole-type
MovieProcess
Decade
Genre
source: http://www-db.stanford.edu/movies/doc.html
Genetics domain (synthetic data)PersonM-chromosome
P-chromosome
Blood-type
PersonM-chromosome
P-chromosome
Blood-type
PersonM-chromosome
P-chromosome
Blood-type
Father Mother
Blood-TestContaminated
Result
Experimental Results
-32000
-30000
-28000
-26000
-24000
-22000
-20000
-18000
200 300 400 500 600 700 800
Sco
re
Dataset Size
Median LikelihoodGold Standard
Future directions• Learning in complex real-world domains
– drug treatment regimes– collaborative filtering
• Missing data• Learning with structural uncertainty• Discovery
– hidden variables– causal structure– class hierarchy
Conclusions• PRMs natural extension of BNs:
– well-founded (probabilistic) semantics– compact representation of complex models
• Powerful learning techniques– builds on BN learning techniques– can learn directly from relational data
• Parameter estimation– efficient, effective exploitation of DB technology
• Structure identification– builds on well understood theory– major issues:
• guaranteeing coherence• search heuristics