bayesian logic programs for plan recognition and machine reading
DESCRIPTION
Bayesian Logic Programs for Plan Recognition and Machine Reading. Sindhu Raghavan Advisor: Raymond Mooney PhD Oral Defense Nov 29 th , 2012. Outline. Motivation Background Bayesian Logic Programs (BLPs) Plan Recognition Machine Reading BLPs for inferring implicit facts - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/1.jpg)
Bayesian Logic Programsfor
Plan Recognition and Machine Reading
Sindhu RaghavanAdvisor: Raymond Mooney
PhD Oral DefenseNov 29th, 2012
1
![Page 2: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/2.jpg)
Outline• Motivation• Background
– Bayesian Logic Programs (BLPs)• Plan Recognition• Machine Reading
– BLPs for inferring implicit facts– Online Rule Learning– Scoring Rules using WordNet
• Future Work• Conclusions
2
![Page 3: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/3.jpg)
Outline• Motivation• Background
– Bayesian Logic Programs (BLPs)• Plan Recognition• Machine Reading
– BLPs for inferring implicit facts– Online Rule Learning– Scoring Rules using WordNet
• Future Work• Conclusions
3
![Page 4: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/4.jpg)
Machine ReadingMachine reading involves the automatic extraction of knowledge from natural language text
4
Example“Barack Obama is the current President of the USA……. Obama was born on August 4, 1961, in Hawaii, USA…….”
Extracted factsnationState(usa)person(barackobama)isLedBy(usa,barackobama)hasBirthPlace(barackobama,usa)employs(usa, barackobama)
Data is relational in nature - several entities and several relations between them
![Page 5: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/5.jpg)
Characteristics of Real World Data• Relational or structured data
– Several entities in the domain– Several relations between entities– Not always independent and identically
distributed (i.i.d)• Presence of noise or uncertainty
– Uncertainty in the types of entities– Uncertainty in the relations
Traditional approaches like first-order logic or probabilistic models can handle either structured data or uncertainty, but not both. 5
![Page 6: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/6.jpg)
Statistical Relational Learning (SRL)
• Integrates first-order logic and probabilistic graphical models [Getoor and Taskar, 2007]
– Overcome limitations of traditional approaches
• SRL formalisms– Stochastic Logic Programs (SLPs) [Muggleton, 1996]
– Probabilistic Relational Models (PRMs) [Friedman et al., 1999]
– Bayesian Logic Programs (BLPs) [Kersting and De Raedt, 2001] – Markov Logic Networks (MLNs) [Richardson and Domingos, 2006]
6
![Page 7: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/7.jpg)
Statistical Relational Learning (SRL)
• Integrates first-order logic and probabilistic graphical models [Getoor and Taskar, 2007]
– Overcome limitations of traditional approaches
• SRL formalisms– Stochastic Logic Programs (SLPs) [Muggleton, 1996]
– Probabilistic Relational Models (PRMs) [Friedman et al., 1999]
– Bayesian Logic Programs (BLPs) [Kersting and De Raedt, 2001]
– Markov Logic Networks (MLNs) [Richardson and Domingos, 2006]
7
![Page 8: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/8.jpg)
Bayesian Logic Programs (BLPs)[Kersting and De Raedt, 2001]
• Integrate first-order logic and Bayesian networks
• Why BLPs?– Efficient grounding mechanism that includes only
those variables that are relevant to the query– Easy to extend by incorporating any type of logical
inference to construct networks– Well suited for capturing causal relations in data
8
![Page 9: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/9.jpg)
Objectives
Plan Recognition
Machine Reading
9
![Page 10: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/10.jpg)
Objectives
10
Plan recognition involves predicting the top-level plan of an agent based on its observed actions
Machine Reading
![Page 11: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/11.jpg)
Objectives
11
Plan Recognition
Machine Reading involves automatic extraction of knowledge from natural language text
![Page 12: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/12.jpg)
Common characteristics
• Inference and learning from partially observed or incomplete data
• Plan recognition– Top-level plan is not observed– Some of the executed actions can be unobserved
• Machine Reading– Information that is implicit is rarely observed in
data– Common sense knowledge is not always explicitly
stated12
![Page 13: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/13.jpg)
Thesis Contributions
• Plan Recognition– Bayesian Abductive Logic Programs (BALPs) [ECML 2011]
• Machine Reading– BLPs for learning to infer implicit facts from natural
language text [ACL 2012]
– Online rule learner for learning common sense knowledge from natural language extractions [In Submission]
– Approach to scoring first-order rules (common sense knowledge) using WordNet [In Submission]
13
![Page 14: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/14.jpg)
Thesis Contributions
• Plan Recognition– Bayesian Abductive Logic Programs (BALPs) [ECML 2011]
• Machine Reading– BLPs for learning to infer implicit facts from natural
language text [ACL 2012]
– Online rule learner for learning common sense knowledge from natural language extractions [In Submission]
– Approach to scoring first-order rules (common sense knowledge) using WordNet [In Submission]
14
![Page 15: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/15.jpg)
Outline• Motivation• Background
– Bayesian Logic Programs (BLPs)• Plan Recognition• Machine Reading
– BLPs for inferring implicit facts– Online Rule Learning– Scoring Rules using WordNet
• Future Work• Conclusions
15
![Page 16: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/16.jpg)
Bayesian Logic Programs (BLPs)[Kersting and De Raedt, 2001]
• Set of Bayesian clauses a|a1,a2,....,an– Definite clauses that are universally quantified– Range-restricted, i.e variables{head} variables{body}– Associated conditional probability table (CPT)
• P(head|body)
• Bayesian predicates a, a1, a2, …, an have finite domains– Combining rule like noisy-or for mapping multiple CPTs into
a single CPT
• Given a set of Bayesian clauses and a query, SLD resolution is used to construct ground Bayesian networks for probabilistic inference
16
![Page 17: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/17.jpg)
Probabilistic Inference and Learning
• Probabilistic inference– Marginal probability
• Exact Inference• Sample Search [Gogate and Dechter, 2007]
• Learning [Kersting and De Raedt, 2008]
– Parameters• Expectation Maximization• Gradient-ascent based learning
17
![Page 18: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/18.jpg)
Outline• Motivation• Background
– Bayesian Logic Programs (BLPs)• Plan Recognition• Machine Reading
– BLPs for inferring implicit facts– Online Rule Learning– Scoring Rules using WordNet
• Future Work• Conclusions
18
![Page 19: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/19.jpg)
Plan Recognition
• Predict an agent’s top-level plan based on its observed actions• Abductive reasoning involving inference of
cause from effect
• Since SLD resolution used in BLPs is deductive in nature, BLPs cannot be used as is plan recognition
19
![Page 20: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/20.jpg)
Extending BLPs for Plan Recognition
20
BLPs Logical Abduction
BALPs
BALPs – Bayesian Abductive Logic Programs
![Page 21: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/21.jpg)
Extending BLPs for Plan Recognition
21
BLPsStickel’s
Abduction Algorithm
BALPs
BALPs – Bayesian Abductive Logic Programs
![Page 22: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/22.jpg)
Experimental Evaluation
• Data• Monroe [Blaylock and Allen, 2005]• Linux [Blaylock and Allen, 2005]• Story Understanding [Ng and Mooney, 1992]
• Systems compared– BALPs– MLN-HCAM [Singla and Mooney, 2011]
– Blaylock and Allen’s system [Blaylock and Allen, 2005]– ACCEL-Simplicity [Ng and Mooney, 1992]– ACCEL-Coherence [Ng and Mooney, 1992]
22
![Page 23: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/23.jpg)
Summary of Results• Monroe and Linux
– BALPs outperform both MLN-HCAM and the system by Blaylock and Allen
• Story Understanding– BALPS outperform both MLN-HCAM and ACCEL-
Simplicity– ACCEL-Coherence outperforms BALPs and other
systems• Specifically developed for text interpretation
• Automatic learning of model parameters using EM
23
![Page 24: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/24.jpg)
Outline• Motivation• Background
– Bayesian Logic Programs (BLPs)• Plan Recognition• Machine Reading
– BLPs for inferring implicit facts– Online Rule Learning– Scoring Rules using WordNet
• Future Work• Conclusions
24
![Page 25: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/25.jpg)
Machine Reading
• Natural language text is typically “incomplete”– Some information is always implicit– Common sense information is not always explicitly
stated– Grice’s maxim of quantity [1975]
• Information extraction (IE) systems extract information that is explicitly stated [Cowie and
Lenhert, 1996; Sarawagi, 2008] – Cannot extract information that is implicit
25
![Page 26: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/26.jpg)
ExampleNatural language text“Barack Obama is the President of the United States of America.”
Query“Barack Obama is a citizen of what country?”
IE systems cannot answer this query since citizenship information is not explicitly stated.
26
![Page 27: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/27.jpg)
Objective• Infer implicit facts from explicitly stated
information– Extract explicitly stated facts using an off-the-shelf
IE system– Learn common sense knowledge in the form of
first-order rules to deduce additional facts– Use BLPs for inference of additional facts
27
![Page 28: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/28.jpg)
Related Work• Logical deduction based approaches
– Learning propositional rules [Nahm and Mooney, 2000]– Purely logical deduction is brittle since it cannot assign
probabilities to inferences– Learning probabilistic first-order rules using FOIL and
FARMER [Carlson et al., 2010; Doppa et al., 2010]– Probabilities are not computed using well-founded
probabilistic graphical models • Use MLN based approaches for inferring additional
facts [Schoenmackers et al., 2010; Sorower et al., 2011]
– “Brute force” inference could result in intractably large networks for large domains
– Scaling of MLNs to large domains [Schoenmackers et al., 2010; Niu et al., 2012]
28
![Page 29: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/29.jpg)
Objectives
• BLPs for learning to infer implicit facts from natural language text
• Online rule learner for learning common sense knowledge from natural language extractions
• Approach to scoring first-order common sense knowledge using WordNet
29
![Page 30: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/30.jpg)
Outline• Motivation• Background
– Bayesian Logic Programs (BLPs)• Plan Recognition• Machine Reading
– BLPs for inferring implicit facts– Online Rule Learning– Scoring Rules using WordNet
• Future Work• Conclusions
30
![Page 31: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/31.jpg)
System ArchitectureTraining
DocumentsInformation Extractor
(IBM SIRE)Extracted
Facts
Rule learnerFirst-OrderLogical Rules
BLP Weight Learner
Bayesian LogicProgram (BLP)
BLP InferenceEngine
TestDocument
Extractions
Inferences withprobabilities 31
.
.
.
.
.
.
Barack Obama is the current President of USA……. Obama was born on August 4, 1961, in Hawaii, USA.
.
.
.
.
.
.
nationState(USA)Person(BarackObama)isLedBy(USA,BarackObama)hasBirthPlace(BarackObama,USA)hasCitizenship(BarackObama,USA)
nationState(B) ∧ isLedBy(B,A) hasCitizenship(A,B)nationState(B) ∧ employs(B,A) hasCitizenship(A,B)
hasCitizenship(A,B) | nationState(B) , isLedBy(B,A) .9hasCitizenship(A,B) | nationState(B) , employs(B,A) .6
nationState(malaysia)Person(mahathir-mohamad)isLedBy(malaysia,mahathir-mohamad)employs(malaysia,mahatir-mohamad)
hasCitizenship(mahathir-mohamad, malaysia) 0.75
![Page 32: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/32.jpg)
System ArchitectureTraining
DocumentsInformation Extractor
(IBM SIRE)Extracted
Facts
Inductive LogicProgramming
(LIME)
First-OrderLogical Rules
BLP Weight Learner
Bayesian LogicProgram (BLP)
BLP InferenceEngine
TestDocument
Extractions
Inferences withprobabilities 32
![Page 33: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/33.jpg)
Inductive Logic Programming (ILP) for learning first-order rules
ILP Rule Learner
Target relationhasCitizenship(X,Y)
Positive instanceshasCitizenship(BarackObama, USA)
hasCitizenship(GeorgeBush, USA)
hasCitizenship(IndiraGandhi,India)
.
.
Negative instanceshasCitizenship(BarackObama, India)
hasCitizenship(GeorgeBush, India)
hasCitizenship(IndiraGandhi,USA)
.
.
KBhasBirthPlace(BarackObama,USA)person(BarackObama)nationState(USA)nationState(India)
.
.
RulesnationState(Y) ∧person(X)∧ isLedBy(Y,X) hasCitizenship(X,Y)
..
Generated using clo
sed-
world assu
mption
33
![Page 34: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/34.jpg)
Inference using BLPsTest document“Barack Obama is the current President of the USA……. Obama was born on August 4, 1961, in Hawaii, USA…….”
Extracted factsnationState(usa)person(barackobama)isLedBy(usa,barackobama)hasBirthPlace(barackobama,usa)employs(usa, barackobama)
Learned rulesnationState(B) ∧ person(A) ∧ isLedBy(B,A) hasCitizenship(A,B)nationState(B) person(A) ∧ ∧ employs(B,A) hasCitizenship(A,B)
34
![Page 35: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/35.jpg)
Logical Inference - Proof 1
hasCitizenship(barackobama,usa)
nationState(usa) person(barackobama) isLedBy(usa,barackobama)
nationState(B) ∧ person(A) ∧ isLedBy(B,A) hasCitizenship(A,B)
35
![Page 36: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/36.jpg)
Logical Inference - Proof 2
hasCitizenship(barackobama,usa)
nationState(usa) person(barackobama) employs(usa,barackobama)
nationState(B) ∧ person(A) ∧ employs(B,A) hasCitizenship(A,B)
36
![Page 37: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/37.jpg)
Bayesian Network ConstructionnationState
(usa)
isLedBy(usa,
barackobama)
employs(usa,
barackobama)
hasCitizenship(barackobama, usa)
37
person(barackobama)
![Page 38: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/38.jpg)
Bayesian Network ConstructionnationState
(usa)
isLedBy(usa,
barackobama)
employs(usa,
barackobama)
hasCitizenship(barackobama, usa)
38
person(barackobama)
![Page 39: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/39.jpg)
Bayesian Network ConstructionnationState
(usa)
isLedBy(usa,
barackobama)
employs(usa,
barackobama)
hasCitizenship(barackobama, usa)
39
person(barackobama)
![Page 40: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/40.jpg)
Bayesian Network ConstructionnationState
(usa)
isLedBy(usa,
barackobama)
- - -
- - -
- - -
- - -
Logical
And
employs(usa,
barackobama)
dummy1 dummy2
hasCitizenship(barackobama, usa)
- - -- - -
- - -
- - -
Logical
And- - -- - -
- - -
- - -
Noisy
Or
40
person(barackobama)
Marginal Probability ??
![Page 41: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/41.jpg)
Experimental Evaluation
• Data– DARPA’s intelligence community (IC) data set from
the Machine Reading Project (MRP)– Consists of news articles on politics, terrorism,
and other international events– 10,000 documents in total
• Perform 10-fold cross validation
41
![Page 42: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/42.jpg)
Experimental Evaluation
• Learning first-order rules using LIME [McCreath and Sharma, 1998]
– Learn rules for 13 target relations– Learn rules using both positive and negative instances
and using only positive instances– Include all unique rules learned from different models
• Learning BLP parameters– Learn noisy-or parameters using Expectation
Maximization (EM)– Set priors to maximum likelihood estimates
42
![Page 43: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/43.jpg)
Experimental Evaluation
• Performance evaluation– Lack of ground truth for evaluation– Manually evaluated inferred facts from 40
documents, randomly selected from each test set– Compute precision
– Fraction of inferences that are correct– Compute two precision scores
• Unadjusted (UA) – does not account for extractor’s mistakes• Adjusted (AD) – account for extractor’s mistakes
– Rank inferences using marginal probabilities and evaluate top-n
43
![Page 44: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/44.jpg)
Experimental Evaluation
• Systems compared– BLP Learned Weights
• Noisy-or parameters learned using EM– BLP Manual Weights
• Noisy-or parameters set to 0.9– Logical Deduction– MLN Learned Weights
• Learn weights using generative online weight learner– MLN Manual Weights
• Assign a weight of 10 to all rules and MLE priors to all predicates
44
![Page 45: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/45.jpg)
Unadjusted Precision
45
![Page 46: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/46.jpg)
Inferior performance of EM
• Insufficient training data• Lack of ground truth information for relations
that can be inferred– Implicit relations seen less frequently in training
data– EM learns lower weights for rules corresponding
to implicit relations
46
![Page 47: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/47.jpg)
Performance of MLNs• Inferior performance of MLNs
– Insufficient training data for learning– Use of closed world assumption for inference and learning– Lack of strictly typed ontology
• GeopoliticalEntity could be an Agent as well as Location
• Improvements to MLNs– Integrity constraints to avoid inference of spurious facts
like employs(a,a)– Incorporate techniques proposed by Sorower et al. [2011]
47
![Page 48: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/48.jpg)
Outline• Motivation• Background
– Bayesian Logic Programs (BLPs)• Plan Recognition• Machine Reading
– BLPs for inferring implicit facts– Online Rule Learning– Scoring Rules using WordNet
• Future Work• Conclusions
48
![Page 49: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/49.jpg)
Limitations of LIME• Assumes data is accurate
– Negative instances artificially generated are usually noisy and inaccurate
– Extraction errors result in noisy data• Does not scale to large corpora
49
Develop an approach that can learn first-order rules from noisy and incomplete IE extractions
![Page 50: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/50.jpg)
Online Rule Learning
• Incorporates the incomplete nature of natural language text– Body consists of relations that are explicitly stated– Head is a relation that can be inferred
• Relations that are implicit occur less frequently than those that are explicitly stated– Use frequency of occurrence as a heuristic to
distinguish different types of relations• Process examples in an online manner to scale to
large corpora
50
![Page 51: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/51.jpg)
Approach
• For each example, construct a directed graph of relation extractions
• Add directed edges between nodes that share one or more constants– Relations connected by edges are related and
participate in the same rule• Traverse the graph to learn first-order rules
51
Learning from positive instances only
![Page 52: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/52.jpg)
Example
“Barack Obama is the current President of the USA……. Obama, citizen of the USA was born on August 4, 1961, in Hawaii, USA…….”
Extracted facts
nationState(USA)person(BarackObama)isLedBy(USA,BarackObama)hasBirthPlace(BarackObama,USA)hasCitizenship(BarackObama,USA)
52
![Page 53: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/53.jpg)
Example
Extracted facts
nationState(USA)person(BarackObama)isLedBy(USA,BarackObama)hasBirthPlace(BarackObama,USA)hasCitizenship(BarackObama,USA)
Entities
“Barack Obama is the current President of the USA……. Obama, citizen of the USA was born on August 4, 1961, in Hawaii, USA…….”
53
![Page 54: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/54.jpg)
Example
Extracted facts
nationState(USA)person(BarackObama)isLedBy(USA,BarackObama)hasBirthPlace(BarackObama,USA)hasCitizenship(BarackObama,USA)
Relations
“Barack Obama is the current President of the USA……. Obama, citizen of the USA was born on August 4, 1961, in Hawaii, USA…….”
54
![Page 55: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/55.jpg)
Directed graph constructionisLedBy(USA,
Barack Obama)
hasBirthPlace(Barack Obama,
USA)
hasCitizenship(Barack Obama,
USA)
isLedBy 33hasBirthPlace 25hasCitizenship 17
?
55
![Page 56: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/56.jpg)
Graph Traversal
isLedBy(USA, Barack Obama) hasBirthPlace(Barack Obama, USA)
isLedBy(USA,
Barack Obama)
hasBirthPlace(Barack Obama,
USA)
56
![Page 57: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/57.jpg)
Graph Traversal
isLedBy(USA, Barack Obama) ∧person(Barack Obama) ∧ nationState(USA) hasBirthPlace(Barack Obama, USA)
isLedBy(USA,
Barack Obama)
hasBirthPlace(Barack Obama,
USA)
57
![Page 58: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/58.jpg)
Graph Traversal
isLedBy(X, Y) ∧person(Y) ∧ nationState(X) hasBirthPlace(Y, X)
isLedBy(USA,
Barack Obama)
hasBirthPlace(Barack Obama,
USA)
58
![Page 59: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/59.jpg)
Rules learned
isLedBy(X, Y) ∧ person(Y) ∧ nationState(X) hasBirthPlace(Y, X)
isLedBy(X, Y) ∧ person(Y) ∧ nationState(X) hasCitizenship(Y, X)
hasBirthPlace(X, Y) ∧ person(X) ∧ nationState(Y) hasCitizenship(X, Y)
59
![Page 60: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/60.jpg)
Sample rules
employs(X, Y) commercialOrganization(X)∧ hasMemberPerson(X, Y)
isLedBy(X, Y) nationState(X)∧ hasCitizenship(Y, X)
isLedBy(X, Y) nationState(X) person(Y) ∧ ∧ hasBirthPlace(Y, X)
60
![Page 61: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/61.jpg)
Experimental Evaluation• Learn first-order rules for 14 target relations
– Full-set– Subset
• 10 target relations
• Manually set noisy-or parameters to 0.9• Systems compared
– Online Rule Learner (ORL)– LIME [McCreath and Sharma, 1998]
– Combined
61
![Page 62: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/62.jpg)
Full-set
62
![Page 63: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/63.jpg)
Inferior performance of ORL on Full-set
• Several incorrect inferences with high marginal probabilities– Instances of thingPhysicallyDamaged and
eventLocationGPE– High probabilities due to multiple rules inferring
these instances– Rules not very accurate resulting in inaccurate
inferences
63
![Page 64: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/64.jpg)
Subset
64
![Page 65: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/65.jpg)
Running Time
• LIME– Learns rules for one target relation at a time– Includes time taken to learn from positive only and
from positive and negative examples• ORL
– Learns rules for all target relations at once
65
ORL LIME
3.8 mins 11.23 hrs
![Page 66: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/66.jpg)
Outline• Motivation• Background
– Bayesian Logic Programs (BLPs)• Plan Recognition• Machine Reading
– BLPs for inferring implicit facts– Online Rule Learning– Scoring Rules using WordNet
• Future Work• Conclusions
66
![Page 67: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/67.jpg)
Scoring first-order rules• Predicate names employ English words• Confident rules typically have predicates whose
words are semantically related• Use word similarity or relatedness to calculate
weights– Word similarity computed using WordNet
• Compute weights between 0 and 1, which are then used as noisy-or parameters– Higher weights indicate more confident rules
67
![Page 68: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/68.jpg)
WordNet[Fellbaum, 1998]
• Lexical knowledge base consisting of 130,000 English words
• Nouns, verbs, adjectives, and adverbs organized into “synsets” (synonym sets)
• wup [Wu and Palmer, 1994] similarity measure to compute word similarity– Computes scaled similarity scores between 0 and 1– Computes the depth of the least common subsumer
of the given words and scales it by the sum of the depths of the given words
68
![Page 69: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/69.jpg)
Scoring rules using WUP
• Compute word similarity using wup for every pair of words (wi,wj)– wi refers to words in the body
– wj refers to words in the head
• Compute average similarity for all pairs of words• Predicate names like hasCitizenship and
hasMember are segmented into has, citizenship, and member– Stop words are removed
69
![Page 70: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/70.jpg)
Example
employs(X,Y) governmentOrganization(X) ∧ hasMember(X,Y)
70
![Page 71: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/71.jpg)
Example
employs(X,Y) governmentOrganization(X) ∧ hasMember(X,Y)(employs, government, organization) (member)
71
![Page 72: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/72.jpg)
Example
employs(X,Y) governmentOrganization(X) ∧ hasMember(X,Y)(employs, government, organization) (member)
72
Word pair wup scoreemploys, member .50government, member .75organization, member .85
Average .70
![Page 73: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/73.jpg)
Example
employs(X,Y) governmentOrganization(X) ∧ hasMember(X,Y) (.70)(employs, government, organization) (member)
employs(X,Y) person(Y) nationState(X)∧ ∧ hasBirthPlace(Y,X) (.67)(employs, person, nation, state) (birth, place)
73
![Page 74: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/74.jpg)
Scoring rules using WUP• WUP-AVG
– Use words from both entities and relations– Use the average similarity between all pairs of words as
the weight• WUP-MAX
– Use words from both entities and relations– Use maximum similarity among all pairs of words as the
weight• WUP-MAX-REL
– Use words from relations only– Use maximum similarity among all pairs of words as the
weight
74
![Page 75: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/75.jpg)
Experimental Evaluation• Target relations
– Full-set– Subset
• Models– COMBINED
• Rule scoring approaches compared– WUP-AVG– WUP-MAX– WUP-MAX-REL– Default (Manual weights set to 0.9)– EM (Weights learned from EM)
75
![Page 76: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/76.jpg)
Full-set
76
![Page 77: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/77.jpg)
Subset
77
![Page 78: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/78.jpg)
Summary
• BLP approach for inferring implicit facts with high precision
• Superior performance of BLPs over purely logical deduction and MLNs
• Efficient learning of probabilistic first-order rules using online rule learning
• Efficacy of WUP-AVG for scoring first-order rules
78
![Page 79: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/79.jpg)
Outline• Motivation• Background
– Bayesian Logic Programs (BLPs)• Plan Recognition• Machine Reading
– BLPs for inferring implicit facts– Online Rule Learning– Scoring Rules using WordNet
• Future Work• Conclusions
79
![Page 80: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/80.jpg)
Future Work
• Plan recognition– Structure learning of abductive knowledge bases
for BALPs– Comparison of BALPs to other SRL models
• ProbLog [Kimmig et al., 2008]
• PRISM [Sato, 1995]
• Poole’s Horn Abduction [Poole, 1993]
• Abductive Stochastic Logic Programs [Tamaddoni-Nezhad, Chaleil, Kakas, & Muggleton, 2006]
80
![Page 81: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/81.jpg)
Future Work
• Machine Reading– Large scale evaluation using crowdsourcing– Comparison of BLPs to existing approaches on
machine reading [Schoenmackers et al., 2010; Carlson et al., 2010; Doppa et al., 2010; Sorower et al., 2011]
– Alternate approaches to scoring rules• Use models from distributional semantics [Garrette et
al., 2011]
81
![Page 82: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/82.jpg)
Long-term Directions
• Parameter learning – Using approximate inference techniques– Discriminative learning of parameters
• Lifted inference for BLPs and BALPs
82
![Page 83: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/83.jpg)
Conclusions
• Demonstrated the efficacy of BLPs on two diverse tasks– Plan recognition
• BALPs
– Machine reading• Infer implicit facts from natural language text• Online rule learner for efficient learning of first-order
rules from noisy IE extractions• Scoring first-order rules using WordNet
83
![Page 84: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/84.jpg)
Conclusions
• Demonstrated superior performance of BLPs over MLNs on both tasks
• Contributions could have direct impact on the advancement of applications that use plan recognition and machine reading– SIRI– IBM’s Watson system
84
![Page 85: Bayesian Logic Programs for Plan Recognition and Machine Reading](https://reader036.vdocument.in/reader036/viewer/2022062814/56816848550346895dde2ee2/html5/thumbnails/85.jpg)
Questions
85