knowledge representation and semantic capturing
DESCRIPTION
Knowledge Representation and Semantic Capturing. Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy of Sciences [email protected]. Few words about me. Programmer at LMD, 2001 -2003 Research Associate at LMD since 2003 - PowerPoint PPT PresentationTRANSCRIPT
Knowledge Representation and Semantic Capturing
Albena StrupchanskaLinguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy of Sciences
Few words about me
Programmer at LMD, 2001 -2003 Research Associate at LMD since 2003
Research interests knowledge representation: CGs, LFs in NLU;
ontologies, semantic web information extraction e-learning question-answering
Knowledge Representation: Conceptual Graphs
Realization of CG operations (generalization, specialization, projection and join)
Integration of CG operations in CGWorld
Usage of those operation in several system prototypes (simple question-answering, eLearning)
Knowledge Acquisition form Text General approach used in a few prototypes that process text in controlled English (restricted domains)
Lexical analysis, Named entities recognition and Part-of-speech tagger - GATE
Syntactic analysis - parser developed by Milena Yankova
Result: translation of text into Logical Forms (LFs) and other similar formalisms e.g. Conceptual Graphs
Knowledge-based approaches Resources used:
type hierarchy domain knowledge
Attempts to treat negation (prototype developed) recognize scenarios (FRET system)
“Naive” Negation Processing
Sentence/Query -> LF -> CG The question:
"Who does not buy bonds?“
will be translated to:
¬(all (X,bond(X)&buy(Y)&(Y,agnt,Univ)&
(Y,obj,X))) set the negation scope to the whole sentence
“Naive” Negation Processing
construct all possible LFs with localization of the negated phrases
(2.1) exists(X,¬bond(X)&buy(Y)&(Y,agnt,Univ)& (Y,obj,X)) (2.2) exists (X,bond(X)& ¬buy(Y)& (Y,agnt,Univ)& (Y,obj,X)) (2.3) exists (X,¬bond(X)&¬buy(Y)& (Y,agnt,Univ)& (Y,obj,X))
(2.1) Who does buy financial instruments different from bonds ? (2.2) Who is doing other actions with bonds except buying them? (2.3) Who is doing other actions except buying with something different
from bonds
“Naive” Negation Processing
Every negated concept is replaced by its hierarchical environment:
every concept corresponding to a verb is replaced by its "antonym or complementary events";
every object is replaced by the so-called restricted universally quantified concepts.
S(nc)=(Sib(nc) SonSib(nc)) \ Son(nc), where nc is the negated concept
Projection of the query to the KB of CGs => retrieval of answers
FRET - Football Reports Extraction of Templates
Semantically driven approach for scenario recognition and templates filling deep understanding only in “certain scenario-relevant
points” by elaborating inference mechanisms LF representation for effective inference Text: football reports with specific paragraph
structure (tickers for each minute)
FRET’s Architecture
TextText
TextTextPreprocessorPreprocessor
Resource Resource BankBank
Logical FormLogical FormTranslatorTranslator
Templates FillerTemplates Filler
DirectMatching
FillingTemplates
InferenceMatching
STOPSTOP
KB of filled KB of filled template’s formstemplate’s forms
no
no
yes
yes
FRET - Resource Bank
Lexicon Grammar rules Rules for translation in logical form Graphs of events
description of the domain events (nodes) and relations (arcs) between them
Templates description (uninstantiated LFs)
FRET - Graph of Events
Three types of events (nodes in a directed graph):
Main event - LF description of obligatory and optional fields of the template and relations between them
Base events - LF of most important self-dependent events in the chosen domain
Sub-events - kinds of base events that are immediately connected to the main event (i.e. there exists an arc between the nodes of the main and the sub-events)
FRET - Graph of Events
Four types of relations (an arc with associated weight
in the graph): Event E2 invalidates event E1, i.e. event E2 happens
after E1 and annuls it Event E1 entails event E2, i.e. when E1 happens E2
always happens at the same time. Event E1 enables event E2, i.e. event E1 happens
before the beginning of event E2 and event E1 is a precondition for E2
Event E2 is a part of event E1.
FRET - Graph of Events
LF: time(Minute) & Action2(A2) & theta(A2,agnt,Player) & theta(A2,obj,D) & ball(D)
LF: time(Minute) & ball(D) & theta(D,into,G) & Net(G)
Base Event: The ball is into the net.Base Event: Player shots the ball.
is a part ofenables
LF: time(Minute) & Action1(A1) & theta(A1,agnt,B) & shot(B) & theta(B,poss,Player) &
theta(A1,obj,G) & Net(G)
Sub Event: Player’s shot hits the net.
LF Obligatory: time(Minute) & Score(A) & theta(A,agnt,Player)Optional: Action1(C) & ball(D) & theta(C,agnt,Player) & theta(C,obj,D) & Location(E) & theta(C,Loc,E) & Action2(F) & theta(F,agnt,Assistant) & theta(F,obj,D)&theta(F,to,Player)
Main Event: Player scores.
entails
FRET - Identification of Negation Explicit negation
Short sentences containing No Complete sentence containing “Not/Non/No”
Both cases: marker NEG attached to the LF of the
previous sentence or succeeding part of the sentence Implicit negation
Sentences with “but”, “however”, “although”
Markers: BAHpos’ and ‘BAHneg’
Markers are inserted during the parsing process
FRET - Negation Sentence:
79 mins: Henry fires at goal, but misses from a tight angle.
Logical forms:time(79) & fire(A) & (A,agnt,‘Henry’) & (A,at,B)
& goal(B) & marker(‘BAHpos’,7).
time(79) & miss(A) & (A,agnt,‘Henry’) & (A,form,B) & angle(B) & (B,char,C)& tight(C) & marker(‘BAHneg’,7).
FRET -Treatment of NegationInterpretation of marked LFs NEGNEG
the matching result is ignored BAHposBAHpos or BAHnegBAHneg
there are two possible interpretations: negation conjunction of independent statements
the algorithm checks whether the dual LFs marked with these markers can be matched to events connected with invalidate relation in the graph
if this succeeds, the previous matching is ignored.
FRET - Templates Filling
The templates filler performs two main steps: Matching LF
based on the modification of the unification algorithm Filling templates
The templates filler processes those LF, which are produced from the so-called extended paragraph. Thus each paragraph is treated separately.
FRET - Matching Algorithm
Direct matching each LF from the extended paragraph to the main event
Inference Matching use inference rules and the knowledge base FRET inference-matching algorithm derives an
inference from:
base events LFs => sub-events LFs => main event LFIf necessary information about some sub- event => consider type of relation between this sub-event and the main event => either recognize or not the main event
Advantages and disadvantages + Logical forms: convenient formalism for
making inference + Knowledge representation as graph of events + Partial parsing (better to understand less than
nothing)
- Creation of graph of events (nodes presented in LFs) and templates (presented in LFs)
- Narrow and restricted domains (not scaleable)
Conclusion
Knowledge-based approaches are successful when they are applied to specific domains
Choice of domain representation formalism is crucial for semantic capturing
Domain modelling is difficult and time-consuming
Much efforts for semantic capturing of simple cases. Probably when these cases are the right ones the goal justifies the means
Thank you!
Any questions?