objectives dod benefits scientific/technical ... - university at …nagi/muri/muri/year_2_files/pdf...
TRANSCRIPT
Network Based Hard/Soft Information FusionTractor and CBIR
Stuart C. Shapiro, Michael Kandefer, Michael Prentice, Daniel SchlegelUniversity at Buffalo
Objectives• Extract state estimates
from English messages.• Preserve as much information
as possible.• Enhance with contextually relevant
background information.DoD Benefits• Improved quality of information
automatically provided to analysts.
Accomplishments• Designed propositional graphs
based on FrameNet & SNePS 3.• Implemented & evaluated CBIR.• Implemented entire
Soft Processing StreamChallenges• Task implies full NLU.
Scientific/Technical Approach• Use state‐of‐the‐art techniques
from Computational Linguistics.• Represent state estimates as
propositional graphs.• Develop CBIR techniques.• Advance the state of knowledge
of NLU.
Main Scientific/Technical AccomplishmentsTractor and CBIR
• Objectives:– Enhanced state estimate data graph from English message.– Enhanced with contextually relevant information from Background Knowledge Sources.
• Implement Tractor framework for soft information fusion.• Utilize:
– GATE text processing framework.– Stanford Dependency Parser.– FrameNet database of semantic frames.– SNePS 2 and SNePS 3 Knowledge Representation and Reasoning system.
• Design of propositional graphs– based on FrameNet and SNePS.
• Parse complete STEF, SYNCOIN, BomberBusterScene message sets to dependency graphs– Identify & tag people, locations, organizations, etc.– Perform some intra‐message coreference resolution
• Implement & evaluate CBIR (Context‐Based Information Retrieval) algorithms.– Use Cyc and NGA GEOnet Names Server
• Represent messages in propositional graphs– Mixing syntactic and semantic information.– Including ontological information from Cyc via CBIR.
2
Publication ListTractor and CBIR
• Stuart C. Shapiro, The Jobs Puzzle: A Challenge for Logical Expressibility and Automated Reasoning. In E. Davis, P. Doherty, and E. Erdem, Eds., Logical Formalizations of Commonsense Reasoning: Papers from the AAAI Spring Symposium, Technical Report SS‐11‐06, AAAI Press, Menlo Park, CA, 2011, 96‐102.
• Michael Kandefer and Stuart C. Shapiro, Evaluating Spreading Activation for Soft Information Fusion, Proceedings of the 14th International Conference on Information Fusion (Fusion 2011), 2011.
• Michael Prentice and Stuart C. Shapiro, Using Propositional Graphs for Soft Information Fusion, Proceedings of the 14th International Conference on Information Fusion (Fusion 2011), 2011.
• Daniel R. Schlegel and Stuart C. Shapiro, Visually Interacting with a Knowledge Base Using Frames, Logic, and Propositional Graphs, Second IJCAI International Workshop on Graph Structures for Knowledge Representation and Reasoning, 2011.
3
Project Statistics and SummaryTractor and CBIR
Personnel supported:‐ Michael Prentice, graduate student‐ Stuart C. Shapiro, faculty member
Students with leveraged (non‐ARO) support‐ Michael Kandefer:
Western New York Prosperity Scholarshipsupported by the Prentice Family Foundation
‐ Daniel Schlegel: volunteer graduate student
Degrees Awarded:₋ Michael Prentice, MS in CSE
Publications:‐ Conference papers: 4
4
Personnel ChangesTractor and CBIR
As of 1 August 2011Graduate Students:‐ Michael Prentice: Left project‐ Michael Kandefer: Left project‐ Daniel Schlegel: Joined project as ARO‐supported half‐time RA
Faculty‐ Stuart C. Shapiro: Mostly supervisory role to active participant
5
Tractor and CBIRObjectives
• Process English messages.• Produce graph representation ofconceptual information in messages.
• Enhance graph withcontextually relevant background information.
6
TractorAccomplishments
• Implemented Tractor Framework.• Used
– GATE (General Architecture for Text Engineering).– Stanford Dependency Parser.– FrameNet.– SNePS 2 and SNePS 3.
• Implemented and used SNePS 3 GUI.• Performed some intra‐message coreference resolution.• Produced dependency graphs.
– Tagged people, location, organization, etc.• Designed Conceptual Propositional Graphs• Implemented automatic propositionalizer
– Implemented Syntax‐Semantics mapper – Initial set of syntax‐semantics mapping rules
• Output to GraphML format for further processingby Data Association and Graph Matching teams.
7
CBIRAccomplishments
• Implemented & evaluated several algorithms• Enhanced propositional graphs
– From NGA GEOnet Names Server (GNS)– From Research Cyc and Open Cyc– Add higher‐level ontological categories
• Analyzed Research Cyc for noun coverage• Developed an interface for creating “cues” into the Cyc ontology from natural language nouns
8
Single MessageSingle Message
Syntactic Processing
Propositionalizer
User & Context Enhanced Data Graph
Context Enhanced Graph
Propositional Graph
Context Enhancement with CBIR
Priority Intelligence Requirements
To be installed
To be developed (Year 3)
Data
Installed
Tractor Framework
User Enhancements &
Corrections
Syntactic Processing Overview
Single Message
Tokenizer
Sentence Splitter Part‐of‐speech Tagger
Intra‐MessageReference Resolution
Dependency Parser
Tagged Dependency Graph
Stemmer
Installed
To be installed
To be developed (Year 3)
Data
Named EntityTagger
UserEnhancements& Corrections
Example Message(BombBusterScene)
1. 01/31/2010, 0700 hrs. ‐‐ Al Sabah newspaper reports that in response to the new government policy, local presidential candidate Azam Al‐Azhar has called for a protest at the Second District Courthouse. Al‐Azhar said he would personally attend this protest, and that local residents should expect to see his black SUV arrive at the Courthouse at around 1800 hrs.
11
12
Example Tagged Dependency Parse
organization person Job title color facility location
Coref
FrameNet
• Corpus‐based frame semanticsof verbs and relations
[Fillmore, 1976; Ruppenhofer, et al., 2006].
• Represents situationsindependent of surface expression.
• Uses slots and fillers (N‐ary relation).• Database of 11,623 lexical unitsassociated with 1,023 frames.
13
SNePS 3
• Implemented KR system[Shapiro, 1979, 2000, 2004].
• Conceptual structure.• Basic units are frames: slots and fillers.• KB may be viewed as a propositional graphor a set of logical assertions about entities.
• Supports n‐ary relations.• Supports assertions about assertions.
– (Metadata, pedigree, …)• Supports reasoning.• Has useful GUI.
14
SNePS 2
• Current publicly available version of SNePS• Supports inference over quantified formulas• SNePS 3 has more sophisticated notion of frames• SNePS 3 uses sorted logic
– Type checking• SNePS 2 used for
– Representation of tagged dependency graphs– Representation of propositional graphs– Syntax‐semantics mapper
• SNePS 3 used for– CBIR– GUI
15
Tagged Dependency
Graph
Propositional Graph
To be installed
To be developed (Year 3)
Data
Installed
Propositionalizer Overview
FrameNetTagger
Syntax‐Semantics Mapper
FrameNet DB
Syntax‐Semantics Mapping Rules
ExampleSyntax‐Semantics Mapping Rule
17
“If a common noun token, v23, has a determiner, v24, as a dependent,and the text of the noun is v25, then the token v23 is an instance of the type v25.”
Examples: “a protest” denotes a protest;“A Baghdad van rental company owner” denotes an owner.
Example Propositional Graph(Hand‐Crafted)
18
Current Statusof Propositionalizer
• Pass on to Data Association Group modestly revised/enhanceddependency graph.
• Dependency relations (head nsubj dobj)more genericthan FrameNet inspired relations.
• Will revisit use of FrameNetvs. more generic conceptual roles.
19
CBIR Detail
Propositional Graph
Contextual Information
Priority Intelligence Requirements
Enhanced Information
Context‐based Information
Retrieval (CBIR)
Forward Inference
Background Knowledge
Sources (BKS)
Previous Previous SituationsResearch Cyc Axioms
Insurgent Insurgent data
NGA GNS
To be installed
To be developed (Year 3)
Data
Installed
Technical ApproachCBIR
• Identifying Relevant Information– State estimate and BKS represented as graphs.– Nodes represent entities.– Arcs (sets of arcs) represent relations.– Forms associative network.– Embed state estimate graph into BKS graph.– Apply “pulse” to state estimate nodes.– Use spreading activation
• to spread pulse to connected (relevant) nodes.
21
CBIR Algorithm Analysis
• Revised experiments on small datasetsevaluated two spreading activation techniques– Marker passing ruled out prior to revised experimentation– Based on Kandefer & Shapiro 2009
• Texai
• ACT‐R Declarative Memory Module
• Comparison showed the f‐measures were equal) for both algorithms when using best average parameter settings of:– ACT‐R (S = 2.0, Activation Threshold = 0.04)– Texai (D = 0.9, Activation Threshold = 0.5)
• Results presented at Fusion 2011
DNAAANi
ijj ∗∗+= ∑∈
/1'
22
( )[ ]∑∑ ∈=
− −+⎟⎠
⎞⎜⎝
⎛=
Cj
n
kki jreeSCtA )))(ln(deg(*/1ln
1
5.0
CBIR Texai Evaluation(Mean across messages evaluated)
23
0.00
0.20
0.40
0.60
0.80
1.00
0.000
0.050
0.100
0.150
0.200
0.250
0.300
0.350
0.400
0.00 0.10 0.20 0.300.40
0.500.60
0.700.80
0.901.00
F‐Measure
Activation Threshold
Decay (D)
0.350‐0.400
0.300‐0.350
0.250‐0.300
0.200‐0.250
0.150‐0.200
0.100‐0.150
0.050‐0.100
0.000‐0.050
CBIR ACT‐R Evaluation(Mean across messages evaluated)
24
0.50
1.50
2.50
3.50
4.50
0.000
0.050
0.100
0.150
0.200
0.250
0.300
0.350
0.400
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18
F‐Measure
Activation Threshold
0.350‐0.400
0.300‐0.350
0.250‐0.300
0.200‐0.250
0.150‐0.200
0.100‐0.150
0.050‐0.100
0.000‐0.050
Noun to Cyc Connection
• Goal– Align message entities with Cyc ontological information
• Process– Using natural language information about Cyc concepts– Construct a query to Cyc using a NL term encountered
• e.g., “woman”– Retrieve the concept associated with the term in Cyc
• e.g., AdultHumanFemale• Progress
– Direct noun attachment (e.g. “woman” ‐> AdultHumanFemale)– Attachment to Open Cyc import allows for inferences to
ontological information• Running CBIR on the CYC Server may be a better alternative
25
CBIRCyc Coverage Analysis
• Retrieved common nouns (some modified) from SyncoinBioweapons Thread– E.g., bio‐weapons, surveys, old man, young boy, Bath’est website
26
Type Unique Occurrences Occurrences
Direct 243 (72.8%) 560 (76.5%)
Indirect 63 (18.9%) 117 (16%)
Relational 28 (8.4%) 55 (7.5%)
Total 334 (100%) 732 (100%)
2011‐2012 PlansTractor
• Capability Goals:– More complete
• NE Recognizer, • Coreference resolution.
– Automatic propositional graph mappingof most of selected message set.
• Research Goals:– Revisit choice of FrameNetvs. other, more generic sets of conceptual roles.
– Design/implement mapping rulesfor chosen set of conceptual roles.
27
2011‐2012 PlansCBIR
• Capability Goal:– Enhance propositional graph of message information with contextually relevant background information.
• Research Goals:– Investigate differential weightingbased on different relations.
– Investigate Large‐scale Data Storage Options• Server‐based• DBMS
– Evaluate CBIR algorithms/parametersusing chosen Dataset
– Evaluate CBIR algorithmsfor retrieval of base‐level category attachment
28
2012‐2014 PlansTractor/CBIR
29
• Continue to improve Tractor/CBIRspeed and quality.
• Improve TRL.• Design/Implement SNePS 3 inference
– Use of quantified First‐Order Logic• Based on Logic of Arbitrary and Indefinite Objects
– Concurrent programming approach• For massively parallel inference• Over very large knowledge bases.
– Use of forward inference to trigger alerts• Meeting analyst interests.
• Army Program Support: A2SF (Patel), CTA (Kott)