Copyright © 2011 Raytheon Company. All rights reserved. Customer Success Is Our Mission is a registered trademark of Raytheon Company.
SCUBA: An Agent-Based Ontology Creation and Alignment Method for
Socio-Cultural Modeling
Donald KretzBruce PeoplesWilliam Phillips
Outline§ Introduction: Background & Motivation
– Socio-Cultural ISR Challenge
§ Method: The SCUBA Prototype– Ontology Generation– Ontology Alignment– Evaluation and Measurement
§ Results– Course of Action Scenario– Measures of Evaluation and Performance
§ Conclusions§ Questions
Socio-Cultural Information Challenge§ Incomplete understanding of
social-cultural factors that influence and define a region
§ Information is available, but in dispersed data stores that require manual fusion
§ Ocean of data is not being analyzed since there are few trained analysts and little advanced technology
§ Manual interpretation is too time consuming
§ Inability to maintain full and up-to-date situational awareness in a dynamically evolving environment
11/30/11
Insufficient knowledge → grave misunderstanding → counterproductive and destructive actions → devastating consequences!
Why is this hard to do?
11/30/11
n There is no “ground truth” to evaluate quality, accuracy, or consistency– Information is extracted from sources varying in objectivity– Discrepancies may arise from perspective or time
n Context is critical, yet the most difficult part– Information extraction - contextual references in text– Knowledge models – representing contextual concepts and relations
n Modeling in general – what to model, how much, how deep?n Problem space evolves and changes – it’s a moving target
– Not only data changing, but model as welln Many levels of heterogeneity that cannot be avoided
– Solutions need to be dynamic– Interoperability requires alignment of similar concepts
§ More information just complicates the problem– Must apply technology to maintain goal of Balanced
Cooperative Modeling
Purely human-centric approaches are costly and impractical
Prior Art
No learning mechanism for self-refinement/improvement
TERMINATE SymOntoX
Text2OntoMAFRA
KAONJaguar
Ontology
Learning
Snoggle Protégé
Alignment API Optima
SimMetrics
No automatic merging - requires validation by human analyst
Ontology Alignment Techniques
11/30/11
String-based• Name similarity• Description similarity• Adbul = Abdul
Language-based• Lemmatization• Morphology• Caves=Cave
Constraint-based• Type similarity• Key properties• Cell=Organization (members,
leader, etc.)
Alignment Reuse• Entire model• Model fragments
Statistical• Frequency distribution• Probability estimates• Same name=Same thing
Graph-based• Path analysis• Parents and children• Cell=Organization (Terrorist,
Insurgent, etc.)
Taxonomy-based• Taxonomy structure
Structure-based• Structure metadata• Neighborhoods
Model-based• SAT solvers• DL reasoners• poppy crop=heroin field
(poppy is part of heroin)
Upper-level domain• Foundational ontology• SUMO, DOLCE, etc.
Basic Techniques for Matching
Name-based
Structure-based
Extensional
Semantic-based
Linguistic• Lexical networks• Thesauri• IED=bomb
Agent ArchitectureOA - Ontology Agent: perform as a proxy for an ontology by mediating access
to its concepts as well as responding to inquiries about its metacharacteristics (e.g., depth, breadth, number of concepts, etc.)
SA - Similarity Agent: calculates the similarity between concepts.
EA - Evaluation Agent: make a judgment as to the relatedness of available ontologies along some relevant dimension (e.g., domain relevance, semantic
similarity, etc.).
MA - Matching Agent: creates mappings of the concepts and relationship types between two ontologies.
HA - Heuristic Agent: determine which ontology pairs make good candidates for matching, which matching behaviors should be applied, and manage the
execution of selection and matching workflows.
UA - Utility Agent: performs supporting tasks such as data and ontology storage/retrieval, job ID management, etc.
Agent Architecture
EVALUATION AGENT combines and reports
observation data to HEURISTIC AGENT
ONTOLOGY AGENT acts as proxy for ontology, provides observations and
metadata, etc.
Behaviors provide specific, common observation functions
HEURISTIC AGENT determines selection strategy and requests
appropriate evaluations
EVALUATION AGENT requests observation data from ONTOLOGY
AGENTS
Candidate Selection
Message
<β>OB01
<Agent>Ontology
Agent
<Agent>Evaluation
Agent
<Agent>Heuristic
Agent
<Agent>Matching
Agent
<Agent>Similarity
Agent
<Agent>Ontology
Agent
<Agent>Ontology
Agent
<Agent>Similarity
Agent
<β>OB02
<β>OB03
<β>OB04
<β>OB05
<β>OB06
<β>OB07
HEURISTIC AGENT chooses an ontology pair for alignment and
the alignment techniques to be applied
Agent Architecture
Behaviors provide specific, common observation and similarity measurement functions
Ontology AlignmentHEURISTIC AGENT
requests alignment of chosen ontologies using
chosen techniques
MATCHING AGENT requests similarity scores across ontological concept and
relation types
SIMILARITY AGENT calculates similarity
scores between concept and
relationship types
MATCHING AGENT creates mappings of concept and
relationship types between ontologies
Message
<β>OB01
<Agent>Ontology
Agent
<Agent>Evaluation
Agent
<Agent>Heuristic
Agent
<Agent>Matching
Agent
<Agent>Similarity
Agent
<Agent>Ontology
Agent
<Agent>Ontology
Agent
<Agent>Similarity
Agent
<β>OB02
<β>OB03
<β>OB04
<β>OB05
<β>OB06
<β>OB07
ONTOLOGY AGENT acts as proxy for
ontology, provides observations and
metadata, etc.
HEURISTIC AGENT produces merged model in
desired format
How Would the Warfighter Use SCUBA?A Mission Planning Demonstration§ Typically a time consuming, manual,
ad hoc process– Hours to days depending on size of
mission and echelon of command– Skim available info sources: Internet,
SIGINT, COMINT, OSINT, HUMINT– Critical information and cross
relationships between information are missed due to time limitations and manual processes
§ Staff is not comprised of SMEs, instead compiles research written by SMEs
§ Information may not be up-to-date§ Number of data sources limited by
amount of available staff (and time)
Mission Planning is the basis for successful mission execution
Analyze the higher HQ Order
Military Decision Making Process Model
COA Development
COA Analysis
COA Comparison
COA Approval
Orders Production
Mission Analysis
Receipt of Mission
Determine the commander’s critical
Review available assets
Determine specified, implied, and
Conduct a risk assessment
Determine constraints
Identify critical facts and assumptions
Determine the initial reconnaissance
Plan use of available time
Write a restated mission
Conduct a mission analysis briefing
Approve the restated mission
Develop the initial commander’s
Issue commander’s guidance
Issue a warning order
Review facts and assumptions
Conduct initial intelligence
ASCOPE
PMESII-PT
METT-TC
Political
Military
Economic
Social
Information
Infrastructure
Physical Environment
Time
PMESII-P Ontology StructurePolitical Military Economic Social Information Infrastructure Physical
EnvironmentSuper
Super Class
CommunitySocial
StratificationSuper Class
Class PoliceCommunity
HeadsAge
StratificationGender Status
Provided a priori by Yale
OCM Taxonomy
Determined by SCUBA
through concept
merge & align behaviors
Tribal Leaders Business Owners
Religious LeadersSub Class
SubSub Class Sheikh Imam Ayatollah
Instance DatabaseSocial - Community - Community Heads - Tribal Leaders: The Pashtu people are more
likely to follow the laws of local tribal leaders, than regionally elected officials.Social - Community - Community Heads - Religious Leaders: Sunni Muslims will first seek
spiritual guidance from their community Sheikh when a family member falls ill.
Ontology Structure with Instances
Measures of Performancen Structural Dimension (syntax)
– Measure of relation instance count
– Measure of concept count– Measure of maximum depth– Measure of concept instance
count– Measure of degree centrality– Measure of relationship type
count
n Functional Dimension (relations between T2O Combined Social/PMESII Social)
– F-Score (Precision & Recall)– String Metric F-Scores– Semantic Metric F-Scores
n Usability Dimension– User recognition– Fitness for user
n Time Dimension– Time to build– Time to do alignment
Measure human analyst created ontology and compare with ontology generated by automated process
11/30/11
Measures of Performance – Structure, Usability, Time
100K
10K
1K
100K
10K
1K10
10.1
1
10
100
100K
1M
10M10
1
0.1
100
10
1
100
100 10
10
1
1
11/30/11
Measures of Performance – Structure, Usability, Time
100K
10K
1K
100K
10K
1K10
10.1
1
10
100
100K
1M
10M10
1
0.1
100
10
1
100
100 10
10
1
1
11/30/11
Measures of Performance – Structure, Usability, Time
100K
10K
1K
100K
10K
1K101
0.1
1
10
100100K
1M
10M10
1
0.1
100
10
1
100
100 10
10
1
1
11/30/11
Measures of Performance – Determine Thresholds of String and Semantic Outputs
Observation Word 1 Word 2 Confidence Level99 independence independence 1.000100 teacher instructor 1.000101 teacher mentor .997102 religion creed .995
. . . .
. . . .490 village community .763
. . . .8470 highway hut 0.000
Observation String 1 String 2 Confidence Level99 independence independence 1.000100 bay bay 1.000101 classes class .997102 religion religious .995
. . . .
. . . .203 Maps Map .688
. . .10460 military organizations mention 0.000
String Output
Semantic Output
11/30/11
Measures of Performance – F-Scores (String)
0
0.250
0.500
0.750
1.000
Bowler
Sub
string
Chapm
an M
atchin
g Sou
ndex
Hamming
Jacc
ard Jaro
Jaro
Wink
ler
Jens
en S
hann
on D
irichle
t
Jens
en S
hann
on U
nsmoo
thed
Monge
Elka
n
Needle
man W
unsc
h
Ngram
String F-ScorePrecisionRecall
11/30/11
Measures of Performance – F-Scores (String + Semantic)
0
0.250
0.500
0.750
1.000
Bowler
Sub
string
Chapm
an M
atchin
g Sou
ndex
Hamming
Jacc
ard Jaro
Jaro
Wink
ler
Jens
en S
hann
on D
irichle
t
Jens
en S
hann
on U
nsmoo
thed
Monge
Elka
n
Needle
man W
unsc
h
Ngram
String+Semantic F-ScorePrecisionRecall
Evaluation/Merge and Align Metrics§ T2O and PMESII Social § Heuristic Agent
– .302 Hours (Semantic Agent Longest)– Ontology Agent
§ Adv 1 MS– Evaluation Agent
§ Adv 1.5 MS– Similarity Agent
§ String (36) 6.1 Second§ Adv. Individual String 1.8 MS§ Semantic (9) 8 Hours (2 Processors)§ Lesk Behavior .3 Hours (2 Processors)
– Matching Agent§ 53 Seconds
Conclusion§SCUBA facilitates the rapid analysis of socio-
cultural ISR to improve understanding of regional political, religious, and economic inter-relationships through creation, alignment, and merging of ontologies
§Enhances ontological engineering process through an innovative alignment and merging process
§ Incorporating mechanisms to ensure information consistency, filtering and pruning algorithms to manage complexity, and learning algorithms to enable self-refinement and improvement over time
Acknowledgments§ Corporate Team
– Dr. John Zolper (Vice President, R&D)– Joanne Wood & Jane Orsulak (Tech Area Directors– Michael Liggett (CRAD Development Manager)
§ Technical Team– Bill Philliips (Largo, FL)– Don Kretz (Garland, TX, IIS)– Bruce E. Peoples (State College, PA, IIS)– Daniel P. Truitt (State College, PA, IIS)– Nathan Bowler (State College, PA, IIS)– Justin Toennies (Largo, FL, NCS)
§ Business Champions– Eric Rickard (IIS)– Chris Thompson (NCS)– Bob Ogden (NCS)– Bob Mojazza (IIS)– Dan Levis (NCS)– Ari Dimitriou (IIS)– Kimbry Mcclure (IIS)
§ Special thanks to Lymba Corporation for their assistance with the Jaguar ontology runs