ontology case
TRANSCRIPT
-
8/13/2019 Ontology Case
1/41
The DHS Ontology Case
Presentation by OntologyStream Inc
Paul Stephen Prueitt, PhD
19 March 2005
Ontology Tutorial 1, copyright, Paul S Prueitt 2005
-
8/13/2019 Ontology Case
2/41
Ontology technology exists in the market
However,practical problems block progress in designing and implementing DHS wide
ontology technology. Business process mythology, like Earned Value program management,
is not focusing on the right questions and maturity models for software development haveprecluded most innovations that come from outside the relational database paradigm.
These practical problems are also partially a consequent of
1. Some specific institutional behaviors related to traditional program management.
2. Confusion caused by long term, and heavily funded Artificial Intelligence marketing activities
-
8/13/2019 Ontology Case
3/41
As a general proposition, through out the federal government, quality metrics
are not guiding management decisions supporting:
1) Quick transitions from database centered information technology to XML
based Semantic Web technology.
2) Transitions from XML repositories to ontology mediated Total Information
Awareness, with Informational Transparency, in Secure Channels.
Emerging Semantic Web
Standards
DHS Ontology
1) World-wide Trade Data
2) Investigation Targeting
3) Risks, Threats and
Vulnerabilities
4) Policy Enforcement
?
-
8/13/2019 Ontology Case
4/41
Diagram from Prueitt, 2003
-
8/13/2019 Ontology Case
5/41
Diagram from Prueitt, 2003
First two steps
are missing
Seven step
AIPM
is not
completeRDBMS
-
8/13/2019 Ontology Case
6/41
The measurement/instrumentation task
First two steps
in the AIPM
Measurement is part of the semantic
extraction task, and is accomplished with a
known set of techniques
Latent semantic technologies
Some sort of n-gram measurement
with encoding into hash tables or internal
ontology representation (CCM and
NdCore, perhaps AeroText and
Converas process ontology (?), Orbs,
Hilbert encoding, CoreTalk/Cubicon.
Stochastic and neural/ genetic
architectures
-
8/13/2019 Ontology Case
7/41
One model for semantic extraction explicitly focuses on the first two aspects of
the AIPM; e.g. instrumentation/measurement and data-encoding/interpretation
-
8/13/2019 Ontology Case
8/41
Actionable Intelligence Process Modelhas an action-perception event
cycle.
Stratified ontologysupports the use of this cycle to produce knowledge of
attack and anticipatory mechanismsbased on the measurement of sub-structural categorical invariance.
Work flow and process
ontologyis available as a
basis for encoding knowledge
of anticipatory response
mechanisms.
Categorical Invariance is
measured, using Orbs
(Ontology referential bases)
for example or CCM(Contiguous Connection
Model) encoding, and
organized as a resource for
RDF triples using some lite
OWL OIL.
-
8/13/2019 Ontology Case
9/41
Distributed ontology management is already
available in some military activities
-
8/13/2019 Ontology Case
10/41
distributed Ontology
Management Architecture
d-OMA
Ontology ArchitectureMarch 10, 2005
Version 3.0
Part of a series on the nature of machine encoding of sets of concepts
-
8/13/2019 Ontology Case
11/41
Community
Transition
BenchmarkHigher level Ontologies as
part of the d-OMA
Localized Ontologies within
an DAML type agency
structure
CommunityCommunity
-
8/13/2019 Ontology Case
12/41
Higher level Ontologies
Localized Ontologies
Links to Web services
and Internal R&D
Community
Community
Community
-
8/13/2019 Ontology Case
13/41
A reconciliation process
is required between
ontology services.
Ontology users have the
following roles:
Knowledge engineers
Information mediators
( e.g. ontology librarians )
Community workers
(e.g. analysts)
Knowledge
Engineers
Ontology
librarian
Intelligence TargetingAnalyst
Community
-
8/13/2019 Ontology Case
14/41
Higher level Ontologies
Localized Ontologies
Real time acquisition of
new concepts, and
modifications toexisting concepts are
made via an piece of
software called
Ontology Use Agent
Ontology Use Agents
have various type of
functions expressed in
accordance with roles
Mature scholarship from
evolutionary psychology
research communities.
Community
-
8/13/2019 Ontology Case
15/41
Human and system interaction with a
common Ontology Structure
Ontology Presentation
Part of a series on the nature of machine encodingof sets of concepts
-
8/13/2019 Ontology Case
16/41
First, and b efore al l else, an computer based ontology is a
{ concepts }
In the natural, physical, sciences an ontology is the causes of those
things that one can interact with in Nature.
Physical science informs usthat a formative process is involved
in the expression of natural ontology in natural settings.
First principles
The set of our personal and private conceptsis often thought to be the causes of how we
reason as hum ans. This metaphoris operational in many peoples understanding about the
nature and usefulness of machine encoded ontology. But this metaphor can also mislead us!!!!
Extensive literatures indicate that the Artificial Intelligence (AI) mytho logyhas leadmany to believe that the reasoning of an ontology might be the same as the
reasoning of a human in all cases.
This inference is not properbecause the truthful-ness of this inference has not been demonstrated by natural science,
and perhaps cannot be d emonstratedno matter what the levels of government funding for AI.
AI is discounted in Tim Berners Lees notion of the Semantic Web.
-
8/13/2019 Ontology Case
17/41
Tim Berners Lee Notion of Semantic Web
One consequence of acknowledging this difference is to elevate the work of the
authors of the OASIS standards, in part icular Top ic Maps. In Topic Maps wehave an open wor ld assumpt ionand very little emphasis on computational
inference. Human knowledge is represented in a shallow form, and
visualization is used to manage this representation.
Computation with topic maps ANDOWL ontologies work together withXML
repositories.
The point being
made here is thatthe notion of
inference is very
different depending
if one is talking
about the human
sideor themachine sideof
the Semantic Web.
-
8/13/2019 Ontology Case
18/41
{ concepts }First principles
Let is use only set theory to consider Tim Berners Lees
notion of Semantic Web.
Let C = { concepts } and B be a subset ofC
Human and/or machine computation
creates a well formed query
Some software
B is a subset ofC
Subsetting
function
The subsetting function might be an
ask-tell
interaction between two ontologies
-
8/13/2019 Ontology Case
19/41
-
8/13/2019 Ontology Case
20/41
First principles
Situational Ontology
SoftwareThe knowledge repository acts as aperceptual groundin a figure-ground
relationship.
The ontology sub-setting function has pulled part, but not all, of the background into a
situational focus. This first principle is consistent with perceptual physics and thus is
informed by natural science.
-
8/13/2019 Ontology Case
21/41
Extending Ontology Structure over
legacy information systems
Part of a series on the nature of machine encoding of sets of concepts
(The following slides are from OntologyStreams Ontology Presentation VII General Background)
-
8/13/2019 Ontology Case
22/41
Functional specs
Ontology Use Start-up Use Model
Model: Steady State Ontology System
Components: Framework for Query Entity
Data Access: Steady State Ontology System
Framework for Query Entity
Building ontology from data flow
Using Ontology
Ontology Generating Framework
The inverse problem: generating synthetic data
Finding data regularity in its natural contextualization
Presentation Contents
-
8/13/2019 Ontology Case
23/41
Functional specs
Functional specs:
1. Human-centric: must be human (individual) centric in design and
function
2. Support data retrieval:must act as a data retrieval mechanism3. Event structure measurement:must assist in the definition of data
acquisition requirements on an on-going basis
4. Interactive:must support multiple interacting ontologies
5. Real Time:must aid in real time problem solving and in the long term
management of specific sets of concepts
Note:Ontology mediated knowledge systems have operational properties that are quite different from traditional
relational database information systems. These five functional specs have been reviewed by a small community of
professional ontologists, as has been deemed correct for knowledge systems.
-
8/13/2019 Ontology Case
24/41
-
8/13/2019 Ontology Case
25/41
Start-up Use Model II
Transaction process
Knowledge
Base
Entity
updates
ReasonerQuery
entities
startup
inferences
Start-up Use Case Step 2
Since instance data is much larger (2 or more
magnitudes larger) than the knowledge base, the
instance data is managed in a separate start-up
process.
Instance data
-
8/13/2019 Ontology Case
26/41
Model: Steady State Ontology System
Transaction process
Knowledge
Base
Entity
updates
Query
entities
inferences
Inference Mgr
Query Mgr
Data Access Mgr
Reasoner
Ontology Mgr
Instance dataInstance data may be remote or local.
Local data is on the same network as the
knowledge base.
-
8/13/2019 Ontology Case
27/41
-
8/13/2019 Ontology Case
28/41
Knowledge
Base
Data
Data Access Mgr
Data Access: Steady State Ontology System
using the OWL standard **
The RDF knowledge base ** is a set of conceptsexpressed as a set:
{ < subject, verb, predicate > }
and the data is either XML or a data structure
such as one would have as a C construct.
The Data Access Manager must manage the
mapping between local data stores (sometimes
having millions of elements) and the set ofconcepts.
The remote data may have many persistence
forms, and will be accessed via a data object.
Instance data
Data Object
Pipes and
Threads
** We use RDF and OWL as a standard to create minimal and well knowledge inference capabilities.
-
8/13/2019 Ontology Case
29/41
-
8/13/2019 Ontology Case
30/41
Framework from Knowledge Management point of view
Transaction processes
Query
entities
Query Manager
Real time analysis is
supported through the
development and use of query
entities.
These entities have regular
structure and are managed
within a Framework.
-
8/13/2019 Ontology Case
31/41
Building ontology from data flow
Transaction processes A model of the "causes" of transactiondata.
The model is based on, grounded in,
the concept of "occurrence in the real
world", or "event".
Associated with each event, we may
have a "measurement".
So we have a set of events
{ e(i) }
where i is a counter.
Some of the fields MAY not be used.
Later the number of fields in any
"findings data flow" may increase or
decrease without us caring at all.
Objective: We convert a stream of
event measurements into an
transaction ontology, and createauxiliary processes that will use a
general r isk onto log y, an onto logy
about proc ess opt im ization, and other
utility ontologies.
-
8/13/2019 Ontology Case
32/41
Using Ontology
Transaction processes
Queryentities
Query Manager
Consider a set of events { e(i) }where i is a
counter.
Each event will have a weakly wstructured (free
form text) and structured scomponent. So we use
the notation
e = w/s or
e(i) = w(i)/s(i).
Knowledge
Base
Data
Instance data
-
8/13/2019 Ontology Case
33/41
Ontology Generating Framework
Transaction processes
Notation
e(i) = w(i)/s(i)
An event is measured by filling in slots in a data
entry form, and by typing in natural language into
comments fields in these entry forms.
Observation: Given real data, one can categorize
the set of events due to the nature of the
information filled in.
{ e(i) }{ w(i) }
{ s(i) }
Semantic extraction
Discrete analysis
Semantic extraction is performed using one of
several tools, or tools in combination with each other
Discrete analysisis mostly the manual
development of ontology through the study of natural
categories in how the data is understood by
humans.
-
8/13/2019 Ontology Case
34/41
{ w(i) }
{ s(i) }
Semantic extraction
Discrete analysis
Each s(i)is a record from a single table.
Suppose there are 120 columns. Each column has
values, sometimes empty. Fix the counter at *.
Let s(*, j) , j = 1, . . . , 120 be the columns. We can call
these columns also using the term slot.
Now for each s(*, j) list the values that are observed to
be in that column. These values are the possible
fillers for the associated slot.
Free form text is weakly structured.
The set
{ w(i) }
is a text corpus that we would like to associate with
several ontologies.
Each association is made as exploratory activities with
specific goals.
For each eventwe may have zero or more free text
fields. Suppose we concatenate these, into one text
unit, and perhaps also develop some metadata (in
some way) that will help contextualize the semanticextraction process. We label this unit as w(i).
Finding data regularity in its natural contextualization
Regularity in data flowis caused by the events
occurring in the external world. Thus the instances
of specific data in data records provide to the
knowledge system a measurement of the events in
the external world.
-
8/13/2019 Ontology Case
35/41
Data regularity in context . Patterns and invariance
XML and related standards
community open and protected standards (CoreTalk, Rosetta Net,
ebXML)
.NET component management
J2EE frameworksSpring
General framework constructions
autonomous extraction (Hilbert encoding of data into keyless hash
tables, SLIP Shallow Link analysis Iterated Scatter-Gather an
Parcelation) Core, AeroText (etc) semantic extraction using processknowledge and text input
-
8/13/2019 Ontology Case
36/41
The role of community
1) A community of practice provide a reif ication processthat is
Human centric (geographical-community / functional-community)
2) Each community may have locally instantiatedOWL ontology
with topic map visualization.
a) Consistently and completeness is managed locally as a property of the human individuals,
acting within a community, and a locally persisted history of informational tr ansactions
with his/her agent
b) I ndividual agentscan query for and acquire XML based Web Services, negotiate with other
agents and create reconciliation processes involving designated agencies.
3) Knowledge engineers act on the behalf of policy makers to
reify new concepts,
delete concepts and to
instantiate reconcil iation containers
-
8/13/2019 Ontology Case
37/41
Establishing coherence in natural knowledge representation
1) Coherence is a physical phenomenonseen in lasers
a) Brain function depends critically on electromagnetic coherence
b) Incoherence, e.g. non-rationality, and in-completeness are two separate contrasting issues
to the issue of coherence
c) Mathematics , and therefore computer science and logic, have completeness and
consistency issues that are well established and well studied
2) Logical coherence is sometimes treated as consistency in logic
a) One may think that one has logical consistency and yet this property of the full system was lost at the
last transaction within the Ontology
b) The act of attempting to find a complete representation of information organization is sometimes
called reification, and reification efforts works against efforts to retain consistency
3) Human usabil i tyoften is a function of a proper balance
between logic and agility
-
8/13/2019 Ontology Case
38/41
Understanding multiple viewpoints
1) Logical consistency and single minded-nessare operationallinked together in most current generation decision support
systems. Database schema legacy issues. Schema servers, FEA (FederalEnterprise Architecture) standards, schema independent data systems
2) Observation: Human cognitive capabil itieshave far moreagility than current generation decision support systems
3) The topic map standard(2001, Newcomb and Briezski ) was
specifically developed to address the non-agility of Semantic
Web standards based on OWL and RDF. (Ontopia, Steve Pepper)
4) Combining XML repositories, OWL, distributed agent
architectures and Topic Maps is expressed as Stratified
Ontology Management
-
8/13/2019 Ontology Case
39/41
Detection of novelty
Scenario: an targeting and search analyst at the Port of Seattle is
only partially aware of why she feels uncomfortableaboutsome characteristic of a shipment from Sweden. The feeling is
expressed in a hand written finding and fed into a document
management repository for findings. A targeting and search
analyst at the Port of LA expresses a factabout a similarshipment without knowing of her colleagues sense of
discomfort.
1) Conceptual roll-up techniquesare used on a minute by minute basis to create a
viewable topic map over occurrences of concepts expressed in findings.
2) L ink analysisconnects an alert about uncertainty in the Seattle finding to the factfrom LA to produce new information related to a known vulnerability and attack
pattern.
3) New knowledge forms are propagated into OWL instanti ated ontology
and rules and viewed using Topic Maps.
-
8/13/2019 Ontology Case
40/41
Agent architecture
Scenario: Human analysts provide situational awareness via tacit
knowledge, personally agent mediated interactions with agentstructures, and human to human communications. A model of
threats and vulnerabil i tieshas evolved but does not account
for a specific new strategy being developed by a smart
smuggler. The smuggler games the current practices in orderto bring illegal elements into the United States
1) The model of threats and vulnerabilities expresses as a reification process from
various techniques and encoded OWL/Protg ontology with rules
2) Global Ontology: The model is maintained via near real time agencyprocesses
under the observation, active review, of knowledge engineers and program
managers working with knowledge of policy and event structure
3) Local Ontology: Information is propagated to individual analysts via alerts and
ontology management services controlled by the localized agent (of the person)
-
8/13/2019 Ontology Case
41/41
New (1/30/2005) tutorial on automated
extraction of ontology from free form text:
http://www.bcngroup.org/beadgames/anticipatoryWeb/23.htm
http://www.bcngroup.org/beadgames/anticipatoryWeb/23.htmhttp://www.bcngroup.org/beadgames/anticipatoryWeb/23.htm