ontology case

Upload: amit-ugle

Post on 04-Jun-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 Ontology Case

    1/41

    The DHS Ontology Case

    Presentation by OntologyStream Inc

    Paul Stephen Prueitt, PhD

    19 March 2005

    Ontology Tutorial 1, copyright, Paul S Prueitt 2005

  • 8/13/2019 Ontology Case

    2/41

    Ontology technology exists in the market

    However,practical problems block progress in designing and implementing DHS wide

    ontology technology. Business process mythology, like Earned Value program management,

    is not focusing on the right questions and maturity models for software development haveprecluded most innovations that come from outside the relational database paradigm.

    These practical problems are also partially a consequent of

    1. Some specific institutional behaviors related to traditional program management.

    2. Confusion caused by long term, and heavily funded Artificial Intelligence marketing activities

  • 8/13/2019 Ontology Case

    3/41

    As a general proposition, through out the federal government, quality metrics

    are not guiding management decisions supporting:

    1) Quick transitions from database centered information technology to XML

    based Semantic Web technology.

    2) Transitions from XML repositories to ontology mediated Total Information

    Awareness, with Informational Transparency, in Secure Channels.

    Emerging Semantic Web

    Standards

    DHS Ontology

    1) World-wide Trade Data

    2) Investigation Targeting

    3) Risks, Threats and

    Vulnerabilities

    4) Policy Enforcement

    ?

  • 8/13/2019 Ontology Case

    4/41

    Diagram from Prueitt, 2003

  • 8/13/2019 Ontology Case

    5/41

    Diagram from Prueitt, 2003

    First two steps

    are missing

    Seven step

    AIPM

    is not

    completeRDBMS

  • 8/13/2019 Ontology Case

    6/41

    The measurement/instrumentation task

    First two steps

    in the AIPM

    Measurement is part of the semantic

    extraction task, and is accomplished with a

    known set of techniques

    Latent semantic technologies

    Some sort of n-gram measurement

    with encoding into hash tables or internal

    ontology representation (CCM and

    NdCore, perhaps AeroText and

    Converas process ontology (?), Orbs,

    Hilbert encoding, CoreTalk/Cubicon.

    Stochastic and neural/ genetic

    architectures

  • 8/13/2019 Ontology Case

    7/41

    One model for semantic extraction explicitly focuses on the first two aspects of

    the AIPM; e.g. instrumentation/measurement and data-encoding/interpretation

  • 8/13/2019 Ontology Case

    8/41

    Actionable Intelligence Process Modelhas an action-perception event

    cycle.

    Stratified ontologysupports the use of this cycle to produce knowledge of

    attack and anticipatory mechanismsbased on the measurement of sub-structural categorical invariance.

    Work flow and process

    ontologyis available as a

    basis for encoding knowledge

    of anticipatory response

    mechanisms.

    Categorical Invariance is

    measured, using Orbs

    (Ontology referential bases)

    for example or CCM(Contiguous Connection

    Model) encoding, and

    organized as a resource for

    RDF triples using some lite

    OWL OIL.

  • 8/13/2019 Ontology Case

    9/41

    Distributed ontology management is already

    available in some military activities

  • 8/13/2019 Ontology Case

    10/41

    distributed Ontology

    Management Architecture

    d-OMA

    Ontology ArchitectureMarch 10, 2005

    Version 3.0

    Part of a series on the nature of machine encoding of sets of concepts

  • 8/13/2019 Ontology Case

    11/41

    Community

    Transition

    BenchmarkHigher level Ontologies as

    part of the d-OMA

    Localized Ontologies within

    an DAML type agency

    structure

    CommunityCommunity

  • 8/13/2019 Ontology Case

    12/41

    Higher level Ontologies

    Localized Ontologies

    Links to Web services

    and Internal R&D

    Community

    Community

    Community

  • 8/13/2019 Ontology Case

    13/41

    A reconciliation process

    is required between

    ontology services.

    Ontology users have the

    following roles:

    Knowledge engineers

    Information mediators

    ( e.g. ontology librarians )

    Community workers

    (e.g. analysts)

    Knowledge

    Engineers

    Ontology

    librarian

    Intelligence TargetingAnalyst

    Community

  • 8/13/2019 Ontology Case

    14/41

    Higher level Ontologies

    Localized Ontologies

    Real time acquisition of

    new concepts, and

    modifications toexisting concepts are

    made via an piece of

    software called

    Ontology Use Agent

    Ontology Use Agents

    have various type of

    functions expressed in

    accordance with roles

    Mature scholarship from

    evolutionary psychology

    research communities.

    Community

  • 8/13/2019 Ontology Case

    15/41

    Human and system interaction with a

    common Ontology Structure

    Ontology Presentation

    Part of a series on the nature of machine encodingof sets of concepts

  • 8/13/2019 Ontology Case

    16/41

    First, and b efore al l else, an computer based ontology is a

    { concepts }

    In the natural, physical, sciences an ontology is the causes of those

    things that one can interact with in Nature.

    Physical science informs usthat a formative process is involved

    in the expression of natural ontology in natural settings.

    First principles

    The set of our personal and private conceptsis often thought to be the causes of how we

    reason as hum ans. This metaphoris operational in many peoples understanding about the

    nature and usefulness of machine encoded ontology. But this metaphor can also mislead us!!!!

    Extensive literatures indicate that the Artificial Intelligence (AI) mytho logyhas leadmany to believe that the reasoning of an ontology might be the same as the

    reasoning of a human in all cases.

    This inference is not properbecause the truthful-ness of this inference has not been demonstrated by natural science,

    and perhaps cannot be d emonstratedno matter what the levels of government funding for AI.

    AI is discounted in Tim Berners Lees notion of the Semantic Web.

  • 8/13/2019 Ontology Case

    17/41

    Tim Berners Lee Notion of Semantic Web

    One consequence of acknowledging this difference is to elevate the work of the

    authors of the OASIS standards, in part icular Top ic Maps. In Topic Maps wehave an open wor ld assumpt ionand very little emphasis on computational

    inference. Human knowledge is represented in a shallow form, and

    visualization is used to manage this representation.

    Computation with topic maps ANDOWL ontologies work together withXML

    repositories.

    The point being

    made here is thatthe notion of

    inference is very

    different depending

    if one is talking

    about the human

    sideor themachine sideof

    the Semantic Web.

  • 8/13/2019 Ontology Case

    18/41

    { concepts }First principles

    Let is use only set theory to consider Tim Berners Lees

    notion of Semantic Web.

    Let C = { concepts } and B be a subset ofC

    Human and/or machine computation

    creates a well formed query

    Some software

    B is a subset ofC

    Subsetting

    function

    The subsetting function might be an

    ask-tell

    interaction between two ontologies

  • 8/13/2019 Ontology Case

    19/41

  • 8/13/2019 Ontology Case

    20/41

    First principles

    Situational Ontology

    SoftwareThe knowledge repository acts as aperceptual groundin a figure-ground

    relationship.

    The ontology sub-setting function has pulled part, but not all, of the background into a

    situational focus. This first principle is consistent with perceptual physics and thus is

    informed by natural science.

  • 8/13/2019 Ontology Case

    21/41

    Extending Ontology Structure over

    legacy information systems

    Part of a series on the nature of machine encoding of sets of concepts

    (The following slides are from OntologyStreams Ontology Presentation VII General Background)

  • 8/13/2019 Ontology Case

    22/41

    Functional specs

    Ontology Use Start-up Use Model

    Model: Steady State Ontology System

    Components: Framework for Query Entity

    Data Access: Steady State Ontology System

    Framework for Query Entity

    Building ontology from data flow

    Using Ontology

    Ontology Generating Framework

    The inverse problem: generating synthetic data

    Finding data regularity in its natural contextualization

    Presentation Contents

  • 8/13/2019 Ontology Case

    23/41

    Functional specs

    Functional specs:

    1. Human-centric: must be human (individual) centric in design and

    function

    2. Support data retrieval:must act as a data retrieval mechanism3. Event structure measurement:must assist in the definition of data

    acquisition requirements on an on-going basis

    4. Interactive:must support multiple interacting ontologies

    5. Real Time:must aid in real time problem solving and in the long term

    management of specific sets of concepts

    Note:Ontology mediated knowledge systems have operational properties that are quite different from traditional

    relational database information systems. These five functional specs have been reviewed by a small community of

    professional ontologists, as has been deemed correct for knowledge systems.

  • 8/13/2019 Ontology Case

    24/41

  • 8/13/2019 Ontology Case

    25/41

    Start-up Use Model II

    Transaction process

    Knowledge

    Base

    Entity

    updates

    ReasonerQuery

    entities

    startup

    inferences

    Start-up Use Case Step 2

    Since instance data is much larger (2 or more

    magnitudes larger) than the knowledge base, the

    instance data is managed in a separate start-up

    process.

    Instance data

  • 8/13/2019 Ontology Case

    26/41

    Model: Steady State Ontology System

    Transaction process

    Knowledge

    Base

    Entity

    updates

    Query

    entities

    inferences

    Inference Mgr

    Query Mgr

    Data Access Mgr

    Reasoner

    Ontology Mgr

    Instance dataInstance data may be remote or local.

    Local data is on the same network as the

    knowledge base.

  • 8/13/2019 Ontology Case

    27/41

  • 8/13/2019 Ontology Case

    28/41

    Knowledge

    Base

    Data

    Data Access Mgr

    Data Access: Steady State Ontology System

    using the OWL standard **

    The RDF knowledge base ** is a set of conceptsexpressed as a set:

    { < subject, verb, predicate > }

    and the data is either XML or a data structure

    such as one would have as a C construct.

    The Data Access Manager must manage the

    mapping between local data stores (sometimes

    having millions of elements) and the set ofconcepts.

    The remote data may have many persistence

    forms, and will be accessed via a data object.

    Instance data

    Data Object

    Pipes and

    Threads

    ** We use RDF and OWL as a standard to create minimal and well knowledge inference capabilities.

  • 8/13/2019 Ontology Case

    29/41

  • 8/13/2019 Ontology Case

    30/41

    Framework from Knowledge Management point of view

    Transaction processes

    Query

    entities

    Query Manager

    Real time analysis is

    supported through the

    development and use of query

    entities.

    These entities have regular

    structure and are managed

    within a Framework.

  • 8/13/2019 Ontology Case

    31/41

    Building ontology from data flow

    Transaction processes A model of the "causes" of transactiondata.

    The model is based on, grounded in,

    the concept of "occurrence in the real

    world", or "event".

    Associated with each event, we may

    have a "measurement".

    So we have a set of events

    { e(i) }

    where i is a counter.

    Some of the fields MAY not be used.

    Later the number of fields in any

    "findings data flow" may increase or

    decrease without us caring at all.

    Objective: We convert a stream of

    event measurements into an

    transaction ontology, and createauxiliary processes that will use a

    general r isk onto log y, an onto logy

    about proc ess opt im ization, and other

    utility ontologies.

  • 8/13/2019 Ontology Case

    32/41

    Using Ontology

    Transaction processes

    Queryentities

    Query Manager

    Consider a set of events { e(i) }where i is a

    counter.

    Each event will have a weakly wstructured (free

    form text) and structured scomponent. So we use

    the notation

    e = w/s or

    e(i) = w(i)/s(i).

    Knowledge

    Base

    Data

    Instance data

  • 8/13/2019 Ontology Case

    33/41

    Ontology Generating Framework

    Transaction processes

    Notation

    e(i) = w(i)/s(i)

    An event is measured by filling in slots in a data

    entry form, and by typing in natural language into

    comments fields in these entry forms.

    Observation: Given real data, one can categorize

    the set of events due to the nature of the

    information filled in.

    { e(i) }{ w(i) }

    { s(i) }

    Semantic extraction

    Discrete analysis

    Semantic extraction is performed using one of

    several tools, or tools in combination with each other

    Discrete analysisis mostly the manual

    development of ontology through the study of natural

    categories in how the data is understood by

    humans.

  • 8/13/2019 Ontology Case

    34/41

    { w(i) }

    { s(i) }

    Semantic extraction

    Discrete analysis

    Each s(i)is a record from a single table.

    Suppose there are 120 columns. Each column has

    values, sometimes empty. Fix the counter at *.

    Let s(*, j) , j = 1, . . . , 120 be the columns. We can call

    these columns also using the term slot.

    Now for each s(*, j) list the values that are observed to

    be in that column. These values are the possible

    fillers for the associated slot.

    Free form text is weakly structured.

    The set

    { w(i) }

    is a text corpus that we would like to associate with

    several ontologies.

    Each association is made as exploratory activities with

    specific goals.

    For each eventwe may have zero or more free text

    fields. Suppose we concatenate these, into one text

    unit, and perhaps also develop some metadata (in

    some way) that will help contextualize the semanticextraction process. We label this unit as w(i).

    Finding data regularity in its natural contextualization

    Regularity in data flowis caused by the events

    occurring in the external world. Thus the instances

    of specific data in data records provide to the

    knowledge system a measurement of the events in

    the external world.

  • 8/13/2019 Ontology Case

    35/41

    Data regularity in context . Patterns and invariance

    XML and related standards

    community open and protected standards (CoreTalk, Rosetta Net,

    ebXML)

    .NET component management

    J2EE frameworksSpring

    General framework constructions

    autonomous extraction (Hilbert encoding of data into keyless hash

    tables, SLIP Shallow Link analysis Iterated Scatter-Gather an

    Parcelation) Core, AeroText (etc) semantic extraction using processknowledge and text input

  • 8/13/2019 Ontology Case

    36/41

    The role of community

    1) A community of practice provide a reif ication processthat is

    Human centric (geographical-community / functional-community)

    2) Each community may have locally instantiatedOWL ontology

    with topic map visualization.

    a) Consistently and completeness is managed locally as a property of the human individuals,

    acting within a community, and a locally persisted history of informational tr ansactions

    with his/her agent

    b) I ndividual agentscan query for and acquire XML based Web Services, negotiate with other

    agents and create reconciliation processes involving designated agencies.

    3) Knowledge engineers act on the behalf of policy makers to

    reify new concepts,

    delete concepts and to

    instantiate reconcil iation containers

  • 8/13/2019 Ontology Case

    37/41

    Establishing coherence in natural knowledge representation

    1) Coherence is a physical phenomenonseen in lasers

    a) Brain function depends critically on electromagnetic coherence

    b) Incoherence, e.g. non-rationality, and in-completeness are two separate contrasting issues

    to the issue of coherence

    c) Mathematics , and therefore computer science and logic, have completeness and

    consistency issues that are well established and well studied

    2) Logical coherence is sometimes treated as consistency in logic

    a) One may think that one has logical consistency and yet this property of the full system was lost at the

    last transaction within the Ontology

    b) The act of attempting to find a complete representation of information organization is sometimes

    called reification, and reification efforts works against efforts to retain consistency

    3) Human usabil i tyoften is a function of a proper balance

    between logic and agility

  • 8/13/2019 Ontology Case

    38/41

    Understanding multiple viewpoints

    1) Logical consistency and single minded-nessare operationallinked together in most current generation decision support

    systems. Database schema legacy issues. Schema servers, FEA (FederalEnterprise Architecture) standards, schema independent data systems

    2) Observation: Human cognitive capabil itieshave far moreagility than current generation decision support systems

    3) The topic map standard(2001, Newcomb and Briezski ) was

    specifically developed to address the non-agility of Semantic

    Web standards based on OWL and RDF. (Ontopia, Steve Pepper)

    4) Combining XML repositories, OWL, distributed agent

    architectures and Topic Maps is expressed as Stratified

    Ontology Management

  • 8/13/2019 Ontology Case

    39/41

    Detection of novelty

    Scenario: an targeting and search analyst at the Port of Seattle is

    only partially aware of why she feels uncomfortableaboutsome characteristic of a shipment from Sweden. The feeling is

    expressed in a hand written finding and fed into a document

    management repository for findings. A targeting and search

    analyst at the Port of LA expresses a factabout a similarshipment without knowing of her colleagues sense of

    discomfort.

    1) Conceptual roll-up techniquesare used on a minute by minute basis to create a

    viewable topic map over occurrences of concepts expressed in findings.

    2) L ink analysisconnects an alert about uncertainty in the Seattle finding to the factfrom LA to produce new information related to a known vulnerability and attack

    pattern.

    3) New knowledge forms are propagated into OWL instanti ated ontology

    and rules and viewed using Topic Maps.

  • 8/13/2019 Ontology Case

    40/41

    Agent architecture

    Scenario: Human analysts provide situational awareness via tacit

    knowledge, personally agent mediated interactions with agentstructures, and human to human communications. A model of

    threats and vulnerabil i tieshas evolved but does not account

    for a specific new strategy being developed by a smart

    smuggler. The smuggler games the current practices in orderto bring illegal elements into the United States

    1) The model of threats and vulnerabilities expresses as a reification process from

    various techniques and encoded OWL/Protg ontology with rules

    2) Global Ontology: The model is maintained via near real time agencyprocesses

    under the observation, active review, of knowledge engineers and program

    managers working with knowledge of policy and event structure

    3) Local Ontology: Information is propagated to individual analysts via alerts and

    ontology management services controlled by the localized agent (of the person)

  • 8/13/2019 Ontology Case

    41/41

    New (1/30/2005) tutorial on automated

    extraction of ontology from free form text:

    http://www.bcngroup.org/beadgames/anticipatoryWeb/23.htm

    http://www.bcngroup.org/beadgames/anticipatoryWeb/23.htmhttp://www.bcngroup.org/beadgames/anticipatoryWeb/23.htm