1 artificial intelligence applications institute centre for intelligent systems and their...

26
1 Artificial Intelligence Applications Institute Centre for Intelligent Systems and their Applications Stuart Aitken Artificial Intelligence Applications Institute A Process Ontology for A Process Ontology for Cell Biology Cell Biology

Upload: joanna-miles

Post on 27-Dec-2015

229 views

Category:

Documents


3 download

TRANSCRIPT

1

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Stuart Aitken

Artificial Intelligence Applications Institute

A Process Ontology for A Process Ontology for Cell BiologyCell Biology

2

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

OutlineOutline

• Rapid Knowledge Formation (RKF) Project– RKF Project goals and domain– The Cyc knowledge based-system– RKF Tools

• Process Ontology– General approach– Formalisation– Example

3

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Rapid Knowledge FormationRapid Knowledge Formation

• The RKF project aims to develop tools which will allow domain experts to enter knowledge directly into the KBS.

• DARPA-funded, two teams:– CYCORP– SRI

• Organised around ‘Challenge Problems’ – Cell Biology

4

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

RKFRKF

Aim: To enable biologists to construct an ontology/KB from a textbook source

formalise

Ontology

Alberts et al, Essential Cell Biology, 1998

5

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Rapid Knowledge FormationRapid Knowledge Formation

Key techniques:• The KBS has knowledge of the KA

process– Knowledge of salience– Knowledge of the requirements of an

adequate formalisation

• There is a dialogue between expert and system, which clarifies the concept being defined.

6

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Rapid Knowledge FormationRapid Knowledge Formation

Evaluation:

After a period of tool development,• trials are organised, both• expert performance, and• KE performance is measured,• and assessed independently.

The evaluation is extensive – over a period of 2 weeks

7

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

The Cyc KBSThe Cyc KBS

• Cyc (Doug Lenat) is a knowledge-based system, under development since ~1984, aiming to represent common sense knowledge.

• Cyc uses a large upper-level ontology

• Uses a logical language based on first-order logic

8

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

The Cyc KBSThe Cyc KBS

Concepts in the Upper Ontology:– Thing, Agent, Event– TangibleThing, InformationBearingObject– …. Dog, Book– subclass(genls), instance-of(isa)– parts, subevent, role predicates– 1600 concepts in total in the public

release (1998) - small% of Cyc

Classification:– Stuff-like vs Object-like– Individual vs Set

9

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

The Cyc KBSThe Cyc KBS

• The upper-ontology supports application development:

Upper-level

Intermediate-level

Application-level

Thing

10

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

The Cyc KBSThe Cyc KBS

Cyc includes:• An inference engine, • GUI, • tools for ontology development.• Until the RKF project, ontology

development was by trained knowledge engineers, working with domain experts.

11

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

RKFRKF

New tools in Cyc:• Define a new concept, and place it

correctly in the ontology• Refine a concept definition• Define a new predicate• Assert a new fact• Define a new rule• State an analogy• Construct a new process

12

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

RKFRKF

User interaction:• Selection of items in the interface

– Choice determined ‘intelligently’, KBS has knowledge of salience, and the KA process, this knowledge must be authored

• Browsing of the ontology• Search• Natural language dialogue

13

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Process ModelsProcess Models

BindsTogether Move

RNA Transcription

14

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Process DescriptorProcess Descriptor

Q: Name the processA: [ RNA Transcription ]Q:Select the type of Process that describes

the category best• event localised• creation or destruction event…• ‘say this:’[ _ _ _ _ _ _ ]Q: Define:• affected object: [ _ _ _ _ _ ]• location: [ _ _ _ _ _ ]• actor: [ _ _ _ _ _ ]

15

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Process ModelsProcess Models

Describing Processes:• Complex expressions at the instance level• Simpler to describe in terms of types

Upper-level

Intermediate-level

subevent(Event,Event)doneBy(Event,Agent)

ForAll ?E ?F ?G implies(subevent(?E,?G) and isa(?E,BindsTogether)subevent(?F,?G) and isa(?F,Move))before(startOf(?E),startOf(?F))

Application-level?

16

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Script VocabularyScript Vocabulary

The Script theory defines the semantics of Type-Level assertions

(typePlaysRoleInScene RNATranscription DNAMolecule BindsTogether objectActedOn)

• Requires rules for identity– Can require complex reasoning

• Good for user input• Can be extended to cover pre and

postconditions of actions

17

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

ScriptsScripts

subevents

BindsTogether

e

Move

f

RNA Transcription

Forall subevents f of t, of type Move,and all subevents e of t, of type BindsTogether,(startsAfterStartingof f e) where t is of type RNATranscription

t

startsAfterStartingOfInScript

18

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

ScriptsScripts

Type playing role

N

BindsTogetherNucleotide

e

Types:

objectActedOn

Instance:

For some n in N, (objectActedOn e n)

19

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

New Script VocabularyNew Script Vocabulary

• Pre and Post conditions

BindsTogether

N

R

N

RnottouchingDirectly connectedTo

(preconditionOfScene-negated BindsTogether touchingDirectly <Ribonucleotide Nucleotide>)

(postconditionOfScene BindsTogether connectedTo <Ribonucleotide Nucleotide>)

20

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

New Script VocabularyNew Script Vocabulary

N R

Some ?n in N, some ?r in R(not(touchingDirectly ?n ?r))

Some ?n in N, some ?r in R(connectedTo ?n ?r)

BindsTogetherNucleotide Ribonucleotide

e

Types:

roleroleSet ofInstances:

Precondition: Postcondition:

identity

21

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Script VocabularyScript Vocabulary

• The Script vocabulary forms an ‘intermediate level’, which

• lies behind the Process descriptor GUI (i.e. the textboxes)

• Not, in itself, a taxonomy of processes, but allows processes to be described in detail.

• Defining the subclass relation is just one task.

22

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Vaccinia Virus Life CycleVaccinia Virus Life Cycle

• The vaccinia virus life cycle was selected as an example of a complex model to formalise as a set of Scripts.

• The model includes actions, decomposition, ordering, objects-playing-roles and pre/postconditions

• It is a good test for the Script vocabulary

23

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Vaccinia Virus Life CycleVaccinia Virus Life Cycle

mRNATranscription-Early

ViralGeneTranslation-Early

MovementOfProtein

Temporal:

mRNATranscription-Early

ViralGeneTranslation-Early

MovementOfProtein

mRNATranscription-Early

ViralGeneTranslation-Early

MovementOfProtein

Participants

Conditions:

Outputs:messengerRNA

Inputs:messengerRNA

Pre:spatiallySubsumes Cell VirusCore

Post:spatiallySubsumes CellCytoplasm Vitf2

24

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

EvaluationEvaluation

• 8 biologists were selected, and trained in the tools, 4 per team

• The knowledge to be formalised was selected (chapter 7 in Alberts)

• The knowledge base was allowed to contain ‘pump-priming’ knowledge

• The biologists entered knowledge , using the tools, then tested it against a set of questions,

• Ontology/KB was revised

25

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

EvaluationEvaluation

Results (outline)• A huge amount of data was collected,

but analysis is complex (IET Inc)• Domain experts were able to develop

ontologies after ‘light’ training• Knowledge engineers out-perform

domain experts in ontology construction

26

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

SummarySummary

‘Power Tools’ for ontology development are being implemented and tested in the RKF project.

• A Script/Process vocabulary has been developed and applied to processes in cell biology, covering:– Temporal order– Participants– Pre/postconditions– Repetition