from words to knowledge orion active structure. orion active structure two approaches we could...

22
From Words to Knowledge ORION Active Structure

Upload: kathleen-harris

Post on 18-Dec-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

From Words to Knowledge

ORIONActive Structure

Page 2: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure

Two Approaches

We could separate the process of turning words into knowledge into its components, or

we could adopt a more holistic approach.

Page 3: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure

A Sequence of Activities

This approach segments the process into separate parts, each of which is blind to all the others. This seems easier conceptually, but is obviously not what people do in reading text.

POS Tags

Words

Grammar

Semantics

Knowledge

Page 4: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure

The Holistic Approach

The lexing, grammar, semantic and structure-building processes proceed simultaneously and synergistically, opportunistically using any information coming from any direction

ActiveStructure

LocalKnowledge

GlobalKnowledge

Sem antics Gram m ar

Words

Page 5: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure

The Basic Elements

These are the basic elements of Active Structure - variables, operators, links, values flowing in the structure

Page 6: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure

A Common Substrate

PAR SE

The basic elements of Active Structure can also be seen as

Entities, Relations and States

These three elements are adequate to model everything - including the grammar of language and the world of

objects

Page 7: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure

The Reading Process

A document is read, paragraph by paragraph, sentence by sentence, word by word.

As the words are read, they are turned into objects that can be manipulated - objects that have the properties both of words and of the objects they represent - a ligand, a gene

The word objects are assembled through grammar into larger objects - receptor or gene structure

And into larger structures, using the relations between the objects provided by nouns and verbs

Page 8: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure

Transformationchanges in the conformation of the Tsr dimer induced by

serine binding improve methylation efficiency

Page 9: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure

Building Structure

When a single possible structure match is found, an invocation of the structure is built, leaving a new BRIDGE operator to look for higher level matches

The

Four WordNoun Phrase

Red Car StopStart

Page 10: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure

Next Symbol

W as O n O fH ead M an StopH atThe

N ounPhrase

VerbPhrase

N ounPhrase

N ounPhrase

PrepPhrase

Preposition

Preposition

Start

The Next Symbol depends on the local structure - run down from the current symbol, then run up again if other structure exists, otherwise jump across a PARSE

Page 11: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure

Harpooning the ModelWhen the noun phrase is recognised, the objects it joins are searched for connection - one is found for animal and colour through ATTRIBUTE, so the same relation joining the objects is searched for in the model, and a unique match is “harpooned” for use with relations - the type of object changes the grammar

Page 12: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure

Automatic Phasing

A BRIDGE operator doing a long match may find not all the information is available

The

Four WordNoun Phrase

Red Car StopStart

If so, it puts a connection on the missing information and waits to be re-activated

Page 13: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure

In the Process of Building

Part of a sentence under construction - hundreds of different active structures are cooperating in the process - building up, cutting out, reversing connections

Page 14: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure

Tight Integration

The structure combines lexical information, grammar and semantics - we pick up the fact that a word is a noun because it is an Entity, we know something isn’t a Material because the Verb says not.

This tight and immediate interweaving of lexical, grammatical and semantic analysis allows us to do things that are not possible with a static sequential approach.

Page 15: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure

Scientific Sentences Are Complex

The synergistic effect of serine and CheW binding to Tsr is attributed to distinct influences on receptor structure; changes in the conformation of the Tsr dimer induced by serine binding improve methylation efficiency, and CheW binding changes the arrangement among Tsr dimers, which increases access to methylation sites.

Page 16: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure

Grammar Is Not Enough

Grammar alone would turn meaningful scientific text into sludge - a participial phrase “induced by...” has to be anchored on the right object, a relative pronoun “which” has to be anchored on the relation

The reading process demands that domain knowledge be available at every turn - knowledge that is held in object hierarchies and relations, and which is seamlessly intermingled with grammatical knowledge during the parsing

Page 17: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure

What Does It Rely On

The paradigm relies on dynamic construction and destruction of active structure, where operators in the structure respond to their local environment by changing the local topology, and then respond to the changed environment, and so on. Each operator can only transmit information through its links, change its connections, add structure or destroy itself.Their interaction suffices to cause all the necessary processes to proceed in parallel, in an opportunistic and synergistic manner.

Page 18: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure

Typical Domain Knowledge Model

Attenuation

Greece Info(GIS) Intensity/

Damage

Acceleration attenuation based on magnitude, distance and local site conditions

Find distance between site and epicentre, local conditions, etc.

Relations between acceleration, intensity and damage ratio

EarthquakeEvent

Frequency/Amplification

Relations between magnitude and frequency, building type, number of floors and natural frequency

The model is built out of the same variables, operators, links as the grammatical and semantic structures, so it can interact with them

Page 19: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure

Genetic Knowledge

ABCA1: ATP-binding cassette, sub-family A (ABC1), member 1

LocusID: 19

Overview ?

The membrane-associated protein encoded by this gene is a member of thesuperfamily of ATP-binding cassette (ABC) transporters. ABC proteins transportvarious molecules across extra- and intracellular membranes. ABC genes aredivided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, OABP,GCN20, White). This protein is a member of the ABC1 subfamily. With cholesterolas its substrate, this protein functions as a cholesteral efflux pump in the cellularlipid removal pathway. Mutations in this gene have been associated with Tangier'sdisease and familial high-density lipoprotein deficiency.

Family ABC (transporter across membranes)Subfamily ABC1 (members ABC1, MDR/TAP, MRP, ALD, OABP, GCN20, White)Gene ABCA1Protein NP_005493Substrate CholesterolFunction cholesterol efflux pump associated lipid removal pathwayMutation causes Tangier’s disease, familial high-density lipoprotein deficiency.

Chromosome: 9 mv Cytogenetic: 9q31.1 RefSeq

Genes

Proteins

Diseases

Anatomy

N P_0001 N P_0047

Brain

Brain

Liver

LiverEye

Kidney

Cortex

FFPFICABC HKT

PKC

The structure is used to understand the text - then the text is used to extend the structure

Page 20: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure

Why Do This

The automated process of Information Extraction needs to be in the same state as a knowledgeable human reader at every point in the text, so inferences about

alternatives and anaphora are made on the same basis - the basis on which the writer expects them to be

made.

The automated process also needs the ability to backtrack when reading more text refutes assumptions

already built into any part of the structure.

Page 21: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure

Is It Really So Different

We are asserting that knowledge can only be captured in active structure - structure that is capable of adapting itself to its environment.

Efforts at capturing knowledge in static structure founder on two reefs - the pieces of structure will not fit together statically, and an algorithm that could manage their combination would be more complex than the combination of the pieces, and is thus unmanageable.

Active Structure avoids both problems - the pieces adapt to each other, and the behavior of the combination is managed by the interaction of the pieces.

Page 22: From Words to Knowledge ORION Active Structure. ORION Active Structure Two Approaches We could separate the process of turning words into knowledge into

ORIONActive Structure