machine reading as a process of partial question-answering peter clark and phil harrison boeing...

Post on 27-Dec-2015

221 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Machine Reading as a Process ofPartial Question-Answering

Peter Clark and Phil Harrison

Boeing Research & Technology

June 2010

Overview

Machine Reading and Question-Answering Approach Algorithm Preliminary Results Summary

Machine Reading

Machine Reading = A “holy grail” of AI Constructing an inference-supporting representation

from text Connecting what is read with what is already known

Reader already knows something Text is elaborating/deepening that knowledge

Do I already know this?Can I interpret this as something that I know?Can I interpret some of this as something I know?

Machine Reading

Machine Reading

Do I already know this?Can I interpret this as something that I know?Can I interpret some of this as something I know?

Do I already know this?Can I interpret this as something that I know?Can I interpret some of this as something I know?

Question-Answering

Machine Reading

Any remainder = new knowledge

Any remainder = failed query

Machine Reading

Question-Answering

Machine Reading

Main insight: These are similar processes

Can apply question-answering techniques to machine reading.Why is that important?Question-answering is precisely a technology for linking what is said (asked) with what is known.

i.e., To read text TAsk: Is it true that T?

Overview

Machine Reading and Question-Answering Approach Algorithm Preliminary Results Summary

General Approach

“The mitotic spindle consists of hollow microtubules.”

“Does the mitotic spindle consist of hollow microtubules?”

“Mitotic spindle has parts [hollow] microtubules”

“Those microtubules are hollow”

Text:

Question:

Partial Answer:

New Knowledge:

Knowledge has guided interpretation

General Approach

“The mitotic spindle consists of hollow microtubules.”

“Does the mitotic spindle consist of hollow microtubules?”

“The mitotic spindle has parts [hollow] microtubules”

“Those microtubules are hollow”

Text:

Question:

Partial Answer:

New Knowledge:

..and identified the “anchor points” in

the KB for new knowledge

General Approach

“The mitotic spindle consists of hollow microtubules.”

“Does the mitotic spindle consists of hollow microtubules?”

“The mitotic spindle has parts [hollow] microtubules”

“Those microtubules are hollow”

Text:

Question:

Partial Answer:

New Knowledge:

Pipelined (KB independent) NLP

Word-SenseDisambiguation

Semantic Role Labeling

?

Topic in the KB

During prophase, the cell…

Parse, logical form

Interleaved Interpretation and Answering

Topic in the KB

During prophase, the cell…

Logical Form

Interleaved Interpretation and Answering

Topic in the KB

During prophase, the cell…

Logical Form

Interleaved Interpretation and Answering

Topic in the KB

Existing Knowledge

During prophase, the cell…

Logical Form

Interleaved Interpretation and Answering

Topic in the KB

During prophase, the cell…

Logical Form

Interleaved Interpretation and Answering

Topic in the KB

Existing Knowledge

During prophase, the cell…

Logical Form

Interleaved Interpretation and Answering

Topic in the KB

During prophase, the cell…

Logical Form

Interleaved Interpretation and Answering

Topic in the KB

Existing Knowledge

During prophase, the cell…

Logical Form

Interleaved Interpretation and Answering

Topic in the KB

Suppose this is the best we can do,interpreting text as existing knowledge

During prophase, the cell…

Logical Form

Interleaved Interpretation and Answering

Topic in the KB

Traditional NLP

During prophase, the cell…

Logical Form

Interleaved Interpretation and Answering

Topic in the KB

New Knowledge

During prophase, the cell…

Logical Form

Interleaved Interpretation and Answering

Topic in the KB

Extended KB

During prophase, the cell…

Logical Form

Interleaved Interpretation and Answering

Topic in the KB

Extended KB

Word sense choicesSemantic role choicesParaphrase rewrites

During prophase, the cell…

Logical Form

Some Possible Semantic Role Labels…

“DNA synthesized by the polymerase”

agent?location? means?

KB

Some Possible Paraphrases (DIRT)…

“spindle consists of microtubules”

“microtubules are part of the spindle”

“spindle is staffed by microtubules”

“microtubules participate in the spindle”

…KB

Overview

Machine Reading and Question-Answering Approach Algorithm Preliminary Results Summary

Knowledge Representation

Ontology: ~400 biology concepts, ~400 general concepts

Axioms: Mainly “Forall…exists…” axioms, e.g., “All eukaryotic cells contain a nucleus” “Subevents of mitosis are prophase, metaphase, …”

Inference: Reason about an instance of a concept Conclusions apply to all instances of the concept (via UG)

Topics

Topic = the concept that a text describes We assume a text is about a single topic Topic could be identified using ML (we do it by hand) Given topic, can find (some) expected “participants” from KB

The centrosomes are pushed apart to opposite ends of the cell nucleus by the action of molecular motors acting on the microtubules. The nuclear envelope breaks downm allowing….

Topic: Prophase

Topics

Topic = the concept that a text describes Participants = Individuals implied to exist given the topic

Can infer (some) participants using the KB

Topic: Prophase

KB

Prophase

The centrosomes are pushed apart to opposite ends of the cell nucleus by the action of molecular motors acting on the microtubules. The nuclear envelope breaks downm allowing….

→ centrosome moves to the pole of a eukaryotic cell → nucleus, cytoplasm → nuclear membrane, etc. etc.

Topics

Topic = the concept that a text describes Participants = Individuals implied to exist given the topic

Can infer (some) participants using the KB

Topic: Prophase

KB

Prophase

Text provides information about participants

The centrosomes are pushed apart to opposite ends of the cell nucleus by the action of molecular motors acting on the microtubules. The nuclear envelope breaks downm allowing….

→ centrosome moves to the pole of a eukaryotic cell → nucleus, cytoplasm → nuclear membrane, etc. etc.

Algorithm

Identify the topic of the text Parse and create initial “logical form”

“The mitotic spindle consists of hollow microtubules.”

"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), modifier(m,h).

1. SetupCreate representation of topic + (known) participants in KB

2. Search: repeat: interpret + (try to) prove parts of the LF

until: as much proved as possibleInterpret remainder (normal NLP) and add to KB

Topic: Prophase

Y4:Mitotic-SpindleX0:Prophase

Y0:Move Y1:Centrosome

Y7:Microtubule

Y3:Elongate

Y2:Eukaryotic-Cell

Y5:Pole

subeventhas-part has-region

object

object

has-part

Y6:Create

……

destination

Create a representation of the topic in the KB

“The mitotic spindle consists of hollow microtubules.”

Y4:Mitotic-SpindleX0:Prophase

Y0:Move Y1:Centrosome

Y7:Microtubule

Y3:Elongate

Y2:Eukaryotic-Cell

Y5:Pole

subeventhas-part has-region

object

object

has-part

Y6:Create

……

destination

"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), mod(m,h).LF interpretation:

“The mitotic spindle consists of hollow microtubules.”

Generate Logical Form

Y4:Mitotic-SpindleX0:Prophase

Y0:Move Y1:Centrosome

Y7:Microtubule

Y3:Elongate

Y2:Eukaryotic-Cell

Y5:Pole

subeventhas-part has-region

object

object

has-part

Y6:Create

……

destination

"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), mod(m,h).LF interpretation:

“The mitotic spindle consists of hollow microtubules.”

Interpret and (try) prove some part of the LF

"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), mod(m,h).LF interpretation:

X0:Prophase

Y0:Move Y1:Centrosome

Y7:Microtubule

Y3:Elongate

Y2:Eukaryotic-Cell

Y5:Pole

subeventhas-part has-region

object

object

has-part

Y6:Create

……

destination

Y4:Mitotic-Spindle

isa(Y4,MSpindle), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,Y4),"of"(c,m),mod(m,h)

Bind a LF variable

“The mitotic spindle consists of hollow microtubules.”

Interpret and (try) prove some part of the LF

"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), mod(m,h).LF interpretation:

X0:Prophase

Y0:Move Y1:Centrosome

Y7:Microtubule

Y3:Elongate

Y2:Eukaryotic-Cell

Y5:Pole

subeventhas-part has-region

object

object

has-part

Y6:Create

……

destination

Y4:Mitotic-Spindle

isa(Y4,MSpindle), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,Y4),"of"(c,m),mod(m,h)

“The mitotic spindle consists of hollow microtubules.”

isa(Y4,MSpindle), "hollow"(h), "microtubule"(m), material(Y4,m), mod(m,h). ? Interpret and (try) prove some part of the LF

LF interpretation:

X0:Prophase

Y0:Move Y1:Centrosome

Y7:Microtubule

Y3:Elongate

Y2:Eukaryotic-Cell

Y5:Pole

subeventhas-part has-region

object

object

has-part

Y6:Create

……

"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), mod(m,h).

isa(Y4,MSpindle), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,Y4),"of"(c,m),mod(m,h)

isa(Y4,MSpindle), "hollow"(h), "microtubule"(m), has-part(Y4,m), mod(m,h).

destination

Y4:Mitotic-Spindle

“The mitotic spindle consists of hollow microtubules.”

Interpret and (try) prove some part of the LF

?

LF interpretation:

Y4:Mitotic-SpindleX0:Prophase

Y0:Move Y1:Centrosome

Y7:Microtubule

Y3:Elongate

Y2:Eukaryotic-Cell

Y5:Pole

subeventhas-part has-region

object

object

has-part

Y6:Create

……

"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), mod(m,h).

isa(Y4,MSpindle), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,Y4),"of"(c,m),mod(m,h)

isa(Y4,MSpindle), "hollow"(h), "microtubule"(m), has-part(Y4,m), mod(m,h).

RecognizedOld Knowledge

destination

“The mitotic spindle consists of hollow microtubules.”

Interpret and (try) prove some part of the LF

LF interpretation:

Y4:Mitotic-SpindleX0:Prophase

Y0:Move Y1:Centrosome

Y7:Microtubule

Y3:Elongate

Y2:Eukaryotic-Cell

Y5:Pole

subeventhas-part has-region

object

object

has-part

Y6:Create

……

"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), mod(m,h).

isa(Y4,MSpindle), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,Y4),"of"(c,m),mod(m,h)

isa(Y4,MSpindle), "hollow"(h), "microtubule"(m), has-part(Y4,m), mod(m,h).

RecognizedOld Knowledge

destination

isa(Y4,MSpindle), "hollow"(h), isa(Y7,Microtubule), has-part(Y4,Y7), modifier(Y7,h).

“The mitotic spindle consists of hollow microtubules.”

Interpret and (try) prove some part of the LF

!

LF interpretation:

X0:Prophase

Y0:Move Y1:Centrosome

Y3:Elongate

Y2:Eukaryotic-Cell

Y5:Pole

subeventhas-part has-region

object

object

Y6:Create

……

"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), mod(m,h).

isa(Y4,MSpindle), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,Y4),"of"(c,m),mod(m,h)

isa(Y4,MSpindle), "hollow"(h), "microtubule"(m), has-part(Y4,m), mod(m,h).

isa(Y4,MSpindle), "hollow"(h), isa(Y7,Microtubule), has-part(Y4,Y7), modifier(Y7,h).

destination

Y7:Microtubulehas-part

Y4:Mitotic-Spindle

“The mitotic spindle consists of hollow microtubules.”

LF interpretation:

X0:Prophase

Y0:Move Y1:Centrosome

Y3:Elongate

Y2:Eukaryotic-Cell

Y5:Pole

subeventhas-part has-region

object

object

Y6:Create

……

"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), mod(m,h).

isa(Y4,MSpindle), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,Y4),"of"(c,m),mod(m,h)

isa(Y4,MSpindle), "hollow"(h), "microtubule"(m), has-part(Y4,m), mod(m,h).

isa(Y4,MSpindle), "hollow"(h), isa(Y7,Microtubule), has-part(Y4,Y7), modifier(Y7,h).

isa(Y4,MSpindle), isa(Y8,Hollow), isa(Y7,Microtubule), has-part(Y4,Y7), shape(Y7,Y8).

Y4:Mitotic-Spindle

has-part

destination

Y7:Microtubule

Traditional NLP for the rest…

“The mitotic spindle consists of hollow microtubules.”

LF interpretation:

X0:Prophase

Y0:Move Y1:Centrosome

Y3:Elongate

Y2:Eukaryotic-Cell

Y5:Pole

subeventhas-part has-region

object

object

Y6:Create

……

"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), mod(m,h).

isa(Y4,MSpindle), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,Y4),"of"(c,m),mod(m,h)

isa(Y4,MSpindle), "hollow"(h), "microtubule"(m), has-part(Y4,m), mod(m,h).

isa(Y4,MSpindle), "hollow"(h), isa(Y7,Microtubule), has-part(Y4,Y7), modifier(Y7,h).

isa(Y4,MSpindle), isa(Y8,Hollow), isa(Y7,Microtubule), has-part(Y4,Y7), shape(Y7,Y8).

Y4:Mitotic-Spindle

has-part

Y8:Hollowshape

New Knowledge

destination

Y7:Microtubule

Add to the KB

“The mitotic spindle consists of hollow microtubules.”

X0:Prophase

Y0:Move Y1:Centrosome

Y3:Elongate

Y2:Eukaryotic-Cell

Y5:Pole

subeventhas-part has-region

object

object

Y6:Create

……

Y4:Mitotic-Spindle

has-part

Y8:Hollowshape

New Knowledge

destination

Y7:Microtubule

“The mitotic spindle consists of hollow microtubules.”

Overview

Machine Reading and Question-Answering Approach Algorithm Illustration and Preliminary Results Summary

Illustration

“During prophase, chromosomes become visible, the nucleolus disappears, the mitotic spindle forms, and the nuclear envelope disappears. Chromosomes become more coiled and can be viewed under a light microscope. Each duplicated chromosome is seen as a pair of sister chromatids joined by the duplicated but unseparated centromere. The nucleolus disappears during prophase. In the cytoplasm, the mitotic spindle, consisting of microtubules and other proteins, forms between the two pairs of centrioles as they migrate to opposite poles of the cell. The nuclear envelope disappears at the end of prophase. This signals the beginning of the substage called prometaphase.”

In all prophase events:• The chromosome moves.• The chromatids are attached by the centromere.• The nucleolus disappears during the prophase.• The mitotic spindle has parts the microtubule and the protein.• The mitotic spindle is created between the centrioles in the cytoplasm.• The centrioles move to the poles.• The nuclear envelope disappears at the end.• Something signals.

Input Text + Topic (here, Prophase):

Output Axioms (expressed in English):

Illustration

“During prophase, chromosomes become visible, the nucleolus disappears, the mitotic spindle forms, and the nuclear envelope disappears. Chromosomes become more coiled and can be viewed under a light microscope. Each duplicated chromosome is seen as a pair of sister chromatids joined by the duplicated but unseparated centromere. The nucleolus disappears during prophase. In the cytoplasm, the mitotic spindle, consisting of microtubules and other proteins, forms between the two pairs of centrioles as they migrate to opposite poles of the cell. The nuclear envelope disappears at the end of prophase. This signals the beginning of the substage called prometaphase.”

In all prophase events:• The chromosome moves.• The chromatids are attached by the centromere.• The nucleolus disappears during the prophase.• The mitotic spindle has parts the microtubule and the protein.• The mitotic spindle is created between the centrioles in the cytoplasm.• The centrioles move to the poles.• The nuclear envelope disappears at the end.• Something signals.

Input Text:

Output Axioms (expressed in English):

Good interpretation using paraphrases

Illustration

“During prophase, chromosomes become visible, the nucleolus disappears, the mitotic spindle forms, and the nuclear envelope disappears. Chromosomes become more coiled and can be viewed under a light microscope. Each duplicated chromosome is seen as a pair of sister chromatids joined by the duplicated but unseparated centromere. The nucleolus disappears during prophase. In the cytoplasm, the mitotic spindle, consisting of microtubules and other proteins, forms between the two pairs of centrioles as they migrate to opposite poles of the cell. The nuclear envelope disappears at the end of prophase. This signals the beginning of the substage called prometaphase.”

In all prophase events:• The chromosome moves.• The chromatids are attached by the centromere.• The nucleolus disappears during the prophase.• The mitotic spindle has parts the microtubule and the protein.• The mitotic spindle is created between the centrioles in the cytoplasm.• The centrioles move to the poles.• The nuclear envelope disappears at the end.• Something signals.

Input Text:

Output Axioms (expressed in English):

Useful New Knowledge

Illustration

“During prophase, chromosomes become visible, the nucleolus disappears, the mitotic spindle forms, and the nuclear envelope disappears. Chromosomes become more coiled and can be viewed under a light microscope. Each duplicated chromosome is seen as a pair of sister chromatids joined by the duplicated but unseparated centromere. The nucleolus disappears during prophase. In the cytoplasm, the mitotic spindle, consisting of microtubules and other proteins, forms between the two pairs of centrioles as they migrate to opposite poles of the cell. The nuclear envelope disappears at the end of prophase. This signals the beginning of the substage called prometaphase.”

In all prophase events:• The chromosome moves.• The chromatids are attached by the centromere.• The nucleolus disappears during the prophase.• The mitotic spindle has parts the microtubule and the protein.• The mitotic spindle is created between the centrioles in the cytoplasm.• The centrioles move to the poles.• The nuclear envelope disappears at the end.• Something signals.

Input Text:

Output Axioms (expressed in English):

Good interpretation

Illustration

“During prophase, chromosomes become visible, the nucleolus disappears, the mitotic spindle forms, and the nuclear envelope disappears. Chromosomes become more coiled and can be viewed under a light microscope. Each duplicated chromosome is seen as a pair of sister chromatids joined by the duplicated but unseparated centromere. The nucleolus disappears during prophase. In the cytoplasm, the mitotic spindle, consisting of microtubules and other proteins, forms between the two pairs of centrioles as they migrate to opposite poles of the cell. The nuclear envelope disappears at the end of prophase. This signals the beginning of the substage called prometaphase.”

In all prophase events:• The chromosome moves.• The chromatids are attached by the centromere.• The nucleolus disappears during the prophase.• The mitotic spindle has parts the microtubule and the protein.• The mitotic spindle is created between the centrioles in the cytoplasm.• The centrioles move to the poles.• The nuclear envelope disappears at the end.• Something signals.

Input Text:

Output Axioms (expressed in English):

Not very useful

Illustration

“During prophase, chromosomes become visible, the nucleolus disappears, the mitotic spindle forms, and the nuclear envelope disappears. Chromosomes become more coiled and can be viewed under a light microscope. Each duplicated chromosome is seen as a pair of sister chromatids joined by the duplicated but unseparated centromere. The nucleolus disappears during prophase. In the cytoplasm, the mitotic spindle, consisting of microtubules and other proteins, forms between the two pairs of centrioles as they migrate to opposite poles of the cell. The nuclear envelope disappears at the end of prophase. This signals the beginning of the substage called prometaphase.”

In all prophase events:• The chromosome moves.• The chromatids are attached by the centromere.• The nucleolus disappears during the prophase.• The mitotic spindle has parts the microtubule and the protein.• The mitotic spindle is created between the centrioles in the cytoplasm.• The centrioles move to the poles.• The nuclear envelope disappears at the end.• Something signals.

Input Text:

Output Axioms (expressed in English):

Bad interpretation

A Preliminary Experiment

10 paragraphs (110 sentences) about prophase, from Web 114 logic statements created

23 (20%) fully known to the KB 27 (24%) partially new knowledge 64 (56%) completely new knowledge

Biologist ranked the statements (expressed in English) as: c = correct; useful knowledge for the KB q = questionable; not useful (meaningless, vague) i = incorrect

A Preliminary Experiment

100Incorrect

3881Questionable

251922Correct

Fullynew

Mixture ofknown & new

Fullyknown

Statements that are:

“The membrane break down”• Questionable due to poor rendering in English, not the original logic

A Preliminary Experiment

100Incorrect

3881Questionable

251922Correct

Fullynew

Mixture ofknown & new

Fullyknown

Statements that are:

70% judged correct

A Preliminary Experiment

100Incorrect

3881Questionable

251922Correct

Fullynew

Mixture ofknown & new

Fullyknown

Statements that are:

39% judged correct

A Preliminary Experiment

Is extracting and integrating some useful knowledge Potentially useful as interactive tool

100Incorrect

3881Questionable

251922Correct

Fullynew

Mixture ofknown & new

Fullyknown

Statements that are:

Summary

Clearly only a first step Simple KR, single parse, contradictions, noisy, …

But: Interpretation guided by knowledge Identifies the “hooks” for new knowledge Is a “real” context for machine reading

To read T,ask “Is it true that T?”

top related