thesis proposal raluca budiu february 9, 2000 the role of background knowledge in sentence and...

41
Thesis Proposal Raluca Budiu February 9, 2000 The Role of Background Knowledge in Sentence and Discourse Processing

Post on 21-Dec-2015

213 views

Category:

Documents


2 download

TRANSCRIPT

Thesis Proposal

Raluca Budiu

February 9, 2000

The Role of Background Knowledge in Sentence and Discourse Processing

02/ 09/ 2000 Thesis proposal --- Raluca Budiu2

Metaphors

Time is money.

People from all cultures use metaphors on an every-day basis, irrespective of their level of education.

Language is full of frozen metaphors (Adam’s apple, leg of a table, etc.)

People understand (most) metaphors easily.

02/ 09/ 2000 Thesis proposal --- Raluca Budiu3

“Mistakes”

People make mistakes when they speak.

Often people do not notice mistakes and can understand the message communicated: How many animals of each kind did

Moses take on the ark?

It’s hard for people not to ignore mistakes.

02/ 09/ 2000 Thesis proposal --- Raluca Budiu4

Memory for Text

People interpret new stories in terms of past experiences.

Doing that helps them remember the new stories better.

Doing than makes them deform the actual facts.

02/ 09/ 2000 Thesis proposal --- Raluca Budiu5

Motivation

Metaphors

“Mistakes”

Memory for text

Claim: all are facets of the same cognitive mechanism, which:

• accounts for both fallibility and robustness

• uses background knowledge as a heuristic in service of the current goal.

02/ 09/ 2000 Thesis proposal --- Raluca Budiu6

At the semantic level, comprehension works • bottom-up: all the information available is used to find

an interpretation;

• top-down: the interpretation is further used to help comprehension or recall.

Proof: a unique computational model in ACT-R (Anderson & Lebiere, 1998)

• explaining and unifying phenomena from various domains;

• satisfying a number of computational and empirical (i.e. fitting actual behavioral data) constraints.

Thesis Topic: Comprehension

02/ 09/ 2000 Thesis proposal --- Raluca Budiu8

Overview

Thesis topic; A model for sentence comprehension; Empirical constraints; Computational constraints; Summary and work plan.

02/ 09/ 2000 Thesis proposal --- Raluca Budiu9

Semantic Interpretation

ModelSemantic

Background

Words+

thematicroles

knowledge

interpretation

Understanding a sentence =finding a matching interpretation/context in the background knowledge.

take

arkanimals

Noah

Ark prop

agent verb

place-obliquepatient

02/ 09/ 2000 Thesis proposal --- Raluca Budiu10

How Does the Model Work?

Ark context

arkHow many did

Farm context

Ark context

Ark context

animals Noah take

Farm context

raise

farmanimals

father

Farm prop

agent verb

place-obliquepatient

on the

take

arkanimals

Noah

Ark prop

agent verb

place-obliquepatient

Incremental

From left to right

omitting

Incremental

From left to right

omitting

02/ 09/ 2000 Thesis proposal --- Raluca Budiu11

Model in the Absence of Context Priming

Context found?

Read word

Extract Word Meaning

Context?

Word matches context?

Find context

Old words match?

yes

no

yesno

nono

yesyes

= here the model may omit to check all the previous words

02/ 09/ 2000 Thesis proposal --- Raluca Budiu12

Context Priming

How many animals did Noah take on the ark?

1. Boat or ship held to resemble that in which Noah and his family were preserved from the Deluge

2. A repository traditionally in or against the wall ofa synagogue for the scrolls of the Torah

Ark story

Noah

animals

took

ark(1)

agent

verb

place-oblique

patie

nt

Different processing at the beginning and at the end of the sentence.

02/ 09/ 2000 Thesis proposal --- Raluca Budiu13

Model With Context Priming

Read word

Extract Context Role

Context role

matches word?

Find context

Old words match?

yesno

no

noyes

yes

Context found?

Sentence notcomprehended

= here the model may omit to check all the previous words

no

02/ 09/ 2000 Thesis proposal --- Raluca Budiu14

Distributed Meaning Assumption

Bible char Navigator

MarriedPatriarch

Noah “Noah”

meaning

word

meaning

meaningmeaning

• Meaning retrieval = extracting word features;• Replace word meaning with feature as unit of processing;• Model remains the same.

Speak very brieflySpeak very briefly

02/ 09/ 2000 Thesis proposal --- Raluca Budiu16

Summary of the Model

Incremental; Trial-and-error strategy; Mixture of bottom-up and top-down strategies; Incomplete processing (aka symbolic partial

matching)• at the word meaning level (not all features extracted);

• at the sentence level;

No syntactic processing: thematic roles are inputs.

02/ 09/ 2000 Thesis proposal --- Raluca Budiu17

Overview

Thesis Topic;Model; Empirical constraints; Computational constraints; Summary and work plan.

02/ 09/ 2000 Thesis proposal --- Raluca Budiu18

Metaphor-related Phenomena

Effects of position on metaphor understanding (Gerrig & Healy, 1983);

• Effects of metaphoric truth on the judgement and recall of sentences of the type Some As are Bs (Glucksberg, Glidea & Bookin, 1982);

• Interferences of literal and metaphoric truth on sentence judgements (Keysar, 1989);

• Effects of context length on metaphor understanding (Ortony, Schallert, Reynolds & Antos, 1978);

• Comprehension differences between different types of metaphors (Gibbs, 1990; Ortony et al. 1978; our data).

02/ 09/ 2000 Thesis proposal --- Raluca Budiu19

Metaphor Position Effects

Metaphor-first sentences take longer to comprehend than metaphor-second sentences(Gerrig & Healy, 1983).

Container contextContainer context

Stars context

Stars context Stars context

Drops of molten silver filled the sky

The sky was filled with drops of molten silver

4.21s4.21s(4.23s)(4.23s)

3.53s3.53s(2.84s)(2.84s)

*

* Predictions

*

02/ 09/ 2000 Thesis proposal --- Raluca Budiu22

What Are Semantic Illusions?

How many animals of each kind did Moses take on the ark?

Semantic illusions are very robust (Reder & Kusbit, 1991); however, not anything can make an illusion.

Good vs. bad illusions:How many animals did Adam take on the ark?

02/ 09/ 2000 Thesis proposal --- Raluca Budiu23

Semantic Illusion Datasets

Illusion rates for good and bad distortions (Ayers, Reder & Anderson, 1996);

Percent correct for good and bad distortions in the gist task (Ayers et al., 1996);

Latencies in the literal and gist task (Reder & Kusbit, 1991);

Processing of semantic anomalies and contradictions (Barton & Sanford, 1993);

When an aircraft crashes, where should the survivors be buried? vs. When a bicycle accident occurs where should the survivors be buried?

02/ 09/ 2000 Thesis proposal --- Raluca Budiu24

Good vs. Bad Illusions

Illusion rates (Ayers et al., 1996) and model predictions

010

2030

4050

60

Undistorted Gooddistortions

Bad distortions

Illu

sio

n R

ate

* 1

00

Data Predictions

All levels of distortion are significantly different from one another.

02/ 09/ 2000 Thesis proposal --- Raluca Budiu27

Modeling Semantic Illusions

take

arkanimals

Noah

Ark prop

agent verb

place-obliquepatient

Mos

es

Adam

Model says “Distorted” if

it finds no interpretation;

Key idea: meaning overlap

(supported by van Oostendorp

& Mul, 1990; van Oostendorp &

Kok, 1990);

Model predicts an effect of

position of distortion in the

sentence: late distortions

are harder to detect.

02/ 09/ 2000 Thesis proposal --- Raluca Budiu28

Memory for Text

Prior schemas can influence text memory (Bartlett,

1932; Bransford & Johnson, 1972; etc.);

If a text is consistent with a pre-existent script

(paradigmatic situation/previous experience)

• subjects recall more propositions from the text,

• but also make more script-consistent intrusions

(Owens, Bower & Black, 1979).

02/ 09/ 2000 Thesis proposal --- Raluca Budiu29

Text Memory Datasets

Recall and recognition of sentences from multiple episodes

related or not by a common setting (Owens et al., 1979);

Interferences from related stories on recall and recognition

of text (Bower, Black & Turner, 1979);

• Text recall in the presence or absence of a topic (Bransford

& Johnson, 1972);

• Recall of single, related and unrelated facts (Bradshaw and

Anderson, 1982).

02/ 09/ 2000 Thesis proposal --- Raluca Budiu30

Interferences Among Related Stories

The number of intrusions can increase if subjects study more variants of the same script (Bower, Black

& Turner, 1979):• At the Dentist’s --- about Bill• At the Doctor’s --- about Tom

Rates of recall per script version (Bower et al., 1979) and model predictions

0

0.1

0.2

0.3

0.4

1 2

Script versions

Ra

te o

f re

ca

ll

Stated actions Data

Stated actionsPredictions

Unstated actionsData

Unstated actionsPredictions

02/ 09/ 2000 Thesis proposal --- Raluca Budiu31

Modeling Script Effects

Story 2 (doctor’s)

Story 1 (dentist’s)

Visiting-healthcare-professional script

Studied Propositions Script Propositions

02/ 09/ 2000 Thesis proposal --- Raluca Budiu33

Difficulties With Modeling Script Effects

Parsing the discourse into a unitary and coherent representation (solve the problem of binding);

Text representation that allows recursive schemas; Modeling different types of intrusions, especially

abstract intrusions:

Studied IntrudedBill paid the bill. Tom paid the bill. The nurse x-rayed Bill’s The nurse checked Tom’s

teeth. blood pressure.

02/ 09/ 2000 Thesis proposal --- Raluca Budiu34

Lexical Ambiguity Resolution

Although not designed for data from this domain, our model makes strong predictions about ambiguity resolution.

Does context influence meaning access for an ambiguous word?

Possible answer: both meanings are activated, but activation depends additively on both context and individual meaning frequency (Tabossi, 1988; Duffy,

Morris & Rayner, 1988; Rayner & Duffy, 1986; Rayner & Frazier,

1989; Lucas, 1999).

02/ 09/ 2000 Thesis proposal --- Raluca Budiu35

Lexical Ambiguity Datasets

Gaze duration on balanced and unbalanced homophones (Duffy et al., 1988);

Mean reading time per character in the disambiguation region (Duffy et al., 1988);

02/ 09/ 2000 Thesis proposal --- Raluca Budiu37

Gaze Durations on Homophones

Duffy et al. (1988) manipulated position of disambiguating region and relative frequency of the homophone’s meanings:

– Disambiguating region before/after the homophone;

– Homophone could be balanced (pitcher) or unbalanced (port);

02/ 09/ 2000 Thesis proposal --- Raluca Budiu38

Gaze Duration on Homophones

Mean gaze duration on homophones in Duffy et al. (1988)

240245250255260265270275280285

Before After Before After

Balanced Unbalanced

msec

Ambiguous Control

• Times longer than controls reflect multiple access.• Times equal with controls reflect selective access.

02/ 09/ 2000 Thesis proposal --- Raluca Budiu39

Time Spent on Disambiguating Region

Mean time spent on disambiguating region in Duffy et al. (1988)

01020304050607080

Before After Before After

Balanced Unbalanced

mse

c/ch

ar

Ambiguous Control

mihaib:

hide

mihaib:

hide

02/ 09/ 2000 Thesis proposal --- Raluca Budiu40

Fitting the Data

Disambiguation-after: • no context priming; • individual meaning activation is proportional with

meaning frequency (ACT-R assumption);• ACT-R is serial (no multiple access), but close

competitors can slow down retrieval (tentative ACT-R assumption).

Disambiguation-before:• context priming: context is an extra source of

activation;• If the wrong meaning is more frequent, context priming

may not be enough.

02/ 09/ 2000 Thesis proposal --- Raluca Budiu41

Overview

Thesis Topic;Model;Empirical constraints: Computational constraints; Summary and work plan.

02/ 09/ 2000 Thesis proposal --- Raluca Budiu42

Computational Constraints

Realistic reaction times; Integration with background knowledge; Allowing for errors of the syntactic processor (i.e. wrong

thematic roles).

ModelSemantic

Background

Words+

Thematicroles

knowledge

interpretation

02/ 09/ 2000 Thesis proposal --- Raluca Budiu43

Syntactic Ambiguity As a Computational Constraint

Garden path effects have been largely documented in the literature:• The horse raced past the barn fell;• The cop arrested by the detective was guilty of taking bribes.

Solution: thematic roles as meaning features later omitted.

ModelSemantic

Background

Words+

Candidate thematicroles

knowledge

interpretation

02/ 09/ 2000 Thesis proposal --- Raluca Budiu44

Summary

Language comprehension theory to be embodied in a unique ACT-R model;

Semantic rather than syntactic level of processing (no parser);

The theory should satisfy:• Computational constraints:

– Realistic reaction times;– Integration with background knowledge;– Syntactic ambiguity.

• Empirical constraints– Metaphor understanding;– Semantic illusions;– Lexical ambiguity;– Memory for text: script effects and elaborations.

02/ 09/ 2000 Thesis proposal --- Raluca Budiu45

Empirical Constraints

Metaphor understanding:• Effects of position on metaphor understanding (Gerrig &

Healy, 1983);

• Effects of metaphoric truth on the judgement and recall of sentences of the type Some As are Bs (Glucksberg et al., 1982);

• Interferences of literal and metaphoric truth on sentence judgements (Keysar, 1989);

• Effects of context length on metaphor understanding (Ortony et al., 1978);

• Comprehension differences between different types of metaphors (Gibbs, 1990; Ortony et al. 1979; our data).

02/ 09/ 2000 Thesis proposal --- Raluca Budiu46

Empirical Constraints (contd.)

Semantic illusions:• Illusion rates for good and bad distortions in the literal

and gist tasks (Ayers et al., 1996);

• Latencies in the literal and gist task (Reder & Kusbit, 1991);

• Processing of semantic anomalies and contradictions (Barton & Sanford, 1993).

Lexical ambiguity:• Gaze duration on balanced and unbalanced

homophones (Duffy et al., 1988);

• Mean reading time per character in the disambiguation region (Duffy et al., 1988);

02/ 09/ 2000 Thesis proposal --- Raluca Budiu47

Empirical Constraints (contd.)

Memory for text (script effects and elaborations):• Recall and recognition of sentences from multiple

episodes related or not by a common setting (Owens et al., 1979);

• Interferences from related stories on recall and recognition of text (Bower et al., 1979);

• Text recall in the presence or absence of a topic (Bransford & Johnson, 1972);

• Recall of single, related and unrelated facts (Bradshaw and Anderson, 1982).

02/ 09/ 2000 Thesis proposal --- Raluca Budiu48

Model Validation

Collect new empirical data to validate “side effects” or other predictions of the model, not covered by the previous list of empirical phenomena:

E.g.: position effects for Moses’ illusion.

Test it on other sets of data (for the same

phenomena) than the ones it has been built for in

order to avoid “overfitting”.

02/ 09/ 2000 Thesis proposal --- Raluca Budiu49

Work Plan

Garden path

Lexical ambiguity

Text memory

Semantic illusions

Metaphor

20% 10%15%30%

• Modeling and parameter fitting;• Data collection: metaphors and semantic illusions;• The model still has to solve the more difficult

problems of discourse representation.

25%