27 january 2010 a modality lexicon and its use in automatic tagging kathryn baker, michael...

38
27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine Piatko May 20, 2010 Presented by Lori Levin Language Technologies Institute Carnegie Mellon University

Upload: angela-underwood

Post on 16-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

27 January 2010

A modality lexicon and its use in automatic tagging

Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine Piatko

May 20, 2010

Presented by Lori Levin

Language Technologies Institute

Carnegie Mellon University

Page 2: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Context

SCALE 2009 Summer Camp in Applied Language Engineering Johns Hopkins University Human Language Technology

Center of Excellence SIMT

Semantically informed MT Can we improve statistical MT with semantic knowledge?

- experiments with modality and named entities

Page 3: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Modality Tagger Output

Example 1: Input: Americans should know that we can not hand over Dr. Khan to

them. Output: Americans <TrigRequire should> <TargRequire know> that we

<TrigAble can> <TrigNegation not> <TargNOTAble hand> over Dr. Khan to them

Example 2: Input: He managed to hold general elections in the year 2002, but he

can not be ignorant of the fact that the world at large did not accept these elections

Output: He <TrigSucceed managed> to <TargSucceed hold> general elections in the year 2002, but he <TrigAble can> <TrigNegation not> <TargNOTAble be> ignorant of the fact that the world at large did <TrigNegation not> <TrigBelief accept> these <TargBelief elections>

Trigger: lexical item that carries a modal meaning.Target: head of the proposition that it scopes over

Holder: the experiencer or cognizer of the modality.

Page 4: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Outline

A modality annotation scheme A modality lexicon A string based modality tagger A tree based modality tagger Evaluation of the taggers Semantically informed MT

Page 5: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Core Cases of Modality

Necessity Possibility

Epistemic John must have arrived

John may have arrived

Deontic/Situational

John has to leave now

You may leave now.

One can get to Staten Island using a ferry.

(van der Auwera and Amman, World Atlas of Language Structures)

Page 6: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Related Concepts: Factivity

Did the proposition happen or not? John went to New York. John may go to New York. If John goes to New York, he will visit MOMA. John bought a ticket to go to NY.

FactBank: Saurí and Pustejovsky

Page 7: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Related Concepts: Evidentiality

Source of information First hand experience or hearsay

- They say that John went to NY. Sensory information

- I heard that John went to NY. Conclusion from evidence

- I don’t see John, so he must have gone to NY.

Page 8: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Other Related Concepts

Speaker attitude and sentiment Conditionality Hypotheticality Realis and Irrealis mood Tense, aspect, etc.

Page 9: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Modality

Example 1: Input: Americans should know that we can not hand over Dr. Khan

to them. Output: Americans <TrigRequire should> <TargRequire know> that

we <TrigAble can> <TrigNegation not> <TargNOTAble hand> over Dr. Khan to them

Example 2: Input: He managed to hold general elections in the year 2002, but

he can not be ignorant of the fact that the world at large did not accept these elections

Output: He <TrigSucceed managed> to <TargSucceed hold> general elections in the year 2002, but he <TrigAble can> <TrigNegation not> <TargNOTAble be> ignorant of the fact that the world at large did <TrigNegation not> <TrigBelief accept> these <TargBelief elections>

Page 10: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Modality Annotation and Tagging

Annotation: Humans add labels to text, following instructions from a coding manual that defines an annotation scheme.

Tagging: A program automatically assigns labels Goals:

Design an annotation scheme that can be followed with high intercoder agreement and low annotation time and cost

Train a tagger on human annotated data Build a tagger based on the annotation scheme

Page 11: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

The inventory of modalities in the annotation scheme

Belief: with what strength does H believe P? Requirement: does H require P? Permissive: does H allow P? Intention: does H intend P? Effort: does H try to do P? Ability: can H do P? Success: does H succeed in P? Want: does H want P?

Joint work with Sergei Nirenburg, Marge McShane, Teruko Mitamura, Owen Rambow, Mona Diab, Eduard Hovy, Bonnie Dorr, Christine Piatko, Michael Bloodgood

H = Holder (experiencer or cognizer)P = Proposition

Page 12: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

The Annotation Scheme

Identify a modality target P and then choose one of these modalities (choose the first one that applies) H requires [P to be true/false] H permits [P to be true/false] H succeeds in [making P true/false] H does not succeed in [making P true/false] H is trying [to make P true/false] H is not trying [to make P true/false] H intends [to make P true/false] H does not intend [to make P true/false] H is able [to make P true/false] H is not able [to make P true/false] H wants [P to be true/false] H firmly believes [P is true/false] H believes [P may be true/false]

Page 13: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Six Simplifications

Transparency to negation Duality of require and permit Ordering for entailment Annotators were not asked to nest modalities. Default is Firmly Believe Annotators were not asked to mark the holder.

Page 14: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

SimplificationsTransparency to negation

Some modalities have negatives in the annotation scheme: not intend, not try, not be able, not succeed

Believe and want do not have negatives in our annotation scheme because of the similarity of I don’t want him to go/I want him not to go.

- Both are coded as H wants P to be false I don’t believe he will go/I believe he will not go.

- Both are coded as H believes P to be false.

Page 15: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

SimplificationsDuality of require and permit

Require and permit do not have negations in the annotation scheme because Not require P to be true means Permit P to be false Not permit P to be true means Require P to be false

Page 16: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

SimplificationsOrdering for entailment

John managed to go to NY. What modality is this? Success? Intent? Effort?

Desire? Ability? Two entailment groupings ordered with respect

to each other: 1. {requires permits}

2. {succeeds tries intends is able wants} Both apply before “believe”, which is not in an

entailment relation with either grouping.

The annotators are instructed to choose the first modality in the list that applies.

Page 17: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

SimplificationsNo embedding of modalities

He might be able to swim Only ability is tagged

Modals are never considered as targets of other modals in the annotation process

Page 18: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Six Simplifications

Transparency to negation Duality of require and permit Ordering for entailment Annotators were not asked to nest modalities. Default is Firmly Believe Annotators were not asked to mark the holder.

Page 19: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Six Simplifications

Transparency to negation Duality of require and permit Ordering for entailment Annotators were not asked to nest modalities. Default is Firmly Believe Annotators were not asked to mark the holder.

Page 20: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

English Modality Lexicon

Modality trigger words might, should, require, permit, need, try, possible, fail,

etc. About 150 lemmas

plus five forms for each verb where applicable- bare infinitive, present tense –s, past tense, past participle,

present participle

Page 21: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

English Modality Lexicon Example

need Pos: VB Modality: Require Trigger word: Need Subcategorization codes

- V3-passive-basic Large helicopters are needed to dispatch urgent relief materials.

- V3-I3-basic The government will need to work continuously for at least a year. We will need them to work continuously.

- T1-monotransitive-for-V3-verbs We need a Sir Sayyed again to maintain this sentiment.

- T1-passive-for-V3-verb He is needed to work continuously.

- modal-auxiliary-basic He need not go.

Page 22: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Modality

Example 1: Input: Americans should know that we can not hand over Dr. Khan

to them. Output: Americans <TrigRequire should> <TargRequire know> that

we <TrigAble can> <TrigNegation not> <TargNOTAble hand> over Dr. Khan to them

Example 2: Input: He managed to hold general elections in the year 2002, but

he can not be ignorant of the fact that the world at large did not accept these elections

Output: He <TrigSucceed managed> to <TargSucceed hold> general elections in the year 2002, but he <TrigAble can> <TrigNegation not> <TargNOTAble be> ignorant of the fact that the world at large did <TrigNegation not> <TrigBelief accept> these <TargBelief elections>

Page 23: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

String Based English Modality Tagger

Input Text that has been tagged with parts of speech.

Mark Triggers Mark spans of words that are exact matches to entries

in the modality lexicon and that have the same part of speech.

Mark Targets Next non-auxiliary verb to the right of a trigger

Spans of words can be marked multiple times with different triggers and targets.

Page 24: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

AmericansNNPS

S

NPVP

shouldMD knowVB that S

NP

wePRP VP

canMD notRB

handVB over NP

DrNNP KhanNNP

PP

to them

Modality Tagging

VB

VP

MD

should

Template

Used T-Surgeon (Stanford NLP tools) to find trees that match templates and mark modality triggers and targets.

Target

Trigger

The Structure-Based English modality Tagger

Page 25: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

The Structure-Based English Modality Tagger

S

NP

AmericansNNPS

VP-require

MD-TrigRequire VB-TargRequireshould know that

S

NP

wePRP VP-NOTAble

MD-TrigAblecan

RB-TrigNegationnot

VB-TargNOTAblehandVB over

NP

DrNNP KhanNNP

PP

to them

1. T-surgeon

2. Percolation

Page 26: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

What was covered

15 subcategorization patterns 150 lemmas Expressions of modality with lexical triggers

Page 27: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

What wasn’t covered

Non-lexical modality Imperatives Other constructions

- It will be a long time/a cold day in hell before… Targets in coordinate structures

To do next Word sense disambiguation

Can, must: deontic or epistemic Manage: manage to do something vs manage a project

Transitivity alternations: alternate mappings between grammatical relations and semantic roles The plan succeeded The government succeeded in its plan. The government succeeded ????

Page 28: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Evaluation: agreement between string-based and structure-based taggers

Calculated Kappa on the basis of 88108 sentences from the English side of the Urdu-English corpus for MTEval

2009

Example: TargPermit (John is allowed to <TargPermit go> to NY)

- 585 Matching Both taggers- 163 Matching just structure-based tagger - 194 Matching just string-based tagger- 87166 No match either tagger

Triggers: Kappa = .82 Targets: Kappa = .76

Page 29: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Evaluation: Structure Based Tagger

Recall: not feasible to look for all expressions of modality that we didn’t tag.- No gold-standard annotated corpus.

Precision: - 249 sentences that were tagged with triggers and targets- From the English side of the MTEval 2009 training

sentences- 86.3% correct

But ranges from about 82% to about 92% depending on genre

Page 30: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Precision: Errors

Light verb or noun is correct syntactic target but not the correct semantic target. Earthquake affected areas in Pakistan will be provided

the required number of tents and blankets by November 15.

The decision should be taken on delayed cases on the basis of merit.

Wrong word sense In Bayas, Sikhs attacked a train under cover of night and

killed everyone. The process of provision of relief goods to needy people

should be managed by the Army and the Edhi Trust. Should be allowed to work like this in the future.

- Like: succeed in something

Page 31: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Precision: Errors

Wrong subcategorization pattern. The officials should consider themselves as servants of

the people. Coordinate Structures

Many large helicopters are needed to dispatch urgent relief materials to the many affected in far flung areas of the Neelam Valley and only America can help us in this regard.

Page 32: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Recall: what did we miss?

Special forms of negation There was no place to seek shelter. The buildings should be reconstructed, not with the RCC,

but with the wood and steel sheets. Constructional and phrasal triggers

President Pervaiz Musharraf has said that he will not rest unless the process of rehabilitation is completed.

Random lexical omissions It is not possible in the middle of winter to re-open the

roads.

Page 33: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

SIMTSemantically Informed MT

S

NP

AmericansNNPS

VP-require

MD-TrigRequire VB-TargRequireshould know that

S

NP

wePRP VP-NOTAble

MD-TrigAblecan

RB-TrigNegationnot

VB-TargNOTAblehandVB over

NP

DrNNP KhanNNP

PP

to them

1. T-surgeon

2. Percolation

Page 34: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Integration of the modality tagger with Syntax Based SMT

Joshua Syntax Based SMT system Callison-Burch

Tag modalities on the English side of the training data.

Without modality tags: BLUE 26.4 With modality tags: BLUE 26.7

Page 35: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Advantages of SIMT

Good for translation between a less commonly taught language and a common language Modality can be analyzed on the common language and

projected via word alignments to the LCTL Depth of semantic analysis Robustness of statistical approach

Page 36: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Summary

Modality annotation scheme Modality lexicon Automatic modality tagger An method for integrating semantics into SMT

Good for translation between LCTLs and common languages

Page 37: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

Future work

Improvements to the tagger Add patterns for constructions without simple lexical

triggers. Word sense disambiguation (manage, attack, etc.) Semantic composition of multiple modalities and

negation. Tagging of holders

Applications of the tagger Further experiments with SIMT Integration into tagger for Committed Belief (factivity)

Page 38: 27 January 2010 A modality lexicon and its use in automatic tagging Kathryn Baker, Michael Bloodgood, Bonnie Dorr, Nathanial W. Filardo, Lori Levin, Christine

END