princeton 11/06/03 1 penn putting meaning into your trees martha palmer university of pennsylvania...

72
Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Upload: vincent-hutchinson

Post on 19-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 1

Penn

Putting Meaning Into Your Trees

Martha PalmerUniversity of Pennsylvania

Princeton Cognitive Science Laboratory

November 6, 2003

Page 2: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 2

PennOutline Introduction Background: WordNet, Levin classes, VerbNet Proposition Bank – capturing shallow

semantics Mapping PropBank to VerbNet Mapping PropBank to WordNet

Page 3: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 3

PennWord sense in Machine Translation

Different syntactic framesJohn left the room Juan saiu do quarto. (Portuguese)John left the book on the table. Juan deizou o livro na mesa.

Same syntactic frame?John left a fortune. Juan deixou uma fortuna.

Page 4: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 4

PennAsk Jeeves – A Q/A, IR ex. What do you call a successful movie? Tips on Being a Successful Movie Vampire ... I shall call

the police. Successful Casting Call & Shoot for ``Clash of

Empires'' ... thank everyone for their participation in the making of yesterday's movie.

Demme's casting is also highly entertaining, although I wouldn't go so far as to call it successful. This movie's resemblance to its predecessor is pretty vague...

VHS Movies: Successful Cold Call Selling: Over 100 New Ideas, Scripts, and Examples from the Nation's Foremost Sales Trainer.

Blockbuster

Page 5: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 5

PennAsk Jeeves – filtering w/ POS tagWhat do you call a successful movie? Tips on Being a Successful Movie Vampire ... I shall call

the police. Successful Casting Call & Shoot for ``Clash of

Empires'' ... thank everyone for their participation in the making of yesterday's movie.

Demme's casting is also highly entertaining, although I wouldn't go so far as to call it successful. This movie's resemblance to its predecessor is pretty vague...

VHS Movies: Successful Cold Call Selling: Over 100 New Ideas, Scripts, and Examples from the Nation's Foremost Sales Trainer.

Page 6: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 6

PennFiltering out “call the police”

call(you,movie,what)

call(you,police)

Syntax

Page 7: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 7

PennEnglish lexical resource is required

That provides sets of possible syntactic frames for verbs.

And provides clear, replicable sense distinctions.

AskJeeves: Who do you call for a good electronic lexical database for English?

Page 8: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 8

PennWordNet – call, 28 senses1. name, call -- (assign a specified, proper name to; "They named their son David"; …) -> LABEL2. call, telephone, call up, phone, ring -- (get or try to get into

communication (with someone) by telephone; "I tried to call you all night"; …)

->TELECOMMUNICATE3. call -- (ascribe a quality to or give a name of a common

noun that reflects a quality; "He called me a bastard"; …)

-> LABEL4. call, send for -- (order, request, or command to come; "She was called into the director's office"; "Call the police!")

-> ORDER

Page 9: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 9

PennWordNet – Princeton (Miller 1985, Fellbaum 1998)

On-line lexical reference (dictionary) Nouns, verbs, adjectives, and adverbs grouped into

synonym sets Other relations include hypernyms (ISA), antonyms,

meronyms

Limitations as a computational lexicon Contains little syntactic information No explicit predicate argument structures No systematic extension of basic senses Sense distinctions are very fine-grained, ITA 73% No hierarchical entries

Page 10: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 10

PennLevin classes (Levin, 1993)

3100 verbs, 47 top level classes, 193 second and third level

Each class has a syntactic signature based on alternations. John broke the jar. / The jar broke. / Jars break easily.

John cut the bread. / *The bread cut. / Bread cuts easily.

John hit the wall. / *The wall hit. / *Walls hit easily.

Page 11: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 11

PennLevin classes (Levin, 1993)

Verb class hierarchy: 3100 verbs, 47 top level classes, 193

Each class has a syntactic signature based on alternations. John broke the jar. / The jar broke. / Jars break easily. change-of-state

John cut the bread. / *The bread cut. / Bread cuts easily. change-of-state, recognizable action, sharp instrument

John hit the wall. / *The wall hit. / *Walls hit easily. contact, exertion of force

Page 12: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 12

Penn

Page 13: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 13

PennConfusions in Levin classes? Not semantically homogenous

{braid, clip, file, powder, pluck, etc...} Multiple class listings

homonymy or polysemy? Conflicting alternations?

Carry verbs disallow the Conative, (*she carried at the ball), but include {push,pull,shove,kick,draw,yank,tug}also in Push/pull class, does take the Conative

(she kicked at the ball)

Page 14: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 14

PennIntersective Levin Classes

“at” ¬CH-LOC“across the room”

CH-LOC

“apart” CH-STATE

Dang, Kipper & Palmer, ACL98

Page 15: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 15

PennIntersective Levin Classes More syntactically and semantically coherent

sets of syntactic patternsexplicit semantic componentsrelations between senses

  VERBNETwww.cis.upenn.edu/verbnet

Dang, Kipper & Palmer, IJCAI00, Coling00

Page 16: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 16

PennVerbNet – Karin Kipper

Class entries:Capture generalizations about verb behaviorOrganized hierarchicallyMembers have common semantic elements,

semantic roles and syntactic frames Verb entries:

Refer to a set of classes (different senses)each class member linked to WN synset(s)

(not all WN senses are covered)

Page 17: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 17

PennSemantic role labels:

Christiane broke the LCD projector.

break (agent(Christiane), patient(LCD-projector))

cause(agent(Christiane), broken(LCD-projector))

agent(A) -> intentional(A), sentient(A), causer(A), affector(A)

patient(P) -> affected(P), change(P),…

Page 18: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 18

PennHand built resources vs. Real data

VerbNet is based on linguistic theory – how useful is it?

How well does it correspond to syntactic variations found in naturally occurring text?

PropBank

Page 19: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 19

PennProposition Bank:From Sentences to Propositions

Powell met Zhu Rongji

Proposition: meet(Powell, Zhu Rongji)Powell met with Zhu Rongji

Powell and Zhu Rongji met

Powell and Zhu Rongji had a meeting

. . .When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane.

meet(Powell, Zhu) discuss([Powell, Zhu], return(X, plane))

debate

consult

joinwrestle

battle

meet(Somebody1, Somebody2)

Page 20: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 20

PennCapturing semantic roles*

George broke [ ARG1 the laser pointer.]

[ARG1 The windows] were broken by the hurricane.

[ARG1 The vase] broke into pieces when it toppled over.

SUBJ

SUBJ

SUBJ

*See also Framenet, http://www.icsi.berkeley.edu/~framenet/

Page 21: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 21

PennEnglish lexical resource is required

That provides sets of possible syntactic frames for verbs with semantic role labels.

And provides clear, replicable sense distinctions.

Page 22: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 22

PennA TreeBanked Sentence

Analysts

S

NP-SBJ

VP

have VP

been VP

expectingNP

a GM-Jaguar pact

NP

that

SBAR

WHNP-1

*T*-1

S

NP-SBJVP

wouldVP

give

the US car maker

NP

NP

an eventual 30% stake

NP

the British company

NP

PP-LOC

in

(S (NP-SBJ Analysts) (VP have (VP been (VP expecting

(NP (NP a GM-Jaguar pact) (SBAR (WHNP-1 that)

(S (NP-SBJ *T*-1) (VP would

(VP give (NP the U.S. car maker)

(NP (NP an eventual (ADJP 30 %) stake) (PP-LOC in (NP the British company))))))))))))

Analysts have been expecting a GM-Jaguar pact that would give the U.S. car maker an eventual 30% stake in the British company.

Page 23: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 23

PennThe same sentence, PropBanked

Analysts

have been expecting

a GM-Jaguar pact

Arg0 Arg1

(S Arg0 (NP-SBJ Analysts) (VP have (VP been (VP expecting

Arg1 (NP (NP a GM-Jaguar pact) (SBAR (WHNP-1 that)

(S Arg0 (NP-SBJ *T*-1) (VP would

(VP give Arg2 (NP the U.S. car maker)

Arg1 (NP (NP an eventual (ADJP 30 %) stake)

(PP-LOC in (NP the British company))))))))))))

that would give

*T*-1

the US car maker

an eventual 30% stake in the British company

Arg0

Arg2

Arg1

expect(Analysts, GM-J pact)give(GM-J pact, US car maker, 30% stake)

Page 24: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 24

PennFrames File Example: expectRoles: Arg0: expecter Arg1: thing expected

Example: Transitive, active:

Portfolio managers expect further declines in interest rates.

Arg0: Portfolio managers REL: expect Arg1: further declines in interest rates

Page 25: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 25

PennFrames File example: giveRoles: Arg0: giver Arg1: thing given Arg2: entity given to

Example: double object The executives gave the chefs a standing ovation. Arg0: The executives REL: gave Arg2: the chefs Arg1: a standing ovation

Page 26: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 26

PennWord Senses in PropBank Orders to ignore word sense not feasible for 700+

verbs Mary left the room Mary left her daughter-in-law her pearls in her will

Frameset leave.01 "move away from":Arg0: entity leavingArg1: place left

Frameset leave.02 "give":Arg0: giver Arg1: thing givenArg2: beneficiary

How do these relate to traditional word senses in VerbNet and WordNet?

Page 27: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 27

PennAnnotation procedure

PTB II - Extraction of all sentences with given verb Create Frame File for that verb Paul Kingsbury

(3100+ lemmas, 4400 framesets,118K predicates) Over 300 created automatically via VerbNet

First pass: Automatic tagging (Joseph Rosenzweig) http://www.cis.upenn.edu/~josephr/TIDES/index.html#lexicon

Second pass: Double blind hand correction

Paul Kingsbury Tagging tool highlights discrepancies Scott Cotton Third pass: Solomonization (adjudication)

Betsy Klipple, Olga Babko-Malaya

Page 28: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 28

PennTrends in Argument Numbering Arg0 = agent Arg1 = direct object / theme / patient Arg2 = indirect object / benefactive /

instrument / attribute / end state Arg3 = start point / benefactive / instrument /

attribute Arg4 = end point Per word vs frame level – more general?

Page 29: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 29

Penn

Additional tags (arguments or adjuncts?)

Variety of ArgM’s (Arg#>4): TMP - when? LOC - where at? DIR - where to? MNR - how? PRP -why? REC - himself, themselves, each other PRD -this argument refers to or modifies

another ADV –others

Page 30: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 30

Penn

Function tags for Chinese (arguments or adjuncts?)

Variety of ArgM’s (Arg#>4): TMP - when? LOC - where at? DIR - where to? MNR - how? PRP -why? TPC – topic PRD -this argument refers to or modifies another ADV –others CND – conditional DGR – degree FRQ - frequency

Page 31: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 31

Penn

Additional function tags for Chinese for phrasal verbs-

Correspond to groups of “prepositions”

AS AT

INTO

ONTO

TO

TOWARDS

Page 32: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 32

PennInflection Verbs also marked for tense/aspect

Passive/Active Perfect/Progressive Third singular (is has does was) Present/Past/Future Infinitives/Participles/Gerunds/Finites

Modals and negations marked as ArgMs

Page 33: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 33

PennFrames: Multiple Framesets Framesets are not necessarily consistent between

different senses of the same verb

Framesets are consistent between different verbs that share similar argument structures, (like FrameNet)

Out of the 787 most frequent verbs:

1 FrameNet – 521 2 FrameNet – 169 3+ FrameNet 97 (includes light verbs)

Page 34: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 34

PennErgative/Unaccusative Verbs

Roles (no ARG0 for unaccusative verbs)Arg1 = Logical subject, patient, thing rising

Arg2 = EXT, amount risen

Arg3* = start point

Arg4 = end point

Sales rose 4% to $3.28 billion from $3.16 billion.

The Nasdaq composite index added 1.01 to 456.6 on paltry volume.

Page 35: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 35

PennActual data for leave http://www.cs.rochester.edu/~gildea/PropBank/Sort/Leave .01 “move away from” Arg0 rel Arg1 Arg3Leave .02 “give” Arg0 rel Arg1 Arg2

sub-ARG0 obj-ARG1 44 sub-ARG0 20 sub-ARG0 NP-ARG1-with obj-ARG2 17 sub-ARG0 sub-ARG2 ADJP-ARG3-PRD 10 sub-ARG0 sub-ARG1 ADJP-ARG3-PRD 6 sub-ARG0 sub-ARG1 VP-ARG3-PRD 5 NP-ARG1-with obj-ARG2 4 obj-ARG1 3 sub-ARG0 sub-ARG2 VP-ARG3-PRD 3

Page 36: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 36

PennPropBank/FrameNet

Buy

Arg0: buyer

Arg1: goods

Arg2: seller

Arg3: rate

Arg4: payment

Sell

Arg0: seller

Arg1: goods

Arg2: buyer

Arg3: rate

Arg4: payment

More generic, more neutral – maps readily to VN,TR Rambow, et al, PMLB03

Page 37: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 37

PennAnnotator accuracy – ITA 84%

1000 10000 100000 10000000.86

0.87

0.88

0.89

0.9

0.91

0.92

0.93

0.94

0.95

0.96hertlerb

forbesk

solaman2istreit

wiarmstr kingsbur

ksledge

nryant

malayaojaywang

delilkanpteppercotter

Annotator Accuracy-primary labels only

# of annotations (log scale)

accu

racy

Page 38: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 38

PennEnglish lexical resource is required

That provides sets of possible syntactic

frames for verbs with semantic role labels?

And provides clear, replicable sense distinctions.

Page 39: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 39

PennEnglish lexical resource is required

That provides sets of possible syntactic frames for verbs with semantic role labels

that can be automatically assigned accurately to new text?

And provides clear, replicable sense distinctions.

Page 40: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 40

Penn

Automatic Labelling of Semantic Relations• Stochastic Model• Features:

PredicatePhrase TypeParse Tree PathPosition (Before/after predicate)Voice (active/passive)Head Word

Gildea & Jurafsky, CL02, Gildea & Palmer, ACL02

Page 41: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 41

PennSemantic Role Labelling Accuracy-Known Boundaries

79.673.682.0Automatic parses

83.177.0Gold St. parses

PropBank

≥ 10 instancesPropBankFramene

t

≥ 10 inst

•Accuracy of semantic role prediction for known boundaries--the system is given the constituents to classify.•FrameNet examples (training/test) are handpicked to be unambiguous.• Lower performance with unknown boundaries.• Higher performance with traces. • Almost evens out.

Page 42: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 42

PennAdditional Automatic Role Labelers

Performance improved from 77% to 88% Colorado (Gold Standard parses, < 10 instances) Same features plus

Named Entity tags Head word POS For unseen verbs – backoff to automatic verb clusters

SVM’s Role or not role For each likely role, for each Arg#, Arg# or not No overlapping role labels allowed

Pradhan, et. al., ICDM03, Sardeneau, et. al, ACL03,Chen & Rambow, EMNLP03, Gildea & Hockemaier, EMNLP03

Page 43: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 43

PennWord Senses in PropBank Orders to ignore word sense not feasible for 700+

verbs Mary left the room Mary left her daughter-in-law her pearls in her will

Frameset leave.01 "move away from":Arg0: entity leavingArg1: place left

Frameset leave.02 "give":Arg0: giver Arg1: thing givenArg2: beneficiary

How do these relate to traditional word senses in VerbNet and WordNet?

Page 44: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 44

PennMapping from PropBank to VerbNet

Frameset id = leave.02

Sense =

give

VerbNet class =

future-having 13.3

Arg0 Giver Agent

Arg1 Thing given

Theme

Arg2 Benefactive

Recipient

Page 45: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 45

PennMapping from PB to VerbNet

Page 46: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 46

PennMapping from PropBank to VerbNet

Overlap with PropBank framesets 50,000 PropBank instances < 50% VN entries, > 85% VN classes

Results MATCH - 78.63%. (80.90% relaxed) (VerbNet isn’t just linguistic theory!)

Benefits Thematic role labels and semantic predicates Can extend PropBank coverage with VerbNet classes WordNet sense tags

Kingsbury & Kipper, NAACL03, Text Meaning Workshophttp://www.cs.rochester.edu/~gildea/VerbNet/

Page 47: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 47

PennWordNet as a WSD sense inventory

Senses unnecessarily fine-grained?

Word Sense Disambiguation bakeoffsSenseval1 – Hector, ITA = 95.5%Senseval2 – WordNet 1.7, ITA verbs = 71%Groupings of Senseval2 verbs, ITA =82%

Used syntactic and semantic criteria

Page 48: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 48

PennGroupings Methodology(w/ Dang and Fellbaum) Double blind groupings, adjudication Syntactic Criteria (VerbNet was useful)

Distinct subcategorization frames call him a bastard call him a taxi

Recognizable alternations – regular sense extensions: play an instrument play a song play a melody on an instrument

SIGLEX01, SIGLEX02, JNLE04

Page 49: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 49

PennGroupings Methodology (cont.)

Semantic Criteria Differences in semantic classes of arguments

Abstract/concrete, human/animal, animate/inanimate, different instrument types,…

Differences in the number and type of arguments Often reflected in subcategorization frames John left the room. I left my pearls to my daughter-in-law in my will.

Differences in entailments Change of prior entity or creation of a new entity?

Differences in types of events Abstract/concrete/mental/emotional/….

Specialized subject domains

Page 50: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 50

PennResults – averaged over 28 verbs

Total

WN polysemy 16.28

Group polysemy 8.07

ITA-fine 71%

ITA-group 82%

MX-fine 60.2%

MX-group 69%

MX – Maximum Entropy WSD, p(sense|context)Features: topic, syntactic constituents, semantic classes +2.5%, +1.5 to +5%, +6%

Dang and Palmer, Siglex02,Dang et al,Coling02

Page 51: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 51

PennGrouping improved ITA and Maxent WSD

Call: 31% of errors due to confusion between senses within same group 1:name, call -- (assign a specified, proper name to; They named

their son David)call -- (ascribe a quality to or give a name of a common noun

that reflects a quality; He called me a bastard)call -- (consider or regard as being;I would not call her beautiful)

75% with training and testing on grouped senses vs.43% with training and testing on fine-grained senses

Page 52: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 52

PennWordNet: - call, 28 senses, groups

WN5, WN16,WN12 WN15 WN26

WN3 WN19 WN4 WN 7 WN8 WN9

WN1 WN22

WN20 WN25

WN18 WN27

WN2 WN 13 WN6 WN23

WN28

WN17 , WN 11 WN10, WN14, WN21, WN24,

Loud cry

Label

Phone/radio

Bird or animal cry

Request

Call a loan/bond

Visit

Challenge

Bid

Page 53: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 53

PennWordNet: - call, 28 senses, groups

WN5, WN16,WN12 WN15 WN26

WN3 WN19 WN4 WN 7 WN8 WN9

WN1 WN22

WN20 WN25

WN18 WN27

WN2 WN 13 WN6 WN23

WN28

WN17 , WN 11 WN10, WN14, WN21, WN24,

Loud cry

Label

Phone/radio

Bird or animal cry

Request

Call a loan/bond

Visit

Challenge

Bid

Page 54: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 54

PennOverlap between Groups and Framesets – 95%

WN1 WN2 WN3 WN4

WN6 WN7 WN8 WN5 WN 9 WN10

WN11 WN12 WN13 WN 14

WN19 WN20

Frameset1

Frameset2

developPalmer, Dang & Fellbaum, NLE 2004

Page 55: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 55

PennSense Hierarchy PropBank Framesets – coarse grained distinctions

Sense Groups (Senseval-2) intermediate level (includes Levin classes) – 95% overlap

WordNet – fine grained distinctions

Page 56: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 56

PennEnglish lexical resource is available

That provides sets of possible syntactic

frames for verbs with semantic role labels that can be automatically assigned accurately to new text.

And provides clear, replicable sense distinctions.

Page 57: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 57

PennSummary of English PropBankPaul Kingsbury, Olga Babko-Malaya, Scott Cotton

Genre Words Frames Files Frameset

Tags

Released

Wall Street Journal*

(financial subcorpus)

300K < 2000 400 July, 02

Wall Street Journal*

(Penn TreeBank II)

1000K < 4000 700 Dec, 03?

(March, 03)

English Translation of

Chinese TreeBank *

ITIC funding

100K <1500 July, 04

Sinorama, English corpus

NSF-ITR funding

150K <2000 July, 05

English half of DLI

Military Corpus

ARL funding

50K < 1000 July, 05

Page 58: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 58

PennA Chinese Treebank Sentence

国会 /Congress 最近 /recently 通过 /pass 了 /ASP 银行法 /banking law

“The Congress passed the banking law recently.”

(IP (NP-SBJ (NN 国会 /Congress))

(VP (ADVP (ADV 最近 /recently))

(VP (VV 通过 /pass)

(AS 了 /ASP)

(NP-OBJ (NN 银行法 /banking law)))))

Page 59: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 59

PennThe Same Sentence, PropBanked

通过 (f2) (pass)

arg0 argM arg1

国会 最近 银行法 (law)

(congress)

(IP (NP-SBJ arg0 (NN 国会 ))

(VP argM (ADVP (ADV 最近 ))

(VP f2 (VV 通过 )

(AS 了 )

arg1 (NP-OBJ (NN 银行法 )))))

Page 60: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 60

PennChinese PropBank Status - (w/ Bert Xue and Scott Cotton)

Create Frame File for that verb - Similar alternations – causative/inchoative,

unexpressed object 5000 lemmas, 2000 DONE, (hired Jiang)

First pass: Automatic tagging 2000 DONE Subcat frame matcher (Xue & Kulick, MT03)

Second pass: Double blind hand correction In progress (includes frameset tagging), 600 DONE Ported RATS to CATS, in use since May

Third pass: Solomonization (adjudication)

Page 61: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 61

PennSummary of Chinese PropBankNianwen Xue, Meiyu Chang, Zhiyi, Ping

Genre Words Frames Files Frameset

Tags

Released

Xinhua News

DOD funding

250K 4867 200 July, 04

Sinorama

NSF-ITR funding

150K < 4000 July, 05

Page 62: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 62

PennA Korean Treebank Sentence

(S (NP-SBJ 그 /NPN+ 은 /PAU) (VP (S-COMP (NP-SBJ 르노 /NPR+ 이 /PCA) (VP (VP (NP-ADV 3/NNU

월 /NNX+ 말 /NNX+ 까지 /PAU) (VP (NP-OBJ 인수 /NNC+ 제의 /NNC 시한 /NNC+ 을 /PCA) 갖 /VV+ 고 /ECS)) 있 /VX+ 다 /EFN+ 고 /PAD) 덧붙이 /VV+ 었 /EPF+ 다 /EFN) ./SFN)

그는 르노가 3 월말까지 인수제의 시한을 갖고 있다고 덧붙였다 .

He added that Renault has a deadline until the end of March for a merger proposal.

Page 63: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 63

PennThe same sentence, PropBanked

덧붙이었다

그는 갖고 있다

르노가 인수제의 시한을

덧붙이다 ( 그는 , 르노가 3 월말까지 인수제의 시한을 갖고 있다 ) (add) (he) (Renaut has a deadline until the end of March for a merger proposal)

갖다 ( 르노가 , 3 월말까지 , 인수제의 시한을 ) (has) (Renaut) (until the end of March) (a deadline for a merger proposal)

Arg0 Arg2

Arg0 Arg1

(S Arg0 (NP-SBJ 그 /NPN+ 은 /PAU) (VP Arg2 (S-COMP ( Arg0 NP-SBJ 르노 /NPR+ 이 /PCA) (VP (VP ( ArgM NP-ADV 3/NNU 월 /NNX+ 말 /NNX+ 까지 /PAU) (VP ( Arg1 NP-OBJ 인수 /NNC+ 제의 /NNC 시한 /NNC+ 을 /PCA) 갖 /VV+ 고 /ECS)) 있 /VX+ 다 /EFN+ 고 /PAD) 덧붙이 /VV+ 었 /EPF+ 다 /EFN) ./SFN)

3 월말까지

ArgM

Page 64: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 64

PennKorean PropBank funded by ARO - CORK

50K words Virginia Corpus of Korean Treebank New semantic augmentations

Predicate-argument relations for predicatesLabel arguments: Arg0, Arg1, Arg2 …

Korean lexical resource – Frames filesUse “semantic role glosses” unique to each predicateCreate manually for 900 predicatesRefer to the hand-corrected DSynt files from Virginia

Extend to newswire domain – 130K words

Page 65: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 65

PennPropBank II Nominalizations NYU Lexical Frames DONE Event Variables, (including temporals and

locatives) More fine-grained sense tagging

Tagging nominalizations w/ WordNet senseSelected verbs and nouns

Nominal Coreference not names

Clausal Discourse connectives – selected subset

Page 66: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 66

PennPropBank I

Also, [Arg0substantially lower Dutch corporate tax rates] helped [Arg1[Arg0 the company] keep [Arg1 its tax outlay] [Arg3-PRD flat] [ArgM-ADV relative to earnings growth]].

relative to earnings…

flatits tax outlaythe company

keep

the company keep its tax outlay flat

tax rateshelp

ArgM-ADVArg3-PRD

Arg1Arg0REL

Event variables;

ID#h23

k16

nominal reference;sense tags;

help2,5 tax rate1

keep1 company1

discourse connectives

{ }

I

Page 67: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 67

Penn

Summary of Multilingual TreeBanks, PropBanks

Parallel Corpora

Text Treebank PropBank I

Prop II

Chinese Treebank

Chinese 500K

English 400K

Chinese 500K

English 100K

Chinese 500K

English 350K

Ch 100K

En 100K

Arabic

Treebank

Arabic 500K

English 500K

Arabic 500K

English ?

?

?

Korean

Treebank

Korean 180K

English 50K

Korean 180K

English 50K

Korean 180K

English 50K

Page 68: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 68

PennLevin class: escape-51.1-1 WordNet Senses: WN 1, 5, 8

Thematic Roles: Location[+concrete] Theme[+concrete]

Frames with Semantics Basic Intransitive

"The convict escaped" motion(during(E),Theme) direction(during(E),Prep,Theme, ~Location)

Intransitive (+ path PP) "The convict escaped from the prison"

Locative Preposition Drop "The convict escaped the prison"

Page 69: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 69

PennLevin class: future_having-13.3 WordNet Senses: WN 2,10,13

Thematic Roles: Agent[+animate OR +organization] Recipient[+animate OR +organization] Theme[]

Frames with Semantics Dative

"I promised somebody my time" Agent V Recipient Theme has_possession(start(E),Agent,Theme) future_possession(end(E),Recipient,Theme) cause(Agent,E)

Transitive (+ Recipient PP) "We offered our paycheck to her" Agent V Theme Prep(to) Recipient )

Transitive (Theme Object) "I promised my house (to somebody)" Agent V Theme

Page 70: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 70

PennActual data for leave http://www.cs.rochester.edu/~gildea/PropBank/Sort/Leave .01 “move away from” Arg0 rel Arg1 Arg3Leave .02 “give” Arg0 rel Arg1 Arg2

sub-ARG0 obj-ARG1 44 sub-ARG0 20 sub-ARG0 NP-ARG1-with obj-ARG2 17 sub-ARG0 sub-ARG2 ADJP-ARG3-PRD 10 sub-ARG0 sub-ARG1 ADJP-ARG3-PRD 6 sub-ARG0 sub-ARG1 VP-ARG3-PRD 5 NP-ARG1-with obj-ARG2 4 obj-ARG1 3 sub-ARG0 sub-ARG2 VP-ARG3-PRD 3

Page 71: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 71

Penn

SENSEVAL – Word Sense Disambiguation Evaluation

SENSEVAL1

1998

SENSEVAL2

2001

Languages

Systems

3

24

12

90

Eng. Lexical Sample

Verbs/Poly/Instances

Yes

13/12/215

Yes

29/16/110

Sense Inventory Hector,

95.5%

WordNet, 73+%

NLE99, CHUM01, NLE02, NLE03

DARPA style bakeoff: training data, testing data, scoring algorithm.

Page 72: Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003

Princeton 11/06/03 72

PennMaximum Entropy WSDHoa Dang, best performer on Verbs

Maximum entropy framework, p(sense|context) Contextual Linguistic Features

Topical feature for W: keywords (determined automatically)

Local syntactic features for W: presence of subject, complements, passive? words in subject, complement positions, particles,

preps, etc.Local semantic features for W:

Semantic class info from WordNet (synsets, etc.) Named Entity tag (PERSON, LOCATION,..) for

proper Ns words within +/- 2 word window