verbnet: a broad-coverage comprehensive lexicon
TRANSCRIPT
VerbNet:A broad-coverage comprehensive lexicon
Karin Kipper SchulerDepartment of Computer and Information Science
University of [email protected]
August 8th, 2003
Natural language processing tasks require both syntactic
and semantic information.
• Differences between syntactic frames can help:
Eng: John left the room. (exited)Port: John saiu do quarto.
Eng: John left the book on the table. (left)
Port: John deixou o livro na mesa.
• But syntax alone is not sufficient:
Eng: John left the room. (exited)Port: John saiu do quarto.
Eng: John left a fortune. (gave away)
Port: John deixou uma fortuna.
1
Available resources
Existing resources either focus on syntax or on semantics, and donot provide a clear association between the two.
In addition:
• frequently domain and language specific
• not available to the whole community
• expensive and time-consuming to build
• verbs in particular are difficult to characterize
2
WordNet focuses on semantics
Miller (1985); Fellbaum (1998)
• on-line lexical database
• nouns, verbs, adjectives and adverbs grouped in synonym sets
• hypernyms, antonyms, entailments
• contains little syntactic information and no explicit predicate-argument structure
• senses are fine-grained
3
VerbNet connects semantics to syntax
Created to overcome problems of existing resources
• computational verb lexicon
• broad-coverage and domain-independent
• clear association between syntax and semantics
– lexical semantic information (pred argument structure)
– syntactic frames and selectional restrictions
– semantic predicates
– links to WordNet senses
• refinement of Levin classes to construct the entries
4
Outline
• Overview
• Building blocks for VerbNet
– Levin classes
– Moens and Steedman event structure
• VerbNet
• Parameterized Action Representation (PARs)
• Evaluation
• Proposed work
5
Levin classes
Levin (1993)
• Verbs are grouped into classes
• Each class is characterized by a set of syntactic patterns
John broke the jar / The jar broke / Jars break easilyJohn cut the bread / *The bread cut / Bread cuts easilyJohn hit the wall / *The wall hit / *Walls hit easily
• Hypothesis: syntax reflects implicit semantic componentscontact, directed motion,exertion of force, change of state
6
Example Levin class
break
Break Levin class - Change-of-state
crackcrash snap
splintersplit
chip tear
crushfracture
smashshatter
rip
7
Problems with Levin classes
• classes are not semantically homogeneous{braid, clip, file, powder, etc..}
• classes are not completely syntactically homogeneous
• verbs can be in multiple class listings
• alternation contradictions
– Carry verbs disallow conative but include {push, pull, shove, etc}
also in Push/pull class which does take conative
8
Event Structure
Verbs refer to events which can be decomposed into a tripartitestructure in a manner similar to Moens and Steedman (1988)
consequentpreparatoryprocess state
culmination
9
Verb classes and event structure
consequentstate
preparatoryprocess (activity)
(bounce, jog, jump, hop, run) (break, chip, crack, tear)
(batter, kick, hit, slap)
RUN class BREAK class
HIT class
culmination
10
Outline
• Overview
• Building blocks for VerbNet
• VerbNet
• Parameterized Action Representation (PARs)
• Evaluation
• Proposed work
11
Characteristics of verbs:
• verbs represent processes/events/states
• verbs have complex meaning
• time, space
• can have participants
• can be subdivided into sub-parts to captureduring, end, results
12
Examples of verbs and their components
• RUN
– express iterative activity, no culmination, or consequent
– one participant
– motion of participant is a semantic component
– path is optional
• HIT
– express contact between two objects
– happens momentarily, has a well defined end, has no consequent
– has three participants
• BREAK
– express a change of state
13
VerbNet class entries
• verb classes to capture generalizations about verb behavior
• for each verb class
– class local thematic roles
– syntactic frames
– selectional restrictions for the arguments in each frame
– each frame includes semantic predicates with a time function
14
Thematic roles
• list contains 21 thematic rolesActor, Agent, Asset, Attribute, Beneficiary, Cause, Destination, Experiencer, Extent,
Instrument, Location, Material, Patient, Predicate, Product, Recipient, Source,
Stimulus, Time, Theme, Topic
• verbs may have different roles if they belong to different classes
• our set of roles has been mapped to the roles used by theUniversity of Colorado for an experiment in automatic role labelassignment
15
Selectional Restrictions
• based on EuroWordNet concepts (Vossen 2003)
• IS-A hierarchy with multiple inheritance and no cycles
• current list contains 36 restrictions
16
Selectional Restrictions
SelRestr
concrete
int-controlforce
machine vehicle
naturalanimate
human
animal
body-partplant
phys-objcomestible
artifact
machine
tool
garmentsolid
rigid
non-rigid
shapepointed
elongatedsubstance
abstract
idea
sound
communication
location
regionPP
place
objecttime
state
scalar
currency
organization
17
Syntactic Frames
Describe possible surface realizations for verbs in a class
• constructions such as transitive, intransitive, resultative,and a large set of Levin’s alternations
• Examples:
1. Agent V Patient
(John hit the ball)
2. Agent V at Patient
(John hit at the window)
3. Agent V Patient[+plural] together
(John hit the sticks together)
18
Semantic Predicates
Semantics of a syntactic frame captured through a conjunction ofsemantic predicates
• each semantic predicate includes a time function showing at whatstage in the event the predicate holdsstart(E), during(E), end(E), result(E)
• semantic predicates can be:
– General predicates such as motion and cause
– Specific predicates such as suffocate
– Variable predicates
• arguments can be:Event, Constant, Thematic Role, Verb Specific
19
Semantic Predicates
• relations between verbs (or verb classes) captured implicitly bythe predicates for the class
• aspect captured by the temporal function present in the predi-cates:
– activities (e.g., run) have during(E)
– bounded activities (e.g., hit) have during(E) and end(E)
– accomplishments (e.g., break) have result(E)
20
Hit classClass hit-18.1
Parent —
Members bang (1,3), bash(1), batter(1,2,3), beat(2,5), ..., hit(2,4,7,10), kick(3), ...
Themroles Agent Patient Instrument
Selrestr Agent[+int control] Patient[+concrete] Instrument[+concrete]
Frames Name Syntax Semantic Predicates
Transitive Agent V Patient
“Paula hit the ball”
cause(Agent, E) ∧
manner(during(E),directedmotion,Agent) ∧
!contact(during(E), Agent, Patient) ∧
manner(end(E),forceful, Agent) ∧
contact(end(E), Agent, Patient)
Transitive
with
Instrument
Agent V Patient
Prep(with) Instrument
“Paula hit the ball with a
stick”
cause(Agent, E) ∧
manner(during(E),directedmotion,Agent) ∧
!contact(during(E),Instrument,Patient) ∧
manner(end(E),forceful, Agent) ∧
contact(end(E), Instrument,Patient)
21
Hierarchical organization
Refinement of Levin classes
• verb classes are hierarchically organized
– 74 new subclasses
– members have common semantic predicates, thematic roles, syntactic frames
– a particular verb or subclass inherit from parent and may add more infor-
mation
22
Transfer of Message
Class Transfer mesg-37.1
Parent —
Members cite(1,3,4), demonstrate(1), ...
Themroles Agent Topic Recipient
Selrestr Agent[+animate] Topic[+message] Recipient[+animate]
Frames Name Syntax Semantic Predicates
Transitive Agent V Topic
“Wanda cited the author”
transfer info(during(E),Agent,?,Topic)∧ cause(Agent,E)
Dative (to-
PP variant)
Agent V Topic Prep(to)
Recipient“Wanda cited the author
to her students”
transfer info(during(E),Agent,Recipient,Topic) ∧
cause(Agent,E)
Class Transfer mesg-37.1-1
Parent Transfer mesg-37.1
Members quote(1), read(3)
Themroles
Selrestr
Frames Name Syntax Semantic Predicates
Dative (di-transitive
variant)
Agent V Recipient Topic“Wanda quoted her
students the author”
transfer info(during(E),Agent,Recipient,Topic) ∧cause(Agent,E)
23
Transfer of Message – level 2
Class Transfer mesg-37.1-1
Parent Transfer mesg-37.1
Members quote(1), read(3)
Themroles Agent Topic Recipient
Selrestr Agent[+animate] Topic[+message] Recipient[+animate]
Frames Name Syntax Semantic Predicates
Transitive Agent V Topic transfer info(during(E),Agent,?,Topic)∧ cause(Agent,E)
Dative (to-
PP variant)
Agent V Topic Prep(to)
Recipient
transfer info(during(E),Agent,Recipient,Topic) ∧
cause(Agent,E)
Dative (di-
transitivevariant)
Agent V Recipient Topic transfer info(during(E),Agent,Recipient,Topic) ∧
cause(Agent,E)
24
VerbNet/WordNet
VerbNet to WordNet mappings
escape−51.1 leave−51.2 fulfill−13.4.1 keep−15.2
wn5 wn9
motion, direction motion, direction,change location
has_possession,transfer
be Prep
future_having−13.3has_possession,transfer,future_having
wn10 wn3 wn2wn1 wn14
LEAVE
25
Current status of VerbNet (on-line version)
• over 4100 verb senses (3004 lemmas)
• 191 first-level classes, 74 new subclasses
• 21 thematic roles
• 314 syntactic frames
• 64 semantic predicates
• 36 selectional restrictions on arguments
• hierarchy of prepositions (57 entries)
26
Related Work
• WordNet (Miller 1985; Fellbaum 1998)
– predicate-argument structure
– relations are explicit
• FrameNet (Baker et al. 1998)
– verb groupings
– frame elements vs. thematic roles
• LCS database (Dorr 2001)
– classes based on Levin
– syntactic frames not explicit
27
Related Work
• CoreLex (Buitelaar 1998)
– syntactic and semantic representation of verbs based on Gen. Lexicon
– concentrated on nouns
• Xtag (Xtag Research Group 2001) and ComLex (Comlex 1994)
– provide detailed syntactic description
28
Potential uses of VerbNet
• information extraction: members of a class are not exactsynonyms but share arguments
• word sense disambiguation: use of selectional restrictions,thematic role labels, and semantic predicates
• automatic role labeling: use of thematic role labels for au-tomatic role labeling
• machine translation: use of semantic predicates abstract fromsurface structure
29
Outline
• Overview
• Building blocks for VerbNet
• VerbNet
• Parameterized Action Representation (PARs)
• Evaluation
• Proposed work
30
Parameterized Action Representation (PAR)
(Badler et al. 1999)
Interface to agents in an animation system.
Needs a semantically precise representation.
• Representation of actions
– instructions to a virtual human
– used in a simulated 3D environment
• Represented as
– parameterized structures
– hierarchical organization
31
PARs include:
• action participants (agents/objects)
• restrictions on the types of objects
• kinematic and dynamic properties (path, manner, ..., force)
• stages of the action
– preparatory specifications
– termination conditions
– post-assertions
32
Uninstantiated PAR for actions of contact
activity :[
ACTION]
participants :
[
agent : AGENT
objects : OBJ1, OBJ2
]
preparatory spec : [get control of(AGENT,OBJ2)]
termination cond : [contact(OBJ1,OBJ2)]
post assertions :
duration : [momentary]
manner :[
MANNER]
33
Example of the PAR inheritance hierarchy
contact/(par:contact)
hit/(manner:forcefully)
kick/(OBJ2:foot) hammer/(OBJ2:hammer)
touch/(manner:gently)
A lexical/semantic hierarchy for actions of contact
34
Instantiated PAR: John hit the ball with a stick
activity :[
ACTION]
participants :
[
agent : John
objects : ball, stick
]
preparatory spec : [get control of(John, stick)]
termination cond : [contact(ball, stick)]
post assertions :
duration : [momentarily]
path, motion, force
manner :[
forcefully]
35
PARs and VerbNet
PARs for animating agents require precise semantics associated withsyntax provided by VerbNet.
• participants of an action are the arguments of a verb
• selectional restrictions on the arguments
• event structure (during, end, result)
• semantic components expressed by predicates
36
Aggregates
(Allbeck et al. 2002)
• VerbNet also used to describe actions of aggregate entities
• actions decomposed by features based on Laban MovementAnalysis (EMOTE system)
• used in a playground scenario with a teacher and 8 kids
• Examples of aggregate actions:
Aggregate actions
Gathering
assemble congregate
Dispersing
dissipate scatter
Obj refer
surround encircle
Formation Milling
37
Aggregates
• PAR entry for assemble(as in “children assemble in the playground”)
assemble / ARG0-v
/ is_concrete(ARG0)
is_plural(ARG0)
!together_group(start(e),ARG0)
transl_motion(during(e),ARG0)
shape_enclosing(during(e),ARG0)
effort_direct(during(e),ARG0)
together_group(end(e),ARG0)
• VerbNet entryClass Herd-47.5.2
Parent —
Members accumulate aggregate amass assemble cluster collectcongregate convene flock gather group herd huddle mass
Themroles Theme[+concrete +plural]
Frames Name Example Syntax Semantics
Intransitive The kids are assembling Theme V !together(start(E),physical,Theme)together(end(E),physical,Theme)
38
Outline
• Overview
• Building blocks for VerbNet
• VerbNet
• Parameterized Action Representation
• Evaluation
• Proposed work
39
PropBank (Univ. of Penn)
(Kingsbury, Palmer, and Marcus, 2002)
• annotation of WSJ part of Penn Treebank with predicate-argumentstructures
• argument labels defined per verb: Arg0, Arg1, ..
• set of modifiers (ARGMs) are also annotated(LOC, TEMP, DIR, etc)
• different senses yield different rolesets
• labels are only significant within roleset
40
Sense distinctions in PropBank
Captured by different rolesets, with coarse-grained senses preferred:
Roleset leave.01 “move away from”:Arg0: entity leavingArg1: place leftArg3: attributeEx: [ARG0 John] [rel left] [ARG1 the room]
Roleset leave.02 “give”:Arg0: giverArg1: thing givenArg2: beneficiaryEx: [ARG0 John] [rel left] [ARG1 cookies] [ARG2 for Mary]
41
Evaluation
Aimed to establish a baseline and to uncover what needs to be addedto VerbNet.
• verify the syntactic coverage of VerbNet vs. independent resource
• approx. 50k instances, 1200 verbs, 178 classes(out of 191 VN classes)
• results computed per verb and per class
42
Syntactic coverage against PropBank (1)
• Mapping between PB rolesets and VN verb classes
• Mapping between PB argument labels and VN thematic roles
arg0 (giver)arg1 (thing given)arg2 (benefactive)
Agent
RecipientTheme
"give"leave.02 future_having−13.3
keep−15.2
fulfill−13.4.1
leave.01
"move away from"
arg2 (attribute)arg1 (place left)
escape−51.1
ThemeSource
arg0 (entity leaving)
leave−51.2
43
Syntactic coverage against PropBank (2)
Example: verb LEAVE
wsj/05/wsj 0568.mrg 12 4:The tax payments will leave Unisys with $ 225 million in loss carry-forwards thatwill cut tax payments in future quarters .
[ARG0 The tax payments] [rel leave] [ARG2 Unisys] [ARG1 with 225 million]
leave-51.2: Theme V NP Prep(with) Sourcefuture have-13.3: Agent V Recipient Prep(with) Theme
44
Syntactic coverage against PropBank (3)
(A) exact match to a frame in the verb class
(B) match to any value for prepositions
(C) match miscellaneous modifiers to VerbNet roles
Matching any mapped classnumber of instances accuracy
A 38,246 0.786B 39,292 0.808C 35,519 0.730(A–C) 39,351 0.809
45
Preposition mismatches
Removing instances with prepositions from experiment, exact matchrate of 81%
Looked at the instances that matched under relaxed criterion:
1. preposition should be added to VerbNet class
- either for a particular verb or to a set of verbs
2. usage of verb is not captured by VerbNet
3. differences between PropBank and VerbNet
- argument versus adjunct
- incorrect mappings between rolesets and classes or
between arguments and roles
4. inconsistent PropBank annotation
46
PropBank/VerbNet/WordNet
leave.01 leave.02
escape−51.1 leave−51.2 fulfill−13.4.1 keep−15.2
wn5 wn9
motion, direction motion, direction,change location
has_possession,transfer
be Prep
future_having−13.3has_possession,transfer,future_having
wn10 wn3 wn2wn1 wn14
givemove away from
47
Outline
• Overview
• Building blocks for VerbNet
• VerbNet
• Parameterized Action Representation (PARs)
• Evaluation
• Proposed work
48
VerbNet
Computational verb lexicon with explicit association between syntaxand semantics:
• broad-coverage and domain-independent
• freely available on-line
Status:
• over 4,100 verb senses (3004 lemmas)
• 191 first-level classes, 74 subclasses
• 314 syntactic frames, and 64 semantic predicates
49
Proposed Work
(1) Complete semantic predicates:underway, estimated to be finished by the end of the summer, 29 new predi-
cates added so far.
(2) Increase syntactic coverage:currently 78% exact match. New syntactic frames and verb-specific prepo-
sitions based on the syntactic experiment coverage are being added. Also,
changes in the matching algorithm, such as looking for specific lexical items
in the frame, are underway.
50
Proposed Work
(3) Refinement of the classes:underway with new subclasses being added. We are using the results of the
syntactic coverage experiment (both frames and prepositions), as well as lin-
guistic judgment to refine classes.
So far, we have 132 subclasses, distributed in 64 classes.
51
Proposed Work
(4) Addition of new members:
• using clustering algorithms to find verbs not currently in VerbNet whichbehave in a similar way as described by the class membership
• Kingsbury and Kipper (2003) did a preliminary investigation using a k-means clustering algorithm on PropBank annotated corpus:
– 921 verbs senses
– 200 distinct syntactic patterns based on surface realization
– split into 150 clusters
– because not all verbs used are in VerbNet, provided additional members to classes
• compare new members and classes suggested to the ones uncovered by Dorrand Jones (1995), and Korhonen (2003)
52
Proposed Work
(5) Mappings to other resources:
• Xtag
– mappings between syntactic frames and TAG tree families
(or single trees)
– goal is to increase syntactic coverage by having transformations
• FrameNet
– provide a different view of the lexicon
– mappings between verbs and frames
53
Proposed Work
(6) Visual experiments
• odd one out, are predicates used consistently across classes?
a set of 4 videos, one of which does not have the same predicates as the
other three
• multilingual experiment, can predicates be used across lan-guages?
“Mary spoons the chocolate over the ice cream”
54
Other possibilities
Verify the correctness of the frames produced in the PropBank ex-periment, and do VerbNet annotation on the PropBank corpus:
• are the frames the expected ones for verbs in the classes?
• are the semantic predicates associated with the frame helpful inany way? Could these be used for MT or WSD?
56
Other possibilities
Verify how well our semantic predicates reflect the relations describedexplicitly in WordNet (at least for relations such as antonomy andentailment)
57
Clustering
• add syntactic information to the patterns (NP, S)
• add semantic roles (Agent, Patient)
• add semantic classes
• undo transformations
• try other clustering algorithms
58
Intersective classes
Around 72% of verbs that belong to intersective classes are clustered in the same sub-class in
VerbNet.
Butter−9.9−1cap, crown, fuel, top
Butter−9.9asphalt, bait. blanketblindfold, etc
plaster, seed, string
Spray/load−9.7−1cram, crowd, jam, pack, etc
Spray/load−9.7 brush, drizzle, hand, etc
plaster, seed, string
Spray/load−9.7−2drape, load, dabdaub, etc
plaster, seed, string
59