modelling human thematic fit judgments igk colloquium 3/2/2005 ulrike padó

20
Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Upload: lesley-washington

Post on 03-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Modelling Human Thematic Fit Judgments

IGK Colloquium3/2/2005

Ulrike Padó

Page 2: Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Overview

• (Very) quick introduction to my framework

• Testing the Semantic Module Different input corpora Smoothing

• Comparing the Semantic Module to standard selectional preference methods

Page 3: Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Modelling Semantic Processing

• General idea: Build a probabilistic

large scale

broad coverage

model of syntactic and semantic sentence processing

Page 4: Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Semantic Processing

• Assign thematic roles on the basis of co-occurrence statistics from semantically annotated corpora

• Corpus-based frequency estimates of: Semantic Subcategorisation (Probability

of seeing the role with the verb) Selectional Preferences (Probability of

seeing the argument head in a role given the verb frame)

Page 5: Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Testing the Semantic Module

• Evaluate just thematic fit of verbs and argument phrases

• Evaluation:1. Correlate predictions with human

judgments2. Role labelling (prefer correct role)

• Try Different input corpora Smoothing

Page 6: Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Training Data

Frequency counts from

• the PropBank (ca. 3000 verb types) Very specific domain

Relatively flat, syntax-based annotation

• FrameNet (ca. 1500 verb types) Deep semantic annotation: Frames code situations,

group verbs that describe similar events and their arguments

Extracted from balanced corpus

Skewed sample through frame-wise annotation

Page 7: Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Development/Test Data

• Development: 60 verb-argument pairs from McRae et al. 98 Two judgments for each data point:

Agent/Patient

Use to determine optimal parameters of clustering (number of clusters, smoothing)

• Test: 50 verb-argument pairs, 100 data points

Page 8: Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Sparse Data

• Raw frequencies are sparse: 1 (Dev)/2 (Test) pairs seen in PropBank

0 (Dev)/2 (Test) pairs seen in FrameNet

• Use semantic classes as level of abstraction: Class-based smoothing

Page 9: Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Smoothing

Reconstruct probabilities for unseen data

• Smoothing by verb and noun classes Count class members instead of word

tokens

• Compare two alternatives: Hand-constructed classes Induced verb classes (clustering)

Page 10: Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Hand-constructed Verb and Noun classes

• WordNet: Use top-level ontology and synsets as noun classes

• VerbNet: Use top-level classes for verbs

• Presumably correct and reliable• Result: No significant correlations

with human data for either training corpus

Page 11: Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Induced Verb Classes

• Automatically cluster verbs Group by similarities of argument heads,

paths from argument to verb, frame, role labels

Determine optimal number of clusters and parameters of the clustering algorithm on the development set

Page 12: Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Induced Classes, PB/FN

Data points covered

/Significance

Raw data2 -/-

2 -/-

All Arguments

59 ns

12=0.55/p<0.05

Just NPs

48 ns

16=0.56/p<0.05

Page 13: Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Results

• Hand-built classes do not work (with this amount of data)

• Module achieves reliable correlations with FN data: Important result for the overall

feasibility of my model

Page 14: Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Adding Noun Classes (PB/FN)

Data points covered

/Significance

Raw data2 -/-

2 -/-

PB, all args, Noun classes

4 =1/ p<0.01

FN, just NPs,Noun classes

18=0.63/ p<0.01

Page 15: Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Results

• Hand-built classes do not work (with this amount of data)

• Module achieves reliable correlations with FN data

• Adding noun classes helps yet a little

Page 16: Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Comparison with Selectional Preference

Methods• Have established that our system

reliably predicts human data• How do we do in comparison to

standard computational linguistics methods?

Page 17: Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Selectional Preference Methods

• Clark & Weir (2002) Add data points by finding the topmost

class in WN that still reliably mirrors the target word frequency

• Resnik (1996) Quantify contribution of WN class n to

the overall preference strength of the verb

• Both rely on WN noun classes, no verb class smoothing

Page 18: Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Selectional Preference Methods (PB/FN)

Data points covered

/Significance

Labelling (Cov/Acc)

Sem. Module 1 18=0.63/ p<0.01

38%/47.4%

Sem. Module 2 16=0.56/p<0.05

30%/60%

Clark & Weir72 ns 84%/50%

23 ns 36%/50%

Resnik75 ns 74%/48.6%

46 ns 50%/48%

Page 19: Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Results

• Too little input data No results for selectional preference

models Small coverage for Semantic Module

• Semantic module manages to make predictions all the same Relies on verb clusters: Verbs are less

sparse than nouns in small corpora• Annotate larger corpus with FN roles

Page 20: Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Annotating the BNC

• Annotate large, balanced corpus: BNC More data points for verbs covered in FN More verb coverage (though purely syntactic

annotation for unknown verbs)

• Results: Annotation relatively sensible and reliable for

non-FN verbs Frame-wise annotation in FN causes problems

for FN verbs