generalized inference with multiple semantic role labeling systems
DESCRIPTION
Generalized Inference with Multiple Semantic Role Labeling Systems. Peter Koomen, Vasin Punyakanok, Dan Roth, (Scott) Wen-tau Yih Department of Computer Science University of Illinois at Urbana-Champaign. Outline. System Architecture Pruning Argument Identification Argument Classification - PowerPoint PPT PresentationTRANSCRIPT
Page 1
Generalized Inference withMultiple Semantic Role Labeling Systems
Peter Koomen, Vasin Punyakanok, Dan Roth, (Scott) Wen-tau Yih
Department of Computer ScienceUniversity of Illinois at Urbana-Champaign
Page 2
Outline
System Architecture Pruning Argument Identification Argument Classification Inference [main difference from other systems]
Inference with Multiple Systems The same approach used by the SRL to assure a coherent
output is used with input produced by multiple systems.
Page 3
System Architecture
Identify argument candidates Pruning Argument Identifier
Binary classification
Classify argument candidates Argument Classifier
Multi-class classification Inference
Use the estimated probability distribution given by the argument classifier, and
Expressive structural and linguistic constraints. Infer the optimal global output – modeled as a
constrained optimization problem
Page 4
Pruning [Xue&Palmer 2004]
Significant errors due to PP attachment
Consider PP as attached to both NP and VP
Devel Prec Rec F1
Gold 30.19 96.57 46.00
Charniak 26.61 85.47 40.59
Page 5
Modified Pruning
Devel Prec Rec F1
Gold 30.19 96.57 46.00
Charniak 26.61 85.47 40.59
Charniak
Modified heuristic
23.31 87.59 36.83
Page 6
Argument Identification
Argument identifier is trained with a phrase-based classifier.
Learning Algorithm – SNoW A sparse network of linear classifiers
Weight update: a regularized variation of the Winnow multiplicative update rule
When probability estimation is needed, we use softmax
Page 7
Argument Identification (Features)
Parse tree structure from Collins & Charniak’s parsers Clauses, chunks and POS tags are from UPC
processors
Page 8
Argument Classification
Similar to argument identification, using SNoW as a multi-class classifier
Classes also include NULL
Page 9
Inference
Occasionally, the output of the argument classifier violates some constraints.
The inference procedure [Punyakanok et al., 2004] Input: the probability estimation (by the argument classifier), and
structural and linguistic constraints Output: the best legitimate global predictions
Formulated as an optimization problem and solved via Integer Linear Programming.
Allows incorporating expressive (non-sequential) constraints on the variables (the arguments types).
Page 10
Integer Linear Programming Inference
For each argument ai
Set up a Boolean variable: ai,t indicating if ai is classified as t
Goal is to maximize i score(ai = t ) ai,t
Subject to the (linear) constraints Any Boolean constraints can be encoded this way.
If score(ai = t ) = P(ai = t ), the objective is find the assignment that maximizes the expected number of arguments that are correct and satisfies the constraints
Page 11
Constraints
No overlapping or embedding arguments
ai,aj overlap or embed: ai,NULL + aj,NULL 1
Page 12
Constraints
Constraints No overlapping or embedding arguments No duplicate argument classes for A0-A5 Exactly one V argument per predicate If there is a C-V, there must be V-A1-C-V pattern If there is an R-arg, there must be arg somewhere If there is a C-arg, there must be arg somewhere before Each predicate can take only core arguments that appear in its
frame file. More specifically, we check for only the minimum and maximum ids
Page 13
Results
Prec Rec F1
Dev Collins 73.89 70.11 71.95
Charniak 75.40 74.13 74.76
WSJ Collins 77.09 72.00 74.46
Charniak 78.10 76.15 77.11
Brown Collins 68.03 63.34 65.60
Charniak 67.15 63.57 65.31
Page 14
Inference with Multiple Systems
The performance of SRL heavily depends on the very first stage – pruning [IJCAI 2005] which is derived directly from the full parse trees
Joint Inference allows improvement over semantic role labeling classifiers Combine different SRL systems through joint inference Systems are derived using different full parse trees
Page 15
Inference with Multiple Systems
Multiple Systems Train and test with Collins’ parse outputs Train with Charniak’ best parse outputs
Test with 5-best Charniak’ parse outputs
Page 16
..., traders say, unable to cool the selling panic in both stocks and futures.
a1a1 a4
b1 b3b2
traders the selling panic in both stocks and futures
traders the selling panic in both stocks and futures
Null A0 A1 A2
0.2 0.4 0.2 0.2
Null A0 A1 A2
0.3 0 0.7 0
Null A0 A1 A2
0.1 0.2 0.4 0.3
Null A0 A1 A2
0.1 0.3 0.2 0.4
Naïve Joint Inference
Null A0 A1 A2
0.3 0.3 0.2 0.2
Page 17
a1a1 a4
a3a2
b1 b3b2
b4Null A0 A1 A2
0.55 0.2 0.15 0.1
Joint Inference – Phantom Candidates
Default Priors
Page 18
Results of Joint Inference
F1
67.75
79.44
77.35
60 70 80 90
Devel
WSJ
Brown
Col
Char
Char-2
Char-3
Char-4
Char-5
Combined
Page 19
Results of Joint Inference
Recall
62.93
76.78
74.83
60 65 70 75 80
Devel
WSJ
Brown
Col
Char
Char-2
Char-3
Char-4
Char-5
Combined
Page 20
Results of Joint Inference
Precision
73.38
82.28
80.05
60 70 80 90
Devel
WSJ
Brown
Col
Char
Char-2
Char-3
Char-4
Char-5
Combined
Page 21
Results of Different Combination
F1
60 70 80 90
Devel
WSJ
Brown
Combined
Col+Char1
Char1-5
Best Single
Page 22
Conclusion
The ILP inference can naturally be extended to reason over multiple SRL systems.
Page 23
Thank You