template-based event extraction

23
Template-Based Event Extraction Kevin Reschke – Aug 15 th 2013 Martin Jankowiak, Mihai Surdeanu, Dan Jurafsky, Christopher Manning

Upload: paige

Post on 22-Feb-2016

37 views

Category:

Documents


0 download

DESCRIPTION

Template-Based Event Extraction. Kevin Reschke – Aug 15 th 2013 Martin Jankowiak , Mihai Surdeanu , Dan Jurafsky , Christopher Manning. Outline. Recap from last time Distant supervision Plane crash dataset Current work Fully supervised setting MUC4 terrorism dataset. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Template-Based Event Extraction

Template-Based Event Extraction

Kevin Reschke – Aug 15th 2013

Martin Jankowiak, Mihai Surdeanu, Dan Jurafsky, Christopher Manning

Page 2: Template-Based Event Extraction

2

Outline• Recap from last time

• Distant supervision• Plane crash dataset

• Current work• Fully supervised setting• MUC4 terrorism dataset

Underlying theme: Joint Inference Models

Page 3: Template-Based Event Extraction

Goal: Knowledge Base Population

“… Delta Flight 14

crashed in Mississippikilling 40 …”

<Plane Crash>

<Flight Number = Flight 14>

<Operator = Delta>

<Fatalities = 40>

<Crash Site = Mississippi> …

News Corpus Knowledge Base

Page 4: Template-Based Event Extraction

Distant SupervisionUse known events to automatically label training data.

Training Knowledge-Base

<Plane crash>

<Flight Number = Flight 11>

<Operator = USAir>

<Fatalities = 200>

<Crash Site = Toronto>

One year after [USAir]Operator [Flight 11]FlightNumber

crashed in [Toronto]CrashSite, families of the [200]Fatalities victims attended a memorial service in [Vancouver]NIL.

Page 5: Template-Based Event Extraction

Plane Crash Dataset80 plane crashes from Wikipedia infoboxes. Training set: 32; Dev set: 8; Test set: 40

Corpus: Newswire data from 1989 – present.

Page 6: Template-Based Event Extraction

Extraction Models• Local Model

• Train and classify each mention independently.

• Pipeline Model• Classify sequentially; use previous label as

feature.• Captures dependencies between labels.

• E.g., Passengers and Crew go together:“4 crew and 200 passengers were on board.”

• Joint Model• Searn Algorithm (Daumé III et al., 2009).• Jointly models all mentions in a sentence.

Page 7: Template-Based Event Extraction

Results

Precision Recall F1 score

Baseline (Maj. Class)

0.026 0.237 0.047

Local Model 0.159 0.407 0.229

Pipeline Model 0.154 0.422 0.226

Joint Model 0.213 0.422 0.283

Page 8: Template-Based Event Extraction

8

Fully Supervised Setting:MUC4 Terrorism Dataset

• 4th Message Understanding Conference (1992).

• Terrorist activities in Latin America.

• 1700 docs ( train / dev / test = 1300 / 200 / 200 ).

• 50/50 mix of relevant and irrelevant doc.

Page 9: Template-Based Event Extraction

9

MUC4 Task• 5 slots types:

• Perpetrator Individual (PerpInd)• Perpetrator Organization (PerpOrg)• Physical Target (Target)• Victim (Victim)• Weapon (Weapon)

• Task: Identify all slot fills in each document.• Don’t worry about differentiating multiple

events.

Page 10: Template-Based Event Extraction

10

MUC4 Example

THE ARCE BATTALION COMMAND HAS REPORTED

THAT ABOUT 50 PEASANTS OF VARIOUS AGES

HAVE BEEN KIDNAPPED BY TERRORISTS OF THE

FARABUNDO MARTI NATIONAL LIBERATION FRONT [FMLN] IN

SAN MIGUEL DEPARTMENT.

Victim

PerpInd

PerpOrg

Page 11: Template-Based Event Extraction

11

MUC4 Example

THE ARCE BATTALION COMMAND HAS REPORTED

THAT ABOUT 50 PEASANTS OF VARIOUS AGES

HAVE BEEN KIDNAPPED BY TERRORISTS OF THE

FARABUNDO MARTI NATIONAL LIBERATION FRONT [FMLN] IN

SAN MIGUEL DEPARTMENT.

PerpInd

PerpOrg

NIL

NIL

Victim

Page 12: Template-Based Event Extraction

12

Baseline Results• Local Mention Model

• Multiclass logistic regression.• Pipeline Mention Model

• Previous non-NIL label (or “none”) is feature for current mention.

Precision Recall F1 scoreLocal 0.522 0.448 0.478Pipeline 0.578 0.405 0.471

Page 13: Template-Based Event Extraction

13

Observation 1:• Local context is insufficient.• Need sentence-level measure. (Patwardhan & Riloff,

2009)

Two bridges were destroyed

. . . in Baghdad last night in a resurgence ofbomb attacks in the capital city.

. . . and $50 million in damage was caused bya hurricane that hit Miami on Friday.

. . . to make way for modern, safer bridgesthat will be constructed early next year.

Page 14: Template-Based Event Extraction

14

Baseline Models + Sentence Relevance• Binary relevance classifier – unigram / bigram features• HardSent:

• Discard all mentions in irrelevant sentences.• SoftSent:

• Sentence relevance is feature for mention classification.

Precision Recall F1 score

Local 0.522 0.448 0.478Local w/ HardSent 0.770 0.241 0.451Local w/ SoftSent 0.527 0.446 0.478Pipeline 0.578 0.405 0.471Pipeline w/ SoftSent 0.613 0.429 0.500

Page 15: Template-Based Event Extraction

15

Observation 2:• Sentence relevance depends on surrounding

context. (Huang & Riloff, 2012)

“Obama was attacked.” (political attack vs. terrorist attack)

“He use a gun.” (weapon in terrorist event?)

Page 16: Template-Based Event Extraction

16

Joint Inference Models• Idea: Model sentence relevance and mention

labels jointly – yield globally optimal decisions.

• Machinery: Conditional Random Fields (CRFs).• Model joint probability of relevance labels and mention

labels conditioned on input features.• Encode dependencies among labels.

• Software: Factorie (http://factorie.cs.umass.edu)• Flexibly design CRF graph structures.• Learning / Classification algorithms with exact and

approximate inference.

Page 17: Template-Based Event Extraction

17

First Pass• Fully joint model.

S

M M M

• Approximate inference a likely culprit.

Precision Recall F1 score

Mention Pipeline w/ SoftSent

0.613 0.429 0.500

Fully Joint Model 0.54 0.39 0.45

Page 18: Template-Based Event Extraction

18

Second Pass• Two linear-chain CRFs with relevance

threshold.

S S S

M M MPrecision Recall F1 score

Mention Pipeline w/ SoftSent

0.613 0.429 0.500

Fully Joint Model 0.54 0.39 0.45CRF: Mention Chain Only 0.485 0.470 0.470CRF: Sentence Chain Only 0.535 0.422 0.463CRF: Both Chains 0.513 0.470 0.481

Page 19: Template-Based Event Extraction

19

Analysis• Many errors are reasonable extractions, but

come from irrelevant documents.

• Learned CRF model weights:RelLabel<+,<NIL>> = -0.071687RelLabel<+,Vict> = 0.716669RelLabel<-,Vict> = -1.688919 ...

RelRel<+, +> = -0.609790 RelRel<+, -> = -0.469663 RelRel<-, +> = -0.634649 RelRel<-, -> = 0.572855

The kidnappers were accused of kidnapping several businessmen for high sums of Money.

Page 20: Template-Based Event Extraction

20

Possibilities for improvement• Label-specific relevance thresholds.

• Leverage Coref (Skip Chain CRFs).

• Incorporate doc-level relevance signal.

Page 21: Template-Based Event Extraction

21

State of the art• Huang & Riloff (2012)

• P / R / F1 : 0.58 / 0.60 / 0.59

• CRF sentence model with local mention classifiers.

• Textual cohesion features to model sentence chains.

• Multiple binary mention classifiers (SVMs).

Page 22: Template-Based Event Extraction

22

Future Work• Apply CRF models to plane crash dataset.

• New terrorism dataset from Wikipedia.

• Hybrid models: combine supervised MUC4 data with distant supervision on Wikipedia data.

Page 23: Template-Based Event Extraction

23

Thanks!