multi-task transfer learning for weakly- supervised relation extraction jing jiang singapore...

23
Multi-Task Transfer Learning for Weakly-Supervised Relation Extraction Jing Jiang Singapore Management University ACL-IJCNLP 2009

Upload: norma-jennings

Post on 17-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Multi-Task Transfer Learning for Weakly-Supervised Relation Extraction

Jing JiangSingapore Management University

ACL-IJCNLP 2009

Aug 5, 2009 ACL-IJCNLP 2009 2

Relation Extraction

• Task definition: to label the semantic relation between a pair of entities in a sentence (fragment)

…[leader arg-1] of a minority [government arg-2]…

PHYS PER-SOC EMP-ORG NIL

PHYS: PhysicalPER-SOC: Personal / SocialEMP-ORG: Employment / Membership / Subsidiary

Aug 5, 2009 ACL-IJCNLP 2009 3

Supervised Learning

• Current solution: supervised machine learning (e.g. [Zhou et al. 2005], [Bunescu & Mooney 2005], [Zhang et al. 2006])

• Training data is needed for each relation type

…[leader arg-1] of a minority [government arg-2]…

arg-1 word: leader arg-2 type: ORG

dependency:arg-1 of arg-2

EMP-ORGPHYS PER-SOC NIL

Aug 5, 2009 ACL-IJCNLP 2009 4

Challenge in Practice

• New relation type (in a new domain): no training data or a few seed instances

• In this work, we study weakly-supervised relation extraction– A few seed instances of the target relation type– Many instances of other auxiliary relation types– Additional human knowledge about the target relation

type

• Main idea: Auxiliary relation types can help!

Aug 5, 2009 ACL-IJCNLP 2009 5

Syntactic Similarity across Relation Types

…[leader arg-1] of a minority [government arg-2]…

arg-1 word: leader arg-2 type: ORG

dependency:arg-1 of arg-2

the youngest [son arg-1] of ex-director [Suharto arg-2]

the [Socialist People’s Party arg-1] of [Montenegro arg-2]

EMP-ORG

PER-SOC

GPE-AFF

Aug 5, 2009 ACL-IJCNLP 2009 6

Syntactic Similarity

Syntactic Pattern

Relation Instance Relation Type (Subtype)

arg-2 arg-1 Arab leaders OTHER-AFF (Ethnic)

his father PER-SOC (Family)

South Jakarta Prosecution Office

GPE-AFF (Based-in)

arg-1 [verb] arg-2 Yemen [sent] planes to Baghdad

ART (User-or-Owner)

His wife [had] three young children

PER-SOC (Family)

Jody Scheckter [paced] Farrari to both victories

EMP-ORG (Employ-Staff)

Aug 5, 2009 ACL-IJCNLP 2009 7

Problem Formulation based on Transfer Learning

• Domain adaptation and transfer learning (e.g. [Blitzer et al. 2006], [Hal Daume III 2007])

our goal: PER-SOC EMP-ORG

• We apply our previous framework ([Jiang & Zhai 2007b])

– Similar in spirit to [Evgeniou & Pontil 2004] and [Daume III, 2007]

Aug 5, 2009 ACL-IJCNLP 2009 8

Review of Relation Extraction BasicsLinear classifier

…[leader arg-1] of a minority [government arg-2]…

10..1..

x

arg-2 type: ORGarg-2 type: PER

dependency:arg-1 of arg-2

4.50.3

.

.6.7

.

.

w

arg-2 type: ORG

xwxf T)(feature vector weight vector in linear

classifier

dependency:arg-1 of arg-2

EMP-ORG

Aug 5, 2009 ACL-IJCNLP 2009 9

General vs. Specific FeaturesAssumption: some features are commonly useful

for different relation types, while other features are specific for individual relation types

: weight vector for target type

: weight vector for k’th auxiliary type

Kkw

w

kk

TT

,,1for

Tw

kwcommon weight vector in a lower H dimensional space

Aug 5, 2009 ACL-IJCNLP 2009 10

Learning Framework

loss function on the target seed instances

loss function on the auxiliary training instances

2

1

22

1

0,,,1

),(

),(minargˆ,ˆ,ˆ

K

kk

kT

T

K

kkk

TTF

TK

kk

DL

DLTk

104 1

Aug 5, 2009 ACL-IJCNLP 2009 11

General Features

Which subset of features should be captured by ?

Kkw

w

kk

TT

,,1for common weight vector in a

lower H dimensional space

Aug 5, 2009 ACL-IJCNLP 2009 12

Feature Separation

• Automatic separation within the learning framework (see [Jiang & Zhai 2007b])

• Human guidance– Argument word features: features that contain head

word of an argument• E.g. arg-1 word: sister

– Entity type features: features that contain the entity type (subtype) of an argument

• E.g. arg-2 type: ORG

• Combined

Aug 5, 2009 ACL-IJCNLP 2009 13

Imposing Entity Type Constraint

• Fix the possible entity types for the arguments for the target relation type

• Filter out the relation instances that do not satisfy the constraint in the end

Aug 5, 2009 ACL-IJCNLP 2009 14

Experiment Setup

• ACE 2004, 7 relation types– 6 types auxiliary types

1 type target type

• 5-fold cross validation

• # seed instances: 10

Aug 5, 2009 ACL-IJCNLP 2009 15

Methods Compared

• BL: train on seed instances only• BL-A: train on seed and auxiliary training

instances together w/o feature separation• TL-auto: transfer learning w/ automatic feature

separation• TL-guide: transfer learning w/ human-guided

feature separation• TL-comb: automatic feature separation

combined with human guidance• TL-NE: TL-comb + entity type constraint

Aug 5, 2009 ACL-IJCNLP 2009 16

ComparisonTarget Type BL BL-A TL-auto TL-

guideTL-

combTL-NE

Physical P 0.000 0.1692 0.2920 0.2934 0.3325 0.5056

R 0.000 0.0848 0.1696 0.1722 0.2383 0.2316

F 0.000 0.1130 0.2146 0.2170 0.2777 0.3176

Personal/Social P 1.000 0.0804 0.1005 0.3069 0.3214 0.6412

R 0.0386 0.1708 0.1598 0.7245 0.7686 0.7631

F 0.0743 0.1093 0.1234 0.4311 0.4533 0.6969

Employment/Membership/Subsidiary

P 0.9231 0.3561 0.5230 0.5428 0.5973 0.7145

R 0.0075 0.1850 0.2617 0.2648 0.3632 0.3601

F 0.0148 0.2435 0.3488 0.3559 0.4518 0.4789

Average P 0.8124 0.1475 0.2412 0.2703 0.2992 0.4231

R 0.0212 0.2432 0.3832 0.4764 0.5509 0.5464

F 0.0406 0.1532 0.2532 0.2958 0.3423 0.4132

Aug 5, 2009 ACL-IJCNLP 2009 17

Effect of λ

λμT 100 1000 10000

P 0.6265 0.3162 0.2992

R 0.1170 0.3959 0.5509

F 0.1847 0.2983 0.3423

Performance of TL-comb. λμk = 104, λν = 1.

2

1

22

K

kk

kT

T

Aug 5, 2009 ACL-IJCNLP 2009 18

Number of Seed Instances

Aug 5, 2009 ACL-IJCNLP 2009 19

Sensitivity of H

Aug 5, 2009 ACL-IJCNLP 2009 20

Conclusions• We proposed to apply a multi-task transfer

learning framework to the weakly-supervised relation extraction problem.

• We defined two kinds of type-specific features.

• Our experiments show that automatic feature separation combined with human guidance and entity type constraint can significantly outperform the baselines.

Aug 5, 2009 ACL-IJCNLP 2009 21

Thank You!

• Questions?

Aug 5, 2009 ACL-IJCNLP 2009 22

Related Work

• [Zhou et al. 2008]: Different way of modeling commonality among relation types.

• [Banko & Etzioni, 2008]: Open-domain relation extraction. No target relation type.

• [Xu et al. 2008]: Rule-based adaptation. Same type.

Aug 5, 2009 ACL-IJCNLP 2009 23

Hypothesized Type-Specific Features