research: entity identification on microblogs

Post on 23-Jan-2017

418 Views

Category:

Data & Analytics

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Entity Identification on Microblogsby CRF Model with Adaptive Dependency

Dept. of Social Informatics,

Kyoto University, Japan

Jun-Li Lu Makoto P. Kato Takehiro Yamamoto Katsumi Tanaka

@2015 IEEE/WIC/ACM International Conference on Web Intelligence (WI2015)

2

Outline

• Entity identification

• How an entity is mentioned

• Method• Feature

• Conditional Random Field (CRF) model

• Adaptive dependency

• Experiment results & conclusion

3

Problem definition:Entity identification on microblogs

Jacoby is leaving for the

rival and betrays Red Sox;

Yankees seems aiming

for championship.

microblog

Given mention, to find mapped entity?

4

How an entity is mentioned?

… is a

professional

baseball teamOur baseball team

is the rival to

Yankees

mention

attribute

Boston Red Sox

rival

… is a

professional…

New York Yankeesname

relationship

Direct-reference• Name: mention is partial or full name of an entity

Indirect-reference• Attribute: mention is to describe an entity

• Relationship: mention is the relationship between two entities

• Metaphor: mention contains another entity’s name but is to map an entity

entity’s article

5

Related work

• Two sub-tasks: NER (Named Entity Recognition), NED (Named Entity Disambiguation)

• NER and NED jointly considered [TKDE2015, WWW2014]

• Mining additional context for NED, in addition to KB [KDD2013]

• On well-written doc. v.s. on short-and-noisy microblog [WWW2014]

• Efficient prediction algorithm [WSDM2015]

=> Past works focused on direct-reference

6

Our contribution

• Survey for indirect-reference• indirect-reference was not infrequent in microblogs

• Novel feature for indirect-reference• topic-specific translation, “entity-known-as” pattern, …

• A efficient model that considers dependency between entities• predicting entities together by CRF model

• getting proper dependency among entities

Presenting flow

Introduction toEntity Identification

Feature CRF Model with Adaptive Dependency

Experiment results

How to measure entities?

How to predict entities?

Previous-work features

…are an

baseball

team…

New York Yankees

microblog

the Yankees is the

rival to ……In 2015, [New

York Yankees] won

championship

…[New York

Yankees

|yankees]… yankees…

# of found documents

Boston Red Sox

[Boston

Red Sox]

…New York

Yankee…

writer’s recent microblogs

yankees

1. Keyword

2. Context similarity

3. Entities’ correlation

4. Mention entity’s name

5. Occurrence frequency

6. User interest

Jaccard-index

bag-of-wordssimilarity

prob.(yankees)

1.

2.

3.

4.5.

6.

candidate-entity candidate-entity

# of found cases

How to measure entities?

match

9

For indirect-reference: topic-specific translation

• To get microblog’s meaning based on topic knowledge: Effective when microblog is abstract

• How we did

…the player is

leaving for…

microblog topic translation

player =“outfielder”

“goalkeeper”

…the playeris leaving for…

news

“New York Yankees”

“Jacoby Ellsbury”

“pitcher”

“outfielder”“shortstop”…

“player”=>“outfielder”

responded or writer’s past microblogs

microblog-related data top proper-noun translation by semantic-similarity

“baseball”

“soccer”

top terms in topic(related Wikipedia documents)

10

For indirect-reference:pattern

• Effective when mention is normal-noun: e.g., no hint for entity’s name

• Pattern 1: entity-known-as

• Pattern 2: entity-performing-action

mention+ known-as-phrase

“pinstripes” + “known as”

action

“hit”

“New York Yankees…known as

…pinstripes”

“Jacoby Ellsbury

…hit”

Presenting flow

Introduction toEntity Identification

Feature CRF Model with Adaptive Dependency

Experiment results

How to measure entities?

How to predict entities?

12

Conditional Random Field (CRF) model

• To predict multiple entities together by proper dependency

• Linear + Non-sequential CRF:• to make prediction tractable, linear time 𝑂(𝑛𝑐2)

• If cycle-CRF, time is exponential, 𝑂(𝑐𝑛)

• to allow proper dependency among entities

n: # of mentions/a microblog; c: # of candidates/a mention

with dependency

pro

bab

ility

𝑌2=

without

𝑌2𝑌1𝑌2𝑌1

13

Adaptive dependency

• To make proper dependency among entities

• By entities’ correlation

CRF model of adaptive dependency

𝑋1 𝑋2 𝑋3 𝑋4 𝑋5

𝑌5𝑌4𝑌2 𝑌3𝑌1

Pick 𝑖, 𝑗 with max adaptive dependencyand not making cycle

………

“baseball”

to make high to make low

“singing”

c c

14

Prediction probability

• CRF model

𝑝 𝒚 𝒙 =1

𝑧 𝒙𝑒𝑥𝑝

𝑖

[

𝑓∈𝐹𝛼

𝑤𝑓 𝑓 𝑦𝑖 +

𝑓∈𝐹𝛽

𝑤𝑓 𝑓 𝑥𝑖 , 𝑦𝑖 ] +

𝑖,𝑗 ∈L

𝑓∈𝐹𝛾

𝑤𝑓𝑓 𝑦𝑖 , 𝑦𝑗

• CRF model with adaptive dependency

𝑝 𝒚 𝒙 =1

𝑧 𝒙𝑒𝑥𝑝

𝑖

[

𝑓∈𝐹𝛼

𝑤𝑓 𝑓 𝑦𝑖 +

𝑓∈𝐹𝛽

𝑤𝑓 𝑓 𝑥𝑖 , 𝑦𝑖 ] +

𝑙= 𝑖,𝑗 ∈L

𝑓∈𝐹𝛾

𝛿 𝑙 𝑤𝑓𝑓 𝑦𝑖 , 𝑦𝑗

𝒚=(𝑦1,…, 𝑦𝑛): a set of entities; 𝒙=(𝑥1,…, 𝑥𝑛): a set of mentions; 𝑧(𝒙): normalization; 𝑤𝑓: weight of feature f

𝐹𝛼: a set of features of an entity𝐹𝛽: a set of features of an entity and mention

𝐹𝛾: a set of features of two entities

L: a set of connections between 𝑌𝑖 , 1 ≤ 𝑖 ≤ 𝑛

𝛿 𝑙 , adaptive dependency:top-k value of 𝑓∈𝐹𝛽𝑤𝑓 𝑓 𝑥𝑖 , 𝑦𝑖 + 𝑓∈𝐹𝛽𝑤𝑓 𝑓 𝑥𝑗 , 𝑦𝑗 + 𝑓∈𝐹𝛾𝑤𝑓 𝑓 𝑦𝑖 , 𝑦𝑗

Presenting flow

Introduction toEntity Identification

Feature CRF Model with Adaptive Dependency

Experiment results

How to measure entities?

How to predict entities?

16

Experiment outline

• Microblog annotation

• Candidate entity generation

• Performance• Overall: features + model

• Feature comparison

• CRF model with adaptive dependency

17

Microblog annotation

• Credible ground-truth: 3 annotators on 500 random tweets from Twitter (2014/10)

• Annotation result:

=> Multiple mentions in a microblog (2.61 per tweet)

=> Indirect-reference was not infrequent (indirect:direct≈2:3)

Twitter-tag Tweet # Mention # direct-ref. # indirect-ref #

#Yankees 86 228 153 108

#Obama 92 227 167 87

#Ebola 97 241 151 156

#Nobel 94 287 228 124

#Islam 92 219 151 95

Mean per tweet 2.61 1.84 1.24

18

Candidate entity generation

• Direct reference: mention is partial or full name of entity

• Indirect reference: mention is included in entity’s main page in Wikipedia

30

50

70

90

10

60

11

0

16

0

21

0

26

0

31

0

36

0

41

0

46

0

51

0

56

0

61

0

66

0

71

0

76

0

81

0

86

0

91

0

96

0

20

00g

t-en

titi

es i

n

candid

ates

(%

)

size of top candidate entities

for direct reference

for indirect reference

=> Weak for indirect-reference

19

Baseline method

• Baseline-model: sequence-rank one-by-one• 𝑎𝑟𝑔𝑚𝑎𝑥𝑒∈𝐶p(yi = e|y1, … , y𝑖−1, 𝑦𝑖+1,…, yn)

• 𝐶: candidates for yi

𝑌5𝑌4𝑌2 𝑌3𝑌1

1𝑜 2𝑜 3𝑜 4𝑜 5𝑜

Context similarity

Entities’ correlation

Mention entity’s name

User interest

Occurrence frequency

Keyword

Topic-specific translation

Pattern

Writing behavior

Ourfeatures

Baseline-features

Ranking order:

20

Overall performance

• Our CRF model (or all features) was always better

=> CRF model works regardless of features

=> Multiple features are required

MRR=1/𝑞 𝑖 1/ranki, where q: # of test, ranki: rank position of ground-truth entities at test i

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

MR

R

**

**

(+SEM)

Our CRF

All-feature (including ours)Baseline-feature

Baseline-model

21

Feature comparison

• Our feature was effective for indirect-reference

00.10.20.30.40.5

Topic-

specific

translation,

Eq. 1a-b

Occurrence

frequency

Entities'

correlation

Topic-

specific

translation,

Eq. 1c-f

Keyword Context

similarity

Pattern Writing

behavior

Mention

entity's

name

User

interest

MR

R for indirect-reference

0

0.2

0.4

0.6

0.8

Occurrence

frequency

Mention

entity's

name

Topic-

specific

translation,

Eq. 1a-b

Entities'

correlation

Topic-

specific

translation,

Eq. 1c-f

Pattern Context

similarity

Writing

behavior

Keyword User

interest

MR

R for direct-reference

(+SEM)

(+SEM)

Our feature

Baseline-feature

22

Effect of CRF model with adaptive dependency

• Our adaptive dependency was a little worse than best• but note that our complexity is in linear

appearing order

𝑂(𝑐𝑛) 𝑂(𝑛𝑐2) 𝑂(𝑛𝑐)complexity 𝑂(𝑛𝑐2) 𝑂(𝑛𝑐2)

00.10.20.30.40.50.60.7

Fully connected Adaptive Occurrence order Random No dependency

MR

R

(+SEM)

23

Conclusion

• Contribution:• Surveyed on microblogs for indirect-reference• Effective feature for indirect-reference• Accurate and efficient: CRF model with adaptive dependency

• Finding:• Not good for getting candidates for indirect-reference

• Limited performance on some novel feature

• Multiple features were required when direct/indirect references are mixed

• Thank you for listening

top related