coreference resolution talk

55
Entities in Natural Language Anaphora Resolution Coreference Resolution References Coreference Resolution Hinrich Schütze and Desislava Zhekova CIS, LMU [email protected] June 21, 2013 Hinrich Schütze and Desislava Zhekova Coreference Resolution

Upload: nparslow

Post on 13-Dec-2015

218 views

Category:

Documents


1 download

DESCRIPTION

talk on coreference resolutionentities, anaphora, coreference

TRANSCRIPT

Page 1: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Coreference Resolution

Hinrich Schütze and Desislava Zhekova

CIS, [email protected]

June 21, 2013

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 2: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Outline

1 Entities in Natural LanguageUnderstanding Natural LanguageThe use of Entities in Natural LanguageReference Resolution

2 Anaphora ResolutionThe Task of Anaphora ResolutionTypes of Anaphora

3 Coreference ResolutionRule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 3: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Understanding Natural LanguageThe use of Entities in Natural LanguageReference Resolution

Understanding Natural Language

John: Mary baked a vanilla slice for the birthday party.

Bob: Really?

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 4: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Understanding Natural LanguageThe use of Entities in Natural LanguageReference Resolution

The use of Entities in Natural Language

John: [Mary]1 baked [a vanilla slice]2 for [the birthday party]3.

Bob: Really?

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 5: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Understanding Natural LanguageThe use of Entities in Natural LanguageReference Resolution

Reference Resolution

John: [Mary]1 baked [a vanilla slice]2 for [the birthday party]3.Unfortunately, [she]4 forgot [the cake]5 in [the oven]6.

Bob: Really?

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 6: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Understanding Natural LanguageThe use of Entities in Natural LanguageReference Resolution

Reference Resolution

# Referring Expression Referent1 Mary, she Mary2 vanilla slice, the cake the vanilla slice cake3 the birthday party the birthday party4 the oven the oven

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 7: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Understanding Natural LanguageThe use of Entities in Natural LanguageReference Resolution

Reference Resolution

Why is this helpful to NLP?

Let us ask the Natural Language Question Answering System STARTsome questions using reference:

What does the question sequence (Who is the Queen ofEngland? What is her age?) return?

What does the question sequence (Who is James Bond? Who isthe Queen of England? What is his age?) return?

What does the question sequence (Who is James Bond? Who isthe Queen of England? How old is this person?) return?

http://start.csail.mit.edu

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 8: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Understanding Natural LanguageThe use of Entities in Natural LanguageReference Resolution

Reference Resolution

The ambiguity of referring expressions is often disambiguated byhumans via clarifications questions:

Did you mean James Bond?

Did you mean the Queen?

How old is who?

Who did you mean?

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 9: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Understanding Natural LanguageThe use of Entities in Natural LanguageReference Resolution

Reference Resolution

John: [Mary]1 baked [a vanilla slice]2 for [the birthday party]3.Unfortunately, [she]4 forgot [the cake]5 in [the oven]6.

Bob: Really?

# First mention Reference1 Mary she2 vanilla slice the cake

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 10: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Understanding Natural LanguageThe use of Entities in Natural LanguageReference Resolution

Reference Resolution

antecedent - denotes the expression that appears previous to areferring expression to the same discourse entity

anaphor - denotes the referring expression to an entity that hasalready been introduced to the discourse

anaphoric relation - the relation that binds the antecedent andthe anaphor

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 11: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Understanding Natural LanguageThe use of Entities in Natural LanguageReference Resolution

Reference Resolution

Anaphora Resolution

Coreference Resolution

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 12: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

The Task of Anaphora ResolutionTypes of Anaphora

Anaphora Resolution

Anaphora Resolution (AR) - is the task that aims at the identificationof the antecedent of a target word or phrase previously introduced tothe discourse. [Mitkov, 2002]

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 13: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

The Task of Anaphora ResolutionTypes of Anaphora

Anaphora Resolution

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 14: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

The Task of Anaphora ResolutionTypes of Anaphora

Anaphora Resolution

find the correct antecedent for each anaphor

once one antecedent is found the task is complete - AR does notdetect all antecedents in the given discourse

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 15: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

The Task of Anaphora ResolutionTypes of Anaphora

Types of Anaphora

The various types of anaphora may be distinguished:

according to the form of the anaphor (e.g. pronominal anaphora,lexical noun phrase anaphora, verb/adverb anaphora, zeroanaphora)

according to the locations of the anaphor and the antecedent(e.g. intrasentential anaphora, intersentential anaphora,interdocument anaphora)

other (e.g. identity-of-reference anaphora, identity-of-senseanaphora, cataphora)

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 16: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Coreference Resolution

Coreference resolution (CR) - is the process that aims to identify thevarious referring expressions in a discourse that are associated withthe same entity and group them under the same equivalence classes.

mention/markable - potentially anaphoric phrase

coreference chain - an equivalence class or a set of mentionsrefering to the same entity

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 17: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Coreference Resolution

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 18: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Coreference Resolution

Larry King: Hello hello Jay Georgia hello.caller_3: Ah thank (you) (Larry). And (Mike) (I) loved ((your)

book). (It) was great. And toward the end of the(book) (you) said Secretary (Putin of Russia) hadasked (you) to come over and (interview) (him).Had (you) done (that)? Uh and (I)’d like to knowabout (it). Thank (you) so much.

Mike Wallace: Yeah.Larry King: (I) did interview (Putin) yes.Mike Wallace: on the sixtieth anniversary of the uh end of World

War Two (he) asked (me) to come on over and (in-terview) (him). And (it) was carried uh in a lot ofplaces. But (I) tell you something. (Putin) to (my)way of thinking who calls (himself) a democrat -(He)’s not our kind of democrat.

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 19: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Hands On

How many coreference chains do these mentions form?

Which are the chains?

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 20: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Coreference Resolution

Larry King: Hello hello Jay Georgia hello.caller_3: Ah thank (you1) (Larry1). And (Mike2) (I3) loved

((your2) book4). (It4) was great. And toward theend of the (book4) (you2) said Secretary (Putinof Russia5) had asked (you2) to come over and(interview6) (him5). Had (you2) done (that6)? Uhand (I3)’d like to know about (it6). Thank (you2) somuch.

Mike Wallace: Yeah.Larry King: (I1) did interview (Putin5) yes.Mike Wallace: on the sixtieth anniversary of the uh end of World

War Two (he5) asked (me2) to come on over and(interview7) (him5). And (it7) was carried uh in a lotof places. But (I2) tell you something. (Putin5) to(my2) way of thinking who calls (himself5) a demo-crat - (He5)’s not our kind of democrat.

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 21: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Coreference Resolution

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 22: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Reference Resolution

John: [Mary]1 baked [a vanilla]2 slice for [the birthday party]3.Unfortunately, [she]4 forgot [the cake]5 in [the oven]6.

Bob: Really?

What about mentions, such as [the birthday party]3 and [the oven]6.These entities are only introduced once, but never referred to!

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 23: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Coreference Resolution

singletons - mentions that refer to an entity in the text that no othermention refers to

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 24: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Coreference Resolution

So, how do we identify coreferential relations? Similar to the WSD taskthat we previously discussed, we have two different approaches:

rule-based approaches

machine learning approaches

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 25: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Rule-based CR

Rule-based approaches rely on:

the availability of lexical and encyclopedic knowledge

manually handcrafted rules

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 26: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Rule-based CR

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 27: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Rule-based CR

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 28: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Machine Learning for CR

Machine Learning for CR tries to meet the drawbacks of rule-basedapproaches:

the cost for manually developing rules

the cost for maintaining and extending the rules

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 29: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Machine Learning for CR

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 30: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Coreference Models

Coreference Resolution is often represented as a binary classificationtask and there are several CR models that can be used for thispurpose [Rahman and Ng, 2011]:

mention-pair model

mention-ranking model

entity-mention model

cluster-ranking model

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 31: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Evaluation

Most widely used evaluation metrics are: MUC, B3, both CEAFvariants (CEAFe and CEAFm) and BLANC. None of them, however,manages to provide an optimal evaluation. This is well demonstratedby the two baselines (singletons and all-in-one).

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 32: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Baselines

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 33: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Baselines

MUC CEAF B3 BLANCR P F1 R P F1 R P F1 R P BLANC

SINGLETONS 0.0 0.0 0.0 71.2 71.2 71.2 71.2 100 83.2 50.0 49.2 49.6ALL-IN-ONE 100 29.2 45.2 10.5 10.5 10.5 100 3.5 6.7 50.0 0.8 1.6

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 34: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Evaluation Settings

gold-closed - gold linguistic annotations must be used by the systemsand no external tools and resources are allowed for additionalpreprocessing.

auto-closed - auto linguistic annotations must be used by the systemsand no external tools and resources are allowed for additionalpreprocessing.

gold-open - gold linguistic annotations must be used by the systemsand external tools and resources are allowed for additionalpreprocessing.

auto-open - auto linguistic annotations must be used by the systemsand external tools and resources are allowed for additionalpreprocessing.

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 35: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

DataWord# Word POS ParseBit PredLemma PFID WS SA NE PredArgs PredArgs Coref0 It PRP (TOP(S(NP*) - - - Speaker#1 * * (ARG1*) (22)1 is VBZ (VP* - 03 - Speaker#1 * (V*) * -2 composed VBN (VP* - 01 2 Speaker#1 * * (V*) -3 of IN (PP* - - - Speaker#1 * * (ARG2* -4 a DT (NP(NP* - - - Speaker#1 * * * (245 primary JJ * - - - Speaker#1 * * * -6 stele NN *) - - - Speaker#1 * * * 24)7 , , * - - - Speaker#1 * * * -8 secondary JJ (NP* - - - Speaker#1 * * * (139 steles NNS *) - - - Speaker#1 * * * 13)10 , , * - - - Speaker#1 * * * -11 a DT (NP* - - - Speaker#1 * * * -12 huge JJ * - - - Speaker#1 * * * -13 round NN * - - - Speaker#1 * * * -14 sculpture NN (NML(NML*) - - - Speaker#1 * * * -15 and CC * - - - Speaker#1 * * * -16 beacon NN (NML* - - - Speaker#1 * * * -17 tower NN *))) - - - Speaker#1 * * * -18 , , * - - - Speaker#1 * * * -19 and CC * - - - Speaker#1 * * * -20 the DT (NP* - - - Speaker#1 (WORK_OF_ART* * * -21 Great NNP * - - - Speaker#1 * * * -22 Wall NNP *) - - - Speaker#1 *) * * -23 , , * - - - Speaker#1 * * * -24 among IN (PP* - - - Speaker#1 * * * -25 other JJ (NP* - - - Speaker#1 * * * -26 things NNS *)))))) - - - Speaker#1 * * *) -27 . . *)) - - - Speaker#1 * * * -

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 36: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Data

#begin document <document ID><sentence>

<sentence>...<sentence>

#end document <document ID>...#begin document <document ID><sentence>

<sentence>...<sentence>

#end document <document ID>

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 37: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Data

<token#1 column#1> <token#1 column#2> <token#1 column#3> ...<token#2 column#1> <token#2 column#2> <token#2 column#3> ...<token#3 column#1> <token#3 column#2> <token#3 column#3> ......

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 38: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

The CR pipeline

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 39: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Mention Detection

Identification of mentions using:

Heuristic approaches – POS, NEs

Rule-based approaches – syntactic annotation

Machine learning approaches – can use all types of providedannotations

Hybrid approaches – combination of implemented approaches

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 40: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Mention Detection MethodsWord# Word POS ParseBit PredLemma PFID WS SA NE PredArgs PredArgs Coref0 It PRP (TOP(S(NP*) - - - Speaker#1 * * (ARG1*) (22)1 is VBZ (VP* - 03 - Speaker#1 * (V*) * -2 composed VBN (VP* - 01 2 Speaker#1 * * (V*) -3 of IN (PP* - - - Speaker#1 * * (ARG2* -4 a DT (NP(NP* - - - Speaker#1 * * * (245 primary JJ * - - - Speaker#1 * * * -6 stele NN *) - - - Speaker#1 * * * 24)7 , , * - - - Speaker#1 * * * -8 secondary JJ (NP* - - - Speaker#1 * * * (139 steles NNS *) - - - Speaker#1 * * * 13)10 , , * - - - Speaker#1 * * * -11 a DT (NP* - - - Speaker#1 * * * -12 huge JJ * - - - Speaker#1 * * * -13 round NN * - - - Speaker#1 * * * -14 sculpture NN (NML(NML*) - - - Speaker#1 * * * -15 and CC * - - - Speaker#1 * * * -16 beacon NN (NML* - - - Speaker#1 * * * -17 tower NN *))) - - - Speaker#1 * * * -18 , , * - - - Speaker#1 * * * -19 and CC * - - - Speaker#1 * * * -20 the DT (NP* - - - Speaker#1 (WORK_OF_ART* * * -21 Great NNP * - - - Speaker#1 * * * -22 Wall NNP *) - - - Speaker#1 *) * * -23 , , * - - - Speaker#1 * * * -24 among IN (PP* - - - Speaker#1 * * * -25 other JJ (NP* - - - Speaker#1 * * * -26 things NNS *)))))) - - - Speaker#1 * * *) -27 . . *)) - - - Speaker#1 * * * -

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 41: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Using NEs

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 42: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Using NEsWord# Word POS ParseBit PredLemma PFID WS SA NE PredArgs PredArgs Coref0 It PRP (TOP(S(NP*) - - - Speaker#1 * * (ARG1*) (22)1 is VBZ (VP* - 03 - Speaker#1 * (V*) * -2 composed VBN (VP* - 01 2 Speaker#1 * * (V*) -3 of IN (PP* - - - Speaker#1 * * (ARG2* -4 a DT (NP(NP* - - - Speaker#1 * * * (245 primary JJ * - - - Speaker#1 * * * -6 stele NN *) - - - Speaker#1 * * * 24)7 , , * - - - Speaker#1 * * * -8 secondary JJ (NP* - - - Speaker#1 * * * (139 steles NNS *) - - - Speaker#1 * * * 13)10 , , * - - - Speaker#1 * * * -11 a DT (NP* - - - Speaker#1 * * * -12 huge JJ * - - - Speaker#1 * * * -13 round NN * - - - Speaker#1 * * * -14 sculpture NN (NML(NML*) - - - Speaker#1 * * * -15 and CC * - - - Speaker#1 * * * -16 beacon NN (NML* - - - Speaker#1 * * * -17 tower NN *))) - - - Speaker#1 * * * -18 , , * - - - Speaker#1 * * * -19 and CC * - - - Speaker#1 * * * -20 the DT (NP* - - - Speaker#1 (WORK_OF_ART* * * -21 Great NNP * - - - Speaker#1 * * * -22 Wall NNP *) - - - Speaker#1 *) * * -23 , , * - - - Speaker#1 * * * -24 among IN (PP* - - - Speaker#1 * * * -25 other JJ (NP* - - - Speaker#1 * * * -26 things NNS *)))))) - - - Speaker#1 * * *) -27 . . *)) - - - Speaker#1 * * * -

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 43: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Using POS-based chunkingWord# Word POS ParseBit PredLemma PFID WS SA NE PredArgs PredArgs Coref0 It PRP (TOP(S(NP*) - - - Speaker#1 * * (ARG1*) (22)1 is VBZ (VP* - 03 - Speaker#1 * (V*) * -2 composed VBN (VP* - 01 2 Speaker#1 * * (V*) -3 of IN (PP* - - - Speaker#1 * * (ARG2* -4 a DT (NP(NP* - - - Speaker#1 * * * (245 primary JJ * - - - Speaker#1 * * * -6 stele NN *) - - - Speaker#1 * * * 24)7 , , * - - - Speaker#1 * * * -8 secondary JJ (NP* - - - Speaker#1 * * * (139 steles NNS *) - - - Speaker#1 * * * 13)10 , , * - - - Speaker#1 * * * -11 a DT (NP* - - - Speaker#1 * * * -12 huge JJ * - - - Speaker#1 * * * -13 round NN * - - - Speaker#1 * * * -14 sculpture NN (NML(NML*) - - - Speaker#1 * * * -15 and CC * - - - Speaker#1 * * * -16 beacon NN (NML* - - - Speaker#1 * * * -17 tower NN *))) - - - Speaker#1 * * * -18 , , * - - - Speaker#1 * * * -19 and CC * - - - Speaker#1 * * * -20 the DT (NP* - - - Speaker#1 (WORK_OF_ART* * * -21 Great NNP * - - - Speaker#1 * * * -22 Wall NNP *) - - - Speaker#1 *) * * -23 , , * - - - Speaker#1 * * * -24 among IN (PP* - - - Speaker#1 * * * -25 other JJ (NP* - - - Speaker#1 * * * -26 things NNS *)))))) - - - Speaker#1 * * *) -27 . . *)) - - - Speaker#1 * * * -

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 44: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Using the syntactic annotationWord# Word POS ParseBit PredLemma PFID WS SA NE PredArgs PredArgs Coref0 It PRP (TOP(S(NP*) - - - Speaker#1 * * (ARG1*) (22)1 is VBZ (VP* - 03 - Speaker#1 * (V*) * -2 composed VBN (VP* - 01 2 Speaker#1 * * (V*) -3 of IN (PP* - - - Speaker#1 * * (ARG2* -4 a DT (NP(NP* - - - Speaker#1 * * * (245 primary JJ * - - - Speaker#1 * * * -6 stele NN *) - - - Speaker#1 * * * 24)7 , , * - - - Speaker#1 * * * -8 secondary JJ (NP* - - - Speaker#1 * * * (139 steles NNS *) - - - Speaker#1 * * * 13)10 , , * - - - Speaker#1 * * * -11 a DT (NP* - - - Speaker#1 * * * -12 huge JJ * - - - Speaker#1 * * * -13 round NN * - - - Speaker#1 * * * -14 sculpture NN (NML(NML*) - - - Speaker#1 * * * -15 and CC * - - - Speaker#1 * * * -16 beacon NN (NML* - - - Speaker#1 * * * -17 tower NN *))) - - - Speaker#1 * * * -18 , , * - - - Speaker#1 * * * -19 and CC * - - - Speaker#1 * * * -20 the DT (NP* - - - Speaker#1 (WORK_OF_ART* * * -21 Great NNP * - - - Speaker#1 * * * -22 Wall NNP *) - - - Speaker#1 *) * * -23 , , * - - - Speaker#1 * * * -24 among IN (PP* - - - Speaker#1 * * * -25 other JJ (NP* - - - Speaker#1 * * * -26 things NNS *)))))) - - - Speaker#1 * * *) -27 . . *)) - - - Speaker#1 * * * -

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 45: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Toy Example following the Mention-Pair Model

[Mary1] had [a good idea2]. [She3] wanted to tell [John4].

[a good idea] [Mary][She] [a good idea][She] [Mary][John] [She][John] [a good idea][John] [Mary]

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 46: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Mention Head Detection

Example:

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 47: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Mention Head Detection

Mention Head Detection is generally realized via:

Heuristics

Rule-based methods

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 48: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Toy Example following the Mention-Pair Model

[Mary1] had [a good idea2]. [She3] wanted to tell [John4].

idea MaryShe ideaShe MaryJohn SheJohn ideaJohn Mary

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 49: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Example of Features Used by the Menion-Pair Model

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 50: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Example of Features Used by the Menion-Pair Model

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 51: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Toy Example following the Mention-Pair Model

[Mary1] had [a good idea2]. [She3] wanted to tell [John4].

idea Mary NN NNPShe idea NNP NNShe Mary PRP NNPJohn She NNP PRPJohn idea NNP NNJohn Mary NNP NNP

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 52: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Toy Example following the Mention-Pair Model

training: [Mary1] had [a good idea2]. [She1] wanted to tell [John4].

test: [Mary1] had [a good idea2]. [She3] wanted to tell [John4].

Training instances: Test instances:

idea Mary NN NNP F idea Mary NN NNPShe idea NNP NN F She idea NNP NNShe Mary PRP NNP T She Mary PRP NNPJohn She NNP PRP F John She NNP PRPJohn idea NNP NN F John idea NNP NNJohn Mary NNP NNP F John Mary NNP NNP

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 53: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Hands On

How would you employ semantic similarity for the task of coreferenceresolution?

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 54: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Rule-based approaches to CRMachine Learning approaches to CRSubtasks of CR

Thank you!

Hinrich Schütze and Desislava Zhekova Coreference Resolution

Page 55: coreference resolution talk

Entities in Natural LanguageAnaphora Resolution

Coreference ResolutionReferences

Bibliography

Ruslan Mitkov. Anaphora resolution. Studies in Language andLinguistics. Longman, 2002.

Altaf Rahman and Vincent Ng. Narrowing the Modeling Gap: aCluster-Ranking Approach to Coreference Resolution. J. Artif. Int.Res., 40(1):469–521, January 2011.

Hinrich Schütze and Desislava Zhekova Coreference Resolution