an analysis of causality between events and its relation to temporal information

25
An Analysis of Causality between Events and its Relation to Temporal Information Paramita Mirza Sara Tonelli [email protected] [email protected] COLING 2014 The Story about Causelity and his friendship with Time

Upload: paramita-mirza

Post on 06-Jul-2015

518 views

Category:

Science


1 download

DESCRIPTION

In Proceedings of the 25th International Conference on Computational Linguistics In this work we present an annotation framework to capture causality between events, inspired by TimeML, and a language resource covering both temporal and causal relations. This data set is then used to build an automatic extraction system for causal signals and causal links between given event pairs. The evaluation and analysis of the system’s performance provides an insight into explicit causality in text and the connection between temporal and causal relations.

TRANSCRIPT

Page 1: An Analysis of Causality between Events and its Relation to Temporal Information

An Analysis of Causality between Events and its Relation to Temporal Information

Paramita Mirza Sara [email protected] [email protected]

COLING 2014

The Story about Causelity and his friendship with Time

Page 2: An Analysis of Causality between Events and its Relation to Temporal Information

One day at McRel Inc.

Eve McRel, the head of Event department, is introducing the team’s new member…

Let me introduce Casey, who will be responsible for extracting causality from text. Casey, Tim is the one responsible for temporal relations, you two should work together.

Hello! I have some questions for you, Tim.

Hi! Feel free to ask. We should have a drink sometimes.

2 / 22

Page 3: An Analysis of Causality between Events and its Relation to Temporal Information

About their jobs

CAUSE

BEFORE

IS_INCLUDED

Typhoon Haiyan struck the eastern Philippines on Friday, killing thousands of

people.

So, when given a piece of text like that, my job is to tell that struckhappened before killing, or that struck happened on Friday.

I see. And my job is to determine that struck is the cause of killing. How do you learn to identify the temporal relations?

Oh, I have this annotated corpus from TempEval-3. I learn a lot from that. 3 / 22

Page 4: An Analysis of Causality between Events and its Relation to Temporal Information

Rink et al. (2010)Use Bethard’s corpus and

shows that temporal informationhelps in identifying causal relations

Resources on causalityCasey is not as lucky as Tim, the TempEval-3 corpus that Tim has doesn’t have causal information. He asked Eve to provide him with a causality corpus so that he can learn from it.

Eve ordered someone to investigate about resources on causality…

Girju et al. (2007)Causality between nominals(SemEval-2007 Task 4)

Do et al. (2011)Causality between verb-verb, verb-noun, and noun-noun(20 news articles from CNN)

Bethard et al. (2008)Causality between eventsunder conjunction and

Riaz and Girju (2013)Causality between verbal events with markers because and but(knowledge base of causal associations of verbs)

4 / 22

Page 5: An Analysis of Causality between Events and its Relation to Temporal Information

Let’s create a causality corpus!Since the available resources on causality are not really what they wanted, Eve decided to create a new one, so she hired two interns…

Eve then consulted some philosophers: Lewis, Cheng, Wolff and Talmy;

and she decided to lean the guidelines on the Dynamics Model (Wolff),

based on Talmy’s force dynamic account of causality.

So guys, I want you to add causal information on top of the TempEval-3 corpus. Can you do it?

Banana?

Argh, this won’t do. I need annotation guidelines for them.

5 / 22

Page 6: An Analysis of Causality between Events and its Relation to Temporal Information

The Annotator’s Guide to the Causality(a trilogy in two parts)

Part 1: CSIGNALA textual element indicating the presence of a causal relation. Parallel to SIGNAL to mark the presence of a temporal relation in TimeML.

• Prepositions

• Conjunctions

• Adverbial connectors

• Clause-integrated expressions

because of, as a result of, due to, …

because, since, so that, …

as a result, so, therefore, …

the result is, that’s why, …

6 / 22

Page 7: An Analysis of Causality between Events and its Relation to Temporal Information

The Annotator’s Guide to the Causality(a trilogy in two parts)

Part 2: CLINKA directional one-to-one relation where source = causing event and target = caused event, (optional) c-signalID = ID of related CSIGNAL. Parallel to TLINK for temporal relations in TimeML.

• Expressions containing affect verbs affect, influence, determine, change

– Ogun CAN crisisS affects the launchT of the All Progressives Congress

• Expressions containing link verbs link, lead, depend (on)

– An earthquakeT in North America was linked to a tsunamiS in Japan

• Basic construction involving causative verbs of CAUSE, ENABLE, PREVENT type

– The purchaseS caused the creationT of the current building 7 / 22

Page 8: An Analysis of Causality between Events and its Relation to Temporal Information

The Annotator’s Guide to the Causality(a trilogy in two parts)

Part 2: CLINKA directional one-to-one relation where source = causing event and target = caused event, (optional) c-signalID = ID of related C-SIGNAL. Parallel to TLINK for temporal relations in TimeML.

• Periphrastic causatives involving causative verbs of CAUSE, ENABLE, PREVENT type

– The blastS prompts the boat to heelT violently

• Expressions containing CSIGNALs

– Iraq said it invadedT Kuwait because of disputesS over oil and money

8 / 22

Page 9: An Analysis of Causality between Events and its Relation to Temporal Information

The interns’ discussion

Hmmm… wowee~ Evo in kalarel no anotatata!(Hmmm… weird. Some events involved in causal relations were not annotated.)

Real?? May para temporel awali jengajengaAnotatata wuliloo!

(Really? Maybe because it was originally built for temporal relations.Let’s annotate them!)

9 / 22

Page 10: An Analysis of Causality between Events and its Relation to Temporal Information

The interns’ reports

http://hlt.fbk.eu/technologies/causal-timebank10 / 22

Page 11: An Analysis of Causality between Events and its Relation to Temporal Information

How to learn causality?

Casey decided to divide the job into two tasks:

1. Labeling CSIGNAL: given a text (annotated with events and time expressions), decide whether a token is part of causal signals or not

2. Identifying CLINK: given a pair of events, decide whether the events are connected by an explicit causal link

Both tasks are basically classification tasks. I will use the created causality corpus to learn from.

To evaluate my learning ability, I will use the 5-fold cross-validation scheme.

11 / 22

Page 12: An Analysis of Causality between Events and its Relation to Temporal Information

Inside Casey’s brainon labeling CSIGNAL

• Text chunking task: a token is classified into B-CSIGNAL, I-CSIGNALand O (for other)

• Pre-processing:– TextPro tool (Pianta et al., 2008) to get NP chunking and named entity

information– Stanford CoreNLP tool to get lemma, PoS tags and dependency relations

between tokens– addDiscourse tool (Pitler and Nenkova, 2009) to get discourse connective

type

12 / 22

Page 13: An Analysis of Causality between Events and its Relation to Temporal Information

Inside Casey’s brainon labeling CSIGNAL

• Classifier:– Built using SVM algorithm provided by YamCha

– Features vectors: token, lemma, PoS tags, NP chunking, dependency relations, and several binary features indicating whether a token is:

• part of an event or a temporal expression

• part of a named entity

• part of a specific discourse connective type

13 / 22

Page 14: An Analysis of Causality between Events and its Relation to Temporal Information

Casey’s noteon labeling CSIGNAL

System Precision Recall F-score

Rule-based (baseline) 54.33% 40.35% 46.31%

Supervised chunking 91.03% 41.76% 57.26%

Rule-based system basically labels as CSIGNAL all causal connectors listed in the annotation guidelines

and those appearing in specific syntactic construction

14 / 22

Page 15: An Analysis of Causality between Events and its Relation to Temporal Information

Inside Casey’s brainon identifying CLINK

• Classification task: an ordered pair of events (e1, e2) is classified into CLINK (e1 as source, e2 as target), CLINK-R (reversed order of source and target) and NO-REL

• Candidate pairs:

– Every possible combination of events in the same sentence in a forward manner, e.g. ”The e1 and e2 are e3”, event pairs are (e1, e2), (e1, e3), (e2, e3)

– Combination of each event in a sentence with events in the following sentence (only consider linking events in two consecutive sentences)

• Pre-processing:– Stanford CoreNLP tool to get lemma, PoS tags and dependency relations

between tokens15 / 22

Page 16: An Analysis of Causality between Events and its Relation to Temporal Information

Inside Casey’s brainon identifying CLINK (continued)

• Classifier:– Built using SVM algorithm provided by YamCha

– Features vectors:

• String and grammatical: token, lemma and PoS tags of e1 and e2, and a binary feature (e1 and e2 have the same PoS tags)

• Textual context: sentence distance and event distance of e1 and e2

• Event attributes: class, tense, aspect and polarity of e1 and e2 as specified in TimeML

16 / 22

Page 17: An Analysis of Causality between Events and its Relation to Temporal Information

Inside Casey’s brainon identifying CLINK (continued)

• Classifier:– Built using SVM algorithm provided by YamCha

– Features vectors:

• Dependency information: dependency path between e1 and e2 (if any), type of causative verbs connecting them (if any), and a binary feature (e1/e2 is the root of the sentence)

• Causal signals: causal signals around e1 and e2, position of the signal (between e1 and e2, or before e1), dependency path between e1/e2 and the signal

• Temporal relations (TLINKs): temporal relation type of TLINK connecting e1/e2 (if any), taken from gold annotated corpus

17 / 22

Page 18: An Analysis of Causality between Events and its Relation to Temporal Information

Casey’s noteon identifying CLINK

System Precision Recall F-score

Rule-based (baseline) 36.79% 12.26% 18.40%

Supervised classification(with gold CSIGNALs)

74.67% 35.22% 47.86%

- without dependency feature 65.77% 30.82% 41.97%

- without CSIGNAL feature 57.53% 13.21% 21.48%

- without TLINK feature 61.59% 29.25% 39.66%

Supervised classification(with automatic CSIGNALs)

67.29% 22.64% 33.88%

Rule-based system basically looks for specific dependency constructions where an affect verb, a link verb, a causative verb (basic and periphrastic

constructions) or a causal signal is connected to two events 18 / 22

Page 19: An Analysis of Causality between Events and its Relation to Temporal Information

Just another meeting at McRels Inc.

Casey reports some findings from his learning activity…

On labeling CSIGNAL, the low recall is most probably due to data sparseness.

Well, that’s expected, only 47% of documents in the corpus contain CSIGNAL. We should enrich the learning data, maybe with Penn Discourse Treebank (PDTB)?

Yeah, maybe. Furthermore, false negatives are mostly because of ambiguous causal signals, such as by and and.

For conjunction and, perhaps the corpus by Bethard et al. (2008) can help?19 / 22

Page 20: An Analysis of Causality between Events and its Relation to Temporal Information

Casey reports some findings from his learning activity…

Just another meeting at McRels Inc.

Hmm.. right. Meanwhile, on identifying CLINK, most mistakes are caused by dependency parser errors.

Try to use another dependency parser. For example… C&C tool (Curran et al., 2007) since it has a better coverage of long-range dependencies.

Okay, worth to try. And again, data sparseness is an issue. Could you provide me with more learning data?

One option is to hire interns again to annotate AQUAINT corpus from TempEval-3. Or, using causality information in PDTB, but pre-processing is needed because

the causality is not between events. Let’s see what I can do…20 / 22

Page 21: An Analysis of Causality between Events and its Relation to Temporal Information

One evening at an Irish pubWhile Tim and Casey are enjoying their Guinness…

So, how’s your work going?

It’s going well. There are some future directions to improve my learning ability.

By the way, the temporal information helps me a lot!Especially to decide the causality direction, because you know,

cause should happen before the effect.

Wow, cool! Perhaps the causal information can also help me too?

21 / 22

Page 22: An Analysis of Causality between Events and its Relation to Temporal Information

While Tim and Casey are enjoying their Guinness…

One evening at an Irish pub

Well, the number of TLINKs that have underlying CLINKs will be much lower. So… maybe the causal information won’t help that much.

Besides, look at this sentence…

Hmmm… interesting, the cause is after the effect. We should discuss more about it. But now… let’s celebrate our future collaboration. Cheers!

Cheers!

“But some analysts questioned T how much of an impact the retirement package

will have, because few jobs will end S up being eliminated.”

…and their story continues, in the next paper ;) 22 / 22

Page 23: An Analysis of Causality between Events and its Relation to Temporal Information

Cast

Casey

Tim

Eve McRel

Minion 1

Minion 2

Causal Relation Extraction System

Temporal Relation Extraction System

Event Relation Repository

Paramita Mirza

Sara Tonelli

Thank You!

Page 24: An Analysis of Causality between Events and its Relation to Temporal Information

Interns’ additional reports

Page 25: An Analysis of Causality between Events and its Relation to Temporal Information

“StatesWest Airlines withdrew T its offer to acquire Mesa Airlines because

the Farmington carrier did not respond S to its offer”

According to Stanford dependency parser,

because is a marker of acquire instead of withdrew

Casey’s noteon dependency parser errors