learning tfc meeting, sri march 2005 on the collective classification of email “speech acts”...

16
Learning TFC Meeting, SRI March 2005 Learning TFC Meeting, SRI March 2005 On the Collective On the Collective Classification of Email “Speech Classification of Email “Speech Acts” Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University

Upload: vernon-newton

Post on 13-Dec-2015

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Learning TFC Meeting, SRI March 2005 On the Collective Classification of Email “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University

Learning TFC Meeting, SRI March 2005Learning TFC Meeting, SRI March 2005

On the Collective Classification of On the Collective Classification of Email “Speech Acts”Email “Speech Acts”

Vitor R. Carvalho & William W. CohenCarnegie Mellon University

Page 2: Learning TFC Meeting, SRI March 2005 On the Collective Classification of Email “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University

Classifying Email into Acts

Verb

Commisive Directive

Deliver Commit Request Propose

Amend

Noun

Activity

OngoingEvent

MeetingOther

Delivery

Opinion Data

Verb

Commisive Directive

Deliver Commit Request Propose

Amend

Noun

Activity

OngoingEvent

MeetingOther

Delivery

Opinion Data

From EMNLP-04, From EMNLP-04, Learning to Learning to Classify Email into Speech ActsClassify Email into Speech Acts, , Cohen-Carvalho-MitchellCohen-Carvalho-Mitchell

An Act is described as a An Act is described as a verb-nounverb-noun pair (e.g., propose pair (e.g., propose meeting, request meeting, request information) - Not all pairs information) - Not all pairs make sense. One single make sense. One single email message may contain email message may contain multiple acts.multiple acts.

Try to describe commonly Try to describe commonly observed behaviors, rather observed behaviors, rather than all possible speech acts than all possible speech acts in English. Also include non-in English. Also include non-linguistic usage of email linguistic usage of email (e.g. delivery of files)(e.g. delivery of files)

Nouns

Verbs

Page 3: Learning TFC Meeting, SRI March 2005 On the Collective Classification of Email “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University

Idea: Predicting Acts from Surrounding Acts

Delivery

Request

Commit

Proposal

Request

Commit

Delivery

Commit

Delivery

<<In-ReplyTo>> • Act has little or no correlation with other acts of same message

• Strong correlation with previous and next message’s acts

Example of Email Sequence

Page 4: Learning TFC Meeting, SRI March 2005 On the Collective Classification of Email “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University

Winograd and Winograd and FloresFlores, 1986:, 1986: “Conversation for “Conversation for Action Structure”Action Structure”

Murakoshi Murakoshi et al. et al. 19991999;; ““Construction of Construction of Deliberation Deliberation Structure in Structure in EmailEmail””

Related work on the Sequential Nature of Related work on the Sequential Nature of NegotiationsNegotiations

Page 5: Learning TFC Meeting, SRI March 2005 On the Collective Classification of Email “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University

Data: CSPACE CorpusData: CSPACE Corpus

Few large, free, natural email corpora are Few large, free, natural email corpora are availableavailable

CSPACE corpus (Kraut & Fussell)CSPACE corpus (Kraut & Fussell)o Emails associated with a semester-long project Emails associated with a semester-long project

for Carnegie Mellon MBA students in 1997for Carnegie Mellon MBA students in 1997o 15,000 messages from 277 students, divided in 15,000 messages from 277 students, divided in

50 teams (4 to 6 students/team)50 teams (4 to 6 students/team)o Rich in task negotiation. Rich in task negotiation. o More than 1500 messages (from 4 teams) were More than 1500 messages (from 4 teams) were

labeled in terms of “Speech Act”. labeled in terms of “Speech Act”. o One of the teams was double labeled, and the One of the teams was double labeled, and the

inter-annotator agreement ranges from 72 to inter-annotator agreement ranges from 72 to 83% (Kappa) for the most frequent acts.83% (Kappa) for the most frequent acts.

Page 6: Learning TFC Meeting, SRI March 2005 On the Collective Classification of Email “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University

Evidence of Sequential Correlation of Evidence of Sequential Correlation of ActsActs

Transition diagram for most common verbs from CSPACE corpusTransition diagram for most common verbs from CSPACE corpus It is NOT a Probabilistic DFAIt is NOT a Probabilistic DFA Act sequence patterns: (Request, Deliver+), (Propose, Commit+, Act sequence patterns: (Request, Deliver+), (Propose, Commit+,

Deliver+), (Propose, Deliver+), most common act was DeliverDeliver+), (Propose, Deliver+), most common act was Deliver Less regularity than the expected ( considering previous Less regularity than the expected ( considering previous

deterministic negotiation state diagrams)deterministic negotiation state diagrams)

Page 7: Learning TFC Meeting, SRI March 2005 On the Collective Classification of Email “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University

Content versus ContextContent versus Context Content:Content: Bag of Words features only Bag of Words features only Context:Context: Parent and Child FeaturesParent and Child Features only ( table below) only ( table below) 8 MaxEnt classifiers, trained on 3F2 and tested on 1F3 team dataset8 MaxEnt classifiers, trained on 3F2 and tested on 1F3 team dataset Only 1Only 1stst child message was considered (vast majority – more than 95%) child message was considered (vast majority – more than 95%)

0 0.1 0.2 0.3 0.4 0.5

Request

Deliver

Commit

Propose

Directive

Commissive

Meeting

dData

Kappa Values (%)

Context Content

Kappa Values on 1F3 using Relational (Context) features and Textual (Content) features.

Parent Boolean Features

Child Boolean Features

Parent_Request, Parent_Deliver, Parent_Commit, Parent_Propose,

Parent_Directive, Parent_Commissive

Parent_Meeting, Parent_dData

Child_Request, Child_Deliver, Child_Commit, Child_Propose,

Child_Directive, Child_Commissive,

Child_Meeting, Child_dData

Set of Context Features (Relational)

Delivery

Request

Commit

Proposal

Request

???

Parent message Child message

Page 8: Learning TFC Meeting, SRI March 2005 On the Collective Classification of Email “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University

Collective Classification using Collective Classification using Dependency NetworksDependency Networks

Dependency networks are probabilistic graphical models in which the full joint distribution of the network is approximated with a set of conditional distributions that can be learned independently. The conditional probability distributions in a DN are calculated for each node given its neighboring nodes (its Markov blanket).

))(|Pr()Pr( ii

i XtNeighborSeXX

No acyclicity constraint. Simple parameter estimation – No acyclicity constraint. Simple parameter estimation – approximate inference (Gibbs sampling)approximate inference (Gibbs sampling)

In this case, Markov blanket = parent message and child In this case, Markov blanket = parent message and child messagemessage

Heckerman et al., JMLR-2000. Neville & Jensen, KDD-MRDM-Heckerman et al., JMLR-2000. Neville & Jensen, KDD-MRDM-2003. 2003.

Page 9: Learning TFC Meeting, SRI March 2005 On the Collective Classification of Email “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University

Collective Classification Collective Classification algorithm algorithm

(based on Dependency Networks Model)(based on Dependency Networks Model)

Page 10: Learning TFC Meeting, SRI March 2005 On the Collective Classification of Email “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University

Agreement versus IterationAgreement versus Iteration

Kappa versus Kappa versus iteration on iteration on 1F3 team 1F3 team dataset, using dataset, using classifiers classifiers trained on trained on 3F2 team 3F2 team data.data.0.25

0.3

0.35

0.4

0.45

0.5

0.55

0 10 20 30 40 50

Iteration

Kap

pa

Deliver Commissive Request

Page 11: Learning TFC Meeting, SRI March 2005 On the Collective Classification of Email “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University

Leave-one-team-out Leave-one-team-out ExperimentsExperiments

4 teams: 1f3(170 4 teams: 1f3(170 msgs), 2f2(137 msgs), 2f2(137 msgs), 3f2(249 msgs) msgs), 3f2(249 msgs) and 4f4(165 msgs)and 4f4(165 msgs)

(x axis)= Bag-of-(x axis)= Bag-of-words onlywords only

(y-axis) = Collective (y-axis) = Collective classification resultsclassification results

Different teams Different teams present different present different styles for styles for negotiations and task negotiations and task delegation.delegation.

0

10

20

30

40

50

60

70

80

0 10 20 30 40 50 60 70 80

4f4

1f3

3f2

2f2

Reference

Kappa ValuesKappa Values

Page 12: Learning TFC Meeting, SRI March 2005 On the Collective Classification of Email “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University

Leave-one-team-out Leave-one-team-out ExperimentsExperiments

Consistent Consistent improvement of improvement of Commissive, Commissive, Commit and Commit and Meet actsMeet acts

Kappa ValuesKappa Values

0

10

20

30

40

50

60

70

0 10 20 30 40 50 60 70

Commiss/Commit/Meet

Direct/dData/Request

Proposal/Delivery

Reference

Page 13: Learning TFC Meeting, SRI March 2005 On the Collective Classification of Email “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University

Leave-one-team-out Leave-one-team-out ExperimentsExperiments

Deliver and dData Deliver and dData performance usually performance usually decreasesdecreases

Associated with Associated with data distribution, data distribution, FYI, file sharing, FYI, file sharing, etc.etc.

For “For “non-delivery”non-delivery”, , improvement in avg. improvement in avg. Kappa is statistically Kappa is statistically significant (p=0.01 significant (p=0.01 on a two-tailed T-on a two-tailed T-test)test)

Kappa ValuesKappa Values

0

10

20

30

40

50

60

70

80

0 10 20 30 40 50 60 70 80

Non-delivery

Deliver/dData

Reference

Page 14: Learning TFC Meeting, SRI March 2005 On the Collective Classification of Email “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University

Act by Act Comparative Act by Act Comparative ResultsResults

37.66

30.74

47.81

58.27

47.25

36.84

42.01

44.98

42.55

32.77

52.42

58.37

49.55

40.72

38.69

43.44

0 10 20 30 40 50 60 70

Commissive

Commit

Meeting

Directive

Request

Propose

Deliver

dData

Kappa Values (%)

Baseline Collective

Kappa values with and without collective classification, averaged over the four test sets in the leave-one-team out experiment.

Page 15: Learning TFC Meeting, SRI March 2005 On the Collective Classification of Email “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University

Discussion and ConclusionDiscussion and Conclusion

Sequential patterns of email acts were Sequential patterns of email acts were observed in the CSPACE corpus.observed in the CSPACE corpus.

These patterns, when studied an artificial These patterns, when studied an artificial experiment, were shown to contain valuable experiment, were shown to contain valuable information to the email-act classification information to the email-act classification problem.problem.

Different teams present different styles for Different teams present different styles for negotiations and task delegation.negotiations and task delegation.

We proposed a collective classification We proposed a collective classification scheme for Email Speech Acts of messages. scheme for Email Speech Acts of messages. (based on a Dependency Network model) (based on a Dependency Network model)

Page 16: Learning TFC Meeting, SRI March 2005 On the Collective Classification of Email “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University

ConclusionConclusion Modest improvements over the baseline Modest improvements over the baseline

(bag of words) were observed on acts related (bag of words) were observed on acts related to negotiation (Request, Commit, Propose, to negotiation (Request, Commit, Propose, Meet, etc) . A performance deterioration was Meet, etc) . A performance deterioration was observed for Delivery/dData (acts less observed for Delivery/dData (acts less associated with negotiations)associated with negotiations)

Agrees with general intuition on the Agrees with general intuition on the sequential nature of negotiation steps.sequential nature of negotiation steps.

Degree of linkage in our dataset is small – Degree of linkage in our dataset is small – which makes the observed results which makes the observed results encouraging.encouraging.