machine classification and analysis of suicide-related communication on twitter

15
Machine Classification and Analysis of Suicide- Related Communication on Twitter Presentation @ ACM Hypertext 2015 Pete Burnap, Gualtiero (Walter) Colombo & Jonathan Scourfield Social Data Science Lab School of Computer Science and Informatics & School of Social Sciences Cardiff University @pbFeed @socdatalab

Upload: pete-burnap

Post on 14-Apr-2017

543 views

Category:

Internet


0 download

TRANSCRIPT

Machine Classification and Analysis of Suicide-Related Communication on Twitter

Presentation @ ACM Hypertext 2015

Pete Burnap, Gualtiero (Walter) Colombo & Jonathan Scourfield

Social Data Science Lab

School of Computer Science and Informatics & School of Social

Sciences Cardiff University

@pbFeed @socdatalab

Social Data Science Lab - @socdatalab

•  Formed in 2015 out of the Collaborative Online Social Media Observatory (COSMOS) programme of work (cosmosproject.net)

•  Mission is to continue the work of COSMOS in democratising access to big social data (e.g. Twitter, Foursquare, Instagram) amongst the academic, private, public and third sectors.

•  A significant proportion of research funds have been awarded to collect and analyse social media data in the contexts of Societal Safety and Security e.g. social tension, hate speech, crime reporting and fear of crime, suicidal ideation

•  Working with Metropolitan Police, Department of Health, Food Standards Agency

The Problem

•  Our previous research has studied online social networks as “social machines” that enable spread of malicious or potentially dangerous information (e.g. rumour, hate speech, malware)

•  Concern about suicide and Internet has moved from dedicated suicide websites to general social media platforms

•  Previous research has shown spikes in recorded suicide rates due to increased risk factors (e.g. celebrity suicide)

The Problem

•  Normalisation of suicidal language (Daine et al., 2013)

•  To date research has tended to rely on human coding of online content – difficult to scale to ‘volume’, or suicide notes (different state of mind?)

•  Social media analysis has yet to distinguish between different types of suicidal communication

Research Aims

•  To explore the potential of natural language processing and machine learning for automated identification and differentiation of suicide-related communication in very large social media data sets

•  This would enable those responsible for supporting safety and wellbeing (e.g. samaritans) to establish a more realistic idea of the volume of suicidal information online and possibly identify emerging ‘clusters’

•  While computation is essential, the work was driven from the s tart by a strong understanding of suicidal

communication/language with established suicide researchers

Developing a classifier for suicide-related social media content

•  Anonymised data from suicide discussion fora

•  Human annotated – ‘is this person suicidal?’

•  Identify (TF.IDF) terms & phrases from ‘suicidal texts’

•  Automated collection of data from Twitter & Tumblr using TF.IDF terms

•  Human annotated sample (n=2000 1k Twitter + 1k Tumblr) – coding frame

•  c1: Evidence of possible suicidal intent

•  c2: Campaigning (i.e. petitions etc.)

•  c3: Flippant reference to suicide

•  c4: Information or support

•  c5: Memorial or condolence

•  c6: Reporting news of someone’s suicide (not bombing)

•  c7: None of the above

Features

(Set 1) Lexical characteristics of sentences used, such as the Parts of Speech (POS), and other language structural features, such as the the most frequently used words and phrases. References to self and others are also captured with POS – these terms have been identified in previous research as being evident within suicidal communication

(Set 2) Sentiment, affective and emotional features and levels of the

terms used within the text. Emotions such as fear, anger and general aggressiveness are particularly prominent in suicidal communication (WordNet Affect)

(Set 3) Language expressed in short, informal text such as social media

posts within a limited number of characters. These were extracted from annotated Tumblr posts

Machine Classification

•  Key question here is: what are the features of suicidal ideation, and what are the features of the other classes?

•  Accuracy important but explanatory value also crucial

•  Methods used for the classifier •  Probabilistic (Naïve Bayes), non-probabilistic linear (linear

SVM) and rule-based (Decision Tree) machine classifier •  Principal Components Analysis (1444 to 255 features) •  Improvement with ‘ensemble’ classifier designed to incorporate diverse principal components (Rotation Forest]

Results (all)

Results (suicidal ideation)

Classifier accuracy

Table 2: Cross-fold 10 Results: Suicidal Ideation

Feature \NB DT SVM RF

Classifier

Set1P 0.509 0.446 0.000 0.627R 0.718 0.474 0.000 0.667F 0.596 0.460 0.000 0.646

Set2P 0.479 0.493 0.500 0.640R 0.718 0.474 0.013 0.705F 0.574 0.484 0.025 0.671

Set3P 0.491 0.514 0.000 0.684R 0.679 0.474 0.000 0.667F 0.57 0.493 0.000 0.675

CombinedP 0.496 0.393 0.000 0.640R 0.718 0.423 0.000 0.731

F 0.586 0.407 0.000 0.683

PCAP 0.321 0.345 0.762 0.507

(combined)R 0.641 0.385 0.205 0.436F 0.427 0.364 0.323 0.469

Table 3: Confusion matrix for the best performing

classification model

classi.c1 c2 c3 c4 c5 c6 c7

asc1 57 0 16 0 0 0 5c2 0 19 2 4 0 3 0c3 13 1 142 0 0 5 16c4 0 4 5 20 0 3 3c5 1 1 1 0 31 1 1c6 0 6 7 6 2 80 3c7 18 0 20 1 2 4 98

6. DISCUSSIONIn this section we analyse the main feature components pro-duced by running the PCA procedure on the combined setthat resulted in the best set of results, as shown in Tables 1to 4. The application of PCA reduced the features set from1444 to 255 attributes in terms of main components. Forthe seven suicide related classes we show in Tables 5 and6 the most representative principal components and brieflydiscuss what each class represents in terms of the featuresin the component and the particular language used in it.

Note that while the distribution of the components per classmirrors the total number of annotation per class (thereforepenalising the classes less represented in our data set suchas ‘memorials’) in Tables 5, 6 and in the related discussionwe are giving priority to the most representative class ofposts containing evidence of possible suicidal intent. We canobserve the following characteristics of the features includedfor each class component:

c1: Many of the features that appear dominant in the sui-cidal ideation class are those related to phrases and expres-sions identified in the suicide literature as being significantlyassociated within the language of suicide. In particular, be-side a limited number of uni/bi/tri-grams generated directlyfrom the training set, the terms derived from a numberof suicide related web sites were fundamental in classify-ing suicidal ideation in Twitter data. As were the regu-

Table 4: Precision, Recall, and F-measure for the

best performing classification model

class. P R Fc1 0.640 0.731 0.683c2 0.613 0.679 0.644c3 0.736 0.802 0.768c4 0.645 0.571 0.606c5 0.886 0.861 0.873c6 0.883 0.769 0.800c7 0.778 0.685 0.729

lar expression features derived for the from Tumblr posts.Examples like ‘end it all now’ and ‘want to be dead’ andregex including expression of ‘depressive/suicidal/self harm-ing’ ...‘thoughts/feelings’ appear strongly related to suicidalideation and clearly discriminating for this specific class.Other terms (such as ‘killing myself’ and the regex con-taining ‘die’ ... ‘my sleep’) become e↵ective for classificationwhen used besides other attributes such as lexical featuresthat express surprise, exaggeration and emphasis (e.g. ad-verbs (‘really’), predeterminers (e.g. ‘such’ ‘rather’)), andwords mapped to specific ‘a↵ective’ domains such as ‘alarm’and ‘misery’. Note that some other concepts and terms ap-pear with a negative correlation as expressions of oppositea↵ective states, such as ‘security’ and ‘admiration’.

c2: For the class representing campaigning and petitionswe can observe more general concepts, again expressed byregular expressions and language clues (word-lists in our ter-minology), such as ‘support/help’, ’blog’ as well as more spe-cific terms (e.g ‘safety plea’) and expressions (‘put an endto this’).

c3: As the confusion matrix in Table 3 show, the classconcerning a ‘flippant’ use of suicidal language is the onepresenting the major di�culties in classification, since itincludes many of the same linguistic features of suicidalideation. However, the principal components derived forthis class identify certain attributes that are the oppositetype of sentiment from emotional distress. These includea↵ective states such as ‘levity’, ‘gaiety’, ‘jollity’, and ‘cheer-fulness’, as well as popular conversational topics, such ascasual remarks about the weather. The confusion occurswhere phrases such as ‘kill myself’ are used frivolously.

c4: The class representing posts related to information andsupport (and prevention) appear mostly represented by spe-cific words (often unigrams and ‘tags’) directly linked to thesources (e.g. #police, #o�cers, internet) and/or topical-ity (such as sexual references (‘#lgtb), and the domains ofself-harm and #suicide).

c5: For the class concerning memorial messages, as may beexpected, direct mentions of the name of the deceased ap-pear highly influential as well as ‘time’ references (e.g. ’amonth ago’, ’a year since’) in association with terms suchas ‘killed’ and ’died’ (well captured by one of our regularexpressions). In addition labels and tags as ‘rip’ and termsexpressing ‘love’ ‘and ‘a↵ection’ are also part of the compo-nents associated with this class.

F-measure: c1 = 0.690, all classes: 0.728

Predictive Features

c6: The class concerning news reports related to suicidepresents features in large part similar to the informationclass (c4). In particular, ‘tags’ and words representing sourcesof information (e.g. #bbc news), types of news (researchstudy or statistical report), and direct mentions of the nameof the deceased (as well as general concepts related to theparticular case, such as in the one here reported of the ‘TV’domain). Note that, the last three classes of memorial, in-formation/support, and news reporting all share the com-mon characteristics of including URL links within the tweetswhich, consequently, does not result as an e↵ective featuresfor classification among these di↵erent classes.

c7 Finally, the class of posts annotated as not related toany of the previous classes exhibits attributes such as gen-eral phrases related to self doubt (such as ‘what’s wrongwith me and ‘hate myself’) and emotional states (such as‘jitteriness’ and ‘admiration’). These are phrases that couldappear in tweets relate to emotional distress but are alsoclearly evident in general everyday ’chatter’.

7. CONCLUSIONIn this paper we developed a number of machine classifiersbuilt with the aim of classifying text relating to suicide onTwitter. The classifier distinguishes between the more wor-rying content, such as suicidal ideation, and other suicide-related topics such as reporting of a suicide, memorial, cam-paigning and support. We built a set of baseline classifiersusing lexical, structural, emotive and psychological featuresextracted from Twitter posts. We then improved on thebaseline classifiers by building an ensemble classifier usingthe Rotation Forest algorithm, achieving an F-measure of0.74 overall (for 7 classes, including suicidal ideation) and0.68 for the suicidal ideation class.

We summarised and attempted to explain the results by re-flecting on the most significant predictive principle compo-nents of each class to provide insight into the language usedon Twitter around suicide-related communication. Fromthis analysis we observed that word-lists and regular expres-sions (regex) extracted from online suicide-related discus-sion fora and other microblogging Web sites appear capableof capturing relevant language ‘clues’, both in terms of sin-gle words, n-grams (word-lists) and more complex patterns.These appear particularly e↵ective for the suicidal ideationclass, expressing emotional distress. Lexical and grammarfeatures such as POSs appear mostly ine↵ective and scarcelypresent in the principal components (only some mentionsas predeterminers, existential clauses and superlatives that,however, also relate to more specific ‘a↵ective’ language fea-tures than only pure lexical ones). A↵ective lexical domains,appear instead very relevant (such as those represented bythe WordNet library of ‘cognitive synonyms’) and able towell represent the a↵ective and emotional states associatedto this particular type of language.

Concepts and labels representing broader semantic domains(also derived form the WordNet library) are, on the con-trary, not e↵ective. In fact, although they appear rather nu-merous as attributes within the principle components theyreveal to be, on close inspection, for the majority of casesirrelevant and mostly generated by a ‘confusion’ and ‘mis-representation’ of words (such as sentences like ‘my reason

Table 5: Principal components per class

c1 - Evidence of possible suicidal intent

0.185word list1 end it all 521+0.185end it all+0.179it all now

+0.179all now+0.175it all

0.149word list1 want to be dead 554-0.133 -0.129i think

+0.125word list1 to commit suicide 547+0.114really

0.149word list1 want to be dead 554+0.145wn a↵ect11 alarm

496-0.123number of adverb superlative 211-0.121word list7

relationship 780+0.118regEx class6 +.+\report.+ 701

0.153thinking about killing+0.153about killing myself

+0.153about killing+0.147so im+0.147wn a↵ect11 misery 314

0.119number of predeterminers 206+0.117regEx class1 +.+

((\cutting|\depres|\sui)|\these|\bad|\sad).+(\thoughts|\feel).+ 667+0.115wn domain astrology 160-0.106bombing

0.231regEx class1 +.+(\bdie).+(\bmy).+\bsleep.+0.177word

list want to be dead 554-0.155wn domain dentistry 113

-0.146wn a↵ect11 security 277-0.129wn a↵ect11 admiration

c2 - Campaigning (i.e. petitions etc.)

0.25 word list2 support 746-0.134wn domain racing 84

+0.119regEx class2 +.+blog.+ 683+0.113wn domain jewellery

0.189safety+0.188plea+0.188safety plea+0.188plea over

0.187end to+0.187word list put an end to this 540

+0.187an end to+0.187an end+0.152r i

c3 - Flippant reference to suicide

0.112wn domain meteorology 166+0.11 to live+0.107

wn a↵ect1 jollity 333+0.107wn a↵ect11 levity 327

+0.107wn a↵ect11 levity-gaiety 378

0.14 word list want to be here anymore 575-0.13number of

existentials (there) 196+0.126wn a↵ect11 cheerfulness 459

-0.111so-0.111really

0.162wn a↵ect11 jollity 333+0.162wn a↵ect11 levity 327

+0.162wn a↵ect11 levity-gaiety 378+0.128or

+0.113wn domain meteorology 166

-0.159myself-0.144regEx class3 total 662-0.136regEx class3 +.

+(\to).+(\kill|\disapp).+ 672-0.125to kill myself-0.125to kill

c4 - Information or support

0.152and anxiety self-harm+0.152challenge+0.152

challengesexps to#lgbt+0.152young people#mylgbthealth

0.175#police #o�cers in+0.175#suicide preventiontoday

+0.175#suicide prevention+0.175o�cers trained+0.175#police

0.21 internet & suicide+0.21 between internet &

+0.21 & suicide http+0.21 & suicide+0.21 internet

c5 - Memorial or condolence

0.155regEx class5 +.+(\kill|\die|\comm).+(day|month|year.+ 692+0.138wn domain mathematics 117

+0.13 wn domain agriculture 104-0.12wn domain tax 126

-0.116number of interjections 215

0.125wn a↵ect11 love 324+0.125love+0.112rip *name

replaced*+0.11 rip *name replaced*+0.107rip

c6- Reporting news of someone’s suicide (not bombing)

0.178bbc news+0.15 number+0.15 deaths by+0.15 deaths

by suicide+0.15 number of+0.15 by suicide from

0.158research-0.123o↵-0.107self+0.106to study link+0.1 see 626

0.129regEx class6+.+friend.+ 690+0.12 friend 608-0.114regEx

class2 +.+blog.+ 683-0.101adverb 599+0.101killed

+0.144self+0.121wn domain tv 184+

0.101*name replaced*13+0.101*name replaced*+0.093dead

Explanatory features

•  Word-lists and regular expressions (regex) extracted from online suicide-related discussion forums and other microblogging Web sites provide ‘clues’ effective for the suicidal ideation class

•  Lexical and grammar features such as POSs appear mostly ineffective

•  ‘Affective’ language very relevant (such as those represented by the WordNet library of ‘cognitive synonyms’) and able to well represent the affective and emotional states associated to this particular type of language.

•  Sentiment Scores generated by software tools for sentiment analysis appear also ineffective and either scarcely or not at all

included within the principal components predictive of each class

Networks of Suicidal Ideation

“…shortest path of retweets of suicidal ideation was higher than previous studies that reported on general retweet path length. Our results found an average of 5, while other research reported metrics between 2 and 4.8.”

Colombo, G., Burnap, P., Hodorog, A. and Scourfield, J. (2015) ‘Analysing the connectivity and communication of suicidal users on Twitter’, Computer Communications - available open access http://tinyurl.com/suicidenetworks

Thanks

Questions?

@pbFeed