identifying actionable messages on social media

29
Identifying Actionable Messages on Social Media Nemanja Spasojevic, Adithya Rao October 29, 2015 @ 2015 IEEE International Big Data Conference Workshop on Mining Big Data in Social Networks

Upload: nemanja-spasojevic

Post on 17-Jan-2017

4.500 views

Category:

Social Media


1 download

TRANSCRIPT

  • Identifying Actionable Messages on Social Media

    Nemanja Spasojevic, Adithya Rao

    October 29, 2015@ 2015 IEEE International Big Data ConferenceWorkshop on Mining Big Data in Social Networks

    http://cci.drexel.edu/bigdata/bigdata2015/http://cci.drexel.edu/bigdata/bigdata2015/

  • Customer Support PR Managment Engagement

    Motivation

  • Motivation - Lithium Social Web

  • Problem Setting

    Identify actionable social media messages in context of an agent of the company.

    Actionable message - clear call to action. Eg. raising an issue to which agent may provide helpful response.

  • Problem Setting

    75 companies / brands 35 languages Facebook & Twitter

    Identify actionable social media messages in context of an agent of the company.

    Actionable message - clear call to action. Eg. raising an issue to which agent may provide helpful response.

  • Examples

  • Challenges - Company Behaviour

    Depending on the company:1. Categories

    a. Media Distribution b. Telecommunicationsc. Retail (Apparel, Beauty, Electronics...)d. Airlinese. Manufacturers

    2. Different Objectives 3. Various Data Coverage Across Companies

  • Challenges - Social NetworkSpoken language depends on platform.

    Twitter

    Facebook

    *public data

  • ar:2015 ()\n , 50% 10 \n LINKDSL \n *\n ( n\ n\ n\ n\ n\bg: ! twitch. :forza horizon 2,forza motorsport 5 GTA 5 online, Battlfield Hardline,watch dogs,world of tanks. :http://www.twitch.tv/lexus9949http://www.twitch.tv/lexus9949http://www.twitch.tv/lexus9949http://www.twitcs:Dobr den,\ntak po njakm ase vs zdravm. Pstroj jsem reklamoval a ukzalo se, e se jednalo o vadn kryt pstroje. Co bylo alespo trochu potujc, e popsan chovn pstroje nebyl zmr, ale chyba. Z Vaich reakc se zdlo, e chyba bude vude jinde ne v HW.\n\nBohuel po tom, co byl psda:Hej igen \n\nJeg har nu vret forbi, jeres Nokia steder her i Paris og de ngter at have noget med det at gre.. Og jeg bor p ingen mde i nrheden af de steder!!! Undskyld men det er jo absurd at give nogle adresser som ngter at gre det!!! S hvad er det for en service eller virkelig mangel p samde:Gewinne jetzt auf Airbnb die Wohnung des Weihnachtsmannes und feiere Silvester im hohen Norden!\r\n\r\n1. Registriere dich bei Airbnb\r\n2. Klicke auf Nimm Kontakt auf unterhalb der Beschreibung\r\n3. Whle beim Check-In den 27.12. aus\r\n4. Antworte im Kontaktfeld, warum du dich als Haussitter eignest\r\n5. es:Buenos das\n\nHe recibido un mail en el que se me informa, que a partir \nde septiembre, mi tarifa pasa a tener un coste de 43,95, cuando yo la \ncontrate por un precio de 41,95.\n\nEn el mismo mail se me indica que si \nlo deseo, puedo efectuar la baja de mi linea sin CP.\n\nHe llamado al \nservicio de Atet:ini digi suka tipu org \ndigi \n digi\n blockfacebook\n facebook\n block\n Facebookblock\n \n \n \n \n \n \n \n \ndigi \n digi \n \n \nfa: fi:Mitenks kalenteriin sitten pystyy lismn sellaisen henkiln syntmpivn jolle ei ole puhelinnumeroa eik kyseinen henkil ole missn Facebookissa tai vastaavassa. Puhelintietojen taakse kun tallentaa syntympivn niin ei sekn ny kalenterissa. Ainoastaan facessa olevien henkiliden.gu: ...!\n ...!\n ' ' ?\n\n \n\n \n . ..\n\n \n \n \n \n \nhe:!!!?????!!!!\n PayPal\n !!!! !!!\n ????!!!!\n---\n\n \n \n \n,\n . hi: \r\nCIGARETTE\r\n ,\r\n\r\nCARPET ,\r\n \r\nBOXING ,\r\n \r\nWRESLING ,\r\n \r\nCRICKET ,\r\n.. \r\nGREAT ..\r\n \r\nCOW ,\r\n \r\nWOW ,\r\n \r\nCHAI ,\r\n id: Microsoft \r\n Oo......................... \r\n \r\n\nBila ada langkah membekas lara Ada kata merangkai dusta Ada tingkah menoreh luka Mohon maaf lahir dan bathin Selamat hari raya Idul Fitri 1436 H\r\n\n [[488232091206393]]\r\n.................................oO \r\nit:Egregio Sig. Boemio,\n\n \n\nLa ringraziamo per aver contattato Microsoft mobile devices support.\n\n \n\nLe scriviamo in riferimento alla Sua telefonata e alla Sua richiesta di riparazione in garanzia, per confermarLe che in data 06.08.2015 il Laboratorio Nazionale Prima Comunicazione S.r.l. ha effettuato ja: \r\n \r\n \r\n \r\n \r\n \r\n\n> \r\n\r\n> \r\n> \r\n > \r\n\n \r\n \r\n \r\n \r\n \r\n \n \r\n \r\n \r\n \r\n \r\n \r\n\n>kn:Jesus's !!!!!!! \n ! ... ! \n , , , , , ! \n" , LORD's

    Challenges - Language

  • MethodologyDomain attributes: Company (c) , Language (l) , Source (s)

    Eg. fully specified domain: D = {c, l, s}

    Consider all domains: P(D) = {{}, {c}, {l}, {s}, {c, l}, {l, s}, {c, s}, {c, l, s}}Example: P({yahoo, en, tw}) = {{}, {yahoo}, {en}, {tw}, {yahoo, en}, {en, tw}, {yahoo, tw}, {yahoo, en, tw}}

    Model Building: Build classifiers CD* for all D* P(D) , and choose classifier with best F measure evaluated on validation set.

  • System Overview

  • Features

    Lexicon Based (generated, and SentiWebNet 3.0) Marker Based ( ?, !, via, rt, @, #, ) Emoticon (Text emoticons, Emoji, Kamoji) Readability Features (Flech-Kincaid, Dale-Chall) Document Length

  • Lexicon Generation For each domain, class (d, a) generate dictionary wd,a(t)

    messaged,a tokensd,a ndfd,a adfd,a wd,a

  • Lexicon Example

  • Marker Features

    ?, ! - punctuation rt, via - credit original content author @ - mention draw attention to user or credit authorship # - indicators of topics associated with content - external context

  • Emoticon Features

    Text Emoticons :-) , :) , }:-> , 8-) ... Emoji , , , , , , ... Kamoji (), o(>_

  • Feature Generation ExampleThanks @ for crashing and keeping us waiting all day for a response! Really appreciate it!! #waste

    "WORD-SENTI_WORD_NET_POSITIVE-SCALED" : 0.088"WORD-SENTI_WORD_NET_NEGATIVE-SCALED" : 0.025 "KEYWORD_ACTIONABLE-GENERAL-SCALED" : 0.305"KEYWORD_NON_ACTIONABLE-GENERAL-SCALED" : 0.2"CHAR_EXCLAMATION_MARK" : 1.0"CHAR_EXCLAMATION_MARK_0" : 0.642"CHAR_EXCLAMATION_MARK_1" : 0.821"CHAR_AT" : 1.0"CHAR_AT_0" : 0.934"CHAR_HASH" : 1.0"CHAR_HASH_0" : 0.902"EMOTICON-GENERAL-PRESENT" : 1.0"EMOTICON-POSITIVE-PRESENT" : 1.0"EMOTICON-POSITIVE-SCALED" : 0.25"EMOTICON-NEGATIVE-PRESENT" : 1.0"EMOTICON-NEGATIVE-SCALED" : 0.75"WORD-GENERAL-GREATER_THAN-10" : 0.17"CHARACTER-GENERAL-GREATER_THAN-100" : 0.121

  • Feature Generation ExampleThanks @ for crashing and keeping us waiting all day for a response! Really appreciate it!! #waste

    "WORD-SENTI_WORD_NET_POSITIVE-SCALED" : 0.088"WORD-SENTI_WORD_NET_NEGATIVE-SCALED" : 0.025 "KEYWORD_ACTIONABLE-GENERAL-SCALED" : 0.305"KEYWORD_NON_ACTIONABLE-GENERAL-SCALED" : 0.2"CHAR_EXCLAMATION_MARK" : 1.0"CHAR_EXCLAMATION_MARK_0" : 0.642"CHAR_EXCLAMATION_MARK_1" : 0.821"CHAR_AT" : 1.0"CHAR_AT_0" : 0.934"CHAR_HASH" : 1.0"CHAR_HASH_0" : 0.902"EMOTICON-GENERAL-PRESENT" : 1.0"EMOTICON-POSITIVE-PRESENT" : 1.0"EMOTICON-POSITIVE-SCALED" : 0.25"EMOTICON-NEGATIVE-PRESENT" : 1.0"EMOTICON-NEGATIVE-SCALED" : 0.75"WORD-GENERAL-GREATER_THAN-10" : 0.17"CHARACTER-GENERAL-GREATER_THAN-100" : 0.121

  • ResultsStrategy Performance Model: A - Logistic model for {c, l, s} B - Logistic model for {} C - Logistic for best domain D - Best model-domain

    (A) (D)(C)

  • Results - Domain Selected

  • Results - ML Technique Selected

    *https://github.com/myui/hivemall

  • Results - Examples

  • Conclusion

    Scalable Framework for actionability detection on Social Media Framework for model building that works across

    75 companies 35 languages Twitter And Facebook

    User Perceived F measure - 0.78 Accuracy - 0.74

    Future Work - Deep Learning , Better Sentiment Analysis,...

  • Q & A

  • Results - Network Performance

  • Results - Language Performance

  • Results - Language Performance Examples

  • Lexicon Features

    Autogenerated Lexicons wd,a SentiWebNet 3.0 (English Only Sentiment Lexicon)

  • Lexicon Generation For each domain, class (d, a) generate dictionary wd,a(t)

    messaged,a tokensd,a ndfd,a adfd,a wd,a