analysis of tweets and tweeters during the 2014 umbrella ... · analysis of tweeters: bots and...

1
Analysis of Tweeters: Bots and Humans Ø Who are the top tweeters? Ø What is the rela/onship between the top tweeters? Ø What is the contribu/on of bots to the amount of tweets? Tweet-retweet graph among top 3000 tweeters (over half of them are bots). Le@: top retweeted users (major news media) Right: top retwee/ng users (bots) Top 3000 tweeters generate 40% of tweets Graph of follower-following rela/onships collected from tweets generated at 4pm,11-15-2014. All are El Universal bot groups that follow each other. Graph of follower-following rela/onships collected from tweets generated at 10am,10-23-2014. All are Falcons News bot groups that follow each other. Tweet-retweet volume before and a@er bot removal.Note the significant reduc/on in noise peaks. Conclusions References Zi Chu, Steven Gianvecchio, Haining Wang, and Sushil Jajodia. 2012. Detec/ng Automa/on of TwiYer Accounts: Are You a Human, Bot, or Cyborg?. IEEE Trans. Dependable Secur. Comput. 9, 6 (November 2012), 811-824. Zhao, Siqi, et al. "Human as real-/me sensors of social and physical events: A case study of twiYer and sports games." arXiv preprint arXiv:1106.4300(2011 Stockmann, Daniela, Big Data from China and Its Implica/on for the Study of the Chinese State -- A Research Report on the 2014 Hongkong Protests on Weibo (May 26, 2015). 90% - 95% of tweets are about news events or are retweets of news events. Only 5% - 10% express opinions about the events. Top 3000 tweeters (out of 240,000 users) generate 40% of tweets. Over half of them are bots. Bot groups have interes/ng internal structures. Sen/ment analysis of tweets reveals the lack of sustained posi/ve feelings about the revolu/on. Ø We collected 2,915,037 tweets about the Hong Kong Revolu/on from 10/05/2014 to 12/06/2014. Ø Goals: Understand the dynamics of the revolu/on using tweets alone. Introduc;on + Computer Science Department, Rice University, * Poli/cal Science Department, Rice University Zhouhan Chen + ([email protected]), Yun Zhou + , Rima Tanash + , Jieya Wen + , Dan Wallach + , Richard Stoll * , Devika Subramanian + Analysis of Tweets and Tweeters during the 2014 Umbrella Revolu;on Data Collec;on Ø Analyze peaks in TwiYer volume. Ø Use TwiYer API to build follower-following and tweet-retweet graphs. Ø Detect bot and human groups with community detec/on algorithms on those graphs. Ø Interpret graphs with visualiza/on tools such as Gephi. Ø Apply natural language processing (NLP) and machine learning for sen/ment analysis on tweets. Ø Tweets collected from: TwiYer Streaming API Ø Number of protest-related tweets: 1,062,606 Ø Number of geo-enabled tweets: 3583 Methods According to TwiYer, loca%on is “The user-defined locaCon for this account’s profile”. The map shows that many people from all over the world were physically present in Hong Kong during the protest. Ø Millions of raw tweets. Ø Many top tweeters are bots, introducing noise in our analysis. Ø Rela/onship between tweeters is complex. Challenges Analysis of Tweets: Events and Sen;ment Ø What are tweets about? Ø What is the rela/onship between tweet volume and major events? Ø How do people feel about the events? Example posi/ve tweet: “Something about this just gives me a liIle more faith in the world. Hong Kong protesters are so freaking nice.” Example nega/ve tweet: “EveryCme I hear about the Umbrella RevoluCon I want to groan for a few years.”

Upload: others

Post on 27-May-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Analysis of Tweets and Tweeters during the 2014 Umbrella ... · Analysis of Tweeters: Bots and Humans Ø Who are the top tweeters? Ø What is the relaonship between the top tweeters?

AnalysisofTweeters:BotsandHumansØ  Whoarethetoptweeters?Ø  Whatistherela/onshipbetweenthetoptweeters?Ø  Whatisthecontribu/onofbotstotheamountoftweets?

Tweet-retweetgraphamongtop3000tweeters(overhalfofthemarebots).Le@:topretweetedusers(majornewsmedia)Right:topretwee/ngusers(bots)

Top3000tweetersgenerate40%oftweets

Graphoffollower-followingrela/onshipscollectedfromtweetsgeneratedat4pm,11-15-2014.AllareElUniversalbotgroupsthatfolloweachother.

Graphoffollower-followingrela/onshipscollectedfromtweetsgeneratedat10am,10-23-2014.AllareFalconsNewsbotgroupsthatfolloweachother.

Tweet-retweetvolumebeforeanda@erbotremoval.Notethesignificantreduc/oninnoisepeaks.

Conclusions ReferencesZiChu,StevenGianvecchio,HainingWang,andSushilJajodia. 2012. Detec/ng Automa/on of TwiYerAccounts: Are You a Human, Bot, or Cyborg?. IEEETrans. Dependable Secur. Comput. 9, 6 (November2012),811-824.Zhao,Siqi,etal."Humanasreal-/mesensorsofsocialand physical events: A case study of twiYer andsportsgames."arXivpreprintarXiv:1106.4300(2011Stockmann, Daniela, Big Data from China and ItsImplica/on for the Study of the Chinese State -- AResearch Report on the 2014Hongkong Protests onWeibo(May26,2015).

�  90%-95%oftweetsareaboutnewseventsorareretweetsofnewsevents.Only5%-10%expressopinionsabouttheevents.

�  Top3000tweeters(outof240,000users)generate40%oftweets.Overhalfofthemarebots.Botgroupshaveinteres/nginternalstructures.

�  Sen/mentanalysisoftweetsrevealsthelackofsustainedposi/vefeelingsabouttherevolu/on.

Ø  Wecollected2,915,037tweetsabouttheHongKongRevolu/onfrom10/05/2014to12/06/2014.

Ø  Goals:Understandthedynamicsoftherevolu/onusingtweetsalone.

Introduc;on

Contact

+ComputerScienceDepartment,RiceUniversity,*Poli/calScienceDepartment,RiceUniversity

ZhouhanChen+([email protected]),YunZhou+,RimaTanash+,JieyaWen+,DanWallach+,RichardStoll*,DevikaSubramanian+

AnalysisofTweetsandTweetersduringthe2014UmbrellaRevolu;on

DataCollec;on

Ø  AnalyzepeaksinTwiYervolume.Ø  UseTwiYerAPItobuildfollower-followingand

tweet-retweetgraphs.Ø  Detectbotandhumangroupswithcommunity

detec/onalgorithmsonthosegraphs.Ø  Interpretgraphswithvisualiza/ontoolssuchas

Gephi.Ø  Applynaturallanguageprocessing(NLP)and

machinelearningforsen/mentanalysisontweets.

Ø  Tweetscollectedfrom:TwiYerStreamingAPIØ  Numberofprotest-relatedtweets:1,062,606Ø  Numberofgeo-enabledtweets:3583

Methods

According to TwiYer, loca%on is “The user-definedlocaCon for thisaccount’sprofile”.Themapshowsthat many people from all over the world werephysicallypresentinHongKongduringtheprotest.

Ø  Millionsofrawtweets.Ø  Manytoptweetersarebots,introducingnoise

inouranalysis.Ø  Rela/onshipbetweentweetersiscomplex.

Challenges

AnalysisofTweets:EventsandSen;ment

Ø  Whataretweetsabout?Ø  Whatistherela/onshipbetweentweetvolume

andmajorevents?Ø  Howdopeoplefeelabouttheevents?

Exampleposi/vetweet:“SomethingaboutthisjustgivesmealiIlemorefaithintheworld.HongKongprotestersaresofreakingnice.”Examplenega/vetweet:“EveryCmeIhearabouttheUmbrellaRevoluConIwanttogroanforafewyears.”