sune lehman: #socialbots and how we created the boston banksy hoax
DESCRIPTION
Prof. Sune Lehman used his course machine learning to create socialbots, that in their turn created a Boston Banksy Hoax.TRANSCRIPT
the thrilling story of the
#socialbots
Sune Lehmann Associate Professor Department of Applied Mathematics and Computer Science Technical University of Denmark @suneman
Art by Allan Criss
Artwork: C. Hidalgo
network structure
LETTERS
Link communities reveal multiscale complexity innetworksYong-Yeol Ahn1,2*, James P. Bagrow1,2* & Sune Lehmann3,4*
Networks have become a key approach to understanding systemsof interacting objects, unifying the study of diverse phenomenaincluding biological organisms and human society1–3. One crucialstep when studying the structure and dynamics of networks is toidentify communities4,5: groups of related nodes that correspondto functional subunits such as protein complexes6,7 or socialspheres8–10. Communities in networks often overlap9,10 such thatnodes simultaneously belong to several groups. Meanwhile, manynetworks are known to possess hierarchical organization, wherecommunities are recursively grouped into a hierarchical struc-ture11–13. However, the fact that many real networks have com-munities with pervasive overlap, where each and every nodebelongs to more than one group, has the consequence that a globalhierarchy of nodes cannot capture the relationships between over-lapping groups. Here we reinvent communities as groups of linksrather than nodes and show that this unorthodox approach suc-cessfully reconciles the antagonistic organizing principles of over-lapping communities and hierarchy. In contrast to the existingliterature, which has entirely focused on grouping nodes, linkcommunities naturally incorporate overlap while revealing hier-archical organization. We find relevant link communities in manynetworks, including major biological networks such as protein–protein interaction6,7,14 and metabolic networks11,15,16, and showthat a large social network10,17,18 contains hierarchically organizedcommunity structures spanning inner-city to regional scales whilemaintaining pervasive overlap. Our results imply that link com-munities are fundamental building blocks that reveal overlap andhierarchical organization in networks to be two aspects of thesame phenomenon.
Although no common definition has been agreed upon, it is widelyaccepted that a community should have more internal than externalconnections19–24. Counterintuitively, highly overlapping communitiescan have many more external than internal connections (Fig. 1a, b).Because pervasive overlap breaks even this fundamental assumption, anew approach is needed.
The discovery of hierarchy and community organization has alwaysbeen considered a problem of determining the correct membership(or memberships) of each node. Notice that, whereas nodes belong tomultiple groups (individuals have families, co-workers and friends;Fig. 1c), links often exist for one dominant reason (two people are inthe same family, work together or have common interests). Instead ofassuming that a community is a set of nodes with many links betweenthem, we consider a community to be a set of closely interrelated links.
Placing each link in a single context allows us to reveal hierarchicaland overlapping relationships simultaneously. We use hierarchicalclustering with a similarity between links to build a dendrogramwhere each leaf is a link from the original network and branches
represent link communities (Fig. 1d, e and Methods). In this den-drogram, links occupy unique positions whereas nodes naturallyoccupy multiple positions, owing to their links. We extract link com-munities at multiple levels by cutting this dendrogram at variousthresholds. Each node inherits all memberships of its links and canthus belong to multiple, overlapping communities. Even though weassign only a single membership per link, link communities can alsocapture multiple relationships between nodes, because multiplenodes can simultaneously belong to several communities together.
The link dendrogram provides a rich hierarchy of structure, but toobtain the most relevant communities it is necessary to determine thebest level at which to cut the tree. For this purpose, we introduce anatural objective function, the partition density, D, based on linkdensity inside communities; unlike modularity20, D does not sufferfrom a resolution limit25 (Methods). Computing D at each level of thelink dendrogram allows us to pick the best level to cut (althoughmeaningful structure exists above and below that threshold). It isalso possible to optimize D directly. We can now formulate overlap-ping community discovery as a well-posed optimization problem,accounting for overlap at every node without penalizing that nodesparticipate in multiple communities.
As an illustrative example, Fig. 1f shows link communities aroundthe word ‘Newton’ in a network of commonly associated Englishwords. (See Supplementary Information, section 6, for details onnetworks used throughout the text.) The ‘clever, wit’ community iscorrectly identified inside the ‘smart/intellect’ community. Thewords ‘Newton’ and ‘Gravity’ both belong to the ‘smart/intellect’,‘weight’ and ‘apple’ communities, illustrating that link communitiescapture multiple relationships between nodes. See SupplementaryInformation, section 3.6, for further visualizations.
Having unified hierarchy and overlap, we provide quantitative,real-world evidence that a link-based approach is superior to exist-ing, node-based approaches. Using data-driven performance mea-sures, we analyse link communities found at the maximum partitiondensity in real-world networks, compared with node communitiesfound by three widely used and successful methods: clique percola-tion9, greedy modularity optimization26 and Infomap21. Clique per-colation is the most prominent overlapping community algorithm,greedy modularity optimization is the most popular modularity-based20 technique and Infomap is often considered the most accuratemethod available27.
We compiled a test group of 11 networks covering many domainsof active research and representing the wide body of available data(Supplementary Table 2). These networks vary from small to large,from sparse to dense, and from those with modular structure to thosewith highly overlapping structure. We highlight a few data sets ofparticular scientific importance: The mobile phone network is the
*These authors contributed equally to this work.
1Center for Complex Network Research, Department of Physics, Northeastern University, Boston, Massachusetts 02115, USA. 2Center for Cancer Systems Biology, Dana-FarberCancer Institute, Harvard University, Boston, Massachusetts 02215, USA. 3Institute for Quantitative Social Science, Harvard University, Cambridge, Massachusetts 02138, USA.4College of Computer and Information Science, Northeastern University, Boston, Massachusetts 02115, USA.
doi:10.1038/nature09182
1Macmillan Publishers Limited. All rights reserved©2010
most comprehensive proxy of a large-scale social network currentlyin existence17,18; the metabolic network iAF1260, from Escherichia coliK-12 MG1655 strain, is one of the most elaborate reconstructionscurrently available16; and the three protein–protein interaction net-works of Saccharomyces cerevisiae are the most recent and completeprotein–protein interaction data yet published14.
These networks possess rich metadata that allow us to describe thestructural and functional roles of each node. For example, the bio-logical roles of each protein in the protein–protein interaction net-work can be described by a controlled vocabulary (Gene Ontologyterms28). By calculating metadata-based similarity measures betweennodes (Methods and Supplementary Information, section 5), we candetermine the quality of communities by the similarity of the nodesthey contain (‘community quality’). Likewise, we can use metadata toestimate the expected amount of overlap around a node, testing thequality of the discovered overlap according to the metadata (‘overlapquality’). For example, metabolites that participate in more meta-bolic pathways are expected to belong to more communities thanmetabolites that participate in fewer pathways. Some methods mayfind high-quality communities but only for a small fraction of thenetwork; coverage measures describe how much of the network wasclassified by each algorithm (‘community coverage’) and how muchoverlap was discovered (‘overlap coverage’). Each community algo-rithm is tested by comparing its output with the metadata, to deter-mine how well the discovered community structure reflects themetadata, according to the four measures. Each measure is normalizedsuch that the best method attains a value of one. ‘Composite perfor-mance’ is the sum of these four normalized measures, such that themaximum achievable score is four. Full details are in Methods andSupplementary Information, sections 5 and 6.
Figure 2 displays the results of this quantitative comparison, show-ing that link communities reveal more about every network’s meta-data than other tested methods. Not only is our approach the overallleader in every network, it is also the winner in most individual aspectsof the composite performance for all networks, particularly the qualitymeasures. The performance of link communities stands out for densenetworks, such as the metabolic and word association networks,which are expected to have pervasively overlapping structure.
University
Joint appointment
Home and work
FamilyBuildings in same neighborhood
Inertia
Law
Newton
Physics
Lab Biology
Scientific
Chemical
Chemistry
Science
EinsteinTheory
Hypothesis
Theorem
Gravity
Relativity
Biologist
Smart
Scientist
Sly
Bright
Genius
Intelligence
Clever
IntelligentGifted
Wise
Inventor Brilliant
Wisdom
Kinetic
Exceptional
Retarded
Invent
Chemist
Wit
Velocity
Intellect
Cunning
Outfox
Flask Beaker
Test tube
Experiment
Research
Apple
Weight
Experiment, science
Newton, gravity, apple
Smart, intellect, scientists
Clever, wit
Science, scientists
21
3
5
3–42–41–42–31–21–34–75–64–64–57–97–88–9
4
67
8
9
a b
c
d
f
e
MeasuresOverlap coverage
Overlap qualityCommunity coverage
Community quality
MethodsLCGI
––––
LinksClique percolationGreedy modularity Infomap
Other networksSocial networksBiological networks
885,9896.34⟨k⟩
N 1,04216.81
1,6473.06
1,00416.57
1,2134.21
2,7298.92
67,4118.90
39038.95
1,2199.80
5,01822.02
18,1425.09
0
1
2
3
4
L C G I L C G I L C G I L C G I L C G I L C G I L C G I L C G I L C G I L C G I L C G I
Com
posi
te p
erfo
rman
ce
Amazon.comWord assoc.PhilosopherUS CongressActorPhonePPI (all)PPI (LC)PPI (AP/MS)PPI (Y2H)Metabolic
Figure 2 | Assessing the relevance of link communities using real-worldnetworks. Composite performance (Methods and SupplementaryInformation) is a data-driven measure of the quality (relevance of discoveredmemberships) and coverage (fraction of network classified) of communityand overlap. Tested algorithms are link clustering, introduced here; cliquepercolation9; greedy modularity optimization26; and Infomap21. Test
networks were chosen for their varied sizes and topologies and to representthe different domains where network analysis is used. Shown for each are thenumber of nodes, N, and the average number of neighbours per node, Ækæ.Link clustering finds the most relevant community structure in real-worldnetworks. AP/MS, affinity-purification/mass spectrometry; LC, literaturecurated; PPI, protein–protein interaction; Y2H, yeast two-hybrid.
Figure 1 | Overlapping communities lead to dense networks and preventthe discovery of a single node hierarchy. a, Local structure in manynetworks is simple: an individual node sees the communities it belongs to.b, Complex global structure emerges when every node is in the situationdisplayed in a. c, Pervasive overlap hinders the discovery of hierarchicalorganization because nodes cannot occupy multiple leaves of a nodedendrogram, preventing a single tree from encoding the full hierarchy.d, e, An example showing link communities (colours in d), the link similaritymatrix (e; darker entries show more similar pairs of links) and the linkdendrogram (e). f, Link communities from the full word association networkaround the word ‘Newton’. Link colours represent communities and filledregions provide a guide for the eye. Link communities capture conceptsrelated to science and allow substantial overlap. Note that the words wereproduced by experiment participants during free word associations.
LETTERS NATURE
2Macmillan Publishers Limited. All rights reserved©2010
It is instructive to examine further the statistics of link communitiesin the metabolic and mobile phone networks (Fig. 3). The communitysize distribution at the optimum value of D is heavy tailed for bothnetworks, whereas the number of communities per node distinguishesthem (Fig. 3, insets): Mobile phone users are limited to a smaller rangeof community memberships, most likely as a result of social and timeconstraints. Meanwhile, the membership distribution of the metabolicnetwork displays the universality of currency metabolites (water, ATPand so on) through the large number of communities they participatein. Notable previous work11,15 removed currency metabolites beforeidentifying meaningful community structure. The statistics presentedhere match current knowledge about the two systems, further con-firming the communities’ relevance.
Having established that link communities at the maximal partitiondensity are meaningful and relevant, we now show that the linkdendrogram reveals meaningful communities at different scales.Figure 4a–c shows that mobile phone users in a community arespatially co-located. Figure 4a maps the most likely geographic loca-tions of all users in the network; several cities are present. In Fig. 4b,we show (insets) several communities at different cuts above theoptimum threshold, revealing small, intra-city communities. Belowthe optimum threshold, larger, yet still spatially correlated, com-munities exist (Fig. 4c). Because we expect a tight-knit communityto have only small geographical dispersion, the clustered structureson the map indicate that the communities are meaningful. The geo-graphical correlation of each community does not suddenly breakdown, but is sustained over a wide range of thresholds. In Fig. 4d, welook more closely at the social network of the largest community inFig. 4c, extracting the structure of its largest subcommunity alongwith its remaining hierarchy and revealing the small-scale structuresencoded in the link dendrogram. This example provides evidence forthe presence of spatial, hierarchical organization at a societal scale. Tovalidate the hierarchical organization of communities quantitatively
throughout the dendrogram, we use a randomized control dendro-gram that quantifies how community quality would evolve if therewere no hierarchical organization beyond a certain point. Figure 4eshows that the quality of the actual communities decays much moreslowly than the control, indicating that real link dendrograms possessa large range of high quality community structures. The quantitativeresults of Fig. 4 are typical for the full test group, implying that rich,meaningful community structure is contained within the link den-drogram. Additional results supporting these conclusions are pre-sented in Supplementary Information, section 7.
Many cutting-edge networks are far from complete. For example,an ambitious project to map all protein–protein interactions in yeastis currently estimated to detect approximately 20% of connections14.As the rate of data collection continues to increase, networks become
Num
ber o
f com
mun
ities
Number of users per community
106
105
104
103
102
101
1000 5 10 15 20 25 30 35
Num
ber o
f use
rs
Number of communitiesper user
103
102
101
106
105
104
103
102
101
100
100
101 102 103
101 102 103
Num
ber o
f com
mun
ities
Number of metabolites per community
103
102
101
1000 50 100 150 200
Num
ber o
fm
etab
olite
s
Number of communitiesper metabolite
Mobile phone
Metabolic
H2O, H+
ATPADP
Pi
Figure 3 | Community and membership distributions for the metabolic andmobile phone networks. The distribution of community sizes and nodememberships (insets). Community size shows a heavy tail. The number ofmemberships per node is reasonable for both networks: we do not observephone users that belong to large numbers of communities and we correctlyidentify currency metabolites, such as water, ATP and inorganic phosphate(Pi), that are prevalently used throughout metabolism. The appearance ofcurrency metabolites in many metabolic reactions is naturally incorporatedinto link communities, whereas their presence hindered communityidentification in previous work11,15.
Threshold, t = 0.20
t = 0.24
t = 0.27
t = 0.27
50 km
a
0.4
D
0.6 0.8 1
d Largest community Largestsubcommunity
Remaininghierarchy
t
e
b
c
Q/Q
max
0 0.2 0.4 0.6 0.8
Word association
0 0.2 0.4 0.6 0.8Link dendrogram threshold, t
Metabolic
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8
Phone
ActualControl
LargestcommunitySecondlargestThird largest
Figure 4 | Meaningful communities at multiple levels of the linkdendrogram. a–c, The social network of mobile phone users displays co-located, overlapping communities on multiple scales. a, Heat map of themost likely locations of all users in the region, showing several cities.b, Cutting the dendrogram above the optimum threshold yields small, intra-city communities (insets). c, Below the optimum threshold, the largestcommunities become spatially extended but still show correlation. d, Thesocial network within the largest community in c, with its largestsubcommunity highlighted. The highlighted subcommunity is shown alongwith its link dendrogram and partition density, D, as a function of threshold,t. Link colours correspond to dendrogram branches. e, Community quality,Q, as a function of dendrogram level, compared with random control(Methods).
NATURE LETTERS
3Macmillan Publishers Limited. All rights reserved©2010
relapse
02805 Social graphs and interactions
twitterbots!
twitterbots!
anecdote #1: you must be careful what you say
[FAIL]
competition #1
[v2.0 bots inspired by Tim Hwang]
who can get the most Justin Bieber followers?
strategy
1. manually create a “realistic” profile (including a few tweets)
2. pick users with 50 - 200 followers 3. follow 10-20 new users per day 4. un-follow whoever doesn’t follow you back
within 24 hours 5. repeat step 2-4
some early insights
strategy
1. manually create a “realistic” profile (including a few tweets)
2. pick users with 50 - 200 followers 3. follow 10-20 new users per day 4. un-follow whoever doesn’t follow you back
within 24 hours 5. repeat step 2-4
100[initial goal was approx 50 followers per bot]
bieber has a lot of (dark matter) spam followers!
anecdote #2: an accidental turing test
Noise
1
Twitter mood predicts the stock market.Johan Bollen1,?,Huina Mao1,?,Xiao-Jun Zeng2.
?: authors made equal contributions.
Abstract—Behavioral economics tells us that emotions canprofoundly affect individual behavior and decision-making. Doesthis also apply to societies at large, i.e. can societies experiencemood states that affect their collective decision making? Byextension is the public mood correlated or even predictive ofeconomic indicators? Here we investigate whether measurementsof collective mood states derived from large-scale Twitter feedsare correlated to the value of the Dow Jones Industrial Average(DJIA) over time. We analyze the text content of daily Twitterfeeds by two mood tracking tools, namely OpinionFinder thatmeasures positive vs. negative mood and Google-Profile of MoodStates (GPOMS) that measures mood in terms of 6 dimensions(Calm, Alert, Sure, Vital, Kind, and Happy). We cross-validatethe resulting mood time series by comparing their ability todetect the public’s response to the presidential election andThanksgiving day in 2008. A Granger causality analysis anda Self-Organizing Fuzzy Neural Network are then used toinvestigate the hypothesis that public mood states, as measured bythe OpinionFinder and GPOMS mood time series, are predictiveof changes in DJIA closing values. Our results indicate that theaccuracy of DJIA predictions can be significantly improved bythe inclusion of specific public mood dimensions but not others.We find an accuracy of 87.6% in predicting the daily up anddown changes in the closing values of the DJIA and a reductionof the Mean Average Percentage Error by more than 6%.
Index Terms—stock market prediction — twitter — moodanalysis.
I. INTRODUCTION
STOCK market prediction has attracted much attentionfrom academia as well as business. But can the stock
market really be predicted? Early research on stock marketprediction [1], [2], [3] was based on random walk theoryand the Efficient Market Hypothesis (EMH) [4]. Accordingto the EMH stock market prices are largely driven by newinformation, i.e. news, rather than present and past prices.Since news is unpredictable, stock market prices will follow arandom walk pattern and cannot be predicted with more than50 percent accuracy [5].
There are two problems with EMH. First, numerous studiesshow that stock market prices do not follow a random walkand can indeed to some degree be predicted [5], [6], [7], [8]thereby calling into question EMH’s basic assumptions. Sec-ond, recent research suggests that news may be unpredictablebut that very early indicators can be extracted from onlinesocial media (blogs, Twitter feeds, etc) to predict changesin various economic and commercial indicators. This mayconceivably also be the case for the stock market. For example,[11] shows how online chat activity predicts book sales. [12]uses assessments of blog sentiment to predict movie sales.[15] predict future product sales using a Probabilistic LatentSemantic Analysis (PLSA) model to extract indicators of
sentiment from blogs. In addition, Google search queries havebeen shown to provide early indicators of disease infectionrates and consumer spending [14]. [9] investigates the relationsbetween breaking financial news and stock price changes.Most recently [13] provide a ground-breaking demonstrationof how public sentiment related to movies, as expressed onTwitter, can actually predict box office receipts.
Although news most certainly influences stock marketprices, public mood states or sentiment may play an equallyimportant role. We know from psychological research thatemotions, in addition to information, play an significant rolein human decision-making [16], [18], [39]. Behavioral financehas provided further proof that financial decisions are sig-nificantly driven by emotion and mood [19]. It is thereforereasonable to assume that the public mood and sentiment candrive stock market values as much as news. This is supportedby recent research by [10] who extract an indicator of publicanxiety from LiveJournal posts and investigate whether itsvariations can predict S&P500 values.
However, if it is our goal to study how public moodinfluences the stock markets, we need reliable, scalable andearly assessments of the public mood at a time-scale andresolution appropriate for practical stock market prediction.Large surveys of public mood over representative samples ofthe population are generally expensive and time-consumingto conduct, cf. Gallup’s opinion polls and various consumerand well-being indices. Some have therefore proposed indirectassessment of public mood or sentiment from the results ofsoccer games [20] and from weather conditions [21]. Theaccuracy of these methods is however limited by the lowdegree to which the chosen indicators are expected to becorrelated with public mood.
Over the past 5 years significant progress has been madein sentiment tracking techniques that extract indicators ofpublic mood directly from social media content such as blogcontent [10], [12], [15], [17] and in particular large-scaleTwitter feeds [22]. Although each so-called tweet, i.e. anindividual user post, is limited to only 140 characters, theaggregate of millions of tweets submitted to Twitter at anygiven time may provide an accurate representation of publicmood and sentiment. This has led to the development of real-time sentiment-tracking indicators such as [17] and “Pulse ofNation”1.
In this paper we investigate whether public sentiment, asexpressed in large-scale collections of daily Twitter posts, canbe used to predict the stock market. We use two tools tomeasure variations in the public mood from tweets submitted
1http://www.ccs.neu.edu/home/amislove/twittermood/
arX
iv:1
010.
3003
v1 [
cs.C
E] 1
4 O
ct 2
010
?
a robot education• recognize bots • use ML to recognize “good” content (to tweet and re-
tweet (using features, sentiment, etc) • detect topics in your tweet stream • analyze the twitter network to find communities,
influentials, etc
social influence?
trends are local
so bots must be Bostonian
little is know about exact form of trending algorithm (twitter secret sauce)
we know that there’s a focus on novelty, and that sustained growth + retweets + faves might matter
how?
build convincing avatars and use high follower counts as part of your disguise
how?
build convincing avatars and use high follower counts as part of your disguise
how?
build convincing avatars and use high follower counts as part of your disguise
how?
build convincing avatars and use high follower counts as part of your disguiseuse machine learning to separate bots from humans (so we can focus on humans)
how?
build convincing avatars and use high follower counts as part of your disguiseuse machine learning to separate bots from humans (so we can focus on humans)use natural language processing and ML to find quality content & topics to tweet and re-tweet
how?
build convincing avatars and use high follower counts as part of your disguiseuse machine learning to separate bots from humans (so we can focus on humans)
use network analysis to explore communities surrounding existing followers, making sure bot actions reach entire communities
use natural language processing and ML to find quality content & topics to tweet and re-tweet
how?
use network analysis to explore communities surrounding existing followers, making sure bot actions reach entire communities
difficult41,42, and we interpret complex contagion broadly to includehomophily; we focus on how both social reinforcement and homo-phily effects collectively boost the trapping of memes within densecommunities, not on the distinctions between them.
To examine and quantify the spreading patterns of memes, weanalyze a dataset collected from Twitter, a micro-blogging platformthat allows millions of people to broadcast short messages (‘tweets’).People can ‘follow’ others to receive their messages, forward(‘retweet’ or ‘‘RT’’ in short) tweets to their own followers, or mention(‘@’ in short) others in tweets. People often label tweets with topicalkeywords (‘hashtags’). We consider each hashtag as a meme.
ResultsCommunities and communication volume. Do memes spread likecomplex contagions in general? If social reinforcement andhomophily significantly influence the spread of memes, we expectmore communication within than across communities. Let us definethe weight w of an edge by the frequency of communication betweenthe users connected by the edge. Nodes are partitioned into densecommunities based on the structure of the network, but withoutknowledge of the weights (see Methods). For each community c,the average edge weights of intra- and inter-community links,Æw æc and Æw æc, quantify how much information flows withinand across communities, respectively. We measure weights byaggregating all the meme spreading events in our data. If memesspread obliviously to community structure, like simple contagions,we would expect no difference between intra- and inter-communitylinks. By contrast, we observe that the intra-community links carry
more messages (Fig. 2(A)). Similar results have been reported fromother datasets35,37. In addition, by defining the focus of an individualas the fraction of activity that is directed to each neighbor in the samecommunity, f , or in different communities, f , we find that peopleinteract more with members of the same community (Fig. 2(B)). Allthe results are statistically significant (p=0:001) and robust acrosscommunity detection methods (see Supplementary Information foradditional details).
Meme concentration in communities. These results suggest thatcommunities strongly trap communication. To quantify this effectfor individual memes, let us define the concentration of a meme incommunities. We expect more concentrated communication and
Figure 1 | The importance of community structure in the spreading of social contagions. (A) Structural trapping: dense communities with few outgoinglinks naturally trap information flow. (B) Social reinforcement: people who have adopted a meme (black nodes) trigger multiple exposures to others (rednodes). In the presence of high clustering, any additional adoption is likely to produce more multiple exposures than in the case of low clustering,inducing cascades of additional adoptions. (C) Homophily: people in the same community (same color nodes) are more likely to be similar and to adoptthe same ideas. (D) Diffusion structure based on retweets among Twitter users sharing the hashtag #USA. Blue nodes represent English users and rednodes are Arabic users. Node size and link weight are proportional to retweet activity. (E) Community structure among Twitter users sharing the hashtags#BBC and #FoxNews. Blue nodes represent #BBC users, red nodes are #FoxNews users, and users who have used both hashtags are green. Node size isproportional to usage (tweet) activity, links represent mutual following relations.
Figure 2 | Meme concentration in communities. We measure weightsand focus in terms of retweets (RT) or mentions (@). We show (A)community edge weight and (B) user community focus using box plots. Boxescover 50% of data and whisker cover 95%. The line and triangle in a boxrepresent the median and mean, respectively.
www.nature.com/scientificreports
SCIENTIFIC REPORTS | 3 : 2522 | DOI: 10.1038/srep02522 2
Lilian Weng, Filippo Menczer & Yong-Yeol Ahn Virality Prediction and Community Structure in Social Networks Scientific Reports, 3:2522
interventions: collaboration is key
share list of followers (yy’s result/multiple exposures)
favorite everything with hashtag
coordinate timing
geotag & time w Bostontwo manual tweets per bot
retweet everything with hashtag (but don’t go crazy)
interventions: collaboration is key
share list of followers (yy’s result/multiple exposures)
favorite everything with hashtaggeotag & time w Bostontwo manual tweets per bot
retweet everything with hashtag (but don’t go crazy)
interventions: collaboration is key
share list of followers (yy’s result/multiple exposures)
favorite everything with hashtaggeotag & time w Bostontwo manual tweets per bot
retweet everything with hashtag (but don’t go crazy)
#bostonthanks
#MeInThree
coffee, coffee, coffeemath, insight, irreverenceiron & wine + white buffalo + donavon frankenreiter
red, white, and blue
[intended to work via engagement, but failed]
#banksyinboston
[crude photoshop work]
real world action
“discovery of existing art”
work to verify rumors (similar to london riots)
great investigative journalism by @abtran
work to verify rumors (similar to london riots)
lessons
surprisingly easy to get followers, impersonate humans using crude meansin some areas Twitter is very different from what we experience
perspectivesprobably no real influence (“stickyness” still central)
bots can make a difference
perspectivesprobably no real influence (“stickyness” still central)
bots can make a difference
frightening/dystopian perspectives
[thanks!]