Download - Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms
![Page 1: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/1.jpg)
Balancing Diversity to Counter-measure Geographical Centralization in Microblogging
Platforms
Eduardo Graells-GarridoWeb Research GroupUniversitat Pompeu FabraBarcelona, Spain
Mounia LalmasYahoo LabsLondon, UK
HypertextSept. 4, 2014
Santiago, Chile
![Page 2: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/2.jpg)
Motivation: Geographical Centralization
Every person behaves in a biased way (homophily, selective exposure, etc.) in both physical and virtual worlds.
Does the same happen with systematic biases?
Chile is a centralized country - public policy, population migration and media are biased towards its capital. This is increasing the population imbalance, and vice versa!
![Page 3: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/3.jpg)
Some Effects of Geographical Centralization
This affects Web users as content is not geographically diverse (mostly related to/from Santiago). Content from other locations is hidden and hard to find.
(I was at WWW when I searched for this. “Everywhere” displays relevant tweets from Santiago only.)
![Page 4: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/4.jpg)
Problem Statement
Detect and Measure Geographical CentralizationIs centralization reflected on micro-blogging platforms?
Tweet Classification into LocationsHow to find tweets from other locations in imbalanced contexts?
[Rout et al, HT 2013] studied geolocation in imbalanced populations from a network perspective. We follow a similar approach from a content perspective.
Information Filtering - Geo. Diverse TimelineHow to build a geographically diverse timeline?
We build upon the work of others based on information diversity filtering. [De Choudhury et al, HT 2011] and [Munson et al, ICWSM 2009]
![Page 5: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/5.jpg)
Case Study: Chile, Municipal Elections 2012Is Geographical Centralization Reflected on Twitter?
![Page 6: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/6.jpg)
Frequent Terms
![Page 7: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/7.jpg)
Dataset: #municipales2012
Locally Important Denser network discussionsLocal vocabulary (classification)
National LevelInteractions between locations
Query Keywordshashtags, tenses of to-vote, candidate names, political institutions, locations
Using self-reported location, 27,95% of users is geolocated at regional level. They published 42,15% of tweets in dataset.
Ideal characteristics, but there is a need to classify tweets.
![Page 8: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/8.jpg)
Physical and Virtual Population Distributions
. We consider the sample geographically representative.
r = 0.95, p < 0.01Source: Census 2012*
r = 0.68, p < 0.01Source: CASEN Survey
Imbalanced Population(Different Orders of Magnitude)
Balanced Representation (Equal Orders of Magnitude)
![Page 9: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/9.jpg)
Is the Chilean Virtual Population in Twitter centralized towards the capital Metropolitan Region?
![Page 10: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/10.jpg)
Interactions Between Locations
Adjacency Matrix of 1-way interactions. [Quercia et al, 2012]
M(i,j) = mentions(Li, Lj) + retweets(Li, Lj)
Each arc in the visualization represents a M(i,j). Li is on the left, Lj on the right.
Green edges indicate i = j.Brown edges indicate j = Santiago
(RM).The rest is gray.
![Page 11: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/11.jpg)
Geographical Centralization
We explain the extreme differences between observations and expectations as geographical centralization towards Santiago (Metropolitan Region)
Observed CentralityEstimated from a graph based on M.
Expected CentralityEstimated from a graph with edge weights based on location populations.
![Page 12: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/12.jpg)
How to make timelines more Geographically Diverse?
Shannon Entropy with respect to geography
![Page 13: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/13.jpg)
First: Classifying Tweets into Locations with Diversity
We built a corpus of location documents.For classification we consider a tweet as a vector of cosine similarities with each location document, weighted using TF-IDF. We evaluate with 10-fold cross-validation.
Similarity features provide more geographical diversity (lost because of population imbalance) and are overall more accurate than bag of words approaches.
Similarity Features
BOW Features
![Page 14: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/14.jpg)
We iteratively add tweets to a timeline T. Each added tweet maximizes T’s information entropy [Choudhury et al, 2011], but we enforce geographical diversity of those additions [Munson et al, 2009].
Second: Filtering Tweets to build a Geo. Diverse Timeline
![Page 15: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/15.jpg)
Empirical Observationselection results start to appear!
unexpected results in some location! discussion becomes a bit more global. in all cases, geographical diversity exists.
Proposed Method is more geographically diverse than baselines:DIV [Choudhury et al, HT 2011]POP: top-k popular tweets
in terms of social voting, PM has more representation of popular tweets than DIV.
![Page 16: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/16.jpg)
Overview of Results
Is centralization reflected on micro-blogging platforms?Yes! As with other behavioral biases (homophily, selective exposure), the systematic bias of geographical centralization is also present and is measurable.
How to find tweets from other locations?Consider imbalance-aware features, such as content similarity metrics. This improves diversity of classifications without losing accuracy.
How to build a geographically diverse timeline?A correct mixture of known techniques can have the desired effects without trade-offs! (gained representation of popularity, did not lose info. diversity)In contrast to sensitive contexts where selective exposure is crucial, geographical diversity is less likely to generate cognitive dissonance.
![Page 17: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/17.jpg)
Future Work
User Evaluationis geographical diversity interesting?
Visualization and User Interfacesis geographical diversity engaging?
![Page 18: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/18.jpg)
Questions?
Thanks for attending!
Contact@carnby
http://carnby.github.io
Special ThanksDany Passarinho, Bárbara Poblete, Diego Sáez-Trumper and Anonymous Reviewers
This work was partially funded by Grant TIN2012-38741 (Understanding Social Media: An Integrated Data Mining Approach) of the Ministry of Economy and Competitiveness of Spain.
https://www.flickr.com/photos/malikaladak/8868491759https://www.flickr.com/photos/28047774@N04/6312764345
https://www.flickr.com/photos/iron_horses/6274365371https://www.flickr.com/photos/efimeravulgata/1429969601
![Page 19: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/19.jpg)
Additional Data :)
![Page 20: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/20.jpg)
![Page 21: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/21.jpg)
![Page 22: Balancing Diversity to Counter-measure Geographical Centralization in Microblogging Platforms](https://reader033.vdocument.in/reader033/viewer/2022051323/547e4482b4795993508b4b09/html5/thumbnails/22.jpg)