inferring user political preferences from streaming communications
DESCRIPTION
Inferring User Political Preferences from Streaming Communications. Svitlana Volkova 1 , Glen Coppersmith 2 and Benjamin Van Durme 1,2. 1 Center for Language and Speech Processing 2 Human Language Technology Center of Excellence. ACL 2014, Baltimore. Motivation. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/1.jpg)
Inferring User Political Preferences from Streaming
Communications
Svitlana Volkova1, Glen Coppersmith2
and Benjamin Van Durme1,2
1Center for Language and Speech Processing2Human Language Technology Center of
ExcellenceACL 2014, Baltimore
![Page 2: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/2.jpg)
Motivation• Personalized, diverse and timely data • Can reveal user interests, preferences and
opinions
DemographicsPro – http://www.demographicspro.com/WolphralAlpha Analytics – http://www.wolframalpha.com/facebook/
![Page 3: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/3.jpg)
Applications• Large-scale passive
polling and real-time live polling
• Online advertising • Healthcare
analytics• Personalized
recommendation systems and search
![Page 4: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/4.jpg)
User Attribute Prediction
Political PreferenceRao et al., 2010; Conover et al.,
2011, Pennacchiotti and Popescu, 2011; Zamal et al.,
2012; Cohen and Ruths, 2013
.
.
.
Communications
GenderGarera and Yarowsky, 2009;
Rao et al., 2010; Burger et al., 2011; Van Durme, 2012; Zamal et al., 2012; Bergsma and Van
Durme, 2013
AgeRao et al., 2010; Zamal et al., 2012; Cohen and Ruth, 2013;
Nguyen et al., 2011, 2013
…
…
…
…
…
![Page 5: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/5.jpg)
Existing Approaches ~1K Tweets*
….…….…….…….…….…….…….…….…
Does an average Twitter user produce thousands of tweets?
*Rao et al., 2010; Conover et al., 2011; Pennacchiotti and Popescu, 2011a; Burger et al., 2011; Zamal et al., 2012; Nguyen et al., 2013
Tweets as a
document
![Page 6: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/6.jpg)
How Active are Twitter Users?
http://www.digitalbuzzblog.com/visualizing-twitter-statistics-x100/
![Page 7: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/7.jpg)
Real-World Predictions
Not active users: no or limited content
Average Twitter usersMedian = 10 tweets per
day
Active users 1,000+ tweets
Private users: no content
10%
50%
20%
20%
![Page 8: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/8.jpg)
Our Approach
1. Take advantage of user local neighborhoods
2. Incremental dynamic real-time predictions
Real world batch
predictions
Streaming predictions
![Page 9: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/9.jpg)
Our Approach
1. Take advantage of user local neighborhoods
2. Incremental dynamic real-time predictions
Real world batch
predictions
![Page 10: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/10.jpg)
Attributed Social Network
User Local Neighborhoods a.k.a. Social Circles
![Page 11: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/11.jpg)
Twitter Network Data
Code, data and trained models for gender, age, political preference prediction
http://www.cs.jhu.edu/~svitlana/
![Page 12: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/12.jpg)
Twitter Social GraphI. Candidate-Centric
1,031 users of interest
II. Geo-Centric 270 users
III. Politically Active* 371 users
10 - 20 neighbors of each type per user~50K nodes, ~60K edges
What types of neighbors lead to the best attribute prediction for a given
user?*Pennacchiotti and Popescu, 2011; Zamal et al., 2012; Cohen and Ruths, 2013
Code, data and trained models for gender, age, political preference prediction
http://www.cs.jhu.edu/~svitlana/
![Page 13: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/13.jpg)
Experiments• Log-linear binary unigram models:
(I) Users vs. (II) Neighbors and (III) Both
• Evaluate the relative utility of different neighborhood types:– varying neighborhood size n=[1, 2, 5, 10] and
content amount t=[5, 10, 15, 25, 50, 100, 200]– 10-fold cross validation with 100 random
restarts for every n and t parameter combination
![Page 14: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/14.jpg)
Neighborhood Comparison
Tweets per Neighbor Tweets per Neighbor
1 Neighbor 10 Neighbors
Accu
racy
![Page 15: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/15.jpg)
Optimizing Twitter API CallsCand-Centric Graph: Friend Circle
![Page 16: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/16.jpg)
Optimizing Twitter API CallsCand-Centric Graph: Friend Circle
![Page 17: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/17.jpg)
Optimizing Twitter API CallsCand-Centric Graph: Friend Circle
![Page 18: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/18.jpg)
Optimizing Twitter API CallsCand-Centric Graph: Friend Circle
![Page 19: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/19.jpg)
Summary: Batch Real-World Predictions with Limited User
DataMore data is better How to get it?• More neighbors per user >
additional content from the existing neighbors
What kind of data?• Follower, friend, @mention,
retweet
• Users recently joined Twitter• No or limited access to user
tweets
no or very
limited content!
Real-world predictions
![Page 20: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/20.jpg)
Our Approach
1. Take advantage of user local neighborhoods
2. Incremental dynamic real-time predictions
Streaming predictions
![Page 21: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/21.jpg)
Iterative Bayesian Predictions
Time
…
?
![Page 22: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/22.jpg)
Cand-Centric Graph: Belief Updates
?
…
Time?
…
Time
![Page 23: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/23.jpg)
Cand-Centric Graph: Prediction Time
User-Neighbor
_x0004_Cand _x0004_ Geo _x0007_ Active0.001
0.01
0.1
1
10
100
0.02
12 20
0.01
198.9
0.002
1.23.2
0.001
3.51.1
Wee
ks (l
og sc
ale)
100 users75%
confidence
Cand
75%95%
User Stream
![Page 24: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/24.jpg)
Batch vs. Online Performance
Cand Geo Active0
0.2
0.4
0.6
0.8
1
0.720.57
0.75+0.03+0.1
+0.11+0.27
+0.27+0.14
+0.28+0.31
+0.25
User Batch Neighbor BatchUser Stream User-Neighbor Stream
![Page 25: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/25.jpg)
Summary
• Neighborhood content is useful*
• Neighborhoods constructed from friends,
usermentions and retweets are most
effective
• Signal is distributed in the neighborhood
• Streaming models > batch models*Pennacchiotti and Popescu, 2011a, 2001b; Conover et al., 2011a, 2001b; Golbeck et al., 2011; Zamal et al., 2012
![Page 26: Inferring User Political Preferences from Streaming Communications](https://reader036.vdocument.in/reader036/viewer/2022062521/568166d9550346895ddaf864/html5/thumbnails/26.jpg)
Thank you!Labeled Twitter network data for gender, age, political preference prediction: http://www.cs.jhu.edu/~svitlana/
Code and pre-trained models available upon request: [email protected]