system u: computational discovery of personality traits from social media for individualized...
TRANSCRIPT
1
System Computa*onal Discovery of Personality Traits from Social Media for Individualized Experience
Michelle Zhou IBM Research, Almaden [email protected]
3
“The perfect solu.on is to serve each consumer individually. The problem? There are 7 billion of them.”
Consumer products CMO, Singapore IBM 2011 CMO Study
4
Model personality traits dis*nguishing individuals [Ford’ 05, O’Brien ’96, Neuman ’99, Gosling ’03, Wholan’06]
Derive personality traits for hundreds of millions of individuals
Individualiza*on at Scale
5
Lengthy standard psychometric tests
Reliability and freshness of test results
Challenges
“Welcome to our store, would you like to take a personality test?”
6
A Silver Lining
Psycholinguis*c studies: personality from text [Tausczik and Pennybaker‘10, Yarkoni ‘10]
Hundreds of millions of people leave text footprints on social media
“I love food, .., with … together we … in… very…happy.”
Word category: Inclusive Agreeableness
7
System U in a Nutshell
Big 5
Values Needs
Emo4onStyle A7tude
Psycholinguis*c Analy*cs
InkWell VisWell
Engagement Recommenda*on
Personality Portrait
Social Media
12
Discovering Big 5 Personality Traits
• Psychological characteris*cs reflec*ng individual differences
• Consistent and enduring • Can change • Link to many aspects of one’s
life – Problem/emo*on coping – Rela*onship selec*on – Occupa*onal proficiency – Team performance – . . .
outgoing/energe*c vs. solitary/reserved
efficie
nt/organize
d vs.
easy-‐going/careless
[O’Brien ’96, Neuman ’99, Gosling ’03, Wholan’06]
Discovering Fundamental Needs
[Ford, 2005]
• Fundamental needs are universal [Aaker 1995, Maslow 1943]
• Oken change with life events • Link to many aspects of one’s
life • Brand/product choices • Occupa*onal choices • . . .
Discovering Values
[Schwartz 2006]
• Values capture personal beliefs and mo*vators • Values guide ac*ons
15
Our Methodology
1. Large-‐scale psychometric studies
2. Deriva*on of psycholinguis*c evidence (lexicons)
3. Online predic*on of personality traits from text
16
Large-‐Scale Psychometric Studies
• Designing item-‐based psychometric studies
• Collec*ng psychometric scores & text footprints on Amazon Mechanical Turk
I tend to pursue perfec*on
17
Deriving Psycholinguis*c Evidence
Machine Learning Psycholinguis*c Lexicons
Ideal
…
Goal 0.23
Special 0.35
…
Half -‐0.26 [Yang & Li, 2013]
18
Online Predic*on of Personality Traits from Text
Predica*ve Models
Personality Traits
Social Media Posts
Big 5 Values Needs Emo*onal Style Aptude …
“… great to have a chauffer who can help us accomplish our goals …”
Chauffeur Accomplish Goal Special License …
Ideal 0.37 0.94 0.23 0.35 0.13 …
1 1 1 0 0 …
19
Online Predic*on of Personality Traits from Text
Addi*onal processing – Normalize counts with total words – Linear combina*on of counts with learned derived co-‐efficient to compute trait scores
– Normalize trait scores to give percen*le scores
“… great to have a chauffer who can help us accomplish our goals …”
Chauffeur Accomplish Goal Special License …
Ideal 0.37 0.94 0.23 0.35 0.13 …
1 1 1 0 0 …
How good are our results compared to standard psychometric studies?
How well can our results be used to predict or influence one’s behavior?
System U vs. Standard Surveys
• Par*cipants – Invited 1325 Twicer users at IBM, 650 responded, and 256 completed
• Method – Par*cipants took three sets of psychometric tests
• 50-‐item Big 5 (IPIP), 26-‐item basic values (Schwartz), and 52-‐item fundamental needs (our own)
– Par*cipants rated how well each type of the derived trait matches with their percep*on of themselves
Results • RV-‐Coefficient correla*on analysis of each type of trait
• Over 80% of popula*on, their correla*on is sta.s.cally significant (80.8%, 98.21%, and 86.6% for Big 5 personality, basic values and needs)
[Gou et al. CHI 2014]
Field Studies on Twicer
Who are more likely to behave as asked and how?
– Respond to recommended services (“ads”)
– Answer strangers’ ques*ons
– Help strangers spread informa*on (e.g., SOS)
Study 1: Who Will Respond to Ads
Method – Iden*fied 7290 Twicer users who twicer about traveling to NYC in the near future
– Computed personality traits for each iden*fied user
– Sent one of the three messages via Twicer to each person
Study 1: Who Will Respond to Ads
Results • Rela*onships between traits and responses
– Avg response rates for some top-‐matched are impressive (e.g., top 25% Extrovert for social msg CTR 8.65, following 9.12, and RFR 5.66)
• Certain personality traits resulted in significantly higher successful responses – A combina*on of high openness and low neuro*cism presented 31% and 45%
increase in clicking and following rates
Study 2: Who Will Answer Ques*ons [Mahmud et al., IUI 2013]
Method – Model a person’s ability, willingness, and readiness to answer ques*ons
– Predict one’s likelihood to respond
– Op*miza*on-‐based approach to answerer selec*on
Study 2: Who Will Answer Ques*ons [Mahmud et al., IUI 2013]
Experiment Results – Iden*fied 500 Twicer users each for two domains – Sent requests to 100 random users, used our work to select 100 among the remaining 400 users
– Compared random, baseline, and ours
TSA-‐tracker-‐1 TSA-‐tracker-‐2 Product
Baseline 42% 33% 31%
Live Experiment Random Selec4on Our Algorithm
TSA-‐Tracker-‐1 29% 66%
Product 26% 60%
Study 2: Who Will Spread Informa*on and When
Method – Modeled core features of an “informa*on spreader”
• Willingness, readiness, ac*vity *me pacern – Predicted the likelihood to respond and *me-‐to-‐act
[Lee et al., IUI 2014]
Study 2: Who Will Spread Informa*on and When [Lee et al., IUI 2014]
Experiment Results – Randomly selected 426 candidates who had recently tweeted about “bird flu” in July 2013
– Each approach selected top 100 candidates
Approach Retwee4ng Rate
Random People Contact 4% Popular People Contact 9%
Our Approach 19%
Approach Retwee4ng Rate
Random People Contact 4% Popular People Contact 8.7% Our Predic*on Approach 18% Our Approach + Wait *me
model 18.5%
33
Key Applica*ons
Marke*ng Determine who, what, how, and when to target
Customer Care Agent-‐Customer match making Real-‐*me agent assistant
Smarter Workforce Recruitment Talent iden*fica*on and development Risk iden*fica*on and mi*ga*on
34
Summary
• Psycholinguis*c analysis derives deep understanding of individuals at scale
• Derived personality traits can be used to predict and influence individuals’ behavior in the real world
• Far-‐reaching implica*ons on crea*ng hyper-‐personalized social recommender systems
35
Acknowledgement • Jilin Chen • Eben Habor • Liang Gou • Jalal Mahmud • Nimrod Megiddo
• Jeff Nichols • Aditya Pal • Jerre Schoudt • Barton Smith • Ying Xuan
• Huahai Yang
• Hernan Badenes • Mateo Nicolas Bengualid • Richard Gabriel • Huiji Gao • Chris Kau
• Mengdie Hu • Kyumin Lee • Tara Machews • Ruogu Yang • Tom Zimmerman
36
References • Chen, J., Hsieh, G., Mahmud, J., and Nichols, J. Understanding individuals personal values from
social media word use. In ACM Proc. CSCW ’2014. • Ford, J. K. Brands Laid Bare. John Wiley & Sons, 2005. • Gou, L., Zhou, M.X., and Yang, H. KnowMe and ShareMe: Understanding automa*cally discovered
personality traits from social media and user sharing preferences. In ACM Proc. CHI 2014. • Lee, K., Mahmud, J., Chen, J., Zhou, M.X., and Nichols, J. Who will retweet this? Automa*cally
iden*fying and engaging strangers on Twicer to spread informa*on. In ACM Proc. IUI ‘2014. • Luo, L., Wang, F., Zhou, M.X., Pan, X., and Chen, H. Who’s got answers? Growing the pool of
answerers in a smart enterprise Social Q&A system. In ACM Proc. IUI ‘2014. • Mahmud, J., Zhou, M.X., Megiddo, N., Nichols, J., and Drews, C. Recommending Targeted Strangers
from Whom to Solicit Informa*on in Twicer. In ACM Proc. IUI ‘2013. • Schwartz, S. H. Basic human values: Theory, measurement, and applica.ons. Revue francaise de
sociologie, 2006. • Tausczik, Y. R., and Pennebaker, J. W. The psychological meaning of words: LIWC and computerized
text analysis methods. Journal of Language and Social Psychology 29, 1 (2010), 24–54. • Yang, H., and Li, Y. Iden*fying user needs from social media. IBM Tech. Report (2013). • Yarkoni, T. Personality in 100,000 words: A large-‐scale analysis of personality and word use among
bloggers. J. research in personality 44, 3 (2010), 363–373.