learning about personal experiences and their outcomes...
TRANSCRIPT
Learning about Personal Experiences and their Outcomes:
Analyzing Social Media as an Observational Study
Emre Kıcıman
Microsoft Research
should i buy a fixer upper or new home?
should i rent or buy a home in 2014
should i get preapproved before house hunting
should i refinance my mortgage
should i get a divorce
should i break up with my boyfriend
should i pay off my mortgage
should i file bankruptcy
should i get a flu shot
should i lease or buy a car
should i join a gym
should i eat before or after working out
should i text him
should i buy a chromebook
should i quit my job
should i pop a blister
should i get bangs
should i wash my hair before coloring it
should i buy a house
should i cut my hair short
should i retire at 65
should i go to college
should i go to law school
should i take a multivitamin
should i text him or wait for him to text me
should i join the military
should i leave my husband
should i get married
should i pop a burn blister
should i see a doctor
should i consolidate my student loans
should i do cardio before or after weights
should i get a tattoo
should i workout with a cold
should i join the army
should i buy an xbox one
should i move to florida
should i have a third child
should i take creatine
should i get a dog
should i run everyday
should i be a teacher
should i become a real estate agent
should i quit drinking coffee
should i shave my head
100s of millions reporting their experiences
People post about the actions they take.
… and post about what happens later.
Broad goal: Create a system for open-domain querying of relationship between situations, actions and outcomes• Enable this for the long tail of unfamiliar situations
• Accommodate many possible data sources
Personal and policy questions• Explore Situations: What should I expect now?
• Understand Actions: Should I do ____?• Plan for Outcomes: What might help reach goals?
Individual timelinesMessages Query-aligned Timelines
C ardigan fanny pac k
Odd Future, B anksy
cre d selvage chil lwa ve
ret ro se lfi es orga nic.
YOLO shabby chic
Thunderca ts , lomo
me di ta tion
Willi ams burg plaid
narwha l cruc ifix Ma rfa
u1
u2u1
u2
E+
E-
Precedents
Subsequents
E+
E+
1. Extract timeline of event occurrences from social media
2. Select timelines that match a query event; and its control group.
3. Identify outcomes
• Introduction
• Causal inference (very) basics
• Brief sanity check on applying causal inference to social media
• Case Study: Effects of alcohol usage during college
• Case Study: Transitions from mental health to suicidal ideation posts
• Stepping back: Insights & bias
A Brief Introduction to Causal Inference
Treatment
Alice Treatment
Alice
Alice
𝑌𝑇=1
𝑌𝑇=1𝑌𝑇=0
Counterfactual Framework
𝑌𝑇=1𝑌𝑇=0
Treatment effect = 𝑌𝑇=1 − 𝑌𝑇=0
Counterfactual Framework
Counterfactual Framework
• We have a missing data problem. We can estimate missing values.
• We’ll focus on average treatment effect (ATE)
ATE = ത𝑌𝑇=1 − ത𝑌𝑇=0
• Careful: 𝑌 is dependent not only on 𝑇, but alsocovariates 𝑋• And 𝑇 can depend on 𝑋 as well
X
Y
T
Disentangling covariates: Randomized Expts.
• Randomized assignment of treatment status provides independence: 𝑇 ⊥ 𝑋
Disentangling covariates: Randomized Expts.
• Randomized assignment of treatment status provides independence: 𝑇 ⊥ 𝑋
Disentangling covariates: Randomized Expts.
• Randomized assignment of treatment status provides independence: 𝑇 ⊥ 𝑋
Observational Studies (not random)
• … Identify comparable treated and untreated subgroups such that independence holds
Matching
• For every person who received treatment, find another with identical covariates who did not.
• In high-dimensions, cannot find identical or nearest neighbor matches
• Instead, project units down to a single dimensional space
A balancing score subdivides observational data so that:
𝑇 ⊥ 𝑋 | 𝑠𝑐𝑜𝑟𝑒
Estimated propensity is one possible balancing score
Match on propensity score.
Generalizes to stratification and weighting approaches
Strong assumptions:- Ignorability of
unobserved covariates- Individual outcome
does not depend on whether others receive treatment
𝑌0, 𝑌1 ⊥ 𝑇 | 𝑠𝑐𝑜𝑟𝑒
Other causal inference methods
Regression Discontinuities Instrumental Variables
Disentangle 𝑋 and 𝑇 through another variable 𝑍.
X
Y
TZ
Sanity check: Might this work? With Olteanu and Varol
CSCW17
1. Extract timeline of event occurrences from texts;Identify treatment and
control groups
2. Learn propensity score estimator and stratify users
3. Calculate population average outcomes
Evaluation
Applied technique to 39 specific situations; across 9 high-level topics.
• Chose 9 domains for diversity.
• Selected situations as popularity-weighted random sample within domain.
Data: Firehose Twitter data, March-May, 2014
• Extract ranked outcomes and temporal evolution;
• Correctness judgements via MTurk
Raw Results: Health\Diseases\TriglyceridesOutcome Count Absolute Increase Z-Score
Your_risk 46 24.8% 18.12
Statin 48 23.1% 17.69
Lower 120 35.9% 17.18
Cardiovascular 54 23.0% 16.72
Healthy_diet 55 19.3% 16.54
Fatty_acid 29 18.3% 16.37
Help_prevent 73 26.9% 16.01
Risk_factor 33 18.3% 15.55
Fish_oil 48 24.4% 15.42
inflammation 78 25.1% 15.30
Raw Results: Health\Diseases\GoutOutcome Count Absolute Increase Z-Score
Flare_up 35 4.1% 12.33
Uric_acid 27 2.9% 10.36
Uric 28 2.9% 10.11
Flare 81 4.9% 9.92
Big_toe 38 2.9% 9.86
Joint 301 7.2% 7.22
Aged 32 1.7% 6.51
Correlation 45 2.8% 6.11
Bollock 53 2.5% 5.96
Shite 108 3.4% 5.93
Raw Results: Society\Issues\Belly Fat
Outcome Count Absolute Increase Z-Score
Burn 156 62.2% 8.96
Ab_workout 13 8.5% 5.82
Workout_lose 13 8.5% 5.82
Help_burn 8 11.1% 5.82
add_video 26 14.0% 5.75
url_playlist 26 14.0% 5.75
Fitness 39 18.6% 5.51
Ab 43 19.1% 5.51
Playlist_metion 30 15.3% 5.39
Biceps 7 4.7% 4.74
Raw Results: Business\Financial\PensionOutcome Count Absolute Increase Z-Score
Tax 1334 18.9% 18.89
Retire 675 15.6% 17.97
Budget 762 14.0% 17.05
Benefit 920 14.7% 15.82
Vote 1278 13.9% 15.57
Government 876 11.5% 15.47
Financial 673 13.7% 15.22
Income 619 12.3% 14.87
Report 1125 13.7% 14.70
investment 490 10.4% 14.28
Evaluating correctness
• Use mechanical turk workers to judge correctness of results:
True or False:
Someone mentioning T will later be more likely to talk about YProvide discussion context to aid annotation
• Calculate precision at rank (P@N), ranked by average effect
Context to aid annotating relevance:
Context to aid annotating relevance:
Precision at Rank
P@10 across domains
Data volume vs P@10
Summary
• Propensity score analysis can be applied to social media timelines to extract outcomes, and do extract relevant results
• Developed methodology to allow human judges to evaluate correctness using the discussion context.
• Evaluated on a large variety of domains:
• Outcomes with higher statistical significance and outcomes with larger treatment effects are more likely to be judged correct
• Quality of results is correlated with data volume
Effects of Alcohol Use During CollegeWith Scott Counts (MSR) and Melissa Gasser (UW)
College is an important transition period
• Success in college predicts career success, career happiness, economic achievement.• High rates of college graduates drive regional income levels and other positive
macro-economic indicators
• Over 40% of college students leave without earning a degree.• Many factors: academic and social integration, financial pressures, …
• Excessive alcohol consumption negatively associated w/ college success, as well as other long-term negative consequences
5-year longitudinal social media analysis
• Existing study methods primarily use surveys.• Limited to single institutions and/or small number of participants
• Rely on participant recall
• Response biases
• Social media studies can complement• Large number of participants
• In situ reporting of experiences
• (different) reporting biases
• Granular observations
What insight are we looking for?
Goal: might intervening to stop early alcohol usage aid college success?
Does early alcohol usage have measurable effects on topics linked to college success?
Does early alcohol usage have measurable effects on future alcohol usage?
5-year longitudinal social media analysis
1. Build a dataset of college students twitter timelines:• Identify twitter accounts of entering college students in Fall 2010
• Gather their tweets from Fall 2010 through Summer 2015
2. Identify relevant events and topics:• Drinking and alcohol mentions in Fall 2010
• Topics known to be related to college success: financial pressures, negative academic outcomes, studying, family, friends, criminal/legal issues.
• We could not find a simple, reliable indicator of college graduation
3. Infer effects of early drinking on college-success linked topics• Stratified propensity score analysis.
Identifying a college cohort
1. Find all tweets matching a high-recall, low-precision phrases2. Build and apply high-precision classifier
Identifying drinking and alcohol mentions
• Previous studies find link between alcohol mentions on social media and real-world alcohol usage
• Identify all tweets that contain a validated list of high-precision alcohol phrases
• Alcohol mentions in first semester (fall 2010) will be our treatment
Individual timelinesMessages Query-aligned Timelines
C ardigan fanny pac k
Odd Future, B anksy
cre d selvage chil lwa ve
ret ro se lfi es orga nic.
YOLO shabby chic
Thunderca ts , lomo
me di ta tion
Willi ams burg plaid
narwha l cruc ifix Ma rfa
u1
u2u1
u2
E+
E-
Precedents
Subsequents
E+
E+
1. Extract timeline of word usage per user
2. Treatment = alcohol mention in 1st semester
3. Identify outcome effectsin a sliding window
Details on causal inference setup
• Covariates:• Word counts for top 50k words• Daily tweet frequency and tweet length statistics• Featurize word counts as proportions• Don’t use words in the week before alcohol mention as covariates
• Treatment:• Marker: Alcohol mention in first semester• Semantically: the treatment is everything that happens the week before a drinking
mention.
• Outcomes• Same as covariates• Week-long sliding window, starting from drinking mention for ~5 years.
High-level results
• Increase in tweet rates after drinking mentions
• Mentioning drinking indicates higher rates of alcohol mentions for ~next 2 years as compared to control.
• Drinking mentions has 6mo-2yr effect on most college-success linked topics; with a longer effect on study habits and friend mentions.
Increase in tweet rate
0
1
2
3
4
5
6
-14
7-1
35
-12
3-1
11
-99
-87
-75
-63
-51
-39
-27
-15 -3 9
21
33
45
57
69
81
93
10
51
17
12
9
Twee
ts/D
ay
Days Before/After Alcohol Mention
75th percentile strata
Treatment Placebo
0
1
2
3
4
5
6
-14
6
-13
3
-12
0
-10
7
-94
-81
-68
-55
-42
-29
-16 -3 10
23
36
49
62
75
88
10
1
11
4
12
7
14
0
Twee
ts/D
ay
Days Before/After Alcohol Mention
50th percentile strata
0
1
2
3
4
5
6
-13
7
-12
5
-11
3
-10
1
-89
-77
-65
-53
-41
-29
-17 -5 7
19
31
43
55
67
79
91
10
3
11
5
12
7
13
9
Twee
ts/D
ay
Days Before/After Alcohol Mention
25th percentile strata
• Before drinking, tweet rates are approximately balanced
• After drinking, tweet rates increase by 0.98 tweets/day
• Further validated with a difference-in-differences analysis, with effect persisting.
• Minor implications for analysis: we have to represent words and tokens as proportions, not counts
Effect on future drinking
0.9
2.9
4.9
6.9
8.9
10.9
12.9
14.9
16.91
11
3
22
5
33
7
44
9
56
1
67
3
78
5
89
7
10
09
11
21
12
33
13
45
14
57
Re
lati
ve T
reat
me
nt
Effe
ct
Days since first alcohol tweet in Fall 2010
Effect on future drinking
0.9
0.95
1
1.05
1.1
1.15
1.21
11
3
22
5
33
7
44
9
56
1
67
3
78
5
89
7
10
09
11
21
12
33
13
45
14
57
Re
lati
ve T
reat
me
nt
Effe
ct
Days since first alcohol tweet in Fall 2010
0
0.005
0.01
0.015
0.02
0.025
0.03
1
11
3
22
5
33
7
44
9
56
1
67
3
78
5
89
7
10
09
11
21
12
33
13
45
14
57
Pro
po
rtio
n o
f al
coh
ol-
twee
ts
Days since first alcohol tweet in Fall2010
control
treatment
Relative effect shows early drinking leads to more drinking, with diminishing effect over time. This is because the non-drinkers “catch up”, increasing the drinking mentions over time.
Effect on topics related to college success
Both control and treatment users follow similar temporal patterns.
But generally see positive effect on each outcome initially, with diminishing effect over time.
Effect on topics related to college success
Initially, drinkers are more social, but after about 1-1.5 years, drinkers mention peers less than control group.
Strong effect on negative academic outcome mentions in year, then no difference.
Persistently lower proportion of study habit mentions.
Summary of findings & insights
Q: Does early alcohol usage have measurable effects on topics linked to college success?
A: Yes, especially in short-term (6mo-2yrs). However, effects might be due to other factors closely associated with drinking (e.g., socialization at parties)
Q: Does early alcohol usage have measurable effects on future alcohol usage?
A: Not long term. In fact, control group catches up over time. Note, this does not account for frequency of drinking.
Goal: might intervening to stop early alcohol usage aid college success?
Causes of effects:Case study on transitions to suicide ideationWith Munmun De Choudhury, Glen Coppersmith, Mark Dredze, Mrinal Kumar
CHI 2016
Shifts from Mental health discussions to Suicide Ideation• Suicide is one of the 10 leading causes of death in United States; yet
identifying suicide risk and preventing suicide is difficult. History of mental illness is a known to be a major factor, but little research on characterizing risk of transition from mental health problems to suicide ideation
• We studied individuals posting to mental health (MH) forums on Reddit, who go on to post in suicide ideation forum (SW).
• What words in MH forum posts indicate that user is more likely to post in SW in the future?
• Causal inference analysis, with same methodology, but sweeping treatments instead of sweeping outcomes
• In particular, we’ll go into more detail looking into heterogeneous effects
1. Not forgetting limitations that restrict actual causal claims
What insight are we looking for?
• Validate that off-line theories of factors that drive suicidal ideation also manifest in online discussions
• Use findings to build better, more nuanced early identification of people at risk of suicidal ideation.
Example posts
MHs (Mental Health Subreddits)
I have been considering going for some formal therapy. Any suggestions?
Everyday I feel sad and lonely
Since past sometime I think I am having panic attacks. I really need help from you guys.
SW (SuicideWatch Subreddit)
I know I was never meant to lead this life
Don’t want to hurt the people I care but I can’t take this anymore
Today I felt I have nothing left, why am I even living… I don’t see a point
Content paraphrased to protect privacy of individuals
Data gathering
To focus on transition from mental health postings to suicide ideation, find people who have been posting in MH forum, but not in SW, before August 11, 2014.
Some of these individuals never post in a SW. Label this group MH
Some of these individuals later do post in SW. Label this group MHSW
PS CAUSE-OF-Effects analysis results
Tok
Most statistically significant treatment tokens. Results consistent with social theories on suicide ideation
Stratified results
Opposite Effects across strata
Case study results
• Distinct markers associated w/increased risk of suicide ideation conform with social science theories
• Highlighted heterogeneous effects and importance of context in interpreting risk factors
Actionable Insights and BiasesDerived from work with Alexandra Olteanu, Carlos Castillo and Fernando Diaz
Working paper: Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries
Actionable Insight
A deep understanding that affects our actions
Lots of social media data == lots of insight??
Barriers to insight: Do we believe the results?
We have to ensure:
• Construct validity: Are we measuring what we think we’re measuring?• E.g., are text classifiers correct?
• Internal validity: Is our analysis correct? • E.g., did we need causal inference but not use it?
• External validity: Do our results generalize to target situations?• E.g., population biases, platform biases, …
Incomplete list of potential biases
• Data source issues• Population biases, behavioral biases, platform biases• Temporal biases• System errors, adversaries, fake content
• Data collection issues• Querying biases, sampling biases, system errors
• Data processing • Cleaning, featurization, aggregation biases
• Analysis issues• Measures, classifiers, featurization• Algorithms and interpretation
Mitigation approaches
• Construct validity• Validate measures through qualitative reading.• Validate measures in separate experiments, against known ground-truth• Use more direct measures (e.g., step counters rather than text interpretation)
• Internal validity• When appropriate, use causal inference to reduce bias from observed confounders
• External validity• Try to avoid need for significant generalization with an “in population” task and
intervention.• Otherwise, measure and account for population and other biases.• Validate biases over time.
Conclusions
1. Estimating causal relationships among experience reports on social media is a rich and promising approach to understanding broad set of phenomenon
Alcohol usage in college;
Transitions to suicidal ideation online
2. Causal inference reduces some kinds of bias, but not enough for actionable insights!
In particular, still need measurement validity and generalizability.
Achieve through separate validation, and/or expt design & scoping of goals
Questions?Referenced papers:
- Towards Decision Support and Goal Achievement: Identifying Action-Outcome Relationships from Social Media. Kıcıman, Richardson. KDD15
- Distilling the Outcomes of Personal Experiences: A Propensity-scored Analysis of Social Media. Olteanu, Varol, Kıcıman. CSCW17
- Using Social Media to Understand the Effects of Alcohol Use During College. Kıcıman, Counts, Gasser (email for draft)
- Shifts to Suicidal Ideation from Mental Health Content in Social Media. De Choudhury, Kıcıman, Dredze, Coppersmith, Kumar. CHI16
- Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries. Olteanu, Castillo, Diaz and Kıcıman. Working paper