learning about personal experiences and their outcomes...

72
Learning about Personal Experiences and their Outcomes: Analyzing Social Media as an Observational Study Emre Kıcıman e [email protected] Microsoft Research

Upload: others

Post on 12-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Learning about Personal Experiences and their Outcomes:

Analyzing Social Media as an Observational Study

Emre Kıcıman

[email protected]

Microsoft Research

Page 2: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

should i buy a fixer upper or new home?

should i rent or buy a home in 2014

should i get preapproved before house hunting

should i refinance my mortgage

should i get a divorce

should i break up with my boyfriend

should i pay off my mortgage

should i file bankruptcy

should i get a flu shot

should i lease or buy a car

should i join a gym

should i eat before or after working out

should i text him

should i buy a chromebook

should i quit my job

should i pop a blister

should i get bangs

should i wash my hair before coloring it

should i buy a house

should i cut my hair short

should i retire at 65

should i go to college

should i go to law school

should i take a multivitamin

should i text him or wait for him to text me

should i join the military

should i leave my husband

should i get married

should i pop a burn blister

should i see a doctor

should i consolidate my student loans

should i do cardio before or after weights

should i get a tattoo

should i workout with a cold

should i join the army

should i buy an xbox one

should i move to florida

should i have a third child

should i take creatine

should i get a dog

should i run everyday

should i be a teacher

should i become a real estate agent

should i quit drinking coffee

should i shave my head

Page 3: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

100s of millions reporting their experiences

People post about the actions they take.

… and post about what happens later.

Page 4: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Broad goal: Create a system for open-domain querying of relationship between situations, actions and outcomes• Enable this for the long tail of unfamiliar situations

• Accommodate many possible data sources

Personal and policy questions• Explore Situations: What should I expect now?

• Understand Actions: Should I do ____?• Plan for Outcomes: What might help reach goals?

Page 5: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Individual timelinesMessages Query-aligned Timelines

C ardigan fanny pac k

Odd Future, B anksy

cre d selvage chil lwa ve

ret ro se lfi es orga nic.

YOLO shabby chic

Thunderca ts , lomo

me di ta tion

Willi ams burg plaid

narwha l cruc ifix Ma rfa

u1

u2u1

u2

E+

E-

Precedents

Subsequents

E+

E+

1. Extract timeline of event occurrences from social media

2. Select timelines that match a query event; and its control group.

3. Identify outcomes

Page 6: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

• Introduction

• Causal inference (very) basics

• Brief sanity check on applying causal inference to social media

• Case Study: Effects of alcohol usage during college

• Case Study: Transitions from mental health to suicidal ideation posts

• Stepping back: Insights & bias

Page 7: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

A Brief Introduction to Causal Inference

Page 8: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Treatment

Page 9: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Alice Treatment

Page 10: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Alice

Page 11: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Alice

Page 12: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

𝑌𝑇=1

Page 13: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

𝑌𝑇=1𝑌𝑇=0

Counterfactual Framework

Page 14: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

𝑌𝑇=1𝑌𝑇=0

Treatment effect = 𝑌𝑇=1 − 𝑌𝑇=0

Counterfactual Framework

Page 15: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Counterfactual Framework

• We have a missing data problem. We can estimate missing values.

• We’ll focus on average treatment effect (ATE)

ATE = ത𝑌𝑇=1 − ത𝑌𝑇=0

• Careful: 𝑌 is dependent not only on 𝑇, but alsocovariates 𝑋• And 𝑇 can depend on 𝑋 as well

X

Y

T

Page 16: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Disentangling covariates: Randomized Expts.

• Randomized assignment of treatment status provides independence: 𝑇 ⊥ 𝑋

Page 17: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Disentangling covariates: Randomized Expts.

• Randomized assignment of treatment status provides independence: 𝑇 ⊥ 𝑋

Page 18: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Disentangling covariates: Randomized Expts.

• Randomized assignment of treatment status provides independence: 𝑇 ⊥ 𝑋

Page 19: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Observational Studies (not random)

• … Identify comparable treated and untreated subgroups such that independence holds

Page 20: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Matching

• For every person who received treatment, find another with identical covariates who did not.

• In high-dimensions, cannot find identical or nearest neighbor matches

• Instead, project units down to a single dimensional space

Page 21: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

A balancing score subdivides observational data so that:

𝑇 ⊥ 𝑋 | 𝑠𝑐𝑜𝑟𝑒

Estimated propensity is one possible balancing score

Match on propensity score.

Generalizes to stratification and weighting approaches

Page 22: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Strong assumptions:- Ignorability of

unobserved covariates- Individual outcome

does not depend on whether others receive treatment

𝑌0, 𝑌1 ⊥ 𝑇 | 𝑠𝑐𝑜𝑟𝑒

Page 23: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Other causal inference methods

Regression Discontinuities Instrumental Variables

Disentangle 𝑋 and 𝑇 through another variable 𝑍.

X

Y

TZ

Page 24: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Sanity check: Might this work? With Olteanu and Varol

CSCW17

Page 25: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

1. Extract timeline of event occurrences from texts;Identify treatment and

control groups

Page 26: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

2. Learn propensity score estimator and stratify users

Page 27: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

3. Calculate population average outcomes

Page 28: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Evaluation

Applied technique to 39 specific situations; across 9 high-level topics.

• Chose 9 domains for diversity.

• Selected situations as popularity-weighted random sample within domain.

Data: Firehose Twitter data, March-May, 2014

• Extract ranked outcomes and temporal evolution;

• Correctness judgements via MTurk

Page 29: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Raw Results: Health\Diseases\TriglyceridesOutcome Count Absolute Increase Z-Score

Your_risk 46 24.8% 18.12

Statin 48 23.1% 17.69

Lower 120 35.9% 17.18

Cardiovascular 54 23.0% 16.72

Healthy_diet 55 19.3% 16.54

Fatty_acid 29 18.3% 16.37

Help_prevent 73 26.9% 16.01

Risk_factor 33 18.3% 15.55

Fish_oil 48 24.4% 15.42

inflammation 78 25.1% 15.30

Page 30: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Raw Results: Health\Diseases\GoutOutcome Count Absolute Increase Z-Score

Flare_up 35 4.1% 12.33

Uric_acid 27 2.9% 10.36

Uric 28 2.9% 10.11

Flare 81 4.9% 9.92

Big_toe 38 2.9% 9.86

Joint 301 7.2% 7.22

Aged 32 1.7% 6.51

Correlation 45 2.8% 6.11

Bollock 53 2.5% 5.96

Shite 108 3.4% 5.93

Page 31: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Raw Results: Society\Issues\Belly Fat

Outcome Count Absolute Increase Z-Score

Burn 156 62.2% 8.96

Ab_workout 13 8.5% 5.82

Workout_lose 13 8.5% 5.82

Help_burn 8 11.1% 5.82

add_video 26 14.0% 5.75

url_playlist 26 14.0% 5.75

Fitness 39 18.6% 5.51

Ab 43 19.1% 5.51

Playlist_metion 30 15.3% 5.39

Biceps 7 4.7% 4.74

Page 32: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Raw Results: Business\Financial\PensionOutcome Count Absolute Increase Z-Score

Tax 1334 18.9% 18.89

Retire 675 15.6% 17.97

Budget 762 14.0% 17.05

Benefit 920 14.7% 15.82

Vote 1278 13.9% 15.57

Government 876 11.5% 15.47

Financial 673 13.7% 15.22

Income 619 12.3% 14.87

Report 1125 13.7% 14.70

investment 490 10.4% 14.28

Page 33: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Evaluating correctness

• Use mechanical turk workers to judge correctness of results:

True or False:

Someone mentioning T will later be more likely to talk about YProvide discussion context to aid annotation

• Calculate precision at rank (P@N), ranked by average effect

Page 34: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Context to aid annotating relevance:

Page 35: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Context to aid annotating relevance:

Page 36: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Precision at Rank

Page 37: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

P@10 across domains

Page 38: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Data volume vs P@10

Page 39: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Summary

• Propensity score analysis can be applied to social media timelines to extract outcomes, and do extract relevant results

• Developed methodology to allow human judges to evaluate correctness using the discussion context.

• Evaluated on a large variety of domains:

• Outcomes with higher statistical significance and outcomes with larger treatment effects are more likely to be judged correct

• Quality of results is correlated with data volume

Page 40: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Effects of Alcohol Use During CollegeWith Scott Counts (MSR) and Melissa Gasser (UW)

Page 41: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

College is an important transition period

• Success in college predicts career success, career happiness, economic achievement.• High rates of college graduates drive regional income levels and other positive

macro-economic indicators

• Over 40% of college students leave without earning a degree.• Many factors: academic and social integration, financial pressures, …

• Excessive alcohol consumption negatively associated w/ college success, as well as other long-term negative consequences

Page 42: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

5-year longitudinal social media analysis

• Existing study methods primarily use surveys.• Limited to single institutions and/or small number of participants

• Rely on participant recall

• Response biases

• Social media studies can complement• Large number of participants

• In situ reporting of experiences

• (different) reporting biases

• Granular observations

Page 43: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

What insight are we looking for?

Goal: might intervening to stop early alcohol usage aid college success?

Does early alcohol usage have measurable effects on topics linked to college success?

Does early alcohol usage have measurable effects on future alcohol usage?

Page 44: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

5-year longitudinal social media analysis

1. Build a dataset of college students twitter timelines:• Identify twitter accounts of entering college students in Fall 2010

• Gather their tweets from Fall 2010 through Summer 2015

2. Identify relevant events and topics:• Drinking and alcohol mentions in Fall 2010

• Topics known to be related to college success: financial pressures, negative academic outcomes, studying, family, friends, criminal/legal issues.

• We could not find a simple, reliable indicator of college graduation

3. Infer effects of early drinking on college-success linked topics• Stratified propensity score analysis.

Page 45: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Identifying a college cohort

1. Find all tweets matching a high-recall, low-precision phrases2. Build and apply high-precision classifier

Page 46: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Identifying drinking and alcohol mentions

• Previous studies find link between alcohol mentions on social media and real-world alcohol usage

• Identify all tweets that contain a validated list of high-precision alcohol phrases

• Alcohol mentions in first semester (fall 2010) will be our treatment

Page 47: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Individual timelinesMessages Query-aligned Timelines

C ardigan fanny pac k

Odd Future, B anksy

cre d selvage chil lwa ve

ret ro se lfi es orga nic.

YOLO shabby chic

Thunderca ts , lomo

me di ta tion

Willi ams burg plaid

narwha l cruc ifix Ma rfa

u1

u2u1

u2

E+

E-

Precedents

Subsequents

E+

E+

1. Extract timeline of word usage per user

2. Treatment = alcohol mention in 1st semester

3. Identify outcome effectsin a sliding window

Page 48: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Details on causal inference setup

• Covariates:• Word counts for top 50k words• Daily tweet frequency and tweet length statistics• Featurize word counts as proportions• Don’t use words in the week before alcohol mention as covariates

• Treatment:• Marker: Alcohol mention in first semester• Semantically: the treatment is everything that happens the week before a drinking

mention.

• Outcomes• Same as covariates• Week-long sliding window, starting from drinking mention for ~5 years.

Page 49: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

High-level results

• Increase in tweet rates after drinking mentions

• Mentioning drinking indicates higher rates of alcohol mentions for ~next 2 years as compared to control.

• Drinking mentions has 6mo-2yr effect on most college-success linked topics; with a longer effect on study habits and friend mentions.

Page 50: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Increase in tweet rate

0

1

2

3

4

5

6

-14

7-1

35

-12

3-1

11

-99

-87

-75

-63

-51

-39

-27

-15 -3 9

21

33

45

57

69

81

93

10

51

17

12

9

Twee

ts/D

ay

Days Before/After Alcohol Mention

75th percentile strata

Treatment Placebo

0

1

2

3

4

5

6

-14

6

-13

3

-12

0

-10

7

-94

-81

-68

-55

-42

-29

-16 -3 10

23

36

49

62

75

88

10

1

11

4

12

7

14

0

Twee

ts/D

ay

Days Before/After Alcohol Mention

50th percentile strata

0

1

2

3

4

5

6

-13

7

-12

5

-11

3

-10

1

-89

-77

-65

-53

-41

-29

-17 -5 7

19

31

43

55

67

79

91

10

3

11

5

12

7

13

9

Twee

ts/D

ay

Days Before/After Alcohol Mention

25th percentile strata

• Before drinking, tweet rates are approximately balanced

• After drinking, tweet rates increase by 0.98 tweets/day

• Further validated with a difference-in-differences analysis, with effect persisting.

• Minor implications for analysis: we have to represent words and tokens as proportions, not counts

Page 51: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Effect on future drinking

0.9

2.9

4.9

6.9

8.9

10.9

12.9

14.9

16.91

11

3

22

5

33

7

44

9

56

1

67

3

78

5

89

7

10

09

11

21

12

33

13

45

14

57

Re

lati

ve T

reat

me

nt

Effe

ct

Days since first alcohol tweet in Fall 2010

Page 52: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Effect on future drinking

0.9

0.95

1

1.05

1.1

1.15

1.21

11

3

22

5

33

7

44

9

56

1

67

3

78

5

89

7

10

09

11

21

12

33

13

45

14

57

Re

lati

ve T

reat

me

nt

Effe

ct

Days since first alcohol tweet in Fall 2010

0

0.005

0.01

0.015

0.02

0.025

0.03

1

11

3

22

5

33

7

44

9

56

1

67

3

78

5

89

7

10

09

11

21

12

33

13

45

14

57

Pro

po

rtio

n o

f al

coh

ol-

twee

ts

Days since first alcohol tweet in Fall2010

control

treatment

Relative effect shows early drinking leads to more drinking, with diminishing effect over time. This is because the non-drinkers “catch up”, increasing the drinking mentions over time.

Page 53: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation
Page 54: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Effect on topics related to college success

Both control and treatment users follow similar temporal patterns.

But generally see positive effect on each outcome initially, with diminishing effect over time.

Page 55: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Effect on topics related to college success

Initially, drinkers are more social, but after about 1-1.5 years, drinkers mention peers less than control group.

Strong effect on negative academic outcome mentions in year, then no difference.

Persistently lower proportion of study habit mentions.

Page 56: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Summary of findings & insights

Q: Does early alcohol usage have measurable effects on topics linked to college success?

A: Yes, especially in short-term (6mo-2yrs). However, effects might be due to other factors closely associated with drinking (e.g., socialization at parties)

Q: Does early alcohol usage have measurable effects on future alcohol usage?

A: Not long term. In fact, control group catches up over time. Note, this does not account for frequency of drinking.

Goal: might intervening to stop early alcohol usage aid college success?

Page 57: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Causes of effects:Case study on transitions to suicide ideationWith Munmun De Choudhury, Glen Coppersmith, Mark Dredze, Mrinal Kumar

CHI 2016

Page 58: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Shifts from Mental health discussions to Suicide Ideation• Suicide is one of the 10 leading causes of death in United States; yet

identifying suicide risk and preventing suicide is difficult. History of mental illness is a known to be a major factor, but little research on characterizing risk of transition from mental health problems to suicide ideation

• We studied individuals posting to mental health (MH) forums on Reddit, who go on to post in suicide ideation forum (SW).

• What words in MH forum posts indicate that user is more likely to post in SW in the future?

• Causal inference analysis, with same methodology, but sweeping treatments instead of sweeping outcomes

• In particular, we’ll go into more detail looking into heterogeneous effects

1. Not forgetting limitations that restrict actual causal claims

Page 59: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

What insight are we looking for?

• Validate that off-line theories of factors that drive suicidal ideation also manifest in online discussions

• Use findings to build better, more nuanced early identification of people at risk of suicidal ideation.

Page 60: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Example posts

MHs (Mental Health Subreddits)

I have been considering going for some formal therapy. Any suggestions?

Everyday I feel sad and lonely

Since past sometime I think I am having panic attacks. I really need help from you guys.

SW (SuicideWatch Subreddit)

I know I was never meant to lead this life

Don’t want to hurt the people I care but I can’t take this anymore

Today I felt I have nothing left, why am I even living… I don’t see a point

Content paraphrased to protect privacy of individuals

Page 61: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Data gathering

To focus on transition from mental health postings to suicide ideation, find people who have been posting in MH forum, but not in SW, before August 11, 2014.

Some of these individuals never post in a SW. Label this group MH

Some of these individuals later do post in SW. Label this group MHSW

Page 62: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

PS CAUSE-OF-Effects analysis results

Tok

Most statistically significant treatment tokens. Results consistent with social theories on suicide ideation

Page 63: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Stratified results

Page 64: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Opposite Effects across strata

Page 65: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Case study results

• Distinct markers associated w/increased risk of suicide ideation conform with social science theories

• Highlighted heterogeneous effects and importance of context in interpreting risk factors

Page 66: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Actionable Insights and BiasesDerived from work with Alexandra Olteanu, Carlos Castillo and Fernando Diaz

Working paper: Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries

Page 67: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Actionable Insight

A deep understanding that affects our actions

Lots of social media data == lots of insight??

Page 68: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Barriers to insight: Do we believe the results?

We have to ensure:

• Construct validity: Are we measuring what we think we’re measuring?• E.g., are text classifiers correct?

• Internal validity: Is our analysis correct? • E.g., did we need causal inference but not use it?

• External validity: Do our results generalize to target situations?• E.g., population biases, platform biases, …

Page 69: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Incomplete list of potential biases

• Data source issues• Population biases, behavioral biases, platform biases• Temporal biases• System errors, adversaries, fake content

• Data collection issues• Querying biases, sampling biases, system errors

• Data processing • Cleaning, featurization, aggregation biases

• Analysis issues• Measures, classifiers, featurization• Algorithms and interpretation

Page 70: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Mitigation approaches

• Construct validity• Validate measures through qualitative reading.• Validate measures in separate experiments, against known ground-truth• Use more direct measures (e.g., step counters rather than text interpretation)

• Internal validity• When appropriate, use causal inference to reduce bias from observed confounders

• External validity• Try to avoid need for significant generalization with an “in population” task and

intervention.• Otherwise, measure and account for population and other biases.• Validate biases over time.

Page 71: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Conclusions

1. Estimating causal relationships among experience reports on social media is a rich and promising approach to understanding broad set of phenomenon

Alcohol usage in college;

Transitions to suicidal ideation online

2. Causal inference reduces some kinds of bias, but not enough for actionable insights!

In particular, still need measurement validity and generalizability.

Achieve through separate validation, and/or expt design & scoping of goals

Page 72: Learning about Personal Experiences and their Outcomes ...kiciman.org/wp-content/uploads/2017/02/maison... · retro selfies organic. YOLO shabby chic Thundercats, lomo meditation

Questions?Referenced papers:

- Towards Decision Support and Goal Achievement: Identifying Action-Outcome Relationships from Social Media. Kıcıman, Richardson. KDD15

- Distilling the Outcomes of Personal Experiences: A Propensity-scored Analysis of Social Media. Olteanu, Varol, Kıcıman. CSCW17

- Using Social Media to Understand the Effects of Alcohol Use During College. Kıcıman, Counts, Gasser (email for draft)

- Shifts to Suicidal Ideation from Mental Health Content in Social Media. De Choudhury, Kıcıman, Dredze, Coppersmith, Kumar. CHI16

- Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries. Olteanu, Castillo, Diaz and Kıcıman. Working paper