8. a/b test and experimental research - web mining...
TRANSCRIPT
1
8. A/B Test and Experimental Research
Jonathan Zhu 祝建華
MACNM Computational Workshop, Jan 13, 2019
22
Who Did People Guess Would Win?
33
Did Facebook Help Trump Win Election 2016?
▪ Yes
o Cambridge Analytica worked for him
o He used Twitter around the clock
o There was a lot of false information about Hillary
o Russian hackers were highly active during the election
o …
▪ No
o More Facebook users supported Clinton than him
o Most of his supporters did not use Twitter
o There was a lot of negative information about him too
o There is no such direct link yet
o …
44
How to Find the True Cause(s)?
Voting Decision
Personal-Family
Background
Social Influence
Media Influence
To remove or isolate
To confirm or ensure using experiment
55
What’s the Powerful Doing: A/B Test
Source: Tan of Millward Brown, 2009
66
Example of A/B Test (Online Experiment)
Source: Tan of Millward Brown, 2009
7
Four Types of A/B Test and Online Experiment
Salganik (2018):
1. Partner with the powerful (websites, government, advertisers, NGOs, etc.)
2. Use an existing system (e.g., running on Facebook, Twitter, etc.)
3. Build a standalone experiment platform (one-time only for experiment, e.g., MusicLab)
4. Build a product system (real application, e.g., MovieLens)
8
Partner with the Powerful
Bond, R. M., et al. (2012). A 61-million-person experiment in social influence and political mobilization. Nature, 489(7415), 295.
99
Build a Standalone Experiment Platform
1010
Randomized Screen
1111
Follow-up Pages for Rating and Downloading
1212
Build a Complete Product System
13
Comparisons among the Four Types of A/B Test
Partner with power
Use existing systems
Build own experiment
Build a real product
Cost (money and time)
Low Low Medium High
Control (for subjects, stimuli)
Medium Low High High
Realism (setting) High High High High
Ethics (impact on systems)
Complex Complex Easy Easy
Source: based on Salganik (2018).
1414
A Larger Picture of Experiment
▪ Major Types:
o Pseudo experiment
o Lab experiment
o Field experiment
o Natural experiment
o A/B Test & online experiment
▪ Key Ingredients:
o Study subjects (S)
o Testing setting (T)
o Random assignment (R)
o Manipulated stimulus (X)
o Observed outcome (O)
1515
Pseudo Experiment
Source: Babbie (2007) Fig 8.3
• Change in Y: unknown • Effect of X: unknown• Confound: unknown
• Change in Y: known • Effect of X: unknown• Confound: possible
• Change in Y: unknown • Effect of X: unknown• Confound: possible
16
Three Most Deadly Sins at Harrah’s
Gary Loveman, CEO Harrah’s:
“… you don't harass women, you don't steal, and you've got to have a control group. … you can lose your job for at Harrah's not running a control group."
Source: Salganik (2018)
17
Lab Experiment
▪ Also known as “classical experiment” or “randomized controlled experiment”, with the presence of the following “controls”:
1. Randomized assignment (R)
2. Control condition (C)
3. Manipulated stimulus (X)
4. Observed outcome (O)
1818
Basic Experimental Design
Source: Babbie (2007) Fig 8.1
19
Field Experiment
Same as lab experiment
▪ Randomization
▪ Control condition
▪ Manipulated stimulus
▪ Observed outcome
Different from lab exprmnt.
▪ More “realistic” setting (e.g., shopping mall, office, etc.)
▪ More “normal” subjects (e.g., adults)
20
Natural Experiment
Same as field experiment
▪ Realistic setting
▪ Normal subjects
Different from field exprmnt.
▪ “Naturally occurred” stimulus, uncontrolled by the researcher
21
Online vs. Offline Experiment
Same as offline experiment
▪ Randomization
▪ Control condition
▪ Manipulated stimulus
▪ Observed outcome
Different from offline exprmnt.
▪ More heterogeneous subjects
▪ Larger sample size
▪ Faster data collection
▪ More “realistic” setting
▪ More likely to run into ethical problems
2222
Online Experiment Stands out as a Winner
Survey
Lab Experiment
Online Experiment
Internal Validity
Exte
rnal
Val
idit
y
High
High
Low
Low
Field Experiment
23
9. Overview of Computational Methods
Jonathan Zhu 祝建華
MACNM Computational Workshop, Jan 13, 2019
24
Summary of the WorkshopPython Basics Web Data
CollectionText Mining User Profiling Visualization A/B Test Network Analysis
Concepts List, Dictionary, Dataframe,
Missing Value,Discretization,
Permutation/Random Sampling,Dummy Coding,Merge/Join/Con
catnate,GroupBy/apply,
Cross Tabulations
Webpage scraping, API
retrieval, html, css
Word extraction, Bag of words, TF-IDF, feature transformation, topic modeling,
tokenization, normalization,st
emming, lemmatization, word tagging
Audience Targeting,Behavior Analytics,
Timing Analysis,Machine Learning,
Clustering,Classification,
Regression
Univariate: histogram & KDE; Bivariate: bar, pie, line charts, hexbin,
scatterplot; 3-variate:
superposed line, grouped &
stacked bar; Multi-variate: parallel
coordinate, scatterplot matrix
Experiment design,
Experiment group, Control
group, Stimulus, Randomization,
Partner with power, Existing
platform, Stand-alone platform, Product systems
Graph, Edge list, Graph-, Node-,
and Community-level analysis,
Degree distribution,
Clustering, Path length,
Centrality, Component,
Community, Ego network
Python Packages pandas Tweepy, BeautifulSoup
Nltk, jieba, sklearn
pandas, scikit-learn
pandas, seabornnumpy
NetworkX
Data (CityU Tweets)
user_profiles_ch_anony.csv, tweet_ch_anony.xlsx
/ tweet_en_anony, tweet_ch_anony
user_profiles_ch_anony.csv,user_profiles_en_anony.csv
user_profiles_ch_anony.csv,user_profiles_en_anony.csv
edgelist_following.csv
25
Real Challenges in Computational Research
▪ Causal inference
▪ Sampling
▪ Mixed methods
▪ Research ethics
26
Causal Inference
▪ Causality is still the ultimate goal of scientific research
▪ Experimental design is the best tool to ensure causality
▪ Online experiment excels over offline counterpart
▪ The major challenge for online experiment lies in research ethics
27
Sampling
▪ Sampling is necessary in the age of big data
▪ Good sample is more informative than big data
▪ Sample size around 10,000-100,000 is an optimal, balancing quality and cost
2828
Integration of Multi-source Data
▪ Found vs. Made Data
o Found data:
• Log files
• Web content
• etc.
o Made data:
• Experiment observations
• Survey responses
• etc.
▪ Offline vs. Online Data
o Offline data
• Small size
• Ground truth known (→supervised learning)
o Online data
• Big size
• Ground truth unknown (→unsupervised learning)
2929
The Rwanda Study
Survey (n=1,000)
Mobile Users (n=1.5 mil)
Wealth(n=1,000)
Call logs(n=1,000)
Wealth &Location
(n=1.5 mil)
1. S
amp
ling
2. P
red
icti
on
Available Data
Available Data
3. E
stim
atio
n
4. A
ggre
gati
on
Survey Data
5. C
ross
-val
idat
ion
Wealth (2,148 cells of 30 districts) 6. Projection
Source: Blumenstock et al. (2015). Predicting poverty and wealth from mobile phone metadata.
3030
Data in the Study
31
Research Ethics
▪ Privacy concerns
▪ National security concerns
▪ Commercial interests
▪ Data ownership
▪ etc.
We’re in a post-API era →relying on data made by you
32
10. Student Reflections
Jonathan Zhu 祝建華
MACNM Computational Workshop, Jan 13, 2019
33
What Do You Think about the Workshop
▪ What did you expect to learn?
▪ What have you actually learn?
▪ What you learned is most relevant/useful for your job?
▪ What else do you think is relevant/useful but has been missed?
▪ Anything else would you like to share?
Thank you so much for your participation!