lean experimentation

62
Lean Experimentation How to leverage online experiments in research and practice Cornell IS Breakfast Talk April 4th, 2012 Thomas Høgenhaven Twitter: @thogenhaven Friday, April 6, 12

Upload: thogenhaven

Post on 17-Jan-2015

2.148 views

Category:

Automotive


0 download

DESCRIPTION

Talk given on lean experimentation in research and practice at Cornell Information Science.

TRANSCRIPT

Page 1: Lean Experimentation

Lean ExperimentationHow to leverage online experiments in research and practice

Cornell IS Breakfast TalkApril 4th, 2012

Thomas HøgenhavenTwitter: @thogenhaven

Friday, April 6, 12

Page 2: Lean Experimentation

Agenda

1. Conducting Online Experiments

2. Experimentation Literature

3. Experimentation in SMEs and Government Today

4. Lean Experimentation

Friday, April 6, 12

Page 3: Lean Experimentation

Conducting Online ExperimentsI

Friday, April 6, 12

Page 4: Lean Experimentation

The Why Bother Question

“While some social scientists engage in small-scale

controlled experimentation with dozens of users or

groups, the capacity to perform large-scale interventions

with thousands of users opens up new opportunities for

research."

(Preece and Schneiderman 2009: 25).

Friday, April 6, 12

Page 5: Lean Experimentation

What I Mean With Online Experiments

In online experiments, we are interested in examining

online behavior. Not just using the internet as a means

to examine offline behavior.

Friday, April 6, 12

Page 6: Lean Experimentation

What I Mean With Online Experiments

VariationA

VariationB

Variationn

Independent variable

Dependent variable

OnlineBehaviorBehavior Online

Behavior

Statistical test

Difference

Dependent variable

OnlineBehavior

Users

Friday, April 6, 12

Page 7: Lean Experimentation

The High-Level Experimental Process

Thomke 1998: 745.

Friday, April 6, 12

Page 8: Lean Experimentation

Example: Experimentation At Microsoft

Guess which one performs better, in each of these 8 pairs.

Anyone getting 6/8 right, wins a t-shirt

Friday, April 6, 12

Page 9: Lean Experimentation

Experimenting At Microsoft

Kohavi et al (2009): Online Experimentation at Microsoft

A B

A B

A B

A B

A B

A B

A B

A B

Which one is significantly better?[] A[] B[] None of them

1

2

3

4

5

6

7

8

Friday, April 6, 12

Page 10: Lean Experimentation

Experimenting At Microsoft

Kohavi et al (2009): Online Experimentation at Microsoft

A B

A B

A B

A B

A B

A B

A B

A B

0 / 200 Microsoft employeesgot more than 5 / 8 answers right

1

2

3

4

5

6

7

8

Friday, April 6, 12

Page 11: Lean Experimentation

What Is The Effect Of Experiments?

33%

33%

33%

Improvement No Effect Disimprovement

Kohavi et al (2009): Online Experimentation at Microsoft

Friday, April 6, 12

Page 12: Lean Experimentation

Is That Just Microsoft Being Microsoft?

No. Estimating effects of changes is incredible hard.

Netflix considers 90% of what they try to be wrong.

Friday, April 6, 12

Page 13: Lean Experimentation

It’s Actually Hard To Predict

https://whichtestwon.com/past-tests

Friday, April 6, 12

Page 14: Lean Experimentation

Experimental Literature2

Friday, April 6, 12

Page 15: Lean Experimentation

Current Experimental Framework in HCI

Psychology &Social Psychology

Experimental methodology literature

HCI

Friday, April 6, 12

Page 16: Lean Experimentation

Offline And Online Experiments

• Psychology literature sometimes uses the internet to study human behavior

• But it does not use the internet to study the internet

Friday, April 6, 12

Page 17: Lean Experimentation

For example...

2010

No mentions of experimentation in online environments

Friday, April 6, 12

Page 18: Lean Experimentation

Offline And Online Experiments

Laboratory Field

Offline

Online

Friday, April 6, 12

Page 19: Lean Experimentation

Offline And Online Experiments

Laboratory Field

Offline

Online

Psychology covers this

Friday, April 6, 12

Page 20: Lean Experimentation

Offline And Online Experiments

Laboratory Field

Offline

Online

Psychology covers this

But not this

Friday, April 6, 12

Page 21: Lean Experimentation

The Research There Is, Is Not Systematic

"To the extent of our knowledge, no research has so far been

reported on treating online test design and implementation in a

systematic manner"

(Cámara and Kobsa 2009: 18).

Friday, April 6, 12

Page 22: Lean Experimentation

Online Experiments In Academia

CHI and CSCW use experiments all the time - but more can be

invested in methodology literature.

This will help explore possibilities and limitations of online

experimentation

Friday, April 6, 12

Page 23: Lean Experimentation

Experimentation In SMEs And Government Agencies Today3

Friday, April 6, 12

Page 24: Lean Experimentation

State Of The Art In Industry Today

• Experimentation is increasing

• At least 25 different software vendors• $0 - $320,000 a year*

*Source: whichmvt.com

Friday, April 6, 12

Page 25: Lean Experimentation

Practice Has Its Own Literature

Friday, April 6, 12

Page 26: Lean Experimentation

Website Experiments

Several ways to conduct experiments1. Server-side / Client-side

2. A/B Test / Multivariate Test

Friday, April 6, 12

Page 27: Lean Experimentation

Not Overly Expensive Software

Google Website Optimizer(free)

Visual Website Optimizer($600 - $3000 / year)

Just 2 out of 25+ vendors

Friday, April 6, 12

Page 28: Lean Experimentation

A/B/n Experiment

WebpageA

WebpageB

Webpagen

Javascript

Independent variable

Dependent variable BehaviorBehavior Behavior

Statistical test Difference

Dependent variable Behavior

Users

Friday, April 6, 12

Page 29: Lean Experimentation

Google Website Optimizer

Friday, April 6, 12

Page 30: Lean Experimentation

Limitations Of Mainstream Experimental Software

1. Limited to between-subject design

2. Lack of data export

3. No control over statistical test

4. Expensive coding necessary

Friday, April 6, 12

Page 31: Lean Experimentation

Limitation 1: Limited To Between Subject Design

• Cannot control for individual differences (No such data is collected / made available)

• Requires more experimental subjects

• No pre-experimental data is collected

Friday, April 6, 12

Page 32: Lean Experimentation

Limitation 2: Lack of Data Export

Friday, April 6, 12

Page 33: Lean Experimentation

Google Website Optimizer: Data Export

Friday, April 6, 12

Page 34: Lean Experimentation

Visual Website Optimizer

Friday, April 6, 12

Page 35: Lean Experimentation

Visual Website Optimizer: Data Export

Friday, April 6, 12

Page 36: Lean Experimentation

Software Limitations: Data Export

• Some software better than other

• No data on individual users

• No segmentation on background variables

• This might be the biggest problem, as this is where many significant results lie.

Friday, April 6, 12

Page 37: Lean Experimentation

Limitation 3: No Choice Between Statistical Tests

Okay?

Friday, April 6, 12

Page 38: Lean Experimentation

Statistical Test = Chance To Beat Original

“The chance to beat original ... displays the probability that a combination will be more the successful than the original version.

When numbers in this column are high, perhaps around 95%, that means a given combination is probably a good candidate to replace your original content.

Low numbers in this column mean that the corresponding combination is a poor candidate for replacement.”

http://support.google.com/websiteoptimizer/bin/answer.py?hl=en&answer=55944

Friday, April 6, 12

Page 39: Lean Experimentation

Visual Website Optimizer Is More Transparent

“ Visual Website Optimizer uses z-tests for both A/B tests and multivariate tests”

Standard Error (SE) = Square root of (p * (1-p) / n)

http://visualwebsiteoptimizer.com/split-testing-blog/what-you-really-need-to-know-about-mathematics-of-ab-split-testing/

Friday, April 6, 12

Page 40: Lean Experimentation

z-tests

• Focus on a single parameter

• Assumes parametric assumptions are met

We don’t know if data fits this

Friday, April 6, 12

Page 41: Lean Experimentation

Limitation 4: Coding Required

WebpageA

WebpageB

Webpagen

Javascript

Independent variable

Dependent variable BehaviorBehavior Behavior

Statistical test Difference

Dependent variable

UsersHave to

be coded

Friday, April 6, 12

Page 42: Lean Experimentation

Software Limitations: Expensive Coding

We already coded it, so we can as well keep it. I hate working for no reason

Friday, April 6, 12

Page 43: Lean Experimentation

Software Limitations: Expensive Coding

I knew this wouldn’t work! We should never have spent resources on it...

Friday, April 6, 12

Page 44: Lean Experimentation

The Challenge

1. Overcome methodological limitations of experimental

software

2. Reduce development costs

3. Explore possibilities and limitations of online experimentation

Friday, April 6, 12

Page 45: Lean Experimentation

Lean Experimentation4

Friday, April 6, 12

Page 46: Lean Experimentation

Test Environment

ProxyA

ProxyB

Proxyn

Independent variable

Dependent variable Behavior

Statistical test Difference

Dependent variable

Behavior on website

Users

Behavior on website

Behavior on website

Friday, April 6, 12

Page 47: Lean Experimentation

Proxies For Experimentation

Website Email

Survey Ads

Friday, April 6, 12

Page 48: Lean Experimentation

Comparative Advantages And Disadvantages

Friday, April 6, 12

Page 49: Lean Experimentation

Lean Experimentation Principles

1.Test assumptions, ideas, and theories

2. Test before coding, not after

3. Test in the field

Friday, April 6, 12

Page 50: Lean Experimentation

1. Test Assumptions, Ideas, And Theories

Friday, April 6, 12

Page 51: Lean Experimentation

2. Test Before Coding, Not After

Bad Idea

Good Idea

Experimentation

Implementation

Ideas

Friday, April 6, 12

Page 52: Lean Experimentation

3. Test In The Field

• Identical design patterns have different effects in different contexts

• E.g. social comparison information in respectively competitive and cooperative communities

• Cocktail effects are largely unknown

Friday, April 6, 12

Page 53: Lean Experimentation

Requirements Of Lean Experimentation

1. Independent groups

2. Random assignment

3. Allows tracking

Friday, April 6, 12

Page 54: Lean Experimentation

Why Use Proxies For Experimentation?

Friday, April 6, 12

Page 55: Lean Experimentation

Test Environment

• Manipulates the independent variable through a proxy

• Examines dependent variable in natural field environment

Friday, April 6, 12

Page 56: Lean Experimentation

Test Subjects

• Existing users (when using website, email, and survey)

• Potential users (when using advertisements)

Friday, April 6, 12

Page 57: Lean Experimentation

Proposed Usage and limitations

Good for• Ideas• Theories• Hypothesis• Features

Less suited for• Small changes• Graphical changes

Can be useful if testing assumptions

Friday, April 6, 12

Page 58: Lean Experimentation

Data Output

• Mixed sources that need to be combined• Open / CTR rates from proxy• Web analytics• SQL databases

Friday, April 6, 12

Page 59: Lean Experimentation

Durability of Proxy Experiment is short

0

4

8

12

16

Wk0 Wk1 Wk2 Wk3

Control Experimentation

Email experiment

Friday, April 6, 12

Page 60: Lean Experimentation

Buy In Needed

1. Making changes on websites

2. Sending Emails

3. Conducting Surveys

4. Running Ads

Hard to sell

Easy to sell

Friday, April 6, 12

Page 61: Lean Experimentation

Feedback Quality

1. Wireframes / early stage development

2. Finished / Nearly finished stages

Critical feedback

Not so critical feedback

Friday, April 6, 12

Page 62: Lean Experimentation

Influence On Decisions

Increased likelihood of impact when getting experimental effect data early

Friday, April 6, 12