experimenting on humans - advanced a/b tests - qcon sf 2014
DESCRIPTION
How do you know what 55 millions users like? Wix.com is conducting hundreds of experiments every month on production to understand which features our users like and which hurt or improve our business. In this talk we’ll explain how our engineering team is supporting our product managers in making the right decisions and getting our product road map on the right path. We will also present some of the open source tools we developed that help us experimenting our products on humans. While A/B test is a very known and familiar methodology for conducting experiments on production when you do that on a large scale by changing your system behavior every 9 minutes, it entails many challenges in the organization level from developers, product managers, QA, marketing and management. In this talk we will explain what is the life-cycle of an experiment, some of the challenges we faced and the effect on our development process and product evolution.TRANSCRIPT
![Page 1: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/1.jpg)
Experimenting on Humans
Aviran Mordo
Head of Back-end Engineering
@aviranm
http://www.linkedin.com/in/aviranhttp://www.aviransplace.com
Talya Gendler
Back-end Team Leader
www.linkedin.com/in/talyagendler
![Page 2: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/2.jpg)
![Page 3: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/3.jpg)
Wix In Numbers
Over 55M users + 1M new users/month
Static storage is >1.5Pb of data
3 data centers + 3 clouds (Google, Amazon, Azure)
1.5B HTTP requests/day
800 people work at Wix, of which ~ 300 in R&D
![Page 4: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/4.jpg)
1542 (A/B Tests in 3 months)
![Page 5: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/5.jpg)
Basic A/B testing
Experiment driven development
PETRI – Wix’s 3rd generation open source
experiment system
Challenges and best practices
Complexities and effect on product
Agenda
![Page 6: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/6.jpg)
10:22
A/B Test
![Page 7: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/7.jpg)
To B or NOT to B?
A
B
![Page 8: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/8.jpg)
Home page results (How many registered)
![Page 9: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/9.jpg)
Experiment Driven Development
![Page 10: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/10.jpg)
This is the Wix editor
![Page 11: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/11.jpg)
Our gallery manager
What can we improve?
![Page 12: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/12.jpg)
Is this better?
![Page 13: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/13.jpg)
Don’t be a loser
![Page 14: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/14.jpg)
Product Experiments Toggles & Reporting
Infrastructure
![Page 15: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/15.jpg)
How do you know what is running?
![Page 16: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/16.jpg)
If I “know” it is better, do I really need to test it?
Why so many?
![Page 17: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/17.jpg)
![Page 18: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/18.jpg)
Sign-upChoose Templat
eEdit site Publish Premiu
m
The theory
![Page 19: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/19.jpg)
Result = Fail
![Page 20: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/20.jpg)
Intent matters
![Page 21: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/21.jpg)
EVERY new feature is A/B tested
We open the new feature to a % of users
Measure success
If it is better, we keep it
If worse, we check why and improve
If flawed, the impact is just for % of our users
Conclusion
![Page 22: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/22.jpg)
Start with 50% / 50% ?
![Page 23: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/23.jpg)
![Page 24: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/24.jpg)
New code can have bugs
Conversion can drop
Usage can drop
Unexpected cross test dependencies
Sh*t happens (Test could fail)
![Page 25: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/25.jpg)
Language
GEO
Browser
User-agent
OS
Minimize affected users (in case of failure)
Gradual exposure (percentage of…)
Company employees
User roles
Any other criteria you have (extendable)
All users
![Page 26: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/26.jpg)
First time visitors = Never visited wix.com
New registered users = Untainted users
Not all users are equal
![Page 27: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/27.jpg)
Start new experiment (limited population)
![Page 28: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/28.jpg)
We need that feature
…and failure is not an option
![Page 29: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/29.jpg)
Adding a mobile view
![Page 30: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/30.jpg)
First trial failed
Performance had to be improved
![Page 31: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/31.jpg)
Halting the test results in loss of data.
What can we do about it?
![Page 32: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/32.jpg)
Solution – Pause the experiment!
• Maintain NEW experience for already exposed users
• No additional users will be exposed to the NEW feature
![Page 33: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/33.jpg)
PETRI’s pause implementation
Use cookies to persist assignment
If user changes browser assignment is
unknown
Server side persistence solves this
You pay in performance & scalability
![Page 34: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/34.jpg)
Decision
Keep feature Drop feature
Improve code & resume experiment
Keep backwards compatibility for exposed users forever?
Migrate users to another equivalent feature
Drop it all together (users lose data/work)
![Page 35: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/35.jpg)
The road to success
![Page 36: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/36.jpg)
Numbers look good but sample size is small
We need more data!
Expand
Reaching statistical significance
25% 50% 75% 100%
75% 50% 25% 0%Control Group (A)
Test Group (B)
![Page 37: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/37.jpg)
Keep user experience consistent
Control Group
(A)
Test Group
(B)
![Page 38: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/38.jpg)
Signed-in user (Editor)Test group assignment is determined by the user IDGuarantee toss persistency across browsers
Anonymous user (Home page)Test group assignment is randomly determinedCan not guarantee persistent experience if changing
browser
11% of Wix users use more than one desktop browser
Keeping persistent UX
![Page 39: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/39.jpg)
Robots are users too!
![Page 40: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/40.jpg)
Always exclude robots
Don’t let Google index a losing page
Don’t let bots affect statistics
![Page 41: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/41.jpg)
There is MORE than one
![Page 42: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/42.jpg)
# of active experiment
Possible # of states
10 1024
20 1,048,576
30 1,073,741,824
Possible states >= 2^(# experiments)
Wix has ~200 active experiments = 1.606938e+60
![Page 43: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/43.jpg)
Supporting 2^N different users is challenging
How do you know which experiment causes errors?
Managing an ever changing production env.
![Page 44: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/44.jpg)
Override options (URL parameters, cookies, headers…)
Near real time user BI tools
Specialized tools
![Page 45: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/45.jpg)
Integrated into the product
![Page 46: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/46.jpg)
Why should product care about
the system architecture
![Page 47: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/47.jpg)
Share document with other users
![Page 48: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/48.jpg)
Document owner is part of a test that enables a new video
component
![Page 49: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/49.jpg)
?
What will the other user experience when editing a shared document ?
Owner Friend
![Page 50: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/50.jpg)
Assignment may be different than owner’s
Owner (B) Friend (A)
![Page 51: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/51.jpg)
Enable features by existing content
Enable features by document owner’s assignment
Exclude experimental features from shared documents
Possible solutions
![Page 52: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/52.jpg)
A/B testing introduces complexity
![Page 53: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/53.jpg)
Petri is more than just an A/B test framework
Feature toggle
A/B Test
Personalization
Internal testing
Continuous deployment
Jira integration
Experiments
Dynamic configuration
QA
Automated testing
![Page 54: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/54.jpg)
Petri is now an open source project
https://github.com/wix/petri
![Page 55: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/55.jpg)
Q&A
Aviran Mordo
Head of Back-end Engineering
@aviranm
http://www.linkedin.com/in/aviranhttp://www.aviransplace.com
Talya Gendler
Back-end Team Leader
www.linkedin.com/in/talyagendler
https://github.com/wix/petri
http://goo.gl/L7pHnd
![Page 56: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/56.jpg)
Creditshttp://upload.wikimedia.org/wikipedia/commons/b/b2/Fiber_optics_testing.jpg
http://goo.gl/nEiepT
https://www.flickr.com/photos/ilo_oli/2421536836
https://www.flickr.com/photos/dexxus/5791228117
http://goo.gl/SdeJ0o
https://www.flickr.com/photos/112923805@N05/15005456062
https://www.flickr.com/photos/wiertz/8537791164
https://www.flickr.com/photos/laenulfean/5943132296
https://www.flickr.com/photos/torek/3470257377
https://www.flickr.com/photos/i5design/5393934753
https://www.flickr.com/photos/argonavigo/5320119828
![Page 57: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/57.jpg)
Modeled experiment lifecycle
Open source (developed using TDD from day 1)
Running at scale on production
No deployment necessary
Both back-end and front-end experiment
Flexible architecture
Why Petri
![Page 58: Experimenting on Humans - Advanced A/B Tests - QCon SF 2014](https://reader036.vdocument.in/reader036/viewer/2022062419/557d60b3d8b42abf3d8b5124/html5/thumbnails/58.jpg)
PERTI Server Your app
Laboratory
DB Logs