how data science helps prevent churn at avira, a 100-million user company calin-andrei burloiu big...

26
How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Pașov Machine Learning Engineer Strata + Hadoop World New York, 2015

Upload: abel-gardner

Post on 04-Jan-2016

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

How Data Science Helps Prevent Churn at Avira,a 100-million User Company

Calin-Andrei BurloiuBig Data Engineer

Iulia PașovMachine Learning

Engineer

Strata + Hadoop WorldNew York, 2015

Page 2: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

About Avira

• Headquarters in Tettnang, Germany

• Security applications for– Windows– Mac OS– iOS– Android

• Awarded for malware detection

Page 3: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

Big Data at Avira

• 430 million global installs• 100 million users• On-premise Hadoop cluster– 7 worker nodes– 30 TB logs and events– 5 TB monthly new data

Page 4: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

About User Churn

Active Installs

New Installs Uninstalls

Page 5: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

Steps

Diagnosis

What can we measure?

Which are the churn reasons?

Understanding

Why do users have issues?

Who is likely to churn?

Treatment & Prevention

How can we react to prevent this?

Page 6: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

Churn DiagnosisWhat can we measure?

Which are the churn reasons?

Page 7: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

What can we measure?

• Metrics– Churn rate– New Installs– Active users– Usage patterns

Page 8: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

Computing Churn from Uninstall Events

• Uninstall events collected as application logs

• Pros:– An event is an uninstall for

sure• Some users reinstall

• Cons:– Some events are lostoffline

online

Page 9: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

Computing Churn from User Inactivity

• Check user event logs• Users are considered churned after some time of inactivity• Pros:

– More accurate• Cons:

– Requires waiting– Results come too late

0 10 20 30 40 50

Days

user inactive for 30 daysuser returns in the 31st day

Page 10: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

User Inactivity Convergence

1-Apr 11-Apr 21-Apr 1-May 11-May 21-May 31-May0

50

100

150

200

250

3118 10In

acti

ve u

sers

Page 11: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

Estimating User Churn

• Predict monthly user churn rate– Predictor

• uninstall events– Outcome

• inactive users

Apr May Jun Jul Aug Sep0

20

40

60

80

100

uninstall inactivepredicted

Page 12: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

Performing Survival Analysis

Jul-1

4

Aug-

14

Sep-

14

Oct-1

4

Nov-1

4

Dec-1

4

Jan-

15

Feb-

15

Mar

-15

Apr-1

5

May

-15

Jun-

15

Jul-1

5

Aug-

15

Sep-

150.0%

20.0%

40.0%

60.0%

80.0%

100.0%

60%

Su

rviv

al P

rob

ab

ilit

y

Page 13: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

User Profile

• Consider– Devices– Behavior– Technical savviness– Business or consumer?– Errors

Users

User Profile

Page 14: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

Churned Users

Active

Churned

Page 15: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

Uninstall Surveys

• Ask users to complete a survey on uninstall

• Find churn reasons• 1% users complete surveys• Complaints from the past

Uninstall

Surveys

Page 16: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

Lifecycle Surveys

• Complaints from the present• Ask users to give feedback a

few weeks after installation• Questions based on insights

from uninstall surveys

• Market research– Know your product’s

market

Lifecycle

Surveys

Page 17: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

Extracting Sentiments from SurveysUninsta

ll Survey

s

Lifecycle

Surveys

Sentiment

Analysis

• Sentiment analysis– Negative review

• Dissatisfaction– Positive review

• Arbitrary reasons (e.g. reinstall)

Page 18: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

Extracting Churn Reasons from Surveys

• Topic detection– Churn reasons

• Insights might be misleading

Uninstall

Surveys

Lifecycle

Surveys

Sentiment

Analysis

Topic Detectio

n

Reasons

Page 19: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

Churn UnderstandingWhy do users have issues?

Who is likely to churn?

Page 20: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

Matching Profiles with Reasons

• Compare users– With churn

reasons– Loyal

• Find patterns– Characteristics– Behavior– Context

Uninstall

Surveys

Lifecycle

Surveys

Sentiment

Analysis

Topic Detectio

n

Reasons

User Profile Match

Page 21: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

How Avira Identified Churnable Users

• Uninstalled surveys revealed an “update” issue as a churn reason– “The product could not update so I uninstalled.”

• User profile of users with the “update” problem– Context

• A particular version of the antivirus– Behavior

• Antivirus didn’t update for at least 2 weeks• Users were active at least 4 times in 2 weeks

Page 22: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

Churn Treatment & Prevention

How can we react to prevent this?

Page 23: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

How can we help?• Find solutions for each churn reason• Directly

– Fix bugs– Fix UX– Add requested features– Offer the right price for extra features

• Indirectly– Head them to support team

Page 24: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

To Summarize...• Know your data• Diagnose users who leave• Find and understand reasons• Treat every reason to prevent churn

Page 25: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata
Page 26: How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big Data Engineer Iulia Paov Machine Learning Engineer Strata

Acknowledgements• Many thanks to our colleagues who worked with us on this project

or helped us with the presentation• Rodica Coderie

• Data Scientist

• Viacheslav Rodionov• Big Data Engineer

• Anna Tyrkich• Designer