How Data Science Helps Prevent Churn at Avira, a 100-million User Company Calin-Andrei Burloiu Big...

Post on 04-Jan-2016

218 views 0 download

Tags:

transcript

How Data Science Helps Prevent Churn at Avira,a 100-million User Company

Calin-Andrei BurloiuBig Data Engineer

Iulia PașovMachine Learning

Engineer

Strata + Hadoop WorldNew York, 2015

About Avira

• Headquarters in Tettnang, Germany

• Security applications for– Windows– Mac OS– iOS– Android

• Awarded for malware detection

Big Data at Avira

• 430 million global installs• 100 million users• On-premise Hadoop cluster– 7 worker nodes– 30 TB logs and events– 5 TB monthly new data

About User Churn

Active Installs

New Installs Uninstalls

Steps

Diagnosis

What can we measure?

Which are the churn reasons?

Understanding

Why do users have issues?

Who is likely to churn?

Treatment & Prevention

How can we react to prevent this?

Churn DiagnosisWhat can we measure?

Which are the churn reasons?

What can we measure?

• Metrics– Churn rate– New Installs– Active users– Usage patterns

Computing Churn from Uninstall Events

• Uninstall events collected as application logs

• Pros:– An event is an uninstall for

sure• Some users reinstall

• Cons:– Some events are lostoffline

online

Computing Churn from User Inactivity

• Check user event logs• Users are considered churned after some time of inactivity• Pros:

– More accurate• Cons:

– Requires waiting– Results come too late

0 10 20 30 40 50

Days

user inactive for 30 daysuser returns in the 31st day

User Inactivity Convergence

1-Apr 11-Apr 21-Apr 1-May 11-May 21-May 31-May0

50

100

150

200

250

3118 10In

acti

ve u

sers

Estimating User Churn

• Predict monthly user churn rate– Predictor

• uninstall events– Outcome

• inactive users

Apr May Jun Jul Aug Sep0

20

40

60

80

100

uninstall inactivepredicted

Performing Survival Analysis

Jul-1

4

Aug-

14

Sep-

14

Oct-1

4

Nov-1

4

Dec-1

4

Jan-

15

Feb-

15

Mar

-15

Apr-1

5

May

-15

Jun-

15

Jul-1

5

Aug-

15

Sep-

150.0%

20.0%

40.0%

60.0%

80.0%

100.0%

60%

Su

rviv

al P

rob

ab

ilit

y

User Profile

• Consider– Devices– Behavior– Technical savviness– Business or consumer?– Errors

Users

User Profile

Churned Users

Active

Churned

Uninstall Surveys

• Ask users to complete a survey on uninstall

• Find churn reasons• 1% users complete surveys• Complaints from the past

Uninstall

Surveys

Lifecycle Surveys

• Complaints from the present• Ask users to give feedback a

few weeks after installation• Questions based on insights

from uninstall surveys

• Market research– Know your product’s

market

Lifecycle

Surveys

Extracting Sentiments from SurveysUninsta

ll Survey

s

Lifecycle

Surveys

Sentiment

Analysis

• Sentiment analysis– Negative review

• Dissatisfaction– Positive review

• Arbitrary reasons (e.g. reinstall)

Extracting Churn Reasons from Surveys

• Topic detection– Churn reasons

• Insights might be misleading

Uninstall

Surveys

Lifecycle

Surveys

Sentiment

Analysis

Topic Detectio

n

Reasons

Churn UnderstandingWhy do users have issues?

Who is likely to churn?

Matching Profiles with Reasons

• Compare users– With churn

reasons– Loyal

• Find patterns– Characteristics– Behavior– Context

Uninstall

Surveys

Lifecycle

Surveys

Sentiment

Analysis

Topic Detectio

n

Reasons

User Profile Match

How Avira Identified Churnable Users

• Uninstalled surveys revealed an “update” issue as a churn reason– “The product could not update so I uninstalled.”

• User profile of users with the “update” problem– Context

• A particular version of the antivirus– Behavior

• Antivirus didn’t update for at least 2 weeks• Users were active at least 4 times in 2 weeks

Churn Treatment & Prevention

How can we react to prevent this?

How can we help?• Find solutions for each churn reason• Directly

– Fix bugs– Fix UX– Add requested features– Offer the right price for extra features

• Indirectly– Head them to support team

To Summarize...• Know your data• Diagnose users who leave• Find and understand reasons• Treat every reason to prevent churn

Acknowledgements• Many thanks to our colleagues who worked with us on this project

or helped us with the presentation• Rodica Coderie

• Data Scientist

• Viacheslav Rodionov• Big Data Engineer

• Anna Tyrkich• Designer