the economics of recommender systems
TRANSCRIPT
The Economics of Recommender Systems
Konstantin Savenkov, COO at Bookmate
http://bookmate.com
Target audience
• RS enthusiasts, to get a context they may lack otherwise
• B2C services and apps, to understand how much resources to spend on RS
• data scientists and evangelists, to sell your idea inside the company
• big data startups, to justify the business model and sell it to investors
• big data businesses, to set fair prices and convince potential customers
Agenda
• Academy vs. industrial settings in RS • Recommender Systems for content
discovery • Business model for B2C content service • Unit economics and underlying KPI • Driving business goals with RS: – conversion – retention – catalogue exploitation – reactivation
RS
Methods, e.g:
Preserving locality during matrix factorisation
Speeding up Gradient
Descent using Alternating Least Squares
BASIC RESEARCH
Different shades of RS
Tools, e.g:
Achieve better filtering of historical data
Combine several methods to apply for a new domain and
prove NDCG is better
BASIC RESEARCH
APPLIED RESEARCH
Different shades of RS
Using it in Production, e.g:
Pick a paper and reproduce
the result on live users
Achieve appropriate response time
Combine offline and online model updates to simulate feedback on user actions
BASIC RESEARCH
APPLIED RESEARCH
TECHNOLOGY TRANSFER
Different shades of RS
BASIC RESEARCH
Does it pushes the needle?
What are the benefits?
How to estimate them?
How to justify expenses on
RS?
When to start spending resources?
Should we invest in RS or better UX or add some
social features?
APPLIED RESEARCH
TECHNOLOGY TRANSFER
BUSINESS
Different shades of RS
Different shades of RS
BASIC RESEARCH
Does it pushes the needle?
What are the benefits?
How to estimate them?
How to justify expenses on
RS?
When to start spending resources?
Should we invest in RS or better UX or add some
social features?
APPLIED RESEARCH
TECHNOLOGY TRANSFER
BUSINESS
This course
This lecture
Academy vs. Tech vs. Business
How to improve
performance by X%
How hard is to implement that?
A: T:
B: When gains match costs?
“It’s tempting, if the only tool you have is a hammer, to treat everything as a nail.”
* Despite the topic of the course, try to avoid the BigData bias
Abraham Maslow, The Psychology of Science, 1966
Setting scope #1: Content discovery
Importance of Recommender Systems for content discovery: – hard to describe preferences in textual form
– textual relevance doesn’t work well
– preference elicitation
– limited catalogue
“I WANT TO READ SOMETHING…”
EVEN FOR BOOKS!
LOOKING FOR UNKNOWN UNKNOWNS
REGIONAL SEGMENTATION
User with a book problem
Search case Recommendation case
RS in the Interface
• Any place in the interface, when number of objects to show exceeds available space
• Most of the interfaces are list-based • Hence, order and size of the list can be
defined by either personalized or non-personalized algorithm
• Explaining recommendations is a different topic
There is no “no recommender system” setting. If there’s “just something” or “popularity sorted”, that’s your RS !
Bookmate example
front search
faceted filter book page
user library
notifications
social feed
Setting scope #2: B2C Content Service
Setting scope #2: B2C Content Service
• User pays either subscription, or per download, or hybrid
• User has a limited attention and time to share with the service
• Content may have different cost for service • Content itself is not a competitive advantage • User aid to select proper content is a
competitive advantage
Unit Economics • Business at scale (marginal revenue and expenses per user)
LTV
Cost of content
CAC
user
life
time
ARPU ARPU
…
PROFIT!
How the product works
• Each connection here is driven and improved by business activities
• The content itself fits into a sort of a BCG matrix:
GROWTH
CO
STS
CAC
Unit Economics & KPI
CAC
LTV
Content Costs
Marketing Expenses
New Customers
ARPU
Lifetime
Consumed Content Mix
Conversion
Retention
Reactivation
Exposed Content Mix
÷
×
Unit Economics & KPI
CAC
LTV
Content Costs
Marketing Expenses
New Customers
ARPU
Lifetime
Consumed Content Mix
Conversion
Retention
Reactivation
Exposed Content Mix
÷
×
* recommendation fairy
*
Recommender Systems & KPI • Users mostly convert via content (paywall) – content is responsible for up to 10x difference in
conversion – recommending content for new users raises the
conversion • Users need help to discover content during
lifetime – recurrent reading achieves recurrent payments – customized aid increases user loyalty – recommending content for loyal users increases
lifetime • Long tail content costs less – Recommending for diversity reduces costs
Recommender Systems may improve every aspect of the
business
Recommender Systems may improve every aspect of the
business
however… remember this guy
1. We reduce resources waste on everything that doesn’t push the needle.
2. There are no recipes on start, all we can is to propose a hypothesis and experiment.
Conclusions: • if there’s a proper place in the interface, you
may apply RS and see the effect
Setting scope #3: Lean formulation
offline and online testing results often don’t correlate
NO ALL-INS AND LEAPS OF FAITH
RS for Conversion / CAC • Hypotheses to prove:
1. There’re enough users who will use RS output 2. Their conversion will be above average
• A/B testing is the only way: – different channels convert with up to 20x difference – current traffic mix is unpredictable and hard to
control in the case of app installs
• Do pilots: – Run with limited resources, then extrapolate and
decide if run full-scale
RS for Conversion / CAC
• Two approaches to estimate: 1. increase of revenue from additionally converted
users
2. decrease of CAC • same amount of marketing expenses attract more
customers due to raised conversion, therefore CAC is reduced
• Suits for estimating various models of RS costs: – upfront costs (then the investments will return)
– flat fee (monthly license or added headcount) – variable costs (CPA or PaaS model)
Case Study (Bookmate / E-Contenta)
• New users get 3 books as a starter – group A – editorial books (non-personalised) – group B – personalized based on social profile (cold-start
recommender) provided by E-Contenta service • Two steps in the funnel:
1. User didn’t know what to read and used RS 2. User converted afterwards
• Straight to the results: – step 1 – 2.17x higher for RS, step 2 – a bit lower – overall, 1.4x increase of conversion for such users (3 sigma)
• Sounds promising! Did 40% more users become converted? • Not really, as there’s just 7% of users who didn’t know what
to book to start with
Let’s look at the economics • Let’s assume we attract 1000 new customers/
month, CAC = $5 (model data), the conversion from traffic is X%
• Therefore, 1.4 increase of the conversion for 7% of overall traffic results in x1.028 increase of overall conversion
• That is, we’ll get 28 new customers more for the same $5000
• That’s equivalent to: – reducing CAC by 14 cents – reducing marketing budget by $136/month
Conclusions from the pilot
• In case of using third-party RS on CPA basis (payment per converted user), CPA is limited by 14 cents per user – actually, should be less as both sides should get
benefits
• In case of a flat license fee of, say, $1000, this is economically efficient starting from 7143 new customers per month – or $35000 monthly marketing budget
RS for Retention / LTV
• Hypotheses to prove: 1. User pays as long as he finds what to read 2. There’re enough users who will use RS output 3. This channel has a discoverability above average
• Ideal experiment: A/B, then count actual lifetime – with lifetime close to year, it’s too long to wait
• Solution: – do separate A/B for different user cohorts (new, 1
month old, 2 months old etc) – estimate significant change in month-to-month
retention for each cohorts
Model case • Recommender system led to increase of
month-to-month retention from 3% (fresh cohorts) to 0.5% (old cohorts)*
Here’s the benefit (area is equal to # of ARPU gains)
* the numbers are not from the actual case and provided to showcase es6ma6ons
Let’s look at the economics
• Increase of the month-to-month retention leads to the increase of the user lifetime: – group A: 9 months
– group B: 11.6 months
• That means 29% increase of LTV • It may be spend this either to attract more
users with the same marginal earnings or to increase profitability
If this is still too long…
• Older cohorts may have too few users to achieve statistical significance
• Proxy metrics may be estimated – content discovery funnels: conversion of books
from opened to read – to use that, a hypotheses “more reads lead to
increase of retention” needs to be proven
RS for catalogue exploitation • complex case, as it affects both conversion and
retention • hypotheses to prove:
1. Recommender system may expose users to a content mix with more marginal profits
2. Conversion and retention would be the same or decrease of costs will overweight decrease of conversion and retention
3. There’re enough users who will use RS output
Case Study • A bit too big to roll out in a presentation • OK, just a bit: adding recommender system to the
interface really drives users out of search:
• as a homework, you may estimate how good should be RS at reducing the costs to justify $1000/month expenses.
Wrapping up
• The proper business approach to Recommender Systems – run a pilot to estimate some numbers, then conclude if you have enough scale to afford the expenses
• The simplest recommender will probably achieve you 80% of possible performance – if it doesn’t, the problem is most likely not in the
algorithm
• And again,
Questions?
• Can you provide some data for my academic research? – Yes, probably!
• Do you have enough scale to hire me as a Recommender Systems specialist? – Most likely!
• May I ask some questions via email? – Sure!