analytical model development & implementation experience from the field bhavani raskutti

27
Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

Upload: loraine-phillips

Post on 18-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

Analytical Model Development & ImplementationExperience from the Field

Bhavani Raskutti

Page 2: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

2

Topics to be covered

• Model development & implementation process

• Case Study 1: Corporate Customer Modelling at Telcos

• Case Study 2: Sales Opportunities for wholesalers

• Take-Home Points

Page 3: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

3

Model Development & Implementation Process

Solution enabling

business to make

strategic & operational decisions

Business Problem

Data Acquisition & Preparation

DAP

AnalyticalProblem Definition

APD

D

Deployment

Presentation

P

Mathematical Modelling

(Algorithms)

Data Matrix

MM

Model Validation

MV

Decision-making by users• Insights via GUI• Automation• Training• Documentation• IT Support

Model Development• Iterative• 90% DAP

Page 4: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

4

Topics to be covered

• Model development & implementation process

• Case Study 1: Corporate Customer Modelling at Telcos

• Case Study 2: Sales Opportunities for wholesalers

• Take-Home Points

Page 5: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

5

Business Problem Large drops in margins & revenue in corporate customer base

Partial churn of some corporate customers to other telcos

Lack of understanding of customer’s needs

Project will target revenue improvement opportunities with an indicative $15 million in sales by:

undertaking a rapid analysis of Customer data from core systems, including front of house, customer satisfaction and marketing for customers with a spend greater than $100k, excluding state and local government

outcomes are to be validated using artificial intelligence tools and rigorous methodology by …

Verbatim from client’s presentation to stake holders

Using data analysis, increase revenue from corporate customers whose spend is > $100k

Page 6: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

6

1. Analytical Problem Definition

• Increase revenue from corporate customers by- Win-back (database look-up)?- churn reduction? - Up-sell/cross-sell to an existing customer?

• Customer data- Relationship with customer

– Customer satisfaction survey data– Service assurance data (customer complaints)

- Demographic information about business customer– Industry segment information– Number of sites

- Revenue from customer– Quarterly revenue from different products

Create models to predict up-sell based on revenue data

1. Analytical Problem Definition

Using data analysis, increase revenue from corporate customers whose spend is > $100k

Page 7: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

7

2. Data Acquisition & Processing

• Population:

- Customers in a segment who currently do not have the product being modelled

• Target or positive case definition:

- Customers in the segment who take up the product within a time period

• Predictors for modelling

Using revenue data, create models to predict customers likely to take up a specific product

2. Data Acquisition & Processing

Page 8: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

8

Population and Target Definition• Let riP be the revenue from a customer on product P in billing

period i

• Population in period i includes all customers with r(i-1)P = 0

• Target or Product take-up in period i iff r(i-1)P=0 and riP >TUMIN

- TUMIN > 0 is the minimum take-up amount determined by the business

Predictors Labels

TRAIN: r(i-1)P = 0

Predict for riP = 0

i i+1

i-1 i

2. Data Acquisition & Processing

Page 9: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

9

Low take-up rates: not enough targets

• Average number of take-ups for any product in any period is small- Large businesses

– Less than 20 take-ups in a period for 70 of the 100+ products– Less than 10 take-ups for 45 products

- Medium businesses– Less than 20 take-ups for 71 products– Less than 10 take-ups for 60 products

• Reasons- “niche” products- Saturated products

2. Data Acquisition & Processing

Page 10: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

10

Low take-up rates (Cont’d)Impact of data aggregation

k=2 is useful

Large Businesses

70

4548

39

51

40

0

10

20

30

40

50

60

70

n=20 n=10

Minimun take-ups (n) for modelling

Nu

mb

er

of

un

mo

dell

ab

le p

rod

ucts

k=1

k=2

k=3

Minimum take-ups(n) for modelling

Medium Businesses

7166

54

71

5960

0

10

20

30

40

50

60

70

n=20 n=10

Minimun take-ups (n) for modelling

Nu

mb

er

of

un

mo

dell

ab

le p

rod

ucts

k=1

k=2

k=3

Minimum take-ups(n) for modelling

• Aggregate data over multiple billing periods k

• Product take-up in periods i to i+k-1 iff r(i-j)P=0 for j=1..k and j=0..k-1 r(i+j)P >(kTUMIN))

Predictors

Labels

i-3 i-2 i-1 i

TRAIN target: r(i-j)P = 0, j = 0..1

Predict if r(i+j)P = 0 or 1; j = 1..2

i-1 i i+1 i+2

2. Data Acquisition & Processing

Page 11: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

11

Low take-up rates (cont’d)• Use of time interleaving

- Aggregate data with k=2- Generate 3 sets of data

moved forward by a period- Concatenate the 3 sets to get

3 times as much training data as for data aggregation with k=2

Impact of time interleaving

Time interleaving enormously enhances modellability

Large Businesses

70

4539

28

19

48

0

10

20

30

40

50

60

70

n=20 n=10Minimum take-ups (n) for modelling

Nu

mb

er o

f u

nm

od

ella

ble

pro

du

cts Raw

DA, k=2

TI

Medium Businesses

60

7166

5449

40

0

10

20

30

40

50

60

70

n=20 n=10Minimum take-ups (n) for modelling

Nu

mb

er o

f u

nm

od

ella

ble

pro

du

cts

Raw

DA, k=2

TI

i-5 i-4 i-3 i-2

PredictorsPrediction

LabelsTRAIN

i-4 i-3 i-2 i-1

i-3 i-2 i-1 i

i-1 i i+1 i+2

2. Data Acquisition & Processing

Page 12: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

12

Predictors for Modelling

• Revenue predictors used- r(i-3)Q – revenue for all products in billing period i-3- Change in revenue from period i-3 to i-2, r(i-3)Q - r(i-2)Q

- Projected revenue for period i-1, 2r(i-3)Q - r(i-2)Q

• All revenue predictors used both as raw values, and normalised by total customer revenue

• Binary predictors indicating churn/take-up in period i-2

• All continuous predictors converted to binary using 10 equisize bins- Overcomes the negative impact of large variance in revenues- Allows generation of non-linear models using linear techniques

Predictors Labels

i-3 i-2 i-1 i

TRAIN target: r(i-j)P = 0, j = 0..1

2. Data Acquisition & Processing

Page 13: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

13

3. Mathematical Modelling• Imbalance in class sizes

- Large businesses– 51 products with < 0.5% take-up on average– 25 products with < 0.1% take-up

- Medium businesses– 74 products with < 0.5% take-up on average– 54 products with < 0.1% take-up

• Maximisation of total take-up revenue - Identifying new high value customers is a priority- Extent of variance

– Take-up amounts range from TUMIN to over a million dollars– Take-up amounts are not correlated with total revenue in

previous billing periods

3. Mathematical Modelling

Page 14: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

14

Imbalance in class sizes• Use of Support Vector Machines (SVMs) instead of decision

trees, neural nets or logistic regression

- Based on Vapnik’s statistical learning theory- Maximises the margin of separation between two classes

• Two different SVM implementations

- SVMstd : equal weight to all training examples

- SVMbal : class dependent weights so all take-ups have a higher weight than all non-take-ups

m

mCC

• m+ and m- : number of +ve and -ve examples

• C+ and C- : weight of +ve and -ve examples

3. Mathematical Modelling

Page 15: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

15

Identifying high value take-up

• SVMval: SVM with different weights for different positive (take-up) training examples

- All take-up examples have a higher weight than all the non-take-up examples (as for SVMbal)

- Each take-up training example has a weight proportional to the amount of take-up

MINTU

iTU

m

mCiC

2

)()(

• m+ and m- : number of +ve and -ve examples

• C- : weight of -ve examples

• TU(i) : Take-up amount of the ith +ve example

• C+(i) : weight of the ith +ve example

3. Mathematical Modelling

Page 16: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

16

4. Model Validation• Model assessment

- Two tests for assessing quality of models (~4,000 models)– 10-fold cross validation tests to determine the best of the 3 SVMs– Tests in production setting to evaluate time interleaving

- All tests on 30 product take-up prediction problems in 4 segments - Performance measures on unseen test set

– Area under receiver operating characteristic curve (AUC)• Measures quality of sorting• Decision threshold independent metric

– Value weighted AUC (VAUC)• Indicates potential revenue from the sorting

• SVMval with time interleaved data is used for generating models

- SVMval significantly more accurate than the other two

- Time interleaving produces more stable models

4. Model Validation

Page 17: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

17

Model Validation by Business

• Predictive models identify more sales opportunities than that identified manually- 3 times as many in large businesses segment- 5 times as many in medium businesses segment

• Results for 2 different regions in medium businesses- Region 1: Predictions for just 5 products generated 9 new

opportunities with an increase in revenue of ~400K A$- Region 2: Predictions identified opportunities that were

already being processed by sales consultants

• Predictive modelling spreads the techniques of good sales teams across the whole organisation

4. Model Validation

Page 18: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

18

5. Presentation

• Output in Excel Spread Sheet automatically generated

• One customer list per segment with:

- Take-up likelihood for all modelled products- Last quarter revenue for all products

5. Presentation

Page 19: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

19

6. Deployment• Implementation in Matlab & C with output in Excel

• Automatic quarterly updates of model after consolidated revenue figures are available

• Models for ~50 products for each of the 4 business segments

• Output delivered to business analytics group

- Different cut-offs for different products/regions- Superimposition of other data for filtering/sorting

• Use of output by sales consultants for renegotiating contracts with customers

6. Deployment

Page 20: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

20

Project Timeline

• Initial approach to data availability for pilot: 12 weeks

• Data to pilot: 6 weeks

• Model validation by business: 12 weeks

• Pilot deployment (5 products, 1 segment): 6 weeks

• Acceptance by business teams: over 9 months

• Final deployment: 4 weeks

• In operation for more than 8 years!!

6. Deployment

Page 21: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

21

Key Success Factors• Willingness of stake-holders to try non-standard solutions

• Innovative solution: Paper published in KDD 2005 - Target definition using multiple overlapping time periods to boost

the number of rare events for modelling- Use of support vector machines for customer analytics

• Being lazy - Scope change from 4 to 50 products- Scope change from 2 to 4 segments- Development of ~200 predictive models in one shot - No stale models in production

• Working with business analysts to instigate change:- Product-centric modelling to customer-centric product packaging

Page 22: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

22

Topics to be covered

• Model development & implementation process

• Case Study 1: Corporate Customer Modelling at Telcos

• Case Study 2: Sales Opportunities for wholesalers

• Take-Home Points

Page 23: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

23

DAP

APD

DP

MM

MV

- Sales demand - Similar products

@ similar outlets have similar demand to sales relationship

- Anomaly may be due to lack of stock

Increase wholesale sales

into major retailers

- Quantify demand - Define normalised

sell-rate - Define a long term

in-stock measure - Define products &

outlets that are similar

- Weekly SOH & sales for each store & SKU

- SKU master

- Store master

Simple univariate regression in SQL

Perform comparisons & find anomalies

with stock issues

- Self-serve report for each sales rep

- Presents list of products with sales opportunities

- Click thru’ to detailed graphs

Case Study: Wholesale Sales

- Absolute error

- Validate with retail

Page 24: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

24Demand

In-s

tock

%

· R1· R2

Demand

Sel

l Rat

e

Sell rate vs Consumer Demand plot • Each point is a store• R1 & R2 are comparable retailers• Values for the same product

Possible reasons for difference• Competing product at R2• Pricing at R2 vs R1• Lack of stock at R2

Case Study: Wholesale Sales (Cont’d)

Page 25: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

25

DAP

APD

DP

MM

MV

- Sales demand - Similar products

@ similar outlets have similar demand to sales relationship

- Anomaly may be due to lack of stock

Increase wholesale sales

into major retailers

- Quantify demand - Define normalised

sell-rate - Define a long term

in-stock measure - Define products &

outlets that are similar

- Weekly SOH & sales for each store & SKU

- SKU master

- Store master

Simple univariate regression in SQL

- Self-serve report for each sales rep

- Presents list of products with sales opportunities

- Click thru’ to detailed graphs

- SQL & Cognos

- Automatic weekly updates

- Training by corporate training team

- Support from IT helpdesk

Perform comparisons & find anomalies

with stock issues

Case Study: Wholesale Sales (Cont’d)

- Absolute error

- Validate with retail

Page 26: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

26

Topics to be covered

• Model development & implementation process

• Case Study 1: Corporate Customer Modelling at Telcos

• Case Study 2: Sales Opportunities for wholesalers

• Take-Home Points

Page 27: Analytical Model Development & Implementation Experience from the Field Bhavani Raskutti

27

Take-home points

• Data acquisition & processing phase forms 80-90% of

any analytics project

• Business users are tool agnostic

- R, SAS, Matlab, SPSS, … for statistical analysis

- Tableau, Cognos, Excel, VB, … for presentation

• Business adoption of analytics driven by

- Utility of application

- Validation of results by using real-life cases

- Ease of decision-making from insights

- Ability to explain insights