dynamic customer segment analysis and behavior prediction using data mining

Post on 04-Jan-2016

28 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

National Tsing Hua University Department of Industrial Engineering and Engineering Management. Dynamic customer segment analysis and behavior prediction using data mining. Group 1: Margaret Dlamini Saumen Bhaumick Daniel Chen Ricky Huang July Panoso. Abstract. - PowerPoint PPT Presentation

TRANSCRIPT

Dynamic customer segment analysis and behavior prediction using data mining

Group 1:Margaret Dlamini

Saumen BhaumickDaniel ChenRicky HuangJuly Panoso

National Tsing Hua UniversityDepartment of Industrial Engineering and

Engineering Management

Abstract

CRM is mainly to Understand customer well By Studying the difference between the

Customers through customer segmentation. Track customers shift from segment to

segment Discover customer segment knowledge Predict Customer segments behavior pattern

CRM

We believe keeping and managing the customer is most important:

•Attractive Personalized services to satisfy Customer needs

•CRM- Closer and deeper relationships with customers

Understanding Customers.

• Analyzing Customers Information.

•Differentiate Customers through Segmentation

• Increase Customer loyalty through Customized products

Predict Customers Purchase behavior

Contact and Serve CustomersThrough Channels

To understand customers its essential to integrate the data collected thru.

Web browsing Purchase behavior Complaints Demographics

THE DATA

The Customer segments and related knowledge discovered from multiple data sources change as Customer base changes

Thus valid for a particular period Most existing predictions methods

fundamentally are based on numerical and historical data patterns (using simple regression or neural network techniques)

FLUCTUATIONS

This can be quite fluctuating caused due to

Promotions New product launching Customer support policies

Customer Segment

This study tracks the customer shift among customer segments

Monitor changes overtime To discover customer segment

knowledge Predict Customer’s segment

behavior pattern

Prediction on Customers behavior

By studying the segment shift each customer might shift

Build a career path of each customer

By aggregating each customers career path, derive the Dominant career paths (majority of customers follow)

Process to Segment Customers

Choose a basis of segmentation, with appropriate variables (demographic or behavioral)

Use a multivariate analysis to group together or split customers.

Evaluate and validate the outputs. Analyze the results in economic

terms

Segmentation Design schemes

Measure used for segmentation Number of resulting segments View about the change overtime Segmentation techniques used Number of the customers

selected

Segmentation MeasuresThe segmentation variables consists of one or a

combination of the following Demographic Geographic Psychographic or BehavioralThe behavioral purchase pattern can beRFM (Recency, Frequency and Monetary)FRAT (Frequency, Recency, Amount & Type)

Number of Resulting Segments

Minimize combined direct and opportunity cost of the Segmentation as critera for optimum number of segments

Allow the derivation of equal sized segments

Judgmental decisions are on the basis of number of segments

View about change overtime

Through occasion based design that assumes that people vary in their needs across occasions of product purchase.

Other way is to consider time-segmented customers through repeated measurements of the same customer at different point in times

Segmentation TechniquesStatistical Methods K -mean algorithm Discriminant Analysis Logistic RegressionMachine learning Techniques Neural Networks (Normally its considered that

neural network are more accurate compared to statistical methods)

Number of Customers

The Customer segmentation can incorporate all the customers or can be limited to sample of them.

If the segmentation is based on sample, its important to predict how many customer falls in that group (Via inferential statistics)

Profitability

Predict changes in the segment to derive static characteristic of the segment

Changes in the segment closely relates to increase or decrease in profitability obtained from the segment

Research Overview This study focus on behavioral variables inc

lude customer’s product usage. Recency, Frequency, Monetary (RFM) analys

is. Self-Organizing Map (SOM) : uses neural clu

stering method to divided the retailer’s customer into numerous groups.

Cont. This paper collect data from July 2001 to

September 2002. Segment customers five times during

fifteen months One quarter is a time window to create

new segmentation.

Cont. Individual career path: present a single

customer’s history of shifts. Dominant career path: a descriptive

pattern, which explains common histories most customer might follow.

One leading to a loyal segment and the other leading to a vulnerable segment.

This study also provide a analytical method for predicting time-variant segment movement a customer might show.

SEGMENTING CUSTOMERS

We should be use a clustering analysis of product usage or purchase. Purchase transactions

have four features:

Customer number or customer ID

Recency value Frequency value Monetary value

Data preparation for the segmentation

We have 3 situations:

Newcomers (don’t have any purchase before period t)

Old customers (but made purchase during period t)

Old customers (but don’t make purchase during period t)

How can we calculate RFM?

Newcomers (do not have any purchase history before period t)

rt = measures how long they made purchaseft = measures how frequently they make purchasemt = measures how much money t

hey spend

Old customers (but made purchase during period t)

Rt-1 - rt = Recency value for period tFt-1 - ft = Frequency value for period tMt-1 - mt = Monetary value for period t

Note. Rt-1, Ft-1 ,Mt-1, stand for cumulative to period t-1

Old customers (but don’t make purchase during period t)

Rt-1 + 3 months = Recency value for period t

Ft-1 + 3 months = Frequency value for period t

Mt-1 + 3 months = Monetary value for period t

Self-organization of customersThe SOM does unsupervised clustering

Records within a group or cluster tend to be similar to each other

Records in different groups are dissimilar

The SOM will end up with a few output units:

- Strong units- Weak units

The strong Units represent probable cluster centers

Segmentation results

2 techniques to speed up the SOM:

It is to vary the size of the neighborhoods: From large to small

The other is to have the winning neuron use a larger learning rate than that of the neighboring neurons

Summary of customer statistics per quarter

Summary of customer segment characteristics for the third quarter of 2001

Loyal

Vulnerable

Newcomer

Result of the successive five-time segmentation

Discovering individual career path and dominant career path

Five-time segmentation makes it possible to combine segment shift histories into a career path.

Natural life cycle Migration External factors

Changes in segments

Over successive quarters there are changes in the number of

customers in a segment indicating certain strategies that

management should review for the CRM

         

To Q4 2001 From Q3 2001 R↓F↑M↑ R↓F↓M↓ R↑F↓M↓ Customer Before Shifts

         

R↓F↑M↑ 24,577 2,267 3,952 30,796

R↓F↓M↓ 5,472 16,181 14,563 36,216

R↑F↓M↓ 2,778 9,387 17,788 29,953

R↑F↑M↑ 461 148 282 891

Customers afters shifts 33,288 27,983 36,585 97,856

Segment shifts of customers from Q3 2001 to Q4 2001

Path No.of customers Probability (%)

R↓F↑M↑→ R↓F↑M↑→ R↓F↑M↑ 20,495 42.0

R↓F↓M↓→ R↑F↓M↓→ R↓F↑M↑ 5,658 11.6

R↑F↓M↓→ R↓F↑M↑→ R↓F↑M↑ 3,386 6.9

R↑F↓M↓→ R↓F↑M↑→ R↓F↑M↑ 2,999 6.1

R↓F↓M↑→ R↓F↑M↑→ R↓F↑M↑ 2,457 5.0

Dominant career paths of length 3, which lead to segment R↓F↑M↑

Dominant Career Paths of length 5, which lead to segment R↑F↓M↓

Path No.of customers Probability

(%)

R↑F↓M↓→R↑F↓M↓→ R↑F↓M↓→ R↑F↓M↓→ R↑F↓M↓ 8,645 20.90

R↓F↓M↓→ R↑F↓M↓→ R↑F↓M↓→ R↑F↓M↓→ R↑F↓M↓ 5,460 13.20

R↓F↓M↓→ R↓F↓M↓→ R↑F↓M↓→ R↑F↓M↓→ R↑F↓M↓ 2,010 4.86

R↑F↓M↓→ R↓F↓M↓→ R↑F↓M↓→ R↑F↓M↓→ R↑F↓M↓ 1,924 4.65

R↓F↑M↑→ R↑F↓M↓→ R↑F↓M↓ → R↑F↓M↓→ R↑F↓M↓ 875 2.11

Predicting Career Paths

Prediction of customer’s segment shifts can be classified as a classification task from the data mining perspective.

This case study use a decision tree induction technique and choose C5.0 to predict the time-variant career paths.

Decision Tree Induction Technique

The C5.0 algorithm has a special method form improving its accuracy rate called boosting.

Boosting working by building mutiple models in a seqience.

The next tree is used to modify and improve the previous one.

Data Preparation for the Prediction

The case generate 6 models for categorical predictions.

Choose the best model with the highest accuracy.

Training six prediction modelsQuarter/

ModelQ3 2001 Q4 2001 Q1 2002 Q2 2002 Q3 2002

PMa Attribute Attribute Class

PMb Attribute Attribute Class

PMc Attribute Attribute Class

PMd Attribute Attribute Attribute Class

PMe Attribute Attribute Attribute Class

PMf Attribute Attribute Attribute Attribute Class

Summary of the Prediction Accuracy of C5.0 Models

Model No. of attributes

Pruning severity

Prediction Accuracy (%)

PMa 2 75 59.74

PMb 2 80 61.68

PMc 2 70 71.27

PMd 3 94 62.28

PMe 3 78 71.38

PMf 4 75 71.13

Prediction Accuracy Statistics for Best Model, PMe

Predicted Values at Q4 2002

Actual Values

at Q4 2002

R↓F↑M↑ R↑F↓M↓ Total

R↓F↓M↓ 2102 2009 4111

R↓F↓M↑ 3907 3071 6978

R↓F↑M↑ 36286 9165 45451

R↑F↓M↓ 8183 33133 41316

Total 50478 47378 97856

Prediction Accuracy Statistics for Best Model, PMe

Predicted Values at Q4 2002

Actual Values

at Q4 2002

R↓F↑M↑ R↑F↓M↓ Total

R↓F↓M↓ 2102 2009 4111

R↓F↓M↑ 3907 3071 6978

R↓F↑M↑ 36286 9165 45451

R↑F↓M↓ 8183 33133 41316

Total 50478 47378 97856

Newcomer Segment

Prediction Accuracy Statistics for Best Model, PMe

Predicted Values at Q4 2002

Actual Values

at Q4 2002

R↓F↑M↑ R↑F↓M↓ Total

R↓F↓M↓ 2102 2009 4111

R↓F↓M↑ 3907 3071 6978

R↓F↑M↑ 36286 9165 45451

R↑F↓M↓ 8183 33133 41316

Total 50478 47378 97856

Total Predict Accuracy (36286+ 33133) / 97856 *100% = 71%

R↓F↑M↑ Predict Accuracy (36286) / 50478 *100% = 72%

R↑F↓M↓ Predict Accuracy (33133) / 47378 *100% = 70%

Performance evaluation of PMe Model Because the training set contains only

a few cases about the newcomer segments(7.8%), the model PMe could hardly learn the pattern about them.

Accuracy predictions for rare categories will earn a higher performance evaluation.

Conclusion This paper have proposed segment-based

knowledge discovery method used for derivation of the descriptive pattern: predict the path customer will shift.

Try to resolve the fundamental problems : changing characteristics of customer in segment and change in its composition.

Cont. Further research Extend the prediction accuracy

Using neural network Building a separate classifier for

different segments and combining result from multiple classifier.

top related