customer insights from telecom data using deep learning

Post on 13-Jan-2017

113 Views

Category:

Data & Analytics

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Analytics on Telecom CDR Data

RedZebra AnalyticsOct 2014

Problem statement

1How to segment Telecom customers and track their dynamics

2How to optimize / reformulate tariff plans

3How to predict churn

The data

• 3 months of CDR– Data consumption– Phone calls and Topups– SMS

• User description (geo, sociodemographics)

The techniques

Deep Neural Networks and Autoencoders (Keras framework)

Random Forest

Extreme Gradient Boosting

Graph analysis (Igraph)

SOM and tSNE

Scikit Learn (Python)

Data processing (for churn prediction)

Churn (1) / no churn (0)

Customer activity is Converted into heatmaps

Network data also considered

We also include network data (like the number of churners connected to a node)

Three distinct users activity

Approach: Convolutional Neural Network

INPUTUser activityheatmap

OUTPUTChurn / no churn

Results

Method AUC - train AUC - testRandom Forest 0.75 0.74Extreme Gradient Boosting 0.80 0.76Variational Autoencoders 0.78 0.75Convolutional Neural Networks 0.79 0.77

Convolutional Neural Networks have the best performance

Some templates of user activity discovered by the neural network

SMS activity per age group

Clustering

Techniques used cluster and visualize data:• K-means• Self-organized maps (SOM)• tSNE

Visualization of sample of users with tSNE

Segmentation with Self Organized Maps

Distance to code-vectors: how stable is the population

Conclusions

• Deep Convolutional Networks achieve top performance• Network data very important (who is connected to who)• We found 5 well defined segments• Payments are determined by calls not data• SOM create relatively stable segments• Intercommunity diverse is some cases

top related