customer insights from telecom data using deep learning

of 16 /16
Analytics on Telecom CDR Data RedZebra Analytics Oct 2014

Author: armando-vieira

Post on 13-Jan-2017



Data & Analytics

1 download

Embed Size (px)


Analytics on Telecom CDR DataRedZebra AnalyticsOct 2014

Problem statementHow to segment Telecom customers and track their dynamics

How to optimize / reformulate tariff plans

How to predict churn

The data3 months of CDRData consumptionPhone calls and TopupsSMS

User description (geo, sociodemographics)

The techniquesDeep Neural Networks and Autoencoders (Keras framework)

Random Forest

Extreme Gradient Boosting

Graph analysis (Igraph)

SOM and tSNE

Scikit Learn (Python)

Data processing (for churn prediction)

Churn (1) / no churn (0)Customer activity is Converted into heatmaps

Network data also considered We also include network data (like the number of churners connected to a node)

Three distinct users activity

Approach: Convolutional Neural Network

INPUTUser activityheatmapOUTPUTChurn / no churn

ResultsMethodAUC - trainAUC - testRandom Forest0.750.74Extreme Gradient Boosting0.800.76Variational Autoencoders0.780.75Convolutional Neural Networks0.790.77

Convolutional Neural Networks have the best performance

Some templates of user activity discovered by the neural network

SMS activity per age group

ClusteringTechniques used cluster and visualize data:K-meansSelf-organized maps (SOM)tSNE

Visualization of sample of users with tSNE

Segmentation with Self Organized Maps

Distance to code-vectors: how stable is the population

ConclusionsDeep Convolutional Networks achieve top performanceNetwork data very important (who is connected to who)We found 5 well defined segmentsPayments are determined by calls not dataSOM create relatively stable segmentsIntercommunity diverse is some cases