customer insights from telecom data using deep learning
Embed Size (px)
TRANSCRIPT
Analytics on Telecom CDR DataRedZebra AnalyticsOct 2014
Problem statementHow to segment Telecom customers and track their dynamics
How to optimize / reformulate tariff plans
How to predict churn
The data3 months of CDRData consumptionPhone calls and TopupsSMS
User description (geo, sociodemographics)
The techniquesDeep Neural Networks and Autoencoders (Keras framework)
Random Forest
Extreme Gradient Boosting
Graph analysis (Igraph)
SOM and tSNE
Scikit Learn (Python)
Data processing (for churn prediction)
Churn (1) / no churn (0)Customer activity is Converted into heatmaps
Network data also considered We also include network data (like the number of churners connected to a node)
Three distinct users activity
Approach: Convolutional Neural Network
INPUTUser activityheatmapOUTPUTChurn / no churn
ResultsMethodAUC - trainAUC - testRandom Forest0.750.74Extreme Gradient Boosting0.800.76Variational Autoencoders0.780.75Convolutional Neural Networks0.790.77
Convolutional Neural Networks have the best performance
Some templates of user activity discovered by the neural network
SMS activity per age group
ClusteringTechniques used cluster and visualize data:K-meansSelf-organized maps (SOM)tSNE
Visualization of sample of users with tSNE
Segmentation with Self Organized Maps
Distance to code-vectors: how stable is the population
ConclusionsDeep Convolutional Networks achieve top performanceNetwork data very important (who is connected to who)We found 5 well defined segmentsPayments are determined by calls not dataSOM create relatively stable segmentsIntercommunity diverse is some cases