phone calls connections relevance to churn in mobile networks
TRANSCRIPT
-
8/13/2019 PHONE CALLS CONNECTIONS RELEVANCE TO CHURN IN MOBILE NETWORKS
1/6
PHONE CALLS CONNECTIONS RELEVANCE TO CHURN IN
MOBILE NETWORKS
Niko GAMULIN1, dr. Mitja TULAR1, dr. Sao TOMAI2
1Telekom Slovenije, d.d., Cigaletova 15, 1000 Ljubljana2Faculty of Electrical Engineering, University of Ljubljana, Traka 25, 1000 Ljubljana
Abstract:As the telecommunications market has
reached the mature stage, the majority of population in
developed areas has already adopted mobile services
and there are many competitors on the market, churn
prediction has become critical for companies in order
to retain their market shares. Analyzing the data that
telecommunications service providers store, originally
for billing purposes, it is possible to observe their users
in the context of social network and gain additional
insights about the spread of influence, relevant to
churn.In this paper, we examine the communication
patterns of mobile phone users and subscription plan
logs. Our primary goal is to discover whether it is
possible to determine which users are more likely to
churn upon observing their outgoing calls and churn
among their neighbors (friends).
Keywords: churn, social network analysis, machine
learning
1. INTRODUCTION
In order to attract new service consumers and retain theexisting ones, telecommunications service providers
have been constantly forming new subscription
adapting the terminal equipment offer according to
actual trends and improving the quality of services by
upgrading network equipment. Along with thedevelopment of machine learning methods and their
efficiency the service providers from all industries have
become aware of importance of the data which can be
used to gain additional insights about their service
consumers and consecutively target the important
customers, more prone to churn. In the past, there havealready been many methods proposed for churn
prediction using the past data. In some of these, the
user is treated as an individual, independent from his
acquaintances data while in the others network effects
have been considered as well.
The main question that motivated our research is
whether it is possible to spot a spread of behavior, i.e.
churn among connected users solely from observing
the strength of call connections among them. In order
to construct a social network, we have observed the
Call Detail Record (CDR) data along the subscription
plan log to determine the users subscription state in
the observed time period. Each user represented a node
and the aggregated outgoing calls towards each
neighbor represented a directed edge, weighted with
the number of calls and the sum of duration of all calls.
The observed users subscription state along with his
neighbors subscription state served as an indicator ofspread of behavior. As it is dynamic process, the time
variable is also important. If the time period, that has
elapsed between two acts of a same kind performed by
two connected users, is large, it might not be
appropriate to state that the second actor followed the
first one and that the same state of two connected users
is not a mere coincidence. On the other hand, if the
observing time period is too short, we might not notice
that the two connected users performed the same act
after some additional time elapsed. In order to check
whether the observing time period is relevant, we
observed the users states over different time period
lengths.
As we have anticipated that the individual subscriber's
choice about churn has been partially motivated by
prior churners among his acquaintances, we have tried
to determine the acquaintances impact on churndecision by observing the number of phone calls
established, the duration of phone calls and the number
of prior churners among the observed subscriber
acquaintances. In order to prove the relevance between
churn decision, and the number of prior churners
among acquaintances along with the connections
strength, measured by the relative number and duration
of phone calls, we have observed the users in 3D space,
defined by the axes that represented the relative
number of prior churners among acquaintances and the
number and duration of calls established with theserelative to all acquaintances. In order to prove the
relevance between observed subscriber's behavior and
prior acts of their neighbors we have calculated and
plotted the lift curve to show the significant influence
from prior acts, made by acquaintances for different
time period lengths.
2. PROBLEM STATEMENT
Although the awareness of the importance of socialnetworks has increased significantly along with spread
of online social networks, such as Facebook, Twitter,
Google+ and LinkedIn, the majority of service
providers from non-internet industries haven't exploited
the potential of real social networks of interconnected
people, who influence each other in real world. While
several online services and retailers have already
22
-
8/13/2019 PHONE CALLS CONNECTIONS RELEVANCE TO CHURN IN MOBILE NETWORKS
2/6
developed marketing campaigns, based on social
networks, the majority of businesses from other
industries still try to attract new customers and keep the
existing ones by threating each of them as individual
and mostly invest in broadcast marketing campaigns
and form offers on global level. Some of service
providers and retailers are certainly not able to treattheir customers as interconnected peers, influenced bytheir friends due to lack of data, needed to represent
users as nodes and form edges between them. On the
other hand telecommunications service providers have
to keep the data for their customers phoneconversations for charging purposes. These same data
could be used to form a social network ifinterconnected users and anticipate how they influence
each other.
Our motivation for this research was based on the
assumption that users influence each other over phone
conversations and the power of influence is conditional
on the duration and number of conversation. Accordingto this assumption, the user whose many friends have
churned is also more likely to churn. Furthermore, we
guessed that if the observed user is influenced by his
peers, he follows them in the shorter amount of time.
In the following chapter, there is an overview of someof the methods, previously proposed, for churn
prediction where users are observed in social network
context.
3. EXISTING SOLUTIONSIn the past, there have been numerous studies
performed, dealing with churn problem in
telecommunications services sector. When dealing withchurn problem, at first one has to be aware of the bigdifference between prepaid and postpaid users. The
first ones are, as opposed to second ones, not bound by
a contract. In case of prepaid users, it is easier to
observe users in social network context and extract the
rules for the diffusion of churn as these users are free to
make a decision about the change of service plan
anytime. The model where prepaid users are observed
in the context of social network is presented in[1]. On
the other hand, while it is easier to observe prepaid
users in the context of social network, it is not trivial to
determine the users churn status as such users dont
explicitly cancel the subscription plan. In[2], a modelfor prepaid user labeling is proposed along with churn
prediction technique where users are observed as
individuals, without influence of interconnected users.
Dierkes et al. [3] observe if user churn decision of
individuals in previous time periods have an impact onother users whom the target customer interacted with
either via voice call, short message service (SMS), or
multimedia message service (MMS) using Markov
Logic Networks (MLNs).
4. THE PROPOSED SOLUTIONThe main motivation for this research was to prove that
users, connected among each other with phone callsand to determine the importance of strength of
connections for the churn spread. Although there are
many factors that influence users decision about
subscription plan change, such as service price,
marketing campaigns and special offers from
competitor providers, we wanted to prove that the
social factors itself plays an important role and for
5. DATASET
For our analysis, we used anonymised historical datafor about 790.000 users and about 42.000.000
aggregated daily call connections records from CDR
for September and October 2010. The call connection
record contained the number of calls between callerand called person and the sum of calls durations for the
observed day. Along with the call connections data we
had available the churn log for the time period from
year 2005 to year 2011 from which it was possible to
label each observed user either as churner or non-
churner.In order to perform the experiment, the original data
had to be reshaped the following way. At first, the list
of all active postpaid users was made from aggregateddaily call records, i.e. all postpaid callers were selected.
Once having the list, we defined three different
observation period lengths: 60 days, 30 days and 15
days. For each period length we looped through the list
of active users and for each one checked the total
number of called neighbors, the number of all outgoing
calls and the duration sum of all outgoing calls. Then,
if user has churned in the period of observed CDR dataperiod, he was labeled as churner, otherwise as non-
churner. Then, according to the observed period length,
all of his neighbors that churned before him, inside the
defined period length, were counted and the duration
and number of outgoing calls were summed. Eachneighbor that churned outside the defined period lengthor after observed user was labeled as non-churner and
the call data were treated as non-churners, i.e. the
relative number of calls and duration for non-churners
was increased. Having these data, the relative values
were calculated for the number of neighbors that
churned before, number of outgoing calls and the sum
of duration of outgoing calls to users that churned
before. After reshaping the data, we performed the
experiment, described in the following section.
6. EXPERIMENT
Each record from the reshaped data can be representedas pair of input vector of independent variables and
output dependent target class variable. In our case, theindependent variables were the number of neighbors
that churned before, relative to the number of all
neighbors, the number of calls and the sum of duration
of all calls to neighbors that churned before, relative to
all neighbors and the target variable was the class that
represented whether the observed user has churned ornot. As the maximum length of input vector is 3 and
the output class variable has 2 possible states, it is
possible, for simpler visual interpretation, to represent
the observed users as colored points, scattered in 3D
space (Picture 1). The main aim of representingobserved users in 3D space was to gain intuition for
further analysis; the reduction of observed variables on
one and can potentially reduce the quality of results,
23
-
8/13/2019 PHONE CALLS CONNECTIONS RELEVANCE TO CHURN IN MOBILE NETWORKS
3/6
while, on the other hand in the domain of large
numbers of input variables the result might be difficult
to interpret.
Picture 1 - Sample set of users from the
observed dataset, represented as points
in space. Blue points represent non-
churners while red points represent
churners. The sphere with centre point
(1,1,1) represents a classification shape;
all users inside are classified as
churners.
From visual representation it was possible to draw the
following intuitive conclusions. If many of observed
user's neighbors churn and if one spends relatively long
time talking with churners, the probability that the
observed user will churn is higher than in case the
observed users don't have many neighbors who
churned.Certainly, visual observation might lead tofalse conclusion and therefore we decided to draw a
sphere and observe, how many users of each class
(churners and non-churners) where captured inside or
outside the sphere, circle or region (depending on the
number of observed independent variables and base
point) with varying range from different base points.
With 3 dimensions, represented with independent input
variables, it is possible to observe users churn state
depending on either single input variable or
combinations of 2 or all variable values. To select a
segment of observed users, we first set a base point andthen increased the observed area range from minimum
value, 0, to maximum value. For all possible
combinations of input variable, we first set the starting
point to the origin of the coordinate system and then to
the point, furthest away from the origin of thecoordinate system. In first case, the observed area werethe points which distance from the origin of the
coordinate system was greater or equal to current
range. In second case, while the base point was set to
the point, furthest away from the origin of the
coordinate system, the observed points which distance
from the base point was smaller or equal to current
range. Depending on the number of observed variables,
the observed area was either defined by line, circle
(Picture 2) or sphere.
The following description of the experiment is limited
to the case of selecting the point, farthest from the
origin of coordinate system as the base point as thismodel achieved better results although the difference
was not significant.
In case of observing single variable, while its value
was gradually increased from minimum value, 0, to
Picture 2 - Examples of observation area
for 2 input variables, marked with grey
colorrelative number of neighbors
churned before and relative duration of
calls to neighbors that churned before
for base points set to the origin of
coordinate system (a) and the point,furthest away (1,1) (b). In case of (a), the
observation area are the point, which
distance from the base point (0,0) is
larger or equal than R, whereas in case
of (b), the observation area are the point,
which distance from the base point (1,1)
is smaller or equal to R.
maximum value, 1, the users churn states inside and
outside range were observed and the percentage of
churners and non-churners inside and outside the rangewas calculated. Similarly, in case of combinations of 2
independent variables, the circle center was set at point(1,1) and the circle radius, which represented the
observed range was gradually increased from base
point, to value 2, where both observed variablesreached maximum value, 1. In case of sphere, the
center was also set at point (1, 1, 1) and the radius was
increased from 0 to3.
= (1)
= (2 ) + ; = (0,0)
(1 ) +(1 ) ; = (0,0)
Having calculated the percentage of churners and non-churners inside and outside the observed range for each
combination of input variables at each step, it is
possible to draw conclusions for how observed users
24
-
8/13/2019 PHONE CALLS CONNECTIONS RELEVANCE TO CHURN IN MOBILE NETWORKS
4/6
interaction with his neighbors and neighbors churn
affect the observed users decision about churn. Tomeasure the relevance of different segmentations,
based on the combination of independent variables, we
used some standard data mining terms, which are
described in the following chapter.
7. RESULTSIn this section, we present the results of our experiment
for different observational time period lengths,
comparing different combinations of observed
variables for churners segmentation.
For each defined time period length, we present and
discuss the results of all combinations and present the
Receiver Operating Characteristic (ROC) along with
precision and recall values for selected fraction of
segmented users.The aforementioned factors are defined as follows. Let
TP be the true positives, TN the true negatives, FP the
false positives and FN the false negatives. In thisexperiment TP represents the number of churners,
captured inside the observed range, FP non-churners
inside the observed range, TN non-churners outside the
observed range and FN churners outside the observed
range. Precision is defined as the fraction of retrieved
instances that are relevant (Equation 2), while recall is
the fraction of relevant instances that are retrieved
= = (2)
= = (3)
Having calculated precision and recall values for
different rates of population, captured inside theobservational area, it is possible to represent these
values graphically (Picture 3) and interpret the
significance of segmentation against random selection.
In case of labeling all users as churners, the precision
would be equal to 1 and the recall value would be equal
to 1 as all churners among all users would be selected.In case of random selection, the precision value is
always close to actual churn rate, while the recall value
increases equally with selected population size,
assuming that churners and non-churners are equally
distributed among population. In case of defining the
criteria to select a specific segment of population with
aim to increase the precision and recall values, thesegmentation efficiency could be measured by
comparing values for segmented users with values for
random selection. Certainly, it is very difficult to
design a perfect segmentation model, valid for generalusage and therefore in real models, there is a certain
amount of samples, in this case non-churners, who are
classified as churners.
Picture 1Precision and recall values for random selection, ideal segmentation and segmentation
with observation of prior neighbor churners rate, relative number of calls to neighbors that
churned before and relative duration of calls to neighbors that churned before with time period
length 60 days
25
-
8/13/2019 PHONE CALLS CONNECTIONS RELEVANCE TO CHURN IN MOBILE NETWORKS
5/6
Picture 4 ROC curve and AuC, colored with grey for observation of prior neighbor churners
rate, relative number of calls to neighbors that churned before and relative duration of calls to
neighbors that churned before with time period length 60 days
Beside the precision and recall values, a usefulmeasure for model evaluation is the Area under
Curve value (AuC), which is derived from Receiver
Operating Characteristic (ROC)[4]. ROC curve is
a graphical plot which illustrates the performance of
binary classification for different rates of captured
users and enables the observer to visually estimatethe cost/benefit ratio of the segmentation model for
selected size of population. The best possible
prediction method would yield a point in the upper
left corner or coordinate (0, 1) of the ROC space, inwhich case there would be selected all actual
churners (TP) and none of non-churners selected(FP). The actual ROC curve values depict relative
trade-offs between benefits from selecting actual
churners and cost from classifying actual non-
churners as churners (FP). The AuC value is a
proportion of the area of the unit square under ROCcurve and is equivalent to the probability that the
classifier will rank a randomly chosen positive
instance (churner in this case) higher than a
randomly chosen negative instance (non-churner).
As random guessing produces the diagonal line
between (0,0) and (1,1), which splits the wholeobservation space in half, the AuC value for
random guessing is equal to 0.5. As in this
discussion we observed the difference between
random guessing and using the classification
model, we calculated the area size between random
guessing curve and actual classification model ROC
curve (AuC) as it is shown in [4]. Similarly,
besides actual AuC value, derived from ROC curve,
we used recall value of random guessing and actual
model to calculate the area size between actual
model recall curve and random guessing recallcurve.With adjusting the capture range, which in
this case represents a side of rectangle in case of
observing 1, circle radius in case of 2 and sphere
radius in case of 3 dimensions, it is not possible to
capture the exact percentage of users, and therefore,
for model estimation we used the AuC. Theprecision, recall, AuC and AuC (Recall) values
for different combinations of input variables, for
observation period length of 60 days are listed in
Table 1.
Dimensions AuC Precision (% of users) Recall (% of users) AuC (Recall)~5%
(actual)
~10%
(actual)
~5% ~10%
x 0.13075 0.0377(5) 0.0237(9.6) 0.2814 0.3397 0.12987
y 0.1353 0.0419(4.93) 0.0299(7.3) 0.3081 0.3264 0.1344
z 0.1368 0.0443(4.72) 0.0263(8.55) 0.3126 0.3362 0.13588
x, y 0.13395 0.0425(4.86) 0.0238(9.53) 0.3084 0.3393 0.13305
x, z 0.13466 0.0422(4.99) 0.0238(9.55) 0.3147 0.3393 0.13376
y, z 0.13643 0.0446(4.64) 0.0271(8.25) 0.3095 0.334 0.13551
x, y, z 0.02355 0.0425(4.95) 0.0236(9.64) 0.3143 0.3397 0.02359
Table 1 - 1Model performance for the time period length of 60 days, using different
combinations of variable inputs, where x represents the relative number of neighbors that churnedbefore, y relative duration of calls to neighbors that churned before and z relative number of calls
to neighbors that churned before
26
-
8/13/2019 PHONE CALLS CONNECTIONS RELEVANCE TO CHURN IN MOBILE NETWORKS
6/6
Comparing the results for different period lengths, we
can see that the number of calls to neighbors that
churned before, relative to total number of calls as
single input variable is the best predictor for period
lengths of 60 and 30 days, while it is close to the best
one in case of 15 days period length as well, where the
best predictor is th relative duration of calls toneighbors that churned before.
By comparing model performances for different
period lengths and same combinations of input
variables, we can see that the model achieves the bestvalues in case of 60 days period length.To describe the
model usefulness in practice, we can consider the case
of observing the relative number of neighbors that
churned before and relative number of calls to
neighbors that churned before for the time period of
60 days. For this case, if we set a range threshold tovalue, for which around 5% (4.99) of segmented users
are captured and treated as churners, the AuC value
is equal to 0.13466, precision is equal to 0.0422, recallis equal to 0.3147 and AuC (Recall) is equal to
0.13376.
8. CONCLUSIONS AND FUTURE
RESEARCH DIRECTIONS
Where the majority of population has already adopted
mobile services, it is critical to implement churn
prediction methods, in order to retain the market
share. Besides observing user behavior as individual,
it is crucial to discover patterns and rules that hold for
network of interconnected users. In this research, we
proved that observed users behavior in terms of churn
depends of his neighbors prior behavior.In case of observing users as individuals, many users
importance might be overlooked; from billing records
the service provider can measure users importance
from the amount of monthly charges whereas theusers, who are not active, in this case do not stand out
but are nevertheless important in case of receiving
many incoming calls and therefore indirectly generate
significant profit as well. If the churn of such users
was prevented, the spread of churn to active users,
who directly generate profit, could be prevented by
targeting the influential neighbors.
As the existence of the influence among connected
users has been proved, our plan for the future researchis to observe the connection in the longer time period
and distinguish the contribution of influence to
observed user of each neighbor separately.
9. ACKNOWLEDGEMENTS
The authors would like to thank Telekom Slovenijefor cooperation. The work was supported in part by
the Ministry of Education, Science, Culture and Sport
of Slovenia and the Slovenian Research Agency.
Special thanks go to the European Union for partly
financing a young researcher training program from
the European Social Fund, under the Operational
Programme Human Resources Development for theperiod 20072013.
10. REFERENCES
[1] K. Dasgupta, R. Singh, B. Viswanathan,
D. Chakraborty, S. Mukherjea, A. A.
Nanavati, and A. Joshi, "Social ties and
their relevance to churn in mobile
telecom networks," presented at the
Proceedings of the 11th internationalconference on Extending database
technology: Advances in database
technology, Nantes, France, 2008.
[2] L. Alberts, I. R. L. M. Peeters, R.
Braekers, and C. Meijer, "Churn
Prediction in the Mobile
Telecommunications Industry," Citeseer.
[3] T. Dierkes, M. Bichler, and R. Krishnan,
"Estimating the effect of word of mouth
on churn and cross-buying in the mobile
phone market with Markov logic
networks,"Decision Support Systems,
vol. 51, pp. 361-371, 2011.
[4] T. Fawcett, "An introduction to ROC
analysis," Pattern recognition letters,
vol. 27, pp. 861-874, 2006.
27