phone calls connections relevance to churn in mobile networks

8/13/2019 PHONE CALLS CONNECTIONS RELEVANCE TO CHURN IN MOBILE NETWORKS

1/6

PHONE CALLS CONNECTIONS RELEVANCE TO CHURN IN

MOBILE NETWORKS

Niko GAMULIN1, dr. Mitja TULAR1, dr. Sao TOMAI2

1Telekom Slovenije, d.d., Cigaletova 15, 1000 Ljubljana2Faculty of Electrical Engineering, University of Ljubljana, Traka 25, 1000 Ljubljana

[email protected]

Abstract:As the telecommunications market has

reached the mature stage, the majority of population in

developed areas has already adopted mobile services

and there are many competitors on the market, churn

prediction has become critical for companies in order

to retain their market shares. Analyzing the data that

telecommunications service providers store, originally

for billing purposes, it is possible to observe their users

in the context of social network and gain additional

insights about the spread of influence, relevant to

churn.In this paper, we examine the communication

patterns of mobile phone users and subscription plan

logs. Our primary goal is to discover whether it is

possible to determine which users are more likely to

churn upon observing their outgoing calls and churn

among their neighbors (friends).

Keywords: churn, social network analysis, machine

learning

1. INTRODUCTION

In order to attract new service consumers and retain theexisting ones, telecommunications service providers

have been constantly forming new subscription

adapting the terminal equipment offer according to

actual trends and improving the quality of services by

upgrading network equipment. Along with thedevelopment of machine learning methods and their

efficiency the service providers from all industries have

become aware of importance of the data which can be

used to gain additional insights about their service

consumers and consecutively target the important

customers, more prone to churn. In the past, there havealready been many methods proposed for churn

prediction using the past data. In some of these, the

user is treated as an individual, independent from his

acquaintances data while in the others network effects

have been considered as well.

The main question that motivated our research is

whether it is possible to spot a spread of behavior, i.e.

churn among connected users solely from observing

the strength of call connections among them. In order

to construct a social network, we have observed the

Call Detail Record (CDR) data along the subscription

plan log to determine the users subscription state in

the observed time period. Each user represented a node

and the aggregated outgoing calls towards each

neighbor represented a directed edge, weighted with

the number of calls and the sum of duration of all calls.

The observed users subscription state along with his

neighbors subscription state served as an indicator ofspread of behavior. As it is dynamic process, the time

variable is also important. If the time period, that has

elapsed between two acts of a same kind performed by

two connected users, is large, it might not be

appropriate to state that the second actor followed the

first one and that the same state of two connected users

is not a mere coincidence. On the other hand, if the

observing time period is too short, we might not notice

that the two connected users performed the same act

after some additional time elapsed. In order to check

whether the observing time period is relevant, we

observed the users states over different time period

lengths.

As we have anticipated that the individual subscriber's

choice about churn has been partially motivated by

prior churners among his acquaintances, we have tried

to determine the acquaintances impact on churndecision by observing the number of phone calls

established, the duration of phone calls and the number

of prior churners among the observed subscriber

acquaintances. In order to prove the relevance between

churn decision, and the number of prior churners

among acquaintances along with the connections

strength, measured by the relative number and duration

of phone calls, we have observed the users in 3D space,

defined by the axes that represented the relative

number of prior churners among acquaintances and the

number and duration of calls established with theserelative to all acquaintances. In order to prove the

relevance between observed subscriber's behavior and

prior acts of their neighbors we have calculated and

plotted the lift curve to show the significant influence

from prior acts, made by acquaintances for different

time period lengths.

2. PROBLEM STATEMENT

Although the awareness of the importance of socialnetworks has increased significantly along with spread

of online social networks, such as Facebook, Twitter,

Google+ and LinkedIn, the majority of service

providers from non-internet industries haven't exploited

the potential of real social networks of interconnected

people, who influence each other in real world. While

several online services and retailers have already

22


2/6

developed marketing campaigns, based on social

networks, the majority of businesses from other

industries still try to attract new customers and keep the

existing ones by threating each of them as individual

and mostly invest in broadcast marketing campaigns

and form offers on global level. Some of service

providers and retailers are certainly not able to treattheir customers as interconnected peers, influenced bytheir friends due to lack of data, needed to represent

users as nodes and form edges between them. On the

other hand telecommunications service providers have

to keep the data for their customers phoneconversations for charging purposes. These same data

could be used to form a social network ifinterconnected users and anticipate how they influence

each other.

Our motivation for this research was based on the

assumption that users influence each other over phone

conversations and the power of influence is conditional

on the duration and number of conversation. Accordingto this assumption, the user whose many friends have

churned is also more likely to churn. Furthermore, we

guessed that if the observed user is influenced by his

peers, he follows them in the shorter amount of time.

In the following chapter, there is an overview of someof the methods, previously proposed, for churn

prediction where users are observed in social network

context.

3. EXISTING SOLUTIONSIn the past, there have been numerous studies

performed, dealing with churn problem in

telecommunications services sector. When dealing withchurn problem, at first one has to be aware of the bigdifference between prepaid and postpaid users. The

first ones are, as opposed to second ones, not bound by

a contract. In case of prepaid users, it is easier to

observe users in social network context and extract the

rules for the diffusion of churn as these users are free to

make a decision about the change of service plan

anytime. The model where prepaid users are observed

in the context of social network is presented in[1]. On

the other hand, while it is easier to observe prepaid

users in the context of social network, it is not trivial to

determine the users churn status as such users dont

explicitly cancel the subscription plan. In[2], a modelfor prepaid user labeling is proposed along with churn

prediction technique where users are observed as

individuals, without influence of interconnected users.

Dierkes et al. [3] observe if user churn decision of

individuals in previous time periods have an impact onother users whom the target customer interacted with

either via voice call, short message service (SMS), or

multimedia message service (MMS) using Markov

Logic Networks (MLNs).

4. THE PROPOSED SOLUTIONThe main motivation for this research was to prove that

users, connected among each other with phone callsand to determine the importance of strength of

connections for the churn spread. Although there are

many factors that influence users decision about

subscription plan change, such as service price,

marketing campaigns and special offers from

competitor providers, we wanted to prove that the

social factors itself plays an important role and for

5. DATASET

For our analysis, we used anonymised historical datafor about 790.000 users and about 42.000.000

aggregated daily call connections records from CDR

for September and October 2010. The call connection

record contained the number of calls between callerand called person and the sum of calls durations for the

observed day. Along with the call connections data we

had available the churn log for the time period from

year 2005 to year 2011 from which it was possible to

label each observed user either as churner or non-

churner.In order to perform the experiment, the original data

had to be reshaped the following way. At first, the list

of all active postpaid users was made from aggregateddaily call records, i.e. all postpaid callers were selected.

Once having the list, we defined three different

observation period lengths: 60 days, 30 days and 15

days. For each period length we looped through the list

of active users and for each one checked the total

number of called neighbors, the number of all outgoing

calls and the duration sum of all outgoing calls. Then,

if user has churned in the period of observed CDR dataperiod, he was labeled as churner, otherwise as non-

churner. Then, according to the observed period length,

all of his neighbors that churned before him, inside the

defined period length, were counted and the duration

and number of outgoing calls were summed. Eachneighbor that churned outside the defined period lengthor after observed user was labeled as non-churner and

the call data were treated as non-churners, i.e. the

relative number of calls and duration for non-churners

was increased. Having these data, the relative values

were calculated for the number of neighbors that

churned before, number of outgoing calls and the sum

of duration of outgoing calls to users that churned

before. After reshaping the data, we performed the

experiment, described in the following section.

6. EXPERIMENT

Each record from the reshaped data can be representedas pair of input vector of independent variables and

output dependent target class variable. In our case, theindependent variables were the number of neighbors

that churned before, relative to the number of all

neighbors, the number of calls and the sum of duration

of all calls to neighbors that churned before, relative to

all neighbors and the target variable was the class that

represented whether the observed user has churned ornot. As the maximum length of input vector is 3 and

the output class variable has 2 possible states, it is

possible, for simpler visual interpretation, to represent

the observed users as colored points, scattered in 3D

space (Picture 1). The main aim of representingobserved users in 3D space was to gain intuition for

further analysis; the reduction of observed variables on

one and can potentially reduce the quality of results,

23


3/6

while, on the other hand in the domain of large

numbers of input variables the result might be difficult

to interpret.

Picture 1 - Sample set of users from the

observed dataset, represented as points

in space. Blue points represent non-

churners while red points represent

churners. The sphere with centre point

(1,1,1) represents a classification shape;

all users inside are classified as

churners.

From visual representation it was possible to draw the

following intuitive conclusions. If many of observed

user's neighbors churn and if one spends relatively long

time talking with churners, the probability that the

observed user will churn is higher than in case the

observed users don't have many neighbors who

churned.Certainly, visual observation might lead tofalse conclusion and therefore we decided to draw a

sphere and observe, how many users of each class

(churners and non-churners) where captured inside or

outside the sphere, circle or region (depending on the

number of observed independent variables and base

point) with varying range from different base points.

With 3 dimensions, represented with independent input

variables, it is possible to observe users churn state

depending on either single input variable or

combinations of 2 or all variable values. To select a

segment of observed users, we first set a base point andthen increased the observed area range from minimum

value, 0, to maximum value. For all possible

combinations of input variable, we first set the starting

point to the origin of the coordinate system and then to

the point, furthest away from the origin of thecoordinate system. In first case, the observed area werethe points which distance from the origin of the

coordinate system was greater or equal to current

range. In second case, while the base point was set to

the point, furthest away from the origin of the

coordinate system, the observed points which distance

from the base point was smaller or equal to current

range. Depending on the number of observed variables,

the observed area was either defined by line, circle

(Picture 2) or sphere.

The following description of the experiment is limited

to the case of selecting the point, farthest from the

origin of coordinate system as the base point as thismodel achieved better results although the difference

was not significant.

In case of observing single variable, while its value

was gradually increased from minimum value, 0, to

Picture 2 - Examples of observation area

for 2 input variables, marked with grey

colorrelative number of neighbors

churned before and relative duration of

calls to neighbors that churned before

for base points set to the origin of

coordinate system (a) and the point,furthest away (1,1) (b). In case of (a), the

observation area are the point, which

distance from the base point (0,0) is

larger or equal than R, whereas in case

of (b), the observation area are the point,

which distance from the base point (1,1)

is smaller or equal to R.

maximum value, 1, the users churn states inside and

outside range were observed and the percentage of

churners and non-churners inside and outside the rangewas calculated. Similarly, in case of combinations of 2

independent variables, the circle center was set at point(1,1) and the circle radius, which represented the

observed range was gradually increased from base

point, to value 2, where both observed variablesreached maximum value, 1. In case of sphere, the

center was also set at point (1, 1, 1) and the radius was

increased from 0 to3.

= (1)

= (2 ) + ; = (0,0)

(1 ) +(1 ) ; = (0,0)

Having calculated the percentage of churners and non-churners inside and outside the observed range for each

combination of input variables at each step, it is

possible to draw conclusions for how observed users

24


4/6

interaction with his neighbors and neighbors churn

affect the observed users decision about churn. Tomeasure the relevance of different segmentations,

based on the combination of independent variables, we

used some standard data mining terms, which are

described in the following chapter.

7. RESULTSIn this section, we present the results of our experiment

for different observational time period lengths,

comparing different combinations of observed

variables for churners segmentation.

For each defined time period length, we present and

discuss the results of all combinations and present the

Receiver Operating Characteristic (ROC) along with

precision and recall values for selected fraction of

segmented users.The aforementioned factors are defined as follows. Let

TP be the true positives, TN the true negatives, FP the

false positives and FN the false negatives. In thisexperiment TP represents the number of churners,

captured inside the observed range, FP non-churners

inside the observed range, TN non-churners outside the

observed range and FN churners outside the observed

range. Precision is defined as the fraction of retrieved

instances that are relevant (Equation 2), while recall is

the fraction of relevant instances that are retrieved

= = (2)

= = (3)

Having calculated precision and recall values for

different rates of population, captured inside theobservational area, it is possible to represent these

values graphically (Picture 3) and interpret the

significance of segmentation against random selection.

In case of labeling all users as churners, the precision

would be equal to 1 and the recall value would be equal

to 1 as all churners among all users would be selected.In case of random selection, the precision value is

always close to actual churn rate, while the recall value

increases equally with selected population size,

assuming that churners and non-churners are equally

distributed among population. In case of defining the

criteria to select a specific segment of population with

aim to increase the precision and recall values, thesegmentation efficiency could be measured by

comparing values for segmented users with values for

random selection. Certainly, it is very difficult to

design a perfect segmentation model, valid for generalusage and therefore in real models, there is a certain

amount of samples, in this case non-churners, who are

classified as churners.

Picture 1Precision and recall values for random selection, ideal segmentation and segmentation

with observation of prior neighbor churners rate, relative number of calls to neighbors that

churned before and relative duration of calls to neighbors that churned before with time period

length 60 days

25


5/6

Picture 4 ROC curve and AuC, colored with grey for observation of prior neighbor churners

rate, relative number of calls to neighbors that churned before and relative duration of calls to

neighbors that churned before with time period length 60 days

Beside the precision and recall values, a usefulmeasure for model evaluation is the Area under

Curve value (AuC), which is derived from Receiver

Operating Characteristic (ROC)[4]. ROC curve is

a graphical plot which illustrates the performance of

binary classification for different rates of captured

users and enables the observer to visually estimatethe cost/benefit ratio of the segmentation model for

selected size of population. The best possible

prediction method would yield a point in the upper

left corner or coordinate (0, 1) of the ROC space, inwhich case there would be selected all actual

churners (TP) and none of non-churners selected(FP). The actual ROC curve values depict relative

trade-offs between benefits from selecting actual

churners and cost from classifying actual non-

churners as churners (FP). The AuC value is a

proportion of the area of the unit square under ROCcurve and is equivalent to the probability that the

classifier will rank a randomly chosen positive

instance (churner in this case) higher than a

randomly chosen negative instance (non-churner).

As random guessing produces the diagonal line

between (0,0) and (1,1), which splits the wholeobservation space in half, the AuC value for

random guessing is equal to 0.5. As in this

discussion we observed the difference between

random guessing and using the classification

model, we calculated the area size between random

guessing curve and actual classification model ROC

curve (AuC) as it is shown in [4]. Similarly,

besides actual AuC value, derived from ROC curve,

we used recall value of random guessing and actual

model to calculate the area size between actual

model recall curve and random guessing recallcurve.With adjusting the capture range, which in

this case represents a side of rectangle in case of

observing 1, circle radius in case of 2 and sphere

radius in case of 3 dimensions, it is not possible to

capture the exact percentage of users, and therefore,

for model estimation we used the AuC. Theprecision, recall, AuC and AuC (Recall) values

for different combinations of input variables, for

observation period length of 60 days are listed in

Table 1.

Dimensions AuC Precision (% of users) Recall (% of users) AuC (Recall)~5%

(actual)

~10%

(actual)

~5% ~10%

x 0.13075 0.0377(5) 0.0237(9.6) 0.2814 0.3397 0.12987

y 0.1353 0.0419(4.93) 0.0299(7.3) 0.3081 0.3264 0.1344

z 0.1368 0.0443(4.72) 0.0263(8.55) 0.3126 0.3362 0.13588

x, y 0.13395 0.0425(4.86) 0.0238(9.53) 0.3084 0.3393 0.13305

x, z 0.13466 0.0422(4.99) 0.0238(9.55) 0.3147 0.3393 0.13376

y, z 0.13643 0.0446(4.64) 0.0271(8.25) 0.3095 0.334 0.13551

x, y, z 0.02355 0.0425(4.95) 0.0236(9.64) 0.3143 0.3397 0.02359

Table 1 - 1Model performance for the time period length of 60 days, using different

combinations of variable inputs, where x represents the relative number of neighbors that churnedbefore, y relative duration of calls to neighbors that churned before and z relative number of calls

to neighbors that churned before

26


6/6

Comparing the results for different period lengths, we

can see that the number of calls to neighbors that

churned before, relative to total number of calls as

single input variable is the best predictor for period

lengths of 60 and 30 days, while it is close to the best

one in case of 15 days period length as well, where the

best predictor is th relative duration of calls toneighbors that churned before.

By comparing model performances for different

period lengths and same combinations of input

variables, we can see that the model achieves the bestvalues in case of 60 days period length.To describe the

model usefulness in practice, we can consider the case

of observing the relative number of neighbors that

churned before and relative number of calls to

neighbors that churned before for the time period of

60 days. For this case, if we set a range threshold tovalue, for which around 5% (4.99) of segmented users

are captured and treated as churners, the AuC value

is equal to 0.13466, precision is equal to 0.0422, recallis equal to 0.3147 and AuC (Recall) is equal to

0.13376.

8. CONCLUSIONS AND FUTURE

RESEARCH DIRECTIONS

Where the majority of population has already adopted

mobile services, it is critical to implement churn

prediction methods, in order to retain the market

share. Besides observing user behavior as individual,

it is crucial to discover patterns and rules that hold for

network of interconnected users. In this research, we

proved that observed users behavior in terms of churn

depends of his neighbors prior behavior.In case of observing users as individuals, many users

importance might be overlooked; from billing records

the service provider can measure users importance

from the amount of monthly charges whereas theusers, who are not active, in this case do not stand out

but are nevertheless important in case of receiving

many incoming calls and therefore indirectly generate

significant profit as well. If the churn of such users

was prevented, the spread of churn to active users,

who directly generate profit, could be prevented by

targeting the influential neighbors.

As the existence of the influence among connected

users has been proved, our plan for the future researchis to observe the connection in the longer time period

and distinguish the contribution of influence to

observed user of each neighbor separately.

9. ACKNOWLEDGEMENTS

The authors would like to thank Telekom Slovenijefor cooperation. The work was supported in part by

the Ministry of Education, Science, Culture and Sport

of Slovenia and the Slovenian Research Agency.

Special thanks go to the European Union for partly

financing a young researcher training program from

the European Social Fund, under the Operational

Programme Human Resources Development for theperiod 20072013.

10. REFERENCES

[1] K. Dasgupta, R. Singh, B. Viswanathan,

D. Chakraborty, S. Mukherjea, A. A.

Nanavati, and A. Joshi, "Social ties and

their relevance to churn in mobile

telecom networks," presented at the

Proceedings of the 11th internationalconference on Extending database

technology: Advances in database

technology, Nantes, France, 2008.

[2] L. Alberts, I. R. L. M. Peeters, R.

Braekers, and C. Meijer, "Churn

Prediction in the Mobile

Telecommunications Industry," Citeseer.

[3] T. Dierkes, M. Bichler, and R. Krishnan,

"Estimating the effect of word of mouth

on churn and cross-buying in the mobile

phone market with Markov logic

networks,"Decision Support Systems,

vol. 51, pp. 361-371, 2011.

[4] T. Fawcett, "An introduction to ROC

analysis," Pattern recognition letters,

vol. 27, pp. 861-874, 2006.

27

phone calls connections relevance to churn in mobile networks

Documents