dynamics of interactions in a large communication network márton karsai, mikko kivelä, raj pan,...
TRANSCRIPT
Dynamics of interactions in a large communication network
Márton Karsai, Mikko Kivelä, Raj Pan, Jari Saramäki, Kimmo Kaski, Albert-László Barabási
János KertészBudapest University of Technology and EconomicsandAalto University, Helsinki
Support: EU FP7 FET-Open 238597, FiDiPro, OTKA
In collaboration with
Temporal aspects of social behavior from mobile phone data
Transmission on a linear chain
VERY slow
May 29, 2003: calls to the Hungarian Central Office for Combatting Catastrophiesthat people obtained sms messages like:„Large nuclear accident in PaksStay home, close doors-windows, don’t eat lettuce”
In reality nothing happened.
Police investigation revealed that a nurse had heared children talking in the kindergarden about such things (who heared them from their parents as a rumor); she called her relatives, the „information” reached a journalist, who started an sms campaign...
Was spreading fast or slow?
Outline- Spreading phenomena in complex networks,
small world- Mobile phone call network: A proxy for the social
network. Nodes, links and weights- The Granovetterian structure of the society- Event list and modeling spreading - Different sources of correlations:
- topology- weight-topology- daily (weekly) patterns- burstiness- link-link
- Differentiating between contributions to spreading- Burst statistics- Summary
Spreading phenomena in networks
- epidemics (bio- and computer)- rumors, information, opinion- innovations- etc.
Nodes of a network can be: - Susceptible- Infected- Recovered (immune)
Corresponding models: SI, SIR, SIS...
Important: speed of spreading (SLOW)
Spreading curve (SI)
Early
Late
Intermediate
m(t)=Ninf /Ntot
Spreading in the society
Small world property; “Six Degrees of Separation”, Erdős number, WWW,
collaboration network, Kevin Bacon game etc.
Not only social nw-s: Internet, genetic transcription, etc.
In many networks the average distance btw two
arbitrary nodes is small (grows at most log with system size).
Distance: length of shortest path btw two nodes
Impossible to know – use a proxy:
The usage of mobile phones in the adult population is close to 100%
All interactions are recorded – use call network as a proxy of the network at the societal level
Small world: fast spreading?There are short, efficient paths. Are they used?
Needed information: - Structure of the society: Network at the societal level - Local transmission dynamics: Detailed description, how information (rumor, opinions etc) is transmitted
• Over 7 million Over 7 million private mobile phoneprivate mobile phone subscriptions subscriptions• Focus: voice calls within the home operator Focus: voice calls within the home operator
• Data aggregated from a period of 18 weeksData aggregated from a period of 18 weeks• Require reciprocity (Require reciprocity (XXY AND YY AND YXX) for a link) for a link
• Customers are anonymous (hash codes)Customers are anonymous (hash codes)• Data from Data from anan European mobile operator (20% market share) European mobile operator (20% market share)• Weights:Weights: either call duration or number of calls either call duration or number of calls
Constructing Constructing social nsocial networetwork from k from mobilemobile
phone data phone data
Y
X 15 min
5 min
20 minX
Y
J.-P. O
nnela
, et a
l. PN
AS 1
04, 7
332-7
336 (2
007)
J.-P. O
nnela
, et a
l. New
J. Phys. 9
, 17
9 (2
007)
Huge network: proxy for network at societal level
Small world
The strength of weak ties (M.Granovetter, 1973)
Hypothesis about the small scale (micro-) structure of the society:
1. “The strength of a tie is a (probably linear) combination of the amount of time, the emotional intensity, the intimacy (mutual confiding), and the reciprocal services which characterize the tie.”2. “The stronger the tie between A and B, the larger the proportion of individuals S to whom both are tied.”
Consequences on large (macro-) scale:Society consists of strongly wired communities linked by weak ties. The latter hold the society together.
Granovetter, Mark S. (May 1973), "The Strength of Weak Ties", American Journal of Sociology 78 (6): 1360–1380
Overlap
• Definition: relative neighborhood overlap (topological)
where the number of triangles around edge (vi, vj) is nij
• Illustration of the concept:
ijji
ijij nkk
nO
)1()1(
Empirical Verification
• Let <O>w denote Oij averaged over a bin of w-values
• Use cumulative link weight distribution: (the fraction of links with weights less than w’)
´
cum )(´)(ww
wPwP
• Relative neighbourhood overlap increases as a function of link weight Verifies Granovetter’s hypothesis (~95%) (Exception: Top 5% of weights)
Blue curve: empirical network
Red curve: weight randomised network
High Weight Links?
• Weak links: Strengh of both adjacent nodes (min & max) considerably higher than link weight
• Strong links: Strength of both adjacent nodes (min & max) about as high as the link weight
• Indication: High weight relationships clearly dominate on-air time of both, others negligible
• Time ratio spent communicating with one other person converges to 1 at roughly w ≈ 104
• Consequence: Less time to interact with others
• Explaining onset of decreasing trend for <O>w
ijji wss /),min(
ijji wss /),max(
wijsi sj
si=Σjwij
Possible to ask unprecedented questions and even find the answers to them
Study revealed the structure of the network, the interplay btw weigths and communities, the relations btw local, mesoscopic and global structure
Spreading of informationKnowledge of information diffusion based on unweighted networksUse the present network to study diffusion on a weighted network: Does
the local relationship between topology and tie strength have an effect? Spreading simulation: infect one node with new information
(1) Granovetterian: pij wij
(2) Reference: pij <w>
Spreading significantly faster on the reference (average weight) networkInformation gets trapped in communities in the real network
Reference
Granovetterian
Small but slow worldWe have data about- who called whom, voice, SMS, MMS - when- how long they talked(+ metadata – gender, age, postal code+ mostly used tower,…)306 million mobile call records of 4.9 millionindividuals during 4 months with 1s resolution
M.K
arsa
i et
al.
http
://a
rxiv
.org
/abs
/100
6.21
25
voice calls SMS
Time sequence is made periodic
Is this fast or slow? What to compare with?
The problem of null models
More accurate study of spreading is possible: Infect (info, gossip, etc.) a node at time t0=0.
Transmission, whenever call with uninfected takes place (SI model). Watch m(t)=<I(t)/N>, the ratio of infected nodes with an average over initiators.
Correlations influence spreading speed-Topology (community structure)- Weight-topology (Granovetter-structure)- Bursty dynamics- Daily pattern- Link-link dynamic correlations
A.-
L. B
ara
bási
, N
atu
re 2
07,
435 (
2005)
Bursty dynamics: inhomogeneous activity patterns
Poissonian
Bursty
Average user
Busy user
Note the different scales
Bursty call patterns for individual users
Daily pattern of call density.
Weekly pattern too (here disregarded)
Calls are non-Poissonian
Scaled inter-event time distr.Binned according to weights (here: number of calls)
Inset: time shuffled
Dynamic link-link correlationstriggered calls, cascades, etc.
How to identify the effect of the different correlations on the spreading?
Introduce different null models by appropriate shuffling of the data.
Time shuffling
Link1 Link2 Link3... LinkN
t11 t21 t31... tN1t12 t22 t32... tN2. . . .
. . t3n_3.... .
t1n_1 . .
t2n_2 .
tNn_N
Destroyes burstiness (and link-link correlations) but keeps weight and daily pattern
Link1 Link2 Link3... LinkN
t11 t21 t31... tN1t12 t22 t32... tN2. . . .
. . t3n_3... .
t1n_1 . .
t2n_2 .
tNn_N
Link sequence shuffling
Select random pairs of links sequences and exchangeDestroys topology-weight and link-link correlation, keeps burstiness
Link1w Link2w Link3w.. LinkNw
t1’1 t2’1 t3’1... tN’1t1’2 t2’2 t3’2... tN’2. . . .
.t1’n_1’
.t2’n_2’
.t3’n_3’...
.tN’n_N’
Equal weight link sequence shuffling
Destroyes link-link correlationsKeeps weight-topology correlations and bursty dynamics
Long time behavior
The role of the daily patternModel calculation: Take the empirical topology, with weightsCompare homogeneous amd imhomogeneous Poissonians
Little effect
Slowing down mainly due to Granovetterian structure and bursty character of human activity
A closer look to burstinessDefine a bursty period (BP)In a series of signals a BP(Δt) is a sequence of
signals with an empty period of length Δt both at the beginning and at the end of the sequence
{
Δt
The end is measured from the end of the talk.
Bursty dynamics: a closer look
What is a burst?Define it relative to a window Δt:A bursty period (BP) is a sequence of events
separated from the rest by empty periods of at least Δt lengths.
ΔtΔt
Statistics of bursts:
length of BP
Frequency
of
len
gth
of
BP
Δt
Δt=10 is toosmall
Distribution of number of events E in bursts
-4.2
Autocorrelation of events
Modeling:
1. Independent events:
Whatever the distribution of inter-event times is, we get for P(E = n) ~ exp(-An) in contrast to the observed power law
Queuing model (Barabási, 2005)
Qualitatively good (power law waiting times and number of events in BP but wrong exponents
We have a to do list, which contain the tasks in a hierarchical order. Always the highest priority task is executed. Tasks arrive at random and get a hierarchy paramater at random.
Summary
- Mobile phone call network used as a proxy for the human network at the societal level- Structure of the society follows Granovetter’s picture (up to 95%)- Micro and macro structures are related- Several different types of correlations- Spreading slowed down mainly by weight-topology correlations and burstiness of human activities- Bursty are highly correlated events (not explainable by circadic patterns)- Strong short time correlations, no model – new explanation needed
J.-P. Onnela, et al. PNAS 104, 7332-7336 (2007)J.-P. Onnela, et al. New J. Phys. 9, 179 (2007)J. Kumpula et al. PRL 99, 228701 (2007) M. Karsai et al. arXiv:1006.2125