scaling, renormalization and self- similarity in complex networks chaoming song (ccny) lazaros...
TRANSCRIPT
Scaling, renormalization and self-similarity in complex networks
Chaoming Song (CCNY)Chaoming Song (CCNY)Lazaros Gallos (CCNY)Lazaros Gallos (CCNY)Shlomo Havlin (Bar-Ilan, Israel)Shlomo Havlin (Bar-Ilan, Israel)
Hernan A. MakseHernan A. Makse
Levich Institute and Physics Dept.Levich Institute and Physics Dept.City College of New YorkCity College of New York
Protein interaction networkProtein interaction network
Are “scale-free” networks really ‘free-of-scale’?“If you had asked me yesterday, I would have said surely not” - said Barabasi.
(Science News, February 2, 2005).
€
N ~ e<l>/ l0
Small world contradicts self-similarity!!!
Small World effect shows that distance between nodes grows logarithmically with N (the network size):
OR
Self-similar = fractal topology is defined by a power-law relation:
€
N ~ ld B
€
< l > ~ ln(N)
How the network behaves under a scale transformation.
WWW nd.edu
300,000 web-pages
R. Albert, et al., Nature (1999)
€
P(k) ~ k−γ
€
γ=2.3
Internet connectivity, with selected backbone ISPs (Internet Service Provider) colored separately.
Faloutsos et al., SIGCOMM ’99
Internet
€
P(k) ~ k−γ
€
γ=2.4
k
P(k
)
J. Han et al., Nature (2004)
Yeast Protein-Protein Interaction Map
Individual proteins
Physical interactions from the “filtered yeast interactome” database: 2493 high-confidence interactions observed by at least two methods (yeast two-hybrid).1379 proteins, <k> = 3.6
Colored according to protein function in the cell:Transcription, Translation, Transcription control, Protein-fate, Genome maintenance, Metabolism, Unknown, etc
Modular structure according to function!
from MIPS database, mips.gsf.de
Metabolic network of biochemical reactions in E.coli
Chemical substrates
Biochemical interactions: enzyme-catalyzed reactions that transform one metabolite into another.
Modular structureaccording to the biochemicalclass of the metabolic productsof the organism.
Colored according to product class:Lipids, essential elements, protein, peptides and amino acids, coenzymes and prosthetic groups, carbohydrates, nucleotides and nucleic acids.
J. Jeong, et al., Nature, 407 651 (2000)
How long is the coastline of Norway?It depends on the length of your ruler.
Fractal Dimension dB-Box Covering Method
Fractals look the same on all scales = `scale-invariant’.
€
lB
€
NB
€
NB (lB ) ~ lB−d B
Box length
Total no. of boxes
Boxing in Biology
How to “zoom out” of a complex network?
Generate boxes where all nodes are within a distance
Calculate number of boxes, , of size needed to cover the network
€
NB (lB ) ~ lB−d B€
lB
€
NB
€
lB
We need the minimum number of boxes: NP-complete optimization problem!We need the minimum number of boxes: NP-complete optimization problem!
Boxing in Biology
Most efficient tiling of the network
4 boxes
5 boxes
1
0
0
1
2
8 node network: Easy to solve
300,000 node network: Mapping to graph colouring problem. Greedy algorithm to find minimum boxes
Larger distances need fewer boxes
€
NB (lB ) ~ lB−d B
€
NB = 4
€
NB = 3
€
NB = 2 -dB
log(lB)lo
g(N
B)
1
2
3fractal
non fractal
€
dB → ∞
Box covering in yeast: protein interaction network
Most complex networks are Fractal
Metabolic Protein interaction
Song, Havlin, Makse, Nature (2005)
€
N(lB ) ~ lB−dB
€
dB = 2.3
€
dB = 3.5
Biological networks
Three domains of life: archaea, bacteria, eukaria
E. coli, H. sapiens, yeast
43 organisms - all scale
Metabolic networks are fractals
Technological and Social Networks TOO
WWW
€
N(lB ) ~ lB−dB
€
dB = 6.25
€
dB = 4.15
nd.edu domain
Hollywood film actors
212,000 actors
300,000 web-pages
Other bio networks: Khang and Bremen groups Internet is not fractal!
Two ways to calculate fractal dimensions
Box covering method Cluster growing method
In homogeneous systems (all nodes with similar k) both definitions agree:
€
NB (lB ) ~ lB−d B
€
MC (lC ) ~ lCd f
€
< MB >= N /NB ~ lBd B
€
dB = d f
€
lC
€
lB
percolation
Box Covering= flat average Cluster Growing = biased
power law
Different methods yield different results due to heterogeneous topology
exponential
Box covering reveals the self similarity. Cluster growth reveals the small world. NO CONTRADICTION! SAME HUBS ARE USED MANY TIMES IN CG.
Is evolution of the yeast fractal?
Ancestral Prokaryote Cell
YeastOtherFungi
Ancestral yeast
Animals+ Plants
Ancestral Fungus
Archaea + Bacteria
Ancestral Eukaryote
presentday
~ 300 million years ago
1 billion years ago
1.5 billion years ago
Following the phylogenetic tree of life:
3.5 billion years ago
COG databasevon Mering, et al Nature (2002)
Suggests that present-day networks could have been created following a self-similar, fractal dynamics.
Same fractal dimension and scale-freeexponent over 3.5 billion years…
€
dB = 2.3
Renormalization in Complex Networks
NOW, REGARD EACH BOX AS A SINGLE NODEAND ASK WHAT IS THE DEGREEDISRIBUTION OF THE NETWORKOF BOXES AT DIFFERENT SCALES ?
Renormalization of WWW network with
l
3B =l
The degree distribution is invariant under renormalization
Internet is not fractal dB--> infinityBut it is renormalizable
Turning back the timeRepeatedly BOXING the network is the same as going back
in time: from a single node to present day.
renormalization
time evolution
Can we “predict” the past…. ? if not the future.
ancestral node
present daynetwork
THE RENORMALIZATION SCHEME
1
time evolution
Evolution of complex networks
opening boxes
How does Modularity arise?The boxes have a physical meaning =
self-similar nested communities
time evolution
ancestral node
present daynetwork
renormalization
1
How to identify communities in complex networks?
Emergence of Modularity in PINBoxes are related to the biologically relevant functional modules
in the yeast protein interactome
time evolution renormalization
present day network
translation transcription protein-fate cellular-fateorganization
ancestralcell
Emergence of modularity in metabolic networks
Appearance of functional modules in E. coli metabolic network.Most robust network than non-fractals.
Theoretical approachHow the communities/modules are linked?
k: degree of the nodes
k’=2renormalization
s=1/4k=8
k’: degree of the communities
€
k' = s(lB ) knode degree
community degree factor<1
Theoretical approach to modular networks: Scaling theory to the rescue
WWW
€
s(lB ) ~ lB−d k
The larger the communitythe smaller their connectivity
new exponent describing how families link
€
k' = s(lB ) k
Scaling relations
A theoretical prediction relating the different exponents
€
NB (lB ) = lB−dB
€
s(lB ) = lB−d k€
lB
€
dB
€
γ=1+dB
dk
new scaling relation
boxes
distance
degree
€
dknew exponent
Scaling relationsThe communities also follow a self-similar pattern
Network dB dk 1+ dB/dk
WWW 4.1 2.5 2.6 2.6
Actor 6.3 5.3 2.2 2.2
E. coli (PIN) 2.3 2.1 2.1 2.2
H. sapiens (PIN) 2.3 2.2 2.0 2.1
43 Metabolic 3.5 3.2 2.1 2.2
WWW Metabolic
Scaling relationworks
€
s(lB )
€
lB
€
lB
€
s(lB )
fractalsfractals communities/modulescommunities/modules
scale-freescale-free
predictionprediction
Why fractality?Some real networks are not fractal
Other models fail too: Erdos-Renyi, hierarchical model, fitness model, JKK model, pseudo-fractals models, etc.
The Barabasi-Albert model of preferentialattachment does not generate fractal networks
All the models fail to predict self-similarity
INTERNET
What is the origin of self-similarity?
HINT: the key to understand fractals is in the degreecorrelations P(k1,k2) not in P(k)
Can you see the difference?
Internet map Yeast protein map
E.coli metabolic map
NON FRACTAL FRACTAL
Quantifying correlations P(k1,k2):
Probability to find a node with k1 links connected with a node of k2 links
Internet map - non fractal Metabolic map - fractal
log(k1)log(k1)
log(
k 2)
log(
k 2)
P(k1,k2)
low prob.
low prob.
high prob.
high prob.
Hubs connected with hubs Hubs connected with non-hubs
Quantify anticorrelation between hubsat all length scales
hubs
hubs
Renormalize
Hubs connected directly
€
ε(l B ) = 2 /3
Hub-Hub Correlation function: fraction of hub-hub connections
Hub-hub connection organized in a self-similar way
The larger de implies more anticorrelation
(fractal) (non-fractal)
Anticorrelations are essential for fractal structure
non-fractal
fractal
What is the origin of self-similarity?
• very compact networks• hubs connected with other hubs• strong hub-hub “attraction”• assortativity
Non-fractal networks
• less compact networks• hubs connected with non-hubs• strong hub-hub “repulsion”• dissasortativity
Fractal networks
InternetAll available models: BA model, hierarchicalrandom scale free, JKK, etc
WWW, PIN, metabolic, genetic, neural networks, some sociological networks
How to model it? renormalization reverses time evolution
Mode IIMode I
tim
e
)()1( tnNtN =+)()1( tsktk ii =+
Both mass and degree increase exponentially with time
Scale-free: γ−kkP ~)(
s
n
ln
ln1+=γ
offspring nodes attached to their parentsimk
(m=2) in this case
reno
rmal
ize
Song, Havlin, Makse, Nature Physics, 2006
How does the length increase with time?
€
N(t) ~ eL(t ) / l0
l0 = 2 /ln n
dB = ∞
€
N(t) ~ L(t)d B
dB = ln n /ln3
Mode II: FRACTALMode I: NONFRACTAL
SMALL WORLD
2)()1( +=+ tLtL )(3)1( tLtL =+
)()1( tnNtN =+ ntetN ln~)(
€
L(t) = 2 t
€
L(t) = e ln 3 t
Combine two modes together
))()(23()1( 00 LtLeLtL +−=++
tim
e )1/(0 eeL −=
)23ln(/ln
)23ln(/ln
eed
end
e
B
−−=−=
€
NB (t) ~ lB (t)−dB
ε(t) ~ lB (t)−d e
e=0.5
Mode I with probability e Mode II with probability 1-e
reno
rmal
ize
PredictionsModel reproduces local small world, scale-free and fractality
model with e=1• attraction between hubs• non-fractal• small world globally
model with e=0.2• repulsion between hubs leads to fractal topology• small world locally inside well defined communities
h.sapiensyeast
The model reproduces the main features of real networks
Case 1: e = 0.8: FRACTALS Case 2: e = 1.0: NON-FRACTALS
€
lB (t) + l0 = at
N(t) = n t
k(t) = st
€
N(lB ) ~ lBd B
s(lB ) ~ lBd k
Model predicts all exponents in terms ofgrowth rates
Each step the total mass scales with a constant n, all the degrees scale with a constant s.
The length scales with a constant a, we obtain:
a
nd B ln
ln=
€
γ=1+lnn
ln s=1+
dB
dka
sd k ln
ln=
We predict the fractal exponents:
Time evolution in yeast network
Multiplicative and exponential growth in yeast PINLength-scales, number of conserved proteins and degree
€
k(t) = eα k t
€
dB =α N
α l
€
γ=1+α N
α k
€
dk =α k
α l
€
N(t) = eα N t
€
l(t) = eα l t
A new principle of network dynamics 1930solid-state physicsbig world
1960Erdos-Renyi model small world
democracy=socialism
1999BA model “rich-get-richer”=
capitalism
2005fractal model“rich-get-richer”
at the expense of the “poor”=globalization
less vulnerable to intentional attacks
Summary
• In contrast to common belief, many real world networks are self-similar.• FRACTALS: WWW, Protein interactions, metabolic networks, neural networks, collaboration networks. • NON-FRACTALS: Internet, all models.• Communities/modules are self-similar, as well.• Scaling theory describes the dynamical evolution.• Boxes are related to the functional modules in metabolic and protein networks.• Origin of self similarity: anticorrelation between hubs• Fractal networks are less vulnerable than non-fractal networks
Positions available: jamlab.org
m = 2
An finally, a model to put all this together
A multiplicative growth processof the number of nodes and links
Probability ehubs always connected
strong hub attractionshould lead to non-fractal
Probability 1-ehubs never connectedstrong hub repulsionshould lead to fractal
Analogous to duplication/divergence
mechanism in proteins??
For the both models, each step the total number of nodes scale as n = 2m +1( N(t+1) = nN(t) ). Now we investigate the transformation of the lengths. They show quite different ways for this two models as following:
2)()1( +=+ tLtL aa )(3)1( tLtL bb =+Then we lead to two different scaling law of N ~ L
)2/ln(~ 0/ 0 nLeN LL
aa =
)3ln/ln(~ ndLN Bd
bbB =
Mode III: L(t+1) =3L(t)Mode II: L(t+1) = 2L(t)+1
Mode I: L(t+1) = L(t)+2
smaller
smaller
Different growth modes lead to differenttopologies
Suppose we have e probability to have mode I, 1-e probability to have mode II and mode III. Then we have:
€
L(t +1) = (3− 2 e)L(t) + 2 e
]2/)1(,[ eefep +⊂+=
taltL ~))(( 0+
pa 23−=
or
)1/(0 ppl −=
Dynamical model
Graph theoretical representation of a metabolicGraph theoretical representation of a metabolicnetworknetwork
(a) A (a) A pathway (catalyzed by Mg2+-dependant enzymes).(b) All interacting metabolites are considered equally. (c) For many biological applications it is useful to ignore co-factors, such as the high energy-phosphate donor ATP, which results in a second type of mapping that connects only the main source metabolites to the main products.
Classes of genes in the yeast proteome
Renormalization following the phylogenetic treeRenormalization following the phylogenetic tree
P. Uetz, et al. Nature 403 (2000).