statistical analysis of bus networks in india · chennai-600036, india abstract through the past...

16
Statistical Analysis of Bus Networks in India Atanu Chatterjee * , Manju Manohar , and Gitakrishnan Ramadurai Department of Civil Engineering Indian Institute of Technology Madras Chennai-600036, India Abstract Through the past decade the field of network science has established itself as a common ground for the cross-fertilization of exciting inter-disciplinary studies which has motivated researchers to model almost every physical sys- tem as an interacting network consisting of nodes and links. Although public transport networks such as airline and railway networks have been extensively studied, the status of bus networks still remains in obscurity. In developing countries like India, where bus networks play an important role in day-to-day commutation, it is of significant interest to analyze its topological structure and answer some of the basic questions on its evolution, growth, robustness and resiliency. In this paper, we model the bus networks of major Indian cities as graphs in L-space, and evaluate their various statistical properties using concepts from network science. Our analysis reveals a wide spectrum of network topology with the common underlying feature of small-world prop- erty. We observe that the networks although, robust and resilient to random attacks are particularly degree-sensitive. Unlike real-world networks, like In- ternet, WWW and airline, which are virtual, bus networks are physically constrained. The presence of various geographical and economic constraints allow these networks to evolve over time. Our findings therefore, throw light on the evolution of such geographically and socio-economically constrained networks which will help us in designing more efficient networks in the future. 1 Introduction From the neural architecture of the brain to the patterns of social interactions, many physical systems and real-world phenomena are being formulated as network models [1, 2, 3, 4, 5, 6, 7, 8]. These models are complex not only because of their size but also because of the various emergent properties that arise due to the pattern of their inter-nodal connections. Any physical, chemical, biological or social system * [email protected] [email protected] [email protected] 1 arXiv:1509.04554v1 [physics.soc-ph] 14 Sep 2015

Upload: others

Post on 22-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistical Analysis of Bus Networks in India · Chennai-600036, India Abstract Through the past decade the eld of network science has established itself as a common ground for the

Statistical Analysis of Bus Networks in India

Atanu Chatterjee∗, Manju Manohar†, and Gitakrishnan Ramadurai‡

Department of Civil EngineeringIndian Institute of Technology Madras

Chennai-600036, India

Abstract

Through the past decade the field of network science has established itselfas a common ground for the cross-fertilization of exciting inter-disciplinarystudies which has motivated researchers to model almost every physical sys-tem as an interacting network consisting of nodes and links. Although publictransport networks such as airline and railway networks have been extensivelystudied, the status of bus networks still remains in obscurity. In developingcountries like India, where bus networks play an important role in day-to-daycommutation, it is of significant interest to analyze its topological structureand answer some of the basic questions on its evolution, growth, robustnessand resiliency. In this paper, we model the bus networks of major Indiancities as graphs in L-space, and evaluate their various statistical propertiesusing concepts from network science. Our analysis reveals a wide spectrum ofnetwork topology with the common underlying feature of small-world prop-erty. We observe that the networks although, robust and resilient to randomattacks are particularly degree-sensitive. Unlike real-world networks, like In-ternet, WWW and airline, which are virtual, bus networks are physicallyconstrained. The presence of various geographical and economic constraintsallow these networks to evolve over time. Our findings therefore, throw lighton the evolution of such geographically and socio-economically constrainednetworks which will help us in designing more efficient networks in the future.

1 Introduction

From the neural architecture of the brain to the patterns of social interactions,many physical systems and real-world phenomena are being formulated as networkmodels [1, 2, 3, 4, 5, 6, 7, 8]. These models are complex not only because of their sizebut also because of the various emergent properties that arise due to the patternof their inter-nodal connections. Any physical, chemical, biological or social system

[email protected][email protected][email protected]

1

arX

iv:1

509.

0455

4v1

[ph

ysic

s.so

c-ph

] 1

4 Se

p 20

15

Page 2: Statistical Analysis of Bus Networks in India · Chennai-600036, India Abstract Through the past decade the eld of network science has established itself as a common ground for the

can be visualized as a complex network whose constituting elements are known asnodes, and the interactions between them identified as links. Based on the natureof the links, these networks can be broadly classified into virtual and spatial net-works. In the former category, the links are physically absent, e.g., social networksor collaboration networks whereas, in the later case, the links are physically present,i.e., geographically embedded road or railway networks [9, 10, 11, 12]. In betweenthese two broad classes there exist networks in which the links although physicallyabsent are however, geographically constrained. The structure of the real-worldnetworks such as bus or electric power grid are dependent upon the structure ofthe physically constrained, geographically embedded networks on which they growand evolve. Particular to the field of transportation science, the use of networksto understand the flow of entities including vehicles, cargo and pedestrians, has along history. This traditional network flow formulation has answered many inter-esting engineering questions related to optimality of cost, maximality of flows andthe classical, shortest path determination [13, 14]. But there exist questions thatdeal with the topological structure of the network, which are primarily concernedwith the inter-nodal connectivity and static (dynamic) evolution of the network,which the traditional formulation fails to address. In order to answer interestingquestions, like estimating the importance of a particular node in a network, identi-fying existence of hubs, analyzing the pattern of variation in shortest paths with thenetwork size, or the robustness and resiliency of the network, we need to look at thestatistical and topological properties of the nodes and the links that constitute thenetwork. Mathematically, a network is a graph, G, characterized by the presenceof nodes, N , and links, L, connecting the nodes, such that G = (N,L) where theset of nodes belong to the Euclidean space of two or three dimension. Specific topublic transit, networks are often modeled either in L-space or in P -space [15, 16].In both the configurations, the nodes remain the same, for example, bus stops,metro or railway stations, whereas the pattern of the link connectivity changes. InL-space formulation, each pair of consecutive neighboring nodes lying along a routeis considered to be connected by a link, whereas in P -space formulation, every pos-sible pair of nodes accessed without transfers are considered as links. Thus, L-spaceconfiguration helps in understanding the relationship between the stops or nodes ingeneral, and P -space helps in studying the transfers between different routes in thenetwork. For each node in the set of nodes, N = {n1, n2, n3...ni|i ∈ I,∀ni ∈ Rn},we identify the degree of a node, ki as the number of links to which that particularnode is connected to. Note that it is in the pattern of the inter-nodal connectivity,specifically the degree-distribution, P (k) of the nodes, that the answer lies to theemergence of several interesting properties of the network. Based on the degree-distribution of nodes, two prominent network models have been identified: a) therandom network model and b) the scale-free network model. The random networkmodel was first studied by Erdos and Renyi, and they provided two generative mod-els where either the number of nodes and edges are fixed or each node is associatedwith some probability [17]. Although the Erdos-Renyi random graph is an impor-tant model for comparison purposes, it fails to capture the essence of real-worldnetworks, like presence of clusters, communities and the small-world phenomena.A more interesting model was proposed by Watts and Strogatz (WS) in order to

2

Page 3: Statistical Analysis of Bus Networks in India · Chennai-600036, India Abstract Through the past decade the eld of network science has established itself as a common ground for the

understand real-world networks in greater depth, which is commonly known as thesmall-world network [6, 18]. However, it has been observed that most of the real-world networks show a heavy-tailed degree distribution, where the degree of someof the nodes significantly exceeds the average degree of the nodes in the network.This inhomogeneity in degree-distribution often gives rise to striking propertiesin the network, that has been extensively studied by Barabasi and Albert (BA)[1, 2, 3]. Both BA and WS models advocate the small-world phenomenon, which isa characteristic feature of real-world networks, such as electric power grids, WWW,Internet, social-networks, protein-yeast (metabolite) interaction networks, citationnetworks, movie-actors collaboration networks [1, 2, 3, 5, 6, 19, 20, 21, 22, 23].

Interestingly, the above mentioned properties have been reported in variouspublic transit networks as well [11, 15, 16, 24, 25, 26, 27, 28, 29]. The small-worldphenomenon in transportation networks makes sense as transportation facilitiesin a city are planned to provide maximum convenience to its people by allowingthem to travel between places in minimum possible time. Most transportationnetworks are pre-planned networks, where the initial design of the network decidesthe presence of hubs. Also, the size of transportation networks is not as largecompared to social-networks or the Internet, and are subjected to geographical aswell as socio-economical constraints. Studies on public transit networks for differentcities around the world (inclusive of all modes: buses, trams, metros and monorails)in general have been shown to exhibit scale-free behaviour with varying values of thepower-law exponent, γ [15, 27, 28, 29]. Airline and metro-networks in specific showscale-free degree distribution patterns whereas degree-distribution in bus and railnetworks as we shall see tend more towards exponential patterns. The reason forthis contrasting behaviour could be attributed to the two following observations:(i) airline-networks are not bounded by geographical constraints and (ii) metro-networks are local often catering to a part of the city whereas, bus and railway-networks are global as they are spread throughout the entire state and sometimesacross the entire country. Specific to Indian scenarios exhaustive studies on publictransit networks as a whole, are yet to conducted. Previous works although haveshown that the pattern of nodal connectivity of the Indian Railway Network (IRN)drastically differs from that of the Airport Network of India (ANI) [11, 24] thenature of bus networks still remains an unsolved problem.

Analysis of the statistical properties of bus transport networks (BTNs) in Chinarevealed scale-free degree distribution and small-world properties in them. Thepresence of nontrivial clustering indicated a hierarchical and modular structure inthe BTN. Weighted analysis of the network was done considering routes as nodesand weights as the number of common stations between the routes. The weightdistribution followed a heavy tailed power law, and the strength and degree werelinearly dependent. [? ] In another study, an empirical investigation was conductedon the bus transport networks (BTNs) of four major cities of China. When ana-lyzed using P -space topology, the degree distribution had exponential distribution,indicating a tendency for random attachment of the nodes. The authors also eval-uated two statistical properties of BTNs, viz., the distribution of number of stopsin a bus route (S) and the number of bus routes a stop joins (R). While the formerhad an exponential functional form, the latter had asymmetric unimodal functional

3

Page 4: Statistical Analysis of Bus Networks in India · Chennai-600036, India Abstract Through the past decade the eld of network science has established itself as a common ground for the

forms [? ]. The statistical analysis of the urban public bus networks of two Chinesecities, Beijing and Chengdu revealed scale free topology and small world character-istics. Presence of more hubs in the Beijing network led to a comparatively smallerexponent of degree distribution and larger clustering coefficient. Location of busstops in a similar way in the two cities has led to a hierarchical structure, denotedby power law behaviour (with nearly same exponents) of the weights characterizingthe passenger flows [? ]. The rail (RTS) and bus transportation systems (BUS)in Singapore were studied with respect to their topological as well as dynamicperspectives. The stations in RTS had high average degree indicating high connec-tivity amongst them, while the BUS had a small average degree. Both networkshad an exponential degree distribution indicative of randomly evolved connectiv-ity. Strength of nodes defined as the sum of weight of incident edges, appearedscale free for both networks indicating the existence of high traffic hub nodes. TheBUS network exhibited small world characteristics and had hierarchical star liketopology. RTS had slightly negative topological assortativity, while the weightedBUS displayed disassortative nature [? ]. An extended space (ES) model with in-formation on geographical location of bus stations and routes was used to analyzethe spatial characteristics of bus transport networks (BTNs) in China [? ]. The ESmodel consisted of directed weighted variations of the L- and P -space networks des-ignated as ESL and ESP networks respectively, and the symmetry-weighted ESWnetwork, which stored information of the SSPs. Often, two bus stations which aregeographically close to each other may not have any direct bus route link betweenthem. Such stations which are at walkable distances from each other, are definedas short-distance station pairs (SSPs). The SSPs greatly influence the BTNs byreducing the transfer times as well as the number of bus routes. The average clus-tering coefficient of the ESW networks was considerably large, denoting a nearlycircular location of the SSPs around a station. Majority of the route sections inthe bus routes were short, while a few route sections connecting cities downtownsand satellite towns or special purpose BRT routes were long, leading to a power lawedge length distribution of the ESL networks.

Majority of the above studies have looked into the structural properties of thebus networks in both L- and P -spaces. The ESW network is one such network whichhas looked into the aspect of network redundancy due to geographical placement ofthe nodes. In this paper, we do a comparative study of the bus networks of some ofthe major Indian cities, namely Ahmedabad (ABN), Chennai (CBN), Delhi (DBN),Hyderabad (HBN), Kolkata (KBN) and Mumbai (MBN). In order to understandthe structure of bus networks in India, in general, we calculate various metrics,such as clustering coefficients, characteristic path lengths, degree-distribution andassortativity. We also simulate network robustness and resiliency by first remov-ing nodes at random, followed by targeted removal based on degree, closeness andbetweenness. This provides us with interesting results on network (nodal) redun-dancy, as well as structural invariance. It may seem at first that the complexity ofa bus transportation network is much lesser than that of other large-scale networks,however it is the nature of the growth and the penetrative effect of these networksthat makes them not only complex but interesting and worthwhile to investigate.

4

Page 5: Statistical Analysis of Bus Networks in India · Chennai-600036, India Abstract Through the past decade the eld of network science has established itself as a common ground for the

2 Methodology

We obtain the route data for all the bus networks from the respective state gov-ernment websites. Every stop is considered a node, and the routes joining thestops form the set of links. We define a graph, G = (N,L) where the set N =(n1, n2, n3, ...) with each ni as a bus-stop, and the set L = (l1, l2, l3, ...) where eachli connects the node pair (ni, nj). The set of nodes belong to the n-dimensionalEuclidean space, Rn, and the set of links form the Cartesian product over Rn.We define the set of routes as the set R such that ∪ili ∈ R for some i. In orderto analyze the networks, we generate the graph adjacency matrix, Aij such thatany matrix element aij of Aij is either equal to one or zero depending upon theexistence of a connecting link between node-pair (i, j). The degree of any node isgiven as ki = Σiaij. The above formulation generates a L-space network withoutweights. In order to assign weights, we calculate the route overlaps between a pairof nodes which we call edge-weights, wij. The degree strength matrix is given bysij = aij × wij and the weighted degree or node-strength as si = Σisij. Since theflow of transport is along both the directions, we consider the network links to beundirected. The local clustering coefficient is given by C(i) =

2|aij :(ni,nj)∈N,aij∈Aij |ki(ki−1)

where aij is the link connecting node pair (i, j), and ki are the neighbours of thenode ni. The neighbourhood, ni, for a node, i is defined as the set of its immediatelyconnected neighbours, as ni = {nj : li ∈ L ∧ lj ∈ L}. For the complete network,Watts and Strogatz defined a global clustering coefficient[5, 6], C = ΣiCi/n. Theweighted clustering coefficient is given as[30] Cw(i) = 1

si(ki−1)Σj,h

wij+wih

2aijaihajh.

Another important measure is the characteristic path length, lij which is defined asthe average number of nodes crossed along the shortest paths for all possible pairsof network nodes. The average distance from a certain vertex to every other vertexis given by di = Σi 6=j

dij|N(G)|−1

. Then, lij is calculated by taking the median of allthe calculated di ∀i ∈ Rn. In order to check the small-world property, we generaterandom graphs of same size, i.e., keeping network size N constant. However, thenetwork topology of a random graph is governed by a wiring probability, pw whichdetermines the connectedness of the network (or the number of edges of the net-work). In order to generate random networks of comparable sizes (similar number

of nodes and edges), we calculate the wiring probability as pwN2

2∼ N . The cen-

tralities, betweenness and closeness tell us the relative importance of nodes in thenetwork. Betweenness centrality of any node is calculated as, CB(i) = Σs 6=i 6=t

σs,t(i)

σs,t,

where σs,t is the number of shortest paths connecting s to t and σs,t(i) number ofshortest paths connecting s to t but passing through i. Likewise, closeness centralityfor any node is calculated by CC(i) = Σi

1aij

. The average closeness is the harmonic

mean of the shortest paths from any node to every other node. In weighted net-works, usually the edge weights are considered as cost functions, therefore, largerthe edge weight, lesser is the node’s closeness, as the cost of travel would be large.However, in our case the edge weights play an altogether different role signifyingthe ‘ease’ of travel, hence, we take the inverse of edge weights during the calcu-lation of weighted CC as in collaboration networks given by Cw

C (i) = min Σi(1wij

).

The degree-assortativity or the Pearson correlation coefficient of degree between

5

Page 6: Statistical Analysis of Bus Networks in India · Chennai-600036, India Abstract Through the past decade the eld of network science has established itself as a common ground for the

pairs of linked nodes is given by Σjkjk(ejk−qjqk)

σ2q

, where ejk is the joint probability

distribution of the remaining degrees of the two vertices at either end of a randomlychosen edge with Σj,kejk = 1 and Σjejk = qk. Here, qk is the normalized degree-distribution of the remaining degrees, and σ2

q is the variance of the distribution qkgiven by[31] σ2

q = Σkk2qk − [Σkkqk]

2. The degree-distribution P (k) gives the prob-ability of finding a node with a degree, k in the network, which basically representsthe ratio of all the nodes with degree equal to k to the size of the network, N . Thedegree-distribution is observed to follow a heavy-tailed function. The equation forthe power-law or exponential fits (in Table 1 and Figure 2) are calculated usingMaximum Likelihood Estimation (MLE) and the Kolmogorov-Smirnov test is em-ployed to check for goodness of fit [32]. The degree-strength correlation is evaluatedusing linear-regression model, and the least-square error is calculated.

3 Results

Figure 1: Figure shows the network structure of the different bus routes where eachnode represents a bus stop. The plots are generated using force directed algorithmsand the colour of nodes partition the networks into different communities.

In Figure 1, we plot the network structure using force directed algorithms. Thefigure compares the structural construct of the networks. We can clearly observethe nature of connectivity between the nodes in the different networks. While DBNis densely packed, CBN, HBN and KBN are sparse. The network structure of MBNis particularly striking. The long branches with multiple intermediate nodes asseen from the figure cause the characteristic path-length, lij of MBN to increase

6

Page 7: Statistical Analysis of Bus Networks in India · Chennai-600036, India Abstract Through the past decade the eld of network science has established itself as a common ground for the

Bus routes Nodes Edges lij Cav γ λ Assortativity < k >ABN 1103 2582 5.59 0.19 2.47 - 0.07 3.67CBN 1009 1610 8.73 0.07 3.81 - 0.12 24.58DBN 1557 4287 5.51 0.18 7.03 0.06 0.07 9.88HBN 1088 2954 3.87 0.26 3.52 - -0.03 23.88KBN 518 884 5.72 0.08 4.96 - -0.01 6.72MBN 2267 3042 25.69 0.15 8.53 0.13 0.18 10.38

Table 1: Tabular representation of the statistical data for the bus routes of six majorIndian cities (lij = characteristic path length, Cav = average weighted clusteringcoefficient, γ = power-law exponent, λ = exponential decay exponent and < k > =average node degree).

abnormally (see Table 1). We also calculate the modularity of the networks toidentify community structure. Networks with high modularity have dense connec-tions between the nodes within the same modularity class but weak connectionsbetween nodes in different modularity class. In order to identify communities wecolour-code the nodes based upon the modularity classes. Community detectionin bus networks help us in identifying the different zones of operation. As largeas six communities were identified for CBN and MBN whereas fewer (four or less)communities were identified for ABN, DBN, HBN and KBN. In Table 1, we presentthe statistical analysis for the various networks in a tabular form. The datasetswere obtained from the government websites of Ahmedabad BRTS (ABN), MTC(CBN), DTC (DBN), APSRTC (HBN), CSTC (KBN) and BEST (MBN). It can beseen from the table that the network sizes of all the cities are comparable to eachother, except that of KBN because CSTC is localized and operates as a subdivisionof West Bengal Surface Transport Corporation (WBSTC) that operates buses inthe entire state. The network density, ρ, which is the ratio of the number of edgesin a given network to the corresponding complete graph varies from 0.001 to 0.006.An interesting feature is the variation of the characteristic path length lij from aslow as 3.87 to as high as 25.69 . In order to get a deeper insight into the structureof these networks, we carried out a weighted analysis by assigning a weight corre-sponding to the overlap of routes connecting a particular pair of nodes which helpsus understand the flow of traffic between that nodal pair. The weighted degree ofa node or its strength is observed to follow a heavy-tailed distribution on a doublelogarithmic scale. Also, the node strength and node degree are found to be relatednon-linearly. This implies that the traffic at a node due to route overlaps ratherincreases exponentially as compared to the actual number of routes it is connectedto [30]. We observe that the average clustering coefficient, Cav also shows a remark-able variation from 0.07 to as high as 0.26. We check the presence of small-worldphenomenon in the above networks by generating random graphs with the samenumber of nodes and comparable number of edges, and calculate the characteristicpath length, lrandij and average clustering coefficient, Crand

av in each case. Upon com-paring with the data in Table 1, we find that Cav >> Crand

av each time, whereas lijis either comparable to lrandij or lij < lrandij . Based upon the above comparisons, we

7

Page 8: Statistical Analysis of Bus Networks in India · Chennai-600036, India Abstract Through the past decade the eld of network science has established itself as a common ground for the

can state that the bus networks show small-world phenomenon. As we discussedearlier, L-space formulation merely gives the relationship between bus stops andbus routes, whereas it is the P -space formulation which helps in predicting numberof transfers, or in this case, number of bus changes. We can estimate the numberof bus changes required by looking at the average number of bus stops present ineach of the routes. For both CBN and MBN, 6 and 15 bus stops are present on anaverage in all the bus routes, which makes the number of transfers as low as 2− 3for all the networks studied.

As discussed earlier, node-degree distribution plays an important role in un-derstanding the structure and evolution of complex networks. In Figure 2 (a), weplot the degree distribution for all the networks on a double logarithmic scale. Thedegree-distribution patterns show heavy-tailed characteristics, as can be seen fromthe plots. More specifically, ABN, CBN, HBN and KBN show power-law behaviour,whereas DBN and MBN have an exponential distribution. The shaded region ineach of the graphs represents the exponential degree cut-offs (kmin). Therefore, thedegree-distribution function can be represented as a truncated power-law, P (k) ∼exp(−k/kmin)k−γ for ABN, CBN, HBN and KBN, whereas for DBN and MBN thedegree-distribution function follows an exponential decay, P (k) ∼ exp(−λk) whereγ and λ are respectively the decay exponents (see Table 1).

Figure 2: (a) Figure shows the degree-distribution, P (k) on a double logarith-mic scale with the power-law exponent, γ and degree-cutoff kmin (inset); (b) Fig-ure shows centrality distribution for betweenness (CB) and closeness centralities(CC) with the decay exponent λ (inset). The plots in the last row show degree-betweenness dependency with exponent α (inset).

In Figure 2 (b), we plot the centrality distributions (closeness and betweenness),P (CC) and P (CB) in the first two rows for ABN, HBN and MBN on a double log-arithmic scale. Only three specific bus networks are chosen since they capture therange of all network topologies shown by the six cities. While ABN shows moreof a strict power-law (due to low exponential cut-off), HBN shows a truncatedpower-law, and MBN an exponential form of degree-distribution. We find that thedistribution function follows an exponential decay given by P (CC) ∼ exp(−λCC)

8

Page 9: Statistical Analysis of Bus Networks in India · Chennai-600036, India Abstract Through the past decade the eld of network science has established itself as a common ground for the

(similarly for CB) where the value of the exponent λ is shown in each of the plots.Unlike degree-distribution plots which show a power-law decay signifying preferen-tial attachment for the new nodes to connect to the existing ones, the centralitydistribution plots rather imply random attachment. Thus, new nodes do not pref-erentially choose existing central nodes to connect to rather they connect to theirexisting neighbors. This can be observed from the degree-betweenness plots. In thelast row, we plot the variation of betweenness centrality with the degree of a nodewhich follows a power-law relationship, given as CB ∼ kα with the magnitude ofthe exponent α also shown in the plots. A close observation reveals that nodes withhigh betweenness certainly have high degrees however, the reverse is not true. Wealso observe that the clustering coefficient C varies with node degree as C ∼ k−1

which implies that the nodes with low clustering coefficients tend to have higherdegrees and vice-versa. This can be explained from the fact that nodes (bus stops)having higher degree will be a part of multiple bus routes whereas, those bus stopsthrough which fewer bus routes pass will have lower degree. Thus, it is more likelyfor the nodes (bus stops) in the later case to form clusters as compared to the oneswhich are connected to multiple bus routes.

Figure 3: Figure shows the variation in characteristic path length, lij with percent-age node removal, p for ABN, DBN, HBN and KBN. In the first row, the plotsdenote variation in lij with random (green) and degree-biased (red) node removal.In the second row, the plots denote variation in lij with betweenness (black) andcloseness-biased (blue) node removal.

In Figure 3, we plot the response of the network’s characteristic path length, lijto random and systematic perturbation. We simulate the robustness and resiliencyof the networks by modeling perturbations as node removals. Due to their strong

9

Page 10: Statistical Analysis of Bus Networks in India · Chennai-600036, India Abstract Through the past decade the eld of network science has established itself as a common ground for the

Figure 4: Figure shows degree-distribution plots for ABN subjected to random anddirected attacks for percentage of node removals, p = 20%, 40% and 60%.

assortative nature MBN and CBN disintegrate into separate entities very quicklywhereas, the other networks remain connected upto atleast 4% of node removals. Inthe first row, we plot the variation in lij with respect to random (green) and degree-biased (red) removal of nodes. In almost all the cases, we can clearly observe that themagnitude of lij increases in case of directed or degree-biased attacks over randomattacks. In the second row, we plot the variation in the magnitude of lij with respectto betweenness (black) and closeness-biased (blue) node removals. It is observedthat in the case of ABN, betweenness plays a crucial role due to sudden increase inthe value of lij, while the same sensitivity is shown by DBN for closeness. In cases ofHBN and KBN, both closeness and betweenness show almost similar characteristics.However, a comparison across rows 1 and 2 in Figure 3 reveals that the upper limiton the magnitude of lij remains almost the same, and does not vary much. Finally,we check the topological structure of the network based upon random and directedattacks by plotting the degree-distribution function in each case for ABN and MBNin Figure 4a and 4b respectively.

4 Discussion

In this paper, we analyzed the statistical properties of the bus routes of the six In-dian cities, namely Ahmedabad, Chennai, Delhi, Hyderabad, Kolkata and Mumbai.Our analysis suggests that the bus networks show a wide spectrum of topologicalstructure from power-law to exponential, with varying magnitude of the power-lawexponent γ. Ahmedabad (ABN) is particularly interesting in this regard becauseit has a BRTS (Bus Rapid Transit System) with dedicated lanes, a type of pub-

10

Page 11: Statistical Analysis of Bus Networks in India · Chennai-600036, India Abstract Through the past decade the eld of network science has established itself as a common ground for the

Figure 5: Figure shows degree-distribution plots for MBN subjected to random anddirected attacks for percentage of node removals, p = 20%, 40% and 60%.

lic transit system which is yet to be introduced at a large scale in India. ABN’sBRTS, thus, holds a structural advantage by the presence of many hubs to whichextreme routes are connected, a structure similar to WWW or the airline networks(WAN and ANI) [24, 30]. As we saw in the earlier sections, CBN and MBN do notshow the small-world property in L-space. They, however, do show the small-worldproperty in terms of transfers, as majority of the places can be visited by makingas little as 2 to 3 bus changes. The structural relationship between bus stops asobserved from the degree-distribution plots in Figure 2 is particularly of interest.In Figure 2, we plot the weighted degree-distribution of the networks which cap-ture the strength of the nodes with respect to the traffic handled in terms of thenumber of routes. In order to check for correlations between node degree, k andnode weighted-degree, s we plot them on a double-logarithmic scale. Interestingly,ABN shows a strong correlation as, s ∼ kβ with β = 1.27 and R2 = 0.91, whereasthe other networks fail to show such strong relationships (CBN, KBN and HBNshow similar relationships with β ∼ 1.44− 2.08, however, with very low correlationcoefficient, R2 ∼ 0.60 − 0.74). The degree-distribution in case of ABN has thepower-law exponent, γ as 2.47, whereas the degree-strength exponent, β is foundto be 1.27. This implies that the strength of a node increases faster as comparedto its degree, which basically shows a sense of order in ABN where higher degreenodes, for example, large or important bus stops, handle heavy traffic as majorityof the routes pass through them. This is definitely missing in the other networkswhere the edge weights or routes seem to be more randomly distributed. Also,the presence of exponential cutoffs (as seen from Figure 2 (a)) further validate ourfindings regarding the randomness associated with the growth and evolution of the

11

Page 12: Statistical Analysis of Bus Networks in India · Chennai-600036, India Abstract Through the past decade the eld of network science has established itself as a common ground for the

bus networks with time (clearly, ABN shows lower kmin, and hence, a lower valueof γ with respect to the other networks). Barabasi and Albert suggested growthand preferential attachment as the underlying mechanisms giving rise to scale-freedegree-distributions, with P (k) ∼ k−γ. The growth equation in BA model follows,∂ki∂t

= mΠ(ki), where m signifies growth and Π(ki) preferential attachment with

Π(ki) = kiΣjkj

for all existing j nodes, later proved analytically [2, 4]. Recently, Wang

et al. simulated exponential growth models for networks, with growth and adjacentnode attachment as underlying processes [33]. The growth equation according to

their model is as follows, ∂P (k)∂t

+ P (k, t) = t∂P (k,t)∂t

= P (k − 1, t) − P (k, t) + δk,1,where P (k, s = t, t > 2) = δk,1. The variable s = 1, 2, 3... marks the vertices, andP (k, s, t) represents the probability that a vertex s has a degree k at time t. In the

continuum limit, the above equation approaches the form, ∂P (k)∂t

= −P (k) givingrise to the decay equation P (k) ∼ exp(−λk) with λ =< k >−1. From Table 1,we can observe that our values for the exponent λ for DBN and MBN agree withthe above analytical prediction. In Figure 2 (b), we plot the centrality distributionfor betweenness (CB) and closeness (CC). We consider betweenness and closenessbecause they play a crucial role from a transportation perspective. CC is a mea-sure of a node’s relative importance in the network due to the existence of shortestpaths from that particular node to every other node in the entire network. CB onthe other hand acts as a bridging node connecting different parts of the networktogether. When traveling from one node to the other, it is often beneficial to get tothe node with the highest value of CC first if a direct path does not exist betweenthe origin-destination pair. Often transportation network of a city is planned in away such that the hubs allow maximum number of routes to pass through them, andall other nodes in the network to be easily reachable from them. Since, centralityis positively correlated to node degree, the hubs in a network also tend to have thelargest degrees. We found this pattern in all the networks, (CB ∼ kα) however, inDBN and MBN the relationship between degree and centrality is not that strong,which is due to the presence of noise in the network due to random attachment ofnodes (see Figure 2 (b) last row). The noise or the presence of redundant nodes(links) due to random attachment of the nodes in the network causes the degree-distribution patterns to shift from a purely power-law decay to truncated power-lawand exponential decays. The presence of these redundant nodes increase the degreeof non-central nodes which is observed in the degree-centrality plots (see Figure2). These nodes due to their random placement tend to appear at random placesin the network causing hindrance in the direct connectivity of the hubs. The net-works (except CBN and MBN) therefore show disassortative or weakly assortativebehaviours. We also observe that the distribution function follows an exponentialdecay, as P (CC) ∼ exp(−λCC) (similarly, for CB) which again confirms our claimsregarding random node attachment. The above centralities show us that all nodesin a network are not the same, i.e., some nodes are more ‘central’ as compared toother nodes. Similarly, some nodes do not play any significant role in the network’soverall functionality, i.e., they are redundant. In Figure 3, we evaluate the network’sresponse to external perturbations by random and directed removal of nodes. We fixan important measure lij and check its variation upon percentage removal of nodes.As we saw earlier, CBN and MBN due to their strong assortative behaviour, seem

12

Page 13: Statistical Analysis of Bus Networks in India · Chennai-600036, India Abstract Through the past decade the eld of network science has established itself as a common ground for the

to be very sensitive to node removals as they quickly disintegrate, whereas ABN,DBN, HBN and KBN do not show any significant change in lij upto 4% of noderemoval. This basically amounts to roughly 40−70 nodal redundancy (in numbers),which can reduce cost of construction, operation and maintenance significantly inthe network. Also, observe that the topological structure of the networks are pre-served (Figure 4a and 4b) even when large number of nodes are removed. Thedegree-biased node removal renders the degree-distribution function discontinuous,signifying the fact that the networks are particularly degree-sensitive. In Figure 4b,we plot the degree-distributions for MBN with respect to percentage node removal.It is particularly interesting to note that MBN originally shows a better fit forexponential distribution (as discussed earlier), whereas degree-removal allows thenetwork to evolve into a scale-free topology with varying power-law exponent, γ.The above phenomenon could be attributed to the reduction of noise (randomnessof connectivity and nodal redundancy) due to removal of nodes. Also, from Table 1,we observe that the bus networks are assortative with HBN and KBN showing weakdisassortative behaviour. From a transportation perspective, assortative mixing isbeneficial as this will allow direct connectivity between hubs.

As noted earlier, bus networks form a specific class of complex networks thatgrow and evolve over physically constrained spatial networks. Interesting in thisregard is the city of Ahmedabad and the Ahmedabad BRTS (ABN). Statisticalanalysis of road networks (by considering intersections as nodes and roads as links)has shown that the topological structure of the road networks in the city of Ahmed-abad show a scale-free degree distribution with γ = 2.5 and lij = 5.20, which is verymuch similar to ABN [10] (see Table 1). Road intersections are usually separatedby a distance which is geographically much smaller as compared to the distancebetween bus stops, therefore, our results emphasize on the fact that transportationundoubtedly brings the world closer. What we observed from our paper is that busnetworks although planned, have certain magnitude of randomness associated withthem which in turn gives rise to a broad spectrum of network topology. This ismore evident from our observations about node-redundancy and structural invari-ance. Also, from the above analysis we can observe that ABN and DBN althoughtopologically different have an efficient structure as compared to others due to lowcharacteristic path length and high clustering coefficient unlike CBN and MBN,weakly assortative mixing unlike HBN and KBN yet robust and resilient unlikeCBN and MBN. Although being topologically very different, ABN and DBN sharesimilar statistical properties. One striking feature that separates the one from theother is that betweenness plays a vital role in ABN whereas, DBN is particularlysensitive to closeness (see Figure 3). This opens before us new horizons for efficienttransportation network designing and planning. Questions like, what are the statis-tical properties of the network that will ensure an efficient network or how networktopology is related to the statistical properties of the network and vice-versa wouldbe both challenging and worthwhile to answer. It would be exciting to come upwith innovative models to capture the growth and evolution of real-world large scalepublic transit networks and suggest generative methods to reduce noise (in the net-work) due to random node attachment by including geographic and socio-economicconstraints, like demand, flow, cost etc., to maximize certain network parameter(s)

13

Page 14: Statistical Analysis of Bus Networks in India · Chennai-600036, India Abstract Through the past decade the eld of network science has established itself as a common ground for the

or node-utility function(s) based on the above constraints.

References

[1] Albert-Laszlo Barabasi and Reka Albert. Emergence of scaling in randomnetworks. Science, 286(5439):509–512, 1999.

[2] Reka Albert and Albert-Laszlo Barabasi. Statistical mechanics of complexnetworks. Reviews of Modern Physics, 74(1):47, 2002.

[3] Reka Albert, Hawoong Jeong, and Albert-Laszlo Barabasi. Internet: Diameterof the world-wide web. Nature, 401(6749):130–131, 1999.

[4] Sergey N Dorogovtsev, AV Goltsev, and Jose Ferreira F Mendes. Pseudofractalscale-free web. Physical Review E, 65(6):066122, 2002.

[5] Mark EJ Newman. The structure and function of complex networks. SIAMReview, 45(2):167–256, 2003.

[6] Duncan J Watts and Steven H Strogatz. Collective dynamics of ‘small-world’networks. Nature, 393(6684):440–442, 1998.

[7] Erzsebet Ravasz and Albert-Laszlo Barabasi. Hierarchical organization in com-plex networks. Physical Review E, 67(2):026112, 2003.

[8] Hiro-Sato Niwa. Power-law versus exponential distributions of animal groupsizes. Journal of Theoretical Biology, 224(4):451–457, 2003.

[9] J Jiang, M Calvao, A Magalhases, D Vittaz, R Mirouse, F Kouramavel, F Tsob-nange, and QA Wang. Study of the urban road networks of le mans. arXivpreprint arXiv:1002.0151, 2010.

[10] Sergio Porta, Paolo Crucitti, and Vito Latora. The network analysis of urbanstreets: a dual approach. Physica A: Statistical Mechanics and its Applications,369(2):853–866, 2006.

[11] Parongama Sen, Subinay Dasgupta, Arnab Chatterjee, PA Sreeram,G Mukherjee, and SS Manna. Small-world properties of the indian railwaynetwork. Physical Review E, 67(3):036106, 2003.

[12] Atanu Chatterjee and Gitakrishnan Ramadurai. Scaling laws in chennaibus network. In 4th International Conference on Complex Systems and Ap-plications, France, pages 137–141. https://halshs.archives-ouvertes.fr/halshs-01060875/document, 2014.

[13] Ravindra K Ahuja, Thomas L Magnanti, and James B Orlin. Network flows.Technical report, DTIC Document, 1988.

[14] Dimitris Bertsimas and Melvyn Sim. Robust discrete optimization and networkflows. Mathematical programming, 98(1):49–71, 2003.

14

Page 15: Statistical Analysis of Bus Networks in India · Chennai-600036, India Abstract Through the past decade the eld of network science has established itself as a common ground for the

[15] Sybil Derrible and Christopher Kennedy. Network analysis of world subwaysystems using updated graph theory. Transportation Research Record: Journalof the Transportation Research Board, (2112):17–25, 2009.

[16] Yihan Zhang, Qingnian Zhang, and Jigang Qiao. Analysis of guangzhou metronetwork based on l-space and p-space using complex network. In Geoinformat-ics (GeoInformatics), 2014 22nd International Conference on, pages 1–6, June2014.

[17] Paul Erdos and A Renyi. On the evolution of random graphs. Publ. Math.Inst. Hungar. Acad. Sci, 5:17–61, 1960.

[18] Steven H Strogatz. Exploring complex networks. Nature, 410(6825):268–276,2001.

[19] Reka Albert, Istvan Albert, and Gary L Nakarado. Structural vulnerability ofthe north american power grid. Physical Review E, 69(2):025103, 2004.

[20] Peer Bork, Lars J Jensen, Christian von Mering, Arun K Ramani, Insuk Lee,and Edward M Marcotte. Protein interaction networks from yeast to human.Current opinion in structural biology, 14(3):292–299, 2004.

[21] Hawoong Jeong, Sean P Mason, A-L Barabasi, and Zoltan N Oltvai. Lethalityand centrality in protein networks. Nature, 411(6833):41–42, 2001.

[22] David Easley and Jon Kleinberg. Networks, crowds, and markets: Reasoningabout a highly connected world. Cambridge University Press, 2010.

[23] Ramon Ferrer i Cancho and Ricard V Sole. Least effort and the origins ofscaling in human language. Proceedings of the National Academy of Sciences,100(3):788–791, 2003.

[24] Ganesh Bagler. Analysis of the airport network of india as a complex weightednetwork. Physica A: Statistical Mechanics and its Applications, 387(12):2972–2980, 2008.

[25] C Von Ferber, T Holovatch, Yu Holovatch, and V Palchykov. Public transportnetworks: empirical analysis and modeling. The European Physical Journal B,68(2):261–275, 2009.

[26] O Woolley-Meza, C Thiemann, D Grady, JJ Lee, H Seebens, B Blasius, andD Brockmann. Complexity in human transportation networks: a comparativeanalysis of worldwide air transportation and global cargo-ship movements. TheEuropean Physical Journal B, 84(4):589–600, 2011.

[27] Roger Guimera, Stefano Mossa, Adrian Turtschi, and LA Nunes Amaral. Theworldwide air transportation network: Anomalous centrality, community struc-ture, and cities’ global roles. Proceedings of the National Academy of Sciences,102(22):7794–7799, 2005.

15

Page 16: Statistical Analysis of Bus Networks in India · Chennai-600036, India Abstract Through the past decade the eld of network science has established itself as a common ground for the

[28] Julian Sienkiewicz and Janusz A Ho lyst. Statistical analysis of 22 public trans-port networks in poland. Physical Review E, 72(4):046127, 2005.

[29] Panagiotis Angeloudis and David Fisk. Large subway systems as complexnetworks. Physica A: Statistical Mechanics and its Applications, 367:553–558,2006.

[30] Alain Barrat, Marc Barthelemy, Romualdo Pastor-Satorras, and AlessandroVespignani. The architecture of complex weighted networks. Proceedings of theNational Academy of Sciences of the United States of America, 101(11):3747–3752, 2004.

[31] Mark EJ Newman. Assortative mixing in networks. Physical review letters,89(20):208701, 2002.

[32] Aaron Clauset, Cosma Rohilla Shalizi, and Mark EJ Newman. Power-lawdistributions in empirical data. SIAM review, 51(4):661–703, 2009.

[33] Weibing Deng, Wei Li, Xu Cai, and Qiuping A Wang. The exponential degreedistribution in complex networks: Non-equilibrium network theory, numeri-cal simulation and empirical data. Physica A: Statistical Mechanics and itsApplications, 390(8):1481–1485, 2011.

5 Acknowledgments

The authors acknowledge the support from Center of Excellence in Urban Trans-port at the Indian Institute of Technology, Madras, sponsored by the Ministry ofUrban Development, Government of India and the Information Technology Re-search Academy, a Division of Media Labs Asia, a non-profit organization of theDepartment of Electronics and Information Technology, funded by the Ministry ofCommunications and Information Technology, Government of India.

6 Author contributions statement

A.C. and G.R. conceived the idea, A.C. and M.M. collected the data and wrote themanuscript, A.C. wrote the codes and ran the simulations, A.C. and G.R. analyzedthe results. All authors reviewed the manuscript.

7 Additional information

Competing financial interests The authors declare no competing financial in-terests.

16