core-periphery structures in networks

44
Sang Hoon Lee School of Physics, Korea Institute for Advanced Study http://newton.kias.re.kr/~lshlj82 Core-periphery Structures in Networks The 18th Statistical Physics Workshop, Chonbuk National University, 20-22 August, 2015 SHL, M. Cucuringu, and Mason Porter, Density-based and Transport-based Core-periphery Structures in Networks, Phys. Rev. E 89, 032810 (2014); M. Cucuringu, M. P. Rombach, SHL, and M. A. Porter, Detection of Core-periphery Structure in Networks Using Spectral Methods and Geodesic Paths, e-print arXiv:1410.6572; SHL, Is Nestedness in Networks Generalized Core-periphery Structures?, in preparation.

Upload: sang-hoon-lee

Post on 15-Feb-2017

678 views

Category:

Science


0 download

TRANSCRIPT

  • Sang Hoon Lee School of Physics, Korea Institute for Advanced Study

    http://newton.kias.re.kr/~lshlj82

    Core-periphery Structures in Networks

    The 18th Statistical Physics Workshop, Chonbuk National University, 20-22 August, 2015

    SHL, M. Cucuringu, and Mason Porter, Density-based and Transport-based Core-periphery Structures in Networks, Phys. Rev. E 89, 032810 (2014); M. Cucuringu, M. P. Rombach, SHL, and M. A. Porter, Detection of Core-periphery Structure in Networks Using Spectral Methods and Geodesic Paths, e-print arXiv:1410.6572; SHL, Is Nestedness in Networks Generalized Core-periphery Structures?, in preparation.

    http://newton.kias.re.kr/~lshlj82

  • Mason Porter

  • Community structure in networks adjacency matrix

    modularity (the objective function to be maximized)

    Q =1

    2

    X

    ij

    Wij

    sisj2

    (gi, gj)

    gi: the community to which node i belongs: the sum of weights in the network

    Mason Porter, J.-P. Onnela, and P. J. Mucha, Not. Am. Math. Soc. 56, 1082 (2009); S. Fortunato, Phys. Rep. 486, 75 (2010).

    si =P

    j Wij

    0 20 40 60 80 100

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    nz = 2730

    p1=0.5, p2=0.05, p3=0.5; pS=0, dS=0

  • Community structure in networks adjacency matrix

    modularity (the objective function to be maximized)

    Q =1

    2

    X

    ij

    Wij

    sisj2

    (gi, gj)

    gi: the community to which node i belongs: the sum of weights in the network

    Mason Porter, J.-P. Onnela, and P. J. Mucha, Not. Am. Math. Soc. 56, 1082 (2009); S. Fortunato, Phys. Rep. 486, 75 (2010).

    si =P

    j Wij

    0 20 40 60 80 100

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    nz = 2730

    p1=0.5, p2=0.05, p3=0.5; pS=0, dS=0

  • Community structure in networks adjacency matrix

    modularity (the objective function to be maximized)

    Q =1

    2

    X

    ij

    Wij

    sisj2

    (gi, gj)

    gi: the community to which node i belongs: the sum of weights in the network

    Mason Porter, J.-P. Onnela, and P. J. Mucha, Not. Am. Math. Soc. 56, 1082 (2009); S. Fortunato, Phys. Rep. 486, 75 (2010).

    si =P

    j Wij

    0 20 40 60 80 100

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    nz = 2730

    p1=0.5, p2=0.05, p3=0.5; pS=0, dS=0

  • Core-periphery structure in networks

    P. Csermely, A. London, L.-Y. Wu, and B. Uzzi, J. Complex Networks 1, 93 (2013); M. P. Rombach, M. A. Porter, J. H. Fowler, and P. J. Mucha, SIAM J. App. Math 74, 167 (2014).

    SHL, M. Cucuringu, and M. A. Porter, Phys. Rev. E 89, 032810 (2014); M. Cucuringu, M. P. Rombach, SHL, and M. A. Porter, e-print arXiv:1410.6572.

    adjacency matrix

    0 20 40 60 80 100

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    nz = 2258

    p1=0.5, p2=0.2, p3=0.02; pS=0, dS=0

  • Core-periphery structure in networks

    P. Csermely, A. London, L.-Y. Wu, and B. Uzzi, J. Complex Networks 1, 93 (2013); M. P. Rombach, M. A. Porter, J. H. Fowler, and P. J. Mucha, SIAM J. App. Math 74, 167 (2014).

    SHL, M. Cucuringu, and M. A. Porter, Phys. Rev. E 89, 032810 (2014); M. Cucuringu, M. P. Rombach, SHL, and M. A. Porter, e-print arXiv:1410.6572.

    adjacency matrix

    0 20 40 60 80 100

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    nz = 2258

    p1=0.5, p2=0.2, p3=0.02; pS=0, dS=0

    core

    periphery

    core

    core

    periphery

    periphery

  • value of the degree k (in particular, k20) [34] and rich club[39,40] but also by a knotty center of nodes that have a highgeodesic betweenness centrality but not necessarily a high degree[36]. A k-core decomposition has also been applied to functionalbrain imaging data to demonstrate a relationship between networkreconfiguration and errors in task performance[41].

    A novel approach that is able to overcome many of theseconceptual limitations is the geometrical core-score [30], which isan inherently continuous measure, is defined for weightednetworks, and can be used to identify regions of a network corewithout relying solely on their degree or strength (i.e., weighteddegree). Moreover, by using this measure, one can produce (i)continuous results, which make it possible to measure whether abrain region is more core-like or periphery-like; (ii) a discreteclassification of core versus periphery; or (iii) a finer discretedivision (e.g., into 3 or more groups). In addition, this method canidentify multiple geometrical cores in a network and rank nodes interms of how strongly they participate in different possible cores.This sensitivity is particularly helpful for the examination of brainnetworks for which multiple cores are hypothesized to mediatemultimodal integration [42]. In this paper, we have demonstratedthat functional brain networks derived from task-based dataacquired during goal-directed brain activity exhibit geometricalcore-periphery organization. Moreover, they are specificallycharacterized by a straightforward core-periphery landscape thatincludes a relatively small core composed of roughly 10% or so ofthe nodes in the network.

    In this paper, we have introduced a method and associateddefinitions to identify a temporal core-periphery organization basedon changes in a nodes module allegiance over time. We havedefined the notion of a temporal core as a set of regions thatexhibit fewer changes in module allegiance over time thanexpected in a dynamic-network null model. Neurobiologically,the temporal core contains brain areas that show consistent task-based mesoscale functional connectivity over the course of anexperiment , and it is therefore perhaps unsurprising that their

    Figure 6. Relationship between temporal and geometrical core-periphery organizations. A strong negative correlation exists betweenflexibility and the geometrical core score for networks constructed from blocks of (A) extensively, (B) moderately, and (C) minimally trained sequenceson scanning session 1 (day 1; circles), session 2 (after approximately 2 weeks of training; squares), session 3 (after approximately 4 weeks of training;diamonds), and session 4 (after approximately 6 weeks of training; stars). This negative correlation indicates that the temporal core-peripheryorganization is mimicked in the geometrical core-periphery organization and therefore that the core of dynamically stiff regions also exhibits denseconnectivity. We show temporal core nodes in cyan, temporal bulk nodes in gold, and temporal periphery nodes in maroon. The darkness of datapoints indicates scanning session; darker colors indicate earlier scans, so the darkest colors indicate scan 1 and the lightest ones indicate scan 4. Thegrayscale lines indicate the best linear fits; again, darker colors indicate earlier scans, so session 1 is in gray and session 4 is in light gray. The Pearsoncorrelation between the flexibility (averaged over 100 multilayer modularity optimizations, 20 participants, and 4 scanning sessions) and thegeometrical core score (averaged over 20 participants and 4 scanning sessions) is significant for the EXT (r : {0:92, p : 3:4|10{45), MOD(r : {0:93, p : 2:2|10{49), and MIN (r : {0:93, p : 4:8|10{50) data.doi:10.1371/journal.pcbi.1003171.g006

    Figure 7. Core-periphery organization of brain dynamicsduring learning. The relationship between temporal and geometricalcore-periphery organization and their associations with learning arepresent in individual subjects. We represent this relationship usingspirals in a plane; data points in this plane represent brain regionslocated at the polar coordinates (fs, {f k), where f is the flexibility ofthe region, s is the skewness of flexibility over all regions, and k is thelearning parameter (see the Materials and Methods) that describes eachindividuals relative improvement between sessions. The skewnesspredicts individual differences in learning; the Spearman rankcorrelation is r : {0:480 and p : 0:034. Poor learners (straighterspirals) tend to have a low skewness (short spirals), whereas goodlearners (curvier spirals) tend to have high skewness (long spirals). Colorindicates flexibility: blue nodes have lower flexibility, and brown nodeshave higher flexibility.doi:10.1371/journal.pcbi.1003171.g007

    Core-Periphery Organization of Brain Dynamics

    PLOS Computational Biology | www.ploscompbiol.org 7 September 2013 | Volume 9 | Issue 9 | e1003171

    D. S. Bassett, N. F. Wymbs, M. A. Porter, P. J. Mucha, J. M. Carlson, and S. T. Grafton, PNAS 108, 7641 (2011); D. S. Bassett, N. F. Wymbs, M. P. Rombach, M. A. Porter, P. J. Mucha, and S. T. Grafton, PLOS Comput. Biol. 9, e1003171 (2013).

    anatomy where few modules uncovered at large spatial scales arecomplemented by more modules at smaller spatial scales (27).

    Dynamic Modular Structure.We next consider evolvability, which ismost readily detected when the organism is under stress (29) orwhen acquiring new capacities such as during external training inour experiment. We found that the community organization ofbrain connectivity reconfigured adaptively over time. Using a re-cently developed mathematical formalism to assess the presenceof dynamic network reconfigurations (25), we constructed multi-layer networks in which we link the network for each time window(Fig. 3A) to the network in the time windows before and after(Fig. 3B) by connecting each node to itself in the neighboring win-dows. We then measured modular organization (3032) on thislinked multilayered network to find long-lasting modules (25).

    To verify the reliability of our measurements of dynamic mod-ular architecture, we introduced three null models based on per-mutation testing (Fig. 3C). We found that cortical connectivity isspecifically patterned, which we concluded by comparison to aconnectional null model in which we scrambled links betweennodes in each time window (33). Furthermore, cortical regionsmaintain these individual connectivity signatures that definecommunity organization, which we concluded by comparison toa nodal null model in which we linked a node in one time win-dow to a randomly chosen node in the previous and next timewindows. Finally, we found that functional communities exhibita smooth temporal evolution, which we identified by comparingdiagnostics computed using the true multilayer network structureto those computed using a temporally permuted version (Fig. 3D).We constructed this temporal null model by randomly reorderingthe multilayer network layers in time.

    By comparing the structure of the cortical network to thoseof the null models, we found that the human brain exhibited aheightened modular structure in which more modules of smallersize were discriminable as a consequence of the emergence andextinction of modules in cortical network evolution. The statio-narity of communities, defined by the average correlation be-tween partitions over consecutive time steps (34), was also higherin the human brain than in the connectional or nodal null models,indicating a smooth temporal evolution.

    Learning. Given the dynamic architecture of brain connectivity, itis interesting to ask whether the specific architecture changes

    A

    B

    Fig. 1. Structure of the investigation. (A) To characterize the network struc-ture of low-frequency functional connectivity (24) at each temporal scale,we partitioned the raw fMRI data (Upper Left) from each subjects brain intosignals originating from N 112 cortical structures, which constitute the net-works nodes (Upper Right). The functional connectivity, constituting the net-work edges, between two cortical structures is given by a Pearson correlationbetween the mean regional activity signals (Lower Right). We then statisti-cally corrected the resulting N N correlation matrix using a false discoveryrate correction (54) to construct a subject-specific weighted functional brainnetwork (Lower Left). (B) Schematic of the investigation that was performedover the temporal scales of days, hours, and minutes. The complete experi-ment, which defines the largest scale, took place over the course of threedays. At the intermediate scale, we conducted further investigations ofthe experimental sessions that occurred on each of those three days. Finally,to examine higher-frequency temporal structure, we cut each experimentalsession into 25 nonoverlapping windows, each of which was a fewminutes induration.

    A C

    B

    Fig. 2. Multiscale modular architecture. (A) Results for the modular decomposition of functional connectivity across temporal scales. (Left) The network plotsshow the extracted modules; different colors indicate different modules and larger separation between modules is used to visualize weaker connectionsbetween them. (A) and (B) correspond to the entire experiment and individual sessions, respectively. Boxplots show the modularity index Q (Left)and the number of modules (Right) in the brain network compared to randomized networks. See Materials and Methods for a formal definition of Q.(C) Modularity index Q and the number of modules for the cortical (blue) compared to randomized networks (red) over the 75 time windows. Error barsindicate standard deviation in the mean over subjects.

    7642 www.pnas.org/cgi/doi/10.1073/pnas.1018985108 Bassett et al.

    edge: functional connectionnode: brain region; ROI (regions of interest)

    Core-periphery structure in functional brain networks

  • edge-density-based definition: Core Score (CS) for nodes

    R(,) =X

    i,j

    WijCi(,)Cj(,)

    core quality

    optimization: deciding the node sequencethat maximizes R(,)i = (1, , N)

    final (normalized) core score

    i

    Ci(,)

    bNc

    core

    periphery N

    1

    (1 )/2(1 + )/2

    for 2 [0, 1] and 2 [0, 1]

    M. P. Rombach, M. A. Porter, J. H. Fowler, and P. J. Mucha, SIAM J. App. Math 74, 167 (2014).

    : core vectorCore-periphery structure from edge density

    SHL, M. Cucuringu, and M. A. Porter, Phys. Rev. E 89, 032810 (2014).

    0 20 40 60 80 100

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    nz = 2258

    p1=0.5, p2=0.2, p3=0.02; pS=0, dS=0

    core periphery

    CS(i) = ZX

    (,)

    Ci(,)R(,)

  • mesoscale structures of a network in terms of transport

    1

    2

    3

    4

    5

    6

    7

  • mesoscale structures of a network in terms of transport

    1

    2

    3

    4

    5

    6

    7

  • mesoscale structures of a network in terms of transport

    1

    2

    3

    4

    5

    6

    7

  • two modes or modules (or communities)

    mesoscale structures of a network in terms of transport

    1

    2

    3

    4

    5

    6

    7

    R. Lambiotte, J.-C. Delvenne, and M. Barahona, IEEE Transactions on Network Science and Engineering 1, 76 (2015); M. Rosvall and C. T. Bergstrom, PNAS 104, 7327 (2007); PNAS 105, 1118 (2008).

  • two modes or modules (or communities)

    mesoscale structures of a network in terms of transport

    1

    2

    3

    4

    5

    6

    7

    R. Lambiotte, J.-C. Delvenne, and M. Barahona, IEEE Transactions on Network Science and Engineering 1, 76 (2015); M. Rosvall and C. T. Bergstrom, PNAS 104, 7327 (2007); PNAS 105, 1118 (2008).

    Abstract:

    The temporal network framework represents dynamically changinginteractions between nodes that describe more realistic situations. The staticrepresentation or simplification of those interacting nodes ignores possiblynontrivial temporal trends that we can harness for better understanding ofcomplex systems. To demonstrate the effectiveness of such temporalinformation, in this work, we take the navigability or packet deliveryproblem on temporal networks as an example to show that using even asmall portion of past information such as the distance to the destinationfrom other nodes may significantly improve the navigability in real temporalnetworks. We quantify such efficiency of navigability in the real temporalnetworks and find that the navigability measures of our model are relativelyuncorrelated with many aggregated network centralities and temporalcorrelation measures, compared to those of the random diffusion model andthe model where the agent indefinitely waits for the direct contact with thedestination. The result indicates that in contrast to the navigation strategieswithout using temporal information, our simple model effectively usesintrinsic temporal patterns for better navigability.

    Keywords: complex network, temporal network, navigability, networkcentrality, diffusion

    00123

    Navigation on Temporal Networks

    *, Petter Holme1

    .1.

    .

    CLOSE

    Gyeongju, October 21-23

    1234

    time t

    t = t0

    source: node 3

    target: node 4

    14

    24

    3 steps, 241 step, 14

    t = t13t = t23

  • backup-pathway-based definition of coreness measure: Path Score (PS) for nodes and edges

    j

    kPS(i) =

    1

    |E|X

    (j,k)2E

    X

    {pjk}

    jik[E \ (j, k)]

    where jik[E \ (j, k)] = 1/|{pjk}| if node i is in theset {pjk} that consists of optimal backup pathsfrom node j to node k, where we stress thatthe edge (j, k) is removed from E,and jik[E \ (j, k)] = 0 otherwise.

    the set of edges: E = {(j, k)| where node j is connected to k}

    Core-periphery structure from transport

    SHL, M. Cucuringu, and M. A. Porter, Phys. Rev. E 89, 032810 (2014).

  • core

    periphery

    backup-pathway-based definition of coreness measure: Path Score (PS) for nodes and edges

    j

    k

    +1

    +1

    +1+1

    +1

    PS(i) =1

    |E|X

    (j,k)2E

    X

    {pjk}

    jik[E \ (j, k)]

    where jik[E \ (j, k)] = 1/|{pjk}| if node i is in theset {pjk} that consists of optimal backup pathsfrom node j to node k, where we stress thatthe edge (j, k) is removed from E,and jik[E \ (j, k)] = 0 otherwise.

    the set of edges: E = {(j, k)| where node j is connected to k}

    added for all the

    edges (j, k) 2 E

    Core-periphery structure from transport

    SHL, M. Cucuringu, and M. A. Porter, Phys. Rev. E 89, 032810 (2014).

  • Core-periphery structure from transportZacharys karate club network with the PS values on the nodes and edges

  • Core-periphery structure from transport

    Zacharys karate club network with a community structure

    Zacharys karate club network with the PS values on the nodes and edges

  • Core-periphery structure from transport

    Zacharys karate club network with a community structure

    Zacharys karate club network with the PS values on the nodes and edges

  • Core-periphery structure from transport

    Zacharys karate club network with a community structure

    Zacharys karate club network with the PS values on the nodes and edges

    ?!

  • Core-periphery structure based on rank-2 matrix approximation

    M. Cucuringu, M. P. Rombach, SHL, and M. A. Porter, e-print arXiv:1410.6572.

    interpreting the adjacency matrix as a perturbation of a low-rank matrix

    other way around) that leads to the largest increase in the objective function (6). Alternatively, if one wishes tomaintain the current size of the core and periphery sets, then one can choose to swap a pair of vertices from theirassignments (of core or periphery) that leads to the largest increase in the objective function.

    Another interesting avenue to explore is the connection to group synchronization over Z2 [22, 23]. Despitethe common terminology (which is a historical accident), we note that this problem is very dierent from classicalsynchronization phenomena in ensembles of coupled oscillators [73]. In group synchronization, one seeks to estimatethe unknown values zi 2 {1,+1} for i 2 {1, . . . , n} associated to the vertices of a graph G = (V,E), given a sparse,noisy subset of pairwise measurements on the edges of the graph (Zij = zizj 2 {1, 1}). For each edge (i, j) 2 E,the stochastic variable is either 1 or 1; in other words, the measurement is either accurate or noisy. Ingroup synchronization over Z2, one maximizes the objective function

    (9) maxz2Zn2X

    (i,j)2E

    zizjZij .

    For each edge in the set E such that Zij = 1 and for the estimated vertices zi = zj = 1 (or Zij = 1 and zi = zj = 1),one adds a value of +1 to the sum in (9). However, whenever Zij = 1 with zi = 1 and zj = 1, one adds a valueof 1 to the objective function. The goal of the synchronization problem over the group Z2 (whose table we showin Table 4) is to maximize the number of pairwise agreements.

    In light of the objective function (5) for detecting core-periphery structure, consider the following group-synchronization-like maximization problem:

    (10) maxz2Zn2X

    (i,j)2E

    zi zjAij ,

    where Aij are (as usual) the adjacency-matrix elements of the graph G and denotes the operation of the underlyingsemigroup S (whose table we show in Table 4), +1 denotes a vertex from the core set, and 1 denotes a vertex fromthe periphery set. The objective function (10) is equivalent to the one in function (5): for a given proposed solution,two adjacent core vertices add +1 to the objective function; we also add +1 to the objective function when a corevertex is adjacent to a peripheral vertex, and we add 1 to the objective function when two peripheral verticesare adjacent to each other. The dierence between the two optimization problems arises from the fact that theirunderlying algebraic structures are dierent. Clearly, it would be interesting to investigate whether one can usemethods for solving the group-synchronization problem over Z2 (such as the eigenvector method and semidefiniteprogramming [32, 34, 68]) for the detection of core-periphery structure. See Refs. [23, 24] for an application ofgroup synchronization to the graph-realization problem arising in distance geometry and Ref. [22] for a very recentapplication to detecting communities in signed multiplex networks.2

    +1 1+1 +1 11 1 +1

    Table 2Table for the group Z2.

    +1 1+1 +1 +11 +1 1

    Table 3Semigroup table.

    5. LowRank-Core: Core-Periphery Detection Via Low-Rank Matrix Approximation. Anotherapproach for detecting core-periphery structure in a network is to interpret its adjacency matrix as a perturbationof a low-rank matrix. Consider, for instance, the block model

    (11) G0 =1ncnc 1ncnp1npnc 0npnp

    ,

    which assumes that core vertices are fully connected among themselves and with all vertices in the periphery setand that no edges exist between any pair of peripheral vertices. The block model in equation (11) corresponds toan idealized block model that Borgatti and Everett [10] employed in a discrete notion of core-periphery structure.The rank of the matrix G0 is 2, as any 3 3 submatrix has at least two identical rows or columns. Consequently,

    2The former application involves synchronization over the orthogonal group O(n), and the later involves synchronization over Z2.

    10

  • Core-periphery structure based on rank-2 matrix approximation

    M. Cucuringu, M. P. Rombach, SHL, and M. A. Porter, e-print arXiv:1410.6572.

    interpreting the adjacency matrix as a perturbation of a low-rank matrix

    other way around) that leads to the largest increase in the objective function (6). Alternatively, if one wishes tomaintain the current size of the core and periphery sets, then one can choose to swap a pair of vertices from theirassignments (of core or periphery) that leads to the largest increase in the objective function.

    Another interesting avenue to explore is the connection to group synchronization over Z2 [22, 23]. Despitethe common terminology (which is a historical accident), we note that this problem is very dierent from classicalsynchronization phenomena in ensembles of coupled oscillators [73]. In group synchronization, one seeks to estimatethe unknown values zi 2 {1,+1} for i 2 {1, . . . , n} associated to the vertices of a graph G = (V,E), given a sparse,noisy subset of pairwise measurements on the edges of the graph (Zij = zizj 2 {1, 1}). For each edge (i, j) 2 E,the stochastic variable is either 1 or 1; in other words, the measurement is either accurate or noisy. Ingroup synchronization over Z2, one maximizes the objective function

    (9) maxz2Zn2X

    (i,j)2E

    zizjZij .

    For each edge in the set E such that Zij = 1 and for the estimated vertices zi = zj = 1 (or Zij = 1 and zi = zj = 1),one adds a value of +1 to the sum in (9). However, whenever Zij = 1 with zi = 1 and zj = 1, one adds a valueof 1 to the objective function. The goal of the synchronization problem over the group Z2 (whose table we showin Table 4) is to maximize the number of pairwise agreements.

    In light of the objective function (5) for detecting core-periphery structure, consider the following group-synchronization-like maximization problem:

    (10) maxz2Zn2X

    (i,j)2E

    zi zjAij ,

    where Aij are (as usual) the adjacency-matrix elements of the graph G and denotes the operation of the underlyingsemigroup S (whose table we show in Table 4), +1 denotes a vertex from the core set, and 1 denotes a vertex fromthe periphery set. The objective function (10) is equivalent to the one in function (5): for a given proposed solution,two adjacent core vertices add +1 to the objective function; we also add +1 to the objective function when a corevertex is adjacent to a peripheral vertex, and we add 1 to the objective function when two peripheral verticesare adjacent to each other. The dierence between the two optimization problems arises from the fact that theirunderlying algebraic structures are dierent. Clearly, it would be interesting to investigate whether one can usemethods for solving the group-synchronization problem over Z2 (such as the eigenvector method and semidefiniteprogramming [32, 34, 68]) for the detection of core-periphery structure. See Refs. [23, 24] for an application ofgroup synchronization to the graph-realization problem arising in distance geometry and Ref. [22] for a very recentapplication to detecting communities in signed multiplex networks.2

    +1 1+1 +1 11 1 +1

    Table 2Table for the group Z2.

    +1 1+1 +1 +11 +1 1

    Table 3Semigroup table.

    5. LowRank-Core: Core-Periphery Detection Via Low-Rank Matrix Approximation. Anotherapproach for detecting core-periphery structure in a network is to interpret its adjacency matrix as a perturbationof a low-rank matrix. Consider, for instance, the block model

    (11) G0 =1ncnc 1ncnp1npnc 0npnp

    ,

    which assumes that core vertices are fully connected among themselves and with all vertices in the periphery setand that no edges exist between any pair of peripheral vertices. The block model in equation (11) corresponds toan idealized block model that Borgatti and Everett [10] employed in a discrete notion of core-periphery structure.The rank of the matrix G0 is 2, as any 3 3 submatrix has at least two identical rows or columns. Consequently,

    2The former application involves synchronization over the orthogonal group O(n), and the later involves synchronization over Z2.

    10

    decompose the original G = G0 +W where W is a noise matrix

  • Core-periphery structure based on rank-2 matrix approximation

    M. Cucuringu, M. P. Rombach, SHL, and M. A. Porter, e-print arXiv:1410.6572.

    interpreting the adjacency matrix as a perturbation of a low-rank matrix

    other way around) that leads to the largest increase in the objective function (6). Alternatively, if one wishes tomaintain the current size of the core and periphery sets, then one can choose to swap a pair of vertices from theirassignments (of core or periphery) that leads to the largest increase in the objective function.

    Another interesting avenue to explore is the connection to group synchronization over Z2 [22, 23]. Despitethe common terminology (which is a historical accident), we note that this problem is very dierent from classicalsynchronization phenomena in ensembles of coupled oscillators [73]. In group synchronization, one seeks to estimatethe unknown values zi 2 {1,+1} for i 2 {1, . . . , n} associated to the vertices of a graph G = (V,E), given a sparse,noisy subset of pairwise measurements on the edges of the graph (Zij = zizj 2 {1, 1}). For each edge (i, j) 2 E,the stochastic variable is either 1 or 1; in other words, the measurement is either accurate or noisy. Ingroup synchronization over Z2, one maximizes the objective function

    (9) maxz2Zn2X

    (i,j)2E

    zizjZij .

    For each edge in the set E such that Zij = 1 and for the estimated vertices zi = zj = 1 (or Zij = 1 and zi = zj = 1),one adds a value of +1 to the sum in (9). However, whenever Zij = 1 with zi = 1 and zj = 1, one adds a valueof 1 to the objective function. The goal of the synchronization problem over the group Z2 (whose table we showin Table 4) is to maximize the number of pairwise agreements.

    In light of the objective function (5) for detecting core-periphery structure, consider the following group-synchronization-like maximization problem:

    (10) maxz2Zn2X

    (i,j)2E

    zi zjAij ,

    where Aij are (as usual) the adjacency-matrix elements of the graph G and denotes the operation of the underlyingsemigroup S (whose table we show in Table 4), +1 denotes a vertex from the core set, and 1 denotes a vertex fromthe periphery set. The objective function (10) is equivalent to the one in function (5): for a given proposed solution,two adjacent core vertices add +1 to the objective function; we also add +1 to the objective function when a corevertex is adjacent to a peripheral vertex, and we add 1 to the objective function when two peripheral verticesare adjacent to each other. The dierence between the two optimization problems arises from the fact that theirunderlying algebraic structures are dierent. Clearly, it would be interesting to investigate whether one can usemethods for solving the group-synchronization problem over Z2 (such as the eigenvector method and semidefiniteprogramming [32, 34, 68]) for the detection of core-periphery structure. See Refs. [23, 24] for an application ofgroup synchronization to the graph-realization problem arising in distance geometry and Ref. [22] for a very recentapplication to detecting communities in signed multiplex networks.2

    +1 1+1 +1 11 1 +1

    Table 2Table for the group Z2.

    +1 1+1 +1 +11 +1 1

    Table 3Semigroup table.

    5. LowRank-Core: Core-Periphery Detection Via Low-Rank Matrix Approximation. Anotherapproach for detecting core-periphery structure in a network is to interpret its adjacency matrix as a perturbationof a low-rank matrix. Consider, for instance, the block model

    (11) G0 =1ncnc 1ncnp1npnc 0npnp

    ,

    which assumes that core vertices are fully connected among themselves and with all vertices in the periphery setand that no edges exist between any pair of peripheral vertices. The block model in equation (11) corresponds toan idealized block model that Borgatti and Everett [10] employed in a discrete notion of core-periphery structure.The rank of the matrix G0 is 2, as any 3 3 submatrix has at least two identical rows or columns. Consequently,

    2The former application involves synchronization over the orthogonal group O(n), and the later involves synchronization over Z2.

    10

    decompose the original G = G0 +W where W is a noise matrix

    Note that W is a random block-structured matrix with independent entries, and its expected value is the rank-2matrix with entries

    (16) E(Wij) =

    8