triadic data analysis in temporal and higher-order networks

Post on 06-Jun-2022

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Triadic data analysis in temporal and higher-order networks

Austin Benson · Cornell UniversityDynaMo@Networks 2021

The humble triangle is fundamental in network science.

2

The Strength of Weak Ties, Granovetter, 1973.Collective dynamics of ‘small-world’ networks. Watts & Strogatz, 1998.

3

The Structure of Positive Interpersonal Relations in Small Groups, 1967.James Davis and Samuel Leinhardt analyzing triangles to test a sociological theory of George Homans using data from Theodore Newcomb.

4

Network Motifs: Simple Building Blocks of Complex Networks, Milo et al., 2002.

Structure and function of the feed-forward loop network motif,Mangan & Alon, 2003.

The Coherent Feedforward Loop Serves as a Sign-sensitive Delay Element in Transcription Networks,Mangan, Zaslaver, & Alon, 2003.

5

• Higher-order / multi-way interactions• Temporal information• Multilayer, multiplex, heterogeneous, attributed• Features / covariates• Large-scale with millions or billions of edges

Modern network data is rich…

@zhangqian_rach

Triangles are super useful for this rich data!

Triadic analysis for modern network data.

6

w/ R Abebe, M Schaub, J Kleinberg, A Jadbabaie

1. Open and closed triangles in temporal, higher-order interactions.Simplicial closure and higher-order link prediction, PNAS, 2018.

2. Triadic motifs in temporal networks.Motifs in temporal networks, WSDM 2017.Sampling methods for counting temporal motifs, WSDM, 2019.

Real-world systems are composed of “higher-order” interactions that we often reduce to pairwise ones.

7

Commercenodes are productsseveral products can be purchased at once

Communicationsnodes are people/accountsemails often have several recipients, not just one.

Physical proximitynodes are peoplepeople gather in groups

Cell biologynodes are proteinsprotein complexes may involve several proteins

We collected many datasets of timestamped hyperedges

8

1. Coauthorship in different domains.2. Emails with multiple recipients. 3. Tags on Q&A forums.4. Threads on Q&A forums.5. Contact/proximity measurements.6. Musical artist collaboration.7. Substance makeup and

classification codes applied to drugs the FDA examines.

8. U.S. Congress committee memberships and bill sponsorship.

9. Combinations of drugs seen in patients in ER visits. https://math.stackexchange.com/q/80181

bit.ly/sc-holp-data

Thinking of higher-order data as a weighted projected graph with filled-in structures is a convenient viewpoint.

9

1

2

3

4

5

6

7

8

9

2

2

11

1

1

11 1

11

1

1

1

1

t1 : {1, 2, 3, 4}t2 : {1, 3, 5}t3 : {1, 6}t4 : {2, 6}t5 : {1, 7, 8}t6 : {3, 9}t7 : {5, 8}t8 : {1, 2, 6}

Data.

Projected graph W.Wij = # of hyperedges containing nodes i and j.

10

5

113

16

20

11

i

j k

i

j k

or

Open triangleeach pair has been in a hyperedge together but all 3 nodes have never been in the same hyperedge

Closed trianglethere is some hyperedge that contains all 3 nodes

What’s more common in empirical data?

music-rap-genius

NDC-substances

NDC-classes

DAWN

coauth-DBLP

coauth-MAG-geology

coauth-MAG-history

congress-bills

congress-committees

tags-stack-overflow

tags-math-sx

tags-ask-ubuntu

email-Eu

email-Enron

threads-stack-overflow

threads-math-sx

threads-ask-ubuntu

contact-high-school

contact-primary-school10 5 10 4 10 3 10 2 10 1

Edge density in projected graph

0.00

0.25

0.50

0.75

1.00

Fract

ion o

f tr

iangle

s open

There is lots of variation in the fraction of triangles that are open, but datasets from the same domain are similar.

12See also Topological analysis of data by Patania, Vaccarino, & Petri, 2017.

Dataset domain separation also occurs at the local level.

13

• Randomly sample 100 egonets per dataset and measure log of average degree and fraction of open triangles.

• Logistic regression model to predict domain (coauthorship, tags, threads, email, contact).

• 75% model accuracy vs. 21% with random guessing.

Triangles close over time.

14

t1 : {1, 2, 3, 4}t2 : {1, 3, 5}t3 : {1, 6}t4 : {2, 6}t5 : {1, 7, 8}t6 : {3, 9}t7 : {5, 8}t8 : {1, 2, 6}

15

Substances in marketed drugs recorded in the National Drug Code directory.

Bin weighted edges into “weak” and “strong ties” in the projected graph W.Wij = # of simplices containing nodes i and j.

• Weak ties. Wij = 1 (one hyperedge contains i and j)• Strong ties. Wij > 2 (at least hyperedges contain i and j)

Weak and strong ties are useful characterizations.

Closure depends on structure in projected graph.

16

• First 80% of the data (in time) ⟶ record configurations of triplets not in closed triangle. • Remainder of data ⟶ find fraction that are now closed triangles.

Increased edge density increases closure probability.

Increased tie strength increases closure probability.

Tension between edge density and tie strength.

Closure probability Closure probability Closure probability

We used this for a new higher-order link prediction task.

17

t1 : {1, 2, 3, 4}t2 : {1, 3, 5}t3 : {1, 6}t4 : {2, 6}t5 : {1, 7, 8}t6 : {3, 9}t7 : {5, 8}t8 : {1, 2, 6}

Data.

• Observe simplices up to time t. • Predict which groups of > 2

nodes will appear after time t.

t We predict structure that graph models would not even consider!

18

i

j k

Wij

Wjk

Wjk

i

j k

?<latexit sha1_base64="ay03YVMwaV0q+u3rGZz2v3X7FoY=">AAAfcnicjZlbc9vGFYDptE1S9ua0b+20hqtRx3EhiVQsy2rHM7Kt+JLYsayL5URQ1AWwJCAuLjq7oChj8Ds7/Ql972tnepagLsA51JQaCcvdby/Y3fMtBfq5irXp9f5165Of/PRnn372+c+7v/jlr379m9tf/Pa9zgoI5H6QqQw++EJLFady38RGyQ85SJH4Sh74o2e2/GAsQcdZumfOc3mUiGEaD+JAGMw6vn36F8/IiZk2VJ5FsZFVOc0pdZCBrI7ze7HrnLjO6MvK8bwuh58WInQeO/cOjsv4pPoxd/7qYPJkdJmMbfLLH8v+Sl5Vx7cXesu96cuhif4ssdCZvbaPv/jjv70wC4pEpiZQQuvDfi83R6UAEwdKVl2v0DIXwUgM5SEmU5FIfVROx1g5i5gTOoMM8Dc1zjS3e70KtgPivNFKaYRfKAGTZq6fZSMs0c1ckWjbMs1NhImamTYH9EBXTpvV54nfZIcg8igO/q8hyLRIcCkS2pvJMtWCk0KZGLKz5rxho0oelZN61rqNGTq0m0tjH74EGbpQKBni9KthBrGJklVcguYamMGjozJO88LINKiXYFAox2SO3YJOGIMMjDp3mutg4tFHN40DOQARuLMJdPPYzq6biJEMpFL1qC2qYh8EnNvly86062MrQ8iKNNRuLoyRkGqsZSCeuDoSudTuIDZuIFRg34e2Tq4ykwgY6XmtLifSCCyczoqSptwrBkbuyLAqcSbuPurd9RX2e50wkRyClGlVTi+WqeOkyfiqwNCxf68RXS+UA5zrOrzy3IDtaefF06pc7W24D3rug4dILTqRMbn+28oKxuKyNjgCOQkikQ7lcpAlK6eF1Da69Ur/4drG6saKlkmMEvAx5pOlM1yzJXurS3G65KMqJEy5r9YX6kvXs5MtUCV2FrveUGW+UB6+9Wy1TZnqAuRmmCkMkU0USZCF8rEHUonJRd0Mb7EZZod7/aPSLqXdEo11397bFaldApCpPMMbSEQalt5AJLE6xwkRuGHRMnpwkW5uGz2oA23xemca11mGj3vLG26QxNgphpNCKWAHZqIHtonmTWLbXmomtqnNunKp7x+ijdaOqvZNbUnUEMhdjNlMPcdbKutWdFW+ffO6KlPbRRJXZVKVMQ7X25WGgzEjbFfxZ1VmfdgKu4WPy2kKu6R8B+0edp+/sVNy0cFevzF9pT+pSq2uOrFwXbt8haSdA6HySFRXQ/3HK7vvrk97OFQyDqIlOvmXRbjU2sqkIbDEDvdqne97E1/AIe4NL/KzSemN7d/FrhdZzTiRjIeRweNhfS03zqKzF0lHBKYQysFqXW+EYd5bXl2Tk0Xn4rXobOG5KdJAOr40ZxiElnWwM0dP77Jbd7XYdZxpA0u95b5MFi9q70YZ4NDjdOhkqYNr7ig5MI6OQ2lr1Nvf3na50K8uG8ET7KsbG4HpnUxbqfB1tY+2Bfow3MIj3EocvEDGqvSUveBqwPQ6Fx6oDFXhqenV4nWiMcelFxRmdqz7g9K+qZrlaOQGMnvfosaZukLsm1b5sAEMa2KxgdhDIw5nmwzDrPRGIs9FuyWCPanat4Ql1XTb+Ildh/ZQg1ZpsxiuamOI1Lui7gxqHte1WeGqwTkVuo1weptLQGHAfYTiFNfqI9r+MnUjWiQzskjmg2Jy0eZFah6qC/8ET1uTYczXSe/v+IbFceJhiC1W5ex6AxWnNYXXOZTBYxxHaGAeEMZimKUC94xNzaPGOGT8oGFvFpNzh4TnG/oFxzSe21IOmW9DJMra+0Gey3qzoQBbOyl5giXXYn5KPakI9pTBnlLsGYM9o9gWg21R7GsG+5pizxnsOcVeMNgLir1ksJcUe8Vgryj2DYN9Q7FvGexbir1msNcUe8Ngbyj2HYN9R7G3DPaWYtsMtk2xdwz2jmI7DLZDsV0G26XYHoPtUWyfwfYp9p7B3lPsgMEOKPaBwT5Q7HsG+55iPzDYDxT7KCFjyB4TqlLh/wgU9eoCuiIxqpXj6wIalSLh+bqAhgD+8x+yFWYldGNGMYfbbLpRIsnfbV3AnLRtfwIvUCAGBV6hQBwKvESBWBR4jQLxKPAiBWJS4FUKxKXAyxSITYHXKRCfAi9UIEYFXqlAnAq8VIFYFXitAvEq8GIFYlbg1QrErcDLFYhdgdcrEL8CL1gghgVesUAcC7xkgVgWeM0C8SzwogViWuBVC8S1wMsWiG2B1y0Q3wIvXCDGhUvlNkmrXN1i8WM349wslRzYJ+C4rSqLUQOOfQbzKRYwWECxkMFCikkGkxQbMNiAYkMGG1IsYrCIYu0DwWL0NBifMNgJxUYMNqKYYjBFsYTBEoqlDJZSrH3KWyyjWM5gOcVOGeyUYsBgQDFuk2uKGQYzFCsYrKDYmMHGFDtjsDOKTRhsQrFzBjun2EcG+8h86CBxD3zgA4l84EMfSOwDH/xAoh/48AcS/8ALAIgBgFcAEAcALwEgFgBeA0A8ALwIgJgAeBUAcQHwMgBiA+B1AMQHwAsBiBGAVwIQJwAvBSBWAF4LQLwAvBiAmAF4NQBxA/ByAGIH4PUAxA/ACwKIIYBXBBBHAC8JIJYAXhNAPAG8KICYAuapot1iUj/Tm7bZYiPbaFQMZas3ZfOVAFIwsAWDLDNpZqSun9I1Hsvach1AnBumVE9LE2G/9GqWpJDUDyMvv4s93Hnx9KjsufhTMY9DZZLfVOHike0C/QTla8PWXF13+2uP3H5/48bqZ/Oq99fdjTV3tV35+PZCv/1VME28X13uP1zuv3uwsPlo9jXx550/dP7cudfpd9Y7m52Xne3Ofifo/LPz31uf3vrsT/+58/s7d+/MvlP+5Naszu86jdcd93/LqXOt</latexit>

scorep(i, j, k)

= (Wpij +W

pjk +W

pik)1/p

Finding the top-k weighted triangles in large graphs required new algorithms.

19

scorep(i, j, k)

= (Wpij +W

pjk +W

pik)1/p

<latexit sha1_base64="wECyDT1irjpegMdv/Iox6i4U4iU=">AAAHdXicfVVtb9s2EFa7Lem0t7T7OAxglzlIO/ktXZZkQAADK4oVa7FsdpoCoZtR0sliTEoqSTV2Cf2o/Zph37Zfsa872k5jOdkI2DqR99zDu3tIhYXg2nQ6f966/d77H6yt3/nQ/+jjTz79bOPuvRc6L1UEx1EucvUyZBoEz+DYcCPgZaGAyVDASTj+wa2fvAGleZ4NzLSAoWSjjCc8YganzjZ+2qIGJsbqKFdQnRXbPCDnARk/IJT6W/R1yWJySLZPziw/r14V5BuC5vn4ncmd+eCV7baL6mxjs9PqzAa5bnQXxqa3GEdnd9fu0TiPSgmZiQTT+rTbKczQMmV4JKDyaamhYNGYjeAUzYxJ0EM7y7oiDZyJSZIr/GWGzGb9ZQjGUWxai2INC0vB1KQ+G+b5GFd05dcpTbI/tDwrSgNZNGdMSkFMTlwtScwVREZMSZ3W8PHbIOMRJIpFAZNaMpMGBXfbDMz4bXOkWJEGko0hAiGupuabcnDBQ8XU1GWQX+ggxMgjlZdZrIOCGQMq04g3ik8CnbICdJBwE0RMRO49dphC5EYyNdb/FbUlwTBcnBVOgLGDMjHwK8SVVRDf3+/cDwXyLnuYFEYKIKvs7OF8LlJuYMUnFCVU1v0vefgNkhpT6O/bbVRcSxuMDZMoZdkIWlEu269L0E6Uut39bvdg56CtQXLUbohSlc0LbtKmS6LJs2aICgc183u0tzl/+NQVlOEJcPXx6UjkIRMUX6mD9SDTpYJenAvsfw/1H+UxHFIFgk0usTluvq6h00F3aF3jnABqXT4a9FnmiqsggwtMQLIstjRhkotpDAkrhaks1cmlXReJTpwqKr+xTKaxgxAfdloHQSQ5kqIsBCoeCcxEJy5EPUmMTTMzcaF6c7DVD0/xqO0Oq9WkHgOeMQX9qQxz8QRTsvMourI/P39W2cxRSF5ZWVmO26V9MDc540S8CgkXkAWHA/TLENtpStfSmwlWGfpPnruSXBIMurXy2XBSWS2uSJzzHG2foqerARNFyqqrrf72dKXq8UgAj9LmvPY3rWCjNd4u9etBujDLXZZ9PpLIROeqcuEsDaWl8/nqmizkM7yT45sQi4WqTvGQTkKmTlF8NA3ziaVv3H/Dp6kqBZAU+Cg1eLnu7RaGNMggBcIiUzJBEObTMd4QndbOLkwa5HI0yGP8nrAsAhKCucDz63wJkhE9K6M/p2r4hMwCNDutLsjGJbqf5gqrw7MRyTOCoiICEkM0j8EhlvLa7FbvguD9/+h/g6hZJrMolasCfkW6q9+M68aLnVYXt/fLt5u9/cX35I73hfeVt+11vT2v5/3oHXnHXuT97v3h/eX9vfbP+pfrX69vzV1v31pgPvdqY739L4qSnVk=</latexit><latexit sha1_base64="wECyDT1irjpegMdv/Iox6i4U4iU=">AAAHdXicfVVtb9s2EFa7Lem0t7T7OAxglzlIO/ktXZZkQAADK4oVa7FsdpoCoZtR0sliTEoqSTV2Cf2o/Zph37Zfsa872k5jOdkI2DqR99zDu3tIhYXg2nQ6f966/d77H6yt3/nQ/+jjTz79bOPuvRc6L1UEx1EucvUyZBoEz+DYcCPgZaGAyVDASTj+wa2fvAGleZ4NzLSAoWSjjCc8YganzjZ+2qIGJsbqKFdQnRXbPCDnARk/IJT6W/R1yWJySLZPziw/r14V5BuC5vn4ncmd+eCV7baL6mxjs9PqzAa5bnQXxqa3GEdnd9fu0TiPSgmZiQTT+rTbKczQMmV4JKDyaamhYNGYjeAUzYxJ0EM7y7oiDZyJSZIr/GWGzGb9ZQjGUWxai2INC0vB1KQ+G+b5GFd05dcpTbI/tDwrSgNZNGdMSkFMTlwtScwVREZMSZ3W8PHbIOMRJIpFAZNaMpMGBXfbDMz4bXOkWJEGko0hAiGupuabcnDBQ8XU1GWQX+ggxMgjlZdZrIOCGQMq04g3ik8CnbICdJBwE0RMRO49dphC5EYyNdb/FbUlwTBcnBVOgLGDMjHwK8SVVRDf3+/cDwXyLnuYFEYKIKvs7OF8LlJuYMUnFCVU1v0vefgNkhpT6O/bbVRcSxuMDZMoZdkIWlEu269L0E6Uut39bvdg56CtQXLUbohSlc0LbtKmS6LJs2aICgc183u0tzl/+NQVlOEJcPXx6UjkIRMUX6mD9SDTpYJenAvsfw/1H+UxHFIFgk0usTluvq6h00F3aF3jnABqXT4a9FnmiqsggwtMQLIstjRhkotpDAkrhaks1cmlXReJTpwqKr+xTKaxgxAfdloHQSQ5kqIsBCoeCcxEJy5EPUmMTTMzcaF6c7DVD0/xqO0Oq9WkHgOeMQX9qQxz8QRTsvMourI/P39W2cxRSF5ZWVmO26V9MDc540S8CgkXkAWHA/TLENtpStfSmwlWGfpPnruSXBIMurXy2XBSWS2uSJzzHG2foqerARNFyqqrrf72dKXq8UgAj9LmvPY3rWCjNd4u9etBujDLXZZ9PpLIROeqcuEsDaWl8/nqmizkM7yT45sQi4WqTvGQTkKmTlF8NA3ziaVv3H/Dp6kqBZAU+Cg1eLnu7RaGNMggBcIiUzJBEObTMd4QndbOLkwa5HI0yGP8nrAsAhKCucDz63wJkhE9K6M/p2r4hMwCNDutLsjGJbqf5gqrw7MRyTOCoiICEkM0j8EhlvLa7FbvguD9/+h/g6hZJrMolasCfkW6q9+M68aLnVYXt/fLt5u9/cX35I73hfeVt+11vT2v5/3oHXnHXuT97v3h/eX9vfbP+pfrX69vzV1v31pgPvdqY739L4qSnVk=</latexit><latexit sha1_base64="wECyDT1irjpegMdv/Iox6i4U4iU=">AAAHdXicfVVtb9s2EFa7Lem0t7T7OAxglzlIO/ktXZZkQAADK4oVa7FsdpoCoZtR0sliTEoqSTV2Cf2o/Zph37Zfsa872k5jOdkI2DqR99zDu3tIhYXg2nQ6f966/d77H6yt3/nQ/+jjTz79bOPuvRc6L1UEx1EucvUyZBoEz+DYcCPgZaGAyVDASTj+wa2fvAGleZ4NzLSAoWSjjCc8YganzjZ+2qIGJsbqKFdQnRXbPCDnARk/IJT6W/R1yWJySLZPziw/r14V5BuC5vn4ncmd+eCV7baL6mxjs9PqzAa5bnQXxqa3GEdnd9fu0TiPSgmZiQTT+rTbKczQMmV4JKDyaamhYNGYjeAUzYxJ0EM7y7oiDZyJSZIr/GWGzGb9ZQjGUWxai2INC0vB1KQ+G+b5GFd05dcpTbI/tDwrSgNZNGdMSkFMTlwtScwVREZMSZ3W8PHbIOMRJIpFAZNaMpMGBXfbDMz4bXOkWJEGko0hAiGupuabcnDBQ8XU1GWQX+ggxMgjlZdZrIOCGQMq04g3ik8CnbICdJBwE0RMRO49dphC5EYyNdb/FbUlwTBcnBVOgLGDMjHwK8SVVRDf3+/cDwXyLnuYFEYKIKvs7OF8LlJuYMUnFCVU1v0vefgNkhpT6O/bbVRcSxuMDZMoZdkIWlEu269L0E6Uut39bvdg56CtQXLUbohSlc0LbtKmS6LJs2aICgc183u0tzl/+NQVlOEJcPXx6UjkIRMUX6mD9SDTpYJenAvsfw/1H+UxHFIFgk0usTluvq6h00F3aF3jnABqXT4a9FnmiqsggwtMQLIstjRhkotpDAkrhaks1cmlXReJTpwqKr+xTKaxgxAfdloHQSQ5kqIsBCoeCcxEJy5EPUmMTTMzcaF6c7DVD0/xqO0Oq9WkHgOeMQX9qQxz8QRTsvMourI/P39W2cxRSF5ZWVmO26V9MDc540S8CgkXkAWHA/TLENtpStfSmwlWGfpPnruSXBIMurXy2XBSWS2uSJzzHG2foqerARNFyqqrrf72dKXq8UgAj9LmvPY3rWCjNd4u9etBujDLXZZ9PpLIROeqcuEsDaWl8/nqmizkM7yT45sQi4WqTvGQTkKmTlF8NA3ziaVv3H/Dp6kqBZAU+Cg1eLnu7RaGNMggBcIiUzJBEObTMd4QndbOLkwa5HI0yGP8nrAsAhKCucDz63wJkhE9K6M/p2r4hMwCNDutLsjGJbqf5gqrw7MRyTOCoiICEkM0j8EhlvLa7FbvguD9/+h/g6hZJrMolasCfkW6q9+M68aLnVYXt/fLt5u9/cX35I73hfeVt+11vT2v5/3oHXnHXuT97v3h/eX9vfbP+pfrX69vzV1v31pgPvdqY739L4qSnVk=</latexit><latexit sha1_base64="wECyDT1irjpegMdv/Iox6i4U4iU=">AAAHdXicfVVtb9s2EFa7Lem0t7T7OAxglzlIO/ktXZZkQAADK4oVa7FsdpoCoZtR0sliTEoqSTV2Cf2o/Zph37Zfsa872k5jOdkI2DqR99zDu3tIhYXg2nQ6f966/d77H6yt3/nQ/+jjTz79bOPuvRc6L1UEx1EucvUyZBoEz+DYcCPgZaGAyVDASTj+wa2fvAGleZ4NzLSAoWSjjCc8YganzjZ+2qIGJsbqKFdQnRXbPCDnARk/IJT6W/R1yWJySLZPziw/r14V5BuC5vn4ncmd+eCV7baL6mxjs9PqzAa5bnQXxqa3GEdnd9fu0TiPSgmZiQTT+rTbKczQMmV4JKDyaamhYNGYjeAUzYxJ0EM7y7oiDZyJSZIr/GWGzGb9ZQjGUWxai2INC0vB1KQ+G+b5GFd05dcpTbI/tDwrSgNZNGdMSkFMTlwtScwVREZMSZ3W8PHbIOMRJIpFAZNaMpMGBXfbDMz4bXOkWJEGko0hAiGupuabcnDBQ8XU1GWQX+ggxMgjlZdZrIOCGQMq04g3ik8CnbICdJBwE0RMRO49dphC5EYyNdb/FbUlwTBcnBVOgLGDMjHwK8SVVRDf3+/cDwXyLnuYFEYKIKvs7OF8LlJuYMUnFCVU1v0vefgNkhpT6O/bbVRcSxuMDZMoZdkIWlEu269L0E6Uut39bvdg56CtQXLUbohSlc0LbtKmS6LJs2aICgc183u0tzl/+NQVlOEJcPXx6UjkIRMUX6mD9SDTpYJenAvsfw/1H+UxHFIFgk0usTluvq6h00F3aF3jnABqXT4a9FnmiqsggwtMQLIstjRhkotpDAkrhaks1cmlXReJTpwqKr+xTKaxgxAfdloHQSQ5kqIsBCoeCcxEJy5EPUmMTTMzcaF6c7DVD0/xqO0Oq9WkHgOeMQX9qQxz8QRTsvMourI/P39W2cxRSF5ZWVmO26V9MDc540S8CgkXkAWHA/TLENtpStfSmwlWGfpPnruSXBIMurXy2XBSWS2uSJzzHG2foqerARNFyqqrrf72dKXq8UgAj9LmvPY3rWCjNd4u9etBujDLXZZ9PpLIROeqcuEsDaWl8/nqmizkM7yT45sQi4WqTvGQTkKmTlF8NA3ziaVv3H/Dp6kqBZAU+Cg1eLnu7RaGNMggBcIiUzJBEObTMd4QndbOLkwa5HI0yGP8nrAsAhKCucDz63wJkhE9K6M/p2r4hMwCNDutLsjGJbqf5gqrw7MRyTOCoiICEkM0j8EhlvLa7FbvguD9/+h/g6hZJrMolasCfkW6q9+M68aLnVYXt/fLt5u9/cX35I73hfeVt+11vT2v5/3oHXnHXuT97v3h/eX9vfbP+pfrX69vzV1v31pgPvdqY739L4qSnVk=</latexit>

i

j k

Wij

Wjk

Wjk

w/ R Kumar, P Liu, M Charikar

Retrieving Top Weighted Triangles in Graphs, WSDM, 2020.

<latexit sha1_base64="vp8S9+HGTTNptdfFhk9UBe0Sf/4=">AAAf9HicjZlbc9y2FYBX6S3dXuK0j52OmSrqJJ7Vele2bKuddGTZ8SWxY9mSbxE1LkiCS1ogQR+Aq5U5nL62/6Fvnb72//Qn9Ll/oAfL1YU8Zz1djUWQ+A4AAjgf11RQqNTY0ejfKx/94Ic/+vFPPv5p/2c//8UvP7n06a9eGF1CKJ+HWml4FQgjVZrL5za1Sr4qQIosUPJlcHTH1b+cSjCpzvftSSEPMzHJ0zgNhcVLby791w/kJM0rK4JSCagr5S1+6r5vdQGlkv1IWOzCer/3/FUv15E0TVFGk3nRppn0vpAzHG6aT748u4KDNF96vu/5WRrNWwIZRaldB1moE8RuDa8/xsP1a5vu8Pl4OP7cS1wUnm16RoY6j4yHLfRNoW0au5hrwxsOHg+3dlzMnzaun8dcG58GYYgfaGt1Nu/Xl3l0do9vLq2OhqP5x6OF8aKw2lt8dt98+tv/+JEOy0zmNlTCmIPxqLCHlQCbhkriTJVGFiI8EhN5gMVcZNIcVvPVqb01vBJ5sQb8l1tvfrV/MQTbAXHSauV0rLP21UDrI6wx7asiM65lejUTNmlfdFfAxKb2uqw5yYI2OwFRJGn4fw1B5mWWWpnR3qzWqgNnpbIp6OP2vGGjSh5Ws2bW+q0ZOnDb22AfgcQdNHBLGuH0q4mG1CbZBi5Bew1sfOuwSvOitDIPmyWIS+VZ7bkk8KIUZGhxC7bXwaZH7wd5GsoYRDhYTOCgSN3sDjJxJEOpVDNqh6o0AAEnbvn0sRkE2MoEdIm7b1AIayXkBqMspLOBSUQhzSBO7SAUKnTnkYsplLaZgCOzrNVhJq3AyvmsKGmr/TK28pmM6gpn4rNbo88Chf1eJGwiJyBlXlfzg2OOE1ycDhOoUtaV+32B6PuRjHGu52BVFBZcT8/u79TVxmhrcH00uH4DqTUvsbYwf7h61crZ0FgcgZyFicgnchjq7Oq7UhrnF3N1fGNza2PrqpFZihoK0DrZ+jGu2bq71fU0Xw9QVhLm3LWbq82h77vJFigzN4t9f6J0IJSPp74L25a5KUFuR1phimyjykJ00lc+SCVmp7Eab7GdZgf748PKLaXbEq11393fE7lbApC5PMYbyATqwo9FlqoTnBCBG7aufBOfltvbxsRNoq1d7MzgOsvoq9FwaxBmKXaK6aRQCtiBnZnYNdG+SWzbz+3MNbXdBFfmygHaaPOw7t7UXYkaArmHOavVPbylqmnF1NWTx4/qKnddZGldZXWV4nD9PWk5GC9E3ZBgEbLowwXslQEupy3dkvIddHvYu/fYTclpB/vj1vRVwayujDrvxMFNdPUQSTcHQhWJqM+H+ueHbt9dnPZoomQaJut08s+qcKmNk0lLYJkb7vk6X/FngYAD3Bt+EuhZ5U/d77W+nzjNeIlMJ4nFx8PNzcJ6a95+Ij0R2lIoD8P6/hGm+Wi4sSlna97pZ827i49CkYfSC6Q9xiR0rIedeWZ+l/2mq7U+PttcA+uj4Vhma6fRe4kGHDo+Sz2de7jmnpKx9UwaSRfRbH9329XquD5rBJ9g1z7YCMzvZN5KjZ/zfbQr0IfRXfwS4SQOfihTVfnKHXA1YH5cCsdKoyp8NT86vCm05rjyw9JlENrCBnHlTup2PRq5hSzOO9RUq3PEnXTqJy1g0hBrLcQ9NNJosckwzSr/SBSF6LZEsNt195awpp5vmyBz69AdatipbVfDeTSmSLMrms6g4XFd2wHnDS4J6LfS6UkhAYUBVxBKc1yr92j7s9IH0TJbkGW2HBSz0zZPS8tQUwZv8WlrNeZ8U/T/iCcsjhMPE2yxrhbHD1Bp3lB4XEJZfIzjCC0sA6JUTHQucM+40jJqikPGLxruZrG4dEj4fEO/4JimS1sqQAcuRRLd3Q/yRDabDQXY2UnZbay5kPNz6nZNsB0G26HYHQa7Q7G7DHaXYl8z2NcUu8dg9yh2n8HuU+wBgz2g2EMGe0ixbxjsG4p9y2DfUuwRgz2i2GMGe0yx7xjsO4o9YbAnFNtlsF2KPWWwpxR7xmDPKLbHYHsU22ewfYo9Z7DnFHvBYC8o9pLBXlLsFYO9othrBntNse8Z7HuKvZegGXLEpKpU+H8EivpNBV2RFNXK8U0FzUqR8XxTQVNAZEHEBixq6MZMUg53l+lGSSR/t00F86Tt+hN4gQIxKPAKBeJQ4CUKxKLAaxSIR4EXKRCTAq9SIC4FXqZAbAq8ToH4FHihAjEq8EoF4lTgpQrEqsBrFYhXgRcrELMCr1YgbgVerkDsCrxegfgVeMECMSzwigXiWOAlC8SywGsWiGeBFy0Q0wKvWiCuBV62QGwLvG6B+BZ44QIxLpwpt0065ZoOi1+7GefqXHLgmIDTrqocRg04DRgsoFjIYCHFIgaLKCYZTFIsZrCYYhMGm1AsYbCEYt0HgsPo02D6lsHeUuyIwY4ophhMUSxjsIxiOYPlFOs+5R2mKVYwWEGxdwz2jmLAYEAxbpMbilkGsxQrGayk2JTBphQ7ZrBjis0YbEaxEwY7odh7BnvPfOkgeQ984gPJfOBTH0juA5/8QLIf+PQHkv/ACwCIAYBXABAHAC8BIBYAXgNAPAC8CICYAHgVAHEB8DIAYgPgdQDEB8ALAYgRgFcCECcALwUgVgBeC0C8ALwYgJgBeDUAcQPwcgBiB+D1AMQPwAsCiCGAVwQQRwAvCSCWAF4TQDwBvCiAmAKWqaLbYta805u32WET12hSTmSnN+WuKwGkInYVsdY211aa5i1d67WsqzchpIVlas28NhPuj17tmhyy5mWke9U6/2PRwbP7O4fVaIA/NfM6VGbFhwJOX9mu0m9QgbFs5MbNwXjz1mA83vpg+PGy8PHNwdbmYKMb/ObS6rj7p2BaeLExHN8Yjp9urG7vLP5M/HHvN73f9b7ojXs3e9u9B73d3vNeuPJ65S8rf1352+Xp5b9f/sflfzboRyuLmF/3Wp/L//ofMv2Yrw==</latexit>

dataset # nodes # edges time (existing) time (ours)

reddit-reply 8.4M 435M 1.1 hours 5 secondsspotify 3.6M 1.9B > 24 hours 31 seconds

Finding top 1000 triangles

Triadic analysis for modern network data.

20

1. Open and closed triangles in temporal, higher-order interactions.Simplicial closure and higher-order link prediction, PNAS, 2018.

2. Triadic motifs in temporal networks.Motifs in temporal networks, WSDM 2017.Sampling methods for counting temporal motifs, WSDM, 2019.

w/ A Paranjape, J Leskovec, P Liu, M Charikar

Temporal network data is extremely common.

21

Private communicatione-mail, phone calls, text messages, instant messages

Public communicationQ&A forums, Facebook walls, Wikipedia edits

Payment systemscredit card transactions, cryptocurrencies, Venmo

Technical infrastructurepackets over the Internet, messages over supercomputer

22

source destination timestampa d 14sc a 15sa c 17sa b 25sa c 28sa c 30sc d 31sc a 32sa c 35s

1 23

δ = 10s

Temporal network motif1. Directed multigraph

with k edges2. Edge ordering3. Maximum time span δ

a

b c

25s 28s32s

Motif instancek temporal edges that match the pattern that all occur within δ time

a

d c

14s 17s15s

Wrong order!(c, a) before (a, c)

See also Temporal Network Motifs: Models, Limitations, Evaluation, Liu, Guarrasi, & Sarıyüce, 2021.

We developed a model for temporal motifs.Motifs in Temporal Networks, WSDM, 2017.

We also developed fast counting algorithms.

23

M6,1 M6,2 M6,3 M6,4 M6,5 M6,6

M5,1 M5,2 M5,3 M5,4 M5,5 M5,6

M4,1 M4,2 M4,3 M4,4 M4,5 M4,6

M3,1 M3,2 M3,3 M3,4 M3,5 M3,6

M2,1 M2,2 M2,3 M2,4 M2,5 M2,6

M1,1 M1,2 M1,3 M1,4 M1,5 M1,6

1, 2, 3 1, 23 1, 2 3 1, 2 3 1, 2

3

1, 2

3

1, 32 12, 3 12 3 12 3 12

3

12

3

1, 3 2 13 2 1 2, 3 1 23

1 2

3

1 2

3

1, 3 2 13 2 123 1

2, 31 2

3

1 2

3

1, 3

2

13

2

1

2

3 1

2

3 1

2, 3

123

1, 3

2

13

2

1

2

3 1

2

3 1

23

1

2, 3

It takes ~2.5 hours to count all instances of these motifs in a 2B edge phone call network (single threaded).

Cyclic triangles are much more frequent in payment networks than in social networks.

24

Sampling algorithms let us go even faster for large datasets and more complicated motifs.

25

δ = 1 day, 16 threads

<latexit sha1_base64="acPXbWFGBGdsOcIBu/atUAMlp+8=">AAAgXnicjZlbc9vGFYCppE1Ttmns9qUznU6RauhxPBRNUqZktZMZ+W4ndixb8iURNM4CWBCwFhedXUiUMfh1/RV962te2z/QswR1Ac6hJ9RYAHe/ve/5Foa8XMXaDIf/Xvnk01/9+rPffP7b7u9+/8Ufvrxy9Y+vdVaAL1/5mcrgrSe0VHEqX5nYKPk2BykST8k33uE9m//mWIKOs3TPnObyIBHTNA5jXxhMend1xXU9OY3T0givUAKqEpzt0j06KkRQOf7ZT9V1TZZDoWT3mnPNcZNCmRhbL5K0XK9KH4sVaRqnU8fEiXSua+lnaaC/rhzXdVw/iQNb9rqCr8v1tUnVDYTBThtb1apjZJJnIJQjg6nUmCZnwrd5WiQ4B1jpNScXMHAvEmyti0q7zoOjIk7j2b0IRzXNEF6/NXnm4PXW7dFg7NgP3k8GkyFeJ4ONjQ28jgbrbg/r6TovZRDE5l6WJDI1tvmN9Y158dL1QudtZVM217dsyni8MbbXtTXn7GN74mXGZMm8M65Mg/O57L67sjocDOcfh96MFjerncVn593Vv/7sBplf2K74Smi9Pxrm5qAUgNOtJC5DoWUu/EMxlft4m4pE6oNyvg8qp4cpgRNmgP9S48xTu5eLYD0gThu1nHV21kz1suwQc3QzVSTa1kxTE2GiZqJNAR3qymmz+jTxmuwURI6L94u6INMiiXHH0NZMlqkWPN+mkJ005w0rVfKgnNWz1m3M0L4NJI1teBJk0LdrGuD0q2kGsYmSMS5Bcw1MePugjNO8MDL16yUIC+WYzLHh5gQxSN+oU6e5DiY+/NBPY1+GIPz+YgL7eWxnt5+IQ+lLpepeW1TFHgg4tcuXnei+h7VMISswwvq5MEZCqrGUgXjW15HIpe6Hsen7Qvn2e2DL5CoziYBDvazWQSKNwMz5rChpyr0iNBKDA2NbBl/dHn7lKWz3MmEiOQUp06qcXyxzEuHitBhPFbIq7e9LRNcNZIhzPQfLPDdgW3r56G5Vjodb/VvD/q0NpHpOZEyu/3HzppGzgTbYAznzI5FO5cDPkptHhdTWZPrmaGOyNd66qWUSo/A89FuydoJrtmaHuhanax5qUcKcW99crS9d1062QG3aWey6U5V5Qrn41bXFtmWqC5DbQaYwRLZRmn4WyG9ckErMzspmOMRmmO3vjQ5Ku5R2SzTWfWdvV6R2CUCm8gQHkAj0hRuKJFanOCECN2xVujo8u29uGx3Wgda73JjGdZbBN8PBVh9Fi41iOCmUAjZgZjq0VTQHiXW7qZnZqrbrwqW+sY82mhxU7UHdl6ghkLsYs5l6iEMq61p0VT5/9rQqU9tEEldlUpUxdtfdlYaDMSFoF/EWRRZt2AK7hYfLaQq7pHwD7RZ2Hz6zU3LWwN6oMX2lN6tKrS4asXBdunyCpJ0DofJIVBdd/emJ3XeXpz2YKhn70Rqd/PMsXGptZdIQWGK7e7HON9yZJ2Af94YbedmsdI/t717XjaxmnEjG08jg8bA5yY3Tc/Yi6eA5WODBiMW67iGG+XAwnshZ7/z06Tn38RlBpL50PGlOMAgt62Bjjp6Psls31eviSWUrWBsORjLpnZXejTLArtsjNUsdXHNHydA4Og6kLVFvfzvscnVUnVeCJ9j6RyuB+UjmtVT4udhHOwJ9GNzHxxUrcXB9GavSVfaCqwHz61I4VBmqwlXzq8Xrm8Ycl65f2AhCWxgvLO2XqpmPRm4gi+8t6jhTF4j90sqfNoBpTfQaiD004mCxyTDMSvdQ5Llo10SwO1V7SJhTzbeNl9h1aHfVb+U2s+GiNIZIvSvqxqDmcV2bBS4qXFKg2win57kEFAbcQAifw5L4A9r+/O6jaJEsyCJZDorZWZ1nd8tQXXjv8bQ1GcZ8fev+E7+wOE48TLHGqlxcP0LFaU3hdQll8BjHHhpYBgQxPpqmAveMvVtGHWOX8UHDDhZvl3YJzzf0C/bpeGlNOWSeDZEoa+8HeSrrzYYCbO2k5A7mXIr5OXWnIthdBrtLsXsMdo9i9xnsPsUeMNgDij1ksIcUe8Rgjyj2mMEeU+wJgz2h2LcM9i3FvmOw7yj2lMGeUuwZgz2j2PcM9j3FnjPYc4rtMNgOxV4w2AuKvWSwlxTbZbBdiu0x2B7FXjHYK4q9ZrDXFHvDYG8o9pbB3lLsBwb7gWI/MtiPFPsgIWPIIROqUuH/ESjq1hl0RWJUK8fXGTQqRcLzdQYNAZF4AVtgkUM3ZhRzuE2mGyWS/GjrDOakbfsTeIECMSjwCgXiUOAlCsSiwGsUiEeBFykQkwKvUiAuBV6mQGwKvE6B+BR4oQIxKvBKBeJU4KUKxKrAaxWIV4EXKxCzAq9WIG4FXq5A7Aq8XoH4FXjBAjEs8IoF4ljgJQvEssBrFohngRctENMCr1ogrgVetkBsC7xugfgWeOECMS6cK7dJWuXqFouP3Yxzs1Ry4IiAx21VWYwa8NhjMI9iPoP5FAsYLKCYZDBJsZDBQopNGWxKsYjBIoq1DwSL0dPg+D2DvafYIYMdUkwxmKJYwmAJxVIGSynWPuUtllEsZ7CcYkcMdkQxYDCgGLfJNcUMgxmKFQxWUOyYwY4pdsJgJxSbMdiMYqcMdkqxDwz2gXnoIHEPfOADiXzgQx9I7AMf/ECiH/jwBxL/wAsAiAGAVwAQBwAvASAWAF4DQDwAvAiAmAB4FQBxAfAyAGID4HUAxAfACwGIEYBXAhAnAC8FIFYAXgtAvAC8GICYAXg1AHED8HIAYgfg9QDED8ALAoghgFcEEEcALwkglgBeE0A8AbwogJgClqmiXWNSv9Ob19liI1tpVExlqzVl05UAkhHajDDLTJoZqeu3dI3XsjZf+xDnhsnV89xE2D96NXNSSOqXkfZV6/yPRfsvH909KId9/KmY16EyyT9W4OyV7Sp9gvK0YUuON/ujye3+aLT10eIny4qPNvtbk/64XfjdldVR+0/B9Ob1eDDaGIxejFe37y7+TPx55y+dv3eud0adzc5253Fnp/Oq46/8a+Xnlf+u/O9v/3E+c75wvqzRT1YWZf7UaXycP/8fYJC3JQ==</latexit>

running time (seconds)

dataset # temporal edges exact sampling par. sampling

EquinixChicago 345M 481.2 45.50 5.666 1.3%RedditComments 636M X 6739 2262 –

26

THANKS! Austin Bensonhttp://cs.cornell.edu/~arb

@austinbensonarb@cs.cornell.edu

Triadic data analysis in temporal and higher-order networks

Supported by ARO MURI, ARO Award W911NF19- 1-0057, NSF Award DMS-1830274, and JP Morgan Chase & Co.

Lots of data available at https://www.cs.cornell.edu/~arb/data/

Santa Fe, NM

top related