triadic data analysis in temporal and higher-order networks

26
1 Triadic data analysis in temporal and higher-order networks Austin Benson · Cornell University DynaMo@Networks 2021

Upload: others

Post on 06-Jun-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Triadic data analysis in temporal and higher-order networks

1

Triadic data analysis in temporal and higher-order networks

Austin Benson · Cornell UniversityDynaMo@Networks 2021

Page 2: Triadic data analysis in temporal and higher-order networks

The humble triangle is fundamental in network science.

2

The Strength of Weak Ties, Granovetter, 1973.Collective dynamics of ‘small-world’ networks. Watts & Strogatz, 1998.

Page 3: Triadic data analysis in temporal and higher-order networks

3

The Structure of Positive Interpersonal Relations in Small Groups, 1967.James Davis and Samuel Leinhardt analyzing triangles to test a sociological theory of George Homans using data from Theodore Newcomb.

Page 4: Triadic data analysis in temporal and higher-order networks

4

Network Motifs: Simple Building Blocks of Complex Networks, Milo et al., 2002.

Structure and function of the feed-forward loop network motif,Mangan & Alon, 2003.

The Coherent Feedforward Loop Serves as a Sign-sensitive Delay Element in Transcription Networks,Mangan, Zaslaver, & Alon, 2003.

Page 5: Triadic data analysis in temporal and higher-order networks

5

• Higher-order / multi-way interactions• Temporal information• Multilayer, multiplex, heterogeneous, attributed• Features / covariates• Large-scale with millions or billions of edges

Modern network data is rich…

@zhangqian_rach

Triangles are super useful for this rich data!

Page 6: Triadic data analysis in temporal and higher-order networks

Triadic analysis for modern network data.

6

w/ R Abebe, M Schaub, J Kleinberg, A Jadbabaie

1. Open and closed triangles in temporal, higher-order interactions.Simplicial closure and higher-order link prediction, PNAS, 2018.

2. Triadic motifs in temporal networks.Motifs in temporal networks, WSDM 2017.Sampling methods for counting temporal motifs, WSDM, 2019.

Page 7: Triadic data analysis in temporal and higher-order networks

Real-world systems are composed of “higher-order” interactions that we often reduce to pairwise ones.

7

Commercenodes are productsseveral products can be purchased at once

Communicationsnodes are people/accountsemails often have several recipients, not just one.

Physical proximitynodes are peoplepeople gather in groups

Cell biologynodes are proteinsprotein complexes may involve several proteins

Page 8: Triadic data analysis in temporal and higher-order networks

We collected many datasets of timestamped hyperedges

8

1. Coauthorship in different domains.2. Emails with multiple recipients. 3. Tags on Q&A forums.4. Threads on Q&A forums.5. Contact/proximity measurements.6. Musical artist collaboration.7. Substance makeup and

classification codes applied to drugs the FDA examines.

8. U.S. Congress committee memberships and bill sponsorship.

9. Combinations of drugs seen in patients in ER visits. https://math.stackexchange.com/q/80181

bit.ly/sc-holp-data

Page 9: Triadic data analysis in temporal and higher-order networks

Thinking of higher-order data as a weighted projected graph with filled-in structures is a convenient viewpoint.

9

1

2

3

4

5

6

7

8

9

2

2

11

1

1

11 1

11

1

1

1

1

t1 : {1, 2, 3, 4}t2 : {1, 3, 5}t3 : {1, 6}t4 : {2, 6}t5 : {1, 7, 8}t6 : {3, 9}t7 : {5, 8}t8 : {1, 2, 6}

Data.

Projected graph W.Wij = # of hyperedges containing nodes i and j.

Page 10: Triadic data analysis in temporal and higher-order networks

10

5

113

16

20

Page 11: Triadic data analysis in temporal and higher-order networks

11

i

j k

i

j k

or

Open triangleeach pair has been in a hyperedge together but all 3 nodes have never been in the same hyperedge

Closed trianglethere is some hyperedge that contains all 3 nodes

What’s more common in empirical data?

Page 12: Triadic data analysis in temporal and higher-order networks

music-rap-genius

NDC-substances

NDC-classes

DAWN

coauth-DBLP

coauth-MAG-geology

coauth-MAG-history

congress-bills

congress-committees

tags-stack-overflow

tags-math-sx

tags-ask-ubuntu

email-Eu

email-Enron

threads-stack-overflow

threads-math-sx

threads-ask-ubuntu

contact-high-school

contact-primary-school10 5 10 4 10 3 10 2 10 1

Edge density in projected graph

0.00

0.25

0.50

0.75

1.00

Fract

ion o

f tr

iangle

s open

There is lots of variation in the fraction of triangles that are open, but datasets from the same domain are similar.

12See also Topological analysis of data by Patania, Vaccarino, & Petri, 2017.

Page 13: Triadic data analysis in temporal and higher-order networks

Dataset domain separation also occurs at the local level.

13

• Randomly sample 100 egonets per dataset and measure log of average degree and fraction of open triangles.

• Logistic regression model to predict domain (coauthorship, tags, threads, email, contact).

• 75% model accuracy vs. 21% with random guessing.

Page 14: Triadic data analysis in temporal and higher-order networks

Triangles close over time.

14

t1 : {1, 2, 3, 4}t2 : {1, 3, 5}t3 : {1, 6}t4 : {2, 6}t5 : {1, 7, 8}t6 : {3, 9}t7 : {5, 8}t8 : {1, 2, 6}

Page 15: Triadic data analysis in temporal and higher-order networks

15

Substances in marketed drugs recorded in the National Drug Code directory.

Bin weighted edges into “weak” and “strong ties” in the projected graph W.Wij = # of simplices containing nodes i and j.

• Weak ties. Wij = 1 (one hyperedge contains i and j)• Strong ties. Wij > 2 (at least hyperedges contain i and j)

Weak and strong ties are useful characterizations.

Page 16: Triadic data analysis in temporal and higher-order networks

Closure depends on structure in projected graph.

16

• First 80% of the data (in time) ⟶ record configurations of triplets not in closed triangle. • Remainder of data ⟶ find fraction that are now closed triangles.

Increased edge density increases closure probability.

Increased tie strength increases closure probability.

Tension between edge density and tie strength.

Closure probability Closure probability Closure probability

Page 17: Triadic data analysis in temporal and higher-order networks

We used this for a new higher-order link prediction task.

17

t1 : {1, 2, 3, 4}t2 : {1, 3, 5}t3 : {1, 6}t4 : {2, 6}t5 : {1, 7, 8}t6 : {3, 9}t7 : {5, 8}t8 : {1, 2, 6}

Data.

• Observe simplices up to time t. • Predict which groups of > 2

nodes will appear after time t.

t We predict structure that graph models would not even consider!

Page 18: Triadic data analysis in temporal and higher-order networks

18

i

j k

Wij

Wjk

Wjk

i

j k

?<latexit sha1_base64="ay03YVMwaV0q+u3rGZz2v3X7FoY=">AAAfcnicjZlbc9vGFYDptE1S9ua0b+20hqtRx3EhiVQsy2rHM7Kt+JLYsayL5URQ1AWwJCAuLjq7oChj8Ds7/Ql972tnepagLsA51JQaCcvdby/Y3fMtBfq5irXp9f5165Of/PRnn372+c+7v/jlr379m9tf/Pa9zgoI5H6QqQw++EJLFady38RGyQ85SJH4Sh74o2e2/GAsQcdZumfOc3mUiGEaD+JAGMw6vn36F8/IiZk2VJ5FsZFVOc0pdZCBrI7ze7HrnLjO6MvK8bwuh58WInQeO/cOjsv4pPoxd/7qYPJkdJmMbfLLH8v+Sl5Vx7cXesu96cuhif4ssdCZvbaPv/jjv70wC4pEpiZQQuvDfi83R6UAEwdKVl2v0DIXwUgM5SEmU5FIfVROx1g5i5gTOoMM8Dc1zjS3e70KtgPivNFKaYRfKAGTZq6fZSMs0c1ckWjbMs1NhImamTYH9EBXTpvV54nfZIcg8igO/q8hyLRIcCkS2pvJMtWCk0KZGLKz5rxho0oelZN61rqNGTq0m0tjH74EGbpQKBni9KthBrGJklVcguYamMGjozJO88LINKiXYFAox2SO3YJOGIMMjDp3mutg4tFHN40DOQARuLMJdPPYzq6biJEMpFL1qC2qYh8EnNvly86062MrQ8iKNNRuLoyRkGqsZSCeuDoSudTuIDZuIFRg34e2Tq4ykwgY6XmtLifSCCyczoqSptwrBkbuyLAqcSbuPurd9RX2e50wkRyClGlVTi+WqeOkyfiqwNCxf68RXS+UA5zrOrzy3IDtaefF06pc7W24D3rug4dILTqRMbn+28oKxuKyNjgCOQkikQ7lcpAlK6eF1Da69Ur/4drG6saKlkmMEvAx5pOlM1yzJXurS3G65KMqJEy5r9YX6kvXs5MtUCV2FrveUGW+UB6+9Wy1TZnqAuRmmCkMkU0USZCF8rEHUonJRd0Mb7EZZod7/aPSLqXdEo11397bFaldApCpPMMbSEQalt5AJLE6xwkRuGHRMnpwkW5uGz2oA23xemca11mGj3vLG26QxNgphpNCKWAHZqIHtonmTWLbXmomtqnNunKp7x+ijdaOqvZNbUnUEMhdjNlMPcdbKutWdFW+ffO6KlPbRRJXZVKVMQ7X25WGgzEjbFfxZ1VmfdgKu4WPy2kKu6R8B+0edp+/sVNy0cFevzF9pT+pSq2uOrFwXbt8haSdA6HySFRXQ/3HK7vvrk97OFQyDqIlOvmXRbjU2sqkIbDEDvdqne97E1/AIe4NL/KzSemN7d/FrhdZzTiRjIeRweNhfS03zqKzF0lHBKYQysFqXW+EYd5bXl2Tk0Xn4rXobOG5KdJAOr40ZxiElnWwM0dP77Jbd7XYdZxpA0u95b5MFi9q70YZ4NDjdOhkqYNr7ig5MI6OQ2lr1Nvf3na50K8uG8ET7KsbG4HpnUxbqfB1tY+2Bfow3MIj3EocvEDGqvSUveBqwPQ6Fx6oDFXhqenV4nWiMcelFxRmdqz7g9K+qZrlaOQGMnvfosaZukLsm1b5sAEMa2KxgdhDIw5nmwzDrPRGIs9FuyWCPanat4Ql1XTb+Ildh/ZQg1ZpsxiuamOI1Lui7gxqHte1WeGqwTkVuo1weptLQGHAfYTiFNfqI9r+MnUjWiQzskjmg2Jy0eZFah6qC/8ET1uTYczXSe/v+IbFceJhiC1W5ex6AxWnNYXXOZTBYxxHaGAeEMZimKUC94xNzaPGOGT8oGFvFpNzh4TnG/oFxzSe21IOmW9DJMra+0Gey3qzoQBbOyl5giXXYn5KPakI9pTBnlLsGYM9o9gWg21R7GsG+5pizxnsOcVeMNgLir1ksJcUe8Vgryj2DYN9Q7FvGexbir1msNcUe8Ngbyj2HYN9R7G3DPaWYtsMtk2xdwz2jmI7DLZDsV0G26XYHoPtUWyfwfYp9p7B3lPsgMEOKPaBwT5Q7HsG+55iPzDYDxT7KCFjyB4TqlLh/wgU9eoCuiIxqpXj6wIalSLh+bqAhgD+8x+yFWYldGNGMYfbbLpRIsnfbV3AnLRtfwIvUCAGBV6hQBwKvESBWBR4jQLxKPAiBWJS4FUKxKXAyxSITYHXKRCfAi9UIEYFXqlAnAq8VIFYFXitAvEq8GIFYlbg1QrErcDLFYhdgdcrEL8CL1gghgVesUAcC7xkgVgWeM0C8SzwogViWuBVC8S1wMsWiG2B1y0Q3wIvXCDGhUvlNkmrXN1i8WM349wslRzYJ+C4rSqLUQOOfQbzKRYwWECxkMFCikkGkxQbMNiAYkMGG1IsYrCIYu0DwWL0NBifMNgJxUYMNqKYYjBFsYTBEoqlDJZSrH3KWyyjWM5gOcVOGeyUYsBgQDFuk2uKGQYzFCsYrKDYmMHGFDtjsDOKTRhsQrFzBjun2EcG+8h86CBxD3zgA4l84EMfSOwDH/xAoh/48AcS/8ALAIgBgFcAEAcALwEgFgBeA0A8ALwIgJgAeBUAcQHwMgBiA+B1AMQHwAsBiBGAVwIQJwAvBSBWAF4LQLwAvBiAmAF4NQBxA/ByAGIH4PUAxA/ACwKIIYBXBBBHAC8JIJYAXhNAPAG8KICYAuapot1iUj/Tm7bZYiPbaFQMZas3ZfOVAFIwsAWDLDNpZqSun9I1Hsvach1AnBumVE9LE2G/9GqWpJDUDyMvv4s93Hnx9KjsufhTMY9DZZLfVOHike0C/QTla8PWXF13+2uP3H5/48bqZ/Oq99fdjTV3tV35+PZCv/1VME28X13uP1zuv3uwsPlo9jXx550/dP7cudfpd9Y7m52Xne3Ofifo/LPz31uf3vrsT/+58/s7d+/MvlP+5Naszu86jdcd93/LqXOt</latexit>

scorep(i, j, k)

= (Wpij +W

pjk +W

pik)1/p

Page 19: Triadic data analysis in temporal and higher-order networks

Finding the top-k weighted triangles in large graphs required new algorithms.

19

scorep(i, j, k)

= (Wpij +W

pjk +W

pik)1/p

<latexit sha1_base64="wECyDT1irjpegMdv/Iox6i4U4iU=">AAAHdXicfVVtb9s2EFa7Lem0t7T7OAxglzlIO/ktXZZkQAADK4oVa7FsdpoCoZtR0sliTEoqSTV2Cf2o/Zph37Zfsa872k5jOdkI2DqR99zDu3tIhYXg2nQ6f966/d77H6yt3/nQ/+jjTz79bOPuvRc6L1UEx1EucvUyZBoEz+DYcCPgZaGAyVDASTj+wa2fvAGleZ4NzLSAoWSjjCc8YganzjZ+2qIGJsbqKFdQnRXbPCDnARk/IJT6W/R1yWJySLZPziw/r14V5BuC5vn4ncmd+eCV7baL6mxjs9PqzAa5bnQXxqa3GEdnd9fu0TiPSgmZiQTT+rTbKczQMmV4JKDyaamhYNGYjeAUzYxJ0EM7y7oiDZyJSZIr/GWGzGb9ZQjGUWxai2INC0vB1KQ+G+b5GFd05dcpTbI/tDwrSgNZNGdMSkFMTlwtScwVREZMSZ3W8PHbIOMRJIpFAZNaMpMGBXfbDMz4bXOkWJEGko0hAiGupuabcnDBQ8XU1GWQX+ggxMgjlZdZrIOCGQMq04g3ik8CnbICdJBwE0RMRO49dphC5EYyNdb/FbUlwTBcnBVOgLGDMjHwK8SVVRDf3+/cDwXyLnuYFEYKIKvs7OF8LlJuYMUnFCVU1v0vefgNkhpT6O/bbVRcSxuMDZMoZdkIWlEu269L0E6Uut39bvdg56CtQXLUbohSlc0LbtKmS6LJs2aICgc183u0tzl/+NQVlOEJcPXx6UjkIRMUX6mD9SDTpYJenAvsfw/1H+UxHFIFgk0usTluvq6h00F3aF3jnABqXT4a9FnmiqsggwtMQLIstjRhkotpDAkrhaks1cmlXReJTpwqKr+xTKaxgxAfdloHQSQ5kqIsBCoeCcxEJy5EPUmMTTMzcaF6c7DVD0/xqO0Oq9WkHgOeMQX9qQxz8QRTsvMourI/P39W2cxRSF5ZWVmO26V9MDc540S8CgkXkAWHA/TLENtpStfSmwlWGfpPnruSXBIMurXy2XBSWS2uSJzzHG2foqerARNFyqqrrf72dKXq8UgAj9LmvPY3rWCjNd4u9etBujDLXZZ9PpLIROeqcuEsDaWl8/nqmizkM7yT45sQi4WqTvGQTkKmTlF8NA3ziaVv3H/Dp6kqBZAU+Cg1eLnu7RaGNMggBcIiUzJBEObTMd4QndbOLkwa5HI0yGP8nrAsAhKCucDz63wJkhE9K6M/p2r4hMwCNDutLsjGJbqf5gqrw7MRyTOCoiICEkM0j8EhlvLa7FbvguD9/+h/g6hZJrMolasCfkW6q9+M68aLnVYXt/fLt5u9/cX35I73hfeVt+11vT2v5/3oHXnHXuT97v3h/eX9vfbP+pfrX69vzV1v31pgPvdqY739L4qSnVk=</latexit><latexit sha1_base64="wECyDT1irjpegMdv/Iox6i4U4iU=">AAAHdXicfVVtb9s2EFa7Lem0t7T7OAxglzlIO/ktXZZkQAADK4oVa7FsdpoCoZtR0sliTEoqSTV2Cf2o/Zph37Zfsa872k5jOdkI2DqR99zDu3tIhYXg2nQ6f966/d77H6yt3/nQ/+jjTz79bOPuvRc6L1UEx1EucvUyZBoEz+DYcCPgZaGAyVDASTj+wa2fvAGleZ4NzLSAoWSjjCc8YganzjZ+2qIGJsbqKFdQnRXbPCDnARk/IJT6W/R1yWJySLZPziw/r14V5BuC5vn4ncmd+eCV7baL6mxjs9PqzAa5bnQXxqa3GEdnd9fu0TiPSgmZiQTT+rTbKczQMmV4JKDyaamhYNGYjeAUzYxJ0EM7y7oiDZyJSZIr/GWGzGb9ZQjGUWxai2INC0vB1KQ+G+b5GFd05dcpTbI/tDwrSgNZNGdMSkFMTlwtScwVREZMSZ3W8PHbIOMRJIpFAZNaMpMGBXfbDMz4bXOkWJEGko0hAiGupuabcnDBQ8XU1GWQX+ggxMgjlZdZrIOCGQMq04g3ik8CnbICdJBwE0RMRO49dphC5EYyNdb/FbUlwTBcnBVOgLGDMjHwK8SVVRDf3+/cDwXyLnuYFEYKIKvs7OF8LlJuYMUnFCVU1v0vefgNkhpT6O/bbVRcSxuMDZMoZdkIWlEu269L0E6Uut39bvdg56CtQXLUbohSlc0LbtKmS6LJs2aICgc183u0tzl/+NQVlOEJcPXx6UjkIRMUX6mD9SDTpYJenAvsfw/1H+UxHFIFgk0usTluvq6h00F3aF3jnABqXT4a9FnmiqsggwtMQLIstjRhkotpDAkrhaks1cmlXReJTpwqKr+xTKaxgxAfdloHQSQ5kqIsBCoeCcxEJy5EPUmMTTMzcaF6c7DVD0/xqO0Oq9WkHgOeMQX9qQxz8QRTsvMourI/P39W2cxRSF5ZWVmO26V9MDc540S8CgkXkAWHA/TLENtpStfSmwlWGfpPnruSXBIMurXy2XBSWS2uSJzzHG2foqerARNFyqqrrf72dKXq8UgAj9LmvPY3rWCjNd4u9etBujDLXZZ9PpLIROeqcuEsDaWl8/nqmizkM7yT45sQi4WqTvGQTkKmTlF8NA3ziaVv3H/Dp6kqBZAU+Cg1eLnu7RaGNMggBcIiUzJBEObTMd4QndbOLkwa5HI0yGP8nrAsAhKCucDz63wJkhE9K6M/p2r4hMwCNDutLsjGJbqf5gqrw7MRyTOCoiICEkM0j8EhlvLa7FbvguD9/+h/g6hZJrMolasCfkW6q9+M68aLnVYXt/fLt5u9/cX35I73hfeVt+11vT2v5/3oHXnHXuT97v3h/eX9vfbP+pfrX69vzV1v31pgPvdqY739L4qSnVk=</latexit><latexit sha1_base64="wECyDT1irjpegMdv/Iox6i4U4iU=">AAAHdXicfVVtb9s2EFa7Lem0t7T7OAxglzlIO/ktXZZkQAADK4oVa7FsdpoCoZtR0sliTEoqSTV2Cf2o/Zph37Zfsa872k5jOdkI2DqR99zDu3tIhYXg2nQ6f966/d77H6yt3/nQ/+jjTz79bOPuvRc6L1UEx1EucvUyZBoEz+DYcCPgZaGAyVDASTj+wa2fvAGleZ4NzLSAoWSjjCc8YganzjZ+2qIGJsbqKFdQnRXbPCDnARk/IJT6W/R1yWJySLZPziw/r14V5BuC5vn4ncmd+eCV7baL6mxjs9PqzAa5bnQXxqa3GEdnd9fu0TiPSgmZiQTT+rTbKczQMmV4JKDyaamhYNGYjeAUzYxJ0EM7y7oiDZyJSZIr/GWGzGb9ZQjGUWxai2INC0vB1KQ+G+b5GFd05dcpTbI/tDwrSgNZNGdMSkFMTlwtScwVREZMSZ3W8PHbIOMRJIpFAZNaMpMGBXfbDMz4bXOkWJEGko0hAiGupuabcnDBQ8XU1GWQX+ggxMgjlZdZrIOCGQMq04g3ik8CnbICdJBwE0RMRO49dphC5EYyNdb/FbUlwTBcnBVOgLGDMjHwK8SVVRDf3+/cDwXyLnuYFEYKIKvs7OF8LlJuYMUnFCVU1v0vefgNkhpT6O/bbVRcSxuMDZMoZdkIWlEu269L0E6Uut39bvdg56CtQXLUbohSlc0LbtKmS6LJs2aICgc183u0tzl/+NQVlOEJcPXx6UjkIRMUX6mD9SDTpYJenAvsfw/1H+UxHFIFgk0usTluvq6h00F3aF3jnABqXT4a9FnmiqsggwtMQLIstjRhkotpDAkrhaks1cmlXReJTpwqKr+xTKaxgxAfdloHQSQ5kqIsBCoeCcxEJy5EPUmMTTMzcaF6c7DVD0/xqO0Oq9WkHgOeMQX9qQxz8QRTsvMourI/P39W2cxRSF5ZWVmO26V9MDc540S8CgkXkAWHA/TLENtpStfSmwlWGfpPnruSXBIMurXy2XBSWS2uSJzzHG2foqerARNFyqqrrf72dKXq8UgAj9LmvPY3rWCjNd4u9etBujDLXZZ9PpLIROeqcuEsDaWl8/nqmizkM7yT45sQi4WqTvGQTkKmTlF8NA3ziaVv3H/Dp6kqBZAU+Cg1eLnu7RaGNMggBcIiUzJBEObTMd4QndbOLkwa5HI0yGP8nrAsAhKCucDz63wJkhE9K6M/p2r4hMwCNDutLsjGJbqf5gqrw7MRyTOCoiICEkM0j8EhlvLa7FbvguD9/+h/g6hZJrMolasCfkW6q9+M68aLnVYXt/fLt5u9/cX35I73hfeVt+11vT2v5/3oHXnHXuT97v3h/eX9vfbP+pfrX69vzV1v31pgPvdqY739L4qSnVk=</latexit><latexit sha1_base64="wECyDT1irjpegMdv/Iox6i4U4iU=">AAAHdXicfVVtb9s2EFa7Lem0t7T7OAxglzlIO/ktXZZkQAADK4oVa7FsdpoCoZtR0sliTEoqSTV2Cf2o/Zph37Zfsa872k5jOdkI2DqR99zDu3tIhYXg2nQ6f966/d77H6yt3/nQ/+jjTz79bOPuvRc6L1UEx1EucvUyZBoEz+DYcCPgZaGAyVDASTj+wa2fvAGleZ4NzLSAoWSjjCc8YganzjZ+2qIGJsbqKFdQnRXbPCDnARk/IJT6W/R1yWJySLZPziw/r14V5BuC5vn4ncmd+eCV7baL6mxjs9PqzAa5bnQXxqa3GEdnd9fu0TiPSgmZiQTT+rTbKczQMmV4JKDyaamhYNGYjeAUzYxJ0EM7y7oiDZyJSZIr/GWGzGb9ZQjGUWxai2INC0vB1KQ+G+b5GFd05dcpTbI/tDwrSgNZNGdMSkFMTlwtScwVREZMSZ3W8PHbIOMRJIpFAZNaMpMGBXfbDMz4bXOkWJEGko0hAiGupuabcnDBQ8XU1GWQX+ggxMgjlZdZrIOCGQMq04g3ik8CnbICdJBwE0RMRO49dphC5EYyNdb/FbUlwTBcnBVOgLGDMjHwK8SVVRDf3+/cDwXyLnuYFEYKIKvs7OF8LlJuYMUnFCVU1v0vefgNkhpT6O/bbVRcSxuMDZMoZdkIWlEu269L0E6Uut39bvdg56CtQXLUbohSlc0LbtKmS6LJs2aICgc183u0tzl/+NQVlOEJcPXx6UjkIRMUX6mD9SDTpYJenAvsfw/1H+UxHFIFgk0usTluvq6h00F3aF3jnABqXT4a9FnmiqsggwtMQLIstjRhkotpDAkrhaks1cmlXReJTpwqKr+xTKaxgxAfdloHQSQ5kqIsBCoeCcxEJy5EPUmMTTMzcaF6c7DVD0/xqO0Oq9WkHgOeMQX9qQxz8QRTsvMourI/P39W2cxRSF5ZWVmO26V9MDc540S8CgkXkAWHA/TLENtpStfSmwlWGfpPnruSXBIMurXy2XBSWS2uSJzzHG2foqerARNFyqqrrf72dKXq8UgAj9LmvPY3rWCjNd4u9etBujDLXZZ9PpLIROeqcuEsDaWl8/nqmizkM7yT45sQi4WqTvGQTkKmTlF8NA3ziaVv3H/Dp6kqBZAU+Cg1eLnu7RaGNMggBcIiUzJBEObTMd4QndbOLkwa5HI0yGP8nrAsAhKCucDz63wJkhE9K6M/p2r4hMwCNDutLsjGJbqf5gqrw7MRyTOCoiICEkM0j8EhlvLa7FbvguD9/+h/g6hZJrMolasCfkW6q9+M68aLnVYXt/fLt5u9/cX35I73hfeVt+11vT2v5/3oHXnHXuT97v3h/eX9vfbP+pfrX69vzV1v31pgPvdqY739L4qSnVk=</latexit>

i

j k

Wij

Wjk

Wjk

w/ R Kumar, P Liu, M Charikar

Retrieving Top Weighted Triangles in Graphs, WSDM, 2020.

<latexit sha1_base64="vp8S9+HGTTNptdfFhk9UBe0Sf/4=">AAAf9HicjZlbc9y2FYBX6S3dXuK0j52OmSrqJJ7Vele2bKuddGTZ8SWxY9mSbxE1LkiCS1ogQR+Aq5U5nL62/6Fvnb72//Qn9Ll/oAfL1YU8Zz1djUWQ+A4AAjgf11RQqNTY0ejfKx/94Ic/+vFPPv5p/2c//8UvP7n06a9eGF1CKJ+HWml4FQgjVZrL5za1Sr4qQIosUPJlcHTH1b+cSjCpzvftSSEPMzHJ0zgNhcVLby791w/kJM0rK4JSCagr5S1+6r5vdQGlkv1IWOzCer/3/FUv15E0TVFGk3nRppn0vpAzHG6aT748u4KDNF96vu/5WRrNWwIZRaldB1moE8RuDa8/xsP1a5vu8Pl4OP7cS1wUnm16RoY6j4yHLfRNoW0au5hrwxsOHg+3dlzMnzaun8dcG58GYYgfaGt1Nu/Xl3l0do9vLq2OhqP5x6OF8aKw2lt8dt98+tv/+JEOy0zmNlTCmIPxqLCHlQCbhkriTJVGFiI8EhN5gMVcZNIcVvPVqb01vBJ5sQb8l1tvfrV/MQTbAXHSauV0rLP21UDrI6wx7asiM65lejUTNmlfdFfAxKb2uqw5yYI2OwFRJGn4fw1B5mWWWpnR3qzWqgNnpbIp6OP2vGGjSh5Ws2bW+q0ZOnDb22AfgcQdNHBLGuH0q4mG1CbZBi5Bew1sfOuwSvOitDIPmyWIS+VZ7bkk8KIUZGhxC7bXwaZH7wd5GsoYRDhYTOCgSN3sDjJxJEOpVDNqh6o0AAEnbvn0sRkE2MoEdIm7b1AIayXkBqMspLOBSUQhzSBO7SAUKnTnkYsplLaZgCOzrNVhJq3AyvmsKGmr/TK28pmM6gpn4rNbo88Chf1eJGwiJyBlXlfzg2OOE1ycDhOoUtaV+32B6PuRjHGu52BVFBZcT8/u79TVxmhrcH00uH4DqTUvsbYwf7h61crZ0FgcgZyFicgnchjq7Oq7UhrnF3N1fGNza2PrqpFZihoK0DrZ+jGu2bq71fU0Xw9QVhLm3LWbq82h77vJFigzN4t9f6J0IJSPp74L25a5KUFuR1phimyjykJ00lc+SCVmp7Eab7GdZgf748PKLaXbEq11393fE7lbApC5PMYbyATqwo9FlqoTnBCBG7aufBOfltvbxsRNoq1d7MzgOsvoq9FwaxBmKXaK6aRQCtiBnZnYNdG+SWzbz+3MNbXdBFfmygHaaPOw7t7UXYkaArmHOavVPbylqmnF1NWTx4/qKnddZGldZXWV4nD9PWk5GC9E3ZBgEbLowwXslQEupy3dkvIddHvYu/fYTclpB/vj1vRVwayujDrvxMFNdPUQSTcHQhWJqM+H+ueHbt9dnPZoomQaJut08s+qcKmNk0lLYJkb7vk6X/FngYAD3Bt+EuhZ5U/d77W+nzjNeIlMJ4nFx8PNzcJ6a95+Ij0R2lIoD8P6/hGm+Wi4sSlna97pZ827i49CkYfSC6Q9xiR0rIedeWZ+l/2mq7U+PttcA+uj4Vhma6fRe4kGHDo+Sz2de7jmnpKx9UwaSRfRbH9329XquD5rBJ9g1z7YCMzvZN5KjZ/zfbQr0IfRXfwS4SQOfihTVfnKHXA1YH5cCsdKoyp8NT86vCm05rjyw9JlENrCBnHlTup2PRq5hSzOO9RUq3PEnXTqJy1g0hBrLcQ9NNJosckwzSr/SBSF6LZEsNt195awpp5vmyBz69AdatipbVfDeTSmSLMrms6g4XFd2wHnDS4J6LfS6UkhAYUBVxBKc1yr92j7s9IH0TJbkGW2HBSz0zZPS8tQUwZv8WlrNeZ8U/T/iCcsjhMPE2yxrhbHD1Bp3lB4XEJZfIzjCC0sA6JUTHQucM+40jJqikPGLxruZrG4dEj4fEO/4JimS1sqQAcuRRLd3Q/yRDabDQXY2UnZbay5kPNz6nZNsB0G26HYHQa7Q7G7DHaXYl8z2NcUu8dg9yh2n8HuU+wBgz2g2EMGe0ixbxjsG4p9y2DfUuwRgz2i2GMGe0yx7xjsO4o9YbAnFNtlsF2KPWWwpxR7xmDPKLbHYHsU22ewfYo9Z7DnFHvBYC8o9pLBXlLsFYO9othrBntNse8Z7HuKvZegGXLEpKpU+H8EivpNBV2RFNXK8U0FzUqR8XxTQVNAZEHEBixq6MZMUg53l+lGSSR/t00F86Tt+hN4gQIxKPAKBeJQ4CUKxKLAaxSIR4EXKRCTAq9SIC4FXqZAbAq8ToH4FHihAjEq8EoF4lTgpQrEqsBrFYhXgRcrELMCr1YgbgVerkDsCrxegfgVeMECMSzwigXiWOAlC8SywGsWiGeBFy0Q0wKvWiCuBV62QGwLvG6B+BZ44QIxLpwpt0065ZoOi1+7GefqXHLgmIDTrqocRg04DRgsoFjIYCHFIgaLKCYZTFIsZrCYYhMGm1AsYbCEYt0HgsPo02D6lsHeUuyIwY4ophhMUSxjsIxiOYPlFOs+5R2mKVYwWEGxdwz2jmLAYEAxbpMbilkGsxQrGayk2JTBphQ7ZrBjis0YbEaxEwY7odh7BnvPfOkgeQ984gPJfOBTH0juA5/8QLIf+PQHkv/ACwCIAYBXABAHAC8BIBYAXgNAPAC8CICYAHgVAHEB8DIAYgPgdQDEB8ALAYgRgFcCECcALwUgVgBeC0C8ALwYgJgBeDUAcQPwcgBiB+D1AMQPwAsCiCGAVwQQRwAvCSCWAF4TQDwBvCiAmAKWqaLbYta805u32WET12hSTmSnN+WuKwGkInYVsdY211aa5i1d67WsqzchpIVlas28NhPuj17tmhyy5mWke9U6/2PRwbP7O4fVaIA/NfM6VGbFhwJOX9mu0m9QgbFs5MbNwXjz1mA83vpg+PGy8PHNwdbmYKMb/ObS6rj7p2BaeLExHN8Yjp9urG7vLP5M/HHvN73f9b7ojXs3e9u9B73d3vNeuPJ65S8rf1352+Xp5b9f/sflfzboRyuLmF/3Wp/L//ofMv2Yrw==</latexit>

dataset # nodes # edges time (existing) time (ours)

reddit-reply 8.4M 435M 1.1 hours 5 secondsspotify 3.6M 1.9B > 24 hours 31 seconds

Finding top 1000 triangles

Page 20: Triadic data analysis in temporal and higher-order networks

Triadic analysis for modern network data.

20

1. Open and closed triangles in temporal, higher-order interactions.Simplicial closure and higher-order link prediction, PNAS, 2018.

2. Triadic motifs in temporal networks.Motifs in temporal networks, WSDM 2017.Sampling methods for counting temporal motifs, WSDM, 2019.

w/ A Paranjape, J Leskovec, P Liu, M Charikar

Page 21: Triadic data analysis in temporal and higher-order networks

Temporal network data is extremely common.

21

Private communicatione-mail, phone calls, text messages, instant messages

Public communicationQ&A forums, Facebook walls, Wikipedia edits

Payment systemscredit card transactions, cryptocurrencies, Venmo

Technical infrastructurepackets over the Internet, messages over supercomputer

Page 22: Triadic data analysis in temporal and higher-order networks

22

source destination timestampa d 14sc a 15sa c 17sa b 25sa c 28sa c 30sc d 31sc a 32sa c 35s

1 23

δ = 10s

Temporal network motif1. Directed multigraph

with k edges2. Edge ordering3. Maximum time span δ

a

b c

25s 28s32s

Motif instancek temporal edges that match the pattern that all occur within δ time

a

d c

14s 17s15s

Wrong order!(c, a) before (a, c)

See also Temporal Network Motifs: Models, Limitations, Evaluation, Liu, Guarrasi, & Sarıyüce, 2021.

We developed a model for temporal motifs.Motifs in Temporal Networks, WSDM, 2017.

Page 23: Triadic data analysis in temporal and higher-order networks

We also developed fast counting algorithms.

23

M6,1 M6,2 M6,3 M6,4 M6,5 M6,6

M5,1 M5,2 M5,3 M5,4 M5,5 M5,6

M4,1 M4,2 M4,3 M4,4 M4,5 M4,6

M3,1 M3,2 M3,3 M3,4 M3,5 M3,6

M2,1 M2,2 M2,3 M2,4 M2,5 M2,6

M1,1 M1,2 M1,3 M1,4 M1,5 M1,6

1, 2, 3 1, 23 1, 2 3 1, 2 3 1, 2

3

1, 2

3

1, 32 12, 3 12 3 12 3 12

3

12

3

1, 3 2 13 2 1 2, 3 1 23

1 2

3

1 2

3

1, 3 2 13 2 123 1

2, 31 2

3

1 2

3

1, 3

2

13

2

1

2

3 1

2

3 1

2, 3

123

1, 3

2

13

2

1

2

3 1

2

3 1

23

1

2, 3

It takes ~2.5 hours to count all instances of these motifs in a 2B edge phone call network (single threaded).

Page 24: Triadic data analysis in temporal and higher-order networks

Cyclic triangles are much more frequent in payment networks than in social networks.

24

Page 25: Triadic data analysis in temporal and higher-order networks

Sampling algorithms let us go even faster for large datasets and more complicated motifs.

25

δ = 1 day, 16 threads

<latexit sha1_base64="acPXbWFGBGdsOcIBu/atUAMlp+8=">AAAgXnicjZlbc9vGFYCppE1Ttmns9qUznU6RauhxPBRNUqZktZMZ+W4ndixb8iURNM4CWBCwFhedXUiUMfh1/RV962te2z/QswR1Ac6hJ9RYAHe/ve/5Foa8XMXaDIf/Xvnk01/9+rPffP7b7u9+/8Ufvrxy9Y+vdVaAL1/5mcrgrSe0VHEqX5nYKPk2BykST8k33uE9m//mWIKOs3TPnObyIBHTNA5jXxhMend1xXU9OY3T0givUAKqEpzt0j06KkRQOf7ZT9V1TZZDoWT3mnPNcZNCmRhbL5K0XK9KH4sVaRqnU8fEiXSua+lnaaC/rhzXdVw/iQNb9rqCr8v1tUnVDYTBThtb1apjZJJnIJQjg6nUmCZnwrd5WiQ4B1jpNScXMHAvEmyti0q7zoOjIk7j2b0IRzXNEF6/NXnm4PXW7dFg7NgP3k8GkyFeJ4ONjQ28jgbrbg/r6TovZRDE5l6WJDI1tvmN9Y158dL1QudtZVM217dsyni8MbbXtTXn7GN74mXGZMm8M65Mg/O57L67sjocDOcfh96MFjerncVn593Vv/7sBplf2K74Smi9Pxrm5qAUgNOtJC5DoWUu/EMxlft4m4pE6oNyvg8qp4cpgRNmgP9S48xTu5eLYD0gThu1nHV21kz1suwQc3QzVSTa1kxTE2GiZqJNAR3qymmz+jTxmuwURI6L94u6INMiiXHH0NZMlqkWPN+mkJ005w0rVfKgnNWz1m3M0L4NJI1teBJk0LdrGuD0q2kGsYmSMS5Bcw1MePugjNO8MDL16yUIC+WYzLHh5gQxSN+oU6e5DiY+/NBPY1+GIPz+YgL7eWxnt5+IQ+lLpepeW1TFHgg4tcuXnei+h7VMISswwvq5MEZCqrGUgXjW15HIpe6Hsen7Qvn2e2DL5CoziYBDvazWQSKNwMz5rChpyr0iNBKDA2NbBl/dHn7lKWz3MmEiOQUp06qcXyxzEuHitBhPFbIq7e9LRNcNZIhzPQfLPDdgW3r56G5Vjodb/VvD/q0NpHpOZEyu/3HzppGzgTbYAznzI5FO5cDPkptHhdTWZPrmaGOyNd66qWUSo/A89FuydoJrtmaHuhanax5qUcKcW99crS9d1062QG3aWey6U5V5Qrn41bXFtmWqC5DbQaYwRLZRmn4WyG9ckErMzspmOMRmmO3vjQ5Ku5R2SzTWfWdvV6R2CUCm8gQHkAj0hRuKJFanOCECN2xVujo8u29uGx3Wgda73JjGdZbBN8PBVh9Fi41iOCmUAjZgZjq0VTQHiXW7qZnZqrbrwqW+sY82mhxU7UHdl6ghkLsYs5l6iEMq61p0VT5/9rQqU9tEEldlUpUxdtfdlYaDMSFoF/EWRRZt2AK7hYfLaQq7pHwD7RZ2Hz6zU3LWwN6oMX2lN6tKrS4asXBdunyCpJ0DofJIVBdd/emJ3XeXpz2YKhn70Rqd/PMsXGptZdIQWGK7e7HON9yZJ2Af94YbedmsdI/t717XjaxmnEjG08jg8bA5yY3Tc/Yi6eA5WODBiMW67iGG+XAwnshZ7/z06Tn38RlBpL50PGlOMAgt62Bjjp6Psls31eviSWUrWBsORjLpnZXejTLArtsjNUsdXHNHydA4Og6kLVFvfzvscnVUnVeCJ9j6RyuB+UjmtVT4udhHOwJ9GNzHxxUrcXB9GavSVfaCqwHz61I4VBmqwlXzq8Xrm8Ycl65f2AhCWxgvLO2XqpmPRm4gi+8t6jhTF4j90sqfNoBpTfQaiD004mCxyTDMSvdQ5Llo10SwO1V7SJhTzbeNl9h1aHfVb+U2s+GiNIZIvSvqxqDmcV2bBS4qXFKg2win57kEFAbcQAifw5L4A9r+/O6jaJEsyCJZDorZWZ1nd8tQXXjv8bQ1GcZ8fev+E7+wOE48TLHGqlxcP0LFaU3hdQll8BjHHhpYBgQxPpqmAveMvVtGHWOX8UHDDhZvl3YJzzf0C/bpeGlNOWSeDZEoa+8HeSrrzYYCbO2k5A7mXIr5OXWnIthdBrtLsXsMdo9i9xnsPsUeMNgDij1ksIcUe8Rgjyj2mMEeU+wJgz2h2LcM9i3FvmOw7yj2lMGeUuwZgz2j2PcM9j3FnjPYc4rtMNgOxV4w2AuKvWSwlxTbZbBdiu0x2B7FXjHYK4q9ZrDXFHvDYG8o9pbB3lLsBwb7gWI/MtiPFPsgIWPIIROqUuH/ESjq1hl0RWJUK8fXGTQqRcLzdQYNAZF4AVtgkUM3ZhRzuE2mGyWS/GjrDOakbfsTeIECMSjwCgXiUOAlCsSiwGsUiEeBFykQkwKvUiAuBV6mQGwKvE6B+BR4oQIxKvBKBeJU4KUKxKrAaxWIV4EXKxCzAq9WIG4FXq5A7Aq8XoH4FXjBAjEs8IoF4ljgJQvEssBrFohngRctENMCr1ogrgVetkBsC7xugfgWeOECMS6cK7dJWuXqFouP3Yxzs1Ry4IiAx21VWYwa8NhjMI9iPoP5FAsYLKCYZDBJsZDBQopNGWxKsYjBIoq1DwSL0dPg+D2DvafYIYMdUkwxmKJYwmAJxVIGSynWPuUtllEsZ7CcYkcMdkQxYDCgGLfJNcUMgxmKFQxWUOyYwY4pdsJgJxSbMdiMYqcMdkqxDwz2gXnoIHEPfOADiXzgQx9I7AMf/ECiH/jwBxL/wAsAiAGAVwAQBwAvASAWAF4DQDwAvAiAmAB4FQBxAfAyAGID4HUAxAfACwGIEYBXAhAnAC8FIFYAXgtAvAC8GICYAXg1AHED8HIAYgfg9QDED8ALAoghgFcEEEcALwkglgBeE0A8AbwogJgClqmiXWNSv9Ob19liI1tpVExlqzVl05UAkhHajDDLTJoZqeu3dI3XsjZf+xDnhsnV89xE2D96NXNSSOqXkfZV6/yPRfsvH909KId9/KmY16EyyT9W4OyV7Sp9gvK0YUuON/ujye3+aLT10eIny4qPNvtbk/64XfjdldVR+0/B9Ob1eDDaGIxejFe37y7+TPx55y+dv3eud0adzc5253Fnp/Oq46/8a+Xnlf+u/O9v/3E+c75wvqzRT1YWZf7UaXycP/8fYJC3JQ==</latexit>

running time (seconds)

dataset # temporal edges exact sampling par. sampling

EquinixChicago 345M 481.2 45.50 5.666 1.3%RedditComments 636M X 6739 2262 –

Page 26: Triadic data analysis in temporal and higher-order networks

26

THANKS! Austin Bensonhttp://cs.cornell.edu/~arb

@[email protected]

Triadic data analysis in temporal and higher-order networks

Supported by ARO MURI, ARO Award W911NF19- 1-0057, NSF Award DMS-1830274, and JP Morgan Chase & Co.

Lots of data available at https://www.cs.cornell.edu/~arb/data/

Santa Fe, NM