the why, how, and when of representations for complex systemsleotrs.com/static/hyper_poster.pdf ·...

1
The why, how, and when of representations for complex systems Leo Torres 1 and Ann Sizemore Blevins 2 , Danielle S. Bassett 2 , Tina Eliassi-Rad 1 1 Northeastern University, 2 University of Pennsylvania Conclusions - Choosing the proper formalism for complex system analyses requires knowledge of data dependencies and question requirements. - The data abstraction method can significantly affect analysis results. References 1. Benson, Austin R., David F. Gleich, and Jure Leskovec. "Higher-order organization of complex networks." Science 353.6295 (2016): 163-166. Comple Systems How do we analyze a complex system? Complex systems abound in many disciplines from neuroscience and computer science to ecology and economics. Despite (or perhaps due to) their prevalence, researchers often have different answers for how best to represent, encode, and analyze a complex system. Here we colelct analysis frameworks, highlight assumptions made within pipelines, and distinguish use cases. SYSTEM I NSIGHT ! I NSIGHT ! D EPENDENCIES preserve dependencies A NALYSES F ORMALISMS R ELATIONS Graph Simplicial complex Hypergraph Removing independent sets or higher relations Forcing structure or relations Graph Simplicial complex Hypergraph Centrality Homology Communities Are subgroups of groups implied? Are nearby nodes likely to connect? small Relational Spatial Are walks Markovian 1 ? Temporal perspectives Edge Simplices Hyperedges mean node clustering (in hypergraph) mean node clustering (in hypergraph) log fill coefficient log fill coefficient 0.0 -0.5 -1.0 -1.5 -2.0 -2.5 -3.0 mean node clustering (in the graph) 0.8 0.7 0.6 0.5 0.4 0.3 0 -1 -2 -3 -4 -5 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0.2 0.4 0.6 0.8 1.0 mean node clustering (in the graph) 0.4 0.5 0.6 0.7 0.8 Enron emails 3 Human protein complexes 4 2. Zhou, Wanding, and Luay Nakhleh. "Properties of metabolic graphs: biological organization or representation artifacts?." BMC bioinformatics 12.1 (2011): 132. extra overlap between hyperedges Hyperedges can have absent substructures. We measure the fullness of a hyperedge with the fill coefficient: Formalisms offer different perspectives Similar to the graph formalism, we can calculate the hypergraph clustering coefficient 2 : Does our interpretation of average clustering change when using different representations? Graph Hypergraph 4. Giurgiu, Madalina, et al. "CORUM: the comprehensive resource of mammalian protein complexes—2019." Nucleic acids research 47.D1 (2018): D559-D563. 3. Klimt, Bryan, and Yiming Yang. "The enron corpus: A new dataset for email classification research." European Conference on Machine Learning. Springer, Berlin, Heidelberg, 2004. 8 7 6 5 4 3 2 Hyperedge size

Upload: others

Post on 21-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The why, how, and when of representations for complex systemsleotrs.com/static/hyper_poster.pdf · - Choosing the proper formalism for complex system analyses requires knowledge of

The why, how, and when of representations for complex systemsLeo Torres1 and Ann Sizemore Blevins2, Danielle S. Bassett2, Tina Eliassi-Rad1

1Northeastern University, 2University of Pennsylvania

Conclusions- Choosing the proper formalism for complex system analyses requires knowledge of data dependencies and question requirements.- The data abstraction method can significantly affect analysis results.

References1. Benson, Austin R., David F. Gleich, and Jure Leskovec. "Higher-order organization of complex networks." Science 353.6295 (2016): 163-166.

CompleSystems

How do we analyze a complex system?Complex systems abound in many disciplines from neuroscience and computer science to ecology and economics. Despite (or perhaps due to) their prevalence, researchers often have different answers for how best to represent, encode, and analyze a complex system. Here we colelct analysis frameworks, highlight

assumptions made within pipelines, and distinguish use cases.

SYSTEM

INSIGHT!INSIGHT!DEPENDENCIES

preservedependencies ANALYSES

FORMALISMS

RELATIONS

Graph

Simplicialcomplex

Hypergraph

Removing independent sets or higher relations

Forcing structure or relations

GraphSimplicialcomplex

Hypergraph

Centrality

Homology

Communities

Are subgroups of groups implied?

Are nearby nodes likely to connect?small

Relational

Spatial

Are walks Markovian1?

Temporal

perspectives

Edge

SimplicesHyperedges

mea

n no

de c

luste

ring

(in h

yper

grap

h)m

ean

node

clus

terin

g (in

hyp

ergr

aph)

log fill coefficientlog fill coefficient

0.0

-0.5

-1.0

-1.5

-2.0

-2.5

-3.0

mean node clustering (in the graph)

0.8

0.7

0.6

0.5

0.4

0.3

0

-1

-2

-3

-4

-5

0.6

0.5

0.4

0.3

0.2

0.1

0.00.2 0.4 0.6 0.8 1.0

mean node clustering (in the graph)0.4 0.5 0.6 0.7 0.8

Enron emails3

Human protein complexes4

2. Zhou, Wanding, and Luay Nakhleh. "Properties of metabolic graphs: biological organization or representation artifacts?." BMC bioinformatics 12.1 (2011): 132.

extra overlap between hyperedges

Hyperedges can have absent substructures. We measure the fullness of a hyperedge with the fill coefficient:

Formalisms offer different perspectives

Similar to the graph formalism, we can calculate the hypergraph clustering coefficient2 :

Does our interpretation of average clustering change when using different representations?

GraphHypergraph

4. Giurgiu, Madalina, et al. "CORUM: the comprehensive resource of mammalian protein complexes—2019." Nucleic acids research 47.D1 (2018): D559-D563.

3. Klimt, Bryan, and Yiming Yang. "The enron corpus: A new dataset for email classification research." European Conference on Machine Learning. Springer, Berlin, Heidelberg, 2004.

8765432

Hyperedgesize