cinet: a cyberinfrastructure for network science
TRANSCRIPT
![Page 1: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/1.jpg)
CINET: A CyberInfrastructure for Network Science
S.M.Shamimul Hasan On behalf of CINET team
Technical Report # 15-‐060
Network Dynamics and SimulaBon Science Lab (NDSSL) Virginia BioinformaBcs InsBtute
Virginia Tech
![Page 2: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/2.jpg)
CINET Team • Virginia Tech: Keith Bisset, Abhijin Adiga, Edward Fox,
Maleq Khan, Chris Kuhlman, Henning Mortveit, Madhav Marathe, Samarth Swarup, Anil VullikanB
• Indiana University: Geoff Fox, Judy Qiu, Stephen Wu • SUNY Albany: S.S. Ravi • Jackson State University: Richard Aló, Chris Cassidy • University of Houston Downtown: Ongard Sirisaengtaksin • Argonne NaBonal Lab and U. Chicago: Pete Beckman • VT Students: S.M. Shamimul Hasan, Md Hasanuzzaman, S M
Arifuzzaman, Maksudul Alam, Sherif Abdelhamid, Zalia Shams, Tirtha Bhaaacharjee
• Persistent Systems: Harsha, Gaurav, Tanmay, Rakhi, Abhijeet, Niranjan and Team
![Page 3: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/3.jpg)
CINET: Team (cont.) • Several evaluators are incorporaBng CINET into courses – S. S. Ravi at the University at Albany, SUNY – Edward Fox at Virginia Tech – Anil VullikanB at Virginia Tech – Henning Mortveit at Virginia Tech – Aravind Srinivasan at University of Maryland – Albert Esterline (NCAT)
• Other evaluators planning to use CINET in research – Zsuzsanna Fagyal at UIUC – Maa Macauley at Clemson University – T. M. Murali at Virginia Tech
![Page 4: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/4.jpg)
Network
“Network is a group or system of interconnected people or things” -‐ Oxford DicBonaries
“Network science is the study of network representaBons of physical, biological, and social phenomena” -‐ NaBonal Research Council
![Page 5: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/5.jpg)
Network Science
• Research in network science has been increasing very rapidly in the last decade, in many different scienBfic fields.
• Networks can be very large: ~108 nodes, ~1010 edges, requiring HPC for analysis
• There is a need for middleware, i.e., an interface layer o Domain experts don’t need to become experts in graph theory, data
mining, and high-‐performance compuBng o Provides an abstracBon layer that allows separaBon of innovaBon
above and below this layer
![Page 6: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/6.jpg)
CINET: Vision • Self-‐sustainable
– Users can contribute new networks, data, algorithms, hardware, and research results
• Self-‐manageable – End users will be insulated from the complexiBes of resource allocaBon,
scheduling, cross-‐plahorm interacBons, and other low-‐level concerns
• Repeatable Science – The exact version of a model that produced a result is kept – All model input parameters are captured – Any system configuraBon informaBon is captured – All input data versions are kept – The enBre set of configuraBon informaBon for an experiment (mulBple
runs) should be accessible by providing a URL – Encourage users of the system to include pointers to results in published
work
![Page 7: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/7.jpg)
System Architecture
![Page 8: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/8.jpg)
• Provides over 150+ networks, 18 graph generators and 80+ measures
• New improved UI for Granite • Components (apps) that allow researchers to interact with CINET:
VisualizaBon of networks, Adding networks, Adding structural analysis tools
• Structural analysis using Galib, NetworkX and SNAP • Version 1.0 of a Python-‐based DSL for compuBng complex
workflows • Resource manager 1.0 completed: allows mulBple computaBonal
and analyBcal resources to be used and selected • Website with addiBonal resources (course notes, etc.).
Version 2.0
![Page 9: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/9.jpg)
Digital Library
Digital Library: v Support network science research v Manage conBnuously produced, large-‐scale scienBfic output
v Provide simulaBon-‐specific services to support science
v Manage large network graphs and workflow of content collecBons
![Page 10: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/10.jpg)
Digital Library Data: – List of networks & metadata. – List of measures & metadata. – Parameters for measures. – List of generators & metadata. – Parameters for generators. Services: — MemoizaBon: Record details of every experiment run — IncenBvizaBon: Report how many Bmes a parBcular graph was used
— Browsing and Searching: graphs, measures, results
![Page 11: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/11.jpg)
TransacBonal Data
• Following data is stored in database – Users – Details Network Analysis run by users including parameters set for
each – Details Generator Analysis run by users including parameters set for
each
• Following is stored in file system – Output files of Network & Generator Analysis.
• Mapping exists between data stored in database and file system
![Page 12: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/12.jpg)
Performance Improvements
• Blackboard is used ONLY for placing job request
• Simpler & fewer number of components • Components are fully distributed – Web-‐app, blackboard, brokers exist on separate VMs
• Brokers are no more required to poll the data but directly noBfied by blackboard container.
![Page 13: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/13.jpg)
Resource Manager
• Decides what is the best resource for a given job request – Through a set of defined rules
• Tracks the health of and load on compute resources – And, considers this knowledge in determining the best resource(s)
![Page 14: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/14.jpg)
Granite Structural Analysis of Complex
Networks
![Page 15: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/15.jpg)
Graph Analysis Resources and Challenges
• Resources : – StaBc Analysis tools: Provide efficient implementaBons of various graph measures or algorithms (e.g., Galib, NetworkX).
– Large collecBon of Data Sets (of networks) • Challenge 1: How can we make an analyBc engine that will
– Reduce programming overhead, – Reuse exisBng resources
• Challenge 2: Provide a simple computaBonal interface to Domain Experts to use available resources and program interacBvely
![Page 16: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/16.jpg)
CINET -‐ Granite
• Granite allows users to run various network measures on a variety of networks – Measures can either be staBc (e.g., degree distribuBon, cluster coefficient) or dynamic (e.g., disease diffusion)
– Network size can range from Bny (10s of nodes) to very large (100s of millions of nodes)
• Granite automaBcally picks best implementaBon of specified measure
• Granite automaBcally picks most appropriate compute resource
![Page 17: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/17.jpg)
• Granite includes modules from three graph algorithm libraries: – Galib (developed at NDSSL) – NetworkX (developed at Los Alamos NaBonal Lab) – SNAP (developed at Stanford University)
Graph Libraries
CINET: A CyberInfrastructure for Network Science
![Page 18: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/18.jpg)
Graph Centrality Measures in CINET u Degree list <Node-‐ID, Degree> u Degree statistics u Degree distribution u Average neighbor degree u Hub-‐authority u Pagerank
u Clustering coefficient distribution
u Streaming-‐based CC distribution (apprx.)
u Betweenness centrality
u Closeness centrality u Degree centrality u Eigenvalue centrality
u k-‐core u k-‐crust u k-‐corona u k-‐clique coefficient u Core number
u Ro distribution
u Coreness of nodes <ID, coreness> u CC list <Node-‐ID, CC> u External-‐memory CC algorithm
(exact)
u Parallel CC algorithm
u Generate degree sequence u Closeness centrality -‐ weighted
u Ro distribution u Closeness vitality –
unweighted
u Closeness vitality -‐ weighted
u Communicability centrality
u In-‐degree centrality u Out-‐degree centrality
![Page 19: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/19.jpg)
Graph Shortest path and ConnecBvity Measures in CINET
u Number of connected components
u Component graph
u Component size distribution
u Strongly connected component
u Weakly connected component
u Bi-‐connected component
u Check bi-‐connectivity
u BFS tree / forest
u BFS predecessor list u BFS successor list u Partitioning by BFS traversal u DFS predecessor list u DFS Successor list u DFS: nodes in post-‐order
visits
u DFS Tree u Articulation point u Bridge edges u Diameter
u Center u Periphery u Check connectivity u Eccentricity
u Radius u DFS: nodes in pre-‐order visits u Check if graph is s DAG
u Topological sort
![Page 20: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/20.jpg)
Weighted Shortest Path and MoBf counBng
u Minimum spanning tree
u Single source shortest path
Weighted shortest path related u Shortest path tree/forest u Weighted diameter (exact and approx.)
u Average pairwise distance (exact and approx.)
u Distribution of pair-‐wise distance (exact and approx.)
Subgraph / Motif counting u Count triangle
u Clique counts (specialized) u Graph transitivity u All maximal clique
u Clique number
u Largest clique containing a node
Flow u Maximum flow
u Minimum cut
CINET: A CyberInfrastructure for Network Science
![Page 21: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/21.jpg)
Other Measures
u Shuffle edges
u Degree-‐assortative shuffle
u Age-‐assortative shuffle
u Compare graphs
u Remove nodes
u Remove edges
u Remove high degree nodes (top x%)
u Remove high degree nodes (degree >=x)
u Check if a degree sequence is graphical
u Compare graphs
u Isolated nodes u Vertex cover u Dominating set
u Minimum edge dominating set
u Check graph consistency u Check if bipartite graph
u Check if chordal graph u Maximal independent set
u Number of common neighbors
CINET: A CyberInfrastructure for Network Science
![Page 22: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/22.jpg)
Simple GeneraBve Models of Networks in CINET
u Random graph generators u Erdos-‐Renyi random graph
u G(n, p) graph u G(n, p) component
u G(n, m) graph
u G(n, r) graph u Watts-‐Strogatz small-‐world graph
u Waxman random graph u Chung-‐Lu
u Havel-‐Hakimi
u Preferential Attachment
u Small world
u Circle u Star u Chain u Lattice
u Deterministic graph generators u Binary tree graph u Star u Wheel
u Grid u Torus u Hypercube u Petersen
![Page 23: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/23.jpg)
Currently Available Networks • 150+ small and large networks
– Sizes vary from 100 edges to 110M edges – Social contact networks
• Chicago, Washington DC, Detroit, New York, Seattle – Multi-‐modal urban transportation networks (e.g., subway, cars,
buses). • Portland, OR
– Adolescent friendship networks • High school in New River Valley
– Blog and other online networks • Slashdot, Epinions
– Infrastructure networks • Ad hoc and mesh, phone call, electrical power
– Biological networks
![Page 24: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/24.jpg)
Networks in CINET (cont.) Types of Networks u Web graph u Autonomous System/Internet u Road/transport networks u Collaboration networks u Co-‐appearance networks u Social networks u Biological networks u Infrastructure(e.g. power) u Others
u Stanford SNAP u Pajek Dataset u http://www-‐personal.umich.edu/~mejn/netdata/ u Some others publicly available sources
Original Sources
![Page 25: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/25.jpg)
List of Networks
Autonomous System/Internet Web Graph u Autonomous systems -‐ Oregon-‐1 -‐ 010331 u Autonomous systems -‐ Oregon-‐1 -‐ 010407 u Autonomous systems -‐ Oregon-‐1 -‐ 010414 u Autonomous systems -‐ Oregon-‐1 -‐ 010421 u Autonomous systems -‐ Oregon-‐1 -‐ 010428 u Autonomous systems -‐ Oregon-‐1 -‐ 010505 u Autonomous systems -‐ Oregon-‐1 -‐ 010512 u Autonomous systems -‐ Oregon-‐1 -‐ 010519 u Autonomous systems -‐ Oregon-‐1 -‐ 010526 u Autonomous systems -‐ Oregon-‐2 -‐ 010331 u Autonomous systems -‐ Oregon-‐2 -‐ 010407 u Autonomous systems -‐ Oregon-‐2 -‐ 010414 u Autonomous systems -‐ Oregon-‐2 -‐ 010421 u Autonomous systems -‐ Oregon-‐2 -‐ 010428 u Autonomous systems -‐ Oregon-‐2 -‐ 010505 u Autonomous systems -‐ Oregon-‐2 -‐ 010512 u Autonomous systems -‐ Oregon-‐2 -‐ 010519 u Autonomous systems -‐ Oregon-‐2 -‐ 010526 u The Internet Topology Zoo -‐ AboveNet u The Internet Topology Zoo -‐ AGIS
u California Web Graph u EPA Web Graph u EuroSiS web mapping study u Web Graph of Berkeley and Stanford
Collaboration Graph
u Condense Matter collaboration network u Condensed Matter collaborations 1999 u Condensed Matter collaborations 2003 u Condensed Matter collaborations 2005 u CS PhD supervision relation graph u Erdos Collaboration Network u General Relativity and Quantum Cosmology
collaboration network u High-‐Energy Theory Collaboration Network 2001 u High-‐Energy Theory Collaboration network 2003 u Network Science Collaboration u Phenomenology Collaboration Network
![Page 26: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/26.jpg)
Social, Proximity and Infrastructure Networks
u Miami Chung-‐Lu u Miami Contact Network u Portland Contact Network u Primary School Cumulative
Networks 1 u Primary School Cumulative
Networks 2 u Seattle Contact Network u Slashdot Social Network 2008 u Slashdot Social Network 2009 u Youtube Social Network
Road/Transport/Infrastructure Networks
u Airlines u California transportation Network u Pennsylvania transportation
network u Texas transportation network u US Air Lines u US Power Grid u Western States Power Grid
u Dolphins' Social Network in NZ u Brightkite Friendship network u Enron Email Data with Manager-‐Subordinate
Relationship Metadata u Enron email Network u Enron Giant Component u Epinions Scoical Network u Giant Component of Brightkite Network u Giant Component of Epinions Networks u Giant Component of Gowalla Network u Giant Component of Max Planck's Facebook
Network u Giant Component of Slashdot0811 Network u Giant Component of Slashdot0902 Network u Gowalla friendship network u Hypertext 2009 dynamic contact network u Hyves Social Network u Infectious SocioPatterns -‐ 2009-‐04-‐28 u Infectious SocioPatterns -‐ 2009-‐04-‐29 u Karate network u LiveJournal Social Network u Max Planck -‐ Flickr Social Network
![Page 27: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/27.jpg)
List of Networks (Contd.)
Biological Networks Co-‐appearance/co-‐purchase Networks
• C. Elegans Neural Network • Yeast PPI network
Games/Sports Networks
• American College Football Network
• Soccer WorldCup'98
• Les Miserables • Network Gloassary • PoliBcs books • Word adjacencies
Others/misc. Networks
• Dynamic Java code • Small World Network
![Page 28: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/28.jpg)
Making Granite Self-‐Sustainable: Concept of Services and Apps
![Page 29: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/29.jpg)
User Management
• User can request account. Account is operaBonal only aser Admin acBvates it.
• Admin can acBvate or deacBvate accounts. • User can change password. • All the enBBes – Networks, Measures, Generators, Analyses – have owners.
![Page 30: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/30.jpg)
User Management
![Page 31: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/31.jpg)
Add Network • User can add network by uploading network file • Uploaded network is validated • For valid networks, edges & nodes are automaBcally
calculated • Networks are converted into .gph & .nx format – • User can specify metadata for the uploaded network • User can specify if the network is –
– Public : available to all users for analysis. – Private: available to only the owner, which is the default opBon
![Page 32: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/32.jpg)
Add Network
![Page 33: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/33.jpg)
VisualizaBon • CINETViz app fully integrated in Granite. • User can submit visualizaBon job for a network. • VisualizaBon process is scalable & abstracted from backend through middleware (blackboard & brokers)
• Once visualizaBon job is completed, user can view & download generated visualizaBon.
• VisualizaBon has 2 user interfaces in Granite – Quick view while selecBng network for analysis – Detailed view in VisualizaBon tab
![Page 34: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/34.jpg)
Features – VisualizaBon
![Page 35: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/35.jpg)
VisualizaBon of Networks (Contd.)
Karate Club Network Miami Graph
![Page 36: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/36.jpg)
VisualizaBon of Networks (Contd.)
Amazon Co-purchase Network
![Page 37: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/37.jpg)
CINET website • Central locaBon of CINET • Portal for course materials • Web address hJp://www.vbi.vt.edu/ndssl/cinet
CINET: A CyberInfrastructure for Network Science
![Page 38: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/38.jpg)
Graph Dynamical Systems Calculator (GDSC)
• Provide a Web ApplicaBon to enable users to compute dynamics for their systems.
• Evaluate arbitrary (small) graphs, a range of vertex funcBons, and update schemes.
• GDSC is an applicaBon in CINET.
Overview
![Page 39: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/39.jpg)
Future Work
• Add graph modificaBon algorithms – Remove edges – Swap edges
• Add data model to manage system workflow • Domain specific language • Registry Service
![Page 40: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/40.jpg)
Digital Library to support
ComputaBonal Epidemiology Datasets
![Page 41: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/41.jpg)
SyntheBc InformaBon Based Epidemiological Laboratory (SIBEL)
![Page 42: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/42.jpg)
The Problem
• ComputaBonal epidemiology employs computer models and informaBcs tools to reason about the spaBo-‐temporal spread of diseases.
• Studies are conducted, in general, through the use of a simulaBon and require informaBon on the populaBon structure, agent behavior, disease transmission, and a model of the disease.
• The heterogeneous content includes metadata, text, tables, spreadsheets, experimental descripBons, and large result files.
![Page 43: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/43.jpg)
NDSSL’s networked epidemiology data repository
Category Data Size Representation Synthetic Population
Household, Person Activity
566 GB Relational
Social Network and Output
Contact Network, Simulation Output
1.84 TB File
Experiment Experiment 240 GB Relational
![Page 44: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/44.jpg)
The Problem (cont.) • Data access and digital library services in current setups are
cumbersome due to heterogeneity and fragmentaBon across datasets.
• There is no accepted framework that allows unified access to such content.
• The diversity of models, data sources, data representaBons, and modaliBes that are collected, used, and modified moBvate the development of a digital library (DL) framework to support computaBonal epidemiology.
• We propose a data mapping framework for digital library systems for computaBonal epidemiology datasets.
• The proposed framework provides a unified view to access and query complete epidemiology workflow data.
![Page 45: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/45.jpg)
Unified View to Access and Query Complete Epidemiology Workflow Data
![Page 46: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/46.jpg)
Resource DescripBon Framework (RDF)
• Directed labeled graphs • Model elements
– Resource: These are the things being described by RDF expressions.
– Property: Is a specific aspect, characterisBc, aaribute or relaBon used to describe a resource Value
– Statement: A statement in RDF consists of resource + property + value subject predicate object
![Page 47: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/47.jpg)
RDF Example
• For the statement “Shamimul Hasan is the creator of the web page www.vt.edu/~shasan2.
• We have RDF statement as
• Node and arc diagram as
Subject(resource) www.vt.edu/~shasan2
Predicate(property) creator
Object(literal) “Shamimul Hasan”
www.umr.edu/~shasan2 Shamimul Hasan creator
![Page 48: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/48.jpg)
Framework • Data mapping provides us the flexibility to switch between various
databases and execute queries on them.
![Page 49: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/49.jpg)
Experimental Study
• We considered a real-‐Bme epidemiology simulaBon study conducted in the Seaale area. The study assumed that influenza transmits in various regional populaBons through person-‐person contact.
• We use the D2RQ Mapping Language to convert relaBonal and file data to RDF graphs, Virtuoso Open-‐Source EdiBon 6.1.6 as RDF data engine, and the SPARQL query language.
![Page 50: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/50.jpg)
Experimental Study (cont.)
Databases RDF Graph Size (GB)
Number of Triples
RDF Graph Generation
Time (Minutes)
Seattle Synthetic Population
177 661,848,662 317
Output 3.10 12,979,996 6
Experiment 0.01 66,654 0.37
![Page 51: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/51.jpg)
Experimental Study (cont.) Queries Bottom-‐up Approach
(SPARQL Query Runtime in Seconds)
Top-‐down Approach (SPARQL Query
Runtime in Seconds) How many people of a particular demographic are sick?
0.04 7.18
Find who infected whom of a particular Demographic
0.38 9.18
How many people get infected on a particular simulation day?
0.03 5.76
![Page 52: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/52.jpg)
Reference • Sherif Hanie El Meligy Abdelhamid, Md. Maksudul Alam, Richard Aló, Shaikh Arifuzzaman, Peter H.
Beckman, Tirtha Bhaaacharjee, Md Hasanuzzaman Bhuiyan, Keith R. Bisset, Stephen Eubank, Albert C. Esterline, Edward A. Fox, Geoffrey Fox, S. M. Shamimul Hasan, Harshal Hayatnagarkar, Maleq Khan, Chris J. Kuhlman, Madhav V. Marathe, Natarajan Meghanathan, Henning S. Mortveit, Judy Qiu, S. S. Ravi, Zalia Shams, Ongard Sirisaengtaksin, Samarth Swarup, Anil Kumar S. VullikanB, Tak-‐Lon Wu: CINET 2.0: A CyberInfrastructure for Network Science. eScience 2014: 324-‐331
• S. M. Shamimul Hasan, Sandeep Gupta, Edward A. Fox, Keith R. Bisset, Madhav V. Marathe: Data mapping framework in a digital library with computaBonal epidemiology datasets. JCDL 2014: 449-‐450
• S. M. Shamimul Hasan, Keith R. Bisset, Edward A. Fox, Kevin Hall, Jonathan Leidig, Madhav V. Marathe: An Extensible Digital Library Service to Support Network Science. ICCS 2013: 419-‐428
• Sherif Elmeligy Abdelhamid, Richard Aló, S. M. Arifuzzaman, Peter H. Beckman, Md Hasanuzzaman Bhuiyan, Keith R. Bisset, Edward A. Fox, Geoffrey Charles Fox, Kevin Hall, S. M. Shamimul Hasan, Anurodh Joshi, Maleq Khan, Chris J. Kuhlman, Spencer J. Lee, Jonathan Leidig, Hemanth MakkapaB, Madhav V. Marathe, Henning S. Mortveit, Judy Qiu, S. S. Ravi, Zalia Shams, Ongard Sirisaengtaksin, Rajesh Subbiah, Samarth Swarup, Nick Trebon, Anil VullikanB, Zhao Zhao:
• CINET: A cyberinfrastructure for network science. eScience 2012: 1-‐8 • Resource DescripBon Framework (RDF) developed by World Wide Web ConsorBum (W3C)-‐ hap://
bit.ly/1aXP5k2
![Page 53: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/53.jpg)
Student AcBvity
• Please Visit Granite website: hap://ndssl.vbi.vt.edu/apps/cinet/
• Launch App • Login
– Username: demo – Password: demo1234
• Start a New Analysis with “Karate” network and “PageRank” measure.
• Check analysis report.
![Page 54: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/54.jpg)
Many Thanks!
![Page 55: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/55.jpg)
AddiBonal Slides
![Page 56: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/56.jpg)
Extensible MemoizaBon Service
• Query a set of digital objects that exactly match a metadata paaern
• UBlizaBon – EducaBon – students – Baseline scenarios – Comparisons, body base, similar regions
![Page 57: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/57.jpg)
Architecture
![Page 58: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/58.jpg)
Architecture (Cont.)
![Page 59: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/59.jpg)
Architecture (Cont.)
![Page 60: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/60.jpg)
• Small |G| < 100,000 – Example: RND-‐G(n,p) Random Graph 1 (nodes:1,000, edges: 4,971)
• Medium 100,000 ≤|G|<10,000,000 – Example: RND-‐G(n,p) Random Graph 500 (nodes: 500,000, edges: 5.00E+06)
• Large |G|≥10,000,000 – Example: Seaale contact network (nodes: 3,207,037, and edges: 8.66E+07).
Network Category
![Page 61: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/61.jpg)
Performance § Shadowfax (Virginia Tech) § 912 cores, 5 TB RAM, 80 TB storage, 7168 CUDA cores § 100+ networks § 100+ measures
![Page 62: CINET: A CyberInfrastructure for Network Science](https://reader034.vdocument.in/reader034/viewer/2022042818/55c3a73bbb61ebe57b8b4601/html5/thumbnails/62.jpg)
Performance