attributes to org charts - ufdcimages.uflib.ufl.edu

1
VIVO: Enabling national Networking of Scientists is supported by NIH grant U24 RR 029822. The UF CTSI is supported in part by NIH awards UL1 RR029890, KL2 RR029888 and TL1 RR029889 VIVO provides a unique opportunity to collect, curate and use data regarding research activity at and across institutions. VIVO makes its data available publically via RDF Schema 1 , an XML format. Tools consuming RDF can operate on any VIVO as a data source. R 2 , an open source system for data management, analysis and visualization, is well-suited for reading and displaying network data. R libraries for reading XML 3 , and constructing and displaying social networks 4 , provide additional programming ease. Here we extend work from last year 5 , with R functions for collecting attributes for VIVO people, orgs, papers and grants, and using those attributes in large scale displays of an organization’s research activity. We use the University of Florida as an example. VIVO and R Use Cases Objects , networks, and displays Attributes to Org Charts: Using R and VIVO for Visualization of Research Activity Mike Conlon, UF Clinical and Translational Science Institute, Gainesville, Florida R code generates objects (people, pubs, grants, orgs) from VIVO URIs and returns a list of attribute name value pairs. org <- get.vivo.org(org.uri) person <- get.vivo.person(person.uri) grant <- get.vivo.grant(grant.uri) pub <- get.vivo.publication(pub.uri) A network can be constructed by using a driver function pointed at an org node. The driver function recursively processes the sub orgs, assembling a network object. The network object can be saved as a CSV file for processing in other tools. uf.uri<-"http://vivo.ufl.edu/individual/UF/UF.rdf" uf.n<-vivo.network("UF",get.vivo.orgs(uf.uri,0)) write.csv(vivo.data.frame(uf.n),file="uf-data.csv") The network can then be displayed with node colors, sizes, shapes and labels determined by object attributes. png(file="BigUFDepth.png",height=72,width=72, units=“in”,res=72) plot(uf.n,displaylabels=T, vertex.col=get.vertex.attribute(uf.n,"depth")) The R tools developed can be used to explore research activities: 1. Show organizations “size” in terms of papers produced, grants awarded, grant dollars, personnel, faculty 2. Compare organizations visually within an institution or across institutions 3. Compare results for subsets by time, e.g. comparing years 4. Show results for subsets of organizations 5. Show grants and publications as nodes attached to people and people as nodes attached to organizations 1 RDF Vocabulary Description Language 1.0: RDF Schema http://www.w3.org/TR/rdf-schema/ 2 R Project Home Page www.r-project.org 3 Lang, Duncan Temple Tools for parsing and generating XML in R, http://www.omegahat.org/RSXML/ 4 Handcock, M., Hunter, D.R., Butts, C.T., Goodreau, S.M. and Morris, M. (2003) Software Tools for the Statistical Modeling of Network Data. Version 2.1-1. Project home page at http://statnet.org , URL http://CRAN.R-project.org/package=statnet 5 Conlon, M. and the VIVO Collaboration “Using the R Programming Language for VIVO Application Programming,” poster presented at 2010 VIVO Conference, New York City, August, 2010. 6 Bastian M., Heymann S., Jacomy M., Gephi: an open source software for exploring and manipulating networks, American Journal of Sociology (2009), pp.361-362 7 R Project Archive http://cran.r-project.org References Visualizing the University of Florida The University of Florida consists of 537 organizations (excluding Shands Hospital organizations not shown here). Each org is colored by its “distance” from UF. Red nodes are distance one, orange distance two, light green distance 3, green distance 4 , cyan distance 5 and blue distance 6. The figure was produced using R software reading VIVO data from vivo.ufl.edu Future work for visualization of research activity: 1. Simplify aggregation of attributes at levels of the network 2. Add additional objects events, projects, data sets 3. Create a web site with user interface for specifying visualizations. Visualization can begin at any VIVO org URI 4. Consider using gephi 6 for network visualization 5. Create a CRAN 7 package for distribution Future Work

Upload: others

Post on 31-Dec-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Attributes to Org Charts - ufdcimages.uflib.ufl.edu

VIVO: Enabling national Networking of Scientists is supported by NIH grant U24 RR 029822. The UF CTSI is supported in part by NIH awards UL1 RR029890, KL2 RR029888 and TL1 RR029889

VIVO provides a unique opportunity to collect, curate and use data

regarding research activity at and across institutions. VIVO makes its

data available publically via RDF Schema1, an XML format. Tools

consuming RDF can operate on any VIVO as a data source.

R2, an open source system for data management, analysis and

visualization, is well-suited for reading and displaying network data.

R libraries for reading XML3, and constructing and displaying social

networks4, provide additional programming ease.

Here we extend work from last year5, with R functions for collecting

attributes for VIVO people, orgs, papers and grants, and using those

attributes in large scale displays of an organization’s research activity.

We use the University of Florida as an example.

VIVO and R Use Cases

Objects , networks, and displays

Attributes to Org Charts: Using R and VIVO for Visualization of Research ActivityMike Conlon, UF Clinical and Translational Science Institute, Gainesville, Florida

R code generates objects (people, pubs, grants, orgs) from VIVO

URIs and returns a list of attribute name value pairs.

org <- get.vivo.org(org.uri)

person <- get.vivo.person(person.uri)

grant <- get.vivo.grant(grant.uri)

pub <- get.vivo.publication(pub.uri)

A network can be constructed by using a driver function pointed at an

org node. The driver function recursively processes the sub orgs,

assembling a network object. The network object can be saved as a

CSV file for processing in other tools.

uf.uri<-"http://vivo.ufl.edu/individual/UF/UF.rdf"

uf.n<-vivo.network("UF",get.vivo.orgs(uf.uri,0))

write.csv(vivo.data.frame(uf.n),file="uf-data.csv")

The network can then be displayed with node colors, sizes, shapes and labels determined by object attributes.

png(file="BigUFDepth.png",height=72,width=72,

units=“in”,res=72)

plot(uf.n,displaylabels=T,

vertex.col=get.vertex.attribute(uf.n,"depth"))

The R tools developed can be used to explore research activities:

1. Show organizations “size” in terms of papers produced, grants

awarded, grant dollars, personnel, faculty

2. Compare organizations visually – within an institution or across

institutions

3. Compare results for subsets by time, e.g. comparing years

4. Show results for subsets of organizations

5. Show grants and publications as nodes attached to people and

people as nodes attached to organizations

1RDF Vocabulary Description Language 1.0: RDF Schema

http://www.w3.org/TR/rdf-schema/2R Project Home Page www.r-project.org3Lang, Duncan Temple Tools for parsing and generating XML in R,

http://www.omegahat.org/RSXML/4Handcock, M., Hunter, D.R., Butts, C.T., Goodreau, S.M. and Morris,

M. (2003) Software Tools for the Statistical Modeling of Network

Data. Version 2.1-1. Project home page at http://statnet.org, URL

http://CRAN.R-project.org/package=statnet5Conlon, M. and the VIVO Collaboration “Using the R Programming

Language for VIVO Application Programming,” poster presented at

2010 VIVO Conference, New York City, August, 2010.6Bastian M., Heymann S., Jacomy M., Gephi: an open source

software for exploring and manipulating networks, American Journal

of Sociology (2009), pp.361-3627R Project Archive http://cran.r-project.org

References

Visualizing the University of Florida

The University of Florida consists of 537 organizations (excluding Shands Hospital organizations not shown here). Each org is

colored by its “distance” from UF. Red nodes are distance one, orange distance two, light green distance 3, green distance 4,

cyan distance 5 and blue distance 6. The figure was produced using R software reading VIVO data from vivo.ufl.edu

Future work for visualization of research activity:

1. Simplify aggregation of attributes at levels of the network

2. Add additional objects – events, projects, data sets

3. Create a web site with user interface for specifying visualizations.

Visualization can begin at any VIVO org URI

4. Consider using gephi6 for network visualization

5. Create a CRAN7 package for distribution

Future Work