vivo: sharing data for research discovery mike conlon university of florida [email protected]

19
VIVO: Sharing Data for Research Discovery Mike Conlon University of Florida [email protected]

Upload: posy-freeman

Post on 12-Jan-2016

218 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: VIVO: Sharing Data for Research Discovery Mike Conlon University of Florida mconlon@ufl.edu

VIVO: Sharing Data for Research

Discovery

Mike ConlonUniversity of Florida

[email protected]

Page 2: VIVO: Sharing Data for Research Discovery Mike Conlon University of Florida mconlon@ufl.edu

Public, structured linked data about investigators interests, activities and accomplishments, and toolsto use thatdata to advancescience

Page 3: VIVO: Sharing Data for Research Discovery Mike Conlon University of Florida mconlon@ufl.edu
Page 4: VIVO: Sharing Data for Research Discovery Mike Conlon University of Florida mconlon@ufl.edu

VIVO Searchlight

Page 5: VIVO: Sharing Data for Research Discovery Mike Conlon University of Florida mconlon@ufl.edu
Page 6: VIVO: Sharing Data for Research Discovery Mike Conlon University of Florida mconlon@ufl.edu
Page 7: VIVO: Sharing Data for Research Discovery Mike Conlon University of Florida mconlon@ufl.edu

Data Production

Page 8: VIVO: Sharing Data for Research Discovery Mike Conlon University of Florida mconlon@ufl.edu

Producing Data

Page 9: VIVO: Sharing Data for Research Discovery Mike Conlon University of Florida mconlon@ufl.edu

processOrg<-function(uri){ x<-xmlParse(uri) u<-NULL name<-xmlValue(getNodeSet(x,"//rdfs:label")[[1]]) subs<-getNodeSet(x,"//j.1:hasSubOrganization") if(length(subs)==0) list(name=name,subs=NULL) else { for(i in 1:length(subs)){ sub.uri<-getURI(xmlAttrs(subs[[i]])["resource"]) u<-c(u,processOrg(sub.uri)) } list(name=name,subs=u) }}

VIVO produces human and machine

readable formats

Software reads RDF from VIVO

and displays

Page 10: VIVO: Sharing Data for Research Discovery Mike Conlon University of Florida mconlon@ufl.edu

Data Sharing

Photograph by J. G. Park. Flicker.com Photograph by Ell Brown Flicker.com

Page 11: VIVO: Sharing Data for Research Discovery Mike Conlon University of Florida mconlon@ufl.edu

Information is stored using the Resource Description

Framework (RDF) as subject-predicate-object “triples”

Jane Smith

professor in

author ofhas affiliation with

Dept. of Genetic

s College of

Medicine

Journal article

Book chapte

r

Book

Genetics Institute

Subject Predicate Object

A Web of Data – The Semantic Web

Page 12: VIVO: Sharing Data for Research Discovery Mike Conlon University of Florida mconlon@ufl.edu
Page 13: VIVO: Sharing Data for Research Discovery Mike Conlon University of Florida mconlon@ufl.edu

The Role of the Archive• Collate data, final semantics, ready for

consumption

Data Archive

Page 14: VIVO: Sharing Data for Research Discovery Mike Conlon University of Florida mconlon@ufl.edu

Institutions record activities, interests, accomplishments

Page 15: VIVO: Sharing Data for Research Discovery Mike Conlon University of Florida mconlon@ufl.edu

Data, Tools and Scientists

Page 16: VIVO: Sharing Data for Research Discovery Mike Conlon University of Florida mconlon@ufl.edu

Data Consumption

Photograph by ScoopMedia. Flicker.com

Photograph by Janet Tarbox. Flicker.com

Page 17: VIVO: Sharing Data for Research Discovery Mike Conlon University of Florida mconlon@ufl.edu

A Consumption ScenarioFind all faculty members whose genetic work

is implicated in breast cancer

VIVO will store information about faculty and associate to genes. Diseaseome associates genes to diseases.

Query resolves across VIVO and data sources it links to.

Page 18: VIVO: Sharing Data for Research Discovery Mike Conlon University of Florida mconlon@ufl.edu

Data ReasoningData integration continues to be a serious bottleneck for the expectations of increased productivity in the pharmaceutical and biotechnology domain.

“Linked Life Data” integrates common public datasets that describe the relationships between gene, protein, interaction, pathway, target, drug, disease and patient and currently consist of more than 5 billion RDF statements.

The dataset interconnects more than 20 complete data sources and helps to understand the “bigger picture” of a research problem by linking previously unrelated data from heterogeneous knowledge.

From the LarKC (Large Knowledge Collider) http://www.larkc.eu/overview/

Page 19: VIVO: Sharing Data for Research Discovery Mike Conlon University of Florida mconlon@ufl.edu

http://vivo.ufl.edu/individual/mconlon