1 ls dam overview january 31, 2012 core team: ian fore, d.phil., nci cbiit, robert freimuth, ph.d.,...

24
1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium, Joyce Hernandez, Merck, Jason Hipp, M.D./Ph.D., University of Michigan, Jenny Kelley, M.A., NCI LPG, Juli Klemm, Ph.D., NCI-CBIIT, Jim McKusker, Yale, Lisa Schick, ScenPro, Inc., Mukesh Sharma, Ph.D., Washington University in St Louis, Grace Stafford, Ph.D., The Jackson Laboratory, Baris Suzek, M.S., Georgetown University

Upload: primrose-heath

Post on 05-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

1

LS DAM Overview

January 31, 2012

Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3rd Millennium, Joyce Hernandez, Merck, Jason Hipp, M.D./Ph.D., University of Michigan, Jenny Kelley, M.A., NCI LPG, Juli Klemm, Ph.D., NCI-CBIIT, Jim McKusker, Yale, Lisa Schick, ScenPro, Inc., Mukesh Sharma, Ph.D., Washington University in St Louis, Grace Stafford, Ph.D., The Jackson Laboratory, Baris Suzek, M.S., Georgetown University

Page 2: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

2

Agenda

• What is it?

• Informing resources and alignment with BRIDG

• LS DAM 2.2.2

• Core Components

• Examples of the LS DAM in use

• Publishing the LS DAM

• Life Sciences Ontology Pilot Subgroup (LS OPS)

2

Page 3: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

3

Life Sciences Domain Analysis Model (LS DAM) – What is it?

• The LS DAM Project grew out of the desire to support interoperability across several NCI cancer Bioinformatics Grid (caBIG®) life sciences applications• The project was initiated in February 2009 with NCI caBIG life sciences

community as the initial stakeholder• Collaboration with the HL7 Clinical Genomics Work Group (HL7 CG WG)

began in June 2010

• Project Goals• To produce a shared view, in UML, of the static semantics of the domain of

hypothesis-driven and discovery research at the organism, cell, and molecular level in order to facilitate human and computable semantic interoperability in that domain

• Maintain alignment of the LS DAM with BRIDG as it also evolves, in order to facilitate human and computable semantic interoperability across the translational research continuum

3

Page 4: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

4

Life Sciences DAM Scope Statement

Life Sciences research includes (i) in vivo, experiments (ii) ex vivo, in vitro or in situ experiments, and (iii) in silico experiments modeling and analyzing processes or other biological phenomena.

 

The Life Sciences Domain Analysis Model (LS DAM) focuses on concepts important for conducting hypothesis driven and discovery science at the organismal, cellular and molecular level. The LS DAM includes and defines relationships between concepts that are central to specimen collection, processing and banking (human, model organism, cell lines, etc.), in vitro imaging, and molecular biology.

Page 5: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

5

Informing Resources

• Many concepts drawn from well-established, community-vetted caBIG projects

• E.g., tissue banking related (caTissue), microarray data management (caArray), laboratory information management system (caLIMS), nanomaterial data sharing portal (caNanoLab)

• Standard reference models and data exchange formats have informed modeling decisions

• ISA-TAB, FuGE, MAGE, BRIDG

• Bound to the ISO 21090 data type standard

• Aligned with the Biomedical Research Integrated Domain Group (BRIDG) model on touchpoints

• Alignment is ongoing as each model continues to evolve• Re-using concepts “owned” by BRIDG that apply to our domain• 53 of the 129 LS DAM 2.2.2 classes originate from BRIDG

• E.g., Person, Organization, Activity (and much of the hierarchy), Material

Page 6: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

6

Alignment with BRIDG to support translational research

• Subject touch point supports traversing from LS DAM to BRIDG to support use cases as needed

• LS DAM includes distinct specialization of BRIDG.Subject to support the specific requirements around SpecimenCollectionProtocolSubject

• Touch points can be leveraged in support of the translational research continuum

Page 7: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

7

Example of coordination with the BRIDG SCC

• LS DAM has requirement to support concept of “equipment”

• Seemed aligned to BRIDG Device • “Device” as class name didn’t suit

domain well• Definition and examples seemed to

convey a more clinical nature

• Worked with BRIDG SCC to modify the definition and add additional examples so the single concept would suit both BRIDG and LS DAM needs

• LS DAM class name is Equipment, but, it is the BRIDG Device (some attributes constrained and added)

7

DEFINITION:An object intended for use whether alone or in combination for diagnostic, prevention, monitoring, therapeutic, scientific, and/or experimental purposes.

EXAMPLE(S): tongue depressor, pacemaker, insulin pump, EKG machine, x-ray machine, mass spectrometer, polymerase chain reaction (PCR) machine, microscope, pH meter

Page 8: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

8

Person, Organization, and their Roles

Document

Experiment Core

Specimen Core

Molecular Biology Core

Molecular Databases Core

Performed Activities (imaging, observations, etc)

Material, Equipment

LS DAM 2.2.2

Page 9: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

9

LS DAM Experiment Core Component

• Conceptual representation of the process of experimentation in both hypothesis driven and discovery research in the Life Sciences domain

• Collaborative effort with HL7 CG WG to model common representation that can support interoperability around expanding array of ‘omics technologies

• Heavily informed by ISA-TAB and MAGE

• Includes concepts such as: ExperimentalStudy, Experiment, ExperimentalParameter, ExperimentalFactor

Page 10: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

10

LS DAM Experiment Core

10

Page 11: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

11

LS DAM Experiment Core - ExperimentalItem

11

Page 12: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

12

LS DAM Molecular Biology Core Component

• Modeling of the molecular components of cells and organisms that will enable linking to genomic and proteomic annotations and experimental findings

• Heavily informed by ISA-TAB, MAGE-TAB, CGOMS, FuGE, dbSNP, EndNote, various public molecular databases

• Includes concepts such as Gene, Protein, NucleicAcidSequence, AminoAcidSequence, GeneticVariation

Page 13: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

13

LS DAM Molecular Biology Core

13

Page 14: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

14

LS DAM Molecular Databases Core Component

• Supports linking identifying information held in databases (such as Ensembl, UniProt, GenBank) to the molecular components defined in the molecular biology core

• Includes concepts such as GeneIdentifier, ProteinIdentifier, SNPIdentifier

Page 15: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

15

LS DAM Molecular Databases Core

15

Page 16: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

16

LS DAM Specimen Core Component

• Modeling of the collection, processing and storage of physical substances originally obtained from a biological entity

• Heavily informed by caBIG caTissue, caNanoLab

• Includes concepts such as Material, BiologicSpecimen, PeformedSpecimenCollection, PerformedSpecimenProcessing, SpecimenCollectionProtocol

Page 17: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

17

LS DAM Specimen Core Component

17

Page 18: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

18

Examples of LS DAM in use

• caBIG® Molecular Annotation Service (MAS)• A resource for accessing molecular annotations from curated data sources• Information model supporting this service derived from the LS DAM • Requested support for annotation of genes and gene products in LS DAM

• HL7 CG WG Omics DAM• A DAM to support genetic testing scenarios from contact with a participant through the

collection of specimens, the molecular assays performed on the specimen or a derivative, and the lab reports generated as a result

• LS DAM leveraged as the backbone of the effort• Several requests submitted, many addressed in next LS DAM release

• caBIG® caLIMS2• A laboratory information management system• Leveraged several LS DAM concepts that are relevant to a laboratory environment

while developing the project’s information model• Requested support for several identified gap concepts in LS DAM

• nano-TAB• An ISA-TAB based standard format for describing data related to investigations,

nanomaterials, specimens, and assays in nanomedicine• Has been mapped to LS DAM in an effort to align the two standards

Page 19: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

19

LS DAM – Publishing

• Needs of the community drive each cycle

• LS DAM 1.0 – July 2009

• LS DAM 2.2.1 – May 2011

• LS DAM 2.2.2 – January 2012

• Robust documentation set made publicly available on the LS DAM wiki (https://wiki.nci.nih.gov/x/cxRlAQ) with each iteration

• Release Summary, Change List, Mapping Document, etc.

• LS DAM User Guide

• Experiment Implementation Guide (IG)• Provides guidance and an illustration of how the model can be instantiated

to support a sample scenario• Anticipate writing additional IGs as model evolves and user base grows

• Results of pilot investigation exploring the possibility of an ontological representation made available in January 2012 (LS OPS)

Page 20: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

20

Life Sciences Ontology Pilot Subgroup (LS OPS)

• The LS DAM was developed and is published in UML as per CBIIT practices

• Interaction with external communities revealed that ontologies are frequently used to represent information (vs. UML) in the life sciences domain

• An ontological representation of the LS DAM would not only enable the LS DAM to leverage semantics from existing ontologies that are widely used within the life sciences community in general, but would also facilitate the adoption of the LS DAM by members of those ontology-based communities

• LS OPS was formed to explore the feasibility of creating an ontological representation from the LS DAM

Page 21: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

21

LS OPS: Goals

• To explore the feasibility of expressing a portion of the LS DAM as an ontology that uses components from existing ontologies developed by other groups

• To continue engaging a variety of groups from the broader scientific community with the intent to form collaborations, exchange information regarding domain information models/ontologies, and identify areas of overlap and gaps among the various communities

• To identify opportunities to reuse components from existing ontologies to express and extend the semantics of the LS DAM

Page 22: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

22

LS DAM: Ontological Representation

• Information Sources• Dublin Core Ontology• NCI Thesaurus• NCBI Taxonomy• Phenotypic Quality Ontology (PATO)• W3C Provenance Ontology (PROV)• Relations Ontology (RO)• Open Biological and Biomedical Ontologies (OBO)• Experimental Factors Ontology (EFO)• Cell Type Ontology (CL)• vcard• Friend of a Friend (FOAF)

• 18 of 20 classes in the constrained model were represented by components from existing ontologies

Page 23: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

23

LS OPS Conclusions

• Producing an ontological representation of the LS DAM would be a worthwhile effort

• By aligning ourselves with projects on the national level we are facilitating interoperability not only within the caBIG program but also across the entire research community

• There are a wealth of existing ontologies that are widely used in the life sciences community that could be leveraged to support the semantics represented in the LS DAM

Page 24: 1 LS DAM Overview January 31, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3 rd Millennium,

24

LS DAM Constrained for LS OPS