1 ls dam overview january 31, 2012 core team: ian fore, d.phil., nci cbiit, robert freimuth, ph.d.,...

Post on 05-Jan-2016

212 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

LS DAM Overview

January 31, 2012

Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund, Ph.D., 3rd Millennium, Joyce Hernandez, Merck, Jason Hipp, M.D./Ph.D., University of Michigan, Jenny Kelley, M.A., NCI LPG, Juli Klemm, Ph.D., NCI-CBIIT, Jim McKusker, Yale, Lisa Schick, ScenPro, Inc., Mukesh Sharma, Ph.D., Washington University in St Louis, Grace Stafford, Ph.D., The Jackson Laboratory, Baris Suzek, M.S., Georgetown University

2

Agenda

• What is it?

• Informing resources and alignment with BRIDG

• LS DAM 2.2.2

• Core Components

• Examples of the LS DAM in use

• Publishing the LS DAM

• Life Sciences Ontology Pilot Subgroup (LS OPS)

2

3

Life Sciences Domain Analysis Model (LS DAM) – What is it?

• The LS DAM Project grew out of the desire to support interoperability across several NCI cancer Bioinformatics Grid (caBIG®) life sciences applications• The project was initiated in February 2009 with NCI caBIG life sciences

community as the initial stakeholder• Collaboration with the HL7 Clinical Genomics Work Group (HL7 CG WG)

began in June 2010

• Project Goals• To produce a shared view, in UML, of the static semantics of the domain of

hypothesis-driven and discovery research at the organism, cell, and molecular level in order to facilitate human and computable semantic interoperability in that domain

• Maintain alignment of the LS DAM with BRIDG as it also evolves, in order to facilitate human and computable semantic interoperability across the translational research continuum

3

4

Life Sciences DAM Scope Statement

Life Sciences research includes (i) in vivo, experiments (ii) ex vivo, in vitro or in situ experiments, and (iii) in silico experiments modeling and analyzing processes or other biological phenomena.

 

The Life Sciences Domain Analysis Model (LS DAM) focuses on concepts important for conducting hypothesis driven and discovery science at the organismal, cellular and molecular level. The LS DAM includes and defines relationships between concepts that are central to specimen collection, processing and banking (human, model organism, cell lines, etc.), in vitro imaging, and molecular biology.

5

Informing Resources

• Many concepts drawn from well-established, community-vetted caBIG projects

• E.g., tissue banking related (caTissue), microarray data management (caArray), laboratory information management system (caLIMS), nanomaterial data sharing portal (caNanoLab)

• Standard reference models and data exchange formats have informed modeling decisions

• ISA-TAB, FuGE, MAGE, BRIDG

• Bound to the ISO 21090 data type standard

• Aligned with the Biomedical Research Integrated Domain Group (BRIDG) model on touchpoints

• Alignment is ongoing as each model continues to evolve• Re-using concepts “owned” by BRIDG that apply to our domain• 53 of the 129 LS DAM 2.2.2 classes originate from BRIDG

• E.g., Person, Organization, Activity (and much of the hierarchy), Material

6

Alignment with BRIDG to support translational research

• Subject touch point supports traversing from LS DAM to BRIDG to support use cases as needed

• LS DAM includes distinct specialization of BRIDG.Subject to support the specific requirements around SpecimenCollectionProtocolSubject

• Touch points can be leveraged in support of the translational research continuum

7

Example of coordination with the BRIDG SCC

• LS DAM has requirement to support concept of “equipment”

• Seemed aligned to BRIDG Device • “Device” as class name didn’t suit

domain well• Definition and examples seemed to

convey a more clinical nature

• Worked with BRIDG SCC to modify the definition and add additional examples so the single concept would suit both BRIDG and LS DAM needs

• LS DAM class name is Equipment, but, it is the BRIDG Device (some attributes constrained and added)

7

DEFINITION:An object intended for use whether alone or in combination for diagnostic, prevention, monitoring, therapeutic, scientific, and/or experimental purposes.

EXAMPLE(S): tongue depressor, pacemaker, insulin pump, EKG machine, x-ray machine, mass spectrometer, polymerase chain reaction (PCR) machine, microscope, pH meter

8

Person, Organization, and their Roles

Document

Experiment Core

Specimen Core

Molecular Biology Core

Molecular Databases Core

Performed Activities (imaging, observations, etc)

Material, Equipment

LS DAM 2.2.2

9

LS DAM Experiment Core Component

• Conceptual representation of the process of experimentation in both hypothesis driven and discovery research in the Life Sciences domain

• Collaborative effort with HL7 CG WG to model common representation that can support interoperability around expanding array of ‘omics technologies

• Heavily informed by ISA-TAB and MAGE

• Includes concepts such as: ExperimentalStudy, Experiment, ExperimentalParameter, ExperimentalFactor

10

LS DAM Experiment Core

10

11

LS DAM Experiment Core - ExperimentalItem

11

12

LS DAM Molecular Biology Core Component

• Modeling of the molecular components of cells and organisms that will enable linking to genomic and proteomic annotations and experimental findings

• Heavily informed by ISA-TAB, MAGE-TAB, CGOMS, FuGE, dbSNP, EndNote, various public molecular databases

• Includes concepts such as Gene, Protein, NucleicAcidSequence, AminoAcidSequence, GeneticVariation

13

LS DAM Molecular Biology Core

13

14

LS DAM Molecular Databases Core Component

• Supports linking identifying information held in databases (such as Ensembl, UniProt, GenBank) to the molecular components defined in the molecular biology core

• Includes concepts such as GeneIdentifier, ProteinIdentifier, SNPIdentifier

15

LS DAM Molecular Databases Core

15

16

LS DAM Specimen Core Component

• Modeling of the collection, processing and storage of physical substances originally obtained from a biological entity

• Heavily informed by caBIG caTissue, caNanoLab

• Includes concepts such as Material, BiologicSpecimen, PeformedSpecimenCollection, PerformedSpecimenProcessing, SpecimenCollectionProtocol

17

LS DAM Specimen Core Component

17

18

Examples of LS DAM in use

• caBIG® Molecular Annotation Service (MAS)• A resource for accessing molecular annotations from curated data sources• Information model supporting this service derived from the LS DAM • Requested support for annotation of genes and gene products in LS DAM

• HL7 CG WG Omics DAM• A DAM to support genetic testing scenarios from contact with a participant through the

collection of specimens, the molecular assays performed on the specimen or a derivative, and the lab reports generated as a result

• LS DAM leveraged as the backbone of the effort• Several requests submitted, many addressed in next LS DAM release

• caBIG® caLIMS2• A laboratory information management system• Leveraged several LS DAM concepts that are relevant to a laboratory environment

while developing the project’s information model• Requested support for several identified gap concepts in LS DAM

• nano-TAB• An ISA-TAB based standard format for describing data related to investigations,

nanomaterials, specimens, and assays in nanomedicine• Has been mapped to LS DAM in an effort to align the two standards

19

LS DAM – Publishing

• Needs of the community drive each cycle

• LS DAM 1.0 – July 2009

• LS DAM 2.2.1 – May 2011

• LS DAM 2.2.2 – January 2012

• Robust documentation set made publicly available on the LS DAM wiki (https://wiki.nci.nih.gov/x/cxRlAQ) with each iteration

• Release Summary, Change List, Mapping Document, etc.

• LS DAM User Guide

• Experiment Implementation Guide (IG)• Provides guidance and an illustration of how the model can be instantiated

to support a sample scenario• Anticipate writing additional IGs as model evolves and user base grows

• Results of pilot investigation exploring the possibility of an ontological representation made available in January 2012 (LS OPS)

20

Life Sciences Ontology Pilot Subgroup (LS OPS)

• The LS DAM was developed and is published in UML as per CBIIT practices

• Interaction with external communities revealed that ontologies are frequently used to represent information (vs. UML) in the life sciences domain

• An ontological representation of the LS DAM would not only enable the LS DAM to leverage semantics from existing ontologies that are widely used within the life sciences community in general, but would also facilitate the adoption of the LS DAM by members of those ontology-based communities

• LS OPS was formed to explore the feasibility of creating an ontological representation from the LS DAM

21

LS OPS: Goals

• To explore the feasibility of expressing a portion of the LS DAM as an ontology that uses components from existing ontologies developed by other groups

• To continue engaging a variety of groups from the broader scientific community with the intent to form collaborations, exchange information regarding domain information models/ontologies, and identify areas of overlap and gaps among the various communities

• To identify opportunities to reuse components from existing ontologies to express and extend the semantics of the LS DAM

22

LS DAM: Ontological Representation

• Information Sources• Dublin Core Ontology• NCI Thesaurus• NCBI Taxonomy• Phenotypic Quality Ontology (PATO)• W3C Provenance Ontology (PROV)• Relations Ontology (RO)• Open Biological and Biomedical Ontologies (OBO)• Experimental Factors Ontology (EFO)• Cell Type Ontology (CL)• vcard• Friend of a Friend (FOAF)

• 18 of 20 classes in the constrained model were represented by components from existing ontologies

23

LS OPS Conclusions

• Producing an ontological representation of the LS DAM would be a worthwhile effort

• By aligning ourselves with projects on the national level we are facilitating interoperability not only within the caBIG program but also across the entire research community

• There are a wealth of existing ontologies that are widely used in the life sciences community that could be leveraged to support the semantics represented in the LS DAM

24

LS DAM Constrained for LS OPS

top related