mmhcc informatics providing innovative and integrative informatics solutions johnita beasley (saic)...
TRANSCRIPT
MMHCC Informatics
Providing Innovative and IntegrativeInformatics Solutions
Johnita Beasley (SAIC)Dana Zhang (SAIC)
Sharon Settnek (SAIC)
2
Cancer Models Database (caMDB)
• The Cancer Models Database allows both intramural and extramural researchers to search and submit mouse models– All models submitted
by extramural researchers are curated to ensure data integrity
3
Model Search
• A simple search is available to allow researchers to search by model name, tumor, organ system, tissue, or species
• An advanced search includes genetic descriptions, carcinogenic agents, the model phenotype, cell lines, and therapeutic approaches
4
Search Results
• Model search results are displayed in table format– Model details can
be displayed by selecting the model descriptor link
– Specific Model components can be viewed by selecting each component Tab
5
Image Constructs
• Search results include any associated image constructs and annotations– Full image views
with zoom and pan capabilities are available
6
Model Submission
• Users can submit new models via the Add new model link
• Users can edit previously submitted models by selecting the model descriptor
• Users can clone or delete a model that they previously submitted
7
Model Components• Submitting a new model involves entering model information
including:– General Information– Genetic Descriptions– Carcinogenic Agents– Publications– Histopathology– Therapeutic Approaches– Cell Lines– Images– Microarray Data
• Required fields appear in bold red text• Data is saved throughout the submission process
8
Model Curation
• The Cancer Models review process is automated in the CMD Admin Function
– A model coordinator assigns users to review specific models
– Reviewers inform the coordinator of any recommendations and modify the status of the model
– The coordinator contacts the model originator concerning any modifications
9
New Features
• In recent months we’ve added or enhanced several caMDB features to include:– Updating the delete and clone capabilities.
– Adding the Model Vocabulary
– Adding the ability to interface with the GEDP
– Adding the Comment capability
• To stay abreast of all new features, visit our “What’s New” link on the caMDB home page.
10
Model Vocabulary
• The Enterprise Vocabulary System (EVS) is a set of services and resources that address NCI’s needs for controlled vocabulary
• The CMD utilizes EVS to access controlled vocabulary– Organ/Tissue– Diagnosis
11
GEDP Interface
• User’s can now connect to the Gene Expression Data Portal (GEDP) to submit Microarray Data they’d like to be associated with a model.
12
Comments
• This newly added feature will allow users of the database to comment on models and to enter additional data to models previously entered by other labs.
13
caMDB Object Model
• The caMDB object model is represented with Unified Modeling Language (UML).
• The object model evolved through a use-case driven, architecture-centric, iterative, and incremental process.
• The object model establishes a standard set of genomic components related to an animal model.
• The object model is extensible and employs reuse (i.e. the caBIO API is used to provide access to other data sources).
14
caMDB Object Modelgov.nih.nci .caBIO.bean.EVSInterface
<<Interface>>
gov.nih.nci .caBIO.bean.EVSInterface<<Interface>>
gov.nih.nci .caBIO.bean.TaxonInterface<<Interface>>
ApprovalInterface<<Interface>>
gov.nih.nci .caBIO.bean.AgentInterface<<Interface>>
Agent
Disease
GeneticAlteration
Organ
Role
ContactInfo
TreatmentSchedule
GeneDelivery
EnvironmentalFactor
InducedMutation
Therapy
1
+drug
1
0..1+treatmentSchedule
0..1
MicroArrayData
CellLine
1+organ
1
ApprovalStatus
TargetedModification
Image
Avai labili ty
1+availabi lity
11
+availabi lity
1
Histopathology
0..1+matastasisOf
0..1
0..*
0..*
+diagnoses
0..*
+histopathologies0..*
0..*+geneticAlteration 0..*
1 +organ1
GenomicSegment
Transgene
Publ ication
0..*
+publications
0..*
0..*
+publications
0..*
Species
Xenograft
1
+organ
1
1+species 1
1
+hostSpecies
1
Person
1..*
+roles
1..*
1+contactInfo1
CarcinogenicIntervention
1+intervention
1
0..1+protocol
0..1
1+geneDel ivery1
1
+environmentalFactor
1
AnimalModel
0..*
+inducedMutations
0..*
0..*+therapies
0..*
0..*+microarrayData
0..*0..*
+cel lLines
0..*
1
+status
1
0..*
+targetedModifications
0..*
0..*
+images
0..*
1+releaseDate
1
0..*
+histopathologies
0..*
0..*
+genomicSegments
0..*
0..*+transgenes
0..*
0..*+publications
0..*
1+species1
0..*+xenografts0..*
1
+principalInvestigator
1 1+submitter
1
0..*
+carcinogenicInterventions
0..*
SexDistribution
Phenotype
0..1
+phenotype
0..1
11
15
caMDB API• The caMDB API provides a means of accessing
animal model data submitted via the Cancer Models Database (caMDB) application.
• The API is based on the caMDB Object Model• The caMDB objects, through their relationships,
simulate the behaviors of an animal model. The model components include:– Cell Lines– Histopathologies– Images– Genetic Descriptions (Transgenes, Genomic Segments, Targeted
Modifications, and Induced Mutations)– CarcinogenicInterventions– Theraputic Approaches– Microarray Data– Publications
16
caMDB API
• The caMDB objects can simulate the behavior of actual genomic components, link to an animal model, such as genes, organs, diseases, etc. by accessing other genomic data sources like:
- caBIO
- Enterprise Vocabulary Services (EVS)
17
caMDB API Architecture
External Java Apps
Clients Presentation Layer Object Layer Data Sources
Browsers
Other Apps
HTML/HTTP
XML/HTTP
Internal Java Apps
Web Server
Servlet Container
JSPs
Servlets
UI Bean
XML Builder
XSLT Engine
SOAP Engine
XML Docs
DTDsXSL
Style Sheet
RMI
URLs
Flat Files
DatabasesDatabases
Histopathologies Phenotypes
PublicationsCellLines
InducedMutations
Object Managers
JDBC
HTTP
FTP
SOAP
Data Access Objects
Transgenes
Therapies Other
Domain Objects
18
caMDB API Architecture• The caMDB API models the n-tiered architecture of caBIO
with client interfaces, server components, back-end objects and data sources
• Clients (browsers, applications) can receive information (HTML and XML) from back-end objects over HTTP– Client applications can also communicate with back-end
objects via Java RMI (Java applications)– Non-Java based applications will communicate via SOAP
• Server components communicate with back-end objects via Java RMI
• Back-end objects communicate directly with data sources (database, URLs, flat files)
19
Image Portal• The NCICB has developed an image portal to
allow researchers to search and submit rodent and human images with annotations
20
Image Portal
• Image annotations may include a detailed description, species, organ, diagnosis, strain, and image dimensions
21
Imaging Technologies• The NCICB is investigating imaging technologies to facilitate
efficient image retrieval and annotation integration• Imaging technologies include JPEG 2000, DICOM 3, and
Image Content Servers– JPEG 2000 is a standard currently under development that defines a set of lossless
(bit-preserving) and lossy compression methods for coding continuous-tone, bi-level, gray-scale, or color digital still images
– DICOM 3 (Digital Imaging and Communications in Medicine) is the industry standard for the transferal of radiology images and other medical information between computers
• DICOM- SR (Structured Reporting) is a UML and XML representation of the DICOM specification
– Image Content Servers provide a mechanism to speed image transmission and improve image quality
• NCICB is exploring interfacing with the MIRC Project (RSNA)
22
Image Annotation Standards• To facilitate image
sharing, a “minimal” set of image annotations are necessary
• Image annotations should leverage existing standards and may be derived from use cases for image retrieval and analysis
• Annotations should include parent-child relationships
23
Image Object Model
24
caIMAGE Architecture
BrowserClient
BrowserClient
caIMAGEWeb
Application Server
caIMAGEWeb
Application Server
LizardTech Image Content
Server
LizardTech Image Content
Server
Network File
System
Network File
SystemCaIMAGE Database
CaIMAGE Database
Lizard Image
Converter
Lizard Image
Converter
ImagesImage Annotations
Image Annotations
Images
Images
Sid Files
Images
25
MMHCC Links
• EMICE Websitehttp://emice.nci.nih.gov
• Cancer Models Database (caMDB)http://cancermodels.nci.nih.gov
• Cancer Image Portal (caImage)http://cancerimages.nci.nih.gov
• caMDB API (including JavaDocs and Object Model)http://emice.nci.nih.gov/MMHCC/mmhcc_organization/members/bioinformatics