1 introduction to the cadsr presented to hl7 vocab sig january 24, 2005 denise warzel national...

37
1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project Officer, Software Development

Upload: isabel-russell

Post on 27-Mar-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

1

Introduction to the caDSRPresented to HL7 Vocab SIG

January 24, 2005

Denise Warzel

National Cancer Institute, Center for Bioinformatics

caDSR Project Officer, Software Development

Page 2: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 2

Presentation Outline

• caCORE Overview

• ISO/IEC 11179 Overview

• caDSR Implementation and tooling

Page 3: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 3

caCORE Components

Enterprise Vocabulary

Data Standards

Bioinformatics Objects

• caCORE is the open-source foundation upon which the NCICB builds its research information management systems

Page 4: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 4

caCORE Infrastructure wiring

Vocabulary for CDE specification

Dictionary, thesaurusservices

Domain object metadata

Common data elements

Public APIs

Common data elements

(CDEs)

Page 5: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 5

Presentation Outline

• caCORE Overview

• ISO/IEC 11179 OverviewISO/IEC 11179 Overview• caDSR Implementation and tooling

Page 6: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 6

Terms and Definitions for ISO/IEC 11179

Administered Item: A registry item for which administrative information is recorded in an Administration Record

Data Element: A unit of data for which the

definition, identification, representation, and permissible values are specified by means of a set of attributes.

Data Element Concept: An idea that can be represented in the form of a data element, described independently of any particular representation.

Value Domain: A set of attributes describing representational characteristics of instance data with or without enumerated permissible values.

Data Element: A unit of data for which the definition, identification,representation, and permissible values are specified by means of a set ofattributes.

Data Element Concept: An idea that can be represented in the form of a data element, described independently of any particular representation.

Value Domain: A set of attributes

describing representational Characteristics of instance datawith or without permissible values.

Value Meaning: A member of theset of finite allowed inventory ofnotions that can be categorized for a conceptual domain.

Permissible Value: Anexpression of a value meaning in a specific value domain

Representation Class: A classification of data elements based upon the type of representational form.

Conceptual Domain: A set of possible value meanings of a data element expressed without representation.

Data Element Representation:

The part of a data element having

A value domain, datatype,and other representational specifications.

Page 7: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 7

• ISO/IEC 11179 Parts 1-6: Information technology – Specification and Standardization of data elements

– A metamodel for ‘data element’ metadata– Standard by which to convey semantic,

syntactic and lexical meaning • Human and machine understandable

• Unambiguous

What is ISO/IEC 11179?

Page 8: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 8

ISO/IEC 11179 Information technology Standard

• ISO/IEC 11179 Part 1: Framework for the specification and standardization of data elements

• ISO/IEC 11179 Part 2: Classification for data elements

• ISO/IEC 11179 Part 3: Registry metamodel and basic attributes

• ISO/IEC 11179 Part 4: Rules and Guidelines for the Formulation of Data Elements

• ISO/IEC 11179 Part 5: Naming and Identification Principles for Data Elements

• ISO/IEC 11179 Part 6: Registration of data elements

• Publically Available from:

• http://isotc.iso.ch/livelink/livelink/fetch/2000/2489/Ittf_Home/PubliclyAvailableStandards.htm??Redirect=1

Page 9: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 9

Basic Metamodel Components

Conceptual_DomainData_Element_Concept

1..10..*+specifying

1..1+having

0..*

data_element_concept_conceptual_domain_relationship

Data_Element

0..*

1..1

+providing_representation_to0..*

+represented_by1..1

expression

Value_Domain

0..* 1..1+represented_with

0..*

+providing_representation_for1..1

representation

0..*

1..1

+representing

0..*

+specified_by1..1

specification

Data Element Concept Conceptual Domain

Data Element Value Domain

Perception

Representation

Page 10: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 10

• “What is this datum?” – Provides concrete guidance on the creation and maintenance of

discrete data element attributes and metadata (semantics) enabling the formulation of data elements in a consistent, standard manner –

• “Metadata Repository/Registry” – Framework for Data element standardization and registration

allow the creation of a shared data environment in much less time and with much less effort than it takes for conventional data management methodologies.

• Adoption of 11179 Allowed us to “Get on with it”Adoption of 11179 Allowed us to “Get on with it”

Why ISO/IEC 11179?

Page 11: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 11

ISO/IEC 11179 Administered Items

Administered_Item

Context (for administered item)

Classification_Scheme

Object_Class

Property

Data_Element_Concept

Conceptual_Domain

Data_Element

Value_Domain

Representation_Class

Derivation_Rule

Page 12: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 12

ISO/IEC Administered ItemAdministration Record and Common Attributes

• Unique Identifier• Administrative Status• Registration Status• Creation Date• Administrative Note(s)• Effective Date• Change Date(s)• Change Description(s)• Origin• Until Date

• Created By• Modified By• Name(s)• Definition(s)• Stewardship Information• Submitter Information• Reference Document(s)• Classifications

Page 13: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 13

ISO/IEC 11179 NCICB Extensions

Administered_Item

Context (for administered item)

Classification_Scheme

Object_Class

Property

Data_Element_Concept

Conceptual_Domain

Data_Element

Value_Domain

Representation_Class

Derivation_Rule

The Concept ClassProvides Semantic Linkage

Form

Concept Class

Page 14: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 14

ObjectAgent

PropertyChemopreventive

Conceptual DomainAgent

Data Element ConceptChemopreventive Agent

Data ElementChemopreventive Agent Name

Value DomainChemopreventive Agent Name

ContextcaCORE

RepresentationName

Cla

ss

ific

ati

on

Sc

he

me

sc

aD

SR

Tra

inin

g

Valid ValuesCyclooxygenase Inhibitor

DoxercalciferolEflornithine

…Ursodiol

caDSR Implementation of ISO/IEC 11179 Model

Page 15: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 15

NCICB Concept ClassCommon Attributes

• Concept Class• Administered Item attributes +

• Concept Unique Identifier• Pointer to an externally defined concept

• Concept Definition Source• Names the source terminology/ontology/vocabulary

• Concept Relationship• Semantic Order of the concepts• NOTE: ISO describes a ‘Concept Relationship’ as a semantic link among two

or more concepts. There is a subtlety in our implementation. In caDSR use the concept relationships as more of a derivation rule, naming the order of the concepts - not semantic relationships in an ontologic or object model sense of ‘relationship’.

• Object Class, Property, Representation term, Qualifier terms, Value Domains

Page 16: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 16

Why vocabularies/ontology important?

• Goal: “Semantically unambiguous, interoperability”• Data Element curators are not necessarily vocabulary experts• NCI had a terminology and vocabulary services group: EVS• Semantic integration is achieved by tying Standard

vocabulary identifier codes to the caDSR metadata• The ISO 11179 provides the framework – we were looking

for something that could be computed without a human having to read and interpret definitions

• By abstracting the curation of concepts in caDSR and instead relying on external vocabularies

Page 17: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 17

EVS and caDSR Distinctions

• caDSR is a metadata repository– maintains metadata to permit a user to locate the correct

data element defining the characteristics of a piece of datum, an instance of a specific concept, in sufficient detail to be collected and stored on a computer

• EVS is a terminology server– provides services for synonymy, mapping between

vocabularies, hierarchical structures, Subconcepts, Superconcepts, Roles, Semantic type, etc.

Page 18: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 18

Presentation Outline

• caCORE Overview

• ISO/IEC 11179 Overview

• caDSR Implementation and toolingcaDSR Implementation and tooling

Page 19: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 19

caDSR Overview

• NCI Data Element Metadata repository and registry• Based on the ISO/IEC 11179 • Designed to integrate caCORE infrastructure• Supports the development and deployment of Data

Elements that are used as metadata descriptors, primarily for NCI-sponsored research, with an ever widening girth of end users

• Available as an open-source download

Page 20: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 20

caDSR Tools

• Goals of caDSR Tools development:– Simplify development and creation of ISO/IEC 11179

compliant metadata by Data Element Curators and UML Modelers

– Simplify consumption of Data Elements by end users and application developers

– Enhance reuse of Data Elements for all – Enable semantic consistency across research domains– Support metadata life-cycle and governance processes

Page 21: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 21

caDSR Home Page

Curators Developers General

Page 22: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 22

Introduction to caDSR Tools

– CDE Browser to Search for and Download – Form Builder to Create user specified collections of CDEs– Side-by-Side Compare

– CDE Curation Tool to Create Data Elements

– Admin Tool to Curate and Administer caDSR - “Power Users”

– Sentinel Tool (3.0)• Generates end user ‘Alerts’ triggered by metadata changes

– Batch Load to import Administered Items• Excel Loader (MS Excel)• UML Loader (XMI)• Case Report Form Loader (MS Excel)

Access, Develop, Manage, Consume

Page 23: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 23

• View, Search, Download– Shopping cart feature

• FormBuilder to Build / Download Forms and Data Elements

• “Context Browsing” Tree– By Classification Schemes

– By Forms

• CDE Basic Search Criteria – Google-like search

– Sortable search results by clicking on column headings

CDE Browser

“CONTEXTBrowsing”

“CONTEXTBrowsing”

Basic SearchBasic Search

Page 24: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 24

• Advanced Search Criteria – Leverages ISO attributes

• Find all with “18254-3” permissible value

• Find all with “Gene*”

• Find all with “Released” workflow status

• Find all with “Standard” Registration status

• Etc.

CDE Browser

Advanced SearchAdvanced Search

Page 25: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 25

Form Builder

• Create and Manage Forms– Organize CDEs into

modules within a Form

– Attach pdf or word format

– Classify Forms into groupings for specific end user communities

– “Publish” “Un-Publish” for Browser Catalog visibility

• “Printer Friendly” version

• Download CDEs

Page 26: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 26

CDE Side-by-Side Compare

• CDE Side-by-Side Compare– Build shopping cart,

compare CDE metadata side by side

– Download to excel spreadsheet

Page 27: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 27

• To Create, Edit or Version: • Data Element Concepts• Value Domains• Data Elements

• ISO 11179 Wizard – Construct ISO compliant Data

Elements by building up the pieces• Builds Names and Definitions

from underlying components.• “Get Associated”

– Leverage ISO to retrieve related CDEs

• “Block Edit”• “shopping cart”• Assign classification schemes• Versioning

Curation Tool

Page 28: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 28

Administration Tool

• System Administration • User Accounts and

Security• Lists of Values (LOVs)

used in content creation

• Create “Framework”: • Conceptual Domains

• Classification Schemes (basis for organizing CDEs in Browser)

• Protocols

Page 29: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 29

Sentinel Tool

• Create “Alerts”– User defined triggers based

on data element metadata attributes

– “notify me of any change to the Value Domain for any CDE on the Adverse Event Form

• Generates and emails a report of changes matching “Alert” criteria

Page 30: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 30

Batch LoadingOC caDSR DEFAULT VALUES: Workflow status = "Released" Alw ays. Version = 1.0 Alw ays. Create Date =Date loaded by Loader. Created by = EVS. Long Name = EVS Preferred name

EVS Preferred Name Definition Definition Source Database Context Preferred NameEffective Begin Date Change Note Alternate Name Type

VARCHAR2 (20) VARCHAR2 (2000) VARCHAR2 (2000) VARCHAR2 (255) VARCHAR2 (20) VARCHAR2 (30) VARCHAR2 (2000) VARCHAR2 (20) Mapped to Long Name and Preferred Name

PreferredDefinition Definition Source Database Requestors Context YY.MM.B Text AlternateName.Type

Not Null Not Null Null Not Null Not Null Null Null Not Null

Celsius Scale The temperature scale defined by the values 0 degree Celsius for the freezing point of water and 100 degrees Celsius for the boiling point of water.  The Celsius degree (C) is the same size as a Kelvin and equal to (F - 32)/1.8.  To convert Celsius to Fah

NCI NCI Thesaurus caBIG 11/18/2004 Requested by Dianne Reeves

NCI_Concept_Code

HEENT HEENT is the Head, Ears, Eyes, Nose and Throat, and is referred to as a body system on a physical or medical examination.  The term is typically used as 'HEENT' in a physician or caregiver notes.

NCI NCI Thesaurus caBIG 11/18/2004 Requested by Dianne Reeves

NCI_Concept_Code

Gracely Pain Unpleasantness Scale

The Gracely Pain Unpleasantness Scale is a visual analog scale of 0 to 20 used by a subject to define their pain unpleasantness experience.  Together with the intensity scale these tools serve to differentiate the patient's sensory perception of pain inte

NCI NCI Thesaurus caBIG 11/18/2004 Requested by Dianne Reeves

NCI_Concept_Code

• Excel Loaders– Formatted MS Worksheet

• Administered Item• Form

• UML Loader– XMI representation of a

UML Class Diagram• Class Object Class• Attribute Property• Data Element Concept,

Value Domain and Data Element derived from the above

Page 31: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 31

Current User Base

• Cancer Biomedical Informatics Grid (caBIG) – 820/466/180/ 61% *• Center for Cancer Research (CCR) – 821/573/506/ 12%

• Clinical Data Interchange Standard Consortium (CDISC) - 3/0• Center for Cancer Imaging (CIP) - 238/151/148/ 2%

• Cancer Therapy Evaluation Program (CTEP) – 8029/2432/2428/ .1%

• Division of Cancer Prevention (DCP) – 427/321/286/ 11%

• National Heart Lung and Blood Institute (NHLBI) – 0/0• Early Detection Research Network (EDRN) – 121/1/1/ 100%

• Divisions of Population Sciences and Cancer Control (PS & CC) 85/9• Specialized Programs of Research Excellence (SPOREs) – 719/197/120/ 39%

• Cancer Ontologic Research Environment (caCORE) – 1028/810/810 0%

* Total CDEs in this Context / ”Released” workflow status / ”Released” and developed by this context / “Reused” from other contexts

Page 32: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 32

Exploring

• National Institute of Neurological and Disorders and Syndromes (NINDS)

• National Icelandic Center for Oncology

• Cancergrid – UK

Page 33: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 33

Operating Environments

• Database Repository– Oracle 9i

• Administration Tool– Oracle PL/SQL, Oracle 9i Application Server

• CDE Browser– Java, Oracle 9i Application Server

• CDE Curation Tool– Jakarta Tomcat

Page 34: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 34

Support

• NCICB Help Desk– [email protected] and telephone support

• Bi-weekly Software meetings– Hosted by Denise Warzel– Telconference and web-cast

• Bi-weekly Content Development Meetings– Hostd by George Komasoulis– Telconference and web-cast

• Open end user requirements meetings, design reviews and prototyping/feedback sessions

• Training– Web-cast and telconference

Page 35: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 35

Contact Information

• caDSR Home Page– http://ncicb.nci.nih.gov/core/caDSR

• caDSR Users ListServ– http://list.nih.gov to subscribe to

[email protected]

• caDSR Training Home Page– http://ncicb.nci.nih.gov/NCICB/core/caDSR/Training

• caDSR Training ListServe– http://list.nih.gov to subscribe to caDSR_Training-

[email protected]

Page 36: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 36

Documentation/Recommended Reading Materials

• caDSR Homepage: – http://ncicb.nci.nih.gov/core/caDSR

• caCORE User Application Manual:– ftp://ftp1.nci.nih.gov/pub/cacore/NCICBapplications/NCICBAppManual.pdf

• caCORE Technical Guide:– ftp://ftp1.nci.nih.gov/pub/cacore/caCORE2.0_Tech_Guide.pdf – caDSR APIs

• caDSR API Guide:– ftp://ftp1.nci.nih.gov/pub/cacore/caDSR/caCORE2.0_caDSR_API.pdf

• caDSR Business Rules – http://ncicb.nci.nih.gov/NCICB/core/caDSR/BusinessRules

• caDSR Content Meetings – http://ncicb.nci.nih.gov/NCICB/core/caDSR/Content

• caDSR_Users List serv subscribe: – http://list.nih.gov– Send Request for caDSR Account to: [email protected]

Page 37: 1 Introduction to the caDSR Presented to HL7 Vocab SIG January 24, 2005 Denise Warzel National Cancer Institute, Center for Bioinformatics caDSR Project

D. Warzel 37

caDSR Tools Team

• NCICB– Peter Covitz

– Denise Warzel

• ScenPro– Bill McCurry

– Tom Phillips

– Robert Harding

– Jennifer Brush

– Larry Hebel

– Smita Hastak

• Oracle– Edmond Mulaire– Ram Chilukuri– Prerna Aggarwal– Dan Ladino– Christophe Ludet– Shaji Kakkodi– Jane Jiang

• SAIC– Kathleen Gundry– Tommie Curtis– Brenda Maeske