february 3, 2014 9:00 – 10:00 am pst

23
OASIS Electronic Trial Master File Standard Technical Committee Metadata Component Layer Discussion February 3, 2014 9:00 – 10:00 AM PST

Upload: manju

Post on 24-Feb-2016

41 views

Category:

Documents


0 download

DESCRIPTION

OASIS Electronic Trial Master File Standard Technical Committee Metadata Component Layer Discussion. February 3, 2014 9:00 – 10:00 AM PST. Agenda. Roll Call. Meeting Etiquette. Announce your name prior to making comments or suggestions Keep your phone on mute when not speaking (#6) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: February 3, 2014 9:00 – 10:00 AM PST

OASIS Electronic Trial Master File Standard Technical

Committee

Metadata Component Layer Discussion

February 3, 20149:00 – 10:00 AM PST

Page 2: February 3, 2014 9:00 – 10:00 AM PST

AgendaTopic Presenter

9:00-9:05 Call to Order & Roll Call Zack Schmidt

9:05-9:10 Approval of Minutes https://www.oasis-open.org/committees/documents.php?wg_abbrev=etmf

All

2

9:10-9:20 Outreach Subcommittee - All Jennifer Alpert9:20-9:50 Tech presentation – Metadata Component Z. Schmidt/Aliaa

9:50-9:55 New Business All

9:55-10:00 Next meeting agenda / Date Z. Schmidt

Page 3: February 3, 2014 9:00 – 10:00 AM PST

Name Company Voting Status Present?Jennifer Alpert Palchak CareLex Voter Y

Aliaa Badr CareLex Voter YOleksiy (Alex) Palinkash CareLex Voter YTroy Jacobson Forte Research Voter YMead Walker HL7 Non-Voter (1st mtg ) YLou Chappuie Individual Voter YLisa Mulcahy Individual Non-Voter NRobert Gehrke Mayo Clinic Non-Voter (np last mtg) N

Rich Lustig Oracle Non-Voter (2d mtg as member) YMichael Agard Paragon Solutions Voter NChristopher McSpiritt Paragon Solutions Voter N

Jamie O’Keefe Paragon Solutions Non-Voter (np last mtg) NFran Ross Paragon Solutions Voter YPeter Alterman SAFE-BioPharma Voter NCatherine Schmidt SterlingBio Voter YZack Schmidt SureClinical Voter YTrish Whetzel, PhD SureClinical Voter YPeter Junge Beijing Sursen Observer NLaura Hilty Forte Research Observer NTony O’Hare Forte Research Observer NEldin Rammell Rammell Consulting Observer YRobin Cover OASIS staff Non-Voter NChet Ensign OASIS staff Non-Voter N

Roll Call

Page 4: February 3, 2014 9:00 – 10:00 AM PST

Meeting Etiquette• Announce your name prior to making comments or

suggestions • Keep your phone on mute when not speaking (#6)

• Do not put your phone on hold – Hang up and dial in again when finished with your other call – Hold = Elevator Music = very frustrated speakers and participants

• Meetings will be recorded and posted– Another reason to keep your phone on mute when not speaking!

• Use the join.me “Chat” feature for questions / comments / Votes

• We will follow Robert’s Rules of OrderNOTE: This meeting is being recorded and minutes will be posted on TC page after the

meeting

From eTMF Std TC to Participants:Hi everyone: remember to keep your phone on mute

4

Page 5: February 3, 2014 9:00 – 10:00 AM PST

• Status – New Members:– Joined: HL7, Kaiser Permanente

– In Progress: EMC, Shire – Activities / Milestones

Outreach Subcommittee

Page 6: February 3, 2014 9:00 – 10:00 AM PST

• Address open issues on Content Classification component discussion

• Begin Metadata Component discussion

Tech Discussion

Page 7: February 3, 2014 9:00 – 10:00 AM PST

–Metadata Component:

• Metadata (‘Tags’)– Characterizes content

– Allows users to precisely search for information, create reports, share data online

– Use of standards-based

terms is critical for interoperability between systems

Metadata Component

Page 8: February 3, 2014 9:00 – 10:00 AM PST

Metadata Component Example

– Each Content Type contains metadata that describes it:

Metadata Component

Page 9: February 3, 2014 9:00 – 10:00 AM PST

Metadata Component

– Metadata is used to tag or index digital content items

– Two primary types of metadata:

• Data Properties: Describes content type – Study ID, Site ID, Org, etc.

• Annotation properties: Describes attributes of content classifications and attributes of data properties.

– All content types required to have Core metadata – Like file properties: Content Type Name, URI, Date Created, Date modified, etc.

• Core to include ‘Business Process Metadata’: to support business process models using BPMN 2.0 terms and digitalsignatures for clinical process automation

– Organizations can add their own metadata (Org specific)

– Metadata Can be described, edited and validated using OWL editor (like open source editor Protégé’)

– Machine readable using W3C OWL2 and RDF/XML

Metadata Component

StudyDigital Content

Classification Categories Hierarchy

A.k.a.Data Properties

Page 10: February 3, 2014 9:00 – 10:00 AM PST

Metadata Classes – Summary:

Core - Always included:File Properties, Classification, Audit Trail Business Process, digital signatures

Domain-specific -- Metadata for a domain in life sciences such as eTMF. In addition to the eTMF domain, future domains could be added in many areas (not this TC’s charter). For example, finance, legal administration, healthcare. Domains should use standards-based terms from groups like W3C, NIH NCIt, HL7Org Specific – Metadata that meets organizations needs – not standards basedGeneral – Non-core terms that are relevant to the content type. Terms are obtained from public standards-based vocabulary terminology resources like Dublin core, Dicom Annotation Properties

Metadata about classification categories and metadata: Core, Org-Specific metadata

Metadata Classes - SummaryCore Metadata Example – File Properties:

Page 11: February 3, 2014 9:00 – 10:00 AM PST

Term Sourcing Concepts:• Terms adopted by standards bodies should be used first in eTMF model

Primary Term Sources for eTMF Metadata:– Internet Standards Dev Orgs: W3C, IETF, ISO, etc.

» Required for interoperability of machine code

– NIH NCIthesaurus: Term database for FDA, CDISC, HL7, other orgs

» Required for interoperability of clinical / health sciences data

Secondary, Tertiary Term Sources for eTMF Metadata:• Medical & Published Standards metadata: Dicom (med imaging); Dublin Core

• Industry sources – widely used terms in enterprise content mgmt software, TMF RM

Metadata – Term Sources

*Spec, Table 6, p21

Page 12: February 3, 2014 9:00 – 10:00 AM PST

Metadata ExampleeTMF Domain

http://purl.bioontology.org/ontology/CareLex/

Content Type

Annotation Properties

Metadata(Data Properties)

Example: eTMF Content Model

• Published at National Center for BioMedical Ontology

(NIH funded)

• Each Content Type has core metadata (Data Properties)

• Each Content Type has eTMF domain metadata (Data Properties)

• All Content Types, Categories have Annotation properties

Page 13: February 3, 2014 9:00 – 10:00 AM PST

Metadata Term (Data Properties) Modification Rules

*Spec, Table 6, p21

1. Core Metadata Terms: Cannot be modified

2. Domain-specific (e.g., eTMF), General, and Organization-specific metadata terms: Can be added to content models:

1. Domain specific – Terms sourced from NCI thesaurus or standards dev organization (SDO).

2. Org specific - When possible, new org specific terms should be sourced from NCI thesaurus or other SDO for interoperability; helpful but not required.

3. General – Sourced from Dublin Core, DICOM, SDO

2. To insure interoperability, the unique code value assigned to each metadata term cannot be modified.

3. Core and Business Process Metadata Properties can be reserved/unreserved. Other types of Metadata Properties can be deleted.

4. Only certain Annotation Properties' values can be modified for different types of Metadata Properties. See CareLex section 2.1.2 for details

• Metadata Terms (data properties) can be modified, edited using the open source Protégé Editor using OWL format and saved as RDF/XML

Page 14: February 3, 2014 9:00 – 10:00 AM PST

Metadata Term (Annotation Properties) Modification Rules

*Spec, Table 6, p21

Rules to Modify Annotation Properties:

1. Core Annotation properties can neither be deleted nor reserved. However, Organization-specific annotation properties can be deleted.

2. Only certain Annotation properties' values can be modified for different types of metadata properties (see CareLex section 8.2 for further details)

• Annotation properties can be modified, edited using the open source Protégé Editor using OWL format and saved as RDF/XML

Page 15: February 3, 2014 9:00 – 10:00 AM PST

Metadata Editing Tool – Free, Open Source Protégé (From Stanford University: http://protege.stanford.edu/ )

*Spec, Table 6, p21

Protégé Editor:-Edit Metadata: -Annotation Properties -Data Properties-Validates metadata relationships and W3C Term name compliance-Creates valid machine readable RDF/XML Ontology

Page 16: February 3, 2014 9:00 – 10:00 AM PST

Core Metadata Terms

Term Definition SourceFile Properties

* Created The date and time at which the resource is created. For a digital file, this need not match a file-system creation time. For a freshly created resource, it should be close to that time. Later file transfer, copying, etc., may make the file-system time arbitrarily different. NIH/NCI

* Modified The date and time the resource was last modified. NIH/NCI* Content Identifier The unique identifier for a content item, such as a document, image, or other media in a

specified context. (Document name.) NIH/NCI

* URI The unique uniform resource Identifier or path (URI) for a content item such as a document, image, or other media in a specified context. NIH/NCI

* Format Content Item File Format, e.g., PDF, JPG, GIF, XLS, DOC, DOCX, XLSX, PPT, PPTX. It uses a filename extension as the format value. NIH/NCI

Basic Audit Trail

* Created By Indicates the username of the person who brought the item into existence.NIH/NCI

* Modified By Indicates the username of the person who changed an item. NIH/NCIClassification

* Content Type Name The name of the Content Type such as 'CV.' A Content Type is a reusable collection of metadata, workflow, behavior, and other settings for a category of items in electronic content material. NIH/NCI

Note: Core metadata terms should be included for each content item. Terms with required Data values = *

*For additional info, see Spec, Appendix 8

Page 17: February 3, 2014 9:00 – 10:00 AM PST

Core Metadata Terms, Continued

*For additional info, see Spec, Appendix 8

Term Definition SourceBusiness Process Metadata (includes Digital Signatures)

Date Date of task or event, or date in the context of document or Content Type. Date can be different from date created. NIH/NCI

Process A sequence or flow of activities in an organization with the objective of carrying out work. Source: BPMN V2.0 Spec (4). Tasks are atomic activities. They are included within a Process. NIH/NCI

Task A single activity that has occurred within a business process. Generally, an end-user, an application, or both will perform the Task. Concept derives from BPMN V2.0. Example task values are: Submitted, Approved, Reviewed, Signed, etc., indicating that a task has been completed. Each task is date stamped and captured in a single record of the business process metadata history log.

NIH/NCI

Source Where the content item is from or its origin. Example values: Import, Scan, Fax, email, system, and other. NIH/NCI

Person The full name of the person who performed the workflow action (e.g., approved or submitted a document) or the person to whom this document is linked. NIH/NCI

Person Role The role of the person who is responsible for or linked to a content item, such as Principal Investigator, Sub-Investigator, Study Coordinator, Sponsor Project Manager, CRO Project Manager, or Data Manager.

NIH/NCI

Subject Identifier Subject Identifier is a unique sequence of characters used to identify, name, or characterize the study subject individual in a clinical trial study. NIH/NCI

*Organization The full name of the Organization linked to the resource. NIH/NCI

Organization Role Denotes the role of the organization, which is responsible for or linked to the Content Item. Values include Sponsor, Site, CRO, and Vendor. NIH/NCI

Username The account name used by a person to access a computer system (used for system generated tasks). NIH/NCI

Digital Signature Extra data embedded in a document or metadata linked to a document. It identifies and authenticates the signer of a document using public-key encryption. May be a URI or path to digital signature resource or certificate.

NIH/NCI

Digital Signature Status Specifies whether a document or content item has been digitally signed. If no signature is required, status = null. Values: Signed, Not Signed, Null NIH/NCI

Page 18: February 3, 2014 9:00 – 10:00 AM PST

eTMF Domain Metadata Terms

*For additional info, see Spec, Appendix 8

Term Definition SourceeTMF Domain Metadata

*Study ID A sequence of characters used to identify, name, or characterize the study. NIH/NCI*Country Name of country using ISO 3166-1 alpha-3 country codes- Example: USA. NIH/NCI

Site ID A unique symbol that establishes the identity of the study site. NIH/NCI

Credential Professional credential of Person for study - MD, RN, PhD or other for Person linked to a content item / document; EX: MD, RN, PhD, MS, MA, BA, MBA

NIH/NCI

Visit Number The numerical identifier of the visit. NIH/NCI

Note: Study ID and Country metadata terms should be included for each content item in the eTMF Domain and are marked *

All other terms assigned to content types based on the published domain content model. For example ‘Site ID’ is assigned to content types within the ‘Site Management’ category. See published eTMF content model for details. All other terms are optional. Additional eTMF Domain Metadata terms may be added as needed in ‘Phase 2’ of the eTMF TC project

Page 19: February 3, 2014 9:00 – 10:00 AM PST

General Metadata

Term Definition CodeGeneral Metadata

Description An account of the resource or content item. Dublin CoreLocation A spatial region or named place. Dublin Core

Title A name given to the resource or content item. Dublin CoreType The nature or genre of the resource or content item. Dublin Core

Note: General Metadata is not required, but is obtained from published standards organizations such as Dublin Core, DICOM, and other standards organizations

Page 20: February 3, 2014 9:00 – 10:00 AM PST

Proposed Metadata Component has following Properties:

• Based on metadata terms from published, standards-based databases:

– W3C, NIH NCI thesaurus, HL7, Dublin Core, DICOM

• W3C XML compliant

– No special characters: ( ) & # @ / … etc. per w3C rules

• Flexible and customizable for organizations, yet interoperable

– Core metadata – allows interoperable data exchange between org’s for domain

• Core: Always included with domain; metadata attributes not modifiable--interoperability

– Org specific metadata – allows use of custom metadata for a specific study, instance

• Defined set of rules for exchanging, adding, modifying non-core metadata

• Any Organization can Modify/Edit org-specific metadata using open source editors like Protégé

• Additional eTMF domain metadata may be added in Phase 2 of eTMF TC if required

Metadata Component - Summary

*Spec, Table 6, p21

Page 21: February 3, 2014 9:00 – 10:00 AM PST

–Content Model:• Comprised of

Classification Categories, Metadata- a ‘filing plan’

• Represented as RDF/XML Machine readable code

• Created for a specific domain instance (e.g, Study)

• Content models can be created, shared with anyone online, offline

• Easily editable in Protégé’

• See specification section 8.3.1 for additional details on RDF/XML file format and content model interoperability

Content Model – OWL RDF/XML Format

Content Model:Classification Categories + Metadata in RDF/XML:

Data Model: Content Model instance and instance data in XML packet*

Content Model

Instance Data Values

Content Resources PDFs, media (URI)

*Data model – for data exchange (future TC meetings)

Content model – for conceptual model design, editing, exchange

Page 22: February 3, 2014 9:00 – 10:00 AM PST

–Classification Categories Component: Naming, Numbering

–Metadata Component: Interoperable metadata

–Content Model: Machine Readable, standard format for exchange of models

Content Classification System Discussion Summary

*For additional info, see Spec details on Content Classification System

Page 23: February 3, 2014 9:00 – 10:00 AM PST

• Roll call

• Reports– Outreach– Tech Discussion: Electronic/Digital Signatures, eTMF Archive File

Exchange Formats

• New business

Draft Agenda: Next Meeting