medical intelligence edw 20 juni: radboudumc
TRANSCRIPT
Research DWHToetsen van ideeën
Radboudumc Technology Centers
• Proteins• Metabolites
• Preclinical• Clinical
• Behavioural
• Animal facility• Pathology
• Cell analysis• Sorting
• Pediatric• Adult• Phase 1, 2, 3, 4
• Vaccines• Pharmaceutics• Radio-isopes• Malaria parasites
• Management• MDM• Disclosure• Analysis• Logging• Audit• Sharing• …
• DNA• RNA
• Internal• External
• HTA• Field lab
• Statistics• Biological• Structural
• Preclinical• Clinical
Information principles
1. Data shall be used consistently across all processes; the definitions should be understandable and available for users and adhere to standards where possible.
2. Data is a vital asset.
3. All data has an identifiable source and owner
4. Data is shared across processes: enter once, reuse many times
5. Data is available to authorized users, managed and protected
Data stewardship Handbook (Data4lifesciences, HANDS)As a data steward, you:• are accountable and responsible for your research data;• are 'in control' of the complete research data flow;• reuse existing data when possible;• collaborate with patient organisations throughout your research;• protect the privacy and safety of study subjects;• apply the FAIR principle;• protect the quality of your research and ensure its reproducibility;• use available expertise and UMC-advised infrastructure;• think ahead about intellectual property rights;• share your data responsibly.
FAIR principle• Findable: the data should be uniquely and persistently identifiable and
other researchers should be able to find your data.• Accessible: the conditions under which the data can be used should be
clear to machines and humans.• Interoperable: data should be machine-readable and use terminologies,
vocabularies, or ontologies that are commonly used in the field;• Reusable: compliant with the above and sufficiently well described with
metadata and provenance information so that the data sources can be linked or integrated with other data sources and enable proper citation.
Clinical DWH Functional requirements • Historical traceability• Source Tracing• Flexible Design, integration without remake• Scalability• Data Integrity• Metadata
• Master Data
• Findable data structures• User Friendly searching• Standardization• Data Cleansing
Old (?) style data collection style
Klik op het pictogram als u een afbeelding wilt toevoegen
Research projectScores (GCS, CAM-ICU, DOS, RASS, NRS slaap, AVPU)
Practitioner Glasgow Coma Scale
Research Coworker MUST Score
Patient SNAQ score
Diagnosis Term Pain assessment score
Lab Component Ventilator settings
Enrollment Allergy Intolerance
Episode Family History
Hospital Admission Social History
Visit Intoxications
ED visit Tobacco use
ProblemList Alcohol intake
Complication Procedure
Lab results Surgical Case
Vital Signs Imaging Study Report
Height EKG
Weight Long Function
BSA Pathology
Temperature QuestionnaireResponse
Heart rate
Respirations Metadata
Blood pressure CodeSystem
Oxygen saturation NamingSystem
Adverse Event reporting ValueSet
Blood transfusion Concept
Home medication use Concept Map
Medication Order Questionnaire
Observations Question
Door onderzoekers
aangedragen
objecten zijn
vindbaar en
modelleerbaar
(n=56)
Academisch Medisch Centrum Algemene Dienst ICT - www.amc.nl/ict
SourceIntergratie /
Transformatie
Source
ZorgZorg
Zorg
Research
External
/ other
ResearchAMC/
Overig
External
/ otherExternal
/ other
Stag
ing
DWH
(*) Workspace
/ Portal
Data Space
Compute Space
Workspace
/ Portal
Data Space
Compute Space
Workspace
/ Portal
Data Space
Compute Space
AD-HOC
Data Metadata
Metadata
Archive
Publish
Research datasets can be sources for future studies
Common components:
• Identity and Access
management
• Traceability
• Pseudonimisation
• Sharing & Collaboration
Workflow
Architectuurschets ACRIS
Pseu
doni
misa
tie /
TTP
Datamining
Research spec. Data
Pseu
doni
misa
tie /
TTP….
onderzoeks
dataset
Catalogus
Catalogus
Academisch Medisch Centrum Algemene Dienst ICT - www.amc.nl/ict 11
STG HSTG DV Universe Dataset
Enterprise Data warehouse
Masterdata
Matrix, DBC, etc.
Sources systems
Palga, Glims
SAP
Target systems (e.g. DRE, Objects)
Epic
Chronicals
Clarity
Star
Enterprise Data warehouse model
Palga, Glims,
etc.
Matrix, HR, AD,
etc.
Clarity
Matrix, HR, AD,
etc.
Specific Data mart
Master data Datamart
Delivery
Direct Delivery
*Links to linked data like images, genetics or audio recordings will be preserved as-is throughout all steps. Views (can be materialized for performance reasons
Delivery
Staging >Data Inte-
gration>
Data Transfor-mation
and Quality
>Data
Pressen-tation
Masterdata
Integrated model
Enterprise Data
warehouse
Palga, Glims,
etc.
Palga, Glims,
etc.
Proposal• One ready to use integration and staging layer• Clinical data for Research in same environment• Staging with Data vault is an option, made easier by tools like CDT• General Data Mart with Dimensional modeling • Model compliant with Epic Star and ZIB and FHIR• Demand driven development• Common tool like MS SSDT
Requirements and proposed techniquesStaging Data integration
Data transformation(Enterprise data
warehouse)Presentation (Data mart)
As- Is Data Vault Star model or 3NF-like Star model (Kimbal)
•View on data sources•Can be materialized for performance reasons
•No (in and output) business logic•Decoupled from staging and Datawarehouse•Change should not invalidate Data warehouse or Data marts•Traceable to the source•Time dimension•Completeness•Can be virtualized
•View on Integration layer•Can be materialized for performance reasons•Limited business logic (only for optimizing output usage)•For healthcare data based on FHIR•Adds terminology translation•Data Quality Check
•View on Transformation layer•Can be materialized for performance reasons•Multi propose data delivery•Focus on usabillity•Add business logic for output optimization
Microsoft Data ToolsMicrosoft Data Tools or
specific tool for Data Vault (if chosen) e.g. CDT
Microsoft Data Tools
Different environments, different Requirements
Operational
Data Capture
Departmental
Transaction Processing
Business Function
Accuracy
Speed
Record for Use Case
Business Operations
Capture and Log
Stable and firm
Data Warehouse
Data Integration
Enterprise Wide
Historization
Core Business Concept
Completeness
Auditability
All data all time
Enterprise Knowledge
Time Slice
Agile
Data Marts
Data Delivery
Demand Driven
Online Analytics
Fact based analysis
Flexibility
Usability
Right data right time
Specific analytics
Prepare and deliver
Responsive
Chronicles Clarity Star-like DWHJob Job Dataset???
Academisch Medisch Centrum Algemene Dienst ICT - www.amc.nl/ict 16
Stellingen uit overleg 27-5-2016
• Historical staging zeer goed automatiseerbaar• DV creatie kan ook virtueel• Behoefte eindgebruiker is leidend• Waar mogelijk ZIB conceptueel volgen, maar alleen als
bronsysteem compliant is• ZIB onvoldoende rijk• Logisch model is kan op FHIR gebaseerd worden• Object lijst is beschikbaar • Naamgeving is irrelevant• Uitgifte via views
Door onderzoekers
aangedragen
objecten zijn
vindbaar en
modelleerbaar
Het grotere geheel• Digital Research Environment
• Data beschikbaar maken voor onderzoekers• Behoeften en prioriteiten zijn geinventariseerd• Uitleveren datasets• Castor koppeling
• Architecture and Data Governance (EVAMEG)• Master Data Management (niet centraal belegd)
• Provisioning Epic, GLIMS, Helix• Gebruik LOINC, SNOMED, G-standaard, Vektis, zorgmail, postcode tabel
• Integratie vraagstukken• Web site en services interne klanten (KCC FOS)• Web site en services externe klanten (RadboudOnline)• Behoefte aan centrale regie over data kwaliteit en uitwisseling• Projecten rond track en trace over platforms heen