© 2008 octagon research solutions, inc. all rights reserved. 2 octagon research solutions, inc....
TRANSCRIPT
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.1
Octagon Research Solutions, Inc.Leading the Electronic Transformation of Clinical R&D
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.
Dan Crawford
Director, Clinical Data Strategies
March 12, 2008
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.2
Basic Concepts of SDTM
• Captures all the submitted tabulation data as a series of observations in domains based on standard specified structure – SDTM does not specify content!
• Raw Collected Data• No imputed Values• Defines specific rules for variable names and structure
within each domain• No derived or analysis variables except for those in
SDTM– RFSTDTC (Reference Start Date)– RFENDTC (Reference End Date)
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.3
Some Common Mistakes when Converting to SDTM
• Adding Derived Variables to CRT Datasets• Imputing Data
– Completing Partial dates– Example: AE Start date is 06/2005. Do not record as
06/01/2005. That work is done in analysis datasets
• Plugging Holes in the data– If you didn’t collect it - Don’t try to create it now
– Example: If collection date is missing, do not create an
algorithm to populate. Leave it blank
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.4
SDTM and ADaM
• SDTM– Source or raw data– Vertical
– No redundancy
– Character variables– Each domain is specific to itself– Dates are ISO8601 character
strings
• ADaM– Derived data– Structure may not
necessarily be vertical– Redundancy is needed for
easy analysis– Numeric variables– Combines variables across
multiple domains– Dates are formatted as
numeric (e.g. SAS dates) to allow manipulation
Source: Susan Kenny, Inspire Pharmaceuticals Inc
BOTH ARE NEEDED FOR FDA REVIEW !
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.5
Process Flow
1. Source Data Evaluation
2. Author Data Conversion Specifications
3. Migrate Data from Source to SDTM Target
4. Data Pooling to Create Integrate Database
5. Data Standardization
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.6
Required Items
•Normalized datasets
•All inclusive lab data
•Gaps between record content and formats catalog will be identified
•Verification that all fields on CRF are captured in datasets
•Supporting documentations on study design and data collection
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.7
Source Data Evaluation
• Source Data Review is the key to a successful SDTM Conversion Project. Due to the granularity of SDTM, it requires a thorough knowledge of legacy data and supporting documentation.
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.8
Legacy Data Challenges
• Issue: Missing data/documentation• Resolution: Perform QC data audits, may include individual case report forms and/or
utilize CSR listings. Work with Sponsor/vendor to identify and locate missing documentation (if it exists)
• Issue: Non-English databases and/or documentation• Resolution: Identify early and perform translation
• Issue: Incomplete/incorrect formats catalogs• Resolution: Identify discrepancies and update format catalogs/ manually link metadata with proper formats and then programmatically update data and
apply decodes
• Issue: Data discrepancies/oddities• Resolution: Indicate data anomalies in “Reviewers Guide” or create “Notes to Reviewer" in the define file
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.9
Source Data Evaluation
• Metadata Analyses: conduct a series of metadata analyses to scan for common attributes and structures against the clinical data.
The results will allow you to create groups of similar studies to reduce units of work and maximize efficiency.
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.10
Project Design
• Group similar studies based on review of automated report, source documentation, and source data.
Example: Studies coming from the same CDM system.
Example: Studies with the same phase or conducted by the same CRO
• Data conversion specifications are developed based on similarities within groups of studies.
Example: Data conversion specifications created for the first study in a group will serve as template for subsequent studies in that group.
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.11
Project Design
• Data conversion specifications will be created based on typetype of domain: Standard or Custom, and then by classclass: Interventions, Events and Findings.
Interventions
Standard Custom
Events Findings
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.12
Project Design
• To ensure full accountability for all data points, each study should include a Mapping Specifications document, detailing the CDISC SDTM target for each source dataset and variable.
• Utilizing a database (excel or access) that stores these instructions will allow you to replicate the process for studies that have identical
(or, similar) structures.
One
Many
One
Many
SDTM
One
One
Many
Many
CRF
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.13
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.14
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.15
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.16
• Extraction Transfer Loading (ETL) tool is used to migrate data from source to target datasets
• Graphical modeling of data flow
• Pluggable maps for reusability of logic
Data Migration Process
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.17
Quality Control
Automated Quality Control
– Mapping specification utility: Built-in SDTM compliance wizard
– CDISC SDTM compliance verified using software developed in-house or
manual review.
Manual Quality Control
– Completion of quality control checklists:
• 100 % QC of converted data against mapping specifications
• 2 subject per Domain QC for all data points against Raw Data
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.18
Recommendations
• Adopt and move SDTM standards as far “upstream” as possible
• Design CRFs with SDTM in mind
• Standardize Controlled Terminology
• Convert Datasets to SDTM
• Generate Analysis datasets and CSR from SDTM
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.19
Data Pooling ChallengesData Pooling Challenges
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.20
Data pooling
• During pooling, data content is standardized Unique Subject Identifiers
• Terms are mapped to common standard Laboratory Data Any collected data with “controlled terminology”
» AE Outcome» AE Relationship» Race
• Dictionary encoding of Adverse Events and Concomitant medication and possibly Medical History
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.21
Challenge #1: Standardization of Data
• Laboratory Data– Standardization of units
• Impacts results and normal range values
– Normal Ranges• Many times the normal ranges are not incorporated into the
laboratory datasets. Find them• Some Laboratory normal ranges are based on Gender and
Age.
– Create library of Standard Analyte names, units (SI) along with all conversion factors
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.22
Laboratory Standardization Example
LBTEST LBORRES LBORRESU Conversion Factor
LBSTRESN LBSTRESU
Albumin 36 g/L (SI) N/A 36 g/L
Albumin 3.4 g/dL 10 34 g/L
Albumin 612 µmol/L .06600 40 g/L
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.23
Challenge # 2: USUBJID
• Do you have the same subjects enrolled in more than one trial?
• If so, do you have a database that tracks these subjects?
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.24
Challenge # 2: USUBJID
• When combining studies to create a pooled database for an ISS/ISE – those subjects will need to have the same USUBJID across all studies.
• USUBJID Database:– Pool necessary variables from all studies (most likely will
come from different source datasets DM, VS, MH)– Output all Subjects with matching DOB and Gender– Use additional information to determine if subject is a match– Assign USUBJID
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.25
Gender=Female Date of Birth=11JUL1947
OBS STUDYID
USUBJID
SITEID
PID PIN RACE HGT Diagnosis Date
COUNTRYCD
MCHD_OBS
REVIEWE
R
COMMENTS
59 DC2005-015-0023
000459
15 23 -NP Asian 146 NDJUN97 UK . DK NO MATCH
205 DC2005-010-0045
000620
10 45 MP Caucasian
157 NDJUL97 UK . DK NO MATCH
Gender=Male Date of Birth=10MAY1955
OBS STUDYID
USUBJID
SITEID
PID PIN RACE HGT Diagnosis Date
COUNTRYCD
MCHD_OBS
REVIEWE
R
COMMENTS
87 DC2005-005-0141
000041
5 141 PAC Caucasian
160 -8MAY1997USA 143 DK MATCH
143 DC2005-001-0097
000041
1 97 PAC Caucasian
160 XXMAY1997
USA 87 DK MATCH
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.26
Challenge # 3:Recoding – Medical History
• Does your ISE require analysis based on a subset of the population – i.e. subjects with Cardiovascular disease?
• Medical History is not coded in many studies and can be problematic to code for an ISS/ISE
• Some CRFs may be designed to allow for more than one term per line
• Coding Medical History typically involves the splitting of many terms
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.27
MHTERM MHMODIFY MHBODSYS MHDECOD
HIP DYSPLASIA OPERATED IN 1976. TORN MENISCUS (R) OPERATED ON 1994. OCCASIONAL LEFT KNEE PAIN
HIP DYSPLASIA CONGENITAL, FAMILIAL AND GENETIC DISORDERS
HIP DYSPLASIA
HIP DYSPLASIA OPERATED IN 1976. TORN MENISCUS (R) OPERATED ON 1994. OCCASIONAL LEFT KNEE PAIN
HIP DYSPLASIA OPERATED IN APPROXIMATELY 1976.
SURGICAL AND MEDICAL PROCEDURES
HIP SURGERY
HIP DYSPLASIA OPERATED IN 1976. TORN MENISCUS (R) OPERATED ON 1994. OCCASIONAL LEFT KNEE PAIN
TORN MENISCUS (R) OPERATED ON 1994
SURGICAL AND MEDICAL PROCEDURES
MENISCUS OPERATION
HIP DYSPLASIA OPERATED IN 1976. TORN MENISCUS (R) OPERATED ON 1994. OCCASIONAL LEFT KNEE PAIN
TORN MENISCUS (R)
INJURY, POISONING AND PROCEDURAL COMPLICATIONS
MENISCUS LESION
HIP DYSPLASIA OPERATED IN 1976. TORN MENISCUS (R) OPERATED ON 1994. OCCASIONAL LEFT KNEE PAIN
OCCASIONAL LEFT KNEE PAIN
MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS
ARTHRALGIA
© 2008 Octagon Research Solutions, Inc. All Rights Reserved.28
Challenge # 3:Coding – “Splits”
Dataset USUBJID AESEQ AETERM AEMODIFY AEGRPID
CRT 000146 2 Nausea/Vomiting
Pool 000146 2.1 Nausea/Vomiting Nausea 2
Pool 000146 2.2 Nausea/Vomiting Vomiting 2