handling interim and immature data in a clinical trials ... interim and immature data in a clinical...
TRANSCRIPT
Proprietary and confidential. Do not distribute.
Proprietary and Confidential
Handling Interim and Immature Data In a
Clinical Trials Setting
Paul Stutzman
Axio Research
22-Oct-2015
Proprietary and confidential. Do not distribute.
Introduction
Outline
2
Motivation
Definition
Preparation
Potential pitfalls
Reassess
Tools and processes
Concluding remarks
Proprietary and confidential. Do not distribute.
Motivation
3
Given:
Variety of statistical projects
Early involvement
Maintain quality/accuracy
Minimize cost/effort
Timely deliverables
Proprietary and confidential. Do not distribute.
Definition
Context
4
Preparatory programming - standards based
Raw SDTM ADaM
Medical monitoring
Interim analyses
Data Monitoring Committee (DMC) / Data Safety
Monitoring Board (DSMB)
Proprietary and confidential. Do not distribute.
Definition
What is interim and incomplete data?
5
Interim:
Before database lock
Often iterative – multiple snapshots over time
Unclean
Incomplete:
Missing records (subjects, visits, parameters…)
Missing values
Proprietary and confidential. Do not distribute.
Definition
Dirty data
6
Good: Accurate and Complete
Fast: Timely
Cheap: Cost & Effort
Pick any 2:
Proprietary and confidential. Do not distribute.
Preparation
What can we know ahead of time?
7
Protocol
Annotated CRFs
SAP
DMC/DSMB Charter
Look at the actual data!
Proprietary and confidential. Do not distribute.
Preparation
What can we know ahead of time?
8
Visit structure
The protocol is your friend.
However…
“Patients randomized to both study arms will have
21-day chemotherapy cycles until disease
progression, unacceptable toxicity, withdrawal of
consent, or protocol specified parameters to stop
treatment.”
Proprietary and confidential. Do not distribute.
Preparation
What can we know ahead of time?
9
All permissible values
Annotated CRF
Data dictionary
Standards constraints
Raw Analysis datasets TLFs
Raw SDTM ADaM TLFs (metadata is critical)
Proprietary and confidential. Do not distribute.
Preparation
Use defensive programming/coding techniques
10
Your program should not fail
Handle all values (anticipated or not)
Unanticipated values should be flagged
Code for all variables – even if insufficient data
Use the SAS log!
Proprietary and confidential. Do not distribute.
Potential pitfalls
Initial considerations
11
All missing
In sufficient data, no records meet criteria yet
For example: No deaths
In DS?
In AE?
Separate deaths dataset?
Will they need to be reconciled?
Should be noted in the SAS log
Proprietary and confidential. Do not distribute.
Potential pitfalls
Initial considerations
12
Not all values are present. Unsure of:
What actual values will be (categorical)
The relationship between variables
xxCAT, xxSCAT
xxTEST, xxTESTCD
Numeric coded categorical variables
Visits
When in doubt, note in the SAS log
Proprietary and confidential. Do not distribute.
Potential pitfalls
Initial considerations
13
Invalid/incorrect data
Out of range
Mismatched related variables’ values
Invalid categorical values
Should be noted in the SAS log
Proprietary and confidential. Do not distribute.
Reassess as you go
Hey, I got a new data transfer!
14
New or missing datasets
How did the number of records in each dataset change
New or missing variables
Variable attributes have changed
New values in categorical variables
New ranges (or outliers) in continuous variables
Changes in related variables (xxTEST vs. xxTESTCD)
Proprietary and confidential. Do not distribute.
Reassess as you go
Hey, I got a new data transfer!
15
Scrutinize lab values and units
Visits
Sort orders no longer provide uniqueness
What does any of this mean for my metadata?
Proprietary and confidential. Do not distribute.
Tools and processes
Defensive programming/coding defined
16
Typical definition (www.drdobbs.com)
“Defensive programming is a practice where you
anticipate failures in your code, then add supporting code
to detect, isolate, and in some cases, recover from the
anticipated failure.”
In this context, however…
Writing programs that handle all input data (both
anticipated and unanticipated) gracefully, and notify the
user/programmer when data or processing issues arise.
Proprietary and confidential. Do not distribute.
Tools and processes
Defensive programming/coding: Notify!
17
Look for all possible known values
Make unanticipated values obvious
Use the SAS log to highlight unexpected values
Use the SAS log to identify variables that can’t be
calculated yet
Proprietary and confidential. Do not distribute.
Tools and processes
Defensive programming/coding techniques
18
For unanticipated values: flag and notify
Final ELSE on IF/ELSE IF constructs
OTHERWISE on SELECT statements
ELSE on SQL CASE statements
PROC FORMAT OTHER
Make unanticipated values to something obvious
Use the SAS log to highlight unexpected values
Proprietary and confidential. Do not distribute.
Tools and processes
Defensive programming/coding SAS log messages
19
Standardize and differentiate from SAS’ messages
!!! Warning:
### Note:
Make the code invisible
put "!!" "! War" "ning: message text";
put "##" "# Note: message text";
Proprietary and confidential. Do not distribute.
Tools and processes
Misc: %FindDups macro
20
%finddups(InDS=ds1 ds2 ds3, SortOrd=USUBJID);
data combined;
merge ds1 ds2 ds3;
by USUBJID;
### Note: No duplicates found in ds1 with SortOrd=USUBJID
!!! Warning: Duplicates in ds2: USUBJID=111-123-0001
!!! Warning: Duplicates in ds2: USUBJID=111-345-0001
### Note: No duplicates found in ds3 with SortOrd=USUBJID
%finddups(InDS=ADLB, SortOrd=USUBJID PARAMCD AVISITN ADT);
Proprietary and confidential. Do not distribute.
Tools and processes
Process an iterative data transfer
21
Archive old data – but keep available
Load new data
Compare
Proprietary and confidential. Do not distribute.
Tools and processes
Process an iterative data transfer
22
Comparison criteria
New or missing datasets
How did the number of records in each dataset change
New or missing variables
Variables with different attributes
New values in categorical variables
New ranges (or outliers) in continuous variables
Changes in related variables (xxTEST vs. xxTESTCD)
Visits!!!
Proprietary and confidential. Do not distribute.
Tools and processes
Data transfer comparison macro
23
Compare current data to previous transfer
Datasets: presence, record counts
Variables: presence, attributes
Values (optional): categorical, continuous
Output Spreadsheet (ODS EXCELXP Tagset)
%mLibComp(Folder=SDTM);
%mLibComp(Folder=SDTM, Values=_all_);
%mLibComp(Folder=SDTM, Values=DM EX, MaxCats=25);
Proprietary and confidential. Do not distribute.
Tools and processes
Datasets comparison spreadsheet
24
Proprietary and confidential. Do not distribute.
Tools and processes
Variables and attributes spreadsheet
25
Proprietary and confidential. Do not distribute.
Tools and processes
Values comparison (categorical) spreadsheet
26
Proprietary and confidential. Do not distribute.
Tools and processes
Values comparison (continuous) spreadsheet
27
Proprietary and confidential. Do not distribute.
Tools and processes
Examine the relationships
28
PROC FREQ is your friend
VISITNUM VISIT (across all domains)
xxTESTCD xxTEST (what about units?)
PARAMN PARAMCD PARAM
Proprietary and confidential. Do not distribute.
Conclusions
29
Understand potential impact
Prepare
Reassess
Tools and processes