the cdisc sdtm/adam pilot project - an example of applying...
TRANSCRIPT
The CDISC SDTM/ADaM
Pilot Project -
an Example of Applying
CDISC Standards
Cathy Barrows
SAS® Drug Development
Users Connection ConferenceJanuary 23, 2008
Barrows, Jan 2008 - 2
OBJECTIVE TODAY
Sometimes it helps to see a bigger picture…
Barrows, Jan 2008 - 3
Goal of this presentation
Show an example
of the results of the work we do as statisticians and statistical programmers
but better
because it is also an example of the use of CDISC standards
and even better
the example was vetted by the FDA
This example is
something we can look at as an illustration
showing us one way of applying the CDISC standards
that meets the expectations and requirements of FDA reviewers
Will also highlight learnings regarding the creation of a CDISC-adherent submission.
Barrows, Jan 2008 - 4
Presentation Outline
Overview of CDISC
The CDISC SDTM/ADaM Pilot
Illustration of Define.xml file
CDISC and the FDA - Why this matters
Barrows, Jan 2008 - 5
Acknowledgements
These slides use, or make reference to, materials
originally created by:
Cathy Barrows, GlaxoSmithKline
Musa Nsereko, Shire Pharm.
Chris Holland, FDA
Susan Kenny, Inspire Pharm.
Greg Anglin, Eli Lilly
Becky Kush, CDISC
Ed Helton, SAS
CDISC SDTM/ADaM Pilot Team
and various webpages (noted).
Overview of CDISC
Barrows, Jan 2008 - 7
Clinical Data Interchange
Standards Consortium
Global, open, multidisciplinary, not-for-profit
organization initiated in 1997 as a volunteer group
Incorporated in 2000
Now > 170 member corporations
Including academic research centers, global
biopharmaceutical companies, technology and
service providers, IRBs….
Active Coordinating Committees in Europe and
Japan
Plus 7 User Networks spanning the U.S.
Barrows, Jan 2008 - 8
Through an open, consensus-based approach, CDISC
has established worldwide industry standards to
support the electronic acquisition, exchange,
submission and archiving of clinical research data and
metadata to improve data quality and streamline
medical and biopharmaceutical product development
and research processes.
Standards are freely available on the CDISC website
(www.cdisc.org).
Barrows, Jan 2008 - 9
Barrows, Jan 2008 - 10
So what does CDISC consist of exactly?
The “CDISC environment” is made up of a
number of elements and concepts……….
SDTM (Study Data Tabulation Model )
ADaM (Analysis Data Model)
Protocol Representation / Trial Design Model
ODM (Operational Data Model)
Define XML
(XML=eXtensible Mark up Language)
CDASH (Clinical Data Acquisition Standards)
Terminology
Barrows, Jan 2008 - 11
Note For the purposes of this presentation, the CDISC standards (SDTM
and ADaM) will be discussed within the context of electronic submissions to the FDA.
However, CDISC standards are applicable to a wide range of drug development activities in addition to regulatory submissions.
Transferring datasets between sponsors and CROs, development partners and independent data monitoring committees
Reviewing and transferring datasets for in–licensing, out–licensing and mergers
The same principles and standards will apply, regardless of the purpose of the datasets.
Barrows, Jan 2008 - 12
SDTM and ADaM - What?
SDTM:
SDTM is basically a
data tabulation model
used to submit data to
FDA
ADaM:
Is a standard for
Analysis Datasets
delivered to the FDA
Builds on the
nomenclature of
SDTM, adding
attributes, data
structure, and
variables required as
appropriate for
statistical analyses
Barrows, Jan 2008 - 13
SDTM and ADaM - Content
SDTM datasets
(domains) contain:
Observations from a
clinical trial i.e. CRF
collected data
Trial design information
e.g. treatment arms and
time/events schedule
Some derived data, though
minimal (e.g. baseline
flags)
ADaM datasets (support
the statistical analyses in
a report) contain:
Restructured tabulation
data plus derived data,
flags and other details to
support the analysis
Does NOT have to contain
all tabulation data
NOT just adding columns
to SDTM dataset!
Barrows, Jan 2008 - 14
SDTM and ADaM - Design
Both are structured as one row per logical piece of information. e.g. one adverse event is one record
ADaM - record content is defined based on topic analysis parameter
Both standards contain predefined naming conventions and formats for many variables
ADaM datasets must be analysis-ready
Can be analyzed with little or no programming or data processing - i.e., one proc away from an analysis result / one step away from knowledge
Barrows, Jan 2008 - 15
A word about the new ADaM IG SDTM is necessarily more specifically defined than ADaM
New ADaM IG presents more requirements than previous ADaM directives, for example: ADaM Dataset Structure and Metadata
Defines the ADaM structure as one record per analysis parameter
Columns added for analysis purposes (e.g. covariates)
NOT just adding columns to SDTM dataset!
Defines required variables including: PARAM, PARAMCD, PARAMN
AVAL / AVALC
Identifies classes of ADaM columns & rows
Standard variable names
Conventions for flag variables
Implementation Issues and Solutions, such as: When should derived columns versus derived rows be created?
Should all observed data be included in analysis datasets?
How should rows used for analyses be identified?
How should missing time points be handled?
Barrows, Jan 2008 - 16
SDTM and ADaM Datasets
Both are representations of the clinical trial
data
Complimentary
Mutually supportive
Each with a specific purpose
BOTH ARE NEEDED
FOR FDA REVIEW
Barrows, Jan 2008 - 17
Metadata Data about the data, providing clear, unambiguous communication of the
science and statistics of the trial
SDTM metadata
Describes the structure, content, and source of the tabulation datasets
Analysis metadata
Structured documentation of the analysis datasets and key analysis results
Analysis dataset metadata / Analysis variable metadata / Analysis variable value-level metadata
Describes the structure, content, and source of the analysis datasets
Analysis results metadata
Describes what (analysis data) and how (methods and algorithms followed) and why (pre-specified, ad hoc, etc) for key analysis results
Supports the statistical analyses of a clinical trial, by communicating:
statistical methods
transformations
assumptions
derivations
imputations
Codifies the analyses performed
Barrows, Jan 2008 - 18
Metadata Standards for Data and Analysis
(as stated by an FDA reviewer)
Enable reviewers to understand, replicate, explore, confirm, reuse, etc.
Analysis metadata: Clear, unambiguous communication of decisions, analysis and results
Underlying principles: Can a reviewing statistician understand?
Can a reviewing statistician efficiently: Quality assure?
Validate?
Analyze?
Using ADaM and SDTM dataset standards and metadata to describe the submitted data: Facilitates a common look and feel for data, both within and across
submissions
Allows standard tools to be used for navigation, browsing and analysis
Barrows, Jan 2008 - 19
ODM and Define XML Define XML
Formal name is “Case Report Tabulation Data Definition Specification” (CRT-DDS)
Specifies the standard for providing Data Definitions in an XML format
Define.xml file Data Definition Document (e.g., "define.pdf" in the 1999 guidance) in a
machine-readable format increase the level of automation
improve the efficiency of the Regulatory Review process
Contains data (ideally) and metadata
Define.xml will replace Define.pdf in submissions (eventually)
Requires a style sheet to be user-friendly
XML schema used to define the expected structure for these XML files is based on CDISC ODM
ODM ODM is a specification of a standard XML schema for the interchange
and archive of clinical trials data and metadata
System-independent
Essentially a transport format or “wrapper” for transferring information
Barrows, Jan 2008 - 20
Data Flow Using CDISC
SDTM Data,
Analysis
Data,
Metadata
CRF, Analysis
Data
Reporting and/or
Regulatory
Submissions
(e)Source
Document
Operational &
Analysis
Databases
Patient Info
Clinical
(CRF or eCRF)
Trial Data
(defined by
SDTM)
ODM
XMLODM XML
Define.xml
ODM
XML
Administrative,
Tracking, Lab
Acquisition Info
Integrated
Reports
Trial Design
(SDTM)
Analysis Plan
Clinical Trial
Protocol
Protocol
Representation
= SDTM and Analysis Data (content)
= ODM (transport)
= Protocol information (content)
= Source data
(other than SDTM/CRF data)
CDISC SDTM /
ADaM Pilot Project
Barrows, Jan 2008 - 22
Disclaimer
All comments, statements, and opinions attributed in this presentation to the regulatory (FDA) review team reflect views of those individuals conveyed as informal feedback to the pilot project team, and must not be taken to represent guidance, policy, or evaluation from the Food and Drug Administration.
Barrows, Jan 2008 - 23
CDISC SDTM / ADaM Pilot Project
Goal for the Pilot was to get initial answers to key questions
What does a CDISC-format submission look like, including both SDTM and ADaM datasets?
Where are the overlaps and differences between SDTM and ADaM?
Do the current CDISC standards and models meet the FDA‟s requirements and expectations (both medical and statistical reviewers)?
What improvements can be considered to optimize the SDTM and ADaM models?
And to produce a worked example implementation of the available CDISC standards.
Barrows, Jan 2008 - 24http://philhord.com/phord/wp-content/development.jpg
The reason for the pilot project
Barrows, Jan 2008 - 25
SDTM / ADaM Pilot Focus
Focus on the package and not on the process
Choices/decisions guided by
timeline
realities of a team of volunteers from multiple
companies
goal was the submission package and the FDA
review
quick, efficient, effective - not necessarily the most
preferred option
Barrows, Jan 2008 - 26
SDTM / ADaM Pilot Focus
Attention to the process would detract from Pilot Objectives: Do current standards result in package that meets
expectations?
Are there holes or issues lacking clarity in the current standards?
The Pilot results, when released, should be reviewed with project objective in mind Utilize information on the process as a basis for
discussion within your organization
Barrows, Jan 2008 - 27
SDTM / ADaM Pilot CDISC Tools
Used the tools currently available (with very
minor modifications if any) to produce the pilot
submission. SDTM IG Version 3.1.1
SDTM Version 1.1
ADaM Version 2.0
(for public comment in March, 2006)
CRT-DDS version 3.1.1
ODM version 1.3
(public comment closed May 2, 2006)
Custom stylesheet
developed by team members
Datasets as XPT not XML
Barrows, Jan 2008 - 28
SDTM / ADaM Pilot Deliverables
1. Submission package
Includes SDTM datasets, ADaM datasets, all
relevant metadata, analysis tables and
figures, abbreviated final study report,
annotated CRF‟s
Review package tied together using
metadata in DEFINE.XML
Barrows, Jan 2008 - 29
SDTM / ADaM Pilot Deliverables
2. Summary report of the pilot
submission project
issues encountered, strengths and
weaknesses
incorporate what we learned from the FDA
feedback
Both the Package and the Report will be
made available via the CDISC website
Barrows, Jan 2008 - 30
SDTM / ADaM Pilot Timelines
Most of the work was done Feb 2006 –
August 2006
FDA made constructive comments about
issues that made their review more
difficult
Pilot team addressed and fixed these
issues November 2006 – Feb 2007
Final package (abbreviated) and pilot
report will be available soon
Barrows, Jan 2008 - 31
SDTM / ADaM Pilot Team
Cathy Barrows (GSK)
Musa Nsereko (Cephalon
/ Shire)
FDA Co-Leaders:
Lonnie Smith (previous)
Chris Holland
Mina Hohlen
Greg Anglin (Lilly)
T Friebel (SAS)
John Gorden (Quintiles)
Tom Guinter (Octagon)
Joel Hoffman (Insightful)
Susan Kenny (Inspire
Pharm.)
Sandy Lei (J&J)
Richard Lewis (Octagon)
Arline Nakanishi (Amgen)
Gregory Steffens (Lilly)
Gary Walker (Quintiles)
Aileen Yam (sanofi-
aventis)
Yuguang Zhao(sanofi-
aventis)
Barrows, Jan 2008 - 32
FDA Participation
Unprecedented level of involvement
Co-Leadership of the project
18-20 FDA employees involved
included medical and statistical reviewers
12 consistently in contact with team
Interactions:
Regular team teleconferences
Face-to-face meeting to define the project
(expectations/requirements)
Pre-submission encounter
Feedback from review
Barrows, Jan 2008 - 33
Use of SDD
Repository
data
documents
metadata
Worked well because everyone could access it and allowed us to keep up with current versions
Did not fully use the capabilities of SDD
Building the
submission package
for the Pilot Project
Barrows, Jan 2008 - 35
Legacy
documents
received
Decisions
regarding data
analysis
Write SAP
Map blank CRF
to SDTM
(aCRF)
Create SDTM
data metadata
Create analysis
data metadata
Create SDTM
datasets (little
derived data)
Create analysis
datasets
Receive legacy
data
Create 0-obs
analysis
datasets
Coding of
events data &
con.med. data
Write study
report
Create 0-obs
SDTM datasets
Finalize
SDTM datasets
Generate
analyses
Derived data
to SDTM
Create analysis
results metadata
Note that “create” includes QC steps.
Write reviewer‟s
guide
Write cover
letter
Create DEFINE
Create XPT files
Building the CDISC Pilot
Submission Packagefrom real
clinical trial data
Barrows, Jan 2008 - 36
Creation of SDTM domains
Redacted data, as originally data based, was
mapped into SDTM
Only domains needed for package were mapped
Mapping documents were created for each
domain and used by programmers to created
domains
SAS® ETL Studio (now marketed as Data
Integration) was tool used for mapping
Barrows, Jan 2008 - 37
Creation of ADaM Datasets
Used mapped SDTM domains as input
Created analysis specifications in series of
Excel spreadsheets
Contained all metadata needed for define file
Traditional SAS programming was used to
create analysis datasets and analysis
results (tables, listings, figures)
Barrows, Jan 2008 - 38
Use of metadata
A „prescriptive‟ approach was used to more
effectively use the metadata
Create metadata first
Via SAS macros, use metadata to drive creation of
datasets (e.g., variable labels, lengths)
Use metadata as input into creation of define.xml
Ensured that metadata and datasets were in sync
Barrows, Jan 2008 - 39
Creation of Define.XML
The Pilot Define.xml file integrated metadata for: SDTM datasets
ADaM datasets
Analysis Results metadata
Used ODM 1.3 with user-written extensions to handle ADaM metadata Should be viewed as ONE possible way to approach the need
and not definitive of the syntax to use in the future
The Pilot team developed their own stylesheet that could accommodate the ODM extensions
Barrows, Jan 2008 - 40
Define.xml Define.xml
Analysis datasets
(XPT)
SDTM datasets
(XPT)
TabulationsAnalysis
DatasetsClinical Study Report
M5
(Clinical Study Reports)
M1
(Administrative)
CDISCPILOT01
Cover Letter &
Reviewer‟s Guide
(PDF)
Study Report
(PDF)
Patient Narratives
(ASCII text)
Annotated CRF
(PDF)
Content and General Structure of Pilot Submission Package
Folders
PDF docs
XPT files
DEFINE files
FDA Feedback
re the Pilot Project
Barrows, Jan 2008 - 42
FDA review team comments after
reviewing 1st submission
Overall favorable impression Expect learning curve to be less steep when standards are being
followed
Severable notable comments ADaM datasets were important component since SDTM datasets
are not analysis ready
ADaM ADSL was very useful for both medical and statistical reviewer
Some issues… Difficulties with transparency in some analysis datasets
Difficulties with Define.xml file - primarily navigation
Barrows, Jan 2008 - 43
Changes made to Define.xml file
Modifications to style sheet took care of
numerous issues
navigation
back button
additional links
Printing issue remains
Barrows, Jan 2008 - 44
Top of Define.XML
Barrows, Jan 2008 - 45
Top of Define.XML
Barrows, Jan 2008 - 46
Regarding analysis datasets
Need a clear data lineage from CRF to analysis
Traceability and Transparency are key Allows reviewers to understand (and trust) what was
done
Allows reviewers to examine the sensitivity of what was done to alternative methodologies
Through data (e.g. flag variables) and metadata Clear, unambiguous communication of decisions,
analysis and results
Barrows, Jan 2008 - 47
What was lacking -
Though the algorithm for performing windowing
and selecting LOCF'ed visits was pre-specified
in the SAP, verifying the procedure followed was
not clear without significant investigative work
Reviewers were unable to test other strategies
(e.g., including all data in the LOCF imputation
rather than only the windowed visits)
Barrows, Jan 2008 - 48
Traceability &Transparency Illustration
Total score data from modified analysis datasetUSUBJID VISITNUM AVISITN AVISITC VISIT AVISITCD ANLDY QSDY AWEEK VISITDT ITYPE ITTV VAL BASE CHG ADD_REC
01-709-1259 3 3 BASELINE BASELINE BL 1 1 0 26-Jan-13 Y 15 15 0
01-709-1259 8 8 WEEK 8 WEEK 8 Wk8 57 57 8 23-Mar-13 Y 21 15 6
01-709-1259 10 10 WEEK 16 WEEK 16 Wk16 106 106 16 11-May-13 Y 19 15 4
01-709-1259 10 12 WEEK 24 WEEK 16 Wk24 106 106 24 11-May-13 LOCF Y 19 15 4 Y
01-709-1259 11 . WEEK 20 WEEK 20 139 139 . 13-Jun-13 N 23 15 8
Metadata for some of the variables:
WARNING: Because the ADaM Implementation Guide was still being developed at the time
of the pilot project, the variable names and definitions do not correspond to
recommendations by the ADaM working group. Not all columns, rows are shown for datasets
Understanding the data for this patient, via metadata:
•VAL can be traced back to QS domain
•Week 24 row (AVISITC=WEEK 24) is a created record (ADD_REC=Y). Created due to
missing data (ITYPE=LOCF).
•The data carried forward for Week 24 are from Week 16 (VISIT=WEEK 16).
•Why not Week 20? Because the Week 20 data are not eligible for analysis (ITTV=N). Not
eligible because VISITNUM is not one of those specified.
Variable Label Computational Algorithm or Method
ADD_RECRecord created for analysis
purposes?
ADD_REC='Y' if record created because of change in data due to windowing or because of missing data (LOCF)
Regarding windowing - If the visit window is other than that noted in QS.VISIT, then another record is created
containing a copy of the observed data. The first record has AVISITC=VISIT and ITTV='N' and ADD_REC=' . The
new record has AVISITC=the name of the windowed visit and ITTV='Y' and ADD_REC='Y'.
ITTV Intent to Treat Visit FlagIf the observed data are eligible for analysis (i.e., QS.VISITNUM in 3,8,10,12,201) and if QS.VISIT = the name of
the visit window containing ADQSADAS.ANLDY then ITTV='Y'; ITTV='N' otherwise
ITYPE Imputation Type ITYPE='LOCF' if record was created to replace missing value
VALNumeric value of PARAM
[[ACTOT for this example]]
Sum(ACITM01:ACITM02, ACITM04:ACITM08, ACITM11:ACITM14), see SAP section 14.1.1 for detailed scoring
algorithm, adjusted for missing values; ACITMxx are the corresponding values of QS.QSSTRESN when
QS.QSTESTCD=ACITMxx
Barrows, Jan 2008 - 49
FDA Feedback after 2nd submission
Define file much improved
The analysis dataset modifications met
their needs
The new structure and metadata provide a
good model of what information is critical
to a reviewer's understanding of the data
lineage from CRF to analysis
Screenshots of the
Define.XML File
from the Pilot Project
Barrows, Jan 2008 - 51
Top of Define.XML
Barrows, Jan 2008 - 52
SDTM Dataset Metadata
Barrows, Jan 2008 - 53
Individual Domain Metadata
Barrows, Jan 2008 - 54
Computational Methods for Study Day
Barrows, Jan 2008 - 55
SDTM Dataset Metadata
Barrows, Jan 2008 - 56
QS Variable Metadata
Barrows, Jan 2008 - 57
Example of a more involved
computational algorithm
Variable Computational Method
or Algorithm
QSSTRESN COMP_QSAD_QSSTRESN if QSDRVFL='Y' and
the QS data pertain to ADAS-Cog or NPIX,
then QSSTRESN is from
ADQSADAS.ACTOT or ADQSNPIX.NPTOT,
respectively,
using the windowed data (i.e., where
VISIT=AVISITC and ITYPE=' '),
else if QSDRVFL = ' '
then QSSTRESN is from the CRF Page
Barrows, Jan 2008 - 58
Top of Define.XML
Barrows, Jan 2008 - 59
ADaM Dataset Metadata
Barrows, Jan 2008 - 60
ADSL Metadata
Barrows, Jan 2008 - 61
Value list for DSREAS
Barrows, Jan 2008 - 62
Analysis Results Metadata
Barrows, Jan 2008 - 63
Analysis Results Metadata
Barrows, Jan 2008 - 64
Analysis Results Metadata
Outcomes
of Pilot Project
Barrows, Jan 2008 - 66
Did we achieve goals?
Can the various CDISC components be used in creating
a submission of electronic data that is in a format that is
acceptable to the FDA and meets the needs of both
medical and statistical reviewers?
Package met the needs and
the expectations of both
medical and statistical FDA
reviewers participating in the
review
Barrows, Jan 2008 - 67
Demonstrated the importance of metadata that
provides clear, unambiguous communication of
the science and statistics of the trial
Showed important to include both data in
SDTM format and data in ADaM format
Did we achieve goals? Produce a worked example implementation of the
available CDISC standards?
Provided a good model of what information
is critical to a reviewer's understanding of
the data lineage from CRF to analysis FDA
review team
Barrows, Jan 2008 - 68
FDA Opinion on CDISC Standards?
Standards have great promise!
Reviewers will need experience with standardized
data
Tools will need development to assist with reviewer
needs
Expect learning curve for a submission to be less
steep when standards are being followed
FDA review team expressed optimism about the
impact that these data standards
will have on the work associated
with their review of new drug
applications
Barrows, Jan 2008 - 69
Pilot #1 Submission Deliverables
Submission package Includes SDTM datasets, analysis datasets, Define.XML (including all
relevant metadata), analysis results, abbreviated report
Summary report of the pilot submission project
Report will be available to general public on the CDISC webpage
Report and submission package will be available to CDISC members on members-only section of CDISC webpage
So Why Should
YOU Care?
Barrows, Jan 2008 - 71
Is it worth implementing the
CDISC Standards?
What does the
future look like?
Barrows, Jan 2008 - 72
FDA has endorsed the CDISC standards by including
them as specifications in FDA Final Guidance.
Study Data Tabulation Model (SDTM) define.xml (CRTDDS): Specifications for FDA
implementation of the ICH eCommon Technical Document
October 2005 – Federal Register Notice of Proposed Rule and listed as DHHS Priority
Barrows, Jan 2008 - 73http://www.fda.gov/oc/datacouncil/cdisc.html
SDTM is noted as an FDA-adopted standard on the
FDA Data Standards Council's website
Barrows, Jan 2008 - 74
About eSubmissions to FDA The FDA has issued the Final Guidance on eSubmissions (eCTD), which
provides reference to the CDISC SDTM as a Study Data Specification when providing electronic submissions to FDA.
A regulation is being proposed that will require all regulatory submissions in electronic format and the use of CDISC SDTM. There will be an appropriate period of time allowed for compliance with this regulation. Do not know when the regulation will be in place.
The FDA is working with CDISC and HL7 on transport technologies that may ultimately replace the current SAS file method used for submissions today (see below).
Regardless of the transport technology, the FDA is committed to using SDTM for content and for evaluation of eSubmissions by regulatory reviewers.
from the CDISC webpage http://www.cdisc.org/publications/fda.html
CDISC SDTM will be a preferred
but not required data specification
until the proposed rule is approved.
Barrows, Jan 2008 - 75
What benefits does CDISC present for
Industry?
Having a standard submission format could improve review times with the FDA Improve efficiencies in evaluation of submissions
Easier communications between regulator and applicant
Allow adoption of common tools by both FDA and sponsor
Utilising submission standards could provide further benefit in a sponsor‟s own clinical process Making communications between partners more efficient (e.g.
Labs, CROs, Affiliates, Sponsors)
Facilitate aggregation and exploitation of data assets
Barrows, Jan 2008 - 76
Wisdom is scar tissue in disguise
Or, as one FDA Team
Member said:
In order to get
a standard
we have to
suffer
Barrows, Jan 2008 - 77