getting to sdtm - cdiscportal.cdisc.org/cdisc user networks/north america/washington dc... · fda...

38
1 Getting to SDTM Jeff Millstein, Ph.D. Principal Consultant Lincoln Safety Group Phase Forward, Inc. The presentation was made at the Washington, DC CDISC User’s Group, held at MedImmune, Gaithersburg, MD, on October 15, 2009. Please contact the author ([email protected]) before using any of this material.

Upload: buicong

Post on 26-Apr-2018

228 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

1

Getting to SDTM

Jeff Millstein, Ph.D.Principal ConsultantLincoln Safety GroupPhase Forward, Inc.

The presentation was made at the Washington, DC CDISC User’s Group, held at MedImmune, Gaithersburg, MD, on October 15, 2009.

Please contact the author ([email protected]) before using any of this material.

Page 2: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

2

Qualifications

Product Guidance This presentation contains directional statements related to the features and

availability of one or more unreleased products. These statements represent current intentions and goals which are subject to change or withdrawal at any time without further notice.

Safe Harbor This presentation contains forward looking statements within the meaning of the

Private Securities Litigation Reform Act of 1995, including, without limitation, statements related to the features and availability of one or more unreleased products. These statements are subject to a variety of risks and uncertainties, including without limitation, technical difficulties encountered in the development of the planned product or changes in product plans which might result in the failure of Phase Forward to release the product as scheduled, as described, or at all, as well as the risks set forth in Phase Forward's public filings with the Securities and Exchange Commission. We do not assume any obligation to update the forward-looking statements contained in this presentation.

This presentation contains forward-looking statements related to the features and availability of one or more unreleased products. These statements are subject to a variety of risks and uncertainties including those listed on the screen and set forth in our public filings.

Page 3: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

3

Introduction

Phase Forward’s headquarters, in Waltham, MA, where I work.

Page 4: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

4

History of the SDTM

368

CDISCSDS Meets

1999 2009

SDS3.0

2004

SDS1.0SDTM1.0

SDTMIG3.1SDTM1.1

SDTMIG3.1.1

PostedSDTM1.2

SDTMIG3.1.2

Final draftSDTM1.2

SDTMIG3.1.2

SDS2.0

2002

ReleasedSDTM1.2

SDTMIG3.1.2

2006

FDA recognizes SDTM asApproved method for submitting

Data tabulation component of CRT

FDA releases Critical PathOpportunities List, Item44=SDTM

FDA releases Notice ofProposed RulemakingRegarding the SDTM

225 239

It has been about ten years since the CDISC Submission Data Standards or SDS Committee met, and about five years since we have had SDTM 3.1, the first real stable and usable version of the data model.

Over the past five years we at Phase Forward, and in particular, the consulting arm within the Lincoln Safety Group, have seen a steady increase in requests for conversion of InForm trials to SDTM.

This year SDTM 3.1.2 was released. Although the standard has improved by fixing issues and adding flexibility, this version currently sits in a “no mans land”. This is because the 3.1.2 standard is not yet adopted by the FDA but some sponsors, believing that adoption is near, are converting to this standard now, while others use only what is currently accepted by the FDA.

Page 5: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

5

Planning an SDTM Conversion

Albert Bierstadt, “Storm in the Mountains" c. 1870, oil on canvas (collection of MFA, Boston, MA)

Page 6: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

6

How much effort is this conversion going to be?

Victor Dubreuil, Barrels of Money," c. 1890s, oil on canvas (collection of Fed Reserve)

Invariably, the first question our consulting group is asked is something like “how long will this take?”, which can be converted into units of effort (e.g., staffing, meeting time) and/or money.

Conversions of any database are never trivial, and conversions of clinical trial data can be complex, time consuming, tedious, and often expensive. In addition, there may be considerable time pressures on job completion,. Therefore, it is important to have an organized methodology and perform the conversion carefully, step-by-step.

I have included in my procedure several stop points which are points in the process that I have found important to stop until the step is complete and approved.

Page 7: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

7

How much effort is this conversion going to be?

Conversions are not often • Completed ahead of time

• Completed within budget

• Simpler than you imagined

• Easily tested

• Done and forgotten

Data conversions are always a significant and non-trivial effort. Because of many factors, conversion project budgeting (time and money) often slips.

More often than not, conversions become more complicated as you understand the nuances and details of the CRF case book. For example, details of the exposure or treatment schedule, derivation of baselines, unit conversions, and coding issues are not always completely understood at project start.

Clinical trials can have a long lifespan, and a conversion once completed may re-appear months or years later for further work. Sometimes, old conversions are revived for comparison with current trials. Therefore, archive your work carefully.

Page 8: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

8

How much effort is this conversion going to be?

Need to assign staff and estimate effort for • Project management

• Project documents

• Issue management

• CRF Annotation

• Data review

• Client meetings

• Trial Design development

• Programming

• Mapping specification

• Technical assistance

• Testing

• Converted data review

• Change management

• Preparation of deliverables

The colors in this slide indicate the kind of expertise one should have to best perform the task, and stay within budget:

• Blue managers

• Green technical, with programming skills

• Purple QA, testing experience

Page 9: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

9

Planning: Why does the client want SDTM?

Submission-ready

SDTM to load a CDW

SDTM for safety review

Sharing and comparing

Ye Olde Data – Legacy conversions

Business planning

The reason for conducting an SDTM conversion is no longer always a submission. It may include standardizing for clinical data warehouse loading, safety review, sharing study results with partners, converting old trials or trials acquired in a merger. Sometimes conversions are performed to help measure effort to plan for future trial conversions.

Page 10: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

10

Planning: What does the client want?

IssuesLog

ExecutableCode

Tests &Results

MappingSpec

Define.xml

AnnotatedCRF

SDTM Datasets

Project DocsSOWs

Contracts

ConversionOutputs

There’s more that a client wants than just SDTM data sets. Knowing the complete list of deliverables helps you determine project work sequence, task staffing, and budgeting.

Page 11: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

11

Planning: What does the client have?

OtherRequirementsConstraints

Staff Available To Review

Protocol

AdditionalSource Data

CompanyMetadata

CRFs[pdf or scans]

SourceData

Time Frame

Budget Constraints

SOWsContracts

ConversionInputs

Clients usually have contracts, budgets, time constraints, source data and CRFs, but they often have additional information which can be extraordinarily useful if you request it at project start. For example, trials design domains, visit naming, and treatment schedules are usually explained in the study protocol document. Most pharma companies have company codelists that are useful in standardizing data.

Note that there are cases where a client has all except data because they are just starting the trial but want SDTM data sets periodically, as data is collected. In this case you might ask about test or UAT data used in trial development.

Page 12: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

12

SDTM Submissions Typical submission includes SDTM datasets + define.xml

eSub

Sponsor Data

Repository

Sponsor Data

Repository

FDAElectronicDocument

RoomServers

FDAElectronicDocument

RoomServers

Review Tools:i-Review, SAS/JMP,

WebSDM/CTSD

Review Tools:i-Review, SAS/JMP,

WebSDM/CTSD

•SDTM XPT Data•Define.XML

WebSDMData

Load &Validation

WebSDMData

Load &Validation

Janus SDTM Materialized

Views

Janus SDTM Materialized

Views

FDA/NCI JanusData

Warehouse

FDA/NCI JanusData

Warehouse

WebSDM DBWebSDM DB

GatewayGateway RepositoryRepository Review Environment

Review Environment

SponsorSponsor

StagingArea

StagingArea

This diagram illustrates how WebSDM™ (a Phase Forward commercial product, see www.phaseforward.com) is currently used at the FDA.

WebSDM contains a load and check feature which can act as a roadblock for submission acceptance if your SDTM submission is incorrectly formatted. WebSDM’s rule set is also part of the Janus Data Warehouse load checking functionality.

Page 13: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

13

Planning: Submission-Ready Data

• Data converted to SDTM• Extra collected data in SUPPQUAL• External data (labs, ECGs) integrated• Clinical comments in CO, integrated

Derivations• Data recoded to standards (e.g., MedDRA)• Data mapped to internal test codes• Data standardized (e.g., labs to SI units)• Data derivations

– Baselines, Timepoints– Categories, Groups

Metadata• Trial design domains• Related observations in RELREC• Define.xml

QA Documents• Mapping specification• Tests and test results

Creating submission-ready SDTM conversions means more than SDTM-formatted data sets. A submission-ready package includes:

• SDTM-formatted data

• external data integrated into domains

• derivations and clinical comments (linked to observations)

•Trial design domains

•Define.xml data dictionary document

•SDTM-annotated CRF

•Project documents including mapping specifications, test and test results, etc.

Page 14: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

14

BudgetingEstimatingGuesstimating

SDTM conversion job estimation is often tricky, particularly if the request is from a new client for a new study.

I start by examining the CRF, taking a guess as to the SDTM domain target, and estimating my programming effort. Then I add time for other factors including define.xml creation, mapping spec development, annotation of the CRF, and testing.

This spreadsheet is then provided to my manager who then meets with the sales staff. They negotiate a cost between company billing needs and the need to complete the sale.

Page 15: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

15

Performing SDTM Conversions

Jean-François Millet, “The Gleaners” c. 1857, oil on canvas (collection of Musée d'Orsay, Paris)

Page 16: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

16

General Conversion Process (steps 1 and 2 of 12)

1. Annotate CRF ↔ Review ↔ Update ↔ Review→ Acceptance Often takes several cycles

Most knowledge-intensive part of the SDTM conversion process

May require sponsor domain experts

Advisable to host a meeting to walk through annotation

Stop here until accepted by all parties

My overall conversion process has 12 steps.

Note: This is my current process, developed over a five-year period in which I was involved with way opver 100 SDTM conversion projects. Your process may differ.

Step 1 is annotation of the CRF (case report form). I view this as a critical step as the SDTM-annotated CRF becomes the master specification for all the downstream programming and testing. Further, virtually everyone involved in the conversion process understands the structure and function of the CRF, so this is a good place to start.

I usually halt the conversion process until the CRF is completely annotated and approved. Continue the process without an approved CRF always creates problems and wastes time and money downstream.

CDISC provides some guidance as to CRF annotation style (search www.cdisc.org).

Page 17: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

17

Annotate the CRF Center of the conversion process Knowledge-intensive component Everyone knows about / understands the CRF or knows who knows Considerable effort to review

Never underestimate the amount of work that is required to annotate a CRF. While not technically complex (use Adobe® Acrobat®), it is without a doubt the most knowledge-intensive aspect of an SDTM conversion. The director/manager of the annotation process must understand both the trial structure and the SDTM target model. The most experienced SDTM developer available should direct or participate in this process along with someone who is familiar with the trial.

Page 18: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

18

Annotate the CRF

Domain 1

Domain 2

More often than not, data from one form maps to multiple domains.

This increases work everywhere:

• More training

• More review

• More programming effort

• More QA

Page 19: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

19

Annotate the CRF Requires careful review and thought

Biostatisticians should be part of the review

Domain 4Domain 2

Domain 1 Domain 3

SDTM annotations require thoughtful and careful review of every page, as some domains will map just one question from some pages.

The bottom image shows a client-created mapping of 4 domains from a nine question section. Probably could use more thought as this is quite messy.

Page 20: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

20

Annotate the CRF

Domain 4Domain 2

Domain 1 Domain 3

Indices that measure disease severity, for example, such as the the APACHE II Index shown here (APACHE II= "Acute Physiology and Chronic Health Evaluation II") , collect data across many dimensions and often map to multiple domains. Here’s a single form that maps to four findings domains. Probably also needs a RELREC also

Page 21: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

21

General Conversion Process (steps 2 of 12)

1. Install and review data Need to check that All subjects can be identified by treatment / ARM

Data set keys can be identified and data sets can be joined together

Topic variables exist and –TESTCDs can be derived

Visit numbering is understandable

Disposition events can be identified

Many other things including SDTM keys codes for items with controlled terminology Data for timing variables Potential RELREC relationships can be established Comments can be properly handles External data (labs, ECGs) is formatted in an understandable way

Useful to do this as soon as data’s available while sponsor staff is engaged

Stop here until confident about source data

Its prudent to spend a couple of days thoroughly reviewing source data. I would not proceed until you are confident that you have what you need to complete the task.

Page 22: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

22

General Conversion Process (step 3 of 12)

1. Annotate CRF2. Install and review data

3. Develop trial design domains as a spreadsheet Requires protocol

May require review cycles

Important to have TA, TE, and TV settled at the start

Useful to do this as soon as data’s available while sponsor staff is engaged, particularly for legacy conversions

Very hesitant to continue conversions for submissions without TD(especially TA and TV)

Quite often, trial design domain creation is left undone until the end of the project. This is often because trial design data is not collected, but is recorded in the study protocol document and must be extracted. In my opinion, this is a big mistake, and I urge SDTM developers to develop the trial design domains, in Excel spreadsheet form, first, and have the client review the form before proceeding.

My reasons for this are that many domains, such as DM, EX, DS, IE, SV and SE all have some dependence on trial design data.

Page 23: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

23

Develop Trial Design Spreadsheet

Example of a trial design spreadsheet showing TA for three studies.

Develop these from the protocol document then have client review and approve before proceeding.

Page 24: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

24

Develop Trial Design Spreadsheet

Example of a trial design spreadsheet showing TE and TV.

Develop these from the protocol document then have client review and approve before proceeding.

Page 25: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

25

General Conversion Process (steps 4 and 5 of 12)

1. Annotate CRF2. Install and review data3. Develop trial design domains as a spreadsheet

4. Program “central 4” domains & start mapping spec document

5. Send “central 4” domains to client with mapping specification For review of a small set of data Gives you an idea of sponsor responsiveness, quality

For comment on the structure of the mapping spec Mapping specs are very time consuming and risky to modify

Now that you have a set of approved SDTM-annotated CRFs, a set of reviewed source data, and a trial design spreadsheet, you are ready to proceed.

If possible, I will next map four domains that I consider central to the SDTM and the conversion process. Then I will provide these to the client for review.

Page 26: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

26

My Central Four Domains

Trial Arms (TA) • Planned structure of trial• Shows all treatment regimens • Links study Protocol to demographic information

Demography (DM) • Who’s enrolled in the trial• Who’s assigned to study ARMs

Exposure (EX) • What treatment did subjects get?• Check on DM.USUBJID x DM.ARM

Disposition (DS) • What was each subjects actual path through the trial?• Final resolution of each subject

TA DM EX DS

The domains TA, DM, EX, and DS cover the trial from planning (TA), to enrollment (DM), to treatment (EX), to trial end (DS).

Creating these four domains allows one to ascertain with complete certainty the set of subjects in the trials, the set of subjects treated, and the fate of these subjects (these are not always so clear in some legacy studies). Additionally, everyone wants to know this information, it is required, and many other domains depend on getting these domains correct, so do it first. Besides, it can be a lot harder than you think.

Page 27: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

27

• Needs to clearly record the relationship between

1. CRF questions2. source data items3. SDTM domains and variables

Mapping SpecificationDocument

In most cases clients want to have documentation of exactly what was programmed in a conversion. To do this, create a “mapping specification” document. A mapping spec shows the name (e.g., table.column or file.column) of source data, and the target (SDTM domain.variable) and a description of what was done to the data, or an algorithm of data processing.

This document is time-expensive to make, and difficult to maintain.

Creation of this document depends on the annotated CRF and the structure of the source data. If these change, then the mapping spec must be updated.

Page 28: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

28

Mapping Specification Spreadsheet

Some (many?) developers create the mapping specification first, seek approval, and then program the instructions it contains.

Although that is a reasonable strategy, I have followed this process over several years and dozens of trials and have found that it just doesn’t work well. This is because it is very difficult (at least for me) to mentally translate a complex CRF into spreadsheet form before you are “into” and familiar with the data.

I tell clients that the annotated CRF is the specification, and then I develop the mapping specification as I codethe conversion; the mapping spec it reflects what I’ve programmed.

Page 29: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

29

General Conversion Process (step 6 of 12)

1. Annotate CRF2. Install and review data3. Develop trial design domains as a spreadsheet4. Program “central 4” domains & start mapping specification 5. Send “central 4” domains & map spec out for review, comment

6. Development continues• Program safety domains next

• Programming other domains after safety domains

• Update mapping specification as development proceeds

• Unit test coding during development

• WebSDM load & check as domains completed

Continue on.

Generally I develop the safety domains next (e.g., AE, CM, LB, VS, PE, EG, maybe one or two others) because this is often what clients want to see first.

I program and update the mapping spec and test SDTM model compliance with WebSDM continuously as the conversion develops.

Page 30: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

30

Program Test RepeatWebSDM

Program domain by domainUpdate map spec in parallel

Generate SDTM datasetsperiodically

Load & check in WebSDM

Further refinement

Annotations / Conversion plans

Formal testing

Review your work early and often. We use WebSDM SDTM load checking for this purpose because it’s fast and the issues are well-defined and traceable. This helps us detect within and across domain inconsistencies early, and it helps us differentiate errors due to logic from errors due to programming. It also highlights issues with source data (e.g., missing data, inconsistent units, etc.)

Page 31: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

31

General Conversion Process (step 7 of 12)

1. Annotate CRF2. Install and review data3. Develop trial design domains as a spreadsheet4. Program “central 4” domains & start mapping specification 5. Send “central 4” domains & map spec out for review, comment6. Development continues

• Programming other domains on annotated CRF

• Update mapping specification as development proceeds

• Unit test during development

• WebSDM load & check as domains completed

7. Begin development of formal tests

Time to test.

Testing might be informal or ad hoc, or more formal including the development of written test scripts, and the execution of test scripts by experienced QA staff.

Page 32: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

32

Testing SDTM Conversions

Subject and row counts• Source data vs. SDTM data• Non-trivial

Data value subset review & topic variable comparisons Categorical aggregate review Continuous aggregate review Manual review

• By individual subject• By proportion of subjects

Load, check, review with WebSDM or Empirica Study Sponsor review Data validation

• Usually a separate project

Combination or custom testing

There are many dimensions to testing.

Row counting between source data sets and SDTM data sets is often harder to complete than you’d think. This is because the SDTM allows you to grow rows (e.g., source has multiple AEs that are to be split, transposing to tall-narrow tables in findings domains) or reduce rows (e.g., suppress rows where AETERM is null) or createrows (e.g., unique disposition milestone created when subjects enter a specific phase or set of circumstances).

Page 33: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

33

General Conversion Process (steps 8-11 of 12)

1. Annotate CRF2. Install and review data3. Develop trial design domains as a spreadsheet4. Program “central 4” domains & start mapping specification 5. Send “central 4” domains & map spec out for review, comment6. Development continues

• Programming other domains on annotated CRF• Update mapping specification as development proceeds• Unit test during development• WebSDM load & check as domains completed

7. Begin development of formal tests

8. Intermediate dataset delivery for review9. Testing commences10. Issues Management as a result of client review11. Define.xml

Development continues.

It’s generally a good idea to periodically supply your client with SDTM datasets that you’ve preliminarily completed for review; the more client review (even if it’s an internal client) the better, as they should understand their study best. Plus, without a doubt, most clients make changes of some sort when they actually see trial data reformatted to SDTM. Biostats, for example, may want data organized or grouped a specific way to prepare for analysis dataset creation.

It’s often best to create the define.xml data dictionary document absolutely last, to avoid rework.

Page 34: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

34

Representing SDTM Metadata in define.xml

Define.xml: Description of datasets and their contents Components

• Table of Contents: lists all domains• Data Definition Tables: provide 7 kinds of information about variables• Controlled Terminology: Codelists referenced• Value Level Metadata: Map test codes to test name values• Comments

Created programmatically by reading SDTM datasets• Create from WebSDM, SDTM Mapper Tool or proprietary code

Generally needs editing to add:• Link to blank annotated CRF and other documents• Origins of each SDTM item (e.g., CRF pages, Derived, eDT, Protocol)• Computational methods or derivations used• Dictionary names and versions (e.g., MedDRA)

• Once we have a set of SDTM-formatted data files, a submission needs to create a dictionary-like document to describe the contents of these files to their recipients.

• We do this by creating a document called a “define”. We create this document using XML so that it displays as a web page.

• The Define.xml document consists of a description of datasets and their contents1. Table of Contents: lists all domains2. Data Definition Tables: provide 7 kinds of information about variables3. Controlled Terminology: Codelists referenced4. Value Level Metadata: Map test codes to test name values5. Comments

The define is created programmatically by reading SDTM datasets (we create from WebSDM v3 and then edit or post-process that document as needed).

Any define document generally needs editing to add:• Link to blank annotated CRF• Origins of each SDTM item (e.g., CRF pages, Derived, eDT, Protocol)• Computational methods or derivations used• Dictionary names and versions (e.g., MedDRA)

Page 35: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

35

Define.xml: TOC

Hyperlink to dataset description

Hyperlink to dataset

Natural keys

The top of a displayed define is a table of contents with a domain order specified by CDISC.

Hyperlinks can move you deeper into the document or invoke an external file.

The appearance of the define, its fonts, background colors, and so forth is controlled by another document called an XML style sheet.

Shown here is the standard CDISC style sheet, but you are free to create your own.

Page 36: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

36

Define.xml: Data Definition Table

Hyperlink to Value Level Metadata

Hyperlink to Codelist

Hyperlink to description of Computational Method

Hyperlink to pages in CRF

Hyperlink to dataset

Within the define each domain and its contents display. If you excluded empty permissible variables then they do not display here.

Within the domain description section hyperlinks take you to:

1. Value-level metadata, or lists of code-like values in your data with their descriptions, such as lab test codes and lab test names

2. Lists of controlled terminology

3. The origin or source for each variable

4. A comment which may also link to a description of a derivation.

Page 37: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

37

General Conversion Process (step 12)

1. Annotate CRF2. Install and review data3. Develop trial design domains as a spreadsheet4. Program “central 4” domains & start mapping specification 5. Send “central 4” domains & map spec out for review, comment6. Development continues

• Programming other domains on annotated CRF• Update mapping specification as development proceeds• Unit test during development• WebSDM load & check as domains completed

7. Begin development of formal tests8. Intermediate dataset delivery for review9. Testing commences10. Issues Management as a result of client review11. Define.xml12. Delivery

Finally, you get to deliver the SDTM-formatted data sets and other documents.

Page 38: Getting to SDTM - CDISCportal.cdisc.org/CDISC User Networks/North America/Washington DC... · FDA recognizes SDTM as ... Creating submission-ready SDTM conversions means more than

38

Final Delivery

Labeled SAS transport files SQL scripts to be run on demand SQL*Loader files Other

• Oracle export .dmp

• SAS data sets

• ASCII files

• Excel workbooks

Although SDTM datasets are often SAS V5 transport files created with SAS PROC COPY with the XPORT option (*.xpt), SDTM delivery might include other types of content such as executable programs (e.g., SQL scripts, SAS, Java, etc.) to be run by the client on demand, or other types of files.

Delivery should always include a manifest document that lists time stamped files.