opencdisc validator 1 - digital infuzion, inc.cdiscportal.digitalinfuzion.com/cdisc user...
TRANSCRIPT
-
OpenCDISC Validator 1.4
Whats New?
David Borbas
Sr Director, Data Management
Jazz Pharmaceuticals, Inc.
Bay Area CDISC Implementation Network
23 May 2013
-
2
Disclaimers
The opinions expressed in this presentation
are not the official views or policies of Jazz
Pharmaceuticals, Inc.
-
3
Topics
What is Open CDISC? Why is it important?
View of the FDA use of OpenCDISC Validator
Using the application
Understanding the Validation Reports
-
4
Open CDISC
Open Source application Java and xml
Can check/validate SDTM, ADaM, SEND and
Define.xml files
Generate Define.xml limited
Based on published standards e.g. Draft Janus
validation rules and user suggested rules
Currently used in the FDA submission checking
process
-
5
FDA Review Past, Present & FutureFDA Review Environment
Sponsor ReviewValidation
WebSDM
Open CDISC
Validator
Past
Present
Future 2013?
Data Fit
JANUS
Open CDISC
Validator
FDA
Electronic
Document
Room
Servers
FDA Review
Tools
WebSDM/
Empirica,
jReview,
SAS/JMP
Future Additional
Data Visualization
Tools - Evaluation
of JMP Clinical,
iReview, Spotfire
Sponsor Data
in eCTD format
for NDA or
sNDA
Dat
a Vie
ws
-
6
Open CDISC Application
Website location http://www.opencdisc.org/
PC specs
Minimum RAM 1 GB
Any Java capable PC
Java version required >= ver 1.5
run check-java.bat file to confirm version
Command line option available
http://www.opencdisc.org/projects/validator/using-
opencdisc-validator
http://www.opencdisc.org/projects/validator/using-opencdisc-validatorhttp://www.opencdisc.org/projects/validator/using-opencdisc-validatorhttp://www.opencdisc.org/projects/validator/using-opencdisc-validatorhttp://www.opencdisc.org/projects/validator/using-opencdisc-validatorhttp://www.opencdisc.org/projects/validator/using-opencdisc-validator
-
7
What is New in Version 1.4 - 1
Support for SDTM v1.3 and SDTMIG v3.1.3
SDTMIG 3.1.2 amend 1 removed upgraded to
SDTMIG 3.1.3
Controlled Terminology
Added ability to specify CDISC Controlled
Terminology versions
ADaM and SEND validation
Added ability to specify SEND or ADaM validation
types
-
8
What is New in Version 1.4 - 2
New SDTM Validation Rules
Validator 1.3 Rule Count = 227
Validator 1.4 Rule Count = 360
Updates to SDTM, SEND and ADaM rules
Better support for split datasets
Specify MedDRA/CDISC Controlled Terminology
versions
Specify SEND/ADaM validation types
-
9
GUI Screen View - 1
-
10
Directories config
Note: Version 1.4 on the right panel in this and all following slides
-
Directories - data
11
-
Directories cdisc
12
New directory
Not present in ver 1.3
-
Directories cdisc/sdtm
Terminology Directories
13
-
Directories 2012-12-21
14
-
Terminology Files
Updating Terminology files
Create a new directory
Name with terminology version date
2013-04-12 newly added after the OpenCDISC validator ver
1.4 was downloaded
15
-
Updated Terminology File added!
16
-
MedDRA -1
Same process as Terminology versions
17
-
MedDRA - 2
Add the MedDRA ascii files for each version to a
directory
18
-
MedDRA versions in the GUI
19
-
Using MedDRA with Define.xml
OpenCDISC Validator is programmed to use the
MedDRA Version that is specified in the define.xml file.
So running validation with a define.xml file will support
checking against that MedDRA version. NOTE: You
cannot override the MedDRA version specified in the
define.xml file by selecting a different version of
MedDRA from the Options menu of the GUI.
If you do not use a define.xml file OR the define file does
not specify the MedDRA version you can use a drop
down list choice during the Options selection phase of
the MedDRA GUI.
20
-
21
Running OpenCDISC Validator - GUI
Start in the main Validator directory where you
unzipped the files and find the file client.bat.
This is the file that will start the GUI and for most
users will be the easiest way to run validation
checking of CDISC Standards files.
Select client.bat to start the program.
-
22
Client Application - Options
Validate (data) or Generate Define.xml
Standard Choices SDTM AdaM
Define.xml SEND
Custom (Tabular)
Source / Input files: select location SAS V5 Transport
Delimited (CSV, Tab etc)
Configuration Depends on Standard choice
Define.xml optional
Reports Report settings link: use to customize report filename etc
-
23
Viewing the Validation Report - 1
Once the validation is complete you will see a brief
summary of the validation statistics
-
24
Viewing the Validation Report - 2
Once the validation is complete you have three
options
View the validation report - click the View Report
button OR
Start another validation or define generation activity -
click the New Session button OR
Exit the GUI application by clicking the File - > Exit or
the close window red x and confirm the exit.
-
25
Validation Report - 1
At the end of a validation run you can choose to view the
report. When you select this option the spreadsheet file
will open.
You can review the report, save the file in another location, or
attach to an email.
Since version 1.3 the Excel report has been revised to
improve readability.
The issues summary tab is now grouped by domain
The terminology issues are collapsed into instances of distinct
terms. This reduces the size of the details tab.
-
26
Validation Report - 2
Formats CSV file unlimited rows
4 tabbed Excel worksheets1. Dataset Summary with Error counts
Errors (High) Warnings (Med) Infos (Low)
2. Issue Summary by Rule ID with error counts
3. Details dataset, variables, values, rules, messages
Contains the complete list of the validation findings sorted by domain name.
4. Rules Rule ID with message, description and type
Report file name and headers can be input with Protocol, date, time, other text etc
Useful to communicate with vendors and internal team
-
27
Validation Report - 3
The Details worksheet can be filtered and sorted
to identify the individual data items you want to
investigate by dataset or by rule or by type of
check.
There are a two characteristics of this list to
highlight that will aid the interpretation of your
results.
-
Validation Report - 4
Unique Row Findings
If the finding is related to a specific row in the data
then a record number will be present. And the specific
variable(s) and value(s) will be identified.
If character values are placed into numeric fields OR
numeric values placed into text fields these will be
treated as unique records in the finding list.
Global Findings
Controlled Terminology findings for a variable that are
present across multiple records will be counted for
each specific term.
28
-
29
Interpreting the Validation Results - 1
What Open CDISC can automate for you
dataset structure var names / labels
data integrity checks reference to DM subjects
presence of baseline flags
dates after disposition
Results units consistency
Terminology checks
referential integrity Start date before End date
Disposition references - sometimes
-
30
Interpreting the Validation Results - 2
What you still have to do dataset structure
do you have the right vars per spec?
custom domains
data integrity baseline flags does not see 2 per subject
Review Terminology flags, If a codelist is expandable is it correct?
data validation more content focused right subjects
right dates
right codelists
right testcodes
-
31
Interpreting the Validation Results - 3
False positive results may occur
Lab tests without units will generate error SD0026
and SD0029 e.g. Urine Ph = 5, specific gravity 1.012
Terminology
Extensible codelists with non matches in the terminology file
where the sponsor has added codes / values not present
E.g. Oxygen Saturation as O2SAT VSTESTCD
Some program bugs may generate false error
messages
Post to Open CDISC forum
-
Interpreting the Validation Results - 4
View of Validation Report
32
-
OpenCDISC Forum Feedback on this New Rule! - 1
SD1082 -- variable length is too long for actual data
graded as an ERROR in the AE domain
Remember All Errors should be addressed and fixed if
possible prior to eSubmission.
.Note, that variable length issue is not SDTM compliance. Its very
specific to SAS format used for regulatory submission.
33
-
OpenCDISC reply - 2
Defining requirements for variable lengths is a very interesting and
controversial topic. This week we have productive discussion on
FDA/PhUSE Data Validation workgroup meeting. There are several
basic and mutually exclusive needs or risks.
1. Datasets should be re-sized (not compressed!) to minimal size
because (very long list of reasons, e.g., data transfer, archiving
storage, analysis tools limitations, hardware limitations, etc.)
2. Variables in SAS XPORT datasets should have consistent/pre-
defined length to avoid data truncation during data integration. There
are many un-perfect and un-complete recommendations how to achieve
this. E.g., --TESTCD variables lengths should be always 8 Chars.
3. Variable length should be defined by data collection process. E.g., it
can be set to maximum length of value in your data collection or control
terminology codelist.
34
-
Rule change 3 New Release v 1.4.1
Note, that variable length issue is not SDTM compliance.
Its very specific to SAS format used for regulatory
submission. OpenCDISC validator will remove all current
Variable Length related checks. Only one new Rule will
be introduced. Before sending data to FDA you need to
re-size you data by each variable to actual maximum
value.
Note, that its needed only for actual data transfer to
FDA. You can do whatever you want before this event.
Its very easy to do.
Profiles will be updated soon. Watch for v1.4.1
OpenCDISC Validator release in the next weeks.
35
-
36
Suggested Process for OpenCDISC Validator
Review Setup / Parameters
Confirm SDTM Version
Confirm MedDRA version compared to Data Mgt Plan
Confirm Terminology version
Include the define.xml file if present
Set report parameters
Study Name / Number / Dates / other text / Excel message limit
Run validation
Review Error report
Update Issue Summary tab with Comments
Refer to details tab as needed
Identify / report any new bugs
Consider submitting an update to terminology team
If final dataset for submission to CBER then complete word doc with validation findings. CBER doc UCM209772
-
37
Questions?
-
38
Thank you!
David Borbas RN, MISSr. Director, Data ManagementJazz Pharmaceuticals, Inc
[email protected]. 650-496-2637