xml in a sas world
DESCRIPTION
XML in a SAS World. Mike Molter d-Wise Technologies. Agenda What is XML? Examples of industry XML standards (schemas) SAS tools for working with XML. What is XML? e X tensible M arkup L anguage Used for structure, storage, and transport of data (w3schools.com) - PowerPoint PPT PresentationTRANSCRIPT
www.d-Wise.com
XML in a SAS World
Mike Molterd-Wise Technologies
www.d-Wise.com
Agenda
• What is XML?
• Examples of industry XML standards
(schemas)
• SAS tools for working with XML
www.d-Wise.com
What is XML?• eXtensible Markup Language
• Used for structure, storage, and transport of data
(w3schools.com)
• Like any other computer language…• textual gibberish• set of rules (structural, syntax)• vocabulary
• elements• attributes• tags• schemas
www.d-Wise.com
<nhl> <team name="Red Wings"> <conference>Eastern</conference> <division>Atlantic</division> <location>Detroit</location> </team>
<team name="Flames"> <conference>Western</conference> <division>Pacific</division> <location>Calgary</location> </team>
<team name="Devils"> <conference>Eastern</conference> <division>Metropolitan</division> <location>New Jersey</location> </team></nhl>
• XML document is made of elements (nhl, team, conference)
• Elements are marked with a start tag and an end tag (<division>, </division>)
• Elements may be nested within other elements (location is nested within team)
• Elements may contain attributes (team element contains the name attribute)
• An element's value is the text outside of a nested element between the element's start and end tags (Pacific is a value of the division element)
• Each XML document must contain a root element (nhl)
www.d-Wise.com
What is XML?• Like any other computer language…
• textual gibberish
• set of rules (structural, syntax)
• vocabulary• elements• attributes• tags• schemas
• Unlike other computer languages…• no keywords• no processor
www.d-Wise.com
XML Schema (or standard)• XML Schema (informal) - A specific set of elements and
attributes, along with a set of rules that govern their use, for
the purpose of transferring data between systems and
developing applications for processing such data.
• An XML schema can be a combination of new elements along
with other XML schemas (extensible)
• XML schema file - A well-formed XML file used for enforcing
the rules of an XML schema, or validating an XML document.
www.d-Wise.com
XML Schema Examples• NHL (Ok, I made this one up)
• XSL (eXtensible Stylesheet Language, .xsl)
• Transforms XML into something else
• XML schema files (.xsd)
• Validates an XML document
• XML Spreadsheet 2003 (.xml)
• Read and displayed by Excel
• ODM, Define, SDS
• Clinical Trials data, metadata
www.d-Wise.com
XML in Pharma• Operational Data Model (ODM)
• Collected clinical trial data, metadata, administrative
data, reference data, audit information
• Define-XML
• Metadata for submitted data in ODM structure
• Value-level metadata is in the define extension
• SDS-XML
• Submission data in ODM structure
www.d-Wise.com
XML in Pharma
Collected Data
Data Transformations
Data Submission
Metadata Submission
ODM.XML SAS
SDS.XML
Define.XML
www.d-Wise.com
Clinical Data ODM
ItemGroup (dataset-level) Metadata
www.d-Wise.com
Clinical Data
ItemGroup (dataset-level) Metadata
Item (variable-level) Metadata
ODM
www.d-Wise.com
Item (variable-level) Metadata
Codelist Metadata (allowable values)
ODM
www.d-Wise.com
Define-XML
www.d-Wise.com
www.d-Wise.com
Teams.sas7bdat
Exporting XML
www.d-Wise.com
libname xmlout xml 'C:\teams_generic.xml' ;
data xmlout.xteams ;set teams ;run;
Exporting XML with the LIBNAME statement
www.d-Wise.com
libname xmlout xml 'C:\teams_oracle.xml' xmltype=oracle ;
data xmlout.xteams ;set teams ;run;
Exporting XML with the LIBNAME statement
www.d-Wise.com
filename xmlout4 'C:\teams_datastep.xml' ;data _null_ ;file xmlout4 ;set teams end=thatsit ;
if _n_ eq 1 then put '<nhl>' ;put '<team name="' name '">' ;put '<conference>' conference '</conference>' ;put '<division>' division '</division>' ;put '<location>' location '</location>' ;put '</team>' ;if thatsit then put '</nhl>' ;run;
Exporting XML with a DATA step
www.d-Wise.com
Exporting XML with the LIBNAME statement or ODS using tagsets
libname xmlout xml 'C:\teams_tagset_libname.xml' tagset=<tagset-name> ;
data xmlout.xteams ;set teams ;run;
ods markup tagset=<tagset-name> file='C:\teams_tagset_ods.xml';
proc print noobs data=teams ;run;
ods markup close ;
www.d-Wise.com
Exporting XML with ODS using SAS's ExcelXP tagset
ods markup tagset=excelxp file='C:\teams_excel.xml';
proc print noobs data=teams ;run;
ods markup close ;
www.d-Wise.com
References
A SAS Programmer's Guide to Generating Define.xml, SAS Global Forum 2009
ods markup tagset=mydefine file='define.xml' ;proc print noobs data=meta-dataset1; run;proc print noobs data=meta-dataset2; run;proc print noobs data=meta-dataset3; run;etcods markup close ;
www.d-Wise.com
References
Tips and Tricks for Creating Multi-Sheet Microsoft Excel Workbooks, Vince DelGobbo, SAS Global Forum 2009
ODS Markup: The SAS Reports You've Always Dreamed of, Eric Gebhart, SUGI 30
www.d-Wise.com
References
ExcelXP on Steroids: Adding Custom Options to the ExcelXP Tagset, SAS Global Forum 2011
ods markup tagset=myexcel file='define.xml' options (tab_color='45') ;proc print noobs data=dataset1; run;ods markup close ;
www.d-Wise.com
Importing XMLlibname xmlout xml 'C:\teams_generic.xml' ;
data xmlout.xteams ;set teams ;run;
Export
data sasteams ;set xmlout.xteams ;run;
Import
www.d-Wise.com
<nhl> <team name="Red Wings"> <conference>Eastern</conference> <division>Atlantic</division> <location>Detroit</location> </team> <team name="Flames"> <conference>Western</conference> <division>Pacific</division> <location>Calgary</location> </team> <team name="Devils"> <conference>Eastern</conference> <division>Metropolitan</division> <location>New Jersey</location> </team></nhl>
libname xmlin xml 'C:\teams_nhl.xml' ;
data sasteam ;set xmlin.team ;run;
NHL.XML
SASTEAM.SAS7BDAT
www.d-Wise.com
Importing XML with an XML map
• An XML map is an XML schema• Provides instructions to the XML LIBNAME engine for reading XML
• Name and Label for the data set• Which XML elements define observations• How to define variables (attributes and values)
• Uses XPath syntax to navigate the XML document and identify its components
filename mymap 'C:\mymap.map' ;libname xmlin xml 'C:\nhl.xml' xmlmap=mymap;
data sasteams ;set xmlin.teams ;run;
www.d-Wise.com
Importing XML with an XML map<?xml version="1.0" encoding="UTF-8"?><SXLEMAP version="1.2">
<TABLE name="SASTeams">
</TABLE></SXLEMAP>
<TABLE-PATH syntax="XPath">/nhl/team</TABLE-PATH>
<COLUMN name="name"> <PATH syntax="XPath">/nhl/team/@name</PATH> <TYPE>character</TYPE> <DATATYPE>string</DATATYPE> <LENGTH>20</LENGTH></COLUMN>
<COLUMN name="conference"> <PATH syntax="XPath">/nhl/team/conference</PATH> <TYPE>character</TYPE> <DATATYPE>string</DATATYPE> <LENGTH>20</LENGTH></COLUMN>
Name of data set to be created
Observation boundary
Variable Definition
www.d-Wise.com
XML Mapper
www.d-Wise.com
Clinical Standards Toolkit (CST)
• A Base SAS framework for executing clinical data tasks
such as verification of data compliance against standards
and importing/exporting ODM and Define.xml.
• Contains all necessary files (SAS macros and driver
programs, maps, XSL stylesheets)
• Learning curve
www.d-Wise.com
Clinical Standards Toolkit (CST)
…or PROC XSL
www.d-Wise.com
References
• Using the SAS Clinical Standards Toolkit 1.5 to Import
CDISC ODM Files, Lex Jansen, Pharmasug 2013
• Using the SAS Clinical Standards Toolkit for Define.xml
Creation, Lex Jansen, Pharmasug 2011
• Accessing the Metadata from the Define.xml Using XSLT
Transformation, Lex Jansen, Phuse 2010
www.d-Wise.com
In Summary…
• Options for Exporting XML• XML LIBNAME engine (XMLTYPE=, TAGSET= options)• ODS (SAS XML destinations or user-defined tagsets)• DATA step• XSL stylesheets• CST (clinical)
• Options for Importing XML• XML LIBNAME engine (XMLTYPE=, TAGSET= options)• XML maps• XSL stylesheets• CST (clinical)