1© Assero Limited, 2010
Introduction to define.xml
Dave Iberson-Hurst27th May 2010ESUG Webinar
© Assero Limited, 2010
3© Assero Limited, 2010
Outline
• Introduction
• Purpose of define.xml
• XML
• How define works
• FAQ
• Q&A
7© Assero Limited, 2010
Purpose
• Describes– What is included within the data
– Where did the data come from
– Derivations, code lists, annotated PDF etc to aid understanding
• Machine Readable
• Human Readable (after processing)
• To aid/inform the reviewer, unambiguous communication
8© Assero Limited, 2010
Submission & eCTD
http://www.fda.gov/ForIndustry/DataStandards/StudyDataStandards/default.htm
Revision 2, June 2008
10© Assero Limited, 2010
Dark Side of the Moon<CDCollection>
<CD TotalTime="45.02">
<Artist>Pink Floyd</Artist>
<Title>Dark Side of the Moon</Title>
<Track Label="1a">Speak To Me</Track>
<Track Label="1b">Breathe</Track>
<Track Label="2">On the Run</Track>
<Track Label="3">Time</Track>
<Track Label="4">The Great Gig in the Sky</Track>
<Track Label="5">Money</Track>
<Track Label="6">Us and Them</Track>
<Track Label="7">Any Colour You Like</Track>
<Track Label="8">Brain Damage</Track>
<Track Label="9">Eclipse</Track>
</CD>
</CDCollection>
11© Assero Limited, 2010
Dark Side of the Moon<CDCollection>
<CD TotalTime="45.02">
<Artist>Pink Floyd</Artist>
<Title>Dark Side of the Moon</Title>
<Track Label="1a">Speak To Me</Track>
<Track Label="1b">Breathe</Track>
<Track Label="2">On the Run</Track>
<Track Label="3">Time</Track>
<Track Label="4">The Great Gig in the Sky</Track>
<Track Label="5">Money</Track>
<Track Label="6">Us and Them</Track>
<Track Label="7">Any Colour You Like</Track>
<Track Label="8">Brain Damage</Track>
<Track Label="9">Eclipse</Track>
</CD>
</CDCollection>
Element
Attribute
Structure
12© Assero Limited, 2010
XML Schemas in Simple Terms
• Defines elements, attributes, data types etc. and their relationships
• Provides the specification for an XML document
• Enables validation of XML documents
13© Assero Limited, 2010
Transformations• XSL – Extensible Stylesheet Language
• Used to transform an XML document
• Requires a tool known as XSLT processor
• Focuses on presentation while XML focuses on content and structure
XSLTProcessor
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" ...
XML Document New Document
XSL Document
16© Assero Limited, 2010
Overall Structure
MetaDataVersion
ItemGroupDef - Domains
ItemDef - Variables
CodeList - Code lists
ODM
Study
GlobalVariables
Links and Variable Level
17© Assero Limited, 2010
Overall Structure<ODM
xmlns="http://www.cdisc.org/ns/odm/v1.2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:def="http://www.cdisc.org/ns/def/v1.0"
xsi:schemaLocation="http://www.cdisc.org/ns/odm/v1.2 define1-0-0.xsd"
FileOID="Study1234"
ODMVersion="1.2"
FileType="Snapshot"
CreationDateTime="2004-07-28T12:34:13-06:00">
<Study OID="1234">
<GlobalVariables>
<StudyName>1234</StudyName>
<StudyDescription>1234 Data Definition</StudyDescription>
<ProtocolName>1234</ProtocolName>
</GlobalVariables>
<MetaDataVersion OID="CDISC.SDTM.3.1.0"
Name="Study 1234, Data Definitions"
Description="Study 1234, Data Definitions"
def:DefineVersion="1.0.0"
def:StandardName="CDISC SDTM"
def:StandardVersion="3.1.0">
... All the content is here ...
</MetaDataVersion>
</Study>
</ODM>
18© Assero Limited, 2010
Domain Meta Data
• Dataset Name – 2 character prefix
• Description – The description for the domain
• Location – Folder and filename
• Structure – level of detail provided
• Purpose – Purpose
• Key Fields – Used to identify and index records
19© Assero Limited, 2010
Domain Meta Data
<ItemGroupDef OID="DM"
Name="DM" Repeating="No"
IsReferenceData="No"
Purpose="Tabulation"
def:Label="Demographics"
def:Structure="One record per event per subject"
def:DomainKeys="STUDYID, USUBJID"
def:Class="Special Purpose"
def:ArchiveLocationID="Location.DM">
<ItemRef ItemOID="STUDYID"
OrderNumber="1" Mandatory="Yes" Role="Identifier"/>
<ItemRef ItemOID="DOMAIN"
OrderNumber="2" Mandatory="Yes" Role="Identifier"/>
<ItemRef ItemOID="USUBJID"
OrderNumber="3" Mandatory="Yes" Role="Identifier"/>
... More itemRefs Here ...
</ItemGroupDef>
20© Assero Limited, 2010
Domain Meta Data
<ItemGroupDef OID="DM"
Name="DM" Repeating="No"
IsReferenceData="No"
Purpose="Tabulation"
def:Label="Demographics"
def:Structure="One record per event per subject"
def:DomainKeys="STUDYID, USUBJID"
def:Class="Special Purpose"
def:ArchiveLocationID="Location.DM">
<ItemRef ItemOID="STUDYID"
OrderNumber="1" Mandatory="Yes" Role="Identifier"/>
<ItemRef ItemOID="DOMAIN"
OrderNumber="2" Mandatory="Yes" Role="Identifier"/>
<ItemRef ItemOID="USUBJID"
OrderNumber="3" Mandatory="Yes" Role="Identifier"/>
... More itemRefs Here ...
</ItemGroupDef>
21© Assero Limited, 2010
Variable Meta Data
• Variable Name – 8 character name• Variable Description – The description• Type – Character String or Numeric• Format – Identifies controlled terminology or
presentation• Origin – Indicator of variable origin – CRF or Derived • Role – How variable is used within a dataset (ID, Topic,
Timing, Qualifier)• Comments – Used by sponsor to assist reviewer in
interpreting the data• Label – Variable Label• References – Computational Method, Code Lists & Value
Lists
22© Assero Limited, 2010
Variable Meta Data<ItemDef OID="DOMAIN"
Name="DOMAIN"
DataType="text"
Length="2"
Origin="CRF Page"
Comment="DOMAIN ABBREVIATION"
def:Label="DOMAIN ABBREVIATION">
</ItemDef>
<ItemDef OID="STUDYID"
Name="STUDYID"
DataType="text"
Length="8"
Origin="CRF Page"
Comment="Demographics CRF Page 4"
def:Label="STUDY IDENTIFIER">
</ItemDef>
<ItemDef OID="SUBJID"
Name="SUBJID"
DataType="text"
Length="60"
Origin="CRF Page"
Comment="Demographics CRF Page 4"
def:Label="SUBJECT IDENTIFIER">
</ItemDef>
23© Assero Limited, 2010
Variable Meta Data<ItemDef OID="DOMAIN"
Name="DOMAIN"
DataType="text"
Length="2"
Origin="CRF Page"
Comment="DOMAIN ABBREVIATION"
def:Label="DOMAIN ABBREVIATION">
</ItemDef>
<ItemDef OID="STUDYID"
Name="STUDYID"
DataType="text"
Length="8"
Origin="CRF Page"
Comment="Demographics CRF Page 4"
def:Label="STUDY IDENTIFIER">
</ItemDef>
<ItemDef OID="SUBJID"
Name="SUBJID"
DataType="text"
Length="60"
Origin="CRF Page"
Comment="Demographics CRF Page 4"
def:Label="SUBJECT IDENTIFIER">
</ItemDef>
24© Assero Limited, 2010
Variable Meta Data <ItemDef OID="VS.VSTESTCD.FRAME“
Name="FRAME"
DataType="float“
Length="8“
SignificantDigits="1"
Origin="CRF Page“
Comment="Vital Signs CRF Page 4"
def:Label="Frame">
<CodeListRef CodeListOID="FRAME"/>
</ItemDef>
<CodeList OID="FRAME" Name="FRAME" DataType="text">
<CodeListItem CodedValue="S">
<Decode><TranslatedText xml:lang="en">Small</TranslatedText></Decode>
</CodeListItem>
<CodeListItem CodedValue="M">
<Decode><TranslatedText xml:lang="en">Medium</TranslatedText></Decode>
</CodeListItem>
<CodeListItem CodedValue="L">
<Decode><TranslatedText xml:lang="en">Large</TranslatedText></Decode>
</CodeListItem>
<CodeListItem CodedValue="XL">
<Decode><TranslatedText xml:lang="en">Extra large</TranslatedText></Decode>
</CodeListItem>
</CodeList>
25© Assero Limited, 2010
Variable Meta Data <ItemDef OID="VS.VSTESTCD.FRAME“
Name="FRAME"
DataType="float“
Length="8“
SignificantDigits="1"
Origin="CRF Page“
Comment="Vital Signs CRF Page 4"
def:Label="Frame">
<CodeListRef CodeListOID="FRAME"/>
</ItemDef>
<CodeList OID="FRAME" Name="FRAME" DataType="text">
<CodeListItem CodedValue="S">
<Decode><TranslatedText xml:lang="en">Small</TranslatedText></Decode>
</CodeListItem>
<CodeListItem CodedValue="M">
<Decode><TranslatedText xml:lang="en">Medium</TranslatedText></Decode>
</CodeListItem>
<CodeListItem CodedValue="L">
<Decode><TranslatedText xml:lang="en">Large</TranslatedText></Decode>
</CodeListItem>
<CodeListItem CodedValue="XL">
<Decode><TranslatedText xml:lang="en">Extra large</TranslatedText></Decode>
</CodeListItem>
</CodeList>
26© Assero Limited, 2010
Value Level Meta Data
• SDS Version 3 makes use of "Tall Skinny" structure. Findings domains consist of – Test/Result pairs (xxTESTCD/xxORRES)
• Interpretation of information in the Results depends on the value of xxTESTCD
• Results for different tests may have different data types, formats, labels, etc
27© Assero Limited, 2010
Value Level Meta Data
<def:ValueListDef OID="ValueList.VS.VSTESTCD">
<ItemRef ItemOID="VS.VSTESTCD.FRAME"
OrderNumber="10" Mandatory="No"/>
<ItemRef ItemOID="VS.VSTESTCD.HTRAW"
OrderNumber="11" Mandatory="No"/>
<ItemRef ItemOID="VS.VSTESTCD.WTRAW"
OrderNumber="12" Mandatory="No"/>
<ItemRef ItemOID="VS.VSTESTCD.MEANBP"
OrderNumber="13" Mandatory="No"/>
</def:ValueListDef>
28© Assero Limited, 2010
Value Level Meta Data
<def:ValueListDef OID="ValueList.VS.VSTESTCD">
<ItemRef ItemOID="VS.VSTESTCD.FRAME"
OrderNumber="10" Mandatory="No"/>
<ItemRef ItemOID="VS.VSTESTCD.HTRAW"
OrderNumber="11" Mandatory="No"/>
<ItemRef ItemOID="VS.VSTESTCD.WTRAW"
OrderNumber="12" Mandatory="No"/>
<ItemRef ItemOID="VS.VSTESTCD.MEANBP"
OrderNumber="13" Mandatory="No"/>
</def:ValueListDef>
29© Assero Limited, 2010
Additional Information
• Annotated CRF – Link to file containing annotated CRF
• See draft Meta Data Guidelines (draft) at http://www.cdisc.org/msg-draft
30© Assero Limited, 2010
Annotated CRF
<def:AnnotatedCRF>
<def:DocumentRef leafID="blankcrf"/>
</def:AnnotatedCRF>
<def:leaf ID="blankcrf" xlink:href="blankcrf.pdf">
<def:title>Annotated Case Report Form</def:title>
</def:leaf>
33© Assero Limited, 2010
Define is an ODM Extension?
• Define.xml is built from the components used by CDISC to build the Operational Data Model (ODM)
• The ODM is used to transport Case Report Form (CRF) data
• Define.xnl is used to transport tabulation metadata
• They are quite different use cases
36© Assero Limited, 2010
Define is Machine Readable?
• Define.xml is built using XML technology
• A computer can consume and process (and understand) the information within the define.xml file
38© Assero Limited, 2010
Define is Human Readable?
• As we said, define.xml is built using XML technology
• A computer can consume and process (and understand) the information within the define.xml file
• But using style sheet technology we can also transform the XML into a form that humans can understand
42© Assero Limited, 2010
Tools• OpenCDISC
– Validator– http://www.opencdisc.org/
• XML4Pharma – CDISC Define.xml Checker– http://www.xml4pharma.com/CDISC_Define_Checker/index.html
• SAS tool set– http://www.sas.com/industry/pharma/cdisc/
• Formedix– Origin Submission Modeller– http://www.formedix.com/cms/index.php?option=com_content&task=view&i
d=28&Itemid=53
• Entimo– entmICE DARE– http://www.entimo.com/solution/entimICE_DARE.html
• Octagon– Checkpoint– http://www.octagonresearch.com/checkpoint-data-validation.html
43© Assero Limited, 2010
CDISC Plans
• New release of define Q3/Q4 2010
• Support ADaM metadata
• Support SDTM V3.1.2
45© Assero Limited, 2010
Purpose
• Describes– What is included within the data
– Where did the data come from
– Derivations, code lists, annotated PDF etc to aid understanding
• Machine Readable
• Human Readable (after processing)
• To aid/inform the reviewer, unambiguous communication
46© Assero Limited, 2010
Q&A