+ All Categories
Home > Documents > Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms...

Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms...

Date post: 13-Aug-2019
Category:
Upload: lehanh
View: 217 times
Download: 0 times
Share this document with a friend
30
Mapping Corporate Data Standards to the CDISC Model David Parker, AstraZeneca UK Ltd, Manchester, UK
Transcript
Page 1: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

Mapping Corporate Data Standards to the CDISC Model

David Parker, AstraZeneca UK Ltd,

Manchester, UK

Page 2: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

Introduction

Discuss CDISC– Case Report Tabulations Data Definition, – Study Data Tabulation Model, – Operational Data Model and mapping strategies from existing standards

• Why do we need to map• Could we make more use of XML

Page 3: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

Standards in the Industry

• Need for Standards, ( Compare Past with vision of the Future)

• Use of SAS• Use of XML

Page 4: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

Discussing Implementation Options

Kenny & Litzsinger PharmaSug 2005 Parallel Method DBMS Extract SDTM Domains

Analysis Datasets

Retrospective DevelopmentDBMS extract Analysis Datasets SDTM Domains

Linear MethodDBMS Extract SDTM Domains Analysis Datasets

Hybrid Method DBMS Extract SDTM Draft Domain Analysis DataSets

SDTM Final Domains

Page 5: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

Parallel

Database

ETLSDTM Domains

Analysis Data Sets

Algorithms

Algorithms

Advantages Disadvantages

•STDM and ADaM Independent

•SDTM created at Time of Submission

•Parallel Project Teams

•Minimum re-engineering of existing processes

•Documentation different for each dataset and decreased efficiency.

•ADaM derived variables do not reference SDTM

•Regulators do not have original extracts for ADaM derived variables.

•Analysis Programs submitted to Regulators do not point to raw data

•Validation between SDTM and ADaM needed

Page 6: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

Retrospective

Database Analysis Data Sets

SDTM

RAW CRTs

ETL Algorithms

Advantages Disadvantages

Regulators do not have original extracts for ADaM derived variables.

•Analysis Programs submitted to Regulators do not point to raw data

•Date imputation have to be undone to match SDTM data standard

•All CRF variables in the SDTM would have to be retained in the analysis datasets.

•Validation between SDTM and source needed

•Analysis Datasets needed before SDTM

SDTM created at Time of Submission

•Enhancements/new releases of SDTM are not affected

Page 7: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

Linear

SDTMDatabase RAW CRTs

Analysis Data Sets

ETL Algorithms Algorithms

Advantages Disadvantages

Analysis programs utilise the SDTM

•SDTM domains facilitate standardisation of Analysis Datasets

•Logical Flow of Software Development (A to B to C)

•Development of Analysis Datasets depend on completion of SDTM

•SDTM domain created for all studies regardless of whether part of submission

•Potential Outsourcing problems

Page 8: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

Hybrid

Database RAW CRTs Draft SDTM

Final SDTM

ETL Algorithms Algorithms

Advantages Disadvantages

SDTM created at the time of submission.

SDTM domains facilitate standardisation of Analysis Datasets

•Analysis programs submitted to the Agency are useable and informative to the reviewer as SDTM input.

•Creation of baseline or population flags in harmony SDTM vs ADaM

•Development of Analysis Datasets depend on completion of SDTM

•SDTM domain created for all studies regardless of whether part of submission

•Potential Outsourcing problems

Page 9: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

SDTM Modelling

Page 10: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

Modelling a Mapping Process(Common issues in mapping dummy corporate standards to CDISC standards )

•Character variables defined as Numeric and vice versa•Variables collected without an obvious corresponding domain in the CDISC SDTM mapping. So must go into SUPPQUAL

•Several corporate modules that map to one corresponding domain in CDISC SDTM.•Dictionary codes not in SDTM parent module, so if needed must becollected in SUPPQUAL

•Core SDTM is a subset of the existing corporate standards•Different structure of data Lab CDISC Domain e.g. baseline flag,Vitals Horizontal vs Vertical•Additional Metadata needed to describe the source in SUPPQUAL; also metadata needed to laboratory data standardization.

•Dates – combining date and times; partial dates.•Data collapsing issues e.g. Adverse Events and Concomitant Medications, Adverse Events maximum intensity.

Page 11: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

Operational Data Model (ODM)

ODM used for Data Interchange

Map Data from SAS to the ODM structure

Use of XSLT Stylesheet to subset theXML dataset

Page 12: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

XML file of SAS Data

Page 13: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

SDTM view of the Data

Page 14: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

Global Ordering Of Elements Case Report Tabulation Data Definition Specification (define.xml) Standards Version 1.0.0.

CDISC Page 44 of 45

• ODM• Study• Global Variables • MetDataVersion

• ItemGroupDef– ItemRef– Def:leaf

• Def:title– Def:ValueListRef

• ItemDef– CodeListRef

• CodeList– CodeListItem

• Decode– TranslatedText

• ExternalCodeList

Page 15: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

ItemGroupDef

<xs:element name="ItemGroupDef" type="ODMcomplexTypeDefinition-ItemGroupDef">

<xs:complexType name="ODMcomplexTypeDefinition-ItemGroupDef">

<xs:sequence>

<xs:element ref="ItemRef" maxOccurs="unbounded"/>

<xs:element ref="Alias" minOccurs="0" maxOccurs="unbounded"/>

<xs:group ref="ItemGroupDefElementExtension" minOccurs="0" maxOccurs="unbounded"/>

</xs:sequence>

<xs:attributeGroup ref="ItemGroupDefAttributeDefinition"/>

<xs:attributeGroup ref="ItemGroupDefAttributeExtension"/>

</xs:complexType>

Page 16: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

ItemGroupDefAttributeDefinition

<xs:attributeGroup name="ItemGroupDefAttributeDefinition"><xs:attribute name="OID" type="oid" use="required"/><xs:attribute name="Name" type="name" use="required"/><xs:attribute name="Repeating" type="YesOrNo" use="required"/><xs:attribute name="IsReferenceData" type="YesOrNo"/><xs:attribute name="SASDatasetName" type="sasName"/><xs:attribute name="Domain" type="text"/><xs:attribute name="Origin" type="text"/><xs:attribute name="Role" type="name"/><xs:attribute name="Purpose" type="text"/><xs:attribute name="Comment" type="text"/>

</xs:attributeGroup>

Page 17: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

ItemGroupDefAttributeDefinition

<ItemGroupDef><ID>1</ID><OID>Demographics</OID><Name>DM</Name><Repeating>No</Repeating><IsReferenceData>No</IsReferenceData><Purpose>Tabulation</Purpose><def_x003A_Label>Demographics</def_x003A_Label><def_x003A_Structure>One record per subject</def_x003A_Structure><def_x003A_DomainKeys>STUDYID,USUBJID</def_x003A_DomainKeys><def_x003A_Class>Special Purpose Domains</def_x003A_Class><def_x003A_ArchiveLocation>DM.XPT</def_x003A_ArchiveLocation></ItemGroupDef>

Page 18: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

ItemGroupDefDefinition

Page 19: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

Global Ordering Of Elements Case Report Tabulation Data Definition Specification (define.xml) Standards Version 1.0.0.

CDISC Page 44 of 45

• ODM• Study• Global Variables • MetDataVersion

• ItemGroupDef– ItemRef– Def:leaf

• Def:title– Def:ValueListRef

• ItemDef– CodeListRef

• CodeList– CodeListItem

• Decode– TranslatedText

• ExternalCodeList

Page 20: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

ITEMDEF

<xs:element name="ItemDef" type="ODMcomplexTypeDefinition-ItemDef"/>

<xs:complexType name="ODMcomplexTypeDefinition-ItemDef"><xs:sequence><xs:element ref="Question" minOccurs="0"/><xs:element ref="ExternalQuestion" minOccurs="0"/><xs:element ref="MeasurementUnitRef" minOccurs="0" maxOccurs="unbounded"/><xs:element ref="RangeCheck" minOccurs="0" maxOccurs="unbounded"/><xs:element ref="CodeListRef" minOccurs="0"/><xs:element ref="Role" minOccurs="0" maxOccurs="unbounded"/><xs:element ref="Alias" minOccurs="0" maxOccurs="unbounded"/><xs:group ref="ItemDefElementExtension" minOccurs="0" maxOccurs="unbounded"/>

</xs:sequence><xs:attributeGroup ref="ItemDefAttributeDefinition"/><xs:attributeGroup ref="ItemDefAttributeExtension"/>

</xs:complexType>

Page 21: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

ItemDefAttributeDefinition

<xs:attributeGroup name="ItemDefAttributeDefinition"><xs:attribute name="OID" type="oid" use="required"/><xs:attribute name="Name" type="name" use="required"/><xs:attribute name="DataType" type="DataType" use="required"/><xs:attribute name="Length" type="integer"/><xs:attribute name="SignificantDigits" type="integer"/><xs:attribute name="SASFieldName" type="sasName"/><xs:attribute name="SDSVarName" type="sasName"/><xs:attribute name="Origin" type="text"/><xs:attribute name="Comment" type="text"/>

</xs:attributeGroup>

Page 22: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

<ItemOID>VISITNUM</ItemOID><Name>VISITNUM</Name><DataType>Num</DataType><Origin>CRF </Origin><Comment>Vital Signs CRF Page x</Comment><def_x003A_Label>Visit Number</def_x003A_Label><def_x003A_DisplayFormat>8</def_x003A_DisplayFormat></ItemDef>

Page 23: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

Global Ordering Of Elements Case Report Tabulation Data Definition Specification (define.xml) Standards Version 1.0.0.

CDISC Page 44 of 45

• ODM• Study• Global Variables • MetDataVersion

• ItemGroupDef– ItemRef– Def:leaf

• Def:title– Def:ValueListRef

• ItemDef– CodeListRef

• CodeList– CodeListItem

• Decode– TranslatedText

• ExternalCodeList

Page 24: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

ITEMREF

<xs:element name="ItemRef" type="ODMcomplexTypeDefinition-ItemRef"/>

<xs:complexType name="ODMcomplexTypeDefinition-ItemRef"><xs:sequence><xs:group ref="ItemRefElementExtension" minOccurs="0" maxOccurs="unbounded"/>

</xs:sequence><xs:attributeGroup ref="ItemRefAttributeDefinition"/><xs:attributeGroup ref="ItemRefAttributeExtension"/>

</xs:complexType>

Page 25: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

ItemRefAttributeDefinition<xs:attributeGroup name="ItemRefAttributeDefinition">

<xs:attribute name="ItemOID" type="oidref" use="required"/>

<xs:attribute name="OrderNumber" type="integer"/>

<xs:attribute name="Mandatory" type="YesOrNo" use="required"/>

<xs:attribute name="KeySequence" type="integer"/>

<xs:attribute name="ImputationMethodOID" type="oidref"/>

<xs:attribute name="Role" type="xs:NMTOKENS"/>

<xs:attribute name="RoleCodeListOID" type="oidref"/>

</xs:attributeGroup>

<ItemRef><ID>1</ID><ItemOID>USUBJID</ItemOID><OrderNumber>1</OrderNumber><Mandatory>Yes</Mandatory><Role>Identifier</Role></ItemRef><

Page 26: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

<ItemRef><ID>1</ID><ItemOID>USUBJID</ItemOID><OrderNumber>1</OrderNumber><Mandatory>Yes</Mandatory><Role>Identifier</Role></ItemRef><

Page 27: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

ITEMDATA

<xs:element name="ItemData" type="ODMcomplexTypeDefinition-ItemData"/>

<xs:complexType name="ODMcomplexTypeDefinition-ItemData"><xs:sequence><xs:element ref="AuditRecord" minOccurs="0"/><xs:element ref="Signature" minOccurs="0"/><xs:element ref="MeasurementUnitRef" minOccurs="0"/><xs:element ref="Annotation" minOccurs="0" maxOccurs="unbounded"/><xs:group ref="ItemDataElementExtension" minOccurs="0" maxOccurs="unbounded"/>

</xs:sequence><xs:attributeGroup ref="ItemDataAttributeDefinition"/><xs:attributeGroup ref="ItemDataAttributeExtension"/>

</xs:complexType>

<xs:attributeGroup name="ItemDataAttributeDefinition"><xs:attribute name="ItemOID" type="oidref" use="required"/><xs:attribute name="TransactionType" type="TransactionType"/><xs:attribute name="Value" type="value"/><xs:attribute name="IsNull" type="YesOnly"/>

</xs:attributeGroup>

Page 28: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

And Finally the Data

<VIT><STUDYID>1234C00001</STUDYID><SITEID>1</SITEID><USUBJID>18</USUBJID><VISITNUM>1</VISITNUM><VSDTC>2005-07-19T00:00:00</VSDTC><VSTESTCD>HRSUP</VSTESTCD><Field7>100</Field7><DOMAIN>VIT</DOMAIN></VIT>

Page 29: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

XML Mapping

Page 30: Mapping Corporate Data Standards to the CDISC Model · Analysis Data Sets ETL Algorithms Algorithms Advantages Disadvantages Analysis programs utilise the SDTM •SDTM domains facilitate

Summary and Questions

• Mapping Options• Common Problems• Potential further use of XML


Recommended