Search This Blog

Thursday, November 13, 2008

CDISC SDTM versus ODM?

I have read with great interest an article posted in ClinPage - The Future of ODM, SDTM and CDISC.   These discussions relate primarily to the proposed requirement from the FDA for data submissions to be made in XML format rather than SAS Transport file format.   I don't think we will see many arguments around this point - XML is now the accepted extensible method of describing the combined data and metadata.  What is more contentious is that it is requested that data be provided in the HL7 v3 Message format.  FDA Docket No. FDA-2008-N-0428 from August 2008 elaborates on where the FDA are in the process.

In addition to the move to an HL7 Message format rather than SAS XPT, commentary exists on a suggestion that a move to ODM rather than SDTM would be considered.   This point is also put forward by Jozef Aerts of xml4pharma.

I would like to comment on a comparison of SDTM versus ODM.

Operational Data Model

ODM was the first CDISC standard to successfully go through the authoring process.  It was aimed as a means to represent data in to context of data capture. Data was indexed to Visits and Forms. The syntax was designed to describe data not from an effective storage format, but from a source to destination format.  You could get data from System A by Visit and Form to System B by Visit and Form.   This is great where the presentation of the data has importance and meaning.

Submission Data Tabulation Model

SDTM, unlike ODM, focuses on groupings of data - not by CRF Form - but by the use of data.  All Demographics information appears on the same record for example.  The SDTM structure has now also become the basis for data delivery and storage within many organizations.  A number of large PharmaBio companies based internal cross company standards on SDTM.

Modelling from Data Captured

The format of data will differ depending on the medium used to capture the data.  Some form factors might have 30 questions on a form, others such as Patient Diaries, might only have 1 or 2 question per form. In addition, when designing a CRF for ease of use, it may not make sense to apply the content of each SDTM domain as the basis for deciding what does and does not go onto a single form. Whether the data appeared on one form, or across many forms is not important when it comes to the value of the data.  Many EDC vendors have gone down the route of designing the database for data capture according to EAV rules - Entity-Attribute-Value form - where each value captured on any form is dropped into a single table. Once captured, data is then re-modelling into a relation structure that may or may not model the layout of the page. (xForms is a generic technology touted as being a potential means of addressing this challenge - I will leave further discussion on this to a later article).

Based on the above, it would seem logical that SDTM is of greater value when used as the method of delivery of data for submission or analysis than ODM.

However, that is not the only reason why SDTM makes sense over ODM when developing and executing eClinical studies.  The primary reason related to metadata re-use.

ODM is not a suitable format for modelling studies because it does not lend itself to ensuring that similar studies are able to effectively re-use metadata.  Sure - I can take a study, copy the metadata, and I have another study... easy... but what about changes.   What if I remove a few fields, add a few fields, change the visit structure. That will of course change the data outputs format if ODM was the format- an issue - see above, but, more importantly it will greatly impact any rules that might exist on the forms.  Rules that use some form of wildcarding mechanism may, or may not work.   Anyway, this is not a posting on metadata architecture, so I will leave it at that.

Bringing together SDTM and HL7 v3

So back to SDTM and HL7.  Is this the right way to go?   I can understand the logic behind this.  Being able to bring EHR and Clinical Trial data together within a common standard could be very useful.  However, at what cost?  

I am not aware of any eClinical application that automatically creates SDTM compliant data sets - regardless of transport layer.  The mapping of proprietary metadata to SDTM is quiet involved with varying degrees of software development required from the various system vendors.  Typically, either SAS macro transformations are used, or, some form of ETL (Extract, Transform and Load) Tool.  This is all complicated enough. Creating a tool that creates SDTM datasets in HL7 v3 is considerably more complicated. Even for large companies it will be a major development under taking.  The complexity is such that smaller companies will simply fail to manage to effectively deliver the data in a cost effective way.

Tools providers may step in - they may offer a means to convert a basic SDTM ASCII file with additional information into a SDTM HL7 v3 file. XML4Pharma as based on recent critique of the approach do not appear to be wishing to jump into supporting this, but, if this becomes a mandate, some companies will.

Playing on the other side of the argument - one of the principles of XML is that the data is also human readable.  In reality, once you add all of the 'overhead', especially with a complicate syntax such as HL7 v3, you end up with something that is only readable by technical gurus.  But then, maybe it shouldn't be people that interpret these files, maybe the complexity has got to the point where it only makes sense that a computer application interprets the files and then presents the appropriate information to the user.  Modern eClinical systems offer views on data. Maybe the presentation of the Submission data is managed in the same way - through an application that presents a view based on purpose.

6 comments:

GB said...

I am intrigued by your reference to xForms. What is your opinion on extending ODM's Presentation and/or ArchiveLayout elements to represent/model the eCRF - say, using ODM compliant vendor extensions?

-GB.

Doug Bain said...

xForms has been around for a number of years. W3C originally put it forward as a step up from basic HTML. I thought it might be a dead technology, but, it would appear partly due to the uptake of Web 2 and Ajax technologies, it has a 2nd life.

I have not been a strong advocate of using ODM as the internal metadata for an EDC application as it has too many limitations to make it competitive IMHO. However, as a means to present a standard cross platform Archive or View of data, it does make some sense. Many EDC products present a subset of CRF page information based on a user role and permission. Blinding for example is often handled through this method. ODM does not support this currently, but, with a technology such as xForms, rules could be added to conditionally manage the 'view' of data. Today, a site archive based on ODM needs to offer a snapshot based subset of the actual CRF content based on the anticipated user accessing the archive - not ideal.

With an ODM/xForms standard, we could potentially have archives standardized and centralized regardless of the original vendor system that captured it in the first place.

Doug Bain said...

I should add that xForms is a technology that is well used in Healthcare / EHR systems and standards. That alone could give CDISC a reason for examining the potential of xForms.

GB said...

Thank you for sharing your thoughts. If xForms _are_ indeed used, they could perhaps be tied to ODM through the vendor extension mechanism.

-GB.

XML4Pharma said...

ODM and XForms are two different, but complementary things. ODM defines studies, visits, includings forms with questions codelists etc.., whereas XForms implements forms for the web.
XForms for example, does not allow to define something like visits.
It is also a misunderstanding that XForms is about presentation - that is done by the CSS which can be attached to an XForms.

As such, an ODM study design can easily be transformed (automatically) into a set of XForms for web usage.
You can find some samples at:
http://www.xml4pharma.com/ODMinEDC/Samples.html but you can also try it out yourself (i.e. submit ODM and get XForms) at a demo application server at:
http://www.xml4pharmaserver.com:8080/XML4PharmaServer/

You do need to register, but after that you can just submit ODM files which are automatically transformed into XForms, which you can immediately try out.

This nicely demos that ODM and XForms are complementary, not competitors. Also there is no need to extend ODM with XForms, as XForms is NOT about presentation.

XML4Pharma said...

User "EDC Consultant" does not seem to understand what ODM and SDTM exactly are about.
It is not ODM "VERSUS" SDTM, as both are complementary: each of them has its own task, and they even nicely work together.
First of all, ODM is a "transport" (or "format") standard, whereas SDTM is a "content" standard.
ODM is essentially a "framework".
Using SDTM, one cannot set up a clinical study, you will need ODM. SDTM is about grouping of data (categorization). The SDTM standard says wich categories you can choose from, but does not say much about the format you should use.
So you can have your SDTM data in SAS datasets, in any kind of XML (e.g. ODM), as a set of tables in a relational data, maybe even in CSV.
What is however important (and I am evangelizing this already for a number of years), is that you design your study already with submission in mind. "Think SDTM, write ODM". So, when designing your CRFs, do already think about how these data will need to be submitted to the FDA.
The CDISC ODM Standard does have a number fields available for this: "SDSVarName", "Domain" and "Alias", allowing to "annotate" CRF questions and groups of questions with SDTM information.
EDC consultant also complains "ODM is not a suitable format for modelling studies because it does not lend itself to ensuring that similar studies are able to effectively re-use metadata" (sic). This is not true at all. Many companies have developed metadata repositories in ... ODM format, so that they can use the same questions, codelists etc.. in different studies. Some have even announced to make these repositories public.
Furthermore, the recently published CDASH "recommendation" defines a number of standardized forms (and the questions in it) for reuse. These CDASH forms almost map 1:1 with SDTM domains. Many vendors have already formatted these as ODM, and a mixed ODM-CDASH team will publish ODM-implementations of CDASH in the next few weeks.
So CDASH is a nice example of a "bridge" between SDTM and ODM.
Another reason that SDTM is not suitable for setting up a study, is that it does not know the concept of "audit trail". SDTM is only about cleaned, final data.
So how could one set up a 21-CFR-11 compliant system using SDTM ?
Once the data collected, SDTM is however the BEST way to enable to compare results between different studies. Due to the categorization, it allows to compare data, even when the drug and disease indications are different. It is also THE way to enable to compare studies from different sponsors, thanks again to the categorization.
So many companies and universities (academic research) use SDTM to build data warehouses, even when the data will never be submitted to the FDA.

SDTM and ODM complement each other, so ODM "versus" SDTM is not an issue at all.
I would invite "EDC Consultant" to attend a CDISC "End to End Workshop" on the next US Interchange, where it is demonstrated how these standards work together.


Now, moving on to the HL7-v3 message discussion. My major objections are that we do already have a format for exchange of clinical data (or is submission not "exchange"?). So formatting SDTM as ODM is a very nice, simple and efficient way. It also nicely works together with define.xml.
Furthermore, whereas HL7-v2 has been very successfull, HL7-v3 has been found to be almost not implementable (too complicated). Third, the development of an HL7-v3 message will take at least 5 years, and the industry cannot afford to have to use SAS transport for so many years. After all, SAS transport dates back from the IBM mainframe age, where we entered our data using punch cards or punch tape.