Search This Blog

Thursday, February 26, 2009

Modelling ODM Metadata - a response

Today, I would like to respond to a recent comment from XML4Pharma regarding the use, or otherwise of ODM in modelling CDISC based studies. X makes some valid points, but some of the concerns are based more on a misunderstanding of the proposals.

First of all, thank you XML4Pharma for your input – all input is good input as far as I am concerned.

Hopefully, with this posting, I can clarify that I do know the differences between ODM and SDTM!... and yes, I do believe the ODM and SDTM complement each other.

I think you have misunderstood how I am suggesting SDTM be used versus ODM. If you look at my recent post, what I am suggesting is precisely what you have been evangelizing about - to think about the Outputs - SDTM - in order to create the inputs. The principle defined was for the modelling of metadata – how you potentially get to an ODM based definition of a study - not how the metadata is used and processed in an EDC product. 

The difference in my proposal from yours is that I am suggesting a 3 tier model in order to achieve the underlying definition of forms and rules in a study. The end result may well be ODM, but, it is how the ODM is prepared is what I am suggesting. To understand where I believe the challenge is we need to think of the definition of a whole study.

Hypothetically, a typical EDC study build includes lets say 8 days of Forms development and 20 days of rules development. Looking at just the forms, the re-use can be effective from study to study... A second study can be 4 days, 3rd study 2 days etc. But what about all the associated rules. How will they work when the visit structures change, new forms are included, and fields are taken away. Will we see 20 days going down to 10 days and then down to 5 days... that depends on whether the use and content of forms for a study are impacted... but not if the logic is hanging off the forms.

Lets take an example problem. I have 5 similar forms that all need to populate the same SDTM domain. With the proposed model we start with the definition of the SDTM Domain (Tier 1). This contains the definition of what we are aiming for. It contains all fields, not just the ones we might use on a form. Next, the definition of a superset Logical structure that contains all of the appropriate fields that might used by a sponsor together with logic that is applied regardless of the capture method (tier 2). Finally at Tier 3, we have the 5 different forms. These all subset the Logical structure defined at Tier 2 and inherit the rules. As we have a consistent thread from Tier 3 thru Tier 2 to Tier 1, it is possible to create a definition that can be used by an eventual target EDC product to populate the same Tier 1 (SDTM Domain) regardless of the form structure or logic.

At tier 1, you define as much information as you can that will be consistent across Tier 2 and 3. Fieldname, data type,  etc.  Tier 2 inherits information from Tier 1 and adds relationships and rules.  You might have many logical structures to capture the same information, but, the information in Tier 1 is only defined once.  Tier 3 applies the same idea.  You might have many instances of a form,  that in turn apply the same rules, but one form might be visualized differently from another.

That was a simple example. Lets imagine I wanted to capture data for 3 domains on the same form. We could simply say to the Investigator – sorry, “we need to split this into separate pages, because that was SDTM demands,..” the true answer is that ‘this is what our modelling structure demands’.  I don't think the model should demand it.

I could have an AE form that is captured on a single page in one study, but captured across 3 pages in another study. I could even reach the point were I want to capture AE information on a PDA (though unlikely!). The eventual SDTM Data domain is the same, the rules are the same (I only want to define them once). The only thing that is changing is the presentation.

As you say, ODM does contain cross references back to SDTM - Fields and Domains. This will allow you to map information back to SDTM.  With the 3 tier model suggested, the way in which the Field and Domain (SDSVarName/Domain) are defined is through inheritance. 

I won't respond to the Audit trail and 21CFRPart 11 comments that XML4Pharma made. Hopefully, by this point my explanations have corrected this misunderstanding.

XML4Pharma's last point regarding HL7-v3...

we do already have a format for exchange of clinical data (or is submission not "exchange"?). So formatting SDTM as ODM is a very nice, simple and efficient way.

if I was part of CDISC and wanted to work closely with HL7, or, if I wanted to be considered a bridge builder to Electronic Health Record systems I would probably embrace HL7-v3 without even opening the specification. It may or may not be the best solution... but on paper it is probably the most marketable.  In response, I would recommend that you constructively and diplomatically present the benefits of an ODM/SDTM based standard over HL7-v3 with further real-world examples. Without this sort of approach - as a leading proponent of ODM based systems, XML4Pharma may come across as simply showing bias due to home grown interests. Personally, my gut feel is that you are correct, but, I do not yet know enough about HL7-v3 to make a fully considered opinion. However, I believe there are sufficient intelligent individuals in the CDISC organization to take a considered technical argument onboard provided it is delivered in the right way.

1 comment:

XML4Pharma said...

Thanks, I will only comment on the last part (HL7-v3 for SDTM submission). I will respond on the SDTM-ODM in a separate contribution.

The thing is that the FDA decided on an HL7-v3 message for SDTM submissions without even consulting the technical people at CDISC. They 'just decided it', though there is no XML knowledge at all at the FDA. So they probably went to HL7 asking for a solution, and as you can think, HL7 did not say "ODM is a better choice" ...
I have the impression that the technical issues have not played a role at all in the decision, and that it was a pure political decision.
Yes, we do need to closely work together with HL7, as it is a major player in the Healthcare world: their HL7-v2 (version two: non-XML!) messages are a hughe success, but their HL7-v3 are very contraversional (see e.g. http://hl7-watch.blogspot.com/ - no it's not me), and have found to be almost not implementable. The HL7-v3 datatypes have been rejected by the European authorities, for technical (XML) reasons.
Integration with healthcare is of utmost importance, and we need HL7 for that (among others). A team (which I am part of) has already developed an interface between HL7-CCD and CDASH-ODM based EDC, see e.g. http://wiki.ihe.net/index.php?title=Clinical_Research_Data_Capture. We use XSLT for that. We did not develop a new HL7-v3 message, as some (at the FDA?) believe is necessary to be able to integrate Healthcare and clinical research.

For an FDA-submission format, I am not completely biased towards ODM.
Essentially, any pretty simple XML structure could be used (SDTM is just a set of two-dimensional tables). So, one could just have the domain names and SDTM variable names as tag names for the XML, with some extra attributes and maybe a few extra elements to glue things between domains together.
Developing an HL7-v3 message however is a complete overkill for this.

Back to Healthcare-Clinical integration. Many of my american colleagues often think there is nothing else in the healthcare world than HL7. That is not correct. When looking at Electronic Health Records, HL7 is only one of the players, with some competitors (or "competing standards") being considerably further in implementation (e.g. in Scandinavia and Australia), and technically (XML) in a much better shape.

So yes, we need HL7, and we should work with them in a very close manner, but I a disagree that we should just be taking everything that HL7 has developed for granted, as we know some of their stuff has major design errors, and is extremely costly to implement.
Especially as there are better alternatives.