Electronic Data Capture - Technology Blog

Thursday, April 1, 2010

Linking SDTM / ODM for better FDA Submissions

If this is not already underway, I think it is time that we examined how we can do a better job of combining SDTM data submissions with ODM data and metadata. The following commentary is hopefully going to raise some comments, and, ideas as to next steps.

The Challenge

CDISC SDTM – the standard for the Submission of Clinical Data for New Drug Applications - is challenging to work with for a number of reasons. Data is structured per Domain, rather than per CRF – quite rightly in my view – however, this re-modeling does create a number of issues that deserve to be addressed.

First of all – getting the data into this format in the first place. In a typical EDC system, you have data captured across many pages. Often these pages contain 1 or more domains for data. Data is presented in nice friendly CRF Page like formats, with quick links to audit trails, queries, comments etc. In the SDTM world, things are not quiet as friendly. You have long lists of records structured by domain. You cannot see the audit trail. You struggle to relate the comments. The queries may not even exist.

Now – ok – maybe the audit trail and query log should be of no significant relevance to the Medical Reviewers… It is supposedly more evidence of the data cleaning process having taken place. Maybe that is just a carry over from working with paper CRF’s for decades. However,I would argue that data is never 100% clean. It is sufficiently clean to merit safe statistical analysis. Reviewers may feel that in a position of doubt, that they would like to see context behind the data being recorded, and therefore see the query and audit logs.

So – after that short ramble – how could we make the life for Reviewers better?

Combining ODM with SDTM

Well, first of all, why not leverage the standards we already have for clinical data – ODM and SDTM – but combine them in a more effective way to offer SDTM directly linked to ODM?

What I mean by this – and I am sure XML4Pharma will point out that this has been suggested previously – is that we extend the ODM specification to accommodate SDTM Domains within the ODM spec, AND that we provide the means to link the SDTM domain content with the associated eCRF data & metadata in the present ODM.

To the end user – this would result in a mechanism to switch between a tabular SDTM view of data to an eCRF view of data, with ready access to the audit trail and queries, as well as potentially a better context of the data in the way it was captured.

Easy to achieve?

Sponsors struggle to create SDTM today. However, I am not certain that this is due to underlying faults in SDTM itself. I think the tools will mature, the standards will mature, and sponsor companies will simply get better at it.

Creating related ODM is also not too difficult. Any EDC vendor for instance that wish to be credible in the marketplace need to be able to offer data and metadata in the CDISC ODM format.

Programmatically linking the SDTM and ODM is probably the hardest part.

In theory, you could have a situation where every field that exists on a SDTM records belongs to a separate ODM page instance. The problem can be solved, but, it is not going to be easy.

Conclusion

Delivering SDTM data in a ODM style format, and creating a means to link the SDTM/ODM to the present ODM data and metadata will create data that is considerably more useful for both analysis, submission and archiving.

Tuesday, March 23, 2010

Apple iPad and eSource

The Apple iPhone broke new ground in offering an all in one device for Music, Phone, Games, Organizer and general applications. Other companies had introduced almost all the concepts. It was Apple that brought them all together so effectively.

I have considered the suitability of the iPhone specifically as a device for capturing clinical trial data. The obvious application is an eDiary device. People familiar with the iPhone will be aware of the phenomenal success of the App Store. It may be possible to write an eDiary app. However, there are challenges with metadata deployment and application patching that might prove insurmountable given the restrictions that Apple place in deployment. As far as using the native browser on the iPhone – the form factor is just not that suitable. Yes – you can fill in a browser based form, you can do the 2 fingered pan and zoom. But, it just doesn’t quite fly when it comes to regular data entry operations.

Shortly, Apple will release the iPad. In many ways this is like an iPhone or iTouch, but larger. Form factor wise, it is similar to many Tablet PC’s. However, it has the advantage of being tied to the iPhone/iTouch OS. It will also be provided with both Wifi and 3G connectivity.

One of the real boundaries to eSource in clinical trials is the portability and availability of the device at the appropriate times. With the larger touch screen of the iPad and the option of connectivity with either 3G or Wifi, it will be increasingly possible to efficiently capture data at the place of data availability.

So – could the iPad break the capture to paper / transpose to EDC bottleneck at the sites? I think so. The solution is likely to be browser based though – App deployment is still too restrictive. It needs to be fully touch screen friendly. It needs to make it beautifully easy for an investigator to ‘interview’ a patient, and key the data during the interview where appropriate. It needs to provide a means for the investigator to indicate through a simple highlighter pen style UI metaphor that data is being entered as source, or, being transposed from source.

I appreciate that other devices exist today that perform a similar function, but, I believe the connectivity, general ease of use, and low price point will make the iPad stand out.

One critical feature may be the Electronic Health Records link. I think the data that should be copied needs to be at the discretion of the site personnel. I am not yet convinced that data privacy together with a sponsor controlling the study build / data propagation is viable right now. A really simply copy/paste mechanism might be better than nothing in the time being. We will need to see how well the Safari browser performs.

Initially, I see the iPad making inroads within Phase I units. Hardware device interfacing is less of an issue here now – most devices should be looking at centralizing the data interchange, rather than sending it directly to the data entry device. Web 2.0 interactive technologies will allow developers to create some of the realtime functionality that dedicated Phase I solutions have enjoyed in the past.

I am looking forward to seeing the first iPad EDC demonstrations at the DIA in June!

Sunday, February 14, 2010

Value of Batch validation?

One of the questions often asked of EDC systems is ‘Where is the batch validation’. The question I would like to ask, is what is the value of Batch Validation versus Online Validation.

I should start by saying that I have a personal dislike of technology that works in a particular way – because that is the way it has always worked – rather than because a pressing requirement exists to make it work the way it does today.

Performance – Batch Validation generally dates back to the good old days of batch data processing. With Clinical Data Management systems where the act of entering data, and the triggering of queries were not time critical, batch processing made sense. The centralized Clinical Data Coordinators would double-enter the data rapidly, and at an appropriate point in time, the batch processing would be triggered, and the appropriate DCF’s lined up for review and distribution.

For EDC – things are different. It is all about Cleaner Data Faster. So not data checking immediately after entry creates an inherent delay. No site personnel want to be hit with a Query/DCF hours or even days after data was keyed if it could have been highlighted to them when they originally entered the data – and, presumably had the source data at hand.

A couple of CDM based tools provide both online edit checking, as well as offline batch validation. The Batch validation elements come from the legacy days of paper CDM as per above. The online checking is a subsequent add-on created due to the difficulty of efficiently parameterize and executing Batch validation checks per Subject eCRF.

Lets have a look at a couple of other differentiators.

1). Online edit checking tends to run within the same transaction scope as the page – so, when a user sees the submited page – they are able to immediately see the results of the edit check execution. This means the data submission and edit check execution for all checks must occur in less than a couple of seconds in order to be sufficiently responsive. With Batch Validation, running across data can be more efficient, and user experience is not impacted – waiting for a page refresh.

I believe most leading EDC products have the performance aspects of real time edit check execution cracked. Networks are faster, computers are maybe 10 time faster than 4 years ago. I don’t believe that performance is an issue in a modern EDC system with properly designed edit checks.

2). Scope – Batch validation is able to read all data within a subject regardless of visit. In addition, some are also capable of checking across patients. EDC systems with online validation also generally manage to read all data for a subject, but do not permit reading across subjects.

3). Capabilities – Most EDC systems edit checking mechanisms are application intelligent, rather than based on SQL or syntax that interprets down to SQL as with Batch Validation. As a result, the syntaxes tend to more business aware. If you are having to write code – SQL or other syntax, then you have a demand to validate the code in a similar fashion to the vendors validation of the system itself. Avoiding coding in favor of a configuration / point and click tool makes the testing considerably easier with automation possible.

4). Architectural Simplicity – If you were a software designer, and you saw a requirement to check data that is entered into a database. Would you create one syntax, or multiple syntaxes? Even if you saw a need for offline batch validation – I think you would go with a single syntax. If you have a means to balance where and when the rules run, then that might be ideal – either at the client side, application side, or database layer. Using 2 or even more syntaxes would be something you would avoid.

5). Integration implications – Data that is imported into an EDC or CDM system should go through exactly the same rules regardless of the medium used to capture it - Browser, PDA, Lab, ECG etc. This even applies if you are importing ODM data. If this is not the case, then downstream data analysis needs to confirm that the validity of the data against the protocol was assured across the devices. Managing to achieve this if you have separate batch and online edit checking is difficult.

On re-reading the details above, it sounds a bit like I am bashing systems that do Batch Validation. That is probably slightly unfair. I have worked with both EDC and CDM systems, and written checks for both. In the paper CDM world, the User Interface for the batch execution of rules makes sense. You choose the appropriate point in time, and, you can determine the scheduling and scope of DCF’s. So – for a pure Paper environment, this meets requirements.

However, in an increasing EDC world – I am not sure this has value. It could be argued that it gives you the best of both worlds. However, I think it is an unsatisfactory compromise that increases complexity when migrating to focus on EDC. It simply does not create a good scalable solution. Users will be left wondering why things are so complex.

Thursday, February 11, 2010

CDASH, SDTM and the FDA

Hurrah! The FDA have made an announcement on their preference towards SDTM!! Well. Sort of. They met up with representatives from CDISC. The CDISC organization wrote down some notes on the discussion, and posted them to their Blog.

Ok – maybe I am being overly flippant. However, why does this message need to come out by proxy from CDISC? Why can the FDA CDER / CBER not step off the fence and make a firm statement on what they want, and when they want it?

One point made was that applying CDASH is the key to attaining SDTM datasets. Well. Sort of. It is a good start point. But, it is only a start point.

The CDASH forms are very closely modeled on the structure of SDTM domains. Do I always want to capture one domain, on one eCRF form – not always. Do I want to sometimes capture information that is logically grouped together according to source documents that belongs to multiple domains on the same eCRF – often I do. We should not compromise the user friendliness and therefore compliance at the sites because of a need to capture data according to the structure of the data extracts.

CDASH was developed around the principle that the EDC or CDM system modeled eCRF’s to equal SDTM domains. If your EDC or CDM system does not do that, then compliance with CDASH is not entirely valuable.

However – or rather HOWEVER – if you fail to apply equivalent naming conventions to CDASH/SDTM and fail to use matching Controlled Terminology, and, you expect to achieve SDTM – you will be severely disappointed. Achieving SDTM will not be hard – it will be virtually impossible.

With regards to the statement that applying CDASH can create 70-90% savings. That is not the whole story. Apply CDASH + standardizing all of the other elements such as rules, visits etc – and automating testing and documentation – yes, then you can achieve a 70-90% savings.

Sunday, January 24, 2010

CDISC Rules 2

In my last posting, I discussed potentially using ARDEN as a syntax for expanding CDISC ODM with rules.

After a couple of months of on and off investigation, I have decided that ARDEN is dead as an option. Actually, ARDEN is largely dead as a potential syntax in general.

The value of a rules syntax lies primarily in the potential ability to put context around data once it reaches a repository or data warehouse.

In theory, the transfer of rules would be of value in transferring a study definition between systems. However, I cannot think of a really valuable situation where this might happen. If data is captured into an IVR System and then transferred across to EDC - does it really matter if they both have access to the rules? Instead, the rules could be applied by one of the systems.

That last point takes me to the other reason why rules are less critical. If the last decade was about standards development this new decade must be about standards application - and, in particular the real time exchange of data between systems. The need to validate data in System A first, before transferring it to System B is only really necessary if System A cannot check directly with System B. With the increasingly prevalent Web Services combined with standards - it will be possible to carry out these checks online.

Friday, October 30, 2009

CDISC Rules!

Ok, so a play on words. CDISC may rule in the field of Clinical Data standards, but, it does not rule in the standardisation of rules associated with data.

Let me expand here for those not familiar with the issue.

CDISC ODM provides a syntax for the definition of metadata (and data) used in the interchange of information between (and sometimes within) systems. CDISC ODM does not scope the definition of edit check rules that are applied to the data when it is captured. I feel that is a significant omission as the rules element of the data a) take a considerable time to develop and b). provide a context to the data.

Question - So, why do we not already have rules built into the standards?

Answer - rules are often technology or vendor specific. There are almost as many methods of implementing rules, as there are EDC products.

Question - Why not define a standard mechanism for creating rules that vendors could either comply with, or, support as part of interfacing?

Answer - Well, it all depends on what you want the rules to do. In their simplest form, rules are boolean expressions that result in the production of a Query or Discrepancy. However, many systems go well beyond simply raising queries. The boolean element of the rule may be consistent, but the activity performed in the situation that the boolean returns true, is often very vendor specific.

Lowest Common Denominator

So - lets assume that we are looking at implementing a lowest common denominator of rules and actions that the majority of systems support, and require

What can we do to standardize a syntax. Three options I think;

1) Choose a syntax from one of the leading vendors,

2). Develop a new syntax building on existing ODM conventions

3). Bring in another standard syntax, potentially already in the Health or LifeScience field

Lets look at them in order.

No. 1 - Choosing a Leading Vendor Syntax is probably great for the chosen leading vendor, but, bad for most other vendors. A benefit though would be that it would already be proven as a means to represent rules and actions in a clinical study. Some syntaxes are based around standard tools such as Visual Basic for Applications, JavaScript or even SQL. This approach may create almost insurmountable boundaries for other vendor systems that do not, or cannot implement the technology - for example, it is not easy to interpret VBA on a non Microsoft platform. So - option 1 has some potential, but, depending on the chosen vendor, may result in closing the door to the standard for others.

No. 2 - Creating a new Syntax would result in something most vendors would be happy with, but, would require considerable effort from the contributors in order to develop a complete specification for the standard, as well as a reference implementation. The advantage of such approach would of course be that it would be seen as a common standard open to all, and not specifically biased to any one vendor company. In practice, the technology approach chosen would favor some more than others.

No. 3 - Leverage an existing Syntax may well bring the benefits of No. 2 without all the costs of designing something from scratch.

Ok, so lets say we go ahead with option 3 - what are the candidate standards for rules in the Health and/or LifeSciences are?

As far as I can tell, not many. In fact, I was only able to find one candidate that had any level of success - a syntax called ARDEN.

ARDEN has existed since 1989 as a syntax for describing Medical Logic. Similar to typical Rules in EDC, they are defined in Modules - Medical Logic Modules - and called based on the triggering of an event.

[For an accurate definition of ARDEN and its roots, check Google Books - search for ARDEN Syntax and examine Clinical knowledge management: opportunities and challenges By Rajeev K. Bali, Pages 209 --> 211]

As a syntax, it is mostly general purpose. Here is a snip from an Arden Syntax module,

logic:

if

last_creat is null and last_BUN is null

then

alert_text := "No recent serum creatinine available. Consider patient's kidney function before ordering contrast studies.";

conclude true;

elseif

last_creat > 1.5 or last_BUN > 30

then

alert_text := "Consider impaired kidney function when ordering contrast studies for this patient.";

conclude true;

else

conclude false;

endif;

;;

In the example, you can see that the syntax uses standard if/then/elseif/endif constructs. Assignments use the := combination etc.

HL7 have a section dedicated to ARDEN here. The activity appears to be limited with no Postings or Documentation since 2004. On walking through some of the presentations, some of the consumer companies such as Eclipsys were proposing extensions to the syntax - for example to add object definition support. It would appear that the take-up of the standard has been limited to those organizations that had a problem to solve in the EHR area, and wanted to re-use a syntax, instead of inventing their own.

The fact that HL7 has lended support for ARDEN may be sufficient in itself. However, we would need to dig considerably deeper to understand how ARDEN syntax would fit with a syntax such as ODM. The first challenge is the conversion to an XML form. There are plenty of articles on ARDEN XML for further reading.

RuleML is another standard that may address the need to create Rules, as well as meeting the perceived need of being XML based.

More about ARDEN and RuleML in a later posting I think. This one is quite long enough for today.

Friday, October 9, 2009

Source to eSource with EDC

One of the areas that I have felt for some time as being a compromise to the effectiveness of EDC, is the subject of Source data, or rather, the fact that source data is often not entered directly into an EDC system.

I appreciate that we have situations where this is impractical for logistical reasons - location of computers, circumstances of source data capture etc.

However, often, it is mandated by the sponsor that source data is not logged in the EDC system, but is instead recorded elsewhere first. Some advocates will indicate the need to comply with the regulations that state data must remain ‘at the site’. Personally, I don’t concur with this assessment. The data is ‘at the site’. The cable connecting the screen to the computer might be very very long (the internet) but the data is constantly available at site. I probably shouldn’t be flippant on this point, but, the conservatism in conflict with progress strikes a nerve.

Transposing information from paper to an EDC screen introduces the potential for error. In the old world days of paper CDM, we had Double Data Entry as a method to confirm transcription errors did not occur - 2 staff enter the same data from the paper CRF’, and the differences flagged/corrected. With onsite EDC we don’t have Double Data Entry, but, we do have Source Data Verification. Instead of 2 staff sitting next to each other double keying data, we have a monitor fly in to perform the 2^nd check. Yes, I know, they carry out other duties, but still. This seems like an enormous effort to check for transcription issues. It also has a massive effect on slowing down time to database lock.

So – where are the solutions:-

1). Data at Site

We could put on our ‘common sense hats’ and make a statement that allows the entry of source data into the online EDC system. I know of a number of EDC companies that have simply placed this in their procedures. No audit findings / concerns have been raised as a result that I am aware of. Come on Regulators – why the delay in making a statement on this subject? The confusion caused is creating a measurable delay in bringing drugs to market!

2). Physical availability

This one is harder to tackle. When you have data to log, do you always have access to a system/device to log the data? Possibly not. Do you simply try to remember it, like an Italian waiter remembering a dinner order… I don’t think so. We either need to provide portable data entry devices, or, we accept paper transcription for these elements.

3). Differentiating Source from eSource

This does open up one area of concern. If we have some data as source, and some as eSource, how do we know which is which? When a Monitor goes looking for the source data, if they don’t find it, does that make it eSource – no. There needs to be some very simple flagging, and visual indication system built into such a mixed source/esource system that supports this. I have seen this in one system, but, it is very rare. Come on EDC vendors – you’re turn here!

4). Other systems

Health Record systems amongst others will often be the first point of entry for data that may apply to an eCRF. The current approach for organizations such as CDISC and HL7 is to create interfaces between systems. This will be a slow burner, I predict. There are so many hurdles in the way. It requires active cooperation from both sides – EDC providers may be fully onboard, but, I am not so sure about most EHR providers.

We may see a company emerge (maybe this is what some former PhaseForward employees are up to – who knows!) develop an online Clinical Development Electronic Health Record System that is immediately EDC ready. Of course, with such a small deployment potential, I am not sure that we will see this appear at all sites in a global study, but for domestic application – say within a single country, inside research hospitals, it could work. I digress - back to the integration issue – yes, when we have standards, the feeds will start to happen, but not for many years.

5). Patient Recorded Information

ePro, Patient Diaries etc. These are increasingly realistic, for the right sort of study for the accurate capture of patient data. If used appropriately, they can cut down the volume of data that needs to be transposed into EDC, and therefore source data verified.

On a side note, I am sure we will see a downloadable iPhone app that will make the current hardware/software dependent or basic PDA browser based systems seem old and tired.

The advantage of diary based systems is that instead of an investigator quizzing a patient, writing down the responses, transposing the responses etc. The information is captured ‘at source’ often considerably closer to the time of event.

Expanding from my previous post, I expect to see systems like PatientsLikeMe expand onto portable devices and act as an entry point for both patient identification and data entry. Internet enabled device pervasiveness will simply make this happen.

Conclusion

A lack of eSource has direct impact on the time it takes to lock data. With adaptive clinical trials executed by forward thinking sponsor companies, the point at which an adaption can occur corresponds directly with how long it takes for the sample size end-point significant datapoints to achieve a locked status. To simplify - the less eSource you have, the longer your studies will take. Forget a few days - we are talking wasted months.

Search This Blog