Electronic Data Capture - Technology Blog: October 2008

Thursday, October 30, 2008

EDC Rules - When should they run?

All EDC systems have some form of rules facility. The rules are typically designed to check data. If the check fails, then typically a query is produced.

Web based EDC systems typically run edit checks when a page is submitted.

Prior to the release of web based systems, it was typical to check the data as soon as it was entered - between fields.

With Web 2.0 technologies, it may prove possible to routinely run checks as soon as data is entered - prior to page submit.

So - a question - which approach is best?

Monday, October 27, 2008

Why eClinical fails to deliver significant ROI

I stumbled across an article posted in ClinPage back in May 2008 that reported on a presentation given by Ron Waife. I cannot say I always agree with Ron's assessments, however I believe he is 100% accurate with his analysis on this occasion. Steve Woody also made an interesting point regarding a potential solution to the problem.

The crux of Ron's position is that Sponsor companies are fundamentally failing in taking advantage of eClinical technologies, primarily due to a failure to embrace new processes, and to break down silo based working models.

Ron makes a sensible suggestion regarding a potential mode that will work - a skunk work approach - that I fully share.

If there are any Pharma execs out there with the power to make change happen - they would do very well to listen to Ron's advise.

An interpretation of the proposal is defined as follows - purposely greatly simplified! ;

Take an adaptive friendly drug program...
Create a skunk work team comprising of a small number of open minded individuals from each existing department - Protocol Writer, (e)Study Builder, Statistician, Safety Manager, Clinical Lead etc.
Put them in a 'virtual' room, and ask them to work tightly together.
The team must work on an 'Agile' style development approach - [ I will expand on this in a later post ]
The program / studies will be adaptive - the data will be available early and the decisions made rapidly.
The Statistician - playing an active, leading role throughout the program - will model the original program, assess the ongoing (daily) execution against the model and adapt accordingly.
A leader of this team should be measured based on the effectiveness of the Program - positive or negative - against a plan.

Sometimes, I think we are too focused on shaving a few days of the time to DB Lock. With an agile adaptive approach - could we not be thinking months and even years of savings?

Steve's suggestion was that a focus on a business model approach might focus the minds of the sponsor companies. His statement regarding the CRO industry;

... which was created and is sustained by the inefficiency of clinical research, is hooked on the heroin (money).

may come across as rather strong, but I believe there is a degree of truth here. CRO's are often the most conservative when it comes to change... 'lets do whatever the client that pays the money wants...' even if it is not necessarily good for them...

However, and this is a big 'however'... CRO companies do act on a conservative basis due to a need to provide a low risk solutions. How many sponsor organizations want to hear about a new 'high risk' implementation method that will be applied to the trial they are responsible for? So - I don't think the blame is entirely merited.

Moving off topic now, so I will close this post... I am interested in hearing comments...

Friday, October 24, 2008

Should eClinical systems be 'EndPoint' aware

EDC Wizard made some interest points in response to the earlier posting 'EDC EndPoints'

The original posting was probably incorrectly titled. It should have maybe said - "Should eClinical systems be EndPoint aware?"

I tend to stay away from the term EDC when I can. I think this term does not really apply now to some of the leading 'EDC' vendors. I think they are still labelled as EDC as customers expect to purchase an EDC solution. However, today, they are more 'Clinical Trial Support Suites'. Vendors are adding more and more upstream and downstream functionality. In doing so, some are clueing up to the fact that the 'bit in the middle' - the data capture and cleaning part - may benefit from early involvement from other parties traditionally left out of the mix.

SDV'ing is an interesting point. EDC Wizard states that

Many sponsors are implementing reduced SDV plans that take a risk-based approach to comparing source data to EDC entries

The activity list for Monitors will increasingly be led by the eClinical system tools. They track what has, and what has not been SDV'd. With a % being applied, a model needs to be applied by the eClinical tool that applies this % appropriately. I am not aware of a tool that has successfully implemented this. Another challenge exists regarding the classification of data that is eSource and that which is not.

What has, and what has not been SDV'd should not be shown to the Investigator by the tool. I believe most tools support differing views based on user roles. This functionality should be applied.

EDC Wizard goes on to say;

I am not sure I would recommend that EDC systems be modified to flag data as primary, secondary, SDV, or non-SDV. It's hard enough to move from protocol to EDC database to study start without adding more complications to database builds.

A very valid point - tools are becoming increasingly complex. 'Keep it Simple' is certainly a solid principle to hold where possible. However - with the current model of blanket significance / locking --> data delivery, I think we are missing an opportunity for early decision making. If the move towards define once, use many times continues to be applied with eClinical systems, then complexity may reduce rather than increase - define the endpoint criteria up-front in one place, and have this information take downstream into EDC and onto data delivery.

Thursday, October 23, 2008

EDC Endpoints

Endpoints are defined as an event or outcome that can be measured objectively to determine whether the intervention being studied is beneficial.

EDC systems often ignore the importance of the definition of an EndPoint. As far as an EDC system is concerned, all data is effectively considered equally significant. [Possibly correspondents from Medidata and/or Phaseforward can correct me on how Rave and/or Inform respectively, handle this.]

Lets say in a sample clinical trial, you have 100 pages of information captured for a subject, and 10 questions per page. That is a total of 1000 data values that potential have to be captured. The capture and cleaning process typically involves the entry, review, SDV and freeze/lock. The time to perform this for a key data value is the same as the time for an item that has limited significance.

EDC systems typically use a hierarchical tree structure of status handling. Every data value is associated with a status. A Page status is reflective of the status of all the data values on the page. The visit status is reflective of all the CRF Pages in the visit etc. However, this does place a common blanket significance to all data that is captured.

It could be argued that all data that is defined as equivalent significance in the execution of a study - the protocol stated a requirement to capture the data for some reason. However, I believe it can defined at the outset the subset of information that is captured that actually contains endpoint significance. The question is - going back to our example with 1000 data values per subject - is it possible to make an early assessment of data, based on a statistically safe error threshold rather than wait until all subject, all visits, all pages and all data values are locked?

For example, let us consider efficacy and in particular efficacy in a Phase II Dose Escalation study. Information on the dosing of a subject, followed by the resulting measurements of effectiveness may occur relatively quickly in the overall duration of a trial. However, a blanket 'clean versus not clean' rule means that non of the data can be examined until either ALL the data achieves a full DB lock, or, an Interim DB Lock (all visits up to a defined point) is achieved.

So - a question to the readers - is it possible to make assessments on data even if a portion of the data is either missing, or unverified?

One potential solution might be a sub-classification of data (or rather metadata).

When defining fields, a classification could be assigned that identifies as recorded value as 'end-point' significant. The actual number of potential endpoints could be list based and defined at a system level. One Primary end-point would be supported with as many secondary end-points as necessary. A value might be classified against 1 or more endpoint classifications.

The key to the value of this would be on the cleaning and data delivery. Rather than determining a tree status based on all data values captured, the tree status would be an accumulation of the data values that fell within the endpoint classification.

So - with our example, lets say that of the 1000 data values captured per subject only 150 might be considered of endpoint significance for efficacy. Once all of the data values are captured and designated as 'clean', then the data would be usable for immediate statistical analysis. Of course other secondary end-points may exist that will demand longer term analysis of the subject data - for example follow-ups.

The chart models that with a typical data capture / cleaning cycle with ongoing analysis of end-point significant data - statistical significant efficacy is determined at 3 months rather than 5.

The potential value that can be gained when making early decisions has been well proven. Adaptive Clinical trials often rely on the principle. By delivering data of a statistically safe state of cleanliness earlier, we could potential greatly accelerate the overall development process.

Friday, October 3, 2008

Paul Bleicher departs PhaseForward

I was interested to hear that Paul Bleicher has stepped down from the Chair of PF to focus on a new venture in Healthcare Informatics. I wonder if this is in any way is related to the (potentially scurrilous) gossip that they were looking at procuring ClickFind from Datatrak.

Probably not, but the timing though is interesting. Just when senior management must be looking at the core technology to determine if it has it in it to go after the SaaS or PaaS market.

Dr Bleicher's departure means that the originals at PF - Richard Dale, Paul Bleicher, Jeff Klofft and Gilbert Benghiat have all now moved on. Richard and Jeff in my view were the original technical visionaries that were supported by some good initial developers led by Gil. Bleicher gave the company that initial credibility with this CRO and Medical background - in some ways he was the 'expert' user, and clearly has a good head for entrepreneurial business.

Anyway - good luck to Dr Bleicher!

Interesting times...

Thursday, October 2, 2008

Return of the 4GL for eClinical? - Part 1

In the 1980's the thing of the day as far as Database Application development was RAD 4GL's. That is - Rapid Application Development Fourth Generation Languages. They were popular because they tackled the problem of slow software development with complex generic tools. They offered high level constructs for developing Database applications. If you wanted to drawn pretty pictures - sorry. If you wanted to control real-time machinery - sorry. But, if you wanted to write an Database Application - yes, they worked very well.

In the last couple of years, two particular technologies have been popular with developers - Ruby on Rails and, more recently Dganjo. These are based on 3rd generation tools - Ruby and Python respectively - extended through the development of a standard framework. The frameworks have been developed for supporting Database Driven Website applications. These are, in a way the 4GL's for the 21'st Century.

I was one of the these early 4GL Developers for a number of years. In my young exuberant days, I used to boast that I could write a full multi-user Stock Control System from scratch in 3 days. [The truth was, that due to a failed backup - I did actually have to write (again) a full Stock Control System for a client in 3 days!]

One particular 4GL Tool that I was particularly proficient at produced database tables, menus, forms, event driven code, database triggers, reports etc. I suppose looking back it was a bit like Oracle Forms, but without the nasty complex parts, or the heavy weight toolset.

One of the attributes of the tool was that it provided a programming syntax that was sufficiently business aware to make it relevant for business functions, while at the same time sufficient flexible to be capable of developing complex database applications. It was the closest syntax I have seen to natural language. It was the sort of syntax that the developers of SQL and PL/SQL might have produced if they had started again in the mid 80's. The language was sufficient for even the most complex Database applications without having to resort to a 3rd Generation language such as C or Fortran. [Oh dear - I am sounding a bit like a IBM OS/2 user, bitter about Microsoft winning through with Windows!)

Anyway, I am getting off the point.

In thinking about eClinical technologies, and, in particular EDC Tools, I have wondered why a company has not created a 4th Generation Trial Development Tool that offers similar generic features for database, forms and rules authoring while embedding standard features such as standard audit trailing, flag setting, security and web enablement. At this point, I am sure some readers will be saying - oh but such tools do exist. Well, yes, you do have 'Study Building' tools, but, they are very specific. A general language is not provided that can be used across the tool set.

Oracle Corp, eResearch Technology and Domain went down similar routes with Oracle Clinical, eDM(DLB Recorder) and ClinTrial by attempting to leverage existing tools from Oracle Forms x 2, and Powerbuilder (ClinTrial). However, these tools were not really designed for eClinical specifically. You ended up using a high level language to dynamically create high level syntax - for example Dynamic SQL. This became very complicated, proprietary and often slow. The normalization of the Oracle Clinical Database is an example of where the natural attributes of the Oracle RDBMS and the Forms tools just weren't sufficiently flexible to handle fully dynamic data structures.

Why an eClinical 4GL might makes sense today?

Two principles of a 4GL were High abstraction and Greater statement power.

Abstraction in that you could create data capture forms and reports that were sufficiently abstracted from the database to ensure the user did not need to understand the underlying data structure in order to effectively use the application.

Greater Statement power allowed a small amount of readable code to do a large amount of work.

Both of the above attributes are relevant to the world of eClinical.

The challenge when designing a good EDC tool is to provide a framework that is as friendly as possible, while at the same time provide sufficient flexibility to perform all of the functions that might be required. Vendors have achieved this by going down one of two routes. Either the data driven approach where syntax for rules are built up from menu's (i.e. list of Visits, Forms etc), or going a free form syntax route using something like VBScript. Both approaches fail to a degree.

A purely data tables driven approach is very limited in the constructs that can be built up. Often, tools have had to fall back to lower level approaches in order to fill the gaps. Also, because the syntax is effectively built from parameters that are fed into routines within the application tool, the performance can be poor. Optimization is very difficult to achieve.

A free form syntax route also causes problems. You need to test the validity of the script in a similar fashion to the testing of the actual core product. The more flexibility - the more room for unexpected actions or results in the deployed system.

So - what is the answer?

Could a hybrid- and in this context - a 4GL Hybrid syntax, that runs within a 4GL application framework be the solution?

Should the hybrid syntax be based on a pre-existing language such as ECMAScript, Ruby, Python or some other
Should the database interaction be transparently built into the Language (ala MUMPS)
Should datatyping be strict or loose?... [ what is datatyping anyway? ]
MVC - what is it, and is it relevant?

I plan on answering these questions in a future posting.

Search This Blog