Electronic Data Capture - Technology Blog

Thursday, October 30, 2008

EDC Rules - When should they run?

All EDC systems have some form of rules facility. The rules are typically designed to check data. If the check fails, then typically a query is produced.

Web based EDC systems typically run edit checks when a page is submitted.

Prior to the release of web based systems, it was typical to check the data as soon as it was entered - between fields.

With Web 2.0 technologies, it may prove possible to routinely run checks as soon as data is entered - prior to page submit.

So - a question - which approach is best?

Monday, October 27, 2008

Why eClinical fails to deliver significant ROI

I stumbled across an article posted in ClinPage back in May 2008 that reported on a presentation given by Ron Waife. I cannot say I always agree with Ron's assessments, however I believe he is 100% accurate with his analysis on this occasion. Steve Woody also made an interesting point regarding a potential solution to the problem.

The crux of Ron's position is that Sponsor companies are fundamentally failing in taking advantage of eClinical technologies, primarily due to a failure to embrace new processes, and to break down silo based working models.

Ron makes a sensible suggestion regarding a potential mode that will work - a skunk work approach - that I fully share.

If there are any Pharma execs out there with the power to make change happen - they would do very well to listen to Ron's advise.

An interpretation of the proposal is defined as follows - purposely greatly simplified! ;

Take an adaptive friendly drug program...
Create a skunk work team comprising of a small number of open minded individuals from each existing department - Protocol Writer, (e)Study Builder, Statistician, Safety Manager, Clinical Lead etc.
Put them in a 'virtual' room, and ask them to work tightly together.
The team must work on an 'Agile' style development approach - [ I will expand on this in a later post ]
The program / studies will be adaptive - the data will be available early and the decisions made rapidly.
The Statistician - playing an active, leading role throughout the program - will model the original program, assess the ongoing (daily) execution against the model and adapt accordingly.
A leader of this team should be measured based on the effectiveness of the Program - positive or negative - against a plan.

Sometimes, I think we are too focused on shaving a few days of the time to DB Lock. With an agile adaptive approach - could we not be thinking months and even years of savings?

Steve's suggestion was that a focus on a business model approach might focus the minds of the sponsor companies. His statement regarding the CRO industry;

... which was created and is sustained by the inefficiency of clinical research, is hooked on the heroin (money).

may come across as rather strong, but I believe there is a degree of truth here. CRO's are often the most conservative when it comes to change... 'lets do whatever the client that pays the money wants...' even if it is not necessarily good for them...

However, and this is a big 'however'... CRO companies do act on a conservative basis due to a need to provide a low risk solutions. How many sponsor organizations want to hear about a new 'high risk' implementation method that will be applied to the trial they are responsible for? So - I don't think the blame is entirely merited.

Moving off topic now, so I will close this post... I am interested in hearing comments...

Friday, October 24, 2008

Should eClinical systems be 'EndPoint' aware

EDC Wizard made some interest points in response to the earlier posting 'EDC EndPoints'

The original posting was probably incorrectly titled. It should have maybe said - "Should eClinical systems be EndPoint aware?"

I tend to stay away from the term EDC when I can. I think this term does not really apply now to some of the leading 'EDC' vendors. I think they are still labelled as EDC as customers expect to purchase an EDC solution. However, today, they are more 'Clinical Trial Support Suites'. Vendors are adding more and more upstream and downstream functionality. In doing so, some are clueing up to the fact that the 'bit in the middle' - the data capture and cleaning part - may benefit from early involvement from other parties traditionally left out of the mix.

SDV'ing is an interesting point. EDC Wizard states that

Many sponsors are implementing reduced SDV plans that take a risk-based approach to comparing source data to EDC entries

The activity list for Monitors will increasingly be led by the eClinical system tools. They track what has, and what has not been SDV'd. With a % being applied, a model needs to be applied by the eClinical tool that applies this % appropriately. I am not aware of a tool that has successfully implemented this. Another challenge exists regarding the classification of data that is eSource and that which is not.

What has, and what has not been SDV'd should not be shown to the Investigator by the tool. I believe most tools support differing views based on user roles. This functionality should be applied.

EDC Wizard goes on to say;

I am not sure I would recommend that EDC systems be modified to flag data as primary, secondary, SDV, or non-SDV. It's hard enough to move from protocol to EDC database to study start without adding more complications to database builds.

A very valid point - tools are becoming increasingly complex. 'Keep it Simple' is certainly a solid principle to hold where possible. However - with the current model of blanket significance / locking --> data delivery, I think we are missing an opportunity for early decision making. If the move towards define once, use many times continues to be applied with eClinical systems, then complexity may reduce rather than increase - define the endpoint criteria up-front in one place, and have this information take downstream into EDC and onto data delivery.

Thursday, October 23, 2008

EDC Endpoints

Endpoints are defined as an event or outcome that can be measured objectively to determine whether the intervention being studied is beneficial.

EDC systems often ignore the importance of the definition of an EndPoint. As far as an EDC system is concerned, all data is effectively considered equally significant. [Possibly correspondents from Medidata and/or Phaseforward can correct me on how Rave and/or Inform respectively, handle this.]

Lets say in a sample clinical trial, you have 100 pages of information captured for a subject, and 10 questions per page. That is a total of 1000 data values that potential have to be captured. The capture and cleaning process typically involves the entry, review, SDV and freeze/lock. The time to perform this for a key data value is the same as the time for an item that has limited significance.

EDC systems typically use a hierarchical tree structure of status handling. Every data value is associated with a status. A Page status is reflective of the status of all the data values on the page. The visit status is reflective of all the CRF Pages in the visit etc. However, this does place a common blanket significance to all data that is captured.

It could be argued that all data that is defined as equivalent significance in the execution of a study - the protocol stated a requirement to capture the data for some reason. However, I believe it can defined at the outset the subset of information that is captured that actually contains endpoint significance. The question is - going back to our example with 1000 data values per subject - is it possible to make an early assessment of data, based on a statistically safe error threshold rather than wait until all subject, all visits, all pages and all data values are locked?

For example, let us consider efficacy and in particular efficacy in a Phase II Dose Escalation study. Information on the dosing of a subject, followed by the resulting measurements of effectiveness may occur relatively quickly in the overall duration of a trial. However, a blanket 'clean versus not clean' rule means that non of the data can be examined until either ALL the data achieves a full DB lock, or, an Interim DB Lock (all visits up to a defined point) is achieved.

So - a question to the readers - is it possible to make assessments on data even if a portion of the data is either missing, or unverified?

One potential solution might be a sub-classification of data (or rather metadata).

When defining fields, a classification could be assigned that identifies as recorded value as 'end-point' significant. The actual number of potential endpoints could be list based and defined at a system level. One Primary end-point would be supported with as many secondary end-points as necessary. A value might be classified against 1 or more endpoint classifications.

The key to the value of this would be on the cleaning and data delivery. Rather than determining a tree status based on all data values captured, the tree status would be an accumulation of the data values that fell within the endpoint classification.

So - with our example, lets say that of the 1000 data values captured per subject only 150 might be considered of endpoint significance for efficacy. Once all of the data values are captured and designated as 'clean', then the data would be usable for immediate statistical analysis. Of course other secondary end-points may exist that will demand longer term analysis of the subject data - for example follow-ups.

The chart models that with a typical data capture / cleaning cycle with ongoing analysis of end-point significant data - statistical significant efficacy is determined at 3 months rather than 5.

The potential value that can be gained when making early decisions has been well proven. Adaptive Clinical trials often rely on the principle. By delivering data of a statistically safe state of cleanliness earlier, we could potential greatly accelerate the overall development process.

Friday, October 3, 2008

Paul Bleicher departs PhaseForward

I was interested to hear that Paul Bleicher has stepped down from the Chair of PF to focus on a new venture in Healthcare Informatics. I wonder if this is in any way is related to the (potentially scurrilous) gossip that they were looking at procuring ClickFind from Datatrak.

Probably not, but the timing though is interesting. Just when senior management must be looking at the core technology to determine if it has it in it to go after the SaaS or PaaS market.

Dr Bleicher's departure means that the originals at PF - Richard Dale, Paul Bleicher, Jeff Klofft and Gilbert Benghiat have all now moved on. Richard and Jeff in my view were the original technical visionaries that were supported by some good initial developers led by Gil. Bleicher gave the company that initial credibility with this CRO and Medical background - in some ways he was the 'expert' user, and clearly has a good head for entrepreneurial business.

Anyway - good luck to Dr Bleicher!

Interesting times...

Thursday, October 2, 2008

Return of the 4GL for eClinical? - Part 1

In the 1980's the thing of the day as far as Database Application development was RAD 4GL's. That is - Rapid Application Development Fourth Generation Languages. They were popular because they tackled the problem of slow software development with complex generic tools. They offered high level constructs for developing Database applications. If you wanted to drawn pretty pictures - sorry. If you wanted to control real-time machinery - sorry. But, if you wanted to write an Database Application - yes, they worked very well.

In the last couple of years, two particular technologies have been popular with developers - Ruby on Rails and, more recently Dganjo. These are based on 3rd generation tools - Ruby and Python respectively - extended through the development of a standard framework. The frameworks have been developed for supporting Database Driven Website applications. These are, in a way the 4GL's for the 21'st Century.

I was one of the these early 4GL Developers for a number of years. In my young exuberant days, I used to boast that I could write a full multi-user Stock Control System from scratch in 3 days. [The truth was, that due to a failed backup - I did actually have to write (again) a full Stock Control System for a client in 3 days!]

One particular 4GL Tool that I was particularly proficient at produced database tables, menus, forms, event driven code, database triggers, reports etc. I suppose looking back it was a bit like Oracle Forms, but without the nasty complex parts, or the heavy weight toolset.

One of the attributes of the tool was that it provided a programming syntax that was sufficiently business aware to make it relevant for business functions, while at the same time sufficient flexible to be capable of developing complex database applications. It was the closest syntax I have seen to natural language. It was the sort of syntax that the developers of SQL and PL/SQL might have produced if they had started again in the mid 80's. The language was sufficient for even the most complex Database applications without having to resort to a 3rd Generation language such as C or Fortran. [Oh dear - I am sounding a bit like a IBM OS/2 user, bitter about Microsoft winning through with Windows!)

Anyway, I am getting off the point.

In thinking about eClinical technologies, and, in particular EDC Tools, I have wondered why a company has not created a 4th Generation Trial Development Tool that offers similar generic features for database, forms and rules authoring while embedding standard features such as standard audit trailing, flag setting, security and web enablement. At this point, I am sure some readers will be saying - oh but such tools do exist. Well, yes, you do have 'Study Building' tools, but, they are very specific. A general language is not provided that can be used across the tool set.

Oracle Corp, eResearch Technology and Domain went down similar routes with Oracle Clinical, eDM(DLB Recorder) and ClinTrial by attempting to leverage existing tools from Oracle Forms x 2, and Powerbuilder (ClinTrial). However, these tools were not really designed for eClinical specifically. You ended up using a high level language to dynamically create high level syntax - for example Dynamic SQL. This became very complicated, proprietary and often slow. The normalization of the Oracle Clinical Database is an example of where the natural attributes of the Oracle RDBMS and the Forms tools just weren't sufficiently flexible to handle fully dynamic data structures.

Why an eClinical 4GL might makes sense today?

Two principles of a 4GL were High abstraction and Greater statement power.

Abstraction in that you could create data capture forms and reports that were sufficiently abstracted from the database to ensure the user did not need to understand the underlying data structure in order to effectively use the application.

Greater Statement power allowed a small amount of readable code to do a large amount of work.

Both of the above attributes are relevant to the world of eClinical.

The challenge when designing a good EDC tool is to provide a framework that is as friendly as possible, while at the same time provide sufficient flexibility to perform all of the functions that might be required. Vendors have achieved this by going down one of two routes. Either the data driven approach where syntax for rules are built up from menu's (i.e. list of Visits, Forms etc), or going a free form syntax route using something like VBScript. Both approaches fail to a degree.

A purely data tables driven approach is very limited in the constructs that can be built up. Often, tools have had to fall back to lower level approaches in order to fill the gaps. Also, because the syntax is effectively built from parameters that are fed into routines within the application tool, the performance can be poor. Optimization is very difficult to achieve.

A free form syntax route also causes problems. You need to test the validity of the script in a similar fashion to the testing of the actual core product. The more flexibility - the more room for unexpected actions or results in the deployed system.

So - what is the answer?

Could a hybrid- and in this context - a 4GL Hybrid syntax, that runs within a 4GL application framework be the solution?

Should the hybrid syntax be based on a pre-existing language such as ECMAScript, Ruby, Python or some other
Should the database interaction be transparently built into the Language (ala MUMPS)
Should datatyping be strict or loose?... [ what is datatyping anyway? ]
MVC - what is it, and is it relevant?

I plan on answering these questions in a future posting.

Sunday, September 28, 2008

Web Services in eClinical

Web Services is one of these technical terms than many folk have heard of, some people understand, and a very few people can actually use. The definition of a Web Service from a technical perspective - courtesy of Wikipedia - is "a software system designed to support interoperable machine-to-machine interaction over a network.

From an eClinical perspective, Web Services allow disparate eClinical systems to communicate on a (near) real-time basis.

I believe that Web Services will help resolve many of the integration issues that eClinical systems suffer from today. You can procedure 2 great systems, but if they don't speak properly together at lot of business value is lost. Combining CDISC with Web Services may well be a solution to many problems encountered.

Web Services - the Basics

Technologies similar to web services have been around for many many years. For example, when you visit an autoteller, put your Visa card in the slot to withdrawn cash, a Web Service 'type' of communication goes on between the bank you communicate with and the Credit Card company actually releasing the funds. What they actually say when such communications occur will of course differ depending on application, but, with Web Services, the way they say it is standardized.

Web Services have evolved into many different things, but, the underlying principles remain the same. Generally, they communicate using XML (Extensible Markup Language) based text over a Protocol called SOAP.

Many folk will be familiar with XML already - CDISC ODM is build around XML as a means to give meaning to clinical data that is transferred. SOAP though may be new term. SOAP, put simply provides a means to transfer - typically over the Internet, XML messages from system to system over firewall friendly channels (or Ports).

When you open up a browser, and enter something like http://www.google.com/ what you are actually doing is asking to communicate with google on internet port '80'. The http equates to Port 80. You might also see https. The 's' part signifies 'secure' and indicates the use of Port 443 (known as SSL). Many corporate and site networks place restrictions on the ports that are open to the internet. Ports 80 and 443 are some of the few ports almost always open, and therefore usable for communication. SOAP can use both these ports. Therefore, web services running on SOAP can speak between systems, avoiding firewall conflicts. This means that if you want System A to speak to System B via Web Services, all you need to do is ensure that an Internet link is available, and you're off and running.

CDISC & eClinical before Web Services

So, what about Web Services, CDISC and eClinical. Why should I care?

Well, traditionally, eClinical systems have been relatively 'dumb' when it has come to communicating. An IVR system would be used to capture the recruitment, or randomization of a subject. The IVR would then send a Text file via old fashioned FTP file transfer to an EDC system, and the EDC system would - at some time in the future - process the text file - creating a new subject, or recording the randomization in the EDC system tables. Sounds ok... but.. what if things go wrong?

With this model, the EDC and IVR systems don't really speak to each other. The IVR system sends something - yes, but if the EDC system doesn't like it - then oops! The IVR will keep sending things regardless. That is one issue. The second issue is that because the two systems don't actively communicate, they cannot cross check (or Handshake) with each other. Imagine if the EDC system held information that the IVR did not. Lets say for instance that the investigator recorded in the EDC system that the subject had dropped out. If the investigator later used the IVR to possibly then Randomize this same patient the IVR could check against the EDC system that the subject was valid and current. Maybe not a perfect example, but, the capability exists.

Web Services provides the mechanism for system A to speak with system B. CDISC ODM provides the syntax with which to communicate. When both systems make reference to a 'FORM', both systems know what is meant.

Web Services - eClinical - so...

In traditional systems design, you had a decision to make when you developed new modules of software as part of a suite of applications. Do I store database information in the same place - sharing a common database, or, do I store it in a separate database and communication / synchronize between the two systems. If you stored everything in the same database - you simplified the table structure, and didn't need to worry about data replication, but, systems were tied together. If you separated the databases, then of course you had duplicate data between the databases, and you had to replicate. This replication was complicated and problematic.

Ok, now lets imagine that the systems come from different vendors. Of course each vendor wants to sell their own system independently - a separate database is mandatory. They hold common information.... no problem, we write interfaces.

Complicated software is designed to examine information that is common between systems, and transfer this by batch transfer. So, for example, we have a list of Sites in System A - we also have a list of Sites in System B. We have a list of site personnel in system A, we also have a list of site personnel in system B - no problem I hear you say. Lets imagine that System A doesn't fully audit trail the information that has changed on the Site's tables. How would System B know what to take.... we need to transfer all the sites, and compare the site information with the previous site information... getting tricky...and this is just a simple list of sites.

Now, lets imagine a more complicated situation, common in an eClinical system. A Protocol amendment occurs, a new arm has been added to a study whereby subjects meeting particular criteria are branched into two separate dosing schemes.

Transferring or Synchronizing this sort of information between 2 systems would be possible, but very very difficult. System A may not have a good place to the put the information from System B. The question is though - do both systems really need the same data? If System B wants to know something, why doesn't it just ask System A at the time it needs the answer, instead of storing all the same data itself?

This is where Web Services can come in.

Lets imagine an IVR system wanted to check with an EDC system if a subject was current in a study (current meaning not dropped out, early terminated or a screen failure). A Web Service could be offered by the EDC system to respond with a 'True' or 'False' to a call 'IS_SUBJECT_CURRENT' ? Of course hand-shaking would need to occur before it hand for security and so on, but following this, the IVR system would simply need to make the call, provide a unique Subject identifier, and the EDC system web service would respond with either 'True' or 'False. With Web Services, this can potentially occur in less than a second.

Lets take this one step further. The EDC system would like to record a subject randomization. The site personnel enter all the key information into the EDC system. The EDC system then makes a Web Service call to the IVR system - passing all of the necessary details. The IVR takes these details, checks them, and if valid, returns the appropriate subject randomization no.. The EDC system presents the Randomization No. for the subject on the eCRF for the site personnel to use. This all happens realtime, and via web service calls in systems located in completely different locations.

Web Services - Metadata independence

Web Services are significant for a number of reasons. Yes, they allow systems to communicate in a near real-time basis over the internet - that's quite cool in itself. What's more significant though in terms of eClinical systems is that Systems A and B don't really need to understand how the other systems do what they do.

If System A had to read the database of System B, it would need to understand how System B actually used the data in the database. The same applies to an interface. If System A received data from System B, it needs to process that data with an understanding of how System B works before it could use it, or potentially update it.

Web Services - beyond CDISC?

CDISC ODM allows you to transfer data, and to some extent metadata from one system to another. To ensure it works for all, the support is to some extent, the 'lowest common denominator' of metadata. It is only really able to describe data that is common and understandable to every other system - (barring extensions - see eClinicalOpinion on these).

Imagine if we could create a common set of Web Service calls. The common calls would take certain parameters, and, return a potential set of responses. The Messaging might be based on CDISC ODM, but the actions would be new and common.

Add_Subject(Study, Site, SubjectId) returns ScreeningNo
Add_DataValue(Study, Site, Subject, Visit....) returns Success, QueryResponse
Read_DataValue(Study, Site, Subject, Visit....) returns DataValue,QueryResponse, DataStatus

With this sort of mechanism, the degree of processing of data and metadata between systems is limited. The 'owning' system does all the work. The data and metadata that the systems need, stay with the original system.

One remaining challenge exists - the common indexing of information - if a data value is targeted towards a particular site, subject, Visit, Page and Line, then they all must be known and specified. That said, a bit of business logic (protocol knowledge) can be applied. For example, if a DBP is captured for a subject, and the target study only has one reference to DBP for a subject in the whole CRF, should I really need to specify the Visit, Page and Instance? Sufficient uniqueness rules could apply.

If CDISC were to create a standard set of InBound and OutBound Web Service calls, you would see a great simplification in how normally disconnected systems inter-operate. Not only could we send data from System A to System B, we could appreciate what happens when it gets there - 'Can I login', 'Did that Verbatim Code?' 'Can I have lab data for subject x'... etc etc.

Will Web Services technologies change the eClinical landscape? No. But, technology advances such as these all help to make the whole eClinical process somewhat less complicated.

Search This Blog