Search This Blog

Thursday, October 2, 2008

Return of the 4GL for eClinical? - Part 1

 

In the 1980's the thing of the day as far as Database Application development was RAD 4GL's. That is - Rapid Application Development Fourth Generation Languages.  They were popular because they tackled the problem of slow software development with complex generic tools. They offered high level constructs for developing Database applications.  If you wanted to drawn pretty pictures - sorry. If you wanted to control real-time machinery - sorry.  But, if you wanted to write an Database Application - yes, they worked very well.

In the last couple of years, two particular technologies have been popular with developers - Ruby on Rails and, more recently Dganjo. These are based on 3rd generation tools - Ruby and Python respectively - extended through the development of a standard framework. The frameworks have been developed for supporting Database Driven Website applications.  These are, in a way the 4GL's for the 21'st Century.

I was one of the these early 4GL Developers for a number of years. In my young exuberant days, I used to boast that I could write a full multi-user Stock Control System from scratch in 3 days.  [The truth was, that due to a failed backup - I did actually have to write (again) a full Stock Control System for a client in 3 days!]

One particular 4GL Tool that I was particularly proficient at produced database tables, menus, forms, event driven code, database triggers, reports etc.  I suppose looking back it was a bit like Oracle Forms, but without the nasty complex parts, or the heavy weight toolset.

One of the attributes of the tool was that it provided a programming syntax that was sufficiently business aware to make it relevant for business functions, while at the same time sufficient flexible to be capable of developing complex database applications.  It was the closest syntax I have seen to natural language.  It was the sort of syntax that the developers of SQL and PL/SQL might have produced if they had started again in the mid 80's. The language was sufficient for even the most complex Database applications without having to resort to a 3rd Generation language such as C or Fortran. [Oh dear - I am sounding a bit like a IBM OS/2 user, bitter about Microsoft winning through with Windows!)

Anyway, I am getting off the point.

In thinking about eClinical technologies, and, in particular EDC Tools, I have wondered why a company has not created a 4th Generation Trial Development Tool that offers similar generic features for database, forms and rules authoring while embedding standard features such as standard audit trailing, flag setting, security and web enablement.   At this point, I am sure some readers will be saying - oh but such tools do exist.  Well, yes, you do have 'Study Building' tools, but, they are very specific.  A general language is not provided that can be used across the tool set.

Oracle Corp, eResearch Technology and Domain went down similar routes with Oracle Clinical, eDM(DLB Recorder) and ClinTrial by attempting to leverage existing tools from Oracle Forms x 2, and  Powerbuilder (ClinTrial). However, these tools were not really designed for eClinical specifically.  You ended up using a high level language to dynamically create high level syntax - for example Dynamic SQL.  This became very complicated, proprietary and often slow.  The normalization of the Oracle Clinical Database is an example of where the natural attributes of the Oracle RDBMS and the Forms tools just weren't sufficiently flexible to handle fully dynamic data structures.

Why an eClinical 4GL might makes sense today?

Two principles of a 4GL were High abstraction and Greater statement power.

Abstraction in that you could create data capture forms and reports that were sufficiently abstracted from the database to ensure the user did not need to understand the underlying data structure in order to effectively use the application.

Greater Statement power allowed a small amount of readable code to do a large amount of work.

Both of the above attributes are relevant to the world of eClinical. 

The challenge when designing a good EDC tool is to provide a framework that is as friendly as possible, while at the same time provide sufficient flexibility to perform all of the functions that might be required. Vendors have achieved this by going down one of two routes.  Either the data driven approach where syntax for rules are built up from menu's (i.e. list of Visits, Forms etc), or going a free form syntax route using something like VBScript.  Both approaches fail to a degree.

A purely data tables driven approach is very limited in the constructs that can be built up.  Often, tools have had to fall back to lower level approaches in order to fill the gaps.  Also, because the syntax is effectively built from parameters that are fed into routines within the application tool, the performance can be poor. Optimization is very difficult to achieve.

A free form syntax route also causes problems.  You need to test the validity of the script in a similar fashion to the testing of the actual core product.   The more flexibility - the more room for unexpected actions or results in the deployed system.

So - what is the answer?

Could a hybrid- and in this context - a 4GL Hybrid syntax, that runs within a 4GL application framework be the solution?

  • Should the hybrid syntax be based on a pre-existing language such as ECMAScript, Ruby, Python or some other
  • Should the database interaction be transparently built into the Language (ala MUMPS)
  • Should datatyping be strict or loose?...   [ what is datatyping anyway? ]
  • MVC - what is it, and is it relevant?

I plan on answering these questions in a future posting.

2 comments:

Eco said...

I should be sleeping but this is a very interesting post and your ideas closely match with my own vision.

If you look at the leading EDC systems they show every sign of being custom coded from the ground up. They are not, as you say, based on a higher-level framework that gives them anything for free.

This shows in many places. All the clinical data may be intensively audit trailed and have complex workflow associated with it but study settings data, user admin and site records do not.

In one place you may be able to extract a grid of data as an excel spreadsheet but in others you can't.

The vision I share with you is that EDC systems should be based on an environment that provides you with a set of things for free - audit trail of everything, the ability to extend with a simple programming language at any point, data migration and export of all data including non-patient data, workflow for all content and so on.

This is very much the Salesforce.com or SAP model but built for the rigorous demands of this regulated environment.

There are many issues with building such an environment - scalability is a big issue with something so dynamic for instance but the fact that we are all thinking the same way suggests that this is a concept that is coming of age.

I look forward to more from you on this.

Eco said...

Would be interested in seeing an example piece of code for a 4GL. That would spark discussion I think.

Also, how do you propose a 4GL should be created? By creating a Domain Specific Language through the metaprogramming capabilities of languages like Ruby? Or by breaking out Lex/Yacc and starting from scratch with _insert language here_?