MIRC CTP
This article describes the stand-alone processing application for clinical trials data using MIRC components and the MIRC internet transport mechanism.
1 Background
MIRC supports clinical trials through two applications, one for data acquisition at an imaging center (FieldCenter) and one for management of the data at a principal investigator's site (MIRC).
The FieldCenter application acquires images via the DICOM protocol, anonymizes them, and transfers them (typically using HTTP, although DICOM is also supported) to a principal investigator's MIRC site. It also supports other types of data files and includes an anonymizer for XML files as well. FieldCenter also contains a client for the Update Service of a MIRC site, allowing the application to save data on, and obtain software updates from, the principal investigator's site.
The MIRC site software contains a partially configurable processing pipeline for clinical trials data, consisting of:
- HttpImportService
- A receiver for HTTP connections from FieldCenter applications transferring data files into the processing pipeline.
- DicomImportService
- A receiver form DICOM datasets for iinsertion into the processing pipeline.
- Preprocessor
- A user-defined component for processing data received by the HttpImportService before it is further processed by other components.
- Anonymizer
- A component for anonymizing DICOM objects or XML objects.
- DatabaseExportService
- A component providing queue management and submission of data objects to a user-defined interface to an external database management system.
- HttpExportService
- A component in the DicomImportService pipeline providing queue management and transmission of data objects to one or more external systems using the HTTP protocol.
- DicomExportService
- A component in the HttpImportService pipeline providing queue management and transmission of data objects to one or more external systems using the DICOM protocol.
The processing pipelines for the HttpImportService and DicomImportService are different. They are not symmetrical. For example, the HttpImportService does not have access to the anonymizer except as part of the DatabaseExportService. Another limitation is that objects received via one protocol can only be exported via the other. While these limitations are consistent with the requirements of most trials, it became clear that a completely symmetrical design would provide better support for more sophisticated trials while still satisfying the requirements of simple ones.
2 ClinicalTrialProcessor
ClinicalTrialProcessor is a stand-alone application that provides all the features of a MIRC site for clinical trials in a highly configurable and extensible way. It connects to FieldCenter applications and can also connect to MIRC sites when necessary. ClinicalTrialProcessor has the following key features:
- Single-click installation.
- Support for multiple pipelines.
- Processing pipelines supporting multiple configurable stages.
- Support for multiple quarantines for data objects which are rejected during processing.
- Pre-defined implementations for key components:
- HTTP Import
- DICOM Import
- DICOM Anonymizer
- XML Anonymizer
- File Storage
- Database Export
- HTTP Export
- DICOM Export
- Web-based monitoring of the application's status, including:
- configuration
- logs
- quarantines
- status
- Support for the FieldCenter Update Service client.
2.1 Installation
2.2 Pipelines
A pipeline is a manager that moves data objects through a sequence of processing stages. Each stage in the pipeline performs a specific function on one or more of the four basic object types supported by MIRC:
- FileObject
- DicomObject
- XmlObject
- ZipObject
Each pipeline must contain one ImportService as its first stage. Each pipeline stage is provided access to a quarantine directory, which may be unique to the stage or shared with other stages, into which the pipeline places objects that are rejected by a stage, thus aborting further processing. At the end of the pipeline, the manager calls the ImportService to remove the processed object from its queue.
2.2.1 ImportService
An ImportService receives objects via a protocol and enqueues them for processing by subsequent stages.
2.2.2 StorageService
A StorageService stores an object in a file system. It is not queued, and it therefore must complete before subsequent stages can proceed. A StorageService may return the current object or the stored object in response to a request for the output object, depending on its implementation.
2.2.3 Processor
A Processor performs some kind of processing on an object. It is not queued. A processor exposes methods with calling signatures that are unique to the object type. In the context of the current MIRC implementation, a Preprocessor is a Processor, as is an Anonymizer. The result of a processing stage is an object that is passed to the next stage in the pipeline.
2.2.4 ExportService
An ExportService provides queued transmission to an external system via a defined protocol. Objects in the queue are full copies of the objects submitted; therefore, subsequent processing is not impeded if a queue is paused, and modifications made subsequently do not affect the queue entry, even if they occur before transmission. (Note: This is different from the current MIRC implementation.)
2.3 Configuration
The ClinicalTrialProcessor configuration is specified by an XML file. There can be one Server element specifying the port on which the HTTP server is to operate, and multiple Pipeline elements, each specifying the stages which comprise it. The name of the element defining a stage is irrelevant and can be chosen for readability; each stage in a pipeline is actually defined by its Java class, specified in the class attribute. Stages are loaded automatically when the program starts, and the loader tests the stage's class to see what kind of stage it represents. It is possible to extend the application beyond the pre-defined stages available in the implementation as described in the section on Extending ClinicalTrialProcessor.
The following is an example of a simple configuration with one pipeline which receives objects via the HTTP protocol, stores them, and exports them to a DICOM destination:
<Configuration> <Server port="80" /> <Pipeline name="Main Pipeline"> <ImportService name="HTTP Import" class="org.rsna.ctp.stdstages.HttpImportService" root="roots/http-import" port="7777" quarantine="quarantines/HttpImportQuarantine" /> <StorageService name="Storage" class="org.rsna.ctp.stdstages.StorageService" root="D:/storage" return-stored-file="no" quarantine="quarantines/StorageQuarantine" /> <ExportService name="PACS Export" class="org.rsna.ctp.stdstages.DicomExportService" root="roots/pacs-export" dest-url="dicom://DestinationAET:ThisAET@ipaddress:port" quarantine="quarantines/PacsExportQuarantine" /> </Pipeline> </Configuration>
Note that in the example above, non-DICOM objects are stored in the StorageService, but they are not exported by the DicomExportService. Each pipeline stage is responsible for testing the class of the object which it receives and processing (or ignoring) the object accordingly.
The following is an example of a more complex configuration. This configuration receives objects, passes them to a trial-specific Processor stage to test whether the object is appropriate for the trial, anonymize objects which make it through the preprocessor, export them to a database, and then anonymize them again to remove information which is not intended for storage, and finally store them.
<Configuration> <Server port="80" /> <Pipeline name="Main Pipeline"> <ImportService name="HTTP Import" class="org.rsna.ctp.stdstages.HttpImportService" root="roots/http-import" port="7777" quarantine="quarantines/HttpImportQuarantine" /> <Processor name="The Preprocessor" class="org.myorg.MyPreprocessor" quarantine="quarantines/PreprocessorQuarantine" /> <Processor name="Main Anonymizer" class="org.rsna.trials.Anonymizer" dicom-script="dicom-anonymizer-1.properties" xml-script="xml-anonymizer-1.script" zip-script="zip-anonymizer-1.script" quarantine="quarantines/MainAnonymizerQuarantine" /> <ExportService name="Database Export" class="org.rsna.trials.DatabaseExportService" adapter-class="org.myorg.MyDatabaseAdapter" root="roots/database-export" quarantine="quarantines/DatabaseExportQuarantine" /> <Processor name="Provenance Remover" class="org.rsna.ctp.stdstages.Anonymizer" dicom-script="dicom-anonymizer-2.properties" xml-script="xml-anonymizer-2.script" zip-script="zip-anonymizer-2.script" quarantine="quarantines/ProvenanceRemoverQuarantine" /> <StorageService name="Storage" class="org.rsna.ctp.stdstages.StorageService" root="D:/storage" return-stored-file="no" quarantine="quarantines/StorageQuarantine" /> <ExportService name="PACS Export" class="org.rsna.ctp.stdstages.DicomExportService" root="roots/pacs-export" dest-url="dicom://DestinationAET:ThisAET@ipaddress:port" quarantine="quarantines/PacsExportQuarantine" /> </Pipeline> </Configuration>
Multiple Pipeline elements may be included, but each must have its own ImportService element, and their ports must not conflict.
Each pipeline stage class has a constructor that is called with its configuration element, making it possible for special processor implementations to be passed additional parameters from the configuration.
2.4 Server
To provide access to the status of the components, the application includes an HTTP server which serves files and provides servlet-like functionality. The web pages are served from a directory tree whose root is named ROOT.
2.4.1 ConfigurationServlet
The ConfigurationServlet displays the contents of the configuration file.
2.4.2 StatusServlet
The StatusServlet displays the status of all the pipeline stages.
2.4.3 LogServlet
The LogServlet provides web access to all the log files in the logs directory.
2.4.4 UpdateService
The UpdateService supports the Update Service clients in FieldCenter applications, serving software updates and saving the remapping tables in trials configured to use it.
2.5 The Standard Stages
The application includes standard stages which allow most trials to be operated without writing any software.
2.5.1 HttpImportService
The HttpImportService listens on a defined port for HTTP connections from FieldCenter applications and receives files transmitted using the MIRC protocol. The configuration element for the HttpImportService is:
<ImportService name="stage name" class="org.rsna.ctp.stdstages.HttpImportService" root="base-directory" port="7777" quarantine="quarantine-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- root is a directory for use by the ImportService for internal storage and queuing.
- port is the port on which the ImportService listens for connections.
- quarantine is a directory in which the ImportService is to quarantine objects that it cannot handle.
Note: directories can be absolute paths (e.g., D:/HttpImport) or relative paths (e.g., quarantines/http-import-quarantine). Relative paths are relative to the directory in which the ClinicalTrialProcessor is located.
2.5.2 DicomImportService
The DicomImportService listens on a defined port for HTTP connections from FieldCenter applications and receives files transmitted using the MIRC protocol. The configuration element for the DicomImportService is:
<ImportService name="stage name" class="org.rsna.ctp.stdstages.DicomImportService" root="base-directory" port="7777" quarantine="quarantine-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- root is a directory for use by the ImportService for internal storage and queuing.
- port is the port on which the ImportService listens for connections.
- quarantine is a directory in which the ImportService is to quarantine objects that it cannot handle.
2.5.3 Anonymizer
The Anonymizer is a processor stage that includes anonymizers for each of the objects which contain defined data. When the anonymizer stage is called to process an object, it calls the anonymizer which is appropriate to the object type. Each anonymizer is configured with a script file. If a script file is either not configured or absent for an object type, objects of that type are quarantined. The configuration element for the Anonymizer is:
<Anonymizer name="stage name" class="org.rsna.ctp.stdstages.Anonymizer" dicom-script="dicom-anonymizer.properties" xml-script="xml-anonymizer.script" zip-script="zip-anonymizer.script" quarantine="quarantine-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- dicom-script specifies the path to the script for the DICOM anonymizer.
- xml-script specifies the path to the script for the DICOM anonymizer.
- zip-script specifies the path to the script for the Zip anonymizer (which anonymizes the manifest in a ZipObject).
- quarantine is a directory in which the Anonymizer is to quarantine objects that it cannot handle.
2.5.4 StorageService
The StorageService stores objects in a file system. It automatically defines subdirectories (based on dates) beneath its root directory and populates them accordingly. The configuration element for the StorageService is:
<StorageService name="stage name" class="org.rsna.ctp.stdstages.StorageService" root="D:/storage" return-stored-file="no" quarantine="quarantine-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- root is the base directory of the storage tree.
- return-stored-file specifies whether the original object or a new object pointing to the file in the storage system is to be returned for processing by subsequent stages. Values are "yes" and "no". The default is "yes".
- quarantine is a directory in which the StorageService is to quarantine objects that it cannot handle.
2.5.5 HttpExportService
The HttpExportService queues objects and transmits them via HTTP using the MIRC-defined Content-Type for each object type. The configuration element for the HttpExportServiceis:
<ExportService name="stage name" class="org.rsna.ctp.stdstages.DicomExportService" root="base-directory" dest-url="http://ipaddress:port/path" quarantine="quarantine-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- root is the base directory of the queuing storage for the ExportService.
- dest-url specifies the destination system's URL.
- quarantine is a directory in which the ExportService is to quarantine objects that it cannot handle.
2.5.6 DicomExportService
The DicomExportService queues objects and transmits them to a DICOM Storage SCP. The configuration element for the DicomExportService is:
<ExportService name="stage name" class="org.rsna.ctp.stdstages.DicomExportService" root="base-directory" dest-url="dicom://DestinationAET:ThisAET@ipaddress:port" quarantine="quarantine-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- root is the base directory of the queuing storage for the ExportService.
- dest-url specifies the destination DICOM Storage SCP's URL.
- quarantine is a directory in which the ExportService is to quarantine objects that it cannot handle.
2.5.7 DatabaseExportService
The DatabaseExportService queues objects and submits them to a DatabaseAdapter class, which must be written specially for the database in question. The configuration element for the DatabaseExportService is:
<ExportService name="stage name" class="org.rsna.trials.DatabaseExportService" adapter-class="org.myorg.MyDatabaseAdapter" root="base-directory" quarantine="quarantines/DatabaseExportQuarantine" />
where:
- name is any text to be used as a label on configuration and status pages.
- adapter-class is the class name of the database's adapter class.
- root is the base directory of the queuing storage for the ExportService.
- quarantine is a directory in which the ExportService is to quarantine objects that it cannot handle.
See [#Implementing_a_DatabaseAdapter Implementing a DatabaseAdapter] for more information.
3 Extending ClinicalTrialProcessor
ClinicalTrialProcessor is designed to be extended with pipeline stages of new types. Stages implement one or more Java interfaces, so it is necessary to get the source code for ClinicalTrialProcessor in order to extend it, even though in principle you don't need to modify the code itself.
3.1 Obtaining the Source Code
The software for ClinicalTrialProcessor is open source. All the software written by the RSNA is released under the RSNA Public License. It is maintained on a CVS server at RSNA headquarters. To obtain the source code, configure a CVS client as follows:
Protocol: Password server (:pserver) Server: mirc.rsna.org Port: 2401 Repository folder: /RSNA User name: cvs-reader Password: cvs-reader Module: ClinicalTrialProcessor
Together, this results in the following CVSROOT (which is constructed automatically if you use something like Tortoise-CVS on a Windows system):
- :pserver:cvs-reader@mirc.rsna.org:2401/RSNA
This account has read privileges, but it cannot write into the repository, so it can check out but not commit. If you wish to be able to commit software to the CVS library, contact the MIRC project manager.
3.2 Building the Software
When you check out the ClinicalTrialProcessor module from CVS, you obtain a directory tree full of the sources and libraries for building the application. The top of the directory tree is ClinicalTrialProcessor. It contains several subdirectories. The source code is in the source directory, which has two child directories, one each for the Java sources, and any files required by the application.
ClinicalTrialProcessor requires Java 1.5 JDK and the JAI ImageIO Tools.
The Ant build file for ClinicalTrialProcessor is in the ClinicalTrialProcessor directory and is called build.xml. To build the software on a Windows system, launch a command window, navigate to the ClinicalTrialProcessor directory, and enter ant all.
The build file contains several targets. The all target does a clean build of everything, including the Javadocs, which are put into the documentation directory. The Javadocs can be accessed with a browser by opening the file:
- ClinicalTrialProcessor/documentation/index.html
The default target, ctp-installer, just builds the application and places the installer in the products directory.
3.3 Implementing a Pipeline Stage
To be recognized as a pipeline stage, a class must implement the org.rsna.ctp.pipeline.PipelineStage interface. An abstract class, org.rsna.ctp.pipeline.AbstractStage, is provided to supply some of the basic methods required by the PipelineStage interface. All the standard stages extend this class.
Each stage type must also implement its own interface. The interfaces are:
- org.rsna.ctp.pipeline.ImportService
- org.rsna.ctp.pipeline.Processor
- org.rsna.ctp.pipeline.StorageService
- org.rsna.ctp.pipeline.ExportService
Each stage must have a constructor which takes its configuration file XML Element as its argument. The constructor must obtain any configuration information it requires from the element. While it is not required that all configuration information be placed in attributes of the element, the getConfigHTML method provided by AbstractStage expects it, and if you choose to encode configuration information in another way, you must override the getConfigHTML method.
3.4 Implementing a DatabaseAdapter
Information on implementing a DatabaseAdapter can be found in Implementing an External Database Interface for MIRC Clinical Trials. It is important to note, however, that since the object types for ClinicalTrialProcessor are not in the same packages as those for MIRC, the DatabaseAdapter must be compiled as part of the ClinicalTrialProcessor build.