Difference between revisions of "MIRC CTP"
Line 384: | Line 384: | ||
where: | where: | ||
*<b>name</b> is any text to be used as a label on configuration and status pages. | *<b>name</b> is any text to be used as a label on configuration and status pages. | ||
− | *<b>id</b> is any text to be used to uniquely identify the stage. This attribute | + | *<b>id</b> is any text to be used to uniquely identify the stage. This attribute is required only when the stage is accessed by other stages. Its value, when supplied, must be unique across all pipelines. |
− | is required only when the stage is accessed by other stages. Its value, when supplied, must | ||
− | be unique across all pipelines. | ||
*<b>root</b> is the root directory of the storage tree. | *<b>root</b> is the root directory of the storage tree. | ||
*<b>type</b> determines the structure of the storage tree. The allowed values are: | *<b>type</b> determines the structure of the storage tree. The allowed values are: |
Revision as of 16:38, 3 August 2008
This article describes the stand-alone processing application for clinical trials data using MIRC components and the MIRC internet transport mechanism.
1 Background
MIRC supports clinical trials through two applications, one for data acquisition at an imaging center (FieldCenter) and one for management of the data at a principal investigator's site (MIRC).
The FieldCenter application acquires images via the DICOM protocol, anonymizes them, and transfers them (typically via HTTP, although DICOM is also supported) to a principal investigator's MIRC site. It also supports other types of data files and includes an anonymizer for XML files as well. FieldCenter also contains a client for the Update Service of a MIRC site, allowing the application to save data on, and obtain software updates from, the principal investigator's site.
The MIRC site software contains a partially configurable processing pipeline for clinical trials data, consisting of:
- HttpImportService
- A receiver for HTTP connections from FieldCenter applications transferring data files into the processing pipeline.
- DicomImportService
- A receiver for DICOM datasets from modalities, PACS, workstations, etc. for insertion into the processing pipeline.
- Preprocessor
- A user-defined component for processing data received by the HttpImportService before it is further processed by other components.
- Anonymizer
- A component for anonymizing DICOM objects or XML objects.
- DatabaseExportService
- A component providing queue management and submission of data objects to a user-defined interface to an external database management system.
- HttpExportService
- A component in the DicomImportService pipeline providing queue management and transmission of data objects to one or more external systems using the HTTP protocol.
- DicomExportService
- A component in the HttpImportService pipeline providing queue management and transmission of data objects to one or more external systems using the DICOM protocol.
The processing pipelines for the HttpImportService and DicomImportService are different. They are not symmetrical. For example, the HttpImportService does not have access to the anonymizer except as part of the DatabaseExportService. Another limitation is that objects received via one protocol can only be exported via the other. While these limitations are consistent with the requirements of most trials, it became clear that a more general design would provide better support for trials requiring complex processing while still satisfying the normal requirements.
2 ClinicalTrialProcessor (CTP)
ClinicalTrialProcessor is a stand-alone program that provides all the processing features of a MIRC site for clinical trials in a highly configurable and extensible application. It connects to FieldCenter applications and can also connect to MIRC sites when necessary. ClinicalTrialProcessor has the following key features:
- Single-click installation.
- Support for multiple pipelines.
- Processing pipelines supporting multiple configurable stages.
- Support for multiple quarantines for data objects which are rejected during processing.
- Pre-defined implementations for key components:
- HTTP Import
- DICOM Import
- DICOM Anonymizer
- XML Anonymizer
- File Storage
- Database Export
- HTTP Export
- DICOM Export
- FTP Export
- Web-based monitoring of the application's status, including:
- configuration
- logs
- quarantines
- status
2.1 Installation
The installer for ClinicalTrialProcessor is available on the RSNA MIRC site. To run the installer, the Java 1.6 (or better) JRE must be present on the system.
To run the installer, double-click the CTP-installer.jar file and choose a directory in which to install the program. The installer can also be run in a command window using the command:
- java -jar CTP-installer.jar
To run the ClinicalTrialProcessor program, the Java Advanced Imaging ImageIO Tools must be present on the system. Java and all its components are available on the java.sun.com website. When obtaining the ImageIO Tools, pay close attention to the fact that the Java Advanced Imaging component is not the same as the Java Advanced Imaging ImageIO Tools. Only the latter are required.
The ClinicalTrialProcessor has no user interface. It can be run by double-clicking the CTP.jar file, or it can be run in a command window. To do so, open a command window, navigate to the directory in which the program was installed, and enter the command:
- java -jar CTP.jar
If large images are to be acquired, it is generally advisable to run the program with a large memory pool. When running from a command window, the parameters can be set using the command:
- java -Xmx512m -Xms128m -jar CTP.jar
When the program starts, it runs without intervention. Status and other information can be obtained through the program's integrated webserver. Accessing the server with no path information displays a page presenting buttons for each of the servlets.
When the program is first installed, a single user is created with the name admin and password password. After logging in as this user, the UserManagerServlet can be used to change the name and/or password and to create other users if necessary.
To stop the program, log in as an admin user and click the Shutdown button on the main page.
2.2 Configuration Files
The program uses two configurable files: config.xml, which is located in the same directory as the program itself, and index.html, which is located in the server's ROOT directory. Both files are intended to be configured for the specific application. The installer does not overwrite these files when it runs; instead, it installs two example files: example-config.xml and example-index.html. When ClinicalTrialProcessor starts, it looks to see if the non-example files are missing, and if so, it copies the example files into the non-example ones. This process allows upgrades to be done without losing any configuration work. After installing the program the first time, it should be run once in order to make the copies, and then the copies can be configured. Configuration is done by hand with any text editor (e.g., TextPad or NotePad). Care should be taken, especially with config.xml, to keep it well-formed. Opening it with a program like InternetExplorer will check it for errors.
2.3 Server
To provide access to the status of the components, the application includes an HTTP server which serves files and provides servlet-like functionality. Files are served from a directory tree whose root is named ROOT. The ROOT directory contains a file, index.html, which provides buttons which link to several servlets providing information about the operation of the program. This file is intended to be configured with logos, additional links, etc., and upgrades do not overwrite it. The standard servlets are:
- LoginServlet allows a user to log into the system.
- UserManagerServlet allows an admin user to create users and assign them privileges.
- ConfigurationServlet displays the contents of the configuration file.
- StatusServlet displays the status of all pipeline stages.
- LogServlet provides web access to all log files in the logs directory.
- QuarantineServlet provides web access to all quarantine directories and their contents.
- SysPropsServlet displays the Java system properties.
- DicomAnonymizerServlet allows the user to configure any DICOM anonymizers in the pipelines.
- ShutdownServlet allows an admin user to shut the program down.
The configuration element for the HTTP server is:
<Server port="80" require-authentication="no" />
where:
- port is the port number on which the HTTP server listens for connections.
- require-authentication determines whether users are forced to log in to the HTTP server. Values are "yes" and "no". The default is "no". The HTTP server is typically operated without requiring authentication to allow users to monitor the status of their transmissions.
2.4 Pipelines
A pipeline is a manager that moves data objects through a sequence of processing stages. Each stage in the pipeline performs a specific function on one or more of the four basic object types supported by MIRC:
- FileObject
- DicomObject
- XmlObject
- ZipObject
Each pipeline must contain one ImportService as its first stage. Each pipeline stage may be provided access to a quarantine directory into which the stage places objects that it rejects, thus aborting further processing. Quarantine directories may be unique to each stage or shared with other stages. At the end of the pipeline, the manager calls the ImportService to remove the object from its queue.
There are four types of pipeline stages. Each is briefly described in subsections below.
2.4.1 ImportService
An ImportService receives objects via a protocol and enqueues them for processing by subsequent stages.
2.4.2 StorageService
A StorageService stores an object in a file system. It is not queued, and it therefore must complete before subsequent stages can proceed. A StorageService may return the current object or the stored object in response to a request for the output object, depending on its implementation.
2.4.3 Processor
A Processor performs some kind of processing on an object. Processors are not intended to be queued. In the context of the current MIRC implementation, a Preprocessor is a Processor, as is an Anonymizer. The result of a processing stage is an object that is passed to the next stage in the pipeline.
2.4.4 ExportService
An ExportService provides queued transmission to an external system via a defined protocol. Objects in the queue are full copies of the objects submitted; therefore, subsequent processing is not impeded if a queue is paused, and modifications made subsequently do not affect the queue entry, even if they occur before transmission. (Note: This behavior is different from that of the current MIRC implementation.) After entering an object in its queue, an ExportService returns immediately.
2.5 System Configuration
The ClinicalTrialProcessor configuration is specified by an XML file called config.xml located in the same directory as the program. There can be one Server element specifying the port on which the HTTP server is to operate, and multiple Pipeline elements, each specifying the stages which comprise it. The name of the element defining a stage is irrelevant and can be chosen for readability; each stage in a pipeline is actually defined by its Java class, specified in the class attribute. Stages are loaded automatically when the program starts, and the loader tests the stage's class to determine what kind of stage it represents. It is possible to extend the application beyond the pre-defined stages available in the implementation as described in Extending ClinicalTrialProcessor.
The following is an example of a simple configuration that might be used at an image acquisition site. It contains one pipeline which receives objects via the DICOM protocol, stores the objects locally, anonymizes them, and transmits them via HTTP (using secure sockets layer for encryption) to a principal investigator's site.
<Configuration> <Server port="80" /> <Pipeline name="Main Pipeline"> <ImportService name="DICOM Import" class="org.rsna.ctp.stdstages.DicomImportService" root="roots/dicom-import" port="1104" /> <StorageService name="Storage" class="org.rsna.ctp.stdstages.FileStorageService" root="storage" return-stored-file="no" quarantine="quarantines/storage" /> <Anonymizer name="Anonymizer" class="org.rsna.ctp.stdstages.Anonymizer" root="roots/anonymizer" acceptDicomObjects="yes" dicom-script="roots/anonymizer/scripts/da.script" quarantine="quarantines/anonymizer" /> <ExportService name="HTTP Export" class="org.rsna.ctp.stdstages.HttpExportService" root="roots/http-export" url="https://university.edu:1443" /> </Pipeline> </Configuration>
The configuration above is the default file which is installed when the program is first run. Changes to the configuration made subsequently are not overwritten during an upgrade. Note that because the storage service appears in the pipeline before the anonymizer, the objects which are stored contain the PHI which was originally received, and because the anonymizer appears before the export service, anonymized objects are exported.
The following is an example of a simple configuration that might be used at a principal investigator's site. It contains one pipeline which receives objects via the HTTP protocol, stores them, and exports them to a DICOM destination:
<Configuration> <Server port="80" /> <Pipeline name="Main Pipeline"> <ImportService name="HTTP Import" class="org.rsna.ctp.stdstages.HttpImportService" root="roots/http-import" ssl="yes" port="1443" /> <StorageService name="Storage" class="org.rsna.ctp.stdstages.FileStorageService" root="D:/storage" return-stored-file="no" quarantine="quarantines/StorageServiceQuarantine" /> <ExportService name="PACS Export" class="org.rsna.ctp.stdstages.DicomExportService" root="roots/pacs-export" url="dicom://DestinationAET:ThisAET@ipaddress:port" /> </Pipeline> </Configuration>
Note that in the example above, non-DICOM objects are stored in the StorageService, but they are not exported by the DicomExportService. Each pipeline stage is responsible for testing the class of the object which it receives and processing (or ignoring) the object accordingly.
The following is an example of a more complex configuration intended only to illustrate more possibilities than the simple configurations above. This configuration receives objects, passes them to a trial-specific Processor stage to test whether they are appropriate for the trial, anonymizes objects which make it through the preprocessor, exports them to a database, anonymizes them again (with a different anonymizer script) to remove information which is not intended for storage, then stores them and exports them to a PACS.
<Configuration> <Server port="80" /> <Pipeline name="Main Pipeline"> <ImportService name="HTTP Import" class="org.rsna.ctp.stdstages.HttpImportService" root="roots/http-import" port="7777" /> <Processor name="The Preprocessor" class="org.myorg.MyPreprocessor" quarantine="quarantines/PreprocessorQuarantine" /> <Processor name="Main Anonymizer" class="org.rsna.ctp.stdstages.Anonymizer" root="roots/main-anonymizer" dicom-script="dicom-anonymizer-1.properties" xml-script="xml-anonymizer-1.script" zip-script="zip-anonymizer-1.script" quarantine="quarantines/MainAnonymizerQuarantine" /> <ExportService name="Database Export" class="org.rsna.trials.DatabaseExportService" adapter-class="org.myorg.MyDatabaseAdapter" root="roots/database-export" /> <Processor name="Provenance Remover" class="org.rsna.ctp.stdstages.Anonymizer" dicom-script="dicom-anonymizer-2.properties" root="roots/provenance-remover" xml-script="xml-anonymizer-2.script" zip-script="zip-anonymizer-2.script" quarantine="quarantines/ProvenanceRemoverQuarantine" /> <StorageService name="Storage" class="org.rsna.ctp.stdstages.FileStorageService" root="D:/storage" return-stored-file="no" /> <ExportService name="PACS Export" class="org.rsna.ctp.stdstages.DicomExportService" root="roots/pacs-export" url="dicom://DestinationAET:ThisAET@ipaddress:port" /> </Pipeline> </Configuration>
Multiple Pipeline elements may be included, but each must have its own ImportService element, and their ports must not conflict.
Each pipeline stage class has a constructor that is called with its configuration element, making it possible for special processor implementations to be passed additional parameters from the configuration. See Implementing a Pipeline Stage for details.
2.6 Standard Stages
The application includes several built-in, standard stages which allow most trials to be operated without writing any software. The sections below show all the configuration attributes recognized by the standard stages.
Attributes which specify directories can contain either absolute paths (e.g., D:/TrialStorage) or relative paths (e.g., quarantines/http-import-quarantine). Relative paths are relative to the directory in which the ClinicalTrialProcessor is located.
Most standard stages have attributes which determine which object types are to be accepted by the stage. These attributes are:
- acceptDicomObjects
- acceptXmlObjects
- acceptZipObjects
- acceptFileObjects
The allowed values are "yes" and "no". The default value for all these attributes is "yes". The DicomImportService and DicomExportService, which are both restricted to DicomObjects, ignore the values of these attributes.
If a standard ImportService receives an object which it is not configured to accept, it quarantines the object, or if no quarantine has been defined for the stage, it discards the object.
If a Processor, StorageService, or ExportService receives an object that it is not configured to accept, it either ignores the object or passes it unmodified to the next pipeline stage. Thus, if an anonymizer which is not configured to anonymize XmlObjects receives an XmlObject, it passes the object on without anonymization.
2.6.1 HttpImportService
The HttpImportService listens on a defined port for HTTP connections from FieldCenter applications and receives files transmitted using the HTTP protocol with Content-Type equal to application/x-mirc. The configuration element for the HttpImportService is:
<ImportService name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.HttpImportService" root="root-directory" port="7777" ssl="yes" acceptDicomObjects="yes" acceptXmlObjects="yes" acceptZipObjects="yes" acceptFileObjects="yes" quarantine="quarantine-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage. This attribute is required only when the stage is accessed by other stages. Its value, when supplied, must be unique across all pipelines.
- root is a directory for use by the ImportService for internal storage and queuing.
- port is the port on which the ImportService listens for connections.
- ssl determines whether the port uses secure sockets layer (yes) or unencrypted http (no). The default is no.
- acceptDicomObjects determines whether DicomObjects are to be enqueued when received.
- acceptXmlObjects determines whether XmlObjects are to be enqueued when received.
- acceptZipObjects determines whether ZipObjects are to be enqueued when received.
- acceptFileObjects determines whether FileObjects are to be enqueued when received.
- quarantine is a directory in which the ImportService is to quarantine objects that it receives but is configured not to accept.
2.6.2 PollingHttpImportService
The PollingHttpImportService obtains files by initiating HTTP connections to an external system. This ImportService is designed to work in conjunction with the PolledHttpExportService to allow penetration of a firewall without having to open an inbound port, as described in Security Issues. The configuration element for the PollingHttpImportService is:
<ImportService name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.PollingHttpImportService" root="root-directory" url="http://ip:port" acceptDicomObjects="yes" acceptXmlObjects="yes" acceptZipObjects="yes" acceptFileObjects="yes" quarantine="quarantine-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage. This attribute is required only when the stage is accessed by other stages. Its value, when supplied, must be unique across all pipelines.
- root is a directory for use by the ImportService for internal storage.
- url is the URL of the PolledHttpExportService.
- acceptDicomObjects determines whether DicomObjects are to be enqueued when received.
- acceptXmlObjects determines whether XmlObjects are to be enqueued when received.
- acceptZipObjects determines whether ZipObjects are to be enqueued when received.
- acceptFileObjects determines whether FileObjects are to be enqueued when received.
- quarantine is a directory in which the ImportService is to quarantine objects that it receives but is configured not to accept.
Note: The protocol part of the url can be http or https, the latter causing connections to be initiated using secure sockets layer.
2.6.3 DicomImportService
The DicomImportService listens on a defined port for HTTP connections from FieldCenter applications and receives files transmitted using the DICOM protocol. The DicomImportService accepts all Application Entity Titles. The configuration element for the DicomImportService is:
<ImportService name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.DicomImportService" root="root-directory" port="port number" called-aet-tag="00097770" calling-aet-tag="00097772" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage. This attribute is required only when the stage is accessed by other stages. Its value, when supplied, must be unique across all pipelines.
- root is a directory for use by the ImportService for internal storage and queuing.
- port is the port on which the ImportService listens for connections.
- called-aet-tag is an optional DICOM element, specified in hex with no comma separating the group and element numbers, into which to store the AE Title which was used by the sender to specify the receiver in the association. If the attribute is missing or zero, or if the value does not parse as a hex integer, the DicomImportService does not store the AE Title of the receiver in the received DICOM object.
- calling-aet-tag is an optional DICOM element, specified in hex with no comma separating the group and element numbers, into which to store the AE Title which was used by the sender to identify itself in the association. If the attribute is missing or zero, or if the value does not parse as a hex integer, the DicomImportService does not store the AE Title of the sender in the received DICOM object.
2.6.4 Anonymizer
The Anonymizer is a processor stage that includes anonymizers for each of the object types which contain defined data. When the anonymizer stage is called to process an object, it calls the anonymizer which is appropriate to the object type. Each anonymizer is configured with a script file. If a script file is either not configured or absent for an object type, objects of that type are returned unmodified. The configuration element for the Anonymizer is:
<Anonymizer name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.Anonymizer" root="root-directory" acceptDicomObjects="yes" acceptXmlObjects="yes" acceptZipObjects="yes" lookup-table="lookup-table.properties" dicom-script="dicom-anonymizer.properties" xml-script="xml-anonymizer.script" zip-script="zip-anonymizer.script" quarantine="quarantine-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage. This attribute is required only when the stage is accessed by other stages. Its value, when supplied, must be unique across all pipelines.
- root is a directory for use by the Anonymizer for temporary storage.
- acceptDicomObjects determines whether DicomObjects are to be anonymized.
- acceptXmlObjects determines whether XmlObjects are to be anonymized.
- acceptZipObjects determines whether ZipObjects are to be anonymized.
- lookup-table specifies the path to the lookup table used by the anonymizer.
- dicom-script specifies the path to the script for the DICOM anonymizer.
- xml-script specifies the path to the script for the DICOM anonymizer.
- zip-script specifies the path to the script for the Zip anonymizer (which anonymizes the manifest in a ZipObject).
- quarantine is a directory in which the Anonymizer is to quarantine objects that generate quarantine calls during processing.
Notes:
- Any object which is accepted but for which no script has been defined is quarantined. If no quarantine has been defined for the stage, the object is passed on unmodified.
- Since FileObjects do not contain formatted information, the anonymizer does not modify such objects.
- If the file identified by the dicom-script attribute is missing, the example-dicom-anonymizer.script file is copied to the specified file. The copy can then be modified using the DicomAnonymizerServlet.
- If the lookup-table attribute is missing, the lookup anonymizer function is disabled.
2.6.5 FileStorageService
The FileStorageService stores objects in a file system. It automatically defines subdirectories beneath its root directory and populates them accordingly. The configuration element for the StorageService is:
<StorageService name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.FileStorageService" root="D:/storage" type="month" acceptDicomObjects="yes" acceptXmlObjects="yes" acceptZipObjects="yes" acceptFileObjects="yes" return-stored-file="yes" require-authentication="no" set-world-readable="no" set-world-writable="no" fs-name-tag="00097770" port="85" quarantine="quarantine-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage. This attribute is required only when the stage is accessed by other stages. Its value, when supplied, must be unique across all pipelines.
- root is the root directory of the storage tree.
- type determines the structure of the storage tree. The allowed values are:
- year: root/FSNAME/year/studyName, e.g. root/2008/123.456.789
- month: root/FSNAME/year/month/studyName, e.g. root/2008/06/123.456.789
- week: root/FSNAME/year/week/studyName, e.g. root/2008/36/123.456.789
- day: root/FSNAME/year/day/studyName, e.g. root/2008/341/123.456.789
- none: root/FSNAME/studyName, e.g. root/123.456.789
(FSNAME is the name of the file system to which the study belongs. See the fs-name-tag attribute below for additional information.)
- acceptDicomObjects determines whether DicomObjects are to be stored.
- acceptXmlObjects determines whether XmlObjects are to be stored.
- acceptZipObjects determines whether ZipObjects are to be stored.
- acceptFileObjects determines whether FileObjects are to be stored.
- return-stored-file specifies whether the original object or a new object pointing to the file in the storage system is to be returned for processing by subsequent stages. Values are "yes" and "no". The default is "yes".
- require-authentication determines whether users are forced to log in to the web server. Values are "yes" and "no". The default is "no".
- set-world-readable determines whether the FileStorageService makes all files and directories readable by all users. Values are "yes" and "no". The default is "no". This attribute should only be used if users or other programs are to be allowed to access files without using the FileStorageService web server. This feature is not recommended.
- set-world-writable determines whether the FileStorageService makes all files and directories writable by all users. Values are "yes" and "no". The default is "no". This attribute should only be used if users or other programs are to be allowed to write files into the FileStorageService without using the FileStorageService web server. This feature is not recommended.
- fs-name-tag is an optional DICOM element, specified in hex with no comma separating the group and element numbers, which is used to specify the name of the root directory's child under which to store a received object. If the attribute is missing from the configuration or if the specified element is missing from the received object, or if the contents of the specified element are blank, the object is stored under the "__default" tree.
- port specifies the port on which a web server is to be started to provide access to the stored studies. If the attribute is missing, no web server is started for the FileStorageService.
- quarantine is a directory in which the StorageService is to quarantine objects that cannot be stored.
Fro more information on the embedded web server in the FileStorageService, see The CTP FileStorageService Web Server.
Notes:
- Files are stored in a tree of directories with the root of the tree as defined in the root attribute of the configuration element.
- Below the root are directories organized by year and month, e.g. "2007/09".
- Below a month directory are directories organized by StudyInstanceUID, thus grouping all objects received for a specific study together in one directory.
- Objects are stored with standard file extensions:
- .dcm for DicomObjects
- .xml for XmlObjects
- .zip for ZipObjects
- .md for FileObjects
- Any object not containing a StudyInstanceUID or StudyUID is stored in the month's bullpen directory.
- The fs-name-tag attribute is designed to work in combination with the called-aet-tag attribute of the DicomImportService to allow separation of objects into different storage trees based on the AE Title used to specify the destination.
2.6.6 HttpExportService
The HttpExportService queues objects and transmits them via HTTP with Content-Type application/x-mirc. The configuration element for the HttpExportService is:
<ExportService name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.HttpExportService" root="root-directory" url="http://ipaddress:port/path" acceptDicomObjects="yes" acceptXmlObjects="yes" acceptZipObjects="yes" acceptFileObjects="yes" interval="10000" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage. This attribute
is required only when the stage is accessed by other stages. Its value, when supplied, must be unique across all pipelines.
- root is a directory for use by the ExportService for internal storage and queuing.
- url specifies the destination system's URL.
- acceptDicomObjects determines whether DicomObjects are to be exported.
- acceptXmlObjects determines whether XmlObjects are to be exported.
- acceptZipObjects determines whether ZipObjects are to be exported.
- acceptFileObjects determines whether FileObjects are to be exported.
- interval is the sleep time (in milliseconds) between polls of the export queue.
Notes:
- The default interval is 10 seconds. The minimum allowed value is one second. The maximum allowed value is 20 seconds.
- The protocol part of the url can be http or https, the latter causing connections to be initiated using secure sockets layer.
2.6.7 PolledHttpExportService
The PolledHttpExportService queues objects and transmits them in the HTTP response stream of a received connection. Files are transmitted with Content-Type equal to application/x-mirc. This ExportService is designed to work in conjunction with the PollingHttpImportService to allow penetration of a firewall without having to open an inbound port, as described in Security Issues. The configuration element for the Polled HttpExportService is:
<ExportService name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.PolledHttpExportService" root="root-directory" port="listening-port" ssl="yes" acceptDicomObjects="yes" acceptXmlObjects="yes" acceptZipObjects="yes" acceptFileObjects="yes" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage. This attribute
is required only when the stage is accessed by other stages. Its value, when supplied, must be unique across all pipelines.
- root is a directory for use by the ExportService for internal storage and queuing.
- port is the port on which the ExportService listens for connections.
- ssl determines whether the port uses secure sockets layer (yes) or unencrypted http (no).
Note: The ssl attribute must correspond to the protocol used in the PollingHttpImportService which connects to the PolledHttpExportService. Since these services are typically used to penetrate a firewall within an institution, secure sockets layer is not normally needed for security.
- acceptDicomObjects determines whether DicomObjects are to be exported.
- acceptXmlObjects determines whether XmlObjects are to be exported.
- acceptZipObjects determines whether ZipObjects are to be exported.
- acceptFileObjects determines whether FileObjects are to be exported.
2.6.8 DicomExportService
The DicomExportService queues objects and transmits them to a DICOM Storage SCP. The configuration element for the DicomExportService is:
<ExportService name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.DicomExportService" root="root-directory" url="dicom://DestinationAET:ThisAET@ipaddress:port" interval="10000" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage. This attribute
is required only when the stage is accessed by other stages. Its value, when supplied, must be unique across all pipelines.
- root is a directory for use by the ExportService for internal storage and queuing.
- url specifies the destination DICOM Storage SCP's URL.
- interval is the sleep time (in milliseconds) between polls of the export queue.
Note: The default interval is 10 seconds. The minimum allowed value is one second. The maximum allowed value is 20 seconds.
2.6.9 FtpExportService
The FtpExportService queues objects and transmits them to an FTP server. The configuration element for the FtpExportService is:
<ExportService name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.FtpExportService" root="root-directory" url="ftp://@ipaddress:port/path" username="..." password="..." acceptDicomObjects="yes" acceptXmlObjects="yes" acceptZipObjects="yes" acceptFileObjects="yes" interval="10000" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage. This attribute
is required only when the stage is accessed by other stages. Its value, when supplied, must be unique across all pipelines.
- root is a directory for use by the ExportService for internal storage and queuing.
- url specifies the destination FTP server's URL:
- ipaddress can be a numeric address or a domain name.
- port is the port on which the FTP listener listens. The default port is 21.
- path is the base directory on the FTP server in which the FtpExportService will create StudyInstance directories.
- username specifies the username under which the FtpExportService will log in to the FTP server.
- password specifies the password to be used in the login process.
- acceptDicomObjects determines whether DicomObjects are to be exported.
- acceptXmlObjects determines whether XmlObjects are to be exported.
- acceptZipObjects determines whether ZipObjects are to be exported.
- acceptFileObjects determines whether FileObjects are to be exported.
- interval is the sleep time (in milliseconds) between polls of the export queue.
Notes:
- The default interval is 10 seconds. The minimum allowed value is one second. The maximum allowed value is 20 seconds.
- The FtpExportService stores files in subdirectories of the path part of the URL, organized by StudyInstanceUID (or StudyUID in the case of non-DICOM files). Files not containing a StudyUID are stored in the bullpen directory under the path directory.
- If the directory specified by the path does not exist, it is created.
- Files are stored within their directories with names that consist of the date and time (to the millisecond) when they were transferred to the server.
- If a file is transmitted multiple times, multiple copies of the file will appear with its directory, each with the date/time of the transfer.
- No index of the studies and files is created on the server.
2.6.10 DatabaseExportService
The DatabaseExportService queues objects and submits them to a DatabaseAdapter class, which must be written specially for the database in question. The configuration element for the DatabaseExportService is:
<ExportService name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.DatabaseExportService" adapter-class="org.myorg.MyDatabaseAdapter" root="root-directory" acceptDicomObjects="yes" acceptXmlObjects="yes" acceptZipObjects="yes" acceptFileObjects="yes" interval="10000" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage. This attribute
is required only when the stage is accessed by other stages. Its value, when supplied, must be unique across all pipelines.
- adapter-class is the class name of the database's adapter class. See Implementing a DatabaseAdapter for more information.
- root is a directory for use by the ExportService for internal storage and queuing.
- acceptDicomObjects determines whether DicomObjects are to be exported.
- acceptXmlObjects determines whether XmlObjects are to be exported.
- acceptZipObjects determines whether ZipObjects are to be exported.
- acceptFileObjects determines whether FileObjects are to be exported.
- interval is the sleep time (in milliseconds) between polls of the export queue.
Note: The default interval is 10 seconds. The minimum allowed value is one second. The maximum allowed value is 20 seconds.
3 Extending ClinicalTrialProcessor
ClinicalTrialProcessor is designed to be extended with pipeline stages of new types. Stages implement one or more Java interfaces, so it is necessary to get the source code for ClinicalTrialProcessor in order to extend it, even though in principle you don't need to modify the code itself.
3.1 Obtaining the Source Code
The software for ClinicalTrialProcessor is open source. All the software written by the RSNA for the project is released under the RSNA Public License. It is maintained on a CVS server at RSNA headquarters. To obtain the source code, configure a CVS client as follows:
Protocol: Password server (:pserver) Server: mirc.rsna.org Port: 2401 Repository folder: /RSNA Username: cvs-reader Password: cvs-reader Module: ClinicalTrialProcessor
Together, this results in the following CVSROOT (which is constructed automatically if you use something like Tortoise-CVS on a Windows system):
- :pserver:cvs-reader@mirc.rsna.org:2401/RSNA
This account has read privileges, but it cannot write into the repository, so it can check out but not commit. If you wish to be able to commit software to the CVS library, contact the MIRC project manager.
3.2 Building the Software
When you check out the ClinicalTrialProcessor module from CVS, you obtain a directory tree full of the sources and libraries for building the application. The top of the directory tree is ClinicalTrialProcessor. It contains several subdirectories. The source code is in the source directory, which has two subdirectories, one each for the Java sources and the files required by the application.
Building ClinicalTrialProcessor requires the Java 1.6 JDK and Ant. Running ClinicalTrialProcessor requires the JDK or JRE and the JAI ImageIO Tools.
The Ant build file for ClinicalTrialProcessor is in the ClinicalTrialProcessor directory and is called build.xml. To build the software on a Windows system, launch a command window, navigate to the ClinicalTrialProcessor directory, and enter ant all.
The build file contains several targets. The all target does a clean build of everything, including the Javadocs, which are put into the documentation directory. The Javadocs can be accessed with a browser by opening the file:
- ClinicalTrialProcessor/documentation/index.html
The default target, ctp-installer, just builds the application and places the installer in the products directory.
3.3 The Object Classes
ClinicalTrialProcessor provides four classes to encapsulate files of various types. The classes are located in the org.rsna.ctp.objects package:
- DicomObject - a DICOM dataset
- XmlObject - an XML file containing identifiers relating the data to the trial and the trial subject
- ZipObject - a zip file containing a manifest.xml file providing identifiers relating the zip file's contents to the trial and the trial subject
- FileObject - a generic file of unknown contents and format
Each class provides methods allowing pipeline stages or database adapters to access the internals of an object without having to know how to parse it. See the Javadocs for a list of all the methods provided by these classes.
3.4 Implementing a Pipeline Stage
To be recognized as a pipeline stage, a class must implement the org.rsna.ctp.pipeline.PipelineStage interface. An abstract class, org.rsna.ctp.pipeline.AbstractStage, is provided to supply some of the basic methods required by the PipelineStage interface. All the standard stages extend this class.
Each stage type must also implement its own interface. The interfaces are:
- org.rsna.ctp.pipeline.ImportService
- org.rsna.ctp.pipeline.Processor
- org.rsna.ctp.pipeline.StorageService
- org.rsna.ctp.pipeline.ExportService
The Javadocs explain the methods which must be implemented in each stage type.
Each stage class must have a constructor which takes its configuration file XML Element as its argument. The constructor must obtain any configuration information it requires from the element. While it is not required that all configuration information be placed in attributes of the element, the getConfigHTML method provided by AbstractStage expects it, and if you choose to encode configuration information in another way, you must override the getConfigHTML method to make that information available to the configuration servlet.
3.5 Implementing a DatabaseAdapter
The DatabaseExportService pipeline stage provides a queuing mechanism for submitting files to a database interface, relieving the interface from having to manage the queue. It calls the overloaded process method of the interface with one of the four object types. Each of the objects includes methods providing access to the internals of its file, allowing the interface to interrogate objects to obtain some or all of their data to insert into an external system.
The DatabaseExportService dynamically loads the database interface class, obtaining the name of the class from the configuration element's adapter-class attribute.
3.5.1 The DatabaseAdapter Class
The DatabaseAdapter class, org.rsna.ctp.stdstages.database.DatabaseAdapter, is a base class for building an interface between the DatabaseExportService and an external database. To be recognized and loaded by the DatabaseExportService, an external database interface class must be an extension of DatabaseAdapter.
The DatabaseAdapter class provides a set of methods allowing the DatabaseExportService to perform various functions, all of which are explained in the Javadocs. The basic interaction model is:
- When the DatabaseExportService detects that files are in its queue, it determines whether the database interface class is loaded and loads it if necessary.
- It then calls the database interface’s connect() method.
- For each file in the queue, it instantiates an object matching the file’s contents and calls the database interface’s process() method. There are four overloaded process methods, one for each object class.
- When the queue is empty, it calls the database interface’s disconnect() method.
All the methods of the DatabaseAdapter class return a static instance of the org.rsna.ctp.pipeline.Status class to indicate the result. The values are:
- Status.OK means that the operation succeeded completely.
- Status.FAIL means that the operation failed and trying again will also fail. This status value indicates a problem with the object being processed.
- Status.RETRY means that the operation failed but trying again later may succeed. This status value indicates a temporary problem accessing the external database.
All the methods of the base DatabaseAdapter class return the value Status.OK.
3.5.2 Extending the DatabaseAdapter Class
To implement a useful interface to an external database, you must extend the DatabaseAdapter class.
Since the DatabaseAdapter class implements dummy methods returning Status.OK, your class that extends DatabaseAdapter only has to override the methods that apply to your application. If, for example, you only care about XML objects, you can just override the process(XmlObject xmlObject) method and let DatabaseAdapter supply the other process() methods, thus ignoring objects of other types.
Although the DatabaseAdapter class includes reset() and shutdown() methods, they are not called by the DatabaseExportService because restarts are not done in ClinicalTrialProcessor and there is no notice of an impending shutdown. You should therefore ensure that the data is protected in the event of, for example, a power failure. Similarly, since one connect() call is made for possibly multiple process() method calls, it is possible that a failure could result in no disconnect() call. Thus, depending on the design of the external system, it may be wise to commit changes in each process() call.
3.5.3 Connecting Your Database Interface Class to ClinicalTrialProcessor
The easiest way to connect your class into ClinicalTrialProcessor is to create a package for it under the ctp tree and then build the entire application. This will avoid your having to change the manifest declaration in the build.xml file, and it will ensure that the class is included in the installation without having to add a jar file.
4 Security Issues
In a clinical trial, transmission of data from image acquisition sites to the principal investigator site involves penetrating at least one firewall. Since the image acquisition site initiates an outbound connection, this only rarely requires special action. At the principal investigator's site, where the connection is inbound, some provision must be made to allow the connection to reach its destination. There are two basic solutions. The simplest solution is to open a port in the firewall at the principal investigator's site and route connections for that port to the computer running the ClinicalTrialProcessor application. In some institutions, however, security policies prohibit this solution. The alternative is to use the PolledHttpExportService and PollingHttpImportService to allow data to flow without having to open any ports on the internal network to inbound connections.
Using this latter solution requires two computers, each running ClinicalTrialProcessor. One computer is placed in the border router's DMZ with one port open to the internet, allowing connections to the HttpImportService of the program running in that computer. That program's pipeline includes a PolledHttpExportService which queues objects and waits for a connection before passing them onward. The second computer is placed on the internal network. Its program has a pipeline which starts with a PollingHttpImportService. That ImportService is configured to make outbound connections to the DMZ computer when a file is requested. This allows files to pass through the firewall on the response stream of the outbound connection without having to open any ports to the internal network.
5 Notes
5.1 Important Note for Unix/Linux Platforms
Unix and its derivatives require that applications listening on ports with numbers less than 1024 have special privileges. On such systems, it might be best to put all import services and all web servers on ports above this value. The default configuration puts the web server on port 80 because that is the default port for the web. This can be changed before the program is run for the first time by editing the example-config.xml file, or after it has been run the first time by editing the config.xml file.
5.2 The Central Remapping Service
Anonymizers have many functions for mapping PHI into trial-specific identifiers which cannot be traced back to the patient. These functions can be grouped into two categories, one which uses tables to maintain correspondences between PHI and its replacement values, and one which avoids the need for tables by using hashing algorithms to compute the replacement values.
The FieldCenter application includes anonymizers which have both categories of mappers. In the category of mappers which use tables, FieldCenter provides both a local mapping table capability and one which can use a central mapping agent provided by a MIRC site.
Hashing algorithms are both more efficient and more secure, so they are preferred over those that use tables, and all new trials are encouraged to use them.
ClinicalTrialProcessor does not provide a central remapping agent. It also does not provide table-based functions in its anonymizers.
5.3 The Update Service
The Update Service is not currently implemented in ClinicalTrialProcessor. One of the primary motivations for the Update Service was for automatic backups of FieldCenter applications' local mapping tables. As this approach is being deprecated, the motivation now is only for software and anonymizer table distribution. Trials which need those functions today can install MIRC sites to provide them in the interim until the service is implemented in ClinicalTrialProcessor.