Difference between revisions of "MIRC CTP"
Line 651: | Line 651: | ||
xmlScript="scripts/xf.script" | xmlScript="scripts/xf.script" | ||
zipScript="scripts/zf.script" | zipScript="scripts/zf.script" | ||
− | interval=" | + | interval="5000" |
proxyIPAddress="..." | proxyIPAddress="..." | ||
proxyPort="..." | proxyPort="..." | ||
Line 679: | Line 679: | ||
Notes: | Notes: | ||
− | #The default interval is | + | #The default interval is 5 seconds. The minimum allowed value is one second. The maximum allowed value is 10 seconds. |
#The protocol part of the <b>url</b> can be <b>http</b> or <b>https</b>, the latter causing connections to be initiated using secure sockets layer. | #The protocol part of the <b>url</b> can be <b>http</b> or <b>https</b>, the latter causing connections to be initiated using secure sockets layer. | ||
#If a proxy server is not in use, the proxy attributes must be omitted. | #If a proxy server is not in use, the proxy attributes must be omitted. |
Revision as of 16:47, 9 June 2009
This article describes the stand-alone processing application for clinical trials data using MIRC components and the MIRC internet transport mechanism.
1 Background
MIRC supports clinical trials through two applications, one for data acquisition at an imaging center (FieldCenter) and one for management of the data at a principal investigator's site (MIRC).
The FieldCenter application acquires images via the DICOM protocol, anonymizes them, and transfers them (typically via HTTP, although DICOM is also supported) to a principal investigator's MIRC site. It also supports other types of data files and includes an anonymizer for XML files as well. FieldCenter also contains a client for the Update Service of a MIRC site, allowing the application to save data on, and obtain software updates from, the principal investigator's site.
The MIRC site software contains a partially configurable processing pipeline for clinical trials data, consisting of:
- HttpImportService
- A receiver for HTTP connections from FieldCenter applications transferring data files into the processing pipeline.
- DicomImportService
- A receiver for DICOM datasets from modalities, PACS, workstations, etc. for insertion into the processing pipeline.
- Preprocessor
- A user-defined component for processing data received by the HttpImportService before it is further processed by other components.
- Anonymizer
- A component for anonymizing DICOM objects or XML objects.
- DatabaseExportService
- A component providing queue management and submission of data objects to a user-defined interface to an external database management system.
- HttpExportService
- A component in the DicomImportService pipeline providing queue management and transmission of data objects to one or more external systems using the HTTP protocol.
- DicomExportService
- A component in the HttpImportService pipeline providing queue management and transmission of data objects to one or more external systems using the DICOM protocol.
The processing pipelines for the HttpImportService and DicomImportService are different. They are not symmetrical. For example, the HttpImportService does not have access to the anonymizer except as part of the DatabaseExportService. Another limitation is that objects received via one protocol can only be exported via the other. While these limitations are consistent with the requirements of most trials, it became clear that a more general design would provide better support for trials requiring complex processing while still satisfying the normal requirements.
2 Clinical Trial Processor (CTP)
CTP is a stand-alone program that provides all the processing features of a MIRC site for clinical trials in a highly configurable and extensible application. It connects to FieldCenter applications and can also connect to MIRC sites when necessary. CTP has the following key features:
- Single-click installation.
- Support for multiple pipelines.
- Processing pipelines supporting multiple configurable stages.
- Support for multiple quarantines for data objects which are rejected during processing.
- Pre-defined implementations for key components:
- HTTP Import
- DICOM Import
- DICOM Anonymizer
- XML Anonymizer
- File Storage
- Database Export
- HTTP Export
- DICOM Export
- FTP Export
- Web-based monitoring of the application's status, including:
- configuration
- logs
- quarantines
- status
2.1 Installation
The installer for CTP is available on the RSNA MIRC site. To run the installer, the Java 1.6 (or better) JRE must be present on the system. Java and all its components are available through the Java website.
To run the CTP installer, double-click the CTP-installer.jar file and choose a directory in which to install CTP. The installer can also be run in a command window using the command:
- java -jar CTP-installer.jar
Certain CTP pipeline stages (FileStorageService, BasicFileStorageService) require that the Java Advanced Imaging ImageIO Tools be present on the system. Parenthetically, note that the Java Advanced Imaging component is not the same as the Java Advanced Imaging ImageIO Tools. Only the latter component is required. (Macintosh users should read ImageIO Tools for Macintosh in the Notes section.)
CTP has no user interface. It can be run by double-clicking the CTP.jar file, or it can be run in a command window. To do so, open a command window, navigate to the directory in which the program was installed, and enter the command:
- java -jar CTP.jar
If large images are to be acquired, it is generally advisable to run the program with a large memory pool. When running from a command window, the parameters can be set using the command:
- java -Xmx512m -Xms128m -jar CTP.jar
When the program starts, it runs without intervention. Status and other information can be obtained through the program's integrated webserver. Accessing the server with no path information displays a page presenting buttons for each of the servlets.
When the program is first installed, a single user is provided with the name admin and password password. After logging in as this user, the UserManagerServlet can be used to change the name and/or password and to create other users if necessary.
To stop the program, log in as an admin user and click the Shutdown button on the main page.
2.2 Configuration Files
The program uses two configurable files: config.xml, which is located in the same directory as the program itself, and index.html, which is located in the server's ROOT directory. Both files are intended to be configured for the specific application. The installer does not overwrite these files when it runs; instead, it installs two example files: example-config.xml and example-index.html. When CTP starts, it looks to see if the non-example files are missing, and if so, it copies the example files into the non-example ones. This process allows upgrades to be done without losing any configuration work. After installing the program the first time, it should be run once in order to make the copies, and then the copies can be configured. Configuration is done by hand with any text editor (e.g., TextPad or NotePad). Care should be taken, especially with config.xml, to keep it well-formed. Opening it with a program like InternetExplorer will check it for errors.
2.3 Server
To provide access to the status of the components, the application includes an HTTP server which serves files and provides servlet-like functionality. Files are served from a directory tree whose root is named ROOT. The ROOT directory contains a file, index.html, which provides buttons which link to several servlets providing information about the operation of the program. This file is intended to be configured with logos, additional links, etc., and upgrades do not overwrite it. The standard servlets are:
- LoginServlet allows a user to log into the system.
- UserManagerServlet allows an admin user to create users and assign them privileges.
- ConfigurationServlet displays the contents of the configuration file.
- StatusServlet displays the status of all pipeline stages.
- LogServlet provides web access to all log files in the logs directory.
- QuarantineServlet provides web access to all quarantine directories and their contents.
- IDMapServlet allows an admin user to access to a database of PHI and anonymized replacements for patient IDs accession numbers, and UIDs.
- SysPropsServlet displays the Java system properties.
- DicomAnonymizerServlet allows an admin user to configure any DICOM anonymizers in the pipelines.
- ScriptServlet allows an admin user to configure the scripts for script-based pipeline stages, including the various Filter stages and the XML and Zip anonymizers.
- LookupServlet allows an admin user to configure lookup tables used by the anonymizers.
- ShutdownServlet allows an admin user to shut the program down.
The configuration element for the HTTP server is:
<Server port="80" ssl="no" requireAuthentication="no" usersClassName="org.rsna.ctp.server.UsersXmlFileImpl" />
where:
- port is the port number on which the HTTP server listens for connections.
- ssl determines whether the HTTP server uses SSL. Values are "yes" and "no". The default is "no".
- requireAuthentication determines whether users are forced to log in to the HTTP server. Values are "yes" and "no". The default is "no". (The HTTP server is typically operated without requiring authentication, thus allowing users to monitor the status of the system without having to log in.)
- usersClassName specifies the Java class to be used for authentication of users and their privileges. If this attribute is missing, the standard CTP implementation (shown above) is employed. This attribute should only be included if a special mechanism has been implemented on the site (for example, using LDAP).
2.4 Pipelines
A pipeline is a manager that moves data objects through a sequence of processing stages. Each stage in the pipeline performs a specific function on one or more of the four basic object types supported by MIRC:
- FileObject
- DicomObject
- XmlObject
- ZipObject
Each pipeline must contain at least one ImportService. Each pipeline stage may be provided access to a quarantine directory into which the stage places objects that it rejects, thus removing them from the pipeline and aborting further processing. Quarantine directories may be unique to each stage or shared with other stages. At the end of the pipeline, the manager calls the ImportService which provided the object to remove it from its queue.
There are four types of pipeline stages. Each is briefly described in subsections below.
2.4.1 ImportService
An ImportService receives objects via a protocol and enqueues them for processing by subsequent stages.
2.4.2 Processor
A Processor performs some kind of processing on an object. Processors are not intended to be queued. In the context of the current MIRC implementation, a Preprocessor is a Processor, as is an Anonymizer. The result of a processing stage is an object that is passed to the next stage in the pipeline.
2.4.3 StorageService
A StorageService stores an object in a file system. It is not queued, and it therefore must complete before subsequent stages can proceed. A StorageService may return the current object or the stored object in response to a request for the output object, depending on its implementation.
2.4.4 ExportService
An ExportService provides queued transmission to an external system via a defined protocol. Objects in the queue are full copies of the objects submitted; therefore, subsequent processing is not impeded if a queue is paused, and modifications made subsequently do not affect the queue entry, even if they occur before transmission. (Note: This behavior is different from that of the current MIRC implementation.) After entering an object in its queue, an ExportService returns immediately.
2.5 System Configuration
The CTP configuration is specified by an XML file called config.xml located in the same directory as the program. There can be one Server element specifying the port on which the HTTP server is to operate, and multiple Pipeline elements, each specifying the stages which comprise it. The name of the element defining a stage is irrelevant and can be chosen for readability; each stage in a pipeline is actually defined by its Java class, specified in the class attribute. Stages are loaded automatically when the program starts, and the loader tests the stage's class to determine what kind of stage it represents. It is possible to extend the application beyond the pre-defined stages available in the implementation as described in Extending CTP.
The following is an example of a simple configuration that might be used at an image acquisition site. It contains one pipeline which receives objects via the DICOM protocol, stores the objects locally, anonymizes them, and transmits them via HTTP (using secure sockets layer for encryption) to a principal investigator's site.
<Configuration> <Server port="80" /> <Pipeline name="Main Pipeline"> <ImportService name="DICOM Import" class="org.rsna.ctp.stdstages.DicomImportService" root="roots/dicom-import" port="1104" /> <StorageService name="Storage" class="org.rsna.ctp.stdstages.FileStorageService" root="storage" returnStoredFile="no" quarantine="quarantines/storage" /> <Anonymizer name="Anonymizer" class="org.rsna.ctp.stdstages.DicomAnonymizer" root="roots/anonymizer" script="scripts/da.script" quarantine="quarantines/anonymizer" /> <ExportService name="HTTP Export" class="org.rsna.ctp.stdstages.HttpExportService" root="roots/http-export" url="https://university.edu:1443" /> </Pipeline> </Configuration>
The configuration above is the default file which is installed when the program is first run. Changes to the configuration made subsequently are not overwritten during an upgrade. Note that because the storage service appears in the pipeline before the anonymizer, the objects which are stored contain the PHI which was originally received, and because the anonymizer appears before the export service, anonymized objects are exported.
The following is an example of a simple configuration that might be used at a principal investigator's site. It contains one pipeline which receives objects via the HTTP protocol, stores them, and exports them to a DICOM destination:
<Configuration> <Server port="80" /> <Pipeline name="Main Pipeline"> <ImportService name="HTTP Import" class="org.rsna.ctp.stdstages.HttpImportService" root="roots/http-import" ssl="yes" port="1443" /> <StorageService name="Storage" class="org.rsna.ctp.stdstages.FileStorageService" root="D:/storage" returnStoredFile="no" quarantine="quarantines/StorageServiceQuarantine" /> <ExportService name="PACS Export" class="org.rsna.ctp.stdstages.DicomExportService" root="roots/pacs-export" url="dicom://DestinationAET:ThisAET@ipaddress:port" /> </Pipeline> </Configuration>
Note that in the example above, non-DICOM objects are stored in the StorageService, but they are not exported by the DicomExportService. Each pipeline stage is responsible for testing the class of the object which it receives and processing (or ignoring) the object accordingly.
Multiple Pipeline elements may be included, but each must have its own ImportService element, and their ports must not conflict.
Each pipeline stage class has a constructor that is called with its configuration element, making it possible for special processor implementations to be passed additional parameters from the configuration. See Implementing a Pipeline Stage for details.
2.6 Standard Stages
The application includes several built-in, standard stages which allow most trials to be operated without writing any software. The sections below show all the configuration attributes recognized by the standard stages.
Attributes which specify directories can contain either absolute paths (e.g., D:/TrialStorage) or relative paths (e.g., quarantines/http-import-quarantine). Relative paths are relative to the directory in which the CTP application is located.
All standard stages have a root attribute which defines a directory for use by the stage for internal storage. All root directories must be unique, e.g. not shared with other stages.
All standard stages have an optional id attribute which, if present, must uniquely identify the stage across all pipelines. This attribute is required only when the stage must be accessed by another stage, for example, when a DatabaseExportService must interrogate a FileStorageService to determine the URL under which an object is stored.
Some standard stages have attributes which determine which object types are to be accepted by the stage. These attributes are:
- acceptDicomObjects
- acceptXmlObjects
- acceptZipObjects
- acceptFileObjects
The allowed values are "yes" and "no". The default value for all these attributes is "yes". Stages which are restricted to specific object types (e.g. DicomImportService, DicomAnonymizer, XmlAnonymizer, DicomFilter, DicomExportService) ignore the values of these attributes.
If a standard ImportService receives an object which it is not configured to accept, it quarantines the object, or if no quarantine has been defined for the stage, it discards the object.
If a Processor, StorageService, or ExportService receives an object that it is not configured to accept, it either ignores the object or passes it unmodified to the next pipeline stage. Thus, if an anonymizer which is not configured to anonymize XmlObjects receives an XmlObject, it passes the object on without anonymization.
2.6.1 Import Services
2.6.1.1 HttpImportService
The HttpImportService listens on a defined port for connections from HTTP clients and receives files transmitted using the HTTP protocol with Content-Type equal to application/x-mirc. The configuration element for the HttpImportService is:
<ImportService name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.HttpImportService" root="root-directory" port="7777" ssl="yes" zip="no" requireAuthentication+"no" acceptDicomObjects="yes" acceptXmlObjects="yes" acceptZipObjects="yes" acceptFileObjects="yes" quarantine="quarantine-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage.
- root is a directory for use by the ImportService for internal storage and queuing.
- port is the port on which the ImportService listens for connections.
- ssl determines whether the port uses secure sockets layer (yes) or unencrypted http (no). The default is no.
- zip determines whether files will be unzipped after reception (yes) or not (no). The default is no. This feature is intended for use with zipped transmissions from the HttpExportService.
- requireAuthentication determines whether trasnmissions are required to have headers which identify a user which has the import privilege. The allowed values are yes) and no. The default is no.
- acceptDicomObjects determines whether DicomObjects are to be enqueued when received.
- acceptXmlObjects determines whether XmlObjects are to be enqueued when received.
- acceptZipObjects determines whether ZipObjects are to be enqueued when received.
- acceptFileObjects determines whether FileObjects are to be enqueued when received.
- quarantine is a directory in which the ImportService is to quarantine objects that it receives but is configured not to accept.
2.6.1.2 PollingHttpImportService
The PollingHttpImportService obtains files by initiating HTTP connections to an external system. This ImportService is designed to work in conjunction with the PolledHttpExportService to allow penetration of a firewall without having to open an inbound port, as described in Security Issues. The configuration element for the PollingHttpImportService is:
<ImportService name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.PollingHttpImportService" root="root-directory" url="http://ip:port" acceptDicomObjects="yes" acceptXmlObjects="yes" acceptZipObjects="yes" acceptFileObjects="yes" quarantine="quarantine-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage.
- root is a directory for use by the ImportService for internal storage.
- url is the URL of the PolledHttpExportService.
- acceptDicomObjects determines whether DicomObjects are to be enqueued when received.
- acceptXmlObjects determines whether XmlObjects are to be enqueued when received.
- acceptZipObjects determines whether ZipObjects are to be enqueued when received.
- acceptFileObjects determines whether FileObjects are to be enqueued when received.
- quarantine is a directory in which the ImportService is to quarantine objects that it receives but is configured not to accept.
Notes:
- The protocol part of the url must be http, although the actual protocol used by the PollingHttpImportService is not HTTP.
- The PollingHttpImportService does not support SSL.
- The PollingHttpImportService does not support a proxy server for its connections.
2.6.1.3 DicomImportService
The DicomImportService listens on a defined port for connections from DICOM Storage SCUs and receives files transmitted using the DICOM protocol. The DicomImportService accepts all Application Entity Titles. The configuration element for the DicomImportService is:
<ImportService name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.DicomImportService" root="root-directory" port="port number" calledAETTag="00097770" callingAETTag="00097772" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage.
- root is a directory for use by the ImportService for internal storage and queuing.
- port is the port on which the ImportService listens for connections.
- calledAETTag is an optional DICOM element, specified in hex with no comma separating the group and element numbers, into which to store the AE Title which was used by the sender to specify the receiver in the association. If the attribute is missing or zero, or if the value does not parse as a hex integer, the DicomImportService does not store the AE Title of the receiver in the received DICOM object.
- callingAETTag is an optional DICOM element, specified in hex with no comma separating the group and element numbers, into which to store the AE Title which was used by the sender to identify itself in the association. If the attribute is missing or zero, or if the value does not parse as a hex integer, the DicomImportService does not store the AE Title of the sender in the received DICOM object.
2.6.1.4 DirectoryImportService
The DirectoryImportService watches a directory and imports any files it finds in it. After the files are passed down the pipeline, they are moved to the quarantine directory, if configured, or deleted from the import directory if no quarantine is available. The purpose of this ImportService is to allow manual input of objects to the pipeline. The configuration element for the HttpImportService is:
<ImportService name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.DirectoryImportService" root="root-directory" minAge="5000" fsName="..." fsNameTag="" acceptDicomObjects="yes" acceptXmlObjects="yes" acceptZipObjects="yes" acceptFileObjects="yes" quarantine="quarantine-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage.
- root is a directory monitored by the ImportService for files to import.
- minAge is the minimum age (in milliseconds) for files which are imported from the root directory. The default value is 5000; the minimum accepted value is 1000. The purpose of this attribute is to ensure that files are completely stored in the root directory before being imported.
- fsName is the name of the FileSystem to be used by a FileStorageService that receives this object.
- fsNameTag is the name of a DICOM element, specified in hex with no comma separating the group and element numbers, in which to store the value of the fsName attribute.
- acceptDicomObjects determines whether DicomObjects are to be accepted.
- acceptXmlObjects determines whether XmlObjects are to be accepted.
- acceptZipObjects determines whether ZipObjects are to be accepted.
- acceptFileObjects determines whether FileObjects are to be accepted.
- quarantine is a directory in which the ImportService is to quarantine objects that it has handled (whether they were accepted for importing or not).
Note: For DicomObjects, if both the fsName and fsNameTag attributes are specified, the stage places the value of the fsName attribute in the element identified by fsNameTag. This allows a DirectoryImportService to mimic the behavior of a DicomImportService which stores the Called AE Title or Calling AE Title in an element for later use by a FileStorageService to assign the object to a FileSystem.
2.6.2 Processors
2.6.2.1 ObjectLogger
The ObjectLogger is a processor stage that logs the passage of objects as they flow past the stage. Objects are passed on unmodified. The log entries are made in the system log and can be viewed in the log viewer servlet on the main admin server page. This stage is intended for initial configuration testing for a new trial; it creates one line in the system log for each object it receives, and using it in a large production system would produce an unwieldy log file. The configuration element for the ObjectLogger is:
<Processor name="stage name" class="org.rsna.ctp.stdstages.ObjectLogger" verbose="yes" />
where:
- name is any text to be used as a label on configuration and status pages.
- verbose increases the amount of information logged about each object.
2.6.2.2 DicomFilter
The DicomFilter is a processor stage that interrogates a DicomObject to determine whether it meets criteria specified in a script file. If a script file is either not configured or absent, objects are passed on unmodified. If a DicomObject meets the specified criteria, it is passed on unmodified. If it does not meet the criteria, it is quarantined. If no quarantine is specified, the object is deleted. Objects of any other type (XmlObject, ZipObject, FileObject) are passed on unmodified. The configuration element for the DicomFilter is:
<Processor name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.DicomFilter" root="root-directory" script="scripts/dicom-filter.script" quarantine="quarantine-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage.
- root is a directory for use by the DicomFilter for temporary storage.
- script specifies the path to the script for the DicomFilter.
- quarantine is a directory in which the DicomFilter is to quarantine objects that do not meet the criteria specified in the script file.
For information on specifying acceptance criteria in the script, see The CTP DICOM Filter.
2.6.2.3 XmlFilter
The XmlFilter is a processor stage that interrogates an XmlObject to determine whether it meets criteria specified in a script file. If a script file is either not configured or absent, objects are passed on unmodified. If an XmlObject meets the specified criteria, it is passed on unmodified. If it does not meet the criteria, it is quarantined. If no quarantine is specified, the object is deleted. Objects of any other type (DicomObject, ZipObject, FileObject) are passed on unmodified. The configuration element for the XmlFilter is:
<Processor name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.XmlFilter" root="root-directory" script="scripts/xml-filter.script" quarantine="quarantine-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage.
- root is a directory for use by the XmlFilter for temporary storage.
- script specifies the path to the script for the XmlFilter.
- quarantine is a directory in which the XmlFilter is to quarantine objects that do not meet the criteria specified in the script file.
For information on specifying acceptance criteria in the script, see The CTP XML and Zip Filters.
2.6.2.4 ZipFilter
The ZipFilter is a processor stage that interrogates the manifest of a ZipObject to determine whether it meets criteria specified in a script file. If a script file is either not configured or absent, objects are passed on unmodified. If a ZipObject meets the specified criteria, it is passed on unmodified. If it does not meet the criteria, it is quarantined. If no quarantine is specified, the object is deleted. Objects of any other type (DicomObject, XmlObject, FileObject) are passed on unmodified. The configuration element for the XmlFilter is:
<Processor name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.ZipFilter" root="root-directory" script="scripts/zip-filter.script" quarantine="quarantine-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage.
- root is a directory for use by the ZipFilter for temporary storage.
- script specifies the path to the script for the XmlFilter.
- quarantine is a directory in which the ZipFilter is to quarantine objects that do not meet the criteria specified in the script file.
For information on specifying acceptance criteria in the script, see The CTP XML and Zip Filters.
2.6.2.5 IDMap
The IDMap is a processor stage that constructs map tables for UID elements, AccessionNumber elements, and PatientID elements. The map tables contain the original values and the replacement values created by the first downstream anonymizer. These tables can be accessed by administrators using the IDMap servlet. The configuration element for the IDMap is:
<Processor name="stage name" class="org.rsna.ctp.stdstages.IDMap" root="root-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- root is a directory for use by the IDMap for permanent storage of the map tables.
2.6.2.6 ObjectTracker
The ObjectTracker is a processor stage that tracks objects by date, PatientID, StudyInstanceUID, SeriesInstanceUID, and SOPInstanceUID. The values tracked are the ones in the objects at the time they arrive at the stage, so if they occur before anonymization, they contain PHI. The tracking tables can be accessed by administrators using the IDTracker servlet. The configuration element for the IDTracker is:
<Processor name="stage name" class="org.rsna.ctp.stdstages.ObjectTracker" root="root-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- root is a directory for use by the IDTracker for permanent storage of its tracking tables.
2.6.2.7 DicomAnonymizer
The DicomAnonymizer is a processor stage that anonymizes DicomObjects and passes all other object types unmodified. The DicomAnonymizer is configured with a script file. If a script file is either not configured or absent, objects are passed unmodified. The configuration element for the DicomAnonymizer is:
<Anonymizer name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.DicomAnonymizer" root="root-directory" lookupTable="scripts/lookup-table.properties" script="scripts/dicom-anonymizer.properties" quarantine="quarantine-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage.
- root is a directory for use by the DicomAnonymizer for temporary storage.
- lookupTable specifies the path to the lookup table used by the DicomAnonymizer.
- script specifies the path to the script for the DicomAnonymizer.
- quarantine is a directory in which the DicomAnonymizer is to quarantine objects that generate quarantine calls during processing.
Notes:
- If the lookupTable attribute is missing, the lookup anonymizer function is disabled.
2.6.2.8 XmlAnonymizer
The XmlAnonymizer is a processor stage that anonymizes XmlObjects and passes all other object types unmodified. The XmlAnonymizer is configured with a script file. If a script file is either not configured or absent, objects are passed unmodified. The configuration element for the XmlAnonymizer is:
<Anonymizer name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.XmlAnonymizer" root="root-directory" script="scripts/xml-anonymizer.script" quarantine="quarantine-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage.
- root is a directory for use by the Anonymizer for temporary storage.
- script specifies the path to the script for the XmlAnonymizer.
- quarantine is a directory in which the XmlAnonymizer is to quarantine objects that generate quarantine calls during processing.
2.6.2.9 ZipAnonymizer
The ZipAnonymizer is a processor stage that anonymizes ZipObjects and passes all other object types unmodified. The ZipAnonymizer is configured with a script file. If a script file is either not configured or absent, objects are passed unmodified. The configuration element for the ZipAnonymizer is:
<Anonymizer name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.ZipAnonymizer" root="root-directory" script="scripts/zip-anonymizer.script" quarantine="quarantine-directory" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage.
- root is a directory for use by the Anonymizer for temporary storage.
- script specifies the path to the script for the ZipAnonymizer.
- quarantine is a directory in which the ZipAnonymizer is to quarantine objects that generate quarantine calls during processing.
2.6.3 Storage Services
2.6.3.1 FileStorageService
The FileStorageService stores objects in a file system. It automatically defines subdirectories beneath its root directory and populates them accordingly. The configuration element for the StorageService is:
<StorageService name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.FileStorageService" root="D:/storage" type="month" timeDepth="0" acceptDuplicateUIDs="yes" acceptDicomObjects="yes" acceptXmlObjects="yes" acceptZipObjects="yes" acceptFileObjects="yes" returnStoredFile="yes" setWorldReadable="no" setWorldWritable="no" fsNameTag="00097770" autoCreateUser="no" port="85" ssl="no" requireAuthentication="no" quarantine="quarantine-directory" > <jpeg wmax="10000" wmin="96" q="-1" /> </StorageService>
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage.
- root is the root directory of the storage tree.
- type determines the structure of the storage tree. The allowed values are:
- year: root/FSNAME/year/studyName, e.g. root/2008/123.456.789
- month: root/FSNAME/year/month/studyName, e.g. root/2008/06/123.456.789
- week: root/FSNAME/year/week/studyName, e.g. root/2008/36/123.456.789
- day: root/FSNAME/year/day/studyName, e.g. root/2008/341/123.456.789
- none: root/FSNAME/studyName, e.g. root/123.456.789
(FSNAME is the name of the file system to which the study belongs. See the fsNameTag attribute below for additional information.)
- timeDepth specifies the length of time in days that studies are stored. The default value is 0, which means forever. If timeDepth is greater than zero, studies older than that value are automatically removed from storage.
- acceptDuplicateUIDs determines whether objects with duplicate UIDs are to be stored under separate names or if a newer duplicate object is to overwrite an older one. Values are "yes" and "no". The default is "yes".
- acceptDicomObjects determines whether DicomObjects are to be stored. Values are "yes" and "no". The default is "yes".
- acceptXmlObjects determines whether XmlObjects are to be stored. Values are "yes" and "no". The default is "yes".
- acceptZipObjects determines whether ZipObjects are to be stored. Values are "yes" and "no". The default is "yes".
- acceptFileObjects determines whether FileObjects are to be stored. Values are "yes" and "no". The default is "yes".
- returnStoredFile specifies whether the original object or a new object pointing to the file in the storage system is to be returned for processing by subsequent stages. Values are "yes" and "no". The default is "yes".
- setWorldReadable determines whether the FileStorageService makes all files and directories readable by all users. Values are "yes" and "no". The default is "no". This attribute should only be used if users or other programs are to be allowed to access files without using the FileStorageService web server. This feature is not recommended.
- setWorldWritable determines whether the FileStorageService makes all files and directories writable by all users. Values are "yes" and "no". The default is "no". This attribute should only be used if users or other programs are to be allowed to write files into the FileStorageService without using the FileStorageService web server. This feature is not recommended.
- fsNameTag is an optional DICOM element, specified in hex with no comma separating the group and element numbers, which is used to specify the name of the root directory's child under which to store a received object. If the attribute is missing from the configuration or if the specified element is missing from the received object, or if the contents of the specified element are blank, the object is stored under the "__default" tree.
- autoCreateUser determines whether the StorageService is to create a user for each new value of the element specified by the fsNameTag. The default is "no". The user is created with both the username and the password set to the value of the element specified by the fsNameTag.
- port specifies the port on which a web server is to be started to provide access to the stored studies. If the attribute is missing, no web server is started for the FileStorageService.
- ssl determines whether the web server uses SSL. Values are "yes" and "no". The default is "no".
- requireAuthentication determines whether users are forced to log in to the web server. Values are "yes" and "no". The default is "no".
- quarantine is a directory in which the StorageService is to quarantine objects that cannot be stored.
Fro more information on the embedded web server in the FileStorageService, see The CTP FileStorageService Web Server and The CTP FileStorageService Access Mechanism.
Notes:
- Files are stored in a tree of directories with the root of the tree as defined in the root attribute of the configuration element.
- Below the root are directories called FileSystems. A FileSystem may be defined by the value of an element in the object being stored by using the fsNameTag attribute. If no FileSystem is defined by the object, it is stored in the default FileSystem (__default).
- Within a FileSystem, studies are grouped depending on the value of the type attribute. For example, if the type attribute has the value "month", studies are organized by year and month, e.g. "2007/09".
- At the bottom of the hierarchy are directories organized by StudyInstanceUID, thus grouping all objects received for a specific study together in one directory.
- Objects are stored with standard file extensions:
- .dcm for DicomObjects
- .xml for XmlObjects
- .zip for ZipObjects
- .md for FileObjects
- Any object not containing a StudyInstanceUID or StudyUID is stored in the bullpen directory.
- The fsNameTag attribute can be used to create storage trees based on element values like PatientID. It can also access elements in private groups, as might be done if the calledAETTag attribute of the DicomImportService is used to pass destination information. Similarly, the callingAETTag attribute of the DicomImportService could be used to pass source information, allowing separation of objects into FileSystems based on the sending system.
Notes on the jpeg child element:
- The optional jpeg child element causes the FileStorageService to create a JPEG image for each DICOM image when it is stored.
- The jpeg element should be included only when pre-computed JPEG images are required by an external application which directly references the disk drive containing the stored files (for example, the NCIA application).
- When images are accessed through the FileStorageService web server's Storage Servlet or Ajax Servlet, they are produced dynamically, and jpeg elements are not required.
- Multiple jpeg elements may appear if multiple images must be created (with different parameters) for each stored DICOM image.
- The attributes specify the width and quality parameters of the created image:
- If the width of the parent DICOM image lies between the values of wmax and wmin, the width of the created image will be equal to the width of the parent.
- wmax specifies the maximum width of the created image. The default is 10000.
- wmin specifies the minimum width of the created image. The default is 96.
- The height of the created image is automatically computed to provide the same aspect ratio as the parent.
- q specifies the compression quality for the created image. Allowed values are 1 through 100, with larger values producing better quality (and larger file sizes). The system default is triggered by specifying -1, which generally produces a good image.
- A JPEG image file created in response to a jpeg child element is stored in the same directory as the parent DICOM image.
- The filename of a created image is constructed from:
- the name of the parent file,
- a suffix in square brackets identifying the parameters used to create it,
- and a .jpeg extension.
- The parameters appear in the order: wmax, wmin, q, and are separated by semicolons.
- For example, a JPEG image created from FO-29126.dcm might have the name FO-29126.dcm[400;96;-1].jpeg.
- If a 96-pixel wide thumbnail is required for an external application, the following jpeg child element could be specified:
<jpeg wmax="96" wmin="96" q="-1" />
2.6.3.2 BasicFileStorageService
The BasicFileStorageService stores objects in a file system. It provides no organization of the files and no means of access to them. It is intended for use in situations where direct file access is provided through an external application like NCIA. The configuration element for the StorageService is:
<StorageService name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.BasicFileStorageService" index="D:/storage" root="D:/storage/root" nLevels="3" maxSize="200" acceptDicomObjects="yes" acceptXmlObjects="yes" acceptZipObjects="yes" returnStoredFile="yes" quarantine="quarantine-directory" > <jpeg wmax="10000" wmin="96" q="-1" /> </StorageService>
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage.
- index is the directory in which the index is stored.
- root is the root directory of the storage tree.
- nLevels defines the depth of the storage tree. The default is 3.
- maxSize defines the maximum number of files or directories in each node of the storage tree. The default is 200.
- acceptDicomObjects determines whether DicomObjects are to be stored. Values are "yes" and "no". The default is "yes".
- acceptXmlObjects determines whether XmlObjects are to be stored. Values are "yes" and "no". The default is "yes".
- acceptZipObjects determines whether ZipObjects are to be stored. Values are "yes" and "no". The default is "yes".
- returnStoredFile specifies whether the original object or a new object pointing to the file in the storage system is to be returned for processing by subsequent stages. Values are "yes" and "no". The default is "yes".
- quarantine is a directory in which the StorageService is to quarantine objects that cannot be stored.
Notes:
- Files are stored in a tree of directories with the root of the tree as defined in the root attribute of the configuration element.
- The index directory must not appear under the root directory. A convenient approach is shown in the example above, where the root directory appears under the index directory.
- The number of files which will be stored under any directory in the root is maxSize**(nLevels-1). For the default values of the nLevels and maxSize parameters (3 and 200), this is 40,000.
- Directories are created at the top level (root) when existing directories are full. There is no maximum number of top-level directories; however, it is wise to consider the number of objects which are expected to be stored and to select values of nLevels and maxSize which would keep the number of top-level directories below 1000.
- For storage requirements of 10 million files, the default values of nLevels and maxSize are fine.
- For storage requirements of 10 billion files, nLevels="4" and maxSize="300" would work well.
- Files are stored as leaves at the bottom of the tree.
- No organization into related groups (e.g. by StudyInstanceUID) is provided.
- Files are indexed by UID (e.g., SOPInstanceUID).
- A duplicate object (e.g., one whose UID matches the UID of an object already stored) overwrites the stored object in the same place in the storage system (e.g., the same directory and the same filename), and any required jpeg images are recreated, overwriting the previously stored ones.
- FileObjects, which do not contain UIDs, are not stored; they are simply passed to the next stage.
- Files are stored with standard file extensions:
- .dcm for DicomObjects
- .xml for XmlObjects
- .zip for ZipObjects
- See the section on the FileStorageService for a description of the jpeg child element.
- If jpeg child elements appear, the files which they create are not counted against the maxSize parameter. Thus, if maxSize is 200 and two jpeg child elements appear, the bottom directories in the tree could contain 600 files. In situations where a given choice of maxSize could result in more than 1000 files in one directory, it is advisable to reduce maxSize and increase nLevels.
2.6.4 Export Services
2.6.4.1 HttpExportService
The HttpExportService queues objects and transmits them via HTTP with Content-Type application/x-mirc. The configuration element for the HttpExportService is:
<ExportService name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.HttpExportService" root="root-directory" url="http://ipaddress:port/path" zip="no" username="username" password="password" acceptDicomObjects="yes" acceptXmlObjects="yes" acceptZipObjects="yes" acceptFileObjects="yes" dicomScript="scripts/df.script" xmlScript="scripts/xf.script" zipScript="scripts/zf.script" interval="5000" proxyIPAddress="..." proxyPort="..." proxyUsername="..." proxyPassword="..." />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage.
- root is a directory for use by the ExportService for internal storage and queuing.
- url specifies the destination system's URL.
- zip determines whether files will be zipped before transmission (yes) or not (no). The default is no. This feature is intended for use with the HttpImportService.
- username specifies the username credential for inclusion in the header during the transmission. This allows an HttpImportService to authenticate transmissions. If the username attribute is not present or has a whitespace value, no header is generated.
- password specifies the password credential for inclusion in the header during the transmission.
- acceptDicomObjects determines whether DicomObjects are to be exported.
- acceptXmlObjects determines whether XmlObjects are to be exported.
- acceptZipObjects determines whether ZipObjects are to be exported.
- acceptFileObjects determines whether FileObjects are to be exported.
- dicomScript specifies the path to a script which examines the contents of a DicomObject and determines whether the object is to be exported.
- xmlScript specifies the path to a script which examines the contents of an XmlObject and determines whether the object is to be exported.
- zipScript specifies the path to a script which examines the contents of a ZipObject and determines whether the object is to be exported.
- interval is the sleep time (in milliseconds) between polls of the export queue.
- proxyIPAddress is the IP address of the network proxy server, if present.
- proxyPort is the port of the network proxy server, if present.
- proxyUsername is the username of the network proxy server, if one is required.
- proxyPassword is the password of the network proxy server, if one is required.
Notes:
- The default interval is 5 seconds. The minimum allowed value is one second. The maximum allowed value is 10 seconds.
- The protocol part of the url can be http or https, the latter causing connections to be initiated using secure sockets layer.
- If a proxy server is not in use, the proxy attributes must be omitted.
- For an object to be accepted for export, the object type must be accepted (e.g., acceptDicomObjects="yes") and the object must pass the script test. If the script attribute is not supplied, the test returns true by default and the object is accepted. See The CTP DICOM Filter and The CTP XML and Zip Filters for information about the script languages.
2.6.4.2 PolledHttpExportService
The PolledHttpExportService queues objects and transmits them in the HTTP response stream of a received connection. Files are transmitted with Content-Type equal to application/x-mirc. This ExportService is designed to work in conjunction with the PollingHttpImportService to allow penetration of a firewall without having to open an inbound port, as described in Security Issues. The configuration element for the Polled HttpExportService is:
<ExportService name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.PolledHttpExportService" root="root-directory" port="listening-port" acceptDicomObjects="yes" acceptXmlObjects="yes" acceptZipObjects="yes" acceptFileObjects="yes" dicomScript="scripts/df.script" xmlScript="scripts/xf.script" zipScript="scripts/zf.script" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage.
- root is a directory for use by the ExportService for internal storage and queuing.
- port is the port on which the ExportService listens for connections.
- acceptDicomObjects determines whether DicomObjects are to be exported.
- acceptXmlObjects determines whether XmlObjects are to be exported.
- acceptZipObjects determines whether ZipObjects are to be exported.
- acceptFileObjects determines whether FileObjects are to be exported.
Notes:
- For an object to be accepted for export, the object type must be accepted (e.g., acceptDicomObjects="yes") and the object must pass the script test. If the script attribute is not supplied, the test returns true by default and the object is accepted. See The CTP DICOM Filter and The CTP XML and Zip Filters for information about the script languages.
- The PolledHttpExportService does not support SSL.
2.6.4.3 DicomExportService
The DicomExportService queues objects and transmits them to a DICOM Storage SCP. The configuration element for the DicomExportService is:
<ExportService name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.DicomExportService" root="root-directory" url="dicom://DestinationAET:ThisAET@ipaddress:port" dicomScript="scripts/df.script" interval="10000" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage. This attribute is required only when the stage is accessed by other stages. Its value, when supplied, must be unique across all pipelines.
- root is a directory for use by the ExportService for internal storage and queuing.
- url specifies the destination DICOM Storage SCP's URL.
- For an object to be accepted for export, the object type must be accepted (e.g., acceptDicomObjects="yes") and the object must pass the script test. If the script attribute is not supplied, the test returns true by default and the object is accepted. See The CTP DICOM Filter for information about the script language.
- interval is the sleep time (in milliseconds) between polls of the export queue.
Note: The default interval is 10 seconds. The minimum allowed value is one second. The maximum allowed value is 20 seconds.
2.6.4.4 FtpExportService
The FtpExportService queues objects and transmits them to an FTP server. The configuration element for the FtpExportService is:
<ExportService name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.FtpExportService" root="root-directory" url="ftp://ipaddress:port/path" username="..." password="..." acceptDicomObjects="yes" acceptXmlObjects="yes" acceptZipObjects="yes" acceptFileObjects="yes" dicomScript="scripts/df.script" xmlScript="scripts/xf.script" zipScript="scripts/zf.script" interval="10000" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage.
- root is a directory for use by the ExportService for internal storage and queuing.
- url specifies the destination FTP server's URL:
- ipaddress can be a numeric address or a domain name.
- port is the port on which the FTP listener listens. The default port is 21.
- path is the base directory on the FTP server in which the FtpExportService will create StudyInstance directories.
- username specifies the username under which the FtpExportService will log in to the FTP server.
- password specifies the password to be used in the login process.
- acceptDicomObjects determines whether DicomObjects are to be exported.
- acceptXmlObjects determines whether XmlObjects are to be exported.
- acceptZipObjects determines whether ZipObjects are to be exported.
- acceptFileObjects determines whether FileObjects are to be exported.
- interval is the sleep time (in milliseconds) between polls of the export queue.
Notes:
- For an object to be accepted for export, the object type must be accepted (e.g., acceptDicomObjects="yes") and the object must pass the script test. If the script attribute is not supplied, the test returns true by default and the object is accepted. See The CTP DICOM Filter and The CTP XML and Zip Filters for information about the script languages.
- The default interval is 10 seconds. The minimum allowed value is one second. The maximum allowed value is 20 seconds.
- The FtpExportService stores files in subdirectories of the path part of the URL, organized by StudyInstanceUID (or StudyUID in the case of non-DICOM files). Files not containing a StudyUID are stored in the bullpen directory under the path directory.
- If the directory specified by the path does not exist, it is created.
- Files are stored within their directories with names that consist of the date and time (to the millisecond) when they were transferred to the server.
- If a file is transmitted multiple times, multiple copies of the file will appear with its directory, each with the date/time of the transfer.
- No index of the studies and files is created on the server.
2.6.4.5 DatabaseExportService
The DatabaseExportService queues objects and submits them to a DatabaseAdapter class, which must be written specially for the database in question. The configuration element for the DatabaseExportService is:
<ExportService name="stage name" id="stage ID" class="org.rsna.ctp.stdstages.DatabaseExportService" adapterClass="org.myorg.MyDatabaseAdapter" poolSize="1" fileStorageServiceID="ID of referenced FileStorageService" root="root-directory" acceptDicomObjects="yes" acceptXmlObjects="yes" acceptZipObjects="yes" acceptFileObjects="yes" interval="10000" />
where:
- name is any text to be used as a label on configuration and status pages.
- id is any text to be used to uniquely identify the stage.
- adapterClass is the class name of the database's adapter class. See Implementing a DatabaseAdapter for more information.
- poolSize specifies the number of subordinate export threads. The default is 1. The allowed values are from 1 to 10.
- fileStorageServiceID is the ID of the FileStorageService which manages objects referenced by the data stored in the database. This attribute is required only when the database must have direct access to the files on the FileStorageService.
- root is a directory for use by the ExportService for internal storage and queuing.
- acceptDicomObjects determines whether DicomObjects are to be exported.
- acceptXmlObjects determines whether XmlObjects are to be exported.
- acceptZipObjects determines whether ZipObjects are to be exported.
- acceptFileObjects determines whether FileObjects are to be exported.
- interval is the sleep time (in milliseconds) between polls of the export queue.
Note: The default interval is 10 seconds. The minimum allowed value is one second. The maximum allowed value is 20 seconds.
3 Security Issues
In a clinical trial, transmission of data from image acquisition sites to the principal investigator site involves penetrating at least one firewall. Since the image acquisition site initiates an outbound connection, this only rarely requires special action. At the principal investigator's site, where the connection is inbound, some provision must be made to allow the connection to reach its destination. There are two basic solutions. The simplest solution is to open a port in the firewall at the principal investigator's site and route connections for that port to the computer running the CTP application. In some institutions, however, security policies prohibit this solution. The alternative is to use the PolledHttpExportService and PollingHttpImportService to allow data to flow without having to open any ports on the internal network to inbound connections.
Using this latter solution requires two computers, each running CTP. One computer is placed in the border router's DMZ with one port open to the internet, allowing connections to the HttpImportService of the program running in that computer. That program's pipeline includes a PolledHttpExportService which queues objects and waits for a connection before passing them onward. The second computer is placed on the internal network. Its program has a pipeline which starts with a PollingHttpImportService. That ImportService is configured to make outbound connections to the DMZ computer when a file is requested. This allows files to pass through the firewall on the response stream of the outbound connection without having to open any ports to the internal network.
4 Notes
4.1 Important Note for Unix/Linux Platforms
Unix and its derivatives require that applications listening on ports with numbers less than 1024 have special privileges. On such systems, it might be best to put all import services and all web servers on ports above this value. The default configuration puts the web server on port 80 because that is the default port for the web. This can be changed before the program is run for the first time by editing the example-config.xml file, or after it has been run the first time by editing the config.xml file.
4.2 ImageIO Tools for Macintosh
The Java Advanced Imaging ImageIO Tools is a component which provides methods for creating and reading image files. It supports many image file types, and it is designed to be extensible. The authors (Gunter Zeilinger, et al.) of the DICOM toolkit (dcm4che) used by CTP extended the ImageIO Tools to support DICOM.
The ImageIO Tools consists of two parts, a top-level library written in Java and runnable on any platform, and a native library which implements certain compression/decompression functions. The latter library is unique to each platform.
The top-level library consists of two files:
- jai_imageio.jar
- clibwrapper_jiio.jar
There is no Macintosh installer for the ImageIO Tools. You can obtain the two files above by getting the zip file for a Linux installation and unpacking it. Place the two files into /System/Library/Java/Extensions.
There is no native library available for the Macintosh. As a consequence, a CTP installation running on a Macintosh cannot support the viewing of images which contain encapsulated pixel data. Most clinical images do not contain such data, but some modalities produce it. If a problem appears in viewing images stored in a FileStorageService or BasicFileStorageService stage, or if a log entry appears indicating a problem in the ImageIO Tools, check the SOP Class of the image to see if it is one which has encapsulated pixel data.