Difference between revisions of "CTP Configuration for NBIA"

From MircWiki
Jump to navigation Jump to search
m (Protected "CTP Configuration for NBIA" ([edit=sysop] (indefinite) [move=sysop] (indefinite)))
(No difference)

Revision as of 15:50, 29 November 2011

This article describes how the CTP application can be configured to work with the National Biomedical Image Archive (NBIA). The intended audience for this article is system administrators at sites running instances of NBIA. There are several articles on CTP on the RSNA MIRC wiki. All would be useful references when reading this article.

1 A Simple Configuration File

The configuration of a CTP instance is specified by an XML file called config.xml located in the same directory as the program. The example configuration shown below assumes that anonymized data objects are transmitted to CTP by an external application using the HTTPS protocol. This configuration might be useful for testing the NCIADatabase DatabaseAdapter.

<Configuration>

   <Server port="80" />

   <Pipeline name="Main Pipeline">

        <ImportService 
            name="HTTP Import"
            class="org.rsna.ctp.stdstages.HttpImportService"
            root="roots/http-import" 
            port="7777"
            ssl="yes"
            acceptDicomObjects="yes"
            acceptXmlObjects="yes"
            acceptZipObjects="yes"
            acceptFileObjects="no"
            quarantine="quarantines/http-import" />

        <StorageService
            name="Storage"
            id="storage"
            class="org.rsna.ctp.stdstages.BasicFileStorageService"
            index="D:/storage" 
            root="D:/storage/root" 
            port="80"
            nLevels="4"
            maxSize="300"
            returnStoredFile="yes"
            quarantine="quarantines/storage" />

        <ExportService
            name="Database Export"
            class="org.rsna.ctp.stdstages.DatabaseExportService"
            adapterClass="org.rsna.ctp.stdstages.database.DatabaseAdapter"
            fileStorageServiceID="storage"
            root="roots/database-export"
            quarantine="quarantines/database-export" />

    </Pipeline>
</Configuration>

1.1 Commentary

This section draws attention to certain attributes in the configuration and explains the reasons for them.

1.1.1 Server

The configuration file above places the CTP admin web server on port 80.

1.1.2 HttpImportService

The configuration file puts the HttpImportService on port 7777 and enables secure sockets layer.

The HttpImportService is configured to accept all object types which contain UIDs. Since the BasicFileStorageService does not accept FileObjects (which do not contain UIDs), the HttpImportService is configured to reject any which are received.

If an object is received which is not accepted, it is quarantined. If it is not desired to keep copies of such objects, the quarantine attribute can be omitted.

1.1.3 Storage Service

The id attribute is critical in this configuration. It provides a way for the DatabaseExportService access the FileStorageService to obtain the URL of an object being exported. The NBIA database requires the URL to provide access to the object as well as a thumbnail image of the object (if the object is an image).

The returnStoredFile attribute is not critical in this configuration. It is set to "yes" just to keep it consistent with the more complex configuration shown below. The reason it is not critical here is that the DatabaseExportService obtains the path to the stored object based on its UID, which is the same in the queued file as it is in the stored file.

The index attribute points to any location where the BasicFileStorageService can store its index.

The root attribute points to any location where the FileStorageService can build its directory structure. It is shown in the configuration above as an absolute path to some device, and the device could be on another drive or another computer from the one where the CTP application is stored. The root could also be placed under the CTP directory tree, for example, by specifying it to be roots/storage/root. In the configuration shown above, it is assumed that the files will be stored outside the CTP tree.

The index directory must not be contained within the root directory tree. As shown above, it is acceptable to have the root within the index.

1.1.4 DatabaseExportService

The adapterClass attribute specifies the fully qualified name of the Java class which implements the NCIADatabase DatabaseAdapter.

The fileStorageServiceID attribute is critical. It tells the DatabaseExportService which BasicFileStorageService to query for the file path of the object being exported.

Note that when an object is received by an ExportService for export, a copy of the object, not a reference to it, is entered into the ExportService's queue. Thus, the object in the queue is independent of the one which was received, and subsequent stages, if any, can modify the object being passed down the pipeline without affecting the object being exported. This is important when removing provenance information as explained below.

2 A Production Configuration File

The example configuration shown below assumes that the anonymized data objects received by the HttpImportService contain provenance information which must be removed from the stored objects after the information has been passed to the NBIA database.

<Configuration>

   <Server port="80" />

   <Pipeline name="Main Pipeline">

        <ImportService 
            name="HTTP Import"
            class="org.rsna.ctp.stdstages.HttpImportService"
            root="roots/http-import" 
            port="7777"
            ssl="yes"
            acceptDicomObjects="yes"
            acceptXmlObjects="yes"
            acceptZipObjects="yes"
            acceptFileObjects="no"
            quarantine="quarantines/http-import" />

        <StorageService
            name="Storage"
            id="storage"
            class="org.rsna.ctp.stdstages.BasicFileStorageService"
            index="D:/storage" 
            root="D:/storage/root" 
            port="80"
            nLevels="4"
            maxSize="300"
            returnStoredFile="yes"
            quarantine="quarantines/storage" />

        <ExportService
            name="Database Export"
            class="org.rsna.ctp.stdstages.DatabaseExportService"
            adapterClass="org.rsna.ctp.stdstages.database.DatabaseAdapter"
            fileStorageServiceID="storage"
            root="roots/database-export"
            quarantine="quarantines/database-export" />

        <Processor
            name="DICOM Provenance Remover"
            class="org.rsna.ctp.stdstages.DicomAnonymizer"
            root="roots/provenance-remover" 
            script="roots/provenance-remover/dicom-anonymizer.script
            quarantine="quarantines/provenance-remover" />

        <Processor
            name="XML Provenance Remover"
            class="org.rsna.ctp.stdstages.XmlAnonymizer"
            root="roots/provenance-remover" 
            script="roots/provenance-remover/xml-anonymizer.script
            quarantine="quarantines/provenance-remover" />

    </Pipeline>
</Configuration>

2.1 Commentary

The commentary in the previous section applies to this configuration as well, but it is important to note several points about the Anonymizer stages.

2.2 FileStorageService

It is critical that the FileStorageService's returnStoredFile attribute be set to yes in order that the object received by the Anonymizer stages points to the stored file. This allows the Anonymizer stages to anonymize the object in place in the BasicFileStorageService. If the BasicFileStorageService did not return the stored file, then an Anonymizer would anonymize the object in the HttpImportService queue, leaving the non-anonymized version in the BasicFileStorageService.

2.2.1 Anonymizer

CTP actually contains three types of anonymizers (DICOM, XML, and Zip), each configured as a separate Processor stage with its own script file. A DicomAnonymizer and an XmlAnonymizer are shown; a ZipAnonymizer could be added if ZipObjects containing provenance information are to be received.

Note that the Anonymizer appears after the DatabaseExportService. This ensures that the database gets a copy of the object containing the provenance information.

When CTP starts, it looks for the script files in the specified locations, and if they are not present, example files are copied into place. The CTP admin web server includes a special servlet for editing DICOM anonymizer scripts. It also includes a servlet for editing all other script types, including those for the ZML and Zip anonymizers as well as the DICOM, XML, and Zip filters). See The CTP XML Anonymizer for information on configuring the anonymizers for XmlObjects and the manifests in ZipObjects. The DICOM anonymizer is described in The CTP DICOM Anonymizer.

3 Advanced Configuration

This section briefly notes two other points which may be of interest in special cases.

3.1 Importing from a File Tree

CTP has several standard ImportServices, and it is designed to be extended. One standard ImportService which might be useful in capturing the contents of a large archive for the NBIA database, is the ArchiveImportService, which walks a directory tree and supplies objects to its pipeline. See the wiki article and the CTP Javadocs for org.rsna.ctp.pipeline.ArchiveImportService for more information.

3.2 Preprocessing

There are three other pipeline stages which might be of use in filtering out data objects which are inappropriate for inclusion in the database. They are:

  • org.rsna.ctp.stdstages.DicomFilter
  • org.rsna.ctp.stdstages.XmlFilter
  • org.rsna.ctp.stdstages.ZipFilter

These stages are described in the main article on CTP.

If preprocessing is desired which goes beyond the capabilities of the filters' script languages, it is possible to create a special-purpose preprocessing stage. See the CTP Javadocs for org.rsna.ctp.pipeline.Processor and org.rsna.ctp.pipeline.AbstractPipelineStage for more information.