DICOM Anonymizer Configuration for Assigning Subject IDs

From MircWiki
Revision as of 13:21, 8 October 2013 by Johnperry (talk | contribs) (→‎The @hashptid Function)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

The CTP DICOM Anonymizer provides a way to replace Protected Health Information (PHI) with subject identifiers, a process that is usually required when acquiring data for clinical research. Most images in clinical imaging trials are stored in the format defined by the DICOM standard. The CTP DicomAnonymizer has three functions that can be used to replace a PHI patient identifier with a non-PHI subject identifier. This article describes the situations in which each function is useful and how to configure the anonymizer to use it. The intended audience of this article is clinical trial administrators and coordinators. To fully understand this article, it may be necessary to refer to The DICOM Anonymizer.

The patient identifier in a DICOM image is contained in the DICOM PatientID element (0010,0020). When a DICOM image is sent to CTP at an image acquisition site, it is passed through a DicomAnonymizer stage to be de-identified before transmission to the principal investigator site. One of the key steps in that process is the replacement of the PHI value of the PatientID element with the subject ID. The configuration of the DicomAnonymizer is done in the DICOM Anonymizer Configurator. For information on the DicomAnonymizer and its configurator, see the articles listed in CTP Articles.

1 The @lookup Function

For trials in which not all the images and data objects will be processed by CTP, it is necessary to assign subject identifiers manually before the images are acquired. This is often done on a spreadsheet or in a notebook. In such trials, the @lookup function is used to insert the subject ID into the PatientID element. The @lookup function uses a lookup table that is manually updated by the clinical trial coordinator when new patients are recruited and both their PatientID (PHI) and their subject ID are known.

This is the anonymizer replacement script that is typically used for the PatientID element:

@lookup(this,ptid)

The first argument of the function specifies the element whose value is to be used as the key into the lookup table. The special keyword this tells the function to use the current element (PatientID) as the source of the key. The second argument is an identifier of the type of key. The keytype is any text that can be used to distinguish different types of keys in the table. In general, it is convenient to use text that is meaningful to a human reader, but the program itself doesn't assign any meaning to it. The only requirement is that it not contain whitespace.

The CTP admin user has access to the Lookup Table Editor page for adding lookup table entries through a browser. For information on updating the lookup table, see The CTP Lookup Table Editor.

For more information, see the lookup section of the DICOM Anonymizer article.

2 The @integer Function

For trials in which all the images and data objects will be processed by CTP and no pre-assigned subject IDs need to be created, a convenient method of assigning subject IDs is to use the @integer function. This function assigns sequential integers to subjects as they are encountered.

This is the anonymizer replacement script that is typically used for the PatientID element:

@integer(this,ptid,3)

The first two arguments have the same meanings as in the @lookup function. The third argument specifies the number of characters in the integer. The function prepends leading zeroes to fill out the width. For more information, see the integer section of the DICOM Anonymizer article.

In situations where multiple image acquisition sites contribute patients to a single trial, it is necessary to add a site-unique prefix to the numeric value to avoid assigning the same ID to multiple subjects. This is typically done in the anonymizer script, like this:

XYZ-@integer(this,ptid,3)

The result would be a sequence of subject IDs in the form XYZ-001, XYZ-002, etc.

3 The @hashptid Function

The @hashptid function can be used as an alternative to the @integer function. It computes a scrambled numeric value based on the site ID of the image acquisition site and the PatientID. It has the advantage that the result does not indicate the provenance of the data. It has the disadvantage that in order to keep the possibility of assigning duplicate IDs low, the IDs must be somewhat longer strings.

This is the anonymizer replacement script that is typically used for the PatientID element:

@hashptid(@SITEID,this,16)

In this case, the first argument points to a parameter in the DicomAnonymizer script. This parameter should be set to a different value for each image acquisition site. The third argument specifies the width of the resulting subject ID. It must be less than or equal to 64 to be compliant with the DICOM standard.

For more information, see the hashptid section of the DICOM Anonymizer article.

4 The IDMap Pipeline Stage

In some situations, an image acquisition site may want to keep a record of the correspondence between PatientID and subject ID values. This can be done by inserting an IDMap pipeline stage before the DicomAnonymizer. See the IDMap section of the main CTP article for details.