Difference between revisions of "The CTP DICOM Filter"

From MircWiki
Jump to: navigation, search
(The Script Language)
 
(26 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
The CTP DicomFilter is a pipeline stage that provides preprocessing of DicomObjects, quarantining those which do not meet the conditions of a script program. This article describes the script language. The intended audience for this article is CTP administrators setting up a processing pipeline.
 
The CTP DicomFilter is a pipeline stage that provides preprocessing of DicomObjects, quarantining those which do not meet the conditions of a script program. This article describes the script language. The intended audience for this article is CTP administrators setting up a processing pipeline.
 
==The Script Language==
 
==The Script Language==
The script language interrogates a received DICOM object and computes a boolean result that, if <b>true</b>, results in the object being accepted for further processing in the pipeline, and if <b>false</b>, results in the object being quarantined, aborting further processing.  
+
The script language interrogates a DICOM object and computes a boolean result that, if <b>true</b>, results in the object being accepted for further processing in the pipeline, and if <b>false</b>, results in the object being quarantined, aborting further processing.  
  
 
An expression in the language consists of terms separated by operators and/or parentheses. There are three operators, listed in order of increasing precedence:
 
An expression in the language consists of terms separated by operators and/or parentheses. There are three operators, listed in order of increasing precedence:
Line 14: Line 14:
 
*term * (term + term) + term * !term
 
*term * (term + term) + term * !term
  
Terms in the language are either reserved words (<b>true.</b> or <b>false.</b>) (note the periods after the words) or expressions in the form:  
+
Terms in the language are either reserved words (<b><tt>true.</tt></b> or <b><tt>false.</tt></b>) (note the periods after the words) or expressions in the form:  
  
::<b><font color=red>identifier</font>.method("<font color=blue>string</font>")</b>
+
::<tt><b><font color=red>identifier</font>.method("<font color=blue>string</font>")</b></tt>
  
An identifier is either a DICOM element name as defined in the CTP DICOM Anonymizer (e.g. SOPInstanceUID) or a DICOM tag, specified in square brackets (e.g. [0008,0018]). No spaces are permitted in identifiers, and tags are required to contain all eight hexadecimal digits identifying the group and element.
+
An identifier is either a DICOM element name as defined in the CTP DICOM Anonymizer (e.g. <tt>SOPInstanceUID</tt>) or a DICOM tag, specified in square brackets (e.g. <tt>[0008,0018]</tt>). No spaces are permitted in identifiers, and tags are required to contain all eight hexadecimal digits identifying the group and element.
 +
 
 +
:<em>Note that the identifier syntax supported by the DicomFilter is the same as that supported by the DicomAnonymizer, except that while the DicomAnonymizer supports enclosing element identifiers in either parentheses or square brackets, the DicomFilter supports only square brackets.</em>
 +
 
 +
An element in the <b>first</b> item dataset of a sequence element may be referenced by connecting identifiers with pairs of colons. There is no limit to the length of the chain of identifiers. All identifiers in the chain except the last must be sequence elements, and the last must not be a sequence element. Examples:
 +
 
 +
::<tt>SeqOfUltrasoundRegions::RegionLocationMinY0</tt>
 +
::<tt>[0018,6011]::[0018,601A]</tt>
 +
::<tt>SeqOfUltrasoundRegions::[0018,601A]</tt>
 +
 
 +
Elements in private groups can be referenced by their numeric group and element numbers like standard elements, as in <tt>[0029,1140]</tt>. Such elements can also be referenced through their Private Creator elements as in <tt>[0029[XYZ CT HEADER]40]</tt>. This is an example that references an element buried two levels down in a private group:
 +
*<tt>[0029[XYZ CT HEADER]40]::[0017[ALIGNMENT HEADER]42]</tt>
 +
In the above example, group 29 exists in the root dataset of the object. In that group, element [0029,0011] contains the text, <tt>XYZ CT HEADER</tt>, thus reserving the block of elements from [0029,1100] through [0029,11FF]. In that block, there is an SQ element [0029,1140]. This is the element referenced by <tt>[0029[XYZ CT HEADER]40]</tt>. The first item dataset of that element contains private group 17, and in that group, there is an element [0017,0010] containing the text, <tt>ALIGNMENT HEADER</tt>, which reserves the block of elements from [0017,1000] through [0017,10FF]. In that block, there is an element [0017,1042]. This is the element referenced by <tt>[0017[ALIGNMENT HEADER]42]</tt>.
  
 
The language supports these methods:
 
The language supports these methods:
 +
 
*<b>equals</b> returns <b>true</b> if the value of the <b>identifier</b> exactly equals the <b>string</b> argument; otherwise, it returns <b>false</b>.
 
*<b>equals</b> returns <b>true</b> if the value of the <b>identifier</b> exactly equals the <b>string</b> argument; otherwise, it returns <b>false</b>.
 +
*<b>equalsIgnoreCase</b> is the case-insensitive version of <b>equals</b>.
 +
 
*<b>matches</b> returns <b>true</b> if the value of the <b>identifier</b> matches the regular expression specified in the <b>string</b> argument; otherwise, it returns <b>false</b>.
 
*<b>matches</b> returns <b>true</b> if the value of the <b>identifier</b> matches the regular expression specified in the <b>string</b> argument; otherwise, it returns <b>false</b>.
 +
 
*<b>contains</b> returns <b>true</b> if the value of the <b>identifier</b> contains the the <b>string</b> argument anywhere within it; otherwise, it returns <b>false</b>.
 
*<b>contains</b> returns <b>true</b> if the value of the <b>identifier</b> contains the the <b>string</b> argument anywhere within it; otherwise, it returns <b>false</b>.
 +
*<b>containsIgnoreCase</b> is the case-insensitive version of <b>contains</b>.
 +
 
*<b>startsWith</b> returns <b>true</b> if the value of the <b>identifier</b> starts with the <b>string</b> argument; otherwise, it returns <b>false</b>.
 
*<b>startsWith</b> returns <b>true</b> if the value of the <b>identifier</b> starts with the <b>string</b> argument; otherwise, it returns <b>false</b>.
 +
*<b>startsWithIgnoreCase</b> is the case-insensitive version of <b>startsWith</b>.
 +
 
*<b>endsWith</b> returns <b>true</b> if the value of the <b>identifier</b> ends with the <b>string</b> argument; otherwise, it returns <b>false</b>.
 
*<b>endsWith</b> returns <b>true</b> if the value of the <b>identifier</b> ends with the <b>string</b> argument; otherwise, it returns <b>false</b>.
 +
*<b>endsWithIgnoreCase</b> is the case-insensitive version of <b>endsWith</b>.
  
The value of an identifier is the string value stored in the received DICOM object in the element associated with the identifier. If an identifier is missing from the received DICOM object, an empty string is provided.
+
The value of an identifier is the string value stored in the DICOM object in the element associated with the identifier. If an identifier is missing from the received DICOM object, an empty string is provided.
 +
 
 +
<b>Comments</b>:
 +
 
 +
All text starting with two '/' characters and proceeding to the end of the line is treated as a comment.
  
 
<b>Script Examples</b>:
 
<b>Script Examples</b>:
Line 33: Line 58:
 
Suppose that images are to be rejected if they are of type "SECONDARY". Such images could be filtered out of the pipeline with a script like:
 
Suppose that images are to be rejected if they are of type "SECONDARY". Such images could be filtered out of the pipeline with a script like:
  
::<b>!ImageType.contains("SECONDARY")</b>
+
::<tt><b>!ImageType.contains("SECONDARY")</b></tt>
  
 
Note the unary negation operator, which is necessary to generate <b>true</b> for images which do <b>not</b> contain the string <b>SECONDARY</b>.
 
Note the unary negation operator, which is necessary to generate <b>true</b> for images which do <b>not</b> contain the string <b>SECONDARY</b>.
Line 39: Line 64:
 
Suppose that images are to be rejected if they are of type "SECONDARY" or of type "DERIVED". Such images could be filtered out of the pipeline with a script like:
 
Suppose that images are to be rejected if they are of type "SECONDARY" or of type "DERIVED". Such images could be filtered out of the pipeline with a script like:
  
::<b>!(ImageType.contains("SECONDARY") + ImageType.contains("DERIVED"))</b>
+
::<tt><b>!(ImageType.contains("SECONDARY") + ImageType.contains("DERIVED"))</b></tt>
  
 
Note again the unary negation operator, and also note the parentheses and the logical <b>or</b> operator, all of which combine to generate <b>true</b> only if the type is neither <b>SECONDARY</b> nor <b>DERIVED</b>.
 
Note again the unary negation operator, and also note the parentheses and the logical <b>or</b> operator, all of which combine to generate <b>true</b> only if the type is neither <b>SECONDARY</b> nor <b>DERIVED</b>.
 +
 +
The same effect could be achieved with a script like:
 +
 +
::<tt><b>!ImageType.contains("SECONDARY") * !ImageType.contains("DERIVED")</b></tt>
 +
 +
Note the use of the logical <b>and</b> operator and the way that each term is individually negated.
  
 
Finally, suppose that images containing any non-empty value in the ImageType element are to be rejected. Such images could be filtered out with a script like:
 
Finally, suppose that images containing any non-empty value in the ImageType element are to be rejected. Such images could be filtered out with a script like:
  
::<b>ImageType.equals("")</b>
+
::<tt><b>ImageType.equals("")</b></tt>
 +
 
 +
Note that in this case the unary negation operator is not used because if the element is missing or empty, the <b>equals</b> method will generate <b>true</b>, which is the value necessary to pass the object down the pipeline. This script could also be coded using the DICOM group and element numbers like this:
 +
 
 +
::<tt><b>[0008,0008].equals("")</b></tt>
  
This script could also be coded using the DICOM group and element numbers like this:
+
Here is an example with comments:
  
::<b>[0008,0008].equals("")</b>
+
<pre>
 +
//This is a comment
 +
    !PatientName.equals("xyz") //accept anybody but xyz
 +
    + !PatientID.contains("1") //or anybody without a 1 in the PatientID
 +
    //+ InstitutionName.containsIgnoreCase("JACKSONVILLE") //note: this line is ignored because it starts with //
 +
//This is another comment
 +
</pre>

Latest revision as of 15:28, 18 January 2016

The CTP DicomFilter is a pipeline stage that provides preprocessing of DicomObjects, quarantining those which do not meet the conditions of a script program. This article describes the script language. The intended audience for this article is CTP administrators setting up a processing pipeline.

The Script Language

The script language interrogates a DICOM object and computes a boolean result that, if true, results in the object being accepted for further processing in the pipeline, and if false, results in the object being quarantined, aborting further processing.

An expression in the language consists of terms separated by operators and/or parentheses. There are three operators, listed in order of increasing precedence:

  • + is logical or
  • * is logical and
  • ! is unary logical negation

Expression Examples:

  • term
  • !term
  • term + term * term
  • term * (term + term) + term * !term

Terms in the language are either reserved words (true. or false.) (note the periods after the words) or expressions in the form:

identifier.method("string")

An identifier is either a DICOM element name as defined in the CTP DICOM Anonymizer (e.g. SOPInstanceUID) or a DICOM tag, specified in square brackets (e.g. [0008,0018]). No spaces are permitted in identifiers, and tags are required to contain all eight hexadecimal digits identifying the group and element.

Note that the identifier syntax supported by the DicomFilter is the same as that supported by the DicomAnonymizer, except that while the DicomAnonymizer supports enclosing element identifiers in either parentheses or square brackets, the DicomFilter supports only square brackets.

An element in the first item dataset of a sequence element may be referenced by connecting identifiers with pairs of colons. There is no limit to the length of the chain of identifiers. All identifiers in the chain except the last must be sequence elements, and the last must not be a sequence element. Examples:

SeqOfUltrasoundRegions::RegionLocationMinY0
[0018,6011]::[0018,601A]
SeqOfUltrasoundRegions::[0018,601A]

Elements in private groups can be referenced by their numeric group and element numbers like standard elements, as in [0029,1140]. Such elements can also be referenced through their Private Creator elements as in [0029[XYZ CT HEADER]40]. This is an example that references an element buried two levels down in a private group:

  • [0029[XYZ CT HEADER]40]::[0017[ALIGNMENT HEADER]42]

In the above example, group 29 exists in the root dataset of the object. In that group, element [0029,0011] contains the text, XYZ CT HEADER, thus reserving the block of elements from [0029,1100] through [0029,11FF]. In that block, there is an SQ element [0029,1140]. This is the element referenced by [0029[XYZ CT HEADER]40]. The first item dataset of that element contains private group 17, and in that group, there is an element [0017,0010] containing the text, ALIGNMENT HEADER, which reserves the block of elements from [0017,1000] through [0017,10FF]. In that block, there is an element [0017,1042]. This is the element referenced by [0017[ALIGNMENT HEADER]42].

The language supports these methods:

  • equals returns true if the value of the identifier exactly equals the string argument; otherwise, it returns false.
  • equalsIgnoreCase is the case-insensitive version of equals.
  • matches returns true if the value of the identifier matches the regular expression specified in the string argument; otherwise, it returns false.
  • contains returns true if the value of the identifier contains the the string argument anywhere within it; otherwise, it returns false.
  • containsIgnoreCase is the case-insensitive version of contains.
  • startsWith returns true if the value of the identifier starts with the string argument; otherwise, it returns false.
  • startsWithIgnoreCase is the case-insensitive version of startsWith.
  • endsWith returns true if the value of the identifier ends with the string argument; otherwise, it returns false.
  • endsWithIgnoreCase is the case-insensitive version of endsWith.

The value of an identifier is the string value stored in the DICOM object in the element associated with the identifier. If an identifier is missing from the received DICOM object, an empty string is provided.

Comments:

All text starting with two '/' characters and proceeding to the end of the line is treated as a comment.

Script Examples:

Suppose that images are to be rejected if they are of type "SECONDARY". Such images could be filtered out of the pipeline with a script like:

!ImageType.contains("SECONDARY")

Note the unary negation operator, which is necessary to generate true for images which do not contain the string SECONDARY.

Suppose that images are to be rejected if they are of type "SECONDARY" or of type "DERIVED". Such images could be filtered out of the pipeline with a script like:

!(ImageType.contains("SECONDARY") + ImageType.contains("DERIVED"))

Note again the unary negation operator, and also note the parentheses and the logical or operator, all of which combine to generate true only if the type is neither SECONDARY nor DERIVED.

The same effect could be achieved with a script like:

!ImageType.contains("SECONDARY") * !ImageType.contains("DERIVED")

Note the use of the logical and operator and the way that each term is individually negated.

Finally, suppose that images containing any non-empty value in the ImageType element are to be rejected. Such images could be filtered out with a script like:

ImageType.equals("")

Note that in this case the unary negation operator is not used because if the element is missing or empty, the equals method will generate true, which is the value necessary to pass the object down the pipeline. This script could also be coded using the DICOM group and element numbers like this:

[0008,0008].equals("")

Here is an example with comments:

//This is a comment
    !PatientName.equals("xyz") //accept anybody but xyz
    + !PatientID.contains("1") //or anybody without a 1 in the PatientID
    //+ InstitutionName.containsIgnoreCase("JACKSONVILLE") //note: this line is ignored because it starts with //
//This is another comment