The MIRCquery Schema

From MircWiki
Revision as of 12:18, 7 July 2006 by Johnperry (Talk | contribs) (MIRCquery Attributes)

Jump to: navigation, search

This article describes technical requirements for participation in the MIRC community as a storage service. Additional information is provided in The MIRCqueryresult Schema and The MIRCdocument Schema.

The intended audience for this article is the system administrator or developer of a teaching file system who is implementing a MIRC storage service to make the system’s teaching files available to the MIRC community.

1 Overview

MIRC may be defined as a collection of web sites that share a common query mechanism.

The two key components involved in the query process are:

  • Query Service: a web site that is accessed by a MIRC user with a web browser. The query service provides web pages that allow a user to define search criteria and select sites in the MIRC community to be searched. It queries the selected sites, organizes the query results into a web page or pages, and sends the pages to the browser.
  • Storage Service: a web site that stores information of interest to MIRC users. In response to a query received from a query service, it identifies all the information it stores which meets the search criteria, constructs a query response containing an abstract of the information to allow a user to determine whether the information is of interest, and sends the response to the query service. It also serves the information directly to a MIRC user in response to a request from the user.

The query service and storage service are independent. A MIRC site may include a query service or a storage service, or both.

2 MIRC Query

A MIRC query is an XML object in the form defined by the MIRCquery schema. It is passed from a query site to a storage service via an HTTP POST of content type text/xml.

The following is an example of a MIRCquery:

<MIRCquery firstresult=”…” maxresults=”…” queryUID=”…” unknown=”…”>
    <title> . . . </title>
    <author>    . . .    </author>
    <abstract> . . . </abstract>
    <keywords> . . . </keywords>
    <history> . . . </history>
    <findings> . . . </findings>
    <diagnosis> . . . </diagnosis>
    <differential-diagnosis> . . . </differential-diagnosis>
    <discussion> . . . </discussion>
    <pathology> . . . </pathology>
    <anatomy> . . . </anatomy>
    <organ-system> . . . </organ-system>
    <code coding-system=”…”> . . . </code>
    <modality> . . . </modality>
    <patient>
        <pt-age>
            <years> . . . </years>
            <months> . . . </months>
            <weeks> . . . </weeks>
            <days> . . . </days>
        </pt-age>
        <pt-sex> . . . </pt-sex>
        <pt-race> . . . </pt-race>
        <pt-species> . . . </pt-species>    <!—-veterinary-->
        <pt-breed> . . . </pt-breed>        <!--veterinary-->
    </patient>
    <image>
        <format> . . . </format>
        <compression> . . . </compression>
        <modality> . . . </modality>
        <anatomy> . . . </anatomy>
        <pathology> . . . </pathology>
    </image>
    <document-type> . . . </document-type>
    <category> . . . </category>
    <level> . . . </level>
    <access> . . . </access>
    <peer-review/>
    <language code=”…”> . . . </language>
    … free text search field …
</MIRCquery>

2.1 MIRCquery Attributes

The firstresult and maxresults attributes of the MIRCquery element are used to allow the query service to break the responses into groups. The firstresult attribute specifies the first result to be returned by the storage service. The value 0 corresponds to the first result in the list. If the attribute is missing, the default value 0 is to be used.

The maxresults attribute specifies the maximum number of results to be returned by the storage service. For example, if the query service is grouping results into sets of 10 and is asking for the third group, firstresult would be set to 20 and maxresults would be set to 10. If the maxresults attribute is missing or 0, the storage service is to return 1 result.

The queryUID attribute may be generated by query services to uniquely identify the query. If present, it can be used by the storage service to cache the results of the query. When provided, all page requests, (all MIRCquery elements with different firstresult attribute values but otherwise containing the same child elements) have the same queryUID attribute value.

The unknown attribute is optionally provided by the query service to instruct the storage service to return the query results as a set of unknowns, providing an alternative title and abstract that conceal the diagnostic result from the student. The value of the attribute may be yes or no. If the attribute is missing, the default value of no is to be used.

2.2 MIRCquery Child Elements

All the child elements are optional in a MIRCquery. A storage service uses the value of any child element included in a MIRCquery as a query field and searches its index for documents containing the contents of the query field in data that is identified to be of the type defined by the name of the child element. Thus, if a MIRCquery contains an

<anatomy>chest</anatomy>

child, the storage service searches its index for documents which containing the word chest in a field identified as anatomy.

Certain elements have enumerated values:

  • pt-sex
  • format
  • compression
  • modality
  • document-type
  • level
  • access
  • language code="…"

These values are defined in The MIRCdocument Schema.

The author element is a special case. Any text contained within the author element of a MIRCquery is intended to be used as a match against any information associated with an author. The RSNA storage service, for example, uses the contents of the author element of the MIRCquery to search all the child elements of the author element in a MIRCdocument (e.g., the name, affiliation, and contact elements).

The peer-review element is another special case. If the element is present in the MIRCquery, all documents listed in search results are required to have been peer-reviewed. If it is missing from the MIRCquery, no constraints are placed on the peer-review status of documents listed in search results. Any text value of the peer-review element is ignored.

3 Query Rules

There may be at most one child element of each type in a MIRCquery. Complex searches within an element type are done using the boolean syntax described below.

The RSNA query and storage services implement the following search rules:

  • If text appears in any MIRCquery child element or attribute, it is a required match for a corresponding element in a document to be listed in the search results.
  • All child elements appearing within a MIRCquery are required matches, e.g., a matching document is one that matches all query fields.
  • Child elements that are not included in a MIRCquery are not required matches. Thus, an empty MIRCquery (<MIRCquery/>) is a match to all documents in the storage service.
  • Text is not case-sensitive.
  • A free-text search, matching text anywhere in a document, is done by placing the search text in the text value of the <MIRCquery> element.
  • Search text containing separate words with no intervening operator characters results in a logical AND of all the words, but not necessarily in order.
  • Search text can be constructed with a logical OR using the “|” character.
  • Search text can be constrained to appear in order by placing it in quotes within the search string.
  • Complex combinations of logical AND and logical OR operations can be created using the parenthesis operator, ( … ).

Note that the first bullet above implies that if text appears in a MIRCquery element or attribute that is not supported by a site’s software implementation, the site must return zero matches.

4 Notes and Suggestions for Implementers

4.1 Content Type

To provide the most efficient support for all languages, query services and storage services should transmit their MIRCquery and MIRCqueryresult contents in UTF-8. Some MIRC query service implementations, in order to convert to UTF-8, transmit a content type of:

text/xml; charset=utf-8

For that reason, storage services should not test the Content-Type using something like:

if (contentType.equals(“text/xml”)) { … }

but instead:

if (contentType.indexOf(“text/xml”) != -1) { … }

4.2 The Abstract

The contents of the abstract element should be brief – less than 10 lines of text – in order to keep the complete set of query results short enough to allow the user to look through the results and select the desired document. The RSNA query service imposes a 1000-character limit on the length of the abstract. Abstracts that exceed the limit are truncated, and any embedded element tags (e.g. HTML) are suppressed to ensure that the result remains well-formed.

If the document is not in HTML format, its format can be noted in the abstract to assist the user in deciding whether the document is of interest.