The Storage Service Index File

From MircWiki
Revision as of 13:41, 14 December 2006 by Rboden (talk | contribs) (Reverted edits by Shadoe (Talk); changed back to last version by Johnperry)
Jump to navigation Jump to search

This article describes the index file used by storage services in the RSNA MIRC implementation. The expected audience for this document is MIRC site administrators or software developers who are modifying the open source version of the RSNA software.

1 The Storage Service Index File

When a storage service starts, it looks in its root directory (for example, /webapps/mircstorage) for a file called siteindex.xml containing the names of the documents it should index. From this file, it builds an XML object in memory containing the searchable parts of all the documents. The XML object in memory is searched whenever a query is received, providing rapid responses. The index file is an XML file with the following format:

<MIRCindex>
    <doc>case1folderpath/case1.xml</doc>
    <doc>case2folderpath/case2.xml</doc>
    <doc>case3folderpath/case3.xml</doc>
    ...
</MIRCindex>

There is one doc element for each document to be indexed. Each doc element contains the path from the servlet root directory (/webapps/[storage service name]) to the XML file to be indexed.

The submit service servlet inserts new doc elements without parsing the XML file, and the algorithm requires exactly the form shown above, with the MIRCindex tag, each doc element, and the MIRCindex end tag all to be on separate lines in the file.

The MIRC software contains no siteindex.xml file as a precaution against overwriting an existing file during an upgrade installation. The storage service servlet automatically creates an empty index file if necessary, containing only:

<MIRCindex>
</MIRCindex>

The storage service index file and its corresponding XML index object are maintained automatically by the software as documents are submitted through the author service or the submit service or created automatically by the DICOM service. In unusual circumstances, such as when a document is inserted into a system manually by an administrator, administrators can manually initiate reloading of the memory-resident index object or rebuilding of the index file using buttons on the storage service admin page.

The XML index object contains MIRCdocument elements in the order in which they occur in the index file. An administrator who wishes to give a document preferential treatment by causing it to appear earlier in query results lists can move the doc element for that document higher in the index file by editing the file with a text editor like TextPad.

2 Remote Site Indexing

Rarely, it may be desirable to index the contents of a remote web site on a storage service. Such indexes are usually created programmatically. To simplify the process, storage services can accept XML files containing index cards for groups of documents. A remote site index is an XML file with the following structure:

<MIRCsiteindex>
    <MIRCdocument docref=”...”>...</MIRCdocument>
    <MIRCdocument docref=”...”>...</MIRCdocument>
    ...
</MIRCsiteindex>

where each MIRCdocument element is a MIRCdocument index card with a docref attribute containing a fully qualified URL pointing to a document on another site. An index card is a MIRCdocument with at least a title, author, and abstract element and containing no local references.

The remote site index is represented in the storage service index with an index element containing the path from the servlet root directory to the file containing the remote site index. doc elements and index elements can be mixed in the same storage service index file, as in this example:

<MIRCindex>
    <doc>case1folderpath/case1.xml</doc>
    <doc>case2folderpath/case2.xml</doc>
    ...
    <index>index1path/index1.xml</index>
    <index>index2path/index2.xml</index>
    ...
</MIRCindex>

There is no support in the admin service or the submit service for managing remote site indexes, so they must be entered by hand.