The CTP JarClassLoader

From MircWiki
Revision as of 13:30, 5 July 2012 by Johnperry (talk | contribs)
Jump to navigation Jump to search

DEPRECATED - This mechanism described in this article has been superseded by a new approach.

This article describes the startup mechanism used by CTP. Nobody needs to read this article in order to use, or even to extend, CTP. The purpose of the article is simply to capture the thought process behind the startup mechanism in case changes are contemplated in the future. This article contains secret, highly technical information which will make you extremely attractive to women. Proceed at your own risk.

CTP is a pipeline processor which is designed to be extensible. The idea is that programmers can implement new PipelineStages or DatabaseAdapters and couple them into the program without having to modify CTP itself. See Extending CTP for more information. Once an extension has been created, it must be recognized by CTP in order to be available for loading. The recommended way to do that is to build extensions into JAR files and place those files in the libraries directory of the CTP installation. To support this mechanism, CTP must have a way to load classes that are not on the classpath.

A brief diversion on ClassLoaders

ClassLoaders load classes and resources. There are three standard ClassLoaders in Java:

  • The primordial ClassLoader is responsible for loading the classes of the Java class library. This ClassLoader is part of the JVM, and it is written in native code for the platform.
  • The extension ClassLoader is responsible for loading Java extensions. These are typically classes in JARs in the jre/lib/ext directory, although it is possible to specify additional extensions directories on the command line when starting a Java program.
  • The application ClassLoader is responsible for loading classes found on the classpath. The classpath is defined by an environment variable or an attribute in the manifest of a JAR file. It can also be specified on the command line when starting a Java program.

Except for the primordial ClassLoader, a ClassLoader is a Java class, and it must be loaded by another ClassLoader, which becomes its parent. ClassLoaders typically obey delegation rules along these lines:

  1. When asked to load a class (by a call to its loadClass method), the ClassLoader first checks to see if it has already loaded the class. If so, it must return exactly the same Class object for the class that it returned before.
  2. If the requested class has not been previously loaded, the ClassLoader calls the loadClass method of its parent. If the parent supplies a Class object for the requested class, the ClassLoader returns it.
  3. If the parent failed to load the requested class (indicated by receiving a ClassNotFoundException from the parent), the ClassLoader searches its domain to find the class. If it finds the class, it returns it. If not, it throws a ClassNotFoundException.

The effect of the delegation mechanism is that the highest level ClassLoader which can load a class is the one that loads it. Every Class object knows the ClassLoader which loaded it. When an object attempts to instantiate another object using the new instruction, the instantiating object's ClassLoader is used to load the Class of the instantiated class, but the delegation mechanism may result in a higher level ClassLoader being the one that does the actual loading (and being the one that the loaded class knows as its ClassLoader). This little bit of arcana will be significant later.

Parenthetic note: For two Class objects to be equal, they must be the same class and they must have been loaded by the same ClassLoader.

CTP startup

CTP is packaged in the CTP.jar file. The main class is org.rsna.ctp.ClinicalTrialProcessor. When the program starts, the main method of the main class is called. That method is shown below.

static final File libraries = new File("libraries");
static final String ctp = "org.rsna.ctp.ClinicalTrialProcessor";

public static void main(String[] args) {

  //Make sure the libraries directory is present
  libraries.mkdirs();

  //Get a JarClassLoader pointing to this program plus the libraries directory
  JarClassLoader cl = 
    JarClassLoader.getInstance(new File[] { new File("CTP.jar"), libraries });

  //Set the context classloader to the JarClassLoader
  Thread.currentThread().setContextClassLoader(cl);

  //Load the class and instantiate it
  try {
    Class ctpClass = cl.loadClass(ctp);
    ctpClass.getConstructor( new Class[0] ).newInstance( new Object[0] );
  }
  catch (Exception unable) { unable.printStackTrace(); }
}

The key objective of the main method is to get the program started, with a ClassLoader in place which knows about the JARs in the libraries directory. There are several subtleties in this method, so it is worthwhile to examine each instruction.

  //Make sure the libraries directory is present
  libraries.mkdirs();

This simply ensures that the libraries directory exists.

  //Get a JarClassLoader pointing to this program plus the libraries directory
  JarClassLoader cl = 
    JarClassLoader.getInstance(new File[] { new File("CTP.jar"), libraries });

This gets an instance of a JarClassLoader which knows about the CTP.jar file (which is located in the top-level CTP directory) and all the JAR files located in the libraries directory. This is a ClassLoader with a special delegation feature that takes precedence over its parent ClassLoader, thus ensuring that if it can load a class, it becomes the ClassLoader of the class. The JarClassLoader is described in detail below.

  //Set the context classloader to the JarClassLoader
  Thread.currentThread().setContextClassLoader(cl);

This sets the context ClassLoader of the Thread in which the program is starting to the JarClassLoader. This is necessary in the case of CTP because the DICOM library (dcm4che) has certain methods which use the context ClassLoader to load classes which are in its JAR. When the program starts, the context ClassLoader is the ApplicationClassLoader, which doesn't know about the dcm4che JAR, so it is necessary to replace the context ClassLoader with one that does.

  //Load the class and instantiate it
  try {
    Class ctpClass = cl.loadClass(ctp);
    ctpClass.getConstructor( new Class[0] ).newInstance( new Object[0] );
  }
  catch (Exception unable) { unable.printStackTrace(); }

This instantiates the ClinicalTrialProcessor class. Note that when the program starts, the ApplicationClassLoader loads the main class, so it would be the one that would be used to load the class by an instruction like the following:

    new ClinicalTrialProcessor();

This would have the effect of making the ApplicationClassLoader the ClassLoader for the entire application, so it is necessary to load the class again with the JarClassLoader to ensure that the JarClassLoader becomes the ClassLoader for everything. For this reason, it is necessary to load the ClinicalTrialProcessor class using the JarClassLoader directly.

Important note: If the JarClassLoader obeyed the normal delegation rules, then it would call its parent ClassLoader (the ClassLoader that loaded the JarClassLoader, namely the ApplicationClassLoader), and that would try to load the ClinicalTrialProcessor class. Since the ApplicationClassLoader had already loaded the class when the program was started, it would return the same Class object, which would have as its ClassLoader the ApplicationClassLoader. This would subvert the whole intent. Thus, it is imperative that the JarClassLoader use a reversed delegation model.

Parenthetic note: The JAR containing the main class is always implicitly on the classpath of the ApplicationClassLoader. Thus, the ApplicationClassLoader always knows about all the classes in the JAR containing the main class, and you could not depend on normal delegation to fail when loading the ClinicalTrialProcessor class even if the JAR is not explicitly on the classpath.

The org.rsna.util.JarClassLoader class

The java.net package has a URLClassLoader class which loads classes from an array of URLs pointing to JARs. Like all standard ClassLoaders, it obeys the standard delegation rules.

The org.rsna.util.JarClassLoader class extends URLClassLoader and overrides the loadClass method to enforce a non-standard set of delegation rules. It also provides a static getInstance method to provide an API which doesn't require the caller to create URLs for each of the JARs.

The entire code of the class is shown below.

/*---------------------------------------------------------------
*  Copyright 2009 by the Radiological Society of North America
*
*  This source software is released under the terms of the
*  RSNA Public License (http://mirc.rsna.org/rsnapubliclicense)
*----------------------------------------------------------------*/

package org.rsna.util;

import java.io.File;
import java.io.IOException;
import java.net.URL;
import java.net.URLClassLoader;
import java.util.Enumeration;
import java.util.LinkedList;

/**
 * A ClassLoader that finds classes in an array of JARs,
 * giving precedence in delegation to the JARs.
 */
public class JarClassLoader extends URLClassLoader {

  /**
   * Get a JarClassLoader initialized to a set of files and directories.
   * This method places all individual files in the array of JARs.
   * It searches any directories to find all the files that end in ".jar"
   * (not case sensitive) and places them in the array as well.
   * @param files the array of File objects to include in the array of JARs.
   * The items in the array can be individual JAR files or directories.
   * @return a JarClassLoader initialized to the set of JARs found in the
   * files array.
   */
  public static JarClassLoader getInstance(File[] files) {
    LinkedList<URL> urlList = new LinkedList<URL>();
    for (int i=0; i<files.length; i++) {
      if (files[i].exists()) {
        if (files[i].isFile()) {
          try { urlList.add( files[i].toURL() ); }
          catch (Exception skip) {
            System.out.println("Unable to add file to classpath: "+files[i]);
          }
        }
        else if (files[i].isDirectory()) {
          File[] jars = files[i].listFiles();
          for (int k=0; k<jars.length; k++) {
            if (jars[k].getName().toLowerCase().endsWith(".jar")) {
              try { urlList.add( jars[k].toURL() ); }
              catch (Exception skip) {
                System.out.println("Unable to add file to classpath: "+jars[k]);
              }
            }
          }
        }
      }
    }
    URL[] urls = new URL[urlList.size()];
    urls = urlList.toArray(urls);
    return new JarClassLoader(urls);
  }

  public JarClassLoader(URL[] urls) {
    super(urls);
  }

  public JarClassLoader(URL[] urls, ClassLoader parent) {
    super(urls, parent);
  }

  protected synchronized Class loadClass(String classname, boolean resolve)
      throws ClassNotFoundException {

    Class theClass = findLoadedClass(classname);
    if (theClass != null) return theClass;

    //If it looks like a system class, try the parent first.
    if (classname.startsWith("java.") || classname.startsWith("javax.")) {
      try { theClass = findBaseClass(classname); }
      catch (ClassNotFoundException cnfe) {
        theClass = findClass(classname);
      }
    }

    //If it didn't look like a system class, then try the jars first.
    //This violates the normal delegation mechanism, but it is done
    //to ensure that this classloader becomes the classloader of all
    //classes that it can load, even those that could have been loaded
    //by the application classloader from the classpath.
    else {
      try { theClass = findClass(classname); }
      catch (ClassNotFoundException cnfe) {
        theClass = findBaseClass(classname);
      }
    }
    if (resolve) { resolveClass(theClass); }
    return theClass;
  }

  private Class findBaseClass(String name) throws ClassNotFoundException {
    return (getParent() == null) ? findSystemClass(name) : getParent().loadClass(name);
  }

}

Note that the loadClass method uses normal delegation (e.g., parent ClassLoader first) if the canonical name of the requested class makes it appear to be a Java system class. This is for more than efficiency. If the parents don't get first crack at Java system classes, it would be possible to use the ClassLoader to load classes that override the security features in Java.