This chapter provides an overview of many different technologies that comprise a typical Java and XSLT software development environment. Once the most commonly used tools are introduced, strategies for testing XSLT and tuning performance are presented. Instead of presenting specific performance benchmarks for various XSLT processors, this chapter's focus is on effective programming techniques that should be applicable to a wide range of tools. XSLT is a very young technology, and tools are improving all the time.
Specialized, lightweight development tools have never been more important to Java developers. Commercial integrated development environments (IDEs) are now only one small piece of a larger suite of essential tools used by a majority of Java development projects. These build tools such as Ant, testing tools such as JUnit, and various XML parsers and XSLT processors. Figure 9-1 illustrates some of the tools found in a typical Java and XSLT development environment.
Although this is a typical development environment, it can be a large number of tools to keep track of. Table 9-1 summarizes how each of these tools is used.
Tool |
Description |
---|---|
Java 2 SDK |
The Java 2 software development kit, i.e., the JDK. |
Apache's JMeter |
A stress-testing tool, primarily used to test scalability and performance of servlets and web sites. |
EJB Container |
Enterprise JavaBeans server, such as JBoss, Enhydra, WebLogic, or WebSphere. |
XML Parser |
Xerces, Crimson, or another SAX and/or DOM parser. |
XSLT Processor |
Xalan, SAXON, or any other XSLT processor. |
Servlet Container |
Apache's Tomcat or any other servlet host. Many application servers include both servlet containers and EJB containers. |
JAXP |
Provides a common API to XML parsers and XSLT processors. |
Apache's Ant |
A Java replacement for make. Ant build files provide a consistent way for every member of the development team to compile and test code. |
IDE |
An integrated development environment, such as Borland's JBuilder. |
JUnit |
An open source unit testing framework. |
Some individual tools are much more powerful when used in the context of an overall development environment. JUnit is much more effective when used in combination with Ant, because Ant ensures that every developer on the team is compiling and testing with the same settings. This means that unit tests executed by one developer should work the same way for everyone else. Without Ant, unit tests that succeed for one developer may fail for others, since they may be using different versions of some tools.
The migration from first generation XML parsers and XSLT processors has been a somewhat painful experience for Java developers. Although the newer APIs are great, older JAR files linger throughout many applications and directories, causing more than their fair share of configuration difficulties. This section describes some of the most common problems and offers advice for configuring several popular tools.
A common complaint against Microsoft Windows systems is known as "DLL Hell." This refers to problems that occur when two applications require different versions of the same DLL file. Installing a new application may overwrite an older version of a DLL file that existing applications depend on, causing erratic behavior or outright system crashes.[45]
[45] Commonly referred to as the blue screen of death.
More frequently than ever before, Java developers must contend with incompatible JAR file versions. For instance, JAXP 1.0 and JAXP 1.1 both ship with a JAR file named jaxp.jar. Applications that require JAXP 1.1 functionality will fail if the 1.0 version of jaxp.jar is listed on the CLASSPATH earlier than the newer version. This happens more often than developers expect, because many commercial and open source development tools ship with XML parsers and XSLT processors. The installation routines for these tools may install JAR files without informing developers or asking for their consent.
The simple fix is to locate and remove old versions of JAR files. This is easier said than done, because in many cases (such as JAXP), the version number is not part of the JAR filename. Since many tools ignore or modify the CLASSPATH when they are executed, simply removing older JAR files from the CLASSPATH will not eradicate all problems. Instructions for fixing this problem in Ant, Tomcat, and JBuilder are coming up.
Some JAR files are beginning to include version information inside of the META-INF/MANIFEST.MF file. This is called the manifest and can be extracted with the following command, where filename.jar is the name of the JAR file:
jar -xf filename.jar META-INF/MANIFEST.MF
Once extracted, the manifest can be viewed with any text editor. Example 9-1 shows the content of the manifest from Version 1.0 of jaxp.jar:
Manifest-Version: 1.0 Specification-Title: Java API for XML Parsing Interfaces Specification-Vendor: Sun Microsystems Created-By: 1.2.2 (Sun Microsystems Inc.) Specification-Version: 1.0.0 Name: javax/xml/parsers Package-Version: 1.0.0 Specification-Title: Java API for XML Parsing Specification-Vendor: Sun Microsystems Sealed: true Specification-Version: 1.0.0 Package-Vendor: Sun Microsystems, Inc. Package-Title: javax.xml.parsers
This manifest makes it quite easy to identify the contents of this JAR file. Although Sun's products tend to be very good about this, the manifest contents are entirely optional, and many other products omit all manifest information.
The dreaded "sealing violation" is one of the more cryptic exceptions encountered. Example 9-2 shows a stack trace that is displayed when a sealing violation occurs.
Exception in thread "main" java.lang.SecurityException: sealing violation at java.net.URLClassLoader.defineClass(URLClassLoader.java:234) at java.net.URLClassLoader.access$100(URLClassLoader.java:56) at java.net.URLClassLoader$1.run(URLClassLoader.java:195) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:297) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:286) at java.lang.ClassLoader.loadClass(ClassLoader.java:253) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:313) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:120) at javax.xml.transform.TransformerFactory.newInstance(TransformerFactory.java:117) at Test.main(Test.java:17)
This exception is hard to diagnose because the error message is not very descriptive, and the stack trace consists mostly of internal Java classes. According to the stack trace, line 17 of Test.java caused the problem. Here it is:
TransformerFactory transFact = TransformerFactory.newInstance( );
Actually, this line of code is perfectly correct. The problem lies in the CLASSPATH instead. The key to understanding this error is the sealing violation description. This indicates that one or more sealed JAR files are on the CLASSPATH in the wrong order.
A sealed JAR file has a manifest entry Sealed: true, as shown in Example 9-1.[46] The package sealing mechanism was introduced in Java Version 1.2 to enforce version consistency. Whenever a package is sealed, all classes in that package must be loaded from the same JAR file. If some of the classes are loaded from one JAR file and others from another, an instance of java.lang.SecurityException is thrown. Figure 9-2 illustrates the problem.
[46] It is also possible to seal individual packages within a JAR file. Refer to the Java 2 Standard Edition documentation for more information.
In this diagram, parser.jar is listed on the CLASSPATH before crimson.jar. This is a problem because Java applications search JAR files in the order in which they appear on the CLASSPATH. Once the org.xml.sax.SAXException class has been loaded by the JVM, any additional classes or interfaces in the org.xml.sax package must be loaded from parser.jar because it is sealed. When the application requests an instance of ContentHandler, the class loader attempts to load the requested class file from crimson.jar, which triggers the SecurityException shown in Example 9-2. The simple fix to this problem is to remove parser.jar from the CLASSPATH, which will load all classes in the org.xml.sax package from crimson.jar.
Other various "configuration" exceptions defined by JAXP are javax.xml.transform.TransformerConfigurationException and javax.xml.parsers.Factory -ConfigurationError. These may occur when an older version of jaxp.jar is still listed on the CLASSPATH. Since JAXP 1.0 is not aware of SAX 2, DOM 2, or XSLT transformations, applications requesting any of these new features may see one of these exceptions when JAXP 1.0 is installed instead of JAXP 1.1.
As mentioned earlier, the filename jaxp.jar is used with Versions 1.0 and 1.1 of JAXP. Therefore, special care must be taken to ensure that the newer copy is present instead of the old one. Since JAXP 1.1 is backwards-compatible with Version 1.0, the older version can be safely replaced without breaking currently installed applications that depend on it.
The easiest exception to debug is java.lang.ClassNotFoundException . This may occur when JAXP 1.1 is listed on the CLASSPATH. However, an XSLT processor or XML parser is not listed. To remedy this situation, merely add a JAXP-compliant parser and XSLT processor to the CLASSPATH.
The Java VM does not simply load classes based on the CLASSPATH environment variable. Before searching the CLASSPATH, the VM attempts to load classes from an optional package directory. An installed optional package is a JAR file located in the Java 2 Runtime Environment's lib/ext directory or in the jre/lib/ext directory of the Java 2 SDK.
If an installed optional package is not located, the VM then searches for download optional packages. These are JAR files that are explicitly referenced by the Class-Path manifest header of another JAR file. For example, a manifest might contain the following line:
Class-Path: jaxp.jar
In this case, the VM would look for jaxp.jar in the same directory as the JAR file that contains the manifest entry.
The best way to ensure that the correct version of XML parser, XSLT processor, and JAXP are installed is to manually copy the required JAR files to the installed optional package directory. Software developers should have the Java 2 SDK installed and should place JAR files in the JAVA_HOME/jre/lib/ext directory. End users, however, will probably use the Java 2 Runtime Environment instead of the entire SDK. For these users, the JAR files can be installed to the lib/ext directory where the JRE is installed.
To uninstall a Java optional package, merely delete the JAR file from the appropriate directory.
Many developers use tools such as Borland's JBuilder as Java development environments. These tools can introduce problems because they typically include a copy of the Java 2 SDK. When running and compiling within the IDE, the VM uses the tool's own Java directory rather than the Sun Java 2 SDK that is probably already installed elsewhere on the system. Figure 9-3 is a typical directory structure that illustrates this potential problem.
In this example, JBuilder is properly configured with JAXP 1.1, the Crimson XML parser, and the Xalan 2.0 JAR file. This means that compilation, running, and debugging will all work properly within the JBuilder IDE. But once the application is executed outside of JBuilder, it will probably fail. This is because the Java 2 SDK contains the older JAXP 1.0 JAR file and its older XML parser.
Merely adding the newer JAXP-related JAR files to the CLASSPATH will probably introduce a sealing exception rather than fix the problem, because the VM will still load files from the installed optional package directory before searching the CLASSPATH. One way to fix this problem is to replace jaxp.jar and parser.jar with the same JAR files found in JBuilder's directory. Another option is to update the JAVA_HOME environment variable and PATH to point to JBuilder's version of Java.
Configuring JAR files and the CLASSPATH on a single developer's machine can be difficult; keeping an entire team of developers in sync requires support from tools. For this reason, it is critical that every team member use the same build process when testing and integrating code changes. In this section, we take a brief look at Apache's Ant.
As discussed in Chapter 3, "XSLT Part 2 -- Beyond the Basics", Apache's Ant is a Java-based build tool that is an excellent alternative to make.[47] Ant is good for numerous reasons, including:
[47] These notes apply to Ant 1.3. Newer versions of Ant may handle JAR files differently.
Its XML build files are easier to create than Makefiles.
It is written in Java and is quite portable.
Builds are extremely fast because the same VM instance is used for most steps in the build process.
Ant can be acquired form http://jakarta.apache.org/ant.[48]
[48] The original author of Ant is working on a new Java build tool called Amber, available at http://www.xiyo.org/Amber.
To install Ant, simply download the binary distribution and uncompress to a convenient directory. Then set the ANT_HOME environment variable to point to this directory and JAVA_HOME to point to the Java installation directory. To test, type ant -version. This should display something similar to the following:
Ant version 1.3 compiled on March 2 2001
Since Ant is written in Java, care must be taken to avoid conflicts with Ant's JAR files and JAR files found in the system CLASSPATH. This is a particular concern when using Ant to drive the XSLT transformation process because Ant ships with JAXP 1.0 JAR files that are not compatible with newer JAXP 1.1 implementations.
Once Ant is installed, update ANT_HOME/lib/jaxp.jar and ANT_HOME/lib/parser.jar, which are part of the older JAXP 1.0 reference implementation. Any JAR files added to the ANT_HOME/lib directory are automatically added to Ant's CLASSPATH and will be seen by the various Ant tasks during the build process. Simply adding JAXP 1.1-compatible JAR files to the ANT_HOME/lib directory will prevent most conflicts with newer applications that require DOM 2, SAX 2, or support for XSLT transformations.
The best way to learn about Ant is to download it, read the first part of the user manual, and then study several example build files. Example 9-3 presents one such build file, which can be used to compile some of the example code in this chapter as well as perform an XSLT transformation.
<?xml version="1.0"?> <!-- ******************************************************* ** Example Ant build file as shown in Chapter 9, "Development Environment, Testing, and Performance". ** ** Assumes the following directory structure: ** examples ** +-chapters ** | +-chap9 ** | build.xml (this file) ** | aidan.xml (example XML file) ** | condensePerson.xslt (example XSLT file) ** +-common ** | +-src ** | +-com/oreilly/javaxslt/swingtrans/... ** | ** +-build (created by this build file) ** ****************************************************--> <project name="chap9" default="help" basedir="../.."> <!-- ******************************************************* ** Global properties. ****************************************************--> <property name="builddir" value="build"/> <path id="thirdparty.class.path"> <pathelement path="lib/saxon_6.2.2.jar"/> <pathelement path="lib/jaxp_1.1.jar"/> <pathelement path="lib/servlet_2.3.jar"/> <pathelement path="lib/junit_3.5.jar"/> <pathelement path="lib/jdom_beta6.jar"/> </path> <!-- ******************************************************* ** Create the output directory structure. ****************************************************--> <target name="prepare"> <mkdir dir="${builddir}"/> </target> <!-- ******************************************************* ** Show a brief usage message. This is the default ** target, and shows up when the user types "ant" ****************************************************--> <target name="help" description="Show a brief help message"> <echo message=Chapter 9, "Development Environment, Testing, and Performance" Example Ant Build File"/> <echo message="Type 'ant -projecthelp' for more assistance..."/> </target> <!-- ******************************************************** ** Remove the entire build directory *****************************************************--> <target name="clean" description="Remove all generated code"> <delete dir="${builddir}"/> </target> <!-- ******************************************************** ** Compile the com.oreilly.javaxslt.swingtrans package *****************************************************--> <target name="compile" depends="prepare" description="Compile the SwingTransformer application"> <javac srcdir="common/src" destdir="${builddir}" includes="com/oreilly/javaxslt/swingtrans/**"> <classpath refid="thirdparty.class.path"/> </javac> </target> <!-- ******************************************************** ** Run com.oreilly.javaxslt.swingtrans.SwingTransformer *****************************************************--> <target name="run" depends="compile" description="Run the SwingTransformer application"> <java fork="yes" classname="com.oreilly.javaxslt.swingtrans.SwingTransformer"> <classpath> <pathelement path="${builddir}"/> </classpath> <classpath refid="thirdparty.class.path"/> </java> </target> <!-- ******************************************************** ** Performs an XSLT transformation. If either the XML ** file or the XSLT stylesheet change, the transformation ** is performed again. ** ** basedir - specifies the location of the XSLT ** destdir - a required attribute, however Ant 1.3 is ** ignoring this. The messages on the console ** indicate that the destdir is being used, ** however it was found that the "out" ** attribute also has to specify the output ** directory. *****************************************************--> <target name="transform" description="Perform an XSLT transformation"> <style processor="trax" basedir="chapters/chap9" destdir="${builddir}" style="condensePerson.xslt" in="chapters/chap9/aidan.xml" out="${builddir}/aidan_condensed.xml"> <!-- pass a stylesheet parameter --> <param name="includeMiddle" expression="yes"/> </style> </target> </project>
All Ant build files are XML and have a <project> root element. This specifies the default target, as well as the base directory. Each of the targets is specified using <target> elements, which can have dependencies on each other. Targets, in turn, contain tasks, which are responsible for performing individual units of work.
The CLASSPATH used by various tasks can be defined once and reused throughout the build file. The <path> element is emphasized in Example 9-3, including several JAR files from the lib directory. For instance:
<pathelement path="lib/servlet_2.3.jar"/>
This illustrates two key points about defining a consistent development environment. First, it is a good idea to rename JAR files to include version numbers. This is a great way to avoid conflicts and unexpected errors, because different versions of most tools use the same filenames for JAR files. By renaming them, it is easier to keep track of what is installed on the system. The only drawback to this approach is that build files must be manually updated whenever new versions of JAR files are installed.
Second, this particular Ant build file defines its own CLASSPATH, rather than relying on the developer's CLASSPATH. Relying on the CLASSPATH environment variable introduces problems because each developer on a team may have a completely different set of JAR files defined in his environment. By encoding everything in the Ant build file, everyone will compile and test with the same setup.
The following target shows how the build file compiles the application:
<target name="compile" depends="prepare" description="Compile the SwingTransformer application"> <javac srcdir="common/src" destdir="${builddir}" includes="com/oreilly/javaxslt/swingtrans/**"> <classpath refid="thirdparty.class.path"/> </javac> </target>
So, to execute this target, simply type ant compile from the command prompt. Since this target depends on the prepare target, the build directory will be created before the code is compiled. Fortunately, the <javac> task is smart enough to compile only source code files that have changes since the last build, making Ant much faster than manually typing javac *.java.
The srcdir and destdir attributes are relative to the basedir that was specified in the <project> element. Since Ant always uses forward slashes (/) as path separators, these relative directories will work on Windows and Unix/Linux systems. As you might guess, the includes attribute defines a filter that limits which files are included in the build.
The last target in this build file performs an XSLT transformation using Ant's <style> task, which is described next.
Of particular interest to XSLT developers is Ant's <style> task. This is a core task that performs one or more XSLT transformations. Ant's JAXP JAR files must be updated as described earlier for this task to work. Here is a simple example of this task:
<style basedir="." destdir="." style="sample.xslt" processor="trax" in="company.xml" out="report.txt"/>
This will look in the project's base directory for the specified XML and XSLT files, placing the output into report.txt. The processor is trax, which means the same thing as JAXP 1.1. Ant will use the first JAXP-compliant processor found on the CLASSPATH. Table 9-2 lists the complete set of attributes for the style task.
Attribute |
Description |
Required? |
---|---|---|
basedir |
The directory where XML files are located. |
yes |
destdir |
The directory where the result tree should be placed. |
yes |
extension |
The default filename extension for the result of the transformation(s). |
no |
style |
The XSLT stylesheet filename. |
yes |
processor |
Specifies which XSLT processor is used. Legal values are "trax" for a TrAX-compliant processor, "xslp" for the XSL:P processor, and "xalan" for Xalan Version 1.x. May also contain the name of a class that implements org.apache.tools.ant.taskdefs.XSLTLiaison. When omitted, defaults to "trax." |
no |
includes |
The comma-separated list of file patterns to include. |
no |
includesfile |
The name of a file that contains include patterns. |
no |
excludes |
The comma-separated list of file patterns to exclude. |
no |
excludesfile |
The name of a file that contains exclude patterns. |
no |
defaultexcludes |
May be "yes" or "no," defaults to "yes." |
no |
in |
A single XML file input. |
no |
out |
A single output filename. |
no |
The pattern attributes, such as includes and excludes, work just like other patterns in Ant. Basically, these allow the task to filter which files are included and excluded from the transformations. When omitted, all files in the base directory are included. Here is how an entire directory of XML files can be transformed:
<style basedir="xmlfiles" includes="*.xml" destdir="build/doc" style="report.xslt" extension="html"/>
As shown back in Example 9-3, parameters can be passed using nested <param> elements. This element has required name and expression attributes:
<style basedir="xmlfiles" includes="*.xml" destdir="build/doc" style="report.xslt" extension="html"> <param name="reportType" expression="'detailed'"/> </style>
Apache's Tomcat is a Servlet and JSP container and has been mentioned throughout this book. It is available from http://jakarta.apache.org/tomcat. Tomcat is fairly easy to install and configure:
Download the latest Tomcat release build for your operating system.
Uncompress the distribution to a directory.
Set the TOMCAT_HOME environment variable to point to this directory.
Set the JAVA_HOME environment variable to point to your Java distribution.
Since web applications are required to read configuration information from their XML deployment descriptors (web.xml), all current versions of Tomcat ship with an XML parser.
Tomcat 3.2.x includes several JAR files in its $TOMCAT_HOME/lib directory. Among these are jaxp.jar and parser.jar, which support JAXP Version 1.0 along with a SAX 1.0 and DOM 1.0 XML parser. Any JAR file added to the lib directory becomes available to every web application. Tomcat uses a simple script to locate *.jar in the lib directory, adding each JAR file to the CLASSPATH as it is encountered. The order of inclusion depends on how the operating system lists files, which is generally alphabetically. The complete CLASSPATH used by Tomcat 3.2.x includes the following:
$TOMCAT_HOME/classes
$TOMCAT_HOME/lib/*.jar
Any existing CLASSPATH
$JAVA_HOME/jre/lib/tools.jar
Although the lib directory provides a convenient way to install utility code that all web applications must use, conflicts arise when individual applications require different versions of SAX, DOM, or JAXP. If Tomcat finds an older version of one of these tools before it finds a newer version, exceptions typically occur. For instance, a sealing violation exception may occur if the existing CLASSPATH contains the newer crimson.jar, but an older version of parser.jar is still present.
The best approach to fully configure Tomcat 3.2.x for XML support is as follows:
Remove jaxp.jar and parser.jar from the $TOMCAT_HOME/lib directory.
Install the following files from the JAXP 1.1 distribution into the $TOMCAT_HOME/lib directory: jaxp.jar, crimson.jar, and xalan.jar.
Of course, JAXP 1.1 supports other tools besides Crimson and Xalan. If you prefer, simply replace crimson.jar and xalan.jar with competing products that are JAXP 1.1-compatible.
Tomcat 4.0 improves upon Tomcat 3.2.x configuration issues in two key ways. First, the user's existing CLASSPATH is no longer appended to Tomcat's CLASSPATH. This helps to avoid situations where code works for one developer (who happens to have some critical file on her CLASSPATH) but fails for other developers who have slightly different personal CLASSPATH configurations.
Secondly, Tomcat 4.0 no longer places JAXP JAR files in a location visible to web applications. This means that if XML support is required, you must install the proper XML JAR files before anything will work. This is far better than the old Tomcat model, because it avoids unexpected collisions with XML libraries used internally by Tomcat. Instead, if you forget to install XML support, you simply see a java.lang.NoClassDefFoundError.
To install XML support into Tomcat 4.0, simply install the required JAR files into the $TOMCAT_HOME/lib directory. These will then be available to all web applications. The other option is to install JAR files into the WEB-INF/lib directory of individual web applications. With this approach, each application can use different versions of various packages without fear of conflicts.
Copyright © 2002 O'Reilly & Associates. All rights reserved.