Extensible Markup Language (XML) has taken off in the last few years as a generic format for storing information. XML looks much like HTML, with a similar combination of elements and attributes for marking up text, but it lets developers create their own vocabularies. Some XML is shared directly over the Web; some XML is used by web services applications; and some XML is used as a foundation for web sites that need to present information in multiple forms. Serving XML documents is just like serving any other files in Apache, requiring only putting the files up and setting a MIME type identifier for them. Web services generally require the installation of modules specific to a particular web-service protocol, which then act as a gateway between the web server and application logic elsewhere on the computer.
The last option — using XML as a foundation for information the Apache server needs to be able to present in multiple forms — is growing more common and fits well in more typical web-server applications. In this case, XML typically provides a format for storing information separate from its presentation details. When the Apache server gets a request for a particular file, say in HTML, it passes it to a tool that deals with the XML. That tool typically loads the XML document, generates a file in the format requested, and passes it back to Apache, which then transmits it to the user. (The XML processor may pull the file from a cache if the file has been requested previously.) If a site is only serving up HTML files, all this extra work is probably unnecessary, but sites that provide HTML, PDF, WML (Wireless Markup Language), and plain-text versions of the same content will likely find this approach very useful. Even sites that offer multiple HTML renditions of the same information may find this approach easier than managing multiple files.
Most commonly, the transformation between the original XML document and the result the user wants is defined using Extensible Stylesheet Language Transformations (XSLT). Developers use XSLT to create templates that define the production of result documents from original XML documents, and these templates can generally be applied to many originals to produce many results.
Making this work on Apache requires adding some parts that support XSLT and manage the caching process. Chapter 19 will explore Cocoon, a Java-based sub-project of the Apache Project that is widely used for this work. Perl devotees may want to explore AxKit, another Apache project that does similar work in Perl. (For a complete list of XML-related projects at Apache, visit http://xml.apache.org/.)
XML and XSLT are subjects that go well beyond the scope of this book. Chapter 19 will provide a brief introduction, but you may also want to explore Learning XML by Erik Ray (O'Reilly, 2001), XSLT by Doug Tidwell (O'Reilly, 2001), and XML in a Nutshell by Elliotte Rusty Harold and Scott Means (O'Reilly, 2002).
Copyright © 2003 O'Reilly & Associates. All rights reserved.