Using a good publishing framework like Cocoon doesn't require any special instruction; it is not a complex application that users must learn to adapt to. In fact, all Cocoon's uses are based on simple URLs entered into a standard web browser. Generating dynamic HTML from XML, viewing XML transformed into PDF files, and even generating VRML applications from XML is simply a matter of typing the URL to the desired XML file into your browser and watching Cocoon and the power of XML take action.
Now that your framework is in place and is correctly handling requests ending in .xml, we begin to see it publish our XML files. Cocoon comes with several sample XML files and associated XSL stylesheets in the project's samples/ subdirectory. However, you have your own XML and XSL from earlier chapters by now, so let's transform the XML table of contents for this book (contents.xml) with the XSL stylesheet (JavaXML.html.xsl), both from Chapter 2, "Nuts and Bolts". Locate where you saved the XML file, and copy it into Cocoon's document root, webapps/cocoon/. The document refers to the stylesheet XSL/JavaXML.html.xsl. Create the XSL/ directory in your web document root, and copy the stylesheet into that directory. The XML document also references a DTD; you will need to either comment that out, or create a DTD/ directory and copy the JavaXML.dtd file, also from Chapter 2, "Nuts and Bolts", into that directory.
Once you have the XML document and its stylesheet in place, you can access it with the URL http://<hostname>:<port>/cocoon/contents.xml in your web browser. Assuming you followed the earlier instructions to get Cocoon running, the transformed XML should look like Figure 10-3.
This should be almost trivial; once Cocoon is set up and configured, serving up dynamic content is a piece of cake! The mapping from XML extensions to Cocoon works for any requests within the context in which you set up Cocoon.
In the discussions concerning using XML for presentation, I've focused on XML converted to HTML. However, that's just scratching the surface of formats that XML can be converted to. Not only is a variety of markup languages supported as final document formats, but in addition, Java provides libraries for converting XML to some non-markup-based formats. The most popular and stable library in this category is the Apache XML group's Formatting Objects Processor, FOP. This gives Cocoon or any other publishing framework the ability to turn XML documents into Portable Document Format (PDF) documents, which are generally viewed with Adobe Acrobat (http://www.adobe.com).
The importance of converting a document from XML into a PDF cannot be overstated; particularly for document-driven web sites, such as print media or publishing companies, it could revolutionize web delivery of data. Consider the following XML document, an XML-formatted excerpt from this chapter, shown in Example 10-1.
<?xml version="1.0"?> <?cocoon-process type="xslt"?> <?xml-stylesheet href="XSL/JavaXML.fo.xsl" type="text/xsl"?> <book> <cover> <title>Java and XML</title> <author>Brett McLaughlin</author> </cover> <contents> <chapter title="Web Publishing Frameworks" number="10"> <paragraph> This chapter begins looking at specific Java and XML topics. So far, I have covered the basics of using XML from Java, looking at the SAX, DOM, JDOM, and JAXP APIs to manipulate XML and the fundamentals of using and creating XML itself. Now that you have a grasp on using XML from your code, I want to spend time on specific applications. The next six chapters represent the most significant applications of XML, and, in particular, how those applications are implemented in the Java space. While there are literally thousands of important applications of XML, the topics in these chapters are those that continually seem to be in the spotlight, and that have a significant potential to change the way traditional development processes occur. </paragraph> <sidebar title="The More Things Change, the More They Stay the Same"> Readers of the first edition of this book will find that much of this chapter on Cocoon is the same as the first edition. Although I promised you that Cocoon 2 would be out by now, and although I expected to be writing a chapter on Cocoon 2, things haven't progressed as quickly as expected. Stefano Mazzochi, the driving force behind Cocoon, finally got around to finishing school (good choice, Stefano!), and so development on Cocoon 2 has significantly slowed. The result is that Cocoon 1.x is still the current development path, and you should stick with it for now. I've updated the section on Cocoon 2 to reflect what is coming, and you should keep an eye out for more Cocoon-related books from O'Reilly in the months to come.</sidebar> <paragraph> I'll begin this look at hot topics with the one XML application that seems to have generated the largest amount of excitement in the XML and Java communities: the web publishing framework. Although I have continually emphasized that generating presentation from content is perhaps over-hyped when compared to the value of the portable data that XML provides, using XML for presentation styling is still very important. This importance increases when looking at web-based applications.</paragraph> </chapter> </contents> </book>
You saw how an XSL stylesheet allows you to transform this document into HTML. But converting an entire chapter of a book into HTML could result in a gigantic HTML document, and certainly an unreadable format; potential readers wanting online delivery of a book generally prefer a PDF document. On the other hand, generating PDF statically from the chapter means that changes to the chapter must be matched with subsequent PDF file generation. Keeping a single XML document format means the chapter can be easily updated (with any XML editor), formatted into SGML for printing hard copy, transferred to other companies and applications, and included in other books or compendiums. Now add the ability for web users to type in a URL and access the book in PDF format to this robust set of features, and you have a complete publishing system.
Although I don't cover formatting objects and the FOP for Java libraries in detail, you can review the entire formatting objects definition within the XSL specification at the W3C at http://www.w3.org/TR/xsl/. Example 10-2 is an XSL stylesheet that uses formatting objects to specify a transformation from XML to a PDF docu ment, appropriate for the XML version of this chapter.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format"> <xsl:template match="book"> <xsl:processing-instruction name="cocoon-format"> type="text/xslfo" </xsl:processing-instruction> <fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format"> <fo:layout-master-set> <fo:simple-page-master master-name="right" margin-top="75pt" margin-bottom="25pt" margin-left="100pt" margin-right="50pt"> <fo:region-body margin-bottom="50pt"/> <fo:region-after extent="25pt"/> </fo:simple-page-master> <fo:simple-page-master master-name="left" margin-top="75pt" margin-bottom="25pt" margin-left="50pt" margin-right="100pt"> <fo:region-body margin-bottom="50pt"/> <fo:region-after extent="25pt"/> </fo:simple-page-master> <fo:page-sequence-master master-name="psmOddEven"> <fo:repeatable-page-master-alternatives> <fo:conditional-page-master-reference master-name="right" page-position="first"/> <fo:conditional-page-master-reference master-name="right" odd-or-even="even"/> <fo:conditional-page-master-reference master-name="left" odd-or-even="odd"/> <!-- recommended fallback procedure --> <fo:conditional-page-master-reference master-name="right"/> </fo:repeatable-page-master-alternatives> </fo:page-sequence-master> </fo:layout-master-set> <fo:page-sequence master-name="psmOddEven"> <fo:static-content flow-name="xsl-region-after"> <fo:block text-align-last="center" font-size="10pt"> <fo:page-number/> </fo:block> </fo:static-content> <fo:flow flow-name="xsl-region-body"> <xsl:apply-templates/> </fo:flow> </fo:page-sequence> </fo:root> </xsl:template> <xsl:template match="cover"> <fo:block font-size="10pt" space-before.optimum="10pt"> <xsl:value-of select="title"/> (<xsl:value-of select="author"/>) </fo:block> </xsl:template> <xsl:template match="contents"> <xsl:apply-templates/> </xsl:template> <xsl:template match="chapter"> <fo:block font-size="24pt" text-align-last="center" space-before.optimum="24pt"> <xsl:value-of select="@number" />. <xsl:value-of select="@title" /> <xsl:apply-templates/> </fo:block> </xsl:template> <xsl:template match="paragraph"> <fo:block font-size="12pt" space-before.optimum="12pt" text-align="justify"> <xsl:apply-templates/> </fo:block> </xsl:template> <xsl:template match="sidebar"> <fo:block font-size="14pt" font-style="italic" color="blue" space-before.optimum="16pt" text-align="center"> <xsl:value-of select="@title" /> </fo:block> <fo:block font-size="12pt" color="blue" space-before.optimum="16pt" text-align="justify"> <xsl:apply-templates/> </fo:block> </xsl:template> </xsl:stylesheet>
If you create both of these files, saving the chapter as chapterTen.xml, and the XSL stylesheet as JavaXML.fo.xsl within a subdirectory called XSL/, you can see the result of the transformation in a web browser. Make sure you have the Adobe Acrobat Reader and plug-in for your web browser, and then access the XML document just created. Figure 10-4 shows the results.
In addition to specifically requesting certain types of transformations, such as a conversion to a PDF, Cocoon allows for dynamic processing to occur based on the request. A common example of this is applying different formatting based on the media of the client. In a traditional web environment, this allows an XML document to be transformed differently based on the browser being used. A client using Internet Explorer could be served a different presentation than a client using Netscape; with the recent wars between versions of HTML, DHTML, and JavaScript brewing between Netscape and Microsoft, this is a powerful feature to have available. Cocoon provides built-in support for many common browser types. Locate the cocoon.properties file you referenced earlier, open it, and scroll to the bottom of the file. You will see the following section (this may be slightly different for newer versions):
########################################## # User Agents (Browsers) # ########################################## # NOTE: numbers indicate the search order. This is VERY VERY IMPORTANT # since some words may be found in more than one browser description. # (MSIE is presented as "Mozilla/4.0 (Compatible; MSIE 4.01; ...") # # for example, the "explorer=MSIE" tag indicates that the XSL stylesheet # associated to the media type "explorer" should be mapped to those # browsers that have the string "MSIE" in their "user-Agent" HTTP header. browser.0 = explorer=MSIE browser.1 = pocketexplorer=MSPIE browser.2 = handweb=HandHTTP browser.3 = avantgo=AvantGo browser.4 = imode=DoCoMo browser.5 = opera=Opera browser.6 = lynx=Lynx browser.7 = java=Java browser.8 = wap=Nokia browser.9 = wap=UP browser.10 = wap=Wapalizer browser.11 = mozilla5=Mozilla/5 browser.12 = mozilla5=Netscape6/ browser.13 = netscape=Mozilla
The keywords after the first equals sign are the items to take note of: explorer, lynx, java, and mozilla5, for example, all differentiate between different user-agents, the codes the browsers send with requests for URLs. As an example of applying stylesheets based on this property, you can create a sample XSL stylesheet to apply when the client accesses the XML table of contents (contents.xml) document with Internet Explorer. Copy the original XML-to-HTML stylesheet, JavaXML.html.xsl, to JavaXML.explorer-html.xsl. Then make the modifications shown in Example 10-3.
<?xml version="1.0"?> <xsl:stylesheet xmlns:javaxml2="http://www.oreilly.com/javaxml2" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:ora="http://www.oreilly.com" version="1.0" > <xsl:template match="javaxml2:book"> <xsl:processing-instruction name="cocoon-format"> type="text/html" </xsl:processing-instruction> <html> <head> <title> <xsl:value-of select="javaxml2:title" /> (Explorer Version) </title> </head> <body> <xsl:apply-templates select="*[not(self::javaxml2:title)]" /> </body> </html> </xsl:template> <xsl:template match="javaxml2:contents"> <center> <h2>Table of Contents (Explorer Version)</h2> <small> Try <a href="http://www.mozilla.org">Mozilla</a> today! </small> </center> <!-- Other XSL directives --> </xsl:template> <!-- Other XSL template matches --> </xsl:stylesheet>
While this is a trivial example, dynamic HTML could be inserted for Internet Explorer 5.5, and standard HTML could be used for Netscape Navigator or Mozilla, which have less DHTML support. With this in place, you need to let your XML document know that if the media type (or user-agent) matches up with the explorer type defined in the properties file, a different XSL stylesheet should be used. The additional processing instruction shown in Example 10-4 handles this, and can be added to the contents.xml file.
<?xml version="1.0"?> <!DOCTYPE Book SYSTEM "DTD/JavaXML.dtd"> <?xml-stylesheet href="XSL/JavaXML.html.xsl" type="text/xsl"?> <?xml-stylesheet href="XSL/JavaXML.explorer-html.xsl" type="text/xsl" media="explorer"?> <?cocoon-process type="xslt"?> <!-- Java and XML Contents --> <book xmlns="http://www.oreilly.com/javaxml2" xmlns:ora="http://www.oreilly.com" > <!-- XML content --> </book>
Accessing the XML in your Netscape browser yields the same results as before; however, if you access the page in Internet Explorer, you will see that the document has been transformed with the alternate stylesheet, and looks like Figure 10-5.
One of the real powers in this dynamic application of stylesheets lies in the use of wireless devices. Remember our properties file?
########################################## # User Agents (Browsers) # ########################################## # NOTE: numbers indicate the search order. This is VERY VERY IMPORTANT # since some words may be found in more than one browser description. # (MSIE is presented as "Mozilla/4.0 (Compatible; MSIE 4.01; ...") # # for example, the "explorer=MSIE" tag indicates that the XSL stylesheet # associated to the media type "explorer" should be mapped to those # browsers that have the string "MSIE" in their "user-Agent" HTTP header. browser.0 = explorer=MSIE browser.1 = pocketexplorer=MSPIE browser.2 = handweb=HandHTTP browser.3 = avantgo=AvantGo browser.4 = imode=DoCoMo browser.5 = opera=Opera browser.6 = lynx=Lynx browser.7 = java=Java browser.8 = wap=Nokia browser.9 = wap=UP browser.10 = wap=Wapalizer browser.11 = mozilla5=Mozilla/5 browser.12 = mozilla5=Netscape6/ browser.13 = netscape=Mozilla
The highlighted entries detect that a wireless agent, such as an Internet-capable phone, is being used to access content. Just as Cocoon detected whether the incoming web browser was Internet Explorer or Netscape, responding with the correct stylesheet, a WAP device can be handled by yet another stylesheet. Add another stylesheet reference in to your contents.xml document:
<?xml version="1.0"?> <!DOCTYPE Book SYSTEM "DTD/JavaXML.dtd"> <?xml-stylesheet href="XSL/JavaXML.html.xsl" type="text/xsl"?> <?xml-stylesheet href="XSL/JavaXML.explorer-html.xsl" type="text/xsl" media="explorer"?> <?xml-stylesheet href="XSL/JavaXML.wml.xsl" type="text/xsl" media="wap"?> <?cocoon-process type="xslt"?> <!-- Java and XML Contents --> <book xmlns="http://www.oreilly.com/javaxml2" xmlns:ora="http://www.oreilly.com" > <!-- XML table of contents --> </book>
Now you need to create this newly referenced stylesheet for WAP devices. The Wireless Markup Language (WML) is typically used when building a stylesheet for a WAP device. WML is a variant on HTML, but has a slightly different method of representing different pages. When a wireless device requests a URL, the returned response must be within a wml element. In that root element, several cards can be defined, each through the WML card element. The device downloads multiple cards at one time (often referred to as a deck) so that it does not have to go back to the server for the additional screens. Example 10-5 shows a simple WML page using these constructs.
<wml> <card id="index" title="Home Page"> <p align="left"> <i>Main Menu</i><br /> <a href="#title">Title Page</a><br /> <a href="#myPage">My Page</a><br /> <p> </card> <card id="title" title="My Title Page"> Welcome to my Title Page!<br /> So happy to see you. </card> <card id="myPage" title="Hello World"> <p align="center"> Hello World! </p> </card> </wml>
This simple example serves requests with a menu, and two screens accessed from links within that menu. The complete WML 1.1 specification is available online, along with all other related WAP specifications, at http://www.wapforum.org/what/technical_1_1.htm. You can also pick up a copy of Learning WML and WML Script by Martin Frost (O'Reilly). Additionally, the UP.SDK can be downloaded from http://www.phone.com/products/upsdk.html; this is a software emulation of a wireless device that allows testing of your WML pages. With this software, you can develop an XSL stylesheet to output WML for WAP devices, and test the results by pointing your UP.SDK browser to http://<hostname>:<port>/contents.xml.
Because phone displays are much smaller than computer screens, you want to show only a subset of the information in our XML table of contents. Example 10-6 is an XSL stylesheet that outputs three cards in WML. The first card is a menu with links to the other two cards. The second card generates a table of contents listing from our contents.xml document. The third card is a simple copyright screen. This stylesheet can be saved as JavaXML.wml.xsl in the XSL/ subdirectory of your Cocoon context.
<?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:javaxml2="http://www.oreilly.com/javaxml2" xmlns:ora="http://www.oreilly.com" exclude-result-prefixes="javaxml2 ora" > <xsl:template match="javaxml2:book"> <xsl:processing-instruction name="cocoon-format"> type="text/wml" </xsl:processing-instruction> <wml> <card id="index" title="{javaxml2:title}"> <p align="center"> <i><xsl:value-of select="javaxml2:title"/></i><br /> <a href="#contents">Contents</a><br/> <a href="#copyright">Copyright</a><br/> </p> </card> <xsl:apply-templates select="javaxml2:contents" /> <card id="copyright" title="Copyright"> <p align="center"> Copyright 2000, O'Reilly & Associates </p> </card> </wml> </xsl:template> <xsl:template match="javaxml2:contents"> <card id="contents" title="Contents"> <p align="center"> <i>Contents</i><br /> <xsl:for-each select="javaxml2:chapter"> <xsl:value-of select="@number" />. <xsl:value-of select="@title" /><br /> </xsl:for-each> </p> </card> </xsl:template> </xsl:stylesheet>
Other than the WML tags, most of this example should look familiar. There is also a processing instruction for Cocoon, with the target specified as cocoon-format. The data sent, type="text/wml", instructs Cocoon to output this stylesheet with a content header specifying that the output is text/wml (instead of the normal text/html or text/plain). There is one other important addition, an attribute added to the root element of the stylesheet:
<?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:javaxml2="http://www.oreilly.com/javaxml2" xmlns:ora="http://www.oreilly.com" exclude-result-prefixes="javaxml2 ora" >
By default, any XML namespace declarations other than the XSL namespace are added to the root element of the transformation output. In this example, the root element of the transformed output, wml, would have the namespace declarations associated with the javaxml2 and ora prefixes added to it:
<wml xmlns:javaxml2="http://www.oreilly.com/javaxml2" xmlns:ora="http://www.oreilly.com" > <!-- WML content --> </wml>
This addition causes a WAP browser to report an error, as xmlns:javaxml2 and xmlns:ora are not allowed attributes for the wml element. WAP browsers are not as forgiving as HTML browsers, and the rest of the WML content would not be shown. However, you must declare the namespace so the XSL stylesheet can handle template matching for the input document, which does use the javaxml-associated namespace. To handle this problem, XSL allows the attribute exclude-result-prefixes to be added to the xsl:stylesheet element. The namespace prefix specified to this attribute will not be added to the transformed output, which is exactly what you want. Your output would now look like this:
<wml> <!-- WML content --> </wml>
This is understood perfectly by a WAP browser. If you've downloaded the UP.SDK browser, you can point it to your XML table of contents, and see the results. Figure 10-6 shows the main menu that results from the transformation using the WML stylesheet when a WAP device requests the contents.xml file through Cocoon.
WARNING: In the UP.SDK browser versions that I tested, the browser would not resolve the entity reference OReillyCopyright. I had to comment this line out in my XML to make the examples work. You will probably have to do the same, until the simulator fixes this bug.
Figure 10-7 shows the generated table of contents, accessed by clicking the "Link" button when the "Contents" link is indicated in the display.
Visit http://www.openwave.com and http://www.wapforum.org for more information on WML and WAP; both sites have extensive online resources for wireless device development.
By now, you should have a pretty good idea of the variety of output that can be created with Cocoon. With a minimal amount of effort and an extra stylesheet, the same XML document can be served in multiple formats to multiple types of clients; this is one of the reasons the web publishing framework is such a powerful tool. Without XML and a framework like this, separate sites would have to be created for each type of client. Now that you have seen how flexible the generation of output is when using Cocoon, I will move on to how Cocoon provides technology that allows for dynamic creation and customization of the input to these transformations.
Copyright © 2002 O'Reilly & Associates. All rights reserved.