Before we leave the topic of linking, we'll discuss one more useful technique. So far, all of this chapter's examples have been structured nicely. When there was a relationship between two pieces of information, we had an id and refid pair to match them. What happens if the XML document you're transforming isn't written that way? Fortunately, we can use the key() function and a new function, generate-id(), to create structure where there isn't any.
For our example here, we'll take out all of the id and refid attributes that have served us well so far. This may be a contrived example, but it demonstrates how we can use the key() and generate-id() functions to generate links between parts of our document.
In our new sample document, we've stripped out the references that neatly tied things together before:
<?xml version="1.0" ?> <!DOCTYPE glossary SYSTEM "unstructuredglossary.dtd"> <glossary> <glentry> <term>applet</term> <defn> An application program, written in the Java programming language, that can be retrieved from a web server and executed by a web browser. A reference to an applet appears in the markup for a web page, in the same way that a reference to a graphics file appears; a browser retrieves an applet in the same way that it retrieves a graphics file. For security reasons, an applet's access rights are limited in two ways: the applet cannot access the file system of the client upon which it is executing, and the applet's communication across the network is limited to the server from which it was downloaded. Contrast with <refterm>servlet</refterm>. </defn> </glentry> <glentry> <term>demilitarized zone</term> <defn> In network security, a network that is isolated from, and serves as a neutral zone between, a trusted network (for example, a private intranet) and an untrusted network (for example, the Internet). One or more secure gateways usually control access to the DMZ from the trusted or the untrusted network. </defn> </glentry> <glentry> <term>DMZ</term> <defn> See <refterm>delimitarized zone</refterm>. </defn> </glentry> <glentry> <term>pattern-matching character</term> <defn> A special character such as an asterisk (*) or a question mark (?) that can be used to represent zero or more characters. Any character or set of characters can replace a pattern-matching character. </defn> </glentry> <glentry> <term>servlet</term> <defn> An application program, written in the Java programming language, that is executed on a web server. A reference to a servlet appears in the markup for a web page, in the same way that a reference to a graphics file appears. The web server executes the servlet and sends the results of the execution (if there are any) to the web browser. Contrast with <refterm>applet</refterm>. </defn> </glentry> <glentry> <term>wildcard character</term> <defn> See <refterm>pattern-matching character</refterm>. </defn> </glentry> </glossary>
To generate cross-references between the <refterm> elements and the associated <term> elements, we'll need to do three things:
Define a key for all terms. We'll use this key to find terms that match the text of the <refterm> element.
Generate a new ID for each <term> we find.
For each <refterm>, use the key() function to find the <term> element that matches the text of <refterm>. Once we've found the matching <term>, we call generate-id() to find the newly created ID.
We'll go through the relevant parts of the stylesheet. First, we define the key:
<xsl:key name="terms" match="term" use="."/>
Notice that we use the value of the <term> element itself as the lookup value for the key. Given a string, we can find all <term> elements with that same text.
Second, we need to generate a named anchor point for each <term> element:
<xsl:template match="glentry"> <p> <b> <a name="{generate-id(term)}"> <xsl:value-of select="term"/> <xsl:text>: </xsl:text> </a> </b> <xsl:apply-templates select="defn"/> </p> </xsl:template>
Third, we find the appropriate reference for a given <refterm>. Given the text of a <refterm>, we can use the key() function to find the <term> that matches. Passing the <term> to the generate-id() function returns the same ID generated when we created the named anchor for that <term>:
<xsl:template match="refterm"> <a href="#{generate-id(key('terms', .))}"> <xsl:value-of select="."/> </a> </xsl:template>
Our generated HTML output creates cross-references similar to those in our earlier stylesheets:
<h1>Glossary Listing: applet - wildcard character</h1> <p> <b><a name="N11">applet: </a></b> An application program, written in the Java programming language, that can be retrieved from a web server and executed by a web browser. A reference to an applet appears in the markup for a web page, in the same way that a reference to a graphics file appears; a browser retrieves an applet in the same way that it retrieves a graphics file. For security reasons, an applet's access rights are limited in two ways: the applet cannot access the file system of the client upon which it is executing, and the applet's communication across the network is limited to the server from which it was downloaded. Contrast with <a href="#N53">servlet</a>. </p> ... <p> <b><a name="N53">servlet: </a></b> An application program, written in the Java programming language, that is executed on a web server. A reference to a servlet appears in the markup for a web page, in the same way that a reference to a graphics file appears. The web server executes the servlet and sends the results of the execution (if there are any) to the web browser. Contrast with <a href="#N11">applet</a>. </p>
Using the key() and generate-id() functions, we've been able to create IDs and references automatically. This approach isn't perfect; we have to make sure the text of the <refterm> element matches the text of the <term> exactly.
This example, like all of the examples we've shown so far, uses a single input file. A more likely scenario is that we have one XML document that contains terms, and we want to reference definitions in a second XML document that contains definitions, but no IDs. We can combine the technique we've described here with the document() function to import a second XML document and generate links between the two. We'll talk about the document() function in a later chapter; for now, just remember that there are ways to use more than one XML input document in your transformations.
Before we leave the topic of linking, we'll go over the details of the generate-id() function. This function takes a node-set as its argument, and works as follows:
For a given transformation, every time generate-id() is invoked against a given node, it returns the same ID. The ID doesn't change while you're doing a given transformation. If you run the transformation again, there's no guarantee generate-id() will generate the same ID the second time around. All calls to generate-id() in the second transformation will return the same ID, but that ID might not be the same as in the first transformation.
TIP: The generate-id() function is not required to check if an ID it generates duplicates an ID that's already in the document. In other words, if your document has an attribute of type ID with a value of sdk3829a, there's a possibility that an ID returned by generate-id() will also be sdk3829a. It's not likely, but be aware that it could happen.
If you invoke generate-id() against two different nodes, the two generated IDs will be different.
Given a node-set, generate-id() returns an ID for the node in the node-set that occurs first in document order.
If the node-set you pass to the function is empty (you invoke generate-id(fleeber), and there are no <fleeber> elements in the current context), generate-id() returns an empty string.
If no node-set is passed in (you invoke generate-id()), the function generates an ID for the context node.
Copyright © 2002 O'Reilly & Associates. All rights reserved.