start page | rating of books | rating of authors | reviews | copyrights

Book HomeXML SchemaSearch this book

B.3. Other

W3C XML Schema is inspiring some new technologies all its own.

B.3.1. PSVI Serialization

The PSVI (Post Schema Validation Infoset) is one of the most obscure features of the W3C XML Schema. Although present throughout the Recommendation, its specification is scattered into small sections describing the "PSVI contribution" of each feature. These contributions are described as plain text even when they are very abstract and difficult to visualize, like those of the ID/IDREF or unique/key/keyref tables. These may eventually be exposed through APIs that haven't been specified yet.

This level of abstraction appears to be due to the organization of the W3C divided into independent working groups. Since the charter of the W3C XML Schema Working Group is hidden on the private members' only section of the W3C web site, we may think it did not include the definition of the processing model of a schema validation and only focused on the definition of the language itself.

However, the effects of this lack of formal specification are similar to those of the unpublished APIs practices that some software editors are famous for: the lack of a concrete description is an obstacle for most users to understand the PSVI, and it creates a kind of "vendor lock-in," since generating the PSVI using another tool instead of W3C XML Schema involves emulating these unspecified APIs and may prove difficult for many developers.

Four years of XML have taught us that there is an easy way to serialize abstract concepts, and the definition of a XML serialization for the PSVI would have a lot of advantages. Assuming the format is simple enough, it would let us visualize what the PSVI is, allow us to process a PSVI using the standard set of XML APIs (DOM, SAX, and friends) and tools (including XSLT), make it easy to include into XML processing pipelines, allow us to save it for reuse, and permit us to generate it out of any application or tool able to generate XML documents.

A proposal has been written by Richard Tobin and Henry S. Thompson and informally published on the W3C web site (http://www.w3.org/2001/05/serialized-infoset-schema.html), but the format is heavyweight and difficult to read.

B.3.2. APIs

Even though the PSVI is produced by W3C XML Schema processors and used by XPath/XSLT 2.0 and XQuery 1.0, no API has been defined to communicate between these applications. The traditional XML APIs (DOM and SAX) have not yet been adapted yet to support this additional amount of information. The most advanced open implementation in this area seems to be the Xerces Native Interface (XNI; see http://xml.apache.org/xerces2-j/xni.html), which is a general framework to add information to the stream of basic events supported by SAX. While it is more generic than it needs to be to support the PSVI, XNI can be used when working with Xerces to expose the information from the PSVI. There is also a Microsoft implementation that similarly exposes information from the PSVI.

The need is there, applications will follow soon, and the general-purpose XML APIs (DOM, SAX and friends) need to take the PSVI into account if they do not want to be replaced by new APIs which will become de facto standards!

B.3.3. Schema Extensions: Error Messages

While the extension mechanisms through foreign attributes and xs:annotation are highly extensible, it might be useful to define a set of commonly used schema extensions that could become interoperable between schema processors. The principle would be similar to the EXSLT extensions (see http://www.exslt.org) proposed by an informal group of XSLT experts, which are now supported by a number of XSLT processors.

The error messages sent by schema processors are often very obscure and difficult for an end user to understand. A schema designer can often provide context-aware messages that are much clearer. Associated with a template, an extension for error messages could look like the following (the namespace URI is just an example):

<xs:simpleType name="dateTimeWithTimezone">
  <xs:restriction base="xs:dateTime">
    <xs:pattern value=".+T.+(Z|[+-].+)">
      <xs:annotation>
        <xs:appinfo>
          <exsd:error xmlns:exsd="http://dyomedea.com/ns/esxd">
            This date should specify a timezone.
          </exsd:error>
        </xs:appinfo>
      </xs:annotation>
    </xs:pattern>
  </xs:restriction>
</xs:simpleType>

or (simpler but less extensible):

<xs:simpleType name="dateTimeWithTimezone">
  <xs:restriction base="xs:dateTime"> 
    <xs:pattern value=".+T.+(Z|[+-].+)" exsd:error="This date
      should specify a timezone."
      xmlns:exsd="http://dyomedea.com/ns/esxd"/>
  </xs:restriction>
</xs:simpleType>


Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.