The datatypes covered in this section are shown in Figure 4-4.
The W3C Recommendation, "XML Schema Part 2: Datatypes," provides new confirmation of how difficult it is to fix time.
The support for date and time datatypes relies entirely on a subset of the ISO 8601 standard, which is the only format supported by W3C XML Schema. The purpose of ISO 8601 is to eliminate the risk of confusion between the various date and time formats used in different countries. In other words, W3C XML Schema does not support these local date and time formats, and imposes the usage of ISO 8601 for any datatype that has the semantic of a date or time. While this is a good thing for interchange formats, this is more questionable when XML is used to define user interfaces, since we will see that ISO 8601 is not very user friendly. The variations using the names of the months or different orders between year, month, and day are not the only victims of this decision: ISO 8601 imposes the usage of the Gregorian (Christian) calendar to the exclusion of calendars used by other cultures or religions.
ISO 8601 describes several formats to define date, times, periods, and recurring dates, with different levels of precision and indetermination. After many discussions, W3C XML Schema selected a subset of these formats and created a primitive datatype for each format that is supported.
The indeterminacy allowed in some of these formats adds a lot of difficulty, especially when comparisons or arithmetic are involved. For instance, it is possible to define a point in time without specifying the time zone, which is then considered undetermined. This undetermined time zone is identical all over the document (and between the schema and the instance documents) and it's not an issue to compare two datetimes without a time zone. The problem arises when you need to compare two points in time, one with a time zone and the other without. The result of this comparison will be undetermined if these values are too close, since one of them may be between -13 hours and +12 hours of Coordinated Universal Time (UTC). Thus, the support of these datetime datatypes introduces a notion of "partial order relation."
Another caveat with ISO 8601 is that time zones are only supported through the time difference from UTC, which ignores the notion of summer time. For instance, if an application working in British Summer Time (BST) wants to specify the time zone--and we have seen that this is necessary to be able to compare datetimes--the application needs to know if a date is in summer (the time zone will be one hour after UTC) or in winter (the time zone would then be UTC). ISO 8601 ignores the "named time zones" using the summer saving times (such as PST, BST, or WET) that we use in our day-to-day life; ignoring the time zones can be seen as a somewhat dangerous shortcut to specify that a datetime is on your "local time," whatever it is.
TIP: The value space of xs:dateTime is considered to be the moment of time itself. The time zone that defines the value (when there is one) is considered meaningless, which is a problem for some applications that complain that even though 2002-01-18T12:00:00+00:00 and 2002-01-18T11:00:00-01:00 refer to the same "moment of time," they carry different time zone information, which should make its way into the value space.
Valid values for xs:dateTime include:
2001-10-26T21:32:52
2001-10-26T21:32:52+02:00
2001-10-26T19:32:52Z
2001-10-26T19:32:52+00:00
-2001-10-26T21:32:52
2001-10-26T21:32:52.12679
The following values are invalid:
2001-10-26 (all the parts must be specified)
2001-10-26T21:32 (all the parts must be specified)
2001-10-26T25:32:52+02:00 (the hours part (25) is out of range)
01-10-26T21:32 (all the parts must be specified)
In the valid examples given above, three of them have identical value spaces:
2001-10-26T21:32:52+02:00
2001-10-26T19:32:52Z
2001-10-26T19:32:52+00:00
The first one (2001-10-26T21:32:52), which doesn't include a time zone specification, is considered to have an indeterminate value between 2001-10-26T21:32:52-14:00 and 2001-10-26T21:32:52+14:00. With the usage of summer saving time, this range is subject to national regulations and may change. The range was between -13:00 and +12:00 when the Recommendation was published, but the Working Group has kept a margin to accommodate possible changes in the regulations.
Despite the indeterminacy of the time zone when none is specified, the W3C XML Schema Recommendation considers that the values of datetimes without time zones implicitly refer to the same undetermined time zone and can be compared between them. While this is fine for "local" applications that operate in a single time zone, this is a source of potential confusion and errors for world-wide applications or even for applications that calculate a duration between moments belonging to different time saving seasons within a single time zone.
Valid values for xs:date include:
2001-10-26
2001-10-26+02:00
2001-10-26Z
2001-10-26+00:00
-2001-10-26
-20000-04-01
The following values are invalid:
2001-10 (all the parts must be specified)
2001-10-32 (the days part (32) is out of range)
2001-13-26+02:00 (the month part (13) is out of range)
01-10-26 (the century part is missing)
xs:date represents a day identified by a Gregorian calendar date (and could have been called "gYearMonthDay"). xs:gYearMonth ("g" for Gregorian) is a Gregorian calendar month and xs:gYear is a Gregorian calendar year. These three datatypes are fixed periods of time and optional time zones may be specified for each of them. The only differences between them really are their length (1 day, 1 month, and 1 year) and their format (i.e., their lexical spaces).
The format of xs:gYearMonth is the format of xs:date without the day part. Valid values for xs:gYearMonth include:
2001-10
2001-10+02:00
2001-10Z
2001-10+00:00
-2001-10
-20000-04
The following values are invalid:
2001 (the month part is missing)
2001-13 (the month part is out of range)
2001-13-26+02:00 (the month part is out of range)
01-10 (the century part is missing)
The format of xs:gYear is the format of xs:gYearMonth without the month part. Valid values for xs:gYear include:
2001
2001+02:00
2001Z
2001+00:00
-2001
-20000
The following values are invalid:
01 (the century part is missing)
2001-13 (the month part is out of range)
This support of time periods is very restrictive: these periods can only match the Gregorian calendar day, month, or year, and cannot have an arbitrary length or start time.
NOTE: Despite the fact that: 01:20:15 is commonly used to represent a duration of 1 hour, 20 minutes, and 15 seconds, a different format has been chosen to represent a duration.
Valid values for xs:time include:
21:32:52
21:32:52+02:00
19:32:52Z
19:32:52+00:00
21:32:52.12679
The following values are invalid:
21:32 (all the parts must be specified)
25:25:10 (the hour part is out of range)
-10:00:00 (the hour part is out of range)
1:20:10 (all the digits must be supplied)
This support of a recurring point in time is also very limited: the recursion period must be a Gregorian calendar day and cannot be arbitrary.
xs:gDay is a period of a Gregorian calendar day recurring each Gregorian calendar month. The lexical representation of xs:gDay is ---DD with an optional time zone specification. Valid values for xs:gDay include:
---01
---01Z
---01+02:00
---01-04:00
---15
---31
The following values are invalid:
--30- (the format must be "---DD")
---35 (the day is out of range)
---5 (all the digits must be supplied)
15 (missing the leading "---")
The rules of arithmetic between dates and durations apply in this case, and days are "pinned" in the range for each month. In our example, --31, the selected dates will be January 31st, February 28th (or 29th), March 31st, April 30th, etc.
xs:gMonthDay is a period of a Gregorian calendar day recurring each Gregorian calendar year. The lexical representation of xs:gMonthDay is --MM-DD with an optional time zone specification. Valid values for xs:gMonthDay include:
--05-01
--11-01Z
--11-01+02:00
--11-01-04:00
--11-15
--02-29
The following values are invalid:
-01-30- (the format must be --MM-DD)
--01-35 (the day part is out of range)
--1-5 (one part is missing)
01-15 (the leading -- is missing)
xs:gMonth is a period of a Gregorian calendar month recurring each Gregorian calendar year. The lexical representation of xs:gMonth defined in the Recommendation is --MM-- with an optional time zone specification. The W3C XML Schema Working Group has acknowledged that this was an error and that the format --MM defined by ISO 8061 should be used instead. It has not been decided yet if the format described in the Recommendation will be forbidden or only deprecated, but it is advised to use the format --MM (assuming that the tools you are using already support it). Valid values for xs:gMonth include:
--05
--11Z
--11+02:00
--11-04:00
--02
The following values are invalid:
-01- (the format must be --MM)
--13 (the month is out of range)
--1 (both digits must be provided)
01 (the leading -- is missing)
The lexical space of xs:duration is the format defined by ISO 8601 under the form PnYnMnDTnHnMnS, in which the capital letters are delimiters that can be omitted when the corresponding member is not used. An important difference with the format used for xs:dateTime is none of these members are mandatory and none of them are restricted to a range. This gives flexibility to choose the units that will be used and to combine several of them--for instance, P1Y2MT123S (1 year, 2 months, and 123 seconds). This flexibility has a price; such a duration is not completely defined: a year may have 365 or 366 days, and a period of two months lasts between 59 and 62 days. Durations cannot always be compared and the order between durations is partial. We will see, in the next chapter, that user-defined datatypes can be derived from xs:duration, which can restrict the components used to express durations and insure that these indeterminations do not happen.
Since the value of a duration is fixed as soon as you give it a starting point, the schema Working Group has identified four datetimes:
1696-09-01T00:00:00Z
1697-02-01T00:00:00Z
1903-03-01T00:00:00Z
1903-07-01T00:00:00Z
These cause the greatest deviations when durations mixing day, month, and other components are added. The Working Group has determined that the comparison of durations is undefined if--and only if--the result of the comparison is different when each of these dates is used as a starting point.
Valid values for xs:duration include:
PT1004199059S
PT130S
PT2M10S
P1DT2S
-P1Y
P1Y2M3DT5H20M30.123S
The following values are invalid:
1Y (the leading P is missing)
P1S (the T separator is missing)
P-1Y (all parts must be positive)
P1M2Y (the parts order is significant and Y must precede M)
Copyright © 2002 O'Reilly & Associates. All rights reserved.