WebDAV properties are expressed in XML in PROPFIND and PROPPATCH requests and responses. The first piece to put in place is how those properties are named and expressed in XML. This section attempts to build property representation from the ground up, combining rules about how to represent property names and property values. Then when I show complete PROPFIND and PROPPATCH request and response bodies in XML in Sections 7.2 and 7.3, all the pieces will be in place to understand those examples. 7.1.1 Basic Property Value ExampleA property value is represented in WebDAV messages as the text contents of an XML element. The element name is the property name. The element namespace is the property namespace (see Listing 7-1). Listing 7-1 Basic property name/value example.<D:getlastmodified xmlns:D="DAV:">Thu, 16 Aug 2001 23:24:33 GMT </D:getlastmodified> In this example, getlastmodified is the property name, DAV: is the property namespace, and the value is a string formatted as a date. 7.1.2 Property Name OnlySometimes only the property name will appear (PROPFIND requests, Section 7.2.1). When this is required, the property name element is shown the same way but without a value. For example, the getlastmodified property is named like this: <D:prop xmlns:D="DAV:"> <D:getlastmodified/> </D:prop> XML marshals empty elements two ways, so it's also possible to see: <D:getlastmodified></D:getlastmodified> An XML parser will treat these two as equivalent, so the WebDAV implementation doesn't have to worry about both. 7.1.3 Empty Property ValuesAn empty property value is different from a property that does not exist. When a property exists on a resource but has no value, it can appear empty. The formatting of empty property values appears identical to showing property names, but the context is different (this is used in PROPFIND responses, Section 7.2.2). The following example is excerpted from a larger response. The status is showing that the property value was returned successfully; therefore, the value must be empty (see Listing 7-2). Listing 7-2 Empty property value. <D:prop> <D:resourcetype/> </D:prop> <D:status>HTTP/1.1 200 OK</D:status> Again, the equivalent XML syntax may be used to compress an empty value representation to: <D:resourcetype></D:resourcetype> 7.1.4 Making Property Values SafeXML needs a way to hold any kind of text without changing the XML parsing or making the XML document invalid. This is done by making the text "safe." In XML documents, < and > are the only control characters, and & is used as an escape character, so these three characters are the only ones that must be treated specially. Any property containing these characters must be made safe to keep the XML document parsable and valid. Otherwise, the XML document may be unparsable or the recipient may misinterpret what characters comprise the property values. There are two ways to make text safe for XML. One is to wrap the text in a special begin and end string, unlikely to occur naturally in text. This is called encapsulation. The other method, called escaping, replaces each illegal character with a string that can be used to restore the original character when the text is removed from the XML. EncapsulationXML defines a way of encapsulating text that may contain illegal characters: The text is preceded by <![CDATA[ and followed by ]]>. CDATA sections cannot nest and may not include ]]>. A property named transit has its value kelowna-->penticton encapsulated as: <x:transit><![CDATA[kelowna-->penticton]]></x:transit> Character EscapingText can also be made safe for XML by escaping each illegal character individually. Characters are escaped with the same mechanism used in HTML. Angle brackets (< and >) are replaced with < and >, respectively. Natural occurrences of the ampersand character (&) must be replaced with the string &. A property named transit with a value of kelowna-->penticton is escaped as: <x:transit>kelowna-->penticton</x:transit>
When property values are set by the client, the client may choose to encapsulate or encode the value when sending it to the server. The server may or may not use the same method for making the value safe when it returns the property value, so the client must be prepared to accept a different encoding than the one used when the property value was set. 7.1.5 Storing Property Value TextServers must store property names, namespaces, and values and the language of the property if it was provided by the client. There are several approaches to storing properties, and countless variations exist.
Since servers may choose any variation on any of these options, and since clients may also submit property values that have been transformed (made safe) multiple times, clients should be prepared to encounter strings like any of the following examples. The third and fourth examples are technically illegal because they contain both kinds of escaping, but they might occur anyway.
7.1.6 WhitespaceBetween XML elements, it doesn't matter how many whitespace characters (tabs, carriage returns, new lines, or spaces) are included. However, whitespace does matter inside XML text element values. Thus, a string property could have a value of a single space, two spaces, or no spaces and these are all different, valid values. This sometimes causes confusion when whitespace characters are added for readability in testing or debugging. For example, if a test application put spaces before or after a date, inside the element tag, the recipient could find this to be an invalid date value, since date values are not supposed to have spaces. For the date property named getlastmodified, the following representation would be invalid unless the spaces were removed: <D:getlastmodified>2001-05-11T17:33:11Z</D:getlastmodified> For empty property values in particular, this can cause confusion. The following example is not an empty value for the resourcetype property; it is an illegal value consisting of whitespace: <D:resourcetype> </D:resourcetype> Implementors should think carefully before adding or stripping leading or trailing whitespace. That's why in this book I've been very careful, adding whitespace to improve readability, but only where it doesn't change the meaning of the example. The character is used when a new line couldn't be avoided, even though it shouldn't be considered part of the example. 7.1.7 InternationalizationProperty names and property values must both be internationalizable. They may contain characters such as accented characters, Arabic or Hebrew script, Chinese characters, and so on. The XML body of a WebDAV message may use one of several character sets, including the required character sets UTF-8 and UTF-16. Thus, any Unicode character may be represented in an XML document and included in a WebDAV property name or property value. XML supports Unicode via the UTF-8 and UTF-16 encoding. The recipient may have to convert string properties from the XML encoding character set to an internal representation, but in some languages this is automatic. Properties must be stored in a format compatible with their character set.
Careless handling of character sets may lead to problems:
7.1.8 XML-Valued PropertiesSome WebDAV property values are strings intended to be parsed as XML. These values contain one or more self-contained XML elements. If the value is not well-formed XML or is incomplete, then the sender has no choice but to encapsulate or escape the value. If the value is well-formed and complete, then the sender might choose to put the value directly into the XML stream. Let's take the example of an XML-formatted value: <home>555-1234</home><work>555-4321</work> We'll put this inside a property named phone in the http://example.com/contacts namespace: <x:phone xmlns:x="http://example.com/contacts"> <home>555-1234</home><work>555-4321</work></x:phone> That was easy, but only because the inner value does not use namespaces, and there's no need to handle prefixes. When namespaces are used, namespace prefixes must be chosen so that they are unique within a scope. The scope of a namespace declaration includes the element where the declaration is placed and every element in the hierarchy underneath, but not any part of the document outside that XML element. If the namespace is defined on the root element, it applies to all elements inside the document. If the property value uses a new namespace not already declared within the scope, a new prefix must be chosen. For example, we'll modify the preceding example so that the home and work elements are defined in the http://example.com/contacts/phonetypes namespace. <x:phone xmlns:x="http://example.com/contacts" xmlns:y="http://example.com/contacts/phonetypes"> <y:home>555-1234</y:home><y:work>555-4321</y:work> </x:phone> A sender might be tempted to apply a simple rule: "Always declare a new prefix for every namespace appearing in the value." However, XML scoping rules prevent this if the same namespace is already declared in the same scope. For example, when the resourcetype property, in the DAV: namespace, takes a value that includes the DAV: namespace, the same prefix must be reused: <D:resourcetype xmlns:D="DAV:"><D:collection/> </D:resourcetype> It would be incorrect for a sender to attempt simply to encapsulate an XML value: <D:resourcetype xmlns:D="DAV:"><![CDATA[<D:collection/>]]> </D:resourcetype>
WebDAV implementations have to go to some trouble to put XML-formatted properties into legal XML documents. Servers can have trouble storing XML-formatted property values such that the property is reconstructed together with its namespaces without any prefix collisions. A WebDAV server needs to detect whether the client has sent an XML-formatted value for a custom property so that later the server knows whether to do prefix correction when it marshals the value in XML. Luckily, XML parsers can do that easily. 7.1.9 Date and Time PropertiesDate and time properties, such as creationdate, are represented as strings in an ISO8601 format subset recommended by the IETF [RFC3339]. The format allows dates alone, times alone, or dates and times together and can include time zone information. Some examples of values for a timestamp property such as creationdate are: 1997-12-01T17:42:21-08:00 2001-05-11T17:33:11Z The part up to the "T" is the date. The part after the "T" and before the hyphen or "Z" is the time. The last piece is the time zone, either the GMT ("Zulu") time zone or a number of hours offset from GMT. In this case, the time zone is eight hours before GMT; a plus sign would appear if the time zone were after GMT. Although the ISO 8601 format can be parsed by humans, it's typically transformed into more readable format for actual display; for example, "7:33 a.m., Saturday May 5, 2001." When the date is formatted for display, some accuracy may be omitted. Note that although ISO 8601 allows date and time formats that are incomplete (e.g., the date without a time specified, or the time without a date specified), WebDAV makes further restrictions on timestamps; the date must be fully specified and the time must be fully specified, including time zone. The getlastmodified property is a little different from regular date/time representations. Since it's defined by its relationship to the Last-Modified header in HTTP/1.1, it uses the same format, even though this format has interoperability and internationalization problems and is no longer recommended for IETF protocols. The format for the Last-Modified header and thus the getlastmodified property is defined in RFC2616. An example of the format is Tue, 15 Nov 1994 12:45:26 GMT. Timestamp Interoperability ChallengesDates and times are difficult to do interoperably, particularly when the precision can vary. Some sample problems:
|