Hack 84 Create an Atom Document

   

figs/beginner.gif figs/hack84.gif

Atom is gaining ground as a feed format, and we should be paying attention to it. This hack guides you through the creation of an Atom document from a template.

Atom is an emerging syndication format for newsfeeds, an attempt to evolve from and improve on the current RSS scene. You can find a draft of the specification at http://www.mnot.net/drafts/draft-nottingham-atom-format-02.html. The Atom forum folks are also creating an Atom API to support Atom newsfeeds (http://bitworking.org/projects/atom/draft-gregorio-09.html).

The goals of the forum are that Atom will be 100 percent vendor-neutral, freely extensible by anybody, cleanly and thoroughly specified, and implemented by everybody (http://www.intertwingly.net/wiki/pie/RoadMap). The Atom spec certainly is more concise than its competitors, and I like that. More and more aggregators are supporting Atom, so I'll say with some confidence that Atom is headed in the right direction, if not to the head of the pack all in due time, of course.

The following template may be used to create your own Atom feed. atom.xml is a brief example of an Atom document:

<feed version="0.3" xmlns="http://purl.org/atom/ns#" xml:lang="en">   <title>Wy'east Communications</title>   <link rel="alternate" type="text/html"    href="http://www.wyeast.net/"/>   <author>     <name>Mike Fitzgerald</name>   </author>   <tagline>Wy'east Communication is an XML consultancy.</tagline>   <modified>2004-08-14T02:43:00-07:00</modified>   <entry>     <title>Legend of Wy'east</title>     <link rel="alternate" type="text/html"      href="http://www.wyeast.net/wyeast.html"/>     <id>http://www.wyeast.net/wyeast.html</id>     <issued>2004-08-14T02:43:00-07:00</issued>     <modified>2004-08-14T02:43:00-07:00</modified>   </entry> </feed>

The document element of an Atom document is feed. This element and its children must be in the namespace http://purl.org/atom/ns# (either a default namespace or prefixed). RSS 0.91 and RSS 2.0 avoid namespaces, but Atom cannot do so. feed must have a version attribute; currently, the appropriate value is 0.3, but I expect that to change by the time you are reading this. An xml:lang attribute (http://www.w3.org/TR/2004/REC-xml-20040204/#sec-lang-tag) is recommended but not mandatory. Values of xml:lang must conform to RFC 3066, "Tags for the Identification of Languages" (http://www.ietf.org/rfc/rfc3066.txt).

The order of the child elements of feed doesn't matter in Atom.


A feed element must contain exactly one title element, which is a title for the resource. It should match the content of the title element in an HTML document, if that is the kind of resource the Atom document is talking about. This corresponds to the title element used in RSS 0.91, RSS 1.0, and RSS 2.0.

A feed element must also contain exactly one link element that has a rel attribute with a value of alternate. (Unlike Atom's predecessors, link elements must be empty, that is, they must not have any content.) feed can have more than one link element child, but its rel attribute value must be something other than alternate; for example, start, next, or prev (rel attribute values are specified at http://bitworking.org/projects/atom/draft-gregorio-09.html#rfc.section.5.4). The type attribute is mandatory and provides a media type (such as text/html) for the linked resource according to the guidelines in RFC 2045, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies" (http://www.ietf.org/rfc/rfc2045.txt). In addition, a link element must have an href attribute that contains a URI for the resource in question. Finally, link may have a title attribute giving the title of the link.

A feed element must have exactly one author element, or, as an alternative, all the entry element children of feed must have an author element. An author element must have exactly one name child element, and may also have one url and/or one email child. url contains a URI associated with the named author, and email is an RFC 822 (http://www.ietf.org/rfc/rfc822.txt) email address for the same author. The order of the child elements of author is not significant. An author element is said to be a Person construct.

The document contains one tagline element, an optional element that holds a description of the feed and is a child of feed. tagline is a Content construct. A Content construct may have a type attribute and a mode attribute. type has a media type as a value, following the guidelines in RFC 2045 (http://www.ietf.org/rfc/rfc2045.txt). A mode attribute can have one of three values: xml (the default if mode is not present), escaped, or base64. If the value is xml, it means that the element's content is XML, such as XHTML (namespace qualified); if escaped, the content is an escaped string that must be unescaped by the feed processor; if base64, the content is base64-encoded (see also RFC 2045).

In addition, feed must contain exactly one modified element that contains a Date construct. A Date construct is formed according to the Date and Time Formats note published by the W3C (http://www.w3.org/TR/NOTE-datetime).

Consequently, a feed element may contain these additional optional elements:

  • One or more contributor elements that, like author, is a Person construct

  • One id element that contains a globally unique identifier that must be a URI and not change; for example, a tag URI (http://www.taguri.org/)

  • One generator element that states what software generated the feed

  • One copyright element containing a copyright statement (a Content construct)

  • One info element explaining the feed format (a Content construct)

6.6.1 Feed Entries

An Atom document may contain zero or more entry elements as children of feed. An entry element roughly corresponds to an item element in the earlier RSS specs. entry may contain any namespace-qualified child elements. As with feed, entry must contain a title element, a link element, and an author element, which have the same specs as those in feed.

Each entry element must have modified and issued child elements, which are both Date constructs. The difference is that issued states when the entry was issued, and modified states the last time the entry was modified. An entry may also have a created child (also a Date construct) which gives the entry's creation date.

Finally, an entry element may have one or more contributor child elements, defined just as in feed. It also may have a summary element that contains a summary, excerpt, or abstract of the entry, plus it may have one or more content children (Content constructs), which can contain the actual payload of the entry in XML, escaped, or base64 format rather than being just a link to it.

6.6.2 See Also

  • General information about Atom: http://www.atomenabled.org

  • Wiki site for developers: http://www.intertwingly.net/wiki/pie/FrontPage



XML Hacks
XML Hacks: 100 Industrial-Strength Tips and Tools
ISBN: 0596007116
EAN: 2147483647
Year: 2006
Pages: 156

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net