Writing RSS 2.0 Documents


Now here's our RSS document in RSS 2.0 format, as created by NewzAlert Composer in Chapter 3. Note that NewzAlert Composer surrounds text with the XML markup <!CDATA and ]]> to hide that text from sensitive XML-parsing software. You don't need to use that markup in your RSS 2.0 documents; just place the text you want stored directly in your document.

    <?xml version="1.0"?>     <rss version="2.0">     <channel>      <generator>NewzAlert Composer v1.70.6, Copyright (c) 2004-2005      Castle Software Ltd, http://www.NewzAlert.com</generator>       <lastBuildDate>Thu, 08 Dec 2005 14:01:27 -0500</lastBuildDate>       <pubDate>Thu, 08 Dec 2005 14:01:34 -0500</pubDate>        <title>Steve's News</title>       <description><![CDATA[This feed contains news from      Steve!]]></description>       <link>http://www.rssmaniac.com/steve</link>       <language>en-us</language>       <copyright>(c) 2006</copyright>       <managingEditor>Steve</managingEditor>       <image>       <title>Steve's News</title>        <url>http://www.rssmaniac.com/steve/Image.jpg</url>        <link>http://www.rssmaniac.com/steve</link>        <description>Steve's News</description>        <width>144</width> <height>36</height>       </image>       <item>        <title>Steve shovels the snow</title>         <description><![CDATA[It snowed once again.        Time to shovel!]]></description>         <pubDate>Thu, 08 Dec 2005 08:39:51 -0500</pubDate>         <link>http://www.rssmaniac.com/steve</link>        </item>      </channel>     </rss> 


RSS 2.0 is really the successor to RSS 0.91 and 0.92, not to RSS 1.0. Here's what's different in RSS 2.0:

  • Channels are no longer limited to 15 items: there is no limit to the number of items in a channel.

  • RSS places restrictions on the first non-whitespace characters of the data in <link> and <url> elements. The URLs in <link> and <url> elements must begin with http://, https://, news://, mailto:, or ftp:// (no longer just http://or ftp://).

  • The <!DOCTYPE> element is no longer required.

  • You use <rss version="2.0"> instead of <rss version="0.91"> as the document element.

  • There are new elements and attributes.

Read on for an in-depth look at the new elements and attributes in RSS 2.0.

The <channel> Element

Here's a list of the required channel elements in RSS 2.0. Note that the <language> element, which was required in RSS 0.91, is now an optional child element in the <channel> element.

  • <title>

  • <link>

  • <description>

Here are the optional <channel> child elements in RSS 2.0:

  • <language>

  • <copyright>

  • <managingEditor>

  • <webMaster>

  • <pubDate>

  • <lastBuildDate>

  • <category>

  • <generator>

  • <docs>

  • <cloud>

  • <ttl>

  • <image>

  • <rating>

  • <textInput>

  • <skipHours>

  • <skipDays>

Of these optional elements, the <category>, <generator>, <cloud>, and <ttl> elements are new and did not appear in version 0.91.

The <channel> Element's <category> Element

As you subscribe to more and more feeds in an RSS reader, you might want a reader that lets you organize your feeds into various folders or categories, as RSSReader does. The <category> element lets you suggest a category for your feed, and you can include as many <category> elements as you want. Here's a category element in our RSS 2.0 document:

    <?xml version="1.0"?>     <rss version="2.0">     <channel>      <generator>NewzAlert Composer v1.70.6, Copyright (c) 2004-2005      Castle Software Ltd, http://www.NewzAlert.com</generator>       <lastBuildDate>Thu, 08 Dec 2005 14:01:27 -0500</lastBuildDate>       <pubDate>Thu, 08 Dec 2005 14:01:34 -0500</pubDate>       <title>Steve's News</title>       <category>Newspapers</category>            .            .            . 


This element functions something like the categories on the back of books that are used for filing purposes in bookstores (Fiction/Mystery and the like).

This element is optional in the <channel> element, and has one optional attribute, <domain>, which usually gives a URL that lets you add more information about the category.

The <channel> Element's <generator> Element

The optional <generator> element holds text identifying the program that was used to create the file. For example, our RSS 2.0 document was created by NewzAlert Composer.

    <?xml version="1.0"?>     <rss version="2.0">     <channel>      <generator>NewzAlert Composer v1.70.6, Copyright (c) 2004-2005          Castle Software Ltd, http://www.NewzAlert.com</generator>      <lastBuildDate>Thu, 08 Dec 2005 14:01:27 -0500</lastBuildDate>       <pubDate>Thu, 08 Dec 2005 14:01:34 -0500</pubDate>       <title>Steve's News</title>            .            .            . 


This element is an optional child element of the <channel> element and has no attributes.

The <channel> Element's <cloud> Element

The <cloud> element lets you interact with a Web application (also called a cloud) that supports the rssCloud interface. At its fastest, RSS feeds are assumed to be updated hourly, but sometimes that's not fast enough. For that reason, you can register with a cloud for faster updates. Programs that are registered with the cloud are notified immediately of updates. You can learn more about clouds at http://blogs.law.harvard.edu/tech/soapMeetsRss#rsscloudInterface.

There are five required attributes in this element: domain, the domain of the cloud; port, the server port over which communication should take place; path, the directory of the registering application on the server; registerProcedure, the name of the registering application; and protocol, the online communication protocol, which is usually XML-RPC (RPC stands for Remote Procedure Call) or SOAP (Simple Object Access Protocol).

    <?xml version="1.0"?>     <rss version="2.0">     <channel>      <generator>NewzAlert Composer v1.70.6, Copyright (c) 2004-2005       Castle Software Ltd, http://www.NewzAlert.com</generator>       <lastBuildDate>Thu, 08 Dec 2005 14:01:27 -0500</lastBuildDate>       <pubDate>Thu, 08 Dec 2005 14:01:34 -0500</pubDate>       <title>Steve's News</title>      <cloud domain="rssmaniac.com" port="80" path="/RPC2"           registerProcedure="register" protocol="soap"/>            .            .            . 


The <channel> Element's <ttl> Element

The last new optional element is the <ttl> element. TTL stands for "time to live," which is the number of minutes a channel can be cached before it should be refreshed from the source. It has always been assumed that channels are refreshed hourly, but if you're writing your own feed, that's going to be a tough one (unless you're a total fanatic, of course).

Here's how you might set the time for refreshes to two hours (120 minutes) in an RSS 2.0 document:

    <?xml version="1.0"?>     <rss version="2.0">     <channel>      <generator>NewzAlert Composer v1.70.6, Copyright (c) 2004-2005       Castle Software Ltd, http://www.NewzAlert.com </generator>       <lastBuildDate>Thu, 08 Dec 2005 14:01:27 -0500</lastBuildDate>       <pubDate>Thu, 08 Dec 2005 14:01:34 -0500</pubDate>      <title>Steve's News</title>       <ttl>120</ttl>            .            .            . 


The <item> Element

There are also changes to the RSS 2.0 <item> element. These are the required child elements of the <item> element:

  • <title>

  • <link>

The optional child elements of the <item> element are the following:

  • <description>

  • <author>

  • <category>

  • <comments>

  • <enclosure>

  • <guid>

  • <pubDate>

  • <source>

Which ones are new? The optional <author>, <category>, <comments>, <enclosure>, <guid>, and <source> elements.

The <item> Element's <author> Element

The <author> element encloses not the author's name, but his or her email. The idea is that readers of your feed can get in touch with the author of a particular item, especially if that author is not the author of the whole channel.

       <item>         <title>Steve shovels the snow</title>         <author>steve@rssmaniac.com</author>        <description><![CDATA[It snowed once again.        Time to shovel!]]>         </description> <pubDate>Thu, 08 Dec 2005 08:39:51 -0500</pubDate>         <link>http://www.rssmaniac.com/steve</link>        </item> 


This is a much-needed element if you have a number of people writing for the same feed, as is the case more and more. If a reader has contact information only for the channel, he or she would have to contact the channel staff in order to reach a particular individual. Providing the author's email on an item-by-item basis is a better idea.

This optional element has no child elements or attributes.

The <item> Element's <category> Element

The <category> child element of the <item> element lets you list a category for the item, thus making it a snap to organize the items in your feed. This optional element might hold a category such as <category>Music</category> that lets you give information about where your item fits in. You can have as many <category> items as you like.

       <item>        <title>Steve shovels the snow</title>         <author>steve@rssmaniac.com</author>         <description><![CDATA[It snowed once again.         Time to shovel!]]></description>         <pubDate>Thu, 08 Dec 2005 08:39:51 -0500</pubDate>         <link>http://www.rssmaniac.com/steve</link>                 <category>Snow</category>                 <category>Labor</category>                                    <category>Weather</category>        </item> 


The <category> element has one optional attribute, <domain>. This attribute is usually a URL that lets you add more information about the category.

The <item> Element's <comments> Element

You might think that the <item> element's <comments> element contains comments about an item, but that's not the way it was designed. Instead, this element is supposed to contain a URL to a page of comments for this item.

       <item>        <title>Steve shovels the snow</title>         <author>steve@rssmaniac.com</author>         <description> <![CDATA[It snowed once again.         Time to shovel!]]> </description>         <pubDate>Thu, 08 Dec 2005 08:39:51 -0500</pubDate>         <link>http://www.rssmaniac.com/steve</link>         <comments>http://www.rssmaniac.com/steve/comments.html        </comments>       </item> 


This optional element has no child elements and no attributes.

The <item> Element's <enclosure> Element

The <enclosure> child element of the <item> element is a critical one. This is the element that makes podcasting possible (see Chapter 7, "Podcasting: Adding Multimedia to Your Feeds").

       <item>        <title>Steve shovels the snow</title>         <author>steve@rssmaniac.com</author>         <description> <![CDATA[It snowed once again.         Time to shovel!]]> </description>         <pubDate>Thu, 08 Dec 2005 08:39:51 -0500</pubDate>         <link>http://www.rssmaniac.com/steve</link>          <enclosure url="http://www.rssmaniac.com/steve/shoveling.mp3"         length="4823902" type="audio/mpeg" />       </item> 


This optional child element lets you include MP3 files, for example, in your feed. There are three required attributes in this element. The url attribute gives the online location of the enclosure, the length attribute gives its length in bytes, and the type gives its MIME type, such as "audi/mpeg" for MP3 files (for a list of the possible MIME types, see http://www.iana.org/assignments/media-types/).

In this element, the URL must begin with "http://".

The <item> Element's <guid> Element

It's possibleespecially if you have multiple authors writing for your feedthat you could send out two items with the same title. Some RSS readers might just look at the title and decide that they've already read the item and go on to the next. What can you do to prevent this?

You can give each item in your feed a globally unique identifier, or guid. A <guid> element is composed of a text string that uniquely identifies an item. RSS readers can check an item's <guid> and know for certain which items are different.

A <guid> element can contain any text string, as long as it's unique. For example, it could be an URL on your Web site (and because URLs are unique, they work fineas long as you don't reuse them). You place your guid in a <guid> element.

       <item>        <title>Steve shovels the snow</title>         <author>steve@rssmaniac.com</author>         <description><![CDATA[It snowed once again.         Time to shovel!]]></description>         <pubDate>Thu, 08 Dec 2005 08:39:51 -0500</pubDate>         <link>http://www.rssmaniac.com/steve</link>                 <guid>http://www.rssmaniac.com/48393.html</guid>        </item> 


Another possibility is simply a large random number (that's the way Microsoft creates guids used in the volume labels of newly formatted disks) or a large random string of characters. If there are enough random digits or letters in your guid, the odds are high that your guid will not match any other guid.

       <item>        <title>Steve shovels the snow</title>        <author>steve@rssmaniac.com</author>         <description><![CDATA[It snowed once again.         Time to shovel!]]></description>         <pubDate>Thu, 08 Dec 2005 08:39:51 -0500</pubDate>         <link>http://www.rssmaniac.com/steve</link>                 <guid>JFWE0-F980D-V04MV-WVR05-9E0FK-EV3R3-TIVK4</guid>                </item> 


There are no rules for creating your own guid in RSS 2.0; it just must be a unique text string.

The <guid> element has an attribute named isPermaLink, which you can set to "true" or "false." If you set this attribute to "true," the RSS reader knows that the guid string is a URL, and a permalink at that (a URL that won't change over timeat least, not as rapidly as other URLs).

The <item> Element's <source> Element

In RSS 2.0, items can also contain a <source> element. Containing the title of the RSS channel that originated the item, this element is useful when one feed contains items from a variety of feeds (for example, when you create a new feed based on search terms). Anytime your feed contains items from other feeds or sources, it's a good idea to use the <source> element.

       <item>        <title>Steve shovels the snow</title>         <author>steve@rssmaniac.com</author>         <description><![CDATA[It snowed once again.         Time to shovel!]]></description>          <pubDate>Thu, 08 Dec 2005 08:39:51 -0500</pubDate>          <link>http://www.rssmaniac.com/steve</link>               <source url="http://www.rssmaniac.com/steve/news.xml">           Steve's News          </source>        </item> 


This element has a required attribute: the url attribute, which gives the XML file source of the feed (in other words, the URL you can use to read the feed). So if you use this element, you can list not just the title of the feed the item came from, but also the URL of that feed (of course, ask permission before including other people's items in your feed).

Extending RSS 2.0

Just as with RSS 1.0, you can extend RSS 2.0. If you want to create and add your own XML elements to an RSS 2.0 document, that's fineyou just have to make sure that your new elements are in their own namespace.

For example, I might want to add a new namespace that uses the prefix steve to an RSS document. Here's how that would work (remember that you have to assign a unique string to your namespace prefix).

    <?xml version="1.0"?>     <rss version="2.0"       xmlns:steve="http://www.rssmaniac.com/steve/about.html">     <channel>     <generator>NewzAlert Composer v1.70.6, Copyright (c) 2004-2005      Castle Software Ltd, http://www.NewzAlert.com </generator>      <lastBuildDate>Thu, 08 Dec 2005 14:01:27 -0500</lastBuildDate>      <pubDate>Thu, 08 Dec 2005 14:01:34 -0500</pubDate>      <title>Steve's News</title>      <description><![CDATA[This feed contains news from      Steve!]]></description>           .           .           .    </rss> 


Now you're free to create and add your own elements, as long as you use your new namespace's prefix:

    <?xml version="1.0"?>     <rss version="2.0"           xmlns:steve="http://www.rssmaniac.com/steve/about.html">      <channel>      <generator>NewzAlert Composer v1.70.6, Copyright (c) 2004-2005       Castle Software Ltd, http://www.NewzAlert.com </generator>       <lastBuildDate>Thu, 08 Dec 2005 14:01:27 -0500</lastBuildDate>      <pubDate>Thu, 08 Dec 2005 14:01:34 -0500</pubDate>      <steve:affiliation>RSS MegaGigaCo, Inc.</steve:affiliation>       <title>Steve's News</title>       <description><![CDATA[This feed contains news from       Steve!]]></description>       <link>http://www.rssmaniac.com/steve</link>       <language>en-us</language>       <copyright>(c) 2006</copyright>       <managingEditor>Steve</managingEditor>       <image>       <title>Steve's News</title>        <url>http://www.rssmaniac.com/steve/Image.jpg</url>       <link>http://www.rssmaniac.com/steve</link>        <description>Steve's News</description>        <width>144</width>        <height>36</height>       </image>       <item>       <title>Steve shovels the snow</title>        <description><![CDATA[It snowed once again.        Time to shovel!]]></description>        <pubDate>Thu, 08 Dec 2005 08:39:51 -0500</pubDate>       <link>http://www.rssmaniac.com/steve</link>       </item>      </channel>     </rss> 


Note that even though you can add new XML elements to RSS 2.0 elements this way, there is no guarantee that RSS readers will know how to handle them. And if they can't, they'll just ignore your feeds, unless you use your own specialized software, written to handle your new elements.

Note

Unlike an RSS 1.0 document, an RSS 2.0 document does not define a default namespace for the entire document. The reason is RSS 2.0's backward compatibility with RSS 0.91 and 0.92, which do not contain a default namespace. Thus an RSS 0.91 or 0.92 document is also a valid RSS 2.0 document.




Secrets of RSS
Secrets of RSS
ISBN: 0321426223
EAN: 2147483647
Year: 2004
Pages: 110

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net