Section 4.5. Creating RSS 2.0 Feeds


4.5. Creating RSS 2.0 Feeds

RSS 0.91 and 0.92 feeds are created in the same way; the additional elements found in 0.92 are well-handled by the existing RSS tools.

Of course, you can always hand-code your RSS feed. Doing so certainly gets you on top of the standard, but it's neither convenient, quick, nor recommended. Ordinarily, feeds are created by a small program in one of the scripting languages: Perl, PHP, Python, etc. Many CMSs already create RSS feeds automatically, but you may want to create a feed in another context. Hey, you might even write your own CMS!

There are various ways to create a feed, all of which are used in real life:


XML transformation

Running a transformation on an XML master document converts the relevant parts into RSS. This technique is used in Apache Axkit-based systems, for example.


Templates

You can substitute values within a RSS feed template. This technique is used within most weblogging platforms, for example.


An RSS-specific module or class within a scripting language

This method is used within hundreds of little ad hoc scripts across the Net, for example.

We'll look at all three methods, but let's start with the third, using an RSS-specific module. In this case, it's Perl's XML::RSS.

4.5.1. Creating RSS with Perl Using XML::RSS

The XML::RSS module is one of the key tools in the Perl RSS world. It is built on top of XML::Parserthe basis for many Perl XML modulesand is object-oriented. Actually, XML::RSS also supports the creation of the older versions of RSS, plus RSS 1.0, and it can parse existing feeds, but in this section we will deal only with its 2.0 creation capabilities.

Incidentally, XML::RSS is an open source project. You can lend a hand, and grab the latest version, at http://sourceforge.net/projects/perl-rss.

Examples Example 4-8 and Example 4-9 show a simple Perl script and the feed it creates.

Example 4-8. A sample XML::RSS script
#!/usr/bin/perl

use Warnings;
use Strict;
use XML::RSS;

my $rss = new XML::RSS( version => '2.0' );

$rss->channel(
    title         => 'The Title of the Feed',
    link          => 'http://www.oreilly.com/example/',
    language      => 'en',
    description   => 'An example feed created by XML::RSS',
    lastBuildDate => 'Tue, 14 Sep 2004 14:30:58 GMT',
    docs          => 'http://blogs.law.harvard.edu/tech/rss',
);

$rss->image(
    title       => 'Oreilly',
    url         => 'http://meerkat.oreillynet.com/icons/meerkat-powered.jpg',
    link        => 'http://www.oreilly.com/example/',
    width       => 88,
    height      => 31,
    description => 'A nice logo for the feed'
);

$rss->textinput(
    title       => "Search",
    description => "Search the site",
    name        => "query",
    link        => "http://www.oreilly.com/example/search.cgi"
);

$rss->add_item(
    title       => "Example Entry 1",
    link        => "http://www.oreilly.com/example/entry1",
    description => 'blah blah',
);

$rss->add_item(
    title       => "Example Entry 2",
    link        => "http://www.oreilly.com/example/entry2",
    description => 'blah blah'
);

$rss->add_item(
    title       => "Example Entry 3",
    link        => "http://www.oreilly.com/example/entry3",
    description => 'blah blah'
);

print $rss->as_string;

Example 4-9. The resultant RSS 2.0 feed
<?xml version="1.0" encoding="UTF-8"?>

<rss version="2.0" xmlns:blogChannel="http://backend.userland.com/blogChannelModule">

<channel>
<title>The Title of the Feed</title>
<link>http://www.oreilly.com/example/</link>
<description>An example feed created by XML::RSS</description>
<language>en</language>
<lastBuildDate>Tue, 14 Sep 2004 14:30:58 GMT</lastBuildDate>
<docs>http://blogs.law.harvard.edu/tech/rss</docs>

<image>
<title>Oreilly</title>
<url>http://meerkat.oreillynet.com/icons/meerkat-powered.jpg</url>
<link>http://www.oreilly.com/example/</link>
<width>88</width>
<height>31</height>
<description>A nice logo for the feed</description>
</image>

<item>
<title>Example Entry 1</title>
<link>http://www.oreilly.com/example/entry1</link>
<description>blah blah</description>
</item>

<item>
<title>Example Entry 2</title>
<link>http://www.oreilly.com/example/entry2</link>
<description>blah blah</description>
</item>

<item>
<title>Example Entry 3</title>
<link>http://www.oreilly.com/example/entry3</link>
<description>blah blah</description>
</item>

<textInput>
<title>Search</title>
<description>Search the site</description>
<name>query</name>
<link>http://www.oreilly.com/example/search.cgi</link>
</textInput>
</channel>
</rss>

After the required Perl module declaration, you create a new instance of XML::RSS, like so:

my $rss = new XML::RSS (version => '2.0');

The new method function returns a reference to the new XML::RSS object. The function can take three arguments, two of which are of interest here:

new XML::RSS (version=>$version, encoding=>$encoding);

The version attribute refers to the version of RSS you want to make (either '2.0' or '1.0', or, if you fancy being a bit retro, '0.91'), and the encoding attribute sets the encoding of the XML declaration. The default encoding, as with XML, is UTF-8.

The rest of the script is quite self-explanatory. The methods channel, image, textinput, and add_item all add new elements and associated values to the feed you are creating, and the print $rss->as_string; prints out the result. You can also call the $rss->save method to save the created feed as a file.

4.5.1.1 guid, Permalink or not

XML::RSS does support the two guid isPermalink options but in a slightly less predictable way than the other element functions. To set guid isPermalink="true", you should do this:

$rss->add_item(
    title       => "Example Entry 1",
    link        => "http://www.oreilly.com/example/entry1",
    description => 'blah blah',
    permaLink   => "http://www.oreilly.com/example/entry1", 
);

However, to set guid isPermalink="false", you should do this:

$rss->add_item(
    title       => "Example Entry 1",
    link        => "http://www.oreilly.com/example/entry1",
    description => 'blah blah',
    guid        => "http://www.example.com/guidsRus/348324327", 
);

4.5.1.2 Module support under XML::RSS

As you can see, XML::RSS always includes the namespace declaration for the blogChannel module. You can also use it to include other modules within your feed.

In Example 4-4, we passed known strings to the module. It's really not of much use as a script; you need to add a more dynamic form of data, or the feed will be very boring indeed. We do an awful lot of this sort of thing in Chapter 10, so let's leave Perl until then, and move on to another language.

4.5.2. Creating RSS 2.0 with PHP

Great RSS 2.0 support for PHP is to be found in the feedcreator.class by Kai Blankenhorn at http://www.bitfolge.de. Unlike the previous section's XML::RSS, feedcreator.class can only create RSS feeds; it can't parse them. No matter: it's very good at that indeed.

As illustrated in Example 4-10, the function for each feed element is named after the element, so it behaves pretty much as you would expect.

Example 4-10. A PHP script using FeedCreator that produces RSS 2.0
<?php 
include("feedcreator.class.php"); 

$rss = new UniversalFeedCreator( ); 
$rss->title = "Example Feed"; 
$rss->description = "This is the feed description"; 
$rss->link = "http://www.example.com/"; 

// Image section
$image = new FeedImage( ); 
$image->title = "example logo"; 
$image->url = "http://www.example.com/images/logo.gif"; 
$image->link = "http://www.example.com"; 
$image->description = "Visit Example.com!"; 
$rss->image = $image; 

// Item Loop
$item = new FeedItem( ); 
$item->title = "Entry one"; 
$item->link = "http://www.example.com/entryone"; 
$item->description = "This is the content of the first entry"; 
$item->author = "Ben Hammersley"; 
$rss->addItem($item);  
// End Item Loop

?>

This is a very simple script. As you can see from the resulting feed, Example 4-11, it produces only one item. We'll be using it for more complicated things later on in the book, so in the meantime, once we've passed over the example output, we'll take a quick look at some of the special features.

Example 4-11. An RSS 2.0 feed created with PHP
<?xml version="1.0" encoding="ISO-8859-1"?>
<!-- generator="FeedCreator 1.7.1" -->
<rss version="2.0">
    <channel>
        <title>Example Feed</title>
        <description>This is the feed description</description>
        <link>http://www.example.com/</link>
        <lastBuildDate>Thu, 16 Sep 2004 20:16:22 +0100</lastBuildDate>
        <generator>FeedCreator 1.7.1</generator>
        <image>
            <url>http://www.example.com/images/logo.gif</url>
            <title>example logo</title>
            <link>http://www.example.com</link>
            <description>Visit Example.com!</description>
        </image>
        <item>
            <title>Entry one</title>
            <link>http://www.example.com/entryone</link>
            <description>This is the content of the first entry</description>
            <author>Ben Hammersley</author>
        </item>
    </channel>
</rss>

4.5.2.1 Caching and saving

One advantage that the FeedCreator class has over Perl's XML::RSS is the built-in caching mechanism. PHP is mostly used, in this case, as a way of dynamically building a feed upon request, perhaps from a database. However, when a feed gets too popular, that might cause too much of a server load. You can have your script store a cache file and serve that instead of running itself by adding this line:

$rss->useCached( );

This saves the dynamically created feed to a cache, serving that instead if it is less than one hour old. Remember, you need to place this line right underneath the $rss = new UniversalFeedCreator( );, or you'll waste precious processor cycles.

You can also explicitly save the file with a command like this:

echo $rss->saveFeed("RSS2.0", "index.xml");

4.5.2.2 Dates

Because, caching not withstanding, the feeds are usually produced dynamically, FeedCreator declares the channel/lastBuildDate element automatically at the time of creation. You can, of course, specify it explicitly, as you can with pubDate. FeedCreator allows the use of RFC 822 (Mon, 20 Jan 03 18:05:41 +0400), ISO 8601 (2003-01-20T18:05:41+04:00), and Unix (1043082341) time values.

4.5.2.3 Namespaced modules

This is the major drawback with the class. You can't, as of Version 1.71 at least, create a feed with modules in it. If you're set on doing thatperhaps with some groovy special in-house application in mindyou'll need to hack at the class's code. It is licensed under the GPL, so go right ahead.

4.5.3. Creating RSS 2.0 with Ruby

Since Version 1.8.2, Ruby has shipped with Kouhei Sutou's RSS parsing and creation library. At time of writing, however, Ruby has only reached 1.8.2.preview.3, and documentation is hard to come by. The only documentation for the new RSS classes is found at:

http://www.cozmixng.org/~rwiki/?cmd=view;name=RSS+Parser%3A%3ATutorial.en

in a potentially unreliable translation from the Japanese original.

Having said that, the library does seem very complete indeed, with support for the parsing and writing of both RSS 1.0 and 2.0. At time of writing, the tutorial just mentioned was growing rapidly and being completed by the library's author. Ruby programmers should check the URL for changes.

4.5.4. Serving RSS 2.0

Although, or perhaps because, there is no official word within the specification regarding this, the growing standard for serving RSS 2.0 is with a MIME type of application/xml. Dave Winer prefers text/xml for the way that it causes the file to display itself nicely inside Internet Explorer. Using application/xml is more correct, but it causes browsers to download the file instead of displaying it. Really advanced users are looking at application/rss+xml, but currently no standard exists. It's up to you, but certainly, it should not be served with any other MIME type. text/plain is right out.