So far, you have
In this section, you continue to
Chances are you aren't going to need the entire feed, but only certain elements of it. In this case, you want to grab the < title > for the feed itself, as well as the < link > of the web page for the feed. The lastBuildDate tag can be used to check to see if anything new has been posted, and the ttl (Time To Live) tag notes that the feed should be cached for 5 minutes. You won't save the TTL, but keep it in mind when figuring out how often to hit the page. The image tag can be ignored; you don't need that for now. Finally, you want the item tag and everything underneath it ( title, link, guid, pubDate, description ).
Some elements will repeat in the feed ( item , for example, in RSS feeds), whereas others will only have one instance. Some subelements may usually only be there once, but may repeat in some instances (think of a listing of books — most only have one author, but some (like this one) have more than one), so your code should be ready to deal with this. Because this is indeed an RSS feed, the item tag repeats.
Look at what type of data the feed is providing. You may need to either add tags yourself (encapsulate the content of a feed in
<p></p>
tags, for example), or strip tags to be better displayed within your templates. Remember to view the document's source for this, because your browser will automatically
The Yahoo! feed appears to provide all of its data in plain text, with appropriate
http://news.yahoo.com/news?tmpl=index&cid=1209
You will need to change & to simply & if you want this to be a functional link.
Document encoding is rather important; often (thanks to the magic that is PHP) we pretty much ignore it and trust that everything will work well, when in fact, using things like SimpleXML, this simply (if you will pardon the pun) isn't the case. Content providers are often careful with the information they provide, to ensure it is of the appropriate type, but often still manage to send characters encoded in another format when including
For your purposes, the Yahoo! feed declares itself to be ISO-8859-1, and all the content provided appears to be encoded correctly.
| Note |
Don't kid yourself and think that big professional sites don't make these mistakes — they do! While writing this book, Amazon corrected an error where it was returning ISO-8859-1 characters in its REST-based API while declaring the stream to be UTF-8. Test your script well, trap errors, and if the script fails while trying to grab a feed, have it notify someone, and continue to use the cached copy. Don't trust foreign data, ever! |
Look around the site for copyright information and restrictions regarding the feed you want to consume. Also look to see if there is more detailed information with regards to what will be included in the feed (in terms of HTML formatting,
Browsing the Yahoo! site indicates that the feed will be provided in the RSS format, and lays out the terms of use for the content of the feed. You can view this information here: http://news.yahoo.com/rss .
Reuters also provides a feed, as well as clear terms of use. If you are planning to make use of a feed in any commercial context, be
| Note |
What are the terms of use?
Reuters offers RSS as a free service to any individual user or non-profit organization that would like to access it for non-commercial use. For all other usage requests, contact us. By accessing our RSS service you are indicating your understanding and agreement that you will not use Reuters RSS for commercial purposes. Reuters
|