Del.icio.us


Del.icio.us is a social bookmarking tool, through which you can bookmark sites you like, share those bookmarks with friends, and tag those bookmarks to be searchable by all. Del.icio.us also has an API that can be used to access bookmarks already made or to add new bookmarks.

Restrictions on Use

In order to use the API, you must be a registered user of Del.icio.us, and requests must not be performed more than once per second. You should also set a unique User-Agent for your application — something identifying your application and the version. Common User-Agents (like PHP's default setting, for example) are likely to be throttled occasionally as they are abused. Monitor the responses received for a 503 error document and forcefully throttle your application's requests. Finally, ensure that any applications you release based on the site do not add or modify bookmarks without the user's consent, and try emailing the site's owner before you release anything. This will help ensure that your application isn't harmful and doesn't end up getting throttled.

Basic HTTP Authentication

All calls against the Del.icio.us API require the use of basic HTTP authentication. This involves including your credentials along with the HTTP header sent with the request. These are the same credentials (username and password) used to access the Del.icio.us site. In order to properly send your credentials, you need to encode your username and password with Base 64 encoding. Luckily, PHP has a function for this express purpose:

 $authorization = base64_encode("username:password"); 

The encoded string looks like this:

 cGt0bG9zczp0d2lkZGx 

This encoding must not be mistaken for encryption; the encoded string can easily be turned back into the original string with the base64_decode() function. The username-colon-password format is dictated by the HTTP specification, and is not a requirement of the base64_encode() function itself. This authentication information is included in the HTTP headers in this format:

 Authorization: Basic cGt0bG9zczp0d2lkZGx 

With this authentication information in hand, a request can be successfully sent.

Obtaining Your Own Bookmarks

The API provides the ability to retrieve all of the bookmarks you have set to date. There is special note, however, that this functionality should only be used sparingly. If you plan on integrating your Del.icio.us bookmarks with your own site, ensure you only retrieve the full list of bookmarks you have set sparingly. The API also provides a last update API call, which you can use to determine when you last added a bookmark.

This code is a little different than previous situations where sockets were used to obtain a remote file. The only change this time is the inclusion of the basic authentication header:

 <?php   $endpoint = "http://del.icio.us/api/posts/get?";   $authorization = base64_encode("pktloss:twiddle");   $url_info = parse_url($endpoint);   $host = $url_info[‘host'];   $path = $url_info[‘path'] . "?" . $url_info[‘query'];   $data = ""; 

The endpoint for the call is set, the username and password are base 64-encoded, and the URL is parsed into its component parts.

   $fp=fsockopen($host, 80);   fputs($fp, "POST " . $path . " HTTP/1.1\r\n");   fputs($fp, "Host: " . $host ."\r\n");   fputs($fp, "Authorization: Basic $authorization\r\n");   fputs($fp, "Accept: text/xml,application/xml,application/xhtml+xml,text/ html;q=0.9,text/plain\r\n");   fputs($fp, "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\n");   fputs($fp, "Connection: close\r\n");   fputs($fp, "User-Agent: PReinheimer Test App v0.1\r\n");   fputs($fp, "Content-Type: application/x-www-form-urlencoded\r\n");   fputs($fp, "Content-Length: " . strlen($data) . "\r\n\r\n");   fputs($fp, "$data"); 

Here the request itself is sent. Note the Authorization header, which is required to access this API, as well as the User-Agent header, which identifies the application. The agent string here is pretty meaningless; remember to set something for your own applications, and try to make it more indicative of your application.

   $response="";   while(!feof($fp))   {     $response.=fgets($fp, 128);   }   fclose($fp);   list($http_headers, $http_content)=explode("\r\n\r\n", $response);   echo "Headers: \n" . $http_headers . "\n\n";   echo "Content: \n" . $http_content; ?> 

Finally, the server's response is pulled from the socket and parsed into its component (header and content) parts. The output is as follows:

 Headers: HTTP/1.1 200 OK Date: Mon, 16 May 2005 19:18:13 GMT Server: Apache/1.3.33 (Debian GNU/Linux) mod_gzip/1.3.26.1a mod_perl/1.29     AuthMySQL/4.3.9-2 Vary: * Connection: close Transfer-Encoding: chunked Content-Type: text/xml Content: 3a4 <?xml version='1.0' standalone='yes'?> <posts dt="2005-05-16" tag="" user="pktloss">   <post href="http://bixdata.com/" description="BixData - Performance Monitor,     Process Viewer, Critical Notifications" hash="d80779ee4c81361239bdb03f608169e0"     others="1" tag="monitor server software" time="2005-05-16T03:10:20Z" />   <post href="http://moonsoar.com/" description="MoonSoar.com"     hash="4f73d86034dabcd288029a3a7aca4a10" others="2" tag="blogs" time="2005-05-     16T02:32:03Z" />   <post href="http://shiflett.org/" description="Chris Shiflett: Home"     hash="37e86256d39dd6de808d8ab9e8f1f46d" others="33" tag="blogs" time="2005-05-     16T03:09:01Z" />   <post href="http://www.acmqueue.org/" description="ACM Queue - Developer Tools,     Hardware, Security, Open Source, Enterprise Search, Data Management, Virtual     Machines, Wireless" hash="630611753b9481be8d0e5ec1dda2441d" others="29"     tag="system:unfiled" time="2005-05-16T03:08:46Z" /> </posts> 0 

The HTTP headers are pretty standard. The only thing to note here is the response code (200 in this case; 404 Document not found is probably the one people are most familiar with). As mentioned earlier, the people who run Del.icio.us would like you to keep your code watching for a 503 Service Unavailable error, which will indicate that you are using the service too often, and must throttle back. The content contains a nice XML document outlining the Del.icio.us bookmarks I have set up.

Note 

One of the first things I usually do when I start working with an XML document in code is to stick it into SimpleXML and print_r the result. This is spectacularly unhelpful this time around because print_r doesn't reveal attributes when displaying SimpleXML objects. So echo'ing out the document will need to suffice.

Getting that code into a SimpleXML object takes a little bit of work. The document starts and ends with some extraneous characters that will prevent the import string from handling the response, so you need to pull that out. The initial characters always appear on the first line, so they can be handled by dumping the first line. The last 0 can be removed by cutting everything off after the final closing brace:

   $firstLine = strpos($http_content, "\n");   $http_content = substr($http_content, $firstLine + 1);   $lastLine = strrpos($http_content, ">");   $http_content = substr($http_content, 0, $lastLine + 1); 

The first new line character is located and its position recorded to be used when trimming the start of the string. The final closing brace is located (strrpos searches a string in reverse) and the process is repeated.

   $xml = simplexml_load_string($http_content);   foreach($xml as $post)   {     echo "Site: {$post['description']} at {$post['href']} has the following tag(s):       {$post['tag']}\n";   } 

Finally, the SimpleXML object can be created and iterated through. The output looks like this:

 Site: BixData - Performance Monitor, Process Viewer, Critical Notifications at http://bixdata.com/ has the following tag(s): monitor server software Site: MoonSoar.com at http://moonsoar.com/ has the following tag(s): blogs Site: Chris Shiflett: Home at http://shiflett.org/ has the following tag(s): blogs Site: ACM Queue - Developer Tools, Hardware, Security, Open Source, Enterprise     Search, Data Management, Virtual Machines, Wireless at http://www.acmqueue.org/     has the following tag(s): system:unfilled 

Adding a Caching Layer

Since the owners of Del.icio.us asked so nicely, it would be appropriate to add a small caching layer into any applications built to ensure that both your machine and the Del.icio.us server aren't bogged down with repeated requests. This caching layer will have two functions: At the very bottom sits baseCall(), which takes care of actually making needed requests to the API, and returns either the resulting document, the word THROTTLE indicating it received a request to slow down incoming requests, or null indicating that it received some other error. Above baseCall() sits callDelicious(), which looks at each request to call the API, and tries to load the results of a recent identical request from the database. Failing that, it passes the request off to baseCall() and records the results.

The baseCall() function is remarkably similar to the code already presented:

 function baseCall($endpoint) {   $authorization = base64_encode("pktloss:twiddle");   $url_info = parse_url($endpoint);   $host = $url_info[‘host'];   $path = $url_info[‘path'] . "?" . $url_info[‘query'];   $data = "";   $fp=fsockopen($host, 80);   fputs($fp, "POST " . $path . " HTTP/1.1\r\n");   fputs($fp, "Host: " . $host ."\r\n");   fputs($fp, "Authorization: Basic $authorization\r\n");   fputs($fp, "Accept:    text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain\r\n");   fputs($fp, "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\n");   fputs($fp, "Connection: close\r\n");   fputs($fp, "User-Agent: PReinheimer Test App v0.1\r\n");   fputs($fp, "Content-Type: application/x-www-form-urlencoded\r\n");   fputs($fp, "Content-Length: " . strlen($data) . "\r\n\r\n");   fputs($fp, "$data");   $response="";   while(!feof($fp))   {     $response.=fgets($fp, 128);   }   fclose($fp);   list($http_headers, $http_content)=explode("\r\n\r\n", $response);   if (strpos($http_headers, "200 OK"))   {     $firstLine = strpos($http_content, "\n");     $http_content = substr($http_content, $firstLine + 1);     $lastLine = strrpos($http_content, ">");     $http_content = substr($http_content, 0, $lastLine + 1);     $xml = simplexml_load_string($http_content);     return $xml;   }else if (strpos($http_headers, "503 Service Unavailable"))   {     return "THROTTLE";   }else   {     return NULL;   } } 

The only difference between this code and the code presented earlier is the function encapsulation, and the examination of the response for either the 200 OK or 503 Service Unavailable response codes. If the response is OK, the document is parsed and returned; on a 503, the word THROTTLE is returned, and if neither response is received, NULL is returned.

The callDelicious function gets a little more complicated, and that should be expected, attempting to handle many of the situations.

 function callDelicious($endpoint, $parameters) {   foreach ($parameters AS $parameter)   {          $endPoint .= $parameter[0] . "=" . $parameter[1] . "&";   }   $key = md5($endpoint);   $today = date("Y-m-j H:i:s", time() - 5 * 60);   $query = "SELECT `key`, `xml` FROM 11_delicious_cache WHERE `key` = ‘$key' &&     `tstamp` > ‘$today' ORDER BY `tstamp` DESC LIMIT 1";   $result = getAssoc($query, 0); 

The full endpoint URL for the REST call is determined and then hashed with the MD5 algorithm. Hashing the URL results in a short SQL safe string, which is ideal for this situation. A cached copy of this endpoint is checked for from within the database — this code checks for a copy that was saved 5 minutes ago (5 minutes * 60 seconds). Depending on your needs, you may want to tweak this time up or down. Realistically, if I was using this as a plugin component on my blog, I would probably set the time to 1 hour.

   if (isset($result[‘xml']))   {     $xml = simplexml_load_string($result[‘xml']); 

If a recent cached copy was found, create the SimpleXML object to be returned later.

   }else   {      $xml = baseCall($endpoint); 

Because a cached copy was not located, the baseCall function is leaned on to retrieve a recent version.

     if ($xml == null)     {       //Record Error?       $xml = "THROTTLE";     } 

Unfortunately, it seems that baseCall was not able to retrieve a more recent version of that particular endpoint. Depending on how and where your code is used, you may want to note this error for further examination, or cross your fingers and hope that it was a temporary connectivity issue and hope it will resolve itself. Either way, the function will attempt to locate an older version of the endpoint.

     if ($xml == "THROTTLE")     {         $query = "SELECT `key`, `xml` FROM 11_delicious_cache WHERE `key` = ‘$key'           ORDER BY `tstamp` DESC LIMIT 1";         $result = getAssoc($query, 0);         if (isset($result[‘xml']))         {           $safeXML = mysql_real_escape_string($result['xml']);           $insertQuery = "REPLACE INTO 11_delicious_cache (`key`, `xml`, `tstamp`)             VALUES (MD5('$endpoint'), '$safeXML', null)";           insertQuery($insertQuery);           $xml = simplexml_load_string($result['xml']); 

If the 503 error was received, the server has requested that the frequency of your connections back off, so the database is checked again for any existing copy of this endpoint. Hopefully there is one in there, even if it is older than 5 minutes. If one is found, it is updated in the database to have a current times-tamp. This should prevent any more calls being made to this endpoint for the next 5 minutes. The old cached copy is used to create a SimpleXML object to be returned when the function finishes.

         }else         {            $xml = null;         } 

In this case, either the THROTTLE response or an unknown response was received, and unfortunately there was no cached copy of this request in the database. There really isn't anything else to try, so null is returned.

     }else if (is_object($xml))     {        $safeXML = mysql_real_escape_string($xml->asXML());        $insertQuery = "REPLACE INTO 11_delicious_cache (`key`, `xml`, `tstamp`)          VALUES (MD5(‘$endpoint'), ‘$safeXML', null)";        insertQuery($insertQuery); 

Here, finally, a good response from the API is handled. The XML object returned by baseCall() is turned back into a well-formed XML string, escaped for use within a SQL query, and saved to the database.

      }else      {         $xml = null;      }   }   return $xml; } 

This last case should not happen, but it's a good idea to allow for the unexpected within your decision blocks. Finally, $xml is returned, which hopefully contains a SimpleXML object.

This caching method has a few pros and cons, which may not be immediately visible when reading the disjoint segments this page layout provides.

Pros:

  • Like any good caching layer, this one allows upper layers to completely ignore the fact that they are getting cached in the first place.

  • This pair of functions is modular enough that they should work well for any REST-based API.

  • Because of the way the database key is generated (including the full endpoint, rather than just the parameters), the same function and database table can be used for multiple REST APIs.

Cons:

  • While this pair ensures that identical queries are not run one after another, it is still very easy to run too many queries too quickly: Run different queries. This could be a problem if many queries are called by the same block of code. If this happens, it is probably time to restructure your code so that queries are updated by the clock, not by page load. At the very least, consider changing the cache duration for different endpoints; that way they won't all expire at once.

  • Furthermore, with the throttled issue, a better approach than just updating the current endpoint to the current timestamp would be to refresh all endpoints to the present. This would essentially prevent any REST calls that have been made before from happening again for as long as you let caches live.

  • MD5 is case sensitive, whereas the API may or may not be. This could result in endpoints the API considers identical to be cached separately (and hence, allowed to be called more often).

Retrieving a List of Used Tags

One of the nice things about Del.icio.us is that it allows you to associate tags with each of your bookmarks. One thing I hate about tags in general is when I accidentally use multiple tags that essentially mean the same thing ("blog" versus "blogs," for example). As such, it is usually a good idea to look at the tag(s) used already before adding a tag to something else. The API provides an endpoint for this call at http://del.icio.us/api/tags/get?, and using the functions just created, calling it couldn't be easier:

 require("../common_db.php"); $endpoint = "http://del.icio.us/api/tags/get?"; $parameters = array(); $xml = callDelicious($endpoint, $parameters); 

I've snuck the requirement for my database functions in there (it is needed for the caching functions), but other than that, all that has been set is the endpoint. The output looks like this:

 <?xml version="1.0" standalone="yes"?> <tags>   <tag count="2" tag="blogs"/>   <tag count="1" tag="monitor"/>   <tag count="1" tag="server"/>   <tag count="1" tag="software"/>   <tag count="1" tag="system:unfiled"/> </tags> 

That lists each of the tags used, as well as the number of times that specific tag has been used.

Adding a Tag to Del.icio.us

Being able to access all those tags is of little use if you can't add a few of your own. Sure, the Del.icio.us website and bookmarks make this easy, but it's always more fun to do it through code. With a simple form it's easy enough to accomplish:

 if ($_POST[‘method'] == "add") {   $endPoint = "http://del.icio.us/api/posts/add?";   $parameters = array();   $parameters[] = array('url', urlencode($_POST[‘url']));   $parameters[] = array('extended', urlencode($_POST[‘extended']));   $parameters[] = array('tags', urlencode($_POST[‘tags']));   $parameters[] = array('description', urlencode($_POST[‘description']));   $parameters[] = array('dt', date("Y-m-jTH:i:sZ"));   $xml = callDelicious($endPoint, $parameters); }else {   echo <<< htmlCodeBlock   <form method="post">   <input type="hidden" name="method" value="add">   URL: <input type="text" name="url"><br>   Extended: <input type="text" name="extended"><br>   Description:<input type="text" name="description"><br>   Tags:<input type="text" name="tags"><br>   <br>   <input type="submit">   </form> htmlCodeBlock; } 

When the code is first called, a brief form is printed out, which allows the user to enter any desired information about the bookmark they wish to add. The second time around, the parameters are pulled from the $_POST variable, URL-encoded, and sent off to the API.

Note 

Note the placement of the closing heredoc tag at the very beginning of the line. Remember, you can't have any white space before that tag or it won't work.

That code works well enough, but why not display the tags already used to the users so they can make more intelligent choices about which tags to use?

 {   $endPoint = "http://del.icio.us/api/tags/get?";   $xml = callDelicious($endPoint, array());   $usedTags = array();   foreach($xml as $tag)   {     $usedTags[] = $tag['tag'];   }   sort($usedTags);   $usedTags = implode(" ", $usedTags);   echo <<< htmlCodeBlock   <form method="post">   <input type="hidden" name="method" value="add">   URL: <input type="text" name="url"><br>   Extended: <input type="text" name="extended"><br>   Descirption:<input type="text" name="description"><br>   Tags:<input type="text" name="tags"><br>   Previously Used Tags: $usedTags<br>   <br>   <input type="submit">   </form> htmlCodeBlock; } 

Using the API, the list of tags used previously is retrieved, sorted alphabetically, and displayed to the user. This will help avoid duplicate nearly similar tag issues. The resulting form looks like the image in Figure 11-4.

image from book
Figure 11-4

Delicious Conclusion

Finally, here is a heading title that looks like a pun. The Del.icio.us social bookmarking tools are becoming wildly popular; particularly, it seems, among PHP programmers. Incorporating portions of your social bookmarks into your website is an easy way to keep fresh and interesting content displayed. The ability to add bookmarks would plug in well with blogging software. Along with looking for a trackback RPC on any URLs entered into a blog entry, the software could offer the user the opportunity to add that URL as a bookmark, if not already present.




Professional Web APIs with PHP. eBay, Google, PayPal, Amazon, FedEx, Plus Web Feeds
Professional Web APIs with PHP. eBay, Google, PayPal, Amazon, FedEx, Plus Web Feeds
ISBN: 764589547
EAN: N/A
Year: 2006
Pages: 130

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net