Got a number of scenic or strategically placed webcams you watch daily? Or would like to ensure that your coworkers are actually doing the work you've assigned them? Keep on top of your pictorial problems with Python . Keeping track of a large number of active webcams is a thankless task: half the time the images haven't changed, and the rest of the time it takes just as long to go through refreshing them all or waiting for them to refresh as it does to look at and mentally process the images themselves . This hack alleviates your grief by automatically downloading images from webcams every 15 secondsbut only if they've been updated, so that we don't waste bandwidth. It's also the only Python script in the entire book and, as such, earns special recognition. To tell the program which URLs to download, we have to put them in a file, one per line. The program looks for this list by default at URIs.txt , but this can be changed both in the source and on the command line. The program puts each picture in its own file, after producing an index file (which defaults to webcams.html ) so that we can quickly and easily browse all of the downloaded images in one go. The CodeSave the following code as getcams.py : #!/usr/bin/python """ getcams.py - Archiving Your Favorite Web Cams Sean B. Palmer, <http://purl.org/net/sbp/>, 2003-07. License: GPL 2; share and enjoy! Usage: python getcams.py [ <filename> ] <filename> defaults to URIs.txt """ import urllib2, time from urllib import quote from email.Utils import parsedate # # # # # # # # # # # # # # # # # # Configurable stuff # # download how often, in seconds seconds = 15 # what file we should write to index = 'webcams.html' # End of configurable stuff! # # # # # # # # # # # # # # # # # def quoteURI(uri): # Turn a URI into a filename. return quote(uri, safe='') def makeHTML(uris): # Create an HTML index so that we # can look at the archived piccies. print "Creating a webcam index at", index f = open(index, 'w') print >> f, '<html xmlns="http://www.w3.org/1999/xhtml" >' print >> f, '<head><title>My Webcams</title></head>' print >> f, '<body>' for uri in uris: # We use the URI of the image for the filename, but we have # to hex encode it first so that our operating systems are # happy with it. The following code unencodes the URI. link = quoteURI(uri).replace('%', '%25') # Now we make the image, and provide a link to the original. print >> f, '<p><img src="%s" alt=" " /><br />' % link print >> f, '-<a href="%s">%s</a></p>' % (uri, uri) print >> f, '</body>' print >> f, '</html>' f.close( ) print "Done creating the index!\n" metadata = {} def getURI(uri): print "Trying", uri # Try to open the URI--we're not downloading it yet. try: u = urllib2.urlopen(uri) except Exception, e: print " ...failed:", e else: # Get some information about the URI; we do this # to find out whether it's been updated yet. info = u.info( ) meta = (info.get('last-modified'), info.get('content-size')) print " ...got metadata:", meta if metadata.get(uri) == meta: print " ...not downloading: no update yet" else: # The image has been updated, so let's download it. metadata[uri] = meta print " ...downloading; type: %s; size: %s" % \ (info.get('content-type', '?'), info.get('content-size', '?')) data = u.read( ) open(quoteURI(uri), 'wb').write(data) print " ...done! %s bytes" % len(data) # Save an archived version for later. t = parsedate(info.get('last-modified')) archv = quoteURI(uri) + '-' + time.strftime('%Y%m%dT%H%M%S', t) + [RETURN] '.jpg' open(archv, 'wb').write(data) u.close( ) def doRun(uris): for uri in uris: startTime = time.time( ) getURI(uri) finishTime = time.time( ) timeTaken = finishTime - startTime print "This URI took", timeTaken, "seconds\n" timeLeft = seconds - timeTaken # time until the next run if timeLeft > 0: time.sleep(timeLeft) def main(argv): # We need a list of URIs to download. We require them to be # in a file; the next line defaults the filename to URIs.txt # if it can't gather one from the command line. fn = (argv + [None])[0] or 'URIs.txt' data = open(fn).read( ) uris = data.splitlines( ) # Now make an index, and then # continuously download the piccies. makeHTML(uris) while 1: doRun(uris) if __name__=="__main_ _": import sys # If the user asks for help, give it to them! # Otherwise, just run the program as usual. if sys.argv[1:] in (['--help', '-h', '-?']): print __doc_ _ else: main(sys.argv[1:]) Running the HackHere's a typical run, invoked from the command line: % python getcams.py Creating a webcam index at webcams.html Done creating the index! Trying http://example.org/webcams/someplace.jpg ...got metadata: ('Thu, 10 Jul 2003 15:50:38 GMT', None) ...downloading; type: image/jpeg; size: ? ...done! 32594 bytes This URI took 8.2480000257 seconds Trying http://example.org/webcams/phenomic.jpg ...got metadata: ('Thu, 10 Jul 2003 11:35:51 GMT', None) ...not downloading: no update yet This URI took 1.30099999905 seconds The code, complicated though it looks, consists of only a few stages:
Hacking the HackThe code has a number of limitations:
Other than these limitations, the code is safe. Sean B. Palmer |