11.11 Module: Fetching LatitudeLongitude Data from the Web


11.11 Module: Fetching Latitude/Longitude Data from the Web

Credit: Will Ware

Given a list of cities, Example 11-1 fetches their latitudes and longitudes from one web site (http://www.astro.ch, a database used for astrology, of all things) and uses them to dynamically build a URL for another web site (http://pubweb.parc.xerox.com), which, in turn, creates a map highlighting the cities against the outlines of continents. Maybe someday a program will be clever enough to load the latitudes and longitudes as waypoints into your GPS receiver.

The code can be vastly improved in several ways. The main fragility of the recipe comes from relying on the exact format of the HTML page returned by the www.astro.com site, particularly in the rather clumsy for x in inf.readlines( ) loop in the findcity function. If this format ever changes, the recipe will break. You could change the recipe to use htmllib.HTMLParser instead, and be a tad more immune to modest format changes. This helps only a little, however. After all, HTML is meant for human viewers, not for automated parsing and extraction of information. A better approach would be to find a site serving similar information in XML (including, quite possibly, XHTML, the XML/HTML hybrid that combines the strengths of both of its parents) and parse the information with Python's powerful XML tools (covered in Chapter 12).

However, despite this defect, this recipe still stands as an example of the kind of opportunity already afforded today by existing services on the Web, without having to wait for the emergence of commercialized web services.

Example 11-1. Fetching latitude/longitude data from the Web
import string, urllib, re, os, exceptions, webbrowser JUST_THE_US = 0 class CityNotFound(exceptions.Exception): pass def xerox_parc_url(marklist):     """ Prepare a URL for the xerox.com map-drawing service, with marks     at the latitudes and longitudes listed in list-of-pairs marklist. """     avg_lat, avg_lon = max_lat, max_lon = marklist[0]     marks = ["%f,%f" % marklist[0]]     for lat, lon in marklist[1:]:         marks.append(";%f,%f" % (lat, lon))         avg_lat = avg_lat + lat         avg_lon = avg_lon + lon         if lat > max_lat: max_lat = lat         if lon > max_lon: max_lon = lon     avg_lat = avg_lat / len(marklist)     avg_lon = avg_lon / len(marklist)     if len(marklist) == 1:         max_lat, max_lon = avg_lat + 1, avg_lon + 1     diff = max(max_lat - avg_lat, max_lon - avg_lon)     D = {'height': 4 * diff, 'width': 4 * diff,          'lat': avg_lat, 'lon': avg_lon,          'marks': ''.join(marks)}     if JUST_THE_US:         url = ("http://pubweb.parc.xerox.com/map/db=usa/ht=%(height)f" +                "/wd=%(width)f/color=1/mark=%(marks)s/lat=%(lat)f/" +                "lon=%(lon)f/") % D     else:         url = ("http://pubweb.parc.xerox.com/map/color=1/ht=%(height)f" +                "/wd=%(width)f/color=1/mark=%(marks)s/lat=%(lat)f/" +                "lon=%(lon)f/") % D     return url def findcity(city, state):     Please_click = re.compile("Please click")     city_re = re.compile(city)     state_re = re.compile(state)     url = ("""http://www.astro.ch/cgi-bin/atlw3/aq.cgi?expr=%s&lang=e"""            % (string.replace(city, " ", "+") + "%2C+" + state))     lst = [ ]     found_please_click = 0     inf = urllib.FancyURLopener(  ).open(url)     for x in inf.readlines(  ):         x = x[:-1]         if Please_click.search(x) != None:             # Here is one assumption about unchanging structure             found_please_click = 1         if (city_re.search(x) != None and             state_re.search(x) != None and             found_please_click):             # Pick apart the HTML pieces             L = [ ]             for y in string.split(x, '<'):                 L = L + string.split(y, '>')             # Discard any pieces of zero length             lst.append(filter(None, L))     inf.close(  )     try:         # Here's a few more assumptions         x = lst[0]         lat, lon = x[6], x[10]     except IndexError:         raise CityNotFound("not found: %s, %s"%(city, state))     def getdegrees(x, dividers):         if string.count(x, dividers[0]):             x = map(int, string.split(x, dividers[0]))             return x[0] + (x[1] / 60.)         elif string.count(x, dividers[1]):             x = map(int, string.split(x, dividers[1]))             return -(x[0] + (x[1] / 60.))         else:             raise CityNotFound("Bogus result (%s)" % x)     return getdegrees(lat, "ns"), getdegrees(lon, "ew") def showcities(citylist):     marklist = [ ]     for city, state in citylist:         try:             lat, lon = findcity(city, state)             print ("%s, %s:" % (city, state)), lat, lon             marklist.append((lat, lon))         except CityNotFound, message:             print "%s, %s: not in database? (%s)" % (city, state, message)     url = xerox_parc_url(marklist)     # Print URL     # os.system('netscape "%s"' % url)     webbrowser.open(url) # Export a few lists for test purposes citylist = (("Natick", "MA"),             ("Rhinebeck", "NY"),             ("New Haven", "CT"),             ("King of Prussia", "PA")) citylist1 = (("Mexico City", "Mexico"),              ("Acapulco", "Mexico"),              ("Abilene", "Texas"),              ("Tulum", "Mexico")) citylist2 = (("Munich", "Germany"),              ("London", "England"),              ("Madrid", "Spain"),              ("Paris", "France")) if _ _name_ _=='_ _main_ _':     showcities(citylist1)

11.11.1 See Also

Documentation for the standard library module htmlllib in the Library Reference; information about the Xerox PARC map viewer is at http://www.parc.xerox.com/istl/projects/mapdocs/; AstroDienst hosts a worldwide server of latitude/longitude data (http://www.astro.com/cgi-bin/atlw3/aq.cgi).



Python Cookbook
Python Cookbook
ISBN: 0596007973
EAN: 2147483647
Year: 2005
Pages: 346

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net