Hack 21. Map Health Code Violations with RDFMapper

With this simple web service, you can build interactive Flash maps from arbitrary data sources in RSS or RDF.

RDFMapper is a web service that searches an RDF file for resources with geographic locations and returns a map overlaid with dots representing the located resources. Clicking on a dot displays information about the clicked resource. Arbitrary images can be treated as maps, so the service can be used for any kind of image annotation.

One big advantage of using RDF for spatial annotation is that it allows data of different types to be mingled freely. RDF is built from a series of "vocabularies": for example, there are vocabularies for weblog posts (RSS), for personal descriptions (FOAF), for restaurants (ChefMoz), for time and events (RDFCalendar), for geometry (RDFGeom2d), and for geography (RDFIG Geo). These vocabularies can be mixed together in the same RDF documents. Any kind of RDF content can be annotated with geographical location, simply by dropping in latitude and longitude properties from the simple Geo vocabulary at http://www.w3.org/2003/01/geo/wgs84_pos#, which describes latitude and longitude in WGS84. Then, RDFMapper can be used to make instant maps of the content. The RDF approach integrates geodata into any domain where it is relevant, without requiring any changes to schemas or data models.

There are innumerable sources of RDF on the Web. Among them are weblogs whose "feeds" are in RSS 1.0 (the RDF dialect of RSS), and descriptions of people and their relationships encoded using the FOAF ("friend of a friend") namespace. Many other sources exist. (For example, the contents of the DMOZ open directory, http://dmoz.org/, are available in RDF.)

But it's easy to roll your own, as Dav Coleman chose to do when making maps of health code violations in San Francisco. On its web site, the city publishes lists of all food service providers, along with enticing details of the violations they've committed: Unsafe Food Source, Vermin, Personal Hygiene, and so on. He used a few Perl scripts to scrape the site, extract the health code details and the street addresses of the restaurants, and locate them using a geocoding service such as http://geocoder.us/ (see [Hack #79] for details on how to do this easily yourself). He published the results as an RDF file (available at http://mappinghacks.com/data/sfhealth.rdf), giving the name and location of each restaurant for which a health code violation had been reported.

The following HTML invokes RDFMapper to map the contents of this RDF file. It sends a set of key-value pairs to the web service, which returns a map in .swf (Macromedia Flash) format:

<a href="javascript:document.forms[form1].submit( );">Submit</a>

When this invisible form is submitted by clicking on the "Submit" link, it performs an HTTP POST to the RDFMapper web service. RDFMapper puts together a map and sends it back to the client.

Figure 2-22 shows the RDFMapper visualization of the RDF file of health code violations. You can view information about individual restaurants by clicking on the appropriate dot on the map. This could be a "how to lie with maps" story: the only level of metadata visually displayed is that there is a violation of some kind there; this could be one relatively innocuous citation for "Holding Temperatures" a degree or two too warm, or a string of violations including Vermin and Contaminated Equipment. On this map, they all look the same qualitatively. However, this is hopefully a good illustration of using RDF as an intermediary format. RDFMapper can also be used to map web data that is not already presented in RDF, with the help of quick conversion scripts.

Figure 2-22. The RDFMapper map of San Francisco health code violations

2.9.1. The RDFMapper Web Service in Depth

The basemap parameter passed to the RDFMapper service specifies the URL of a file in RDFMap format describing the map that is to appear in the background. This RDFMap file specifies the URL of the image and the parameters of the geographic projection that applies to the image. The image is either a JPEG or a .swf file. A catalog of maps suitable for use with RDFMapper can be found at http://www.mapbureau.com/viewer. You can also create your own base maps from existing images by encoding the relevant projection data in RDFMap format.

The content parameter specifies the URL of the RDF file to be mapped.

Most of the remaining parameters denote functions defined in the Fabl programming language, an open source language designed specifically for manipulating RDF. (Fabl is implemented within RDF as well; see http://fabl.net for more details.) Parameters naming Fabl functions may take the form:

In the former case, the function is taken from a library of utility functions for RDFMapper, available at http://www.mapbureau.com/libsrc/rdfmapper_utils-2.0.fbl. In the latter case, the function is taken from the specified Fabl code file. See the rdfmapper_utils-2.0.fbl file for documentation of the utilities used in the previous example, such as itemGen1, extractTopicLocation, and so on. If, as is often the case, the functions appearing in the library suit the purposes of an application, there is no need to develop new versions.

The extractor parameter denotes a Fabl function that, when given an RDF resource, returns a geom2d:Point representing its location, or nil. RDFMapper maps everything (that is, all resources) in the content file for which extractor returns a non-nil value. Here's a sample extractor:

geom2d:Point function extractGeoLatLong(Resource x) { 
 var geom2d:Point rs; 
 if ((count(x,geo:lat)>0) && (count(x,geo:long)>0))
 { 
 rs = new(geom2d:Point); 
 rs . geom2d:x = x.geo:long; 
 rs . geom2d:y = x.geo:lat; 
 return rs;
 }
 return nil ~ geom2d:Point; 
}

In English, this code reads: "If geo:lat and geo:long each take on at least one value within some resource x, then extract the corresponding values, package them as a geom2d:Point, and return that Point."

Additional details about RDFMapper, including a complete description of all parameters, can be found at http://www.mapbureau.com/rdfmapper/2.0.

2.9.2. Hacking RDFMapper

Although RDFMapper is designed to plot locations from geo-annotated RSS and RDF, there are workarounds in case your hosted online journal service, such as Blogger, doesn't have the facility to let you add new tags and namespaces to your RSS feed.

RDFMapper's extractLocation function gets around this issue by scanning the content of posts for text of the form 124.12346.16. Such text can be hidden from the reader of the weblog by embedding it in a span of the form:

 124.12346.16

It's not pretty, but it works! The following RDFMapper parameters will map a Blogger weblog:

name="basemap" value="http://www.mapbureau.com/basemaps/astoria.0.xml" 
name="content" value="http://rdfmapperexample.blogspot.com/rss/
rdfmapperexample.xml" 
name="extractor" value="extractLocation" 
name="itemGen" value="itemGen0"

Using this hack, any weblog with an RSS feed can be mapped with RDFMapper. In fact, RDFMapper is a generalization of an earlier service called Blogmapper, which mapped weblogs but not other varieties of RDF. If you just want to map blog posts, then you might have a look at Blogmapper itself, at http://blogmapper.com/.

Chris Goad

Mapping Your Life

Mapping Your Neighborhood

Mapping Your World

Mapping (on) the Web

Mapping with Gadgets

Mapping on Your Desktop

Names and Places

Building the Geospatial Web

Mapping with Other People