What Is the Semantic Web? | Semantics in Business Systems: The Savvy Managers Guide (The Savvy Managers Guides)

One of the confusions, and also a source of perceived instability on the Semantic Web front, is the number of seemingly overlapping initiatives and their rapid ascendancy and decline. If you examine this more closely, though, it is more encouraging. First, the acronyms are changing faster than the underlying concepts. Second, many of these standards are not contradictory, but operate at different levels. Third, as someone once said, "With death there is hope." The marketplace is winnowing out these ideas and settling on a few useful ones.

Figure 14.1 shows some of the key Semantic Web technologies and their interrelationships. I will describe most of them briefly and focus on resource description framework/notation3 because it is the base technology of the Semantic Web:

Knowledge interchange format (KIF)—This is the interface format that knowledge engineers and those in the artificial intelligence (AI) community use to exchange rules.
Dublin Core—The Dublin Core (from the not so romantic Dublin, Ohio) is metadata for authored materials. It covers books, music, articles, Web pages, and the like. As such, it represents a large portion of what the Web deals with, and it has been widely adopted as the ontology of choice for things such as authorship, title, publish date, copyright holder, and so on. By convention, Dublin Core tags are preceded by dc:.
DARPA agent markup language (DAML)—DAML is a schema language that was developed by the U.S. Defense Department. It was often combined with OIL (DAML+OIL), but is now being superseded by OWL.
Ontology inference layer (OIL)—OIL primarily adds inferencing to a schema.
Resource description framework (RDF)—Described below.
Notation3—An abbreviated format for RDF, described below.
RDF schema (RDFS)—An extension to RDF to cover schema concepts such as class, subclass, and so on.
Web ontology language (OWL)—The heir apparent to RDF, DAML+OIL, and RDFS.

click to expand
Figure 14.1: Some key Semantic Web technologies (the lines represent "begats" relationships).

Resource Description Framework and Notation3

RDF is the base technology for the Semantic Web. It is a modeling language, and it models a world (the so-called real world or any world of concepts, documents, and ideas). The model is conceptually simple and sound, because it is based on the mathematics of model theory. The model should look familiar by now: two ovals connected by an arrow (Figure 14.2). The RDF model is structurally similar to entity/attribute/value or term/fact/term; what makes it different is that each part of the triplet (either the subject, the property [predicate], or the object) can be a direct reference to the resource that describes this item. The resources are coded as universal resource identifiers (URIs) so that a single unambiguous definition can be found.

Figure 14.2: Basic RDF model.

For example, we can model fairly concrete things such as the fact that Tim Bray wrote an article titled "What Is RDF?"^[111] Figure 14.3 shows this conceptually; however, in RDF terms we should convert each of the terms to resources, as in Figure 14.4 and the equivalent listing in Figure 14.5.

Figure 14.3: Tim Bray and RDF.

Figure 14.4: Tim Bray and RDF as resources.

    ?rdf:Description about='http://www.textuality.com/RDF/Why-RDF.html'>    ?Author> Tim Bray ?/Author>    ?Home-Page rdf:resource='http://www.textuality.com'/>    ?/rdf:Description>

Figure 14.5: Tim Bray and RDF as resources.

I pulled this example from an article that is almost 5 years old. Several things about the syntax and how we think about RDF have changed over this period. However, I think this example illustrates both what RDF is meant to do and some of the issues that people have when they first come to it. (The diagram was not in the original article. I annotated the word "author" with the prefix "dc:" because it is more in line with current usage, as the meaning of the word "author" would have been defined in the Dublin Core ontology.)

The first part of this makes a lot of sense, and I think it shows where the real strength of RDF lies. In other modeling disciplines we would talk about the article "What Is RDF?" However, the article is a document. It is stored on a computer. We can just point at it. We don't have to talk about it. This is potentially brilliant.

Except that it already underscores some problems. I read this article from a hard copy, so I went to the URL/URI shown—and it wasn't there. I was redirected to the XML.com site, but not to this specific document; I was redirected to the home page. I had to find the document again. This is a separate but related problem that we have had with the current Web since its inception. I can almost hear Tim Berners-Lee,^[112] as well as Jacob Neilsen, harping on the "URLs must stay active forever" theme.^[113] Because once we start building our knowledge lattices on top of explicit references to specific URIs, if the URI changes, the system doesn't work.

It's the other end of the graph that is of real interest. Physical, real-world objects, which are the easiest to model in traditional modeling, are suddenly difficult. What should we find at the end of this link? We should find Tim Bray, but of course that's not possible; he is flesh and blood. What we do find is a home page that sort of has something to do with him, and a link to a page that has some more information, including how to contact him (phone and email). But this hardly seems satisfactory. This isn't Tim Bray. This is how to contact Tim Bray. The specifications and current state of this technology are not very precise about what we should find at the final terminus for a reference to a person, but the person's home page hardly seems definitive, and it is only useful in a limited number of cases.

RDF isn't limited to this type of simplistic association. We can relate concepts to concepts and things to concepts. At the risk of overload, let me introduce the alternative notation for RDF, called Notation3, and show a part of the example I will introduce later.

Figure 14.6 was extracted from a sample genealogy ontology.^[114] The first two lines refer to the ontologies used (there were more ontologies in this example). In each case, the link shows where the ontology resides (note to the Semantic Web: we're going to need a more sophisticated versioning system than what was used here—file folders with dates), and the prefix to use in the rest of this document when referring to that ontology. For example, the OWL ("owl:") ontology is where we will find the definition of "inverseOf."

    @prefix owl: ?http://www.w3.org/2002/07/owl#>.    @prefix gc: ?http://www.daml.org/2001/01/gedcom/gedcom#>.    gc:parent owl:inverseOf gc:child.    gc:grandparent owl:inverseOf gc:grandchild

Figure 14.6: A snippet of a genealogy ontology in Notation3.

The second half of the listing contains two examples of the Notation3 (n3) notation. Each line is an RDF triple (subject, predicate, object). This is much more concise and easy to read.

^[111]Tim Bray, "What Is RDF?" Available at http://www.xml.com/pub/a/2001/01/24/rdf.html.

^[112]Tim Berners-Lee, "Cool URI's Don't Change," 1998. Available at http://www.w3.org/Provider/Style/URI.html.

^[113]Jakob Nielsen, "Web Pages Must Live Forever," Nov 29, 1998. Available at http://useit.com/alertbox/981129.html.

^[114]Jos De Roo, http://www.agfa.com/w3c/euler/gedcom-relations.n3.