More on Relations Versus Types | Database in Depth: Relational Theory for Practitioners

Chapter 2 was called Relations Versus types. However, I wasn't in a position in that chapter to explain the most important difference between those two concepts but now I am, and I will.

I've shown that the database at any given time can be thought of as a collection of true propositions: for example, the proposition Supplier S1 is under contract, is named Smith, has status 20, and is located in city London. More specifically, I've shown that the argument values appearing in such a proposition (S1, Smith, 20, and London, in the example) are, precisely, the attribute values from the corresponding tuple, and of course each such attribute value is a value of the associated type. It follows that:

Types are sets of things we can talk about;

relations are (true) statements about those things.

In other words, types give us our vocabulary the things we can talk about and relations give us the ability to say things about the things we can talk about. (There's a nice analogy here that might help: Types are to relations as nouns are to sentences.) For example, if we limit our attention to suppliers only, for simplicity, we see that:

The things we can talk about are supplier numbers, names, integers, and character strings and nothing else.
The things we can say are things of the form "The supplier with the specified supplier number is under contract, has the specified name, has the status denoted by the specified integer, and is located in the city denoted by the specified character string" and nothing else. (Nothing else, that is, except for things that are logically implied by things we can say. For example, given the things we already know we can say about supplier S1, we can also say things like Supplier S1 is under contract, is named Smith, has status 20, and is located in some city where the city is left unspecified. And if you're thinking that what I've just told you is very reminiscent of, and probably has some deep connection to, relational projection as well, you'd be absolutely right.)

The foregoing state of affairs has at least three important corollaries. To be specific, in order to represent "some portion of the real world" (as I put it in the previous section):

Types and relations are both necessary without types, we have nothing to talk about; without relations, we can't say anything.
Types and relations taken together are sufficient, as well as necessary we don't need anything else, logically speaking. (Well, we do need relvars, in order to reflect the fact that the real world changes over time, but we don't need them to represent the situation at any given time.)
Types and relations aren't the same thing. Beware of anyone who tries to pretend they are! In fact, pretending a type is just a special kind of relation is precisely what certain commercial products try to do (though they don't usually talk in such terms) and I hope it's clear that any product that's founded on such a logical error is doomed to eventual failure. The products I have in mind aren't relational products, of course; typically, they're products that support objects in the object-oriented sense, or products that try somehow to marry such objects and SQL tables. (In fact, at least one of the products I have in mind has indeed already failed.) Further details are beyond the scope of this book, however.

I'd like to wind up this section by offering a slightly more formal perspective on some of what I've been saying. I've said a database can be thought of as a collection of true propositions. In fact, a database, together with the operators that apply to the propositions represented in that database (or to sets of such propositions, rather), is a logical system. And when I say "a logical system," I mean a formal system like euclidean geometry, for example that has axioms ("given truths") and rules of inference by which we can prove theorems ("derived truths") from those axioms. Indeed, it was Codd's very great insight, when he first invented the relational model back in 1969, that (despite the name) a database isn't really just a collection of data; rather, it's a collection of facts, or in other words true propositions. Those propositions the given ones, that is, which is to say the ones represented by the base relvars are the axioms of the logical system under discussion. The inference rules are essentially the rules by which new propositions can be derived from the given ones; in other words, they're the rules that tell us how to apply the operators of the relational algebra. Thus, when the system evaluates some relational expression (in particular, when it responds to some query), it's really deriving new truths from given ones; in effect, it's proving a theorem!

Once we understand the foregoing, we can see that the whole apparatus of formal logic becomes available for use in attacking "the database problem." In other words, questions such as:

What should the database look like to the user?
What should integrity constraints look like?
What should the query language look like?
How can we best implement queries?
More generally, how can we best evaluate database expressions?
How should results be presented to the user?
How should we design the database in the first place?

(and others like them) all become, in effect, questions in logic that are susceptible to logical treatment and can be given logical answers.

Of course, it goes without saying that the relational model supports the foregoing perception very directly which is why, in my opinion, that model is rock solid, and "right," and will endure. It's also why, again in my opinion, other "data models" are simply not in the same ballpark. Indeed, I seriously question whether those other "models" deserve to be called models at all, in the same sense that the relational model can be called a model. Certainly most of them are ad hoc to a degree, instead of being firmly founded, as the relational model is, in set theory and predicate logic. I'll expand on these issues in Chapter 8.