Chapter Two. Relations Versus Types | Database in Depth: Relational Theory for Practitioners

The title of this chapter is Relations Versus Types, but most of it has to do with types. The point is, the relational model certainly requires a supporting type system, but it has very little to say about the nature of that system. Why does it require it? Because relations (and relvars) are defined in terms of types; that is, every attribute of every relation (and of every relvar) is defined to be of some type. For example, attribute STATUS of the suppliers relvar S might be of type INTEGER. If it is, then every relation s that's a possible value for relvar S must also have a STATUS attribute of type INTEGER which means in turn that every tuple in such a relation s must have a STATUS value that's an integer.

I'll be discussing such matters in more detail later in this chapter. For now, let me just say that with certain important exceptions, which I'll also be discussing later relational attributes can be defined on any types whatsoever (implying among other things that those types can be as complex as we like, as we'll see later as well). In particular, such attributes can be defined on either system-defined (that is, built-in) or user-defined types. For our running example, I'll assume the attributes have types as follows (note that some of the attributes have the same name as the types they're defined on and others don't):

     Suppliers              Parts                 Shipments     SNO    : SNO           PNO    : PNO          SNO : SNO     SNAME  : NAME          PNAME  : NAME         PNO : PNO     STATUS : INTEGER       COLOR  : COLOR        QTY : QTY     CITY   : CHAR          WEIGHT : WEIGHT                            CITY   : CHAR

I'll also assume, where it makes any difference, that types INTEGER (integers) and CHAR (character strings of arbitrary length) are system-defined and the others are user-defined.

By the way, SQL in particular does have a built-in type called INTEGER, as I'm sure you know. It also has a built-in type called CHAR, but (a) that type denotes fixed-length strings, not arbitrary-length ones, and (b) the length in question, n say, usually has to be specified along with the CHAR specification, like this: CHAR(n).^[*] (CHAR without such a length specification is shorthand for CHAR(1) not a very useful default, it might be thought.) SQL also allows users to define their own types.

^[*] SQL does have a varying-length character-string type, called VARCHAR, but even there a maximum length has to be specified.

In the interests of historical accuracy, I should now say that when Codd first defined the relational model, he said relations were defined over domains, not types. In fact, however, domains and types are exactly the same thing. Now, you can take this claim as a position statement on my part, if you like, but I want to present arguments in the next two sections to support that position. I'll start with the relational model as Codd originally defined it; thus, I'll use the term domain, not type, until further notice. There are two major topics I want to discuss, one per section:

Domain-constrained comparisons and "domain check override": I hope this part of the discussion will persuade you that domains really are types.
Data value atomicity and first normal form: And I hope this part will persuade you that those types can be arbitrarily complex.