More on Foreign Keys | Database in Depth: Relational Theory for Practitioners

I explained the basic idea of foreign keys in Chapter 1, but here's a precise definition (note the reliance on tuple equality once again):

Definition: Let R1 and R2 be relvars, not necessarily distinct, and let K be a key for R1. Let FK be a subset of the heading of R2 that, possibly after some attribute renaming, involves exactly the same attributes as K. Then FK is a foreign key if and only if, at all times, every tuple in R2 has an FK value that is equal to the K value in some (necessarily unique) tuple in R1 at the time in question.

As we know, in the suppliers-and-parts database, {SNO} and {PNO} are foreign keys in relvar SP, referencing the sole candidate key in fact, the primary key in relvar S and relvar P, respectively. Here's another example:

     VAR EMP BASE RELATION       { ENO ENO, ..., MNO ENO, ... }         KEY { ENO }         FOREIGN KEY { RENAME ( MNO AS ENO ) } REFERENCES EMP ;

Attribute MNO here denotes the employee number of the manager of the employee identified by ENO; thus, the "referencing relvar" (R2 in the definition) and the "referenced relvar" (R1 in the definition) in this example are one and the same. For example, the EMP tuple for employee E3 might include an MNO value of E2, which constitutes a reference to the EMP tuple for employee E2. But foreign key values, like candidate key values, are tuples ; conceptually, therefore, we have to rename the MNO attribute in the foreign key specification, in order for the tuple equality comparison to be valid. (What tuple equality comparison? The one that's implicit in the process of checking the foreign key constraint recall that tuples must certainly be of the same type if they're to be tested for equality, and "same type" means they must have the same attribute names.)

As an aside, I should mention that the relational model as originally formulated required foreign keys to match not just some candidate key but, very specifically, the primary key in the referenced relvar. However, I gave my reasons in Chapter 1 for not insisting that some candidate key always be chosen and made primary; accordingly, therefore, I don't want to insist that foreign keys always match primary keys specifically. (I agree with SQL on this one.)

Now, SQL supports not just foreign keys as such but also certain associated referential actions such as CASCADE (which can be specified as part of either an ON DELETE clause or an ON UPDATE clause). For example, the CREATE TABLE statement for shipments might include the following:

     FOREIGN KEY ( SNO ) REFERENCES S ( SNO ) ON DELETE CASCADE

Given this specification, an attempt to delete a specific supplier will cascade to delete all shipments for that supplier as well. I mention this point for the following reasons:

First, such specifications might be useful in practice, but they aren't part of the relational model as such.
But that's not necessarily a problem! The relational model is the foundation of the database field, but it's only the foundation. There's no reason why additional features shouldn't be built on top of, or alongside, that foundation just so long as those additions don't violate the prescriptions of the model, of course (and are in the spirit of the model and can be shown to be useful, I suppose I should add). To elaborate:
1. Type theory provides the most obvious example. We saw in Chapter 2 that "types are orthogonal to tables," but we also saw that full and proper type support in relational systems is highly desirable, to say the very least.
2. By way of a second example, the relational model has almost nothing to say about recovery and concurrency controls, but this fact obviously doesn't mean that relational systems shouldn't provide such controls. (Actually it could be argued that the relational model does say something about such matters implicitly, because it relies on the DBMS to implement updates properly and not to lose data but it doesn't prescribe anything specific.)

One final remark to close this section: I've discussed foreign keys because they're of considerable pragmatic importance, and also because they're part of the model as originally defined. But I think I should stress the point that they aren't truly fundamental they're really just shorthand for certain integrity constraints that are commonly required in practice, as we'll see in Chapter 6.^[*] (In fact, much the same could be said for candidate keys as well, but in that case the practical benefits of providing a shorthand are overwhelming.)

^[*] Precisely for this reason, in fact, explicit support for them is currently omitted from Tutorial D. However, I'm sure an "industrial-strength" version of that language would support them, and I'm taking the liberty in this book of pretending such support is already there.