More on Candidate Keys


I explained the basic idea of candidate keys in Chapter 1, but now I want to make the concept more precise. Here's a definition:

Definition: Let K be a subset of the heading of relvar R. Then K is a candidate key (or just key for short) for R if and only if it possesses both of the following properties:


Uniqueness

No possible value for R contains two distinct tuples with the same value for K.


Irreducibility

No proper subset of K has the uniqueness property.

NOTE

In accordance with usual practice, throughout this book I take statements of the form "B is a subset of A" and "A is a superset of B" to include the possibility that A and B might be equal. If I want to exclude that possibility, I'll talk explicitly in terms of proper subsets and supersets.

Now, the uniqueness property is self-explanatory, but I need to say a bit more about the irreducibility property. Consider relvar S and the set of attributes {SNO,CITY} let's call it SK which is certainly a subset of the heading of S that has the uniqueness property (no relation that's a possible S value ever has two distinct tuples with the same SK value). But it doesn't have the irreducibility property, because we could discard the CITY attribute and what's left, the set {SNO}, would still have the uniqueness property. So we don't regard SK as a key, because it's "too big." By contrast, {SNO} is irreducible, and it's a key.

Why do we want keys to be irreducible? One reason is that if we were to specify a "key" that was not irreducible, the DBMS wouldn't be able to enforce the uniqueness constraint properly. For example, suppose we lied and told the DBMS that {SNO,CITY} was a key. Then it couldn't enforce the constraint that supplier numbers are "globally" unique; instead, it could enforce only the weaker constraint that supplier numbers are "locally" unique, in the sense that they're unique within a given city. So this is one reason (not the only one) why we require keys not to include any attributes that aren't needed for unique identification purposes.

Now, all of the relvars we've seen so far have had just one key. Here, by contrast, are some with two or more (they're meant to be self-explanatory). Note the overlapping nature of the keys in the second and third examples.

     VAR TAX_BRACKET BASE RELATION       { LOW MONEY, HIGH MONEY, PERCENTAGE INTEGER }         KEY { LOW }         KEY { HIGH }         KEY { PERCENTAGE } ;     VAR ROSTER BASE RELATION       { DAY DAY_OF_WEEK, TIME TIME_OF_DAY, GATE GATE, PILOT NAME }         KEY { DAY, TIME, GATE }         KEY { DAY, TIME, PILOT } ;     VAR MARRIAGE BASE RELATION       { SPOUSE_A NAME, SPOUSE_B NAME, DATE_OF_MARRIAGE DATE }         /* assume no polygamy and no couple marrying */         /* each other more than once ...             */         KEY { SPOUSE_A, DATE_OF_MARRIAGE }         KEY { DATE_OF_MARRIAGE, SPOUSE_B }         KEY { SPOUSE_B, SPOUSE_A } ;

I'll close this section with a few miscellaneous points. First, note that the key concept applies to relvars, not relations. Why? Because to say something is a key is to say a certain integrity constraint is in effect a certain uniqueness constraint, to be specific and integrity constraints apply to variables, not values. (Integrity constraints constrain updates, and updates apply to variables, not values. See Chapter 6 for further discussion.)

Second, if R is a relvar, then R certainly does have at least one key. The reason is that every possible value of R is a relation and therefore contains no duplicate tuples, by definition; at the very least, therefore, the combination of all of the attributes of R certainly has the uniqueness property.[*] Thus, either that combination also has the irreducibility property or there's some proper subset of that combination that does. Either way, there's something that's both unique and irreducible.

[*] We can't say the same for SQL tables, of course SQL tables allow duplicate rows and so might have no key at all.

Third, note that key values are tuples. In the case of relvar S, for example, with its sole key {SNO}, the value of that key for some specific tuple say, that for supplier S1 is:

     TUPLE { SNO SNO('S1') }

(Recall from Chapter 3 that every subset of a tuple is a tuple in turn.) Of course, in practice we would usually say, informally, that the key value in this example is just S1 or SNO('S1'), rather but it really isn't.

Following on from the previous point: it should now be clear that the key concept, like so much else in the relational model, relies on the fundamental concept of tuple equality. That is, in order to enforce the uniqueness constraint, we need to be able to tell when two key values are equal, and that's precisely a matter of tuple equality even when, as in the case of relvar S, the tuples in question are of degree one and "look like" simple scalar values.

My final point has to do with the notion of functional dependency. I don't want to get into a lot of detail regarding that concept here I'll do that in Chapter 7 but you're probably familiar with it anyway; all I want to do here is call your attention to the following. Let K be a key for relvar R, and let A be any attribute of R. Then R necessarily satisfies the functional dependency:

     K  To elaborate briefly: in general, the functional dependency K  R have the same value for K, they also have the same value for A. But if two tuples have the same value for K, where K is a key, then by definition they're the very same tuple! and so they must have the same value for A. In other words, loosely: we always have "functional dependency arrows" out of keys to everything else in the relvar. Again, I'll revisit these ideas in Chapter 7.



Database in Depth
Database in Depth: Relational Theory for Practitioners
ISBN: 0596100124
EAN: 2147483647
Year: 2006
Pages: 127
Authors: C.J. Date

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net