More on Views | Database in Depth: Relational Theory for Practitioners

A view, also called a virtual relvar, is a relvar that doesn't have separate existence in its own right but looks to the user as if it did. Here's a definition:

Definition: A view or virtual relvar V is a relvar whose value at time t is the result of evaluating a certain relational expression at that time t. The expression in question is specified when V is defined and must mention at least one relvar.

The following are a couple of examples, "London suppliers" and "non-London suppliers" (Tutorial D on the left, SQL on the right):

     VAR LS VIRTUAL                    |  CREATE VIEW LS AS     ( S WHERE CITY = 'London' ) ;     |  ( SELECT S.*     WHERE  S.SNO IN                   |  FROM   S                                       |  WHERE  S.CITY = 'London' ) ;     VAR NLS VIRTUAL                   |  CREATE VIEW NLS AS     ( S WHERE CITY  'London ) ;  |                  ( SELECT S.*                                       |  FROM   S                                          WHERE  S.CITY <> 'London' ) ;     VAR NLS VIRTUAL                   |  CREATE VIEW NLS AS     ( S WHERE CITY  'London ) ;  |                  ( SELECT S.*                           FROM   S    |                                       |   WHERE  S.CITY <> 'London' ) ;

The parentheses in all of these examples are unnecessary but not wrong. I include them for clarity.

View Retrievals

To repeat, views are meant to look to the user as if they had their own separate existence; in other words, they're supposed to "look and feel" just like base relvars so far as the user is concerned. In particular, the user should be able to operate on views as if they were base relvars, and the DBMS should be able to map those user operations into suitable operations on the base relvars in terms of which the views are ultimately defined. I say "ultimately" here because (of course) one thing we can do, if views really do behave just like base relvars, is define further views on top of them, as in this SQL example:

     CREATE VIEW LS_STATUS       AS ( SELECT LS.SNO, LS.STATUS            FROM   LS ) ;

Mapping read-only operations is straightforward. For example, suppose we issue this SQL query on view LS:

     SELECT LS.SNO     FROM   LS     WHERE  LS.STATUS > 10

First, the DBMS replaces the reference to the view in the FROM clause by the expression that defines that view, yielding:

     SELECT LS.SNO     FROM ( SELECT S.*            FROM   S            WHERE  S.CITY = 'London' ) AS LS     WHERE  LS.STATUS > 10

This expression can now be simplified to:

     SELECT S.SNO     FROM   S     WHERE  S.CITY = 'London'     AND    S.STATUS > 10

The reason the foregoing process works is precisely because of the closure property of the relational algebra. Closure implies, among other things, that wherever we're allowed to have the name of something for example, in a query we can always have a more general expression that evaluates to a thing of the appropriate type.^[*] In the FROM clause, for example, we can have an SQL table name; thus we can also have a more general SQL table expression, and that's why we're allowed to substitute the expression that defines the view LS for the name LS in the example.

^[*] I would much have preferred to use the more formal term object in this sentence in place of the very vague term thing, but object has become a loaded term in computing contexts.

By the way, it's worth mentioning that the foregoing process didn't always work in early versions of SQL (to be specific, in versions of the SQL standard prior to 1992), because those early versions failed to support closure properly. As a result, certain innocuous-looking queries against certain innocuous-looking tables (actually views) failed and failed, moreover, in ways that were hard to explain. Here's a simple example:

     CREATE VIEW V       AS ( SELECT S.CITY, SUM ( S.STATUS ) AS ST            FROM   S            GROUP  BY S.CITY ) ;     SELECT V.CITY     FROM   V     WHERE  V.ST > 25

This example failed in the SQL standard prior to 1992. And although the standard has now been fixed, it doesn't follow that all of the products have! And indeed there's at least one major product that still hasn't, at the time of writing (early 2005).

View Updates

I turn now to update operations. Before I get into specifics, I want to look at the London and non-London supplier views again (and now I'll switch to Tutorial D):

     VAR LS  VIRTUAL ( S WHERE CITY = 'London' ) ;     VAR NLS VIRTUAL ( S WHERE CITY  'London ) ;

The important point here is as follows: instead of S being the base relvar and LS and NLS views, LS and NLS could be base relvars and S could be a view, like this:

     VAR LS BASE RELATION       { SNO SNO, SNAME NAME, STATUS INTEGER, CITY CHAR }         KEY { SNO } ;     VAR NLS BASE RELATION       { SNO SNO, SNAME NAME, STATUS INTEGER, CITY CHAR }         KEY { SNO } ;     VAR S VIRTUAL ( LS UNION NLS ) ;

NOTE

In order to achieve complete equivalence, we would also have to specify certain constraints in particular, constraints to the effect that every CITY value in LS is London and every CITY value in NLS is not London but I omit such details here. See Chapter 6 for further discussion of such matters.

The message of this example is that, to a very large extent, which relvars are base and which virtual is arbitrary from which it follows that there must be no arbitrary and unnecessary distinctions between base and virtual relvars. This state of affairs is referred to as The Principle of Interchangeability (of base and virtual relvars). Here are some implications:

Like base relvars, views are subject to integrity constraints. (We usually think of integrity constraints as applying just to base relvars, but The Principle of Interchangeability shows that this position isn't really tenable.)
In particular, views have candidate keys (and so I should perhaps have included some key specifications in my view examples prior to this point; Tutorial D permits such specifications, but SQL doesn't). They might also have foreign keys, and foreign keys might refer to them.
I didn't mention this point in Chapter 1, but the "entity integrity" rule is supposed to apply specifically to base relvars, not views. It thereby violates The Principle of Interchangeability. (Of course, I reject that rule anyway, because it has to do with nulls.)
Many SQL products, and the SQL standard, provide some kind of "row ID" feature. If that feature applies to base tables and not to views which in practice is quite likely it violates The Principle of Interchangeability.^[*] Of course, row IDs as such aren't part of the relational model, but that fact in itself doesn't mean they shouldn't be supported. But I observe as an important aside that if those row IDs are regarded as some kind of object ID in the object-oriented sense (as they are, most unfortunately, in the SQL standard, as well as in most of the major SQL products), then they're definitely prohibited! Object IDs are effectively pointers, and the relational model explicitly prohibits pointers.
^[*] It might violate The Information Principle, too (see Chapter 8).

And, to revert to the main point of this discussion, we must be able to update views because if we can't, then that fact in itself constitutes the clearest possible violation of The Principle of Interchangeability.

As you probably know, SQL's support for this requirement is quite weak, both in the standard and in the major commercial products. It does at least typically include support for updates on views defined as simple restrictions and/or projections of a single underlying base relvar (though even here there are some problems). For example, consider the following view (which is essentially identical to one we saw in Chapter 1):

     CREATE VIEW SST_PARIS       AS ( SELECT S.SNO, S.STATUS            FROM   S            WHERE  S.CITY = 'Paris' ) ;

This view is a projection of a restriction of base table S, and so we might, for example, be able to perform the following DELETE on it:

     DELETE     FROM   SST_PARIS     WHERE  SST_PARIS.STATUS > 15 ;

This DELETE maps to:

     DELETE     FROM   S     WHERE  S.CITY = 'Paris'     AND    S.STATUS > 15 ;

But few products provide support for updating views that are much more sophisticated than this one.

Unfortunately, I'm now straying into an area in which there's still a certain amount of controversy. My own opinion is that the view updating problem has largely been solved (that is, the theory exists); however, not everybody agrees with me, and in any case a detailed discussion of the subject requires rather more background than it's possible to include in a book of this nature. Thus, I'm afraid the best I can do here is refer you to another book Databases, Types, and the Relational Model: The Third Manifesto, Third Edition (Addison-Wesley, 2006), by C. J. Date and Hugh Darwen, where the subject is examined in depth if you want more specifics.

Miscellaneous Points

There are a few more things I need to say in order to finish up this section. First of all, it's well known, but worth mentioning anyway, that views serve two rather different purposes:

The user who actually defines view V is, obviously, aware of the expression X in terms of which V is defined. That user can use the name V wherever the expression X is intended, but such uses are basically just shorthand.
By contrast, a user who's merely informed that V exists and is available for use is supposed (at least ideally) not to be aware of the expression X; to that user, in fact, V is supposed to look and feel just like a base relvar, as we've already seen. And it's this second use of views that's the really important one, and the one I've been concentrating on, tacitly, throughout this section so far.

Second, when I explained what a view was at the beginning of this section, I said the relational expression that defined the view had to mention at least one relvar. Why? Because if it doesn't, the "virtual relvar" won't be a relvar at all! I mean, it won't be a variable, and it certainly won't be updatable. Instead, it'll be a relation constant, or what we might call a "relcon." For example (to invent some syntax on the fly):

     CONST PERIODIC_TABLE ( RELATION {           TUPLE { ELEMENT 'Hydrogen', SYMBOL 'H' , ATOMICNO  1 },           TUPLE { ELEMENT 'Helium'  , SYMBOL 'He', ATOMICNO  2 },           ...           TUPLE { ELEMENT 'Uranium' , SYMBOL 'U' , ATOMICNO 92 }                                     } ) ;

While it certainly might be desirable to provide some kind of "relcon" functionality along the foregoing lines, I don't think we should think of such things as relvars. I don't think it helps the cause of understanding to pretend that constants are variables.

Third, an unfortunate terminological clash is arising as I write, certainly in the academic world, and to some extent in the commercial world also. Recall from Chapter 1 that a view can be thought of as a derived relvar. Well, there's another kind of derived relvar, too, called a snapshot. As the name might suggest, a snapshot, although it's derived, is real, not virtual meaning it's represented not just by its definition in terms of other relvars but also, at least conceptually, by its own separate copy of the data. For example (to invent some syntax again):

     VAR LSS SNAPSHOT ( S WHERE CITY = 'London' )         REFRESH EVERY DAY ;

Defining a snapshot is just like executing a query, except that:

The result of the query is saved in the database under the specified name (LSS in the example) as a read-only relvar (read-only, that is, apart from the periodic refresh; see the next bullet item).
Periodically (EVERY DAY in the example) the snapshot is refreshed its current value is discarded, the query is executed again, and the result of that new execution becomes the new snapshot value.

In the example, therefore, snapshot LSS represents the data as it was at most 24 hours ago.

Snapshots are important in data warehouses, distributed systems, and many other contexts. In all cases, the rationale is that many applications can tolerate, or might even require, data "as of " some particular point in time. Reporting and accounting applications are a case in point; such applications typically require the data to be frozen at an appropriate moment (for example, at the end of an accounting period), and snapshots allow such freezing to occur without locking out other applications.

So far, so good. The problem is that snapshots have come to be known (at least in some circles) not as snapshots at all but as materialized views. But snapshots aren't views! Indeed, the whole point about views, at least so far as the relational model is concerned, is that they aren't materialized; as we've seen, operations on views are implemented by mapping them into suitable operations on the underlying base relvars. Thus, "materialized view" is simply a contradiction in terms. Worse yet, the unqualified term view is now often taken to mean a "materialized view" specifically again, at least in some circles and so we're in danger of no longer having a good term to mean a view in the original sense. In this book, of course, I do use the term view in its original sense, but be warned that it doesn't always have that meaning elsewhere. Caveat lector.