Section 1.1. The Relational View of Data

1.1. The Relational View of Data

A database is nothing but a model of a small part of a real-life situation. As any representation, a database is always an imperfect model, and a very narrow depiction of a rich and complex reality. There is rarely a single way to represent some business activity, but rather several variants that in a technical sense will be semantically correct. However, for a given set of processes to apply, there is usually one representation that best meets the business requirement.

The relational model is thus named, not because you can relate tables to one another (a popular misconception), but as a reference to the relationships between the columns in a table. These are the relationships that give the model its name; in other words, relational means that if several values belong to the same row in a table, they are related. The way columns are related to each other defines a relation, and a relation is a table (more exactly, a table represents one relation).

The business requirements determine the scope of the real-world situation that is to be modeled. Once you have defined the scope, you can proceed to identify the data that you need to properly record business activity. If we say that you are a used car dealer and want to model the cars you have for sale (for instance to advertise them on a web site), items such as make, model, version, style (sedan, coupe, convertible...), year, mileage, and price may be the very first pieces of information that come to mind. But potential buyers may want to learn about many more characteristics to be able to make an informed choice before settling for one particular car. For instance:

General state of the vehicle (even if we don't expect anything but "excellent")
Safety equipment
Manual or automatic transmission
Color (body and interiors), metallic paintwork or not, upholstery, hard or soft top, perhaps a picture of the car
Seating capacity, trunk capacity, number of doors
Power steering, air conditioning, audio equipment
Engine capacity, cylinders, horsepower and top speed, brakes (everyone isn't a car enthusiast who would know technical specifications from the car description)
Fuel, consumption, tank capacity
Current location of the car (may matter to buyers if the site lists cars available from a number of physical places)
And so on.. .

If we decide to model the available cars into a database, then each row in a table summarizes a particular statement of factfor instance, that there is for sale a 1964 pink Cadillac Coupe DeVille that has already been driven twenty times around the Earth.

Through relational operations, such as joins, and also by filtering, selection of particular attributes, or computations applied to attributes (say computing from consumption and tank capacity how many miles we can drive without refueling), we can derive new factual statements. If the original statements are true, the derived statements will be true.

Whenever we are dealing with knowledge, we start with facts that we accept as truths that need no proof (in mathematics these are known as axioms , but this argument is by no means restricted to mathematics and you could call those unproved true facts principles in other disciplines). It is possible to build upon these true facts (proving theorems in mathematics) to derive new truths. These truths themselves may form the foundations from which further new truths emerge.

Relational databases work in exactly the same way. It is absolutely no accident that the relational model is mathematically based. The relations we define (which once again means, for an SQL database, the tables we create) represent facts that we accept, a priori, as true. The views we define, and the queries we write, are new truths that we prove.

The coherence of the relational model is a critically important concept to grasp. Because of the inherent mathematical stability of the principles that underlie relational data modeling , we can be totally confident that the result of any query of our original database will indeed generate equally valid factsif we respect the relational principles. Some of the key principles of the relational theory are that a relation, by definition, contains no duplicate, and that row ordering isn't significant. As you shall see in Chapter 4, SQL allows developers to take a number of liberties with the relational theory, liberties that may be the reasons for either surprising results or the failure of a database optimizer to perform efficiently.

There is, however, considerable freedom in the choice of our basic truths. Sometimes the exercise of this freedom can be done very badly. For example, wouldn't it be a little tedious if every time someone went to buy some apples, the grocer felt compelled to prove all Newtonian physics before weighing them? What must be thought of a program where the most basic operation requires a 25-way join?

We may use much data in common with our suppliers and customers. However, it is likely that, if we are not direct competitors, our view of the same data will be different, reflecting our particular perspective on our real-life situation. For example, our business requirements will differ from those of our suppliers and customers, even though we are all using the same data. One size doesn't fit all. A good design is a design that doesn't require crazy queries.

Modeling is the projection of business requirements.