Understanding the Relational Database Model

The relational database model was a huge leap forward from the network database model. Instead of relying on a parent-child or owner-member relationship, the relational model allows any file to be related to any other by means of a common field. Suddenly, the complexity of the design was greatly reduced because changes could be made to the database schema without affecting the system's ability to access data. And because access was not by means of paths to and from files, but from a direct relationship between files, new relations between these files could be easily added.

In 1970, when E. F. Codd developed the model, it was thought to be impractical. The increased ease of use comes at a large performance penalty, and the hardware in those days was not able to implement the model. Since then, of course, hardware has taken huge strides to where today even the simplest computers can run sophisticated relational database management systems.

Relational databases go hand in hand with the development of SQL, which is covered in Part I, "Using MySQL." The simplicity of SQL—where even a novice can learn to perform basic queries in a short period of time—is a large part of the reason for the popularity of the relational model.

Tables 7.1 and 7.2 relate to each other through the stock_code field. Any two tables can relate to each other simply by creating a field they have in common.

Table 7.1: The Product Table
Stock_code	Description	Price
A416	Nails, box	$0.14
C923	Drawing pins, box	$0.08

Table 7.2: The Invoice Table
Invoice_code	Invoice_line	Stock_code	Quantity
3804	1	A416	10
3804	2	C923	15

Introducing Basic Terms

The relational model uses certain terms to describe its components. If you've gone through Part I, "Using MySQL," you'll be familiar with many of them:

Data are the values kept in the database. On their own, the data mean very little. CA 684-213 is an example of data in a DMV (Division of Motor Vehicles) database.
Information is processed data. For example, CA 684-213 is the car registration number of Lyndon Manson in a DMV database.
A database is a collection of tables.
Each table is made up of records (the horizontal rows in the table, also called tuples). Each record should be unique, and can be stored in any order in the table.
Each record is made up of fields (which are the vertical columns of the table, also called attributes). Basically, a record is one fact (for example, one customer or one sale).
These fields can be of various types. MySQL has many types, as you saw in Chapter 2 ("Data Types and Table Types"), but generally the types fall into three kinds: character, numeric, and date. For example, a customer name is a character field, a customer's birthday is a date field, and a customer's number of children is a numeric field.
The range of allowed values for a field is called the domain (also called a field specification). For example, a credit_card field may be limited to only the values Mastercard, Visa, and Amex.
A field is said to contain a null value when it contains nothing at all. Null fields can create complexities in calculations and have consequences for data accuracy. For this reason, many fields are specifically set not to contain null values.
A key accesses specific records in a table.
An index is a mechanism to improve the performance of a database. Indexes are often confused with keys. They are, strictly speaking, part of the physical structure, and keys are part of the logical structure. You'll often see the terms used interchangeably, however.
A view is a virtual table made up of a subset of the actual tables.
A one-to-one (1:1) relationship is where for each instance of the first table in a relationship, only one instance of the second table exists. An example of this would be a case where a chain of stores carries a vending machine. Each vending machine can only be in one store, and each store carries only one vending machine (see Figure 7.3).

Figure 7.3: A one-to-one relationship
A one-to-many (1:M) relationship is where for each instance of the first table in the relationship, many instances of the second table exist. This is a common kind of relationship. An example is the relationship between a sculptor and their sculptures. Each sculptor may have created many sculptures, but each sculpture has been created by only one sculptor (see Figure 7.4).

Figure 7.4: A one-to-many relationship
A many-to-many (M:N) relationship occurs where, for each instance of the first table, there are many instances of the second table, and for each instance of the second table, there are many instances of the first. For example, a student can have many lecturers, and a lecturer many students (see Figure 7.5).

Figure 7.5: A many-to-many relationship
A mandatory relationship exists where for each instance of the first table in a relationship, one or more instances of the second must exist. For example, for a music group to exist, there must exist at least one musician in that group.
An optional relationship is where for each instance of the first table in a relationship, there may exist instances of the second. For example, if an author can be listed in the database without having written a book (in other words, a prospective author), that relationship is optional. The reverse isn't necessarily true, though; for example, for a book to be listed, it must have an author.
Data integrity refers to the condition where data is accurate, valid, and consistent. An example of poor integrity would be if a customer telephone number is stored differently in two different locations. Another is where a course record contains a reference to a lecturer who is no longer present at the school. In Chapter 8, "Database Normalization," you'll learn a technique that assists you to minimize the risk of these sorts of problems: database normalization.

Now that you've been introduced to some of the basic terms, the next section will cover table keys, a fundamental aspect of relational databases, in more detail.

Introducing Table Keys

A key, as the term itself indicates, unlocks access to the tables. If you know the key, you know how to identify specific records and the relationships between tables.

A candidate key is a field, or combination of fields, that uniquely identifies a record. It cannot contain a null value, and its value must be unique. (With duplicates, you would no longer be identifying a unique record.)

A primary key is a candidate key that has been designated to identify unique records in the table throughout the database structure. As an example, Table 7.3 shows the customer table.

Table 7.3: The Custpmer Table
customer_code	first_name	surname	telephone_number
1	John	Smith	448-2143
2	Charlotte	Smith	448-2143
3	John	Smith	9231-5312

At first glance, there are two possible candidate keys for this table. Either customer_ code or a combination of first_name, surname, and telephone_number would suffice. It is always better to choose the candidate key with the least number of fields for the primary key, so you would choose customer_code in this example. Upon reflection, there is also the possibility of the second combination not being unique. The combination of first_name, surname, and telephone_number could in theory be duplicated, such as where a father has a son of the same name who is contactable at the same telephone number. This system would have to expressly exclude this possibility for these three fields to be considered for the status of primary key.

There may be many John Smiths in the average English-speaking country, but you avoid confusion by assigning each a unique number. Once a primary key has been created, the remaining candidate keys are labeled as alternate keys.

Introducing Foreign Keys

You already know that a relation between two tables is created by assigning a common field to the two tables. This common field must be a primary key to the one table. Consider a relation between a customer table and a sale table. The relation is not much good if instead of using the primary key, customer_code, in the sale table, you use another field that is not unique, such as the customer's first name. You could never know for sure which customer made the sale in that case. So, in Figure 7.6, customer_code is called the foreign key in the sale table; in other words, it is the primary key in a foreign table.

click to expand
Figure 7.6: Setting foreign keys

Foreign keys allow for something called referential integrity. What this means is that if a foreign key contains a value, this value refers to an existing record in the related table. For example, take a look at Table 7.4 and Table 7.5.

Table 7.4: The Lecturer Table
code	firstname	surname
1	Anne	Cohen
2	Leonard	Clark
3	Vusi	Cave

Table 7.5: The Course Table
Course title	Lecturer
Introduction to Programming	1
Information Systems	2
Systems Software	3

Referential integrity exists here, as all the lecturers in the course table exist in the lecturer table. However, let's assume Anne Cohen leaves the institution, and you remove her from the lecturer table. In a situation where referential integrity is not enforced, she would be removed from the lecturer table, but not from the course table, shown in Table 7.6 and Table 7.7.

Table 7.6: The Lecturer Table
code	firstname	surname
2	Leonard	Clark
3	Vusi	Cave

Table 7.7: The Course Table
Course title	Lecturer
Introduction to Programming	1
Information Systems	2
Systems Software	3

Now, when you look up who lectures Introduction to Programming, you are sent to a nonexistent record. This is called poor data integrity.

Foreign keys also allow cascading deletes and updates. For example, if Anne Cohen leaves, taking the Introduction to Programming course with her, all trace of her can be removed from both the lecturer and course table by using one statement. The delete "cascades" through the relevant tables, removing all relevant records. Since version 3.23.44, MySQL has supported checking of foreign key with the InnoDB table type, and cascading deletes have been supported since version 4.0.0. Remember, though, that enforcing referential integrity does have a performance cost. Though without it, it becomes the responsibility of the application to maintain data integrity.

Foreign keys can contain null values, indicating that no relationship exists.

Introducing Views

Views are virtual tables. They are only a structure and contain no data. Their purpose is to allow a user to see a subset of the actual data. Views are one of the most frequent MySQL feature requests, and are due to be implemented in version 5.

A view can consist of a subset of one table. For example, Table 7.8 is a subset of the full table, shown in Table 7.9.

Table 7.8: The Student View
Student View
First_name Surname Grade

Table 7.9: The Student Table
Student
Student_id First_name Surname Grade Address Telephone

This view could be used to allow other students to see their fellow students' marks but not allow them access to personal information.

Or a view could be a combination of a number of tables, such as the view shown in Table 7.10. It's a combination of Table 7.11, Table 7.12, and Table 7.13.

Table 7.10: The Student Grade View
Student Grade View
First_name Surname Course description Grade

Table 7.11: The Student Table
Student
Student_id First_name Surname Address Telephone

Table 7.12: The Course Table
Course
Course_id
Course description

Table 7.13: The Grade Table
Grade
Student_id Course_id Grade

Views are also useful for security. In larger organizations, where many developers may be working on a project, views allow developers access to only the data they need. What they don't need, even if it is in the same table, is hidden from them, safe from being seen or manipulated. It also allows queries to be simplified for developers. For example, without the view, a developer would have to retrieve the fields in the view with the following sort of query:

SELECT first_name,surname,course_description,grade FROM student,  grade, course WHERE grade.student_id = student.student_id AND  grade.course_id = course.course_id;

With the view, a developer could do the same with the following:

SELECT first_name,surname,course_description,grade FROM student_grade_view;

This is much more simple for a junior developer who hasn't yet learned how to do joins, and it's just less hassle for a senior developer, too!