Relational databases have the capability to split data across several tables in a structured way so data is never repeated. In general, a column in one table that can uniquely identify a particular row is known as a primary key and a column in another table that references that table's primary key is known as a foreign key.
Consider the tables in the sample database. The book table has a primary key named book_id, which is a unique numeric value for each book record in the table. Similarly, the primary key on the person table is person_id. The loan table contains details of books loaned to people and references the two other tables. It contains foreign key fields named book_id and person_id that contain numeric values corresponding to rows in their respective tables.
When a book is loaned, therefore, the loan table does not have to duplicate any of the information about either the person or the book. The only extra information needed in this table is the date of the loan, stored in the date_lent column.
To combine data from two or more tables to find the detail of records referenced in a foreign key, you have to perform a table join. The most common way to do this is by specifying a list of tables in the FROM clause of a SELECT statement and using the WHERE clause to indicate the relationship between the tables.
The following query returns all the loan data, joining the loan table to both person and book, so that information from those tables can be displayed.
SELECT loan.date_lent, person.name, book.title FROM loan, book, person WHERE loan.book_id = book.book_id AND loan.person_id = person.person_id;
The prefixes used in this query to indicate which table each column refers to is known as qualifying the column. This is essential when the same column name applies to more than one table. If the columns are not properly qualified, MySQL does not make any assumptions about which table a query is referring to and returns an error. It is good practice to always qualify your columns to create clear and easy-to-read queries.
The output from this query looks like the following.
+------------+---------+-----------------------+ | date_lent | name | title | +------------+---------+-----------------------+ | 2005-06-05 | Yvette | Green Eggs and Ham | | 2005-07-17 | Thies | In the Night Kitchen | | 2005-07-17 | Thies | How to Be a Grouch | | 2005-09-10 | harmony | Jacob Two-Two Meets | | | | the Hooded Fang | | 2005-10-18 | Thies | Green Eggs and Ham | +------------+---------+-----------------------+
Additional filters can be applied to a query that contains a join by adding more conditions with the AND keyword.
It is vital to include a WHERE clause when joining tables to indicate the relationship between them. Otherwise, every row from the first table is joined with each row from the second table, creating a data set that is very large and usually not useful!
The JOIN Keyword
Another syntax for joins uses the JOIN keyword, with the relationship between two tables given in a separate ON clause. This can make queries more readable because the join conditions are kept separate to any filters in the WHERE clause. It also allows for different types of joins to be performed, such as an outer join.
The following query is identical in behavior to the previous example, using the JOIN syntax:
SELECT loan.date_lent, person.name, book.title FROM loan JOIN book ON loan.book_id = book.book_id JOIN person ON loan.person_id = person.person_id
In the previous examples, column names were qualified by using the table name as a prefix. Often, and particularly when using a database with longer table names, this can become cumbersome. You might want to use aliases to create a shorter name to refer to the tables in a query.
Use AS to give a table an alias. The AS keyword appears immediately after a table name in the FROM clause or a JOIN clause, as shown in the following example:
SELECT l.date_lent, p.name, b.title FROM loan AS l JOIN book AS b ON l.book_id = b.book_id JOIN person AS p ON l.person_id = p.person_id
The alias is then used as the prefix to qualify a column name, both in the SELECT clause and any conditions in the query. When you use aliases, you must use the alias throughout the queryyou cannot also qualify a column using the full table name prefix.
The joins in the previous example are known as inner joins. A row is only returned by the query if there is data in both tables in the join that meets the join criteria.
Another type of join is an outer join, where every row is returned from one table regardless of whether there is a corresponding row in the second table. When this happens, values returned from the second table that do not have corresponding rows in the first table are all given NULL values.
The following query finds the most recent loan date for each person in the database using an outer join:
SELECT p.name, MAX(l.date_lent) FROM person p LEFT OUTER JOIN loan l ON p.person_id = l.person_id GROUP BY p.name;
Because one row in the person table does not have any associated loan data, the loan date displayed is NULL.
The output is as follows:
+---------+------------------+ | name | MAX(l.date_lent) | +---------+------------------+ | harmony | 2005-09-10 | | Lenz | NULL | | Thies | 2005-10-18 | | Yvette | 2005-06-05 | +---------+------------------+
Just like you can give a table name an alias using AS, you can give a column an alias. Look back at the output from the previous example, and you will see that the second column's heading contains the function used to generate that value.
By changing the first line of the query to
SELECT p.name, MAX(l.date_lent) AS max_date_lent
the column heading shows the alias name rather than the column calculation.
When displaying a query using mysql, this is purely a cosmetic issue, but when you develop applications using one of MySQL's APIs, you will usually want to reference the data returned in a computed column using a simple name.
Similarly, suppose you execute a query that returns two columns from different tables but the columns have the same name. You should use a column alias to give each one a unique name so you can identify each one in your application.