Creating an Advanced Recordset | Dreamweaver Mx: Advanced Php Web Development

Chapter 7 - Advanced SQL Usage

byGareth Downes-Powellet al.
Wrox Press 2003

We saw in Chapter 5 the general structure of the SELECT statement. SELECT is used to retrieve rows of data selected from one or more tables. In this chapter, we will show you a lot of SQL examples. You can try them on whichever tool you prefer, but for consistency, we will show you only the SQL query and its results.

As well as retrieving data from tables, SELECT may also be used to retrieve rows without any reference to a table. For example:

    SELECT 1+2;

will give you:

This example shows you one thing: even if in most of the cases where you will use the SELECT statement, you will specify at least two pieces of information - what you want to select and from where you want to select it - this is not entirely true. This example will compute the expression '1+2' and return a recordset made of one line and one row containing the value '3'.

The SELECT Statement

The Syntax of SELECT

The SELECT syntax can be summarized as follows:

    SELECT    [DISTINCT | DISTINCTROW]    select_expression, ...    [FROM table_references]    [WHERE where_definition]    [GROUP BY {unsigned_integer | col_name | formula} [ASC | DESC], ...]    [HAVING where_definition]    [ORDER BY {unsigned_integer | col_name | formula} [ASC | DESC], ...]    [LIMIT [offset,] rows]

As you can see, the SELECT statement is structured in different parts that will tell the SQL server not only which rows to retrieve, but also how to show them and order them.

In the previous example, the 'select_expression' was the expression '1+2'.

As we saw on the previous example, the 'select_expression' is the only non-optional element. This is a total must and non-negotiable rule. All other parts may be omitted. But it is very unlikely that you will be in situation where you can omit the 'FROM' part, because then you can't retrieve any database information.

Important

Note that even though the parts of the SELECT syntax that are enclosed with [...] characters are optional, the order of these parts is mandatory. You cannot invert or reorder them; otherwise you will get a syntax error.

Using DISTINCT

The keywords DISTINCT or DISTINCTROW are shortcuts for not selecting the same row twice. It may happen that when you select some fields only once in your query, that the result contains the exact same row multiple times. Using the DISTINCT keyword will return only the rows that are different for at least one column.

As an example, imagine we have the table ' exampletable' that contains the following data:

	Var1	Var2	Var3
1	Abc	324	78
2	Dcg	65	78
3	Abc	324	78

As you can see row 1 and 3 have the exact same values. Using a 'SELECT DISTINCT ...' statement would return only rows 1 and 2.

Choosing Columns

The SELECT statement allows you to specify which columns you want to retrieve. You should not forget that when talking about the recordset's column we don't necessarily talk about the table's column.

In the previous example we had: SELECT 1+2. In this example, the select_expression was '1+2' and defined one recordset column without using a table. Suppose we have the following:

    SELECT 1+2, 2*3;

The result will be:

1+2	2*3
3	6

As you see, we specified two columns.

Basically the purpose of the select_expression is to define the column of the recordset. It is a list of column_expressions separated by the "," character. We will now analyze how to build the column_expression.

Selecting Table Columns

To select a table column you just need to use the name of the column.

Example: We want to select the columns ID, price, bed, and number from the table room:

    SELECT ID, price, bed, number FROM room

which will give you:

ID	price	bed	number
1	34	2	100
2	56	3	101

Note that this is pure example data.

Further in this chapter, we will see how to retrieve data from multiple tables with one query. The rule given above stays true in multiple table context except in one case: if the tables you are selecting from have a column name in common, and you are selecting one of these columns in your 'select_expression' you must precede the column name by the table's name and character.

For example, in the table 'room' and the table bookings we have the column 'ID' in common. Suppose now that we want to retrieve the room's ID and the bookings' ID. You must use the following query:

    SELECT room.ID, bookings.ID FROM room, bookings

Using MySQL Functions

In the above section, we saw that the simpliest form of the 'column_expression' is just the column name. In this section, we will see that 'column_expression' can be a very complex expression using many MySQL functions.

For example, suppose that the hotel stores the price of the room in pounds sterling. But, for our application, we would like to show the price of the room in a different currency. We would like to know the price of the room in Euros and Dollars. Let's say the coefficient for converting pounds in Euros is 1.61 and the coefficient to convert pounds to Dollars is 1.45. We will modify the first query as follows:

    SELECT ID, price*1.61,price*1.45,bed, number FROM room

which will give you:

ID	*price1.61**	*price1.45**	bed	number
1	54.74	49.30	2	100
2	90.16	81.20	3	101

We could even go further. You want to make sure of the currency you're working on. And therefore, you want to add the currency symbol in your recordset. We will use the following query:

    SELECT ID, concat(price*1.61,'*'), concat(price*1.45,'$'), bed, number FROM room

which will give you:

ID	*concat(price1.61 ,'')*	*concat(price1.45,'$')**	bed	number
1	54.74*	49.30$	2	100
2	90.16*	81.20$	3	101

In this example, we used the function concat() to concatenate a list of strings into one.

Recordset Column Aliasing

In the last example, you can see that the column names of the recordset are taken from the select_expression. The second column expression was 'concat (price*1.61)' and so was the recordset column name. If your column_expression uses a very complex expression, the column name may grow very long. The recordset's column name can be aliased with the keyword AS. So we can modify the previous query to make the results look nicer:

    SELECT ID, concat(price*1.61,'*') as priceeuro,concat(price*1.45,'$') as    pricedollar,bed, number FROM room

which will give you:

ID	priceeuro	pricedollar	bed	number
1	54.74*	49.30$	2	100
2	90.16*	81.20$	3	101

As you can see, using the as theprice keyword we change the resulting column name. This is extremely important especially in our context: web application development. When you will be making reference to the different columns of your recordset it will be much easier for you if all your columns have clean and concise names.

The column name is also important when using the HAVING keyword, this will be covered later.

Note

Always place the as keyword at the very end of the column_expression.

Selecting All Columns of a Table At Once

There is a shortcut in SQL that you can use to select all the columns of a table at once: the * element. The * element will return all the columns of a table. For example:

    SELECT * FROM room

which will give you:

ID	price	bed	number
1	34	2	100
2	56	3	101

The * may seem useful at first but you cannot specify the order in which you would like the columns to appear. You also cannot specify any expression on the column, or rename the column of the recordset. The name is taken directly from the table, so be careful when using the * element.

Tables, Joining Tables, and Foreign Keys

In the previous example, we introduced the FROM keyword. When selecting the value from the table room, we used the FROM room element.

In the previous examples, we have used the FROM keyword several times. It is now time to define it thoroughly. The FROM keyword tells MySQL which table you want to take values from. The simplest usage of the FROM keyword is to define the name of the table you're working on.

In the example about the price of the room, we used the following FROM syntax:

    SELECT ID, price, bed, number FROM room

The 'FROM room' tells MySQL that you want to retrieve the column ID, price, bed, and number from the table 'room'. SQL allows you to do much more powerful queries: you can use multiple tables in one query; you can also JOIN tables into virtually any new table.

Multiple Table References

In the FROM section, tables are referenced by a list of table names separated by the "," character. When using a multiple table reference, MySQL will operate what we call a scalar product of the two tables. Suppose we have the following tables:

Table 1	Table 2
Field 1	Field 2
String 1	String 2
F11	F21
S11	S21
F12	F22
S12	S22

The scalar product of these two tables consists in having each row of the second table appended to each row of the first one. As an example the following query:

    SELECT * FROM table1, table2

will give you:

Field1	String1	Field2	Strlng2
F11	S11	F21	S21
F11	S11	F22	S22
F12	S12	F21	S21
F12	S12	F22	S22

As you can see the number of rows you get is the product of the number of rows of the first table and the second while the number of columns you get is the sum of the columns from each table. The values of the different columns are not multiplied or added in any way, they stay the same. Using the multiple table reference is rare and will generally not give you the result you expected, so we will focus on other forms of JOIN in this book.

Linking Tables

To explain the concept of table linking we will use the tables: room, bookings and clients. Our goal will be to have a historical list of the booking for each room.

We will know the room ID and for one specific room we will get the list of each booking, the start date, the end date, the client's name, and the price paid for this reservation. In order to achieve that goal, we will use the concept of Foreign Keys. A foreign key is a set of columns that will, for each row of a table, contain the value of the primary key of another table.

Note

MySQL doesn't support enforced foreign keys. You will have to remember the relationship between each table and ensure yourself the correctness of your table. This is not as hard as you may think.

When looking at the table 'room' and 'bookings' you will see a special column: the room ID. In the room table it is called ID and in the table bookings it is called roomID. The following diagram shows you the table definition:

click to expand

The diagram shows that the column roomID in the table 'bookings' represents the room's ID to which this booking refers. In this case, we could say the column roomID is a foreign key to the table 'room' because it is linked to the identifier column of the room.

In the note below we stated that MySQL doesn't enforce the foreign keys. And indeed, you have to remember that the column roomID in the table 'bookings' is a foreign key to the table room. You have no way to tell MySQL explicitly about the foreign key.

The foreign key concept is what actually makes a database a relational. When using the column 'roomID' to link to the room's ID, you are creating a relation that links these two tables and you set the meaning of the column roomID.

For example, if we have a booking that starts the 22nd of March 2002 and ends the 24^th of March 2002 for client 1 and this client had room 2, then we will have the following row in the bookings table:

ID	clientID	roomID	startDate	endDate
.	.	.	.	.
23	1	2	2002-3-22	2002-3-24
.	.	.	.	.

Now suppose we want to know the list of booking for each room, we will use the following statement:

    SELECT room.ID, bookings.ID, clientID, roomID, startDate, endDate FROM room    LEFT JOIN bookings ON room.ID = roomID

We will review the syntax of the LEFT JOIN statement later on, but here we are implementing the concept of the roomID foreign key in this specific statement, thanks to the 'LEFT JOIN bookings ON room.ID = roomID' part of the query. This query will give you:

ID	ID	clientID	roomID	startDate	endDate
.	.	.	.	.	.
2	23	1	2	2002-3-22	2002-3-24
.	.	.	.	.	.

There we have something extremely interesting, even if it doesn't appear sol We have a recordset that gives us not just the room information or the bookings information, but both in one recordset.

If we wanted to have a journal of bookings for each room, then we have it. But our goal was to also have the client information. The following diagram shows another part of the relationship of our table:

click to expand

As you have probably already guessed, we have another foreign key: the clientID column. To get information from the table client, we will do as we did for the first query: we will use the LEFT JOIN keyword on the clients table. The new SQL query will look like this:

    SELECT room.ID, bookings.ID, clientID,roomID,startDate,endDate, clients.firstname,    clients.lastname FROM room LEFT JOIN bookings ON room.ID = roomID LEFT JOIN    clients ON bookings.clientID=clients.ID

Note that this query will return the rooms that have not been booked already. This is due to the nature of the LEFT JOIN. Further we will see how to retrieve only rooms that have been booked at least once, with the INNER JOIN syntax.

In this query, we have modified two parts of the query: the select_expression now has the clients.firstname and clients.lastname. We have also added a new LEFT JOIN ... ON ... sequence. This time we are joining the table clients and we link the client.ID to the clientID that can be found in the bookings table. It works exactly the same way as when we linked the bookings table to the room table.

We have linked the table bookings to the table room because the meaning of the column booking.roomID was the room's ID. Therefore we do exactly the same by linking the table bookings to the table clients, because the columun booking.clientID was the client's ID.

To summarize, the LEFT JOIN tablename ON clause will link two tables by checking that the condition clause is TRUE. In our example,e we had the two clauses following LEFT JOIN:

LEFT JOIN bookings ON room.ID = roomID
LEFT JOIN clients ON bookings.clientID = client.ID

The interesting thing in the LEFT JOIN is that it allows you to construct queries with multiple tables thanks to the relationship you defined when constructing your table.

Joining Tables and Foreign Keys

In the previous section we introduced the concept of joining tables. In the short term, joining can be thought of as a way to retrieve information from multiple tables at once. Joining tables is the most powerful concept in SQL. You've often heard about Relational databases or relational database management system (RDBMS). The relational part comes from this very concept.

Here we will review the JOIN syntaxes. In the previous series of examples, we used the LEFT JOIN syntax to join the room table with the bookings and clients table. We have also reviewed multiple table references, which are also a kind of joining.

Here are the other kinds, and their syntax:

    table_reference, table_reference    table_reference [CROSS] JOIN table_reference    table_reference INNER JOIN table_reference join_condition    table_reference STRAIGHT_JOIN table_reference    table_reference LEFT [OUTER] JOIN table_reference join_condition    table_reference LEFT [OUTER] JOIN table_reference    table_reference NATURAL [LEFT [OUTER]] JOIN table_reference    table_reference RIGHT [OUTER] JOIN table_reference join_condition    table_reference RIGHT [OUTER] JOIN table_reference    table_reference NATURAL [RIGHT [OUTER]] JOIN table_reference

Note that the elements enclosed with [...] are optional.

In this group of different JOIN syntax, the table_reference expression means the expression that you will construct to refer to the table you want to join. The expression table_reference is defined as:

    table_name [[AS] alias] [USE INDEX (key_list)] [IGNORE INDEX (key_list)]

And the expression join_condition is defined as:

    ON conditional_expr    OR USING (column_list)

We already have presented the multiple table reference as well as the LEFT JOIN syntax. The LEFT JOIN syntax is the most frequently used flavor of JOIN. Note that the OUTER keyword and the RIGHT keyword are syntactic keywords.

Simple Joining of Tables

At the beginning of this section we introduced the multiple table reference. The CROSS JOIN and the JOIN keywords do exactly the same type of joining. For example, the statement:

    SELECT * FROM room, bookings

could be rewritten as:

    SELECT * FROM room JOIN bookings

INNER JOIN

The INNER JOIN is used when you want the conditional clause to be always TRUE. That may sound strange. In the previous example, we used the LEFT JOIN because we wanted to have a listing of all rooms. We therefore used the LEFT JOIN.

Suppose that we now want to see only the rooms that have been reserved at least once. Being reserved at least once means that we have an entry in the table bookings. By using the INNER JOIN we ensure not only that the room.ID and bookings.roomID are equal, but that we take only the rows of the table bookings for which the column roomID exists in the table room and for the column ID.

You should never have any conditions in the ON part that are used to restrict which rows you have in the result set. If you want to restrict which rows should be in the result, you have to do this in the WHERE clause.

NATURAL JOIN

The NATURAL JOIN or the NATURAL LEFT JOIN is equivalent to an INNER JOIN or LEFT JOIN where the ON clause would list all the existing columns in both tables. That mainly serves as a shortcut. For example, suppose we have two tables: A and B. Table A has the column c1 and C2, and table B has the column C1 and C2 also. A following NATURAL JOIN statement:

    SELECT * FROM A NATURAL JOIN B

could be rewritten in:

    SELECT * FROM A INNER JOIN B ON A.C1=B.C1 AND A.C2=B.C2

USING

There is another shortcut that is very useful: the USING element. The following example:

    SELECT * FROM A LEFT JOIN B USING (C1, C2)

is semantically equivalent to:

    SELECT * FROM A LEFT JOIN B ON A.C1 = B.C1 AND A.C2 = B.C2

To be able to use the USING shortcut, the column name must be the same. So if you plan to use it, you should name the column you want to have in the USING clause with identical names in both tables.

For example, we cannot use the USING keyword with the Hotel database the way it has been designed. If we wanted to do so, we should rename the column ID in the table room to be 'roomID'. Then we would have two columns named roomID: one in the room table, the other in the table bookings. Then, we would use a LEFT JOIN with a USING clause like this one:

    SELECT * FROM room LEFT JOIN bookings USING (roomID)

And that would give us the equivalent statement:

    SELECT * FROM room LEFT JOIN bookings ON room.roomID=bookings.roomID

Table Aliasing

Tables may be aliased for easier use. By using table aliasing you can also join the same table multiple times. The following example shows you how to do that:

    SELECT * FROM room JOIN bookings JOIN room as room2

Filtering Rows

The WHERE part of a SELECT statement is used when you want to filter the recordset. The core of a search engine resides in the WHERE part of a query. It is important to note that the WHERE clause works with table columns only. If you did make some computation on the table columns and aliased them to be able to filter the recordset according to the new value, you must use the HAVING clause. The where_expression is built as a logical expression that will return either TRUE of FALSE.

The Values of TRUE and FALSE

Anything different from 0 is always TRUE. The only possible value for FALSE is 0

Building Logical Expressions

Here is a short example of a logical expression:

    SELECT * FROM room WHERE ID > 1 AND startDate='2002-31-12'

The logical expression is: ID > 1 AND startDate='2002-31-12'

To build your logical expression you will use operators. There are four types of operators:

The parenthesis
The comparison operators
The logical operators
The flow control operators

Some of these operators will return the NULL value. This value cannot be compared to TRUE or FALSE, because it has its own meaning. If you want to know if a value is NULL or not, use the is NULL operator or IS NOT NULL.

Parenthesis

Use the (...) to force the order of evaluation of the subexpression contained in the parenthesis, for example:

    SELECT * FROM ROOM WHERE (3 * 2) -2

will give you all the rows in the table room because (3*2) - 2 evaluates to 4, therefore TRUE, while this one:

    SELECT * FROM ROOM WHERE 3 * (2-2)

will return no results because 3*(2-2) evaluates to 0.

Comparison Operators:

The comparison operators always return the value 0 (FALSE) or 1 (TRUE) or eventually NULL. We will talk about the NULL value in the flow control operators section. These functions work both for strings and numbers. Strings are automatically converted to numbers and numbers to strings as needed.

The following table shows you the list of comparison operators and their meanings as well as an example when the returned value is TRUE or FALSE. Note that the word 'expr' is an abbreviation of 'expression'.

Operator	Meaning	TRUE example	FALSE example
=	Equality	'1' = 1	1 = 2
	Checks if both arguments are equal	'ab' = 'ab'	'a' = 'ab'
!=	Different	'1' != 0	1 <>0
<>	Returns TRUE if the arguments are different
<=	Less or equal than	1 <= 2	3 <= 1
>=	Greater or equal than	2 >= 1	1 >= 3
<	Strictly less than	1 < 2	2 < 1
>	Strictly greater than	2 > 1	1 > 2
IS NULL	Returns TRUE if the value is the NULL value.	NULL IS NULL Note that this is only case where this operator is TRUE.	1 IS NULL
IS NOT NULL	Returns TRUE if the value is not NULL	1 IS NOT NULL	NULL IS NOT NULL. Note that this is only case where this operator is FALSE.
expr BETWEEN min AND max	Checks if the expr value is situated between min and max values.	2 BETWEEN 1 AND 3	4 BETWEEN 1 AND 3
expr IN (value, ...)	Checks if the expr is contained in the value list. This operator is extremely useful when used with SET and ENUM column type.	1 IN (1,2,4)	3 IN (1,2,4)
expr NOT IN (value, ...)	Checks if the expression is not in the value list.	3 IN (1,2,4)	1 IN (1,2,4)
ISNULL(expr)	Checks if the expr is NULL	ISNULL(1/0)	ISNULL(0/1)
COALESCE (list)	Returns the first non-null item in the list.	This operator will not return a true of FALSE value. Instead it will return either NULL if there is no non-NULL value or the first value of the list that is non-NULL.
INTERVAL (X,n1, n2, n3, ...)	Returns 0 if X<n1, 1 if X<n2 and so on.	This operator will not return TRUE or FALSE but instead the numerical position of the first i where X<ni. Note that n1<n2<n3...is required for this function to work correctly.

Logical Operators

There are three logical operators. They return either TRUE or FALSE or eventually NULL. For logical operators the NULL value is in most cases the same as a FALSE value.

Operator

Meaning

TRUE example

FALSE example

! expr

NOT expr

This operator negates the logical value of expr.

! 0

! 1

AND

Returns TRUE only if both arguments are TRUE.

1 AND 2

0 AND 2

Returns TRUE if at least one of the two arguments is TRUE.

1 OR 0

0 OR 0

At the beginning of the filtering section, we introduced a simple query to show you a logical expression:

     SELECT * FROM room WHERE ID > 1 AND startDate='2002-31-12'

There we see the AND operator, which is used to tell MySQL that we want both of the conditions: ID > 1 and startDate= '2002-31-12' to be TRUE. If we want only one of these conditions, we would write the query like this:

     SELECT * FROM room WHERE ID > 1 OR startDate='2002-31-12'

Flow Control Operators

The flow control operators are extremely helpful when you want to build a powerful query. They act as mini branching control in your query.

The IFNUL(expr1, expr2) operator returns expr1 if expr1 is not NULL otherwise it will return expr2.
The NULLIF (expr1, expr2) operator returns NULL if expr1 = expr2, else returns expr1.
The IF (expr1, expr2, expr3) is the most powerful of the flow control operators. It allows you to build some tests quickly into your query. If expr1 is TRUE, it will return expr2 else it will return the expr3.

We have seen that the LEFT JOIN returns the values of the first table and each row of the second table for which the ON clause is TRUE. However, it may happen that the first table has no counterpart value in the second table. That would be the case if a room has never been booked. There wouldn't be any bookings row that would have the column roomID pointing at that room.

In this case a NULL value is used for each column value of the second table. Suppose we want to return the value 'never booked' when there is no booking value for that room, we would use the following query:

     SELECT room.ID, IF(bookings.roomID IS NULL, ' never booked','at least once') as     booked FROM room LEFT JOIN bookings ON room.ID=bookings.roomID

Note that in the IF operator, expr1 is evaluated as an integer value, which means that if you are testing floating-point or string values, you should do so using a comparison operation:

     IF(0.1,1,0)

which will return 0 because in the IF context, 0.1 will be converted into an integer and then the expr1 will be FALSE. When testing floating point, use:

     IF(0.1<>0,1,0)

which will return 1.

Using LIKE

The expr1 LIKE expr2 operator compares expr1 and expr2 and returns TRUE if expr1 is like expr2 and FALSE otherwise. The 'like' word means that you can use special characters in 'expr2' to tell MySQL about the level of likeliness of your comparison. These characters are the following:

%: means any string of characters of any size.
_ (underscore): means any character, but only one.

The following table will show examples of the LIKE operator and the results:

Statement	Value
SELECT 'a' LIKE 'b'	FALSE. Because there is no character % or _, and 'a' is not equal to 'b'.
SELECT 'a' LIKE '_'	TRUE. The expression '_' will match any string made of only one character.
SELECT 'a' LIKE '%'	TRUE. Note that the expression '%' will always return TRUE.
SELECT 'myroom' LIKE '%room'	TRUE. The expression '%room' matches all strings that with by the sequence 'room'.
SELECT 'myroom' LIKE 'my%'	TRUE. This is the contrary of the previous example, it matches every sequence beginning with 'my'.
SELECT 'myroom' LIKE '%theroom%'	FALSE. The sequence 'theroom' cannot be found in the value 'myroom'.

The LIKE operator is extremely useful to make a quick search of specific keywords in your database, and is therefore used often in web site design. If you are comparing case-sensitive strings with any of the standard operators (=, <>..., but not LIKE) end space will be ignored.

Grouping

The GROUP BY keyword allows you to regroup the rows of a recordset according to the recordset column names into one single row.

This means that for example, if you have a recordset that has 50 rows, but one column of this recordset can take only four different values, grouping your recordset on that column will return a new recordset containing only four rows.

The grouping is done on the recordset column name and not necessarily on the table column name. If you selected some complex expression from multiple table fields and if this expression is aliased with the AS keyword, you can group the rows by using the aliased column name.

How Grouping Works

We will illustrate the concept of grouping with the following exercise: We want to know for each table how many reservations have been registered. To do that, we will have to use the JOIN of two tables: room to get a list of each row and bookings to know about the bookings that are present in the database.

First of all, we will use an example of data contained in these two tables:

     SELECT * FROM room

has:

ID	price	bed	Number
1	34	2	101
2	56	3	102

while the table bookings:

     SELECT * FROM bookings

has:

ID	roomID	clientID	startDate	endDate
23	2	1	2002-3-22	2002-3-24
24	2	2	2002-3-27	2002-4-1

As you can see, in the table bookings the column roomID has two occurrences of the value 2. The ID of the first room is not present, which means it has never been booked. Therefore, we know the results we would like to have: room 2 has had 2 bookings while room 1 has 0 reservations.

Now we will build the query to get all this information at once:

     SELECT room.ID as roomID,price, bed,number,bookings.ID as bookingID FROM room LEFT     JOIN bookings ON room.ID=roomID

which will return the following:

roomID	Price	Bed	Number	bookingID
1	34	2	100	NULL
2	56	3	101	23
2	56	3	101	24

We are almost done. The important point is the recordset column roomID. The recordset contains the following values for this column: 1, 2, and 2. We will insert a GROUP BY sequence to group the rows of the whole recordset that have the same value for the column roomID. And finally we are going to introduce the count() function. The SQL query becomes:

     SELECT room.ID as roomID,price, bed,number,bookings.ID as bookingID, count(roomID)     as bookCount FROM room LEFT JOIN bookings ON room.ID=roomID GROUP BY room.ID

This will give us the expected results:

roomID	price	bed	number	bookingID	bookCount
1	34	2	100	NULL	0
2	56	3	101	23	2

As you can see, the recordset column bookCount contains the number of bookings for each room. But the power of this query is that it not only requires one SQL statement, but that it also contains each piece of information about the room. So, you don't have to perform multiple queries in order to know about the room and the number of reservations.

Note

The choice of the column on which you will group items is extremely important. We have seen that the LEFT JOIN allows you to retrieve results even if there is nothing in the left table. When this is the case, the column of the second table will have the value NULL. But if you group on a column that may have the NULL value, you have good chances that the result will be wrong or unexpected.

Implicit Grouping

MySQL optimizes the use of the GROUP BY by doing an implicit grouping if you use some grouping function in your query. If you use a group function in a statement containing no GROUP BY clause, it is equivalent to grouping on all rows. For example:

     SELECT count(*) FROM room INNER JOIN bookings ON roomID=room.ID

will give you the number of rooms that have at least one booking. In this case, the number is 2. See the section about the group by function to see other examples of implicit grouping.

GROUP BY Syntax

The following syntax shows you the syntax for the GROUP BY structure:

     GROUP BY {unsigned_integer | col_name | formula) [ASC | DESC], ...

This syntax above shows you that the GROUP BY expression can be made of a list of grouping elements. It means that you can group the column successively with different column names. For example, the following query:

     SELECT * FROM room GROUP BY number, price

will group your recordset first by the number of the room, and then with the price column.

The unsigned_integer expression is the column number of your recordset section starting from 1. The col_name expression represents one of the recordset column names. If your recordset column is made of a complex subexpression you must alias it with the AS keyword and use that alias as the column name. The formula expression means a formula computed from anything that is valid SQL.

For example, you could use the following:

     SELECT * FROM room INNER JOIN bookings ON roomID=room.ID GROUP BY room.ID/2

Be careful that if your formula is an unsigned integer it will be used like a column number. If you really want to group rows by an integer, then use a formula: 0+19 for example.

This example will show you the difference between the types of grouping. Note that we supposed that the data in the table bookings are the same as for the previous example:

     SELECT * FROM bookings GROUP BY 2

which will give you:

ID	roomID	clientID	startDate	endDate
23	2	1	2002-3-22	2002-3-24

Because the groupment has been made on an unsigned integer, which means the second column of your recordset. As the table bookings has the value '2' for the roomID in all the rows, the result is only one row.

     SELECT * FROM bookings GROUP by clientID

which will give you:

ID	room ID	clientID	startDate	endDate
23	2	1	2002-3-22	2002-3-24
24	2	2	2002-3-27	2002-4-1

Because the column clientID has two different values, the grouping doesn't do anything, and you get the whole content of your table room. Now the following query:

     SELECT * FROM bookings GROUP BY 0+1

which will give you:

ID	roomID	clientID	startDate	endDate
23	2	1	2002-3-22	2002-3-24

because the group by expression, '0+1', is a formula and always evaluates to the same value: 1, therefore, all the rows of your recordset are grouped together.

Functions To Use in GROUP BY

In the previous part we introduced the function count(). We will now review a number of functions that imply the use of a GROUP BY clause.

count(expr)

The count (expr) returns a count of the number of non-NULL values in the rows retrieved by a SELECT statement. Note that it doesn't return the sum of the values, just the number of rows. In the example about the reservation of a table, we introduced the function count().

     SELECT room.ID as roomID,price, bed,number,bookings.ID as bookingID, count(roomID)     as bookCount FROM room LEFT JOIN bookings ON room.ID=roomID GROUP BY room.ID

As you can see on this statement, the count(roomID) will return the number of rows that have the same roomID, thanks to the grouping made on room.ID.

The expression COUNT(*) is somewhat different in that it returns a count of the number of rows retrieved, whether or not they contain NULL values.COUNT(*) is optimized to return very quickly if the SELECT retrieves from one table, no other columns are retrieved, and there is no WHERE clause. For example:

     select COUNT(*) from room

will return the number of rows contained in the table room.

count(DISTINCT ...)

The count (DISTINCT exprl, ...) function returns a count of the number of different non-NULL values.

avg(expr)

The avg(expr) returns the average value of the grouped column value. For example, if we want to know the average price of the room, we will use the following query:

     SELECT avg(price) FROM room

which will give you the average price of the rooms. You may notice that we computed the average value on the column price, but did not specify any groupment clause. This is an example of the implicit grouping that we saw above. The implicit grouping will group all the rows into one single row. Therefore, the average will be computed on all rows, which is exactly what we want.

min(expr) and max(expr)

The min(expr) and max(expr) returns the minimum and the maximum value of the grouped value. For example, if we want to know the lowest priced room as well as the highest priced room, we will do as follows:

     SELECT min(price),max(price) FROM room

which will give you the following result:

min(price)	max(price)
34	56

sum(expr)

The sum(expr) function is used to compute the sum of all grouped values. This could let us know, for example, the amount of money the hotel could make, if all the rooms are booked.

     SELECT sum(price) FROM room

Will give you 90 as the result.

Filtering Using HAVING

With the WHERE clause, we have already had a chance to look at how to filter a recordset, but the WHERE clause can only work on the table column level. If you use a complex expression as a recordset column, you won't be able to use a WHERE clause on that column.

SQL provides a second mechanism to address this issue: the HAVING clause. HAVING works almost the same way as the WHERE clause, except that it works on the recordset column while WHERE works at the table columns level. To illustrate the HAVING clause, try to execute the following query:

     SELECT year(startDate) as startyear ,year(endDate) as endyear FROM bookings WHERE     startyear=2002

Executing this query will result in a syntax error telling you that the column 'startyear' is unknown in the WHERE clause. Now try to execute this query:

     SELECT year(startDate) as startyear ,year(endDate) as endyear FROM bookings HAVING     startyear=2002

This will return the correct recordset. The HAVING expression is built with the exact same syntax as the WHERE clause. You can use the entire WHERE operators and functions.

Sorting the Recordset

The ORDER BY clause is used for ordering the rows retrieved in a specific order. The syntax for the ORDER BY is as follows:

     [ORDER BY {unsigned_integer | col_name | formula} [ASC | DESC] ,...]

The unsigned_integer expression is, like in the GROUP BY, relative to the recordset column and starts from 1. The following example shows you how to order a recordset by the second column:

     SELECT * FROM room ORDER by 2 DESC

which will give you:

ID	price	bed	Number
2	56	3	102
1	34	2	101

Note that the DESC keyword has been used to reverse the order of the sort. The ASC keyword is used to order the recordset in normal order. It is optional and you can omit it. The col_name expression represents the name of one of the recordset columns. We could have written the previous example as:

     SELECT * FROM room ORDER by price DESC

If you want to order by the number of beds, use the following query:

     SELECT * FROM room ORDER by bed

The formula expression represents a formula computed from anything that is valid SQL. For example, you could use the following:

     SELECT * FROM room ORDER BY rand()

This example will return a recordset ordered by random values. Another example: In the GROUP BY part we have learned how to group the room to retrieve the bookings. The following query will show you how to order the result by the room number.

     SELECT room.ID as roomID,price, bed,number,bookings.ID as bookingID, count(roomID)     as count FROM room LEFT JOIN bookings ON room.ID=roomID GROUP BY room.ID ORDER BY     room.number

which will give you:

roomID	price	bed	number	bookingID	count
1	34	2	100	NULL	0
2	56	3	101	23	2

Limiting the Number of Rows

Sometimes the number of rows you will get from a SELECT statement will be huge and you will want to limit the amount of rows. Other times you will just be interested in the first 10 rows. You may even want to implement recordset navigation by selecting small portions of a larger recordset. To do this, we will use the LIMIT clause. The syntax of the LIMIT clause is shown below:

     [LIMIT [offset,] rows]

The offset is the first row that you want to limit, and the rows element is the number of rows you want to limit. To illustrate the LIMIT concept, we will use a table with the following content:

Table: example
ID	Value
1	10
2	3
3	5
4	78
5	82
6	9

The following queries will show you the result of the LIMIT clause:

     SELECT * FROM EXAMPLE LIMIT 2

which will give you:

ID	Value
1	10
2	3

     SELECT * FROM EXAMPLE LIMIT 2,3

which will give you:

ID	Value
3	5
4	78
5	82

A Lottery Example

An interesting application of the LIMIT clause is when you want to pick a random row. Suppose that in our hotel, each month a room is picked randomly and its current occupant is given a free day.

To select the room, we will use the following query:

     SELECT * from room ORDER BY rand() limit 1

This query will select all rows from the table room, sort them in a random order, then will just output the first row.