8.4. Advanced SQLIn this section, we'll introduce database concepts that, while not strictly necessary for developing your web sites, can increase performance and give your queries more flexibility. 8.4.1. IndexesIndexes work the same way that an index of a book works. If you were to look for the keyword "create table" without an index, you'd need to spend a lot of time scanning through the pages of the book looking for a section that might be relevant. Then you'd have to scan the entire section. This certainly isn't an efficient use of your time or the database's. The solution is an index at the end. The data in an index is sorted in order and organized to make finding a specific value as quickly as possible. Because the values are sorted, if you're looking for something specific, the database can stop looking when it finds a value larger than the item you're looking for. You face the same problems as a book does, though. If an index is so great, why not index everything? There are numerous reasons:
So some intelligent decisions about which fields to index in your tables have to be made. Each index requires its own datafile for storage, which can add a bit of processing time when the contents of an indexed field changes in the database. 8.4.1.1. When indexes are usedIf you do a simple SELECT statement with a WHERE clause, an index won't be used. There are three major areas where an index can be used:
Just remember, indexes have to be defined before they can be used. 8.4.1.2. Where to specify the indexDatabase indexes can be specified as part of the CREATE TABLE command or they can be added to an existing table by using special SQL commands. If the index is created as part of the CREATE TABLE command, then it's specified at the end of the code block like this: UNIQUE `authind` (`author`) This UNIQUE command creates an index on the author name field. To create the same index using a SQL statement, use the code in Example 8-3. Example 8-3. Creating a simple index
which returns the following: Query OK, 4 rows affected (0.11 sec) Records: 4 Duplicates: 0 Warnings: 0 Now to describe the table: DESCRIBE `authors`; This gives you this information: +-----------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-----------+--------------+------+-----+---------+----------------+ | author_id | int(11) | | PRI | NULL | auto_increment | | title_id | int(11) | | | 0 | | | author | varchar(125) | YES | UNI | NULL | | +-----------+--------------+------+-----+---------+----------------+ 3 rows in set (0.00 sec) Notice the new value of UNI in the key column for author. 8.4.1.3. Multicolumn indexesIt's also possible to create MySQL indexes that use more than one column. A multicolumn unique index can be used to make sure that the combination of two or more keys is unique. The best columns to index are those that are likely to be used in the WHERE clause, especially if you know that certain combinations of keys will be used. Those are good columns to add to a multicolumn index. Unique indexes, similar to primary indexes, are also unique. Only one primary index is allowed per table. However, you can have as many unique indexes as your heart desires; there is no parameter on this. We're going to do a query with a specific WHERE clause and then use EXPLAIN to get details about how it was processed by MySQL: SELECT * FROM `authors` WHERE `author` = 'Arnold Robbins'; This returns: +-----------+----------+----------------+ | author_id | title_id | author | +-----------+----------+----------------+ | 3 | 2 | Arnold Robbins | +-----------+----------+----------------+ 1 row in set (0.00 sec) Use the EXPLAIN keyword on a database that doesn't have an index defined for the authors table: EXPLAIN SELECT * `authors` WHERE `author` = 'Arnold Robbins'; EXPLAIN, in turn, gives you this output: +----+-------------+---------+------+---------------+------+---------+------+--- ---+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | ro ws | Extra | +----+-------------+---------+------+---------------+------+---------+------+--- ---+-------------+ | 1 | SIMPLE | authors | ALL | NULL | NULL | NULL | NULL | 4 | Using where | +----+-------------+---------+------+---------------+------+---------+------+--- ---+-------------+ 1 row in set (0.00 sec) The EXPLAIN output provides a wealth of information about how MySQL processed the query. It tells you that:
After creating a unique index on authors called authind using the syntax from Example 8-3, rerun the EXPLAIN query: +----+-------------+---------+-------+---------------+---------+---------+------ -+------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+-------+---------------+---------+---------+------ -+------+-------+ | 1 | SIMPLE | authors | const | authind | authind | 126 | const | 1 | | +----+-------------+---------+-------+---------------+---------+---------+------ -+------+-------+ 1 row in set (0.12 sec) Notice that many of the values have changed regarding the indexing. Typing ref would mean that rows with matching index values are read from this table for matches.
The comparison shows that adding the index saves a lot of processing time, even for this small table. 8.4.2. Selecting Using the LEFT JOIN ON ClauseWe've discussed performing joins in our SELECT statements using the WHERE clause, but there's another way to join tables. Instead of using the WHERE keyword, LEFT JOIN ON can be used to perform a left or outer join. A left join simply allows you to query two tables that are linked together by a relationship but allows one of the tables to return rows even if there isn't a matching row in the other table. Using the bookstore tables as an example, you might want to create a query that returns users and their purchases but also lists users who have yet to purchase anything. Using the syntax: SELECT fields FROM left_table LEFT JOIN right_table ON left_table.field_id = right_table.field_id; your goal could be accomplished like this: SELECT * FROM `users` LEFT JOIN `purchases` ON `users`.`user_id` = `purchases`.`user_id`; When doing a normal database query that links two tables, if both tables do not include the key values for the field being joined, nothing is returned for the entry. 8.4.3. Using Database FunctionsJust like there are functions in PHP, you can also use functions within your MySQL queries. We'll discuss several categories of functions starting with string functions. The other major categories you'll learn about are date and time modification functions. 8.4.3.1. String functionsSince you'll frequently work with strings, MySQL provides many functions for doing a variety of tasks. You'll generally use the string functions with data that is being returned from a query. However, it's possible to use them without even referencing a table. 8.4.3.1.1. ConcatenationJust like the process of putting strings together with the PHP operator (.), which is a period, MySQL can paste together strings from data fields with the CONCAT function. For example, if you want to return a single field that combines the title with the number of pages, you could use CONCAT. Example 8-4 shows how this is done. Example 8-4. Using CONCAT to put fields together
Concatenating returns: +----------------------------------------+ | concat(title,' has ',pages,' pages.') | +----------------------------------------+ | Linux in a Nutshell has 476 pages. | | Classic Shell Scripting has 256 pages. | +----------------------------------------+ 2 rows in set (0.02 sec) The result is a string that's ready for displaying straight from the SQL query.
The CONCAT function pastes together as many fields as you give it. 8.4.3.1.2. Concatenation with a predefined separatorSometimes you might want to consistently put the same character or string between fields you're concatenating. This can be used for building a table export list. To do this, use the CONCAT_WS function. For example, to return all of the fields in the authors table with commas as separators, you would use: SELECT CONCAT_WS(',',`author_id`,`title_id`,`author`) FROM `authors`; This returns the following: +------------------------------------------+ | CONCAT_WS(',',author_id,title_id,author) | +------------------------------------------+ | 1,1,Ellen Siever | | 2,1,Aaron Weber | | 3,2,Arnold Robbins | | 4,2,Nelson Beebe | +------------------------------------------+ 4 rows in set (0.01 sec) The separator could have been a space, which is useful for putting first and last name fields together for display. 8.4.3.1.3. Calculate a string lengthTo calculate the length of a string, use the LENGTH function, as shown in Example 8-5. Example 8-5. Calculating the length of a string
This returns: +-----------------------------------------------------+ | CONCAT(title,' has ',LENGTH(title), ' characters.') | +-----------------------------------------------------+ | Linux in a Nutshell has 19 characters. | | Classic Shell Scripting has 23 characters. | +-----------------------------------------------------+ 2 rows in set (0.02 sec) Example 8-5 shows the usage of LENGTH and CONCAT together. 8.4.3.1.4. Changing strings to upper- or lowercaseIf you want to change the case of a string to all upper- or lowercase letters, you can use the UCASE and LCASE functions. For example, to covert the book title to all uppercase and then all lowercase, use the code in Example 8-6. Example 8-6. Changing the case of the title
Example 8-6 returns: +-------------------------+-------------------------+ | UCASE(title) | LCASE(title) | +-------------------------+-------------------------+ | LINUX IN A NUTSHELL | linux in a nutshell | | CLASSIC SHELL SCRIPTING | classic shell scripting | +-------------------------+-------------------------+ 2 rows in set (0.03 sec) 8.4.3.1.5. Trimming and padding stringsWhen working with forms, it's sometimes necessary to pad the length of a string to improve its display. The padding can be dots or some other character. VARCHAR type strings, in particular, are variable in length. The two functions that perform padding are LPAD and RPAD, they pad from the left and right, respectively. They both take three arguments: the string to pad, the size of the pad, and what character to use as padding. For example, we'll do a left pad on the title field of books to make it a uniform 30 characters with a period (.) as the padding character: SELECT LPAD(`title`,30,'.') FROM `books`; This returns your values all at the righthand margin: +--------------------------------+ | LPAD(title,30,'.') | +--------------------------------+ | ...........Linux in a Nutshell | | .......Classic Shell Scripting | +--------------------------------+ 2 rows in set (0.00 sec) This looks somewhat like the formatting you see in a table of contents. To trim spaces or tabs (also known as whitespace) from a string, use LTRIM to remove them from the left and RTRIM to remove them from the right. To trim nonwhitespace characters, use the trIM function. It uses a syntax that's slightly different, because you're leading trimming: TRIM(LEADING FROM string); For trailing trimming, use: TRIM(TRAILING FROM string); In Example 8-7, LEADING is used to remove the leading zeros. Example 8-7. Using the LEADING option to remove zeros
Example 8-7 returns: +-------------------------------------------+ | TRIM(LEADING '0' from '0000Example00000') | +-------------------------------------------+ | Example00000 | +-------------------------------------------+ 1 row in set (0.00 sec) To remove the trailing zeros, use the code in Example 8-8. Example 8-8. Using TRIM with the TRAILING option
Example 8-8 returns: +--------------------------------------------+ | TRIM(TRAILING '0' from '0000Example00000') | +--------------------------------------------+ | 0000Example | +--------------------------------------------+ 1 row in set (0.01 sec) Notice that while Examples 8-7 and 8-8 don't reference any tables in the SELECT statements, they're still valid queries. 8.4.3.1.6. String location and positionSometimes you'll want to know whether a string is within a string and what its position is in that string. To locate a string within a string, use the LOCATE() function. It takes the string to look for and the string to search in as its arguments. Example 8-9 shows how the location of a string is returned from a database field. Example 8-9. Looking for the string in our author names
Example 8-9 returns: +-------------------+---------------------+ | author | LOCATE(on',`author`)| +-------------------+---------------------+ | Aaron Weber | 4 | | Arnold Robbins | 0 | | Ellen Siever | 0 | | Nelson Beebe | 5 | +-------------------+---------------------+ 4 rows in set (0.01 sec) The author names that don't contain the string on return a position of 0, indicating that the string was not found.
Keep in mind only the first occurrence of a string is matched, similar to a Find in an application. LOCATE() can also take a third argument to start looking at a position other than the start of the string. 8.4.3.1.7. Cutting up stringsThe substring functions provide a way to extract a portion of a string. All that's needed is the string to work with, the position to start from, and how many characters to extract. Use the LEFT, RIGHT, and SUBSTRING functions to do the extraction.
For example, if a database has phone numbers stored in a 10-digit string without any formatting, the numbers could be displayed with the formatting by using the code in Example 8-10. Example 8-10. Adding the formatting to a phone number using LEFT, RIGHT,and SUBSTR
These commands return: +------------------------------------------------------------------------------- -----------+ | CONCAT('(',LEFT('6128238193',3),')',SUBSTR('6128238193',4,3),'-',RIGHT('612823 8193', 4)) | +------------------------------------------------------------------------------- -----------+ | (612)823-8193 | +------------------------------------------------------------------------------- -----------+ 1 row in set (0.02 sec) Example 8-10 shows how all three of these functions work together to reformat a phone number. The phone number could just as easily have been a database field instead of the number in the example. 8.4.3.1.8. Search and replace functionAnother useful function is the REPLACE function. It does what the name implies, exactly like find/replace in a word-processing application. It takes a source string, a search string, and a replacement string and returns the string with the replacement. For example, suppose you wanted to replace "Avenue" with "Ave" in an address, but only for the current query. Here's how it's done: SELECT REPLACE('2323 Fulerton Avenue', 'Avenue', 'Ave.'); The REPLACE function displays: +---------------------------------------------------+ | REPLACE('2323 Fulerton Avenue', 'Avenue', 'Ave.') | +---------------------------------------------------+ | 2323 Fulerton Ave. | +---------------------------------------------------+ 1 row in set (0.00 sec) Now that we've shown you just about all you could imagine you'll do with strings, it's time to work with dates and times. 8.4.3.2. Date and time functionsAgain, you're dealing with territory that PHP had functions to work with, but what if you'd like to query for purchases from the last 30 days? It's nice to be able to do the date and time arithmetic in the query. The date and time functions can be used with or without a database table in the query. We'll show you both in the following examples. 8.4.3.2.1. Days, months, years, and weeksGiven a certain date, it's hard to remember if that day was a Tuesday or a Thursday. MySQL provides functions that tell you without having to do any of the thinking ourselves. How convenient! You could plot what day you were born just by establishing the date and year. PHP provides two very similar functions to do the calculation. The WEEKDAY function takes a date as its argument and returns a number. The number represents the day of the week with Monday being 0. You could also use the DAYOFWEEK function, which, confusingly enough, does exactly the same thing but numbers the days differently, starting with Saturday as 1. Table 8-3 lists how each function numbers days of the week.
For example, to find out what day of the week was October 12, 1964, use the WEEKDAY function in Example 8-11. Example 8-11. Using WEEKDAY to get the day of the week
This then tells you: +-----------------------+ | WEEKDAY('1964-10-12') | +-----------------------+ | 0 | +-----------------------+ 1 row in set (0.00 sec) Which means October 12, 1964 was a Monday. Pretty cool stuff! It may seem a bit odd to return a number for the day of the week, so there's a function to return the day as its name. The DAYNAME function works like DAYOFWEEK or WEEKDAY but returns a string with the name instead, as shown in Example 8-12. Example 8-12. Using DAYNAME to get the day of the week as a name
As you can see, an alpha answer returns: +-----------------------+ | DAYNAME('1964-10-12') | +-----------------------+ | Monday | +-----------------------+ 1 row in set (0.00 sec) Which proves that we were right in Example 8-11! Similar to the DAYOFWEEK function are DAYOFMONTH and DAYOFYEAR. They take a date as their input and return a number. DAYOFMONTH returns the day of the month, and DAYOFYEAR returns days since the beginning of the calendar year, as demonstrated in Example 8-13. Example 8-13. Finding days since the start of the year
From your DAYOFYEAR function, it returns the following: +-----------------------+-------------------------+ | DAYOFYEAR('2006-1-1') | DAYOFYEAR('2006-12-24') | +-----------------------+-------------------------+ | 1 | 358 | +-----------------------+-------------------------+ 1 row in set (0.00 sec) Just like the relationship between DAYOFWEEK and DAYNAME, MONTH and MONTHNAME return the numeric month or its name, as shown in Example 8-14. Example 8-14. Using MONTH and MONTHNAME on the purchases table
Example 8-14 returns: +------------+------------+----------------+ | day | MONTH(`day`) | MONTHNAME(`day`) | +------------+------------+----------------+ | 2005-02-15 | 2 | February | | 2005-02-10 | 2 | February | +------------+------------+----------------+ 2 rows in set (0.00 sec) If you want to find the week number for a certain date, you can use the WEEK function. It takes a date as its argument and returns the week number. SELECT WEEK('2006-12-24'); This returns: +--------------------+ | WEEK('2006-12-24') | +--------------------+ | 52 | +--------------------+ 1 row in set (0.00 sec) This probably seems pretty easy compared to a lot of the information we've provided. Remember, though, based on how the calendar lays out, some years can have 53 weeks. 8.4.3.2.2. Hours, minutes, and secondsWhen working with datetime, timestamp, or time data types, a specific time is stored in the field. MySQL provides several functions to manipulate these times. They take the logical names: HOUR, MINUTE, and SECOND. HOUR takes a time as an argument and returns the hour from 0 to 23. MINUTE returns the minute of a time from 0 to 59, and similarly, SECOND returns the second in the same range, as shown in Example 8-15. Example 8-15. Using HOUR and MINUTE on a time
Example 8-15 returns: +--------------------------------------------------+ | CONCAT_WS(':',hour('4:46:45'),MINUTE('4:46:45')) | +--------------------------------------------------+ | 4:46 | +--------------------------------------------------+ 8.4.3.2.3. Dates and times arithmeticMySQL provides the functions DATE_ADD and DATE_SUB to allow you to add and subtract days from dates. Their syntax is: DATE_ADD(date,INTERVAL expression type) DATE_SUB(date,INTERVAL expression type) The type can be one of those listed in Table 8-4.
For example, if you want to calculate the date of the current day minus 12, you would write code like that in Example 8-16. Example 8-16. Using DATE_SUB to subtract days
This returns (your time will be different based on when you run this query): +----------------------------------+ | date_sub(NOW(), INTERVAL 12 day) | +----------------------------------+ | 2005-11-03 04:27:09 | +----------------------------------+ 1 row in set (0.00 sec) The NOW function returns the current time; we'll discuss this and some other special date/time functions shortly. In Example 8-16, the value after INTERVAL can be any expression that returns the format the type is expecting from Table 8-4. Since Version 3.23, MySQL also supports the syntax of using + and - with dates, as in Example 8-17. Example 8-17. Using the minus operator on a date
Example 8-17 returns: +------------------------+ | NOW()- INTERVAL 12 DAY | +------------------------+ | 2005-11-03 04:32:30 | +------------------------+ 1 row in set (0.01 sec) It's really all the same command but with an abbreviated syntax. 8.4.3.2.4. NOW functionThe NOW function returns the current date and time according to the setting of your computer's system date and time. So, if your computer clock is off, the data from NOW will be as well. MySQL provides several functions for returning the current date or time, or the current date and time together. CURDATE and CURRENT_DATE both return the date in 'YYYY-MM-DD' format. SELECT CURDATE(); This returns: +------------+ | CURDATE() | +------------+ | 2005-11-15 | +------------+ 1 row in set (0.00 sec) Use CURTIME or CURRENT_TIME to return the current time in the format 'HH:MM:SS': SELECT CURTIME(); Computer setting for date and time returns: +-----------+ | CURTIME() | +-----------+ | 04:44:50 | +-----------+ 1 row in set (0.00 sec) In addition to the NOW function, you can use SYSDATE and CURRENT_TIMESTAMP to return the current date and time in the 'YYYY-MM-DD HH:MM:SS' format: SELECT SYSDATE(); Military formatted data is returned: +---------------------+ | SYSDATE() | +---------------------+ | 2005-11-15 04:45:14 | +---------------------+ 1 row in set (0.00 sec) Last but not least, MySQL provides the ability to display dates and times in a variety of formats. 8.4.3.2.5. Formatting for displayTo display a date in a custom format, use the DATE_FORMAT function. It takes a date or timestamp as its input and a format string. In Table 8-5, the format strings are shown.
If you use any other characters in the format string, they appear as they are, as shown in Example 8-18. Example 8-18. Using DATE_FORMAT with a string to place colons between the segments
Adding colons displays: +------------------------------------------------+ | DATE_FORMAT('2006-12-24 09:09:23', '%h:%i:%s') | +------------------------------------------------+ | 09:09:23 | +------------------------------------------------+ 1 row in set (0.01 sec) At this point, all the basics have been covered. In our next chapter, we'll walk through using PHP to connect and work with MySQL data. |