Module 5: Advanced Reporting | MySQL: Essential Skills

This module will expand on the report-generating skills learned in the previous module by explaining the most commonly used MySQL functions, including those for universal-, control flow “, aggregate-, mathematic-, string-, date-, and null- related manipulations of data. You will use the ORDER BY and GROUP BY clauses to focus your data output, and the JOIN commands to incorporate data from different tables into the same report.

Critical Skill 5.1 Use Functions in MySQL

In this skill, you will learn how to take advantage of many of the most-used functions in the MySQL arsenal. These functions allow you to quickly access and manipulate the data in your database tables. Using functions, you can define both simple and complex problems, and the MySQL functions will do the work for you.

Before you learn about the functions, you should be aware of how MySQL handles data in a mixed context. For instance, what if you were to give MySQL a nonsensical statement like this:

 SELECT "3 bears"+"3 blind mice"+17;

MySQL would not give you an error for trying to add two strings to a number. Instead, it would ignore the string portion of the information, add the numeric portion of the information together, and return an answer of 23.

The MySQL database tries to interpret strings as numbers, if the context implies numbers, or as dates, if the context implies dates. It does this because MyISAM does not function with transactions, so it cannot roll back erroneous operations on a transactional level. Therefore, MySQL operates with a kind of doing-its-best philosophy. Occasionally, this may lead you to places you never intended to go when you place data in a mixed context; however, the side effect of this philosophy is that conversion functions are not necessary. MySQL will turn numbers into strings or strings into numbers automatically when the context requires it.

Using Universal Functions

Two functions within MySQL are so pervasive that they are universal: the IN function and the IF function. They are both applied to string and number values, used in the SELECT command, and involve kinds of comparisons.

IN Function

The IN function allows you to specify a comparison value for choosing table fields. It has the following syntax:

 WHERE <  comparison_field  > IN(  comparison_value  1[,  comparison_value2  ,);

When the comparison matches, the fields specified in the SELECT command are chosen and returned to the command line. The comparison value does not need to be one of the fields specified in the SELECT portion of the command; it just needs to be in the WHERE IN portion.

For example, using the duck_cust table, to get a list with the customer number and first and last names for all the customers in a particular set of states, use the following code:

 SELECT cust_num, cust_first, cust_last, cust_state FROM duck_cust WHERE cust_state IN("WA","CA","AZ");

The comparison works with number fields as well as strings. For example, the WHERE < comparison_field > IN value could be a list of customer numbers, like this:

 SELECT cust_first, cust_last, cust_state FROM duck_cust WHERE cust_num IN(3,7,13,4,2);

As you can see, you can use the IN function only when you know the comparison values. It is practical when you don t need to list too many values. In theory, the list can be as long as you need it to be, but the longer the list, the less elegant the code and the more opportunity for mistakes in typing the comparison data into the command.

IF Function

The IF function allows you to set conditional evaluations with corresponding true and false values. Its syntax is as follows :

 IF(<  Boolean_test  >, <  value_if_true  >, <  value_if_false  >)

The Boolean_test must evaluate to either a true (anything nonzero) or false (zero) value. The value_if_true and value_if_false can be a string, number, or date.

You can use the IF function more than once within a SELECT command by separating the functions with the standard comma separators.

The IF function is often used for creating columns for matrix-style reports or for counting instances of something (as opposed to adding up what those instances contain).

As an example, let s see how you could use the IF function to get information from a table with data about U.S. states. This table lists the population of each of the states last year and ten years ago, along with fields representing the capital city, number of congressional representatives, and region. Table 5-1 shows the table s fields and their defined types.

Table 5-1: Field Names and Types for the StatesPop Table
Field	Type
StateName	CHAR(15)
PopNow	INT
DecadePop	INT
NumReps	TINYINT
StateCap	CHAR(20)
Region	CHAR(15)

Suppose that you have received a request for a report that shows whether the population of each state is over or under 10 million, both now and a decade ago, listing the states by their regions . The following code will produce that report:

 SELECT StateName AS State, Region, IF(PopNow<10000000, "<10M",">10M") AS Current, IF(DecadePop<1000000, "<10M",">10M") AS "Decade Ago", FROM StatesPop ORDER BY Region;

This gives you a report that has the following headers: State, Region, Current, and Decade Ago. The first two columns would hold the data from the StatesPop table, and the next two columns, depending on the range of their population, would hold either <10M or >10M.

As the code stands, if a state s population happens to be exactly 10 million, it will be placed in the greater-than 10 million category. This very well might be just fine for the type of data you are dealing with (population figures) and the generalized information required by the requested report. However, if you need a more detailed report, you can use more IF functions and more columns for the results. The following code gives you a report with more information.

 SELECT StateName AS State, Region, IF(PopNow<1000000, PopNow,"") AS "<1M Now", IF(PopNow>1000000 AND PopNow<5000000, PopNow,"") AS "<5M Now", IF(PopNow>5000000 AND PopNow<10000000, PopNow,"") AS "<10M Now ", IF(PopNow>10000000 AND PopNow<15000000, PopNow,"") AS "<15M Now ", IF(PopNow>15000000, PopNow,"") AS ">15M Now ", IF(DecadePop<1000000, "<1M","") AS "<1M Then", IF(DecadePop>1000000 AND DecadePop<5000000, "<5M","") AS "<1M Then", IF(DecadePop>5000000 AND DecadePop<10000000, "<10M","") AS "<5M Then", IF(DecadePop>10000000 AND DecadePop<15000000, "<15M","") AS "<15M Then", IF(DecadePop>15000000, ">15M","") AS ">15M Then" FROM StatesPop ORDER BY Region;

This code returns a report that has 12 columns. For each state, it will display the state name and region, with population figures in Now and Then sections. All of the other columns in the Now and Then sections will remain blank, as indicated by the pair of double quotation marks with nothing between them.

IF functions can become as complicated as the permutations of your data and your grasp of conditional comparisons allow.

Using Control Flow Functions

Control flow functions allow you to set a value or condition, called a CASE, which is either compared against another value or evaluated as to its condition. You also specify a WHEN result for matching values or true conditions, and sometimes an alternate ELSE result for nonmatching values or false conditions. Control flow functions are normally used in situations where there is a specific set of results that apply to all, or nearly all, of the data. The IF function discussed in the previous section can also be considered a control flow function.

There are essentially two variations of the syntax for CASE WHEN: one for comparing values and one for evaluating conditions.

The comparing values syntax looks like this:

 CASE <  value  > WHEN <  compare_value  > THEN <  result1  >  [WHEN <  compare_value2  > THEN <  result2  > ...]  [ELSE <  nonmatch_result  >] END

The CASE value is specified in the code. The WHEN compare_value is drawn from the record of the table being read. The result is also specified in the code. As implied by the portion of the WHEN clause syntax in the square brackets, theoretically, you can have an unlimited number of WHEN clauses to compare the CASE value against. The square brackets around the ELSE clause indicate that it is also an optional part of the code. If it makes more sense in a particular situation for there to be no ELSE clause in the event of the CASE comparison not matching, you can omit the ELSE clause. The END is required in the syntax, and omitting it will cause a syntax error on the command line.

The conditional syntax is generally similar but the operational details of it differ :

 CASE WHEN <  condition  > THEN <  true_result1  >  [WHEN <  condition2  > THEN <  true_result2  > ...]  [ELSE <  false_result  >] END

The CASE no longer has a specified value that the WHEN compares against. The WHEN has a condition, which is either true or false. If the condition is true, the true_result is used in the output. If the condition is false, and a false_result is supplied in the ELSE clause, it is used in the output.

In both variations of the CASE WHEN syntax, if no ELSE clause is specified and the comparison does not match or the condition is false, the NULL value will be returned.

Here is an example of using the comparison value form of CASE WHEN:

 SELECT CASE 3 WHEN 1 THEN "One" WHEN 2 THEN "Two" ELSE "Three" END;

Figure 5-1 shows the resulting report. The value returned was the string Three, since the CASE value 3 did not match either of the comparison values (1 or 2) supplied in the WHEN clauses.

Figure 5-1: Using CASE WHEN for comparing values

The following is an example of using CASE WHEN for evaluating conditions:

 SELECT CASE WHEN 9>7 THEN "TRUE" ELSE "FALSE" END AS "Results";

As shown in Figure 5-2, the value 9 is greater than the value 7, so the string TRUE was returned. The header for the report was Results, because it was specified using AS.

Figure 5-2: Using CASE WHEN for evaluating conditions

This example shows both variants of CASE WHEN with no ELSE clause:

 SELECT CASE "ferret" WHEN "cat" THEN "meow" WHEN "dog" THEN "woof" END; SELECT CASE WHEN 4=5 THEN "large value of 4" END AS "results";

Figure 5-3 shows the results. Since the string "ferret" was not matched by the strings "cat" or "dog", and the condition 4=5 was evaluated as false, both lines of code returned NULL.

Figure 5-3: Using CASE WHEN without ELSE

Note	The field type of the returned value is always the same as the field type of the result value following the first THEN in the command.

Using Aggregate Functions

This section describes aggregate , or collective, functions. These functions combine numbers by adding, averaging, or counting instances; give the high and low ranges of numbers; and provide a measurement of probability.

SUM, AVG, and COUNT

The SUM, AVG, and COUNT functions do what their names imply. Each action is performed on an expression, which can be as simple as a name indicating a column of fields or a more complex statement that has internal functions of its own.

The SUM function does simple addition of the content values in a column and returns a result. If there are no values in the specified column to add, it returns zero. However, if it references an empty set, where there are no rows in the specified table, SUM returns NULL. Here is an example:

 SELECT SUM(num_sold) FROM Jan_sales;

The AVG function figures the averages of the specified expression, which means it adds all the value s contents together, and then divides that result by the number of instances added together in the first result. Here is an example:

 SELECT student_name, AVG(quiz_scores) FROM class_scores;

The COUNT function adds the instances of the record field, as opposed to adding the value of the data in them. If the COUNT function is supplied with an expression (a field name, for instance), it will add one to its total for every non-NULL record field; however, if it is supplied with the asterisk wildcard (*), it starts at zero and adds one with each record it processes. This means that it does not matter whether the record field holds any data; it counts it regardless. Here is an example:

 SELECT COUNT(renter_name) FROM apt_listings; SELECT COUNT(*) FROM apt_listings;

The first of the two preceding commands would only add one to the COUNT if the rental listing showed a renter s name, so the count would end up showing the number of renters in the apartment complex. The second command would add one to the count for every record in the apt_listings table, so the count would end up showing how many apartments were in the complex, whether or not they were rented at the time.

MAX, MIN, and STD or STDDEV

The MIN and MAX functions evaluate the given fields and return the minimum or maximum value contained in them, respectively. They will evaluate both number and string types. Here is an example using a numeric field type:

 SELECT MIN(rent_amount) from apt_listings; SELECT MAX(rent_amount) from apt_listings;

This example would return the minimum rent amount and the maximum rent amount from the apt_listings table, giving you the costs of the cheapest and most expensive apartments.

Here is an example using a string field type:

 SELECT MIN(StateName) FROM StatesPop; SELECT MAX(StateName)FROM StatesPop;

This would return the first state name in an alphabetized list for MIN and the last state name in an alphabetized list for MAX.

Note

In the case of columns defined as ENUM and SET types, comparison is done using their string value, rather than their position relative to the set, as one might assume. MySQL has stated the intention to change a future version so that ENUM and SET are compared relative to their set position, but they do not function that way yet.

The STD and STDDEV functions are used for figuring standard deviation. They both function in the same way. The STDDEV function is provided only for compatibility with the Oracle database.

Calculating with Mathematical Functions

MySQL offers a wide variety of mathematical functions. Here, we will look at some of the more commonly used mathematical functions.

Trigonometry Functions

Simply put, trigonometry is the study of the properties of triangles . It is used in a variety of scientific disciplines like astronomy, geography, engineering. and physics. More specifically , the functions of trigonometry are used in calculus, linear algebra, and statistics.

DEGREES, RADIANS, and PI The DEGREES and RADIANS functions deal with converting values. The DEGREES function returns the value of X converted into radians. The RADIANS function returns the value of X converted into degrees. Their syntax is as follows:

 DEGREES(  X  ) RADIANS(  X  )

When you use the results of these two functions in a SELECT command ”for instance, with a beginning value of 90 degrees ”they will toggle their results between the two measurements. For example, RADIANS, given a value of 90 degrees, returns 1.5707963267949, and DEGREES, when given the value 1.5707963267949 radians, returns 90 degrees.

The function PI returns the value of pi. MySQL displays six decimal places as its default; however, internally, it uses the full double precision for computations . The function has parentheses, but no value is placed in them, as shown in the following syntax:

 PI()

The SELECT PI(); command returns the value 3.141593.

SIN, COS, TAN, and COT The SIN function is the trigonometric function sine. The COS function is the trigonometric function cosine. The TAN function is the trigonometric function tangent. The COT function is the trigonometric function cotangent. Their syntax is as follows:

 SIN(  X  ) COS(  X  ) TAN(  X  ) COT(  X  )

In all of these functions, the value X is given in radians.

ASIN, ACOS, ATAN, and ATAN2 The ASIN function returns the arc sine of the value of X , meaning that the sine of the value it returns is the X supplied in the function. The ACOS function returns the arc cosine of the value of X , meaning that the cosine of the value it returns is the X supplied in the function. The ATAN function returns the arc tangent of the value of X, meaning that the tangent of the value it returns is the value X supplied in the function. Their syntax is as follows:

 ASIN(  X  ) ACOS(  X  ) ATAN(  X  )

The ATAN function can also have two values input: X and Y . This returns the arc tangent of the two variables . As shown in the following syntax, MySQL allows you to use either the ATAN or ATAN2 notation when using this function, but the results are identical.

 ATAN(  X  ,  Y  ) ATAN2(  X  ,  Y  )

Note	The ASIN function returns NULL if the supplied X value is not in the range of “1 to 1.

Logarithms and Powers

Logarithms started out as a way to multiply by adding before the advent of calculators . Powers are a convenient way of expressing large numbers by defining them as a smaller number raised to a stated power. For instance, 3 ^{4 is equal to 3} 3 3 3 or working it out the long way, 3 3=9, 9 3=27, 27 3=81. So, 3 ^4=81.

SQRT The SQRT function returns the nonnegative square root of the supplied X value.

 SQRT(  X  )

LOG, EXP, and POW The LOG function returns the natural logarithm of the value X when it is supplied with only one value. When two values are supplied, it returns the logarithm of X for an arbitrary base supplied as the value B . That makes it the equivalent of LOG(X)/LOG(B).

 LOG(  X  ) LOG(  B  ,  X  )

Note	The two-value, arbitrary base version of LOG was added in MySQL version 4.0.3. In earlier versions, you must use LOG(X)/LOG(B) to reach the same result. The LOG2 function was also added in MySQL version 4.0.3. In earlier versions, you must use LOG(X)/LOG(2) to reach the same result.

The LOG10 function returns the logarithm of the supplied value X in base 10. The LOG2 function returns the logarithm of the supplied value X in base 2. LOG2 is useful for figuring out how much space, in bits, it takes to store a number.

 LOG10(  X  ) LOG2(  X  )

The EXP function returns the value of e, which is defined as the base of natural logarithms, raised to the power of the supplied value X .

 EXP(  X  )

The POW and POWER functions raise the value of X to the power of Y . This operation enables you to notate powers without the use of superscript notation. So, POW(2, 3) is equal to 2 ^3.

 POW(  X  ,  Y  ) POWER(  X  ,  Y  )

MOD and % The MOD function returns the modulo; that is, the remainder of the supplied value X divided by the supplied value M . You can also use the percent symbol (%) in place of MOD, which is the operator used for modulo in the C programming language.

 MOD(  X  ,  M  )  X  %  M

MySQL indicates that this function is safe to use with BIGINT values.

Note	In MySQL version 4.1 only, the syntax X MOD M is allowed, but it does not function in any version before or after.

ABS and SIGN The ABS function returns the absolute value of the supplied value X .

 ABS(  X  )

The SIGN function returns the sign of the supplied value X as “1, 0, or 1, depending on whether the value X is negative, zero, or positive, respectively. The SIGN function is used to test whether a given number X is positive, negative or zero. So, if the value of X were -42, the SIGN function would return a -1 value. If the value of X were 0, the SIGN function would return a 0 value. If the value of X were 9, the SIGN function would return a 1 value.

 SIGN(  X  )

Rounding Functions

MySQL provides a lot of ways to deal with rounding numbers. The function ROUND is paradoxically the one that gives you the least control over how it goes about figuring your result.

ROUND, CEILING, FLOOR, and TRUNCATE ROUND returns the supplied value X rounded to the nearest integer, and it has a two-variable form that allows you to specify, with the value D , the number of decimal places you want to round to. The syntax for the two versions of ROUND is as follows:

 ROUND(  X  ) ROUND(  X  ,  D  )

However, when the value in question is halfway between two integers (.5), the direction ROUND rounds is controlled by the C Library implementation, which varies depending on which operating system you are using. Some of operating systems always round up; some always round down; and some always round toward zero.

If you want more control over how your number is rounded, or you don t want to worry about how the particular operating system you re using deals with the 0.5 issue, you would be better off using TRUNCATE, CEILING, or FLOOR.

The CEILING function always rounds upwards. That means it returns the smallest integer value that is not less than the supplied value X .

 CEILING(  X  ) CEIL(  X  )

Note	In MySQL version 4.0.6, the alias CEIL was added; however, the return value is converted to BIGINT.

The FLOOR function always rounds downward. That means that it returns the largest integer value not greater than the value of the supplied X .

 FLOOR(  X  )

Note	The return value of the FLOOR function is converted to BIGINT.

The TRUNCATE function simply returns the supplied value X truncated to the number of decimal places equal to the supplied value D . If the supplied value for D is zero, there will be no decimal places returned.

 TRUNCATE(  X  ,  D  )

In the case of a negative number supplied for the D value, the whole part of the number is zeroed out. For instance, the command TRUNCATE(113, -2) returns the value 100; TRUNCATE(113, -1) returns the value 110.

Table 5-2 shows the rounding functions and results of using these functions with both positive and negative numbers. Remember that if you try the ROUND example with the .5 value on your system, you may get a different answer, depending on whether your computer s halfway default is to round up or down.

Table 5-2: Rounding Functions on Positive and Negative Numbers
Positive	Result	Negative	Result
ROUND(7.5)	7	ROUND(-7.5)	“8
ROUND(7.53)	8	ROUND(-7.53)	“8
ROUND(7.49)	7	ROUND(-7.49)	“7
CEILING(7.5)	8	CEILING(-7.5)	“8
FLOOR(7.5)	7	FLOOR(-7.5)	“8
TRUNCATE(7.5, 0)	7	TRUNCATE(-7.5, 0)	“7

Random Function

The RAND function has two versions: one with a specified value N , and one with no specified value. If no value N is specified, then RAND returns a value between zero and one. If a value N is specified, it is used as a seed value, which results in a repeatable sequence.

 RAND() RAND(  N  )

The RAND function is meant to be a quick and easy way to generate returns that are random enough for a given situation, and portable between identical versions of MySQL, rather than being a perfect random generator.

Random Numbers If you use RAND with a specified N value, the result will be repeatable. For instance, the value returned for RAND(3) is, and will always be, 0.90576975597606. If no N value is specified, the return will be a more or less random number between zero and one.

If you run three RAND()commands, the return values may differ widely within the function s given range; however, the range between zero and one is a small enough sample, statistically speaking, for results to appear patterned . You can receive the same result twice in a row, for instance. The RAND function is not perfectly random, but it is serviceable for most requirements. Any time two commands in the RAND( X ) syntax have the same X value, the result is always the same.

Random Strings The RAND function can also be used to display a more or less random selection from a SELECT command by using it as the variable in an ORDER BY clause. The following example results in the random output of a single instance of the field cust_duckname from the duck_cust table:

 SELECT cust_duckname FROM duck_cust ORDER BY RAND() LIMIT 1;

As long as your MySQL version is newer than 3.23, you can also use the RAND function with a wildcard SELECT statement, instead of needing to specify a column. The following example displays three adequately random, entire records from the duck_cust table s contents:

 SELECT * FROM duck_cust ORDER BY RAND() LIMIT 3;

Note	If you use the RAND function in a WHERE clause, the RAND will be reevaluated every time the WHERE is executed.

Manipulating Strings with String Functions

String functions manipulate strings in some of the same ways mathematical functions manipulate numbers. They can add and subtract data from existing strings, make new strings, and edit the data already contained in strings. This section will cover the main string functions, and then cover those functions used to compare strings and manipulate the data in them.

Main String Functions

The main string functions are the ones most often used to handle strings and the information held in them. They add strings or portions of strings together, pick out parts of strings based on position or delimiters, search for strings in strings, change strings in other strings, and calculate the length of strings.

CONCAT The CONCAT function is short for concatenate, which means to link together in a series or chain. Its syntax allows for one or more parameters that it will link together and return a result of a single string.

 CONCAT(<  string1  >[, <  string2  >,  ])

If numeric data is supplied as an argument in CONCAT, it automatically converts the data into its equivalent string form. Also, if any of the supplied data is NULL, CONCAT returns the result NULL. Figure 5-4 shows some basic examples of the CONCAT function.

Figure 5-4: Basic examples of the CONCAT function

The CONCAT_WS function, which stands for CONCAT With Separator, allows you to use the first argument to indicate a separator for the rest of the arguments, and this separator will appear between the arguments in the resulting string.

 CONCAT_WS(<  separator  >, <  string1  >[, <  string2  >,  ])

Unlike CONCAT, if a NULL appears in the strings supplied to the CONCAT_WS function, it is simply ignored. If, however, NULL is used as the indicated separator, a NULL value is returned. Figure 5-5 shows examples of using CONCAT_WS.

Figure 5-5: Examples of the CONCAT_WS function

SUBSTRING and SUBSTRING_INDEX The SUBSTRING and SUBSTRING_INDEX functions both pick part of a string to return in a string result, but they operate based on different criteria. SUBSTRING makes its pick based on position within the string, and SUBSTRING_INDEX makes its pick based on delimiters and their occurrence within the string. The following syntax shows four variations of SUBSTRING and the single syntax for SUBSTRING_INDEX.

 SUBSTRING(<  string  >, <  position  >) SUBSTRING(<  string  > FROM <  position  >) SUBSTRING(<  string  >, <  position  >, <  length  >) SUBSTRING(<  string  > FROM <  position  > FOR <  length  >) SUBSTRING_INDEX(<  string  >, <  delimiter  >, <  count  >)

The first two SUBSTRING syntax formats give the same result; the only difference is how they are written. These functions count in from the start of the string to the amount of the specified position , and beginning there, return the remainder of the string.

The third and fourth SUBSTRING syntax formats are also variations that produce the same result. SUBSTRING functions with a length parameter count in to the place indicated by the position parameter, but they count from that place using the length parameter, and then return the string that lies between those two specified places. Figure 5-6 shows examples of the SUBSTRING function variations.

Figure 5-6: Examples of the SUBSTRING function

The SUBSTRING_INDEX function works in a similar manner, but it counts the specified delimiter instead of position within the string . If the count value is positive, SUBSTRING_INDEX returns the portion of the string that is to the left of, or before, the final delimiter . If the count value is negative, it returns everything to the right, or behind, the final delimiter . When the count is a positive number, the function counts the appearances of the delimiter starting from the left, or beginning, of the string. When the count is a negative number, SUBSTRING_INDEX counts the appearances of the delimiter starting from the right, or end, of the string. Figure 5-7 shows examples of using SUBSTRING_INDEX with both positive and negative count values.

Figure 5-7: Examples of SUBSTRING_INDEX

INSTR, REPLACE, and LENGTH The INSTR function, short for In String, returns the numeric position of the beginning of specified substring within the specified string .

 INSTR(<  string  >, <  substring  >)

The REPLACE function takes three values: string , from_string , and to_string . It finds the portion of the string that matches the from_string value and replaces it with the to_string value, and then returns the new version of the string . If it does not find the specified from_string value, it returns the string value unaltered.

 REPLACE(<  string  >, <  from_string  >, <  to_string  >);

The LENGTH function takes in a string and returns the numeric value of the length. If a character within the string takes multiple bytes to store, LENGTH counts those bytes separately.

 LENGTH(<  string  >)

Figure 5-8 shows examples of INSTR, REPLACE, and LENGTH.

Figure 5-8: Examples of the INSTR, REPLACE, and LENGTH functions

Comparison String Functions

Comparison string functions allow you to compare two strings and get a return that tells you whether they are alike, not alike, or sometimes, partially alike.

LIKE The LIKE function, in its most basic form, allows you to compare one specified string against another, resulting in the return of a one if true or zero if false. You can also use two wildcard values: percent (%) to match any number of characters, including zero characters , and underscore (_) to match one, and only one, character. If you are trying to search for one of those two wildcard symbols, you must use the escape symbol, the backslash (\), in front of the wildcard symbol you are comparing against. If you want to use a backslash in a comparison string, you can specify a different escape symbol with the ESCAPE clause.

 <  string  > LIKE <  pattern_string  > [ESCAPE <'  alternate_escape_character  '>]

Figure 5-9 shows examples of the LIKE function, its wildcard symbols, and setting a different escape symbol.

Figure 5-9: Examples of the LIKE function

STRCMP The STRCMP function, short for String Comparison, takes two arguments, and returns a number depending on whether they partially match or don t match. If the two strings match, STRCMP returns a value of zero. If the first argument is a smaller value than the second argument, according to the current sort order, STRCMP returns a value of negative one. For any other comparison result, STRCMP returns a value of one.

 STRCMP(<  string1  >, <  string2  >)

Note	Starting with MySQL version 4.0, STRCMP ceased to be case-sensitive unless one or both of the arguments are binary strings.

Figure 5-10 shows examples resulting in the three return values of STRCMP:

Figure 5-10: Examples of the STRCMP function

In the first example, the strings "duck" and "Duck" return a value of 0, because STRCMP perceives them as matching since it is not case-sensitive.
In the second example, the strings "duck" and "wolf" return a value of -1, because the first string "duck" comes before the second string "wolf" in an alphanumeric sort.
In the third example, the strings "duck" and "bat" return a value of 1, because they do not match, nor is the first string smaller according to the sort order.
In the fourth example, the two strings from the third example are in the reverse order, making "bat" the first string and "duck" the second string, and STRCMP returns a value of -1, because the first string comes before the second string in the sort order.

Manipulating String Functions

A variety of string functions allow you to manipulate the data in useful ways. The ones described here are some of the most useful string functions.

REVERSE and REPEAT The REVERSE function takes in a string and returns it in reverse order.

 REVERSE(<  string  >)

The REPEAT function takes in a string and a count and returns a string with the contents of the specified string repeated count times.

 REPEAT(<  string  >, <  count  >)

For example, the command SELECT REVERSE("desserts"); returns the string value stressed. The command SELECT REPEAT("Love", 3); returns the string value LoveLoveLove.

LPAD, RPAD, UPPER, and LOWER The LPAD and RPAD functions, short for Left Padding and Right Padding, respectively, take in a string , length , and padding_string , and return a string that contains the specified string enlarged using the padding_string until it equals the value of length . The LPAD function pads to the left of the string, and the RPAD function pads to the right of the string.

 LPAD(<  string  >, <  length  >, <  padding_string  >) RPAD(<  string  >, <  length  >, <  padding_string  >)

The UPPER and LOWER functions take in a string and convert it to all uppercase or all lowercase characters, respectively.

 UPPER(<  string  >) LOWER(<  string  >)

Figure 5-11 shows examples of the LPAD, RPAD, UPPER, and LOWER functions.

Figure 5-11: Examples of the LPAD, RPAD, UPPER, and LOWER functions

TRIM, LTRIM, and RTRIM The TRIM function takes in a remove_string and a string , and trims the remove_string off the specified string at the beginning, end, or both, using the reserved word LEADING, TRAILING, or BOTH. If no specifiers are provided, the default is BOTH. If no remove_string is supplied, TRIM removes any spaces at the beginning, end, or both.

 TRIM(<  string  >) TRIM([[LEADINGTRAILINGBOTH] <  remove_string  > FROM] <  string  >)

If you are removing spaces from only one side of the string or the other, you can also use LTRIM or RTRIM, which take in a string and remove any spaces from the left or right side of the string, respectively.

 LTRIM(<  string  >) RTRIM(<  string  >)

Figure 5-12 shows examples of the TRIM, LTRIM, and RTRIM functions.

Figure 5-12: Examples of the TRIM, LTRIM, and RTRIM functions

LEFT, MID, and RIGHT The LEFT and RIGHT functions take in a string and length , and return the leftmost or rightmost length of the string , respectively. LEFT counts in to the value of length from the left, and RIGHT counts in to the value of length from the right.

 LEFT(<  string  >, <  length  >) RIGHT(<  string  >, <  length  >)

The MID function works exactly like the SUBSTRING(< string >, < position >, < length >) function, with the added bonus that it is shorter to type.

 MID(<  string  >, <  position  >, <  length  >)

Given the string "John Paul George Ringo", the LEFT, RIGHT, and MID functions work as follows:

SELECT LEFT("John Paul George Ringo", 4) will return the string value "John", because it starts at the left of the string and selects the first four characters it finds.
SELECT RIGHT("John Paul George Ringo", 5) will return the string value "Ringo", because it starts at the right of the string and selects the first five characters it finds.
SELECT MID("John Paul George Ringo", 11, 6) will return the string value "George", because it counts in from the left of the string 11 places, starts counting anew from that eleventh place, and selects the next six characters it finds.

CHAR and ASCII The CHAR function interprets the arguments as integers and returns the characters given by the ASCII code for those numbers. Null values are ignored.

 CHAR(  N  ,)

The ASCII function takes in a string and returns the ASCII code value for the leftmost character in the string. If a numeric value is supplied instead of a string, MySQL will automatically convert it to its string equivalent. This function is case-sensitive.

 ASCII(<  string  >)

As an example, the command SELECT CHAR(74, 101, 116) returns the string value "Jet". The command SELECT ASCII("James") returns the ASCII code value for the leftmost character of the string, in this case, 74. If either "Jet" or "James" had a lowercase j , the corresponding ASCII number would be 106.

Using Date Functions

The date functions allow you to manipulate time-related information. You can use them to specify date and time formats, and to get date and time information.

DATE_FORMAT

The DATE_FORMAT function takes in a date and a format , and then returns a string containing the date displayed in accordance with the specified format .

 DATE_FORMAT(<  date  >, <  format  >)

You indicate the formatting by using a selection of symbols from a set of 22 specifiers, as shown in Table 5-3.

Table 5-3: DATE_FORMAT Specifiers
Day
Specifier	Description
%a	Day of the week in 3-letter abbreviation (Sun, Mon, etc.)
%D	Day of the month abbreviated (1 ^st , 2 ^nd , etc.)
%d	Day of the month in 2-digit numeric indicator (00-31)
%e	Day of the month in numeric indicator (0-31)
%j	Day of the year in 3-digit display (001-366)
%W	Day of the week in full word (Sunday, Monday, etc.)
%w	Day of the week in numeric form (1=Sunday, 2=Monday, etc.)
Month
%b	Month of the year in 3-letter abbreviation (Jan, Feb, etc.)
%c	Month of the year in numeric indicator (0-12)
%M	Month of the year in full word (January, February, etc.)
%m	Month of the year in 2-digit numeric indicator (00 “12)
Year
%X	Year in 4-digit numeric display, beginning with Sunday, for use with %V
%x	Year in 4-digit numeric display, beginning with Sunday, for use with %v
%Y	Year in 4-digit numeric display
%y	Year in 2-digit numeric display
Week
%U	Week of the year (00 “53), beginning with Sunday
%u	Week of the year (00 “53), beginning with Monday
%V	Week of the year (01 “53), beginning with Sunday, for use with %X
%v	Week of the year (01 “53), beginning with Sunday, for use with %x
Hour
%H	Hour in 2-digit, 24-hour time (00 “23)
%h, %I	Hour 2-digit, in 12-hour time (01 “12)
%k	Hour in 24-hour time (0 “23)
%l	Hour in 12-hour time (1 “12)
Minute
%I	Minute in 2-digit display (00 “59)
%p	Before noon or after noon (AM or PM)
Second
%f	Microseconds in 6-digit display (000000 “999999)
%S, %s	Seconds in 2-digit numeric indicator (00 “59)
Time
%r	Time in 12-hour display, formatted HH : MM : SS , with AM or PM
%T	Time in 24-hour display, formatted HH : MM : SS
Percent
%%	Display of the percent symbol (%)

Figure 5-13 shows examples of the DATE_FORMAT functions using the specifiers %W, %M, %d, and %Y, which refer to the day of the week, the month name, the date, and the four-digit year, respectively.

Figure 5-13: Examples of the DATE_FORMAT function

NOW and INTERVAL

The NOW function does not take any arguments. It returns the current date and time in the following string format: YYYY - MM - DD HH : MM : SS . If you add a + 0, it returns the current date and time in the following numeric format: YYYYMMDDHHMMSS .

 NOW() NOW() + 0

You can add the INTERVAL clause to the NOW function by using the plus sign (+) and one of the following words with a count and date/time modifier: YEAR, MONTH, DAY, HOUR, MINUTE, or SECOND.

 NOW() + INTERVAL <  count  > <  date/time_modifier  >

For example, NOW() + INTERVAL 8 DAY; returns the date and time eight days from the moment the command is run.

Caution

The INTERVAL modifiers are singular words, even if referring to multiple instances. Changing the INTERVAL modifiers to YEARS, MONTHS, DAYS, HOURS, MINUTES, or SECONDS will result in an error message.

DAYNAME, DAYOFWEEK, and DAYOFYEAR

The DAYNAME function takes in a string containing a date in the YYYY - MM - DD format, or a numeric date in the YYYYMMDD format, and returns the day of the week for the specified date . Any other format returns a NULL.

 DAYNAME(<  date  >)

For example, the string version of the DAYNAME command SELECT DAYNAME ("2048-08-19") returns a value of Wednesday. The numeric version SELECT DAYNAME(20911204) returns a value of Tuesday.

The DAYOFWEEK function takes in a date , in either the string or numeric format, and returns a numeric value for the day of the week instead of a string value. It begins counting the day of the week with Sunday: 1=Sunday, 2=Monday, 3=Tuesday, 4=Wednesday, 5=Thursday, 6=Friday, and 7=Saturday.

 DAYOFWEEK(<  date  >)

The DAYOFYEAR function takes in a date , in either the string or numeric format, and returns a numeric value for the day of the year. It begins counting on January 1, so 1=January 1 and so on until 365=December 31, unless it is a leap year, in which case, 366=December 31.

 DAYOFYEAR(<  date  >)

For example, SELECT DAYOFWEEK("2048-08-19") returns the value 4, which corresponds with Wednesday. The command SELECT DAYOFYEAR(20911231) returns the value 365, because it is the last day of the year, given that 2091 is not a leap year.

Getting Information with Null-related Functions

Null-related functions are used to control the flow of data into your output. They allow you to make a return of a NULL value into a communicative output that has more meaning than a blank field.

IFNULL

The IFNULL function takes in two expressions, which can be string or numeric in format. If expression1 is not NULL, the IFNULL function returns expression1 . If expression1 is NULL, the IFNULL function returns expression2 . The return value does not alter the expression s type. For example, if you specify a string in the IFNULL function, a string is returned.

 IFNULL(<  expression1  >, <  expression2  >)

NULLIF

The NULLIF function also takes in two expressions of either type. If expression1 is equal to expression2 , NULL is returned; otherwise , expression1 is returned. In the NULLIF function, the expression2 is never a return value.

 NULLIF (<  expression1  >, <  expression2  >)

You can obtain the same result by using a CASE WHEN statement like CASE WHEN expression2 = expression2 THEN NULL ELSE expression1 , but plainly, the NULLIF function is the simpler, more efficient way of expressing the condition.

Figure 5-14 shows examples of the IFNULL and NULLIF functions. Since the syntax is almost identical, it s important to verify that you are using the one you actually need.

Figure 5-14: Examples of the IFNULL and NULLIF functions

Progress Check

1.	When do you use the control flow functions?
2.	What actions do aggregate functions perform?
3.	What main string function joins one or more strings together into a single string?
4.	Which string functions compare?
5.	Which two string functions change case?
6.	What is the function used to control the way a date and/or time looks in the output?
7.	The syntax of the command CASE WHEN expression2 = expression2 THEN NULL ELSE is the equivalent to what more concise function?

Answers

1.	The control flow functions are used in situations where a specific set of results applies to all, or nearly all, of the data.
2.	The aggregate functions combine numbers by adding, averaging, or counting instances; give the high and low ranges of numbers; and provide a measurement of probability.
3.	The CONCAT function allows for one or more parameters that it will link together to return a result of a single string.
4.	The LIKE and STRCMP functions do comparisons.
5.	The UPPER and LOWER string functions change case to all uppercase or all lowercase, respectively.
6.	The DATE_FORMAT function allows you to control the look of a date and/or time in your output.
7.	It is equivalent to the NULLIF function.