Section 12.1. Conditions in Formulas | Excel 2010: The Missing Manual

12.1. Conditions in Formulas

Chapter 7 gave you a first look at how to use conditional logic when writing Excel formulas. The basic principles are easy: you construct a condition using the logical operators like <, >, =, and <>, and then use this condition with a conditional function . So far, you've only considered one conditional functionIF( )which performs different actions depending on the result of a calculation.

For example, the following formula carries out the operation in either the second or third argument depending on the value of cell A20:

 =IF(A20>10000, A20*5%, A20*3%)

Translation: for values greater than 10,000, Excel executes the formula A20*5% ; otherwise , it carries out the second formula. If, for example, A20 contains the dollar amount of a sales invoice, you can use this formula to determine the commission for a sales person. If the sale exceeds the magic $10,000 amount, a larger 5 percent commission kicks in.

12.1.1. IF( ): Building Conditional Formulas

IF( ) is one of the most useful conditional functions, but it's not the only one. In fact, you'll find five more logical functions in the Logical category in the Insert Function dialog box (choose Insert Function). These include:

TRUE( ) and FALSE( ) . These functions don't take any arguments. The TRUE( ) function simply returns the logical value true , and the FALSE( ) function returns false . You might use these functions to enter logical values directly into a cell or formula, but you won't need them very often.
AND( ) . This function accepts two conditions and returns true if both of them are true. If either condition is false, the AND( ) function returns false .
OR( ) . This function accepts two conditions and returns true if either one of them is true. The OR( ) function returns false only if both conditions are false.
NOT( ) . This function accepts a conditional argument and reverses it. For example, consider the formula =NOT(ISBLANK(A10)) . Assuming A10 is empty, the ISBLANK( ) function returns true ; by contrast, the NOT( ) formula returns the reverse false . The NOT( ) function is rarely required, but sometimes it helps make your logic a little clearer.

On their own, these logical functions don't accomplish much. However, you can combine them in some interesting ways with the IF( ) function. For example, imagine you want to make sure the 5 percent rate (from the sales commission example) kicks in only if the sale exceeds $10,000 and the sale occurs after the year 2003, when the new commission rules came into effect. You can rewrite the earlier formula to take this fact into account using the AND( ) function. If the invoice date is in cell B4, here's what the formula would look like:

 =IF(AND(A20>10000,YEAR(B4)>2003), A20*5%, A20*3%)

Similarly, you might encounter a situation where you want to alter your logic so that the higher commission rate kicks in if at least one of two different criteria is met. In this case, you'd use the OR( ) function.

Tip: The information functions described in Chapter 11 are also quite useful in conditional statements. The "IS" functions, like ISERROR( ) and ISBLANK( ) especially lend themselves to these statements.

You can also choose between more than two options by nesting multiple IF( ) statements in a single formula. (Nesting is a technique that lets you put one function inside another.) For example, the following formula uses a commission of 2 percent if the sale is under $500, a commission of 5 percent if the total is above $10,000, and a 3 percent commission for all other cases.

 =IF(A20<500, A20*2%, IF(A20>10000, A20*5%, A20*3%))

The formula begins by checking the first condition (whether A20 is less than $500). If it is, Excel carries out the first expression, A20*2% . If A20 is not less than $500, Excel moves on to the second expression, which is actually another IF( ) function. It then checks the second condition and chooses between the two remaining commission levels.

Excel allows up to eight nested IF( ) statements in one formula. (This limitation actually applies to all Excel functions. You can't nest functions more than eight levels deep.) Nesting IF( ) statements may get the job done, but it can lead to some extremely complicated formulas, so tread carefully .

Note: In some situations, nesting multiple IF( ) statements may cause the formula to become too complex and error-prone . In these cases, it's often easier to build a custom function that analyzes the information and calculates the commission rate using full-fledged VBA (Visual Basic for Applications) code. You'll see an example of that in Chapter 26.

Along with the functions in the Logical category, Excel also includes two functions in the Math & Trig category that use conditions. These are the COUNTIF( ) and SUMIF( ) functions, as described in the next two sections.

12.1.2. COUNTIF( ): Counting Only the Cells You Specify

To understand the purpose of COUNTIF( ) and SUMIF( ), you need to remember that the COUNT( ) and SUM( ) functions devour everything in their path . But what if you want to pick out specific cells in a range and count or sum only these cells? You might try to use COUNT( ) or SUM( ) in conjunction with an IF( ) statement, but that doesn't solve the problem. For example, consider the following formula:

 =IF(ISBLANK(A1), 0, COUNT(A1:A10))

This formula checks to see if cell A1 is blank. If it is, the formula returns 0. If it isn't blank, the formula returns the count of all occupied cells from A1 to A10. As you can see, the actual counting operation is an all-or-nothing affair. Excel either counts all of the cells or ignores them all. There's no way to count just some of the cells (or, similarly, add just some of the cells). The COUNTIF( ) and SUMIF( ) functions address this issue by letting you specify a condition for each cell .

The COUNTIF( ) function is the most straightforward of the two. It takes two parameters: the range of cells you want to count, and the criteria that a cell needs to satisfy in order to be counted:

 COUNTIF(range, criteria)

The criteria argument is the key to unlocking the real power of COUNTIF( ). The formula tests every cell in your range to see if it meets your criteria, and counts only if it does. With the criteria you can:

Test if a cell matches a specific value.
Test if a cell is greater or less than a specific number.
Test if a cell matches, or is greater or less than, a number in another cell.
Test if a cell contains text that matches a simple pattern.

For example, consider the list of products shown in Figure 12-1, which stretches from row 2 to 42. Counting the number of products is easyyou just use the plain- vanilla count function:

 =COUNT(A2:A42)

This formula returns 41 , which is the total number of non-blank cells in the range.

Figure 12-1. This worksheet shows a list of products. The COUNT( ) function makes it easy to count how many products there are, but COUNTIF( ) gives you the additional ability to count only those products that reach a set price threshold or are in a certain category.

Now, what if you want to count how many products are priced over $500? This challenge calls for the COUNTIF( ) function. Here's the formula:

 =COUNTIF(C2:C42, ">500")

Note that in this case, the formula counts the cells in column C. That's because column C contains the price information for each product, which you need in order to evaluate the condition. (When using the COUNTIF( ) function, the condition in the second argument is always a string, which means you need to make sure to place it inside quotation marks.)

To understand how the criteria argument works, you need to realize that it's not a logical condition like the ones used with the IF( ) statement. Instead, it's a snippet of text that contains part of a condition. When COUNTIF( ) springs into action, it creates a full condition for each cell in its assigned range. For example, in the formula shown earlier, the criteria is >500 . Each time the COUNTIF( ) function tests a cell, it uses this criteria to generate a full-fledged condition on the fly. For example, the first cell in the range is C2, so the condition becomes C2>500 . If the condition is true (and for C2, in Figure 12-1, it is), COUNTIF( ) counts the cell.

Using this logic, you can easily construct other conditional counting formulas. For example, here's how you'd count all the products in the Travel category (column D) by examining the cells in the category column:

 =COUNTIF(D2:D42,"=Travel")

Tip: If your condition uses the equal sign, you can omit it. For example, COUNTIF( ) assumes that the condition " Travel " is equivalent to " =Travel ".

UNDER THE HOOD
How Excel Compares Different Types of Data

The type of data that's in a cellordinary numbers, text, or datesinfluences how Excel compares it in a condition. For example, if you use the greater than (>) and less than (<) symbols with numbers , Excel performs the expected numeric comparison. However, if you use these symbols with text, Excel performs a letter-by-letter alphabetic comparison, which means Excel considers the word apple less than banana because the first letter, a, occurs before b in the alphabet. Comparisons are always case-insensitive, so tRaVeL matches Travel .

Date and time comparisons work the same as numeric comparisons, although it helps to remember that Excel stores dates as a number of days (see Sidebar 2.1 for an explanation) and times as a fractional number between 0 and 1. In other words, Excel deems that dates or times in the past are smaller than more recent dates in a comparison.

Finally, Excel lets you use the asterisk (*) wildcard when you're making text comparisons. The asterisk stands for any sequence of characters . Thus, the criteria " =T* " matches any cell that contains text that starts with the letter T. Here's how you'd use it with the COUNTIF( ) function:

 =COUNTIF(D2:D42,"=T*")

This formula matches any category that starts with the letter T, which includes Travel and Tools.

You can even draw the information you want to use in your condition from another cell. In this case, you simply need to use the text concatenation operator (&) to join together the cell value with the conditional operator you want to use. (See Section 10.1 for an explanation of how concatenation works.) For example, if the reader of your spreadsheet enters the category name in cell G1, you could count matching products using the following formula:

 =COUNTIF(D2:D42,"=" & G1)

This formula joins the equal sign to whatever text is in cell G1. Thus, if G1 has the text Tools, the criteria becomes " =Tools ".

You can use a similar technique to utilize a function in the criteria argument. For example, here's a formula that counts the number of products that are above the average price:

 =COUNTIF(C2:C42, ">" & AVERAGE(C2:C42))

12.1.3. SUMIF( ): Adding Only the Cells You Specify

The SUMIF( ) function follows the same principle as COUNTIF( ). The only difference is that it accepts an optional third argument:

 SUMIF(test_range, criteria, [sum_range])

The first argument is the range of cells you want the criteria to test, the second is the criteria itself, and the third is the range of cells you want to sum. So if the first cell in test_range passes the test (that is, causes the criteria to evaluate to true ), the function adds whatever the first cell is in the sum_range to the total sum. Both test_range and sum_range must have the same number of cellsusually they'll be different columns in the same table.

For example, imagine you want to calculate the sum of all the products in the Travel category, as shown in Figure 12-1. In this case, you'd want to test the cells from the category column (D2:D42) to see whether they contain the word Travel; these cells comprise your test range. But then you would want to add the cells from the price column (C2:C42), and those cells would be your sum range. Here's the formula:

 =SUMIF(D2:D42, "=Travel", C2:C42)

If you omit the sum_range argument, the SUMIF( ) function uses the same range for testing and for summing. You could use this approach to calculate the total value of all products with a price above $500:

 =SUMIF(C2:C42,">500")

Figure 12-2 shows how SUMIF( ) and COUNTIF( ) can work together.

12.1.4. SUMPRODUCT( ): Multiplying Two Ranges

Both the COUNTIF( ) and SUMIF( ) functions suffer from one limitation: they can evaluate cells using only one criteria. There's a solution, however, in the form of a quirky, often overlooked function called SUMPRODUCT( ).

SUMPRODUCT( ) was originally designed to multiply two ranges and add up all the results. In order to perform this feat, the ranges must have the exact same dimensions. To calculate the result, Excel multiplies each value in one range with the corresponding value in the second range. Finally, all of these values are added together for a grand total.

Excel gurus soon discovered that the SUMPRODUCT( ) function had a use far beyond these humble beginnings. When combined with a little conditional logic, SUMPRODUCT( ) can filter out all sorts of information. This ability has given it the reputation as Excel's "magic formula."

Figure 12-2. You can also use the COUNTIF( ) and SUMIF( ) functions in more complicated formulas, like the one shown here. The formula in B13 ignores all scores of 0, which represent students that didn't take the test. The result of 78 percent is clearly quite a bit higher than the average of 68 percent, returned by the AVERAGE( ) function, but it's a more realistic representation of student performance.

To understand how SUMPRODUCT( ) works, you first need to understand a few interesting principles:

Excel can convert true and false values into numbers. In fact, it performs this conversion automatically when you use them in a formula, converting TRUE into 1 and FALSE into 0.
As math geeks already know, 1 and 0 are special values in multiplication. If you multiply something by 1, it remains unchanged. If you multiply it by 0, it disappears. That means that with a little craftiness, you can use 1 and 0 to filter out just the information you want from a column.

For example, consider the price list example shown in Figure 12-1, with product prices in column C. Say you want to show only those products whose prices are greater than $500. You could do it by adding a new column that applies a filter using conditional logic, as shown in Figure 12-3. After inserting the new column, you'd place this formula in cell E2:

 =C2*(C2>500)

When Excel evaluates this formula, it begins with the portion in parentheses. It tests if C2 (the price) is greater than $500. If it is, this condition returns a value of true . If not, this condition returns false . Either way, Excel automatically converts the Boolean TRUE and FALSE values to their numerical equivalent: 1 and 0, respectively.

In other words, if the price is less than $500, the formula in E2 evaluates to . If the price is greater than $500, the condition returns true , and the formula in E2 carries out the formula ( C2*1 ), which simply multiplies the price listed in C1 by 1.

If you copy this formula to all the cells in E2, you'll end up with a column that shows the prices for all products above $500, as shown in Figure 12-3.

Figure 12-3. You can filter out cells with the help of conditional arithmetic. Just create the condition and multiply it by the original cell value. In this example, prices under $500 "disappear" from column E because the condition evaluates to false, which Excel automatically converts to 0.

What you've just seen are the underpinnings of how the SUMPRODUCT( ) function works. SUMPRODUCT( ) performs the same calculation as in Figure 12-3it multiplies two values, one of which is the original cell value, and one of which is a condition that evaluates to true or false . The difference is that SUMPRODUCT( ) works on a whole range of cells, and then it adds all the results together.

Using SUMPRODUCT( ), you can create a formula that calculates the total of all products above $500. Here's what it would look like:

 =SUMPRODUCT((C2:C42)*(C2:C42>500))

In order to perform this calculation, SUMPRODUCT( ) multiplies two tables of numbers. The first table (C2:C42) is the column of prices. The second table (C2:C42>500) is the column of true or false values. Excel creates this table by examining each cell in the range of cells from C2 to C42. Those cells that contain prices that are less than $500 evaluate to 0 and, when multiplied with themselves, "disappear" since their product is 0.

Note: When using conditional arithmetic or the SUMPRODUCT( ) function, it's important to remember to include parentheses to ensure all the conditions are evaluated separately. For example, C2*(C2 >500) is not the same as C2*C2 >500 . In the second example, Excel performs the multiplication before it performs the comparison.

So far, the SUMPRODUCT( ) function isn't doing anything that you can't already accomplish with SUMIF( ). Life becomes more interesting if you want to evaluate more than one condition. For example, the following formula calculates the total of all products between $500 and $1000a feat that's impossible with SUMIF( ).

 =SUMPRODUCT((C2:C42)*(C2:C42>500)*(C2:C42<1000))

In this case, two condition tables are created and multiplied (one for prices above $500 and one for prices below $1,000). The price makes it into the final sum only if it meets both criteria. There's no practical limit to how many conditions you can use with the SUMPRODUCT( ) function, which makes it tremendously powerful.

In addition, you can use SUMPRODUCT( ) just as a counting tool (that is, a tool that counts the number of items in a set, rather than summing up their values). In this case, you simply need to multiply the conditions you want to use. Each true condition evaluates to 1, and the complete expression evaluates to 1, if all the conditions are true, or to 0 if any one of the conditions is false. As a result, the SUMPRODUCT( ) formula ends up counting just the items that meet all the conditions you've specified. For example, here's a formula that counts the number of products that fall between $500 and $1000:

 =SUMPRODUCT((C2:C42>500)*(C2:C42<1000))

Some Excel gurus use SUMPRODUCT( ) to create reports . One example is shown in the two tables in Figure 12-4. The first table (rows 1 through 8) has the source data, which is a list of dates that various contractors worked, and the total number of hours they logged. The second table (rows 12 through 14) uses the SUMPRODUCT( ) function to extract information from the source table, to figure out things like how many hours each programmer has spent programming or testing.

For example, cell B14 is designed to answer the question: how many hours has Ella spent on testing? The following SUMPRODUCT( ) formula solves this question using two conditions. First, it scans column C in order to find all rows that have Ella's name. Then, it scans column B to determine what type of work Ella did. Finally, these two condition tables are multiplied with the number of hours spent working. Here's the final formula:

 =SUMPRODUCT((C2:C8=B12)*(B2:B8=A14)*(D2:D8))

The formula returns the total number of hours from rows that matched both conditions, indicating that Ella has done testing for a total of 16 hours.

Figure 12-4. SUM-PRODUCT( ) can be a godsend when boiling down large amounts of information. Of course, if you have an extremely large list, using multiple conditions could affect performance, because Excel must build a separate table for each condition in order to evaluate the formula.