# Section 8.2. Groups of Numbers

### 8.2. Groups of Numbers

A common spreadsheet task is distilling a few important pieces of information out of several pages of data. For example, say you want to hunt through a column looking for minimums and maximums, in order to find the lowest -priced product or best sales quarter. Or maybe you want to calculate averages, means, and percentile rankings to help grade a class of students. In either case, Excel provides a number of useful functions. Most of these are part of the Statistical group, although the SUM( ) function is actually part of the Math & Trig group .

#### 8.2.1. SUM( ): Summing Up Numbers

Almost every Excel program in existence has been called on at least once to do the same thing: add a group of numbers. This task falls to the wildly popular SUM( ) function, which simply adds everything in it. The SUM( ) function takes up to 30 parameters , each of which can be a single cell reference or a range of cells .

Tip: You can specify more than 30 cell references by adding extra parentheses to create subgroups, like so: =SUM((A1,A2),(A3,A4)) . This formula uses the power of ranges. Even though it has the effect of adding four cells' values, it's really adding two ranges , each with two distinct cells. Excel considers each range as a single argument.

Here's a SUM( ) formula that adds two cells:

` =SUM(A1,A2) `

And here's a SUM( ) formula that adds the range of 11 cells from A2 to A12:

` =SUM(A2:A12) `

And here's a SUM( ) formula that adds a range of cells along with a separately referenced cell, and two literal values:

` =SUM(A2:A12,B5,429.1,35000) `

Note: The SUM( ) function automatically ignores any cells in its range with text content or any that are blank. However, SUM( ) will add up calendar dates (which are actually specially formatted numbers, as you saw in Sidebar 2.1). Therefore, make sure you don't sum a range of cells that includes a date.
 POWER USERS' CLINIC Creating Formulas Quickly with AutoSum Summing a row or column of values is so common that Excel has a button dedicated to just that purpose: the AutoSum button on the Standard toolbar. You can find this button by looking for the Greek letter S (capital sigma), which means sum to mathematicians. When you click the AutoSum button, Excel makes an educated guess about what cells you want to total. For example, if you're at the end of a row, Excel assumes you want to add all the numeric values in all the columns on the left. If you're at the bottom of a column of numbers, Excel assumes you want to add these values instead. When you click AutoSum, Excel creates a formula that uses SUM( ) with the cell reference it thinks you need, and it highlights the reference (as shown here). At this point, you can hit Enter to accept the formula as is, type in a new reference, or resize the range with the mouse. The AutoSum feature isn't just for summing. You can also use it to calculate averages, counts, maximums, and minimums. To do so, don't click the AutoSum button directly. Instead, on the right side of the AutoSum button, click the drop-down arrow to get a list of functions. Next , choose the function that you need. Excel inserts this function into your cell, and uses the same auto-guess strategy to pick out the nearby cells for the calculation.

#### 8.2.2. COUNT( ), COUNTA( ), and COUNTBLANK( ): Counting Items in a List

Sometimes you don't need to add a series of cells, but instead want to know how many items there are total (in a list, for instance). That's the purpose of Excel's straightforward counting functions, COUNT( ), COUNTA( ), and COUNTBLANK( ). These functions operate similarly to the SUM( ) function in that they take up to 30 arguments, each of which can be a cell reference or a range of cells.

The COUNT( ) function counts the number of cells that have numeric input (including dates). The COUNTA( ) function counts cells with any kind of content. And finally, the COUNTBLANK( ) function gives you the number of empty cells (see Figure 8-2).

Here's how you might use the COUNT( ) function with a range of cells:

` =COUNT(A2:A12) `

For an example of when COUNT( ) comes in handy, consider the following formula, which determines the average of a group of cells, without requiring that you manually input the total number of values:

` =SUM(A2:A12)/COUNT(A2:A12) `

 Figure 8-2. This worksheet shows the counting functions in action on a range of cells, from A1 to A9. The function names are in column C; the formulas containing the functions are in column D. The COUNT( ) function includes the date cell A9, while the COUNTA( ) function includes all cells that have any kind of information in them.

This formula finds the average of all the cells in the range A2:12 that have values. Because it uses COUNT( ), the average remains correct even if the range includes empty cells (which Excel simply ignores). Of course, Excel already includes an AVERAGE( ) function that can perform this calculation for you, but the COUNT( ) function is still useful in longs lists of data that might contain missing information. For example, imagine you create a worksheet with a list of customer charges. One of the columns in your list is Discount Amount. Using the COUNT( ) function on the discount column, you can find out how many customers got a discount in your list. If you want to get a little fancier, you could use COUNT( ) on the Customer Name column to find out how many rows are in the table. Then you can divide the number of discount sales by the number of total sales to find out the percentage of the time that you sell something for a reduced price.

You can also combine the counting functions to calculate some additional pieces of potentially useful information. If you subtract COUNT( ) from COUNTA( ), for example (that is, if your formula reads something like =COUNTA(A1:A10) - COUNT(A1:A10) ), you'll end up with the number of text cells in the range.

Note: You can also use SUMIF( ) and COUNTIF( ) functions to sum or count cells that meet specific criteria. For more information on using conditional logic, as well as the SUMIF( ) and COUNTIF( ) functions, see pages Section 12.1.3 and Section 12.1.2, respectively.

#### 8.2.3. MAX( ) and MIN( ): Finding Maximum and Minimum Values

The MAX( ) and MIN( ) functions pick the largest or smallest value out of a series of cells. This is a great tool for picking important values (best-selling products, failing students, or historically high temperatures ) out of a long list of information. As with COUNT( ) and SUM( ), the MAX( ) and MIN( ) functions accept up to 30 cell references or ranges.

For example, the following formula examines four cells (A2, A3, B2, and B3) and tells you the largest value:

` =MAX(A2:B3) `

The MAX( ) and MIN( ) functions ignore any non-numeric content, which includes text, empty cells, and Boolean (true or false) values.

Dates are included in MAX( ) and MIN( ) calculations because Excel stores them internally as the number of days that have passed since a particular date. (For Windows versions of Excel, this date is January 1, 1900; see Sidebar 2.1 for details on how this works.) For this reason, it makes little sense to use MAX( ) or MIN( ) on a range that includes both dates and ordinary numeric data. On the other hand, you might want to use MAX( ) to find the most recent (i.e., the largest) date or MIN( ) to find the oldest (or smallest) date. Just make sure you format the cell containing the formula using the Date number format (Section 4.1.2), so that Excel will display the date it has identified.

Excel also provides MAXA( ) and MINA( ) functions, which work just like MAX( ) and MIN( ), except for the way in which they handle text and Booleans. MAXA( ) and MINA( ) always assume TRUE values equal 1 and FALSE values equal 0. All text values are treated as 0. For example, if you have a list of top professionals with a Salary column, and some cells say "Undisclosed", it makes sense to use MAX( ) and MIN( ) to ignore these cells altogether. On the other hand, if you have a list of items you've tried to auction off on eBay with a Sold For column, and some cells say "No Bids", it might make sense to use MINA( ) and MAXA( ) to treat this text as a 0 value.

#### 8.2.4. LARGE( ), SMALL( ), and RANK( ): Ranking Your Numbers

MAX( ) and MIN( ) allow you to grab the largest and smallest numbers, but what if you want to grab something in between? For example, you might want to create a top-10 list of best-selling products, instead of picking out just the single best. In these cases, consider the LARGE( ) and SMALL( ) functions, which identify values that are not quite the highest or lowest. In fact, these functions even let you specify how far from the top and bottom you want to look. Here's how they work.

Both the LARGE( ) and SMALL( ) functions require two arguments: the range you want to search, and the item's position in the list. The list position is where the item would fall if the list were ordered from largest to smallest (for LARGE( )), or from smallest to largest (for SMALL( )). Here's what LARGE( ) looks like:

` LARGE(range, position) `

For example, if you specify a position of 1 with the LARGE( ) function, you'll get the largest item on the list, which is the same result as using MAX( ). If you specify a position of 2, as in the following formula, you'll get the second largest value:

` =LARGE(A2:A12, 2) `

Here's an example formula that adds the three largest entries in a range:

` =LARGE(A2:A12,1) + LARGE(A2:A12,2) + LARGE(A2:A12,3) `

Assuming the range A2:A12 contains a list of monthly expenses, this formula gives you the total of your three most extravagant splurges.

SMALL( ) performs the opposite task by identifying the number that is the smallest, second-smallest, and so on. For example, the following formula gives you the second-smallest number:

` =SMALL(A2:A12, 2) `

And finally, Excel lets you approach this problem in reverse. Using the RANK( ) function, you can find where a specific value falls in the list. The RANK( ) function requires two parts : the number you're looking for and the range you're searching. In addition, you can supply a third parameter that specifies how Excel should order the values before searching. (By default, Excel searches values from highest to lowest.) Here's what the RANK( ) formula looks like:

` RANK(number, range, [order_type]) `

For example, imagine you have a range of cells from A2 to A12 that represent scores on a test. Somewhere in this range is a score of 77. You want to know how this compares to the other marks, so you create the following formula using the RANK( ) function:

` =RANK(77, A2:A12) `

If this formula works out to 5, if means that 77 is the fifth-highest score in the range you indicated. But what if more than one student scored 77? Excel handles duplicates in much the same way as tied times are dealt with in races. If there are three students who score 77, for example, they all tie for fifth place. But the next lowest grade (say, 75) gets a rank of eightthree positions down the list.

If you want to rank values in the reverse order, you need to include a third argument of 0. This orders them from lowest to highest (so that the grade at rank 1 will be the lowest):

` =RANK(77, A2:A12, 0) `

Note: If you use RANK( ) to rank a number that doesn't exist in the data series, you'll receive the #N/A error code in the cell, indicating that Excel can't rank the number because it's not present.

One neat trick is to rank all your numbers in a list. For example, assume you have a list of grades in cells A2 to A12. You want to show the rank of each grade in the corresponding cells B2 to B12. Calculate the rank for A2 like so:

` =RANK(A2, \$A:\$A) `

This formula uses absolute references for the list. That means you can copy this formula to cells B3 through B12 to calculate the rank for the remaining grades, and Excel automatically adjusts the formula to what you need.

#### 8.2.5. AVERAGE( ) and MEDIAN( ): Finding Average or Median Values

Excel makes it easy for you to find the average value for a set of numbers. The AVERAGE( ) function doesn't accomplish anything you couldn't do on your own using COUNT( ) and SUM( ) together, but, well, you could also bake your own bread. Bottom line: the AVERAGE function can be a real timesaver.

The AVERAGE( ) function uses just one argument: the cell range you want to average:

` =AVERAGE(A2:A12) `

The AVERAGE( ) function ignores all empty cells or text values. For example, if the preceding formula had only three numbers in the range A2:A12, Excel would add these values and divide them by 3. If this isn't the behavior you want, you can use the AVERAGEA( ) function, which counts all text cells as though they contain the number 0 (but it still ignores blank cells). (For an example of when text should and shouldn't be treated as 0 values, see Section 8.2.4.)

Note: In some cases, you might want to perform an average that ignores certain values. For example, when determining the average score of all students on a test, you might want to disregard students who scored 0 because they may have been absent. In this case, you can use the conditional summing technique described on Section 12.1.3.

Excel can also help you identify the median value for a set of numbers. If you were to order your range of numbers from lowest to highest, the median would be the number that falls in the middle. (If there are an even amount of numbers then the two middle numbers are averaged to generate the median.)

You calculate the median (see Figure 8-3) in the same way that you calculate an average:

` =MEDIAN(A2:A12) `

Tip: In the Standard toolbar, Excel offers a quick shortcut for five of its most popular functionsSUM( ), AVERAGE( ), COUNT( ), MAX( ), and MIN( ). Just click the button to perform a quick AutoSum on any group of selected cells. Or, if you click the triangle to the right of the button, you'll see a list where you can quickly unleash any of the other four functions.

#### 8.2.6. MODE( ): Finding Numbers that Frequently Occur in a List

MODE( ) is an interesting function that gives you the most frequently occurring value in a data series. For example, if you use MODE( ) with the worksheet shown in Figure 8-3 (like this: =MODE(B2:B10 )), it gives you the number 78. That's because the score 78 occurs twice, unlike all the other scores, which just appear once.

 Figure 8-3. When comparing test scores and class grades, medians are often used instead of averages. The reason? Averages can be unnaturally skewed if there are one or more values that are extremely high or low. For example, if one student turns in a blank paper for a test and receives a 0, the average might be unnaturally low, and not truly representative of a class's performance. This is particularly true when averaging a small set of numbers. Usually, however, the median is quite close to the average.

MODE( ) does have a few limitations, though. It ignores text values and empty cells, so you can't use it to get the most common text entry. It also returns the error code #N/A if no value repeats at least twice. And if there's more than one value that repeats the same number of times, you'll get the highest number.

#### 8.2.7. PERCENTILE( ) and PERCENTRANK( ): Advanced Ranking Functions

If you really want to dissect the test scores from the worksheet in Figure 8-3, you don't need to stop with simple averages, medians, and rankings. You can take a closer look at the overall grade distribution by using the PERCENTILE( ) and PERCENTRANK( ) functions.

Percentiles are often used to split groups of students into categories. (Technically, a percentile is a value on a scale of 1 to 100, indicating the percentage of scorers in any given ranking who are equal to, or below, the scorer in question.) For example, you might decide that the top 10 percent of all the students in your class will gain admission into an advanced class the following semester. In other words, a student's final grade must be better than 90 percent of his or her classmates. Students at this prestigious level occupy the 90 th percentile.

That's where the PERCENTILE( ) function comes in very handy. You supply the cell range, the percentile (as a fraction from 0 to 1), and the function reveals the numeric grade the student needs to match that percentile. For example, here's how you'd calculate the minimum grade needed to enter the 90 th percentile (assuming the grades are in cells B2 to B10):

` =PERCENTILE(B2:B10, 0.9) `

Percentiles make the most sense when you have a large number of values because then the distribution becomes the most regular. In other words, there's a wide range of different test scores that are well spread out, with no " clumping " that might skew your results. Percentiles don't work as well with small amounts of data. For example, if you have fewer than 10 students and try to calculate the grade needed to enter the 90 th percentile, you'll come up with a grade higher than any student has scored so far (and possibly higher than 100%).

PERCENTRANK( ) performs the inverse task. It takes a range of data and a number from 1 to 100, and gives you the fraction indicating the number's percentile. The following formula, for instance, returns the percentile of the grade 81 (as it compares to a list of other grades):

` =PERCENTRANK(B2:B13,81) `

Using the data shown in Figure 8-4, this formula gives you the fraction 0.727, indicating that an 81 falls into the 72 nd percentile (and almost hits the 73rd percentile). That is, it's a better score than 72 percent of all the other students, and a lower score than the other 28 percent. In this case, you can also apply the rounding functions explained earlier in the chapter, because percentiles are (by convention) whole numbers. You can use ROUND( ) to round the result of the formula to the closest percentile (73rd), or use TRUNC( ) to return the highest percentile that the student actually reached (72 nd ), which is the conventional approach.

#### 8.2.8. FREQUENCY( ): Identifying to Which Category a Number Belongs

The FREQUENCY( ) function is the last statistical function you'll learn about in this section. Like the functions you've seen so far, FREQUENCY( ) helps you analyze numbers based on the way they're distributed within a group of other numbers. For example, PERCENTILE( ) and PERCENTRANK( ) let you examine how a single test score ranks in comparison to other students' scores. Functions like AVERAGE( ), RANK( ), and MEDIAN( ) give you other tools to compare how different values stack up against one another.

FREQUENCY( ) is different from these functions in one key way: it doesn't compare values against each other. Instead, it lets you define multiple ranges, and then, after chewing through a list of numbers, tells you how many values on the list fall into each range. For example, you could use FREQUENCY( ) to examine a list of incomes in order to see how many belong in a collection of income ranges that you've identified.

 Figure 8-4. Percentiles make a lot of sense when you're trying to analyze how one value fits into a large range. For example, using these student scores, you can tell that students need an 89.2 to enter the 90 th percentile. Ideally, you'd use a much larger series of data to calculate percentiles as a small sample like this generates widely varying percentiles each time you enter one new score. Keep in mind that percentiles also work wonders for figuring wages , survey data, and even physical characteristics like heights and weights.

FREQUENCY( ) is a little more complicated than the other functions you've seen in this chapter because it gives you what's known as an array (a list of separate values). When a function gives you an array, it returns multiple results that must be displayed in different cells.

In order to use the FREQUENCY( ) function properly, you first need to create an array formula , which displays its results in a group of cells. (Put another way: a plain- vanilla formula occupies one cell; an array formula spans multiple cells.)

A worksheet filled with student test scores, such as the one shown in Figure 8-5, provides an ideal place to learn how the FREQUENCY( ) function works, showing you how many students aced a test and how many botched it. Here's how to use the FREQUENCY( ) function to generate the results shown in cells F6 through F10:

1. Figure out which cells have the numbers you want to analyze.

If you were creating this worksheet, after filling the first two columns with student names and scores, you'd enter the words Grade, Maximum , and Frequency in columns D, E, and F. You'd also enter the relevant grade ranges (in cells D6 to D10) and maximum scores (E6 to E10). The ranges and scores entered here are the traditional values; but of course you could enter any range you wanted.

The FREQUENCY( ) function is interested in two groups of numbers. B2 through B12 are the student scores being analyzed . E6 through E10 represent the ranges that Excel is comparing each student score against. (E6 covers scores between 0 and 49, E7 covers 50 to 59, and so on.) Step 3 explains how these cell ranges become part of the FREQUENCY( ) function as arguments. The array formula itself goes into cells F6 to F10, as explained in the next step.

 Figure 8-5. This worksheet shows the completed analysis performed by the FREQUENCY( ) function. It indicates that four students received a C, while only two received an A. As with any other formula, if you modify the scores or the category maximums, you'd see the results change automatically. If you want to change your FREQUENCY( ) formula, make sure you select all the cells in the array, so you update all of them.

Tip: If you want to calculate the values that fall in a range with no upper maximum (one that simply captures all values from the previous maximum up), just leave the cell that defines the maximum value blank. This category must be the last one you define. For example, in the student grade example, you could leave cell E10 blank, although there's no reason to take this step because you know a student can't score higher than 100.
2. Select the cells where you want to enter the formula (which is also where the results will appear).

One result appears in each cell. In this example, you'd select cells F6 to F10.

3. Type in the formula.

The FREQUENCY( ) function requires two arguments: the cells you want to analyze (B2:B13) and the value ranges the scores you're comparing against (E6:E10). You can manually type in the function arguments, or you can enter them by clicking the worksheet itself.

Here's the actual formula used in this example:

` =FREQUENCY(B2:B13,E6:E10) `

4. When you've finished entering the formula, press Ctrl+Shift+Enter.

You can't simply press Enter, because that will enter only the formula into the first selected cell. Instead, you need to press Ctrl+Shift+Enter to copy the formula into all the result cells. Once you've done this, you'll see the results shown in Figure 8-6.

 Figure 8-6. The QUOTIENT( ) and MOD( ) functions let you break a number down into two pieces. In this case, the worksheet uses QUOTIENT( ) to find out how many pizzas you can afford with a set amount of money, and MOD( ) to find out how much money you'll have left over.

Excel 2010: The Missing Manual
ISBN: 1449382355
EAN: 2147483647
Year: 2003
Pages: 185