Preparing Data for Pivot Tables

   

There are some special considerations to keep in mind when you specify the records you want to use in pivot tables. These issues don't apply to external data ranges unless you then use the external data range as the source for a pivot table. The idea is to avoid putting records into the pivot table if they contain null values in a field that you want to use to group records.

Grouping on Date and Time Fields

It often happens that you want to analyze data in a pivot table using a date or time of day as a row, column, or page field. For example, you might want to know the average number of traffic accidents during each hour of the day or during each day of the week.

When the time and/or date that something occurs is important, it's typical to record the occurrence's exact time of day and, often, its specific date. But when you want to analyze the data, it's atypical to care about the exact minute when the event occurred. At least in the realm of descriptive statistics, you seldom pay attention to the fact that two traffic accidents occurred at 4:37 p.m. while one occurred at 4:38 p.m.

So, you would like to group your data according to broader categories than minutes half-hour or one-hour brackets, perhaps. Other analyses, such as tracking a company's revenues, typically rely on even broader categories, such as months and quarters.

Excel pivot tables have a very useful capability that can help out here. If you establish a time or date field in a pivot table's row or column area, you can subsequently create groups based on that field, groups defined by hours, or days, or months, and so on. Figure 6.11 shows an example.

Figure 6.11. A pivot table based on date or time values seldom provides any useful information before you've grouped its dates or times.

graphics/06fig11.jpg


In Figure 6.11, the time at which traffic accidents were reported is shown in a worksheet range beginning in A1, adjacent to a pivot table based on that data range. The summary capabilities of pivot tables have not yet been brought to bear the table just replicates the information in the underlying data range.

Now suppose that you'd like to see the number of accidents on an hourly basis. To do so, you take these steps:

  1. Click any cell in the pivot table's row field, and choose Data, Group and Outline, Group (depending on your version, you can also right-click a cell in the row field and choose Group and Show Detail, Group). The Grouping dialog box shown in Figure 6.12 appears.

    Figure 6.12. The starting and ending values are based on the smallest and largest values in the data source.

    graphics/06fig12.gif


  2. Because the Row field you selected is a date/time field, the dialog box offers by default to group the field into months. Click the Months item to deselect it, and then click the Hours item to select it, as shown in Figure 6.12.

  3. Click OK to group the row field as shown in Figure 6.13.

    Figure 6.13. The format of a grouped field in a pivot table is based on, but not necessarily identical to, the format of the underlying data source.

    graphics/06fig13.jpg


Now a pattern starts to emerge one that was obscured by all the detail in the row field before you grouped it. It becomes apparent that most traffic accidents occur during rush hours. Just as you suspected.

NOTE

If you're using Excel 97, you can group in this way only if you make the field you want to group on a Row field you can't do this with a Column or Page field in Excel 97. After you've created the groups, you can pivot the table to make the Row field either a Column or a Page field. In subsequent versions, you can begin by grouping on a Column field, but (as of Excel 2003) not a Page field. Nevertheless, the logic of the worksheet's dimensions mean that it's sensible to start it off as a Row field because you can accommodate many more ungrouped values that way.


Grouping on Other Numeric Fields

Figure 6.14 shows a similar situation. You have revenue data by date for the theatrical release of a movie that has been appearing in theaters for a few weeks. You couldn't care less about the specific day (or time of day, for that matter) on which the revenue was realized, but you're interested in the dollar amount that was realized by week.

Figure 6.14. Here you'll want to group by date instead of time.

graphics/06fig14.jpg


You begin by creating a pivot table that uses Revenue Date as a Row field and Revenue Amount as a Data field. Then you select the Revenue Date field, choose Data, Group and Outline, clear Month by clicking it in the By list box, and select Days (refer to Figure 6.12). The Number of Days spinner becomes available and you use it to specify seven days. When you click OK, the pivot table is reconfigured to appear as in Figure 6.15.

Figure 6.15. Pivot tables' grouping options don't include Week to group by week, you have to specify seven days.

graphics/06fig15.gif


This is a terrific feature in pivot tables one that makes them so valuable in analyzing time-dependent events. Whether you use Excel 97 and base charts on pivot tables, or use a later version to create pivot charts, the time and date grouping capability is a great way to provide visual analyses.

TIP

You aren't restricted to just one grouping level. You could, for example, choose both Months and Years and you should probably do so if you're analyzing data such as corporate revenues that span more than one year.


Avoiding Null Values

But there's a problem. Pivot tables are unable to group on fields that contain null values. For example, suppose that the underlying data sets shown in Figure 6.11 through 6.15 had an empty cell in, say, A207. The resulting pivot table would look like the one in Figure 6.16.

Figure 6.16. Notice the blank value in C210:D210.

graphics/06fig16.jpg


As soon as you try to group on the field with the blank value, Excel displays the warning message Cannot group that selection, and you will not see the Grouping dialog box. Excel does not say why it can't group that selection, but the reason is usually that there's a null value somewhere in the field you're trying to group.

You can hide that blank value, either with Field Settings or by right-clicking it and selecting Hide from the shortcut menu. That doesn't help if you try again to group the field, you still get the warning message instead of the Grouping dialog box. You can remove the null value from the worksheet list and then refresh the pivot table's data, again to no effect. With a date or time field, after you've admitted a null value to the pivot table, you're not going to be able to group on that field.

The solution is to remove that record from the worksheet list before you create the pivot table. Or, if you've based the pivot table on an external data source, you can use Microsoft Query's Criteria pane to prevent any null values from entering the table in the first place (see Figure 6.17).

Figure 6.17. The criterion specifying Is Not Null prevents any records with a null value on that field from being returned to the workbook.

graphics/06fig17.gif


Therefore, if you need to group date or time values in a pivot table's field, it's a good idea to set up your data source so that it will provide no null values on that field.

graphics/arrow_icon.gif Basing a pivot table on an external data source is covered in detail in "Importing Data to Pivot Tables," p. 104.


Avoiding Null Values in Other Grouped Fields

There are occasions on which you might want to group values other than dates and times. Suppose that you wanted to examine the ages of customers, or clients, or patients in general, people you work with. Their age distribution appears in Figure 6.18.

Figure 6.18. Excel does not recognize the values in column A as dates or times.

graphics/06fig18.gif


When you click one of the cells in the pivot table and choose Data, Group and Outline, Group, the Grouping dialog box shown in Figure 6.19 appears.

Figure 6.19. You can lump many records into one group at the top or bottom of the range by changing the Starting At or Ending At values.

graphics/06fig19.gif


Excel can recognize and divide a date or time field into standard categories such as year, month, day, and minute. In contrast, Excel has to rely on the user to provide brackets for an arbitrary scale, as in Figure 6.19. You know the numbers represent years, but Excel doesn't.

This sort of numeric grouping brings with it the same problem that accompanies dates and times: A null value in the data source causes Excel to say it can't group that field when you ask it to. So, it's wise to use the same sort of criterion, Is Not Null, as shown in Figure 6.17.

There is one difference, though. With a field such as the one shown in Figure 6.18, you don't need to start from scratch if you allow a null value into the pivot table. Just remove it from the table's data source whether a worksheet list or an external data source and refresh the pivot table's data. Now you'll be able to group on that field.



Managing Data with Microsoft Excel
Managing Data with Microsoft Excel
ISBN: 789731002
EAN: N/A
Year: 2003
Pages: 134

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net