Preparing Your Data for Pivot Table Reporting


When you have a family portrait taken, the photo-grapher takes the time to make sure the lighting is right, the poses are natural, and everyone smiles their best smile. This preparation ensures that the resulting photo is effective in its purposeto show a beautiful, happy family.

When you create a pivot table report, you're a photographer, taking a snapshot of your data. Taking time to make sure your data looks its best will ensure that your pivot table report is effective in accomplishing the task at hand.

One of the benefits of working in a spreadsheet is that you have the flexibility of laying out your data to suit your needs. Indeed, the layout you choose will depend heavily on the task at hand. However, many of the data layouts that are used for presentations are not appropriate when used as the source data for a pivot table report.

As you read the next section, which goes into preparing your data, keep in mind that pivot tables have only one hard rule as it pertains to your data source. Your data source must have column headingsthat is, labels in the first row of your data describing the information in each column. If this is not the case, your pivot table report cannot be created.

However, just because your pivot table report is created successfully does not mean that it's effective. A host of things can go wrong as a result of bad data preparation, from inaccurate reporting to problems with grouping and sorting.

Let's look at a few of the things you can do to ensure that you end up with a viable pivot table report.

Ensure Your Data Is in a Tabular Layout

A perfect format for the source data in a pivot table is data in a tabular layout. In tabular layout, there are no blank rows or columns. Every column has a heading. Every field has a value in every row. Columns do not contain repeating groups of data.

The data shown in Figure 2.1 is an example of data that is structured properly for a pivot table. There are headings for each column. Even though the values in B2:B6 are all the same model, the model number appears in each cell. Month data is organized down the page instead of across the columns.

Figure 2.1. This data is structured properly for use as a pivot table source.


Tabular layouts are database-centric, meaning that you would most commonly find these types of layouts in databases. These layouts are designed to store and maintain large amounts of data in a well-structured, scalable format.

Use Unique Headings That Occupy Only a Single Row of Data

Every column should have a unique heading that is one row high. Depending on your manager, this can be problematic. You might work for a manager who demands that the heading "Gross Margin" be split with "Gross" in row 1 and "Margin" in row 2.

To overcome this problem, enter a heading in row 1 that is double the height of a normal row. Type the word "Gross." Before leaving the cell, type Alt+Enter followed by "Margin." The result will be a single cell that contains two lines of data.

Avoid Storing Data in Section Headings

Examine the data in Figure 2.2. This spreadsheet shows a report of sales by month and model for the North region of a company. Because the data in rows 2 through 24 all pertains to the North region, the author of the worksheet put a single cell with "North" in B1. This is effective for display of the data, but not effective when used as a pivot table data source.

Figure 2.2. Region and model data is not formatted properly in this dataset.


There are a few other issues with the layout in Figure 2.2. All the data in rows 2 through 6 applies to Model 2500P, so this value is entered once in A2, then formatted vertically and combined with Merge Cells to create an interesting look for the report. Again, although this is a cool format, it is not useful for pivot table reporting. In addition, there are no clear column headings. You can guess that column A is Model, column B is Month, and column C is Sales, but for Excel to create a pivot table, this information must be included in the first row of the data.

Avoid Repeating Groups as Columns

The format shown in Figure 2.3 is very common. A time dimension is presented across several columns. Although it is possible to create a pivot table from this data, it will not be ideal.

Figure 2.3. This matrix format is very common but not effective for pivot tables. The month field is spread across several columns of the report.


This format is showing a data column called "month" spread across several columns. Sales data is reported at the intersection of the model located in column B and the month in row 2.

Also in Figure 2.3, the region column should contain a heading such as "Region" in A2, and the value "North" should be repeated in cells A3:A8.

Eliminate Gaps and Blank Cells in Your Data Source

Delete all empty columns within your data source. An empty column in the middle of your data source will cause your pivot table to fail on creation because the blank column, in most cases, will not have a column name.

Delete all empty rows within your data source. Empty rows may cause you to inadvertently leave out a large portion of your data range, making your pivot table report incomplete.

Fill in as many blank cells in your data source as possible. Although this is not required to create a workable pivot table, blank cells in and of themselves are generally errors waiting to happen. So a good practice is to represent missing values with some logical missing-value code wherever possible.

Apply Appropriate Type Formatting to Your Fields

Formatting your fields appropriately will help avoid a whole host of possible issues, from inaccurate reporting to problems with grouping and sorting.

Make certain that any fields that will be used in calculations are explicitly formatted as a number, currency, or any other format that is appropriate for use in mathematical functions. Fields containing dates should also be formatted as any one of the available date formats.

Summary of Good Data Source Design

The attributes of an effective tabular design are as follows:

  • The first row of your data source is made up of field labels or headings that describe the information in each column.

  • Each column in your data source represents a unique category of data.

  • Each row in your data source represents individual items in each column.

  • None of the column names in your data source double as data items that will be used as filters or query criterion (that is, names of months, dates, years, names of locations, names of employees).

CASE STUDY: Cleaning Up Data for Pivot Table Analysis

The worksheet shown in Figure 2.4 is a great looking report. However, it cannot be effectively used as a data source for a pivot table. Can you identify the problems with this dataset?

1.

The model information does not have its own column. Model information appears in the Region column. To correct this problem, insert a new column for Model and include the model number on every row.

2.

There are blank columns and rows in the data. Column D in the figure should be deleted. The blank rows between models (such as rows 7 and 15) should be deleted.

3.

Blank cells are used to present the data in an outline format. The person reading this worksheet would probably assume that cells B10:B11 are in the New England market and cell A11 is in the North region. These blank cells need to be filled in with the values from above.

TIP

There is a trick to filling in the blank cells. Select the entire range of data. Select GoTo from the Edit menu. Choose the Special button in the GoTo dialog box. In the GoTo Special dialog box, select Blanks. With all the blank cells selected, type a formula of = <up arrow> then type Ctrl+Enter to fill this formula in all blank cells. Remember to copy and paste special values to convert the formulas to values.

4.

The worksheet presents one data columnthe data containing monthas several columns in the worksheet. Columns E through J need to be reformatted as two col umns. Place the month name in one column and the sales for that month in the next column. This step either requires a fair amount of copying and pasting or a few lines of VBA macro code.

Figure 2.4. Someone spent a lot of time formatting this report to look good, but what problems prevent it from being used as a data source for a pivot table?


TIP

For a great book on learning VBA macro programming, read VBA and Macros for Microsoft Excel by Bill Jelen and Tracy Syrstad (ISBN: 0789731290, Que Publishing). It is another book in the Business Solutions series.


After you make the four changes described here, the data is ready for use as a pivot table data source. As you can see in Figure 2.5, there are headings for every column. There are no blank cells, rows, or columns in the data. The monthly data is now presented down column E instead of across several columns.

Figure 2.5. This data will take up six times as many rows, but it is perfectly formatted for pivot table analysis.




    Pivot Table Data Crunching
    Pivot Table Data Crunching for Microsoft Office Excel 2007
    ISBN: 0789736012
    EAN: 2147483647
    Year: 2003
    Pages: 140

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net