When you have a family portrait taken, the photo-grapher takes the time to make sure the lighting is right, the poses are natural, and everyone smiles their best smile. This preparation ensures that the resulting photo is effective in its purposeto show a beautiful, happy family.
When you create a pivot table report, you're a photographer, taking a snapshot of your data. Taking time to make sure your data looks its best will ensure that your pivot table report is effective in accomplishing the task at hand.
One of the benefits of working in a spreadsheet is that you have the flexibility of laying out your data to suit your needs. Indeed, the layout you choose will depend heavily on the task at hand. However, many of the data layouts that are used for presentations are not appropriate when used as the source data for a pivot table report.
As you read the next section, which goes into preparing your data, keep in mind that pivot tables have only one hard rule as it pertains to your data source. Your data source must have column headingsthat is, labels in the first row of your data describing the information in each column. If this is not the case, your pivot table report cannot be created.
However, just because your pivot table report is created successfully does not mean that it's effective. A host of things can go wrong as a result of bad data preparation, from inaccurate reporting to problems with grouping and sorting.
Let's look at a few of the things you can do to ensure that you end up with a viable pivot table report.
Ensure Your Data Is in a Tabular Layout
A perfect format for the source data in a pivot table is data in a tabular layout. In tabular layout, there are no blank rows or columns. Every column has a heading. Every field has a value in every row. Columns do not contain repeating groups of data.
The data shown in Figure 2.1 is an example of data that is structured properly for a pivot table. There are headings for each column. Even though the values in B2:B6 are all the same model, the model number appears in each cell. Month data is organized down the page instead of across the columns.
Figure 2.1. This data is structured properly for use as a pivot table source.
Tabular layouts are database-centric, meaning that you would most commonly find these types of layouts in databases. These layouts are designed to store and maintain large amounts of data in a well-structured, scalable format.
Use Unique Headings That Occupy Only a Single Row of Data
Every column should have a unique heading that is one row high. Depending on your manager, this can be problematic. You might work for a manager who demands that the heading "Gross Margin" be split with "Gross" in row 1 and "Margin" in row 2.
To overcome this problem, enter a heading in row 1 that is double the height of a normal row. Type the word "Gross." Before leaving the cell, type Alt+Enter followed by "Margin." The result will be a single cell that contains two lines of data.
Avoid Storing Data in Section Headings
Examine the data in Figure 2.2. This spreadsheet shows a report of sales by month and model for the North region of a company. Because the data in rows 2 through 24 all pertains to the North region, the author of the worksheet put a single cell with "North" in B1. This is effective for display of the data, but not effective when used as a pivot table data source.
Figure 2.2. Region and model data is not formatted properly in this dataset.
There are a few other issues with the layout in Figure 2.2. All the data in rows 2 through 6 applies to Model 2500P, so this value is entered once in A2, then formatted vertically and combined with Merge Cells to create an interesting look for the report. Again, although this is a cool format, it is not useful for pivot table reporting. In addition, there are no clear column headings. You can guess that column A is Model, column B is Month, and column C is Sales, but for Excel to create a pivot table, this information must be included in the first row of the data.
Avoid Repeating Groups as Columns
The format shown in Figure 2.3 is very common. A time dimension is presented across several columns. Although it is possible to create a pivot table from this data, it will not be ideal.
Figure 2.3. This matrix format is very common but not effective for pivot tables. The month field is spread across several columns of the report.
This format is showing a data column called "month" spread across several columns. Sales data is reported at the intersection of the model located in column B and the month in row 2.
Also in Figure 2.3, the region column should contain a heading such as "Region" in A2, and the value "North" should be repeated in cells A3:A8.
Eliminate Gaps and Blank Cells in Your Data Source
Delete all empty columns within your data source. An empty column in the middle of your data source will cause your pivot table to fail on creation because the blank column, in most cases, will not have a column name.
Delete all empty rows within your data source. Empty rows may cause you to inadvertently leave out a large portion of your data range, making your pivot table report incomplete.
Fill in as many blank cells in your data source as possible. Although this is not required to create a workable pivot table, blank cells in and of themselves are generally errors waiting to happen. So a good practice is to represent missing values with some logical missing-value code wherever possible.
Apply Appropriate Type Formatting to Your Fields
Formatting your fields appropriately will help avoid a whole host of possible issues, from inaccurate reporting to problems with grouping and sorting.
Make certain that any fields that will be used in calculations are explicitly formatted as a number, currency, or any other format that is appropriate for use in mathematical functions. Fields containing dates should also be formatted as any one of the available date formats.
Summary of Good Data Source Design
The attributes of an effective tabular design are as follows:
CASE STUDY: Cleaning Up Data for Pivot Table Analysis
The worksheet shown in Figure 2.4 is a great looking report. However, it cannot be effectively used as a data source for a pivot table. Can you identify the problems with this dataset?
Figure 2.4. Someone spent a lot of time formatting this report to look good, but what problems prevent it from being used as a data source for a pivot table?
For a great book on learning VBA macro programming, read VBA and Macros for Microsoft Excel by Bill Jelen and Tracy Syrstad (ISBN: 0789731290, Que Publishing). It is another book in the Business Solutions series.
After you make the four changes described here, the data is ready for use as a pivot table data source. As you can see in Figure 2.5, there are headings for every column. There are no blank cells, rows, or columns in the data. The monthly data is now presented down column E instead of across several columns.
Figure 2.5. This data will take up six times as many rows, but it is perfectly formatted for pivot table analysis.