Refine the Fields


After you make a first stab at putting fields in tables, the next step is to refine your fields. A few classic slip-ups in creating fields are easily noticed. I prefer "slip-ups" instead of "mistakes" or "blunders" because they initially appear to be perfectly logical solutions to the database problems you face. But they violate database principles and will vastly reduce the effectiveness of your database.

Calculated Fields

No value in a table should depend on any other value in the table for its own value. This basic principle of database design might seem to offer more confusion than enlightenment. Perhaps this restatement will make things a little clearer: You must be able to change the value of any field without affecting any other field.

An example should make it plainer still. Your company offers certain discounts on orders for fast payment, closeout sales, volume sales, and so on. The discounts can vary product by product within a single order (see Figure 2.2). The unit price for the same product can vary from order to order as well.

Figure 2.2. This selection of records from the Order Details table of the Northwind database shows that the company offers various discounts on its products, and the discounts can vary within a single order. Although it is not indicated in the figure, the same product can have various discounts, depending on the order.


Assume that the unit price for Product A in Order #101 is $1,000. Assume that the discount for Product A in Order #101 is 10%. Both of these values would be entered and stored in your database. But you would not store the actual amount of the discountthat is, 10% of $1,000, or $100. The $100 amount depends on its value entirely from other fields, Unit Price and Discount. Access makes it easy to perform such calculations in other objects, so there is no need to store such calculated values in tables.

Are there any calculated values in the fields listed in Table 2.9? Years on Air arouses suspicion. If you know the values for Year Started and Year Ended, you can calculate the Years on Air. The Years on Air field is unnecessary and should thus be eliminated from the database.

Multipart Fields

A field should contain one distinct item within any value. A multipart field contains two or more kinds of values. Recall the following from the section on database principles:

  • Each row of each column should have only one value.

  • Data should be broken into its smallest components.

  • A field should contain values of only one data type, such as Date, Number, and so on.

A multipart field breaks the first two of these rules and often breaks the third rule as well. Look at the table in Figure 2.2. Suppose that, instead of three separate fields, the quantity, the unit price, and the discount were all included in a single field. This field would contain three distinct items. It would be difficult to sort, edit, and delete; it would also be impossible to perform calculations on the quantities.

Multipart fields are sometimes a lot harder to recognize than this example would indicate. For example, consider product codes. They often appear as one long string of alphanumeric characters. But they can comprise several distinct items, including product category, product subcategory, manufacturing location, and so on. Whether or not you decide to break up the product code into its constituent elements depends on your database needs. Nonetheless, you should be aware that you are dealing with a multipart field.

Consider the fields of the Classic TV database (see Table 2.9). Do any of the columns appear to be multipart fields? Some of the fields that provide extensive descriptionssuch as Network Notes in the Networks table and Synopsis in the Programs tablemight seem to be multipart fields. But just as a sentence isn't a run-on simply because it's long, a field isn't a multipart field simply because there's a lot of information in it. The data in the Network Notes, Synopsis, and similar columns are perfectly acceptable as fields with the Memo data typein other words, a field with the potential to store lots of text in each row, yet in sum containing only one value.

The Classic TV database has, in fact, two multipart fields. The less egregious error is in the Actors table. Currently, only one column is provided for the actor's name. An actor's name comprises at least two separate items, first name and last name.

The way to resolve a multipart field is to break the value into its components parts and assign each part its own field. Thus, you'll have both First Name and Last Name fields for actor names.

Q&A

Q1:

Must a person's name always be two or more fields?

A1:

I admit, the decision isn't always clear cut. Note that the Classic TV database also contains a Network Founder field (see Table 2.6). I don't suggest that you divide that field into two. In a table of just a few networks, it seems unlikely that you'll need to manipulate the founders' first and last names discretely.

The Contact Name field in the Customers table in Access's Northwind sample database also uses a single field for both first and last names. But here I think the table design is unfortunate because you can enter only one contact name for each customer. If you had several contacts for one customer, which is not at all unlikely, you could find separate first name and last name fields useful for sorting and easier data manipulation.

I'm not arguing that you always need to put first and last names into separate fields, but you'll want to be cautious about this violation of the "one field, one value" principle.


Besides Actor Name, one other field violates the multipart field ruleand much more egregiously. In the Programs table, the Character Names/Occupations field clearly contains two types of data: the name of the character and the role (see Table 2.10). This multipart field presents definite problems for finding and retrieving data.

Table 2.10. Programs

Program Name

Character Names/Occupations

<<other fields>>

The Andy Griffith Show

Andy Taylor/Sheriff; Barney Fife/Deputy Sheriff

 

Bewitched

Samantha Stevens/Witch; Darrin Stevens/Ad executive

 

The Lucy Show

Lucy Carmichael/Widow; Vivian Bagley/Divorcee

 

Married with Children

Al Bundy/Shoe Salesman; Peg Bundy/Housewife

 


For example, suppose you wanted to find all the characters who had the occupation of baker. If character names and occupations were in the same field, accomplishing this task would be extremely inefficient. Moreover, what if there were characters named Richard Baker or Sarah Baker? Sure, that's a contrived example. But I think you can see the threat that multipart fields would pose to data integrity in a database of any size and complexity. Clearly, character names and character occupations should be separated into two fields.

To renew the now-tiresome question of people's names, it can be debated whether Character Names should be further divided between the character's first name and last name. It could be helpful for finding all the names of the Cartwright boys on Bonanza, or the first name of Bob Hartley's wife on The Bob Newhart Show. On the other hand, character names such as Aunt Bea on The Andy Griffith Show or Uncle Fester on The Addams Family feel like single values. For now, let's keep Character Names as one field; we'll make two fields out of it a little later.

There's another big problem with the Character Names/Occupations field. Even after segregating character names and occupations, you're left with only one field for all the character names and only one field for occupations. If there are several lead characters in a show, as there almost always are, you'll have a bunch of names in one field and several occupations in the other. (You might want to sneak a peak at Table 2.15 to see this table, and then return here.)

Table 2.15. Programs

Multivalue Fields

Program Name

Character Names

Character Occupations

<<other fields>>

The Andy Griffith Show

Andy Taylor, Barney Fife, Gomer Pyle

Sheriff, Deputy Sheriff, Gas Station Attendant

 

Happy Days

Richie Cunningham, Arthur Fonzarelli

Student, Motorcyclist

 

Bewitched

Samantha Stevens, Darrin Stevens

Witch, Ad Executive

 

The Lucy Show

Lucy Carmichael, Vivian Bagley

Widow, Divorcee

 

Married with Children

Al Bundy, Peg Bundy

Shoe Salesman, Homemaker

 


To resolve this problem, though, you'll need to understand how keys work, so let me first talk about this essential database topic.

NOTE

The changes to the tables made in the "Calculated Fields" and "Multipart Fields" sections are incorporated in Table 2.11.

Table 2.11. Classic TV Database

Current List of Fields

Programs

Genres

Networks

Actors

Name

Name

Official Name

First Name

Year Started

Description

Popular Name

Last Name

Year Ended

 

Founder

Gender

Location

 

Network Notes

Biography

Synopsis

   

Program Notes

   

Character Names

   

Character Occupations






Hands-On Microsoft Access(c) A Practical Guide to Improving Your Access Skills
Hands-On Microsoft Access: A Practical Guide to Improving Your Access Skills
ISBN: 0321245458
EAN: 2147483647
Year: 2005
Pages: 169
Authors: Bob Schneider

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net