Refine the Fields
After you make a first stab at
putting
fields in tables, the
next
step is to refine your fields. A few classic slip-ups in creating fields are easily noticed. I prefer "slip-ups" instead of "mistakes" or "blunders" because they initially appear to be
perfectly
logical solutions to the database problems you face. But they
violate
database principles and will vastly reduce the effectiveness of your database.
Calculated Fields
No value in a table should depend on any other value in the table for its own value.
This basic principle of database design might seem to offer more confusion than enlightenment. Perhaps this restatement will make things a little clearer: You must be able to change the value of any field without
affecting
any other field.
An example should make it plainer still. Your company offers certain discounts on orders for fast payment, closeout sales, volume sales, and so on. The
discounts
can vary product by product within a single order (see Figure 2.2). The unit price for the same product can vary from order to order as well.
Assume that the unit price for Product A in Order #101 is $1,000. Assume that the discount for Product A in Order #101 is 10%. Both of these values would be entered and stored in your database. But you would
not
store the actual amount of the discountthat is, 10% of $1,000, or $100. The $100 amount depends on its value
entirely
from other fields, Unit Price and Discount. Access makes it easy to perform such calculations in other objects, so there is no need to store such
calculated values
in tables.
Are there any calculated values in the fields listed in Table 2.9?
Years on Air
arouses suspicion. If you know the values for
Year Started
and
Year Ended
, you can calculate the
Years on Air
. The
Years on Air
field is unnecessary and should thus be eliminated from the database.
Multipart Fields
A field should contain one distinct item within any value.
A
multipart field
contains two or more kinds of values. Recall the following from the section on database principles:
-
Each row of each column should have only one value.
-
Data should be broken into its smallest
components
.
-
A field should contain values of only one data type, such as
Date
,
Number
, and so on.
A multipart field breaks the first two of these rules and often breaks the third rule as well. Look at the table in Figure 2.2. Suppose that, instead of three separate fields, the quantity, the unit price, and the discount were all included in a single field. This field would contain three distinct items. It would be difficult to
sort
, edit, and delete; it would also be
impossible
to perform calculations on the
quantities
.
Multipart fields are sometimes a lot harder to recognize than this example would
indicate
. For example, consider product codes. They often appear as one long string of
alphanumeric
characters
. But they can comprise several distinct items, including product category, product
subcategory
, manufacturing location, and so on. Whether or not you decide to break up the product code into its
constituent
elements depends on your database needs. Nonetheless, you should be aware that you are dealing with a multipart field.
Consider the fields of the Classic TV database (see Table 2.9). Do any of the
columns
appear to be multipart fields? Some of the fields that provide
extensive
descriptionssuch as
Network Notes
in the Networks table and
Synopsis
in the Programs tablemight seem to be multipart fields. But just as a
sentence
isn't a
run-on
simply because it's long, a field isn't a multipart field simply because there's a lot of information in it. The data in the
Network Notes
,
Synopsis
, and similar columns are perfectly acceptable as fields with the
Memo
data typein other words, a field with the potential to store lots of text in each row, yet in sum containing only one value.
The Classic TV database has, in fact, two multipart fields. The less egregious error is in the Actors table. Currently, only one column is provided for the actor's name. An actor's name comprises at least two separate items, first name and last
name
.
The way to resolve a multipart field is to break the value into its components
parts
and assign each part its own field. Thus, you'll have both
First Name
and
Last Name
fields for actor
names
.
|
Q1:
|
Must a person's name
always
be two or more fields?
|
|
A1:
|
I admit, the decision isn't always clear cut. Note that the Classic TV database also contains a
Network Founder
field (see Table 2.6). I don't suggest that you divide that field into two. In a table of just a few networks, it seems
unlikely
that you'll need to manipulate the founders' first and last names discretely.
The
Contact Name
field in the Customers table in Access's Northwind sample database also uses a single field for both first and last names. But here I think the table design is unfortunate because you can enter only one contact name for each customer. If you had several contacts for one customer, which is not at all unlikely, you could find separate first name and last name fields useful for sorting and easier data manipulation.
I'm not arguing that you always need to put first and last names into separate fields, but you'll want to be cautious about this violation of the "one field, one value" principle.
|
|
Besides
Actor Name
, one other field
violates
the multipart field ruleand much more egregiously. In the Programs table, the
Character Names/Occupations
field clearly contains two types of data: the name of the character and the role (see Table 2.10). This multipart field
presents
definite problems for finding and retrieving data.
Table 2.10. Programs
|
Program Name
|
Character Names/Occupations
|
<<other fields>>
|
|
The Andy Griffith Show
|
Andy Taylor/Sheriff; Barney Fife/Deputy Sheriff
|
|
|
Bewitched
|
Samantha Stevens/Witch; Darrin Stevens/Ad executive
|
|
|
The Lucy Show
|
Lucy Carmichael/Widow; Vivian Bagley/Divorcee
|
|
|
Married with Children
|
Al Bundy/Shoe Salesman; Peg Bundy/Housewife
|
|
For example, suppose you wanted to find all the characters who had the
occupation
of
baker
. If character names and occupations were in the same field, accomplishing this task would be extremely inefficient. Moreover, what if there were characters named Richard Baker or Sarah Baker? Sure, that's a contrived example. But I think you can see the threat that multipart fields would pose to data integrity in a database of any
size
and complexity. Clearly, character names and character occupations should be separated into two fields.
To renew the now-tiresome question of people's names, it can be debated whether
Character Names
should be further divided between the character's first name and last name. It could be helpful for finding all the names of the Cartwright boys on
Bonanza
, or the first name of Bob Hartley's wife on
The Bob Newhart Show
. On the other hand, character names such as Aunt Bea on
The Andy Griffith Show
or Uncle Fester on
The Addams Family
feel like single values. For now, let's keep
Character Names
as one field; we'll make two fields out of it a little later.
There's another big problem with the
Character Names/Occupations
field. Even after segregating character names and occupations, you're left with only one field for all the character names and only one field for occupations. If there are several lead characters in a show, as there almost always are, you'll have a bunch of names in one field and several occupations in the other. (You might want to
sneak
a peak at Table 2.15 to see this table, and then return here.)
Table 2.15. Programs
|
Multivalue Fields
|
|
Program Name
|
Character Names
|
Character Occupations
|
<<other fields>>
|
|
The Andy Griffith Show
|
Andy Taylor, Barney Fife, Gomer Pyle
|
Sheriff, Deputy Sheriff, Gas Station Attendant
|
|
|
Happy Days
|
Richie Cunningham, Arthur Fonzarelli
|
Student, Motorcyclist
|
|
|
Bewitched
|
Samantha Stevens, Darrin Stevens
|
Witch, Ad Executive
|
|
|
The Lucy Show
|
Lucy Carmichael, Vivian Bagley
|
Widow, Divorcee
|
|
|
Married with Children
|
Al Bundy, Peg Bundy
|
Shoe Salesman, Homemaker
|
|
{% if main.adsdop %}{% include 'adsenceinline.tpl' %}{% endif %}
To resolve this problem, though, you'll need to understand how
keys
work, so let me first talk about this essential database topic.
NOTE
The changes to the tables made in the "Calculated Fields" and "Multipart Fields" sections are incorporated in Table 2.11.
Table 2.11. Classic TV Database
|
Current List of Fields
|
|
Programs
|
Genres
|
Networks
|
Actors
|
|
Name
|
Name
|
Official Name
|
First Name
|
|
Year Started
|
Description
|
Popular Name
|
Last Name
|
|
Year Ended
|
|
Founder
|
Gender
|
|
Location
|
|
Network Notes
|
Biography
|
|
Synopsis
|
|
|
|
|
Program Notes
|
|
|
|
|
Character Names
|
|
|
|
|
Character Occupations
|
|