When you set out to design a database system, there are two concepts you simply must be familiar with before you can say you have a solid planning foundation. You need to know the types of things your system will track, and you need to know the characteristics of each of those things. In a recipe list, for example, you track one kind of thing: recipes. A recipe's characteristics are, for example, recipe name, recipe type, calories, ingredients, and cooking time. You could draw it out like this:
Recipe
Here is one thing followed by a collection of its characteristics. A bigger database system may store information about several kinds of things, each with its own set of characteristics. For example, if I want to write a database system for a motorcycle company, I might want to track information about motorcycles, customers, and sales. Now I have three kinds of things, each with its own set of characteristics.
In database design terminology, the things in your database system are called entities. Each entity is a specific, distinct kind of thing, about which you need to track information. This system tracks data about three distinct kinds of things. And each kind of thing has certain characteristics, which in the technical jargon are called attributes. The motorcycle example includes three entities, and each has some specific number of attributes (see Table 5.1).
Motorcycle |
Customer |
Sale |
---|---|---|
Model Number |
First Name |
Customer Name |
Model Year |
Last Name |
Date |
Vehicle ID Number |
Birth Date |
Amount |
Factory Serial Number |
Street Address |
|
Accessories |
City |
|
Manufacturer |
State |
|
Model Name |
ZIP |
The first indispensable step in solid database design is to determine what entities (things) your proposed system needs to track, and what the attributes (characteristics) of each entity are. It's not just the first step, thoughit's also the third, fifth, seventh, and so forth. Your list of things and their characteristics will inevitably change during your analysis, sometimes quite frequently. This is not a bad thing. It's a natural part of database design. You'll inevitably revisit and refine your list of entities and their attributes several times in the course of designing the system.
Roughly speaking, an entity is a class of things that all look more or less alike. In other words, from a database standpoint, you track many instances of an entity, and you track the same kind of information about each one. In a banking system, you'd probably have an entity called Customer because a banking database wants to keep track of many different customers, and wants to record roughly the same kinds of data about each one. (You'll always want to know a customer's birth date, Social Security number, home address, and the like.)
Attributes, on the other hand, refer to the kinds of information you track about each entity. If Customer is an entity in our banking database, birth date, home address, and Social Security number are among the attributes of a customer.
It won't surprise you to learn that entities often correspond to actual database tables, and attributes often correspond to database fields. More likely than not, a banking database will have a Customer table with fields for date of birth, address, and Social Security number.
Note
The entities in these diagrams are purely abstract things. They may or may not translate directly into database tables. Your FileMaker solution may (and almost certainly will) end up with tables that aren't represented on your design diagram.
It's fairly easy to represent entities and attributes in the graphical notation of an ERD. Sometimes it's more convenient to draw an entity without showing any of its attributes, in which case you can draw it in a simple box, as shown in Figure 5.1.
Figure 5.1. A simple preliminary ERD showing entities for customers and accounts, with no attributes shown.
Sometimes it's appropriate to show entities with some or all of their attributes, in which case you can add the attributes as shown in Figure 5.2.
Figure 5.2. An ERD showing entities for customers and accounts, with attributes shown.
Entities Versus Attributes: A Case Study
The focus of this chapter is in taking descriptions of real-world problems and turning them into usable ERDs. As was noted earlier, your first step in trying to model a problem into an ERD is sorting out the entities from the attributes. To see how to tackle this, let's begin with an example of a simple process description:
Maurizio's Fish Shack is ready to go digital. Maurizio sells fish out of his storefront but he's not worried about electronically recording his sales to consumers just yet. He just wants to keep track of all the fish he buys wholesale. Every time he buys a load of fish, he wants to know the kind, the quantity, the cost of the purchase, and the vendor he bought it from. This will give him a better handle on how much he's buying and from whom, and may help him negotiate some volume discounts. |
Now you know the basics of Maurizio's business. Next you need to develop a list of potential entities. Here are some possibilities:
Fish |
Load of fish |
Purchase |
Storefront |
Variety |
Vendor |
Sale |
Quantity |
Volume discount |
Consumer |
Cost |
Tip
Usually the rule of thumb to apply when coming up with a list of possible entities is to pull out every word that's a noun; in other words, every word that represents a specific thing.
These are typically referred to as candidate entities, in that they all represent possible entities in the system. But are they all entities? You can immediately cross "storefront," "sale," and "consumer" off the list, for the simple reason that the process description already says that these are parts of his business that Maurizio doesn't want to automate at this time. That leaves us with the following potential entities:
Fish |
Cost |
Load of fish |
Purchase |
Variety |
Vendor |
Quantity |
Volume discount |
Well, "fish" and "load of fish" look like they refer to the same thing. According to the process description, a load of fish is actually a quantity of fish that Maurizio bought to resell. Put in those terms, it's clearly the same thing as a purchase. Now the list looks like this:
Purchase (of fish) |
Quantity |
Vendor |
Variety |
Cost |
Volume discount |
These all seem like reasonable things to track in a database system. But are they all entities? Remember that an entity is a kind of thing. The thing will probably appear many times in a database, and the system will always track a coherent set of information about the thing. Put that way, a purchase of fish sounds like an entity. You'll record information about many fish purchases in Maurizio's database.
What about something like "cost"? The "cost" in the process description refers to the price Maurizio paid for a load of fish, so cost isn't really an entity. It's the price paid for one load of fish. It's actually a piece of information about a fish purchase because each fish purchase has an associated cost. The same is also true for "variety" and "quantity." These are all attributes of the "purchase" entity.
Then you get to "vendor." A vendor is clearly a category of thing; you'll probably want to store information about many vendors in this database, so you can consider a vendor to be an entity. This leaves "volume discount." Well, that one's a bit tricky. It probably applies to a vendor, and might reasonably be called an attribute of a vendor. If you assume that each vendor may offer a discount of some kind, it makes sense for it to be an attribute of a vendor.
Figure 5.3 shows what the fledgling ERD for this system might look like, with the two entities from the process description and their various attributes.
Figure 5.3. An ERD showing entities for fish purchases and vendors, with attributes shown.
A few things are noteworthy about this diagram. Notice that the entities are called Purchase and Vendor, instead of Purchases and Vendors. When naming entities, it's preferable to name them in the singular, rather than the plural. (You're trying to answer the question "each instance of this thing is a....") In FileMaker, we usually extend that convention to the database table that ends up being built for each entity.
Design as an Iterative Process
Your general task when designing a database (or indeed any piece of software) is to take a set of things in a real-world problem domain and translate them into corresponding things in the software domain. In your software, you create a simplified model of reality. Concepts like "fish purchase" and "fish vendor" in the problem domain turn into concepts like "purchase entity" and "vendor entity" in a design, and may ultimately turn into things like "purchase table" and "vendor table" in the finished database.
But this translation (from problem domain to software) is not a one-way street. It's rare that there's a single, unambiguous software model that corresponds perfectly to a real-world problem. Usually, your software constructs are approximations of the real world, and how you arrive at those approximations depends a lot on the goal toward which you're working.
In general, software design follows an iterative path, meaning you perform a similar set of steps over and over again until you end up with something that's "close enough." For example, in your initial reading of the design problem, you might miss an entity or two. Or you might create entities you don't really need on later examination. Later, as you do more work on the project and learn more about the problem domain, you may revise your understanding of the model. Some entities might disappear and become attributes of other entities. Or some attributes might turn out to be entities in their own right. You might find it's possible to combine two similar entities into one. Or you might find out that one entity really needs to be split in two. We're not trying to make you feel uncertain or hesitant about your design decisions. Just recognize that it's not imperative, or necessarily even possible, to get the design exactly right the first time. You'll revisit your design assumptions frequently over the course of the design process, and this is a natural part of the process.
Part I: Getting Started with FileMaker 8
FileMaker Overview
Using FileMaker Pro
Defining and Working with Fields
Working with Layouts
Part II: Developing Solutions with FileMaker
Relational Database Design
Working with Multiple Tables
Working with Relationships
Getting Started with Calculations
Getting Started with Scripting
Getting Started with Reporting
Part III: Developer Techniques
Developing for Multiuser Deployment
Implementing Security
Advanced Interface Techniques
Advanced Calculation Techniques
Advanced Scripting Techniques
Advanced Portal Techniques
Debugging and Troubleshooting
Converting Systems from Previous Versions of FileMaker Pro
Part IV: Data Integration and Publishing
Importing Data into FileMaker Pro
Exporting Data from FileMaker
Instant Web Publishing
FileMaker and Web Services
Custom Web Publishing
Part V: Deploying a FileMaker Solution
Deploying and Extending FileMaker
FileMaker Server and Server Advanced
FileMaker Mobile
Documenting Your FileMaker Solutions