3 4
Once you have identified the system requirements, you are ready to develop a logical data model. The data model is essentially an extension of the system requirements. When creating the data model, you are organizing the requirements into a logical representation of the database. The data model includes definitions of entities, their attributes, and entity constraints. The model also includes definitions of the relationships between entities and the constraints on those relationships. This lesson describes how to develop a data model by identifying the entities, their attributes and constraints, and their relationships.
When you gather system requirements for a database design, one of the steps that you take is to define the types of data that the database will contain. These types of data can be separated into categories that represent a logical division of information. In most instances, each category translates to a table object within the database. Normally, there is a set of primary objects, and after they are identified, the related objects become more apparent.
For example, in the Pubs database, one of the primary objects is the Titles table. One of the objects related to the Titles table is the RoySched table, which provides information about the royalty schedules associated with each book. Another object is the TitleAuthor table, which matches authors to books.
By using the categories of data defined in the system requirements, you can start to create a map of the table objects within your new database. For example, suppose you are designing a database for a hotel's reservation system. During the process of gathering system requirements, you identify several categories of data, including rooms, guests, and reservations. As a result, you add tables to your database design that match each of these categories, as shown in Figure 3.9.
Figure 3.9 The primary objects in a database design: the Rooms table, the Reservations table, and the Guests table.
When identifying the business rules for this system, you determined that the hotel has eight types of rooms and that regular guests prefer a certain type of room. As a result, the Rooms table and the Guests table will each include a room type attribute. You decide to create a table for room types, as shown in Figure 3.10.
Figure 3.10 The hotel's reservation database, which includes the RoomType table.
Now, the Rooms table and the Guests table can reference the RoomType table without having to repeat a room description for each room and each guest. In addition, as room types change, you can update the information in one location, rather than having to update multiple tables and records.
Before you can complete the process of defining table objects within the database, you must define the relationships between the tables. Whenever you identify a many-to-many relationship, you will have to add a junction table. Relationships are discussed in more detail later in this lesson.
After you have defined all of the tables that you can define at this point, you can define the columns (attributes) for those tables. Again, you will be taking this information directly from the system requirements in which you identified which types of data should be included with each category of information.
Using the earlier hotel database example, suppose that you determined during the process of gathering system requirements that the Guests category of data should include information about the guests' first names, last names, addresses, telephone numbers, and room preferences. As a result, you plan to add columns to the Guests table for each of these types of information. You also plan to add a unique identifier for each guest, as is the case with any normalized entity. Figure 3.11 shows the Guests table with all of the columns that the table will contain.
Figure 3.11 The Guests table and its attributes.
After you have defined the tables and their columns, you should define the relationships between the tables. Through this process, you might discover that you need to modify the design that you have created to this point.
Start by choosing one of the primary tables and selecting the entities that have relationships to that table. Referring once more to the hotel database used in earlier examples, assume that the system requirements state that all reservations must include room and guest information. Rooms, guests, and reservations are the categories of data. As a result, you can deduce that a relationship exists between rooms and reservations and between guests and reservations. Figure 3.12 shows the relationships between these objects. A line connecting the two tables signifies a relationship. Notice that a relationship also exists between the Rooms table and the RoomType table and between the Guests table and the RoomType table.
Figure 3.12 The relationships that exist between tables in the hotel's reservation database.
Once you establish that a relationship exists between tables, you must define the type of relationship. In Figure 3.12, each relationship (line) is marked at each end (where it connects to the table) with the number 1 or with an infinity symbol (∞). The 1 refers to the one side of a relationship, and the infinity symbol refers to the many side of a relationship.
NOTE
To determine the types of relationships that exist between tables, you should look at the types of data that each table contains and the types of interchange between them. For example, a relationship exists between the Guests table and the Reservations table. The relationship exists because guests must be included in reservation information. According to the business rules, a guest can make one or more reservations, but each reservation record can include the name of only one guest, usually the person who is making the reservation. As a result, a one-to-many relationship exists between the two tables: one guest to many reservations.
A relationship also exists between the Reservations table and the Rooms table. According to the business rules, a reservation can be made for one or more rooms, and a room can be included in one or more reservations (on different dates). In this case, a many-to-many relationship exists: many reservations to many rooms. In a normalized database design, however, many-to-many relationships must be modified by adding a junction table and creating one-to-many relationships between each original table and the junction table, as shown in Figure 3.13.
Figure 3.13 The RoomReserv table as a junction table between the Rooms table and the Reservations table.
At this point in the database design process, you should have the entities, their attributes, and the relationships between entities mapped. Now, you must identify the constraints on the data that will be stored in your tables. Most of your work was already completed when you identified the business rules as you gathered system requirements. As stated previously, business rules include all constraints on a system, including data integrity and security. For this stage of the design process, your focus will be on the constraints specific to the data. You will take the data-related business rules and refine and organize them. You should try to organize the constraints based on the objects that you created in the database, and you should word them in a way that reflects those objects.
Returning again to the database design in Figure 3.13, suppose that one of the business rules is stated as follows: "A guest record can, but is not required to, include one of the predefined room type preferences but cannot include any other room type preference." When defining the data constraints, you should reference the relevant tables and columns and separate them so that they each focus on a single instruction:
When possible, you should organize data constraints according to tables and their columns. In some cases, a constraint applies to the table as a whole, to more than one table, to a relationship between tables, or to data security. In these cases, try to organize the constraints in a way that is the most logical and the most relevant to the project you are working on. The goal of identifying the data constraints is to have a clear road map when creating database objects and their relationships and enforcing data integrity.
In this exercise, you will take the steps necessary to create a logical data model. Much of this exercise involves drawing the tables, entities, and relationships that make up the database. Although you can use a drawing program such as Visio to create these objects, paper and a pencil are all that you really need. If you like, you can later transfer your model to a drawing program. In addition, you will need paper and a pencil to write the data constraints. You can also write these directly to a word processing document or a text document. Whatever method you choose, you should save the result for subsequent exercises. To perform this exercise, you will use the book shop scenario from Exercise 2 in Lesson 3.
Each category represents one of the primary table objects in your database design.
Your drawing should include five tables.
Your next step will be to identify any related tables. At this point, designing a database becomes a little more complicated. A good source to use for determining related tables is the list of business rules that you identified when you gathered the system requirements. Essentially, you are looking for subcategories of information or business rules that lead you to believe that additional tables are necessary. Remember, you can modify the database design as you identify relationships between tables and constraints on data.
For consistency, use the following names for your new tables: OrderStatus, FormOfPayment, Positions, and BookCondition.
You should now have 10 tables.
For each category of data, you defined which information should be included with each category. This information makes up your columns.
For consistency, use the following labels for column names:
Table | Columns |
---|---|
Books | TitleID, AuthorID, Publisher, PubDate, Edition, Cost, SRP, ConditionID, Sold |
BookCondition | ConditionID, ConditionName, Description |
Authors | AuthorID, FirstName, LastName, YearBorn, YearDied, Description |
Employees | EmployeeID, FirstName, LastName, Address1, Address2, City, State, Zip, Phone, DOB, HireDate, PositionID |
Positions | PositionID, Title, JobDescrip |
Customers | CustomerID, FirstName, LastName, Phone, Address1, Address2, City, State, Zip |
Orders | OrderID, CustomerID, EmployeeID, Amount, OrderDate, DeliveryDate, PaymentID, StatusID |
OrderStatus | StatusID, StatusDescrip |
FormOfPayment | PaymentID, PaymentDescrip |
BookOrders | OrderID, BookID |
Notice that the Employees table does not include a column for books purchased and dates of purchases. Because each customer can purchase more than one book, you would not include the information here. You could create a table to store this information, but it would be unnecessary because it would duplicate information that already exists in a database (information that can be derived through views or ad hoc queries).
You are looking for direct relationships. For example, the Books table has a direct relationship with the BookCondition table. BookCondition data applies directly to Books data. In addition, Authors data is directly related to Book data (authors write books). There is also a direct relationship between Books data and BookOrders data (orders include the books being sold).
Notice that there is no direct relationship between the Books table and the Orders table. The relationship between the two tables is indirect and is expressed through the BookOrders table.
Your database design should look similar to the schema in Figure 3.14.
Figure 3.14 Identifying the relationships between tables in the logical data model.
To determine the type of relationship, think in terms of the data associated with each object. For example, a relationship exists between employees and the orders that they generate. An employee can create many orders, but only one employee can create an order. Therefore, a one-to-many relationship exists between the Orders table and the Employees table (one employee can create many orders). The Employees table is on the one side of the relationship, and the Orders table is on the many side.
Your database should now look similar to the schema in Figure 3.15.
Figure 3.15 Identifying the types of relationships between tables in the logical data model.
Which relationship is many-to-many?
You are deleting the relationship between the two tables because a direct relationship no longer exists. Instead, an indirect relationship is created through the BookAuthors table. In addition, the AuthorID column is no longer necessary in the Books table because the book/author relationship is expressed in the BookAuthors table.
Your database design should now look similar to the schema in Figure 3.16.
Figure 3.16 Adding the BookAuthors table to the logical data model.
To which object(s) does this business rule apply?
What are the data constraints?
What are the data constraints for your database design?
The data model includes definitions of entities, their attributes, and entity constraints. The model also includes definitions of the relationships between entities and the constraints on those relationships. One of the first steps that you must take toward creating a data model is to identify the types of data that the database will contain. These types of data can be separated into categories that represent a logical division of information. In most instances, each category translates to a table object within the database. Once you have defined the tables and their columns, you should define the relationship between the tables. To determine the type of relationship that exists between tables, you should look at the types of data that each table contains and the types of interchanges between them. Once you determine the relationships between tables, you must identify the constraints on the data that will be stored in your tables. You should organize data constraints according to tables and their columns (when possible).