Entity Relationship Modeling


  • Define entities. Considerations include entity composition.

    • Specify entity, attributes, and relationships in a logical model.

Relational databases are databases in which data is organized into related objects rather than tied to a file. Each of the objects contained in a database is related to the others in some way. You'll often see the acronym RDBMS as an alternative to Relational Database Management System. Data RDBMS, such as SQL Server, also store other tidbits such as database security, database structure, and all other objectsincluding triggers, views, and stored procedurespertaining to that database.

Based on the paper written by E.F. Codd in 1970, "A Relational Model of Data for Large Shared Data Banks," relational databases store sets of data in relations (often called tables). The tables are often related to one another, but this is not required.

The relational database design, also known as the Entity-Relationship model, was first developed by the database engineer Charles Bachman in 1960 and then later called the Entity-Relational model (ER model) in 1976 by Peter Chen. The Entity Relational model allows a database to be set up in a fairly simple and organized manner. The simplicity of a data model gives it the advantage of flexibility; this benefit enables you to easily create and enhance databases based on business requirements on the fly. Many other model formats are also commonly used, each having similar characteristics for diagramming a potential database structure. See Figure 2.1 for a sample ER model designed using Visio and a portion of the Northwind sample database.

Figure 2.1. ER Source for Northwind.

graphics/02fig01.gif

Modeling data into entities is simple and any number of data modeling tools are available to assist in this design. Microsoft has the Visio product, which is an excellent design tool for a variety of purposes including databases, programming, flowcharting , network design, and more. Many third-party products are also available at little or no cost. However, the most commonly used toolsERWIN and Embarcadero's productsare fairly expensive. Any of these tools will give you the desired functionality to document a logical design. A summary of some of the many products available and some related information is presented in Table 2.1. This chapter uses the Lloyd's Hospital case study as the medium to build a practical and efficient data model.

IN THE FIELD: ERWIN

One of the most popular tools for ER model design used in a lot of business environments is ERWIN, now owned by Computer Associates.

Table 2.1. Database Design Software

Product

Source

License

SmartDraw

www.smartdraw.com

Free 30-day full version

RFFlow

www.rff.com

Feature limited free trial

CaseStudio

www.casestudio.com

Feature limited free trial

DeZign

www.datanamic.com

Feature limited free trial

PDQ Lite

www.patton-patton.com

Free 15-day full version

Visio

www.microsoft.com

$7.50 30-day full version CD

An ER data model consists of three main components that are present in any model format:

  • Entities . In data modeling, entities can be closely compared to the physical element of a table in SQL Server. An entity is a collection of related data elements such as customers or products.

  • Attributes . Attributes are the characteristics given to an entity such as Salary and FirstName. Attributes of an entity represent a set of properties of a data element within an entity. They will become the columns or fields in the physical database.

  • Relationships . Relationships show a logical link between two or more entities. A customer entity may be related to orders; a product entity may be related to sales.

The following three steps are involved in creating a basic data model:

  1. Identify and create entities.

  2. Create entity properties (attributes).

  3. Identify relationships (one-to-many, one-to-one, many-to-many).

Entities

Entities are the main objects in an ER model because they contain all attributes and participate in relationships. Entities represent the things, places, people, and concepts involved in a real-world situation.

An easy way to identify candidates for entities is to keep a lookout for the nouns present in a case scenario. This method is not surprising because nouns resemble all the things an entity can represent. For example, look at this excerpt from the case study: "The pharmacy is available to all our doctors and medicine can be bought by any of our registered patients ."

From the excerpt above, you can apparently determine entity candidates by looking at the nouns: Medicine, Patient, and Doctor.

Four steps need to be undertaken when determining entities for an ER diagram. The following four steps should be carefully observed because they are related to the first part of the Microsoft exam objective mentioned at the beginning of this chapterdefining entities.

  1. Write down key nouns from the case study.

  2. Identify which nouns are really needed as entities.

  3. Draw the entities.

  4. Write a brief description of each entity.

Now you can attempt to complete the first two steps of the data modeling process using the chapter case scenario.

STEP BY STEP

2.1 ER Modeling

Try out the following steps yourself, and then compare your results to the results listed following this Step by Step.

  1. Write down the key nouns from the case scenario on a scrap piece of paper.

  2. Identify which nouns are really going to be needed as entities.

Answer to Step by Step 2.1 : The following are key nouns found in the case scenario representing a solution to Step by Step 2.1:

Patients

Doctors

Surgeons

Nurses

Training

Employees

Medicine

Labs

Machinery

Academic Faculty

Courses

Trainees

A standard documentation technique is to draw entities as rectangular boxes with enough space to hold many attributes. At this point, taking things one step at a time, you aren't quite ready to complete the attributes and relationships for your model. After you have identified the major nouns, you are ready to begin a diagram or sketch for the design. Rough paper or any of the previously mentioned software packages can be used for the design. These diagrams make up what is known as a data model diagram , better known as an entity-relationship diagram .(ERD)

STEP BY STEP

2.2 ER Modeling

To complete the entity portion of an ER model, you need to begin documenting the model. Try out these next two steps now to begin putting the picture together.

  1. Draw the entities on paper and then, if desired, transfer them to computer.

  2. Write a brief description of each entity.

Answer to Step by Step 2.2 : The following are the nouns that would most logically be made into entities. These nouns should have enough significance to be used; the other nouns possibly would be contained within some of these entities.

Patient

Machinery

Employee

Course

Medicine

Trainee

Designing a good database at this stage still does not require you to acquire the knowledge of how databases are implemented in SQL Server; rather it depends on how well you can comprehend a given scenario and its elements. The next section takes a look at entity composition and the defining of attributes.

Defining Attributes

Identifying attributes is the next step in ensuring a successful data modeling process, and moreover is a part of the Microsoft exam objectives for this chapter, defining entity composition, and the sub-objective, specify entity attributes.

Attributes are characteristics or properties defined within a single entity, and they correspond to real-world properties of a place, thing, or concept. An attribute candidate is usually found by examining sentences in the case study that are in close proximity to its residing entity. For instance, take a look at this excerpt from the Lloyd's Hospital case study: "We have 300 employees, most of whom work with a permanent contract, but some work on an hour-to- hour basis." Think of attributes as properties of entities. Therefore, from the excerpt above, you probably should have guessed Contract as being a good candidate. because it is a property of an employee.

Some common issues in database design are:

  • Should this be an entity or an attribute?

  • Should this be an entity or a relationship?

One might model the Contract as an entity, and later make the relationship Employees work under a Contract. This is not the model used in this chapter, but it is worth mentioning that there are many different ways to model the same information. Each variation of a model may work and be correct. This is part of what makes data modeling a painstaking process that requires patience. Whatever model is chosen , try not to jump ahead to the next stage of modeling until you have considered some alternatives.

EXAM TIP

Attributes on the Exam On the exam you will not have to dream up your own attributes, but you may have to select from a list of potential attributes and place them into the correct entity.


Try to find out the attributes that fit each of the Patient, Employee, Medicine, Machinery, Course, and Trainee entities. Test your results with the results provided in Figure 2.2. If you missed an attribute or added an extra attribute that isn't found below, don't feel that it is wrong. Attribute decision is a hard choice and is varied from person to person, depending on your perspective on the scenario. Also remember that this simulation is missing a lot of information, so there are likely to be many more attributes than those listed.

Figure 2.2. Attributes implementation in an Entity-Relationship diagram (ER diagram).

graphics/02fig02.gif

Naming Attributes

For the case study of Lloyd's Hospital, attributes are defined as shown in Figure 2.2.

To create and identify attributes is a developed skill; there is no set method for doing soyou just need to practice. However, there are some good guidelines available that will help you in creating and choosing the best attributes for your data model. The first is how you name your entities. A good name makes an attribute look professional and helps in its readability. Here are some good guidelines that help in naming entities. Remember to always follow these to keep your design consistent.

  • An entity or an attribute should be named in its singular form, thereby implying that it is only a single instance. An instance is a single occurrence of an entity.

  • The use of underscores ( _ ) in naming attributes is not a good habit because special characters have particular meanings in some software packages, and the mixture of text and other characters is difficult to type. Try to distinguish a word from another by using mixed case, as in DeliveryService rather than Delivery_Service.

  • Entity and attribute names should not exceed 128 characters because a table in SQL Server is restricted to that. In fact, names should be kept as small as possible while still providing a meaningful object title. Remember, entities will become tables in SQL Server.

  • Entity names should be unique.

  • Attributes within the same entity should not be the same as any other object name; any naming in SQL Server should be kept unique. Chapter 3 goes into more detail on this point.

Appropriate naming conventions are often developed as a corporate or development standard within an organization. Often mechanisms for shortening names or using common prefixing or suffixing may all be part of a programming team's standard.

Attribute Makeup

Now that you understand how to properly name attributes, it's time to look at how to break attributes down, a process more commonly known as decomposing attributes . Decomposing an attribute takes an attribute from its original form and divides it into its components. A good example of this is the breaking down of the Address attribute. An Address attribute may easily be broken down into attributes that store data as follows :

  • Street. Stores the street address of the user .

  • City. Stores where the user lives.

  • State. Stores which state the user lives in.

  • Zip. Stores the user's zip code.

This process of decomposing an attribute helps you develop a normalized database structure as defined later in this chapter during the discussion of the physical design. Decomposing is also a function of usage as well. If, for example, a person's name is only needed in its full form, then a decision may be made to not break it up into the separate attributes of first and last name.

Decomposing an attribute provides many benefits in contrast to general, built attributes that are not decomposed. First off, data integrity is improved. Data integrity is a measurement of how well data is kept consistent and flawless throughout the whole ER model.

When attributes are decomposed, different methods of ensuring data integrity can be applied to the broken-down segments rather than the attribute as a whole. For example, a Zip attribute can be checked as an integer and City as a string. Secondly, decomposing aids in sorting and improving data retrieval performance.

A generic attribute Address contains the street, city, state, and zip. To just get the state of a customer in Washington, you have to select the whole Address attribute and parse it to find Washington, thereby degrading performance because of the redundant data retrieved. If you just have four separate attributes, you can select the State column and get your results more quickly. This same rule applies to updating data. It's easier to update a single part of Address rather than parsing and then updating the whole Address attribute.

In most cases, an attribute is identified for each entity that will act as a key. This attribute could be a customer number, product number, or simply an attribute to serve no other purpose but to act as a key identifier. When an attribute is added solely for the sake of being an identifier, it is known as a surrogate key . Whether you use a natural or surrogate key, that key will represent an important part in establishing relationships. The term Primary Key is assigned to this attribute. Even though SQL Server does not mandate the use of a Primary Key, one should be defined simply as a matter of good practice.

REVIEW BREAK: General Recap on Data Modeling

When designing a data model, you do not need to know how databases are actually implemented in SQL Server. In fact, the primary goal at this stage is picking the facts out of a given problem definition. A case scenario in whatever form it takes will describe the current situation, problems, and changes a firm is currently undergoing. You need to deduce all tables, columns, and relationships from this preparation.

Entities become the tables when a logical design is transferred to a physical database model. Each entity is made up of a number of properties or attributes that will later become the column definitions of the table. A relationship between entities creates a logical link between the entities and in the physical design will represent the Primary Key and Foreign Key relationship of a constraint.



MCSE Training Guide (70-229). Designing and Implementing Databases with MicrosoftR SQL ServerT 2000 Enterprise Edition
MCSE Training Guide (70-229). Designing and Implementing Databases with MicrosoftR SQL ServerT 2000 Enterprise Edition
ISBN: N/A
EAN: N/A
Year: 2003
Pages: 228

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net