Design Process Overview | Professional UML with Visual Studio. NET. Unmasking Visio for Enterprise Architects

Chapter 8 - Database Modeling with Visio for Enterprise Architects

byAndrew Filevet al.?
Wrox Press ©2002


	Team FLY

We begin by looking at where the details of the rest of the chapter fit into the development of a business solution. When we first set out to build a project, we have an idea of what the solution will entail and how it will be built. It's getting there by extracting the details of our idea that is the aim of the design process. In order to prevent us from getting lost in these details, however, our aim in the next section is to provide a bigger picture, so that you can have an idea of how and why VS.NET's data-modeling capabilities can help you develop better solutions.

Database Modeling

When a problem is presented to us, we first need to understand the nature of it. The only ones who can do this are the ones who have the problem in the first place. In order to help them, we need them to communicate it to us. The main concern at this point is this communication. But how can we communicate effectively when the parties come from different backgrounds? Finding a common ground that is simple and intuitive between the technical domain and the problem domain is what a modeling solution hopes to accomplish. Neither party has to know the details of what the other side is doing when they can both agree that the model is correct. In this way, a model is like a contract (interface in object-oriented (OO) parlance) between the two groups. This will become clearer later on in the chapter when we see, as an example, that as long as they can both agree on the logical ER diagram, the programmer can develop their part of the solution independent of the physical database and the DBA can perform their optimization and administration however they please. Now replace DBA and Programmer with Programmer and Domain expert and you'll have an idea of where ORM fits in.

A concise description of the problem creates documentation. This is another benefit of modeling: documentation as an artifact is a by-product of the modeling process. Documentation should also serve as a defined deliverable, which can be used to measure progress of the project and minimize ambiguity and assumptions made on the part of both sides. The less assumptions made by both sides, the less risk is involved.

Visual Studio .NET for Enterprise Architects will greatly speed the iteration of the design steps and insulate the architect from the risk involved in changing scope or design decisions later on in the development cycle. One way this is accomplished is through the automation of mapping from one design phase to the other. While the improvements brought forth in this new tool are many, the database modeling aspects represent a tremendous leap in the development of effective databases. Most of us have used tools that can create DBMS systems based on logical diagrams and models. Many different tools exist that accomplish only this. In fact, most DBMS systems come with a suite of tools that will allow the architect to map out the database using the most popular of these models, the Entity Relationship diagram (ERD). These diagrams have become the de facto standard today and represent one layer of abstraction above the physical database. They can be used in a number of ways to directly manipulate or create scripts that can update the system to reflect changes made in the source model. This process has been greatly refined in the latest version of Visio for Enterprise Architects.

Although this is the way it has been done for some time and it is still part of the process of creating the system, Entity Relationship Diagrams were only developed to ease one aspect of the development cycle, namely the logical design phase. In order to address the other steps, VS.NET now tightly integrates Object Role Modeling (ORM).

ORM has been around since the mid 1970's and is at a level of maturity where a formal specification has been adopted for some time. It was developed alongside UML as a data-centric modeling language and like UML, helps nearly all phases of the development process. It uses natural language to help gather and analyze requirements with subject matter experts, provides a modeling language to perform conceptual design, and maps well to Entity Relationship Diagrams to provide the logistics of the implementation of the solution. Like UML, it is flexible and broad enough to cover any situation, and yet finely detailed enough to be useful and convey most of its information to the next step of the design process.

Object Role Modeling (ORM)

ORM grew out of a combination of Binary Relationship Modeling, which studied the different types of relationships that could exist between two objects, and Natural Language Information Analysis Method (NIAM), which sought to dissect language to define objects involved and the relationships between them. The goal was to create a modeling language that was based on natural language, was intuitive, and mapped easily to existing database systems. In its current incarnation, developed by Dr. Terry Halprin it is supported by the latest version of Visio available with Visual Studio EA (see www.microsoft.com/office/visio for more details) as FORML (Formal Object Role Modeling Language). The result is a comprehensive modeling language that provides easy-to-understand language coupled with intuitive, easy-to-understand diagrams that map quite well to the logical design structures.

The benefits of its use are numerous. It greatly speeds up the requirements-gathering process by describing objects and the roles they play instead of trying to jump directly to creating tables with attributes to store the information. It provides stability and reliability by accommodating underlying changes easily and allowing the model to be populated with sample data to check the accuracy of the model. It is complete, in that almost any business requirement may be notated using the many constructs of the language. Finally, it helps comprehension so that the ideas conveyed by the model may be easily expressed to non-technical business stakeholders. The combination of all of these benefits makes ORM a very useful tool in most if not all of the steps of the design process.

Because ORM is closely tied to UML from an object-oriented perspective, it offers some of the same benefits and can actually help better refine a UML model.

It can be used to model many different levels of abstraction. For example, the entire system can be mapped on a single page showing only the major subsystems and how they interact with each other. Then it can be used to build the subsystems with a fine degree of detail and control over most of the implementation aspects. Even at this level, it still holds true to its name by representing conceptual objects as they are, not fitted into an implementation-specific modeling solution such as static structure or ER diagrams. Even at this level, its object-oriented nature shields users from implementation and technical terms such as tables and columns.

What is ORM?

ORM is a fact-based approach to modeling a solution. What does this mean? It means that we define the system or domain by defining the facts about what makes up the system. Instead of talking about objects and attributes or tables and columns, we talk about the roles an object plays in the system or what it does with another object. This allows a high-level discussion of the system, but don't think everything needs to be kept at a high level. ORM provides a level of detail specific enough to clearly define the rules and constraints that make up the system.

By expressing the parts that make up the system in simple sentences like these, we're able to begin the process of formalizing our design.

Elementary Facts

At its base level, ORM is composed of elementary facts. These are simple declarative statements about objects, and the roles they play. They shouldn't contain any extraneous information at this point, simply a subject and a predicate. Some examples from a system we're going to look at are:

Flight departs from Gate
Terminal has a Gate
Airline owns a Terminal

Sometimes it is useful to express the inverse of the facts so that the statement is even clearer:

Gate is departed from by Flight
Gate is of Terminal
Terminal is owned by Airline

The shorthand way to represent this is with a slash between the two forms of the predicate. We read these left to right, and then right to left so that none of the information above is lost:

Flight departs from/is departed from by Gate (read as: Flight departs from Gate; Gate is departed from by Flight)
Terminal has a/is of Gate
Airline owns/is owned by Terminal

Here, the objects are the nouns in each of the facts and are capitalized so that they can be easily distinguished. Since there are two nouns in each of these facts, they are called binary facts. The number of objects participating in a predicate is the arity of the predicate. The default arity is binary, but we can also have unary facts that describe the state of an object (for example, Gate is occupied), ternary facts that we'll see later, and even quaternary facts (with an arity of 1, 3, and 4, respectively). The predicate of the sentence defines the roles (arrives, has, and owns) that the objects fulfill in the fact.

This is the verbal aspect of ORM and is a great way to express information about the system. Another method is used in concert with this verbalization. This is the graphical depiction of these elementary facts using ORM notation.

ORM Notation

To visually represent the facts above, we use ORM notation.

Objects and Roles

Objects are depicted as ovals, while the roles are depicted as boxes as shown:

click to expand

We can see that each object fulfills one role above represented by a box. These boxes together represent the predicate of the fact. The entire diagram together depicts the fact type.

Constraints

Constraints are the bread and butter of the ORM. They are how we define the rules of the system and further refine the relationships between objects represented by their roles. For example, we could say that an airline must own at least one terminal in order to be considered an airline. This is known as a Mandatory Constraint and it asserts that in order to exist, the object must participate in this role. Mandatory Constraints are depicted in ORM notation by a dot on the line connecting the object to its role. Coming from the other side of the predicate, we could assert that every terminal is owned by at most one airline. This is a uniqueness constraint stating that the Terminal object can only play the role once or not at all. This is represented by an arrow over the role played by the Terminal object. You can see this in the diagram below:

click to expand

These are simple constraints that we've entered to demonstrate some of the notation. We'll show some more complex constraints in a bit, but first let's take a look at a procedure that has been designed to help us come up with the objects, roles, and constraints. This is the Conceptual Schema Design Procedure (CSDP).

The Conceptual Schema Design Procedure

This procedure was created specifically for ORM modeling to help create ORM diagrams. The steps of the procedure are as follows:

Transform familiar information examples into elementary facts, and apply quality checks
Draw the fact types, and apply a population check
Check for entity types that should be combined and note any arithmetic derivations
Add uniqueness constraints and check fact types
Add mandatory role constraints and check for logical derivations
Add value, set comparison, and subtyping constraints
Add other constraints and perform final checks

We'll go through each of these steps with a straightforward example. Suppose we need to create a system that manages a number of the day-to-day operations of an airline. This would be quite a large system, so we'll break it down into more manageable parts. We've been told that the part they're having the most problems with is how to get the right meals to the right flights. This will be the domain or Universe of Discourse we'll begin with.

Step one of the procedure accepts as its input requirements and examples from the users of the system. When we say examples, we mean just that. One of the benefits of ORM and this procedure is that it accepts example data from the users and can even use it to generate parts of the model. One of the best ways to obtain this type of data is with a report or input screen that the users expect the system to fill with data. This offers an extremely intuitive way for the users to specify requirements. People are familiar with reports and they'll usually know what kind of information they will need from the system and what kind of information they will need to enter.

We meet with the people currently responsible for the part of the system we're beginning with and they provide us with an auditing report that the system would produce. We quickly scan through and removed some of the monotony and repetition of the report to give a clearer picture of the types of information needed:

Flight Number	Departure Time	Terminal	Gate	Seat	Meal Type
5468	8:30 AM	A	A34	B12	Kosher
5468	8:30 AM	A	A34	B13	Regular
5572	9:00 AM	C	C28	E23	Vegetarian
5695	9:30 AM	C	C15	F11	Regular
5433	10:30 AM	C	C12	D15	Regular
5211	11:15 AM	C	C12	F11	Regular

So applying the method described in the first step of the CSDP we derive the following fact types:

Flight has Flight Number
Flight has Departure Time
Flight departs Gate
Gate has GateCode
Terminal has Gate
Terminal has TerminalLetter
Flight has Seat
Seat has SeatCode
Meal Type is delivered to Seat

This takes us all of five minutes after we've got the report above. We quickly write them down and show them to the client for some quick verification that these really do capture most of the objects in the system. This is another great feature of ORM. Because it uses natural language, we don't need our clients to learn anything at all just yet. We simply make some statements about the system and see if they nod their heads.

Objects Types: Entity and Value Types

At this point, let's discuss the different types of objects that will play a role in our facts above: object types and value types.

Objects are anything that can take part in a fact. They are what the system will be composed of and can be real-world physical objects (Terminals, Gates) or more abstract (Takeoff Times). A value object represents an object that will not be broken down further and usually represents a numeric or string constant. Think of value objects as the primitive data types in a system. They are usually restricted to participating in one role with an entity type object.

An entity object on the other hand represents a more complicated real-world object that can play many different roles with both values and other entity types. Entity objects are represented by solid ovals while value objects are dotted ovals.

click to expand

Entity objects must have a reference scheme that identifies the objects within their roles.

The reference scheme is used so that you can get a clearer picture of which object instance is playing in the role instance.

The experts of the system will help with defining the reference scheme for your entity object types. For example, we find that:

Flight is referenced by its FlightNumber
Gate can be referenced by its GateCode
Terminal is referenced by its TerminalLetter
Seat is referenced by its SeatCode

The reference scheme is depicted in the diagram in parentheses below the Entity object name, as shown for our Terminal object:

Value types that do not serve as reference schemes for entity types are constants. They have an implicit reference scheme that is omitted from the diagram. For example, an instance of the TerminalLetter object could be 'B'.

After coming up with our reference schemes for our entity objects, we have the following fact types:

Flight (FlightNumber) has Departure Time
Flight (FlightNumber) departs Gate (GateCode)
Terminal (TerminalLetter) has Gate (GateCode)
Flight (FlightNumber) has Seat (SeatCode)
Meal Type is delivered to Seat (SeatCode)

The next step in the CSDP is to draw the fact types. Since we're going to use Visio to do this, it will help to understand some things about how Visio structures and organizes our source models.

Visio Data Projects

When Visio is launched, it assumes you would like to begin a new modeling project or diagram and presents you with a number of modeling solutions. Selecting the Databases category presents you with the choices shown below (depending on which models you chose to install in the Visio setup application). The diagrams that we are concerned with throughout this chapter are the Database Model Diagram, the ER Source Model, and the ORM Source Model. The cornerstone of the Visio Data Project will be the Database Model Diagram (DMD).

click to expand

The DMD is mainly an Entity Relationship (ER) diagram with project management capabilities. The drawing surface itself is a standard ER diagram with standard ER shapes that are used to map out your logical design. However, unlike the standard ER Source Model, it has a number of features that make up the structure of a data project. Most of this functionality is accessed through the database menu. Opposite, we show a standard ER source model's database (right) menu compared with that of the Database Model Diagram on the left.

click to expand

The DMD is the only diagram from which a database can be generated or updated. Think of the DMD as a staging area for the project. As we'll go into detail later, while ORM is a great tool for gathering/analyzing requirements and creating a conceptual design, it wouldn't be a good idea to directly create a physical database from it. There has to be an intermediate step depicting some of the logical design details. As we'll see, at interim steps in the conceptual design, the project can be built to produce the logical design as encapsulated by the DMD. Here it can be given final implementation details before the generation or update of the database takes place. We'll go into each of the differences in detail in a while, but for completeness, we'll briefly explain what each represents.

The Show Related Tables menu choice is merely a UI change; it will add to the drawing surface all the tables in the model that are directly related to the currently selected table. The model menu includes the Model Error Check function and a new function called Refresh that will refresh the model with any changes in the underlying database. The Project sub-menu is the most important change from ER diagram to data project. It provides the functionality to add existing models to the data project and maintain the synchronization between the source models and the DMD.

Creating the Data Project

We noted earlier that a data project's functionality is based around the Database Model Diagram. This is best explained in terms of the Project submenu of the Database menu pictured above. As you can see, the Project menu contains functionality to build the project, manage the project documents, and maintain the source models. The first step is usually to add a source model to the project. Since this is a completely new system that we intend to design, we will begin by adding a new ORM source model so that we can get back to our example. Clicking this option prompts you to name and save your new source model. After it is finished saving, you should see the project window partially covering your drawing. Your new file should appear in the project window; double-clicking it will open your new ORM diagram alongside your DMD.

You now have all of the files you need to create your solution in Visio. The ORM source model will facilitate requirements gathering, analysis, and creation of the conceptual design. The DMD encompasses an ER diagram that will accommodate all of the logical design requirements and allow the detail needed to generate or update the physical database you plan to use.

click to expand

Now that we have a data project that includes a new blank ORM source model, let's get back to our example.

Step 2 - Drawing the Fact Types

There are many ways we can draw fact types in Visio. We're going to show how all of them work so you can decide which one you like best. The first thing most people notice when they create a diagram in Visio is the stencil shapes. Looking in the stencil, we notice that there are only three shapes. This is very misleading because the ORM notation provides many shapes we can use to display information about the system.

As we'll see later on, most of the shapes will be displayed on the diagram depending on properties and information we set using dialogs Visio provides. To get an idea of most, if not ALL of the shapes used to represent relationships in ORM, take a look at the stencil of an ORM diagram instead of an ORM source model. ORM diagrams are used to merely draw the diagram using the shapes. They cannot be used to generate the logical model as source diagrams can.

It turns out that dragging shapes from the stencil is one of the most tedious ways to define objects and fact types. A much better way is the Business Rules Window

The Business Rules Window

The Business Rules window is usually located as a tab at the bottom of the screen. This area can be occupied by windows and editors that can all be selected from the View submenu of the Database menu pictured below:

The Business Rules window is composed of two panes: the Object Types pane where we can create and define some of the properties of our objects, and the Fact Types pane where we do the same thing with our facts.

Taking our first example fact: Flight departs Gate, let's begin by adding the two objects in the fact: Flight and Gate. Double-click the first row to create a new entry. Name the object type Flight in the object types column, then double-click to add a new object type and name it Gate. Don't worry about the other columns just yet.

So now we have two objects, let's define the relationship (or fact) type between them. Bring the Fact Types tab forward and double-click the row to add a new fact type. The fact type editor is displayed:

click to expand

The Fact Editor

The Fact Editor consists of five tabs. The first of these, the Fact tab, describes the basic elements of the fact including the objects and the relationships they take part in. You can choose to enter the fact in Freeform or Guided input styles. Guided is the default (you can change this in the properties; we'll show you how later) and allows you to select from your list of objects on each side and add the verbiage or wording of the roles each object plays. Notice that when the Guided input style is being used, the arity of the fact can be selected from the drop-down next to the Guided radio button. This lets Visio know how many boxes to put on the form for your object names and relationships. You can also enter the inverse of the relationship in the text box provided. Let's do this with our example so that the Fact Editor resembles the one overleaf:

click to expand

Freeform input allows you to simply type a fact into the editor. You can use either Capitalized or Brackets to distinguish your objects from the rest of fact by choosing from the drop-down next to the Freeform radio button. The freeform editor really is the best way because it is faster, easier, and allows you to enter your facts in the most intuitive way, as a sentence. In freeform input, the inverse is entered into the predicate by delimiting it with a slash.

We'll enter the first fact using guided input and then use freeform input to enter another fact. To enter our first fact select the Flight object from the first Object Name drop-down and type departs from in the Relationship text box. Next, select Gate from the second Object Name drop-down so that we have a completed fact. Fill in the inverse: is departed from in the Inverse Relationship textbox. Press the Apply button and the Fact Editor saves your new fact to the model and clears itself so that you can enter a new fact.

Now we'll enter the second fact type represented by the statement Flight has Seat using the Freeform input style. Select FreeForm and go ahead and leave the dropdown with Capitalized selected. Enter Flight has a/is in Seat exactly as shown next:

click to expand

On your screen you will see that the objects are highlighted in red and the predicate is blue.

Moving onto the Object tab of the Fact Editor we see one way we can define the reference scheme for the objects in our fact.

click to expand

The Fact Editor Object tab shows a box that has the objects that take part in this fact. Choosing an object in this box displays its attributes in the rest of the form so that you can modify them.

You can change the object kind or type using the Object kind drop-down box. The three available choices are: Entity, Value, and External. We've already gone over Entity and Value types. External simply means that the object is defined in another diagram somewhere else. This is extremely useful when you have a larger system where different teams are working on different parts of the system. If an object has been better defined elsewhere, then we can refer to it in our diagram as external and need only be concerned with the roles it fulfills in our system.

The Entity attributes box is where we define our reference scheme. As can be seen here, we have defined the FlightNumber to be the reference scheme for our Flight Entity object. The Ref Type dropdown allows you to better specify how the reference scheme identifies the entity object. The most common Ref Type is Identification, but you can also choose Measurement and Formatting.

click to expand

Click OK to close the Fact Editor. Looking in the Object Types Pane of the Business Rules window, we see that the editor has created our Seat object for us with the reference scheme we specified.

Let's quickly add the rest of the fact types so that we can move on to the next step of the CSDP. Choosing Freeform input style, enter the rest of our elementary facts, pressing Apply after each one in the fact editor:

Flight (FlightNumber) has/describes DepartureTime
Terminal (TerminalLetter) has/is in Gate (GateCode)
Meal Type (MealType) is delivered to/receives Seat (SeatCode)

After entering these facts, you should have the following facts and objects just like the Business Rules window below:

click to expand

The object types should also match what you see here:

click to expand

At this point it is really a good idea to save our drawing. Click Save from the File menu and save your file.

Now let's look at yet another tool Visio provides that helps us complete the steps of the CSDP. Remember, we still have to do a population check to make sure we have represented all the data in the report. To facilitate this, Visio provides automated reporting of our model so far.

Visio Reporting

To help with the steps of the CSDP and to better communicate our progress and ideas to the client access the different reports by clicking Report... fromthe Database menu.

click to expand

The New Report Wizard will present you with a list of the different report types it provides. Even though we've only touched the surface of ORM so far, this is a good time to bring up this feature because it will help at this and the next stage of the CSDP.

click to expand

Report	Description
Constraint Type Report	Summarizes the different types of constraints in your model. You can select which constraint types you wish to include and what attributes you would like to report on. You can also describe how you would like what attributes you would like to sort, and group. As we'll see later, most of this information is pulled from the Add Constraint Editor.
Fact Type Report	Summarizes the information that is pulled from the Fact Type Editor form. It gives a report on the 'arity' of the facts you have entered so far, some of the simple constraints you have entered for the facts, and the example data you have entered.
Object Type Report	Summarizes all of the object types you have entered so far. Great resource to use when doing population checks and combining entities that are the same in different parts of the system.
Supertype Report	Provides a number of reporting options and figures to help understand the Subtype relationships and subset constraints of your model. Includes a hierarchical tree.

It's just a few simple steps to generate these reports but they provide a powerful way of checking your model and looking for discrepancies at this stage of the project. In fact, I would advise the reader to go ahead and bring up each of the reports and just click Finish (this will accept the defaults) in the first step. It only takes a minute and can give you a good idea of the type of information these reports provide.

Step 3 of the CSDP

Step 3 is the combination of entities and any objects that can be derived arithmetically.

Keep in mind that our example so far has been of one part of a fairly simple system. We may have two or more teams working on other parts of the system and probably a large portion of them will come up with, for example a flight object. Other parts of the system will want to describe other facts that include a flight, such as Pilot flies Plane. We don't want to duplicate efforts at this point. If another part of the system is defining the object with more detail than we need, we can simply mark it as external and define our own roles for it. This is definitely where the reports described in the last section can really come in handy.

Arithmetic derivations are just that. Any objects that can be derived arithmetically from a combination of others should probably not be objects in their own right. For example, suppose we created an object representing the total number of passengers aboard our plane, but we already knew the number of First Class Passengers and the number of Coach Passengers. The object representing the total number of passengers would be unnecessary, since we would simply add First Class and Coach Passengers to come up with the total.

Thus we perform this step on our model so far and are satisfied that our Flight, Gate, and Seat objects don't need to be combined and that there are no arithmetic derivations in our model. The next step in the CSDP is the addition of some constraints to our model.

Constraints

Constraints are the most important part of an ORM model. They help to further specify the business rules of your data model. We'll begin our discussion of constraints by looking at some of the other tabs of the Fact Editor.

The Examples tab is for adding example data to the model. This is one of the key benefits of ORM diagrams. Example data and test populations can, and should, be loaded directly into the model for every fact type. The integrity of the model can then be checked for validity against the example data. Constraints can even be derived from the example data.

Let's try this by starting with the following fact: Flight (FlightNumber) departs Gate (GateCode). With this fact selected, right-click the fact in the Business Rules Editor and select Edit Fact.... Then, click the Examples tab. These rows are where you will add your sample data that represents instances that will eventually be stored in the database. Add the sample data as shown below:

click to expand

Highlighting row 1 and looking in the bottom window, we see Flight 5468 departs from Gate A34. We mentioned earlier in the chapter that verbalization of facts is very important in ORM. Here is a prime example. We can use this to verify that our data and our facts really do meet the situation. We should make sure that we enter information that really represents the system we want to model. Notice that each Flight is unique and that each Gate is unique or that one Flight will depart from one gate and that one Gate is departed from by one Flight. It's important to remember that this doesn't mean that a Gate can only be used once. That's a Primary uniqueness constraint, and is explained below. Let's add these constraints to the fact using the Constraints tab of the Fact Editor:

click to expand

Here you can add the following constraints:

Constraint	Description
Uniqueness Constraint	Specify that an instance of one or more of the objects in the role will be unique within the role. For example, One Flight leaves One Gate.
Mandatory Constraint	Specify that the object must play that role in order for the role to exist. In other words, a Flight must depart from a Gate if it is to exist as a flight.
Frequency Constraint	Specify the number of times, if any, an object must fulfill the role. Here it is just one, but in other situations it could be more. For example, Four Flight attendants are required to serve a Flight.

Enter these constraints using the Constraints tab so that they'll be reflected in our model. Visio provides questions to help us conceptualize the constraint. Clicking the Rephrase button will grammatically rephrase this question to further illustrate the constraints that relate to the fact.

The drop-downs above the Rephrase button present some of the ways we can answer the question that will allow Visio to set the constraints for us.

click to expand

Constraint Question #1 relates to the role that Flight plays in the predicate while Constraint Question #2 will help us set constraints on the role that Gate plays. We will set Exactly One for both Constraint questions as we said above.

The Primary Uniqueness drop-down allows you to specify a primary uniqueness constraint over one of the roles in this fact. This means for example, that the object can only play in one of these roles ever. A Match (the one you use to light fires) object, or a Thirty-First Birthday object; these can only be used once. In other words, if there were a primary uniqueness constraint over the fact in the screenshot above, only one instance of Gate would ever exist for this role. A Flight could only depart from one gate once, then no other flights could depart from that same Gate - definitely not the situation we want here.

Now our Constraint tab should look like this:

click to expand

Notice how Visio has verbalized our fact and its constraints. This is a great quality check to make sure that our constraints are correct.

Another good way to verify our constraints is back in the Examples tab. Let's go back there and click the Analyze button:

click to expand

This causes Visio to analyze the example data and make a suggestion about what constraints the data implies. If you have already entered constraints that do not match the data, an error will appear. I wanted to show how to add a constraint using the Constraints tab, but if we had done this first, we could have bypassed the Constraints tab and applied the constraints automatically after entering our example data by clicking Apply UC Constraints. This is another very good reason for adding example data to every fact.

Our fact now looks like this:

click to expand

Notice the bars with arrows over the roles we have defined uniqueness constraints for, and the dots indicating mandatory roles.

The Advanced tab of the Fact Editor allows you to set some of the other properties of the role.

click to expand

Notes can be defined for pretty much anything in Visio. They can help document things that may need to be added later and are stored in the model.

The Objectify / Nest Fact as: textbox allows you to turn this fact into a full-fledged object that can then take part in its own roles. For example, let's say that we want to objectify the fact: Flight departs Gate as a Departure that can then take part in a role with a Runway object. We could type Departure into the textbox provided to accomplish this. ORM represents a nested fact as a soft rectangle around the predicate as shown:

click to expand

The Departure object is now an entity type with all of the properties of any other entity type.

The Derivation radio buttons and textbox allow you to add derivation rules for this fact type. A derivation rule is a way to enter more information than is formally allowed in the model about the way that the relationship is carried out. For example, let's suppose that there was a way to derive some other object in the system from the gate and flight. We could enter it here as a documented feature of the system that we'll enter later. The Derived radio button indicates that the derivation exists while the Derived and stored radio button indicates that the derivation exists and the value will be stored in the database.

External fact type allows you to specify your object as an external object type. Remember, this stipulates that the fact is defined in another part of the system and that we just want to show a relationship between it and our system without explicitly defining it.

Steps 4 to 7 of the CSDP

The rest of the CSDP presents a recommended order for creating the rest of the constraint types in the model. In this section, we'll define and give some examples of these constraints. To do this, we'll be taking a look at a better way to add constraints than the Fact Editor - the Add Constraint form. We're also going to end our above example here to look at a situation in which some of these more complex constraints occur.

Traveler, TicketPrice, and FrequentFlierMiles

We're going to look at the situation of how a traveler is charged for their flight. This will involve a Traveler entity object, a FrequentFilerMiles entity object and a TicketPrice entity object. We're also going to see a unary fact type that describes whether a Traveler is an employee of the carrier of the flight.

Our small part of the system is represented by three facts. Create these facts in the same manner as you did in previous examples:

Traveler pays TicketPrice
Traveler uses FrequentFlierMiles
Traveler is employee

Let's not worry about the uniqueness constraints that we can create in the Fact Editor for these roles. The Fact Editor is limited to creating constraints for the particular fact that it is modifying. The Add Constraint form can create constraints between facts. Here are our facts so far in ORM notation:

click to expand

Here's the Fact Editor window showing the unary fact:

click to expand

Here's an explanation of the constraint we'll add first: We know that a Traveler will not pay the TicketPrice if they are using FrequentFlierMiles or are employed by the carrier; this is an exclusion constraint. We also know that they must fall into one of the categories. They must be an employee, pay the TicketPrice, or use FrequentFlierMiles to take part in the system; this is a mandatory or inclusive constraint.

Add Constraint Form

The Add Constraint form allows an easier and better way for you to add constraints to your ORM diagram. To create our constraints, we'll select all three of the predicates of our diagram by dragging a box around them, or holding the Shift key (normally we would use Ctrl for this purpose in Windows!). Select Add Constraints... to bring up the Add Constraint editor.

click to expand

The Add Constraint Editor

click to expand

The Constraint type drop-down allows you to select the kind of constraint you would like to define for the roles you select. The Primary checkbox tells Visio that the constraint provides a primary reference for the role instance (much like we explained earlier with the Fact Editor). The roles we had selected are displayed in the first box so that we may select them to participate in this constraint. Finally, the box at the bottom will verbalize the constraint as we create it to help walk us through the process.

Select Exclusion as our constraint type and select the roles on the left-hand side, which represent those roles fulfilled by the Traveler object as shown:

click to expand

Notice the verbalizer box at the bottom has grammatically stated our constraint. This is such a wonderful feature!

Also notice the Number of Roles at each end box that has appeared. This allows you to specify the number the roles that will be included over more than one instance of the constraint.

Click OK to add the constraint so that we now have the following diagram:

click to expand

Notice that the circled X is used to indicate an exclusion constraint. Now we know that a Traveler instance cannot take part in more than one role, or said another way, they are disjunctive. Now let's add the other constraint. Select the predicates again and select Add Constraint to bring up the Add Constraints dialog. Select a Mandatory constraint type and choose the roles as shown:

click to expand

Click OK to add the constraint so that the diagram now looks like this:

click to expand

Remember that the notation for a mandatory role is a dot. Visio has drawn the dot on top of our exclusion constraint's X. You can right-click the constraint and select Split X / OR constraint to show them separately. The technical term for this type of constraint is an inclusive-OR constraint or a mandatory disjunction of the roles.

Here's a short explanation of the rest of the constraints (that we haven't looked at already) that you can add in the constraint type editor:

Constraint	Description
Subset	Much like a subtype relation that we'll look at later. It denotes that the population of one role must be a subset of the population fulfilling another role. In other words, a member of one role must also be a member in another role. For example, Travelers using FrequentFlierMiles must also play the role of Participating in the FrequentFlierProgram. The difference between this and an Equality constraint, is that Travelers participating in the FrequentFlierProgram don't necessarily have to use their FrequentFlierMiles.
Equality	Specifies that the populations of the roles must be equal.
Index	These constraints apply less to conceptual design and more to the performance advantages of an index defined in the physical database. They don't really specify any concepts but are included in ORM notation by a circled I.
Ring	There are many types of ring constraints; the Visio documentation actually has some good examples and definitions for all of them.

Although there is plenty of documentation in Visio on them, it's worth looking at an example of a Ring Constraint. Suppose we are to build a program for the review of pilots on the planes such that pilots review other pilots:

The constraint has to stipulate that a pilot cannot review themself. Select the predicate and bring up the Add Constraints editor, select a Ring constraint type and then select both sides of the role to show the Ring Constraint Editor:

click to expand

We'll select an Irreflexive constraint from the Ring Type: drop-down and click OK. Notice that the bottom of the Add Constraint Editor once again helps out by verbalizing the relationship: No Pilot reviews itself.

ORM notation depicts this with a circle to depict its cyclical nature and ir for irreflexive. The end result Oir is shown below:

Creating the Conceptual, Logical, and Physical Database

We'll begin by creating an ORM source model. To create our new ORM source diagram, we select a new ORM Source Model diagram from the Database category of the main drawing menu. We now have a blank drawing surface with which to begin modeling. The simple example we are going to build is that of a part of an airport reservations system. We are only going to be concerned with a simple seating assignment data store. A report of some of our elementary facts is shown below:

Fact1 - Airline owns/is owned by airplane
Fact2 - Airplane has a seat
Fact3 - Seat has a row number
Fact4 - Seat has a location code (aisle, window, middle)
Fact5 - Traveler sits in seat
Fact6 - Traveler rides in airplane

Creating the Object Types

We're already familiar with the Business Rules editor, so let's just go ahead and create the objects below:

Object Name	Reference Mode	Kind	Data Type
Airline	Code	Entity	Char(10)
Airplane	Id	Entity	Char(10)
Traveler	Id	Entity	int
Seat	Code	Entity	char(5)
RowNumber	Number	Entity	int
LocationCode	Code	Entity	char(1)

Our business rules editor should now look like this:

click to expand

Creating the Predicate Types

Now that we have our objects, we will define the roles that they play in Fact 1. Remember, we can add facts by either creating them in the Fact types pane of the Business Rules editor, or by dragging a predicate object to the surface. Either way, the Fact Editor will help define the predicate as shown.

click to expand

We will also want to define some simple constraints on this relationship. Here we need to think about how we are going to define what is and isn't allowed by the system. It is critically important, that we must not let our own assumptions guide us here. Once again, let's ask our point of contact for the system for advice about how the final product really should work. We don't want to define the constraints too strictly to allow for all combinations and relationships that will need to be depicted in the system. For this relationship, we decide that an airline can own more than one airplane at a time, an airplane must belong to one airline, and that an airline has to own at least one plane. This is depicted by the arrow over the is owned by side of the predicate and the black dot on the connector to our airplane and airline objects. Our Fact Editor will look as shown:

click to expand

Remember that there is no primary uniqueness constraint because that doesn't apply here. Instances of the objects and their uniqueness are not defined by this role.

Fact2 states that Airplanes have Seats, Seats are in Airplanes. This is another simple constraint that we define as:

click to expand

The Fact Editor's Constraints tab looks like this:

click to expand

Primary uniqueness constraint

The other fact types are pretty much the same except that one thing has been left out that I would like to mention. A seat has a Row and a Location. One might have just taken the same road as for all the other constraints and defined these predicates like this:

click to expand

That's perfectly fine, and would probably result in a model that would work. Both of these are mandatory and their values actually define each seat. This is because the reference scheme for the Seat (SeatNumber) is a combination or composite of the reference schemes of the Location (LocationCode) and the Row (RowNumber). The point is that a seat doesn't necessarily have a need for its own unique identifier because this is provided by the roles it fulfills with the Location and Row objects. How do we go about adding this relationship (called a composite primary reference scheme) to the diagram? The best way to these types of constraints happens to be the Add Constraint form. Remember, to display it, choose both predicates either by dragging a box diagonally around them, or by holding down the Shift key and clicking both of them. Then either click on one of the shapes, right-click and select Add Constraints... or do the same from the Database menu.

Here is our old friend the Add Constraint dialog again. Click each of the sides of the predicates shown so that they look like the diagram below and check the Primary checkbox. Notice the verbalizer window reads the constraint for you as you click each role.

click to expand

Now clear the reference mode of the seat object using the Business Rules editor.

click to expand

Your diagram should now look like this: the circled P represents the primary uniqueness constraint. This means that a seat is primarily defined by the unique relationships it plays with LocationCode and Rownumber.

click to expand

The Verbalizer grammatically states this:

click to expand

We need to enter two more facts to define our complete model. Once again we just open the Fact Editor and type: Traveler rides in Airplane in the Fact tab (using the Freeform input method). We then define the constraints as shown below:

click to expand

Next, we define the fact: Traveler sits in Seat. Type this into the Fact tab and define the constraints for this fact as shown below:

click to expand

By now, your full diagram should look something like this (Remember, a great way to add the shapes is to right-click and select Show relationships):

click to expand

Before moving to the next section, it is a good idea to do a model error check. You can do this by selecting Model Error Check from the Database menu:

click to expand

This will perform a conceptual validation of your model. It will inform you of any problems with your model such as a lack of reference schemes or constraints that contradict the samples you've entered. All of this information will appear in the Output window. I've entered some faulty examples for one of the facts to demonstrate this. Otherwise, our model generates no errors and no warnings.

click to expand

You can double-click the error/warning row and Visio will highlight the offending shape on the diagram. Since we have no errors or warnings in our current model (I'll correct the faulty examples I've entered before we continue!), we're ready to move one step closer to generating the database. That next step is:

Building the Logical Model

Now that we have a good diagram of the objects in our part of the system, it is time to see if our work has truly paid off. One of the best things about using an automated modeling tool like Visio is that you can regenerate the model at will in a few seconds. If you don't like the results, you can always go back and change things, then generate it again.

At this point, I can't stress enough the need to get the Visio SR-1 from Microsoft. It fixes a number of really annoying and potentially destructive bugs in database diagramming and updating logic that can lead to, at the least, headaches, and at the most corrupted models! Do yourself a favor and get it from: http://msdn.microsoft.com/downloads/default.asp?URL=/downloads/sample.asp?url=/MSDN-FILES/027/001/906/msdncompositedoc.xml

We begin by creating our Database Model Diagram and adding our ORM diagram to it by selecting Add Existing Document from the Database menu. The project window should come up at this point and we should see that our diagram is now part of our data project. Select Build from the Database menu and Visio begins to generate the logical model. The Output window will show the status as well as any errors and warnings that reflect mistakes or inconsistencies in your model. Remember, we can double-click the row to highlight the offending shape. After Visio is done generating your model, you'll see the tables it has generated in the Tables and Views Window. You can see in the Tables and Views window opposite, that it has generated three tables: Airplane, Seat, and Traveler.

click to expand

This illustrates a point. When you first use Visio to generate a logical diagram from you ORM, you will sometimes wonder why it has chosen to do certain things. One of the aspects that seems counterintuitive is that Visio didn't decide to put Airlines into their own table. They are an entity object aren't they? Why wouldn't Visio represent them with an entity in an ER diagram? The answer has to do with the algorithm Visio uses to map the objects. The simple fact is that it is optimized, while our perceptions (what we think should be generated) are not. We didn't define any value types or other roles extending our airline object. Visio therefore, has taken the simplest and best way of representing them, by showing them as a required column in our Airplane table.

Let's see how Visio defined the relationships between our objects. The easiest way to do this is to drag one of the tables to the drawing surface and select Show related tables, either from the context (right-click) menu or the Database menu. Visio adds the shapes to the drawing and lays them out for us as seen below:

click to expand

At first glance it doesn't look pretty. Even though many databases allow column names to have spaces, this still isn't necessarily a good practice. Why is Row by itself as a field name while Location has its reference scheme included (LocationCode)? Also, what's with the Rides Airplane Id of the Traveler table?

The answer to these kinds of problems lies back on our ORM diagram in the document properties. Although at first, they can seem confusing, changing these settings can save a lot of time and help generate truly usable ER diagrams. I will try to explain each of the properties and how each one can affect your ER diagram. Go back to the ORM model by double-clicking it in the Project window, or the Window menu if it's already open in Visio. Open the document properties by choosing Database | Options | Document.

The first tab of the ORM Document Options Window simply asks if you want to show constraints that only impact on the physical model. This pretty much means indexes that you define for the sole purpose of speeding performance.

The Abbreviation tab is where things start to get interesting. It displays some common default words and the abbreviations that Visio inserts in place of other words when the logical model is generated. Any of these words appearing anywhere in your model will be replaced by that abbreviation. This is how Visio maps your objects/predicates to fields and tables. If we don't do this, many of the sits in, rides, and such like, can make their way into the logical model. Notice that most of the words have no abbreviation at all. This is not by accident; the simple fact is that most of the time you don't want any of these "conceptual" words to even appear into your model. Let's add some of our own. Enter in the words: sits, rides, and owned into the bottom of the list so that your list now looks like this:

click to expand

This also explains our Row, RowNumber problem - the field names Location Code and Row are derived from the ORM reference schemes LocationCode and RowNumber. If you look in the Abreviation tab we've just added to, number is there. Erasing this will solve our issue.

This scenario illustrates a great way to find out the nuances of what's going on with the names Visio uses when you build your model. If when you are building your ER diagram you're not sure why Visio did chose a certain name, it usually relates to some of these document properties.

Just to show exactly what this fixed, let's rebuild the logical model. Press OK and open your DMD. Build the data project again and you should get what I have below.

click to expand

As you can see, the extraneous words were removed! The spaces still remain, but we'll show how to get rid of them below.

click to expand

From the Prefixes tab, we can set prefixes that Visio will use in the naming of our columns and tables. For columns, we can choose to have no prefix, a prefix based on the first few letters of the table name, or a custom prefix on a per table basis. This can go a long way to creating uniformity in the names of objects in your database. We can also specify a custom prefix for our table names. Hopefully in a later version of Visio, there might be an option to have a column prefix based on the column datatype. Either way, this is an extremely useful feature that can go a long way in setting some consistency across your architected database solution without most of the work.

The Suffix tab allows you to define a custom suffix to append to the end of column and table names.

click to expand

The Capitalization tab allows you to specify the rules to apply to the capitalization of both column and table names. You can choose to force all upper or lower case punctuation or have it determined by the first letter.

click to expand

The Miscellaneous tab has a number of options that we can use to tailor our logical diagram. It is here that we find the reason behind the spaces in the column names of our ER diagram. Looking at the Spacing Character label in the above screenshot, we see that the Other radio button is selected by default. This is very misleading and I'm not sure why it's been designed this way, but inside the Other textbox, there is a [space] character - of course, you can't see it! Usually one would want to set it to something else. We'll choose None for the Spacing Character, although an underscore could do nicely too (This would make LocationCode become Location_Code); this is really a matter of personal preference.

The Reference mode options are useful but are also misleading in some cases. The option I have found to be the best is to use the reference mode as the column name. This sets whatever you put in the reference mode box to be the name of the column that represents the primary key in the table represented by your object, or the column name that represents your object (remember how the primary key of the Airline table was named?). The reason why this is misleading is because Visio seems to have done this. I usually set it to Add to object type name, and then make sure I name my columns with the name of the object type plus an ID or Code.

The Maximum name length can be used to restrict the maximum length of column names. The Use predicate text... checkbox tells Visio to use the predicate verbiage to help name relationships and columns.

Finally, the Pluralize table names checkbox is very useful and I usually set it. It will take your objects (which are singular in the conceptual view) and pluralize them so that your airplane object becomes an Airplanes table, for example. I usually set them to be the defaults for all documents I create. After setting these properties, and building the model again, the end result appears below.

click to expand

We now have a fairly complete logical design created from our conceptual design without having to define one table or column! The benefits that we have realized as developers are numerous. Not only have we collected requirements about the specification of the system we need to build and a conceptual design of our system, but we also have a very good start at logical design! The truly great part is that we can make changes to our conceptual design and rebuild the logical model with one click on a menu, thus insulating us from the risk of downstream changes.

Meanwhile, it is good to keep in mind that our data model is not dependent on any vendor-specific DBMS. We have used SQL server's driver here because that is the database that we will generate this model to, but we could just as easily have chosen another vendor's driver. We could do so right now as a matter of fact, by opening our ORM diagram and choosing Driver... from the Database | Optionsmenu.

In fact, when we do, we see the screen overleaf and find that there are a number of properties we can define that specify how our model will map to any database we choose. Specifically, we can specify the way we want our data types to map and how we want DDL scripts to be generated.

click to expand

Generating the Database

Now that we have our logical model defined, we can begin to generate our physical database. The process is exceedingly easy since Visio provides a wizard to walk us through the process. Select Generate... from the Database menu of the data project to begin. You are presented with the screen below:

click to expand

The checkboxes provided allow you to decide exactly how you want the database to be generated. If you would like to generate a DDL script, you can select that checkbox and choose where you would like to save the text file. Clicking Generate new database will instruct Visio that you are indeed ready to create the actual database. The Store current database... checkbox is extremely useful as we'll see later. It instructs Visio to take a snapshot of the database so that it will be easier to update and replicate changes back to the model at a later date. Make the choices to suit your needs and click Next.

The next screen lets you configure your data source that points to the server or file that you want the database to be generated in. Remember the "data source" is really just a pointer to the database you want to generate, don't get confused by the wording. You can have Visio create the database or use one that already exists. In this example, using the SQL Server driver, you must specify a data source to a SQL Server you can connect to. Clicking new brings up the Create New Data Source wizard:

click to expand

After you have configured your data source to point to your server, Give your database a name in the Database Name textbox and hit Next. The wizard now has a summary screen that shows you the tables you are about to create in your new database. Hit Next and Visio does a final physical validation of the database to make sure that the model is consistent and that there are no errors.

click to expand

Clicking Finish on the wizard starts the generation process. If you have chosen to create a new database and you didn't create one in the data source creation wizard, Visio will ask you for a logical file name for the database and transaction log as well as the path and other physical characteristics. When the generation process is done, Visio asks if you would like to see the generated DDL script.

You now have a completely generated database that should exactly match your logical model. You can verify this by opening SQL Server's Enterprise Manager or Visual Studio .NET's Server Explorer and seeing for yourself - the database listed with all the others

click to expand

The first thing to do when you have verified that your database is set up the way you like, is go back to your data project and save it. You'll be presented with this message box:

click to expand

This dialog can come up many times when you are working on a project and can be intimidating at first, mostly due to its wording. Here's what it means - the collective model is just another name for the data project or Database Model Diagram. In this instance, changes were made to it that you would like to keep, changes made by Visio when you generated the database. When it stores the database model, it marks all the shapes in your model as reverse engineered. This is some kind of internal trick to help Visio merge and synchronize your model with what's in the database. If you choose to build your project, your source models are used to build the logical design in the DMD, overwriting anything in the DMD. So, in this instance, we will choose Yes to update the source models so that the shapes in our ORM will also be marked as reverse engineered, but there will be many times when you're working back and forth between the two that you may choose not to.

Organization of the Data Projects

Another point to bring up is third sentence in the dialog. This is a true statement and can be a point of confusion and even aggravation.

Working with multiple source models in this version of Visio just isn't a good idea at this point. In fact there isn't really all that much need. Keep in mind that the DMD is just an ER diagram so you really don't need any of those. Visio won't update multiple source models with new objects, only change existing object names. A logical diagram (provided by the DMD) and an ORM diagram are all you need to model incredibly complex systems.

Also, when your diagrams get very large, there are many ways to spread them out and organize them. You can right-click on the tab at the bottom of the current drawing page in either diagram to create a new page representing a conceptual group of objects and relationships:

click to expand

Also, another extremely useful feature for organizing your drawings is the folder system of the Business Rules window. You can right-click to create a new folder, and then drag fact types into that folder as shown below:

click to expand

Refreshing the Model from the Database

The database is now in the hands of the DBA and in action in the real world. No matter how good the design, there always seems to be someone or something that wants to change it. Sometimes it's simply a data type or column name change, or the addition of an index you may not have thought of to increase performance. Sometimes the change can be downright huge such as combining tables to decrease joins. You can be sure that a well-normalized schema will be lost on some and it may be beyond your control to prevent it. Luckily, you can at least find out what's gone on and update your documentation to reflect the changes. The reporting feature mentioned later can also be of great help here. Also, a good idea is to make sure that you have backups of your existing model so that you can always revert to your pristine design idea.

Let's suppose that the "powers that be" have decided that Airlines are now going to own terminals in airports. This isn't that far-fetched after all - we had already broached the subject that Airlines should possibly have been in their own table. The problem comes that, instead of coming back to you to architect the solution, they just take it upon themselves to create the Airlines and Terminals tables themselves. Here is what they have done:

Created an Airlines table with an airlineCode primary key
Created a Terminals table with a terminalID primary key
Added an airlineCode column to the Terminals table

Your first step is to open up your diagram and choose refresh from the model submenu of the Database menu:

click to expand

The wizard asks for a data source. You should already have one configured, but if not, create one as described in the Generating the Database section and click Next. Visio searches for conflicts between your model and the way it thinks the database is, and the way the physical database has changed. It should find two conflicts pictured below:

click to expand

We will want to refresh our model with these changes, so make sure the top of the hierarchy is highlighted and click the Refresh model radio button. Visio generates the tables for us in our data project. We realize this by looking at the Tables and views window. We should be able to have Visio lay them out on the diagram by right-clicking the Airplanes table and selecting Show related tables. This doesn't work and, to our horror, we see why - no foreign key constraints were defined!

We save our model and the Update Source Models dialog comes up. Click Yes so that our ORM and ER diagram will be updated (you could also just choose Update Source models from the Database | Project menu). Opening our ORM we affirm that two new object types are there, but no new fact types. You can see that there has been minimal impact on our ORM model besides the two new objects. Some of the reference mode names have changed due to the fact that our changes to the document properties were not used for the new objects.

We also notice another problem. Because there were no constraints and the names didn't exactly match up, the airlineCode field that was added in the physical database is now a value type object.

This shows a problem with the current implementation of Visio's reverse engineering. Visio doesn't use the properties of our document or model that we've set during the reverse engineering process. This is a major flaw because it causes Visio to corrupt the model when you refresh it from the data source. I have no doubt that this will be fixed in later releases. However, let's bear in mind that ORM is a conceptual design procedure and wasn't really intended to map exactly to the underlying database model. It is better to use an ER diagram for the kind of maintenance duties that we must perform.

We mentioned reverse engineering the database in the previous paragraph - now it's time to have a deeper look at this process.


	Team FLY