When the ancient Greek philosopher Heraclitus said that "everything is in a state of flux," he was not thinking of data models, but he might as well have been. No matter how good the initial data model and the resulting physical schema, it is almost inevitable you will need to make at least some changes during the life of the database. Good initial modeling will minimize the overall need for changes, and lessen the severity of those changes, but you should still expect to make changes along the way.
Changes to data structures are disruptive, and can have a far reaching impact on existing applications, and the members of a software development team. To minimize this disruption, you need tools that allow you to manage change smoothly and reliably. VEA has some very good facilities for managing database changes.
Making changes at the conceptual level in your ORM source model(s), and then propagating those changes through the logical model and to the physical database schema is the most effective method. The next best alternative is to make changes directly in the logical model and generate them into the physical schema.
The worst alternative is to make changes in the physical database itself, and then update the logical and conceptual models. The toolset in VEA supports all these alternatives, and they'll all be discussed in this chapter. First, we'll discuss managing changes in the modeling environment, and then changes in the physical schema.
Note |
Years of observation as a consultant have shown that even the most experienced designers are capable of making truly bone-headed decisions if they bypass the conceptual model when changing existing data structures. ORM conceptual models excel at exposing assumptions, and are easy to validate with users. Because changes usually come about as a result of newly discovered business requirements, changing the ORM model first makes a lot of sense. |
Use the facts and constraints listed in Table 16-1 to create a small ORM source model that will be used for the example. The first column shows the fact type as you should type it into the fact editor. The second column shows how to answer the question on the Constraints tab of the Fact Editor. Except where specified, only answer Constraint Question #1 on the tab. The third column shows the verbalization of the constraint. Compare the text in the third column to the output of the Verbalizer window to ensure that you have applied the proper constraints.
Fact Type |
Fact Editor Constraint |
Constraint Verbalization |
---|---|---|
Patient has Family Name |
Exactly One |
Each Patient has some FamilyName. |
Patient has PhoneNr |
Zero or One |
Each Patient has at most one PhoneNr. |
Patient was born in Country |
Zero or One |
Each Patient was born in at most one Country. |
Country has Country Name / Country Name belongs to Country |
Exactly One (question #1), Zero or One (question #2) |
Each Country has some CountryName. |
The graphical representation of your ORM model should look like Figure 16-1.
Assign the data types and lengths shown in Table 16-2 to the objects in your model. Save your model to a file called Patient_Simple_ORM.
Figure 16-1: Graphical ORM model.
Model Object Type |
MS SQL Server Physical Data Type |
---|---|
Country |
char(2) |
CountryName |
varchar(30) |
FamilyName |
varchar(30) |
Patient |
smallint |
PhoneNr |
varchar(14) |
Create an empty database model diagram and add Patient_Simple_ORM to the database project list. If necessary, refer to Chapter 7 for detailed instructions on building database projects with ORM source models. After building the database project you should have a small database model diagram that looks like Figure 16-2.
Figure 16-2: Logical Model.
Your column names may have slightly different names or be in a different order to that shown in Figure 16-2. Chapter 7 discusses controlling automatic column name generation from the ORM level, and Chapter 12 discusses renaming and re-ordering columns in the database model diagram. Save your database diagram model to a file called Patient_Simple_Build.
Caution |
If you change a table or column name, or reorder any columns in the database model diagram, you will be prompted to migrate the changes back to the ORM source model. See Figure 10-26 at the end of Chapter 10 for a screen shot of the migration prompt and a more detailed explanation. |
Generate a physical database schema from the database model diagram Patient_Simple_Build. Generating physical schemas is explained in Chapter 11. For this exercise, do not create the schema through a DDL script, but instead use an ODBC connection. This physical schema will be used later in the chapter to demonstrate change management, so create the physical schema before proceeding with the following example.
Fact Type |
Fact Editor Constraint |
Constraint Verbalization |
---|---|---|
Patient is allergic to Drug |
Zero or More (question #1); Zero or More (question #2) |
It is possible that some Patient is allergic to more than one Drug and that more than one Patient is allergic to some Drug. |
Figure 16-3: ORM model with new drug allergy fact.
Caution |
Under certain conditions, you may be prompted to migrate changes from the logical model to the ORM source model. As a general rule, answer No if this prompt appears while you are attempting to build a project. |
Migrating changes from a database model diagram to an ORM source model means altering the ORM source model to the consistent with the database model diagram.
Usually, you are building a project because you just made changes to the ORM source model, thus temporarily causing the ORM model to be inconsistent with the existing database model diagram. If you answer Yes to the migrate question, VEA will push information from the old database model diagram back into the ORM source model, thus undoing the ORM changes that you just made.
The general rule of answering No only applies while building a project. For instance, when saving a logical database model diagram that you have just modified by renaming columns or tables, or adding trigger code, it makes sense to migrate changes back to the source model.
Figure 16-4: Result of a project build after changing the ORM model.
Until now, the only changes you have migrated back from a database model diagram to an ORM source model have been name changes. However, you can also make actual structural changes in the database model diagram and migrate them back to the ORM model.
Note |
Making structural changes in the database model diagram and migrating them back to the ORM source model is like eating soup with a fork. You can do it, but it's messy. It is much better to make the changes at the conceptual ORM level and push them down into the database model diagram through a project build. Try to limit your changes in the database model diagram to things that can be done only at the logical level, like writing triggers and procedures, creating views, renaming tables etc. Even most column names can be controlled from the ORM name generation rules, although some columns must be explicitly renamed at the logical level. |
Generally, the bigger the change at the logical level, the uglier the migrated result will be in the conceptual model. The conceptual ORM level contains more semantic information than the logical level of the database model diagram. You can always go from a state of greater information to lesser information, as occurs during a project build. When going from the logical to the conceptual level, some of the information is simply not available.
For instance, ORM supports a verbalized predicate for every relationship, regardless of whether the objects involved will become tables or columns at the logical level. However, at the logical level, predicate information is available only for relationships between tables. Logical information that does not originate at the ORM level is thus tagged with ugly "placeholder" predicates.
ORM also provides more than one way to model certain situations, even though the end logical result is the same. This is a great advantage in creating understandable data models that can be validated with users. However, when pushing logical information up into ORM, Visio has to arbitrarily choose an ORM modeling technique, whether or not it is well suited to the particular situation.
The simplest way to understand these issues is to go through an example. Make a copy of the backup ORM source model that you created at the beginning of section 16.2. Do not use the model that you used in section 16.2, and do not use your actual backup model. Call the copy Copy_Patient_Simple_ORM.
Figure 16-5: ORM model after migrating a table from the logical level.
The new object type Patient_Drug_Allergy corresponds to the table you created in step three. The object type uses an external primary reference constraint, which is explained in section 5.2 of Chapter 5. Figure 16-5 and Figure 16-3 are conceptually equivalent, but Figure 16-3 is a more natural expression of the facts, and easier to interpret. You can make Figure 16-5 easier to read by putting meaningful words in the predicates of the new fact types.
Regardless of how you make changes to your logical database model diagram, those changes must be propagated into the physical database schema. For this example, we'll use the ORM source model and the generated database model diagram from section 16.2
You generated a physical in Section 16.2 prior to changing the length of the CountryName object, and prior to adding the drug allergy fact type. In this example, you'll use VEA's Update Wizard to update the physical schema with the changes you made to your model.
Figure 16-6: Update Wizard, page one.
Figure 16-7: Update Wizard, page two.
Figure 16-8: Database Connection Prompt.
Figure 16-9: Update wizard page three, no conflicts.
How can there be "no conflicts" when you know for a fact that the physical schema doesn't have the new Patient_Drug_Allergy table, and that CountryName in the physical schema is only 30 characters long instead of the new 44 characters ?
The message "no conflicts" does not refer to a comparison between the physical schema and the current model, but between the physical schema and an image of the last known state of the physical schema, which VEA extracted and stored in the current model during the initial physical schema generation. Because the physical schema has not been changed since the initial generation, it matches VEA's database image, and there is no conflict. This comparison will be explained further in section 16.6.
Figure 16-10: Ready to make changes to the physical schema.
Figure 16-11: Update wizard, page five.
VEA's "Refresh" feature allows modelers to capture changes that were made at the physical database level and automatically incorporate them into the model. You will find it more effective to change a physical schema by first changing the logical model and then propagating the change to the physical schema. As it is, people in even the most disciplined environments make physical schema changes that bypass the data model.
To show the refresh feature in action, we must first introduce some changes into the physical schema of which the logical model is unaware. Let's simulate a situation where somebody changed the length of one column, and added an entirely new column without telling the data modeler:
To refresh the database model diagram, follow these steps:
Figure 16-12: Refresh Wizard conflicts, tree collapsed .
Figure 16-13: Refresh Wizard conflicts, tree expanded.
Figure 16-14: Choosing to refresh the logical model.
Figure 16-15: About to update the model.
Figure 16-16: Refreshed database model diagram.
From the previous two sections, it may appear that the Update Wizard is designed solely to propagate model changes to the physical schema, while the Refresh Wizard is designed to propagate changes from the physical schema back to the logical model. Actually, the Update Wizard can handle bi-directional changes in a single operation. This operation is known as three-way synchronization .
One may reasonably ask, "If the wizard is synchronizing the physical schema and the logical model, why is it called ˜three-way synchronization' instead of ˜two-way synchronization ?" Perhaps a more accurate description would be "three way comparison, yielding two way synchronization". To reliably synchronize the model and the physical schema, VEA must compare three things: The previous state of the physical schema (which should match the previous state of the logical model), the current state of the logical model, and the current state of the physical schema.
Synchronization involves managing the transition of multiple objects from one state to another. As long as the physical schema cannot be changed through means other than the Update Wizard, things are straightforward. Assuming a newly created model, here is the sequence of steps:
However, the physical schema can be changed through other means. Let's insert a step 2.5 and assume that someone made changes to the physical schema using database administration tools. Now the physical schema is not in state A, but in an unknown state X.
If, in step four, the tool merely issues DDL based upon the transition from state A to state M, the results will be unpredictable. The DDL might work if state X is not too much different from state A, or the DDL may fail completely. Even if the DDL succeeds, the physical schema will still not be in state M, but in some new state that is a mix of state M and state X. Worse, the new mixed state may break applications that depend on the database. No one will know exactly what the new state is, so fixing theproblem will be very difficult. Reliable synchronization requires the model to comprehend the physical schema's full state just prior to synchronization.
The solution is for VEA to compare the physical schema's current state (X) with physical schema's previous state A, which is also the previous state of the logical model. Conflicts between state X and state A must be resolved by adjusting the physical schema, the model, or both. Only after all conflicts are resolved so that the model fully comprehends the physical schema can the tool issue the appropriate commands to reliably synchronize the schemas.
Figure 16-17 shows a graphical representation of the major steps.
Figure 16-17: Resolving conflicts and synchronizing Schemas.
In step A of Figure 16-17, the physical schema has information that is unknown to the logical model and vice versa. Upon conflict resolution in step B, the logical model knows everything about the physical schema, though the physical schema may not fully comprehend the logical model. With the conflicts resolved, the tool can issue the proper DDL to reliably synchronize the physical schema with the logical model (step C).
There is one big problem with this solution: The critical comparison of the previous physical schema and logical model state (A) and the current physical schema state (X) cannot be done because state A no longer exists. The physical schema is in state X and the model is in state M. Visio avoids this problem by storing an image of state A in the model immediately after it first generates the physical schema. Now that all the pieces are in place, here is the new sequence of steps:
1.5 |
VEA extracts an image of state A from the physical schema and stores the image in the model file. |
2.5 |
Someone else changes the physical schema. The physical schema is now in state X. |
3.5 |
The Update Wizard compares state A and state X and reports all conflicts. For each conflict, you choose to either update the model or to update the physical schema. The logical model now fully comprehends the state of the physical schema. |
3.6 |
Through its database driver and an ODBC DSN, the tool issues DDL commands to synchronize the physical schema with the logical model. |
3.7 |
As a final step, VEA automatically extracts an image of the new physical schema, and replaces the stored image of state A with this new image. VEA is now ready for the next time you use the Update Wizard. |
Now that you've slogged your way through the abstract example of three-way synchronization, let's see it in action. Since you just refreshed your model, it is in the same state (call it state A) as the physical schema. VEA has stored an image of state A in your model:
The physical schema is now in state X.
Figure 16-18: Database conflicts.
Figure 16-19: Refresh the model for the PhoneNr change of length.
Figure 16-20: Update the database with the Patient_Drug_Allergy table.
Figure 16-21: Update Wizard model update confirmation.
Figure 16-22: Update Wizard physical database update confirmation.
Note that you made changes only to the ORM source model and, yet the tool will issue commands to the database to actually create a table. The three-way synchronization feature recognized that the table Patient_Drug_Allergy had been dropped from the physical level. Instead of trying to change a non-existent object, the Wizard issues a CREATE TABLE statement. Click Finish and your physical schema will be synchronized with your logical model. Remember to save your logical model and answer Yes to the migration question.
When you create a physical schema from your logical model (see chapter 11), VEA automatically produces the DDL necessary for creating tables and views. However, most databases provide their own procedural language for writing functions and procedures. You can write stored procedures to enforce constraints that cannot be handled by key structures, or to enforce business rules.
Contrary to the advice of many web application design books and self proclaimed application design experts, executing business logic in stored procedures is often an excellent idea. Writing business logic in stored procedures does not violate n- tier architecture, and well written procedures will not become "entangled" with the database.
It is important to keep business logic in a separate layer, but the physical place where that logic resides is not a guiding principal of n-tier architecture. Instead of executing all business logic in a physically separate application tier that must constantly query and write to the database as it processes business logic, it often makes sense write those rules into procedures right on the database. The application server merely supplies the correct arguments while invoking the procedure. All the heavy lifting is done by code that has local access to the data in a secure environment on a very powerful machine.
Even if you don't buy into the concept of using stored procedures to enforce business rules, over time you will end up writing significant amounts of code for things like archive triggers, complex views and the like. The VEA Synchronization and Transfer features can help you manage your code.
Thankfully code synchronization has nothing to do with three-way synchronization, so the explanation will be much simpler, and it won't require any diagrams.
Many organizations tightly manage the database code written by their database administrators and developers. In these workplaces, any database code must first be put into a script that follows standard formats and tested for accuracy. Once the script is known to work, it is stored under source code control. Often, the basic DDL is separated from the trigger and procedure code.
This extra management work may sound bureaucratic, but it often makes a lot of sense. Moving among environments (development, test, and production) is much easier and less error prone if all the objects are scripted first. Restoring after a crash or setting up another environment is much easier if different versions of the scripts are clearly identifiable. As a side effect of these policies, you often have to provide trigger and procedure scripts in separate files that can be managed apart from the main data model.
The need to keep your code in separate files can lead to a quandary : On the one hand, your database code needs to reside in separate files, and on the other hand, the most natural place to store and work on your code is in the modeling tool. Code synchronization and mirror files and solve this issue.
A mirror file is a plain-text file that resides outside your model. You associate the mirror file with database code objects in VEA. Once the mirror file and database code object are associated, VEA will synchronize the mirror file and database code object on demand. The synchronization can go in either direction, file to code object, or code object to file. Let's take a look at an example:
Figure 16-23: Invoking the code editor.
Figure 16-24: Code editor properties pane.
Figure 16-25: Code synchronization window.
Figure 16-26: Code synchronization window after sync operation.
Use this feature if you must have the same data model support more than one server. In most work situations, this single model, multiple back end scenario would be extremely rare.
Code objects are tied to the default VEA database driver. If you change from say, the MS SQL Server drive to the Oracle Server driver, your code objects will "disappear." The code objects will still be in the model, they are just not available with the alternate driver. This makes sense, because Oracle PL/SQL code is significantly different from TSQL.
VEA gives you the option to selectively copy (or move if you so desire ), code objects from one environment to another. Let's say you are re-hosting an Oracle database onto MS SQL Server. You might start by reverse engineering the existing database and all the procedure and trigger code. You would then switch to the MS SQL Server driver to adapt the tables. However, once you switched drivers, you would notice that all the trigger and procedure code stayed with the Oracle driver, and is not available to you.
Using Code Transfer, you can copy the Oracle code into the MS SQL Server version of the model. From that point, you can keep the code separate and work on converting the Oracle code to SQL Server. Fortunately, the Code Transfer feature is a lot easier to use than it is to explain, so let's take a look:
Figure 16-27: Changing Drivers to Oracle.
Figure 16-28: Code transfer window.
Note |
Remember three important points when using Code Transfer:
|
It's time to dig out the DDL script that you saved in step seven of section 16.4. As you recall, you generated that script the first time you used the Update Wizard to propagate a change from the logical model to the physical schema. The script modified the existing country table by lengthening the column from 30 to 44 characters . If you can't find the script on your computer, don't despair, because the salient portion is reproduced in Figure 16-29. To save space, the "go" command and extra blank lines between each SQL statement have been removed.
Figure 16-29: Fragment of a change script.
The purpose of this script is to modify a column in an existing table. Some database engines allow a column to be lengthened by a simple ALTER TABLE statement, but some engines require that the table be dropped and rebuilt with the new desired column length. The VEA tool is not tied to a specific database engine, so the designers have chosen to have the script drop and re-create the table, which will work with any database server. The VEA generated script is designed to preserve any data in the existing table.
Lines one and two of the script drop the foreign key constraint in the Patient table that points to Country Lines four through eleven of the script create a backup table called Country_IMO and insert the rows from the current table into the backup table.
Line 13 drops the current table. The DROP TABLE statement would have failed if the foreign key were not removed from the Patient table in lines one and two. Lines 15 through 22 create a new Country table with the correct data length for CountryName, and insert the rows from the backup table. Line 24 drops the backup table because it is no longer needed. Finally, the foreign key in the Patient table is restored in line 31.
VEA's automatic data migration can save you a lot of work, but there are some important caveats to remember:
With the exception of the last section, this chapter has focused on making updates via a direct connection to the database, instead of running DDL scripts. Using a direct connection can save you a lot of time, but remember: Speed Kills!
There are many cases where using a script is preferable to an automated update. First among these cases are any changes to a production database. Using any type of automated tool that issues DDL to modify the schema of a production database is very risky. Only software salesman and outplacement consultants trying to drum up new business will tell you differently.
Always use a script that you have thoroughly tested to modify a production database. It goes without saying that you do the modifications during a planned time when the database will be unavailable to users. You should also consider writing a "back out" script that will restore the database to its original state if something goes wrong with your changes.
Modifications to Production are not the only concern. Even changes in the Development environment can benefit from scripting. While you are making the actual modifications to the physical schema in the Development environment, software coding has to stop. For most projects, this means two to ten people will not work during the changes. A well executed schema change should take only minutes, but problems can turn those minutes into hours. Ten people wasting half a day is an expensive proposition. Using a script that you have thoroughly tested is a good way to guarantee that the changes go smoothly and quickly.
VEA can generate new database scripts without ever connecting a database, but to generate scripts for modifying a database, VEA must connect to a server. On the surface, it appears that you cannot benefit from VEA's Update Wizard technology without letting the tool update your database automatically.
Actually, there's great news. It's quite easy to set up things so that the Update Wizard generates all your change scripts, but you still execute the scripts yourself after testing. All you need in MS SQL Server terms is two small databases, called "Model" and "Scratch." The equivalent Oracle term for what you need is "account."
Once you have set up your new databases (or accounts), follow these steps:
These steps may seem somewhat involved, but the actual execution is very smooth and quick. The biggest advantages are:
Overview of Database Modeling and the Database Modeling Tool
The Conceptual Modeling Solution (ORM)
The Logical Modeling Solution (ER and Relational)
Managing Database Projects