9.6 Structure-Level Remedies

9.6 Structure-Level Remedies

There are a number of data inaccuracy issues that come out of structure analysis. Some of these are indictments on the data source and should lead to remedies in the source systems. Others are issues that can be handled when moving data.

Fixing Source Systems

Structural issues in source systems generally do not cause problems with their applications. They merely invite inaccurate data to be entered without detection. The remedies that are commonly used are to add data checkers for data entry, update, and delete to ensure that rules are not violated by new data. This is generally easy to do with relational database systems, but more difficult with others.

start sidebar

A Common story involving relational systems is where the original application developers specified use of extensive database functionality to enforce structural constraints only to have the database administrators remove the definitions because of either performance problems in processing transactions or performance problems in performing loads. Additionally, if primary key/foreign key synonym chains get too long and involved, they must be turned off for load and have all data checked after the load completes. The check step is often left out. This means that it is not sufficient to specify the constraints. It is also important to ensure that they are used all of the time.

end sidebar

Most issues that come out of structure analysis violations involve bad original database design. It is very difficult to get an IT department to want to change the structure of a database that is in operation. The cascading effects can be large, creating an expensive project with a high degree of risk. When changes cannot be done, it is doubly important to document the structure of the source systems to help fend off errors in using the data.

You can always add batch checkers to look for rule violations outside transaction processing. Often this is a good compromise between tearing up a running system and doing nothing.

Issues Involved in Moving Data

The most interesting quality issues come up for projects that try to extract the data from source systems and move it to new, different target systems. The mismatches on structure issues can make or break a project. Assuming you identify the source structure properly and map it correctly to target structures, there can still be issues left to deal with.

One issue is structural mismatches between source and target regarding primary keys. It may be that the data just cannot make the trip to the other side. It lacks the structural compatibility to find its proper place at the target. When this occurs you have to consider either scrapping the project or changing the target design. Sometimes you have no choice about going ahead with a project, requiring some creative design to fix the problems. Having the profile of the source system's structure in the level of detail described in this chapter is the perfect input for coming up with a solution.

A last area for remedies is to ferret out and deal with data inaccuracies when moving data. Ideally, you would want to fix the source systems. However, when this is not possible or will take a long time to complete, screening data as it is being moved for structure rule violations will at least keep junk from getting into target systems.

The last value is in designing proper data movement processes. The logic discussed for data model development is the basis for building a proper series of data transformation and merge steps that will result in a correct target. Trying to build a design for data movement without this road map just invites errors.



Data Quality(c) The Accuracy Dimension
Data Quality: The Accuracy Dimension (The Morgan Kaufmann Series in Data Management Systems)
ISBN: 1558608915
EAN: 2147483647
Year: 2003
Pages: 133
Authors: Jack E. Olson

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net