EntityRelationship Model Validation

Team-Fly    

 
Requirements Analysis: From Business Views to Architecture
By David C. Hay
Table of Contents
Chapter 3.  Column One: Data


Entity/Relationship Model Validation

When delivering a data model, it is important to validate it. This happens from two directions. First, someone should proofread it to be sure that it follows the modeling rules and conventions. This means first making sure that the syntax is correct, and then making sure that the positional conventions were followed. Then it means making sure that industry standard conventions for representing standard situations were at least consulted. There should be an explicit reason for each departure from the semantic standard.

Second, it should be reviewed with subject matter experts to ensure that the assertions made in the model are in fact true in the business.

Finally, it should be tested for intellectual rigor. Is it consistent and coherent ? This is trickier to do.

John Sharp has in fact devised a way to validate the coherence and correctness of a model systematically, using sample data and focusing on the use of unique identifiers. He has presented it in numerous articles for the Journal of Conceptual Modeling (on the World Wide Web at: http://www.inconcept.com/JCM ) .This involves setting up a table of attributes and populating it with sample data, varying one attribute at a time.

In an example from a course that he teaches, Figure 3.39 shows the Barker version of a model presented in the Federal IDEF1X standard [FIPS, 1993]. Some of the entity types have been slightly renamed for clarity. Dr. Sharp uses this to demonstrate his technique.

Figure 3.39. Validation Exercise1.

graphics/03fig39.gif

Note that the unique identifiers for each entity type are shown in the diagram as a combination of attributes (denoted by the octothorpe symbol, #) and relationships (denoted by a line across the relationship). Table 3.17 shows the identifiers of each entity type.

Table 3.17. Unique Identifiers

Entity type

Attribute

Relationship

PERSON

"ID"

 

SEMESTER

"ID"

 

COURSE

"Course Number"

 

COURSE OFFERING

 

during SEMESTER

   

occurrence of COURSE

CLASS SECTION

"Section Number"

 
   

of COURSE OFFERING

CLASS ENROLLMENT

 

of PERSON

   

in CLASS SECTION

Dr. Sharp's technique begins by expressing a sentence that represents an occurrence of the entity type. In this example, such a sentence might be "Sally ( PERSON 234) received an A in Section 2 of Geology 101 (GE101) during the Fall semester of 1999 (F99)." This is represented by the first row of Table 3.18.

Table 3.18. Validation Table

Semester ID

Course ID

Section Number

Person ID

Grade

Expert

Model

F99

GE101

2

234

A

   

(another)

GE101

2

234

A

Y

Y

F99

(another)

2

234

A

Y

Y

F99

GE101

(another)

234

A

N

Y

F99

GE101

2

(another)

A

Y

Y

F99

GE101

2

234

(another)

N

N

The relation is arranged so that the attributes represent the predicates of a particular entity type. (Predicate, recall, means either an attribute or a relationship to another entity type.)

The rows are arranged to ask, in each case, if a particular predicate can have any other value for the predicate than the one used in the base sentence. For example, in Row Two, is it possible that Sally ( PERSON 234) could receive an A in Section 2 of Geology 101 (GE101) during another semester ?

At the right are two columns , indicating in one case the answer by an expert in the area under discussion, and in the other case the answer dictated by the model. If the answers don't match, the model must be wrong.

In the model, since "Semester ID" is part of the unique identifier of COURSE OFFERING, A COURSE OFFERING , can be for as many SEMESTERS as we want. The expert also says that, at least under current rules, a PERSON can take the same SECTION of the same COURSE more than once. (Note that the rules may change.) Similarly, the model and the expert agree, as they do in all the rows but onethe fourth.

In the fourth row, the question is, can Sally ( PERSON 234) receive an A in another section of Geology 101 (GE101) during the Fall semester of 1999? It is highly unlikely that she is taking two sections at once, but this is not prevented by the model. In this case, the model says that, since SECTION is part of the identifier of "grade" (in GRADED ENROLLMENT) , then it is possible to take as many sections of a course as you want, and there is nothing to prevent the sections from being in the same semester. However, the expert points out that this is not possible. You can take only one section of a course at a time. Hence, the model is wrong, or it is at least missing a business rule.

The corrected model is shown in Figure 3.40. Note that CLASS ENROLLMENT is now identified only by COURSE OFFERING , not by CLASS SECTION. It is still a fact that the CLASS ENROLLMENT is in a particular CLASS SECTION , but now a given occurrence of CLASS ENROLLMENT in a COURSE OFFERING may be in only one CLASS SECTION. Note that a business rule must be still be specified: Since each CLASS ENROLLMENT must be in a CLASS SECTION that must be of a COURSE OFFERING, and each CLASS ENROLLMENT must be in a COURSE OFFERING directly, the CLASS OFFERING for a single CLASS ENROLLMENT must be the same for both paths.

Figure 3.40. Validation Exercise2.

graphics/03fig40.gif

This validation procedure is a very important way to verify that your model says exactly what you think it does. In addition, it identifies business rules that may not be able to be portrayed in the model at all. Note that this validation cannot be done with an object model, since object models do not specify which attributes and relationships are unique identifiers.


Team-Fly    
Top
 


Requirements Analysis. From Business Views to Architecture
Requirements Analysis: From Business Views to Architecture
ISBN: 0132762005
EAN: 2147483647
Year: 2001
Pages: 129
Authors: David C. Hay

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net