Graphically Oriented Approaches | Semantics in Business Systems: The Savvy Managers Guide (The Savvy Managers Guides)

With graphics the first question is, "Do we adopt something someone else has built, or start from scratch?" The argument in favor of using an existing graphic depiction is that people will be familiar with it. The argument against is that although they are familiar, we are allowing them to confuse the semantic with the nonsemantic. Graphic notations by their nature emphasize some aspect of a problem and deemphasize others. For example, as we will see, the unified modeling language (UML) graphics would emphasize the class structure if we were preparing to implement an object-oriented system. Entity relationship models emphasize the persistent stored data and help shepherd it through the normalization process.

Requirements of a Semantic Graphic Model

Many graphic tools and modeling approaches are currently available. We discuss each of the major ones in this chapter. However, none of them is really adequate for the problem. This section outlines the requirements of a notation approach and tool, as a point of reference for the following tool evaluation.

Many of the notations currently available either don't scale to industrialsized problems or don't have features that make using the models easy. This section outlines requirements of a graphic semantic model editor:

Classes and instances—One of the most difficult requirements is the need to represent what is traditionally "instance level" data (specific instantiated objects), as well as class or schema level data. The reason we need one tool and graphic representation to represent both is that increasingly we will find that a model will mix and match both levels. Classes or templates at one level will be instances at another. Any approach that doesn't allow for this will require us to shift models and representations at points where we don't want to.
Semantic primitives and static types—We need to have some way to indicate which semantic primitive type that each instance, class, or type belongs to. We posit that there are fewer than 100, and more than 20, semantic primitives, and every other instance or class "is" one of them. Further, they are primitive enough that there is no temptation to include the capability for an instance to change its type over its lifetime. To put it another way, there are a limited number of fundamentally semantically different base classes. There will probably be too many of them to distinguish them by color or shape, so we will probably have to rely on some sort of labeling.
Dynamic types or classes—There are far, far more dynamic classes. These include state, status, and temporal change or behavior. For example, a request for materials may become a requisition, may later (dynamically) become an open order, and may eventually become a closed order. Its properties change as it morphs through these transitions. We need to be able to indicate at an instance level which dynamic types an object is. It is not as important that we be able to trace from the type to the instances. Note that an instance can be many dynamic types at once, and it might have been any particular type in its lifetime (we might want to show this). A typical application will have hundreds of dynamic types, and some will have thousands. Each one will correspond to a "type" in business rules.
Properties—For any instance (or class, dynamic type, or template) we need to be able to indicate what properties it can take on, and potentially what its current properties are. Each property needs to have its name, value, and type. Most instances will have few properties and many (3 to 12) relationships (which are a special form of property).
Relationships—We need to show what relationships are allowed (at the template level), including cardinality, and in some cases we may need to show what relationships are instantiated. We also need to indicate what "kind" of relationship it is (containment, assessment, reference, etc.) and indicate whether versioning or effectivity is in effect.
Templates—A template is a semantic primitive. It is a special kind of generative rule. It creates new instances or constellations of instances. It can be created by other templates. We need to be able to indicate what static type of instances this template creates, as well as what dynamic types this can or does give them. We will have hundreds of templates. Many won't need to be shown in a diagram because they won't be interestingly different from their peers.
Generating properties on templates—A generating property on a template is one that executes to create new data for the instance that the template is creating. We have to be able to distinguish between the properties of the template and those that it engenders in its offspring (analogous to class variable and instance variable in object-oriented systems). We need to be able to show the concept of default, and whether it is a lazy default or an ambitious one. (This has bearing if we later disconnect the item from the dynamic type.)
Rules—We need a way to indicate rules, as well as which instances or types they govern. (Note that in most cases it has to be at a higher level of abstraction than instance.) A typical system will have thousands of rules, so we need ways to show and hide them, and perhaps have ways to show just one type of rule at a time, or centered on one object.
Behavior—We need a way to indicate what an instance or type can "do." This is analogous to methods in object-oriented systems, but they are dynamic and can be changed at the instance level. Behavior will be reified and represented as data in the model.
Volumes, subsetting, and navigation—The most important issue is that most models will consist of tens of thousands of things and relationships that need to be represented. There is nothing currently available that will deal well with that level of complexity on a single diagram.

To build these new modeling tools and approaches, we will need to borrow from other disciplines. Cartography deals with a great deal of complexity by rendering special-purpose views. Cartographers have the advantage of one fixed view (the physical land structure) against which many other values can be juxtaposed. Tim Bray (one of the codevelopers of XML) has started a company, called Antarctica,^[70] to build content maps that are analogous to geographic maps.

We may need to borrow from the biosciences, which have done incredible work in the area of visualizing complex protein structures as a way to understand and predict biologic behavior. Most of our systems will not be as complex as organic molecules, but the biosciences do have the advantage of modeling a physical system that they can test things against. Our system models are not models of physical structures.

We may need to adopt some techniques such as hyperbolic trees to help us with information density. More broadly, to show complex temporal relationships, we will be greatly aided by animation and three-dimensional visualization. One of the advantages of three-dimensional representation is that it allows one more way to hold things in proximity, which is one of the things we want to do when we model. But no matter what we do, we are going to have to find ways for defining and using relevant subsets of the model.

Using Entity Relationship Modeling to Model Semantics

In entity relationship (ER) modeling, we would define the key entities in the problem domain. Figure 10.11, a simplified ER model of the Swetsville sales system, shows Sculpture, Location, and Customer and the relationships that would join them: ItemLocation, SalesOrder, and SalesLineItem. Each of the entities would have attributes as shown.

click to expand
Figure 10.11: Relational version of sales order.

Each of the boxes would eventually become a table. The terms in the gray areas are the entity names. The rest are the attributes of each entity, which will become columns in the tables. Lines represent where foreign keys will establish relationships. In other words, the way we know that an order is going to a particular customer is by "joining" the SalesOrder table to the Customer table via the Customer/CustomerID relation. If things go well, for any given order there will be exactly one matching customer, and therefore one delivery address.

Note that most of the design choices here are completely arbitrary. There is a body of knowledge, and a lot of patterns of better designs, but nothing really in the methodology to guide toward better or worse designs.

Also, to be fair, the more advanced practitioners of ER and especially Extended or Enhanced Entity Relationship (EER) produce designs that look very much like object-oriented designs, but they eventually have to reduce them to tables and rows. Most relational modelers that I have come in contact with in industry go straight to a relational design, pretty much as close as possible to the target table design.

^[70]See http://antarctica.net/ for further information.