We have some idea of where our sources are; let's now look briefly at project planning issues. The best way to think about this is relative to the development activities. Figure 9.1 shows key tasks in a traditional waterfall-style application development project. I will describe where semantic elicitation fits in each of the major stages.
Figure 9.1: Typical tasks in an application development and implementation project (waterfall method).
The most powerful time to elicit semantic information is when you are doing your architectural planning, in particular the application architecture. Information uncovered at this stage has a profound impact on decisions regarding the size and boundaries of applications. As we will discuss in Chapter 13, it will have a major impact on the types of reusable services you implement later on.
It is typical for semantic analysis done at the time of architectural planning to uncover shareable services and major areas of overlap between existing systems. These opportunities can lead to consolidation of applications and reuse of services, both of which save considerable development and maintenance effort.
Another good time for semantic discovery is in the project definition and scope phase. This occurs when the projects are being approved, and there is considerable involvement with users and analysts.
Another excellent time to elicit semantic meaning is during the requirements gathering phase. This is because the requirements will be expressed in terms of semantics and because of the leverage at this point in the project. At the requirements phase, you are still refining the scope of the system and are still determining what is important and what is going to be done.
Semantics help the requirements process considerably. Figure 9.2 is an excerpt from a requirements document for a psychiatric hospital system. The paragraph shown in Figure 9.2, which is just one of hundreds, is rife with semantics. Some of the semantic entities are straightforward, such as physician, county, and admission, but some are not. For example, unless you had considerable background in the health insurance industry, it would take some interrogation before you would realize that a "utilization review" is a form of authorization. Semantic analysis is consistent with and will clarify other requirements gathering approaches such as functional decomposition, use case design, scenario-based design, and quality function deployment.
The proposed system must capture and store the clinical data included on the hospital's Admission Notification form (Exhibit I). The required data includes, but is not limited to the following: Date and time of admission Admitting diagnosis Type of admission (eg civil, involuntary, voluntary) Court commitment (eg 72 hour) Identification that patient is in seclusion or under restraint Admitting county and Certified Mental Health Professional Admitting Physician/ institution Identification that patients rights have been read Utilization review requirements, including authorized days and authorization number Indication that the liability notice, release and Assignment form has been signed.
Most projects have a phase where they do most of the packaging of logic into buildable modules and do the logical design of the databases. This is another productive time to introduce semantics into the project. In some ways the best time is when some work has been done on the conceptual design, but not so much as to create a commitment to the design. Once people start defining and designing screens and reports, they often are reluctant to entertain changes, even if they are constructive (see sidebar).
We worked on a system for a federal agency. A General Accounting Office (GAO) report had revealed that the agency was conservatively losing $1 million per day in lost interest and lost receipts of money due. We had worked on a similar state system and were called in to assist with the design of a new system to address the problems. After a bit of looking around we discovered that one of the problems the agency had focused on was how to avoid a situation they had had historically: large (multimillion-dollar) checks getting lost while they were being processed. (There was a complex process to reconcile these checks and make sure they were for the right amount.) We suggested that they photocopy the check, put it in the bank, and do the reconciliation to the copy.
Unfortunately, this had not occurred to any agency personnel earlier, and they were deeply into the conceptual design of what they called the "report check tracking system." This was a system that was intended to keep track of the movement of a check from desk to desk as it was being processed, such that it wouldn't get lost. We might have been able to dissuade them from this, except that for the 2 years they had been working on this problem, this was the only area where they had made any real progress (the substantive problems, as you might imagine, were elsewhere).
At conceptual design time, designers will often have designed a great deal of variety into their models. For example, a recent design we worked with had entities and tables for licenses, permits, and apprenticeships. As we semantically modeled these we found that they were far more alike than different, and therefore we created an abstract entity (called permission) that subsumed the three. Although there were still some views for setting up the differences between these entities and tables, most of the other processes that dealt with them were reduced in number and complexity by two thirds.
This hints at how great the payoff can be from semantic modeling. Even greater payoffs were hinted at in the chapter on metadata. In most cases we've found that semantic investigation has led to the discovery of areas to which we could apply metadata design. In one case, a client had built 24 fairly similar "modules" each for a disease-specific health care outcome. This approach, besides requiring new development every time they wanted to introduce a new disease protocol, prevented any combined reporting, any comorbid analysis, or any modification to the modules. After some analysis it became apparent that a single, more generalized "instrument" could be designed in which the protocol and the inclusion of questions to the instruments were metadata.
For reasons discussed earlier, once past conceptual design, few projects will entertain fundamental changes to the design, even if the changes are beneficial. However, occasionally a project will find that its assumptions have been violated midway through development, and semantic analysis can be applied to find out how to correct the problem.
There is great opportunity to apply semantic analysis to the task of systems integration. The main reason this is true is that pretty much everyone expects interfaces, or other integration points, to be harder to build than expected and to require a great deal of rework in test, conversion, and even postproduction support. Developers are ready to hear the argument that the source of the difficulty is semantics—specifically, that the semantics of the system they are interfacing with are not what they are purported to be. Further, they know that their problems are not syntactical, mechanical, or technical, because they typically work through these problems early on.
Semantic modeling of a target system will proceed differently than for new development. Specifically, the investigation of an existing system will rely far more on data profiling of the existing data than it will from interviewing users. However, the result is generally the same.
You should also take this opportunity to model the aspects of the new system (if there is one) to make sure the semantics on the other side of the interface are what they are purported to be. In this case, you will not have data from its use to rely on and will have to work primarily from specs and code.