How Applications Create Meaning | Semantics in Business Systems: The Savvy Managers Guide (The Savvy Managers Guides)

Let's take a closer look at how applications create order out of chaos. Four major components are needed:

Schema—the structure of the end result
Constraints—rules about allowable information
Production rules—rules about generating information based on information supplied
Query—strategies for seeking out new information

Schema

The goal of the application is to create a structured and consistent set of data that is usable by downstream processes. The three most common forms of expressing structured and consistent data are the following:

A database schema
A transaction
A message or document schema

What these have in common is a formal structure that has the potential to have semantic meaning. When we say "has the potential" we mean that the schema may mean something. The extent to which it is precise, consistent, and accurate is, in most applications, a judgment call. The degree to which another system could understand and rely on the semantics is unknown. However, the potential for semantic meaning is greater than if the data were in an unstructured format such as a word processing document.

A database schema might have a customer table, an order table, an order line table, and an inventory item table. The application takes the unstructured data from the environment and enables a user to put the "right" data in the "right" fields. The application contains some logic to move the data from the entered fields to the correct fields in the database, transaction, or message.

A transaction schema is a description of the layout of a transaction that can alter the data stored in the database. A transaction for an inventory adjustment might have the date when the adjustment was deemed to be necessary, the inventory identifier, the inventory location, the quantity by which to adjust the balance, the amount by which to adjust the value, and the reason for making the adjustment. Transaction schemas have been most formalized in business-to-business environments for electronic data interchange, as we discuss in Chapter 12, but are also highly formalized for internal transactions.

Message schemas, which we will take up in detail in Chapter 11, concern defining complex structures of documents or interapplication messages. The message or document schema is tagged in line with the instance data, which makes it verbose, but at the same time makes it easier to express more complex structures, and to express them in the presence of schema evolution.

Constraints

Another way that the application imposes order is through the expression of constraints. A simple constraint is a range check. For example, the number of wheels on a vehicle must be between 2 and 16, or the number of current spouses must be 1 or 0. Constraints are semantically interesting in two ways. First, their existence says that unless a new set of information passes the constraint tests, we'll disregard the information. "Unless you supply a fax number, I'm not going to accept your order." Occasionally this is exactly the intent of the requirements of the application. For example, we may not want to give someone a white paper until we have a valid email address. In this case, as we'll expand on later, the constraint ("must have a valid email address") is tied to an action ("send the white paper"). Most of the time, however, the constraint is there because it is far harder for downstream processes to deal with ambiguous or incomplete information. This may or may not be the actual intent of the requirements. Sometimes we would rather have some information, some indication that something happened, even if we don't have all the i's dotted and t's crossed.

The other aspect of constraints that is semantically interesting is how these constraints are expressed. Generally, they are described in the form of a rule. The rule is either about a single semantic property (e.g., "zipcode"), a predicate to express validity ("must be numeric"), or an action to take ("abort update transaction"). Constraints can involve multiple properties. A classic constraint is the "foreign key" constraint, which says that for a property of one entity to be valid, it must exist as a "primary" key on another entity. For example, the product number on a sales order must exist in the inventory table. Constraints can be complex and can be expressed at one of three times: before data entry (the constraint could make sure you get only valid choices), during data entry (interactive validation), or after data entry (when the transaction is being posted or even later).

Production Rules

Much of an application is really production rules. By production rules, I mean strategies for producing more data from existing data and an event. A simple production rule is "clone," and the simplest case would be to make a copy (e.g., making a copy of a document, or making a copy of a shape in a drawing package).

The most standard production rules in business systems are to create new records. However, the more interesting production rules are more elaborate than that. A class in an object-oriented language is a production rule. Sending the "new" message to a class causes it to execute code (the production rule) that instantiates (creates an instance of) an instance of the class to which the message was sent, and it often involves creating other associated classes.

More flexible production rules come from applications that allow users to build complex constellations of objects such that when the appropriate event is fired, they create a derived set of instances. For example, if we build a standard subproject in a project management system and then instantiate it, the production rules will create a new project.

Query

In most applications, aiding a user in finding information is an exercise in establishing which parameters are relevant, soliciting values for those parameters, and executing queries against predetermined tables in the database. Imagine what an application would be like if every time users needed information, they had to find it themselves. Queries ("select name from customer table where customerid = ‘12335’") are used throughout applications for validation and to get information to present to users.

Queries cover the following areas:

Scope—Where should we look for our values? Historically this has been the database to which you are attached and a specific set of tables, but it could easily be broader or narrower.
Filtering—Of the potential returned items, which do we exclude? In relational databases, the item in the "where" clause is excluded. Typically this is a set of predicates evaluated for each returned set.
Navigation—In relational databases this is done with joins. More generally, what related data can you get to from the data you've initially selected? A rule-based approach specifies which rules govern the possible navigation and the chosen rules.
Projection—Of the returned entities, which do we want to present? In relational databases, this is what is in the select class. It is a potentially interestingly structured return set.

Even Web-based searches, which seem to have no structure to them, are highly structured. The ability of Google or other search engines to find information rapidly based on keywords is made possible only through the use of various indexes and prefetched queries.