10.5 Code Instrumentation

We can measure software systems either invasively by modifying the code to put software instrumentation points in it, or we can monitor the activity of the code through hardware probes. It is by far and away cheaper and easier to modify the code than it is to modify the hardware for measurement purposes. We will continue to pay a performance price for the life of the software should we choose to insert our measurement probes into the software. Life is full of trade-offs.

Should we choose to obtain our measurements from software instrumentation points, there are two ways to do this. First, we can insert new code in the source code of the program modules to do our work. We do not always have access to the source code however. This does not preclude measurement. We can modify the binary code for a program to capture the information that we need. In either case we will need only to know that we have reached a certain point in the program. Each instrumentation point in the code will contain its own unique identifier. When that instrumentation point is encountered, it will transmit its identifier to a receiving module, together with whatever data we wish to capture at the point.

10.5.1 Source Code Instrumentation Process

A software probe or instrumentation point at the source code level is simply a function call to a new source code function module that will capture the telemetry from the instrumentation point. We will probably want to know which instrumentation point was transmitting to us and perhaps capture some data as well. In C, this instrumentation point might look something like:

                         Clic(Point_No, Data);

The module being called Clic we will add to the source code base to capture in the information supplied in the argument list. Each instrumentation point will transmit its own telemetry. The first argument, Point_No, will be a unique number for this instrumentation point so that we can identify the source of the telemetry. The second argument will possibly be a data structure, Data, that will capture the essential information we wish to record at the point of call.

There are several different strategies that can be used to place software instrumentation points in source code. The number of points will be a factor in the instrumentation process. A large number of points can be expected to significantly degrade the performance of the software that is being measured. On the other hand, if too few points are used, then we may not achieve the degree of resolution that we need for our monitoring purpose.

The location of instrumentation points in the code will be a function of what we wish to learn from the measuring process. Instrumentation points can be placed in the source code essentially at random, or they can be systematically placed according to the structure of the program. With random instrumentation, probes are simply inserted at random points throughout the system. If we merely want to capture the essential behavior of the software, then random instrumentation should serve this function quite well. In this case, we will determine the degree of resolution that we wish to achieve in this monitoring process and choose the minimum number of points that will achieve this resolution. For some types of statistical monitoring processes, this random probe insertion process will work quite well.

Another approach to the placement of software probes is by module location. This will permit us to track the transition of program activity throughout the call graph representation of the program. There are two different instrumentation strategies that can be employed. We can instrument the software at the beginning of each module call. This instrumentation strategy will permit the frequency count of each module to be developed over an observation interval. This will permit us to generate profiles of module activity among program modules. Let us look at this from the data collection process in the data collection module. What we wish to do is to accumulate the frequency count for each module execution. To do this we will set up a vector in the data collection module with one element for each instrumented module. Each time a module is executed during an observation interval, we will increase the frequency count for that module by one. Periodically, the data collection module will transmit the execution profile out of the system for analysis or storage.

The problem with instrumenting a single point at the beginning of each module is that it is not possible to know when control has passed out of a module. That is, we cannot know with any degree of certainty the calling sequence of modules. If we do need to have access to this information, then we will have to instrument the return statements from each program module as well. This is somewhat more difficult than just putting some executable code at the beginning of a program module. There may be multiple return statements in a single C program module. Perhaps the easiest way to insert our Clic probes is to wrap the return statement with a do-while as follows:

     do {Clic(Point_No, Data); return ((a+b/c));}while(0);

We have replaced one statement (the return) with another statement (the do-while). This means that we will have to run an instrumentation preprocessor through the code before we compile it.

The execution profile can be emitted by our Clic module at the end of a real-time observation interval, say every millisecond. This will allow us to see the distribution of program activity as a function of time. Unfortunately, these temporal observations are generally of little utility in measuring software behavior, in that different machines both within and among hardware architectures differ greatly in their performance characteristics. To eliminate this uncontrolled source of variation, a superior strategy is to standardize each profile vector so that each profile represents the same frequency count. That is the basic notion behind the concept of an epoch, which represents the transition from one observation point to the next. That is, we can set the observation interval to, say, 1000 epochs. Every 1000 epochs we will emit the execution profile and reset the contents of the new execution profile to zeros.

The objective in dynamic measurement is to minimize the number of instrumentation points and still be able to obtain the information needed for decision making. In some circumstances, however, it will not be possible to get the necessary resolution on the phenomenon we wish to study at the module entry and exit level, or the call graph level. For those cases demanding increased resolution on program behavior, we may choose to instrument the control structure within each program module. We will call this the flowgraph instrumentation level.

To perform flowgraph instrumentation we need to capture information relating to the decision paths of the module that is executing. Experience has shown that instrumenting cycles can create some serious data bandwidth problems and provide little or no valuable information about program behavior. Thus, we choose to instrument only the decision paths in if statements or case statements. We will instrument the path from the if statements for the true predicate outcome. We could also instrument the false or else path but the two paths are mutually exclusive. It is questionable whether the incremental information to be gained from instrumenting both paths merits the additional bandwidth in telemetry.

In C we will modify the predicate clause to contain our software probe. Consider the following if statement:

      if (a < b || c = = d) <stmt>;

We will insert the Clic probe in the predicate clause as follows:

      if (Clic(Point_No, Data) && a < b || c = = d &&      Clic(Point_No, Data)) <stmt>;

For the case statement, we can simply add a statement to each case statement as follows:

     case 'b' : Clic(Point_No, Data); <stmt>

It is very cheap to instrument software with these software probes in terms of human effort. We can write a tool that will preprocess the code and drop it in. We will continue to pay a relatively high performance cost over the life of the software for having elected to take this cheap initial solution. If performance is an issue, then hardware instrumentation should be seriously considered as a viable alternative to software instrumentation.

10.5.2 Binary Code Instrumentation

Binary code can also be modified to insert instrumentation points. In this case we are somewhat more constrained. Without a painful amount of reverse-engineering it will not be possible to capture much data. We will have to be satisfied just learning which instrumentation points fire as the code is executed. For most purposes this level of instrumentation will be more than adequate.

To instrument binary code we must be aware that transitions among program modules at runtime will be implemented by p-capturing instructions. These are instructions that capture the contents of the program counter before they alter it. Within the Intel X86 architecture, for example, there are the absolute CALL instructions. These instructions will push the CS and EIP registers onto the stack and take the jump to the appropriate function module address.

Before we modify the binary code base, we must first monitor the activity of the system to find areas of memory that are used as heap space for dynamic memory allocation by the running system. Our objective in this analysis is to identify a place where we can insinuate our own monitoring code and not affect the operation of the system we are monitoring.

The next step in the process is to identify and capture program call statements (opcodes 9A and FF in the case of Pentium X86). We will substitute these function calls with a call to our own module. The function call that we are replacing will be placed at the end of our telemetry capture module. It will be the last thing we execute in our own code. We will not, of course, ever execute a return statement. We must be careful not to destroy or alter the return information (the CS and EIP register contents) so that the user's function will return control to the appropriate place in the code. Again, any registers that are altered in our telemetry capture module must be saved and restored before we exit the procedure.

For embedded code, sometimes this process can be a little tricky. Some embedded code is time dependent. We might accidentally insert our probe in a sequence of code that must execute under tight time constraints. When additional code is added to this sequence, it is quite possible that the code will fail as a result. In modern software systems, this is much less a factor than it was in the past.