Symptoms | Preventative Programming Techniques: Avoid and Correct Common Mistakes (Charles River Media Programming)

< Day Day Up >

Many of the symptoms of cut-and-paste are obvious, but it is still easy to overlook them. Of course, if you are the one doing the cut-and-paste, the symptoms are usually obvious when you select the cut operation and then apply the paste operation in your favorite text editor. Using these editor tools alone is not necessarily bad, which is why we will cover the various indications of the improper use of cut-and-paste along with a few subtle clues to instances of cut-and-paste that might otherwise be missed.

The Bug Came Back

Here is an all too common scenario that occurs while debugging an application. A bug is reported and assigned to a programmer. The programmer looks at the bug, and then replicates the tester’s description to reproduce the bug. Once he reproduces the error, the code can be traced through and the location of the failure determined. The responsible code is then changed to fix the problem, and the bug is submitted as fixed. The tester can then reproduce the situation and mark the fix as verified. This should be the end of a successful resolution to a bug.

However, when cut-and-paste is used during development, several complications can arise after the problem is fixed. The most common occurrence is the reemergence of the bug with a different method of reproducing it. Even in the best-case scenario, this means that the programmer and tester will need to go through the entire fix-and-verify cycle again. This includes finding the exact location of the new error in the code, changing the code to eliminate this error, and then testing to ensure that the error was fixed without introducing new problems. While one small consolation is that knowledge from fixing the original error can reduce the time required to change the code, finding the error and testing that it is fixed are much more time consuming and do not benefit nearly as much from knowledge of the previous error.

Take the vector cross product as an example, where vector1 and vector2 are three-dimensional vector objects:

   Vector cross_product = new Vector(       vector1.y * vector2.z – vector1.z * vector2.y,       vector1.x * vector2.z – vector1.z * vector2.x,       vector1.x * vector2.y – vector1.y * vector2.x);

This is only necessary for certain vector calculation, so a programmer might decide to use it directly in the code without adding it to the vector class. However, another programmer comes along later and finds another location that needs the functionality. Not sure if he wants to modify the vector class, he cut-and-pastes the code into the new location. This might happen a few more times before it is time to test the application.

Testing begins, and the following bug is reported:

   Beam graphics appear to be distorted in    appearance.

After a couple hours of searching, this leads the programmer to find a bug in his code:

   // ...    Vector beam_up = new Vector(       beam.y * camera.z – beam.z * camera.y,       beam.x * camera.z – beam.z * camera.x,       beam.x * camera.y – beam.y * camera.x);    // ...

This is promptly changed to:

   // ...    Vector beam_up = new Vector(       beam.y * camera.z – beam.z * camera.y,       beam.z * camera.x – beam.x * camera.z,       beam.x * camera.y – beam.y * camera.x);    // ...

The code is submitted, retested, and verified as correct. However, this code still exists in several other places across the application. Another bug report is guaranteed to occur, and the debugging process will begin again. Perhaps another programmer fields this report, and spends an hour to find the bug in his code:

   // ...    Vector spark = new Vector(       object1.y * object2.z – object1.z * object2.y,       object1.x * object2.z – object1.z * object2.x,       object1.x * object2.y – object1.y * object2.x);    // ...

Then fixes it:

   // ...    Vector spark = new Vector(       object1.y * object2.z – object1.z * object2.y,       object1.z * object2.x – object1.x * object2.z,       object1.x * object2.y – object1.y * object2.x);    // ...

This process will be required for each instance in which the original code was cut-and-pasted. Each occurrence will waste development time that could have been used on work that is more productive or allowing an earlier completion time to be achieved.

That is not the end, however, as things can get even worse. The programmer might ignore the bug thinking that he has already fixed it, not realizing that another occurrence of the code exists. In the earlier example, this could occur easily if there were two different types of beams and another programmer had created the second beam type by cut-and-pasting the code from the first beam type. Alternatively, the tester might see it as a reoccurrence of the original bug and resubmit the original bug report without updating the procedure to reproduce the problem. This could also easily occur in the case where there are two separate occurrences of code for displaying beams. Worse, the new bug could be assigned to a different programmer who does not even have the knowledge of the original fix to help him. Moreover, when cut-and-paste is prevalent in an application, the bug could reoccur several more times with different methods of reproduction. Repeatedly fixing the same bug is an obvious sign of cut-and-paste that makes the already difficult debugging process exasperating.

Search and Replace

A direct consequence of cut-and-paste is the use of search-and-replace for making changes to the code that has been created with cut-and-paste. While there are other uses of search-and-replace, if you find yourself using it often on a project, chances are good that cut-and-paste is also in heavy use. One of the major problems of the basic search-and-replace functionality is the automation with which it operates. Normally automation is extremely useful for improving development speed and efficiency, but in this case, the programmer could overlook changes that are invalid or problematic because the process is automated.

When search-and-replace is used, make sure that each instance is carefully examined to ensure that the results are those desired. Do not use the replace all feature even if it is available, as this will blindly replace code without your knowledge. This is like closing your eyes and using a machine gun to try to save a hostage; chances are high that more harm will be caused than good. However, it is equally important to determine why search-and-replace is necessary. If the root cause is cut-and-paste, then code should be refactored rather than just replaced. Even if cut-and-paste is not involved, there might be refactoring tools, which we will discuss later, to help.

What Does That Mean: CAP Symptoms in Documentation

Poor documentation is an illness in its own right, but a particular type of documentation error can indicate the presence of cut-and-paste code. Have you ever read a comment that seems to go with the code below it, except that it gets one fact blatantly wrong, or sometimes it doesn’t even seem to relate at all to the code below it? Chances are that you might find a similar but modified version of the code and an exact duplicate of the comment elsewhere in the code base. Programmers often cut-and-paste code and modify it slightly without updating the comments. Of course, the code does not have to be cut-and-paste for the comment to get out of date. Searching for a distinguishable part of the comment elsewhere in the code can quickly resolve this question, and it is worth it to find cut-and-paste code that requires refactoring.

To illustrate this, let us start with the following code:

   // Compute the average of the current frame    // and the next frame.    float midframe = (frame[current] + frame[current + 1]) / 2;

Now, this code could be cut-and-pasted to a new section and then slightly modified:

   // Compute the average of the current frame    // and the next frame.    float midframe = (frame[current] + frame[current - 1]) / 2;

Notice that the comment was not updated along with the change to the code. This can easily lead to numerous problems, especially if the difference between the code and comment is subtle. Another programmer reading this could easily take the comment at face value and use this code with misleading expectations. However, by searching for the comment, the original code can be found and a comparison will easily show that the code differs. From this, it directly follows that the code must be looked at more closely.

Obscure Bug Hunt

Most of us have tracked down off-by-one errors, memory overwrites, and other mistakes caused by one wrong element in thousands or millions of lines of code. These tiny errors are insidious, causing frustration and hours of lost time. Cut-and-paste makes it easy to make such an error, leading to obscure and hard to track bugs often long after the programmer who wrote it has forgotten how that section of code works. So, how does cut-and-paste contribute to this problem? Let us go over a typical cut-and-paste scenario to see what can happen.

A programmer discovers the need for some new functionality that is very similar to another section of code that he knows about. Looking at the code, he determines how it can be modified to fit his needs and proceeds to cut-and-paste it into his own code. Just then, he notices another change that should be made. Since the change is quick, he goes ahead and makes it. Now he compiles the code and runs one simple test case that works as expected. Finished with that task, he moves on, having placed a subtle time bomb in the application. He forgot to make the change to the new code that he cut-and-pasted and, as luck would have it, the test case did not test for that change.

To show how easily this can happen, just consider code to determine the midpoint of a line. Start with the computation for the x coordinate:

   midpoint.x = (current_line.endpoint1.x +       current_line.endpoint2.x) / 2;

Now, cut and paste:

   midpoint.x = (current_line.endpoint1.x +       current_line.endpoint2.x) / 2;    midpoint.x = (current_line.endpoint1.x +       current_line.endpoint2.x) / 2;

And begin to modify to compute the y coordinate:

   midpoint.x = (current_line.endpoint1.x +       current_line.endpoint2.x) / 2;    midpoint.y = (current_line.endpoint1.x +       current_line.endpoint2.y) / 2;

Oops, the telephone rings. After answering the telephone, you move on to the next section of code. Unfortunately, you have left a small error waiting to appear once testing begins. This error is particularly insidious if proper unit testing is not being done, leaving the error to be discovered much later.

The more code that is copied during a cut-and-paste operation, the more likely a small change will be forgotten. The more cut-and-paste that is performed, the more likely these tiny errors will creep into the code. Writing proper tests before copying the code can help somewhat, but that is more like treating the symptoms rather than the disease. It is easy to miss something when creating test cases, and cut-and-paste requires more test cases than properly written code does.

Too Many Directions

Imagine these instructions to create a new build of an application:

Make changes to the code and/or assets.
Update the version number in the manifest file.
Update the version number in the application description file.
If image tile assets have changed, update the width and height of the images in the source code.
If map assets have changed, update the map size in the source code.
Build and verify.

A short list, but if you look closely, there are three extra steps that should not be necessary. Each of these steps is a chance for human error to occur and wreck the build. The reason for these three extra steps lies in the build design, which forces cut-and-paste updating of the application. For example, the build process should automatically update the version number in the application descriptor file based on what is in the manifest file. Likewise, the assets already contain information on their size that should either be read programmatically at run time or preprocessed by the build process. These changes would result in a much shorter list of actions:

Make changes to the code and/or assets.
Update the version number in the manifest file.
Build and verify.

This is a symptom of forced cut-and-paste programming that often appears later in development or when reusing code from a previous project. Eliminating the instructions that require cut-and-paste should be top priority on any project. Risk is greatly reduced by removing human interaction from repetitive tasks. Now some of you might believe that you can assign a less expensive employee to handle these repetitive tasks and avoid losing money that way, but this fallacy falls flat in practice. This person will likely become a bottleneck for making changes and is still likely to make mistakes that break the application. Always remember, a little time up front will save a lot of time later.

Information Duplication

It is important to remember that cut-and-paste problems apply to more than just code. In addition, cut-and-paste is not the only method of creating duplicate information. It is easy for two programmers to create similar pieces of code on the same project without realizing it just because of poor communication. What we are really talking about is removing duplicate information from human interaction. Any time there are two occurrences of similar code, there is a potential for error. Later, we will talk about techniques for fixing these instances in already existing code, but first we will examine how these poor coding practices can be avoided all together.

< Day Day Up >