17.3. goto

< Free Open Study >

cc2e.com/1785

You might think the debate related to gotos is extinct, but a quick trip through modern source-code repositories like SourceForge.net shows that the goto is still alive and well and living deep in your company's server. Moreover, modern equivalents of the goto debate still crop up in various guises, including debates about multiple returns, multiple loop exits, named loop exits, error processing, and exception handling.

The Argument Against gotos

The general argument against gotos is that code without gotos is higher-quality code. The famous letter that sparked the original controversy was Edsger Dijkstra's "Go To Statement Considered Harmful" in the March 1968 Communications of the ACM. Dijkstra observed that the quality of code was inversely proportional to the number of gotos the programmer used. In subsequent work, Dijkstra has argued that code that doesn't contain gotos can more easily be proven correct.

Code containing gotos is hard to format. Indentation should be used to show logical structure, and gotos have an effect on logical structure. Using indentation to show the logical structure of a goto and its target, however, is difficult or impossible.

Use of gotos defeats compiler optimizations. Some optimizations depend on a program's flow of control residing within a few statements. An unconditional goto makes the flow harder to analyze and reduces the ability of the compiler to optimize the code. Thus, even if introducing a goto produces an efficiency at the source-language level, it may well reduce overall efficiency by thwarting compiler optimizations.

Proponents of gotos sometimes argue that they make code faster or smaller. But code containing gotos is rarely the fastest or smallest possible. Donald Knuth's marvelous, classic article "Structured Programming with go to Statements" gives several examples of cases in which using gotos makes for slower and larger code (Knuth 1974).

In practice, the use of gotos leads to the violation of the principle that code should flow strictly from top to bottom. Even if gotos aren't confusing when used carefully, once gotos are introduced, they spread through the code like termites through a rotting house. If any gotos are allowed, the bad creep in with the good, so it's better not to allow any of them.

Overall, experience in the two decades that followed the publication of Dijkstra's letter showed the folly of producing goto-laden code. In a survey of the literature, Ben Shneiderman concluded that the evidence supports Dijkstra's view that we're better off without the goto (1980), and many modern languages, including Java, don't even have gotos.

The Argument for gotos

The argument for the goto is characterized by an advocacy of its careful use in specific circumstances rather than its indiscriminate use. Most arguments against gotos speak against indiscriminate use. The goto controversy erupted when Fortran was the most popular language. Fortran had no presentable loop structures, and in the absence of good advice on programming loops with gotos, programmers wrote a lot of spaghetti code. Such code was undoubtedly correlated with the production of low-quality programs, but it has little to do with the careful use of a goto to make up for a gap in a modern language's capabilities.

A well-placed goto can eliminate the need for duplicate code. Duplicate code leads to problems if the two sets of code are modified differently. Duplicate code increases the size of source and executable files. The bad effects of the goto are outweighed in such a case by the risks of duplicate code.

The goto is useful in a routine that allocates resources, performs operations on those resources, and then deallocates the resources. With a goto, you can clean up in one section of code. The goto reduces the likelihood of your forgetting to deallocate the resources in each place you detect an error.

Cross-Reference

For details on using gotos in code that allocates resources, see "Error Processing and gotos" in this section. See also the discussion of exception handling in Section 8.4, "Exceptions."

In some cases, the goto can result in faster and smaller code. Knuth's 1974 article cited a few cases in which the goto produced a legitimate gain.

Good programming doesn't mean eliminating gotos. Methodical decomposition, refinement, and selection of control structures automatically lead to goto-free programs in most cases. Achieving goto-less code is not the aim but the outcome, and putting the focus on avoiding gotos isn't helpful.

Decades' worth of research with gotos failed to demonstrate their harmfulness. In a survey of the literature, B. A. Sheil concluded that unrealistic test conditions, poor data analysis, and inconclusive results failed to support the claim of Shneiderman and others that the number of bugs in code was proportional to the number of gotos (1981). Sheil didn't go so far as to conclude that using gotos is a good idea rather, that experimental evidence against them was not conclusive.

The evidence suggests only that deliberately chaotic control structure degrades [programmer] performance. These experiments provide virtually no evidence for the beneficial effect of any specific method of structuring control flow.
B. A. Sheil

Finally, the goto has been incorporated into many modern languages, including Visual Basic, C++, and the Ada language, the most carefully engineered programming language in history. Ada was developed long after the arguments on both sides of the goto debate had been fully developed, and after considering all sides of the issue, Ada's engineers decided to include the goto.

The Phony goto Debate

A primary feature of most goto discussions is a shallow approach to the question. The arguer on the "gotos are evil" side presents a trivial code fragment that uses gotos and then shows how easy it is to rewrite the fragment without gotos. This proves mainly that it's easy to write trivial code without gotos.

The arguer on the "I can't live without gotos" side usually presents a case in which eliminating a goto results in an extra comparison or the duplication of a line of code. This proves mainly that there's a case in which using a goto results in one less comparison not a significant gain on today's computers.

Most textbooks don't help. They provide a trivial example of rewriting some code without a goto as if that covers the subject. Here's a disguised example of a trivial piece of code from such a textbook:

C++ Example of Code That's Supposed to Be Easy to Rewrite Without gotos

do {    GetData( inputFile, data );    if ( eof( inputFile ) ) {       goto LOOP_EXIT;    }    DoSomething( data ); } while ( data != -1 ); LOOP_EXIT:

The book quickly replaces this code with goto-less code:

C++ Example of Supposedly Equivalent Code, Rewritten Without gotos

GetData( inputFile, data ); while ( ( !eof( inputFile ) ) && ( ( data != -1 ) ) ) {    DoSomething( data );    GetData( inputFile, data ) }

This so-called "trivial" example contains an error. In the case in which data equals -1 entering the loop, the translated code detects the -1 and exits the loop before executing DoSomething(). The original code executes DoSomething() before the -1 is detected. The programming book trying to show how easy it is to code without gotos translated its own example incorrectly. But the author of that book shouldn't feel too bad; other books make similar mistakes. Even the pros have difficulty translating code that uses gotos.

Here's a faithful translation of the code with no gotos:

C++ Example of Truly Equivalent Code, Rewritten Without gotos

do {    GetData( inputFile, data );    if ( !eof( inputFile )) {       DoSomething( data );    } } while ( ( data != -1 ) && ( !eof( inputFile ) ) );

Even with a correct translation of the code, the example is still phony because it shows a trivial use of the goto. Such cases are not the ones for which thoughtful programmers choose a goto as their preferred form of control.

It would be hard at this late date to add anything worthwhile to the theoretical goto debate. What's not usually addressed, however, is the situation in which a programmer fully aware of the goto-less alternatives chooses to use a goto to enhance readability and maintainability.

The following sections present cases in which some experienced programmers have argued for using gotos. The discussions provide examples of code with gotos and code rewritten without gotos and evaluate the tradeoffs between the versions.

Error Processing and gotos

Writing highly interactive code calls for paying a lot of attention to error processing and cleaning up resources when errors occur. The following code example purges a group of files. The routine first gets a group of files to be purged, and then it finds each file, opens it, overwrites it, and erases it. The routine checks for errors at each step.

Visual Basic Code with gotos That Processes Errors and Cleans Up Resources

 ' This routine purges a group of files. Sub PurgeFiles( ByRef errorState As Error_Code )   Dim fileIndex As Integer   Dim fileToPurge As Data_File   Dim fileList As File_List   Dim numFilesToPurge As Integer   MakePurgeFileList( fileList, numFilesToPurge )   errorState = FileStatus_Success   fileIndex = 0   While ( fileIndex < numFilesToPurge )      fileIndex = fileIndex + 1      If Not ( FindFile( fileList( fileIndex ), fileToPurge ) ) Then         errorState = FileStatus_FileFindError         GoTo END_PROC       <-- 1      End If     If Not OpenFile( fileToPurge ) Then        errorState = FileStatus_FileOpenError        GoTo END_PROC       <-- 2     End If     If Not OverwriteFile( fileToPurge ) Then        errorState = FileStatus_FileOverwriteError        GoTo END_PROC       <-- 3     End If     if Not Erase( fileToPurge ) Then        errorState = FileStatus_FileEraseError        GoTo END_PROC       <-- 4     End If  Wend END_PROC:       <-- 5    DeletePurgeFileList( fileList, numFilesToPurge ) End Sub

(1)Here' GoTo.
(2)Here' GoTo.
(3)Here' GoTo.
(4)Here' GoTo.
(5)Here' GoTo label.

This routine is typical of circumstances in which experienced programmers decide to use a goto. Similar cases come up when a routine needs to allocate and clean up resources like database connections, memory, or temporary files. The alternative to gotos in those cases is usually duplicating code to clean up the resources. In such cases, a programmer might balance the evil of the goto against the headache of duplicate-code maintenance and decide that the goto is the lesser evil.

You can rewrite the previous routine in a couple of ways to avoid gotos, and both ways involve tradeoffs. The possible rewrite strategies follow:

Rewrite with nested if statements To rewrite with nested if statements, nest the if statements so that each is executed only if the previous test succeeds. This is the standard, textbook programming approach to eliminating gotos. Here's a rewrite of the routine using the standard approach:

Visual Basic Code That Avoids gotos by Using Nested ifs

 ' This routine purges a group of files. Sub PurgeFiles( ByRef errorState As Error_Code )    Dim fileIndex As Integer    Dim fileToPurge As Data_File    Dim fileList As File_List    Dim numFilesToPurge As Integer    MakePurgeFileList( fileList, numFilesToPurge )    errorState = FileStatus_Success    fileIndex = 0    While ( fileIndex < numFilesToPurge And errorState = FileStatus_Success )       <-- 1       fileIndex = fileIndex + 1       If FindFile( fileList( fileIndex ), fileToPurge ) Then          If OpenFile( fileToPurge ) Then             If OverwriteFile( fileToPurge ) Then                If Not Erase( fileToPurge ) Then                   errorState = FileStatus_FileEraseError                End If             Else ' couldn't overwrite file                errorState = FileStatus_FileOverwriteError             End If          Else ' couldn't open file             errorState = FileStatus_FileOpenError          End If       Else ' couldn't find file          errorState = FileStatus_FileFindError       <-- 2       End If    Wend    DeletePurgeFileList( fileList, numFilesToPurge ) End Sub

(1)The While test has been changed to add a test for errorState.
(2)This line is 13 lines away from the If statement that invokes it.

Cross-Reference

This routine could also be rewritten with break and no gotos. For details on that approach, see "Exiting Loops Early" in Section 16.2.

For people used to programming without gotos, this code might be easier to read than the goto version, and if you use it, you won't have to face an inquisition from the goto goon squad.

The main disadvantage of this nested-if approach is that the nesting level is deep, very deep. To understand the code, you have to keep the whole set of nested ifs in your mind at once. Moreover, the distance between the error-processing code and the code that invokes it is too great: the code that sets errorState to FileStatus_FileFindError, for example, is 13 lines from the if statement that invokes it.

Cross-Reference

For more details on indentation and other coding layout issues, see Chapter 31, "Layout and Style." For details on nesting levels, see Section 19.4, "Taming Dangerously Deep Nesting."

With the goto version, no statement is more than four lines from the condition that invokes it. And you don't have to keep the whole structure in your mind at once. You can essentially ignore any preceding conditions that were successful and focus on the next operation. In this case, the goto version is more readable and more maintainable than the nested-if version.

Rewrite with a status variable To rewrite with a status variable (also called a state variable), create a variable that indicates whether the routine is in an error state. In this case, the routine already uses the errorState status variable, so you can use that.

Visual Basic Code That Avoids gotos by Using a Status Variable

' This routine purges a group of files. Sub PurgeFiles( ByRef errorState As Error_Code )    Dim fileIndex As Integer    Dim fileToPurge As Data_File    Dim fileList As File_List    Dim numFilesToPurge As Integer    MakePurgeFileList( fileList, numFilesToPurge )    errorState = FileStatus_Success    fileIndex = 0    While ( fileIndex < numFilesToPurge ) And ( errorState = FileStatus_Success )       <-- 1       fileIndex = fileIndex + 1       If Not FindFile( fileList( fileIndex ), fileToPurge ) Then          errorState = FileStatus_FileFindError       End If       If ( errorState = FileStatus_Success ) Then       <-- 2          If Not OpenFile( fileToPurge ) Then             errorState = FileStatus_FileOpenError        End If       End If       If ( errorState = FileStatus_Success ) Then       <-- 3          If Not OverwriteFile( fileToPurge ) Then             errorState = FileStatus_FileOverwriteError          End If       End If       If ( errorState = FileStatus_Success ) Then       <-- 4          If Not Erase( fileToPurge ) Then             errorState = FileStatus_FileEraseError          End If       End If    Wend    DeletePurgeFileList( fileList, numFilesToPurge ) End Sub

(1)The While test has been changed to add a test for errorState.
(2)The status variable is tested.
(3)The status variable is tested.
(4)The status variable is tested.

The advantage of the status-variable approach is that it avoids the deeply nested if-then-else structures of the first rewrite and is thus easier to understand. It also places the action following the if-then-else test closer to the test than the nested-if approach did, and it completely avoids else clauses.

Understanding the nested-if version requires some mental gymnastics. The status-variable version is easier to understand because it closely models the way people think about the problem. You find the file. If everything is OK, you open the file. If everything is still OK, you overwrite the file. If everything is still OK…

The disadvantage of this approach is that using status variables isn't as common a practice as it should be. Document their use fully, or some programmers might not understand what you're up to. In this example, the use of well-named enumerated types helps significantly.

Rewrite with try-finally Some languages, including Visual Basic and Java, provide a try-finally statement that can be used to clean up resources under error conditions.

To rewrite using the try-finally approach, enclose the code that would otherwise need to check for errors inside a try block, and place the cleanup code inside a finally block. The try block specifies the scope of the exception handling, and the finally block performs any resource cleanup. The finally block will always be called regardless of whether an exception is thrown and regardless of whether the PurgeFiles() routine Catches any exception that's thrown.

Visual Basic Code That Avoids gotos by Using try-finally

' This routine purges a group of files. Exceptions are passed to the caller. Sub PurgeFiles()    Dim fileIndex As Integer    Dim fileToPurge As Data_File    Dim fileList As File_List    Dim numFilesToPurge As Integer    MakePurgeFileList( fileList, numFilesToPurge )    Try       fileIndex = 0       While ( fileIndex < numFilesToPurge )          fileIndex = fileIndex + 1          FindFile( fileList( fileIndex ), fileToPurge )          OpenFile( fileToPurge )          OverwriteFile( fileToPurge )          Erase( fileToPurge )       Wend    Finally       DeletePurgeFileList( fileList, numFilesToPurge )    End Try End Sub

This approach assumes that all function calls throw exceptions for failures rather than returning error codes.

The advantage of the try-finally approach is that it is simpler than the goto approach and doesn't use gotos. It also avoids the deeply nested if-then-else structures.

The limitation of the try-finally approach is that it must be implemented consistently throughout a code base. If the previous code were part of a code base that used error codes in addition to exceptions, the exception code would be required to set error codes for each possible error, and that requirement would make the code about as complicated as the other approaches.

Comparison of the Approaches

Each of the four methods has something to be said for it. The goto approach avoids deep nesting and unnecessary tests but of course has gotos. The nested-if approach avoids gotos but is deeply nested and gives an exaggerated picture of the logical complexity of the routine. The status-variable approach avoids gotos and deep nesting but introduces extra tests. And the try-finally approach avoids both gotos and deep nesting but isn't available in all languages.

Cross-Reference

For a complete list of techniques that can be applied to situations like this, see "Summary of Techniques for Reducing Deep Nesting" in Section 19.4.

The try-finally approach is the most straightforward in languages that provide try-finally and in code bases that haven't already standardized on another approach. If try-finally isn't an option, the status-variable approach is slightly preferable to the goto and nested-if approaches because it's more readable and it models the problem better, but that doesn't make it the best approach in all circumstances.

Any of these techniques works well when applied consistently to all the code in a project. Consider all the tradeoffs, and then make a projectwide decision about which method to favor.

gotos and Sharing Code in an else Clause

One challenging situation in which some programmers would use a goto is the case in which you have two conditional tests and an else clause and you want to execute code in one of the conditions and in the else clause. Here's an example of a case that could drive someone to goto:

C++ Example of Sharing Code in an else Clause with a goto

if ( statusOk ) {    if ( dataAvailable ) {       importantVariable = x;       goto MID_LOOP;    } } else {    importantVariable = GetValue();    MID_LOOP:    // lots of code    ... }

This is a good example because it's logically tortuous it's nearly impossible to read it as it stands, and it's hard to rewrite it correctly without a goto. If you think you can easily rewrite it without gotos, ask someone to review your code! Several expert programmers have rewritten it incorrectly.

You can rewrite the code in several ways. You can duplicate code, put the common code into a routine and call it from two places, or retest the conditions. In most languages, the rewrite will be a tiny bit larger and slower than the original, but it will be extremely close. Unless the code is in a really hot loop, rewrite it without thinking about efficiency.

The best rewrite would be to put the // lots of code part into its own routine. Then you can call the routine from the places you would otherwise have used as origins or destinations of gotos and preserve the original structure of the conditional. Here's how it looks:

C++ Example of Sharing Code in an else Clause by Putting Common Code into a Routine

if ( statusOk ) {    if ( dataAvailable ) {       importantVariable = x;       DoLotsOfCode( importantVariable );    } } else {    importantVariable = GetValue();    DoLotsOfCode( importantVariable ); }

Normally, writing a new routine is the best approach. Sometimes, however, it's not practical to put duplicated code into its own routine. In this case, you can work around the impractical solution by restructuring the conditional so that you keep the code in the same routine rather than putting it into a new routine:

C++ Example of Sharing Code in an else Clause Without a goto

if ( ( statusOk && dataAvailable ) || !statusOk ) {    if ( statusOk && dataAvailable ) {       importantVariable = x;    }    else {       importantVariable = GetValue();    }    // lots of code    ... }

This is a faithful and mechanical translation of the logic in the goto version. It tests statusOK two extra times and dataAvailable once, but the code is equivalent. If retesting the conditionals bothers you, notice that the value of statusOK doesn't need to be tested twice in the first if test. You can also drop the test for dataAvailable in the second if test.

Cross-Reference

Another approach to this problem is to use a decision table. For details, see Chapter 18, "Table-Driven Methods."

Summary of Guidelines for Using gotos

Use of gotos is a matter of religion. My dogma is that in modern languages, you can easily replace nine out of ten gotos with equivalent sequential constructs. In these simple cases, you should replace gotos out of habit. In the hard cases, you can still exorcise the goto in nine out of ten cases: You can break the code into smaller routines, use try-finally, use nested ifs, test and retest a status variable, or restructure a conditional. Eliminating the goto is harder in these cases, but it's good mental exercise and the techniques discussed in this section give you the tools to do it.

In the remaining one case out of 100 in which a goto is a legitimate solution to the problem, document it clearly and use it. If you have your rain boots on, it's not worth walking around the block to avoid a mud puddle. But keep your mind open to goto-less approaches suggested by other programmers. They might see something you don't.

Here's a summary of guidelines for using gotos:

Use gotos to emulate structured control constructs in languages that don't support them directly. When you do, emulate them exactly. Don't abuse the extra flexibility the goto gives you.
Don't use the goto when an equivalent built-in construct is available.
Measure the performance of any goto used to improve efficiency. In most cases, you can recode without gotos for improved readability and no loss in efficiency. If your case is the exception, document the efficiency improvement so that goto-less evangelists won't remove the goto when they see it.
Cross-Reference

For details on improving efficiency, see Chapter 25, "Code-Tuning Strategies," and Chapter 26, "Code-Tuning Techniques."
Limit yourself to one goto label per routine unless you're emulating structured constructs.
Limit yourself to gotos that go forward, not backward, unless you're emulating structured constructs.
Make sure all goto labels are used. Unused labels might be an indication of missing code, namely the code that goes to the labels. If the labels aren't used, delete them.
Make sure a goto doesn't create unreachable code.
If you're a manager, adopt the perspective that a battle over a single goto isn't worth the loss of the war. If the programmer is aware of the alternatives and is willing to argue, the goto is probably OK.

< Free Open Study >