7.5. How to Use Routine Parameters | Code Complete: A Practical Handbook of Software Construction, Second Edition

< Free Open Study >

Interfaces between routines are some of the most error-prone areas of a program. One often-cited study by Basili and Perricone (1984) found that 39 percent of all errors were internal interface errors errors in communication between routines. Here are a few guidelines for minimizing interface problems:

Put parameters in input-modify-output order Instead of ordering parameters randomly or alphabetically, list the parameters that are input-only first, input-and-output second, and output-only third. This ordering implies the sequence of operations happening within the routine-inputting data, changing it, and sending back a result. Here are examples of parameter lists in Ada:

Cross-Reference

For details on documenting routine parameters, see "Commenting Routines" in Section 32.5. For details on formatting parameters, see Section 31.7, "Laying Out Routines."

Ada Example of Parameters in Input-Modify-Output Order

 procedure InvertMatrix(    originalMatrix: in Matrix;       <-- 1    resultMatrix: out Matrix ); ... procedure ChangeSentenceCase(    desiredCase: in StringCase;    sentence: in out Sentence ); ... procedure PrintPageNumber(    pageNumber: in Integer;    status: out StatusType );

(1)Ada uses in and out keywords to make input and output parameters clear.

This ordering convention conflicts with the C-library convention of putting the modified parameter first. The input-modify-output convention makes more sense to me, but if you consistently order parameters in some way, you will still do the readers of your code a service.

Consider creating your own in and out keywords Other modern languages don't support the in and out keywords like Ada does. In those languages, you might still be able to use the preprocessor to create your own in and out keywords:

C++ Example of Defining Your Own In and Out Keywords

 #define IN #define OUT void InvertMatrix(    IN Matrix originalMatrix,    OUT Matrix *resultMatrix ); ... void ChangeSentenceCase(    IN StringCase desiredCase,    IN OUT Sentence *sentenceToEdit ); ... void PrintPageNumber(    IN int pageNumber,    OUT StatusType &status );

In this case, the IN and OUT macro-keywords are used for documentation purposes. To make the value of a parameter changeable by the called routine, the parameter still needs to be passed as a pointer or as a reference parameter.

Before adopting this technique, be sure to consider a pair of significant drawbacks. Defining your own IN and OUT keywords extends the C++ language in a way that will be unfamiliar to most people reading your code. If you extend the language this way, be sure to do it consistently, preferably projectwide. A second limitation is that the IN and OUT keywords won't be enforceable by the compiler, which means that you could potentially label a parameter as IN and then modify it inside the routine anyway. That could lull a reader of your code into assuming code is correct when it isn't. Using C++'s const keyword will normally be the preferable means of identifying input-only parameters.

If several routines use similar parameters, put the similar parameters in a consistent order The order of routine parameters can be a mnemonic, and inconsistent order can make parameters hard to remember. For example, in C, the fprintf() routine is the same as the printf() routine except that it adds a file as the first argument. A similar routine, fputs(), is the same as puts() except that it adds a file as the last argument. This is an aggravating, pointless difference that makes the parameters of these routines harder to remember than they need to be.

On the other hand, the routine strncpy() in C takes the arguments target string, source string, and maximum number of bytes, in that order, and the routine memcpy() takes the same arguments in the same order. The similarity between the two routines helps in remembering the parameters in either routine.

Use all the parameters If you pass a parameter to a routine, use it. If you aren't using it, remove the parameter from the routine interface. Unused parameters are correlated with an increased error rate. In one study, 46 percent of routines with no unused variables had no errors, and only 17 to 29 percent of routines with more than one unreferenced variable had no errors (Card, Church, and Agresti 1986).

This rule to remove unused parameters has one exception. If you're compiling part of your program conditionally, you might compile out parts of a routine that use a certain parameter. Be nervous about this practice, but if you're convinced it works, that's OK too. In general, if you have a good reason not to use a parameter, go ahead and leave it in place. If you don't have a good reason, make the effort to clean up the code.

Put status or error variables last By convention, status variables and variables that indicate an error has occurred go last in the parameter list. They are incidental to the main purpose of the routine, and they are output-only parameters, so it's a sensible convention.

Don't use routine parameters as working variables It's dangerous to use the parameters passed to a routine as working variables. Use local variables instead. For example, in the following Java fragment, the variable inputVal is improperly used to store intermediate results of a computation:

Java Example of Improper Use of Input Parameters

 int Sample( int inputVal ) {    inputVal = inputVal * CurrentMultiplier( inputVal );    inputVal = inputVal + CurrentAdder( inputVal );    ...    return inputVal;       <-- 1 }

(1)At this point, inputVal no longer contains the value that was input.

In this code fragment, inputVal is misleading because by the time execution reaches the last line, inputVal no longer contains the input value; it contains a computed value based in part on the input value, and it is therefore misnamed. If you later need to modify the routine to use the original input value in some other place, you'll probably use inputVal and assume that it contains the original input value when it actually doesn't.

How do you solve the problem? Can you solve it by renaming inputVal? Probably not. You could name it something like workingVal, but that's an incomplete solution because the name fails to indicate that the variable's original value comes from outside the routine. You could name it something ridiculous like inputValThatBecomesWorkingVal or give up completely and name it x or val, but all these approaches are weak.

A better approach is to avoid current and future problems by using working variables explicitly. The following code fragment demonstrates the technique:

Java Example of Good Use of Input Parameters

 int Sample( int inputVal ) {    int workingVal = inputVal;    workingVal = workingVal * CurrentMultiplier( workingVal );    workingVal = workingVal + CurrentAdder( workingVal );    ...        <-- 1    ...    return workingVal; }

(1)If you need to use the original value of inputVal here or somewhere else, it's still available.

Introducing the new variable workingVal clarifies the role of inputVal and eliminates the chance of erroneously using inputVal at the wrong time. (Don't take this reasoning as a justification for literally naming a variable inputVal or workingVal. In general, inputVal and workingVal are terrible names for variables, and these names are used in this example only to make the variables' roles clear.)

Assigning the input value to a working variable emphasizes where the value comes from. It eliminates the possibility that a variable from the parameter list will be modified accidentally. In C++, this practice can be enforced by the compiler using the keyword const. If you designate a parameter as const, you're not allowed to modify its value within a routine.

Document interface assumptions about parameters If you assume the data being passed to your routine has certain characteristics, document the assumptions as you make them. It's not a waste of effort to document your assumptions both in the routine itself and in the place where the routine is called. Don't wait until you've written the routine to go back and write the comments you won't remember all your assumptions. Even better than commenting your assumptions, use assertions to put them into code.

Cross-Reference

For details on interface assumptions, see the introduction to Chapter 8, "Defensive Programming." For details on documentation, see Chapter 32, "Self-Documenting Code."

What kinds of interface assumptions about parameters should you document?

Whether parameters are input-only, modified, or output-only
Units of numeric parameters (inches, feet, meters, and so on)
Meanings of status codes and error values if enumerated types aren't used
Ranges of expected values
Specific values that should never appear

Limit the number of a routine's parameters to about seven Seven is a magic number for people's comprehension. Psychological research has found that people generally cannot keep track of more than about seven chunks of information at once (Miller 1956). This discovery has been applied to an enormous number of disciplines, and it seems safe to conjecture that most people can't keep track of more than about seven routine parameters at once.

In practice, how much you can limit the number of parameters depends on how your language handles complex data types. If you program in a modern language that supports structured data, you can pass a composite data type containing 13 fields and think of it as one mental "chunk" of data. If you program in a more primitive language, you might need to pass all 13 fields individually.

If you find yourself consistently passing more than a few arguments, the coupling among your routines is too tight. Design the routine or group of routines to reduce the coupling. If you are passing the same data to many different routines, group the routines into a class and treat the frequently used data as class data.

Cross-Reference

For details on how to think about interfaces, see "Good Abstraction" in Section 6.2.

Consider an input, modify, and output naming convention for parameters If you find that it's important to distinguish among input, modify, and output parameters, establish a naming convention that identifies them. You could prefix them with i_, m_, and o_. If you're feeling verbose, you could prefix them with Input_, Modify_, and Output_.

Pass the variables or objects that the routine needs to maintain its interface abstraction There are two competing schools of thought about how to pass members of an object to a routine. Suppose you have an object that exposes data through 10 access routines and the called routine needs three of those data elements to do its job.

Proponents of the first school of thought argue that only the three specific elements needed by the routine should be passed. They argue that doing this will keep the connections between routines to a minimum; reduce coupling; and make them easier to understand, reuse, and so on. They say that passing the whole object to a routine violates the principle of encapsulation by potentially exposing all 10 access routines to the routine that's called.

Proponents of the second school argue that the whole object should be passed. They argue that the interface can remain more stable if the called routine has the flexibility to use additional members of the object without changing the routine's interface. They argue that passing three specific elements violates encapsulation by exposing which specific data elements the routine is using.

I think both these rules are simplistic and miss the most important consideration: what abstraction is presented by the routine's interface? If the abstraction is that the routine expects you to have three specific data elements, and it is only a coincidence that those three elements happen to be provided by the same object, then you should pass the three specific data elements individually. However, if the abstraction is that you will always have that particular object in hand and the routine will do something or other with that object, then you truly do break the abstraction when you expose the three specific data elements.

If you're passing the whole object and you find yourself creating the object, populating it with the three elements needed by the called routine, and then pulling those elements out of the object after the routine is called, that's an indication that you should be passing the three specific elements rather than the whole object. (In general, code that "sets up" for a call to a routine or "takes down" after a call to a routine is an indication that the routine is not well designed.)

If you find yourself frequently changing the parameter list to the routine, with the parameters coming from the same object each time, that's an indication that you should be passing the whole object rather than specific elements.

Use named parameters In some languages, you can explicitly associate formal parameters with actual parameters. This makes parameter usage more self-documenting and helps avoid errors from mismatching parameters. Here's an example in Visual Basic:

Visual Basic Example of Explicitly Identifying Parameters

 Private Function Distance3d( _    ByVal xDistance As Coordinate, _       <-- 1    ByVal yDistance As Coordinate, _         |    ByVal zDistance As Coordinate _       <-- 1 )    ... End Function ... Private Function Velocity( _    ByVal latitude as Coordinate, _    ByVal longitude as Coordinate, _    ByVal elevation as Coordinate _ )    ...    Distance = Distance3d( xDistance := latitude, yDistance := longitude, _       <-- 2       zDistance := elevation )    ... End Function

(1)Here's where the formal parameters are declared.
(2)Here's where the actual parameters are mapped to the formal parameters.

This technique is especially useful when you have longer-than-average lists of identically typed arguments, which increases the chances that you can insert a parameter mismatch without the compiler detecting it. Explicitly associating parameters may be overkill in many environments, but in safety-critical or other high-reliability environments the extra assurance that parameters match up the way you expect can be worthwhile.

Make sure actual parameters match formal parameters Formal parameters, also known as "dummy parameters," are the variables declared in a routine definition. Actual parameters are the variables, constants, or expressions used in the actual routine calls.

A common mistake is to put the wrong type of variable in a routine call for example, using an integer when a floating point is needed. (This is a problem only in weakly typed languages like C when you're not using full compiler warnings. Strongly typed languages such as C++ and Java don't have this problem.) When arguments are input only, this is seldom a problem; usually the compiler converts the actual type to the formal type before passing it to the routine. If it is a problem, usually your compiler gives you a warning. But in some cases, particularly when the argument is used for both input and output, you can get stung by passing the wrong type of argument.

Develop the habit of checking types of arguments in parameter lists and heeding compiler warnings about mismatched parameter types.

< Free Open Study >