11.1. Considerations in Choosing Good Names

< Free Open Study >

You can't give a variable a name the way you give a dog a name because it's cute or it has a good sound. Unlike the dog and its name, which are different entities, a variable and a variable's name are essentially the same thing. Consequently, the goodness or badness of a variable is largely determined by its name. Choose variable names with care.

Here's an example of code that uses bad variable names:

Java Example of Poor Variable Names

x = x - xx; xxx = fido + SalesTax( fido ); x = x + LateFee( x1, x ) + xxx; x = x + Interest( x1, x );

What's happening in this piece of code? What do x1, xx, and xxx mean? What does fido mean? Suppose someone told you that the code computed a total customer bill based on an outstanding balance and a new set of purchases. Which variable would you use to print the customer's bill for just the new set of purchases?

Here's a version of the same code that makes these questions easier to answer:

Java Example of Good Variable Names

balance = balance - lastPayment; monthlyTotal = newPurchases + SalesTax( newPurchases ); balance = balance + LateFee( customerID, balance ) + monthlyTotal; balance = balance + Interest( customerID, balance );

In view of the contrast between these two pieces of code, a good variable name is readable, memorable, and appropriate. You can use several general rules of thumb to achieve these goals.

The Most Important Naming Consideration

The most important consideration in naming a variable is that the name fully and accurately describe the entity the variable represents. An effective technique for coming up with a good name is to state in words what the variable represents. Often that statement itself is the best variable name. It's easy to read because it doesn't contain cryptic abbreviations, and it's unambiguous. Because it's a full description of the entity, it won't be confused with something else. And it's easy to remember because the name is similar to the concept.

For a variable that represents the number of people on the U.S. Olympic team, you would create the name numberOfPeopleOnTheUsOlympicTeam. A variable that represents the number of seats in a stadium would be numberOfSeatsInTheStadium. A variable that represents the maximum number of points scored by a country's team in any modern Olympics would be maximumNumberOfPointsInModernOlympics. A variable that contains the current interest rate is better named rate or interestRate than r or x. You get the idea.

Note two characteristics of these names. First, they're easy to decipher. In fact, they don't need to be deciphered at all because you can simply read them. But second, some of the names are long too long to be practical. I'll get to the question of variable-name length shortly.

Table 11-1 shows several examples of variable names, good and bad:

Table 11-1. Examples of Good and Bad Variable Names
Purpose of Variable	Good Names, Good Descriptors	Bad Names, Poor Descriptors
Running total of checks written to date	runningTotal, checkTotal	written, ct, checks, CHKTTL, x, x1, x2
Velocity of a bullet train	velocity, trainVelocity, velocityInMph	velt, v, tv, x, x1, x2, train
Current date	currentDate, todaysDate	cd, current, c, x, x1, x2, date
Lines per page	linesPerPage	lpp, lines, l, x, x1, x2

The names currentDate and todaysDate are good names because they fully and accurately describe the idea of "current date." In fact, they use the obvious words. Programmers sometimes overlook using the ordinary words, which is often the easiest solution. Because they're too short and not at all descriptive, cd and c are poor names. current is poor because it doesn't tell you what is current. date is almost a good name, but it's a poor name in the final analysis because the date involved isn't just any date, but the current date; date by itself gives no such indication. x, x1, and x2 are poor names because they're always poor names x traditionally represents an unknown quantity; if you don't want your variables to be unknown quantities, think of better names.

Names should be as specific as possible. Names like x, temp, and i that are general enough to be used for more than one purpose are not as informative as they could be and are usually bad names.

Problem Orientation

A good mnemonic name generally speaks to the problem rather than the solution. A good name tends to express the what more than the how. In general, if a name refers to some aspect of computing rather than to the problem, it's a how rather than a what. Avoid such a name in favor of a name that refers to the problem itself.

A record of employee data could be called inputRec or employeeData. inputRec is a computer term that refers to computing ideas input and record. employeeData refers to the problem domain rather than the computing universe. Similarly, for a bit field indicating printer status, bitFlag is a more computerish name than printerReady. In an accounting application, calcVal is more computerish than sum.

Optimum Name Length

The optimum length for a name seems to be somewhere between the lengths of x and maximumNumberOfPointsInModernOlympics. Names that are too short don't convey enough meaning. The problem with names like x1 and x2 is that even if you can discover what x is, you won't know anything about the relationship between x1 and x2. Names that are too long are hard to type and can obscure the visual structure of a program.

Gorla, Benander, and Benander found that the effort required to debug a program was minimized when variables had names that averaged 10 to 16 characters (1990). Programs with names averaging 8 to 20 characters were almost as easy to debug. The guideline doesn't mean that you should try to make all of your variable names 9 to 15 or 10 to 16 characters long. It does mean that if you look over your code and see many names that are shorter, you should check to be sure that the names are as clear as they need to be.

You'll probably come out ahead by taking the Goldilocks-and-the-Three-Bears approach to naming variables, as Table 11-2 illustrates.

Table 11-2. Variable Names That Are Too Long, Too Short, or Just Right
Too long:	numberOfPeopleOnTheUsOlympicTeam
	numberOfSeatsInTheStadium
	maximumNumberOfPointsInModernOlympics
Too short:	n, np, ntm
	n, ns, nsisd
	m, mp, max, points
Just right:	numTeamMembers, teamMemberCount
	numSeatsInStadium, seatCount
	teamPointsMax, pointsRecord

The Effect of Scope on Variable Names

Cross-Reference

Scope is discussed in more detail in Section 10.4, "Scope."

Are short variable names always bad? No, not always. When you give a variable a short name like i, the length itself says something about the variable namely, that the variable is a scratch value with a limited scope of operation.

A programmer reading such a variable should be able to assume that its value isn't used outside a few lines of code. When you name a variable i, you're saying, "This variable is a run-of-the-mill loop counter or array index and doesn't have any significance outside these few lines of code."

A study by W. J. Hansen found that longer names are better for rarely used variables or global variables and shorter names are better for local variables or loop variables (Shneiderman 1980). Short names are subject to many problems, however, and some careful programmers avoid them altogether as a matter of defensive-programming policy.

Use qualifiers on names that are in the global namespace If you have variables that are in the global namespace (named constants, class names, and so on), consider whether you need to adopt a convention for partitioning the global namespace and avoiding naming conflicts. In C++ and C#, you can use the namespace keyword to partition the global namespace.

C++ Example of Using the namespace Keyword to Partition the Global Namespace

namespace UserInterfaceSubsystem {    ...    // lots of declarations    ... } namespace DatabaseSubsystem {    ...    // lots of declarations    ... }

If you declare an Employee class in both the UserInterfaceSubsystem and the DatabaseSubsystem, you can identify which you wanted to refer to by writing UserInterfaceSubsystem::Employee or DatabaseSubsystem::Employee. In Java, you can accomplish the same thing by using packages.

In languages that don't support namespaces or packages, you can still use naming conventions to partition the global namespace. One convention is to require that globally visible classes be prefixed with subsystem mnemonic. The user interface employee class might become uiEmployee, and the database employee class might become dbEmployee. This minimizes the risk of global-namespace collisions.

Computed-Value Qualifiers in Variable Names

Many programs have variables that contain computed values: totals, averages, maximums, and so on. If you modify a name with a qualifier like Total, Sum, Average, Max, Min, Record, String, or Pointer, put the modifier at the end of the name.

This practice offers several advantages. First, the most significant part of the variable name, the part that gives the variable most of its meaning, is at the front, so it's most prominent and gets read first. Second, by establishing this convention, you avoid the confusion you might create if you were to use both totalRevenue and revenueTotal in the same program. The names are semantically equivalent, and the convention would prevent their being used as if they were different. Third, a set of names like revenueTotal, expenseTotal, revenueAverage, and expenseAverage has a pleasing symmetry. A set of names like totalRevenue, expenseTotal, revenueAverage, and averageExpense doesn't appeal to a sense of order. Finally, the consistency improves readability and eases maintenance.

An exception to the rule that computed values go at the end of the name is the customary position of the Num qualifier. Placed at the beginning of a variable name, Num refers to a total: numCustomers is the total number of customers. Placed at the end of the variable name, Num refers to an index: customerNum is the number of the current customer. The s at the end of numCustomers is another tip-off about the difference in meaning. But, because using Num so often creates confusion, it's probably best to sidestep the whole issue by using Count or Total to refer to a total number of customers and Index to refer to a specific customer. Thus, customerCount is the total number of customers and customerIndex refers to a specific customer.

Common Opposites in Variable Names