12.9. Creating Your Own Types (Type Aliasing) | Code Complete: A Practical Handbook of Software Construction, Second Edition

< Free Open Study >

Programmer-defined data types are one of the most powerful capabilities a language can give you to clarify your understanding of a program. They protect your program against unforeseen changes and make it easier to read all without requiring you to design, construct, or test new classes. If you're using C, C++, or another language that allows user-defined types, take advantage of them!

Cross-Reference

In many cases, it's better to create a class than to create a simple data type. For details, see Chapter 6, "Working Classes."

To appreciate the power of type creation, suppose you're writing a program to convert coordinates in an x, y, z system to latitude, longitude, and elevation. You think that double-precision floating-point numbers might be needed but would prefer to write a program with single-precision floating-point numbers until you're absolutely sure. You can create a new type specifically for coordinates by using a typedef statement in C or C++ or the equivalent in another language. Here's how you'd set up the type definition in C++:

C++ Example of Creating a Type

typedef float Coordinate; // for coordinate variables

This type definition declares a new type, Coordinate, that's functionally the same as the type float. To use the new type, you declare variables with it just as you would with a predefined type such as float. Here's an example:

C++ Example of Using the Type You've Created

Routine1( ... ) {    Coordinate latitude;      // latitude in degrees    Coordinate longitude;     // longitude in degrees    Coordinate elevation;     // elevation in meters from earth center    ... } ... Routine2( ... ) {    Coordinate x;   // x coordinate in meters    Coordinate y;   // y coordinate in meters    Coordinate z;   // z coordinate in meters    ... }

In this code, the variables latitude, longitude, elevation, x, y, and z are all declared to be of type Coordinate.

Now suppose that the program changes and you find that you need to use double-precision variables for coordinates after all. Because you defined a type specifically for coordinate data, all you have to change is the type definition. And you have to change it in only one place: in the typedef statement. Here's the changed type definition:

C++ Example of Changed Type Definition

typedef double Coordinate; // for coordinate variables       <-- 1

(1)The original float has changed to double.

Here's a second example this one in Pascal. Suppose you're creating a payroll system in which employee names are a maximum of 30 characters long. Your users have told you that no one ever has a name longer than 30 characters. Do you hard-code the number 30 throughout your program? If you do, you trust your users a lot more than I trust mine! A better approach is to define a type for employee names:

Pascal Example of Creating a Type for Employee Names

Type    employeeName = array[ 1..30 ] of char;

When a string or an array is involved, it's usually wise to define a named constant that indicates the length of the string or array and then use the named constant in the type definition. You'll find many places in your program in which to use the constant this is just the first place in which you'll use it. Here's how it looks:

Pascal Example of Better Type Creation

 Const    NAME_LENGTH = 30;       <-- 1    ... Type    employeeName = array[ 1..NAME_LENGTH ] of char;       <-- 2

(1)Here's the declaration of the named constant.
(2)Here's where the named constant is used.

A more powerful example would combine the idea of creating your own types with the idea of information hiding. In some cases, the information you want to hide is information about the type of the data.

The coordinates example in C++ is about halfway to information hiding. If you always use Coordinate rather than float or double, you effectively hide the type of the data. In C++, this is about all the information hiding the language does for you. For the rest, you or subsequent users of your code have to have the discipline not to look up the definition of Coordinate. C++ gives you figurative, rather than literal, information-hiding ability.

Other languages, such as Ada, go a step further and support literal information hiding. Here's how the Coordinate code fragment would look in an Ada package that declares it:

Ada Example of Hiding Details of a Type Inside a Package

 package Transformation is    type Coordinate is private;       <-- 1    ...

(1)This statement declares Coordinate as private to the package.

Here's how Coordinate looks in another package, one that uses it:

Ada Example of Using a Type from Another Package

with Transformation; ... procedure Routine1(...) ...    latitude: Coordinate;    longitude: Coordinate; begin    -- statements using latitude and longitude    ... end Routine1;

Notice that the Coordinate type is declared as private in the package specification. That means that the only part of the program that knows the definition of the Coordinate type is the private part of the Transformation package. In a development environment with a group of programmers, you could distribute only the package specification, which would make it harder for a programmer working on another package to look up the underlying type of Coordinate. The information would be literally hidden. Languages like C++ that require you to distribute the definition of Coordinate in header files undermine true information hiding.

These examples have illustrated several reasons to create your own types:

To make modifications easier It's little work to create a new type, and it gives you a lot of flexibility.
To avoid excessive information distribution Hard typing spreads data-typing details around your program instead of centralizing them in one place. This is an example of the information-hiding principle of centralization discussed in Section 6.2.
To increase reliability In Ada, you can define types such as type Age is range 0..99. The compiler then generates run-time checks to verify that any variable of type Age is always within the range 0..99.
To make up for language weaknesses If your language doesn't have the predefined type you want, you can create it yourself. For example, C doesn't have a boolean or logical type. This deficiency is easy to compensate for by creating the type yourself:
```
typedef int Boolean;
```

Why Are the Examples of Creating Your Own Types in Pascal and Ada?

Pascal and Ada have gone the way of the stegosaurus and, in general, the languages that have replaced them are more usable. In the area of simple type definitions, however, I think C++, Java, and Visual Basic represent a case of three steps forward and one step back. An Ada declaration like

currentTemperature: INTEGER range 0..212;

contains important semantic information that a statement like

int temperature;

does not. Going a step further, a type declaration like

type Temperature is range 0..212; ... currentTemperature: Temperature;

allows the compiler to ensure that currentTemperature is assigned only to other variables with the Temperature type, and very little extra coding is required to provide that extra safety margin.

Of course, a programmer could create a Temperature class to enforce the same semantics that were enforced automatically by the Ada language, but the step from creating a simple data type in one line of code to creating a class is a big step. In many situations, a programmer would create the simple type but would not step up to the additional effort of creating a class.

Guidelines for Creating Your Own Types

Cross-Reference

In each case, consider whether creating a class might work better than a simple data type. For details, see Chapter 6, "Working Classes."

Keep these guidelines in mind as you create your own "user-defined" types:

Create types with functionally oriented names Avoid type names that refer to the kind of computer data underlying the type. Use type names that refer to the parts of the real-world problem that the new type represents. In the previous examples, the definitions created well-named types for coordinates and names real-world entities. Similarly, you could create types for currency, payment codes, ages, and so on aspects of real-world problems.

Be wary of creating type names that refer to predefined types. Type names like BigInteger or LongString refer to computer data rather than the real-world problem. The big advantage of creating your own type is that it provides a layer of insulation between your program and the implementation language. Type names that refer to the underlying programming-language types poke holes in that insulation. They don't give you much advantage over using a predefined type. Problem-oriented names, on the other hand, buy you easy modifiability and data declarations that are self-documenting.

Avoid predefined types If there is any possibility that a type might change, avoid using predefined types anywhere but in typedef or type definitions. It's easy to create new types that are functionally oriented, and it's hard to change data in a program that uses hard-wired types. Moreover, use of functionally oriented type declarations partially documents the variables declared with them. A declaration like Coordinate x tells you a lot more about x than a declaration like float x. Use your own types as much as you can.

Don't redefine a predefined type Changing the definition of a standard type can create confusion. For example, if your language has a predefined type Integer, don't create your own type called Integer. Readers of your code might forget that you've redefined the type and assume that the Integer they see is the Integer they're used to seeing.

Define substitute types for portability In contrast to the advice that you not change the definition of a standard type, you might want to define substitutes for the standard types so that on different hardware platforms you can make the variables represent exactly the same entities. For example, you can define a type INT32 and use it instead of int, or a type LONG64 instead of long. Originally, the only difference between the two types would be their capitalization. But when you moved the program to a new hardware platform, you could redefine the capitalized versions so that they could match the data types on the original hardware.

Be sure not to define types that are easily mistaken for predefined types. It would be possible to define INT rather than INT32, but you're better off creating a clean distinction between types you define and types provided by the language.

Consider creating a class rather than using a typedef Simple typedefs can go a long way toward hiding information about a variable's underlying type. In some cases, however, you might want the additional flexibility and control you'll achieve by creating a class. For details, see Chapter 6, "Working Classes."

cc2e.com/1206

Cross-Reference

For a checklist that applies to general data issues rather than to issues with specific types of data, see the checklist on page 257 in Chapter 10, "General Issues in Using Variables." For a checklist of considerations in naming varieties, see the checklist on page 288 in Chapter 11, "The Power of Variable Names."

Checklist: Fundamental Data

Numbers in General

Does the code avoid magic numbers?
Does the code anticipate divide-by-zero errors?
Are type conversions obvious?
If variables with two different types are used in the same expression, will the expression be evaluated as you intend it to be?
Does the code avoid mixed-type comparisons?
Does the program compile with no warnings?

Integers

Do expressions that use integer division work the way they're meant to?
Do integer expressions avoid integer-overflow problems?

Floating-Point Numbers

Does the code avoid additions and subtractions on numbers with greatly different magnitudes?
Does the code systematically prevent rounding errors?
Does the code avoid comparing floating-point numbers for equality?

Characters and Strings

Does the code avoid magic characters and strings?
Are references to strings free of off-by-one errors?
Does C code treat string pointers and character arrays differently?
Does C code follow the convention of declaring strings to be length CON-STANT+1?
Does C code use arrays of characters rather than pointers, when appropriate?
Does C code initialize strings to NULLs to avoid endless strings?
Does C code use strncpy() rather than strcpy()? And strncat() and strncmp()?

Boolean Variables

Does the program use additional boolean variables to document conditional tests?
Does the program use additional boolean variables to simplify conditional tests?

Enumerated Types

Does the program use enumerated types instead of named constants for their improved readability, reliability, and modifiability?
Does the program use enumerated types instead of boolean variables when a variable's use cannot be completely captured with true and false?
Do tests using enumerated types test for invalid values?
Is the first entry in an enumerated type reserved for "invalid"?

Named Constants

Does the program use named constants for data declarations and loop limits rather than magic numbers?
Have named constants been used consistently not used as named constants in some places and as literals in others?

Arrays

Are all array indexes within the bounds of the array?
Are array references free of off-by-one errors?
Are all subscripts on multidimensional arrays in the correct order?
In nested loops, is the correct variable used as the array subscript, avoiding loop-index cross-talk?

Creating Types

Does the program use a different type for each kind of data that might change?
Are type names oriented toward the real-world entities the types represent rather than toward programming-language types?
Are the type names descriptive enough to help document data declarations?
Have you avoided redefining predefined types?
Have you considered creating a new class rather than simply redefining a type?

< Free Open Study >