Managing Project Complexity | C++ For Artists: The Art, Philosophy, And Science Of Object-Oriented Programming

< Day Day Up >

Large projects differ from small projects in many ways. Large projects have more of everything: more variables, more user-defined types, more functions, more lines of code, and more complexity. There are two types of complexity: conceptual complexity and physical complexity.

Try to imagine a lot of something, like the number of dump truck loads required to move Mount Everest to North Carolina. Imagining large numbers poses a certain amount of conceptual complexity. Large software projects are very conceptually complex and many such projects end in failure because the conceptual complexity became impossible to manage. Object-oriented analysis and design (OOAD) techniques were developed to help tame conceptual complexity.

Conceptual complexity is accompanied by physical complexity. Large software development projects usually have many people working on many parts of the code at the same time. To ensure success, software developers adopt development standards. Development standards are rules developers must follow to ensure other developers can understand their work. Development standards may address issues like file naming, file location, configuration management, commenting requirements, and many, many other smart things to do to tame physical complexity.

Later in the book you will be taught how to tame conceptual complexity. This section presents you with a few smart things to do to help you manage the physical complexity of your projects. No project is too small to benefit from the techniques presented below. It is a good idea to develop good project management habits early in your programming career.

A word of warning: You could ignore the advice given here and manage to get small, simple projects to run, but if you try and structure large projects like small, simple projects, cramming all your code into one long file, you will doom yourself to failure. Formulate good programming habits now. Bad programming habits are hard to break and will end up breaking you in the long run.

Split Even Simple Projects Into Multiple Source Code Files

One of the first programming skills you must learn to help manage physical complexity is how to create multiple file projects. Your simplest programming project will have three files: a header file, an implementation file, and a main file. Larger projects will have more. As a rule of thumb you will have one header file and one implementation file for each class or abstract data type you declare. There will be only one main file which contains the main() function.

I will discuss these files and what goes into each one in more detail below, but first, I want to tell you why you want to learn the skill of developing multi-file projects. The following discussion about class interfaces may be somewhat advanced for novice readers. Fear not! Classes are discussed in great detail later in the book.

Separating a Class’s Interface from its Implementation

When you design a system using object-oriented techniques you model the system’s functionality by identifying objects within the system and how they interact with each other. Each object will have a certain behavior associated with it, along with an interface that allows other objects to access that behavior.

An object will belong to a class of objects. A class of objects is modeled in C++ using the struct or class construct. When you declare a new user-defined type representing an object in the system you are modeling you will create a new class. In this class you will declare a set of public methods. It is this set of public methods that become the interface to objects of that class. Because the class declaration contains the prototypes for the public class interface functions, and therefore considered as the interface to class objects, you will put class declarations in header files. I will talk more about header files below.

After you have declared the interface to a class of objects you need to define class behavior. You define class behavior by implementing the class member functions you declared in the class declaration. All class member functions declared for a class will be defined in a separate implementation file. I will talk more about implementation files below too.

If all this talk of classes, objects, and interfaces makes little or no sense to you now, just hang in there. It is all covered in much greater detail later in the book.

Benefits of Separating Interface from Implementation

You reap many benefits by declaring a class in one file and defining its behavior in another. I will talk about a few of those benefits now.

Makes Large Project File Management Easier

The larger the project, the more source code files it will contain. Putting each class declaration into its own header file and its implementation in a separate implementation file allows you to adopt a simple file naming convention. Namely, name the file the same name as the class it contains suffixed by either an “h”, meaning header, or “cpp”, meaning C++ implementation file. Giving your files the same names as the classes they contain makes finding them among tens, hundreds, or even thousands of files a heck of a lot easier.

Increases Portability

Portability refers to the ability of source code to be ported to another computer system. Although seamless portability is difficult to achieve without serious prior planning, you can make it easier to achieve by keeping platform or operating system dependent code separate. Putting class declarations and implementations in separate files helps you do just that.

Allows You to Create Class Libraries

Putting class declarations and implementations in different files will let you create class libraries. With a class library you can share the interface to your class or classes along with the compiled implementation code. You keep the C++ source code to the implementation and thereby protect your rights to your hard work.

Helpful Preprocessor Directives

Before compiling your source code, a C++ compiler will preprocess your code. It does this by invoking a program called the preprocessor. The preprocessor performs macro substitution, conditional compilation and filename inclusion. You tell the preprocessor what to do by putting preprocessor directives in your source code.

While there are many different preprocessor directives available for your use, you need only learn four of them to help you create and manage multiple file projects and thus help you manage the physical complexity of your projects. These are #ifndef, #define, #endif, and #include. As your C++ expertise grows you will find many other uses for these directives, as well as uses for other preprocessor directives not covered in this section.

#ifndef, #define, #endif

You can use this combination of preprocessor directives together to help you perform conditional compilation of your header files or source code. The purpose of using these three directives in your header file is to prevent the header file and its contents from being included multiple times in a project. The reason multiple header file inclusion is not a good thing is because a header file will contain function and/or data type declarations. A function or data type declaration should be made only once in a program. Multiple declarations make compilers unhappy!

The best way to illustrate their usage is by example. The C++ source code shown in example 1.1 represents a small header file called test.h that declares one function prototype named test().

Listing 1.1: test.h

#ifndef TEST_H #define TEST_H void test(); #endif

The #ifndef directive stands for “if not defined”. It is followed by an identifier, in this case TEST_H. The #define directive means exactly that, “define”. It is followed by the same identifier. The #endif directive stands for “end if”. It signals the end of the #ifndef preprocessor directive. The body of the header file appears between the #ifndef and #endif directives. This includes the #define directive and the function prototype test().

Remember that the purpose of the preprocessor directives is to communicate with the C++ preprocessor. What will happen in this case is the preprocessor will encounter the #ifndef directive and its accompanying identifier. If the identifier TEST_H has not been previously defined then the #define directive will be executed next, defining TEST_H, followed by the declaration of test().

On the other hand, if TEST_H has been previously defined, then everything between the #ifndef and #endif will be ignored by the preprocessor.

#include

Use the #include directive to perform file inclusion. There are essentially two ways to use the #include directive: #include <filename> and #include “filename”. Substitute the name of the header file you wish to include for the word filename.

The first usage, #include <filename>, will instruct the preprocessor to search in a number of directory locations as defined in your development environment. Most development environments let you customize this search sequence. If found, the entire #include line is replaced with the contents of filename.

The second usage, #include “filename”, acts much like the first with the usual difference of checking first for filename in a user default directory. If filename is not found in the user’s default directory then the preprocessor searches a list of predefined search locations.

The Final Word on Preprocessor Directive Behavior

The behavior of many C++ language features is implementation dependent, meaning the exact behavior is left up to the compiler writer. The search paths of the #include directives will be different for each development environment. To learn where your compiler is searching for header files and more importantly, how to make if find your header files when you create them, consult your compiler documentation.

Project File Format

Your projects will be comprised of many header and implementation files and one main file. This section shows you the general format of each file and what goes into each one. I will use the declaration of a simple class as an example.

Header File

Example 1.2 represents the contents of a file named firstclass.h

Listing 1.2: firstclass.h

#ifndef FIRSTCLASS_H #define FIRSTCLASS_H class FirstClass{  public:     FirstClass();    virtual ~FirstClass();  private:    static int object_count; }; #endif

Several conventions used here are worth noting. First, the name of the header file, firstclass.h, reflects the name of the class declaration it contains in lowercase letters with the suffix “h”. Second, the identifier FIRSTCLASS_H is capitalized. The name of the identifier is the name of the file with the “.” replaced with the underscore character “_”. Doing these two simple little things makes your programming life easier by making it easy to locate your class header files and taking the guesswork out of generating identifier names for the #ifndef and #define statements.

Header files can contain other stuff besides class declarations. The following table will prove invaluable in helping you remember what you should and shouldn’t put in header files.

Table 1-1: Header File Contents
Header Files Can Contain...	Examples
Comments	// C++-style comments /* C-style comments */
Include Directives	# include <helloworld.h> #include “helloworld.h”
Macro Definitions	#define ARRAY_SIZE 100
Conditional Compilation Directives	#ifndef FIRSTCLASS_H
Name Declarations	class FirstClass;
Enumerations	enum PenState {up, down};
Constant Definitions	const int ARRAY_SIZE = 100;
Data Declarations	extern int count;
Inline Function Definitions	inline static int getObjectCount(){ return object_count; }
Function Declarations	extern float getPay();
Template Declarations	template<class T> class MyClass;
Template Definitions	template<class T> class MyClass{ };
Type Definitions	class MyClass{ };
Named Namespaces	namespace MyNameSpace{ }

It is just as helpful to know what you should not put in a header file. The following table offers some advice.

Table 1-2: What Not To Put In A Header File
Header Should Not Contain...	Examples
Ordinary Function Definitions	float getPay() {return itsPay; }
Data Definition	double d;
Aggregate Definitions	int my_array[ ] = { 3, 2, 1};
Unnamed Namespaces	namespace { }
Exported Template Definitions	export template<class T> setVal(T t) { }

Implementation File

Now that FirstClass is declared in firstclass.h definitions must be given for each of the member functions. In this case there are two functions to define, the constructor, FirstClass() and the destructor ~FirstClass(). C++ implementation files are suffixed with “cpp”. Name the implementation file the same name as the header file and add the “cpp” suffix to the filename. Thus, the implementation file for FirstClass is named firstclass.cpp. The code for firstclass.cpp is given in example 1.3.

Listing 1.3: firstclass.cpp

#include "firstclass.h" #include <iostreams> /**********************************  Initialize classwide static variables first ********************************** */ int object_count = 0; /**********************************  Define member functions  *********************************** */ FirstClass::FirstClass(){   object_count ++;   cout<<"There is/are: " <<object_count      <<" FirstClass object(s)!"<<endl; } FirstClass::~FirstClass(){   if((--object_count) == 0)     cout<<"Destroyed last FirstClass object!"<<endl;     else        cout<<"There are: "<<object_count            <<" FirstClass objects left!"<<endl; }

Main File

The main file is a C++ implementation file but instead of defining class member functions it contains the main() function. It has the same suffix, “cpp”, as any other implementation file. I recommend naming this file main.cpp. This makes finding your main file an easy task.

Listing 1.4: main.cpp

           #include "firstclass.h"            int main(){             FirstClass f1, f2, f3;            }

That’s it! Main files, and the main() function, should be kept short.

Commenting

A well commented program will be easier to understand by not only yourself but by others who read your code as well. There are two ways to comment source code. The first way involves adding additional, explicit comment lines to your source code by way of comment delimiters of which there are two styles: C and C++. The second way to comment your code is to write self-commenting code. This may sound complicated but it is easy to do. Besides making your code easier to read, writing self-commenting code reduces the need to rely on the first way of commenting. It also increases code reliability because you will find problems with your code easier if your code is easy to read and understand.

C-Style Comments

Add C-style comments to your code by enclosing text between two sets of delimiters: “/*” and “*/”. For example:

Listing 1.5: C-style comments

/* *********************************  This is a C style comment *********************************** */

Everything between the /* and the */ is ignored by the compiler. Different programmers have different commenting styles. A word of advice: Programmers are often passionate about how they do business. Rise above the pettiness of arguing commenting issues with fellow programmers. Doing so is a complete waste of mental energy.

However, when using C-style comments keep a few things in mind. They are best used to insert blocks of comments. They can be used to insert one line of comments but C++-style comments are better suited for this purpose as you will see below.

I recommend aligning the /* and */ along the left margin as is shown in the example. You will be less likely to forget the */ and save yourself a lot of wondering why half your program doesn’t compile!

Lastly, I also recommend you avoid the urge to make a cute little box out of whatever character you choose to use as a border. For example...only now, if you want to add a line to your comment you have to fiddle around with adding hyphens at the beginning and end of each line.

Listing 1.6: C-style comments

/* ------------------------------------------------- -      This is also a C style comment             - -                                                 - ---------------------------------------------------- */

C++-style Comments

Listing 1.7: C++-style comment

// This is a C++ style comment

As you can see, a C++-style comment begins with two slash characters. They can appear anywhere in your program and tell the compiler to ignore everything that appears to the right up to the end of the line. Another example...

Listing 1.8: C++ comment clutter

class TestClass{    public: // public section       TestClass(); // constructor       virtual ~TestClass(); //destructor }; //end of TestClass

...shows how to use C++-style comments to really clutter up your code, which leads into a good piece of advice: use them sparingly!

To avoid the need to add comments to your source code in the first place I recommend strongly that you read the next section and take notes.

Write Self-Commenting Code: Give Identifiers Meaningful Names

Self-commenting source code puts the joy back into programming. Self-commenting source code is easier to write, easier to read, easier to maintain, and, if you do happen to make a mistake, your mistake will be easier to find if your source code is self-commenting. How do you self-comment source code?

Essentially, you select names for identifiers that make sense in the context of your program. An identifier is a string of characters used to represent storage locations for variables, constants, functions, types, and other objects within your program.

How you form identifier names is as important as what you name them. Here’s some guidance for naming variables, constants, and functions.

Variables

Use lower case letters when declaring variables. Separate each word of a multi-word identifier with an underscore character. Writing variables in lower case will make it easy to spot them in your program. Naming them something that makes sense will remind you of their purpose. The following table gives a few examples, both good and bad, of variable names.

Table 1-3: Good vs. Bad Variable Names
Variable Declaration	Comment
int a;	Bad! What the @#%^ does “a” stand for?
int mother_in_law_count;	Good! Although you are counting mother-in-laws, at least you know what you are counting.
Student *s[100];	Bad! How will someone else know that s is an array of pointers to students if they don’t see the declaration?
Student *student_pointers[100];	Good! Now they’ll know what’s supposed to be in each array element.

Constants

Use upper case letters when defining constants. Separate each word of a multi-word constant with the underscore character. The following table offers a few examples, both good and bad, of constant names.

Table 1-4: Good vs. Bad Constant Naming
Constant Declaration	Comment
const int a = 3;	Bad! What does a stand for? Is a a variable or a constant?
const int MAX_ARRAY_SIZE = 100;	Good!
#define object_count 25	Bad! The word count sounds like it might change in the future. Because it is lower case it looks like a variable.
#define MAX_OBJECT_COUNT 25	Good! Now it is clear this is a constant and this is the maximum number of objects allowed.

Functions

Start function names with lower case letters. Join multi-word function names together and capitalize the first letter of each additional word. Functions do things. Verbs denote action. Choose function names that indicate the action the function performs. The following table gives a few examples of function names.

Table 1-5: Function Naming
Function Declaration	Comment
void printScreen();	Good!
int getObjectCount();	Good!
void print();	Bad! Print what?
void setPenPositionUp();	Good! No mistaking what this function is supposed to do!

Adopt A Convention And Stick With It

The identifier naming recommendations presented here represent a convention. If you choose to adopt the styles suggested here, fine. If you don’t, that’s fine too. Whatever naming convention you choose to adopt I recommend you stick with it and be consistent. Don’t start naming variables one way and then change the way you name them in the middle of your program. Nothing will confuse you faster than naming inconsistency.

Restrict The Number of Global Variables

Global variables tend to pollute the global name space and lead to the production of tightly coupled code. Tightly coupled code is bad juju, as you will learn below.

Minimize Coupling, Maximize Cohesion

Repeat aloud several times; minimize coupling, maximize cohesion, minimize coupling, maximize cohesion. Good. Practice a few times on your own while I explain why you want to follow this mantra.

Coupling

Coupling refers to the degree to which each module in your source code is affected in any way by making a change to another module. Coupling can be loose, tight, or anywhere in between. You want to keep coupling as loose as possible. How does coupling occur?

One way to couple modules and not even realize you are doing it is through the reckless use of global variables. Modules can also be coupled to other modules, as is the case when one function depends on the services of another function.

It takes considerable knowledge and skill to eliminate all coupling from a group of code modules. For now, be aware that if your code is too tightly coupled, you will break it over there when you make a change here.

Cohesion

Cohesion refers to the degree to which the code in each module contributes to the purpose and function of that module. The rule of thumb is to maximize cohesion. All code belonging to a function should exist to implement that function. Don’t do anything surprising or mysterious in a function because it happens to be a convenient place to do it at the time.

< Day Day Up >