Section 2.3. Variables

2.3. Variables

Imagine that we are given the problem of computing 2 to the power of 10. Our first attempt might be something like

       #include <iostream>       int main()       {           // a first, not very good, solution           std::cout << "2 raised to the power of 10: ";           std::cout << 2*2*2*2*2*2*2*2*2*2;           std::cout << std::endl;           return 0;       }

This program solves the problem, although we might double- or triple-check to make sure that exactly 10 literal instances of 2 are being multiplied. Otherwise, we're satisfied. Our program correctly generates the answer 1,024.

We're next asked to compute 2 raised to the power of 17 and then to the power of 23. Changing our program each time is a nuisance. Worse, it proves to be remarkably error-prone. Too often, the modified program produces an answer with one too few or too many instances of 2.

An alternative to the explicit brute force power-of-2 computation is twofold:

Use named objects to perform and print each computation.
Use flow-of-control constructs to provide for the repeated execution of a sequence of program statements while a condition is true.

Here, then, is an alternative way to compute 2 raised to the power of 10:

       #include <iostream>       int main()       {           // local objects of type int           int value = 2;           int pow = 10;           int result = 1;           // repeat calculation of result until cnt is equal to pow           for (int cnt = 0; cnt != pow; ++cnt)               result *= value;   // result = result * value;           std::cout << value                     << " raised to the power of "                     << pow << ": \t"                     << result << std::endl;           return 0;       }

value, pow, result, and cnt are variables that allow for the storage, modification, and retrieval of values. The for loop allows for the repeated execution of our calculation until it's been executed pow times.

Exercises Section 2.3

Exercise 2.11:
Write a program that prompts the user to input two numbers, the base and exponent. Print the result of raising the base to the power of the exponent.

Key Concept: Strong Static Typing

C++ is a statically typed language, which means that types are checked at compile time. The process by which types are checked is referred to as type-checking.

In most languages, the type of an object constrains the operations that the object can perform. If the type does not support a given operation, then an object of that type cannot perform that operation.

In C++, whether an operation is legal or not is checked at compile time. When we write an expression, the compiler checks that the objects used in the expression are used in ways that are defined by the type of the objects. If not, the compiler generates an error message; an executable file is not produced.

As our programs, and the types we use, get more complicated, we'll see that static type checking helps find bugs in our programs earlier. A consequence of static checking is that the type of every entity used in our programs must be known to the compiler. Hence, we must define the type of a variable before we can use that variable in our programs.

2.3.1. What Is a Variable?

A variable provides us with named storage that our programs can manipulate. Each variable in C++ has a specific type, which determines the size and layout of the variable's memory; the range of values that can be stored within that memory; and the set of operations that can be applied to the variable. C++ programmers tend to refer to variables as "variables" or as "objects" interchangeably.

Lvalues and Rvalues

We'll have more to say about expressions in Chapter 5, but for now it is useful to know that there are two kinds of expressions in C++:

lvalue (pronounced "ell-value"): An expression that is an lvalue may appear as either the left-hand or right-hand side of an assignment.
rvalue (pronounced "are-value"): An expression that is an rvalue may appear on the right- but not left-hand side of an assignment.
Variables are lvalues and so may appear on the left-hand side of an assignment. Numeric literals are rvalues and so may not be assigned. Given the variables:
```
       int units_sold = 0;       double sales_price = 0, total_revenue = 0; 
```

it is a compile-time error to write either of the following:

       // error: arithmetic expression is not an lvalue       units_sold * sales_price = total_revenue;       // error: literal constant is not an lvalue       0 = 1;

Some operators, such as assignment, require that one of their operands be an lvalue. As a result, lvalues can be used in more contexts than can rvalues. The context in which an lvalue appears determines how it is used. For example, in the expression

       units_sold = units_sold + 1;

the variable units_sold is used as the operand to two different operators. The + operator cares only about the values of its operands. The value of a variable is the value currently stored in the memory associated with that variable. The effect of the addition is to fetch that value and add one to it.

The variable units_sold is also used as the left-hand side of the = operator. The = operator reads its right-hand side and writes to its left-hand side. In this expression, the result of the addition is stored in the storage associated with units_sold; the previous value in units_sold is overwritten.

In the course of the text, we'll see a number of situations in which the use of an rvalue or lvalue impacts the behavior and/or the performance of our programsin particular when passing and returning values from a function.

Exercises Section 2.3.1

Exercise 2.12:
Distinguish between an lvalue and an rvalue; show examples of each.

Exercise 2.13:
Name one case where an lvalue is required.

Terminology: What Is an object?

C++ programmers tend to be cavalier in their use of the term object. Most generally, an object is a region of memory that has a type. More specifically, evaluating an expression that is an lvalue yields an object.

Strictly speaking, some might reserve the term object to describe only variables or values of class types. Others might distinguish between named and unnamed objects, always referring to variables when discussing named objects. Still others distinguish between objects and values, using the term object for data that can be changed by the program and using the term value for those that are read-only.

In this book, we'll follow the more colloquial usage that an object is a region of memory that has a type. We will freely use object to refer to most of the data manipulated by our programs regardless of whether those data have built-in or class type, are named or unnamed, or are data that can be read or written.

2.3.2. The Name of a Variable

The name of a variable, its identifier, can be composed of letters, digits, and the underscore character. It must begin with either a letter or an underscore. Upper- and lowercase letters are distinct: Identifiers in C++ are case-sensitive. The following defines four distinct identifiers:

       // declares four different int variables       int somename, someName, SomeName, SOMENAME;

There is no language-imposed limit on the permissible length of a name, but out of consideration for others that will read and/or modify our code, it should not be too long.

For example,

       gosh_this_is_an_impossibly_long_name_to_type

is a really bad identifier name.

C++ Keywords

C++ reserves a set of words for use within the language as keywords. Keywords may not be used as program identifiers. Table 2.2 on the next page lists the complete set of C++ keywords.

Table 2.2. C++ Keywords
`asm`	`do`	`if`	`return`	`try`
`auto`	`double`	`inline`	`short`	`typedef`
`bool`	`dynamic_cast`	`int`	`signed`	`typeid`
`break`	`else`	`long`	`sizeof`	`typename`
`case`	`enum`	`mutable`	`static`	`union`
`catch`	`explicit`	`namespace`	`static_cast`	`unsigned`
`char`	`export`	`new`	`struct`	`using`
`class`	`extern`	`operator`	`switch`	`virtual`
`const`	`false`	`private`	`template`	`void`
`const_cast`	`float`	`protected`	`this`	`volatile`
`continue`	`for`	`public`	`throw`	`wchar_t`
`default`	`friend`	`register`	`true`	`while`
`delete`	`goto`	`reinterpret_cast`

C++ also reserves a number of words that can be used as alternative names for various operators. These alternative names are provided to support character sets that do not support the standard set of C++ operator symbols. These names, listed in Table 2.3, also may not be used as identifiers:

Table 2.3. C++ Operator Alternative Names
`and`	`bitand`	`compl`	`not_eq`	`or_eq`	`xor_eq`
`and_eq`	`bitor`	`not`	`or`	`xor`

In addition to the keywords, the standard also reserves a set of identifiers for use in the library. Identifiers cannot contain two consecutive underscores, nor can an identifier begin with an underscore followed immediately by an upper-case letter. Certain identifiersthose that are defined outside a functionmay not begin with an underscore.

Conventions for Variable Names

There are a number of generally accepted conventions for naming variables. Following these conventions can improve the readability of a program.

A variable name is normally written in lowercase letters. For example, one writes index, not Index or INDEX.
An identifier is given a mnemonic namethat is, a name that gives some indication of its use in a program, such as on_loan or salary.
An identifier containing multiple words is written either with an underscore between each word or by capitalizing the first letter of each embedded word. For example, one generally writes student_loan or studentLoan, not studentloan.

The most important aspect of a naming convention is that it be applied consistently.

Exercises Section 2.3.2

Exercise 2.14:
Which, if any, of the following names are invalid? Correct each identified invalid name.
       (a) int double = 3.14159;        (b) char _;       (c) bool catch-22;               (d) char 1_or_2 ='1';       (e) float Float = 3.14f; 

2.3.3. Defining Objects

The following statements define five variables:

       int units_sold;       double sales_price, avg_price;       std::string title;       Sales_item curr_book;

Each definition starts with a type specifier, followed by a comma-separated list of one or more names. A semicolon terminates the definition. The type specifier names the type associated with the object: int, double, std::string, and Sales_item are all names of types. The types int and double are built-in types, std::string is a type defined by the library, and Sales_item is a type that we used in Section 1.5 (p. 20)and will define in subsequent chapters. The type determines the amount of storage that is allocated for the variable and the set of operations that can be performed on it.

Multiple variables may be defined in a single statement:

       double salary, wage;    // defines two variables of type double       int month,           day, year;          // defines three variables of type int       std::string address;    // defines one variable of type std::string

Initialization

A definition specifies a variable's type and identifier. A definition may also provide an initial value for the object. An object defined with a specified first value is spoken of as initialized. C++ supports two forms of variable initialization: copy-initialization and direct-initialization. The copy-initialization syntax uses the equal (=) symbol; direct-initialization places the initializer in parentheses:

       int ival(1024);     // direct-initialization       int ival = 1024;    // copy-initialization

In both cases, ival is initialized to 1024.

Although, at this point in the book, it may seem obscure to the reader, in C++ it is essential to understand that initialization is not assignment. Initialization happens when a variable is created and gives that variable its initial value. Assignment involves obliterating an object's current value and replacing that value with a new one.

Many new C++ programmers are confused by the use of the = symbol to initialize a variable. It is tempting to think of initialization as a form of assignment. But initialization and assignment are different operations in C++. This concept is particularly confusing because in many other languages the distinction is irrelevant and can be ignored. Moreover, even in C++ the distinction rarely matters until one attempts to write fairly complex classes. Nonetheless, it is a crucial concept and one that we will reiterate throughout the text.

There are subtle differences between copy- and direct-initialization when initializing objects of a class type. We won't completely explain these differences until Chapter 13. For now, it's worth knowing that the direct syntax is more flexible and can be slightly more efficient.

Using Multiple Initializers

When we initialize an object of a built-in type, there is only one way to do so: We supply a value, and that value is copied into the newly defined object. For built-in types, there is little difference between the direct and the copy forms of initialization.

For objects of a class type, there are initializations that can be done only using direct-initialization. To understand why, we need to know a bit about how classes control initialization.

Each class may define one or more special member functions (Section 1.5.2, p. 24) that say how we can initialize variables of the class type. The member functions that define how initialization works are known as constructors. Like any function, a constructor can take multiple arguments. A class may define several constructors, each of which must take a different number or type of arguments.

As an example, we'll look a bit at the string class, which we'll cover in more detail in Chapter 3. The string type is defined by the library and holds character strings of varying sizes. To use strings, we must include the string header. Like the IO types, string is defined in the std namespace.

The string class defines several constructors, giving us various ways to initialize a string. One way we can initialize a string is as a copy of a character string literal:

       #include <string>       // alternative ways to initialize string from a character string literal       std::string titleA = "C++ Primer, 4th Ed.";       std::string titleB("C++ Primer, 4th Ed.");

In this case, either initialization form can be used. Both definitions create a string object whose initial value is a copy of the specified string literal.

However, we can also initialize a string from a count and a character. Doing so creates a string containing the specified character repeated as many times as indicated by the count:

       std::string all_nines(10, '9');   // all_nines= "9999999999"

In this case, the only way to initialize all_nines is by using the direct form of initialization. It is not possible to use copy-initialization with multiple initializers.

Initializing Multiple Variables

When a definition defines two or more variables, each variable may have its own initializer. The name of an object becomes visible immediately, and so it is possible to initialize a subsequent variable to the value of one defined earlier in the same definition. Initialized and uninitialized variables may be defined in the same definition. Both forms of initialization syntax may be intermixed:

       #include <string>       // ok: salary defined and initialized before it is used to initialize wage       double salary = 9999.99,             wage(salary + 0.01);       // ok: mix of initialized and uninitialized       int interval,           month = 8, day = 7, year = 1955;       // ok: both forms of initialization syntax used       std::string title("C++ Primer, 4th Ed."),                   publisher = "A-W";

An object can be initialized with an arbitrarily complex expression, including the return value of a function:

       double price = 109.99, discount = 0.16;       double sale_price = apply_discount(price, discount);

In this example, apply_discount is a function that takes two values of type double and returns a value of type double. We pass the variables price and discount to that function and use its return value to initialize sale_price.

2.3.4. Variable Initialization Rules

When we define a variable without an initializer, the system sometimes initializes the variable for us. What value, if any, is supplied depends on the type of the variable and may depend on where it is defined.

Initialization of Variables of Built-in Type

Whether a variable of built-in type is automatically initialized depends on where it is defined. Variables defined outside any function body are initialized to zero. Variables of built-in type defined inside the body of a function are uninitialized. Using an uninitialized variable for anything other than as the left-hand operand of an assignment is undefined. Bugs due to uninitialized variables can be hard to find. As we cautioned on page 42, you should never rely on undefined behavior.

Exercises Section 2.3.3

Exercise 2.15:
What, if any, are the differences between the following definitions:
       int month = 9, day = 7;       int month = 09, day = 07; 
If either definition contains an error, how might you correct the problem?
Exercise 2.16:
Assuming calc is a function that returns a double, which, if any, of the following are illegal definitions? Correct any that are identified as illegal.
      (a) int car = 1024, auto = 2048;      (b) int ival = ival;      (c) std::cin >> int input_value;      (d) double salary = wage = 9999.99;      (e) double calc = calc(); 

We recommend that every object of built-in type be initialized. It is not always necessary to initialize such variables, but it is easier and safer to do so until you can be certain it is safe to omit an initializer.

Caution: Uninitialized Variables Cause Run-Time Problems

Using an uninitialized object is a common program error, and one that is often difficult to uncover. The compiler is not required to detect a use of an uninitialized variable, although many will warn about at least some uses of uninitialized variables. However, no compiler can detect all uses of uninitialized variables.

Sometimes, we're lucky and using an uninitialized variable results in an immediate crash at run time. Once we track down the location of the crash, it is usually pretty easy to see that the variable was not properly initialized.

Other times, the program completes but produces erroneous results. Even worse, the results can appear correct when we run our program on one machine but fail on another. Adding code to the program in an unrelated location can cause what we thought was a correct program to suddenly start to produce incorrect results.

The problem is that uninitialized variables actually do have a value. The compiler puts the variable somewhere in memory and treats whatever bit pattern was in that memory as the variable's initial state. When interpreted as an integral value, any bit pattern is a legitimate valuealthough the value is unlikely to be one that the programmer intended. Because the value is legal, using it is unlikely to lead to a crash. What it is likely to do is lead to incorrect execution and/or incorrect calculation.

Initialization of Variables of Class Type

Each class defines how objects of its type can be initialized. Classes control object initialization by defining one or more constructors (Section 2.3.3, p. 49). As an example, we know that the string class provides at least two constructors. One of these constructors lets us initialize a string from a character string literal and another lets us initialize a string from a character and a count.

Each class may also define what happens if a variable of the type is defined but an initializer is not provided. A class does so by defining a special constructor, known as the default constructor. This constructor is called the default constructor because it is run "by default;" if there is no initializer, then this constructor is used. The default constructor is used regardless of where a variable is defined.

Most classes provide a default constructor. If the class has a default constructor, then we can define variables of that class without explicitly initializing them. For example, the string type defines its default constructor to initialize the string as an empty stringthat is, a string with no characters:

       std::string empty;  // empty is the empty string; empty =""

Some class types do not have a default constructor. For these types, every definition must provide explicit initializer(s). It is not possible to define variables of such types without giving an initial value.

Exercises Section 2.3.4

Exercise 2.17:
What are the initial values, if any, of each of the following variables?
       std::string global_str;       int global_int;       int main()       {           int local_int;           std::string local_str;           // ...           return 0;       } 

2.3.5. Declarations and Definitions

As we'll see in Section 2.9 (p. 67), C++ programs typically are composed of many files. In order for multiple files to access the same variable, C++ distinguishes between declarations and definitions.

A definition of a variable allocates storage for the variable and may also specify an initial value for the variable. There must be one and only one definition of a variable in a program.

A declaration makes known the type and name of the variable to the program. A definition is also a declaration: When we define a variable, we declare its name and type. We can declare a name without defining it by using the extern keyword. A declaration that is not also a definition consists of the object's name and its type preceded by the keyword extern:

       extern int i;   // declares but does not define i       int i;          //  declares and defines i

An extern declaration is not a definition and does not allocate storage. In effect, it claims that a definition of the variable exists elsewhere in the program. A variable can be declared multiple times in a program, but it must be defined only once.

A declaration may have an initializer only if it is also a definition because only a definition allocates storage. The initializer must have storage to initialize. If an initializer is present, the declaration is treated as a definition even if the declaration is labeled extern:

       extern double pi = 3.1416; // definition

Despite the use of extern, this statement defines pi. Storage is allocated and initialized. An extern declaration may include an initializer only if it appears outside a function.

Because an extern that is initialized is treated as a definition, any subseqent definition of that variable is an error:

       extern double pi = 3.1416; // definition       double pi;                 // error: redefinition of pi

Similarly, a subsequent extern declaration that has an initializer is also an error:

       extern double pi = 3.1416; // definition       extern double pi;          // ok: declaration not definition       extern double pi = 3.1416; // error: redefinition of pi

The distinction between a declaration and a definition may seem pedantic but in fact is quite important.

In C++ a variable must be defined exactly once and must be defined or declared before it is used.

Any variable that is used in more than one file requires declarations that are separate from the variable's definition. In such cases, one file will contain the definition for the variable. Other files that use that same variable will contain declarations forbut not a definition ofthat same variable.

Exercises Section 2.3.5

Exercise 2.18:
Explain the meaning of each of these instances of name:
       extern std::string name;       std::string name("exercise 3.5a");       extern std::string name("exercise 3.5a"); 

2.3.6. Scope of a Name

Every name in a C++ program must refer to a unique entity (such as a variable, function, type, etc.). Despite this requirement, names can be used more than once in a program: A name can be reused as long as it is used in different contexts, from which the different meanings of the name can be distinguished. The context used to distinguish the meanings of names is a scope. A scope is a region of the program. A name can refer to different entities in different scopes.

Most scopes in C++ are delimited by curly braces. Generally, names are visible from their point of declaration until the end the scope in which the declaration appears. As an example, consider this program, which we first encountered in Section 1.4.2 (p. 14):

       #include <iostream>       int main()       {           int sum = 0;           //  sum values from 1 up to 10 inclusive           for (int val = 1; val <= 10; ++val)               sum += val;   // equivalent to sum = sum + val           std::cout << "Sum of 1 to 10 inclusive is "                     << sum << std::endl;           return 0;       }

This program defines three names and uses two names from the standard library. It defines a function named main and two variables named sum and val. The name main is defined outside any curly braces and is visible throughout the program. Names defined outside any function have global scope; they are accessible from anywhere in the program. The name sum is defined within the scope of the main function. It is accessible throughout the main function but not outside of it. The variable sum has local scope. The name val is more interesting. It is defined in the scope of the for statement (Section 1.4.2, p. 14). It can be used in that statement but not elsewhere in main. It has statement scope.

Scopes in C++ Nest

Names defined in the global scope can be used in a local scope; global names and those defined local to a function can be used inside a statement scope, and so on. Names can also be redefined in an inner scope. Understanding what entity a name refers to requires unwinding the scopes in which the names are defined:

       #include <iostream>       #include <string>       /*  Program for illustration purposes only:        *  It is bad style for a function to use a global variable and then        *  define a local variable with the same name        */       std::string s1 = "hello";  // s1 has global scope       int main()       {           std::string s2 = "world"; // s2 has local scope           // uses global s1; prints "hello world"           std::cout << s1 << " " << s2 << std::endl;           int s1 = 42; // s1 is local and hides global s1           // uses local s1;prints "42 world"           std::cout << s1 << " " << s2 << std::endl;           return 0;       }

This program defines three variables: a global string named s1, a local string named s2, and a local int named s1. The definition of the local s1 hides the global s1.

Variables are visible from their point of declaration. Thus, the local definition of s1 is not visible when the first output is performed. The name s1 in that output expression refers to the global s1. The output printed is hello world. The second statement that does output follows the local definition of s1. The local s1 is now in scope. The second output uses the local rather than the global s1. It writes 42 world.

Programs such as the preceeding are likely to be confusing. It is almost always a bad idea to define a local variable with the same name as a global variable that the function uses or might use. It is much better to use a distinct name for the local.

We'll have more to say about local and global scope in Chapter 7 and about statement scope in Chapter 6. C++ has two other levels of scope: class scope, which we'll cover in Chapter 12 and namespace scope, which we'll see in Section 17.2.

2.3.7. Define Variables Where They Are Used

In general, variable definitions or declarations can be placed anywhere within the program that a statement is allowed. A variable must be declared or defined before it is used.

It is usually a good idea to define an object near the point at which the object is first used.

Defining an object where the object is first used improves readability. The reader does not have to go back to the beginning of a section of code to find the definition of a particular variable. Moreover, it is often easier to give the variable a useful initial value when the variable is defined close to where it is first used.

One constraint on placing declarations is that variables are accessible from the point of their definition until the end of the enclosing block. A variable must be defined in or before the outermost scope in which the variable will be used.

Exercises Section 2.3.6

Exercise 2.19:

What is the value of j in the following program?

       int i = 42;       int main()       {           int i = 100;           int j = i;           // ...       }

Exercise 2.20:

Given the following program fragment, what values are printed?

       int i = 100, sum = 0;       for (int i = 0; i != 10; ++i)            sum += i;       std::cout << i << " " << sum << std::endl;

Exercise 2.21:

Is the following program legal?

       int sum = 0;       for (int i = 0; i != 10; ++i)           sum += i;       std::cout << "Sum from 0 to " << i                 << " is " << sum << std::endl;