Section 3.4. Introducing Iterators

3.4. Introducing Iterators

While we can use subscripts to access the elements in a vector, the library also gives us another way to examine elements: We can use an iterator. An iterator is a type that lets us examine the elements in a container and navigate from one element to another.

The library defines an iterator type for each of the standard containers, including vector. Iterators are more general than subscripts: All of the library containers define iterator types, but only a few of them support subscripting. Because iterators are common to all containers, modern C++ programs tend to use iterators rather than subscripts to access container elements, even on types such as vector that support subscripting.

Caution: Only Subscript Elements that Are Known to Exist!

It is crucially important to understand that we may use the subscript operator, (the [] operator), to fetch only elements that actually exist. For example,

      vector<int> ivec;      // empty vector      cout << ivec[0];       // Error: ivec has no elements!      vector<int> ivec2(10); // vector with 10 elements      cout << ivec[10];      // Error: ivec has elements 0...9

Attempting to fetch an element that doesn't exist is a run-time error. As with most such errors, there is no assurance that the implementation will detect it. The result of executing the program is uncertain. The effect of fetching a nonexisting element is undefinedwhat happens will vary by implementation, but the program will almost surely fail in some interesting way at run time.

This caution applies any time we use a subscript, such as when subscripting a string and, as we'll see shortly, when subscripting a built-in array.

Attempting to subscript elements that do not exist is, unfortunately, an extremely common and pernicious programming error. So-called "buffer overflow" errors are the result of subscripting elements that don't exist. Such bugs are the most common cause of security problems in PC and other applications.

Exercises Section 3.3.2

Exercise 3.13:
Read a set of integers into a vector. Calculate and print the sum of each pair of adjacent elements in the vector. If there is an odd number, tell the user and print the value of the last element without summing it. Now change your program so that it prints the sum of the first and last elements, followed by the sum of the second and second-to-last and so on.

Exercise 3.14:
Read some text into a vector, storing each word in the input as an element in the vector. transform each word into uppercase letters. Print the transformed elements from the vector, printing eight words to a line.
Exercise 3.15:
Is the following program legal? If not, how might you fix it?
      vector<int> ivec;      ivec[0] = 42; 
Exercise 3.16:
List three ways to define a vector and give it 10 elements, each with the value 42. Indicate whether there is a preferred way to do so and why.

The details of how iterators work are discussed in Chapter 11, but we can use them without understanding them in their full complexity.

Container `iterator` Type

Each of the container types, such as vector, defines its own iterator type:

      vector<int>::iterator iter;

This statement defines a variable named iter, whose type is the type named iterator defined by vector<int>. Each of the library container types defines a member named iterator that is a synonym for the actual type of its iterator.

Terminology: Iterators and Iterator Types

When first encountered, the nomenclature around iterators can be confusing. In part the confusion arises because the same term, iterator, is used to refer to two things. We speak generally of the concept of an iterator, and we speak specifically of a concrete iterator type defined by a container, such as vector<int>.

What's important to understand is that there is a collection of types that serve as iterators. These types are related conceptually. We refer to a type as an iterator if it supports a certain set of actions. Those actions let us navigate among the elements of a container and let us access the value of those elements.

Each container class defines its own iterator type that can be used to access the elements in the container. That is, each container defines a type named iterator, and that type supports the actions of an (conceptual) iterator.

The `begin` and `end` Operations

Each container defines a pair of functions named begin and end that return iterators. The iterator returned by begin refers to the first element, if any, in the container:

      vector<int>::iterator iter = ivec.begin();

This statement initializes iter to the value returned by the vector operation named begin. Assuming the vector is not empty, after this initialization, iter refers to the same element as ivec[0].

The iterator returned by the end operation is an iterator positioned "one past the end" of the vector. It is often referred to as the off-the-end iterator indicating that it refers to a nonexistent element "off the end" of the vector. If the vector is empty, the iterator returned by begin is the same as the iterator returned by end.

The iterator returned by the end operation does not denote an actual element in the vector. Instead, it is used as a sentinel indicating when we have processed all the elements in the vector.

Dereference and Increment on `vector` Iterators

The operations on iterator types let us retrieve the element to which an iterator refers and let us move an iterator from one element to another.

Iterator types use the dereference operator (the * operator) to access the element to which the iterator refers:

      *iter = 0;

The dereference operator returns the element that the iterator currently denotes. Assuming iter refers to the first element of the vector, then *iter is the same element as ivec[0]. The effect of this statement is to assign 0 to that element.

Iterators use the increment operator (++) (Section 1.4.1, p. 13) to advance an iterator to the next element in the container. Incrementing an iterator is a logically similar operation to the increment operator when applied to int objects. In the case of ints, the effect is to "add one" to the int's value. In the case of iterators, the effect is to "advance the iterator by one position" in the container. So, if iter refers to the first element, then ++iter denotes the second element.

Because the iterator returned from end does not denote an element, it may not be incremented or dereferenced.

Other Iterator Operations

Another pair of useful operations that we can perform on iterators is comparison: Two iterators can be compared using either == or !=. Iterators are equal if they refer to the same element; they are unequal otherwise.

A Program that Uses Iterators

Assume we had a vector<int> named ivec and we wanted to reset each of its elements to zero. We might do so by using a subscript:

      // reset all the elements in ivec to 0      for (vector<int>::size_type ix = 0; ix != ivec.size(); ++ix)              ivec[ix] = 0;

This program uses a for loop to iterate through the elements in ivec. The for defines an index, which it increments on each iteration. The body of the for sets each element in ivec to zero.

A more typical way to write this loop would use iterators:

      // equivalent loop using iterators to reset all the elements in ivec to 0      for (vector<int>::iterator iter = ivec.begin();                                 iter != ivec.end(); ++iter)          *iter = 0;  // set element to which iter refers to 0

The for loop starts by defining iter and initializing it to refer to the first element in ivec. The condition in the for tests whether iter is unequal to the iterator returned by the end operation. Each iteration increments iter. The effect of this for is to start with the first element in ivec and process in sequence each element in the vector. Eventually, iter will refer to the last element in ivec. After we process the last element and increment iter, it will become equal to the value returned by end. At that point, the loop stops.

The statement in the for body uses the dereference operator to access the value of the current element. As with the subscript operator, the value returned by the dereference operator is an lvalue. We can assign to this element to change its value. The effect of this loop is to assign the value zero to each element in ivec.

Having walked through the code in detail, we can see that this program has exactly the same effect as the version that used subscripts: We start at the first element in the vector and set each element in the vector to zero.

This program, like the one on page 94, is safe if the vector is empty. If ivec is empty, then the iterator returned from begin does not denote any element; it can't, because there are no elements. In this case, the iterator returned from begin is the same as the one returned from end, so the test in the for fails immediately.

`const_iterator`

The previous program used a vector::iterator to change the values in the vector. Each container type also defines a type named const_iterator, which should be used when reading, but not writing to, the container elements.

When we dereference a plain iterator, we get a nonconst reference (Section 2.5, p. 59) to the element. When we dereference a const_iterator, the value returned is a reference to a const (Section 2.4, p. 56) object. Just as with any const variable, we may not write to the value of this element.

For example, if text is a vector<string>, we might want to traverse it, printing each element. We could do so as follows:

      // use const_iterator because we won't change the elements      for (vector<string>::const_iterator iter = text.begin();                                    iter != text.end(); ++iter)          cout << *iter << endl; // print each element in text

This loop is similar to the previous one, except that we are reading the value from the iterator, not assigning to it. Because we read, but do not write, through the iterator, we define iter to be a const_iterator. When we dereference a const_iterator, the value returned is const. We may not assign to an element using a const_iterator:

      for (vector<string>::const_iterator iter = text.begin();                                   iter != text.end(); ++ iter)          *iter = " ";     // error: *iter is const

When we use the const_iterator type, we get an iterator whose own value can be changed but that cannot be used to change the underlying element value. We can increment the iterator and use the dereference operator to read a value but not to assign to that value.

A const_iterator should not be confused with an iterator that is const. When we declare an iterator as const we must initialize the iterator. Once it is initialized, we may not change its value:

      vector<int> nums(10);  // nums is nonconst      const vector<int>::iterator cit = nums.begin();      *cit = 1;               // ok: cit can change its underlying element      ++cit;                  // error: can't change the value of cit

A const_iterator may be used with either a const or nonconst vector, because it cannot write an element. An iterator that is const is largely useless: Once it is initialized, we can use it to write the element it refers to, but cannot make it refer to any other element.

      const vector<int> nines(10, 9);  // cannot change elements in nines      // error: cit2 could change the element it refers to and nines is const      const vector<int>::iterator cit2 = nines.begin();      // ok: it can't change an element value, so it can be used with a const vector<int>      vector<int>::const_iterator it = nines.begin();      *it = 10; // error: *it is const      ++it;     // ok: it isn't const so we can change its value

      // an iterator that cannot write elements      vector<int>::const_iterator      // an iterator whose value cannot change      const vector<int>::iterator

Exercises Section 3.4

Exercise 3.17:
Redo the exercises from Section 3.3.2 (p. 96), using iterators rather than subscripts to access the elements in the vector.

Exercise 3.18:
Write a program to create a vector with 10 elements. Using an iterator, assign each element a value that is twice its current value.

Exercise 3.19:
Test your previous program by printing the vector.

Exercise 3.20:
Explain which iterator you used in the previous programs, and why.

Exercise 3.21:
When would you use an iterator that is const? When would you use a const_iterator. Explain the difference between them.

3.4.1. Iterator Arithmetic

In addition to the increment operator, which moves an iterator one element at a time, vector iterators (but few of the other library container iterators) also support other arithmetic operations. These operations are referred to as iterator arithmetic, and include:

iter + n
iter - n
We can add or subtract an integral value to an iterator. Doing so yields a new iterator positioned n elements ahead of (addition) or behind (subtraction) the element to which iter refers. The result of the addition or subtraction must refer to an element in the vector to which iter refers or to one past the end of that vector. The type of the value added or subtracted ought ordinarily to be the vector's size_type or difference_type (see below).
iter1 - iter2
Computes the difference between two iterators as a value of a signed integral type named difference_type, which, like size_type, is defined by vector. The type is signed because subtraction might have a negative result. This type is guaranteed to be large enough to hold the distance between any two iterators. Both iter1 and iter2 must refer to elements in the same vector or the element one past the end of that vector.

We can use iterator arithmetic to move an iterator to an element directly. For example, we could locate the middle of a vector as follows:

      vector<int>::iterator mid = vi.begin() + vi.size() / 2;

This code initializes mid to refer to the element nearest to the middle of ivec. It is more efficient to calculate this iterator directly than to write an equivalent program that increments the iterator one by one until it reaches the middle element.

Any operation that changes the size of a vector makes existing iterators invalid. For example, after calling push_back, you should not rely on the value of an iterator into the vector.

Exercises Section 3.4.1

Exercise 3.22:
What happens if we compute mid as follows:
      vector<int>::iterator mid = (vi.begin() + vi.end()) / 2;