5.1 CONTAINER CLASSES IN C | Programming with Objects: A Comparative Presentation of Object Oriented Programming with C++ and Java

5.1 CONTAINER CLASSES IN C++

The ten main container classes that come with the C++ Standard Template Library (STL) are:

vector: Particularly efficient for random access of data elements using array-like indexing. Also, efficient for addition and deletion of data elements at the back end of a vector. A vector can act both like a regular array and like a linked list. If the insertion of new elements into a vector causes the vector to exceed the memory previously allocated to it, the vector simply migrates to a new location in the memory. To minimize the overhead associated with such migration, one must judiciously choose the amount of memory that is allocated to a vector initially and the incremental additions to this memory as the vector runs out of previously allocated memory.
list: Particularly efficient for all kinds of list operations, meaning insertion and deletion of elements anywhere in a list, including the addition and deletion of elements at both its ends. Accessing elements via array-like indexing is not allowed.
deque: (Stands for double-ended queue and rhymes with ‘check') Combines the best of vector and list with regard to subscripting via the ‘[]' operator and the add/delete operations at both ends of a data sequence stored in a deque. However, the prototypical list operations (insert/erase anywhere in a list) are not as efficient as for a list.
stack: Only supports storage and retrieval at the back end of a sequence of data. That means that the only data element that can be retrieved is the one that was pushed most recently into the stack.
queue: Only supports retrieval at the front end of a sequence of data while new elements are pushed into the back end. This means that of the data elements currently in the container, it is the oldest that is available next for retrieval.
priority_queue: A priority queue is like a queue except that the element that is available for retrieval next has the highest priority associated with it.
map: This container is used to store a sequence of <key, value> pairs. The name map implies a mapping from the keys to the values. Particularly efficient for retrieving a ‘value' for a given ‘key'. The <key, value> pairs are stored in a sorted order using a comparison function whose default meaning corresponds to the ‘<' operator for the key type. As an example, a telephone directory in which any single name can appear only once is an example of a map container. The keys are the names and the values the phone numbers.
set: A set is a degenerate form of a map in that no value need be specified for the keys. Therefore, in contrast with, say, a vector or a list, the list of objects (meaning the keys) is stored in a sorted order.
multimap: A more general form of map. While any single ‘key' can appear only once in the sequence of <key, value> pairs in a map, in a multimap it is possible to have more than one <key, value> pair for the same key. A phone directory in which a name is allowed to appear more than once (because a person is allowed to have more than one phone number) would be an example of a multimap container.
multiset: A multiset is a degenerate form of a multimap in which no value need be specified for the keys.

In addition to these ten, there are other more specialized container classes like bitset, valarray, etc. Our focus will be on the ten main classes listed here. The end of the section includes a few brief comments about the other containers of STL.

Of the main STL containers, vector, list, and deque are known as the sequence containers, because they keep their elements in a sequential order. Sequential order does not imply that the elements are necessarily kept in contiguous segments of memory. It only means that the container supports some kind of a next-element operation that would allow one to go from one element to the next until reaching the end of the data sequence.

The sequence containers that do store their elements in contiguous segments of memory are vector and deque. For that reason, these two containers permit array-like random-access through the subscript operator ‘[]'. The fact these two containers store their data in contiguous segments of memory means that insertions or deletions anywhere except at the two ends of the container must be expensive. Obviously, if you insert a new element in the middle, to create room for the new element you'd need to shuffle all of the existing entries on one side of the new entry. If you delete an entry in the middle, you'd again need to shuffle all of the entries on one side of the deletion point so that there is no gap in the container after the deletion.

Of the two containers with very similar properties, vector and deque, the former has very little memory overhead-meaning memory needed in addition to what is actually required for the storage of data-and has the ‘[]' type of indexing that is almost as efficient as for arrays. On the other hand, a deque has a slightly higher memory overhead and its array-like indexing via the ‘[]' operator is slightly less efficient than that for vector. However, a deque is more efficient than a vector for insertions/deletions at the front of the container.

On the other hand, if you were looking for a container that was efficient for insertions and deletions everywhere, especially in the middle, and you did not care about array-like indexing, you'd choose list. A list container consists of doubly linked nodes that facilitate highly efficient insertions/deletions anywhere. Because of the memory needed for all the nodes, a list has the highest memory overhead of the three sequenced containers.

Table 5.1 should help the reader choose the best container for a given application. The notation O(N) used in this table stands for linear time complexity; it means that the time taken for the operation listed in the first column is linearly proportional to the number of elements in the container. The notation O(1) stands for constant time complexity; it means that the time taken is independent of the number of items already in the container. The ‘+' symbol means that the operation might occasionally have a significant extra overhead associated with it. For example, inserting an additional element at the back of a vector would ordinarily incur a fixed cost, but if we run out of room to grow the vector and the entire vector has to be migrated to some other location in the memory, there would come into play the overhead of having to copy over all of the vector elements to their new locations. The combined notation O(1) + is referred to as amortized constant time.

Table 5.1
Operation and its efficiency	vector	deque	list
Efficiency for array-like access	O(1)	O(1)	O(N)
Efficiency for insert/delete at front	O(N)	O(1)+	O(1)
Efficiency for insert/delete at back	O(1)+	O(1)+	O(1)
Efficiency for insert/delete in middle	O(N)	O(N)	O(1)

The containers stack, queue, and priority_queue are called adapters or adapter containers because their role is to restrict the more general interfaces of the basic sequence containers vector, deque, and list. For example, by using the operations push_back and pop_back, it is perfectly possible to use either a vector or a deque directly as a stack. But if you wanted to make sure that a stack in your program was used only as a stack and in no other way that is possible with, say, a vector, you can use a vector through a stack adapter. Similarly, it is possible to use either a list or a deque as a queue data structure, but if you wanted to suppress all the other non-queue functionality of those two containers, you can use them through the queue adapter. The same goes for the priority_queue adapter container.

The remaining four containers-map, set, multimap, and multiset-are known as the associative containers. These containers allow us to store and retrieve an object on the basis of a key associated with the object. What gets stored in such containers are <key, value> pairs in key-sorted order, usually in the form of a binary tree. Key-order sorting ensures that the object associated with a key can be retrieved in O(log(N)) time, where N is the number of keys in the container. The keys must be unique for a map, meaning that there can be only one <key, value> pair for each key. If it is desired to have multiple entries for the same key, the associative container to use is multimap. The container set is a degenerate form of map in which only the keys are stored. The reason for the name set is that there can be no duplicate keys-a basic requirement of membership in a set in the formal set theory. A multiset is to a multimap what set is to a map. If key-sorted storage of the <key, value> pairs is not necessary, it is also possible to implement a map using a hash table, which is capable of giving a O(1) time retrieval efficiency depending on the data and the hashing function used.^[2].

Of the more specialized containers that we will not be discussing in this book, bitset is used for achieving array-like addressing for the individual bits of a data entity in the memory and valarray for holding numerical data and processing thereof.

All container classes support the assignment operator so that a container can be assigned wholesale to a variable of the same type; the ‘= =' operator that returns true if two containers of the same type have the same number of elements and if the elements in corresponding positions in the two containers satisfy the ‘= =' operator for the element type; and the ‘<' operator if each element in the container on the left of the operator is less than the corresponding element of the container on the right. The meaning of the other comparison operators, ‘! =, ‘>', ‘<=', and ‘>=', is inferred from the overloadings for ‘= =' and ‘<'.

When a class type object is inserted into an STL container, there is the question of what actually gets stored in the container. Is it the object itself? Or, is it a copy of the object? The answer is the latter. The container will copy the object using either the copy constructor or the copy assignment operator defined for the element type. Copy constructors and copy assignment operators are presented in Chapter 11.

In the rest of this section, we will show simple examples for all but two of the ten main containers of STL. We do not show examples for multimap and multiset because their interfaces are similar to those for map and set, respectively. Additionally, we will present the class vector in more detail than others because this container type is used more often than others and because much of what we say about the vector container applies to other containers also. For example, the issues that arise when we store programmer-defined class type objects in a vector are identical to such issues for the other containers.

5.1.1 Vector

A vector is like an array that can grow and shrink as elements are added to or removed from the vector. Since vectors allow for array-like indexing, they can be used very much like arrays. In addition, since vectors allow new elements to be inserted anywhere, they can be used very much like linked lists. The random access efficiency of a vector through array like indexing is comparable to that for arrays since-as for arrays-the elements are held in a contiguous block of memory.

The reader might ask: Doesn't holding all the elements of a vector in one contiguous block of memory prevent an indefinite insertion of new elements into the vector? Won't the vector eventually run into the memory allocated to the other objects in a computer program? In a vector, this problem is taken care of by allowing a vector to migrate elsewhere in memory if that is what it takes to keep all its elements in one contiguous block of memory. The elements residing at the current locations are then copied over into the freshly appropriated memory and the currently used memory deallocated.^[3].

But then the reader might ask: If a vector is allowed to migrate around in the memory as it grows, does that not create an overhead associated with memory allocation, memory deallocation, the copying of elements from one location to another, etc.? This computational overhead associated with a vector can be alleviated by intelligently preallocating memory when a vector is first created. If despite this memory preallocation, a vector has to move to a different place in the memory occasionally, in many applications the computational overhead associated with that would be a small price to pay for the huge convenience afforded by a vector's array-like random access combined with its highly efficient insert/delete operations at its back end, and its linked-list like flexibility for occasional insertions and deletions everywhere else.

The following statement declares vec to be an empty vector whose elements will be of type T:^[4]

     vector<T> vec;

For example, a vector of integers would be declared by

     vector<int> vec;

and a vector of strings by

     vector<string> s_vec;

Basic to the use of a C++ vector is the notion of an iterator. What a pointer is to an ordinary array, an STL iterator is to a vector (and to other STL containers). From the standpoint of usage, an important difference between the two is that there is no such thing as a null iterator. That is, an iterator cannot be initialized by setting it equal to 0.

The following statement declares p to be an iterator for a vector of ints:

     vector<int>::iterator p;

Whereas an array pointer can be initialized by setting it equal to the name of the array, a vector iterator needs more specialized syntax for its initialization. For example, to make the iterator of an integer vector vec point to the first element of the vector, you'd need to say

      vector<int>::iterator p = vec.begin();

and to make it point to one past the last element of the vector, you'd say

      vector<int>::iterator p = vec.end();

Using the begin() and end() function, one can set up a following kind of a while loop for printing out the elements of a vector of ints:

      vector<int>::iterator p = vec.begin();      while ( p != vec.end() )          cout << *p++ << "  ";

This construct shows two additional features of vector iterators: (1) They can be dereferenced in exactly the same manner as array pointers-by using the operator ‘*'. (2) They can be incremented (and decremented) in exactly the same manner as array pointers by using the ‘++' (and ‘−−') operator. The loop iteration comes to a halt when the value of the iterator p reaches one past the end of the vector, a condition detected by the function end().

C++ provides the functions push_back() and pop_back() for adding new elements at the back end of a vector and for popping elements off the back end. Both functions return void and do their assigned jobs by side effect. For example, if we say

      vector<int> vec;      vec.push_back( 34 );      vec.push_back( 23 );      vec.pop_back();        //get rid of the last element - 23

that would push the integers 34 and 23 into the vector vec and then pop the last entry, 23, off the vector. To determine the size of the vector thus created, we could say

     cout << vec.size();     // answer: 1

Note that size() would have returned 2 before we popped off the last entry.

As is the case with array pointers, we could also set the value of a vector element by dereferencing the iterator, as shown below:

     vector<int> vec;     vec.push_back( 34 );     vec.push_back( 23 );     vector<int>::iterator p = vec.begin();     *(p + 1) 0= 52;

The last statement will change the value of the second element from 23 to 52.

As was mentioned at the beginning of this chapter, vectors come with the subscript operator for random access to their elements. So if vec is a vector consisting of N elements, the elements can be designated vec[0], vec[1], ……, vec[N-1]. Therefore, another way to print out all the elements of a vector would be

     int i = 0;     while ( i < vec.size() )         cout << vec[i++] << "  ";

We can even change the value of a vector element by using the subscript operator. For example, if we say

     vec[1] = 49;

the value stored at the second element will be changed to 49.

The following program shows the above-mentioned features of a vector inside a program that you can run (and modify) for practice:

 
 // VectorBasic.cc #include <iostream> #include <vector>                                                           //(A) using namespace std; void print( vector<int> ); int main() {     vector<int> vec;                                                        //(B)     vec.push_back( 34 );                                                    //(C)     vec.push_back( 23 );       // size is now 2     print( vec );              // 34 23     vector<int>::iterator p;                                                //(D)     p = vec.begin();                                                        //(E)     *p = 68;                                                                //(F)     *(p + 1) = 69;                                                          //(G)     //  *(p + 2) = 70;         // WRONG                                     //(H)     print( vec );              // 68 69     vec.pop_back();            // size is now 1                             //(I)         print( vec );            // 68                                      vec.push_back(101);                                                     //(J)     vec.push_back(103);      // size is now 3                               //(K)                              // size is now 3     int i = 0;     while ( i < vec.size() )                                               //(L)         cout << vec[i++] << "  ";     cout << endl;// 68 101 103     vec[0] = 1000;                                                          //(M)     vec[1] = 1001;                                                          //(N)     vec[2] = 1002;                                                          //(O)     print( vec );            // 1000 1001 1002     return 0; } void print( vector<int> v ) {     cout << "\nvector size is: " << v.size() << endl;     vector<int>::iterator p = v.begin();     while ( p != v.end() )         cout << *p++ << "  ";     cout << endl << endl; }

Note in line (A) the inclusion of the header file vector. This file contains the prototypes of all C++ vector related functions. The effect of the statements in lines (B) through (D) has already been explained. In line (E), we initialize the iterator so that it points to the beginning of the vector. In lines (F) and (G), we then show how to alter the values of elements of the vector by dereferencing the iterator. We are able to do so because, by using the push_back() function, we have already created a vector of two elements. So the iterators being dereferenced point to valid locations in the vector. That should also explain why the statement in line (H) is illegal. In line (I), we pop the vector and remove the last element from its back end. The pop_back() returns void but removes the last element as a side effect. After that, the vector has only one element. The while loop in line (L) shows how the elements of a vector can be accessed with the subscript operator. Finally, lines (M) through (O) show how the elements of a vector can be overwritten by using subscript indexing.

It is also possible to declare a vector of a predetermined size, say of size 100 elements, by a declaration like

     vector<int> vec(100);

In such declarations, each element of the vector is initialized to a zero of the appropriate kind. For the case shown, each element will be initialized to the integer 0. When applicable, the default initialization can be changed by supplying it as a second argument to the vector constructor. For example, in the declaration

     vector<int> vec(100, 2);

each of 100 elements of the vector will be initialized to the integer 2.

We will use the next program to illustrate the functions front() and back() and to show the use of the ‘&' operator to obtain the iterator associated with a particular vector element. The functions front() and back(), when invoked on a vector; return the first and the last elements of the vector. This program also illustrates the use of resize() which adds zero-initialized elements to a previously defined vector; and reserve() which increases the capacity of a vector by setting aside uninitialized memory for the vector.

Line (A) of the program below declares a vector with preallocated memory for storing 100 integers. Each element of this vector is initialized by default to 0, as evidenced by the output produced by the sum(v1) statement. Subsequently, in line (B), we push a new element into the vector. Notice that this step acquires memory in addition to what was allocated in line (A). As a result, the size of the vector is now 101. In line (C), we increase the capacity of the vector by invoking reserve; the additional memory allocated to the vector remains uninitialized. Note that increasing the capacity of the vector through reserve does not alter what is returned by the size() function. Lines (D) and (E) demonstrate the behavior of the front() and back() functions. Line (F) shows how a vector iterator can be initialized to point directly to a chosen element in the vector.

By creating a vector of predetermined size but with nonzero initialization, lines (H) through (J) show the behavior of resize with regard to adding additional initialized elements to a vector. Line (H) creates a vector of 150 integers, each element of the vector initialized to the integer 2. In line (I), we resize this vector so that its new size is 500. But the additional elements that are added are initialized to the zero of the appropriate kind-in this case the integer 0.

Line (K) and the statements that immediately follow show how clear() can be used to clear out all the elements of a vector-and the size of the vector made zero-without altering its capacity. As shown in line (L), the same effect can be achieved by invoking resize(0) on the vector.

 
 //VectorFrontBackResize.cc #include <iostream> #include <vector> int sum( vector<int> vec ) {     int result = 0;     vector<int>::iterator p = vec.begin();         while ( p != vec.end() ) result += *p++;        return result; } int main() {     vector<int> v1(100);                                                    //(A)     cout << v1.size() << endl;           // 100     cout << sum( v1 ) << endl;           // 0     v1.push_back( 23 );                                                     //(B)     cout << v1.size() << endl;           // 101     cout << sum( v1 ) << endl;           // 23     v1.reserve( 1000 );                                                     //(C)     cout << v1.capacity() << endl;       // 1000     cout << v1.size() << endl;           // 101     cout << v1[900] << endl;             // undefined     cout << sum( v1 ) << endl;           // 23     cout << v1.front() << endl;          // 0                               //(D)     cout << v1.back() << endl;           // 23                              //(E)     v1.pop_back();     cout << v1[ v1.size() - 1 ] << endl; // 0                               //(F)     vector<int>::iterator p = &v1[50];                                      //(G)     cout << *p << endl;                  // 0     vector<int> v2(150, 2);                                                 //(H)     cout << sum( v2 ) << endl;           // 300     v2.resize( 500 );                                                       //(I)     cout << v2.size() << endl;           // 500     cout << v2[150] << endl;             // 0                               //(J)     v2.clear();                                                             //(K)     cout << v2.empty() << endl;          // true     cout << v2.capacity() << endl;       // 500     cout << v2.size() << endl;           // 0     v2.resize( 0 );                                                         //(L)     cout << v2.capacity() << endl;       // 500     cout << v2.size() << endl;           // 0     return 0; }

Note that the call v1[ v1.size() - 1 ] in line (F) is identical to invoking the function back() on a vector.

As mentioned before, the functions

      begin()      end()

return iterators: the former to the first element of the vector, and the latter to one past the last element. On the other hand, the functions

      front ()      back ()

return the vector elements themselves, the former the first element and the latter the last element.^[5]

5.1.1.1 List Operations on Vectors

We will now show some operations on vectors that highlight the fact that a vector can also be used in much the same manner as a linked list, meaning that elements can be added or removed at the beginning, anywhere in the middle, or at the end of a vector. But the reader needs to be aware of the fact that it is only at the back end of a vector that the list operations can be carried out efficiently, meaning in constant time. List operations anywhere else run the risk of taking time proportional to the number of elements in the vector. The reason for this is that a vector must keep all its elements in one continuous run of the memory. Given a vector of a sufficiently large capacity, adding an element at the back-end requires only that the new element be placed in the memory segment next to where the vector currently ends. But inserting an element anywhere else would require that all of the downstream elements be shuffled over in order to create space for the new element.

The functions that are used commonly for inserting new elements into a vector and for deleting existing elements are insert() and erase(). Both these functions take iterator arguments. The iterator must point to the position where a new element is to be inserted or from where an old element is to be deleted. As you would expect, when insert() is used for inserting a new element, the size of the vector increases by one; and when erase() is used for deleting an element, the size of the vector decreases by one. If the location of an item that needs to be deleted is not known, one can invoke the find() function to obtain the iterator that points to the first occurrence of the item. The iterator returned by find() can then be passed to the erase() function for the deletion of the item.

The next program illustrates the list operations on a vector. Even more importantly, this program shows that when you modify a vector structurally with a list operation, any previous initializations of an iterator become invalid. This is illustrated with lines (A) through (E) of the program below. Line (A) creates a vector of five integers, all initialized to 0. In line (B), we declare and initialize an iterator to the vector. When this iterator is dereferenced in line (C), we obtain the value stored in the first element of the vector-the number 0. We then invoke a list operation in line (D) by inserting a new element, 9, at the location pointed to by the iterator value in the first argument to insert. When we dereference the previously initialized iterator again in line (D), we can expect unpredictable behavior from the program, probably some garbage output if not a program crash. When we re-initialize the iterator in line (F), the problem disappears and we see the expected output in line (G).

Line (H) illustrates the invocation of erase to delete an element of the vector that is pointed to by the iterator argument to the function-in this case the very first element. Line (I) demonstrates inserting an element in the middle of a vector, and line (J) the deletion of an element in the middle of a vector. Lines (K) and (L) show the same pair of operations at the end of the vector.

Lines (M) through (R) show how find can be used in conjunction with erase to delete vector elements holding a particular item of data whose location is not known in advance. Each invocation of find returns the iterator value pointing to first occurrence of the data item supplied to the function for its last argument.

 
 //VectorInsertEraseSort.cc #include <iostream> #include <vector> #include <algorithm>     // for the find() and sort() generic functions using namespace std; void print( vector<int> ); int main() {     vector<int> vec(5);                                                    //(A)     print( vec );                                       // 0 0 0 0 0     vector<int>::iterator p =  vec.begin();                                //(B)     cout << *p << endl;                                 // 0               //(C)     vec.insert( vec.begin(), 9 );                                          //(D)     print( vec );                                       // 9 0 0 0 0 0     cout << *p << endl;                                 // ERROR           //(E)     p = vec.begin();                                                       //(F)     cout << *p << endl;                                 // 9               //(G)     vec.erase( vec.begin() );                                              //(H)     print( vec );                                       // 0 0 0 0 0     vec.insert( vec.begin() + 2, 8 );                                      //(I)     print( vec );                                       // 0 0 8 0 0 0     vec.erase( vec.begin() + 2 );                                          //(J)     print( vec );                                       // 0 0 0 0 0     vec.insert( vec.end(), 7 );                                            //(K)     print( vec );                                       // 0 0 0 0 0 7     vec.erase( vec.end() - 1 );                                            //(L)     print( vec );                                       // 0 0 0 0 0     vec.insert( vec.begin() + 3, 6 );                                      //(M)     print( vec );                                       // 0 0 0 6 0 0     vec.erase( find( vec.begin(), vec.end(), 6 ) );                        //(N)     print( vec );                                       // 0 0 0 0 0     vec.insert( vec.begin() + 1, 3 );                                      //(O)     vec.insert( vec.begin() + 5, 3 );                                      //(P)     print( vec );                                       // 0 3 0 0 0 3 0     vec.erase( find( vec.begin(), vec.end(), 3 ) );                        //(Q)     vec.erase( find( vec.begin(), vec.end(), 3 ) );                        //(R)     print( vec );                                       // 0 0 0 0 0     vec[0] = 23;                                                           //(S)     vec[1] = 2;     vec[2] = 16;     vec[3] = 45;     vec[4] = 16;     print( vec );                                        // 23 2 16 45 16     sort( vec.begin(), vec.end() );                                        //(T)     print( vec );                                        // 2 16 16 23 45     return 0; } void print( vector<int> v ) {      vector<int>::iterator p = v.begin();      while ( p != v.end() )          cout << *p++ << " ";      cout << endl; }

The above program also illustrates the sorting of a vector by the STL generic algorithm sort declared in the algorithm header file of the C++ Standard Library. The two arguments passed to sort in line (T) are the two iterator values, first pointing to the beginning of the vector and the second to one past the end. (We could have sorted just a portion of the vector by supplying appropriate iterator values to sort.).

5.1.1.2 Vector of Class Type Objects

All of our discussion so far, although stated for the case of vectors of integers, carries over to the case of vectors of other data types, including class types. For a class type object to be stored in a container, at the minimum the class should support a no-arg constructor so that memory appropriated for a container can be properly initialized. The class must also possess overload definitions for the ‘= =' and ‘<' operators so that the objects stored in a container can be compared by the various STL algorithms. STL is capable of inferring the meaning for the other comparison operators from these two.

As an example of a container of a class type object, you could declare a vector of strings by

     vector<string> vec;

And then you could push into the vector either string objects or objects that the system can covert automatically into string objects. For example, you could say

     vec.push_back( "hello" );     vec.push_back( "hi" );     vec.insert( vec.end(), "greetings" );

where the last statement is equivalent to invoking push_back( "greetings" ). The argument supplied to the push_back function is first converted into a string object and then entered into the vector. This works because the function push_back has been suitably overloaded to accept a const char* argument.

A vector of strings can be sorted by the sort function declared in the algorithm header file of the C++ standard library just as a vector of ints was sorted in the previous subsection. We could, for example, say

     sort( vec.begin(), vec.end() );

By default, the system will sort a vector of strings according to the lexicographic order dictated by the ASCII code associated with the characters. A different sorting order can be obtained by supplying one more argument to the sort function that establishes the comparison criterion to be used. To fully appreciate the nature of this third argument, you'd need to first understand operator overloading in C++, a topic that will be discussed in Chapter 12. Sections 12.10 and 12.11 of that chapter discuss sorting of class type objects with the user-specified criteria for sorting supplied through operator overload definitions.

You can also create a vector of a data type that you may define yourself. Given a programmer-defined class

     class X {         int p;     public:         //     };

let's say that we want to store objects of type X in a vector. To create a vector of this data type, you would say

     vector<X> vec;

or, if you wanted to create a vector with a preallocated capacity of, say five, you'd say

     vector<X> vec( 5 );

You can push an X object into a vector by

     vector<X> vec;     X x1(...);     vec.push_back( x1 );

where the second statement is a call to one of the constructors of class X. While the invocation of push_back would work fine, as it has in our earlier examples, we are faced with the following interesting question: What exactly is being inserted into the vector? Is it the object x1 itself, or is it a copy of the object? The answer: A copy of the object.^[6]

The following program provides a working example of a vector of a type defined by a programmer in line (A). Lines (B) and (C) provide overload definitions for the operators ‘= =' and ‘<'. The syntax used for these overload definitions will be explained in Chapter 12.

In main, we first create an empty vector for object of type X in line (F). We then create three X object in lines (G) through (I); these are subsequently pushed into the vector in the next three lines of code.^[7]

To demonstrate that what gets stored in the container are the copies of the objects and not the original objects directly, in line (J) we change the state of one of the objects. But this state change is not reflected in the printout statements for the vector elements just before and just after line (J).

Line (K) demonstrates the declaration of a vector of predetermined size and the initialization of the vector elements by the no-arg constructor for class X. Lines (L) and (M) demonstrate the resizing of a vector and the changing of its capacity, respectively. While the memory acquired through resizing is initialized according to the no-arg constructor for X, the additional memory reserved by changing the capacity of the vector remains uninitialized, as evidenced by the output in line (N).

The sort function invoked in line (O) on a vector of X objects uses the ‘<' operator that is overloaded for the class in line (B). Another way to invoke sort would be via its 3-argument version, shown in the commented out line (P), that for its third argument requires a function object that knows how to compare objects of type X. The commented out code for X_Comparator in line (D) creates such a function object through the overloading of the ‘()' operator in line (E). Chapter 12 will explain what is meant by a function object and by the overloading of the overloaded ‘()' operator.

 
 //VectorForClassType.cc #include <iostream> #include <vector> #include <algorithm>      // for sort() using namespace std; class X {                                                              //(A)     int p; public:     X() { p = 42; }     X( int q ) { p = q; }     int getp() const { return p; }     void changeState( int pp ) { p = pp; } }; //Chapter 12 explains the syntax shown for the two //operator overloadings for class X: bool operator<( const X& x1, const X& x2 ) {                           //(B)        return x1.getp() < x2.getp(); } bool operator==( const X& x1, const X& x2 ) {                          //(C)      return x1.getp() == x2.getp(); } // An alternative way of sorting a vector of objects // of type X would be to invoke a 3-argument sort with // the third argument set to the function object // X_Comparator(). This function object would correspond // to the overloading of the ‘()' operator. See Chapter // 12 for the overloading of this operator. // class X_Comparator {                                                //(D) // public: //    bool operator() ( const X& x1, const X& x2 ) const {             //(E) //       return x1.getp() < x2.getp(); //    } // }; void print( vector<X> ); int main() {      vector<X> vec;                                                    //(F)      X x1( 2 );                                                        //(G)      X x2( 3 );                                                        //(H)      X x3( 5 );                                                        //(I)      vec.push_back( x1 );      vec.push_back( x3 );      vec.push_back( x2 );      print( vec );                  // 2 5 3      x2.changeState( 1000 );                                           //(J)      //change made to x2 in line (J) does not affect copy of x2 in vec:      print( vec);                   // 2 5 3      //vector elements initialized by X's no-arg      constructor:      vector<X> vec_2( 5 );                                             //(K)      print( vec_2 );                // 42 42 42 42 42      vec_2.resize( 7 );                                                //(L)      print( vec_2 );                // 42 42 42 42 42 42 42      //uninitialized increase in the vector capacity:      vec_2.reserve( 10 );                                               //(M)      cout << vec_2.capacity() << endl;         // 10      print ( vec_2 );             // 42 42 42 42 42 42 42                                   // size still returns 7      cout << vec_2[ 8 ].getp() << endl;   // undefined                  //(N)      //set up vector for sorting:      vec_2[0] = X(12);      vec_2[1] = X(36);      vec_2[2] = X(3);      vec_2[3] = X(56);      vec_2[4] = X(2);      sort( vec_2.begin(), vec_2.end() );                                    //(O)      // The commented out statement in line (P) below is an      // alternative way of sorting a vector of objects of      // type X. In the 3-argument invocation of sort, the      // third argument is a function-object that corresponds      // to the overloading of the '()' operator for the      // X_Comparator class. This overloading was shown earlier      // in the commented out line (E).      // sort( vec_2.begin(), vec_2.end(), X_Comparator() );              //(P)      print( vec_2 );            // 2 3 12 36 42 42 56      vec_2.clear();      print( vec_2 );                   // vec_2 is now empty      cout << vec_2.capacity() << endl;// 10      return 0; } void print( vector<X> v ) {      cout << "\nvector size is: " << v.size() << endl;      vector<X>::iterator p = v.begin();      while ( p != v.end() )           cout <<    (*p++).getp() << "  ";      cout << endl << endl; }

Note again that when we create a vector of a designated size, as in

     vector<X> vecPresized( 5 );

each element of the vector is initialized by invoking the no-arg constructor of the class. Since the no-arg constructor for X sets the data member p to 42, and since we print out the vector by printing out the value of the data member p for each element of the vector, we get the sequences of the number 42, as shown in the output displayed below line (K).

When an element is erased for a vector of class type objects, the destructor for the class is invoked to carry out the erasure. Also, if the erasure of a vector element requires that all the higher-indexed elements be shuffled down in memory (so that all of the vector elements can stay in contiguous segments of memory), it is the copy assignment operator for the class that if defined is called upon to do the job. A brief introduction to destructors was provided in Chapter 3. Destructors and copy assignment constructors are fully discussed in Chapter 11.

5.1.1.3 Using an Array to Initialize a Vector

We will now show how an array can be used to initialize a vector. To initialize a vector with an array, the vector constructor takes two arguments, both pointers, one pointing to the beginning of the array and the other to one past the end of the array. So an int array named data:

     int data[] = {11, 12, 23, 34};

can be used to initialize an int vector by

     vector<int> vec( data, &data[ size ] );        //(A)

Note that the pointer that is the second argument to the constructor points to a position that is past the end of the array. The pointer to the end of the array is given by

     &data[ size - 1 ]

but if you use this for the second argument in the invocation shown at (A) above, you'll be excluding the last element of the array from the vector. The following example shows a small program containing the above statements for vector initialization:

 
 //VectorInitArray.cc #include <iostream> #include <vector> using namespace std; void print( vector<int> ); int main() {     int data[] = {11, 12, 23, 34};     int size = sizeof( data ) / sizeof( data[0] );     vector<int> vec( data, &data[ size ] );     print( vec );           // 11 12 23 34     return 0; } void print( vector<int> vec ) {     vector<int>::iterator p = vec.begin();     while ( p < vec.end() )       cout << *p++ << " ";     cout << endl; }

5.1.2 Deque

A deque has all the functionality of a vector and then some. As mentioned before, a vector is inefficient for insert/delete operations at the front of the sequence, because if you insert a new element at the front, the vector has to shuffle all the other elements in the memory to their next location. On the other hand, in a deque the insert/delete operations at the front are just as efficient as they are at the back. So, whereas a vector provides us with the efficient push_back and pop_back operations at the back, a deque has these two and also push_front and pop_front for equally efficient operations at the front. However, note that if an application does not require efficient insert/delete operations at the front of a data sequence, you are better off with a vector because of its lower memory overhead and more efficient array indexing.

The main reason for the next program is to illustrate the push_front and pop_front operations made available for deque and the fact that these two operations work the same way at the front of the container as the push_back and pop_back operations at the back. The program also shows some of the features of deque that it shares with the vector container.

Line (A) of the program below creates an empty deque. Lines (B) and (C) then use push_back to place two items in the deque. We then use push_front in lines (D) and (E) to push two more items into the deque. Whether the items were pushed in from the front or from the back is reflected in the order in which the items appear when displayed by the print statement next. Lines (F) and (G) demonstrate the popping off of the items from the two ends of the container. Lines (H) through (L) demonstrate list operations on a deque-these work the same way as for a vector. Finally, line (M) shows how a deque can be sorted using the STL algorithm sort. All of the remarks we made earlier in the chapter concerning this sort function for the case of vectors apply here also.

 
 //DequeFront.cc #include <string> #include <deque> #include <algorithm>         // for sort, find using namespace std; void print( deque<string> ); int main() {     deque<string> animals;                                                  //(A)     animals.push_back( "yak" );                                             //(B)     animals.push_back( "zebra" );                                           //(C)         animals.push_front( "cat" );                                            //(D)     animals.push_front( "canary" );                                         //(E)     print(animals);               // canary cat yak zebra     animals.pop_front();                                                    //(F)     animals.pop_back();                                                     //(G)     print(animals);               // cat yak     //list operations on a deque:     animals.erase(find( animals.begin(), animals.end(), "cat" ));           //(H)     print(animals);               // yak     animals.insert( animals.begin(), "canary" );                            //(I)     print(animals);               // canray yak     int sz = animals.size();      // 2     animals.resize( 5 );          // size() will now return 5               //(J)     animals[sz] = "fox";          // animals[2] = "fox"                     //(K)     animals[sz+1] = "elephant";   // animals[3] = "elephant"     animals[sz+2] = "cat";        // animals[4] = "cat"     print( animals );             // canary yak fox elephant cat     animals.erase( animals.begin() + 2 ); // remove "fox"                   //(L)     print( animals );             // canary yak elephant cat     //sorting a deque:     sort( animals.begin(), animals.end() );                                 //(M)     print( animals );           // canary cat elephant yak     return 0; } void print( deque<string> d ) {     typedef deque<string>::const_iterator CI;                 //(N)     cout << "The number of items in the deque:" << d.size() << endl;      for ( CI iter = d.begin(); iter != d.end(); iter++ )          cout << *iter << " ";     cout << endl << endl; }

A couple of additional features to note about the above program are the use of typedef and const_iterator in line (N). The typedef shown allows us to create an abbreviated and more convenient synonym for an existing type.^[8] A const_iterator, as we will mention in Section 13.2.2 of Chapter 13, does not permit any modification of the element it points to, in much the same way as a const pointer will not allow you to change the object it points to.

5.1.3 List

If your application requires frequent insertions of new data items anywhere-front, back, or anywhere else in the middle-in a sequence and array-like indexing for accessing the data items is not important, then you need to use a list. In contrast with a vector and a deque, a list is stored in the form of a linked list, which makes it efficient to insert new elements anywhere-front, back, or in the middle. But because a linked list is not stored in a contiguous run of the memory, array-like indexing is not provided for accessing list elements.

The following program illustrates some of the more common operations on a list of strings, these being

 push_back      : for attaching a new element at the back pop_back       : for removing the last element in a list remove         : for removing an element anywhere (first occ. only) push_front     : for attaching a new element at the front pop_front      : for removing the first element sort           : for sorting unique         : for removing duplicates splice         : for splicing one list into another merge          : for merging two lists size           : for determining the number of elements begin          : iterator pointing to the first element end            : iterator pointing to one past the last element

The program starts in line (A) by declaring animals to be an empty list<string>. Lines (B) through (F) push string items into the list, the last intentionally a duplicate to later demonstrate the usefulness of the unique operation. Each call to push_back creates a new list node by side effect while returning a void. Line (H) shows how the last item in the list can be popped off by pop_back, whereas line (I) shows how the first occurrence of a designated item can be gotten rid of by value using the remove function. Both these calls return void and eliminate nodes by side effect. Lines (J) and (K) illustrate the push_front and pop_front operations for attaching a new node at the front of the list and for removing the first node of a list. As with push_back and pop_back, these two operations return void and do their job through side effect. Line (L) shows how a new node can be inserted at a specified place in a list.

An STL list has its own sort function, which is then invoked on animals in line (M). The sort function uses by default the meaning of the ‘<' operator for the string type for comparing the data items in the list. There is also a two-argument version of sort which can be supplied with your own comparison criterion. The member function unique invoked in line (N) eliminates any duplicates in a list. By default, it uses the ‘= =' operator to compare its argument with the items in a list to identify any duplicates in the list. A two-argument version of unique allows you to specify a predicate for the uniqueness criterion for class type objects. Both sort and unique return void.

The rest of the program illustrates splicing from one list into another and the merging of two lists. For these demonstrations we create a second list in a block of code starting in line (O). The location where the splicing is to occur on the invoked list is specified by the first argument to splice in line (P). The second argument is the name of the list from which an item is to be taken for splicing. And the last argument is the location in the argument list which points to the data item to be spliced in. Note that the argument list loses the node which was spliced into the invoking list. The merge function shown in line (R) assumes that both the list on which it is invoked and the argument list are sorted. The argument list is merged into the invoked list in such a way that the result remains sorted. The argument list will be empty after a merge.

As evidenced by the invocation of begin() and end() in lines (L) and (P), a list supports iterators. As with other containers, begin() returns an iterator to the first item in a list and end() returns it to just beyond the last item. Again, as with other containers, it is an error to dereference the iterator returned by end().

 
 //ListOps.cc #include <string> #include <list> using namespace std; void print( list<string>& ); int main() {     list<string> animals;                                                   //(A)         animals.push_back( "cheetah" );                                         //(B)     animals.push_back( "lion" );                                            //(C)     animals.push_back( "cat" );                                             //(D)     animals.push_back( "fox" );                                             //(E)     animals.push_back( "elephant" );                                        //(F)     animals.push_back( "cat" );      // duplicate cat                       //(G)          print( animals );                // cheetah lion cat fox                                      // elephant cat      animals.pop_back();                                                    //(H)      print( animals );               // cheetah lion cat fox                                      // elephant      animals.remove( "lion" );       // first occurrence of lion            //(I)      print( animals );               // cheetah cat fox elephant      animals.push_front( "lion" );                                          //(J)      print( animals );               // lion cheetah cat fox elephant      animals.pop_front( );                                                  //(K)      print( animals );               // cheetah cat fox elephant      animals.insert( animals.end(), "cat" );                                //(L)      print( animals );               // cheetah cat fox elephant cat      animals.sort();                                                        //(M)      print( animals );               // cat cat cheetah elephant fox      animals.unique();                                                      //(N)      print( animals );               // cat cheetah elephant fox      //another list needed for demonstrating splicing and merging: list<string> pets;//(O)      pets.push_back( "cat" );      pets.push_back( "dog" );      pets.push_back( "turtle" );      pets.push_back( "bird" );      animals.splice( animals.begin(), pets, pets.begin() );                 //(P)      print( animals );               // cat cat cheetah elephant fox      print( pets );                  // dog turtle bird      pets.sort();                    // bird dog turtle                     //(Q)      animals.merge( pets );                                                 //(R)      cout << pets.empty() << endl;                                           // true                             //(S)      print( animals );               // bird cat cat cheetah                //(T)                                      // dog elephant fox                                      // turtle      return 0; } void print( list<string>& li ) {                       //(U)     typedef list<string>::const_iterator CI;     cout << "The number of items in the list:"     << li.size() << endl;;     for ( CI iter = li.begin(); iter != li.end(); iter++ )         cout << *iter << " ";     cout << endl << endl; }

The reader will also notice that a list<string> object is passed by reference to the print function in line (U)-that is a consequence of declaring the parameter type to be list<string>&. As we will explain in Chapter 9, this is a more efficient way to pass a large object to a function (provided this mode of argument passing is used with care). Object reference in C++ is presented in Chapter 8.

5.1.4 Stack

We will now present the first of the sequence container adapters. These are container classes that are basically the same as the sequence containers we have discussed so far, but with restricted interfaces.

To explain what we mean by a "sequence container with a restricted interface", let's say you need the stack data structure in a computer program. A stack works on the principle of "last in, first out" (LIFO), as shown pictorially in Figure 5.1. Based on the earlier discussion on vectors, it should be obvious that we can use a vector as a stack. A vector provides us with all the necessary operations for a stack: push_back, pop_back, back, and empty-push_back for pushing an item into a stack; pop_back to get rid of the item most recently pushed into a stack; back for retrieving the value of the item at the top of a stack; and, finally, empty for ascertaining whether a stack is empty. But using a vector directly as a stack makes the program somewhat less perspicuous in comparison with a program that uses a container that was called stack.

click to expand
Figure 5.1

Now imagine a stack class that in its internals uses a vector to store the data, but only makes available to the programmer the functions push and pop for doing what the names imply, a function top for retrieving the value of the element at the top of the stack, and empty to figure out if the stack is empty. Someone perusing the program would know immediately as to what logic was being implemented through the stack container, making the program easier to understand and maintain. This would amount to using vector with a restricted interface.

The STL stack container class has been defined in such a way that it can act as a wrapper class for any container that supports the operations push_back, pop_back, back, and either empty or size, if not both. What that means is that if you were to write your own container class equipped with these functions, the STL stack container would be able to adapt to it (hence the name "adapter class").

Shown below is a program that illustrates how an STL stack can be used. Note the constructor:

      stack< string, vector<string> > s;

This constructor invocation says that the stack object s will use vector<string> as the underlying container. If no underlying container is specified, as for example in

      stack< string > s;

the stack uses a deque by default-deque<string> for the example shown. The functions push and pop perform their duties as side effects-both return void. The function push of stack basically calls push_back on the underlying container, and the function pop calls pop_back on the underlying container. The function top returns a value obtained by invoking back on the underlying container.

 
 //StackOps.cc #include <string> #include <stack> #include <vector> using namespace std; int main() {      stack< string, vector<string> > s;      s.push( "me" );      s.push( "to" );      s.push( "talk" );      while ( !s.empty() ) {           cout << s.top() << " ";                     // talk to me           s.pop();      }      return 0; }

5.1.5 Queue

This is the second of the adapter containers. Just as stack can adapt to any underlying container that provided a certain functionality, queue can adapt to any underlying container that supports push_back, pop_front, front, back, size, and empty. The queue itself provides to the user the functions push, pop, front, back, size, and empty. The push of queue invokes push_back of the underlying container and the pop of queue invokes pop_front of the underlying container. The rest of the functions of queue directly invoke functions of the same name on the underlying container. Figure 5.2 captures the control action of a queue.

click to expand
Figure 5.2

The following program shows how one can use the queue container. Note the constructor call

     queue<string> q;

which declares q to be a queue<string> whose underlying container is a deque<string> by default. If you wanted to use some other container, say, a list container as the underlying container, the constructor invocation would become

     queue< string, list<string> > q;

 
 //QueueOps.cc #include <string> #include <queue> using namespace std; int main() {     queue<string> q;     q.push( "roses" );     q.push( "are" );     q.push( "red" );     while ( !q.empty() ) {         cout << q.front() << " ";           // roses are red         q.pop();      }      return 0; }

5.1.6 Priority_Queue

This is the last of the adapter containers. A priority_queue works very much like a queue except that the item that will be popped off next is the one with the highest-priority. So whenever anything is pushed into or popped off a priority_queue, the item with the highest priority is brought to the front of the queue. The priority of an item is usually established on the basis of a programmer-supplied comparison criterion for the queue items. If a comparison criterion is not explicitly supplied, the system will try to use either the ‘<' operator if defined for the items in the queue or, if applicable, the less function object from the functional header file of the C++ Standard Library. Function objects, discussed in Chapter 12, correspond to classes for which the operator ‘()' has been overloaded.

The program shown below stores pair objects in a priority_queue.^[9] Each pair consists of a string and an unsigned int, the latter serving as a priority number for the former. We also define a Prioritize function object by overloading the ‘()' operator. As already stated, the topic of function objects is treated in detail in Chapter 12.

Note the call to the constructor:

    priority_queue< pair< string, unsigned int >,         vector <pair< string, unsigned int > >, Prioritize > pq;

which declares pq to be a priority_queue based on a vector< pair<string, unsigned int> > for its underlying container. Since a priority_queue can adapt to any underlying container that supports front, push_back, pop_back, and empty, one can also use a list or a deque for the underlying container.

 
 //PriorityQueueOps.cc #include <string> #include <queue> using namespace std; class Prioritize { public:      // As explained in Chapter 12, the overloading of the      // '()' operator shown below makes available a function      // object Prioritize() that is subsequenlty supplied as      // the third argument to the priority_queue constructor.      int operator() ( const pair<string, unsigned int>& p1,                       const pair<string, unsigned int>& p2 ) {          return p1.second < p2.second;      } }; int main() {      priority_queue< pair< string, unsigned int >,         vector <pair< string, unsigned int > >, Prioritize >   pq;      pq.push( pair<string, int>( "go to lunch", 2) );      pq.push( pair<string, int>( "go to bathroom", 10 ) );      pq.push( pair<string, int>( "take a nap", 1 ) );      while ( !pq. empty() ) { //(A)          cout << pq.top().first << endl;          pq.pop();      }      return 0; }

The output of the program produced by the while loop in line (A) is

     go to bathroom         go to lunch     take a nap

5.1.7 Map

A C++ map is a sequence of <key, value> pairs in which every key is unique. So a phone book in which every name appeared only once could be represented by a map-each distinct name would serve as a key and the phone-number associated with that name would be the corresponding value. A map can be thought of as a mapping from the keys to the values.

As mentioned at the beginning of Section 5.1, all the <key, value> pairs in a map are stored in a key-sorted ascending order in a binary tree to allow for fast retrieval. The order is maintained as new pairs are added to the container. As one would expect, generating and maintaining a key-sorted order requires that the container have access to a less-than predicate for the key type.^[10] This usually is not a problem for the commonly used case of the string type for the keys-since the operator ‘<' is defined for the string type. But should you decide to use this container with your own class type for the keys, you'd need to make sure that a suitable less-than operation is defined for your class type.^[11]

To illustrate how map is used, let's try to write a C++ program that generates a word histogram for a text file.^[12] First of all, we must include the header file for this container, as shown in line (A) in the program that follows. The container itself is declared as in line (B), which declares hist to be an empty map to begin with. After the creation of this container (and after setting up an input file stream in line (C)), the histogram itself is constructed by exactly two lines of code in lines (D) and (E), repeated here for convenience:

     while ( in >> word )         hist[ word ]++;

The logic that is embedded in the second of these two lines is as follows: For a word just read from the source file, we first evaluate hist[ word ]. If word is not already a key in the histogram, the container would go ahead and create an entry for it, setting its value to 0 of the appropriate kind. In our case, since the value part of each <key, value> pair is supposed to be an integer, the value set for such a word will be the integer 0. After doing all this work, hist[ word ] will return 0, which would then get incremented to 1 by the postfix increment operator. Through this mechanism, for a word that was not already in the histogram, we'd get a count of 1 inserted in the histogram. On the other hand, if the word was already as a key in the histogram, hist[ word ] would return the current value for that key-meaning the current count. The postfix increment operator would then bump up the count by one.

Lines (H) and (I) print out the contents of the map container. We scan through the sequence of <key, value> pairs in the container and then output the keys and the values in the form of a table. The scanning is accomplished by first defining a local typedef for an iterator to the map container in line (G) and then setting up a for loop for printing out the keys and the corresponding values through the pair data members first and second in line (I). Initializing the iterator and setting up a terminating condition for the loop are very much like what we saw earlier for vectors. Here is the program:

 
 //MapHist.cc #include <string> #include <map>                                                              //(A) #include <fstream> using namespace std; int main() {     map<string, int> hist;                                                  //(B)     ifstream in( "inFile" );                                                //(C)     string word;     while ( in >> word )                                                    //(D)     hist[ word ]++;                                                         //(E)     in.close();                                                             //(F)     typedef map<string, int>::const_iterator CI;                            //(G)     for ( CI iter = hist.begin(); iter != hist.end(); ++iter )              //(H)         cout << iter->first << ‘\t' << iter->second << endl;                //(I)     return 0; }

The default choice of the ‘<' operator as a criterion for key-ordering the elements of a map is not always appropriate. Consider the case when the keys are of type char*. Strings of type char* should not be compared with the ‘<' operator; instead, they should be compared using the strcmp function. When ‘<' is not suitable, the programmer must provide an alternative ordering criterion in the form of a function object (see Section 12.10 of Chapter 12). Here is Stroustrup's example of a function object that invokes the strcmp function:

     class Cstring_less {     public:          bool operator() ( const char* p, const char* q ) const {             return strcmp( p, q ) < 0;           }      };

We can now declare a map for holding <C_style_string, value> pairs by

     map< char*, int, Cstring_less > assoc_list;

For this container, the Cstring_less function object will be used to build a tree representation from the items in the container. When the user defines a comparison function in the form of a function object, all other relational operators are derived from that comparison function. To illustrate, if a user-defined comparison function is denoted cmp(x, y), the following predicate could then be used for testing for an equivalence:

     !cmp(x,y) && !cmp(y,x)

This implies that equivalent keys need not be equal in a map container. For example, as stated by Stroustrup [54], an associative container that uses case-insensitive comparison as its comparison criterion will consider the strings "Last," "lAst," "lAst," "laSt," and "lasT" equivalent, even though the operator ‘= =' for strings considers them different.

Finally, if it is not necessary for the <key, value> pairs to be stored in a key-sorted order, you may obtain better performance by using a hash-table version of the mapping container called hash_map if the hashing function used is effective for the keys.

5.1.8 Set

Each object can appear only once in a set-no duplicate elements allowed. For example, the names of all of your friends would constitute a set-assuming that the same name was not shared by two or more friends.

The following example illustrates some of the basic functionality of a set. In line (A), we include the header file for this container. Line (B) declares animals as a set of strings. Lines (C) through (F) illustrate how new elements can be inserted into a set. In line (G), we try to insert the object "cat" into the set a second time, but as the output of the print loop in line (I) shows, the set retains only one "cat". Line (J) illustrates the use of erase() to remove a set element. Lines (H) and (K) show the use of the size () function to determine the number of elements in a set. The iteration loops in lines (I) and (L) show that iterators work for a set the same way as for the other containers we have already seen.

 
 //SetOps.cc #include <string> #include <set>                                                             //(A) using namespace std; int main() {     set<string> animals;                                                   //(B)     animals.insert( "cheetah" );                                           //(C)     animals.insert( "lion" );                                              //(D)     animals.insert( "cat" );                                               //(E)     animals.insert( "elephant" );                                          //(F)     animals.insert( "cat" );                    //attempting a duplicate   //(G)     cout << animals.size() << endl;;            // 4                       //(H)     typedef set<string>::const_iterator CI;     for (CI iter = animals.begin();                                        //(I)           iter != animals.end();           iter++)        cout << *iter << " "; // cat cheetah elephant lion      animals.erase( "lion" );                                               //(J)      cout << animals.size() << endl;;// 3                                   //(K)      for ( CI iter = animals.begin();                                       //(L)           iter != animals.end();           iter++ )      cout << *iter << " "; // cat cheetah elephant      return 0; }

5.1.9 Generic Algorithms

In addition to the containers, STL also provides us with algorithms that can be used with the container classes for searching, sorting, counting, merging, filling, comparing, swapping, deleting, partitioning, and much more.

The most significant thing to note about the algorithms is that they only require iterators for their arguments and that they do not need to know about the containers directly. As an example, suppose you initialize two iterators by having them point to two chosen locations in a container; if you supply those iterators to an STL sort algorithm, it would sort the elements between the two iterators. You would not need to tell the sort function directly about the container. This would work even if the container class was created by you, as long as it satisfied certain iterator requirements. For this reason, the STL algorithms are also known as generic algorithms. Whether or not a certain algorithm will work with a given container depends on whether or not the container supports the iterator types needed by the algorithm.

There are 60 different algorithms. Of these, 23 are nonmutating algorithms because they do not alter the contents of a container. The remaining 37 are mutating. Examples of nonmutating algorithms are min_element, which returns an iterator that points to the minimum element in a sequence; find, which returns an iterator that points to the first occurrence of a value in a sequence; binary_search, which performs a binary search for a given value in an ordered container, and so on. Examples of mutating algorithms are fill for assigning values to specified elements in a sequence; sort for in-place sorting of the elements within a specified range in a container, and so on.

With regard to the sorting algorithms provided by STL, although the reader has already seen examples of sorting in some of the example code we have shown in this section, Chapter 12 will present in greater detail how the contents of a C++ container can be sorted. For the sorting of class type objects, there are basically two approaches and they both require operator overloading:

You can overload the ‘<' operator for the element type. By default, the generic sort algorithm uses the overloaded definition of this operator for ordering the elements. But, as we already mentioned earlier in this chapter, the default use of the ‘<' operator is not always appropriate for sorting.
One can define what is known as a function object that is then supplied as an argument to the sorting algorithm. Through the overload definition supplied for the ‘()' operator, the function object tells the sorter how to compare two class type objects.

Chapter 12 will discuss these alternatives in detail.

^[2]An ideal hashing function will allocate a unique memory address to every single key in an associative container like map. When that can be done, a key and its associated value can be retrieved in constant time (meaning in time that is independent of the number of <key, value> pairs in the container) since one would not have to search through the keys. When this ideal condition cannot be satisfied, two or more keys may get mapped to the same hash code. When that happens, only one such <key, value> pair will be assigned the address corresponding to that hash code. But stored along with the chosen <key, value> pair will be pointers to the other pairs whose keys got assigned the same hash code.

^[3]Therefore, it makes no sense to write functions that have pointers to vectors or to vector elements. Functions that cause a vector to get resized explicitly or implicitly are capable of moving an entire vector to a new location in the memory. These include resize(), which is an explicit call for resizing a vector, and functions like like push_back() and insert(), which resize a vector implicitly.

^[4]The syntax used here is that for a template class, a concept that was introduced in Chapter 3 but is presented more fully in Chapter 13.

^[5]Strictly speaking, the functions front() and back() return references to the first and the last elements. Object reference in C++ is presented in Chapter 8.

^[6]This copy is made using either a programmer-defined copy constructor or a system-supplied default copy constructor for the class. Copy constructors are discussed in Chapter 11.

^[7]More accurately speaking, it is the copies of the three objects that are pushed into the container.

^[8]We could also have used a macro definition to create a shorthand for a long type name by replacing line (N) with

     #define CI deque<string>::const_iterator

but a typedef is more appropriate here because a typedef name obeys the same scoping rules as ordinary variables. A typedef name defined inside a function would not be visible outside the function. Besides, as we will mention in Section 7.12 of Chapter 7, the use of macros is discouraged in C++.

^[9]The pair class, actually a struct, is convenient for encapsulating a pair of different types that belong together because they stand for two mutually related attributes. It is defined in the utility header file of the C++ Standard Library.

^[10]This would ordinarily be provided by overloading the ‘<' operator for the data type of the keys Overloading of C++ operators is discussed in Chapter 12.

^[11]If the overload definition of the ‘<' is not suitable for the kind of sorting that is desired with a map container, it is also possible to invoke the map constructor with a separate comparison function.

^[12]Based on an example provided by Stroustrup [54].