Section 10.4. The set Type

10.4. The `set` Type

A map is a collection of a keyvalue pairs, such as an address and phone number keyed to an individual's name. In contrast, a set is simply a collection of keys. For example, a business might define a set named bad_checks, to hold the names of individuals who have issued bad checks to the company. A set is most useful when we simply want to know whether a value is present. Before accepting a check, for example, that business would query bad_checks to see whether the customer's name was present.

With two exceptions, set supports the same operations as map:

All the common container operations listed in Section 10.2 (p. 358).
The constructors described in Table 10.3 (p. 360).
The insert operations described in Table 10.5 (p. 365).
The count and find operations described in Table 10.6 (p. 367).
The erase operations described in Table 10.7 (p. 369).

The exceptions are that set does not provide a subscript operator and does not define mapped_type. In a set, the value_type is not a pair; instead it and key_type are the same type. They are each the type of the elements stored in the set. These differences reflect the fact that set holds only keys; there is no value associated with the key. As with map, the keys of a set must be unique and may not be changed.

Exercises Section 10.4

Exercise 10.21:
Explain the difference between a map and a set. When might you use one or the other?

Exercise 10.22:
Explain the difference between a set and a list. When might you use one or the other?

10.4.1. Defining and Using `set`s

To use a set, we must include the set header. The operations on sets are essentially identical to those on maps.

As with map, there can be only one element with a given key in a set. When we initialize a set from a range of elements or insert a range of elements, only one element with a given key is actually added:

      // define a vector with 20 elements, holding two copies of each number from 0 to 9      vector<int> ivec;      for (vector<int>::size_type i = 0; i != 10; ++i) {          ivec.push_back(i);          ivec.push_back(i); // duplicate copies of each number      }      // iset holds unique elements from ivec      set<int> iset(ivec.begin(), ivec.end());      cout << ivec.size() << endl;      // prints 20      cout << iset.size() << endl;      // prints 10

We first create a vector of ints named ivec that has 20 elements: two copies of each of the integers from 0 through 9 inclusive. We then use all the elements from ivec to initialize a set of ints. That set has only ten elements: one for each distinct element in ivec.

Adding Elements to a `set`

We can add elements to a set by using the insert operation:

      set<string> set1;         // empty set      set1.insert("the");       // set1 now has one element      set1.insert("and");       // set1 now has two elements

Alternatively, we can insert a range of elements by providing a pair of iterators to insert. This version of insert works similarly to the constructor that takes an iterator paironly one element with a given key is inserted:

      set<int>    iset2; //    empty set      iset2.insert(ivec.begin(), ivec.end());     // iset2 has 10 elements

Like the map operations, the version of insert that takes a key returns a pair containing an iterator to the element with this key and a bool indicating whether the element was added. The one that takes an iterator pair returns void.

Fetching an Element from a `set`

There is no subscript operator on sets. To fetch an element from a set by its key, we use the find operation. If we just want to know whether the element is present, we could also use count, which returns the number of elements in the set with a given key. Of course, for set that value can be only one (if the element is present) or zero (if it is not):

      iset.find(1)     // returns iterator that refers to the element with key == 1      iset.find(11)    // returns iterator == iset.end()      iset.count(1)    // returns 1      iset.count(11)   // returns 0

Just as we cannot change the key part of a map element, the keys in a set are also const. If we have an iterator to an element of the set, all we can do is read it; we cannot write through it:

      // set_it refers to the element with key == 1      set<int>::iterator set_it = iset.find(1);      *set_it = 11;               // error: keys in a set are read-only      cout << *set_it << endl;    // ok: can read the key

10.4.2. Building a Word-Exclusion Set

On page 369 we removed a given word from our word_count map. We might want to extend this approach to remove all the words in a specified file. That is, our word-count program should count only words that are not in a set of excluded words. Using set and map, this program is fairly straightforward:

      void restricted_wc(ifstream &remove_file,                         map<string, int> &word_count)      {          set<string> excluded; // set to hold words we'll ignore          string remove_word;          while (remove_file >> remove_word)              excluded.insert(remove_word);          // read input and keep a count for words that aren't in the exclusion set          string word;          while (cin >> word)             // increment counter only if the word is not in excluded             if (!excluded.count(word))                  ++word_count[word];      }

This program is similar to the word-count program on page 363. The difference is that we do not bother to count the common words.

The function starts by reading the file it was passed. That file contains the list of excluded words, which we store in the set named excluded. When the first while completes, that set contains an entry for each word in the input file.

The next part of the program looks a lot like our original word-count program. The important difference is that before counting each word, we check whether the word is in the exclusion set. We do this check in the if inside the second while:

      // increment counter only if the word is not in excluded      if (!excluded.count(word))

The call to count returns one if word is in excluded and zero otherwise. We negate the return from count so that the test succeeds if word is not in excluded. If word is not in excluded, we update its value in the map.

As in the previous version of our word count program, we rely on the fact that subscripting a map inserts an element if the key is not already in the map. Hence, the effect of

      ++word_count[word];

is to insert word into word_count if it wasn't already there. If the element is inserted, its value is initially set to 0. Regardless of whether the element had to be created, the value is then incremented.

Exercises Section 10.4.2

Exercise 10.23:
Write a program that stores the excluded words in a vector instead of in a set. What are the advantages to using a set?

Exercise 10.24:
Write a program that generates the non-plural version of a word by stripping the 's' off the end of the word. Build an exclusion set to recognize words in which the trailing 's' should not be removed. Two examples of words to place in this set are success, class. Use this exclusion set to write a program that strips plural suffixes from its input but leaves words in the exclusion set unchanged.

Exercise 10.25:
Define a vector of books you'd like to read within the next six months and a set of titles that you've read. Write a program that chooses a next book for you to read from the vector, provided that you have not yet read it. When it returns the selected title to you, it should enter the title in the set of books read. If in fact you end up putting the book aside, provide support for removing the title from the set of books read. At the end of our virtual six months, print the set of books read and those books that were not read.

10.4. The set Type