Section 11.4. Structure of Generic Algorithms

11.4. Structure of Generic Algorithms

Just as there is a consistent design pattern behind the containers, there is a common design underlying the algorithms. Understanding the design behind the library makes it easier to learn and easier to use the algorithms. Because there are more than 100 algorithms, it is much better to understand their structure than to memorize the whole list of algorithms.

The most fundamental property of any algorithm is the kind(s) of iterators it expects. Each algorithm specifies for each of its iterator parameters what kind of iterator can be supplied. If a parameter must be a random-access iterator, then we can provide an iterator for a vector or a deque, or we can supply a pointer into an array. Iterators on the other containers cannot be used with such algorithms.

A second way is to classify the algorithms is as we did in the beginning of this chapter. We can categorize them by what actions they take on the elements:

Some are read-only and leave element values and ordering unchanged.
Others assign new values to specific elements.
Others move values from one element to another.

As we'll see in the remainder of this section, there are two additional patterns to the algorithms: One pattern is defined by the parameters the algorithms take; the other is defined by two function naming and overloading conventions.

11.4.1. Algorithm Parameter Patterns

Superimposed on any other classification of the algorithms is a set of parameter conventions. Understanding these parameter conventions can aid in learning new algorithmsby knowing what the parameters mean, you can concentrate on understanding the operation the algorithm performs. Most of algorithms take one of the following four forms:

      alg (beg, end, other parms);      alg (beg, end, dest, other parms);      alg (beg, end, beg2, other parms);      alg (beg, end, beg2, end2, other parms);

where alg is the name of the algorithm, and beg and end denote the range of elements on which the algorithm operates. We typically refer to this range as the "input range" of the algorithm. Although nearly all algorithms take an input range, the presence of the other parameters depends on the work being performed. The common ones listed heredest, beg2 and end2are all iterators. When used, these iterators fill similar roles. In addition to these iterator parameters, some algorithms take additional, noniterator parameters that are algorithm-specific.

Algorithms with a Single Destination Iterator

A dest parameter is an iterator that denotes a destination used to hold the output. Algorithms assume that it is safe to write as many elements as needed.

When calling these algorithms, it is essential to ensure that the output container is sufficiently large to hold the output, which is why they are frequently called with insert iterators or ostream_iterators. If we call these algorithms with a container iterator, the algorithm assumes there are as many elements as needed in that container.

If dest is an iterator on a container, then the algorithm writes its output to existing elements within the container. More commonly, dest is bound to an insert iterator (Section 11.3.1, p. 406) or an ostream_iterator. An insert iterator adds elements to the container, ensuring that there is enough space. An ostream_iterator writes to an output stream, again presenting no problem regardless of how many elements are written.

Algorithms with a Second Input Sequence

Algorithms that take either beg2 alone or beg2 and end2 use these iterators to denote a second input range. These algorithms typically use the elements from the second range in combination with the input range to perform a computation. When an algorithm takes both beg2 and end2, these iterators are used to denote the entire second range. That is, the algorithm takes two completely specified ranges: the input range denoted by beg and end, and a second input range denoted by beg2 and end2.

Algorithms that take beg2 but not end2 treat beg2 as the first element in the second input range. The end of this range is not specified. Instead, these algorithms assume that the range starting at beg2 is at least as large as the one denoted by beg, end.

As with algorithms that write to dest, algorithms that take beg2 alone assume that the sequence beginning at beg2 is as large as the range denoted by beg and end.

11.4.2. Algorithm Naming Conventions

The library uses a set of consistent naming and overload conventions that can simplify learning the library. There are two important patterns. The first involves algorithms that test the elements in the input range, and the second applies to those that reorder elements within the input range.

Distinguishing Versions that Take a Value or a Predicate

Many algorithms operate by testing elements in their input range. These algorithms typically use one of the standard relational operators, either == or <. Most of the algorithms provide a second version that allows the programmer to override the use of the operator and instead to supply a comparison or test function.

Algorithms that reorder the container elements use the < operator. These algorithms define a second, overloaded version that takes an additional parameter representing a different operation to use to order the elements:

      sort (beg, end);         // use < operator to sort the elements      sort (beg, end, comp);   // use function named comp to sort the elements

Algorithms that test for a specific value use the == operator by default. These algorithms provide a second namednot overloadedversion with a parameter that is a predicate (Section 11.2.3, p. 402). Algorithms that take a predicate have the suffix _if appended:

      find(beg, end, val);       // find first instance of val in the input range      find_if(beg, end, pred);   // find first instance for which pred is true

These algorithms both find the first instance of a specific element in the input range. The find algorithm looks for a specific value; the find_if algorithm looks for a value for which pred returns a nonzero value.

The reason these algorithms provide a named version rather than an over-loaded one is that both versions take the same number of parameters. In the case of the ordering algorithms, it is easy to disambiguate a call based solely on the number of parameters. In the case of algorithms that look for a specific element, the number of parameters is the same whether testing for a value or testing a predicate. Overloading ambiguities (Section 7.8.2, p. 269) would therefore be possible, albeit rare, and so the library provides two named versions for these algorithms rather than relying on overloading.

Distinguishing Versions that Copy from Those that Do Not

Independently of whether an algorithm tests its elements, the algorithm may re-arrange elements within the input range. By default, such algorithms write the rearranged elements back into their input range. These algorithms also provide a second, named version that writes to a specified output destination. These algorithms append _copy to their names:

      reverse(beg, end);      reverse_copy(beg, end, dest);

The reverse function does what its name implies: It reverses the order of the elements in the input sequence. The first version reverses the elements in the input sequence itself. The second version, reverse_copy, makes a copy of the elements, placing them in reverse order in the sequence that begins at dest.