Finding Things in Strings

Problem

You want to search a string for something. Maybe it's a single character, another string, or one of (or not of) an unordered set of characters. And, for your own reasons, you have to find it in a particular way, such as the first or last occurrence, or the first or last occurrence relative to a particular index.

Solution

Use one of basic_string's "find" member functions. Almost all start with the word "find," and their name gives you a pretty good idea of what they do. Example 4-15 shows how some of the find member functions work.

Example 4-15. Searching strings

#include 
#include 

int main( ) {
 std::string s = "Charles Darwin";

 std::cout << s.find("ar") << '
'; // Search from the
 // beginning
 std::cout << s.rfind("ar") << '
'; // Search from the end

 std::cout << s.find_first_of("swi") // Find the first of
 << '
'; // any of these chars

 std::cout << s.find_first_not_of("Charles") // Find the first
 << '
'; // that's not in this
 // set

 std::cout << s.find_last_of("abg") << '
'; // Find the first of
 // any of these chars
 // starting from the
 // end

 std::cout << s.find_last_not_of("aDinrw") // Find the first
 << '
'; // that's not in this
 // set, starting from
 // the end
}

Each of the find member functions is discussed in more detail in the "Discussion" section.

Discussion

There are six different find member functions for finding things in strings, each of which provides four overloads. The overloads allow for either basic_string or charT* parameters (charT is the character type). Each has a basic_string::size_type parameter pos that lets you specify the index where the search should begin, and there is one overload with a size_type parameter n that allows you only to search based on the first n characters from the set.

It's hard to keep track of all of these member functions, so Table 4-2 gives a quick reference of each function and its parameters.

Table 4-2. Member functions for searching strings

Member function

Description

size_type find (const basic_string& str,
 size_type pos = 0) const;
size_type find (const charT* s,
 size_type pos,
 size_type n) const;
size_type find (const charT* s,
 size_type pos = 0) const;
size_type find (charT c,
 size_type pos = 0) const;

Returns the index of the first instance of a character or substring, starting at the beginning or the index indicated by the pos parameter. If n is specified, then match the first n characters in the target string.

size_type rfind ( ... )

Find the first instance of a character or substring, from the end to the beginning. In other words, do the same thing as find, but starting from the end of the string.

size_type find_first_of ( ... )

Find the first occurrence of any of the characters in the set that is provided as a basic_string or character pointer. If n is specified, then only the first n characters in the set are considered.

size_type find_last_of ( ... )

Find the last occurrence of any of the characters in the set that is provided as a basic_string or character pointer. If n is specified, then only the first n characters in the set are considered.

size_type find_first_not_of ( ... )

Find the first occurrence of a character that is not one of the characters in the set that is provided as a basic_string or character pointer. If n is specified, then only the first n characters in the set are considered.

size_type find_last_not_of ( ... )

Find the last occurrence of any of the characters in the set that is provided as a basic_string or character pointer. If n is specified, then only the first n characters in the set are considered.

All of these member functions return the index of the occurrence of what you are looking for as a value of type basic_string::size_type. If the search fails, it returns basic_string::npos, which is a special value (usually -1) that indicates search failure. Even though it is usually -1, you should test for equality with npos to be as portable as possible; this also makes your intent clear, since by comparing to npos you are explicitly checking for search failure and not some magic number.

With this variety of searching algorithms, you should be able to find what you're looking for, and if not, to use them in your own algorithms. If basic_string doesn't provide what you need, however, look in before you roll your own. The standard algorithms operate on sequences by using iterators and, nearly as often, function objects. Conveniently, basic_strings provide iterators for easy traversal, so it is trivial to plug string iterators into standard algorithms. Say you want to find the first occurrence of the same character twice in a row. You can use the adjacent_find function template to find two equal, adjacent elements in a string ("adjacent" means that their positions differ by one iterator, i.e., that *iter == *(iter + 1)).

std::string s = "There was a group named Kiss in the 70s";

std::string::iterator p =
 std::adjacent_find(s.begin( ), s.end( ));

The result is an iterator that points to the first of the adjacent elements.

If you have to write your own algorithm for operating on strings, don't use a basic_string like you would a C-style string by using operator[] to get at each item. Take advantage of the existing member functions. Each of the find functions takes a size_type parameter that indicates the index where the search should proceed from. Using the find functions repeatedly, you can advance through the string as you see fit. Consider Example 4-16, which counts the number of unique characters in a string.

Example 4-16. Counting unique characters

#include 
#include 

template
int countUnique(const std::basic_string& s) {
 using std::basic_string;

 basic_string chars;

 for (typename basic_string::const_iterator p = s.begin( );
 p != s.end( ); ++p) {
 if (chars.find(*p) == basic_string::npos)
 chars += *p;
 }
 return(chars.length( ));
}

int main( ) {
 std::string s = "Abracadabra";

 std::cout << countUnique(s) << '
';
}

The find functions come in handy quite often. Keep them at the top of the list when you have to find things in strings.

Building C++ Applications

Code Organization

Numbers

Strings and Text

Dates and Times

Managing Data with Containers

Algorithms

Classes

Exceptions and Safety

Streams and Files

Science and Mathematics

Multithreading

Internationalization

XML

Miscellaneous

Index



C++ Cookbook
Secure Programming Cookbook for C and C++: Recipes for Cryptography, Authentication, Input Validation & More
ISBN: 0596003943
EAN: 2147483647
Year: 2006
Pages: 241

Flylib.com © 2008-2020.
If you may any questions please contact us: flylib@qtcs.net