Regex


Header: "boost/regex.hpp"

A regular expression is encapsulated in an object of type basic_regex. We will look closer at the options for how regular expressions are compiled and parsed in subsequent sections, but let's first take a cursory look at basic_regex and the three important algorithms that are the bulk of this library.

 namespace boost {   template <class charT,             class traits=regex_traits<charT> >   class basic_regex {   public:     explicit basic_regex(       const charT* p,        flag_type f=regex_constants::normal);     bool empty() const;      unsigned mark_count() const;        flag_type flags() const;   };   typedef basic_regex<char> regex;   typedef basic_regex<wchar_t> wregex; } 

Members

 explicit basic_regex (   const charT* p,    flag_type f=regex_constants::normal); 

This constructor accepts a character sequence that contains the regular expression, and an argument denoting which options to use for the regular expressionfor example, whether it should ignore case. If the regular expression in p isn't valid, an exception of type bad_expression, or regex_error, is thrown. Note that these two exceptions mean the same thing; at the time of this writing, the change from the current name bad_expression has not yet been made, but the next version of Boost.Regex will change it to regex_error.

 bool empty() const;  

This member is a predicate that returns true if the instance of basic_regex does not contain a valid regular expressionthat is, it has been assigned an empty character sequence.

 unsigned mark_count() const;  

mark_count returns the number of marked subexpressions in the regex. A marked subexpression is a part of the regular expression enclosed within parentheses. The text that matches a subexpression can be retrieved after calling one of the regular expression algorithms.

 flag_type flags() const; 

Returns a bitmask containing the option flags that are set for this basic_regex. Examples of flags are icase, which means that the regular expression is ignoring case, and JavaScript, indicating that the syntax for the regex is the one used in JavaScript.

 typedef basic_regex<char> regex; typedef basic_regex<wchar_t> wregex; 

Rather than declaring variables of type basic_regex, you'll typically use one of these two typedefs. These two, regex and wregex, are shorthands for the two character types, similar to how string and wstring are shorthands for basic_string<char> and basic_string<wchar_t>. This similarity is no coincidence, as a regex is, in a way, a container for a special type of string.

Free Functions

 template <class charT,class Allocator,class traits >   bool regex_match(     const charT* str,      match_results<const charT*,Allocator>& m,     const basic_regex<charT,traits >& e,     match_flag_type flags = match_default); 

regex_match determines whether a regular expression (the argument e) matches the whole character sequence str. It is mainly used for validating text. Note that the regular expression must match everything in the parsed sequence, or the function returns false. If the sequence is successfully matched, regex_match returns TRue.

 template <class charT,class Allocator, class traits>    bool regex_search(     const charT* str,     match_results<const charT*,Allocator>& m,     const basic_regex<charT,traits >& e,     match_flag_type flags = match_default); 

regex_search is similar to regex_match, but it does not require that the whole character sequence be matched for success. You use regex_search to find a sub-sequence of the input that matches the regular expression e.

 template <class traits,class charT>   basic_string<charT> regex_replace(     const basic_string<charT>& s,     const basic_regex<charT,traits >& e,     const basic_string<charT>& fmt,     match_flag_type flags = match_default); 

regex_replace searches through a character sequence for all matches of the regular expression e. Every time the algorithm makes a successful match, it formats the matched string according to the argument fmt. By default, any text that is not matched is unchangedthat is, the text is part of the output but is not altered.

There are several overloads for all of these three algorithms: one accepting a const charT* (charT is the character type), another accepting a const basic_string<charT>&, and one overload that takes two bidirectional iterators as input arguments.



    Beyond the C++ Standard Library(c) An Introduction to Boost
    Beyond the C++ Standard Library: An Introduction to Boost
    ISBN: 0321133544
    EAN: 2147483647
    Year: 2006
    Pages: 125

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net