Autocorrect Text as a Buffer Changes

Problem

You have a class that represents some kind of text field or document, and as text is appended to it, you want to correct automatically misspelled words the way Microsoft Word's Autocorrect feature does.

Solution

Using a map, defined in , strings, and a variety of standard library features, you can implement this with relatively little code. Example 4-31 shows how to do it.

Example 4-31. Autocorrect text

#include 
#include 
#include 
#include

using namespace std;

typedef map StrStrMap;

// Class for holding text fields
class TextAutoField {

public:
 TextAutoField(StrStrMap* const p) : pDict_(p) {}
 ~TextAutoField( ) {}

 void append(char c);
 void getText(string& s) {s = buf_;}

private:
 TextAutoField( );
 string buf_;
 StrStrMap* const pDict_;
};

// Append with autocorrect
void TextAutoField::append(char c) {

 if ((isspace(c) || ispunct(c)) && // Only do the auto-
 buf_.length( ) > 0 && // correct when ws or
 !isspace(buf_[buf_.length( ) - 1])) { // punct is entered

 string::size_type i = buf_.find_last_of(" f

	v");

 i = (i == string::npos) ? 0 : ++i;

 string tmp = buf_.substr(i, buf_.length( ) - i);
 StrStrMap::const_iterator p = pDict_->find(tmp);

 if (p != pDict_->end( )) { // Found it, so erase
 buf_.erase(i, buf_.length( ) - i); // and replace
 buf_ += p->second;
 }
 }
 buf_ += c;
}

int main( ) {

 // Set up the map
 StrStrMap dict;
 TextAutoField txt(&dict);

 dict["taht"] = "that";
 dict["right"] = "wrong";
 dict["bug"] = "feature";

 string tmp = "He's right, taht's a bug.";
 cout << "Original: " << tmp << '
';
 for (string::iterator p = tmp.begin( );
 p != tmp.end( ); ++p) {
 txt.append(*p);
 }

 txt.getText(tmp);

 cout << "Corrected version is: " << tmp << '
';
}

The output of Example 4-31 is:

Original: He's right, taht's a bug.
Corrected version is: He's wrong, that's a feature.

 

Discussion

strings and maps are handy for situations when you have to keep track of string associations. TextAutoField is a simple text buffer that uses a string to hold its data. What makes TextAutoField interesting is its append method, which "listens" for whitespace or punctuation, and does some processing when either one occurs.

To make this autocorrect behavior a reality, you need two things. First, you need a dictionary of sorts that contains the common misspelling of a word and the associated correct spelling. A map stores key-value pairs, where the key and value can be of any types, so it's an ideal candidate. At the top of Example 4-31, there is a typedef for a map of string pairs:

typedef map StrStrMap;

See Recipe 4.18 for a more detailed explanation of maps. TextAutoField stores a pointer to the map, because most likely you would want a single dictionary for use by all fields.

Assuming client code puts something meaningful in the map, append just has to periodically do lookups in the map. In Example 4-31, append waits for whitespace or punctuation to do its magic. You can test a character for whitespace with isspace, or for punctuation by using ispunct, both of which are defined in for narrow characters (take a look at Table 4-3).

The code that does a lookup requires some explanation if you are not familiar with using iterators and find methods on STL containers. The string tmp contains the last chunk of text that was appended to the TextAutoField. To see if it is a commonly misspelled work, look it up in the dictionary like this:

StrStrMap::iterator p = pDict_->find(tmp);

if (p != pDict_->end( )) {

The important point here is that map::find returns an iterator that points to the pair containing the matching key, if it was found. If not, an iterator pointing to one past the end of the map is returned, which is exactly what map::end returns (this is how all STL containers that support find work). If the word was found in the map, erase the old word from the buffer and replace it with the correct version:

buf_.erase(i, buf_.length( ) - i);
buf_ += p->second;

Append the character that started the process (either whitespace or punctuation) and you're done.

See Also

Recipe 4.17, Recipe 4.18, and Table 4-3

Building C++ Applications

Code Organization

Numbers

Strings and Text

Dates and Times

Managing Data with Containers

Algorithms

Classes

Exceptions and Safety

Streams and Files

Science and Mathematics

Multithreading

Internationalization

XML

Miscellaneous

Index



C++ Cookbook
Secure Programming Cookbook for C and C++: Recipes for Cryptography, Authentication, Input Validation & More
ISBN: 0596003943
EAN: 2147483647
Year: 2006
Pages: 241

Flylib.com © 2008-2020.
If you may any questions please contact us: flylib@qtcs.net