Using Word as a Spell Checker from Unmanaged C

Using Word as a Spell Checker from Unmanaged C++

Because automation is a COM technology, you can use the #import technique to reduce the amount of code you need to write to access Word and give it instructions. The structure of the console application looks like this:

 #include "stdafx.h"  // import type library, dll etc // other include files needed for code to compile int _tmain(int argc, _TCHAR* argv[]) {    ::CoInitialize(NULL);    {       // create smart pointer       // use it to access Word functionality    }    ::CoUninitialize();    return 0; } 

The brace brackets that start after CoInitialize() and end before CoUninitialize() are there for scoping; they ensure that the smart pointer has gone out of scope (triggering the destructor) before COM is cleaned up.

This simple application just prompts the user for a sentence , pulls it apart with strtok () , and passes it by Word for a spell-check, one word at a time. The general approach looks like this:

 cout << "Enter a sentence and press enter:"  << endl; char testSentence[1000]; cin.getline(testSentence,999); char* word = strtok(testSentence," \t,() "); while (word != NULL) {    _// check this word, offer suggestions if any    word = strtok(NULL," \t,() "); } 


Why am I using strtok() in this example? It's an old-fashioned C function, after all. The answer is simple: It works, and it doesn't bloat your application. There's a Tokenize() function in the ATL version of CString that does much the same thing, but you have to bring a lot of other code in to get access to that. Oddly, the STL doesn't offer an equivalent. In managed code, the Split() method of the String class takes care of things for you. If you're working in unmanaged C++ and need to work with strings, don't forget those old-fashioned C functions: They just might be all you need.

Finding Your Way Around

With the overall structure in place, a few questions leap to mind:

  • Which type library or DLL should be named in the #import statement?

  • Which progid do you pass to the constructor of the smart pointer?

  • What are the functions offered by the automation classes associated with Word, what parameters do they take, and where can you find documentation?

The answers to these questions, for Word 2003, are in this section, of course, but you need to be able to answer them for any automation server you plan to use. Automation servers usually come with documentation. It can be quite complicated trying to work out these pieces of information using tools alone, and it's simple if the provider of the application just tells you. For completeness, and to remove any air of mystery or magic, this section shows you where all the information can be found.

To automate Word, you need to use the #import statement to bring in three files:

  • MSO.DLL The Microsoft Office library. This "sets the stage" for defining the types you will actually work with. Omit this #import and the others will cause compiler errors.

  • VBE6EXT.olb The object library for the Visual Basic Editor, which in fact applies to all VBA hosts , including Office. Omit this #import and the others will cause compiler errors.

  • MSWORD.OLB The object library for Word. This defines the smart pointers you will actually be using.

If you have no MSDN library and no Google, you can use the OLE/COM object viewer, which comes with Visual Studio .NET 2003, to discover that these are the files you need. You can bring it up from within Visual Studio on the Tools menu. It shows you all the COM components that are in the Registry, arranged by categories. The first place to look is in the Automation Objects section of this tool.

Unfortunately, not all automation servers are in the Registry. Search your hard drive for files with the extensions .DLL, .TLB, and .OLB with names that remind you of the product you're trying to automate, in folders that also remind you of the product. (Use the Search option on the Start menu.) For example, a file called MSWORD.OLB sounds like a really good fit when you're looking for something related to Microsoft Word. Finding it in Program Files\Microsoft Office\OFFICE11 makes the connection even stronger. So you've found the first file you'll need, and for some automation servers it might be the only one you'll need. The next step is to look in it and learn a little more about it.

In the OLE/COM object viewer, choose, File, View Typelib. Browse to the folder where you found the likely file, and open it. The viewer shows pages and pages of information for MSWORD.OLB, and it starts like this:

[View full width]
[View full width]
// Generated .IDL file (by the OLE/COM Object Viewer) // // typelib filename: MSWORD.OLB [ uuid(00020905-0000-0000-C000-000000000046), version(8.3), helpstring("Microsoft Word 11.0 Object Library"), helpfile("VBAWD10.CHM"), helpcontext(00000000) ] library Word { // TLib : // TLib : Microsoft Visual Basic for Applications Extensibility 5.3 : graphics/ccc.gif {0002E157-0000-0000-C000-000000000046} importlib("VBE6EXT.OLB"); // TLib : Microsoft Office 11.0 Object Library : {2DF8D04C-5BFA-101B-BDE5-00AA0044DE52} importlib("MSO.DLL"); // TLib : OLE Automation : {00020430-0000-0000-C000-000000000046} importlib("stdole2.tlb");

The helpstring in this IDL confirms that this object library is for Word in Office 11 (an internal name for Office 2003). You've found the right object library. The importlib statements tell you about other libraries on which this one depends. Write down their names; you're going to need them shortly.

After the importlib statements comes a long list of all the interfaces defined in this library, including _Application and _Document . Often when a library has a lot of interfaces, some start with an underscore , perhaps just to get them to the top of alphabetical lists, where they're easy to find. The OLEViewer lists these interfaces in the tree view on the left as well; scroll past all the typedef enum entries and you'll find some interfaces. Expand _Application and you'll find pages of methods ; this is a big interface. There are no search commands in the OLE/COM viewer, so try this: Click in the right pane, use Ctrl+A to select all the text and Ctrl+C to copy it. Open a Notepad instance, and paste in all the text. Now you can use Notepad's find capabilities.

If you search for Spelling , you'll find a lot of typedefs and options, but more importantly a method called CheckSpelling , with IDL that looks like this:

 HRESULT CheckSpelling(   [in] BSTR Word,   [in, optional] VARIANT* CustomDictionary,   [in, optional] VARIANT* IgnoreUppercase,   [in, optional] VARIANT* MainDictionary,   [in, optional] VARIANT* CustomDictionary2,   [in, optional] VARIANT* CustomDictionary3,   [in, optional] VARIANT* CustomDictionary4,   [in, optional] VARIANT* CustomDictionary5,   [in, optional] VARIANT* CustomDictionary6,   [in, optional] VARIANT* CustomDictionary7,   [in, optional] VARIANT* CustomDictionary8,   [in, optional] VARIANT* CustomDictionary9,   [in, optional] VARIANT* CustomDictionary10,   [out, retval] VARIANT_BOOL* prop); 

IDL reads a lot like C++: This is a function definition and it includes all the parameters. The square brackets hold attributes of the parameters and are fairly self explanatory. This method takes a lot of optional parameters and one non-optional one. It returns a Boolean value indicating whether the word passed in was spelled correctly.

Scrolling up from that line will reveal that the method is in the _ Application interface. You've discovered what you need to know to code against the Word object model through COM.

Accessing the Automation Server

Create a Win32 console application called Word. Add this line, after the #include of stdafx.h:

[View full width]
[View full width]
#import "C:\Program Files\Microsoft Office\OFFICE11\msword.olb" rename("ExitWindows"," graphics/ccc.gif WordExitWindows")

Make sure the path you type here corresponds to the folder where you found MSWORD.OLB. The #import directive is qualified here by an attribute. The rename("ExitWindows"," WordExitWindows") attribute deals with a conflict that occurs whenever you're automating Office products: This function is declared in more than one place. It doesn't matter what you rename it to, you just want to eliminate the name conflict.

Build this application and you'll get compiler errors. That's why you wrote down the libraries imported by this one, as revealed by the OLE/COM object viewer. Add another #import statement, before the one that imports MSWORD.OLB, importing VBE6EXT.OLB, because it's named first in the IDL. Use Start, Search to find the path to the file. Most developers like to impose a namespace name, like this:

[View full width]
[View full width]
#import "C:\Program Files\Common Files\Microsoft Shared\VBA\VBA6\VBE6EXT.olb" graphics/ccc.gif rename_namespace("VBE6")

Build the project again and you'll get some more errors, so add another #import statement, before the other two, bringing in the next importlib mentioned in the IDL:

[View full width]
[View full width]
#import "C:\Program Files\Common Files\Microsoft Shared\OFFICE11\mso.dll" graphics/ccc.gif rename_namespace("Office2003")

Build one more time and you should have only a handful of warnings about duplicate names. There are two ways to suppress these warnings: either add auto_rename attributes to the #import statements, or suppress the warnings with a pragma . Add this line before the import of MSWORD.OLB:

 #pragma warning (disable:4278) 

Now the project will build without errors or warnings, and it's ready to have actual code added to it. Remember this process when you use a #import in a project of your own that uses automation, and no one told you which file to import.

Creating a Smart Pointer and Calling Word Methods

Because you know you're going to use the _Application interface, create an instance of the _ApplicationPtr smart pointer object to simplify accessing the methods. This smart pointer is created for you by the #import statement, which makes a smart pointer for every interface in the type library, with Ptr at the end of the name. The constructor takes a string, comprised of the progid of the object you need to create. To discover the progid , you need to look in the OLE/COM viewer again. One of the first lines in the file defines the library:

 library Word 

After all the interfaces are a number of coclass statements (search for them in the IDL you copied into Notepad). The first two look like this:

 [    uuid(000209F0-0000-0000-C000-000000000046),    helpcontext(0x000009b9),    appobject ] coclass Global {     [default] interface _Global; }; [   uuid(000209FF-0000-0000-C000-000000000046),   helpcontext(0x00000970) ] coclass Application {     [default] interface _Application;     [source] dispinterface ApplicationEvents;     [source] dispinterface ApplicationEvents2;     [source] dispinterface ApplicationEvents3;     [default, source] dispinterface ApplicationEvents4; }; 

For each coclass, the IDL lists the interfaces it supports. The Application coclass supports the _ Application interface (not surprisingly) along with some others. Therefore, the progid to pass to the constructor is Word.Application . Whenever you aren't sure which progid to use, check the IDL for the coclass names. Find the coclass that implements the desired interface, and then build the progid from the library name and the coclass name.

That means the line of code to create the smart pointer is this:

 Word::ApplicationPtr ap("Word.Application"); 

From here it's just a matter of calling the CheckSpelling method. It takes a BSTR and returns an HRESULT , but it also has a retval parameter called prop that is a VARIANT_BOOL* . Thanks to the smart pointers and other behind-the-scenes code generated by the #import statement, you can think of it as taking an ordinary char* string and returning a bool . You call it like this:

 bool spellingOK = ap->CheckSpelling("helloo"); 

If the function returns true , Word believes the spelling is okay. If it returns false , Word can't find the word in the dictionary. You might want to get some suggestions, in that case. A little trawling around in the IDL will get you what you need. Use the techniques shown earlier and you'll discover

  • A method of the _Application interface called GetSpellingSuggestions() that returns a SpellingSuggestions**

  • An interface called SpellingSuggestions , which is what GetSpellingSuggestions returns

  • A not-very-exciting list of properties on the SpellingSuggestions interface, except for an Item property, which returns a SpellingSuggestion**

  • A SpellingSuggestion interface with a Name property that returns a BSTR

If the word you are checking is in a variable called word , this code will check it and offer suggestions:

 bool spellingOK = ap->CheckSpelling(word); if (!spellingOK) {    cout << word << " is not recognized by Word. Word suggests:"  << endl;    Word::SpellingSuggestionsPtr sugg;    sugg = ap->GetSpellingSuggestions(word);    int suggcount = sugg->GetCount();    for (int i = 1; i <= suggcount; i++)    {       Word::SpellingSuggestionPtr suggestedword = sugg->Item(i);       if (suggestedword)       {          cout << suggestedword->GetName() << endl;       }    }    if (suggcount == 0)       cout << "No suggestions."  << endl; } 

At this point, you should not only understand this code, and know what it makes Word do, but know how to walk up to any type library that you believe offers automation and learn what is in it and how to use it, without relying on magic strings discovered by Googling late into the night.

Putting It All Together

There are many steps involved in putting together this automation client. You have seen code snippets that import the automation files, create and use smart pointers, prompt the user for input, and write results back to the screen. Listing 9.1 shows the entire console application.

Listing 9.1 Word.cpp
 // Word.cpp : Defines the entry point for the console application. // #include "stdafx.h"  #import "C:\Program Files\Common Files\Microsoft Shared\OFFICE11\mso.dll"  \           rename_namespace("Office2003") #import "C:\Program Files\Common Files\Microsoft Shared\VBA\VBA6\VBE6EXT.olb"  \           rename_namespace("VBE6") #pragma warning (disable:4278) #import "C:\Program Files\Microsoft Office\OFFICE11\msword.olb"  \           rename("ExitWindows","WordExitWindows") #include <iostream> using namespace std; #include <string> int _tmain(int argc, _TCHAR* argv[]) {    ::CoInitialize(NULL);    {       Word::_ApplicationPtr ap("Word.Application");       //to get suggestions, there must be a document       if (ap->Documents->Count == 0)          ap->Documents->Add();       cout << "Enter a sentence and press enter:"  << endl;       char testSentence[1000];       cin.getline(testSentence,999);       char* word = strtok(testSentence," \t,() ");       while (word != NULL)       {          bool spellingOK = ap->CheckSpelling(word);          if (!spellingOK)          {             cout << word << " is not recognized by Word. Word suggests:"                      << endl;             Word::SpellingSuggestionsPtr sugg;             sugg = ap->GetSpellingSuggestions(word);             int suggcount = sugg->GetCount();             for (int i = 1; i <= suggcount; i++)             {                Word::SpellingSuggestionPtr suggestedword = sugg->Item(i);                if (suggestedword)                {                   cout << suggestedword->GetName() << endl;                }             }             if (suggcount == 0)                cout << "No suggestions."  << endl;          }          word = strtok(NULL," \t,() ");       }       _variant_t v =  Word::wdDoNotSaveChanges;       ap->Quit(&v);    }    ::CoUninitialize();    return 0; } 

There are three concepts in this code that were not presented earlier:

  • It is an oddity of Word that you don't get spelling suggestions unless there is a document open. This code creates one by using the Add() method of the Documents() property of the application.

  • Release versions of this code sometimes blow up if Word is not fully initialized before requests are made of it. (In debug, the time you take to step through code gives Word lots of time to come up.) Therefore, the code to create the smart pointer and to open an empty document are before the prompt for the sentence from the users. Make sure you put the creation of the smart pointer as early as possible in your code.

  • The last piece of work before unloading COM is quitting Word. If you don't, there will be a lot of WINWORD.EXE entries left running (use Task Manager to see them), and your system might get bogged down. The Quit() method takes a VARIANT* , and this code uses the helper class _variant_t to create the VARIANT in a single line.

Try running the application (if you run in release [Ctrl+F5], the application will pause and remind you to press any key to continue), and entering a sentence with a mix of correctly and incorrectly spelled words. For example, enter this sentence:

 Tihs sentence hsa some misteaks 

You should see this output:

 Tihs is not recognized by Word. Word suggests: This Tins Ties Tics Tips hsa is not recognized by Word. Word suggests: has misteaks is not recognized by Word. Word suggests: mistakes misspeaks misdeals mistake mistreats mistrals 

There you have it: a simple application that uses the power of Word to check spelling and offer suggestions. Think about other capabilities offered by Office applicationsthey're almost limitless. Your code can tap into them, just this simply.

Microsoft Visual C++. NET 2003 Kick Start
Microsoft Visual C++ .NET 2003 Kick Start
ISBN: 0672326000
EAN: 2147483647
Year: 2002
Pages: 141
Authors: Kate Gregory © 2008-2017.
If you may any questions please contact us: