Office 2003 Research Services


Microsoft Office 2003 offers yet another translation opportunity. Office 2003 includes a feature called Research Services that enables you to perform various kinds of research from within an Office application. This enables you to perform the research without having to leave your Office application, but it also allows the result of the research to be pasted into your application in context. You can write your own Research Services in .NET and have Office use them just like one of its own.

But this isn't what is of most interest to us with regard to machine translation. One of the built-in Research Services is the Translation service. To see it in action, start an Office 2003 application, such as Word, and select Tools, Research.... A new pane called "Research" is added on the right side of the window. Drop down the combo box (which initially says "All Reference Books") and select Translation. In the "Search for:" text box, enter a phrase to translate. In the two combo boxes below, enter the "From" language and the "To" language. Click the green arrow next to the "Search for:" text box; the text then is translated (see Figure 9.8).

Figure 9.8. Using the Microsoft Office 2003 Research Pane to Perform Translations (Translation Services Provided by WorldLingo.)


It is this translation facility that you can harness for your own automatic translation.

If you search on MSDN (http://msdn.microsoft.com) or the Microsoft Office Web site (http://office.microsoft.com), you will find a fair amount of information on creating your own research services and integrating them into Office. However, you won't find any information on how to consume research services. The reason behind this is that Microsoft expects that the only consumer of Office 2003 Research Services is Office 2003. However, all that you need to know is in this section. You might find it useful to download the Microsoft Office 2003 Research Services SDK, which contains some background information on the subject.

Most Office 2003 Research Services are simply Web services. As such, given the Web service's URL, its WSDL, and the format of its messages, we can use the Web service just like any other Web service. Figure 9.9 shows the list of Research Services that are contained within the Registry at HKEY_CURRENT_USER\Software\ Microsoft\Office\11.0\Common\Research\Sources.

Figure 9.9. Microsoft Office 2003 Research Services Registry Entries


Research Services is an "install on demand" option, so you will need to use the translation facility once before the Registry is populated. A "source" is a provider of research information. Microsoft Office Online Services (shown in Figure 9.9) is an example of one such provider. From this entry, you can see the URL of the Web service (http://office.microsoft.com/Research/query.asmx). Providers provide services. If you expand the source's keys, you can see the list of services (see Figure 9.10).

Figure 9.10. Microsoft Office 2003 Services Registry Entries


This service is a translation service that translates from "English (U.S.)" to "French (France)". The kind of service is specified in the CategoryID, which is 0x36120000 (907149312) for translation services (this is the REFERENCE_TRANSLATION constant in the Office 2003 Research Service SDK). Of particular interest here is the SourceData entry, which is in the following format:

 <FromLCID>/<ToLCID>/<ResultType> 


In the entry in Figure 9.10, the FromLCID is 1033 (which is the locale ID for "English (U.S.)"), the ToLCID is 1036 (which is the locale ID for "French (France)"), and the ResultType is 4. The result type is "1" for keyword translators and "2" for whole-document translators; "4" is not documented but appears to be for keyword/sentence translators. For our purposes, we are interested in "1" and "4".

From this information, you could read through the list of providers collecting a list of services that have a CategoryID of 0x36120000 and a SourceData that has a result type of either 1 or 4.

The information contained in the Registry is simply a cache of the information returned by calling the provider's Web service's Registration method. If you already know the URL of the Web service and want to know the list of services that it provides, an alternative to reading through the Registry is to call the Registration method and read through the result.


By default, three providers of translation services are included with Office 2003:

  • internal:LocalTranslation

  • Microsoft Office Online Services

  • WorldLingo

The "internal:LocalTranslation" provider is a set of Win32 DLLs and is not a Web service. You can find the DLLs in "%CommonProgramFiles%\Microsoft Shared\TRANSLAT". They are installed on demand, so they won't be present until you have translated English to/from French and/or English to/from Spanish. Because this provider is not a web service and the functions are undocumented, I have chosen to ignore this provider.

At first sight, the Microsoft Office Online Services looks like a good source of machine translation. The URL in the Registry can be used as is in Visual Studio's ASP.NET Web Service Wizard to generate a Web service reference because the Web service returns the WSDL that describes the Web service. Unfortunately, the Web service itself suffers from two problems. First, the Web service is more of a translation dictionary than a keyword translator. For example, if you translate Stop into German, the result (after all the HTML formatting has been removed) is this:

1. (-pp-) intransitives Verb (an)halten, stehen bleiben (auch Uhr und so weiter), stoppen; aufhören; besonders Brt. bleiben; stop dead plötzlich oder abrupt stehen bleiben; stop at nothing vor nichts zurückschrecken; stop short of doing, stop short at something zurückschrecken vor (Dativ); transitives Verb anhalten, stoppen; aufhören mit; ein Ende machen oder setzen (Dativ); Blutung stillen; Arbeiten, Verkehr und so weiter zum Erliegen bringen; etwas verhindern; jemanden abhalten (from von), hindern (from an Dativ); Rohr und so weiter verstopfen (auch stop up); Zahn füllen, plombieren; Scheck sperren (lassen); stop by vorbeischauen; stop in vorbeischauen (at bei); stop off umgangssprachlich: kurz Halt machen; stop over kurz Halt machen; Zwischenstation machen;

Clearly, this is the kind of definition that you would expect to find in a dictionary, but it is virtually useless for machine translation.

Second, it translates just single words; it cannot translate a sentence or a phrase. It is almost completely meaningless to translate words one by one and string them together, so these services have no use to us.

WorldLingo Translation Services

The third provider, WorldLingo, is the only viable option that is installed by default. The complete source code to use with this provider is included with this book. Because it is long, I focus only on the most important parts.

The first problem in using the WorldLingo services is that the WorldLingo server doesn't expose the WSDL for the Web service. You can't simply put http://www.worldlingo.com/wl/msoffice11 into Visual Studio's ASP.NET Web Service wizard; the process needs to be a little lower level. Instead, you can use an HttpWebRequest object to send an HTTP request to the server and read the Web Response object that is returned. SendRequest sends a SOAP request to a URL:

 protected string SendRequest(string url, string soapPacket) {     HttpWebRequest httpWebRequest =         (HttpWebRequest) WebRequest.Create(url);     httpWebRequest.ContentType = "text/xml; charset=utf-8";     httpWebRequest.Headers.Add(         "SOAPAction: urn:Microsoft.Search/Query");     httpWebRequest.Method = "POST";     httpWebRequest.ProtocolVersion = HttpVersion.Version10;     Stream stream = httpWebRequest.GetRequestStream();     StreamWriter streamWriter = new StreamWriter(stream);     streamWriter.Write(soapPacket);     streamWriter.Close();     WebResponse webResponse = httpWebRequest.GetResponse();     Stream responseStream = webResponse.GetResponseStream();     StreamReader responseStreamReader =         new StreamReader(responseStream);     return responseStreamReader.ReadToEnd(); } This would be used something like this:- string responsePacket = SendRequest(     "http://www.worldlingo.com/wl/msoffice11", queryPacket); 


The Web service has a method called Query that accepts a single parameter that is a string of XML. The XML contains the translation request, including the "from" language, the "to" language, and the text to be translated. The aforementioned Microsoft Office 2003 Research Services SDK has the structure of this XML packet. At first sight, the Research Services Class Library (RCSL, also available from http://msdn.microsoft.com) includes QueryRequest and QueryResponse classes that might help. These classes are wrappers to build and read the XML used with the Query method. Unfortunately, they are designed for use by developers, not consumers, of Research Services; consequently, they enable you to read the query XML and to create the response XML. This doesn't help because we want to create the query XML and read the response XML.

To create the query XML, I wrote a GetQueryXml method, which can be called something like this:

 GetQueryXml("The monkey is in the tree", service.Id, "(11.0.6360)") 


We pass the string to translate, the GUID of the service that performs the translation, and a build number. The GUID of the service identifies the from/to language pair. GetQueryXml then builds the necessary XML using XmlTextWriter according to the schema defined in the SDK.

The return result of the SendRequest method is the response from the Web service. Again, this is an XML string using the QueryResponse schema defined in the SDK. The Response element of this XML contains the translated text. Unfortunately, this translated text is formatted for display in an Office application, so it contains HTML formatting that must be removed first. With this done, we have our translated text.




.NET Internationalization(c) The Developer's Guide to Building Global Windows and Web Applications
.NET Internationalization: The Developers Guide to Building Global Windows and Web Applications
ISBN: 0321341384
EAN: 2147483647
Year: 2006
Pages: 213

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net