Through a comparative study, the previous section has shown the undeniable advantages of creating satellite DLLs. This section will demonstrate how you can offer an MUI solution within your own applications by using satellite DLLs, whether you're dealing with Win32 applications, Web content, or the .NET Framework.
Assuming that you have already separated your resources from your core functional language binary, creating an MUI solution is as easy as following these basic steps:
Before examining the technical implementation of MUI in Win32 applications, it's important to underscore that the only safe assumption for a fallback language is the UI language of the operating system. If you do not offer a localized solution for that language, then instead of defaulting to a given language (such as English), you will need to enable the user to select a language. For example, if the UI of the operating system is set to French, and yet you don't have localized files for French, you might be tempted to default to English resource files. This might be an accepted solution in Canada, where French and English coexist. But if a French user lives in Belgium, for example, odds are that his or her second language is Dutch rather than English.
Detecting the system's UI language is handled differently across various versions of Windows. In Windows Me, Windows 2000, and the Windows XP family of products, the new GetUserDefaultUILanguage API returns the language ID for the current UI language of the operating system. (In Windows MUI versions it is possible to have different users with different UI languages selected.) You can enumerate all available UI languages on the system by calling the EnumUILanguages API.
In Windows 95, Windows 98, and Windows 98 Second Edition, the UI language is stored in the registry at HKCU\Control Panel\Desktop\ResourceLocale. This key will return the language ID (LANGID) of the UI in hexadecimal (for example, 00000409 for English).
In Windows NT 3.5x, and Windows NT 4, because of the absence of a relevant API and consistent registry entries, the safest way to check the language of the operating system is to look at the version stamp of Ntdll.dll. The language of this DLL is the same as the language of the UI. The only exceptions to this approach are with Arabic, Hebrew, or Thai versions of Windows NT 4, where version stamping can help you detect an operating system enabled to support these languages. Thus the steps for checking the UI language of the operating system are:
The following code sample demonstrates how detection of the system's UI language would work on Windows NT 3.5x and Windows NT 4 platforms:
HMODULE hLib; hLib = LoadLibrary(TEXT("ntdll.dll")); // failed to load the file if (hLib == NULL) return (FALSE); // For East Asian countries, Ntdll.dll contains both English and // localized resources. Therefore, only take into consideration // the information for the non-English version. EnumResourceLanguages(hLib, RT_VERSION, MAKEINTRESOURCE(1), EnumResLangProc, NULL); FreeLibrary(hLib); // If you only have English version stamping, you might still be // dealing with an enabled language (Arabic, Hebrew, or // Thai). Only true for Windows NT 4. if (g_wLangID == US_LANG_ID { UINT uiACP; uiACP = GetACP(); switch (uiACP) { // Thai code page activated; this is a Thai-enabled system. case 874: g_wLangID = MAKELANGID(LANG_THAI, SUBLANG_DEFAULT); break; // Hebrew code page; this is a Hebrew-enabled system. case 1255: g_wLangID = MAKELANGID(LANG_HEBREW, SUBLANG_DEFAULT); break; // Arabic code page; this is an Arabic-enabled system. case 1256: g_wLangID = MAKELANGID(LANG_ARABIC, SUBLANG_ARABIC_SAUDI_ARABIA); break; default: break; } } // In our resource enumeration, we only keep non-English- // language stampings. BOOL CALLBACK EnumResLangProc(HANDLE hModule, LPCTSTR lpszType, LPCTSTR lpszName, WORD wIDLanguage, LONG_PTR lParam) { if (!lpszName) return FALSE; if (wIDLanguage != US_LANG_ID) g_wLangID = wIDLanguage; return TRUE; }
All these techniques for detecting the system's UI language return a language ID composed of a primary language and a sublanguage. (For more information on primary languages and sublanguages, see Appendix D, "Table of Language Identifiers.") Two exceptions to this rule are Traditional Chinese (predominant in Taiwan) and Simplified Chinese (predominant in mainland China), which share the same primary language and yet have different localized versions. The same is true for Brazilian Portuguese and European Portuguese.
Now that you have identified the default UI language of the operating system, you need to load the satellite corresponding to that language. This process can be simplified by using generic naming conventions for your resource DLLs. If you only have one or two resource files, the suggested approach is to name them with the language ID, such as Myres409.dll for the 0x409 English (United States) satellite DLL. If your application has several resource files, a more practical solution is to create subdirectories named with the corresponding language IDs and place all satellite DLLs for the same language together. Figure 6-5 shows the file distribution for the GlobalDev sample application, which can be found in the Samples subdirectory on the companion CD.
Figure 6-5 - File distribution for the GlobalDev sample application, where all resource files follow the convention GRes<LANGID>.dll.
With this approach, by extracting the LANGID part of the file name and then using it to call GetLocaleInfo, you can find out the native name of each language resource file. The following code sample detects all language resource files currently available, makes sure that the languages are supported, and displays the list to a user:
int nIndex = 0; WIN32_FIND_DATAW wfd; HANDLE hFindFile; // The naming convention for resource DLLs is as follows: // GRes[LANGID].dll. // Find all available resource DLLs in the current directory // but enumerate gres*.* files. hFindFile = FindFirstFile(TEXT("gres*.*"), &wfd); do{ LANGID wFileLang; TCHAR szLangName[32]; // Skip first four letters ("GRes")of file name and convert // the rest to a LANGID. wFileLang = (LANGID) _tcstoul(wfd.cFileName+4, NULL, 16); // Since more languages might be offered than the user has // support for, only list available // and supported languages. if (IsValidLocale(wFileLang, LCID_INSTALLED)) { // Get the native language name. GetLocaleInfo(MAKELCID(wFileLang, SORT_DEFAULT) , LOCALE_SNATIVELANGNAME, szLangName, 32); // Add the new language to the list of UI languages. SendDlgItemMessage(hDlg, IDC_LANGUAGES, CB_INSERTSTRING, nIndex, (LPARAM) szLangName); nIndex++ ; } } // Look for the following resource DLL: while (FindNextFile(hFindFile, &wfd) );
In the previous code sample, adding new languages is transparent to the executable, since all you need to do is to put a new language resource DLL in the directory. The application enumerates the new file, finds its native language name, and adds this name to the list of supported languages. (See Figure 6-6.)
Figure 6-6 - List of available UI languages created by the previous code sample.
Just as creating an MUI solution has its rewards for Win32 applications, offering this type of solution for Web content will have its own payoffs. Multilingual Web content will obviously be able to reach more people. Multinational corporations with operations around the world can more effectively accommodate employees who speak different languages when an Internet or intranet site offers multilingual capabilities. These are just a few instances for which an MUI solution can help, though many more examples could be given.
Glossary
Offering Web sites that provide content in the user's preferred language follows the same principles as providing multilingual content for Win32 applications:
Internet Explorer offers its own MUI solution (called "pluggable UI," which also uses satellite DLLs), where the browser UI language on English Windows 98, for example, can be set to Swedish. On Windows 2000 and Windows XP-which also offer their own MUI solution for the operating system UI-if the browser's MUI files are available, it tries to keep its UI language synchronized with the UI language of the operating system to avoid confusion.
Dynamic generation of language-specific Web interfaces from XML data is not hard-at least, it is not any harder than using XML to store UI data for a single language. The traditional model is to create an XML file containing all necessary data, bind the XML data to an HTML object on the page or apply an XSL transformation-and that's it! The client gets a nicely formatted page and your data, which is isolated from the code and UI, is easy to maintain. A multilingual scenario does not require much change to this traditional model. You will basically need to replace one single-language XML file with a set of language-specific files, which are loaded according to the settings of the client. An important step in building a UI according to the user's preferences is getting those preferences in the first place. There are several methods you can use:
The advantage of the last two methods is that you can use them at a session's initialization step-for example, in the Session_OnStart procedure of Global.asa. However, these two solutions might be less flexible than the first method, in which you query values in a client-side script. In effect, the last two methods are equivalent to making an educated guess about the user's preferred language. Suppose the browser language is set to Finnish and your Web pages are not available in that language. You might default to English, not realizing that the user's second language is Russian, not English. Furthermore, there is not much advantage to the last two methods if the Session object is disabled in the first place.
Rather than selecting one particular algorithm as the preferred solution for all scenarios, it is wiser to base your solution upon the specific business case at hand. As you will see in the the sample code that follows, the decision on which language to use to display content is isolated within the GetSessionLanguage() function. In order to make GetSessionLanguage() universal, the function returns the language ID. However, it would be better to set the Session's UI language variable. As a general rule, you should set a language for each user's session. This way you allow different languages for different users, and you can confine your decision about which UI language to a single instance-within the initialization code. Some applications might avoid using Session states, in which case a language ID returned as a string comes in handy. Note also that GetSessionLanguage cannot be made completely universal; it will depend on the naming conventions you adopt for the XML files in your product, the set of languages you have localized into, and the default language of your application. The following code illustrates one mechanism possible for dynamic construction of a language-specific UI. To make things more general, it is not specified whether the code is executed in a Global.asa initialization of a session or later. Error handling was also removed from the code, but you should be able to handle errors that might arise, such as when the XML file for a particular language cannot be found.
<%@LANGUAGE = "VBScript" %> <% ' Function GetSessionLanguage() is omitted for simplicity. It ' would be application-specific anyway. You can find a sample of ' this type of function in the Samples\MultiLanguageWeb\ ' BrowserSniff subdirectory on the companion CD. The function ' implements language detection based on the value of ' HTTP_ACCEPT_LANGUAGE. ' It is assumed that the function returns a language ID string ' that is never empty. ' The default language of the site is returned if the language ' of the user's choice is not available. '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' ' InitializeUI() detects the language to be loaded, ' stores the language ID in a session variable, ' loads the proper resource XML file and ' language-independent XSL template, ' generates the UI, and writes it to the client. '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' Sub InitializeUI() 'Returns a non-empty string Session("ui_language") = GetSessionLanguage() ' If your languages canot be presented in one code page, you ' can set the Session.Codepage and Response.Charset values ' here, based on the language. ' But always handling text in UTF-8 might be a better solution. Session.Codepage = 65001 Response.Charset = 65001 Set Session("xmlData") = Server.CreateObject("Msxml.DOMDocument") Set Session("xmlStyle") = Server.CreateObject("Msxml.DOMDocument") Session("xmlData").load( Server.MapPath("xml/"&Session("ui_language")&"/uiRes.xml")) Session("xmlStyle").load(Server.MapPath("xml/uiTemplate.xsl")) Response.Write( Session("xmlData").transformNode(Session("xmlStyle"))) End Sub ... %>
The XML file can contain all the information needed for on-the-fly generation of localized content. If you need to apply directionality to your localized pages, use language-specific graphics or URLs. In the previous code sample, if English, French, German, and Japanese resources are defined, the structure of the "resource" tree looks like this. (See Figure 6-7.)
The assumption is that the same information is kept in each language version of the XML resources, thus preserving the same schema among the languages. With this model you can use the same XSL transformation on all of the files, which will simplify your code. Adding new language resources becomes easy as well.
When dealing with MUI in the context of the .NET Framework, you will need to consider the role of the CurrentUICulture property, as well as how resources are generated and loaded. The following sections explore these topics.
Figure 6-7 - Structure of resource tree.
Glossary
The Microsoft .NET Framework offers a whole new approach to the way resource files are created and loaded. Through the support offered by the CLR, you can provide an MUI solution in your .NET applications much more easily. Moreover, with the .NET Framework you can implement MUI in a way that is practically transparent. The CurrentUICulture property is a vital part of resource handling.
In the new naming conventions of the .NET Framework, the CurrentUICulture property of the CultureInfo class (from the System.Globalization namespace) is a per-thread setting. You can retrieve CurrentUICulture by querying the Thread.CurrentUICulture property, and you can change CurrentUICulture by setting Thread.CurrentUICulture. Thread.CurrentUICulture is used by the ResourceManager class to look up culture-specific resources at run time.
The CultureInfo class specifies a unique name for each culture based on the Request for Comments (RFC) 1766 standard. This standard uses the format <languagecode2>-<country/regioncode2>, where <languagecode2> is a lowercase two-letter culture code associated with a language, derived from International Organization for Standardization (ISO) 639-1, and where <country/regioncode2> is an uppercase two-letter subculture code derived from ISO 3166. As discussed in Chapter 4, the invariant culture-which is not associated with any particular language, country, or region-is the root of all cultures. A neutral culture is associated with a language and can be used for resources. (The neutral culture is the equivalent of the primary language ID in the Win32 programming paradigm.) For instance, "fr" is a neutral culture indicating that the language you are dealing with is French, but the neutral culture makes no provision about the actual location (country or region) in which French is being used (France, Belgium, Canada, and so on).
A specific culture is associated with both a language and a region and provides formatting-specific information for that particular location. So, for example, "fr-FR" indicates that you are dealing with the French language and with regional settings that are appropriate for France. (See Chapter 4.) Figure 6-8 summarizes this hierarchy, using German and English as examples. Within the neutral culture of German, for instance, are German in Austria, Switzerland, Germany, Liechtenstein, and Luxembourg. The neutral cultures, German and English, are both part of the invariant culture.
Figure 6-8 - Hierarchy of invariant, neutral, and specific cultures.
The CurrentUICulture property is set by the framework when the application starts so that it matches the user's default UI language. Alternatively, you can set CurrentUICulture explicitly in your application's code. The following code example sets the CurrentUICulture property to the neutral culture "de" for German.
Thread.CurrentThread.CurrentUICulture = new CultureInfo("de");
You can also set the CurrentUICulture property to a specific culture. The following code example sets the CurrentUICulture property to the specific culture "de-DE" for German in Germany:
Thread.CurrentThread.CurrentUICulture = new CultureInfo("de-DE");
As you have seen, the CurrentUICulture property can help you handle resources in a culture-specific manner. The following sections will show how resources are created and loaded.
As in the Win32 programming model, in the .NET Framework your resources should be separated from the rest of your code. Resource files in .NET are created within .resx files; these files consist of XML-based entries that specify objects and strings within XML tags. One advantage of a .resx file is that when opened with a text editor (such as Notepad or Microsoft Word), it can be written to, parsed, and manipulated.
When viewing a .resx file, you can actually see the binary form of an embedded object (a picture, for example) when this binary information is a part of the resource manifest. Apart from this binary information, a .resx file is completely readable and maintainable. A .resx file contains a standard set of header information, which describes the format of the resource entries and specifies the versioning information for the XML that is used to parse the data. Following the header information, each entry is described as a name/value pair. A name/value pair in the .resx format is wrapped in XML code, which describes string or object values. When a string is added to a .resx file, the name of the string is embedded in a <data> tag, and the value is enclosed in a <value> tag.
The resource generation process is straightforward and includes three steps (although the first and second steps are optional):
Figure 6-9 - The .NET resource generation process.
The ResourceManager class provides convenient access to culturally appropriate resources at run time. This classmanages multiple resources from a common source that has a particular root name, like the hierarchy shown earlier in Figure 6-8. ResourceManager objects provide fallback resource lookup to region-independent and neutral cultures when specific localized resources are not provided. Resource-loading requests are based on the culture that is associated with the current thread. This culture can be set through the Thread.CurrentThread.CurrentUICulture property.
The ResourceManager class provides two methods, GetString and GetObject, which enable an application to load either a string resource or any object (that can be serialized) from an assembly. Both the GetString and the GetObject methods support two types of overloads:
If the resource is not localized for the culture that is requested (either explicitly through the second overload or implicitly, by using CurrentUICulture), the lookup will fall back using the culture's Parent property, stopping after looking in the default resources.
When you package your application's resources, you must name them using the resource-naming conventions that the CLR expects. The run time identifies a resource by its culture signature, or name, as discussed in "CurrentUICulture " earlier in this chapter. The .NET Framework uses a hub and spoke model to package and deploy resources. This model requires that you place resources in specific locations, so that they can be easily located and used. If you do not compile and name resources as expected, or if you do not place them in the correct locations, the CLR will not be able to locate them.
If your application includes resources for specific cultures (such as "de-DE"), place each specific culture in its own directory. Do not place specific cultures in subdirectories of their respective neutral culture's directory. (For example, do not put the "de-DE" resources in the "de" folder.) If you do so, the application will not be able to find the localized resource.
The hub and spoke model for packaging and deploying resources uses a fallback process to locate appropriate resources. If an application user requests a ResourceSet that is unavailable, the CLR searches the hierarchy of cultures looking for an appropriate fallback resource that most closely matches the user's request, and raises an exception only as a last resort. At each level of the hierarchy, if an appropriate resource is found, the run time uses it. If the resource is not found, the search continues at the next level.
For instance, if an application is localized in French using the "fr-FR" culture for its resources satellite assembly, and if the application is looking for a string that cannot be found in "fr-FR," it will then fall back to the French culture (if available) and try to find the string in the "fr" satellite assembly. If the string is not available at that level, it will then fall back to the default resources in the main assembly. Finally, if the string resource is not found at that level, an exception will be raised. Figure 6-10 illustrates the fallback process.
The final MUI solution to consider applies to console or text-mode applications. Again, you must think about how resources are handled and loaded within this particular context.
Figure 6-10 - The .NET resource fallback mechanism.
A Win32-based console application finds and loads resources in the same manner as any Win32 code. However, when handling resources in console (or text-mode) code, you must take into account some specifics of the Windows console subsystem. An obvious requirement for any code is that it produce a UI that is easily readable. Additionally, the UI language and settings must be based on the user's preferences. Unfortunately, there can be no warranty against the situation when the UI language cannot be represented in the console output code page. If this happens, loading and displaying resources in the language of the user's UI does no good-the console will not be able to display the output.
The easiest way to prevent the UI from breaking is to make resource loading dependent upon the console code page. Picking which language resource files to load can be tricky, though. One possible method, which can be applied to many scenarios, involves the following steps:
The advantage of loading English resources is that they can always be displayed; but you should be aware that, by doing so, you are losing cultural accuracy. You can also select another approach, such as setting the language that you are going to load to match the console code page, regardless of the UI language. The catch here is that the UI language is a per-user setting, and the console code page follows the system locale; the user cannot always change the system locale. Remember also that setting the ThreadLocale affects locale-sensitive operations such as date formatting, and should be used cautiously. If you use multilingual resource sections and change the thread-locale value to select the language to load, do it only for the time resources are loaded, and restore the original value after that. Console applications depend heavily on text resources, often building the output at run time from string table entries, using the functions of the printf family for run-time parameter substitution. In many cases, this practice makes localizability of the code difficult. When multiple parameters have to be substituted at run time, their relative positions might differ from one language to another, but the order of the printf arguments cannot be changed without code modifications. Using message tables together with FormatMessage API eliminates this problem.
Message tables are a Win32 resource that uses sequential numbers rather than escape letters to mark replacement parameters, making it convenient to store messages that contain several (up to 99) replacement parameters. The Format-Message API function will substitute variables according to each placemarker's numeric label and not according to its position in the string. Localizers can freely change a string's word order, and FormatMessage will still return correct results.
Message tables are defined using message compiler (.mc) files and are compiled into resource files using the message compiler, mc.exe. The format of the message table is designed so that multilingual error messages are easier to interpret. To accomplish this, the message table includes flags for error severity and for the facility that caused an error. For instance, the message text file contains a header that defines names and language identifiers used by the message definitions in the body of the file. The header can contain the following statements:
These flags are optional and can be omitted if the message table is used to store ordinary text resources. The following code sample illustrates how multilingual message tables with no "redundant flags" are created:
// SAMPLE.MC LanguageNames = (English=0x409:MSG00409) LanguageNames =(German=0x407:MSG00407) MessageId=1 SymbolicName=IDS_NOFILE Language=English Cannot open file %1 Language=German Die Datei %1 kann nicht geöffnet werden. MessageId=2 SymbolicName=IDS_OTHERIMAGE Language=English %1 is a %2 image. Language=German %2-Abbild ein %1 ist.
The following code shows how the previous message table can be used in a program. The language ID here is set to NULL. The resources will be loaded according to the ThreadLocale settings.
wchar_t lpBuf[60]; LPVOID lppArgs[10]; DWORD len = FormatMessage( FORMAT_MESSAGE_FROM_HMODULE | FORMAT_MESSAGE_ARGUMENT_ARRAY, NULL, // message source - 0 stands for current module idMsg, // ID of the message to be loaded NULL, // Language ID lpBuf, // Destination buffer sizeof(lpBuf)/sizeof(TCHAR), lppArgs // Array of message inserts - list of strings );
Important
Also, most C run-time text display and input APIs-depending on the C run-time implementation-are still code-page-dependent. It's strongly suggested that you use Windows console APIs for display and input in order to take advantage of an interface that is fully Unicode-based.