Internationalization and Localization


Often you may need to provide localized versions of your application or component, so that it can be used effectively in environments where U.S. English is not the primary language. You have probably already seen similar requirements with existing projects; the requirement to provide local versions of applications is certainly not unique to .NET.

Definitions

Deploying a component for other locales involves two steps: internationalization and localization.

Internationalization (also known as i18n, because there are 18 letters between the initial i and final n in the word "internationalization") enables a program to work effectively in all supported locales. Traditionally, it entails two steps:

  • Converting the application to use Unicode or other "non-ASCII" string technology. Fortunately, you can usually avoid this step in .NET; all .NET strings are implemented using Unicode, so they are automatically capable of storing all characters supported by .NET.

  • Replacing all embedded strings with string resources. You should be able to change the culture used by an application without changing the application source code. Suppose your application has code similar to the following:

     MsgBox("Can not open the file") 

    You must change this code so that the specific text in the message box can be translated without changing the source code itself. This modification generally involves some sort of string resource conceptmessages are given IDs (often numeric), and the source code refers to these messages by their IDs. The ID-to-string mapping is maintained elsewhere and can be localized independently of the application source code.

    .NET applications must perform this chore just like any other application framework. To accomplish this task, a common and very simple practice is to use string literals in your source code that do not translate into localizable applications.

Localization (also known as l10n, for same reasons that internationalization is called i18n) is the process of actually translating your application. Unlike the internationalization step, localization must be performed once for every translated version of the application. Often the developers of the application do not perform this step, but rather the experts in the culture of the translation take care of localization.

Existing Technologies: Separation of Code and User Interfaces

The problem of providing localized versions of an application predates .NET, so not surprisingly a number of strategies have evolved over time to solve it. Obviously, programmers want to minimize the effort required to create localized applications, both during initial development of the application and during ongoing maintenance and localization throughout the application's life.

A time-honored technique for dealing with this requirement is to separate the code (that is, the application logic) from any user interface (that is, anything displayed to or collected from the user). Traditional Windows developers have had a number of tools at their disposal to assist in the development of such applications. Windows allows resources (dialog or menu definitions, text strings, or any binary resource) to be embedded directly inside executable and DLL files, and this approach remains the most popular technique for localization. Typically, an application vendor will provide a DLL or executable file with the application code containing the resources for the application's primary language (generally U.S. English). The vendor will then create separate DLL files containing resources for each supported language (that is, for each localization).

.NET Localization Concepts

The building blocks for localization in .NET are assemblies and cultures. This chapter has already discussed assemblies, but when applications are localized, it is common practice to split resources from the main code and build them into separate culture-specific assemblies (known in .NET as "satellite assemblies").

The class library provides some helper classes for locating resources from these localized assemblies. If you structure your resources correctly, .NET will search for the "best" resource for the given culture. If it cannot locate a resource specific to that culture, it will use the default (or culture-neutral) resource.

Cultures

A culture is defined as the user's language, and optionally the user's location. Cultures are identified by culture tags of the format primary[-secondary], where the primary tag is the language code and the optional secondary tag is the country/region code. The language codes are defined in ISO 639, "Codes for the Representation of Names of Languages"; the country/region codes are defined in ISO 3166, "Codes for the Representation of Names of Countries." Valid codes include "en" (specifying a primary language tag of English and no secondary tag) and "en-AU" (specifying English plus a secondary country/region code of Australia).

If the end user's culture is known, an application can take advantage of a large set of user preferencesfor example, date formatting, day and month names, currency symbols, and so forth. Furthermore, the application can determine the most appropriate resources for that user.

For many of the culture-specific preferences, .NET will provide high-level functions that allow you to ignore the intricacies of these preferences. For example, if you need to display a date to the user, you should never build your own date format. While it may be tempting to write code similar to

 Console.WriteLine("Today is {0}/{1}/{2}", dt.Month,                     dt.Day, dt.Year); 

this code works correctly only in the United States. You should always attempt to locate a function that does culture-aware formatting, such as the following:

 Console.WriteLine("Today is {0:d}", dt.ToString("d")); 

This code will automatically apply the appropriate separators and day-month-year ordering for the user's culture.

To assist you in utilizing your own localized resources, .NET defines the concept of a satellite assembly and provides the ResourceSet and ResourceManager classes to automatically provide the most culturally appropriate resources for an application. A satellite assembly contains only resources for a specific culture; as noted earlier, it is created during the localization process. Each supported culture generally has its own satellite assembly, supplementing the primary, or parent, assembly. The ResourceSet and ResourceManager classes, along with some file and directory naming conventions for these satellite assemblies, automatically provide the most culturally appropriate resources for the application.

Resource Fallback Process

.NET provides some naming and directory structure conventions that the resource fallback process uses to provide this cultural sensitivity. File naming conventions apply to satellite assemblies: For the parent assembly MyApp.dll , for example, satellite assemblies for this assembly are named MyApp.resources.dll . Directory naming conventions apply to each culture-specific version of the satellite assemblythat is, the directory containing each MyApp.resources.dll file.

The resource fallback process aims to always provide some resource, even if it cannot find an exact match for the requested culture. For most applications, a default (or neutral) set of resources (most often English resources) is built into the primary assembly. If the search of satellite assemblies for a culture-specific version of a resource fails, this culture-neutral resource is generally better than no resource at all!

The process of locating a resource involves the following steps:

  • Search the GAC for the exact satellite assembly. If the named assembly exists in the cache with the exact culture specified and the resource exists in the assembly, use this resource. Otherwise, the search continues.

  • Search the directory of the currently executing assembly for a directory with the same name and culture as the specified culture. If the named assembly exists in this directory and the resource exists, use this resource. Otherwise, the search continues.

  • Search the GAC again for the "parent culture" of the specified culture. If a resource is found, use it. The "parent culture" is the fallback culture for the specified culture. For example, if the specified culture is "fr-CA" (for FrenchCanada), the parent culture is "fr," meaning region-independent French. There may be many parentsalthough any culture can have only one parent, this parent may itself have a parent, and so on.

  • Search the directory of the currently executing assembly again, this time for the parent culture. If the resource is found, use it. Otherwise, the search continues.

  • Repeat Steps 3 and 4, using the next parent culture if one exists.

  • If no other resource is available, use the default resource. If no default resource is available, throw an exception.

Example: A Localized Application

Let's demonstrate these concepts with a simple, contrived example. This example is kept as small as possible to maintain a focus on the key internationalization concepts rather than the details of the application itself. The simple application loads two strings (one to say "Hello" and one to say " Goodbye ") and prints them to the console. It supports a command-line option to specify the culture to use for the strings.

The C# code that implements this application is quite simple, as shown in Listing 5.5. The .NET class library hides the complexities of locating the correct satellite assembly. UseCulture.cs shows the source code.

Listing 5.5 A simple localized application
 using System; using System.Reflection; using System.Resources; using System.Globalization; class UseLocale {   public static void Main(string [] args)   {     string value;     // First set up the default culture.     CultureInfo ci = null;     // If a command-line argument has been given,     // assume it is a culture name.     if (args.Length>0)     {       ci = new CultureInfo(args[0]);       Console.WriteLine("(using culture {0})", ci);     }     // Create the resource manager that does     // all the hard work.     ResourceManager rm = new ResourceManager("strings",                       Assembly.GetExecutingAssembly());     // Load the specific string for the culture.     value = rm.GetString("hello", ci);     Console.WriteLine(value);     value = rm.GetString("goodbye", ci);     Console.WriteLine(value);     } } 

The program in Listing 5.5 supports specification of a culture on the command line. If no command-line option is given, then culture-neutral resources are used (because the CultureInfo variable remains null ). Next, the code creates a resource manager for the resources identified by the name strings . It then loads two string values, using the keys hello and goodbye , respectively.

This is all well and good, but it implies the existence of a resource named string that contains the keys hello and goodbye . Let's see how they are defined.

Defining Resources

The .NET Framework comes with a tool called the Resource Generator ( resgen .exe ), which takes a text file and creates a resource file suitable for use with the ResourceManager class. This module is then linked or embedded into the assembly. As discussed previously, the culture-neutral resources are provided in the primary (or parent) assembly, and a satellite assembly is created for each supported culture.

The format of the text file is quite simple. All lines beginning with the semicolon symbol ( ; ) are considered comments. Every other line follows the format key=value , where key is used by the source code to extract the text, and value is the string returned.

The text file for the culture-neutral strings is shown in strings.txt (see Listing 5.6). Listing 5.6 includes a few comment lines, followed by trivial definitions for hello and goodbye . As you can see from the C# code in Listing 5.5, these definitions act as the keys used in the source code.

Listing 5.6 strings.txt holding culture-neutral string resources
 ; Sample strings for the localization example. ; Define 2 strings: hello and goodbye. ; The other culture-specific string files may override one ; or both of these strings. hello=Hello goodbye=Goodbye 
Building the Assembly

As with the other examples in this chapter, let's use a makefile to build the assembly. The complete makefile is included with the example source code, but Listing 5.7 shows the relevant entries for the primary assembly.

Listing 5.7 makefile portion that builds the primary assembly
 UseCulture.exe: UseCulture.cs strings.resources     csc /resource:strings.resources UseCulture.cs strings.resources: strings.txt     resgen strings.txt strings.resources 

The code in Listing 5.7 results in a single assembly, UseCulture.exe , that is built from a single source file and that has one embedded file.

The first block indicates that building UseCulture.exe depends on UseCulture.cs (the source code) and strings.resources (a file containing the compiled string resources). The C# compiler is invoked, and the string resources are embedded in the assembly.

The second block indicates how strings.resources is built. It depends on strings.txt (the message definition file) and invokes the Resource Generator tool to generate the compiled file. In this example, it is important that the resource name be strings.resources , as strings is the name passed to the ResourceManager class that locates the resources. Executing resgen.exe with no argument will print usage options for the tool.

Defining a Satellite Assembly

To complete the example, you must define a satellite assembly for at least one culture. As two of the authors of this book are Australian, we decided that it is appropriate to support our own unique culturethe en-AU culturein our own application.

As described in the "Resource Fallback Process" section, the simplest way of achieving this result is first to create a subdirectory with the name of the supported culture, and then to place a satellite assembly in that directory. That is exactly the approach followed here.

As this satellite assembly contains only resources, no C# code is required. All that is needed is a localized strings.txt file. Listing 5.8 shows the Australian version of strings.txt .

Listing 5.8 en-AU\strings.txt holding localized string resources
 ; Sample strings for the localization example. ; Australian English version - override only hello. hello=G'Day 

In Listing 5.8, only the hello string is overridden. If all goes according to plan, the culture-neutral version of goodbye should still be used for this culture.

To build the satellite assembly, the makefile contains the entries shown in Listing 5.9.

Listing 5.9 makefile portion that builds the satellite assembly
 en-AU/UseCulture.resources.dll: en-AU/strings.txt     resgen en-AU/strings.txt en-AU/strings.en-AU.resources     al /embed:en-AU/strings.en-AU.resources \        /culture:en-AU/out:en-AU/UseCulture.resources.dll 

As you can see, the target to be built is an assembly named UseCulture.resources.dll in the en-AU directory. As mentioned previously, both the name of the file and the directory are critical for this example to work. To create the assembly itself, you must perform two discrete steps:

  1. Compile the string resources.

  2. Create the assembly itself.

Let's use the Resource Generator tool to create the module, and then the Assembly Linker tool to create the final assembly. Note that the module name is strings.en-AU.resources ; once again, this name is critical to ensure that the ResourceManager can locate the resource.

At this point, running nmake should produce the UseCulture.exe and en-AU\UseCulture.resources.dll assemblies. You can then run the example code to ensure that it works as expected.

First, let's execute the application with no command-line arguments, which should cause the application to use the culture-neutral resources:

 C:\> UseCulture.exe  Hello Goodbye 

To test the Australian culture, pass en-AU on the command line:

 C:\> UseCulture.exe en-AU  (using culture en-AU) G'Day Goodbye 

You can specify any valid culture, even if no direct support for that culture is available. For example, specifying fr-CA for Canadian French will still result in the culture-neutral resource being used:

 C:\> UseCulture.exe fr-CA  (using culture fr-CA) Hello Goodbye 

If you specify an invalid culture string, the program will throw an exception. Obviously, a robust application should catch this exception and insist that the user specify a valid culture string.

Note that in the general case, using the null CultureInfo is not the most appropriate behavior, as it will always use culture-neutral resources regardless of the user's current culture settings. If you change the line of code from

 CultureInfo ci = null; 

to

 CultureInfo ci = CultureInfo.CurrentCulture; 

the current culture will be used in preference to the neutral culture. (The example program used null simply for purposes of demonstration.)

Onward: Application Domains

This point concludes the overview on creating assemblies. This section has shown how use of assemblies and assembly caches attempts to solve many of the issues associated with versioning of applications. The next section returns to the concept of application domains, which were introduced in Chapter 4. Assemblies and application domains are tied together in a simple way: At runtime, assemblies are always loaded into application domains. Now that you have a better understanding of assemblies, it should be easier to see their relationship to application domains.



Programming in the .NET Environment
Programming in the .NET Environment
ISBN: 0201770180
EAN: 2147483647
Year: 2002
Pages: 146

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net