Globalization of the Test

The first step to test the world-readiness of a product is to verify that its functionality is globalized. As you've seen, this is done by modifying legacy test cases-or globalizing the test. Globalization represents functionality that is an intrinsic, inseparable part of a system or product. Rather than testing for globalization of every single system component of multiple products, which would require far too many resources and hours, by globalizing the test you check for world-readiness at the same time you check the general functionality of a core product's entire system.

Although Part II, "Globalization," explored the concept of globalization in depth, in order to help you verify that a product's functionality has been globalized, some central ideas of globalization are restated here. Among these ideas, an application or a component is considered globalized if:

  • It is Unicode-based; that is, any Unicode text can be handled successfully.
  • Correct encodings are selected for text conversions, if conversion is needed to interact with legacy systems.
  • The code follows the user's culture-specific settings.

Whenever a piece of code handles text or deals with locale-sensitive functionality, it has to be tested for proper functionality from the perspective of world-readiness. Code that has been globalized can process data (text) in multiple scripts and can adjust itself to the environment (in terms of locale settings, for instance). Verifying functionality can be planned as a separate test pass, but it is better to fit it into regular functional testing. (For a detailed description of steps required for code globalization, see Part II.)

Practical guidelines for globalization of your test go beyond breaking the code-page dependency in the test data and checking the cultural accuracy (which means matching the user-locale settings). To make testing more effective in searching for globalization bugs, target specific functionality and create an environment where this functionality is likely to break. Here are some of the areas of functionality that testers should verify:

  • Multilingual text processing with no data loss. Unicode data that does not match locale settings should never be converted to single-language encodings, hence preserving all data.
  • Correct encoding conversions. When the data, for certain design reasons, needs to be converted to or from nonmultilingual encoding, correct encoding should be selected when multiple choices are possible. These choices include Original Equipment Manufacturer (OEM) code pages versus Windows code pages, UTF-8 encoding versus OEM encoding, and so on. For double-byte text handling, when source or target encoding is multibyte, bugs in code that handle text in multibyte encodings should be visible even in an otherwise globalized application. Also, make sure that all conversions to non-Unicode encodings are justified-do not let incorrect assumptions about encoding limitations corrupt your text data.
  • Proper handling of locale settings. The system locale should affect only encoding conversions. User-locale settings should define the date, time, number and currency formatting, calendar and sorting settings, and nothing else. Input locales should be properly handled, such that the input locale can be selected for any supported language, and this language can be entered.

(See "Creating the Test Environment" later in this chapter for specific ways to target and detect the problems just listed.)

You've seen the importance of globalizing your test as well as particular areas of functionality in which to search for globalization bugs. How do you now go about preparing a globalized test?

Preparing the Test

Before testing is performed, there are several items to consider in preparation. For example, you'll need to determine priority levels for components that are going to be tested, choose a test platform, and create a proper testing environment.

Prioritizing the Components

Some components are more likely to have globalization problems than others, so part of globalizing the test means adapting your testing practices accordingly. Assign a globalization priority to all components that are going to be tested. Top priority goes to:

  • Code designed to be used on Windows 2000 and Windows XP, and on Microsoft Windows 95, Microsoft Windows 98, and Microsoft Windows Millennium Edition (Me). Top priority also goes to code that interfaces with components written for Windows 95/98/Me or earlier platforms.
  • Components that handle text data in non-Unicode encodings. Some of them can be identified as those calling non-Unicode functions of the Microsoft Win32 application programming interface (API). Network components or components with console output are assumed to be in this area. Components using standard C run-time (CRT) libraries to handle text input/output (I/O) belong to this group, too.
  • Components that extensively handle text data.
  • Applications using files for data storage or data exchange (Windows metafiles, Microsoft Jet database files, Group Policy Engine, security configuration tools, and so on).
  • Components that have had many globalization problems in the past.

After determining which components are more likely to have serious globalization errors, choose a viable test platform in accordance with your particular circumstances.

Choosing a Test Platform

Choosing the operating system for your globalized test is another important step. Among the factors that can affect your decision are the functionality of your system and required level of support, testers' ability to work on a localized version of the operating system, and availability of the operating system for testing.

Even if you are targeting a broader range of operating systems, Windows XP can be a good fit for your primary test platform. No other operating system gives you the same flexibility with locale settings, in addition to native support for the broadest range of languages and locales. Windows 95 and Window 98, for example, keep a tight rein on your ability to change the system locale. Windows XP, on the other hand, gives you freedom to adjust system and user locales. In fact, by adjusting these settings, even when you run a localized application on an English version of Windows XP, the application will still behave as though it is running on a localized platform. Thus you can simulate varying language and regional environments on one machine, according to which system and user locales you set. This would normally only be possible by having multiple computers running different language versions of the operating system.

You can also use other platforms that differ from English Windows XP. For instance, Windows XP Multilingual User Interface Pack, henceforth referred to as MUI, is especially useful if your code implements a multilingual UI and has to adjust to the UI settings of the operating system. This approach is a more easily implemented alternative to installing multiple localized versions of the operating system. (For more information on MUI, see Chapter 6, "Multilingual User Interface [MUI].")

Another test platform you can use is the localized build of the target operating system. If the tested application is to be localized into certain languages, these languages are the obvious choice for the platform. If the target operating system is Windows 2000 or Windows XP, other language versions of the operating system can be used too, even if the UI language does not match the language of your application. By using this configuration, you check how your application interacts with a localized system, where names of the system folders, built-in accounts, fonts, and other system objects might be different from what you get with an English or MUI system. Once you've chosen a test platform, the next step is to create the test environment.

Creating the Test Environment

Choosing a platform for the test is not enough to guarantee that the test is globalized and that no globalization problems are left in the code. The static properties of the platform (such as names of the system folders, built-in account names, and so on) are only one part of the picture; you'll also need to consider those properties that can be manipulated-in other words, the platform's dynamic properties. For instance, you can configure language and cultural preferences of Windows through different settings of system and user locales. You should, in fact, tune these settings for the best outcome of your test. While this chapter has identified general areas of specific functionality to target, you have not yet seen how to target them. Therefore, the following list digs deeper into those areas, giving you optimal test settings to use for detecting some typical globalization problems. (For more information, see the "Sample International Testing Cases" at the end of this chapter. This checklist gives you further areas of functionality that you can test, along with showing you how they can be verified, the potential problems associated with them, and when the test is applicable.)

Non-Unicode data path cannot handle double-byte data properly. The problems in this area can be detected if you select an East Asian system locale such as Japanese, Chinese, or Korean, and use data encoded with double-byte characters in the corresponding code page.

Components interacting with non-Unicode applications select incorrect code page (Windows versus OEM) for character conversion. This problem can be detected with the system locale in cases where Windows and OEM encodings do not coincide. European locales, like German or Russian, are good candidates. The test data must contain characters whose code-point values are different in Windows and OEM code pages.

The component is unable to process and follow user-locale settings. Select the locales where special rules must be applied to functionality that your application exposes. For instance, for special sorting rules, select Turkish (test with dotless "I"). For details, see "Globalizing Your Test Data" later in this chapter. Select Chinese for applications with different sort orders (and use Chinese data).

For calendar processing, select locales with non-Gregorian calendars. Finally, for date, time, number, and currency formatting, select locales with formatting rules different from those accepted in the environment where the application is developed. Try locales with different time and date separators, since too often programmers assume the colon or slash to be the only possibilities. For example, the Italian (Italy) locale uses a period (.) as a time separator, while the Italian (Switzerland) locale uses a period as a date separator. In addition, make sure that data (such as files and databases) saved under one locale setting can be properly read and used under another locale, and that the data shown matches the client's settings.

It is also a good idea to set the user locale different from the system locale. A non-Unicode component will fail to display when all of the following are true:

  • Your application picks an incorrect locale for data formatting.
  • The mechanisms used to display locale-specific data treat text as Unicode.
  • The date or time has to be displayed using symbols not in the current code page.

Having a non-Unicode component that fails to display is as bad as picking an incorrect format.

For distributed applications, data is not successfully passed between different locales. To detect this problem, set up a network with a mixed environment where some systems have an East Asian (for example, Japanese) system locale and others have a European system locale. Having the application run in a distributed, mixed network verifies that data can be successfully passed between different locales. Additionally (though not really within the scope of globalization), it might also be wise to test how your distributed system works when the time-zone settings are different among computers in your system.

You've determined which code, components, and applications are most likely to present globalization problems, decided upon a test platform, and manipulated settings to create an optimal testing environment for detecting specific problems. Following the guidelines just presented will ensure you have considered vital issues when preparing to carry out a globalized test.

Carrying Out the Test

In order to carry out a globalized test, you will need to globalize your test data, making sure your test covers functionality-specific areas. You will also need to be able to recognize globalization problems such as loss of functionality, as well as code that fails to follow locale conventions as defined by the current user locale.

Globalizing Your Test Data

After the environment has been set for globalized testing, your regular test cases must be run with special attention paid to potential globalization problems. The test data must match the test. Generally, it must be multilingual Unicode text that doesn't match the system locale. Use a mix of scripts (Latin, Cyrillic, Thai, and so on) or Unicode-only languages (Armenian, Georgian, Indic languages) to discover code-page dependency.

Some particular areas of functionality might require special test data. For example, if the application converts the case of letters, you can verify the functionality with scripts that do not have uppercase and lowercase symbols, such as East Asian scripts or Hebrew. You can also use text with special locale-specific rules for handling case conversion, such as Turkish text with "dotless I." In Turkish both the "dotted" and "dotless" letter "I" exist. "Dotless I" converted to lowercase becomes "", while "dotted " converted to lowercase becomes "i".

Another example of an area requiring functionality-specific test data is the sorting procedure, for which there are some specific examples of data you can use. First, Latin text must be sorted differently based on the differences between the Modern Spanish and Traditional Spanish sort order. The letter combination "ch" is considered a unique compression under Traditional Spanish sort order. However, under the Modern Spanish sort order, "ch" is considered as just the separate letter "c" followed by "h". Traditional Spanish sort order also defines the combination "ll" as a unique letter, while Modern Spanish sort order considers "ll" to be just the letter "l" followed by another letter "l".

A further example of test data involves the sorting order for Turkish "dotless I" versus "dotted I." Under Turkish locale settings, "dotless I" (the uppercase "I" or the lowercase "") precedes "dotted I" (the uppercase "" or lowercase "i"). In other words, in Turkish sort order "I" precedes "", and "" precedes "i".

Some characters in your application have the potential to cause functionality problems. If the application has special handling rules for specific values of characters, some characters incorporated into the data stream will be processed as special control codes. Characters whose code points contain byte sequences that coincide with these special control codes can cause functionality problems.

For example, suppose an application has to parse a string that presumably contains a folder name. The algorithm will have to search for a backslash (the standard delimiter, or path separator, that Windows uses to denote folders). If the algorithm is designed to search for the byte value of the delimiter, it will have to stop at symbol 0x5C, the byte representation for the backslash. However, 0x5C is a valid trail byte in East Asian code pages, so the algorithm will fail on certain strings containing those characters. This will not happen if the algorithm is designed to search for the backslash character itself instead of its byte value-namely, if characters located with the CharNext( ) function are examined for matching the criteria, not the values of bytes in the string.

When planning your test, put greater importance on test cases that deal with the direct or indirect I/O operations on strings. You should also give special priority to tests involving text and formatted data (such as numbers, monetary values, date, and time). Direct input happens when a user enters text in the UI or the text is fetched from some external storage, like a text file or database. Examples of indirect string input are host names, user names, and folder names that include current user names, and so on. If your application deals with information of this kind, make sure this data is multilingual when your tests are running.

In a small number of cases the range of characters in the test input might be somewhat limited, in accordance with the design limitations. For example, only characters that match the system locale can be used in NetBIOS-related tests. Set these tests in a globalized environment. (See "Creating the Test Environment" earlier in this chapter.) It might be hard to enter all of these test inputs manually if you do not know the languages in which you are preparing your test data. A simple Unicode text generator can be very helpful at this point. (You can find a sample of this utility in the Samples subdirectory of the companion CD.)

In addition to globalizing test data that will catch problems and validate functionality, as you conduct your test and view the results you should be able to recognize how some of the globalization problems mentioned earlier in "Creating the Test Environment" manifest themselves.

Recognizing the Problems

The most serious globalization problem is functionality loss. This problem can surface either immediately, such as when a system locale is changed, or later, such as when accessing input data (non-Latin character input).

Some functionality problems manifest themselves in the form of display problems. The following are some common ones that you might see:

  • Question marks (????) appearing instead of displayed text indicates problems in Unicode-to-code-page conversion.
  • Random characters ( and so on) appearing instead of readable text indicates that the code page-based code is using the wrong code page.
  • The appearance of default glyphs such as boxes, vertical bars, or tildes () indicates that the selected font cannot display some of the characters.

It might be hard to find certain problems in display or print results without having adequate shaping, layout, or script knowledge. Testing for these sorts of problems is language-specific and often cannot be executed without language expertise. On the other hand, this type of test might be limited to code inspection. For instance, if the text output is formed and displayed using standard text-handling mechanisms, like Uniscribe for complex scripts (see Chapter 24, "Uniscribe"), complex-script processing in text output can be considered safe.

Another area of potential problems is code failing to follow locale conventions as defined by the current user locale. Make sure that locale-sensitive data in your application (numbers, dates, time, currency, and calendars) is displayed according to the current settings in the Regional And Language Options property sheet of your computer. (For information on locale awareness, see Chapter 4, "Locale and Cultural Awareness." )

However, Regional And Language Options does not cover all locale-specific functionality. For example, you cannot see the current sort order there. Thus it is important to have a test plan covering all aspects of functionality related to locale before you start your test. You can use the National Language Support (NLS) and the .NET Framework documentation as a starting point for this plan. Use this documentation for finding exactly what locale information you'll need to retrieve dynamically, and then apply those requirements to your project.

As a final note regarding globalized testing, when planning and running testing always remember that globalization is wider in scope than just supporting one particular language. The test you use to verify that code has been globalized might not seem to duplicate a realistic usage scenario; that is, the settings of a particular globalized test may look a bit unnatural. Think about cars being tested in extreme weather conditions. Obviously the deserts of Arizona in midsummer or the frozen Alaskan tundra in midwinter don't represent typical driving conditions. However, by covering the widest range of conditions possible, the test will produce more comprehensive results. Similarly, your test should cover the widest range of possibilities you can think of#151;including potential markets you might never have considered or even heard of, as well as the multiple ways in which your product might be used.

Once you've verified that a product's functionality has been globalized through globalizing the existing testing process, you will need to determine whether the application is going to be localized. If so, the next step is localizability testing. Using techniques such as pseudo-localization and the pseudo-mirroring test; reviewing code, UI, and documentation; and performing pilot localization are all tasks that comprise localizability testing.

Microsoft Corporation - Developing International Software
Developing International Software
ISBN: 0735615837
EAN: 2147483647
Year: 2003
Pages: 198 © 2008-2017.
If you may any questions please contact us: