Casing


Latin script languages have a concept of upper and lower case, which all developers are familiar with. However, not all languages have this concept or implement it in the same way. Some languages (e.g., Japanese) are case-less. Others exist only as upper case (e.g., Khutsuri) or lower case (e.g., Nushkuri). For these languages, no case conversions should occur. If you intend to support Azeri or Turkish, you should be aware of the special case of the letter I. Most Latin script languages have two I characters: a Capital Letter I (without a dot on top) and a Small Letter i (with a dot on top). Azeri and Turkish have two more I characters: a Capital Letter i (with a dot on top) and a Small Letter (without a dot on top). You need to be aware of this because the rules for conversion between upper and lower case among these four letters are different for Azeri and Turkish cultures than they are for other cultures. (These special rules are known as Turkic Casing Rules.) Tables 6.4 and 6.5 show the effects of converting each of the four letters between upper and lower case using the "en" and "TR", cultures respectively.

Table 6.4. Upper- and Lowercasing I Using English Culture

Character Name

Character

Unicode Code Point

Upper Character

Upper Unicode Code Point

Lower Character

Lower Unicode Code Point

Latin Capital Letter

I

U+0049

I

U+0049

i

U+0069

Latin Small Letter

i

U+0069

I

U+0049

i

U+0069

Latin Capital Letter With dot

U+0130

U+0130

i

U+0069

Latin Small Letter Without Dot

U+0131

I

U+0049

U+0131


Table 6.5. Upper- and Lowercasing I Using Turkish Culture

Character Name

Character

Unicode Code Point

Upper Character

Upper Unicode Code Point

Lower Character

Lower Unicode Code Point

Latin Capital Letter

I

U+0049

I

U+0049

U+0131

Latin Small Letter

i

U+0069

I

U+0130

i

U+0069

Latin Capital Letter With Dot

U+0130

U+0130

i

U+0069

Latin Small Letter Without Dot

U+0131

I

U+0049

U+0131


From these two tables, it can be seen that the Turkish lowercase equivalent of Latin Capital I is not the same as the English lowercase equivalent, and the Turkish uppercase equivalent of Latin Small I is not the same as the English uppercase equivalent. The problem is illustrated in the following code fragment:

 CultureInfo cultureInfo = new CultureInfo("en"); string test = "Delphi is in italics"; string testUpper = "DELPHI IS IN ITALICS"; if (test.ToUpper(cultureInfo).CompareTo(testUpper) == 0)     Text = "Equal"; else     Text = "Not equal"; 


The two strings are equal if the culture is "en", but they are not equal if the culture is "tr". How you handle this difference in code is dependent upon the nature of the strings being compared. If you were comparing a company name typed by a user, a case-less conversion using String.Compare and passing the CurrentCulture would be the safest comparison. If, however, you were comparing a string that could be considered a programmatic element against a known stringsay, an XML tag nameyou should use the invariant culture to perform the comparison, to ensure that culture-specific casing rules do not change the success of the comparison.




.NET Internationalization(c) The Developer's Guide to Building Global Windows and Web Applications
.NET Internationalization: The Developers Guide to Building Global Windows and Web Applications
ISBN: 0321341384
EAN: 2147483647
Year: 2006
Pages: 213

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net