Sorting Localized Strings

Problem

You have a sequence of strings that contain non-ASCII characters, and you need to sort according to local convention.

Solution

The locale class has built-in support for comparing characters in a given locale by overriding operator. You can use an instance of the locale class as your comparison functor when you call any standard function that takes a functor for comparison. (See Example 13-8.)

Example 13-8. Locale-specific sorting

#include 
#include 
#include 
#include 
#include 

using namespace std;

bool localeLessThan (const string& s1, const string& s2) {

 const collate& col =
 use_facet >(locale( )); // Use the global locale

 const char* pb1 = s1.data( );
 const char* pb2 = s2.data( );

 return (col.compare(pb1, pb1 + s1.size( ),
 pb2, pb2 + s2.size( )) < 0);
}

int main( ) {

 // Create two strings, one with a German character
 string s1 = "diät";
 string s2 = "dich";

 vector v;
 v.push_back(s1);
 v.push_back(s2);

 // Sort without giving a locale, which will sort according to the
 // current global locale's rules.
 sort(v.begin( ), v.end( ));
 for (vector::const_iterator p = v.begin( );
 p != v.end( ); ++p)
 cout << *p << endl;

 // Set the global locale to German, and then sort
 locale::global(locale("german"));
 sort(v.begin( ), v.end( ), localeLessThan);
 for (vector::const_iterator p = v.begin( );
 p != v.end( ); ++p)
 cout << *p << endl;
}

The first sort follows ASCII sorting convention, and therefore the output looks like this:

dich
diät

The second sort uses the proper ordering according to German semantics, and it is just the opposite:

diät
dich

 

Discussion

Sorting becomes more complicated when you're working in different locales, and the standard library solves this problem. The facet collate provides a member function compare that works like strcmp: it returns -1 if the first string is less than the second, 0 if they are equal, and 1 if the first string is greater than the second. Unlike strcmp, collate::compare uses the character semantics of the target locale.

Example 13-8 presents the function localeLessThan , which returns TRue if the first argument is less than the second according to the global locale. The most important part of the function is the call to compare:

col.compare(pb1, // Pointer to the first char
 pb1 + s1.size( ), // Pointer to one past the last char
 pb2,
 pb2 + s2.size( ))

Depending on the execution character set of your implementation, Example 13-8 may return the results I showed earlier or not. But if you want to ensure string comparison works in a locale-specific manner, you should use collate::compare. Of course, the standard does not require an implementation to support any locales other than "C," so be sure to test for all the locales you support.

Building C++ Applications

Code Organization

Numbers

Strings and Text

Dates and Times

Managing Data with Containers

Algorithms

Classes

Exceptions and Safety

Streams and Files

Science and Mathematics

Multithreading

Internationalization

XML

Miscellaneous

Index



C++ Cookbook
Secure Programming Cookbook for C and C++: Recipes for Cryptography, Authentication, Input Validation & More
ISBN: 0596003943
EAN: 2147483647
Year: 2006
Pages: 241

Flylib.com © 2008-2020.
If you may any questions please contact us: flylib@qtcs.net