Sorting Localized Strings


You have a sequence of strings that contain non-ASCII characters, and you need to sort according to local convention.


The locale class has built-in support for comparing characters in a given locale by overriding operator. You can use an instance of the locale class as your comparison functor when you call any standard function that takes a functor for comparison. (See Example 13-8.)

Example 13-8. Locale-specific sorting


using namespace std;

bool localeLessThan (const string& s1, const string& s2) {

 const collate& col =
 use_facet >(locale( )); // Use the global locale

 const char* pb1 = );
 const char* pb2 = );

 return (, pb1 + s1.size( ),
 pb2, pb2 + s2.size( )) < 0);

int main( ) {

 // Create two strings, one with a German character
 string s1 = "diät";
 string s2 = "dich";

 vector v;

 // Sort without giving a locale, which will sort according to the
 // current global locale's rules.
 sort(v.begin( ), v.end( ));
 for (vector::const_iterator p = v.begin( );
 p != v.end( ); ++p)
 cout << *p << endl;

 // Set the global locale to German, and then sort
 sort(v.begin( ), v.end( ), localeLessThan);
 for (vector::const_iterator p = v.begin( );
 p != v.end( ); ++p)
 cout << *p << endl;

The first sort follows ASCII sorting convention, and therefore the output looks like this:


The second sort uses the proper ordering according to German semantics, and it is just the opposite:




Sorting becomes more complicated when you're working in different locales, and the standard library solves this problem. The facet collate provides a member function compare that works like strcmp: it returns -1 if the first string is less than the second, 0 if they are equal, and 1 if the first string is greater than the second. Unlike strcmp, collate::compare uses the character semantics of the target locale.

Example 13-8 presents the function localeLessThan , which returns TRue if the first argument is less than the second according to the global locale. The most important part of the function is the call to compare:, // Pointer to the first char
 pb1 + s1.size( ), // Pointer to one past the last char
 pb2 + s2.size( ))

Depending on the execution character set of your implementation, Example 13-8 may return the results I showed earlier or not. But if you want to ensure string comparison works in a locale-specific manner, you should use collate::compare. Of course, the standard does not require an implementation to support any locales other than "C," so be sure to test for all the locales you support.

