.NODE

Sorting Localized Strings

Problem

You have a sequence of strings that contain non-ASCII characters, and you need to sort according to local convention.

Solution

The locale class has built-in support for comparing characters in a given locale by overriding operator. You can use an instance of the locale class as your comparison functor when you call any standard function that takes a functor for comparison. (See Example 13-8.)

Example 13-8. Locale-specific sorting

#include 
#include 
#include 
#include 
#include 

using namespace std;

bool localeLessThan (const string& s1, const string& s2) {

 const collate& col =
 use_facet >(locale( )); // Use the global locale

 const char* pb1 = s1.data( );
 const char* pb2 = s2.data( );

 return (col.compare(pb1, pb1 + s1.size( ),
 pb2, pb2 + s2.size( )) < 0);
}

int main( ) {

 // Create two strings, one with a German character
 string s1 = "diät";
 string s2 = "dich";

 vector v;
 v.push_back(s1);
 v.push_back(s2);

 // Sort without giving a locale, which will sort according to the
 // current global locale's rules.
 sort(v.begin( ), v.end( ));
 for (vector::const_iterator p = v.begin( );
 p != v.end( ); ++p)
 cout << *p << endl;

 // Set the global locale to German, and then sort
 locale::global(locale("german"));
 sort(v.begin( ), v.end( ), localeLessThan);
 for (vector::const_iterator p = v.begin( );
 p != v.end( ); ++p)
 cout << *p << endl;
}

The first sort follows ASCII sorting convention, and therefore the output looks like this:

dich
diät

The second sort uses the proper ordering according to German semantics, and it is just the opposite:

diät
dich

 

Discussion

Sorting becomes more complicated when you're working in different locales, and the standard library solves this problem. The facet collate provides a member function compare that works like strcmp: it returns -1 if the first string is less than the second, 0 if they are equal, and 1 if the first string is greater than the second. Unlike strcmp, collate::compare uses the character semantics of the target locale.

Example 13-8 presents the function localeLessThan , which returns TRue if the first argument is less than the second according to the global locale. The most important part of the function is the call to compare:

col.compare(pb1, // Pointer to the first char
 pb1 + s1.size( ), // Pointer to one past the last char
 pb2,
 pb2 + s2.size( ))

Depending on the execution character set of your implementation, Example 13-8 may return the results I showed earlier or not. But if you want to ensure string comparison works in a locale-specific manner, you should use collate::compare. Of course, the standard does not require an implementation to support any locales other than "C," so be sure to test for all the locales you support.

Building C++ Applications

Code Organization

Numbers

Strings and Text

Dates and Times

Managing Data with Containers

Algorithms

Classes

Exceptions and Safety

Streams and Files

Science and Mathematics

Multithreading

Internationalization

XML

Miscellaneous

Index

show all menu





C++ Cookbook
Secure Programming Cookbook for C and C++: Recipes for Cryptography, Authentication, Input Validation & More
ISBN: 0596003943
EAN: 2147483647
Year: 2006
Pages: 241
Similar book on Amazon

Flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net