Wide Character Support

I l @ ve RuBoard

Wide Character Support

Supporting larger character sets is an important aspect of internationalizing C. C9X provides for a wchar_t type and a host of related functions for dealing with larger character sets.

The stddef.h header file defines wchar_t as an integer type; the exact type is implementation dependent. Its intended use is to hold characters from an extended character set that is a superset of the basic character set. By definition, the char type is sufficient to handle the basic character set. The wchar_t type may need more bits to handle a greater range of code values. For example, char might be an 8-bit byte and wchar_t might be a 16-bit unsigned short .

Wide-character constants and string literals are indicated with an L prefix, and you can use the %lc and %ls modifiers to display wide-character data:

 wchar_t wch = L'I'; wchar_t w_arr[20] = L"am wide!"; printf("%lc %ls\n", wch, w_arr); 

If, for example, wchar_t is implemented as a 2-byte unit, then the 1-byte code for 'I' would be stored in the low-order byte of wch . Characters not from the standard set might require both bytes to hold the character code. You could use hexadecimal escape sequences, for example, to indicate characters whose code values exceed the char range:

 wchar_t w = L'\x2f48'; /* 16-bit code value */ 

An array of wchar_t values can hold a wide-character string, with each element holding a single wide-character code. A wchar_t value with a code value of is the wchar_t equivalent of the null character, and it is termed a null wide character . It is used to terminate wide-character strings.

You can use the %lc and %ls specifiers to read wide characters:

 wchar_t wch; wchar_t w_arr[20]; puts("Enter your grade:"); scanf("%lc", &wch); puts("Enter your first name:"); scanff("%ls",w_arr); 

The wchar .h header file offers further wide-character support. In particular, it provides wide-character I/O functions, wide-character conversion functions, and wide-character string manipulation functions. For the most part, they are wide-character equivalents of existing functions. For example, you can use fwprintf() and wprintf() for output and fwscanf() and wscanf () for input. The main differences are that these functions require a wide-character control string and they deal with input and output streams of wide characters. For example, the following displays information as a sequence of wide characters:

 wchar_t * pw = L"Points to a wide-character string"; int dozen = 12; wprintf(L"Item %d: %ls\n", dozen, pw); 

Similarly, there are getwchar() , putwchar() , fgetws() , and fputws() functions. The header defines a WEOF macro that plays the same role that EOF does for byte-oriented I/O. It's required to be a value that does not correspond to a valid character. Because it is possible that all values of wchar_t type are valid characters, the library defines a wint_t type that can encompass all wchar_t values plus WEOF .

There are equivalents to the string.h library functions. For example, wcscpy(ws2, ws1) copies the wide-character string pointed to by ws1 to the wide-character array pointed to by ws2 . Similarly, there is a wcscmp() function for comparing wide strings, and so on.

The wctype.h header file adds character-classification functions to the mix. For example, iswdigit() returns true if its wide-character argument is a digit, and the isblank() function returns true if its argument is a blank. The standard values for a blank are a space, written as L' ' , and a horizontal tab, written as L'\t' .

I l @ ve RuBoard


C++ Primer Plus
C Primer Plus (5th Edition)
ISBN: 0672326965
EAN: 2147483647
Year: 2000
Pages: 314
Authors: Stephen Prata

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net