Section 19.0. Introduction


19.0. Introduction

While everyone who programs in PHP has to learn some English eventually to get a handle on its function names and language constructs, PHP can create applications that speak just about any language. Some applications need to be used by speakers of many different languages. Taking an application written for French speakers and making it useful for German speakers is made easier by PHP's support for internationalization and localization.

Internationalization (often abbreviated I18N[]) is the process of taking an application designed for just one locale and restructuring it so that it can be used in many different locales. Localization (often abbreviated L10N[]) is the process of adding support for a new locale to an internationalized application.

[] The word "internationalization" has 18 letters between the first "i" and the last "n."

[] The word "localization" has 10 letters between the first "l" and the "n."

A locale is a group of settings that describe text formatting and language customs in a particular area of the world. The settings are divided into six categories:


LC_COLLATE

These settings control text sorting: which letters go before and after others in alphabetical order.


LC_CTYPE

These settings control mapping between uppercase and lowercase letters as well as which characters fall into the different character classes, such as alphanumeric characters.


LC_MONETARY

These settings describe the preferred format of currency information, such as what character to use as a decimal point and how to indicate negative amounts.


LC_NUMERIC

These settings describe the preferred format of numeric information, such as how to group numbers and what character is used as a thousands separator.


LC_TIME

These settings describe the preferred format of time and date information, such as names of months and days and whether to use 24- or 12-hour time.


LC_MESSAGES

This category contains text messages used by applications that need to display information in multiple languages.

There is also a metacategory, LC_ALL, that encompasses all the categories.

A locale name generally has three components. The first, an abbreviation that indicates a language, is mandatory. For example, "en" for English or "pt" for Portuguese. Next, after an underscore, comes an optional country specifier, to distinguish between different countries that speak different versions of the same language. For example, "en_US" for U.S. English and "en_UK" for British English, or "pt_BR" for Brazilian Portuguese and "pt_PT" for Portuguese Portuguese. Last, after a period, comes an optional character set specifier. For example, "zh_TW.Big5" for Taiwanese Chinese using the Big5 character set. While most locale names follow these conventions, some don't. One difficulty in using locales is that they can be arbitrarily named. Finding and setting a locale is discussed in Recipes 19.1 through 19.3.

Different techniques are necessary for correct localization of plain text, dates and times, and currency. Localization can also be applied to external entities your program uses, such as images and included files. Localizing these kinds of content is covered in Recipes 19.4 through 19.8.

Systems for dealing with large amounts of localization data are discussed in Recipes 19.9 and 19.10. 19.9 shows some simple ways to manage the data, and 19.10 introduces GNU gettext, a full-featured set of tools that provide localization support.

Recipes 19.11 through 19.13 discuss how to make sure your programs work well with a variety of character encodings so they can handle strings such as à l'Opéra-Théâtre, , and . One way to do this is to have all text your programs process be encoded as UTF-8. This encoding scheme can handle the Western characters in the familiar ISO-8859-1 encoding as well as characters for other writing systems around the world. These recipes focus on using UTF-8 to provide a seamless, language-independent experience for your users.

PHP 6, still in development when these words are being written, has greatly enhanced support for Unicode, including more efficient operations on multibyte strings and a completely revamped locale system. Andrei Zmievski's "PHP 6 and Unicode" talk, available at http://www.gravitonic.com/talks/, has an overview of the Unicode-related changes coming in PHP 6.




PHP Cookbook, 2nd Edition
PHP Cookbook: Solutions and Examples for PHP Programmers
ISBN: 0596101015
EAN: 2147483647
Year: 2006
Pages: 445

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net