As we saw in Chapter 1, a string is a collection of text characters that serve a variety of purposes in PHP and HTML. However, there are tens of thousands of characters that can be seen on most modern computers, and learning how to work with them and have your web applications handle them correctly takes some effort. We will continue our discussion of strings by first looking at how PHP handles strings internally. Then we will look at the various ways computers represent characters, and learn how to handle them in PHP. We will finish by looking at some ways in which we can manipulate strings.
How PHP Interprets Strings
In short, strings are represented in PHP through a sequence of 8-bit character codes, which PHP assumes are in the ISO-8859-1 character set (more details later in this chapter). This implies that PHP supports only 256 characters at a given time, which would certainly be distressing to readers who regularly use an alphabet requiring more than that, such as Chinese or Hindi.
Fortunately, the long answer reveals that this is not as great a barrier as one might fearPHP does not do much to your strings, and even provides an extension called mbstring to help us with this problem. As long as you are careful with how you use strings, to what functions you pass them, and how you send them to and from pages in your web application, there is little standing in the way of you and a fully globalized web application.
Before we show you how to do this, we will spend some time looking at what we mean by "character sets," and how computers deal with different alphabets.