Strings, or pieces of text, account for up to 50 percent and more of the objects created during the execution of a typical Java application. While Strings are objects, they are made up of sequences of individual characters. Java represents a character as a primitive type known as char. Since a char value is of a primitive type (like an int value), remember that you cannot send messages to it. CharactersJava includes a char type that represents letters, digits, punctuation marks, diacriticals, and other special characters. Java bases its character set on a standard known as Unicode 4.0 for representation of its characters. The Unicode standard is designed to accommodate virtually all character variations in the major languages of the world. More information regarding the standard can be found at http://www.unicode.org. Java uses two bytes to store each character. Two bytes is 16 bits, which means that Java can represent 216, or 65,536, characters. While that may seems like a lot, it's not enough to support everything in the Unicode standard. You probably won't need to concern yourself with supporting anything over the two-byte range, but if you do, Java allows you to work with characters as int values. An int is four bytes, so the billions of characters it can support should be sufficient until the Federation requires us to incorporate the Romulan alphabet. You can represent character literals in Java in a few ways. The simplest form is to embed the actual character between single quotes (tics). char capitalA = 'A';
Characters are essentially numerics. Each character maps to a corresponding positive integer from 0 through 65,535. Here is a test snippet that shows how the character 'A' has a numeric value of 65 (its Unicode equivalent). assertEquals(65, capitalA); Not all characters can be directly entered via the keyboard. You can represent Unicode characters using the Unicode escape sequence, \u or \U, followed by a 4-digit hex number. assertEquals('\u0041', capitalA); Additionally, you may represent characters as a 3-digit octal (base 8) escape sequence. assertEquals('\101', capitalA); The highest possible character literal that you may represent as an octal sequence is '\377', which is equivalent to 255. Most older languages (for example, C) treat characters as single bytes. The most well-known standard for representing characters in a single-byte character set (SBCS), the American Standard Code for Information Interchange (ASCII), is defined by ANSI X3.4.[1] The first 128 characters of Unicode map directly to their ASCII correspondents.
Special CharactersJava defines several special characters that you can use for things such as output formatting. Java represents the special characters with an escape sequence that consists of the backslash character (\) followed by a mnemonic. The table below summarizes the char literals that represent these special characters.
Since the tic character and the backslash character have special meanings with respect to char literals, you must represent them with an escape sequence. You may also escape (i.e., prefix with the escape character \) the double quote character, but you are not required to do so.
|