Section 4.8. Questions and Answers


4.8. Questions and Answers

The Unicode web site contains a Frequently Asked Questions (FAQ) section, divided into topics and categories, at http://www.unicode.org/faq/. You will probably find it very useful, especially if you take some time now to have a look at its table of contents, so that you roughly know what you can expect to find there. The following list of questions and answers does not try to compete with the Unicode FAQ. Rather, it discusses some general questions in some depth, partly dealing with same questions as the FAQ, but explaining the answers in a more tutorial-like manner.

4.8.1. Where Can I Find Tools for Using Unicode?

Software tools for Unicode, such as Unicode-capable word processors, editors, subroutine libraries, and converters, exist both as commercial products and as freeware under varying license conditions. Many tools have been developed for a particular environment, such as Windows XP, Macintosh, or Linux, though there are also tools that have been implemented for several environments. You may also encounter more or less obsolete tools that support some old version of Unicode only, although even an old tool might be sufficient for a limited purpose. Thus, there are probably many places to look, and the choice depends on your goals and resources. The Unicode FAQ points to two resources at the Unicode site in its answer to the question:


Useful Resources (http://www.unicode.org/onlinedat/resources.html)

This link list contains the following parts: Fonts and Keyboards, Linguistics and Script Specialty Sites, Organizations and Other Standards, and Using Unicode. You will find tools for Unicode through the Using Unicode section, though it is rather mixed, and many links point to sites that just exemplify Unicode use.


Unicode Enabled Products (http://www.unicode.org/onlinedat/products.html)

The page presents a large sample list of products (in a broad sense) that are more or less Unicode-enabled, divided into categories: Databases and Repertoires, Fonts and Printing Software, Internationalization Libraries, Operating Systems, Programming Languages and IDEs, Search Engines, Standards, Translation Systems, and Other Systems and Products. As you may guess, the list partly exists to demonstrate how widely Unicode can be used. If you intend to create Unicode-enabled software, the International Libraries part is a good start in estimating how to find suitable building blocks.

Although the "Other Systems and Products" part of the latter resource also contains many Unicode editors and word processors, you get a better picture of such software from the resource mentioned in Chapter 2: http://www.alanwood.net/unicode/utilities.html.

4.8.2. Why Do People Call Unicode a 16-Bit Code?

Unicode was originally designed to be a 16-bit code, it can be represented in a 16-bit encoding (UTF-16), and all widely used characters are in the BMP range, where code numbers can be presented as 16-bit integers. Before Unicode Version 3.0 (March 2001), all characters were in the BMP, so that although the structure of Unicode allowed a much wider code space, only a 16-bit subspace was in use.

Besides, people read books, articles, and messages that call Unicode a 16-bit code. The idea has the properties of a very successful meme (an idea that people receive and pass forward): within a certain scope (information technology), the idea is simple, easy to understand and remember, and it sounds new and interesting.

Yet it would be incorrect to say that Unicode is a 16-bit code in practice, or for most practical purposes. It's not a 16-bit encoding: Unicode is widely used in an 8-bit encoding, UTF-8. It's not a 16-bit coding space: planes outside the BMP have increasing importance.

4.8.3. How Can I Have a Character Added to Unicode?

If you would like to have a character, or a collection of characters, added to Unicode, you will likely analyze the issue and find out that you can use existing characters. For example, proposals to add new precomposed characterscombinations of a base character and some diacritic(s)will almost surely be rejected. If you know a character that looks different from any existing Unicode characters, it is probably a variant of an existing character and should be treated that way. It may well be a common character in an uncommon font. If you think your company's symbol counts as a character, the Unicode Consortium will most probably disagree. Ligatures and typographic variants will normally not be accepted either.

If you still think you have a character that needs to be encoded, check the instructions on submitting characters on the Unicode web site. Their basic content is that a proposal must be sent in writing and it must contain:

  • At least one image of the proposed character, normally from a printed source (and including several images will help in illustrating the character)

  • Substantial documentation that justifies the proposal (explaining, among other things, how the character is used in texts and why it needs to be recognized as different from existing Unicode characters)

  • Identification of the sponsor(s), with contact information (postal and email address and phone number)

You should normally first send an informal query on the matter to the public Unicode discussion list (email list), described at http://www.unicode.org/consortium/distlist.html. You might take a look at the document registry http://std.dkuug.dk/jtc1/sc2/wg2/ to see what the proposals look like and how detailed they are.

4.8.4. How Can I Check That I've Understood the Principles?

The principles of Unicode aren't something you need to learn by heart. Rather, you learn them when you read more about Unicode and work with it. Still, it might be a good idea to sit down and check whether you can write down the 10 principles of Unicode. Specify each of them for yourself with a word or two that name the principle, and then write a simple sentence that says something about it, maybe just an example. Then check your list against the list given in the section "Design Principles" in this chapter, or against the description in the Unicode standard (in Version 4.1, it's in section 2.2).

As a different test, read the very short description of Unicode, "What is Unicode?" or one of its translations at http://www.unicode.org/standard/WhatIsUnicode.html. Read it with a critical mind, and ask yourself the following questions:

  • If you had to use the description as a basis when talking about Unicode, could you back up any general statement there with at least one concrete example? (This tests your general understanding of Unicode, not just this chapter.)

  • If you had to explain Unicode at elementary school, which parts of the description would you omit?

  • The description says that "Unicode provides a unique number for every character." What does "every character" really mean here?

  • Name at least three essential problems of using Unicode that are not mentioned in the description.



Unicode Explained
Unicode Explained
ISBN: 059610121X
EAN: 2147483647
Year: 2006
Pages: 139

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net