E.1. Introduction

The use of inconsistent character encodings (i.e., numeric values associated with characters) in the developing of global software products causes serious problems, because computers process information as numbers. For instance, the character "a" is converted to a numeric value so that a computer can manipulate that piece of data. Many countries and corporations have developed their own encoding systems that are incompatible with the encoding systems of other countries and corporations. For example, the Microsoft Windows operating system assigns the value 0xC0 to the character "A with a grave accent"; the Apple Macintosh operating system assigns that same value to an upside-down question mark. This results in the misrepresentation and possible corruption of data when data is not processed as intended.

In the absence of a widely implemented universal character-encoding standard, global software developers had to localize their products extensively before distribution. Localization includes the language translation and cultural adaptation of content. The process of localization usually includes significant modifications to the source code (such as the conversion of numeric values and the underlying assumptions made by programmers), which results in increased costs and delays releasing the software. For example, some English-speaking programmers might design global software products assuming that a single character can be represented by one byte. However, when those products are localized for Asian markets, the programmer's assumptions are no longer valid; thus, the majority, if not the entirety, of the code needs to be rewritten. Localization is necessary with each release of a version. By the time a software product is localized for a particular market, a newer version, which needs to be localized as well, may be ready for distribution. As a result, it is cumbersome and costly to produce and distribute global software products in a market where there is no universal character-encoding standard.

In response to this situation, the Unicode Standard, an encoding standard that facilitates the production and distribution of software, was created. The Unicode Standard outlines a specification to produce consistent encoding of the world's characters and symbols. Software products that handle text encoded in the Unicode Standard need to be localized, but the localization process is simpler and more efficient because the numeric values need not be converted and the assumptions made by programmers about the character encoding are universal. The Unicode Standard is maintained by a nonprofit organization called the Unicode Consortium, whose members include Apple, IBM, Microsoft, Oracle, Sun Microsystems, Sybase and many others.

When the Consortium envisioned and developed the Unicode Standard, they wanted an encoding system that was universal, efficient, uniform and unambiguous. A universal encoding system encompasses all commonly used characters. An efficient encoding system allows text files to be parsed easily. A uniform encoding system assigns fixed values to all characters. An unambiguous encoding system represents a given character in a consistent manner. These four terms are referred to as the Unicode Standard design basis.

E 2 Unicode Transformation Formats



    Introduction to Computers, the Internet and Visual C#

    Introduction to the Visual C# 2005 Express Edition IDE

    Introduction to C# Applications

    Introduction to Classes and Objects

    Control Statements: Part 1

    Control Statements: Part 2

    Methods: A Deeper Look


    Classes and Objects: A Deeper Look

    Object-Oriented Programming: Inheritance

    Polymorphism, Interfaces & Operator Overloading

    Exception Handling

    Graphical User Interface Concepts: Part 1

    Graphical User Interface Concepts: Part 2


    Strings, Characters and Regular Expressions

    Graphics and Multimedia

    Files and Streams

    Extensible Markup Language (XML)

    Database, SQL and ADO.NET

    ASP.NET 2.0, Web Forms and Web Controls

    Web Services

    Networking: Streams-Based Sockets and Datagrams

    Searching and Sorting

    Data Structures



    Appendix A. Operator Precedence Chart

    Appendix B. Number Systems

    Appendix C. Using the Visual Studio 2005 Debugger

    Appendix D. ASCII Character Set

    Appendix E. Unicode®

    Appendix F. Introduction to XHTML: Part 1

    Appendix G. Introduction to XHTML: Part 2

    Appendix H. HTML/XHTML Special Characters

    Appendix I. HTML/XHTML Colors

    Appendix J. ATM Case Study Code

    Appendix K. UML 2: Additional Diagram Types

    Appendix L. Simple Types


    Visual C# How to Program
    Visual C# 2005 How to Program (2nd Edition)
    ISBN: 0131525239
    EAN: 2147483647
    Year: 2004
    Pages: 600

    Flylib.com © 2008-2020.
    If you may any questions please contact us: flylib@qtcs.net