Encodings in the .NET Framework

Glossary


  • ASP.NET : Stands for Microsoft Active Server Pages for the .NET Framework. The new generation of Active Server Pages (ASP) files written in a managed language on the Common Language Runtime (CLR) using the .NET Framework. Also known as "ASP+" and "ASPX."

The .NET Framework is a platform for building, deploying, and running Web services and applications that provides a highly productive, standards-based, multilanguage environment for integrating existing or legacy investments with next-generation applications and services. The .NET Framework uses Unicode UTF-16 to represent characters, although in some cases it uses UTF-8 internally. The System.Text namespace provides classes that allow you to encode and decode characters, with support that includes the following encodings:

  • Unicode UTF-16 encoding. Use the UnicodeEncoding class to convert characters to and from UTF-16 encoding.
  • Unicode UTF-8 encoding. Use the UTF8Encoding class to convert characters to and from UTF-8 encoding.
  • ASCII encoding. ASCII encodes the Latin alphabet as single 7-bit characters. Because this encoding only supports character values from U+0000 through U+007F, in most cases it is inadequate for internationalized applications. You can use the ASCIIEncoding class to convert characters to and from ASCII encoding whenever you need to interoperate with legacy encodings and systems.
  • Windows/ISO Encodings. The System.Text.Encoding class provides support for a wide range of Windows/ISO encodings.

Encoding Support for Code Pages

The use of Unicode in the .NET Framework simplifies the development of world-ready applications because you no longer need to reference a code page. The .NET Framework provides support for data encoded using code pages. You can use the Encoding.GetEncoding method (Int32) to create a target encoding object for a specified code page. Specify a code page number as the Int32 parameter. The following code example creates an encoding object-enc -for the code page 1252.

 [Visual Basic] Encoding enc = Encoding.GetEncoding(1252) [C#] Encoding enc = Encoding.GetEncoding(1252); 

After you create an encoding object that corresponds to a specified code page, you can use the object to perform other operations supported by the System.Text.Encoding class.

The one additional type of support introduced to ASP.NET is the ability to clearly distinguish between file, request, and response encodings. To set the encoding in ASP.NET for code, page directives, and configuration files, you'll need to do the following.

In code:

 Response.ContentEncoding=<value>  Request.ContentEncoding=<value>  File.ContentEncoding=<value>         

In page directives:

 <%@Page ResponseEncoding=<value>%> <%@Page RequestEncoding=<value>%> <%@Page FileEncoding=<value>%> 

In a configuration file:

 <configuration>  <globalization  fileEncoding=<value>  requestEncoding=<value>   responseEncoding=<value>  /> </configuration> 

The following code example in C# uses the Encoding.GetEncoding method to create a target encoding object for a specified code page. The Encoding.-GetBytes method is called on the target encoding object to convert a Unicode string to its byte representation in the target encoding. The byte representations of the strings in the specified code pages are displayed.

 using System; using System.IO; using System.Globalization; using System.Text; public class Encoding_UnicodeToCP {    public static void Main()    {       // Convert ASCII characters to bytes.       // Display the string's byte representation in the       // specified code page.       // Code page 1252 represents Latin characters.       PrintCPBytes("Hello, World!",1252);       // Code page 932 represents Japanese characters.       PrintCPBytes("Hello, World!",932);  // Convert Japanese characters to bytes.       PrintCPBytes("\u307b,\u308b,\u305a,\u3042,\u306d",1252);       PrintCPBytes("\u307b,\u308b,\u305a,\u3042,\u306d",932);    }   public static void PrintCPBytes(string str, int codePage)    {   Encoding targetEncoding;       byte[] encodedChars; // Get the encoding for the specified code page.       targetEncoding = Encoding.GetEncoding(codePage); // Get the byte representation of the specified string.       encodedChars = targetEncoding.GetBytes(str);      // Print the bytes.       Console.WriteLine          ("Byte representation of '{0}' in Code Page '{1}':", str,               codePage);       for (int i = 0; i < encodedChars.Length; i++)          Console.WriteLine("Byte {0}: {1}", i, encodedChars[i]);    } } 

To determine the encoding to use for response characters in an Active Server Pages for the .NET Framework (ASP.NET) application, set the value of the HttpResponse.ContentEncoding property to the value returned by the appropriate method. The following code example illustrates how to set HttpResponse.ContentEncoding.

 // Explicitly set the encoding to UTF-8. Response.ContentEncoding = Encoding.UTF8; // Set ContentEncoding using the name of an encoding. Response.ContentEncoding = Encoding.GetEncoding(name); // Set ContentEncoding using a code page number. Response.ContentEncoding = Encoding.GetEncoding(codepageNumber); 

The last major area of discussion involves encodings in console or text-mode programming. In the section that follows, you'll find information on using the Win32 API and C run-time (CRT) library functions, CRT console input/output (I/O), and Win32 text-mode I/O, should you need to deal with this sort of application.



Microsoft Corporation - Developing International Software
Developing International Software
ISBN: 0735615837
EAN: 2147483647
Year: 2003
Pages: 198

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net