International Features

IIS and its related technologies-such as ASP-are key to developing a non-English Web site. Some of the international features associated with IIS include its code-page settings, request collections, Uniform Resource Identifier (URI) processing, and logging-among other features discussed in the sections that follow.

Code-Page Settings

Glossary


  • Global.asa: A file typically containing scripts that initialize application or session variables, connect to databases, send cookies, and perform other operations pertaining to the ASP application or to the user's session with an ASP application as a whole.
  • Indexing Service: A base service of Windows NT, Windows 2000, and Windows XP that extracts content from files and constructs an indexed catalog to facilitate efficient and rapid searching. Indexing Service can extract both text and property information from files on the local host and on remote, networked hosts.

When the server receives a request for an ASP file, it processes server-side scripts contained in the file to build the Web page that is sent to the browser. In IIS 6, ASP and the script engines it supports use Unicode internally. If you author all of your pages in the default code page of the Web server, ASP automatically converts strings to Unicode. (Again, this is true only for IIS 6.) If your script was not created for the Web server's or the browser's default code page, you need to specify the code page. This will allow strings to be correctly converted as they are passed between the script and the ASP engine, between the ASP engine and the browser, and between the ASP engine and COM components. To specify the code page for an ASP page, you can use the Response.CodePage property, the Session.CodePage property, the AspCodePage metabase property, the @CODE-PAGE directive, locale ID (LCID) settings, and the HTTP Charset attribute. (For more information on Response.CodePage and Session.CodePage, see Chapter 3, "Unicode." ) Since IIS does not offer its own code-page or locale support-such as National Language Support (NLS) features-the support available within Windows is used. Consequently, when you select a code page (charset) or locale for an ASP page, session, or response, you must have the appropriate Windows support installed on the server computer.

Response.CodePage Property

When a script is executed, Response.CodePage determines how characters are encoded. If your ASP page runs on IIS 5.1 or later, it is always better to set the value of Response.CodePage explicitly. This way, you eliminate the implicit behaviorthat can cause text transformations you do not expect. If Response.CodePage is not set explicitly in a Web page, then it is set implicitly with this hierarchy:

  • If sessions are enabled and Session.CodePage is set explicitly, Session.CodePage sets Response.CodePage.
  • If @CODEPAGE is defined at the top of the page, @CODEPAGE sets Response.CodePage.
  • If the AspCodePage metabase property for the application is set to a value other than zero, AspCodePage sets Response.CodePage.
  • Otherwise, the Web server's default system code page sets Response.CodePage.

Response.CodePage allows applications that don't enable session state to specify a code page dynamically. For run-time execution of ASP, the following logic for initializing Response.CodePage is used.

If session state is disabled:

 if (@CODEPAGE defined)    Response.CodePage = @CODEPAGE value else if (AspCodePage property present)    Response.CodePage = value at AspCodePage else    Response.CodePage = CP_ACP 

Note


CP_ACP is a server variable that represents the active Windows (sometimes called "ANSI") code page on the user's system.

If session state is enabled:

 if (Session.CodePage set explicitly (via script execution))    Response.CodePage = Session.CodePage else if (@CODEPAGE defined)    Response.CodePage = @CODEPAGE value else if (AspCodePage property present (implicitly sets Session.CodePage))    Response.CodePage = value at AspCodePage else    Response.CodePage = CP_ACP 

Session.CodePage Property

If session state is enabled and the value of Response.CodePage is not set explicitly, the value of Session.CodePage is used during script execution to dynamically specify how the strings coming back from the script engine are to be sent to the client.

AspCodePage Metabase Property

Residing at the application level in the metabase, AspCodePage acts as the default value for any of the initial values of Session.CodePage, Response.CodePage, and @CODEPAGE. However, this default value can be overridden by @CODEPAGE when compiling an ASP page. The compile-time code page for Global.asa is specified by using AspCodePage.

@CODEPAGE Directive

This value is used to compile the ASP file, such as in converting the contents of the ASP file to Unicode, as required by the script-engine interface. Specifying a code page in this manner ensures that your literal strings are converted correctly. The value is used for run-time execution, but has no life beyond the executing script. For compiling the ASP page, the order is:

 if (@CODEPAGE)  else if (AspCodePage property is present)  else use CP_ACP 

However, you should be aware that if Response.CodePage is initialized by Server.Execute using a different value for @CODEPAGE than that of the calling ASP, the encoding is not reset to the calling ASP upon return from the child ASP. Instead, the encoding stays in the new code page.

LCID Settings

The same basic logic that Response.CodePage, Session.CodePage, and @CODEPAGE use applies to the LCID settings Response.LCID, Session.LCID, the AspLCID metabase property, and @LCID. If Session.LCID is not explicitly set in a page, it is implicitly set by the AspLCID metabase property. If AspLCID is not set, or if it is set to zero, Session.LCID is set by the default system locale. Session.LCID can be set multiple times in one Web page and used to format data each time. (For more information on LCIDs, see Chapter 4, "Locale and Cultural Awareness." )

HTTP Charset Attribute

The Charset attribute, which specifies the character set to be used in an HTML page, remains unchanged for IIS 6. The Indexing Service uses the character set specified by Charset to properly display text in the HTML page. It is recommended that you set Response.CodePage, Response.CharSet, and the Charset attribute to the same value. In a file written in Japanese, you would set the Charset attribute to shift_jis as shown in the following example:

 <META HTTP-EQUIV="Content-Type" CONTENT="text/html;    charset=shift_jis"> 

Another international feature of IIS involves request collections. The following section discusses how these collections work.

Request Collections

Because customers were once obliged to choose non-Unicode character sets instead of UTF-8 to build content on IIS 4 and IIS 5, and because ASP execution on IIS 4 and IIS 5 didn't fully support UTF-8, support for Windows code pages is still necessary. The processing code page used by the request collections is set with the Response.CodePage command. In addition, UTF-8 support has been added for the five types of request collections in ASP: ClientCertificate, Cookies, Form, QueryString, andServerVariables.

ClientCertificate

Windows XP security fully supports ClientCertificate encoded in UTF-8.

Cookies

IIS 6 will convert cookie parameters to and from Unicode using the current Response.CodePage. This affects NAME, VALUE, PATH, and DOMAIN elements in the cookie. If an element of the cookie contains nonalphanumeric characters, it will be URL-encoded with % escaping and hexadecimal codes after WideChar-ToMultiByte conversion.

Form

All properties of the Form collection support Windows code pages and UTF-8.

QueryString

All properties of the QueryString collection support Windows code pages and UTF-8.

ServerVariables

In IIS 6, all file names and metabase paths are in Unicode. This is a change from IIS 4 and IIS 5 where file names and paths were dependent on CP_ACP. In addition, the core Web server maintains them as Unicode, and ASP now flows through into the scripting engines. To provide for these changes, the ServerVariables in Table 15-2 now also support Unicode. (Go to http://msdn.microsoft.com to learn more about each variable.)

Table 15-2 Unicode-supported ServerVariables.

APP_POOL_ID

LOGON_USER

APPL_MD_PATH

PATH_INFO

APPL_PHYSICAL_PATH

PATH_TRANSLATED

AUTH_USER

REMOTE_HOST

CERT_ISSUER

REMOTE_USER

CERT_SERVER_ISSUER

SCRIPT_NAME

CERT_SERVER_SUBJECT

SCRIPT_TRANSLATED

CERT_SUBJECT

SERVER_NAME

HTTPS_SERVER_ISSUER

UNMAPPED_REMOTE_USER

HTTPS_SERVER_SUBJECT

URL

INSTANCE_META_PATH

URI Processing

Another international feature of IIS 6 is URI processing. ASP can now receive code page-based or UTF-8 URIs from the client. When a client sends a URI that contains characters from the double-byte character set (DBCS), IIS first tries to resolve the URI as UTF-8. If IIS cannot connect to that URI, the URI is assumed to be encoded in the active Windows code page of the server. If it's not a UTF-8 string, MultiByteToWideChar is called to convert the URI string to Unicode. When a client sends a UTF-8 URI, IIS looks at the URI and tries to determine if it is a UTF-8 URI. If it is, the URI is simply converted back to a Unicode string.

On servers where the active code page is a DBCS encoding, the algorithm first tries to resolve the URI as a DBCS string in the current system locale, and not as a UTF-8 string. The rest of the algorithm is not changed.

Other Methods and Components

IIS 6 now supports Windows "ANSI" code-page or UTF-8 parameters for the methods and components shown in Table 15-3. (Go to http://msdn.microsoft.com to learn more about these methods and components.)

Table 15-3 Methods and components supported by UTF-8.

Server.Execute

ContRot.dll

Server.MapPath

NextLink.dll

Server.Transfer

FileSystemObject

Server.URLEncode

Logging Utility

Server-Side Includes

Page Counter Component

Global.asa

Permission Checker Component

AdRot.dll

Tools Component

IIS 6 also supports many new Unicode server-support functions. UNICODE_ server variables are not exposed to ASP. These variables should be used in programs that directly access Internet Server Application Programming Interface (ISAPI) extensions, a topic that is beyond the scope of this book. If these variables are used in an ASP page, an error will be returned. Since ASP developers operate in a Unicode world already, there is no need to support this new functionality.

Because of the need to handle content that is increasingly international, logging has met this challenge. The following section explains the support that logging offers.

Logging

To be more internationally flexible, IIS 6 features the ability to write out log files in UTF-8. By default, this feature is turned off, and files are encoded as the active Windows "ANSI" Code Page (ACP). There is a check box at the server level that tells the system whether to use UTF-8 when logging. Once the Web or FTP services have been started, this switch cannot be changed. To modify its setting, these services must be stopped. The names of Web and FTP services in the list of Windows services appear, respectively, as "World Wide Web Publishing Service" and the "FTP Publishing Service."

It is important to be aware that there are a few limitations associated with IIS 4 and IIS 5 in terms of their international support. The following will show you what these issues are and how to handle them.



Microsoft Corporation - Developing International Software
Developing International Software
ISBN: 0735615837
EAN: 2147483647
Year: 2003
Pages: 198

Similar book on Amazon

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net