13.5 Dynamic Language Negotiation

Java Servlet Programming, 2nd Edition > 13. Internationalization > 13.5 Dynamic Language Negotiation

 
< BACKCONTINUE >

13.5 Dynamic Language Negotiation

Now let's push the envelope yet a little farther (perhaps off the edge of the table) with a servlet that tailors its output to match the language preferences of the client. This allows the same URL to serve its content to readers across the globe in their native tongues.

13.5.1 Language Preferences

There are two ways a servlet can know the language preferences of the client. First, the browser can send the information as part of its request. Newer browsers, beginning with Netscape Navigator 4 and Microsoft Internet Explorer 4, allow users to specify their preferred languages. With Netscape Navigator 4 and 6, this is done under Edit Preferences Navigator Languages. With Microsoft Internet Explorer 4, it's done under View Internet Options General Languages; Internet Explorer 5 moves the option from under View to under Tools.

A browser sends the user's language preferences to the server using the Accept-Language HTTP header. The value of this header specifies the language or languages that the client prefers to receive. Note that the HTTP specification allows this preference to be ignored. An Accept-Language header value looks something like the following:

en, es, de, ja, zh-TW

This indicates the client user reads English, Spanish, German, Japanese, and Chinese appropriate for Taiwan. By convention, languages are listed in order of preference. Each language may also include a q-value that indicates, on a scale from 0.0 to 1.0, an estimate of the user's preference for that language. The default q-value is 1.0 (maximum preference). An Accept-Language header value including q-values looks like this:

en, es;q=0.8, de;q=0.7, ja;q=0.3, zh-TW;q=0.1

This header value means essentially the same thing as the previous example.

The second way a servlet can know the language preferences of the client is by asking. For example, a servlet might generate a form that asks which language the client prefers. Thereafter, it can remember and use the answer, perhaps using the session-tracking techniques discussed in Chapter 7.

13.5.2 Charset Preferences

In addition to an Accept-Language HTTP header, a browser may send an Accept-Charset header that tells the server which charsets it understands. An Accept-Charset header value may look something like this:

iso-8859-1, utf-8

This indicates the browser understands ISO-8859-1 and UTF-8. If the Accept-Charset isn't sent or if its value contains an asterisk (*), it can be assumed the client accepts all charsets. Note that the current usefulness of this header is limited: few browsers yet send the header, and those browsers that do tend to send a value that contains an asterisk.

13.5.3 Resource Bundles

Using Accept-Language (and, in some cases, Accept-Charset), a servlet can determine the language in which it will speak to each client. But how can a servlet efficiently manage several localized versions of a page? One answer is to use Java's built-in support for resource bundles.

A resource bundle holds a set of localized resources appropriate for a given locale. For example, a resource bundle for the French locale might contain a French translation of all the phrases output by a servlet. Then, when the servlet determines it wants to speak French, it can load that resource bundle and use the stored phrases. All resource bundles extend java.util.ResourceBundle. A servlet can load a resource bundle using the static method ResourceBundle.getBundle( ):

public static final   ResourceBundle ResourceBundle.getBundle(String bundleName, Locale locale)

A servlet can pull phrases from a resource bundle using the getString( ) method of ResourceBundle:

public final String ResourceBundle.getString(String key)

A resource bundle can be created in several ways. For servlets, the most useful technique is to put a special properties file in the server's classpath that contains the translated phrases. The file should be specially named according to the pattern bundlename_language.properties or bundlename_language_country.properties. For example, use Messages_fr.properties for a French bundle or Messages_zh_TW.properties for a Chinese/Taiwan bundle. The file should contain US-ASCII characters in the following format:

name1=value1 name2=value2 ...

Each line may also contain whitespace and Unicode escapes. The information in this file can be loaded automatically by the getBundle( ) method.

13.5.4 Writing to Each His Own

Example 13-8 demonstrates the use of Accept-Language , Accept-Charset, and resource bundles with a servlet that says "Hello World" to each client in that client's own preferred language. Here's a sample resource bundle properties file for English, which you would store in HelloBabel_en.properties somewhere searched by the server's class loader (such as WEB-INF/classes):

greeting=Hello world

And here's a resource bundle for Japanese, to be stored in HelloBabel_ja.properties :

greeting=\u4eca\u65e5\u306f\u4e16\u754c

This HelloBabel servlet uses the com.oreilly.servlet.LocaleNegotiator class that contains the black box logic to determine which Locale, charset, and ResourceBundle should be used. Its code is shown in the next section.

Example 13-8. A Servlet Version of the Tower of Babel
import java.io.*; import java.util.*; import java.text.*; import javax.servlet.*; import javax.servlet.http.*; import com.oreilly.servlet.LocaleNegotiator; import com.oreilly.servlet.ServletUtils; public class HelloBabel extends HttpServlet {   public void doGet(HttpServletRequest req, HttpServletResponse res)                                throws ServletException, IOException {     try {       String bundleName = "HelloBabel";       String acceptLanguage = req.getHeader("Accept-Language");       String acceptCharset = req.getHeader("Accept-Charset");       LocaleNegotiator negotiator =         new LocaleNegotiator(bundleName, acceptLanguage, acceptCharset);       Locale locale = negotiator.getLocale();       String charset = negotiator.getCharset();       ResourceBundle bundle = negotiator.getBundle();  // may be null       res.setContentType("text/plain; charset=" + charset);       res.setHeader("Content-Language", locale.getLanguage());       res.setHeader("Vary", "Accept-Language");       PrintWriter out = res.getWriter();       DateFormat fmt = DateFormat.getDateTimeInstance(DateFormat.LONG,                                                       DateFormat.LONG,                                                       locale);       if (bundle != null) {         out.println("In " + locale.getDisplayLanguage() + ":");         out.println(bundle.getString("greeting"));         out.println(fmt.format(new Date()));       }       else {         out.println("Bundle could not be found.");       }     }     catch (Exception e) {       log(ServletUtils.getStackTraceAsString(e));     }   } }

This servlet begins by setting the name of the bundle it wants to use, and then it retrieves its Accept-Language and Accept-Charset headers. It creates a LocaleNegotiator, passing in this information, and quickly asks the negotiator which Locale, charset, and ResourceBundle it is to use. Notice that a servlet may ignore the returned charset in favor of the UTF-8 encoding. And just remember, UTF-8 is not as widely supported as the charsets normally returned by LocaleNegotiator. Next, the servlet sets its headers: its Content-Type header specifies the charset, Content-Language specifies the locale's language, and the Vary header indicates to the client (if by some chance it should care) that this servlet can vary its output based on the client's Accept-Language header.

Once the headers are set, the servlet generates its output. It first gets a PrintWriter to match the charset. Then it says in the default language, usually English which language the greeting is to be in. Next, it retrieves and outputs the appropriate greeting from the resource bundle. And lastly, it prints the date and time appropriate to the client's locale. If the resource bundle is null, as happens when there are no resource bundles to match the client's preferences, the servlet simply reports that no bundle could be found.

13.5.5 The LocaleNegotiator Class

The code for LocaleNegotiator is shown in Example 13-9. Its helper class, LocaleToCharsetMap , is shown in Example 13-10. If you are happy to treat the locale negotiator as a black box, feel free to skip this section.

LocaleNegotiator works by scanning through the client's language preferences looking for any language for which there is a corresponding resource bundle. Once it finds a correspondence, it uses LocaleToCharsetMap to determine the charset. If there's any problem, it tries to fall back to U.S. English. The logic ignores the client's charset preferences.

The most complicated aspect of the LocaleNegotiator code is having to deal with the unfortunate behavior of ResourceBundle.getBundle( ) . The getBundle( ) method attempts to act intelligently. If it can't find a resource bundle that is an exact match to the specified locale, it tries to find a close match. The problem, for our purposes, is that getBundle( ) considers the resource bundle for the default locale to be a close match. Thus, as we loop through client languages, it's difficult to determine when we have an exact resource bundle match and when we don't. The workaround is to first fetch the ultimate fallback resource bundle, then use that reference later to determine when there is an exact match. This logic is encapsulated in the getBundleNoFallback( ) method.

Example 13-9. The LocaleNegotiator Class
package com.oreilly.servlet; import java.io.*; import java.util.*; import com.oreilly.servlet.LocaleToCharsetMap; public class LocaleNegotiator {   private ResourceBundle chosenBundle;   private Locale chosenLocale;   private String chosenCharset;   public LocaleNegotiator(String bundleName,                           String languages,                           String charsets) {     // Specify default values:     //   English language, ISO-8859-1 (Latin-1) charset, English bundle     Locale defaultLocale = new Locale("en", "US");     String defaultCharset = "ISO-8859-1";     ResourceBundle defaultBundle = null;     try {       defaultBundle = ResourceBundle.getBundle(bundleName, defaultLocale);     }     catch (MissingResourceException e) {       // No default bundle was found. Flying without a net.     }     // If the client didn't specify acceptable languages, we can keep     // the defaults.     if (languages == null) {       chosenLocale = defaultLocale;       chosenCharset = defaultCharset;       chosenBundle = defaultBundle;       return;  // quick exit     }     // Use a tokenizer to separate acceptable languages     StringTokenizer tokenizer = new StringTokenizer(languages, ",");     while (tokenizer.hasMoreTokens()) {       // Get the next acceptable language.       // (The language can look something like "en; qvalue=0.91")       String lang = tokenizer.nextToken();       // Get the locale for that language       Locale loc = getLocaleForLanguage(lang);       // Get the bundle for this locale. Don't let the search fallback       // to match other languages!       ResourceBundle bundle = getBundleNoFallback(bundleName, loc);       // The returned bundle is null if there's no match. In that case       // we can't use this language since the servlet can't speak it.       if (bundle == null) continue;  // on to the next language       // Find a charset we can use to display that locale's language.       String charset = getCharsetForLocale(loc, charsets);       // The returned charset is null if there's no match. In that case       // we can't use this language since the servlet can't encode it.       if (charset == null) continue;  // on to the next language       // If we get here, there are no problems with this language.       chosenLocale = loc;       chosenBundle = bundle;       chosenCharset = charset;       return;  // we're done     }     // No matches, so we let the defaults stand     chosenLocale = defaultLocale;     chosenCharset = defaultCharset;     chosenBundle = defaultBundle;   }   public ResourceBundle getBundle() {     return chosenBundle;   }   public Locale getLocale() {     return chosenLocale;   }   public String getCharset() {     return chosenCharset;   }   private Locale getLocaleForLanguage(String lang) {     Locale loc;     int semi, dash;     // Cut off any q-value that might come after a semi-colon     if ((semi = lang.indexOf(';')) != -1) {       lang = lang.substring(0, semi);     }     // Trim any whitespace     lang = lang.trim();     // Create a Locale from the language. A dash may separate the     // language from the country.     if ((dash = lang.indexOf('-')) == -1) {       loc = new Locale(lang, "");  // No dash, no country     }     else {       loc = new Locale(lang.substring(0, dash), lang.substring(dash+1));     }     return loc;   }   private ResourceBundle getBundleNoFallback(String bundleName, Locale loc) {     // First get the fallback bundle -- the bundle that will be selected     // if getBundle() can't find a direct match. This bundle can be     // compared to the bundles returned by later calls to getBundle() in     // order to detect when getBundle() finds a direct match.     ResourceBundle fallback = null;     try {       fallback =         ResourceBundle.getBundle(bundleName, new Locale("bogus", ""));     }     catch (MissingResourceException e) {       // No fallback bundle was found.     }     try {       // Get the bundle for the specified locale       ResourceBundle bundle = ResourceBundle.getBundle(bundleName, loc);       // Is the bundle different than our fallback bundle?       if (bundle != fallback) {         // We have a real match!         return bundle;       }       // So the bundle is the same as our fallback bundle.       // We can still have a match, but only if our locale's language       // matches the default locale's language.       else if (bundle == fallback &&             loc.getLanguage().equals(Locale.getDefault().getLanguage())) {         // Another way to match         return bundle;       }       else {         // No match, keep looking       }     }     catch (MissingResourceException e) {       // No bundle available for this locale     }     return null;  // no match   }   protected String getCharsetForLocale(Locale loc, String charsets) {     // Note: This method ignores the client-specified charsets     return LocaleToCharsetMap.getCharset(loc);   } }
Example 13-10. The LocaleToCharsetMap Class
package com.oreilly.servlet; import java.util.*; public class LocaleToCharsetMap {   private static Hashtable map;   static {     map = new Hashtable();     map.put("ar", "ISO-8859-6");     map.put("be", "ISO-8859-5");     map.put("bg", "ISO-8859-5");     map.put("ca", "ISO-8859-1");     map.put("cs", "ISO-8859-2");     map.put("da", "ISO-8859-1");     map.put("de", "ISO-8859-1");     map.put("el", "ISO-8859-7");     map.put("en", "ISO-8859-1");     map.put("es", "ISO-8859-1");     map.put("et", "ISO-8859-1");     map.put("fi", "ISO-8859-1");     map.put("fr", "ISO-8859-1");     map.put("he", "ISO-8859-8");     map.put("hr", "ISO-8859-2");     map.put("hu", "ISO-8859-2");     map.put("is", "ISO-8859-1");     map.put("it", "ISO-8859-1");     map.put("iw", "ISO-8859-8");     map.put("ja", "Shift_JIS");     map.put("ko", "EUC-KR");     // Requires JDK 1.1.6     map.put("lt", "ISO-8859-2");     map.put("lv", "ISO-8859-2");     map.put("mk", "ISO-8859-5");     map.put("nl", "ISO-8859-1");     map.put("no", "ISO-8859-1");     map.put("pl", "ISO-8859-2");     map.put("pt", "ISO-8859-1");     map.put("ro", "ISO-8859-2");     map.put("ru", "ISO-8859-5");     map.put("sh", "ISO-8859-5");     map.put("sk", "ISO-8859-2");     map.put("sl", "ISO-8859-2");     map.put("sq", "ISO-8859-2");     map.put("sr", "ISO-8859-5");     map.put("sv", "ISO-8859-1");     map.put("tr", "ISO-8859-9");     map.put("uk", "ISO-8859-5");     map.put("zh", "GB2312");     map.put("zh_TW", "Big5");   }   public static String getCharset(Locale loc) {     String charset;     // Try for a full name match (may include country)     charset = (String) map.get(loc.toString());     if (charset != null) return charset;     // If a full name didn't match, try just the language     charset = (String) map.get(loc.getLanguage());     return charset;  // may be null   } }  

13.5.6 System-Provided Locales

Beginning with Servlet API 2.2 a servlet can get and set the preferred locale of the client using a few convenience methods. ServletRequest has a new getLocale( ) method that returns a Locale object indicating the client's most preferred locale. For HTTP servlets, the preference is based on the Accept-Language header. There's also a getLocales( ) method that returns an Enumeration of Locale objects indicating all the acceptable locales for the client, with the most preferred first. Accompanying these methods is a setLocale(Locale loc) method added to ServletResponse that allows a servlet to specify the locale of the response. The method automatically sets the Content-Language header and the Content-Type charset value. The setLocale( ) method should be called after setContentType( ) and before getWriter( ) (because it modifies the content type and impacts the PrintWriter creation). For example:

public void doGet(HttpServletRequest req, httpServletResponse res)               throws ServletException, IOException {   res.setContentType("text/html");   Locale locale = req.getLocale();   res.setLocale(locale);   PrintWriter out = res.getWriter();   // Write output based on locale.getLanguage() }

While these methods allow for appealing code, they provide no assistance in determining whether or not a Locale is supported by the web application. To make this determination, you need extra logic like LocaleNegotiator.


Last updated on 3/20/2003
Java Servlet Programming, 2nd Edition, © 2001 O'Reilly

< BACKCONTINUE >


Java servlet programming
Java Servlet Programming (Java Series)
ISBN: 0596000405
EAN: 2147483647
Year: 2000
Pages: 223

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net