Foreign Languages


Once upon a time in the computer industry, if you wanted to write a document in a language other than English (assuming you were in an English-speaking country), you had just about no chance of accomplishing it without special software support.

Thankfully, today's software is far more internationally aware. The World Wide Web is the most logical place to look for evidence of this, and indeed the number of non-English sites is growing rapidly.

Tip

For more character-encoding advice, see "WaSP Asks the W3C" (www.webstandards.org/learn/askw3c/dec2002.html).


To reach a wider audience, the Zen Garden solicited volunteer translations (www.mezzoblue.com/zengarden/translations). As they came in, a few problems became abundantly clear: Working with foreign languages requires at least a rudimentary knowledge of character encoding, and translation is much more of an art than a science.

Character Encoding

Modern operating systems are quite good about working with non-English characters. A base installation of Windows XP or Mac OS X won't always include full support for the wide range of possible human languages, but language packs are available on the install CDs that enable proper rendering of many foreign characters.

Being able to see the fonts is only half the battle; copying and pasting text from a source into an HTML document only works if the character encoding matches. Text that uses shift_jis (Japanese) as its base encoding will not display properly in a document encoded with utf-8 (Unicode), for example. It's important to ensure that your HTML document's character encoding matches the text you're working with.

Encoding may be set at the server level, as a default for all pages on your site. But even if done so, it's important to specify the character encoding within your HTML using a meta tag:

 <meta http-equiv="content-type" content="text/html; charset=iso-8859-1" /> 

The Zen Garden uses a value of iso-8859-1, which refers to a standardized encoding for most major European languages. Various translations use different encodings, but in hindsight the best choice would have been to use the value UTF-8 for each.

UTF-8 is a variant of Unicode, the popular international encoding language. The advantage of using it is that multiple languages with differing character sets, like French, Japanese, Arabic, and Greek, can all potentially coexist within the same document.

The only real caveat against its widespread use on the Web right now is that some older, non-internationally aware server-side software and authoring tools won't support it. Modern browsers have no problems that prevent its use, so the user side isn't the problem; it's the server side. As well, UTF-8 creates larger files than necessary when working with Asian languages, due to the complex character sets they require, so documents written primarily in Chinese might benefit from a more specific character encoding.

The lesson learned is that due to its powerful ability to handle multiple languages, UTF-8 is the system of choice for dealing with language on the Web if the tools support it.

Translation Discrepancies

Because the Zen Garden relies on volunteers for translations, the quality of each varies. Native speakers of each language currently available often send comments with suggested fixes; obvious grammar and spelling flaws are easily resolved, but suggestions for alternate wording are much more difficult to implement.

Even if a professional translation service is used, human languages aren't directly interchangeable. A phrase may be interpreted in different ways, each shaded by the interpreter's experience and cultural context. For example, does the phrase "That will never fly" refer to a bad idea or a faulty airplane? Is "enlightenment" a Zen concept or a weight-loss technique? Context might clear up the ambiguity, but only if the context is properly understood.

Human language is more imprecise than computer language, so the problem persists. The best that can be hoped for is a consensus among multiple speakers of the language; factor in regional dialects and variations that have evolved over time (are you going to spell it color or use the British colour?), and it's apparent that even consensus is hard to accurately form. Especially if you don't speak the language in question!

The lesson learned here is that translating a document is no easy task, and perfection is perhaps unattainable. As a result, the Zen Garden now has a light-hearted disclaimer stating that errors are to be expected, and we're willing to live with that.



    The Zen of CSS Design(c) Visual Enlightenment for the Web
    The Zen of CSS Design(c) Visual Enlightenment for the Web
    ISBN: N/A
    EAN: N/A
    Year: 2005
    Pages: 117

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net