11.3 Internationalization

   

Starting with Samba 3.0, Samba supports Unicode "on the wire," requiring no additional effort on your part to support filenames and other text containing characters in international character sets.

11.3.1 Internationalization Options

Samba 2.2.x has a limited ability to speak foreign tongues : if you need to support filenames containing characters that aren't in standard ASCII, some options that can help you are shown in Table 11-3.

Table 11-3. Internationalization options

Option

Parameters

Function

Default

Scope

client code page

Described in this section

Sets a code page to expect from clients

850

Global

character set

Described in this section

Translates code pages into alternate Unix character sets

None

Global

coding system

Described in this section

Translates code page 932 into an Asian character set

None

Global

valid chars

string (set of characters)

Adds individual characters to a code page

None

Global

11.3.1.1 client code page

The character sets on Windows platforms hark back to the original concept of a code page . These code pages are used by DOS and Windows clients to determine rules for mapping lowercase letters to uppercase letters . Samba can be instructed to use a variety of code pages through the use of the global client code page option to match the corresponding code page in use on the client. This option loads a code page definition file and can take the values specified in Table 11-4.

Table 11-4. Valid code pages with Samba 2.0

Code page

Definition

437

MS-DOS Latin (United States)

737

Windows 95 Greek

850

MS-DOS Latin 1 (Western European)

852

MS-DOS Latin 2 (Eastern European)

861

MS-DOS Icelandic

866

MS-DOS Cyrillic (Russian)

932

MS-DOS Japanese Shift-JIS

936

MS-DOS Simplified Chinese

949

MS-DOS Korean Hangul

950

MS-DOS Traditional Chinese

You can set the client code page as follows :

 [global]     client code page = 852 

The default value of this option is 850, for MS-DOS Latin 1. You can use the make_smbcodepage tool that comes with Samba (by default in /usr/local/samba/bin ) to create your own SMB code pages, in the event that those listed earlier are not sufficient.

11.3.1.2 character set

The global character set option can be used to convert filenames offered through a DOS code page (see the previous section, Section 11.3.1.1) to equivalents that can be represented by Unix character sets other than those in the United States. For example, if you want to convert the Western European MS-DOS character set on the client to a Western European Unix character set on the server, you can use the following in your configuration file:

 [global]     client code page = 850     character set = ISO8859-1 

Note that you must include a client code page option to specify the character set from which you are converting. The valid character sets (and their matching code pages) that Samba accepts are listed in Table 11-5.

Table 11-5. Valid character sets

Character set

Matching code page

Definition

ISO8859-1

850

Western European Unix

ISO8859-2

852

Eastern European Unix

ISO8859-5

866

Russian Cyrillic Unix

ISO8859-7

737

Greek Unix

KOI8-R

866

Alternate Russian Cyrillic Unix

Normally, the character set option is disabled completely.

11.3.1.3 coding system

The coding system option is similar to the character set option. However, its purpose is to determine how to convert a Japanese Shift JIS code page into an appropriate Unix character set. To use this option, the client code page option described previously must be set to page 932 . The valid coding systems that Samba accepts are listed in Table 11-6.

Table 11-6. Valid coding-system parameters

Character set

Definition

SJIS

Standard Shift JIS

JIS8

Eight-bit JIS codes

J8BB

Eight-bit JIS codes

J8BH

Eight-bit JIS codes

J8@B

Eight-bit JIS codes

J8@J

Eight-bit JIS codes

J8@H

Eight-bit JIS codes

JIS7

Seven-bit JIS codes

J7BB

Seven-bit JIS codes

J7BH

Seven-bit JIS codes

J7@B

Seven-bit JIS codes

J7@J

Seven-bit JIS codes

J7@H

Seven-bit JIS codes

JUNET

JUNET codes

JUBB

JUNET codes

JUBH

JUNET codes

JU@B

JUNET codes

JU@J

JUNET codes

JU@H

JUNET codes

EUC

EUC codes

HEX

Three-byte hexadecimal code

CAP

Three-byte hexadecimal code (Columbia AppleTalk Program)

11.3.1.4 valid chars

The valid chars option can be used to add individual characters to a code page. You can use this option as follows:

 valid chars =  valid chars = 0450:0420 0x0A20:0x0A00 valid chars = A:a 

Each character in the list specified should be separated by spaces. If there is a colon between two characters or a numerical equivalent, the data to the left of the colon is considered an uppercase character, while the data to the right is considered the lowercase character. You can represent characters both by literals (if you can type them) and by octal, hexadecimal, or decimal Unicode equivalents.

If you use this option, it must be listed after the client code page to which you wish to add the character.

   


Using Samba
Using Samba: A File and Print Server for Linux, Unix & Mac OS X, 3rd Edition
ISBN: 0596007698
EAN: 2147483647
Year: 2003
Pages: 475

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net