Evolving rendering needs are being documented by Unicode and are influenced by XSL-FO. -
The rendering algorithm is governed primarily by the Unicode Bidirectional Algorithm: -
Nuances are described by XSL-FO 1.0 Recommendation (Unicode BIDI Algorithm, Section 5.8) in conjunction with the CSS definition. -
Here, we do not attempt to repeat the detailed algorithms described in the above documents, but only to give a general overview. There are three categories of direction strength for Unicode characters . -
Strong characters are from language groups with well defined left-to-right or right-to-left character-progression directions. -
Weak characters are digits, currency symbols, and some punctuation characters. -
This includes mirroring characters whose presentation on the canvas may be different from their representation in the data. -
This includes "(", ")", "[", "]", "<", ">", "{", "}", " «", " »", etc. -
A mirrored character is rendered in the writing direction of adjacent text, which may require it to be flipped by the formatter. -
The stylesheet writer doesn't have any responsibility for doing the flipping. -
This includes ‬ Pop Directional Formatting (PDF) for embedding levels. -
Neutral characters are white space characters and some separator characters. The embedding algorithm is based on isolating groups of sequences in "embedding levels." -
An embedded sequence sets a progression direction for the members of the sequence. -
The stylesheet writer may need to introduce embedding levels to keep a sequence of characters together. -
Using bidi-override will create an embedding level. -
The progression direction of the level is either the specified direction , or the current writing-mode direction if not specified. -
Using a unicode-bidi value of " embed " doesn't change the inherent writing direction of the individual characters in the embedding level. -
Using a unicode-bidi value of " bidi-override " forces the characters to render in the direction of the embedding level, ignoring the inherent properties of the characters. -
Embedding levels are indicated in the resulting stream using LRE, RLE, LRO, RLO, and PDF Unicode characters. The resulting stream of characters is grouped according to the Unicode directionality controls. -
The grouping of characters crosses boundaries of embedding levels. -
The act of grouping weak characters with strong characters gives direction to the weak characters. -
Weak characters are influenced by their proximity to strong characters so they become strongly directed themselves . -
Weak characters between two strong characters of the same direction adopt that direction. -
Weak characters preceding a strong character without white space interruption adopt the strong character's direction. -
Weak characters following a strong character and any intervening white space adopt the strong character's direction. -
The characters are rendered after all of the characters have been assigned a direction. Examples in this book include sequences of right-to-left language text, as defined in Example 11-1. Example 11-1 Sample Unicode sequences of right-to-left language text Line 01 <!ENTITY hebrew-test "בדיקה 02 עברית"> 03 <!ENTITY arabic-test1 "إختبا"> 04 <!ENTITY arabic-test2 "ر عربي"> 05 <!ENTITY arabic-test "&arabic-test1;&arabic-test2;"> Consider a detailed example of mixing sequences of different directions in . -
Lines in the test are grouped differently to illustrate how groups are ordered in the writing direction for rendering before the characters found in the group are rendered; -
test "12" is left-to-right with embedded sequences of right-to-left text; -
test "23" is right-to-left with the content identical to test "12"; -
test "34" is almost identical to test "23" except for a space introduced after "89"; -
test "45" is right-to-left but the language text inside is in groups without overriding direction; -
test "56" is left-to-right but the language text inside is in groups that override direction; -
test "67" is right-to-left but the language text inside is in groups that override direction. -
Note that where the direction isn't specified, it is inferred by the writing mode; -
Weak punctuation characters separate the strong script characters. -
The first three tests illustrate the differences in assignment of direction to weak characters. -
Note how the introduction of a space at the end of the "34" test changes the direction assignment to the "89" characters, compared to the "89" characters in the "23" test. -
Embedding groups arrange the groups of left-to-right sequences; -
in test "34", the English text is shown to the left of the French text; -
in test "45", the French text is shown to the left of the English text. -
Overriding the direction results in an improper presentation of language text; -
in test "56", the Hebrew and Arabic sequences are inappropriately presented; -
in test "67", the English and French sequences are inappropriately presented. Example 11-2 Controlling bidirectionality using grouping Line 01 <block-container> 02 <block>12 - English Test 13 , Test Franais 14 03 + &hebrew-test; 15 = &arabic-test; 16 / 89end</block> 04 </block-container> 05 <block-container writing-mode="rl-tb"> 06 <block>23 - English Test 13 , Test Franais 14 07 + &hebrew-test; 15 = &arabic-test; 16 / 89end</block> 08 </block-container> 09 <block-container writing-mode="rl-tb"> 10 <block>34 - English Test 13 , Test Franais 14 11 + &hebrew-test; 15 = &arabic-test; 16 / 89 end</block> 12 </block-container> 13 <block-container writing-mode="rl-tb"> 14 <block>45 - <bidi-override unicode-bidi="embed" 15 >English Test 13</bidi-override> 16 , <bidi-override unicode-bidi="embed" 17 >Test Franais 14</bidi-override> 18 + <bidi-override unicode-bidi="embed" 19 >&hebrew-test; 15</bidi-override> 20 = <bidi-override unicode-bidi="embed" 21 >&arabic-test; 16</bidi-override> 22 / 89 end</block></block-container> 23 <block-container> 24 <block>56 - <bidi-override unicode-bidi="bidi-override" 25 >English Test 13</bidi-override> 26 , <bidi-override unicode-bidi="bidi-override" 27 >Test Franais 14</bidi-override> 28 + <bidi-override unicode-bidi="bidi-override" 29 >&hebrew-test; 15</bidi-override> 30 = <bidi-override unicode-bidi="bidi-override" 31 >&arabic-test; 16</bidi-override> 32 / 89 end</block></block-container> 33 <block-container writing-mode="rl-tb"> 34 <block>67 - <bidi-override unicode-bidi="bidi-override" 35 >English Test 13</bidi-override> 36 , <bidi-override unicode-bidi="bidi-override" 37 >Test Franais 14</bidi-override> 38 + <bidi-override unicode-bidi="bidi-override" 39 >&hebrew-test; 15</bidi-override> 40 = <bidi-override unicode-bidi="bidi-override" 41 >&arabic-test; 16</bidi-override> 42 / 89 end</block></block-container> Figure 11-1 illustrates the on-screen interpretation of Example 11-2. -
The line annotations are only for illustrative purposes; they visualise the groupings of characters and the character writing directions. -
This test does not incorporate explicit use of Unicode directionality characters; -
Note how the grouping of characters for strength purposes goes over the bounds of embedded levels. Figure 11-1. Example of bidirectionality |