1.10.
Summaries
The following summaries use very
concise
language, and they are hardly understandable in isolation. However,
having read the text of this chapter, you may find them useful and
return to them later. The terminology
related
to
characters
varies
quite a lot, so the summaries help in checking out how this book
names
things.
1.10.1. Summary of
Definitions
Following is a list of terms you may come
across:
-
Character
-
A basic unit of textual information, as abstract
concept, as opposed to stylistic and typographic variation between
shapes
that can be identified as the same character.
-
Character
code
-
A mapping, often presented in tabular form, that
defines a one-to-one correspondence between characters in a
character
repertoire
and a set of nonnegative integers.
-
Character
encoding
-
A method (algorithm) for presenting characters
in digital form by mapping sequences of code
numbers
of characters
into sequences of octets. Encodings have names, which can be
registered.
-
Code
number
-
The integer assigned to a character in a
character code. Synonyms: code position, code value, code element,
code point, code set value, code.
-
Character
repertoire
-
A collection of distinct characters. No specific
internal presentation in computers or data transfer is assumed. The
repertoire per se does not even define an ordering for the
characters; ordering for sorting and other purposes is to be
specified separately. A character repertoire is usually defined by
specifying names of characters and a sample (or reference)
presentation of characters in visible form.
-
Glyph
-
A basic unit of visual rendering of
charactersi.e., a particular visible presentation of a character,
or part of character, or pair or sequence of characters.
-
Octet
-
A sequence of eight binary digits (0 and 1)
treated as a unit.
1.10.2. Summary of
Concept Levels
We can consider a character, say @, at different
conceptual levels, or levels of abstraction:
-
Character as an
abstraction
-
The idea of a particular character, in the mind
of an individual and in social usage. For example, whatever @
suggests to you, or your
friends
, or people in your country.
-
Character as
defined in a specification
-
A particular definition of a character, aimed at
making the idea explicit and communicable. The definition can show
some glyph(s) for the character,
name
it (e.g., "commercial at"),
describe it verbally, and list its properties in some general
framework.
-
Coded
character
-
A character as defined in a specification
together with its code number in some system of such numbers. In
most systems, the number of @ is 40 in hexadecimal. The number can
be used as a concise way of referring to the characters, often
using some special notation like 0x40 or U+0040.
-
Encoded
representation
-
A particular internal representation of the code
number of a character, and hence of the character. This depends on
the encoding used. For the @ character, the representation could be
the octet 40 (hex) alonei.e., the bit sequence 00001000. In another
encoding, however, it could consist of two octets, 00 and 40.
-
Glyph
-
A rendering of a charactere.g., @ or @ or @.
|