6.5. Characters


6.5. Characters

Characters are atomic objects representing letters, digits, special symbols such as $ or -, and certain nongraphic control characters such as space and newline. Characters are written with a #\ prefix. For most characters, the prefix is followed by the character itself. The written character representation of the letter A, for example, is #\A. The characters newline and space may be written in this manner as well, but they can also be written as #\newline or #\space.

This section describes the operations that deal primarily with characters. See also the following section on strings and Chapter 7 on input and output for other operations relating to characters.

(char=? char1 char2 char3 ...)

procedure

(char<? char1 char2 char3 ...)

procedure

(char<=? char1 char2 char3 ...)

procedure

(char>? char1 char2 char3 ...)

procedure

(char>=? char1 char2 char3 ...)

procedure

returns: #t if the relation holds, #f otherwise

These predicates behave in a similar manner to the numeric predicates =, <, >, <=, and >=. For example, char=? returns #t when its arguments are equivalent characters, and char<? returns #t when its arguments are monotonically increasing character values.

Independent of the particular representation employed, the following relationships are guaranteed to hold.

  • The lower-case letters #\a through #\z are in order from low to high; e.g., #\d is less than #\e.

  • The upper-case letters #\A through #\Z are in order from low to high; e.g., #\Q is less than #\R.

  • The digits #\0 through #\9 are in order from low to high; e.g., #\3 is less than #\4.

  • All digits precede all lower-case letters, or all lower-case letters precede all digits.

  • All digits precede all upper-case letters, or all upper-case letters precede all digits.

The tests performed by char=?, char<?, char>?, char<=?, and char>=? are casesensitive. That is, the character #\A is not equivalent to the character #\a according to these predicates.

The ANSI/IEEE standard includes only two-argument versions of these procedures. The more general versions are mentioned in the Revised5 Report.

 (char>? #\a #\b)  #f (char<? #\a #\b)  #t (char<? #\a #\b #\c)  #t (let ((c #\r))   (char<=? #\a c #\z))  #t (char<=? #\Z #\W)  #f (char=? #\+ #\+)  #t (or (char<? #\a #\0)     (char<? #\0 #\a))  #t 

(char-ci=? char1 char2 char3 ...)

procedure

(char-ci<? char1 char2 char3 ...)

procedure

(char-ci>? char1 char2 char3 ...)

procedure

(char-ci<=? char1 char2 char3 ...)

procedure

(char-ci>=? char1 char2 char3 ...)

procedure

returns: #t if the relation holds, #f otherwise

These predicates are identical to the predicates char=?, char<?, char>?, char<=?, and char>=? except that they are case-insensitive. This means that when two letters are compared, case is unimportant. For example, char=? considers #\a and #\A to be distinct values; char-ci=? does not.

The ANSI/IEEE standard includes only two-argument versions of these procedures. The more general versions are mentioned in the Revised5 Report.

 (char-ci<? #\a #\B)  #t (char-ci=? #\W #\w)  #t (char-ci=? #\= #\+)  #f (let ((c #\R))   (list (char<=? #\a c #\z)         (char-ci<=? #\a c #\z)))  (#f #t) 

(char-alphabetic? char)

procedure

returns: #t if char is a letter, #f otherwise

 (char-alphabetic? #\a)  #t (char-alphabetic? #\T)  #t (char-alphabetic? #\8)  #f (char-alphabetic? #\$)  #f 

(char-numeric? char)

procedure

returns: #t if char is a digit, #f otherwise

 (char-numeric? #\7)  #t (char-numeric? #\2)  #t (char-numeric? #\X)  #f (char-numeric? #\space)  #f 

(char-lower-case? letter)

procedure

returns: #t if letter is lower-case, #f otherwise

If letter is not alphabetic, the result is unspecified.

 (char-lower-case? #\r)  #t (char-lower-case? #\R)  #f (char-lower-case? #\8)  unspecified 

(char-upper-case? letter)

procedure

returns: #t if letter is upper-case, #f otherwise

If letter is not alphabetic, the result is unspecified.

 (char-upper-case? #\r)  #f (char-upper-case? #\R)  #t (char-upper-case? #\8)  unspecified 

(char-whitespace? char)

procedure

returns: #t if char is whitespace, #f otherwise

Whitespace consists of spaces and newlines and possibly other nongraphic characters, depending upon the Scheme implementation and the underlying operating system.

 (char-whitespace? #\space)  #t (char-whitespace? #\newline)  #t (char-whitespace? #\Z)  #f 

(char-upcase char)

procedure

returns: the upper-case character equivalent to char

If char is a lower-case character, char-upcase returns the upper-case equivalent. If char is not a lower-case character, char-upcase returns char.

 (char-upcase #\g)  #\G (char-upcase #\Y)  #\Y (char-upcase #\7)  #\7 

(char-downcase char)

procedure

returns: the lower-case character equivalent to char

If char is an upper-case character, char-downcase returns the lower-case equivalent. If char is not an upper-case character, char-downcase returns char.

 (char-downcase #\g)  #\g (char-downcase #\Y)  #\y (char-downcase #\7)  #\7 

(char->integer char)

procedure

returns: an exact integer representation for char

char->integer is useful for performing table lookups, with the integer representation of char employed as an index into a table. The integer representation of a character is typically the integer code supported by the operating system for character input and output.

Although the particular representation employed depends on the Scheme implementation and the underlying operating system, the rules regarding the relationship between character objects stated above under the description of char=? and its relatives holds for the integer representations of characters as well.

The following examples assume that the integer representation is the ASCII code for the character.

 (char->integer #\h)  104 (char->integer #\newline)  10 

The definition of make-dispatch-table below shows how the integer codes returned by char->integer may be used portably to associate values with characters in vector-based dispatch tables, even though the exact correspondence between characters and their integer codes is unspecified.

make-dispatch-table accepts two arguments: an association list (see assv in Section 6.3) associating characters with values and a default value for characters without associations. It returns a lookup procedure that accepts a character and returns the associated (or default) value. make-dispatch-table builds a vector that is used by the lookup procedure. This vector is indexed by the integer codes for the characters and contains the associated values. Slots in the vector between indices for characters with defined values are filled with the default value. The code works even if char->integer returns negative values or both negative and nonnegative values, although the table can get large if the character codes are not tightly packed.

 (define make-dispatch-table   (lambda (alist default)     (let ((codes (map char->integer (map car alist))))       (let ((first-index (apply min codes))             (last-index (apply max codes)))          (let ((n (+ (- last-index first-index) 1)))             (let ((v (make-vector n default)))                (for-each                  (lambda (i x) (vector-set! v (- i first-index) x))                  codes                  (map cdr alist))                ;; table is built; return the table lookup procedure                (lambda (c)                   (let ((i (char->integer c)))                      (if (<= first-index i last-index)                          (vector-ref v (- i first-index))                          default))))))))) (define-syntax define-dispatch-table   ;; define-dispatch-table associates sets of characters in strings   ;; with values in a call to make-dispatch-table.   (syntax-rules ()      (( default (str val) ...)       (make-dispatch-table          (append (map (lambda (c) (cons c 'val))                       (string->list str))                   ...)            'default)))) (define t    (define-dispatch-table       unknown       ("abcdefghijklmnopqrstuvwxyz" letter)       ("ABCDEFGHIJKLMNOPQRSTUVWXYZ" letter)       ("0123456789" digit))) (t #\m)  letter (t #\0)  digit (t #\*)  unknown 

(integer->char int)

procedure

returns: the character object corresponding to the exact integer int

This procedure is the functional inverse of char->integer. It is an error for int to be outside the range of valid integer character codes.

The following examples assume that the integer representation is the ASCII code for the character.

 (integer->char 48)  #\0 (integer->char 101)  #\e 




The Scheme Programming Language
The Scheme Programming Language
ISBN: 026251298X
EAN: 2147483647
Year: 2003
Pages: 98

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net