Recipe2.1.Determining the Kind of Character a char Contains


Recipe 2.1. Determining the Kind of Character a char Contains

Problem

You have a variable of type char and wish to determine the kind of character it containsa letter, digit, number, punctuation character, control character, separator character, symbol, whitespace, or surrogate character. Similarly, you have a string variable and want to determine the kind of character in one or more positions within this string.

Solution

To determine the value of a char, use the built-in static methods on the System.Char structure shown here:

 Char.IsControl            Char.IsDigit Char.IsLetter             Char.IsNumber Char.IsPunctuation        Char.IsSeparator Char.IsSurrogate          Char.IsSymbol Char.IsWhitespace 

Discussion

The following examples demonstrate how to use the methods shown in the Solution section in a function to return the kind of a character. First, create an enumeration to define the various types of characters:

 public enum CharKind {     Letter,     Number,     Punctuation,     Unknown } 

Next, create a method that contains the logic to determine the type of a character and to return a CharKind enumeration value indicating that type:

 public static CharKind GetCharKind(char theChar) {     if (Char.IsLetter(theChar))     {         return CharKind.Letter;     }     else if (Char.IsNumber(theChar))     {         return CharKind.Number;     }     else if (Char.IsPunctuation(theChar))     {         return CharKind.Punctuation;     }     else     {         return CharKind.Unknown;     } } 

If, however, a character in a string needs to be evaluated, use the overloaded static methods on the char structure. The following code modifies the GetCharKind method to accept a string variable and a character position in that string. The character position determines which character in the string is evaluated.

 public static CharKind GetCharKindInString(string theString, int charPosition) {     if (Char.IsLetter(theString, charPosition))     {         return CharKind.Letter;     }     else if (Char.IsNumber(theString, charPosition))     {         return CharKind.Number;     }     else if (Char.IsPunctuation(theString, charPosition))     {         return CharKind.Punctuation;     }     else     {         return CharKind.Unknown;     } } 

The GetCharKind method accepts a character as a parameter and performs a series of tests on that character using the Char type's built-in static methods. An enumeration of all the different types of characters is defined and is returned by the GetCharKind method.

Table 2-1 describes each of the static Char methods.

Table 2-1. Char methods

Char method

Description

IsControl

A control code in the ranges \U007F, \U0000\U001F, and \U0080\U009F.

IsDigit

Any decimal digit in the range 09.

IsLetter

Any alphabetic letter.

IsNumber

Any decimal digit or hexadecimal digit.

IsPunctuation

Any punctuation character.

IsSeparator

A space separating words, a line separator, or a paragraph separator.

IsSurrogate

Any surrogate character in the range \UD800\UDFFF.

IsSymbol

Any mathematical, currency, or other symbol character. Includes characters that modify surrounding characters.

IsWhitespace

Any space character and the following characters:

\U0009

\U000A

\U000B

\U000C

\U000D

\U0085

\U2028

\U2029


The following code example determines whether the fifth character (the charPosition parameter is zero-based) in the string is a digit:

 if (GetCharKind("abcdefg", 4) == CharKind.Digit) {…}  

In Version 2.0 of the .NET Framework, a few extra Is* functions were added to augment the existing methods. If the character in question is a letter (i.e., the IsLetter method returns true), you can determine if the letter is uppercase or lowercase by using the methods in Table 2-2.

Table 2-2. Upper- and lowercase Char methods

Char method

Description

IsLower

A character that is lowercase

IsUpper

A character that is uppercase


If the character in question is a surrogate (i.e., the IsSurrogate method returns true), you can use the methods in Table 2-3 to get more information on the surrogate character.

Table 2-3. Surrogate Char methods

Char method

Description

IsHighSurrogate

A character that is in the range \UD800 to \UDBFF

IsLowSurrogate

A character that is in the range \UDC00 to \UDFFF


In addition to these surrogate methods, an additional method, IsSurrogatePair, returns true only if two characters create a surrogate pairthat is, one character is a high surrogate and one character is a low surrogate.

The final addition to this group of methods is the IsLetterOrDigit method, which returns TRue only if the character in question is either a letter or a digit. To determine if the character is either a letter or a digit, use the IsLetter and IsDigit methods.

See Also

See the "Char Structure" topic in the MSDN documentation.



C# Cookbook
Secure Programming Cookbook for C and C++: Recipes for Cryptography, Authentication, Input Validation & More
ISBN: 0596003943
EAN: 2147483647
Year: 2004
Pages: 424

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net