|
|
|
SummaryThis chapter covered the basic control structures provided by C# to implement flow control logic within an application. You've seen the syntax and the usage for each of these control structures as well as learned when each of them should be used. In addition, we touched on the foreach statement that will be covered in Chapter 5. |
|
|
|
|
|
|
Further Reading
Programming C# , Third Edition by Jesse Liberty , O'Reilly. ISBN: 0596004893. The Applied Microsoft .NET Framework Programming in C# Collection , Microsoft Press. ISBN: 0735619751. |
|
|
|
|
|
|
Chapter 4. Strings and Regular ExpressionsIN BRIEF This chapter covers .NET strings and their manipulation, along with the very powerful feature of regular expressions. The first section deals with the .NET String class. You will learn about the workings of .NET strings, formatting, and manipulation of .NET strings. The second section deals with the StringBuilder class. You will learn efficient ways to handle string concatenation and manipulation. The final section deals with regular expressions. You will learn to apply this powerful engine for matching, grouping, validating, and replacing strings. WHAT YOU NEED
STRINGS AND REGULAR EXPRESSIONS AT A GLANCE
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
|
|
|
|
String Basics
The .NET Framework finally
Understanding the
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
Type |
Format |
Input |
Output |
|---|---|---|---|---|
|
c |
Currency |
{ 0:c} |
250.25 |
$250.25 |
|
-250.25 |
-$250.25 |
|||
|
d |
Decimal (whole number) |
{ 0:d} |
250 |
250 |
|
-250 |
-250 |
|||
|
e |
Scientific |
{ 0:e} |
3.14 |
3.140000e+000 |
|
-3.14 |
-3.140000e+000 |
|||
|
f |
Fixed point |
{ 0:f} |
3.14 |
3.14 |
|
-3.14 |
-3.14 |
|||
|
g |
General |
{ 0:g} |
3.14 |
3.14 |
|
-3.14 |
-3.14 |
|||
|
n |
Number with commas for thousands |
{ 0:n} |
25000 |
25,000 |
|
-25000 |
-25,000 |
|||
|
p |
Percent |
{ 0:p} |
.25 |
25.00% |
|
{ 0,2:p} |
.25555 |
25.56% |
||
|
X |
Uppercase hexadecimal |
{ 0:X} |
15 |
F |
|
x |
Lowercase hexadecimal |
{ 0:x} |
15 |
F |
|
Specifier |
Type |
Format |
Input |
Output |
|---|---|---|---|---|
|
|
Zero placeholder |
{ 0:00.0000} |
3.14 |
3.1400 |
|
# |
Digit placeholder |
{ 0:(#).##} |
3.14 |
(3).14 |
|
. |
Decimal point |
{ 0:0.0} |
3.14 |
3.1 |
|
, |
Thousand separator |
{ 0:0,0} |
2500.25 |
2,500 |
|
,. |
Number scaling |
{ 0:0,.} |
2000 |
2 (Note: Scales by 1000) |
|
% |
Percent |
{ 0:0%} |
25 |
2500% Multiplies by 100 and adds percent sign |
|
; |
|
{ Positive- Format} ;{ Negative- Format} ;{ Zero -format} |
||
With the exception of the group separator, custom integer formatting is obvious at first glance. The group separator allows for multiple format options based on the integer value to be formatted. Essentially, the group separator allows for three different format specifications, based on the value of the integer to be formatted. Those specifications apply to a positive value, and then a negative value, and finally a zero value. For instance, if you want negative floating point values to appear in parentheses, the following formatting could be used:
string result = string.Format("{0:$##,###.00;$(##,###.00);$-.--}", amount);
The next common data type for formatting is the
DateTime
struct within .NET. There are many options when it comes to formatting dates, and Tables 4.3 and 4.4 list the various formatting specifiers and
|
Specifier |
Description |
Format |
Result Using System.DateTime.Now |
|---|---|---|---|
|
d |
Short date |
{ 0:d} |
4/17/2004 |
|
D |
Long date |
{ 0:D} |
April 17, 2004 |
|
t |
Short time |
{ 0:t} |
11:50 AM |
|
T |
Long time |
{ 0:T} |
11:50:30: AM |
|
f |
Full date and time |
{ 0:f} |
April 17, 2004 11:51 AM |
|
F |
Long full date and time |
{ 0:F} |
April 17, 2004 11:51:45 AM |
|
g |
Default date and time |
{ 0:g} |
4/17/2004 11:53 AM |
|
G |
Long default date and time |
{ 0:G} |
4/17/2004 11:53:45 AM |
|
M or m |
Month day |
{ 0:M} |
April 17 |
|
R or r |
RFC1123 date string |
{ 0:r} |
Sat, 17 Apr 2004 11:55:17 GMT |
|
s |
Sortable date string ISO 8601 |
{ 0:s} |
2004-04-17T11:56:22 |
|
u |
Universal sortable date pattern |
{ 0:u} |
2004-04-17 11:58:11Z |
|
U |
Universal sortable full date pattern |
{ 0:U} |
Saturday, April 17, 2004 3:58:32 PM |
|
Y or y |
Year month pattern |
{ 0:Y} |
April, 2004 |
|
Specifier |
Description |
Format |
|
|---|---|---|---|
|
d |
Displays the day of the week as a number |
{ 0:d} |
|
|
dd |
Displays the day of the month as a leading zero integer |
{ 0:dd} |
|
|
ddd |
Displays the abbreviated day of the week |
{ 0:ddd} |
|
|
dddd |
Displays the full
|
{ 0:dddd} |
|
|
f,ff,fff,ffff... |
Displays seconds fractions in one or more digits |
{ 0:f} or { 0:ff} |
|
|
g or gg |
Displays the era, such as B.C. or A.D. |
{ 0:g} |
|
|
h |
Displays the
|
{ 0:h} |
|
|
hh |
Displays the hour from 112 with leading zero |
{ 0:hh} |
|
|
H |
Displays the hour in military format 023 |
{ 0:H} |
|
|
HH |
Displays the hour in military format 023 with leading zero for single-digit hours |
{ 0:HH} |
|
|
m |
Displays the minute as an integer |
{ 0:m} |
|
|
mm |
Displays the minute as an integer with leading zero for single-digit minute values |
{ 0:mm} |
|
|
M |
Displays the month as an integer |
{ 0:M} |
|
|
MM |
Displays the month as an integer with leading zero for single-digit month values |
{ 0:MM} |
|
|
MMM |
Displays the abbreviated month name |
{ 0:MMM} |
|
|
MMMM |
Displays the full name of the month |
{ 0:MMMM} |
|
|
s |
Displays the seconds as a integer |
{ 0:s} |
|
|
ss |
Displays the seconds as an integer with a leading zero for single-digit second values |
{ 0:ss} |
|
|
t |
Displays the first character of A.M. or P.M. |
{ 0:t} |
|
|
tt |
Displays the full A.M. or P.M. |
{ 0:tt} |
|
|
y |
Displays two-digit year, with no
|
{ 0:y} |
|
|
yy |
Displays two-digit year |
{ 0:yy} |
|
|
yyyy |
Displays four-digit year |
{ 0:yyyy} |
|
|
zz |
Displays the time zone offset |
{ 0:zz} |
|
|
: |
Time separator |
{ 0:hh:mm:tt} |
|
|
/ |
Date separator |
{ 0:MM/dd/yyyy} |
|
It is often necessary to include in a string special characters such as tab, newline, or even the \ character. To insert such formatting, it is necessary to use the escape character (\), which
string escapeMe = string.Format( "C:\SAMS\Code" );
With the escape character in place, the value of escapeMe is "C:\SAMS\CODE" .
FORMATTING NOTESIf you don't want to use the double-backslash ( \\ ) syntax, C# provides a special shortcut that you can use. By preceding any string literal with the @ symbol, it acts as an escape for the entire string, enabling you to write code that looks like this: string myFile = @"C:\SAMS\Code\File.txt"; As a special note, the { character can also cause difficulty when attempting to use it in a string that contains other formatting characters. To display the { character itself, use {{ to escape it. This comes into play only during the following:
string myString = string.Format( "{{x}} = {0}", x );
|
One of the most common string-processing requirements is the locating of substrings within a string. The
System.String
class provides several methods for locating substrings and each method in
|
Method |
Description |
|---|---|
|
EndsWith |
Used to determine whether a string ends with a specific substring. Returns true or false. |
|
IndexOf |
Returns the first index (zero-based) location of the supplied substring or character. Returns 1 if the substring is not found. |
|
IndexOfAny |
Returns the first index (zero-based) location of the supplied substring or partial match. Returns 1 if the substring is not found. |
|
LastIndexOf |
Returns the last index of the specified substring. Returns 1 if the substring is not found. |
|
LastIndexOfAny |
Returns the last index of the specified substring or partial math. Returns 1 if the substring is not found. |
|
StartsWith |
Returns true if the string starts with the specified substring or character. |
Just as with format specifiers and padding, the String class provides a set of padding methods that pad a string with a space or specified character. Padding can be used to pad spaces or characters to the left or right of the target string. The code in Listing 4.3 shows how to pad a string to 20 characters in length with leading spaces.
string leftPadded = "Left Padded";
Console.WriteLine("123456789*123456789*123456789*");
Console.WriteLine( leftPadded.PadLeft(20, ' ' ) );
The output of the code in Listing 4.3 is as
123456789*123456789*
Left Padded
Sometimes it is necessary to remove characters from a string and this is the purpose of trimming. The
TRim
method allows for the removal of spaces or characters from either the start or end of a string. By default, the
trim
method
In addition, there are two other trimming methods. trimStart removes spaces or a list of specified characters from the beginning of a string. TRimEnd removes spaces or a list of specified characters from the end of a string.
You can access the trim method and others like it on any string variable, as shown here:
string myTrimmedString = myString.Trim();
To replace characters or substrings in a string, use the Replace method. For instance, to remove display formatting from a phone number such as (919) 555-1212, the following code can be used:
string phoneNumber = "(919) 555-1212";
string fixedPhoneNumber =
phoneNumber.Replace( "(", "" ). Replace( ")", "" ).Replace( "-", "" )
.Replace( " ", "" );
Console.WriteLine( fixedPhoneNumber );
Notice how the Replace method is used. Each time Replace is called, a new string is created. Thus, the cascading use of the Replace method to remove all unwanted strings is necessary.
REPLACING WITH EMPTY STRINGSWhen you want to remove a character and replace it with nothing, you must use the string version rather than the empty character '' notation; otherwise, the compiler will issue a warning about empty character declarations. |
String splitting comes in handy for parsing comma-separated values or any other string with noted separated characters. The
Split
method requires nothing more than a character parameter that denotes how to split up the string. The result of this operation is an array of strings where each element is a substring of the original string. To separate or spilt a comma-separated list such as
apple
,
orange
,
string fruit = "apple,orange,banana";
string[] fruits= fruit.Split( ',' );
foreach (string fruitName in fruits)
Console.WriteLine(fruitName);
//Result
//fruits[0] -> apple
//fruits[1] -> orange
//fruits[2] -> banana
The last two major methods of the String class involve changing the case of a string. The case can be changed to uppercase or lowercase and results in a new string of the specified case. Remember that strings are immutable and any action that modifies a string results in a new string. Therefore, the following takes place:
string attemptToUpper = "attempt to upper"; attemptToUpper.ToUpper( ); //attemptToUpper is still all lower case
To see the effect of the ToUpper() method, the result string has to be assigned to a variable. The following illustrates the proper use of ToUpper() :
string allLower = "all lower"; string ALL_UPPER = allLower.ToUpper( ); //ALL_UPPER -< "ALL LOWER";
To improve performance, the
StringBuilder
class is designed to manage an array of characters via direct manipulation. Such an implementation eliminates the need to constantly allocate new strings. This
The most basic use of the StringBuilder class is to perform string concatenation, which is the process of building a result string from various other strings and values until the final string is complete. The StringBuilder class provides an Append method. The Append method is used to append values to the end of the current string. Values can be integer, boolean, char, string, DateTime, and a list of others. In fact, the Append method has 19 overloads in order to accommodate any value you need to append to a string.
In addition to appending values to the current string, StringBuilder also provides the ability to append formatted strings. The format specifiers are the same specifiers listed in the previous section. The AppendFormat method is provided in order to avoid calls to string.Format(...) and the unnecessary creation of additional strings.
The insertion of strings is another useful method provided by the StringBuilder class. The Insert method takes two parameters. The first parameter specifies the zero-based index at which to begin the insertion. The second parameter is the value to insert at the specified location. Similar to the Append method, the Insert method provides 18 different overloads in order to support various data types for insertion into the string. Listing 4.4 shows the usage of the Insert method.
using System;
using System.Text;
namespace Listing_4_4 {
class Class1 {
[STAThread]
static void Main(string[] args) {
StringBuilder stmtBuilder = new StringBuilder( "SELECT FROM MYTABLE" );
Console.Write( "Enter Columns to select from MYTABLE: ");
string columns = Console.ReadLine( ); //FirstName, LastName
stmtBuilder.Insert( 7, columns );
//insert a space after the column names
stmtBuilder.Insert( 7 + columns.Length, " " );
//SELECT FirstName, LastName FROM MYTABLE
Console.WriteLine( stmtBuilder.ToString( ) );
}
}
}
You might run across a need to generate strings based on templates where certain tokens (substrings) are later replaced with values. In fact, this is how Visual Studio .NET works. There is a template file from which each project is created. The new source file that is created is generated from a template and various tokens are
Using the Replace method, it is possible to create template strings, such as SQL statements, and replace tokens with actual values as demonstrated by the following code:
StringBuilder selectStmtTemplate = string StringBuilder(); selectStmtTemplate.Append( "SELECT $FIELDS FROM $TABLE" ); selectStmtTemplate.Replace( "$FIELDS", fieldList ); selectStmtTemplate.Replace( "$TABLE", tableName );
The Remove method allows for sections of the underlying string to be completely removed from the StringBuilder object. The Remove method takes two parameters. The first parameter specifies the zero-based index of the position denoting the starting point. The second parameter specifies the length or number of characters to remove.
|
|
|