Let s Talk Strings

Let's Talk Strings

Arguably, string manipulation and file manipulation are the two most important facets of computer science. Almost all programs use either or both. Now that we're familiar with file manipulation using the .NET Framework, let's spend some time looking at the new string functionality. Also, in our review of how to work with strings, you will start to see patterns in the .NET Framework classes. Once you master a few of them, the rest follow the same conventions and soon you will feel at home working with them.

It's been said that everything in computer science can be ultimately reduced to string manipulation. While this statement may or may not be true, strings are incredibly important in even the smallest program. The adept string capabilities of classic Visual Basic were the envy of programmers in all languages. Using the classic Visual Basic built-in functions Mid, Left, and Right made string handling a breeze. In many languages, such as C and C++, strings are actually a chore. They're handled as an array, you have to access them just so, and you have to index through them. In fact, most programmers starting out with C++ cut their teeth on the language by writing classes just to handle strings. In short, string manipulation was work.

The good news about strings in Visual Basic .NET is that we have the same capability to manipulate strings as we had in Visual Basic 6. However, we have to change the way we work with strings because they have grown up in Visual Basic .NET. We all became dependent on the string manipulation functions built into the Visual Basic 6 language. While they were simple, elegant, and usually fast, they were built into the compiler and specific to the language.

What's New in Strings?

The String data type comes from the System.String class. Like the File class, the String is a sealed class, so you cannot inherit from it. Sealing the String class permits the system to perform behind-the-scenes string optimization algorithms. Perhaps the most dramatic new notion associated with strings is that an instance of the String class is considered immutable, meaning that a string cannot be modified after it has been created. Wait a moment. If a string is immutable, how can we delete a section from the middle of the string or trim leading or trailing blanks? In fact, how will we use all of the other string manipulation methods? Manipulation implies change.

In Visual Basic .NET, all of the string-manipulation methods appear to modify a string, but they actually destroy the original and return a new string containing the modification. The end result is transparent to programmers, so why do we care?

There are several reasons. An immutable string makes threading, ownership, and aliasing of a string object much simpler, for example. Also, .NET maintains a pool of literal strings within the memory space of the running program (known as the application domain). All literal strings in the program are automatically part of the pool. This system permits sophisticated algorithms to merge any duplicate strings. Because a string is an object, we can correctly perform reference comparisons (checking the memory location) instead of value comparisons (checking the actual value of the string).

Uninitialized Strings

The first thing to learn about strings is that we have to give them a value before we use them. When a string variable is dimensioned and not given a value, .NET initializes the variable to an empty string. Consider the following line.

Dim sString1 As String

You can think of sString1 as a reference variable that points to a string. However, the variable is currently uninitialized because it isn't referring to a string. This oversight was a common misstep in classic Visual Basic.

Now let's say that we try to use the variable in an innocent way, such as displaying it.

MessageBox.Show(sString1)

Remember that a string is a reference data type, but instead of initializing a string to the NULL of other reference types, Visual Basic .NET initializes the variable to an empty string, or "". Because the sString1 reference is not NULL, we will not get the dreaded "Attempt to dereference a null object reference" error message if we attempt to access the variable. (In object-oriented programming, dereferencing means attempting to get something from a memory location—think of a reference variable as a pointer in C++. This error message tells us that we attempted to grab something by referencing a memory location, but the object is NULL. Oops. Visual Basic .NET provides another safety net by returning an empty string when you access an uninitialized string variable.)

Working with Strings

Because our strings are objects, they have various manipulation methods built in. All of the handy string manipulation functions in Visual Basic 6 are now methods of the String object. For example, if we want to concatenate two strings, we could write code something like the following. The results are shown in Figure 5-9.

Dim sString1 As String = "Don't try to shift gears while trying" Dim sString2 As String = " to put hot sauce on your burrito." sString1 = sString1.Concat(sString1, sString2)

Figure 5-9

The results of our concatenated strings.

Of course, the following two statements accomplish the same thing:

sString1 &= sString2 sString1 = sString1 + sString2

None of these examples really modifies sString1 at all. The original string is destroyed, and a new, modified string is created on the fly and assigned to sString1. If your program is going to do quite a bit of string manipulation, there will be a lot of creating and throwing away of strings. As you might guess, this process can be slow. If you really need to do some industrial strength string manipulation, use the StringBuilder class. StringBuilder objects are convenient for situations in which it is desirable to modify a string—perhaps by removing, replacing, or inserting characters—without creating a new string for each modification. The methods contained within this class do not return a new StringBuilder object (unless specified otherwise). In the next chapter, we will be using the StringBuilder class to build a fun program that mimics a Rogerian psychologist.

The String class has several methods that permit you to get substrings, insert and delete substrings, split a string into two substrings, find a string's length, and perform many other operations. For example, the Split method takes a delimiter and breaks a string into an array of substrings. Likewise, the Join method returns a concatenated string from an array of substrings (similar to those created by the Split method). I've written many string parsers over the years to extract fields embedded in a string sent from a legacy mainframe somewhere. The Split procedure was added in the Visual Basic 6 language as a built-in function. Now in Visual Basic .NET you can use a single line of code to split a string into substrings.

To determine the length of a string, simply use its Length method. Because the MessageBox class is expecting a string, you have to call the ToString method to convert the numeric value if you want to display the string's length in a message box. (The length returned by the Length method is in characters, not bytes.) You can see how this strong typing will save headaches when our program is released.

MessageBox.Show(sString1.Length.ToString) 'Displays 71

Likewise, finding substrings in a string is also a breeze. If you are looking for a character or substring in a string, call the IndexOf method. If the substring (or character) is present, IndexOf returns the index of the first occurrence of the substring or character. If the substring is not present, IndexOf returns -1.

Dim sString1 As String = "Don't try to shift gears while trying" MessageBox.Show(sString1.IndexOf("s").ToString) 'Displays 13

The Substring method corresponds to the classic Visual Basic Mid$ function. The Substring method has two overloads. One constructs and returns a substring starting at the specified index to the end of the string. The other extracts and returns a substring at a starting index of the specified length. (Remember that strings start at index 0, not 1.)

Copying and Cloning a String

The Copy method of a string object makes a duplicate of a specified string. If the original string is empty, the copy of the string is also empty.

sString2 = String.Copy(sString1) MessageBox.Show(sString2) 'The contents of sString1 are copied ' to sString2

Now let's say we want to use the Clone method of our sString1 object. What's the difference between the Copy method and the Clone method? Whereas Copy makes a duplicate string, Clone simply returns a reference to the same string.

sString2 = sString1.Clone().ToString MessageBox.Show(sString2) 'sString2 contains a reference ' to sString1

The Equals method returns True if we compare sString1 to sString2 after we either copy or clone sString1 to sString2. We can also use the Equals method to compare two unrelated strings.

Dim sString1 As String = "You can see the morning, " & _ "but I can see the light." Dim sString2 As String = "You can see the morning, " & _ "but I can see the light." MessageBox.Show(sString1.Equals(sString2).ToString) 'Returns True

As you might imagine, we can also use the comparison operator on strings.

If sString1 = sString2 Then MessageBox.Show("Strings are equal") Else MessageBox.Show("Strings are not equal") End If

Microsoft .NET provides the handy curly bracket ({}) formatting characters that allow you to insert variables into strings. For example, we might have a program that dynamically presents output. We can use the {} characters to serve as placeholders for variables—{0} represents the first variable in a comma-separated list, {1} represents the second variable, and so on, as shown in the following example.

Dim iAdd As Integer = 2 Dim sString As String = String.Format("{0} and {0} = {1}", _ iAdd, iAdd + iAdd) MessageBox.Show(sString)

The output from this code is shown in Figure 5-10.

Figure 5-10

The {} formatting characters serve as placeholders for variables.

Using the {} characters is much easier than having to build a string and manually concatenate the variables, as was required in classic Visual Basic.

Dim sString As String = iAdd & " and " & iAdd & " = " & _ (iAdd + iAdd)

In addition, you can use the {} characters along with other formatting characters to format a string.

Dim iCost As Integer = 954 Dim sString As String = String.Format("Your total is {0:C}.", _ iCost) MessageBox.Show(sString, "String Format Placeholders", _ MessageBoxButtons.OK, MessageBoxIcon.Information)

The output from this code is shown in Figure 5-11.

Figure 5-11

Formatting characters make it easy to enhance string output.

For more information about formatting strings see the topics "Creating New Strings" and "Picture Numeric Format Strings" in the Visual Basic .NET help file.

How to Efficiently Use the Help File

All programmers, from novice to guru, will often rely on a product's help file. While this sidebar is about searching the .NET help files, the techniques described here are also helpful when using Internet search engines. Once you know the tricks, you can find whatever you want very quickly, either in the help file or on the Internet.

Let's say you want to find whether the members of the File class provide some particular functionality that you need. If you simply search for the characters ile, you are soon greeted with 500 or more help entries. Looking through this large number of entries to find what you are looking for would be quite a daunting chore.

If you type +file +member, you force the search engine to display only pages that contain both the word file and the word member on the same page. Or, if you type +file member, all pages with the word file will be displayed that may or may not contain the word member. Although these criteria yield better search results, neither criterion ensure the words are even within a few paragraphs of each other. In many cases this type of search can be helpful, but not when you are looking for a very specific item. If you search for +"file member" in quotation marks, however, you force the search engine to display pages where the two words are together, and your search is much more fruitful.

In general, use lowercase in your terms. Many Internet search engines are case sensitive to uppercase letters—if you enter an initial-capped word, such as File, only pages that contain the initial-capped word File are returned. If you search for file with a lowercase f, pages that contain both File and file are returned. The IDE search engine, however, will always change whatever you enter to lowercase. Spend some time poking around the help file. It's probably the best investment of time to start really learning Visual Basic .NET.



Coding Techniques for Microsoft Visual Basic. NET
Coding Techniques for Microsoft Visual Basic .NET
ISBN: 0735612544
EAN: 2147483647
Year: 2002
Pages: 123
Authors: John Connell

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net