Chapter 6: Strings, IO, Formatting, and Parsing


This chapter focuses on the various API-related topics that were added to the exam for Java 5. J2SE comes with an enormous API, and a lot of your work as a Java programmer will revolve around using this API. The exam team chose to focus on APIs for I/O, formatting, and parsing. Each of these topics could fill an entire book. Fortunately, you won't have to become a total I/O or regex guru to do well on the exam. The intention of the exam team was to include just the basic aspects of these technologies, and in this chapter we cover more than you'll need to get through the String, I/O, formatting, and parsing objectives on the exam.

Certification Objective —String, StringBuilder, and StringBuffer (Exam Objective 3.1)

3.1 Discuss the differences between the String, StringBuilder, and StringBuffer classes.

Everything you needed to know about Strings in the SCJP 1.4 exam, you'll need to know for the SCJP 5 examplus, Sun added the StringBuilder class to the API, to provide faster, non-synchronized StringBuffer capability. The StringBuilder class has exactly the same methods as the old StringBuffer class, but StringBuilder is faster because its methods aren't synchronized. Both classes give you String-like objects that handle some of the String class's shortcomings (like immutability).

The String Class

This section covers the String class, and the key concept to understand is that once a String object is created, it can never be changed—so what is happening when a String object seems to be changing? Let's find out.

Strings Are Immutable Objects

We'll start with a little background information about strings. You may not need this for the test, but a little context: will help. Handling "strings" of characters is a fundamental aspect of most programming languages. In Java, each character in a string is a 16-bit Unicode character. Because Unicode characters are 16 bits (not the skimpy 7 or 8 bits that ASCII provides), a rich, international set of characters is easily represented in Unicode.

In Java, strings arc objects. Just like other objects, you can create an instance of a String with the new keyword, as follows:

 String s = new String(); 

This line of code creates a new object of class String, and assigns it to the reference variable s. So far, String objects seem just like other objects. Now, let's give the String a value:

 s = "abcdef"; 

As you might expect, the String class has about a zillion constructors, so you can use a more efficient shortcut:

 String s = new String("abcdef"); 

And just because you'll use strings all the time, you can even say this:

 String s = "abcdef"; 

There are some subtle differences between these options that we'll discuss later, but what they have in common is that they all create a new String object, with a value of "abcdef", and assign it to a reference variable s. Now let's say that you want a second reference to the String object referred to by s:

 String s2 = s;    //   refer s2 to the same String as s 

So far so good. String objects seem to be behaving just like other objects, so what's all the fuss about?Immutability! (What the heck is immutability?) Once you have assigned a String a value, that value can never change—it's immutable, frozen solid, won't budge, fini, done. (We'll talk about why later, don't let us forget.) The good news is that while the String object is immutable, its reference variable is not, so to continue with our previous example:

 s = s.concat(" more stuff");  // the concat() method 'appends'                               // a literal to the end 

Now wait just a minute, didn't we just say that Strings were immutable? So what's all this "appending to the end of the string" talk? Excellent question: let's look at what really happened

The VM took the value of String s (which was "abcdef"), and tacked "more stuff" onto the end, giving us the value "abcdef more stuff". Since Strings are immutable, the VM couldn't stuff this new value into the old String referenced by s, so it created a new String object, gave it the value "abcdef more stuff", and made s refer to it. At this point in our example, we have two String objects: the first one we created, with the value "abcdef", and the second one with the value "abcdef more stuff". Technically there are now three String objects, because the literal argument to concat, " more stuff", is itself a new String object. But we have references only to "abcdef" (referenced by s2) and "abcdef more stuff" (referenced by s).

What if we didn't have the foresight or luck to create a second reference variable for the "abcdef" String before we called s = s.concat (" more stuff");? In that case, the original, unchanged String containing "abcdef" would still exist in memory, but it would be considered "lost." No code in our program has any way to reference it—it is lost to us. Note, however, that the original "abcdef" String didn't change (it can't, remember, it's immutable); only the reference variable s was changed, so that it would refer to a different String. Figure 6-1 shows what happens on the heap when you reassign a reference variable. Note that the dashed line indicates a deleted reference.

image from book
Figure 6-1: String objects and their reference variables

To review our first example:

 String s = "abcdef";   // create a new String object, with                        // value "abcdef", refer s to it String s2 = s;         // create a 2nd reference variable                        // referring to the same String // create a new String object, with value "abcdef more stuff", // refer s to it. (Change s's reference from the old String // to the new String.) ( Remember s2 is still referring to // the original "abcdef" String.) s = s.concat(" more stuff"); 

Let's look at another example:

 String x = "Java"; x.concat(" Rules!") ; System.out.println("x = " + x);   // the output is "x = Java" 

The first line is straightforward: create a new String object, give it the value "Java", and refer x to it. Next the VM creates a second String object with the value "Java Rules !" but nothing refers to it. The second String object is instantly lost; you can't get to it. The reference variable x still refers to the original String with the value "Java". Figure 6-2 shows creating a String without assigning a reference to it.

image from book
Figure 6-2: A String object is abandoned upon creation

Let's expand this current example. We started with

 String x = "Java"; x.concat(" Rules!"); System.out.println("x = " + x);   // the output is: x = Java 

Now let's add

 x.toUpperCase() ; System.out.println("x = " + x);   // the output is still:                                   // x = Java 

(We actually did just create a new String object with the value "JAVA", but it was lost, and x still refers to the original, unchanged String "Java".) How about adding

 x.replace('a', 'X'); System.out.println("x = " + x);   // the output is still:                                   // x = Java 

Can you determine what happened? The VM created yet another new String object, with the value "JXvX", (replacing the a's with x's), but once again this new String was lost, leaving x to refer to the original unchanged and unchangeable String object, with the value "Java". In all of these cases we called various String methods to create a new String by altering an existing String, but we never assigned the newly created String to a reference variable.

But we can put a small spin on the previous example:

 String x = "Java"; x = x.concat(" Rules!");           // Now we're assigning the                                    // new String to x System.out.println("x = " + x);    // the output will be:                                    // x = Java Rules! 

This time, when the VM runs the second line, a new String object is created with the value of "Java Rules!", and x is set to reference it. But wait, there's more—now the original String object, "Java", has been lost, and no one is referring to it. So in both examples we created two String objects and only one reference variable, so one of the two String objects was left out in the cold. See Figure 6-3 for a graphic depiction of this sad story. The dashed line indicates a deleted reference.

image from book
Figure 6-3: An old String object is abandoned

Let's Lake this example a little further:

 String x = "Java"; x = x.concat(" Rules!"); System.out.println("x = " + x);    // the output is:                                    // x = Java Rules! x.toLowerCase();                   // no assignment, create a                                    // new, abandoned String System.out.println("x = " + x);    // no assignment, the output                                    // is still: x = Java Rules! x = x.toLowerCase();               // create a new String,                                    // assigned to x System.out.println("x = " + x);    // the assignment causes the                                    // output: x = java rules! 

The preceding discussion contains the keys to understanding Java String immutability. If you really, really get the examples and diagrams, backwards and forwards, you should get 80 percent of the String questions on the exam correct.

We will cover more details about Strings next, but make no mistake—in terms of bang for your buck, what we've already covered is by far the most important part of understanding how String objects work in Java.

We'll finish this section by presenting an example of the kind of devilish String question you might expect to see on the exam. Take the time to work it out on paper (as a hint, try to keep track of how many objects and reference variables there are, and which ones refer to which).

 String s1 = "spring "; String s2 = s1 + "summer "; s1.concat("fall ") ; s2.concat(s1); s1 += "winter "; System.out.println(s1 + " " + s2); 

What is the output? For extra credit, how many String objects and how many reference variables were created prior to the println statement?

Answer: The result of this code fragment is "spring winter spring summer". There are two reference variables, s1 and s2. There were a total of eight String objects created as follows: "spring", "summer " (lost), "spring summer", "fall" (lost), "spring fall" (lost), "spring summer spring" (lost), "winter" (lost), "spring winter" (at this point "spring" is lost). Only two of the eight String objects are not lost in this process.

Important Facts About Strings and Memory

In this section we'll discuss how Java handles String objects in memory, and some of the reasons behind these behaviors.

One of the key goals of any good programming language is to make efficient use of memory. As applications grow, it's very common for String literals to occupy large amounts of a program's memory, and there is often a lot of redundancy within the universe of String literals for a program. To make Java more memory efficient, the JVM sets aside a special area of memory called the "String constant pool." When the compiler encounters a String literal, it checks the pool to sec if an identical String already exists. If a match is found, the reference to the new literal is directed to the existing String, and no new String literal object is created. (The existing String simply has an additional reference.) Now we can start to see why making String objects immutable is such a good idea. If several reference variables refer to the same String without even knowing it, it would be very bad if any of them could change the String's value.

You might say, "Well that's all well and good, but what if someone overrides the String class functionality; couldn't that cause problems in the pool?" That's one of the main reasons that the String class is marked final. Nobody can override the behaviors of any of the String methods, so you can rest assured that the String objects you are counting on to be immutable will, in fact, be immutable.

Creating New Strings

Earlier we promised to talk more about the subtle differences between the various methods of creating a String. Let's look at a couple of examples of how a String might be created, and let's further assume that no other String objects exist in the pool:

 String s = "abc";     //  creates one String object and one                       //  reference variable 

In this simple case, "abc" will go in the pool and s will refer to it.

 String s = new String("abc");  // creates two objects,                                // and one reference variable 

In this case, because we used the new keyword, Java will create a new String object in normal (non-pool) memory, and s will refer to it. In addition, the literal "abc" will be placed in the pool.

Important Methods in the String Class

The following methods arc some of the more commonly used methods in the String class, and also the ones that you're most likely to encounter on the exam.

  • charAt() Returns the character located at the specified index

  • concat() Appends one String to the end of another ( "+" also works)

  • equalsIgnoreCase() Determines the equality of two Strings, ignoring case

  • length() Returns the number of characters in a String

  • replace() Replaces occurrences of a character with a new character

  • substring() Returns a part of a String

  • toLowerCase() Returns a String with uppercase characters converted

  • toString() Returns the value of a String

  • toUpperCase() Returns a String with lowercase characters converted

  • trim() Removes whitespace from the ends of a String

Let's look at these methods in more detail.

public char charAt(int index) This method returns the character located at the String's specified index. Remember, String indexes are zero-based—for example,

 String x = "airplane"; System.out.println( x.charAt(2) );       // output is 'r' 

public String concat(String s) This method returns a String with the value of the String passed in to the method appended to the end of the String used to invoke the method—for example,

 String x = "taxi"; System.out.println( x.concat(" cab") ); // output is "taxi cab" 

The overloaded + and += operators perform functions similar to the concat() method—for example,

 String x = "library"; System.out.println( x + " card");    // output is "library card" String x = "Atlantic"; x += " ocean" System.out.println( x );         // output is "Atlantic ocean" 

In the preceding "Atlantic ocean" example, notice that the value of x really did change! Remember that the += operator is an assignment operator, so line 2 is really creating a new String, "Atlantic ocean", and assigning it to the x variable. After line 2 executes, the original String x was referring to, "Atlantic", is abandoned.

public boolean equalslgnoreCase(String s) This method returns a boolean value (true or false) depending on whether the value of the String in the argument is the same as the value of the String used to invoke the method. This method will return true even when characters in the String objects being compared have differing cases—for example,

 String x = "Exit"; System.out.println( x.equalsIgnoreCase("EXIT"));   // is "true" System.out.println( x.equalsIgnoreCase("tixe"));   // is "false" 

public int length() This method returns the length of the String used to invoke the method—for example,

 String x = "01234567"; System.out.println( x.length() );      // returns "8" 

public String replace(char old, char new) This method returns a String whose value is that of the String used to invoke the method, updated so that any occurrence of the clear in the first argument is replaced by the char in the second argument—for example,

 String x = "oxoxoxox"; System.out.println( x.replace('x', 'X') );     // output is                                                // "oXoXoXoX" 

public String Substring(int Begin) public String substring(int begin, int end) The substring() method is used to return a part (or substring) of the String used to invoke the method. The first argument represents the starting location (zero-based) of the substring. If the call has only one argument, the substring returned will include the characters to the end of the original String. If the call has two arguments, the substring returned will end with the character located in the nth position of the original String where n is the second argument. Unfortunately, the ending argument is not zero-based, so if the second argument is 7, the last character in the returned String will be in the original String's 7 position, which is index 6 (ouch). Let's look at some examples:

image from book
Exam Watch

Arrays have an attribute (not a method), called length. You may encounter questions in the exam that attempt to use the length() method on an array, or that attempt to use the length attribute on a String. Both cause compiler errors—for example,

 String x = "test"; System.out.println( x.length };     // compiler error 

or

 String[] x = new String[3]; System.out.println( x.length() );   // compiler error 

image from book

 String x = "0123456789";         // as if by magic, the value                                  // of each char                                  // is the same as its index! System.out.println( x.substring(5) );     // output is  "56789" System.out.println( x.substring(5, 8));   // output is "567" 

The first example should be easy: start at index 5 and return the rest of the String. The second example should be read as follows: start at index 5 and return the characters up to and including the 8th position (index 7).

public String toLowerCase() This method returns a String whose value is the String used to invoke the method, but with any uppercase characters converted to lowercase—for example,

 String x = "A New Moon"; System.out.println( x.toLowerCase() );    // output is                                           // "a new moon" 

public String toString() This method returns the value of the String used to invoke the method. What? Why would you need such a seemingly "do nothing" method? All objects in Java must have a toString() method, which typically returns a String that in some meaningful way describes the object in question. In the case of a String object, what more meaningful way than the String's value? For the sake of consistency, here's an example:

 String x = "big surprise"; System.out.println( x.toString() );     // output -                                         // reader's exercise 

public String toUpperCase() This method returns a String whose value is the String used to invoke the method, but with any lowercase characters converted to uppercase—for example,

 String x = "A New Moon"; System.out.println( x.toUpperCase() );    // output is                                           // "A NEW MOON" 

public String trim() This method returns a String whose value is the String used to invoke the method, but with any leading or trailing blank spaces removed — for example,

 String x = "         hi        "; System, out.println ( x + "x" );           // result is                                            // "     hi    x" System.out.println( x.trim() + "x");       // result is "hix" 

The StringBuffer and StringBuilder Classes

The java.lang.StringBuffer and java.lang.StringBuilder classes should be used when you have to make a lot of modifications to strings of characters. As we discussed in the previous section, String objects are immutable, so if you choose to do a lot of manipulations with String objects, you will end up with a lot of abandoned String objects in the String pool. (Even in these days of gigabytes of RAM, it's not a good idea to waste precious memory on discarded String pool objects.) On the other hand, objects of type StringBuffer and StringBuilder can be modified over and over again without leaving behind a great effluence of discarded String objects.

On the Job 

A common use for StringBuffers and StringBuilders is file I/O when large, ever-changing streams of input are being handled by the program. In these cases, large blocks of characters are handled as units, and StringBuffer objects are the ideal way to handle a block of data, pass it on, and then reuse the same memory to handle the next block of data.

StringBuffer vs. StringBuilder

The StringBuilder class was added in Java 5. It has exactly the same API as the StringBuffer class, except StringBuilder is not thread safe. In other words, its methods are not synchronized. (More about thread safety in Chapter 9.) Sun recommends that you use StringBuilder instead of StringBuffer whenever possible because StringBuilder will run faster (and perhaps jump higher). So apart from synchronization, anything we say about StringBuilder's methods holds true for StringBuffer's methods, and vice versa. The exam might use these classes in the creation of thread-safe applications, and we'll discuss how that works in Chapter 9.

Using StringBuilder and StringBuffer

In the previous section, we saw how the exam might test your understanding of String immutability with code fragments like this:

 String x = "abc"; x.concat("def"); System.out.println("x = " + x);     //  output is "x = abc" 

Because no new assignment was made, the new String object created with the concat() method was abandoned instantly. We also saw examples like this:

 String x = "abc"; x = x.concat("def"); System.out.println("x = " + x);    // output is "x = abcdef" 

We got a nice new String out of the deal, but the downside is that the old String "abc" has been lost in the String pool, thus wasting memory. If we were using a StringBuffer instead of a String, the code would look like this:

 StringBuffer sb = new StringBuffer("abc"); sb.append("def"); System.out.println("sb = " + sb);     // output is "sb = abcdef" 

All of the StringBuffer methods we will discuss operate on the value of the String-Buffer object invoking the method. So a call to sb.append("def"); is actually appending "def" to itself (StringBuffcr sb). In fact, these method calls can be chained to each other—for example,

 StringBuilder sb = new StringBuilder("abc"); sb.append("def").reverse().insert(3, "---"); System.out.println( sb );              // output is  "fed --- cba" 

Notice that in each of the previous two examples, there was a single call to new, concordantly in each example we weren't creating any extra objects. Each example needed only a single StringXxx object to execute.

Important Methods in the StringBuffer and StringBuilder Classes

The following method returns a StringXxx object with the argument's value appended to the value of the object that invoked the method.

public synchronized StringBuffer append(String s) As we've seen earlier, this method will update the value of the object that invoked (he method, whether or not the return is assigned to a variable. This method will take many different arguments, including boolean, char, double, float, int, long, and others, but the most likely use on the exam will be a String argument—for example,

 StringBuffer sb = new StringBuffer("set "); sb.append("point"); System.out.println(sb);       // output is "set point" StringBuffer sb2 = new StringBuffer("pi = "); sb2.append(3.14159f); System.out.println(sb2);      // output is  "pi = 3.14159" 

public StringBuilder delete(int start, int end) This method returns a StringBuilder object and updates the value of the StringBuilder object that invoked the method call. In both cases, a substring is removed from the original object. The starting index of the substring to be removed is defined by the first argument (which is zero-based), and the ending index of the substring to be removed is defined by the second argument (but it is one-based)! Study the following example carefully:

 StringBuilder sb = new StringBuilder("0123456789"); System.out.println(sb.delete(4,6));      // output is "01236789" 

image from book
Exam Watch

The exam will probably test your knowledge of the difference between String and StringBuffer objects. Because StringBuffer objects are changeable, the following code fragment will behave differently than a similar code fragment that uses String objects:

 StringBuffer sb = new StringBuffer("abc"); sb.append("def"); System.out.println( sb ); 

In this case, the output will be: "abcdef"

image from book

public StringBuilder insert(int offset, String s) This method returns a StringBuilder object and updates the value of the StringBuilder object that invoked the method call. In both cases, the String passed in to the second argument is inserted into the original StringBuilder starting at the offset location represented by the first argument (the offset is zero-based). Again, other types of data can be passed in through the second argument (boolean, char, double, float, int, long, and so on), but the String argument is the one you're most likely to see:

 StringBuilder sb = new StringBuilder("01234567"); sb.insert(4, "---"); System.out.println( sb );          //   output is  "0123---4567" 

public synchronized StringBuffer reverse() This method returns a StringBuffer object and updates the value of the StringBuffer object that invoked the method call. In both cases, the characters in the StringBuffer are reversed, the first character becoming the last, the second becoming the second to the last, and so on:

 StringBuffer s = new StringBuffer("A man a plan a canal Panama"); sb.reverse(); System.out.println(sb); // output: "amanaP lanac a nalp a nam A" 

public String toString() This method returns the value of the StringBuffer object that invoked the method call as a String:

 StringBuffer sb = new StringBuffer("test string"); System.out.println( sb.toString() );  // output is "test string" 

That's it for StringBuffers and StringBuilders. If you take only one thing away from this section, it's that unlike Strings, StringBuffer objects and StringBuilder objects can be changed.

image from book
Exam Watch

Many of the exam questions covering this chapter's topics use a tricky (and not very readable) bit of Java syntax known as "chained methods." A statement with chained methods has this general form:

 result = method1().method2().method3(); 

In theory, any number of methods can be chained in this fashion, although typically you won't see more than three. Here's how to decipher these "handy Java shortcuts" when you encounter them:

  1. Determine what the leftmost method call will return (let's call it x).

  2. Use x as the object invoking the second (from the left) method. If there are only two chained methods, the result of the second method call is the expression's result.

  3. If there is a third method, the result of the second method call is used to Invoke the third method, whose result is the expression's result—for example,

 String x = "abc"; String y = x.concat("def").toUpperCase().replace('C','x'); //chained methods System.out.println("y = " + y); // result is "y = ABxDEF" 

Let's look at what happened. The literal def was concatenated to abc, creating a temporary, intermediate String (soon to be lost), with the value abcdef. The toUpperCase() method created a new (soon to be lost) temporary String with the value ABCDBF. The replace() method created a final String with the value ABxDEF, and referred y to it.

image from book




SCJP Sun Certified Programmer for Java 5 Study Guide Exam 310-055
SCJP Sun Certified Programmer for Java 5 Study Guide (Exam 310-055) (Certification Press)
ISBN: 0072253606
EAN: 2147483647
Year: 2006
Pages: 131

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net