Using the StringTokenizer Class

   

Using the StringTokenizer Class

There are times when you are dealing with Strings that must be broken up into pieces or into multiple Strings. Where the strings are broken apart depends on which delimiters are specified for the tokenization process. By default, a space is used to separate the strings, but you can also use any other character to separate the tokens.

Imagine if you had to parse through a text file of words and separate each word so that later you could perform a query and determine if a particular word was in your search list and how many times it appeared. As you can see, searching through a large String and looking for beginnings and endings of words could be tedious and take some effort. Fortunately for Java developers, you get this feature free with the StringTokenizer class.

The StringTokenizer class is in the java.util package. You must import the util package or at least the StringTokenizer class into your application if you plan to use it. You can construct a StringTokenizer object in several ways, but the easiest is to supply the string you want to tokenize as the constructor's single argument, like this:

 StringTokenizer tokenizer = new StringTokenizer("One Two Three Four Five"); 

This type of string tokenizer uses space characters as the separators (called delimiters ) between the tokens. You can also supply a different delimiter for the tokens to be separated by. To get a token, you call either the nextElement or the nextToken methods . The nextElement method returns an Object, whereas the nextToken returns a String. The following is an example of getting the next token as a String:

 String token = tokenizer.nextToken(); 

Each time you call nextToken, you get the next token in the string. Usually, you extract tokens using a while loop. To control the while loop, you call the hasMoreTokens method, which returns true as long as there are more tokens in the string. A typical tokenizer loop might look like this:

 while ( tokenizer.hasMoreTokens() ) {    String token = tokenizer.nextToken(); } 

As you might have guessed, the StringTokenizer maintains the what the current position it is at in the String that it is tokenizing. Certain methods like nextToken advance the current position on to the next position. The tokenizer takes a substring of the String that it is tokenizing and returns it back from the nextToken method. The original String is not modified during this process.

You can also determine how many tokens are in the string by calling the countTokens() method:

 StringTokenizer tokenizer = new StringTokenizer("One Two Three Four Five"); int count = tokenizer.countTokens(); 

In this example, count equals 5.

Getting All the Tokens from a File

Listing 8.5 shows a Java application that opens and reads from a file and searches for all the strings that are within the file that match a string that you pass in on the command line.

Listing 8.5 WordSearchExample.java ” A Application That Tokenizes Strings
 import java.util.StringTokenizer; import java.io.*; public class WordSearchExample {   private String fileName = null;   // Default Constructor   public WordSearchExample( String fileName )   {     super();     this.fileName = fileName;   }   // Private Accessor   private String getFilename()   {     return fileName;   }   // Method that actually performs the search   public int doSearch( String word )   {     LineNumberReader reader = null;     int count = 0;     StringTokenizer tokenizer = null;     String currentString = null;     String tempString = null;     try     {       reader = new LineNumberReader( new FileReader( getFilename() ) );       while( (currentString = reader.readLine()) != null )       {         // No sense tokenizing an empty string         if ( currentString.equals( "" ) )           continue;         System.out.println(             "Searching through string: " + currentString );         tokenizer = new StringTokenizer( currentString );         while( tokenizer.hasMoreTokens() )         {           tempString = tokenizer.nextToken();           if ( !tempString.equals("") && tempString.equals( word ))           {             System.out.println(          "Found on line " + reader.getLineNumber() + ": " + currentString );             count++;           }         }       }     }     catch( IOException ex )     {       System.out.println(            "Problem locating or opening the file: " + getFilename() );     }     // return the number of instances that were found     return count;   }   // Main Method   public static void main( String[] args )   {     if ( args.length != 2 )     {       System.out.println( "Usage: java WordSearchExample <Filename>" );       System.exit( 0 );     }     // Get the strings passed in from the command line     String fileName = args[0];     String word = args[1];     // Create an instance of the Example Class     WordSearchExample example = new WordSearchExample( args[0] );     int count = example.doSearch( word );     System.out.println (          count + " instances of the word: " + word + " found!" );   } } 

Here is the output when the word String is searched for in the text file using the WordSearchExample from Listing 8.5:

 C:\jdk1.3se_book\classes>java WordSearchExample c:\jdk1.3se_book\sample.txt String Searching: String handling in C or C++ (the languages that inspired Java) is infamously clunky. Found on line 1: String handling in C or C++ (the languages that inspired Java)is infamously clunky. Searching: Java solves that problem the same way many C++ programmers do: by creating a String class. Found on line 2: Java solves that problem the same way many C++ programmers do: by creating a String class. Searching: Java's String class enables your programs to manage text strings effortlessly, using Found on line 3: Java's String class enables your programs to manage text strings effortlessly, using Searching: statements similar to those used in simpler languages such as BASIC or Pascal. Java Searching: also makes it easy to handle fonts, which determine the way that your text strings Searching: appear onscreen. 3 instances of the word: String found! C:\jdk1.3se_book\classes> 

Play around with this example. You can also extend it to search for multiple words in a file. You could also easily extend it to search for strings using case insensitivity and even search substrings with little modification to the source code.

   


Special Edition Using Java 2 Standard Edition
Special Edition Using Java 2, Standard Edition (Special Edition Using...)
ISBN: 0789724685
EAN: 2147483647
Year: 1999
Pages: 353

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net