The package sjm.parse.tokens uses a TokenString class to hold the results of tokenizing a string. A TokenString is similar to a String , but it contains a series of tokens rather than a series of characters . Like String objects, TokenString objects are immutable, meaning that they cannot change after they are created. Figure 9.17 shows the TokenString class. Figure 9.17. The TokenString class. A TokenString is essentially an array of Tokens . Like String , TokenString is immutable, so there is never a need to copy a TokenString . The TokenAssembly class hides the fact that it relies on class TokenString . The TokenStringSource class, on the other hand, returns TokenString objects. If you use TokenStringSource to break up an input stream, you must understand the collaboration of several token- related classes. This example shows a collaboration of instances of these classes: package sjm.examples.tokens; import sjm.parse.*; import sjm.parse.tokens.*; /** * This class shows a collaboration of objects from classes * <code>Tokenizer</code>, <code>TokenStringSource</code>, * <code>TokenString</code>, <code>TokenAssembly</code>. */ public class ShowTokenString public static void main(String args[]) { // a parser that counts words Parser w = new Word().discard(); w.setAssembler(new Assembler() { public void workOn(Assembly a) { if (a.stackIsEmpty()) { a.push(new Integer(1)); }else { Integer i = (Integer) a.pop(); a.push(new Integer(i.intValue() + 1)); } } }); // a repetition of the word counter Parser p = new Repetition(w); // consume token strings separated by semicolons String s = "I came; I saw; I left in peace;"; Tokenizer t = new Tokenizer(s); TokenStringSource tss = new TokenStringSource(t, ";"); // count the words in each token string while (tss.hasMoreTokenStrings()) { TokenString ts = tss.nextTokenString(); TokenAssembly ta = new TokenAssembly(ts); Assembly a = p.completeMatch(ta); System.out.println( ts + " (" + a.pop() + " words)"); } } } Running this class shows the word count of each semicolon-delimited section: I came (2 words) I saw (2 words) I left in peace (4 words) In this example, -
A Tokenizer object breaks the input into tokens. -
A TokenStringSource object divides the tokens into TokenString objects. -
A TokenString variable holds a succession of TokenString objects from the input. -
A TokenAssembly variable holds a TokenAssembly object that wraps around a TokenString object. |