The discussion of primitive data types in Chapter 5 mentions that you can create a string using an array of type char. However, Java has a better way of dealing with strings—the String class in the java.lang package, which is always imported, and hence always available to you. You have already seen examples that use this String class, so now we will discuss it in more detail, as well as the methods that are supplied with it.
Why does Java use a class for strings instead of a built-in data type? This is because built-in data types or primitives are restricted to the few operators built into the language for math and manipulation purposes (like + and -). You spend so much time preparing, converting to and from, manipulating, and querying strings, that you gain significant programmer productivity from a class (versus a built-in primitive data type) given all the methods that can be supplied with it for common string operations.
For example, how many times have you written justification, trimming, and conversion code for text strings? Why not have language-supplied methods to save you the drudgery? In RPG, such language-supplied functionality comes in the form of op-codes and built-in functions. In an object-oriented language, it comes in the form of methods on a simple self-contained class. Indeed, it is through string classes in object-oriented languages that you first start to appreciate the power and elegance of objects. This great ability to encapsulate the useful methods or functions commonly needed by programmers into a class data type drives home the potential of objects and OO. For example, to support a new operation, you need only add a new method, versus designing a new operator into the language.
An important note about strings in Java is that the language designers slightly relaxed their strict object-oriented rules to allow strings to be concatenated directly with a built-in operator—the plus operator (+). They also allowed an intuitive means of instantiating strings that does not force you to use formal object-instantiation syntax (as they did for compile-time arrays).
This underlies the importance of strings and string manipulation to every programmer and their invariable prominence in every program. The more intuitive and convenient it is to use strings in a language, the more accepted that language is by programmers. Certainly, Java's goal is to "keep it simple." If you prefer to stick to more formal rules, though, you can do that, too. In other words, if you prefer to instantiate a string in the formal way and use a method call, concat, to concatenate two strings, instead of just using the + operator, that's fine. For example, you can use an intuitive style like this:
String text1 = "George"; String text2 = "Phil"; String finalText = text1 + " and " + text2; System.out.println(finalText);
Alternatively, you can use a formal style like this:
String text1 = new String("George"); String text2 = new String("Phil"); String finalText = new String(text1); finalText = finalText.concat(" and "); finalText = finalText.concat(text2); System.out.println(finalText);
The output of both examples is "George and Phil," as you would expect. These examples show that there are two ways to initialize strings—by implicitly equating them to string literals or by explicitly allocating an instance of the String class using the new operator. Once you create your strings, you can manipulate them using the many methods supplied with the String class. These samples also highlight the two means of adding strings together in Java, either via the intuitive plus operator or via the concat method of the String class. Note that in the latter, the string passed as a parameter is appended to the string represented by the String object. The actual String target object is not affected. Rather, a new String object is created and returned. Thus, the method call has no side effects.
Can you guess the output of the following?
String finalText = "George"; finalText.concat("and Phil"); System.out.println(finalText);
The answer is "George", not "George and Phil" as you might initially expect. Do not get caught by this common mistake. Another important consideration is string equality in Java. You cannot use the equality operator (==) to compare two string objects, like this:
if (text1 == text2)
Rather, you must use the equals method, which returns true if the target string and the passed-in string are equivalent, like this:
if (text1.equals(text2)) ...
This is the single most common mistake when using the String class. The problem is that the use of natural instantiation and the plus operator for concatenation tend to make you think of strings as primitive data types in Java. However, they are actually objects of the String class—that is, object reference variables. Like all object reference variables, they actually contain a memory address of the class instance, and as such, the equality operator only tells you if the two variables refer to the same address, which they almost never do. The operator does not have the intelligence to make a decision about whether all characters are the same and the strings are the same length. Rather, the code inside the equals method is required for this. You are all the more prone to this pitfall as an RPGIV programmer, because RPG IV has a free-form IF op-code syntax that does allow you to compare two strings (alphabetic fields) using the equality operator, as in:
IF STRING1 = STRING2
Take care in your Java coding to avoid this bug, as you will not notice it for a while, given that the compiler will not complain about it. Equality testing of object references is legal, after all, for those cases when you want to know if two variables actually do refer to the same allocated instance in memory.
RPG does not have a pure string data type similar to the String class in Java. In fact, in RPG III, you were quite restricted in character field functionality. You only had room for six character literals, and so you often had to resort to compile-time arrays to code longer literals. You had the MOVE, MOVEA, and MOVEL op-codes, and the CAT, CHECK, CHEKR, COMP, SCAN, SUBST, and XLATE op-codes.
In RPG IV, life is much better! You define a string field as a fixed-length character field:
DmyString S 20A INZ('Anna Lisa')
In this example, the first field, mystring, is defined as a 20-character alphanumeric field, and is initialized to an initial value of 'Anna Lisa' using the INZ keyword for initializing. The keyword area of the D-spec is from column 44 to 79, giving lots of room, and literals that don't fit even in this can be continued by placing a plus or minus character in the last position and continuing the literal on the next line.
Once you have defined a character field, you can perform operations on it by using the same op-codes as in RPG III, plus the EVAL and EVALR op-codes, and many handy built-in functions, as you shall see. Also, you can use the plus operator for concatenation and the free-form IF statement can do comparisons of strings. You can assign a string to a variable using the traditional MOVE and MOVEL op-codes or the new free-format EVAL or EVALR op-codes with the assignment statement, as shown in Listing 7.1.
Listing 7.1: Assigning Strings with EVAL and EVALR in RPG IV
D string1 S 10A D string2 S 10A C EVAL string1 = 'abc' C EVALR string2 = 'abc'
The EVAL op-codes are only for assignment, but are preferred to the older MOVE op-codes because they have a free-format factor-two from column 36 to 79, and can continue onto the next lines, where they get all columns from 8 to 79. This means you can write expressions that are as long as you want! The difference between EVAL and EVALR is that the latter right-justifies the target string into the source field. So, in the example in Listing 7.1, string1 contains 'abc ' while string2 contains ' abc'.
In addition to assigning literals or other fields to string fields, you can assign special figurative constants to them to initialize the values. Specially, you can assign *BLANKS to blank out the whole field, or *ALL'X...' to assign and repeat whatever literal you specify for x..., for example:
C EVAL string1 = *BLANKS C EVAL string2 = *ALL'AB'
In this example, string1 becomes ' ' while string2 becomes 'ABABABABAB'. Note that these figurative constants can also be specified at declaration time on the D-spec as the parameter to the INZ keyword.
RPG III and IV use single quotes to delimit a string literal, while Java uses double quotes. Also, strings in RPG are fixed in length and always padded with blanks if needed to achieve this length. So, if you display the contents of the myString field in the example, you would see 'Anna Lisa '. This is different than Java, where the size of the string is always exactly as long as the text last assigned to it. You never explicitly specify the length of a string in Java, this is done implicitly for you, for example:
String name = "Phil"; // new string, length four name = "George"; // assigned to a new value, length six
The length of a string is exactly the length of its current contents, which can be changed with an assignment statement. The new value is never padded or truncated by Java. If you want padding with blanks, you have to explicitly specify the blanks in your literal, like this:
String myString = "Phil "; // new string, length eight
In fact, the idea of dynamically sized string fields that always hold the exact text they have been assigned, instead of being padded, is so wonderful that RPG itself now supports it as well. As of V4R2, you can code the VARYING keyword on a character field to have it behave similar to Java, as in:
DmyString S 20A INZ('Anna Lisa') D VARYING
You still have to code a length, but that is used only as the maximum so the compiler knows how much memory to allocate. If you used DSPLY to print the contents of this field to the console and put quotes around the result, you would see 'Anna Lisa'—exactly the value it was initialized to, with no padding. This is nice.
In both RPG and Java, you might want to include an embedded quote. Since these are the delimiters, you have to use special syntax to embed them. In RPG, you double-up the embedded quote, like this:
DwithQuote S 20A INZ('Phil''s name')
In Java, you use the same backslash escape sequence you saw in Chapter 5:
String withQuote = "Phil says " do you get it? "";
Finally, remember that every character in Java requires two bytes of storage, even in String objects. This does not affect you as a programmer, other than to make life much easier when supporting multiple languages or countries, since you do not have to worry about codepage conversions or CCSIDs. We mention it because RPG IV also now has this capability, as of V4R4. Rather than using an A for the data type column, you can code a C, identifying this field as containing Unicode characters. You can convert between character and Unicode fields, and vice versa, using regular EVAL, EVALR, MOVE, and MOVEL operations, or the new %UCS2 built-in function. We don't cover the details of Unicode fields in RPG IV here, but if you are writing international applications, you should have a look. Further, if you are planning to have RPG code that calls Java code, or vice versa, then the Unicode data type is a perfect match for Java's String objects when passing parameters between the languages.
Table 7.1 compares all available string-manipulation op-codes and built-in functions in RPG IV to those methods available in Java. Java offers more functionality than shown here, as you will see shortly.
RPG Op-code |
RPG Built-in |
Description |
Java String Method(s) |
---|---|---|---|
CAT (or + operator) |
Concatenate two strings |
concat method or + operator |
|
SUBST |
%SUBST |
Extract a substring from a string |
substring method |
SCAN |
%SCAN |
Scan for a substring |
indexOf method |
%TRIM |
Trim beginning and ending blanks |
trim method |
|
%TRIML |
Trim leading blanks |
Not available |
|
%TRIMR |
Trim trailing blanks |
Not available |
|
%LEN |
Return length of string |
length method |
|
XLATE |
Translate a string |
No xlate match, but there are toUpperCase and toLowerCase methods |
|
CHECK |
Check for characters |
Not available |
|
CHECKR |
Check in reverse |
Not available |
|
%CHAR |
Convert various types to an outputable string. |
valueOf method |
|
%REPLACE |
Allowsreplacement of a substring with another |
replace method, but only replaces individual characters |
The next sections examine each of the operations listed in Table 7.1 and give examples of them for both RPG and Java. Where there is no matching method in Java, as in the case of XLATE and CHECK, you will learn how to write a method yourself to simulate the function.
Let's start with string concatenation for both RPG and Java, looking at an example to illustrate the use of this function. Suppose you have two fields: one contains a person's first name and the other contains the last name. You need to concatenate the two fields and print out the result. This is easy to do in both languages, since both support concatenating strings. In RPG, you use the CAT op-code, as shown in Listing 7.2.
Listing 7.2: Concatenating Strings in RPG Using the CAT Op-code
D first S 10A INZ('Mike') D last S 10A INZ('Smith') D name S 20A INZ(' ') C* Factor1 OpCode Factor2 Result C first CAT last:1 name C name DSPLY
This example uses two fields, first to represent the first name and last to represent the last name. It declares and initializes these fields right on the D-spec. However, in more complex applications, these fields may be read from the screen or from a file on disk. They can even be passed in via the command line. The C-spec uses the CAT op-code to concatenate field first to field last in factors one and two. The result of the concatenation is placed in the field name, which is specified in the result column. Notice also that:1 is specified in factor two in order to tell the compiler to insert one blank between the field values when concatenating them. The DSPLY op-code displays the value of name, which is "Mike Smith", as you would expect.
Listing 7.3 illustrates the same example written in Java.
Listing 7.3: Concatenating Strings in Java Using the CONCAT String Method
public class Concat { public static void main(String args[]) { String first, last, name; first = "Mike"; last = "Smith"; name = first.concat(" ").concat(last); System.out.println("The name is: " + name); } } // end class Concat
This example uses the String object to declare all the string variables, namely: first, last, and name. As mentioned earlier, by declaring the string variables and initializing them, all three string objects are created and initialized (that is, no new keyword is required to instantiate the object). Because you want to append a blank to the first name and then add the last name to that, two concat operations are necessary; there is no equivalent to RPG's :1 trick. The double concatenation could have been done in two steps, but we chose to do it in one. We can do this because the concat method returns the concatenated string, which can then be used directly as the object of a subsequent concat operation. Notice that we use the string to be concatenated as the object in the concat call, and pass in as a parameter the string to concatenate to it. The result is returned from the call, which is placed in the name variable. Then, the result is displayed using the println method. Double quotes are used around the blank literal, but because it is only one character, single quotes could also have been used, as there are two overloaded concat methods—one that takes a string and one that takes a single character.
Did you notice another concatenation in the previous example? If you did, you have a sharp eye. The println method concatenates the string literal "The name is :" to the object reference variable name using the plus operator. This is another way of concatenating two strings. In fact, this is a fast way of concatenating two strings in an expression for both RPG and Java. RPG supports the same + operator for concatenation in an expression. Table 7.2 replaces the CAT op-code of the previous example with the EVAL op-code, and the concat Java method with the + operator.
RPG IV |
Java |
---|---|
C EVAL name = first + ' ' + last |
name = first + " " + last; |
Clearly, the use of the plus operator is a plus for programmers!
Next, let's take a look at the substring op-code in RPG and the corresponding substring method in Java. In RPG, you use the SUBST operation code to extract a substring from a string starting at a specified location for a specified length, as shown in Listing 7.4.
Listing 7.4: Substringing Strings in RPG Using the SUBST Op-code
D* 12345678901234567890123456789 DWhyJava S 30A INZ('Because Java is for RPG pgmrs') D first S 4A D second S 6A D third S 3A D sayWhat S 15A C 4 SUBST WhyJava:9 first C 6 SUBST WhyJava:14 second C 3 SUBST whyJava:21 third C EVAL sayWhat = C first+' '+second+' '+third C sayWhat DSPLY C EVAL *INLR = *ON
This example takes a string with the value "Because Java is for RPG pgmrs" and retrieves different strings from it to make up the string "Java is for RPG". To do this, it declares three different character fields and a field to store the results. As the example illustrates, the SUBST op-code takes the number of characters to substring in factor one, and takes the source as well as the starting position for the retrieval in factor two. For example, the first SUBST operation receives the value Java and places it in the result field first. When all of the values have been retrieved, the concatenation operator concatenates all fields. Finally, the result of the field is displayed using the DSPLY operation code. The result is "Java is for RPG".
Because of RPG's fixed-field padding rules, we had to declare all the fields to be the exact length needed to hold the result. As an alternative, we could have coded them all to be a maximum length, such as 20, and then used the VARYING keyword on their D-spec definitions.
Listing 7.5 illustrates an equivalent example, written in Java, using the substring method of the String class. The parameters for the substring method have the beginning index value as the first parameter and the ending index as the second parameter. There are some subtle differences compared to RPG:
Otherwise, the logic is similar to RPG's and easy enough to follow. There is also a second version of the substring method, which takes as input only one parameter—the starting position (again, zero-based). This returns a string containing all characters from that starting position to the end of the target string.
Listing 7.5: Substringing Strings in Java Using the SUBSTRING String Method
public class Substring { public static void main(String args[]) { String whyJava, first, second, third, sayWhat; // 01234567890123456789012345678901234 whyJava = "Because Java is for RPG Programmers"; first = whyJava.substring(8,12); second = whyJava.substring(13,19); third = whyJava.substring(20,23); sayWhat = first + " " + second + " " + third; System.out.println(sayWhat); } } // end class Substring
Why is it that the second parameter has to be one past the actual ending column? It turns out this makes some processing easier. For example, you don't normally know exactly what column to start the substring operation in, and what column to end it in. Instead, you will often determine this programmatically, by searching for a specific delimiting character such as a blank or comma or dollar sign. This is done in Java using the indexOf method that you'll see shortly, but it is very simple to use. You give it the character to find, and it returns the zero-based position of that character (and you can specific what position to start the search). Well, when you are searching for the ending delimiter, you will subsequently be substringing up to the character position right before that delimiter, so this funny rule of Java's substring method makes it a little easier. For example, we could have written the first substring example from Listing 7.5 as this:
first = whyJava.substring(8,whyJava.indexOf(' '));
We don't want to string you out, so we'll move on now to the next topic.
In RPG IV, you can also use the %SUBST built-in function to accomplish the same thing in expressions. The syntax of this is %SUBST(string:start{:length}). The parameters are the same as the op-code SUBST, as you see in the following example:
EVAL secondWord = %SUBST('RPG USERS':5:5) EVAL secondWord = %SUBST('RPG USERS':5)
In both cases, the result is "USERS". Note the similarities to Java's substring method, notwithstanding the "gotcha's" mentioned.
One of the more commonly used functions in almost all languages is the ability to search one string for the occurrence of another. For example, you might have a string like "Java is for RPG users" and you want to find the position of the substring "RPG". Once you know this, you can simply extract the characters found after it. By searching for substrings, you can avoid hard-coding the substring parameters when you don't know the positions at compile time. You can use the SCAN operation code for this, as Listing 7.6 illustrates.
Listing 7.6: Scanning Strings for Substrings in RPG Using the SCAN Op-code
D* 123456789012345678901 Dstr S 40A INZ('Java is for RPG users') Didx S 3P 0 C 'RPG' SCAN str idx C idx DSPLY C EVAL idx = %SCAN('RPG':str) C idx DSPLY C EVAL *INLR = *ON
This example defines a string field named str and initializes it to "Java is for RPGusers". It finds the location of the substring "RPG" in the main string using SCAN, placing the desired substring in factor one, the source string in factor two, and the resulting field to contain the numeric index value in the result column. When the operation is executed, idx will contain the position where the substring was found, which is 13 in this example. Note that RPG allows you to specify the start location for the search in the second part of factor two. If the start location is not specified, as in this example, the default is to start at the first character. This example also shows the %SCAN built-in function, which offers the same functionality as the op-code, but can be used in free-format expressions. The second DSPLY operation also results in 13. You could specify an optional :start parameter on the %SCAN function.
For Java, this is a simple operation, since Java supplies an indexOf method in its String class, as shown in Listing 7.7. This method takes one or two parameters. The first is the string you are looking for, and the optional second is the start location of the search (the character position). Again, this is a zero-based position, not a one-based position as in for RPG. If you do not specify the start location, as with RPG, the default start value will be set to the first character position (that is, zero). In the example, the value that is printed is 12 (zero-based, again, so 12 is the thirteenth character). The indexOf method returns -1 if the given substring is not found in the input String object.
Listing 7.7: Scanning Strings for Substrings in Java Using the indexOf String Method
public class Scan { public static void main(String args[]) { // 012345678901234567890 String str = new String("Java is for RPG users"); int idx = str.indexOf("RPG"); System.out.println("RPG occurs at: " + idx); } }
In addition to indexOf, Java also supplies a handy lastIndexOf method, which will search backwards for a given substring. Again, it has an optional second parameter for specifying where to start the search, but this time the search continues backward from that start position.
Finally, both indexOf and lastIndexOf support either a string parameter, as you have already seen, or a single character parameter when searching for an individual character, as in:
int dollarPos = myString.indexOf('$');
Trimming is the process of removing leading or trailing blanks from a string, and both languages have built-in support for it. In RPG, three built-in functions make it easy to trim blanks. Listing 7.8 shows the %TRIM built-in function.
Listing 7.8: Trimming Blanks in RPG Using the %TRIM Built-in Function
D leftright S 40A INZ(' Java is for- D RPG users ') D temp S 40A C* C EVAL temp = %TRIM(leftright) + '.' C temp DSPLY C EVAL *INLR = *ON
This example uses %TRIM on the EVAL op-code to trim both leading and trailing blanks in the field leftright, placing the result in temp. Note that it concatenates a period to the trimmed string so that you can see the trimmed result before RPG pads it back out to the declared length of 40. After the operation, the field contains the following:
"Java is for RPG users. "
In Java, this task is also easy to accomplish, using the appropriately named trim method, as shown in Listing 7.9.
Listing 7.9: Trimming Blanks in Java Using the trim String Method
public class Trim { public static void main(String args[]) { String str = " Java is for RPG users "; str = str.trim(); System.out.println("Trimmed: '" + str + "'"); } } // end class Trim
The result is "Trimmed: 'Java is for RPG users'". Note again that Java does not pad strings out to some pre-declared length, so we did not have to concatenate a period.
Easy stuff. However, what if you only want to remove leading blanks? Or trailing blanks? In RPG it is very easy, since the language supports two additional built-ins, %TRIML and %TRIMR, for trimming left (leading) and right (trailing) blanks. However, Java has only the TRIM method, and unfortunately, no TRIML and TRIMR methods. It is not brain surgery or VCR programming to write your own code to do this, however, and you will do so by the end of the chapter, after learning about the StringBuffer class.
Determining the length of a string, to decide if it is empty or needs truncation or padding, is a simple task in both languages. In RPG IV, you simply use the %LEN built-in function, specifying a field or string literal or built-in function as a parameter, as shown in Listing 7.10.
Listing 7.10: Getting the Length of a Field in RPG IV
D aString S 40A INZ(' Java is forD RPG users ') D len1 S 9 0 D len2 S 9 0 C* C EVAL len1 = %LEN(aString) C EVAL len2 = %LEN(%TRIM(aString)) C len1 DSPLY C len2 DSPLY C EVAL *INLR = *ON
This code shows two examples of the %LEN built-in function—one takes a character field as input and the other takes a nested built-in function call %TRIM as input. The displayed output of this program is 40 and 21. Why 40 for the first one, even though the string on the INZ keyword is 35 long? Because the field is declared as 40 characters long. If you were to add the VARYING keyword to the definition of aString, you would get 35 from the %LEN built-in function.
In Java, you invoke the length method, as shown in Listing 7.11. When you run this example, you get 35 and 21. Remember that in Java, characters are two bytes long, because they are Unicode characters. However, the length returned by the length method is the number of characters, not the number of bytes—the latter is actually two times the former. The same is true for RPG Unicode fields and the %LEN built-in function. All String class methods that take or return an index number deal with the character position, not the byte offset, so you rarely need to worry about the fact that characters are two bytes long.
Listing 7.11: Getting the Length of a String in Java
public class Length { public static void main(String args[]) { String aString = " Java is for RPG users "; int len1 = aString.length(); int len2 = aString.trim().length(); System.out.println(len1); System.out.println(len2); } } // end class Length
So far, you have seen RPG op-codes compared to available Java methods. As you recall from Table 7.1, a few RPG op-codes or built-ins are simply not available in Java. For example, the XLATE op-code in RPG has no apparent equivalent in Java—at least, not yet. What to do in this case? Write your own method! First, let's review the RPG support. Listing 7.12 shows an example of the RPG XLATE op-code.
Listing 7.12: Translating Characters in RPG Using the XLATE Op-code
D from C CONST('GPR4') D to C CONST('VAJA') D source S 4A INZ('RPG4') D target S 4A C from:to XLATE source target C target DSPLY C EVAL *INLR = *ON
As you know, XLATE translates the source string in factor two to another sequence of characters, depending on the from and to strings specified in factor one. The result of this translation is placed in the result field. In particular, all characters in the source string with a match in the from string are translated to the corresponding characters in the to string. The rule is that the lengths of the from, to, and source strings must all be the same. The example translates R to J, P to A, G to V, and 4 to A, respectively. With the value RPG4 in the source string specified in factor two, the result is JAVA after the operation. Note that we decided to make the from and to fields constants, using RPG IV's syntax for constants.
Java has no corresponding method in its String class, but it does have a related method named replace, which takes two character parameters as input. It replaces all occur- rences of the first character in the target string object with the second character. It sounds similar to RPG's XLATE, except that replace only replaces a single character, not a string of characters. Not to worry—you can write your own Java method that emulates RPG's XLATE op-code by using the replace method repeatedly, once for each character in a given string of from characters.
What is interesting is that you cannot extend String (discussed in a later chapter) because the Java language designers made it final, preventing this. Thus, String augmentation methods like this will be created as traditional, standalone functions—that is, they will be defined as static, and take as parameters whatever they need. But even static methods must exist in a class, so we create them in an arbitrary class named RPGString, which is where all of the remaining methods in this chapter will go.
Listing 7.13 is the Java equivalent to RPG's XLATE op-code. (For consistency, we have named the method xlate.) The required parameters are, of course, the source string to be translated, followed by the from and to strings, and finally the start position, where the translation should start. To be consistent with Java's string methods, this start position is zero-based. To be consistent with RPG, the start position value should be optional, defaulting to the first character if not passed. To support an optional parameter at the end of the parameter list in a Java method, simply supply a second method with the same name that does not specify or accept that last parameter, which we have done. This second overloaded method can simply call the first full version of the method and pass in the default value for the missing parameter. In this case, this is zero for the first character.
Listing 7.13: Translating Characters in Java Using the xlate Method
public class RPGString { public static String xlate(String source, String fromChars, String toChars, int start) { String resultString; // minimal input error checking if (fromChars.length() != toChars.length()) return new String("BAD INPUT!"); if (start > source.length() || start < 0) return new String("BAD INPUT!"); // first off, get the substring to be xlated... resultString = source.substring(start); // xlate each fromChars char to same pos in toChars for (int i = 0; i < fromChars.length(); i++) resultString = resultString.replace(fromChars.charAt(i), toChars.charAt(i)); // now append xlated part to non-xlated part resultString = source.substring(0,start) + resultString; return resultString; } // end xlate method public static String xlate(String source, String fromChars, String toChars) { return xlate(source, fromChars, toChars, 0); } // end xlate method two } // end RPGString class
The code in the first and primary xlate method is reasonably straightforward—you first check to make sure the input is valid, then create a substring of the source that excludes the characters before the given start position. Next, for every character in the from string, you use the String class replace method to replace all occurrences of that character with the character in the corresponding position of the to string. Finally, you append that to the substring of the source up to the start position, and return this resulting string. To get an individual character out of a string, you must use the charAt method and supply the zero-based index of the character.
To test this, we supply a main method in our class so that we can call it from the command line and see the results. (This idea of supplying a main method for test cases for of your handwritten classes is a good idea, by the way.) Listing 7.14 shows the test case, which tests both versions of the method—first without specifying a start position, and then with specifying a start position.
Listing 7.14: Testing the xlate Method
public static void main(String args[]) { /*——————————————————*/ /* Test xlate method */ /*——————————————————*/ // "012345678901234567890"; String src="RPGP is for you Joo!"; String from = "RPG"; String to = "Jav"; System.out.println("Input string : '" + src + "'"); src=RPGString.xlate(src, from, to); System.out.println("Output string1: '" + src + "'"); from = "J"; to = "t"; src=RPGString.xlate(src, from, to, 16); System.out.println("Output string2: '" + src + "'"); } //end main method
Note the calls to xlate are qualified with the class name RPGString. Because this method is in the same class as the method being called, this is not necessary. However, we did this to illustrate how code in any other class would have to look. The example translates the characters in RPG to the corresponding characters in JAVA, and then translates the character J to the character t, starting at position 16 (again, zero-based). If we started at zero, the first J would be translated, which is not what we want. The final result is as follows:
Input string : 'RPGP is for you Joo!' Output string1: 'JAVA is for you Joo!' Output string2: 'JAVA is for you too!'
One function you'll often require is string translation to uppercase or lowercase. There is no language-supplied function for this in RPG, but there is in Java. However, you can accomplish this task in RPG using, once again, the XLATE op-code, using all lowercase characters for the from string and all uppercase characters for the to string, as shown in Listing 7.15.
Listing 7.15: Translating Case in RPG Using the XLATE Op-code
D LOWER C 'abcdefghijklmnopqrstuvwxyz' D UPPER C 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' D string S 30A INZ('Java is for rpg users') C string DSPLY C LOWER:UPPER XLATE string string C string DSPLY C EVAL *INLR = *ON
Two named constant strings, LOWER and UPPER, are defined to contain all the lowercase characters and their matching uppercase characters. To illustrate how this works, a field named string is defined containing the string "Java is for rpg users". Next, the XLATE op-code is used with the value 'LOWER:UPPER' in factor one and the string variable in the result column. After executing this operation, the result is the whole string in uppercase: 'JAVA IS FOR RPG USERS'.
Need we say this is a great opportunity for a procedure? It could be named, say, Upper-Case, and take a string as input and return the uppercase version. Of course, it would only support single-length character fields unless you used VARYING length fields and specified the OPTIONS(*VARSIZE) keyword for the procedure parameter.
In Java, converting strings from uppercase to lowercase and vice versa is even simpler, as Java supplies intuitive methods to do this. The first is toUpperCase, which translates the target String object to all uppercase. The second is toLowerCase, which translates the target String object to all lowercase. For example, see Listing 7.16.
Listing 7.16: Translating Case in Java Using the toUpperCase and toLowerCase Methods
String str = new String("Java for RPG Programmers"); str = str.toUpperCase(); System.out.println("String in uppercase: " + str); str = str.toLowerCase(); System.out.println("String in lowercase: " + str);
Compiling and running this example results in the following:
String in uppercase: JAVA IS FOR RPG USERS String in lowercase: java is for rpg users
Note |
The RPG example does not handle international characters, such as those containing an umlaut, while the Java methods do. That is because Java characters are Unicode-based, so they inherently support international characters. |
As with translating characters, RPG has language support to easily handle checking for the existence of characters, while Java does not have a supplied method. RPG has two op-codes, CHECK and CHECKR. These operations verify that each character in a given search string specified in factor one is among the characters in the base string specified in factor two. Each character in the given search string is compared with all of the characters specified in the base string. If a match exists, the next character is verified. Otherwise, the index value indicating the position of the unmatched character in the search string is placed in the result field, and the search is stopped. If a match is found for all characters in the search string, zero is returned in the result field. In the case of CHECK, verification of characters begins at the leftmost character, whereas for CHECKR, verification starts at the rightmost character. Listing 7.17 shows two examples, one for CHECK and the other for CHECKR.
Listing 7.17: Verifying Character Existence in RPG Using the CHECK and CHECKR Op-codes
D NUMBERS C CONST('0123456789') D pos S 9 0 D base S 7A INZ('*22300*') C NUMBERS CHECK base:2 pos C pos DSPLY C NUMBERS CHECKR base:6 pos C pos DSPLY C EVAL *INLR = *ON
Checking characters is most commonly used to check if a numeric field contains alphanumeric characters or vice versa. The example in Listing 7.17 checks to see if a string of numeric digits contains any alphanumeric characters. It starts by defining the set of numeric digits, zero through nine, and storing them in the constant field NUMBERS. The character field base is initialized to string "*22300*" and is the field to be checked. After executing the CHECK operation, the value in the result field is 7. It is not 1, as you may have expected, because the second part of factor two, which is the start position, contains 2. This tells the compiler to start verification at the second position.
The following CHECKR operation code uses similar parameters as the CHECK op-code, except that the start position is specified to be six, which is the position to start from. If a start position was not specified, it would default to the ending character position. The result after executing the CHECKR is one in the pos result field. Note also that the result field for both operations can be a numeric array.
As mentioned earlier, Java does not have methods similar to CHECK and CHECKR for character verification. As with character translation, you need to write your own methods to take care of this. Listing 7.18 contains the code to accomplish character verification. (Note that the same class name RPGString is used as in the previous example, thus building up a number of useful static string methods in this same class.)
Listing 7.18: Verifying Character Existence in Java Using check Methods
public static int check(String search, String base, int start) { // minimal error checking if (start >= base.length() || start < 0) return -2; // scan each char of base for match in search... for (int idx = start; idx < base.length(); idx++) if (search.indexOf(base.charAt(idx)) == -1) return idx; // return constant indicating match found for all return -1; } public static int check(String search, String base) { return check(search, base, 0); }
Two check methods are defined to simulate RPG's CHECK op-code: one takes a starting position index and the other does not. The latter simply calls the former with zero for the starting position. The algorithm first checks the validity of the input parameters, then scans each character in the given base string for an occurrence in the given search string. If all characters have a match, the special constant -1 is returned. Otherwise, the index position of the first non-matching character in the base string is returned. To be consistent with Java String class methods, the methods accept a zero-based starting position and return a zero-based index position. Because of this, they cannot return zero when all characters match, as RPG does, because zero is a valid index position. For this reason, they return -1.
Listing 7.19 defines two more methods, this time to simulate CHECKR with and without a starting position parameter. These are similar to the check methods; the only changed lines are shown in bold. Basically, you need to loop backwards through the base string, and you need to default to the last character position when no start position parameter is passed.
Listing 7.19: Verifying Character Existence from the Right in Java with checkR Methods
public static int checkR(String search, String base, int start) { // minimal error checking if (start >= base.length() || start < 0) return -2; // scan each char of base for match in search... for (int idx = start; idx >= 0; idx—) if (search.indexOf(base.charAt(idx)) == -1) return idx; // return constant indicating match found for all return -1; } public static int checkR(String search, String base) { return checkR(search, base, base.length()-1); }
Listing 7.20 shows the code in main to test these methods.
Listing 7.20: Testing the check and checkR Methods in Java
String digits = "0123456789"; String test = "*22300*"; int result; result = RPGString.check(digits, test); System.out.println("result is: " + result); result = RPGString.check(digits, test, 1); System.out.println("result is: " + result); result = RPGString.checkR(digits, test); System.out.println("result is: " + result); result = RPGString.checkR(digits, test, 5); System.out.println("result is: " + result);
Compiling and running it gives the following:
result is: 0 result is: 6 result is: 6 result is: 0
The usual purpose in using the RPG CHECK op-code is to verify that a given string contains numeric data. This is possible in Java with our new check method. However, for completeness, we show you another way. The Character class discussed in Chapter 5 is a class version of the char data type in Java. You will find in this class a number of worthwhile methods, including one named isDigit. This is a static method that takes any single character and returns true if the character is a digit from zero to nine. So, to test a whole string, you can simply call this method for each of the characters, as shown in the isNumeric method in Listing 7.21.
Listing 7.21: A Method for Testing if a String Is All Numeric Digits
public static boolean isNumeric(String inputString) { boolean allNumeric = true; for (int idx=0; idx
Recall the discussion at the beginning of the chapter about the concat method, and how it does not affect the String object on which you invoke it, but rather returns a new String object. You have seen that this is also true of other string manipulation methods like toUpperCase and replace. This is because the String class is immutable—that is, you cannot change a String object, you can only use methods that return new String objects. In many cases, the original string object is no longer used and is swept up by the garbage collector.
This read-only behavior of strings can have performance implications for calculations that do a lot of string manipulating. For example, this is true of any code that builds up a string by concatenating characters inside a loop. For this reason, Java supplies a second string class named StringBuffer that is mutable—it can be changed directly using supplied methods. This class is completely independent of the String class. That is, although some methods are common between the two, StringBuffer also has its own unique set of methods for altering the object directly, which you will see shortly.
If you need to dynamically change the strings in your method, you should use StringBuffer instead of String. Both classes support methods to convert back and forth between them. For example, you can use a StringBuffer object to do your string manipulation, and then, once the string is complete, convert it back to a String object using the toString method supplied in StringBuffer for this purpose. In fact, this conversion back and forth between String and StringBuffer classes has the added advantage of allowing you to use methods available in both classes by simply converting from one class to the other. You will almost always want to accept and return String objects, not StringBuffer objects, from your methods, so this conversion is often done at the beginning and end of your method. For example, methods for significant string manipulations might follow this format:
public String workOnString(String input) { StringBuffer workString = new StringBuffer(input); // do manipulation work on the workString variable return workString.toString(); }
How do you declare a string using the StringBuffer class? You must use the formal way, with the new operator, optionally specifying a string literal or String object as input:
StringBuffer aName = new StringBuffer("Angelica Farr");
There are no language extensions to allow intuitive instantiation like '= "this is a string"' as there are for Strings. Similarly, there are no language extensions for easy concatenation of StringBuffer objects using the plus sign as there are for Strings. (Of course, use of + is allowed between StringBuffer objects inside the System.out.println parameter string, as it is for all data types.) To concatenate strings to a StringBuffer object, use the append method:
StringBuffer quotedName = new StringBuffer("George"); quotedName.append(" and ").append("Phil");
Notice how this method does have a side effect on the object it works against, so you do not need to equate the result to another variable as you would with the concat method in the String class. This method returns the current StringBuffer object, so you can string together multiple method calls in one statement, as shown here. The append method is also convenient in that there are many overridden versions of it supporting all the primitive data types as the parameter, and conversion to a string literal is done for you, for example:
boolean flag = true; StringBuffer output = new StringBuffer("flag value = ").append(flag); System.out.println(output); // "flag value = true"
This results in the output "flag value = true". The append method also accepts String objects as input. In fact, it will accept any object as input! For objects, it simply calls the object's toString method to convert it to a string.
You do not always want to change your string by appending to it, sometimes you want to insert new strings into the middle of it. The StringBuffer class supports this with an insert method, with a number of overridden methods similar to append, allowing all manner of data types to be inserted after being converted to string format. All versions of the insert method take an integer insertion-point index as the first parameter and the actual string or other data type to be inserted as the second parameter, as in:
StringBuffer string1 = new StringBuffer("GORE"); string1.insert(1,"E"); string1.insert(4,"G"); System.out.println(string1);
This results in the output "GEORGE". Notice the insertion point given is the character position before the desired insertion point. In addition to append and insert, there are setChar and getChar methods for changing a particular character value in place and retrieving the character value at a specified zero-based position. A method named getChars can return a substring, but in the form of a character array, not a String. This could, however, be converted to a StringBuffer by using the version of append or insert that accepts a character array as input.
There is also an interesting method named reverse that reverses the content of a string, such that "Java" would become "avaJ". Presumably, there is a use for this somewhere! Maybe it's used for writing out the letters "ECNALUBMA" on the front of ambulances!
StringBuffer objects support the notion of capacity—a buffer length that is greater than or equal to the length of the string literal contained in the StringBuffer. Behind the scenes, the StringBuffer class uses an array of characters to hold the string. The array is given an initial default size, and as the string grows, the array often needs to be reallocated with a bigger size. This behind-the-scenes work is done for you, but there are methods to explicitly set the size (i.e., the capacity) of this buffer. You can thereby optimize performance by predicting the final size you will eventually require, minimizing the need for costly reallocations. It is by judicious use of capacity planning that you can most benefit from using a StringBuffer as a scratchpad to build up a computed string.
When instantiating an empty StringBuffer, you can specify the initial capacity by passing in an integer value, like this:
StringBuffer largeString = new StringBuffer(255);
Note that the default capacity for an empty StringBuffer object is 16. Aside from setting the initial capacity at instantiation time, you can also use the ensureCapacity method to ensure that the current buffer is at least as large as the number you pass as an argument. If it is not, the buffer size or capacity is grown to the size you specified. Despite the method name, ensureCapacity does not return a boolean value—in fact, it does not return anything. There is also a method for returning the current capacity, which is named capacity. It takes no arguments, and returns an integer value. This notion of capacity and its two methods is also available in other classes in Java that contain growable lists, such as the Vector class discussed in Chapter 6.
While you have ensureCapacity and capacity methods for working with a StringBuffer object's buffer size, you also have setLength and length methods for working with the actual string's size. This is always less than or equal to the capacity. You can use setLength to grow or shrink the string's size, effectively padding it (with null characters, which are hex zeros) or truncating it. Note that if you set the length of the string to be greater than the capacity, the capacity is automatically grown, just as it is when you grow a string past its capacity using append. On the other hand, if you truncate a string, the capacity is not reduced. Listing 7.22 will help you see the difference between capacity and length.
Listing 7.22: The Difference between Length and Capacity in the StringBuffer Class
public class TestStringBuffer { public static void main(String args[]) { StringBuffer test1 = new StringBuffer(20); // capacity test1.append("12345678901234567890"); // string System.out.println(); System.out.println("String = "" + test1 + """); System.out.println("Capacity = " + test1.capacity()); System.out.println("Length = " + test1.length()); test1.setLength(50); // string length System.out.println("———————————————-"); System.out.println("String = "" + test1 + """); System.out.println("Capacity = " + test1.capacity()); System.out.println("Length = " + test1.length()); test1.setLength(10); // string length System.out.println("———————————————-"); System.out.println("String = "" + test1 + """); System.out.println("Capacity = " + test1.capacity()); System.out.println("Length = " + test1.length()); } // end main method } // end TestStringBuffer class
Running this class results in the following:
String = "12345678901234567890" Capacity = 20 Length = 20 ——————————————————————————————-- String = "12345678901234567890 Capacity = 50 Length = 50 ———————————————-———————————————- String = "1234567890" Capacity = 50 Length = 10
The length is always the current number of characters held in the buffer, while the capacity is the maximum number of characters the buffer can hold without having to be resized internally. Notice how calling setLength with a value of 50 extends the actual string itself to be 50 characters long, but it uses the null character (all zeros) to pad, so you never see the ending quote. You'd have to subsequently replace all those null characters with blanks to get what you probably wanted. Also notice how the capacity always increases in size when needed, while it never decreases in size automatically.
Now let's go back to the trim operation and see how to implement trim-right and trim-left functionality in Java. A previous section showed how both RPG and Java have built-in functions for simultaneously trimming both leading and trailing blanks. It also mentioned that RPG has built-in functions for explicitly stripping either trailing-only or leading-only blanks, using the %TRIMR or %TRIML functions. In Java, however, you must implement this functionality yourself if you need it, which you will see shortly. First, Listing 7.23 reviews these built-in functions in RPG.
Listing 7.23: Trimming Leading and Trailing Blanks in RPG with %TRIML and %TRIMR
D input S 16A INZ(' Java for U ') D result S 16A C EVAL result = %TRIML(input) + '.' C result DSPLY C EVAL result = %TRIMR(input) + '.' C result DSPLY C EVAL *INLR = *ON
The input string is " Java for U ". Predictably, the result after %TRIML is "Java for U .", and the result after %TRIMR is " Java for U." Note that the concatenating of the period after the trim right lets you see the result before RPG pads the result field back to its declared length. That would not be necessary if the VARYING keyword had been defined on the result field. It is that easy to trim leading or trailing blanks in RPG, since the language directly supports it.
Java, on the other hand, has no supplied methods in either its String or StringBuffer classes, so you must write your own. With the use of the StringBuffer class previously discussed, however, this is not very difficult. You again create two methods as static, pass in as a parameter the string to operate on, and place the methods in an RPGString class. To be consistent with RPG, call the two methods trimr and triml. Because you will be doing a reasonable amount of manipulation on the strings, start out in both cases by creating a StringBuffer temporary object from the given String object, and in both methods end by using the toString method of StringBuffer to convert the scratchpad object back to a String that can be returned. The trimr method is the easiest, as you just need to find that last non-blank character and truncate the StringBuffer at that point, using the setLength method. Listing 7.24 shows this. Note that this method has to test for the case when it is given a string that is all blanks, in which case it just does setLength(0).
Listing 7.24: Trimming Trailing Blanks in Java with a trimr Method
public static String trimR(String input) { if (input.length() == 0) // error checking return input; StringBuffer temp = new StringBuffer(input); int idx = temp.length()-1; // find last non-blank character while ( (idx >= 0) && (temp.charAt(idx) == ' ') ) idx—; // truncate string if (idx >= 0) temp.setLength(idx+1); else temp.setLength(0); return temp.toString(); } // end trimR method
The triml method is a little more complicated because it involves shifting the characters left, from the first non-blank character. This is best accomplished by brute-force, character-by-character copying. The most efficient way to do this is to use a StringBuffer object that has been initialized to a sufficient capacity, as with the temp2 variable in Listing 7.25.
Listing 7.25: Trimming Leading Blanks in Java with a triml Method
public static String trimL(String input) { if (input.length() == 0) // error checking return input; StringBuffer temp1 = new StringBuffer(input); int idx, idx2; // find last non-blank character idx = 0; while ( (idx < temp1.length()) && (temp1.charAt(idx) == ' ') ) idx++; // truncate string if (idx < temp.length()) { // copy characters to new object int newSize = temp1.length() - idx; StringBuffer temp2 = new StringBuffer(newSize); for (idx2 = 0; idx2 < newSize; idx2++, idx++) temp2.append(temp1.charAt(idx)); return temp2.toString(); } else { temp1.setLength(0); return temp1.toString(); } } // end trimL method
Again, some additional complexity is added by the need to handle the case when an all-blank string is given as input.
As usual, we write test-case code in the main method to drive and demonstrate these new methods, including the all-blank test case, which is shown in Listing 7.26.
Listing 7.26: Testing the triml and trimr Methods
System.out.println("———————————-———————————"); System.out.println("Testing trimR method..."); System.out.println("———————————-———————————"); String paddedString = " Java For RPG Programmers "; String trimmedRight = RPGString.trimR(paddedString); System.out.println(""" + trimmedRight + """); String blankString = " "; trimmedRight = RPGString.trimR(blankString); System.out.println(""" + trimmedRight + """); System.out.println("——————————————————-———-"); System.out.println("Testing trimL method..."); System.out.println("—————————————————-————-"); paddedString = " Java For RPG Programmers "; String trimmedLeft = RPGString.trimL(paddedString); System.out.println(""" + trimmedLeft + """); trimmedLeft = RPGString.trimL(blankString); System.out.println(""" + trimmedLeft + """);
The result of compiling and running this is what you would expect:
——————————————————————- Testing trimR method... ———————————-——————————— " Java For RPG Programmers" "" ———————————-——————————— Testing trimL method... ———————————-——————————— "Java For RPG Programmers " ""
Easy enough? Not really, we admit, but then again, now you can simply call these methods. However, for completeness, we should also mention there are alternatives that are more inefficient but easier to code. To trim only leading or only trailing blanks, for example, you could use the String trim method if you first take care to add a non-blank character to the appropriate end of the string before the trim operation, then remove it after, like this:
String input = " a test "; String trimmedLeft, trimmedRight; trimmedLeft = (input + '.').trim(); trimmedLeft = trimmedLeft.substring(0,trimmedLeft.length()-1); trimmedRight = ('.' + input).trim(); trimmedRight = trimmedRight.substring(1);
Often, when writing string-parsing code, you will want to extract individual words. Java recognizes this need and supplies a utility class in the java.util package named StringTokenizer that does this automatically. This is a good class to know about, as it can save significant coding effort in those cases where a word-by-word extraction of a given string is required. It is instantiated by specifying the String object to parse. Subsequent iteration through the words, or tokens, is accomplished by the two methods hasMoreTokens and nextToken, as shown in Listing 7.27.
Listing 7.27: Testing the StringTokenizer Class in Java
public static void main(String args[]) { String inputString = "Mary had a little lamb"; StringTokenizer tokens = new StringTokenizer(inputString); String nextToken; System.out.println(); while (tokens.hasMoreTokens()) { nextToken = tokens.nextToken(); System.out.println(nextToken); } }
Running this gives the following:
Mary had a little lamb
What delimits or separates words or tokens? By default, it is blank spaces, but this can be explicitly specified at instantiation time, by entering all delimiting characters as a string, for example:
String sample = " $123,456.78 "; StringTokenizer words = new StringTokenizer(sample, " $,.");
This specifies four delimiter characters: the blank, dollar sign, comma, and period. You can also specify delimiters as part of the nextToken method call, in the event they are different per token. For your information, the above little example yields the tokens "123", "456", and "78".
This same functionality requires a little more work in RPG, as you have to write it yourself. However, the code is not so difficult, as shown in Listing 7.28. Since you are a seasoned RPG IV programmer by now, we will not dissect this example, but rather leave that to you. In fact, we recommend that you turn this into a reusable procedure.
Listing 7.28: Listing 7.28: Scanning for Delimiter Characters in RPG
D formula C 'A * 2 / 3 - Num' D tempstr S 10A D start S 2P 0 INZ(1) D end S 2P 0 INZ(0) C DOW (start <= %LEN(formula)) C EVAL end = %SCAN(' ':formula:start) C IF end = 0 C EVAL end = %LEN(formula)+1 C ENDIF C EVAL tempstr= C %SUBST(formula:start:end-start) C tempstr DSPLY C EVAL start=end+1 C ENDDO C EVAL *INLR = *ON
There are a number of remaining methods in the String class that offer additional functionality beyond what RPG supplies. Rather than describe them all, we leave them to your own discovery. However, Table 7.3 provides a brief summary of some of the more interesting ones. Refer to the JDK documentation for the java.lang.String class for more detailed information.
Method |
Description |
---|---|
compareTo(String) |
Compares two strings lexicographically. |
copyValueOf(char[],int,int) |
Returns a string that is equivalent to the specified character array. |
endsWith(String) and startsWith(String) |
Tests if this string ends with or starts with the given substring. |
equals(String) and equalsIgnoreCase(String) |
Tests if two string regions are equal. |
getBytes() |
Convert this string to a byte array. |
getChars(int, int, char[], int) |
Copies a substring into the destination character array, starting at the given offset. |
regionMatches(int,String, int,int) |
Tests if two string into the destination character array, starting at the given offset. |
toCharArray() |
Converts this string to a new character array. |
toLowerCase() and toUpperCase() |
Folds all of the characters in this string to lowercase or uppercase. |
valueOf(xxx) |
Takes as input a primitive data type value and converts it to a string. |
This chapter introduced you to the following concepts:
We hope this discussion of strings did not tie you in knots!
Foreword