Chapter 4. The Java Language

CONTENTS

4.1 Text Encoding
4.2 Comments
4.3 Types
4.4 Statements and Expressions
4.5 Exceptions
4.6 Assertions
4.7 Arrays

In this chapter, we introduce the framework of the Java language and some of its fundamental facilities. We don't try to provide a full language reference here; instead, we'll lay out the basic structures of Java with special attention to how it differs from other languages. For example, we'll take a close look at arrays in Java, because they are significantly different from those in some other languages. We won't, on the other hand, spend too much time explaining basic language constructs such as loops and control structures. Nor will we talk much about Java's object-oriented side here, as that's covered in detail in Chapter 5 through Chapter 7. As always, we'll try to provide meaningful examples to illustrate how to use Java in everyday programming tasks.

4.1 Text Encoding

Java is a language for the Internet. Since the people of the Net speak and write in many different human languages, Java must be able to handle a large number of languages as well. One of the ways in which Java supports international access is through Unicode character encoding. Unicode uses a 16-bit character encoding; it's a worldwide standard that supports the scripts (character sets) of most languages.^[1]

Java source code can be written using the Unicode character encoding and stored either in its full 16-bit form or with ASCII-encoded Unicode character values. This makes Java a friendly language for non-English-speaking programmers who can use their native alphabet for class, method, and variable names.

The Java char type and String objects also support Unicode. But if you're concerned about having to labor with two-byte characters, you can relax. The String API makes the character encoding transparent to you. Unicode is also ASCII-friendly; the first 256 characters are defined to be identical to the first 256 characters in the ISO8859-1 (Latin-1) encoding; if you stick with these values, there's really no distinction between the two.

Most platforms can't display all currently defined Unicode characters. As a result, Java programs can be written with special Unicode escape sequences. A Unicode character can be represented with this escape sequence:

\uxxxx

xxxx is a sequence of one to four hexadecimal digits. The escape sequence indicates an ASCII-encoded Unicode character. This is also the form Java uses to output Unicode characters in an environment that doesn't otherwise support them.

Java stores and manipulates characters and strings internally as Unicode values. Java also comes with classes to read and write Unicode-formatted character streams.

4.2 Comments

Java supports both C-style block comments delimited by /* and */ and C++ - style line comments indicated by //:

/*  This is a           multiline               comment.    */      // This is a single-line comment   // and so // is this

As in C, block comments can't be nested. Single-line comments are delimited by the end of a line; extra // indicators inside a single line have no effect. Line comments are useful for short comments within methods; they don't conflict with wrapping block comment indicators around large chunks of code during development.

4.2.1 javadoc Comments

By convention, a block comment beginning with /** indicates a special doc comment. A doc comment is designed to be extracted by automated documentation generators, such as the Java SDK's javadoc program. A doc comment is terminated by the next */, just as with a regular block comment. Leading spacing and the first * on each line is ignored; lines beginning with @ are interpreted as special tags for the documentation generator.

Here's an example:

/**    * I think this class is possibly the most amazing thing you will    * ever see. Let me tell you about my own personal vision and   * motivation in creating it.    * <p>    * It all began when I was a small child, growing up on the    * streets of Idaho. Potatoes were the rage, and life was good...    *    * @see PotatoPeeler    * @see PotatoMasher    * @author John 'Spuds' Smith    * @version 1.00, 19 Dec 1996    */

javadoc creates HTML documentation for classes by reading the source code and pulling out the embedded comments. The author and version information is presented in the output, and the @see tags make hypertext links to the appropriate class documentation.

The compiler also looks at the doc comments; in particular, it is interested in the @deprecated tag, which means that the method has been declared obsolete and should be avoided in new programs. The fact that a method is deprecated is noted in the compiled class file so a warning message can be generated whenever you use a deprecated feature in your code (even if the source isn't available).

Doc comments can appear above class, method, and variable definitions, but some tags may not be applicable to all of these. For example, a variable declaration can contain only a @see tag. Table 4-1 summarizes the tags used in doc comments.

Table 4-1. Doc comment tags
Tag	Description	Applies to
@see	Associated class name	Class, method, or variable
@author	Author name	Class
@version	Version string	Class
@param	Parameter name and description	Method
@return	Description of return value	Method
@exception	Exception name and description	Method
@deprecated	Declares an item to be obsolete	Class, method, or variable

4.3 Types

The type system of a programming language describes how its data elements (variables and constants) are associated with actual storage. In a statically typed language, such as C or C++, the type of a data element is a simple, unchanging attribute that often corresponds directly to some underlying hardware phenomenon, such as a register or a pointer value. In a more dynamic language such as Smalltalk or Lisp, variables can be assigned arbitrary elements and can effectively change their type throughout their lifetime. A considerable amount of overhead goes into validating what happens in these languages at runtime. Scripting languages such as Perl achieve ease of use by providing drastically simplified type systems in which only certain data elements can be stored in variables, and values are unified into a common representation, such as strings.

Java combines the best features of both statically and dynamically typed languages. As in a statically typed language, every variable and programming element in Java has a type that is known at compile time, so the runtime system doesn't normally have to check the type validity of assignments while the code is executing. Unlike C or C++, Java also maintains runtime information about objects and uses this to allow truly safe runtime polymorphism and casting (using an object as a type other than its declared type).

Java data types fall into two categories. Primitive types represent simple values that have built-in functionality in the language; they are fixed elements, such as literal constants and numbers. Reference types (or class types) include objects and arrays; they are called reference types because they are passed "by reference," as we'll explain shortly.

4.3.1 Primitive Types

Numbers, characters, and boolean values are fundamental elements in Java. Unlike some other (perhaps more pure) object-oriented languages, they are not objects. For those situations where it's desirable to treat a primitive value as an object, Java provides "wrapper" classes (see Chapter 10). One major advantage of treating primitive values as such is that the Java compiler can more readily optimize their usage.

Another important portability feature of Java is that primitive types are precisely defined. For example, you never have to worry about the size of an int on a particular platform; it's always a 32-bit, signed, two's complement number. Table 4-2 summarizes Java's primitive types.

Table 4-2. Java primitive data types
Type	Definition
boolean	`true` or `false`
char	16-bit Unicode character
byte	8-bit signed two's complement integer
short	16-bit signed two's complement integer
int	32-bit signed two's complement integer
long	64-bit signed two's complement integer
float	32-bit IEEE 754 floating-point value
double	64-bit IEEE 754 floating-point value

Those of you with a C background may notice that the primitive types look like an idealization of C scalar types on a 32-bit machine, and you're absolutely right. That's how they're supposed to look. The 16-bit characters were forced by Unicode, and ad hoc pointers were deleted for other reasons. But overall, the syntax and semantics of Java primitive types are meant to fit a C programmer's mental habits.

4.3.1.1 Floating-point precision

Floating-point operations in Java follow the IEEE 754 international specification, which means that the result of floating-point calculations is normally the same on different Java platforms. However, since Version 1.3, Java has allowed for extended precision on platforms that support it. This can introduce extremely small-valued and arcane differences in the results of high-precision operations. Most applications would never notice this, but if you want to ensure that your application produces exactly the same results on different platforms, use the special keyword strictfp as a class modifier on the class containing the floating-point manipulation.

4.3.1.2 Variable declaration and initialization

Variables are declared inside of methods or classes in C style. For example:

int foo;   double d1, d2;   boolean isFun;

Variables can optionally be initialized with an appropriate expression when they are declared:

int foo = 42;   double d1 = 3.14, d2 = 2 * 3.14;   boolean isFun = true;

Variables that are declared as instance variables in a class are set to default values if they aren't initialized. (In this case, they act much like static variables in C or C++.) Numeric types default to the appropriate flavor of zero, characters are set to the null character (\0), and boolean variables have the value false. Local variables declared in methods, on the other hand, must be explicitly initialized before they can be used.

4.3.1.3 Integer literals

Integer literals can be specified in octal (base 8), decimal (base 10), or hexadecimal (base 16). A decimal integer is specified by a sequence of digits beginning with one of the characters 1-9:

int i = 1230;

Octal numbers are distinguished from decimal numbers by a leading zero:

int i = 01230;             // i = 664 decimal

As in C, a hexadecimal number is denoted by the leading characters 0x or 0X (zero "x"), followed by digits and the characters a-f or A-F, which represent the decimal values 10-15:

int i = 0xFFFF;            // i = 65535 decimal

Integer literals are of type int unless they are suffixed with an L, denoting that they are to be produced as a long value:

long l = 13L;   long l = 13;       // equivalent: 13 is converted from type int

(The lowercase character l ("el") is also acceptable but should be avoided because it often looks like the numeral 1.)

When a numeric type is used in an assignment or an expression involving a type with a larger range, it can be promoted to the larger type. For example, in the second line of the previous example, the number 13 has the default type of int, but it's promoted to type long for assignment to the long variable. Certain other numeric and comparison operations also cause this kind of arithmetic promotion. A numeric value can never be assigned to a type with a smaller range without an explicit (C-style) cast, however:

int i = 13;   byte b = i;          // Compile-time error, explicit cast needed byte b = (byte) i;   // OK

Conversions from floating-point to integer types always require an explicit cast because of the potential loss of precision.

4.3.1.4 Floating-point literals

Floating-point values can be specified in decimal or scientific notation. Floating-point literals are of type double unless they are suffixed with an f or F denoting that they are to be produced as a float value:

double d = 8.31;   double e = 3.00e+8;   float f = 8.31F;   float g = 3.00e+8F;

4.3.1.5 Character literals

A literal character value can be specified either as a single-quoted character or as an escaped ASCII or Unicode sequence:

char a = 'a';   char newline = '\n';   char smiley = '\u263a';

4.3.2 Reference Types

In Java, as in other object-oriented languages, you create new complex data types from primitives by creating a class that defines a new type in the language. For instance, if we create a new class called Foo in Java, we are also implicitly creating a new type called Foo. The type of an item governs how it's used and where it's assigned. An item of type Foo can, in general, be assigned to a variable of type Foo or passed as an argument to a method that accepts a Foo value.

In an object-oriented language like Java, a type is not necessarily just a simple attribute. Reference types are related in the same way as the classes they represent. Classes exist in a hierarchy, where a subclass is a specialized kind of its parent class. The corresponding types have the same relationship, where the type of the child class is considered a subtype of the parent class. Because child classes always extend their parents and have, at a minimum, the same functionality, an object of the child's type can be used in place of an object of the parent's type. For example, if you create a new class, Cat, that extends Animal, there is a new type Cat that is considered a subtype of Animal. Objects of type Cat can then be used anywhere an object of type Animal can be used; an object of type Cat is said to be assignable to a variable of type Animal. This is called subtype polymorphism and is one of the primary features of an object-oriented language. We'll look more closely at classes and objects in Chapter 5.

Primitive types in Java are used and passed "by value." In other words, when a primitive value is assigned or passed as an argument to a method, it's simply copied. Reference types, on the other hand, are always accessed "by reference." A reference is simply a handle or a name for an object. What a variable of a reference type holds is a reference to an object of its type (or of a subtype, as described earlier). A reference is like a pointer in C or C++, except that its type is strictly enforced and the reference value itself is a primitive entity that can't be examined directly. A reference variable can't be created or changed other than through assignment to an appropriate object. When references are assigned or passed to methods, they are copied by value. If you are familiar with C, you can think of a reference as a pointer type that is automatically dereferenced whenever it's mentioned.

Let's run through an example. We specify a variable of type Foo, called myFoo, and assign it an appropriate object:^[2]

Foo myFoo = new Foo( );   Foo anotherFoo = myFoo;

myFoo is a reference-type variable that holds a reference to the newly constructed Foo object. (For now, don't worry about the details of creating an object; we'll cover that in Chapter 5.) We create a second Foo type variable, anotherFoo, and assign it to the same object. There are now two identical references: myFoo and anotherFoo. If we change things in the state of the Foo object itself, we see the same effect by looking at it with either reference.

We can pass an object to a method by specifying a reference-type variable (in this case, either myFoo or anotherFoo) as the argument:

myMethod( myFoo );

An important, but sometimes confusing, distinction to make at this point is that the reference itself is passed by value. That is, the argument passed to the method (a local variable from the method's point of view) is actually a third copy of the reference. The method can alter the state of the Foo object itself through that reference, but it can't change the caller's notion of the reference to myFoo. That is, the method can't change the caller's myFoo to point to a different Foo object; it can change only its own. For those occasions when we want a method to have the side effect of changing a reference passed to it, we have to wrap that reference in another object to provide a layer of indirection.

Reference types always point to objects, and objects are always defined by classes. However, two special kinds of reference types specify the type of object they point to in a slightly different way. Arrays in Java have a special place in the type system. They are a special kind of object automatically created to hold a series of some other type of object, known as the base type. Declaring an array-type reference implicitly creates the new class type, as you'll see in the next section.

Interfaces are a bit sneakier. An interface defines a set of methods and a corresponding type. Any object that implements all methods of the interface can be treated as an object of that type. Variables and method arguments can be declared to be of interface types, just like class types, and any object that implements the interface can be assigned to them. This allows Java to cross the lines of the class hierarchy in a type-safe way.

4.3.3 A Word About Strings

Strings in Java are objects; they are therefore a reference type. String objects do, however, have some special help from the Java compiler that makes them look more like primitive types. Literal string values in Java source code are turned into String objects by the compiler. They can be used directly, passed as arguments to methods, or assigned to String type variables:

System.out.println( "Hello World..." );   String s = "I am the walrus...";   String t = "John said: \"I am the walrus...\"";

The + symbol in Java is overloaded to provide string concatenation as well as numeric addition. Along with its sister +=, this is the only overloaded operator in Java:

String quote = "Four score and " + "seven years ago,";   String more = quote + " our" + " fathers" +  " brought...";

Java builds a single String object from the concatenated strings and provides it as the result of the expression. We discuss the String class in Chapter 9.

4.4 Statements and Expressions

Although Java's means of declaring methods is quite different from C++, Java's statement and expression syntax is similar to C. Again, the creators of Java came from this background, and the intention was to make the low-level details of Java easily accessible to C programmers. Java statements appear inside methods and classes; they describe all activities of a Java program. Variable declarations and assignments, such as those in the previous section, are statements, as are basic language structures such as conditionals and loops. Expressions produce values; an expression is evaluated to produce a result, to be used as part of another expression or in a statement. Method calls, object allocations, and, of course, mathematical expressions are examples of expressions. Technically, since variable assignments can be used as values for further assignments or operations (in somewhat questionable programming style), they can be considered to be both statements and expressions.

One of the tenets of Java is to keep things simple and consistent. To that end, when there are no other constraints, evaluations and initializations in Java always occur in the order in which they appear in the code from left to right, top to bottom. We'll see this rule used in the evaluation of assignment expressions, method calls, and array indexes, to name a few cases. In some other languages, the order of evaluation is more complicated or even implementation-dependent. Java removes this element of danger by precisely and simply defining how the code is evaluated. This doesn't mean you should start writing obscure and convoluted statements, however. Relying on the order of evaluation of expressions is a bad programming habit, even when it works. It produces code that is hard to read and harder to modify. Real programmers, however, are not made of stone, and you may catch us doing this once or twice when we can't resist the urge to write terse code.

4.4.1 Statements

As in C or C++, statements and expressions in Java appear within a code block. A code block is syntactically a series of statements surrounded by an open curly brace ({) and a close curly brace (}). The statements in a code block can contain variable declarations:

{       int size = 5;       setName("Max");       ...   }

Methods, which look like C functions, are in a sense code blocks that take parameters and can be called by their names, for example, SetUpDog:

setUpDog( String name ) {       int size = 5;       setName( name );       ...   }

Variable declarations are limited in scope to their enclosing code block. That is, they can't be seen outside of the nearest set of braces:

{       int i = 5;   }      i = 6;           // Compile-time error, no such variable i

In this way, code blocks can be used to arbitrarily group other statements and variables. The most common use of code blocks, however, is to define a group of statements for use in a conditional or iterative statement.

Since a code block is itself collectively treated as a statement, we define a conditional like an if/else clause as follows:

if ( condition )       statement;   [ else       statement; ]

Thus, the if clause has the familiar (to C/C++ programmers) functionality of taking two different forms. Here's one:

if ( condition )       statement;

Here's the other:

if ( condition )  {       [ statement; ]       [ statement; ]       [ ... ]   }

The condition is a boolean expression. You can't use an integer expression or a reference type like you can in C. In other words, while i==0 is legitimate, i is not (unless i itself is boolean).

In the second form, the statement is a code block, and all its enclosed statements are executed if the conditional succeeds. Any variables declared within that block are visible only to the statements within the successful branch of the condition. Like the if/else conditional, most of the remaining Java statements are concerned with controlling the flow of execution. They act for the most part like their namesakes in other languages.

The do and while iterative statements have the familiar functionality; their conditional test is also a boolean expression:

while ( condition )     statement;      do       statement;   while ( condition );

The for statement also looks like it does in C:

for ( initialization; condition; incrementor )       statement;

The variable initialization expression can declare a new variable which is then limited to the scope of the for statement:

for (int i = 0; i < 100; i++ ) {       System.out.println( i )       int j = i;       ...   }

Java does not in general support the C comma operator, which groups multiple expressions into a single expression. However, you can use multiple comma-separated expressions in the initialization and increment sections of the for loop. For example:

for (int i = 0, j = 10; i < j; i++, j-- ) {      ...   }

The Java switch statement takes an integer type (or an argument that can be automatically promoted to an integer type) and selects among a number of alternative case branches:

switch ( int expression ) {       case int expression :           statement;       [ case int expression           statement;       ...       default :           statement;  ]   }

No two of the case expressions can evaluate to the same value. As in C, an optional default case can be specified to catch unmatched conditions. Normally, the special statement break is used to terminate a branch of the switch:

switch ( retVal ) {       case myClass.GOOD :           // something good           break;       case myClass.BAD :           // something bad           break;       default :          // neither one           break;   }

The Java break statement and its friend continue perform unconditional jumps out of a loop or conditional statement. They differ from the corresponding statements in C by taking an optional label as an argument. Enclosing statements, such as code blocks and iterators, can be labeled with identifier statements:

one:       while ( condition ) {           ...           two:               while ( condition ) {                   ...                   // break or continue point               }           // after two       }   // after one

In this example, a break or continue without argument at the indicated position would have the normal, C-style effect. A break would cause processing to resume at the point labeled "after two"; a continue would immediately cause the two loop to return to its condition test.

The statement break two at the indicated point would have the same effect as an ordinary break, but break one would break both levels and resume at the point labeled "after one." Similarly, continue two would serve as a normal continue, but continue one would return to the test of the one loop. Multilevel break and continue statements remove the remaining justification for the evil goto statement in C/C++.

There are a few Java statements we aren't going to discuss right now. The try, catch, and finally statements are used in exception handling, as we'll discuss later in this chapter. The synchronized statement in Java is used to coordinate access to statements among multiple threads of execution; see Chapter 8 for a discussion of thread synchronization.

4.4.1.1 Unreachable statements

On a final note, we should mention that the Java compiler flags "unreachable" statements as compile-time errors. An unreachable statement is one that the compiler determines won't be called at all. Of course there may be many methods that are actually never called in your code, but the compiler detects only those that it can "prove" will never be called simply by simple checking at compile time. For example, a method with an unconditional return statement in the middle of it causes a compile-time error. So does a method with something like this:

if (1 < 2) return; // unreachable statements

4.4.2 Expressions

An expression produces a result, or value, when it is evaluated. The value of an expression can be a numeric type, as in an arithmetic expression; a reference type, as in an object allocation; or the special type void, which is the declared type of a method that doesn't return a value. In the last case, the expression is evaluated only for its side effects, that is, the work it does aside from producing a value. The type of an expression is known at compile time. The value produced at runtime is either of this type or, in the case of a reference type, a compatible (assignable) subtype.

4.4.2.1 Operators

Java supports almost all standard C operators. These operators also have the same precedence in Java as they do in C, as shown in Table 4-3.

Table 4-3. Java operators
Precedence	Operator	Operand type	Description
1	++, --	Arithmetic	Increment and decrement
1	+, -	Arithmetic	Unary plus and minus
1	~	Integral	Bitwise complement
1	!	Boolean	Logical complement
1	`(` `type` `)`	Any	Cast
2	*, /, %	Arithmetic	Multiplication, division, remainder
3	+, -	Arithmetic	Addition and subtraction
3	+	String	String concatenation
4	<<	Integral	Left shift
4	>>	Integral	Right shift with sign extension
4	>>>	Integral	Right shift with no extension
5	<, <=, >, >=	Arithmetic	Numeric comparison
5	`instanceof`	Object	Type comparison
6	==, !=	Primitive	Equality and inequality of value
6	==, !=	Object	Equality and inequality of reference
7	&	Integral	Bitwise AND
7	&	Boolean	Boolean AND
8	^	Integral	Bitwise XOR
8	^	Boolean	Boolean XOR
9	\|	Integral	Bitwise OR
9	\|	Boolean	Boolean OR
10	&&	Boolean	Conditional AND
11	\|\|	Boolean	Conditional OR
12	?:	NA	Conditional ternary operator
13	=	Any	Assignment
13	*=, /=, %=, +=, -=, <<=, >> =, >>>=, &=, ^=, \|=	Any	Assignment with operation

There are a few operators missing from the standard C collection. As we said, Java doesn't support the comma operator for combining expressions, although the for statement allows you to use it in the initialization and increment sections. Java doesn't allow direct pointer manipulation, so it doesn't support the reference (&), dereference (*), and sizeof operators that are familiar to C/C++ programmers. We should also note that the percent (%) operator is not strictly a modulo, but a remainder, and it can have a negative value.

Java also adds some new operators. As we've seen, the + operator can be used with String values to perform string concatenation. Because all integral types in Java are signed values, the >> operator performs a right-arithmetic-shift operation with sign extension. The >>> operator treats the operand as an unsigned number and performs a right-arithmetic-shift with no sign extension. The new operator, as in C++, is used to create objects; we will discuss it in detail shortly.

4.4.2.2 Assignment

While variable initialization (i.e., declaration and assignment together) is considered a statement, with no resulting value, variable assignment alone is also an expression:

int i, j;          // statement i = 5;             // both expression and statement

Normally, we rely on assignment for its side effects alone, but, as in C, an assignment can be used as a value in another part of an expression:

j = ( i = 5 );

Again, relying on order of evaluation extensively (in this case, using compound assignments in complex expressions) can make code very obscure and hard to read. Do so at your own peril.

4.4.2.3 The null value

The expression null can be assigned to any reference type. It has the meaning of "no reference." A null reference can't be used to reference anything and attempting to do so generates a NullPointerException at runtime.

4.4.2.4 Variable access

The dot (.) operator has multiple meanings. It can retrieve the value of an instance variable (of some object) or a static variable (of some class). It can also specify a method to be invoked on an object or class. Using the dot (.) to access a variable in an object is an expression that results in the value of the variable accessed. This can be either a numeric type or a reference type:

int i;   String s;   i = myObject.length;   s = myObject.name;

A reference-type expression can be used in further evaluations, by selecting variables or calling methods within it:

int len = myObject.name.length( );   int initialLen = myObject.name.substring(5, 10).length( );

Here we have found the length of our name variable by invoking the length() method of the String object. In the second case, we took an intermediate step and asked for a substring of the name string. The substring method of the String class also returns a String reference, for which we ask the length. Compounding operations like this is also called "chaining," which we'll mention again later.

4.4.2.5 Method invocation

A method invocation is essentially a function call: an expression that results in a value. The value's type is the return type of the method. Thus far, we have seen methods invoked by name:

System.out.println( "Hello World..." );   int myLength = myString.length( );

Here we invoked the methods println() and length() on different objects. Selecting which method is invoked, however, can be more complicated than it appears, because Java allows method overloading and overriding (multiple methods with the same name); the details are discussed in Chapter 5.

Like the result of any expression, the result of a method invocation can be used in further evaluations, as we saw earlier. You can allocate intermediate variables to make it absolutely clear what your code is doing, or you can opt for brevity where appropriate; it's all a matter of coding style. The two following code snippets are equivalent:

int initialLen = myObject.name.substring(5, 10).length( ); String temp1 = myObject.name; String temp2 = temp1.substring(5, 10); int initialLen = temp2.length( );

4.4.2.6 Object creation

Objects in Java are allocated with the new operator:

Object o = new Object( );

The argument to new is the constructor for the class. The constructor is a method that always has the same name as the class. The constructor specifies any required parameters to create an instance of the object. The value of the new expression is a reference of the type of the created object. Objects always have one or more constructors, though they may not always be accessible to you.

We'll look at object creation in detail in Chapter 5. For now, just note that object creation is a type of expression and that the resulting object reference can be used in general expressions. In fact, because the binding of new is "tighter" than that of dot (.), you can easily create a new object and invoke a method in it, without assigning the object to a reference type variable:

int hours = new Date(  ).getHours( );

The Date class is a utility class that represents the current time. Here we create a new instance of Date with the new operator and call its getHours() method to retrieve the current hour as an integer value. The Date object reference lives long enough to service the method call and is then cut loose and garbage-collected at some point in the future.

Calling methods in object references in this way is, again, a matter of style. It would certainly be clearer to allocate an intermediate variable of type Date to hold the new object and then call its getHours() method. However, combining operations like this is common.

4.4.2.7 The instanceof operator

The instanceof operator can be used to determine the type of an object at runtime. It compares an object against a particular type. instanceof returns a boolean value that indicates whether an object is an instance of a specified class or a subclass of that class:

Boolean b;   String str = "foo";   b = ( str instanceof String );   // true b = ( str instanceof Object );   // also true b = ( str instanceof Date );     // false, not a Date or subclass

instanceof also correctly reports if the object is of the type of an array or a specified interface (as we'll discuss later):

if ( foo instanceof byte[] )     ...

It is also important to note that the value null is not considered an instance of any object. So the following test returns false, no matter what the declared type of the variable:

String s = null;  if ( s instanceof String )      // false, won't happen

4.5 Exceptions

Java's roots are in embedded systems software that runs inside specialized devices such as hand-held computers, cellular phones, and fancy toasters. In those kinds of applications, it's especially important that software errors be handled robustly. Most users would agree that it's unacceptable for their phone to simply crash or for their toast (and perhaps their house) to burn because their software failed. Given that we can't eliminate the possibility of software errors, it's a step in the right direction to recognize and deal with anticipated application-level errors in a methodical way.

Dealing with errors in a language such as C is entirely the responsibility of the programmer. There is no help from the language itself in identifying error types, and there are no tools for dealing with them easily. In C, a routine generally indicates a failure by returning an "unreasonable" value (e.g., the idiomatic -1 or null). As the programmer, you must know what constitutes a bad result and what it means. It's often awkward to work around the limitations of passing error values in the normal path of data flow.^[3] An even worse problem is that certain types of errors can legitimately occur almost anywhere, and it's prohibitive and unreasonable to explicitly test for them at every point in the software.

Java offers an elegant solution to these problems through exceptions. (Java exception handling is similar to, but not quite the same as, exception handling in C++.) An exception indicates an unusual condition or an error condition. Program control becomes unconditionally transferred or "thrown" to a specially designated section of code where it's caught and handled. In this way, error handling is orthogonal to (or outside) the normal flow of the program. We don't have to have special return values for all our methods; errors are handled by a separate mechanism. Control can be passed long distance from a deeply nested routine and handled in a single location when that is desirable, or an error can be handled immediately at its source. There are still some standard methods that return -1 as a special value, but these are generally limited to situations where we are expecting a special value.^[4]

A Java method is required to specify the exceptions it can throw (i.e., the ones that it doesn't catch itself); this means that the compiler can make sure we handle them. In this way, the information about what errors a method can produce is promoted to the same level of importance as its argument and return types. You may still decide to punt and ignore obvious errors, but in Java you must do so explicitly.

4.5.1 Exceptions and Error Classes

Exceptions are represented by instances of the class java.lang.Exception and its subclasses. Subclasses of Exception can hold specialized information (and possibly behavior) for different kinds of exceptional conditions. However, more often they are simply "logical" subclasses that serve only to identify a new exception type. Figure 4-1 shows the subclasses of Exception in the java.lang package. It should give you a feel for how exceptions are organized. Most other packages define their own exception types, which usually are subclasses of Exception itself or of its important subclass RuntimeException.

Another important exception class is IOException in the package java.io. The IOException class has many subclasses for typical I/O problems (such as FileNotFoundException) and networking problems (such as MalformedURLException). Network exceptions belong to the java.net package. Another important descendant of IOException is RemoteException, which belongs to the java.rmi package. It is used when problems arise during remote method invocation (RMI). Throughout this book we'll mention the exceptions you need to be aware of as we run into them.

Figure 4-1. The java.lang.Exception subclasses

figs/lj2.0401.gif

An Exception object is created by the code at the point where the error condition arises. It can be designed to hold whatever information is necessary to describe the exceptional condition and also includes a full stack trace for debugging. (A stack trace is the list of all the methods called in order to reach the point where the exception was thrown). The Exception object is passed as an argument to the handling block of code, along with the flow of control. This is where the terms "throw" and "catch" come from: the Exception object is thrown from one point in the code and caught by the other, where execution resumes.

The Java API also defines the java.lang.Error class for unrecoverable errors. The subclasses of Error in the java.lang package are shown in Figure 4-2. A notable Error type is AssertionError, which is used by the Java assert language statement to indicate a failure. We'll talk about that a bit later. A few other packages define their own subclasses of Error, but subclasses of Error are much less common (and less useful) than subclasses of Exception. You generally needn't worry about these errors in your code (i.e., you do not have to catch them); they are intended to indicate fatal problems or virtual machine errors. An error of this kind usually causes the Java interpreter to display a message and exit. You are actively discouraged from trying to catch or recover from them because they are supposed to indicate a fatal program bug, not a routine condition.

Figure 4-2. The java.lang.Error subclasses

figs/lj2.0402.gif

Both Exception and Error are subclasses of Throwable. Throwable is the base class for objects which can be "thrown" with the Java language throw statement. In general you should extend only Exception, Error or one of their subclasses.

4.5.2 Exception Handling

The try/catch guarding statements wrap a block of code and catch designated types of exceptions that occur within it.

try {       readFromFile("foo");       ...   }    catch ( Exception e ) {       // Handle error       System.out.println( "Exception while reading file: " + e );     ...   }

In this example, exceptions that occur within the body of the try portion of the statement are directed to the catch clause for possible handling. The catch clause acts like a method; it specifies an argument of the type of exception it wants to handle and, if it's invoked, it receives the Exception object as an argument. Here we receive the object in the variable e and print it along with a message.

A try statement can have multiple catch clauses that specify different types (subclasses) of Exception:

try {       readFromFile("foo");       ...   }    catch ( FileNotFoundException e ) {       // Handle file not found       ...   }    catch ( IOException e ) {       // Handle read error       ...   }    catch ( Exception e ) {       // Handle all other errors       ...   }

The catch clauses are evaluated in order, and the first possible (assignable) match is taken. At most, one catch clause is executed, which means that the exceptions should be listed from most specific to least. In the previous example, we'll anticipate that the hypothetical readFromFile() can throw two different kinds of exceptions: one for a file not found and another for a more general read error. Any subclass of Exception is assignable to the parent type Exception, so the third catch clause acts like the default clause in a switch statement and handles any remaining possibilities.

One beauty of the try/catch scheme is that any statement in the try block can assume that all previous statements in the block succeeded. A problem won't arise suddenly because a programmer forgot to check the return value from some method. If an earlier statement fails, execution jumps immediately to the catch clause; later statements are never executed.

4.5.3 Bubbling Up

What if we hadn't caught the exception? Where would it have gone? Well, if there is no enclosing try/catch statement, the exception pops to the top of the method in which it appeared and is, in turn, thrown from that method up to its caller. If that point in the calling method is within a try clause, control passes to the corresponding catch clause. Otherwise the exception continues propagating up the call stack, from one method to its caller. In this way, the exception bubbles up until it's caught, or until it pops out of the top of the program, terminating it with a runtime error message. There's a bit more to it than that because, in this case, the compiler might have reminded us to deal with it, but we'll get back to that in a moment.

Let's look at another example. In Figure 4-3, the method getContent() invokes the method openConnection() from within a try/catch statement. In turn, openConnection() invokes the method sendRequest(), which calls the method write() to send some data.

Figure 4-3. Exception propagation

figs/lj2.0403.gif

In this figure, the second call to write() throws an IOException. Since sendRequest() doesn't contain a try/catch statement to handle the exception, it's thrown again from the point where it was called in the method openConnection(). Since openConnection() doesn't catch the exception either, it's thrown once more. Finally it's caught by the try statement in getContent() and handled by its catch clause.

4.5.4 Exception Stack Traces

Since an exception can bubble up quite a distance before it is caught and handled, we may need a way to determine exactly where it was thrown. All exceptions can dump a stack trace that lists their method of origin and all the nested method calls it took to arrive there. Most commonly the user sees this when it is printed using the printStackTrace() method.

try {      // complex task  } catch ( Exception e ) {      // dump information about exactly where the exception occurred      e.printStackTrace( System.err );      ...  }

Java 1.4 introduces methods that allow you to retrieve the stack trace information programmatically, using the Throwable getStackTrace() method. This method returns an array of StackTraceElement objects, each of which represents a method call on the stack. You can ask a StackTraceElement for details about that method's location using the methods getFileName() , getClassName(), getMethodName(), and getLineNumber().

4.5.5 Checked and Unchecked Exceptions

We explained earlier how Java forces us to be explicit about our error handling. But it's not realistic to require that every conceivable type of error be handled in every situation. So Java exceptions are divided into two categories: checked and unchecked. Most application-level exceptions are checked, which means that any method that throws one, either by generating it itself (as we'll discuss later) or by ignoring one that occurs within it, must declare that it can throw that type of exception in a special throws clause in its method declaration. We haven't yet talked in detail about declaring methods; we'll cover that in Chapter 5. For now all you need to know is that methods have to declare the checked exceptions they can throw or allow to be thrown.

Again in Figure 4-3, notice that the methods openConnection() and sendRequest() both specify that they can throw an IOException. If we had to throw multiple types of exceptions, we could declare them separated with commas:

void readFile( String s ) throws IOException, InterruptedException {       ...   }

The throws clause tells the compiler that a method is a possible source of that type of checked exception and that anyone calling that method must be prepared to deal with it. The caller may use a try/catch block to catch it, or it may declare that it can throw the exception itself.

In contrast, exceptions that are subclasses of either the class java.lang. RuntimeException or the class java.lang.Error are unchecked. See Figure 4-1 for the subclasses of RuntimeException. (Subclasses of Error are generally reserved for serious class loading or runtime system problems.) It's not a compile-time error to ignore the possibility of these exceptions; methods don't have to declare they can throw them. In all other respects, unchecked exceptions behave the same as other exceptions. We are free to catch them if we wish; we simply aren't required to.

Checked exceptions are intended to cover application-level problems such as missing files and unavailable hosts. As good programmers (and upstanding citizens), we should design software to recover gracefully from these kinds of conditions. Unchecked exceptions include problems such as "out of memory" and "array index out of bounds." While these may indicate application-level programming errors, they can occur almost anywhere and usually aren't possible to recover from. Fortunately, because there are unchecked exceptions, you don't have to wrap every one of your array-index operations in a try/catch statement.

To sum up, checked exceptions are problems a reasonable application should try to handle gracefully; unchecked exceptions (runtime exceptions or errors) are problems from which we would not normally expect our software to recover. Error types are those explicitly intended to be conditions that we should never try to handle or recover from.

4.5.6 Throwing Exceptions

We can throw our own exceptions: either instances of Exception or one of its existing subclasses, or our own specialized exception classes. All we have to do is create an instance of the Exception and throw it with the throw statement:

throw new Exception( );

Execution stops and is transferred to the nearest enclosing try/catch statement. (There is little point in keeping a reference to the Exception object we've created here.) An alternative constructor lets us specify a string with an error message:

throw new Exception("Something really bad happened");

You can retrieve this string by using the Exception object's getMessage() method. Often, though, you can just refer to the object itself; in the first example in the earlier section, Section 4.5.2, an exception's string value is automatically provided to the println() method.

By convention, all types of Exception have a String constructor like this. The earlier String message is somewhat facetious and vague. Normally you won't be throwing a plain old Exception but a more specific subclass. For example:

public void checkRead( String s ) {        if ( new File(s).isAbsolute( ) || (s.indexOf("..") != -1) )           throw new SecurityException(             "Access to file : "+ s +" denied.");   }

In this code, we partially implement a method to check for an illegal path. If we find one, we throw a SecurityException, with some information about the transgression.

Of course, we could include whatever other information is useful in our own specialized subclasses of Exception. Often, though, just having a new type of exception is good enough because it's sufficient to help direct the flow of control. For example, if we are building a parser, we might want to make our own kind of exception to indicate a particular kind of failure:

class ParseException extends Exception {      ParseException( ) {           super( );      }      ParseException( String desc ) {           super( desc );      } }

See Chapter 5 for a full description of classes and class constructors. The body of our Exception class here simply allows a ParseException to be created in the conventional ways we've created exceptions previously (either generically or with a simple string description). Now that we have our new exception type, we can guard like so:

// Somewhere in our code  ...  try {      parseStream( input );  } catch ( ParseException pe ) {      // Bad input...  } catch ( IOException ioe ) {      // Low-level communications problem  }

As you can see, although our new exception doesn't currently hold any specialized information about the problem (it certainly could), it does let us distinguish a parse error from an arbitrary I/O error in the same chunk of code.

4.5.6.1 Chaining exceptions

Sometimes you'll want to take some action based on an exception and then turn around and throw a new exception in its place. This is common when building frameworks, where low-level detailed exceptions are handled and represented by higher level exceptions that can be managed more easily. For example you might want to catch an IOException in a communication package, possibly perform some cleanup, and ultimately throw a higher level exception of your own, maybe something like LostServerConnection.

You can do this in the obvious way by simply catching the exception and then throwing a new one. But then you lose important information, including the stack trace of the original "causal" exception. To deal with this, you can use the technique of exception chaining. This means that you include the causal exception in the new exception that you throw. Java 1.4 adds explicit support for exception chaining. Base exception types can be constructed with an exception as an argument or the standard String message and an exception:

throw new Exception( "Here's the story...", causalException );

You can get access to this exception later with the getCause() method, which returns the causal exception. More importantly, Java automatically prints both exceptions and their respective stack traces if you print the exception or if it is shown to the user.

You can add this kind of constructor to your own exception subclasses as well (delegating to the parent constructor). However, since this is (at least formally) a recent addition to Java, many preexisting exception types do not provide this kind of constructor. You can still take advantage of this pattern by using the Throwable method initCause() to set the causal exception explicitly after constructing your exception and before throwing it:

Try {   // ... } catch ( IOException cause ) {   Exception e =      new IOException("What we have here is a failure to communicate...");   e.initCause( cause );   throw e; }

4.5.7 try Creep

The try statement imposes a condition on the statements that it guards. It says that if an exception occurs within it, the remaining statements are abandoned. This has consequences for local variable initialization. If the compiler can't determine whether a local variable assignment we placed inside a try/catch block will happen, it won't let us use the variable:

void myMethod( ) {       int foo;          try {           foo = getResults( );       }        catch ( Exception e ) {           ...       }          int bar = foo;  // Compile-time error -- foo may not have been initialized

In this example, we can't use foo in the indicated place because there's a chance it was never assigned a value. One obvious option is to move the assignment inside the try statement:

try {       foo = getResults( );         int bar = foo;  // Okay because we get here only                     // if previous assignment succeeds }    catch ( Exception e ) {       ...   }

Sometimes this works just fine. However, now we have the same problem if we want to use bar later in myMethod(). If we're not careful, we might end up pulling everything into the try statement. The situation changes if we transfer control out of the method in the catch clause:

try {       foo = getResults( );   }    catch ( Exception e ) {       ...       return;   }      int bar = foo;  // Okay because we get here only                 // if previous assignment succeeds

The compiler is smart enough to know that if an error had occurred in the try clause we wouldn't have reached the bar assignment. Your code will dictate its own needs; you should just be aware of the options.

4.5.8 The finally Clause

What if we have some cleanup to do before we exit our method from one of the catch clauses? To avoid duplicating the code in each catch branch and to make the cleanup more explicit, use the finally clause. A finally clause can be added after a try and any associated catch clauses. Any statements in the body of the finally clause are guaranteed to be executed, no matter why control leaves the try body (whether an exception was thrown or not):

try {       // Do something here   }    catch ( FileNotFoundException e ) {       ...   }    catch ( IOException e ) {       ...   }    catch ( Exception e ) {       ...   }    finally {       // Clean up here   }

In this example, the statements at the cleanup point are executed eventually, no matter how control leaves the try. If control transfers to one of the catch clauses, the statements in finally are executed after the catch completes. If none of the catch clauses handles the exception, the finally statements are executed before the exception propagates to the next level.

If the statements in the try execute cleanly, or if we perform a return, break, or continue, the statements in the finally clause are executed. To perform cleanup operations, we can even use try and finally without any catch clauses:

try {       // Do something here       return;   }  finally {       System.out.println("Whoo-hoo!");   }

Exceptions that occur in a catch or finally clause are handled normally; the search for an enclosing try/catch begins outside the offending try statement.

4.5.9 Performance Issues

We mentioned at the beginning of this section that there are methods in the core Java APIs that still return "out of bounds" values such as -1 or null instead of throwing Exceptions. Why is this? Well, for some it is simply a matter of convenience; where a special value is easily discernible, we may not want to have to wrap those methods in try/catch blocks.

But there is also a performance issue. Because of the way the Java virtual machine is implemented, guarding against an exception being thrown (using a try) is free. It doesn't add any overhead to the execution of your code. However, throwing an exception is not free. When an exception is thrown, Java has to locate the appropriate try/catch block and perform other time-consuming activities at runtime.

The result is that you should throw exceptions only in truly "exceptional" circumstances and try to avoid using them for expected conditions, especially when performance is an issue. For example, if you have a loop, it may be better to perform a small test on each pass and avoid throwing the exception, rather than throwing it frequently. On the other hand, if the exception is thrown only one in a gazillion times, you may want to eliminate the overhead of the test code and not worry about the cost of throwing that exception.

4.6 Assertions

An assertion is a simple pass/fail test of some condition, performed while your application is running. Assertions can be used to check the "sanity" of your code, anywhere you believe certain conditions are guaranteed by correct program behavior. They are distinct from other kinds of tests because they check conditions that should never be violated: if the assertion fails, the application is to be considered broken and generally halts with an appropriate error message. Assertions are supported directly by the Java language so that they can be turned on or off at runtime to remove any performance penalty of including them in your code.

Using assertions to test for the correct behavior of your application is a simple but powerful technique for ensuring software quality. It fills a gap between those aspects of software that can be checked automatically by the compiler and those more generally checked by "unit tests" and human testing. Assertions test assumptions about program behavior and make them guarantees (at least while they are activated).

Explicit support for assertions was added in Java 1.4. However, if you've written much code in any language, you have probably used assertions in some form. For example, you may have written something like the following:

if ( !condition )     throw new AssertionError("fatal error: 42");

An assertion in Java is equivalent to this example but performed with the assert language keyword. It takes a boolean condition and an optional expression value. If the assertion fails, an AssertionError is thrown, which usually causes Java to bail out of the application.

The optional expression may evaluate to either a primitive or object type. Either way, its sole purpose is to be turned into a string and shown to the user if the assertion fails; most often you'll use a string message explicitly. Here are some examples:

assert false; assert ( array.length > min ); assert a > 0 : a assert foo != null :  "foo is not null!"

In the event of failure, the first two assertions print only a generic message whereas the third prints the value of a and the last prints the foo is not null! message.

Again, the important thing about assertions is not just that they are more terse than the equivalent if condition but that they can be enabled or disabled when you run the application. Disabling assertions means that their test conditions are not even evaluated, so there is no performance penalty for including them in your code (other than, perhaps, space in the class files when they are loaded).

Assertions are supported only in Java 1.4 and (for now) require passing a special switch to the compiler so it recognizes the assert keyword. So to use the assert examples, you'll have to compile using the -source compiler switch and specify 1.4 as the language version. For example:

% javac -source 1.4 MyApplication.java

Assertions were implemented this way to provide some migration time for existing applications with their own methods named assert, which will now be illegal. In some future release, assertions will be recognized by default.

4.6.1 Enabling and Disabling Assertions

Assertions are turned on or off at runtime. When disabled, assertions still exist in the class files but are not executed and consume no time. You can enable and disable assertions for an entire application or on a package-by-package or even class-by-class basis. By default, assertions are turned off in Java 1.4. To enable them for your code, use the Java flag -ea or -enableassertions:

% java -ea MyApplication

To turn on assertions for a particular class, append the class name like so:

% java -ea:com.oreilly.examples.Myclass

To turn on assertions just for particular packages append the package name with trailing ellipses (three dots) like so:

% java -ea:com.oreilly.examples...

When you enable assertions for package, Java also enables all subordinate package names (e.g., com.oreilly.examples.text). However you can become more selective by using the corresponding -da or -disableassertions flag to negate individual packages or classes. You can combine all this to achieve arbitrary groupings like this:

  % java -ea:com.oreilly.examples... -da:com.oreilly.examples.text       -ea:com.oreilly.examples.text.MonkeyTypewriters  MyApplication

This example enables assertions for the com.oreilly.examples package as a whole, excludes the package com.oreilly.examples.text, then turns exceptions on for just one class, MonkeyTypewriters, in that package.

4.6.2 Using Assertions

An assertion enforces a rule about something that should be unchanging in your code and would otherwise go unchecked. You can use an assertion for added safety anywhere you want to verify your assumptions about program behavior that can't be checked by the compiler.

A common situation that cries out for an assertion is testing for multiple conditions or values where one should always be found. In this case, a failing assertion as the default or "fall through" behavior indicates the code is broken. For example, suppose we have a value called direction that should always contain either the constant value LEFT or RIGHT:

if ( direction == LEFT )     doLeft(  ); else if ( direction == RIGHT )     doRight(  ) else      assert false : "bad direction";

The same applies to the default case of a switch:

switch ( direction ) {     case LEFT:         doLeft(  );         break;     case RIGHT:         doRight(  );         break;     default:         assert false; }

In general, you should not use assertions for checking the validity of arguments to methods because you want that behavior to be part of your application, not just a test for quality control that can be turned off. The validity of input to a method is called its pre-conditions, and you should usually throw an exception if they are not met; this elevates the preconditions to part of the method's "contract" with the user. However, checking the results of your methods before returning them is always valid; these are called post-conditions.

Sometimes determining what is or is not a pre-condition depends on your point of view. For example, when a method is used internally within a class, pre-conditions may already be guaranteed by the methods that call it. Public methods of the class should probably throw exceptions when their pre-conditions are violated, but a private method might use assertions because its callers are always closely related code that should obey the correct behavior.

Finally, note that assertions can not only test simple expressions but perform complex validation as well. Remember that anything you place in the condition expression of an assert statement is not evaluated when assertions are turned off. You can make helper methods for your assertions, containing arbitrary amounts of code. And, although it suggests a dangerous programming style, you can even use assertions that have side effects to capture values for use by later assertions all of which will be disabled when assertions are turned off. For example:

int savedValue; assert ( savedValue = getValue(  )) != -1; // Do work... assert checkValue( savedValue );

Here, in the first assert, we use helper method getValue() to retrieve some information and save it for later. Then after doing some work, we check the saved value using another assertion, perhaps comparing results. When assertions are disabled we'll no longer save or check the data. Note that it's necessary for us to be somewhat cute and make our first assert condition into a boolean by checking for a known value. Again, using assertions with side effects is a bit dangerous because you have to be careful that those side effects are only seen by other assertions. Otherwise, you'll be changing your application behavior when you turn them off.

4.7 Arrays

An array is a special type of object that can hold an ordered collection of elements. The type of the elements of the array is called the base type of the array; the number of elements it holds is a fixed attribute called its length. Java supports arrays of all primitive and reference types.

The basic syntax of arrays looks much like that of C or C++. We create an array of a specified length and access the elements with the index operator, []. Unlike other languages, however, arrays in Java are true, first-class objects. An array is an instance of a special Java array class and has a corresponding type in the type system. This means that to use an array, as with any other object, we first declare a variable of the appropriate type and then use the new operator to create an instance of it.

Array objects differ from other objects in Java in three respects:

Java implicitly creates a special array class type for us whenever we declare an array type variable. It's not strictly necessary to know about this process in order to use arrays, but it helps in understanding their structure and their relationship to other objects in Java.
Java lets us use the [] operator to access array elements, so that arrays look as we expect. We could implement our own classes that act like arrays, but we would have to settle for having methods such as get() and put() instead of using the special [] notation.
Java provides a corresponding special form of the new operator that lets us construct an instance of an array and specify its length with the [] notation or initialize it from a structured list of values.

4.7.1 Array Types

An array-type variable is denoted by a base type followed by the empty brackets, []. Alternatively, Java accepts a C-style declaration, with the brackets placed after the array name.

The following are equivalent:

int [] arrayOfInts;   int arrayOfInts [];

In each case, arrayOfInts is declared as an array of integers. The size of the array is not yet an issue, because we are declaring only the array-type variable. We have not yet created an actual instance of the array class, with its associated storage. It's not even possible to specify the length of an array when declaring an array-type variable.

An array of objects can be created in the same way:

String [] someStrings;   Button someButtons [];

4.7.2 Array Creation and Initialization

The new operator is used to create an instance of an array. After the new operator, we specify the base type of the array and its length, with a bracketed integer expression:

arrayOfInts = new int [42];   someStrings = new String [ number + 2 ];

We can, of course, combine the steps of declaring and allocating the array:

double [] someNumbers = new double [20];   Component widgets [] = new Component [12];

As in C, array indices start with zero. Thus, the first element of someNumbers[] is 0, and the last element is 19. After creation, the array elements are initialized to the default values for their type. For numeric types, this means the elements are initially zero:

int [] grades = new int [30];   grades[0] = 99;  grades[1] = 72;   // grades[2] == 0

The elements of an array of objects are references to the objects, not actual instances of the objects. The default value of each element is therefore null, until we assign instances of appropriate objects:

String names [] = new String [4];   names [0] = new String( );   names [1] = "Boofa";   names [2] = someObject.toString( );   // names[3] == null

This is an important distinction that can cause confusion. In many other languages, the act of creating an array is the same as allocating storage for its elements. In Java, a newly allocated array of objects actually contains only reference variables, each with the value null.^[5] That's not to say that there is no memory associated with an empty array; there is memory needed to hold those references (the empty "slots" in the array). Figure 4-4 illustrates the names array of the previous example.

Figure 4-4. A Java array

figs/lj2.0404.gif

names is a variable of type String[] (i.e., a string array). This particular String[] object contains four String type variables. We have assigned String objects to the first three array elements. The fourth has the default value null.

Java supports the C-style curly braces {} construct for creating an array and initializing its elements:

int [] primes = { 1, 2, 3, 5, 7, 7+4 };    // primes[2] == 3

An array object of the proper type and length is implicitly created, and the values of the comma-separated list of expressions are assigned to its elements.

We can use the {} syntax with an array of objects. In this case, each expression must evaluate to an object that can be assigned to a variable of the base type of the array, or the value null. Here are some examples:

String [] verbs = { "run", "jump", someWord.toString( ) };   Button [] controls = { stopButton, new Button("Forwards"),        new Button("Backwards") };   // All types are subtypes of Object   Object [] objects = { stopButton, "A word", null };

The following are equivalent:

Button [] threeButtons = new Button [3];   Button [] threeButtons = { null, null, null };

4.7.3 Using Arrays

The size of an array object is available in the public variable length:

char [] alphabet = new char [26];   int alphaLen = alphabet.length;             // alphaLen == 26      String [] musketeers = { "one", "two", "three" };   int num = musketeers.length;                // num == 3

length is the only accessible field of an array; it is a variable, not a method. (Don't worry, the compiler tells you when you accidentally use parentheses, as if it were a method; everyone does now and then.)

Array access in Java is just like array access in C; you access an element by putting an integer-valued expression between brackets after the name of the array. The following example creates an array of Button objects called keyPad and then fills the array with Button objects:

Button [] keyPad = new Button [ 10 ];   for ( int i=0; i < keyPad.length; i++ )           keyPad[ i ] = new Button( Integer.toString( i ) );

Attempting to access an element that is outside the range of the array generates an ArrayIndexOutOfBoundsException. This is a type of RuntimeException, so you can either catch and handle it yourself, if you really expect it, or ignore it, as we've already discussed:

String [] states = new String [50];      try {       states[0] = "California";       states[1] = "Oregon";       ...       states[50] = "McDonald's Land";  // Error: array out of bounds }    catch ( ArrayIndexOutOfBoundsException err ) {       System.out.println( "Handled error: " + err.getMessage( ) );   }

It's a common task to copy a range of elements from one array into another. Java supplies the arraycopy() method for this purpose; it's a utility method of the System class:

System.arraycopy(source,sourceStart,destination,destStart,length);

The following example doubles the size of the names array from an earlier example:

String [] tmpVar = new String [ 2 * names.length ];   System.arraycopy( names, 0, tmpVar, 0, names.length );   names = tmpVar;

A new array, twice the size of names, is allocated and assigned to a temporary variable tmpVar. The arraycopy() method is then used to copy the elements of names to the new array. Finally, the new array is assigned to names. If there are no remaining references to the old array object after names has been copied, it is garbage-collected on the next pass.

4.7.4 Anonymous Arrays

Often it is convenient to create "throw-away" arrays, arrays that are used in one place and never referenced anywhere else. Such arrays don't need to have a name because you never need to refer to them again in that context. For example, you may want to create a collection of objects to pass as an argument to some method. It's easy enough to create a normal, named array; but if you don't actually work with the array (if you use the array only as a holder for some collection), you shouldn't have to. Java makes it easy to create "anonymous" (i.e., unnamed) arrays.

Let's say you need to call a method named setPets(), which takes an array of Animal objects as arguments. Provided Cat and Dog are subclasses of Animal, here's how to call setPets() using an anonymous array:

Dog pokey = new Dog ("gray");  Cat boojum = new Cat ("grey");  Cat simon = new Cat ("orange");  setPets ( new Animal [] { pokey, boojum, simon });

The syntax looks just like the initialization of an array in a variable declaration. We implicitly define the size of the array and fill in its elements using the curly-brace notation. However, since this is not a variable declaration, we have to explicitly use the new operator to create the array object.

You can use anonymous arrays to simulate variable-length argument lists (called VARARGS in C), a feature of many programming languages that Java doesn't provide. The advantage of anonymous arrays over variable-length argument lists is that the former allow stricter type checking; the compiler always knows exactly what arguments are expected, and therefore it can verify that method calls are correct.

4.7.5 Multidimensional Arrays

Java supports multidimensional arrays in the form of arrays of array type objects. You create a multidimensional array with C-like syntax, using multiple bracket pairs, one for each dimension. You also use this syntax to access elements at various positions within the array. Here's an example of a multidimensional array that represents a chess board:

ChessPiece [][] chessBoard;   chessBoard = new ChessPiece [8][8];   chessBoard[0][0] = new ChessPiece( "Rook" );   chessBoard[1][0] = new ChessPiece( "Pawn" );   ...

Here chessBoard is declared as a variable of type ChessPiece[][] (i.e., an array of ChessPiece arrays). This declaration implicitly creates the type ChessPiece[] as well. The example illustrates the special form of the new operator used to create a multidimensional array. It creates an array of ChessPiece[] objects and then, in turn, makes each element into an array of ChessPiece objects. We then index chessBoard to specify values for particular ChessPiece elements. (We'll neglect the color of the pieces here.)

Of course, you can create arrays with more than two dimensions. Here's a slightly impractical example:

Color [][][] rgbCube = new Color [256][256][256];   rgbCube[0][0][0] = Color.black;   rgbCube[255][255][0] = Color.yellow;   ...

As in C, we can specify a partial index of a multidimensional array to get an array-type object with fewer dimensions. In our example, the variable chessBoard is of type ChessPiece[][]. The expression chessBoard[0] is valid and refers to the first element of chessBoard, which, in Java, is of type ChessPiece[]. For example, we can create a row for our chess board:

ChessPiece [] startRow =  {       new ChessPiece("Rook"), new ChessPiece("Knight"),       new ChessPiece("Bishop"), new ChessPiece("King"),       new ChessPiece("Queen"), new ChessPiece("Bishop"),       new ChessPiece("Knight"), new ChessPiece("Rook")   };      chessBoard[0] = startRow;

We don't necessarily have to specify the dimension sizes of a multidimensional array with a single new operation. The syntax of the new operator lets us leave the sizes of some dimensions unspecified. The size of at least the first dimension (the most significant dimension of the array) has to be specified, but the sizes of any number of trailing, less significant array dimensions may be left undefined. We can assign appropriate array-type values later.

We can create a checkerboard of boolean values (which is not quite sufficient for a real game of checkers) using this technique:

boolean [][] checkerBoard;   checkerBoard = new boolean [8][];

Here, checkerBoard is declared and created, but its elements, the eight boolean[] objects of the next level, are left empty. Thus, for example, checkerBoard[0] is null until we explicitly create an array and assign it, as follows:

checkerBoard[0] = new boolean [8];   checkerBoard[1] = new boolean [8];   ...   checkerBoard[7] = new boolean [8];

The code of the previous two examples is equivalent to:

boolean [][] checkerBoard = new boolean [8][8];

One reason we might want to leave dimensions of an array unspecified is so that we can store arrays given to us by another method.

Note that since the length of the array is not part of its type, the arrays in the checkerboard do not necessarily have to be of the same length. That is, multidimensional arrays don't have to be rectangular. Here's a defective (but perfectly legal, to Java) checkerboard:

checkerBoard[2] = new boolean [3];   checkerBoard[3] = new boolean [10];

And here's how you could create and initialize a triangular array:

int [][] triangle = new int [5][];   for (int i = 0; i < triangle.length; i++) {        triangle[i] = new int [i + 1];       for (int j = 0; j < i + 1; j++)           triangle[i][j] = i + j;      }

4.7.6 Inside Arrays

We said earlier that arrays are instances of special array classes in the Java language. If arrays have classes, where do they fit into the class hierarchy and how are they related? These are good questions; however, we need to talk more about the object-oriented aspects of Java before answering them. That's the subject of the next chapter. For now, take it on faith that arrays fit into the class hierarchy.

[1]  For more information about Unicode, see http://www.unicode.org. Ironically, one of the scripts listed as "obsolete and archaic" and not currently supported by the Unicode standard is Javanese a historical language of the people of the Island of Java.
[2]  The comparable code in C++ would be:
[3]  The somewhat obscure setjmp( ) and longjmp( ) statements in C can save a point in the execution of code and later return to it unconditionally from a deeply buried location. In a limited sense, this is the functionality of exceptions in Java.
[4]  For example, the getHeight( ) method of the Image class returns -1 if the height isn't known yet. No error has occurred; the height will be available in the future. In this situation, throwing an exception would be inappropriate.
[5]  The analog in C or C++ is an array of pointers to objects. However, pointers in C or C++ are themselves two- or four-byte values. Allocating an array of pointers is, in actuality, allocating the storage for some number of those pointer objects. An array of references is conceptually similar, although references are not themselves objects. We can't manipulate references or parts of references other than by assignment, and their storage requirements (or lack thereof) are not part of the high-level Java language specification.