1.4 Organization of the Java Virtual Machine

The JVM is divided into four conceptual data spaces:

Class area, where the code and constants are kept
Java stack, which keeps track of which methods have been called and the data associated with each method invocation
Heap, where objects are kept
Native method stacks, for supporting native methods

1.4.1 Class Area

The class area stores the classes that are loaded into the system. Method implementations are kept in a space called the method area, and constants are kept in a space called the constant pool. The class definitions also serve as templates for objects. Objects are stored in the heap (see section 1.4.3).

A class loader (see chapter 8) introduces new classes into the class area. When a class is no longer used, it may be garbage collected.

Classes have several properties:

Superclass
List of interfaces (possibly empty)
List of fields
List of methods and implementations, stored in the method area
List of constants, stored in the constant pool

All properties of classes are immutable. That is, there's no way to change any property of a class once it has been brought into the system. This makes the machine more stable, since you know that a method will have the same code each time it's invoked, and each object of a particular class has the same fields as every other object of that class.^[3]

^[3] If a class is no longer used, then it may be unloaded and reloaded at a later time. The new definition may be different. However, this does not cause inconsistencies between objects, because the original definition of the class would not be unloaded if instances of it still existed.

Fields come in two varieties: static and nonstatic. For nonstatic fields, there is a copy of the field in each object. For static fields, there is a single copy of the field for the entire class.

See Figure 1.1 for a picture of a class area. It depicts two classes: GamePlayer and ChessPlayer. Each has fields and methods. Each field and method has a descriptor and a list of properties. The descriptor tells which type of values the field may hold or the parameters and return type of a method. The constant pool has been omitted from the figure. For more about the constant pool, see chapter 9.

Figure 1.1. Class area

graphics/01fig01.gif

For methods that don't have the abstract property, there is a method implementation. Method implementations are defined in terms of instructions, which are discussed in chapter 2.

1.4.2 Java Stack

Each time a method is invoked, a new data space called a stack frame is created. Collectively, the stack frames are called the Java stack. The stack frame on top of the stack is called the active stack frame.

Each stack frame has an operand stack, an array of local variables, and a pointer to the currently executing instruction. This instruction pointer is called the program counter (PC). The program counter points into the method area. It points to the current instruction. Ordinarily, the program counter moves from one instruction to the subsequent instruction, but some instructions (like goto) cause the program counter to move to some other place within the method.

The top frame of the Java stack shows the current place of execution. It is called the active frame. Only the operand stack and local variable array in the active stack frame can be used. When a method is invoked, a new Java stack frame is created and that becomes the top of the Java stack. The program counter is saved as part of the old Java stack frame. The new Java stack frame has its own program counter, which points to the beginning of the called method.

When the newly called method returns, the active stack frame disappears and the stack frame below it becomes the active frame again. The program counter is set to the instruction after the method call, and the method continues.

Figure 1.2 shows a Java stack that has two stack frames. The first entry on the stack is at the bottom. It shows a call to the method main in the class GamePlayer. The program counter points to the instruction nine bytes from the beginning of the method.

Figure 1.2. Java stack and heap

graphics/01fig02.gif

Above that is the active stack frame. The active stack frame is a method called getPlayerMove in the class ChessPlayer, currently at instruction 17. It has two items on its operand stack and three slots in its local variable array. Two of those slots are uninitialized. Slot 0 contains a reference to an object in the heap. The references are represented by arrows that point to the objects. The heap is discussed in more detail in section 1.4.3.

1.4.3 Heap

Objects are stored in the heap. Each object is associated with a class in the class area. Each object also has a number of slots for storing fields; there is one slot for each nonstatic field in the class, one for each nonstatic field in the superclass, and so on. An example is shown in Figure 1.3. This heap contains a chess player named Pooky. A real heap has thousands or millions of objects in it.

Figure 1.3. Heap

graphics/01fig03.gif

The chess player is an object of class ChessPlayer. The name field is a reference to an object whose class is java/lang/String. This string contains a field called data, which points to an array of characters. ([C is the descriptor for an array of characters. See section 2.5 for more about descriptors.) This array is five characters long, containing the letters P, o, o, k, and y.

Another field of ChessPlayer is color, which is an int. This doesn't point to another object. Instead, it just holds the number 1.

1.4.4 Native Method Stacks

Native methods are used like other JVM methods, except that instead of being implemented using JVM instructions, they are implemented in some other language. They allow the programmer to handle situations that cannot be handled completely in Java, such as interfacing to platform-dependent features or integrating legacy code written in other languages.

When native methods are compiled into machine codes, they usually use a stack to keep track of their state. The JVM provides a native method stack that the native methods can use. They are often called "C stacks," because the most common way to implement native methods is to write them in C and compile them into native code.

Native methods do not exist on all JVM implementations, and different implementations have different standards for them, so they are not always portable. A common standard, the Java Native Interface (JNI), is often but not always available. This book focuses on JVM issues, so native methods are largely ignored. For more information about implementing native methods, consult the JNI documentation and the documentation for your particular JVM implementation.

1.4.5 Example

This example shows more detail of the Hello, world code shown earlier in relation to the Java stack, method area, and heap.

Figure 1.4 shows the state of the system after executing getstatic and ldc. There is one frame on the Java stack. The top of the stack points to the Hello, world string, and the next stack slot points to the object in System.out. There is one local variable, which points to the argument array. The program counter points to the invokevirtual instruction.

Figure 1.4. Before executing `invokevirtual`

graphics/01fig04.gif

Figure 1.5 shows what happens when the invokevirtual instruction is executed. The operands to the instruction are popped off the operand stack. A new Java stack frame is created. The old stack frame is now inactive. It cannot be changed again until it becomes the active stack frame, which happens only when the newly called method terminates. (In the diagram, the reference lines from the old stack frame are dotted to make them easier to follow.)

Figure 1.5. While executing `invokevirtual`

graphics/01fig05.gif

The new stack frame points to the first instruction of the println method. Notice that the first two entries in the local variable array are initialized to the parameters of the method call. Additional local variables may be present, depending on how the method is implemented. These are uninitialized. The new operand stack is empty.

The JVM now executes the instructions in println until println returns. Figure 1.6 shows what happens after the call to println returns. The new stack frame has been removed, so the previous stack frame is now the active stack frame. The parameters to the method have been removed from the operand stack, so the stack is now empty. The program counter has been moved to the next instruction. The JVM will continue with that instruction. Notice that two of the objects no longer have lines pointing to them. The storage for these objects may be reclaimed by the JVM and used for new objects. This is called garbage collection, and it is discussed in section 1.5.

Figure 1.6. After executing `invokevirtual`

graphics/01fig06.gif