The official arbiter of what constitutes a JVM is The Java Virtual Machine Specification, by Tim Lindholm and Frank Yellin. The Specification defines three things:
1.2.1 Instruction Set
The executable programs running in the JVM are expressed in terms of instructions called bytecodes, resembling the machine code of most computer architectures. The JVM instruction set is designed around a stack-based architecture with special object-oriented instructions.
The instructions are stored in the class file in a binary format that is readily readable by a computer but unintelligible to human beings. Throughout this book we use a language called Oolong. The format of an Oolong program is nearly equivalent to a class file, but words and numbers are used in place of the binary values.
Here is a segment of Oolong code to compute 2 + 3:
bipush 2 ; Push the number 2 bipush 3 ; Push the number 3 iadd ; Add them together
Initially, the operand stack is empty:
Each bipush instruction pushes its argument onto the operand stack. After the first two instructions, the operand stack looks like this:
The iadd instruction expects to find two numbers on the operand stack. It pops these numbers off, replacing them by their sum:
This is the result we were looking for.
One thing that distinguishes the JVM instruction set from the instruction sets of most CPUs is the way the JVM works with memory. Most computers view memory as a vast array of bytes. If you want to build an object, you allocate a collection of contiguous bytes. Different locations within this collection of bytes are the different parts of the object's state. To call a function, you jump to the location in memory where that function is located.
The JVM doesn't permit byte-level access to memory. Instead, it has instructions for allocating objects, invoking methods, and retrieving and modifying fields in those objects. For example, this code gets an object from a field, then calls a method on that object, passing a string as a parameter:
getstatic java/lang/System/out Ljava/io/PrintStream; ldc "Hello, world" invokevirtual java/io/PrintStream/println (Ljava/lang/String;)V
The first instruction retrieves the value of the out field from the class java/lang/System. The value of this field is an object that must be of the class java/io/PrintStream (or a subclass).
The second instruction pushes the constant string Hello, world onto the stack. The string is another object of the class java/lang/String. The stack now looks like this:
The final instruction invokes a method. The name of the method is println; its definition can be found in the class java/io/PrintStream. It expects an argument of type java.lang.String on the stack, and it returns nothing. It also expects an object of class java/io/Printstream to be on the stack below the arguments; this is the target of the method invocation. This calls the method, which prints
The method call removes both the argument and the target of the method invocation. The stack is now empty.
1.2.2 class File Format
The Java Virtual Machine Specification defines a binary format, called the class file, which represents a class as a stream of bytes. The Java platform has methods for converting class files into classes in the JVM.
The term "class file" is slightly misleading. Data in class file format does not have to be stored in a file. They can be stored in a database, across the network, as part of a Java archive file (JAR), or in a variety of other ways.
The key to using class files is the class ClassLoader, which is part of the Java platform. Many different subclasses of ClassLoaders are available, which load from databases, across the network, from JAR files, and so on. Java-supporting web browsers have a subclass of ClassLoader that can load class files over the Internet.
If you store your information in some nonstandard format (such as compressed) or in a nonstandard place (such as in a database), you can write your own subclass of ClassLoader. We'll discuss how to do this in chapter 8.
In order to ensure that certain parts of the machine are kept safe from tampering, the JVM has a verification algorithm to check every class. The purpose of verification is to ensure that programs follow a set of rules that are designed to protect the security of the JVM.
Programs can try to subvert the security of the JVM in a variety of ways. For example, they might try overflowing the stack, hoping to corrupt memory they are not allowed to access. They might try to cast an object inappropriately, hoping to obtain pointers to forbidden memory. The verification algorithm ensures that this does not happen by tracing through the code to check that objects are always used according to their proper types.
The verification algorithm is discussed in detail in chapter 6.