7.3 The Class File Format


As explained earlier and depicted in Figure 7.1, the standardization of the Java class file format enables Java to become a real mobile code language. Reviewing the details of the class file definition often reveals many of the features of Java from a different perspective. The richness of the fields in this file often strikes readers familiar with Java as revealing . This section is a short simplified description of the Java class format. The complete JVM specification is available from Sun Microsystems.

To understand the Java class file concept, it may be helpful to understand how it improves on other file formats and the issues it was designed to address. Compiled binary executables for different platforms, such as *.exe files for windows , usually differ not only in the instruction set, libraries, and APIs at which they are aimed, but also by the file format used to represent the program code. For instance, Windows executables are encoded using the COFF file format, whereas Linux uses the ELF file format; there are existing deployed iTV set-top boxes running either Windows -based or Linux-based applications, but both are rarely supported simultaneously . Because Java aims at cross-platform binary compatibility, it introduced the Class File Format, which is a universal file format for all Java programs.

Although a portion of each class file consists of fixed length data, most of the class file's content is comprised of complex variable length nested data structures. The components of a class file are listed in Table 7.1. An example source code is depicted in Example 7.1 and its compiled class file is depicted in Example 7.2.

Table 7.1. Class File Format

Field

Length

Description

Magic

4

Always 0xCAFEBABE

minor_version

2

This is the minor version number of the JVM specification. For example, 0x0003 for JDK 1.0.2

major_version

2

This is the major version number of the JVM specification. For example, 0x002D for JDK 1.0.2

constant_count

2

The number of items in the constant pool table.

constant_pool

*

The table whose entries are constants used in this file. Each entry is a cp_info_constant structure, whose first byte, the tag, determines its length and structure (see section 7.3.1).

access_flags

2

A mask of modifiers used in the class and interface declaration.

this_class

2

A valid index into the constant_pool table. The constant_pool entry at that index is a CONSTANT_Class_info structure representing the class or interface defined by this class file.

super_class

2

This field specifies a valid index into the constant_pool table (it is zero for the java.lang.Object class only). The constant_pool entry at that index is a CONSTANT_Class_info structure representing the superclass of the class defined by this class file. Neither the superclass nor any of its superclasses may be a final class.

interfaces_count

2

The number of direct superinterfaces of this class or interface type.

interfaces

*

The interfaces inplemented by this class.

field_count

2

The number of field_info structures in the fields table. The field_info structures represent all fields, both class variables and instance variables , declared by this class or interface type.

fields

*

The list of fields used by this class, each specified using a field_info structure.

method_count

2

The number of method_info structures in the methods table.

methods

*

The list of methods implemented by this class, each specified using a method_info structure.

attributes_count

2

The number of attributes in the attributes table of this class.

attributes

*

The list of attributes for this class.

Example 7.1 Java source code of 162 chars.
  class Act {   public static void doMathForever() {   int i = 0;   while (true) {   i += 1;   i *= 2;   }   }   }  
Example 7.2 The compiled 306 byte class file.
  CA FE BA BE 00 03 00 2D 00 11 07 00 0C 07 00 0D 0A 00 01 00   04 0C 00 0E 00 10 01 00 0D 43 6F 6E 73 74 61 6E 74 56 61 6C   75 65 01 00 0D 64 6F 4D 61 74 68 46 6F 72 65 76 65 72 01 00   0A 45 78 63 65 70 74 69 6F 6E 73 01 00 0F 4C 69 6E 65 4E 75   6D 62 65 72 54 61 62 6C 65 01 00 0A 53 6F 75 72 63 65 46 69   6C 65 01 00 0E 4C 6F 63 61 6C 56 61 72 69 61 62 6C 65 73 01   00 04 43 6F 64 65 01 00 10 6A 61 76 61 2F 6C 61 6E 67 2F 4F   62 6A 65 63 74 01 00 03 41 63 74 01 00 06 3C 69 6E 69 74 3E   01 00 0B 73 6E 69 70 65 74 2E 6A 61 76 61 01 00 03 28 29 56   00 00 00 02 00 01 00 00 00 00 00 02 00 09 00 06 00 10 00 01   00 0B 00 00 00 30 00 02 00 01 00 00 00 0C 03 3B 84 00 01 1A   05 68 3B A7 FF F9 00 00 00 01 00 08 00 00 00 12 00 04 00 00   00 04 00 02 00 06 00 05 00 07 00 09 00 05 00 00 00 0E 00 10   00 01 00 0B 00 00 00 1D 00 01 00 01 00 00 00 05 2A B7 00 03   B1 00 00 00 01 00 08 00 00 00 06 00 01 00 00 00 02 00 01 00   09 00 00 00 02 00 0F  

The first 8 bytes within a class file specify a magic number, and minor and major version numbers . The magic bytes are always set to 0xCAFEBABE and are simply used to distinguish the class file from a file in any other format. Most operating systems allow modifying the file name's extension without regard to the file's content, and therefore JVMs do not rely on the file name extension *.class; instead, they inspect the magic number to determine the file's type.

The version bytes identify the version of the class file format that this file conforms to. iTV standards often specify the range of version numbers supported by compliant implementations . Typically, JVM may have difficulty reading a class file format whose version numbers are more recent than the most recent version supported by that JVM. To identify such cases, each JVM implementation should indicate what range of class file versions it is able to process.

It is generally necessary to parse the entire file to read any one piece of data, because it is not always possible to know where the data of interest is until reading through all the data. Typically, a JVM would read a class file once and place the data in memory often reorganizing it to enable efficient access rather than small footprint. For this reason, a surprisingly large amount of the code of any JVM is concerned with the interpretation, mapping, and possibly caching of this class file format.

7.3.1 Constant Pool

As per Table 7.1, the major_version field is immediately followed by the constant_pool_count (2 bytes), and the constant_pool table. This table is a list of (constant_pool_count “ 1) variant entries. Each of these entries contains an 8-bit tag specifying the type of constant. This tag is needed since the constant data structure is a variant. The data bytes carrying the constant's value. The syntax and semantics of this field depends on the value of the Tag field. The following types of constants are typically supported (see the JVM release notes for version specific variation):

  • CONSTANT_Utf8 : Having a tag value of 0x01, it consists of 2 bytes for the length in big-endian format, followed by a string in Utf8 (Unicode) format.

  • CONSTANT_Integer : Having a tag value of 0x03, it consists of 4 bytes representing an integer in big-endian format.

  • CONSTANT_Float : Having a tag value of 0x04, it consists of 4 bytes representing a float in big-endian format.

  • CONSTANT_Long : Having a tag value of 0x05, it consists of 8 bytes representing a long in big-endian format.

  • CONSTANT_Double : Having a tag value of 0x06, it consists of 8 bytes representing a double in big-endian format.

  • CONSTANT_Class : Having a tag value of 0x07, it specifies a class name constant.

  • CONSTANT_String : Having a tag value of 0x08, it specifies the index of a CONSTANT_Utf8 entry.

  • CONSTANT_Fieldref : Having a tag value of 0x09, it specifies the name and type of a Field, and the class of which it is a member.

  • CONSTANT_Methodref : Having a tag value of 0x0A, it specifies the name and type of a Method, and the class of which it is a member.

  • CONSTANT_InterfaceMethodref : Having a tag value of 0x0B, it specifies the name and type of an Interface Method, and the Interface of which it is a member.

  • CONSTANT_NameAndType : Having a tag value of 0x0C, it specifies the Name and Type entry for a field, method, or interface.

Example 7.2 depicts a class file with 0x10 (16 decimal) entries. The items in this table can be of various types and of variable length. The first entry is referred to as Constant Pool item 1, and the 16th-entry (or k th-entry in general) is referred to as the Constant Pool item 16 (or k in general). An index of an item within the Constant Pool table is 2 bytes, and therefore the maximum number of items in that table is 2 16 “1=65535.

The reference chain in the constant pool is often used to determine types; the actual value of a constant may be different from its initial reference. For example, a CONSTANT_Class item should refer to a CONSTANT_UTF8 item specifying the class name, and a CONSTANT_NameAndType item should refer to a pair of CONSTANT_UTF8 items, the first specifying the name and the second specifying the type.

7.3.2 Class Declaration

The class declaration section consists of the information provided in the declaration of this class. This includes the access_flags, this_class, and super_class fields. The access flags typically supported are as follows (see the JVM release notes for version specific variation):

  • ACC_PUBLIC : Setting the bit 0x0001indicates that the class may be accessible from any other class outside the package.

  • ACC_SUPER : Setting the bit 0x0002 indicates that this is a superclass method which requires special treatment on invocation.

  • ACC_FINAL : Setting the bit 0x0010 indicates that no subclasses are allowed.

  • ACC_INTERFACE : Setting the bit 0x0200 indicates that the class file specifies an interface as opposed to a class.

  • ACC_ABSTRACT : Setting the bit 0x0400 indicates that the class is abstract and may not be instantiated directly. Instantiating this class requires instantiating a non-abstract subclass.

The value of the 2-byte field this_class is a valid index into the constant pool table which points to either an interface or a class name (tag=7). For a class, the value of the 2-byte super_class field can be either 0 or points to a class name; neither of the superclass or its superclasses can have its Final flag set. For java.lang.Object the value of super_class is 0. For an interface, this field indexes the constant pool entry referring to the java.lang.Object class.

7.3.3 Super-interfaces and Classes

A Java interface is a list of methods that an implementing class provides. Interfaces may have super- and sub-interfaces. The super-interfaces section specifies the list of direct super-interfaces of this class or interface; the list of sub-interfaces is implicit. A class may implement up to 65535 interfaces, each of which may inherit up to 65536 interfaces. For example, a class could implement both a mouse listener and key listener interface capturing both mouse and keyboard events. These listener interfaces list the functions that the implementing class provides and implements as callbacks for capturing these events. The field interface_count specifies how many interfaces are implemented by this class. The interfaces field contains one 2-byte constant pool index for each implemented or inherited interface.

7.3.4 Fields

The fields section specifies the list of all class and instance variables of this class. A class may have up to 65535 fields, each specified using a field_info structure. The following access flags are typically supported (see the JVM release notes for version specific variation):

  • ACC_PUBLIC : Setting the bit 0x0001indicates that this field is public and may be accessed from outside the package.

  • ACC_PRIVATE : Setting the bit 0x0002 indicates that this field is accessible only within this class.

  • ACC_PROTECTED : Setting the bit 0x0004 indicates that this field is accessible only within this class and all its subclasses.

  • ACC_STATIC : Setting the bit 0x0008 indicates that this field is static. This means that exactly one copy of this field is available regardless of the number of instantiations of this class exist in memory; in particular, this field is accessible even in case there are no (i.e., 0) instances of this class in memory.

  • ACC_FINAL : Setting the bit 0x0010 indicates that this field cannot be modified at run time. This means that the field is constant.

  • ACC_VOLATILE : Setting the bit 0x0040 indicates that this field cannot be cached and its value may not persist in certain scenarios.

  • ACC_TRANSIENT : Setting the bit 0x0080 indicates that this field is not serialized and de-serialized when using serialization (e.g., for purposes of data transmission or storage).

Table 7.2. The field_info Structures Used to Represent Fields of a Class

Field Name

Description

access_flags

A 2-byte mask of modifiers used to describe access permission to and properties of a field.

name_index

This 2-byte field is a valid index into the constant_pool table to an item specifying CONSTANT_Utf8 (tag=1). This item represents a valid Java field name stored as a simple (not fully qualified) name.

descriptor_index

This 2-byte field is a valid index into the constant_pool table specifying CONSTANT _Utf8 (tag=1) structure. This item represents a valid Java field descriptor.

attribute_count

This 2-byte field specifies the number of additional attributes; up to 65535 attributes can be specified.

Attributes

The attribute_info structure is specified in Table 7.3.

7.3.5 Methods

Each method, and each instance initialization method <init>, is described by a variable-length method_info structure. The structure has four components:

  • access_flags : This is a mask of modifiers used to describe access permission to and properties of a method or instance initialization method. At most one of the flags ACC_PUBLIC, ACC_PROTECTED, and ACC_PRIVATE may be set for any method. Class and instance methods may not use ACC_ABSTRACT together with ACC_FINAL, ACC_NATIVE, or ACC_SYNCHRONIZED (i.e., native and synchronized methods require an implementation). A class or instance method may not use ACC_PRIVATE with ACC_ABSTRACT (i.e., a private method cannot be overridden, so such a method could never be implemented or used). A class or instance method may not use ACC_STATIC with ACC_ABSTRACT (i.e., a static method is implicitly final and thus cannot be overridden, so such a method could never be implemented or used). Each interface method is implicitly abstract, and has its ACC_ABSTRACT flag set. Each interface method is implicitly public, and has its ACC_PUBLIC flag set.

  • name_index : This is a valid index into the constant_pool table pointing to a CONSTANT_Utf8 (tag=1) representing either one of the special internal method names , either <init> or <clinit>, or a valid Java method name, stored as a simple (not fully qualified) name.

  • descriptor_index : This is a valid index into the constant_pool table pointing to a CONSTANT_Utf8 representing a valid Java method descriptor.

  • attribute_count, attributes : The value of the attributes_count indicates the number of additional attributes of this method that are listed in the table specified by the attributes field. Each entry in that table, a variable-length attribute_info structure (see Table 7.3) that specifies either a code or an exceptions attribute.

7.3.6 Attributes

Attributes are used in the class file, field_info, method_info, and code_attribute structures of the class file format. All attributes have the general attribute_info format described in Table 7.3. Of the predefined attributes, the Code, ConstantValue, and Exceptions attributes are typically supported by JVM implementations. Use of the remaining predefined attributes is optional, and JVMs may silently ignore those attributes.

Table 7.3. The attribute_info Structure Used to Represent Fields of a Class

Name

Description

name_index

This 2-byte field is a valid index into the constant_pool table to an item which specifies CONSTANT_Utf8 (tag=1). This item represents a valid Java field name stored as a simple (not fully qualified) name.

attribute_length

This 4-byte field specifies the number of bytes of data in the info field below.

info

This field has 'attribute_length' bytes and specifies the data bytes in this attribute.

The following attributes are pre-defined:

  • SourceFile : The SourceFile attribute is an optional fixed-length attribute specifying the source file from which this class file was compiled. The value of the attribute_name_index points to a CONSTANT_Utf8_info (tag=1), representing the string "SourceFile". The value of the attribute_length field is 2. The value of the info field is a valid index into the constant_pool table pointing to a CONSTANT_Utf8_info (tag=1) representing the name of the source file from which this class file was compiled.

  • ConstantValue : The ConstantValue attribute is a fixed-length attribute representing the value of a constant field that is (explicitly or implicitly) static; that is, the ACC_STATIC bit in the flags item of the field_info structure is set. The field is not required to be final. There can be no more than one ConstantValue attribute in the attributes table of a given field_info structure. The constant field represented by the field_info structure is assigned the value referenced by its ConstantValue attribute as part of its initialization. The value of the attribute_name_index points to a CONSTANT_Utf8_info (tag=1), representing the string "ConstantValue". The value of the attribute_length field is 2. The value of the info field is a valid index into the constant_pool table specifying the constant value represented by this attribute.

  • Code : The Code attribute is a variable-length attribute used in the attributes table of method_info structures. A Code attribute contains the JVM instructions and auxiliary information for a single Java method, instance initialization method, or class or interface initialization method. There is exactly one Code attribute in each method_info structure. The value of the attribute_name_index points to a CONSTANT_Utf8_info (tag=1), representing the string "Code". The value of the attribute_length item indicates the length of the code data, excluding the initial six bytes.

  • Exceptions : The Exceptions attribute is a variable-length attribute used in the attributes table of a method_info structure. The Exceptions attribute indicates which checked exceptions a method may throw. The value of the attribute_name_index points to a CONSTANT_Utf8_info (tag=1), representing the string "Exceptions".

  • LineNumberTable : This is an optional variable-length attribute in the attributes table of a Code attribute. The value of the attribute_name_index item points to a CONSTANT_Utf8 (tag=1), representing the string "LineNumberTable". It may be used by debuggers to determine which part of the JVM code array corresponds to a given line number in the original Java source file. If LineNumberTable attributes are present in the attributes table of a given Code attribute, then they may appear in any order. Furthermore, multiple LineNumberTable attributes may together represent a given line of a Java source file; that is, LineNumberTable attributes need not be one-to-one with source lines.

    An entry in this table has the fields start_pc and line_number, and constitutes a mapping of a code array indexes to source code line numbers.

    • start_pc : This field indicates the index into the code array at which the code for a new line in the original Java source file begins.

    • line_number : This specifies the corresponding line number in the original Java source file.

  • LocalVariableTable : This is an optional variable-length attribute in the attributes table of a Code. It may be used by debuggers to determine the value of a given local variable during the execution of a method. An entry in this table has its attribute_name_index point to a CONSTANT_Utf8 (tag=1) representing the string "LocalVariableTable". If LocalVariableTable attributes are present in the attributes table of a given Code attribute, then they may appear in any order. There may be no more than one LocalVariableTable attribute per local variable in the Code attribute.

    An entry in this table has the fields start_pc, length, name_index, descriptor_index, and index, which essentially define a state of the method at any given point during its execution.

    • start_pc, length : The given local variable has a value at indices into the code array in the interval [start_pc, start_pc+length] inclusive.

    • name_index, descriptor_index : These point to a CONSTANT_Utf8 structure representing a valid Java local variable name stored as a simple name or a descriptor for a Java local variable. Java local variable descriptors have the same form as field descriptors.

    • index : This is the offset, in 16-bit words, of the given local variable from the beginning of the structure containing all local variables. If the local variable at an index is a two-word type (double or long), it occupies both index and index+1.



ITV Handbook. Technologies and Standards
ITV Handbook: Technologies and Standards
ISBN: 0131003127
EAN: 2147483647
Year: 2003
Pages: 170

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net