2.3 Object-Relational Mapping


Object-Relational (O/R) mapping is a technique to map between developed Java classes and relational databases. Java relations between objects, classes, and their fields have to be mapped somehow to database relations, tables, and columns in a relational database.

An object-oriented domain model of business entities is sometimes compromised for technical reasons like performance or space requirements when the application matures. In this case, object-relational mapping is also a technique to reengineer complex, existing database models. The object-oriented "is-a" and "has-a" relationships can only be poorly mapped to relational databases, where a record or related information often spans a number of tables and has to be assembled by join operations.

2.3.1 Classes versus tables

One talks about O/R mapping if tables and columns of a database are connected with classes and fields of an object-oriented language, so that an instance of a class can be used to access or modify data of one or more rows/columns of that database.

Figure 2-1 shows a Java class Author that has a countryCode and name field. This information can be put into an AUTHORS table that has corresponding COUNTRY and NAME columns:

Figure 2-1. The object versus the relational view of the world.

graphics/02fig01.gif

Instances of the Author class are plain Java objects that contain the countryCode and name data during runtime. The instance data corresponds to table rows, which are accessed by SQL (Structured Query Language) queries and INSERT , UPDATE , or DELETE operations. Starting with this basic, apparently simple mapping scheme, the details can get more complex rapidly :

  1. In each of both worlds , there are different type systems.

  2. Java supports inheritance; most relational database systems don't.

  3. Deletion in Java is governed by a garbage collector, while relational databases have an explicit operation. Combining these two can lead to questions.

  4. True objects are ultimately always "found" by following a reference from another object. Relational databases, however, have a different view: A table is a set of rows, on which query operations can be performed.

  5. Naming conventions are different: Class names can be long, are case sensitive, and include a hierarchical package name; attribute names can also be long. SQL table and column labels, however, are often of (very) limited size. The class-to-table and attribute-to-field name mapping thus needs to be addressed. Automatic name mapping schemas or manual specification of table and column names are imaginable, and often both provided by JDO implementations and schema generators.

  6. Ambiguities exist in mapping references to relations.

Such issues are examined in more detail over the next few pages.

2.3.2 String, date, and other type mappings

The Java type system is different from SQL's type system. Basic types like Java short , int , float , and double do not directly map to exact SQL types, and vice versa. Even simple types like Java String and Date have no exact equivalent in the relational database world because they are classes, and in principle, classes are mapped to tables. In Figure 2-2, the Publication class has a title field that refers a String object at runtime, but in a database, columns can be declared to store character data.

Figure 2-2. A Java string is really a reference to an independent object, but is generally mapped to a column of a table, not a row in a STRING table.

graphics/02fig02.gif

Using date types is even worse . In Java, the Date class counts milliseconds since January 1, 1970, and stores them internally in a 64-bit long value. In SQL, there are many different flavors of date and time or combinations of them available as column types.

Other mapping issues could arise if the out-of-the-box provided object-to-table mapping is not desired purely for efficiency reasons in the relational model; it is often the case to use object references to model a Java "enum" pattern that would generally be modeled as a simple number column. Even simpler, sometimes a String field of a Java class should actually be represented as a small 1-2 character type column with some sort of fixed back and forth mapping of some code, or currency symbol, and so on. (Such issues are more likely to arise if existing relational schemas should be used by, but sometimes even desired, with new schemas.)

2.3.3 Inheritance mapping

Object models using inheritance can be mapped to relational models by different mapping schemes. The class hierarchy shown in Figure 2-3 is used throughout the next paragraphs to explain these schemes.

Figure 2-3. Sample class inheritance.

graphics/02fig03.gif

The first scheme maps every field of every class of the class hierarchy into a single, plain table. An additional TYPE column indicates whether a record is a Publication, Book, or CD object. Because an instance of Publication can be a Publication, a Book, or a CD, the table contains lots of empty fields. If the instance is a Book, the CD fields are not used and vice versa. Figure 2-4 shows this table structure.

Figure 2-4. A "flat" inheritance mapping with an entire class hierarchy in one table.

graphics/02fig04.gif

A second scheme is on the other extremeit puts the fields of each class in a separate table. A fourth table, BOOKCDPUB defines the relations of the instances. Figure 2-5 is an example of such a mapping scheme. This scheme should be used if Book and CD instances are rarely expected and a huge number of plain Publication objects exist in the database.

Figure 2-5. A "fat" inheritance mapping with one table for each class, plus an additional "forward join" table (rarely used by products in this form).

graphics/02fig05.gif

In a third scheme to save the extra table, if the Publication class is abstract or no pure Publication objects exist, the Book and CD relation fields BOOKID and CDID can be moved into the Publications table. This scheme, shown in Figure 2-6, is frequently used in applications.

Figure 2-6. An inheritance mapping with one table for each concrete class, and "forward join foreign keys" from superclass to all possible subclasses.

graphics/02fig06.gif

It is a typical model with key/foreign-key relations that also illustrates the ambiguity of "is-a" and "has-a" references. One cannot tell from the table declarations whether Books and CDs are part of the Publication data or just referenced by it.

Figure 2-7 shows a fourth mapping scheme that eliminates the extra table by using backward references. Notice that the BOOKID and CDID columns are both primary keys of their respective tables, as well as foreign keys to PUBLICATIONS.PUBID. The insert, update, and delete operations for Book and CD objects is faster and less disk space is used for each object, although queries can become quite complexfor instance, if Book objects are searched by title. An often-seen sensible optimization is to provide some sort of TYPE (or CLASS) column in PUBLICATIONS that allows determining the exact type of a Publication instance, i.e., whether it is "just" a Publication, a Book or CD instance, without further queries. Notice in the previous mapping scheme that this was determined by the non-NULL presence of BOOKID or CDID foreign keys in the PUBLICATIONS table.

Figure 2-7. An inheritance mapping with one table for each concrete class, and "backward join foreign key" from each subclass to the superclass.

graphics/02fig07.gif

Figure 2-8 shows yet another possible mapping scheme in which one table per concrete class is used. The definitions of the attributes of the superclass are duplicated in each subclass, but no data is duplicated , because an object is always a Publication, Book, or CD. If the Publication class is not abstract, a PUBLICATIONS table, with only PUBID and TITLE columns, and storing only instances of real Publication objects (no subclass instances), can be used; if Publication is abstract, then that table is not necessary.

Figure 2-8. An inheritance mapping with separate tables for each concrete class.

graphics/02fig08.gif

This mapping has very good performance characteristics for most query operations, except for queries on the Publication superclass "with subclasses," which could not be expressed as a table join and would lead to two separate queries or a UNION. (It is imaginable, purely to optimize performance for this specific case, to duplicate some columns of records of the subclass tables [BOOKS and CDS] into the superclass PUBLICATIONS table to speed up respective queries; in this case, a persistence framework would have to ensure consistency.)

2.3.4 Security

When using O/R mapping, other problems result from different access restrictions to data. In Java, public, protected, private, and package access defines the rules for field read and write access, as well as method invocation.

Relational database systems have much more flexible access policies, based on user or group security control settings. Such mapping problems may occur if a column of a table must not be read by a user , but the class does contain a field for that column. Security issues may influence the way columns are mapped to fields and classes to get a NullPointerException for illegal field access.

2.3.5 Query language translation

Finding relational records is usually done by expressing a search in SQL. However, SQL is not well suited to express filtering expressions on object graphs.

Most O/R mapping tools thus have some form of non-SQL query facility. Queries are then translated into SQL before being sent to the database for execution. This facility generally comes either in the flavor of another string-based query language that mapping tools parse or directly as some sort of tree of Criteria, Operation, and other objects.

While again simple cases are easy to translate, the more advanced scenarios can rapidly get fairly interesting exercises. For example, if the objects query language allows to sort on attributes of referenced objects, such as JDOQL, then the query translation must find out which tables to join, when to use Outer Joinsif available, and so on.

2.3.6 Referential integrity, deletion, and so on

Pure Java objects exist as long as at least one other object references them. This familiar Java feature is known as garbage collection. The days of C and other low-level programming languages with the problem called "dangling pointers" (references to non-existing objects previously de-allocated memory) that led to screaming developers are long gone when using Java.

However, persistent objects can usually be deleted explicitly. Relational databases have long had their own mechanism (referential integrity constraints) to prevent "dangling records," i.e., foreign-key references to non-existent primary keys.

Again, marrying these two different views is an issue that a mapping layer needs to address, be it by explicitly supporting relational referential integrity constraints and performing correct statement ordering if required, maybe by supporting cascaded delete options, or by explicitly not supporting relational referential integrity.

2.3.7 Transparent persistence in O/R mapping

After the Java object model is mapped to columns and tables, an O/R mapping implementation must be able to read and write objects by some means. The basic operations on databases are query, insert, update, and delete. The fundamental idea of transparent persistence is to hide away direct database. A further objective is to minimize database access. This can be achieved by reading known data from a memory cache and updating only modified objects or even fields.

Tools are provided by O/R mapping solutions that create the required code behind field access methods to transparently execute appropriate SQL statements. Although it is possible to implement all the methods by hand, it can be erroneous. Especially when the model changes from one application version to another, or when the model becomes more complex, many project teams spend more time implementing JDBC and SQL mapping code than implementing application logic. A closer look at this problem is taken in the JDO and JDBC chapter.

A brief example of what happens is given below. Assuming that an instance of a Book already exists in memory, the following list shows how an O/R implementation retrieves an Author object. The Java code looks like this:

 
 Book book = ... already in memory...; Author author = book.getAuthor(); 

The O/R mapping implementation might perform something similar to these steps:

  • Take the book's AUTHORID .

  • Create an Author instance.

  • Create a JDBC statement like

     
     SELECT * FROM AUTHORS WHERE AUTHORID = id. 
  • Copy necessary fields of the result record into the author instance. This step includes a more or less complex attribute to field mapping due to the different type systems.

  • Return the author instance.

2.3.8 Identities

Each object in Java has an implicit object identity based on its memory location within the JVM. This identity is used to express relationships with other objects in memory. However, object identity cannot be guaranteed across space and time. It is thus not suitable for identifying relationships in a datastore.

A relational database's view of relationships is based on primary and foreign keys. It is essential that a one-to-one association exists between objects in memory and records in a database, or else it becomes impossible to identify the appropriate rows for query, update, and delete operations. Nobody would expect that a call to book.getAuthor() returns different objects in two adjacent calls. Therefore, the pseudo code from the above example must be extended to satisfy the constraint:

  • Take the book's AUTHORID .

  • Check if an instance corresponding to AUTHORID already exists in memory. If yes, return that instance; else continue.

  • Create an Author instance.

  • Create a JDBC statement like

     
     SELECT * FROM AUTHORS WHERE AUTHORID = id. 
  • Copy the necessary fields of the result record into the author instance.

  • Return the author instance.



Core Java Data Objects
Core Java Data Objects
ISBN: 0131407317
EAN: 2147483647
Year: 2003
Pages: 146

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net