Portability Versus Performance | Microsoft Office PowerPoint 2007 On Demand

It's important to consider whether there is any real likelihood of changing your database technology. This isn't the same thing as changing database vendor. For example, changing from a Microsoft SQL Server relational database to an Oracle database should have a far smaller impact than changing from a relational database to an object database. Nor is it the same thing as changing a database schema within the same database. This is likely enough to happen that we should be able to cope with it.

Organizations make a heavy investment in databases, both in license costs and staffing. Then they fill them with valuable data. They aren't going to simply throw this investment away. In practice, it's rare for J2EE applications to be ported between fundamentally different databases. In fact, it's more likely that an organization would move from one application server to another (even one application server platform, such as J2EE, to another, such as .NET, or vice versa), than move from an RDBMS to an ODBMS. Even a switch between RDBMS vendors is uncommon in practice, due to the investment involved.

Why is this question important? Because it has implications for the way in which we approach persistence. The unfortunate results of the assumption that database portability is paramount are (a) a lukewarm interest in an application's target database; and (b) effort being wasted achieving what for most application is a non-goal: complete portability between databases. The assumption that organizations will view J2EE as the center of their data access strategy flaws much thinking on persistence in J2EE.

Consider a realistic scenario. A company has spent hundreds of thousands of dollars on an Oracle installation. The company employs several Oracle DBAs, as several applications in its software suite besides J2EE applications use the database. The J2EE team doesn't liaise with the DBAs, and insists on developing "portable" applications that don't take advantage of any of Oracle's features. Clearly, this is a poor strategy in terms of the company's strategic investment in Oracle and actual business need.

If it's clear that your organization is committed to a particular database technology (and very likely a particular vendor), the next question is whether to take advantage of vendor-specific functionality.

The answer is yes, if that functionality can deliver real benefits, such as improved performance. We should never reject such a possibility because we aim to achieve an application with 100% code portability; instead, we should ensure that we have a portable design that isolates any non-portable features behind Java interfaces (remember that good J2EE practice is based on good OO practice). Total portability of code often indicates a design that cannot be optimized for any platform.

We need abstraction to achieve portability. The question is at what level we should achieve that abstraction.

Let's consider an example, with two alternative implementation approaches illustrating different levels of abstraction.

Abstraction using a DAO
We can decouple business logic from data access code and achieve a portable design by deciding that "the AccountManager session bean will use an implementation of a data access interface that can return value objects for all accounts with a balance over a specified amount and transactions of last month totaling more than a specified amount". We've deferred the implementation of data access to the DAO without imposing any constraints on how it should go about it.
Abstraction using CMP entity beans
An attempt at complete code portability is to say, The Account entity bean will use CMP. Its local home interface will have a findByAccountBalanceAndTransactionTotal() method to return entities meeting these criteria. This method will rely on an ejbSelectByAccountBalance() method that returns entities meeting the balance criteria, which is backed by an EJB QL query that's not RDBMS-specific. The findByAccountBalanceAndTransactionTotal() method will iterate over the collection of entities returned by the ejbSelectByAccountBalance() method, navigating the associated collection of Transaction entities for each to add their values.

This roundabout approach is necessary because EJB QL (as of EJB 2.0) does not support aggregate functions: probably because these are considered a relational concept. I think I got that algorithm right, but I'd definitely be writing unit tests to check the implementation (of course this would be relatively difficult, as the code would run only inside an EJB container)!

Let's assume that we're developing, the application in question to work with an RDBMS.

The first of these two approaches can be implemented using the capabilities of the database. The data access interface offers a high level of abstraction. Its implementation will most likely use JDBC, and the logic in the query can efficiently be implemented in a single SQL query. Porting the application to another database would at most involve reimplementing the same simple DAO interface using another persistence API (in fact, the SQL query would probably be portable between RDBMSs).

The second approach adds significant overhead because it forces us to perform the abstraction at too low a level. As a result, we can't use the RDBMS efficiently. We must use an EJB QL query that is much less powerful than the SQL query, and are forced to pull too much data out of the database and perform data operations in Java in the J2EE server. The result is greater complexity and much poorer performance.

In the business situation I've described, the fact that the second approach delivers total code portability is of no real benefit. The first approach gives us an optimal solution now, and we've neatly isolated the little code we'd need to reimplement if the back-end ever changes significantly.

There are many things that we can do in EJB QL without this kind of pain, so this example shows the EJB QL abstraction in its worst light. It illustrates the fact that inflexible pursuit of code portability can result in highly inefficient data access.

Important

If it's impossible for an application to take advantage of worthwhile database-specific functionality, the application has a poor architecture.