Introduction | Java EE and .NET Interoperability: Integration Strategies, Patterns, and Best Practices

The Resource tier often comprises a diverse set of enterprise applications such as databases, ERP, CRM, and legacy systems. It is common to develop point-to-point integration to each of these applications that in the end yields a convoluted system landscape. Adding the Java EE and .NET technology disparity increases the complexity of the integration task. Therefore, as a best practice, it is valuable to abstract out the back-end integration by building a so-called Integration tier. This Integration tier provides cohesive access to the corresponding Resource tier in a way that is transparent to the rest of the business logic. In the context of Java EE .NET integration, it is particularly important to factor out the integration logic to avoid a brittle architecture and allow individual services to interconnect using Service Oriented Architecture (SOA).

This chapter demonstrates how to achieve interoperability with back-end services using synchronous integration and how to overcome Java and .NET technology mismatches. The first strategy involves building an interoperable Data Access Layer that allows Java EE and .NET applications to access back-end applications. Additionally, the chapter discusses design techniques related to legacy integration that are present in most enterprise Resource tiers.

Those familiar with Java EE and .NET persistence layers, connection pooling techniques, and object-relational mapping mechanisms can jump straight to the "Data Access Adapter Strategy" section. Otherwise, it might be useful to learn the basics of Java and .NET database connectivity.

Java EE Persistence Mechanism

There are a few options available to Java developers to build database connectivity. Depending on the application requirements, they might select Plain Old Java Object (POJO) to connect to the back-end systems. Alternatively, if the system relies on Enterprise Java Beans (EJBs), there are two flavors of persistence mechanisms to choose from: Bean Managed Persistence (BMP), where database access is programmatically managed, or Container Managed Persistence (CMP) that relies on the underlying application server or third-party integrated CMP solutions to connect to the database and perform object-to-relational mapping. Both BMP and the POJO model directly leverage Java Database Connectivity (JDBC) and allow users to construct SLQ statements manually. With CMP as well as with Java Data Objects (JDO), which is another option available to assemble a persistence layer, developers do not have to manually construct SQL queries. With JDO, an XML mapping outlines how Java objects map with a relational database. XML mapping can be defined with tools that are available with commercial and open source JDO implementations. With EJB CMP, the mapping-properties element contains the object to relational database mapping. With both of these approaches, the Data Access Layer is transparent at development and configured at deployment.

JDBC APIs facilitate traditional database Create/Remove/Update/Delete (CRUD) operations and support embedded SQL statements that can be passed as method arguments. JDBC also supports stored procedures and prepared statements as well as SQL3 data types, scrollable result sets, and programmatic and batch updates. There are numerous providers for the JDBC drivers, see [JDBC_Drivers]. Depending on the database, the developer might have to choose one out of four different types of JDBC drivers:

Type 1 drivers provide the Java bridging technology to access a database driver. The JDBC-ODBC bridge is an example of such a driver.
Type 2 drivers are native API drivers that allow Java applications to make JNI calls to the corresponding low-level (for example, C/C++) methods for database access.
Type 3 drivers are based on the networking APIs and use sockets to communicate between the client application and middleware on the database server, which translates the client call to the corresponding database driver call.
Type 4 drivers are the most popular because they are pure Java drivers that use Java sockets to communicate to the database. This type of driver implements the JDBC specification in Java and is the simplest to configure.

The JDBC APIs are provided with the java.sql library and the optional package, javax.sql, which includes the JDBC 2.0 Standard Extension API.

From a programming standpoint, connecting to a database starts with a JNDI look up call, where the logical name assigned to the data source would be passed:

 DataSource dataSource = (DataSource)context.lookup("java:comp/env/jdbc/SupplyChainDB");

After obtaining a data source, the next step is to get the database connection that is either retrieved from a connection pool or newly created:

 Connection conn = datasource.getConnection(); Statement stmt = conn.createStatement();

In cases where a JNDI context is not present, you can use DriverManager to obtain a connection:

 DriverManager.getConnection()

The next step is to execute an SQL statement. The JDBC API supports regular SQL statements that can be embedded into Java applications in the following manner:

 ResultSet rs = stmt.executeQuery("SELECT descr, qty FROM ship_goods");

Updating database information can be achieved with the executeUpdate() operation:

 String query1 =    new String("INSERT INTO ship_goods VALUES ('847SWS',              'SunFire V40z',3)"); stmt.executeUpdate(query1);

In addition to the embedded Java SQL statements, JDBC supports prepared statements using PreparedStatement, which allow passing parameters into a query at the time of statement execution:

 PreparedStatement stmt =     conn.prepareStatement("update ship_goods set qty = ?" );

The statement is invoked with the corresponding parameter passed at the time of invocation. The value of the parameter has to be bound to the corresponding data type, such as Integer:

 stmt.setInt( 3, item.getQty() );

The prepared statement is now ready for execution:

 stmt.execute(); stmt.closei();

Database stored procedures can be invoked via CallableStatement. In the example that follows, three parameters have to be passed to the shipGoods() stored procedure at the time of execution. This is shown in Listing 6-1.

Listing 6-1. JDBC Callable Statement

 CallableStatement statement =    connection.prepareCall("{call shipGoods[(?,?,?)]}"); stmt.setString(1, item.getId()); stmt.setString(2, item.getDescr()); stmt.setInt(3, itme.getQty()); stmt.execute();

It is important to note that an insert, delete, or update operation has to be enclosed within the scope of the transaction that either commits or rolls back, as shown in Listing 6-2.

Listing 6-2. Transaction Management

     try{        userTxn.begin();        stmt.executeUpdate("update …")";        userTxn.commit();     }     catch (SQLException exc) {        try {            // In case of an error rollback the transaction            userTxn.rollback()        } catch (Exception e) {// process exception}     } finally {…}

JDBC ResultSet and RowSet

To retrieve and iterate over a result set, JDBC provides a ResultSet class that allows a parsing result of the executed query (Listing 6-3):

Listing 6-3. ResultSet Example

 ResultSet rs = stmt.executeQuery("SELECT descr, qty FROM ship_goods");     while (rs.next())        System.out.println(rs.getString("descr") + "      "        + rs.getInt("qty"));

To preview information stored by the result set, JDBC APIs provide ResultSetMetaData and RowSetMetaData libraries. Previewing the number of records may be useful if a result set contains a large number of records to gradually process the result set. The javax.sql.RowSet is an extension to the ResultSet that complies with the JavaBean component model. Property manipulation and event notification are supported as part of that model. Another useful feature relates to the fact that RowSet can be serialized for remote processing. For a scrollable and updateable result set, CachedRowSet, which is is a JavaBean, should be considered. CachedRowSet can be easily populated with database data and treated as a "disconnected" component from the data source. CachedRowSet can be sent to a remote client application for updates without maintaining an open database connection. This model is more scalable and offers a greater degree of flexibility. Aside from the CachedRowSet, the JDBC 3.0 API defines the JdbcRowSet interface that allows a ResultSet to be represented in the form of a JavaBean. A WebRowSet extends CachedRowSet with Web application functionalities. Specifically, a WebRowSet can read and write a RowSet object in a well formed XML format. A FilteredRowSet, as implied by its name, is used for filtering on inbound and outbound RowSet read and write operations. A JoinRowSet is another type of RowSet that acts as a container combining multiple RowSet objects. See [JavaSQLDescr] for more details on RowSet functionality.

JDBC also supports retrieval of the database metadata via DatabaseMetaData, ResultSetMetaData, and RowSetMetaData classes.

After data have been retrieved and corresponding database operations have been completed, it is time to close the database connection.

Closing Connections

When manipulating data using the JDBC API, it is important to close any database connections, statements, and result sets to prevent leaking resources. It is wise to close the database connection in the finally block to ensure that the closing logic is executed regardless of the program success or failure.

Listing 6-4. Closing JDBC Connection

finally {     if (stmt != null)         try {             stmt.close();         } catch (sqlexception sqlex) { }         if (resultSet != null)             try {                  resultSet.close();             } catch (sqlexception sqlex) { }         if (conn != null)             try {                 conn.close();             } catch (sqlexception sqlex) {} }

The next sections look at database access functionality available under .NET.

Accessing Data via ADO.NET

On the .NET platform, ADO.NET is the main mechanism to access a database. Similar to JDBC drivers, ADO.NET providers enable access to the underlying data sources. There are three main types of .NET Framework providers. They are SQL Server, OLEDB, and ODBC providers, each having individual pros and cons, which will be discussed here.

The ADO.NET provider for SQL Server belongs to the class of managed providers that enable native connectivity to the underlying data source. Managed providers are executed within the CLR boundaries, which is to say, managed by the CLR. Managed providers are very appealing to .NET developers due to their inherit OOP support and optimal performance. The .NET Framework also allows connecting to a database via OLEDB and ODBC. .NET providers are commonly used to connect to legacy databases that do not have managed providers. The ODBC API is implemented in C, which makes it difficult to apply to the object-oriented .NET application model. Another drawback of the ODBC API is that it is strictly designed to work with relational databases, and therefore it is difficult to extend these APIs to connect to non-relational resources. Finally, OLEDB requires low-level programming that involves excessive complexity at development time. Both ODBC and OLEDB require corresponding .NET providers to facilitate application access to the underlying data sources. The ODBC .NET provider wraps the native ODBC driver. Similarly, the .NET OLEDB provider abstracts out the OLEDB data sources. Adding a layer of indirection between a driver and an application tends to degrade the overall application performance. Therefore, .NET managed providers, such as the SQL provider, help to overcome limitations of OLEDB and ODBC solutions. Figure 6-1 depicts the three main choices for .NET developers to connect to data sources.

Figure 6-1. .NET Data source architecture

Microsoft offers .NET managed providers for SQL Server and Oracle. In addition, managed providers for Oracle, SQL Server, DB2, and Sybase are available from DataDirect and OpenLink Software (see [DataDirect] and [OpenLink]). An example demonstrated later in the chapter uses the Microsoft SQL Server with an SQL managed provider and the Core Lab Software Development's MySql database managed provider.

ADO.NET introduces managed object-oriented APIs compared to C-based unmanaged ADO APIs. The ADO.NET API is intended for Web applications development, supporting disconnected result sets and providing intrinsic XML support. A result set retrieved from the database can be easily represented in the XML format and transferred between applications for processing. ADO.NET inherits some of its logic from the ActiveX Data Objects (ADO) APIs, designed for COM applications, which simplifies ADO developers' transition to .NET development. Connecting to a database is fairly straightforward, as discussed next.

Connecting to a Data Source

The libraries that support ADO.NET are bundled within the System.Data.dll assembly. The API comprises five namespaces: Common, OleDb, SqlClient, SqlTypes, and Microsoft.Data.Odbc. The SQL Server System.Data.SqlClient namespace defines data provider classes to connect to the Microsoft SQL Server database. Third-party providers enable access to databases such as MySql [MySQL_DotNet], used by the sample application.

Figure 6-2 highlights how to establish a connection to the database.

Figure 6-2. Database connectivity overview

A Connection object or in this case, the MySqlConnection and SlqConnection, provides connectivity to the underlying data source. A DataAdapter object uses the Connection to retrieve information by executing a corresponding command. The Command object, in this example SqlCommand and MySqlCommand, is used to execute standard CRUD operations.

The first step is to create a Connection object with the corresponding host, database name, and database user ID and password information that is passed in the form of a connection string:

 string conString = "Data Source=(local);uid=sa;pwd=admin;database=Warehouse_B"; SqlConnection conn = new SqlConnection(connString);

Next is to compose an SQL command that inserts some data into the ship_goods table:

 string query =     "INSERT INTO ship_goods (id, descry, qty)      VALUES ('01JKY977Y','SunFire V40z',2)";

The Command object, created next, executes the previous SQL call:

 SqlCommand command = new SqlCommand(query); command.Connection = conn;

Next is to open the connection and execute the command:

 con.Open(); command.ExecuteNonQuery();

ADO.NET allows using either a Command object directly to execute CRUD operations or use the Command in conjunction with a DataAdapter to retrieve data into a DataSet for disconnected processing. The code snippet that follows shows how to populate the DataSet object using SqlDataAdapter:

 SqlDataAdapter adapter = new SqlDataAdapter(sql, conn); DataSet dataset = new DataSet("ShipGoodsDescr"); adapter.Fill(dataset, "Descr");

The disconnected DataSet can then be passed to a remote client for display or further processing. As can be seen from the preceding example, Connection, Command, DataReader, and DataAdapter comprise core classes of the ADO.NET namespace. The Command object executes not only queries, but also stored procedures and returns DataSet or DataReader objects discussed hereafter.

Similar to the JDBC model, ADO.NET allows the execution of stored procedures. For that the command type needs to be set as follows:

 Command.CommandType = CommandType.StoredProcedure;

The corresponding parameters can then be passed to the stored procedure. As a way to optimize the performance of stored procedures, Microsoft introduced a new component called Data Access Application Block (DAAB).

Data Access Application Block

The goal behind the DAAB component is similar to the JDBC objective, which is to develop code that is portable across multi-vendor databases such as SQL Server, Oracle, and DB2. DAAB introduces a layer of abstraction over database accesses, thus hiding database specifics from .NET developers. In terms of stored procedure optimization, DAAB defines the ParameterCache class that caches the necessary parameters associated with a stored procedure and avoids future look up calls to the database, in case subsequent stored procedure invocations are performed. Another nice feature of DAAB is abstraction over the database name. A logical name for a database, similar to the JNDI name, can be used by an application in lieu of hard-coded details of the underlying data source. Application configuration files store database specifics, which allows a developer to easily switch between databases without having to recompile the code. The DatabaseFactory object retrieves required configuration parameters at run time to access the corresponding database. See [DAAB] for additional information.

ADO.NET DataReader and DataSet

As part of the System.Data namespace, ADO.NET provides two main methodologies to fetch dataa connected model and a disconnected model. The connected model uses DataReader to fetch rows of data through an active connection to a data source and is somewhat similar to the JDBC ResultSet. The disconnected model uses the DataAdapter as a bridge between the database and a DataSet (which resembles the JDBC RowSet), offering disconnected, updateable, scrollable access to the data. DataSet can be used to pass information to a remote client application for processing. Because retrieved data often have to be preordered or filtered, the DataView object can be used to represent different views of the same dataset. A DataViewManager embraces a collection of views, which simplifies data management.

Listing 6-5 lists how to retrieve data from a database.

Listing 6-5. ADO.NET DataReader Example

 string sqlSelect ="SELECT descr FROM ship_goods" ; SqlExecuteReader reader = cmd.ExecuteReader(); while (reader.Read())     Console.Write(reader.GetString(0).ToString());

DataReader is also useful in the situations where you need to execute multiple queries as a single batch job, as shown in Listing 6-6:

Listing 6-6. DataReader Batch Job

 string query1 = "SELECT * FROM ship_goods"; string query2 = "SELECT * FROM purchase_order"; string query3 = "SELECT * FROM invoice"; SqlCommand command = new SqlCommand(query1, query2, query3);    SqlExecuteReader reader = command.ExecuteReader();

From a best practices standpoint, DataReader should be used with Command object for rapid data access, as it only provides forward read-only functionality. DataReader does not allow modifying the fetched data. On the other hand, DataAdapter is used in conjunction with DataSet to enable modifications made to the retrieved data.

ADO.NET provides a rich set of data structures to store relational data in an object format. These objects include DataTable, DataColumn Collection, DataColumn, DataRowCollection, DataRow, Constraint Collection, and Constraint. ADO.NET also supports access to the underlying resource's metadata such as Data Column.

Closing ADO.NET Connection

After the data have been fetched, it's important not to forget to close the database connection:

 reader.Close(); conn.Close();

See [ADO.NET_Examples], [ADO.NET_C#], and [DataAccess_ArchitectureGuide] for more details on best practices developing ADO.NET applications.

After having walked through Java and .NET database APIs, the next step is to explore the connection pooling functionality.

Connection Pooling with .NET and Java EE

Both Java and .NET support a connection-pooling mechanism that allows reusing database connections. The purpose behind a connection pool is to enhance application performance, reduce underlying resources utilization, and simplify the programming model.

In .NET you need to close or dispose the connection to ensure that the connection is returned back to the pool. Common parameters that can be set include initial pool size, connection lifetime, maximum pool size, and minimum pool size. See [ConnPooling_.NET] for more details on setting up a connection pool under .NET. Defining a connection pool in Java is similar. The connection pool can be established for both Java EE and Java applications. Former types of applications would leverage the pool of resources provided by Java EE application servers. See [ConnPooling_Java] for more details on setting up a connection pool in Java. Listing 6-7 defines a connection pool in Java.

Listing 6-7. Connection Pool in JDBC

 Context ctx = new InitialContext(); ConnectionPoolDataSource datasource = (ConnectionPoolDataSource)ctx.lookup("jdbc/Warehouse_A"); pooledConnection pcon = datasource.getPooledConnection("root", "admin");

As has already been shown, the connection initialization starts with the JNDI look up call. The returned logical name of the connection pool is narrowed to a javax.sqlConnectionPoolDataSource type. A connection is then returned from the getPooledConnection() method.

Along with connection-pooling functionality, both Java and .NET applications leverage the persistence mechanism, which is explored next.

Object-Relational Mapping

Most enterprise applications use a mapping between the domain object model and the database relational model. There is often no direct correspondence that can be drawn easily between these two models, and application developers have to translate relational schema into an object-oriented class hierarchy.

To abridge the tedious translation routine between these two models, there are object-relational mapping (ORM) tools available for both Java and .NET platforms [.NET_ORM_Tools]. With the Java EE architecture developers can leverage Entity Beans with CMP or BMP to perform mapping between the object and relational layers. The complexity of the EJB persistence model resulted in the development of a simpler Java Data Object (JDO) specification and open source solutions such as Hibernate. Third-party products including Oracle, TopLink, SolarMetric, and Kodo offer tools that shield application developers from the underlying SQL calls.

Considering the importance of the persistence layer for any enterprise application and the complexity of EJBs, a new persistence model has been introduced in the EJB 3.0 specification, developed as part of JSR_220. This specification defines a persistence layer using Plain Old Java Objects (POJO), thus no longer requiring an EJB to implement the home interface. The idea behind this specification is to simplify and standardize Java persistence APIs. This new model leverages best practices from Java Data Objects (JDO) and Hibernate, both of which are commonly used in the industry to develop a persistence layer.

Common best practices such as Data Transfer Objects (DTO) and Service Locator patterns are used by the EJB 3.0 specification. Additionally, the specification uses annotations and the EJB QL query language. The EJB 3.0 specification intends to steer Java developers toward a cohesive persistence model and will be available as part of the Java EE 5.0 platform. See [JDO], [Hibernate], [JSR_220], [TopLink], and [Kodo] for more details.

Version 2.0 of the .NET Framework introduces ObjectSpaces that enable the O/R Mapping. Object to relational schema mapping with ObjectSpaces encompasses three parts: Relational Schema Definition (RSD), Object Schema Definition (OSD), and Mapping Schema that links the other two schemas. For initializing ObjectSpaces you can only pass the Mapping Schema that defines mapping rules, listed in Listing 6-8.

Listing 6-8. .NET Mapping Schema

 <m:Mappings>      <m:Map SourceVariable="ship_goods" TargetSelect="Retailer.ShipGoods">         <m:FieldMap SourceField="ID" TargetField="ID" />         <m:FieldMap SourceField="QTY" TargetField="Quantity" />         <m:FieldMap SourceField="DESCR" TargetField="Description" />      </m:Map>    </m:Mappings>

More can be learned about ObjectSpaces in [ObjectSpaces] and download samples, documentations, and tools as part of the Win FX SDK, [WinFX_SDK].

If a persistence layer in Java or .NET is being built, there are numerous resources available to guide developers through the process and to determine the optimal O/R mapping tool. Interestingly, a popular open source Java O/RM tool, Hibernate, is being ported to .NET. See [NHibernate] for more details. Among commercial solutions, LLBLGen Pro and EntityBroker offer O/R Mapping solutions under .NET. See [LLBLGen_Pro] and [EntityBroker]. For an additional list of tools please refer to [ORM_Tools_Java] and [ORM_Tools_.NET].

After a lengthy introduction to various data access mechanisms, it is time to move on with this chapter's main objectiveResource tier interoperability. First is a brief discussion of the business scenario of the developed application, next is a review of a popular interoperability strategy, and finally a demonstration of the application.

Business Scenario Overview

The WS-I Supply Chain Management sample application continues to be the framework in these illustrations, and a Source Goods use case has been selected to demonstrate effective ways to connect to a back-end infrastructure. The Source Goods scenario is depicted in Figure 6-3:

Figure 6-3. Business scenario sequence diagram

The main application processes a ShipGoods request, for which the application has to build integration points with multiple back-end warehousing systems. Depending on goods availability, the application submits requests to multiple warehouse systems until the order has been fulfilled. For the purpose of demonstrating how to connect to heterogeneous back-end environments, we will assume that Warehouse_A represents an RDBMS system such as Oracle or MySql, which is actually used for the code sample. Warehouse_B is represented with Microsoft SQL Server. Warehouse_C is a legacy system that has to be accessed in case the first two ShipGoods requests fail. Individual warehousing systems are accessed through the Data Access Layer (DAL) that will be developed in both the Java and .NET applications. First is a look at how to build the DAL from a retailer system implemented in Java and then how to build the same layer with .NET technologies.