In this part of the chapter, we will discuss how to "prep" a querythat is, how to pave the way, create the right environment, and optimize the conditions so that the query will run well. We'll follow a query through all its phases: making the statement, getting metadata, setting result set flags, fetching, and using the getXXX methods . We know this is a lot of trouble for one little SELECT. For many situations you'd be better off following this adviceJust do it! Our painstaking approach is an illustration of all the possible optimizations at each stage, rather than an order that you must actually follow for your own queries. Once you've scanned through our illustration, though, you'll find many occasions to employ a subset of these steps. Query StatementWe begin our illustration with the step of writing an SQL statement. You can make a SELECT statement with Statement and execute it with executeQuery . Here's an example: Statement stmt1 = con.CreateStatement(); ResultSet rs1 = stmt1.executeQuery( "SELECT * FROM Table1"); ... Statement stmt2 = con.CreateStatement(); ResultSet rs2 = stmt2.executeQuery( "SELECT * FROM Table2"); Stop. Why did we make two stmt s? Are they independent? If stmt2 doesn't have to follow stmt1 , then you can set up a second thread and run it in parallel. The idea of "Use two threads" is applicable in practically any case; we'll just mention it once and leave repetitions to your imagination from now on. Note, however, that not all ODBC drivers are reentrant"Use two threads" is inappropriate if you're using a JDBC-ODBC bridge. If stmt2 does have to follow stmt1 , then you should not make stmt2 at all. Instead, just reuse stmt1 , like this: Statement stmt1 = con.CreateStatement(); ResultSet rs1 = stmt1.executeQuery( "SELECT * FROM Table1"); ... ResultSet rs2 = stmt1.executeQuery( "SELECT * FROM Table2"); We're more concerned about reusing objects than a Java programmer normally is for two reasons:
Query SyntaxUse consistent, standard, trick-free SQL syntax. You can, of course, use {escape} additions for portability. Some drivers try to cache recently used SQL statements, so it's handy if they can find the times that the syntax is duplicated . getBestRowIdentifierSeveral methods in the DatabaseMetaData class can be called "catalog functions" because each one has an exact analog among the ODBC catalog functions. The two most interesting methods are getIndexInfo , which is equivalent to the ODBC function SQLStatistics , and getBestRowIdentifier , which is equivalent to ODBC's SQLSpecialColumns . We discussed the importance of SQLSpecialColumns in Chapter 12, "ODBC," so we'll just add two remarks about the getBestRowIdentifier method here.
The general tips for all catalog functions are:
If you use getBestRowIdentifier in order to give the user a choice of columns to use for searching, you will probably have to call getIndexInfo and getColumns as well. But you can get most of the data using a single call if you know how the DBMS stores catalog metadata. Remember that the best row identifier is often the DBMS-specific ROWID pseudocolumn. This will not appear in a list created by getColumns . ResultSetMetaDataYou should retrieve information about a result set before you get the result set itself. You can do this with a prepared statement. Here's an example: PreparedStatement pstmt = con.preparedStatement( "SELECT * FROM Table1"); ResultSetMetaData rsmd = pstmt.getMetaData(); int i = rsmd.getColumnCount(); This example would be a failure if "preparing" and "executing" took the same amount of time. Luckily, most DBMSs will prepare only when you ask them to prepare. So this goes quickly. Once you have an rsmd and a column count, you can check each individual column to see whether there should be some change in the strategy. The important methods for this purpose are:
The decisions you can make with the type, precision, and nullable information include:
Bottom line: There is still time to abort or revise the query at this stage. All that's happened so far is a prepare. If you decide to continue, keep the rsmd . There are other rsmd methods that are useful for displays.
Query SettingsSome settings apply to stmt or pstmt objects. These settings affect the way that result sets work. In this section, we'll look at three of them: Scroll Type , Concurrency Type , and FetchSize . Scroll TypeThe Scroll Type setting has three options: con.createStatement("SELECT * FROM Table1", ResultSet.TYPE_FORWARD_ONLY); con.createStatement("SELECT * FROM Table1", ResultSet.TYPE_SCROLL_INSENSITIVE); con.createStatement("SELECT * FROM Table1", ResultSet.TYPE_SCROLL_SENSITIVE); The default scroll type is TYPE_FORWARD_ONLY, and that's definitely the way to leave it if you can tolerate going through the result set from start to end, without skipping or retreating. The other two scroll types are scroll cursors and are both expensive and hard to implement. You will find that scroll cursors are troublesome with some DBMSs. INSENSITIVE cursors are for high concurrency situations, and their performance is good if the transactions are read-only. What happens behind the scenes is that somebody (probably the JDBC driver) either creates a temporary local table on the server, or fetches the entire result set into a local cache off the server. The effect is that the initial query execution, or possibly the first fetch after the query execution, will be slow. On the other hand, all subsequent fetches will be blazingly fast because it's all coming out of a nearby cache. Don't use an insensitive cursor for a single-row query, because the driver will waste time copying to a cache when it could send directly to you. And don't use an insensitive cursor if the query contains LOBs, because caches lose their value if they overflow the available memory. SENSITIVE cursors are for low concurrency situations when you want to see whether any other user changes a row while you have it. What happens behind the scenes is that the driver retrieves a list of all the best row identifier values (probably ROWIDs or primary keys), and stores them in cache. When the application asks for a fetch of, for example, Row #14, the driver looks up the value associated with that rowsay it's ROWID #123456. Then the driver selects a single row from the server with: ... WHERE ROWID = 123456 The server returns that row, and the driver passes it on to the application. Thus you can watch a screen change dynamically as other users change the rows you selected. You should not use a sensitive cursor if a call to getBestRowIdentifier tells you that there is no fast, persistent, unique column lookup possible for the table from which you're selecting. Concurrency typeThe Concurrency Type setting has two options: con.createStatement("SELECT * FROM Table1", ResultSet.CONCUR_READ_ONLY); con.createStatement("SELECT * FROM Table1", ResultSet.CONCUR_UPDATABLE); The default concurrency type is CONCUR_READ_ONLY. The CONCUR_UPDATABLE concurrency type should be used only if you intend to allow updates of rows that have been fetched , because indicating that a result set is "updatable" will affect the DBMS's locking strategy. Specifically, a DBMS that supports update locks will use them; see Chapter 15, "Locks." FetchSizeThe FetchSize setting has one option: stmt.setFetchSize(n); The overhead for a network message is high, so there's a gain if somebody fetches, say, 25 rows at a time, rather than 1 row at a time. The "somebody" in this case has to be the driver and the DBMS, acting in concert. The way to change the number of rows fetched at once is: boolean b = stmt.setFetchSize(25); // or rs.setFetchSize(25) There are two fetch sizes: the default FetchSize that the DBMS starts off with, and the DBMS's maximum FetchSize . If you ask for more than the maximum, then boolean b becomes false . The default FetchSize for each of the Big Eight is one row; the maximum FetchSize is 999 rows. The standard advice is that you should not call setFetchSize because the DBMS must have a reason for setting the FetchSize default where it is. And that is true. However, at the time that the DBMS is installed, there is no way of knowing how large your rows are. The important factor that affects speed is the number of bytes retrieved, not the number of rows. So if your result set has very small or very large rows, then you have a reason to change the number of rows per fetch. A block fetch is a fetch that gets multiple rows at once. When using block fetches, keep this in mind: If you fetch 25 rows at a time and some other user changes row #24 while you're working, row #24 is out of date by the time your application sees it. You can use the refreshRow method to compensate for this. The Bottom Line: Query PreppingYou can make a SELECT statement with Statement and execute it with execute Query . Using two threads is applicable in practically all cases. But not all ODBC drivers are reentrantusing two threads is inappropriate when you're using a JDBC-ODBC bridge. If a statement stmt2 has to follow another statement stmt1 then don't make stmt2 at all. Just reuse stmt1 . It's important to reuse objects because (a) garbage collection won't do a complete cleanup and (b) reusing an object causes an automatic close of the existing result set. A close must happen, otherwise, memory leaks are possible. Use consistent, standard, trick-free SQL syntax. Some drivers try to cache recently used SQL statements, so it's handy if they can find the times that the syntax is duplicated. The general tips for all catalog functions are (a) avoid search patterns or blank string arguments, (b) cache the results, and (c) make sure you need to know the information before using a catalog function. Remember that the best row identifier is often the DBMS-specific ROWID pseudocolumn. This will not appear in a list created by getColumns . Get information about a result set before you get the result set itself, using ResultSetMetaData . The default scroll type is TYPE_FORWARD_ONLY. Leave the setting there unless you can't tolerate going through the result set from start to end, without skipping or retreating. Scroll cursors are both expensive and hard to implement. You will find that scroll cursors are troublesome with some DBMSs. Use insensitive cursors for high concurrency situations. Their performance is especially good for read-only transactions. Don't use an insensitive cursor for a single-row query. Don't use an insensitive cursor if the query contains LOBs. Use sensitive cursors for low concurrency situations when you want to see whether any other user changes a row while you have it. Don't use a sensitive cursor if a call to getBestRowIdentifier tells you that there is no fast, persistent, unique column lookup possible for the table from which you're selecting. The default concurrency type is CONCUR_READ_ONLY. Leave the setting there unless you intend to allow updates of rows that have been fetched. Saying that a result set is updatable will affect the DBMS's locking strategy. Specifically, a DBMS that supports update locks will use them. Fetch multiple rows at once (a block fetch) whenever possible. If your result set has very small or very large rows, you have a reason to change the number of rows per block fetch. If you fetch 25 rows at a time and some other user changes row #24 while you're working, row #24 is out of date by the time your application sees it. Use the refreshRow method to compensate for this. |