Fetching Techniques

Team-Fly

Some of the data access code I've seen in the past 25 years seems quite fetching—other code is downright ugly (sorry 'bout that). But we do need to touch on techniques that, at least on the surface, seem like good ways to quickly populate and offload Recordset objects. For example, fetching Recordset data row-by-row often seems expensive and downright unnecessary. However, offloading rows into a Variant array or delimited string can also be costly. As a means to pass data from tier to tier, the Variant array can be especially costly. We'll discuss this more completely in Chapter 8, but consider that Variant arrays transmit their data in padded Unicode data structures. These behemoth arrays take considerably longer (almost three times as long) to move from tier to tier when compared to Recordsets.

As I see it, there are two parts to the fetching operation. (There are probably three but I can't think of a third.) First, fetching completes membership population. This, in turn, releases share locks on the server. This is good—good for scalable applications and for those other clients waiting to perform updates on the rows and pages (and perhaps tables) that you have share locked. Second, fetching can "offload" the Recordset to another control, to a file, to a stream, or to a named memory area. This is often necessary to display, transport, or persist the Recordset. But it's not always needed. Sometimes we can leave the data in the Recordset, make changes, and send the whole Recordset (or just the changes) to another layer.

Basically, there are many techniques used to populate and unload Recordsets:

  • Use the MoveNext method until EOF changes to True, or use the MoveLast method. While these techniques do not unload the Recordset, they do complete population.
  • Use the GetRows or GetString method. These methods both unload and populate in a single operation.
  • Assign the Recordset to a "complex"[1] bound control. The control completes population as you scroll.
  • Assign the Recordset to a "smart" unbound control, such as the MSHFlexGrid.
  • Let ADO self-populate.
  • Assign the ActiveConnection property to Nothing. ADO ensures Recordset population is complete before cutting off the connection.

Remember that ADO self-populates if left to its own devices. That is, after the first set of rows arrives and control returns to your application (whether or not you're running asynchronously), ADO continues to populate your Recordset behind the scenes on its own thread.

Note 

Recordset population should never be a problem, as long as you don't fetch any more than a few CacheSize rows (a couple of hundred). Any more than that and you have to start worrying about membership population to free server-side locks, memory consumed on the client, and whether your client will run out of patience with you and your application.

ADO begins (and tries to end the population process) by prefetching a set of rows before returning control to the client. By default, this initial set contains 50 rows. You can control the number of records in the initial fetch by setting the "Initial Fetch Size" property in the ADO Recordset Properties collection.

Displaying Rows with the MSHFlexGrid Control

One of the easiest and fastest ways to get the Recordset values in front of the user is to set the MSHFlexGrid control's Recordset property. I've been using this approach for some time now, and it seems to work fine—except in one situation. When using server-side cursors, you cannot inspect the Recordset object's EOF (or almost any other) property[2] before assigning the Recordset to the grid. If you do, you'll get a 30023 error from the grid. (Sigh.) I submitted a bug report on this issue, but they aren't going to fix it. You'll have to switch to client-side cursors for this to work—or don't touch the EOF property.

After the data is displayed in the grid, as shown in Figure 7-1, you'll notice that the headings are filled in for you, but the column widths will usually need to be adjusted to match your Recordset data. To fix this, I usually loop through the grid when it's used for the first time and assign appropriate widths. After you set the ColWidth property for each column, setting the Recordset property has no further effect on the column width settings. I use the TextWidth function in Visual Basic to return the correct number of twips to assign to the ColWidth property. This function accepts a string whose width is translated into twips. I use a variable-length string of M characters because the M character is a maximum width character, and when using proportional fonts, this can make a big difference. For example, the number of twips is far less if you use a string of period or space characters than it is when using a wider character. Using a string of M characters makes it easier to estimate the number of twips needed for an appropriate number of characters in each field.

click to expand
Figure 7-1: Set the MSHFlexGrid1.Recordset to your Recordset to display the rowset.

The Choose function in the following example returns an appropriate number of M characters based on the width of each Field in the Recordset.

 i = 0 With MSHFlexGrid1     .Row = 0     For Each fld In rs.Fields         .ColWidth(i) = TextWidth(String(Choose(i + 1, 12, 7, 10, 7, 80), "M"))         i = i + 1     Next fld End With 

Displaying Rows Using the Immediate Window

Another way of seeing the Recordset data while debugging the application is to dump the rows to the Immediate window, as shown in Figure 7-2:

click to expand
Figure 7-2: Use Debug.Print to display the Recordset field names and values in the Immediate window.

I use the following routine to show the Field names followed by individual row values. While this is pretty intuitive, it might be handy to dump the Recordset while debugging.

 For Each fld In rs.Fields     Debug.Print fld.Name, Next fld Debug.Print Do Until rs.EOF     For Each fld In rs.Fields         Debug.Print fld.Value,     Next fld     Debug.Print     rs.MoveNext Loop 

Asynchronous Fetching

If you use the adAsyncFetch option as an Options argument on the Open method, ADO continues to fetch rows behind the scenes after synchronously fetching the initial set. This wasn't implemented in ADO until version 2.5, despite being an acceptable Options argument for some time. However, if you execute the MoveLast method on a scrollable cursor, you disable the asynchronous operation—your code blocks while ADO dutifully fetches the rowset.

IMHO 

The Move methods (all of them) should include an option to support asynchronous operations, to prevent blocking no matter what. Perhaps someday.

It is apparent that server-side cursors were never intended for asynchronous Recordsets—specifically adAsyncFetchNonBlocking and adAsyncFetch. However, server-side cursors do work (meaning they free up the client to do something else) with adAsyncExecute and a Static cursor type. However, they still block, because you cannot reference the data in the Recordset until the cursor is fully populated.

In ADO, you can get an asynchronous operation through the ADO Connection or Command objects. To enable this option, use one of the following enumerations with the Open or Execute methods:

  • adAsyncExecute creates a second thread and begins executing the command on that thread. When the provider finishes executing the command, ADO uses notification (it fires an event) to inform your code that the query has completed executing.
  • adAsyncFetch, when using client cursors, fetches the results on a second thread in the background after the execution of the command completes. A notification occurs (an event fires) when the entire result set has been fetched.

The event handlers are activated if you declare the ADO Connection and Recordset variables using the WithEvents option, as shown in the following code:

 Dim WithEvents Cn as ADODB.Connection Dim WithEvents Rs as ADODB.Recordset 

Once you declare the Connection object and Recordset object using WithEvents, Visual Basic adds the objects to the object drop-down list in the IDE (as shown in Figure 7-3) and exposes the event handler prototypes. The asynchronous operations fire the ConnectComplete and ExecuteComplete events on the Connection object when the asynchronous operation is done.

click to expand
Figure 7-3: The Visual Basic IDE exposes the Recordset Events.

When you request events on the Recordset object, asynchronous operations fire the following events:

  • FetchProgress: This method is called periodically during a lengthy asynchronous operation to report how many more rows have currently been retrieved (fetched) into the Recordset.

  • FetchComplete: This method is called after all the records in a lengthy asynchronous operation have been retrieved (fetched) into the Recordset.

When you haven't set up event handlers, use the State property to indicate completion of asynchronous operations—check the adStateExecuting and adStateFetching State property bit flags to determine whether the asynchronous operations are still progressing.

Do Until RS.EOF Loops

Quite a few programs use Do Until or Do While loops to fetch and offload the data rows until RS.EOF returns True. This is not one of the 7 deadly computer sins (a topic for another time), but it is one of the top 20 sins. Sometimes you don't have a choice, but consider what's going on in the loop:

  • Are you moving the first row to a set of controls on the form and waiting for the user to press Next so that you can execute MoveNext? This is not a good idea—at least not until you have completed population. If you can, execute a MoveLast and then a MoveFirst[3] after opening the Recordset. This completes population. If this seems to take a long time, you've fetched too many rows.
  • Are you testing each row for a specific value and on selected (or each) rowperforming some sort of update operation? Again, this is not a good idea, especially if you are running against SQL Server, Oracle, or any intelligent backend. These systems can filter and process sets of rows much faster than your client can. Just read up on stored procedures.
  • Are you simply dumping rows to a delimited string, XML structure, or Variant array? This is a waste of time, as there are built-in functions to do this for you.
  • Are you dumping the Recordset to a grid control record-by-record? Again, this is not necessary, because the MSHFlexGrid control and many other third-party controls support Recordset properties that can be set to your Recordset object. Even if they don't support a Recordset property, they might support a Clip property that can accept a delimited string as constructed by the GetString method (which is discussed next).

Fetching Recordsets with GetRows and GetString

ADO supports two methods for offloading the entire contents of, or a subset of, the Recordset to either a Variant array or a delimited string. When I first started working with RDO (which supports similar functionality), these methods seemed very cool—fast, easy, and powerful. However, the bloom is off the rose with these methods, as far as I'm concerned, now that I know their impact and how they were implemented.

GetRows extracts some or all of the Recordset rows and sends the rows to a Variant array. The following code uses the default settings to extract the entire Recordset:

 vaArray = rs.GetRows 

In much the same way, the GetString method extracts some or all of the Recordset rows, but GetString constructs a "pseudo" delimited string. I say pseudo because IMHO a delimited string includes quotes around the string elements of the string; GetString, however, does not put quotes around the strings—it simply separates them with the field delimiter (such as tabs or commas). This makes it harder to deal with the delimited string when it arrives. For example, you'll find it tough to construct correct delimited strings from fields that contain quotes. You can specify the interfield delimiter and the interrecord delimiter, but if you want to pass the delimited string to a control that supports a Clip property, you can leave the syntax as is—the default settings are fine.

 strDelimited = rs.GetString 

In RDO, you had to provide a very large number (well beyond the number of rows expected) to extract all of the rows from your Recordset using the GetRows and GetClipString methods; but by default, ADO extracts all rows. You can fetch a few rows at a time with either method by using the Rows argument. However, only the GetRows method lets you select a starting position in the Recordset using the Start argument.

Here are some additional examples. First, we create a Recordset with three Fields. Then, we move down five rows and capture a bookmark. And then we reposition to the top of the Recordset.

 rs.Open "select title, year_published, ISBN from titles where title like 'H%'", cn rs.Move 5 vaBookmark = rs.Bookmark rs.MoveFirst 

In the next part of the code, we dump all fields from the 100+ rows into the array. If you use −1 for the Rows argument, ADO fetches all remaining rows, starting at the current record pointer.

 vaArray = rs.GetRows rs.MoveFirst 

In this next case, we start at the first Recordset row and extract 10 rows, but just the Title field.

 vaArray = rs.GetRows(10, , "title") rs.MoveFirst 

The Start argument is not a starting row number, but it can tell ADO to start at various points in the Recordset, assuming the Recordset supports bookmarks. The Start argument can take the following values in addition to a normal Bookmark value:

  • adBookmarkCurrent—start at the current record.
  • adBookmarkFirst—start at the first record.
  • adBookmarkLast—start at the last record, thus returning only one record.
  • A bookmark Reposition to a previously saved bookmark.

This next code example extracts five rows starting at the bookmarked row (which is row 5), and dumps just the Year_Published field.

 vaArray = rs.GetRows(5, vaBookmark, "Year_Published") rs.MoveFirst 

The next line extracts just the ISBN and Title fields from all rows.

 vaArray = rs.GetRows(Fields:=Array("ISBN", "Title")) 

To reposition the CRP to the beginning of the Recordset, I used the MoveFirst method before fetching the rows. However, you don't have to do this, and if the Recordset is opened as forward-only, you won't be able to—at least not without rerunning the query.

GetRows does not "remove" rows from the Recordset, nor is the method affected by the Sort property, but it is affected by the current Filter property setting. That is, if you set the Filter property before using GetRows (or GetString), you get just the rows in the Recordset that satisfy the Filter criteria.

Using the RecordCount Property

If you want to know how many Records met the criteria in your query's WHERE clause, you can check the RecordCount property. However, RecordCount is not implemented for the default Recordset (RO/FO) or for any forward-only cursor—it always returns −1. If RecordCount does not seem to work on your scrollable cursor, try using MoveLast to force population, but be prepared for a wait while all rows are fetched. Remember, you can also examine the array boundaries of the Variant array constructed with the GetRows method if you want to return a valid row count.

Tip 

In some cases, ADO executes the MoveLast method to fully populate the cursor when you reference the RecordCount property. This means that if you open the View Locals window, the RecordCount property is referenced and ADO might execute the MoveLast method. If you aren't ready for a long wait while the Recordset is being populated, don't check the RecordCount property.

Switching to a scrollable cursor just to get the RecordCount does not make sense, because of the considerable overhead imposed by ADO to manage scrolling. On the other hand, some folks recommend adding a SELECT COUNT(*) to the query to return the number of qualifying rows. However, this makes the server work twice as hard on the query—once to get the count, and again to fetch the rows. COUNT(*) is not necessarily accurate anyway, because the row count can change between queries. If you surround the two queries (the COUNT(*) and the row-returning query) in a transaction, you prevent changes to the database while you're fetching, but this imposes additional scalability problems. If you use adUseClient for the CursorLocation, ADO creates a Static cursor, which supports the RecordCount property without problems.

If you use Return @@ROWCOUNT or return the @@ROWCOUNT in the Return status or OUTPUT parameter, these also require that the cursor be populated before the parameters are returned. That is, ADO needs to have read the last row from the server. If there are not very many rows in the cursor (50 or less), ADO will have completed population by the time you get control and you'll be able to pick off the OUTPUT parameters or Return status (an integer). However, if there are more than CacheSize rows to return, you have to wait at least until population is complete before accessing the OUTPUT parameters or Return status. In any case, remember that the return status and OUTPUT parameters are not available with some providers until you close the Recordset. This seems to throw another monkey wrench in this strategy.

Tip 

When using Forward-only cursors, ADO can run a tally of records visited, but only if you set the CacheSize property to something greater than 1. As you move forward through the Recordset, RecordCount reflects the current tally.When you reach the end, it equals the number of rows fetched.

Working with Variant Arrays

For those of you not familiar with Variant arrays, here are a couple of tips. First, the UBound function can be very helpful. It returns the dimensions of the array; but remember that Variant arrays are indexed by columns and rows, not by rows and columns as you might expect. When you use the UBound function to see how many rows and columns resulted from your GetRows method call, use this syntax:

 lngRowsInArray = UBbound(vaArray,2) intColumnsInArray = UBound(vaArray,1) 

After you know the dimensions of the Variant array, you can extract the data, like this:

 Sub DumpRows(lrs As Variant) Dim i As Integer, j As Integer For i = 0 To UBound(lrs, 2)     For j = 0 To UBound(lrs, 1)         Debug.Print lrs(j, i),        ' Note col, row     Next j     Debug.Print Next i Debug.Print String(25, "-") 

If you're not sure about how the array elements are numbered, you can simply check out the Locals Window in the Visual Basic IDE—it dumps the entire array for you, as shown in Figure 7-4. Notice how the array boundaries are shown. Figure 7-4 shows three columns (0, 1, and 2), and 1 row (0-based).

click to expand
Figure 7-4: Examining the Locals Window GetRows Variant Array

When using these techniques, remember that executing GetRows on an empty Recordset will result in a trappable error being raised.

 3021 ADODB.Recordset Either BOF or EOF is True, or the current record has been deleted. Requested operation requires a current record. 

In these cases, it is a good idea to assign Empty to the variant variable. So, whenever you are expecting data in a variant, check to see if it is Empty before referencing array elements.

[1]Complex bound controls handle multiple rows of a Recordset. For example, a Grid control is a complex bound control. Simple bound controls handle only one row, such as a TextBox control.

[2]David Sceppa and Rick Nasci fought for some time to get this "bug" fixed. The ADO folks sent back a long paragraph describing how this was a "by design" feature.

[3]If your code executes a MoveFirst just after the Recordset opens, you might trigger another execution of the query for no purpose.


Team-Fly


ADO Examples and Best Practices
Ado Examples and Best Practices
ISBN: 189311516X
EAN: 2147483647
Year: 2000
Pages: 106

Similar book on Amazon

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net