|
Reducing the amount of memory that your Access database and its code occupy may result in both it and other Windows applications running faster. But, if fast execution is a real coding priority, there are other methods you can consider:
Use constants
Use specific object types (early binding)
Use variables , not properties
Avoid slow structures
Beware of IIf
Use integer arithmetic where possible
Use inline code
Use DoEvents judiciously
Use the Requery method, not the Requery action
Use Me
Speed up database operations
We'll look at these in more detail and provide some code samples that prove the point. The code samples have been included so that you can gauge the impact of these techniques for yourself. After all, computers differ greatly in terms of the amount of RAM, processor speed, cache size , disk speed, and so on. These differences are reflected in the performance of VBA on those machines. Don't take my word for it; try out these examples for yourself!
Looking at the results below, you might be forgiven for wondering whether it is worth bothering with some of the improvements. After all, if it only takes three milliseconds to perform an operation, why bother going to the effort of optimizing your code so that it only takes one millisecond? After all, who's going to notice? Well, although it's fair to say that the difference may not be noticeable in a single line of code, it might be if that code is executed many times. For example, you may be looping through all the controls on a form to check for a certain condition. Or you might be looping through all the records in a recordset. Get into the habit of coding efficiently all the time - then it won't be a struggle to do it when it really matters.
The simplest method for timing how long a piece of code takes to execute is to use the Timer function. For example, we could use the following sample of code to time how long it takes for a For... Next structure to loop through 100,000 values.
Sub TimeLoop() Dim sngStart As Single Dim sngEnd As Single Dim lngLoop As Long sngStart = Timer For lngLoop = 0 To 100000 Next lngLoop sngEnd = Timer Debug.Print "It took " & Format$(sngEnd - sngStart, "0.000") & " secs." End Sub
The Timer function returns a Single indicating the number of seconds elapsed since midnight. In the example above, we save the value of Timer in the variable sngStart before the loop commences:
sngStart = Timer
And then save the value of Timer in the variable sngEnd when the loop has finished:
sngEnd = Timer
By subtracting one from the other, we can determine how long it took for the loop to execute:
Debug.Print "It took " & Format$(sngEnd - sngStart, "0.000") & " secs."
Now you could include this code in all of the procedures that you want to time. However, if you will be doing a lot of code timings, you might want to set up a test harness like this
Open a standard code module and type in the procedure TestPerformance listed below:
Sub TestPerformance(lngIterations As Long, intRuns As Integer) Dim intLoop As Integer Dim lngLoop As Long Dim sngStart As Single Dim sngCorrection As Single Dim sngTime As Single Dim sngTimeTotal As Single sngStart = Timer For lngLoop = 1 To lngIterations EmptyRoutine '<--- This is an empty procedure Next sngCorrection = Timer - sngStart For intLoop = 1 To intRuns sngStart = Timer For lngLoop = 1 To lngIterations TestProc '<--- This is the procedure we are testing Next sngTime = Timer - sngStart - sngCorrection Debug.Print "Run " & intLoop & ": " & sngTime & " seconds" sngTimeTotal = sngTimeTotal + sngTime Next Debug.Print Debug.Print "Correction : " & Round(sngCorrection, 3) & " seconds" Debug.Print Debug.Print "AverageTime: " & Round(sngTimeTotal / intRuns, 3) & " seconds" End Sub
Also, create two procedures called EmptyRoutine and TestProc , looking like this:
Sub EmptyRoutine() End Sub Sub TestProc() Dim sngTime As Single sngTime = Timer Do While sngTime + 5 > Timer DoEvents Loop End Sub
Now, open the Immediate window, type in the following code to run the TestPerformance procedure and hit Enter .
The results of running the TestProc procedure 10 times in 5 different test runs will (eventually) be shown in the Immediate window.
How It Works
The TestPerformance procedure accepts two arguments. The first of these is the number of times that the procedure we want to test ( TestProc ) will be executed in each run; the second argument indicates the number of runs that we want to time.
The important part of this procedure is a loop that runs the code to be tested a certain number of times:
For lngLoop = 1 To lngIterations TestProc Next
In this case, TestProc is the name of the procedure that we wish to test. When you want to time a different procedure, simply substitute its name at this point of the TestPerformance procedure.
The difference between the time at the start and end of this loop is then displayed. However, the loop itself has its own overhead. For this reason, we also use another loop to monitor the overhead that is incurred by executing the loop.
sngStart = Timer For lngLoop = 1 To lngIterations EmptyRoutine Next sngCorrection = Timer - sngStart
The EmptyRoutine function is simply a procedure that contains no code.
The overhead that is recorded is then stored in a variable called sngCorrection and is used to correct the eventual timings that the function displays after each of the test runs:
sngTime = Timer - sngStart - sngCorrection Debug.Print "Run " & intLoop & ": " & sngTime & " seconds"
Don't forget, if you do not want to build this TestProcedure function, you can find the code for this - and everything else in this chapter - in the Performance.mdb database.
You will notice that the code we test is in a For...Next loop which iterates many times. We need to do this because, although the Timer function is very accurate, each function executes extremely quickly and so executing many iterations of the function increases the accuracy of our measurement.
The sample timings shown in this chapter were produced on a 2GHz Pentium machine 1 70GB IDE disk (7.2ms access) and 512MB RAM. These timings can also be found in the Performance.xls spreadsheet accompanying this book.
Important | If you want to test these code samples for yourself, just substitute the name of the procedure that you want to time in place of TestProc in the TestPerformance procedure. |
When testing the execution speed of your code, you should make the test conditions as realistic as possible. In other words, test the application on the same specification of computer that will be used in production, load up any other applications which the users will be running, use realistic volumes of data, and (if possible) have realistic numbers of users accessing the shared data in the database. Also, run your tests three or four times to see if performance varies from run to run. You might find that the first run of the following examples will give varying results as the code and any data has to be loaded into memory before it can be run. On subsequent runs, the code and data should already be in memory so the speed will improve and the results will be more consistent. Bear in mind that you are looking for relative improvements in timings when testing a block of code; there is not a single target figure that you are usually aiming for, just an improvement over pervious recordings.
Anyway, that's enough about timing our code. Let's get on with looking at methods for improving our code's performance
If you use values in your code that do not change, then you should consider assigning them to constants rather than to variables. As well as the overhead incurred by dimensioning the variable, whenever a variable is encountered in a line of VBA code, Access needs to access the memory location that contains the variable in order to determine the value of the variable. In contrast, the value of a constant is resolved and written into the code at compile time. As a result, reading values from constants generally works out slightly quicker than reading values from variables. The following code illustrates this; the first one uses constants only, and should be quickest, and the second uses string variables.
Like variables, constants can be declared at the procedure or at the module level and can have either Public or Private scope.
Sub ConstantsNotVariables() Const sDummy = "This is a string which is saved as a constant " & _ "rather than as a variable. This means that it is " & _ "resolved and written into the code at compile time." Const sDummy2 = "This is a string which is saved as a constant " & _ "rather than as a variable. This means that it is " & _ "resolved and written into the code at compile time." If sDummy = sDummy2 Then End If End Sub Sub VariablesNotConstants() Dim sDummy As String Dim sDummy2 As String
sDummy = "This is a string which is saved as a variable " & _ "rather than as a constant. This means that it has to be " & _ "retrieved every time it needs to be used." sDummy2 = "This is a string which is saved as a variable " & _ "rather than as a constant. This means that it has to be " & _ "retrieved every time it needs to be used." If sDummy = sDummy2 Then End If End Sub
To test this out, add the previous procedures to a module and then modify the TestPerformance procedure we created earlier to call the VariablesNotConstants procedure. Execute it from the Immediate window ( View Immediate Window ) by typing TestPerformance 30000,5 . Make a note of the timings, and the average. Now modify the TestPerformance to run the ConstantsNotVariables procedure and run it again.
The results below confirm our beliefs. When run through 30,000 iterations of each procedure, using constants was 1.6 times faster, than using variables.
Procedure | Iterations | Elapsed Time | Improvement |
---|---|---|---|
VariablesNotConstants | 30,000 | 93 ms | |
ConstantsNotVariables | 30,000 | 58 ms | 1.6 times |
Although the use of 'loose' data types such as Variant and Object can make it easier to write generic code, they are usually slower to use than more specific data types.
Previously, we looked at the difference between using early and late binding. The two procedures that we looked at were GetObjectArea (which used late binding), and GetShapeArea (which used early binding).
Function GetObjectArea(obj As Object, _ Side1 As Double, _ Side2 As Double, _ Side3 As Double) As Double obj.Construct Side1, Side2, Side3 GetObjectArea = obj.Area End Function
Although this code will run, it is not very efficient. Because the object variable obj has been declared As Object , Access does not know what type of object it is. This in turn means that whenever we try to inspect or set a property value, or invoke a method against that object at runtime, Access must first check whether or not that property or method is appropriate for the object.
Function GetShapeArea(shp As Shape, _ Side1 As Double, _ Side2 As Double, _ Side3 As Double) As Double shp.Construct Side1, Side2, Side3 GetShapeArea = shp.Area End Function
The second procedure uses what is known as early binding . This time around, Access knows what type of object shp is. This means that it can determine at compile time which properties and methods are appropriate to it. Because Access only has to perform this check once, and it does it prior to runtime, the difference in execution speed at run time between code using the two methods can be very significant.
A secondary advantage of early binding is that because Access can determine which properties and methods are appropriate at compile time, any errors in your code which result from misspelling property or method names are caught at compile time rather than appearing as run-time errors. In addition, without early binding, you won't see the auto-complete features of VBA, such as Auto List Members and Auto Quick Info .
The procedures we use to test these two techniques are shown below:
Sub UseLateBinding() Dim tri As Triangle Set tri = New Triangle GetObjectArea tri, 5, 6, 7 End Sub Sub UseEarlyBinding() Dim shp As shpTriangle Set shp = New shpTriangle
GetShapeArea shp, 5, 6, 7
End Sub Procedure | Iterations | Elapsed Time | Improvement |
UseLateBinding | 30,000 | 2053 ms | |
UseEarlyBinding | 30,000 | 406 ms | 5.1 times |
You can realize similar performance benefits if you use variables to refer to forms, controls, and properties. If you are going to refer to a form, report, or control more than once in a procedure, you should create an object variable for the object and then refer to that instead of the object itself.
The following procedure opens the Switchboard form and determines whether the name of various command buttons on that form is "Blobby" . If any of the buttons does have that name, then code execution will stop on the line that performs the comparison for that button.
The Assert method of the Debug object is useful for debugging and causes code execution to stop when a given expression (like frm.cmdExit.Name <> "Blobby" ) evaluates to False .
Sub UseFormVariables() Dim frm As Form_frmSwitchboard Set frm = Forms("frmSwitchboard") Debug.Assert frm.cmdExit.Name <> "Blobby" Debug.Assert frm.cmdIceCreams.Name <> "Blobby" Debug.Assert frm.cmdIngredients.Name <> "Blobby" Debug.Assert frm.cmdMaintenance.Name <> "Blobby" Debug.Assert frm.cmdReports.Name <> "Blobby" Debug.Assert frm.cmdSuppliers.Name <> "Blobby" Set frm = Nothing End Sub
Alternatively, we could have rewritten this code to take advantage of the With structure.
Sub UseFormVariablesAndWith() Dim frm As Form_frmSwitchboard Set frm = Forms("frmSwitchboard") With frm Debug.Assert .cmdExit.Name <> "Blobby" Debug.Assert .cmdIceCreams.Name <> "Blobby" Debug.Assert .cmdIngredients.Name <> "Blobby" Debug.Assert .cmdMaintenance.Name <> "Blobby" Debug.Assert .cmdReports.Name <> "Blobby" Debug.Assert .cmdSuppliers.Name <> "Blobby" End With Set frm = Nothing End Sub
Code written using either of these two syntaxes will execute considerably faster than code that uses the long-hand syntax.
Sub UseLongHandSyntax() Debug.Assert Forms!frmSwitchboard!cmdExit.Name <> "Blobby" Debug.Assert Forms!frmSwitchboard!cmdIceCreams.Name <> "Blobby" Debug.Assert Forms!frmSwitchboard!cmdIngredients.Name <> "Blobby" Debug.Assert Forms!frmSwitchboard!cmdMaintenance.Name <> "Blobby" Debug.Assert Forms!frmSwitchboard!cmdReports.Name <> "Blobby" Debug.Assert Forms!frmSwitchboard!cmdSuppliers.Name <> "Blobby" End Sub
In this situation the With...End With syntax is only fractionally faster than the first method which simply uses object variables. This is because of the overhead involved in setting up the With structure. However, you would find that if you were to add more and more references to the specified object between the With and End With statements, then this structure would become even more efficient.
Procedure | Iterations | Elapsed Time | Improvement |
---|---|---|---|
UseLongHandSyntax | 30,000 | 5582 ms | |
UseFormVariables | 30,000 | 3687 ms | 1.5 times |
UseFormVariablesAndWith | 30,000 | 3634 ms | 1.015 times |
Another way to make your VBA code run faster is to avoid using slow structures. What does this mean? Well, most languages offer the programmer several different methods of performing a single task. If real execution speed is a coding priority, you should test each of these different methods for speed and decide which to use accordingly . For example, in Chapter 13, we discovered that there are two methods of determining the area of a triangle, given the lengths of all three sides.
Function Area1(dblSide1 As Double, _ dblSide2 As Double, _ dblSide3 As Double) As Double Dim dblAngle1 As Double dblAngle1 = ((dblSide2 ^ 2) + (dblSide3 ^ 2) - (dblSide1 ^ 2)) _ / (2 * dblSide2 * dblSide3) dblAngle1 = Atn(-dblAngle1 / Sqr(-dblAngle1 * dblAngle1 + 1)) + 2 * Atn(1) Area1 = dblSide2 * dblSide3 * Sin(dblAngle1) * 0.5 End Function
Function Area2(dblSide1 As Double, _ dblSide2 As Double, _ dblSide3 As Double) As Double Dim dblSemiPerim As Double dblSemiPerim = (dblSide1 + dblSide2 + dblSide3) / 2 Area2 = Sqr((dblSemiPerim) * _ (dblSemiPerim - dblSide1) * _ (dblSemiPerim - dblSide2) * _ (dblSemiPerim - dblSide3)) End Function
Both functions return identical results, but the first of these executes noticeably faster. In this situation, there would seem to be little reason not to choose the first method. However, in other situations you might find that the faster method conflicts with another of the project's coding priorities, so you may have to compromise. Even if this is the case, though, the time spent timing the code will not have been wasted , as you will be able to use this knowledge in future projects.
Procedure | Iterations | Elapsed Time | Improvement |
---|---|---|---|
Area2 (Heron's Formula): | 30,000 | 61 ms | |
Area1 ( Trigonometrical Method): | 30,000 | 7 ms | 8.7 times |
Other examples of potentially slow structures are now described.
The Immediate If ( IIf ) function is often viewed as a quick and easy way to return one of two values depending on whether an expression evaluates to True . We looked at this function in Chapter 4. Its syntax is:
value = IIf(Expression, TruePart, FalsePart)
TruePart is returned if Expression is True , and FalsePart is returned if Expression is False . This is the same as writing:
If Expression Then value = TruePart Else value = FalsePart EndIf
However, the key difference between the two formats is that the IIf function will always evaluate both TruePart and FalsePart , whereas the normal If structure will only evaluate the part which is returned. To see the implications of this, consider these two portions of code:
Function IIfTest(lngNumber As Long) Dim lngRetVal As Long lngRetVal = IIf(lngNumber = 5, 10, _ DMin("Quantity", "tblSales", "AmountPaid > 180")) End Function Function IfTest(lngNumber As Long) Dim lngRetVal As Long If lngNumber = 5 Then lngRetVal = 10 Else lngRetVal = DMin("Quantity", "tblSales", "AmountPaid > 180") End If End Function
Both of these procedures do the same thing. They evaluate the variable lngNumber and if it is equal to 5 , the procedure sets the value of lngRetVal to 10 . If it isn't, the procedure sets the value of lngRetVal to a value that it looks up in the tblSales table.
The difference between the procedures is that the first one will always look up the record from tblSales whether it's required or not. So whenever these procedures are called with lngNumber equal to 5 , the first one will be considerably slower.
Procedure | Iterations | Elapsed Time | Improvement |
---|---|---|---|
IIfTest (10) | 3,000 | 38366 ms | |
IfTest (10) | 3,000 | 38320 ms | 1.001 times |
IIfTest (5) | 3,000 | 37668 ms | |
IfTest (5) | 3,000 | 2 ms | 18834 times |
The speed with which arithmetic calculations are performed depends on the data type of the variables concerned and the type of operation being performed.
In general, however, Integer and Long variables are faster than Single and Double variables. These, in turn, are faster than Currency variables. Variant variables are considerably slower, with most operations taking typically twice as long as with other data types.
Although the difference in execution times for a single operation is very small, it will become more noticeable for repeated operations (such as within large loops ).
Another useful tip is to use Integer division wherever possible. Integer division uses the ( \ ) symbol rather than the ( / ) one and always returns an integer, which will be faster than the Double that normal division always returns.
Earlier on, we noted that variables could be passed as arguments to procedures by reference or by value. When a variable is passed by reference (the default), the procedure that is called is passed a pointer to the memory location of the variable being passed. In contrast, when a variable is passed by value, a copy of the variable is made and is passed to the procedure. Although passing variables by value has its uses, it is fractionally slower than passing by reference.
Both of these methods are slower, however, than placing the code inline (within the body of the original procedure). The downside of inline code is that it is more difficult to maintain if you have the same code appearing inline in multiple procedures. In addition, having the same code appearing inline in multiple procedures will increase the memory footprint of your code. But if your chief coding priority is execution speed, you should seriously consider using in-line code, particularly if this code is frequently called or within a loop structure.
The following procedures can be used to illustrate the difference between the three methods described above:
Sub TestPassingByValue() Dim dbl1 As Double Dim dbl2 As Double dbl1 = 1234 'Placing the argument in parentheses passes it by value dbl2 = FourthPower((dbl1)) End Sub Sub TestPassingByReference() Dim dbl1 As Double Dim dbl2 As Double dbl1 = 1234 'Passing by reference dbl2 = FourthPower(dbl1) End Sub Sub TestPassingInLine() Dim dbl1 As Double Dim dbl2 As Double dbl1 = 1234 'Inline coding dbl2 = dbl1 ^ 4 End Sub Function FourthPower(dblVal As Double) FourthPower = dblVal ^ 4
End Function Procedure | Iterations | Elapsed Time | Improvement |
TestPassingByValue | 30,000 | 57 ms | |
TestPassingByReference | 30,000 | 45 ms | 1.27 times |
TestPassingInLine | 30,000 | 15 ms | 3.8 times |
When a VBA procedure is running, it will act very selfishly and hog the Access limelight unless you tell it otherwise . For example, if you were to run the following portion of code, you would find that Access was locked up until the code finished running.
Sub NoDoEvents Dim lngCounter As Long Dim i As Integer For lngCounter = 1 to 1000000 i = Rnd*12 Next lngCounter End Sub
This routine takes approximately 1.7 seconds to execute on my computer. But it is considered good etiquette (and common sense) to yield control to Windows every so often. For example, while your routine is running you may wish to cancel it, pause it for a moment, or do something else. If your routine ignored all your requests then you wouldn't be very happy. What you need is a way to allow other events to be processed while your routine runs.
This can be achieved with the DoEvents statement. This instructs Windows to process any messages or keystrokes that are currently queued. In the following portion of code, whenever the loop reaches the DoEvents statement, control passes to Windows which checks to see whether any other application has any messages or keystrokes waiting to be processed. It then passes control back to the procedure.
Sub AllDoEvents Dim lngCounter As Long Dim i As Integer For lngCounter = 1 to 1000000 i = Rnd*12 DoEvents Next lngCounter End Sub
Although this is good practice, it does take up time. In fact, the routine, which checks for events every time the loop is passed through, takes over 50 minutes to run! If you want to use the DoEvents statement, do so sparingly. The portion of code shown above can be rewritten like this:
Sub SomeDoEvents Dim lngCounter As Long Dim i As Integer For lngCounter = 1 to 1000000 i = Rnd*12 If lngCounter Mod 50000 = 0 Then DoEvents Next lngCounter End Sub
Now, control passes to Windows every 50,000 loops. This means 20 times in the 2.4 seconds or so that the loop now takes to finish, which leaves you with a well-behaved and yet fast bit of code. The DoEvents adds only 0.7 seconds to the execution time of this code.
Procedure | Iterations | Elapsed Time | Difference |
---|---|---|---|
NoDoEvents | 1 x 1,000,000 | 169 ms | |
AllDoEvents | 1 x 1,000,000 | 801554 ms | +13.36 min |
SomeDoEvents | 1 x 1,000,000 | 328 ms | +0.159 sec |
Obviously, the number of DoEvents statements that you actually use will vary depending on the degree of interactivity that the procedure demands. For example, a procedure that runs for 10 minutes at the dead of night when everyone is in their beds and while no other applications are running will typically require less interactivity (and thus fewer DoEvents statements) than a procedure which updates the screen to display real-time stock prices to a user .
Whenever you yield control to the processor from within a procedure, you should always write your code in such a way that the procedure will not be executed again from a different part of your code before the first call returns. If it does (this is called reentrancy ) your code will probably not work the way you intended it to and the application may either hang or crash.
Another method of speeding up your procedures is to avoid using the Requery action to display up-to-date values in a form or control. Use the Requery method instead. This is quicker as it simply reruns the query behind the form or control instead of closing it, reopening it and then rerunning it, as the Requery action does. The actual measured difference will depend on the size and complexity of the form, and the underlying query.
DoCmd Requery ctlText.Name 'This is slow ctlText.Requery 'This is much quicker
When you use the Me keyword to refer to a form within an event procedure, Access only searches in the local namespace (that is the objects that are currently available in the open form) for the form. This means that the form is found more quickly than if the full reference to the form is specified.
Forms!frmFoo.BackColor = QBColor(9) 'This is slow Me.BackColor = QBColor(9) 'This is quicker
Whereas the optimizations you can realize through changing the syntax of your VBA are sometimes marginal, optimizing your database calls almost always leads to substantial performance benefits. The reason for that is simple. Database calls generally take longer to execute than normal VBA statements because they involve accessing the hard disk as opposed to changing the contents of the computer's memory, so a 10% improvement in performance in both will be more noticeable in the case of the database call. We'll look below at some of the ways you can improve the way that VBA code interacts with your database.
When you run your performance testing, you should replicate both the expected data volumes and the conditions of the production environment as closely as you can. It may not always be possible to obtain a copy of live data to perform your testing against, so you may need to generate test data.
There are two stages to producing a set of test data: creating the test table(s) and populating the test table(s). To see how easy it can be to build large volumes of test data, let's try it out for ourselves .
Create a new database in which to store the test data that we will create. Call the new database PerformanceXP.mdb .
In the PerformanceXP.mdb database, switch to VBA by hitting ALT+F11 .
Display the References dialog by selecting References from the Tools menu. When the dialog appears, uncheck the reference to Microsoft ActiveX Data Objects 2.5 Library and check the reference to the Microsoft DAO 3.6 Object Library . Then hit the OK button.
Next, create a new module by selecting Insert/Module from the toolbar.
Add the following procedure to the new module. This will be used to create the test table.
Sub BuildPerformanceTable() On Error GoTo BuildPerformanceTable_Error Dim db As Database Dim tdf As TableDef Dim fld As Field Dim idx As Index Set db = CurrentDb() 'Try to delete the tblPerformance table db.TableDefs.Delete ("tblPerformance") 'Create a new tabledef object Set tdf = db.CreateTableDef("tblPerformance") 'Create and save an ID counter field Set fld = tdf.CreateField("ID", dbLong) fld.Attributes = fld.Attributes Or dbAutoIncrField tdf.Fields.Append fld 'Create and save an unindexed 255-character text field Set fld = tdf.CreateField("UnindexedText", dbText, 255) tdf.Fields.Append fld 'Create and save an indexed 255-character text field Set fld = tdf.CreateField("IndexedText", dbText, 255) tdf.Fields.Append fld Set idx = tdf.CreateIndex("TextIndex") Set fld = idx.CreateField("IndexedText") idx.Fields.Append fld tdf.Indexes.Append idx 'Create and save an integer field Set fld = tdf.CreateField("Num1in100", dbInteger) tdf.Fields.Append fld 'Create and save another integer field Set fld = tdf.CreateField("Num1in1000", dbInteger) tdf.Fields.Append fld 'Create and save (yet) another integer field Set fld = tdf.CreateField("Num1in10000", dbInteger) tdf.Fields.Append fld 'Create and save a Yes/No field Set fld = tdf.CreateField("YesNo", dbBoolean) tdf.Fields.Append fld 'Save the tabledef into the database db.TableDefs.Append tdf BuildPerformanceTable_Exit: Exit Sub BuildPerformanceTable_Error: Select Case Err Case 3265 'Item not found in this collection Resume Next Case Else MsgBox "The following unexpected error occurred:" & vbCrLf & _ Err.Description & " (Error " & Err.Number & ")", vbCritical Resume BuildPerformanceTable_Exit End Select End Sub
Next, add the following procedure to the new module. This will be used to insert the data into the newly created table:
Sub PopulatePerformanceTable(lngRecords As Long) Dim lngRecordLoop As Long Dim intLoop As Long Dim recPerformance As Recordset Dim strText As String Set recPerformance = CurrentDb.OpenRecordset("tblPerformance", _ dbOpenDynaset, _ dbAppendOnly) For lngRecordLoop = 1 To lngRecords 'Prepare to add new record recPerformance.AddNew 'Add string up to 5 chars long into [UnindexedText] field strText = "" For intLoop = 1 To (1 + Int(5 * Rnd)) strText = strText & Chr$(65 + Int(26 * Rnd)) Next intLoop recPerformance("UnindexedText") = strText 'Add string up to 255 chars long into [IndexedText] field strText = "" For intLoop = 1 To (1 + Int(255 * Rnd)) strText = strText & Chr$(65 + Int(24 * Rnd)) Next intLoop recPerformance("IndexedText") = strText 'Add integer between 1 and 100 into [Num1in100] field recPerformance("Num1in100") = 1 + Int(100 * Rnd) 'Add integer between 1 and 1000 into [Num1in1000] field recPerformance("Num1in1000") = 1 + Int(1000 * Rnd) 'Add integer between 1 and 10000 into [Num1in10000] field recPerformance("Num1in10000") = 1 + Int(10000 * Rnd) 'Add True or False into [YesNo] field recPerformance("YesNo") = (Rnd < 0.5) 'Save new record recPerformance.Update Next recPerformance.Close MsgBox lngRecords & " rows were added to the tblPerformance table." & _ vbCrLf & _ "It now has " & CurrentDb.TableDefs("tblPerformance").RecordCount & _ " records." End Sub
Check that there are no compilation errors in the code that you have just typed in by selecting Compile PerformanceData from the Debug menu.
Now make sure that the Immediate window is visible by hitting Ctrl + G . Run the procedure to create the test table by typing the following into the Immediate window and hitting Enter .
Check that the table has been created by switching back to Access. You can do this by hitting Alt + F11 . The new table should appear in the Database window. If it does not appear, you might have to refresh the Database window by hitting F5 :
Now, switch back to VBA and populate the table with test data. To do this, type the following into the Immediate window and hit Enter .
When the procedure has finished executing, a message box will inform you of the fact.
Switch back to Access and open the table. It should contain 10,000 records of random-looking data.
Finally, save the module and take a look at the size of the database in Explorer. You should see that it is a little under 700KB. Not a bad amount of test data for five minutes' work!
How It Works
Don't be daunted by the length of the two procedures we use in this example. They are actually fairly straightforward. The first of these, the BuildPerformanceTable procedure, builds an empty table with the following structure:
Field | Datatype |
---|---|
ID | AutoNumber |
UnindexedText | Text(5) |
IndexedText | Text(255) |
Num1in100 | Integer |
Num1in1000 | Integer |
Num1in10000 | Integer |
YesNo | Yes/No |
It creates this table by using DAO, which is the reason that we replace the default reference to ADO with a reference to DAO in step 3.
The first task in the BuildPerformanceTable procedure is to delete any existing tables with the name tblPerformance . Although not necessary in the exercise above, this step is useful if you are going to execute this procedure more than once.
db.TableDefs.Delete ("tblPerformance")
However, it could be that there was no existing table with that name (such as when this procedure is run for the first time), in which case this line of code would normally cause run-time error 3265 to occur.
That is why our error handler contains a test for that specific error code.
BuildPerformanceTable_Error: Select Case Err Case 3265 'Item not found in this collection Resume Next Case Else MsgBox "The following unexpected error occurred:" & vbCrLf & _ Err.Description & " (Error " & Err.Number & ")", vbCritical Resume BuildPerformanceTable_Exit End Select
If the error does occur, we simply ignore the error and resume execution on the next line of code in the main body of the procedure. Any other errors that occur cause a generic message box to display the error code and description.
The next step is to create the table and append the seven fields to it. This is all fairly straightforward and should hold few surprises for you.
If you aren't quite sure how this table creation process works, it is described in much more detail in Chapter 8.
Once we have created the table, we need to populate it with random data. That is what the PopulatePerformanceTable procedure does.
If you have read the previous chapters in this book, then there should be little to surprise you in this procedure. The first step is to open the Recordset that will be used to insert new records into:
Set recPerformance = CurrentDb.OpenRecordset("tblPerformance", dbOpenDynaset, _dbAppendOnly)
The important feature to notice here is the fact that the Recordset has been opened with the dbAppendOnly flag. This indicates that an empty Recordset should be opened for the sole purpose of adding new records. This speeds up the opening of the Recordset , especially in situations where the underlying table contains a large number of records.
The next step is to prepare to insert the specified number of records. The basic structure for adding new records looks like this:
For lngRecordLoop = 1 To lngRecords 'Prepare to add new record recPerformance.AddNew . . . 'Save new record recPerformance.Update Next
If you remember back as far as Chapter 6, you will recall that there are three steps to adding records to a Recordset . First, a new record is placed into the copy buffer:
recPerformance.AddNew
Next, the fields in the copy buffer are amended. Finally, the contents of the copy buffer are appended to the table.
recPerformance.Update
That's straightforward enough, but how do we generate the random data that we will insert into the table? The key to it is the Rnd statement.
The Rnd statement generates a pseudo-random number which is greater than or equal to 0 and less than 1. We can take advantage of the Rnd function to define both a random length and a random value for data to be inserted. Inserting a random integer between 1 and, say, 100 is simple enough:
recPerformance("Num1in100") = 1 + Int(100 * Rnd)
The Int function truncates numbers rather than rounding them. So, if a number generated by the Rnd function satisfies this condition:
< r <1
then (100*Rnd) yields this:
< r <100
and Int(100*Rnd) yields an integer such that:
< r < 99
To yield an integer between 1 and 100, rather than between 0 and 99, we simply add 1 .
1 + Int(100 * Rnd)
So adding random numbers poses no problems. But how do we add random text strings? The answer is to use to the Chr$ function. This returns a character based on its ANSI code. The American National Standards Institute (ANSI) character set used by Microsoft Windows contains 256 characters. The first 32 characters (from 0 to 31) represent special characters such as tab and backspace characters. The next 96 characters (from 32 to 127) correspond to the letters and symbols on a standard U.S. keyboard. The final 128 characters represent special characters, such as letters in international alphabets, accents, currency symbols, and fractions.
If you look at the ANSI character set, you will notice that the letters A-Z have ANSI codes ranging from 65-90. So, if we want to generate a random letter between A and Z we need to generate an integer between 65 and 90 and pass that number to the Chr$ function. That is just what this line does:
strText = strText & Chr$(65 + Int(26 * Rnd))
If we want our string values to be of random length between 1 and 5 characters, we simply need to execute this line of code a random number of times between 1 and 5.
For intLoop = 1 To (1 + Int(5 * Rnd)) strText = strText & Chr$(65 + Int(24 * Rnd)) Next intLoop
You can also determine a character's ANSI code by using the Asc function, for example, Asc("A") returns 65 .
In some situations you might want to be able to generate a reproducible series of random numbers. You can easily do this by resetting the Rnd function. We can do this by passing a negative number to the Rnd function. For example, the following procedure will always generate the same random numbers.
Sub PseudoRandom() Dim i As Integer Rnd (-2) For i = 1 To 5 Debug.Print Rnd Next End Sub
To generate a different series of reproducible random numbers, simply pass a different negative number to the Rnd function.
Of course, the numbers returned by Rnd - just like the numbers returned by any mathematical algorithm - are not truly random. They are pseudo-random. However, they are random enough for the purpose of generating test data!
To fill the table with random data, we simply need to run the PopulatePerformanceTable procedure, passing to it as an argument the number of records that we wish to insert. When the procedure has finished executing, it displays a message box explaining the number of records it has added and the total number of records now in the table.
Be careful not to get carried away when running this procedure. On a PC with a 300 MHz Pentium processor and 64MB of memory this procedure will add records at a rate of nearly 250 per second. Put another way, that means that the database will grow at a rate of around 10MB per minute!
Something else to be wary of is the fact that deleting and recreating tables can quickly lead to database bloat. That is a condition that arises when unused space is not reclaimed as objects are deleted, and it manifests itself in ever increasing database sizes. To reduce database bloat, you should compact the database regularly, a task described in more detail later on in this chapter. Fortunately, the auto-compact feature in Access 2002 means that bloated databases will automatically be compacted when they are closed.
So now we have a relatively straightforward method for generating large volumes of test data at great speed. Now let's look at some ways that we can improve access to that data.
Adding an index to a field can be an excellent way of improving the performance of searches on that field. Although adding indexes slows updates and increases locking contention , very often this overhead is more than offset by the performance benefits gained if the fields are frequently used for query searches.
The following procedure counts the number of records in which the IndexedText field begins with the letter X , using the set of 10,000 records that we created in the previous exercise:
Sub IndexedSQL() Dim strSQL As String Dim rec As Recordset strSQL = "SELECT Count(*) FROM tblPerformance WHERE IndexedText LIKE 'X*'" Set rec = CurrentDb.OpenRecordset(strSQL, dbOpenDynaset) rec.Close End Sub
Whereas this procedure runs the same query against the UnindexedText field:
Sub UnindexedSQL() Dim strSQL As String Dim rec As Recordset strSQL = "SELECT Count(*) FROM tblPerformance WHERE UnindexedText LIKE 'X*'" Set rec = CurrentDb.OpenRecordset(strSQL, dbOpenDynaset) rec.Close End Sub
On my computer the indexed search took 188ms the first time it was run and only 38ms for subsequent executions. By way of contrast, the unindexed search took over 5 seconds the first time and only 38ms subsequently.
Operation | Elapsed Time | Improvement |
---|---|---|
UnindexedSQL (1 st run) | 5375ms | |
IndexedSQL (1 st run) | 188 ms | 28.66 times |
UnindexedSQL (subsequent runs) | 38 ms | |
IndexedSQL (subsequent runs) | 38 ms | 1.3 times |
Important | The vast difference between the first and subsequent executions of these searches shows the impact of caching. The first time that these searches are performed, the data has to be physically read in from disk, whereas for subsequent searches the data only needs to be read in from memory - a substantially faster operation. |
Another way of increasing the performance of data access in code is to use a more efficient type of Recordset object, a subject we have already looked at in Chapter 6.
Sub TestSnapshot() Dim strSQL As String Dim rec As Recordset strSQL = "SELECT * FROM tblPerformance WHERE IndexedText > 'N*'" Set rec = CurrentDb.OpenRecordset(strSQL, dbOpenSnapshot) rec.MoveLast rec.Close End Sub
The procedure above returns all of the records from the tblPerformance table that have a value in the IndexedText field which begins with an N or any letter alphabetically after N .
Because dynaset-type Recordset objects only cache a copy of the key values of the result set, they will typically open more quickly than snapshot-type Recordset objects in situations where the result set is larger than, say, 500 records.
Sub TestDynaset() Dim strSQL As String Dim rec As Recordset strSQL = "SELECT * FROM tblPerformance WHERE IndexedText > 'N*'" Set rec = CurrentDb.OpenRecordset(strSQL, dbOpenDynaset) rec.MoveLast rec.Close End Sub
Where the result set is extremely large, or where the base tables are located on the other side of a slow network, the difference can be quite significant.
The following table indicates the results of running these two procedures against a tblPerformance table with 30,000 rows (of which approximately half are returned by the query).
Procedure | Elapsed Time | Improvement |
---|---|---|
TestSnapshot (1 st run) | 8266 ms | |
TestDynaset (1 st run) | 734 ms | 11.3 times |
TestSnapshot (subsequent runs) | 72 ms | |
TestDynaset (subsequent runs) | 16 ms | 4.5 times |
If you cannot use a dynaset-type Recordset object, you might find that you achieve slightly better performance if you use a forward-only Recordset object.
These only allow you to scroll downwards through a recordset and you cannot use certain methods (for example, MoveLast ) against recordsets created like this. Although this means that you cannot use forward-only recordsets in all situations, the fact that they do not need a complicated cursoring mechanism means that they will often outperform conventional recordsets.
In a multi-user environment, you might also see some small advantage from using read-only recordsets.
Still better performance gains, however, can be achieved by opening an append-only recordset. These recordsets are empty and can only be used to add new records, rather than for inspecting existing records. Because append-only recordsets do not require any records to be retrieved, they will typically open significantly faster than fully- populated recordsets, especially when the base table contains many records.
The performance of any of the Recordset types described in this section is highly dependent on the size of the base table, the restrictiveness of the criteria, and which fields are indexed. In short, there is no substitute for performance testing on your own database.
Each record in a recordset is automatically assigned a Bookmark when the recordset is opened. If you are in a recordset and know that you will want to move back to the record that you are currently on, you should save the record's Bookmark to a variable. By setting the Bookmark property of the Recordset object to the value you saved in the variable, you will be able to return to the record far more quickly than you would be able to if you used any of the Find or Seek methods.
Bookmarks are stored as arrays of byte data and so should be saved as byte arrays rather than strings. Although you can save a Bookmark to a string variable, comparing a record's Bookmark to a Bookmark stored in a string variable will not yield correct results, unless the comparison is performed as a binary comparison. You can also use variants for bookmarks, but the extra work VBA has to do with the variant type will mean performance is not as good as using a byte array. For more information on using Bookmarks and performing binary comparison, refer to Chapter 6.
In Chapter 6 we examined DAO in detail, and in Chapter 21 we will be examining ADO in detail, including a discussion on when to use ADO in preference to DAO. One of the factors you will consider when making that choice will be performance. So let's have a look at how ADO performs compared to DAO when using the Jet database engine. Firstly let's compare updating data in a loop using DAO against using ADO. Before the following code will work, you need to make sure that you have a reference to Microsoft ActiveX Data Objects 2.7 by selecting Tools References in the VBA editor:
Sub ADOUpdates() Dim strSQL As String Dim con As ADODB.Connection Set con = CurrentProject.Connection Dim rec As New ADODB.Recordset strSQL = "SELECT * FROM tblPerformance where indexedtext > 'N%' " rec.Open strSQL, con, adOpenKeyset, adLockOptimistic Do Until rec.EOF rec("num1in10000") = rec("num1in10000") * 0.9 rec("num1in1000") = rec("num1in1000") * 0.9 rec("num1in100") = rec("num1in100") * 0.9 rec.Update rec.MoveNext Loop rec.Close End Sub Sub DAOUpdates() Dim strSQL As String Dim rec As DAO.Recordset strSQL = "SELECT * FROM tblPerformance where IndexedText > 'N*'" Set rec = CurrentDb.OpenRecordset(strSQL, adOpenDynamic, dbOptimistic) Do Until rec.EOF rec.Edit rec("num1in10000") = rec("num1in10000") * 0.9 rec("num1in1000") = rec("num1in1000") * 0.9 rec("num1in100") = rec("num1in100") * 0.9 rec.Update rec.MoveNext Loop rec.Close End Sub
Note that the adOpenKeyset cursor is equivalent to the dbOpenDynaset recordset type, and that adLockOptimistic equates to dbOptimistic . Note also that ADO does not need to begin editing with the rec.Edit method.
Operation | Elapsed Time | Improvement |
---|---|---|
ADOUpdates | 569 ms | |
DAOUpdates | 425 ms | 1.34 times |
For this test we can see that DAO performs better than ADO. In fact DAO makes much more efficient use of the Microsoft Jet database engine than ADO can. One of the reasons for this is that the internal ADO calls made to retrieve schema information about the table you are querying are relatively inefficient when applied against Jet, and that adds overhead. Also in ADO, every Connection object uses a separate Jet session, whereas, by default, DAO objects use the same Jet session. Each session has an overhead and a separate read cache, so we can see that ADO requires more work to get started, and that the read cache, which makes subsequent runs so much faster, cannot be shared between each connection.
Microsoft quotes that for many operations using Jet databases DAO is 5 to 10 times faster than ADO, and for queries and updates against tables with a large number of columns ADO is 30 percent to 80 percent slower than the equivalent query using DAO (see Q225048 - http://support.microsoft.com/default.aspx?scid=kb;en-us;Q225048 ).
|