Three Approaches to ClientServer Interactions

Three approaches suggest themselves to changing and updating data in the typical ordering scenarios, such as the one we have already considered. We'll examine each in turn before making our choice.

Classic Client/Server Computing

Let's get back to the ordering scenario we considered earlier, where we had to find the client, find the thing to order, and then make an order we were having to carry out. Here, our three pieces of server-side work—two reads and an update— would be three separate database transactions/interactions. We pull the data down for the picklist with the first read. We pick a customer ID and use that as a parameter on the next read. We make all changes in the user interface, and when the user is happy, he or she clicks the OK button (or whatever user interface standard we are following). The data goes back—perhaps as parameters on a stored procedure call or an action query—and all the work is done within the database server to make the changes as one database transaction. This is how most early client/server systems were made—it was frequently the model with terminal front-ended applications.

What happens in this approach if something goes wrong? For instance, suppose a business rule, implemented in the stored procedure we called, makes sure that the order we have just input doesn't take the customer's total outstanding orders over their credit limit. Perhaps we even checked for this in the user interface by retrieving the current total order value and credit limit for the customer. In the meantime, however, some other order has also been processed, and that has tipped the balance.

Clearly, the database doesn't get updated. The stored procedure returns something to tell the user interface this, and makes sure anything it did to the database is now rolled back (perhaps including assigning order numbers). This model has advantages:

There are no locks on the database until we actually do the insert/update work, and then only for the briefest time we can get away with—for example, while the stored procedure is running in the database.
Our two read routines can be reused in other pieces of functionality within the system. (It is quite likely that the customer picklist read—and the customer details read if it is not too order-processing specific—can be used in a number of places in the system.)
If we needed to tweak the work process or the user interface, we wouldn't have to rewrite our functionality, just reorder it. Thus if we wanted to take the order details before we checked on the customer, we could do it.

This model aims at making sure that all requests on the server are read-only ones, until the user hits OK. Then all the actions (updates, inserts, deletes) are done in one database transaction. We at TMS like this approach—it isn't a classic for nothing. We have to admit, however, that it won't always work—just because 8 million flies choose it.

Journal/Temporary Entry

Let's look at the stock checking step in our hypothetical application: find the order item with its stock level. We weren't saying, "Lock that stock level until we have finished completing our order." If we don't lock the stock level, we can obviously get situations where, when we first checked, there was enough stock to place the order, but by the time we input our order stock levels have fallen. (We are successful—orders are coming in all the time!). If that is the case, again our poor old stored procedure is going to have to check, raise an error, and offer the customer—via the user interface—the opportunity to accept either a back-order for that item, drop the order line, or drop the whole order.

We could implement this step differently: when we check stock levels, we could say lock, or temporarily allocate, that stock, so that when our order clerk says, "We have that item in stock," it remains true when the order is placed. This can be done as the usual kind of transaction—insert/update records in this allocated stock table. The problem comes if the link between the ordering clerk and this stock allocation table entry goes down before the order is placed and the stock committed.

This problem is usually dealt with by grouping the stock allocation(s) and the order committal into a standard database transaction so that either they both work, or neither does. This opens us up to locking problems, and also to issues of deciding exactly when the transaction has failed. Has it, for instance, failed if upon committing the order we get a rollback because the customer's credit limit has been exceeded? Does this circumstance mean that all the stock for all the order lines should no longer be allocated? This can mean doing the stock allocations and all the stock checks all over again if there is some process for authorizing such orders or upping the customer's credit limit.

Stateful/Transactional/Conversational Servers

Different IT shops have different names for these sorts of servers, but the implications are the same. Essentially, if we go for a solution to the situation above, where we rollback through transactions, our transactions can easily span multiple interactions between the ordering clerk and the database. This is how cursors are often used—when it isn't necessary. It is also how transactions are frequently used. This means that the data may be locked for an extended period—take the extreme case of the ordering clerk dying of a heart attack brought on by the stress of continual uncertainty (due to locking problems) about the length of a task. Suppose also that our system copes with such problems as this, power failures, and so on by writing state data to the local disk for later recovery.

Even if our system users don't die mid-transaction, we are facing a situation of potentially long locks. Add to this a database server that has page-level locking, and make the system one with a high transaction rate and concurrent user numbers, and you have a problem waiting to occur. If we are going to lock, we have to do it in a way that interferes with as few users as possible, so that reading data without the intention of updating is still allowed. This is usually done with software locks (the application writes a value when it gets the data, and checks the value as it puts it back) or timestamp fields.

Locking and Unlocking

"I shall be happy to give you any information in my power."

Sir Arthur Conan Doyle,

The Naval Treaty:

The Memoirs of Sherlock Holmes

What we are dealing with here is protection against conflicting updates that might break the business rules, and corrupt the data. Of course, this is a not very new problem, and has often been faced in computing before. Clearly even Shakespeare faced this problem, and was awakened by nightmares of inappropriate data-locking strategies.

Some data changes are all or nothing, replacement changes: I look at a customer record, and I change the surname. Some are not: I look at a balance figure on an account record, and change it with a financial transaction. The change this time is incremental; it uses the existing data to help make the change. If the existing data is the wrong data, the resultant change will be wrong. So if another ordering clerk committed an order for that customer faster than I did, my transaction cannot be allowed to complete.

A pessimistic scenario

I or the other user should be stopped from starting a transaction, while the other one is (potentially) already in a transaction. Good—only one working on the data at a time. Bad—only one can work on the data at a time. Imagine a system with three major customer accounts. Each customer has many different people who can phone in orders at the same time, and we have many ordering clerks to allow them to do this. Imagine ten orders for one of our large corporate customers being taken concurrently by ten order clerks. The first one in can lock the data; the other nine must wait. The result? Frustration.

An optimistic scenario

Let's take the same task: ten orders are being taken concurrently. For each order, we read the data and get some sort of indicator of where we are in sequence. All our order clerks go for it—hell-for-leather Online Transaction Processing (OLTP). The first process to complete an order checks to see if the sequence token it is holding allows it to commit the changes. The token says "Yes, you are in the correct sequence to change the data," so the transaction commits. The second order process checks its sequence token and finds that the data has changed. The second order process than has to check the data again, to see if it is going to break the business rules that apply now that the server data has changed. If no rules are broken, the process commits the order. If committing the order would break the rules, the process has to make a decision about which value should be in the database (the value that is there now, the value that was there before the first order was committed, or the value after this second order is committed). Otherwise the process has to refer the decision back to the second ordering clerk. The choices include: making the change, rolling back the change it was going to make and thus starting all over again (leaving the data as it was after the first transaction), or even forcing the first transaction to rollback too, so that the data is as it was before any ordering clerks got to work on it.

Optimistic or pessimistic?

What sort of data is it? Some fields, or even records, might be the sort of data in which no business rules can apply. In that case the last update must always be the one that defines what data is now in the database. It might be that currently there are no rules, in which case there can be no incorrect sequence. It can be that the data is not volatile or only rarely worked on concurrently. Usually, if you must lock, you must also strive to minimize the time you are locking, and avoid locking over user interactions if you can.

The Trouble with Teraflops

Imagine a database with 50,000 account records, perhaps for a public utility. Imagine a user requirement has been allowed to slip through the change control net —perhaps it's even genuinely needed—that allows users to search for customers by the first letter of their surnames. Chances are we will get about 2000 hits when we enter "D."

The prototype established that they want to see the hits back in a picklist that shows Name and Full Address.Table 12-2 describes the data fields returned for a single hit.

Field		Size (bytes)
ID		4
Surname		50
ForeName		50
AddressLine1		50
AddressLine2		50
AddressLine3		50
Town/City		50
County		30
Postcode InCode		4
Postcode OutCode		4
TOTAL		342

Table 12-1 The fields in the customer surname query

So every time we do one of these searches, we can expect 684 KB across the network for each hit. We might consider sending all 2000 hits back in one go. This way, all 2000 have to arrive before the user interface code can load them in the picklist (probably a nice resource hungry grid control). We could send the rows one at a time, and try and load the first in the user interface as soon as it arrives.

The issue is one of scalability. System constraints are preventing us from doing what we wish to do. The approach we adopt is bound to be a compromise between cosseting bandwidth and providing responsiveness to the user. Since we would actually prefer never to send more than 100 rows back at a time, we can design away the problem.

If the user really must scroll down hundreds of rows, we can send back the first 100, and then indicate there are more. Then—if we really must—we will get the others (although I would really prefer to make them narrow their query) in the background while our user is still looking at the first 100. The alternative of bringing the next lot when the user asks for it (usually by scrolling down a grid or list box in the user interface) is very awkward as a user interface, since the chances are high that the user can scroll faster than we can get the data. Even so, there are likely to be strangely jerky and unexpected delays for the user.

We could consider setting up a special index on the server—a covering index for all the fields we need in the picklist, but the index only shows every 50th row. Thus we bring down the first 100, and if they want more we fetch the index. They choose from the index, and we bring down the 50 that fall between two index entries. Tree views are quite good for this. The new hierarchical cursors in ADO, and the Hierarchical FlexGrid control, are particularly interesting here. If users wants more data than we are prepared to send, we could send down the index and let them narrow things down from that. Other user interface solutions are tabs, with each tab representing an index section, like a phone or address book.

Another question that needs consideration is whether to bring the details of the records back in the picklist, or only the data that is needed for display to do the picking. Suppose that the full size of the record is 700 bytes. This would mean moving 1.4 MB for each picklist. Sounds like a bad idea—imagine the effect on the network.

Once we have the list, how do we work with it? The simplest method is to create objects on an as-needed basis. That is, an object is created and populated with data only after the user has chosen an item from the picklist. It almost never makes sense to create an object and load its data for each entry in a picklist. Our user interface might behave as if that is what we have done, but in reality it will be all a front. There are a number of options for implementation here.

When to Open and Close Connections

One of the most expensive things you can do with a database (at least in terms of time) is opening a connection to it. Try it out with the RDOvsADO sample project in the samples directory for this chapter. This expense in time has led to most database applications, one way or another, opening connections and keeping them open for extended periods, rather than taking the hit of opening the connections more frequently. You'll often see the strategy of a Visual Basic application opening its database connection on application startup, holding it as Public property of the application (a Global variable in old-time Visual Basic speak), using it everywhere throughout the data access work, and finally closing the connection only when the application shuts down. It once was a good strategy, but the world is changing.

If applications didn't use connections this way, they implemented some kind of connection pooling, wherein a pool of already-open connections was used to supply any application. This is because, although database connections are expensive, they are also finite. To begin with, if only for licensing reasons, we must usually manage them, but also each connection uses up memory on both client and server machines. Thus when certain database access technologies require multiple connections per user or application to carry out a task, they can severely impact application performance.

One of the interesting aspects of building middle-tier objects that talk to the database (rather than allowing an application on the client to open its own database connection) is that objects in the middle-tier can open the connections. They might even be on the same machine as the database server, and they can provide connection pooling. When Microsoft Transaction Server (MTS) was introduced as a middleware provider of both Object Request Brokering services and Transaction Process Monitoring, one of its subsidiary roles was to provide pooling of shared resources. Currently ODBC connection pooling is already provided by MTS. Even without MTS, ODBC 3.5 now provides connection pooling itself. (See Figure 12-1.)

Figure 12-1 Connection pooling with ODBC 3.5

The effect of connection pooling can best be seen by a project that opens components, which each open and use a connection.

With connection pooling in place it becomes viable to change the connection opening and holding strategy. It can even be worthwhile to open a connection every time one is required, use it, and then close it, because in reality the connection will not be opened and closed but pooled, so as to avoid the cost of repeated connection openings. However, the pool can be limited in size so that the minimum number of active useful connections exist, but idle connections using resources without doing work are all but eradicated.

Deciding Where to Put Transaction Scope

Where should transaction scope lie? Who or what will control it and coordinate commits and rollbacks?

What is transaction scope?

With technologies such as DAO, RDO, and ADO, the transactional scope is limited to some database object or other.

With DAO that object is the Workspace object. Transactions are always global to the Workspace object and aren't limited to only one Connection or Database object. Thus if you want multiple transactional scopes you need to have more than one Workspace object.

In RDO, transactions are always global to the rdoEnvironment object and aren't limited to only one database or resultset. So with RDO, to have multiple transaction scopes you need multiple rdoEnvironment objects.

If you want to have simultaneous transactions with overlapping, non-nested scopes, you can create additional rdoEnvironment objects to contain the concurrent transactions.

With ADO, the transactional scope is on the Connection object—if the provider supports transactions (you can check for a Transaction DDL property [DBPROP_SUPPORTEDTXNDDL] in the Connection's Properties collection). In ADO, transactions may also be nested, and the BeginTrans method returns a value indicating what level of nesting you've got. In this case, calling one of the standard transaction methods affects only the most recently opened transaction. But if there are no open transactions, you get an error.

ADO can also automatically start new transactions when you call one of the standard close methods for a transaction. For a given connection, you can check the Attributes property, looking to see if the property has a value greater than zero. It can be the sum of adXactCommitRetaining (131072)—indicating that a call to CommitTrans automatically starts a new transaction, and adXactAbortRetaining (262144)—indicating that a call to RollbackTrans automatically starts a new transaction.

Cooperating components and transaction scope

Regardless of which object provides transactional scope and methods, the salient point is that there is only one such object for each data access library. Whatever the object, all code that needs to cooperate in the same transaction has to gain access to that same instance of the object. This is a small issue in a two-tier architecture, where all the components are on the same machine (and more often than not where there is only one client executable). There are no process boundaries or machine boundaries being crossed, and the place where the transaction-owning object will be instantiated is a no-brainer.

In a tiered system with a number of cooperating components— perhaps on different machines—some means has to be found of passing object references to the shared transaction-owning object. This is one of the reasons why it typically doesn't make the bestsense to build data access code and transaction scope into individual business objects in a tiered system.

Frequently, compromises in design have to be made to allow the passing of this data access object so that a number of business components (which should have no database access specific knowledge or abilities) can cooperate in the same transaction. This is where Microsoft Transaction Server (MTS) solves the problem with a degree of elegance with its Context object and the CreateInstance, SetComplete, and SetAbort methods. Taking a lesson from this, a good design pattern is to create a class with the purpose of acting as shared transaction manager for cooperating components, or else to use MTS to do the same thing.

Such strategies allow the grouping of components on different machines in the same transaction, and moving the point of transactional control to the place where the sequence of interactions is also controlled. Frequently this is at some functional location, quite near to the user interface. (See Chapter 2 for more detailed information about business objects.)

Another issue arises here, namely that of stateful and stateless work. MTS prefers statelessness, where everything can be done to complete a unit of work in as few calls as possible between client and server. However, transactional work is by its very nature stateful and built from a series of interactions. This harks back to the three approaches to client/server interaction we considered earlier. This is also discussed in Chapter 2.

Let's Get Tiering

We've already mentioned tiering in terms of cooperating components and transaction scope. So, we're going to make most of the transactional functionality to fit into Microsoft's Distributed interNet Applications Architecture (DNA). (What, you mean you didn't know they'd trademarked DNA? Does that mean you're not paying the franchising fee either?) The old TLA Lab must have really burnt some midnight oil to make that fit. Dolly the sheep country. Please note, this doesn't mean that your application has to be a Web-based application to benefit from the DNA architecture. In fact, as a developer with lots of client/server experience etched in scar tissue on my body, one of the things I was most encouraged aboutwas that (when looked at the right way), Web-based systems and client/server systems are very similar. Thus I didn't have to unlearn everything I'd sacrificed the last five years to learn. When I discovered this a year or so ago, I was much cheered, and sang Welsh songs of triumph involving sheep, beer, and rain.

A Web-based system is of necessity structured in pieces, since it almost never makes sense to put all the work on the client and send everything to it. A Web-based system always has to be aware of the network and scalability. A Web-based system also needs to be as stateless and disconnected as it can get away with. This sounds very client/server-like.

If we are going to consider component-based distributed systems now, we should make a few basic decisions:

Where to put the data access code?

The tiered architecture includes a Data Services tier. Often this is taken to be the database or data source itself, but TMS would argue that this tier also includes the data access code, which acts as the interface to the system's data sources. Consider the options.

Every business object class implements its own data access code. This may be the same code in many places, which is a maintenance problem. Or if there are many disparate data sources, there is a lack of consistency.

However, the bulk of data access code, regardless of its data storage format, is likely to provide the same service, and thus provide the same interface. Imagine the advantages of uniformity: all business objects in the same system, regardless of their business rules, using the same data access interface. Thus there is uniformity for the creation of all generic business services (Create, Read, Update, Delete—the timeless CRUD matrix). This makes it possible to write wizards and add-ins in Visual Basic that generate business objects and all of their common functionality. Since there can be many such business object classes in a system, this can represent a major saving to the project. If a template for the data access class that the business object classes use is created, the internals of supporting different end data sources can be encapsulated behind a standard interface, protecting the business objects from changes of data source, and protecting the data access code from changes of business rule. If data access is provided by a specialized class, this also makes it possible for that class to be bundled as a separate component, which will allow the logical tiers to be physically deployed in any combination. This can have scalability advantages.

The responsibility of the data access service becomes to interface to all supported data sources in as efficient a way as possible and to provide a common interface to any data service consumers. This allows a developer to wrap up RDO or ADO code in a very simple, common interface, but to make the internal implementation as specific, complex, and arcane as performance demands. Such code is also reusable across all projects that use a given type of data source—a handy component to have in the locker if implemented internally in RDO (because it should work with any ODBC compliant data source), and even more generic if it uses ADO internally. But again, as new unique data sources are required, either new implementations of the component (which have new internal implementations but support the same public interface) can be produced or else extra interfaces or parameter-driven switches can be used to aim at other types of data source.

Essentially, when a request for data or a request to modify data comes in to a business object, (for example, "Move all of Account Manager Dante's Customers to Smolensky") probably via a method call from an action object² in a Visual Basic front-end program. The business object's method should find the SQL string (or other Data Manipulation Language [DML] syntax) that correspondsto the task required. Once the business object has obtained the SQL string "Update Clients Set Account_Manager_ID = ? where Account_Manager_ID = ?" (or more likely the name of a stored procedure which does this), it passes this and the front-end supplied parameters 12 and 32 to an instance of a DAO object. The DAO object would choose to link with the data source using ADO/RDO, and depending on the format of the DML, it might also convert its data type. The DAO object would then update the data in the remote RDBMS, and pass back a "successful" notification to the business object (its client).

This gives a preferred application structure, as shown in Figure 12-2:

Figure 12-2 Where to put data access code

The Client creates a reference to a procedural object and calls a method on it. In order to carry out the work of that method, the procedural object needs to call more than one business object in a transaction. To do this it creates a reference to a Data Access component, perhaps in its Class_Initialize. It tells the data access component via a method or property that it wants to start a transaction. Then it creates references to each of the business objects and passes the data access component reference to each of them. The procedural object also maintains its own reference to the data access component's object. This is so that they will use the same connection, and thus be in the same transactional context with the database. The business objects carry out their interactions with the database via the same data access component and thus the same database transaction. Once the business objects have finished with the data access component they should set it to nothing. In order to do this, in their Class_Terminate event the business objects must check to see that they have a live reference, and when they find one, setting it to nothing.

The procedural object can now (perhaps in its Class_Terminate) commit the transaction (if no errors have occurred) and set the data access object's reference to nothing, effectively closing it down. This is the "Do it once, do it right, and forget it forever" principle in action.

What About SQL?

SQL (or other DML in which the data source is very obscure) has to be kept somewhere. Often it is scattered through a system—perhaps bundled into each business object. This is clearly not a good policy. What if the data source changes in type or structure? What if the type of DML changes? The maintenance burden can be high. At minimum, even if the SQL is easy to find, every business object must be recompiled. The TMS answer is something called a SQL ROOS (Resource Only Object Server). At TMS we use these as central, easy-to-replace-and-modify repositories for business rule SQL, or anything that is textual (error messages, interface text etc.). Internally the mechanism used is a very standard Windows one, the resource file (RES) that holds a string table. The ROOS makes calls to this string table, using appropriate off-sets to pull out the correct text, and load the methods parameters for return to a calling business server. Many Windows programs use these resource files, either compiled into them, or often compiled and linked to as separate DLLs. Visual Basic has a restriction in this regard: we can only include one RES file per project. So to avoid this, and to make string maintenance straightforward without having to recompile a whole application, we do put our RES files into separate DLLs, as illustrated in Figure 12-3. In this way many DLLs, and therefore many RES files, can be referenced and used in the same project. See the SQLROOS sample on the companion CD for a simple example. The Commenter add-in also uses a RES file for loading pictures onto Visual Basic's Edit toolbar.

Figure 12-3 Using multiple ROOS's in a system.

For a given request, the SQL ROOS might return the following information:

The SQL statement (including "?" placeholders for parameters)
The number of parameters being used in the SQL statement
The type (input, output, input/output) of each parameter
The kind of query being done (straight select, prepared statement, stored procedure, multiple resultset, insert, update, delete, or other action/batch query)

It makes sense to have a ROOS template project—although some things might change from ROOS to ROOS (mainly the contents of the RES file), the bulk of the code will be exactly the same. Also remember that if you use a ROOS template project, change the base address for each ROOS DLL (and indeed each DLL) you intend to deploy on the same machine. If you have missed this step, check out the DLL Base Address entry on the Compile tab of the Project Properties dialog box, and the associated Help.

So when a business object server has to execute one of its methods that requires retrieval of remote data via a business rule, it uses the SQL ROOS to provide that rule (in the form of SQL which typically calls a stored procedure). It then adds any data parameters, and uses the retrieved SQL and the data parameters as parameter arguments on a method call to the data access server. The data access server in turn submits the SQL and returns a notification of success/failure, any returned parameters or a resultset (depending on the kind of operation being undertaken) to the business object server. If necessary, the business object server returns any data to its base client. (See Figure 12-4.)

Figure 12-4 Using a SQL ROOS

Static Lookup Data

Static data presents a challenge and an opportunity (I don't usually go in for management-speak, so I really means this). The more normalized a data source is, the more joins are likely to be needed in SQL to produce the data for a given instance of a business entity. These kinds of lookup joins can kill performance, and with some data technologies you can reach critical mass when the query becomes too complicated to support. When pulling back an entity it is also common to have to bring back all the possible lookup values and display them as options to the current setting in a ListBox or ComboBox. Ditto when adding a new instance of a business entity.

Since this data is lookup data it is usually static, so it might be an option (especially for lookups that are frequently used throughout the application) to load or preload the data once, locally, on the client, and use this data to provide the lookup options. This also ends the necessity of making all the joins—you are effectively joining in the user interface.

Thus it can sometimes be advantageous for performance to have (largely) static data loaded locally on the client PC in a Jet MDB file database, rather than pulling it across a network every time we need to display it. We have to address the problem of when the static data does change and we need to refresh the static local copy. We could have a central source to check, to compare the state of our data with the central data. This could take the form of a table on an RDBMS, which has a record for each lookup table, held in the data source. When that relatively static data is changed, a date value should be set (indicating the date and time of the change). When the client application starts up, one of its initialization actions should be to check this table and compare each of the lookup table dates with the dates for its own local copies of the lookup tables. If it finds a date difference, it should update its local data.

If it is extremely unlikely that the data will change with great frequency or that working practices will not require frequent data changes during the working day, it is safe to assume that application startup is frequent enough to check for changed lookup data status. However, there is risk here. It might be necessary to design a mechanism that can check the timeliness of our static data every time we would normally have downloaded it. Even then, there is a performance gain: the more static the data, the higher the gain. We would only download the changed tables, and then perhaps only the changes, not all the lookups.

When you have the static data, where do you put it? Such static data can be stored in an MDB and accessed via DAO code, or in flat files, or now in persistable classes (more on this in a little while). It can be either loaded by code directly into interface controls, or accessed via DAO code and then the resulting resultset attached to a data control, which is in turn bound to interface controls. (Although in this second option, there is a tradeoff of performance, code volume/ ease of coding, and resource usage.)

Tiers and Data

Tiered systems built with Visual Basic are essentially systems of class instances provided by a number of physical components. The components can be deployed on a number of machines interacting to produce system behavior. This means the class instances share their data with the system through class interface methods, properties, and events. Figure 12-5 illustrates how data and objects fit together at the simplest level.

Figure 12-5 Objects and their data.

A class is a template of functionality—it is a generic functional wrapper around data variables. It comes to life when you createan instance of it and give it some specific data to hold in its variables. It is possible to have data-only classes composed only of properties. It is equally possible to have functionality-only classes composed entirely of methods. In both cases a developer's first reaction to such classes should be to question their design. However, such classes are the exception rather than the rule.

Objects, State, and Data: The Buxom Server

In an object, the data can ultimately come from the two ends of a tiered system: the user (or some other form of) input, or some persistent data storage (usually a database). Some data is transient, ephemeral, short-lived stuff—the data equivalent of a mayfly. Most, however, requires storage and has persistence. It is typically a design goal of scalable COM-based systems that the objects they are made of are as stateless as possible. In other words, specific instances of a class (and thus specific sets of data) stay "live" and in memory for the briefest time. There are many good reasons for this. Principally this is so that we minimize the impact of idle instantiated objects on server system resources such as CPU and memory. At TMS, for any sizable system we typically would attempt to assess the likely impact of this on a server, as early as possible in the design process (see the scalability example spreadsheet in the samples directory for this chapter).

There is always a compromise to be made between the benefits of preinstantiation and the overhead of idle instances. Consider the case of the Buxom server—a publicly instantiable class provided by an ActiveX EXE server, sitting on a machine (The Bar) remote from its clients (thirsty punters like you and I). Its purpose is to stand as the software equivalent of a bartender in a system. Consider the process of ordering a round of drinks from an instance of the Buxom Server.

Preinstantiation: Your Object Hiring Policy

When a customer walks up to the bar with tongue hanging out, the management can have taken one of two diametrically opposed attitudes to hiring.

Hire, Serve, and Fire Management can see the customer and go out and hire a new Buxom server, because there is someone to serve. This will entail advertising, or perhaps going through a staffing agency; interviewing, hiring, perhaps even training. Eventually, there will be a Buxom server to serve the customer. Imagine the customer is typically English (used to queuing and waiting and therefore uncomplaining), so he is still there when the Buxom server arrives in her new uniform to take the order. The Buxom server serves the customer, and as soon as the cash is in the register, management fires her.

What I've just described is the equivalent of setting the class Instancing property of the Buxom server to SingleUse. This means that each time a customer wants to get a round of drinks from a Buxom server, we let COM handle the request by starting up a new instance of the ActiveX EXE component, which passes the customer an object reference to a new Buxom server instance. The customer uses the server's properties and methods to get a round of drinks, and then sets the object reference to Nothing. The reference to the Buxom server instance drops to zero, COM unloads the class instance and the component that provides it from memory. They are only in memory when a customer uses them. This is good for server resource management, but tough on the customer's patience. This was a very Visual Basic 4 implementation, since the alternative of MultiUse instancing without multithreading led to blocked requests for service.

Preinstantiation: a staff pool Management has estimated (the trouble of course with estimates is that they have a habit of transforming into promises) that a pool of four permanent bartenders will be enough to handle the volume of customers. They may have to be a little flexible and augment the staff with temporary help at peak periods (happy hour?). So they have already hired. Now when a customer comes up to the bar—as long as it isn't a terribly busy time—an instance of a Buxom server says, "Hi, what can I get you?" This time the customer doesn't have to wait for the hiring process to be completed before an order for drinks can be taken.

This is the equivalent of creating a pool manager (did you ever look at that sample?), which holds object references to instances of the Buxom server class and keeps them instantiated when they aren't busy. You can build a separate, generic pool manager, or a multithreaded ActiveX Server that only provides Buxom servers. Typically pool managers start as standalone ActiveX programs, which have public methods for getting a server and a form to keep the pool-managing component itself alive. Or, they use Sub Main in a novel way to provide a message loop, rather than just processing sequentially. Now the Buxom server must have teardown code to clean its properties (its state) after usage, so that when it is kept alive, it can be used by each customer as if it were newly instantiated. This will avoid the large component instantiation overhead, but will place some overhead of idle instances on server resources. However, with good algorithms to manage the pool size dynamically, the overhead can be kept to a minimum, and is at least under control.

This is the stateless basis of MTS systems. MTS, however, doesn't exactly do preinstantiation, but rather keeps instances that have been used but are now idle alive and waiting for customers for a period of time before allowing them to close and drop out of memory. This is how most Visual Basic systems scale without infinite hardware—by minimizing the number of "live" objects in the system at any given time and thus the amount of resources being used. This is also how— from the database developer's point of view—we can minimize numbers of database locks and connections: by combining object pooling and database connection pooling.

State and the Buxom server

Let's look a little closer at the task of ordering a round of drinks—something I feel well qualified to discuss. We go up to the bar and catch the attention of a Buxom server (we have an object reference to her). This is an art form in itself in some bars. We begin the process of ordering, in answer to her polite, "Hi, what can I get you?" We use her OrderDrink method to ask for a pint of Scruttocks Old Dirigible (American translation: a microbrew). She pulls our pint. We wait, since we are synchronously bound. We order a gin and tonic. "Do you want ice and lemon in that?" We don't know—Marmaduke has only just joined the development team. We can probably guess he does, but courtesy and politics dictate that we ask (he's project manager). We tell the Buxom server that we'll check and get back to her. Only thing is, he is sitting in the back room with the others, and the bar is very crowded. We fight our way through, check on his ice and lemon preference (he wants both), and fight our way back. All this time our Buxom server has been idle, consuming system resources but unable to service another order without abandoning her state data.

We carry on with our order: more drinks and some snacks (pork rinds). Then we hand over the money, get our change, and even ask for a tray to carry it all on. We have been interacting with a stateful, conversational server, passing data back and forth. We have been calling methods, or even directly interacting with our Buxom server's properties (I said we'd try and spice up data access at the outset). Since the Buxom server was instantiated on one side of the bar, and we are on the other, we have also been incurring network overhead with each exchange. For long periods one or other of us has also been idle, due to the latency of the network and the crowds in the bar. What's the alternative?

Stateless drinking: maximizing your drinking time

So instead of the unnecessary idling, we implement a single method in the Buxom server, called OrderRound. The OrderRound method carries all the data with it: an indication of the drinks and snacks we want, the Booleans to indicate we want a tray and that there will be no tip (stateless and mean), even the cash. The returns from OrderRound are our round and also some change, we hope. What advantages does this have?

To start, one call and one set of network overhead. Minimal idle time—in fact, we could even make it asynchronous by telling the Buxom server where we're sitting, and she could deliver the drinks (or one of those awful public address systems, firing an event to say our burgers are cooked—there's notable service in most of the places I frequent—far too upscale). As soon as the Buxom server is free, she can dump the state of our order —all taken in as parameters on one method anyway, so no global data required—and free herself up for the next customer. Such a system will scale better, since a single Buxom server ought to be able to turn around more orders in a given period of time.

This is the ultimate statelessness. This is what MTS systems crave (using SetComplete to indicate order completion), although it can be an awkward way of thinking for some developers. It is also what most database access techniques would ideally like to achieve. In truth, most of the time we will be somewhere in between highly stateful and stateless, but at least we know what to push toward in our design.

So the responsibilities of a data access strategy or design don't begin and end at getting data out and into the database; they have to include the dataflow (now there's an IT system term you don't hear so much anymore) around the system. If that system is composed of a number of components spread across a network, an intranet, or even the Internet, this dataflow aspect can take on a great deal of significance, since it is likely to be where performance is won or lost.

Still Not Sure What Data
Is Where Objects Are Concerned?

So what is data? (Nothing like getting the fundamentals out of the way early—then there's lots of time for beer.) Funny how, on a software project, if you don't get the basics right the first time, they come back to haunt you (and steal your drinking time) later in the development cycle. In the context of this chapter, Data is sadly not a character from Star Trek—the Next Generation (I know, you don't get the time to watch it, you're debugging at the moment), but variables holding specific values. Anyone reading this who has already worked on a tiered system will know that, should you ever come to physically distribute your tiers of objects—that is to have cooperating objects working across more than one machine to constitute the system—you are faced with an object-or-data, reference-or-variable dilemma. Which should you use, raw variables or object references (and through them properties and methods)? To the purist there is no contest—use the objects. But we are pragmatists. Our jobs are on the line if excellent theory produces appalling performance, so we have to think a little before jumping.

Data—the Currency of an Object System

The flow of data from object to object in a distributed system is the currency of the system. It's the oil that turns the cogs of functionality. Where is an object? An object is instantiated on a given machine, and that is always where it stays (well, almost always—see "Remoting" later). One object manipulates another via an object reference, making calls on that object reference to its properties and methods. Those properties and methods —the object's interface—provide access to the object's uniqueness (its data). This is a paradigm much easier to grasp and use than an API (application programming interface) is. The object reference is essentially a pointer (4 bytes) to the object. The object doesn't move, therefore referring to an in-process object is fast, an out-of-process object is slower, and a cross-machine object much, much slower. If you want to assess the difference for yourself, the TimeIt ³ application (see the TimeIt subdirectory in the samples directory for this chapter) has some test cases that will allow you to try this⁴, or you can add your own. You can even try the Application Performance Explorer that ships with Visual Basic. In the cross-machine scenario, the pain in the butt is that each call to a property or a method is a piece of traffic across the network (consider also the effect of ByVal and ByRef choices for parameters on methods, as illustrated in the table below). This is going to slow your application down.

Location of component	Type of argument	Recommendation
In-process	Large strings (not modified)	Use ByRef because it uses a pointer rather than creating a copy and passing it.
Out-of-process	Large strings (not modified)	Use ByVal because ByRef data is copied into component's address space, this method passes a pointer to the local data, and this doubles the network traffic.
Out-of-process	Object reference	Use ByVal because this means you have one way cross-process marshalling, which is fast.
Out-of-process	Object reference that is going to be replaced	Use ByRef so that the original reference can be changed.
Out-of-process	Most things	Use ByVal—if you use ByRef you can't avoid marshalling by parenthesizing the parameter, or using ByVal when calling the method.

Table 12-2 Talking with components: ByVal or ByRef ?

Discussions of how to make a system faster also throw an interesting sidelight on the evolution of a programming language. In early versions of a language, your biggest goal is to make it work at all. Often there are only one or two ways to achieve the functionality in the early versions of a language, and to achieve those you have to put in some hours. There are lots of clearly wrong answers and a few clearly right answers. So if it works, you must have built it right. Then features are added to the language. These sometimes add functionality but more often diversify—more ways to skin the same old cats.

Developers are diverse too, and they react differently to the widening of their choices. Some stay with the same old tried, trusted, and understood ways. Others try every new feature, compare each to the old ways, and make decisions about how and when they will use them. Still others read a bit and wait to be directed by industry opinion. It is hard because Visual Basic has grown exponentially. But enough of sidetracks and musings.

How does this affect the data access strategy task we've set ourselves? Well, unless you are very lucky your data source will be large, shared, and on a different machine and dangerous. You can create your data-retrieving objects locally (on the user's client machine, the sink client) or remotely via a component on the server. The old computer adage, "do the work near the data," may influence you here, as will whether you have middle-tier objects out on the network. If you have remote objects, you have to consider whether to pass object references or variables, or something else.

Remoting

Efficient passing of data between processes, and particularly across machine boundaries, is critical to the success of scalable, distributed, tiered Visual Basic systems. We've already (I hope) established that passing data in variables that are grouped together on methods—rather than lots of property touching—is the way to go. We also looked at when to use ByVal and ByRef on parameters. Now we have to look at how we can get the most efficient performance out of remoting. Marshalling data across machines and process boundaries attracts a heavy penalty from COM, so in Visual Basic 4 and 5 a few ways established themselves as favored. They were used particularly for passing large amounts of data, which would otherwise have ended up being collections of objects. Let's remind ourselves of those ways first.

Variant arrays and GetRowsAn efficient way of passing larger amounts of data across process and machine boundaries is as Variant arrays. It is possible to make your own arrays using the Array function. But Variant arrays gained in popularity mainly because they were coded directly into the data access interfaces that most Visual Basic developers use. GetRows exists on RDO's rdoResultset objects and both DAO's and ADO's Recordset objects. The ADO version is more flexible than the DAO and RDO implementations—check out its extra parameters. Using Getrows in the data access code looks like this:

Public Function GiveMeData() As Variant     Dim rsfADO  As New ADODB.Recordset  'The Recordset object     Dim vResult As Variant              'to put data in.          rsfADO.Open strSQL, strCon     'Open a connection.     vResult = rsfADO.GetRows     'Park the results in a Variant array.     GiveMeData = vResult         'Send the Variant array back. End Function

Finally the Variant array reaches the user interface (having perhaps passed through a number of hands, classes, and components) and is used.

Public Sub GetData()     Dim nRowCount As Integer    'A row count holder.     Dim nColCount As Integer    'A row count holder.     Dim sCurrentRow As String   'A string to build each row in.     Dim vData As Variant        'A Variant to hold our data.     'Call the server and get back a Variant array.     vData = oDataServer.GiveMeData     'For each row (rows are dimension 2 of the array).     For nRowCount = LBound(vData, 2) To UBound(vData, 2)         sCurrentRow = "" 'Initialize the current row.         'For each column in the row.         For nColCount = LBound(vData, 1) To UBound(vData, 1)             'Add the column data to the current row.             sCurrentRow = sCurrentRow & Space(2) & _  CStr(vData(nColCount, nRowCount))         Next nColCount         'Add the row to a list box - or use it somehow.         lstDataRows.AddItem sCurrentRow     Next nRowCount End Sub

In each case, GetRows creates a two-dimensional array, which it puts into a Variant. The first array subscript identifies the column and the second identifies the row number. The only downside of GetRows is that, when you have grown used to manipulating collections of objects and accessing their data through their properties, going back to arrays feels clumsy. It is a specific and increasingly unusual implementation. As a result, at TMS we often use a DLL that turns a two-dimensional Variant array into either a recordset object again (either our own implementation, or less frequently one of the standalone implementations available in ADO and RDO) on the client-side, purely for ease of coding. (See the Gendata and DataArray projects in the samples directory for this chapter. These projects use GetRows with ADO, and a custom implementation of the recordset.)

UDTs and LSET Some people (I was not among them) so mourned the passing of the user-defined type (UDT) into history (since UDTs could be passed as a parameter) that they came up with work-arounds, to which they became devoted. The UDT has its Visual Basic roots in file access, particularly where files were structured as records.

Imagine a UDT for an Account record:

Public Type AccountStruct     AccountID As Long     Balance As Currency     Status  As AccountStatus     AccountTitle As String     OverdraftAmount As Currency     AvailableFunds As Currency End Type

Since a UDT cannot be passed as a parameter, it has first to be converted. Therefore, a corresponding user-defined type that only has one member, but is the same size as the original is defined:

Type Passer     Buffer As String * 36 End Type

Caution
Beware on sizing that in 32-bit Windows, strings are Unicode. Therefore each character = 2 bytes, so if you need an odd number size (such as 17 bytes because you have a byte member in the original type) you would actually have a spare byte.

We then copy the original user-defined type into this temporary one. We can do this with LSet. In the Client component we have:

Dim pass As Passer     Dim oUDTer As udtvb5er     Set oUDTer = New udtvb5er     LSet pass = myorig       oUDTer.GetUDT pass.Buffer

In the server we need the same user-defined types defined, along with the following method in our udtvb5er class:

Public Sub GetUDT(x As String)     Dim neworig As AccountStruct     Dim newpass As Passer     newpass.Buffer = x     LSet neworig = newpass MsgBox neworig.AccountID & Space(2) & Trim(neworig.AccountTitle)_  & Space(2) & "Current Balance: " & CStr(neworig.Balance) End Sub

To do this we are copying from one type to the other, passing the data as a string, and then reversing the process at the other end. However, as Visual Basic's Help warns, using LSet to copy a variable of one user-defined type into a variable of a different user-defined type is not recommended. Copying data of one data type into space reserved for a different data type can have unpredictable results. When you copy a variable from one user-defined type to another, the binary data from one variable is copied into the memory space of the other, without regard for the data types specified for the elements.

Note
There is an example project in the code samples for this chapter, called UDTPass.vbg. This is not a recommendation, merely a recap of what was possible!

Visual Basic 6 and remoting

With a few exceptions, this chapter has up until now dealt in long-term truths rather than cool new Visual Basic 6 features. I haven't done this to hide things from you deliberately. However passing data from one thing to another is an area where Visual Basic 6 has added quite a few new features, and it's pretty confused in there at the moment. I've attempted to sort them out for you here.

New ways of passing variables (arrays and UDTs) Visual Basic 6 has made it possible to return arrays from functions. In the spirit of variable passing, rather than property touching, this is likely to be of some help. The new syntax looks like this in the function you are calling:

Public Function GetIDs() As Long()     Dim x() As Long     Dim curAccount As Account     Dim iCount As Integer     ReDim x(1 To mCol.Count)     iCount = 1     For Each curAccount In mCol         x(iCount) = curAccount.AccountID         iCount = iCount + 1     Next curAccount     GetIDs = x End Function

From the client's view it is like this:

Private Sub cmdGetIDs_Click()     Dim x() As Long     Dim i As Integer     x() = oAccounts.GetIDs     For i = LBound(x()) To UBound(x())         lstIDs.AddItem CStr(x(i))     Next i     lstIDs.Visible = True End Sub

This example has been included in the project group UDTGrp.vbg in the samples directory for this chapter.

Visual Basic 6 has also added the ability to have public UDTs and to pass them between components. Thus a UDT structure such as we looked at earlier:

Public Type AccountStruct     AccountID As Long     Balance As Currency     Status  As AccountStatus     AccountTitle As String     OverdraftAmount As Currency     AvailableFunds As Currency End Type

can be passed back and forth to the server thus in the server class's code:

Public Function GetallData() As AccountStruct     Dim tGetallData As AccountStruct     tGetallData.AccountID = AccountID     tGetallData.Balance = Balance     tGetallData.Status = Status     tGetallData.AccountTitle = AccountTitle     tGetallData.OverdraftAmount = OverdraftAmount     tGetallData.AvailableFunds = AvailableFunds     GetallData = tGetallData End Function Public Sub SetAllData(tAccount As AccountStruct)     AccountID = tAccount.AccountID     Balance = tAccount.Balance     Status = tAccount.Status     AccountTitle = tAccount.AccountTitle     OverdraftAmount = tAccount.OverdraftAmount End Sub

and called like this from the client:

Dim oAccount As Account Dim tAccount As AccountStruct For Each oAccount In oAccounts     tAccount = oAccount.GetallData     lstAccounts.AddItem tAccount.AccountID & _         Space(4 - (Len(CStr(tAccount.AccountID)))) & _         tAccount.Balance & Space(3) & tAccount.OverdraftAmount Next oAccount

Remoting ADO recordsets The combination of ADO and the Remote Data Service (RDS) client-side library—intended for speedy, lightweight, disconnected data access for Web applications—can be particularly useful for any distributed client/server system, regardless of its user interface type. The client needs a reference to the Microsoft ActiveX Data Objects Recordset 2.0 Library, while the server has a reference to the Microsoft ActiveX Data Objects 2.0 Library. At its simplest, the client code looks like this:

Private oDD As DataDonor Private Sub cmdGetData_Click()     Dim oRS As ADOR.Recordset     Set oDD = New DataDonor     Set oRS = oDD.GiveData     Set oDD = Nothing     Set oDataGrid.DataSource = oRS End Sub

While in the server component the DataDonor Class's code looks like this:

Public Function GiveData() As ADODB.Recordset     'A very boring query we can use on any SQL Server.     Const strSQL As String = "SELECT * FROM authors"      'We'll use this DSN.     Const strCon As String = _         "DSN=pubsit;UID=lawsond;PWD=lawsond;" & _         "DATABASE=pubs;APP=DudeCli;"      Dim ors As New ADODB.Recordset     ors.LockType = adLockBatchOptimistic     ors.CursorLocation = adUseClient       ors.CursorType = adOpenStatic     ors.Open strSQL, strCon       Set ors.ActiveConnection = Nothing     Set GiveData = ors End Function

In order to create a disconnected recordset, you must create a Recordset object that uses a client-side cursor that is either a static or keyset cursor (adOpenStatic or adOpenKeyset) with a lock type of adLockBatchOptimistic.

(This example is in the samples directory for this chapter, as RemRset.vbg).

If you return a disconnected recordset from a function, either as the return value, or as an output parameter, the recordset copies its data to its caller. If the caller is in a different process, or on a different machine, the recordset marshals the data it is holding to the caller's process. In so doing it compresses the data to avoid occupying substantial network bandwidth, which makes it an ideal way to send large amounts of data to a client machine. Remoting ADO recordsets really has to be done.

The end resultset is a recordset which has been instantiated on the server, then physically passed down to the client. It is no longer connected to the database at all, but can be used as a recordset, and if changes are made to it the recordset could be passed back to a server and reassociated with a database connection to allow updates to take effect. This avoids the penalty of continued network overhead because each column and field is referenced for its data, and is a strong contender as a replacement for passing Variant arrays.

Returning changes from a disconnected recordset: batch updating When a recordset is disconnected and has been remoted to a different machine, it is possible to make changes to it using its Edit, Update, and Delete methods. In fact, it is one of the only times when it makes sense to use these methods on a cursor, since we are not using up all the normal resources or actually talking to the database. When you are finished changing things, you pass the recordset back to a component that has a live connection to the database. It uses the UpdateBatch method to put in all your changes in a single batch.

Public Sub ReconUpdate(rs As ADODB.Recordset)     Dim conn As ADODB.Connection     Set conn = New ADODB.Connection     conn.Open "DSN=Universe"     Set rs.ActiveConnection = conn     Rs.UpdateBatch

Fear not the score of other untrustworthy users having updated the same records as you! Just like batch updating in RDO, you have all the tools to sort out collisions, if they occur. However, beware if you are expecting to marshal a recordset from the middle tier to the client machine to resolve any conflicts you have. The three versions of the data (original, value, and underlying) are not marshalled, so you need to do some work yourself. (See Q177720 in the Knowledge Base for more on this.)

Creatable ADO recordsets ADO recordsets can act as a standard interface, even when data is not being accessed from a database using an OLE DB provider, since ADO recordsets can be created independently and filled with data by direct code manipulation. Here is server code for this in a class:

Private rs As ADODB.Recordset Private Sub Class_Initialize()     Dim strPath As String, strName As String     Dim i As Integer     ' Create an instance of the Recordset.     Set rs = New ADODB.Recordset     ' Set the properties of the Recordset.     With rs         .Fields.Append "DirID", adInteger         .Fields.Append "Directory", adBSTR, 255         .CursorType = adOpenStatic         .LockType = adLockOptimistic         .Open     End With     ' Loop through the directories and populate     ' the Recordset.     strPath = "D:\"     strName = Dir(strPath, vbDirectory)     i = 0     Do While strName <> ""         If strName <> "." And strName <> ".." Then             If (GetAttr(strPath & strName) And _                 vbDirectory) = vbDirectory Then                 i = i + 1                 With rs                     .AddNew                     .Fields.Item("DirID") = i                     .Fields.Item("Directory") = strName                     .Update                 End With             End If         End If         strName = Dir     Loop     ' Return to the first record.     rs.MoveFirst End Sub

This code is in the DataAware.vbp sample project in the samples directory for this chapter.

Persisting recordsets ADO Recordset objects can be saved to a file by using their Save method. This can be valuable if you have a requirement to store data for longer than a run of a program, but without being able to do so in a data source. Imagine a user has made changes to a recordset, and then cannot reconnect to a data source such as a remote database. Persisting data can also be useful for a disconnected recordset, since the connection and the application can be closed while the recordset is still available on the client computer. The code for persisting a Recordset object looks like this:

rsDudes.Save "c:\tms\dudestuff.dat", adPersistADTG

and to get the data out from the file again the following code would do it:

rsDudes.Open " c:\tms\dudestuff.dat "

Files and persistable classes Visual Basic 6 gave classes the capabilities that some other ActiveX instantiables (such as ActiveX Documents and User Controls) have had for a version already—namely the capability to persist their properties through the PropertyBag object. This allows us to store a class's properties between instances. The Persistable property in conjunction with the PropertyBag object lets a class instance be persisted almost anywhere: a file, the Registry, a database, a word-processor document, or a spreadsheet cell.

Why persist? Most components have properties; one of the great annoyances of Visual Basic's Class_Initialize event procedure is that you can't get parameters into it. Typically Class_Initialize is used to set up default values for a class instance. The default values you use are frozen in time when you compile the component, unless you use something like INI settings, Registry entries, files, or command line arguments to vary them. This is where the Visual Basic 6 class's Persistable property comes in, allowing you to save a component's values between runs. To be persistable, a class has to be public and createable. When you set a class's Persistable property to Persistable, three new events are added to the class: ReadProperties, WriteProperties, and InitProperties. Just like in an ActiveX User Control, you can mark a property as persistable by invoking the PropertyChanged method in a Property Let or Property Set procedure, as in the following example:

Private mBattingAverage As Decimal Public Property Let BattingAverage (newAverage As Decimal)     mBattingAverage = newAverage     PropertyChanged "BattingAverage" End Property

Calling the PropertyChanged method marks the property as dirty. The WriteProperties event will fire when the class is terminated if any property in the class has called PropertyChanged. Then we use the events and the PropertyBag object almost the same way as in a User Control.

There is a twist however: we need a second instance of a PropertyBag object, so that when the object goes away, and takes its PropertyBag object with it, there is a persisted set of properties. The following code shows persisting an object to a text file, but remember that they can be persisted wherever you like, even in a database:

Private pb As PropertyBag      ' Declare a PropertyBag object. Private oBatsman As Batsman    ' Declare a Cricketer. Private Sub Form_Unload(Cancel As Integer)     Dim varTemp as Variant     ' Instantiate the second PropertyBag object.     Set pb = New PropertyBag     ' Save the object to the PropertyBag using WriteProperty.     pb.WriteProperty "FirstManIn", oBatsman     ' Assign the Contents of the PropertyBag to a Variant.     varTemp = pb.Contents     ' Save to a text file.     Open "C:\tms\FirstBat.txt" For Binary As #1     Put #1, , varTemp     Close #1 End Sub

The Contents property of the PropertyBag object contains the Batsman object stored as an array of bytes. In order to save it to a text file, you must first convert it to a data type that a text file understands—here, that data type is a Variant.

Depersisting an object

Once our Batsman is contained inside a text file (or any other type of storage), it is easy to send him wherever we want. We could take the FIRSTBAT.TXT file and send it with perhaps an entire cricket scorecard to our newspaper's office for incorporating into a newspaper report of the day's play. The code to reuse the Batsman object would look something like this:

Private pb As PropertyBag     ' Declare a PropertyBag object. Private oBatsman As Batsman   ' Declare a Batsman object. Private Sub Form_Load()     Dim varTemp As Variant     Dim byteArr() as Byte     ' Instantiate the PropertyBag object.     Set pb = New PropertyBag     ' Read the file contents into a Variant.     Open " C:\tms\FirstBat.txt " For Binary As #1     Get #1, , varTemp     Close #1     ' Assign the Variant to a Byte array.     ByteArr = varTemp     ' Assign to the PropertyBag Contents property.     pb.Contents = ByteArr     ' Instantiate the object from the PropertyBag     Set oBatsman = pb.ReadProperty("FirstManIn") End If

It isn't the same object being created in one place and reused in another; it is an exact copy of the object. This ability to "clone," or copy, an object for reuse offers a lot of potential.