Creating a Data Factory

for RuBoard

Simply put, a data factory is a class that exposes methods and properties that abstract not only the provider used, but also common operations such as the instantiation and population of connection, command, and parameter objects. In this way, the data factory abstracts as much of the ADO.NET code as possible from the rest of the application.

Note

MSDN recently published an article that described a data factory for SQL Server called a "Data Access Application Block." The primary difference between that approach and the one described today is that the Data Access Application Block is SQL Server “specific and is therefore neither provider nor database independent.

Obviously, this approach has several benefits, foremost among them being the reduction in the amount of code that you have to write and the ability to dynamically decide which provider to use. However, in addition to those benefits, the data factory can be designed to include second-level benefits such as

Database Independence . Not only is it possible to abstract the provider by using interface-based programming, but a data factory can also abstract all the database-specific syntax from your application by introducing the concept of "statements" that map to particular database-specific SQL statements.
Caching . After the data factory has created a particular command object and its parameters based on a logical statement, the factory can then cache it, thereby allowing it to be reused. In this way, performance is increased because the data factory never has to create the same command object twice.

To illustrate the functionality that the data factory provides to your data access classes, Table 18.1 includes a description of each of its public methods and properties and their purposes. The remainder of today will be devoted to looking at a couple of key aspects of the factory, including how the provider and statement are abstracted and how caching is implemented.

Tip

Because the entire data factory class is more than 1,000 lines of code, all the code can't be shown. To get the entire class, download the code files associated with this book on the Sams Web site at www.samspublishing.com.

Table 18.1. Public Members of the Data Factory Class

Signature	Description
	Properties
`CacheFilePath`	Property that specifies the path where the statement files can be found
`Connection`	Property that returns the connection object
`Provider`	Property that specifies the provider to use
`UseCache`	Property that specifies whether to cache any new statements and whether to use those already in the cache
	Methods
`BeginTransaction`	Starts a transaction and returns it
`CreateDataAdapter`	Creates and returns a data adapter based on the statements passed to it
`CreateSqlFiles`	Overloaded; shared method that creates XML statement files asynchronously for all stored procedures in a SQL Server database
`CreateSqlFile`	Shared method that creates an XML statement file for a particular SQL Server stored procedure
`ExecuteDataReader`	Overloaded; returns a data reader based on the statement passed to it and optionally can use transactions and command behaviors
`ExecuteNonQuery`	Overloaded; executes a statement and optionally uses a transaction and returns a return value and output parameters
`ExecuteScalar`	Overloaded; executes a statement and returns the value. Optionally returns the return value and uses a transaction
`ExecuteSqlXmlReader`	Executes a statement that uses the `FOR XML` functionality of SQL Server; will throw an exception if the instance isn't using SqlClient
`GetDataSet`	Overloaded; returns a `DataSet` based on the statement and optionally uses a transaction
`GetDataTable`	Overloaded; returns a `DataTable` based on the statement and optionally uses a transaction
`GetProviders`	Shared method that returns an array containing the intrinsically supported providers
`GetXsltResults`	Transforms the results from the statement using the given stylesheet
`RemoveStatement`	Shared method that removes a particular statement from the internal cache
`SyncDataSet`	Synchronizes the given `DataSet` using the given data adapter
	Events
`SqlFilesCreated`	Event raised when the asynchronous `CreateSqlFiles` method completes

The data factory class itself is written in VB. Because it will be used by the entire ComputeBooks organization, it is placed in the ComputeBooks.Data namespace. The class is also sealed ( marked as NotInheritable ) and imports several namespaces in addition to those that are defaulted for all VB projects. Listing 18.1 shows the resulting declaration of the class.

Listing 18.1 The `DataFactory` class. This class is used internally in data access classes to abstract much of the ADO.NET code that must be written.

 Option Strict On Imports System.Collections.Specialized Imports System.Xml Imports System.Xml.Xsl Imports System.IO Imports System.Data.SqlClient Imports System.Data.OleDb Imports System.Reflection Imports System.Data.Common Namespace ComputeBooks.Data   Public NotInheritable Class DataFactory       ' Public and private members   End Class End Namespace

Abstracting the Provider

As with the ProviderFactory class discussed yesterday , one of the main jobs of the data factory is to abstract the provider. However, the DataFactory class does this using a different technique.

The basic approach used is that when an instance of the DataFactory is created, it is passed the connection string and optionally the provider in the constructor (the provider may also be set using the Provider property). This triggers the _createProviderTypes private method that creates five Type objects for the particular provider specified. These Type objects represent the connection, command, data reader, parameter, and data adapter, and are used to dynamically instantiate the provider-specific types when needed. In fact, the method then also creates an instance of the connection object to be exposed through the Connection property.

The interesting aspect of _createProviderTypes , however, is that if the Provider property isn't set to "SqlClient" or "OleDb," it will look for an XML file called DataFactory.config and read other provider information it contains. For example, if the DataFactory were going to use a custom provider created by ComputeBooks, the DataFactory.config file would look as shown in Listing 18.2.

Listing 18.2 Configuring the `DataFactory` class. This file can be used to specify additional providers that the class supports.

 <?xml version="1.0" encoding="utf-8" ?> <DataFactory>   <Provider name="Cbks" assembly="ComputeBooksData.dll">     <Connection type="ComputeBooks.Data.CbksConnection" />     <Command type="ComputeBooks.Data.CbksCommand" />     <Parameter type="ComputeBooks.Data.CbksParameter" />     <DataAdapter type="ComputeBooks.Data.CbksDataAdapter" />     <DataReader type="ComputeBooks.Data.CbksDataReader" />   </Provider> </DataFactory>

Listing 18.3 shows the complete _createProviderTypes method.

Listing 18.3 Creating provider types. This private method creates the provider-specific `Type` objects used by the class.

 Private Sub _createProviderTypes()   ' Provider and connection string are set so instantiate the connection object   ' May need to read from the XML file   Dim xmlConfig As New XmlDocument()   Dim provNodes As XmlNodeList   Dim prov As XmlNode   Dim provAssembly As [Assembly]   Select Case Me.Provider     Case "SqlClient"       _conType = GetType(SqlConnection)       _comType = GetType(SqlCommand)       _drType = GetType(SqlDataReader)       _daType = GetType(SqlDataAdapter)       _parmType = GetType(SqlParameter)       _paramtypes = _sqlParamTypes     Case "OleDb"       _conType = GetType(OleDbConnection)       _comType = GetType(OleDbCommand)       _drType = GetType(OleDbDataReader)       _daType = GetType(OleDbDataAdapter)       _parmType = GetType(OleDbParameter)       _paramtypes = _oledbParamTypes     Case Else       _paramtypes = _otherParamtypes       ' Load a provider dynamically       Try         xmlConfig.Load("DataFactory.config")  ' Relative path         provNodes = xmlConfig.GetElementsByTagName("Provider")         For Each prov In provNodes           If prov.Attributes("name").Value = Me.Provider Then             ' Load the assembly             _provAssembly = prov.Attributes("assembly").Value             provAssembly = [Assembly].LoadFrom(_provAssembly)              ' Load the data provider types             If Not prov.SelectSingleNode("Connection") Is Nothing Then               _sconType = prov.SelectSingleNode( _                  "Connection").Attributes("type").Value               _conType = provAssembly.GetType(_sconType, True, True)             End If             If Not prov.SelectSingleNode("Command") Is Nothing Then               _scomType = prov.SelectSingleNode( _                  "Command").Attributes("type").Value               _comType = provAssembly.GetType(_scomType, True, True)             End If             If Not prov.SelectSingleNode("DataAdapter") Is Nothing Then               _sdaType = prov.SelectSingleNode( _                  "DataAdapter").Attributes("type").Value               _daType = provAssembly.GetType(_sdaType, True, True)             End If             If Not prov.SelectSingleNode("DataReader") Is Nothing Then               _sdrType = prov.SelectSingleNode( _                  "DataReader").Attributes("type").Value               _drType = provAssembly.GetType(_sdrType, True, True)             End If             If Not prov.SelectSingleNode("Parameter") Is Nothing Then               _sparmType = prov.SelectSingleNode( _                  "Parameter").Attributes("type").Value               _parmType = provAssembly.GetType(_sparmType, True, True)             End If           End If         Next       Catch e As Exception         _throwException( _           "Could not load the provider, check the DataFactory.config file", e)       End Try   End Select   ' Create an instance of the connection object   Try     If Not _conType Is Nothing Then       _connection = CType(Activator.CreateInstance(_conType, _          False), IDbConnection)       _connection.ConnectionString = _connect     End If   Catch e As Exception     _throwException("Could not create connection object.", e)   End Try End Sub

You'll first note in Listing 18.3 that a simple Select Case statement is used if the provider is set to either "SqlClient" or "OleDb." If so, the GetType method is used to simply return the types and store them in private Type variables . However, in the Case Else block, an XmlDocument object is used to load the DataFactory.config file into memory where each Provider element is traversed to see whether it matches the Provider property. If so, the shared LoadFrom method of the System.Reflection.Assembly class is invoked to load the assembly that contains the provider by creating an Assembly object.

Note

You'll notice in Listing 18.3 that the Assembly class is referenced using brackets [] both in the declaration and when calling the shared method. This is required in VB because Assembly is also a reserved keyword.

Next, each of the five Type objects is populated by navigating to the appropriate XmlNode in the document and using the GetType method of the Assembly object. If the DataFactory.config file is in error, the private _throwException method is called. As you probably guessed, the variables such as _conType and _sconType are declared at the class level as Type and String objects, respectively.

After the types have been specified, the connection object is created and its connection string is set using the shared CreateInstance method of the System.Reflection.Activator class. Note that the connection object is cast to the IDbConnection interface using the CType method because even though the actual object is of type SqlConnection , for example, the connection will be referenced by the interface. This is the technique that is used throughout the data factory class to dynamically create provider objects.

Handling Data Types

You've probably also noticed in Listing 18.3 that the _paramTypes variable is populated for each provider. For example, if SqlClient is the provider, the variable is set to _sqlParamTypes . If OleDb is the provider, the variable is set to _oledbParamTypes , and if there is a custom provider, the variable is set to _otherParamTypes . Each of these three variables refers to shared Hashtable objects that contain the names of data types and the values they correspond to in the SqlDbType , OleDbType , and DbType enumerations, respectively. All four of the variables are declared at the class level as follows :

 Private Shared _sqlParamTypes As New Hashtable() Private Shared _oledbParamTypes As New Hashtable() Private Shared _otherParamtypes As New Hashtable() Private _paramtypes As Hashtable

The Hashtable objects themselves are populated in the shared constructor of the class, a small snippet of which is shown in Listing 18.4.

Listing 18.4 Referencing data types. The shared constructor populates the `Hashtable` objects that map the names of data types to the values in the enumerations.

 Shared Sub New()   ' Setup the types used by the providers   _sqlParamTypes.Add("bigint", SqlDbType.BigInt)   _sqlParamTypes.Add("bit", SqlDbType.Bit)   _sqlParamTypes.Add("varchar", SqlDbType.VarChar)   _sqlParamTypes.Add("char", SqlDbType.Char)   _sqlParamTypes.Add("string", SqlDbType.NVarChar)   _sqlParamTypes.Add("integer", SqlDbType.Int)   ' [Others omitted]   ' OleDb types   _oledbParamTypes.Add("BigInt", OleDbType.BigInt)   _oledbParamTypes.Add("Boolean", OleDbType.Boolean)   _oledbParamTypes.Add("BSTR", OleDbType.BSTR)   _oledbParamTypes.Add("string", OleDbType.VarWChar)   _oledbParamTypes.Add("int", OleDbType.Integer)   _oledbParamTypes.Add("long", OleDbType.BigInt)   ' [Others omitted]   ' Other provider types   _otherParamtypes.Add("string", DbType.String)   _otherParamtypes.Add("integer", DbType.Int32)   _otherParamtypes.Add("short", DbType.UInt16)   _otherParamtypes.Add("boolean", DbType.Boolean)   _otherParamtypes.Add("date", DbType.Date)   ' [Others omitted] End Sub

The purpose of _paramTypes and of the Hashtable objects to which it refers is two- fold. First, it allows the DataFactory class to associate the appropriate data types with parameter objects, and second, it allows database-independent statement files to be written. We'll discuss these files later.

Abstracting the Statement

The creation of the provider Type objects and the connection object occur when an instance of the DataFactory class is created. When one of its methods, such as GetDataSet , is invoked, the class must then use the types to execute the method.

However, in addition to abstracting the provider, the DataFactory class also abstracts the SQL that's executed against the provider using the concept of statements. A statement is simply the definition of the SQL and the set of parameters that need to be executed. In this way, a statement is analogous to a command object, but isn't provider or database specific.

By abstracting the statement and the specification of parameters, the methods in the DataFactory class ”and in the data access classes that use it ”needn't be concerned with database specifics. The mechanisms used to achieve this independence are statement files formatted as XML used to encapsulate each specific statement executed by your application.

For example, a method in a data access class written in C# that needs to return all the books written by a particular author could be coded as follows:

 DataFactory df  = new DataFactory(_connect, "SqlClient"); df.CacheFilePath = new DirectoryInfo("."); HybridDictionary parms = new HybridDictionary(); parms.Add("author", "Fox, Dan"); DataSet ds  = new DataSet(); ds = df.GetDataSet("GetTitles", parms);

In this small code snippet, a new instance of the DataFactory class is created and is passed both a connection string and the hardcoded value of "SqlClient" specifying that SQL Server will be used.

Note

Obviously, your classes could also read the provider value from a configuration file, as was done with the connection string as shown on Day 9, "Using Connections and Transactions."

The CacheFilePath property is then set, which specifies the directory in which to search for statement files (in this case, the default directory). The GetDataSet method is then passed a HybridDictionary , which contains the parameters as name-value pairs along with the string "GetTitles" , which is the name of the statement. Internally, the DataFactory class translates the "GetTitles" string into the command object it represents by reading its information from a statement file.

Note

The HybridDictionary class can be found in the System.Collections.Specialized namespace and is interesting because when the collection is small, it stores its item in a ListDictionary . As the number of items grows, it switches to using a Hashtable . As a result, this class is useful when you don't know ahead of time how many elements the collection will contain.

Because this process is the meat of the DataFactory class, we'll take each aspect of statements in turn , including parsing the statement file, using the statement cache, and creating statement files.

Parsing the Statement File

As mentioned previously, the statement is abstracted into a statement file that specifies the SQL and parameters needed to execute the statement. Listing 18.5 shows an example of a statement file for the GetTitles statement.

Listing 18.5 A statement file. This file abstracts the calling of the `usp_GetTitles` stored procedure in a SQL Server database.

 <?xml version="1.0" encoding="utf-8" ?> <DataFactory>     <Statement name="GetTitles" type="StoredProcedure">         <Sql>usp_GetTitles</Sql>         <Parameters>           <Param name="author" SqlName="@author" type="string"            maxLength="30" direction="Input" />           <Param name="title" SqlName="@title" type="string"            maxLength="100" direction="Input" />         </Parameters>     </Statement> </DataFactory>

Although the file is straightforward, note that the Sql element specifies the database-specific SQL syntax to use and can therefore also contain inline SQL statements such as the following:

 <Sql>SELECT * FROM Titles WHERE author = @author</Sql>

After the GetDataSet method is invoked as in the earlier code snippet, the DataFactory ultimately calls its private _getStatement method, which returns an IDbCommand object with its parameters and connection fully specified. Figure 18.1 shows the process flow for this method.

Figure 18.1. Returning a statement. This diagram shows the process flow of the `_getStatement` method that either pulls a statement out of the cache or creates it dynamically.

graphics/18fig01.gif

As you can see from Figure 18.1, the _getStatement method initially determines whether the statement has already been executed and therefore is in shared cache. If so, and if the UseCache property is set to True , the statement is pulled out of the cache and processed , as you'll see in the next section.

However, if the statement is not in the cache, as would happen the very first time the " GetTitles " statement is executed, the method must load and parse the statement file and create the command and parameter objects. It does this by invoking a private _ getStatementFromFile method that looks in the path specified by CacheFilePath for a file with the name statement. config. If the file is found, it is loaded and parsed using an XmlTextReader as shown in Listing 18.6.

Listing 18.6 Parsing the statement file. This method loads and parses the statement file.

 Private Function _getStatementFromFile(ByVal statement As String) As Statement   Dim found As Boolean   Dim s As statement   Dim xlr As XmlTextReader   Dim fileName, temp As String   ' Get the file name   fileName = Me.CacheFilePath.FullName & "\" & statement & ".config"   If Not File.Exists(fileName) Then     _throwException("File for statement " & statement & _        " does not exist.", Nothing)   Else     xlr = New XmlTextReader(fileName)   End If   Try     xlr.WhitespaceHandling = WhitespaceHandling.None     xlr.MoveToContent()     Do While xlr.Read()       Select Case xlr.Name         Case "Statement"           If xlr.GetAttribute("name") = statement Then             ' Found it             found = True             s = New statement()             s.Parms = New ArrayList()             s.CommandType = CType(System.Enum.Parse(s.CommandType.GetType, _              xlr.GetAttribute("type"), True), CommandType)             xlr.Read()             s.SQL = xlr.ReadElementString("Sql")             s.Name = statement          End If         Case "Param"           If found Then             Dim p As New Parm()             p.Name = xlr.GetAttribute("name")             p.SQLName = xlr.GetAttribute("SqlName")             p.Type = xlr.GetAttribute("type")             temp = xlr.GetAttribute("maxLength")             If Not temp Is Nothing Then p.maxLength = CInt(temp)             temp = Nothing             temp = xlr.GetAttribute("SourceColumn")             If Not temp Is Nothing Then p.SourceColumn = temp             p.Direction = CType(System.Enum.Parse(p.Direction.GetType, _               xlr.GetAttribute("direction"), True), ParameterDirection)             s.Parms.Add(p)           End If       End Select     Loop     If _daSwitch.Enabled Then       Trace.WriteLine("Successfully read " & fileName)     End If     Return s   ' success   Catch e As Exception     _throwException("Could not parse " & fileName, e)   Finally     xlr.Close()   End Try End Function

In Listing 18.6, the method first builds the file path and determines whether the file exists using the System.IO.File class. If so, it loads the file using the XmlTextReader object and parses it using a Do Loop and the Read method. The Select Case statement then looks for Statement and Param elements in the statement file, which are then parsed and loaded into Statement and Parm structures. The structures are declared as Friend within the assembly that contains the DataFactory class as follows:

 Friend Structure Statement   Public Name As String   Public SQL As String   Public CommandType As CommandType   Public Parms As ArrayList   Public Command As IDbCommand End Structure Friend Structure Parm   Public Name As String   Public SQLName As String   Public Type As String   Public Direction As ParameterDirection   Public maxLength As Integer   Public SourceColumn As String  End Structure

As you might expect, the Parms field of the structure holds an ArrayList used to contain instances of the Parm structure. After the structures are populated, the Statement structure s is returned from the method. The Command field of the Statement structure is used to reference the fully instantiated command object with its parameters.

Note

Just as in the base classes we discussed yesterday, the DataFactory classes uses a BooleanSwitch to control tracing output so that you can generate a log file.

Using the Statement Cache

To minimize the number of times the DataFactory class needs to access the file system, and to avoid having to create command and parameter objects repeatedly, the class stores statements that have already been accessed in a shared Hashtable .

As shown in Figure 18.1, the _getStatement method first checks the cache to see whether the statement has already been loaded. It does this by checking a Hashtable referred to with the _procs variable. Just as with the Hashtable objects used to store the data types for each provider, the _procs variable points to a Hashtable that is provider specific. This is required because multiple instances of the DataFactory using different providers might be used in the same application. This is implemented by declaring _procs and a synchronized Hashtable at the class level like so:

 Private Shared _provCache As Hashtable = Hashtable.Synchronized(New Hashtable()) Private _procs As Hashtable

Note

The Hashtable class contains a shared Synchronized method that automatically creates a thread-safe wrapper for the Hashtable class. This means that multiple threads can add to the Hashtable concurrently.

Then, when the Provider property is set, the _provCache Hashtable is inspected to see whether a Hashtable already exists for the provider. If not, a new synchronized Hashtable is created and added to _provCache . Listing 18.7 shows the definition of the Provider property.

Listing 18.7 Setting the provider. The `Set` block of the `Provider` property creates or references the provider-specific `Hashtable` .

 Public Property Provider() As String   ' Sets up the provider to use   Get     Return _provider   End Get   Set(ByVal Value As String)     ' See if there is a cache for this provider     If _provCache.ContainsKey(Value) Then       _procs = CType(_provCache(Value), Hashtable)     Else       _provCache.Add(Value, Hashtable.Synchronized(New Hashtable()))       _procs = CType(_provCache(Value), Hashtable)     End If     _provider = Value     _createProviderTypes()   End Set End Property

As you can see in Listing 18.7, the provider cache is first checked with the ContainsKey method to see whether a Hashtable exists in its collection for the provider (specified by Value ). If so, the _procs variable simply references it; if not, a new Hashtable is created and added to _provCache . In either case, the _createProviderTypes method is then executed to create the provider-specific Type objects, as shown in Listing 18.3.

The _getStatement method in Listing 18.8 then uses the _procs variable and the _getStatementFromFile method to implement the algorithm shown in Figure 18.1.

Listing 18.8 Creating and caching a statement. This method implements the process flow shown in Figure 18.1.

 Private Function _getStatement(ByVal statement As String, _  ByVal parms As HybridDictionary, ByVal setParmValue As Boolean) As IDbCommand   Dim s As statement   Dim com, newCom As IDbCommand   Dim p As Parm   Dim newParm As IDbDataParameter   ' See if its in the cache   If Not Me.UseCache OrElse _procs.ContainsKey(statement) Then     If _daSwitch.Enabled Then       Trace.WriteLine("Cache hit for " & statement)     End If     ' Pull it out of the cache     s = CType(_procs.Item(statement), statement)     newCom = CType(_cloneObject(s.Command), IDbCommand)   Else     s = _getStatementFromFile(statement)     ' Build the command, add the parameters     com = CType(Activator.CreateInstance(_comType, False), IDbCommand)     com.CommandText = s.SQL     com.CommandType = s.CommandType     ' Now add the parameters     For Each p In s.Parms       Dim args(1) As Object       args(0) = p.SQLName       ' Get the type       If _paramtypes.ContainsKey(p.Type) Then         args(1) = _paramtypes(p.Type)       Else         _throwException("Invalid type in statement " & statement, Nothing)       End If       ' Create the parameter object       newParm = CType(Activator.CreateInstance(_parmType, args), _        IDbDataParameter)       ' Set its properties       newParm.Direction = p.Direction       If Not p.SourceColumn Is Nothing Then          newParm.SourceColumn = p.SourceColumn       End If       If p.maxLength > 0 Then newParm.Size = p.maxLength       newParm.Value = DBNull.Value       ' Add it to the collection       com.Parameters.Add(newParm)     Next     s.Command = com     ' Add it to the cache     If Me.UseCache Then       _procs.Add(statement, s)       ' Clone the object that was just created       newCom = CType(_cloneObject(com), IDbCommand)       If _daSwitch.Enabled Then         Trace.WriteLine("Added " & statement & " to cache.")       End If     Else       newCom = com     End If   End If   ' Populate the parameters   For Each p In s.Parms     If setParmValue AndAlso parms.Contains(p.Name) Then       CType(newCom.Parameters(p.SQLName), IDataParameter).Value = _        parms.Item(p.Name)     End If   Next   ' Return the command with the populated parameters   newCom.Connection = _connection   Return newCom End Function

The _getStatement method first checks the cache ( _procs ) for the statement, and if it finds the statement, it extracts the Command field and passes it to the private _cloneObject method to make a copy of the command object.

Note

The _cloneObject method simply casts for the ICloneable interface and, if it exists, uses its Clone method to create a copy of the object. This works for both SqlCommand and OleDbCommand objects because they both support the interface and perform deep copies (copies that also copy the object's collections, such as the parameters collection). In order for custom providers to be used with the DataFactory , therefore, their command objects would need to do likewise.

This must be done so that multiple instances of the DataFactory running on multiple threads won't be competing to work with the same command object. This is especially the case because each command object must have its parameter values set differently each time a method is called.

If the statement isn't found in the cache, the appropriate command object is created using the CreateInstance method of the Activator class. Its CommandText and CommandType properties are then set to the values in the Statement structure. Next, the parameters are created by traversing the Parms ArrayList of the Statement structure. The interesting aspect of this code is that when the parameter object is created with the CreateInstance method, it is passed an array ( args ) as the second argument. This array is mapped to the constructor of the class, for example OleDbParameter , and contains the name of the parameter and its data type. Note that the parameter type is retrieved from the _paramtypes Hashtable we discussed earlier.

After the parameter object is created, its Direction , SourceColumn , Size , and Value properties are all set and the parameter is added to the collection. The new command object is then referenced in the Command field of the Statement structure.

Note

As you can tell from this code, although some providers such as SqlClient don't require you to create all the parameter objects if you don't use them, the DataFactory does in fact create all the parameter objects defined in the statement file regardless of whether they will ultimately be called. If the parameters have default values in the stored procedure, a client needn't populate its HybridDictionary object with values for all the parameters.

If the cache is in use, the statement is added to the cache for future use and a new command is created that will actually be used during this execution.

For this particular execution, then, the parameter objects need to be populated with values. This is accomplished by traversing the Parms ArrayList and determining whether the HybridDictionary passed into the method from the client (that contains author = "Fox, Dan" for example) contains parameters of the same name. If so, the Value property is populated by retrieving the value from the parameter. This technique was used so that clients needn't add the parameters to the HybridDictionary in any particular order and needn't provide values for all the parameters.

Finally, the command object is associated with the private connection object and is returned. At this point, a fully populated command object with the appropriate parameter values has been created. It can then be used to execute the command through the ExecuteScalar or ExecuteDataReader method, or to populate a DataSet , DataTable , or data reader.

Creating Statement Files

Of course, using statement files implies that someone or something must create a statement file for each stored procedure or SQL statement executed against the data store. In particularly large applications, this can be a daunting task.

To automate the process, the DataFactory exposes shared CreateSqlFile and CreateSqlFiles methods, which create a file for a single SQL Server stored procedure and for all the stored procedures in a database, respectively. Both methods are overloaded and end up calling the private _createSqlFile and _createSqlFiles methods.

Most of the work is accomplished in the _createSqlFile method, which is passed the SqlConnection object, the name of the stored procedure, the directory to create the file in, and a flag indicating whether to populate the SourceColumn attribute. The method relies on the sp_procedure_params_rowset system stored procedure, and uses a SqlDataReader to read the procedure metadata returned and an XmlTextWriter object to write out the statement file, as shown in Listing 18.9.

Listing 18.9 Creating statement files. This method creates a statement file for a single stored procedure.

 Private Shared Sub _createSqlFile(ByVal con As SqlConnection, _   ByVal procName As String, ByVal cacheDir As DirectoryInfo, _   ByVal defaultSourceColumn As Boolean)   ' Writes the file for a stored procedure in SQL Server 2000   Dim com As New SqlCommand("sp_procedure_params_rowset", con)   Dim dr As SqlDataReader   Dim statement As String   Dim fs As FileStream   Dim xlr As XmlTextWriter   Dim closeOnFinish As CommandBehavior = CommandBehavior.Default   com.CommandType = CommandType.StoredProcedure   com.Parameters.Add("@procedure_name", procName)   ' Take off the prefix if one exists   If Left(procName, Len(ProcPrefix)) = ProcPrefix Then     statement = Mid(procName, 5)   Else     statement = procName   End If   Dim fileName As String = cacheDir.FullName & "\" & statement & ".config"   Try     If con.State = ConnectionState.Closed Then       closeOnFinish = CommandBehavior.CloseConnection       con.Open()     End If     dr = com.ExecuteReader(closeOnFinish)     ' Open the file     If File.Exists(fileName) Then       File.Delete(fileName)     End If     fs = New FileStream(fileName, FileMode.CreateNew)     xlr = New XmlTextWriter(fs, Text.Encoding.Default)     xlr.Indentation = 2     xlr.Formatting = Formatting.Indented     xlr.WriteStartDocument()     xlr.WriteComment("Generated by the DataFactory at " & _      Now.ToLongDateString())     xlr.WriteStartElement("DataFactory")     xlr.WriteStartElement("Statement")     xlr.WriteAttributeString("name", statement)     xlr.WriteAttributeString("type", "StoredProcedure")     xlr.WriteElementString("Sql", procName)     xlr.WriteStartElement("Parameters")     Do While dr.Read       xlr.WriteStartElement("Param")       xlr.WriteAttributeString("name", Mid(dr("PARAMETER_NAME").ToString(), 2))       xlr.WriteAttributeString("SqlName", dr("PARAMETER_NAME").ToString())       xlr.WriteAttributeString("type", dr("TYPE_NAME").ToString())       If defaultSourceColumn Then         xlr.WriteAttributeString("SourceColumn", _           Mid(dr("PARAMETER_NAME").ToString(), 2))       End If       If Not IsDBNull(dr("CHARACTER_MAXIMUM_LENGTH")) Then         xlr.WriteAttributeString("maxLength", _           dr("CHARACTER_MAXIMUM_LENGTH").ToString)       End If       Select Case dr("PARAMETER_TYPE")         Case 4           xlr.WriteAttributeString("direction", "ReturnValue")         Case 1           xlr.WriteAttributeString("direction", "Input")         Case 2           xlr.WriteAttributeString("direction", "InputOutput")         Case 3           xlr.WriteAttributeString("direction", "Output")       End Select       xlr.WriteEndElement() ' Param     Loop     xlr.WriteEndDocument()   Catch e As Exception     Throw New Exception("Could not create XML for " & procName, e)   Finally     If Not dr Is Nothing Then dr.Close()     If Not xlr Is Nothing Then xlr.Flush()     If Not fs Is Nothing Then fs.Close()   End Try End Sub

You'll notice after setting up the call to the stored procedure, the method uses the shared field ProcPrefix to remove any naming convention that the procedure might use to create a more generic statement and therefore file name. Although it can be changed, by default ProcPrefix is set to "usp_" ”a common naming convention for user -defined SQL Server stored procedures.

The remainder of the method simply executes the stored procedure and traverses the results while using the methods of the XmlTextWriter to write out the statement file.

A client could then call one of the overloaded signatures like so:

 DataFactory.CreateSqlFile(connect, "usp_GetSchedule", _   new DirectoryInfo("C:\ComputeBooks"),true);

This would result in the file GetSchedule.config being created in the ComputeBooks directory.

One of the most interesting aspects of these shared methods, however, is that the CreateSqlFiles method operates asynchronously so that the statement files can be created on a separate thread. This is accomplished using an asynchronous delegate declared at the class level like so:

 Private Delegate Sub WriteSqlFiles(ByVal connect As String, _   ByVal cachedir As DirectoryInfo, _   ByVal defaultSourceColumn As Boolean)

The CreateSqlFiles method can then use the delegate to execute the _createSqlFiles method on a background thread like so:

 Public Shared Sub CreateSqlFiles(ByVal connect As String, _   ByVal cacheDir As DirectoryInfo)   ' Create files for each of the stored procs in   ' the database on a separate thread   Dim async As New AsyncCallback(AddressOf _createSqlCallback)   Dim createFiles As New WriteSqlFiles(AddressOf _createSqlFiles)   createFiles.BeginInvoke(connect, cacheDir, False, async, Nothing)   ' Now this thread is free End Sub

Here, the _createSqlCallback method will be called when the _createSqlFiles method completes on the background thread. This _createSqlCallback method simply calls the EndInvoke delegate and raises the SqlFilesCreated event.

Note

For more information on using asynchronous delegates, see the "Asynchronous Design Pattern Overview" topic in the online documentation or my articles on .NET patterns on InformIT.com.

Using ASP.NET Caching

Although not shown today, the DataFactory class can also be used to take advantage of ASP.NET caching to enable the site to be dynamically updateable. If the DataFactory class were used from an ASP.NET application, items associated with each statement could be added to the System.Web.Caching.Cache object using its Add or Insert methods. When items are added to the cache, they can be associated with a file dependency ( CacheDependency object) so that when the file changes, the item is automatically removed from the cache. When this happens, your ASP.NET application can be notified using the CacheItemRemovedCallback delegate. It can then call the Remove method of the DataFactory instance to remove the item from the DataFactory's cache as well. In this way, an administrator can re-create statement files using an administrative Web page that calls the shared methods discussed in this section. The application will automatically use the new versions of the files the next time the statement is executed.

Using the `DataFactory`

Now that you've seen how the DataFactory class is implemented internally, a look at several of the public methods it exposes will illustrate how the implementation is used. First, consider the ExecuteScalar method, which exposes three signatures that include the statement, parameters, and optionally an Integer passed in by reference to catch the return value, and a transaction object to associate with the command. All three public methods ultimately call the private _executeScalar method shown in Listing 18.10.

Listing 18.10 Implementing the `DataFactory` . This private method is called by the public `ExecuteScalar` methods.

 Private Function _executeScalar(ByVal statement As String, _   ByVal parms As HybridDictionary, ByRef returnVal As Integer, _   ByVal transaction As IDbTransaction) As Object   ' Return a single value using ExecuteScalar   Dim com As IDbCommand   Dim val As Object   Dim leaveOpen As Boolean = False   ' Get the command   com = _getStatement(statement, parms, True)   If Not transaction Is Nothing Then     com.Transaction = transaction   End If   Try     If com.Connection.State = ConnectionState.Closed Then       com.Connection.Open()     Else       leaveOpen = True     End If     val = com.ExecuteScalar()     Dim p As IDataParameter     For Each p In com.Parameters       If p.Direction = ParameterDirection.ReturnValue Then         returnVal = CInt(p.Value)         Exit For       End If     Next     Return val   Catch e As Exception     _throwException( _         "Failed to execute ExecuteScalar method for statement " & statement, e)   Finally     If Not leaveOpen Then com.Connection.Close()   End Try End Function

As you can see in Listing 18.10, the method relies on the private _getStatement method to return an IDbCommand object with fully populated parameters. The command object is then associated with a transaction if one is passed to the method. Because the connection object might already be open (for example, if a transaction is active), the method then checks to see whether the connection needs to be opened. After calling the ExecuteScalar method and capturing the returned value, the parameters are traversed to catch the return value. Finally, the value returned from ExecuteScalar is returned to the caller and the connection is closed if necessary.

In another example, the _getDataSet method shown in Listing 18.11 is called by the public GetDataSet methods. Unlike the _executeScalar method, the retrieval of the command object is wrapped into the private _setupDataAdapter method that dynamically creates the data adapter and populates its SelectCommand with the command object returned in the statement. After the data adapter is created, the DataSet is filled and its ExtendedProperties collection is populated with metadata that your data access classes might be able to use.

Listing 18.11 Returning a `DataSet` . This private method is used by the public `GetDataSet` methods.

 Private Function _getDataSet(ByVal statement As String, _  ByVal parms As HybridDictionary, _  ByVal transaction As IDbTransaction) As DataSet   ' Returns a DataSet given the statement   Dim da As IDataAdapter   Dim ds As New DataSet()   da = _setupDataAdapter(statement, parms, transaction)   Try     ' Fill and return     da.MissingMappingAction = MissingMappingAction.Passthrough     da.MissingSchemaAction = MissingSchemaAction.AddWithKey     da.Fill(ds)     ds.DataSetName = statement     ds.ExtendedProperties.Add("Creator", "DataFactory class")     ds.ExtendedProperties.Add("TimeCreated", Now.ToShortTimeString)     ds.ExtendedProperties.Add("Statement", statement)     ds.ExtendedProperties.Add("Parameters", parms)     Return ds   Catch e As Exception     _throwException("Could not fill DataSet", e)   End Try End Function

By using these public methods, the methods of a data access class such as the _getTitles method shown yesterday in Listing 17.4 can be simplified as shown in Listing 18.12.

Listing 18.12 Using the `DataFactory` . This method uses the `DataFactory` to retrieve titles.

 Private Function _getTitles(ByVal author As String, ByVal title As String, _   ByVal isbn As String, ByVal lowPubDate As Date, _   ByVal catID As Guid) As DataSet   Dim parms As New HybridDictionary()   Try     If Not isbn Is Nothing AndAlso isbn.Length > 0 Then       parms.Add("isbn", isbn)     Else       If Not title Is Nothing AndAlso title.Length > 0 Then         parms.Add("titles", title)       End If       If Not author Is Nothing AndAlso author.Length > 0 Then         parms.Add("author", author)       End If       If lowPubDate.Equals(Nothing) Then         parms.Add("lowPubDate", lowPubDate)       End If       If Not catID.Equals(Guid.Empty) Then         parms.Add("catID", catID)       End If     End If     Return _df.GetDataSet("GetTitles", parms)   Catch e As DataFactoryException     Call MyBase.ThrowComputeBookException("GetTitles Failed", e)   End Try End Function

Note that the _df variable is a class-level private variable that references an instance of the DataFactory class created in the constructor of the Books class. The key differences to note between Listing 18.12 and the original method in Listing 17.4 are as follows:

When using the DataFactory , the method doesn't need to create or call any provider-specific objects. This is provider independence.
When using the DataFactory , the method doesn't need to know which stored procedure is executed to return the data nor the actual names of the parameters exposed by the procedure. This is database independence.
When using the DataFactory , the number of lines of code in the method was reduced by 35%.

Measuring Performance

The obvious issue with using a data factory approach like the one presented today is the cost of creating type objects for the provider dynamically and reading the statement files from the file system. Keep in mind that the performance hit incurred when creating the provider types happens only when the DataFactory is instantiated. Further, reading a particular statement file should occur only once in the entire application, which will have a negligible effect on performance.

In fact, in tests conducted with Microsoft Application Center Test (ACT) on a sample ASP.NET application using the DataFactory , it was discovered that after the application was running, the performance was actually slightly better using the DataFactory than when not using it. Performance could further be enhanced by caching the provider- specific type objects in a shared data structure as well. This would be especially effective if the DataFactory used the ODBC or custom .NET Data Provider because the DataFactory.config file wouldn't need to be parsed.

for RuBoard

Table 18.1. Public Members of the Data Factory Class

Listing 18.1 The DataFactory class. This class is used internally in data access classes to abstract much of the ADO.NET code that must be written.

Abstracting the Provider

Listing 18.2 Configuring the DataFactory class. This file can be used to specify additional providers that the class supports.

Listing 18.3 Creating provider types. This private method creates the provider-specific Type objects used by the class.

Handling Data Types

Listing 18.4 Referencing data types. The shared constructor populates the Hashtable objects that map the names of data types to the values in the enumerations.

Abstracting the Statement

Parsing the Statement File

Listing 18.5 A statement file. This file abstracts the calling of the usp_GetTitles stored procedure in a SQL Server database.

Figure 18.1. Returning a statement. This diagram shows the process flow of the _getStatement method that either pulls a statement out of the cache or creates it dynamically.

Listing 18.6 Parsing the statement file. This method loads and parses the statement file.

Using the Statement Cache

Listing 18.7 Setting the provider. The Set block of the Provider property creates or references the provider-specific Hashtable .

Listing 18.8 Creating and caching a statement. This method implements the process flow shown in Figure 18.1.

Creating Statement Files

Listing 18.9 Creating statement files. This method creates a statement file for a single stored procedure.

Using ASP.NET Caching

Using the DataFactory

Listing 18.10 Implementing the DataFactory . This private method is called by the public ExecuteScalar methods.

Listing 18.11 Returning a DataSet . This private method is used by the public GetDataSet methods.

Listing 18.12 Using the DataFactory . This method uses the DataFactory to retrieve titles.

Measuring Performance

Listing 18.1 The `DataFactory` class. This class is used internally in data access classes to abstract much of the ADO.NET code that must be written.

Listing 18.2 Configuring the `DataFactory` class. This file can be used to specify additional providers that the class supports.

Listing 18.3 Creating provider types. This private method creates the provider-specific `Type` objects used by the class.

Listing 18.4 Referencing data types. The shared constructor populates the `Hashtable` objects that map the names of data types to the values in the enumerations.

Listing 18.5 A statement file. This file abstracts the calling of the `usp_GetTitles` stored procedure in a SQL Server database.

Figure 18.1. Returning a statement. This diagram shows the process flow of the `_getStatement` method that either pulls a statement out of the cache or creates it dynamically.

Listing 18.7 Setting the provider. The `Set` block of the `Provider` property creates or references the provider-specific `Hashtable` .

Using the `DataFactory`

Listing 18.10 Implementing the `DataFactory` . This private method is called by the public `ExecuteScalar` methods.

Listing 18.11 Returning a `DataSet` . This private method is used by the public `GetDataSet` methods.

Listing 18.12 Using the `DataFactory` . This method uses the `DataFactory` to retrieve titles.