Questions That Should Be Asked More Frequently | Microsoft ADO.NET (Core Reference) (PRO-Developer)

Application Design Considerations

Creating a helpful and intuitive user interface is just one of many facets of building an effective Windows-based application. Let's discuss some important application design considerations.

Fetching Only the Data You Need

As you develop your application, it's important that you consider how your database will grow. Executing a SELECT ... FROM MyTable query when your application starts up might seem fine while you're developing the application, but as the table grows in size, fetching the results of the query will take more time. The more data you retrieve, the more time your application will take to retrieve that data.

Take our sample order entry application. As the application starts, it issues queries to retrieve information for all the orders that a customer has placed. Is that the right call? Perhaps retrieving all orders for a customer is overkill. Maybe the users of the application are primarily interested in viewing the orders that have not yet shipped. Maybe the application should fetch only orders that a particular customer has placed in the past three months.

The application's environment can also be a factor in determining which data to fetch. Perhaps the user needs to be able to download data onto a laptop using a 28-Kbps modem, access and modify that data off line while at a remote site, and then reconnect using the same modem at the end of the day to transmit changes back to the database. You won't want to waste any bandwidth because the pipeline is so thin, but because of the environment, the application will require you to download all the necessary data from the database.

Updating Strategies

The sample application caches changes and relies on optimistic locking to submit updates. Let's discuss some other updating strategies.

Immediate Updates vs. Cached Updates

Whether you decide to submit changes immediately or cache those changes and submit them later should depend on what's appropriate for your application.

When the user modifies an order in the sample application, the application does not immediately submit the change to the database. The application relies on ADO.NET to cache the update until the user clicks the Submit Changes button.

We could easily change the application so that it submits the change to an order when the user clicks the Update button. When the user clicks the Edit button, the application will allow the user to modify the order and its line items. If the user clicks the Cancel button, the application will discard the changes. If the user clicks the Update button instead, the application will save the changes and then use the DataAdapter to submit the changes to the database.

One benefit of working with data off line and caching changes is that the application does not need to communicate with the database as frequently. However, the longer the user waits to submit cached changes, the greater the chances that another user will have modified the same data in the database, which can cause the update attempt to fail.

You should weigh the pros and cons of each approach to determine what's appropriate for your application. If the users of our sample application will handle incoming phone orders from customers with slow-moving inventory, caching the changes will probably suffice. But that approach isn't appropriate for an airline ticket reservation system. You wouldn't want a user to try to save a traveler's itinerary only to discover that the last seat on the return flight was sold while the traveler was trying to find his or her frequent flyer number.

Refetching Before Allowing Changes

When you retrieve data into a disconnected structure such as a DataSet, the data can become stale. But unlike a carton of milk, your DataSet does not come with an expiration date. The DataSet does not fire an event when the corresponding rows in your database change. By the time the user modifies data in your application and attempts to submit the changes to your database, another user might have modified that same row of data and the update attempt will fail.

Take our sample application. It fetches data when the application starts. The user might click the Edit button within seconds of starting the application. The longer the application is open, the more stale the data becomes. In fact, the user might wait minutes or hours before modifying the contents of a row.

By the time the user clicks the Edit button, another user might have modified the corresponding row in the database. If you're developing an application that might face this scenario, you might want to refetch the contents of the corresponding row from the database when the user clicks the Edit button.

To refetch the contents of the row, you can create a DataAdapter that executes a parameterized query that looks like this:

SELECT ... FROM MyTable WHERE KeyCol = ?

If you've set the PrimaryKey property of your DataTable, the DataAdapter will update the contents of the DataRow with data from your database. Remember that this query will not generate an exception if another user has deleted the corresponding row in the database. In that case, the query will just return no rows. The DataAdapter object's Fill method returns an integer that contains the number of rows that the database fetched. If the Fill method returns 0, you know that the row no longer exists in the database. You can trap for that scenario and elegantly inform the user that the row that he or she wanted to edit no longer exists.

ADO.NET and Pessimistic Locking

Even if you submit changes to the database as soon as they're made, rather than cache them, and you refetch the contents of a row before allowing the user to modify the contents, the user's update attempt might fail because the data on the server is not locked. You can use pessimistic locking to make sure the update attempt will succeed.

Pessimistic locking is a powerful and somewhat dangerous feature. Be afraid. Be very afraid.

This is an advanced topic, intended for developers who really understand the impact of locking data on their servers. Only a small number of applications require pessimistic locking, such as airline reservation systems.

I don't recommend using pessimistic locking as a general way of avoiding failed update attempts. For most applications, it's better to have optimistic update attempts occasionally fail than to have queries fail because data on the server is locked.

Updating data using pessimistic locking involves locking the row in the database before editing its contents, which ensures that the update attempt will not fail as a result of changes made by another user. The ADO.NET object model is designed to submit changes through optimistic updates. As you learned in Chapter 10 and Chapter 11, the DataAdapter object lets you submit pending changes stored in a DataSet. The DataSet object does not lock data on the database when you modify the contents of a DataRow object. There are no properties you can set on ADO.NET objects to achieve pessimistic locking, at least not in the initial release of ADO.NET.

You can still achieve pessimistic locking through the use of Transaction objects. However, you'll probably need to do more than issue a simple SELECT query from within a transaction. Whether issuing a SELECT query from within a transaction locks the rows returned by the query depends on the database, the type of query, and the isolation level for the transaction.

A transaction's isolation level controls how, or if, the work performed on a transaction will affect work performed on other transactions. SQL Server uses "read committed" as its default transaction isolation level. With this isolation level, rows are locked once they're modified in the transaction. Simply retrieving the contents of a row in a SELECT query will not lock the row. If, however, you use an isolation level of "repeatable read" or "serializable," you lock the rows you retrieve in a SELECT query.

Some databases support the use of locking hints in a query. With SQL Server, you can issue the following query to lock a row of data in a transaction, regardless of the transaction's isolation level:

SELECT CustomerID, CompanyName, ContactName, Phone FROM Customers     WITH (UPDLOCK) WHERE CustomerID = 'ALFKI'

See your database's documentation to find out what types of transaction isolation levels and locking hints it supports.

The following code snippet pessimistically locks a row in an MSDE database through the use of an OleDbTransaction object and the use of locking hints in the SELECT query. As soon as the code retrieves the results of the query, the row of data is locked on the server. You can set a breakpoint after the call to the DataAdapter object's Fill method to verify that the data is locked. At this point, you can examine the contents of the row in an ad hoc query tool such as SQL Server's Query Analyzer, but attempts to modify the contents of the row will fail.

Visual Basic .NET

Dim strConn, strSQL As String strConn = "Provider=SQLOLEDB;Data Source=(local)\NetSDK;" & _           "Initial Catalog=Northwind;Trusted_Connection=Yes;" strSQL = "SELECT CustomerID, CompanyName FROM Customers " & _          "WITH (UPDLOCK) WHERE CustomerID = 'ALFKI'" Dim cn As New OleDbConnection(strConn) cn.Open() Dim txn As OleDbTransaction = cn.BeginTransaction Dim cmd As New OleDbCommand(strSQL, cn, txn) Dim da As New OleDbDataAdapter(cmd) Dim cb As New OleDbCommandBuilder(da) Dim tbl As New DataTable() da.Fill(tbl) Dim row As DataRow = tbl.Rows(0) row("CompanyName") = "Modified" da.Update(tbl) txn.Rollback() cn.Close()

Visual C# .NET

string strConn, strSQL; strConn = "Provider=SQLOLEDB;Data Source=(local)\\NetSDK;" +           "Initial Catalog=Northwind;Trusted_Connection=Yes;"; strSQL = "SELECT CustomerID, CompanyName FROM Customers " +          "WITH (UPDLOCK) WHERE CustomerID = 'ALFKI'"; OleDbConnection cn = new OleDbConnection(strConn); cn.Open(); OleDbTransaction txn = cn.BeginTransaction(); OleDbCommand cmd = new OleDbCommand(strSQL, cn, txn); OleDbDataAdapter da = new OleDbDataAdapter(cmd); OleDbCommandBuilder cb = new OleDbCommandBuilder(da); DataTable tbl = new DataTable(); da.Fill(tbl); DataRow row = tbl.Rows[0]; row["CompanyName"] = "Modified"; da.Update(tbl); txn.Rollback(); cn.Close();


	I use a CommandBuilder in the sample code solely for the sake of brevity. Chapter 11 discusses why you're better off supplying your own updating logic.

Connection Strategies

You can choose between two connection strategies. The appropriate approach will depend on the parameters of your application.

Connecting and Disconnecting

The simplest approach to connecting to your database is to let DataAdapter objects open the connection implicitly. The sample application uses this approach. The DataAdapter objects implicitly open the connection on calls to the DataAdapter object's Fill and Update methods and close the Connection object at the end of the calls. This approach is simple, but it's not always the best approach.

Depending on the responsiveness of your network and database, opening a connection to your database can be time consuming. You might be able to improve the performance of your application by opening a connection to your database when the application starts and keeping the connection open for the lifetime of the application, but this approach also has its drawbacks. It works well for small numbers of users but might not be appropriate for large numbers of simultaneous users. Also, this approach is really applicable only in two-tier applications in which the application can connect to the database directly. If your application relies on middle-tier components to communicate with your database, this approach isn't feasible.

The best approach is a kind of hybrid of these two approaches. You explicitly open a connection to your database but only when you need it. This approach is similar to the one in which you let the DataAdapter objects manage the state of your connection. The application executes the following code when it starts up:

Visual Basic .NET

daProducts.Fill(dsChapter13.Products) daEmployees.Fill(dsChapter13.Employees) daOrders.SelectCommand.Parameters(0).Value = strCustomerID daOrders.Fill(dsChapter13.Orders) daDetails.SelectCommand.Parameters(0).Value = strCustomerID daDetails.Fill(dsChapter13.Order_Details)

Visual C# .NET

daProducts.Fill(dsChapter13.Products); daEmployees.Fill(dsChapter13.Employees); daOrders.SelectCommand.Parameters[0].Value = strCustomerID; daOrders.Fill(dsChapter13.Orders); daDetails.SelectCommand.Parameters[0].Value = strCustomerID; daDetails.Fill(dsChapter13.Order_Details)

Each call to the Fill method of a DataAdapter object implicitly opens and closes the Connection object associated with the DataAdapter. This means that the code opens and closes the Connection object four times. Calling the Connection object's Open method before the calls to the DataAdapter objects' Fill methods will improve the performance slightly. Explicitly opening the Connection object before calls to the DataAdapter objects' methods also allows you to group the changes you submit to the database in a single transaction.

If I were pressed to recommend a single, general approach to managing connection state, this is the one I'd recommend.

Connection Pooling

Connection pooling can greatly improve the performance of your multi-tiered applications. In fact, because connection pooling is enabled by default, you might be taking advantage of connection pooling without even realizing it. The only connections that are reused are the ones with the same connection string and credentials, so you're not really using connection pooling unless the middle-tier components use the same connection string and credentials.

Some developers rely on their database to enforce security in their multi-tiered applications. Because the middle-tier components use the users' credentials to connect to the database, applications that use this approach will not benefit from connection pooling. To get the full benefit of connection pooling, you should have your middle-tier components rely on their own specific credentials. Use network security to make sure that only users who have the appropriate credentials can access the middle-tier components.

Although connection pooling is primarily geared toward multi-tiered applications, it can also improve the performance of your simple two-tiered application. When the sample application closes its Connection object, implicitly or explicitly, the actual connection to the database is cached in the connection pool. If the application reopens the Connection object before the actual connection times out in the pool, the connection will be reused.

If you're working with the OLE DB .NET Data Provider and you don't want to use connection pooling in your application, add the following attribute to your connection string:

OLE DB Services=-4;

Developers working with the SQL Client .NET Data Provider can use the following attribute to ensure that their connections are not pooled when the Connection object is closed:

Pooling=False;

Working with BLOB Data

You'll get better performance by storing BLOB data in files on your server and storing the location of those files in your database. Operating systems are better suited to working with files. Storing that same data in a database is less efficient. SQL Server 2000, for example, breaks up BLOB data that's greater than 8 KB in size into multiple pages in the database. That means that storing the contents of a 40-KB file involves separating the contents of that file into five separate pieces.

While I'm not a big fan of storing BLOB data in databases, I can definitely see the appeal. Storing some data in a database and other data in files increases the number of technologies involved. Keeping the data secure and backing up your data becomes more complex.

In case you do decide to store BLOB data in your database, here are some tips for working with BLOB data in ADO.NET.

Delaying BLOB Fetching

If your query fetches a hundred rows and the query includes BLOB columns, do you really want to retrieve all that data along with the results of your query? SQL Server BLOB columns can contain up to 2 GB of data. Do you know how much BLOB data your query will return?

One way to improve the performance of your application is to avoid fetching BLOB data from your database until you need it. Fetch the non-BLOB data ahead of time, and then fetch the BLOB data as necessary. This technique is especially helpful when the user will access the BLOB data only for the currently visible row.

Handling BLOBs in DataSet Objects

Accessing and modifying the contents of a BLOB column in a DataSet is actually very straightforward. ADO.NET stores BLOBs of text as strings and binary BLOBs as byte arrays. The DataRow object does not expose GetChunk or AppendChunk methods as in previous data access models. You must retrieve or modify the entire contents of the column.

You treat a BLOB of text just as you would any other text-based column.

Visual Basic .NET

Dim row As DataRow Dim strBlob As String  'Accessing the contents of a BLOB of text strBlob = CStr(row("TextBlob")) 'Modifying the contents of a BLOB of text row("TextBlob") = strBlob

Visual C# .NET

DataRow row; string strBlob;  //Accessing the contents of a BLOB of text strBlob = (string) row["TextBlob"]; //Modifying the contents of a BLOB of text row["TextBlob"] = strBlob;

Similarly, you treat binary BLOB columns just as you would smaller binary columns.

Visual Basic .NET

Dim row As DataRow Dim aBinaryBlob As Byte()  'Accessing the contents of a BLOB of text aBinaryBlob = CType(row("BinaryBlob"), Byte()) 'Modifying the contents of a BLOB of text row("BinaryBlob") = aBinaryBlob

Visual C# .NET

DataRow row; Byte[] aBinaryBlob;  //Accessing the contents of a BLOB of text aBinaryBlob = (Byte[]) row["BinaryBlob"]; //Modifying the contents of a BLOB of text row["BinaryBlob"] = aBinaryBlob;

Handling BLOBs Using DataReader Objects

The DataReader object offers you a choice: you can access the contents of a BLOB column all at once, or you can fetch data from the column in chunks.

The following code snippet uses a single call to the DataReader to retrieve the contents of a BLOB column:

Visual Basic .NET

Dim cmd As OleDbCommand Dim rdr As OleDbDataReader Dim intTextBlobColumnNo, intBinaryBlobColumnNo As Integer Dim strTextBlob As String Dim aBinaryBlob As Byte()  rdr = cmd.ExecuteReader(CommandBehavior.SequentialAccess) Do While rdr.Read     strTextBlob = rdr.GetString(intTextBlobColumnNo)     aBinaryBlob = CType(rdr(intBinaryBlobColumnNo), Byte()) Loop rdr.Close

Visual C# .NET

OleDbCommand cmd; OleDbDataReader rdr; int intTextBlobColumnNo, intBinaryBlobColumnNo; string strTextBlob; Byte[] aBinaryBlob;  rdr = cmd.ExecuteReader(CommandBehavior.SequentialAccess); while (rdr.Read()) {     strTextBlob = rdr.GetString(intTextBlobColumnNo);     aBinaryBlob = (Byte[]) rdr[intBinaryBlobColumnNo]; } rdr.Close()


	The preceding code snippet retrieves the contents of the text BLOB column using the strongly typed GetString method but retrieves binary BLOB data by implicitly calling the untyped Item property and then converting the return value to a byte array. The DataReader object does expose a GetBytes method, but it returns data in chunks rather than in a single call.

BLOB columns can be rather large. Storing the entire contents of a BLOB column in a single string or byte array might not be the best idea if the column contains a couple hundred megabytes of data. In such cases, your best bet is to fetch the BLOB data a chunk at a time, write the contents to the hard drive, and access the contents when appropriate.

The DataReader object exposes two methods—GetBytes and GetChars—that let you retrieve binary data in chunks. The following code snippet demonstrates how you can use the GetBytes method to retrieve the contents of a binary BLOB column from a DataReader in 8-KB chunks and write that data to a file. You can follow the same logic to retrieve text BLOB data using the GetChars method instead.

Visual Basic .NET

'Add the following line of code at the beginning of the code module. Imports System.IO Dim cmd As OleDbCommand Dim rdr As OleDbDataReader Dim intBlobColumnNo As Integer = 1 Dim intChunkSize As Integer = 8192 Dim intOffset As Integer = 0 Dim intBytesReturned As Integer Dim aBinaryBlob(intChunkSize) As Byte Dim strPathToFile As String = "C:\GetBytes.jpg" Dim filOutput As New FileStream(strPathToFile, FileMode.Create) rdr = cmd.ExecuteReader(CommandBehavior.SequentialAccess) rdr.Read() Do     intBytesReturned = CInt(rdr.GetBytes(intBlobColumnNo, intOffset, _                                          aBinaryBlob, 0, intChunkSize))     If (intBytesReturned > 0) Then         filOutput.Write(aBinaryBlob, 0, intBytesReturned)     End If     intOffset += intBytesReturned Loop Until intBytesReturned <> intChunkSize filOutput.Close() rdr.Close()

Visual C# .NET

//Add the following line of code at the beginning of the code module. using System.IO; OleDbCommand cmd; OleDbDataReader rdr; int intBinaryBlobCol = 1; int intChunkSize = 8192; int intOffset = 0; int intBytesReturned; Byte[] aBinaryBlob = new Byte[intChunkSize]; string strPathToFile = "C:\\GetBytes.jpg"; FileStream filOutput = new FileStream(strPathToFile, FileMode.Create); rdr = cmd.ExecuteReader(CommandBehavior.SequentialAccess); rdr.Read(); do {     intBytesReturned = (int) rdr.GetBytes(intBinaryBlobCol, intOffset,                                            aBinaryBlob, 0, intChunkSize);     if (intBytesReturned > 0)         filOutput.Write(aBinaryBlob, 0, intBytesReturned);     intOffset += intBytesReturned; } while (intBytesReturned == intChunkSize); filOutput.Close(); rdr.Close();

Binary BLOBs in the Northwind Database

You might have noticed that the Northwind database includes BLOB columns. For example, the Employees table includes a column called Photo, which contains a picture of the employee.

Unfortunately, the Photo column also contains some extra bytes in the form of an Access OLE header. This header allows Microsoft Access to know what type of data the BLOB column contains—such as a .jpg file, a Word document, or an Excel spreadsheet. As a result, if you try to fetch the contents of the Photo column using ADO.NET (or ADO, or RDO, or...) you won't be able to load that data into a PictureBox control or view the contents of the file in an imaging program such as Paint.

How can you discard the Access OLE header so that you're left with just the desired data? In short, you can't. The format of the Access OLE header is proprietary and is not documented.

However, the companion CD includes a sample application called LoadEmployeePhotos that can help you replace the default contents of the Photo column in the Employees table with pure .jpg images. In the application's directory, you'll find .jpg files that contain pictures of the employees. The application loads this data into the Northwind database by executing a series of parameterized queries.

You can also look to this application as an example of how to load the contents of files into a database using parameterized queries.

A Sample BLOB Application

Now that you have actual pictures in the Employees table, let's take a brief look at a sample application that retrieves this data and displays it on a Windows form. This application, ShowEmployeesPhotos, which is on the companion CD, retrieves employee information from the Northwind database into a DataSet. Figure 13-11 shows the user interface for the sample application.

Figure 13-11

Displaying binary BLOB data and displaying it in a PictureBox control

To build the application more quickly, I relied on the CurrencyManager and bound controls to manage the current position and simplify the process of displaying employee information.

The DataSet that the sample application uses has two separate DataTable objects—one for the non-BLOB data and one for the BLOB data. The child DataTable also includes the primary key column (EmployeeID) to simplify the process of moving from a row in the parent DataTable to the corresponding row in the child DataTable. I added a column called FetchedPhoto to the parent DataTable to keep track of whether I've fetched the photo for that employee.

When the application starts up, it retrieves non-BLOB employee information (EmployeeID, LastName, FirstName, and so on) from the Employees table. The application then retrieves the contents of the Photo column the first time the user navigates to a particular employee. (The photos are small in size—only 22 KB—so the application wouldn't incur a large performance hit if it had loaded this data on startup, especially if the size of the table remains small.) This approach can greatly improve the performance of applications in which the user can view only a fraction of the rows that the application retrieves.

User Interfaces Built with ADO.NET Power

You've seen how the data binding features that are intrinsic to Windows forms can help you build user interfaces quickly and easily. You've also learned that you can achieve greater control over your user interface by relying on your own code rather than on bound controls. You now also know the pros and cons of different strategies for connecting to your database, querying your database, submitting updates, and working with BLOB data.