Datasets, Adapters, and Sources


Now that we have seen a couple of no-coding-required examples, let's take a peek under the hood and see how data binding actually works. Several players make data binding work, many of which can be seen on the component tray or design surface:

  • A back-end data source, such as an Access database, a remote SQL Server database, a Web service, or some other data storage and retrieval system, is where the data ultimately lives.

  • A dataset is a disconnected local cache of some portion of the back-end data source.

  • An adapter connects the dataset to the back-end data source, both to fill the dataset from the back-end source and to update the back end with any changes. There usually is one adapter per table, which is why we saw two adapters in the preceding example.

  • A binding source acts as the intermediary between the user interface control and the dataset. Although it is possible to bind a control directly to a dataset, as discussed later in this chapter, it usually is more convenient to go through a dedicated binding source object.

  • A data-bindable control provides a user interface element that enables the user to read or write the data.

The back-end data source is represented in a VSTO project by the connection string passed to the adapter; everything else is represented by a member of the customized host item (the worksheet or document) class.

Let's take a look at these different components in more detail.

Data Sources and Security Best Practices

As you probably noticed in the Connection Wizard, all the information required to connect to the back-end data source is stored in a connection string generated by the wizard. It typically looks something like this:

    Server=MyDataServer; Database=Customers;     Integrated Security=true; 


That is, it says where the database is located, what it is called, and how the user should be authenticated. All this information is potentially sensitive! Use caution when embedding connection strings in your programs. Remember, even without the source code it is very easy to figure out which strings are embedded in a managed application. This particularly applies to connection strings in which, instead of using Windows NT integrated security, you simply embed UserID=eric;Password=BigSecret123 directly.

Furthermore, hard-coded embedded strings in your source code make it hard for developers, testers, end users, and database administrators to update your application should the database connection information change over time. As discussed earlier in this chapter, Visual Studio gives you the option of embedding the connection string in the configuration file. The automatically generated configuration file in our example above looks something like Listing 17.1.

Listing 17.1. A Typical Database Connection String in a Configuration File

<?xml version="1.0" encoding="utf-8" ?> <configuration>   <configSections>   </configSections>   <connectionStrings>     <add name=   "ExcelWorkbook11.Properties.Settings.NorthwindConnectionString"       connectionString="Provider=Microsoft.Jet.OLEDB.4.0;Data       Source="C:\Program Files\Microsoft       Office\OFFICE11\SAMPLES\Northwind.mdb"       providerName="System.Data.OleDb" />   </connectionStrings> </configuration> 


It is also a good idea to use the "principle of least privilege." This is one of the fundamental principles of secure design: Grant exactly as much privilege as you need to get the job doneno more, no less. If your user needs to be able to read from the database but not write to it, for example, do not specify a connection string that gives the user administrator rights to the database. Instead, choose a connection string that specifies a username and password with read-only access. That way, if the username and password are ever compromised, at least the attacker does not get administrator access out of it.

Better still, do not use stored user IDs and passwords at all. Some databases use integrated Windows authentication, so the logged-on user can use his already-authenticated credentials seamlessly. Or if your database system requires a username and password, make the user type them rather than store them. As you'll see later in this chapter, when we discuss adapters, you can change manually the connection string used by the adapter before it fills the dataset. That way, you could ask the user to type his user ID and password and then generate a new connection string from that information.

Datasets

The cornerstone of the VSTO 2005 data model, and of ADO.NET in general, is the dataset. We should motivate the existence of datasets by describing the old way of doing data access. Back in the 20th century, you typically communicated with a database via "ADO Classic" something like this:

  1. Create and open a connection to a database.

  2. Create and execute a database command (such as SELECT partnumber FROM invoices WHERE price>100).

  3. Enumerate the resulting record set.

  4. Close the connection.

This approach worked fairly well, but it had several drawbacks. The principal drawbacks were consequences of the fact that this model requires a live connection to a database. If there are going to be many live connections, the server needs to be scalable and robust, which can be expensive. Therefore, to minimize load upon the server, we want connections to be short-lived. But because the connection is open while the user is enumerating the record set, the connection is typically open for quite some timeas long as the user is working with the data.

Furthermore, even if the server-side expense of keeping connections open is unimportant, this model does not work well in a world where you want to be able to work with your data even if you temporarily lack network connectivity.

A Disconnected Strategy

Database connections are both expensive and necessary; therefore, they must be managed carefully. In a typical ADO application, much developer effort is expended writing code to ensure that the connection is open for as little time as possible while still meeting the needs of the application's users. ADO.NET addresses the problems of ADO by going straight to the root; if we cannot make connections inexpensive, we can at least make them less necessary. ADO.NET, therefore, is fundamentally a disconnected strategy. A typical ADO.NET scenario goes something like this:

  1. Create a DataAdapter to manage the connection to a specific database or other data source.

  2. Set properties on the adapter that tell it what query to execute against the database.

  3. Create a dataset to be filled.

  4. Invoke a method on the adapter to take care of the details of opening a connection, executing the query, saving the results in the dataset, and closing the connection as soon as possible.

  5. Work with the data in the now-disconnected dataset.

  6. When you finish working with the data, invoke a method on the adapter to reopen the connection to the database and update it with any changes.

And indeed, as you will see later in the chapter, when we discuss adapters, VSTO does exactly this on your behalf.

Because the dataset acts much like the original database, the connection need be open only as long as it takes to fill the dataset. After the data has been copied to the dataset, you can query and manipulate the dataset for as long as you want without worrying that you are consuming a valuable database connection.

Furthermore, there is no reason why the data used to fill the dataset has to come from a connected database; you could fill the dataset from an XML file, or write a program to add tables and rows to build one from scratch. Datasets have no knowledge of where the data they contain comes from; if you need it, all that knowledge is encapsulated in the adapter.

Note

The foregoing is not to say that old-fashioned connected data access is impossible, or even discouraged, in ADO.NET; the DataReader class allows for traditional always-connected access to a database. Neither Windows Forms controls nor VSTO 2005 host items/host controls can use DataReaders for data binding, however, so we speak of them no more in this book.


Typed and Untyped Datasets

In Solution Explorer in the Word or Excel projects we created earlier, you will find a NorthwindDataSet.xsd file containing the database schema. This is an XML document that describes the tables, columns, and relationships that make up the dataset. One of the child nodes in the Solution Explorer tree view is NorthwindDataSet.Designer.cs. This file contains the automatically generated code for the dataset and table adapters.

The first line of the declaration is interesting:

Partial Class NorthwindDataSet

The generated class is partial so that if you need to add your own extensions to it, you can do so in a separate file; it is a bad idea to edit automatically generated files. More important, this dataset extends the System.Data .DataSet class. A System.Data.DataSet consists of a collection of data tables. As you would expect, data tables consist of a collection of data columns and data rows. Each class exposes various collections as properties that allow you to navigate through the dataset.

System.Data.DataSet is not an abstract class; you can create instances and fill them from any back-end data source. But that would be an untyped dataset; the NorthwindDataSet is a typed dataset. Untyped datasets give you great flexibility but are so general that they are somewhat harder to use.

If you were to fill an untyped dataset with data from the Northwind database file, for example, you could access a particular datum with an expression such as this:

name = myDataSet.Tables("Products").Rows(1)("ProductName") 


But that flexibility comes at a cost: You can accidentally pass in a bad table name or a bad column name, or make a bad assumption about the type of the data stored in a column. Because none of the structure of the tables or types of the columns is known at compile time, the compiler is unable to verify that the code will run without throwing exceptions. Also, the IntelliSense engine is unable to provide any hints about the dataset's structure while you are developing the code.

Typed datasets mitigate these problems. A typed dataset is a class that extends the dataset base class; it has all the flexible, untyped features of a regular untyped dataset but also has compile-time strongly typed properties that expose the tables by name. A typed dataset also defines typed data table and data row subclasses.

As you can see from the NorthwindDataSet.Designer.vb file, the typed dataset has public properties that enable you to write much more straightforward code, such as this:

name = myDataSet.Products(1).ProductName 


Typed datasets extend untyped datasets in many ways; some of the most important are as follows:

  • Tables are exposed as read-only properties typed as instances of typed data tables.

  • Tables have read-only properties for each column.

  • Tables have an indexer that returns a typed data row.

  • Event delegates for row change events pass typed change event arguments. Each row type has a row-changing, a row-changed, a row-deleting, and a row-deleted event. (You may be wondering where the row-adding and row-added events are. The changing/changed events pass a DataRowAction enumerated type to indicate whether the row in question was newly created.)

  • Tables provide methods for adding and removing typed data rows.

  • Rows provide getters, setters, and nullity testers for each column.

In short, it is almost always a good idea to use a typed dataset. Weakly typed code is harder to read, harder to reason about, and harder to maintain.

Adapters

Take a look at the Startup event handler in either the Word or the Excel example above. Visual Studio has generated the code in Listing 17.2 automatically on your behalf.

Listing 17.2. Autogenerated Table-Filling Code

Public Class Sheet1  Private Sub Sheet1_Startup(ByVal sender As Object, _    ByVal e As System.EventArgs) Handles Me.Startup    'TODO: Delete this line of code to remove the default    'AutoFill for 'NorthwindDataSet.Products'.    If Me.NeedsFill("NorthwindDataSet") Then      Me.ProductsTableAdapter.Fill(Me.NorthwindDataSet.Products)    End If    'TODO: Delete this line of code to remove the default    'AutoFill for 'NorthwindDataSet.Suppliers'.    If Me.NeedsFill("NorthwindDataSet") Then      Me.SuppliersTableAdapter.Fill( _        Me.NorthwindDataSet.Suppliers)    End If  End Sub End Class 


We discuss what exactly NeedsFill is for in more detail when we discuss data caching later in this chapter and in Chapter 18, "Server Data Scenarios." But for now, this should look fairly straightforward: If the two tables need to be filled from the back-end data source, the adapters fill the appropriate tables.

There are a number of reasons why you might want to not fill the data tables automatically in the Startup event, which is why the comment points out that you can remove the auto-generated code. As mentioned earlier in this chapter, you may want to require that the user enter a database password before attempting to fill the dataset. You can generate a new connection string and then set the adapter's Connection.ConnectionString property.

Or perhaps you want to give the user the option of whether to connect to the back end. If the user is on an expensive or slow connection, the user may want to skip downloading a large chunk of data. For any number of reasons, you may not want to connect right away or use the default connection string, so Visual Studio allows you to modify this startup code.

Visual Studio generates strongly typed custom adapters at the same time that it generates the typed dataset. If you show hidden files and read through the hidden generated adapter code in NorthwindDataSet.Designers.vb, you will see that the generated adapter has been hard-coded to connect to the database specified by the connection string in the configuration file. The bulk of the generated adapter code consists of the query code to handle reading from the back-end data store into the typed dataset and then taking any changes in the dataset and updating or deleting the appropriate rows in the store.

The adapter takes care of all the details of opening the connection, executing the query, copying the data into the dataset, and closing the connection. At this point, we have a local copy of the data, which we can use to our heart's content without worrying about taxing the server further.

When you are done editing the local copy of the data in the dataset, you can use the adapter to update the database with the changes by calling the Update method of the adapter. Then the adapter will take care of making the additions, changes, and deletions to the back-end database.

Note

By default, the adapter assumes that you want optimistic concurrencythat is, other users will be able to update the database unless you are in the process of updating the database. Other concurrency models are possible but beyond the scope of this text. If you want either pessimistic concurrency (that is, the database remains locked the whole time that you have the offline dataset) or destructive concurrency (that is, the database is never locked, even when multiple people are writing at once), consult a reference on ADO.NET to see how to configure your adapter appropriately.


Using Binding Sources As Proxies

Why does Visual Studio bind the controls to a BindingSource proxy object, rather than binding controls directly to the data table?

The reason is because the control can bind to the proxy even if the data in the table is not currently available. Perhaps the data table is going to be derived from a call to a Web service, which will not happen until long after the initialization is complete, or until the user types his password or clicks a button to start the database connection.

The proxy object is created when the customization starts, and controls can be bound to it even if no "real" data is available. When the real data is available, the binding source updates the controls. It is essentially just a thin "shim" that makes it easier to set up bindings before all the data is available.

As you saw in the examples, multiple controls can share the same binding source and, therefore, have the same currencythat is, when one control is updated, every other control linked to the same binding source is also updated automatically. Controls on different worksheets or even on the actions pane can share binding sources and thereby share currency. You will learn about currency management in more detail later in this chapter.

Data-Bindable Controls

The last piece of the data binding puzzle is the host control or Windows Forms control on the spreadsheet or document that actually displays the bound data. There are two flavors of data-bindable controls: simple and complex. Controls that can bind a single datum to a particular property are simple-data-bindable. Controls that can bind multiple rows and/or columns are complex-data-bindable.

In the preceding examples, the list object in Excel and the combo box and data grid Windows Forms controls are complex-data-bindable; the list object and data grid display multiple rows and columns from a table, and the combo box displays multiple rows from a single column. The bookmark and named range controls, by contrast, are simple-data-bindable; only a single datum is bound to the Value property of the named range.

All the Windows Forms controls are simple-data-bindable, as are almost all the Word and Excel host items and host controls. (There is one exception: The Word XMLNodes host control is neither simple- nor complex-data-bindable.) Of the host items and host controls, only Excel's list object is complex-data-bindable.

The behind-the-scenes mechanisms by which controls implement data binding and manage currency are fairly complex; we cover them in more detail toward the end of this chapter. But first, now that we have gotten a little context as to what all these parts are and how they relate, let's take a look at a somewhat more labor-intensive way to do data binding in Excel. This time, we are going to write a few lines of code.




Visual Studio Tools for Office(c) Using Visual Basic 2005 with Excel, Word, Outlook, and InfoPath
Visual Studio Tools for Office: Using Visual Basic 2005 with Excel, Word, Outlook, and InfoPath
ISBN: 0321411757
EAN: 2147483647
Year: N/A
Pages: 221

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net