Adults should put in their work the same seriousness that children put in their games.
-Friedrich Nietzsche
Almost all data-driven applications look alike. They need to fetch data across different tables, join columns, and filter data based on some user-specific criteria. Data-driven applications must also format data in such a way that it can be easily (and pleasantly) consumed by end-users. In this context, the organization of the output, pagination, and editing models are key issues. More often than not, data is made of cross-related tables in which parent/child relationships are necessary to define the data model and to extract a specific subset of information (for example, detail views and subtotals) quickly and easily.
All distributed applications must be scalable to some extent. In general, Web applications are more sensitive to the issue of scalability than desktop, client/server applications. If a Web application is deployed over the Internet and made open to anonymous users—for example, a shopping Web site—providing a good, if not optimal, level of scalability is a sort of survival test. Scalability is related to performance, but it can't be identified exclusively with that. Likewise, performance affects scalability. However, they're not synonymous. For example, an application with a long response time will often have an overall software architecture that scales well, either up or out, to avoid performance degradation as the number of users grows. On the other hand, you will often run across extremely fast and super-optimized applications whose designs are not very scalable. A general law of performance tuning states that you must know the underpinnings of a system to devise tricks to increase the throughput. In summary, optimizing performance involves making assumptions and choosing average values as indicators of the typical workload. In contrast, scalability just refers to the system's ability of maintaining the same level of performance as the number of concurrent users and the workload grows.
In general, caching data does not seem to be an optimal choice for all Web applications because it consumes valuable resources on a per-user or per-application basis. As a result, having less memory available to the Web server might lead to performance degradation. An application that needs to be only fast should be as stateless as possible and, subsequently, avoid caching state information. On the other hand, if you're developing an application that needs to be both fast and scalable, you should carefully consider caching data at some level to reduce the workload on middle-tier processing components. In addition, the more tiered your system is, the more options you have to scale it out as long as disconnected caches of data can be moved across the tiers. In the area of speed and scalability, ADO.NET helps considerably. ADO.NET doesn't perform magic or provide a ready-to-use solution for every scenario. The true power of ADO.NET is the set of tools it supplies for you to build highly scalable and reasonably fast Web applications that are suited to your needs.
The DataSet and its satellite classes (DataRelation in particular) are efficient and effective tools that primarily offer options to accomplish common tasks in real-world, data-driven Web applications. After evaluating the characteristics and expectations of your particular application, you should choose the right combination of old-style and new-style programming. Do not trust those who depict ADO.NET as a magic wand that makes Web applications fast and scalable with nothing more than a simple touch. And also don't trust those who completely disregard ADO.NET. At a minimum, ADO.NET provides you with new and powerful tools. It's up to you, then, to decide in any given situation whether they can help or not.
This chapter is divided in three parts, the first of which is about the ADO.NET best practices for executing cross-table commands—the typical commands of a real Web application. The other parts tackle two specific issues—subtotaling and master/detail views—whose effective implementation depends to some extent on the execution of compound queries.
Even the simplest Web application has a rather complex and interrelated structure of databases. While a database with dozens of related tables, views, constraints, and triggers is not that common, I have yet to see a (nontoy) Web application with less than a handful of related tables. As a result, the need for Web applications to execute commands against a multitude of tables is more concrete than ever and, therefore, must be properly addressed.
The problem with more complicated databases has to do with both queries and updates. Users should be able to access and filter related rows, such as all orders generated within a given period by a particular customer. They also must be able to update or delete rows in a way that does not violate the consistency and integrity of the database. Some databases resort to server-side constraints to block undesired changes. The fact that most database administrators use this strategy to stay on the safe side is not a good reason to avoid code that could potentially corrupt the database. Ideally, client applications should submit changes that respect the database structure and its internal update rules (for example, relationships and constraints).
In light of the ADO.NET capabilities, we can recognize two ways of conducting multiple queries that span across multiple tables—an old way and a new way. The old-fashioned way is typical of ADO and any former data-access technology. Based on the SQL JOIN command, this well-known programming technique returns a single data structure that is quite easy to filter but hard to update and is populated with redundant data.
The alternative approach, touted by the ADO.NET disconnected model, exploits the capabilities of the DataSet object—that is, a multitable cache of data. The idea is that you split the original query into two or more distinct queries and gather the results in separate tables within the same DataSet object. The queries can be run individually or grouped in a stored procedure or in a batch. Using split queries results in a much more compact data set that can be easily managed via the programming interface of the DataSet and DataRelation objects. Separate queries managed through ADO.NET support cascading changes and batch updates, and they don't require synchronization code to be written on your end to retrieve the children of a parent row. On the downside, filtering records over a relation is not particularly easy.
Using Joined Queries
Suppose you need to obtain, grouped by customer, all orders issued in a given year. You don't want all the orders, though, only those with at least 30 items. In addition, you want the name of the employee who processed the order and the name of the company that issued it. The query can be written like this:
SELECT o.CustomerID, od.OrderID, o.OrderDate, o.ShippedDate, SUM(od.Quantity * od.UnitPrice) AS price, c.CompanyName, e.LastName FROM Orders o INNER JOIN Customers c ON c.CustomerID=o.CustomerID INNER JOIN Employees e ON e.EmployeeID=o.EmployeeID INNER JOIN [Order Details] od ON o.OrderID=od.OrderID AND o.OrderID=od.OrderID WHERE (YEAR(o.OrderDate) = 1997) GROUP BY o.CustomerID, c.CompanyName, od.OrderID, o.OrderDate, o.ShippedDate, e.LastName HAVING (SUM(od.Quantity) > 30) ORDER BY o.CustomerID, od.orderid
The query spans the Northwind tables—Orders, Order Details, Customers, and Employees. It retrieves all orders with more than 30 items and is grouped by customer and filtered by year. Viewed through Microsoft SQL Server Query Analyzer, the result set looks like Figure 8-1.
Figure 8-1: The result of a query that makes intensive use of JOIN statements.
You can immediately see how much redundant data the result set contains. The query counts about 300 records, and each of them has a high likelihood of containing a duplicate customer or employee name. The main characteristic of a join query is that it returns data as a monolithic block. As mentioned, this is good news and bad news at the same time. The good news is the data is held in the same block of memory, works great for presentation purposes, and can be further filtered by simply adding some criteria to the object that represents it—either an ADO Recordset or an ADO.NET DataTable. The bad news is, the result set does not make good use of memory and might raise scalability issues. Also, keeping changes synchronized with the database is not trivial. Finding a match between an element in the result set and a table record is challenging, and this makes it hard to keep the data synchronized and even to apply changes.
ADO hierarchical recordsets have been an attempt to better sort things out. They gave developers a chance to design a hierarchical query through a made-to-measure (and, frankly, horrible) syntax. As a result, a tree of parent and child Recordset objects reduced the redundancy of data and made synchronization simpler. The main drawbacks of ADO hierarchical recordsets are the inability to access more than one data source and the overly simple update model. To this list, we should then add the annoyance of working with the notorious SHAPE language, which is needed to build such recordsets.
Another subtle point in favor of joined queries is the inherent transaction support that SQL Server (and in general any database management system) guarantees to individual commands. A complex query, especially if it includes subqueries or views, is made of different low-level operations that run in the context of the same implicit transaction. This guarantees that no concurrency issue would ever be raised, even for highly volatile databases. We'll return to this point later in the chapter in the "Respecting Data Integrity" section.
Paging Through the Results of Any Query
In Chapter 7, "Paging Through Data Sources," we discussed various ways of paging through the results of a query. In particular, we focused on building queries that return a page of data. However, we used very simple commands to query just a single table. The paging techniques we discussed can be applied to any query, even though they might sometimes require adjustments. In general, the pattern used to retrieve a page is as follows:
SELECT * FROM (SELECT TOP PageSize * FROM (SELECT TOP PageSize*PageIndex * FROM (query) AS t1 ORDER BY KeyColumn DESC) AS t2 ORDER BY KeyColumn
Note that if you're going to sort by multiple columns you must add the DESC qualifier to all column names. If the query placeholder is a subquery, it can contain an inner ORDER BY clause only if a TOP clause is present. If you don't actually need a TOP clause, use TOP 100 PERCENT.
To greatly simplify coding, you could also define a view. Adapting the random paging sample we saw in Chapter 7 to work on the aforementioned orders query requires only minor changes. (See joins.aspx in the book samples.)
Using Distinct Queries
An alternative approach made particularly attractive by ADO.NET is separating an otherwise nested and complex query into multiple simpler queries. There's a double advantage to doing so. First, the amount of data returned to the client is significantly smaller. Second, the functionality of the ADO.NET DataSet object makes managing multiple tables straightforward.
Next, I'll show you a way to split the preceding query into smaller and relatively independent pieces. The query consists of three subqueries: getting a filtered view of orders and details, plus getting a view of related employees and customers, respectively.
The first query (shown below) selects all orders containing at least 30 items. In this slightly revised version, the query doesn't join with the Customers and Employees tables to get foreign information about the customer and employee who participated in the order.
SELECT o.CustomerID, od.OrderID, o.OrderDate, o.ShippedDate, SUM(od.Quantity * od.UnitPrice) AS price FROM Orders o INNER JOIN Customers c ON c.CustomerID=o.CustomerID INNER JOIN Employees e ON e.EmployeeID=o.EmployeeID INNER JOIN [Order Details] od ON o.OrderID=od.OrderID AND o.OrderID=od.OrderID WHERE (YEAR(o.OrderDate) = 1997) GROUP BY o.CustomerID, od.OrderID, o.OrderDate, o.ShippedDate HAVING (SUM(od.Quantity) > 30) ORDER BY o.CustomerID, od.orderid
The other two queries serve only the purpose of providing extra information about the involved employees and customers. In their simplest form, the queries are written as follows:
SELECT employeeid, lastname FROM employees SELECT customerid, companyname FROM customers
Of course, in this case all employees and customers will be selected, which is not exactly what a joined query would return. However, depending on the structure and goals of the application, this shortcut might be acceptable. In general, employees and customers represent data that has low volatility and, as such, is ideal for caching. The issue here is the size of the tables. If your company has thousands of employees and customers, caching entire tables might not be the best approach.
A query that returns only the employees and the customers referenced by the first query might be much more complex to write, however. It mostly depends on the complexity of the search criteria. The more sophisticated the criteria is, the more complex the child queries are, and subsequently, the higher the price is for the query. As usual, fine-tuning the query subsystem of an application is a delicate art.
Let's continue with the example and analyze what's needed to obtain only the employees and customers involved with the orders selected by the first query. The criteria set for these queries require you to group the information by orders and compute the quantity of orders (at least 180 in the following sample). Given this subset of orders, you can then extract distinct employees or customers:
SELECT DISTINCT * FROM (SELECT c.customerid, c.companyname FROM Customers c INNER JOIN Orders o ON o.customerid=c.customerid INNER JOIN [Order Details] od ON o.OrderID=od.OrderID WHERE (YEAR(o.OrderDate) = 1997) GROUP BY o.orderid, c.customerid, c.companyname HAVING (SUM(od.Quantity) > 180) ) AS t1
Similarly, for employees you use the following third query:
SELECT DISTINCT * FROM (SELECT e.employeeid, e.lastname, e.title FROM employees e INNER JOIN Orders o ON o.employeeid=e.employeeid INNER JOIN [Order Details] od ON o.OrderID=od.OrderID WHERE (YEAR(o.OrderDate) = 1997) GROUP BY o.orderid, e.employeeid, e.lastname, e.title HAVING (SUM(od.Quantity) > 180) ) AS t1
As you can see, in this case we're going to run three complex queries rather than just one. In other circumstances, filtering the satellite tables is considerably easier and faster. As we'll also see throughout this chapter, ADO.NET is not a magic solution and effective SQL commands are now, more than ever, essential for creating effective applications.
Note |
Many Web applications cache global data in global objects such as Cache, Application, or Session. (We'll review these objects in detail in Chapter 14, "ASP.NET State Management.") A warning, though, must be raised here. Suppose you decide to split a complex query into separate queries and also decide to keep child queries simple by creating look-up tables with employee and customer information. Where do you cache that data? If you store data in Session or Cache—it depends on the required scope—be aware that both objects might be subject to unexpected clean-up because of process recycling (see Chapter 2), low memory conditions, or both. You should always check whether the cache is still there and reload it if necessary. In summary, consider that look-up tables might need to be frequently refreshed. |
All the queries you obtained can be grouped in a SQL batch or stored procedure and handed off to a data adapter for actual population. You get back a DataSet with as many tables as there are queries in the statement. You can access and index tables individually and create in-memory relations between tables. Note that relationships must be set up manually because no feature in ADO.NET provides for automatic reading and processing of database relationship objects and constraints. For more information about ADO.NET relations, see Chapter 5, "The ADO.NET Object Model."
Tip |
Relations also offer a quick way to simulate in-memory JOINs. Suppose you have two tables (named Orders and Employees) filled with information on orders and employees, respectively. To create a relation between the two tables based on the common field employeeid, you would use the following code: DataRelation rel = new DataRelation ("Emp2Ord",employees.Columns["employeeid"], orders.Columns["employeeid"]); data.Relations.Add(rel); Finally, to add a new column to the Orders table that points to the lastname column in the parent table according to the relation, you would use the following code: orders.Columns.Add("EmployeeName", typeof(string), "Parent(Emp2Ord).lastname"); To read the last name of the employee, you would simply refer to the newly created EmployeeName column on the in-memory Orders table. |
Respecting Data Integrity
Another issue to consider is data consistency. A complex query that runs as a single SELECT statement works in the context of an implicit transaction and looks like an atomic piece of code. When you split that into separate queries, you should also consider the impact that concurrency might have on the state of the database. What if a customer is concurrently deleted or an order is modified between the subqueries? It might or might not be a problem for the application. In case it is, though, you must consider protecting the subqueries in a transaction so that other users are prevented from updating or inserting rows while the queries are executing.
Transactions are a unit of work that execute according to four principles—atomicity, consistency, isolation, and durability—often cumulatively known as ACID. For our purposes, the key element is isolation. The isolation property refers to the database's ability to shield each running transaction from the effects of other concurrently running transactions. If a transacted query is running at the same time someone else is updating the database, the final results can vary depending on the level of isolation set. Normally each transacted operation can have the level of isolation it needs. If the application needs absolute data consistency and cannot afford phantom rows, the only appropriate isolation level is serializable. A serializable transaction unlocks the tables only upon completion. With this level of isolation, reading uncommitted data (a process known as dirty reads) and getting phantom rows is not possible. Phantom rows refer to any situation in which transactions work on misaligned sets of data. A phantom row is a row that one transaction added but is unknown to the other concurrent transaction. Likewise, a row visible in one transaction but physically deleted by another is an example of phantom rows. Serializable transactions should be the last resource in database programming because they seriously affect the overall concurrency of the system.
If you need the power of serializable transactions without its burden, you can opt for a sort of compromise between joined and distinct queries. You first run the query by using nested JOINs as needed. Next, once the result set has been downloaded in an ADO.NET environment, you process the original DataTable and split its contents into smaller tables. On the server, the query works as a monolithic piece of code; the application, though, sees it as the union of distinct but related tables.
SELECT o.customerid, c.companyname, e.employeeid, e.lastname, od.orderid, o.orderdate, o.shippeddate, SUM(od.quantity*od.unitprice) AS price FROM Orders AS o INNER JOIN Customers AS c ON c.customerid=o.customerid INNER JOIN Employees AS e ON e.employeeid=o.employeeid INNER JOIN [Order Details] AS od ON o.orderid=od.orderid WHERE Year(o.orderdate) = @TheYear AND od.orderid=o.orderid GROUP BY o.customerid, c.companyname, od.orderid, o.orderdate, o.shippeddate, e.employeeid, e.lastname HAVING SUM(od.quantity) >@TheAmount ORDER BY o.customerid
The next listing shows the structure of the code that works on the preceding query:
DataSet SplitData(DataSet ds) { // Assumes that the DataSet has one table named Orders. The 2nd // column is CompanyName; the 4th column is LastName DataSet _dataset = new DataSet(); // Make a full worker copy of the DataSet _dataset = ds.Copy(); // Extract tables CreateCustomers(_dataset, ds); CreateEmployees(_dataset, ds); // Remove columns from Orders(companyname [2] and lastname [4]) // so that the original table doesn't contain any longer data // moved into distinct tables _dataset.Tables["Orders"].Columns.RemoveAt(1); _dataset.Tables["Orders"].Columns.RemoveAt(3); return _dataset; }
The code makes a full deep copy of the source DataSet, which will be used as the Orders table in the resulting new DataSet. This new DataSet then adds dynamically created tables for customers and employees. Finally, the columns now included in child tables are removed from the Orders table. The following code shows how the satellite Customers table is created:
// Create Customers static void CreateCustomers(DataSet _dataset, DataSet orig) { DataTable _customers = orig.Tables["Orders"].Clone(); _customers.TableName = "Customers"; // Remove unneeded columns (no data is involved) // Only the first two columns (CustomerID and CompanyName) are kept for(int i=2; i<_customers.Columns.Count - 1; i++) _customers.Columns.RemoveAt(2); // Fill with data string customerKey = ""; foreach(DataRow row in _dataset.Tables["Orders"].Rows) { // Already sorted by CustomerID if(customerKey != row["customerid"].ToString()) { // select distinct _customers.ImportRow(row); customerKey = row["customerid"].ToString(); } } // Add to the DataSet _dataset.Tables.Add(_customers); }
Note |
A similar approach can also be used if you have an existing stored procedure to retrieve the data and want to manage its return data through distinct in-memory tables. If you can't modify the existing stored procedure, this approach is a viable solution. |
When it comes to updating cross-related tables, the order in which each constituent statement executes is critical. For example, you can't delete an order without also deleting all its details. Likewise, you aren't normally allowed to add an invoice detail row without first adding its parent row. When related tables are involved, changes flow from the parent to the children in different ways, depending on the type of change—update, delete, or insert.
In general, there are two ways of conducting table updates—through a direct connection or batch update. In the direct connection case, you typically open a connection and execute the stored procedure or the SQL batch with all the logic and data you need to apply. In the batch update case, you typically yield to a middle-tier component, which takes care of submitting data to the database, possibly in an asynchronous way. This approach in a Web scenario requires you to serialize the data to be submitted from the Web server layer down to an internal tier, which holds a physical connection to the database. The availability of an easy-to-serialize object such as the DataSet and the powerful batch update mechanism of data adapters give you the tools to build highly scalable applications. But how can you effectively use these tools?
Cascading Changes
As mentioned in Chapter 5, a foreign key constraint set on a DataTable restricts the action performed when a value in a column is either deleted or updated. In a parent/ child relationship between two tables, deleting a value from the parent table can affect the child rows in various ways. Typically, the child rows are also deleted. However, in some cases, the cascading action sets the child rows to null or default values. The action that works best for you can be programmatically selected using the programming interface of the ForeignKeyConstraint class.
What really matters, though, is that ADO.NET provides the tools to configure the runtime environment to automatically cascade changes from parent to child rows. Such a mechanism is important because it allows you to work on a DataSet object, enter changes, and if needed, have them propagated through the tree. When you have gone through all the changes, ADO.NET guarantees that all the tables in the DataSet have been properly updated. In addition, the DataSet is a disconnected and serializable object and is ideal for delivering the new image of the data to the middle-tier component for physical updates.
The Batch Update
Data submission is the process in which all in-memory changes are passed on to the back-end system for permanent storage. In ADO.NET, submission does not consist of a block of data being sent to the database—Microsoft SQL Server 2000 or any other—in a single shot. An ADO.NET batch update just executes individual statements on the target system, one for each change that needs to be submitted. For the most part, statements will be SQL statements.
The DataSet object can submit data to the database in batch mode by using the data adapter's Update method. Data can be submitted only on a per-table basis. The Update method prepares and calls a tailor-made statement for each inserted, updated, or deleted row in the specified DataTable object.
adapter.Update(dataSet, tableName);
The cascading changes feature guarantees that all the tables in the DataSet are in sync and consistent. The batch update is its natural complement, as it allows you to submit changes, one table at a time, to the database.
Rows are scanned and processed according to their natural order (their position in the table's Rows collection). If you need to process rows in a particular order, you must split the overall update process into various subprocesses, each working on the selected bunch of rows you need. For example, if you have parent/child related tables, you might want to start by modifying rows in both tables. Next, you delete rows in the child table, and after that, you delete rows in the parent table. Finally, you insert new rows in the parent table and finish with insertions into the child table. The following code snippet shows how to select and submit only the rows that have been added to a table:
// Submit all the rows that have been added to a given table DataRow[] arrayOfRows = table.Select("", "", DataViewRowState.Added); adapter.Update(arrayOfRows);
This solution is made possible by the fact that one of the Update overloads takes an array of DataRow objects, an overload that provides for the greatest flexibility.
Important |
When serializing a DataSet object across the tiers of a system for update purposes, you're much better off using the subset of the DataSet that contains the changed rows. You get this smaller version by using the GetChanges method. If you're batch-updating from within the same AppDomain, though, this technique won't result in a significant improvement in performance. |
Detecting and Resolving Update Conflicts
Data updates are based on a decidedly optimistic vision of concurrency. What happens if, by the time you attempt to apply your changes to the back-end system, someone else has modified the same records? Technically speaking, in this case you have a data conflict. How to handle conflicts is strictly application-specific, but the reasonable options can be easily summarized in three points: the first-win, last-win, and ask-the-user options:
For Web applications, you normally choose one of the first two options and try to ensure that the update always ends without going back to the user for further interaction.
Important |
Conflicts are not detected automatically by some special feature built into the ADO.NET runtime. Conflicts are caught only if the commands used to carry the batch update have certain characteristics. In particular, conflict-aware commands have quite a long WHERE clause, one in which the primary key is first checked to verify the identity of the row and then a number of other columns are checked against the values they had when the DataSet was originally built. You control the columns in the WHERE clause in two ways: either by writing your own commands (as shown in the upcoming text) or by setting the SelectCommand property of the data adapter. (See Chapter 5.) An alternative way of detecting conflicts is by using a timestamp column. You add a timestamp column to a table and set it with the current time whenever you update the table. This trick greatly simplifies both the operations needed to check whether a conflict is going to happen (forward checking) and the command itself. Checking a single field is much easier than checking multiple columns! |
Stored Procedures in Batch Updates
One of the biggest drawbacks of the ADO batch update was that it was impossible for the user to intervene and modify the commands being used for the update. As discussed in Chapter 5, ADO.NET allows you to configure the command—insert, update, or delete—to use for any event and gives you a chance to link it to source columns. The following code shows how to use a stored procedure for all the updated rows of the table:
adapter.UpdateCommand = new SqlCommand(); adapter.UpdateCommand.CommandText = "sp_myUpdate"; adapter.UpdateCommand.CommandType = CommandType.StoredProcedure; adapter.UpdateCommand.Connection = conn; p1 = new SqlParameter("@empID", SqlDbType.Int); p1.SourceVersion = DataRowVersion.Original; p1.SourceColumn = "employeeid"; adapter.UpdateCommand.Parameters.Add(p1); p2 = new SqlParameter("@firstName", SqlDbType.NVarChar, 10); p2.SourceVersion = DataRowVersion.Current; p2.SourceColumn = "firstname"; adapter.UpdateCommand.Parameters.Add(p2); p3 = new SqlParameter("@lastName", SqlDbType.NVarChar, 20); p3.SourceVersion = DataRowVersion.Current; p3.SourceColumn = "lastname"; adapter.UpdateCommand.Parameters.Add(p3);
Notice the pair of properties, SourceColumn and SourceVersion, that allows you to define what column and what version of the column should be bound to the parameter. The possibility of using a stored procedure adds an unprecedented level of flexibility to the batch update, especially when cross-referenced tables are involved.
Using Transactions in Batch Updates
To execute transacted commands, you can either use the transaction-related statements of the database management system (DBMS) you're working with or rely on the .NET data provider's abstraction of a transaction. There's virtually no difference between the two approaches, as the .NET abstraction of a transaction simply exists in a managed wrapper that triggers a call to the underlying transactional infrastructure. When you call the BeginTransaction method on the connection object of a .NET data provider, a new transaction command executes over the connection. Let's consider the following snippet:
conn.Open(); SqlTransaction tran = conn.BeginTransaction(); tran.Commit(); conn.Close();
If you spy on the process by using a profiler tool, you'll see the following sequence of commands hitting the database:
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;BEGIN TRANSACTION COMMIT TRANSACTION
So the ADO.NET transaction object is useful because it makes the code you're writing less dependent on the physical structure of the database, but no significant difference occurs under the hood.
Once created, the transaction must be associated with the command or the commands you are about to run. For batch updates, this means the UpdateCommand, InsertCommand, and DeleteCommand must share the same transactional context.
Retrieving Server-Generated Values
A frequent problem of cross-table updates is that you have to use some values generated by the database for further statements within the same stored procedure or SQL batch. A typical situation in which this problem arises is when triggers are involved as well as identity columns. If you're using direct commands or stored procedures, there's not much that ADO.NET does for you. You must address the issue using the tools the database provides you with. Typically, using a subsequent query that reads critical values back fits the bill. The values can be exposed through output parameters or additional result sets. If identity columns are involved, you use the identity functions of the DBMS to retrieve them. For example, with SQL Server 2000, you use the SCOPE_IDENTITY function.
If you're using batch updates, ADO.NET provides a couple of built-in mechanisms that enable you to know the values actually stored in the database as a result of the update. The first mechanism is based on the UpdatedRowSource property of the command object. The other mechanism leverages the RowUpdated event on the data adapter object. Let's examine how to use each to retrieve the value of an identity column when a new record is added.
The UpdatedRowSource property tells the command object to look for values returned by the INSERT and UPDATE statements and to assign them to the DataRow object being updated in the batch. The net effect of this property is that, at the end of the batch, each inserted or updated row in the DataTable contains exactly the values of its counterpart in the database. If identity columns or triggers are around, the original, in-memory values of the row are automatically updated with the actual values stored in the database. Sounds great? It is, indeed, but for it to happen, a bit of work is required on your side. In particular, you have to define the insert and the update commands as stored procedures or SQL batches so that they execute the appropriate query to read values back. In other words, the ADO.NET batch update provides some handy facilities but there's only way to get server-generated—by reading them back with a batch query.
The UpdatedRowSource property is designed to look for output parameters, the first row of the next query, both, or neither. By default, the option to look for both is selected. The following batch inserts a new row into the Employees database whose employeeid column is the identity. Next, it retrieves the last identity value generated in the scope.
INSERT INTO employees (lastname, firstname) VALUES (@LName, @FName) SELECT SCOPE_IDENTITY() AS employeeid
If you assign the previous statements to the InsertCommand property of a data adapter, the UpdatedRowSource property will catch the newly generated identity value and pass it along to the DataRow object being updated. By setting the UpdatedRowSource property to Both (the default) or FirstReturnedRecord, each inserted row processed through the batch update will have its employeeid column updated. You don't have to write any additional code.
Caution |
It's highly recommended that you don't use this technique if commands are to be autogenerated using a command builder. The statements generated by a command builder don't include the required additional query and are generated when the data adapter's Update method is called. At this time, it is too late for editing the command text to add the extra query. You should not use the UpdatedRowSource property if you are using command builders. |
The SCOPE_IDENTITY function returns the last identity value inserted into an identity column in the same stored procedure, trigger, function, or batch. If you're using SQL Server 7.0 or earlier, use @@IDENTITY instead. SCOPE_IDENTITY, IDENT_CURRENT, and @@IDENTITY are similar functions in that they return values inserted into identity columns. For more information on the difference between these features, see the SQL Server Books Online.
Whenever the data adapter is going to update a row within a batch update, it fires a couple of events—RowUpdating and RowUpdated. RowUpdating fires before the update takes place and RowUpdated fires immediately after. The following code shows how to specify an event handler for the RowUpdated event.
adapter.RowUpdated += new SqlRowUpdatedEventHandler(RowUpdated);
The event passes back a data structure of type SqlRowUpdatedEventArgs in which the Row property returns an updated row. (We're using SQL Server as the data provider; the code would have to be adapted for other providers.) The RowUpdated event is provided mostly for databases that don't support stored procedures, multiple queries, and output parameters. Want an example? Microsoft Access.
Important |
Bear in mind that the automatic update of the DataRow occurs only during batch updates and only if you add output parameters to the command (for example, a stored procedure) or additional queries to retrieve particular fields. Also note that the name of the retrieved column must match the name of the DataTable column. That's why we aliased SCOPE_IDENTITY in the earlier sample (on page 403). |
In ASP.NET, the DataGrid is the ideal tool for building feature-rich data reports. As we've seen in Chapter 6, "Creating Bindable Grids of Data," and Chapter 7, it provides for paging, sorting, and advanced customization of the data displayed. Despite its adeptness at building feature-rich reports, the DataGrid does not have an ad hoc feature to automatically compute functions on the values listed in a given column. In Chapter 6, we demonstrated how to add a footer to a table built with the DataGrid and how to modify the structure of the footer to display totals. In computing totals for a column, though, we exploited the programming interface of the DataTable and DataRelation objects.
Keep in mind that a total is a scalar value you display only in a single place—typically, a cell in the footer bar. Displaying a total doesn't in any way affect the rows you display in the grid. Unfortunately, the same can't be said for displaying column subtotals—they do affect rows you display in the grid.
A subtotal is a non–data-bound row that is interspersed with the data rows of the grid. Like a footer bar, it can have any structure and normally displays summary information that relates to the records found after the beginning of the table or the last subtotal. For example, if your grid displays orders grouped by customers, a subtotal row could count the number and total amount of all orders issued by each customer. The DataGrid control doesn't provide any built-in facility that works in this scenario, but with an appropriate combination of SQL and ASP.NET coding you can make it happen.
The key point to understand about DataGrid controls is that they don't distinguish between rows in the bound data source. All the bound rows are displayed and any attempt to insert fake rows programmatically is destined to fail or, even worse, to violate the consistency of the output. On the other hand, bound rows can be displayed with different styles, even radically different styles.
As a result, any subtotaling rows you want to display must be part of the data source you associate with the grid. These rows must be easily recognizable so that the ItemDataBound and ItemCreated event handlers can apply any necessary styling. How can you add summary rows to the data source? There are two possibilities. You can either ask the database to compute subtotal rows, or you can add blank rows and populate them with ad hoc data. In many cases, the database subtotaling capabilities will match your expectations. If you need to control the whole process of aggregation, or if the database you're targeting doesn't provide subtotaling support, you should opt for a pure ADO.NET approach.
The WITH ROLLUP Operator
Subtotaling makes particular sense if you're also grouping data. SQL Server provides the WITH ROLLUP operator as an extension to the GROUP BY clause of a SELECT statement. The operator works by adding summary rows to the result set according to a specified template. The summary row will have the same layout as all other columns, but you can customize the content of each column to some extent. Let's consider a sample query: getting all orders issued in a given year (say, 1997) grouped by customers. The SQL query to start with is the following:
SELECT o.customerid, od.orderid, SUM(od.quantity*od.unitprice) AS price FROM Orders o, [Order Details] od WHERE YEAR(o.orderdate)=1997 AND od.orderid=o.orderid GROUP BY o.customerid, od.orderid ORDER BY o.customerid, od.orderid
Figure 8-2 shows the results as they appear in SQL Query Analyzer. The arrows indicate where we would like to have subtotal rows.
Figure 8-2: A result set that groups order information by customer.
Let's rewrite the query to use the WITH ROLLUP operator and compute order subtotals per customer:
SELECT o.customerid AS MyCustomerID, CASE GROUPING(od.orderid) WHEN 0 THEN od.orderid ELSE 99999 END AS MyOrderID, SUM(od.quantity*od.unitprice) AS OrderAmount FROM Orders o, [Order Details] od WHERE Year(orderdate) = 1997 AND od.orderid=o.orderid GROUP BY o.customerid, od.orderid WITH ROLLUP ORDER BY o.customerid, MyOrderID
The first thing to notice is that the GROUP BY clause has been extended to include the WITH ROLLUP operator. The second column—originally od.orderid—has been replaced with a more complex CASE...WHEN construct. The CASE...WHEN statement is a simple flow controller, however. The resulting column contains the corresponding od.orderid value if the guarded value is 0; otherwise, it contains 99999. This brings us to the role of the GROUPING function.
GROUPING is an aggregate function that takes the name of a column in the GROUP BY clause and returns an additional column with 0/1 values. A value of 0 means the row has been generated by the query; a value of 1 indicates the row is a summary row added by the ROLLUP operator. The following code snippet shows how to add an extra column to the query. The IsSummary column indicates whether the row is a subtotal.
SELECT GROUPING(od.orderid) IsSummary,
In the preceding statement, no extra column is actually returned to the user, but the orderid column in the regular query is assigned a special value (99999 or whatever value you decide to use) if the row contains summary values.
The result set of the WITH ROLLUP query contains summary rows characterized by the 99999 value in the MyOrderID field. (If you can't figure out a neutral value for a column, just add an additional 0/1 column based on the output of the GROUPING function.) An appropriate order in the query makes any summary rows appear as the last row of each group. Displaying subtotals is now a simple matter of styling the grid.
Figure 8-3: Rollup information in a grouped query.
Note |
The ROLLUP operator is not the only aggregator available in SQL Server 2000. SQL Server 2000 also supports COMPUTE BY, although it is used only for backward compatibility. Contentwise, the result set generated by a ROLLUP operation is identical to that returned by a COMPUTE BY statement. Other differences, though, tip the scales in favor of ROLLUP. For example, ROLLUP returns a single result set. In contrast, COMPUTE BY returns multiple result sets that increase the complexity of code necessary to handle the data. In addition, only ROLLUP can be used in a server cursor, and the query optimizer can sometimes generate more efficient execution plans for it. The CUBE operator, on the other hand, has programmatic characteristics that assimilate it with ROLLUP. The result set generated, though, is quite different because CUBE explodes data and produces a superset of groups with cross-tabulation of every column to the value of other columns. |
A Pure ADO.NET Approach
If the database doesn't support any aggregate operator that groups data, you can calculate the needed values yourself and insert the summary rows manually with a bit of ADO.NET coding. A possible way of doing that entails the steps shown in the following paragraphs.
Fetch any needed data. You do this by executing a couple of queries against the database. The first retrieves all orders to be displayed but is limited to returning the customer ID, order ID, and total amount of the order.
SELECT customerid, od.orderid, SUM(od.quantity*od.unitprice) AS price FROM Orders o, [Order Details] od WHERE Year(orderdate) = @nYear AND od.orderid=o.orderid GROUP BY od.orderid, o.customerid ORDER BY o.customerid
The second query selects all distinct customers and for each of them counts the orders and computes the total:
SELECT c.customerid, c.CompanyName, Total=SUM(od.quantity*od.unitprice), TotalOrders=COUNT(DISTINCT od.OrderID) FROM Customers c, [Order Details] od, Orders o WHERE Year(orderdate)=@nYear AND c.CustomerID=o.CustomerID AND o.OrderID=od.OrderID GROUP BY c.CompanyName, c.CustomerID ORDER BY c.customerid
Once the two tables are in memory, you programmatically add empty rows to the orders table and create a sorted view of the table so that the empty rows show up as the last row of each customer group. Next, you can customize a little bit of the text shown by the additional rows so that the final output looks like Figure 8-4. The full source code for the Subtotal sample is available in the book samples.
Figure 8-4: A console application that adds summary rows to the tables of a DataSet.
Let's take the result set shown in Figure 8-3 and display it through a DataGrid control. The output looks like Figure 8-5. The rows with a black background are the summary rows added by the ROLLUP operator. Notice, though, that the highlighted background in the figure is only a graphical effect on the picture. The DataGrid control has no built-in logic to detect a summary row. Adding such logic is exactly the goal of this section.
Figure 8-5: Displaying the output of a ROLLUP query using a DataGrid control.
Having summary rows in the data source is not enough. We must be able to distinguish summary rows from ordinary bound rows and render the summary rows with different visual styles.
When the DataGrid is rendered in HTML, it fires a couple of events to the page—ItemCreated and ItemDataBound. As we saw in Chapter 6, by hooking those events you can analyze the data object being drawn and set its styles. In doing so, you test the value of the MyOrderID column against the well-known value of 99999 and decide whether that row is to be rendered as a summary row.
Formatting the Summary Row
When the ItemCreated event fires, no data has yet been bound to the DataGrid item. You can apply visual styles only from within the ItemCreated event. However, before creating the appearance of a summary row, we first need to verify that it is a summary row. To verify that the current row is just a summary row, we need to access the underlying data object and check its MyOrderID column. ItemCreated is not the ideal place to check for this condition, so let's hook ItemDataBound instead. The following code shows the ItemDataBound event handler for the sample page shown in Figure 8-6.
Figure 8-6: Summary rows display with different styles and a different layout.
void ItemDataBound(object sender, DataGridItemEventArgs e) { CustomizeItem(e); } void CustomizeItem(DataGridItemEventArgs e) { DataRowView drv = (DataRowView) e.Item.DataItem; if (drv != null) { // Current row is a summary row if ((int) drv["MyOrderID"] == 99999) { // Yellow background and bold font e.Item.BackColor = Color.Yellow; e.Item.Font.Bold = true; // Remove the cell with the Order ID e.Item.Cells.RemoveAt(1); // Span the Customer ID cell to cover two cells e.Item.Cells[0].ColumnSpan = 2; e.Item.Cells[1].HorizontalAlign = HorizontalAlign.Right; } } }
Summary rows are given a different background color and a boldface font. In addition, we also modify the structure of the row by removing the second cell (the one containing the order ID) and enlarging the first to span over two logical cells. The last column, which contains the total amount of the order, is then right-aligned. Note that after a cell is removed, you must reassign any properties (for example, horizontal alignment) to all the cells.
Setting Up the Footer
As you can see in Figure 8-6, the first row in the result set contains the total of all orders selected by the query. If you're going to display a list of orders, keep in mind that information simply doesn't belong on the page. You can remove the first row of the DataTable at any time before you bind it to the DataGrid.
// Fill the DataSet DataSet ds = new DataSet(); da.Fill(ds, "Orders"); // Remove the first row but cache the total of orders decimal total = (decimal) ds.Tables["Orders"].Rows[0]["total"]; ds.Tables["Orders"].Rows.RemoveAt(0); ViewState["TotalAmountOfOrders"] = total;
To remove a row within a table, you call the RemoveAt method on the Rows collection. The RemoveAt method can take either a DataRow object or the 0-based index of the row to remove. The row is physically detached from the table, but it's not destroyed as an object until it goes out of scope.
The Total column on the first row contains the total amount of all the orders issued by all customers in the specified year. That information might be valuable as well, but it's better to display it in the footer. To have it available for each request, we cache it in the page view state or in any other global repository:
void CustomizeFooter(DataGridItemEventArgs e) { e.Item.Cells[2].HorizontalAlign = HorizontalAlign.Right; e.Item.Cells[2].Text = String.Format("{0:c}", ViewState["TotalAmountOfOrders"]); }
Figure 8-7 shows the total of all orders displayed in the footer.
Figure 8-7: The footer shows the total of all orders.
Adding Statistical Information
So far we've used a summary row in which the customerid column dumbly repeated the ID of the above customer. While this is a possibility, more useful information can be provided. For example, you could count the orders that a given customer has issued in the year, and spice it up with some statistics, such as a calculation of the orders as a percentage of the total volume for the year:
void CustomizeItem(DataGridItemEventArgs e) { DataRowView drv = (DataRowView) e.Item.DataItem; if (drv != null) { // current row is a summary row if ((int) drv["MyOrderID"] == 99999) { // Yellow background and bold font e.Item.BackColor = Color.Yellow; e.Item.Font.Bold = true; // Remove the cell with the Order ID e.Item.Cells.RemoveAt(1); // Set up the alignment for the last column e.Item.Cells[1].HorizontalAlign = HorizontalAlign.Right; // Count the orders for the customer DataTable dtOrders = drv.Row.Table; int orderCount = (int) dtOrders.Compute( "Count(customerid)", "customerid='"+drv["customerid"]+"'") -1; // Calculate the percentage of the total volume decimal amount = (decimal) drv["price"]; decimal total = (decimal) ViewState["TotalAmountOfOrders"]; decimal percent = 100 * amount / total; // Set up the first cell e.Item.Cells[0].ColumnSpan = 2; e.Item.Cells[0].Style["font-weight"] = "400"; e.Item.Cells[0].Style["font-size"] = "80%"; string msg = String.Format("{0} order(s) [{1:n}%]", orderCount, percent); e.Item.Cells[0].Text = msg } } }
There are many ways to count the orders for each customer. Probably the most effective way consists of calculating them once and caching the few bytes somewhere in the view state or session state. In the following sample code, we count them each time using the Compute method of the DataTable object. (This is definitely not a smart choice for a real application, but it gives us a chance to show an interesting use of the ADO.NET model.)
int count = dtOrders.Compute("Count(customerid)", "customerid='" + drv["customerid"] + "'") -1;
Of course, you must subtract one from the total because it also includes the summary row.
Finally, as the code shows, you can use custom visual settings for each cell in the row. The first cell, in fact, is displayed without boldface type and with a font size 80 percent smaller than the default, as shown in Figure 8-8.
Figure 8-8: Statistical information shown in the summary row.
The summary information can take up any number of rows in the grid. However, if multiple rows are necessary, you should insert additional rows yourself using the approach discussed earlier.
Applications based on interrelated tables often need to provide a hierarchical navigation mechanism to enable users to sort through various parent/child table relationships. For example, a user can pick a customer from a menu and then see all the orders the selected customer has issued. Next, the user can click on a particular order and have all its details displayed. When such parent/child relationships exist, a child table is always associated with a particular element on the parent table. What, then, would be an effective way of retrieving and displaying parent/child information in an ASP.NET application?
ADO.NET provides a rich in-memory, database-like infrastructure that makes it particularly easy to define relationships between tables and extract child rows from parent rows. The downside is that a roundtrip occurs whenever a user clicks on a row to go one level down or up. As a result, the page posts back and is completely rebuilt from scratch. The hierarchical data that is necessary to fill the page is either cached on the Web server or downloaded over and over again from the database. As discussed in Chapter 6 and Chapter 7, caching is an excellent choice if you can afford it. Otherwise, retrieving small pieces of data, and retrieving them only when you need them, is an alternative technique that consumes a minimum amount of server resources. If you're caching data across page requests, do it using the DataSet object and create as many relationships as needed. Doing this will allow you to retrieve the related rows quickly and effectively.
In the rest of this chapter, we'll build a master/detail application that addresses issues you'll encounter in the real world. The application contains three related grids that display customers, orders for a customer, and the details of an order. We'll use the SelectedIndexChanged event to command a more specific query and drill down into the data. We'll also use different techniques for paging and sorting the data. The different views of the data are implemented using inline frames.
The sample page we'll consider is made of a master grid that pages through a set of customers. We'll use custom paging and also add sorting capabilities to the DataGrid. Figure 8-9 shows the grid in action with an applied filter that selects only the customers based either in Italy or Spain.
Figure 8-9: A sortable and pageable grid that lets you select a customer. By clicking on the customer name, you can see the orders the customer has issued.
The grid retrieves only the records that fit in the current page based on the sort expression. This is a combination of styles we didn't fully cover in Chapter 6 and Chapter 7 when we explored the paging and sorting mechanism of the grid. If you plan to use custom paging and sorting together, you can't use a client-side mechanism (for example, a DataView) to sort records. The DataView in fact has limited visibility over the set of records and would actually sort only records in the current page. Sometimes this feature is acceptable, sometimes not. If you want full sorting—that is, if you want to page through customers in a given order—the query command used to retrieve a page of records must be slightly enhanced.
Custom Paging and Sorting
To get a page of records, we use a query made of three nested SELECT statements that select the right subset and then re-order it. (See Chapter 7.) It's useful to recall the scheme we discussed earlier in this chapter in the "Paging Through the Results of Any Query" sidebar. The placeholders in boldface are elements you must configure properly.
SELECT * FROM (SELECT TOP size * FROM (SELECT TOP size*index * FROM (query ORDER BY KeyCol ASC) AS t0) AS t1 ORDER BY KeyCol DESC) AS t2 ORDER BY KeyCol ASC
The query element is any SELECT statement that defines the set of records to be displayed. It can be as complex as needed and can include any clauses that make sense with a SELECT statement. In its simplest form, the query placeholder is the name of the table. As you can see, the KeyCol placeholder indicates the column (or the columns) to sort by. Each occurrence of the KeyCol placeholder is associated with a sort direction. If you request a default ascending order, only the second occurrence must be set to DESC. We fully explained the reason for this in Chapter 6. Of course, if you want to invert the direction, all direction placeholders must be reversed too. Note that if multiple columns are used, you must reverse the direction of each of them.
In this way, what you get from SQL Server is a properly sorted result set that can be bound to the grid without further manipulations. The following listing shows the new version of the BindData method:
void BindData() { string sortby = (string) ViewState["DataSortExpression"]; DataTable _data = CreateDataSource(1 + grid.CurrentPageIndex, sortby); grid.DataSource = _data; grid.DataBind(); }
As we mentioned in Chapter 6, sorting is a subtle and complex operation that can add unnoticed overhead to your Web application. Going down to the database only for sorting is probably a bad hit to overhead, but if we're already using the database to page through a data set, asking it to return sorted data is a minor addition and probably the most effective approach.
Note |
The Oracle ROWNUM operator numbers the rows in the result set before the sort is applied. So if you want rows sorted in some way, you must nest queries so that the innermost retrieves and sorts data and the outermost just ensures rows are correctly numbered. While I'm speaking of Oracle ROWNUM, I should mention that the next version of SQL Server is expected to support a similar feature but, at the time of this writing, the form, implementation, and syntax have not been decided upon. |
The Selection Mechanism
When the user selects a row in the master grid, the SelectedIndexChanged event fires and updates the SelectedIndex property with the 0-based index of the row. As mentioned in Chapter 6, when the user clicks on a DataGrid column with a command name of select, two events are raised: ItemCommand and then SelectedIndexChanged. This sequence of events simplifies the implementation of the item deselection. While processing ItemCommand, the SelectedIndex property has not yet been updated. So you store the selected index in a global variable and compare it with the new selected index. If the two match, the user clicked two consecutive times on the same item. At this point, you deselect the item.
void ItemCommand(object sender, DataGridCommandEventArgs e) { switch(e.CommandName) { case "select": m_currentSelectedIndex = grid.SelectedIndex; break; } } void SelectedIndexChanged(object sender, EventArgs e) { if (m_currentSelectedIndex == grid.SelectedIndex) { UnselectItem(); return; } SelectItem(); } void SelectItem() { string custID = (string) grid.DataKeys[grid.SelectedIndex]; string url = String.Format("orderslist.aspx?custid={0}", custID); ordersView.Attributes["Src"] = url; }
When a customer is selected, the page refreshes the orders frame that points to a second page named orderslist.aspx. The page receives the ID of the customer as a query parameter, performs a query against the orders, and finally lists all the orders associated with that customer.
Note that if you can afford caching all the data, including customers and orders, you can easily retrieve the orders for a customer using a relation between the tables.
Using Inline Frames
Inline frames are browser windows that can be placed anywhere in the HTML code by using the tag. The tag is part of the HTML 4.0 standard. Note, though, that not all browsers support this element, which has been part of the Internet Explorer arsenal since version 4.0. Old Netscape Navigator 4.x browsers support a similar element, named , with a nearly identical overall goal and syntax.
There are no ASP.NET server controls to map the functionality of the tag. If you mark it with the runat attribute, the ASP.NET runtime will render it using an instance of the HtmlGenericControl class. The following code declares a frame with a solid, single-line, flat border and no scrolling capabilities:
To set the URL that the frame will display, you use the Src attribute. Note, though, that you can't set the Src attribute through a property because the HtmlGenericControl class doesn't provide for that. You must set it as an explicit HTML attribute using the Attributes collection.
ordersView.Attributes["Src"] = url;
To unselect a customer and clear the page with its orders displayed, you simply make the frame point to the about:blank URL. The about protocol is supported by most browsers, including old versions of Netscape browsers.
The orderslist.aspx page has a structure that is nearly identical to its parent page (cust2orders.aspx). It gets the customer ID from the query string and uses that to perform an ad hoc query against the database. The orders are then displayed through a pageable DataGrid control. One of the columns of the grid is a ButtonColumn with a command name of select, which makes the items displayed truly selectable.
Normally the text of a ButtonColumn is a hyperlink element rendered with the default styles—that is, underlining and visited-link or active-link colors. To make it look like normal yet selectable text, you can resort to the following style declaration:
The text will be underlined in red only when the mouse hovers over the text, as shown in Figure 8-10.
Figure 8-10: A pageable grid that shows the orders for a customer. By clicking on the order ID, you can see the details of that order.
Smart Caching
The orders DataGrid is pageable but does not use custom paging. It also does not cache all orders. It caches only the orders the grid needs to display—that is, the orders that relate to a given customer. The pagination takes place automatically within the DataGrid and is completely handled by the control.
Because the role of the grid is to display an unfiltered list of orders for a customer, chances are that concurrent users working in different sessions might be accessing the same set of records. For this reason, the population of the orders DataGrid occurs in two steps. First the code attempts to locate a slot in the ASP.NET global Cache object that contains the orders for the customers. If found, no database access is performed and the records are retrieved from memory. Otherwise, a query is run and the results are copied in to the Cache for further access by the same, or other, users.
void Page_Init(object sender, EventArgs e) { // Store the current customer ID string tmp = (string) Request.QueryString["CustID"]; m_currentCustomerID = tmp.Substring(0, 5); // Store the name of the DataSet as used within Cache m_dataSetName = String.Format(m_dataSetName, m_currentCustomerID); }
The Init event fires at the beginning of the page request lifecycle. All the global data you set here is visible to all the other methods that are called as the request processing proceeds. In particular, we need to retrieve data from Cache to set up the footer. The footer is configured during the DataGrid's ItemCreated event, which occurs before the page's Load event. For this reason, we need to store the name of the cache slot and the customer ID into global variables at the very beginning of the request processing.
Note |
Each processing of a request results in a sequence of methods—event handlers, actually—being called one after the next. Any global variables you set at any point in this cycle work as expected. You can't, on the other hand, set global variables and expect that they will survive the request. If you need this to happen, use the ViewState object or any other global data container such as Session or Cache. |
The naming convention used to recognize the slot within Cache that contains the orders of a particular customer is illustrated in the following line:
private string m_dataSetName = "Orders [{0}]";
The code in Page_Init replaces the placeholder with the actual customer ID read out of the query string. Also note that only the first five characters are taken into account—the known maximum length of a customer ID. As we'll see in Chapter 15, "ASP.NET Security," this simple trick can save you from a lot of trouble with malicious code and SQL injection attacks. The following code is used to populate the orders grid:
void DataFromSourceToMemory() { // Attempt to locate the orders from the cache in case other users // already retrieved them. Otherwise, read from the database DataSet ds; ds = (DataSet) Cache[m_dataSetName]; if (ds == null) ds = PhysicalDataRead(); // Stores orders in the cache Cache[m_dataSetName] = ds; }
The DataSet we're caching for each accessed customer is made of two tables—the orders table and further customer information table. The second table holds only one record with registration data about the customer. Note that the DataSet is obtained using a SQL batch—that is, a couple of concatenated SQL statements. Both statements require the use of a parameter—the customer ID. Because the parameter is the same in both cases—and thanks to the SQL Server .NET provider's named parameters—you specify it only once. This wouldn't have been possible with position-based parameters.
Adding Templated Columns
In the Northwind database, orders are registered with a required date and a ship date. Users, though, might appreciate a simple flag positioned beside an order to denote whether the order shipped on time or not. Later, if needed, they could drill down and find date-related details. If you don't plan to show more than a Boolean flag to signify the delay of an order, you can modify the query to make SQL Server return a computed Boolean column. Otherwise, you query for both the shippeddate and requireddate columns and create an in-memory computed column to store the Boolean flag.
// Get the data DataSet _data = new DataSet(); adapter.Fill(_data); _data.Tables["Table"].TableName = "Orders"; // Add a custom column comparing ShippedDate and RequiredDate _data.Tables["Orders"].Columns.Add("Delayed", typeof(bool), "shippeddate > requireddate");
How can you display this information? Although effective, a True or False string might not be a particularly friendly solution. In addition, rows with the most common value (say False, meaning that the order shipped on time) shouldn't be displayed. A better solution would be to display a new column that indicates when an order was delayed. What about templated columns? The following code shows the declaration of a templated column in which the contents of each cell is an elaboration of the newly added Delayed column:
In particular, the template displays a Wingdings symbol (the checkmark) or the empty string based on the Boolean value in the source column. Of course, for this to happen you must have the Wingdings 2 font installed.
Figure 8-11: The last column shows a check symbol to indicate the corresponding order shipped after being delayed.
Listing the Order Details
The orderslist.aspx page contains a child frame that points to the orderdetails.aspx page. This page is aimed at showing the details of a particular order. The page receives the order ID through the query string, downloads the few records that describe the items in the order, and populates a nonpageable DataGrid control.
The orderdetails.aspx page is fired when the user clicks to select an order. When this happens, a label is also updated with further information about the ship date and the required date of shipment for the order. The difference between the two dates is expressed in terms of days.
The various totals you see on the footer of the grids are calculated using the Compute method on the DataTable object.
The complexity of the code we examined in this chapter is really close to the physical threshold beyond which one normally starts looking at ways to implement reusability and encapsulation. The master/detail views and subtotaling applications we've built so far make intensive use of certain features of the DataGrid control that I simply cut and pasted from one source file to the next. There has to be a better way of reusing code in ASP.NET! Actually, a (much) better way does exist, and we'll start looking at it in Chapter 10 and get into more detail in Chapter 18 and Chapter 19.
Real-world programming is a powerful mix of best practices, design patterns, and those finer touches that always marks the difference between good and not-quite-as-good applications. However, a deep knowledge and understanding of the platform is a must. Speaking of ASP.NET in particular, the required knowledge spans the world of .NET and touches on the land of SQL.
Many real-world applications are based on a web of interrelated and interconnected databases. Designing effective queries and update strategies is critical. Keeping an eye on the overall scalability is a must. Providing awesome, user-friendly features is just what you live (and work) for. Right? SQL is the glue that holds databases and data-driven applications together, but SQL—in the real world—is not a complete standard. Yes, there was once the SQL 92 platform, but since then each vendor has added feature upon feature to provide better functionality. So you should know the SQL engine you're using to write effective code. And in ASP.NET, you also must know the client-side engine that interacts with the back end. This engine is ADO.NET.
ADO.NET doesn't add magic to your arsenal of tricks and techniques. It doesn't reduce database programming to mere declarative programming, in which you build working code by simply pointing and clicking. Several times in this chapter (and even more often in ADO.NET books such as David Sceppa's Microsoft ADO.NET [Microsoft Press, 2002]), you find described tools and classes that kindly and generously do most of the work for you. In the same sources, you normally find notes that warn you against using those tools and classes in real-world applications. The command-builder object or even the batch update, for example, greatly simplify the update process, but you must know what they do and how they do it. Chances are that to fix your application you just need to work around their standard behavior. ADO.NET can do a lot of things for you, but it can do even more if you know how to instruct it.
This chapter discussed a few general points typical of all Web applications of some complexity. In the next chapter, we'll move away from the rather ubiquitous DataGrid control to consider more flexible alternatives, such as the Repeater and DataList controls. The next chapter opens Part III of the book, which is entirely devoted to system-provided ASP.NET rich controls.
Part I - Building an ASP.NET Page
Part II - Adding Data in an ASP.NET Site
Part III - ASP.NET Controls
Part IV - ASP.NET Application Essentials
Part V - Custom ASP.NET Controls
Part VI - Advanced Topics
Index