Several types of master-detail and parent-child relationships pose unique challenges to creating a high-performance Web application. There are two main master-detail problems: highly nested structures such as category trees and large master-detail relationships.
Most applications have a tree structure somewhere in them. Generally they are used as a method of navigating a Web site or organizing a collection. The data structure behind them is generally the same and usually wrong. Look at Table 19-1 and see if you can find the problem with this structure.
DESCRIPTION | DATA TYPE |
---|---|
CategoryID | Int |
Description | Char(20) |
ParentCategoryID | Int |
CategoryID is the primary key for this table. The Description field is the name of the category displayed to the user so they know what they're seeing. ParentCategoryID is a subcategory of CategoryID. Generally the root-level category has a ParentCategoryID of 0 or -1 to indicate that it's a root-level category and that there's no parent category.
The problem with Table 19-1 is that there isn't a way to get the entire tree without executing multiple Select statements. To get all of the categories in this tree you'll have to commit another scalability mistake—using recursion. The only way to successfully get the whole tree is to recursively select everything that has a ParentCategoryID equal to the current CategoryID and so on. If your tree structure gets malformed, major problems can happen on your Web server and/or database server depending on exactly how you coded your statement.
The second problem with Table 19-1 is that to get decent performance, you'll have to have an index for CategoryID and a clustered index for ParentCategoryID, which is unnecessary with the alternate method described in Table 19-2.
DESCRIPTION | DATA TYPE |
---|---|
CategoryID | Int |
Description | Char(20) |
ParentCategoryID | Int |
RootCategoryID | Int |
Don't despair if you're using a system that uses this technique; it's a common programming error. The important thing is that there's a solution. Table 19-2 shows a minor modification to the table that makes a major difference in performance.
This structure adds a RootCategoryID column, which is the category at the base of this entire tree. This column gives you a simple way to load the entire tree in a single Select statement with no recursion needed on the database server, and with a little fancy code on your Web server, you can avoid recursion there as well.
An additional advantage of this structure is the level of optimization afforded by a single clustered index on the RootCategoryID. Because you won't be using the CategoryID in the WHERE clause of any of your Select statements, you won't even need it to be indexed—just the RootCategoryID column. Figure 19-5 shows a populated VideoCategory table, the high performance way.
Figure 19-5: The data in the VideoCategory table
One of the mantras of good database design is to never have repetitive data, a mantra we're breaking here with good reason. In this case, it's necessary. Listing 19-7 shows the VideoCategory, VideoCategoryData, and VideoCategoryDataAccess objects.
Listing 19-7: VideoCategory and Related Objects
Imports System Imports System.Data Imports System.Data.SqlClient Imports System.Data.Common Namespace EnterpriseVB.VideoStore.Data Public Class VideoCategory Protected data As DataTable Protected index As Integer Protected subCategories As New ArrayList() Protected Friend ReadOnly Property MyData() As DataRow() Get Dim myRow() As DataRow = {Me.data.Rows(index)} Return myRow End Get End Property Public Sub New(ByRef data As VideoCategoryData, ByVal index As Integer) Me.data = data Me.index = index End Sub Public Sub New() Me.data = New VideoCategoryData() data.Rows.Add(data.NewRow()) Me.index = 0 End Sub Public Function GetColumn(ByRef ColumnName As String) As Object Return data.Rows(index)(ColumnName) End Function Public Function SetColumn(ByRef ColumnName As String, ByRef _ ColumnValue As Object) data.Rows(index)(ColumnName) = ColumnValue End Function Public Property CategoryID() As Int32 Get If (GetColumn("CategoryID").GetType() Is _ Type.GetType("System.DBNull")) Then Return -1 End If Return CType(GetColumn("CategoryID"), Int32) End Get Set(ByVal Value As Int32) SetColumn("CategoryID", Value) End Set End Property Public Property Description() As String Get Return "" + GetColumn("Description") End Get Set(ByVal Value As String) SetColumn("Description", Value) End Set End Property Public Property ParentCategoryID() As Int32 Get If (GetColumn("ParentCategoryID").GetType() Is _ Type.GetType("System.DBNull")) Then Return -1 End If Return CType(GetColumn("ParentCategoryID"), Decimal) End Get Set(ByVal Value As Int32) SetColumn("ParentCategoryID", Value) End Set End Property Public Property RootCategoryID() As Int32 Get If (GetColumn("RootCategoryID").GetType() Is _ Type.GetType("System.DBNull")) Then Return -1 End If Return CType(GetColumn("RootCategoryID"), Int32) End Get Set(ByVal Value As Int32) SetColumn("RootCategoryID", Value) End Set End Property Public Function AddSubCategory(ByRef subCat As VideoCategory) Me.subCategories.Add(subCat) End Function Public Function FindCategoryByID(ByVal catID As Integer) As VideoCategory Return Me.FindCategoryByID(Me, catID) End Function Friend Function FindCategoryByID(ByRef cat As VideoCategory, _ ByVal catID As Integer) As VideoCategory If (catID = cat.CategoryID) Then Return cat End If Dim i As Integer For i = 0 To cat.CountSubCategories() - 1 If (FindCategoryByID(cat.GetSubCategory(i), catID) Is Nothing) Then Else Return cat.GetSubCategory(i) End If Next Return Nothing End Function Public Function CountSubCategories() As Int32 Return Me.subCategories.Count End Function Public Function GetSubCategory(ByVal i As Int32) As VideoCategory Return CType(Me.subCategories(i), VideoCategory) End Function End Class Public Class VideoCategoryData Inherits DataTable Public Sub New() MyBase.New("VideoCategory") Me.Columns.Add("CategoryID", Type.GetType("System.Int32")) Me.Columns.Add("Description", Type.GetType("System.String")) Me.Columns.Add("ParentCategoryID", Type.GetType("System.Int32")) Me.Columns.Add("RootCategoryID", Type.GetType("System.Int32")) End Sub End Class Public Class VideoCategoryDataAccess Public connectionString As String Protected adapter As SqlDataAdapter Protected loadAll As SqlDataAdapter Public Sub New() connectionString = "Your Connection String;" adapter = New SqlDataAdapter() adapter.SelectCommand = New SqlCommand("VideoCategoryGetTree", New _ SqlConnection(connectionString)) adapter.SelectCommand.CommandType = CommandType.StoredProcedure adapter.SelectCommand.Parameters.Add _ ("@RootCategoryID", SqlDbType.Decimal, 0, "RootCategoryID") End Sub Public Function GetCategoryTree(ByVal rootCategoryID As Int32) _ As VideoCategory Dim data As New VideoCategoryData() adapter.SelectCommand.Parameters("@RootCategoryID").Value = _ rootCategoryID adapter.Fill(data) Dim categories As VideoCategory() categories = GetVideoCategoryArrayFromData(data) Dim i As Integer Dim x As Integer If (categories.Length = 0) Then Return Nothing End If For i = 0 To categories.Length - 1 For x = 0 To categories.Length - 1 If (categories(i).CategoryID = _ categories(x).ParentCategoryID) Then categories(i).AddSubCategory(categories(x)) End If Next Next Return categories(0) End Function Public Shared Function GetVideoCategoryArrayFromData(ByRef data As _ VideoCategoryData) As VideoCategory() Dim vArray(data.Rows.Count - 1) As VideoCategory Dim i As Integer For i = 0 To (data.Rows.Count - 1) vArray(i) = New VideoCategory(data, i) Next i Return vArray End Function End Class End Namespace
The real magic of this object happens in the GetCategoryTree() method of the DAC. It's here that the category tree is assembled without recursion and with a single query. The DAC returns a single VideoCategory object that you can then work with and display. The net effect of this is you have a high-performance category tree and you'll never have to write and data access methods for the VideoCategory object no matter how many times you use it.
The VideoCategory object models the complete tree by providing a GetSubCategory() method to retrieve the subcategories of the current category. Internally, the VideoCategory object stores its subcategories in an ArrayList that it manages.
The VideoCategory object also provides a FindCategoryByID() method that does a search on the tree and returns the requested VideoCategory object. When a specific category gets referenced, either by the user selecting something or by some other method, it will likely be by the CategoryID. Without a method such as this, you would likely have to make another round trip to the database to load a category again that you already have in memory.
Figure 19-6 shows the output of a test harness. A test harness is a simple application used to exercise the features of a component. Test harnesses can be helpful in doing object-level or component-level debugging.
Figure 19-6: The output of the test harness
The method described in the "Highly Nested Structures" section works great on most category trees with a reasonable amount of categories, but if you have a tree with potentially hundreds of categories, you'll need to use an alternate method.
For large structures you need to be able to load the tree in logical blocks. For example, look at the large structure-optimized version of the category tree in Table 19-3.
DESCRIPTION | DATA TYPE |
---|---|
CategoryID | Int |
Description | Char(20) |
ParentCategoryID | Int |
GroupID | Int |
In this table's structure you can load the tree one block at a time and assemble the tree structure using your mapped objects in memory.
The stored procedure in Listing 19-8 allows you to request all the children of a specific category and get all of the categories in that group in a single block. The result is you can display a collapsed tree to the user, and as they expand the tree, the server won't have to hit the database for each expanded category.
Listing 19-8: The GetBlockedCategory Stored Procedure
CREATE PROCEDURE dbo. GetBlockedCategory @CategoryID int AS BEGIN SELECT CategoryID, Description, ParentCategoryID, GroupID FROM BlockedTree WHERE GroupID= (SELECT GroupID FROM Category WHERE ParentCategoryID=@CategoryID) END
The only additional concern you have is writing the corresponding Visual Basic .NET code to take these blocks and dynamically assemble them on to your tree using a simple lazy-loading technique. (If you don't know what lazy loading is, don't worry—we cover it in the "Implementing Lazy Loading" section.)
Figure 19-7 shows the data in the BlockedTree table. Seeing this table with a sample of data in it should help you understand the intent of the structure.
Figure 19-7: The BlockedTree table with data
The test harness in this example uses a simple custom-coded method of displaying a tree, but there's a better way to do it. Microsoft provides the outstanding TreeView control for just this task in their Internet Explorer Web controls package at http://msdn.microsoft.com/library/default.asp?url=/workshop/webcontrols/overview/treeview.asp. Don't let the name fool you—it's not only for Internet Explorer. In fact, it works quite well in Netscape 4.x and 6, thanks to some clever coding by Microsoft.
The Internet Explorer Web controls package is free and includes several other useful Web components, such as a tabbed view and a docking-capable toolbar that will have your Web applications looking like desktop applications!
Using recursion to walk through the tree is not the most high-performance method of moving through a tree structure. As an alternative, you can create an index that looks like Table 19-4.
CATEGORYID | DESCRIPTION | PARENTCATEGORYID | INDEX |
---|---|---|---|
1 | Video Tapes | -1 | 1.0.0 |
2 | Horror | 1 | 1.2.0 |
3 | Comedy | 1 | 1.3.0 |
4 | Romantic Comedy | 3 | 1.3.4 |
5 | Drama | 1 | 1.5.0 |
The index is a string that shows the category at each stop of the tree. In Table 19-4, the root CategoryID is 1; therefore, all of the first numbers in the index are 1. The second number is the next level down in the tree, which is different for everything except the Romantic Comedy category. Romantic Comedy is a subcategory of Comedy, so they share the second number 3. The last number is 0 for everything but Romantic Comedy because only it is nested three levels deep.
The index would not actually be stored in the database but instead assembled in memory as a way to quickly iterate through the tree without recursion. The reason why it's not a good idea to necessarily store the index in the database is because you would have to use the substring function as part of your subquery, which would cause a table scan.
Implementing an index such as this is both a good and a bad thing depending on how you look at it. The good part is that once the tree is assembled, the tree can be iterated through and searched quickly and without recursion. The bad part is there's a small upfront cost in building the index. Because .NET is stateless and will forget the tree as soon as it's assembled, we included the "quick-and-dirty" method over building the full index for the examples in this chapter.
If you plan on caching a tree or implementing some type of method that reuses the same structure, we recommend adding an index.