Understanding Problematic Master-Detail Relationships


Several types of master-detail and parent-child relationships pose unique challenges to creating a high-performance Web application. There are two main master-detail problems: highly nested structures such as category trees and large master-detail relationships.

Working with Highly Nested Structures

Most applications have a tree structure somewhere in them. Generally they are used as a method of navigating a Web site or organizing a collection. The data structure behind them is generally the same and usually wrong. Look at Table 19-1 and see if you can find the problem with this structure.

Table 19-1: The VideoCategory Table, the Poor Performance Way

DESCRIPTION

DATA TYPE

CategoryID

Int

Description

Char(20)

ParentCategoryID

Int

CategoryID is the primary key for this table. The Description field is the name of the category displayed to the user so they know what they're seeing. ParentCategoryID is a subcategory of CategoryID. Generally the root-level category has a ParentCategoryID of 0 or -1 to indicate that it's a root-level category and that there's no parent category.

The problem with Table 19-1 is that there isn't a way to get the entire tree without executing multiple Select statements. To get all of the categories in this tree you'll have to commit another scalability mistake—using recursion. The only way to successfully get the whole tree is to recursively select everything that has a ParentCategoryID equal to the current CategoryID and so on. If your tree structure gets malformed, major problems can happen on your Web server and/or database server depending on exactly how you coded your statement.

The second problem with Table 19-1 is that to get decent performance, you'll have to have an index for CategoryID and a clustered index for ParentCategoryID, which is unnecessary with the alternate method described in Table 19-2.

Table 19-2: The VideoCategory Table, the High Performance Way

DESCRIPTION

DATA TYPE

CategoryID

Int

Description

Char(20)

ParentCategoryID

Int

RootCategoryID

Int

Don't despair if you're using a system that uses this technique; it's a common programming error. The important thing is that there's a solution. Table 19-2 shows a minor modification to the table that makes a major difference in performance.

This structure adds a RootCategoryID column, which is the category at the base of this entire tree. This column gives you a simple way to load the entire tree in a single Select statement with no recursion needed on the database server, and with a little fancy code on your Web server, you can avoid recursion there as well.

An additional advantage of this structure is the level of optimization afforded by a single clustered index on the RootCategoryID. Because you won't be using the CategoryID in the WHERE clause of any of your Select statements, you won't even need it to be indexed—just the RootCategoryID column. Figure 19-5 shows a populated VideoCategory table, the high performance way.

click to expand
Figure 19-5: The data in the VideoCategory table

One of the mantras of good database design is to never have repetitive data, a mantra we're breaking here with good reason. In this case, it's necessary. Listing 19-7 shows the VideoCategory, VideoCategoryData, and VideoCategoryDataAccess objects.

Listing 19-7: VideoCategory and Related Objects

start example
 Imports System Imports System.Data Imports System.Data.SqlClient Imports System.Data.Common Namespace EnterpriseVB.VideoStore.Data     Public Class VideoCategory         Protected data As DataTable         Protected index As Integer         Protected subCategories As New ArrayList()         Protected Friend ReadOnly Property MyData() As DataRow()             Get                 Dim myRow() As DataRow = {Me.data.Rows(index)}                 Return myRow             End Get         End Property         Public Sub New(ByRef data As VideoCategoryData, ByVal index As Integer)             Me.data = data             Me.index = index         End Sub         Public Sub New()             Me.data = New VideoCategoryData()             data.Rows.Add(data.NewRow())             Me.index = 0         End Sub         Public Function GetColumn(ByRef ColumnName As String) As Object             Return data.Rows(index)(ColumnName)         End Function         Public Function SetColumn(ByRef ColumnName As String, ByRef _ ColumnValue As Object)             data.Rows(index)(ColumnName) = ColumnValue         End Function         Public Property CategoryID() As Int32             Get                 If (GetColumn("CategoryID").GetType() Is _ Type.GetType("System.DBNull")) Then                     Return -1                 End If                 Return CType(GetColumn("CategoryID"), Int32)             End Get             Set(ByVal Value As Int32)                 SetColumn("CategoryID", Value)             End Set         End Property         Public Property Description() As String             Get                 Return "" + GetColumn("Description")             End Get             Set(ByVal Value As String)                 SetColumn("Description", Value)             End Set         End Property         Public Property ParentCategoryID() As Int32             Get                 If (GetColumn("ParentCategoryID").GetType() Is _ Type.GetType("System.DBNull")) Then                     Return -1                 End If                 Return CType(GetColumn("ParentCategoryID"), Decimal)             End Get             Set(ByVal Value As Int32)                 SetColumn("ParentCategoryID", Value)             End Set         End Property         Public Property RootCategoryID() As Int32             Get                 If (GetColumn("RootCategoryID").GetType() Is _ Type.GetType("System.DBNull")) Then                     Return -1                 End If                 Return CType(GetColumn("RootCategoryID"), Int32)             End Get             Set(ByVal Value As Int32)                 SetColumn("RootCategoryID", Value)             End Set         End Property         Public Function AddSubCategory(ByRef subCat As VideoCategory)             Me.subCategories.Add(subCat)         End Function         Public Function FindCategoryByID(ByVal catID As Integer) As VideoCategory             Return Me.FindCategoryByID(Me, catID)         End Function         Friend Function FindCategoryByID(ByRef cat As VideoCategory, _ ByVal catID As Integer) As VideoCategory             If (catID = cat.CategoryID) Then                 Return cat             End If             Dim i As Integer             For i = 0 To cat.CountSubCategories() - 1                 If (FindCategoryByID(cat.GetSubCategory(i), catID) Is Nothing) Then                 Else                     Return cat.GetSubCategory(i)                 End If             Next             Return Nothing         End Function         Public Function CountSubCategories() As Int32             Return Me.subCategories.Count         End Function         Public Function GetSubCategory(ByVal i As Int32) As VideoCategory             Return CType(Me.subCategories(i), VideoCategory)         End Function     End Class     Public Class VideoCategoryData         Inherits DataTable         Public Sub New()             MyBase.New("VideoCategory")             Me.Columns.Add("CategoryID", Type.GetType("System.Int32"))             Me.Columns.Add("Description", Type.GetType("System.String"))             Me.Columns.Add("ParentCategoryID", Type.GetType("System.Int32"))             Me.Columns.Add("RootCategoryID", Type.GetType("System.Int32"))         End Sub     End Class     Public Class VideoCategoryDataAccess         Public connectionString As String         Protected adapter As SqlDataAdapter         Protected loadAll As SqlDataAdapter         Public Sub New()             connectionString = "Your Connection String;"             adapter = New SqlDataAdapter()             adapter.SelectCommand = New SqlCommand("VideoCategoryGetTree", New _ SqlConnection(connectionString))             adapter.SelectCommand.CommandType = CommandType.StoredProcedure             adapter.SelectCommand.Parameters.Add _             ("@RootCategoryID", SqlDbType.Decimal, 0, "RootCategoryID")         End Sub         Public Function GetCategoryTree(ByVal rootCategoryID As Int32) _ As VideoCategory             Dim data As New VideoCategoryData()             adapter.SelectCommand.Parameters("@RootCategoryID").Value = _ rootCategoryID             adapter.Fill(data)             Dim categories As VideoCategory()             categories = GetVideoCategoryArrayFromData(data)             Dim i As Integer             Dim x As Integer             If (categories.Length = 0) Then                 Return Nothing             End If             For i = 0 To categories.Length - 1                 For x = 0 To categories.Length - 1                     If (categories(i).CategoryID = _ categories(x).ParentCategoryID) Then                         categories(i).AddSubCategory(categories(x))                     End If                 Next             Next             Return categories(0)         End Function         Public Shared Function GetVideoCategoryArrayFromData(ByRef data As _ VideoCategoryData) As VideoCategory()             Dim vArray(data.Rows.Count - 1) As VideoCategory             Dim i As Integer             For i = 0 To (data.Rows.Count - 1)                 vArray(i) = New VideoCategory(data, i)             Next i             Return vArray         End Function     End Class End Namespace 
end example

The real magic of this object happens in the GetCategoryTree() method of the DAC. It's here that the category tree is assembled without recursion and with a single query. The DAC returns a single VideoCategory object that you can then work with and display. The net effect of this is you have a high-performance category tree and you'll never have to write and data access methods for the VideoCategory object no matter how many times you use it.

The VideoCategory object models the complete tree by providing a GetSubCategory() method to retrieve the subcategories of the current category. Internally, the VideoCategory object stores its subcategories in an ArrayList that it manages.

The VideoCategory object also provides a FindCategoryByID() method that does a search on the tree and returns the requested VideoCategory object. When a specific category gets referenced, either by the user selecting something or by some other method, it will likely be by the CategoryID. Without a method such as this, you would likely have to make another round trip to the database to load a category again that you already have in memory.

Figure 19-6 shows the output of a test harness. A test harness is a simple application used to exercise the features of a component. Test harnesses can be helpful in doing object-level or component-level debugging.


Figure 19-6: The output of the test harness

Working with a Much Larger Tree

The method described in the "Highly Nested Structures" section works great on most category trees with a reasonable amount of categories, but if you have a tree with potentially hundreds of categories, you'll need to use an alternate method.

For large structures you need to be able to load the tree in logical blocks. For example, look at the large structure-optimized version of the category tree in Table 19-3.

Table 19-3: The BlockedTree Table

DESCRIPTION

DATA TYPE

CategoryID

Int

Description

Char(20)

ParentCategoryID

Int

GroupID

Int

In this table's structure you can load the tree one block at a time and assemble the tree structure using your mapped objects in memory.

The stored procedure in Listing 19-8 allows you to request all the children of a specific category and get all of the categories in that group in a single block. The result is you can display a collapsed tree to the user, and as they expand the tree, the server won't have to hit the database for each expanded category.

Listing 19-8: The GetBlockedCategory Stored Procedure

start example
 CREATE PROCEDURE dbo. GetBlockedCategory @CategoryID int AS BEGIN SELECT CategoryID, Description, ParentCategoryID, GroupID FROM BlockedTree WHERE GroupID=     (SELECT GroupID FROM Category WHERE ParentCategoryID=@CategoryID) END 
end example

The only additional concern you have is writing the corresponding Visual Basic .NET code to take these blocks and dynamically assemble them on to your tree using a simple lazy-loading technique. (If you don't know what lazy loading is, don't worry—we cover it in the "Implementing Lazy Loading" section.)

Figure 19-7 shows the data in the BlockedTree table. Seeing this table with a sample of data in it should help you understand the intent of the structure.

click to expand
Figure 19-7: The BlockedTree table with data

start sidebar
Using a Better Tree Control

The test harness in this example uses a simple custom-coded method of displaying a tree, but there's a better way to do it. Microsoft provides the outstanding TreeView control for just this task in their Internet Explorer Web controls package at http://msdn.microsoft.com/library/default.asp?url=/workshop/webcontrols/overview/treeview.asp. Don't let the name fool you—it's not only for Internet Explorer. In fact, it works quite well in Netscape 4.x and 6, thanks to some clever coding by Microsoft.

The Internet Explorer Web controls package is free and includes several other useful Web components, such as a tabbed view and a docking-capable toolbar that will have your Web applications looking like desktop applications!

end sidebar

A Final Note on Tree Optimization

Using recursion to walk through the tree is not the most high-performance method of moving through a tree structure. As an alternative, you can create an index that looks like Table 19-4.

Table 19-4: An Indexed Table

CATEGORYID

DESCRIPTION

PARENTCATEGORYID

INDEX

1

Video Tapes

-1

1.0.0

2

Horror

1

1.2.0

3

Comedy

1

1.3.0

4

Romantic Comedy

3

1.3.4

5

Drama

1

1.5.0

The index is a string that shows the category at each stop of the tree. In Table 19-4, the root CategoryID is 1; therefore, all of the first numbers in the index are 1. The second number is the next level down in the tree, which is different for everything except the Romantic Comedy category. Romantic Comedy is a subcategory of Comedy, so they share the second number 3. The last number is 0 for everything but Romantic Comedy because only it is nested three levels deep.

The index would not actually be stored in the database but instead assembled in memory as a way to quickly iterate through the tree without recursion. The reason why it's not a good idea to necessarily store the index in the database is because you would have to use the substring function as part of your subquery, which would cause a table scan.

Implementing an index such as this is both a good and a bad thing depending on how you look at it. The good part is that once the tree is assembled, the tree can be iterated through and searched quickly and without recursion. The bad part is there's a small upfront cost in building the index. Because .NET is stateless and will forget the tree as soon as it's assembled, we included the "quick-and-dirty" method over building the full index for the examples in this chapter.

If you plan on caching a tree or implementing some type of method that reuses the same structure, we recommend adding an index.




Applied ADO. NET(c) Building Data-Driven Solutions
Applied ADO.NET: Building Data-Driven Solutions
ISBN: 1590590732
EAN: 2147483647
Year: 2006
Pages: 214

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net