Section 7.3. Practical Implementation of Trees


7.3. Practical Implementation of Trees

The following subsections provide examples of each of the three hierarchy models. In each case, rows have been inserted into the example tables in the same order (ordered by commander) in an attempt to divorce the physical order of the rows from the expected result. Remember that the design is questionable, and that the purpose is to show in as simple a way as possible how to handle trees according to the model under discussion.

7.3.1. Adjacency Model

The following table describes the hierarchical organization of an army using the adjacency model . The table name I've chosen to use is, appropriately enough, ADJACENCY_MODEL. Each row in the table describes a military unit. The parent_id points upward in the tree to the enclosing unit:

      Name                            Null?    Type      ------------------------------- -------- --------------      ID                              NOT NULL NUMBER      PARENT_ID                                NUMBER      DESCRIPTION                     NOT NULL VARCHAR2(120)      COMMANDER                                VARCHAR2(120) 

Table ADJACENCY_MODEL has three indexes: a unique index on id (the primary key), an index on parent_id, and an index on commander. Here are a few sample lines from ADJACENCY_MODEL:

     ID  PARENT_ID DESCRIPTION                  COMMANDER     --- --------- ---------------------------- -----------------------------     435         0 French Armée du Nord of 1815 Emperor Napoleon Bonaparte     619       435 III Corps                    Général de Division Dominique                                                Vandamme     620       619 8th Infantry Division        Général de Division Baron                                                Etienne-Nicolas Lefol     621       620 1st Brigade                  Général de Brigade Billard                                                (d.15th)     622       621 15th Rgmt Léger              Colonel Brice     623       621 23rd Rgmt de Ligne           Colonel Baron Vernier     624       620 2nd Brigade                  Général de Brigade Baron                                                Corsin     625       624 37th Rgmt de Ligne           Colonel Cornebise     626       620 Division Artillery     627       626 7/6th Foot Artillery         Captain Chauveau 

7.3.2. Materialized Path Model

Table MATERIALIZED_PATH_MODEL stores the same hierarchy as ADJACENCY_MODEL but with a different representation. The (id, parent_id) pair of columns associating adjacent nodes is replaced with a single materialized_path column that records the full "ancestry" of the current row:

      Name                                Null?    Type      ----------------------------------- -------- ----------------      MATERIALIZED_PATH                   NOT NULL VARCHAR2(25)      DESCRIPTION                         NOT NULL VARCHAR2(120)      COMMANDER                                    VARCHAR2(120) 

Table MATERIALIZED_PATH_MODEL has two indexes, a unique index on materialized_path (the primary key), and an index on commander. In a real case, the choice of the path as the primary key is, of course, a very poor one, since people or objects rarely have as a defining characteristic their position in a hierarchy. In a proper design, there should be at least some kind of id, as in table ADJACENCY_MODEL. I have suppressed it simply because I had no use for it in my limited tests.

However, my questionable choice of materialized_path as the key was also made with the idea of checking in that particular case the benefit of the special implementations discussed in Chapter 5, in particular, what happens when the table that describes a tree happens to map the tree structure of an index? In fact, in this particular example such mapping makes no difference.

Here are the same sample lines as in the adjacency model, but with the materialized path:

     MATERIALIZED_PATH DESCRIPTION                  COMMANDER     ----------------- ---------------------------- --------------------------     F                 French Armée du Nord of 1815 Emperor Napoleon Bonaparte     F.3               III Corps                    Général de Division                                                    Dominique Vandamme     F.3.1             8th Infantry Division        Général de Division Baron                                                    Etienne-Nicolas Lefol     F.3.1.1           1st Brigade                  Général de Brigade Billard                                                    (d.15th)     F.3.1.1.1         15th Rgmt Léger              Colonel Brice     F.3.1.1.2         23rd Rgmt de Ligne           Colonel Baron Vernier     F.3.1.2           2nd Brigade                  Général de Brigade Baron                                                    Corsin     F.3.1.2.1         37th Rgmt de Ligne           Colonel Cornebise     F.3.1.3           Division Artillery     F.3.1.3.1         7/6th Foot Artillery         Captain Chauveau 

7.3.3. Nested Sets Model (After Celko)

With the nested set model , we have two columns, left_num and right_num, which describe how each row relates to other rows in the hierarchy. I'll show shortly how those two numbers are used to specify a hierarchical position:

      Name                                Null?    Type      ----------------------------------- -------- -------------      DESCRIPTION                                  VARCHAR2(120)      COMMANDER                                    VARCHAR2(120)      LEFT_NUM                            NOT NULL NUMBER      RIGHT_NUM                           NOT NULL NUMBER 

Table NESTED_SETS_MODEL has a composite primary key, (left_num, right_num) plus an index on commander. As with the materialized path model, this is a poor choice but it is adequate for our present tests.

It is probably time now to explain how the mysterious numbers, left_num and right_num, are obtained. Basically, one starts from the root of the tree, assigning 1 to left_num for the root node. Then all child nodes are recursively visited, as shown in Figure 7-1, and a counter increases at each call. You can see the counter on the line in the figure. It begins with 1 for the root node and increases by one as each node is visited.

Figure 7-1. How nested sets numbers are assigned


Say that we visit a node for the very first time. For instance, in the example of Figure 7-1, after having assigned the integer 1 to the left_num value of the 1st Corps node, we encounter (for the first time) the node 1st British Guards Division. We increase our counter and assign 2 to left_num. Then we visit the node's children, encountering for the first time 1st Guards Brigade and assigning the value of our counter, 3 at this stage, to left_num. But this node, on this example, has no child. Because there is no child, we increment our counter and assign its value to right_num, which in this case takes the value 4. Then we move on to the node's sibling, 2nd Guards Brigade. It is the same story with this sibling. Finally, we returnour second visitto the parent node 1st British Guards Division and can assign the new value of our counter, which has now reached 7, to its right_num. We then proceed to the next sibling, 3rd Anglo-German Division, and so on.

As mentioned earlier, you can see that the [left_num, right_num] pair of any node is enclosed within the [left_num, right_num] pair of any of its ascendantshence the name of nested sets. Since, however, we have three independent trees (the Anglo-Dutch, Prussian, and French armies), which is called in technical terms a forest, I have had to create an artificial top level that I have called Armies of 1815. Such an artificial top level is not required by the other models.

Here is what we get from our example after having computed all numbers:

     DESCRIPTION                  COMMANDER                  LEFT_NUM RIGHT_NUM     ---------------------------- -------------------------- -------- ----------     Armies of 1815                                                 1       1622     French Armée du Nord of 1815 Emperor Napoleon Bonaparte      870       1621     III Corps                    Général de Division            1237       1316                                  Dominique Vandamme     8th Infantry Division        Général de Division Baron      1238       1253                                  Etienne-Nicolas Lefol     1st Brigade                  Général de Brigade Billard     1239       1244                                  (d.15th)     15th Rgmt Léger              Colonel Brice                  1240       1241     23rd Rgmt de Ligne           Colonel Baron Vernier          1242       1243     2nd Brigade                  Général de Brigade Baron       1245       1248                                  Corsin     37th Rgmt de Ligne           Colonel Cornebise              1246       1247     Division Artillery                                          1249       1252     7/6th Foot Artillery         Captain Chauveau               1250       1251 

The rows in our sample that are at the bottom level in the hierarchy can be spotted by noticing that right_num is equal to left_num + 1.

The author of this clever method claims that it is much better than the adjacency model because it operates on sets and that is what SQL is all about. This is perfectly true, except that SQL is all about unbounded sets, whereas his method relies on finite sets, in that you must count all nodes before being able to assign the right_num value of the root. And of course, whenever you insert a node somewhere, you must renumber both the left_num and right_num values of all the nodes that should be visited after the new node, as well as the right_num value of all its ascendants. The necessity to modify many other items when you insert a new item is exactly what happens when you store an ordered list into an array: as soon as you insert a new value, you have to shift, on average, half the array. The nested set model is imaginative, no doubt, but a relational nightmare, and it is difficult to imagine worse in terms of denormalization. In fact, the nested sets model is a pointer-based solution, the very quagmire from which the relational approach was designed to escape.




The Art of SQL
The Art of SQL
ISBN: 0596008945
EAN: 2147483647
Year: N/A
Pages: 143

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net