20.5. Binary Search Trees
Binary trees are a data structure that impose an order on inserted nodes: items less than a node are stored in the left subtree, and items greater than a node are inserted in the right. At the bottom, the subtrees are empty. Because of this structure, binary trees naturally support quick, recursive traversalsat least ideally, every time you follow a link to a subtree, you divide the search space in half.[*]
Binary trees are named for the implied branch-like structure of their subtree links. Typically, their nodes are implemented as a triple of values: (LeftSubtree, NodeValue, RightSubtree). Beyond that, there is fairly wide latitude in the tree implementation. Here we'll use a class-based approach:
Instead of coding distinct search functions, binary trees are constructed with "smart" objects (class instances) that know how to handle insert/lookup and printing requests and pass them to subtree objects. In fact, this is another example of the object-oriented programming (OOP) composition relationship in action: tree nodes embed other tree nodes and pass search requests off to the embedded subtrees. A single empty class instance is shared by all empty subtrees in a binary tree, and inserts replace an EmptyNode with a BinaryNode at the bottom (see Example 20-13).
Example 20-13. PP3E\Dstruct\Classics\btree.py
As usual, BinaryTree can contain objects of any type that support the expected interface protocolhere, > and < comparisons. This includes class instances with the _ _cmp_ _ method. Let's experiment with this module's interfaces. The following code stuffs five integers into a new tree, and then searches for values 0 . . . 9:
C:\...\PP3E\Dstruct\Classics>python >>> from btree import BinaryTree >>> x = BinaryTree( ) >>> for i in [3,1,9,2,7]: x.insert(i) ... >>> for i in range(10): print (i, x.lookup(i)), ... (0, 0) (1, 1) (2, 1) (3, 1) (4, 0) (5, 0) (6, 0) (7, 1) (8, 0) (9, 1)
To watch this tree grow, add a print statement after each insert. Tree nodes print themselves as triples, and empty nodes print as *. The result reflects tree nesting:
>>> y = BinaryTree( ) >>> y * >>> for i in [3,1,9,2,7]: ... y.insert(i); print y ... ( *, 3, * ) ( ( *, 1, * ), 3, * ) ( ( *, 1, * ), 3, ( *, 9, * ) ) ( ( *, 1, ( *, 2, * ) ), 3, ( *, 9, * ) ) ( ( *, 1, ( *, 2, * ) ), 3, ( ( *, 7, * ), 9, * ) )
At the end of this chapter, we'll see another way to visualize trees in a GUI (which means you're invited to flip ahead now). Node values in this tree object can be any comparable Python objectfor instance, here is a tree of strings:
>>> z = BinaryTree( ) >>> for c in 'badce': z.insert(c) ... >>> z ( ( *, 'a', * ), 'b', ( ( *, 'c', * ), 'd', ( *, 'e', * ) ) ) >>> z = BinaryTree( ) >>> for c in 'abcde': z.insert(c) ... >>> z ( *, 'a', ( *, 'b', ( *, 'c', ( *, 'd', ( *, 'e', * ) ) ) ) )
Notice the last result here: if items inserted into a binary tree are already ordered, you wind up with a linear structure and lose the search-space partitioning magic of binary trees (the tree grows in right branches only). This is a worst-case scenario, and binary trees generally do a good job of dividing values in practice. But if you are interested in pursuing this topic further, see a data structures text for tree-balancing techniques that automatically keep the tree as dense as possible.
Also note that to keep the code simple, these trees store a value only and lookups return a 1 or (true or false). In practice, you sometimes may want to store both a key and an associated value (or even more) at each tree node. Example 20-14 shows what such a tree object looks like, for any prospective lumberjacks in the audience.
Example 20-14. PP3E\Dstruct\Classics\btree-keys.py
Here is this script's self-test code at work; nodes simply have more content this time around:
C:\...\PP3E\Dstruct\Classics>python btree-keys.py ( ( *, 'aaa'=2, * ), 'bbb'=1, ( *, 'ccc'=3, * ) ) 2 3 ( ( *, 'aaa'=5, * ), 'bbb'=1, ( *, 'ccc'=3, ( *, 'ddd'=4, * ) ) )