Class 9: Binary Search Trees

Definitions

 

Binary search tree is a binary tree that stores keys and satellite data. The keys are stored in such a way to satisfy the following property:

 

 

The above property allows us to print out all keys in a BST in sorted order by an inorder tree walk algorithm (left, root, right). Similar algorithms include preorder tree walk (root, left, right), and postorder tree walk (left, right, root). For example:

 

Inorder-Tree-Walk (x)

1     if x <> NIL

2           Inorder-Tree-Walk(left[x])

3           print key[x]

4           Inorder-Tree-Walk(right[x])

 

It takes Q(n) to walk an n-nodeBST, since the procedure is called exactly twice for each node of the tree (left and right child) + printing the node. More formally:

 

Theorem 12.1

 

If x is the root of an n-node subtree, then Inorder-Tree-Walk(x) takes Q(n) time.

 

Proof. On an empty subtree Inorder-Tree-Walk(x) takes a constant time (to compare to NIL), so T(0) = c for some positive constant c.

 

For n>0, suppose the procedure is called on a node x, whose left subtree has k nodes, and the right subtree has (n-k-1) nodes. The time to perform the algorithm is:

 

            T(n) = T(k) + T(n-k-1) + d,

 

where d is time to print x.

 

We use the substitution method to show that T(n) = (c+d)n + c. For n=0: T(0) = c. For n>0 we have:

 

T(n) = T(k) + T(n-k-1) + d =

 

            ((c+d)k + c) + ((c+d)(n-k-1) + c) + d =

 

            ck + dk + c +cn –ck –c + dn –dk –d + c + d =

 

            cn + dn + c =

 

            (c+d)n + c       ¢

Querying

 

We use the following procedure to search for a node in BST.

 

Tree-Search(x,k)

Input: root x and key k

Output: pointer to a node with key k, or NIL if not found

1     if x = NIL or k=key[x]

2           return x

3     if k < key[x]

4           return Tree-Search(left[x],k)

5     else

6           return Tree-Search(right[x],k)

 

The nodes encountered form a path downward from the root of the tree, hence the running time of search is O(h), where h is the height of the tree.

 

The same algorithm can be written iteratively:

 

Iterative-Tree-Search(x,k)

1     while x <> NIL and k <> key[x]

2           if k < key[x]

3                 x = left[x]

4           else

5                 x = right[x]

6     return x         

 

Minimum and maximum elements in BST can be found by following the left, or the right subtree respectively:

 

Tree-Minimum(x)

1     while left[x] <> NIL

2           x = left[x]

3     return x

 

Tree-Maximum(x)

1     while right[x] <> NIL

2           x = right[x]

3     return x

 

If all keys are distinct, the successor of the node x is the smallest key greater than key[x]. The following algorithm finds the successor without ever comparing keys.

 

Tree-Successor(x)

1     if right[x] <> NIL

2           return Tree-Minimum(right[x])

3     y = parent[x]

4     while y <> NIL and x = right[y]

5           x = y

6           y = parent[y]

7     return y

 

When the right subtree is empty, the successor of x is the lowest ancestor of x, whose left child is also ancestor of x (lines 3-7). To find such ancestor, we find the first one that is not on the "right subtree" path to x.

 

The running time of the above algorithm is O(h) since we either follow a path down or up from the node.

 

Modifying

 

The following procedure inserts a node z (key[z] = v, left[z]=right[z]=NIL) into a tree T. The algorithm works like Tree-Search by tracing the path downward. The node will always be inserted as a leaf (or root if the tree is empty).

 

Tree-Insert(T,z)

1     y = NIL

2     x = root[T]

3     while x <> NIL

4           y = x

5           if key[z] < key[x]

6                 x = left[x]

7           else

8                 x = right[x]

9      parent[z] = y

10    if y = NIL // Tree T is empty

11          root[T] = z

12    else

13          if key[z] < key[y]

14                left[y] = z

15          else

16                right[y] = z

 

The above algorithm runs in O(h) time on a tree of height h.

 

The algorithm to delete a node from a BST takes a pointer to the node. It considers the following three cases:

 

 

Tree-Delete(T,z)

1     if left[z] = NIL and right[z] = NIL

2           y = z

3     else

4           y = Tree-Successor(z)

5     // y is the node to "splice out"

6     if left[y] <> NIL

7           x = left[y]

8     else

9           x = right[y]

10    // x will be a new child of y's parent

11    if x <> NIL

12          parent[x] = parent[y] // x has a new parent

13    if parent[y] = NIL

14          root[T] = x

15    else

16          if y = left[parent[y]] // y was a left child

17                left[parent[y]] = x

18          else

19                right[parent[y]] = x

20    // copy y to the new place if necessary

21    if y <> z

22          key[z] = key[y]

23          <copy y's satellite data into z>

24    return y

 

The procedure runs in O(h) as it takes constant time to change the pointers after the sliceable node is found, which in the worst-case takes the same time as the Tree-Successor algorithm.