315 Notes, Wednesday, May 7, 2008

Dynamic Programming

We will conclude our discussions of the subject with the well-studied domain of ADT StaticDictionary.

Optimal Binary Search Tree (OBST)

In the static case of a search structure (no insertions or deletions), we assume that we have available the statistics of searches (as frequency, or probability of element accesses) to guide us in the construction of a binary search tree (BST) to implement the StaticDictionary. The formal description of the problem (of constructing an optimal binary search tree) is as follows:

We are given the sequence p_1, p_2,...,p_n of probabilities of accessing the elements of ordinals ("order of keys") 1, 2, ..., n, respectively. The goal is a binary search tree with n nodes, which minimizes the expected access time.

Let us call the expected access time for a given binary search tree T its cost, c(T). Obviously, this cost is the sum of node depths weighted by the probability of access: c(T)=∑_{1≤i≤n}d(i)p_i. We notice that if k is the root of the tree T and T', T'' are its left and right principal subtrees, then d(i)=1+d'(i) for i<k and d(j)=1+d''(j) for j>k (where d'(i) and d''(j) are the depth of i and j in T' and T'', respectively). Now we can express the cost of T by the costs of T' and T'':

(where w(T), the weight of T, is the sum of access probabilities for the nodes of T. )

Now it is time to state the Principle of Optimality: an optimal solution has optimal components. In the OBST case, it means that if T is optimal, then T' and T'' have to be optimal (with respect to their membership). The immediate implication for the OBST construction algorithm is that only information about optimal solutions to subproblems is necessary to find the optimal T. (If we did not know that k was the optimal root, we could have tried all possible k's, 1≤k≤n, and in O(n) time found the one that minimizes the cost.)

Following the Principle of Optimality, we can state the Dynamic Programming algorithm design paradigm: