Optimal Binary Search Tree

6 min readAug 25, 2019

In Binary Search Tree (BST)we know that for each node in the tree, left-sub tree of that particular node contains lesser value than the parent node and similarly for right-sub tree that it contains higher value than the parent node.

Basically, we know the key values of the each node, but let say we also know the frequencies of each node in the process of searching, means that how much time particular node is being searched. So, by knowing the frequencies of each node and their key values we can find the overall cost of searching.

In many applications the cost of searching is very important. So, it is required that the overall cost of searching should be as less as possible. And we know that search time of BST is more than the Balanced Binary Search Tree, as Balanced Binary Search tree has less number of levels than the BST. And there is one way which can further reduce the cost than the Balanced BST, which is Optimal Binary Search Tree . Let us understand that by following example.

As there are 3 different keys, so we can have total 5 various BST by changing order of keys. And which van be found by, where “n” is the number of keys.

So following are the various possible Binary Search Trees of the above data. And also the overall cost for searching for each BST.

The cost is computed by multiplying the each node’s frequency with the level of tree( Here we are assuming that the tree starts from level 1 ) and then add them to compute the overall cost of BST.

As we can see in Figure 2 there are various 5 types of arrangements are possible. And as we have discussed earlier that, it is possible to further reduce the cost of Balanced BST which is specified in following figure 3.

As it is shown in above figure that 2nd BST is balanced and the 4th BST is not balanced, though it’s cost is less than the cost of Balanced BST and its cost is the least among all, so it is our Optimal Binary Search Tree for the given data in Figure 1. So here our idea to generate the Optimal Binary Search Tree is that, the nodes whose frequencies are more should appear in the lower levels of Tree .i.e, In our example node with key 40 is having highest frequency and which appears at level 1 which is our Optimal BST.

Here note that: if the number of nodes are less then we can find optimal BST by checking all possible arrangements, but if the nodes are greater than 3 like 4,5,6….. then respectively 14,42,132….. , different BSTs are possible so by checking all arrangements to find Optimal Cost may lead to extra overhead. So we will see now another approach to solve the problem of Optimal BST using Dynamic Programming Approach.

Using Dynamic Approach:-

where c[ i , j ] in which c stands for cost, “ pi ” stands for frequency of node i. We will use matrix form to solve this problem.Problem Statement is as following and we have to find Optimal BST for that:

SOLUTION
Step: 1 :: According to above formula: 2 we derived the following matrix, that c[i,i] = Pi and for i>j → c[i,j] = 0;

Step: 2 :: So now compute value of C[0,1] as following and put it in table.

and the table after this step will look like…

Step: 3 :: Do same for all elements like c[1,2],c[2,3],c[0,2],c[1,3],c[0,3] and check your answer with the following table……..

Step:4 :: So now we want to arrange the nodes in such a way that they will cost the minimum value which we derived previously. For that we have to choose root(parent node)in every group of nodes. See in step 2: for node 0 and node 1 the parent node will be decided by the value of K for which we took the minimum cost, in our example for value of k=0, we are able to find the minimum value SO parent node will be node 0 and similarly do for others and make table as following which will help you to arrange nodes very easily.

Step: 5:: After creating table as shown above go from BOTTOM → UP. As shown in table Node 2 appears at bottom of the table so it is root of our BST. In our example the keys are sorted so we can say that node 3 will appear at the right sub-tree of node 2 , and node 0 and node 1 will appear at the left sub-tree of the node 2 as shown below.

So now there is one question that “ Which node is parent node between node 0 and node 1??” For that again Go from Bottom →Up and see that node 0 is parent node , because when we were computing value of c[0,1] at that time we came to know that between 0,1 node 0 is parent node. And as in our example the keys are sorted so node 1 will appear in the right sub-tree of node 0(because key value of node 1 is higher than the key value of node 0). So our Optimal Binary Search Tree for our example will look like…….