We have discussed several possible implementations of this ADT, none entirely satisfactory from the point of view of the complexity of operations. Today, we introduce and analyze the ingenious and efficient implementation by a collection of trees.
Each set is represented by a tree with nodes representing the members of the set. Information about the set is stored in (available in constant time from) the root of the tree; each node (except for the root) points to its parent in the tree. Nodes representing particular elements are accessible (through an indexing mechanism) in constant time. The Union of two sets can be implemented by making the root of one tree the parent of the root of the other. Clearly, this can be done in constant time. The Find operation can be implemented by traversing the parent pointers from the relevant node. The complexity of this operation is thus proportional to the depth of the node ("the length of the find-path.")
It is easy to construct a worst case scenario in which the length of the find-path is proportional to the size of the tree. We can attempt to "balance the tree", for instance, by keeping the height of the tree available in constant time and using it to increase the height of the union only when both component trees have the same height. (Recall Fibonacci Trees from the discussion of AVL trees.) This will lead to logarithmic bound on the height of the trees and thus the complexity of Find.
An additional "eager" operation bounds the latter even more: after traversal of the find-path, compress it by making all the traversed nodes point to the root as their parent. This is called path compression; the procedure is "eager" because we hope to make future Find's more efficient; besides, it comes at "no cost" as the time of this operation is again proportional to the length of the find-path. The only question is, how much more efficient?
First, we will implement Union guided by the size of the component trees (this parameter, associated with the root of the tree, will also be used in our analysis): always make the root of the larger size tree the parent of the root of the smaller size tree (the sum of the sizes will then become associated with the overall root; the other root's size will never change after this operation.)