Union-find
Imagine we have a graph (e.g., a network) like this
and we are interested in reachability queries. For example
Is g reachable from a? — NO
Is there a path between i and h? — YES
However, to make things more interesting, our graph is not static
and we can link two nodes.
Link f and j:
Is there a path between e and j? — YES
Is there a path between c and i? — NO
Link b and g:
Is there a path between c and i now? — YES
Link a and h:
Link a and d:
Is there a path between e and h? — NO
Is there a path between d and g? — YES
This problem can be formulated a little more abstractly: given a collection of disjoint sets, we would like to support the following two operations:
union(A,B)
– replaces sets A and B with their unionfind(x)
– find set X where x belongs.
In our network application, the nodes reachable from each other form disjoint sets (called connected components). By linking two nodes from two different components, we get one big component – their union. Nodes x and y are connected by a path if and only if they belong to the same component, i.e., if find(x)=find(y)
.
Link f and j: union C and E
Link b and g: union A and D
Link a and h; link a and d
Applications
Kruskal’s algorithm
Unification
Other:
Gilbert et al. (1994) An efficient algorithm to compute row and column counts for sparse Cholesky factorization. SIAM Journal on Matrix Analysis and Applications, 15:1075.
Gabow, H. and Tarjan, R. (1985). A linear-time algorithm for a special case of disjoint set union. Journal of computer and system sciences, 30(2):209–221.