UWB and UW Seal
   
Clark F. Olson
Publications
By type:
Journal papers
Conference papers
Book chapters
Technical reports
NASA tech briefs
Theses
By subject:
Curve detection
Image matching
Indexing
Ordnance recognition
Pose clustering
Robot localization
Theory
Parallel Algorithms for Hierarchical Clustering
Clark F. Olson
Technical report UCB/CSD-94-786, Computer Science Division, University of California at Berkeley, 1994.
Hierarchical clustering is a common method used to determine clusters of similar data points in multi-dimensional spaces. O(n^2) algorithms, where n is the number of points to cluster, have long been known for this problem [Sibson, 1973; Defays, 1977; Day and Edelsbrunner, 1984]. This paper discusses parallel algorithms to perform hierarchical clustering using various distance metrics. I describe O(n) time algorithms for clustering using the single link, average link, complete link, centroid, median, and minimum variance metrics on an n node CRCW PRAM and O(n\log n) algorithms for these metrics (except average link and complete link) on n/log n node butterfly networks or trees. Thus, optimal efficiency is achieved for a significant number of processors using these distance metrics. A general algorithm is given that can be used to perform clustering with the complete link and average link metrics on a butterfly. While this algorithm achieves optimal efficiency for the general class of metrics, it is not optimal for the specific cases of complete link and average link clustering.