on the parallel complexity of hierarchical clustering and cc-complete problems dr. raymond greenlaw...
TRANSCRIPT
On the Parallel Complexity of Hierarchical Clustering and
CC-Complete Problems
Dr. Raymond GreenlawSchool of Computing
Armstrong Atlantic State Universityand
Dr. Sanpawat KantabutraDepartment of Computer Science
Chiang Mai University
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 2
Outline• Introduction• Preliminaries• Algorithms for Hierarchical Clustering• Complexity of Hierarchical Clustering• CC-Complete Problems• Conclusions and Open Problems• References• Acknowledgments
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 3
Outline• Introduction• Preliminaries• Algorithms for Hierarchical Clustering• Complexity of Hierarchical Clustering• CC-Complete Problems• Conclusions and Open Problems• References• Acknowledgments
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 4
Introduction• Clustering is a division of data into
groups of ‘similar’ objects, where each group is given a more-compact representation.
• Used to model very large data sets.
• Points are more similar to their own cluster than to points in other clusters.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 5
Introduction• Useful tool in data mining, where
immense data sets which are difficult to store and to manipulate are involved.
• Study the parallel complexity of the hierarchical clustering problem.
• Builds a tree of clusters.• Sibling clusters in this tree partition the
points associated with their parent.• Can explore data using various levels of
granularity.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 6
Introduction• Two widely studied models
– Bottom-Up Starts with single-point clusters and then recursively merges two or more of the most-‘appropriate’ clusters.
– Top-Down Starts with one large cluster consisting of all the data points and then recursively splits the most-‘appropriate’ cluster.
• In both methods, the process continues until a desired stopping condition is met such as a required number of clusters or a diameter bound of the ‘largest’ cluster.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 7
Introduction• A variety of sequential versions of
hierarchical-clustering methods have been studied:– Cure Guha, et al.: Bottom-Up, good for
clusters having arbitrary shapes or outliers– Chameleon Karypis et al.: Bottom-Up,
relies heavily on graph partitioning– Principal Direction Divisive Partitioning
Boley: Top-Down, good for document collections
– Hierarchical Divisive Bisecting k-means Steinbach: Top-Down
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 8
Introduction• Address the parallel complexity of
hierarchical clustering.• Describe known sequential algorithms for
top-down and bottom-up hierarchical clustering.
• Parallelize top-down, when n points are to be clustered, provide an O(log n)-time, n2-processor CREW-PRAM algorithm that computes the same output as the corresponding sequential algorithm.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 9
Introduction• Define a natural decision problem based
on bottom-up hierarchical clustering and add this Hierarchical Clustering Problem (HCP) to the list of CC-complete problems, adding a data mining problem for the first time.
• Show that HCP is one of the computationally most-difficult problems in the Comparator Circuit Value Problem (CCVP) class.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 10
Introduction• Demonstrate that the HCP is very
unlikely to have an NC algorithm.• In sharp contrast, give an NC algorithm
for the top-down sequential approach.• Parallel complexities of top-down and
bottom-up are different, unless CC equals NC.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 11
Outline• Introduction• Preliminaries• Algorithms for Hierarchical Clustering• Complexity of Hierarchical Clustering• CC-Complete Problems• Conclusions and Open Problems• References• Acknowledgments
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 12
Preliminaries• Interested in relating the complexity of
hierarchical clustering to that of a problem involving Boolean circuits containing comparator gates.
• Comparator gates have two output wires, the first outputting the minimum and the second outputting the maximum of its two inputs.
• Each output has a maximum fanout of one.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 13
Preliminaries• Based on the comparator gate• Basis for an entire complexity class
Comparator Circuit Value Problem (CCVP)• Given: An encoding of a Boolean
circuit composed of comparator gates, inputs x1,…,xn, and a designated output y.
• Problem: Is output y of TRUE on input x1,…,xn?
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 14
Preliminaries• Let P denote the class of all languages
decidable in polynomial time.• Let NC denote the class of all languages
decidable in poly-logarithmic time using a polynomial number of processors on a PRAM.
• Let RNC denote the randomized version of NC.• Let NLOG denote the class non-deterministic
logarithmic space.• Let CC denote the class of problems that are
NC many-one reducible to CCVP.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 15
Outline• Introduction• Preliminaries• Algorithms for Hierarchical
Clustering• Complexity of Hierarchical Clustering• CC-Complete Problems• Conclusions and Open Problems• References• Acknowledgments
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 16
Algorithms for Hierarchical Clustering
Sequential Algorithms – Bottom-Up• Input: set of points, distance function,
bound B, and desired number of clusters, k• Output: set of clusters• Pair up all points starting with the two
closest ones, then the next remaining two closest ones, and so on, until all are paired.
• Next, the sets of points X and Y minimizing dmin(X,Y) over all remaining sets are merged, until only k sets remain.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 17
Algorithms for Hierarchical Clustering
Sequential Algorithms – Bottom-Up (cont.)• Assumed that the number of input
points is even.• There are no restrictions placed on the
distance function.• In the first phase of the algorithm points
are clustered whose distance is less than or equal to B.
• Operates in polynomial time.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 18
Algorithms for Hierarchical Clustering
Sequential Algorithms – Top-Down• Function v(G) takes a graph as its
argument and returns a set that consists of the vertices from G.
• Input: set of points, a distance function, and the desired number of clusters k
• Output: set of clusters• All points start in the same cluster.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 19
Algorithms for Hierarchical Clustering
• Compute a minimum-cost spanning tree.
• Form clusters by repeatedly removing the highest-cost edge from what remains of a minimum-cost spanning tree of the graph corresponding to the initial set of points with respect to the distance function, until exactly k sets have been formed.
• Runs in polynomial time.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 20
Algorithms for Hierarchical Clustering
• Top-Down and Bottom-Up have different parallel complexities, unless CC equals NC.
• Prove that the exact same clusters as produced by the Sequential (Top-Down) Hierarchical Clustering Algorithm can be computed in NC.
• A natural decision problem based on the Sequential (Bottom-Up) Hierarchical Clustering Algorithm is CC-complete.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 21
Algorithms for Hierarchical Clustering
• Since a CC-complete problem is very unlikely to have an NC algorithm and a problem with an NC algorithm is very unlikely to be CC-complete, the parallel complexities of these two sequential algorithms are different.
• For a fast parallel algorithm for hierarchical clustering, the algorithm should be based on the Top-Down approach.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 22
Algorithms for Hierarchical Clustering
• Theorem: Let n denote the number of points to be clustered. The Parallel (Top-Down) Hierarchical Clustering Algorithm can be implemented in O(log n) time using n2 processors on the CREW PRAM.
• This algorithm is an NC algorithm, which means that the clusters can be computed very fast in parallel.
• Any reasonable decision problem based on this algorithm will be in NC.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 23
Outline• Introduction• Preliminaries• Algorithms for Hierarchical Clustering• Complexity of Hierarchical
Clustering• CC-Complete Problems• Conclusions and Open Problems• References• Acknowledgments
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 24
Complexity of Hierarchical Clustering
Hierarchical Clustering Problem (HCP)• Given: A set S of n points in Rd, a
distance function dS : S x S N, the number of clusters k ≤ n/2 N, a distance bound B, and two points x, y S.
• Problem: Are x and y with dS(x, y) ≤ B in the same cluster C after the first-phase of the Sequential (Bottom-Up) Hierarchical Clustering Algorithm?
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 25
Complexity of Hierarchical Clustering
• No restrictions placed on the properties the distance function must satisfy, the distances themselves must be natural numbers.
• This version of the problem easily reduces to the version where the weights come from R+.
• Not concerned with the distance between a point and itself, the k is the number of clusters to be formed.
• x and y are required to be no further apart than the distance bound B.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 26
Complexity of Hierarchical Clustering
Lexicographically First Maximal Matching Problem (LFMMP)
• Given: An undirected graph G = (V, E) with an ordering on its edges plus a distinguished edge e E.
• Problem: Is e in the lexicographically first maximal matching of G?
• A matching is maximal if it cannot be extended.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 27
Complexity of Hierarchical Clustering
• LFMMP is CC-complete [Cook 1982, Mayr and Subramanian 1992].
• Theorem: The Hierarchical Clustering Problem is NC many-one reducible to the Lexicographically First Maximal Matching Problem, that is, HCP ≤ LFMMP.
• HCP is in CC.
mNC
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 28
Complexity of Hierarchical Clustering
• Theorem: The Lexicographically First Maximal Matching Problem is NC many-one reducible to the Hierarchical Clustering Problem, that is, LFMMP ≤ HCP.
• Proof Sketch: Let G = (V = {1,…,n},E), ø : E {1,…,|E|} be an ordering on E, and e = {u,v} E be an instance of the LFMMP. Construct instance of HCP, a set S of n points p1,…,pn in Rd, a distance function dS : S x S N, clusters k ≤ n/2 N, bound B, and x,y S.
mNC
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 29
Complexity of Hierarchical Clustering
• Proof (cont.): Let S = {1,…,n,n+1,…,2n}. Let V’ = S – V. Define the distance function between each pair of points in S as follows:
• Let B = |E|, k = n, and take u and v as our points
dS(x,y) = ø({x,y})
if {x,y} E
= 2|E| if x V and y V’ or vice versa
= 3|E| if x V’, y V’, and x ≠ y
= 4|E| if x ≤ n, y ≤ n, x ≠ y, and
{x,y} E
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 30
Complexity of Hierarchical Clustering
• Theorem: The Hierarchical Clustering Problem is CC-complete.
• This expands the list of CC-complete problems and adds the first clustering/data mining problem to the class.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 31
Outline• Introduction• Preliminaries• Algorithms for Hierarchical Clustering• Complexity of Hierarchical Clustering• CC-Complete Problems• Conclusions and Open Problems• References• Acknowledgments
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 32
CC-Complete ProblemsComparator Circuit Value Problem (CCVP)• Given: An encoding of a Boolean circuit
composed of comparator gates, inputs x1,…,xn, and a designated output y.
• Problem: Is output y of TRUE on input x1,…,xn?
• References: [Cook 1982, Mayr and Subramanian 1992]
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 33
CC-Complete ProblemsLexicographically First Maximal Matching
Problem (LFMMP)• Given: An undirected graph G = (V, E) with
an ordering on its edges plus a distinguished edge e E.
• Problem: Is e in the lexicographically first maximal matching of G?
• References: [Cook 1982, Mayr and Subramanian 1992]
• Remarks: Resembles the Lexicographically First Maximal Independent Set Problem which is P-complete.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 34
CC-Complete ProblemsStable Marriage Problem (SMP)• Given: A set of n men and a set of n
women. For each person a ranking of the opposite sex according to their preference for a marriage partner.
• Problem: Does the given instance of the problem have a set of marriages that is stable? The set is stable if there is no unmatched pair {m, w} such that both m and w prefer each other to their current partners.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 35
CC-Complete ProblemsStable Marriage Problem (SMP)• References: [Mayr and Subramanian
1992, Subramanian 1989]• Remarks: If the preference lists are
complete, there is always a solution. Several variations of the SMP are also known to be equivalent to the CCVP. The Male-Optimal Stable Marriage Problem finds a matching in which no man could do any better in a stable marriage.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 36
CC-Complete ProblemsStable Marriage Stable Pair Problem (SMSPP)• Given: A set of n men and n women, for each
person a ranking of the opposite sex according to their preference for a marriage partner, and a designated couple Alice and Bob.
• Problem: Are Alice and Bob a stable pair for the given instance of the problem? That is, is it the case that Alice and Bob are married to each other in some stable marriage?
• References: [Mayr and Subramanian 1992, Subramanian 1989]
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 37
CC-Complete ProblemsStable Marriage Minimum Regret Problem
(SMMRP)• Given: A set of n men and n women, for
each person a ranking of the opposite sex according to their preference for a marriage partner, and a natural number k, 1 ≤ k ≤ n.
• Problem: Is there a stable marriage in which every person has regret at most k? The regret of a person in a stable marriage is the position of her mate on her preference list.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 38
CC-Complete ProblemsStable Marriage Minimum Regret Problem
(SMMRP)• References: [Mayr and Subramanian
1992, Subramanian 1989]• Remarks: The goal in this problem is to
minimize the maximum regret of any person.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 39
CC-Complete ProblemsTelephone Connection Problem (TCP)• Given: A telephone line with a fixed
channel capacity k, a natural number l, and a sequence of calls (s1, f1),…, (sn, fn), where si (fi) denotes the starting (respectively, finishing) time of the i-th call. The i-th call can be serviced at time si if the number of calls being served at that time is less than k. If the call cannot be served, it is discarded. When a call is completed, the channel is freed up.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 40
CC-Complete ProblemsTelephone Connection Problem (TCP)• Problem: Is the l-th call serviced?• References: [Ramachandran and Wang
1991]• Remarks: There is an O(min( ,k) log
n)-time EREW-PRAM algorithm that uses n processors for solving the TCP.
n
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 41
CC-Complete ProblemsInternal Diffusion Limited Aggregation
Predication Problem (IDLAPP)• Given: A time T and a list of moves
(t,i,s), one for each time 0 ≤ t ≤ T indicating that at time t for particle i, if still active, will visit site s, plus a designated site d, and a designated particle p. A particle is active if it is still moving within the cluster, that is, the particle has not yet stuck to the cluster because all of the sites that it has visited so far were occupied already.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 42
CC-Complete ProblemsInternal Diffusion Limited Aggregation
Predication Problem (IDLAPP)• Problem: Is site d occupied and is site p
active at time T?• References: [Moore and Machta 2000]
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 43
CC-Complete ProblemsInternal Diffusion Limited Aggregation
Predication Square Lattice Problem• Given: A time T and a list of moves
(t,i,s) on a square lattice, one for each time 0 ≤ t ≤ T indicating that at time t for particle i, if still active, will visit site s, plus a designated site d, and a designated particle p.
• Problem: Is site d on the square latice occupied and is site p active at time T?
• References: [Moore and Matcha 2000]
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 44
CC-Complete ProblemsHierarchical Clustering Problem (HCP)• Given: A set S of n points in Rd, a distance
function dS : S x S N, the number of clusters k ≤ n/2 N, a distance bound B, and two points x, y S.
• Problem: Are x and y with dS(x, y) ≤ B in the same cluster C after the first-phase of the Sequential (Bottom-Up) Hierarchical Clustering Algorithm?
• Reference: [This work 2006]
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 45
Outline• Introduction• Preliminaries• Algorithms for Hierarchical Clustering• Complexity of Hierarchical Clustering• CC-Complete Problems• Conclusions and Open Problems• References• Acknowledgments
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 46
Conclusions• A natural decision problem based on
bottom-up hierarchical clustering is CC-complete.
• Top-down hierarchical clustering is in NC.• Brings the number of known CC-complete
problems to ten, and shows that the HCP is unlikely to have a NC algorithm.
• Fast parallel algorithms for hierarchical clustering should be based on a top-down approach.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 47
Open Problems• Is Euclidean HCP CC-complete? (It is in
CC.)• Determine the complexity of the
second-phase of the Sequential (Bottom-Up) Hierarchical Clustering Algorithm.
• Add new problems to the class of CC-complete problems.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 48
Outline• Introduction• Preliminaries• Algorithms for Hierarchical Clustering• Complexity of Hierarchical Clustering• CC-Complete Problems• Conclusions and Open Problems• References• Acknowledgments
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 49
References• [Blumenthal 1953] Theory and Applications of Distance Geometry.
Oxford University Press.• [Boley 1998] Principal direction divisive partitioning. Data Mining and
Knowledge Discovery, 2(4):325—344.• [Chong, Han, and Lam 2001] Concurrent threads and optimal parallel
minimum spanning tree algorithms. Journal of the ACM, 48(2):297—323.
• [Cole 1988] Parallel Merge Sort. SIAM Journal of Computing, 17(4):770—785.
• [Cook 1985] A taxonomy of problems with fast parallel algorithms. Information and Control, 64(1—3):2—22.
• [Dash, Petrutiu, and Scheuermann 2004] Efficient parallel hierarchical clustering. Lecture Notes in Computer Science, 3149:363—371.
• [Feder 1992] A new fixed point approach to stable networks and stable marriages. Journal of Computer and System Sciences, 45(2):233—284.
• [Gibbons 1985] Algorithmic Graph Theory. Cambridge University Press.
• [Greenlaw 1992] A model classifying algorithms as inherently sequential with applications to graph searching. Information and Computing, 97(2):133—149.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 50
References• [Greenlaw, Hoover, and Ruzzo 1995] Limits to Parallel
Computation: P-Completeness Theory. Oxford University Press.• [Guha, Rastogi, and Shim 1998] Cure: An efficient clustering
algorithm for large databases. In ACM SIGMOD, pages 378—385, Seattle, WA. Association for Computing Machinery.
• [Jain and Dubes 1988] Algorithms for Clustering Data. Prentice-Hall.
• [Karypis, Han, and Vumar 1999] CHAMELEON: A hierarchical clustering algorithm using dynamic modeling. IEEE Computer, 32(8):68—75.
• [Kaufman and Rousseeuw 1990] Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley and Sons.
• [Li 1990] Parallel algorithms for hierarchical clustering and clustering validity. IEEE Trans. Pattern Analysis and Machine Intelligence, 12(11):1088—1092.
• [Li and Fang 1989] Parallel clustering algorithms. Parallel Computing, 11(3):275—290.
• [Mayr and Subramanian 1992] The complexity of circuit value and network stability. Journal of Computer and System Sciences, 44(2):302—323.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 51
References• [Moore and Machta 2000] Internal diffusion-limited aggregation:
Parallel algorithms and complexity. Journal of Statistical Physics, 99(3/4):661—690.
• [Olson 1995] Parallel algorithms for hierarchical clustering. Parallel Computing, 21(8):1313—1325.
• [Pólya, Tarjan, and Woods 1983] Notes on Introductory Combinatorics. Birkhäuser, Boston.
• [Rajasekaran 2005] Efficient parallel hierarchical clustering algorithms. IEEE Transactions on Parallel and Distributed Systems, 16(6):497—502.
• [Ramachandran and Wang 1991] Parallel algorithms and complexity results for telephone link simulation. In Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing, pages 378—375, Dallas, TX, December. IEEE.
• [Reif (ed) 1993] Synthesis of Parallel Algorithms. Morgan Kaufmann.• [Sairam, Vitter, and Tamassia 1993] A complexity theoretic
approach to incremental computation. In Finkel, Enjalbert, and Wagner, editors, STACS 93: 10th Annual Symposium on Theoretical Aspects of Computer Science, volume 665 of Lecture Notes in Computer Sciences, pages 640—649, Wurzburg, Germany, Fbruary. Springer-Verlag.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 52
References• [Steinbach, Karypis, and Kumar 2000] A comparison of
document clustering techniques. In 6th ACM SIGKDD World Text Mining Conference, Boston, MA. Association for Computing Machinery.
• [Subramanian 1989] A new approach to stable matching problems. Technical Report STAN-CS-90-1275. Stanford University, Department of Computer Science.
• [Subramanian 1990] The Computational Complexity of the Circuit Value and Network Stability Problems, PhD thesis, Stanford University. Depatment of Computer Science Technical Report, STAN-CS-90-1311.
• [Tsai, Horng, Lee, Tsai, and Kao 1997] Parallel hierarchical clustering algorithms on processor arrays with a reconfigurable bus system. Pattern Recognition, 30(5):801—815.
• [Wu, Horng, and Tsai 2000] Efficient parallel algorithms for hierarchical clustering on arrays with reconfigurable optical buses. Journal of Parallel and Distributed Computing, 60(9):1137—1153.
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 53
Outline• Introduction• Preliminaries• Algorithms for Hierarchical Clustering• Complexity of Hierarchical Clustering• CC-Complete Problems• Conclusions and Open Problems• References• Acknowledgments
On the Parallel Complexity of Hierarchical Clustering and CC-Complete Problems — Greenlaw and Kantabutra — 54
Acknowledgements• Computer Science Department at
Chiang Mai University, Thailand• Fulbright Commissions of Thailand and
the United States• Jim Hoover and Larry Ruzzo for material
from [Greenlaw, Hoover, and Ruzzo 1995]