{"id":366,"date":"2013-12-06T15:11:23","date_gmt":"2013-12-06T20:11:23","guid":{"rendered":"http:\/\/homepages.uc.edu\/~yaozo\/wordpress\/?p=366"},"modified":"2013-12-06T15:11:23","modified_gmt":"2013-12-06T20:11:23","slug":"hierarchical-clustering-algorithms","status":"publish","type":"post","link":"https:\/\/zhuoyao.net\/index.php\/2013\/12\/06\/hierarchical-clustering-algorithms\/","title":{"rendered":"Hierarchical Clustering Algorithms"},"content":{"rendered":"<p align=\"justify\"><span style=\"font-family: Arial, Helvetica, sans-serif; font-size: xx-small;\"><em>How They Work<\/em><\/span><span style=\"font-family: 'Times New Roman', Times, serif;\"><br \/>\nGiven a set of N items to be clustered, and an N*N distance (or similarity) matrix, the basic process of hierarchical clustering (defined by\u00a0<a href=\"http:\/\/home.deib.polimi.it\/matteucc\/Clustering\/tutorial_html\/hierarchical.html#johnson\">S.C. Johnson in 1967<\/a>) is this:<\/span><\/p>\n<ol>\n<li><span style=\"font-family: 'Times New Roman', Times, serif;\">Start by assigning each item to a cluster, so that if you have N items, you now have N clusters, each containing just one item. Let the distances (similarities) between the clusters the same as the distances (similarities) between the items they contain.<\/span><\/li>\n<li><span style=\"font-family: 'Times New Roman', Times, serif;\">Find the closest (most similar) pair of clusters and merge them into a single cluster, so that now you have one cluster less.<\/span><\/li>\n<li><span style=\"font-family: 'Times New Roman', Times, serif;\">Compute distances (similarities) between the new cluster and each of the old clusters.<\/span><\/li>\n<li><span style=\"font-family: 'Times New Roman', Times, serif;\">Repeat steps 2 and 3 until all items are clustered into a single cluster of size N. (*)<\/span><\/li>\n<\/ol>\n<p align=\"justify\"><span style=\"font-family: 'Times New Roman', Times, serif;\">Step 3 can be done in different ways, which is what distinguishes\u00a0<em>single-linkage<\/em>\u00a0from\u00a0<em>complete-linkage<\/em>\u00a0and\u00a0<em>average-linkage<\/em>\u00a0clustering.<br \/>\nIn\u00a0<em>single-linkage<\/em>\u00a0clustering (also called the\u00a0<em>connectedness<\/em>\u00a0or\u00a0<em>minimum<\/em>\u00a0method), we consider the distance between one cluster and another cluster to be equal to the\u00a0<span style=\"text-decoration: underline;\">shortest<\/span>\u00a0distance from any member of one cluster to any member of the other cluster. If the data consist of similarities, we consider the similarity between one cluster and another cluster to be equal to the\u00a0<span style=\"text-decoration: underline;\">greatest<\/span>\u00a0similarity from any member of one cluster to any member of the other cluster.<br \/>\nIn\u00a0<em>complete-linkage<\/em>\u00a0clustering (also called the\u00a0<em>diameter<\/em>\u00a0or\u00a0<em>maximum<\/em>\u00a0method), we consider the distance between one cluster and another cluster to be equal to the\u00a0<span style=\"text-decoration: underline;\">greatest<\/span>\u00a0distance from any member of one cluster to any member of the other cluster.<br \/>\nIn\u00a0<em>average-linkage<\/em>\u00a0clustering, we consider the distance between one cluster and another cluster to be equal to the\u00a0<span style=\"text-decoration: underline;\">average<\/span>\u00a0distance from any member of one cluster to any member of the other cluster.<br \/>\nA variation on average-link clustering is the UCLUS method of\u00a0<a href=\"http:\/\/home.deib.polimi.it\/matteucc\/Clustering\/tutorial_html\/hierarchical.html#dandrade\">R. D&#8217;Andrade (1978)<\/a>\u00a0which uses the\u00a0<span style=\"text-decoration: underline;\">median<\/span>\u00a0distance, which is much more outlier-proof than the average distance.<\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: 'Times New Roman', Times, serif;\">This kind of hierarchical clustering is called\u00a0<em>agglomerative<\/em>\u00a0because it merges clusters iteratively. There is also a\u00a0<em>divisive<\/em>\u00a0hierarchical clustering which does the reverse by starting with all objects in one cluster and subdividing them into smaller pieces. Divisive methods are not generally available, and rarely have been applied.<\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: 'Times New Roman', Times, serif;\">(*) Of course there is no point in having all the N items grouped in a single cluster but, once you have got the complete hierarchical tree, if you want k clusters you just have to cut the k-1 longest links.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p align=\"justify\"><span style=\"font-family: Arial, Helvetica, sans-serif; font-size: xx-small;\"><em>Single-Linkage Clustering: The Algorithm<\/em><\/span><span style=\"font-family: 'Times New Roman', Times, serif;\"><br \/>\nLet\u2019s now take a deeper look at how Johnson\u2019s algorithm works in the case of single-linkage clustering.<br \/>\nThe algorithm is an agglomerative scheme that erases rows and columns in the proximity matrix as old clusters are merged into new ones.<\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: 'Times New Roman', Times, serif;\">The N*N proximity matrix is D = [d(i,j)]. The clusterings are assigned sequence numbers 0,1,&#8230;&#8230;, (n-1) and L(k) is the level of the kth clustering. A cluster with sequence number m is denoted (m) and the proximity between clusters (r) and (s) is denoted d [(r),(s)].<\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: 'Times New Roman', Times, serif;\">The algorithm is composed of the following steps:<\/span><\/p>\n<table width=\"75%\" border=\"1\" align=\"center\">\n<tbody>\n<tr>\n<td>\n<ol>\n<li><em><span style=\"font-family: 'Times New Roman', Times, serif;\">Begin with the disjoint clustering having level L(0) = 0 and sequence number m = 0.<br \/>\n<\/span><\/em><\/li>\n<li><em><span style=\"font-family: 'Times New Roman', Times, serif;\">Find the least dissimilar pair of clusters in the current clustering, say pair (r), (s), according to<\/span><br \/>\n<span style=\"font-family: 'Times New Roman', Times, serif;\"><br \/>\nd[(r),(s)] = min d[(i),(j)]<\/span><br \/>\n<span style=\"font-family: 'Times New Roman', Times, serif;\"><br \/>\nwhere the minimum is over all pairs of clusters in the current clustering.<br \/>\n<\/span><\/em><\/li>\n<li><em><span style=\"font-family: 'Times New Roman', Times, serif;\">Increment the sequence number : m = m +1. Merge clusters (r) and (s) into a single cluster to form the next clustering m. Set the level of this clustering to<\/span><br \/>\n<span style=\"font-family: 'Times New Roman', Times, serif;\"><br \/>\nL(m) = d[(r),(s)]<br \/>\n<\/span><\/em><\/li>\n<li><em><span style=\"font-family: 'Times New Roman', Times, serif;\">Update the proximity matrix, D, by deleting the rows and columns corresponding to clusters (r) and (s) and adding a row and column corresponding to the newly formed cluster. The proximity between the new cluster, denoted (r,s) and old cluster (k) is defined in this way:<\/span><br \/>\n<span style=\"font-family: 'Times New Roman', Times, serif;\"><br \/>\nd[(k), (r,s)] = min d[(k),(r)], d[(k),(s)]<br \/>\n<\/span><\/em><\/li>\n<li><em><span style=\"font-family: 'Times New Roman', Times, serif;\">If all objects are in one cluster, stop. Else, go to step 2.<\/span><\/em><\/li>\n<\/ol>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p align=\"justify\">\n<p align=\"justify\"><span style=\"font-family: Arial, Helvetica, sans-serif; font-size: xx-small;\"><em>An Example<\/em><\/span><span style=\"font-family: 'Times New Roman', Times, serif;\"><br \/>\nLet\u2019s now see a simple example: a hierarchical clustering of distances in kilometers between some Italian cities. The method used is single-linkage.<\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: 'Times New Roman', Times, serif;\"><strong>Input distance matrix<\/strong>\u00a0(L = 0 for all the clusters):<\/span><\/p>\n<table width=\"35%\" border=\"1\" align=\"center\">\n<tbody>\n<tr>\n<td><\/td>\n<td>\n<div align=\"center\"><strong>BA<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>FI<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>MI<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>NA<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>RM<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>TO<\/strong><\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><strong>BA<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<td>\n<div align=\"center\">662<\/div>\n<\/td>\n<td>\n<div align=\"center\">877<\/div>\n<\/td>\n<td>\n<div align=\"center\">255<\/div>\n<\/td>\n<td>\n<div align=\"center\">412<\/div>\n<\/td>\n<td>\n<div align=\"center\">996<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><strong>FI<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\">662<\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<td>\n<div align=\"center\">295<\/div>\n<\/td>\n<td>\n<div align=\"center\">468<\/div>\n<\/td>\n<td>\n<div align=\"center\">268<\/div>\n<\/td>\n<td>\n<div align=\"center\">400<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><strong>MI<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\">877<\/div>\n<\/td>\n<td>\n<div align=\"center\">295<\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<td>\n<div align=\"center\">754<\/div>\n<\/td>\n<td>\n<div align=\"center\">564<\/div>\n<\/td>\n<td>\n<div align=\"center\">138<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><strong>NA<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\">255<\/div>\n<\/td>\n<td>\n<div align=\"center\">468<\/div>\n<\/td>\n<td>\n<div align=\"center\">754<\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<td>\n<div align=\"center\">219<\/div>\n<\/td>\n<td>\n<div align=\"center\">869<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><strong>RM<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\">412<\/div>\n<\/td>\n<td>\n<div align=\"center\">268<\/div>\n<\/td>\n<td>\n<div align=\"center\">564<\/div>\n<\/td>\n<td>\n<div align=\"center\">219<\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<td>\n<div align=\"center\">669<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><strong>TO<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\">996<\/div>\n<\/td>\n<td>\n<div align=\"center\">400<\/div>\n<\/td>\n<td>\n<div align=\"center\">138<\/div>\n<\/td>\n<td>\n<div align=\"center\">869<\/div>\n<\/td>\n<td>\n<div align=\"center\">669<\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p align=\"center\"><img decoding=\"async\" alt=\"\" src=\"http:\/\/home.deib.polimi.it\/matteucc\/Clustering\/tutorial_html\/images\/italia01.gif\" \/><\/p>\n<p align=\"justify\"><span style=\"font-family: 'Times New Roman', Times, serif;\">The nearest pair of cities is MI and TO, at distance 138. These are merged into a single cluster called &#8220;MI\/TO&#8221;. The level of the new cluster is L(MI\/TO) = 138 and the new sequence number is m = 1.<br \/>\n<\/span><span style=\"font-family: 'Times New Roman', Times, serif;\">Then we compute the distance from this new compound object to all other objects. In single link clustering the rule is that the distance from the compound object to another object is equal to the shortest distance from any member of the cluster to the outside object. So the distance from &#8220;MI\/TO&#8221; to RM is chosen to be 564, which is the distance from MI to RM, and so on.<\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: 'Times New Roman', Times, serif;\">After merging MI with TO we obtain the following matrix:<\/span><\/p>\n<p align=\"justify\">\n<table width=\"35%\" border=\"1\" align=\"center\">\n<tbody>\n<tr>\n<td><\/td>\n<td>\n<div align=\"center\"><strong>BA<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>FI<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>MI\/TO<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>NA<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>RM<\/strong><\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><strong>BA<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<td>\n<div align=\"center\">662<\/div>\n<\/td>\n<td>\n<div align=\"center\">877<\/div>\n<\/td>\n<td>\n<div align=\"center\">255<\/div>\n<\/td>\n<td>\n<div align=\"center\">412<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><strong>FI<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\">662<\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<td>\n<div align=\"center\">295<\/div>\n<\/td>\n<td>\n<div align=\"center\">468<\/div>\n<\/td>\n<td>\n<div align=\"center\">268<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><strong>MI\/TO<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\">877<\/div>\n<\/td>\n<td>\n<div align=\"center\">295<\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<td>\n<div align=\"center\">754<\/div>\n<\/td>\n<td>\n<div align=\"center\">564<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><strong>NA<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\">255<\/div>\n<\/td>\n<td>\n<div align=\"center\">468<\/div>\n<\/td>\n<td>\n<div align=\"center\">754<\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<td>\n<div align=\"center\">219<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><strong>RM<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\">412<\/div>\n<\/td>\n<td>\n<div align=\"center\">268<\/div>\n<\/td>\n<td>\n<div align=\"center\">564<\/div>\n<\/td>\n<td>\n<div align=\"center\">219<\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p align=\"center\"><img decoding=\"async\" alt=\"\" src=\"http:\/\/home.deib.polimi.it\/matteucc\/Clustering\/tutorial_html\/images\/italia02.gif\" \/><\/p>\n<p align=\"justify\"><span style=\"font-family: 'Times New Roman', Times, serif;\">min d(i,j) = d(NA,RM) = 219 =&gt; merge NA and RM into a new cluster called NA\/RM<br \/>\nL(NA\/RM) = 219<br \/>\nm = 2<\/span><\/p>\n<p>&nbsp;<\/p>\n<table width=\"35%\" border=\"1\" align=\"center\">\n<tbody>\n<tr>\n<td><\/td>\n<td>\n<div align=\"center\"><strong>BA<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>FI<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>MI\/TO<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>NA\/RM<\/strong><\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><strong>BA<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<td>\n<div align=\"center\">662<\/div>\n<\/td>\n<td>\n<div align=\"center\">877<\/div>\n<\/td>\n<td>\n<div align=\"center\">255<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><strong>FI<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\">662<\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<td>\n<div align=\"center\">295<\/div>\n<\/td>\n<td>\n<div align=\"center\">268<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><strong>MI\/TO<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\">877<\/div>\n<\/td>\n<td>\n<div align=\"center\">295<\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<td>\n<div align=\"center\">564<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><strong>NA\/RM<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\">255<\/div>\n<\/td>\n<td>\n<div align=\"center\">268<\/div>\n<\/td>\n<td>\n<div align=\"center\">564<\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p align=\"center\"><img decoding=\"async\" alt=\"\" src=\"http:\/\/home.deib.polimi.it\/matteucc\/Clustering\/tutorial_html\/images\/italia03.gif\" \/><\/p>\n<p><span style=\"font-family: 'Times New Roman', Times, serif;\">min d(i,j) = d(BA,NA\/RM) = 255 =&gt; merge BA and NA\/RM into a new cluster called BA\/NA\/RM<br \/>\nL(BA\/NA\/RM) = 255<br \/>\nm = 3<\/span><\/p>\n<p>&nbsp;<\/p>\n<table width=\"35%\" border=\"1\" align=\"center\">\n<tbody>\n<tr>\n<td><\/td>\n<td>\n<div align=\"center\"><strong>BA\/NA\/RM<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>FI<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>MI\/TO<\/strong><\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><strong>BA\/NA\/RM<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<td>\n<div align=\"center\">268<\/div>\n<\/td>\n<td>\n<div align=\"center\">564<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><strong>FI<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\">268<\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<td>\n<div align=\"center\">295<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><strong>MI\/TO<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\">564<\/div>\n<\/td>\n<td>\n<div align=\"center\">295<\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p align=\"center\"><img decoding=\"async\" alt=\"\" src=\"http:\/\/home.deib.polimi.it\/matteucc\/Clustering\/tutorial_html\/images\/italia04.gif\" \/><\/p>\n<p><span style=\"font-family: 'Times New Roman', Times, serif;\">min d(i,j) = d(BA\/NA\/RM,FI) = 268 =&gt; merge BA\/NA\/RM and FI into a new cluster called BA\/FI\/NA\/RM<br \/>\nL(BA\/FI\/NA\/RM) = 268<br \/>\nm = 4<\/span><\/p>\n<p>&nbsp;<\/p>\n<table width=\"35%\" border=\"1\" align=\"center\">\n<tbody>\n<tr>\n<td><\/td>\n<td>\n<div align=\"center\"><strong>BA\/FI\/NA\/RM<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>MI\/TO<\/strong><\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><strong>BA\/FI\/NA\/RM<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<td>\n<div align=\"center\">295<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><strong>MI\/TO<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\">295<\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p align=\"center\"><img decoding=\"async\" alt=\"\" src=\"http:\/\/home.deib.polimi.it\/matteucc\/Clustering\/tutorial_html\/images\/italia05.gif\" \/><\/p>\n<p align=\"justify\"><span style=\"font-family: 'Times New Roman', Times, serif;\">Finally, we merge the last two clusters at level 295.<\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: 'Times New Roman', Times, serif;\">The process is summarized by the following hierarchical tree:<\/span><\/p>\n<p align=\"center\"><img decoding=\"async\" alt=\"\" src=\"http:\/\/home.deib.polimi.it\/matteucc\/Clustering\/tutorial_html\/images\/image057.gif\" \/><\/p>\n<p>&nbsp;<\/p>\n<p align=\"justify\"><em><span style=\"font-family: Arial, Helvetica, sans-serif; font-size: xx-small;\">Problems<\/span><\/em><span style=\"font-family: 'Times New Roman', Times, serif;\"><br \/>\nThe main weaknesses of agglomerative clustering methods are:<\/span><\/p>\n<ul>\n<li><span style=\"font-family: 'Times New Roman', Times, serif;\">they do not scale well: time complexity of at least\u00a0<em>O(n<sup>2<\/sup>)<\/em>, where n is the number of total objects;<\/span><\/li>\n<li><span style=\"font-family: 'Times New Roman', Times, serif;\">they can never undo what was done previously.<\/span><\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p align=\"justify\"><span style=\"font-family: Arial, Helvetica, sans-serif; font-size: xx-small;\"><em>Bibliography<\/em><\/span><\/p>\n<div align=\"justify\">\n<ul>\n<li><span style=\"font-family: 'Times New Roman', Times, serif;\"><a name=\"johnson\"><\/a>S. C. Johnson (1967): &#8220;Hierarchical Clustering Schemes&#8221;\u00a0<em>Psychometrika<\/em>, 2:241-254<\/span><\/li>\n<li><span style=\"font-family: 'Times New Roman', Times, serif;\"><a name=\"dandrade\"><\/a>R. D&#8217;andrade (1978): &#8220;U-Statistic Hierarchical Clustering&#8221;\u00a0<em>Psychometrika<\/em>, 4:58-67<\/span><\/li>\n<li><span style=\"font-family: 'Times New Roman', Times, serif;\">Andrew Moore: \u201cK-means and Hierarchical Clustering &#8211; Tutorial Slides\u201d<br \/>\n<a href=\"http:\/\/www-2.cs.cmu.edu\/~awm\/tutorials\/kmeans.html\">http:\/\/www-2.cs.cmu.edu\/~awm\/tutorials\/kmeans.html<\/a><\/span><\/li>\n<li><span style=\"font-family: 'Times New Roman', Times, serif;\">Osmar R. Za\u00efane: \u201cPrinciples of Knowledge Discovery in Databases &#8211; Chapter 8: Data Clustering\u201d<br \/>\n<a href=\"http:\/\/www.cs.ualberta.ca\/~zaiane\/courses\/cmput690\/slides\/Chapter8\/index.html\">http:\/\/www.cs.ualberta.ca\/~zaiane\/courses\/cmput690\/slides\/Chapter8\/index.html<\/a><\/span><\/li>\n<li><span style=\"font-family: 'Times New Roman', Times, serif;\">Stephen P. Borgatti: \u201cHow to explain hierarchical clustering\u201d<br \/>\n<a href=\"http:\/\/www.analytictech.com\/networks\/hiclus.htm\">http:\/\/www.analytictech.com\/networks\/hiclus.htm<\/a><\/span><\/li>\n<li><span style=\"font-family: 'Times New Roman', Times, serif;\">Maria Irene Miranda: \u201cClustering methods and algorithms\u201d<br \/>\n<a href=\"http:\/\/www.cse.iitb.ac.in\/dbms\/Data\/Courses\/CS632\/1999\/clustering\/dbms.html\">http:\/\/www.cse.iitb.ac.in\/dbms\/Data\/Courses\/CS632\/1999\/clustering\/dbms.htm<\/a><\/span><\/li>\n<li><\/li>\n<\/ul>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>How They Work Given a set of N items to be clustered, and an N*N distance (or similarity) matrix, the basic process of hierarchical clustering&hellip; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[24],"tags":[],"class_list":["post-366","post","type-post","status-publish","format-standard","hentry","category-statistics"],"_links":{"self":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/posts\/366","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/comments?post=366"}],"version-history":[{"count":0,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/posts\/366\/revisions"}],"wp:attachment":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/media?parent=366"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/categories?post=366"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/tags?post=366"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}