DaveM@9: function linkList = aglomCluster(data, clusterMethod, distanceMetric, numClusters) DaveM@8: %% aglomCluster(data, clusterMethod, distanceMetric, numClusters) DaveM@8: % This function performs aglomerative clustering on a given data set, DaveM@8: % allowing the interpretation of a hierarchical data, and plotting a DaveM@8: % dendrogram. DaveM@8: % DaveM@8: % data in the format of of each row is an observation and each column is a DaveM@8: % feature vector clusterMethod; DaveM@8: % * 'average' Unweighted average distance (UPGMA) DaveM@8: % * 'centroid' Centroid distance (UPGMC), appropriate for Euclidean DaveM@8: % distances only DaveM@8: % * 'complete' Furthest distance DaveM@8: % * 'median' Weighted center of mass distance (WPGMC),appropriate DaveM@8: % for Euclidean distances only DaveM@8: % * 'single' Shortest distance DaveM@8: % * 'ward' Inner squared distance (minimum variance algorithm), DaveM@8: % appropriate for Euclidean distances only (default) DaveM@8: % * 'weighted' Weighted average distance (WPGMA) DaveM@8: % distanceMetric DaveM@8: % * 'euclidean' Euclidean distance (default). DaveM@8: % * 'seuclidean' Standardized Euclidean distance. Each coordinate DaveM@8: % difference between rows in X is scaled by dividing by the DaveM@8: % corresponding element of the standard deviation S=nanstd(X). To DaveM@8: % specify another value for S, use D=pdist(X,'seuclidean',S). DaveM@8: % * 'cityblock' City block metric. DaveM@8: % * 'minkowski' Minkowski distance. The default exponent is 2. To DaveM@8: % specify a different exponent, use D = pdist(X,'minkowski',P), where P DaveM@8: % is a scalar positive value of the exponent. DaveM@8: % * 'chebychev' Chebychev distance (maximum coordinate difference). DaveM@8: % * 'mahalanobis' Mahalanobis distance, using the sample covariance DaveM@8: % of X as computed by nancov. To compute the distance with a different DaveM@8: % covariance, use D = pdist(X,'mahalanobis',C), where the matrix C is DaveM@8: % symmetric and positive definite. DaveM@8: % * 'cosine' One minus the cosine of the included angle between points DaveM@8: % (treated as vectors). DaveM@8: % * 'correlation' One minus the sample correlation between points DaveM@8: % (treated as sequences of values). DaveM@8: % * 'spearman' One minus the sample Spearman's rank correlation between DaveM@8: % observations (treated as sequences of values). DaveM@8: % * 'hamming' Hamming distance, which is the percentage of coordinates DaveM@8: % that differ. DaveM@8: % * 'jaccard' One minus the Jaccard coefficient, which is the DaveM@8: % percentage of nonzero coordinates that differ. DaveM@8: % numClusters is the number of final clusters produced by the dendrogram, DaveM@8: % if 0 (default), then will infer from data DaveM@8: DaveM@8: if(nargin<2) DaveM@8: clusterMethod = 'ward'; DaveM@8: end DaveM@8: if(nargin<3) DaveM@8: distanceMetric = 'euclidean'; DaveM@8: end DaveM@8: if (nargin<4) DaveM@8: numClusters = 0; DaveM@8: end DaveM@8: DaveM@8: distMap = pdist(data, distanceMetric); DaveM@8: linkList = linkage(distMap, clusterMethod); DaveM@36: [~,T] = dendrogram(linkList,numClusters,'Orientation','left'); DaveM@8: DaveM@8: DaveM@8: end