sfx-subgrouping: code/aglomCluster.m annotate

annotate code/aglomCluster.m @ 37:d9a9a6b93026 tip

Add README

author	DaveM
date	Sat, 01 Apr 2017 17:03:14 +0100
parents	4af6fc2100e8
children

rev	line source
DaveM@9	1 function linkList = aglomCluster(data, clusterMethod, distanceMetric, numClusters)
DaveM@8	2 %% aglomCluster(data, clusterMethod, distanceMetric, numClusters)
DaveM@8	3 % This function performs aglomerative clustering on a given data set,
DaveM@8	4 % allowing the interpretation of a hierarchical data, and plotting a
DaveM@8	5 % dendrogram.
DaveM@8	6 %
DaveM@8	7 % data in the format of of each row is an observation and each column is a
DaveM@8	8 % feature vector clusterMethod;
DaveM@8	9 % * 'average' Unweighted average distance (UPGMA)
DaveM@8	10 % * 'centroid' Centroid distance (UPGMC), appropriate for Euclidean
DaveM@8	11 % distances only
DaveM@8	12 % * 'complete' Furthest distance
DaveM@8	13 % * 'median' Weighted center of mass distance (WPGMC),appropriate
DaveM@8	14 % for Euclidean distances only
DaveM@8	15 % * 'single' Shortest distance
DaveM@8	16 % * 'ward' Inner squared distance (minimum variance algorithm),
DaveM@8	17 % appropriate for Euclidean distances only (default)
DaveM@8	18 % * 'weighted' Weighted average distance (WPGMA)
DaveM@8	19 % distanceMetric
DaveM@8	20 % * 'euclidean' Euclidean distance (default).
DaveM@8	21 % * 'seuclidean' Standardized Euclidean distance. Each coordinate
DaveM@8	22 % difference between rows in X is scaled by dividing by the
DaveM@8	23 % corresponding element of the standard deviation S=nanstd(X). To
DaveM@8	24 % specify another value for S, use D=pdist(X,'seuclidean',S).
DaveM@8	25 % * 'cityblock' City block metric.
DaveM@8	26 % * 'minkowski' Minkowski distance. The default exponent is 2. To
DaveM@8	27 % specify a different exponent, use D = pdist(X,'minkowski',P), where P
DaveM@8	28 % is a scalar positive value of the exponent.
DaveM@8	29 % * 'chebychev' Chebychev distance (maximum coordinate difference).
DaveM@8	30 % * 'mahalanobis' Mahalanobis distance, using the sample covariance
DaveM@8	31 % of X as computed by nancov. To compute the distance with a different
DaveM@8	32 % covariance, use D = pdist(X,'mahalanobis',C), where the matrix C is
DaveM@8	33 % symmetric and positive definite.
DaveM@8	34 % * 'cosine' One minus the cosine of the included angle between points
DaveM@8	35 % (treated as vectors).
DaveM@8	36 % * 'correlation' One minus the sample correlation between points
DaveM@8	37 % (treated as sequences of values).
DaveM@8	38 % * 'spearman' One minus the sample Spearman's rank correlation between
DaveM@8	39 % observations (treated as sequences of values).
DaveM@8	40 % * 'hamming' Hamming distance, which is the percentage of coordinates
DaveM@8	41 % that differ.
DaveM@8	42 % * 'jaccard' One minus the Jaccard coefficient, which is the
DaveM@8	43 % percentage of nonzero coordinates that differ.
DaveM@8	44 % numClusters is the number of final clusters produced by the dendrogram,
DaveM@8	45 % if 0 (default), then will infer from data
DaveM@8	46
DaveM@8	47 if(nargin<2)
DaveM@8	48 clusterMethod = 'ward';
DaveM@8	49 end
DaveM@8	50 if(nargin<3)
DaveM@8	51 distanceMetric = 'euclidean';
DaveM@8	52 end
DaveM@8	53 if (nargin<4)
DaveM@8	54 numClusters = 0;
DaveM@8	55 end
DaveM@8	56
DaveM@8	57 distMap = pdist(data, distanceMetric);
DaveM@8	58 linkList = linkage(distMap, clusterMethod);
DaveM@36	59 [~,T] = dendrogram(linkList,numClusters,'Orientation','left');
DaveM@8	60
DaveM@8	61
DaveM@8	62 end

Mercurial > hg > sfx-subgrouping

annotate code/aglomCluster.m @ 37:d9a9a6b93026 tip