Dawn@4: function [ idx ] = voicingByClustering( x, fs, noOfFrames, frameLength ) Dawn@4: % attempt to classify the voiced/unvoiced frames using k-means Dawn@4: % clustering with the short term energy and spectral centroid Dawn@4: % as feature vectors Dawn@4: % returns an array containing voicing decision for frames Dawn@4: % Useful only for speech frames Dawn@4: % Window length and step (in seconds): Dawn@4: win = frameLength/fs; Dawn@4: step = win; Dawn@4: Dawn@4: % calculate the short term energy Dawn@4: Eor = ShortTimeEnergy(x, win*fs, step*fs ); Dawn@4: % calculate the spectral centroid Dawn@4: Cor = SpectralCentroid(x, win*fs, step*fs, fs ); Dawn@4: Dawn@4: % dataFileName = '../../../../../Results/VUVgrouping.txt'; Dawn@4: % dataFileID = fopen( dataFileName, 'w' ); Dawn@4: Dawn@4: noOfClusters = 2; %voiced, unvoiced Dawn@4: data = [Eor Cor]; Dawn@4: idArray = zeros(1, length(Eor)); Dawn@4: Dawn@4: % myColours = ['r.'; 'm.'; 'c.'; 'w.'; 'g.'; 'y.'; 'b.']; Dawn@4: Dawn@4: [idx ctrs]=kmeans( data, noOfClusters, 'Replicates',100,... Dawn@4: 'start', 'sample', 'Distance', 'cityblock'); Dawn@4: Dawn@4: % we don't know which group will be classed as voiced Dawn@4: % or unvoiced. Dawn@4: % assume that the number of voiced frames is more than unvoiced Dawn@4: Dawn@4: noOfAFrames = length( find( idx == 1 )); Dawn@4: noOfBFrames = length( find( idx == 2 )); Dawn@4: if( noOfAFrames < noOfBFrames ) Dawn@4: voicedIdx = 2; Dawn@4: unvoicedIdx = 1; Dawn@4: else Dawn@4: voicedIdx = 1; Dawn@4: unvoicedIdx = 2; Dawn@4: end Dawn@4: Dawn@4: Dawn@4: % now re-number the idx array so all voiced frames = 1 and Dawn@4: % unvoiced = 2 Dawn@4: Dawn@4: voicedPos = find( idx == voicedIdx ); Dawn@4: unvoicedPos = find( idx == unvoicedIdx ); Dawn@4: Dawn@4: %replace the idx Dawn@4: idx( voicedPos ) = 1; Dawn@4: idx( unvoicedPos ) = 2; Dawn@4: Dawn@4: end Dawn@4: