wolffd@0: function [sim, dissim, confidence] = sim_from_comparison_naive(comparison, comparison_ids, symmetrical) wolffd@0: % wolffd@0: % [sim, dissim, confidence] = sim_from_comparison_naive(comparison) wolffd@0: % wolffd@0: % derives symmetric, absolute similarity measurements wolffd@0: % from relative magnatagatune comparisons wolffd@0: % naive implementation for first tests of the ITML algorithm wolffd@0: % wolffd@0: wolffd@0: % reindex comparison for more simple evaluation wolffd@0: % makro_prepare_comparison wolffd@0: wolffd@0: % --- wolffd@0: % analyse the number of comparisons for each pair of songs wolffd@0: % --- wolffd@0: [num_compares] = get_comparison_stats(comparison, comparison_ids); wolffd@0: wolffd@0: % --- wolffd@0: % in comparison, the outlying piece is highlighted. wolffd@0: % thus, we naively consider that wolffd@0: % a. both of the remaining pieces are more similar to each other. wolffd@0: % b. the outlier is dissimilar to both of the other pieces wolffd@0: % --- wolffd@0: [outsort, outidx] = sort(comparison(:,4:6),2,'ascend'); wolffd@0: wolffd@0: % --- wolffd@0: % similarity of the two non-outliers a, b wolffd@0: % they are similar if both of them have scores way smaller wolffd@0: % than the outlier c: wolffd@0: % score (a,b) = 1 - (max(a,b)/c) wolffd@0: % wolffd@0: % dissimilarity: clip b is considered more different to clip c than wolffd@0: % a, as clip a seems to share some properties with both songs wolffd@0: % dissim(b,c) = 0.5 + b/(2c) wolffd@0: % --- wolffd@0: wolffd@0: sim = sparse(numel(comparison_ids),numel(comparison_ids)); wolffd@0: dissim = sparse(numel(comparison_ids),numel(comparison_ids)); wolffd@0: for i = 1:size(comparison,1) wolffd@0: wolffd@0: % get the outlier votes wolffd@0: simpair = comparison(i,outidx(i,1:2)); wolffd@0: c = comparison(i,outidx(i,3)); wolffd@0: wolffd@0: % we want a triangular similarity matrix wolffd@0: [simpair, simidx] = sort(simpair); wolffd@0: outsort(i,1:2) = outsort(i,simidx); wolffd@0: wolffd@0: % --- wolffd@0: % save the distance between the second biggest vote and the max vote. wolffd@0: % NOTE: we bias the vote by dividing through the number of total wolffd@0: % comparisons for the particular pair of clips wolffd@0: % --- wolffd@0: sim(simpair(1), simpair(2)) = sim(simpair(1), simpair(2)) + ... wolffd@0: (1 - outsort(i,2) / outsort(i,3)) * (1 / num_compares(simpair(1),simpair(2))); wolffd@0: wolffd@0: dissim(simpair(1:2), c) = 0.5 + (outsort(i,1:2) ./ (2 * outsort(i,3))); wolffd@0: end wolffd@0: wolffd@0: % --- wolffd@0: % mirror to make matrix symmetrical wolffd@0: % --- wolffd@0: if nargin == 3 && symmetrical wolffd@0: sim = sim + sim'; wolffd@0: dissim = dissim + dissim'; wolffd@0: end wolffd@0: wolffd@0: % --- wolffd@0: % TODO: use number of votes and std or similar to wolffd@0: % rate the confidence for each similarity mesurement wolffd@0: % --- wolffd@0: confidence = []; wolffd@0: