annotate do_correlation.m @ 6:e2337cd691b1 tip

Finishing writing the matlab code to replicate all observations made in the article. Added the article to the repository. Renamed the two main scripts ("1-get_mirex_estimates.rb" and "2-generate_smith2013_ismir.m") to not have dashes (since this was annoying within Matlab) Added new Michael Jackson figure.
author Jordan Smith <jordan.smith@eecs.qmul.ac.uk>
date Wed, 05 Mar 2014 01:02:26 +0000
parents 624231da830b
children
rev   line source
jordan@2 1 function [asig pval a a_] = do_correlation(megacube, songs, metrics, algos, algo_groups, merge_algos, merge_songs, merge_dsets, metric_labels, bonferroni)
jordan@2 2
jordan@2 3 % function [asig pval a a_] = do_correlation(megacube, songs, metrics, algos, algo_groups, merge_algos, merge_songs, merge_dsets, metric_labels, bonferroni)
jordan@2 4 %
jordan@2 5 % Script to make and analyze correlation plot.
jordan@2 6 % Example usage:
jordan@2 7 % To run your first experiment (Fig 1a) request:
jordan@2 8 % do_correlation(megacube, lab_measures, sind_manual1, [1:9], -1, 0, 1, -1, s_manual1)
jordan@2 9 %
jordan@2 10 % MEGACUBE is the giant (N songs) x (M metrics) x (L algorithms) matrix of evaluation results.
jordan@2 11 % SONGS, METRICS and ALGOS are the indices into these three dimensions desired.
jordan@2 12 % ALGO_GROUPS indicates groups of algorithms that should be averaged together rather than counted separately.
jordan@2 13 % (this has not yet been implemented)
jordan@2 14 % Set MERGE_ALGOS > 0 in order to compute the median score across algorithms.
jordan@2 15 % Set MERGE_SONGS > 0 in order to compute the median score across songs.
jordan@2 16 % MERGE_DSETS is also not yet implemented.
jordan@2 17 % METRIC_LABELS is a matrix of strings, one for each of the METRICS, for use in plotting.
jordan@2 18 % Set BONFERRONI > 0 in order to apply a bonferroni correction of BONFERRONI. (Default value: 0.05.)
jordan@2 19 % Note a few hard-coded decisions, such as:
jordan@2 20 % - significance level hard coded as 0.05.
jordan@2 21 % - in the image, decision that tau > 0.8 is strong, tau > 0.33 is weak, and tau < 0.33 is nothing.
jordan@2 22
jordan@2 23 % Defaults and hard coding values:
jordan@2 24 if nargin<10,
jordan@2 25 bonferroni = 0.05;
jordan@2 26 end
jordan@2 27 significant_p = 0.05;
jordan@2 28 maxtau = 0.8;
jordan@2 29 mintau = 0.33;
jordan@2 30
jordan@2 31
jordan@2 32
jordan@2 33 tmpcube = megacube(songs,metrics,algos);
jordan@2 34
jordan@2 35 % if exist('algo_groups'),
jordan@2 36 % for i=1:length(algo_groups),
jordan@2 37 % merge the groups somehow...
jordan@2 38 % end
jordan@2 39 % end
jordan@2 40
jordan@2 41 if merge_algos>0, % If we merge algorithms, take the median score across algorithms.
jordan@2 42 tmpcube = median(tmpcube,3);
jordan@2 43 elseif merge_songs>0, % If we merge songs, take the median score across songs.
jordan@2 44 tmpcube = median(tmpcube,1); % Then, resize the matrix to be 2-d:
jordan@2 45 tmpcube = transpose(reshape(tmpcube,size(tmpcube,2),size(tmpcube,3)));
jordan@2 46 end
jordan@2 47
jordan@2 48 % Compute Kendall tau correlation:
jordan@2 49 [a pval] = corr(tmpcube,'type','Kendall');
jordan@2 50 % Apply bonferroni correction:
jordan@2 51 m = length(a)*(length(a)-1)/2;
jordan@2 52 asig = pval<significant_p;
jordan@2 53 if bonferroni>0,
jordan@2 54 fprintf('Bonferroni applied.\n')
jordan@2 55 asig = (pval*m)<bonferroni; % This is the matrix of values that are significant.
jordan@2 56 end
jordan@2 57 a_ = (abs(a)>=maxtau) + (abs(a)>=mintau);
jordan@2 58 a_ = tril(a_,-1);
jordan@2 59
jordan@2 60 % A contains the correlation values themselves.
jordan@2 61 % ASIG is a binary matrix that states whether the correlation is statistically significant.
jordan@2 62 % A_ is a matrix of -2, -1, 0, 1 and 2s that says whether a correlation is qualitatively strong (2), qualitatively weak (1), or nada (0).
jordan@2 63 % Sometimes values will be statistically significant, but qualitatively insignificant. We do not want to bother looking at these, so
jordan@2 64 % let us make our pretty picture carefully.
jordan@2 65
jordan@2 66 % The values we display will always be straight from A. The colour we display, to emphasize the strong correlations,
jordan@2 67 % should be the element-wise product of A, ASIG, and A_.
jordan@2 68 % Also:
jordan@2 69 % Iff tau>0.33 (a_>0), include text.
jordan@2 70 % Iff tau is significant (asig=1), include background.
jordan@2 71 % Iff tau>0.8 (a_=2), put in bold.
jordan@2 72 % Iff tau>0.8 AND tau is significant, invert the color of the text (because the colour will be darker).
jordan@2 73
jordan@2 74 img = a_.*a.*asig;
jordan@2 75 img = img(2:end,1:end-1); % ignore the diagonal
jordan@2 76 clf
jordan@2 77 imagesc(img, [-1 1])
jordan@2 78 for i=1:length(a_),
jordan@2 79 for j=1:length(a_),
jordan@2 80 if a_(i,j)>0,
jordan@2 81 % tau is >0.33 so we definitely write the value. need to determine fontface and colour.
jordan@2 82 % if tau>.8, put in bold
jordan@2 83 if abs(a_(i,j))>1,
jordan@2 84 fontw = 'bold';
jordan@2 85 else
jordan@2 86 fontw = 'normal';
jordan@2 87 end
jordan@2 88 if abs(a_(i,j))>1 & asig(i,j)==1,
jordan@2 89 textcolor = [1 1 1];
jordan@2 90 else
jordan@2 91 textcolor = [0 0 0];
jordan@2 92 end
jordan@2 93 % h = text(j-.35,i-1,num2str(a(i,j),2),'Color',textcolor);
jordan@2 94 h = text(j,i-1,sprintf('%.2f',a(i,j)),'Color',textcolor,'FontWeight',fontw,'FontSize',12,'HorizontalAlignment','center');
jordan@2 95 set(h,'HorizontalAlignment','center')
jordan@2 96 end
jordan@2 97 end
jordan@2 98 end
jordan@2 99 cmap_el = transpose([linspace(.3,1,50)]);
jordan@2 100 cmap = repmat(cmap_el,1,3);
jordan@2 101 cmap = [cmap; flipud(cmap)];
jordan@2 102 % Alternatively:
jordan@2 103 cmap = [ones(size(cmap_el)) cmap_el cmap_el; flipud([cmap_el cmap_el ones(size(cmap_el))])];
jordan@2 104 colormap(cmap);
jordan@2 105
jordan@2 106 set(gca,'YTickLabel',metric_labels(2:end),'YTick',(1:length(a)-1),'FontAngle','italic','FontSize',12)
jordan@2 107 set(gca,'XTickLabel',metric_labels(1:end-1),'XTick',(1:length(a)-1),'FontAngle','italic','FontSize',12)
jordan@2 108 % set(gcf,'Position',[1000,1000,700,300])
jordan@2 109 % set(gca,'XTickLabel',metric_labels(2:2:end),'YTick',(1:length(a)/2))
jordan@2 110
jordan@2 111 % axis([0.5, length(a)-.5, 1.5, length(a)+.5])