annotate notes/transcriptionMultipleTemplates-annotated.m @ 7:f3d14ba3d8e2

Add Smaragdis-Raj PLCA paper
author Chris Cannam
date Wed, 19 Mar 2014 17:32:12 +0000
parents ed9e20b6b165
children 30715e32c5cc
rev   line source
Chris@2 1 function [ph pz sumY] = transcriptionMultipleTemplates(filename,iter,sz,su)
Chris@2 2
Chris@2 3
Chris@2 4 % Load note templates
Chris@2 5 load('noteTemplatesBassoon'); W(:,:,1) = noteTemplatesBassoon;
Chris@2 6 load('noteTemplatesCello'); W(:,:,2) = noteTemplatesCello;
Chris@2 7 load('noteTemplatesClarinet'); W(:,:,3) = noteTemplatesClarinet;
Chris@2 8 load('noteTemplatesFlute'); W(:,:,4) = noteTemplatesFlute;
Chris@2 9 load('noteTemplatesGuitar'); W(:,:,5) = noteTemplatesGuitar;
Chris@2 10 load('noteTemplatesHorn'); W(:,:,6) = noteTemplatesHorn;
Chris@2 11 load('noteTemplatesOboe'); W(:,:,7) = noteTemplatesOboe;
Chris@2 12 load('noteTemplatesTenorSax'); W(:,:,8) = noteTemplatesTenorSax;
Chris@2 13 load('noteTemplatesViolin'); W(:,:,9) = noteTemplatesViolin;
Chris@2 14 load('noteTemplatesSptkBGCl'); W(:,:,10) = noteTemplatesSptkBGCl;
Chris@2 15
Chris@2 16
Chris@2 17 %pitchActivity = [14 16 30 40 20 21 38 24 35 1; 52 61 69 76 56 57 71 55 80 88]';
Chris@2 18 pitchActivity = [16 16 30 40 20 21 38 24 35 16; 52 61 69 73 56 57 71 55 73 73]';
Chris@2 19
Chris@2 20
Chris@5 21 %% this turns W0 into a 10x88 cell array in which W0{instrument}{note}
Chris@5 22 %% is the 545x1 template for the given instrument and note number.
Chris@2 23 W = permute(W,[2 1 3]);
Chris@2 24 W0 = squeeze(num2cell(W,1))';
Chris@5 25
Chris@2 26 clear('noteTemplatesBassoon','noteTemplatesCello','noteTemplatesClarinet','noteTemplatesFlute','noteTemplatesGuitar',...
Chris@2 27 'noteTemplatesHorn','noteTemplatesOboe','noteTemplatesTenorSax','noteTemplatesViolin','noteTemplatesSptkBGCl','W');
Chris@2 28
Chris@2 29
Chris@2 30 % Compute CQT
Chris@5 31
Chris@5 32 %% The CQT parameters are hardcoded in computeCQT. It has frequency
Chris@5 33 %% range 27.5 -> samplerate/3, 60 bins per octave, a 'q' of 0.8 (lower
Chris@5 34 %% than the maximum, and default, value of 1), 'atomHopFactor' 0.3
Chris@5 35 %% rather than the default 0.25 (why?), Hann window, default sparsity
Chris@5 36 %% threshold.
Chris@5 37
Chris@5 38 %% for a 43.5 second 44.1 KHz audio file, intCQT will be a 545x30941
Chris@5 39 %% array, one column every 0.0014 seconds.
Chris@2 40 [intCQT] = computeCQT(filename);
Chris@5 41
Chris@5 42 %% X is sampled from intCQT at 7.1128-column intervals, giving
Chris@5 43 %% 4350x545 in this case, so clearly 100 columns per second; then
Chris@5 44 %% transposed
Chris@2 45 X = intCQT(:,round(1:7.1128:size(intCQT,2)))';
Chris@5 46
Chris@5 47 %% median filter to reduce noise -- I think this is essentially the
Chris@5 48 %% same as Xue's method for devuvuzelation
Chris@2 49 noiseLevel1 = medfilt1(X',40);
Chris@2 50 noiseLevel2 = medfilt1(min(X',noiseLevel1),40);
Chris@2 51 X = max(X-noiseLevel2',0);
Chris@5 52
Chris@5 53 %% take every 4th row. We had 100 per second (10ms) so this is 40ms as
Chris@5 54 %% the comment says. I am guessing we denoised at a higher resolution
Chris@5 55 %% for better denoising, though still not at the original resolution,
Chris@5 56 %% for speed. Y is now 1088x545 in our example and looks pretty clean
Chris@5 57 %% as a contour plot.
Chris@2 58 Y = X(1:4:size(X,1),:); % 40ms step
Chris@5 59
Chris@5 60 %% a 1x1088 array containing the sum of each column. Doesn't appear to
Chris@5 61 %% be used.
Chris@2 62 sumY = sum(Y');
Chris@5 63
Chris@2 64 clear('intCQT','X','noiseLevel1','noiseLevel2');
Chris@2 65
Chris@2 66 fprintf('%s','done');
Chris@2 67 fprintf('\n');
Chris@2 68 fprintf('%s',['Estimating F0s...........']);
Chris@2 69
Chris@2 70 % For each 2sec segment, perform SIPLCA with fixed W0
Chris@2 71 ph = zeros(440,size(Y,1));
Chris@2 72 pz = zeros(88,size(Y,1));
Chris@2 73
Chris@2 74 for j=1:floor(size(Y,1)/100)
Chris@2 75
Chris@2 76 x=[zeros(2,100); Y(1+(j-1)*100:j*100,:)'; zeros(2,100)];
Chris@2 77 [w,h,z,u,xa] = cplcaMT( x, 88, [545 1], 10, W0, [], [], [], iter, 1, 1, sz, su, 0, 1, 1, 1, pitchActivity);
Chris@2 78
Chris@2 79 H=[]; for i=1:88 H=[H; h{i}]; end;
Chris@2 80 ph(:,1+(j-1)*100:j*100) = H;
Chris@2 81 Z=[]; for i=1:88 Z=[Z z{i}]; end;
Chris@2 82 pz(:,1+(j-1)*100:j*100) = Z';
Chris@2 83 perc = 100*(j/(floor(size(Y,1)/100)+1));
Chris@2 84 fprintf('\n');
Chris@2 85 fprintf('%.2f%% complete',perc);
Chris@2 86 end;
Chris@2 87
Chris@2 88 len=size(Y,1)-j*100; % Final segment
Chris@2 89
Chris@2 90 if (len >0)
Chris@2 91 x=[zeros(2,len); Y(1+j*100:end,:)'; zeros(2,len)];
Chris@2 92 [w,h,z,u,xa] = cplcaMT( x, 88, [545 1], 10, W0, [], [], [], iter, 1, 1, sz, su, 0, 1, 1, 1, pitchActivity);
Chris@2 93 fprintf('\n');
Chris@2 94 fprintf('100%% complete');
Chris@2 95
Chris@2 96 H=[]; for i=1:88 H=[H; h{i}]; end;
Chris@2 97 ph(:,1+j*100:end) = H;
Chris@2 98 Z=[]; for i=1:88 Z=[Z z{i}]; end;
Chris@2 99 pz(:,1+j*100:end) = Z';
Chris@5 100 end;