annotate notes/cplcaMT-annotated.m @ 7:f3d14ba3d8e2

Add Smaragdis-Raj PLCA paper
author Chris Cannam
date Wed, 19 Mar 2014 17:32:12 +0000
parents 0d181e07c778
children 30715e32c5cc
rev   line source
Chris@5 1 function [w,h,z,u,xa] = cplcaMT( x, K, T, R, w, h, z, u, iter, sw, sh, sz, su, lw, lh, lz, lu, pa)
Chris@5 2 % function [w,h,xa2] = cplcaMT( x, K, T, R, w, h, z, u, iter, sw, sh, sz, su, lw, lh, lz, lu)
Chris@5 3 %
Chris@5 4 % Perform multiple-source, multiple-template SIPLCA for transcription
Chris@5 5 %
Chris@5 6 % Inputs:
Chris@5 7 % x input distribution
Chris@5 8 % K number of components
Chris@5 9 % T size of components
Chris@5 10 % R size of sources
Chris@5 11 % w initial value of p(w) [default = random]
Chris@5 12 % h initial value of p(h) [default = random]
Chris@5 13 % z initial value of p(z) [default = random]
Chris@5 14 % iter number of EM iterations [default = 10]
Chris@5 15 % sw sparsity parameter for w [default = 1]
Chris@5 16 % sh sparsity parameter for h [default = 1]
Chris@5 17 % sz sparsity parameter for z [default = 1]
Chris@5 18 % lw flag to update w [default = 1]
Chris@5 19 % lh flag to update h [default = 1]
Chris@5 20 % lh flag to update h [default = 1]
Chris@5 21 % pa source-component activity range [Rx2]
Chris@5 22 %
Chris@5 23 % Outputs:
Chris@5 24 % w p(w) - spectral bases
Chris@5 25 % h p(h) - pitch impulse
Chris@5 26 % z p(z) - mixing matrix for p(h)
Chris@5 27 % xa approximation of input
Chris@5 28
Chris@5 29 % Emmanouil Benetos 2011, based on cplca code by Paris Smaragdis
Chris@5 30
Chris@5 31
Chris@5 32 %% for the transcription application,
Chris@5 33 %% x -> noise-reduced constant Q. In the application this is a 2-sec,
Chris@5 34 %% 100-col segment with 2 zeros at top and bottom, so 549x100
Chris@5 35 %% K -> 88, number of notes
Chris@5 36 %% T -> [545 1], a two-element array: 545 is the length of each
Chris@5 37 %% template, but why 1?
Chris@5 38 %% R -> 10, number of instruments
Chris@5 39 %% w -> a 10x88 cell array, in which w{instrument,note} is a 545x1
Chris@5 40 %% array containing the template for the given instrument and note
Chris@5 41 %% number
Chris@5 42 %% h -> empty
Chris@5 43 %% z -> empty
Chris@5 44 %% u -> empty
Chris@5 45 %% iter -> a parameter for the program, 12 in the mirex submission
Chris@5 46 %% sw -> 1
Chris@5 47 %% sh -> 1
Chris@5 48 %% sz -> 1.1
Chris@5 49 %% su -> 1.3, not documented above, presumably sparsity for u
Chris@5 50 %% lw -> 0, don't update w
Chris@5 51 %% lh -> 1, do update h
Chris@5 52 %% lz -> 1, do update z
Chris@5 53 %% lu -> 1, not documented above, presumably do update u
Chris@5 54 %% pa -> a 10x2 array in which pa(instrument,1) is the lowest note
Chris@5 55 %% expected for that instrument and pa(instrument,2) is the highest
Chris@5 56
Chris@5 57
Chris@5 58 % Sort out the sizes
Chris@5 59
Chris@5 60 wc = 2*size(x)-T; %% works out to 553x199
Chris@5 61 hc = size(x)+T-1; %% works out to 1093x100
Chris@5 62
Chris@5 63 % Default training iterations
Chris@5 64 if ~exist( 'iter')
Chris@5 65 iter = 10;
Chris@5 66 end
Chris@5 67
Chris@5 68
Chris@5 69 % Initialize
Chris@5 70 sumx = sum(x); %% for later normalisation
Chris@5 71
Chris@5 72 if ~exist( 'w') || isempty( w)
Chris@5 73 %% doesn't happen, w was provided (it's the template data)
Chris@5 74 w = cell(R, K);
Chris@5 75 for k = 1:K
Chris@5 76 for r=1:R
Chris@5 77 w{r,k} = rand( T);
Chris@5 78 w{r,k} = w{r,k} / sum( w{r,k}(:));
Chris@5 79 end
Chris@5 80 end
Chris@5 81 end
Chris@5 82 if ~exist( 'h') || isempty( h)
Chris@5 83 %% does happen, h was not provided
Chris@5 84 h = cell(1, K);
Chris@5 85 for k = 1:K
Chris@5 86 h{k} = rand( size(x)-T+1);
Chris@5 87 h{k} = h{k};
Chris@5 88 end
Chris@5 89 %% h is now a 1x88 cell, h{note} is a 5x100 array of random values.
Chris@5 90 %% The 5 comes from the height of the CQ array minus the length of
Chris@5 91 %% a template, plus 1. I guess this is space to allow for the
Chris@5 92 %% 5-bins-per-semitone pitch shift.
Chris@5 93 end
Chris@5 94 if ~exist( 'z') || isempty( z)
Chris@5 95 %% does happen, z was not provided
Chris@5 96 z = cell(1, K);
Chris@5 97 for k = 1:K
Chris@5 98 z{k} = rand( size(x,2),1);
Chris@5 99 z{k} = z{k};
Chris@5 100 end
Chris@5 101 %% z is a 1x88 cell, z{note} is a 100x1 array of random values.
Chris@5 102 end
Chris@5 103 if ~exist( 'u') || isempty( u)
Chris@5 104 %% does happen, u was not provided
Chris@5 105 u = cell(R, K);
Chris@5 106 for k = 1:K
Chris@5 107 for r=1:R
Chris@5 108 if( (pa(r,1) <= k && k <= pa(r,2)) )
Chris@5 109 u{r,k} = ones( size(x,2),1);
Chris@5 110 else
Chris@5 111 u{r,k} = zeros( size(x,2),1);
Chris@5 112 end
Chris@5 113 end;
Chris@5 114 end
Chris@5 115 %% u is a 10x88 cell, u{instrument,note} is a 100x1 double containing
Chris@5 116 %% all 1s if note is in-range for instrument and all 0s otherwise
Chris@5 117 end
Chris@5 118
Chris@6 119 fh = cell(1, K); %% 1x88
Chris@6 120 fw = cell(R, K); %% 10x88
Chris@5 121 for k = 1:K
Chris@5 122 fh{k} = ones(wc) + 1i*ones(wc);
Chris@5 123 for r=1:R
Chris@5 124 fw{r,k} = ones(wc) + 1i*ones(wc);
Chris@5 125 end;
Chris@5 126 end;
Chris@5 127
Chris@6 128 %% now fh is a 1x88 cell, and fh{note} is a 553x199 array initialised
Chris@6 129 %% with all complex values 1 + 1i
Chris@6 130
Chris@6 131 %% fw is a 10x88 cell, and fw{instrument,note} is a 553x199 array
Chris@6 132 %% likewise
Chris@5 133
Chris@5 134
Chris@5 135 % Make commands for subsequent multidim operations and initialize fw
Chris@6 136
Chris@5 137 fnh = 'c(hc(1)-(T(1)+(1:size(h{k},1))-2),hc(2)-(T(2)+(1:size(h{k},2))-2))';
Chris@5 138 xai = 'xa(1:size(x,1),1:size(x,2))';
Chris@5 139 flz = 'xbar(end:-1:1,end:-1:1)';
Chris@5 140
Chris@5 141 for k = 1:K
Chris@5 142 for r=1:R
Chris@5 143 if( (pa(r,1) <= k && k <= pa(r,2)) )
Chris@6 144
Chris@6 145 %% fftn(X,siz) takes an N-dimensional FFT (same number of
Chris@6 146 %% dimensions as siz) of X, padding or truncating X
Chris@6 147 %% beforehand so that it is of size siz. Here w{r,k} is the
Chris@6 148 %% 545x1 template for instrument r and note k, and wc is
Chris@6 149 %% 553x199.
Chris@6 150
Chris@6 151 %% I believe this is equivalent to performing a 553-point
Chris@6 152 %% FFT of each column of the input (with w{r,k} in the first
Chris@6 153 %% 545 elements of the first column of that input) and then
Chris@6 154 %% a 199-point FFT of each row of the result.
Chris@6 155
Chris@6 156 %% The output is of course complex.
Chris@6 157
Chris@5 158 fw{r,k} = fftn( w{r,k}, wc);
Chris@5 159 end;
Chris@5 160 end;
Chris@5 161 end;
Chris@5 162
Chris@5 163 % Iterate
Chris@5 164 for it = 1:iter
Chris@5 165
Chris@5 166 %disp(['Iteration: ' num2str(it)]);
Chris@5 167
Chris@5 168 % E-step
Chris@5 169 xa = eps;
Chris@6 170 for k = 16:73 %% overall note range found in instrument set
Chris@5 171 fh{k} = fftn( h{k}, wc);
Chris@5 172 for r=1:R
Chris@5 173 if( (pa(r,1) <= k && k <= pa(r,2)) )
Chris@5 174 xa1 = abs( real( ifftn( fw{r,k} .* fh{k})));
Chris@5 175 xa = xa + xa1(1:size(x,1),1:size(x,2)) .*repmat(z{k},1,size(x,1))'.*repmat(u{r,k},1,size(x,1))';
Chris@5 176 clear xa1;
Chris@5 177 end
Chris@5 178 end
Chris@5 179 end
Chris@5 180
Chris@5 181 xbar = x ./ xa;
Chris@5 182 xbar = eval( flz);
Chris@5 183 fx = fftn( xbar, wc);
Chris@5 184
Chris@5 185
Chris@5 186 % M-step
Chris@5 187 for k = 16:73
Chris@5 188
Chris@5 189
Chris@5 190 % Update h, z, u
Chris@5 191 nh=eps;
Chris@5 192 for r=1:R
Chris@5 193 if( (pa(r,1) <= k && k <= pa(r,2)) )
Chris@5 194 c = abs( real( ifftn( fx .* fw{r,k} )));
Chris@5 195 nh1 = eval( fnh);
Chris@5 196 nh1 = nh1 .*repmat(u{r,k},1,size(h{k},1))';
Chris@5 197 nh = nh + nh1;
Chris@5 198
Chris@5 199 nhu = eval( fnh);
Chris@5 200 nhu = nhu .* h{k};
Chris@5 201 nu = sum(nhu)';
Chris@5 202 nu = u{r,k} .* nu + eps;
Chris@5 203 if lu
Chris@5 204 u{r,k} = nu;
Chris@5 205 end;
Chris@5 206
Chris@5 207 end;
Chris@5 208 end
Chris@5 209 nh = h{k} .* (nh.^sh);
Chris@5 210 nz = sum(nh)';
Chris@5 211 nz = z{k} .* nz + eps;
Chris@5 212
Chris@5 213
Chris@5 214 % Assign and normalize
Chris@5 215 if lh
Chris@5 216 h{k} = nh;
Chris@5 217 end
Chris@5 218 if lz
Chris@5 219 z{k} = nz;
Chris@5 220 end
Chris@5 221
Chris@5 222
Chris@5 223 end
Chris@5 224
Chris@5 225 % Normalize z over t
Chris@5 226 if lz
Chris@5 227 Z=[]; for i=1:K Z=[Z z{i}]; end;
Chris@5 228 Z = Z.^sz;
Chris@5 229 Z(1:end,1:15)=0;
Chris@5 230 Z(1:end,74:88)=0;
Chris@5 231 Z = Z./repmat(sum(Z,2),1,K); z = num2cell(Z,1); %figure; imagesc(imrotate(Z,90));
Chris@5 232 end
Chris@5 233
Chris@5 234 % Normalize u over z,t
Chris@5 235 if lu
Chris@5 236 U=[]; for r=1:R U(r,:,:) = cell2mat(u(r,:)); end;
Chris@5 237 for i=1:size(U,2) for j=1:size(U,3) U(:,i,j) = U(:,i,j).^su; U(:,i,j) = U(:,i,j) ./ sum(U(:,i,j)); end; end;
Chris@5 238 U0 = permute(U,[2 1 3]); u = squeeze(num2cell(U0,1));
Chris@5 239 end
Chris@5 240
Chris@5 241 % Normalize h over z,t
Chris@5 242 H=[]; for k=1:K H(k,:,:) = cell2mat(h(k)); end; H0 = permute(H,[2 1 3]);
Chris@5 243 for i=1:size(H0,2) for j=1:size(H0,3) H0(:,i,j) = sumx(j)* (H0(:,i,j) ./ sum(H0(:,i,j))); end; end;
Chris@5 244 h = squeeze(num2cell(squeeze(H0),[1 3])); for k=1:K h{k} = squeeze(h{k}); end;
Chris@5 245
Chris@5 246 %figure; imagesc(imrotate(xa',90));
Chris@5 247
Chris@5 248 end
Chris@5 249
Chris@5 250 %figure; imagesc(imrotate(xa',90));