idamnjanovic@8: function [p,s] = SMALL_swipe(X,fs, f, plim,dlog2p,dERBs,woverlap,sTHR) ivan@128: %% Modified SWIPEP Pitch estimation using SWIPE'. idamnjanovic@24: % This is modified swipep MATLAB code that is working directly in spectral idamnjanovic@24: % domain and uses only one window size. The results are suboptimal idamnjanovic@24: % comparing to original code. It is also converted to SWIPE which uses all ivan@128: % the harmonics of the signal. ivan@128: ivan@128: % ivan@128: % idamnjanovic@24: % idamnjanovic@24: % SWIPEP Pitch estimation using SWIPE'. idamnjanovic@8: % P = SWIPEP(X,Fs,[PMIN PMAX],DT,DLOG2P,DERBS,STHR) estimates the pitch idamnjanovic@8: % of the vector signal X every DT seconds. The sampling frequency of idamnjanovic@8: % the signal is Fs (in Hertz). The spectrum is computed using a Hann idamnjanovic@8: % window with an overlap WOVERLAP between 0 and 1. The spectrum is idamnjanovic@8: % sampled uniformly in the ERB scale with a step size of DERBS ERBs. The idamnjanovic@8: % pitch is searched within the range [PMIN PMAX] (in Hertz) with samples idamnjanovic@8: % distributed every DLOG2P units on a base-2 logarithmic scale of Hertz. idamnjanovic@8: % The pitch is fine-tuned using parabolic interpolation with a resolution idamnjanovic@8: % of 1 cent. Pitch estimates with a strength lower than STHR are treated idamnjanovic@8: % as undefined. idamnjanovic@8: % idamnjanovic@8: % [P,T,S] = SWIPEP(X,Fs,[PMIN PMAX],DT,DLOG2P,DERBS,WOVERLAP,STHR) idamnjanovic@8: % returns the times T at which the pitch was estimated and the pitch idamnjanovic@8: % strength S of every pitch estimate. idamnjanovic@8: % idamnjanovic@8: % P = SWIPEP(X,Fs) estimates the pitch using the default settings PMIN = idamnjanovic@8: % 30 Hz, PMAX = 5000 Hz, DT = 0.001 s, DLOG2P = 1/48 (48 steps per idamnjanovic@8: % octave), DERBS = 0.1 ERBs, WOVERLAP = 0.5, and STHR = -Inf. idamnjanovic@8: % idamnjanovic@8: % P = SWIPEP(X,Fs,...,[],...) uses the default setting for the parameter idamnjanovic@8: % replaced with the placeholder []. idamnjanovic@8: % idamnjanovic@8: % REMARKS: (1) For better results, make DLOG2P and DERBS as small as idamnjanovic@8: % possible and WOVERLAP as large as possible. However, take into account idamnjanovic@8: % that the computational complexity of the algorithm is inversely idamnjanovic@8: % proportional to DLOG2P, DERBS and 1-WOVERLAP, and that the default idamnjanovic@8: % values have been found empirically to produce good results. Consider idamnjanovic@8: % also that the computational complexity is directly proportional to the idamnjanovic@8: % number of octaves in the pitch search range, and therefore , it is idamnjanovic@8: % recommendable to restrict the search range to the expected range of idamnjanovic@8: % pitch, if any. (2) This code implements SWIPE', which uses only the idamnjanovic@8: % first and prime harmonics of the signal. To convert it into SWIPE, idamnjanovic@8: % which uses all the harmonics of the signal, replace the word idamnjanovic@8: % PRIMES with a colon (it is located almost at the end of the code). idamnjanovic@8: % However, this may not be recommendable since SWIPE' is reported to idamnjanovic@8: % produce on average better results than SWIPE (Camacho and Harris, idamnjanovic@8: % 2008). idamnjanovic@8: % idamnjanovic@8: % EXAMPLE: Estimate the pitch of the signal X every 10 ms within the idamnjanovic@8: % range 75-500 Hz using the default resolution (i.e., 48 steps per idamnjanovic@8: % octave), sampling the spectrum every 1/20th of ERB, using a window idamnjanovic@8: % overlap factor of 50%, and discarding samples with pitch strength idamnjanovic@8: % lower than 0.2. Plot the pitch trace. idamnjanovic@8: % [x,Fs] = wavread(filename); idamnjanovic@8: % [p,t,s] = swipep(x,Fs,[75 500],0.01,[],1/20,0.5,0.2); idamnjanovic@8: % plot(1000*t,p) idamnjanovic@8: % xlabel('Time (ms)') idamnjanovic@8: % ylabel('Pitch (Hz)') idamnjanovic@8: % idamnjanovic@8: % REFERENCES: Camacho, A., Harris, J.G, (2008) "A sawtooth waveform idamnjanovic@8: % inspired pitch estimator for speech and music," J. Acoust. Soc. Am. idamnjanovic@8: % 124, 1638-1652. idamnjanovic@24: ivan@128: % ivan@128: % Centre for Digital Music, Queen Mary, University of London. ivan@128: % This file copyright 2009 Ivan Damnjanovic. ivan@128: % ivan@128: % This program is free software; you can redistribute it and/or ivan@128: % modify it under the terms of the GNU General Public License as ivan@128: % published by the Free Software Foundation; either version 2 of the ivan@128: % License, or (at your option) any later version. See the file ivan@128: % COPYING included with this distribution for more information. ivan@128: %% ivan@128: idamnjanovic@8: if ~ exist( 'plim', 'var' ) || isempty(plim), plim = [30 5000]; end idamnjanovic@8: %if ~ exist( 'dt', 'var' ) || isempty(dt), dt = 0.001; end idamnjanovic@8: if ~ exist( 'dlog2p', 'var' ) || isempty(dlog2p), dlog2p = 1/48; end idamnjanovic@8: if ~ exist( 'dERBs', 'var' ) || isempty(dERBs), dERBs = 0.05; end idamnjanovic@8: % if ~ exist( 'woverlap', 'var' ) || isempty(woverlap) idamnjanovic@8: % woverlap = 0.5; idamnjanovic@8: % elseif woverlap>1 || woverlap<0 idamnjanovic@8: % error('Window overlap must be between 0 and 1.') idamnjanovic@8: % end idamnjanovic@8: if ~ exist( 'sTHR', 'var' ) || isempty(sTHR), sTHR = -Inf; end idamnjanovic@8: %t = [ 0: dt: length(x)/fs ]'; % Times idamnjanovic@8: % Define pitch candidates idamnjanovic@8: log2pc = [ log2(plim(1)): dlog2p: log2(plim(2)) ]'; idamnjanovic@8: pc = 2 .^ log2pc; idamnjanovic@8: S = zeros( length(pc), 1 ); % Pitch strength matrix idamnjanovic@8: % Determine P2-WSs idamnjanovic@8: %logWs = round( log2( 8*fs ./ plim ) ); idamnjanovic@8: ws = [2822];%2.^[ logWs(1): -1: logWs(2) ]; % P2-WSs idamnjanovic@8: pO = 8 * fs ./ ws; % Optimal pitches for P2-WSs idamnjanovic@8: % Determine window sizes used by each pitch candidate idamnjanovic@8: d = 1 + log2pc - log2( 8*fs./ws(1) ); idamnjanovic@8: % Create ERB-scale uniformly-spaced frequencies (in Hertz) idamnjanovic@8: fERBs = erbs2hz([ hz2erbs(min(pc)/4): dERBs: hz2erbs(fs/2) ]'); idamnjanovic@8: for i = 1 : length(ws) idamnjanovic@8: %dn = max( 1, round( 8*(1-woverlap) * fs / pO(i) ) ); % Hop size idamnjanovic@8: % Zero pad signal idamnjanovic@8: %xzp = [ zeros( ws(i)/2, 1 ); x(:); zeros( dn + ws(i)/2, 1 ) ]; idamnjanovic@8: % Compute spectrum idamnjanovic@8: %w = hanning( ws(i) ); % Hann window idamnjanovic@8: %o = max( 0, round( ws(i) - dn ) ); % Window overlap idamnjanovic@8: %[ X, f, ti ] = specgram( xzp, ws(i), fs, w, o ); idamnjanovic@8: % Select candidates that use this window size idamnjanovic@8: if length(ws) == 1 idamnjanovic@8: j=[1:size(pc)]'; k = []; idamnjanovic@8: elseif i == length(ws) idamnjanovic@8: j=find(d-i>-1); k=find(d(j)-i<0); idamnjanovic@8: elseif i==1 idamnjanovic@8: j=find(d-i<1); k=find(d(j)-i>0); idamnjanovic@8: else idamnjanovic@8: j=find(abs(d-i)<1); k=1:length(j); idamnjanovic@8: end idamnjanovic@8: % Compute loudness at ERBs uniformly-spaced frequencies idamnjanovic@8: fERBs = fERBs( find( fERBs > pc(1)/4, 1, 'first' ) : end ); idamnjanovic@8: L = sqrt( max( 0, interp1( f, X, fERBs, 'spline', 0) ) ); idamnjanovic@8: % Compute pitch strength idamnjanovic@8: Si = pitchStrengthAllCandidates( fERBs, L, pc ); idamnjanovic@8: % Interpolate pitch strength at desired times idamnjanovic@8: % if size(Si,2) > 1 idamnjanovic@8: % warning off MATLAB:interp1:NaNinY idamnjanovic@8: % Si = interp1( ti, Si', t, 'linear', NaN )'; idamnjanovic@8: % warning on MATLAB:interp1:NaNinY idamnjanovic@8: % else idamnjanovic@8: % Si = repmat( NaN, length(Si),1 ); idamnjanovic@8: % end idamnjanovic@8: % Add pitch strength to combination idamnjanovic@8: % lambda = d( j(k) ) - i; idamnjanovic@8: mu = ones( size(j) ); idamnjanovic@8: % mu(k) = 1 - abs( lambda ); idamnjanovic@8: S(j,:) = S(j,:) + repmat(mu,1,size(Si,2)) .* Si; idamnjanovic@8: end idamnjanovic@8: % Fine tune pitch using parabolic interpolation idamnjanovic@8: p = repmat( NaN, size(S,2), 1 ); idamnjanovic@8: s = repmat( NaN, size(S,2), 1 ); idamnjanovic@8: for j = 1 : size(S,2) idamnjanovic@8: [ s(j), i ] = max( S(:,j), [], 1 ); idamnjanovic@8: if s(j) < sTHR, continue, end idamnjanovic@8: if i == 1 || i == length(pc) idamnjanovic@8: p(j) = pc(i); idamnjanovic@8: else idamnjanovic@8: I = i-1 : i+1; idamnjanovic@8: tc = 1 ./ pc(I); idamnjanovic@8: ntc = ( tc/tc(2) - 1 ) * 2*pi; idamnjanovic@8: c = polyfit( ntc, S(I,j), 2 ); idamnjanovic@8: ftc = 1 ./ 2.^[ log2(pc(I(1))): 1/12/100: log2(pc(I(3))) ]; idamnjanovic@8: nftc = ( ftc/tc(2) - 1 ) * 2*pi; idamnjanovic@8: [s(j) k] = max( polyval( c, nftc ) ); idamnjanovic@8: p(j) = 2 ^ ( log2(pc(I(1))) + (k-1)/12/100 ); idamnjanovic@8: % if (p(j)-pc(I(1)))<0.75*abs(p(j)-pc(I(2))) idamnjanovic@8: % p(j)=pc(I(1)); idamnjanovic@8: % elseif (pc(I(3))-p(j))<0.75*abs(p(j)-pc(I(2))) idamnjanovic@8: % p(j)=pc(I(3)); idamnjanovic@8: % else idamnjanovic@8: p(j)=pc(I(2)); idamnjanovic@8: % end idamnjanovic@8: end idamnjanovic@8: end idamnjanovic@8: idamnjanovic@8: function S = pitchStrengthAllCandidates( f, L, pc ) idamnjanovic@8: % Create pitch strength matrix idamnjanovic@8: S = zeros( length(pc), size(L,2) ); idamnjanovic@8: % Define integration regions idamnjanovic@8: k = ones( 1, length(pc)+1 ); idamnjanovic@8: for j = 1 : length(k)-1 idamnjanovic@8: k(j+1) = k(j) - 1 + find( f(k(j):end) > pc(j)/4, 1, 'first' ); idamnjanovic@8: end idamnjanovic@8: k = k(2:end); idamnjanovic@8: % Create loudness normalization matrix idamnjanovic@8: N = sqrt( flipud( cumsum( flipud(L.*L) ) ) ); idamnjanovic@8: for j = 1 : length(pc) idamnjanovic@8: % Normalize loudness idamnjanovic@8: warning off MATLAB:divideByZero idamnjanovic@8: NL = L(k(j):end,:) ./ repmat( N(k(j),:), size(L,1)-k(j)+1, 1); idamnjanovic@8: warning on MATLAB:divideByZero idamnjanovic@8: % Compute pitch strength idamnjanovic@8: idamnjanovic@8: S(j,:) = pitchStrengthOneCandidate( f(k(j):end), NL, pc(j) ); idamnjanovic@8: end idamnjanovic@8: idamnjanovic@8: function S = pitchStrengthOneCandidate( f, NL, pc ) idamnjanovic@8: n = fix( f(end)/pc - 0.75 ); % Number of harmonics idamnjanovic@8: if n==0, S=NaN; return, end idamnjanovic@8: k = zeros( size(f) ); % Kernel idamnjanovic@8: % Normalize frequency w.r.t. candidate idamnjanovic@8: q = f / pc; idamnjanovic@8: % Create kernel idamnjanovic@8: for i = [ 1:n] % primes(n) ] idamnjanovic@8: a = abs( q - i ); idamnjanovic@8: % Peak's weigth idamnjanovic@8: p = a < .25; idamnjanovic@8: k(p) = cos( 2*pi * q(p) ); idamnjanovic@8: % Valleys' weights idamnjanovic@8: v = .25 < a & a < .75; idamnjanovic@8: k(v) = k(v) + cos( 2*pi * q(v) ) / 2; idamnjanovic@8: end idamnjanovic@8: % Apply envelope idamnjanovic@8: k = k .* sqrt( 1./f ); idamnjanovic@8: % K+-normalize kernel idamnjanovic@8: k = k / norm( k(k>0) ); idamnjanovic@8: % Compute pitch strength idamnjanovic@8: S = k' * NL; idamnjanovic@8: idamnjanovic@8: function erbs = hz2erbs(hz) idamnjanovic@8: erbs = 21.4 * log10( 1 + hz/229 ); idamnjanovic@8: idamnjanovic@8: function hz = erbs2hz(erbs) idamnjanovic@8: hz = ( 10 .^ (erbs./21.4) - 1 ) * 229;