holger@0: classdef CAdaptInstrSpec holger@0: % CAdaptInstrSpec - Estimation of a filter curve that enables the holger@0: % adaptation of instrument templates from one recording to another. The holger@0: % beta-divergence is used as a cost function between the original and the holger@0: % adapted spectra. All spectra need to be provided on a logarithmic holger@0: % frequency axis (for now, an extension to linear frequency axes should holger@0: % be straightforward). For further details on the estimation method, see holger@0: % the references below. holger@0: % holger@0: % PROPERTIES holger@0: % no public properties holger@0: % holger@0: % METHODS holger@0: % CAdaptInstrSpec - constructor for CAdaptInstrSpec object holger@0: % setH - sets filter curve 'h' holger@0: % getH - returns estimate of filter curve ''h'' holger@0: % getSmoothedH - returns smoothed and interpolated version of the holger@0: % filter curve ''h'' holger@0: % updateH - performs single update of filter curve ''h'' holger@0: % estimateSpectra - estimates spectra based on current estimate of holger@0: % the filter curve. holger@0: % compBetaDivergence - compute beta-divergence between original and holger@0: % estimated spectra holger@0: % holger@0: % For further help on the methods, type 'help CAdaptInstrSpec.[methodName]' holger@0: % holger@0: % holger@0: % References: holger@0: % holger@0: % [1] H. Kirchhoff, S. Dixon, A. Klapuri. Missing spectral templates holger@0: % estimation for user-assisted music transcription. IEEE International holger@0: % Conference on Acoustics, Speech and Signal Processing, Vancouver, holger@0: % Canada, 2013, submitted. holger@0: % [2] H. Kirchhoff, S. Dixon, and A. Klapuri. Cross-recording adaptation of holger@0: % musical instrument spectra. Technical Report C4DM-TR-11-2012, holger@0: % Queen Mary University of London, 2012. holger@0: % http://www.eecs.qmul.ac.uk/~holger/C4DM-TR-11-2012 holger@0: holger@0: % Copyright (C) 2012 Holger Kirchhoff holger@0: % holger@0: % This program is free software; you can redistribute it and/or holger@0: % modify it under the terms of the GNU General Public License holger@0: % as published by the Free Software Foundation; either version 2 holger@0: % of the License, or (at your option) any later version. holger@0: % holger@0: % This program is distributed in the hope that it will be useful, holger@0: % but WITHOUT ANY WARRANTY; without even the implied warranty of holger@0: % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the holger@0: % GNU General Public License for more details. holger@0: % holger@0: % You should have received a copy of the GNU General Public License holger@0: % along with this program; if not, write to the Free Software holger@0: % Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. holger@0: holger@0: holger@0: properties (Access='private') holger@0: holger@0: h = []; % filter transfer function holger@0: maxDevFromMedianInDB = 10; % maximum deviation from mean values holger@0: numCepstralCoeffs = 20; holger@0: regularisationParam = 0.001; holger@0: holger@0: spectra_DB = []; % basis functions estimated from database (e.g. RWC) holger@0: spectra_data = []; % basis functions derived from analysis spectrogram holger@0: W_DB = []; % spectra_DB reduced to peak amplitudes only holger@0: W_data = []; % spectra_data reduced to peak amplitudes only holger@0: W_data_est = []; % estimated basis functions (W_data * h) holger@0: f0Idcs_DB = []; % pitch values of columns in W_DB holger@0: f0Idcs_data = []; % pitch values of columns in W_data holger@0: numF0Idcs_DB = 0; holger@0: numF0Idcs_data = 0; holger@0: holger@0: commonF0Idcs = []; % midi pitch values that occur both in f0Idcs_DB and in f0Idcs_data holger@0: commonF0IdcsIdcs_DB = []; % pitch indices in f0Idcs_DB that also occur in f0Idcs_data holger@0: commonF0IdcsIdcs_data = []; % pitch indices in f0Idcs_data that also occur in f0Idcs_DB holger@0: numCommonShifts = 0; holger@0: holger@0: zeroIdcsW_data = []; % indices in W_data that are zero (required for numerical reasons) holger@0: zeroIdcsW_data_est = []; % indices in W_DB that are zero holger@0: zeroIdcsH = []; % indices in h that are zero holger@0: holger@0: maxNumFreqs = 0; holger@0: holger@0: costFctName = ''; holger@0: costFctNames = {'LS', 'KL', 'IS', 'BD'}; holger@0: holger@0: beta = 0; % parameter beta for beta divergence holger@0: holger@0: end % properties holger@0: holger@0: methods holger@0: holger@0: function obj = CAdaptInstrSpec(spectra_DB, spectra_data, f0Idcs_DB, f0Idcs_data, numBinsPerSemitone, costFctName, varargin) holger@0: % CAdaptInstrSpec - constructor of CAdaptInstrSpec class holger@0: % holger@0: % myObj = CAdaptInstrSpec(spectra_DB, spectra_data, f0Idcs_DB, holger@0: % f0Idcs_data, numBinsPerSemitone, costFctName) holger@0: % constructs a CAdaptInstrSpec object. holger@0: % holger@0: % Parameters: holger@0: % spectra_DB - matrix containing in its columns the the holger@0: % database templates that are to be adapted holger@0: % spectra_data - matrix containing the spectra estimated holger@0: % from the recording to which the database holger@0: % spectra should be adapted holger@0: % f0Idcs_DB - f0-indices corresponding to the columns holger@0: % in spectra_DB holger@0: % f0Idcs_data - f0-indices corresponding to the columns holger@0: % in spectra_data holger@0: % numBinsPerSemitone - pitch resolution of the constant-Q holger@0: % spectra holger@0: % costFctName - name of cost function. available cost holger@0: % functions are: holger@0: % 'LS' - least squares error holger@0: % 'KL' - generalised Kullback-Leibler div. holger@0: % 'IS' - Itakura-Saito divergence holger@0: % 'BD' - beta divergence holger@0: % holger@0: % If 'BD' is selected as the cost function, the parameter beta holger@0: % has to be provided by myObj = CAdaptInstrSpec(..., 'beta', holger@0: % betaValue) where betaValue is a real, finite scalar. holger@0: holger@0: holger@0: assert(length(f0Idcs_DB) == size(spectra_DB,2), 'number of values in ''f0Idcs_DB'' must be the same as number of columns in W_DB'); holger@0: assert(length(f0Idcs_data) == size(spectra_data,2), 'number of values in ''f0Idcs_data'' must be the same as number of columns in W_data'); holger@0: assert(size(spectra_DB,1) == size(spectra_data,1), 'number of frequency bins (rows) in W_DB and W_data must be the same'); holger@0: holger@0: % FIXME: check that inputs are valid! holger@0: holger@0: %% set member variables holger@0: obj.f0Idcs_DB = f0Idcs_DB; holger@0: obj.f0Idcs_data = f0Idcs_data; holger@0: obj.spectra_DB = spectra_DB; holger@0: obj.spectra_data = spectra_data; holger@0: holger@0: obj.maxNumFreqs = size(spectra_DB, 1); holger@0: obj.numF0Idcs_DB = size(spectra_DB, 2); holger@0: obj.numF0Idcs_data = size(spectra_data,2); holger@0: holger@0: [obj.commonF0Idcs, obj.commonF0IdcsIdcs_DB, obj.commonF0IdcsIdcs_data] = intersect(f0Idcs_DB, f0Idcs_data); holger@0: obj.numCommonShifts = length(obj.commonF0IdcsIdcs_DB); holger@0: holger@0: holger@0: %% reduce spectra to partial amplitudes only holger@0: obj.W_DB = obj.noteSpec2partialSpec(spectra_DB, f0Idcs_DB, numBinsPerSemitone); holger@0: obj.W_data = obj.noteSpec2partialSpec(spectra_data, f0Idcs_data, numBinsPerSemitone); holger@0: holger@0: %% adjust amplitudes in database spectra at common pitches holger@0: obj.W_DB(:, obj.commonF0IdcsIdcs_DB) = obj.adjustPartialPositions(obj.W_DB(:, obj.commonF0IdcsIdcs_DB), ... holger@0: obj.W_data(:, obj.commonF0IdcsIdcs_data), ... holger@0: obj.commonF0Idcs, numBinsPerSemitone); holger@0: holger@0: holger@0: %% find zero-entries in W_data and W_data_est holger@0: obj.zeroIdcsW_data = (obj.W_data(:,obj.commonF0IdcsIdcs_data) == 0); holger@0: obj.zeroIdcsW_data_est = (obj.W_DB(:,obj.commonF0IdcsIdcs_DB) == 0); % zero where W_DB is zero (see computeW_data_est) holger@0: obj.zeroIdcsH = (sum(obj.W_data,2) == 0) | (sum(obj.W_DB(:,obj.commonF0IdcsIdcs_DB),2) == 0); holger@0: holger@0: assert(ischar(costFctName), ~any(strcmpi(costFctName, obj.costFctNames)), ... holger@0: 'Argument ''costFctName'' must be a string, and must match one of the implemented cost function names.'); holger@0: obj.costFctName = costFctName; holger@0: holger@0: holger@0: %% set beta & cost function name holger@0: switch obj.costFctName holger@0: case 'LS' holger@0: obj.beta = 2; holger@0: obj.costFctName = 'BD'; holger@0: case 'KL' holger@0: obj.beta = 1; holger@0: obj.costFctName = 'BD'; holger@0: case 'IS' holger@0: obj.beta = 0; holger@0: obj.costFctName = 'BD'; holger@0: case 'BD' holger@0: %check if beta was set holger@0: if isempty(varargin) % FIXME: if more optional arguments are added later, use MATLAB's inputparser holger@0: error('When ''BD'' is used as the cost function, beta needs to be set.'); holger@0: elseif ~strcmp(varargin{1}, 'beta') holger@0: error('Name/value pair for ''beta'' not found.') holger@0: end holger@0: beta = varargin{2}; holger@0: validateattributes(beta, {'numeric'}, {'scalar', 'real', 'finite', 'nonnan'}, 'CSourceFilter', 'beta', 5) holger@0: obj.beta = beta; holger@0: end holger@0: holger@0: %% initialise h holger@0: obj.h = zeros(obj.maxNumFreqs,1); holger@0: obj.h(~obj.zeroIdcsH) = 1; holger@0: holger@0: %% compute initial WEst holger@0: obj = computeW_data_est(obj); holger@0: holger@0: end holger@0: holger@0: holger@0: function obj = updateH(obj) holger@0: % updateH - perform single update of filter curve ''h'' holger@0: % holger@0: % myObj = myObj.updateH applies the update functions to the holger@0: % filter curve ''h''. holger@0: holger@0: %% get spectra at common f0 indices holger@0: W_data = obj.W_data(:, obj.commonF0IdcsIdcs_data); holger@0: W_DB = obj.W_DB(:, obj.commonF0IdcsIdcs_DB); holger@0: W_data_est = obj.W_data_est; holger@0: holger@0: holger@0: %% compute W_data * W_data_est^(beta-2) holger@0: nomMatrix = W_data .* W_data_est .^ (obj.beta-2); holger@0: holger@0: % fix divide by 0 holger@0: if obj.beta < 2 % if beta < 2, exponent of WEst^(beta-2) is negative -> division holger@0: maxRatio = max(max(nomMatrix( ~obj.zeroIdcsW_data_est ))); holger@0: nomMatrix(obj.zeroIdcsW_data & obj.zeroIdcsW_data_est) = 1; holger@0: nomMatrix(~obj.zeroIdcsW_data & obj.zeroIdcsW_data_est) = maxRatio; holger@0: end holger@0: holger@0: %% compute W_data^(beta-1) holger@0: denomMatrix = W_data_est .^ (obj.beta-1); holger@0: holger@0: % fix divide by 0 holger@0: if obj.beta < 1 % if beta < 1, exponent of WEst^(beta-1) is negative -> division holger@0: maxRatio = max(max(denomMatrix(~obj.zeroIdcsW_data_est))); holger@0: denomMatrix(obj.zeroIdcsW_data_est) = maxRatio; holger@0: end holger@0: holger@0: %% multiply by W_DB holger@0: nomMatrix = nomMatrix .* W_DB; holger@0: denomMatrix = denomMatrix .* W_DB; holger@0: holger@0: holger@0: %% compute nominator and denominator holger@0: nom = sum(nomMatrix, 2); holger@0: denom = sum(denomMatrix, 2); holger@0: holger@0: %% compute ratio holger@0: ratio = nom ./ denom; holger@0: ratio((nom==0) & (denom==0)) = 1; holger@0: ratio((nom~=0) & (denom==0)) = max(ratio); holger@0: holger@0: %% apply update holger@0: obj.h(~obj.zeroIdcsH) = obj.h(~obj.zeroIdcsH) .* ratio(~obj.zeroIdcsH); holger@0: holger@0: %% recompute WEst holger@0: obj = computeW_data_est(obj); holger@0: end holger@0: holger@0: holger@0: function [spectra shiftVals] = estimateSpectra(obj, shiftVals) holger@0: % estimateSpectra - computes basis functions from the current holger@0: % estimates for e and h holger@0: % holger@0: % [spectra f0Idcs] = myObj.estimateSpectra(f0Idcs) estimates the holger@0: % spectra at the f0 indices provided by ''f0Idcs'' by applying holger@0: % the current estimate of the filter curve to the spectra in holger@0: % ''spectra_DB'' (see constructor). Spectra are only returned holger@0: % for those f0Idcs that exist in ''f0Idcs_DB'' specified in the holger@0: % constructor. The output is a matrix ''spectra'' containing the holger@0: % estimated spectra and a vector ''f0Idcs'' containing the holger@0: % corresponding f0 indices. holger@0: holger@0: %FIXME: check that input variable 'shiftVals' is correct holger@0: holger@0: % find values in shiftVals that also exist in obj.f0Idcs_DB holger@0: [commonShiftVals, shiftIdcs_DB, dummy] = intersect(obj.f0Idcs_DB, shiftVals); holger@0: numCommonShiftVals = length(commonShiftVals); holger@0: holger@0: h = obj.getSmoothedH(); holger@0: %h = obj.getH(); holger@0: spectra = obj.spectra_DB(:,shiftIdcs_DB) .* repmat(h, 1, numCommonShiftVals); holger@0: end holger@0: holger@0: function h = getH(obj) holger@0: % getH - get filter curve ''h'' holger@0: % holger@0: % myH = myObj.getH() returns the member variable ''h''. holger@0: holger@0: h = obj.h; holger@0: end holger@0: holger@0: function h = getSmoothedH(obj) holger@0: % getH - get smoothed version of filter curve ''h'' holger@0: % holger@0: % myH = myObj.getSmoothedH() returns a smoothed version of the holger@0: % filter curve ''h''. Smoothing is done by applying the discrete holger@0: % cepstrum spectral envelope algorithm from Diemo Schwartz to holger@0: % the filter curve ''h''. holger@0: holger@0: nonZeroFreqIdcsH = find(~obj.zeroIdcsH); holger@0: holger@0: % select nonzero entries from h holger@0: h = obj.h; holger@0: h_nonzero = h(nonZeroFreqIdcsH); holger@0: h_nonzero_DB = 20*log10(h_nonzero); holger@0: holger@0: holger@0: % correct outliers that are more than 10 dB above or below median holger@0: medianInDB = median(h_nonzero_DB); holger@0: holger@0: idcs = h_nonzero_DB > medianInDB + obj.maxDevFromMedianInDB; holger@0: h_nonzero(idcs) = 10^( (medianInDB + obj.maxDevFromMedianInDB)/20 ); holger@0: holger@0: idcs = h_nonzero_DB < medianInDB - obj.maxDevFromMedianInDB; holger@0: h_nonzero(idcs) = 10^( (medianInDB - obj.maxDevFromMedianInDB)/20 ); holger@0: holger@0: holger@0: % setup vector containing frequencies for cosine approx. holger@0: w = (1:obj.maxNumFreqs)' / obj.maxNumFreqs * pi; holger@0: holger@0: % select w at nonzero entries of h holger@0: w_nonzero = w(nonZeroFreqIdcsH); holger@0: holger@0: % copy first and last nonzero entry to boundaries holger@0: if nonZeroFreqIdcsH(1) ~= 1 holger@0: h_nonzero = [h_nonzero(1); h_nonzero]; holger@0: w_nonzero = [w(1); w_nonzero]; holger@0: end holger@0: if nonZeroFreqIdcsH(end) ~= obj.maxNumFreqs holger@0: h_nonzero = [h_nonzero; h_nonzero(end)]; holger@0: w_nonzero = [w_nonzero; w(end)]; holger@0: end holger@0: holger@0: holger@0: % apply cosine approximation to h (discrete cepstrum) holger@0: coeffs = dceps(h_nonzero, w_nonzero, obj.numCepstralCoeffs, obj.regularisationParam); holger@0: h = idceps(coeffs, w); holger@0: holger@0: end holger@0: holger@0: function obj = setH(obj, h) holger@0: % setH - set member variable h holger@0: % holger@0: % myObj = myObj.setH(myH) sets the member variable h to myH. holger@0: % myH must be a non-negative column vector of length [number of holger@0: % frequencies]. holger@0: holger@0: obj.h = h; holger@0: obj = computeW_data_est(obj); holger@0: end holger@0: holger@0: function betaDiv = compBetaDivergence(obj) holger@0: % compBetaDivergence - computes beta divergence based on the holger@0: % current estiates holger@0: % holger@0: % betaDiv = myObj.compBetaDivergence() returns the holger@0: % beta-divergence between the instrument spectra from the holger@0: % recording and the adapted database spectra based on the value holger@0: % for beta specified by the cost function in the constructor. holger@0: holger@0: W_data = obj.W_data(~obj.zeroIdcsW_data_est); holger@0: W_data_est = obj.W_data_est(~obj.zeroIdcsW_data_est); holger@0: holger@0: switch obj.beta holger@0: case 0 holger@0: betaDivMat = W_data ./ W_data_est - log(W_data ./ W_data_est) - 1; holger@0: holger@0: case 1 holger@0: betaDivMat = W_data .* log(W_data ./ W_data_est) + W_data - W_data_est; holger@0: holger@0: otherwise holger@0: betaDivMat = (W_data .^ obj.beta) / (obj.beta * (obj.beta-1)) ... holger@0: + (W_data_est .^ obj.beta) / obj.beta ... holger@0: - (W_data .* (W_data_est .^ (obj.beta-1))) / (obj.beta-1); holger@0: end holger@0: holger@0: betaDiv = sum(betaDivMat(:)); holger@0: end holger@0: holger@0: end % methods holger@0: holger@0: holger@0: methods (Access = private) holger@0: holger@0: function obj = computeW_data_est(obj) holger@0: % computes basis functions from the current estimates for s, e and h holger@0: holger@0: if ~isempty(obj.h) holger@0: obj.W_data_est = obj.W_DB(:,obj.commonF0IdcsIdcs_DB) .* repmat(obj.h, 1, obj.numCommonShifts); holger@0: end holger@0: end holger@0: holger@0: end % methods (Access = private) holger@0: holger@0: holger@0: methods (Access = private, Static) holger@0: holger@0: function partialSpectra = noteSpec2partialSpec(noteSpectra, f0Idcs, numBinsPerSemitone) holger@0: % goes through all note spectra, extracts the partial amplitudes holger@0: % and writes them to their absolute frequency positions holger@0: holger@0: % initialize matrix for result holger@0: [numFreqs numPitches] = size(noteSpectra); holger@0: partialSpectra = zeros(numFreqs, numPitches); holger@0: holger@0: % get (ideal) relative partial positions holger@0: maxNumPartials = floor(freqIdx2PartialIdx(numFreqs, numBinsPerSemitone)); holger@0: relF0IdcsOfPartials = partialIdx2FreqIdx((1:maxNumPartials)', numBinsPerSemitone); holger@0: meansF0Idcs = geomean( [relF0IdcsOfPartials(1:end-1)'; relF0IdcsOfPartials(2:end)'] )'; holger@0: relLowerBoundOfPartials = [1; floor(meansF0Idcs)+1]; holger@0: relUpperBoundOfPartials = [floor(meansF0Idcs); numFreqs]; holger@0: holger@0: % go through all spectra holger@0: for pitchIdx = 1:numPitches holger@0: holger@0: currF0Idx = f0Idcs(pitchIdx); holger@0: currNumPartials = floor(freqIdx2PartialIdx(numFreqs-currF0Idx+1, numBinsPerSemitone)); holger@0: holger@0: % go through partials holger@0: for partialIdx = 1:currNumPartials holger@0: holger@0: % find maximum with partial range holger@0: lowerBound = currF0Idx-1 + relLowerBoundOfPartials(partialIdx); holger@0: upperBound = min(currF0Idx-1 + relUpperBoundOfPartials(partialIdx), numFreqs); holger@0: [maxAmpl maxIdx] = max(noteSpectra(lowerBound:upperBound, pitchIdx)); holger@0: holger@0: % write to result matrix holger@0: partialSpectra(lowerBound-1+maxIdx, pitchIdx) = maxAmpl; holger@0: holger@0: end holger@0: end holger@0: end % noteSpec2partialSpec holger@0: holger@0: holger@0: function W_DB = adjustPartialPositions(W_DB, W_data, f0Idcs, numBinsPerSemitone) holger@0: % adjust the positions of the partials in W_DB to those in W_data holger@0: holger@0: assert(size(W_DB,2) == size(W_data,2), '''W_DB'' and ''W_data'' must contain the same number of columns'); holger@0: assert(length(f0Idcs) == size(W_DB,2), 'Number of elements in ''f0Idcs'' must be equal to number of columns in ''W_DB'''); holger@0: holger@0: [numFreqs numPitches] = size(W_DB); holger@0: holger@0: % get (ideal) relative partial positions holger@0: maxNumPartials = floor(freqIdx2PartialIdx(numFreqs, numBinsPerSemitone)); holger@0: relF0IdcsOfPartials = partialIdx2FreqIdx((1:maxNumPartials)', numBinsPerSemitone); holger@0: meansF0Idcs = geomean( [relF0IdcsOfPartials(1:end-1)'; relF0IdcsOfPartials(2:end)'] )'; holger@0: relLowerBoundOfPartials = [-ceil(numBinsPerSemitone/2); floor(meansF0Idcs)+1]; % make 1st bound -numBinsPerSemitone/2 to allow 1st partial to deviate below ideal position holger@0: relUpperBoundOfPartials = [floor(meansF0Idcs); numFreqs]; holger@0: holger@0: % go through all spectra holger@0: for pitchIdx = 1:numPitches holger@0: holger@0: currF0Idx = f0Idcs(pitchIdx); holger@0: currNumPartials = compNumPartials(numFreqs-currF0Idx+1, numBinsPerSemitone); holger@0: holger@0: % go through partials holger@0: for partialIdx = 1:currNumPartials holger@0: holger@0: % compute bounds for partial range holger@0: lowerBound = max(currF0Idx-1 + relLowerBoundOfPartials(partialIdx), 1); holger@0: upperBound = min(currF0Idx-1 + relUpperBoundOfPartials(partialIdx), numFreqs); holger@0: holger@0: % get indices of partial in both W_DB and W_data holger@0: idx_DB = find(W_DB(lowerBound:upperBound, pitchIdx)); holger@0: idx_data = find(W_data(lowerBound:upperBound, pitchIdx)); holger@0: holger@0: % set partial amplitude in W_DB to frequency index of W_data holger@0: partAmpl = W_DB(lowerBound-1+idx_DB, pitchIdx); holger@0: W_DB(lowerBound-1+idx_DB, pitchIdx) = 0; holger@0: W_DB(lowerBound-1+idx_data, pitchIdx) = partAmpl; holger@0: end holger@0: end holger@0: end % adjustPartialPositions holger@0: end holger@0: holger@0: % methods (Static) holger@0: % holger@0: % function Xshift = shiftSpectra(X, shiftVals) holger@0: % % shifts each spectrum (column in X) down by the amount specified in holger@0: % % shiftVals holger@0: % holger@0: % assert(size(X,2) == length(shiftVals), 'number values in ''shiftVals'' must be the same as number of columns in ''X'''); holger@0: % holger@0: % [maxNumFreqs numShifts] = size(X); holger@0: % Xshift = zeros(maxNumFreqs, numShifts); holger@0: % holger@0: % for pitchIdx = 1:numShifts holger@0: % phi = shiftVals(pitchIdx); holger@0: % Xshift(1:maxNumFreqs-phi, pitchIdx) = X(phi+1:maxNumFreqs, pitchIdx); holger@0: % end holger@0: % end holger@0: % end % methods (Static) holger@0: holger@0: end