wolffd@0: function [aCoeff,resid,pitch,G,parcor,stream] = proclpc(data,sr,L,fr,fs,preemp) wolffd@0: % USAGE: [aCoeff,resid,pitch,G,parcor,stream] = proclpc(data,sr,L,fr,fs,preemp) wolffd@0: % wolffd@0: % This function computes the LPC (linear-predictive coding) coefficients that wolffd@0: % describe a speech signal. The LPC coefficients are a short-time measure of wolffd@0: % the speech signal which describe the signal as the output of an all-pole wolffd@0: % filter. This all-pole filter provides a good description of the speech wolffd@0: % articulators; thus LPC analysis is often used in speech recognition and wolffd@0: % speech coding systems. The LPC parameters are recalculated, by default in wolffd@0: % this implementation, every 20ms. wolffd@0: % wolffd@0: % The results of LPC analysis are a new representation of the signal wolffd@0: % s(n) = G e(n) - sum from 1 to L a(i)s(n-i) wolffd@0: % where s(n) is the original data. a(i) and e(n) are the outputs of the LPC wolffd@0: % analysis with a(i) representing the LPC model. The e(n) term represents wolffd@0: % either the speech source's excitation, or the residual: the details of the wolffd@0: % signal that are not captured by the LPC coefficients. The G factor is a wolffd@0: % gain term. wolffd@0: % wolffd@0: % LPC analysis is performed on a monaural sound vector (data) which has been wolffd@0: % sampled at a sampling rate of "sr". The following optional parameters modify wolffd@0: % the behaviour of this algorithm. wolffd@0: % L - The order of the analysis. There are L+1 LPC coefficients in the output wolffd@0: % array aCoeff for each frame of data. L defaults to 13. wolffd@0: % fr - Frame time increment, in ms. The LPC analysis is done starting every wolffd@0: % fr ms in time. Defaults to 20ms (50 LPC vectors a second) wolffd@0: % fs - Frame size in ms. The LPC analysis is done by windowing the speech wolffd@0: % data with a rectangular window that is fs ms long. Defaults to 30ms wolffd@0: % preemp - This variable is the epsilon in a digital one-zero filter which wolffd@0: % serves to preemphasize the speech signal and compensate for the 6dB wolffd@0: % per octave rolloff in the radiation function. Defaults to .9378. wolffd@0: % wolffd@0: % The output variables from this function are wolffd@0: % aCoeff - The LPC analysis results, a(i). One column of L numbers for each wolffd@0: % frame of data wolffd@0: % resid - The LPC residual, e(n). One column of sr*fs samples representing wolffd@0: % the excitation or residual of the LPC filter. wolffd@0: % pitch - A frame-by-frame estimate of the pitch of the signal, calculated wolffd@0: % by finding the peak in the residual's autocorrelation for each frame. wolffd@0: % G - The LPC gain for each frame. wolffd@0: % parcor - The parcor coefficients. The parcor coefficients give the ratio wolffd@0: % between adjacent sections in a tubular model of the speech wolffd@0: % articulators. There are L parcor coefficients for each frame of wolffd@0: % speech. wolffd@0: % stream - The LPC analysis' residual or excitation signal as one long vector. wolffd@0: % Overlapping frames of the resid output combined into a new one- wolffd@0: % dimensional signal and post-filtered. wolffd@0: % wolffd@0: % The synlpc routine inverts this transform and returns the original speech wolffd@0: % signal. wolffd@0: % wolffd@0: % This code was graciously provided by: wolffd@0: % Delores Etter (University of Colorado, Boulder) and wolffd@0: % Professor Geoffrey Orsak (Southern Methodist University) wolffd@0: % It was first published in wolffd@0: % Orsak, G.C. et al. "Collaborative SP education using the Internet and wolffd@0: % MATLAB" IEEE SIGNAL PROCESSING MAGAZINE Nov. 1995. vol.12, no.6, pp. wolffd@0: % 23-32. wolffd@0: % Modified and debugging plots added by Kate Nguyen and Malcolm Slaney wolffd@0: wolffd@0: % A more complete set of routines for LPC analysis can be found at wolffd@0: % http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html wolffd@0: wolffd@0: % (c) 1998 Interval Research Corporation wolffd@0: wolffd@0: wolffd@0: if (nargin<3), L = 13; end wolffd@0: if (nargin<4), fr = 20; end wolffd@0: if (nargin<5), fs = 30; end wolffd@0: if (nargin<6), preemp = .9378; end wolffd@0: wolffd@0: [row col] = size(data); wolffd@0: if col==1 data=data'; end wolffd@0: wolffd@0: nframe = 0; wolffd@0: msfr = round(sr/1000*fr); % Convert ms to samples wolffd@0: msfs = round(sr/1000*fs); % Convert ms to samples wolffd@0: duration = length(data); wolffd@0: speech = filter([1 -preemp], 1, data)'; % Preemphasize speech wolffd@0: msoverlap = msfs - msfr; wolffd@0: ramp = [0:1/(msoverlap-1):1]'; % Compute part of window wolffd@0: wolffd@0: wolffd@0: for frameIndex=1:msfr:duration-msfs+1 % frame rate=20ms wolffd@0: frameData = speech(frameIndex:(frameIndex+msfs-1)); % frame size=30ms wolffd@0: nframe = nframe+1; wolffd@0: autoCor = xcorr(frameData); % Compute the cross correlation wolffd@0: autoCorVec = autoCor(msfs+[0:L]); wolffd@0: wolffd@0: % Levinson's method wolffd@0: err(1) = autoCorVec(1); wolffd@0: k(1) = 0; wolffd@0: A = []; wolffd@0: for index=1:L wolffd@0: numerator = [1 A.']*autoCorVec(index+1:-1:2); wolffd@0: denominator = -1*err(index); wolffd@0: k(index) = numerator/denominator; % PARCOR coeffs wolffd@0: A = [A+k(index)*flipud(A); k(index)]; wolffd@0: err(index+1) = (1-k(index)^2)*err(index); wolffd@0: end wolffd@0: wolffd@0: aCoeff(:,nframe) = [1; A]; wolffd@0: parcor(:,nframe) = k'; wolffd@0: wolffd@0: % Calculate the filter wolffd@0: % response wolffd@0: % by evaluating the wolffd@0: % z-transform wolffd@0: if 0 wolffd@0: gain=0; wolffd@0: cft=0:(1/255):1; wolffd@0: for index=1:L wolffd@0: gain = gain + aCoeff(index,nframe)*exp(-i*2*pi*cft).^index; wolffd@0: end wolffd@0: gain = abs(1./gain); wolffd@0: spec(:,nframe) = 20*log10(gain(1:128))'; wolffd@0: plot(20*log10(gain)); wolffd@0: title(nframe); wolffd@0: drawnow; wolffd@0: end wolffd@0: wolffd@0: % Calculate the filter response wolffd@0: % from the filter's impulse wolffd@0: % response (to check above). wolffd@0: if 0 wolffd@0: impulseResponse = filter(1, aCoeff(:,nframe), [1 zeros(1,255)]); wolffd@0: freqResp = 20*log10(abs(fft(impulseResponse))); wolffd@0: plot(freqResp); wolffd@0: end wolffd@0: wolffd@0: errSig = filter([1 A'],1,frameData); % find excitation noise wolffd@0: wolffd@0: G(nframe) = sqrt(err(L+1)); % gain wolffd@0: autoCorErr = xcorr(errSig); % calculate pitch & voicing wolffd@0: % information wolffd@0: [B,I] = sort(autoCorErr); wolffd@0: num = length(I); wolffd@0: if B(num-1) > .3*B(num) wolffd@0: pitch(nframe) = abs(I(num) - I(num-1)); wolffd@0: else wolffd@0: pitch(nframe) = 0; wolffd@0: end wolffd@0: wolffd@0: resid(:,nframe) = errSig/G(nframe); wolffd@0: if(frameIndex==1) % add residual frames using a wolffd@0: stream = resid(1:msfr,nframe); % trapezoidal window wolffd@0: else wolffd@0: stream = [stream; wolffd@0: overlap+resid(1:msoverlap,nframe).*ramp; wolffd@0: resid(msoverlap+1:msfr,nframe)]; wolffd@0: end wolffd@0: if(frameIndex+msfr+msfs-1 > duration) wolffd@0: stream = [stream; resid(msfr+1:msfs,nframe)]; wolffd@0: else wolffd@0: overlap = resid(msfr+1:msfs,nframe).*flipud(ramp); wolffd@0: end wolffd@0: end wolffd@0: stream = filter(1, [1 -preemp], stream)';