wolffd@0: function KL = kldiv(varValue,pVect1,pVect2,varargin) wolffd@0: %KLDIV Kullback-Leibler or Jensen-Shannon divergence between two distributions. wolffd@0: % KLDIV(X,P1,P2) returns the Kullback-Leibler divergence between two wolffd@0: % distributions specified over the M variable values in vector X. P1 is a wolffd@0: % length-M vector of probabilities representing distribution 1, and P2 is a wolffd@0: % length-M vector of probabilities representing distribution 2. Thus, the wolffd@0: % probability of value X(i) is P1(i) for distribution 1 and P2(i) for wolffd@0: % distribution 2. The Kullback-Leibler divergence is given by: wolffd@0: % wolffd@0: % KL(P1(x),P2(x)) = sum[P1(x).log(P1(x)/P2(x))] wolffd@0: % wolffd@0: % If X contains duplicate values, there will be an warning message, and these wolffd@0: % values will be treated as distinct values. (I.e., the actual values do wolffd@0: % not enter into the computation, but the probabilities for the two wolffd@0: % duplicate values will be considered as probabilities corresponding to wolffd@0: % two unique values.) The elements of probability vectors P1 and P2 must wolffd@0: % each sum to 1 +/- .00001. wolffd@0: % wolffd@0: % A "log of zero" warning will be thrown for zero-valued probabilities. wolffd@0: % Handle this however you wish. Adding 'eps' or some other small value wolffd@0: % to all probabilities seems reasonable. (Renormalize if necessary.) wolffd@0: % wolffd@0: % KLDIV(X,P1,P2,'sym') returns a symmetric variant of the Kullback-Leibler wolffd@0: % divergence, given by [KL(P1,P2)+KL(P2,P1)]/2. See Johnson and Sinanovic wolffd@0: % (2001). wolffd@0: % wolffd@0: % KLDIV(X,P1,P2,'js') returns the Jensen-Shannon divergence, given by wolffd@0: % [KL(P1,Q)+KL(P2,Q)]/2, where Q = (P1+P2)/2. See the Wikipedia article wolffd@0: % for "Kullback–Leibler divergence". This is equal to 1/2 the so-called wolffd@0: % "Jeffrey divergence." See Rubner et al. (2000). wolffd@0: % wolffd@0: % EXAMPLE: Let the event set and probability sets be as follow: wolffd@0: % X = [1 2 3 3 4]'; wolffd@0: % P1 = ones(5,1)/5; wolffd@0: % P2 = [0 0 .5 .2 .3]' + eps; wolffd@0: % wolffd@0: % Note that the event set here has duplicate values (two 3's). These wolffd@0: % will be treated as DISTINCT events by KLDIV. If you want these to wolffd@0: % be treated as the SAME event, you will need to collapse their wolffd@0: % probabilities together before running KLDIV. One way to do this wolffd@0: % is to use UNIQUE to find the set of unique events, and then wolffd@0: % iterate over that set, summing probabilities for each instance of wolffd@0: % each unique event. Here, we just leave the duplicate values to be wolffd@0: % treated independently (the default): wolffd@0: % KL = kldiv(X,P1,P2); wolffd@0: % KL = wolffd@0: % 19.4899 wolffd@0: % wolffd@0: % Note also that we avoided the log-of-zero warning by adding 'eps' wolffd@0: % to all probability values in P2. We didn't need to renormalize wolffd@0: % because we're still within the sum-to-one tolerance. wolffd@0: % wolffd@0: % REFERENCES: wolffd@0: % 1) Cover, T.M. and J.A. Thomas. "Elements of Information Theory," Wiley, wolffd@0: % 1991. wolffd@0: % 2) Johnson, D.H. and S. Sinanovic. "Symmetrizing the Kullback-Leibler wolffd@0: % distance." IEEE Transactions on Information Theory (Submitted). wolffd@0: % 3) Rubner, Y., Tomasi, C., and Guibas, L. J., 2000. "The Earth Mover's wolffd@0: % distance as a metric for image retrieval." International Journal of wolffd@0: % Computer Vision, 40(2): 99-121. wolffd@0: % 4) Kullback–Leibler divergence. Wikipedia, The Free Encyclopedia. wolffd@0: % wolffd@0: % See also: MUTUALINFO, ENTROPY wolffd@0: wolffd@0: if ~isequal(unique(varValue),sort(varValue)), wolffd@0: warning('KLDIV:duplicates','X contains duplicate values. Treated as distinct values.') wolffd@0: end wolffd@0: if ~isequal(size(varValue),size(pVect1)) || ~isequal(size(varValue),size(pVect2)), wolffd@0: error('All inputs must have same dimension.') wolffd@0: end wolffd@0: % Check probabilities sum to 1: wolffd@0: if (abs(sum(pVect1) - 1) > .00001) || (abs(sum(pVect2) - 1) > .00001), wolffd@0: error('Probablities don''t sum to 1.') wolffd@0: end wolffd@0: wolffd@0: if ~isempty(varargin), wolffd@0: switch varargin{1}, wolffd@0: case 'js', wolffd@0: logQvect = log2((pVect2+pVect1)/2); wolffd@0: KL = .5 * (sum(pVect1.*(log2(pVect1)-logQvect)) + ... wolffd@0: sum(pVect2.*(log2(pVect2)-logQvect))); wolffd@0: wolffd@0: case 'sym', wolffd@0: KL1 = sum(pVect1 .* (log2(pVect1)-log2(pVect2))); wolffd@0: KL2 = sum(pVect2 .* (log2(pVect2)-log2(pVect1))); wolffd@0: KL = (KL1+KL2)/2; wolffd@0: wolffd@0: otherwise wolffd@0: error(['Last argument' ' "' varargin{1} '" ' 'not recognized.']) wolffd@0: end wolffd@0: else wolffd@0: KL = sum(pVect1 .* (log2(pVect1)-log2(pVect2))); wolffd@0: end wolffd@0: wolffd@0: wolffd@0: wolffd@0: wolffd@0: wolffd@0: wolffd@0: wolffd@0: