camir-ismir2012: core/tools/kldiv.m annotate

annotate core/tools/kldiv.m @ 0:cc4b1211e677 tip

initial commit to HG from Changeset: 646 (e263d8a21543) added further path and more save "camirversion.m"

author	Daniel Wolff
date	Fri, 19 Aug 2016 13:07:06 +0200
parents
children

rev	line source
Daniel@0	1 function KL = kldiv(varValue,pVect1,pVect2,varargin)
Daniel@0	2 %KLDIV Kullback-Leibler or Jensen-Shannon divergence between two distributions.
Daniel@0	3 % KLDIV(X,P1,P2) returns the Kullback-Leibler divergence between two
Daniel@0	4 % distributions specified over the M variable values in vector X. P1 is a
Daniel@0	5 % length-M vector of probabilities representing distribution 1, and P2 is a
Daniel@0	6 % length-M vector of probabilities representing distribution 2. Thus, the
Daniel@0	7 % probability of value X(i) is P1(i) for distribution 1 and P2(i) for
Daniel@0	8 % distribution 2. The Kullback-Leibler divergence is given by:
Daniel@0	9 %
Daniel@0	10 % KL(P1(x),P2(x)) = sum[P1(x).log(P1(x)/P2(x))]
Daniel@0	11 %
Daniel@0	12 % If X contains duplicate values, there will be an warning message, and these
Daniel@0	13 % values will be treated as distinct values. (I.e., the actual values do
Daniel@0	14 % not enter into the computation, but the probabilities for the two
Daniel@0	15 % duplicate values will be considered as probabilities corresponding to
Daniel@0	16 % two unique values.) The elements of probability vectors P1 and P2 must
Daniel@0	17 % each sum to 1 +/- .00001.
Daniel@0	18 %
Daniel@0	19 % A "log of zero" warning will be thrown for zero-valued probabilities.
Daniel@0	20 % Handle this however you wish. Adding 'eps' or some other small value
Daniel@0	21 % to all probabilities seems reasonable. (Renormalize if necessary.)
Daniel@0	22 %
Daniel@0	23 % KLDIV(X,P1,P2,'sym') returns a symmetric variant of the Kullback-Leibler
Daniel@0	24 % divergence, given by [KL(P1,P2)+KL(P2,P1)]/2. See Johnson and Sinanovic
Daniel@0	25 % (2001).
Daniel@0	26 %
Daniel@0	27 % KLDIV(X,P1,P2,'js') returns the Jensen-Shannon divergence, given by
Daniel@0	28 % [KL(P1,Q)+KL(P2,Q)]/2, where Q = (P1+P2)/2. See the Wikipedia article
Daniel@0	29 % for "Kullback–Leibler divergence". This is equal to 1/2 the so-called
Daniel@0	30 % "Jeffrey divergence." See Rubner et al. (2000).
Daniel@0	31 %
Daniel@0	32 % EXAMPLE: Let the event set and probability sets be as follow:
Daniel@0	33 % X = [1 2 3 3 4]';
Daniel@0	34 % P1 = ones(5,1)/5;
Daniel@0	35 % P2 = [0 0 .5 .2 .3]' + eps;
Daniel@0	36 %
Daniel@0	37 % Note that the event set here has duplicate values (two 3's). These
Daniel@0	38 % will be treated as DISTINCT events by KLDIV. If you want these to
Daniel@0	39 % be treated as the SAME event, you will need to collapse their
Daniel@0	40 % probabilities together before running KLDIV. One way to do this
Daniel@0	41 % is to use UNIQUE to find the set of unique events, and then
Daniel@0	42 % iterate over that set, summing probabilities for each instance of
Daniel@0	43 % each unique event. Here, we just leave the duplicate values to be
Daniel@0	44 % treated independently (the default):
Daniel@0	45 % KL = kldiv(X,P1,P2);
Daniel@0	46 % KL =
Daniel@0	47 % 19.4899
Daniel@0	48 %
Daniel@0	49 % Note also that we avoided the log-of-zero warning by adding 'eps'
Daniel@0	50 % to all probability values in P2. We didn't need to renormalize
Daniel@0	51 % because we're still within the sum-to-one tolerance.
Daniel@0	52 %
Daniel@0	53 % REFERENCES:
Daniel@0	54 % 1) Cover, T.M. and J.A. Thomas. "Elements of Information Theory," Wiley,
Daniel@0	55 % 1991.
Daniel@0	56 % 2) Johnson, D.H. and S. Sinanovic. "Symmetrizing the Kullback-Leibler
Daniel@0	57 % distance." IEEE Transactions on Information Theory (Submitted).
Daniel@0	58 % 3) Rubner, Y., Tomasi, C., and Guibas, L. J., 2000. "The Earth Mover's
Daniel@0	59 % distance as a metric for image retrieval." International Journal of
Daniel@0	60 % Computer Vision, 40(2): 99-121.
Daniel@0	61 % 4) <a href="matlab:web('http://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence','-browser')">Kullback–Leibler divergence</a>. Wikipedia, The Free Encyclopedia.
Daniel@0	62 %
Daniel@0	63 % See also: MUTUALINFO, ENTROPY
Daniel@0	64
Daniel@0	65 if ~isequal(unique(varValue),sort(varValue)),
Daniel@0	66 warning('KLDIV:duplicates','X contains duplicate values. Treated as distinct values.')
Daniel@0	67 end
Daniel@0	68 if ~isequal(size(varValue),size(pVect1)) \|\| ~isequal(size(varValue),size(pVect2)),
Daniel@0	69 error('All inputs must have same dimension.')
Daniel@0	70 end
Daniel@0	71 % Check probabilities sum to 1:
Daniel@0	72 if (abs(sum(pVect1) - 1) > .00001) \|\| (abs(sum(pVect2) - 1) > .00001),
Daniel@0	73 error('Probablities don''t sum to 1.')
Daniel@0	74 end
Daniel@0	75
Daniel@0	76 if ~isempty(varargin),
Daniel@0	77 switch varargin{1},
Daniel@0	78 case 'js',
Daniel@0	79 logQvect = log2((pVect2+pVect1)/2);
Daniel@0	80 KL = .5 * (sum(pVect1.*(log2(pVect1)-logQvect)) + ...
Daniel@0	81 sum(pVect2.*(log2(pVect2)-logQvect)));
Daniel@0	82
Daniel@0	83 case 'sym',
Daniel@0	84 KL1 = sum(pVect1 .* (log2(pVect1)-log2(pVect2)));
Daniel@0	85 KL2 = sum(pVect2 .* (log2(pVect2)-log2(pVect1)));
Daniel@0	86 KL = (KL1+KL2)/2;
Daniel@0	87
Daniel@0	88 otherwise
Daniel@0	89 error(['Last argument' ' "' varargin{1} '" ' 'not recognized.'])
Daniel@0	90 end
Daniel@0	91 else
Daniel@0	92 KL = sum(pVect1 .* (log2(pVect1)-log2(pVect2)));
Daniel@0	93 end
Daniel@0	94
Daniel@0	95
Daniel@0	96
Daniel@0	97
Daniel@0	98
Daniel@0	99
Daniel@0	100
Daniel@0	101

Mercurial > hg > camir-ismir2012

annotate core/tools/kldiv.m @ 0:cc4b1211e677 tip