Mercurial > hg > camir-aes2014
comparison core/tools/kldiv.m @ 0:e9a9cd732c1e tip
first hg version after svn
author | wolffd |
---|---|
date | Tue, 10 Feb 2015 15:05:51 +0000 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:e9a9cd732c1e |
---|---|
1 function KL = kldiv(varValue,pVect1,pVect2,varargin) | |
2 %KLDIV Kullback-Leibler or Jensen-Shannon divergence between two distributions. | |
3 % KLDIV(X,P1,P2) returns the Kullback-Leibler divergence between two | |
4 % distributions specified over the M variable values in vector X. P1 is a | |
5 % length-M vector of probabilities representing distribution 1, and P2 is a | |
6 % length-M vector of probabilities representing distribution 2. Thus, the | |
7 % probability of value X(i) is P1(i) for distribution 1 and P2(i) for | |
8 % distribution 2. The Kullback-Leibler divergence is given by: | |
9 % | |
10 % KL(P1(x),P2(x)) = sum[P1(x).log(P1(x)/P2(x))] | |
11 % | |
12 % If X contains duplicate values, there will be an warning message, and these | |
13 % values will be treated as distinct values. (I.e., the actual values do | |
14 % not enter into the computation, but the probabilities for the two | |
15 % duplicate values will be considered as probabilities corresponding to | |
16 % two unique values.) The elements of probability vectors P1 and P2 must | |
17 % each sum to 1 +/- .00001. | |
18 % | |
19 % A "log of zero" warning will be thrown for zero-valued probabilities. | |
20 % Handle this however you wish. Adding 'eps' or some other small value | |
21 % to all probabilities seems reasonable. (Renormalize if necessary.) | |
22 % | |
23 % KLDIV(X,P1,P2,'sym') returns a symmetric variant of the Kullback-Leibler | |
24 % divergence, given by [KL(P1,P2)+KL(P2,P1)]/2. See Johnson and Sinanovic | |
25 % (2001). | |
26 % | |
27 % KLDIV(X,P1,P2,'js') returns the Jensen-Shannon divergence, given by | |
28 % [KL(P1,Q)+KL(P2,Q)]/2, where Q = (P1+P2)/2. See the Wikipedia article | |
29 % for "Kullback–Leibler divergence". This is equal to 1/2 the so-called | |
30 % "Jeffrey divergence." See Rubner et al. (2000). | |
31 % | |
32 % EXAMPLE: Let the event set and probability sets be as follow: | |
33 % X = [1 2 3 3 4]'; | |
34 % P1 = ones(5,1)/5; | |
35 % P2 = [0 0 .5 .2 .3]' + eps; | |
36 % | |
37 % Note that the event set here has duplicate values (two 3's). These | |
38 % will be treated as DISTINCT events by KLDIV. If you want these to | |
39 % be treated as the SAME event, you will need to collapse their | |
40 % probabilities together before running KLDIV. One way to do this | |
41 % is to use UNIQUE to find the set of unique events, and then | |
42 % iterate over that set, summing probabilities for each instance of | |
43 % each unique event. Here, we just leave the duplicate values to be | |
44 % treated independently (the default): | |
45 % KL = kldiv(X,P1,P2); | |
46 % KL = | |
47 % 19.4899 | |
48 % | |
49 % Note also that we avoided the log-of-zero warning by adding 'eps' | |
50 % to all probability values in P2. We didn't need to renormalize | |
51 % because we're still within the sum-to-one tolerance. | |
52 % | |
53 % REFERENCES: | |
54 % 1) Cover, T.M. and J.A. Thomas. "Elements of Information Theory," Wiley, | |
55 % 1991. | |
56 % 2) Johnson, D.H. and S. Sinanovic. "Symmetrizing the Kullback-Leibler | |
57 % distance." IEEE Transactions on Information Theory (Submitted). | |
58 % 3) Rubner, Y., Tomasi, C., and Guibas, L. J., 2000. "The Earth Mover's | |
59 % distance as a metric for image retrieval." International Journal of | |
60 % Computer Vision, 40(2): 99-121. | |
61 % 4) <a href="matlab:web('http://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence','-browser')">Kullback–Leibler divergence</a>. Wikipedia, The Free Encyclopedia. | |
62 % | |
63 % See also: MUTUALINFO, ENTROPY | |
64 | |
65 if ~isequal(unique(varValue),sort(varValue)), | |
66 warning('KLDIV:duplicates','X contains duplicate values. Treated as distinct values.') | |
67 end | |
68 if ~isequal(size(varValue),size(pVect1)) || ~isequal(size(varValue),size(pVect2)), | |
69 error('All inputs must have same dimension.') | |
70 end | |
71 % Check probabilities sum to 1: | |
72 if (abs(sum(pVect1) - 1) > .00001) || (abs(sum(pVect2) - 1) > .00001), | |
73 error('Probablities don''t sum to 1.') | |
74 end | |
75 | |
76 if ~isempty(varargin), | |
77 switch varargin{1}, | |
78 case 'js', | |
79 logQvect = log2((pVect2+pVect1)/2); | |
80 KL = .5 * (sum(pVect1.*(log2(pVect1)-logQvect)) + ... | |
81 sum(pVect2.*(log2(pVect2)-logQvect))); | |
82 | |
83 case 'sym', | |
84 KL1 = sum(pVect1 .* (log2(pVect1)-log2(pVect2))); | |
85 KL2 = sum(pVect2 .* (log2(pVect2)-log2(pVect1))); | |
86 KL = (KL1+KL2)/2; | |
87 | |
88 otherwise | |
89 error(['Last argument' ' "' varargin{1} '" ' 'not recognized.']) | |
90 end | |
91 else | |
92 KL = sum(pVect1 .* (log2(pVect1)-log2(pVect2))); | |
93 end | |
94 | |
95 | |
96 | |
97 | |
98 | |
99 | |
100 | |
101 |