Mercurial > hg > camir-aes2014
comparison toolboxes/distance_learning/mlr/readme.txt @ 0:e9a9cd732c1e tip
first hg version after svn
author | wolffd |
---|---|
date | Tue, 10 Feb 2015 15:05:51 +0000 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:e9a9cd732c1e |
---|---|
1 Metric Learning to Rank (mlr-1.0) | |
2 http://www-cse.ucsd.edu/~bmcfee/code/mlr/ | |
3 | |
4 Brian McFee <bmcfee@cs.ucsd.edu>, 2010. | |
5 | |
6 This code is distributed under the GNU GPL license. See LICENSE for details, | |
7 or http://www.gnu.org/licenses/gpl-3.0.txt . | |
8 | |
9 | |
10 | |
11 INTRODUCTION | |
12 ------------ | |
13 This package contains the MATLAB code for Metric Learning to Rank (MLR). | |
14 The latest version of this software can be found at the URL above. | |
15 | |
16 The software included here implements the algorithm described in | |
17 | |
18 [1] Mcfee, Brian and Lanckriet, G.R.G. Metric learning to rank. | |
19 Proceedings of the 27th annual International Conference | |
20 on Machine Learning (ICML), 2010. | |
21 | |
22 Please cite this paper if you use this code. | |
23 | |
24 | |
25 | |
26 INSTALLATION | |
27 ------------ | |
28 | |
29 1. Requirements | |
30 | |
31 This software requires MATLAB R2007a or later. Because it makes extensive use | |
32 of the "bsxfun" function, earlier versions of Matlab will not work. | |
33 | |
34 If you have an older Matlab installation, you can install an alternative bsxfun | |
35 implementation by going to | |
36 | |
37 http://www.mathworks.com/matlabcentral/fileexchange/23005 , | |
38 | |
39 however, this may not work, and it will certainly be slower than the native | |
40 bsxfun implementation. | |
41 | |
42 | |
43 2. Compiling MEX functions | |
44 | |
45 MLR includes two auxiliary functions "cummax" and "binarysearch" to accelerate | |
46 certain steps of the optimization procedure. The source code for these | |
47 functions is located in the "util" subdirectory. | |
48 | |
49 A makefile is provided, so that (on Unix systems), you can simply type | |
50 | |
51 cd util | |
52 make | |
53 cd .. | |
54 | |
55 | |
56 3. Running the demo | |
57 | |
58 To test the installation, run | |
59 | |
60 mlr_demo | |
61 | |
62 from within Matlab. The demo generates a random training/test split of the Wine data set | |
63 http://archive.ics.uci.edu/ml/datasets/Wine | |
64 and learns a metric with MLR. Both native and learned metrics are displayed in a figure | |
65 with a scatter plot. | |
66 | |
67 | |
68 TRAINING | |
69 -------- | |
70 | |
71 There are several modes of operation for training metrics with MLR. In the | |
72 simplest mode, training data is contained in a matrix X (each column is a | |
73 training vector), and Y contains the labels/relevance for the training data | |
74 (see below). | |
75 | |
76 [W, Xi, D] = mlr_train(X, Y, C,...) | |
77 | |
78 X = d*n data matrix | |
79 Y = either n-by-1 label of vectors | |
80 OR | |
81 n-by-2 cell array where | |
82 Y{q,1} contains relevant indices for q, and | |
83 Y{q,2} contains irrelevant indices for q | |
84 | |
85 C >= 0 slack trade-off parameter (default=1) | |
86 | |
87 W = the learned metric, i.e., the inner product matrix of the learned | |
88 space can be computed by X' * W * X | |
89 Xi = slack value on the learned metric (see [1]) | |
90 D = diagnostics | |
91 | |
92 | |
93 By default, MLR optimizes for Area Under the ROC Curve (AUC). This can be | |
94 changed by setting the "LOSS" parameter to one of several ranking loss | |
95 measures: | |
96 | |
97 [W, Xi, D] = mlr_train(X, Y, C, LOSS) | |
98 where LOSS is one of: | |
99 'AUC': Area under ROC curve (default) | |
100 'KNN': KNN accuracy* | |
101 'Prec@k': Precision-at-k | |
102 'MAP': Mean Average Precision | |
103 'MRR': Mean Reciprocal Rank | |
104 'NDCG': Normalized Discounted Cumulative Gain | |
105 | |
106 *Note: KNN is correct only for binary classification problems; in | |
107 practice, Prec@k is usually a better alternative. | |
108 | |
109 | |
110 For KNN/Prec@k/NDCG, a threshold k may be set to determine the truncation of | |
111 the ranked list. Thiscan be done by setting the k parameter: | |
112 | |
113 [W, Xi, D] = mlr_train(X, Y, C, LOSS, k) | |
114 where k is the number of neighbors for Prec@k or NDCG | |
115 (default=3) | |
116 | |
117 | |
118 By default, MLR regularizes the metric W by the trace, i.e., the 1-norm of the | |
119 eigenvalues. This can be changed to one of several alternatives: | |
120 | |
121 [W, Xi, D] = mlr_train(X, Y, C, LOSS, k, REG) | |
122 where REG defines the regularization on W, and is one of: | |
123 0: no regularization | |
124 1: 1-norm: trace(W) (default) | |
125 2: 2-norm: trace(W' * W) | |
126 3: Kernel: trace(W * X), assumes X is square | |
127 and positive-definite | |
128 | |
129 The last setting, "kernel", is appropriate for regularizing metrics learned | |
130 from kernel matrices. | |
131 | |
132 | |
133 W corresponds to a linear projection matrix (rotation and scaling). To learn a | |
134 restricted model which may only scale, but not rotate the data, W can be | |
135 constrained to diagonal matrices by setting the "Diagonal" parameter to 1: | |
136 | |
137 [W, Xi, D] = mlr_train(X, Y, C, LOSS, k, REG, Diagonal) | |
138 Diagonal = 0: learn a full d-by-d W (default) | |
139 Diagonal = 1: learn diagonally-constrained W (d-by-1) | |
140 | |
141 Note: the W returned in this case will be the d-by-1 vector corresponding | |
142 to the main diagonal of a full metric, not the full d-by-d matrix. | |
143 | |
144 | |
145 | |
146 Finally, we provide a stochastic gradient descent implementation to handle | |
147 large-scale problems. Rather than estimating gradients from the entire | |
148 training set, this variant uses a random subset of size B (see below) at each | |
149 call to the cutting plane subroutine. This results in faster, but less | |
150 accurate, optimization: | |
151 | |
152 [W, Xi, D] = mlr_train(X, Y, C, LOSS, k, REG, Diagonal, B) | |
153 where B > 0 enables stochastic optimization with batch size B | |
154 | |
155 | |
156 | |
157 TESTING | |
158 ------- | |
159 | |
160 Once a metric has been trained by "mlr_train", you can evaluate performance | |
161 across all measures by using the "mlr_test" function: | |
162 | |
163 Perf = mlr_test(W, test_k, Xtrain, Ytrain, Xtest, Ytest) | |
164 | |
165 W = d-by-d positive semi-definite matrix | |
166 test_k = vector of k-values to use for KNN/Prec@k/NDCG | |
167 Xtrain = d-by-n matrix of training data | |
168 Ytrain = n-by-1 vector of training labels | |
169 OR | |
170 n-by-2 cell array where | |
171 Y{q,1} contains relevant indices (in 1..n) for point q | |
172 Y{q,2} contains irrelevant indices (in 1..n) for point q | |
173 Xtest = d-by-m matrix of testing data | |
174 Ytest = m-by-1 vector of training labels, or m-by-2 cell array | |
175 If using the cell version, indices correspond to | |
176 the training set, and must lie in the range (1..n) | |
177 | |
178 | |
179 The output structure Perf contains the mean score for: | |
180 AUC, KNN, Prec@k, MAP, MRR, NDCG, | |
181 as well as the effective dimensionality of W, and | |
182 the best-performing k-value for KNN, Prec@k, and NDCG. | |
183 | |
184 For information retrieval settings, consider Xtrain/Ytrain as the corpus | |
185 and Xtest/Ytest as the queries. | |
186 | |
187 By testing with | |
188 Wnative = eye(size(X,1)) | |
189 you can quickly evaluate the performance of the native (Euclidean) metric. | |
190 | |
191 | |
192 FEEDBACK | |
193 -------- | |
194 | |
195 Please send any bug reports, source code contributions, etc. to | |
196 | |
197 Brian McFee <bmcfee@cs.ucsd.edu> | |
198 |