Daniel@0: Metric Learning to Rank (mlr-1.0) Daniel@0: http://www-cse.ucsd.edu/~bmcfee/code/mlr/ Daniel@0: Daniel@0: Brian McFee , 2010. Daniel@0: Daniel@0: This code is distributed under the GNU GPL license. See LICENSE for details, Daniel@0: or http://www.gnu.org/licenses/gpl-3.0.txt . Daniel@0: Daniel@0: Daniel@0: Daniel@0: INTRODUCTION Daniel@0: ------------ Daniel@0: This package contains the MATLAB code for Metric Learning to Rank (MLR). Daniel@0: The latest version of this software can be found at the URL above. Daniel@0: Daniel@0: The software included here implements the algorithm described in Daniel@0: Daniel@0: [1] Mcfee, Brian and Lanckriet, G.R.G. Metric learning to rank. Daniel@0: Proceedings of the 27th annual International Conference Daniel@0: on Machine Learning (ICML), 2010. Daniel@0: Daniel@0: Please cite this paper if you use this code. Daniel@0: Daniel@0: Daniel@0: Daniel@0: INSTALLATION Daniel@0: ------------ Daniel@0: Daniel@0: 1. Requirements Daniel@0: Daniel@0: This software requires MATLAB R2007a or later. Because it makes extensive use Daniel@0: of the "bsxfun" function, earlier versions of Matlab will not work. Daniel@0: Daniel@0: If you have an older Matlab installation, you can install an alternative bsxfun Daniel@0: implementation by going to Daniel@0: Daniel@0: http://www.mathworks.com/matlabcentral/fileexchange/23005 , Daniel@0: Daniel@0: however, this may not work, and it will certainly be slower than the native Daniel@0: bsxfun implementation. Daniel@0: Daniel@0: Daniel@0: 2. Compiling MEX functions Daniel@0: Daniel@0: MLR includes two auxiliary functions "cummax" and "binarysearch" to accelerate Daniel@0: certain steps of the optimization procedure. The source code for these Daniel@0: functions is located in the "util" subdirectory. Daniel@0: Daniel@0: A makefile is provided, so that (on Unix systems), you can simply type Daniel@0: Daniel@0: cd util Daniel@0: make Daniel@0: cd .. Daniel@0: Daniel@0: Daniel@0: 3. Running the demo Daniel@0: Daniel@0: To test the installation, run Daniel@0: Daniel@0: mlr_demo Daniel@0: Daniel@0: from within Matlab. The demo generates a random training/test split of the Wine data set Daniel@0: http://archive.ics.uci.edu/ml/datasets/Wine Daniel@0: and learns a metric with MLR. Both native and learned metrics are displayed in a figure Daniel@0: with a scatter plot. Daniel@0: Daniel@0: Daniel@0: TRAINING Daniel@0: -------- Daniel@0: Daniel@0: There are several modes of operation for training metrics with MLR. In the Daniel@0: simplest mode, training data is contained in a matrix X (each column is a Daniel@0: training vector), and Y contains the labels/relevance for the training data Daniel@0: (see below). Daniel@0: Daniel@0: [W, Xi, D] = mlr_train(X, Y, C,...) Daniel@0: Daniel@0: X = d*n data matrix Daniel@0: Y = either n-by-1 label of vectors Daniel@0: OR Daniel@0: n-by-2 cell array where Daniel@0: Y{q,1} contains relevant indices for q, and Daniel@0: Y{q,2} contains irrelevant indices for q Daniel@0: Daniel@0: C >= 0 slack trade-off parameter (default=1) Daniel@0: Daniel@0: W = the learned metric, i.e., the inner product matrix of the learned Daniel@0: space can be computed by X' * W * X Daniel@0: Xi = slack value on the learned metric (see [1]) Daniel@0: D = diagnostics Daniel@0: Daniel@0: Daniel@0: By default, MLR optimizes for Area Under the ROC Curve (AUC). This can be Daniel@0: changed by setting the "LOSS" parameter to one of several ranking loss Daniel@0: measures: Daniel@0: Daniel@0: [W, Xi, D] = mlr_train(X, Y, C, LOSS) Daniel@0: where LOSS is one of: Daniel@0: 'AUC': Area under ROC curve (default) Daniel@0: 'KNN': KNN accuracy* Daniel@0: 'Prec@k': Precision-at-k Daniel@0: 'MAP': Mean Average Precision Daniel@0: 'MRR': Mean Reciprocal Rank Daniel@0: 'NDCG': Normalized Discounted Cumulative Gain Daniel@0: Daniel@0: *Note: KNN is correct only for binary classification problems; in Daniel@0: practice, Prec@k is usually a better alternative. Daniel@0: Daniel@0: Daniel@0: For KNN/Prec@k/NDCG, a threshold k may be set to determine the truncation of Daniel@0: the ranked list. Thiscan be done by setting the k parameter: Daniel@0: Daniel@0: [W, Xi, D] = mlr_train(X, Y, C, LOSS, k) Daniel@0: where k is the number of neighbors for Prec@k or NDCG Daniel@0: (default=3) Daniel@0: Daniel@0: Daniel@0: By default, MLR regularizes the metric W by the trace, i.e., the 1-norm of the Daniel@0: eigenvalues. This can be changed to one of several alternatives: Daniel@0: Daniel@0: [W, Xi, D] = mlr_train(X, Y, C, LOSS, k, REG) Daniel@0: where REG defines the regularization on W, and is one of: Daniel@0: 0: no regularization Daniel@0: 1: 1-norm: trace(W) (default) Daniel@0: 2: 2-norm: trace(W' * W) Daniel@0: 3: Kernel: trace(W * X), assumes X is square Daniel@0: and positive-definite Daniel@0: Daniel@0: The last setting, "kernel", is appropriate for regularizing metrics learned Daniel@0: from kernel matrices. Daniel@0: Daniel@0: Daniel@0: W corresponds to a linear projection matrix (rotation and scaling). To learn a Daniel@0: restricted model which may only scale, but not rotate the data, W can be Daniel@0: constrained to diagonal matrices by setting the "Diagonal" parameter to 1: Daniel@0: Daniel@0: [W, Xi, D] = mlr_train(X, Y, C, LOSS, k, REG, Diagonal) Daniel@0: Diagonal = 0: learn a full d-by-d W (default) Daniel@0: Diagonal = 1: learn diagonally-constrained W (d-by-1) Daniel@0: Daniel@0: Note: the W returned in this case will be the d-by-1 vector corresponding Daniel@0: to the main diagonal of a full metric, not the full d-by-d matrix. Daniel@0: Daniel@0: Daniel@0: Daniel@0: Finally, we provide a stochastic gradient descent implementation to handle Daniel@0: large-scale problems. Rather than estimating gradients from the entire Daniel@0: training set, this variant uses a random subset of size B (see below) at each Daniel@0: call to the cutting plane subroutine. This results in faster, but less Daniel@0: accurate, optimization: Daniel@0: Daniel@0: [W, Xi, D] = mlr_train(X, Y, C, LOSS, k, REG, Diagonal, B) Daniel@0: where B > 0 enables stochastic optimization with batch size B Daniel@0: Daniel@0: Daniel@0: Daniel@0: TESTING Daniel@0: ------- Daniel@0: Daniel@0: Once a metric has been trained by "mlr_train", you can evaluate performance Daniel@0: across all measures by using the "mlr_test" function: Daniel@0: Daniel@0: Perf = mlr_test(W, test_k, Xtrain, Ytrain, Xtest, Ytest) Daniel@0: Daniel@0: W = d-by-d positive semi-definite matrix Daniel@0: test_k = vector of k-values to use for KNN/Prec@k/NDCG Daniel@0: Xtrain = d-by-n matrix of training data Daniel@0: Ytrain = n-by-1 vector of training labels Daniel@0: OR Daniel@0: n-by-2 cell array where Daniel@0: Y{q,1} contains relevant indices (in 1..n) for point q Daniel@0: Y{q,2} contains irrelevant indices (in 1..n) for point q Daniel@0: Xtest = d-by-m matrix of testing data Daniel@0: Ytest = m-by-1 vector of training labels, or m-by-2 cell array Daniel@0: If using the cell version, indices correspond to Daniel@0: the training set, and must lie in the range (1..n) Daniel@0: Daniel@0: Daniel@0: The output structure Perf contains the mean score for: Daniel@0: AUC, KNN, Prec@k, MAP, MRR, NDCG, Daniel@0: as well as the effective dimensionality of W, and Daniel@0: the best-performing k-value for KNN, Prec@k, and NDCG. Daniel@0: Daniel@0: For information retrieval settings, consider Xtrain/Ytrain as the corpus Daniel@0: and Xtest/Ytest as the queries. Daniel@0: Daniel@0: By testing with Daniel@0: Wnative = eye(size(X,1)) Daniel@0: you can quickly evaluate the performance of the native (Euclidean) metric. Daniel@0: Daniel@0: Daniel@0: FEEDBACK Daniel@0: -------- Daniel@0: Daniel@0: Please send any bug reports, source code contributions, etc. to Daniel@0: Daniel@0: Brian McFee Daniel@0: