changeset 30:eba57dbe56f3

Updates
author Paulo Chiliguano <p.e.chiilguano@se14.qmul.ac.uk>
date Wed, 02 Sep 2015 20:56:37 +0100
parents b1c54790ed97
children ca0615744a53
files Report/abstract/abstract.tex Report/chapter2/background.tex Report/chapter3/ch3.tex Report/chapter4/evaluation.tex Report/chapter5/results.tex Report/chapter6/conclusions.tex Report/chiliguano_msc_finalproject.blg Report/chiliguano_msc_finalproject.lof Report/chiliguano_msc_finalproject.pdf Report/chiliguano_msc_finalproject.synctex.gz Report/chiliguano_msc_finalproject.toc Report/references.bib slides/chiliguano_msc_project_slides.tex
diffstat 13 files changed, 526 insertions(+), 187 deletions(-) [+]
line wrap: on
line diff
--- a/Report/abstract/abstract.tex	Tue Sep 01 11:29:38 2015 +0100
+++ b/Report/abstract/abstract.tex	Wed Sep 02 20:56:37 2015 +0100
@@ -1,5 +1,12 @@
 \begin{abstract}
 
-This is an abstract.
+%Background
+There is a vast range of Internet resources available today, including songs, albums, playlists or podcasts, that a user cannot discover if there is not a tool to filter the items that the user might consider relevant. Several recommendation techniques has been developed since the internet explosion to achieve this filtering task. In an attempt to recommend relevant song to users, we propose an hybrid recommender that considers real-world users information and high-level representation for audio data. We use a deep learning technique, convolutional deep neural network, to represent the audio data in an abstract level. As our main contribution, we investigate a state-of-the-art technique, estimation of distribution algorithm, to capture the listening behaviour of an individual from the features of the songs that are interesting to the user. The designed hybrid music recommender outperform the predictions compared with a traditional content-based recommender. 
+%Aims
+%Method
+
+%Results
+
+%Conclusions
 
 \end{abstract}
\ No newline at end of file
--- a/Report/chapter2/background.tex	Tue Sep 01 11:29:38 2015 +0100
+++ b/Report/chapter2/background.tex	Wed Sep 02 20:56:37 2015 +0100
@@ -100,10 +100,11 @@
 One disadvantage of CB filtering is that personal reviews are not considered in the recommendation process, because this technique is limited to explicit representation of items~\parencite{1242}. Moreover, some representations limit the description to certain aspects only~\parencite{Lops2011}.
 
 Another limitation of CB might be the collection of external data due to restricted access, e.g., the Million Song Dataset \parencite{Bertin-Mahieux2011} does not provide audio data due to copyright restrictions\footnote{http://labrosa.ee.columbia.edu/millionsong/pages/can-i-contact-you-privately-get-audio} and some preview clips are not available in the 7digital UK music catalogue. 
-%Content-based methods build user profiles by analysing the users' rated items. Each profile is then processed to be correlated with another item, which has not been rated, to compute the interest of the user on this object. \parencite{Lops2011}
+
+In our project, a CB recommender is used as the baseline to show a improved performance in music recommendation. Please refer to Section~\ref{sec:recresults} for more detail.
 
 \subsection{Item Representation}
-Items require an accurate description to achieve upstanding results for recommending items to users \Autocite{1242}. In majority of the content-based filtering systems, item attributes are textual features extracted from web resources. \parencite{Lops2011}
+Items require an accurate description to achieve upstanding results for recommending items to users~\Autocite{1242}. In majority of the content-based filtering systems, item attributes are textual features extracted from web resources.~\parencite{Lops2011}
 
 In our approach, we describe the songs in terms of n-dimensional vectors. Each dimension in the vector represent the probability of the song to belong to a music genre. The probalitity estimation is obtained from a music classifier implemented with a deep learning technique. The song representation process is illustrated in section~\ref{subsec:genre}
 
@@ -193,7 +194,7 @@
 \end{equation}
 where \emph{k} is the number of classes.
 \subsubsection{Music Feature Learning}
-\textcite{Sigtia20146959} examined and compared DNNs to discover features from the GTZAN dataset and the ISMIR 2004 genre classification dataset\footnote{http://ismir2004.ismir.net/genre\_contest/}, using rectifier linear units (ReLUs) and dropout regularisation. ReLUs provides better convergence without pre-training. Dropout regularisation reduces the problem of overfitting.
+\textcite{Sigtia20146959} examined and compared DNNs to discover features from the GTZAN dataset and the ISMIR 2004 genre classification dataset\footnote{http://ismir2004.ismir.net/genre\_contest/}, using rectifier linear units (ReLUs) and dropout regularisation. The ReLU activation function is defined as $max(x,0)$. ReLUs provides better convergence without pre-training. Dropout regularisation reduces the problem of overfitting.
 
 First, the GTZAN dataset was divided into four 50/25/25 train, validation, test parts. For each audio clip of the dataset, they calculated the Fast Fourier Transform (FFT) on frames of length 1,024 samples (22,050 kHz sampling rate) with a window overlap of 50\%. Next, they used the magnitude of each FFT frame resulting in a 513 dimensional vector. And then, each feature dimension is normalised to have zero mean and unit standard deviation.
 
@@ -253,9 +254,9 @@
 \subsection{A Hybrid Recommendation Model Based on EDA}
 \textcite{Liang2014781} exploited a permutation EDA to model user profiles in an hybrid model for movie recommendation using the MovieLens 1M dataset\footnote{http://grouplens.org/datasets/movielens/}.
 
-A movie, \emph{i}, is described using a vector, $t_i=\{(k_1,w_1),\ldots ,(k_n,w_n)\}$, where the keywords $k_n$ and weights $w_n$ are calculated with term frequency-inverse document frequency (TF-IDF) technique. A user is initially represented by a set,  $S_u$, of $(i, r_{u,i}$) tuples, where, $r_{u,i}$ is the rating of the movie \emph{i} given by user \emph{u}. The keywords in every $S_u$ set are embedded in a new set, $D_u$.
+A movie, \emph{i}, is described using a vector, $t_i=\{(k_1,w_1),\ldots ,(k_n,w_n)\}$, where the keywords $k_n$ and weights $w_n$ are calculated with term frequency-inverse document frequency (TF-IDF) technique. A user is initially represented by a set,  $S_u=\{(t_1, r_{u,1}),\ldots,(t_i, r_{u,i})\vert r_{u,i}>\bar r_{u}\}$, where, $r_{u,i}$ is the rating of the movie \emph{i} given by user \emph{u}, and $\bar r_{u}$ is a threshold. The keywords in every $S_u$ set are embedded in a new set, $D_u$.
 
-The goal is to learn the user profile, $profile_u$, by minimisation of the fitness function, defined by Equation~\eqref{eq:fitness}
+The goal is to learn the user profile, $profile_u=\{(k_1,w_1),\ldots ,(k_n,w_n)\}$, by minimisation of the fitness function, defined by Equation~\eqref{eq:fitness}
 \begin{equation}
 fitness(profile_u) =\sum_{i\in S_u}\log(r_{u,i}\times sim(profile_u,t_i))
 \label{eq:fitness}
--- a/Report/chapter3/ch3.tex	Tue Sep 01 11:29:38 2015 +0100
+++ b/Report/chapter3/ch3.tex	Wed Sep 02 20:56:37 2015 +0100
@@ -1,6 +1,8 @@
 \chapter{Methodology}
 \label{ch:methodology}
-The methodology used to develop our hybrid music recommender consists of three main stages. First, the collection of real world user-item data corresponding to the play counts of specific songs and the fetching of audio clips of the unique identified songs in the dataset. Secondly, the implementation of the CDNN to represent the audio clips in terms of music genre probabilities as n-dimensional vectors. Finally, we investigate a permutation EDA and a continuous EDA to model user profiles based on the rated songs above a threshold. Every stage of our hybrid recommender is entirely developed in Python 2.7\footnote{https://www.python.org/download/releases/2.7/}, although, they are implemented in different platforms, e.g., OS X (v10.10.4) for the most part of the implementation, Ubuntu (14.04 LTS installed on VirtualBox 5.0.0) for intermediate time-frequency representation and CentOS (Linux release 7.1.1503) for the data preprocessing and CDNN implementation.
+The methodology used to develop our hybrid music recommender consists of four main stages. First, the collection of real world user-item data corresponding to the play counts of specific songs and the fetching of audio clips of the unique identified songs in the dataset. Secondly, the implementation of the CDNN to represent the audio clips in terms of music genre probabilities as n-dimensional vectors. Next, permutation EDA and a continuous EDA are investigated to model user profiles based on the rated songs above a threshold. Finally, the process of top-N recommendation for the baseline and the hybrid recommender is described.
+
+Every stage of our hybrid recommender is entirely developed in Python 2.7\footnote{https://www.python.org/download/releases/2.7/}, although, they are implemented in different platforms, e.g., OS X (v10.10.4) for the most part of the implementation, Ubuntu (14.04 LTS installed on VirtualBox 5.0.0) for intermediate time-frequency representation and CentOS (Linux release 7.1.1503) for the data preprocessing and CDNN implementation.
 
 \section{Data collection}
 The Million Song Dataset \parencite{Bertin-Mahieux2011} is a collection of audio features and metadata for a million contemporary popular music tracks which provides ground truth for evaluation research in MIR. This collection is also complemented by the Taste Profie subset which provides 48,373,586 triplets, each of them consist of anonymised user ID, Echo Nest song ID and play count. We choose this dataset because it is publicly available data and it contains enough data for user modelling and recommender evaluation.
@@ -29,7 +31,7 @@
 
 Considering the Echo Nest API and 7digital API limited number of requests (see Section~\ref{sec:musicservices}), the process of fetching data from 1,500 song IDs takes at least 8 hours, resulting in a total of 640 MP3 files.
 
-Additionally, the script accumulates the Echo Nest song identifier, track ID, artist name, song title and the 7digital preview audio URL for each downloaded track in a text file only if the audio clip is available for download. The generated text file is used for the preprocessing of the cleaned taste profile dataset in Subsection~\ref{rating}. The flowchart of the script is shown in Figure~\ref{fig:fetchaudio}
+Additionally, the script accumulates the Echo Nest song identifier, track ID, artist name, song title and the 7digital preview audio URL for each downloaded track in a text file only if the audio clip is available for download. The generated text file is used for the preprocessing of the cleaned taste profile dataset in subsection~\ref{subsec:rating}. The flowchart of the script is shown in Figure~\ref{fig:fetchaudio}
 \begin{figure}[ht!]
 	\centering
 	\includegraphics[width=0.9\textwidth]{chapter3/fetch_audio.png}
@@ -52,7 +54,7 @@
 
 First, a list of absolute paths corresponding to the songs in the repository is generated. The sequence of paths in the list is modified by random shuffling. This new sequence of absolute paths is saved in a text file.
 
-Second, for every path in the text file of randomised absolute paths, a fragment equivalent to 3 seconds of the associated audio clip is loaded at a sampling rate of 22,050 Hz and converted to mono channel. For every fragment, a mel-scaled power spectrogram with 128 bands is computed from windows of 1,024 samples with a hop size of 512 samples. Hence, the spectrogram is converted to logarithmic scale in dB using peak power as reference. The functions \textit{load}, \textit{feature.melspectrogram} and \textit{logamplitude}, correspondingly to load an audio clip, spectrogram computation and logarithmic conversion, from the LibROSA\footnote{https://bmcfee.github.io/librosa/index.html} package are used.
+Second, for every path in the text file of randomised absolute paths, a fragment equivalent to 3 seconds of the associated audio clip is loaded at a sampling rate of 22,050 Hz and converted to mono channel. For every fragment, a mel-scaled power spectrogram with 128 bands is computed from windows of 1,024 samples with a hop size of 512 samples, resulting in a spectrogram of 130 frames with 128 components. Hence, the spectrogram is converted to logarithmic scale in dB using peak power as reference. The functions \textit{load}, \textit{feature.melspectrogram} and \textit{logamplitude}, correspondingly to load an audio clip, spectrogram computation and logarithmic conversion, from the LibROSA\footnote{https://bmcfee.github.io/librosa/index.html} package are used.
 
 To handle audio with LibROSA functions, it is recommended to use the Samplerate\footnote{https://pypi.python.org/pypi/scikits.samplerate/} package for efficient resampling. In our project, we considered to use the SoX\footnote{http://sox.sourceforge.net/} cross-platform without success due to operating system restrictions. Alternatively, we use the FFmpeg\footnote{https://www.ffmpeg.org/} cross-platform and \textit{libmp3lame0}\footnote{http://packages.ubuntu.com/precise/libmp3lame0} packages for efficient resampling.
 
@@ -65,10 +67,10 @@
 \section{Data preprocessing}
 In order to obtain suitable representations for users' interest in the taste profile dataset and for songs' spectrograms, it is necessary an additional process of the data.
 \subsection{Rating from implicit user feedback}
-\label{rating}
-First, the text file of the downloaded MP3 metadata (see subsection~\ref{subsec:spectrogram}) is used to retain the triplets, from the cleaned taste profile subset, that contain the song IDs of the available audio clips. A reduced taste profile dataset with 4,685 triplets is obtained.
+\label{subsec:rating}
+First, the text file of the downloaded MP3 metadata (see subsection~\ref{subsec:spectrogram}) is used to retain the triplets, from the cleaned taste profile subset, that contain the song IDs of the available audio clips. A reduced taste profile dataset with 4,685 triplets is obtained, corresponding to information of 53 users.
 
-The reduced taste profile dataset represent the user listening habits as implicit feedback, i.e., play counts of songs, it is necessary to normalise the listening habits as explicit feedback, i.e., range of values $[1\ldots5]$ that indicate how much a user likes a song. Normalisation of play counts is computed with the complementary cumulative distribution of play counts of a user, following the procedure given by \textcite{1242}. Songs in the top 80–100\% of the distribution get a rating of 5, songs in the 60–80\% range get a 4, songs in the 40-60\% range get a 3, songs in the 20-40\% get a 2 and songs in the 0-20\% range get a rating of 1. An exception for this allocation of ratings comes out when the coefficient of variation, given by Equation~\eqref{eq:cv}:
+The reduced taste profile dataset represent the user listening habits as implicit feedback, i.e., play counts of songs, it is necessary to normalise the listening habits as explicit feedback, i.e., range of values $[1\ldots5]$ that indicate how much a user likes a song. Normalisation of play counts is computed with the complementary cumulative distribution of play counts of a user, following the procedure given by \textcite{1242}. Songs in the top 80 - 100\% of the distribution get a rating of 5, songs in the 60 - 80\% range get a 4, songs in the 40 - 60\% range get a 3, songs in the 20 - 40\% get a 2 and songs in the 0 - 20\% range get a rating of 1. An exception for this allocation of ratings comes out when the coefficient of variation, given by Equation~\eqref{eq:cv}:
 \begin{equation}
 CV=\frac{\sigma}{\mu}
 \label{eq:cv}
@@ -76,47 +78,140 @@
 where, $\sigma$ is the standard deviation and $\mu$ is the mean of the play counts of a user, is less or equal than $0.5$. In that case, every song gets a rating of 3.
 
 \subsection{Standardise time-frequency representation}
+\label{subsec:normalised}
 The logarithmic mel-scaled power spectrograms obtained in subsection~\ref{subsec:spectrogram} are normalised to have zero mean and unit variance in each frequency band, using the \textit{fit} and \textit{transform} methods of the \textit{StandardScaler} class from the Scikit-learn~\parencite{scikit-learn} package, as a common requirement of several machine learning classifiers.
 
+Additionally, the GTZAN normalised spectrograms dataset is split in 3 subsets: 500 spectrograms for training, 250 spectrograms for validation and 250 spectrograms for testing. Each spectrogram is saved as a tuple \textit{(spectrogram, tag)} in a cPickle file, where tag is the number of the music genre: 0 for blues, 1 for classical, 2 for country, 3 for disco, 4 for hiphop, 5 for jazz, 6 for metal, 7 for pop, 8 for reggae and 9 for rock.
+
 \section{Algorithms}
 \label{sec:algorithms}
 The hybrid music recommender approach in this project can be considered as implementation of feature augmentation method and a meta-level method presented in subsection~\ref{subsec:hybridrecommender}. First, user profiles are generated using the rating matrix and the song vector representation. Next, the model generated is the input of a CB recommender to produce \emph{top-N} song recommendations. The general model of our hybrid recommender is shown in Figure~\ref{fig:generalhybrid}
 \begin{figure}[ht!]
 	\centering
 	\includegraphics[width=\textwidth]{chapter3/General_model_hybrid_recommender.png}
-	\caption{Diagram of hybrid music recommender}
+	\caption{Diagram of the hybrid music recommender}
 	\label{fig:generalhybrid}
 \end{figure}
 
 \subsection{Probability of music genre representation}
 \label{subsec:genre}
-To represent an audio file in a 10-dimensional vector, whose dimensions corresponds to the 10 music genres specified in the GTZAN dataset, a CDNN is implemented using Theano library. For intensive computation processes, such as convolution, the implementation on equipment with Graphical Processing Unit (GPU) acceleration is recommended. In this project, a CentOS (Linux release 7.1.1503) server with a Tesla K40c\footnote{http://www.nvidia.com/object/tesla-servers.html} GPU is exploited.
+To represent an audio file in a 10-dimensional vector, whose dimensions correspond to the 10 music genres specified in the GTZAN dataset, a CDNN is implemented using Theano library. For intensive computation processes, such as convolution, the implementation on equipment with Graphical Processing Unit (GPU) acceleration is recommended. In this project, a CentOS (Linux release 7.1.1503) server with a Tesla K40c\footnote{http://www.nvidia.com/object/tesla-servers.html} GPU is exploited.
 
-The code for logistic regression, multilayer perceptron and deep convolutional network designed for character recognition of MNIST\footnote{http://www.iro.umontreal.ca/~lisa/deep/data/mnist/mnist.pkl.gz} dataset, available on~\textcite{1_deeplearning.net_2015} is adapted to our purpose of music genre classification
+The scripts for logistic regression, multilayer perceptron and deep convolutional network designed for character recognition of MNIST\footnote{http://www.iro.umontreal.ca/~lisa/deep/data/mnist/mnist.pkl.gz} dataset, available on~\textcite{1_deeplearning.net_2015} is adapted to our purpose of music genre classification. ReLU and dropout functions are defined in the deep convolutional network script.
 
 %Deep belief network is a probabilistic model that has one observed layer and several hidden layers.
 \subsubsection{CDNN architecture}
 \begin{figure}[ht!]
 	\centering
 	\includegraphics[width=\textwidth]{chapter3/CDNN.png}
-	\caption{Diagram of hybrid music recommender}
+	\caption{Diagram of CDNN for music genre classification~\parencite{kereliuk15}}
 	\label{fig:cdnn}
 \end{figure}
-The input of the CDNN consist of the 128-bands spectrograms obtained in subsection~\ref{subsec:spectrogram}. The batch size considered is 20 frames.
-Each convolutional layer consists of 10 kernels and ReLUs activation units. In the first convolutional layer the pooling size is 4 and in the second layer the pooling size is 2. The filters analyses the frames along the frequency axis to consider every Mel components with a hop size of 4 frames in the time axis. Additionally, there is a hidden MLP with 500 units.
+A similar architecture of a CDNN for music genre classification~\parencite{kereliuk15} is recreated in our project. A batch size of 20 and a dropout rate of 0.20 for the convolutional layer units are considered.
 
+Initially, the reshape of the 2-dimension normalised spectrograms (130 frames$\times$128 frequency bands) obtained in subsection~\ref{subsec:normalised} to a 4-dimension tensor, compatible with the input of the first convolutional layer (batch size$\times$1$\times$130$\times$128), is required.
 
-The classification of genre for each frame is returned by negative log likelihood estimation of a logistic stochastic gradient descent (SGD) layer.
+The first convolutional layer consists of 32 filters, each one with a size of 8 frames, with a max-pooling downsampling of 4, to reduce the size of the spectrogram along the time axis. The size of the resulting spectrogram is 30$\times$128 and the output of this first convolutional layer is a 4-dimension tensor with a size of 20$\times$32$\times$30$\times$128.
 
+The second convolutional layer consists of 32 filters, each one with a size of 8 frames, with a max-pooling downsampling of 4, to reduce the size of the spectrogram obtained in the first layer. The size of the new spectrogram is 5$\times$128 and the output of this second convolutional layer is a 4-dimension tensor with a size of 20$\times$32$\times$5$\times$128.
+
+Following the convolution process, the reshape of the 4-dimensional tensor of the output of the second convolutional layer is required to feed the fully connected MLP. The MLP consists of 500 ReLUs.
+
+Finally, the classification of music genre is accomplished with logistic regression layer of the 500 output values from the MLP. This output layer consists of 10 units with softmax activation function (see Equation~\eqref{eq:softmax}).
 
 \subsubsection{Learning parameters}
-In our testing, we obtained a 38.8 \% of classification error after 9 trials using the GTZAN dataset. More details of classification results are shown in Table~\ref{table:genre}.
+The weights and biases of the units of the CDNN are the parameters
+to be modelled by SGD to minimise a cost function. The cost function is the negative log likelihood of the prediction in the output layer given the target values, i.e., music genre ground truth.
 
-%\subsubsection{Probability of genre representation}
+The CDNN for training, validation and testing is run for 200 epochs, each epoch equivalent to 50 iterations. The number of iterations corresponds to the ratio between the number of spectrograms (1,000 for GTZAN dataset) and the batch size.
+
+According to \textcite{bengio2012practical}, the patience value is the minimum number of training examples. In our project, the patience value is set at 1,000.
+
+In our testing, after 9 trials in the CDNN, we obtain a best classification error of 38.8 \% using the spectrograms corresponding to the GTZAN dataset (see Table~\ref{table:genre}). The weights and biases for this best classification error are saved in a cPickle file to be applied as initial parameters of the CDNN for vector representation.
+
+\subsubsection{Vector representation}
+The script of CDNN is adapted to produce the vector representation of the spectrograms. This CDNN uses the weights and biases learnt in genre classification process as initial parameters.
+
+A 10-dimension vector is produced by the softmax output layer. Each dimension corresponds to a music genre and each value represents the probability of a song to belong to a specific music genre, given the normalised spectrogram at the input layer.
 
 \subsection{User profile modelling}
 \label{subsec:profile}
-\subsubsection{Permutation EDA}
-\subsubsection{Continuous Univariate Marginal Distribution Algorithm}
+To model user profiles from the triplets in the normalised taste profile dataset, we adapt the permutation EDA (see algorithm~\ref{alg:hybrideda} on page ~\pageref{alg:hybrideda}) and the continuous EDA (see algorithm~\ref{alg:umda} on page ~\pageref{alg:umda}). For both EDAs, we consider the following:
+\begin{itemize}
+	\item User representation $S_u=\{(t_1, r_{u,1}),\ldots,(t_i, r_{u,i})\vert r_{u,i}>\bar r_{u}\}$.
+	\item Rating threshold $\bar r_{u}\}=2$, assuming that a user does not like songs with ratings of 1 and 2 out of 5.
+	\item The stopping criteria is the maximum number of generations limited to 250.
+\end{itemize}
 
-\subsection{Top-N songs recommendation}
\ No newline at end of file
+
+\subsubsection{Modelling with Permutation EDA}
+In the case of permutation EDA, the genre tags (0 for blues, 1 for classical, 2 for country, 3 for disco, 4 for hiphop, 5 for jazz, 6 for metal, 7 for pop, 8 for reggae and 9 for rock) are considered as the keywords $k_n$ in the set $D_u$ and the weights $w_{n,i}$ are 50 evenly spaced samples over the interval $[0.1, 0.9]$, thus, the size of the set $K_u$ is $N=500$ and the initial probability is $c_{n,i}=1/500$.
+
+The population size is equal to $u=53$, that is the number of users in the normalised taste profile dataset. Instead of using the Monte Carlo method to generate the initial population of $profile_u$, 10 tuples $(k_n,w_{n,i})$ from $K_u$ are random sampled for each user. The number of top individuals $M$ is a half of the total of users. The process of sampling new individuals is preserved. The adapted permutation EDA for user modelling is illustrated in Algorithm~\ref{alg:permutationeda}:
+
+\begin{algorithm}[ht!]
+	\caption{Calculate $profile_u$ for users in taste profile}
+	\begin{algorithmic} 
+		\REQUIRE set $D_u$, weights $w_{n,i}$
+		\REQUIRE population size $u$, MAXGEN
+		\REQUIRE $M = Round(u/2)$
+		%\ENSURE $y = x^n$
+		%\STATE Random selection of keywords $k_n$ from $D_u$
+		\STATE Assign a weight $w_{n,i}$ to each $k_n$ to build a set~$K_u$ of size~$N$
+		\STATE Assign a probability $c_{n,i}=1/N$ to each $(k_n,w_{n,i})$
+		\STATE Generate initial population of $profile_u$
+		\WHILE{$generation <$ MAXGEN}
+		\STATE Compute each $fitness(profile_u)$
+		\STATE Rank individuals by their fitness value
+		\STATE Select top $M < N$ individuals
+		\STATE Update $c_{n,i}$ by counting the occurrences of $(k_n,w_{n,i})$ in the $M$ individuals profiles
+		\STATE Generate $profile_u$ by random sampling according to updated $c_{n,i}$
+		\STATE $generation\leftarrow generation+1$
+		\ENDWHILE
+		\RETURN $profile_u$
+	\end{algorithmic}
+	\label{alg:permutationeda}
+\end{algorithm}
+
+The time elapsed for modelling user profiles with the permutation EDA is approximately 7.82 seconds.
+
+\subsubsection{Modelling with $UMDA_c^G$}
+The $UMDA_c^G$ algorithm is adapted to select the top $M_{sel}$ individuals by using the fitness function (Equation~\eqref{eq:fitness} on page~\pageref{eq:fitness}) exploited by the permutation EDA. The population size is $M=53$ users, the selection parameter is $\tau=0.5$. $x_i$ represent the probability value of the music genre dimension, \emph{i}, in the $profile_u$ vector.
+
+In each generation, \emph{t}, the mean value $\mu_{i,t}$ and the variance $\sigma_{i,t}^2$ is computed for every dimension, $i$, along the $M_{sel}$ individuals vectors. For each dimension, i.e., music genre, the normal distribution is calculated with its corresponding mean value and variance, to estimate the  individuals vectors of the next generation.
+\begin{algorithm}[ht!]
+	\caption{Framework for $UMDA_c^G$ to model users}
+	\begin{algorithmic} 
+		\REQUIRE population size $M$
+		\REQUIRE selection parameter $\tau$
+		\STATE Generate $M$ individuals at random
+		\STATE $M_{sel}\leftarrow M\cdot\tau$
+		\STATE $t \leftarrow 0$
+		\WHILE{$t <$ MAXGEN}
+		\STATE Compute each $fitness(profile_u)$
+		\STATE Rank individuals by their fitness value
+		\STATE Select top $M_{sel}$ individuals 
+		\STATE $\mu_{i,t}\leftarrow\frac{1}{M_{sel}}\sum_{j=1}^{M_{sel}}x_i^j$
+		\STATE $\sigma_{i,t}^2\leftarrow\frac{1}{M_{sel}-1}\sum_{j=1}^{M_{sel}}(x_i^j-\mu_{i,t})^2$ 
+		\STATE $p_t({x_{i}}\vert \mu_{i,t},\sigma_{i,t}^2)\leftarrow\frac{1}{\sqrt{2\pi}\sigma_{i,t}}\exp(-\frac{1}{2}(\frac{x_i-\mu_{i,t}}{\sigma_{i,t}})^2)$
+		\STATE Sample $M$ individuals from $p_t({x_{i}}\vert \mu_{i,t},\sigma_{i,t}^2)$
+		\STATE $t\leftarrow t+1$
+		\ENDWHILE
+	\end{algorithmic}
+	\label{alg:umda2}
+\end{algorithm}
+The time elapsed for modelling user profiles with the continuous EDA is approximately 4.20 seconds.
+\subsection{Top-N songs recommendation}
+The final stage of the recommender systems implemented is to generate a list of song recommendations according to the similarity values computed with Equation~\eqref{eq:wpearson} (see page~\pageref{eq:wpearson}).
+\subsubsection{Top-N recommendations in CB baseline}
+The list of recommendations in a CB recommender is given by the similarities between the items that a user has already rated and the new items. It is assumed the user has not seen before the new items.
+
+First, the similarity matrix between every item in the training set is computed. Only the $k=30$ most similar items are kept for each item. Next, for each song that a user rated above the threshold (rating $>$ 2), the $k$ neighbours are retrieved as a list of candidate items. The list is normalised to have a maximum value of 1. The lists of candidates are appended. For the repeated candidates, the similarity values are summed up. The $N$ candidates with higher similarity values are recommended to a user.
+
+\subsubsection{Top-N recommendations in hybrid music recommender}
+In our hybrid music model (see Figure~\ref{fig:generalhybrid} on page~\pageref{fig:generalhybrid}), the content based filtering computes the similarity between a user interest profile and a each song vector in the test set. The songs are ranked in descending order and the first $N$ songs of this ranking are recommended.
+
+In our project, we experiment with different values for N, obtaining the best results with the hybrid music recommender based on permutation EDA for all the experiments. Refer to Section~\ref{sec:recresults} for detailed results of evaluation.
+
+\section{Summary}
+In this chapter, we presented the collection and preprocessing of the taste profile subset to model the user profiles with EDAs. As well, we presented the procedure of time-frequency representation of the audio content to feed a CDNN in order to obtain a 10-dimension vector representation corresponding to the probability of a song to belong to a music genre. Also, we presented the adapted architecture of the CDNN and the EDAs for hybrid recommendation. In the following chapter, we introduce the evaluation method and experiments to evaluate our hybrid recommender approach.
\ No newline at end of file
--- a/Report/chapter4/evaluation.tex	Tue Sep 01 11:29:38 2015 +0100
+++ b/Report/chapter4/evaluation.tex	Wed Sep 02 20:56:37 2015 +0100
@@ -28,33 +28,44 @@
 
 \subsection{Decision based metrics}
 Our hybrid recommender produces a list of songs for each user, hence, it is necessary to evaluate the recommendation with a metrics derived from \textit{confusion matrix} that reflects the categorisation of test items as true positives (TP), false positives (FP), true negatives (TN) and false negatives (FN). In this project we consider the following metrics \parencite{1242}:
-\subsubsection{Precision}
-\begin{equation}
-Precision = \frac{TP} {TP+FP}\label{eq:1}
-\end{equation}
+\begin{itemize}
+	\item \textbf{Precision} is the ratio of correct positive predictions.
+	\begin{equation}
+	Precision = \frac{TP} {TP+FP}\label{eq:1}
+	\end{equation}
+	\item \textbf{Recall} is the ratio of positive instances predicted as positive.
+	\begin{equation}
+	Recall = \frac{TP} {TP+FN}\label{eq:2}
+	\end{equation}
+	\item \textbf{F1 measure}, is the harmonic relation of precision and recall.
+	\begin{equation}
+	Recall = \frac{2 \times Precision \times Recall} {Precision+Recall}\label{eq:3}
+	\end{equation}
+	\item \textbf{Accuracy}, is the ratio of correct predictions.
+	\begin{equation}
+	Recall = \frac{TP+TN} {TP+FP+TN+FN}\label{eq:4}
+	\end{equation}
+\end{itemize}
+
+
 %Text \eqref{eq:1}
-\subsubsection{Recall}
-\begin{equation}
-Recall = \frac{TP} {TP+FN}\label{eq:2}
-\end{equation}
+
+
 %Text \eqref{eq:2}
-\subsubsection{F1}
-\begin{equation}
-Recall = \frac{2 \times Precision \times Recall} {Precision+Recall}\label{eq:3}
-\end{equation}
-\subsubsection{Accuracy}
-\begin{equation}
-Recall = \frac{TP+TN} {TP+FP+TN+FN}\label{eq:4}
-\end{equation}
+
 
 \section{Evaluation method}
 The hybrid music recommender system proposed in this project is evaluated through an offline experiment and the results are presented with decision based metrics described in the previous section.
 
 \subsection{Training set and test set}
-For the purpose of evaluate and compare the performance, we use a random sample of 20 \% of the total size of the cleaned Taste Profile subset for testing and the rest 80 \% is used to train both the hybrid recommender approach and content-based recommender baseline. The sampling process is repeated for ten times.
+The normalised taste profile dataset (refer to subsection~\ref{subsec:rating}) is split in a training and a test set. For each user in the dataset, a random sample corresponding to 20 \% of the total number of ratings is assigned to the test set, and the rest 80 \% is assigned to the training set. The split process is iterated for 10 times, resulting in a total of 10 training and 10 test sets.
+
+\subsection{Top-N evaluation}
+For each song in the user test set, we look up if the song is included or not in the list of top-N recommendations.
+
+If the test song is in the top-N recommendation and if the rating of the test song is above the threshold (rating $>$ 2), we count as a true positive, otherwise is counted as a false positive.
+
+If the test song is not in the top-N recommendation and if the rating of the test song is above the threshold, we count as a false negative, otherwise is counted as a true positive.
 
 %\subsection{Evaluation measures}
 %Because the dataset does not include explicit ratings, hence, the number of plays of tracks are considered as users' behaviours,
-
-
-
--- a/Report/chapter5/results.tex	Tue Sep 01 11:29:38 2015 +0100
+++ b/Report/chapter5/results.tex	Wed Sep 02 20:56:37 2015 +0100
@@ -7,7 +7,7 @@
 %\end{minipage}
 
 \section{Genre classification results}
-A total of 9 trials were executed to train, validate and test the CDNN for genre classification using a 3 second frame of each file of the GTZAN dataset. We obtained the following results showed in Table~\ref{table:genre}.
+A total of 9 trials are executed for training, validating and testing the CDNN using the normalised spectrograms of GTZAN dataset (see subsection~\ref{subsec:normalised}). We obtained the following results showed in Table~\ref{table:genre}.
 \begin{table}[h!]
 	\caption{Genre classification results} % title of Table
 	\centering % used for centering table
@@ -33,6 +33,8 @@
 For the initial trial, the error is higher because the weight and bias values for each unit of the layers in the deep learning classifier are randomly initialised.
 
 \section{Recommender evaluation results}
+\label{sec:recresults}
+In general, the results demonstrate the hybrid music recommender based on a permutation EDA presents a better performance compared with both the CB recommender and the hybrid approach based on a continuous EDA. Nevertheless, the recall values are lower in all cases. In Table~\ref{table:recn5}, the results of top-5 recommendation are shown.
 \begin{table}[h!]
 	\caption{Evaluation of recommender systems (N=5)} % title of Table
 	\centering % used for centering table
@@ -51,6 +53,7 @@
 	\label{table:recn5} % is used to refer this table in the text
 \end{table}
 
+In Table~\ref{table:recn10}, the results of top-10 songs recommendation are shown. In this case, the precision value improve for the CB recommender. The accuracy values for all recommender systems tend to decrease.
 \begin{table}[h!]
 	\caption{Evaluation of recommender systems (N=10)} % title of Table
 	\centering % used for centering table
@@ -69,6 +72,7 @@
 	\label{table:recn10} % is used to refer this table in the text
 \end{table}
 
+In Table~\ref{table:recn20}, the results of top-20 songs recommendation are shown. In this case, the recall values rise for all the recommender systems, compared with the top-5 and top-10 recommendations, but the precision and accuracy tend to decrease. At this point, we can deduce that our hybrid recommender approaches could improve the recall without losing reached precision if $N$ is in a value between 10 and 20.
 \begin{table}[h!]
 	\caption{Evaluation of recommender systems (N=20)} % title of Table
 	\centering % used for centering table
@@ -85,4 +89,4 @@
 		\end{tabular}
 	\end{adjustbox}
 	\label{table:recn20} % is used to refer this table in the text
-\end{table}
\ No newline at end of file
+\end{table}
--- a/Report/chapter6/conclusions.tex	Tue Sep 01 11:29:38 2015 +0100
+++ b/Report/chapter6/conclusions.tex	Wed Sep 02 20:56:37 2015 +0100
@@ -1,12 +1,13 @@
 \chapter{Conclusion}
 \label{ch:conclusion}
-%``Representations of music directly from the temporal or spectral domain can be very sensitive to small time and frequency deformations''. \parencite{zhang2014deep}
+The whole aim of our project has been the design and the implementation of an hybrid music recommender in order to mitigate the cold-start problem in content-based recommender systems. We investigated several types of hybridisation in recommender systems to choose a suitable architecture (shown in~\ref{fig:generalhybrid}) for the available datasets. To represent real world users and raw waveforms, we decided to investigate and implement state-of-the-art techniques.
+
+Despite of the success in computer vision field, we found in our project that convolutional deep neural networks achieve similar results to long-established music genre classifier approaches in music information retrieval field.
+
+Due to the natural selection concept associated to estimation of distribution algorithms, we investigated and considered these optimisation techniques for modelling users' listening behaviour in terms of probabilities of music genres from the songs in they listened.
+
+On the other hand, we found that a limited number of genres for song representation lead us to coarse predictions according to decision-based metrics.
 
 \section{Future work}
 
-\begin{itemize}
-	\item Download more audio data from 7digital catalog.
-	\item Consider another high-level representation instead of music genres or extend the number of genres.
-	\item Predict rating values of items from users' neighbourhood to evaluate the performance of our hybrid recommender and compare it with a traditional collaborative filtering baseline.
-	\item Optimisation of the profile using latent vectors.
-\end{itemize}
\ No newline at end of file
+For the future, we have the intention to enhance our hybrid music recommender considering a wide range of music genres or latent vectors for item representation. We shall work on investigating several configurations of convolutional deep neural networks and different types of deep learning techniques, particularly, unsupervised learning approaches, for a better high-level representation of audio waveforms. In addition, we will continue investigating the fascinating estimation of distribution algorithms, considering another fitness functions to optimise, to model user profiles in recommender systems. Finally, we also consider the evaluation of hybrid recommender with an online experiment.
\ No newline at end of file
--- a/Report/chiliguano_msc_finalproject.blg	Tue Sep 01 11:29:38 2015 +0100
+++ b/Report/chiliguano_msc_finalproject.blg	Wed Sep 02 20:56:37 2015 +0100
@@ -21,15 +21,15 @@
 Reallocated singl_function (elt_size=4) to 100 items from 50.
 Reallocated singl_function (elt_size=4) to 100 items from 50.
 Reallocated singl_function (elt_size=4) to 100 items from 50.
-Reallocated field_info (elt_size=4) to 11672 items from 5000.
+Reallocated field_info (elt_size=4) to 11811 items from 5000.
 Database file #1: chiliguano_msc_finalproject-blx.bib
 Database file #2: references.bib
 Warning--I'm ignoring Putzke2014519's extra "keywords" field
---line 262 of file references.bib
+--line 271 of file references.bib
 Warning--I'm ignoring Putzke2014519's extra "keywords" field
---line 263 of file references.bib
+--line 272 of file references.bib
 Warning--I'm ignoring Putzke2014519's extra "keywords" field
---line 264 of file references.bib
+--line 273 of file references.bib
 Biblatex version: 3.0
 Name 1 in "Hypebot.com," has a comma at the end for entry 1_hypebot.com_2015
 while executing---line 2513 of file biblatex.bst
@@ -417,45 +417,45 @@
 while executing---line 2659 of file biblatex.bst
 Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015
 while executing---line 2659 of file biblatex.bst
-You've used 48 entries,
+You've used 49 entries,
             6047 wiz_defined-function locations,
-            1473 strings with 26491 characters,
-and the built_in function-call counts, 171582 in all, are:
-= -- 5752
-> -- 7367
-< -- 1506
-+ -- 3605
-- -- 4025
-* -- 14453
-:= -- 11734
+            1478 strings with 26643 characters,
+and the built_in function-call counts, 174590 in all, are:
+= -- 5881
+> -- 7454
+< -- 1538
++ -- 3628
+- -- 4049
+* -- 14650
+:= -- 11899
 add.period$ -- 0
-call.type$ -- 48
-change.case$ -- 678
+call.type$ -- 49
+change.case$ -- 697
 chr.to.int$ -- 212
-cite$ -- 94
-duplicate$ -- 19117
-empty$ -- 17502
-format.name$ -- 3793
-if$ -- 36491
+cite$ -- 96
+duplicate$ -- 19464
+empty$ -- 17841
+format.name$ -- 3842
+if$ -- 37190
 int.to.chr$ -- 0
-int.to.str$ -- 108
+int.to.str$ -- 111
 missing$ -- 0
-newline$ -- 1716
-num.names$ -- 2073
-pop$ -- 15454
+newline$ -- 1737
+num.names$ -- 2125
+pop$ -- 15772
 preamble$ -- 1
-purify$ -- 856
+purify$ -- 876
 quote$ -- 0
-skip$ -- 8980
+skip$ -- 9164
 stack$ -- 0
-substring$ -- 3245
-swap$ -- 6795
-text.length$ -- 1420
-text.prefix$ -- 47
+substring$ -- 3326
+swap$ -- 6909
+text.length$ -- 1444
+text.prefix$ -- 48
 top$ -- 1
-type$ -- 1803
+type$ -- 1841
 warning$ -- 0
-while$ -- 1039
+while$ -- 1058
 width$ -- 0
-write$ -- 1667
+write$ -- 1687
 (There were 192 error messages)
--- a/Report/chiliguano_msc_finalproject.lof	Tue Sep 01 11:29:38 2015 +0100
+++ b/Report/chiliguano_msc_finalproject.lof	Wed Sep 02 20:56:37 2015 +0100
@@ -16,7 +16,7 @@
 \defcounter {refsection}{0}\relax 
 \contentsline {figure}{\numberline {2.6}{\ignorespaces Schematic representation of a deep neural network\nobreakspace {}\parencite {1_brown_2014}\relax }}{19}{figure.caption.15}
 \defcounter {refsection}{0}\relax 
-\contentsline {figure}{\numberline {2.7}{\ignorespaces Convolutional deep neural network LeNet model \parencite {1_deeplearning.net_2015}\relax }}{21}{figure.caption.17}
+\contentsline {figure}{\numberline {2.7}{\ignorespaces Convolutional deep neural network LeNet model \parencite {1_deeplearning.net_2015}\relax }}{22}{figure.caption.17}
 \defcounter {refsection}{0}\relax 
 \contentsline {figure}{\numberline {2.8}{\ignorespaces Flowchart of estimation of distribution algorithm \parencite {Ding2015451}\relax }}{23}{figure.caption.19}
 \defcounter {refsection}{0}\relax 
@@ -28,9 +28,9 @@
 \defcounter {refsection}{0}\relax 
 \contentsline {figure}{\numberline {3.3}{\ignorespaces Flowchart for time-frequency representation process\relax }}{32}{figure.caption.22}
 \defcounter {refsection}{0}\relax 
-\contentsline {figure}{\numberline {3.4}{\ignorespaces Diagram of hybrid music recommender\relax }}{36}{figure.caption.23}
+\contentsline {figure}{\numberline {3.4}{\ignorespaces Diagram of the hybrid music recommender\relax }}{36}{figure.caption.23}
 \defcounter {refsection}{0}\relax 
-\contentsline {figure}{\numberline {3.5}{\ignorespaces Diagram of hybrid music recommender\relax }}{37}{figure.caption.25}
+\contentsline {figure}{\numberline {3.5}{\ignorespaces Diagram of CDNN for music genre classification\nobreakspace {}\parencite {kereliuk15}\relax }}{37}{figure.caption.25}
 \defcounter {refsection}{0}\relax 
 \addvspace {10\p@ }
 \defcounter {refsection}{0}\relax 
Binary file Report/chiliguano_msc_finalproject.pdf has changed
Binary file Report/chiliguano_msc_finalproject.synctex.gz has changed
--- a/Report/chiliguano_msc_finalproject.toc	Tue Sep 01 11:29:38 2015 +0100
+++ b/Report/chiliguano_msc_finalproject.toc	Wed Sep 02 20:56:37 2015 +0100
@@ -42,17 +42,17 @@
 \defcounter {refsection}{0}\relax 
 \contentsline {subsubsection}{Hybrid music recommender}{16}{section*.13}
 \defcounter {refsection}{0}\relax 
-\contentsline {section}{\numberline {2.5}Deep Learning}{17}{section.2.5}
+\contentsline {section}{\numberline {2.5}Deep Learning}{18}{section.2.5}
 \defcounter {refsection}{0}\relax 
 \contentsline {subsection}{\numberline {2.5.1}Deep Neural Networks}{18}{subsection.2.5.1}
 \defcounter {refsection}{0}\relax 
-\contentsline {subsubsection}{Music Feature Learning}{19}{section*.16}
+\contentsline {subsubsection}{Music Feature Learning}{20}{section*.16}
 \defcounter {refsection}{0}\relax 
 \contentsline {subsection}{\numberline {2.5.2}Convolutional Deep Neural Networks}{20}{subsection.2.5.2}
 \defcounter {refsection}{0}\relax 
-\contentsline {subsubsection}{Deep content-based music recommendation}{21}{section*.18}
+\contentsline {subsubsection}{Deep content-based music recommendation}{22}{section*.18}
 \defcounter {refsection}{0}\relax 
-\contentsline {section}{\numberline {2.6}Estimation of Distribution Algorithms}{22}{section.2.6}
+\contentsline {section}{\numberline {2.6}Estimation of Distribution Algorithms}{23}{section.2.6}
 \defcounter {refsection}{0}\relax 
 \contentsline {subsection}{\numberline {2.6.1}A Hybrid Recommendation Model Based on EDA}{24}{subsection.2.6.1}
 \defcounter {refsection}{0}\relax 
@@ -76,52 +76,54 @@
 \defcounter {refsection}{0}\relax 
 \contentsline {subsection}{\numberline {3.2.2}Standardise time-frequency representation}{35}{subsection.3.2.2}
 \defcounter {refsection}{0}\relax 
-\contentsline {section}{\numberline {3.3}Algorithms}{35}{section.3.3}
+\contentsline {section}{\numberline {3.3}Algorithms}{36}{section.3.3}
 \defcounter {refsection}{0}\relax 
 \contentsline {subsection}{\numberline {3.3.1}Probability of music genre representation}{36}{subsection.3.3.1}
 \defcounter {refsection}{0}\relax 
-\contentsline {subsubsection}{CDNN architecture}{36}{section*.24}
+\contentsline {subsubsection}{CDNN architecture}{37}{section*.24}
 \defcounter {refsection}{0}\relax 
-\contentsline {subsubsection}{Learning parameters}{37}{section*.26}
+\contentsline {subsubsection}{Learning parameters}{38}{section*.26}
 \defcounter {refsection}{0}\relax 
-\contentsline {subsection}{\numberline {3.3.2}User profile modelling}{38}{subsection.3.3.2}
+\contentsline {subsubsection}{Vector representation}{39}{section*.27}
 \defcounter {refsection}{0}\relax 
-\contentsline {subsubsection}{Permutation EDA}{38}{section*.27}
+\contentsline {subsection}{\numberline {3.3.2}User profile modelling}{39}{subsection.3.3.2}
 \defcounter {refsection}{0}\relax 
-\contentsline {subsubsection}{Continuous Univariate Marginal Distribution Algorithm}{38}{section*.28}
+\contentsline {subsubsection}{Modelling with Permutation EDA}{40}{section*.28}
 \defcounter {refsection}{0}\relax 
-\contentsline {subsection}{\numberline {3.3.3}Top-N songs recommendation}{38}{subsection.3.3.3}
+\contentsline {subsubsection}{Modelling with $UMDA_c^G$}{40}{section*.29}
 \defcounter {refsection}{0}\relax 
-\contentsline {chapter}{\numberline {4}Experiments}{39}{chapter.4}
+\contentsline {subsection}{\numberline {3.3.3}Top-N songs recommendation}{41}{subsection.3.3.3}
 \defcounter {refsection}{0}\relax 
-\contentsline {section}{\numberline {4.1}Evaluation for recommender systems}{39}{section.4.1}
+\contentsline {subsubsection}{Top-N recommendations in CB baseline}{42}{section*.30}
 \defcounter {refsection}{0}\relax 
-\contentsline {subsection}{\numberline {4.1.1}Types of experiments}{39}{subsection.4.1.1}
+\contentsline {subsubsection}{Top-N recommendations in hybrid music recommender}{43}{section*.31}
 \defcounter {refsection}{0}\relax 
-\contentsline {subsection}{\numberline {4.1.2}Evaluation strategies}{40}{subsection.4.1.2}
+\contentsline {section}{\numberline {3.4}Summary}{43}{section.3.4}
 \defcounter {refsection}{0}\relax 
-\contentsline {subsection}{\numberline {4.1.3}Decision based metrics}{41}{subsection.4.1.3}
+\contentsline {chapter}{\numberline {4}Experiments}{44}{chapter.4}
 \defcounter {refsection}{0}\relax 
-\contentsline {subsubsection}{Precision}{41}{section*.29}
+\contentsline {section}{\numberline {4.1}Evaluation for recommender systems}{44}{section.4.1}
 \defcounter {refsection}{0}\relax 
-\contentsline {subsubsection}{Recall}{41}{section*.30}
+\contentsline {subsection}{\numberline {4.1.1}Types of experiments}{44}{subsection.4.1.1}
 \defcounter {refsection}{0}\relax 
-\contentsline {subsubsection}{F1}{41}{section*.31}
+\contentsline {subsection}{\numberline {4.1.2}Evaluation strategies}{45}{subsection.4.1.2}
 \defcounter {refsection}{0}\relax 
-\contentsline {subsubsection}{Accuracy}{41}{section*.32}
+\contentsline {subsection}{\numberline {4.1.3}Decision based metrics}{46}{subsection.4.1.3}
 \defcounter {refsection}{0}\relax 
-\contentsline {section}{\numberline {4.2}Evaluation method}{42}{section.4.2}
+\contentsline {section}{\numberline {4.2}Evaluation method}{47}{section.4.2}
 \defcounter {refsection}{0}\relax 
-\contentsline {subsection}{\numberline {4.2.1}Training set and test set}{42}{subsection.4.2.1}
+\contentsline {subsection}{\numberline {4.2.1}Training set and test set}{47}{subsection.4.2.1}
 \defcounter {refsection}{0}\relax 
-\contentsline {chapter}{\numberline {5}Results}{43}{chapter.5}
+\contentsline {subsection}{\numberline {4.2.2}Top-N evaluation}{47}{subsection.4.2.2}
 \defcounter {refsection}{0}\relax 
-\contentsline {section}{\numberline {5.1}Genre classification results}{43}{section.5.1}
+\contentsline {chapter}{\numberline {5}Results}{48}{chapter.5}
 \defcounter {refsection}{0}\relax 
-\contentsline {section}{\numberline {5.2}Recommender evaluation results}{44}{section.5.2}
+\contentsline {section}{\numberline {5.1}Genre classification results}{48}{section.5.1}
 \defcounter {refsection}{0}\relax 
-\contentsline {chapter}{\numberline {6}Conclusion}{45}{chapter.6}
+\contentsline {section}{\numberline {5.2}Recommender evaluation results}{49}{section.5.2}
 \defcounter {refsection}{0}\relax 
-\contentsline {section}{\numberline {6.1}Future work}{45}{section.6.1}
+\contentsline {chapter}{\numberline {6}Conclusion}{51}{chapter.6}
 \defcounter {refsection}{0}\relax 
-\contentsline {chapter}{References}{46}{section.6.1}
+\contentsline {section}{\numberline {6.1}Future work}{52}{section.6.1}
+\defcounter {refsection}{0}\relax 
+\contentsline {chapter}{References}{53}{section.6.1}
--- a/Report/references.bib	Tue Sep 01 11:29:38 2015 +0100
+++ b/Report/references.bib	Wed Sep 02 20:56:37 2015 +0100
@@ -1,3 +1,12 @@
+@incollection{bengio2012practical,
+	title={Practical recommendations for gradient-based training of deep architectures},
+	author={Bengio, Yoshua},
+	booktitle={Neural Networks: Tricks of the Trade},
+	pages={437--478},
+	year={2012},
+	publisher={Springer}
+}
+
 @article{scikit-learn,
 	title={Scikit-learn: Machine Learning in {P}ython},
 	author={Pedregosa, F. and Varoquaux, G. and Gramfort, A. and Michel, V.
--- a/slides/chiliguano_msc_project_slides.tex	Tue Sep 01 11:29:38 2015 +0100
+++ b/slides/chiliguano_msc_project_slides.tex	Wed Sep 02 20:56:37 2015 +0100
@@ -1,100 +1,309 @@
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% Beamer Presentation
+% LaTeX Template
+% Version 1.0 (10/11/12)
+%
+% This template has been downloaded from:
+% http://www.LaTeXTemplates.com
+%
+% License:
+% CC BY-NC-SA 3.0 (http://creativecommons.org/licenses/by-nc-sa/3.0/)
+%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+%----------------------------------------------------------------------------------------
+%	PACKAGES AND THEMES
+%----------------------------------------------------------------------------------------
+
 \documentclass{beamer}
-%
-% Choose how your presentation looks.
-%
-% For more themes, color themes and font themes, see:
-% http://deic.uab.es/~iblanes/beamer_gallery/index_by_theme.html
-%
-\mode<presentation>
-{
-  \usetheme{Frankfurt}      % or try Darmstadt, Madrid, Warsaw, ...
-  \usecolortheme{rose} % or try albatross, beaver, crane, ...
-  \usecolortheme{seahorse}
-  \usefonttheme[onlymath]{serif}  % or try serif, structurebold, ...
-  \setbeamertemplate{navigation symbols}{}
-  \setbeamertemplate{caption}[numbered]
-} 
 
-\usepackage[english]{babel}
-\usepackage[utf8x]{inputenc}
+\mode<presentation> {
 
-\title[Your Short Title]{Hybrid music recommender using content-based and social information}
-\author{Paulo Esteban Chiliguano Torres}
-\institute{School of Electronic Engineering and Computer Science\\Queen Mary University of London}
-\date{September 3rd, 2015}
+% The Beamer class comes with a number of default slide themes
+% which change the colors and layouts of slides. Below this is a list
+% of all the themes, uncomment each in turn to see what they look like.
+
+%\usetheme{default}
+%\usetheme{AnnArbor}
+%\usetheme{Antibes}
+%\usetheme{Bergen}
+%\usetheme{Berkeley}
+%\usetheme{Berlin}
+%\usetheme{Boadilla}
+%\usetheme{CambridgeUS}
+%\usetheme{Copenhagen}
+%\usetheme{Darmstadt}
+%\usetheme{Dresden}
+%\usetheme{Frankfurt}
+%\usetheme{Goettingen}
+%\usetheme{Hannover}
+%\usetheme{Ilmenau}
+%\usetheme{JuanLesPins}
+%\usetheme{Luebeck}
+\usetheme{Madrid}
+%\usetheme{Malmoe}
+%\usetheme{Marburg}
+%\usetheme{Montpellier}
+%\usetheme{PaloAlto}
+%\usetheme{Pittsburgh}
+%\usetheme{Rochester}
+%\usetheme{Singapore}
+%\usetheme{Szeged}
+%\usetheme{Warsaw}
+
+% As well as themes, the Beamer class has a number of color themes
+% for any slide theme. Uncomment each of these in turn to see how it
+% changes the colors of your current slide theme.
+
+%\usecolortheme{albatross}
+%\usecolortheme{beaver}
+%\usecolortheme{beetle}
+%\usecolortheme{crane}
+%\usecolortheme{dolphin}
+%\usecolortheme{dove}
+%\usecolortheme{fly}
+%\usecolortheme{lily}
+%\usecolortheme{orchid}
+%\usecolortheme{rose}
+%\usecolortheme{seagull}
+\usecolortheme{seahorse}
+%\usecolortheme{whale}
+%\usecolortheme{wolverine}
+
+%\setbeamertemplate{footline} % To remove the footer line in all slides uncomment this line
+%\setbeamertemplate{footline}[page number] % To replace the footer line in all slides with a simple slide count uncomment this line
+
+\setbeamertemplate{navigation symbols}{} % To remove the navigation symbols from the bottom of all slides uncomment this line
+}
+
+\usepackage{graphicx} % Allows including images
+\usepackage{booktabs} % Allows the use of \toprule, \midrule and \bottomrule in tables
+
+%----------------------------------------------------------------------------------------
+%	TITLE PAGE
+%----------------------------------------------------------------------------------------
+
+\title[Hybrid music recommender]{Hybrid music recommender using content-based and social information} % The short title appears at the bottom of every slide, the full title is only on the title page
+
+\author{Paulo Esteban Chiliguano Torres} % Your name
+\institute[QMUL] % Your institution as it will appear on the bottom of every slide, may be shorthand to save space
+{School of Electrical Engineering and Computer Science\\
+Queen Mary University of London \\ % Your institution for the title page
+\medskip
+%\textit{john@smith.com} % Your email address
+}
+\date{September 1st, 2015} % Date, can be changed to a custom date
 
 \begin{document}
 
 \begin{frame}
-  \titlepage
+\titlepage % Print the title page as the first slide
 \end{frame}
 
-% Uncomment these lines for an automatically generated outline.
-\begin{frame}{Outline}
-  \tableofcontents
+\begin{frame}
+\frametitle{Outline} % Table of contents slide, comment this block out to remove it
+\tableofcontents % Throughout your presentation, if you choose to use \section{} and \subsection{} commands, these will automatically be printed on this slide as an outline of your presentation
 \end{frame}
 
-\section{Introduction}
+%----------------------------------------------------------------------------------------
+%	PRESENTATION SLIDES
+%----------------------------------------------------------------------------------------
 
-\begin{frame}{Introduction}
 
+\section{Motivation}
+\begin{frame}
+	\textit{''Music doesn't have any special meaning; it depends what it's attached to.''} (Oliver Sacks 1933-2015)
+\end{frame}
+\begin{frame}
+\frametitle{Aim and Motivations}
+Design and implement a hybrid music recommender to mitigate the cold-start problem in a content-based recommendation strategy. 
 \begin{itemize}
-  \item Your introduction goes here!
-  \item Use \texttt{itemize} to organize your main points.
+\pause \item Implement a convolutional deep neural network (CDNN) to obtain high-level representation of an audio file.
+\pause \item Investigate Estimation of Distribution Algorithms (EDAs) to model user profiles in terms of probabilities of music genres preferences.
+\end{itemize}
+\end{frame}
+
+
+\subsection{Related work}
+
+\begin{frame}
+\frametitle{Recommender Systems}
+Hybrid music recommender (Yoshii et al. 2008)
+\begin{itemize}
+	\item ``bag of timbres'' to represent acoustic features.
+	\item Three-way aspect model: ``unobserved'' genre
+\end{itemize}
+\pause Deep content-based music recommendation (Oord et al. 2013)
+\begin{itemize}
+	\item CDNN for latent vector representation
+	\item Million Song Dataset
+\end{itemize}
+\pause Hybrid recommender based on EDA (Liang, T. et al. 2014)
+\begin{itemize}
+	\item TF-IDF for item attributes
+	\item Movielens dataset
+	\item Permutation EDA
 \end{itemize}
 
-\vskip 1cm
 
-\begin{block}{Examples}
-Some examples of commonly used commands and features are included, to help you get started.
-\end{block}
 
+%The following two theorems might be important to recall
+%\begin{theorem}[Theorem 1]
+%The HVG associated to a bi-infinite series of i.i.d. random variables extracted from a continuous probability distribution $f(x)$ is
+%$P(k)=\bigg (\frac{1}{3}\bigg ) \bigg (\frac{2}{3}\bigg )^{k-2}; \ k=2,3,\dots \ \ \ \ \ \ (\forall f)$
+%\end{theorem}
+%\begin{theorem}[Theorem 2]
+%\The DHVG associated to a bi-infinite series of i.i.d. random variables extracted from a continuous probability distribution $f(x)$ is
+%$P(k)=\bigg (\frac{1}{2}\bigg )^k; \ k=1,2,3,\dots \ \ \ \ \ \ (\forall f)$
+%\end{theorem}
 \end{frame}
 
-\section{Some \LaTeX{} Examples}
-
-\subsection{Tables and Figures}
-
-\begin{frame}{Tables and Figures}
-
+\section{Hybrid music recommendation}
+\subsection{Design}
+\begin{frame}
+\frametitle{Hybrid music recommender design}
+Fundamental tasks:
 \begin{itemize}
-\item Use \texttt{tabular} for basic tables --- see Table~\ref{tab:widgets}, for example.
-\item You can upload a figure (JPEG, PNG or PDF) using the files menu. 
-\item To include it in your document, use the \texttt{includegraphics} command (see the comment below in the source code).
+	\item User modelling
+	\item Information filtering
+\end{itemize}
+Required data:
+\begin{itemize}
+	\item User-item matrix: Taste profile dataset (53 users)
+	\item Audio clips: 7digital UK catalogue (640 clips)
+\end{itemize}
+Song representation:
+\begin{itemize}
+	\item 10-dimensional vector
+	\item Probability to belong to a music genre
 \end{itemize}
 
-% Commands to include a figure:
-%\begin{figure}
-%\includegraphics[width=\textwidth]{your-figure's-file-name}
-%\caption{\label{fig:your-figure}Caption goes here.}
-%\end{figure}
 
-\begin{table}
-\centering
-\begin{tabular}{l|r}
-Item & Quantity \\\hline
-Widgets & 42 \\
-Gadgets & 13
-\end{tabular}
-\caption{\label{tab:widgets}An example table.}
-\end{table}
-
+%\begin{example}[Theorem Slide Code]
+%Blablabla
+%\end{example}
+%And then you might be able to state the main conjecture you will solve
 \end{frame}
 
-\subsection{Mathematics}
-
-\begin{frame}{Readable Mathematics}
-
-Let $X_1, X_2, \ldots, X_n$ be a sequence of independent and identically distributed random variables with $\text{E}[X_i] = \mu$ and $\text{Var}[X_i] = \sigma^2 < \infty$, and let
-$$S_n = \frac{X_1 + X_2 + \cdots + X_n}{n}
-      = \frac{1}{n}\sum_{i}^{n} X_i$$
-denote their mean. Then as $n$ approaches infinity, the random variables $\sqrt{n}(S_n - \mu)$ converge in distribution to a normal $\mathcal{N}(0, \sigma^2)$.
-
+\subsection{Architecture}
+\begin{frame}
+	\frametitle{Hybrid music recommender approach}
+	\begin{itemize}
+		\item Feature augmentation
+		\item Meta-level
+	\end{itemize}
+	\begin{figure}[ht!]
+		\centering
+		\includegraphics[width=\textwidth]{hybrid.png}
+		%\caption{Diagram of the cleaning process of the Taste Profile subset}
+		%\label{fig:taste_profile}
+	\end{figure}
 \end{frame}
 
-\section{Conclusions}
+\subsection{Item and user representation}
 \begin{frame}
-	a
+	\frametitle{Probability of music genre}
+	%\begin{itemize}
+		%\item Feature augmentation
+		%\item Meta-level
+	%\end{itemize}
+	\begin{figure}[ht!]
+		\centering
+		\includegraphics[width=\textwidth]{CDNN.png}
+		\caption{CDNN for music genre classification (Kereliuk et al. 2015)}
+		%\label{fig:taste_profile}
+	\end{figure}
 \end{frame}
 
-\end{document}
+\begin{frame}
+	\frametitle{Estimation of Distribution Algorithms (EDAs)}
+	\begin{figure}[ht!]
+		\centering
+		\includegraphics[width=0.5\textwidth]{eda.png}
+		\caption{Flowchart for EDA (Ding et al. 2015)}
+		%\label{fig:taste_profile}
+	\end{figure}
+\end{frame}
+\begin{frame}
+	\frametitle{User profile modelling}
+	With permutation EDA:
+	\begin{itemize}
+	\item 10 tags (GTZAN) equivalent to keywords
+	\item 50 weights: evenly spaced over the inverval $[0.1,\ldots,0.9]$
+	\end{itemize}
+	\frametitle{User profile modelling}
+	With continuous EDA:
+	\begin{itemize}
+		\item Each genre considered as a dimension
+		\item Compute mean and covariance for each dimension along individuals
+		\item Sample from normal distribution
+	\end{itemize}
+\end{frame}
+
+\section{Results}
+\subsection{Music genre classifier}
+\begin{frame}
+	\frametitle{Genre classification}
+\begin{table}[h!]
+	\caption{Genre classification results} % title of Table
+	\centering % used for centering table
+	\begin{tabular}{c c c c c} % centered columns (4 columns)
+		\hline\hline %inserts double horizontal lines
+		Trial & Validation error (\%) & Test error (\%) & Iter. & Time elapsed (min.) \\ [0.5ex] % inserts table
+		%heading
+		\hline % inserts single horizontal line
+		1 & 58.0 & 65.2 & 650 & 7.00 \\ % inserting body of the table
+		2 & 37.6 & 46.0 & 2150 & 13.07 \\
+		3 & 39.6 & 46.0 & 700 & 7.54 \\
+		4 & 35.6 & 36.8 & 550 & 6.01 \\
+		5 & 36.4 & 40.0 & 250 & 5.47 \\
+		6 & 40.4 & 44.8 & 150 & 5.41 \\
+		7 & 32.4 & 40.4 & 800 & 8.64 \\
+		8 & 36.0 & 38.8 & 250 & 5.42 \\
+		9 & 34.0 & 38.8 & 850 & 9.14 \\ [1ex] % [1ex] adds vertical space
+		\hline %inserts single line
+	\end{tabular}
+	\label{table:genre} % is used to refer this table in the text
+\end{table}
+\end{frame}
+
+\subsection{Hybrid recommender}
+\begin{frame}
+\frametitle{Top - N recommendation}
+\begin{figure}[ht!]
+	\centering
+	\includegraphics[width=0.9\textwidth]{a.png}
+	%\caption{CDNN for music genre classification (Kereliuk et al. 2015)}
+	%\label{fig:taste_profile}
+\end{figure}
+\end{frame}
+
+
+
+%------------------------------------------------
+
+
+
+
+\section{Conclusions and future work}
+\begin{frame}
+\frametitle{Conclusions and future work}
+\begin{itemize}
+\item CDNN produce similar results to long-established music genre classifiers
+\item Hybrid permutation EDA outperforms CB
+\item Investigate unsupervised deep learning
+\item Online evaluation
+\end{itemize}
+\end{frame}
+
+
+
+%------------------------------------------------
+
+\begin{frame}
+\Huge{\centerline{Questions?}}
+\end{frame}
+
+%----------------------------------------------------------------------------------------
+
+\end{document} 
\ No newline at end of file