Mercurial > hg > hybrid-music-recommender-using-content-based-and-social-information
changeset 28:a95e656907c3
Updated report
author | Paulo Chiliguano <p.e.chiilguano@se14.qmul.ac.uk> |
---|---|
date | Mon, 31 Aug 2015 02:43:54 +0100 |
parents | ae650489d3a8 |
children | b1c54790ed97 |
files | Report/chapter1/introduction.tex Report/chapter2/background.tex Report/chapter3/ch3.tex Report/chapter4/evaluation.tex Report/chapter5/results.tex Report/chapter6/conclusions.tex Report/chiliguano_msc_finalproject.blg Report/chiliguano_msc_finalproject.lof Report/chiliguano_msc_finalproject.pdf Report/chiliguano_msc_finalproject.synctex.gz Report/chiliguano_msc_finalproject.tex Report/chiliguano_msc_finalproject.toc Report/references.bib |
diffstat | 13 files changed, 331 insertions(+), 89 deletions(-) [+] |
line wrap: on
line diff
--- a/Report/chapter1/introduction.tex Sun Aug 30 15:49:27 2015 +0100 +++ b/Report/chapter1/introduction.tex Mon Aug 31 02:43:54 2015 +0100 @@ -19,7 +19,7 @@ A convolutional deep neural network (CDNN), which is a deep learning model, is employed to describe the time-frequency content of each audio clip with a n-dimensional vector, whose dimensions represent the probability of a clip to belong to an specific music genre. In this project, we bound the number of music genres to 10. -As a primary contribution of this project, estimation of distribution algorithms (EDA) are investigated to model user profiles in terms of probabilities of music genres preferences. The algorithms use play count and the content vector of each song in the user's collection to optimise the profile. In addition, the content vector values is are treated as discrete and continuous variables, for evaluation purposes. +As a primary contribution of this project, estimation of distribution algorithms (EDAs) are investigated to model user profiles in terms of probabilities of music genres preferences. The algorithms use play count and the content vector of each song in the user's collection to optimise the profile. In addition, each dimension in the content vector is treated as a discrete and continuous variable, for evaluation purposes. To our knowledge, this is the first approach that uses a continuous EDA for user profile modelling in recommender systems. Each user profile then is compared with the vector representation of an audio clip to compute the similarity value between them. Recommendations for an user are built up by selecting the clips with highest similarity values. @@ -27,4 +27,4 @@ \section{Thesis outline} -The rest of the report is organised as follows: Chapter 2 provides an overview in recommender systems. Recommendation process, associated challenges, and related work based on state-of-the-art techniques are discussed. In Chapter 3, we present our proposed hybrid recommendation approach and describe the stages and algorithms in detail. The experiments and evaluation protocols are to assess the performance of the hybrid recommender presented in Chapter 4. We proceed to discuss and analyse the results from the conducted experiments to evaluate the proposed hybrid music recommender. In Chapter 6, we present the conclusions and some thoughts for further research. \ No newline at end of file +The rest of the report is organised as follows: Chapter~\ref{ch:background} provides an overview in recommender systems. Recommendation process, associated challenges, and related work based on state-of-the-art techniques are discussed. In Chapter~\ref{ch:methodology}, we present our proposed hybrid recommendation approach and describe the stages and algorithms in detail. The experiments and evaluation protocols are to assess the performance of the hybrid recommender presented in Chapter~\ref{ch:experiments}. In Chapter~\ref{ch:results}, we proceed to discuss and analyse the results from the conducted experiments to evaluate the proposed hybrid music recommender. In Chapter~\ref{ch:conclusion}, we present the conclusions and some thoughts for further research. \ No newline at end of file
--- a/Report/chapter2/background.tex Sun Aug 30 15:49:27 2015 +0100 +++ b/Report/chapter2/background.tex Mon Aug 31 02:43:54 2015 +0100 @@ -1,4 +1,5 @@ \chapter{Background} +\label{ch:background} Recommender systems create opportunities and challenges for industry to understand consumption behaviour of users. In particular, for music industry, the development of recommender systems could improve digital music sales \parencite{ringen_2015}, and also, it could assist the listeners to discover new music through their habits \parencite{1_hypebot.com_2015}. However, when there is no priori information of a new introduced item in a recommender system, known as the \textit{cold-start problem}, popular songs could be favoured in recommendation process instead of items in the \textit{long tail}, i.e., songs that do not have enough ratings. Usually, content-based recommender systems are used to solve the cold-start problem because similarities between items are based on the content without regarding the ratings \parencite{Park200811}. Another solution to address the cold-start problem is to combine recommendation techniques to boost the strengths of each technique in an hybrid architecture. \parencite{melville2010recommender} In this chapter, we present the importance of online social networks and music services platforms for retrieving user-item information, in conjunction with related work on music recommender systems. Subsequently, a novel approach of an hybrid recommendation model based on estimation of distribution algorithms (EDAs) is introduced and examined. @@ -175,7 +176,7 @@ These advantages of deep learning methods enable us to learn abstractions from music low-level content in order to reduce the \textit{semantic gap}~\parencite{Celma2006} in MIR. Additionally, feature extraction does not require significant domain knowledge compared to \textit{hand-crafted} engineering. Nonetheless, deep learning implementations require a lot of data. \subsection{Deep Neural Networks} -A deep neural network (DNN) \parencite{hinton2012deep} is defined as a feed-forward artificial neural network with more than one layer of hidden units between the input and the output layer (see Figure~\ref{fig:dnn}). +A deep neural network (DNN) \parencite{hinton2012deep} is defined as a feed-forward artificial neural network (ANN), or multi-layer perceptron (MLP), with more than one layer of hidden units between the input and the output layer (see Figure~\ref{fig:dnn}). \begin{figure}[ht!] \centering \includegraphics[width=\textwidth]{chapter2/dnn.png} @@ -197,18 +198,28 @@ \end{equation} where \emph{k} is the number of classes. \subsubsection{Music Feature Learning} -\textcite{Sigtia20146959} examined and compared deep neural networks to discover features from the GTZAN dataset and the ISMIR 2004 genre classification dataset\footnote{http://ismir2004.ismir.net/genre\_contest/}, using rectifier linear units (ReLUs) and dropout regularisation. The GTZAN dataset was divided into four 50/25/25 train, validation, test parts. +\textcite{Sigtia20146959} examined and compared DNNs to discover features from the GTZAN dataset and the ISMIR 2004 genre classification dataset\footnote{http://ismir2004.ismir.net/genre\_contest/}, using rectifier linear units (ReLUs) and dropout regularisation. ReLUs provides better convergence without pre-training. Dropout regularisation reduces the problem of overfitting. -For each audio clip, they calculated the Fast Fourier Transform (FFT) on frames of length 1024 samples (22050 kHz sampling rate) with a window overlap of 50\%. Next, they used the magnitude of each FFT frame resulting in a 513 dimensional vector. And then, each feature dimension is normalised to have zero mean and unit standard deviation. +First, the GTZAN dataset was divided into four 50/25/25 train, validation, test parts. For each audio clip of the dataset, they calculated the Fast Fourier Transform (FFT) on frames of length 1,024 samples (22,050 kHz sampling rate) with a window overlap of 50\%. Next, they used the magnitude of each FFT frame resulting in a 513 dimensional vector. And then, each feature dimension is normalised to have zero mean and unit standard deviation. For the deep neural network, the 500 hidden units were trained with stochastic gradient descent (SGD) with a learning rate of 0.01, a patience of 10 and a dropout rate of 0.25. The system classifies the GTZAN data with an accuracy of 83$\pm$1.1\%, a value of the same order of results obtained with hand-crafted features. \subsection{Convolutional Deep Neural Networks} -A convolutional deep neural network (CDNN)~\parencite{Bengio-et-al-2015-Book} is a type of artificial neural network that uses convolution operation instead of matrix multiplication for processing data that has grid-like topology, designed to recognize visual patterns directly from pixel images. LeNet-5~\parencite{1_lecun_2015} (see Figure~\ref{fig:lenet}) is one model of convolutional network designed for recognition of handwritten and machine-printed characters. +Inspired in the behaviour of animal visual processing system~\parencite{1_deeplearning.net_2015}, a convolutional deep neural network (CDNN)~\parencite{Bengio-et-al-2015-Book, 1_ufldlstanfordedu_2015} is a type of MLP that uses convolution operation instead of matrix multiplication for processing data that has grid-like topology. CDNNs are designed to recognize visual patterns directly from pixel images. In general, the architecture of a CDNN is based on: +\begin{itemize} + \item \textbf{Sparse connectivity:} the inputs of hidden units in an upper layer \emph{m} are from a subset of units in a lower layer \emph{m$-$1}. + \item \textbf{Shared weights:} each hidden unit in a layer share the same weight vector and bias. The layer with this parametrisation form a \textit{feature map}. + %\subitem To preserve the information about the input it is suggested to keeping the total number of activations (number of feature maps times number of pixel positions) + \item \textbf{Convolutional layers:} A feature map is obtained by convolution of the input image with a linear filter, adding a bias term and then applying a non-linear function. Each convolutional (hidden) layer is composed of multiple feature maps. + %\subitem The trick is thus to find the right level of “granularity” (i.e. filter shapes) in order to create abstractions at the proper scale, given a particular dataset. + \item \textbf{Max-pooling:} is a non-linear down-sampling to divide the input image into a set of non-overlapping rectangles and, for each rectangle, the maximum value is returned. + %Typical values are 2x2 or no max-pooling. Very large input images may warrant 4x4 pooling in the lower-layers. +\end{itemize} -\begin{figure}[h!] +LeNet-5~\parencite{1_lecun_2015} is one model of CDNN designed for recognition of handwritten and machine-printed characters. In Figure~\ref{fig:lenet}, the LeNet model is illustrated. The lower-layers are composed of convolution and max-pooling layers and the upper-layer is a fully-connected MLP. The input to MLP is the set of all features maps at the layer below. +\begin{figure}[ht!] \centering \includegraphics[width=1\textwidth]{chapter2/mylenet.png} \caption{Convolutional deep neural network LeNet model \parencite{1_deeplearning.net_2015}} @@ -216,18 +227,99 @@ \end{figure} \subsubsection{Deep content-based music recommendation} -\textcite{NIPS2013_5004} used a CDNN to predict latent vectors for each audio track in a content-based recommendation system. +\textcite{NIPS2013_5004} proposed to use a latent factor model for CB recommendation and the implementation of a CDNN to predict the latent factors from music audio. To obtain 50-dimension latent vectors, they used a weighted matrix factorisation (WMF) algorithm on the Taste Profile Subset. Also, they retrieved audio clips for over 99\% of the songs in the dataset from 7digital.com. -Weighted matrix factorisation -Demostrating a better performance compared to MFCC attributes -similar architecture CNN +To train the CDNN, the latent vectors obtained through the WMF are used as ground truth. The input of the CDNN is a log-compressed mel-spectrogram with 128 components computed from windows of 1,024 samples and a hop size of 512 samples (sampling rate of 22,050 Hz) for each audio clip. The duration of audio clips is limited to 3 seconds. + +They used 10-fold cross validation and obtained an average area under the ROC curve (AUC) of 0.86703 for prediction based on the latent factor vectors, outperforming the bag-of-timbres approach. \section{Estimation of Distribution Algorithms} -Estimation of distribution algorithms (EDAs) \parencite{pelikan2015estimation} are optimisation techniques by constructing a probabilistic model from a sample of solutions, generating a new population and leading to an optimal solution \parencite{Santana:Bielza:Larrañaga:Lozano:Echegoyen:Mendiburu:Armañanzas:Shakya:2009:JSSOBK:v35i07}. +Inspired in natural selection of species, an estimation of distribution algorithm (EDA) \parencite{pelikan2015estimation, Ding2015451, Santana:Bielza:Larrañaga:Lozano:Echegoyen:Mendiburu:Armañanzas:Shakya:2009:JSSOBK:v35i07} is an optimisation technique that estimates a probabilistic model from a sample of promising individuals, which is used to generate a new population and leading to an optimal solution of an objective function, called the \textit{fitness} function, until a termination criteria, e.g., maximisation, minimisation, maximum number of generations, is satisfied. These algorithms were applied to solve complex problems such as load balancing for mobile networks~\parencite{Hejazi15} or software reliability prediction~\parencite{Jin2014113}. In Figure~\ref{fig:eda} we show the general flowchart of an EDA. -\textcite{Liang2014781} exploited an EDA to model user profiles by using weighted featured vectors of keywords from a set of items that the user had rated above a threshold. +\begin{figure}[ht!] + \centering + \includegraphics[width=0.5\textwidth]{chapter2/eda.png} + \caption{Flowchart of estimation of distribution algorithm \parencite{Ding2015451}} + \label{fig:eda} +\end{figure} -These algorithms were applied in complex problems such as load balancing for mobile networks \parencite{Hejazi15} or software reliability prediction +According to~\textcite{pelikan2015estimation}, the main components of an EDA are a selection operator, a class of probabilistic models for modelling and sampling, and a replacement operator for combining the old population with the offspring. Also, regarding the types of distributions that an EDA are able to capture~\parencite{pelikan2015estimation} can be categorised in four broad groups: +\begin{itemize} + \item \textbf{Discrete variables EDAs}, where candidate solutions are represented by fixed-length strings of a finite cardinality. + \item \textbf{Permutation EDAs}, where candidate solutions are represented by permutations over a given set of elements. + \item \textbf{Real-valued vectors (continuous) EDAs}, where candidate solutions are mapped from real-valued variables into a discrete domain or the probabilistic model defined on real-valued variables are considered. + \item \textbf{Genetic programming EDAs}. +\end{itemize} + +The advantages of using EDAs include the discovery of problem-specific features or reducing the memory requirements. However, it is time consuming to build explicit probabilistic models.~\parencite{pelikan2015estimation} + +In our project, we investigate permutation EDAs and continuous EDAs for user profile modelling. + +\subsection{A Hybrid Recommendation Model Based on EDA} +\textcite{Liang2014781} exploited a permutation EDA to model user profiles in an hybrid model for movie recommendation using the MovieLens 1M dataset\footnote{http://grouplens.org/datasets/movielens/}. A movie, \emph{i}, is described using keywords and weights vector, $t_i$, calculated by term frequency-inverse document frequency (TF-IDF) technique. A user is initially represented by a set, $S_u$, of \textit{(movie, rating)} tuples. The keywords of every $S_u$ set are embedded in a new set, $D_u$. + +The goal is to learn the user profile, $profile_u$, by minimisation of the fitness function, defined by Equation~\eqref{eq:fitness} +\begin{equation} +fitness(profile_u) =\sum_{i\in S_u}\log(r_{u,i}\times sim(profile_u,t_i)) +\label{eq:fitness} +\end{equation} +where, $r_{u,i}$ is the rating of the movie \emph{i} given by user \emph{u}, and $sim(profile_u,t_i)$ is computed by the cosine similarity coefficient, defined by Equation~\eqref{eq:cossim} +\begin{equation} +sim(profile_u, t_i)=cos(profile_u, t_i) =\frac{profile_u\cdot t_i}{\Vert profile_u\Vert\times\Vert t_i\Vert} +\label{eq:cossim} +\end{equation} + +The pseudocode of EDA implemented by \textcite{Liang2014781} is delineated by Algorithm~\ref{algo:hybrideda}, where MAXGEN is the maximum number of generations. +\begin{algorithm}[ht!] + \caption{Calculate $profile_u$} + \begin{algorithmic} + \REQUIRE set $D_u$, weights $w_{n,i}$ + \REQUIRE population size $N$, MAXGEN + %\ENSURE $y = x^n$ + \STATE Random selection of keywords $k_n$ from $D_u$ + \STATE Assign a weight $w_{n,i}$ to each $k_n$ to build a set~$K_u$ of size~$N$ + \STATE Assign a probability $c_{n,i}=1/N$ to each $(k_n,w_{n,i})$ + \STATE Generate initial population of $profile_u$ by Monte Carlo method + \WHILE{$generation <$ MAXGEN} + \STATE Compute each $fitness(profile_u)$ + \STATE Rank individuals by their fitness value + \STATE Select top $M < N$ individuals + \STATE Update $c_{n,i}$ by counting the occurrences of $(k_n,w_{n,i})$ on the $M$ individuals + \STATE Generate $profile_u$ by random sampling according to updated $c_{n,i}$ + \ENDWHILE + \RETURN $profile_u$ + \end{algorithmic} + \label{algo:hybrideda} +\end{algorithm} + +\subsection{Continuous Univariate Marginal Distribution Algorithm} + +\textcite{gallagher2007bayesian} presented the continuous univariate marginal distribution algorithm ($UMDA_c^G$) as an extension of a discrete variable EDA. The general pseudocode of the ($UMDA_c^G$) is delineated in Algorithm~\ref{algo:umda}, where $x_i\in \textbf{x}$ represent the \emph{i} parameter of \textbf{\emph{x}} individual solution. + +\begin{algorithm}[ht!] + \caption{Framework for $UMDA_c^G$} + \begin{algorithmic} + \REQUIRE population size $M$ + \REQUIRE selection parameter $\tau$ + %\ENSURE $y = x^n$ + \STATE $t \leftarrow 0$ + \STATE Generate $M$ individuals at random + \WHILE{$t <$ stopping criteria} + \STATE $M_{sel}\leftarrow M\cdot\tau$ + \STATE Select $M_{sel}$ individuals + \STATE $\mu_{i,t}\leftarrow\frac{1}{M_{sel}}\sum_{j=1}^{M_{sel}}x_i^j$ + \STATE $\sigma_{i,t}^2\leftarrow\frac{1}{M_{sel}-1}\sum_{j=1}^{M_{sel}}(x_i^j-\mu_{i,t})^2$ + \STATE $p_t({x_{i}}\vert \mu_{i,t},\sigma_{i,t}^2)\leftarrow\frac{1}{\sqrt{2\pi}\sigma_{i,t}}\exp(-\frac{1}{2}(\frac{x_i-\mu_{i,t}}{\sigma_{i,t}})^2)$ + \STATE Sample $M$ individuals from $p_t({x_{i}}\vert \mu_{i,t},\sigma_{i,t}^2)$ + \STATE $t\leftarrow t+1$ + \ENDWHILE + \RETURN $profile_u$ + \end{algorithmic} + \label{algo:umda} +\end{algorithm} + + +To our knowledge, this is the first work to use a continuous EDA for user profile modelling in a recommender system. \section{Summary} In this chapter, previous work on recommender systems has been reviewed and novelty techniques to representing acoustical features and to model user profiles has been presented. The next steps are to collect the dataset by crawling online social information, to extract the acoustical features of a collection of songs to represent them as n-dimensional vectors, to model the user profiles by using EDAs, and therefore, to return a list of song recommendations. \ No newline at end of file
--- a/Report/chapter3/ch3.tex Sun Aug 30 15:49:27 2015 +0100 +++ b/Report/chapter3/ch3.tex Mon Aug 31 02:43:54 2015 +0100 @@ -1,4 +1,5 @@ \chapter{Methodology} +\label{ch:methodology} The methodology used to develop our hybrid music recommender consists of three main stages. First, the collection of real world user-item data corresponding to the play counts of specific songs and the fetching of audio clips of the unique identified songs in the dataset. Secondly, the implementation of the deep learning algorithm to represent the audio clips in terms of music genre probabilities as n-dimensional vectors. Finally, we investigate estimation of distribution algorithms to model user profiles based on the rated songs above a threshold. Every stage of our hybrid recommender is entirely done in Python 2.7\footnote{https://www.python.org/download/releases/2.7/}. \section{Data collection} @@ -6,10 +7,9 @@ \subsection{Taste Profile subset cleaning} Due to potential mismatches\footnote{http://labrosa.ee.columbia.edu/millionsong/blog/12-2-12-fixing-matching-errors} between song ID and track ID on the Echo Nest database, it is required to filter out the wrong matches in the Taste Profile subset. The cleaning process is illustrated in Figure~\ref{fig:taste_profile} - -\begin{figure}[h!] +\begin{figure}[ht!] \centering - \includegraphics[width=1\textwidth]{chapter3/taste_profile.png} + \includegraphics[width=0.8\textwidth]{chapter3/taste_profile.png} \caption{Cleaning of the taste profile subset} \label{fig:taste_profile} \end{figure}
--- a/Report/chapter4/evaluation.tex Sun Aug 30 15:49:27 2015 +0100 +++ b/Report/chapter4/evaluation.tex Mon Aug 31 02:43:54 2015 +0100 @@ -1,5 +1,5 @@ \chapter{Experiments} - +\label{ch:experiments} In order to evaluate the performance of a recommender system, there are several scenarios to be considered depending on the structure of the dataset and the prediction accuracy. It is therefore necessary to determine a suitable experiment for the evaluation of our proposed hybrid music recommender that employs a rating matrix and vector representation of songs as inputs to produce \textit{top-N} song recommendations. In addition, the performance of our hybrid approaches is compared with a pure content-based recommender algorithm.
--- a/Report/chapter5/results.tex Sun Aug 30 15:49:27 2015 +0100 +++ b/Report/chapter5/results.tex Mon Aug 31 02:43:54 2015 +0100 @@ -1,4 +1,5 @@ \chapter{Results} +\label{ch:results} %\begin{minipage}{\textwidth} %\centering %\includegraphics[scale=.6]{"Project Images/Figure_GA02".png}
--- a/Report/chapter6/conclusions.tex Sun Aug 30 15:49:27 2015 +0100 +++ b/Report/chapter6/conclusions.tex Mon Aug 31 02:43:54 2015 +0100 @@ -1,4 +1,5 @@ \chapter{Conclusion} +\label{ch:conclusion} Data is not strong enough \section{Future work}
--- a/Report/chiliguano_msc_finalproject.blg Sun Aug 30 15:49:27 2015 +0100 +++ b/Report/chiliguano_msc_finalproject.blg Mon Aug 31 02:43:54 2015 +0100 @@ -21,15 +21,15 @@ Reallocated singl_function (elt_size=4) to 100 items from 50. Reallocated singl_function (elt_size=4) to 100 items from 50. Reallocated singl_function (elt_size=4) to 100 items from 50. -Reallocated field_info (elt_size=4) to 11116 items from 5000. +Reallocated field_info (elt_size=4) to 11672 items from 5000. Database file #1: chiliguano_msc_finalproject-blx.bib Database file #2: references.bib Warning--I'm ignoring Putzke2014519's extra "keywords" field ---line 208 of file references.bib +--line 251 of file references.bib Warning--I'm ignoring Putzke2014519's extra "keywords" field ---line 209 of file references.bib +--line 252 of file references.bib Warning--I'm ignoring Putzke2014519's extra "keywords" field ---line 210 of file references.bib +--line 253 of file references.bib Biblatex version: 3.0 Name 1 in "Hypebot.com," has a comma at the end for entry 1_hypebot.com_2015 while executing---line 2513 of file biblatex.bst @@ -103,6 +103,30 @@ while executing---line 2513 of file biblatex.bst Name 1 in "Deeplearning.net," has a comma at the end for entry 1_deeplearning.net_2015 while executing---line 2513 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2513 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2513 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2513 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2513 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2513 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2513 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2513 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2513 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2513 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2513 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2513 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2513 of file biblatex.bst Name 1 in "Hypebot.com," has a comma at the end for entry 1_hypebot.com_2015 while executing---line 2517 of file biblatex.bst Name 1 in "Hypebot.com," has a comma at the end for entry 1_hypebot.com_2015 @@ -121,6 +145,12 @@ while executing---line 2517 of file biblatex.bst Name 1 in "Deeplearning.net," has a comma at the end for entry 1_deeplearning.net_2015 while executing---line 2517 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2517 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2517 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2517 of file biblatex.bst Name 1 in "Blog.seagatesoft.com," has a comma at the end for entry 1_blogseagatesoftcom_2015 while executing---line 2523 of file biblatex.bst Name 1 in "Blog.seagatesoft.com," has a comma at the end for entry 1_blogseagatesoftcom_2015 @@ -139,6 +169,24 @@ while executing---line 2523 of file biblatex.bst Name 1 in "Hypebot.com," has a comma at the end for entry 1_hypebot.com_2015 while executing---line 2523 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2523 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2523 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2523 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2527 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2527 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2527 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2527 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2527 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2527 of file biblatex.bst Name 1 in "Hypebot.com," has a comma at the end for entry 1_hypebot.com_2015 while executing---line 2527 of file biblatex.bst Name 1 in "Hypebot.com," has a comma at the end for entry 1_hypebot.com_2015 @@ -193,6 +241,12 @@ while executing---line 2531 of file biblatex.bst Name 1 in "Hypebot.com," has a comma at the end for entry 1_hypebot.com_2015 while executing---line 2531 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2531 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2531 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2531 of file biblatex.bst Name 1 in "Blog.seagatesoft.com," has a comma at the end for entry 1_blogseagatesoftcom_2015 while executing---line 2537 of file biblatex.bst Name 1 in "Blog.seagatesoft.com," has a comma at the end for entry 1_blogseagatesoftcom_2015 @@ -211,6 +265,24 @@ while executing---line 2537 of file biblatex.bst Name 1 in "Hypebot.com," has a comma at the end for entry 1_hypebot.com_2015 while executing---line 2537 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2537 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2537 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2537 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2541 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2541 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2541 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2541 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2541 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2541 of file biblatex.bst Name 1 in "Hypebot.com," has a comma at the end for entry 1_hypebot.com_2015 while executing---line 2541 of file biblatex.bst Name 1 in "Hypebot.com," has a comma at the end for entry 1_hypebot.com_2015 @@ -265,6 +337,12 @@ while executing---line 2595 of file biblatex.bst Name 1 in "Deeplearning.net," has a comma at the end for entry 1_deeplearning.net_2015 while executing---line 2595 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2595 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2595 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2595 of file biblatex.bst Reallocated singl_function (elt_size=4) to 100 items from 50. Reallocated wiz_functions (elt_size=4) to 9000 items from 6000. Name 1 in "Blog.seagatesoft.com," has a comma at the end for entry 1_blogseagatesoftcom_2015 @@ -321,45 +399,63 @@ while executing---line 2659 of file biblatex.bst Name 1 in "Hypebot.com," has a comma at the end for entry 1_hypebot.com_2015 while executing---line 2659 of file biblatex.bst -You've used 44 entries, +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2659 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2659 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2659 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2659 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2659 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2659 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2659 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2659 of file biblatex.bst +Name 1 in "Ufldl.stanford.edu," has a comma at the end for entry 1_ufldlstanfordedu_2015 +while executing---line 2659 of file biblatex.bst +You've used 48 entries, 6047 wiz_defined-function locations, - 1447 strings with 25348 characters, -and the built_in function-call counts, 155530 in all, are: -= -- 5278 -> -- 6585 -< -- 1386 -+ -- 3193 -- -- 3553 -* -- 12948 -:= -- 10546 + 1474 strings with 26412 characters, +and the built_in function-call counts, 169804 in all, are: += -- 5715 +> -- 7284 +< -- 1512 ++ -- 3554 +- -- 3970 +* -- 14112 +:= -- 11579 add.period$ -- 0 -call.type$ -- 44 -change.case$ -- 635 -chr.to.int$ -- 191 -cite$ -- 86 -duplicate$ -- 17340 -empty$ -- 15936 -format.name$ -- 3370 -if$ -- 33222 +call.type$ -- 48 +change.case$ -- 682 +chr.to.int$ -- 212 +cite$ -- 94 +duplicate$ -- 18879 +empty$ -- 17377 +format.name$ -- 3662 +if$ -- 36209 int.to.chr$ -- 0 -int.to.str$ -- 100 +int.to.str$ -- 109 missing$ -- 0 -newline$ -- 1536 -num.names$ -- 1909 -pop$ -- 14117 +newline$ -- 1665 +num.names$ -- 2080 +pop$ -- 15408 preamble$ -- 1 -purify$ -- 789 +purify$ -- 849 quote$ -- 0 -skip$ -- 8181 +skip$ -- 8915 stack$ -- 0 -substring$ -- 3007 -swap$ -- 6157 -text.length$ -- 1287 -text.prefix$ -- 43 +substring$ -- 3256 +swap$ -- 6714 +text.length$ -- 1411 +text.prefix$ -- 47 top$ -- 1 -type$ -- 1651 +type$ -- 1803 warning$ -- 0 -while$ -- 947 +while$ -- 1040 width$ -- 0 -write$ -- 1491 -(There were 144 error messages) +write$ -- 1616 +(There were 192 error messages)
--- a/Report/chiliguano_msc_finalproject.lof Sun Aug 30 15:49:27 2015 +0100 +++ b/Report/chiliguano_msc_finalproject.lof Mon Aug 31 02:43:54 2015 +0100 @@ -18,11 +18,13 @@ \defcounter {refsection}{0}\relax \contentsline {figure}{\numberline {2.7}{\ignorespaces Schematic representation of a deep neural network\nobreakspace {}\parencite {1_brown_2014}\relax }}{19}{figure.caption.16} \defcounter {refsection}{0}\relax -\contentsline {figure}{\numberline {2.8}{\ignorespaces Convolutional deep neural network LeNet model \parencite {1_deeplearning.net_2015}\relax }}{21}{figure.caption.18} +\contentsline {figure}{\numberline {2.8}{\ignorespaces Convolutional deep neural network LeNet model \parencite {1_deeplearning.net_2015}\relax }}{22}{figure.caption.18} +\defcounter {refsection}{0}\relax +\contentsline {figure}{\numberline {2.9}{\ignorespaces Flowchart of estimation of distribution algorithm \parencite {Ding2015451}\relax }}{23}{figure.caption.20} \defcounter {refsection}{0}\relax \addvspace {10\p@ } \defcounter {refsection}{0}\relax -\contentsline {figure}{\numberline {3.1}{\ignorespaces Cleaning of the taste profile subset\relax }}{24}{figure.caption.20} +\contentsline {figure}{\numberline {3.1}{\ignorespaces Cleaning of the taste profile subset\relax }}{29}{figure.caption.21} \defcounter {refsection}{0}\relax \addvspace {10\p@ } \defcounter {refsection}{0}\relax
--- a/Report/chiliguano_msc_finalproject.tex Sun Aug 30 15:49:27 2015 +0100 +++ b/Report/chiliguano_msc_finalproject.tex Mon Aug 31 02:43:54 2015 +0100 @@ -34,6 +34,9 @@ \usepackage{adjustbox} +\usepackage{algorithm} +\usepackage{algorithmic} + \begin{document} %\setlength{\TPHorizModule}{200mm}
--- a/Report/chiliguano_msc_finalproject.toc Sun Aug 30 15:49:27 2015 +0100 +++ b/Report/chiliguano_msc_finalproject.toc Mon Aug 31 02:43:54 2015 +0100 @@ -50,68 +50,72 @@ \defcounter {refsection}{0}\relax \contentsline {subsection}{\numberline {2.5.2}Convolutional Deep Neural Networks}{20}{subsection.2.5.2} \defcounter {refsection}{0}\relax -\contentsline {subsubsection}{Deep content-based music recommendation}{21}{section*.19} +\contentsline {subsubsection}{Deep content-based music recommendation}{22}{section*.19} \defcounter {refsection}{0}\relax -\contentsline {section}{\numberline {2.6}Estimation of Distribution Algorithms}{21}{section.2.6} +\contentsline {section}{\numberline {2.6}Estimation of Distribution Algorithms}{23}{section.2.6} \defcounter {refsection}{0}\relax -\contentsline {section}{\numberline {2.7}Summary}{22}{section.2.7} +\contentsline {subsection}{\numberline {2.6.1}A Hybrid Recommendation Model Based on EDA}{24}{subsection.2.6.1} \defcounter {refsection}{0}\relax -\contentsline {chapter}{\numberline {3}Methodology}{23}{chapter.3} +\contentsline {subsection}{\numberline {2.6.2}Continuous Univariate Marginal Distribution Algorithm}{26}{subsection.2.6.2} \defcounter {refsection}{0}\relax -\contentsline {section}{\numberline {3.1}Data collection}{23}{section.3.1} +\contentsline {section}{\numberline {2.7}Summary}{26}{section.2.7} \defcounter {refsection}{0}\relax -\contentsline {subsection}{\numberline {3.1.1}Taste Profile subset cleaning}{24}{subsection.3.1.1} +\contentsline {chapter}{\numberline {3}Methodology}{28}{chapter.3} \defcounter {refsection}{0}\relax -\contentsline {subsection}{\numberline {3.1.2}Fetching audio data}{25}{subsection.3.1.2} +\contentsline {section}{\numberline {3.1}Data collection}{28}{section.3.1} \defcounter {refsection}{0}\relax -\contentsline {subsection}{\numberline {3.1.3}Intermediate time-frequency representation for audio signals}{26}{subsection.3.1.3} +\contentsline {subsection}{\numberline {3.1.1}Taste Profile subset cleaning}{29}{subsection.3.1.1} \defcounter {refsection}{0}\relax -\contentsline {section}{\numberline {3.2}Data preprocessing}{27}{section.3.2} +\contentsline {subsection}{\numberline {3.1.2}Fetching audio data}{30}{subsection.3.1.2} \defcounter {refsection}{0}\relax -\contentsline {section}{\numberline {3.3}Algorithms}{27}{section.3.3} +\contentsline {subsection}{\numberline {3.1.3}Intermediate time-frequency representation for audio signals}{31}{subsection.3.1.3} \defcounter {refsection}{0}\relax -\contentsline {subsection}{\numberline {3.3.1}Music genre classifier}{27}{subsection.3.3.1} +\contentsline {section}{\numberline {3.2}Data preprocessing}{32}{section.3.2} \defcounter {refsection}{0}\relax -\contentsline {subsubsection}{CNN network architecture}{27}{section*.21} +\contentsline {section}{\numberline {3.3}Algorithms}{32}{section.3.3} \defcounter {refsection}{0}\relax -\contentsline {subsection}{\numberline {3.3.2}User profile modelling}{28}{subsection.3.3.2} +\contentsline {subsection}{\numberline {3.3.1}Music genre classifier}{32}{subsection.3.3.1} \defcounter {refsection}{0}\relax -\contentsline {subsubsection}{Permutation EDA}{28}{section*.22} +\contentsline {subsubsection}{CNN network architecture}{32}{section*.22} \defcounter {refsection}{0}\relax -\contentsline {subsubsection}{Continuous Univariate Marginal Distribution Algorithm}{28}{section*.23} +\contentsline {subsection}{\numberline {3.3.2}User profile modelling}{33}{subsection.3.3.2} \defcounter {refsection}{0}\relax -\contentsline {subsection}{\numberline {3.3.3}Song recommendation}{28}{subsection.3.3.3} +\contentsline {subsubsection}{Permutation EDA}{33}{section*.23} \defcounter {refsection}{0}\relax -\contentsline {chapter}{\numberline {4}Experiments}{29}{chapter.4} +\contentsline {subsubsection}{Continuous Univariate Marginal Distribution Algorithm}{33}{section*.24} \defcounter {refsection}{0}\relax -\contentsline {section}{\numberline {4.1}Evaluation for recommender systems}{29}{section.4.1} +\contentsline {subsection}{\numberline {3.3.3}Song recommendation}{33}{subsection.3.3.3} \defcounter {refsection}{0}\relax -\contentsline {subsection}{\numberline {4.1.1}Types of experiments}{29}{subsection.4.1.1} +\contentsline {chapter}{\numberline {4}Experiments}{34}{chapter.4} \defcounter {refsection}{0}\relax -\contentsline {subsection}{\numberline {4.1.2}Evaluation strategies}{30}{subsection.4.1.2} +\contentsline {section}{\numberline {4.1}Evaluation for recommender systems}{34}{section.4.1} \defcounter {refsection}{0}\relax -\contentsline {subsection}{\numberline {4.1.3}Decision based metrics}{31}{subsection.4.1.3} +\contentsline {subsection}{\numberline {4.1.1}Types of experiments}{34}{subsection.4.1.1} \defcounter {refsection}{0}\relax -\contentsline {subsubsection}{Precision}{31}{section*.24} +\contentsline {subsection}{\numberline {4.1.2}Evaluation strategies}{35}{subsection.4.1.2} \defcounter {refsection}{0}\relax -\contentsline {subsubsection}{Recall}{31}{section*.25} +\contentsline {subsection}{\numberline {4.1.3}Decision based metrics}{36}{subsection.4.1.3} \defcounter {refsection}{0}\relax -\contentsline {subsubsection}{F1}{31}{section*.26} +\contentsline {subsubsection}{Precision}{36}{section*.25} \defcounter {refsection}{0}\relax -\contentsline {subsubsection}{Accuracy}{31}{section*.27} +\contentsline {subsubsection}{Recall}{36}{section*.26} \defcounter {refsection}{0}\relax -\contentsline {section}{\numberline {4.2}Evaluation method}{32}{section.4.2} +\contentsline {subsubsection}{F1}{36}{section*.27} \defcounter {refsection}{0}\relax -\contentsline {subsection}{\numberline {4.2.1}Training set and test set}{32}{subsection.4.2.1} +\contentsline {subsubsection}{Accuracy}{36}{section*.28} \defcounter {refsection}{0}\relax -\contentsline {chapter}{\numberline {5}Results}{33}{chapter.5} +\contentsline {section}{\numberline {4.2}Evaluation method}{37}{section.4.2} \defcounter {refsection}{0}\relax -\contentsline {section}{\numberline {5.1}Genre classification results}{33}{section.5.1} +\contentsline {subsection}{\numberline {4.2.1}Training set and test set}{37}{subsection.4.2.1} \defcounter {refsection}{0}\relax -\contentsline {section}{\numberline {5.2}Recommender evaluation results}{34}{section.5.2} +\contentsline {chapter}{\numberline {5}Results}{38}{chapter.5} \defcounter {refsection}{0}\relax -\contentsline {chapter}{\numberline {6}Conclusion}{35}{chapter.6} +\contentsline {section}{\numberline {5.1}Genre classification results}{38}{section.5.1} \defcounter {refsection}{0}\relax -\contentsline {section}{\numberline {6.1}Future work}{35}{section.6.1} +\contentsline {section}{\numberline {5.2}Recommender evaluation results}{39}{section.5.2} \defcounter {refsection}{0}\relax -\contentsline {chapter}{References}{36}{section.6.1} +\contentsline {chapter}{\numberline {6}Conclusion}{40}{chapter.6} +\defcounter {refsection}{0}\relax +\contentsline {section}{\numberline {6.1}Future work}{40}{section.6.1} +\defcounter {refsection}{0}\relax +\contentsline {chapter}{References}{41}{section.6.1}
--- a/Report/references.bib Sun Aug 30 15:49:27 2015 +0100 +++ b/Report/references.bib Mon Aug 31 02:43:54 2015 +0100 @@ -1,3 +1,46 @@ +@inproceedings{gallagher2007bayesian, + title={Bayesian inference in estimation of distribution algorithms}, + author={Gallagher, Marcus and Wood, Ian and Keith, Jonathan and Sofronov, George}, + booktitle={Evolutionary Computation, 2007. CEC 2007. IEEE Congress on}, + pages={127--133}, + year={2007}, + organization={IEEE} +} +@ARTICLE{Ding2015451, + author={Ding, C. and Ding, L. and Peng, W.}, + title={Comparison of effects of different learning methods on estimation of distribution algorithms}, + journal={Journal of Software Engineering}, + year={2015}, + volume={9}, + number={3}, + pages={451-468}, + doi={10.3923/jse.2015.451.468}, + note={cited By 0}, + url={http://www.scopus.com/inward/record.url?eid=2-s2.0-84924609049&partnerID=40&md5=e6419e97e218f8ef1600e3d21e6a9e36}, + document_type={Article}, + source={Scopus}, +} + +@ARTICLE{Jin2014113, + author={Jin, C. and Jin, S.-W.}, + title={Software reliability prediction model based on support vector regression with improved estimation of distribution algorithms}, + journal={Applied Soft Computing Journal}, + year={2014}, + volume={15}, + pages={113-120}, + doi={10.1016/j.asoc.2013.10.016}, + note={cited By 0}, + url={http://www.scopus.com/inward/record.url?eid=2-s2.0-84889065631&partnerID=40&md5=6ba595eee679fa8355329646504b3ae3}, + document_type={Article}, + source={Scopus}, +} +@online{1_ufldlstanfordedu_2015, + author={Ufldl.stanford.edu,}, + title={Unsupervised Feature Learning and Deep Learning Tutorial}, + url={http://ufldl.stanford.edu/tutorial/supervised/ConvolutionalNeuralNetwork/}, + urldate={2015-8-30}, + year={2015} +} @online{1_lecun_2015, author={LeCun, Yann}, title={MNIST Demos on Yann LeCun's website},