# HG changeset patch
# User alo
# Date 1466034054 -3600
# Node ID 483df26a74d87dfa9005aaf4c8f11f6bfce99dae
# Parent  e5f5ddcb72d2d009c9f4b51af33151bca807e60c
minor corrections and changes to rebuttal

diff -r e5f5ddcb72d2 -r 483df26a74d8 musicweb.tex
--- a/musicweb.tex	Thu Jun 16 00:18:12 2016 +0100
+++ b/musicweb.tex	Thu Jun 16 00:40:54 2016 +0100
@@ -229,6 +229,13 @@
 	\label{fig:layers}
 \end{figure}
 
+\begin{figure}[!ht]
+	\centering
+	\includegraphics[scale=0.5]{graphics/mw_flow.pdf}%\vspace{-5pt}
+	\caption{MusicWeb architecture}\vspace{-10pt}
+	\label{fig:layers}
+\end{figure}
+
 The global MusicBrainz identifiers enable convenient and concise means to disambiguate between potential duplicates or irregularities in metadata across resources, a problem which is all too common in systems relying on named entities. Besides identifiers, the MusicBrainz infrastructure is also used for the search functionality of MusicWeb. However, in order to query any information in DBpedia, the MusicBrainz identifiers need to be associated with a DBpedia resource, which is a different kind of identifier. This mapping is achieved by querying the Sameas.org co-reference service to retrieve the corresponding DBpedia URIs. The caveat in this process is that Sameas does not actually keep track of MusicBrainz artist URIs, however, by substituting the domain for the same artist's URI in the BBC domain\footnote{\url{http://www.bbc.co.uk/music/artists/}}, the service can get around this obstacle. Once the DBpedia artist identity is determined, the service proceeds to construct the majority of the profile, including the biography and most of the linking categories to other artists. The standard categories available include associated artists and artists from the same hometown, while music group membership and artist collaboration links are queried from MusicBrainz. The core of the Semantic Web linking functionality is provided by categories from YAGO. The Spotify\footnote{historically the Echonest} and Last.fm APIs are used for recommendations that are based on different similarity calculations, thus providing recommendations that do not necessarily overlap.
 
 \section{Artist similarity}
diff -r e5f5ddcb72d2 -r 483df26a74d8 rebuttal.txt
--- a/rebuttal.txt	Thu Jun 16 00:18:12 2016 +0100
+++ b/rebuttal.txt	Thu Jun 16 00:40:54 2016 +0100
@@ -1,14 +1,14 @@
 Thank you very much for the reviews, they have been very useful and have given us great suggestions as to how to move forward. As suggested by one of the reviewers, we have been contemplating a journal article which would include a more thorough evaluation, including discovery span and user engagement, testing of individual components related to different modalities in our discovery application and testing the system holistically. This constitutes future work.
 
-Many of the comments can and would be addressed in a camera ready version of the paper: description of figures, a better diagram of the architecture, unifying naming criteria, the name of the guitarist for Pantera (Dimebag Darrell), etc. We would be very happy to discuss other technical details, which we thought might clutter the content of the paper, which had to be kept necessarily short. These include infrastructure, triple stores, rest services, databases, deployment, expected web traffic and performance benchmarking, concurrent data mining and big data handling.
+Many of the comments can and would be addressed in a camera ready version of the paper: description of figures, a better diagram of the architecture, unifying naming criteria, the name of the guitarist for Pantera (Dimebag Darrell), etc. We would be very happy to discuss other technical details, which we thought might clutter the content of the paper, which had to be kept necessarily short. These include infrastructure, triple stores, rest services, databases, deployment, expected web traffic and performance benchmarking, concurrent data mining and big data handling as well as clarification on preference for the Manhattan distance metric above alteratives.
 Also, we could write many pages about difficulties encountered, overcome as well as ongoing. There are many issues which we have tried to address: reliability of services, NLP entity extraction and topic modelling, identifying relevant texts (using metadata, tagging and non-curated raw text), as well as content based matching.
 There is an excellent suggestion to relate two artists through indirect connections. This is a very good comment and we should definitely take it into account in future work. However, our current focus is on finding sets of artists that are in the overlap of certain categories. In this sense, using direct links seems more appropriate. Using indirect links would result in navigation by multiple categories at once which may be less clear to the user.
 
-We would be to include more detailed descriptions of the methodology implemented, with more details provided on the specific algorithms and the rationale behind their use. In the case of computing artists' similarity, the main purpose of the content-based analysis is to establish musically overlapping factors in two artists’ repertoire rather than computing similarity directly.
+We would be happy to include more detailed descriptions of the methodology implemented, with more details provided on the specific algorithms and the rationale behind their use. In the case of computing artists' similarity, the main purpose of the content-based analysis is to establish musically overlapping factors in two artists’ repertoire rather than computing similarity directly.
 
- We do feel, however, that there are certain misunderstandings which we perhaps failed to make clear:
+We do feel, however, that there are certain misunderstandings which we perhaps failed to make clear:
 
-- The application is not a music recommendation system. The paper presents an emerging application for music artist discovery. In contrast with conventional recommendation system that provide recommendation by similarity, the focus here is on novelty and serendipity which are commonly identified as important requirements in music recommendation systems[Celma, 2010], presenting problems that are yet to be addressed successfully. Typical systems employ collaborative filtering or similarities in curated metadata sources. These approaches do not reach artists in the long tail of distributions computed from listening habits or preferences in social networks, while using curated metadata doesn’t scale to large catalogues and fails to reach new artists.
+- The application is not a music recommendation system. The paper presents an emerging application for music artist discovery. In contrast with conventional recommendation systems that provide recommendation by similarity, the focus here is on novelty and serendipity which are commonly identified as important requirements in music recommendation systems[Celma, 2010], presenting problems that are yet to be addressed successfully. Typical systems employ collaborative filtering or similarities in curated metadata sources. These approaches do not reach artists in the long tail of distributions computed from listening habits or preferences in social networks, while using curated metadata doesn’t scale to large catalogues and fails to reach new artists.
 In the camera ready version, a more thorough account on requirements gleaned from music recommendation will be provided together with specific problems in the domain of artist discovery. We will enumerate how our system meets some of these requirements and provide an assessment for cases where they aren’t yet successfully met. We believe that thorough testing is outside the scope of the present submission. The system’s ability to dynamically create interesting links between artists proves an initial hypothesis that Linked Data combined with text and music content processing can provide a faceted browsing experience that is relevant in music artist discovery.
 - The system demonstrates a first experiment in artist discovery allowing users to navigate the vast space of music artists by combining multiple modalities. The faceted browsing interface allows users to choose a direction most relevant to their information seeking task (i.e. cultural links, overlaps of certain musical factors between artists, typical mood, etc.). As mentioned earlier, MusicWeb is not a music recommendation system. Consequently, techniques applied in that domain are less relevant in our case and direct comparison with recommendation methods in algorithms would therefore be somewhat moot.