changeset 46:ec79a3225b29

discussion
author mariano
date Sun, 01 May 2016 04:50:52 +0100
parents fae1615d7b8f
children 20598a0f7dcd f94a152a553a
files musicweb.tex
diffstat 1 files changed, 5 insertions(+), 4 deletions(-) [+]
line wrap: on
line diff
--- a/musicweb.tex	Sun May 01 04:07:10 2016 +0100
+++ b/musicweb.tex	Sun May 01 04:50:52 2016 +0100
@@ -7,8 +7,7 @@
 \usepackage{courier}
 \usepackage{adjustbox}
 \usepackage{url}
-\usepackage{subcaption}
-\captionsetup{compatibility=false}
+\usepackage{parskip}
 \usepackage[ngerman,english]{babel}
 \usepackage[utf8]{inputenc}
 \newcommand{\cmark}{\fontsize{14}{14}\textbullet\selectfont}
@@ -283,7 +282,7 @@
 Texts (or abstracts, in the case of research publications where the body is not available) are subjected to semantic analysis. It is first tokenised and a bag of words is extracted from it. This bag of words is used to query the alchemy\footnote{AlchemyAPI is used under license from IBM Watson.} language analysis service for:
 \begin{itemize}
 \item Named entity recognition. The entity recogniser provides a list of names that appear mentioned in the text together with a measure of relevance. They can include toponyms, institutions, publications and persons. MusicWeb is interested in identifying artists, so every person mentioned is checked against the database. If the person is not included in MusicWeb's database then three resources are checked: dbpedia, musicbrainz and freebase. All three resources identify musicians using the yago ontology. It is important to align the artist properly, since the modeling process is largely unsupervised, and wrong identifications can skew the model. Musicians identified in texts are stored and linked to the artist that originated the query. MusicWeb then offers a link to either of them as ``appearing together in article''.
-\item Keyword extraction. Non-managed texts and research that don't include tags or keywords. Keywords are checked against wordnet for hypernyms and stored. Artists that share keywords or hypernyms are considered to be relevant to the same topic in the literature.
+\item Keyword extraction. Non-managed texts and research papers that don't include tags or keywords. Keywords are checked against wordnet for hypernyms and stored. Artists that share keywords or hypernyms are considered to be relevant to the same topic in the literature.
 \end{itemize}
 MusicWeb also offers links between artists who appear in different articles by the same author, as well as in the same journal.
 
@@ -316,7 +315,7 @@
 While automatic feature extraction has significantly enhanced organisation and categorisation of large music collections, it is still rather challenging to derive high level semantic information relating to mood or genre. Complementing signal processing and machine learning methods with crowd-sourced social tagging data from platforms like Last.fm can enrich and inform understanding of general listening habits and connections between artists.
 Mood-based similarity is another experimental enhancement to MusicWeb. This method involves using the Semantic Web version of ILM10K music mood dataset that consists of over 4000 unique artists. The dataset is based on crowd-sourced mood tag statistics from Last.fm users, which have been transformed to numerical coordinates in a Cartesian space. Every track in the collection is associated to 2-dimensional coordinates reflecting energy and pleasantness respectively. The similarity between artists in this case is measured by first calculating the location of the target artist in the mood space by averaging the coordinates of all the associated tracks. The same procedure is repeated for all other artists which then enables computing Manhattan distances between the target from the rest and using the ranking as similarity metric. This process is illustrated by the example SPARQL query in Listing \ref{lst:sparql1}.
 \vspace{-10pt}
-\noindent\begin{minipage}{\textwidth}
+\noindent\begin{minipage}[!ht]{\textwidth}
 	\begin{lstlisting}[ style = sparql, label=lst:sparql1, tabsize=4,  caption={example SPARQL query to retrieve similar artists from the ILM10K mood dataset by Manhattan distance of valence-arousal coordinates.} ]
 SELECT ?artist ?mbid
 WHERE
@@ -349,6 +348,8 @@
 
 
 \section{Discussion}\label{sec:discussion}
+Interacting with MusicWeb can be a surprising experience. Often, the artists visited are similar enough. It is not unexpected that Rihanna and Chris Brown are linked because they are both mentioned in the same news item. Or, for instance, Schumman, Von Weber and Berlioz are all identified in the same musicology paper. It often happens, however, that the user begins by searching an artist and, following some of the links offered, ends up listening to a completely different style of music. One such journey, for example, started with Madonna. By following links only suggested by the fact that both artists appear in the same text the user is directed to Theodor W. Adorno, and then to Gustav Mahler, and finally to Pierre Boulez. Taking two steps back, from Adorno the user can choose to visit Niezstche's page, both being German composers who are also philosophers, or to Albert Einstein's, since both are Jewish people who migrated to the U.S. fleeing Nazi Germany. One final example of following semantic links: a user who searches for Bach can then go to Mozart's page. From there he can proceed to Glenn Gould's, on to Thelonius Monk and finally John Coltrane.
+This kind of discovery path would be very unlikely in any recommender system based on user profile data.
 
 \section{Conclusions}\label{sec:conclusions}