Mercurial > hg > soundsoftware-icassp-2012
changeset 18:6cb784212511
More small updates
author | Chris Cannam |
---|---|
date | Sun, 25 Sep 2011 12:58:26 +0100 |
parents | 56627b8fcf4d |
children | 94c9700135db |
files | cannam.tex |
diffstat | 1 files changed, 60 insertions(+), 39 deletions(-) [+] |
line wrap: on
line diff
--- a/cannam.tex Sun Sep 25 12:20:00 2011 +0100 +++ b/cannam.tex Sun Sep 25 12:58:26 2011 +0100 @@ -112,7 +112,7 @@ Reproducible Research Repository\footnote{\tt http://rr.epfl.ch/}, designed to promote reproducible research by requiring the authors of a paper to upload the code and data used in the experiments. Readers -can also comment on a publication and evaluate the reproducibility of +can then comment on a publication and evaluate the reproducibility of the work. Although the Reproducible Research principle proposes a comprehensive @@ -122,8 +122,9 @@ \section{Understanding real-world limitations on software practice} \label{sec:researchsoft} -[We are going to propose three barriers to reuse -- lack of education - \& confidence; lack of tools \& facilities; platform +[We are going to propose four barriers to reuse -- lack of education + \& confidence; lack of tools \& facilities; lack of incentive, + because code is not measured as publications are; platform incompatibilities and code rot. We need to give the facts and figures supporting these as barriers and then identify them.] @@ -145,13 +146,25 @@ complete responses. There were responses from at least 16 different institutions. -Although 44\% of respondents said that they took steps to ensure -reproducibility of their work, their accompanying comments suggested -various interpretations of the meaning of reproducibility. A common -theme was that code would be made available on personal request; some -respondents said that they documented code in order to be able to -reproduce the results themselves, or that they were planning to -publish software or data rather than having actually done so. +Although 80\% of respondents reported developing software themselves +during research and 40\% of those said that they took steps to ensure +reproducibility of their publications, the accompanying comments +showed that this did not necessarily involve the publication of code. +Respondents referred to using standard, publicly-available datasets +and calibration procedures when performing measurements; to +documenting code and data so that they or other researchers in their +group could reproduce the results later; or to making code available +on personal request. All of these are worthwhile actions but they do +not suggest widespread use of a full reproducible compendium approach. + +Developing collaboratively even within a research group also seems to +be the exception rather than the rule. 51\% of respondents who +developed software said that their code did not leave their own +computer, and 59\% said they did not use version control software +(typically used in software practice to facilitate collaborative +development). This is also consistent with the finding by Hannay et +al that scientists usually developed and used software on their own +desktop computers rather than dedicated processing servers. Our respondents cited as obstacles to the publication of code lack of time, copyright restrictions, and the potential for future commercial @@ -161,11 +174,6 @@ independence and competition, and quality concerns as typical inhibiting factors for open sharing of data and code. -Our survey found that most respondents kept code on their own machines -and did not develop collaboratively. This is consistent with the -Hannay study, which found that scientists typically developed and used -software on their personal computers rather than dedicated servers -(TODO: check this, does it say anything about sharing code?). [This suggests that there are cultural and technical [lack of facilities \& awareness of how to use them] barriers... this makes @@ -196,14 +204,12 @@ available to later researchers. For example, in the well-known subject of beat tracking, the method of Scheirer et al~\cite{scheirer} was developed in \CC{} for a legacy platform and is now only available -informally; Goto et al~\cite{goto} was written for a now-defunct -parallel architecture and never publicly released; Hainsworth -\cite{hainsworth} was written in MATLAB with a non-portable DLL -component and only runs on a single platform; Klapuri et al -\cite{klapuri} was written in MATLAB with a platform-specific -extension and is not widely distributed for reasons of commercial -confidentiality; and methods from several other researchers have not -been published as code. +informally; Goto et al~\cite{goto} was written for a parallel +architecture no longer in wide use and never publicly released; +Hainsworth \cite{hainsworth} was written in MATLAB with a +Windows-specific DLL component and only runs on a single platform; +Klapuri et al \cite{klapuri} is not widely distributed; and methods +from several other researchers have not been published as code. \section{Sustainable software: a bottom-up approach} \label{sec:philosophy} @@ -298,9 +304,7 @@ section \ref{sec:lackoffacilities}. The site is implemented using our own custom version of the Redmine\footnote{\tt http://redmine.org/} project management application, with Mercurial version control. Any -UK researcher in the field can register and start their own -collaborative projects using the version control, wiki, issue -tracking, and other services provided. +UK researcher in the field can register and start their own projects. Four aspects of our code site contribute to sustainability and utility for researchers: @@ -329,7 +333,7 @@ public, or private to a group of collaborating researchers; work can also be started privately and made public later. At the time of writing, 57\% of projects hosted at the site are private, and the - average number of members in private projects is 1.97. + average of the numbers of members per private project is 1.97. \item {\em Tracking external projects} --- Researchers who use code hosting or project management facilities elsewhere can also make use of our site as a nexus for relevant projects, as the site does not @@ -355,8 +359,12 @@ \subsection{Barrier to reuse: Platform incompatibilities} -introductory note here: the barrier is that software that is published -is not always usable +We observed in \ref{sec:researchsoft} that researchers in this field +choose to use many platforms and programming languages to carry out +their work. Although the most common (MATLAB) is widely available in +signal processing groups, it is a commercial platform that is not +widely available to researchers in other fields related to audio and +music such as computational musicology or music therapy. \subsubsection{Sonic Visualiser and Vamp Plugins} \label{subsubsec:sv} @@ -431,20 +439,33 @@ Following this publication, we worked with the author of the paper and code to develop a C++ implementation of the method and turn it into a -highly usable Vamp plugin for chord estimation, named Chordino. This -code has been made available at -http://code.soundsoftware.ac.uk/projects/nnls-chroma --- a Web page -which also links the code with its associated publication. Although -it has been updated since publication, the plugin includes a mode in -which it uses the same method as that submitted to the MIREX -evaluation and as a consequence, although this paper still lacks true -``one-click'' reproducibility, a high degree of openness and effective -reusability have been achieved even though the process did not begin -until after the initial publication. +Vamp plugin for chord estimation, named Chordino. This code and its +revision history are available through our code site\footnote{\tt + http://code.soundsoftware.ac.uk/projects/nnls-chroma} and thereby +linked with the associated publication. Although the code has been +updated since release, the plugin includes a mode in which it uses the +same method as that submitted to the MIREX evaluation. As a +consequence, although this paper still lacks a true reproducibility +compendium, a high degree of openness and effective reusability have +been achieved even though the process did not begin until after the +initial publication. %\subsection{Auditory Image Models} %\label{subsubsec:aim} +\section{Recommendations} +\label{sec:recommendations} + +[1. Software development training -- pick small battles -- testing, + provenance etc] + +[2. Version control and related facilities -- provide, encourage + people to use, make it simple] + +[3. Don't feel bad] + +[4. ???] + \section{Conclusions and Future Work} \label{sec:conclusions}