changeset 14:7b4c0e52878e

fill in a few more blanks
author Chris Cannam
date Thu, 22 Sep 2011 17:04:49 +0100
parents bba6c067444c
children d166744ca3d8
files cannam.tex
diffstat 1 files changed, 87 insertions(+), 60 deletions(-) [+]
line wrap: on
line diff
--- a/cannam.tex	Thu Sep 22 16:24:37 2011 +0100
+++ b/cannam.tex	Thu Sep 22 17:04:49 2011 +0100
@@ -1,7 +1,7 @@
 \documentclass{article}
 \usepackage{spconf,amsmath,graphicx}
 \def\CC{{C\nolinebreak[4]\hspace{-.05em}\raisebox{.4ex}{\tiny\bf ++}}}
-
+\raggedbottom
 
 \title{The title of the scientific paper: A long, explanatory subtitle}
 
@@ -92,7 +92,7 @@
 special session was organised at the ICASSP 2007 international signal
 processing conference, and special issues of IEEE Signal Processing
 Magazine and Computing in Science and Engineering concerning this
-subject both appeared in 2009 \cite{vandewalle2009}. The IEEE Signal
+subject both appeared in 2009~\cite{vandewalle2009}. The IEEE Signal
 Processing society now encourages Reproducible Research, allowing
 links from the online journal repository IEEEXplore to the code and
 data so that other researchers can reproduce the results. Actions such
@@ -112,36 +112,33 @@
 solution to the problem of code dissemination, in practice take-up in
 this field appears limited.  
 
-In order to better understand the reality  faced by the audio and
-music research community, we conducted a survey on software usage and
-development~\cite{ssamrsurvey}. 
-In October 2010 we opened an online survey, advertised to a
-number of senior researchers in other groups around the UK. This
-survey asked for detailed information about the software usage and authorship practices of
-researchers, with the aim of obtaining a number of individual case
-points for further examination as well as some broad numerical
-results. The survey, started in November 2010, closed in April 2011,
-with 54 complete and 23 partially complete responses. There were
-responses from at least 16 different institutions.
+In order to better understand the reality faced by the audio and music
+research community, we conducted an online survey on software usage
+and development~\cite{ssamrsurvey}.  This survey opened in October
+2010 and was advertised to a number of senior researchers in other
+groups around the UK. We asked for detailed information about the
+software usage, authorship and publication practices of researchers,
+with the aim of obtaining a number of individual case points for
+further examination as well as some broad numerical results. The
+survey, closed in April 2011, with 54 complete and 23 partially
+complete responses. There were responses from at least 16 different
+institutions.
 
-
-There was also a section on Reproducible Research. In this section the
-majority of respondents said either that they took no steps to ensure
-reproducibility or that they only made code or data available on
-request.  Obstacles cited included lack of time, copyright
-restrictions, and the potential for commercial use of the code
-\cite{ssamrsurvey}.  In addition to these, a broader case study by the
-UK Research Information Network into science research across several
-subject areas also identified lack of evidence of benefits, cultures
-of independence and competition, and concerns about quality as typical
-factors inhibiting open sharing of data and code
-\cite{rin2010}. Intuitively, undertaking reproducible research takes
-effort early in the research cycle, before the benefits are
-necessarily apparent and while the value of the reserch is still
-unclear, and can be perceived as delaying the production of ``real''
-research. Once research results have been produced and a paper
-written, there is little apparent incentive to make the research
-reproducible.
+The majority of our respondents said either that they took no steps to
+ensure reproducibility in their publications or that they only made
+code or data available on request.  Obstacles cited included lack of
+time, copyright restrictions, and the potential for commercial use of
+the code.  A broader case study by the UK Research Information Network
+into science research across several subject areas also identified
+lack of evidence of benefits, cultures of independence and
+competition, and concerns about quality as typical factors inhibiting
+open sharing of data and code \cite{rin2010}. Intuitively, undertaking
+reproducible research takes effort early in the research cycle, before
+the benefits are necessarily apparent and while the value of the
+reserch is still unclear, and can be perceived as delaying the
+production of ``real'' research. Once research results have been
+produced and a paper written, there is little apparent incentive to
+make the research reproducible.
 
 In many of the fields within this community, researchers lack the
 skills or desire to write their own code or to make someone else's
@@ -159,14 +156,14 @@
 used, including MATLAB and numerous MATLAB toolboxes, C++, Max/MSP,
 OpenFrameworks, Juce, HTK and MPTK, SuperCollider, Clojure and
 R. Recent publications from our group have also made use of Python
-\cite{fazekas} and Prolog \cite{raimond}.
+\cite{fazekas} and Prolog~\cite{raimond}.
 
 As a consequence of the lack of publication and variety of platforms
 used, software developed in earlier research is not always readily
 available to later researchers.  For example, in the well-known
-subject of beat tracking, the method of Scheirer et al \cite{scheirer}
-was developed in \CC for a legacy platform and is now only available
-informally; Goto et al \cite{goto} was written for a now-defunct
+subject of beat tracking, the method of Scheirer et al~\cite{scheirer}
+was developed in \CC{} for a legacy platform and is now only available
+informally; Goto et al~\cite{goto} was written for a now-defunct
 parallel architecture and never publicly released; Hainsworth
 \cite{hainsworth} was written in MATLAB with a non-portable DLL
 component and only runs on a single platform; Klapuri et al
@@ -201,7 +198,14 @@
 ultimate concern of our present work, therefore, is sustainability and
 reusability rather than reproducibility.
 
-\subsection{Education and Confidence With Code}
+We cannot address all possible barriers to software publication and
+reuse, but following section \ref{subsec:researchsoft} we identify
+that we may be able to help in overcoming: lack of confidence in code
+quality and of comfort with collaborative development; lack of
+facilities and tools to support such development; and reusability
+problems caused by platform incompatibilities.
+
+\subsection{Education and Confidence with Code}
 
 introductory note here: the barrier is that people lack software
 development skills
@@ -211,18 +215,22 @@
 
 In November 2010 we organised an Autumn School for researchers,
 presented by Dr Greg Wilson and based on the Software Carpentry
-materials \cite{softwarecarpentry}.  This week-long residential course
+materials~\cite{softwarecarpentry}.  This week-long residential course
 for 20 audio and music researchers from groups around the UK taught
 fundamentals of software development and good practice including
 version control, unit testing and test-driven development, Python
-syntax and structure, and managing small result databases with sqlite.
+syntax and structure, and managing experimental datasets with sqlite.
 
 \subsubsection{Videos and Tutorials}
 
+We have made available all of the teaching material from the Autumn
+School in video form\footnote{\tt
+  http://soundsoftware.ac.uk/autumnschool2010video} and we have
+started work on tutorial material on various subjects (todo: what can
+we say about this?)
+
 \subsection{Facilities and Tools}
 
-% this is awfully fuzzy
-
 Researchers will not make use of version control and collaborative
 development facilities if they are unaware that they exist.  An
 informal poll of attendees at the Autumn School (section
@@ -301,37 +309,56 @@
 
 Sonic Visualiser was developed at the Centre for Digital Music from
 2005 onwards as a visualisation and analysis tool for audio
-recordings, particularly of music \cite{cannam2006}.  Noting the lack
+recordings, particularly of music~\cite{cannam2006}.  Noting the lack
 of a modular way to release audio analysis methods for use by the
 general public, in 2006 we developed the Vamp plugin system
-\cite{vamp} and implemented it in Sonic Visualiser and the subsequent
-Sonic Annotator \cite{cannam2010}.  The Vamp system has since been
-used by the Centre and others with some success for publishing working
-methods.
+\cite{vamp} and implemented it using \CC{} in Sonic Visualiser and the
+subsequent Sonic Annotator~\cite{cannam2010}.  The Vamp system has
+since been used by the Centre and others with some success for
+publishing working methods.
+
+Although this format is of course not suitable for all methods, using
+a plugin format for publication has some advantages.  It permits a
+working algorithm to be converted directly to a unit of code which can
+be used in real applications, without the need to develop a custom
+interface.  A published interface supported by more than one host
+program ensures that the published code is not dependent on the
+continued distribution of a single application.  While \CC{} is not
+always appropriate for research code, it is widely used and
+understood, and we have also provided a Vamp interface for plugins
+implemented in Python \cite{vampy}.
 
 \subsubsection{Hands-on Help} % porting, etc
 
-% something about MATLAB?
-
+Another approach to dealing with platform incompatibilities is to
+provide a service that can be used to respond to demand for research
+software and to maintain it accordingly.  That is, besides encouraging
+people to publish and if necessary maintain their own code, and
+teaching people how to update and fix the code they need to use, we
+can also locate code that researchers need and help them to obtain it
+in a working state.  We intend during late 2011 and early 2012 to
+visit a number of research groups, identify ``lost'' or troublesome
+code, and provide development expertise to make it available where
+possible.
 
 \section{Case studies}
 \label{sec:casestudies}
 
-\subsection{Beat Trackers}
+\subsection{Beat tracking}
 
 In section \ref{subsec:researchsoft} we gave example of beat tracking
-methods whose implementations had effectively been lost.  However, we
+methods whose implementations are not readily available.  However, we
 have made some successful attempts at sustainability for other
 methods. An early work from C4DM in audio onset detection
 \cite{bello2003} was used in later work on beat tracking
 \cite{davies2005}. This algorithm, originally in MATLAB, was ported to
-SuperCollider by Collins at Cambridge and in our group into a
+SuperCollider by Collins at Cambridge, and in our group into a
 cross-platform C++ Vamp plugin. It also inspired a real time Max/MSP
-beat tracking system \cite{robertson2007} and was used in
+beat tracking system~\cite{robertson2007} and was used in
 beat-synchronous audio effects, developed in Matlab and ported to
-real-time VST plugins \cite{stark2008}.  Nearly all of these
-implementations are now available, linked to their references, through
-the SoundSoftware code repository described in section \ref{sec:codesite}.
+real-time VST plugins~\cite{stark2008}.  Nearly all of these
+implementations are now available, with their references, through the
+SoundSoftware code repository described in section \ref{sec:codesite}.
 
 
 \subsection{Chordino and NNLS Chroma}
@@ -339,14 +366,14 @@
 
 % Note that in this case the author did _not_ follow a RR methodology, and the code is not referred to in the paper.  The link between code and publication must be made after the fact.
 
-In \cite{mauch2010}, Mauch describes a method for improving the
+In~\cite{mauch2010}, Mauch describes a method for improving the
 automatic recognition of chords whose fundamental frequencies are
 easily confused with other partials.  This is a traditional
 publication which makes no reference to any published code or test
 data.  Although no formal attempt was made initially toward
 reproducibility, some independent evaluation was carried out through
 the submission of a MATLAB implementation of the method to the annual
-MIREX evaluation exchange \cite{mirex}.
+MIREX evaluation exchange~\cite{mirex}.
 
 Following this publication, we worked with the author of the paper and
 code to develop a C++ implementation of the method and turn it into a
@@ -361,11 +388,11 @@
 reusability have been achieved even though the process did not begin
 until after the initial publication.
 
-\subsection{Auditory Image Models}
-\label{subsubsec:aim}
+%\subsection{Auditory Image Models}
+%\label{subsubsec:aim}
 
-\section{Future work}
-\label{sec:future}
+\section{Conclusions and Future Work}
+\label{sec:conclusions}
 
 %%\section{Appendix: SoundSoftware survey 2010}
 %%\label{sec:survey}