# HG changeset patch # User Chris Cannam # Date 1316984306 -3600 # Node ID cfd78be7676ba4b494c060fc34f5e26b8fecd928 # Parent 94c9700135db498e6268443bbed52a1738711fab Another big batch of updates, working this time mostly on the Barriers sections. Also add autumn school followup survey spreadsheet. diff -r 94c9700135db -r cfd78be7676b autumn-school-survey.xls Binary file autumn-school-survey.xls has changed diff -r 94c9700135db -r cfd78be7676b cannam.tex --- a/cannam.tex Sun Sep 25 21:01:17 2011 +0100 +++ b/cannam.tex Sun Sep 25 21:58:26 2011 +0100 @@ -179,13 +179,13 @@ Not only does software often go un-published; software that is published often does not work for future users. Partly this is because of platform variations. In many of the fields within this -community, researchers lack the skills or desire to grapple with -someone else's code. Where they do work with code, they use a variety -of platforms and batch and real-time environments: respondents to our -survey named MATLAB and numerous of its toolboxes, Max/MSP, C++ and -OpenFrameworks, Juce, HTK and MPTK, SuperCollider, Python, Clojure and -R among technologies used. And even code that can be run may produce -incorrect results \cite{merali2010} owing to limited testing. +community, researchers lack the skills or desire to grapple with code +if it will not immediately run on a platform they have available. +Where they produce code, they use a variety of platforms and batch and +real-time environments: respondents to our survey named MATLAB and +numerous of its toolboxes, Max/MSP, C++ and OpenFrameworks, Juce, HTK +and MPTK, SuperCollider, Python, Clojure and R among technologies +used. As a consequence of these obstacles to publication and the variety of platforms used, software developed in earlier research is often @@ -203,9 +203,10 @@ \label{sec:philosophy} Our approach to this problem is to facilitate incremental improvements -to the way software is managed during research, by identifying -practical barriers to software reuse and providing means to reduce or -eliminate them. +to the way software is managed during research, by identifying those +practical barriers to software reuse that admit straightforward +removal or mitigation through simple educational and technical +measures. While we support the goal of reproducible research and aim to encourage open publication of code and data linked with paper @@ -226,57 +227,60 @@ ultimate concern of our present work, therefore, is sustainability and reusability rather than reproducibility. -[TODO: reword this following reword to sec \ref{sec:researchsoft}] - We cannot address all possible barriers to software publication and -reuse, but following section \ref{sec:researchsoft} we identify -that we may be able to help in overcoming: lack of confidence in code -quality and of comfort with collaborative development; lack of -facilities and tools to support such development; and reusability +reuse, but following section \ref{sec:researchsoft} we identify four +that may be approachable: lack of confidence in code quality and of +comfort with collaborative development; lack of facilities and tools +to support such development; lack of incentive to distribute software +given the academic focus on paper publications; and reusability problems caused by platform incompatibilities. \subsection{Barrier to reuse: Lack of education and confidence with code} +\label{sec:lackofeducation} -introductory note here: the barrier is that people lack software -development skills - -\subsubsection{SoundSoftware/Software Carpentry Autumn School} -\label{sec:autumnschool} +In section \ref{sec:researchsoft} we noted that research software +developers are largely self-trained, and Merali \cite{merali2010} +provides a number of examples of unfortunate outcomes caused by lack +of education and experience in software development. In November 2010 we organised an Autumn School for researchers, presented by Dr Greg Wilson and based on the Software Carpentry materials~\cite{softwarecarpentry}. This week-long residential course for 20 audio and music researchers from groups around the UK taught -fundamentals of software development and good practice including -version control for software, unit testing and test-driven -development, Python syntax and structure, and managing experimental -datasets with sqlite. +fundamentals of software development including version control for +software, unit testing and test-driven development, Python syntax and +structure, and managing experimental datasets with sqlite. We have +made available all of the teaching material from the Autumn School in +online videos.\footnote{\tt + http://soundsoftware.ac.uk/autumnschool2010video} -\subsubsection{Videos and Tutorials} - -We have made available all of the teaching material from the Autumn -School in video form\footnote{\tt - http://soundsoftware.ac.uk/autumnschool2010video} and we have -started work on tutorial material on various subjects (todo: what can -we say about this?) +A subsequent online poll of attendees~\cite{autumnschoolsurvey} +suggests that training in even the most basic software development +skills may be well received by and beneficial to researchers. +Attendees identified program design, testing and validation, and +provenance and reproducibility as particularly valuable areas covered, +and these are areas in which the simplest possible introductions to +program structure, test-driven development, and version control can +provide sufficient provocation for the researcher to re-evaluate their +own practices. \subsection{Barrier to reuse: Lack of facilities and tools} \label{sec:lackoffacilities} Researchers will not make use of version control and collaborative development facilities that are not available to them, or of whose -existence they are not aware. An informal poll of attendees at the -Autumn School (section \ref{sec:autumnschool}) showed that few of them -were aware of such facilities being provided by their institutions. -This is consistent with the feedback to our survey and with experience -in our own group, where version control has been used sporadically and -set up in an ad-hoc fashion. +existence they are not aware. Few of the attendees at our Autumn +School (section \ref{sec:lackofeducation}) were aware of such +facilities being provided by their institutions, and in our survey +(section \ref{sec:researchsoft}) only a minority of respondents made +use of them. This is consistent with experience in our own research +group, where version control has been used only sporadically. -Attendees at the Autumn School also reported difficulty getting -started with the complex user interfaces available for version -control. Nonetheless, version control was identified by attendees in -debriefing as the most compelling subject taught during the course, -suggesting that lack of awareness may be the main barrier to uptake. (TODO: link to this) +Attendees at the Autumn School also reported difficulty during the +course in getting started with the complex user interfaces available +for version control. Nonetheless, version control was amongst the +areas identified subsequently as most valuable, suggesting that lack +of awareness may be the main barrier to uptake. \subsubsection{SoundSoftware Code Site} \label{sec:codesite} @@ -288,11 +292,12 @@ is unable to help them or if they have a need to work with researchers at other institutions who would not be permitted access to their institution's facilities. The existence of this site also addresses -shortcomings in our own group's use of version control mentioned in -section \ref{sec:lackoffacilities}. The site is implemented using our -own custom version of the Redmine\footnote{\tt http://redmine.org/} -project management application, with Mercurial version control. Any -UK researcher in the field can register and start their own projects. +shortcomings in our own group's use of version control +(section~\ref{sec:lackoffacilities}). The site is implemented using a +custom version of the Redmine\footnote{\tt http://redmine.org/} +project management application, together with Mercurial version +control. Any UK researcher in the field can register and start their +own projects. Four aspects of our code site contribute to sustainability and utility for researchers: @@ -313,7 +318,7 @@ \begin{enumerate} \item {\em Focus} --- The focus of the site on audio and music - research may make it easier to locate and obtain code for reuse. + research may make it easier to locate and obtain code. \item {\em Linking publications with code} --- Users can associate publication records with their projects so that readers can immediately see what publications are related to the code. @@ -340,10 +345,18 @@ \label{sec:easyhg} Our attempt to address the difficulties faced in learning version -control user interfaces is EasyMercurial, an application we developed -based on existing work [citation needed] in order to provide a user -interface that we could teach easily to researchers across multiple -operating system platforms. +control user interfaces is EasyMercurial,\footnote{\tt + http://easyhg.org} an application we developed based on existing +work [citation needed] in order to provide a user interface that we +could teach easily to researchers across multiple operating system +platforms. + +\subsection{Barrier to reuse: Lack of incentive for publication} + +[TODO: Code site helps here -- it makes it more likely your + publication will be cited because people who are looking first and + foremost for code implementations will have a handy reference right + in front of them] \subsection{Barrier to reuse: Platform incompatibilities} @@ -351,32 +364,31 @@ choose to use many platforms and programming languages to carry out their work. Although the most common (MATLAB) is widely available in signal processing groups, it is a commercial platform that is not -widely available to researchers in other fields related to audio and -music such as computational musicology or music therapy. +widely used in other fields related to audio and music such as +computational musicology or music therapy. \subsubsection{Plugins} \label{sec:plugins} +One method of giving code a better chance to survive may be to produce +a ``plugin'' that can be used in applications that researchers in +related fields might already be using. This permits a working +algorithm to be converted directly to a unit of code which can be used +in real applications, without the need to develop a custom interface. +A published interface supported by more than one host program ensures +that the published code is not dependent on the continued distribution +of a single application. Code that uses the plugin format of a +successful application is relatively likely to be understood by other +developers. -Sonic Visualiser was developed at the Centre for Digital Music from -2005 onwards as a visualisation and analysis tool for audio -recordings, particularly of music~\cite{cannam2006}. Noting the lack -of a modular way to release audio analysis methods for use by the -general public, in 2006 we developed the Vamp plugin system -\cite{vamp} and implemented it using \CC{} in Sonic Visualiser and the -subsequent Sonic Annotator~\cite{cannam2010}. The Vamp system has -since been used by the Centre and others with some success for -publishing working methods. - -Although this format is of course not suitable for all methods, using -a plugin format for publication has some advantages. It permits a -working algorithm to be converted directly to a unit of code which can -be used in real applications, without the need to develop a custom -interface. A published interface supported by more than one host -program ensures that the published code is not dependent on the -continued distribution of a single application. While \CC{} is not -always appropriate for research code, it is widely used and -understood, and we have also provided a Vamp interface for plugins -implemented in Python \cite{vampy}. +At C4DM we have had some success with plugins in standard audio +processing formats such as VST, as well as in writing externals for +modular systems such as Max/MSP or SuperCollider. For audio analysis +methods, in 2006 we developed the Vamp plugin system \cite{vamp} and +implemented it using \CC{} in our applications Sonic +Visualiser~\cite{cannam2006} and the subsequent Sonic +Annotator~\cite{cannam2010}. This system has also been used by the +Centre and others with some success for publishing working methods. +In section~\ref{sec:casestudies} we discuss some examples. \subsubsection{Hands-on Help} % porting, etc @@ -386,10 +398,13 @@ people to publish and if necessary maintain their own code, and teaching people how to update and fix the code they need to use, we can also locate code that researchers need and help them to obtain it -in a working state. We intend during late 2011 and early 2012 to -visit a number of research groups, identify ``lost'' or troublesome -code, and provide development expertise to make it available where -possible. +in a working state. Although the effort involved in updating old code +is higher than that involved in improving it at the time it is +written, this approach permits resources to be aimed directly at the +code that current researchers actually want to use. We intend during +late 2011 and early 2012 to visit a number of research groups, +identify ``lost'' or troublesome code, and provide development +expertise to make it available where possible. (TODO: move that last sentence to Future Work and refer to it here?) \section{Case studies} \label{sec:casestudies} @@ -451,12 +466,24 @@ people to use, make it simple -- use our facilities if you like -- encourage people to use it for papers as well!] -[3. Don't feel bad] +[3. Produce plugins and deliberately target end-user applications] -[4. ???] +[4. Don't feel bad] -\section{Conclusions and Future Work} -\label{sec:conclusions} +[5. ???] + +\section{Future Work} +\label{sec:future} + +[More study to find out how far these ``improvements'' actually help] + +[Visits to find code we can take under our wing] + +[Follow-up to Autumn School] + +[More training, videos, tutorials etc] + +[EasyHg evaluation] %%\section{Appendix: SoundSoftware survey 2010} %%\label{sec:survey}