# HG changeset patch # User Chris Cannam # Date 1331311089 0 # Node ID 8c92f88fa920be927ec7f161c5481a1c0b40c0ab # Parent 17c67727560b67c9db1a972f4597aa4ec8d9f269 Update following feedback from Marco, Simon and Luis diff -r 17c67727560b -r 8c92f88fa920 tutorial.pdf Binary file tutorial.pdf has changed diff -r 17c67727560b -r 8c92f88fa920 tutorial.tex --- a/tutorial.tex Thu Mar 08 17:46:16 2012 +0000 +++ b/tutorial.tex Fri Mar 09 16:38:09 2012 +0000 @@ -32,27 +32,27 @@ \section{Motivation} -The requirement to develop and reuse software is almost universal in -audio and music informatics research. Many methods, including most of -those published at ISMIR, are developed in tandem with software -implementations, and many of them are too complex or too fundamentally +The need to develop and reuse software to process data is almost +universal in music informatics research. Many methods, including most +of those published at ISMIR, are developed in tandem with software +implementations, and some of them are too complex or too fundamentally software-based to be reproduced readily from a published paper alone. For this reason, it is helpful for sustainable research to have software and data published along with papers. -In practice, however, non-publication of code and data is still the -norm and research software is commonly lost after publication of the +In practice, non-publication of code and data is still the norm and +research software is commonly lost following publication of the associated methods. -Our survey by the Sound Software project (http://soundsoftware.ac.uk) -of UK audio and music researchers in 2011 found that even among those -respondents who reported both developing software during research and -taking steps to reproducibility for their publications, only 35\% -reported having in fact published any of their code. Our respondents -cited as obstacles to the publication of code lack of time, copyright -restrictions, and the potential for future commercial use. A broader -study into science research across several subject areas by the UK -Research Information Network additionally identified the lack of +For the Sound Software project\footnote{\tt http://soundsoftware.ac.uk/} we +carried out a survey of UK audio and music researchers in 2011. Of +those respondents who reported both developing software during +research and taking steps to reproducibility for their publications, +only 35\% reported having in fact published any of their +code. Respondents cited as obstacles to publication of code lack of +time, copyright restrictions, and the potential for future commercial +use. A broader study of research across several subject areas by the +UK Research Information Network additionally identified lack of evidence of benefits, cultures of independence and competition, and quality concerns as inhibiting factors. @@ -79,34 +79,40 @@ This tutorial will be in three parts: \begin{itemize} -\item An {\bf introduction and overview} discussing the motivation for reusable software and data in research and providing an overview of some methods, tools and facilities available to researchers for this purpose; -\item A {\bf hands-on session} in which attendees are encouraged to try out some of these methods in code; +\item An {\bf introduction and overview} discussing the motivation for + reusable software and data in research and providing an overview of + some methods, tools and facilities available to researchers for this + purpose; +\item A {\bf hands-on session} in which attendees are encouraged to + try out some of these methods in code; \item A review and discussion of {\bf practical issues} in ensuring - that publication actually occurs, relevant also to research group - leaders. + that publication of data and code actually occurs, relevant also to + research group leaders. \end{itemize} \subsection{Introduction and overview} -In the first part of the tutorial, we will first discuss problems -faced by researchers in developing and reusing software and data in -their research, and their consequences for scientific work. +In the first part of the tutorial, we will first set out motivations +for publishing software and code, and then discuss problems faced by +researchers in trying to do so, taking into account their consequences +for scientific rigour. -We then give an overview of {\bf software tools, facilities and - methods} available for researchers to assist with collaborative -development and software publication, including: +We then give an overview of software tools, facilities and methods +available for researchers to assist with collaborative development and +software publication, including: \begin{itemize} \item Version control software: The concepts; practical advantages; overview of Mercurial, Git, Subversion; hosting facilities such as Github, Bitbucket, or (for UK researchers) our own code.soundsoftware.ac.uk; \item Unit testing and managing provenance and reproducibility for code; \item Data management: principles, repositories, and versioning; -\item Software licences: commonly-used open-source licences; the pros and cons of GPL and BSD licensing schemes +\item Software and data licences: commonly-used open-source licences; + pros and cons of GPL and BSD licensing schemes; Creative Commons \end{itemize} \subsection{Hands-on session} -The second part of the tutorial is a hands-on session in which -attendees will get the opportunity to work through an example using +The second part of the tutorial will be a hands-on session in which +attendees will get the opportunity to work through an example with real code. A ``toy'' music informatics programming problem will be presented, with @@ -117,12 +123,13 @@ preference) using a very simple unit testing regime; \item Place the code under version control using a local repository in a distributed version control system; -\item Tag the code and make a record of the software version and its - corresponding output data version; -\item Tweak the algorithm and record the updated versions accordingly; +\item Tag the code and make a record associating the software version + with its output data version; +\item Tweak the algorithm and record the updated software and data + versions accordingly; \item Place the resulting software under a standard open-source software licence; -\item Follow a simple ``release procedure'' to produce a source code +\item Follow a simple ``release procedure'' to produce a code and data release. \end{itemize} @@ -138,29 +145,31 @@ \begin{itemize} \item Publication mechanisms for reproducible research: \begin{itemize} -\item Open-access journal papers -\item Self-archiving -\item Technical reports -\item Copyright issues relating to journal or book publication -\item Publishing software in such a way that its relationship with the written publication is apparent -\item Associating specific versions of software or data with a publication +\item Open-access journal papers; +\item Self-archiving; +\item Technical reports; +\item Copyright issues relating to journal or book publication; +\item Mechanisms for associating software with the paper publication; +\item Identifying specific versions of software or data with a + publication. \end{itemize} \item Publication policies for research group leaders: \begin{itemize} -\item Why publish software? -\item Institutional assistance with publication barriers -\item The research community +\item Why publish software and data?; +\item What software and data should be published, and when? +\item Institutional assistance with publication barriers; +\item The research community. \end{itemize} \end{itemize} \section{Intended and expected audience} The primary audience for this tutorial is researchers within the music -informatics community who have to develop or reuse software and data -during their day-to-day research. +informatics community who develop or reuse software and data during +their day-to-day research. We believe that an overwhelming majority of material submitted to -ISMIR required software to be developed during research. Given results +ISMIR requires software to be developed during research. Given results showing that most researchers are self-taught in software development, and in light of the reasons researchers report as to why they do not publish software, we think that a large proportion of the active @@ -175,8 +184,7 @@ \subsection{Experience in this area} -The presenters manage the Sound Software project\footnote{\tt - http://soundsoftware.ac.uk/} and Sustainable Management of Digital +The presenters manage the Sound Software project and Sustainable Management of Digital Music Research Data project\footnote{\tt http://rdm.c4dm.eecs.qmul.ac.uk/} in the Centre for Digital Music (C4DM) at Queen Mary University of London. @@ -210,7 +218,13 @@ from Instituto Superior T\'ecnico in Lisbon, where he specialized in digital signal processing with a focus on speech synthesis. -{\bf Dr Marco Fabiani} (to be provided by Marco) +{\bf Dr Marco Fabiani} is a post-doctoral Research Assistant at C4DM +working on the Sustainable Management of Digital Music Research Data +project. He recently completed his PhD in Computer Science - Speech +and Music Communication (KTH, Stockholm) with a thesis on interactive +computer-based music performance, and has worked on topics including +audio signal processing, music information retrieval, HCI, and sound +perception. {\bf Prof Mark Plumbley} is Director of C4DM and leads the Sound Software initiative. His work in audio signal analysis includes beat @@ -233,8 +247,17 @@ ISMIR 2006 tutorial on Computational Rhythm Description. \section{Any special requirements} + +Attendees will be encouraged to bring and use laptops, so sufficient +space and network capacity would be welcome. + +It would be nice to separate the three parts of the tutorial with +coffee and biscuit breaks! + \section{Contact information} +Please contact Chris Cannam, {\tt chris.cannam@eecs.qmul.ac.uk}. + \end{document} %%% Local Variables: