Mercurial > hg > soundsoftware-ismir-2012
changeset 1:17c67727560b
Some updates following first reviews from Mark
author | Chris Cannam |
---|---|
date | Thu, 08 Mar 2012 17:46:16 +0000 |
parents | d2cf7cdf7882 |
children | 8c92f88fa920 |
files | tutorial.pdf tutorial.tex |
diffstat | 2 files changed, 150 insertions(+), 91 deletions(-) [+] |
line wrap: on
line diff
--- a/tutorial.tex Thu Mar 08 13:56:24 2012 +0000 +++ b/tutorial.tex Thu Mar 08 17:46:16 2012 +0000 @@ -3,11 +3,23 @@ \onecolumn \raggedbottom +%\oddsidemargin -6.2truemm +%\evensidemargin -6.2truemm + +\topmargin 0truept +\headheight 0truept +\headsep 0truept +%\footheight 0truept +%\footskip 0truept +\textheight 229truemm +%\textwidth 178truemm + \title{ISMIR Tutorial Proposal} -\author{Mark D. Plumbley, Chris Cannam, and Lu\'{i}s A. Figueira\\ +\author{Chris Cannam, Lu\'{i}s A. Figueira, Marco Fabiani,\\ +Mark D. Plumbley, and Simon Dixon\\ Centre for Digital Music, Queen Mary University of London\\ - {\tt\small \{mark.plumbley, chris.cannam, luis.figueira\}@eecs.qmul.ac.uk}} + {\tt \{firstname.lastname\}@eecs.qmul.ac.uk}} \begin{document} % @@ -18,83 +30,108 @@ {\bf Reusable software and reproducibility in music informatics research} +\section{Motivation} + +The requirement to develop and reuse software is almost universal in +audio and music informatics research. Many methods, including most of +those published at ISMIR, are developed in tandem with software +implementations, and many of them are too complex or too fundamentally +software-based to be reproduced readily from a published paper +alone. For this reason, it is helpful for sustainable research to have +software and data published along with papers. + +In practice, however, non-publication of code and data is still the +norm and research software is commonly lost after publication of the +associated methods. + +Our survey by the Sound Software project (http://soundsoftware.ac.uk) +of UK audio and music researchers in 2011 found that even among those +respondents who reported both developing software during research and +taking steps to reproducibility for their publications, only 35\% +reported having in fact published any of their code. Our respondents +cited as obstacles to the publication of code lack of time, copyright +restrictions, and the potential for future commercial use. A broader +study into science research across several subject areas by the UK +Research Information Network additionally identified the lack of +evidence of benefits, cultures of independence and competition, and +quality concerns as inhibiting factors. + +We identified a number of barriers to the publication of software and +data, including: + +\begin{itemize} +\item lack of education in software development and consequently of + confidence that code is of publishable quality; +\item lack of facilities and tools to enable collaborative development + and to support a familiarity with sharing and publishing code; +\item lack of incentive to distribute software and data; +\item reusability problems caused by platform incompatibilities. +\end{itemize} + +During this tutorial we will discuss these problems, outline a +possible course of action that researchers and research groups can +take in order to mitigate each of these barriers, and present a +practical, hands-on session in which attendees can familiarise +themselves with some of the tools and methods involved and gain +confidence with using them for their own work. + \section{Outline of the tutorial content} This tutorial will be in three parts: \begin{itemize} -\item An {\bf introduction and overview} discussing the motivation for reusable software in research and providing an overview of some methods, tools and facilities available to researchers for this purpose; -\item A practical {\bf hands-on example} section in which attendees are encouraged to try out some of these methods in code; -\item A {\bf review and discussion} on the subject of publications policy, relevant also to research group leaders. +\item An {\bf introduction and overview} discussing the motivation for reusable software and data in research and providing an overview of some methods, tools and facilities available to researchers for this purpose; +\item A {\bf hands-on session} in which attendees are encouraged to try out some of these methods in code; +\item A review and discussion of {\bf practical issues} in ensuring + that publication actually occurs, relevant also to research group + leaders. \end{itemize} \subsection{Introduction and overview} -In the first part of the tutorial we will discuss some of the problems -faced by researchers in developing and reusing software in their -research, and their consequences for scientific work. We will then -outline some of the options available for researchers to help overcome -barriers to software reuse. +In the first part of the tutorial, we will first discuss problems +faced by researchers in developing and reusing software and data in +their research, and their consequences for scientific work. -\subsubsection{The present situation and what we can do about it} - -We will talk about findings which show limited levels of collaborative -development and software publication in audio and music research, and -present and discuss research and survey data pointing to some common -causes for the lack of publication and eventual disappearance of -research software code. - -For example, our survey of UK audio and music researchers in 2011 -found that even among those respondents who reported both developing -software during research and taking steps to reproducibility for their -publications, only 35\% reported having in fact published any of their -code. Our respondents cited as obstacles to the publication of code -lack of time, copyright restrictions, and the potential for future -commercial use. (A broader study into science research across several -subject areas by the UK Research Information Network additionally -identified the lack of evidence of benefits, cultures of independence -and competition, and quality concerns as inhibiting factors.) - -We will identify a number of barriers to the publication of code, -including the lack of education and confidence with code, lack of -facilities and tools to support collaborative development, lack of -incentive to distribute software, and reusability problems caused by -platform incompatibilities. We will outline a possible course of -action that researchers and research groups can take in order to -mitigate each of these barriers, including focused small-scale -training programmes, the use of version control software, and actions -that create an association between published software and citeable -publications. - -\subsubsection{Software, tools, and facilities} - -This section will present an overview of methods, tools and facilities available to researchers to assist with collaborative development and software publication, including: +We then give an overview of {\bf software tools, facilities and + methods} available for researchers to assist with collaborative +development and software publication, including: \begin{itemize} -\item Version control software: The concepts; practical advantages; overview of Mercurial, Git, Subversion; hosting facilities such as Github, Bitbucket, or (for UK researchers) our own code.soundsoftware.ac.uk -\item Unit testing and managing provenance and reproducibility for code +\item Version control software: The concepts; practical advantages; overview of Mercurial, Git, Subversion; hosting facilities such as Github, Bitbucket, or (for UK researchers) our own code.soundsoftware.ac.uk; +\item Unit testing and managing provenance and reproducibility for code; +\item Data management: principles, repositories, and versioning; \item Software licences: commonly-used open-source licences; the pros and cons of GPL and BSD licensing schemes -\item Data management: principles and repositories \end{itemize} -\subsection{Hands-on examples} +\subsection{Hands-on session} -In this section, the second of the three, attendees will get the opportunity to work through an example using real code. +The second part of the tutorial is a hands-on session in which +attendees will get the opportunity to work through an example using +real code. -A toy MIR-related programming problem will be presented, and attendees will pair up to: +A ``toy'' music informatics programming problem will be presented, with +sample code and data available, and attendees will pair up to: \begin{itemize} -\item Implement it in either Python or MATLAB/Octave using a very simple unit testing regime; -\item Place the code under version control using a local repository in a distributed version control system; -\item Make the appropriate changes to place the result under a standard open-source software licence; -\item Tag the code and follow a simple "release procedure" to produce a source code package. +\item Implement it in Python or MATLAB/Octave (according to their + preference) using a very simple unit testing regime; +\item Place the code under version control using a local repository in + a distributed version control system; +\item Tag the code and make a record of the software version and its + corresponding output data version; +\item Tweak the algorithm and record the updated versions accordingly; +\item Place the resulting software under a standard open-source + software licence; +\item Follow a simple ``release procedure'' to produce a source code + release. \end{itemize} -\subsection{Review and discussion} +\subsection{Practical issues} -Following a review of the results of the hands-on example, we will -open out the discussion into the wider field of reproducible -publication, and into areas of policy and actions that research groups -and research leaders may wish to consider. +The third part of the tutorial will open out the discussion into the +wider field of reproducible publication, and into areas of policy and +actions that research groups and research leaders may wish to +consider. This section will therefore cover: @@ -110,7 +147,7 @@ \end{itemize} \item Publication policies for research group leaders: \begin{itemize} -\item Why publish? +\item Why publish software? \item Institutional assistance with publication barriers \item The research community \end{itemize} @@ -119,8 +156,8 @@ \section{Intended and expected audience} The primary audience for this tutorial is researchers within the music -informatics community who have to develop or reuse software during -their day-to-day research. +informatics community who have to develop or reuse software and data +during their day-to-day research. We believe that an overwhelming majority of material submitted to ISMIR required software to be developed during research. Given results @@ -134,44 +171,66 @@ research group leaders, because of its implications in terms of both institutional and group policy and guidance for research students. -\section{Short biography of the presenter(s)} +\section{Short biography of the presenters} -The presenters manage the Sound Software project -(http://soundsoftware.ac.uk/), an initiative to assist researchers in -audio and music fields in the UK to manage software code in a more -sustainable manner. Based at the Centre for Digital Music (C4DM) at -Queen Mary, University of London, they have extensive experience in -audio and music research (including music information retrieval) and -software development, and have given workshops on sustainable software -development in research at the C4DM and elsewhere in the UK. +\subsection{Experience in this area} -Mark Plumbley is Director of C4DM and leads the Sound Software -initiative. His work in audio signal analysis includes beat tracking, -music transcription, source separation and object coding, using -techniques such as neural networks, independent component analysis, -sparse representations and Bayesian modelling. Prof Plumbley is Chair +The presenters manage the Sound Software project\footnote{\tt + http://soundsoftware.ac.uk/} and Sustainable Management of Digital +Music Research Data project\footnote{\tt + http://rdm.c4dm.eecs.qmul.ac.uk/} in the Centre for Digital Music +(C4DM) at Queen Mary University of London. + +The Sound Software project is an EPSRC-funded initiative to assist +researchers to manage software code in a more sustainable manner, +based at the C4DM but with the whole UK audio and music research +community as its focus. + +The Sustainable Management of Digital Music Research Data project is a +JISC-funded pilot data-management project focusing on data published +by the C4DM. + +The presenters have extensive experience in audio and music research +and software development, and have given workshops on sustainable +software development in research at the C4DM and elsewhere in the UK. + +\subsection{The presenters} + +{\bf Chris Cannam} is the principal developer for the Sound Software +project and code hosting site.\footnote{\tt + http://code.soundsoftware.ac.uk} He is a software developer with +many years of commercial and open-source cross-platform development +experience. While at C4DM he has developed software including the +widely-used Sonic Visualiser audio analysis and visualisation +application. + +{\bf Lu\'is Figueira} is a software developer with several years of +experience with C/C++, Ruby on Rails, Scheme, Web technologies and +databases. He has an MSc in Electrotechnical and Computer Engineering +from Instituto Superior T\'ecnico in Lisbon, where he specialized in +digital signal processing with a focus on speech synthesis. + +{\bf Dr Marco Fabiani} (to be provided by Marco) + +{\bf Prof Mark Plumbley} is Director of C4DM and leads the Sound +Software initiative. His work in audio signal analysis includes beat +tracking, music transcription, source separation and object coding, +using techniques such as neural networks, independent component +analysis, sparse representations and Bayesian modelling. He is Chair of the International Independent Component Analysis Steering Committee, a member of the IEEE Machine Learning in Signal Processing Technical Committee, and an Associate Editor for IEEE Transactions on Neural Networks. He leads the ICA Research Network and Digital Music Research Network. -Chris Cannam is a software developer with 15 years commercial and -extensive open-source and cross-platform development experience. While -at the C4DM he has worked on the widely-used Sonic Visualiser audio -analysis and visualisation application; Sonic Annotator, a tool for -batch extraction of meaningful features from audio files; the Vamp -plugin API for audio feature extraction, and many plugins using this -API; and tools and ontologies for music description using RDF within -the Semantic Web. - -Luis Figueira is a software developer with more than 5 years of -experience with C/C++, Ruby on Rails, Scheme, Web technologies and -databases. He has an MSc in Electrotechnical and Computers Engineering -from Instituto Superior Técnico in Lisbon, where he specialized in -digital signal processing with a focus on speech synthesis. Luis has -recently worked in a speech technology spin-off and an open-source web -development company. +{\bf Dr Simon Dixon} leads the Music Informatics area of C4DM and the +Sustainable Management of Digital Music Research Data project. His +research interests cover various aspects of music informatics, +including high-level music signal analysis and the representation of +musical knowledge. He has been General Co-Chair of the Dagstuhl +Seminar on Multimodal Music Processing and Computer Music Modeling and +Retrieval, Programme Co-Chair for ISMIR 2007, and co-presenter of the +ISMIR 2006 tutorial on Computational Rhythm Description. \section{Any special requirements} \section{Contact information}