view tutorial.tex @ 1:17c67727560b

Some updates following first reviews from Mark
author Chris Cannam
date Thu, 08 Mar 2012 17:46:16 +0000
parents d2cf7cdf7882
children 8c92f88fa920
line wrap: on
line source
\documentclass{article}
\usepackage{amsmath,graphicx}
\onecolumn
\raggedbottom

%\oddsidemargin  -6.2truemm
%\evensidemargin -6.2truemm

\topmargin 0truept
\headheight 0truept
\headsep 0truept
%\footheight 0truept
%\footskip 0truept
\textheight 229truemm
%\textwidth 178truemm

\title{ISMIR Tutorial Proposal}

\author{Chris Cannam, Lu\'{i}s A. Figueira, Marco Fabiani,\\
Mark D. Plumbley, and Simon Dixon\\
Centre for Digital Music, Queen Mary University of London\\
  {\tt \{firstname.lastname\}@eecs.qmul.ac.uk}}

\begin{document}
%
\maketitle
%

\section{Title}

{\bf Reusable software and reproducibility in music informatics research}

\section{Motivation}

The requirement to develop and reuse software is almost universal in
audio and music informatics research.  Many methods, including most of
those published at ISMIR, are developed in tandem with software
implementations, and many of them are too complex or too fundamentally
software-based to be reproduced readily from a published paper
alone. For this reason, it is helpful for sustainable research to have
software and data published along with papers.

In practice, however, non-publication of code and data is still the
norm and research software is commonly lost after publication of the
associated methods.

Our survey by the Sound Software project (http://soundsoftware.ac.uk)
of UK audio and music researchers in 2011 found that even among those
respondents who reported both developing software during research and
taking steps to reproducibility for their publications, only 35\%
reported having in fact published any of their code. Our respondents
cited as obstacles to the publication of code lack of time, copyright
restrictions, and the potential for future commercial use. A broader
study into science research across several subject areas by the UK
Research Information Network additionally identified the lack of
evidence of benefits, cultures of independence and competition, and
quality concerns as inhibiting factors.

We identified a number of barriers to the publication of software and
data, including:

\begin{itemize}
\item lack of education in software development and consequently of
  confidence that code is of publishable quality;
\item lack of facilities and tools to enable collaborative development
  and to support a familiarity with sharing and publishing code;
\item  lack of incentive to distribute software and data;
\item  reusability problems caused by platform incompatibilities.
\end{itemize}

During this tutorial we will discuss these problems, outline a
possible course of action that researchers and research groups can
take in order to mitigate each of these barriers, and present a
practical, hands-on session in which attendees can familiarise
themselves with some of the tools and methods involved and gain
confidence with using them for their own work.

\section{Outline of the tutorial content}

This tutorial will be in three parts:
\begin{itemize}
\item An {\bf introduction and overview} discussing the motivation for reusable software and data in research and providing an overview of some methods, tools and facilities available to researchers for this purpose;
\item A {\bf hands-on session} in which attendees are encouraged to try out some of these methods in code;
\item A review and discussion of {\bf practical issues} in ensuring
  that publication actually occurs, relevant also to research group
  leaders.
\end{itemize}

\subsection{Introduction and overview}

In the first part of the tutorial, we will first discuss problems
faced by researchers in developing and reusing software and data in
their research, and their consequences for scientific work.

We then give an overview of {\bf software tools, facilities and
  methods} available for researchers to assist with collaborative
development and software publication, including:

\begin{itemize}
\item Version control software: The concepts; practical advantages; overview of Mercurial, Git, Subversion; hosting facilities such as Github, Bitbucket, or (for UK researchers) our own code.soundsoftware.ac.uk;
\item Unit testing and managing provenance and reproducibility for code;
\item Data management: principles, repositories, and versioning;
\item Software licences: commonly-used open-source licences; the pros and cons of GPL and BSD licensing schemes
\end{itemize}

\subsection{Hands-on session}

The second part of the tutorial is a hands-on session in which
attendees will get the opportunity to work through an example using
real code.

A ``toy'' music informatics programming problem will be presented, with
sample code and data available, and attendees will pair up to:

\begin{itemize}
\item Implement it in Python or MATLAB/Octave (according to their
  preference) using a very simple unit testing regime;
\item Place the code under version control using a local repository in
  a distributed version control system;
\item Tag the code and make a record of the software version and its
  corresponding output data version;
\item Tweak the algorithm and record the updated versions accordingly;
\item Place the resulting software under a standard open-source
  software licence;
\item Follow a simple ``release procedure'' to produce a source code
  release.
\end{itemize}

\subsection{Practical issues}

The third part of the tutorial will open out the discussion into the
wider field of reproducible publication, and into areas of policy and
actions that research groups and research leaders may wish to
consider.

This section will therefore cover:

\begin{itemize}
\item    Publication mechanisms for reproducible research:
\begin{itemize}
\item        Open-access journal papers
\item        Self-archiving
\item        Technical reports
\item        Copyright issues relating to journal or book publication
\item        Publishing software in such a way that its relationship with the written publication is apparent
\item        Associating specific versions of software or data with a publication
\end{itemize}
\item    Publication policies for research group leaders:
\begin{itemize}
\item        Why publish software?
\item        Institutional assistance with publication barriers
\item        The research community
\end{itemize}
\end{itemize}

\section{Intended and expected audience}

The primary audience for this tutorial is researchers within the music
informatics community who have to develop or reuse software and data
during their day-to-day research.

We believe that an overwhelming majority of material submitted to
ISMIR required software to be developed during research. Given results
showing that most researchers are self-taught in software development,
and in light of the reasons researchers report as to why they do not
publish software, we think that a large proportion of the active
researchers represented at ISMIR regardless of subject focus will find
the material in our tutorial of interest.

Our tutorial is also highly relevant to research supervisors and
research group leaders, because of its implications in terms of both
institutional and group policy and guidance for research students.

\section{Short biography of the presenters}

\subsection{Experience in this area}

The presenters manage the Sound Software project\footnote{\tt
  http://soundsoftware.ac.uk/} and Sustainable Management of Digital
Music Research Data project\footnote{\tt
  http://rdm.c4dm.eecs.qmul.ac.uk/} in the Centre for Digital Music
(C4DM) at Queen Mary University of London.

The Sound Software project is an EPSRC-funded initiative to assist
researchers to manage software code in a more sustainable manner,
based at the C4DM but with the whole UK audio and music research
community as its focus.

The Sustainable Management of Digital Music Research Data project is a
JISC-funded pilot data-management project focusing on data published
by the C4DM.

The presenters have extensive experience in audio and music research
and software development, and have given workshops on sustainable
software development in research at the C4DM and elsewhere in the UK.

\subsection{The presenters}

{\bf Chris Cannam} is the principal developer for the Sound Software
project and code hosting site.\footnote{\tt
  http://code.soundsoftware.ac.uk} He is a software developer with
many years of commercial and open-source cross-platform development
experience. While at C4DM he has developed software including the
widely-used Sonic Visualiser audio analysis and visualisation
application.

{\bf Lu\'is Figueira} is a software developer with several years of
experience with C/C++, Ruby on Rails, Scheme, Web technologies and
databases. He has an MSc in Electrotechnical and Computer Engineering
from Instituto Superior T\'ecnico in Lisbon, where he specialized in
digital signal processing with a focus on speech synthesis.

{\bf Dr Marco Fabiani} (to be provided by Marco)

{\bf Prof Mark Plumbley} is Director of C4DM and leads the Sound
Software initiative. His work in audio signal analysis includes beat
tracking, music transcription, source separation and object coding,
using techniques such as neural networks, independent component
analysis, sparse representations and Bayesian modelling. He is Chair
of the International Independent Component Analysis Steering
Committee, a member of the IEEE Machine Learning in Signal Processing
Technical Committee, and an Associate Editor for IEEE Transactions on
Neural Networks. He leads the ICA Research Network and Digital Music
Research Network.

{\bf Dr Simon Dixon} leads the Music Informatics area of C4DM and the
Sustainable Management of Digital Music Research Data project. His
research interests cover various aspects of music informatics,
including high-level music signal analysis and the representation of
musical knowledge. He has been General Co-Chair of the Dagstuhl
Seminar on Multimodal Music Processing and Computer Music Modeling and
Retrieval, Programme Co-Chair for ISMIR 2007, and co-presenter of the
ISMIR 2006 tutorial on Computational Rhythm Description.

\section{Any special requirements}
\section{Contact information}

\end{document}

%%% Local Variables: 
%%% mode: latex
%%% TeX-master: t
%%% End: