Mercurial > hg > webaudioevaluationtool
changeset 91:0cc3066d8abe
Paper: Introduction + Interface
author | Brecht De Man <b.deman@qmul.ac.uk> |
---|---|
date | Mon, 27 Apr 2015 19:38:27 +0100 |
parents | e100556bd44f |
children | 985020f428bb |
files | docs/SMC15/smc2015template.bbl docs/SMC15/smc2015template.bib docs/SMC15/smc2015template.tex |
diffstat | 3 files changed, 60 insertions(+), 50 deletions(-) [+] |
line wrap: on
line diff
--- a/docs/SMC15/smc2015template.bbl Mon Apr 27 16:49:32 2015 +0100 +++ b/docs/SMC15/smc2015template.bbl Mon Apr 27 19:38:27 2015 +0100 @@ -39,13 +39,11 @@ \bibitem{durr2015implementation} G.~Durr, L.~Peixoto, M.~Souza, R.~Tanoue, and J.~D. Reiss, ``Implementation and evaluation of dynamic level of audio detail,'' in \emph{Audio Engineering - Society Conference: 56th International Conference: Audio for Games}, February - 2015. + Society Conference: 56th International Conference: Audio for Games}, 2015. \bibitem{deman2014a} B.~De~Man and J.~D. Reiss, ``Adaptive control of amplitude distortion - effects,'' in \emph{53rd Conference of the Audio Engineering Society}, - January 2014. + effects,'' in \emph{53rd Conference of the Audio Engineering Society}, 2014. \bibitem{mushram} E.~Vincent, M.~G. Jafari, and M.~D. Plumbley, ``Preliminary guidelines for @@ -56,7 +54,7 @@ J.~D. Reiss and C.~Uhle, ``Determined source separation for microphone recordings using {IIR} filters,'' in \emph{129th Convention of the Audio Engineering Society}.\hskip 1em plus 0.5em minus 0.4em\relax Audio - Engineering Society, November 2010. + Engineering Society, 2010. \bibitem{song2013b} Y.~Song, S.~Dixon, and A.~R. Halpern, ``Do online social tags predict perceived @@ -76,7 +74,7 @@ \bibitem{deman2014b} B.~De~Man and J.~D. Reiss, ``{APE}: {A}udio {P}erceptual {E}valuation toolbox for {MATLAB},'' in \emph{136th Convention of the Audio Engineering Society}, - April 2014. + 2014. \bibitem{beaqlejs} S.~Kraft and U.~Z{\"o}lzer, ``{BeaqleJS}: {HTML5} and {JavaScript} based @@ -93,29 +91,39 @@ \emph{AIA/DAGA Conference on Acoustics, Merano (Italy)}, 2013. \bibitem{whisper} -S.~Ciba, A.~Wlodarski, and H.-J. Maempel, ``Whis{PER} -- a new tool for +S.~Ciba, A.~Wlodarski, and H.-J. Maempel, ``Whis{PER} -- {A} new tool for performing listening tests,'' in \emph{126th Convention of the Audio - Engineering Society}, May 7-10 2009. + Engineering Society}, 2009. + +\bibitem{opaque} +J.~Berg, ``{OPAQUE} -- {A} tool for the elicitation and grading of audio + quality attributes,'' in \emph{118th Convention of the Audio Engineering + Society}, 2005. + +\bibitem{guineapig} +J.~Hynninen and N.~Zacharov, ``{GuineaPig} - {A} generic subjective test system + for multichannel audio,'' in \emph{106th Convention of the Audio Engineering + Society Convention}, 1999. \bibitem{mushra} \emph{Method for the subjective assessment of intermediate quality level of coding systems}.\hskip 1em plus 0.5em minus 0.4em\relax Recommendation {ITU-R BS.1534-1}, 2003. -\bibitem{guineapig} -J.~Hynninen and N.~Zacharov, ``{GuineaPig} - {A} generic subjective test system - for multichannel audio,'' in \emph{106th Convention of the Audio Engineering - Society Convention}, 1999. - \bibitem{mason2015compression} A.~Mason, N.~Jillings, Z.~Ma, J.~D. Reiss, and F.~Melchior, ``Adaptive audio reproduction using personalized compression,'' in \emph{Audio Engineering Society Conference: 57th International Conference: The Future of Audio - Entertainment Technology -- Cinema, Television and the Internet}, March 2015. + Entertainment Technology -- Cinema, Television and the Internet}, 2015. \bibitem{bech} S.~Bech and N.~Zacharov, \emph{Perceptual Audio Evaluation - Theory, Method and Application}.\hskip 1em plus 0.5em minus 0.4em\relax John Wiley \& Sons, 2007. +\bibitem{deman2015a} +B.~De~Man, M.~Boerum, B.~Leonard, G.~Massenburg, R.~King, and J.~D. Reiss, + ``Perceptual evaluation of music mixing practices,'' in \emph{138th + Convention of the Audio Engineering Society}, 2015. + \end{thebibliography}
--- a/docs/SMC15/smc2015template.bib Mon Apr 27 16:49:32 2015 +0100 +++ b/docs/SMC15/smc2015template.bib Mon Apr 27 19:38:27 2015 +0100 @@ -23,7 +23,6 @@ @conference{deman2014b, Author = {De Man, Brecht and Joshua D. Reiss}, Booktitle = {136th Convention of the Audio Engineering Society}, - Month = {April}, Title = {{APE}: {A}udio {P}erceptual {E}valuation toolbox for {MATLAB}}, Year = {2014}} @@ -46,7 +45,6 @@ @conference{mason2015compression, Author = {Mason, Andrew and Jillings, Nick and Ma, Zheng and Reiss, Joshua D. and Melchior, Frank}, Booktitle = {Audio Engineering Society Conference: 57th International Conference: The Future of Audio Entertainment Technology -- Cinema, Television and the Internet}, - Month = {March}, Title = {Adaptive Audio Reproduction Using Personalized Compression}, Year = {2015}} @@ -69,7 +67,6 @@ @inproceedings{uhlereiss, Author = {Reiss, Joshua D. and Uhle, Christian}, Booktitle = {129th Convention of the Audio Engineering Society}, - Month = {November}, Organization = {Audio Engineering Society}, Title = {Determined Source Separation for Microphone Recordings Using {IIR} Filters}, Year = {2010}} @@ -78,7 +75,6 @@ @conference{deman2014a, Author = {De Man, Brecht and Joshua D. Reiss}, Booktitle = {53rd Conference of the Audio Engineering Society}, - Month = {January}, Title = {Adaptive Control of Amplitude Distortion Effects}, Year = {2014}} @@ -97,8 +93,7 @@ @conference{whisper, Author = {Simon Ciba and Andr{\'e} Wlodarski and Hans-Joachim Maempel}, Booktitle = {126th Convention of the Audio Engineering Society}, - Month = {May 7-10}, - Title = {Whis{PER} -- A new tool for performing listening tests}, + Title = {Whis{PER} -- {A} new tool for performing listening tests}, Year = {2009}} @conference{scale, @@ -130,7 +125,6 @@ @conference{durr2015implementation, Author = {Durr, Gabriel and Peixoto, Lys and Souza, Marcelo and Tanoue, Raisa and Reiss, Joshua D.}, Booktitle = {Audio Engineering Society Conference: 56th International Conference: Audio for Games}, - Month = {February}, Title = {Implementation and Evaluation of Dynamic Level of Audio Detail}, Year = {2015}} @@ -152,3 +146,14 @@ Title = {{GuineaPig} - {A} generic subjective test system for multichannel audio}, Year = {1999}} +@conference{opaque, + Author = {Berg, Jan}, + Booktitle = {118th Convention of the Audio Engineering Society}, + Title = {{OPAQUE} -- {A} Tool for the Elicitation and Grading of Audio Quality Attributes}, + Year = {2005}} + +@conference{deman2015a, + Author = {De Man, Brecht and Boerum, Matt and Leonard, Brett and Massenburg, George and King, Richard and Reiss, Joshua D.}, + Booktitle = {138th Convention of the Audio Engineering Society}, + Title = {Perceptual Evaluation of Music Mixing Practices}, + Year = {2015}}
--- a/docs/SMC15/smc2015template.tex Mon Apr 27 16:49:32 2015 +0100 +++ b/docs/SMC15/smc2015template.tex Mon Apr 27 19:38:27 2015 +0100 @@ -9,6 +9,8 @@ \usepackage{ifpdf} \usepackage[english]{babel} \usepackage{cite} +\usepackage{enumitem} +\setitemize{noitemsep,topsep=0pt,parsep=0pt,partopsep=0pt} \hyphenation{Java-script} @@ -151,9 +153,9 @@ %NICK: examples of what kind of audio applications HTML5 has made possible, with references to publications (or website)\\ -Perceptual evaluation of audio plays an important role in a wide range of research including audio quality \cite{schoeffler2013impact,repp}, sound synthesis \cite{de2013real,durr2015implementation}, audio effect design \cite{deman2014a}, source separation \cite{mushram,uhlereiss}, music and emotion analysis \cite{song2013b,song2013a}, and many others \cite{friberg2011comparison}. % codec design? +Perceptual evaluation of audio plays an important role in a wide range of research on audio quality \cite{schoeffler2013impact,repp}, sound synthesis \cite{de2013real,durr2015implementation}, audio effect design \cite{deman2014a}, source separation \cite{mushram,uhlereiss}, music and emotion analysis \cite{song2013b,song2013a}, and many others \cite{friberg2011comparison}. % codec design? -This work is based in part on the APE audio perceptual evaluation interface for MATLAB \cite{deman2014b}. An important drawback of this toolbox is the need to have MATLAB to create a test and even to run (barring the use of an executable generated by MATLAB), and limited compatibility with both earlier and newer versions of MATLAB, which makes it hard to maintain. On the other hand, a web application generally has the advantage of running in most browsers on most applications. +%This work is based in part on the APE audio perceptual evaluation interface for MATLAB \cite{deman2014b}. An important drawback of this toolbox is the need to have MATLAB to create a test and even to run (barring the use of an executable generated by MATLAB), and limited compatibility with both earlier and newer versions of MATLAB, which makes it hard to maintain. On the other hand, a web application generally has the advantage of running in most browsers on most applications. % IMPORTANT %[TO ADD: other interfaces for perceptual evaluation of audio, browser-based or not!] \\ @@ -162,9 +164,8 @@ % to add: OPAQUE, Rumsey's repertory grid technique - \begin{table}[htdp] -\caption{Available audio perceptual evaluation interfaces} +\caption{Available audio perceptual evaluation tools} \begin{center} \begin{tabular}{|*{3}{l|}} % order? @@ -184,38 +185,33 @@ \label{tab:interfaces} \end{table}% -Various perceptual audio interfaces are already available, see Table \ref{tab:interfaces}. -Many are MATLAB-based, useful for easily processing and visualising the data produced by the listening tests, but requiring the application to be installed to run or - in the case of an executable created with MATLAB - at least create the test. -Furthermore, compatibility is limited across different versions of MATLAB. +Various listening test design tools are already available, see Table \ref{tab:interfaces}. A few other listening test tools, such as OPAQUE \cite{opaque} and GuineaPig \cite{guineapig}, are described but not available to the public at the time of writing. + +Many are MATLAB-based, useful for easily processing and visualising the data produced by the listening tests, but requiring MATLAB to be installed to run or - in the case of an executable created with MATLAB - at least create the test. +Furthermore, compatibility is usually limited across different versions of MATLAB. Similarly, Max requires little or no programming background but it is proprietary software as well, which is especially undesirable when tests need to be deployed at different sites. More recently, BeaqleJS \cite{beaqlejs} makes use of the HTML5 audio capabilities and comes with a number of predefined, established test interfaces such as ABX and MUSHRA \cite{mushra}. % -Another listening test tool, GuineaPig \cite{guineapig}, is not available to the public at the time of writing. + +A browser-based perceptual evaluation tool for audio has a number of advantages. First of all, it doesn't need any other software than a browser, meaning deployment is very easy and cheap. As such, it can also run on a variety of devices and platforms. The test can be hosted on a central server with subjects all over the world, who can simply go to a webpage. This means that multiple participants can take the test simultaneously, potentially in their usual listening environment if this is beneficial for the test. Naturally, the constraints on the listening environment and other variables still need to be controlled if they are important to the experiment. Depending on the requirements a survey or a variety of tests preceding the experiment could establish whether remote participants and their environments are adequate for the experiment at hand. + +The Web Audio API is a high-level JavaScript Application Programming Interface (API) designed for real-time processing of audio inside the browser through various processing nodes\footnote{http://webaudio.github.io/web-audio-api/}. Various web sites have used the Web Audio API for creative purposes, such as drum machines and score creation tools\footnote{http://webaudio.github.io/demo-list/}, +others from the list show real-time captured audio processing such as room reverberation tools and a phase vocoder from the system microphone. The BBC Radiophonic Workshop shows effects used on famous TV shows such as Doctor Who, being simulated inside the browser\footnote{http://webaudio.prototyping.bbc.co.uk/}. +Another example is the BBC R\&D personalised compressor which applies a dynamic range compressor on a radio station that dynamically adjusts the compressor settings to match the listener's environment \cite{mason2015compression}. + + % [How is this one different from all these?] improve % FLEXIBLE (reference (not) appropriate) - -Furthermore, the option to provide free-text comment fields allows for tests with individual vocabulary methods, as opposed to only allowing quantitative scales associated to a fixed set of descriptors. - - -% ENVIRONMENT -There are a number of advantages to building a web audio based listening test environment. The ability to easily deploy a flexible and scalable testing environment that requires no proprietary software to run makes the web audio evaluation tool a very flexible testing tool. The ability to host a single test server and create multiple clients not only allows multiple participants to be involved in a trial simultaneously, but also permits participants to be located anywhere in the world. There are also less user experience issues, since all users should have some experience with using existing web technologies. - +In contrast with the tools listed above, we aim to provide an environment in which a variety of multi-stimulus tests can be designed, with a wide range of configurability, while keeping setup and collecting results as straightforward as possible. For instance, the option to provide free-text comment fields allows for tests with individual vocabulary methods, as opposed to only allowing quantitative scales associated to a fixed set of descriptors. % EASE OF USE: no need to go in the code To make the tool accessible to a wide range of researchers, we aim to offer maximum functionality even to those with little or no programming background. The tool we present can set up a listening test without reading or adjusting any code, provided no new types of interfaces need to be created. -We present a browser-based perceptual evaluation tool from which any kind of multiple stimulus audio evaluation tool where subjects need to rank, rate, select, or comment on different audio samples can be built. %In this paper, we provide a listening test back end that allows for easy set up of a wide variety of listening tests, highly flexible yet very simple and not requiring any programming skills. -The Web Audio API is a high-level JavaScript Application Programming Interface (API) designed for real-time processing of audio inside the browser through various processing nodes\footnote{http://webaudio.github.io/web-audio-api/}. Various web sites have used the Web Audio API for either creative purposes, such as drum machines and score creation tools\footnote{http://webaudio.github.io/demo-list/}, -others from the list show real-time captured audio processing such as room reverberation tools and a phase vocoder from the system microphone. The BBC Radiophonic Workshop shows effects used on famous TV shows such as Doctor Who, being simulated inside the browser\footnote{http://webaudio.prototyping.bbc.co.uk/}. -Another example is the BBC R\&D personalised compressor which applies a dynamic range compressor on a radio station that dynamically adjusts the compressor settings to match the listener's environment \cite{mason2015compression}. - -We present a browser-based perceptual evaluation tool from which any kind of multiple stimulus audio evaluation tool where subjects need to rank, rate, select, or comment on different audio samples can be built. +% ENVIRONMENT %In this paper, we provide a listening test back end that allows for easy set up of a wide variety of listening tests, highly flexible yet very simple and not requiring any programming skills. +Specifically, we present a browser-based perceptual evaluation tool from which any kind of multiple stimulus audio evaluation tool where subjects need to rank, rate, select, or comment on different audio samples can be built. We also include an example of the multiple stimulus user interface included with the APE tool \cite{deman2014b}, which presents the subject with a number of axes on which a number of markers, corresponding to audio samples, can be moved to reflect any subjective quality, as well as corresponding comment boxes. However, other graphical user interfaces can be put on top of the engine that we provide with minimal or no modifications. Examples of this are the MUSHRA test \cite{mushra}, single or multiple stimulus evaluation with a two-dimensional interface (such as valence and arousal dimensions), or simple annotation (using free-form text, check boxes, radio buttons or drop-down menus) of one or more audio samples at a time. -In some cases, such as method of adjustment, where the audio is processed by the user \cite{bech}, or AB test \cite{bech}, where the interface does not show all audio samples to be evaluated at once, the back end of the tool needs to be modified as well. - -There are a number of advantages to building a web audio based listening test environment. The ability to easily deploy a flexible and scalable testing environment that requires no proprietary software to run makes the web audio evaluation tool a very flexible testing tool. The ability to host a single test server not only allows multiple participants to be involved in a trial simultaneously, but also permits participants to be located anywhere in the world. There are also less user experience issues, since all users should have some experience with using existing web technologies. - +In some cases, such as method of adjustment, where the audio is processed by the user, or AB test, where the interface does not show all audio samples to be evaluated at once \cite{bech}, the back end of the tool needs to be modified as well. In the following sections, we describe the included interface in more detail, discuss the implementation, and cover considerations that were made in the design process of this tool. @@ -229,10 +225,11 @@ \section{Interface}\label{sec:interface} -At this point, we have implemented the interface of the MATLAB-based APE (Audio Perceptual Evaluation) toolbox \cite{deman2014b}. This shows one marker for each simultaneously evaluated audio fragment on one or more horizontal axes, that can be moved to rate or rank the respective fragments in terms of any subjective quality, as well as a comment box for every marker, and any extra text boxes for extra comments. -The reason for such an interface, where all stimuli are presented on a single rating axis (or multiple axes if multiple subjective qualities need to be evaluated), is that it urges the subject to consider the rating and/or ranking of the stimuli relative to one another, as opposed to comparing each individual stimulus to a given reference, as is the case with e.g. a MUSHRA test \cite{mushra}. See Figure \ref{fig:interface} for an example of the interface, with eleven fragments and one axis. %? change if a new interface is shown +At this point, we have implemented the interface of the MATLAB-based APE (Audio Perceptual Evaluation) toolbox \cite{deman2014b}. This shows one marker for each simultaneously evaluated audio fragment on one or more horizontal axes, that can be moved to rate or rank the respective fragments in terms of any subjective property, as well as a comment box for every marker, and any extra text boxes for extra comments. +The reason for such an interface, where all stimuli are presented on a single rating axis (or multiple axes if multiple subjective qualities need to be evaluated), is that it urges the subject to consider the rating and/or ranking of the stimuli relative to one another, as opposed to comparing each individual stimulus to a given reference, as is the case with e.g. a MUSHRA test \cite{mushra}. As such, it is ideal for any type of test where the goal is to carefully compare samples against each other, like perceptual evaluation of different mixes of music recordings \cite{deman2015a} or sound synthesis models \cite{durr2015implementation}, as opposed to comparing results of source separation algorithms \cite{mushram} or audio with lower data rate \cite{mushra} to a high quality reference signal. +See Figure \ref{fig:interface} for an example of the interface, with eleven fragments and one axis. %? change if a new interface is shown -However, the back end of this test environment allows for many more established and novel interfaces for listening tests, particularly ones where the subject only assesses audio without manipulating it (i.e. method of adjustment), which would require additional features to be implemented. +%For instance, the option to provide free-text comment fields allows for tests with individual vocabulary methods, as opposed to only allowing quantitative scales associated to a fixed set of descriptors. \begin{figure*}[ht] \begin{center} @@ -383,7 +380,7 @@ Further work may include the development of other common test designs, such as MUSHRA \cite{mushra}, 2D valence and arousal rating, and others. We will add functionality to assist with setting up large-scale tests with remote subjects, so this becomes straightforward and intuitive. In addition, we will keep on improving and expanding the tool, and highly welcome feedback and contributions from the community. -The source code of this tool can be found on \url{code.soundsoftware.ac.uk/projects/webaudioevaluationtool}. +The source code of this tool can be found on \\ \texttt{code.soundsoftware.ac.uk/projects/}\\ \texttt{webaudioevaluationtool}. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%