comparison docs/WAC2016/WAC2016.tex @ 340:90b9b077475a WAC2016

Added screen shots, edited bib. Some edits to intro, abstract and conclusion.
author Nicholas Jillings <nicholas.jillings@eecs.qmul.ac.uk>
date Thu, 15 Oct 2015 21:17:14 +0100
parents b52911d1df64
children cccb3387d375
comparison
equal deleted inserted replaced
339:b52911d1df64 340:90b9b077475a
124 % is the number that will appear on the first page PLUS the 124 % is the number that will appear on the first page PLUS the
125 % number that will appear in the \additionalauthors section. 125 % number that will appear in the \additionalauthors section.
126 126
127 \maketitle 127 \maketitle
128 \begin{abstract} 128 \begin{abstract}
129 Here comes the abstract. 129
130 Perceptual listening tests are commonplace in audio research and a vital form of evaluation. Many tools exist to run such tests, however many operate one test type and are therefore limited whilst most require proprietary software. Using Web Audio the Web Audio Evaluation Tool (WAET) addresses these concerns by having one toolbox which can be configured to run many differen tests, perform it through a web browser and without needing proprietary software or computer programming knowledge. In this paper the role of the Web Audio API in giving WAET key functionalities are shown. The paper also highlights less common features, available to web based tools, such as easy remote testing environment and in-browser analytics.
131
130 \end{abstract} 132 \end{abstract}
131 133
132 134
133 \section{Introduction} 135 \section{Introduction}
134 136
184 \end{table*} 186 \end{table*}
185 % 187 %
186 %Selling points: remote tests, visualisaton, create your own test in the browser, many interfaces, few/no dependencies, flexibility 188 %Selling points: remote tests, visualisaton, create your own test in the browser, many interfaces, few/no dependencies, flexibility
187 189
188 %[Talking about what we do in the various sections of this paper. Referring to \cite{waet}. ] 190 %[Talking about what we do in the various sections of this paper. Referring to \cite{waet}. ]
189 To meet the need for a cross-platform, versatile and easy-to-use listening test tool, we previously developed the Web Audio Evaluation Tool \cite{waet} which at the time of its inception was capable of running a listening test in the browser from an XML configuration file, and storing an XML file as well, with one particular interface. We have now expanded this into a tool with which a wide range of listening test types can easily be constructed and set up remotely, without any need for manually altering code or configuration files, and which allows visualisation of the collected results in the browser. In this paper, we discuss these different aspects and explore which future improvements would be possible. Specifically, in Section \ref{sec:architecture} we cover the general implementation aspects, with a focus on the Web Audio API, followed by a discussion of the requirements for successful remote tests in Section \ref{sec:remote}. Section \ref{sec:interfaces} describes the various interfaces the tool supports, as well as how to keep this manageable. Finally, in Section \ref{sec:analysis} we provide an overview of the analysis capabilities in the browser, before summarising our findings and listing future research directions in Section \ref{sec:conclusion}. 191 To meet the need for a cross-platform, versatile and easy-to-use listening test tool, we previously developed the Web Audio Evaluation Tool \cite{waet} which at the time of its inception was capable of running a listening test in the browser from an XML configuration file, and storing an XML file as well, with one particular interface. We have now expanded this into a tool with which a wide range of listening test types can easily be constructed and set up remotely, without any need for manually altering code or configuration files, and which allows visualisation of the collected results in the browser. In this paper, we discuss these different aspects and explore which future improvements would be possible.
190 192
191 \begin{figure}[tb] 193 \begin{figure}[tb]
192 \centering 194 \centering
193 \includegraphics[width=.5\textwidth]{interface.png} 195 \includegraphics[width=.5\textwidth]{interface.png}
194 \caption{A simple example of a multi-stimulus, single attribute, single rating scale test with a reference and comment fields.} 196 \caption{A simple example of a multi-stimulus, single attribute, single rating scale test with a reference and comment fields.}
229 231
230 Although WAET uses a sparse subset of the Web Audio API functionality, its performance comes directly from using it. Listening tests can convey large amounts of information other than obtaining the perceptual relationship between the audio fragments. With WAET it is possible to obtain which parts of the audio fragments were listened to and when, at what point in the audio stream the participant switched to a different fragment, and how a fragment's rating was adjusted over time within a session, to name a few. Not only does this allow evaluation of a wealth of perceptual aspects, but it helps detect poor participants whose results are potentially not representative. 232 Although WAET uses a sparse subset of the Web Audio API functionality, its performance comes directly from using it. Listening tests can convey large amounts of information other than obtaining the perceptual relationship between the audio fragments. With WAET it is possible to obtain which parts of the audio fragments were listened to and when, at what point in the audio stream the participant switched to a different fragment, and how a fragment's rating was adjusted over time within a session, to name a few. Not only does this allow evaluation of a wealth of perceptual aspects, but it helps detect poor participants whose results are potentially not representative.
231 233
232 One of the key initial design parameters for WAET was to make the tool as open as possible to non-programmers and to this end all of the user modifiable options are included in a single XML document. This document is called the specification document and can be designed either by manually writing the XML (or modifying an existing document or template) or using our included test creator. These are standalone HTML pages which do not require any server or internet connection and help a build the test specification document. The first (test\_create.html) is for simpler tests and operates step-by-step to guide the user. It supports media through drag and drop and a clutter free interface. The advanced version is for more advanced tests where raw XML manipulation is not wanted but the same freedom is required (whilst keeping a safety net). Both models support automatic verification to ensure the XML file is valid and will highlight areas which are either incorrect and would cause an error, or options which should be removed as they are blank. 234 One of the key initial design parameters for WAET was to make the tool as open as possible to non-programmers and to this end all of the user modifiable options are included in a single XML document. This document is called the specification document and can be designed either by manually writing the XML (or modifying an existing document or template) or using our included test creator. These are standalone HTML pages which do not require any server or internet connection and help a build the test specification document. The first (test\_create.html) is for simpler tests and operates step-by-step to guide the user. It supports media through drag and drop and a clutter free interface. The advanced version is for more advanced tests where raw XML manipulation is not wanted but the same freedom is required (whilst keeping a safety net). Both models support automatic verification to ensure the XML file is valid and will highlight areas which are either incorrect and would cause an error, or options which should be removed as they are blank.
233 235
234 The basic test creator utilises the Web Audio API to perform quick playback checks and also allows for loudness normalisation techniques inspired from \cite{ape}. These are calculated offline by accessing the raw audio samples exposed from the buffer before being applied to the audio element as a gain attribute. This is used in the test to perform loudness normalisation without needing to edit any audio files. Equally the gain can be modified in either editor using an HTML5 slider or number box. 236 The basic test creator, Figre \ref{fig:test_create}, utilises the Web Audio API to perform quick playback checks and also allows for loudness normalisation techniques inspired from \cite{ape}. These are calculated offline by accessing the raw audio samples exposed from the buffer before being applied to the audio element as a gain attribute. This is used in the test to perform loudness normalisation without needing to edit any audio files. Equally the gain can be modified in either editor using an HTML5 slider or number box.
237
238 \begin{figure}[h!]
239 \centering
240 \includegraphics[width=.45\textwidth]{test_create_2.png}
241 \caption{Screen-shot of test creator tool using drag and drop to create specification document}
242 \label{fig:test_create}
243 \end{figure}
235 244
236 %Describe and/or visualise audioholder-audioelement-... structure. 245 %Describe and/or visualise audioholder-audioelement-... structure.
237 The specification document contains the URL of the audio fragments for each test page. These fragments are downloaded asynchronously in the test and decoded offline by the Web Audio offline decoder. The resulting buffers are assigned to a custom Audio Objects node which tracks the fragment buffer, the playback bufferSourceNode, the XML information including its unique test ID, the interface object(s) associated with the fragment and any metric or data collection objects. The Audio Object is controlled by an over-arching custom Audio Context node (not to be confused with the Web Audio Context). This parent JS Node allows for session wide control of the Audio Objects including starting and stopping playback of specific nodes. 246 The specification document contains the URL of the audio fragments for each test page. These fragments are downloaded asynchronously in the test and decoded offline by the Web Audio offline decoder. The resulting buffers are assigned to a custom Audio Objects node which tracks the fragment buffer, the playback bufferSourceNode, the XML information including its unique test ID, the interface object(s) associated with the fragment and any metric or data collection objects. The Audio Object is controlled by an over-arching custom Audio Context node (not to be confused with the Web Audio Context). This parent JS Node allows for session wide control of the Audio Objects including starting and stopping playback of specific nodes.
238 247
239 The only issue with this model is the bufferNode in the Web Audio API, which is implemented in the standard as a `use once' object. Once the bufferNode has been played, the bufferNode must be discarded as it cannot be instructed to play the same bufferSourceNode again. Therefore on each start request the buffer object must be created and then linked with the stored bufferSourceNode. This is an odd behaviour for such a simple object which has no alternative except to use the HTML5 audio element. However they do not have the ability to synchronously start on a given time and therefore not suited. 248 The only issue with this model is the bufferNode in the Web Audio API, which is implemented in the standard as a `use once' object. Once the bufferNode has been played, the bufferNode must be discarded as it cannot be instructed to play the same bufferSourceNode again. Therefore on each start request the buffer object must be created and then linked with the stored bufferSourceNode. This is an odd behaviour for such a simple object which has no alternative except to use the HTML5 audio element. However they do not have the ability to synchronously start on a given time and therefore not suited.
388 %%%% \item 2 point Scale - Better or Worse - (not sure how to default this - they default everything to better, which is an interesting choice) 397 %%%% \item 2 point Scale - Better or Worse - (not sure how to default this - they default everything to better, which is an interesting choice)
389 %%%% \end{itemize} 398 %%%% \end{itemize}
390 %%%% \end{itemize} 399 %%%% \end{itemize}
391 400
392 % Build your own test 401 % Build your own test
402
393 \begin{comment} 403 \begin{comment}
394 { \bf A screenshot would be nice. 404 { \bf A screenshot would be nice.
395 405
396 Established tests (see below) included as `presets' in the build-your-own-test page. } 406 Established tests (see below) included as `presets' in the build-your-own-test page. }
397 \end{comment} 407 \end{comment}
426 \section{Concluding remarks and future work} 436 \section{Concluding remarks and future work}
427 \label{sec:conclusion} 437 \label{sec:conclusion}
428 438
429 The code and documentation can be pulled or downloaded from our online repository available at \url{code.soundsoftware.ac.uk/projects/webaudioevaluationtool}. 439 The code and documentation can be pulled or downloaded from our online repository available at \url{code.soundsoftware.ac.uk/projects/webaudioevaluationtool}.
430 440
431 [Talking a little bit about what else might happen. Unless we really want to wrap this up. ]
432
433 \cite{schoeffler2015mushra} gives a `checklist' for subjective evaluation of audio systems. The Web Audio Evaluation Toolbox meets most of its given requirements including remote testing, crossfading between audio streams, collecting browser information, utilising UI elements and working with various audio formats including uncompressed PCM or WAV format. 441 \cite{schoeffler2015mushra} gives a `checklist' for subjective evaluation of audio systems. The Web Audio Evaluation Toolbox meets most of its given requirements including remote testing, crossfading between audio streams, collecting browser information, utilising UI elements and working with various audio formats including uncompressed PCM or WAV format.
434 % remote 442 % remote
435 % language support (not explicitly stated) 443 % language support (not explicitly stated)
436 % crossfades 444 % crossfades
437 % choosing speakers/sound device from within browser? --- NOT POSSIBLE, can only determine channel output counts and its up to the hardware to determine 445 % choosing speakers/sound device from within browser? --- NOT POSSIBLE, can only determine channel output counts and its up to the hardware to determine
438 % collect information about software and sound system 446 % collect information about software and sound system
439 % buttons, scales, ... UI elements 447 % buttons, scales, ... UI elements
440 % must be able to load uncompressed PCM 448 % must be able to load uncompressed PCM
441 449
450 The use of the Web Audio API is therefore key to WAET to meeting these requirements and others for performing perceptual evaluation tests. Along with the power of the HTML DOM environment giving the ability to interact with all on-page elements creates a powerful and flexible tool capable of performing a multitude of tests out of the box, whilst other tests could easily be built on top of the framework provided.
451 \begin{comment}
442 [What can we not do? `Method of adjustment', as in \cite{schoeffler2015mushra} is another can of worms, because, like, you could adjust lots of things (volume is just one of them, that could be done quite easily). Same for using input signals like the participant's voice. Either leave out, or mention this requires modification of the code we provide.] 452 [What can we not do? `Method of adjustment', as in \cite{schoeffler2015mushra} is another can of worms, because, like, you could adjust lots of things (volume is just one of them, that could be done quite easily). Same for using input signals like the participant's voice. Either leave out, or mention this requires modification of the code we provide.]
453 \end{comment}
443 454
444 % 455 %
445 % The following two commands are all you need in the 456 % The following two commands are all you need in the
446 % initial runs of your .tex file to 457 % initial runs of your .tex file to
447 % produce the bibliography for the citations in your paper. 458 % produce the bibliography for the citations in your paper.
448 \bibliographystyle{abbrv} 459 \bibliographystyle{ieeetr}
449 \bibliography{WAC2016} % sigproc.bib is the name of the Bibliography in this case 460 \bibliography{WAC2016} % sigproc.bib is the name of the Bibliography in this case
450 % You must have a proper ".bib" file 461 % You must have a proper ".bib" file
451 % and remember to run: 462 % and remember to run:
452 % latex bibtex latex latex 463 % latex bibtex latex latex
453 % to resolve all references 464 % to resolve all references